[gmx-users] question about continuation using tpbconv and mdrun -cpi

Thu Jul 30 16:53:13 CEST 2009

Hi Mark,

Sorry to trouble you again!

I made two tests, by using one-processor and one 32-processors on the
same cluster. I used 4000SPC/E waters (OPLS-AA ff.) The former
(one-processor) gives the exactly the same potential. However, the
latter still shown some deviation of the potential. When I use "-reprod
yes", the potential is -190098+-457.199 for "Comparison", and
-190088+-467.225 for "Reference". When I use "-dlb no", -190116+-476.512
for "Comparison", and -190114+-483.749 for "Reference".

The following is the job script lines when using "-reprod yes":
********************************************************************************
# Reference
grompp -f full.mdp -c initial.gro -p system.top -o full
mpirun -np 32 mdrun -deffnm full -reprod yes

# Comparison
grompp -f full.mdp -c initial.gro -p system.top -o full2
mpirun -np 32 mdrun -deffnm full2 -reprod yes -cpt 2 -maxh 0.2
mpirun -np 32 mdrun -deffnm full2 -reprod yes -cpi full2.cpt -append
********************************************************************************

Do you have some further ideas?

best wishes,
Baofu Qiao

Mark Abraham wrote:
> Baofu Qiao wrote:
>> Hi Mark,
>>
>> Thanks!
>> Because the maximum time for one single job is set to be 24hours on the
>> cluster I'm using, I want to make sure which is the best way to continue
>> the gmx jobs. I wonder how strong effect "mdrun -cpi" has?  From the
>> introduction of mdrun, it seems that there are some EXTRA energy frames,
>> but for the trajectory file (.xtc), there is no extra frames? Am I
>> right?
>>
>> "mdrun -h
>> --> The contents will be binary identical (unless you use dynamic load
>> balancing), but for technical reasons there might be some extra energy
>> frames when using checkpointing (necessary for restarts without
>> appending)."
>
> The intent with GROMACS 4.x is for a user to be able to construct a
> .tpr with a very long simulation time, and perhaps constrain mdrun
> with -maxh (or rely on the cluster killing the job), and to use the
> information in the checkpoint file to restart correctly and perhaps to
> then use mdrun -append so that when the simulation is running
> smoothly, only one set of files needs to exist. Thus one doesn't need
> to trouble with using tpbconv correctly, crashes can restart
> transparently, etc. The old-style approach still works, however.
>
> Obviously you should (be able to) verify with mdrun -reprod that
> whatever approach you use when you construct your job scripts leads to
> simulations that are in principle reproducible. For production, don't
> use -reprod because you will want the better speed from dynamic load
> balancing, etc.
>
> Mark
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before
> posting!
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>