Hi,<br><br>I have a 72-lipid DPPC (MARTINI) system that I ran for 400ns in GROMACS 3.3.3. I picked a snapshot from the middle (~100ns), so I know it should be equilibrated and be able to run for several hundred nanoseconds. If I use GROMACS 4.0.3 (or 4.0.5) and 1, 2, or 4 processors, everything is great. However, if I use more than 4 processors (either on a single node or on two nodes), I get errors like this:<br>
<br>------------- begin error -------------<br>vol 0.87 imb F 2% step 350900, will finish Sat Jun 27 23:16:51 2009<br>vol 0.84 imb F 4% step 351000, will finish Sat Jun 27 23:16:49 2009<br><br>A list of missing interactions:<br>
G96Angle of 576 missing 1<br><br>Molecule type 'DPP'<br>the first 10 missing interactions, except for exclusions:<br> G96Angle atoms 2 3 5 global 254 255 257<br><br>
-------------------------------------------------------<br>Program mdrun_mpi, VERSION 4.0.5<br>Source code file: domdec_top.c, line: 341<br><br>Fatal error:<br>1 of the 1368 bonded interactions could not be calculated because some atoms involved moved further apart than the multi-body cut-off distance (1.2 nm) or the two-body cut-off distance (1.2 nm), see option -rdd, for pairs and tabulated bonds also see option -ddcheck<br>
-------------------------------------------------------<br><br>"Pump Up the Volume Along With the Tempo" (Jazzy Jeff)<br><br>Error on node 0, will try to stop all the nodes<br>Halting parallel program mdrun_mpi on CPU 0 out of 8<br>
<br>gcq#177: "Pump Up the Volume Along With the Tempo" (Jazzy Jeff)<br><br>h199:0.MPID_Abort: h199:0.MPI Abort by user Aborting program !<br>h199:0.MPID_CH_Abort: h199:0.Aborting program!<br>Abort on node h199 due to MPI_Abort (type 2)<br>
-------------- end error --------------<br><br>This error happens on a different step depending on how many processors I use, whether I use gfortran or ifort, etc. <br><br>Am I understanding correctly that I have a triplet of particles A-B-C where the bond-angle term cannot be calculated because the distance between A and C is greater than 1.2 nm?<br>
<br>Is dynamic load balancing causing the error to happen at different steps for different numbers of processors?<br><br>It appears that I can fix the problem by setting -rdd=1.4 on the command line, but I'd like to make sure I'm not just sweeping something else under the rug.<br>
<br>For what it's worth, the equilibrium bond lengths in MARTINI's DPPC model are all either .47 or .37 nm. In the -rdd=1.4 run, the maximum bond lengths range from .68 to .71 nm depending on the particular bond and the A-C distances from the A-B-C triplets range from 1.09 to 1.21.<br>
<br>Also, is there any chance that the default settings will get this right for my system in the future?<br><br>Thanks,<br><br>-michael<br clear="all"><br>-- <br>Michael Lerner, Ph.D.<br>IRTA Postdoctoral Fellow<br>Laboratory of Computational Biology NIH/NHLBI<br>
5635 Fishers Lane, Room T909, MSC 9314<br>Rockville, MD 20852 (UPS/FedEx/Reality)<br>Bethesda MD 20892-9314 (USPS)<br>