<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#ffffff">
    On 7/02/2011 9:52 PM, Qiong Zhang wrote:
    <blockquote cite="mid:652775.33884.qm@web53801.mail.re2.yahoo.com"
      type="cite">
      <table border="0" cellpadding="0" cellspacing="0">
        <tbody>
          <tr>
            <td style="font: inherit;" valign="top"><!--[if gte mso 9]><xml>
 <w:WordDocument>
  <w:View>Normal</w:View>
  <w:Zoom>0</w:Zoom>
  <w:PunctuationKerning/>
  <w:DrawingGridVerticalSpacing>7.8 磅</w:DrawingGridVerticalSpacing>
  <w:DisplayHorizontalDrawingGridEvery>0</w:DisplayHorizontalDrawingGridEvery>
  <w:DisplayVerticalDrawingGridEvery>2</w:DisplayVerticalDrawingGridEvery>
  <w:ValidateAgainstSchemas/>
  <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
  <w:IgnoreMixedContent>false</w:IgnoreMixedContent>
  <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
  <w:Compatibility>
   <w:SpaceForUL/>
   <w:BalanceSingleByteDoubleByteWidth/>
   <w:DoNotLeaveBackslashAlone/>
   <w:ULTrailSpace/>
   <w:DoNotExpandShiftReturn/>
   <w:AdjustLineHeightInTable/>
   <w:BreakWrappedTables/>
   <w:SnapToGridInCell/>
   <w:WrapTextWithPunct/>
   <w:UseAsianBreakRules/>
   <w:DontGrowAutofit/>
   <w:UseFELayout/>
  </w:Compatibility>
  <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
 </w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:LatentStyles DefLockedState="false" LatentStyleCount="156">
 </w:LatentStyles>
</xml><![endif]--><!--[if !mso]><object
 classid="clsid:38481807-CA0E-42D2-BF39-B33AF135CC4D" id=ieooui></object>
<style>
st1\:*{behavior:url(#ieooui) }
</style>
<![endif]--><!--[if gte mso 10]>
<style>
 /* Style Definitions */
 table.MsoNormalTable
        {mso-style-name:普通表格;
        mso-tstyle-rowband-size:0;
        mso-tstyle-colband-size:0;
        mso-style-noshow:yes;
        mso-style-parent:"";
        mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
        mso-para-margin:0cm;
        mso-para-margin-bottom:.0001pt;
        mso-pagination:widow-orphan;
        font-size:10.0pt;
        font-family:"Times New Roman";
        mso-ansi-language:#0400;
        mso-fareast-language:#0400;
        mso-bidi-language:#0400;}
</style>
<![endif]--><!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="1026"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1"/>
 </o:shapelayout></xml><![endif]-->
              <p class="MsoNormal"><span lang="EN-US">Dear all
                  gmx-users,</span></p>
              <p class="MsoNormal"><span lang="EN-US"> </span></p>
              <p class="MsoNormal"><span lang="EN-US">I have </span><span
                  lang="EN-US">recently </span><span lang="EN-US">been
                  testing the REMD
                  simulations. I was running simulations on a
                  supercomputer system<span
                    class="highlightedsearchterm"> </span>ba<span
                    class="highlightedsearchterm">se</span>d
                  on the AMD Opteron 12-core (2.1 GHz) processors. The
                  Gromacs 4.5.3
                  version was used.</span></p>
              <p class="MsoNormal"><span lang="EN-US"> </span></p>
              <p class="MsoNormal"><span lang="EN-US">I have a system of
                  5172 atoms, of which 138
                  atoms belong to solute and the other are water
                  molecules. An exponential
                  distribution of temperatures was generated ranging
                  from 276 to 515 K in total
                  of 42 replicas or from 298 to 420 K in total of 24
                  replicas, ensuring that the
                  exchange ratio between all adjacent replicas is about
                  0.25. The replica
                  exchange was carried out every 0.5ps. The integrate
                  step size was 2fs.</span></p>
              <p class="MsoNormal"><span lang="EN-US"> </span></p>
              <p class="MsoNormal"><span lang="EN-US">For the above
                  system, when REMD is
                  simulated over 24 replicas, the simulation speed is
                  reasonably fast. However,
                  when REMD is simulated over 42 replicas, the
                  simulation speed is awfully slow.Please see the
                  following table for the speed.<br>
                </span></p>
              <p class="MsoNormal"><span lang="EN-US">----------------------------------------------------------------------------</span></p>
              <p class="MsoNormal"><span lang="EN-US">Replica number<span
                    style="">    </span>CPU number<span style="">     </span>speed</span></p>
              <p class="MsoNormal" style="margin-left: 90pt;
                text-indent: -90pt;"><span style="" lang="EN-US"><span
                    style="">24<span style="font: 7pt &quot;Times New
                      Roman&quot;;">                                                    
                    </span></span></span><span lang="EN-US">96<span
                    style="">             </span>58015steps/15minutes</span></p>
              <p class="MsoNormal" style="margin-left: 90pt;
                text-indent: -90pt;"><span style="" lang="EN-US"><span
                    style="">42<span style="font: 7pt &quot;Times New
                      Roman&quot;;">                                                    
                    </span></span></span><span lang="EN-US">42<span
                    style="">  </span><span style="">           </span><a
                    moz-do-not-send="true" name="OLE_LINK5">865steps/15minutes</a></span></p>
              <p class="MsoNormal" style="margin-left: 90pt;
                text-indent: -90pt;"><span style="" lang="EN-US"><span
                    style="">42<span style="font: 7pt &quot;Times New
                      Roman&quot;;">                                                    
                    </span></span></span><span lang="EN-US">84<span
                    style="">             </span>1175<a
                    moz-do-not-send="true" name="OLE_LINK7">steps/15minutes</a></span></p>
              <p class="MsoNormal" style="margin-left: 84.75pt;
                text-indent: -84.75pt;"><span style="" lang="EN-US"><span
                    style="">42<span style="font: 7pt &quot;Times New
                      Roman&quot;;">                                                 
                    </span></span></span><span lang="EN-US">168<span
                    style="">             </span>1875steps/15minutes</span></p>
              <div style="border-style: none none solid; border-color:
                -moz-use-text-color -moz-use-text-color windowtext;
                border-width: medium medium 1pt; padding: 0cm 0cm 1pt;">
                <p class="MsoNormal" style="border: medium none;
                  padding: 0cm;"><span lang="EN-US">42<span style="">           
                                    </span>336<span style="">           
                    </span>2855steps/15minutes</span></p>
              </div>
              <p class="MsoNormal"><span lang="EN-US"> </span></p>
              <p class="MsoNormal"><span lang="EN-US">The command line
                  for the mdrun
                  is:</span></p>
              <p class="MsoNormal"><span lang="EN-US">aprun -n (CPU
                  number here) mdrun_d -s
                  md.tpr -multi (replica number here) -replex 250</span></p>
              <p class="MsoNormal"><span lang="EN-US"> </span></p>
              <p class="MsoNormal"><span lang="EN-US">My questions are :<br>
                </span></p>
              <p class="MsoNormal"><span lang="EN-US">1) why the REMD
                  for the 42
                  replicas is so slow for the same system? <br>
                </span></p>
              <p class="MsoNormal"><span lang="EN-US">2) On what aspects
                  can I improve the operating
                  efficiency please?<br>
                </span></p>
            </td>
          </tr>
        </tbody>
      </table>
    </blockquote>
    <br>
    What's the network hardware? Can other machine load influence your
    network performance?<br>
    <br>
    Are the systems in the NVT ensemble? Use diff to check the .mdp
    files differ only how you think they do.<br>
    <br>
    What are the values of nstlist and nstcalcenergy?<br>
    <br>
    Take a look at the execution time breakdown at the end of the .log
    files, and do so for more than one replica. With the current
    implementation, every simulation has to synchronize and communicate
    every handful of steps, which means that large scale parallelism
    won't work efficiently unless you have fast network hardware that is
    dedicated to your job. This effect shows up in the "Rest" row of the
    time breakdown. With Infiniband, I'd expect you should only be
    losing about 10% of the run time total. The 30-fold loss you have
    upon going from 24-&gt;42 replicas keeping 4 CPUs/replica suggests
    some other contribution, however.<br>
    <br>
    Mark<br>
  </body>
</html>