<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#ffffff">
On 7/02/2011 9:52 PM, Qiong Zhang wrote:
<blockquote cite="mid:652775.33884.qm@web53801.mail.re2.yahoo.com"
type="cite">
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td style="font: inherit;" valign="top"><!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:PunctuationKerning/>
<w:DrawingGridVerticalSpacing>7.8 磅</w:DrawingGridVerticalSpacing>
<w:DisplayHorizontalDrawingGridEvery>0</w:DisplayHorizontalDrawingGridEvery>
<w:DisplayVerticalDrawingGridEvery>2</w:DisplayVerticalDrawingGridEvery>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:Compatibility>
<w:SpaceForUL/>
<w:BalanceSingleByteDoubleByteWidth/>
<w:DoNotLeaveBackslashAlone/>
<w:ULTrailSpace/>
<w:DoNotExpandShiftReturn/>
<w:AdjustLineHeightInTable/>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:UseFELayout/>
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
</w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" LatentStyleCount="156">
</w:LatentStyles>
</xml><![endif]--><!--[if !mso]><object
classid="clsid:38481807-CA0E-42D2-BF39-B33AF135CC4D" id=ieooui></object>
<style>
st1\:*{behavior:url(#ieooui) }
</style>
<![endif]--><!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
        {mso-style-name:普通表格;
        mso-tstyle-rowband-size:0;
        mso-tstyle-colband-size:0;
        mso-style-noshow:yes;
        mso-style-parent:"";
        mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
        mso-para-margin:0cm;
        mso-para-margin-bottom:.0001pt;
        mso-pagination:widow-orphan;
        font-size:10.0pt;
        font-family:"Times New Roman";
        mso-ansi-language:#0400;
        mso-fareast-language:#0400;
        mso-bidi-language:#0400;}
</style>
<![endif]--><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1"/>
</o:shapelayout></xml><![endif]-->
<p class="MsoNormal"><span lang="EN-US">Dear all
gmx-users,</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">I have </span><span
lang="EN-US">recently </span><span lang="EN-US">been
testing the REMD
simulations. I was running simulations on a
supercomputer system<span
class="highlightedsearchterm"> </span>ba<span
class="highlightedsearchterm">se</span>d
on the AMD Opteron 12-core (2.1 GHz) processors. The
Gromacs 4.5.3
version was used.</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">I have a system of
5172 atoms, of which 138
atoms belong to solute and the other are water
molecules. An exponential
distribution of temperatures was generated ranging
from 276 to 515 K in total
of 42 replicas or from 298 to 420 K in total of 24
replicas, ensuring that the
exchange ratio between all adjacent replicas is about
0.25. The replica
exchange was carried out every 0.5ps. The integrate
step size was 2fs.</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">For the above
system, when REMD is
simulated over 24 replicas, the simulation speed is
reasonably fast. However,
when REMD is simulated over 42 replicas, the
simulation speed is awfully slow.Please see the
following table for the speed.<br>
</span></p>
<p class="MsoNormal"><span lang="EN-US">----------------------------------------------------------------------------</span></p>
<p class="MsoNormal"><span lang="EN-US">Replica number<span
style=""> </span>CPU number<span style=""> </span>speed</span></p>
<p class="MsoNormal" style="margin-left: 90pt;
text-indent: -90pt;"><span style="" lang="EN-US"><span
style="">24<span style="font: 7pt "Times New
Roman";">
</span></span></span><span lang="EN-US">96<span
style=""> </span>58015steps/15minutes</span></p>
<p class="MsoNormal" style="margin-left: 90pt;
text-indent: -90pt;"><span style="" lang="EN-US"><span
style="">42<span style="font: 7pt "Times New
Roman";">
</span></span></span><span lang="EN-US">42<span
style=""> </span><span style=""> </span><a
moz-do-not-send="true" name="OLE_LINK5">865steps/15minutes</a></span></p>
<p class="MsoNormal" style="margin-left: 90pt;
text-indent: -90pt;"><span style="" lang="EN-US"><span
style="">42<span style="font: 7pt "Times New
Roman";">
</span></span></span><span lang="EN-US">84<span
style=""> </span>1175<a
moz-do-not-send="true" name="OLE_LINK7">steps/15minutes</a></span></p>
<p class="MsoNormal" style="margin-left: 84.75pt;
text-indent: -84.75pt;"><span style="" lang="EN-US"><span
style="">42<span style="font: 7pt "Times New
Roman";">
</span></span></span><span lang="EN-US">168<span
style=""> </span>1875steps/15minutes</span></p>
<div style="border-style: none none solid; border-color:
-moz-use-text-color -moz-use-text-color windowtext;
border-width: medium medium 1pt; padding: 0cm 0cm 1pt;">
<p class="MsoNormal" style="border: medium none;
padding: 0cm;"><span lang="EN-US">42<span style="">
</span>336<span style="">
</span>2855steps/15minutes</span></p>
</div>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">The command line
for the mdrun
is:</span></p>
<p class="MsoNormal"><span lang="EN-US">aprun -n (CPU
number here) mdrun_d -s
md.tpr -multi (replica number here) -replex 250</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">My questions are :<br>
</span></p>
<p class="MsoNormal"><span lang="EN-US">1) why the REMD
for the 42
replicas is so slow for the same system? <br>
</span></p>
<p class="MsoNormal"><span lang="EN-US">2) On what aspects
can I improve the operating
efficiency please?<br>
</span></p>
</td>
</tr>
</tbody>
</table>
</blockquote>
<br>
What's the network hardware? Can other machine load influence your
network performance?<br>
<br>
Are the systems in the NVT ensemble? Use diff to check the .mdp
files differ only how you think they do.<br>
<br>
What are the values of nstlist and nstcalcenergy?<br>
<br>
Take a look at the execution time breakdown at the end of the .log
files, and do so for more than one replica. With the current
implementation, every simulation has to synchronize and communicate
every handful of steps, which means that large scale parallelism
won't work efficiently unless you have fast network hardware that is
dedicated to your job. This effect shows up in the "Rest" row of the
time breakdown. With Infiniband, I'd expect you should only be
losing about 10% of the run time total. The 30-fold loss you have
upon going from 24->42 replicas keeping 4 CPUs/replica suggests
some other contribution, however.<br>
<br>
Mark<br>
</body>
</html>