<table cellspacing="0" cellpadding="0" border="0" ><tr><td valign="top" style="font: inherit;"><P>Hello,</P><P></P><P>thank you for your answer. I just wondering though. How am I supposed to have a system with more than 99999 atoms, while the gro file has a fixed format giving up to 5 digits in the number of atoms? <BR></P><P></P><P>What else should I change in order to succeed better performance from my hardware if I can succeed having a much bigger system? You say so that ethernet has reached its limits.. </P><P></P><P>I was concidering using a supercomputing center in Europe and as far as I know they are using nodes which are using the Cell 9 core processors technology in each node. How someone there can accomplish a better performance using gromacs 4 using more nodes? Which might be the limit there in such machines.  </P><P></P><P>Thank you once again,</P><P>Nikos</P><BR>--- Berk Hess <I>&lt;gmx3@hotmail.com&gt;</I> schrieb am <B>Mi,

 18.2.2009:<BR><BLOCKQUOTE style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px;">Von: Berk Hess &lt;gmx3@hotmail.com&gt;<BR>Betreff: RE: [gmx-users] gromacs-4.0.2, parallel performance in two quad core xeon machines<BR>An: lastexile7gr@yahoo.de<BR>Datum: Mittwoch, 18. Februar 2009, 19:16<BR><BR><DIV id="yiv278737063">


<STYLE>

#yiv278737063 .hmmessage P

{

margin:0px;padding:0px;}

#yiv278737063 {

font-size:10pt;font-family:Verdana;}

</STYLE>


Hi,<BR><BR>You can not scale a system of just 7200 atoms<BR>to 16 cores which are connected by ethernet.<BR>400 atoms per core is already the scaling limit of Gromacs<BR>on current hardware with the fastest available network.<BR><BR>On ethernet a system 100 times as large might scale well to two nodes.<BR><BR>Berk<BR><BR><BR><HR id="stopSpelling">Date: Wed, 18 Feb 2009 09:40:28 -0800<BR>From: lastexile7gr@yahoo.de<BR>To: gmx-users@gromacs.org<BR>Subject: [gmx-users] gromacs-4.0.2,        parallel performance in two quad core xeon machines <BR><BR><TABLE border="0" cellpadding="0" cellspacing="0"><TBODY><TR><TD style="font-family:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;font-size:inherit;line-height:inherit;font-size-adjust:inherit;font-stretch:inherit;" valign="top">Hello,<BR><BR>we have built a cluster with nodes that are comprised by the following: dual core Intel(R) Xeon(R) CPU E3110 @ 3.00GHz. The memory of each node has 16Gb of

 memory. The switch that we use is a dell power connect model. Each node has a Gigabyte ethernet card..<BR><BR>I tested the performance for a system of 7200 atoms in 4cores of one node, in 8 cores of one node and in 16 cores of two nodes. In one node the performance is getting better.<BR>The problem I get is that moving from one node to two, the performance decreases dramatically (almost two days for a run that finishes in less than 3 hours!).<BR><BR>I have compiled gromacs with --enable-mpi option. I also have read previous archives from Mr Kurtzner, yet from what I saw is that they are focused on errors in gromacs 4 or on problems that previous versions of gromacs had. I get no errors, just low

 performance.<BR><BR>Is there any option that I must enable in order to succeed better performance in more than one nodes?  Or do you think according to your experience that the switch we use might be the problem? Or maybe should we have to activate anything from the nodes?<BR><BR>Thank you in advance,<BR>Nikos<BR><BR></TD></TR></TBODY></TABLE><BR><BR><HR>Express yourself instantly with MSN Messenger! <A rel="nofollow" target="_blank" href="http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/">MSN Messenger</A> 

</DIV></BLOCKQUOTE></td></tr></table><br>