Thank you Mark, <div>I am trying to implement the suggestion 1. </div><div>I exchange the states between two simulations using a copy, and actually it works.</div><div>But the step after the exchanging is executed, I still get error that are recovered using </div>

<div>bExchanged = TRUE.</div><div>I guess dd_partition_system modifies the local state but also other variables, so I tried</div><div>to run dd_partition_system with the original states but still only bExchanged = TRUE</div>

<div>permits me to run without error.</div><div>Here is the new code:</div><div><br></div><div>if (DOMAINDECOMP(cr))</div><div><div>        {</div><div><span class="Apple-tab-span" style="white-space:pre">                </span>  int old_flag = state-&gt;flags;</div>

<div><span class="Apple-tab-span" style="white-space:pre">                </span>  state-&gt;flags=(1&lt;&lt;estX);             //I am interested only in coordinates</div><div><span class="Apple-tab-span" style="white-space:pre">                </span>  state_copy-&gt;flags=(1&lt;&lt;estX);        </div>

<div><span class="Apple-tab-span" style="white-space:pre">                </span>  dd_collect_state(cr-&gt;dd,state,state_global_copy); <span class="Apple-tab-span" style="white-space:pre">                                </span> </div><div><span class="Apple-tab-span" style="white-space:pre">                </span>  state-&gt;flags=old_flag;</div>

<div>        }</div><div><br></div><div>GMX_BARRIER(cr-&gt;mpi_comm_mygroup); </div><div><br></div><div> if (MASTER(cr))</div><div>        {</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>     exchange_state(cr-&gt;ms,  Y,state_global_copy); //Now state_global_copy contains the state_global of Y</div>

<div>        }</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>   </div><div>GMX_BARRIER(cr-&gt;mpi_comm_mygroup);</div><div>if (PAR(cr)) </div><div>{</div><div>    if (DOMAINDECOMP(cr)) </div><div>

<span class="Apple-tab-span" style="white-space:pre">        </span>{</div><div> <span class="Apple-tab-span" style="white-space:pre">        </span>  dd_partition_system(fplog,step,cr,TRUE,1,processes </div><div>                               state_global_copy,top_global,ir,</div>

<div>                               state_copy,&amp;f,mdatoms,top_copy,fr,       </div><div>                               vsite,shellfc,constr,</div><div>                               nrnb,wcycle,FALSE);</div><div>            }</div>

<div>}<span class="Apple-tab-span" style="white-space:pre">                </span></div></div><div><br></div><div><br></div><div>//DOING SOMETHING</div><div><br></div><div> dd_partition_system(fplog,step,cr,TRUE,1, state_global,top_global,ir,</div>

<div><div>                               state,&amp;f,mdatoms,top,fr,vsite,shellfc,constr,</div><div>                               nrnb,wcycle,FALSE);</div></div><div>//I still get error recovered by bExchanged = TRUE</div>

<div><br></div><div><br></div><div>Since is tedious for you correct my code, is there any documentation on the parameters used by dd_ functions</div><div>in order to permit me to use them with the right input?</div><div><br>

</div><div><br></div><div>Francesco</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div>

<div><br></div><div><br></div><div><br></div><div><br><div class="gmail_quote">2012/8/15 Mark Abraham <span dir="ltr">&lt;<a href="mailto:Mark.Abraham@anu.edu.au" target="_blank">Mark.Abraham@anu.edu.au</a>&gt;</span><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000"><div class="im">

    <div>On 15/08/2012 5:46 AM, francesco oteri

      wrote:<br>

    </div>

    <blockquote type="cite">Dear gromacs users and developers,

      <div>I have a question related to domain decomposition:</div>

      <div><br>

      </div>

      <div>I have to tun multiple simulation and every step </div>

      <div>1) In the sim X the state of simY (and simY need state from

        sim X)</div>

      <div>2) getting the potential energy</div>

      <div>3) continuing</div>

      <div><br>

      </div>

      <div><br>

      </div>

      <div><br>

      </div>

      <div>right now I am testing the point1. In particular I exchange

        the state between simulation X and Y two times:</div>

      <div>the first time this permit at simulation X to get the state

        of Y ( and vicecersa) while the second exchange</div>

      <div>restore the original situatation.</div>

    </blockquote>

    <br></div>

    Seems you don&#39;t actually want to exchange states, but rather do a

    computation on a copy of the other state. I&#39;d either<br>

    1) do exchange_state on X into a different t_state from the one with

    which X is simulating (so there is no need to exchange back, since

    doing dd_partition_system a second time requires neighbour searching

    twice and that will kill your scaling even harder)<br>

    2) get Y to do the computation on its coordinates, since swapping

    the result is probably much cheaper than collecting the state,

    communcating the state, doing NS and DD on the state and then

    computing on it. That might mean maintaining multiple t_forcerec or

    gmx_mtop_t, but at least those data structures are likely constant,

    so you only have to communicate them rarely(once?).<div><div class="h5"><br>

    <br>

    <blockquote type="cite">

      <div><br>

      </div>

      <div>I inserted the following code between lines</div>

      <div><br>

      </div>

      <div>

        <div>   if ((repl_ex_nst &gt; 0) &amp;&amp; (step &gt; 0)

          &amp;&amp; !bLastStep &amp;&amp;</div>

        <div>            do_per_step(step,repl_ex_nst)) </div>

        <div>        {</div>

      </div>

      <div><br>

      </div>

      <div>and </div>

      <div><br>

      </div>

      <div>bExchanged = replica_exchange(fplog, cr, repl_ex,

        state_global, enerd-&gt;term, state,step,t);</div>

      <div><br>

      </div>

      <div><br>

      </div>

      <div>//Performing the first exchange</div>

      <div>

        <div>  if (DOMAINDECOMP(cr))</div>

        <div>        {</div>

        <div><span style="white-space:pre-wrap"> </span>

           dd_collect_state(cr-&gt;dd,state,state_global);</div>

      </div>

      <div><br>

      </div>

      <div> if (MASTER(cr))</div>

      <div>        {<span style="white-space:pre-wrap">

        </span> <span style="white-space:pre-wrap"> </span> </div>

      <div>          exchange_state(cr-&gt;ms, Y, state_global);<span style="white-space:pre-wrap"> </span> <span style="white-space:pre-wrap"> </span>     </div>

      <div>       }</div>

      <div>        </div>

      <div>

        <div>  if (DOMAINDECOMP(cr)) </div>

        <div>    {                    <span style="white-space:pre-wrap"> </span>  <span style="white-space:pre-wrap"> </span></div>

        <div>           dd_partition_system(fplog,step,cr,TRUE,1,      

             </div>

        <div>                                                 

          state_global,top_global,ir, </div>

        <div>                                                 

          state,NULL,mdatoms,top,fr,</div>

        <div>                                                 

          vsite,shellfc,constr,</div>

        <div>                                               

           nrnb,wcycle,FALSE);                               </div>

        <div>    }</div>

      </div>

      <div><br>

      </div>

      <div>//Now every node should have its part of the Y simulation</div>

      <div>//Getting potential energy</div>

      <div><br>

      </div>

      <div><br>

      </div>

      <div><br>

      </div>

      <div>//Performing the second exchange</div>

      <div>

        <div>if (MASTER(cr))</div>

        <div>  { </div>

        <div>    exchange_state(cr-&gt;ms, Y, state_global);<span style="white-space:pre-wrap"> </span> //

          I don&#39;t need to  call because nothing changed state_global 

          dd_collect_state</div>

        <div>  }</div>

      </div>

      <div><br>

      </div>

      <div>

        <div>  if (DOMAINDECOMP(cr)) </div>

        <div>    {                    <span style="white-space:pre-wrap"> </span>  <span style="white-space:pre-wrap"> </span></div>

        <div>           dd_partition_system(fplog,step,cr,TRUE,1,      

             </div>

        <div>                                                 

          state_global,top_global,ir, </div>

        <div>                                                 

          state,NULL,mdatoms,top,fr,</div>

        <div>                                                 

          vsite,shellfc,constr,</div>

        <div>                                               

           nrnb,wcycle,FALSE);                               </div>

        <div>    }</div>

      </div>

      <div><br>

      </div>

      <div>//Now state Y is back to simulation Y </div>

      <div><br>

      </div>

      <div><br>

      </div>

      <div>The problem is that this simple code gives me problem, in

        particular it gives LINCS problem</div>

      <div>in do_force the step after my code is executed.</div>

      <div><br>

      </div>

      <div>Since forcing bNS=TRUE solves the problem, I guess there is

        some issue with neighbor list updating</div>

      <div>but I dont understand why.</div>

      <div><br>

      </div>

      <div>I observed that in, after the last dd_partition_system,

        syste-&gt;natoms had an other value compared with the value</div>

      <div>it has at the before my code is executed.</div>

      <div><br>

      </div>

      <div>What is my error?</div>

    </blockquote>

    <br></div></div>

    Particularly with dynamic load balancing, there is no reason that

    the DD for any replica should resemble the DD for any other replica.

    Each processor can have totally different atoms, and a different

    number of atoms, so blindly copying stuff into those data structures

    will lead to the kinds of problems you see. I&#39;d still expect

    problems even if you disable dynamic load balancing. Hence my

    suggestions above.<br>

    <br>

    The implementation of replica exchange in GROMACS scales poorly

    because exchanging coordinates requires subsequent NS and DD. So I&#39;d

    encourage you to avoid that route if you can. Exchanging

    Hamiltonians is much cheaper. I have an implementation of that for

    T-REMD, but it won&#39;t see the light of day any time soon.<span class="HOEnZb"><font color="#888888"><br>

    <br>

    Mark<br>

  </font></span></div>

<br>--<br>

gmx-developers mailing list<br>

<a href="mailto:gmx-developers@gromacs.org">gmx-developers@gromacs.org</a><br>

<a href="http://lists.gromacs.org/mailman/listinfo/gmx-developers" target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-developers</a><br>

Please don&#39;t post (un)subscribe requests to the list. Use the<br>

www interface or send it to <a href="mailto:gmx-developers-request@gromacs.org">gmx-developers-request@gromacs.org</a>.<br></blockquote></div><br><br clear="all"><div><br></div>-- <br>Cordiali saluti, Dr.Oteri Francesco<br>

</div>