<br /><br /><span>On 04/05/11, <b class="name">Igor Leontyev </b> &lt;ileontyev@ucdavis.edu&gt; wrote:</span><blockquote cite="mid:66C9829632484B329FED01940C4907DD@homecomp" class="iwcQuote" style="border-left: 1px solid rgb(0, 0, 255); padding-left: 13px; margin-left: 0pt;" type="cite"><div class="mimepart text plain">To make partial charges be adjustable according to acting field I have introduced modifications to gromacs 4.0.7. The serial (single thread) version seems to be ready and I want to implement parallelization (with particle decomposition). In my current implementation:<br />- values of mdatoms-&gt;chargeA for local atoms are updated in &quot;do_md&quot; at the begininig of each timestep;<br />- 'MPI_Sendrecv' + 'gmx_wait' are used in &quot;do_force&quot; (right after the call &quot;move_cgcm&quot;) to distribute the new charges over parallel nodes.<br />After this the array mdatoms-&gt;chargeA have updated values on all nodes. But some problem arises later in &quot;gmx_pme_do&quot; (modification free routine) hanging up execution and even PC.</div></blockquote><br />Standard procedure is to use a debugger to see which memory access from where is problematic. I'm not aware of a free parallel debugger, however. Bisecting with printf() calls can work...<br /><br /><blockquote cite="mid:66C9829632484B329FED01940C4907DD@homecomp" class="iwcQuote" style="border-left: 1px solid rgb(0, 0, 255); padding-left: 13px; margin-left: 0pt;" type="cite"><div class="mimepart text plain">Is it possible that source of the problem is in use of ('MPI_Sendrecv' + 'gmx_wait') in wrong place of the code?</div></blockquote><br />I doubt it.<br /><br /><blockquote cite="mid:66C9829632484B329FED01940C4907DD@homecomp" class="iwcQuote" style="border-left: 1px solid rgb(0, 0, 255); padding-left: 13px; margin-left: 0pt;" type="cite"><div class="mimepart text plain">Many communications are performed in &quot;gmx_pme_do&quot;, e.g. &quot;pmeredist&quot; calls 'MPI_Alltoallv' for charge and coordinate redistribution over the nodes.<br /><br />Is there a particular reason in gromacs code why some communications are done by  'MPI_Sendrecv' but other by 'MPI_Alltoallv'? What is the right way (or right MPI routine) to distribute the locally updated charges over all nodes?</div></blockquote><br />Various parts of the code date from times when different parts of the MPI standard had implementations of varying quality,<br />and some parts are throwbacks (I gather) to the way very early versions of GROMACS were designed to communicate on a parallel machine with ring topology.<br />These days, we should use the collective communication calls rather than introduce maintenance issues re-implementing wheels.<br /><br />I can't help with clues on how a PD simulation should distribute such information, except that there must be a mapping somewhere of simulation atom to MPI rank that distributed the data in mdatoms shortly after it was constructed from the .tpr file.<br /><br />Mark