<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    On 2/04/2012 7:13 PM, James Starlight wrote:
    <blockquote
cite="mid:CAALQopx_5RE+AdrExRtuf2mTA5OKCa=MT8YjJ+esqHdEx59jPA@mail.gmail.com"
      type="cite">Mark,<br>
      <br>
      As I've told previously I have problems with the running
      simulation in multi-node mode.<br>
    </blockquote>
    <br>
    Yup, and my bet is you can't run any other software on multiple MPI
    nodes either, because your MPI system is not set up correctly, or
    maybe is too old. We can't help with that, since it's nothing to do
    with GROMACS.<br>
    <br>
    <blockquote
cite="mid:CAALQopx_5RE+AdrExRtuf2mTA5OKCa=MT8YjJ+esqHdEx59jPA@mail.gmail.com"
      type="cite"><br>
      I checked logs of such simulations and fond like this<br>
      <br>
      Will use 10 particle-particle and 6 PME only nodes<br>
      This is a guess, check the performance at the end of the log file<br>
      Using 6 separate PME nodes<br>
      <br>
      This simulation was run on the 2 nodes ( 2*8 CPUs). And I've never
      obtain the same notions about PME nodes when I've launch my
      systems on the singe node.</blockquote>
    <br>
    Not surprising. Running in parallel is a lot more tricky than
    running in serial, and so there's lots of software engineering that
    supports it. See manual 3.15 and 3.17.5. Running at near-maximum
    efficiency in parallel requires you understand some of that, but by
    default it will "just run" almost all the time.<br>
    <br>
    <blockquote
cite="mid:CAALQopx_5RE+AdrExRtuf2mTA5OKCa=MT8YjJ+esqHdEx59jPA@mail.gmail.com"
      type="cite"> Might it be that some special options for the PME
      nodes are needed in the mdp file to be defined ?<br>
    </blockquote>
    <br>
    Not in the sense you mean. There are not normally any .mdp changes
    necessary to support parallelism, and you get told about them when
    they arise. The trace back below clearly indicates that the problem
    occurs as GROMACS goes to set up the parallel communication
    infrastructure, which has nothing directly to do with the .mdp
    contents.<br>
    <br>
    Mark<br>
    <br>
    <blockquote
cite="mid:CAALQopx_5RE+AdrExRtuf2mTA5OKCa=MT8YjJ+esqHdEx59jPA@mail.gmail.com"
      type="cite">
      <br>
      James<br>
      <br>
      <div class="gmail_quote">20 &#1084;&#1072;&#1088;&#1090;&#1072; 2012&nbsp;&#1075;. 18:02 &#1087;&#1086;&#1083;&#1100;&#1079;&#1086;&#1074;&#1072;&#1090;&#1077;&#1083;&#1100; Mark
        Abraham <span dir="ltr">&lt;<a moz-do-not-send="true"
            href="mailto:Mark.Abraham@anu.edu.au">Mark.Abraham@anu.edu.au</a>&gt;</span>
        &#1085;&#1072;&#1087;&#1080;&#1089;&#1072;&#1083;:<br>
        <blockquote class="gmail_quote" style="margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div class="HOEnZb">
            <div class="h5">On 20/03/2012 10:35 PM, James Starlight
              wrote:<br>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                Could someone tell me what tell the below error<br>
                <br>
                Getting Loaded...<br>
                Reading file MD_100.tpr, VERSION 4.5.4 (single
                precision)<br>
                Loaded with Money<br>
                <br>
                <br>
                Will use 30 particle-particle and 18 PME only nodes<br>
                This is a guess, check the performance at the end of the
                log file<br>
                [ib02:22825] *** Process received signal ***<br>
                [ib02:22825] Signal: Segmentation fault (11)<br>
                [ib02:22825] Signal code: Address not mapped (1)<br>
                [ib02:22825] Failing at address: 0x10<br>
                [ib02:22825] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030)
                [0x7f535903e03$<br>
                [ib02:22825] [ 1] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x7e23)
                [0x7f535$<br>
                [ib02:22825] [ 2] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x8601)
                [0x7f535$<br>
                [ib02:22825] [ 3] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x8bab)
                [0x7f535$<br>
                [ib02:22825] [ 4] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x42af)
                [0x7f5353$<br>
                [ib02:22825] [ 5] /usr/lib/libopen-pal.so.0(opal_progress+0x5b)
                [0x7f535790506b]<br>
                [ib02:22825] [ 6] /usr/lib/libmpi.so.0(+0x37755)
                [0x7f5359282755]<br>
                [ib02:22825] [ 7] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(+0x1c3a)
                [0x7f$<br>
                [ib02:22825] [ 8] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(+0x7fae)
                [0x7f$<br>
                [ib02:22825] [ 9] /usr/lib/libmpi.so.0(ompi_comm_split+0xbf)
                [0x7f535926de8f]<br>
                [ib02:22825] [10] /usr/lib/libmpi.so.0(MPI_Comm_split+0xdb)
                [0x7f535929dc2b]<br>
                [ib02:22825] [11] /usr/lib/libgmx_mpi_d.openmpi.so.6(gmx_setup_nodecomm+0x19b)
                $<br>
                [ib02:22825] [12] mdrun_mpi_d.openmpi(mdrunner+0x46a)
                [0x40be7a]<br>
                [ib02:22825] [13] mdrun_mpi_d.openmpi(main+0x1256)
                [0x407206]<br>
                [ib02:22825] [14] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)
                [0x7f$<br>
                [ib02:22825] [15] mdrun_mpi_d.openmpi() [0x407479]<br>
                [ib02:22825] *** End of error message ***<br>
                --------------------------------------------------------------------------<br>
                mpiexec noticed that process rank 36 with PID 22825 on
                node ib02 exited on sign$<br>
                --------------------------------------------------------------------------<br>
                <br>
                <br>
                I've obtained it when I've tried to use my system on
                multi-node station ( there is no problem on single
                node). Does this problem with the cluster system or
                something wrong with parameters of my simulation?<br>
              </blockquote>
              <br>
            </div>
          </div>
          The trace back suggests your MPI system is not configured
          correctly for your hardware.<span class="HOEnZb"><font
              color="#888888"><br>
              <br>
              Mark</font></span>
          <div class="HOEnZb">
            <div class="h5"><br>
              -- <br>
              gmx-users mailing list &nbsp; &nbsp;<a moz-do-not-send="true"
                href="mailto:gmx-users@gromacs.org" target="_blank">gmx-users@gromacs.org</a><br>
              <a moz-do-not-send="true"
                href="http://lists.gromacs.org/mailman/listinfo/gmx-users"
                target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-users</a><br>
              Please search the archive at <a moz-do-not-send="true"
                href="http://www.gromacs.org/Support/Mailing_Lists/Search"
                target="_blank">http://www.gromacs.org/Support/Mailing_Lists/Search</a>
              before posting!<br>
              Please don't post (un)subscribe requests to the list. Use
              the www interface or send it to <a moz-do-not-send="true"
                href="mailto:gmx-users-request@gromacs.org"
                target="_blank">gmx-users-request@gromacs.org</a>.<br>
              Can't post? Read <a moz-do-not-send="true"
                href="http://www.gromacs.org/Support/Mailing_Lists"
                target="_blank">http://www.gromacs.org/Support/Mailing_Lists</a><br>
            </div>
          </div>
        </blockquote>
      </div>
      <br>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
    </blockquote>
    <br>
  </body>
</html>