<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#ffffff">
    On 26/05/2011 12:25 PM, jagannath mondal wrote:
    <blockquote cite="mid:281675.78202.qm@web137402.mail.in.yahoo.com"
      type="cite">
      <table border="0" cellpadding="0" cellspacing="0">
        <tbody>
          <tr>
            <td style="font: inherit;" valign="top">Hi,
              <div>&nbsp;&nbsp;I am having a problem in running replica exchange
                simulation over multiple nodes.&nbsp;</div>
              <div>To run the simulation for 16 replicas over two 8-core
                processors, I generated a hostfile as follows:</div>
              <div>&nbsp;yethiraj30 slots=8 max_slots=8</div>
              <div>&nbsp;&nbsp;yethiraj31 slots=8 max_slots=8</div>
              <div><br>
              </div>
              <div>These two machines are intra-connected and I have
                installed openmpi&nbsp;</div>
              <div>Then If I try to run the replica exchange simulation
                using the following command:</div>
              <div>mpirun -np 16 --hostfile &nbsp;hostfile &nbsp;mdrun_4mpi -s
                topol_.tpr -multi 16 -replex 100 &gt;&amp;
                log_replica_test</div>
              <div><br>
              </div>
              <div>But I find following error and mdrun does not proceed
                at all :</div>
              <div><br>
              </div>
              <div>
                <div>NNODES=16, MYRANK=0, HOSTNAME=yethiraj30</div>
                <div>NNODES=16, MYRANK=1, HOSTNAME=yethiraj30</div>
                <div>NNODES=16, MYRANK=4, HOSTNAME=yethiraj30</div>
                <div>NNODES=16, MYRANK=2, HOSTNAME=yethiraj30</div>
                <div>NNODES=16, MYRANK=6, HOSTNAME=yethiraj30</div>
                <div>NNODES=16, MYRANK=3, HOSTNAME=yethiraj30</div>
                <div>NNODES=16, MYRANK=5, HOSTNAME=yethiraj30</div>
                <div>NNODES=16, MYRANK=7, HOSTNAME=yethiraj30</div>
                <div>[yethiraj30][[22604,1],0][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
                  connect() to 192.168.0.31 failed: No route to host
                  (113)</div>
                <div>[yethiraj30][[22604,1],4][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
                  connect() to 192.168.0.31 failed: No route to host
                  (113)</div>
                <div>[yethiraj30][[22604,1],6][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
                  connect() to 192.168.0.31 failed: No route to host
                  (113)</div>
                <div>[yethiraj30][[22604,1],1][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
                  connect() to 192.168.0.31 failed: No route to host
                  (113)</div>
                <div>[yethiraj30][[22604,1],3][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
                  connect() to 192.168.0.31 failed: No route to host
                  (113)</div>
                <div>[yethiraj30][[22604,1],2][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
                  connect() to 192.168.0.31 failed: No route to host
                  (113)</div>
                <div>NNODES=16, MYRANK=10, HOSTNAME=yethiraj31</div>
                <div>NNODES=16, MYRANK=12, HOSTNAME=yethiraj31</div>
              </div>
              <div><br>
              </div>
              <div>I am not sure how to resolve this issue. In general,
                I can go from one machine to another without any problem
                using ssh. But, when I am trying to run openmpi over
                both the machines, I get this error. Any help will be
                appreciated.</div>
            </td>
          </tr>
        </tbody>
      </table>
    </blockquote>
    <br>
    Sorry, this is a problem of MPI configuration, not GROMACS, if six
    of your processors are not talking properly. You'll need to read the
    documentation and/or talk with your sysadmins.<br>
    <br>
    Mark<br>
  </body>
</html>