<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#ffffff">
On 26/05/2011 12:25 PM, jagannath mondal wrote:
<blockquote cite="mid:281675.78202.qm@web137402.mail.in.yahoo.com"
type="cite">
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td style="font: inherit;" valign="top">Hi,
<div> I am having a problem in running replica exchange
simulation over multiple nodes. </div>
<div>To run the simulation for 16 replicas over two 8-core
processors, I generated a hostfile as follows:</div>
<div> yethiraj30 slots=8 max_slots=8</div>
<div> yethiraj31 slots=8 max_slots=8</div>
<div><br>
</div>
<div>These two machines are intra-connected and I have
installed openmpi </div>
<div>Then If I try to run the replica exchange simulation
using the following command:</div>
<div>mpirun -np 16 --hostfile hostfile mdrun_4mpi -s
topol_.tpr -multi 16 -replex 100 >&
log_replica_test</div>
<div><br>
</div>
<div>But I find following error and mdrun does not proceed
at all :</div>
<div><br>
</div>
<div>
<div>NNODES=16, MYRANK=0, HOSTNAME=yethiraj30</div>
<div>NNODES=16, MYRANK=1, HOSTNAME=yethiraj30</div>
<div>NNODES=16, MYRANK=4, HOSTNAME=yethiraj30</div>
<div>NNODES=16, MYRANK=2, HOSTNAME=yethiraj30</div>
<div>NNODES=16, MYRANK=6, HOSTNAME=yethiraj30</div>
<div>NNODES=16, MYRANK=3, HOSTNAME=yethiraj30</div>
<div>NNODES=16, MYRANK=5, HOSTNAME=yethiraj30</div>
<div>NNODES=16, MYRANK=7, HOSTNAME=yethiraj30</div>
<div>[yethiraj30][[22604,1],0][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
connect() to 192.168.0.31 failed: No route to host
(113)</div>
<div>[yethiraj30][[22604,1],4][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
connect() to 192.168.0.31 failed: No route to host
(113)</div>
<div>[yethiraj30][[22604,1],6][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
connect() to 192.168.0.31 failed: No route to host
(113)</div>
<div>[yethiraj30][[22604,1],1][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
connect() to 192.168.0.31 failed: No route to host
(113)</div>
<div>[yethiraj30][[22604,1],3][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
connect() to 192.168.0.31 failed: No route to host
(113)</div>
<div>[yethiraj30][[22604,1],2][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect]
connect() to 192.168.0.31 failed: No route to host
(113)</div>
<div>NNODES=16, MYRANK=10, HOSTNAME=yethiraj31</div>
<div>NNODES=16, MYRANK=12, HOSTNAME=yethiraj31</div>
</div>
<div><br>
</div>
<div>I am not sure how to resolve this issue. In general,
I can go from one machine to another without any problem
using ssh. But, when I am trying to run openmpi over
both the machines, I get this error. Any help will be
appreciated.</div>
</td>
</tr>
</tbody>
</table>
</blockquote>
<br>
Sorry, this is a problem of MPI configuration, not GROMACS, if six
of your processors are not talking properly. You'll need to read the
documentation and/or talk with your sysadmins.<br>
<br>
Mark<br>
</body>
</html>