<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Dear Dimitar, <br>
I'm following the debate regarding:<br>
<span style="border-collapse: collapse; font-family:
arial,sans-serif; font-size: 13px;"><br>
<div>
<div><font size="1"><br>
</font></div>
</div>
</span>
<blockquote
cite="mid:BANLkTinHcv9nTtK8O2LTvie_NTuDBb8w3g@mail.gmail.com"
type="cite">
<div class="gmail_quote">
<div class="gmail_quote">The point was not "why" I was getting
the restarts, but the fact itself that I was getting restarts
close in time, as I stated in my first post. I actually also
don't know whether jobs are deleted or suspended. I've thought
that a job returned back to the queue will basically start
from the beginning when later moved to an empty slot ... so
don't understand the difference from that perspective.<br>
</div>
</div>
</blockquote>
<br>
In the second mail yoo say:<br>
<br>
<span style="border-collapse: collapse; font-family:
arial,sans-serif; font-size: 13px;">
<div>Submitted by:</div>
<div>========================</div>
<div><font size="1">ii=1</font></div>
<div><font size="1">ifmpi="mpirun -np $NSLOTS"</font></div>
<div><font size="1">--------</font></div>
<div><font size="1"> if [ ! -f run${ii}-i.tpr ];then</font></div>
<div>
<div><font size="1"> cp run${ii}.tpr run${ii}-i.tpr </font></div>
<div><font size="1"> tpbconv -s run${ii}-i.tpr -until
200000 -o run${ii}.tpr </font></div>
<div><font size="1"> fi</font></div>
<div><font size="1"><br>
</font></div>
<div><font size="1"> k=`ls md-${ii}*.out | wc -l`</font></div>
<div><font size="1"> outfile="md-${ii}-$k.out"</font></div>
<div><font size="1"> if [[ -f run${ii}.cpt ]]; then</font></div>
<div><font size="1"> </font></div>
<div><font size="1"> <b> $ifmpi `which mdrun` </b>-s
run${ii}.tpr -cpi run${ii}.cpt -v -deffnm run${ii} -npme 0
> $outfile 2>&1 </font></div>
<div><font size="1"><br>
</font></div>
<div><font size="1"> fi</font></div>
</div>
<div>=========================<br>
<br>
<br>
If I understand well, you are submitting the SERIAL mdrun. This
means that multiple instances of mdrun are running at the same
time.<br>
Each instance of mdrun is an INDIPENDENT instance. Therefore
checkpoint files, one for each instance (i.e. one for each
CPU), are written at the same time.</div>
</span>
</body>
</html>