<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
FONT-SIZE: 10pt;
FONT-FAMILY:Tahoma
}
</style>
</head>
<body class='hmmessage'><div style="text-align: left;">Hi,<br><br>You really do not have any error or warning messages at the end of your log<br>file, stdlog or stderr?<br><br>Up to now there has been only one report of problems.<br>This is on a cray xt4, where some dlb jobs (with initial empty cells)<br>stop at step 10 with the error message that some cell dimensions have become 0.<br>Unfortunately I can not reproduce this on an x86_64 linux machine.<br>So we will have to do some xt4 debugging.<br><br>Can you produce core dump files?<br><br>Berk<br></div><br><br><br><hr id="stopSpelling">> From: st01397@student.uib.no<br>> To: gmx-users@gromacs.org<br>> Date: Mon, 29 Sep 2008 15:21:34 +0200<br>> CC: <br>> Subject: [gmx-users] Possible bug in parallelization,        PME or load-balancing on Gromacs 4.0_rc1 ??<br>> <br>> I am running some annealing trials on a Cray XT4. And although the<br>> throughput is impressive, I have severe difficulties with stability of<br>> the code.<br>> For my relatively small system of ~7500 atoms the engine typically crash<br>> after ~500k steps.<br>> <br>> I am using the bleeding-edge CVS version: mdrun.c (1.141) (the newest<br>> one after Erik L.'s recent patch of the PME code) <br>> <br>> I configure and compile on the compute nodes exclusively (not the<br>> frontend) and the only compiler warning(s) I get are of the type:<br>> <br>> "warning: Using 'getpwuid' in statically linked applications requires <br>> at runtime the shared libraries from the glibc version used for linking"<br>> <br>> After compile though, the code executes and runs for ~20mins, producing<br>> sound data before stalling.<br>> <br>> The error logs are very short and quite uniformative.<br>> <br>> PBS .o: <br>> Application 159316 exit codes: 137<br>> Application 159316 exit signals: Killed<br>> Application 159316 resources: utime 0, stime 0<br>> --------------------------------------------------<br>> Begin PBS Epilogue hexagon.bccs.uib.no<br>> Date: Mon Sep 29 12:32:54 CEST 2008<br>> Job ID: 65643.nid00003<br>> Username: bjornss<br>> Group: bjornss<br>> Job Name: pmf_hydanneal_heatup_400K<br>> Session: 10156<br>> Limits: walltime=05:00:00<br>> Resources:<br>> cput=00:00:00,mem=4940kb,vmem=22144kb,walltime=00:20:31<br>> Queue: batch<br>> Account: fysisk<br>> Base login-node: login5<br>> End PBS Epilogue Mon Sep 29 12:32:54 CEST 2008<br>> <br>> PBS .err:<br>> _pmii_daemon(SIGCHLD): PE 0 exit signal Killed<br>> [NID 702]Apid 159316: initiated application termination.<br>> <br>> As proper electrostatics is crucial to my modeling I am using PME which<br>> comprises a large part of my calculation cost: 35-50%<br>> In the most extreme case, I use the following startup-script<br>> <br>> run.pbs:<br>> <br>> #!/bin/bash<br>> #PBS -A fysisk<br>> #PBS -N pmf_hydanneal_heatup_400K<br>> #PBS -o pmf_hydanneal.o<br>> #PBS -e pmf.hydanneal.err<br>> #PBS -l walltime=5:00:00,mppwidth=40,mppnppn=4<br>> <br>> cd /work/bjornss/pmf/structII/hydrate_annealing/heatup_400K<br>> source $HOME/gmx_latest_290908/bin/GMXRC<br>> <br>> aprun -n 40 parmdrun -s topol.tpr -maxh 5 -npme 20<br>> exit $?<br>> <br>> <br>> Now, apart from a significant reduction in the system dipole moment,<br>> there are no large changes in the system, nor significant translations<br>> of the molecules in the box.<br>> <br>> I enclose the md.log and my parameter file. The run-topology (topol.tpr)<br>> can be found at:<br>> <br>> http:/drop.io/mdanneal<br>> <br>> if anyone wants to try and replicate the crash on their local cluster,<br>> they are welcome.<br>> If after such trials are attempted the error persists, I am willing to<br>> post a bug on bugzilla.<br>> <br>> <br>> If more information is needed I will try to provide it upon request<br>> <br>> <br>> Regards and thanks for bothering<br>> <br>> -- <br>> ---------------------<br>> Bjørn Steen Saethre <br>> PhD-student<br>> Theoretical and Energy Physics Unit<br>> Institute of Physics and Technology<br>> Allegt, 41<br>> N-5020 Bergen<br>> Norway<br>> <br>> Tel(office) +47 55582869 <br>> <br>> <br><br /><hr />Express yourself instantly with MSN Messenger! <a href='http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/' target='_new'>MSN Messenger</a></body>
</html>