<div style="font-family: 'Times New Roman'; font-size: 16px;"><br /><br /><span>On 02/10/10, <b class="name">Jennifer Williams </b> <Jennifer.Williams@ed.ac.uk> wrote:</span><blockquote cite="mid:20100210113942.jte74cv34kw4sksc@www.staffmail.ed.ac.uk" class="iwcQuote" style="border-left: 1px solid rgb(0, 0, 255); padding-left: 13px; margin-left: 0pt;" type="cite"><div class="mimepart text plain"><br />Sorry for the delay in replying back. I start the job using the following script file:<br /><br />#$ -S /bin/bash<br />#$ -l h_rt=47:59:00<br />#$ -j y<br />#$ -pe mpich2 8<br />#$ -cwd<br />cd /home/jwillia4/GRO/gromacs-4.0.7/JJW_003/PH_TORUN<br />/home/jwillia4/GRO/bin/mpirun -np 8 /home/jwillia4/GRO/bin/mdrun_mpi -v -s md.tpr<br /><br />The strange thing is that sometimes it works and the job runs to completion and sometimes it crashes immediately with the orte error so I know that it is not the input files causing the problems. It seems entirely random.</div></blockquote>That sounds like some kind of dynamic linking problem. You may be able to constrain the GROMACS configure program to link statically to your choice of MPI library with --enable-static or something - but only if static versions of the MPI libraries exist.<br /><blockquote cite="mid:20100210113942.jte74cv34kw4sksc@www.staffmail.ed.ac.uk" class="iwcQuote" style="border-left: 1px solid rgb(0, 0, 255); padding-left: 13px; margin-left: 0pt;" type="cite"><div class="mimepart text plain"><br />Has it to do with the -pe mpich2 8 line? I was previously using Open MPI installed on the cluster for common use but now have downloaded everything into my home directory. The script has been adapted from the time when I didn't have my own OpenMPI in my home directory. Perhaps it needs further alteration but I don't know what.</div></blockquote>Try things and see. We've no idea what your queueing flags are or should be doing, but involving two different MPI libraries is asking for trouble.<br _moz_dirty="" /><blockquote cite="mid:20100210113942.jte74cv34kw4sksc@www.staffmail.ed.ac.uk" class="iwcQuote" style="border-left: 1px solid rgb(0, 0, 255); padding-left: 13px; margin-left: 0pt;" type="cite"><div class="mimepart text plain">How would I do about checking whether MPI is running?</div></blockquote>By running a test program. Either get a "Hello world" program from an MPI tutorial, or perhaps something available with the library itself.<br _moz_dirty="" /><br _moz_dirty="" />Mark<br /><blockquote cite="mid:20100210113942.jte74cv34kw4sksc@www.staffmail.ed.ac.uk" class="iwcQuote" style="border-left: 1px solid rgb(0, 0, 255); padding-left: 13px; margin-left: 0pt;" type="cite"><div class="mimepart text plain"><br />If you spot anything suspicious in the above commands please let me know.<br /><br />Thanks<br /><br />Jenny<br /><br /><br />Quoting Chandan Choudhury <iitdckc@gmail.com>:<br /><br />>As Justin said give the command line options for mdrun and also check that<br />>your mpi environment is running. Better to run a parallel job and check its<br />>output.<br />><br />>Chadnan<br />><br />>--<br />>Chandan kumar Choudhury<br />>NCL, Pune<br />>INDIA<br />><br />><br />>On Mon, Feb 8, 2010 at 8:02 PM, Justin A. Lemkul <jalemkul@vt.edu> wrote:<br />><br />>><br />>><br />>>Jennifer Williams wrote:<br />>><br />>>><br />>>>Dear All,<br />>>><br />>>>I am having problems compiling gromacs 4.0.7 in parallel. I am following<br />>>>the<br />>>>Quick and Dirty Installation instructions on the gromacs webpage.<br />>>>I downloaded the the versions of fftw, OpenMPI and gromacs-4.0.7 following<br />>>>these instructions.<br />>>><br />>>>Everything seems to compile OK and I get all the serial executables<br />>>>including mdrun written to my bin directory and they seem to run fine.<br />>>>However when I try to run mdrun_mpi on 6 nodes I get the following:<br />>>><br />>>>[vlxbig16:08666] [NO-NAME] ORTE_ERROR_LOG: Not found in file<br />>>>runtime/orte_init_stage1.c at line 182<br />>>>[vlxbig16:08667] [NO-NAME] ORTE_ERROR_LOG: Not found in file<br />>>>runtime/orte_init_stage1.c at line 182<br />>>>[vlxbig16:08700] [NO-NAME] ORTE_ERROR_LOG: Not found in file<br />>>>runtime/orte_init_stage1.c at line 182<br />>>>[vlxbig16:08670] [NO-NAME] ORTE_ERROR_LOG: Not found in file<br />>>>runtime/orte_init_stage1.c at line 182<br />>>>[vlxbig16:08681] [NO-NAME] ORTE_ERROR_LOG: Not found in file<br />>>>runtime/orte_init_stage1.c at line 182<br />>>>[vlxbig16:08659] [NO-NAME] ORTE_ERROR_LOG: Not found in file<br />>>>runtime/orte_init_stage1.c at line 182<br />>>>--------------------------------------------------------------------------<br />>>>It looks like orte_init failed for some reason; your parallel process is<br />>>>likely to abort. There are many reasons that a parallel process can<br />>>>fail during orte_init; some of which are due to configuration or<br />>>>environment problems. This failure appears to be an internal failure;<br />>>>here's some additional information (which may only be relevant to an<br />>>>Open MPI developer):<br />>>><br />>>> orte_rml_base_select failed<br />>>> --> Returned value -13 instead of ORTE_SUCCESS<br />>>><br />>>><br />>>>Does anyone have any idea what is causing this? Computer support at my<br />>>>University is not sure.<br />>>><br />>>><br />>>How are you launching mdrun_mpi (command line)?<br />>><br />>>-Justin<br />>><br />>><br />>>>Thanks<br />>>><br />>>><br />>>><br />>>><br />>>--<br />>>========================================<br />>><br />>>Justin A. Lemkul<br />>>Ph.D. Candidate<br />>>ICTAS Doctoral Scholar<br />>>MILES-IGERT Trainee<br />>>Department of Biochemistry<br />>>Virginia Tech<br />>>Blacksburg, VA<br />>>jalemkul[at]vt.edu | (540) 231-9080<br />>><a href="http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin" target="_blank">http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin</a><br />>><br />>>========================================<br />>>--<br />>>gmx-users mailing list gmx-users@gromacs.org<br />>><a href="http://lists.gromacs.org/mailman/listinfo/gmx-users" target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-users</a><br />>>Please search the archive at <a href="http://www.gromacs.org/search" target="_blank">http://www.gromacs.org/search</a> before posting!<br />>>Please don't post (un)subscribe requests to the list. Use the www interface<br />>>or send it to gmx-users-request@gromacs.org.<br />>>Can't post? Read <a href="http://www.gromacs.org/mailing_lists/users.php" target="_blank">http://www.gromacs.org/mailing_lists/users.php</a><br />>><br />><br /><br /><br /><br />Dr. Jennifer Williams<br />Institute for Materials and Processes<br />School of Engineering<br />University of Edinburgh<br />Sanderson Building<br />The King's Buildings<br />Mayfield Road<br />Edinburgh, EH9 3JL, United Kingdom<br />Phone: ++44 (0)131 650 4 861<br /><br /><br />-- <br />The University of Edinburgh is a charitable body, registered in<br />Scotland, with registration number SC005336.<br /><br /><br />-- <br />gmx-users mailing list gmx-users@gromacs.org<br /><a href="http://lists.gromacs.org/mailman/listinfo/gmx-users" target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-users</a><br />Please search the archive at <a href="http://www.gromacs.org/search" target="_blank">http://www.gromacs.org/search</a> before posting!<br />Please don't post (un)subscribe requests to the list. Use thewww interface or send it to gmx-users-request@gromacs.org.<br />Can't post? Read <a href="http://www.gromacs.org/mailing_lists/users.php" target="_blank">http://www.gromacs.org/mailing_lists/users.php</a><br /></div></blockquote></div>