<HTML>
<HEAD>
<META content="text/html; charset=big5" http-equiv=Content-Type>
<META content="OPENWEBMAIL" name=GENERATOR>
</HEAD>
<BODY bgColor=#ffffff>
Hi,
<br />
<br />Do someone use gromacs, lam, and condor together
here?
<br />I use gromacs with lam/mpi on condor
system.
<br />Everytime I submit the parallel
job.
<br />I got the node which is occupied before and the performance of each
cpu is
below
10%.
<br />How should I change the
script?
<br />Below is one submit script and two executable
script.
<br />
<br />
condor_mpi:
<br />----
<br />#!/bin/bash
<br />Universe =
parallel
<br />Executable =
./lamscript
<br />machine_count =
2
<br />output =
md_$(NODE).out
<br />error =
md_$(NODE).err
<br />log =
md.log
<br />arguments =
/stathome/jiangsl/simulation/gromacs/2OMP/2OMP_1_1/md.sh
<br />+WantIOProxy =
True
<br />should_transfer_files =
yes
<br />when_to_transfer_output =
on_exit
<br />Queue
<br />-------
<br />
<br />
lamscript:
<br />
-------
<br />#!/bin/sh
<br />
<br />_CONDOR_PROCNO=$_CONDOR_PROCNO
<br />_CONDOR_NPROCS=$_CONDOR_NPROCS
<br />_CONDOR_REMOTE_SPOOL_DIR=$_CONDOR_REMOTE_SPOOL_DIR
<br />
<br />SSHD_SH=`condor_config_val
libexec`
<br />SSHD_SH=$SSHD_SH/sshd.sh
<br />
<br />CONDOR_SSH=`condor_config_val
libexec`
<br />CONDOR_SSH=$CONDOR_SSH/condor_ssh
<br />
<br /># Set this to the bin directory of your lam
installation
<br /># This also must be in your .cshrc file, so the remote
side
<br /># can find
it!
<br />export
LAMDIR=/stathome/jiangsl/soft/lam-7.1.4
<br />export
PATH=${LAMDIR}/bin:${PATH}
<br />export
LD_LIBRARY_PATH=/lib:/usr/lib:$LAMDIR/lib:.:/opt/intel/compilers/lib
<br />
<br />. $SSHD_SH $_CONDOR_PROCNO
$_CONDOR_NPROCS
<br />
<br /># If not the head node, just sleep forever, to let
the
<br /># sshds
run
<br />if [ $_CONDOR_PROCNO -ne 0
]
<br />then
<br />               
wait
<br />               
sshd_cleanup
<br />               
exit
0
<br />fi
<br />
<br />EXECUTABLE=$1
<br />shift
<br />
<br /># the binary is copied but the executable flag is
cleared.
<br /># so the script have to take care of
this
<br />chmod +x
$EXECUTABLE
<br />
<br /># to allow multiple lam jobs running on a single
machine,
<br /># we have to give somewhat unique
value
<br />export
LAM_MPI_SESSION_SUFFIX=$$
<br />export
LAMRSH=$CONDOR_SSH
<br /># when a job is killed by the user, this script will get
sigterm
<br /># This script have to catch it and do the cleaning for
the
<br /># lam
environment
<br />finalize()
<br />{
<br />sshd_cleanup
<br />lamhalt
<br />exit
<br />}
<br />trap finalize
TERM
<br />
<br />CONDOR_CONTACT_FILE=$_CONDOR_SCRATCH_DIR/contact
<br />export
$CONDOR_CONTACT_FILE
<br /># The second field in the contact file is the machine
name
<br /># that condor_ssh knows how to use. Note that this used
to
<br /># say "sort -n +0 ...", but -n option is now
deprecated.
<br />sort < $CONDOR_CONTACT_FILE | awk '{print $2}' >
machines
<br />
<br /># start the lam
environment
<br /># For older versions of lam you may need to remove the -ssi boot rsh
line
<br />lamboot -ssi boot rsh -ssi rsh_agent "$LAMRSH -x"
machines
<br />
<br />if [ $? -ne 0
]
<br />then
<br />        echo "lamscript error
booting
lam"
<br />        exit
1
<br />fi
<br />
<br />mpirun C -ssi rpi usysv -ssi coll_smp 1 $EXECUTABLE $@
&
<br />
<br />
CHILD=$!
<br />TMP=130
<br />while [ $TMP -gt 128 ] ;
do
<br />        wait
$CHILD
<br />       
TMP=$?;
<br />done
<br />
<br /># clean up
files
<br />sshd_cleanup
<br />/bin/rm -f
machines
<br />
<br /># clean up
lam
<br />lamhalt
<br />
<br />exit
$TMP
<br />----
<br />
<br />
md.sh
<br />
----
<br />#!/bin/sh
<br />#running
GROMACS
<br />/stathome/jiangsl/soft/gromacs-4.0.5/bin/mdrun_mpi_d
\
<br />-s /stathome/jiangsl/simulation/gromacs/2OMP/2OMP_1_1/md/200ns.tpr
\
<br />-e /stathome/jiangsl/simulation/gromacs/2OMP/2OMP_1_1/md/200ns.edr
\
<br />-o /stathome/jiangsl/simulation/gromacs/2OMP/2OMP_1_1/md/200ns.trr
\
<br />-g /stathome/jiangsl/simulation/gromacs/2OMP/2OMP_1_1/md/200ns.log
\
<br />-c
/stathome/jiangsl/simulation/gromacs/2OMP/2OMP_1_1/md/200ns.gro
<br />-----
<br />
<br />
<br />Hsin-Lin
<br />
</BODY>
</HTML>