[gmx-users] internal MPI error: GER overflow

Dr. Y. U. Sasidhar sasidhar at chem.iitb.ac.in
Tue Sep 24 06:26:04 CEST 2002


I am getting the following error on a 16 node cluster running on RH 7.3 
Linux. What should I do to correct the error.
gmx_mdrun_mpi script:
=====================
lamboot
mpirun -v -c 16  -s n0 -lamd  mdrun_mpi  -np 16  -v -s full.tpr -e 
full.edr -o full.trr -c after_full.gro -g full.log >& full.job &
tail full.job
echo " finished the script "
===================
./gmx_mdrun_mpi

LAM 6.5.6/MPI 2 C++/ROMIO - University of Notre Dame

 finished the script
sasidhar at cluster:~/ykgqp$ more full.job
15301 mdrun_mpi running on n0 (o)
4745 mdrun_mpi running on n1
4340 mdrun_mpi running on n2
4272 mdrun_mpi running on n3
4527 mdrun_mpi running on n4
4487 mdrun_mpi running on n5
5032 mdrun_mpi running on n6
4370 mdrun_mpi running on n7
3931 mdrun_mpi running on n8
4288 mdrun_mpi running on n9
4245 mdrun_mpi running on n10
3947 mdrun_mpi running on n11
4074 mdrun_mpi running on n12
4215 mdrun_mpi running on n13
4196 mdrun_mpi running on n14
4180 mdrun_mpi running on n15
NNODES=16, MYRANK=2, HOSTNAME=node02
NNODES=16, MYRANK=5, HOSTNAME=node05
NNODES=16, MYRANK=6, HOSTNAME=node06
NNODES=16, MYRANK=9, HOSTNAME=node09
NNODES=16, MYRANK=8, HOSTNAME=node08
NNODES=16, MYRANK=10, HOSTNAME=node10
NNODES=16, MYRANK=7, HOSTNAME=node07
NNODES=16, MYRANK=12, HOSTNAME=node12
NNODES=16, MYRANK=15, HOSTNAME=node15
NNODES=16, MYRANK=14, HOSTNAME=node14
NNODES=16, MYRANK=11, HOSTNAME=node11
NNODES=16, MYRANK=13, HOSTNAME=node13
MPI_Isend: internal MPI error: GER overflow (rank 5, MPI_COMM_WORLD)
NNODES=16, MYRANK=4, HOSTNAME=node04
NNODES=16, MYRANK=3, HOSTNAME=node03
-----------------------------------------------------------------------------

One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 4340 failed on node n2 with exit status 1.
-----------------------------------------------------------------------------
Rank (5, MPI_COMM_WORLD): Call stack within LAM:
Rank (5, MPI_COMM_WORLD):  - MPI_Isend()
Rank (5, MPI_COMM_WORLD):  - main()
sasidhar at cluster:~/ykgqp$ lamhalt

LAM 6.5.6/MPI 2 C++/ROMIO - University of Notre Dame

-- 
 Sasidhar







More information about the gromacs.org_gmx-users mailing list