[gmx-users] multiple GPU on multiple nodes

jhon michael espinosa duran cyberjhon at hotmail.com
Mon Jan 27 23:22:14 CET 2014


Hi guys
I am using a CrayXK7 machine with one GPU (Tesla K20) per node (one AMD Opteron 16-core Interlagos x86_64). Currently I am trying to run the gromacs gpu version using only two nodesbut it is not working.
When I tried using one node and one gpu it works
aprun -n 1 mdrun_mpi -deffnm filename

when I try two nodes and two GPUs, it does not work (these are the ways that I had tried)
aprun -n 2 mdrun_mpi -deffnm filename
aprun -n 2 mdrun_mpi -gpu_id 00 -deffnm filename
aprun -n 2 mdrun_mpi -gpu_if 0011 -deffnm filename
aprun -n 32 mdrun_mpi -deffnm filename
aprun -n 32 mdrun_mpi -gpu_id 00 -deffnm filename
aprun -n 32 mdrun_mpi -gpu_if 0011 -deffnm filename
Sometimes  I got errors like:
Program mdrun_mpi, VERSION 4.6.2
Source code file: /N/soft/cle4/gromacs/gromacs-4.6.2/src/gmxlib/gmx_detect_hardware.c, line: 580

Fatal error:
Some of the requested GPUs do not exist, behave strangely, or are not compatible:
    GPU #0: insane
    GPU #0: insane

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors


Program mdrun_mpi, VERSION 4.6.2
Source code file: /N/soft/cle4/gromacs/gromacs-4.6.2/src/gmxlib/statutil.c, line: 976

Invalid command line argument:
0
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
1 GPU detected on host nid00900:
  #0: NVIDIA Tesla K20, compute cap.: 3.5, ECC: yes, stat: compatible


-------------------------------------------------------
Program mdrun_mpi, VERSION 4.6.2
Source code file: /N/soft/cle4/gromacs/gromacs-4.6.2/src/gmxlib/gmx_detect_hardware.c, line: 580

Fatal error:
Some of the requested GPUs do not exist, behave strangely, or are not compatible:
    GPU #1: inexistent
    GPU #1: inexistent

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
1 GPU detected on host nid00900:
  #0: NVIDIA Tesla K20, compute cap.: 3.5, ECC: yes, stat: compatible

Compiled acceleration: None (Gromacs could use AVX_128_FMA on this machine, which is better)

-------------------------------------------------------
Program mdrun_mpi, VERSION 4.6.2
Source code file: /N/soft/cle4/gromacs/gromacs-4.6.2/src/gmxlib/gmx_detect_hardware.c, line: 356

Fatal error:
Incorrect launch configuration: mismatching number of PP MPI processes and GPUs per node.
mdrun_mpi was started with 2 PP MPI processes per node, but only 1 GPU were detected.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
If you have any idea how make it work, please let me know
John Michael 
 		 	   		  


More information about the gromacs.org_gmx-users mailing list