<p dir="ltr">Hi,</p>
<p dir="ltr">Are you trying to run several of these at once? If so, you need to manage details better, because just doing three 8 core runs with pinning and different gpu ids will leave 16 cores idle. See examples at <a href="http://manual.gromacs.org/documentation/2016.2/user-guide/mdrun-performance.html">http://manual.gromacs.org/documentation/2016.2/user-guide/mdrun-performance.html</a>. Or better, use the MPI mdrun to do a multi simulation and let it get the details right. </p>
<p dir="ltr">Mark</p>
<br><div class="gmail_quote"><div dir="ltr">On Wed, 22 Feb 2017 17:54 Igor Leontyev <<a href="mailto:ileontyev@ucdavis.edu">ileontyev@ucdavis.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> ><br class="gmail_msg">
> What CPU vs GPU time per step gets reported at the end of the log<br class="gmail_msg">
> file?<br class="gmail_msg">
<br class="gmail_msg">
Thank you Berk for prompt response. Here is my log-file that provides<br class="gmail_msg">
all the details.<br class="gmail_msg">
<br class="gmail_msg">
=================================================<br class="gmail_msg">
Host: compute-0-113.local pid: 12081 rank ID: 0 number of ranks: 1<br class="gmail_msg">
:-) GROMACS - gmx mdrun, 2016.2 (-:<br class="gmail_msg">
<br class="gmail_msg">
GROMACS is written by:<br class="gmail_msg">
...........................................................<br class="gmail_msg">
<br class="gmail_msg">
GROMACS: gmx mdrun, version 2016.2<br class="gmail_msg">
Executable:<br class="gmail_msg">
/home/leontyev/programs/bin/gromacs/gromacs-2016.2/bin/gmx_avx2_gpu<br class="gmail_msg">
Data prefix: /home/leontyev/programs/bin/gromacs/gromacs-2016.2<br class="gmail_msg">
Working dir:<br class="gmail_msg">
/share/COMMON2/MDRUNS/GROMACS/MUTATIONS/PROTEINS/coc-Flu_A-B_LIGs/MDRUNS/InP/fluA/Output_test/6829_6818_9/Gromacs.571690<br class="gmail_msg">
Command line:<br class="gmail_msg">
gmx_avx2_gpu mdrun -nb gpu -gpu_id 3 -pin on -nt 8 -s<br class="gmail_msg">
6829_6818-liq_0.tpr -e<br class="gmail_msg">
/state/partition1/Gromacs.571690.0//6829_6818-liq_0.edr -dhdl<br class="gmail_msg">
/state/partition1/Gromacs.571690.0//6829_6818-liq_0.xvg -o<br class="gmail_msg">
/state/partition1/Gromacs.571690.0//6829_6818-liq_0.trr -x<br class="gmail_msg">
/state/partition1/Gromacs.571690.0//6829_6818-liq_0.xtc -cpo<br class="gmail_msg">
/state/partition1/Gromacs.571690.0//6829_6818-liq_0.cpt -c<br class="gmail_msg">
6829_6818-liq_0.gro -g 6829_6818-liq_0.log<br class="gmail_msg">
<br class="gmail_msg">
GROMACS version: 2016.2<br class="gmail_msg">
Precision: single<br class="gmail_msg">
Memory model: 64 bit<br class="gmail_msg">
MPI library: thread_mpi<br class="gmail_msg">
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 32)<br class="gmail_msg">
GPU support: CUDA<br class="gmail_msg">
SIMD instructions: AVX2_256<br class="gmail_msg">
FFT library: fftw-3.3.4-sse2-avx<br class="gmail_msg">
RDTSCP usage: enabled<br class="gmail_msg">
TNG support: enabled<br class="gmail_msg">
Hwloc support: disabled<br class="gmail_msg">
Tracing support: disabled<br class="gmail_msg">
Built on: Mon Feb 20 18:26:54 PST 2017<br class="gmail_msg">
Built by: <a href="mailto:leontyev@cluster01.interxinc.com" class="gmail_msg" target="_blank">leontyev@cluster01.interxinc.com</a> [CMAKE]<br class="gmail_msg">
Build OS/arch: Linux 2.6.32-642.el6.x86_64 x86_64<br class="gmail_msg">
Build CPU vendor: Intel<br class="gmail_msg">
Build CPU brand: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz<br class="gmail_msg">
Build CPU family: 6 Model: 45 Stepping: 7<br class="gmail_msg">
Build CPU features: aes apic avx clfsh cmov cx8 cx16 htt lahf mmx msr<br class="gmail_msg">
nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3<br class="gmail_msg">
sse4.1 sse4.2 ssse3 tdt x2apic<br class="gmail_msg">
C compiler: /share/apps/devtoolset-1.1/root/usr/bin/gcc GNU 4.7.2<br class="gmail_msg">
C compiler flags: -march=core-avx2 -static-libgcc -static-libstdc++<br class="gmail_msg">
-O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast<br class="gmail_msg">
C++ compiler: /share/apps/devtoolset-1.1/root/usr/bin/g++ GNU 4.7.2<br class="gmail_msg">
C++ compiler flags: -march=core-avx2 -std=c++0x -O3 -DNDEBUG<br class="gmail_msg">
-funroll-all-loops -fexcess-precision=fast<br class="gmail_msg">
CUDA compiler: /share/apps/cuda-8.0/bin/nvcc nvcc: NVIDIA (R) Cuda<br class="gmail_msg">
compiler driver;Copyright (c) 2005-2016 NVIDIA Corporation;Built on<br class="gmail_msg">
Sun_Sep__4_22:14:01_CDT_2016;Cuda compilation tools, release 8.0, V8.0.44<br class="gmail_msg">
CUDA compiler<br class="gmail_msg">
flags:-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_35,code=compute_35;-gencode;arch=compute_52,code=compute_52;-gencode;arch=compute_61,code=compute_61;-use_fast_math;;;-Xcompiler;,-march=core-avx2,,,,,,;-Xcompiler;-O3,-DNDEBUG,-funroll-all-loops,-fexcess-precision=fast,,;<br class="gmail_msg">
<br class="gmail_msg">
CUDA driver: 8.0<br class="gmail_msg">
CUDA runtime: 8.0<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
Running on 1 node with total 24 cores, 24 logical cores, 4 compatible GPUs<br class="gmail_msg">
Hardware detected:<br class="gmail_msg">
CPU info:<br class="gmail_msg">
Vendor: Intel<br class="gmail_msg">
Brand: Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz<br class="gmail_msg">
Family: 6 Model: 63 Stepping: 2<br class="gmail_msg">
Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf<br class="gmail_msg">
mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp<br class="gmail_msg">
sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic<br class="gmail_msg">
SIMD instructions most likely to fit this hardware: AVX2_256<br class="gmail_msg">
SIMD instructions selected at GROMACS compile time: AVX2_256<br class="gmail_msg">
<br class="gmail_msg">
Hardware topology: Basic<br class="gmail_msg">
Sockets, cores, and logical processors:<br class="gmail_msg">
Socket 0: [ 0] [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [<br class="gmail_msg">
7] [ 8] [ 9] [ 10] [ 11]<br class="gmail_msg">
Socket 1: [ 12] [ 13] [ 14] [ 15] [ 16] [ 17] [ 18] [<br class="gmail_msg">
19] [ 20] [ 21] [ 22] [ 23]<br class="gmail_msg">
GPU info:<br class="gmail_msg">
Number of GPUs detected: 4<br class="gmail_msg">
#0: NVIDIA GeForce GTX 1080, compute cap.: 6.1, ECC: no, stat:<br class="gmail_msg">
compatible<br class="gmail_msg">
#1: NVIDIA GeForce GTX 1080, compute cap.: 6.1, ECC: no, stat:<br class="gmail_msg">
compatible<br class="gmail_msg">
#2: NVIDIA GeForce GTX 1080, compute cap.: 6.1, ECC: no, stat:<br class="gmail_msg">
compatible<br class="gmail_msg">
#3: NVIDIA GeForce GTX 1080, compute cap.: 6.1, ECC: no, stat:<br class="gmail_msg">
compatible<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
For optimal performance with a GPU nstlist (now 10) should be larger.<br class="gmail_msg">
The optimum depends on your CPU and GPU resources.<br class="gmail_msg">
You might want to try several nstlist values.<br class="gmail_msg">
Changing nstlist from 10 to 40, rlist from 0.9 to 0.932<br class="gmail_msg">
<br class="gmail_msg">
Input Parameters:<br class="gmail_msg">
integrator = sd<br class="gmail_msg">
tinit = 0<br class="gmail_msg">
dt = 0.001<br class="gmail_msg">
nsteps = 10000<br class="gmail_msg">
init-step = 0<br class="gmail_msg">
simulation-part = 1<br class="gmail_msg">
comm-mode = Linear<br class="gmail_msg">
nstcomm = 100<br class="gmail_msg">
bd-fric = 0<br class="gmail_msg">
ld-seed = 1103660843<br class="gmail_msg">
emtol = 10<br class="gmail_msg">
emstep = 0.01<br class="gmail_msg">
niter = 20<br class="gmail_msg">
fcstep = 0<br class="gmail_msg">
nstcgsteep = 1000<br class="gmail_msg">
nbfgscorr = 10<br class="gmail_msg">
rtpi = 0.05<br class="gmail_msg">
nstxout = 10000000<br class="gmail_msg">
nstvout = 10000000<br class="gmail_msg">
nstfout = 0<br class="gmail_msg">
nstlog = 20000<br class="gmail_msg">
nstcalcenergy = 100<br class="gmail_msg">
nstenergy = 1000<br class="gmail_msg">
nstxout-compressed = 5000<br class="gmail_msg">
compressed-x-precision = 1000<br class="gmail_msg">
cutoff-scheme = Verlet<br class="gmail_msg">
nstlist = 40<br class="gmail_msg">
ns-type = Grid<br class="gmail_msg">
pbc = xyz<br class="gmail_msg">
periodic-molecules = false<br class="gmail_msg">
verlet-buffer-tolerance = 0.005<br class="gmail_msg">
rlist = 0.932<br class="gmail_msg">
coulombtype = PME<br class="gmail_msg">
coulomb-modifier = Potential-shift<br class="gmail_msg">
rcoulomb-switch = 0.9<br class="gmail_msg">
rcoulomb = 0.9<br class="gmail_msg">
epsilon-r = 1<br class="gmail_msg">
epsilon-rf = inf<br class="gmail_msg">
vdw-type = Cut-off<br class="gmail_msg">
vdw-modifier = Potential-shift<br class="gmail_msg">
rvdw-switch = 0.9<br class="gmail_msg">
rvdw = 0.9<br class="gmail_msg">
DispCorr = EnerPres<br class="gmail_msg">
table-extension = 1<br class="gmail_msg">
fourierspacing = 0.12<br class="gmail_msg">
fourier-nx = 42<br class="gmail_msg">
fourier-ny = 42<br class="gmail_msg">
fourier-nz = 40<br class="gmail_msg">
pme-order = 6<br class="gmail_msg">
ewald-rtol = 1e-05<br class="gmail_msg">
ewald-rtol-lj = 0.001<br class="gmail_msg">
lj-pme-comb-rule = Geometric<br class="gmail_msg">
ewald-geometry = 0<br class="gmail_msg">
epsilon-surface = 0<br class="gmail_msg">
implicit-solvent = No<br class="gmail_msg">
gb-algorithm = Still<br class="gmail_msg">
nstgbradii = 1<br class="gmail_msg">
rgbradii = 1<br class="gmail_msg">
gb-epsilon-solvent = 80<br class="gmail_msg">
gb-saltconc = 0<br class="gmail_msg">
gb-obc-alpha = 1<br class="gmail_msg">
gb-obc-beta = 0.8<br class="gmail_msg">
gb-obc-gamma = 4.85<br class="gmail_msg">
gb-dielectric-offset = 0.009<br class="gmail_msg">
sa-algorithm = Ace-approximation<br class="gmail_msg">
sa-surface-tension = 2.05016<br class="gmail_msg">
tcoupl = No<br class="gmail_msg">
nsttcouple = 5<br class="gmail_msg">
nh-chain-length = 0<br class="gmail_msg">
print-nose-hoover-chain-variables = false<br class="gmail_msg">
pcoupl = Parrinello-Rahman<br class="gmail_msg">
pcoupltype = Isotropic<br class="gmail_msg">
nstpcouple = 5<br class="gmail_msg">
tau-p = 0.5<br class="gmail_msg">
compressibility (3x3):<br class="gmail_msg">
compressibility[ 0]={ 5.00000e-05, 0.00000e+00, 0.00000e+00}<br class="gmail_msg">
compressibility[ 1]={ 0.00000e+00, 5.00000e-05, 0.00000e+00}<br class="gmail_msg">
compressibility[ 2]={ 0.00000e+00, 0.00000e+00, 5.00000e-05}<br class="gmail_msg">
ref-p (3x3):<br class="gmail_msg">
ref-p[ 0]={ 1.01325e+00, 0.00000e+00, 0.00000e+00}<br class="gmail_msg">
ref-p[ 1]={ 0.00000e+00, 1.01325e+00, 0.00000e+00}<br class="gmail_msg">
ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 1.01325e+00}<br class="gmail_msg">
refcoord-scaling = All<br class="gmail_msg">
posres-com (3):<br class="gmail_msg">
posres-com[0]= 0.00000e+00<br class="gmail_msg">
posres-com[1]= 0.00000e+00<br class="gmail_msg">
posres-com[2]= 0.00000e+00<br class="gmail_msg">
posres-comB (3):<br class="gmail_msg">
posres-comB[0]= 0.00000e+00<br class="gmail_msg">
posres-comB[1]= 0.00000e+00<br class="gmail_msg">
posres-comB[2]= 0.00000e+00<br class="gmail_msg">
QMMM = false<br class="gmail_msg">
QMconstraints = 0<br class="gmail_msg">
QMMMscheme = 0<br class="gmail_msg">
MMChargeScaleFactor = 1<br class="gmail_msg">
qm-opts:<br class="gmail_msg">
ngQM = 0<br class="gmail_msg">
constraint-algorithm = Lincs<br class="gmail_msg">
continuation = false<br class="gmail_msg">
Shake-SOR = false<br class="gmail_msg">
shake-tol = 0.0001<br class="gmail_msg">
lincs-order = 12<br class="gmail_msg">
lincs-iter = 1<br class="gmail_msg">
lincs-warnangle = 30<br class="gmail_msg">
nwall = 0<br class="gmail_msg">
wall-type = 9-3<br class="gmail_msg">
wall-r-linpot = -1<br class="gmail_msg">
wall-atomtype[0] = -1<br class="gmail_msg">
wall-atomtype[1] = -1<br class="gmail_msg">
wall-density[0] = 0<br class="gmail_msg">
wall-density[1] = 0<br class="gmail_msg">
wall-ewald-zfac = 3<br class="gmail_msg">
pull = false<br class="gmail_msg">
rotation = false<br class="gmail_msg">
interactiveMD = false<br class="gmail_msg">
disre = No<br class="gmail_msg">
disre-weighting = Conservative<br class="gmail_msg">
disre-mixed = false<br class="gmail_msg">
dr-fc = 1000<br class="gmail_msg">
dr-tau = 0<br class="gmail_msg">
nstdisreout = 100<br class="gmail_msg">
orire-fc = 0<br class="gmail_msg">
orire-tau = 0<br class="gmail_msg">
nstorireout = 100<br class="gmail_msg">
free-energy = yes<br class="gmail_msg">
init-lambda = -1<br class="gmail_msg">
init-lambda-state = 0<br class="gmail_msg">
delta-lambda = 0<br class="gmail_msg">
nstdhdl = 100<br class="gmail_msg">
n-lambdas = 13<br class="gmail_msg">
separate-dvdl:<br class="gmail_msg">
fep-lambdas = FALSE<br class="gmail_msg">
mass-lambdas = FALSE<br class="gmail_msg">
coul-lambdas = TRUE<br class="gmail_msg">
vdw-lambdas = TRUE<br class="gmail_msg">
bonded-lambdas = TRUE<br class="gmail_msg">
restraint-lambdas = FALSE<br class="gmail_msg">
temperature-lambdas = FALSE<br class="gmail_msg">
all-lambdas:<br class="gmail_msg">
fep-lambdas = 0 0 0 0<br class="gmail_msg">
0 0 0 0 0 0<br class="gmail_msg">
0 0 0<br class="gmail_msg">
mass-lambdas = 0 0 0 0<br class="gmail_msg">
0 0 0 0 0 0<br class="gmail_msg">
0 0 0<br class="gmail_msg">
coul-lambdas = 0 0.03 0.1 0.2<br class="gmail_msg">
0.3 0.4 0.5 0.6 0.7 0.8<br class="gmail_msg">
0.9 0.97 1<br class="gmail_msg">
vdw-lambdas = 0 0.03 0.1 0.2<br class="gmail_msg">
0.3 0.4 0.5 0.6 0.7 0.8<br class="gmail_msg">
0.9 0.97 1<br class="gmail_msg">
bonded-lambdas = 0 0.03 0.1 0.2<br class="gmail_msg">
0.3 0.4 0.5 0.6 0.7 0.8<br class="gmail_msg">
0.9 0.97 1<br class="gmail_msg">
restraint-lambdas = 0 0 0 0<br class="gmail_msg">
0 0 0 0 0 0<br class="gmail_msg">
0 0 0<br class="gmail_msg">
temperature-lambdas = 0 0 0 0<br class="gmail_msg">
0 0 0 0 0 0<br class="gmail_msg">
0 0 0<br class="gmail_msg">
calc-lambda-neighbors = -1<br class="gmail_msg">
dhdl-print-energy = potential<br class="gmail_msg">
sc-alpha = 0.1<br class="gmail_msg">
sc-power = 1<br class="gmail_msg">
sc-r-power = 6<br class="gmail_msg">
sc-sigma = 0.3<br class="gmail_msg">
sc-sigma-min = 0.3<br class="gmail_msg">
sc-coul = true<br class="gmail_msg">
dh-hist-size = 0<br class="gmail_msg">
dh-hist-spacing = 0.1<br class="gmail_msg">
separate-dhdl-file = yes<br class="gmail_msg">
dhdl-derivatives = yes<br class="gmail_msg">
cos-acceleration = 0<br class="gmail_msg">
deform (3x3):<br class="gmail_msg">
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}<br class="gmail_msg">
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}<br class="gmail_msg">
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}<br class="gmail_msg">
simulated-tempering = false<br class="gmail_msg">
E-x:<br class="gmail_msg">
n = 0<br class="gmail_msg">
E-xt:<br class="gmail_msg">
n = 0<br class="gmail_msg">
E-y:<br class="gmail_msg">
n = 0<br class="gmail_msg">
E-yt:<br class="gmail_msg">
n = 0<br class="gmail_msg">
E-z:<br class="gmail_msg">
n = 0<br class="gmail_msg">
E-zt:<br class="gmail_msg">
n = 0<br class="gmail_msg">
swapcoords = no<br class="gmail_msg">
userint1 = 0<br class="gmail_msg">
userint2 = 0<br class="gmail_msg">
userint3 = 0<br class="gmail_msg">
userint4 = 0<br class="gmail_msg">
userreal1 = 0<br class="gmail_msg">
userreal2 = 0<br class="gmail_msg">
userreal3 = 0<br class="gmail_msg">
userreal4 = 0<br class="gmail_msg">
grpopts:<br class="gmail_msg">
nrdf: 6332.24 62.9925 18705.8<br class="gmail_msg">
ref-t: 298.15 298.15 298.15<br class="gmail_msg">
tau-t: 1 1 1<br class="gmail_msg">
annealing: No No No<br class="gmail_msg">
annealing-npoints: 0 0 0<br class="gmail_msg">
acc: 0 0 0<br class="gmail_msg">
nfreeze: N N N<br class="gmail_msg">
energygrp-flags[ 0]: 0<br class="gmail_msg">
<br class="gmail_msg">
Using 1 MPI thread<br class="gmail_msg">
Using 8 OpenMP threads<br class="gmail_msg">
<br class="gmail_msg">
1 GPU user-selected for this run.<br class="gmail_msg">
Mapping of GPU ID to the 1 PP rank in this node: 3<br class="gmail_msg">
<br class="gmail_msg">
Will do PME sum in reciprocal space for electrostatic interactions.<br class="gmail_msg">
<br class="gmail_msg">
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++<br class="gmail_msg">
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G.<br class="gmail_msg">
Pedersen<br class="gmail_msg">
A smooth particle mesh Ewald method<br class="gmail_msg">
J. Chem. Phys. 103 (1995) pp. 8577-8592<br class="gmail_msg">
-------- -------- --- Thank You --- -------- --------<br class="gmail_msg">
<br class="gmail_msg">
Will do ordinary reciprocal space Ewald sum.<br class="gmail_msg">
Using a Gaussian width (1/beta) of 0.288146 nm for Ewald<br class="gmail_msg">
Cut-off's: NS: 0.932 Coulomb: 0.9 LJ: 0.9<br class="gmail_msg">
Long Range LJ corr.: <C6> 3.6183e-04<br class="gmail_msg">
System total charge, top. A: 7.000 top. B: 7.000<br class="gmail_msg">
Generated table with 965 data points for Ewald.<br class="gmail_msg">
Tabscale = 500 points/nm<br class="gmail_msg">
Generated table with 965 data points for LJ6.<br class="gmail_msg">
Tabscale = 500 points/nm<br class="gmail_msg">
Generated table with 965 data points for LJ12.<br class="gmail_msg">
Tabscale = 500 points/nm<br class="gmail_msg">
Generated table with 965 data points for 1-4 COUL.<br class="gmail_msg">
Tabscale = 500 points/nm<br class="gmail_msg">
Generated table with 965 data points for 1-4 LJ6.<br class="gmail_msg">
Tabscale = 500 points/nm<br class="gmail_msg">
Generated table with 965 data points for 1-4 LJ12.<br class="gmail_msg">
Tabscale = 500 points/nm<br class="gmail_msg">
Potential shift: LJ r^-12: -3.541e+00 r^-6: -1.882e+00, Ewald -1.000e-05<br class="gmail_msg">
Initialized non-bonded Ewald correction tables, spacing: 8.85e-04 size: 1018<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
Using GPU 8x8 non-bonded kernels<br class="gmail_msg">
<br class="gmail_msg">
Using Lorentz-Berthelot Lennard-Jones combination rule<br class="gmail_msg">
<br class="gmail_msg">
There are 21 atoms and 21 charges for free energy perturbation<br class="gmail_msg">
Removing pbc first time<br class="gmail_msg">
Pinning threads with an auto-selected logical core stride of 1<br class="gmail_msg">
<br class="gmail_msg">
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++<br class="gmail_msg">
S. Miyamoto and P. A. Kollman<br class="gmail_msg">
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid<br class="gmail_msg">
Water Models<br class="gmail_msg">
J. Comp. Chem. 13 (1992) pp. 952-962<br class="gmail_msg">
-------- -------- --- Thank You --- -------- --------<br class="gmail_msg">
<br class="gmail_msg">
Intra-simulation communication will occur every 5 steps.<br class="gmail_msg">
Initial vector of lambda components:[ 0.0000 0.0000 0.0000<br class="gmail_msg">
0.0000 0.0000 0.0000 0.0000 ]<br class="gmail_msg">
Center of mass motion removal mode is Linear<br class="gmail_msg">
We have the following groups for center of mass motion removal:<br class="gmail_msg">
0: rest<br class="gmail_msg">
<br class="gmail_msg">
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++<br class="gmail_msg">
N. Goga and A. J. Rzepiela and A. H. de Vries and S. J. Marrink and H. J. C.<br class="gmail_msg">
Berendsen<br class="gmail_msg">
Efficient Algorithms for Langevin and DPD Dynamics<br class="gmail_msg">
J. Chem. Theory Comput. 8 (2012) pp. 3637--3649<br class="gmail_msg">
-------- -------- --- Thank You --- -------- --------<br class="gmail_msg">
<br class="gmail_msg">
There are: 11486 Atoms<br class="gmail_msg">
<br class="gmail_msg">
Constraining the starting coordinates (step 0)<br class="gmail_msg">
<br class="gmail_msg">
Constraining the coordinates at t0-dt (step 0)<br class="gmail_msg">
RMS relative constraint deviation after constraining: 0.00e+00<br class="gmail_msg">
Initial temperature: 291.365 K<br class="gmail_msg">
<br class="gmail_msg">
Started mdrun on rank 0 Wed Feb 22 02:11:02 2017<br class="gmail_msg">
Step Time<br class="gmail_msg">
0 0.00000<br class="gmail_msg">
<br class="gmail_msg">
Energies (kJ/mol)<br class="gmail_msg">
Bond Angle Proper Dih. Ryckaert-Bell. Improper Dih.<br class="gmail_msg">
2.99018e+03 4.09043e+03 5.20416e+03 4.32600e+01 2.38045e+02<br class="gmail_msg">
LJ-14 Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR)<br class="gmail_msg">
2.04778e+03 1.45523e+04 1.59846e+04 -2.41317e+03 -1.92125e+05<br class="gmail_msg">
Coul. recip. Position Rest. Potential Kinetic En. Total Energy<br class="gmail_msg">
1.58368e+03 2.08367e-09 -1.47804e+05 3.03783e+04 -1.17425e+05<br class="gmail_msg">
Temperature Pres. DC (bar) Pressure (bar) dVcoul/dl dVvdw/dl<br class="gmail_msg">
2.91118e+02 -3.53694e+02 -3.01252e+02 4.77627e+02 1.41810e+01<br class="gmail_msg">
dVbonded/dl<br class="gmail_msg">
-2.15074e+01<br class="gmail_msg">
<br class="gmail_msg">
step 80: timed with pme grid 42 42 40, coulomb cutoff 0.900: 391.6<br class="gmail_msg">
M-cycles<br class="gmail_msg">
step 160: timed with pme grid 36 36 36, coulomb cutoff 1.043: 595.7<br class="gmail_msg">
M-cycles<br class="gmail_msg">
step 240: timed with pme grid 40 36 36, coulomb cutoff 1.022: 401.1<br class="gmail_msg">
M-cycles<br class="gmail_msg">
step 320: timed with pme grid 40 40 36, coulomb cutoff 0.963: 318.8<br class="gmail_msg">
M-cycles<br class="gmail_msg">
step 400: timed with pme grid 40 40 40, coulomb cutoff 0.938: 349.9<br class="gmail_msg">
M-cycles<br class="gmail_msg">
step 480: timed with pme grid 42 40 40, coulomb cutoff 0.920: 319.9<br class="gmail_msg">
M-cycles<br class="gmail_msg">
optimal pme grid 40 40 36, coulomb cutoff 0.963<br class="gmail_msg">
Step Time<br class="gmail_msg">
10000 10.00000<br class="gmail_msg">
<br class="gmail_msg">
Writing checkpoint, step 10000 at Wed Feb 22 02:11:41 2017<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
Energies (kJ/mol)<br class="gmail_msg">
Bond Angle Proper Dih. Ryckaert-Bell. Improper Dih.<br class="gmail_msg">
2.99123e+03 4.14451e+03 5.19572e+03 2.56045e+01 2.74109e+02<br class="gmail_msg">
LJ-14 Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR)<br class="gmail_msg">
2.01371e+03 1.45326e+04 1.55974e+04 -2.43903e+03 -1.88805e+05<br class="gmail_msg">
Coul. recip. Position Rest. Potential Kinetic En. Total Energy<br class="gmail_msg">
1.26353e+03 7.39689e+01 -1.45132e+05 3.14390e+04 -1.13693e+05<br class="gmail_msg">
Temperature Pres. DC (bar) Pressure (bar) dVcoul/dl dVvdw/dl<br class="gmail_msg">
3.01283e+02 -3.61306e+02 1.35461e+02 3.46732e+02 1.03533e+01<br class="gmail_msg">
dVbonded/dl<br class="gmail_msg">
-1.08537e+01<br class="gmail_msg">
<br class="gmail_msg">
<====== ############### ==><br class="gmail_msg">
<==== A V E R A G E S ====><br class="gmail_msg">
<== ############### ======><br class="gmail_msg">
<br class="gmail_msg">
Statistics over 10001 steps using 101 frames<br class="gmail_msg">
<br class="gmail_msg">
Energies (kJ/mol)<br class="gmail_msg">
Bond Angle Proper Dih. Ryckaert-Bell. Improper Dih.<br class="gmail_msg">
3.01465e+03 4.25438e+03 5.23249e+03 3.47157e+01 2.59375e+02<br class="gmail_msg">
LJ-14 Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR)<br class="gmail_msg">
2.02486e+03 1.45795e+04 1.58085e+04 -2.42589e+03 -1.89788e+05<br class="gmail_msg">
Coul. recip. Position Rest. Potential Kinetic En. Total Energy<br class="gmail_msg">
1.28411e+03 6.08802e+01 -1.45660e+05 3.09346e+04 -1.14726e+05<br class="gmail_msg">
Temperature Pres. DC (bar) Pressure (bar) dVcoul/dl dVvdw/dl<br class="gmail_msg">
2.96448e+02 -3.57435e+02 3.32252e+01 4.36060e+02 1.77368e+01<br class="gmail_msg">
dVbonded/dl<br class="gmail_msg">
-1.82384e+01<br class="gmail_msg">
<br class="gmail_msg">
Box-X Box-Y Box-Z<br class="gmail_msg">
4.99607e+00 4.89654e+00 4.61444e+00<br class="gmail_msg">
<br class="gmail_msg">
Total Virial (kJ/mol)<br class="gmail_msg">
1.00345e+04 5.03211e+01 -1.17351e+02<br class="gmail_msg">
4.69630e+01 1.04021e+04 1.73033e+02<br class="gmail_msg">
-1.16637e+02 1.75781e+02 1.01673e+04<br class="gmail_msg">
<br class="gmail_msg">
Pressure (bar)<br class="gmail_msg">
7.67740e+01 -1.32678e+01 3.58518e+01<br class="gmail_msg">
-1.22810e+01 -2.15571e+01 -5.79828e+01<br class="gmail_msg">
3.56420e+01 -5.87931e+01 4.44585e+01<br class="gmail_msg">
<br class="gmail_msg">
T-Protein T-LIG T-SOL<br class="gmail_msg">
2.98707e+02 2.97436e+02 2.95680e+02<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
P P - P M E L O A D B A L A N C I N G<br class="gmail_msg">
<br class="gmail_msg">
PP/PME load balancing changed the cut-off and PME settings:<br class="gmail_msg">
particle-particle PME<br class="gmail_msg">
rcoulomb rlist grid spacing 1/beta<br class="gmail_msg">
initial 0.900 nm 0.932 nm 42 42 40 0.119 nm 0.288 nm<br class="gmail_msg">
final 0.963 nm 0.995 nm 40 40 36 0.128 nm 0.308 nm<br class="gmail_msg">
cost-ratio 1.22 0.82<br class="gmail_msg">
(note that these numbers concern only part of the total PP and PME load)<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
M E G A - F L O P S A C C O U N T I N G<br class="gmail_msg">
<br class="gmail_msg">
NB=Group-cutoff nonbonded kernels NxN=N-by-N cluster Verlet kernels<br class="gmail_msg">
RF=Reaction-Field VdW=Van der Waals QSTab=quadratic-spline table<br class="gmail_msg">
W3=SPC/TIP3p W4=TIP4p (single or pairs)<br class="gmail_msg">
V&F=Potential and force V=Potential only F=Force only<br class="gmail_msg">
<br class="gmail_msg">
Computing: M-Number M-Flops % Flops<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
NB Free energy kernel 20441.549154 20441.549 0.3<br class="gmail_msg">
Pair Search distance check 289.750448 2607.754 0.0<br class="gmail_msg">
NxN Ewald Elec. + LJ [F] 78217.065728 5162326.338 85.6<br class="gmail_msg">
NxN Ewald Elec. + LJ [V&F] 798.216192 85409.133 1.4<br class="gmail_msg">
1,4 nonbonded interactions 55.597769 5003.799 0.1<br class="gmail_msg">
Calc Weights 344.614458 12406.120 0.2<br class="gmail_msg">
Spread Q Bspline 49624.481952 99248.964 1.6<br class="gmail_msg">
Gather F Bspline 49624.481952 297746.892 4.9<br class="gmail_msg">
3D-FFT 36508.030372 292064.243 4.8<br class="gmail_msg">
Solve PME 31.968000 2045.952 0.0<br class="gmail_msg">
Shift-X 2.882986 17.298 0.0<br class="gmail_msg">
Bonds 21.487804 1267.780 0.0<br class="gmail_msg">
Angles 38.645175 6492.389 0.1<br class="gmail_msg">
Propers 58.750116 13453.777 0.2<br class="gmail_msg">
Impropers 4.270427 888.249 0.0<br class="gmail_msg">
RB-Dihedrals 0.445700 110.088 0.0<br class="gmail_msg">
Pos. Restr. 0.900090 45.005 0.0<br class="gmail_msg">
Virial 23.073531 415.324 0.0<br class="gmail_msg">
Update 114.871486 3561.016 0.1<br class="gmail_msg">
Stop-CM 1.171572 11.716 0.0<br class="gmail_msg">
Calc-Ekin 45.966972 1241.108 0.0<br class="gmail_msg">
Constraint-V 187.108062 1496.864 0.0<br class="gmail_msg">
Constraint-Vir 18.717354 449.216 0.0<br class="gmail_msg">
Settle 62.372472 20146.308 0.3<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
Total 6028896.883 100.0<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
R E A L C Y C L E A N D T I M E A C C O U N T I N G<br class="gmail_msg">
<br class="gmail_msg">
On 1 MPI rank, each using 8 OpenMP threads<br class="gmail_msg">
<br class="gmail_msg">
Computing: Num Num Call Wall time Giga-Cycles<br class="gmail_msg">
Ranks Threads Count (s) total sum %<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
Neighbor search 1 8 251 0.530 9.754 1.4<br class="gmail_msg">
Launch GPU ops. 1 8 10001 0.509 9.357 1.3<br class="gmail_msg">
Force 1 8 10001 10.634 195.662 27.3<br class="gmail_msg">
PME mesh 1 8 10001 22.173 407.991 57.0<br class="gmail_msg">
Wait GPU local 1 8 10001 0.073 1.338 0.2<br class="gmail_msg">
NB X/F buffer ops. 1 8 19751 0.255 4.690 0.7<br class="gmail_msg">
Write traj. 1 8 3 0.195 3.587 0.5<br class="gmail_msg">
Update 1 8 20002 1.038 19.093 2.7<br class="gmail_msg">
Constraints 1 8 20002 0.374 6.887 1.0<br class="gmail_msg">
Rest 3.126 57.513 8.0<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
Total 38.906 715.871 100.0<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
Breakdown of PME mesh computation<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
PME spread/gather 1 8 40004 19.289 354.929 49.6<br class="gmail_msg">
PME 3D-FFT 1 8 40004 2.319 42.665 6.0<br class="gmail_msg">
PME solve Elec 1 8 20002 0.518 9.538 1.3<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
<br class="gmail_msg">
GPU timings<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
Computing: Count Wall t (s) ms/step %<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
Pair list H2D 251 0.023 0.090 1.1<br class="gmail_msg">
X / q H2D 10001 0.269 0.027 12.5<br class="gmail_msg">
Nonbonded F kernel 9700 1.615 0.166 75.0<br class="gmail_msg">
Nonbonded F+ene k. 50 0.014 0.273 0.6<br class="gmail_msg">
Nonbonded F+prune k. 200 0.039 0.196 1.8<br class="gmail_msg">
Nonbonded F+ene+prune k. 51 0.016 0.323 0.8<br class="gmail_msg">
F D2H 10001 0.177 0.018 8.2<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
Total 2.153 0.215 100.0<br class="gmail_msg">
-----------------------------------------------------------------------------<br class="gmail_msg">
<br class="gmail_msg">
Average per-step force GPU/CPU evaluation time ratio: 0.215 ms/3.280 ms<br class="gmail_msg">
= 0.066<br class="gmail_msg">
For optimal performance this ratio should be close to 1!<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
NOTE: The GPU has >25% less load than the CPU. This imbalance causes<br class="gmail_msg">
performance loss.<br class="gmail_msg">
<br class="gmail_msg">
Core t (s) Wall t (s) (%)<br class="gmail_msg">
Time: 311.246 38.906 800.0<br class="gmail_msg">
(ns/day) (hour/ns)<br class="gmail_msg">
Performance: 22.210 1.081<br class="gmail_msg">
=================================================<br class="gmail_msg">
<br class="gmail_msg">
On 2/22/2017 1:04 AM, Igor Leontyev wrote:<br class="gmail_msg">
> Hi.<br class="gmail_msg">
> I am having hard time with accelerating free energy (FE) simulations on<br class="gmail_msg">
> my high end GPU. Not sure is it normal for my smaller systems or I am<br class="gmail_msg">
> doing something wrong.<br class="gmail_msg">
><br class="gmail_msg">
> The efficiency of GPU acceleration seems to decrease with the system<br class="gmail_msg">
> size, right? Typical sizes in FE simulations in water is 32x32x32 A^3<br class="gmail_msg">
> (~3.5K atoms) and in protein it is about 60x60x60A^3 (~25K atoms).<br class="gmail_msg">
> Requirement for larger MD box in FE simulation is rather rare.<br class="gmail_msg">
><br class="gmail_msg">
> For my system (with 11K atoms) I am getting on 8 cpus and with GTX 1080<br class="gmail_msg">
> gpu only up to 50% speedup. GPU utilization during simulation is only<br class="gmail_msg">
> 1-2%. Does it sound right? (I am using current gmx ver-2016.2 and CUDA<br class="gmail_msg">
> driver 8.0; by request will attach log-files with all the details.)<br class="gmail_msg">
><br class="gmail_msg">
> BTW, regarding how much take perturbed interactions, in my case<br class="gmail_msg">
> simulation with "free_energy = no" running about TWICE faster.<br class="gmail_msg">
><br class="gmail_msg">
> Igor<br class="gmail_msg">
><br class="gmail_msg">
>> On 2/13/17, 1:32 AM,<br class="gmail_msg">
>> "<a href="mailto:gromacs.org_gmx-developers-bounces@maillist.sys.kth.se" class="gmail_msg" target="_blank">gromacs.org_gmx-developers-bounces@maillist.sys.kth.se</a> on behalf of<br class="gmail_msg">
>> Berk Hess" <<a href="mailto:gromacs.org_gmx-developers-bounces@maillist.sys.kth.se" class="gmail_msg" target="_blank">gromacs.org_gmx-developers-bounces@maillist.sys.kth.se</a> on<br class="gmail_msg">
>> behalf of <a href="mailto:hess@kth.se" class="gmail_msg" target="_blank">hess@kth.se</a>> wrote:<br class="gmail_msg">
>><br class="gmail_msg">
>> That depends on what you mean with this.<br class="gmail_msg">
>> With free-energy all non-perturbed non-bonded interactions can run on<br class="gmail_msg">
>> the GPU. The perturbed ones currently can not. For a large system<br class="gmail_msg">
>> with a<br class="gmail_msg">
>> few perturbed atoms this is no issue. For smaller systems the<br class="gmail_msg">
>> free-energy kernel can be the limiting factor. I think there is a<br class="gmail_msg">
>> lot of<br class="gmail_msg">
>> gain to be had in making the extremely complex CPU free-energy kernel<br class="gmail_msg">
>> faster. Initially I thought SIMD would not help there. But since any<br class="gmail_msg">
>> perturbed i-particle will have perturbed interactions with all<br class="gmail_msg">
>> j's, this<br class="gmail_msg">
>> will help a lot.<br class="gmail_msg">
>><br class="gmail_msg">
>> Cheers,<br class="gmail_msg">
>><br class="gmail_msg">
>> Berk<br class="gmail_msg">
>><br class="gmail_msg">
>> On 2017-02-13 01:08, Michael R Shirts wrote:<br class="gmail_msg">
>> > What?s the current state of free energy code on GPU?s, and what<br class="gmail_msg">
>> are the roadblocks?<br class="gmail_msg">
>> ><br class="gmail_msg">
>> > Thanks!<br class="gmail_msg">
>> > ~~~~~~~~~~~~~~~~<br class="gmail_msg">
>> > Michael Shirts<br class="gmail_msg">
--<br class="gmail_msg">
Gromacs Developers mailing list<br class="gmail_msg">
<br class="gmail_msg">
* Please search the archive at <a href="http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List" rel="noreferrer" class="gmail_msg" target="_blank">http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List</a> before posting!<br class="gmail_msg">
<br class="gmail_msg">
* Can't post? Read <a href="http://www.gromacs.org/Support/Mailing_Lists" rel="noreferrer" class="gmail_msg" target="_blank">http://www.gromacs.org/Support/Mailing_Lists</a><br class="gmail_msg">
<br class="gmail_msg">
* For (un)subscribe requests visit<br class="gmail_msg">
<a href="https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers" rel="noreferrer" class="gmail_msg" target="_blank">https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers</a> or send a mail to <a href="mailto:gmx-developers-request@gromacs.org" class="gmail_msg" target="_blank">gmx-developers-request@gromacs.org</a>.<br class="gmail_msg">
<br class="gmail_msg">
</blockquote></div>