[gmx-users] Results of villin headpiece with AMD 8 core

Mon Jan 14 16:31:38 CET 2019

Mirco, to continue the results from the 32 core AMD Ryzen 1080ti
8.4 % of the available CPU time was lost due to load imbalance in the domain decomposition.

Core t (s) Wall t (s) (%)
Time: 151131.597 2361.432 6400.0
39:21
(ns/day) (hour/ns)
Performance: 7.318 3.280

command gmx mdrun -deffnm dppc.md -nb gpu -pme gpu -ntmpi 8 -ntomp 8 -npme 1 -gputasks 00000000
Using other variations of ntmpi /ntomp or only the defaults severely degraded performance. I was not able to decrease the load imbalance from the 8.4% manually.
============================== log head=============================
GROMACS: gmx mdrun, version 2019
Executable: /usr/local/gromacs/bin/gmx
Data prefix: /usr/local/gromacs
Working dir: /home/hms/Desktop/d.dppc.4096
Process ID: 87696
Command line:
gmx mdrun -deffnm dppc.md -nb gpu -pme gpu -ntmpi 8 -ntomp 8 -npme 1 -gputasks 00000000

GROMACS version: 2019
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX2_128
FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128-avx512
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: hwloc-1.11.0
Tracing support: disabled
C compiler: /usr/bin/gcc-6 GNU 6.5.0
C compiler flags: -mavx2 -mfma -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
C++ compiler: /usr/bin/g++-6 GNU 6.5.0
C++ compiler flags: -mavx2 -mfma -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
CUDA compiler: /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2018 NVIDIA Corporation;Built on Sat_Aug_25_21:08:01_CDT_2018;Cuda compilation tools, release 10.0, V10.0.130
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;-D_FORCE_INLINES;; ;-mavx2;-mfma;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
CUDA driver: 10.0
CUDA runtime: 10.0

Running on 1 node with total 64 cores, 64 logical cores, 1 compatible GPU
Hardware detected:
CPU info:
Vendor: AMD
Brand: AMD Ryzen Threadripper 2990WX 32-Core Processor
Family: 23 Model: 8 Stepping: 2
Features: aes amd apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdrnd rdtscp sha sse2 sse3 sse4a sse4.1 sse4.2 ssse3
Hardware topology: Basic
Sockets, cores, and logical processors:
Socket 0: [ 0] [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15] [ 16] [ 17] [ 18] [ 19] [ 20] [ 21] [ 22] [ 23] [ 24] [ 25] [ 26] [ 27] [ 28] [ 29] [ 30] [ 31] [ 32] [ 33] [ 34] [ 35] [ 36] [ 37] [ 38] [ 39] [ 40] [ 41] [ 42] [ 43] [ 44] [ 45] [ 46] [ 47] [ 48] [ 49] [ 50] [ 51] [ 52] [ 53] [ 54] [ 55] [ 56] [ 57] [ 58] [ 59] [ 60] [ 61] [ 62] [ 63]
GPU info:
Number of GPUs detected: 1
#0: NVIDIA GeForce GTX 1080 Ti, compute cap.: 6.1, ECC: no, stat: compatible

On Jan 13 2019, at 7:47 pm, pbuscemi <pbuscemi at q.com> wrote:
> Mirco,
>
> Here are the results for three runs of the million atoms DPPC
> ===============================8 core 2700x 1080ti ==================================
> gmx mdrun -deffnm dppc.md -nb gpu -pme gpu -ntmpi 4 -ntomp 4 -npme 1 -gputasks 0000
> Core t (s) Wall t (s) (%)
> Time: 5286.270 330.392 1600.0
> (ns/day) (hour/ns)
> Performance: 5.126 4.682
>
> ======================================================================================
> On Jan 12 2019, at 5:42 pm, paul buscemi <pbuscemi at q.com> wrote:
> >
> > Mirco,
> > on the modification - nicely done.
> > On the system speed, running Maestro-Desmond one core ) the 1080ti is pegged and at usually 90% power. them folks at Schrodinger know what they are doing. So the base speed is apparently sufficient, its some other factor e.g. the work load distribution that is not optimized.
> >
> > I’ll work with your files tomorrow and let you know how it turns out— thanks
> > Have a a great weekend
> > Paul
> > > On Jan 12, 2019, at 3:11 PM, Wahab Mirco <Mirco.Wahab at chemie.tu-freiberg.de> wrote:
> > > Hi Paul,
> > > thanks for your reply.
> > > On 11.01.2019 23:20, paul buscemi wrote:
> > > > Getting the ion and SOL concentration correct in the top is trickier ( for me ) than it should have been, If you happen to reuse both solvate and genion during the build keeping track of the top is like using a digital rubics cube..! The charge the villin was +1 because after I downloaded it from the pdb I removed all other water and ions - it just made pdb2gmx easier to work with.
> > >
> > >
> > > I simply hand-edited the .gro by making up two ions and put them
> > > somewhere near the corners and added a short energy minimization.
> > > Then, I added one line in the .top for the ions.
> > >
> > > > The 1080 scaled nicely with the 1080 ti, these are really nice pieces of hardware. and you are correct, given the choice of increased processors vs faster processors - choose the latter. I have the AMD OC to 4.0 GH and it runs the same model almost as fast as as 32 core AMD at 3.7 GHz.
> > > Your system is possibly too slow to saturate the 1080Ti at this small
> > > system size. In a much larger system, the lead of the 1080 Ti over the
> > > 1080 may possibly reach the theoretical expectation.
> > >
> > > > I've run 300k DPPC models ( ~300 DPPC molecules ) and they run at ~15 ns/day in NPT. And yes, if you can send the pdb, top, and itps I’t would be interesting to compare the two AMDs.
> > > I did upload the stuff here + a readme-file. This system is much too
> > > large for a single box + GPU (for productive runs), but maybe in 5 years
> > > or so we can watch capillary waves through connected IMD/VMD in real-
> > > time ;)
> > >
> > > => http://suwos.gibtsfei.net/d.dppc.4096.zip
> > > Regards
> > > Mirco
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
> >
> >
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
>