[gmx-users] GROMACS parallel performance(sorry to post again)

Mon Dec 6 04:55:49 CET 2004

Hi, everyone.
Hi, David.
    Following your suggestion, I did some benchmark test for GROMACS and NAMD while HT on or off. I don't know is there any corresponding benchmark for GROMACS and NAMD or Charmm. As I know little about Gromacs, I can not do a homologous test for them. So I just simply compare the standard GROMACS DPPC and NAMD APOA1 benchmarks. About 92K atoms in the APOA1 system and A 2fs timestep was used. 
DPPC

                     CPUs    (Mnbf/s)         (GFlops) (ps/NODE hour) (NODE hour/ns) Efficiency

HT Off            1:            18.722          675.081            5.781          172.968                1

    2:     33.221     1.198      10.256     97.500        0.8870

    4:     39.605     1.428      12.224     81.806        0.5286

    8:     49.064     1.770      15.145     66.028        0.3275

HT on 

           2:    26.032    939.268    8.045    124.306           0.6958
                       4:          33.108         1.195           10.221         97.833                        0.4421
                       8:          42.312         1.528           13.072         76.500                        0.2827    

Apoa1 (NAMD)

HT off           CPUs    node second/step        (ps/NODE hour)        Efficiency

1:                  3.1571                        2.281                          1
       2:                 1.6519                        4.359                          0.9555

4:                  0.8657                        8.317                          0.9116

8：               0.4723                        15.244                        0.8354

HT on            

0.5:               3.2170                        2.238                          1.9623
       1:                 2.0815                        3.459                          1.5164

2:                  1.5409                        4.673                          1.0243

                     4:                  0.8668                        8.306                          0.9103

                     8:                  0.4343                        16.578                        0.9084

            Note: The cpu number is the real number of cpus while HT on.

    Now we can see GROMACS is sevral times faster than NAMD while just use sigle cpu. But NAMD's scaling is very good and can make advantage of HT. GROMACS's scaling was bad. Is this performance reasonable? What's the problem?

    Best regards.

Wenyu Zhong

  ----- Original Message ----- 
  From: Zhong Wenyu 
  To: 钟 文宇 
  Sent: Thursday, December 02, 2004 7:31 PM
  Subject: [gmx-users] GROMACS parallel performance(sorry to post again)

  >On Tue, 2004-11-30 at 20:24 +0800, Zhong Wenyu wrote:
  > Hi,everyone.
  >     I'm sorry to post this mail again, but it really disturbed me and
  > I haven't found a solution in the previous post.
  > I am a newbie to gromacs and have used NAMD for some time. I think> maybe Gromacs can do something NAMD can't, So I came into your area.
  >     I have compiled single precision versions of fftw and gromacs with
  > icc. The Cflags used were "-O3 Cip Cunroll -xN". Now the first problem
  > bothered me was the parallel performance of Gromacs. It was
  > unreasonable on a Xeon 2.4 cluster made by myself. The OS was Rocks
  > 3.2.0 based on RedHat 3.0EL, network was 1000Mbit ethernet.
  > Hyperthreading was on and it was about 10%-20% faster than off while
  > running namd.
  >     The option used like this:
  >     grompp -f grompp.mdp -c conf.gro -p topol.top -po mdout.mdp -np
  > $NSLOTS -shuffle -sort
  >     mpirun -np $NSLOTS -machinefile $nodefile mdrun -s topol.tpr -o
  > traj.trr -c confout.gro -e ener.edr -g md.log
  > benchmarks:
  >     Villin:
  >            CPUs   (Mnbf/s)   (GFlops) (ps/NODE hour) (NODE hour/ns)
  >             1:     45.429      1.424    345.125      2.897
  >             4:     46.467      1.458    352.941      2.833
  >             8:     29.621    929.329    225.000      4.444
  >     DPPC:
  >            CPUs    (Mnbf/s)   (GFlops) (ps/NODE hour) (NODE hour/ns)
  >             1:     18.722    675.081      5.781    172.968
  >             8:     33.108      1.195     10.221     97.833
  >            16:     42.312      1.528     13.072     76.500
  >     Considering Hyperthreading was on, while CPUs was 4, actually it
  > was 2.
  >     The performance was far away from the benchmark on website of
  > Gromacs, and almost as same as a previous post describing:
  >  http://www.gromacs.org/pipermail/gmx-users/2004-June/011028.html. but
  > I haven't seen his solution.
  >     What's the problem? How cloud I do? Maybe I should close
  > hyperthreading or recompile Gromacs or other thing?

  turn off hyperthreading. compare performance in ps/day to NAMD.

  parallel scaling will be improved in v 4.0

  >     Please help me. 
  >     Thanks a lot.
  > Wenyu Zhong
  > _______________________________________________
  > gmx-users mailing list
  > gmx-users at gromacs.org
  > http://www.gromacs.org/mailman/listinfo/gmx-users
  > Please don't post (un)subscribe requests to the list. Use the 
  > www interface or send it to gmx-users-request at gromacs.org.
  -- 
  David.
  ________________________________________________________________________
  David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
  Dept. of Cell and Molecular Biology, Uppsala University.
  Husargatan 3, Box 596,          75124 Uppsala, Sweden
  phone:  46 18 471 4205          fax: 46 18 511 755
  spoel at xray.bmc.uu.se    spoel at gromacs.org   http://xray.bmc.uu.se/~spoel
  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20041206/1ebc16be/attachment.html>