<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
FONT-SIZE: 10pt;
FONT-FAMILY:Tahoma
}
</style>
</head>
<body class='hmmessage'><div style="text-align: left;">Hi,<br></div><div style="text-align: left;"><br>Looking at your 64 core results, it seems that your PP:PME load ratio is about 1:1.<br>In most cases 3:1 is much better performance wise.<br>grompp probably also printed a note about this and also how to fix it.<br>I have also described this shortly in the parallelization section of the pdf manual.<br><br>You should probably increase your cut-offs and pme grid spacing by the same factor<br>(something around 1.2).<br>Hopefully mdrun should choose the proper number of pme nodes for you<br>when you do not use -npme.<br><br>Berk<br></div><br><br><hr id="stopSpelling">> Date: Wed, 1 Oct 2008 17:18:24 -0400<br>> From: jalemkul@vt.edu<br>> To: gmx-users@gromacs.org<br>> Subject: [gmx-users] Improving scaling - Gromacs 4.0 RC2<br>> <br>> <br>> Hi,<br>> <br>> I've been playing around with the latest release candidate of version 4.0, and I <br>> was hoping someone out there more knowledgeable than me might tell me how to <br>> improve a bit on the performance I'm seeing. To clarify, the performance I'm <br>> seeing is a ton faster than 3.3.x, but I still seem to be getting bogged down <br>> with the PME/PP balance. I'm using mostly the default options with the new mdrun:<br>> <br>> mdrun_mpi -s test.tpr -np 64 -npme 32<br>> <br>> The system contains about 150,000 atoms - a membrane protein surrounded by <br>> several hundred lipids and solvent (water). The protein parameters are GROMOS, <br>> lipids are Berger, and water is SPC. My .mdp file (adapted from a generic 3.3.x <br>> file that I always used to use for such simulations) is attached at the end of <br>> this mail. It seems that my system runs fastest on 64 CPU's. Almost all tests <br>> with 128 or 256 seem to run slower. The nodes are dual-core 2.3 GHz Xserve G5, <br>> connected by Infiniband.<br>> <br>> Here's a summary of some of the tests I've run:<br>> <br>> -np        -npme        -ddorder        ns/day        % performance loss from imbalance<br>> 64        16        interleave        5.760        19.6<br>> 64        32        interleave        9.600        40.9<br>> 64        32        pp_pme                5.252        3.9<br>> 64        32        cartesian        5.383        4.7<br>> <br>> All other mdrun command line options are defaults.<br>> <br>> I get ~10.3 ns/day with -np 256 -npme 64, but since -np 64 -npme 32 seems to <br>> give almost that same performance there seems to be no compelling reason to tie <br>> up that many nodes.<br>> <br>> Any hints on how to speed things up any more? Is it possible? Not that I'm <br>> complaining...the same system under GMX 3.3.3 gives just under 1 ns/day :) I'm <br>> really curious about the 40.9% performance loss I'm seeing with -np 64 -npme 32, <br>> even though it gives the best overall performance in terms of ns/day.<br>> <br>> Thanks in advance for your attention, and any comments.<br>> <br>> -Justin<br>> <br>> =======test.mdp=========<br>> title                = NPT simulation for a membrane protein<br>> ; Run parameters<br>> integrator        = md<br>> dt                = 0.002<br>> nsteps                = 10000                20 ps<br>> nstcomm                = 1<br>> ; Output parameters<br>> nstxout                = 500<br>> nstvout                = 500<br>> nstfout                = 500<br>> nstlog                = 500<br>> nstenergy        = 500<br>> ; Bond parameters<br>> constraint_algorithm         = lincs<br>> constraints                = all-bonds<br>> continuation         = no                starting up<br>> ; Twin-range cutoff scheme, parameters for Gromos96<br>> nstlist                = 5<br>> ns_type                = grid<br>> rlist                = 0.8<br>> rcoulomb        = 0.8<br>> rvdw                = 1.4<br>> ; PME electrostatics parameters<br>> coulombtype        = PME<br>> fourierspacing = 0.24<br>> pme_order        = 4<br>> ewald_rtol        = 1e-5<br>> optimize_fft        = yes<br>> ; V-rescale temperature coupling is on in three groups<br>> Tcoupl                 = V-rescale<br>> tc_grps                = Protein POPC SOL_NA+_CL-<br>> tau_t                = 0.1 0.1 0.1<br>> ref_t                = 310 310 310<br>> ; Pressure coupling is on<br>> Pcoupl                = Berendsen<br>> pcoupltype        = semiisotropic<br>> tau_p                = 2.0                <br>> compressibility        = 4.5e-5 4.5e-5<br>> ref_p                = 1.0 1.0<br>> ; Generate velocities is on<br>> gen_vel                = yes                <br>> gen_temp        = 310<br>> gen_seed        = 173529<br>> ; Periodic boundary conditions are on in all directions<br>> pbc                = xyz<br>> ; Long-range dispersion correction<br>> DispCorr        = EnerPres<br>> <br>> ========end test.mdp==========<br>> <br>> -- <br>> ========================================<br>> <br>> Justin A. Lemkul<br>> Graduate Research Assistant<br>> Department of Biochemistry<br>> Virginia Tech<br>> Blacksburg, VA<br>> jalemkul[at]vt.edu | (540) 231-9080<br>> http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin<br>> <br>> ========================================<br>> _______________________________________________<br>> gmx-users mailing list gmx-users@gromacs.org<br>> http://www.gromacs.org/mailman/listinfo/gmx-users<br>> Please search the archive at http://www.gromacs.org/search before posting!<br>> Please don't post (un)subscribe requests to the list. Use the <br>> www interface or send it to gmx-users-request@gromacs.org.<br>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php<br><br /><hr />Express yourself instantly with MSN Messenger! <a href='http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/' target='_new'>MSN Messenger</a></body>
</html>