<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Tahoma
}
--></style>
</head>
<body class='hmmessage'>
Hi,<br><br>In Gromacs 4.5 there is no difference, since it does not use real thread parallelization.<br>Gromacs 4.5 has a built-in threaded MPI library, but openmpi also has an efficient<br>MPI implementation for shared memory machines. But even with proper thread<br>parallelization I expect the same 15 to 20% performance improvement.<br>(I guess the lower 8% number is because of loss when going from 8 to 16 processes<br>for mdrun for that particular system)<br><br>Berk<br><br>> Date: Mon, 9 Aug 2010 11:36:35 -0400<br>> From: chris.neale@utoronto.ca<br>> To: gmx-users@gromacs.org<br>> Subject: [gmx-users] hyperthreading<br>> <br>> I haven't tried mdrun -nt based hyperthreading, but I have tested <br>> using -np 16 on an 8 core box. I get an 8% to 18% performance increase <br>> when using -np 16 and optimizing -npme as compared to -np 8 and <br>> optimizing -npme. This is on a cluster of Intel Xeon E5540 aka <br>> "Nehalem" with 2 quad cores in a single box and using gromacs 4.0.7. I <br>> now regularly overload the number of processes.<br>> <br>> selected examples:<br>> System A with 250,000 atoms:<br>> mdrun -np 8 -npme -1 1.15 ns/day<br>> mdrun -np 8 -npme 2 1.02 ns/day<br>> mdrun -np 16 -npme 2 0.99 ns/day<br>> mdrun -np 16 -npme 4 1.36 ns/day <-- 118 % performance vs 1.15 ns/day<br>> mdrun -np 15 -npme 3 1.32 ns/day<br>> <br>> System B with 35,000 atoms (4 fs timestep):<br>> mdrun -np 8 -npme -1 22.66 ns/day<br>> mdrun -np 8 -npme 2 23.06 ns/day<br>> mdrun -np 16 -npme -1 22.69 ns/day<br>> mdrun -np 16 -npme 4 24.90 ns/day <-- 108 % performance vs 23.06 ns/day<br>> mdrun -np 56 -npme 16 14.15 ns/day<br>> <br>> Cutoffs and timesteps differ between these runs, but both use PME and <br>> explicit water.<br>> <br>> I'd be interested in hearing about any comparisons between -np based <br>> process overloading and -nt based hyperthreading.<br>> <br>> Hope it helps,<br>> Chris.<br>> <br>> -- original message --<br>> <br>> Hi,<br>> <br>> These are nehalem Xeons I presume?<br>> Then you get 15 to 20% more performance in Gromacs running 2 vs 1 <br>> thread or process per physical core.<br>> <br>> Berk<br>> <br>> > Date: Fri, 6 Aug 2010 09:24:11 -0500<br>> > From: dmobley at gmail.com<br>> > To: gmx-users at gromacs.org<br>> > Subject: [gmx-users] hyperthreading<br>> ><br>> > Dear All,<br>> ><br>> > I'm putting together a new Dell Xeon cluster running ROCKS 5.3 which<br>> > uses CENTOS (6 I believe). This is currently ~20 dual quad-cores with<br>> > roughly 16 GB of RAM each.<br>> ><br>> > In any case, I wanted to inquire about hyperthreading. Does anyone<br>> > have experience on similar machines with vs. without hyperthreading?<br>> > The ROCKS users list suggests that hyperthreading ought always be off<br>> > for HPC applications, which sounds overly simplistic to me, though I<br>> > more or less follow the logic of this.<br>> ><br>> > So, has anyone done any benchmarking yet in a similar setting, and<br>> > what thoughts do you have? I obviously can do some benchmarking myself<br>> > as well but I thought I'd check in with the list first.<br>> ><br>> > Thanks so much,<br>> > David<br>> ><br>> ><br>> > -- <br>> > David Mobley<br>> > dmobley at gmail.com<br>> > 504-383-3662<br>> <br>> <br>> -- <br>> gmx-users mailing list gmx-users@gromacs.org<br>> http://lists.gromacs.org/mailman/listinfo/gmx-users<br>> Please search the archive at http://www.gromacs.org/search before posting!<br>> Please don't post (un)subscribe requests to the list. Use the <br>> www interface or send it to gmx-users-request@gromacs.org.<br>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php<br>                                            </body>
</html>