<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
    <title></title>
  </head>
  <body bgcolor="#ffffff" text="#000000">
    Hi Alan,<br>
    <br>
    On 9/14/10 3:21 PM, Alan wrote:
    <blockquote
      cite="mid:AANLkTiksD3z-069_bGcwhMFwysYj=SomFn5QU89YRLHJ@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      Hi there,
      <div><br>
      </div>
      <div>I am testing on a MBP 17" SL 10.6.4 64 bits and nvidia
        GeForce 9600M GT <br>
        <div><br>
        </div>
        <div>So I got mdrun-gpu compiled and apparently running, but
          when I try to run 'mdrun' to compare I have a segment fault.</div>
        <div><br>
        </div>
        <div>Any other comments to the md.mdp and em.mdp are very
          welcome too.</div>
        <div><br>
        </div>
        <div>
          <div>##### To test mdrun-gpu</div>
          <div><br>
          </div>
          <div>
            <div>cat &lt;&lt; EOF &gt;| em.mdp</div>
            <div>define                   = -DFLEXIBLE</div>
            <div>integrator               = cg ; steep</div>
            <div>nsteps                   = 200</div>
            <div>constraints              = none</div>
            <div>emtol                    = 1000.0</div>
            <div>nstcgsteep               = 10 ; do a steep every 10
              steps of cg</div>
            <div>emstep                   = 0.01 ; used with steep</div>
            <div>nstcomm                  = 1</div>
            <div>coulombtype              = PME</div>
            <div>ns_type                  = grid</div>
            <div>rlist                    = 1.0</div>
            <div>rcoulomb                 = 1.0</div>
            <div>rvdw                     = 1.4</div>
            <div>Tcoupl                   = no</div>
            <div>Pcoupl                   = no</div>
            <div>gen_vel                  = no</div>
            <div>nstxout                  = 0 ; write coords every #
              step</div>
            <div>optimize_fft             = yes</div>
            <div>EOF</div>
            <div><br>
            </div>
            <div>cat &lt;&lt; EOF &gt;| md.mdp</div>
            <div>integrator               = md-vv</div>
            <div>nsteps                   = 1000</div>
            <div>dt                       = 0.002</div>
            <div>constraints              = all-bonds</div>
            <div>constraint-algorithm     = shake</div>
            <div>nstcomm                  = 1</div>
            <div>nstcalcenergy            = 1</div>
            <div>ns_type                  = grid</div>
            <div>rlist                    = 1.3</div>
            <div>rcoulomb                 = 1.3</div>
            <div>rvdw                     = 1.3</div>
            <div>vdwtype                  = cut-off</div>
            <div>coulombtype              = PME</div>
          </div>
        </div>
      </div>
    </blockquote>
    PME on the GPUs is not very fast, about 3 times faster than a single
    core<br>
    <blockquote
      cite="mid:AANLkTiksD3z-069_bGcwhMFwysYj=SomFn5QU89YRLHJ@mail.gmail.com"
      type="cite">
      <div>
        <div>
          <div>
            <div>Tcoupl                   = Andersen</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Andersen works only on with OpenMM. Gromacs accepts it as an option
    but the actual algorithm is not implemented for the CPU version yet.<br>
    <blockquote
      cite="mid:AANLkTiksD3z-069_bGcwhMFwysYj=SomFn5QU89YRLHJ@mail.gmail.com"
      type="cite">
      <div>
        <div>
          <div>
            <div>nsttcouple               = 1</div>
            <div>tau_t                    = 0.1</div>
            <div>tc-grps                  = system</div>
            <div>ref_t                    = 300</div>
            <div>Pcoupl                   = mttk</div>
            <div>Pcoupltype               = isotropic</div>
            <div>
              nstpcouple               = 1</div>
            <div>tau_p                    = 0.5</div>
            <div>compressibility          = 4.5e-5</div>
            <div>ref_p                    = 1.0</div>
            <div>gen_vel                  = yes</div>
            <div>nstxout                  = 2 ; write coords every #
              step</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Fetching date from the GPU every 2 steps is way too often. Use a
    value that you will actually use in production runs.<br>
    <br>
    <blockquote
      cite="mid:AANLkTiksD3z-069_bGcwhMFwysYj=SomFn5QU89YRLHJ@mail.gmail.com"
      type="cite">
      <div>
        <div>
          <div>
            <div>lincs-iter               = 2</div>
            <div>DispCorr                 = EnerPres</div>
            <div>optimize_fft             = yes</div>
            <div>EOF</div>
            <div><br>
            </div>
            <div>wget -c "<a moz-do-not-send="true"
                href="http://www.pdbe.org/download/1brv">http://www.pdbe.org/download/1brv</a>"
              -O 1brv.pdb</div>
            <div><br>
            </div>
            <div>pdb2gmx -ff amber99sb -f 1brv.pdb -o Prot.pdb -p
              Prot.top -water spce -ignh</div>
            <div><br>
            </div>
            <div>editconf -bt triclinic -f Prot.pdb -o Prot.pdb -d 1.0</div>
            <div><br>
            </div>
            <div>genbox -cp Prot.pdb -o Prot.pdb -p Prot.top -cs</div>
            <div><br>
            </div>
            <div>grompp -f em.mdp -c Prot.pdb -p Prot.top -o Prot.tpr</div>
            <div><br>
            </div>
            <div>echo 13 | genion -s Prot.tpr -o Prot.pdb -neutral -conc
              0.15 -p Prot.top -norandom</div>
            <div><br>
            </div>
            <div>grompp -f em.mdp -c Prot.pdb -p Prot.top -o em.tpr</div>
            <div><br>
            </div>
            <div>mdrun -v -deffnm em</div>
            <div><br>
            </div>
            <div>grompp -f md.mdp -c em.gro -p Prot.top -o md.tpr</div>
            <div><br>
            </div>
            <div>mdrun-gpu -v -deffnm md -device
              "OpenMM:platform=Cuda,memtest=15,deviceid=0,force-device=yes"</div>
            <div><br>
            </div>
            <div>[snip]</div>
            <div>Reading file md.tpr, VERSION 4.5.1-dev-20100913-9342b
              (single precision)</div>
            <div>Loaded with Money</div>
            <div><br>
            </div>
            <div><br>
            </div>
            <div>Back Off! I just backed up md.trr to ./#md.trr.7#</div>
            <div><br>
            </div>
            <div>Back Off! I just backed up md.edr to ./#md.edr.7#</div>
            <div><br>
            </div>
            <div>WARNING: OpenMM supports only Andersen thermostat with
              the md/md-vv/md-vv-avek integrators.</div>
            <div><br>
            </div>
            <div><br>
            </div>
            <div>WARNING: OpenMM supports only Monte Carlo barostat for
              pressure coupling.</div>
            <div><br>
            </div>
            <div><br>
            </div>
            <div>WARNING: Non-supported GPU selected (#0, GeForce 9600M
              GT), forced continuing.Note, that the simulation can be
              slow or it migth even crash.</div>
            <div><br>
            </div>
            <div><br>
            </div>
            <div>Pre-simulation ~15s memtest in progress...done, no
              errors detected</div>
            <div>starting mdrun 'PROTEIN G in water'</div>
            <div>1000 steps,      2.0 ps.</div>
            <div>step 900, remaining runtime:     4 s</div>
            <div>Writing final coordinates.</div>
            <div><br>
            </div>
            <div>step 1000, remaining runtime:     0 s</div>
            <div>Post-simulation ~15s memtest in progress...done, no
              errors detected</div>
            <div><br>
            </div>
            <div><span class="Apple-tab-span" style="white-space: pre;">
              </span>OpenMM run - timing based on wallclock.</div>
            <div><br>
            </div>
            <div>               NODE (s)   Real (s)      (%)</div>
            <div>       Time:     44.556     44.556    100.0</div>
            <div>               (Mnbf/s)   (MFlops)   (ns/day)
               (hour/ns)</div>
            <div>Performance:      0.000      0.027      3.882    
               6.182</div>
          </div>
          <div><br>
          </div>
          <div><br>
          </div>
          <div>But if I try:</div>
          <div>mdrun -v -deffnm md -nt 1</div>
          <div>[snip]</div>
          <div>
            <div>starting mdrun 'PROTEIN G in water'</div>
            <div>1000 steps,      2.0 ps.</div>
            <div>[1]    75786 segmentation fault  mdrun -v -deffnm md
              -nt 1</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    It might be due to the Andersen thermostat setting.<br>
    <br>
    <blockquote
      cite="mid:AANLkTiksD3z-069_bGcwhMFwysYj=SomFn5QU89YRLHJ@mail.gmail.com"
      type="cite">
      <div>
        <div>
          <div>
          </div>
          <div><br>
          </div>
          <div>Note: using -nt 1 because SHAKE is not supported with
            domain decomposition.</div>
          <div><br>
          </div>
          <div>If using Tcoupl and Pcoupl = no and then I can compare
            mdrun x mdrun-gpu, being my gpu ~2 times slower than only
            one core. Well, I definitely don't intended to use mdrun-gpu
            but I am surprised that it performed that bad (OK, I am
            using a low-end GPU, but sander_openmm seems to work fine
            and very fast on my mbp).</div>
          <div><br>
          </div>
        </div>
      </div>
    </blockquote>
    Try fetching data less often. Also, currently the GPUs are best used
    for implicit solvent simulations<br>
    <br>
    <blockquote
      cite="mid:AANLkTiksD3z-069_bGcwhMFwysYj=SomFn5QU89YRLHJ@mail.gmail.com"
      type="cite">
      <div>
        <div>
          <div>BTW, in gmx 4.5 manual, there's reference to Andersen
            thermostat only at section 6.9 GROMACS on GPUs. Is it
            supposed to be used only with mdrun-gpu?</div>
        </div>
      </div>
    </blockquote>
    Yes, at the moment.<br>
    <br>
    Rossen<br>
    <br>
    <blockquote
      cite="mid:AANLkTiksD3z-069_bGcwhMFwysYj=SomFn5QU89YRLHJ@mail.gmail.com"
      type="cite">
      <div>
        <div>
          <div><br>
          </div>
          <div>Any ideas? Thanks,</div>
          <div><br>
          </div>
          <div>Alan</div>
          <br>
          -- <br>
          Alan Wilter S. da Silva, D.Sc. - CCPN Research Associate<br>
          Department of Biochemistry, University of Cambridge. <br>
          80 Tennis Court Road, Cambridge CB2 1GA, UK.<br>
          &gt;&gt;<a moz-do-not-send="true"
            href="http://www.bio.cam.ac.uk/%7Eawd28">http://www.bio.cam.ac.uk/~awd28</a>&lt;&lt;<br>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>