<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#ffffff" text="#000000">

    Dear Gromacs community<br>

             I would like to better understand how the uncertainty on

    the diffusion coefficient is calculated in GROMACS.<br>

    I am calculating water diffusivity for different water models at

    different temperatures.<br>

    I am using Gomacs 4.5.4<br>

    <br>

    <br>

    Reading the manual, I am a bit confused on how g_msd calculate the

    uncertainty on the average.<br>

    Indeed, the manual told me (pag 285-286) <br>

    <br>

    1)<br>

    <br>

    "g_msd computes the mean square displacement (MSD) of atoms from a

    set of initial positions. This<br>

    provides an easy way to compute the diffusion constant using the

    Einstein relation. The time between the<br>

    reference points for the MSD calculation is set with -trestart. The

    diffusion constant is calculated by<br>

    least squares fitting a straight line (D*t + c) through the MSD(t)

    from -beginfit to -endfit (note that t <br>

    is time from the reference positions, not simulation time). <font

      color="#3366ff">An error estimate given, which is the difference<br>

      of the diffusion coefficients obtained from fits over the two

      halves of the fit interval.</font>"<br>

    <br>

    <br>

    2) By the way, few lines below<br>

    <br>

    <br>

    "The diffusion coefficient is determined by linear regression of the

    MSD,<br>

    where, unlike for the normal output of D, the times are weighted

    according to<br>

    the number of reference points, i.e. short times have a higher

    weight. Also<br>

    when -beginfit=-1,fitting starts at 10% and when -endfit=-1, fitting

    goes to<br>

    90%.<font color="#3366ff"> Using this option one also gets an

      accurate error estimate based on the<br>

      statistics between individual molecules</font>. Note that this

    diffusion coefficient<br>

    and error estimate are only accurate when the MSD is completely

    linear<br>

    between -beginfit and -endfit."<br>

    <br>

    Based on this last paragraph, I suppose that g_msd calculates for

    each molecule the diffusion coefficient and then it calculates the

    average and standard deviation from the daset of the diffusion

    coefficients of each molecules.<br>

    <br>

    So, if I compute <br>

    <br>

    g_msd -f traj.xtc -s topol.tpr -n index.ndx.ndx -o msd.xvg -b 0 -e

    100000 -beginfit 15000 -endfit 25000 -trestart 10 -type z<br>

    <br>

    at the top of my output file msd.xvg I can read<br>

    <br>

    # MSD gathered over 100000 ps with 10001 restarts<br>

    # Diffusion constants fitted from time 1500 to 25000 ps<br>

    # D[all_100000] = 0.1230 (+/- 0.0060) (1e-5 cm^2/s)<br>

    <br>

    <br>

    So, 0.0060 is one standard deviation based on the statistic between

    different molecules (second method) or is calculated using the

    difference of the diffusion coefficient on the two halves of the

    fitting region (first method)?<br>

    <br>

    I was searching in the mail list without success...thanks for any

    help<br>

    <br>

    Ivan<br>

    <pre class="moz-signature" cols="72">-- 

------

Ivan Gladich, Ph.D.

Postdoctoral Fellow

Academy of Sciences of the Czech Republic

Institute of Organic Chemistry and Biochemistry AS CR, v.v.i.

Flemingovo nám. 2.

166 10 Praha 6

Czech Republic 

Tel: +420775504164

e-mail: <a class="moz-txt-link-abbreviated" href="mailto:ivan.gladich@uochb.cas.cz">ivan.gladich@uochb.cas.cz</a>

web page:<a class="moz-txt-link-freetext" href="http://www.molecular.cz/~gladich/">http://www.molecular.cz/~gladich/</a>

-----</pre>

  </body>

</html>