<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Dear Gromacs community<br>
I would like to better understand how the uncertainty on
the diffusion coefficient is calculated in GROMACS.<br>
I am calculating water diffusivity for different water models at
different temperatures.<br>
I am using Gomacs 4.5.4<br>
<br>
<br>
Reading the manual, I am a bit confused on how g_msd calculate the
uncertainty on the average.<br>
Indeed, the manual told me (pag 285-286) <br>
<br>
1)<br>
<br>
"g_msd computes the mean square displacement (MSD) of atoms from a
set of initial positions. This<br>
provides an easy way to compute the diffusion constant using the
Einstein relation. The time between the<br>
reference points for the MSD calculation is set with -trestart. The
diffusion constant is calculated by<br>
least squares fitting a straight line (D*t + c) through the MSD(t)
from -beginfit to -endfit (note that t <br>
is time from the reference positions, not simulation time). <font
color="#3366ff">An error estimate given, which is the difference<br>
of the diffusion coefficients obtained from fits over the two
halves of the fit interval.</font>"<br>
<br>
<br>
2) By the way, few lines below<br>
<br>
<br>
"The diffusion coefficient is determined by linear regression of the
MSD,<br>
where, unlike for the normal output of D, the times are weighted
according to<br>
the number of reference points, i.e. short times have a higher
weight. Also<br>
when -beginfit=-1,fitting starts at 10% and when -endfit=-1, fitting
goes to<br>
90%.<font color="#3366ff"> Using this option one also gets an
accurate error estimate based on the<br>
statistics between individual molecules</font>. Note that this
diffusion coefficient<br>
and error estimate are only accurate when the MSD is completely
linear<br>
between -beginfit and -endfit."<br>
<br>
Based on this last paragraph, I suppose that g_msd calculates for
each molecule the diffusion coefficient and then it calculates the
average and standard deviation from the daset of the diffusion
coefficients of each molecules.<br>
<br>
So, if I compute <br>
<br>
g_msd -f traj.xtc -s topol.tpr -n index.ndx.ndx -o msd.xvg -b 0 -e
100000 -beginfit 15000 -endfit 25000 -trestart 10 -type z<br>
<br>
at the top of my output file msd.xvg I can read<br>
<br>
# MSD gathered over 100000 ps with 10001 restarts<br>
# Diffusion constants fitted from time 1500 to 25000 ps<br>
# D[all_100000] = 0.1230 (+/- 0.0060) (1e-5 cm^2/s)<br>
<br>
<br>
So, 0.0060 is one standard deviation based on the statistic between
different molecules (second method) or is calculated using the
difference of the diffusion coefficient on the two halves of the
fitting region (first method)?<br>
<br>
I was searching in the mail list without success...thanks for any
help<br>
<br>
Ivan<br>
<pre class="moz-signature" cols="72">--
------
Ivan Gladich, Ph.D.
Postdoctoral Fellow
Academy of Sciences of the Czech Republic
Institute of Organic Chemistry and Biochemistry AS CR, v.v.i.
Flemingovo nám. 2.
166 10 Praha 6
Czech Republic
Tel: +420775504164
e-mail: <a class="moz-txt-link-abbreviated" href="mailto:ivan.gladich@uochb.cas.cz">ivan.gladich@uochb.cas.cz</a>
web page:<a class="moz-txt-link-freetext" href="http://www.molecular.cz/~gladich/">http://www.molecular.cz/~gladich/</a>
-----</pre>
</body>
</html>