<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman,new york,times,serif;font-size:12pt"><div>Hi everybody,<br><br>We're trying to understand how gromacs g_msd calculates MSD (without reading through the C code, I really don't know much about it, and there are hundreds of lines).<br>(What we want to do is to calculate MSD properly in our supercooled water simulations, i.e., to choose correctly the values for -b, -e&nbsp; and -trestart in order to get the longest, reliable MSD data)<br><br><br>We found a mail from Gaurav Goel (quoted below) which gives a reasonable explanation on the topic.<br><br>If we understand this correctly, then the output of<br><br>g_msd -b 50 -e 100 -trestart 50<br><br>should be the same as the second half (with the proper shifting) of&nbsp; <br>the output of<br><br>g_msd -b 0 -e 100 -trestart 50<br><br>But it is not, as anyone can verify with any

 simulation.<br>...... what are we missing?<br><br>Thanks in advance,<br>Julian<br><br>--<br>Julian Gelman Constantin <br>Department of Inorganic, Analytic and Chemical Physics (DQIAQF) <br>School of Exact and Natural Sciences <br>University of Buenos Aires, Argentina<br><br><br>---<br>Gaurav Goel gauravgoeluta at gmail.com<br>Tue Jul 13 00:56:02 CEST 2010<br><br>On Mon, Jul 12, 2010 at 5:22 PM, Ricardo Cuya Guizado<br>&lt;rcuyag at hotmail.com&gt; wrote:<br>&gt; Dear gromacs users<br>&gt; I make a MD of 20 ns of a solute in water<br>&gt; With the g_msd program the msd vs the time was obtained<br>&gt; In the plot, I observed a linear behaviour of the MSD from 0 to 15 ns and a<br>&gt; plateau with no linear tendence at the last 5 ns arpoximately.<br>&gt; In order to know if the observed plateau was due to the data or is due to<br>&gt; the way as the algorithm process the data, I divided the MD in two<br>&gt; trajectories and obtained the msd for each

 one.<br>&gt; From 0-10ns, the plot observed shows a linear tendence en the begining<br>and a<br>&gt; plateau with no linear tendence from 9 to 10 ns.<br>&gt; From 10-20 ns the plot observed was linear from 10 to 18 ns and not linear<br>&gt; at the last, the same plateau was observed.<br>&gt; Comparing the plots there are not equivalent,.<br>&gt; Why g_msd produces a non linear plot at the last of the calculation and the<br>&gt; plateau is ever produces.<br>&gt; Somebody will explain the way as the g_msd algorithm work? and why the plot<br>&gt; are no equivalent or why there must be equivalent?<br><br>I will explain how the g_msd algorithm works and hopefully that will<br>answer all your questions above. What you see in the output file is<br>average-MSD versus time. This average is done over all the particles<br>in the group you selected and over multiple time origins (this last<br>option can be selected with the -trestart parameter). Also, time

 in<br>column 1 is time difference from the start of your trajectory to<br>current time.<br><br>E.g., let's say you collected a trajectory over 5 time units and<br>choose -trestart=1 time unit and -dt=1 time unit.<br><br>dt=1 means you'll have 6 configurations for your analysis (including<br>the configuration at t=0).<br><br>trestart=1 means you'll have 5 distinct trajectories for your analysis:<br>Trajectory 1: 0-5<br>T2: 1-5<br>T3: 2-5<br>T4: 3-5<br>T5: 4-5<br><br>Now you can notice that all 5 trajectories contribute to the average<br>MSD after 1 time unit (T1-T5), 4 trajectories contribute to the<br>average MSD after 2 time units (T1-T4),&nbsp; 3 trajectories to the average<br>MSD after 3 time units (T1-T3), ...., and only one trajectory to the<br>MSD after 5 time units (T1). Of course, this assumes that trestart is<br>large enough that all all these trajectories are uncorrelated.<br><br>So, it's clear that longer the time interval at which you want

 to<br>evaluate the MSD lesser the number of trajectories used to evaluate<br>it...and hence, higher error in MSD values at longer times. That might<br>explain deviation from linear behaviour at long times.<br><br>However, you must be careful in interpreting the MSD data and I<br>recommend reading some literature on the subject. A plateau in MSD<br>versus time data might also signify what is called cage motion, in<br>which a particle or atom is trapped by the surrounding particles and<br>is not able to move out of that hole on the simulation time scale. If<br>you want you can send me your MSD versus time data along with some<br>information on your system (such as potentials, density, temperature<br>etc.) and I can let you know my comments.<br><br>Few words of caution:<br>Make sure that the center of mass of your particle (or atom or<br>molecule) is diffusing several particle diameters. Also, make sure<br>that you're calculating the self-diffusion

 coefficient&nbsp; by fitting a<br>straight line to the linear region of MSD versus time data. You can<br>either modify the -beginfit and -endfit options... or calculate the<br>slope of the MSD versus time data using some other software (e.g.,<br>gnuplot, excel, etc.). If you're doing the latter you'll need to take<br>a look at the code in gmx_msd.c to know how the diffusion coefficent<br>is calculated from the slope of MSD versus time data (tog et correct<br>units, use proper scaling factors, etc.).<br><br>I hope that helped.<br><br>-Gaurav<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; Regards<br>&gt; Ricardo Cuya<br></div>

</div><br>


      &nbsp;</body></html>