<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman,new york,times,serif;font-size:12pt"><div>Hi everybody,<br><br>We're trying to understand how gromacs g_msd calculates MSD (without reading through the C code, I really don't know much about it, and there are hundreds of lines).<br>(What we want to do is to calculate MSD properly in our supercooled water simulations, i.e., to choose correctly the values for -b, -e and -trestart in order to get the longest, reliable MSD data)<br><br><br>We found a mail from Gaurav Goel (quoted below) which gives a reasonable explanation on the topic.<br><br>If we understand this correctly, then the output of<br><br>g_msd -b 50 -e 100 -trestart 50<br><br>should be the same as the second half (with the proper shifting) of <br>the output of<br><br>g_msd -b 0 -e 100 -trestart 50<br><br>But it is not, as anyone can verify with any
simulation.<br>...... what are we missing?<br><br>Thanks in advance,<br>Julian<br><br>--<br>Julian Gelman Constantin <br>Department of Inorganic, Analytic and Chemical Physics (DQIAQF) <br>School of Exact and Natural Sciences <br>University of Buenos Aires, Argentina<br><br><br>---<br>Gaurav Goel gauravgoeluta at gmail.com<br>Tue Jul 13 00:56:02 CEST 2010<br><br>On Mon, Jul 12, 2010 at 5:22 PM, Ricardo Cuya Guizado<br><rcuyag at hotmail.com> wrote:<br>> Dear gromacs users<br>> I make a MD of 20 ns of a solute in water<br>> With the g_msd program the msd vs the time was obtained<br>> In the plot, I observed a linear behaviour of the MSD from 0 to 15 ns and a<br>> plateau with no linear tendence at the last 5 ns arpoximately.<br>> In order to know if the observed plateau was due to the data or is due to<br>> the way as the algorithm process the data, I divided the MD in two<br>> trajectories and obtained the msd for each
one.<br>> From 0-10ns, the plot observed shows a linear tendence en the begining<br>and a<br>> plateau with no linear tendence from 9 to 10 ns.<br>> From 10-20 ns the plot observed was linear from 10 to 18 ns and not linear<br>> at the last, the same plateau was observed.<br>> Comparing the plots there are not equivalent,.<br>> Why g_msd produces a non linear plot at the last of the calculation and the<br>> plateau is ever produces.<br>> Somebody will explain the way as the g_msd algorithm work? and why the plot<br>> are no equivalent or why there must be equivalent?<br><br>I will explain how the g_msd algorithm works and hopefully that will<br>answer all your questions above. What you see in the output file is<br>average-MSD versus time. This average is done over all the particles<br>in the group you selected and over multiple time origins (this last<br>option can be selected with the -trestart parameter). Also, time
in<br>column 1 is time difference from the start of your trajectory to<br>current time.<br><br>E.g., let's say you collected a trajectory over 5 time units and<br>choose -trestart=1 time unit and -dt=1 time unit.<br><br>dt=1 means you'll have 6 configurations for your analysis (including<br>the configuration at t=0).<br><br>trestart=1 means you'll have 5 distinct trajectories for your analysis:<br>Trajectory 1: 0-5<br>T2: 1-5<br>T3: 2-5<br>T4: 3-5<br>T5: 4-5<br><br>Now you can notice that all 5 trajectories contribute to the average<br>MSD after 1 time unit (T1-T5), 4 trajectories contribute to the<br>average MSD after 2 time units (T1-T4), 3 trajectories to the average<br>MSD after 3 time units (T1-T3), ...., and only one trajectory to the<br>MSD after 5 time units (T1). Of course, this assumes that trestart is<br>large enough that all all these trajectories are uncorrelated.<br><br>So, it's clear that longer the time interval at which you want
to<br>evaluate the MSD lesser the number of trajectories used to evaluate<br>it...and hence, higher error in MSD values at longer times. That might<br>explain deviation from linear behaviour at long times.<br><br>However, you must be careful in interpreting the MSD data and I<br>recommend reading some literature on the subject. A plateau in MSD<br>versus time data might also signify what is called cage motion, in<br>which a particle or atom is trapped by the surrounding particles and<br>is not able to move out of that hole on the simulation time scale. If<br>you want you can send me your MSD versus time data along with some<br>information on your system (such as potentials, density, temperature<br>etc.) and I can let you know my comments.<br><br>Few words of caution:<br>Make sure that the center of mass of your particle (or atom or<br>molecule) is diffusing several particle diameters. Also, make sure<br>that you're calculating the self-diffusion
coefficient by fitting a<br>straight line to the linear region of MSD versus time data. You can<br>either modify the -beginfit and -endfit options... or calculate the<br>slope of the MSD versus time data using some other software (e.g.,<br>gnuplot, excel, etc.). If you're doing the latter you'll need to take<br>a look at the code in gmx_msd.c to know how the diffusion coefficent<br>is calculated from the slope of MSD versus time data (tog et correct<br>units, use proper scaling factors, etc.).<br><br>I hope that helped.<br><br>-Gaurav<br>><br>><br>><br>><br>> Regards<br>> Ricardo Cuya<br></div>
</div><br>
</body></html>