Thank you Tsjerk, this is indeed the solution I figured out as mentioned in a previous post. The only hick is that this can work only if the compiler supports large files. In that case I can use a <span style="background-color:rgb(255,255,255);font-family:CourierNew,Courier,monospace;text-align:left">#define _FILE_OFFSET_BITS 64</span><span style="background-color:rgb(255,255,255);font-family:CourierNew,Courier,monospace;text-align:left"> </span><span style="background-color:rgb(255,255,255);font-family:CourierNew,Courier,monospace;text-align:left">and fseeko instead of fseek. I did test it with a 200Gb long file.</span><div>
<div><div style="text-align:left"><font face="CourierNew, Courier, monospace"><br></font></div><br><div class="gmail_quote">On 5 June 2012 09:45, Tsjerk Wassenaar <span dir="ltr"><<a href="mailto:tsjerkw@gmail.com" target="_blank">tsjerkw@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Paolo,<br>
<br>
The python code also gives a hint about the C solution... You still<br>
don't need to read in the first frame. Bytes 81-84 from the start<br>
contain the size of the frame, excluding 92 bytes used for the header.<br>
Mind that this is only an approximate size for a frame, as the size<br>
per frame in an xtc file is variable. But it'll probably be close. If<br>
you have the size of one frame, you need the size of the file, for<br>
which you can use the solution at<br>
<a href="http://stackoverflow.com/questions/8236/how-do-you-determine-the-size-of-a-file-in-c" target="_blank">http://stackoverflow.com/questions/8236/how-do-you-determine-the-size-of-a-file-in-c</a><br>
Dividing one by the other should give an indication of the number of<br>
frames. If you have a small C program for calculating the number of<br>
frames, please do post it. It might be interesting for others.<br>
<br>
Hope it helps,<br>
<br>
Tsjerk<br>
<div class="HOEnZb"><div class="h5"><br>
On Tue, Jun 5, 2012 at 12:40 AM, Oliver Stueker <<a href="mailto:ostueker@gmail.com">ostueker@gmail.com</a>> wrote:<br>
><br>
> As far as I know there is no field at the beginning of the file that would<br>
> give a parser hints how many frames are in it.<br>
> (probably because that makes it easier/more performant to append to the file<br>
> while reducing the risk of corrupting it in case a write goes bad)<br>
><br>
> On the other hand that makes it hard to implement random-access to frames in<br>
> XTC/TRR files.<br>
><br>
> Interestingly there is just a discussion on the mailing list of MDAnalysis<br>
> (a python framework that can deal with XTC and other trajectories) on how<br>
> libxdr might be extended to generate a checksum-protected index for XTC<br>
> files, so that a given trajectory has to be read only once from beginning to<br>
> end.<br>
> <a href="https://groups.google.com/group/mdnalysis-discussion/browse_thread/thread/3cae3634c726f1ad" target="_blank">https://groups.google.com/group/mdnalysis-discussion/browse_thread/thread/3cae3634c726f1ad</a><br>
><br>
><br>
> a different Oliver<br>
><br>
><br>
> On Mon, Jun 4, 2012 at 3:24 PM, Paolo Franz <<a href="mailto:paolo.franz@gmail.com">paolo.franz@gmail.com</a>> wrote:<br>
>><br>
>> I am trying to avoid doing it by brute force, that is reading all frames<br>
>> until the last is found. In the origin, what I really need to do is to test<br>
>> if a frame exists in the trajectory. I tried with xtc_seek_frame, but that<br>
>> does not work. Of course, if I know how many frames are they the test<br>
>> becomes trivial.<br>
>><br>
>> That said, I definitely know what is in the trajectory, how many frames<br>
>> are there: I ran the md myself and I have the output file! What I want to do<br>
>> is to write a code that figure out by itself what to expect and if, by any<br>
>> chance I forget what is inside, it does not go into an infinite loop if I<br>
>> ask to analyse the wrong frame.<br>
>><br>
>> Cheers<br>
>> Paolo<br>
>><br>
>> On 4 June 2012 22:59, Justin A. Lemkul <<a href="mailto:jalemkul@vt.edu">jalemkul@vt.edu</a>> wrote:<br>
>>><br>
>>><br>
>>> If all you need is the number of frames contained in an .xtc file, is<br>
>>> there some reason why running gmxcheck on the .xtc file is insufficient?<br>
>>><br>
>>> -Justin<br>
>>><br>
>>><br>
>>> On 6/4/12 4:56 PM, Paolo Franz wrote:<br>
>>>><br>
>>>> Hi Tsjerk,<br>
>>>> Thanks, but I don't really want to use a python script, I am doing this<br>
>>>> from<br>
>>>> some c/c++ code. I think I figured out a way to do it, but I haven't<br>
>>>> tested it yet:<br>
>>>><br>
>>>> i) open the file<br>
>>>> ii) do a read_first_xtc<br>
>>>> iii) then get the file pointer positon from ftellg, which should be the<br>
>>>> length<br>
>>>> of the frame in bytes;<br>
>>>> iv) place the file pointer at the end of the file with an fseek, then<br>
>>>> get the<br>
>>>> length with an ftellg<br>
>>>> v) Divide the total length by the length of a frame and obtain the<br>
>>>> number of<br>
>>>> written frames.<br>
>>>><br>
>>>> I am only wondering what to do when the length in bytes of the file is<br>
>>>> too large<br>
>>>> for a long int!<br>
>>>><br>
>>>> On 4 June 2012 16:11, Tsjerk Wassenaar <<a href="mailto:tsjerkw@gmail.com">tsjerkw@gmail.com</a><br>
>>>> <mailto:<a href="mailto:tsjerkw@gmail.com">tsjerkw@gmail.com</a>>> wrote:<br>
>>>><br>
>>>> Hey Paolo,<br>
>>>><br>
>>>> I think I posted a script for extracting a last frame before, but if<br>
>>>> I<br>
>>>> can't even find it myself... Here it is:<br>
>>>><br>
>>>> #!/usr/bin/env python<br>
>>>><br>
>>>> from struct import unpack<br>
>>>> import sys<br>
>>>><br>
>>>> def i(x): return sum([ord(x[j])<<(24-j*8) for j in range(4)])<br>
>>>><br>
>>>> f = open(sys.argv[1])<br>
>>>> tag = f.read(8) # Tag: magic number and number of<br>
>>>> atoms<br>
>>>> n = 92 + i(f.read(84)[-4:]) # Size of frame in bytes<br>
>>>><br>
>>>> f.seek(-5*n/4, 2) # This should contain a complete<br>
>>>> frame<br>
>>>> frame = f.read() # Read the remaining part in<br>
>>>> frame = frame[frame.index(tag):] # Find the tag<br>
>>>><br>
>>>> # Open the output file<br>
>>>> if len(sys.argv) > 2:<br>
>>>> o = sys.argv[2]<br>
>>>> else:<br>
>>>> o = sys.argv[1][:-4]+"-last.xtc"<br>
>>>> open(o,"w").write(frame)<br>
>>>><br>
>>>> ###<br>
>>>><br>
>>>> Hope it helps. Cheers,<br>
>>>><br>
>>>> Tsjerk<br>
>>>> On Mon, Jun 4, 2012 at 12:59 PM, Paolo Franz <<a href="mailto:paolo.franz@gmail.com">paolo.franz@gmail.com</a><br>
>>>> <mailto:<a href="mailto:paolo.franz@gmail.com">paolo.franz@gmail.com</a>>> wrote:<br>
>>>> > Hello everybody!<br>
>>>> ><br>
>>>> > I am wondering how I can figure out the number of frames contained<br>
>>>> in an<br>
>>>> > .xtc file. Indeed, I need to read a particular frame of a<br>
>>>> trajectory and I<br>
>>>> > thought that the function<br>
>>>> > xtc_seek_frame(FILE * , int *, int *)<br>
>>>> > would return 0 if the frame was there and 1 when it was not.<br>
>>>> Instead, if I<br>
>>>> > call it with a frame outside the boundaries it seems to go into an<br>
>>>> infinite<br>
>>>> > loop. What I am doing wrong? Is there a way to read the last frame<br>
>>>> of an<br>
>>>> > .xtc file?<br>
>>>> ><br>
>>>> > Sincerely<br>
>>>> > Paolo<br>
>>>> ><br>
><br>
><br>
</div></div><div class="im HOEnZb">> --<br>
> gmx-developers mailing list<br>
> <a href="mailto:gmx-developers@gromacs.org">gmx-developers@gromacs.org</a><br>
> <a href="http://lists.gromacs.org/mailman/listinfo/gmx-developers" target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-developers</a><br>
> Please don't post (un)subscribe requests to the list. Use the<br>
> www interface or send it to <a href="mailto:gmx-developers-request@gromacs.org">gmx-developers-request@gromacs.org</a>.<br>
<br>
<br>
<br>
</div><div class="im HOEnZb">--<br>
Tsjerk A. Wassenaar, Ph.D.<br>
<br>
post-doctoral researcher<br>
Molecular Dynamics Group<br>
* Groningen Institute for Biomolecular Research and Biotechnology<br>
* Zernike Institute for Advanced Materials<br>
University of Groningen<br>
The Netherlands<br>
--<br>
gmx-developers mailing list<br>
<a href="mailto:gmx-developers@gromacs.org">gmx-developers@gromacs.org</a><br>
</div><div class="HOEnZb"><div class="h5"><a href="http://lists.gromacs.org/mailman/listinfo/gmx-developers" target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-developers</a><br>
Please don't post (un)subscribe requests to the list. Use the<br>
www interface or send it to <a href="mailto:gmx-developers-request@gromacs.org">gmx-developers-request@gromacs.org</a>.<br>
</div></div></blockquote></div><br></div></div>