<br><br><div class="gmail_quote">On Tue, Apr 17, 2012 at 10:58 AM, David van der Spoel <span dir="ltr"><<a href="mailto:spoel@xray.bmc.uu.se">spoel@xray.bmc.uu.se</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">On 2012-04-17 16:47, Roland Schulz wrote:<br>
> Hi,<br>
><br>
> On Tue, Apr 17, 2012 at 9:48 AM, Erik Lindahl <<a href="mailto:erik@kth.se">erik@kth.se</a><br>
</div><div><div class="h5">> <mailto:<a href="mailto:erik@kth.se">erik@kth.se</a>>> wrote:<br>
><br>
><br>
> On Apr 17, 2012, at 3:18 PM, Roland Schulz wrote:<br>
><br>
> > E.g. for parallelization the issue is very similar as it is for<br>
> portability. Supporting domain decomposition makes it more difficult<br>
> for everyone and everyone has to make sure that they don't brake it.<br>
> And it is only included because it essential to Gromacs and used by<br>
> almost everyone.<br>
><br>
> Right - and that's of course something we don't want to push down<br>
> just on the few people working with parallelization :-) We don't<br>
> have automated tests for it yet, but when we have more functional<br>
> tests the idea is that we should automatically reject patches that<br>
> break parallel runs!<br>
><br>
> Yes. But we only do it for parallelization because the majority (in this<br>
> case probably everyone) agrees that this is important. We wouldn't<br>
> accept a feature which would be as time consuming for every developer as<br>
> parallelization is, but only useful for a small minority. :-)<br>
><br>
> I simply don't buy the argument that just because these 1132 lines<br>
> are not perfect (they obviously aren't) portability doesn't matter<br>
> at all and we might as well include 10 megabytes of additional<br>
> source code where we have no control of the portability.<br>
><br>
> I didn't say portability isn't important at all. All I'm saying is that<br>
> portability shouldn't be treated as a Boolean. In practice portability<br>
> is, as any other metric, a scale. And the decision to support 99.9% of<br>
> platforms instead 99.5% should be a matter of cost benefit analysis as<br>
> is adding a new feature.<br>
><br>
> > But I think that "fancy" IO is also an optional feature. I agree<br>
> that it is a very important feature and it has many disadvantages if<br>
> the same format is not used everywhere. But it is also<br>
> non-essential. And at that point it should become a matter of<br>
> cost-benefit and not a matter of principal. I.e. how many people<br>
> benefit from features made possible by HDF5 (e.g. because limited<br>
> developer time wouldn't allow them without HDF5) versus how much of<br>
> a pain is it to the few people how have to live with XTC (and<br>
> conversion). And one very important factor in that cost-benefit<br>
> analysis is the ratio of users.<br>
><br>
> But now you are moving the goal-posts! The aim of the present<br>
> TNG-based project was NOT "fancy" IO, but a new default simple<br>
> portable Gromacs trajectory format that (1) includes headers for<br>
> atom names and stuff, (2) is a small free library that can easily be<br>
> contributed to other codes so they can read/write our files, and (3)<br>
> enable better compression.<br>
><br>
> What I meant with "fancy" IO was that it is optional. These 3 things<br>
> aren't required to run a simulation on an exotic platform (e.g. Kei) and<br>
> to be able to analysis the results (after potentially converting).<br>
><br>
> It would of course be nice if this format also allowed efficient<br>
> parallel IO and advanced slicing, but that has never been the<br>
> primary goal of the file format project, in particular not if it<br>
> starts to come in conflict with the aims above.<br>
><br>
> As a said before, parallel IO isn't the issue. (Simple) parallel writing<br>
> is easier without HDF5. Parallel reading (for analysis) is possible as<br>
> long as the format is seekable (can be easily added even to XTC by<br>
> creating a 2nd file with the index).<br>
><br>
><br>
> Having said that, we just discussed things here in the lab, and one<br>
> alternative could be to have a simple built-in HDF5 implementation<br>
> that can write correct headers for 1-3 dimensional arrays so our<br>
> normal files are HDF5-compliant when written on a single node. This<br>
> should be possible to do in ~100k of source code. If there is no<br>
> external HDF5 library present, this will be the only alternative<br>
> supported, and you will not be able to use e.g. parallel IO - but<br>
> the file format will work.<br>
><br>
><br>
> Option 1) Up to 100k lines we have to write and support. And the code<br>
> can only use the subset of HDF5 supported.<br>
> Option 2) Users on very exotic platforms have to keep using XTC and in<br>
> post-production convert their files (only if they want to benefit of<br>
> HDF5 advantages in analysis)<br>
><br>
> I really don't see how Option 1 could win in any reasonable<br>
> cost benefit analysis. :-)<br>
><br>
> BTW: All of HDF5 is 135k lines (according to sloccount, exluding C++, HL<br>
> or Fortran binding). And HDF5 has all OS depending functions (IO,<br>
> threads, ..) abstracted. Thus only a small part (18 files, total 9300<br>
> lines - this includes the respective headers and the abstraction layer<br>
> itself) have any #ifdef for windows. Thus only those files would need to<br>
> be touched to add support for a non POSIX, WINDOWS, or VMS OS. It is<br>
> even possible to write an own low level file layer<br>
> (<a href="http://www.hdfgroup.org/HDF5/doc/TechNotes/VFL.html" target="_blank">http://www.hdfgroup.org/HDF5/doc/TechNotes/VFL.html</a>) which could be<br>
> based on futil.c to have our own OS abstraction.<br>
><br>
> The caveat is what happens to the physical file format when HDF5<br>
> writes parallel IO? Will this result in a file with different<br>
> properties that is difficult for us to read with a naive<br>
> implementation?<br>
><br>
> No problem. HDF5 parallel IO doesn't produce different formats. It<br>
> writes in standard chunks (which would need to be supported anyhow for<br>
> block compression and fast seek).<br>
><br>
> Roland<br>
><br>
<br>
</div></div>Nice discussion. Just wanted to point out that if GROMACS needs HDF5 the<br>
big-iron vendors will help porting HDF5 to their platforms.<br>
<br>
By the way, has anyone worked on a port to iOS yet :) ?<br></blockquote><div>It seems ;-) </div><div><a href="http://code.google.com/p/ios-face-detection/source/browse/OpenCV-2.2.0/include/opencv2/flann/hdf5.h?r=d35a62f475aa2813e4f3c80e50c33b7112389746">http://code.google.com/p/ios-face-detection/source/browse/OpenCV-2.2.0/include/opencv2/flann/hdf5.h?r=d35a62f475aa2813e4f3c80e50c33b7112389746</a></div>
<div><a href="http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-January/002357.html">http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-January/002357.html</a></div><div><br></div><div>Roland</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span class="HOEnZb"><font color="#888888"><br>
<br>
--<br>
David van der Spoel, Ph.D., Professor of Biology<br>
Dept. of Cell & Molec. Biol., Uppsala University.<br>
Box 596, 75124 Uppsala, Sweden. Phone: <a href="tel:%2B46184714205" value="+46184714205">+46184714205</a>.<br>
<a href="mailto:spoel@xray.bmc.uu.se">spoel@xray.bmc.uu.se</a> <a href="http://folding.bmc.uu.se" target="_blank">http://folding.bmc.uu.se</a><br>
</font></span><div class="HOEnZb"><div class="h5">--<br>
gmx-developers mailing list<br>
<a href="mailto:gmx-developers@gromacs.org">gmx-developers@gromacs.org</a><br>
<a href="http://lists.gromacs.org/mailman/listinfo/gmx-developers" target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-developers</a><br>
Please don't post (un)subscribe requests to the list. Use the<br>
www interface or send it to <a href="mailto:gmx-developers-request@gromacs.org">gmx-developers-request@gromacs.org</a>.<br>
<br>
<br>
<br>
<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>ORNL/UT Center for Molecular Biophysics <a href="http://cmb.ornl.gov">cmb.ornl.gov</a><br>865-241-1537, ORNL PO BOX 2008 MS6309<br>