[gmx-developers] libxml2

Erik Lindahl erik.lindahl at scilifelab.se
Mon Nov 11 02:43:20 CET 2013


Hi,

I’m fine with having it as a hard dependency, provided we’ve had it compile automatically during installs for a while without complaints (it has been on by default for 4.6, right?).

However, I also sat down and looked a bit at the XML files in David’s patch, and this made me realize we need a broader approach.

Just introducing XML is not going to help us much, in particular not if we just add a generic “XML” file type. This would be like merely having “BIN” and “ASCII” types for all other files. Before we know it, we are going to have a dozen of XML formats for different programs that have nothing to do with each other, and there will be lots of input/output routines in all programs processing them just-so-slightly differently. In addition, adding XML tags instead of relying on tabs/space/newline is of course a small step forward, but just a very small one - to really fix things we need to make things even more structured.

Some things I would like to see before we start using XML:

1) We need proper namespaces and sub-namespaces, so we can tell different XML components from each other. This will also require us to think a bit about information in general, even if we don’t implement all components from the start. There are going to be lots of places where we specify information on a residue and atom basis - how should all these relate to each other?  When are things forcefield-specific vs. general, and when should they go in the same vs. different files?

2) I think it makes a lot of sense to separate different XML files, so a future mdp replacement might have extension xmdp, while the xml toplogy has extension xtop. We should still be able to merge all of them in a single file (fine with namespaces), but this will avoid the problems when we specify an XML mdp file where a program was expecting an XML top file (in other words, no generic “XML” file format that can contain anything).

3) We need to think through naming carefully. In particlar: No custom abbreviations unless it is really necessary. We should also use proper names for types and similar settings, rather than merely translating our old integer selectors to XML.

4) For any measurement, we should have units.

5) To enable XSLT transformations and better namespace handling, I think we should standardize on (and require) schema descriptions for validation, rather than the older DTDs.

6) We need some good common modules for reading/writing generic structured data, so the actual files are isolated from the programs using them.



Some of this will take time, but my worry about pushing ahead and starting to use XML anyway for individual programs is that it might easily soon create a similar divergent mess as we’ve had with the current text files?

Cheers,

Erik




On 10 Nov 2013, at 15:34, Mark Abraham <mark.j.abraham at gmail.com> wrote:

> My experience of libxml2 has been favourable. I'm happy with a dependency on it, but someone needs to identify a version (preferably one that is known to be in package repos and/or have binaries available on the web). I would suggest we implement the dependency roughly as we do for FFTW:
> 
> * the install guide drops suitable hints to go get libxml2-dev(el) from your favourite repo (note that libxml2 might be installed by default, but we might need the #include headers that are only in the -dev or -devel packages!)
> * CMake detects if those exist in CMAKE_PREFIX_PATH, and gives a fatal error if not found.
> * the fatal error can be avoided by either letting the user supply a libxml2 tarball (e.g. so we can test in Jenkins also), or use cmake -DGMX_BUILD_OWN_LIBXML2 to do the same download-and-build thing.
> 
> Even if legal, I'm not so keen on bundling the libxml2 tarball at ~5MB, when gromacs is ~10MB. Bundling just the headers we need in order to use a system libxml2 might be a good option.
> 
> The proposed bump to require CMake version 2.8.8 in Redmine/Gerrit should make this a little smoother than it has been in the past.
> 
> I think there should be a wrapper layer between libxml2 and the GROMACS code that uses it, so that we have the option to change the implementation if we want to do so later.
> 
> There was an interesting post from Marcus Hanwell from Kitware on this list earlier this year about how their projects handle this kind of thing, (http://gromacs.5086.x6.nabble.com/parallel-make-problems-td5009226.html) which seems like it should be what we should do now that we have several of these kinds of dependencies currently "living" in src/external (FFTW, Boost subset, TNG, now libxml2, maybe later PDBx or some FMM code, maybe gmxblas and gmxlapack should go live there). For 5.0, I can live with a hack that copies how we handle FFTW, though.
> 
> Mark
> 
> 
> On Sun, Nov 10, 2013 at 9:35 PM, David van der Spoel <spoel at xray.bmc.uu.se> wrote:
> On 2013-11-10 20:58, Erik Lindahl wrote:
> Hi,
> 
> One reason could be that we haven’t really started to use standardized XML input/output formats yet, although we’re heading there long term. I’m also not enough of an expert to say whether libxml2 is the best XML parser out there, since there are quite a few alternatives?
> 
> If there are any specific new modules that would need it, doesn’t it make more sense to have those modules go through the normal code review (including a discussion of whether the proposed XML formats are nicely designed, etc), rather than separately making XML a hard requirement even if the current code doesn’t rely on it?
> 
> This is why I asked. I think we should refrain from adding more text only poorly documented files that are prone to errors. Therefore I used a XML files in this patch https://gerrit.gromacs.org/#/c/2659/, however Teemu pointed out that it does not compile without XML (since there are no ifdefs). So rather than implementing TWO versions of the code to read in the necessary data, we have to decide to make this obligatory now or not.
> 
> As regards compiling under windows (from the libxml2 website):
> 
> Libxml2 is known to be very portable, the library should build and work without serious troubles on a variety of systems (Linux, Unix, Windows, CygWin, MacOS, MacOS X, RISC Os, OS/2, VMS, QNX, MVS, VxWorks, ...)
> 
> It is distributed under the MIT license so I guess we could even include it in the source code as a backup. With the known portability and the license I don't see any reason not to. It comes pre-installed on Macs and Linux by the way.
> 
> 
> 
> 
> Cheers,
> 
> Erik
> 
> On 10 Nov 2013, at 11:51, David van der Spoel <spoel at xray.bmc.uu.se> wrote:
> 
> Hi,
> 
> we have decided a long time ago that for 5.0 libxml2 would be required.
> If there is any reason why we should remain to be able to compile
> gromacs WITHOUT libxml2, please speak up now.
> 
> If no good arguments will be brought forward then I will change the main
> CMakeList.txt such that gromacs will not compile without it.
> 
> --
> David van der Spoel, Ph.D., Professor of Biology
> Dept. of Cell & Molec. Biol., Uppsala University.
> Box 596, 75124 Uppsala, Sweden. Phone:  +46184714205.
> spoel at xray.bmc.uu.se    http://folding.bmc.uu.se
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
> 
> 
> 
> -- 
> David van der Spoel, Ph.D., Professor of Biology
> Dept. of Cell & Molec. Biol., Uppsala University.
> Box 596, 75124 Uppsala, Sweden. Phone:  +46184714205.
> spoel at xray.bmc.uu.se    http://folding.bmc.uu.se
> -- 
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-developers-request at gromacs.org.
> 
> -- 
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-developers-request at gromacs.org.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20131110/7ed556fb/attachment.html>


More information about the gromacs.org_gmx-developers mailing list