<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">PS:<div><br></div><div>If it wasn’t implicitly clear, I can try to help realize this, although I can’t promise to do it right away, and I can’t do it myself :-)</div><div><br></div><div><br></div><div><div>To try to be constructive, I’ve been considering the scenario where we want to describe the execution of a complete job, including some components that will require extra chemical data. Longer-term, I think it would be great if we could assemble a single XML document that really describes the entire system (even coordinates), force field, topology, simulation, the MDP settings, metadata settings for parallelization, and not least the chemical data.</div><div><br></div><div>Then we could have a structure with a top-level “gromacs” XML namespace that just contains metadata (user, generating program, etc) and a bunch of lower namespaces that contain the actual data.</div></div><div>These could for instance be “forcefield”, “topology”, mdp parameters, and likely a separate block to be able to describe higher-level simulation metadata (e.g. parallelization or that we should run N simulations in REMD).</div><div><br></div><div>We don’t need to think of the contents of most of these until we implement them. If we want to start with the special case of structure factors I guess the questions we should think of are:</div><div><br></div><div><br></div><div>1) Where do we see this type of data fitting in a bigger Gromacs namespace? What other similar data might we have in the future?</div><div><br></div><div>2) Are there any other structure factors that could occur in a simulation (say, X-ray)? Can we describe those in the same datastructure, or should they be separate? If separate, we should reflect that in the naming, etc.</div><div><br></div><div>3) Can we design a simple datastructure for _this_ type of data, so other programs that need it can ask Gromacs (which will also validate input xml files) rather than write their own XML parsing code?</div><div><br></div><div><br></div><div>If that sounds potentially interesting I can try to contribute by starting to sketch on the highest-level namespace?</div><div><br></div><div><br></div><div>Cheers,</div><div><br></div><div>Erik</div><div><br></div><div><br><div><div>On 11 Nov 2013, at 03:47, Erik Lindahl &lt;<a href="mailto:erik.lindahl@scilifelab.se">erik.lindahl@scilifelab.se</a>&gt; wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">


<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">


<div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">

Hi,

<div><br>

<div>

<div>On 10 Nov 2013, at 23:33, David van der Spoel &lt;<a href="mailto:spoel@xray.bmc.uu.se">spoel@xray.bmc.uu.se</a>&gt; wrote:</div>

<blockquote type="cite">

<div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">

<br>

I guess this will prevent us from using xml in practice. We have<span class="Apple-converted-space">&nbsp;</span><br>

discussed xml for ten years or so, but the transition to xml schema is a<span class="Apple-converted-space">&nbsp;</span><br>

real show stopper. I don't have the time to learn that as well. Does<span class="Apple-converted-space">&nbsp;</span><br>

that imply I should stop developing? In addition, for many small files<span class="Apple-converted-space">&nbsp;</span><br>

you don't need a dtd or schema (and in fact there isn't one for these<span class="Apple-converted-space">&nbsp;</span><br>

xml files), it's just that the libxml2 library demands you put it into<span class="Apple-converted-space">&nbsp;</span><br>

the file. If we're talking rtp files then that's another matter where<span class="Apple-converted-space">&nbsp;</span><br>

more structure is needed.<br>

</div>

</blockquote>

<div><br>

</div>

<div>I think the ability to validate the contents of a file is the core concept we want from XML. An XML file that doesn’t have any DTD or Schema is just a textfile that looks fancier - you can add illegal data anywhere, and they you only rely on the internal

 logic of the program reading it to catch your error (or not) - that won’t really be much safer than our current text files.</div>

<div><br>

</div>

<div>Writing a schema for a simple file takes less than an hour to learn, and there are even free DTD-to-schema converters. Obviously, it will still be a lot of work to write an advanced schema e.g. for topologies, but I don’t think that’s on the table right

 now. &nbsp;However, just as class design is a pain for all of us (well, maybe not Teemu :-), the reason for doing it is that it will save time for all developers and lead to fewer bugs in the long run.</div>

<blockquote type="cite">

<div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">

<br>

Some other points, like having clear names and units I do agree with and<span class="Apple-converted-space">&nbsp;</span><br>

can change it my present application.<br>

<br>

Common modules for writing and reading implies that all possible data<span class="Apple-converted-space">&nbsp;</span><br>

should be merged into one or a few monster formats. This in itself will<span class="Apple-converted-space">&nbsp;</span><br>

create extra problems.<br>

</div>

</blockquote>

<div><br>

</div>

<div>Well, it doesn’t necessarily have to be _one_ single format, but I think it is a far better solution to standardize on how we do it rather than ~20 tools each inventing their own structure for how to store and read data? That is what we have right now

 with the text files...</div>

<br>

<blockquote type="cite">

<div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">

As for changing names of files, this shouldn't be necessary as one<span class="Apple-converted-space">&nbsp;</span><br>

should be able to see from the content what kind of file this is. No<span class="Apple-converted-space">&nbsp;</span><br>

strong feelings here but it would be very confusing to add many new<span class="Apple-converted-space">&nbsp;</span><br>

files names.<br>

</div>

</blockquote>

<div><br>

</div>

If we have a good namespace structure we can probably get around without it. However, at some point we have to consider how to separate the topology XML file from the mdp XML file in each directory.</div>

<div><br>

<blockquote type="cite">

<div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">

@Mark: an extra layer wouldn't help would it - there is no competing<span class="Apple-converted-space">&nbsp;</span><br>

package as far as I know. There is, however, libxml++, a C++ wrapper<span class="Apple-converted-space">&nbsp;</span><br>

around libxml2, which is slightly more logical to use in C++ code, but<span class="Apple-converted-space">&nbsp;</span><br>

it would imply an extra library. On the other hand that might function<span class="Apple-converted-space">&nbsp;</span><br>

as a thin wrapper around the library.<br>

</div>

</blockquote>

<div><br>

</div>

<div>I know of at least Expat and MSXML, and quickly also foundmini-XML, Xerces, AsmXml and RapidXml, where the last two are claiming an order of magnitude faster parsing speeds than libxml2.&nbsp;</div>

<div>I see no particular reason for using any of those libraries today, but this sounds like exactly the same situation where we originally saw no reason for any other FFT libraries than FFTW :-)</div>

<div><br>

</div>

<div>Cheers,</div>

<div><br>

</div>

<div>Erik</div>

<div><br>

</div>

</div>

</div>

</div>


-- <br>gmx-developers mailing list<br><a href="mailto:gmx-developers@gromacs.org">gmx-developers@gromacs.org</a><br>http://lists.gromacs.org/mailman/listinfo/gmx-developers<br>Please don't post (un)subscribe requests to the list. Use the <br>www interface or send it to gmx-developers-request@gromacs.org.</blockquote></div><br></div></body></html>