<p dir="ltr">Hi,</p>

<p dir="ltr">I think the only issue is indeed large negative arguments to exp and large positive to erfc. Such things occur in several places in the code. At some point we also had issues with this in the SIMD kernels since one of the SIMD math functions doesn't handle this correctly. I needed to add extra masking of move the masking for this.<br>

For our SIMD kernels we can (hope to) check for every such case, but we can't do that for all code.</p>

<p dir="ltr">Cheers,</p>

<p dir="ltr">Berk</p>

<div class="gmail_extra"><br><div class="gmail_quote">On Sep 15, 2016 9:43 PM, Erik Lindahl &lt;erik.lindahl@gmail.com&gt; wrote:<br type="attribution"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div style="word-wrap:break-word"><div style="font-family:&#39;helvetica&#39; , &#39;arial&#39;;font-size:13px;color:rgba( 0 , 0 , 0 , 1 );margin:0px">Hi,</div><p>On 15 September 2016 at 19:12:50, Schulz, Roland (<a href="mailto:roland.schulz&#64;intel.com">roland.schulz&#64;intel.com</a>) wrote:</p> <div><blockquote style="font-family:&#39;helvetica&#39; , &#39;arial&#39;;font-size:13px;font-style:normal;font-weight:normal;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div><div></div><div>Hi, <br /><br />What precision do we expect from math functions in GROMACS? We use by default relatively aggressive math options for many compilers but we haven&#39;t documented what precision we demand from the compiler. This makes it difficult to decide for new compilers / compiler versions what the correct set of flags is. </div></div></blockquote></div><div><div><blockquote style="font-family:&#39;helvetica&#39; , &#39;arial&#39;;font-size:13px;font-style:normal;font-weight:normal;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div><div><br />How many ulps max relative error do we accept for standard math functions? Should we use the same precision for non-simd math function as for simd functions (GMX_SIMD_ACCURACY_BITS_SINGLE)? Our default (22) corresponds to accepting 1ulp relative error, correct? Currently our default flags for ICC don&#39;t specify accuracy and thus we use the default accuracy for O3 which is 4ulps. Also AFAIK gcc -ffast-math allows 2ulps errors. Where this matters is in FunctionTest.ErfInvDouble which fails with ICC17 and -no-prec-div (corresponds to -ffast-math) because the result is only correct to 6ulp but the test requires 4ulp which I believe one cannot guarantee if the intermediate results are only correct to 2ulp (because it includes a difference of intermediate results). </div></div></blockquote></div><p>To tell the truth I haven’t done any super-deep analysis. In my early trials (in particular for the SIMD layer) I optimized entirely for performance and accepted larger errors, but when I tested I realized we could get to full performance with very little extra cost, and that has the huge advantage we don’t need to worry about accuracy when we use our own functions in new places (which in turn means we can use them almost everywhere).</p><p>My reason for settling on 4ULPs is that the SIMD math functions I tested (at least with gcc) typically achieved 1-2 ULP accuracy, and then I doubled this to have some margin.</p><p>Personally I would be hesitant to increase this without a detailed analysis of what goes wrong, and why it does not happen with other compilers.</p><p>If we accept 6ULP, why not 8? If we accept 8, why not 10 for another compiler?, etc. </p><div><div><blockquote style="font-family:&#39;helvetica&#39; , &#39;arial&#39;;font-size:13px;font-style:normal;font-weight:normal;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div><div><br />How accurate results do we accept for non-standard input values such as extremes, nans, infinites, denormals. For GCC we use -ffast-math which AFAIU means it won&#39;t produce correct results for nans and infinites. I&#39;m not sure about extremes and denormals. </div></div></blockquote></div><p>I don’t expect us to every need to handle NaN or Inf as input values. Denormals can be clipped to zero. </p><div><div><blockquote style="font-family:&#39;helvetica&#39; , &#39;arial&#39;;font-size:13px;font-style:normal;font-weight:normal;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div><div>An example where this matters: When compiling with ICC and allowing that all 4 non-standard values don&#39;t have to produce correct results (using -fast or -fp-model fast&#61;2 or -fimf-domain-exclusion&#61;common) than the complex/nbnxn-ljpme-LB test fails because cr2 (nbnxn_kernel_ref_inner.h line 241) gets very large and expf(cr2) should produce zero but produces NaN. Our SIMD exp also doesn&#39;t support very large values but in our SIMD kernel we mask out particles beyond the cut-off so that this cannot get that large. </div></div></blockquote></div><p>Spontaneously it sounds very dangerous to have the compiler assume that every single argument for every single invocation of the exponential function in a 3-million-line program has been checked so the arguments never fall in the extreme range. Basically: There’s no way we’ve checked that for old code, but new code might be better.</p><p><br /></p><p>Cheers,</p><p><br /></p><p>Erik</p></div></div></div></div>


</blockquote></div><br></div>