<div dir="ltr"><div dir="ltr">Hi,<div><br></div><div>Indeed, a straight port without much performance optimization may not be a lot of effort, but integrating an additional kernel flavor into the existing codebase will mean added complexity which will probably require some refactoring and preliminary work to accommodate the new set of kernels without code duplication and avoiding introducing complexity or performance overhead in the current kernels.</div><div><br></div><div>However also note that most NVIDIA consumer cards -- which are very widely used by our users -- have a 32x lower DP throughput than SP which is far more than what what most people would find acceptable, I'd say.</div><div><br></div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">--<br>Szilárd</div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jul 22, 2019 at 10:28 PM Berk Hess <<a href="mailto:hess@kth.se">hess@kth.se</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<div class="gmail-m_-3456568359214469351moz-cite-prefix">Hi,<br>
<br>
IIRC all Nvidia Tesla cards have always had double precision, at
half the throughput of single precision. But there are very few
cases where double precision is needed. Energy drift in single
precision is never an issue, unless you really can not use a
thermostat.<br>
<br>
But having said that, making the GPU code, either CUDA or OpenCL
work in double precision is probably not much effort. But making
it work efficiently requires optimizing several algorithmic
parameters and maybe changing the arrangement of some data in the
different GPU memory levels.<br>
<br>
Cheers,<br>
<br>
Berk<br>
<br>
On 7/22/19 10:10 PM, James wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi,
<div><br>
</div>
<div>My apologies if this question has been previously
discussed. I just joined the list and all I know is that from
reading the docs and release comments, writ ing code for
double precision on GPU's is not a priority.</div>
<div><br>
</div>
<div>However, I believe all recent upper-end Nvidia cards have
native double precision (which was not true several
generations ago). So, you don't have to have a real
"scientific computing" GPU to take advantage of this -- most
people probably already have the hardware. Still, I understand
that most people do not need/want to run double precision.
But, some do (and you have to if you are concerned with
conservation of energy -- the energy drift in single precision
is substantial).</div>
<div><br>
</div>
<div>So, I would like to ask what the level of effort to do this
is believed to be? Would it require a lot of new code, or
would it be porting the single precision code to double
precision?</div>
<div><br clear="all">
<div>
<div dir="ltr" class="gmail-m_-3456568359214469351gmail_signature">
<div dir="ltr">
<div dir="ltr">Sincerely,<br>
James<br>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="gmail-m_-3456568359214469351mimeAttachmentHeader"></fieldset>
</blockquote>
<br>
</div>
-- <br>
Gromacs Developers mailing list<br>
<br>
* Please search the archive at <a href="http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List" rel="noreferrer" target="_blank">http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List</a> before posting!<br>
<br>
* Can't post? Read <a href="http://www.gromacs.org/Support/Mailing_Lists" rel="noreferrer" target="_blank">http://www.gromacs.org/Support/Mailing_Lists</a><br>
<br>
* For (un)subscribe requests visit<br>
<a href="https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers" rel="noreferrer" target="_blank">https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers</a> or send a mail to <a href="mailto:gmx-developers-request@gromacs.org" target="_blank">gmx-developers-request@gromacs.org</a>.</blockquote></div></div>