<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi,<br>
      <br>
      I assume this is with GPUs.<br>
      If you run in a debugger, break on exit, can you tell me which
      sort_atoms call this comes from?<br>
      <br>
      On how many MPI ranks is this?<br>
      If I can easily run this, could you mail me the tpr and the run
      settings?<br>
      <br>
      Cheers,<br>
      <br>
      Berk<br>
      <br>
      On 11/08/2013 02:30 PM, Carsten Kutzner wrote:<br>
    </div>
    <blockquote cite="mid:527CE75D.1050509@gwdg.de" type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <div class="moz-cite-prefix">Hi,<br>
        <br>
        using a just checked-out 4.6 branch compiled with debug checks I
        get<br>
        <br>
        -------------------------------------------------------<br>
        Program mdrun, VERSION 4.6.4-dev-20131107-ba8232e<br>
        Source code file:
        /home/ckutzne/junoworkspace/git-gromacs-vanilla/src/mdlib/nbnxn_search.c,

        line: 609<br>
        <br>
        Fatal error:<br>
        (int)((x[74522][x]=11.764535 - 10.229600)*58.394176) = 89, not
        in 0 - 16*4<br>
        <br>
        For more information and tips for troubleshooting, please check
        the GROMACS<br>
        website at <a moz-do-not-send="true"
          class="moz-txt-link-freetext"
          href="http://www.gromacs.org/Documentation/Errors">http://www.gromacs.org/Documentation/Errors</a><br>
        -------------------------------------------------------<br>
        <br>
        Carsten<br>
        <br>
        <br>
        On 11/08/2013 02:00 PM, Berk Hess wrote:<br>
      </div>
      <blockquote cite="mid:527CE062.3020200@kth.se" type="cite">
        <div class="moz-cite-prefix">On 11/08/2013 01:44 PM, Mark
          Abraham wrote:<br>
        </div>
        <blockquote
cite="mid:CAMNuMAS9x3=btqto7URjCjB6tMdTwtHV6Ye7eCZ21wJmWpb1Ag@mail.gmail.com"
          type="cite">
          <div dir="ltr"><br>
            <div class="gmail_extra"><br>
              <br>
              <div class="gmail_quote">On Fri, Nov 8, 2013 at 12:58 PM,
                Carsten Kutzner <span dir="ltr">&lt;<a
                    moz-do-not-send="true" href="mailto:ckutzne@gwdg.de"
                    target="_blank">ckutzne@gwdg.de</a>&gt;</span>
                wrote:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi
                  Mark, hi Berk,<br>
                  <div class="im"><br>
                    On Nov 7, 2013, at 6:48 PM, Berk Hess &lt;<a
                      moz-do-not-send="true" href="mailto:hess@kth.se">hess@kth.se</a>&gt;


                    wrote:<br>
                    <br>
                    &gt; Hi Carsten,<br>
                    &gt;<br>
                    &gt; After how many steps does this happen?<br>
                  </div>
                  this happens immedeately at startup.<br>
                  <div class="im"><br>
                    &gt; Could you run with a debug build (or without
                    NDEBUG defined)?<br>
                    &gt; I added a lot of checks, not done with NDEBUG,
                    in the fix for the issue you linked.<br>
                  </div>
                  Will do that now.<br>
                  <div class="im"><br>
                    &gt; On 11/07/2013 06:27 PM, Mark Abraham wrote:<br>
                    &gt;&gt; Unclear. 6583c94 is one of your commits.
                    Some very recent stuff has been playing with nstlist
                    and rlist (safely, or so we thought.) Can you
                    reproduce with mainstream release-4-6?<br>
                  </div>
                  This is basically mainstream 4-6, since in my commit I
                  only changed the default behavior of<br>
                  appending to no.<br>
                </blockquote>
                <div><br>
                </div>
                <div>Right. What's the mainstream parent commit? I was
                  going to release 4.6.4 today - if you're based off the
                  current tip then maybe we shouldn't. If you're based
                  off code a month back then we know the problem, if
                  any, is of longer standing.</div>
              </div>
            </div>
          </div>
        </blockquote>
        This is 4.6.4-dev which seems to include my fix for the previous
        issue, so this issue is surely present in the current
        4-6-release branch. It must be due to a somewhat exotic
        condition, since this code is widely used and we haven't had
        other reports.<br>
        <br>
        I think it should be easy to track this down with all the debug
        checks in the code.<br>
        And if Carsten can send me his system and the conditions to
        reproduce it, I can also help with debugging.<br>
        <br>
        Cheers,<br>
        <br>
        Berk<br>
        <blockquote
cite="mid:CAMNuMAS9x3=btqto7URjCjB6tMdTwtHV6Ye7eCZ21wJmWpb1Ag@mail.gmail.com"
          type="cite">
          <div dir="ltr">
            <div class="gmail_extra">
              <div class="gmail_quote">
                <div><br>
                </div>
                <div>Mark</div>
                <div><br>
                </div>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex"> <span
                    class="HOEnZb"><font color="#888888"><br>
                      Carsten<br>
                    </font></span>
                  <div class="HOEnZb">
                    <div class="h5"><br>
                      &gt;&gt;<br>
                      &gt;&gt; Mark<br>
                      &gt;&gt;<br>
                      &gt;&gt;<br>
                      &gt;&gt; On Thu, Nov 7, 2013 at 5:18 PM, Carsten
                      Kutzner &lt;<a moz-do-not-send="true"
                        href="mailto:ckutzne@gwdg.de">ckutzne@gwdg.de</a>&gt;


                      wrote:<br>
                      &gt;&gt; Hi,<br>
                      &gt;&gt;<br>
                      &gt;&gt; we have a 120k atom system that crashes
                      with<br>
                      &gt;&gt;<br>
                      &gt;&gt;
                      ------------------------------------------------------<br>
                      &gt;&gt; Program mdrun_mpi, VERSION
                      4.6.4-dev-20131015-6583c94<br>
                      &gt;&gt; Source code file:
                      /home/c/gromacs/src/mdlib/nbnxn_search.c, line:
                      685<br>
                      &gt;&gt;<br>
                      &gt;&gt; Software inconsistency error:<br>
                      &gt;&gt; Lost particles while sorting<br>
                      &gt;&gt; For more information and tips for
                      troubleshooting, please check the GROMACS<br>
                      &gt;&gt; website at <a moz-do-not-send="true"
                        href="http://www.gromacs.org/Documentation/Errors"
                        target="_blank">http://www.gromacs.org/Documentation/Errors</a><br>
                      &gt;&gt;
                      -------------------------------------------------------<br>
                      &gt;&gt;<br>
                      &gt;&gt; if run with &gt;= 2 MPI processes on a
                      GPU and small values for nstlist. On my
                      workstation,<br>
                      &gt;&gt; nstlist = 34 and larger works, whereas
                      nstlist &lt;= 33 lead to the above problem.<br>
                      &gt;&gt;<br>
                      &gt;&gt; Another system (60k atoms) does not
                      produce this problem, so system size seems<br>
                      &gt;&gt; to matter as well.<br>
                      &gt;&gt;<br>
                      &gt;&gt; Looks like an old ghost:<br>
                      &gt;&gt;<br>
                      &gt;&gt; <a moz-do-not-send="true"
                        href="http://redmine.gromacs.org/issues/1153"
                        target="_blank">http://redmine.gromacs.org/issues/1153</a><br>
                      &gt;&gt;<br>
                      &gt;&gt;<br>
                      &gt;&gt; Should I file a redmine issue?<br>
                      &gt;&gt;<br>
                      &gt;&gt; Carsten<br>
                      &gt;&gt;<br>
                      &gt;&gt;<br>
                      &gt;&gt; --<br>
                      &gt;&gt; gmx-developers mailing list<br>
                      &gt;&gt; <a moz-do-not-send="true"
                        href="mailto:gmx-developers@gromacs.org">gmx-developers@gromacs.org</a><br>
                      &gt;&gt; <a moz-do-not-send="true"
                        href="http://lists.gromacs.org/mailman/listinfo/gmx-developers"
                        target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-developers</a><br>
                      &gt;&gt; Please don't post (un)subscribe requests
                      to the list. Use the www interface or send it to <a
                        moz-do-not-send="true"
                        href="mailto:gmx-developers-request@gromacs.org">gmx-developers-request@gromacs.org</a>.<br>
                      &gt;&gt;<br>
                      &gt;&gt;<br>
                      &gt;&gt;<br>
                      &gt;<br>
                      &gt; --<br>
                      &gt; gmx-developers mailing list<br>
                      &gt; <a moz-do-not-send="true"
                        href="mailto:gmx-developers@gromacs.org">gmx-developers@gromacs.org</a><br>
                      &gt; <a moz-do-not-send="true"
                        href="http://lists.gromacs.org/mailman/listinfo/gmx-developers"
                        target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-developers</a><br>
                      &gt; Please don't post (un)subscribe requests to
                      the list. Use the<br>
                      &gt; www interface or send it to <a
                        moz-do-not-send="true"
                        href="mailto:gmx-developers-request@gromacs.org">gmx-developers-request@gromacs.org</a>.<br>
                      <br>
                      --<br>
                      gmx-developers mailing list<br>
                      <a moz-do-not-send="true"
                        href="mailto:gmx-developers@gromacs.org">gmx-developers@gromacs.org</a><br>
                      <a moz-do-not-send="true"
                        href="http://lists.gromacs.org/mailman/listinfo/gmx-developers"
                        target="_blank">http://lists.gromacs.org/mailman/listinfo/gmx-developers</a><br>
                      Please don't post (un)subscribe requests to the
                      list. Use the<br>
                      www interface or send it to <a
                        moz-do-not-send="true"
                        href="mailto:gmx-developers-request@gromacs.org">gmx-developers-request@gromacs.org</a>.<br>
                    </div>
                  </div>
                </blockquote>
              </div>
              <br>
            </div>
          </div>
          <br>
          <fieldset class="mimeAttachmentHeader"></fieldset>
          <br>
        </blockquote>
        <br>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <br>
      </blockquote>
      <br>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
    </blockquote>
    <br>
  </body>
</html>