Re: [lammps-users] Improved performance for setup on large numbers of MPI ranks
From: Christopher Knight <cjknight2009@...24...>
Date: Thu, 26 Oct 2017 14:38:31 -0500

Hi Stan,

Yup, sure thing. Just did a quick test with the remotes/origin/comm-nprocs-opt branch and the rhodo input replicated 30x30x30 OK for me on 4,096 ranks. 

When writing this, not assigning atoms correctly was usually because the overlap criteria wasn’t robust, so my first guess is that’s the issue in your case (assuming it’s not because I simply omitted support/testing for something, e.g. triclinic).

Can you share details of what you ran and/or inputs (if not readily accessible)? I didn’t see description on github of inputs that crashed.



LAMMPS (23 Oct 2017)
  using 1 OpenMP thread(s) per MPI task
using multi-threaded neighbor list subroutines
Reading data file ...
  orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615)
  16 by 16 by 16 MPI processor grid
  reading atoms ...
  32000 atoms
Replicating atoms ...
  orthogonal box = (-27.5 -38.5 -36.3646) to (1622.5 2271.5 2145.42)
  16 by 16 by 16 MPI processor grid
Replicate::bounding box image: lo= -1 -1 -1  hi= 1 1 1
Replicate:: buf_all memory allocating       8.02 MB
Replicate: average # of replicas added to proc= 111.458252 out of 27000 (0.412808 %)
  864000000 atoms
  748521000 bonds
  1092609000 angles
  1534383000 dihedrals
  27918000 impropers

On Oct 26, 2017, at 12:54 PM, Moore, Stan <stamoor@...3...> wrote:

Axel added your code as a PR on LAMMPS GitHub: I tried out the replicate memory command but got a crash, can you take a look?