LAMMPS WWW Site - LAMMPS Documentation - LAMMPS Mailing List Archives
Re: [lammps-users] Secondary run failure when calculating clustering
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lammps-users] Secondary run failure when calculating clustering


From: David Nicholson <danich@...212...>
Date: Mon, 13 Nov 2017 16:06:42 -0500

Hello Craig and Axel,

I think that I have diagnosed the problem here: https://github.com/lammps/lammps/pull/728. If I'm correct, you should be able to avoid this problem for now by running with "pre no" on the second run, or ensuring that the neighbor list isn't built on the last step of the first run (making it a non-multiple of 10). 

-David

On Wed, Nov 8, 2017 at 12:13 PM, Axel Kohlmeyer <akohlmey@...24...> wrote:
just a quick note. this is probably related to the issue described here:


both computes use the same communication patterns and logic to iterate the information across MPI ranks.
i can reproduce it and have some traces, but have not yet understood what exactly is causing the deadlock.
this is quite complex and i am currently extremely busy with other tasks.

axel.

On Wed, Nov 8, 2017 at 6:20 AM, Devonport, Craig <C.Devonport@...1982...> wrote:
Hi,

I’m trying to calculate solid cluster sizes and i’ve come up against a problem.

I’m using a compute reduce to get the largest cluster number (I know this is just the number of the atom first identified in the cluster but it forces the clustering to update). When I have this in my thermo style one run command will work fine but if I try to do a second it’ll only work if the first was for less than 10 steps. If the first run was for 10 or more steps lamps hangs after printing the thermo style headings.

This problem appears to go away at small system sizes but I haven’t narrowed down exactly what the limit is.

Have I missed something about clustering in lamps which is causing this?


I’ve simplified my script to hopefully focus on the issue
(in my actual script I’m using q6 and coordination number to determine solid particles etc)

log test.log
units lj
atom_style atomic
atom_modify map hash
lattice fcc 1.0 spacing 1 1 1

region box block 0 9 0 2 0 2 units lattice
# doesn't work at 10 x 2 x 2 - 160 atoms
# works at 9 x 2 x 2 - 144 atoms
create_box 1 box
create_atoms 1 box

mass 1 1.0
pair_style lj/cut 3.5
pair_modify tail yes
pair_coeff 1 1 1.0 1.0

# do clustering on all atoms
compute cluster all cluster/atom 1.5

# get largest cluster number
compute max all reduce max c_cluster

thermo_style custom step c_max
thermo 1
fix 1 all nph iso 0.02 0.02 1
fix 2 all langevin 1.5 1.5 0.1 1

# run 1
run 10

# run 2
run 10

I’m running the 23 Oct 2017 version on lammps
This happens on macOS 10.12.6, lammps complied with gcc 7.2.0.0 (from mac ports)
and on SUSE Enterprise 11.4 lammps compiled with gcc 6.3 or icc 17.0.2

Here’s the output i get when it’s not working:

LAMMPS (23 Oct 2017)
Lattice spacing in x,y,z = 1.5874 1.5874 1.5874
Created orthogonal box = (0 0 0) to (15.874 3.1748 3.1748)
  1 by 1 by 1 MPI processor grid
Created 160 atoms
  Time spent = 0.000475883 secs
Neighbor list info ...
  update every 1 steps, delay 10 steps, check yes
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 3.8
  ghost atom cutoff = 3.8
  binsize = 1.9, bins = 9 2 2
  2 neighbor lists, perpetual/occasional/extra = 1 1 0
  (1) pair lj/cut, perpetual
      attributes: half, newton on
      pair build: half/bin/atomonly/newton
      stencil: half/bin/3d/newton
      bin: standard
  (2) compute cluster/atom, occasional
      attributes: full, newton on
      pair build: full/bin/atomonly
      stencil: full/bin/3d
      bin: standard
Setting up Verlet run ...
  Unit style    : lj
  Current step  : 0
  Time step     : 0.005
Per MPI rank memory allocation (min/avg/max) = 4.325 | 4.325 | 4.325 Mbytes
Step c_max 
       0            1 
       1            1 
       2            1 
       3            1 
       4            1 
       5            1 
       6            1 
       7            1 
       8            1 
       9            1 
      10            1 
Loop time of 0.00756097 on 1 procs for 10 steps with 160 atoms

Performance: 571355.384 tau/day, 1322.582 timesteps/s
99.6% CPU use with 1 MPI tasks x no OpenMP threads

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.0022697  | 0.0022697  | 0.0022697  |   0.0 | 30.02
Neigh   | 0.00042892 | 0.00042892 | 0.00042892 |   0.0 |  5.67
Comm    | 0.00012016 | 0.00012016 | 0.00012016 |   0.0 |  1.59
Output  | 0.004585   | 0.004585   | 0.004585   |   0.0 | 60.64
Modify  | 0.00013828 | 0.00013828 | 0.00013828 |   0.0 |  1.83
Other   |            | 1.884e-05  |            |       |  0.25

Nlocal:    160 ave 160 max 160 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost:    2291 ave 2291 max 2291 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs:    17980 ave 17980 max 17980 min
Histogram: 1 0 0 0 0 0 0 0 0 0
FullNghs:  35960 ave 35960 max 35960 min
Histogram: 1 0 0 0 0 0 0 0 0 0

Total # of neighbors = 35960
Ave neighs/atom = 224.75
Neighbor list builds = 1
Dangerous builds = 1
Setting up Verlet run ...
  Unit style    : lj
  Current step  : 10
  Time step     : 0.005
Per MPI rank memory allocation (min/avg/max) = 4.452 | 4.452 | 4.452 Mbytes
Step c_max 


Hope I’ve been clear and provided enough information to replicate this.

Thanks,

Craig






------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
lammps-users mailing list
lammps-users@...655....net
https://lists.sourceforge.net/lists/listinfo/lammps-users




--
Dr. Axel Kohlmeyer  akohlmey@...24...  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
lammps-users mailing list
lammps-users@...6297....sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lammps-users




--
David Nicholson
Graduate Student
Massachusetts Institute of Technology
914 400 3192