LAMMPS WWW Site - LAMMPS Documentation - LAMMPS Mailing List Archives
Re: [lammps-users] issue running lennard/mdf when using gpu acceleration for other pair_style commands
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lammps-users] issue running lennard/mdf when using gpu acceleration for other pair_style commands


From: Axel Kohlmeyer <akohlmey@...24...>
Date: Mon, 20 Nov 2017 09:31:13 -0500

unless you turned them off, batch systems usually capture the standard and error output from the submitted scripts. there *must* be an error message in those somewhere.

but there are a few things in your log file that don't make much sense to me.
have you been able to run any of the benchmark examples correctly on the GPUs?
what exactly is the command line and the input for your simulation?

On Mon, Nov 20, 2017 at 2:58 AM, riccardo innocenti <riccardo-1990@...4463...> wrote:

Dear Axel,


But in this case it does not seem to print an error message. 


These are my last lines in log.lammps:


Neighbor list info ...
  update every 1 steps, delay 10 steps, check yes
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
  binsize = 6, bins = 25 25 25
  7 neighbor lists, perpetual/occasional/extra = 7 0 0
  (1) pair coul/long/gpu, perpetual, skip from (6)
      attributes: full, newton off
      pair build: skip
      stencil: none
      bin: none
  (2) pair lj/cut/gpu, perpetual, skip from (6)
      attributes: full, newton off
      pair build: skip
      stencil: none
      bin: none
  (3) pair lj/cut/coul/long/gpu, perpetual, skip from (6)
      attributes: full, newton off
      pair build: skip
      stencil: none
      bin: none
  (4) pair buck/gpu, perpetual, skip from (6)
      attributes: full, newton off
      pair build: skip
      stencil: none
      bin: none
  (5) pair lennard/mdf, perpetual, skip from (7)
      attributes: half, newton off
      pair build: skip
      stencil: none
      bin: none
  (6) neighbor class addition, perpetual
      attributes: full, newton off
      pair build: full/bin
      stencil: full/bin/3d
      bin: standard
  (7) neighbor class addition, perpetual, half/full from (6)
      attributes: half, newton off
      pair build: halffull/newtoff
      stencil: none
      bin: none
WARNING: Inconsistent image flags (../domain.cpp:785)
Memory usage per processor = 84.5436 Mbytes
Step Time PotEng Temp Press Volume Pxx Pyy Pzz Cella Cellb Cellc CellAlpha CellBeta CellGamma CPU
       0            0    -45205.24          300   -297.94047    3167414.5   -130.57545   -340.03169   -423.21427    146.85936    146.85936    146.85936           90           90           90            0


and then the programs just call MPI_Abort(). There does not seem to be any indication of where the error is.


Kind regards,

Riccardo


From: Axel Kohlmeyer <akohlmey@...24...>
Sent: 19 November 2017 19:07:21

To: riccardo innocenti
Cc: lammps-users@...396...sourceforge.net
Subject: Re: [lammps-users] issue running lennard/mdf when using gpu acceleration for other pair_style commands
 


On Sun, Nov 19, 2017 at 10:34 AM, riccardo innocenti <riccardo-1990@...4463...> wrote:

Dear Axel,


Thank you for the reply.


I was not interested in accelerating those styles (mdf) on the gpu, but the other ones present in my force field file (e.g. pppm, coul/long...).


what part of the output could help identify what the problem is?

​wherever the error messages are captured.  when LAMMPS calls MPI_Abort(), this will only be after it printed an error message stating why it stopped.

axel.​
 


Kind regards,

Riccardo


From: Axel Kohlmeyer <akohlmey@...24...>
Sent: 19 November 2017 16:19:26
To: riccardo innocenti
Cc: lammps-users@...655....net
Subject: Re: [lammps-users] issue running lennard/mdf when using gpu acceleration for other pair_style commands
 


On Sun, Nov 19, 2017 at 9:12 AM, riccardo innocenti <riccardo-1990@...4463...> wrote:

Dear All,


I am trying to run some simulations on gpu accelerated nodes (NVIDIA Tesla K20X with 6 GB GDDR5 memory) using the mdf class of  potentials. The lammps version I am using is 10Mar17.


When I used the mdf pair_style (Does not matter if it is the buck, lennard or lj type) the simulation fails after outputting the energy at step 0 without error messages (in log.lammps). My output file last lines look like:


​the output below if from your queuing ​system, except for the first line. so it is not useful at all. consult with your local admin staff to learn how to find the output to the screen.

also, trying to run mdf pair styles on the GPU is a pointless exercise, since those styles are not GPU accelerated, as is clearly evident from the LAMMPS manual.

axel.

 


Rank 20 [Sun Nov 19 15:05:23 2017] [c0-1c1s1n0] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 20
srun: error: nid01988: task 20: Aborted
srun: Terminating job step 4576432.0
slurmstepd: error: *** STEP 4576432.0 ON nid01987 CANCELLED AT 2017-11-19T15:05:23 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
Initializing Device 0 on core 11...srun: error: nid01992: tasks 60-68,70-71: Killed
srun: error: nid01990: tasks 36-47: Killed
srun: error: nid01994: tasks 84-95: Killed
srun: error: nid01993: tasks 72-83: Killed
srun: error: nid01988: tasks 12-19,21-23: Killed
srun: error: nid01989: tasks 24-35: Killed
srun: error: nid01992: task 69: Killed
srun: error: nid01987: tasks 0-11: Killed
srun: error: nid01991: tasks 48-59: Killed
"slurm-4576432.out" 170L, 6808C                  


When I run the simulation without GPU acceleration the simulations run without any issues.


I am not sure what the error could be. Does anyone has any suggestion?


Kind regards,

Riccardo


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
lammps-users mailing list
lammps-users@...655....net
https://lists.sourceforge.net/lists/listinfo/lammps-users




--
Dr. Axel Kohlmeyer  akohlmey@...24...  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
lammps-users mailing list
lammps-users@...655....net
https://lists.sourceforge.net/lists/listinfo/lammps-users




--
Dr. Axel Kohlmeyer  akohlmey@...92......  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
lammps-users mailing list
lammps-users@...6297....sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lammps-users




--
Dr. Axel Kohlmeyer  akohlmey@...12...24...  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.