LAMMPS WWW Site - LAMMPS Documentation - LAMMPS Mailing List Archives
Re: [lammps-users] error: segmentation fault_reax/c_KOKKOS
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lammps-users] error: segmentation fault_reax/c_KOKKOS


From: Axel Kohlmeyer <akohlmey@...24...>
Date: Tue, 5 Sep 2017 16:15:30 -0400



On Mon, Sep 4, 2017 at 11:49 AM, Mohammad Izadi <izadi0511@...92......> wrote:

Dear lammps users,

I installed lammps kokkos_mpi_only (lammps-31Mar17) on a server computer and I run reax/c/KOKKOS-package from the command line:


​please first update to the latest LAMMPS version and test if your issue persists. 
if yes, please provide a full input deck, so people elsewhere can reproduce this issue and debug it.

axel.​

 

  nohup  mpirun     -np    2    ./lmp_kokkos_mpi_only   -k   on   -sf    kk    <  in.input &

My system has 2073 atoms and it is in a gas phase. When I have a smaller system (e.g. a system with 200 atom), it works without any error. My input file is as below:

=================

echo            both

units                       real

newton            on

atom_style            charge

dimension       3

boundary        p p p

#read_restart    restart22

restart 500      restart11 restart22

read_data              Silica2073.data

pair_style              reax/c NULL

pair_coeff              * * ffield.reax.input C H O N S Si Na Ar

neighbor                2 bin

neigh_modify      every 5 delay 0 check no

velocity          all create 2100 235485 mom yes rot yes

fix                       1 all nvt temp 2100.0 2100.0 100.0

fix                   2 all qeq/reax 1 0.0 10.0 1e-6 reax/c

fix                  4 all reax/c/species 10 10 250 species.txt

fix                  6 all efield 0.0001 0.0 0.0

fix_modify     6 energy yes

fix                  7 all reax/c/bonds 250 bonds.reaxc

compute reax all pair reax/c

variable eb             equal c_reax[1]

variable ea             equal c_reax[2]

variable elp            equal c_reax[3]

variable emol        equal c_reax[4]

variable ev             equal c_reax[5]

variable epen         equal c_reax[6]

variable ecoa         equal c_reax[7]

variable ehb           equal c_reax[8]

variable et              equal c_reax[9]

variable eco           equal c_reax[10]

variable ew            equal c_reax[11]

variable ep             equal c_reax[12]

variable efi             equal c_reax[13]

variable eqeq         equal c_reax[14]

thermo_style    custom  step  temp  atoms  etotal  ke  pe  v_eb  v_ea  v_elp  v_emol  v_ev  v_epen v_ecoa  v_ehb  v_et  v_eco  v_ew  v_ep  v_efi  v_eqeq  density  vol  press

thermo          250

timestep 0.1

dump                     1 all xyz  250 dumpnvt.xyz

run                          4000000

========================

Also, when I use a single core run, it doesn’t stop, but with multi core runs and large systems (2073 atom) instantly it stop with the bottom error:

=========================================

WARNING: Fixes cannot send data in Kokkos communication, switching to classic communication (../comm_kokkos.cpp:382)

[cschpc:169783] *** Process received signal ***

[cschpc:169783] Signal: Segmentation fault (11)

[cschpc:169783] Signal code: Address not mapped (1)

[cschpc:169783] Failing at address: (nil)

[cschpc:169783] [ 0] /lib64/libpthread.so.0() [0x3f6940f710]

[cschpc:169783] [ 1] ./lmp_kokkos_mpi_only(_ZN6Kokkos12parallel_forINS_11RangePolicyIJNS_6SerialEN9LAMMPS_NS27PairReaxFindBondSpeciesZeroEEEENS3_15PairReaxCKokkosIS2_EEEEvRKT_RKT0_RKSsPNS_4Impl9enable_ifIXntsrNSG_11is_integralIS8_EE5valueEvE4typeE+0x268) [0x17a1bc8]

[cschpc:169783] [ 2] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS15PairReaxCKokkosIN6Kokkos6SerialEE15FindBondSpeciesEv+0xb0) [0x17aa0d0]

[cschpc:169783] [ 3] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS15PairReaxCKokkosIN6Kokkos6SerialEE7computeEii+0x34a4) [0x17e26e4]

[cschpc:169783] [ 4] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS12VerletKokkos5setupEv+0x6aa) [0x1a6b43a]

[cschpc:169783] [ 5] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS3Run7commandEiPPc+0x65e) [0x1a2271e]

[cschpc:169783] [ 6] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS5Input15command_creatorINS_3RunEEEvPNS_6LAMMPSEiPPc+0x26) [0xcfcc66]

[cschpc:169783] [ 7] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS5Input15execute_commandEv+0x7e7) [0xcfb0f7]

[cschpc:169783] [ 8] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS5Input4fileEv+0x317) [0xcfbc57]

[cschpc:169783] [ 9] ./lmp_kokkos_mpi_only(main+0x46) [0xd136c6]

[cschpc:169783] [10] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3f6881ed5d]

[cschpc:169783] [11] ./lmp_kokkos_mpi_only() [0x6adfd1]

[cschpc:169783] *** End of error message ***

--------------------------------------------------------------------------

mpirun noticed that process rank 0 with PID 169783 on node cschpc.ut.ac.ir exited on signal 11 (Segmentation fault).

--------------------------------------------------------------------------

Is it from the shortage of the ram on the computer?

It does not help my mind. If you have any suggestion about this problem, I will be glad.

 

Thanks in advance for your help

 

Best regard

 

=====================

Mohammad Ebrahim izadi,

Department of Chemistry,

Tehran University,

Islamic Republic of Iran,

Phone : +98 – 21 – 61113358

Fax :  +98 – 21 – 66409348


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
lammps-users mailing list
lammps-users@...6297....sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lammps-users




--
Dr. Axel Kohlmeyer  akohlmey@...12...24...  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.