LAMMPS WWW Site - LAMMPS Documentation - LAMMPS Mailing List Archives
Re: [lammps-users] error: segmentation fault_reax/c_KOKKOS
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lammps-users] error: segmentation fault_reax/c_KOKKOS


From: Mohammad Izadi <izadi0511@...24...>
Date: Wed, 6 Sep 2017 13:00:20 +0430

Dear Axel,

Thank you for your helps. I have update my lammps to lammps-11Agus17 and first installed the lmp_mpi with below commands:

make yes-all

make no-lib

make mpi

With this command “ nohup    mpirun    –np   25   ./lmp_mpi   –sf   intel   <   in.input   &” I run my simulation, but it halted with the last error “Segmentation fault”.

I install the lmp_kokkos_mpi_only with below commands:

make yes-all

make no-lib

make yes-KOKKOS

make kokkos_mpi_only

I run with “nohup  mpirun     -np    25    ./lmp_kokkos_mpi_only   -k   on   -sf    kk    <  in.input &” this run is also stopped with the similar error, so this new version also has this problem.

Please help me on this problem. My whole input files are attached to this email. I have some much bigger input files with exactly similar compositions that I want to run them after solving this problem.


I sincerely thank you

Best regard

 =====================

Mohammad Ebrahim izadi,

Department of Chemistry,

Tehran University,

Islamic Republic of Iran,

Phone : +98 – 21 – 61113358

Fax :  +98 – 21 – 66409348


On Wed, Sep 6, 2017 at 12:45 AM, Axel Kohlmeyer <akohlmey@...24...> wrote:


On Mon, Sep 4, 2017 at 11:49 AM, Mohammad Izadi <izadi0511@...24...> wrote:

Dear lammps users,

I installed lammps kokkos_mpi_only (lammps-31Mar17) on a server computer and I run reax/c/KOKKOS-package from the command line:


​please first update to the latest LAMMPS version and test if your issue persists. 
if yes, please provide a full input deck, so people elsewhere can reproduce this issue and debug it.

axel.​

 

  nohup  mpirun     -np    2    ./lmp_kokkos_mpi_only   -k   on   -sf    kk    <  in.input &

My system has 2073 atoms and it is in a gas phase. When I have a smaller system (e.g. a system with 200 atom), it works without any error. My input file is as below:

=================

echo            both

units                       real

newton            on

atom_style            charge

dimension       3

boundary        p p p

#read_restart    restart22

restart 500      restart11 restart22

read_data              Silica2073.data

pair_style              reax/c NULL

pair_coeff              * * ffield.reax.input C H O N S Si Na Ar

neighbor                2 bin

neigh_modify      every 5 delay 0 check no

velocity          all create 2100 235485 mom yes rot yes

fix                       1 all nvt temp 2100.0 2100.0 100.0

fix                   2 all qeq/reax 1 0.0 10.0 1e-6 reax/c

fix                  4 all reax/c/species 10 10 250 species.txt

fix                  6 all efield 0.0001 0.0 0.0

fix_modify     6 energy yes

fix                  7 all reax/c/bonds 250 bonds.reaxc

compute reax all pair reax/c

variable eb             equal c_reax[1]

variable ea             equal c_reax[2]

variable elp            equal c_reax[3]

variable emol        equal c_reax[4]

variable ev             equal c_reax[5]

variable epen         equal c_reax[6]

variable ecoa         equal c_reax[7]

variable ehb           equal c_reax[8]

variable et              equal c_reax[9]

variable eco           equal c_reax[10]

variable ew            equal c_reax[11]

variable ep             equal c_reax[12]

variable efi             equal c_reax[13]

variable eqeq         equal c_reax[14]

thermo_style    custom  step  temp  atoms  etotal  ke  pe  v_eb  v_ea  v_elp  v_emol  v_ev  v_epen v_ecoa  v_ehb  v_et  v_eco  v_ew  v_ep  v_efi  v_eqeq  density  vol  press

thermo          250

timestep 0.1

dump                     1 all xyz  250 dumpnvt.xyz

run                          4000000

========================

Also, when I use a single core run, it doesn’t stop, but with multi core runs and large systems (2073 atom) instantly it stop with the bottom error:

=========================================

WARNING: Fixes cannot send data in Kokkos communication, switching to classic communication (../comm_kokkos.cpp:382)

[cschpc:169783] *** Process received signal ***

[cschpc:169783] Signal: Segmentation fault (11)

[cschpc:169783] Signal code: Address not mapped (1)

[cschpc:169783] Failing at address: (nil)

[cschpc:169783] [ 0] /lib64/libpthread.so.0() [0x3f6940f710]

[cschpc:169783] [ 1] ./lmp_kokkos_mpi_only(_ZN6Kokkos12parallel_forINS_11RangePolicyIJNS_6SerialEN9LAMMPS_NS27PairReaxFindBondSpeciesZeroEEEENS3_15PairReaxCKokkosIS2_EEEEvRKT_RKT0_RKSsPNS_4Impl9enable_ifIXntsrNSG_11is_integralIS8_EE5valueEvE4typeE+0x268) [0x17a1bc8]

[cschpc:169783] [ 2] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS15PairReaxCKokkosIN6Kokkos6SerialEE15FindBondSpeciesEv+0xb0) [0x17aa0d0]

[cschpc:169783] [ 3] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS15PairReaxCKokkosIN6Kokkos6SerialEE7computeEii+0x34a4) [0x17e26e4]

[cschpc:169783] [ 4] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS12VerletKokkos5setupEv+0x6aa) [0x1a6b43a]

[cschpc:169783] [ 5] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS3Run7commandEiPPc+0x65e) [0x1a2271e]

[cschpc:169783] [ 6] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS5Input15command_creatorINS_3RunEEEvPNS_6LAMMPSEiPPc+0x26) [0xcfcc66]

[cschpc:169783] [ 7] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS5Input15execute_commandEv+0x7e7) [0xcfb0f7]

[cschpc:169783] [ 8] ./lmp_kokkos_mpi_only(_ZN9LAMMPS_NS5Input4fileEv+0x317) [0xcfbc57]

[cschpc:169783] [ 9] ./lmp_kokkos_mpi_only(main+0x46) [0xd136c6]

[cschpc:169783] [10] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3f6881ed5d]

[cschpc:169783] [11] ./lmp_kokkos_mpi_only() [0x6adfd1]

[cschpc:169783] *** End of error message ***

--------------------------------------------------------------------------

mpirun noticed that process rank 0 with PID 169783 on node cschpc.ut.ac.ir exited on signal 11 (Segmentation fault).

--------------------------------------------------------------------------

Is it from the shortage of the ram on the computer?

It does not help my mind. If you have any suggestion about this problem, I will be glad.

 

Thanks in advance for your help

 

Best regard

 

=====================

Mohammad Ebrahim izadi,

Department of Chemistry,

Tehran University,

Islamic Republic of Iran,

Phone : +98 – 21 – 61113358

Fax :  +98 – 21 – 66409348


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
lammps-users mailing list
lammps-users@...655....net
https://lists.sourceforge.net/lists/listinfo/lammps-users




--
Dr. Axel Kohlmeyer  akohlmey@...24...  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.

Attachment: segmentation_fault.rar
Description: application/rar