[lammps-users] Fix Addforce Slow Down the Computational Efficiency
# [lammps-users] Fix Addforce Slow Down the Computational Efficiency

 From: Wei Peng Date: Mon, 16 Apr 2018 16:58:12 -0400

Dear LAMMPS administrators or users,

I am simulating a system with 8 nanoparticles containing Lennard-Jones atoms connected by FENE bonds. Every step, I applied a force to each atom that are contained in half of each nanoparticle. The force applied to every atom all points to the center of the mass of the whole nanoparticle. To conserve the momentum, I added an opposite force that is applied to the rest of the system. To take the additional energy out of the system, I used Nose-Hoover thermostat.

I used fix addforce command to apply the extra force. It turns out it works very well, and it can give the expected result. However, the efficiency of the computation is significantly slowed down by this additional force (~20 times slower). I guess the low efficiency is mainly caused by the communication cost, but it turns out not the case (according to the output file produced by LAMMPS).

I have attached the input file and the output file here. The version of LAMMPS is Aug. 2017.

Can you give me any advice on accelerating the computation. Thank you for your help!

Below is the input file:

compute          coord1 np1 property/atom xu yu zu
compute          c1 np1 com

# np1 is the group id of nanoparticle 1.

variable         famp equal "0.1"

variable         dirx1 atom "c_coord1[1]-c_c1[1]"
variable         diry1 atom "c_coord1[2]-c_c1[2]"
variable         dirz1 atom "c_coord1[3]-c_c1[3]"
variable         diramp1 atom "sqrt(v_dirx1^2 + v_diry1^2 + v_dirz1^2)"
variable         fx1 atom "v_famp*v_dirx1/v_diramp1"
variable         fy1 atom "v_famp*v_diry1/v_diramp1"
variable         fz1 atom "v_famp*v_dirz1/v_diramp1"
compute          fxsum1 half1 reduce sum v_fx1
compute          fysum1 half1 reduce sum v_fy1
compute          fzsum1 half1 reduce sum v_fz1
fix              addfnp1 half1 addforce v_fx1 v_fy1 v_fz1

# "half1" is the group id of the half of nanoparticle 1 that was applied a force to
# I did the same thing for other 7 nanoparticles, which was not posted here.

variable         fxliquid equal "-1.0 *(c_fxsum1+c_fxsum2+c_fxsum3+c_fxsum4+c_fxsum5+c_fxsum6+c_fxsum7+c_fxsum8)/count(liquid)"

variable         fyliquid equal "-1.0 * (c_fysum1+c_fysum2+c_fysum3+c_fysum4+c_fysum5+c_fysum6+c_fysum7+c_fysum8)/count(liquid)"

variable         fzliquid equal "-1.0 * (c_fzsum1+c_fzsum2+c_fzsum3+c_fzsum4+c_fzsum5+c_fzsum6+c_fzsum7+c_fzsum8)/count(liquid)"

fix              addliquid liquid addforce v_fxliquid v_fyliquid v_fzliquid

# I summed up the forces applied to the 8 nanoparticles and take the opposite and then add it to every atom in group "liquid".

Here is the performance report in the output file:

Loop time of 2228.15 on 1024 procs for 100000 steps with 53432 atoms

Performance: 38776.610 tau/day, 44.880 timesteps/s
100.0% CPU use with 1024 MPI tasks x 1 OpenMP threads

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 3.1147     | 3.4368     | 3.6571     |   4.7 |  0.15
Bond    | 0.14084    | 0.74264    | 5.9983     | 113.4 |  0.03
Neigh   | 152.64     | 153.88     | 154.96     |   4.0 |  6.91
Comm    | 39.216     | 47.364     | 63.729     |  80.0 |  2.13
Output  | 8.1437     | 8.8403     | 9.5221     |  13.4 |  0.40
Modify  | 1982.8     | 1999.6     | 2007.8     |  12.5 | 89.74
Other   |            | 14.33      |            |       |  0.64

Nlocal:    52.1797 ave 478 max 40 min
Histogram: 1010 6 2 2 0 2 0 1 0 1
Nghost:    236.358 ave 1235 max 204 min
Histogram: 993 16 4 2 3 3 1 1 0 1
Neighs:    261.92 ave 343 max 63 min
Histogram: 2 3 4 4 5 66 355 429 145 11

Total # of neighbors = 268206
Ave neighs/atom = 5.01958
Ave special neighs/atom = 0.63243
Neighbor list builds = 22088
Dangerous builds = 0
Total wall time: 0:37:10

Sincerely,
Wei