LAMMPS WWW Site - LAMMPS Documentation - LAMMPS Mailing List Archives
Re: [lammps-users] converting pdb to lammps data file
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lammps-users] converting pdb to lammps data file

From: Andrew Jewett <jewett@...1937...>
Date: Mon, 18 Sep 2017 17:53:32 -0700

As Axel pointed out PDB files lack a lot of information.  This missing information means that any tool which converts PDB files needs to make a large number of strong assumptions about your system.  Both moltemplate and topotools are general programs and they make few assumptions.  So you will have to supply all this information yourself (missing hydrogen atoms, atom types, force field choice, bond ambiguity, etc...  There are conflicting several atom naming conventions which make interpreting PDB files somewhat hellish.  They require a lot of processing.)

One of these assumptions is the force field you are using.  Different PDB conversion tools often only work with one type of force field...  So, generally speaking, you need to decide the force field you want to use first, and then choose which simulation software and conversion tool bests supports that force field.  If you have decided you want to limit yourself to LAMMPS, then I have heard that charmm2lmp (ch2lamp?) can completely convert PDB files into LAMMPS data files without any other tools needed, but you are limited to using the CHARMM force field (and probably not the latest one.  The same could be said about AmberTools+amb2lmp (or amner2lmp?) and Materials studio+msi2lmp.  These tools are not fully supported and have their share of bugs.  Axel (and Bruce Allen?) have been valiantly attempting to maintain msi2lmp, but it has some limitations that can not be easily addressed.  Another alternative is to use the ATB database (see below).

   --- recommendation ---

If you are simulating large biomolecules and are willing to start from scratch, then (my impression) is that the most modern and versatile free simulation tool for simulating all-atom proteins, nucleic acids and lipids might be OpenMM.

NAMD, GROMACS, and AMBER and MaterialsStudio are also still very popular and powerful.  LAMMPS has some features that none of these programs have, but if you don't need them, then why suffer.  All of these simulation programs come with PDB conversion tools which are much more convenient than anything available for LAMMPS (and I hate to say, more likely to be bug free).

    --- moltemplate ----

   If you wanted to prepare an all-atom simulation of DPPC lipids in moltemplate, it is possible.  Its not that difficult.  You just have to define a "DPPC" molecule listing all the atom types and bonds explicitly (similar to the "" file in the moltemplate examples) in that type of lipid.  For an example of the file format, see:
You can find all of the atom types you need in the "" file, but you have to choose them carefully.

Alternatively, you generate a "" file containing DPPC lipid molecule in moltemplate format using the "ATB" database.  This is probably the easiest (and safest) way to create these files right now.
(All files are available in moltemplate format, apparently.   When I tried it, the process seemed relatively straightforward.)

Once you have defined a "DPPC" molecule in moltemplate, it's not difficult to create a "" file which tells moltemplate to generate many copies of these lipid molecules, as well as many water molecules and ions that would be present. You can either use moltemplate ".move()" commands, or PACKMOL (or LipidWrap?) To generate initial coordinates for all the molecules in the simulation which moltemplate can read.

  However, if you plan to run all-atom simulations of -proteins-, then it is more difficult.  You would have to write definitions for all 20 amino acids, and then create a polymer object linking them together.)   I don't run these kinds of simulations, so I never took the time to write these 20 molecule objets, but it should not be -that- difficult.  See post here:

For this reason and others, moltemplate is not yet a convenient tool for preparing all-atom simulations of proteins.  I don't know if it ever will be.  It was intended for coarse grained modeling.

People can spend months figuring out how to convert PDB files I to working simulation input files.  (This is not a good file format.  I wish it would die.  But people keep using it.)  This is not easy.  It's really time consuming.
I hope this helps give you an idea about the software which is available and their relatively strengths.


PS. I wish I could say more about topotools.  I know you can do all this using topotools, but you will need to supply all of these missing details as well.

On Sep 18, 2017 4:56 AM, "Axel Kohlmeyer" <akohlmey@...24...> wrote:
On Mon, Sep 18, 2017 at 7:34 AM, Neda Rafiee <ne.rafiee@...444...> wrote:
> Thanks a lot Axel,
> In fact I followed the tutorial, I used the following command for all of my
> atoms, for example for P I used:
> set selpl [atomselect top {name P}]
> $selpl set type PL
> $selpl set mass 30.9740
> $selpl set charge 1.500
> and then, I used :
> topo retypebonds
> topo numbondtypes
> topo guessangles
> topo numangletypes
> topo guessdihedrals
> topo numdihedraltypes
> topo writelammpsdata full
> Actually the number of atom types, bond types, and angle types are correct
> but there is a problem with the number of dihedral types. Actually, I have
> 58 dihedral types but in the created data files there is only 43. Can you
> help me with this?

no. i have no time to do your work. topotools is not designed to be a
simple automated topology builder tool (one could program one on top
of it, but i don't have the interest or the time), but a tool that
simplifies topology manipulations, that other automated tools cannot
do easily. thus, as a "close to the metal" tool, it is very much a
tool for people that know what they are doing or those that spend the
time to carefully test and understand each step of what they are

mind you, the topotools tutorial is only an example and only for a
force field that doesn't require a residue template database to look
up topology, atom type and charge data.

> And what is your opinion with using Moltemplate in lammps to produce a data
> file?

topotools and moltemplate are different tools with different goals. i
have never used moltemplate.


> Thanks
> Neda
> On Mon, Sep 18, 2017 at 3:49 PM, Axel Kohlmeyer <akohlmey@...24...> wrote:
>> On Mon, Sep 18, 2017 at 6:50 AM, R. Varsha <varsharani.0909@...24...>
>> wrote:
>> > Hello Neda,
>> >
>> > Yes you can create lammps data file from your pdb via topotools. First
>> > you
>> > have to load that pdb file in vmd and then open tk console and give the
>> > following command -
>> > topo writelammpsdata <output filename> <atom-style>
>> >
>> > for e.g.
>> > topo writelammpsdata full
>> that is bad advice. this command will indeed create a data file, but
>> such a data file is _very likely_ bogus, *especially* when the input
>> is only a pdb file.
>> building a correct data file (or rather a force field specific
>> topology in general) for a molecular system, requires a lot more
>> effort. there is not a simple automatic "conversion" simply because
>> the pdb file is lacking a lot of the necessary information. in
>> particular a pdb file is lacking:
>> - explicit atom type information (note, that the atom names in a pdb
>> rarely coincide with the atom types, which are force field specific)
>> - explicit bond/angle/dihedral/improper information. VMD will use
>> heuristics to reconstruct bonds, but those may not be accurate and
>> anything beyond is not created, since those are not needed for
>> visualization
>> - partial charge information. most conventional force fields require
>> an assignment of an force field specific atom type *and* a partial
>> charge. those are usually stored in residue specific databases. for
>> charmm, these are .rtf files, while a force field like OPLS/AA has an
>> increment system.
>> for proteins (and selected lipids) where _the pdb file follows
>> *strictly* the published naming conventions_ tools exist (e.g. charmm
>> (the program), psfgen, xleap/tleap, pdb2gmx and more) to process the
>> pdb file, often in an interactive process and requiring a suitable
>> force field specific database.
>> topotools in combination with VMD scripting *can* be used for this,
>> but it requires a significant amount of thinking and custom scripting
>> as shown by the simple examples here:
>> axel.
>> >
>> >
>> > Regards
>> > Varsha
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Check out the vibrant tech community on one of the world's most
>> > engaging tech sites,!
>> > _______________________________________________
>> > lammps-users mailing list
>> >
>> >
>> >
>> --
>> Dr. Axel Kohlmeyer  akohlmey@...24...
>> College of Science & Technology, Temple University, Philadelphia PA, USA
>> International Centre for Theoretical Physics, Trieste. Italy.

Dr. Axel Kohlmeyer  akohlmey@...43...4...
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.

Check out the vibrant tech community on one of the world's most
engaging tech sites,!
lammps-users mailing list