- Open Access
SPICES: a particle-based molecular structure line notation and support library for mesoscopic simulation
© The Author(s) 2018
- Received: 5 January 2018
- Accepted: 3 August 2018
- Published: 9 August 2018
- Molecular structure representation
- Line notation
- Mesoscopic simulation
- Dissipative Particle Dynamics
A molecular simulation task comprises three successive steps: The definition of a simulation job with all necessary input information (preparation step), the actual loop over discrete integration time steps to numerically solve the equations of motion (the actual simulation step) and the analysis of the simulation record with all calculated results (evaluation step). The first (preparation) step of this triad has to provide data structures that can be leveraged by the algorithms of the second (simulation) step in an optimized manner to allow for a maximum performance of their interplay. This is commonly achieved by definition of adequate sets of arrays that encode all necessary molecular information like spatial positions or bonds of the interacting entities. The content of these arrays is usually provided by large tabular ASCII files that are often (at least partly) edited by hand. An example of these ASCII files may be found at  for 1,2-Dimyristoyl-sn-glycero-3-phosphocholine (DMPC) phospholipid molecules of a bilayer-membrane simulation task where each line contains an interacting entity, its spatial x,y and z coordinates, line offsets to bonded entities and specific indices for additional force assignments. The manual creation of these machine-oriented contents is not only a tedious but an error-prone type of work: For all but the simplest molecular ensembles errors are likely to be generated that may spoil the whole simulation process. Thus there is a valid necessity to prevent mistakes by safeguarded operations and to reduce manual preparation overhead by adequate automation.
Key part of this work is a set of methods operating on an intuitive line notation for particle-decomposed molecular structures denoted SPICES (Simplified Particle Input ConnEction Specification). The SPICES design is derived from straightforward simplifications of the well-established SMILES representation for atom-based molecular connectivity [21–23]. The set of SPICES related methods supports the interplay of structural encoding levels (compare Fig. 1) as well as structure-based calculations for mesoscopic simulations (length and time scales, simulation box size, compound concentrations etc.): It allows for parsing and (graphically) analyzing the line notations, topological calculations (e.g. particle frequencies, particle neighbors or particle paths) as well as the generation of corresponding 3D particle structures with support of their spatial mapping into the simulation box and the final output of tabular ASCII files with molecular information for the following simulation step (the construction of the tabular ASCII file at  was in fact supported by the SPICES related code of this work).
The Spices.jar library supports all aspects of SPICES definition and handling. A Spices object may be created with at least an input structure string or in combination with additional information like a map of available particles. A syntax parser analyzes the provided line notation and returns detailed syntax error information if necessary by the methods isValid and getErrorMessage. SPICES properties like the frequency of particles or complete lists of particle neighbors are evaluated upon user request by the methods getParticleFrequencies or getNextNeighbors.
A function of specific importance is the spatial projection of topological SPICES into a simulation box to set up adequate start geometries. Since a mesoscopic simulation is driven by soft particle potentials (in contrast to atomic hard core repulsions for e.g. molecular dynamics), different particles may occupy the same exact spatial position (which would lead to infinite forces for hard atomic potentials) as well as penetrate each other. Thus the possibly severe problems of particle entanglements or caging effects due to inadequate start geometries are considerably attenuated . Nonetheless, a more favorable initial configuration may considerably reduce the necessary simulation period. A straightforward approach is a spatial linear tube representation  as shown in Figs. 4 and 5: The longest linear particle chain in the molecule is determined and its particles are consecutively lined up along a straight line according to the specified bond length (which may be squeezed to fit into specific compartments like simulation box layers, see below and Fig. 5). Then all branched side particles are collapsed onto their nearest-neighbor particle on this line. For a fast determination of a sufficiently long linear particle chain, the Depth-First Search (DFS) algorithm is used . Starting from the first particle of the SPICES line notation the maximum-distant particle A is evaluated by a first DFS run. With a second DFS run, the maximum-distant particle B from particle A is determined. Finally the particle chain between A and B is chosen for the spatial tube representation. If a [START]/[END] tag pair is defined the longest (oriented) linear chain between the tagged particles is evaluated. The sketched algorithm leads to true longest chains for acyclic SPICES but not necessarily for cyclic particle structures. For a distinct fragmentation scheme of a molecule there may be several different but equally valid SPICES line notations since the proposed line notation is not canonically unique. For acyclic SPICES with a defined [START]/[END] tag pair the sketched 3D tube construction process will lead to a single distinct spatial 3D tube representation for all these possible different line notations (without a defined [START]/[END] tag pair there may be two possible orientations). For cyclic particle structures this may not be the case, i.e. different but equally valid SPICES line notations may lead to different spatial 3D tube representations and corresponding different start geometries of a simulation. According to our experience this shortcoming is of minor practical relevance since the possibly different 3D tube representations for small molecules seem to be sufficiently similar for convergent mesoscopic simulation results. On the other hand, for large complex molecules like cross-linked (bio)polymers the simple linear 3D tube representation is questionable in principal so that specific conversion tools like a PDB-to-SPICES parser for peptides and proteins would be advised which would take the known molecular 3D structure into account.
The sketched spatial projection (see Fig. 5) is accomplished by interplay of the methods setCoordinates and getParticlePositionsAndConnections: After creation of a Spices object from a SPICES line notation string (which is rapidly performed within a fraction of a second for small molecules like DMPC) arrays for the first (start) and the last (end) particle positions of all spatial linear 3D tubes as well as the bond length may be provided via the setCoordinates method. The first (start) particles of the linear chains always have the defined start positions whereas the last (end) particles may not necessarily reach the defined end positions if the length of the defined start/end straight line is longer than the accumulated bond lengths of the particles on the longest linear chain so that a 3D tube may be smaller than defined. On the other hand a 3D tube may be squeezed (with equally reduced bond lengths) if the length of the defined start/end straight line is smaller than the accumulated bond lengths. Thus the calling code (e.g. a compartment editor that allows for flexible compartment definitions within the simulation box like the bilayer compartment shown right in Fig. 5) must only define correctly-oriented and valid lines within an arbitrary compartment (which is comparatively simple to realize) without the necessity to calculate and pre-check every individual length (which could be more difficult). Method getParticlePositionsAndConnections then provides all corresponding particle positions within the simulation box where in addition all particle–particle bonds are coded with specific offsets which are commonly used by simulation kernels (compare to the tabular ASCII file at ). The sketched interplay of methods setCoordinates and getParticlePositionsAndConnections performs sufficiently fast for true on-the-fly calculations, e.g. a spatial projection of 50.000 DMPC molecules (with 800.000 particles) into the simulation box performs in less than a second using an ordinary scientific workstation or even a standard notebook computer.
A graphical visualization may be achieved by adequate application of open-source projects that provide chemical structure drawing capabilities. For instance the structure-diagram layout of the Chemistry Development Kit (CDK) [26–28] can be customized to display SPICES instead of atom-based connection topologies . A principle problem of this (mis)use of atom-based layouts is the inappropriateness of its layout elements and templates: Particle graphs do not follow common patterns of atomic connections (see Fig. 6) so that topological visualizations may result in incomprehensible graphs. Thus a more general graph visualization approach with e.g. the GraphStream library  is necessary. In addition this library allows individually tailored changes of the produced graph by manual displacement of node positions to remove unwanted node or edge overlaps. SpicesViewer.jar is a GUI application (on top of Spices.jar and connection library SpicesToGraphStream.jar) for a topological SPICES display with the GraphStream library to analyze the influence of different graph settings and to demonstrate computational functions like zooming or graph image generation. Figure 6 shows the SpicesViewer.jar GUI with a manually tailored SPICES graph visualization of the cyclic peptide Kalata B1 with 29 amino acids.
This work provides a Java library for SPICES handling and mesoscopic simulation support (Spices.jar) in combination with a connection library (SpicesToGraphStream.jar) and a Java Graphical User Interface (GUI) viewer application (SpicesViewer.jar) for visual topological inspection and manipulation of SPICES molecule definitions. All libraries/applications are publicly available as open source published under the GNU General Public License version 3 . The SPICES GitHub repository contains the Java bytecode libraries, a Windows OS installer for the SpicesViewer GUI application, all Javadoc HTML documentations  and the Netbeans  source code packages including Unit tests.
The presented set of methods may alleviate molecular structure definitions for mesoscopic simulation tasks. The SpicesViewer GUI application demonstrates relevant use cases in detail with corresponding sample code. The new libraries may be utilized within scripting environments or become part of integrated mesoscopic simulation systems.
Future developments may address SPICES parsers that especially support the more difficult preparation of polymer systems, e.g. a PDB-to-SPICES parser for peptides and proteins provided in form of PDB files (actually, the SPICES string of the Kalata B1 peptide in Fig. 6 was generated from its PDB file with a prototype parser that uses the amino acid fragmentation schemes and connection rules outlined in ). Another promising challenge would be a conversion between particle and all-atom representations for an interplay of atomistic and mesoscopic simulation.
KvdB, MD, JS and AZ designed, implemented and tested the SPICES related code. ME, HK and AZ conceived the SPICES approach and lead the project development. All authors read and approved the final manuscript.
The authors like to thank the GraphStream dynamic graph library development and project team, the Apache Commons contributors as well as the reviewers for helpful suggestions—especially the catchy SPICES acronym—and Noel O’Boyle for stimulating discussions. The support of GNWI—Gesellschaft für naturwissenschaftliche Informatik mbH, Oer-Erkenschwick, Germany, is gratefully acknowledged.
The authors declare that they have no competing interests.
Availability of data and materials
SPICES repository at https://github.com/zielesny/SPICES.
Availability and requirements
Project name: SPICES. Project home page: SPICES repository at https://github.com/zielesny/SPICES. Operating system(s): Platform independent. Programming language: Java. Other requirements: Java 1.8 or higher. License: GNU General Public License version 3.
Ethics approval and consent to participate
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Text file (2018) PositionsBonds1.txt. https://github.com/zielesny/Jdpd/tree/master/src/de/gnwi/jdpd/tests/test_DMPC. Accessed 16 June 2018
- Engel T, Gasteiger J (eds) (2018) Chemoinformatics: basic concepts and methods. Wiley, WeinheimGoogle Scholar
- Engel T, Gasteiger J (eds) (2018) Applied chemoinformatics: achievements and future opportunities. Wiley, WeinheimGoogle Scholar
- Siani MA, Weininger D, Blaney JM (1994) CHUCKLES: a method for representing and searching peptide and peptoid sequences on both monomer and atomic levels. J Chem Inf Comput Sci 34(3):588–593View ArticleGoogle Scholar
- Siani MA, Weininger D, James CA, Blaney JM (1995) CHORTLES: a method for representing oligomeric and template-based mixtures. J Chem Inf Comput Sci 35(6):1026–1033View ArticleGoogle Scholar
- Drefahl A (2011) CurlySMILES: a chemical language to customize and annotate encodings of molecular and nanodevice structures. J Cheminf 3:1View ArticleGoogle Scholar
- Zhang T, Li H, Xi H, Stanton RV, Rotstein SH (2012) HELM: a hierarchical notation language for complex biomolecule structure representation. J Chem Inf Model 52(10):2796–2806View ArticleGoogle Scholar
- Dufresne Y, Noé L, Leclère V, Pupin M (2015) Smiles2Monomers: a link between chemical and biological structures for polymers. J Cheminf. 7:62View ArticleGoogle Scholar
- Truszkowski A, Daniel M, Kuhn H, Neumann S, Steinbeck C, Zielesny A, Epple M (2014) A molecular fragment cheminformatics roadmap for mesoscopic simulation. J Cheminf 6:45View ArticleGoogle Scholar
- Hoogerbrugge PJ, Koelman JMVA (1992) Simulating microscopic hydrodynamic phenomena with dissipative particle dynamics. Europhys Lett 19(3):155–160View ArticleGoogle Scholar
- Koelman JMVA, Hoogerbrugge PJ (1993) Dynamic simulations of hard-sphere suspensions under steady shear. Europhys Lett 21(3):363–368View ArticleGoogle Scholar
- Espanol P, Warren P (1995) Statistical mechanics of dissipative particle dynamics. Europhys Lett 30(4):191–196View ArticleGoogle Scholar
- Espanol P (1995) Hydrodynamics from dissipative particle dynamics. Phys Rev E 52(2):1734–1742View ArticleGoogle Scholar
- Groot RD, Warren P (1997) Dissipative particle dynamics: bridging the gap between atomistic and mesoscopic simulation. J Chem Phys. 107(11):4423–4435View ArticleGoogle Scholar
- Groot RD, Madden TJ (1998) Dynamic simulation of diblock copolymer microphase separation. J Chem Phys 105(20):8713–8724View ArticleGoogle Scholar
- Ryjkina E, Kuhn H, Rehage H, Müller F, Peggau J (2002) Molecular dynamic computer simulations of phase behavior of non-ionic surfactants. Angew Chem Int Ed 41(6):983–986View ArticleGoogle Scholar
- Schulz SG, Kuhn H, Schmid G, Mund C, Venzmer J (2004) Phase behavior of amphiphilic polymers: a dissipative particles dynamics study. Colloid Polym Sci 283:284–290View ArticleGoogle Scholar
- Truszkowski A, Epple M, Fiethen A, Zielesny A, Kuhn H (2013) Molecular fragment dynamics study on the water–air interface behavior of non-ionic polyoxyethylene alkyl ether surfactants. J Colloid Interface Sci 410:140–145View ArticleGoogle Scholar
- Vishnyakov A, Lee M-T, Neimark AV (2013) Prediction of the critical micelle concentration of nonionic surfactants by dissipative particle dynamics simulations. J Phys Chem Lett. 4:797–802View ArticleGoogle Scholar
- Truszkowski A, van den Broek K, Kuhn H, Zielesny A, Epple M (2015) Mesoscopic simulation of phospholipid membranes, peptides, and proteins with molecular fragment dynamics. J Chem Inf Model 55:983–997View ArticleGoogle Scholar
- Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36View ArticleGoogle Scholar
- Weininger D, Weininger A, Weininger JL (1989) SMILES. 2. Algorithm for generation of unique SMILES notation. J Chem Inf Comput Sci 29(2):97–101View ArticleGoogle Scholar
- Weininger D (1990) Smiles. 3. Depict. Graphical depiction of chemical structures. J Chem Inf Comput Sci 30(3):237–243View ArticleGoogle Scholar
- Groot RD (2003) Electrostatic interactions in dissipative particle dynamics—simulation of polyelectrolytes and anionic surfactants. J Chem Phys 118(24):11265–11277View ArticleGoogle Scholar
- Wayne R, Sedgewick K (2011) Algorithms. Chapter 4: Graphs, 4th edn. Addison-Wesley, BostonGoogle Scholar
- Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen EL (2003) The Chemistry Development Kit (CDK): An open-source java library for chemo- and bioinformatics. J Chem Inform Comput Sci 43(2):493–500View ArticleGoogle Scholar
- Steinbeck C, Hoppe C, Kuhn S, Floris M, Guha R, Willighagen EL (2006) Recent Developments of the Chemistry Development Kit (CDK): an open-source java library for chemo- and bioinformatics. Curr Pharm Des 12(17):2111–2120View ArticleGoogle Scholar
- Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N, Kuhn S, Pluska T, Rojas-Chertó M, Spjuth O, Torrance G, Evelo CT, Guha R, Steinbeck C (2017) The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform 9:33View ArticleGoogle Scholar
- GraphStream: A dynamic graph library. http://graphstream-project.org. Accessed 16 June 2018
- GNU General Public License. http://www.gnu.org/licenses. Accessed 16 June 2018
- Javadoc documentation. http://www.oracle.com/technetwork/java/javase/documentation. Accessed 16 June 2018
- NetBeans IDE Version 8.2. https://netbeans.org. Successor: https://netbeans.apache.org. Accessed 16 June 2018