Full article: Automated determination of hybrid particle-field parameters by machine learning

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

The hybrid particle-field molecular dynamics method is an efficient alternative to standard particle-based coarse grained approaches. In this work, we propose an automated protocol for optimisation of the effective parameters that define the interaction energy density functional, based on Bayesian optimisation. The machine-learning protocol makes use of an arbitrary fitness function defined upon a set of observables of relevance, which are optimally matched by an iterative process. Employing phospholipid bilayers as test systems, we demonstrate that the parameters obtained through our protocol are able to reproduce reference data better than currently employed sets derived by Flory-Huggins models. The optimisation procedure is robust and yields physically sound values. Moreover, we show that the parameters are satisfactorily transferable among chemically analogous species. Our protocol is general, and does not require heuristic a posteriori rebalancing. Therefore it is particularly suited for optimisation of reliable hybrid particle-field potentials of complex chemical mixtures, and extends the applicability corresponding simulations to all those systems for which calibration of the density functionals may not be done via simple theoretical models.

GRAPHICAL ABSTRACT

Keywords:

1. Introduction

Hybrid particle-field (hPF) simulations are a class of efficient methods that are well adapted for studying very large soft matter systems with molecular resolution [Citation1–4]. The essence of the hPF methodology is contained in the two terms of the hPF Hamiltonian: (1) $H ({r}) = \sum_{m} H_{0} ({r}_{m}) + W [{ϕ (r)}] .$ (1) Here $H_{0}$ , the Hamiltonian of a single molecule m, contains the kinetic energy and the intramolecular potential as defined in standard particle-based potentials, and W, the interaction energy functional [Citation5–7] dependent on the density-fields $ϕ (r)$ of the different particle species, models all intermolecular interactions.

Intramolecular forces, by their very nature, only act on a single molecule, while the density-field interactions manifest as a quasi-instantaneous external potential, coupling the motion of the different molecules. The possibility of computing the external potential using particle-mesh routines allows for a very efficient and highly parallel implementation requiring very little communication among processors, resulting in algorithms formally exhibiting strong-scaling [Citation8,Citation9]. Very recently, a GPU-based implementation of the Monte Carlo-based hPF (single chain in mean field) set a new milestone with simulations of polymer melts composed of 10 billion particles [Citation9].

The coupling of hPF to molecular dynamics in efficient parallelised software [Citation8,Citation10] has allowed for the application of hPF simulations on both conventional soft polymer mixtures and biological systems [Citation11–15]. Prominent examples range from nanostructured multiphase materials [Citation16–19] to organised and disorganised lipid/water mixtures [Citation20,Citation21]. Recently, hPF was extended to simulations of polypeptides [Citation15], and to include explicit treatment of electrostatic interactions [Citation22–24], the latter opening up to the formulation of density functional-based computational predictive models of the complex phase behavior of lipopolysaccharides [Citation25].

Despite the growing level of maturity reached by hPF simulations, so far relatively little attention has been put into developing systematic protocols for the parameterisation of the interaction energy functional W. In particular, the quality of hPF models depends on both the physical model chosen for $W [ϕ]$ , and on the appropriate calibration of all the numerical parameters it may depend upon. The most commonly employed model for W typically takes the form of: (2) $\begin{aligned} W [ϕ (r)] & = \frac{1}{2 ϕ_{0}} \int d r (\sum_{ij} {\tilde{χ}}_{ij} ϕ_{i} (r) ϕ_{j} (r) \\ + \frac{1}{κ} {(\sum_{j} ϕ_{j} (r) - ϕ_{0})}^{2}), \end{aligned}$ (2) where the average number density of the system is denoted $ϕ_{0}$ , κ is a compressibility term which controls the level of fluctuations of the overall density, and the ${\tilde{χ}}_{ij}$ matrix is an energetic parameter that models local mixing energy between species i, j present in the system.

Parameters for the local mixing energy may be derived by different experimental approaches. For example, for simple polymers in a solvent, the $\tilde{χ}$ -parameter can be obtained from thermometric data [Citation26]. This is however not as easily available when considering hetero-polymeric systems. Another approach is to estimate $\tilde{χ}$ by its relationship with the Hildebrand solubility parameter [Citation27]. However this can be problematic as solubility parameters are often inaccurate [Citation28]. Most importantly, for the molecular resolution of hPF models, which often adopt coarse grained (CG) representations in the range of four–ten atoms per bead, factorisation of global experimental data into the individual molecular components may not be trivial.

A more effective determination of $\tilde{χ}$ parameters may be obtained using simple Flory-Huggins (F-H) lattice models: (3) ${\tilde{χ}}_{ij} = - z (ϵ_{ij} - \frac{1}{2} (ϵ_{ii} + ϵ_{jj})),$ (3) where $ϵ_{ij}$ is the mixing energy between species i and j, and z is the coordination number, which takes the value of 6 for three-dimensional Cartesian lattices. The mixing energy between two species can be approximated by the two-body interaction energy defined in the potential of the underlying molecular model employed. While this approach has been quite successful so far, there are a few limitations that hamper its general use. Prominently, the F-H model considers contact energies only, sometimes even disregarding entropy contributions to the binding, not taking into account many-body effects, or long-range interactions. The latter are particularly important, for example, in very polar or charged moieties. In practice, F-H parameters provide very good qualitative guesses for the values of $\tilde{χ}$ . Nonetheless, satisfactory quantitative agreement with reference data, especially in chemically complex systems, usually requires an a posteriori heuristic fine tuning of at least some of the values of the $\tilde{χ}$ matrix [Citation21].

Importantly, even though the first term of the interaction energy in (Equation2(2) $\begin{aligned} W [ϕ (r)] & = \frac{1}{2 ϕ_{0}} \int d r (\sum_{ij} {\tilde{χ}}_{ij} ϕ_{i} (r) ϕ_{j} (r) \\ + \frac{1}{κ} {(\sum_{j} ϕ_{j} (r) - ϕ_{0})}^{2}), \end{aligned}$ (2) ) accounts in principle for the total energy of mixing, in recent times the addition of other terms to the W functional, for example explicitly describing electrostatics [Citation22–24] or surface interactions [Citation29,Citation30], poses the problem of appropriately factorising such contributions out the mixing $\tilde{χ}$ term to avoid non-physical double-counting. In these cases, $\tilde{χ}$ loses a direct physical meaning, and for this reason it is problematic to define plausible values for $\tilde{χ}$ directly from theoretical models.

The hPF interaction energy is globally dependent on a large set of parameters comprising both the $\tilde{χ}$ matrix, and any other parameter present in other energy terms eventually employed. Therefore, the determination of an accurate functional W should be addressed as a global optimisation problem. Systematic approaches to parameterisation of ordinary particle-particle potentials in CG force fields, such as force matching [Citation31], Iterative Boltzmann Inversion [Citation32] and Inverse Monte Carlo [Citation33], effectively consider parameterisation as optimisation problems where parameters are chosen to satisfy a given fitness function. For example, Iterative Boltzmann inversion and Inverse Monte Carlo consider a high resolution reference potential of mean force and optimise interaction potentials to reproduce this reference using the CG degrees of freedom. A key observation in such attempts is that the potential of mean force and interactions potentials, due to loss of entropy in the process of CG, most often are significantly different. Similarly, the $\tilde{χ}$ parameter of continuum density-field for polymers has not the direct meaning of a potential of mean force, but rather that of a phenomenological energetic term [Citation7].

The determination of hPF $\tilde{χ}$ force fields parameters poses a particularly challenging optimisation problem. First, these interactions cannot be framed as in a reaction coordinate form; therefore, $\tilde{χ}$ parameters cannot be optimised through standard state of the art methods, such as Iterative Boltzmann inversion or Inverse Monte Carlo. Second, the gathering of hPF data does not yield derivatives of the model fitness with respect to the parameters, thereby restricting us to gradient-free optimisation. Finally, the $\tilde{χ}$ -matrix may involve a large parameter space for complex chemical mixtures, thus a general optimisation method needs to be capable of dealing with large dimensional parameter spaces.

Given such constraints, the large family of surrogate (or response surface methodology [RSM]) model based approaches, in which a response surface meta-model is introduced and updated through sequential noisy sampling, provides several possible optimisation techniques. Methods in the literature, of particular relevance, are classical sequential RSM [Citation34,Citation35], Lipschitz optimisation [Citation36,Citation37], Trust region methods [Citation34,Citation38], and Bayesian optimisation [Citation39–41] (BO). In addition, various random search methods, such as genetic algorithms [Citation42,Citation43], simulated annealing [Citation44–46], Latin hypercube sampling [Citation47], or straight uniform random sampling, are applicable.

Among the cited methodologies, BO is a versatile scheme for the global optimisation of expensive non-linear black-box functions for which derivatives with respect to the input parameters are hard or impossible to compute [Citation39–41]. The BO algorithm, developed in the 70s, has in the last decade emerged as a strong solution to derivative-free optimisation of computationally expensive and noisy black-box functions, with powerful performance in many practical applications, especially within the field of machine learning hyper-parameter optimisation [Citation48–52].

In this work we present a protocol for the optimisation of hPF parameters based on BO. The choice of this methodology is based on its strong theoretical convergence properties when paired with an upper-confidence bound (UCB) acquisition function [Citation53], its simple implementation, and its highly data efficient sampling [Citation54]. The effectiveness and robustness of our optimisation protocol is tested against uniform random sampling, the simplest possible optimisation strategy, and previous literature data based on F-H models.

2. Materials and methods

2.1. The hybrid particle-field method

The phase space of a molecular system with total energy (Equation1(1) $H ({r}) = \sum_{m} H_{0} ({r}_{m}) + W [{ϕ (r)}] .$ (1) ) may be sampled either by Monte Carlo [Citation1], or by molecular dynamics (hPF-MD) [Citation3]. In this work we employed hPF-MD.

In hPF-MD, the equations of motion for the independent particles are determined by the presence of an external potential obtained as the functional derivative of W. Specifically, the potential acting on each particle species i located at position $r$ takes the form [Citation3]: (4) $\begin{aligned} V_{i}^{ext} (r) & = \frac{δW [ϕ (r)]}{δ ϕ_{i} (r)} \\ = \frac{1}{ϕ_{0}} (\sum_{j} {\tilde{χ}}_{ij} ϕ_{j} (r) + \frac{1}{κ} (\sum_{j} ϕ_{j} (r) - ϕ_{0})) . \end{aligned}$ (4) In the OCCAM hPF-MD software [Citation8], which we employ in this paper, the related forces are evaluated via a numerical particle-mesh approach from spatial derivatives of the external potential: (5) $F_{i}^{ext} (r) = - \nabla V_{i} (r) = - \frac{1}{ϕ_{0}} \sum_{j} ({\tilde{χ}}_{ij} + \frac{1}{κ}) \nabla ϕ_{j} (r) .$ (5) For more details on the computation of the forces, see ref. [Citation3].

2.2. hPF force field parameterisation protocol

To determine hPF force field parameters, we employ a general iterative automated optimisation framework as depicted in Figure . Starting from a force field parameter set $x$ , a hPF trajectory is gathered and analysed giving output data of relevance $y_{sim .}$ . The output data is then compared to reference data $y_{ref .}$ , which can be provided by any accurate source, including high(er) resolution simulations or experiment. An objective (or fitness) function $η = η (y_{sim .}, y_{ref .}; x)$ assesses the quality of the parameterisation, and from the fitness value, the optimiser proposes a new hPF parameter set $x$ . The full cycle is automated and is repeated until satisfactory convergence of the fitness is reached, yielding the optimal hPF force field.

Figure 1. Protocol for optimising hPF force fields.

In principle, any optimiser that is not dependent on gradient values of the fitness values, can be employed. However, given the potentially large dimension of the parameter space $x$ and the computationally expensive simulations needed to gather $y_{sim .}$ , it is essential that the optimiser should converge with the fewest possible amount of iterations. Next, simulation data has an element of stochasticity due to only sampling a finite set of system configurations in the ensemble average, therefore the optimiser needs to be robust against noise in the fitness values. Finally, we note that computational expensiveness of gathering $y_{sim .}$ , makes almost any computational cost of the optimiser itself negligible.

The protocol we propose makes use of BO, a surrogate based model for solving constrained optimisation problems: (6) $x_{opt} = argma x_{x \in X} η (x) .$ (6) The space of possible parameter configurations $X$ is usually a compact subset of $R$ and the objective function η is in general unknown, non-convex, multimodal, and only accessible through (computationally expensive) pointwise noisy sampling. In the BO algorithm, a Gaussian process (GP) function prior is placed on the underlying true objective and updated via Bayesian posterior updating (Bayes' rule) by sequential probing of η [Citation54]. In this way, a probabilistic response surface is built which represents, at each iteration, the model's beliefs about the objective (μ) and how confident the model is at each point in $X$ (σ). BO achieves high efficiency in the sampling of the parameter space by leveraging both μ and σ in an acquisition function (AF), $a (x) = a (μ (x), σ (x))$ . Often, the AF contains a parameter β which governs the trade-off between exploration (sampling areas in $X$ where the uncertainty is high) and exploitation (sampling areas where good $x$ are known to be located). The AF guides the sampling by picking points $x \in X$ to explore according to a strategy for improving upon the currently best found $x$ .

The GP prior is a multivariate Gaussian distribution over functions, uniquely defined by a covariance kernel $Σ_{0}$ and a mean function $μ_{0}$ . The kernel function induces a metric on $X$ which defines a measure of the distance (similarity) between points $x$ and $x^{'}$ . The choice of a specific such $Σ_{0}$ represents a priori assumptions about the structure of the underlying true objective.

Often, one or more hyper-parameters in the covariance kernel have to be specified. It is customary to fix the values of these parameters by maximising the marginal likelihood of the model, given the observed data. Marginalising out the true noise-free objective function gives the likelihood of the model hyper-parameters. For GPs, the log marginal likelihood integral in question is analytically tractable, and may be easily maximised to determine the optimal kernel hyper-parameters.

2.3. Test case: phospholipid model for bilayers

As test case we consider a hPF-MD model for fully-saturated phospholipid bilayers, using in particular four variants characterised by different lengths of the fatty tail dipalmitoylphosphatidylcholine (DPPC), dimyristoylphosphatidylcholine (DMPC), distearoylphosphatidylcholine (DSPC), and mono-unsaturated dioleoylphosphatidylcholine (DOPC). For direct comparison, we use the same mapping of the model developed by De Nicola et al. [Citation21] (Figure ), which employs a MARTINI CG representation of the phospholipids [Citation55] and explicit solvent.

Figure 2. Summary of the hPF phospholipid model. Left: CG representation of the DPPC phospholipid and solvent. Right: Outline of the two terms in the hPF Hamiltonian.

In this work, we limit our analysis to the optimisation of the ${\tilde{χ}}_{ij}$ matrix, while the bonded terms and the compressibility κ are kept the same in the model by De Nicola et al. [Citation21].

A ( $13 \times 13 \times 14 {nm}^{3}$ ) simulation box containing 528 DPPC lipids and 14,000 water beads is employed. Each simulation in the optimisation look lasts $20 ns$ . The hPF simulations were run using OCCAM.Footnote¹ The simulations are performed under the NVT ensemble, using the Andersen thermostat with a coupling time of $0.1 ps$ and a collision frequency of $7.0 {ps}^{- 1}$ . A time step of $0.03 ps$ was used. The particle-mesh routines for particle-field forces in OCCAM employed a grid size of $0.58 nm$ (1.25 times the bond length used) and an update period of $0.3 ps$ (10 time steps). hPF-MD simulations are performed at a temperatures of 335, 325, and $303 K$ for DSPC, DPPC, and DMPC/DOPC, respectively. These temperatures were chosen to allow direct comparison with the results in [Citation21].

To evaluate the fitness of the model we consider electron density profiles (ϕ) of the different species, compared to those obtained from reference CG simulations using the MARTINI force field. This choice is made to have the best assessment of the quality of the BO procedure as compared to the F-H parameters used in the literature currently. For optimal determination of hPF parameters for phospholipids, more accurate all-atom models or experimental data may be eventually employed. Note however that using an experimental target for the objective function adds a layer of complexity due to the incompleteness of the data as only the total electron density across the membrane normal is available, not the density of individual moieties.

The fitness $η (φ; \tilde{χ})$ is defined as the average mean squared error over the electron densities of the different species k: (7) $η (φ; \tilde{χ}) = \frac{1}{n n_{k}} \sum_{k = 1}^{n_{k}} \sum_{i = 1}^{n} | φ_{i}^{k} - {\hat{φ}}_{i}^{k} |^{2},$ (7) with $φ_{i}^{k}$ being the electron density of species k at a position $z_{i} = 2 iℓ / n - ℓ$ along the bilayer normal. The density profiles are computed relative to the center of mass of all carbon type beads in the simulation, which is taken to be the center of the bilayer. ${\hat{φ}}_{i}^{k}$ indicates the reference density to be matched (in our case the MARTINI simulation results). The total number of different particle species is denoted $n_{k}$ , while n is the number of bins in the chosen density histogram. For a better of comparison with F-H data [Citation21], the absolute deviations $S_{k}$ are also reported: (8) $S_{k} (φ; \tilde{χ}) = \frac{1}{n} \sum_{i = 1}^{n} | φ_{i}^{k} - {\hat{φ}}_{i}^{k} | .$ (8) In addition, $S_{p}$ , the mean percentage error relative to the average electron density $φ_{0}$ over the full histogram across all species, is reported: (9) $S_{p} (φ; \tilde{χ}) = \frac{1}{φ_{0} n n_{k}} \sum_{k = 1}^{n_{k}} \sum_{i = 1}^{n} | φ_{i}^{k} - {\hat{φ}}_{i}^{k} |,$ (9) To avoid potential cold-start problems, each optimisation run is started with $2 d$ (d being the dimension of the parameter space) randomly sampled points. After the initial random sampling period, new points to be probed are selected according to the maximum of the UCB acquisition function [Citation53] (Figure ). The exploration/exploitation trade-off parameter in the acquisition function is set to $β = 2$ , favouring exploration of the large parameter space. The Gaussian process underlying the BO uses a Matérn covariance kernel [Citation56,Citation57] with smoothing parameter $ν = 5 / 2$ , and a constant zero mean function. In addition, a diagonal white noise kernel is added to account for the noisy sampling.

All $\tilde{χ}$ parameters are constrained to the values used by De Nicola $\pm 10 {kJ mol}^{- 1}$ [Citation21]. Minimising the volume of the $X$ hypercube is important for reducing the computational cost of the optimisation scheme, however the range used must be large enough to include the true optimum. As our fitness (MSE) is locally convex close to the true minimum, an insufficiently large parameter subset search would result in an optima on the edge of $X$ being reported by the BO method, as opposed to an interior point. As no such results were found, the value used was deemed to sufficiently balance both concerns.

3. Results and discussion

3.1. Optimisation of hPF parameters for DPPC

DPPC was used as prototypic test systems to assess the effectiveness of BO for the determination of hPF $\tilde{χ}$ parameters. The choice of such a system was determined both by the presence of a relatively complex chemical structure, and by the existence of a vast reference literature, including experimental [Citation58–60] and computer simulations [Citation61,Citation62], as well as hPF models [Citation20,Citation21].

Figure reports the density profiles for hPF simulations of DPPC after BO of the $\tilde{χ}$ parameters with respect to the mean-squared-error objective function, computed between the hPF and reference MARTINI density profiles (Equation7(7) $η (φ; \tilde{χ}) = \frac{1}{n n_{k}} \sum_{k = 1}^{n_{k}} \sum_{i = 1}^{n} | φ_{i}^{k} - {\hat{φ}}_{i}^{k} |^{2},$ (7) ). The density profiles for all the bead types match well those of the reference, with $S_{k}$ values smaller than $7.3 el . / {nm}^{3}$ for all lipid beads, and with a $S_{p}$ value less than $2 %$ .

Figure 3. Density profiles and representative membrane snapshots from hPF-MD simulations of a DPPC bilayer using ${\tilde{χ}}_{F - H}$ parameters [Citation21] (left), particle-based simulations using the MARTINI CG force field (centre), and hPF-MD ${\tilde{χ}}_{BO}$ parameters (right). The table presents absolute deviations $S_{k}$ in the density profiles between the F-H and BO parameter simulations, and the reference MARTINI profile. Percentage deviations $S_{p}$ are given in parenthesis. $S_{k}$ values are given in el./nm $^{3}$ .

Previously published hPF-MD models for phospholipids are based on the MARTINI CG mapping [Citation55,Citation63,Citation64] and employ a $\tilde{χ}$ matrix based on the F-H model (Equation3(3) ${\tilde{χ}}_{ij} = - z (ϵ_{ij} - \frac{1}{2} (ϵ_{ii} + ϵ_{jj})),$ (3) ). F-H parameters are extracted from the corresponding Lennard-Jones binding energies of the MARTINI force field ( ${\tilde{χ}}_{F - H}$ hereafter [Citation21]). As noted in the original work, using the lateral density profile as the benchmark property, heuristic adjustment of the $\tilde{χ}$ parameter between C and W beads was required to improve the stability and overall structure of the bilayer. Overall, the F-H parameter set produces a satisfactory organisation of the lipid bilayer (Figure ), evidenced by a very good qualitative agreement of the lateral density profiles for the different moieties compared to reference CG simulations using the MARTINI force field. Nonetheless, the hPF/F-H density profiles are characterised by a $S_{p}$ of about 4 to $5 %$ for the different lipids [Citation21], and larger values of $S_{k}$ , reaching a maximum of $20.51 el . / {nm}^{3}$ for the G bead. Comparison of $S_{k}$ and $S_{p}$ values indicates that BO provides a substantial improvement compared to F-H.

Given the use of theoretical models for the derivation of $\tilde{χ}$ , it has been hard so far to discern the origin of any discrepancies from reference data between intrinsic approximations of the hPF method, or the use of non optimal parameter sets. In particular, broader density profiles in lipids were usually understood as a consequence of the intrinsic softness of the field interactions [Citation22]. In fact, using BO parameters, there is an appreciable sharpening of the distributions for all the beads, even though the peaks remain broader than CG simulations based on pair-interactions (Figure ). This is of particular interest, as it demonstrates that indeed, in phospholipids, the F-H parameterisation is accurate enough to capture the physics of the hPF model; nonetheless, there is still space for significant quantitative improvement by a global optimisation approach.

3.2. Feature importance

Data in Figure show how the performance of the F-H parameters is not equal for all the moieties present in the system. In particular, F-H is better at reproducing the distributions of the lipid head and tails, while the density profile of the glycerol groups (G beads) appears too broad. The physical reason for such discrepancy may be attributed to the fact that glycerol floats at the interface between the phase-separated water and lipid fatty tails. Therefore, its distribution depends more than the others on a delicate balance among all the terms in W. This effect may be difficult to reproduce adopting an independent parameterisation of the individual elements of the $\tilde{χ}$ mixing energy matrix. On the contrary, the BO approach appears better suited to take into account all competing interactions, producing more balanced $\tilde{χ}$ values.

The uneven error in the F-H distributions suggests that the hPF model is not equally robust with respect to variations of the different $\tilde{χ}$ terms. To verify this hypothesis, we calculated the correlation (mutual information, MI) between input $\tilde{χ}$ parameters and resulting fitness. The MI between two continuous random variables X and Y with probability density functions (PDF) $f_{X}$ and $f_{Y}$ (and joint PDF $f_{X, Y}$ ) is defined as [Citation65,Citation66] (10) $I (X; Y) = - \int_{X} \int_{Y} d x d y f_{X, Y} (x, y) \log \frac{f_{X, Y} (x, y)}{f_{X} (x) f_{Y} (x)},$ (10) and can be understood as the reduction in uncertainty about the values of Y, once X is revealed. The MI between any input parameter ${\tilde{χ}}_{kj}$ and the resulting fitness $η (ϕ; \tilde{χ})$ , thus yields a measure of the feature importance for the full parameter space.

Figure shows the relative feature importance of the different ${\tilde{χ}}_{ij}$ parameters, as well as optimisation results from BO runs which only include the most important ones. The fitness is here represented by the average coefficient of determination, $R^{2}$ , over the density profiles of all the different beads, (11) $η^{R^{2}} (φ; \tilde{χ}) = \frac{1}{n_{k}} \sum_{k = 1}^{n_{k}} R^{2} (φ^{k}, {\hat{φ}}^{k}) .$ (11) Evidently, a subset of just four parameters carry the majority of the feature importance, meaning optimising only these four, keeping the others at their F-H model value, yields results comparable to the ones obtained after an optimisation over the full 10-dimensional parameter space (Figures and ). The four relevant parameters have a clear physical meaning, as they are the main determinants for the hydrophilic/hydrophobic character of the polar heads and the fatty tails, respectively ( ${\tilde{χ}}_{NW}, {\tilde{χ}}_{CW}$ ), and for the amphipathic behaviour of glycerol ( ${\tilde{χ}}_{GW}, {\tilde{χ}}_{GC}$ ).

Figure 4. Left: Feature importance as ranked by the mutual information measure between the fitness and the individual ${\tilde{χ}}_{ij}$ parameters, for hPF-MD simulations of a DPPC bilayer with randomly sampled $\tilde{χ}$ matrices. Presented values are normalised relative to the most important parameter ( ${\tilde{χ}}_{CW}$ ) (arbitrary units). Inset details the low relative MI values found for the last six matrix elements (error bars omitted). $\tilde{χ}$ parameters are included in order of decreasing feature importance (left). Right: Best fitness achieved (here using the average coefficient of determination, $R^{2}$ , across all bead species) for each dimension of the parameter space subspace used in hPF-MD BO protocol runs on the DPPC bilayer system.

Figure 4. Left: Feature importance as ranked by the mutual information measure between the fitness and the individual χ~ij parameters, for hPF-MD simulations of a DPPC bilayer with randomly sampled χ~ matrices. Presented values are normalised relative to the most important parameter (χ~CW) (arbitrary units). Inset details the low relative MI values found for the last six matrix elements (error bars omitted). χ~ parameters are included in order of decreasing feature importance (left). Right: Best fitness achieved (here using the average coefficient of determination, R2, across all bead species) for each dimension of the parameter space subspace used in hPF-MD BO protocol runs on the DPPC bilayer system.

Figure 5. Top: Density profiles for hPF-MD DPPC bilayer simulations ran with Bayesian optimised parameter sets with four (left) and ten (right) included $\tilde{χ}$ parameters. The four-parameter simulation uses ${\tilde{χ}}_{F - H}$ values for all but the $\tilde{χ}$ matrix elements with the highest feature importance, namely ${\tilde{χ}}_{NW}$ , ${\tilde{χ}}_{CW}$ , ${\tilde{χ}}_{GW}$ , and ${\tilde{χ}}_{GC}$ , c.f. column three of the table (bottom). Bottom: Resulting $\tilde{χ}$ matrices from the BO protocol applied to hPF-MD simulations of a DPPC bilayer. Results reported for selected subspaces of the full 10-dimensional parameter space, with the ${\tilde{χ}}_{ij}$ s shown in red being fixed and not part of the optimisation run. All $\tilde{χ}$ values given in ${kJmol}^{- 1}$ . Mean percentage errors, $S_{p}$ , associated with each set of optimised parameters is given in the last row.

3.3. Transferability of BO-hPF parameters

Table reports the parameter sets obtained by BO for DPPC compared to those obtained for two other saturated phospholipids differing in the length of fatty acid chains (DSPC, DMPC), and one unsaturated lipid (DOPC). Overall, the most relevant four $\tilde{χ}$ matrix elements do not differ significantly from DPPC to DSPC. The less hydrophobic character of the C bead in DOPC may be attributed to the presence of the unsaturated moiety. We remark that for sake of simplicity, the C=C bond was represented by a different bead type, consistent with the MARTINI mapping, and all $\tilde{χ}$ parameters involving that were kept at the reference F-H values [Citation21].

Table 1. Optimised $\tilde{χ}$ -matrix parameters found by the BO scheme for hPF-MD simulations of DPPC, DMPC, DSPC, and DOPC bilayer systems. The $\tilde{χ}$ matrix elements are given in order of decreasing feature importance. Reference parameters are the Flory-Huggins ( ${\tilde{χ}}_{F - H}$ ) parameters used in [Citation21]. All values given in units of ${kJ mol}^{- 1}$ .

Display Table

The transferability of the obtained data sets is tested by performing hPF simulations for a lipid using parameters optimised on other structures. The absolute error on the density profiles obtained exchanging $\tilde{χ}$ values are presented in Table , and show how the global structure of the bilayers remain mostly unaffected, with relatively small changes in $S_{k}$ and $S_{p}$ values, which remain systematically lower than those of the F-H parameterisation. This fact indicates that the BO protocol is able to find robust data-sets for chemically similar moieties, also ensuring very good transferability.

Table 2. Mean absolute deviations in electron density, $S_{k}$ , with respect to the MARTINI reference density for the different lipids simulated with Bayesian optimised parameter sets on the different phospholipids (relative percentage deviations $S_{p}$ in parenthesis). Comparison with data from De Nicola, using the baseline ${\tilde{χ}}_{F - H}$ parameter set [Citation21]. All $S_{k}$ values given $el . / {nm}^{3}$ .

Display Table

3.4. Robustness of BO-hPF procedure

Large multidimensional parameter spaces often exhibit multiple locally optimal parameter sets or flat fitness surfaces that can hinder convergence towards the globally optimal parameter set. Figure shows one such example for the hPF parameters, with a projection of the fitness (estimated by the surrogate fitting function) in terms of ${\tilde{χ}}_{CW}$ and ${\tilde{χ}}_{GC}$ . The plot exhibits a narrow region of unacceptable values, and a relatively large flat plateau of high score, were the determination of the position of the maximum is numerically non trivial, and may lead to multiple solutions.

Figure 6. Top left: The surrogate objective fitness surface (here using the average coefficient of determination, $R^{2}$ , across all bead species) in an example DPPC BO run with only the four parameters exhibiting the highest feature importance scores included ( ${\tilde{χ}}_{CW}$ , ${\tilde{χ}}_{GC}$ , ${\tilde{χ}}_{NW}$ , and ${\tilde{χ}}_{GW}$ ). Individual samplings with their associated fitnesses are represented as blue dots. A projection onto the subspace spanned by ${\tilde{χ}}_{CW}$ and ${\tilde{χ}}_{GC}$ shown. All $\tilde{χ}$ matrix elements are given in ${kJ mol}^{- 1}$ . Top right: Best DPPC simulation membrane fitness (average $R^{2}$ ) for BO and random sampling with only the four parameters exhibiting the highest feature importance scores included. Comparison with the fitness achieved by the reference ${\tilde{χ}}_{F - H}$ parameter set. Inset details when BO and random sampling surpass the ${\tilde{χ}}_{F - H}$ parameter set in terms of $R^{2}$ fitness. Bottom: Scatter matrices showing correlations between all pairs of $\tilde{χ}$ parameters in a BO run on a DPPC bilayer (left) compared with random sampling (right). Only the four parameters exhibiting the highest feature importance scores are included in the sampling. The matrix diagonal shows the density of sampled points for each individual ${\tilde{χ}}_{ij}$ parameter. All $\tilde{χ}$ matrix elements are given in ${kJ mol}^{- 1}$ .

However, our tests on transferability across lipid species do not indicate such problems for the BO-hPF procedure, finding instead systematically consistent parameter sets for the different lipid species. Moreover, as shown in Figure , BO converges steadily, and outperforms random sampling protocols in finding the optimal solution, even as both schemes improve upon ${\tilde{χ}}_{F - H}$ after only a handful of iterations. In particular, after a few efficient initial steps, random sampling is not able to converge toward the best solution, and remains confined in the large basin comprising of very different, not fully optimised, combinations of parameters. This is in agreement with results reported in the literature for BO applied to toy model functions [Citation48], and such diverse fields as e.g. chemical design [Citation67], active learning [Citation68], robotics [Citation69,Citation70], and machine learning [Citation71]. The faster and more robust convergence of BO is determined by the intrinsic ability of the algorithm to learn what region of the space is more relevant to sample, disregarding other less relevant regions (Figure ).

4. Concluding remarks

In this work, we proposed a protocol to determine accurate potentials for hPF simulations, using BO as the main driver for the optimisation of the free parameters. Our scheme requires the definition of an arbitrary fitting function based on any set of relevant observables to be learned. The quantities of relevance may come from experimental data or from more accurate higher-resolution benchmark simulations (for example all-atom or CG), the only requirement being that the pertinent quantities can be straightforwardly estimated with a hPF model.

Using DPPC, DMPC, DSPC, and DOPC phospholipid bilayers as test systems, we showed how this procedure determines sets of parameters for the interaction energy that significantly improve the models present in the literature based on F-H theory. The new Bayesian-optimised potentials also show excellent transferability among chemically similar moieties.

Despite being more complex than F-H, the BO procedure here introduced offers various advantages. First, the procedure does not require the estimate of two-body interaction energies, which may be difficult to determine with good accuracy, for example, in the absence of CG models compatible with the mapping employed in the hPF simulations. Second, the protocol is very general, and can thus be used to concomitantly optimise the mixing terms of the interaction energy ( $\tilde{χ}$ ) and any other parameter of relevance present in other parts of the energy functional. This is particularly interesting in the view of recent advances for hPF model potentials, which include, for example, specific potentials for peptides, for electrostatics [Citation22–24], or for surface energy terms [Citation29,Citation30]. Finally, being an automatic procedure, BO does not require user-based fine tuning of the parameters, ensuring a more systematic and reproducible determination of the potentials, especially for chemically complex systems.

BO is robust in determining physically meaningful parameters despite the relatively large variable space. This is due to the ability of BO to restrain the search only in a sub-region of the space where the physical solution is contained. Nonetheless, this evidence cannot be assumed as general, and it cannot be excluded that BO of hPF parameters over even higher-dimensional variable spaces would lead to numerical ambiguities. In this respect, we may suggest that the best strategy for the optimisation of hPF parameters implies the formulation of an adequate Ansatz, for example using the F-H method, that would be used as a starting point for the optimisation. In this work, we showed how feature importance can be applied to the BO procedure to identify on-the-fly those parameters that are not relevant for the convergence to the best solution, and which can be thus dropped out of the optimisation protocol. In this way, full BO optimisation can be performed only on a subset of relevant parameters, keeping all the others at (or in the neighbourhood of) their initial F-H values. In case the F-H parameters cannot be determined, or the parameter space is intrinsically too large, we foresee the possibility of introducing penalty terms to the fitting function, similarly to those used in other optimisation procedures like RESP [Citation72], even though this has not been explored in this work.

In conclusion, the establishment of an automated machine-learning procedure for the optimisation of hPF parameters promises to further expand the applicability of this powerful simulation method toward increasingly chemically complex systems.

Acknowledgments

The authors thank Antonio De Nicola for providing topology and structure files for the lipid bilayer systems.

Data availability statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The authors acknowledge the support of the Norges Forskningsråd (Norwegian Research Council) through the CoE Hylleraas Centre for Quantum Molecular Sciences [grant number P262695], the Norwegian Supercomputing Program (NOTUR) [grant number NN4654K], and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), [project number 233630050 - TRR 146].

Notes

1 http://www.occammd.org/

References

K.C. Daoulas and M. Müller, J. Chem. Phys. 125 (18), 184904 (2006).
PubMedGoogle Scholar
M. Müller, J. Stat. Phys. 145 (4), 967–1016 (2011).
Web of Science ®Google Scholar
G. Milano and T. Kawakatsu, J. Chem. Phys. 130 (21), 214106 (2009).
PubMedGoogle Scholar
G.G. Vogiatzis, G. Megariotis and D.N. Theodorou, Macromolecules 50 (7), 3004–3029 (2017).
Web of Science ®Google Scholar
F. Schmid, J. Phys.: Condens. Matter 10 (37), 8105 (1998).
Web of Science ®Google Scholar
T. Kawakatsu, Statistical Physics of Polymers: An Introduction (Springer Science & Business Media, Berlin, 2013).
Google Scholar
G. Fredrickson, The Equilibrium Theory of Inhomogeneous Polymers, Vol. 134 (Oxford University Press, Oxford, 2006).
Google Scholar
Y. Zhao, A. De Nicola, T. Kawakatsu and G. Milano, J. Comput. Chem. 33 (8), 868–880 (2012).
PubMed Web of Science ®Google Scholar
L. Schneider and M. Müller, Comput. Phys. Commun. 235, 463–476 (2019).
Web of Science ®Google Scholar
Y.L. Zhu, H. Liu, Z.W. Li, H.J. Qian, G. Milano and Z.Y. Lu, J. Comput. Chem. 34 (25), 2197–2211 (2013).
PubMed Web of Science ®Google Scholar
G. Milano, T. Kawakatsu and A. De Nicola, Phys. Biol. 10 (4), 045007 (2013).
PubMedGoogle Scholar
T.A. Soares, S. Vanni, G. Milano and M. Cascella, J. Phys. Chem. Lett. 8 (15), 3586–3594 (2017).
PubMed Web of Science ®Google Scholar
M. Cascella and S. Vanni, in Chemical Modelling: Applications and Theory, edited by Michael Springborg and Jan-Ole Joswig (Royal Society of Chemistry, London, 2015), Vol. 12, Chap. 1, pp. 1–52.
Google Scholar
S.J. Marrink, V. Corradi, P.C.T. Souza, H.I. Ingólfsson, D.P. Tieleman and M.S.P. Sansom, Chem. Rev. 119 (9), 6184–6226 (2019).
PubMed Web of Science ®Google Scholar
S.L. Bore, G. Milano and M. Cascella, J. Chem. Theory Comput. 14 (2), 1120–1130 (2018).
PubMed Web of Science ®Google Scholar
A. De Nicola, T. Kawakatsu, F. Müller-Plathe and G. Milano, Eur. Phys. J. Spec. Top. 225 (8-9), 1817–1841 (2016).
Web of Science ®Google Scholar
Y. Zhao, M. Byshkin, Y. Cong, T. Kawakatsu, L. Guadagno, A. De Nicola, N. Yu, G. Milano and B. Dong, Nanoscale 8 (34), 15538–15552 (2016).
PubMed Web of Science ®Google Scholar
G. Munaò, A. Pizzirusso, A. Kalogirou, A. De Nicola, T. Kawakatsu, F. Müller-Plathe and G. Milano, Nanoscale 10 (46), 21656–21670 (2018).
PubMed Web of Science ®Google Scholar
G. Munaò, A. De Nicola, F. Müller-Plathe, T. Kawakatsu, A. Kalogirou and G. Milano, Macromolecules 52 (22), 8826–8839 (2019).
Web of Science ®Google Scholar
A. De Nicola, Y. Zhao, T. Kawakatsu, D. Roccatano and G. Milano, Theor. Chem. Acc. 131 (3), 1167 (2012).
Google Scholar
A. De Nicola, Y. Zhao, T. Kawakatsu, D. Roccatano and G. Milano, J. Chem. Theory Comput. 7 (9), 2947–2962 (2011).
PubMed Web of Science ®Google Scholar
Y.L. Zhu, Z.Y. Lu, G. Milano, A.C. Shi and Z.Y. Sun, Phys. Chem. Chem. Phys. 18 (14), 9799–9808 (2016).
PubMed Web of Science ®Google Scholar
H.B. Kolli, A. De Nicola, S.L. Bore, K. Schäfer, G. Diezemann, J. Gauss, T. Kawakatsu, Z.Y. Lu, Y.L. Zhu, G. Milano and M. Cascella, J. Chem. Theory Comput. 14 (9), 4928–4937 (2018).
PubMed Web of Science ®Google Scholar
S.L. Bore, H.B. Kolli, T. Kawakatsu, G. Milano and M. Cascella, J. Chem. Theory Comput. 15 (3), 2033–2041 (2019).
PubMed Web of Science ®Google Scholar
A. De Nicola, T.A. Soares, D.E.S. Santos, S.L. Bore, G.J.A. Sevink, M. Cascella and G. Milano, Biochim. Biophys. Acta 0, 129570 (2020).
Google Scholar
J. Brandrup, E.H. Immergut, E.A. Grulke, A. Abe and D.R. Bloch, Polymer Handbook, Vol. 7 (John Wiley & Sons, New York, 1989).
Google Scholar
T. Lindvig, M.L. Michelsen and G.M. Kontogeorgis, Fluid Ph. Equilibria 203 (1-2), 247–260 (2002).
Web of Science ®Google Scholar
S. Venkatram, C. Kim, A. Chandrasekaran and R. Ramprasad, J. Chem. Inf. Model. 59 (10), 4188–4194 (2019).
PubMed Web of Science ®Google Scholar
A.P. Sgouros, A.T. Lakkas, G. Megariotis and D.N. Theodorou, Macromolecules 51 (23), 9798–9815 (2018).
Web of Science ®Google Scholar
S.L. Bore, H.B. Kolli, A. De Nicola, M. Byshkin, T. Kawakatsu, G. Milano and M. Cascella, J. Chem. Phys. 152 (18), 184908 (2020).
PubMedGoogle Scholar
S. Izvekov and G.A. Voth, J. Phys. Chem. B 109 (7), 2469–2473 (2005).
PubMed Web of Science ®Google Scholar
D. Reith, M. Pütz and F. Müller-Plathe, J. Comput. Chem. 24 (13), 1624–1636 (2003).
PubMed Web of Science ®Google Scholar
A.P. Lyubartsev, in Coarse-Grained Modeling of Biomolecules, edited by Garegin A. Papoian (CRC Press, Boca Raton, 2017), Chap. 1, pp. 29–54.
Google Scholar
S. Amaran, N.V. Sahinidis, B. Sharda and S.J. Bury, Ann. Oper. Res. 240 (1), 351–380 (2016).
Web of Science ®Google Scholar
R.H. Myers, D.C. Montgomery and C.M. Anderson-Cook, Response Surface Methodology: Process and Product Optimization Using Designed Experiments (John Wiley & Sons, New York, 2016).
Google Scholar
S.A. Piyavskii, USSR Comput. Math. & Math. Phys. 12 (4), 57–67 (1972).
Google Scholar
B.O. Shubert, SIAM J. Numer. Anal. 9 (3), 379–388 (1972).
Web of Science ®Google Scholar
K.H. Chang, L.J. Hong and H. Wan, Informs. J. Comput. 25 (2), 230–243 (2013).
Web of Science ®Google Scholar
J. Mockus, V. Tiesis and A. Zilinskas, The Application of Bayesian Methods for Seeking the Extremum, Vol. 2 (North-Holand, Amsterdam, 1978), pp. 117–129.
Google Scholar
H.J. Kushner, J. Math. Anal. Appl 5 (1), 150–167 (1962).
Google Scholar
M. Pelikan, D.E. Goldberg and E. Cantú-Paz, Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation, Vol. 1 (Morgan Kaufmann Publishers Inc., Burlington, 1999), pp. 525–532.
Google Scholar
D. Whitley, Stat. Comput. 4 (2), 65–85 (1994).
Web of Science ®Google Scholar
C.R. Reeves, INFORMS J. Comput. 9 (3), 231–250 (1997).
Google Scholar
S. Kirkpatrick, C.D. Gelatt and M.P. Vecchi, Science 220 (4598), 671–680 (1983).
PubMed Web of Science ®Google Scholar
D. Bertsimas and J. Tsitsiklis, Stat. Sci. 8 (1), 10–15 (1993).
Web of Science ®Google Scholar
T.M. Alkhamis, M.A. Ahmed and V.K. Tuan, Eur. J. Oper. Res. 116 (3), 530–544 (1999).
Web of Science ®Google Scholar
M.D. McKay, R.J. Beckman and W.J. Conover, Technometrics 21 (2), 239–245 (1979).
Web of Science ®Google Scholar
J. Snoek, H. Larochelle and R.P. Adams, in Advances in Neural Information Processing Systems 25, edited by F. Pereira, C.J.C. Burges, L. Bottou and K.Q. Weinberger (Curran Associates, Inc., Red Hook, 2012), pp. 2951–2959.
Google Scholar
D.R. Jones, J. Global Optim. 21 (4), 345–383 (2001).
Web of Science ®Google Scholar
J. Azimi, A. Jalali and X. Fern, preprint, arXiv:1202.5597 (2012)..
Google Scholar
J.S. Bergstra, R. Bardenet, Y. Bengio and B. Kégl, in Advances in Neural Information Processing Systems 24, edited by J. Shawe-Taylor, R.S. Zemel, P.L. Bartlett, F. Pereira and K.Q. Weinberger (Curran Associates, Inc., Red Hook, 2011), pp. 2546–2554.
Google Scholar
K. Swersky, J. Snoek and R.P. Adams, in Advances in Neural Information Processing Systems 26, edited by C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K.Q. Weinberger (Curran Associates, Inc., Red Hook, 2013), pp. 2004–2012.
Google Scholar
N. Srinivas, A. Krause, S.M. Kakade and M. Seeger, preprint, arXiv:0912.3995 (2009).
Google Scholar
B. Shahriari, K. Swersky, Z. Wang, R.P. Adams and N. de Freitas, Proc. IEEE 104 (1), 148–175 (2015).
Web of Science ®Google Scholar
T.A. Wassenaar, H.I. Ingólfsson, R.A. Böckmann, D.P. Tieleman and S.J. Marrink, J. Chem. Theory Comput. 11 (5), 2144–2155 (2015).
PubMed Web of Science ®Google Scholar
B. Matérn, Spatial Variation, Vol. 36 (Springer Science & Business Media, Berlin, 2013).
Google Scholar
M.L. Stein, Interpolation of Spatial Data: Some Theory for Kriging (Springer Science & Business Media, Berlin, 2012).
Google Scholar
H.I. Petrache, S.W. Dodd and M.F. Brown, Biophys. J. 79 (6), 3172–3192 (2000).
PubMed Web of Science ®Google Scholar
Q. Waheed and O. Edholm, Biophys. J. 97 (10), 2754–2760 (2009).
PubMed Web of Science ®Google Scholar
J.F. Nagle and S. Tristram-Nagle, Biochim. Biophys. Acta, Biomembr. 1469 (3), 159–195 (2000).
PubMed Web of Science ®Google Scholar
E. Lindahl and O. Edholm, J. Chem. Phys. 113 (9), 3882–3893 (2000).
Web of Science ®Google Scholar
Z.A. Levine, R.M. Venable, M.C. Watson, M.G. Lerner, J.E. Shea, R.W. Pastor and F.L. Brown, J. Am. Chem. Soc. 136 (39), 13582–13585 (2014).
PubMed Web of Science ®Google Scholar
S.J. Marrink, A.H. de Vries and A.E. Mark, J. Phys. Chem. B 108 (2), 750–760 (2004).
Web of Science ®Google Scholar
S.J. Marrink, H.J. Risselada, S. Yefimov, D.P. Tieleman and A.H. de Vries, J. Phys. Chem. B 111 (27), 7812–7824 (2007).
PubMed Web of Science ®Google Scholar
C.E. Shannon, Bell Syst. Tech. J. 27 (3), 379–423 (1948).
Google Scholar
B. Frénay, G. Doquire and M. Verleysen, Neural Netw. 48, 1–7 (2013).
PubMed Web of Science ®Google Scholar
R.R. Griffiths and J.M. Hernández-Lobato, preprint, arXiv:1709.05501 (2017).
Google Scholar
E. Brochu, V.M. Cora and N. De Freitas, preprint, arXiv:1012.2599 (2010).
Google Scholar
R. Calandra, A. Seyfarth, J. Peters and M.P. Deisenroth, Ann. Math. Artif. Intel. 76 (1-2), 5–23 (2016).
Web of Science ®Google Scholar
D.J. Lizotte, T. Wang, M.H. Bowling and D. Schuurmans, in IJCAI, Vol. 7 (Institute of Electrical and Electronics Engineers, Piscataway, 2007), pp. 944–949.
Google Scholar
A. Klein, S. Falkner, S. Bartels, P. Hennig and F. Hutter, Electron. J. Stat. 11 (2), 4945–4968 (2017).
Web of Science ®Google Scholar
C.I. Bayly, P. Cieplak, W. Cornell and P.A. Kollman, J. Phys. Chem. 97 (40), 10269–10280 (1993).
Web of Science ®Google Scholar

Automated determination of hybrid particle-field parameters by machine learning

Abstract

GRAPHICAL ABSTRACT

1. Introduction