2,905
Views
20
CrossRef citations to date
0
Altmetric
Editorial

The current impact of water thermodynamics for small-molecule drug discovery

ORCID Icon &
Pages 1221-1225 | Received 28 Jun 2019, Accepted 03 Sep 2019, Published online: 10 Sep 2019

1. Introduction

The affinity of drug molecules to their respective molecular targets can be described by the thermodynamic properties of the molecular interaction. These properties typically reflect the totality of all underlying molecular recognition processes during ligand binding and are defined by a complex mix of many contributions. This is a direct consequence of not following a simple rigid two-body lock-and-key interaction model but rather going through a highly dynamic process including changes in protein and/or ligand flexibility as well as solvation and desolvation effects. Notwithstanding this known complexity, consulting thermodynamic signatures had been proposed as means to select and/or design drug molecules that offer an accelerated path for further development [Citation1]. Whilst initially providing a more advanced concept for increasing the success in prospective drug design, it has come lately under scrutiny due to lack of the envisioned impact in drug discovery [Citation2]. A major factor that impacted negatively on the utility of thermodynamic signatures is their strong and unpredictable modulation by the local water structure. Although providing a challenge, this also displays a large untapped potential for drug discovery, as we´ve lately gained much more thermodynamic and structural insights into the role of water molecules for ligand binding.

2. The impact of water on the modulation of thermodynamic ligand binding signatures

Binding thermodynamics always need to consider the entire path of a highly dynamic binding process, starting with the separated pair of protein and ligand and ending with the intimately formed protein-ligand complex. This path consists of many different simultaneous as well as sequential steps that will ultimately define the overall thermodynamic binding signature. Consequently, experimentally observed binding enthalpy values that can be extracted, for example, from rigorously performed isothermal titration calorimetry (ITC) experiments, tend to contain a mix of many positive and negative thermodynamic contributions which are strongly modulated by changes in the solvent inventory [Citation3]. In this context it is important to realize that the classical view of the binding entropy being dominated by desolvation effects is way too simplistic (a view that has been frequently coined ‘the hydrophobic effect’), as in particular experimental data have shown that solvent displacement and binding in hydrophobic cavities can be in some instances strongly enthalpically driven. This implies, that changes in the water structure can frequently remain undetected, as there is no instantaneous and experimentally observable thermodynamic signature that can be derived from ITC or surface plasmon resonance (SPR) experiments and be readily attributed to, e.g. the removal of solvent molecules from hydrophobic binding surfaces. This is predominantly a consequence of compensatory thermodynamic processes as exemplified in . In the presented case of two closely related and equipotent Melagatran analogues, significant changes in the solvent inventory upon ligand binding remain undetected, which is due to compensatory thermodynamic contributions that originate from simultaneous changes in hydrogen bonding and binding site conformation [Citation2]. Nevertheless, solvent changes are significantly contributing to the overall thermodynamic footprint of the binding, and albeit their contribution to the affinity of the molecular interaction tend to be minor, they can´t be neglected when trying to optimize small molecule binding to drug targets. This becomes already obvious when taking a closer look at how small molecular weight fragments tend to bind to a large variety of target proteins representing different types of binding pockets.

Figure 1. Relationship between changes in the solvent inventory and the observed binding enthalpy ΔHobs for two closely related and equipotent Melagatran analogues in complex with thrombin. (a) The carboxy-amide of the first analogue forms hydrogen bonds with the backbone amide of Gly219 and the Glu192 side chain, which is further mediated by a well-defined water molecule. (b) Switching to a tertiary dimethyl-amide analogue results in the lost ability to form any hydrogen bond with Glu192, forcing it into a different side-chain conformation whilst observing an additional water molecule. The totality of changes in the hydrogen bonding between ligand and protein as well as the associated structural changes within the binding site mask the significant changes in the solvent inventory as reflected in an almost unchanged ΔHobs. Reprinted with permission from [Citation2]. Copyright 2019 American Chemical Society

Figure 1. Relationship between changes in the solvent inventory and the observed binding enthalpy ΔHobs for two closely related and equipotent Melagatran analogues in complex with thrombin. (a) The carboxy-amide of the first analogue forms hydrogen bonds with the backbone amide of Gly219 and the Glu192 side chain, which is further mediated by a well-defined water molecule. (b) Switching to a tertiary dimethyl-amide analogue results in the lost ability to form any hydrogen bond with Glu192, forcing it into a different side-chain conformation whilst observing an additional water molecule. The totality of changes in the hydrogen bonding between ligand and protein as well as the associated structural changes within the binding site mask the significant changes in the solvent inventory as reflected in an almost unchanged ΔHobs. Reprinted with permission from [Citation2]. Copyright 2019 American Chemical Society

3. The influence of water on the binding properties of fragment hits

A recent report systematically analyzed the molecular and binding properties of fragment hits by thoroughly investigating 489 published protein–fragment complexes [Citation4]. This analysis revealed several common features including preferences in buried surface area upon binding, hydrogen bonding as well as other directional interactions with the protein targets. Almost three quarters of all fragment hits bury more than 80% of their total solvent-accessible surface area upon binding, which would obviously have strong consequences on the overall solvent inventory as they are typically quite polar in nature due to the frequent presence of atoms that can either accept or donate hydrogen bonds. In fact, the analysis reveals that the stabilization of polar interactions is a recurring theme for most of the fragment–protein complexes. These interactions also include H-bonds to structural water molecules, stressing the important role of water in the binding of fragments to proteins. When examining high-resolution protein structures that allow for a proper assessment of structural water molecules, almost half of the fragment hits establish at least one hydrogen bond to solvent molecules that reside within the binding pocket. These solvent molecules are often part of extended nonbonded interaction networks thereby offering different interaction hotspots. In some cases, these interactions can display the only polar interaction of the fragment molecule within the binding pocket, which presents an additional limitation for the characterization and prediction of fragment binding using computational approaches. However, it becomes clear from those findings, that the solvent inventory needs to be strongly considered when assessing drug design opportunities aiming to improve small molecule binding.

4. The opportunity to exploit water for improved drug design

It is now increasingly recognized that a thermodynamic signature is not an endpoint that can be used directly as an optimization parameter in drug design [Citation2,Citation3]. But what about changes between series or small changes in optimization for the same target with only perturbative changes in structure and thus the solvent inventory? Combined computational and experimental investigations of closely related compounds under the same conditions can in fact provide important information to guide both compound and experimental design [Citation5Citation8]. One of the most commonly applied design optimization parameter is LLE (Lipophilic ligand Efficiency) [Citation9]. It has been suggested that optimizing for LLE roughly corresponds to optimizing for enthalpically dominated thermodynamic signatures [Citation10]. Theoretical considerations as well as a comparison of the experimental thermodynamic signature and LLE for congeneric sets of ligands however shows, that optimization for LLE and binding enthalpy is intrinsically different [Citation2]. Whereas LLE has a physical connection with ligand binding specificity and promiscuity, there is no such connection for the enthalpic signature, which is related to the complexity of water thermodynamics.

We´ve already learned, that water can have a dominant effect overall thermodynamic signature of the binding. Water molecules can have very specific interactions for the promotion of protein-ligand recognition as seen with fragments, but a large portion of the thermodynamic signal may however stem from processes not directly involved in the region in direct proximity of the ligand. The many roles played by water in biological systems are discussed in a recent comprehensive review by Spyrakis et. Al [Citation8]. They introduce the terms ‘hot’ and ‘cold’ waters to distinguish between ‘cold’ water molecules that are tightly bound, shaping active sites and mediates interactions as opposed to ‘hot’ molecules that are easily displaced by ligands or protein conformational changes. They point out that water-mediated interactions often display substantial entropy-enthalpy compensations that will have small effects on the affinity but large effects on the thermodynamic signature. This disconnect significantly increases the requirement for a more rigorous treatment of solvent effects for rational drug design. In this context, simple docking studies may enable a reasonable ranking of the affinity of compounds, but the structural information in the docking poses may be incorrect and hence misleading.

5. Theoretical approaches and tools for exploiting water in drug discovery

To impact molecular design, it is imperative to have interpretable signals. Computational methods allow us to study details not available experimentally and to make quantitative estimates of individual parts of the systems. Experimental methods such as SPR and ITC can provide the free energy of binding as well as kinetic and thermodynamic data. Computational methods make it possible to generate spatially resolved full free energy profiles of ligand binding. An example is the case of the encapsulated water molecule in the ligand released upon binding. Atoms are restrained to representative conformers from a molecular dynamics (MD)-trajectory or hydrogen-hydrogen distance restraints, obtained from nuclear magnetic resonance (NMR) experiments. Select conformations are used to investigate solvation properties. In their study, the torsional distribution around S-N sulfonamides was complemented by NMR-derived distance restraints for the ligand conformations used for estimation of solvent thermodynamics [Citation11]. Investigation of a pair of diastereomeric ligands was used to estimate differences in conformational and solvation entropy [Citation12]. Another example is switching of partial charges of individual water molecules to study the effect of water hydrogen bond networks [Citation13]. The effect of water networks not directly involved in ligand interactions can have profound effects both on affinity and kinetics and a single water molecule may also alter the thermodynamic signature [Citation6,Citation13,Citation14]. A recent systematic study of the effect of ligand rigidification for a series of Fasudil derivatives highlights that specific ligand-water interactions also in the free solvated state must be considered in the general case [Citation11]. The study also exemplifies the complexity of the impact of water on the overall thermodynamic profile. They studied the thermodynamic signatures for a set of flexible/rigid pairs while keeping constant the number of heteroatoms. The compounds were roughly of the same affinity but with drastically different thermodynamic signatures. It is often assumed that rigidification of a flexible ligand will lead to a decrease in the entropic penalty when binding to a protein. Contrary to this common perception, they found that the most flexible ligands had the most favorable entropic signals. The underlying causation was that the unbound flexible ligand would strongly coordinate water molecules in solution. The main entropic signal originated from the release of the water molecules from the ligand upon binding and not necessarily from protein-bound solvent molecules, albeit a more efficient protein-bound solvent displacement as a result of increased ligand flexibility and improved binding site adaptability should not be fully excluded. The effect would thus not be visible in most standard procedure studies where one tends to focus entirely on the bound protein/ligand/water complex. This effect is different from the one observed in the study by Li and Gilson [Citation15], where more flexible ligands displayed a larger entropy contribution to the binding dominated by conformational effects on protein residues.

MD simulations can be used to investigate water-networks [Citation16]. Using such simulations in conjunction with high-resolution crystallography and SPR the importance of water networks was studied in detail. It was found, that inhibitors forming the most perfect water networks showed significantly prolonged residence times [Citation6]. Simpler methods can be used to find enthalpically advantageous interactions, conserved waters will typically correlate well with these. A recent study compared four commercially available solvent mapping tools (SZMAP, WaterFLAP, 3D-RISM, and WaterMap) on three different targets [Citation17]. They concluded that the only simulation-based approach being Water-map did provide some enhanced accuracy toward the other grid-based methods, but that all methods could be used to enhance the predictions from dockings alone.

The limits and accuracy for molecular simulations of free energies rely on the ability to sample relevant physical configurations and the underlying force fields. Biologically relevant systems will typically contain large molecules with many internal degrees of freedom as well as many solvent molecules. Sampling of all possible configurations/conformations is simply intractable for many systems. One way of trying to work around this is to perform free energy perturbation (FEP) simulations, where one investigates differences in estimated free energy between closely related systems, e.g. small incremental changes in ligands. In the absence of experimental data on individual water positions, the calibration of the employed force fields is very challenging. Different methods and force fields can also give vastly different estimates calculated for the same water in the same crystal structure [Citation8]. A recent study investigated, why the computed free energies of protein folding are so sensitive to the water models used [Citation18]. They studied some of the most common models and found, that the largest contributions to discrepancies between different water models on computed protein folding free energy landscapes came from the water–water interactions. They claim, that the difference is primarily due to long-range electrostatic interactions and not to van-der-Waals interactions. They further point out, that the contribution from solvent–solvent interactions to the total protein folding free energy is much larger than the folding free energy itself. Thus, small errors in the force field may accumulate. The effect of this will be highly dependent on the system studied. For systems suitable for FEP simulations (e.g. small changes in ligands and small conformational changes) this may be of lesser importance. For protein–protein interactions or large conformational changes, this effect is expected to be more important. But also, for seemingly small perturbations water interactions can react collectively through distributed networks. One way of determining errors that arise from incomplete sampling is through examining hysteresis. A recent study observed large hysteresis in FEP calculations with large dependence on the initial solvation states [Citation19]. We are currently not aware of any simple rules that will tell a scientist if the system is simple enough a priori. The question remains how accurate models one can afford. A simulation of a solvated protein-ligand system with explicit water molecules may include an order of magnitude more water atoms than protein atoms. Thus, most of the time in a simulation can be spent on simulating water. They also point out, that many of the widely used water models poorly reproduces bulk properties of water. A recent study utilized machine learning to parametrize coarse-grained models to fit the structural and mesoscopic thermodynamic anomalies of water and ice [Citation20]. These simpler models are significantly more tractable computationally. At the same time models parameterized on mesoscopic properties may not necessarily be relevant on the atomic scale [Citation21]. Force field optimization is also vulnerable to error cancellation effects. In their comprehensive review, Onufriev and Izadi discuss the remits of various water models and approximations relevant for simulation of biomolecules [Citation22]. One of their take home messages is that there probably is not going to be a water model that rules them all but that one will have to choose considering all the pros and cons and select the appropriate model and approach that best addresses the question at hand.

6. Conclusions

Undoubtedly, solvent molecules play an important role in molecular recognition processes and thus can´t be neglected in the context of structure-based drug design. As their thermodynamic contributions to the binding cannot be easily disentangled using experimental approaches, only an impactful combination of computation and experiment can provide valuable guidance for drug discovery projects, which is far from routine investigations. Although some advances have been made in predicting affinities using molecular simulations, quantitative computational predictions of thermodynamic signatures are far from feasible even for relatively simple and well-described systems. This is in part due to the incorporation of canceling errors in the parameterization of force fields as well as the intrinsically quantum effects of solvent networks. The coordinate combinations of biophysical methods like NMR, ITC, and X-ray together with computational models aiming to guide compound and experimental design should however be strongly considered in drug discovery projects. A better understanding and appreciation of solvent thermodynamics will have the potential to significantly impact molecular design and drug discovery moving forward.

7. Expert opinion

Solvent molecules that mediate molecular interactions are a quite common feature of protein-drug complexes. Nevertheless, these interactions often remain undetected and more importantly, are underutilized during rational drug design. In our view, there are several underlying reasons for this. One reason is that they are frequently considered to have an insignificant contribution to the binding affinity. Another reason is related to the inherent difficulties to assess them experimentally as well as computationally. It appears intuitive, that thermodynamic signatures might provide a different handle to understand the role of water interactions to guide and inform drug design. Recent studies have however made it clear, that thermodynamic signatures need complementary information to provide detailed mechanistic understanding and that thermodynamic signatures cannot be used as endpoints for optimization [Citation2,Citation3]. Nevertheless, thermodynamic characterization has played an important role in furthering our understanding of the driving forces of protein–ligand interactions. The most important learning from these studies is in our view the increased appreciation of the complexity and variety of effects displayed by water interactions, which makes it impossible to aggregate them into a single optimization parameter that one can use in an automated fashion. Design strategies are frequently stated only with reference to a static picture representation of protein-ligand complexes. It is well known, that protein and ligand flexibility will affect not only the dynamics but also the thermodynamics of protein–ligand interactions. Ligand binding is always a dynamic process and the water network around the ligand binding site may act collectively even for very rigid sites. Thus, the number and nature of water molecule interactions involved may vary considerably and the dewetting of the ligand binding site and ligand should influence the binding event in a non-trivial way. Obviously, crystal structures of high resolution will enable one to uncover some of the molecular features of the solvent inventory and might provide a retrospective rationale for improved affinity or more favorable enthalpic binding. From such experiments, we know that solvent molecules do not only serve as a bulk medium but take very specifically part in molecular recognition events. Many individual water interactions can however also be transient and are not necessarily loyal in the net protein-ligand recognition event and it constitutes a real challenge to identify methods that are able to handle the large length and time-scales involved in biomolecular hydration. The transient and statistical nature of these phenomena makes them difficult to characterize using interpretable and quantifiable observables that can be used as optimization parameters [Citation2,Citation3].

In our view, an increased understanding of the thermodynamics of solvent molecules is imperative for the progress of rational design. Protein–ligand interactions are inherently statistical in nature and experimental observables are statistical emergent phenomena originating from the summation of countless individual events and degrees of freedom. This complexity necessitates concerted efforts using both theoretical and experimental approaches in combination. It also makes it important to invest in the study of much simpler systems. Pharmaceutical companies will naturally focus on biologically relevant systems including multi-domain proteins and larger multi-component protein complexes. These will have enormous amounts of degrees of freedom. This makes it difficult to treat the systems with rigor from a computational perspective. It also makes it difficult to deconvolute and interpret experimental outputs. Fundamental academic work with less complex systems could in our opinion furnish a much better opportunity to further advance the field as compared with work employing very complex systems that mainly provide thermodynamic noise. An important effort in this direction is, in the authors opinion, the SAMPLX challenges that has been used to benchmark and forward computational approaches to host-guest systems and hydration as well as distribution coefficients [Citation23]. Also for these simpler systems, an accurate prediction can be challenging and the accuracy of methodologies can be highly dependent on the system under study [Citation24]. This editorial is a call for engagement in such concerted efforts to apply rigorous and systematic experimental design using state of the art computational and experimental tools to advance the field.

Declaration of interest

Stefan Geschwindner and Johan Ulander are both employees of AstraZeneca. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

Reviewer Disclosures

Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.

Additional information

Funding

The manuscript has not been funded.

References

  • Ladbury JE, Klebe G, Freire E. Adding calorimetric data to decision making in lead discovery: a hot tip. Nat Rev Drug Discov. 2010;9:23–27.
  • Geschwindner S, Ulander J, Johansson P. Ligand binding thermodynamics in drug discovery: still a hot tip? J Med Chem. 2015;58:6321–6335.
  • Klebe G. Broad-scale analysis of thermodynamic signatures in medicinal chemistry: are enthalpy-favored binders the better development option? Drug Discov Today. 2019;24:943–948.
  • Giordanetto F, Jin C, Willmore L, et al. Fragment hits: what do they look like and how do they bind? J Med Chem. 2019;62:3381–3394.
  • Irwin BWJ, Huggins DJ. Estimating atomic contributions to hydration and binding using free energy perturbation. J Chem Theory Comput. 2018;14:3218–3227.
  • Krimmer SG, Cramer J, Betz M, et al. Rational design of thermodynamic and kinetic binding profiles by optimizing surface water networks coating protein-bound ligands. J Med Chem. 2016;59:10530–10548.
  • Schiebel J, Gaspari R, Wulsdorf T, et al. Intriguing role of water in protein-ligand binding studied by neutron crystallography on trypsin complexes. Nat Commun. 2018;9:e3559.
  • Spyrakis F, Ahmed MH, Bayden AS, et al. The roles of water in the protein matrix: a largely untapped resource for drug discovery. J Med Chem. 2017;60:6781–6827.
  • Leeson PD, Springthorpe B. The influence of drug-like concepts on decision-making in medicinal chemistry. Nat Rev Drug Discov. 2007;6:881–890.
  • Shultz MD. The thermodynamic basis for the use of lipophilic efficiency (LipE) in enthalpic optimizations. Bioorg Med Chem Lett. 2013;23:5992–6000.
  • Wienen-Schmidt B, Jonker HRA, Wulsdorf T, et al. Paradoxically, most flexible ligand binds most entropy-favored: intriguing impact of ligand flexibility and solvation on drug–kinase binding. J Med Chem. 2018;61:5922–5933.
  • Verteramo ML, Stenström O, Ignjatović MM, et al. Interplay between conformational entropy and solvation entropy in protein–ligand binding. J Am Chem Soc. 2019;141:2012–2026.
  • Magarkar A, Schnapp G, Apel A-K, et al. Enhancing drug residence time by shielding of intra-protein hydrogen bonds: a case study on CCR2 Antagonists. ACS Med Chem Lett. 2019;10:324–328.
  • Betz M, Wulsdorf T, Krimmer SG, et al. Impact of surface water layers on protein-ligand binding: how well are experimental data reproduced by molecular dynamics simulations in a thermolysin test case? J Chem Inf Model. 2016;56:223–233.
  • Li A, Gilson MK. Protein-ligand binding enthalpies from near-millisecond simulations: analysis of a preorganization paradox. J Chem Phys. 2018;149:072311.
  • Haider K, Cruz A, Ramsey S, et al. Solvation structure and thermodynamic mapping (SSTMap): an open-source, flexible package for the analysis of water in molecular dynamics trajectories. J Chem Theory Comput. 2018;14:418–425.
  • Bucher D, Stouten P, Triballeau N. Shedding light on important waters for drug design: simulations versus grid-based methods. J Chem Inf Model. 2018;58:692–699.
  • Anandakrishnan R, Izadi S, Onufriev AV. Why computed protein folding landscapes are sensitive to the water model. J Chem Theory Comput. 2019;15:625–636.
  • Wahl J, Smieško M. Assessing the predictive power of relative binding free energy calculations for test cases involving displacement of binding site water molecules. J Chem Inf Model. 2019;59(2):754–765.
  • Chan H, Cherukara MJ, Narayanan B, et al. Machine learning coarse grained models for water. Nat Commun. 2019;10:e379.
  • Silverstein KAT, Haymet ADJ, Dill KA. A simple model of water and the hydrophobic effect. J Am Chem Soc. 1998;120(13):3166–3175.
  • Onufriev AV, Izadi S. Water models for biomolecular simulations. Wiley Interdiscip Rev Comput Mol Sci. 2018;8:e1347.
  • Bannan CC, Burley KH, Chiu M, et al. Blind prediction of cyclohexane–water distribution coefficients from the SAMPL5 challenge. J Comput Aid Mol Des. 2016;30(11):927–944.
  • Rizzi A, Murkli S, McNeill JN, et al. Overview of the SAMPL6 host–guest binding affinity prediction challenge. J Comput Aid Mol Des. 2018;32(10):937–963.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.