References
- PDB Statistics: Overall growth of released structures per year. Nov 30, 2020. https://www.rcsb.org/stats/growth/growth-released-structures.
- Hollingsworth SA, Dror RO. Molecular dynamics simulation for all. Neuron. 2018;99:1129–31.
- Enkavi G, Javanainen M, Kulig W, et al. Multiscale simulations of biological membranes: the challenge to understand biological phenomena in a living substance. Chem Rev. 2019;119:5607–5774.
- Shirts M, Pande VS. Screen savers of the world unite. Science. 2000;290:1903–1904.
- Shaw DE, Deneroff MM, Dror RO, et al. Anton, a special-purpose machine for molecular dynamics simulation. Commun ACM. 2008;51:91–97.
- Shaw DE, Grossman JP, Bank JA, et al. Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer. SC’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 41–53 (2014). New Orleans, USA. doi: 10.1109/SC.2014.9.
- Shaw DE. Millisecond-long molecular dynamics simulations of proteins on a special-purpose machine. Biophys J. 2013;104:45a.
- Lane TJ, Shukla D, Beauchamp KA, et al. To milliseconds and beyond: challenges in the simulation of protein folding. Curr Opin Struct Biol. 2013;23:58–65.
- Chandler DE, Strümpfer J, Sener M, et al. Light harvesting by lamellar chromatophores in Rhodospirillum photometricum. Biophys J. 2014;106:2503–2510.
- Dror RO, Dirks RM, Grossman JP, et al. Biomolecular simulation: a computational microscope for molecular biology. Ann Rev Biophys. 2012;41:429–452.
- Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction, Second Edition. Springer Series in Statistics; New York, NY. 2009. DOI:10.1007/978-0-387-84858-7.
- Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589.
- Hegedűs T, Geisler M, Lukács G, et al. AlphaFold2 transmembrane protein structure prediction shines. bioRxiv. 2021 2021.08.21;457196. DOI:10.1101/2021.08.21.457196.
- Haiech J, Koscielniak T, Grassy G. Use of TSAR as a new tool to analyze the molecular dynamics trajectories of proteins. J Mol Graph. 1995;13:46–48.
- Gordon HL, Somorjai RL. Fuzzy cluster analysis of molecular dynamics trajectories. Proteins. 1992;14:249–264.
- Troyer JM, Cohen FE. Protein conformational landscapes: energy minimization and clustering of a long molecular dynamics trajectory. Proteins. 1995;23:97–110.
- Karpen ME, Tobias DJ, Brooks III,CL. Statistical clustering techniques for the analysis of long molecular dynamics trajectories: analysis of 2.2-ns trajectories of YPGDV. Biochemistry. 1993;32:412–420.
- Torda AE, van Gunsteren WF. A lgorithms for clustering molecular dynamics configurations. J Comput Chem. 1994;15:1331–1340.
- Bellman R. Dynamic programming. Science. 1966;153:34–37.
- McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5:115–133.
- Lecun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444.
- Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans R Soc A Math Phys Eng Sci. 2016;374:20150202.
- Stein SAM, Loccisano AE, Firestine SM, et al. Principal components analysis: a review of its application on molecular dynamics data. Annu Rep Comput Chem. 2006;2:233–261.
- Lange OF, Grubmüller H. Can principal components yield a dimension reduced description of protein dynamics on long time scales? J Phys Chem B. 2006;110:22842–22852.
- Sittel F, Jain A, Stock G. Principal component analysis of molecular dynamics: on the use of cartesian vs. internal coordinates. J Chem Phys. 2014;141:014111.
- Amadei A, Linssen ABM, de Groot BL, et al. An efficient method for sampling the essential subspace of proteins. J Biomol Struct Dyn. 1996;13:615–625.
- Amadei A, Linssen ABM, Berendsen HJC. Essential dynamics of proteins. Proteins. 1993;17:412–425.
- Daidone I, Amadei A. Essential dynamics: foundation and applications. Wiley Interdiscip Rev Comput Mol Sci. 2012;2:762–770.
- Lange OF, Schäfer LV, Grubmüller H. Flooding in GROMACS: accelerated barrier crossings in molecular dynamics. J Comput Chem. 2006;27:1693–1702.
- Branduardi D, Bussi G, Parrinello M. Metadynamics with adaptive Gaussians. J Chem Theory Comput. 2012;8:2247–2254.
- Spiwok V, Lipovová P, Králová B. Metadynamics in essential coordinates: free energy simulation of conformational changes. J Phys Chem B. 2007;111:3073–3076.
- Yang YI, Shao Q, Zhang J, et al. Enhanced sampling in molecular dynamics. J Chem Phys. 2019;151:070902.
- Altis A, Nguyen PH, Hegger R, et al. Dihedral angle principal component analysis of molecular dynamics simulations. J Chem Phys. 2007;126:244111.
- Sittel F, Filk T, Stock G. Principal component analysis on a torus: theory and application to protein dynamics. J Chem Phys. 2017;147:244101.
- Wolf A, Kirschner KN. Principal component and clustering analysis on molecular dynamics data of the ribosomal L11·23S subdomain. J Mol Model. 2013;19:539–549.
- Mu Y, Nguyen PH, Stock G. Energy landscape of a small peptide revealed by dihedral angle principal component analysis. Proteins. 2005;58:45–52.
- Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–2830.
- Abraham MJ, Murtola T, Schulz R, et al. Gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1–2:19–25.
- Molgedey L, Schuster HG. Separation of a mixture of independent signals using time delayed correlations. Phys Rev Lett. 1994;72:3634–3637.
- Pérez-Hernández G, Noé F. Hierarchical time-lagged independent component analysis: computing slow modes and reaction coordinates for large molecular systems. J Chem Theory Comput. 2016;12:6118–6129.
- Noé F, Clementi C. Kinetic distance and kinetic maps from molecular dynamics simulation. J Chem Theory Comput. 2015;11:5002–5011.
- Sultan MM, Pande VS. TICA-metadynamics: accelerating metadynamics by using kinetically selected collective variables. J Chem Theory Comput. 2017;13:2440–2447.
- Scherer MK, Trendelkamp-Schroer B, Paul F, et al. PyEMMA 2: a software package for estimation, validation, and analysis of Markov models. J. Chem. Theory Comput. 2015;11:5525–5542.
- Hofmann T, Schölkopf B, Smola AJ. Kernel methods in machine learning. 1. Ann Stat. 2008;36:1171–1220.
- Schwantes CR, Pande VS. Modeling molecular kinetics with tICA and the kernel trick. J Chem Theory Comput. 2015;11:600–608.
- Antoniou D, Schwartz SD. Toward identification of the reaction coordinate directly from the transition state ensemble using the kernel PCA method. J Phys Chem B. 2011;115:2465–2469.
- Coifman RR, Lafon S, Lee AB, et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl. Acad. Sci. 2005;102:7426–7431.
- Tenenbaum JB, De Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000;290:2319–2323.
- Ferguson AL, Panagiotopoulos AZ, Debenedetti PG, et al. Systematic determination of order parameters for chain dynamics using diffusion maps. Proc Natl Acad Sci. 2010;107:13597–13602.
- Rohrdanz MA, Zheng W, Maggioni M, et al. Determination of reaction coordinates via locally scaled diffusion map. J Chem Phys. 2011;134:124116.
- Kim SB, Dsilva CJ, Kevrekidis IG, et al. Systematic characterization of protein folding pathways using diffusion maps: application to Trp-cage miniprotein. J Chem Phys. 2015;142:085101.
- Das P, Moll M, Stamati H, et al. Low-dimensional, free-energy landscapes of protein-folding reactions by nonlinear dimensionality reduction. Proc Natl Acad Sci. 2006;103:9885–9890.
- Spiwok V, Králová B. Metadynamics in the conformational space nonlinearly dimensionally reduced by Isomap. The Journal of Chemical Physics. 2011;135:224504.
- Hashemian B, Millán D, Arroyo M. Modeling and enhanced sampling of molecular systems with smooth and nonlinear data-driven collective variables. J Chem Phys. 2013;139:214101.
- Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science. 2006;313:504–507.
- Wehmeyer C, Noé F. Time-lagged autoencoders: deep learning of slow collective variables for molecular kinetics. J Chem Phys. 2018;148:241703.
- Chen W, Ferguson AL. Molecular enhanced sampling with autoencoders: on-the-fly collective variable discovery and accelerated free energy landscape exploration. J Comput Chem. 2018;39:2079–2102.
- Ribeiro JML, Bravo P, Wang Y, et al. Reweighted autoencoded variational Bayes for enhanced sampling (RAVE). J Chem Phys. 2018;149:072301.
- Lamim Ribeiro JM, Tiwary P. Toward achieving efficient and accurate ligand-protein unbinding with deep learning and molecular dynamics through RAVE. J Chem Theory Comput. 2019;15:708–719.
- Varolgüne YB, Bereau T, Rudzinski JF. Interpretable embeddings from molecular simulations using Gaussian mixture variational autoencoders. Mach Learn: Sci Technol. 2020;1:015012.
- Wang W, Gómez-Bombarelli R. Coarse-graining auto-encoders for molecular dynamics. PLoS Comput Biol. 2018;15:e1007033.
- Degiacomi MT. Coupling molecular dynamics and deep learning to mine protein conformational space. Structure. 2019;27:1034–1040.e3.
- Hub JS, De Groot BL. Detection of functional modes in protein dynamics. PLoS Comput Biol. 2009;5:1000480.
- Krivobokova T, Briones R, Hub JS, et al. Partial least-squares functional mode analysis: application to the membrane proteins AQP1, Aqy1, and CLC-ec1. Biophys J. 2012;103:786–796.
- Kaptan S, Assentoft M, Schneider HP, et al. H95 is a pH-dependent gate in Aquaporin 4. Structure. 2015;23:2309–2318.
- Saboe PO, Rapisarda C, Kaptan S, et al. Role of pore-lining residues in defining the rate of water conduction by Aquaporin-0. Biophys. J. 2017;112:953–965.
- Izvekov S, Voth GA. A multiscale coarse-graining method for biomolecular systems. J Phys Chem B. 2005;109:2469–2473.
- Izvekov S, Voth GA. Effective force field for liquid hydrogen fluoride from ab initio molecular dynamics simulation using the force-matching method. J Phys Chem B. 2005;109:6573–6586.
- Scherer C, Scheid R, Andrienko D, et al. Kernel-based machine learning for efficient simulations of molecular liquids. J Chem Theory Comput. 2020;16:3194–3204.
- John ST, Csányi G. Many-body coarse-grained interactions using Gaussian approximation potentials. J Phys Chem B. 2017;121:10934–10949.
- Wang J, Olsson S, Wehmeyer C, et al. Machine learning of coarse-grained molecular dynamics force fields. ACS Cent. Sci. 2019;5:755–767.
- Murtola T, Falck E, Karttunen M, et al. Coarse-grained model for phospholipid/cholesterol bilayer employing inverse Monte Carlo with thermodynamic constraints. J Chem Phys. 2007;126:075101.
- Mehmood T, Ahmed B. The diversity in the applications of partial least squares: an overview. J Chemom. 2016;30:4–17.
- Peters JH, de Groot BL, Levitt M. Ubiquitin dynamics in complexes reveal molecular recognition mechanisms beyond induced fit and conformational selection. PLoS Comput. Biol. 2012;8:e1002704.
- Sakuraba S, Kono H. Spotting the difference in molecular dynamics simulations of biomolecules. J Chem Phys. 2016;145:074116.
- Sultan MM, Kiss G, Shukla D, et al. Automatic selection of order parameters in the analysis of large scale molecular dynamics simulations. J Chem Theory Comput. 2014;10:5217–5223.
- Classification and Regression Trees. Breiman L, Friedman J, Stone CJ, et al. https://books.google.fi/books?id=JwQx-WOmSyQC (Taylor & Francis, 1984).
- Breiman L. Random forests. Mach Learn. 2001;45:5–32.
- Šikić M, Tomić S, Vlahoviček K. Prediction of protein-protein interaction sites in sequences and 3D structures by random forests. PLoS Comput Biol. 2009;5:e1000278.
- Riniker S. Molecular dynamics fingerprints (MDFP): machine learning from MD data to predict free-energy differences. J Chem Inf Model. 2017;57:726–741.
- Aghaaminiha M, Ghanadian SA, Ahmadi E, et al. A machine learning approach to estimation of phase diagrams for three-component lipid mixtures. Biochim Biophys Acta - Biomembr. 2020;1862:183350.
- Wang F, Shen L, Zhou H, et al. Machine learning classification model for functional binding modes of TEM-1 β-lactamase. Front. Mol. Biosci. 2019;6:47.
- Deisenroth MP, Faisal AA, Ong CS. Mathematics for machine learning (Cambridge University Press, 2020).
- Prinz JH, Wu H, Sarich M, et al. Markov models of molecular kinetics: generation and validation. J. Chem. Phys. 2011;134:174105.
- Wolf A, Kirschner KN. Principal component and clustering analysis on molecular dynamics data of the ribosomal L11·23S subdomain. J Mol Model. 2013;19:539–549.
- Sgourakis NG, Merced-Serrano M, Boutsidis C, et al. Atomic-level characterization of the ensemble of the Aβ(1-42) monomer in water using unbiased molecular dynamics simulations and spectral algorithms. J Mol Biol. 2011;405:570–583.
- Abramyan TM, Snyder JA, Thyparambil AA, et al. Cluster analysis of molecular simulation trajectories for systems where both conformation and orientation of the sampled states are important. J Comput Chem. 2016;37:1973–1982.
- Bremer PL, De Boer D, Alvarado W, et al. Overcoming the heuristic nature of k-means clustering: identification and characterization of binding modes from simulations of molecular recognition complexes. J Chem Inf Model. 2020;60:3081–3092.
- Ester M, Kriegel H-P, Sander J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. KDD96 Proceedings 226–231 (1996). Portland, Oregon, USA. https://dl.acm.org/doi/10.5555/3001460.3001507
- Wang K, Chodera JD, Yang Y, et al. Identifying ligand binding sites and poses using GPU-accelerated Hamiltonian replica exchange molecular dynamics. J Comput Aided Mol Des. 2013;27:989–1007.
- Galindo-Murillo R, Cheatham TE. DNA binding dynamics and energetics of cobalt, nickel, and copper metallopeptides. ChemMedChem. 2014;9:1252–1259.
- Kim M, Choi SH, Kim J, et al. Density-based clustering of small peptide conformations sampled from a molecular dynamics simulation. Journal of Chemical Information and Modeling. 2009;49:2528–2536.
- Campello RJGB, Moulavi D, Zimek A, et al. Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data. 2015;10:1–51.
- Melvin RL, Xiao J, Godwin RC, et al. Visualizing correlated motion with HDBSCAN clustering. Protein Sci. 2018;27:62.
- Maragakis P, Van Der Vaart A, Karplus M. Gaussian-mixture umbrella sampling. J Phys Chem B. 2009;113:4664–4673.
- Westerlund AM, Delemotte L. InfleCS: clustering free energy landscapes with Gaussian mixtures. J Chem Theory Comput. 2019;15:6752–6759.
- Debnath J, Parrinello M. Gaussian mixture-based enhanced sampling for statics and dynamics. J Phys Chem Lett. 2020;11:5076–5080.
- Plante A, Shore DM, Morra G, et al. A machine learning approach for the discovery of ligand-specific functional mechanisms of GPCRs. Molecules. 2019;24:2097.
- Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: visualising image classification models and saliency maps. In: Workshop at International Conference on Learning Representations (2014). Banff, Canada. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.746.3713