443
Views
26
CrossRef citations to date
0
Altmetric
Articles

Building and deploying a cyberinfrastructure for the data-driven design of chemical systems and the exploration of chemical space

, , &
Pages 921-929 | Received 09 Jan 2018, Accepted 26 Apr 2018, Published online: 22 May 2018

References

  • Selassie CD . History of quantitative structure-activity relationships. In: Abraham DJ , editor. Burger’s medicinal chemistry and drug discovery. Hoboken (NJ): Wiley; 2003. p. 1–48.
  • Müller K-R , Rätsch G , Sonnenburg S , et al . Classifying ‘drug-likeness’ with kernel-based learning methods. J Chem Inform Model. 2005;45:249–253.
  • Le Bailly deTilleghem C , Govaerts B . A review of quantitative structure-activity relationship (QSAR) models. Technical Report 07027, Universite catholique de Louvain; 2007.
  • Lipinski C , Hopkins A . Navigating chemical space for biology and medicine. Nature. 2004;432:855–861.
  • Kirkpatrick P , Ellis C . Chemical space. Nature. 2004;432:823.
  • Dobson CM . Chemical space and biology. Nature. 2004;432:824–828.
  • Zvinavashe E , Murk AJ , Rietjens IMCM . Promises and pitfalls of quantitative structure-activity relationship approaches for predicting metabolism and toxicity. Chem Res Toxicol. 2008;21:2229–2236.
  • Scior T , Medina-Franco JL , Do QT , et al . How to recognize and workaround pitfalls in QSAR studies: a critical review. Curr Med Chem. 2009;16:4297–4313.
  • Schneider G . Virtual screening: an endless staircase? Nat Rev Drug Discovery. 2010;9:273–276.
  • Rajan K . Informatics for materials science and engineering: data-driven discovery for accelerated experimentation and application. Amsterdam: Butterworth-Heinemann; 2013.
  • National Science and Technology Council . Materials genome initiative for global competitiveness. Tech. Rep. Washington (DC): National Science and Technology Council; 2011.
  • Hansen K , Biegler F , Fazli S , et al . Assessment and validation of machine learning methods for predicting molecular atomization energies. J Chem Theory Comput. 2013;9:3404–3419.
  • Huan TD , Mannodi-Kanakkithodi A , Ramprasad R . Accelerated materials property predictions and design using motif-based fingerprints. Phys Rev B. 2015;92:1–10, 1503.07503v2.
  • Bartók AP , Kondor R , Csányi G . On representing chemical environments. Phys Rev B. 2013;87:184115.
  • Isayev O , Fourches D , Muratov EN , et al . Materials cartography: representing and mining materials space using structural and electronic fingerprints. Chem Mater. 2015;27:735–743.
  • Behler J , Parrinello M . Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys Rev Lett. 2007;98:146401.
  • Snyder JC , Rupp M , Hansen K , et al . Finding density functionals with machine learning. Phys Rev Lett. 2012;108:253002.
  • Mueller T , Hautier G , Jain A , et al . Evaluation of tavorite-structured cathode materials for lithium-ion batteries using high-throughput computing. Chem Mater. 2011;23:3854–3862.
  • Wang S , Wang Z , Setyawan W , et al . Assessing the thermoelectric properties of sintered compounds via high-throughput ab-initio calculations. Phys Rev X. 2011;1:021012.
  • Wilmer CE , Leaf M , Lee CY , et al . Large-scale screening of hypothetical metal-organic frameworks. Nat Chem. 2012;4:83–89.
  • Korth M . Large-scale virtual high-throughput screening for the identification of new battery electrolyte solvents: evaluation of electronic structure theory methods. Phys Chem Chem Phys. 2014;16:7919–7926.
  • Olivares-Amaya R , Amador-Bedolla C , Hachmann J , et al . Accelerated computational discovery of high-performance materials for organic photovoltaics by means of cheminformatics. Energy Environ Sci. 2011;4:4849–4861.
  • Wen S , Nanda K , Huang Y , et al . Practical quantum mechanics-based fragment methods for predicting molecular crystal properties. Phys Chem Chem Phys. 2012;14:7578–7590.
  • Stevanović V , Lany S , Ginley DS , et al . Assessing capability of semiconductors to split water using ionization potentials and electron affinities only. Phys Chem Chem Phys. 2014;16:3706–3714.
  • Potyrailo R , Rajan K , Stoewe K , et al . Combinatorial and high-throughput screening of materials libraries: review of state of the art. ACS Combin Sci. 2011;13:579–633.
  • Hachmann J , Olivares-Amaya R , Atahan-Evrenk S , et al . The Harvard clean energy project: large-scale computational screening and design of organic photovoltaics on the world community grid. J Phys Chem Lett. 2011;2:2241–2251.
  • Hachmann J , Olivares-Amaya R , Jinich A , et al . Lead candidates for high-performance organic photovoltaics from high-throughput quantum chemistry -- the Harvard clean energy project. Energy Environ Sci. 2014;7:698–704.
  • Gunter D , Cholia S , Jain A , et al . Community accessible datastore of high-throughput calculations: experiences from the materials project. In: 2012 Sc Companion: High Performance Computing, Networking Storage and Analysis. Scc; 2012. p. 1244–1251.
  • White AA . Big data are shaping the future of materials science. MRS Bull. 2013;38:594–595.
  • Blum LC , van Deursen R , Reymond JL . Visualisation and subsets of the chemical universe database GDB-13 for virtual screening. J Comput Aided Mol Des. 2011;25:637–647.
  • Ruddigkeit L , van Deursen R , Blum LC , et al . Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inform Model. 2012;52:2864–2875.
  • Rupp M , Tkatchenko A , Müller KR , et al . Fast and accurate modeling of molecular atomization energies with machine learning. Phys Rev Lett. 2012;108(5):058301.
  • Pilania G , Wang C , Jiang X , et al . Accelerating materials property predictions using machine learning. Sci Rep. 2013;3:2810. DOI:10.1038/srep02810
  • Mansbach RA , Ferguson AL . Machine learning of single molecule free energy surfaces and the impact of chemistry and environment upon structure and dynamics. J Chem Phys. 2015;142:105101.
  • Pegg SCH , Haresco JJ , Kuntz ID . A genetic algorithm for structure-based de novo design. J Comput Aided Mol Des. 2001;15:911–933.
  • Proschak E , Sander K , Zettl H , et al . From molecular shape to potent bioactive agents II: fragment-based de novo design. ChemMedChem. 2009;4:45–48.
  • Nakamura M , Hachiya T , Saito Y , et al . An efficient algorithm for de novo predictions of biochemical pathways between chemical compounds. BMC Bioinform. 2012;13:S8.
  • Sokolov AN , Atahan-Evrenk S , Mondal R , et al . From computational discovery to experimental characterization of a high hole mobility organic crystal. Nat Commun. 2011;2:437–438.
  • Hartenfeller M , Schneider G . De Novo drug design. In: Bajorath J , editor. Chemoinformatics and computational chemical biology. Vol. 672; Totowa, NJ: Humana Press; 2011. p. 299–323.
  • Hachmann J , Afzal MAF . ChemLG 0.5 -- a library generator code for the enumeration of chemical and materials space; 2017. Available from: https://hachmannlab.github.io/chemlg.
  • Hachmann J , Evangelista WS , Afzal MAF , et al . ChemHTPS 0.7 -- an automated virtual high-throughput screening program suite for chemical and materials data generation; 2017. Available from: https://hachmannlab.github.io/chemhtps.
  • Hachmann J , Agrawal S , Sonpal A , et al . ChemBDDB 0.2 -- a big data database toolkit for chemical and materials data storage; 2017. Available from: https://hachmannlab.github.io/chembddb.
  • Hachmann J , Haghighatlari M . ChemML 0.10 -- a machine learning and informatics program suite for chemical and materials data mining; 2017. Available from: https://hachmannlab.github.io/chemml.
  • Reymond JL , Ruddigkeit L , Blum L , et al . The enumeration of chemical space. Wiley Interdisciplinary Rev Comput Mol Sci. 2012;2:717–733.
  • Neese F . The ORCA program system. Wiley Interdisciplinary Rev Comput Mol Sci. 2012;2:73–78.
  • Shao Y , Gan Z , Epifanovsky E , et al . Advances in molecular quantum chemistry contained in the Q-Chem 4 program package. Mol Phys. 2015;113:184–215.
  • Abraham MJ , Murtola T , Schulz R , et al . Gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1–2:19–25.
  • Amador-Bedolla C , Olivares-Amaya R , Hachmann J , et al . Organic photovoltaics. In: Rajan K , editor. Informatics for materials science and engineering: data-driven discovery for accelerated experimentation and application. Amsterdam: Butterworth-Heinemann; 2013. Chapter 17; p. 423–442.
  • Pyzer-Knapp EO , Suh C , Gómez-Bombarelli R , et al . What is high-throughput virtual screening? A perspective from organic materials discovery. Ann Rev Mater Res. 2015;45:195–216.
  • Lopez SA , Pyzer-Knapp EO , Simm GN , et al . The Harvard organic photovoltaic dataset. Sci Data. 2016;3:160086.
  • Kowalski BR , Bender CF . Pattern recognition. Powerful approach to interpreting chemical data. J Am Chem Soc. 1972;94:5632–5639.
  • Pedregosa F , Weiss R , Brucher M . Scikit-learn : machine learning in python. J Mach Learn Res. 2011;12:2825–2830.
  • Abadi M , Agarwal A , Barham P , et al . TensorFlow: large-scale machine learning on heterogeneous systems; 2015, software available from tensorflow.org.
  • Chollet F , et al . Keras; 2015, software available from: https://github.com/fchollet/keras.
  • Talete srl . DRAGON (Software for Molecular Descriptor Calculation); 2011, software available from: http://www.talete.mi.it/.
  • O’Boyle NM , Banck M , James CA , et al . Open babel: an open chemical toolbox. J Cheminform. 2011;3:33. DOI:10.1186/1758-2946-3-33
  • RDKit: Open-source cheminformatics, software available from: http://www.rdkit.org.
  • Bishop CM . Pattern recognition and machine learning. New York: Springer; 2006.
  • Müller KR , Mika S , Rätsch G , et al . An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw. 2001;12:181–201.
  • Manzhos S , Carrington T . A random-sampling high dimensional model representation neural network for building potential energy surfaces. J Chem Phys. 2006;125:084109.
  • Dahl GE . Deep learning approaches to problems in speech recognition, computational chemistry, and natural language text processing [PhD thesis]. University of Toronto; 2015.
  • Todeschini R , Consonni V , Mannhold R , et al . Handbook of molecular descriptors. Weinheim: Wiley-VCH; 2000.
  • Sykora VJ , Leahy DE . Chemical Descriptors Library (CDL): a generic, open source software library for chemical informatics. J Chem Inform Model. 2008;48(10):1931–1942.
  • Nilakantan R , Bauman N , Dixon JS , et al . Topological torsion: a new molecular descriptor for sar applications comparison with other descriptors. J Chem Inform Comput Sci. 1987;27:82–85.
  • O’Boyle NM , Sayle RA . Comparing structural fingerprints using a literature-based similarity benchmark. J Cheminform. 2016;8:1–14.
  • Hansen K , Biegler F , Ramakrishnan R , et al . Machine learning predictions of molecular properties: accurate many-body potentials and nonlocality in chemical space. J Phys Chem Lett. 2015;6:2326–2331.
  • Ramakrishnan R , von Lilienfeld OA . Machine learning, quantum chemistry, and chemical space. In: Parrill AL , Lipkowitz KB , editors. Reviews in computational chemistry. Vol. 30. Hoboken, NJ: John Wiley \ & Sons; 2017. p. 225–256.
  • Ward L , Agrawal A , Choudhary A , et al . A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput Mater. 2016;2:16028.
  • Schütt KT , Arbabzadah F , Chmiela S , et al . Quantum-chemical insights from deep tensor neural networks. Nat Commun. 2017;8:13890.
  • Isayev O , Oses C , Toher C , et al . Universal fragment descriptors for predicting electronic properties of inorganic crystals. Nat Commun. 2017;8:15679.
  • Ferré G , Haut T , Barros K . Learning molecular energies using localized graph kernels. J Chem Phys. 2017;146:114107.
  • Collins CR , Gordon GJ , von Lilienfeld OA , et al . Constant size molecular descriptors for use with machine learning. arXiv 2017, arXiv:1701.06649.
  • Collins CR , Gordon GJ , von Lilienfeld OA , et al . Constant size molecular descriptors for accurate machine learning models of molecular properties. J Chem Phys. 2018;148: 241718. DOI:10.1063/1.5020441
  • Tian Y . Inheritance of molecular orbital energies from monomer building blocks to larger copolymers in organic semiconductors [master’s thesis]. University at Buffalo; 2016.
  • Shih CY . Systematic trends in results from different density functional theory models [master’s thesis]. University at Buffalo; 2015.
  • Afzal MAF , Cheng C , Hachmann J . Combining first-principles and data modeling for the accurate prediction of the refractive index of organic polymers. J Chem Phys. 2018;148:241712.
  • Kumaran Sudalayandi Rajeswari V . First-principles modeling of polymer degradation kinetics and virtual high-throughput screening of candidates for biodegradable polymers [master’s thesis]. University at Buffalo; 2018.
  • Lei T , Wang JY , Pei J . Roles of flexible chains in organic semiconducting materials. Chem Mater. 2014;26:594–603.
  • Angione MD , Pilolli R , Cotrone S , et al . Carbon based materials for electronic bio-sensing. Mater Today. 2011;14:424–433.
  • Voigt A , Ostrzinski U , Pfeiffer K , et al . New inks for the direct drop-on-demand fabrication of polymer lenses. Microelectron Eng. 2011;88:2174–2179.
  • Ummartyotin S , Juntaro J , Sain M , et al . Development of transparent bacterial cellulose nanocomposite film as substrate for flexible organic light emitting diode (OLED) display. Indus Crops Prod. 2012;35:92–97.
  • Xiang C , Ma R . Devices to increase OLED output coupling efficiency with a high refractive index substrate. US Patent 9,640,781. 2017.
  • Nishiyama H , Nishii J , Mizoshiri M , et al . Microlens arrays of high-refractive-index glass fabricated by femtosecond laser lithography. Appl Surf Sci. 2009;255:9750–9753.
  • Kokubun Y , Funato N , Takizawa M . Athermal waveguides for temperature-independent lightwave devices. IEEE Photon Technol Lett. 1993;5:1297–1300.
  • Wei H , Krishnaswamy S . Direct laser writing polymer micro-resonators for refractive index sensors. IEEE Photon Technol Lett. 2016;28:2819–2822.
  • Rodríguez A , Vitrant G , Chollet PA , et al . Optical control of an integrated interferometer using a photochromic polymer. Appl Phys Lett. 2001;79:461–463.
  • Singaravalu S , Mayo DC , Park HK , et al . Anti-reflective polymer-nanocomposite coatings fabricated by RIR-MAPLE. In: SPIE LASE. Vol. 8607. International Society for Optics and Photonics; Bellingham, WA; 2013. p. 860718.
  • Kim JB , Lee JH , Moon CK , et al . Highly enhanced light extraction from surface plasmonic loss minimized organic light-emitting diodes. Adv Mater. 2013;25:3571–3577.
  • Kim E , Cho H , Kim K , et al . A facile route to efficient, low-cost flexible organic light-emitting diodes: utilizing the high refractive index and built-in scattering properties of industrial-grade PEN substrates. Adv Mater. 2015;27:1624–1631.
  • Jintoku H , Ihara H . The simplest method for fabrication of high refractive index polymer-metal oxide hybrids based on a soap-free process. Chem Commun. 2014;50:10611–10614.
  • Liu JG , Ueda M . High refractive index polymers: fundamental research and practical applications. J Mater Chem. 2009;19:8907–8919.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.