Abstract
Aim: We introduce AutoQSAR, an automated machine-learning application to build, validate and deploy quantitative structure–activity relationship (QSAR) models. Methodology/results: The process of descriptor generation, feature selection and the creation of a large number of QSAR models has been automated into a single workflow within AutoQSAR. The models are built using a variety of machine-learning methods, and each model is scored using a novel approach. Effectiveness of the method is demonstrated through comparison with literature QSAR models using identical datasets for six end points: protein–ligand binding affinity, solubility, blood–brain barrier permeability, carcinogenicity, mutagenicity and bioaccumulation in fish. Conclusion: AutoQSAR demonstrates similar or better predictive performance as compared with published results for four of the six endpoints while requiring minimal human time and expertise.
Supplementary Material
AutoQSAR models have been provided for all end points sufficient to regenerate data from this publication. To view the supplementary material, please see the list here and the AutoQSAR models here.
Financial & competing interests disclosure
All authors are employees of Schrödinger, Inc. and some authors hold shares of that company. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
No writing assistance was utilized in the production of this manuscript.