Abstract
A quantitative structure–activity relationship was developed to predict the efficacy of carbon adsorption as a control technology for endocrine-disrupting compounds, pharmaceuticals, and components of personal care products, as a tool for water quality professionals to protect public health. Here, we expand previous work to investigate a broad spectrum of molecular descriptors including subdivided surface areas, adjacency and distance matrix descriptors, electrostatic partial charges, potential energy descriptors, conformation-dependent charge descriptors, and Transferable Atom Equivalent (TAE) descriptors that characterize the regional electronic properties of molecules. We compare the efficacy of linear (Partial Least Squares) and non-linear (Support Vector Machine) machine learning methods to describe a broad chemical space and produce a user-friendly model. We employ cross-validation, y-scrambling, and external validation for quality control. The recommended Support Vector Machine model trained on 95 compounds having 23 descriptors offered a good balance between good performance statistics, low error, and low probability of over-fitting while describing a wide range of chemical features. The cross-validated model using a log-uptake (qe) response calculated at an aqueous equilibrium concentration (Ce) of 1 μM described the training dataset with an r2 of 0.932, had a cross-validated r2 of 0.833, and an average residual of 0.14 log units.
Acknowledgements
Rensselaer Polytechnic Institute acknowledges that the WaterRF is the joint owner of certain technical information upon which this manuscript is based. This document was reviewed by a panel of independent experts selected by WaterRF. Mention of trade names or commercial products does not constitute WaterRF endorsement or recommendations for use. Similarly, omission of products or trade names indicates nothing concerning WaterRF’s position regarding product effectiveness or applicability. The comments and views detailed herein may not necessarily reflect the views of the WaterRF, its officers, directors, affiliates or agents.