Abstract
The prediction of the plasma protein binding (PPB) affinity of chemicals is of paramount significance in the drug development process. In this study, ensemble machine learning-based QSPR models have been established for a four-category classification and PPB affinity prediction of diverse compounds using a large PPB dataset of 930 compounds and in accordance with the OECD guidelines. The structural diversity of the chemicals was tested by the Tanimoto similarity index. The external predictive power of the developed QSPR models was evaluated through internal and external validations. In the QSPR models, XLogP was the most important descriptor. In the test data, the classification QSPR models rendered an accuracy of >93%, while the regression QSPR models yielded r2 of >0.920 between the measured and predicted PPB affinities, with the root mean squared error <9.77. Values of statistical coefficients derived for the test data were above their threshold limits, thus put a high confidence in this analysis. The QSPR models in this study performed better than any of the previous studies. The results suggest that the developed QSPR models are reliable for predicting the PPB affinity of structurally diverse chemicals. They can be useful for initial screening of candidate molecules in the drug development process.
Acknowledgement
The authors thank the Director, CSIR-Indian Institute of Toxicology Research, Lucknow (India) for his keen interest in this work and for providing all necessary facilities.