Abstract
Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.
Acknowledgement
The authors wish to acknowledge Paola Gramatica from the University of Insubria for providing access to QSARINS 2.2.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
† Presented at the 8th International Symposium on Computational Methods in Toxicology and Pharmacology Integrating Internet Resources, CMTPI-2015, 21–25 June 2015, Chios, Greece.