Abstract
Important understanding can be gained from using molecular biology-based and chemistry-based techniques together. Bayesian classifiers have thus been developed in the present work using several statistically significant molecular properties of compiled datasets of drugs and non-drugs, including their disease category or organ. The results show they provide a useful classification and simplicity of several different ligand efficiencies and molecular properties. Early recall of drugs among non-drugs using the classifiers as a ranking tool is also provided. As the chemical space of compounds is addressed together with their anatomical characterization, chemical libraries can be improved to select for specific organ or disease. Eventually, by including even finer detail, the method may help in designing libraries with specific pharmacological or toxicological target chemical space. Alternatively, a lack of statistically significant differences in property density distributions may help in further describing compounds with possibility of activity on several organs or disease groups, and given their very similar or considerably overlapping chemical space, therefore wanted or unwanted side-effects. The overlaps between densities for several properties of organs or disease categories were calculated by integrating the area under the curves where they intersect. The naïve Bayesian classifiers are readily built, fast to score, and easily interpretable.
Acknowledgements
We thank Mare Oja for help with data collation and Dr Csaba Hetényi for helpful discussions and fruitful collaboration on similar topics. We thank the Estonian Science Foundation Grant 7709 and the Estonian Ministry for Education and Research Grant SF0140031Bs09.
Notes
Presented at the 15th International Workshop on Quantitative Structure–Activity Relationships in Environmental and Health Sciences (QSAR2012), 18–22 June 2012, Tallinn, Estonia.