107
Views
6
CrossRef citations to date
0
Altmetric
Articles

Molecular activity prediction by means of supervised subspace projection based ensembles of classifiers

, , &
Pages 187-212 | Received 22 Nov 2017, Accepted 29 Dec 2017, Published online: 01 Feb 2018

References

  • S.M. Paul, D.S. Mytelka, C.T. Dunwiddie, C.C. Persinger, B.H. Munos, S.R. Lindborg, and A.L. Schacht, How to improve R&D productivity: The pharmaceutical industry's grand challenge, Nat. Rev. Drug Discov. 9 (2010), pp. 203–214.
  • Y. Fukunishi, Structure-based drug screening and ligand-based drug screening with machine learning, Comb. Chem. High Throughput Screen. 12 (2009), pp. 397–408.
  • C. Hansch and T. Fujita, p-σ-π Analysis. A method for the correlation of biological activity and chemical structure, J. Am. Chem. Soc. 86 (1964), pp. 1616–1626.
  • E. Byvatov, K.H. Baringhaus, G. Schneider, and H. Matter, A virtual screening filter for identification of cytochrome P450 2C9 (CYP2C9) inhibitors, QSAR Comb. Sci. 26 (2007), pp. 618–628.
  • S. Sciabola, I. Morao, and M.J. de Groot, Pharmacophoric fingerprint method (TOPP) for 3D-QSAR modeling: Application to CYP2D6 metabolic stability, J. Chem. Inf. Model. 47 (2007), pp. 76–84.
  • J.R. Votano, M. Parham, L.M. Hall, L.H. Hall, L.B. Kier, S. Oloff, and A. Tropsha, QSAR modeling of human serum protein binding with several modeling techniques utilizing structure–information representation, J. Med. Chem. 49 (2006), pp. 7169–7181.
  • L.H. Hall and L.B. Kier, Electrotopological state indices for atom types: A novel combination of electronic, topological, and valence state information, J. Chem. Inf. Comput. Sci. 35 (1995), pp. 1039–1045.
  • R. Carrasco-Velar, J. Prieto-Entenza, A. Antelo-Collado, J. Padrón-García, G. Cerruela-García, Á. Maceo-Pixa, R. Alcolea-Núñez, and L. Silva-Rojas, Hybrid reduced graph for SAR studies, SAR QSAR Environ. Res. 24 (2013), pp. 201–214.
  • G.C. García, B. Palacios-Bejarano, I.L. Ruiz, and M. Gómez-Nieto, Comparison of representational spaces based on structural information in the development of QSAR models for benzylamino enaminone derivatives, SAR QSAR Environ. Res. 23 (2012), pp. 751–774.
  • R. Todeschini and V. Consonni, Handbook of Molecular Descriptors, Vol. 11, John Wiley & Sons, Weinheim, Germany, 2008.
  • P. Yang, Y. Hwa Yang, B.B Zhou, and A.Y Zomaya, A review of ensemble methods in bioinformatics, Curr. Bioinform. 5 (2010), pp. 296–308.
  • T. Hancock, R. Put, D. Coomans, Y. Vander, Heyden, and Y. Everingham, A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies, Chemom. Intell. Lab. Syst. 76 (2005), pp. 185–196.
  • C. Merkwirth, H. Mauser, T. Schulz-Gasch, O. Roche, M. Stahl, and T. Lengauer, Ensemble methods for classification in cheminformatics, J. Chem. Inf. Comput. Sci. 44 (2004), pp. 1971–1978.
  • D. Plouffe, A. Brinker, C. McNamara, K. Henson, N. Kato, K. Kuhen, A. Nagle, F. Adrián, J.T. Matzen, and P. Anderson, In silico activity profiling reveals the mechanism of action of antimalarials discovered in a high-throughput screen, Proceed. Nat. Acad. Sci. 105 (2008), pp. 9059–9064.
  • F. Hammann, C. Suenderhauf, and J.R. Huwyler, A binary ant colony optimization classifier for molecular activities, J. Chem Inf. Model. 51 (2011), pp. 2690–2696.
  • F. Fontaine, M. Pastor, I. Zamora, and F. Sanz, Anchor−GRIND: Filling the gap between standard 3D QSAR and the GRid-INdependent Descriptors, J. Med. Chem. 48 (2005), pp. 2687–2694.
  • B. Gaüzère, L. Brun, and D. Villemin, Two new graphs kernels in chemoinformatics, Pattern Recognit. Lett. 33 (2012), pp. 2038–2047.
  • C. Helma, T. Cramer, S. Kramer, and L. De Raedt, Data mining and machine learning techniques for the identification of mutagenicity inducing substructures and structure activity relationships of noncongeneric compounds, J. Chem. Inf. Comput. Sci. 44 (2004), pp. 1402–1411.
  • The Carcinogenic Potency Database (CPDB). Available at https://toxnet.nlm.nih.gov/cpdb/cpdb.html. (accessed on 4 june 2017).
  • B.N. Ames, W.E. Durston, E. Yamasaki, and F.D. Lee, Carcinogens are mutagens: A simple test system combining liver homogenates for activation and bacteria for detection, Proceed. Nat. Acad. Sci. 70 (1973), pp. 2281–2285.
  • G. Hakimelahi and G. Khodarahmi, The identification of toxicophores for the prediction of mutagenicity, hepatotoxicity and cardiotoxicity, J. Iran Chem. Soc. 2 (2005), pp. 244–267.
  • J. Blagg, Structure–activity relationships for in vitro and in vivo toxicity, Annu. Rep. Med. Chem. 41 (2006), pp. 353–368.
  • B.P. Cho, F.A. Beland, and M.M. Marques, NMR structural studies of a 15-mer DNA sequence from a ras protooncogene, modified at the first base of codon 61 with the carcinogen 4-aminobiphenyl, Biochemistry 31 (1992), pp. 9587–9602.
  • C.W. Yap, PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints, J. Comput. Chem. 32 (2011), pp. 1466–1474.
  • R.E. Carhart, D.H. Smith, and R. Venkataraghavan, Atom pairs as molecular features in structure–activity studies: Definition and applications, J. Chem. Inf. Comput. Sci. 25 (1985), pp. 64–73.
  • C. Steinbeck, Y. Han, S. Kuhn, O. Horlacher, E. Luttmann, and E. Willighagen, The Chemistry Development Kit (CDK): An open-source Java library for chemo- and bioinformatics, J. Chem. Inf. Comput. Sci. 43 (2003), pp. 493–500.
  • E-State fragments in E-State Fingerprint. Available at http://www.scbdd.com/padel_desc/fps-estate/. (accessed on 3 November 2017).
  • J.L. Durant, B.A. Leland, D.R. Henry, and J.G. Nourse, Reoptimization of MDL keys for use in drug discovery, J. Chem. Inf. Comput. Sci. 42 (2002), pp. 1273–1280.
  • SMARTS Patterns for Functional Group in Substructure Fingerprint. Available at http://www.scbdd.com/cdk_desc/fps-substructure/. (accessed on 4 july 2017).
  • G.M. Downs, V.J. Gillet, J.D. Holliday, and M.F. Lynch, Review of ring perception algorithms for chemical graphs, J. Chem. Inf. Comput. Sci. 29 (1989), pp. 172–187.
  • I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, USA, 2005.
  • S. Garcia, J. Derrac, J. Cano, and F. Herrera, Prototype selection for nearest neighbor classification: Taxonomy and empirical study, IEEE Trans. Pattern Anal. Mach. Intell. 34 (2012), pp. 417–435.
  • A. Ben-David, A lot of randomness is hiding in accuracy, Eng. Appl. Artif. Intell. 20 (2007), pp. 875–885.
  • C.-C. Chang and C.-J. Lin, LIBSVM: A library for support vector machines, ACM Trans. Intell. Syst. Technol. 2 (2011), pp. 1–27.
  • L.I. Kuncheva, A theoretical study on six classifier fusion strategies, IEEE Trans. Pattern Anal. Mach. Intell. 24 (2002), pp. 281–286.
  • T.G. Dietterich, An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization, Mach. Learn. 40 (2000), pp. 139–157.
  • J. Maudes, J.J. Rodríguez, C. García-Osorio, and N. García-Pedrajas, Random feature weights for decision tree ensemble construction, Inf. Fusion 13 (2012), pp. 20–30.
  • N. García-Pedrajas, C. García-Osorio, and C. Fyfe, Nonlinear boosting projections for ensemble construction, J. Mach. Learn. Res. 8 (2007), pp. 1–33.
  • S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice Hall PTR, Upper Saddle River, New York, USA, 1998.
  • N. García-Pedrajas and C. García-Osorio, Constructing ensembles of classifiers using supervised projection methods based on misclassified instances, Expert Syst. Appl. 38 (2011), pp. 343–359.
  • N. García-Pedrajas, J. Maudes-Raedo, C. García-Osorio, and J.J. Rodríguez-Díez, Supervised subspace projections for constructing ensembles of classifiers, Inf. Sci. 193 (2012), pp. 1–21.
  • K. Fukunaga and J. Mantock, Nonparametric discriminant analysis, IEEE Trans. Pattern Anal. Mach. Intell. (1983), pp. 671–678.
  • B.-C. Kuo and D.A. Landgrebe, Nonparametric weighted feature extraction for classification, IEEE Trans. Geosci. Remote Sensing 42 (2004), pp. 1096–1105.
  • Q. Tian, J. Yu, and T.S. Huang, Boosting multiple classifiers constructed by hybrid discriminant analysis, in Multiple Classifier Systems: Proceedings of the 6th International Workshop, MCS 2005, Seaside, CA, 13–15 June, N.C. Oza, R. Polikar, J. Kittler, and F. Roli, eds., Springer Berlin Heidelberg, Berlin, Heidelberg, 2005, pp. 42–52.
  • W.J. Youden, Index for rating diagnostic tests, Cancer 3 (1950), pp. 32–35.
  • J. Demšar, Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res. 7 (2006), pp. 1–30.
  • S.L. Salzberg, On comparing classifiers: Pitfalls to avoid and a recommended approach, Data Min. Knowl. Discov. 1 (1997), pp. 317–328.
  • F. Wilcoxon, Individual comparisons by ranking methods, Biometrics 1 (1945), pp. 80–83.
  • D.J. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, CRC Press, New York, USA, 2003.
  • P.B. Nemenyi, Distribution-Free Multiple Comparisons, PhD thesis, Princeton University, USA, 1963.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.