316
Views
13
CrossRef citations to date
0
Altmetric
Articles

Classifying bio-concentration factor with random forest algorithm, influence of the bio-accumulative vs. non-bio-accumulative compound ratio to modelling result, and applicability domain for random forest modelFootnote

, &
Pages 967-981 | Received 10 Jun 2014, Accepted 03 Aug 2014, Published online: 06 Dec 2014

References

  • European Parliament, Regulation (EC) No 1907/2006 of the European Parliament and of the Council of 18 December 2006 concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) establishing a European Chemicals Agency, amending Directive 1999/45/EC and repealing Council Regulation (EEC) No 793/93 and Commission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC and Commission Directives 91/155/EEC, 93/67/EEC, 93/105/EC and 2000/21/EC, Off. J. Eur. Union L396 (2006), pp. 1–849.
  • J.D. Dearden, QSAR modeling of bioaccumulation, in Predicting Chemical Toxicity and Fate, M.T.D. Conin and D.J. Livingstone, eds., CRC Press, Boca Raton, FL., 2004, pp. 333–356.
  • M. Pavan, A.P. Worth, and T.I. Netzeva, Review of QSAR Models for Bioconcentration, EUR 22327EN, European Commission Joint Research Centre, Ispra, Italy, 2006.
  • G. Piir, S. Sild, A. Roncaglioni, E. Benfenati, and U. Maran, QSAR model for the prediction of bio-concentration factor using aqueous solubility and descriptors considering various electronic effects, SAR QSAR Environ. Res. 21 (2010), pp. 711–729.10.1080/1062936X.2010.528596
  • X. Sun, Y. Li, X. Liu, J. Ding, Y. Wang, H. Shen, and Y. Chang, Classification of bioaccumulative and non-bioaccumulative chemicals using statistical learning approaches, Mol. Diversity 12 (2008), pp. 157–169.10.1007/s11030-008-9092-x
  • M. Nendza and M. Müller, Screening for low aquatic bioaccumulation (1): Lipinski’s ‘Rule of 5’ and molecular size, SAR QSAR Environ. Res. 21 (2010), pp. 495–512.10.1080/1062936X.2010.502295
  • M. Nendza and T. Herbst, Screening for low aquatic bioaccumulation (2): Physico-chemical constraints, SAR QSAR Environ. Res. 22 (2011), pp. 351–364.10.1080/1062936X.2011.569896
  • S. Strempel, M. Nendza, M. Scheringer, and K. Hungerbühler, Using conditional inference trees and random forests to predict the bioaccumulation potential of organic chemicals, Environ. Toxicol. Chem. 32 (2013), pp. 1187–1195.10.1002/etc.2150
  • A. Fernández, A. Lombardo, R. Rallo, A. Roncaglioni, F. Giralt, and E. Benfenati, Quantitative consensus of bioaccumulation models for integrated testing strategies, Environ. Int. 45 (2012), pp. 51–58.10.1016/j.envint.2012.03.004
  • M. Galar, A. Fernández, E. Barrenecha, H. Bustince, and F. Herrera, A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches, IEEE Trans. Syst. Man. Cybern. C 42 (2012), pp. 463–484.10.1109/TSMCC.2011.2161285
  • A.P. Toropova, A.A. Toropov, A. Lombardo, A. Roncaglioni, E. Benfenati, and G. Gini, A new bioconcentration factor model based on SMILES and indices of presence of atoms, Eur. J. Med. Chem. 45 (2010), pp. 4399–4402.10.1016/j.ejmech.2010.06.019
  • JChem for Excel 6.0.0. ChemAxon, 2013; software available at http://www.chemaxon.com/products/jchem-for-office/
  • E. Bolton, Y. Wang, P.A. Thiessen, and S.H. Bryant, PubChem: Integrated platform of small molecules and biological activities, Annu. Rep. Comput. Chem. 4 (2008), pp. 217–241.10.1016/S1574-1400(08)00012-1
  • SciFinder, Chemical Abstracts Service, Columbus, OH, 2014; software available at https://scifinder.cas.org.
  • US Environmental Protection Agency, Persistent Bioaccumulative Toxic (PBT) Chemicals; Lowering of Reporting Thresholds for Certain PBT Chemicals; Community Right-to-Know Toxic Chemical Reporting, Fed. Reg. 64 (1999), pp. 58666–58753.
  • G. Piir, S. Sild, and U. Maran, Comparative analysis of local and consensus quantitative structure-activity relationship approaches for the prediction of bio-concentration factor, SAR QSAR Environ. Res. 24 (2013), pp. 175–199.10.1080/1062936X.2012.762426
  • S. Dimitrov, N. Dimitrova, T. Parkerton, M. Comber, M. Bonnell, and O. Mekenyan, Base-line model for identifying the bioaccumulation potential of chemicals, SAR QSAR Environ. Res. 16 (2005), pp. 531–554.10.1080/10659360500474623
  • C.W. Yap, PaDEL-Descriptor: An open source software to calculate molecular descriptors and fingerprints, J. Comput. Chem. 32 (2011), pp. 1466–1474.10.1002/jcc.v32.7
  • T. Cheng, Y. Zhao, X. Li, F. Lin, Y. Xu, X. Zhang, Y. Li, R. Wang, and L. Lai, Computation of octanol-water partition coefficients by guiding an additive model with knowledge, J. Chem. Inf. Model. 47 (2007), pp. 2140–2148.10.1021/ci700257y
  • Estimation Programs Interface Suite™ for Microsoft® Windows v 4.11, US Environmental Protection Agency, Washington DC, 2012; software available at http://www.epa.gov/opptintr/exposure/pubs/episuite.htm.
  • I.V. Tetko, J. Gasteiger, R. Todeschini, A. Mauri, D. Livingstone, P. Ertl, V.A. Palyulin, E.V. Radchenko, N.S. Zefirov, A.S. Makarenko, V.Y. Tanchuk, and V.V. Prokopenko, Virtual computational chemistry laboratory - design and description, J. Comput. Aid. Mol. Des. 19 (2005), pp. 453–463.10.1007/s10822-005-8694-y
  • VCCLAB, 2005; software available at http://www.vcclab.org.
  • L. Breiman, Random Forests, Machine Learning 45 (2001), pp. 5–32.10.1023/A:1010933404324
  • R: A language and environment for statistical computing 3.0.2, R Foundation for Statistical Computing, Vienna, 2013; software available at http://www.R-project.org/.
  • F. Sahigara, K. Mansouri, D. Ballabio, A. Mauri, V. Consonni, and R. Todeschini, Comparison of different approaches to define the applicability domain of QSAR models, Molecules 17 (2012), pp. 4791–4810.10.3390/molecules17054791
  • V. Ruusmann, S. Sild, and U. Maran, QSAR DataBank – An approach for the digital organization and archiving of QSAR model information, J. Cheminf. 6 (2014), p. 25.10.1186/1758-2946-6-25
  • J.A. Platts, D. Butina, M.H. Abraham, and A. Hersey, Estimation of molecular linear free energy relation descriptors using a group contribution approach, J. Chem. Inf. Comput. Sci. 39 (1999), pp. 835–845.10.1021/ci980339t

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.