3,115
Views
196
CrossRef citations to date
0
Altmetric
Articles

The relative performance of ensemble methods with deep convolutional neural networks for image classification

, &
Pages 2800-2818 | Received 05 Apr 2017, Accepted 11 Feb 2018, Published online: 26 Feb 2018

References

  • D. Benkeser, C. Ju, S. Lendle, and M. van der Laan, Online cross-validation-based ensemble learning, Statistics in Medicine, 37 (2018), pp. 249–260. doi: 10.1002/sim.7320
  • J.O. Berger and M.E. Bock, Combining independent normal mean estimation problems with unknown variances, Ann. Statist. 4 (1976), pp. 642–648. doi: 10.1214/aos/1176343472
  • L. Breiman, Bagging predictors, Mach. Learn. 24 (1996), pp. 123–140.
  • L. Breiman, Stacked regressions, Mach. Learn. 24 (1996), pp. 49–64.
  • L. Breiman, Random forests, Mach. Learn. 45 (2001), pp. 5–32. doi: 10.1023/A:1010933404324
  • A. Chambaz, W. Zheng, and M. van der Laan, Data-adaptive inference of the optimal treatment rule and its mean reward. the masked bandit, U.C. Berkeley Division of Biostatistics Working Paper Series, 2016.
  • K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, Learning phrase representations using RNN encoder–decoder for statistical machine translation, preprint (2014). Available at arXiv:1406.1078.
  • A. Choromanska, M. Henaff, M. Mathieu, G.B. Arous, and Y. LeCun, The loss surfaces of multilayer networks, in AISTATS, San Diego, California, USA, 2015.
  • M.M. Davies and M.J. van der Laan, Optimal spatial prediction using ensemble machine learning, Int. J. Biostat. 12 (2016), pp. 179–201.
  • T.G. Dietterich, Ensemble methods in machine learning, in International workshop on multiple classifier systems, Springer Berlin Heidelberg, Berlin, Heidelberg, NY, 2000, pp. 1–15.
  • B. Efron and C. Morris, Combining possibly related estimation problems, J. R. Stat. Soc. Ser. B 35 (1973), pp. 379–421.
  • Y. Freund and R.E. Schapire, et al. Experiments with a new boosting algorithm, in Proceedings of Machine Learning Research, Vol. 96, Bari, IT, 1996, pp. 148–156.
  • Y. Freund and R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. Syst. Sci. 55 (1997), pp. 119–139. doi: 10.1006/jcss.1997.1504
  • J.H. Friedman, Greedy function approximation: a gradient boosting machine, Ann. Statist. 29 (2001), pp. 1189–1232. doi: 10.1214/aos/1013203451
  • I. Goodfellow, Y. Bengio, and A. Courville, Deep learning, MIT Press, 2016. Available at http://www.deeplearningbook.org
  • E.J. Green and W.E. Strawderman, A James–Stein type estimator for combining unbiased and possibly biased estimators, J. Am. Stat. Assoc. 86 (1991), pp. 1001–1006. doi: 10.1080/01621459.1991.10475144
  • A. Grover and J. Leskovec, node2vec: Scalable feature learning for networks, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, San Francisco, CA, USA, 2016, pp. 855–864.
  • K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, preprint (2015). Available at arXiv:1512.03385.
  • K. He, X. Zhang, S. Ren, and J. Sun, Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, in Proceedings of the IEEE International Conference on Computer Vision, Araucano Park, Santiago, CL, 2015, pp. 1026–1034.
  • T. Hothorn, P. Bühlmann, S. Dudoit, A. Molinaro, and M.J. van der Laan, Survival ensembles, Biostatistics 7 (2006), pp. 355–373. doi: 10.1093/biostatistics/kxj011
  • S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, preprint (2015). Available at arXiv:1502.03167.
  • C. Ju, M. Combs, S.D. Lendle, J.M. Franklin, R. Wyss, S. Schneeweiss, and M.J. van der Laan, Propensity score prediction for electronic healthcare dataset using super learner and high-dimensional propensity score method, U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper 351, 2016.
  • A. Krizhevsky and G. Hinton, Learning multiple layers of features from tiny images, Technical report, University of Toronto, 2009.
  • A. Krizhevsky, I. Sutskever, and G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, Lake Tahoe, Nevada, USA, 2012, pp. 1097–1105.
  • L.I. Kuncheva, C.J. Whitaker, C.A. Shipp, and R.P. Duin, Limits on the majority vote accuracy in classifier fusion, Pattern Anal. Appl. 6 (2003), pp. 22–31. doi: 10.1007/s10044-002-0173-7
  • Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature 521 (2015), pp. 436–444. doi: 10.1038/nature14539
  • M. Lin, Q. Chen, and S. Yan, Network in network, preprint (2013). Available at arXiv:1312.4400.
  • A.R. Luedtke and M.J. van der Laan, Super-learning of an optimal dynamic treatment rule, Int. J. Biostat. 12 (2016), pp. 305–332.
  • M.T. Luong, H. Pham, and C.D. Manning, Effective approaches to attention-based neural machine translation, preprint (2015). Available at arXiv:1508.04025.
  • T.M. Mitchell, Machine Learning, Vol. 45, McGraw Hill, Burr Ridge, IL, 1997, pp. 870–877.
  • B. Perozzi, R. Al-Rfou, and S. Skiena, Deepwalk: Online learning of social representations, in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, 2014, pp. 701–710.
  • R. Pirracchio, M.L. Petersen, M. Carone, M.R. Rigon, S. Chevret, and M.J. van der Laan, Mortality prediction in intensive care units with the super ICU learner algorithm (sicula): A population-based study, Lancet Respir. Med. 3 (2015), pp. 42–52. doi: 10.1016/S2213-2600(14)70239-5
  • E.C. Polley and M.J. Van Der Laan, Super learner in prediction, U.C. Berkeley Division of Biostatistics Working Paper Series, 2010.
  • J.N.K. Rao and K. Subrahmaniam, Combining independent estimators and estimation in linear regression with unequal variances, Biometrics 27 (1971), pp. 971–990. doi: 10.2307/2528832
  • D.B. Rubin and S. Weisberg, The variance of a linear combination of independent estimators using estimated weights, Biometrika 62 (1975), pp. 708–709. doi: 10.1093/biomet/62.3.708
  • O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A.C. Berg, and L. Fei-Fei, Imagenet large scale visual recognition challenge, Int. J. Comput. Vis. 115 (2015), pp. 211–252. doi: 10.1007/s11263-015-0816-y
  • K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, preprint (2014). Available at arXiv:1409.1556.
  • S.E. Sinisi, E.C. Polley, M.L. Petersen, S.Y. Rhee, and M.J. vanderLaan, Super learning: An application to the prediction of HIV-1 drug resistance, Stat. Appl. Genet. Mol. Biol. 6 (2007). doi:doi: 10.2202/1544-6115.1240.
  • N. Srivastava, G.E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res. 15 (2014), pp. 1929–1958.
  • Y. Su, S. Shan, X. Chen, and W. Gao, Hierarchical ensemble of global and local classifiers for face recognition, IEEE. Trans. Image. Process. 18 (2009), pp. 1885–1896. doi: 10.1109/TIP.2009.2021737
  • C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, Going deeper with convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 1–9.
  • M.J. Van Der Laan and S. Dudoit, Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: Finite sample oracle inequalities and examples, U.C. Berkeley Division of Biostatistics Working Paper Series, 2003.
  • M.J. van der Laan, E.C. Polley, and A.E. Hubbard, Super learner, Stat. Appl. Genet. Mol. Biol. 6 (2007). doi:doi: 10.2202/1544-6115.1309.
  • A. Veit, M. Wilber, and S. Belongie, Residual networks are exponential ensembles of relatively shallow networks, preprint (2016). Available at arXiv:1605.06431.
  • H. Wang, A. Cruz-Roa, A. Basavanhally, H. Gilmore, N. Shih, M. Feldman, J. Tomaszewski, F. Gonzalez, and A. Madabhushi, Cascaded ensemble of convolutional neural networks and handcrafted features for mitosis detection, in SPIE Medical Imaging, International Society for Optics and Photonics, San Diego, CA, 2014, pp. 90410B–90410B.
  • D.H. Wolpert, Stacked generalization, Neural. Netw. 5 (1992), pp. 241–259. doi: 10.1016/S0893-6080(05)80023-1
  • R. Wyss, S. Schneeweiss, M. van der Laan, S.D. Lendle, C. Ju, and J.M. Franklin. Using super learner prediction modeling to improve high-dimensional propensity score estimation, Epidemiology, 29 (2018), pp. 96–106. doi: 10.1097/EDE.0000000000000762
  • M.D. Zeiler and R. Fergus, Visualizing and understanding convolutional networks, in European Conference on Computer Vision, Springer, Columbus, OH, 2014, pp. 818–833.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.