611
Views
34
CrossRef citations to date
0
Altmetric
Original Articles

Automatic model selection for high-dimensional survival analysis

, , , , &
Pages 62-76 | Received 29 Sep 2013, Accepted 26 May 2014, Published online: 18 Jun 2014

References

  • Hoerl A, Kennard R. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. 1970;12:55–67. doi: 10.1080/00401706.1970.10488634
  • Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol. 1996;58:267–288.
  • Binder H, Schumacher M. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinformatics. 2008;9(1):14. doi: 10.1186/1471-2105-9-14
  • Bøvelstad HM, Nyågard S, Størvold HL, Aldrin M, Borgan Ø, Frigessi A, Lingjaerde OC. Predicting survival from microarray data: a comparative study. Bioinformatics. 2007;23:2080–2087. doi: 10.1093/bioinformatics/btm305
  • Kammers K, Lang M, Hengstler JG, Schmidt M, Rahnenführer J. Survival models with preclustered gene groups as covariates. BMC Bioinformatics. 2011;12(1):478. doi: 10.1186/1471-2105-12-478
  • López-Ibáñez M, Dubois-Lacoste J, Stützle T, Birattari M. The irace package, iterated race for automatic algorithm configuration, TR/IRIDIA/y2011-004, IRIDIA, Université Libre de Bruxelles, Belgium; 2011.
  • Hutter F, Hoos HH, Leyton-Brown K. Sequential model-based optimization for general algorithm configuration. In: Coello-Coello CA, editor, Learning and intelligent optimization. Berlin Heidelberg: Springer; 2011, p. 507–523.
  • Thornton C, Hutter F, Hoos HH, Leyton-Brown K. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining; 2013. p. 847–855.
  • Koch P, Bischl B, Flasch O, Bartz-Beielstein T, Weihs C, Konen W. Tuning and evolution of support vector kernels. Evol Intell. 2012;5:153–170. doi: 10.1007/s12065-012-0073-8
  • Heagerty PJ, Zheng Y. Survival model predictive accuracy and ROC curves. Biometrics. 2005;61:92–105. doi: 10.1111/j.0006-341X.2005.030814.x
  • Cox D. Regression models and life-tables. J R Stat Soc Ser B Methodol. 1972;34:187–220.
  • Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Ser B Statist Methodol. 2005;67:301–320. doi: 10.1111/j.1467-9868.2005.00503.x
  • Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting. Ann Statist. 2000;28:337–407. doi: 10.1214/aos/1016218223
  • Bühlmann P, Hothorn T. Boosting algorithms: regularization, prediction and model fitting. Statist Sci. 2007;22: 477–505. doi: 10.1214/07-STS242
  • Hothorn T, Buehlmann P, Kneib T, Schmid M, Hofner B. Chapter title. mboost: model-based boosting. R package version 2.2-3; 2013.
  • Binder H, Allignol A, Schumacher M, Beyersmann J. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics. 2009;25:890–896. doi: 10.1093/bioinformatics/btp088
  • Binder H. CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks. R package version 1.4; 2013.
  • Mantel N. Chi-square tests with one degree of freedom; extensions of the mantel-haenszel procedure. J Amer Statist Assoc. 1963;58:690–700.
  • Ishwaran H, Kogalur U. Random survival forests for R. R News. 2007;7:25–31.
  • Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random survival forests. Ann Appl Statist. 2008;2: 841–860. doi: 10.1214/08-AOAS169
  • Saeys Y, Inza IN, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007;23:2507–2517. doi: 10.1093/bioinformatics/btm344
  • Kratz JR, He J, Van Den Eeden SK, Zhu ZH, Gao W, Pham PT, Mulvihill MS, Ziaei F, Zhang H, Su B, Zhi X, Quesenberry CP, Habel LA, Deng Q, Wang Z, Zhou J, Li H, Huang MC, Yeh CC, Segal MR, Ray MR, Jones KD, Raz DJ, Xu Z, Jahan TM, Berryman D, He B, Mann MJ, Jablons DM. A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international validation studies. Lancet. 2012;379:823–832. doi: 10.1016/S0140-6736(11)61941-7
  • Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell. 2005;27:1226–1238. doi: 10.1109/TPAMI.2005.159
  • Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA. 2002;99:6567–6572. doi: 10.1073/pnas.082099299
  • Hastie T, Tibshirani R, Narasimhan B, Chu G. pamr: Pam: prediction analysis for microarrays. R package version 1.54.1; 2013.
  • Birattari M, Stützle T, Paquete L, Varrentrapp K. A racing algorithm for configuring metaheuristics. In: Proceedings of the genetic and evolutionary computation conference, GECCO ’02; San Francisco, CA: Morgan Kaufmann Publishers Inc.; 2002. p. 11–18.
  • Conover WJ. Practical nonparametric statistics. Wiley series in probability and mathematical statistics. New York: Wiley; 1980.
  • Edgar R, Domrachev M, Lash AE. Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30:207–210. doi: 10.1093/nar/30.1.207
  • Shedden K, Taylor JM, Enkemann SA, Tsao MS, Yeatman TJ, Gerald WL, Eschrich S, Jurisica I, Giordano TJ, Misek DE, Chang AC, Zhu CQ, Strumpf D, Hanash S, Shepherd FA, Ding K, Seymour L, Naoki K, Pennell N, Weir B, Verhaak R, Ladd-Acosta C, Golub T, Gruidl M, Sharma A, Szoke J, Zakowski M, Rusch V, Kris M, Viale A, Motoi N, Travis W, Conley B, Seshan VE, Meyerson M, Kuick R, Dobbin KK, Lively T, Jacobson JW, Beer DG. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat Med. 2008;14:822–827. doi: 10.1038/nm.1790
  • Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003;4:249–264. doi: 10.1093/biostatistics/4.2.249
  • Simon R. Resampling strategies for model assessment and selection. In: Fundamentals of data mining in genomics and proteomics. US: Springer; 2007. p. 173–186.
  • Bischl B, Mersmann O, Trautmann H, Weihs C. Resampling methods for meta-model validation with recommendations for evolutionary computation. Evol Comput. 2012;20:249–275. doi: 10.1162/EVCO_a_00069
  • Therneau TM. A package for survival analysis in S. R package version 2.37-4; 2013.
  • Therneau TM,d Grambsch PM. Modeling survival data: extending the cox model. New York: Springer; 2000.
  • Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths for Cox's proportional hazards model via coordinate descent. J Statist Softw. 2011;39:1–13.
  • Bischl B, Lang M, Mersmann O, Rahnenführer J, Weihs C. Computing on high performance clusters with R: packages BatchJobs and BatchExperiments. 1/2012, TU Dortmund University, 2012. Available from: http://sfb876.tu-dortmund.de/PublicPublicationFiles/bischl_etal_2012a.pdf.
  • Bilal E, Dutkowski J, Guinney J, Jang IS, Logsdon BA, Pandey G, Sauerwine BA, Shimoni Y, Moen Vollan HK, Mecham BH, Rueda OM, Tost J, Curtis C, Alvarez MJ, Kristensen VN, Aparicio S, Børresen-Dale AL, Caldas C, Califano A, Friend SH, Ideker T, Schadt EE, Stolovitzky GA, Margolin AA. Improving breast cancer survival analysis through competition-based multidimensional modeling. PLoS Comput Biol. 2013;9.
  • Rijn J, Bischl B, Torgo L, Gao B, Umaashankar V, Fischer S, Winter P, Wiswedel B, Berthold M, Vanschoren J. OpenML: a collaborative science platform. In: Blockeel H, Kersting K, Nijssen S, Železný F, editors, Machine learning and knowledge discovery in databases. Berlin Heidelberg: Springer; 2013. p. 645–649.
  • Snoek J, Larochelle H, Adams R. Practical Bayesian optimization of machine learning algorithms. In: Pereira F, Burges C, Bottou L, Weinberger K, editors, Advances in neural information processing systems 25. 2012, p. 2960–2968.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.