1,261
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Survival stacking with multiple data types using pseudo-observation-based-AUC loss

, ORCID Icon & ORCID Icon
Pages 858-870 | Received 15 Jul 2021, Accepted 08 Feb 2022, Published online: 15 May 2022

References

  • Aalen, O. O., and S. Johansen. 1978. An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scandinavian Journal of Statistics 50 (3):141–150.
  • Ambale-Venkatesh, B., X. Yang, C. O. Wu, K. Liu, W. G. Hundley, R. McClelland, A. S. Gomes, A. R. Folsom, S. Shea, E. Guallar, et al. 2017. Cardiovascular event prediction by machine learning: The multi-ethnic study of atherosclerosis. Circulation Research 121 (9):1092–1101. doi:10.1161/CIRCRESAHA.117.311312.
  • Andersen, P. K., and M. Pohar Perme. 2010. Pseudo-observations in survival analysis. Statistical Methods in Medical Research 19 (1):71–99. doi:10.1177/0962280209105020.
  • Binder, H., A. Allignol, M. Schumacher, and J. Beyersmann. 2009. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics 25 (7):890–896. doi:10.1093/bioinformatics/btp088.
  • Binder, H., and M. Schumacher. 2008. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinformatics 90 (14):1–10.
  • Binder, N., T. A. Gerds, and P. K. Andersen. 2014. Pseudo-observations for competing risks with covariate dependent censoring. Lifetime Data Analysis 20 (2):303–315. doi:10.1007/s10985-013-9247-7.
  • Corey, K. M., S. Kashyap, E. Lorenzi, S. A. Lagoo-Deenadayalan, K. Heller, K. Whalen, S. Balu, M. T. Heflin, S. R. McDonald, M. Swaminathan, et al. 2018. Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): A retrospective, single-site study. PLOS Medicine 150 (11):1–19.
  • Cox, D. R. 1972. Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological) 34 (2):187–220. doi:10.1111/j.2517-6161.1972.tb00899.x.
  • Curtis, C., S. Shah, S.-F. Chin, G. Turashvili, O. Rueda, M. Dunning, D. Speed, A. Lynch, S. Samarajiwa, Y. Yuan, et al. 2012 04. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486 (7403):346–352. doi:10.1038/nature10983.
  • Fong, Y., S. Yin, and Y. Huang. 2016. Combining biomarkers linearly and nonlinearly for classification using the area under the ROC curve. Statistics in Medicine 35 (21):3792–3809. doi:10.1002/sim.6956.
  • Friedman, J., T. Hastie, and R. Tibshirani. 2010. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 330 (1):1–22. URL http://www.jstatsoft.org/v33/i01/.
  • Gonzalez Ginestet, P., A. Kotalik, D. M. Vock, J. Wolfson, and E. E. Gabriel. 2020. Stacked inverse probability of censoring weighted bagging: A case study in the InfCareHIV register. Journal of the Royal Statistical Society: Series C (Applied Statistics) 700 (1):51–65.
  • Graw, F., T. A. Gerds, and M. Schumacher. 2009. On pseudo-values for regression analysis in competing risks models. Lifetime Data Analysis 15 (2):241–255. doi:10.1007/s10985-008-9107-z.
  • Gulshan, V., L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros, et al. 2016. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316 (22):2402–2410. doi:10.1001/jama.2016.17216.
  • Hastie, T. J., and R. J. Tibshirani. 1990. Generalized additive models. Boca Raton: CRC press.
  • Heagerty, P. J., T. Lumley, and M. S. Pepe. 2000. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 56 (2):337–344. doi:10.1111/j.0006-341X.2000.00337.x.
  • Ishwaran, H., T. A. Gerds, U. B. Kogalur, R. D. Moore, S. J. Gange, and B. M. Lau. 2014. Random survival forests for competing risks. Biostatistics 15 (4):757–773. doi:10.1093/biostatistics/kxu010.
  • Ishwaran, H., U. B. Kogalur, E. H. Blackstone, and M. S. Lauer. 2008. Random survival forests. The Annals of Applied Statistics 20 (3):841–860.
  • Mukherjee, A., R. Russell, S.-F. Chin, B. Liu, O. Rueda, H. Ali, G. Turashvili, B. Mahler-Araujo, I. Ellis, S. Aparicio, et al. 2018. Associations between genomic stratification of breast cancer and centrally reviewed tumour pathology in the metabric cohort. Npj Breast Cancer 4 (1):12. doi:10.1038/s41523-018-0056-8.
  • Nygård Johansen, M., S. Lundbye‐Christensen, and E. Thorlund Parner. 2020. Regression models using parametric pseudo-observations. Statistics in Medicine 39 (22):2949–2961. doi:10.1002/sim.8586.
  • Pepe, M. S. 2000. Combining diagnostic test results to increase accuracy. Biostatistics 1 (2):123–140. doi:10.1093/biostatistics/1.2.123.
  • Pepe, M. S., Z. Feng, Y. Huang, G. Longton, R. Prentice, I. M. Thompson, and Y. Zheng. 2007. Integrating the predictiveness of a marker with its performance as a classifier. American Journal of Epidemiology 167 (3):362–368. doi:10.1093/aje/kwm305.
  • Polley, E., E. LeDell, C. Kennedy, S. Lendle, and M. van der Laan. Superlearner: Super learner prediction, 2019. R package version 2.0-26.
  • R Core Team. 2020. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. URL: https://www.R-project.org/. (Accesses 2021 Dec 21).
  • Royston, P., and D. G. Altman. 2013. External validation of a Cox prognostic model: Principles and methods. BMC Medical Research Methodology 130 (1):1–15.
  • Royston, P., and W. Sauerbrei. 2004. A new measure of prognostic separation in survival data. Statistics in Medicine 23 (5):723–748. doi:10.1002/sim.1621.
  • Sabathé, C., P. K. Andersen, C. Helmer, T. A. Gerds, H. Jacqmin-Gadda, and P. Joly. 2020. Regression analysis in an illness-death model with interval-censored data: A pseudo-value approach. Statistical Methods in Medical Research 29 (3):752–764. doi:10.1177/0962280219842271.
  • Sachs, M. C., A. Discacciati, Å. H. Everhov, O. Olén, and E. E. Gabriel. 2019. Ensemble prediction of time-to-event outcomes with competing risks: A case-study of surgical complications in Crohn’s disease. Journal of the Royal Statistical Society: Series C (Applied Statistics) 680 (5):1431–1446.
  • Sachs, M. C., and E. E. Gabriel. eventglm: Regression models for event history outcomes, 2021. URL https://sachsmc.github.io/eventglm/ . (Accesses 2021 Dec 21). R package version 1.1.1.
  • Schumacher, M., G. Bastert, H. Bojar, K. Hübner, M. Olschewski, W. Sauerbrei, C. Schmoor, C. Beyerle, R. Neumann, and H. Rauschecker. 1994. Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. Journal of Clinical Oncology 12 (10):2086–2093. doi:10.1200/JCO.1994.12.10.2086.
  • Steyerberg, E., A. Vickers, N. Cook, T. Gerds, M. Gonen, N. Obuchowski, M. Pencina, and M. Kattan. 2010. Assessing the performance of prediction models a framework for traditional and novel measures. Epidemiology (Cambridge, Mass.) 21 (1):128–38, 01. doi:10.1097/EDE.0b013e3181c30fb2.
  • Therneau, T. M. A package for survival analysis in R, 2020. URL https://CRAN.R-project.org/package=survival . (Accesses 2021 Dec 21). R package version 3.2-7.
  • Therneau, T. M., and P. M. Grambsch. 2000. Modeling survival data: Extending the Cox Model. New York: Springer-Verlag.
  • Tibshirani, R. J., and B. Efron. 2002. Pre-validation and inference in microarrays. Statistical Applications in Genetics and Molecular Biology 10 (1):1–18.
  • Van der Laan, M. J., E. C. Polley, and A. E. Hubbard. 2007. Super learner. Statistical Applications in Genetics and Molecular Biology 60 (1). https://doi.org/10.2202/1544-6115.1309.
  • Weng, S. F., J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi. 2017. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLOS ONE 120 (4):1–14. doi:10.1371/journal.pone.0174944.