476
Views
0
CrossRef citations to date
0
Altmetric
Commentary

Risk Model Development and Validation in Clinical Oncology: Lessons Learned

ORCID Icon, ORCID Icon &
Pages 1-11 | Received 15 Feb 2022, Accepted 16 Oct 2022, Published online: 03 Nov 2022

References

  • Shmueli G. To explain or to predict? Statist Sci. 2010;25(3):289–310. doi:10.1214/10-STS330.
  • Piccininni M, Konigorski S, Rohmann JL, Kurth T. Directed acyclic graphs and causal thinking in clinical risk prediction modeling. BMC Med Res Methodol. 2020;20(1):179. doi:10.1186/s12874-020-01058-z.
  • Carmona-Bayonas A, Jiménez-Fonseca P, Gallego J, Msaouel P. Causal considerations can inform the interpretation of surprising associations in medical registries. Cancer Invest. 2022;40(1):1–13. doi:10.1080/07357907.2021.1999971.
  • Kent DM, van Klaveren D, Paulus JK, D'Agostino R, Goodman S, Hayward R, et al. The predictive approaches to treatment effect heterogeneity (PATH) statement: explanation and elaboration. Ann Intern Med. 2020;172(1):W1–W25. doi:10.7326/M18-3668.
  • Msaouel P, Lee J, Thall PF. Making patient-specific treatment decisions using prognostic variables and utilities of clinical outcomes. Cancers (Basel). 2021;13(11):2741. doi:10.3390/cancers13112741.
  • Cuzick J. Prognosis vs treatment interaction. JNCI Cancer Spectr. 2018;2:pky006. doi:10.1093/jncics/pky006.
  • Lee J, Thall PF, Msaouel P. Precision bayesian phase I-II dose-finding based on utilities tailored to prognostic subgroups. Stat Med. 2021;40(24):5199–5217. doi:10.1002/sim.9120.
  • Zhou Z-R, Wang W-W, Li Y, Jin K-R, Wang X-Y, Wang Z-W, et al. In-depth mining of clinical data: the construction of clinical prediction model with R. Ann Transl Med. 2019;7(23):796. doi:10.21037/atm.2019.08.63.
  • Meng X-L. Enhancing (publications on) data quality: deeper data minding and fuller data confession. Royal Stats Society Series A. 2021;184(4):1161–1175. doi:10.1111/rssa.12762.
  • Pagel C, Yates CA. Tackling the pandemic with (biased) data. Science. 2021;374(6566):403–404. doi:10.1126/science.abi6602.
  • Tleyjeh IM, Kashour T, Mandrekar J, Petitti DB. Overlooked shortcomings of observational studies of interventions in coronavirus disease 2019: an illustrated review for the clinician. Open Forum Infect Dis. 2021;8(8):ofab317. doi:10.1093/ofid/ofab317.
  • Hand DJ. Dark data: why what you don’t know matters. Princeton: Princeton University Press; 2020. p. 1. online resource
  • Little RJA, Rubin DB. Statistical analysis with missing data. (2nd ed). Hoboken (NJ): Wiley; 2002.
  • Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:b2393. doi:10.1136/bmj.b2393.
  • Lee KJ, Tilling KM, Cornish RP, Little RJA, Bell ML, Goetghebeur E, STRATOS initiative, et al. Framework for the treatment and reporting of missing data in observational studies: the treatment and reporting of missing data in observational studies framework. J Clin Epidemiol. 2021;134:79–88. doi:10.1016/j.jclinepi.2021.01.008.
  • Amrhein V, Korner-Nievergelt F, Roth T. The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research. Peer J. 2017;5:e3544. doi:10.7717/peerj.3544.
  • Altman DG, Bland JM. Absence of evidence is not evidence of absence. BMJ. 1995;311(7003):485. doi:10.1136/bmj.311.7003.485.
  • Amrhein V, Greenland S, McShane B. Scientists rise up against statistical significance. Nature. 2019;567(7748):305–307. doi:10.1038/d41586-019-00857-9.
  • Hahn A, Göhler AC, Hermann C, Winkler A. Even when you know it is a placebo, you experience less sadness: first evidence from an experimental open-label placebo investigation. J Affect Disord. 2022;304:159–166. doi:10.1016/j.jad.2022.02.043.
  • Riley RD, Ensor J, Snell KIE, Harrell FE, Martin GP, Reitsma JB, et al. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020;368:m441. doi:10.1136/bmj.m441.
  • Chen L. Overview of clinical prediction models. Ann Transl Med. 2020;8(4):71. doi:10.21037/atm.2019.11.121.
  • Peduzzi P, Concato J, Feinstein AR, Holford TR. Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. J Clin Epidemiol. 1995;48(12):1503–1510. doi:10.1016/0895-4356(95)00048-8.
  • Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996;49(12):1373–1379. doi:10.1016/S0895-4356(96)00236-3.
  • Concato J, Peduzzi P, Holford TR, Feinstein AR. Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy. J Clin Epidemiol. 1995;48(12):1495–1501. doi:10.1016/0895-4356(95)00510-2.
  • van Smeden M, Moons KG, de Groot JA, Collins GS, Altman DG, Eijkemans MJ, et al. Sample size for binary logistic prediction models: beyond events per variable criteria. Stat Methods Med Res. 2019;28(8):2455–2474. doi:10.1177/0962280218784726.
  • van Smeden M, de Groot JAH, Moons KGM, Collins GS, Altman DG, Eijkemans MJC, et al. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis. BMC Med Res Methodol. 2016;16(1):163. doi:10.1186/s12874-016-0267-3.
  • Meng X-L. Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 US presidential election. Ann Appl Stat. 2018;12(2):685–726. doi:10.1214/18-AOAS1161SF.
  • Bradley VC, Kuriwaki S, Isakov M, Sejdinovic D, Meng X-L, Flaxman S, et al. Unrepresentative big surveys significantly overestimated US vaccine uptake. Nature. 2021;600(7890):695–700. doi:10.1038/s41586-021-04198-4.
  • Msaouel P. The Big Data Paradox in Clinical Practice. Cancer Invest. 2022;40(7):567–576. doi:10.1080/07357907.2022.2084621.
  • Greenland S, Hofman A. Multiple comparisons controversies are about context and costs, not frequentism versus Bayesianism. Eur J Epidemiol. 2019;34(9):801–808. doi:10.1007/s10654-019-00552-z.
  • Streiner DL. Best (but oft-forgotten) practices: the multiple problems of multiplicity-whether and how to correct for many statistical tests. Am J Clin Nutr. 2015;102(4):721–728. doi:10.3945/ajcn.115.113548.
  • Schrodi SJ. The use of multiplicity corrections, order statistics and generalized family-wise statistics with application to genome-wide studies. PLoS One. 2016;11(4):e0154472. doi:10.1371/journal.pone.0154472.
  • Dmitrienko A, D'Agostino RB. Multiplicity considerations in clinical trials. N Engl J Med. 2018;378(22):2115–2122. doi:10.1056/NEJMra1709701.
  • Vickerstaff V, Omar RZ, Ambler G. Methods to adjust for multiple comparisons in the analysis and sample size calculation of randomised controlled trials with multiple primary outcomes. BMC Med Res Methodol. 2019;19(1):129. doi:10.1186/s12874-019-0754-4.
  • Hochberg Y. A sharper Bonferroni procedure for multiple tests of significance. Biometrika. 1988;75(4):800–802. doi:10.1093/biomet/75.4.800.
  • Hommel G. A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika. 1988;75(2):383–386. doi:10.1093/biomet/75.2.383.
  • Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc: Series B. 1995;57(1):289–300. doi:10.1111/j.2517-6161.1995.tb02031.x.
  • Brzyski D, Peterson CB, Sobczyk P, Candès EJ, Bogdan M, Sabatti C, et al. Controlling the rate of GWAS false discoveries. Genetics. 2017;205(1):61–75. doi:10.1534/genetics.116.193987.
  • Ge T, Chen C-Y, Ni Y, Feng Y-CA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun. 2019;10(1):1776. doi:10.1038/s41467-019-09718-5.
  • Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models. Cambridge, New York: Cambridge University Press; 2007.
  • Spiegelhalter DJ. The art of statistics: how to learn from data (ed First US edition). New York: Basic Books; 2019.
  • Shapiro DD, Msaouel P. Causal diagram techniques for urologic oncology research. Clin Genitourin Cancer. 2021;19:271 e1–271 e7.
  • Sperrin M, Jenkins D, Martin GP, Peek N. Explicit causal reasoning is needed to prevent prognostic models being victims of their own success. J Am Med Inform Assoc. 2019;26(12):1675–1676. doi:10.1093/jamia/ocz197.
  • Blum KA, Gupta S, Tickoo SK, Chan TA, Russo P, Motzer RJ, et al. Sarcomatoid renal cell carcinoma: biology, natural history and management. Nat Rev Urol. 2020;17(12):659–678. doi:10.1038/s41585-020-00382-9.
  • Heinze G, Wallisch C, Dunkler D. Variable selection - a review and recommendations for the practicing statistician. Biom J. 2018;60(3):431–449. doi:10.1002/bimj.201700067.
  • Harrell FE. Regression modeling strategies. (2nd ed). Switzerland: Springer International Publishing; 2015.
  • Harrell FE, Jr., Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statist Med. 1996;15(4):361–387. doi:10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4.
  • Putt ME. Assessing risk factors with information beyond P value thresholds: statistical significance does not equal clinical importance. Cancer. 2021;127(8):1180–1185. doi:10.1002/cncr.33369.
  • Altman DG, Andersen PK. Bootstrap investigation of the stability of a Cox regression model. Stat Med. 1989;8(7):771–783. doi:10.1002/sim.4780080702.
  • Sauerbrei W, Schumacher M. A bootstrap resampling procedure for model building: application to the Cox regression model. Stat Med. 1992;11(16):2093–2109. doi:10.1002/sim.4780111607.
  • Hastie T, Tibshirani R, Friedman J. The elements of statisttical learning. New York (NY): Springer-Verlag; 2009.
  • Efron B, Hastie T, Tibshirani R. Least angle regression. Ann Stat. 2004;32:407–499.
  • Greenland S. Invited commentary: variable selection versus shrinkage in the control of multiple confounders. Am J Epidemiol. 2008;167(5):523–529. doi:10.1093/aje/kwm355.
  • Greenland S. Bayesian perspectives for epidemiological research. II. Regression analysis. Int J Epidemiol. 2007;36(1):195–202. doi:10.1093/ije/dyl289.
  • Gruber MHJ. Improving efficiency by shrinkage: the James-Stein and ridge regression estimators. New York: Marcel Dekker; 1998.
  • Pavlou M, Ambler G, Seaman S, De Iorio M, Omar RZ. Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events. Stat Med. 2016;35(7):1159–1177. doi:10.1002/sim.6782.
  • Halabi S, Li C, Luo S. Developing and validating risk assessment models of clinical outcomes in modern oncology. JCO Precis Oncol. 2019;3(3):1–12. doi:10.1200/PO.19.00068.
  • Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B, et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. doi:10.1016/j.jclinepi.2019.02.004.
  • Lunn D, Spiegelhalter D, Thomas A, Best N. The BUGS project: evolution, critique and future directions. Stat Med. 2009;28(25):3049–3067. doi:10.1002/sim.3680.
  • Vehtari A, Gelman A, Gabry J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput. 2017;27(5):1413–1432. doi:10.1007/s11222-016-9696-4.
  • Austin PC, Steyerberg EW. Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models. Stat Methods Med Res. 2017;26(2):796–808. doi:10.1177/0962280214558972.
  • Steyerberg EW, Harrell FE. Jr. Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol. 2016;69:245–247. doi:10.1016/j.jclinepi.2015.04.005.
  • Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55–63. doi:10.7326/M14-0697.
  • Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, et al.; STROBE Initiative. Strengthening the reporting of observational studies in epidemiology (STROBE): explanation and elaboration. Int J Surg. 2014;12(12):1500–1524. doi:10.1016/j.ijsu.2014.07.014.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.