1,235
Views
0
CrossRef citations to date
0
Altmetric
Articles

Predicting dichotomised outcomes from high-dimensional data in biomedicine

ORCID Icon & ORCID Icon
Pages 1756-1771 | Received 23 Aug 2022, Accepted 28 Jun 2023, Published online: 26 Jul 2023

References

  • C. Caspell-Garcia, T. Simuni, D. Tosun-Turgut, I. Wu, Y. Zhang, M. Nalls, A. Singleton, L.A. Shaw, J.-H. Kang, J.Q. Trojanowski, A. Siderowf, C. Coffey, S. Lasch, D. Aarsland, D. Burn, L.M. Chahine, A.J. Espay, E.D. Foster, K.A. Hawkins, I. Litvan, I. Richard, and D. Weintraub, Multiple modality biomarker prediction of cognitive impairment in prospectively followed de novo Parkinson disease, PLoS One 12 (2017), Article ID e0175674. doi: 10.1371/journal.pone.0175674
  • J. Cohen, The cost of dichotomization, Appl. Psychol. Meas. 7 (1983), pp. 249–253. doi: 10.1177/014662168300700301
  • N.V. Dawson and R. Weiss, Dichotomizing continuous variables in statistical analysis: A practice to avoid, Med. Decis. Making 32 (2012), pp. 225–226. doi: 10.1177/0272989X12437605
  • A.R. de Leon and B. Wu, Copula-based regression models for a bivariate mixed discrete and continuous outcome, Stat. Med. 30 (2011), pp. 175–185. doi: 10.1002/sim.4087
  • M. de Paula and C.A.R. Diniz, Generalized linear regression models incorporating original outcome distributions, Commun. Stat. – Theory Methods 45 (2016), pp. 5762–5786. doi: 10.1080/03610926.2014.948726
  • A. Dupuy and D. Nassar, Dichotomization of primary outcomes serves external validity, J. Invest. Dermatol. 134 (2014), pp. 266–267. doi: 10.1038/jid.2013.258
  • D.P. Farrington and R. Loeber, Some benefits of dichotomization in psychiatric and criminological research, Crim. Behav. Ment. Health 10 (2000), pp. 100–122. doi: 10.1002/cbm.349
  • V. Fedorov, F. Mannino, and R. Zhang, Consequences of dichotomization, Pharm. Stat. 8 (2009), pp. 50–61. doi: 10.1002/pst.331
  • G.M. Fitzmaurice and N.M. Laird, Regression models for a bivariate discrete and continuous outcome with clustering, J. Am. Stat. Assoc. 90 (1995), pp. 845–852. doi: 10.2307/2291318
  • J. Friedman, T. Hastie, and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, J. Stat. Softw. 33 (2010), pp. 1–22. doi: 10.18637/jss.v033.i01 (glmnet).
  • M.E. Fullard, B. Tran, S.X. Xie, J.B. Toledo, C. Scordia, C. Linder, R. Purri, D. Weintraub, J.E. Duda, L.M. Chahine, and J.F. Morley, Olfactory impairment predicts cognitive decline in early Parkinson's disease, Parkinsonism Relat. Disord. 25 (2016), pp. 45–51. doi: 10.1016/j.parkreldis.2016.02.013
  • F.E. Harrell, General aspects of fitting regression models – avoiding categorization; ordinal logistic regression, in Regression Modeling Strategies, Springer, Cham, 2015, pp. 311–325. doi: 10.1007/978-3-319-19425-7
  • S. Heritier and E. Ronchetti, Robust binary regression with continuous outcomes, Can. J. Stat. 32 (2004), pp. 239–249. doi: 10.2307/3315927
  • O. Kuss, The danger of dichotomizing continuous variables: A visualization, Teach. Stat. 35 (2013), pp. 78–79. doi: 10.1111/test.12006
  • R.C. MacCallum, S. Zhang, K.J. Preacher, and D.D. Rucker, On the practice of dichotomization of quantitative variables, Psychol. Methods 7 (2002), pp. 19–40. doi: 10.1037/1082-989X.7.1.19
  • K. Marek, D. Jennings, S. Lasch, A. Siderowf, and C. Tanner, The Parkinson Progression Marker Initiative (PPMI), Prog. Neurobiol. 95 (2011), pp. 629–635. doi: 10.1016/j.pneurobio.2011.09.005
  • B.K. Moser and L.P. Coombs, Odds ratios for a continuous outcome variable without dichotomizing, Stat. Med. 23 (2004), pp. 1843–1860. doi: 10.1002/sim.1776
  • M.P. Naeini and G.F. Cooper, Binary classifier calibration using an ensemble of piecewise linear regression models, Knowl. Inf. Syst. 54 (2018), pp. 151–170. doi: 10.1007/s10115-017-1133-2
  • O. Naggara, J. Raymond, F. Guilbert, D. Roy, A. Weill, and D.G. Altman, Analysis by categorizing or dichotomizing continuous variables is inadvisable: An example from the natural history of unruptured aneurysms, AJNR Am. J. Neuroradiol. 32 (2011), pp. 437–440. doi: 10.3174/ajnr.A2425
  • Z.S. Nasreddine, N.A. Phillips, V. Bédirian, S. Charbonneau, V. Whitehead, I. Collin, J.L. Cummings, and H. Chertkow, The Montreal Cognitive Assessment, MoCA: A brief screening tool for mild cognitive impairment, J. Am. Geriatr. Soc. 53 (2005), pp. 695–699. doi: 10.1111/j.1532-5415.2005.53221.x
  • A. Rauschenberger, I. Ciocănea-Teodorescu, M.A. Jonker, R.X. Menezes, and M.A. van de Wiel, Sparse classification with paired covariates, Adv. Data Anal. Classif. 14 (2020), pp. 571–588. doi: 10.1007/s11634-019-00375-6 (palasso).
  • A. Rauschenberger and E. Glaab, Predicting correlated outcomes from molecular data, Bioinform. 37 (2021), pp. 3889–3895. doi: 10.1093/bioinformatics/btab576 (joinet).
  • A. Rauschenberger, E. Glaab, and M.A. van de Wiel, Predictive and interpretable models via the stacked elastic net, Bioinform. 37 (2021), pp. 2012–2016. doi: 10.1093/bioinformatics/btaa535 (starnet).
  • J. Schwarz and D. Heider, GUESS: Projecting machine learning scores to well-calibrated probability estimates for clinical decision-making, Bioinform. 35 (2019), pp. 2458–2465. doi: 10.1093/bioinformatics/bty984 (CalibratR).
  • Y. Shentu and M. Xie, A note on dichotomization of continuous response variable in the presence of contamination and model misspecification, Stat. Med. 29 (2010), pp. 2200–2214. doi: 10.1002/sim.3966
  • D.L. Streiner, Breaking up is hard to do: The heartbreak of dichotomizing continuous data, Can. J. Psychiatry 47 (2002), pp. 262–266. doi: 10.1177/070674370204700307
  • S. Suissa, Binary methods for continuous outcomes: A parametric alternative, J. Clin. Epidemiol. 44 (1991), pp. 241–248. doi: 10.1016/0895-4356(91)90035-8
  • S. Suissa and L. Blais, Binary regression with continuous outcomes, Stat. Med. 14 (1995), pp. 247–255. doi: 10.1002/sim.4780140303
  • R. Ulrich and M. Wirtz, On the correlation of a naturally and an artificially dichotomized variable, Br. J. Stat. Psychol. 57 (2004), pp. 235–251. doi: 10.1348/0007110042307203
  • M.A. van de Wiel, J. Berkhof, and W.N. van Wieringen, Testing the prediction error difference between 2 predictors, Biostat. 10 (2009), pp. 550–560. doi: 10.1093/biostatistics/kxp011
  • B. Zadrozny and C. Elkan, Transforming classifier scores into accurate multiclass probability estimates, in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 694–699. doi: 10.1145/775047.775151
  • H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc., B: Stat. Methodol. 67 (2005), pp. 301–320. doi: 10.1111/j.1467-9868.2005.00503.x