Predicting dichotomised outcomes from high-dimensional data in biomedicine

Armin RauschenbergerLuxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, LuxembourgCorrespondence[email protected]

https://orcid.org/0000-0001-6498-4801 View further author information

Enrico GlaabLuxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg

https://orcid.org/0000-0003-3977-7469 View further author information

Pages 1756-1771 | Received 23 Aug 2022, Accepted 28 Jun 2023, Published online: 26 Jul 2023

Cite this article
https://doi.org/10.1080/02664763.2023.2233057
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

C. Caspell-Garcia, T. Simuni, D. Tosun-Turgut, I. Wu, Y. Zhang, M. Nalls, A. Singleton, L.A. Shaw, J.-H. Kang, J.Q. Trojanowski, A. Siderowf, C. Coffey, S. Lasch, D. Aarsland, D. Burn, L.M. Chahine, A.J. Espay, E.D. Foster, K.A. Hawkins, I. Litvan, I. Richard, and D. Weintraub, Multiple modality biomarker prediction of cognitive impairment in prospectively followed de novo Parkinson disease, PLoS One 12 (2017), Article ID e0175674. doi: 10.1371/journal.pone.0175674
PubMed Web of Science ®Google Scholar
J. Cohen, The cost of dichotomization, Appl. Psychol. Meas. 7 (1983), pp. 249–253. doi: 10.1177/014662168300700301
Web of Science ®Google Scholar
N.V. Dawson and R. Weiss, Dichotomizing continuous variables in statistical analysis: A practice to avoid, Med. Decis. Making 32 (2012), pp. 225–226. doi: 10.1177/0272989X12437605
PubMed Web of Science ®Google Scholar
A.R. de Leon and B. Wu, Copula-based regression models for a bivariate mixed discrete and continuous outcome, Stat. Med. 30 (2011), pp. 175–185. doi: 10.1002/sim.4087
PubMed Web of Science ®Google Scholar
M. de Paula and C.A.R. Diniz, Generalized linear regression models incorporating original outcome distributions, Commun. Stat. – Theory Methods 45 (2016), pp. 5762–5786. doi: 10.1080/03610926.2014.948726
Web of Science ®Google Scholar
A. Dupuy and D. Nassar, Dichotomization of primary outcomes serves external validity, J. Invest. Dermatol. 134 (2014), pp. 266–267. doi: 10.1038/jid.2013.258
PubMed Web of Science ®Google Scholar
D.P. Farrington and R. Loeber, Some benefits of dichotomization in psychiatric and criminological research, Crim. Behav. Ment. Health 10 (2000), pp. 100–122. doi: 10.1002/cbm.349
Google Scholar
V. Fedorov, F. Mannino, and R. Zhang, Consequences of dichotomization, Pharm. Stat. 8 (2009), pp. 50–61. doi: 10.1002/pst.331
PubMed Web of Science ®Google Scholar
G.M. Fitzmaurice and N.M. Laird, Regression models for a bivariate discrete and continuous outcome with clustering, J. Am. Stat. Assoc. 90 (1995), pp. 845–852. doi: 10.2307/2291318
Web of Science ®Google Scholar
J. Friedman, T. Hastie, and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, J. Stat. Softw. 33 (2010), pp. 1–22. doi: 10.18637/jss.v033.i01 (glmnet).
PubMed Web of Science ®Google Scholar
M.E. Fullard, B. Tran, S.X. Xie, J.B. Toledo, C. Scordia, C. Linder, R. Purri, D. Weintraub, J.E. Duda, L.M. Chahine, and J.F. Morley, Olfactory impairment predicts cognitive decline in early Parkinson's disease, Parkinsonism Relat. Disord. 25 (2016), pp. 45–51. doi: 10.1016/j.parkreldis.2016.02.013
PubMed Web of Science ®Google Scholar
F.E. Harrell, General aspects of fitting regression models – avoiding categorization; ordinal logistic regression, in Regression Modeling Strategies, Springer, Cham, 2015, pp. 311–325. doi: 10.1007/978-3-319-19425-7
Google Scholar
S. Heritier and E. Ronchetti, Robust binary regression with continuous outcomes, Can. J. Stat. 32 (2004), pp. 239–249. doi: 10.2307/3315927
Web of Science ®Google Scholar
O. Kuss, The danger of dichotomizing continuous variables: A visualization, Teach. Stat. 35 (2013), pp. 78–79. doi: 10.1111/test.12006
Google Scholar
R.C. MacCallum, S. Zhang, K.J. Preacher, and D.D. Rucker, On the practice of dichotomization of quantitative variables, Psychol. Methods 7 (2002), pp. 19–40. doi: 10.1037/1082-989X.7.1.19
PubMed Web of Science ®Google Scholar
K. Marek, D. Jennings, S. Lasch, A. Siderowf, and C. Tanner, The Parkinson Progression Marker Initiative (PPMI), Prog. Neurobiol. 95 (2011), pp. 629–635. doi: 10.1016/j.pneurobio.2011.09.005
PubMed Web of Science ®Google Scholar
B.K. Moser and L.P. Coombs, Odds ratios for a continuous outcome variable without dichotomizing, Stat. Med. 23 (2004), pp. 1843–1860. doi: 10.1002/sim.1776
PubMed Web of Science ®Google Scholar
M.P. Naeini and G.F. Cooper, Binary classifier calibration using an ensemble of piecewise linear regression models, Knowl. Inf. Syst. 54 (2018), pp. 151–170. doi: 10.1007/s10115-017-1133-2
PubMed Web of Science ®Google Scholar
O. Naggara, J. Raymond, F. Guilbert, D. Roy, A. Weill, and D.G. Altman, Analysis by categorizing or dichotomizing continuous variables is inadvisable: An example from the natural history of unruptured aneurysms, AJNR Am. J. Neuroradiol. 32 (2011), pp. 437–440. doi: 10.3174/ajnr.A2425
PubMed Web of Science ®Google Scholar
Z.S. Nasreddine, N.A. Phillips, V. Bédirian, S. Charbonneau, V. Whitehead, I. Collin, J.L. Cummings, and H. Chertkow, The Montreal Cognitive Assessment, MoCA: A brief screening tool for mild cognitive impairment, J. Am. Geriatr. Soc. 53 (2005), pp. 695–699. doi: 10.1111/j.1532-5415.2005.53221.x
PubMed Web of Science ®Google Scholar
A. Rauschenberger, I. Ciocănea-Teodorescu, M.A. Jonker, R.X. Menezes, and M.A. van de Wiel, Sparse classification with paired covariates, Adv. Data Anal. Classif. 14 (2020), pp. 571–588. doi: 10.1007/s11634-019-00375-6 (palasso).
Web of Science ®Google Scholar
A. Rauschenberger and E. Glaab, Predicting correlated outcomes from molecular data, Bioinform. 37 (2021), pp. 3889–3895. doi: 10.1093/bioinformatics/btab576 (joinet).
PubMed Web of Science ®Google Scholar
A. Rauschenberger, E. Glaab, and M.A. van de Wiel, Predictive and interpretable models via the stacked elastic net, Bioinform. 37 (2021), pp. 2012–2016. doi: 10.1093/bioinformatics/btaa535 (starnet).
PubMed Web of Science ®Google Scholar
J. Schwarz and D. Heider, GUESS: Projecting machine learning scores to well-calibrated probability estimates for clinical decision-making, Bioinform. 35 (2019), pp. 2458–2465. doi: 10.1093/bioinformatics/bty984 (CalibratR).
PubMed Web of Science ®Google Scholar
Y. Shentu and M. Xie, A note on dichotomization of continuous response variable in the presence of contamination and model misspecification, Stat. Med. 29 (2010), pp. 2200–2214. doi: 10.1002/sim.3966
PubMed Web of Science ®Google Scholar
D.L. Streiner, Breaking up is hard to do: The heartbreak of dichotomizing continuous data, Can. J. Psychiatry 47 (2002), pp. 262–266. doi: 10.1177/070674370204700307
PubMed Web of Science ®Google Scholar
S. Suissa, Binary methods for continuous outcomes: A parametric alternative, J. Clin. Epidemiol. 44 (1991), pp. 241–248. doi: 10.1016/0895-4356(91)90035-8
PubMed Web of Science ®Google Scholar
S. Suissa and L. Blais, Binary regression with continuous outcomes, Stat. Med. 14 (1995), pp. 247–255. doi: 10.1002/sim.4780140303
PubMed Web of Science ®Google Scholar
R. Ulrich and M. Wirtz, On the correlation of a naturally and an artificially dichotomized variable, Br. J. Stat. Psychol. 57 (2004), pp. 235–251. doi: 10.1348/0007110042307203
PubMed Web of Science ®Google Scholar
M.A. van de Wiel, J. Berkhof, and W.N. van Wieringen, Testing the prediction error difference between 2 predictors, Biostat. 10 (2009), pp. 550–560. doi: 10.1093/biostatistics/kxp011
PubMed Web of Science ®Google Scholar
B. Zadrozny and C. Elkan, Transforming classifier scores into accurate multiclass probability estimates, in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 694–699. doi: 10.1145/775047.775151
Google Scholar
H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc., B: Stat. Methodol. 67 (2005), pp. 301–320. doi: 10.1111/j.1467-9868.2005.00503.x
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Predicting dichotomised outcomes from high-dimensional data in biomedicine

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Predicting dichotomised outcomes from high-dimensional data in biomedicine

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date