Search in:

Biostatistics & Epidemiology Volume 4, 2020 - Issue 1

Free access

17,568

Views

CrossRef citations to date

Altmetric

U.S. Department of Veterans Affairs Panel on Statistics and Analytics on Healthcare Datasets: Challenges and Recommended Strategies

Statistical modeling methods: challenges and strategies

Steven S. Henleya Department of Medicine, Loma Linda University School of Medicine, Loma Linda, CA, USA;b Center for Advanced Statistics in Education, VA Loma Linda Healthcare System, Loma Linda, CA, USA;c Martingale Research Corporation, Plano, TX, USACorrespondence[email protected]
View further author information

Richard M. Goldend School of Behavioral and Brain Sciences, University of Texas at Dallas, Richardson, TX, USAView further author information

T. Michael Kashnera Department of Medicine, Loma Linda University School of Medicine, Loma Linda, CA, USA;b Center for Advanced Statistics in Education, VA Loma Linda Healthcare System, Loma Linda, CA, USA;e Department of Veterans Affairs, Office of Academic Affiliations (10A2D), Washington, DC, USAView further author information

Pages 105-139 | Received 13 Jun 2018, Accepted 24 Apr 2019, Published online: 22 Jul 2019

Cite this article
https://doi.org/10.1080/24709360.2019.1618653
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Agresti A. Categorical data analysis. 2nd ed. New York: Wiley-Interscience; 2002.
Google Scholar
Rencher AC, Christensen WF. Methods of multivariate analysis. 3rd ed. Hoboken (NJ): Wiley; 2012.
Google Scholar
Christensen R. Log-linear models and logistic regression. New York: Springer-Verlag; 1997.
Google Scholar
Fox J. Applied regression analysis and generalized linear models. 3rd ed. Los Angeles (CA): SAGE; 2015.
Google Scholar
Draper NR, Smith H. Applied regression analysis. 3rd ed. New York: Wiley; 1998.
Google Scholar
Harrell FE. Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. New York: Springer; 2001.
Google Scholar
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. 2nd ed. New York: Springer; 2009.
Google Scholar
Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. 1st; Reprint ed. New York: Springer; 2010.
Google Scholar
Royston P, Sauerbrei W. Multivariate model-building: a pragmatic approach to regression analysis based on fractional polynomials for modelling continuous variables. New York: John Wiley & Sons; 2008.
Google Scholar
Hastie T, Tibshirani R. Generalized additive models. 1st ed. New York: Chapman and Hall; 1990.
Google Scholar
McCullagh P, Nelder JA. Generalized linear models. 2nd ed. London; New York: Chapman and Hall; 1989.
Google Scholar
Hosmer DW, Lemeshow S, Sturdivant RX. Applied logistic regression. 3rd ed. New York: Wiley-Interscience; 2013.
Google Scholar
Hosmer DW, Lemeshow S, May S. Applied survival analysis: regression modeling of time-to-event data. 2nd ed. Hoboken (NJ): Wiley-Interscience; 2008.
Google Scholar
Zar JH. Biostatistical analysis. 5th ed. New York: Pearson; 2009.
Google Scholar
Gentle JE. Elements of computational statistics. New York: Springer-Verlag; 2002.
Google Scholar
Vittinghoff E, Glidden DV, Shiboski SC, et al. Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. 2nd ed. New York: Springer; 2012.
Google Scholar
Daniel WW, Cross CL. Biostatistics: a foundation for analysis in the health sciences. 11th ed. Hoboken (NJ): John Wiley & Sons; 2019.
Google Scholar
White H. Estimation, inference, and specification analysis. Cambridge; New York: Cambridge University Press; 1994.
Google Scholar
White H. Asymptotic theory for econometricians. Revised ed. New York: Academic Press; 2001.
Google Scholar
Eddy DM, Hollingworth W, Caro JJ, et al. Model transparency and validation: a report of the ISPOR-SMDM modeling good research practices task force–7. Value Health. 2012;15(6):843–850.
Google Scholar
Collins GS, Reitsma JB, Altman DG, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMC Med. 2015;13:1.
Google Scholar
Depaoli S, van de Schoot R. Improving transparency and replication in Bayesian statistics: the WAMBS-checklist. Psychol Methods. 2017;22(2):240–261.
Google Scholar
Thiese MS, Arnold ZC, Walker SD. The misuse and abuse of statistics in biomedical research. Biochem Med. 2015;25(1):5–11.
Google Scholar
Caro JJ, Briggs AH, Siebert U, et al. Modeling good research practices–overview: a report of the ISPOR-SMDM modeling good research practices task force–1. Value Health. 2012;15(6):796–803.
Google Scholar
Wei B. Exponential family nonlinear models. New York: Springer; 1998.
Google Scholar
Bishop C. Neural networks for pattern recognition. New York: Oxford University Press; 1995.
Google Scholar
Golden RM. Mathematical methods for neural network analysis and design. Cambridge (MA): MIT Press; 1996.
Google Scholar
Duda RO, Hart PE, Stork DG. Pattern classification. 2nd ed. New York: Wiley-Interscience; 2000.
Google Scholar
Bishop CM. Pattern recognition and machine learning. 1st (reprint) ed. New York: Springer; 2016.
Google Scholar
Murphy KP. Machine learning: a probabilistic perspective. 1st ed. Cambridge (MA): The MIT Press; 2012.
Google Scholar
Zheng A, Casari A. Feature engineering for machine learning: principles and techniques for data scientists. 1st ed. Sebastopol (CA): O’Reilly Media; 2018.
Google Scholar
Guyon I, Elisseff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–1182.
Google Scholar
Kuhn A, Johnson K. Applied predictive modeling. New York: Springer; 2013.
Google Scholar
Collett D. Modelling binary data. 2nd ed. Boca Raton (FL): Chapman & Hall/CRC; 2003.
Google Scholar
Dobson AJ, Barnett A. An introduction to generalized linear models. 3rd ed. Boca Raton (FL): Chapman & Hall/CRC; 2008.
Google Scholar
Seber GAF, Wild CJ. Nonlinear regression; Hoboken (NJ): Wiley-Interscience; 2003.
Google Scholar
Hardin JW, Hilbe JM. Generalized linear models and extensions. 3rd ed. College Station (TX): Stata Press; 2012.
Google Scholar
Hilbe JM. Logistic regression models. New York: Chapman and Hall; 2009.
Google Scholar
Weisberg S. Applied linear regression. 4th ed. Hoboken (NJ): Wiley; 2013.
Google Scholar
Zhou X-H, Zhou C, Lui D, et al. Applied missing data analysis in the health sciences. 1st ed. Hoboken (NJ): Wiley; 2014.
Google Scholar
Little RJA, Rubin DB. Statistical analysis with missing data. 2nd ed. Hoboken (NJ): Wiley-Interscience; 2002.
Google Scholar
Schafer JL. Analysis of incomplete multivariate data. New York: Chapman & Hall/CRC; 1997.
Google Scholar
Allison PD. Missing data. Thousand Oaks (CA): Sage; 2001.
Google Scholar
Golden RM, Henley SS, White H, et al. Consequences of model misspecification for maximum likelihood estimation with missing data. Econometrics. 2019;7(3):37. doi:10.3390/econometrics7030037.
Google Scholar
Zhou Z-H. Challenges and strategies in analysis of missing data. Biostat Epidemiol. 2020;4(1):15–23. doi:10.1080/24709360.2018.1469810.
Google Scholar
Cook RD, Weisberg S. Applied regression including computing and graphics. 1st ed. New York: Wiley-Interscience; 1999.
Google Scholar
Cook RD, Weisberg S. Residuals and influence in regression. 1st ed. New York: Chapman and Hall/CRC; 1983.
Google Scholar
Fox J. Regression diagnostics: an introduction (quantitative applications in the social sciences). 1st ed. Newbury Park (CA): SAGE; 1991.
Google Scholar
Hamilton LC. Regression with graphics: a second course in applied statistics. Pacific Grove, CA: Brooks/Cole; 1992.
Google Scholar
Belsley DA, Kuh E, Welsch RE. Regression diagnostics: identifying influential data and sources of collinearity. 1st ed. New York: Wiley-Interscience; 1980.
Google Scholar
Wasserman L. All of nonparametric statistics. New York: Springer; 2007.
Google Scholar
Corder GW, Foreman DI. Nonparametric statistics: a step-by-step approach. 2nd ed. Hoboken (NJ): Wiley; 2014.
Google Scholar
Hollander M, Wolfe DA, Chicken E. Nonparametric statistical methods. 3rd ed. New York: John Wiley & Sons; 2014.
Google Scholar
Kashner TM, Henley SS, Golden RM, Zhou Z-HA. Making causal inferences about treatment effect sizes from observational datasets. Biostat Epidemiol. 2020;4(1):48–83. doi:10.1080/24709360.2019.1681211.
Google Scholar
Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82:669–688.
Google Scholar
Pearl J. Causality: models, reasoning, and inference. Cambridge: University of Cambridge Press; 2000.
Google Scholar
VanderWeele TJ, Robins JM. Directed acyclic graphs, sufficient causes, and the properties of conditioning on a common effect. Am J Epidemiol. 2007;166(9):1096–1104.
Google Scholar
Hoeting JA, Madigan D, Raftery AE, et al. Bayesian model averaging: a tutorial. Stat Sci. 1999;14(4):382–401.
Google Scholar
Garcia-Perez MA. Statistical conclusion validity: some common threats and simple remedies. Front Psychol. 2012;3(325). doi: 10.3389/fpsyg.2012.00325
Google Scholar
Cook TD, Campbell DT. Quasi-experimentation: design & analysis issues for field settings. 1st ed. Boston (MA): Houghton Mifflin; 1979.
Google Scholar
Burnham KP, Anderson DR. Model selection and multimodel inference: a practical information-theoretic approach. 2nd ed. New York: Springer; 2002.
Google Scholar
Burnham KP, Anderson DR. Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res. 2004;33(2):261–304.
Google Scholar
Claeskens G, Hjort NL. Model selection and model averaging. Cambridge; New York: Cambridge University Press; 2008.
Google Scholar
Steckler A, McLeroy KR. The importance of external validity. Am J Public Health. 2008;98(1):9–10.
Google Scholar
Persaud N, Mamdani MM. External validity: the neglected dimension in evidence ranking. J Eval Clin Pract. 2006;12(4):450–453.
Google Scholar
Maronna RA, Martin RD, Yohai VJ, et al. Robust statistics: theory and methods (with R). 2nd ed. Hoboken (NJ): John Wiley & Sons; 2019.
Google Scholar
Pearl J, Bareinboim E. External validity: from do-calculus to transportability across populations. Stat Sci. 2014;29(4):579–595.
Google Scholar
Justice AC, Covinsky KE, Berlin JA. Assessing the generalizability of prognostic information. Ann Intern Med. 1999;130(6):515–524.
Google Scholar
Altman DG, Royston P. What do we mean by validating a prognostic model? Stat Med. 2000;19(4):453–473.
Google Scholar
Khorsan R, Crawford C. How to assess the external validity and model validity of therapeutic trials: a conceptual approach to systematic review methodology. Evid Based Compl Altern Med. 2014;2014:1–12.
Google Scholar
Steyerberg EW, Harrell FE, Borsboom GJ, et al. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol. 2001;54(8):774–781.
Google Scholar
Brakenridge SC, Phelan HA, Henley SS, et al. Early blood product and crystalloid volume resuscitation: risk association with multiple organ dysfunction after severe blunt traumatic injury. J Trauma. 2011;71(2):299–305.
Google Scholar
Brakenridge SC, Henley SS, Kashner TM, et al. Comparing clinical predictors of deep venous thrombosis vs. pulmonary embolus after severe blunt injury: a new paradigm for post-traumatic venous thromboembolism? J Trauma. 2013;74(5):1231–1237.
Google Scholar
Westover AN, Kashner TM, Winhusen TM, et al. A systematic approach to subgroup analyses in a smoking cessation trial. Am J Drug Alcohol Abuse. 2015;41(6):498–507.
Google Scholar
Henley SS, Kashner TM, Golden RM, et al. Response to letter regarding “a systematic approach to subgroup analyses in a smoking cessation trial”. Am J Drug Alcohol Abuse. 2016;42(1):112–113.
Google Scholar
Gorelick DA, McPherson S. Improving the analysis and modeling of substance use. Am J Drug Alcohol Abuse. 2015;41(6):475–478.s
Google Scholar
Volkow ND. Director’s report to the national advisory council on drug abuse, September 2015. US National Institutes of Health: National Institute on Drug Abuse; 2015.
Google Scholar
Winhusen TM, Somoza EC, Brigham GS, et al. Impact of attention-deficit/hyperactivity disorder (ADHD) treatment on smoking cessation intervention in ADHD smokers: a randomized, double-blind, placebo-controlled trial. J Clin Psychiatry. 2010;71(12):1680–1688.
Google Scholar
Golden RM, Henley SS, White H, et al. Generalized information matrix tests for detecting model misspecification. Econometrics. 2016;4(4):46.
Google Scholar
Golden RM, Henley SS, White H, et al. New directions in information matrix testing: eigenspectrum tests. In: Chen X, Swanson NR, editors. Causality, prediction, and specification analysis: recent advances and future directions: essays in honor of Halbert L. White, Jr. (Festschrift Hal White Conference). New York: Springer; 2013. p. 145–178.
Google Scholar
White H. Maximum likelihood estimation of misspecified models. Econometrica. 1982;50(1):1–25.
Google Scholar
Bates DM, Watts DG. Nonlinear regression analysis and its applications. New York: Wiley-Interscience; 2007.
Google Scholar
Hemmert GAJ, Schons LM, Wieseke J, et al. Log-likelihood-based pseudo-R2 in logistic regression: deriving sample-sensitive benchmarks. Sociol Methods Res. 2016;47(3):507–531.
Google Scholar
Kullback S, Leibler R. On information and sufficiency. Ann Math Stat. 1951;22:79–86.
Google Scholar
Akaike H. Information theory and an extension of the maximum likelihood principle. Second International Symposium on Information Theory; 1973; Budapest.
Google Scholar
Linhart H, Zucchini W. Model selection. New York: Wiley; 1986.
Google Scholar
Konishi S, Kitagawa G. Information criteria and statistical modeling. New York: Springer; 2008.
Google Scholar
Hurvich CM, Tsai C-L. Regression and time series model selection in small samples. Biometrika. 1989;76(2):297–307.
Google Scholar
Hurvich CM, Tsai CL. A corrected Akaike information criterion for vector autoregressive model selection. J Time Ser Anal. 1993;14:271–279.
Google Scholar
Schwartz G. Estimating the dimension of a model. Ann Stat. 1978;6:461–464.
Google Scholar
Ando T. Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models. Biometrika. 2007;94(2):443–458.
Google Scholar
Ando T. Predictive Bayesian model selection. Am J Math Manage Sci. 2011;31:13–38.
Google Scholar
Bozdogan H. Akaike’s information criterion and recent developments in information complexity. J Math Psychol. 2000;44(1):62–91.
Google Scholar
Takeuchi K. Distribution of information statistics and a criterion of model fitting for adequacy of models. Math Sci. 1976;153:12–18.
Google Scholar
Djuric PM. Asymptotic MAP criteria for model selection. IEEE Trans Signal Process. 1998;46:2726–2735.
Google Scholar
Lv J, Liu JS. Model selection principles in misspecified models. J R Stat Soc Series BStat Methodol. 2014;76(1):141–167.
Google Scholar
Poskitt DS. Precision, complexity and Bayesian model determination. J R Stat Soc Series B Methodol. 1987;49(2):199–208.
Google Scholar
Cavanaugh JE. A large-sample model selection criterion based on Kullback’s symmetric divergence. Stat Probab Lett. 1999;42:333–343.
Google Scholar
Seghouane AK. A note on overfitting properties of KIC and KICc. Signal Process. 2006;86:3055–3060.
Google Scholar
White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980;48(4):817–838.
Google Scholar
White H. Consequences and detection of misspecified nonlinear regression models. J Am Stat Assoc. 1981;76:419–433.
Google Scholar
Begg MD, Lagakos S. On the consequences of model misspecification in logistic regression. Environ Health Perspect. 1990;87:69–75.
Google Scholar
Lehmann EL. Model specification: the views of Fisher and Neyman, and later developments. Stat Sci. 1990;5:160–168.
Google Scholar
Godfrey LG. Misspecification tests in econometrics: the Lagrange multiplier principle and other approaches. Cambridge, MA: Cambridge University Press; 1989.
Google Scholar
Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models. Cambridge, MA: Cambridge University Press; 2006.
Google Scholar
Davison AC, Tsai CL. Regression model diagnostics. Int Stat Rev. 1992;60:337–353.
Google Scholar
Cook RD. Detection of influential observation in linear regression. Technometrics. 1977;19(1):15–18.
Google Scholar
Hodge VJ, Austin J. A survey of outlier detection methodologies. Artif Intell Rev. 2004;22(2):85–126.
Google Scholar
Rousseeuw P, Leroy A. Robust regression and outlier detection. 3rd ed. New York: John Wiley & Sons; 1996.
Google Scholar
White H. Specification testing in dynamic models. Advances in Econometrics - Fifth World Congress; 1987.
Google Scholar
Wilks SS. The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann Math Stat. 1938;9:60–62.
Google Scholar
Davies N, Petruccelli JD. Detecting non-linearity in time series. J R Stat Soc Series D Stat. 1986;35(2):271–280.
Google Scholar
Hosmer DW, Taber S, Lemeshow S. The importance of assessing the fit of logistic regression models: a case study. Am J Public Health. 1991;81:1630–1635.
Google Scholar
Hosmer DW, Hosmer T, Le Cessie S, et al. A comparison of goodness-of-fit tests for the logistic regression model. Stat Med. 1997;16:965–980.
Google Scholar
Allison PD. Measures of fit for logistic regression. Paper 1485-2014. SAS Global Forum; 2014; Washington (DC).
Google Scholar
Xie XJ. Goodness-of-fit tests for logistic regression models: evaluating logistic model fit when continuous covariates are present. Riga: VDM Verlag Dr. Mueller E.K.; 2008.
Google Scholar
Hosmer DW, Hjort NL. Goodness-of-fit processes for logistic regression: simulation results. Stat Med. 2002;21(18):2723–2738.
Google Scholar
Kuss O. Global goodness-of-fit tests in logistic regression with sparse data. Stat Med. 2002;21:3789–3801.
Google Scholar
Cho JS, Phillips PCB. Pythagorean generalization of testing the equality of two symmetric positive definite matrices. J Econom. 2018;202(1):45–56.
Google Scholar
Prokhorov A, Schepsmeier U, Zhu Y. Generalized information matrix tests for copulas. Econom Rev. 2019; in press.
Google Scholar
Huang W, Prokhorov A. A goodness-of-fit test for copulas. Econom Rev. 2014;33(7):751–771.
Google Scholar
Cho J, Phillips PCB. Testing equality of covariance matrices via Pythagorean means. Cowles Foundation Discussion Paper No. 1970; 2014.
Google Scholar
Ibragimov R, Prokhorov A. Heavy tails and copulas: topics in dependence modelling in economics and finance. Hackensack (NJ): World Scientific Publishing; 2017.
Google Scholar
Schepsmeier U. Efficient information based goodness-of-fit tests for vine copula models with fixed margins: a comprehensive review. J Multivar Anal. 2015;138:34–52.
Google Scholar
Schepsmeier U. A goodness-of-fit test for regular vine copula models. Econom Rev. 2016;38(1):25–46.
Google Scholar
Hall A. The information matrix test for the linear model. Rev Econ Stud. 1987;54:257–263.
Google Scholar
Nelder JA, Wedderburn RWM. Generalized linear models. J R Stat Soc Series A. 1972;135:370–384.
Google Scholar
Golden RM. Making correct statistical inferences using a wrong probability model. J Math Psychol. 1995;39(1):3–20.
Google Scholar
Archer KJ, Lemeshow S. Goodness-of-fit test for a logistic regression model fitted using survey sample data. Stata J. 2006;6:97–105.
Google Scholar
Copas JB. Unweighted sum of squares test for proportions. Appl Stat. 1989;38:71–80.
Google Scholar
Deng X, Wan S, Zhang B. An improved goodness-of-test for logistic regression models based on case-control data by random partition. Commun Stat Simul Comput. 2009;38:233–243.
Google Scholar
Hosmer DW, Lemeshow S, Klar J. Goodness-of-fit testing for multiple logistic regression analysis when the estimated probabilities are small. Biom J. 1988;30(7):1–14.
Google Scholar
Hosmer DW, Lemeshow S. A goodness-of-fit test for the multiple logistic regression model. Commun Stat. 1980;9:1043–1069.
Google Scholar
Qin J, Zhang B. A goodness-of-fit test for logistic regression models based on case-control data. Biometrika. 1997;84:609–618.
Google Scholar
Tsiatis AA. A note on a goodness-of-fit test for the logistic regression model. Biometrika. 1980;67:250–251.
Google Scholar
Zhang B. A chi-squared goodness-of-fit test for logistic regression models based on case-control data. Biometrika. 1999;86:531–539.
Google Scholar
Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361–387.
Google Scholar
Breusch TS, Pagan AR. Simple test for heteroscedasticity and random coefficient variation. Econometrica. 1979;47(5):1287–1294.
Google Scholar
Durbin J, Watson GS. Testing for serial correlation in least squares regression. Biometrika. 1971;58(1):1–19.
Google Scholar
White KJ. The Durbin-Watson test for autocorrection in nonlinear models. Rev Econ Stat. 1992;74(2):370–373.
Google Scholar
Hendry DF, Richard J-F. On the formulation of empirical models in dynamic econometrics. J Econom. 1982;20:3–33.
Google Scholar
Paul P, Pennell ML, Lemeshow S. Standardizing the power of the Hosmer-Lemeshow goodness of fit test in large data sets. Stat Med. 2013;32(1):67–80.
Google Scholar
Davidson J. Econometric theory. Malden (MA): Wiley-Blackwell; 2000.
Google Scholar
Ljung GM, Box GEP. On a measure of a lack of fit in time series models. Biometrika. 1978;65(2):297–303.
Google Scholar
Golden RM. Discrepancy risk model selection test theory for comparing possibly misspecified or nonnested models. Psychometrika. 2003;68(2):229–249.
Google Scholar
Vuong QH. Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica. 1989;57:307–333.
Google Scholar
Golden RM. Statistical tests for comparing possibly misspecified and nonnested models. J Math Psychol. 2000;44(1):153–170.
Google Scholar
Rivers D, Vuong Q. Model selection tests for nonlinear dynamic models. Economet J. 2002;5:1–39.
Google Scholar
Feng ZD, McCulloch CE. Using bootstrap likelihood ratios in finite mixture models. J R Stat Soc Series B Methodol. 1996;58(3):609–617.
Google Scholar
Tekle FB, Gudicha DW, Vermunt JK. Power analysis for the bootstrap likelihood ratio test for the number of classes in latent class models. Adv Data Anal Classif. 2016;10(2):209–224.
Google Scholar
Gray HL, Baek J, Woodward WA, et al. A bootstrap generalized likelihood ratio test in discriminant analysis. Comput Stat Data Anal. 1996;22(2):137–158.
Google Scholar
McLachlan GJ. On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. J R Stat Soc Series C Appl Stat. 1987;36(3):318–324.
Google Scholar
Lo Y, Mendell NR, Rubin DB. Testing the number of components in a normal mixture. Biometrika. 2001;88(3):767–778.
Google Scholar
Gibbons JD, Chakraborti S. Nonparametric statistical inference. 4th ed. New York: Chapman and Hall/CRC; 2003.
Google Scholar
Serfling RJ. Approximation theorems of mathematical statistics. 2nd ed. New York: Wiley-Interscience; 1980.
Google Scholar
Bagdonavicius V, Kruopis J, Nikulin MS. Non-parametric tests for complete data. London & Hoboken: Wiley-ISTE; 2011.
Google Scholar
Kanji GK. 100 statistical tests. 3rd ed. Thousand Oaks, CA: SAGE; 2006.
Google Scholar
Barnett V, Lewis T. Outliers in statistical data. 3rd ed. New York: Wiley; 1994.
Google Scholar
Iglewicz B, Hoaglin DC. How to detect and handle outliers. Milwaukee (WI): American Society for Quality Control Press; 1993.
Google Scholar
Anderson TW, Darling DA. Asymptotic theory of certain “goodness-of-fit” criteria based on stochastic processes. Ann Math Stat. 1952;23:193–212.
Google Scholar
Justel A, Peña D, Zamar R. A multivariate Kolmogorov–Smirnov test of goodness of fit. Stat Probab Lett. 1997;35(3):251–259.
Google Scholar
Kolmogorov AN. Sulla Determinazione Empirica di una Legge di Distribuzione. Giornale dell’Istituto Italiano degli Attuari. 1933;4:83–91.
Google Scholar
Smirnov N. Table for estimating the goodness of fit of empirical distributions. Ann Math Stat. 1948;19:279–281.
Google Scholar
Zhao D, Bu L, Alippi C, et al. A Kolmogorov-Smirnov test to detect changes in stationarity in big data. IFAC PapersOnLine. 2017;50(1):14260–14265.
Google Scholar
Shapiro SS, Wilk MB. An analysis of variance test for normality (complete samples). Biometrika. 1965;52(3–4):591–611.
Google Scholar
Kendall M. Rank correlation methods. Oxford: Charles Griffin & Company Limited; 1948.
Google Scholar
Kendall M. A new measure of rank correlation. Biometrika. 1938;30(1–2):81–93.
Google Scholar
Well AD, Myers JL. Research design and statistical analysis. 3rd ed. New York: Routledge; 2010.
Google Scholar
Cook RD. Regression graphics: ideas for studying regressions through graphics. 1st ed. New York: Wiley-Interscience; 1998.
Google Scholar
Kashner TM, Hinson R, Holland G, et al. A data accounting system for clinical investigators. J Am Med Inform Assoc. 2007;14(4):394–396.
Google Scholar
Osborne JW. Best practices in data cleaning: a complete guide to everything you need to do before and after collecting your data. 1st ed. Thousand Oaks (CA): SAGE Publications; 2013.
Google Scholar
Box GEP, Cox DR. An analysis of transformations. J R Stat Soc Series B. 1964;26(2):211–252.
Google Scholar
Kashner TM, Henley SS, Golden RM, et al. Assessing the preventive effects of cognitive therapy following relief of depression: a methodological innovation. J Affect Disord. 2007;104(1–3):251–261.
Google Scholar
Kashner TM, Henley SS, Golden RM, et al. Studying the effects of ACGME duty hours limits on resident satisfaction: results from VA learners’ perceptions survey. Acad Med. 2010;85(7):1130–1139.
Google Scholar
Jaccard J, Turris R. Interaction effects in multiple regression. (NY): SAGE; 2003.
Google Scholar
Box GEP. Do interactions matter? Qual Eng. 1990;2:365–369.
Google Scholar
Balli HO, Sørensen BE. Interaction effects in econometrics. Empir Econ. 2013;45(1):583–603.
Google Scholar
Aiken LS, West SG. Multiple regression: testing and interpreting interactions. Thousand Oaks (CA): Sage; 1991.
Google Scholar
Cox DR. Interaction. Int Stat Rev / Revue Internationale de Statistique. 1984;52(1):1–24.
Google Scholar
Greenland S. Tests for interaction in epidemiologic studies: a review and a study of power. Stat Med. 1983;2(2):243–251.
Google Scholar
Hayes AF, Matthes J. Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations. Behav Res Methods. 2009;41(3):924–936.
Google Scholar
Stone-Romero EF, Anderson LE. Relative power of moderated multiple regression and the comparison of subgroup correlation coefficients for detecting moderator effects. J Appl Psychol. 1994;79:354–359.
Google Scholar
George EI, Clyde M. Model uncertainty. Stat Sci. 2004;19(1):81–94.
Google Scholar
Draper D. Assessment and propagation of model uncertainty. J R Stat Soc Series B Methodol. 1995;57(1):45–97.
Google Scholar
Greenland S. Modeling and variable selection in epidemiologic analysis. Am J Public Health. 1989;79(3):340–349.
Google Scholar
George EI. The variable selection problem. J Am Stat Assoc. 2000;95(452):1304–1308.
Google Scholar
Kosorok MR, Ma S. Marginal asymptotics for the “large p, small n” paradigm: with applications to microarray data. Ann Stat. 2007;35(4):1456–1486.
Google Scholar
Fan J, Lv J. Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Series B Stat Methodol. 2008;70(5):849–911.
Google Scholar
Fan Y, Tang CY. Tuning parameter selection in high dimensional penalized likelihood. J R Stat Soc Series B Stat Methodol. 2013;75(3):531–552.
Google Scholar
Wasserman L, Roeder K. High-dimensional variable selection. Ann Stat. 2009;37(5A):2178–2201.
Google Scholar
Buhlmann P, van de Geer S. Statistics for high-dimensional data: methods, theory, and applications. New York: Springer; 2011.
Google Scholar
West M. Bayesian factor regression models in the “large p, small n” paradigm. In: Bernardo JM, Bayarri MJ, Berger JO, et al., editors. Bayesian statistics. London: Oxford University Press; 2003. p. 723–732.
Google Scholar
Johnstone IM, Titterington DM. Statistical challenges of high-dimensional data. Philos Trans Series A Math Phys Eng Sci. 2009;367(1906):4237–4253.
Google Scholar
Tibshirani R. Regression shrinkage and selection via the LASSO. J R Stat Soc Series B. 1996;58:267–288.
Google Scholar
Hastie T, Tibshirani R, Tibshirani RJ. Extended comparisons of best subset selection, forward stepwise selection, and the lasso. arXiv:1707.08692v2 [stat.ME]; 2017.
Google Scholar
Efroymson MA. Multiple regression analysis. In: Ralston A, Wilf HS, editors. Mathematical methods for digital computers. New York: Wiley; 1960. p. 191–203.
Google Scholar
Miller AJ. Subset selection in regression. 2nd ed. Boca Raton: Chapman & Hall/CRC; 2002.
Google Scholar
Furnival GM. All possible regressions with less computation. Technometrics. 1971;13(2):403–408.
Google Scholar
Furnival GM, Wilson RW. Regression by leaps and bounds. Technometrics. 1974;16:499–511.
Google Scholar
Gatu C, Kontoghiorghes EJ. Branch-and-bound algorithms for computing the best-subset regression models. J Comput Graph Stat. 2006;15(1):139–156.
Google Scholar
Hand DJ. Branch and bound in statistical data. J R Stat Soc Series D Stat. 1981;30(1):1–13.
Google Scholar
Hans C, Dobra A, West M. Shotgun stochastic search for “large p” regression. J Am Stat Assoc. 2007;102(478):507–516.
Google Scholar
Cai A, Tsay RS, Chen R. Variable selection in linear regression with many predictors. J Comput Graph Stat. 2009;18(3):573–591.
Google Scholar
George EI, McCulloch RE. Variable selection via Gibbs sampling. J Am Stat Assoc. 1993;88:881–889.
Google Scholar
Han C, Carlin BP. Markov chain Monte Carlo methods for computing Bayes factors: a comparative review. J Am Stat Assoc. 2001;96(455):1122–1132.
Google Scholar
Hocking R. The analysis and selection of variables in linear regression. Biometrics. 1976;32:1–49.
Google Scholar
Garside MJ. The best sub-set in multiple-regression analysis. R Stat Soc Series C Appl Stat. 1965;14(2–3):196–200.
Google Scholar
Schatzoff M, Tsao R, Fienberg S. Efficient calculation of all possible regressions. Technometrics. 1968;10(4):769.
Google Scholar
Bertsimas D, King A, Mazumder R. Best subset selection via a modern optimization lens. Ann Stat. 2016;44(2):813–852.
Google Scholar
Bertsimas D, King A. Logistic regression: from art to science. Stat Sci. 2017;32(3):367–384.
Google Scholar
Hosmer DW, Jovanovic B, Lemeshow S. Best subsets logistic regression. Biometrics. 1989;45(4):1265–1270.
Google Scholar
Edwards D, Havranek T. A fast model selection procedure for large families of models. J Am Stat Assoc. 1987;82(397):205–213.
Google Scholar
Edwards D, Havranek T. A fast procedure for model search in multidimensional contingency tables. Biometrika. 1985;72(2):339–351.
Google Scholar
Lawless JF, Singhal K. Efficient screening of nonnormal regression models. Biometrics. 1978;34(2):318–327.
Google Scholar
LaMotte LR, Hocking R. Computational efficiency in the selection of regression variables. Technometrics. 1970;12(1):83–93.
Google Scholar
Brusco MJ, Stahl S. Branch-and-Bound applications in Combinatorial data analysis. New York: Springer-Verlag; 2005.
Google Scholar
Gatu C, Kontoghiorghes EJ. Efficient strategies for deriving the subset VAR models. Comput Manage Sci. 2005;2(4):253–278.
Google Scholar
Gatu C, Kontoghiorghes EJ, Gilli M, et al. An efficient branch-and-bound strategy for subset vector autoregressive model selection. J Econ Dynam Control. 2008;32(6):1949–1963.
Google Scholar
Winkler G. Image analysis, random fields, and dynamic Monte Carlo methods. New York: Springer-Verlag; 1991.
Google Scholar
Gilks WR, Richardson S, Spiegelhalter DJ, editors. Markov Chain Monte Carlo in practice. Boca Raton (FL): Chapman & Hall/CRC; 1996.
Google Scholar
Geyer C. Practical Markov chain Monte Carlo. Stat Sci. 1992;7(4):473–483.
Google Scholar
Gamerman D, Lopes HF. Markov Chain Monte Carlo: stochastic simulation for Bayesian inference. 2nd ed. New York: Chapman & Hall/CRC; 2006.
Google Scholar
Neal RM. Probabilistic inference using Markov Chain Monte Carlo methods, technical report CRG-TR-93-1. Toronto: Department of Computer Science, University of Toronto; 1993.
Google Scholar
Gelfand AE, Smith AFM. Sampling-based approaches to calculating marginal densities. J Am Stat Assoc. 1990;85(410):398–409.
Google Scholar
George EI, McCulloch RE. Approaches for Bayesian variable selection. Stat Sin. 1997;7:339–373.
Google Scholar
Madigan D, Raftery AE. Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc. 1994;89(428):1535–1546.
Google Scholar
Raftery AE, Zheng Y. Long-run performance of Bayesian model averaging. Seattle: University of Washington; 2003.
Google Scholar
Wang D, Lertsithichai P, Nanchahal K, et al. Risk factors of coronary heart disease: A Bayesian model averaging approach. J Appl Stat. 2003;30(7):813–826.
Google Scholar
Wang D, Zhang W, Bakhai A. Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression. Stat Med. 2004;23(22):3451–3467.
Google Scholar
Viallefont V, Raftery AE, Richardson S. Variable selection and Bayesian model averaging in case-control studies. Stat Med. 2001;20(21):3215–3230.
Google Scholar
Genell A, Nemes S, Steineck G, et al. Model selection in medical research: a simulation study comparing Bayesian model averaging and stepwise regression. BMC Med Res Methodol. 2010;10:108.
Google Scholar
Wilson MA, Iversen ES, Clyde MA, et al. Bayesian model search and multilevel inference for SNP association studies. Ann Appl Stat. 2010;4(3):1342–1364.
Google Scholar
Volinsky CT, Madigan D, Raftery AE, et al. Bayesian model averaging in proportional hazard models. assessing the risk of a stroke. Appl Stat J R Stat Soc Series C. 1997;46(4):433–448.
Google Scholar
Regal RR, Hook EB. The effects of model selection on confidence-intervals for the size of a closed population. Stat Med. 1991;10(5):717–721.
Google Scholar
Vock DM, Atchison EA, Legler JM, et al. Accounting for model uncertainty in estimating global burden of disease. Bull World Health Organ. 2011;89(2):112–120.
Google Scholar
Montgomery JM, Nyhan B. Bayesian model averaging: theoretical developments and practical applications. Polit Anal. 2010;18(2):245–270.
Google Scholar
Ando T, Tsay R. Predictive likelihood for Bayesian model selection and averaging. Int J Forecast. 2010;26(4):744–763.
Google Scholar
Prost L, Makowski D, Jeuffroy M. Comparison of stepwise selection and Bayesian model averaging for yield gap analysis. Ecol Modell. 2008;219(1–2):66–76.
Google Scholar
Yang X, Belin TR, Boscardin WJ. Imputation and variable selection in linear regression models with missing covariates. Biometrics. 2005;61(2):498–506.
Google Scholar
Jackson CH, Thompson SG, Sharples LD. Accounting for uncertainty in health economic decision models by using model averaging. J R Stat Soc Series A Stat Soc. 2009;172:383–404.
Google Scholar
Morales KH, Ibrahim JG, Chen C-J, et al. Bayesian model averaging with applications to benchmark dose estimation for arsenic in drinking water. J Am Stat Assoc. 2006;101(473):9–17.
Google Scholar
Clyde M. Bayesian model averaging and model search strategies. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM, editors. Bayesian statistics 6. London: Oxford University Press; 1999. p. 157–185.
Google Scholar
Raftery AE, Madigan D, Hoeting JA. Bayesian model averaging for linear regression models. J Am Stat Assoc. 1997;92:179–191.
Google Scholar
Burnham KP, Anderson DR. Model selection and inference: a practical information-theoretic approach. New York: Springer; 1998.
Google Scholar
Buckland ST, Burnham KP, Augustin NH. Model selection: an integral part of inference. Biometrics. 1997;53(2):603–618.
Google Scholar
Wang H, Zhang X, Zou G. Frequentist model averaging estimation: a review. J Syst Sci Complex. 2009;22(4):732–748.
Google Scholar
Ullah A, Wang H. Parametric and nonparametric frequentist model selection and model averaging. Econometrics. 2013;1:157–179.
Google Scholar
Barr J, Cavanaugh J. Forensics: assessing model goodness: a machine learning view. Encyclopedia Semant Comput Robot Intell. 2018;2(2):1850015.
Google Scholar
Russell SJ, Norvig P. Artificial intelligence: a modern approach. New Jersey: Prentice Hall; 2003.
Google Scholar
Picard R, Cook D. Cross-validation of regression models. J Am Stat Assoc. 1984;79(387):575–583.
Google Scholar
Stone M. Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Series B. 1974;36:111–133.
Google Scholar
Efron B, Tibshirani R. An introduction to the bootstrap. New York: Chapman & Hall/CRC; 1993.
Google Scholar
Anderson TW. An introduction to multivariate statistical analysis. New York: John Wiley & Sons; 2003.
Google Scholar
Gordon AD. Classification. 2nd ed. New York: Chapman & Hall/CRC; 1999.
Google Scholar
Huberty CJ. Applied discriminant analysis. New York: John Wiley & Sons; 1994.
Google Scholar
Krzanowski WJ. Principles of multivariate analysis: a user’s perspective. New York: Oxford University Press; 2000.
Google Scholar
Huber PJ, Ronchetti EM. Robust statistics. 2nd ed. New York: John Wiley & Sons; 2009.
Google Scholar
de Myttenaere A, Golden B, Le Grand B, et al. Mean absolute percentage error for regression models. Neurocomputing. 2016;192:38–48.
Google Scholar
Tarpey T. A note on the prediction sum of squares statistic for restricted least squares. Am Stat. 2000;54(2):116–118.
Google Scholar
Allen DM. The relationship between variable selection and data augmentation and a method for prediction. Technometrics. 1974;16:125–127.
Google Scholar
Pepe MS. The statistical evaluation of medical tests for classification and prediction. Oxford: Oxford University Press; 2004.
Google Scholar
Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inform Process Manage. 2009;45(4):427–437.
Google Scholar
UCI machine learning repository. Irvine (CA): University of California, School of Information and Computer Science; 2017. Available from: http://archive.ics.uci.edu/ml.
Google Scholar
Wickens TD. Elementary signal detection theory. New York: Oxford University Press; 2002.
Google Scholar
Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett. 2006;27:861–874.
Google Scholar
Metz CE. Basic principles of ROC analyses. Semin Nucl Med. 1978;8:283–298.
Google Scholar
Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem. 1993;39(4):561–577.
Google Scholar
Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36.
Google Scholar
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–845.
Google Scholar
King G, Zeng L. Logistic regression in rare events data. Polit Anal. 2001;9:137–163.
Google Scholar
Berger JO. Statistical decision theory and Bayesian analysis. 2nd ed. New York: Springer-Verlag; 1985.
Google Scholar
Aitchison J. Goodness of prediction fit. Biometrika. 1975;62:547–554.
Google Scholar
Harris IR. Predictive fit for natural exponential families. Biometrika. 1989;76(4):675–684.
Google Scholar
Vidoni P. Improved predictive model selection. J Stat Plan Inference. 2008;138:3713–3721.
Google Scholar
Fushiki T, Komaki F, Aihara K. Nonparametric bootstrap prediction. Bernoulli. 2005;11(2):293–307.
Google Scholar
Ng A, Jordan MI. On discriminative vs. generative classiﬁers: a comparison of logistic regression and naive Bayes. NIPS; 2001.
Google Scholar
Schölkopf B, Smola AJ. Learning with kernels: support vector machines, regularization, optimization and beyond. Cambridge (MA): MIT Press; 2001.
Google Scholar
Smola AJ, Bartlett PJ, Schölkopf B, et al. Advances in large-margin classifiers. Cambridge, MA: MIT Press; 2000.
Google Scholar
Perkins NJ, Schisterman EF. The Youden index and the optimal cut-point corrected for measurement error. Biom J. 2005;47(4):428–441.
Google Scholar
Youden WJ. Index for rating diagnostic tests. Cancer. 1950;3:32–35.
Google Scholar
Baldi P, Brunak S, Chauvin Y, et al. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics. 2000;16:412–424.
Google Scholar
Fletcher RH, Fletcher SW. Clinical epidemiology: the essentials. 4th ed. Baltimore (MD): Lippincott Williams & Wilkins; 2005.
Google Scholar
Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20(1):37–46.
Google Scholar
Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975;405(2):442–451.
Google Scholar
Cramér H. Mathematical methods of statistics. Princeton: Princeton University Press; 1946.
Google Scholar
Sasaki Y. The truth of the F-measure. Manchester: School of Computer Science, University of Manchester; 2007.
Google Scholar
Powers DMW. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J Mach Learn Technol. 2011;2(1):37–63.
Google Scholar
Magder LS, Fix AD. Optimal choice of a cut point for a quantitative diagnostic test performed for research purposes. J Clin Epidemiol. 2003;56(10):956–962.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Statistical modeling methods: challenges and strategies

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Statistical modeling methods: challenges and strategies

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date