3,351
Views
213
CrossRef citations to date
0
Altmetric
Original Articles

Using Lasso for Predictor Selection and to Assuage Overfitting: A Method Long Overlooked in Behavioral Sciences

REFERENCES

  • Ambartsumian, V.A. (1929). On a problem of the theory of eigenvalues. Zeitschrift für Physik, 53, 690–695. doi: doi:10.1007/bf01330827
  • Andersen, C.M., & Bro, R. (2010). Variable selection in regression—a tutorial. Journal of Chemometrics, 24, 728–737. doi: doi:10.1002/cem.1360
  • Babyak, M.A. (2004). What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models. Psychosomatic Medicine, 66, 411–421. doi: doi:10.1097/01.psy.0000127692.23278.a9
  • Belloni, A. & Chernozhukov, V. (2013). Least squares after model selection in high dimensional sparse models. Bernoulli, 19, 521–547. doi: doi:10.3150/11-bej410
  • Candès, E., & Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 35, 2313–2351. doi: doi:10.1214/009053606000001523
  • Cohen, J., Cohen, P., West, S.G., & Aiken, L.S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences. Hoboken, NJ: Routledge.
  • Derksen, S., & Keselman, H.J. (1992). Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. British Journal of Mathematical and Statistical Psychology, 45, 265–282. doi: doi:10.1111/j.2044-8317.1990.tb00940.x
  • Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32, 407–499. doi: doi:10.1214/009053604000000067
  • Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1–22. doi: doi:10.1145/1401890.1401893
  • Gelman, A., & Shalizi, C.R. (2013). Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, 66, 8–38. doi: doi:10.1111/j.2044-8317.2011.02037.x
  • Groll, A., & Tutz, G. (2014). Variable selection for generalized linear mixed models by L1 penalized estimation. Statistics and Computing, 24, 137–154. doi: doi:10.1007/s11222-012-9359-z
  • Gui, J., & Li, H. (2005). Penalized Cox regression analysis in the high-dimensional and low sample size settings, with applications to microarray gene expression data. Bioinformatics, 21, 3001–3008. doi: doi:10.1093/bioinformatics/bti422
  • Hadamard, J. (1902). Sur les problèmes aux dérivées partielles et leur signification physique. Princeton University Bulletin, 13, 49–52.
  • Harrell, F.E. (2001). Regression modeling strategies: With applications to linear models, logistic regression, and survival analysis. New York, NY: Springer.
  • Harrell, F.E., Lee, K.L., Matchar, D.B., & Reichert, T.A. (1985). Regression models for prognostic prediction: Advantages, problems, and suggested solutions. Cancer Treatment Reports, 69, 1071–1077.
  • Hartmann, A., Van Der Kooij, A.J., & Zeeck, A. (2009). Exploring nonlinear relations: Models of clinical decision making by regression with optimal scaling. Psychotherapy Research, 19, 482–492. doi: doi:10.1080/10503300902905939
  • Hartmann, A., Zeeck, A., & Barrett, M.S. (2010). Interpersonal problems in eating disorders. International Journal of Eating Disorders, 43, 619–627. doi: doi:10.1002/eat.20747
  • Hawkins, D.M. (2004). The problem of overfitting. Journal of Chemical, Information, and Computer Sciences, 44, 1–12. doi: doi:10.1021/ci0342472
  • Hesterberg, T., Choi, N.H., Meier, L., & Fraley, C. (2008). Least angle and ℓ1 penalized regression: A review. Statistics Surveys, 2, 61–93. doi: doi:10.1214/08-ss035
  • Hoerl, A.E., & Kennard, R.W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12, 55–67. doi: doi:10.2307/1267351
  • Hurvich, C.M., & Tsai, C.L. (1990). The impact of model selection on inference in linear regression. The American Statistician, 44, 214–217. doi: doi:10.2307/2685338
  • Johnson, J.B. & Omland, K.S. (2004) Model selection in ecology and evolution. Trends in Ecology & Evolution, 19, 101–108. doi: doi:10.1016/j.tree.2003.10.013
  • Johnson, M., & Sinharay, S. (2011). Remarks from the new editors. Journal of Educational and Behavioral Statistics, 36, 3–5. doi: doi:10.3102/1076998610387267
  • Khalili, A., & Chen, J. (2007). Variable selection in finite mixture of regression models. Journal of the American Statistical Association, 102, 1025–1038. doi: doi:10.1198/016214507000000590
  • Knapp, T. R., & Sawilowsky, S.S. (2001). Constructive criticisms of methodological and editorial practices. The Journal of Experimental Education, 70, 65–79. doi: doi:10.1080/00220970109599498
  • Lockhart, R., Taylor, J., Tibshirani, R.J., & Tibshirani, R. (2014). A significance test for the lasso. The Annals of Statistics, 42, 413–468. doi: doi:10.1214/13-aos1175
  • Lomax, R.G., & Hahs-Vaughn, D.L. (2013). Statistical concepts: A second course. New York, NY: Routledge.
  • Meier, L., Van De Geer, S., & Bühlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society: Series B, 70, 53–71.
  • Meier, L. (2009). grplasso: Fitting user specified models with Group Lasso penalty. R package version 0.4-2. . doi: doi:10.1111/j.1467-9868.2007.00627.x
  • Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B, 72, 417–473. doi: doi:10.1111/j.1467-9868.2010.00740.x
  • Park, M.Y., & Hastie, T. (2007). L1‐regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society: Series B, 69, 659–677. doi: doi:10.1111/j.1467-9868.2007.00607.x
  • Pope, P.T., & Webster, J.T. (1972). The use of an F-statistic in stepwise regression procedures. Technometrics, 14, 327–340. doi: doi:10.1080/00401706.1972.10488919
  • Schelldorfer, J., Meier, L., & Bühlmann, P. (2014). GLMMLasso: An algorithm for high dimensional generalized linear mixed models using ℓ1-penalization. Journal of Computational and Graphical Statistics, 23, 460–477. doi: doi:10.1080/10618600.2013.773239
  • Scheidt, C.E., Hasenburg, A., Kunze, M., Waller, E., Pfeifer, R., Zimmermann, P., … Waller, N. (2012). Are individual differences of attachment predicting bereavement outcome after perinatal loss? A prospective cohort study. Journal of Psychosomatic Research, 73, 375–382. doi: doi:10.1016/j.jpsychores.2012.08.017
  • Schmid, N.S., Taylor, K.I., Foldi, N.S., Berres, M., & Monsch, A.U. (2013). Neuropsychological signs of Alzheimer's disease 8 years prior to diagnosis. Journal of Alzheimer's Disease, 34, 537–546.
  • Sharpe, D. (2013). Why the resistance to statistical innovations? Bridging the communication gap. Psychological Methods, 18, 572–582. doi: doi:10.1037/a0034177
  • Städler, N., Bühlmann, P., & Van De Geer, S. (2010). ℓ1-penalization for mixture regression models. Test, 19, 209–256. doi: doi:10.1007/s11749-010-0197-z
  • Stephens, P.A., Buskirk, S.W., Hayward, G.D., & Martinez del Rio, C. (2005) Information theory and hypothesis testing: A call for pluralism. Journal of Applied Ecology, 42, 4–12. doi: doi:10.1111/j.1365-2664.2005.01002.x
  • Steyerberg, E.W., Eijkemans, M.J., & Habbema, J.D. F. (1999). Stepwise selection in small data sets: A simulation study of bias in logistic regression analysis. Journal of Clinical Epidemiology, 52, 935–942.
  • Stock, J.H., & Watson, M.W. (2003). Introduction to econometrics. Boston, MA: Addison-Wesley. doi: doi:10.1007/s00362-009-0230-z
  • Subramanian, J., & Simon, R. (2013). Overfitting in prediction models—Is it a problem only in high dimensions? Contemporary Clinical Trials, 36, 636–641. doi: doi:10.1016/j.cct.2013.06.011
  • Thompson, B. (1989). Why won't stepwise methods die? Measurement and Evaluation in Counseling and Development, 21, 146–148.
  • Thompson, B. (1995). Stepwise regression and stepwise discriminant analysis need not apply here: A guidelines editorial. Educational and Psychological Measurement, 55, 525–534. doi: doi:10.1177/0013164495055004001
  • Thompson, B. (2001). Significance, effect sizes, stepwise methods, and other issues: Strong arguments move the field. The Journal of Experimental Education, 70, 80–93. doi: doi:10.1080/00220970109599499
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58, 267–288. doi: doi:10.1111/j.1467-9868.2011.00771.x
  • Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in Medicine, 16, 385–395. doi: doi:10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3
  • Tikhonov, A.N. (1943). On the stability of inverse problems. Doklady Akademii Nauk SSSR, 39, 195–198.
  • Vinod, H.D. (1978). A survey of ridge regression and related techniques for improvements over ordinary least squares. The Review of Economics and Statistics, 60, 121–131. doi: doi:10.2307/1924340
  • Waldmann, P., Mészáros, G., Gredler, B., Fuerst, C., & Sölkner, J. (2013). Evaluation of the lasso and the elastic net in genome-wide association studies. Frontiers in Genetics, 4 (270), 1–11. doi: doi:10.3389/fgene.2013.00270
  • Wasserman, L., & Roeder, K. (2009). High dimensional variable selection. Annals of Statistics, 37, 2178. doi: doi:10.1214/08-aos646
  • Whittingham, M.J., Stephens, P.A., Bradbury, R.B., & Freckleton, R.P. (2006). Why do we still use stepwise modelling in ecology and behaviour? Journal of Animal Ecology, 75, 1182–1189. doi: doi:10.1111/j.1365-2656.2006.01141.x
  • Wilkinson, L. (1979). Tests of significance in stepwise regression. Psychological Bulletin, 86, 168–174. doi: doi:10.1037/0033-2909.86.1.168
  • Wintle, B.A., McCarthy, M.A., Volinsky, C.T. & Kavanagh, R.P. (2003) The use of Bayesian model averaging to better represent uncertainty in ecological models. Conservation Biology, 17, 1579–1590. doi: doi:10.1111/j.1523-1739.2003.00614.x
  • Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B, 68, 49–67. doi: doi:10.1111/j.1467-9868.2005.00532.x
  • Zou, H., Hastie, T., & Tibshirani, R. (2007). On the “degrees of freedom” of the lasso. The Annals of Statistics, 35, 2173–2192. doi: doi:10.1214/009053607000000127
  • Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, 1418–1429. doi: doi:10.1198/016214506000000735
  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67, 301–320. doi: doi:10.1111/j.14679868.2005.00503.x

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.