964
Views
7
CrossRef citations to date
0
Altmetric
Theory and Methods

Excess Optimism: How Biased is the Apparent Error of an Estimator Tuned by SURE?

ORCID Icon &
Pages 697-712 | Received 01 Jan 2017, Published online: 06 Aug 2018

References

  • Akaike, H. (1973), “Information Theory and an Extension of the Maximum Likelihood Principle,” Second International Symposium on Information Theory, pp. 267–281.
  • Ball, K. (1993), “The Reverse Isoperimetric Problem for Gaussian Measure,” Discrete & Computational Geometry, 10, 411–420.
  • Baranchik, A. (1964), “Multiple Regression and Estimation of the Mean of a Multivariate Normal Distribution,” Technical Report, Stanford University.
  • Berk, R., Brown, L., Buja, A., Zhang, K., and Zhao, L. (2013), “Valid Post-Selection Inference,” Annals of Statistics, 41, 802–837.
  • Bernau, C., Augustin, T., and Boulesteix, A.-L. (2013), “Correcting the Optimal Resampling-Based Error Rate by Estimating the Error Rate of Wrapper Algorithms,” Biometrics, 69, 693–702.
  • Breiman, L. (1992), “The Little Bootstrap and Other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error,” Journal of the American Statistical Society, 87, 738–754.
  • Candes, E. J., Sing-Long, C. M., and Trzasko, J. D. (2013), “Unbiased Risk Estimates for Singular Value Thresholding and Spectral Estimators,” IEEE Transactions on Signal Processing, 61, 4643–4657.
  • Cavalier, L., Golubev, Y., Picard, D., and Tsybakov, A. (2002), “Oracle Inequalities for Inverse Problems,” Annals of Statistics, 30, 843–874.
  • Chen, X., Lin, Q., and Sen, B. (2015), “On Degrees of Freedom of Projection Estimators with Applications to Multivariate Shape Restricted Regression,” arXiv: 1509.01877.
  • Donoho, D. L., and Johnstone, I. M. (1994), “Ideal Spatial Adaptation by Wavelet Shrinkage,” Biometrika, 81, 425–455.
  • ——— (1995), “Adapting to Unknown Smoothness via Wavelet Shrinkage,” Journal of the American Statistical Association, 90, 1200–1224.
  • ——— (1998), “Minimax Estimation via Wavelet Shrinkage,” Annals of Statistics, 26, 879–921.
  • Efron, B. (1986), “How Biased is the Apparent Error Rate of a Prediction Rule?” Journal of the American Statistical Association, 81, 461–470.
  • ——— (2004), “The Estimation of Prediction Error: Covariance Penalties and Cross-Validation,” Journal of the American Statistical Association, 99, 619–632.
  • ——— (2010), Large-scale Simultaneous Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction, New York: Cambridge University Press.
  • ——— (2014), “Estimation and Accuracy after Model Selection,” Journal of the American Statistical Association, 109, 991–1007.
  • Efron, B., and Hastie, T. (2016), Computer Age Statistical Inference: Algorithms, Inference, and Data Science, New York: Cambridge University Press.
  • Fithian, W., Sun, D., and Taylor, J. (2014), “Optimal Inference after Model Selection,” arXiv: 1410.2597.
  • Hoerl, A., and Kennard, R. (1970), “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, 12, 55–67.
  • James, W., and Stein, C. (1961), “Estimation with Quadratic Loss,” Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1, 361–379.
  • Janson, L., Fithian, W., and Hastie, T. (2015), “Effective Degrees of Freedom: A Flawed Metaphor,” Biometrika, 102, 479–485.
  • Johnstone, I. M. (1999), “Wavelet Shrinkage for Correlated Data and Inverse Problems: Adaptivity Results,” Statistica Sinica, 9, 51–83.
  • ——— (2015), Gaussian Estimation: Sequence and Wavelet Models, New York: Cambridge University Press, draft version.
  • Klivans, A., O’Donnell, R., and Servedio, R. (2008), “Learning Geometric Concepts via Gaussian Surface Area,” Foundations of Computer Science, 49, 541–550.
  • Kneip, A. (1994), “Ordered Linear Smoothers,” Annals of Statistics, 22, 835–866.
  • Krstajic, D., Buturovic, L., Leahy, D., and Thomas, S. (2014), “Cross-Validation pitfalls when Selecting and Assessing Regression and Classification Models,” Journal of Cheminformatics, 6.
  • Lee, J., Sun, D., Sun, Y., and Taylor, J. (2016), “Exact Post-Selection Inference, with Application to the Lasso,” Annals of Statistics, 44, 907–927.
  • Li, K.-C. (1985), “From Stein’s Unbiased Risk Estimates to the Method of Generalized Cross-Validation,” Annals of Statistics, 14, 1352–1377.
  • ——— (1986), “Asymptotic Optimality of CL and Generalized Cross-Validation in Ridge Regression with Application to Spline Smoothing,” Annals of Statistics, 14, 1101–1112.
  • ——— (1987), “Asymptotic Optimality for Cp, CL, Cross-Validation and Generalized Cross-Validation: Discrete Index Set,” Annals of Statistics, 15, 958–975.
  • Lockhart, R., Taylor, J., Tibshirani, R. J., and Tibshirani, R. (2014), “A Significance Test for the Lasso,” Annals of Statistics, 42, 413–468.
  • Mallows, C. (1973), “Some Comments on Cp,” Technometrics, 15, 661–675.
  • Mikkelsen, F. R., and Hansen, N. R. (2016), “Degrees of Freedom for Piecewise Lipschitz Estimators,” arXiv: 1601.03524.
  • Nazarov, F. (2003), “On the Maximal Perimeter of a Convex set in Rn with Respect to Gaussian Measure,” Geometric Aspects of Functional Analysis, 1806, 169–187.
  • Stein, C. (1981), “Estimation of the Mean of a Multivariate Normal Distribution,” Annals of Statistics, 9, 1135–1151.
  • Tian Harris, X. (2016), “Prediction Error after Model Selection,” arXiv: 1610.06107.
  • Tibshirani, R. J. (2015), “Degrees of Freedom and Model Search,” Statistica Sinica, 25, 1265–1296.
  • Tibshirani, R. J., and Taylor, J. (2011), “The Solution Path of the Generalized Lasso,” Annals of Statistics, 39, 1335–1371.
  • ——— (2012), “Degrees of Freedom in Lasso Problems,” Annals of Statistics, 40, 1198–1232.
  • Tibshirani, R. J., Taylor, J., Lockhart, R., and Tibshirani, R. (2016), “Exact Post-Selection Inference for Sequential Regression Procedures,” Journal of the American Statistical Association, 111, 600–620.
  • Tibshirani, R. J., and Tibshirani, R. (2009), “A Bias Correction for the Minimum Error Rate in Cross-Validation,” Annals of Applied Statistics, 3, 822–829.
  • Tsamardinos, I., Rakhshani, A., and Lagani, V. (2015), “Performance-Estimation Properties of Cross-Validation-Based Protocols with Simultaneous Hyper-Parameter Optimization,” International Journal on Artificial Intelligence Tools, 24.
  • Ulfarsson, M. O., and Solo, V. (2013a), “Tuning Parameter Selection for Nonnegative Matrix Factorization,” IEEE International Conference on Acoustics, Speech and Signal Processing.
  • ——— (2013b), “Tuning Parameter Selection for Underdetermined Reduced-Rank Regression,” IEEE Signal Processing Letters, 20, 881–884.
  • Varma, S., and Simon, R. (2006), “Bias in Error Estimation When Using Cross-Validation for Model Selection,” BMC Bioinformatics, 7.
  • Xie, X., Kou, S., and Brown, L. (2012), “SURE Estimates for a Heteroscedastic Hierarchical Model,” Journal of the American Statistical Association, 107, 1465–1479.
  • Ye, J. (1998), “On Measuring and Correcting the Effects of Data Mining and Model Selection,” Journal of the American Statistical Society, 93, 120–131.
  • Zou, H., Hastie, T., and Tibshirani, R. (2007), “On the ‘Degrees of Freedom’ of the Lasso,” Annals of Statistics, 35, 2173–2192.
  • Zou, H., and Yuan, M. (2008), “Regularized Simultaneous Model Selection in Multiple Quantiles Regression,” Computational Statistics and Data Analysis, 52, 5296–5304.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.