439
Views
9
CrossRef citations to date
0
Altmetric
Regularized Regression: Implementation and Interpetation

Distributed Generalized Cross-Validation for Divide-and-Conquer Kernel Ridge Regression and Its Asymptotic Optimality

, &
Pages 891-908 | Received 13 Aug 2017, Accepted 18 Feb 2019, Published online: 28 May 2019

References

  • Abramowitz, M., and Stegun, I. A. (1972), Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables (Vol. 55), New York: Dover Publications.
  • Bach, F. (2013), “Sharp Analysis of Low-Rank Kernel Matrix Approximations,” in Conference on Learning Theory, pp. 185–209.
  • Bertin-Mahieux, T., Ellis, D. P., Whitman, B., and Lamere, P. (2011), “The Million Song Dataset,” in ISMIR (Vol. 2), p. 10.
  • Blanchard, G., and Mücke, N. (2016), “Parallelizing Spectral Algorithms for Kernel Learning,” arXiv no. 1610.07487.
  • Chang, X., Lin, S.-B., and Zhou, D.-X. (2017), “Distributed Semi-supervised Learning With Kernel Ridge Regression,” Journal of Machine Learning Research, 18, 1–22.
  • Chen, J., Zhang, C., Kosorok, M. R., and Liu, Y. (2017), “Double Sparsity Kernel Learning With Automatic Variable Selection and Data Extraction,” arXiv no. 1706.01426.
  • Chen, X., and Xie, M. (2014), “A Split-and-Conquer Approach for Analysis of Extraordinarily Large Data,” Statistica Sinica, 24, 1655–1684.
  • Craven, P., and Wahba, G. (1978), “Smoothing Noisy Data With Spline Functions,” Numerische Mathematik, 31, 377–403. DOI:10.1007/BF01404567.
  • Gu, C. (2013), Smoothing Spline ANOVA Models (Vol. 297), New York: Springer Science & Business Media.
  • Gu, C., and Ma, P. (2005), “Optimal Smoothing in Nonparametric Mixed-Effect Models,” The Annals of Statistics, 33, 1357–1379. DOI:10.1214/009053605000000110.
  • Guo, Z.-C., Shi, L., and Wu, Q. (2017), “Learning Theory of Distributed Regression With Bias Corrected Regularization Kernel Network,” Journal of Machine Learning Research, 18, 1–25.
  • Györfi, L., Kohler, M., Krzyzak, A., and Walk, H. (2006), A Distribution-Free Theory of Nonparametric Regression, New York: Springer Science & Business Media.
  • Hastie, T. J. (2017), “Generalized Additive Models,” in Statistical Models in S, Boca Raton, FL: Routledge, pp. 249–307.
  • Jehan, T., and DesRoches, D. (2011), Analyzer Documentation, Somerville, MA: The Echo Nest.
  • Kandasamy, K., and Yu, Y. (2016), “Additive Approximations in High Dimensional Nonparametric Regression via the SALSA,” in International Conference on Machine Learning, pp. 69–78.
  • Li, K.-C. (1986), “Asymptotic Optimality of CL and Generalized Cross-Validation in Ridge Regression With Application to Spline Smoothing,” The Annals of Statistics, 14, 1101–1112. DOI:10.1214/aos/1176350052.
  • Lin, S.-B., Guo, X., and Zhou, D.-X. (2017), “Distributed Learning With Regularized Least Squares,” The Journal of Machine Learning Research, 18, 3202–3232.
  • Lu, J., Cheng, G., and Liu, H. (2016), “Nonparametric Heterogeneity Testing For Massive Data,” arXiv no. 1601.06212.
  • Mallows, C. L. (2000), “Some Comments on Cp,” Technometrics, 42, 87–94. DOI:10.2307/1271437.
  • Pollard, D. (1986), “Rates of Uniform Almost-Sure Convergence for Empirical Processes Indexed by Unbounded Classes of Functions,” Technical Report, Department of Statistics, Yale University.
  • Pollard, D. (1995), “Uniform Ratio Limit Theorems for Empirical Processes,” Scandinavian Journal of Statistics, 22, 271–278.
  • R Core Team (2018), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing.
  • Rice, J., and Rosenblatt, M. (1983), “Smoothing Splines: Regression, Derivatives and Deconvolution,” The Annals of Statistics, 11, 141–156. DOI:10.1214/aos/1176346065.
  • Schaback, R., and Wendland, H. (2006), “Kernel Techniques: From Machine Learning to Meshless Methods,” Acta Numerica, 15, 543–639. DOI:10.1017/S0962492906270016.
  • Shang, Z., and Cheng, G. (2017), “Computational Limits of a Distributed Algorithm for Smoothing Spline,” The Journal of Machine Learning Research, 18, 3809–3845.
  • Shawe-Taylor, J., and Cristianini, N. (2004), Kernel Methods for Pattern Analysis, Cambridge: Cambridge University Press.
  • van de Geer, S. A., and van de Geer, S. (2000), Empirical Processes in M-Estimation (Vol. 6), Cambridge: Cambridge University Press.
  • Wahba, G. (1990), Spline Models for Observational Data (Vol. 59), Philadelphia, PA: SIAM.
  • Wood, S. N. (2004), “Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models,” Journal of the American Statistical Association, 99, 673–686. DOI:10.1198/016214504000000980.
  • Xiang, D., and Wahba, G. (1996), “A Generalized Approximate Cross Validation for Smoothing Splines With Non-Gaussian Data,” Statistica Sinica, 6, 675–692.
  • Xu, G., and Huang, J. Z. (2012), “Asymptotic Optimality and Efficient Computation of the Leave-Subject-Out Cross-Validation,” The Annals of Statistics, 40, 3003–3030. DOI:10.1214/12-AOS1063.
  • Xu, G., Shang, Z., and Cheng, G. (2018), “Optimal Tuning for Divide-and-Conquer Kernel Ridge Regression With Massive Data,” in Proceedings of the 35th International Conference on Machine Learning, PMLR (Vol. 80), pp. 5483–5491.
  • Yuan, M. (2006), “GACV for Quantile Smoothing Splines,” Computational Statistics & Data Analysis, 50, 813–829. DOI:10.1016/j.csda.2004.10.008.
  • Zhang, C., Liu, Y., and Wu, Y. (2016), “On Quantile Regression in Reproducing Kernel Hilbert Spaces With the Data Sparsity Constraint,” The Journal of Machine Learning Research, 17, 1374–1418.
  • Zhang, Y., Duchi, J., and Wainwright, M. (2015), “Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm With Minimax Optimal Rates,” The Journal of Machine Learning Research, 16, 3299–3340.
  • Zhao, T., Cheng, G., and Liu, H. (2016), “A Partially Linear Framework for Massive Heterogeneous Data,” Annals of Statistics, 44, 1400. DOI:10.1214/15-AOS1410.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.