685
Views
0
CrossRef citations to date
0
Altmetric
Theory and Methods

DDAC-SpAM: A Distributed Algorithm for Fitting High-dimensional Sparse Additive Models with Feature Division and Decorrelation

, ORCID Icon, ORCID Icon & ORCID Icon
Received 08 Oct 2021, Accepted 07 Jun 2023, Published online: 26 Jul 2023

References

  • Aerts, M., Claeskens, G., and Wand, M. P. (2002), “Some Theory for Penalized Spline Generalized Additive Models,” Journal of Statistical Planning and Inference, 103, 455–470. DOI: 10.1016/S0378-3758(01)00237-3.
  • Battey, H., Fan, J., Liu, H., Lu, J., and Zhu, Z. (2018), “Distributed Estimation and Inference with Statistical Guarantees,” Annals of Statistics, 46, 1352–1382.
  • Borggaard, C., and Thodberg, H. H. (1992), “Optimal Minimal Neural Interpretation of Spectra,” Analytical Chemistry, 64, 545–551. DOI: 10.1021/ac00029a018.
  • Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. (2011), “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers,” Foundations and Trends in Machine Learning, 3, 1–122. DOI: 10.1561/2200000016.
  • Cai, T. T., Zhang, A. R., and Zhou, Y. (2022), “Sparse Group Lasso: Optimal Sample Complexity, Convergence Rate, and Statistical Inference,” IEEE Transactions on Information Theory, 68, 5975–6002. DOI: 10.1109/tit.2022.3175455.
  • Chen, X., and Xie, M.-g. (2014), “A Split-and-Conquer Approach for Analysis of Extraordinarily Large Data,” Statistica Sinica, 24, 1655–1684. DOI: 10.5705/ss.2013.088.
  • Chikuse, Y. (2003), Statistics on Special Manifolds (Vol. 174), New York: Springer.
  • Downs, T. D. (1972), “Orientation Statistics,” Biometrika, 59, 665–676. DOI: 10.1093/biomet/59.3.665.
  • Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004), “Least Angle Regression,” The Annals of Statistics, 32, 407–499. DOI: 10.1214/009053604000000067.
  • Foygel, R., and Drton, M. (2010), “Exact Block-Wise Optimization in Group Lasso and Sparse Group Lasso for Linear Regression,” Statistics, 1050, 11.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2010), “Regularization Paths for Generalized Linear Models via Coordinate Descent,” Journal of Statistical Software, 33, 1–22.
  • Haris, A., Simon, N., and Shojaie, A. (2022), “Generalized Sparse Additive Models,” Journal of Machine Learning Research, 23, 1–56.
  • Hastie, T., and Tibshirani, R. (1990), Generalized Additive Models, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, New York: Taylor & Francis.
  • He, Q., Zhang, H. H., Avery, C. L., and Lin, D. (2016), “Sparse Meta-Analysis with High-Dimensional Data,” Biostatistics, 17, 205–220. DOI: 10.1093/biostatistics/kxv038.
  • Huang, J., Horowitz, J. L., and Wei, F. (2010), “Variable Selection in Nonparametric Additive Models,” Annals of Statistics, 38, 2282–2313.
  • Jia, J., and Rohe, K. (2015), “Preconditioning the Lasso for Sign Consistency,” Electronic Journal of Statistics, 9, 1150–1172. DOI: 10.1214/15-EJS1029.
  • Khatri, C. G., and Pillai, K. C. S. (1965), “Some Results on the Non-central Multivariate Beta Distribution and Moments of Traces of Two Matrices,” Annals of Mathematical Statistics, 36, 1511–1520. DOI: 10.1214/aoms/1177699910.
  • Koltchinskii, V., and Yuan, M. (2010), “Sparsity in Multiple Kernel Learning,” The Annals of Statistics, 38, 3660–3695. DOI: 10.1214/10-AOS825.
  • Lee, J. D., Liu, Q., Sun, Y., and Taylor, J. E. (2017), “Communication-Efficient Sparse Regression,” The Journal of Machine Learning Research, 18, 115–144.
  • Lou, Y., Bien, J., Caruana, R., and Gehrke, J. (2016), “Sparse Partially Linear Additive Models,” Journal of Computational and Graphical Statistics, 25, 1126–1140. DOI: 10.1080/10618600.2015.1089775.
  • Meier, L., Van De Geer, S., and Bühlmann, P. (2008), “The Group Lasso for Logistic Regression,” Journal of the Royal Statistical Society, Series B, 70, 53–71. DOI: 10.1111/j.1467-9868.2007.00627.x.
  • Meier, L., Van de Geer, S., and Bühlmann, P. (2009), “High-Dimensional Additive Modeling,” The Annals of Statistics, 37, 3779–3821. DOI: 10.1214/09-AOS692.
  • Petersen, A., and Witten, D. (2019), “Data-Adaptive Additive Modeling,” Statistics in Medicine, 38, 583–600. DOI: 10.1002/sim.7859.
  • Petersen, A., Witten, D., and Simon, N. (2016), “Fused Lasso Additive Model,” Journal of Computational and Graphical Statistics, 25, 1005–1025. DOI: 10.1080/10618600.2015.1073155.
  • Raskutti, G., Wainwright, M. J., and Yu, B. (2012), “Minimax-Optimal Rates for Sparse Additive Models Over Kernel Classes via Convex Programming,” Journal of Machine Learning Research, 13, 389–427.
  • Ravikumar, P., Lafferty, J., Liu, H., and Wasserman, L. (2009), “Sparse Additive Models,” Journal of the Royal Statistical Society, Series B, 71, 1009–1030. DOI: 10.1111/j.1467-9868.2009.00718.x.
  • Sadhanala, V., and Tibshirani, R. J. (2019), “Additive Models with Trend Filtering,” The Annals of Statistics, 47, 3032–3068. DOI: 10.1214/19-AOS1833.
  • Shi, C., Lu, W., and Song, R. (2018), “A Massive Data Framework for m-estimators with Cubic-Rate,” Journal of the American Statistical Association, 113, 1698–1709. DOI: 10.1080/01621459.2017.1360779.
  • Simon, N., and Tibshirani, R. (2012), “Standardization and the Group Lasso Penalty,” Statistica Sinica, 22, 983–1001. DOI: 10.5705/ss.2011.075.
  • Song, Q., and Liang, F. (2015), “A Split-and-Merge Bayesian Variable Selection Approach for Ultrahigh Dimensional Regression,” Journal of the Royal Statistical Society, Series B, 77, 947–972. DOI: 10.1111/rssb.12095.
  • Speckman, P. (1985), “Spline Smoothing and Optimal Rates of Convergence in Nonparametric Regression Models,” The Annals of Statistics, 13, 970–983. DOI: 10.1214/aos/1176349650.
  • Stone, C. J. (1985), “Additive Regression and other Nonparametric Models,” The Annals of Statistics, 13, 689–705. DOI: 10.1214/aos/1176349548.
  • Tang, L., Zhou, L., and Song, P. X.-K. (2016), “Method of Divide-and-Combine in Regularised Generalised Linear Models for Big Data,” arXiv preprint arXiv:1611.06208 .
  • Thodberg, H. H. (1993), “Ace of Bayes: Application of Neural Networks with Pruning,” Technical Report, Citeseer.
  • Tibshirani, R. (1996), “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society, Series B, 58, 267–288. DOI: 10.1111/j.2517-6161.1996.tb02080.x.
  • Van de Geer, S., Bühlmann, P., Ritov, Y., and Dezeure, R. (2014), “On Asymptotically Optimal Confidence Regions and Tests for High-Dimensional Models,” 42, 1166–1202.
  • Wang, X., Dunson, D. B., and Leng, C. (2016), “Decorrelated Feature Space Partitioning for Distributed Sparse Regression,” in Advances in Neural Information Processing Systems, pp. 802–810.
  • Wood, S. N. (2011), “Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models,” Journal of the Royal Statistical Society, Series B, 73, 3–36. DOI: 10.1111/j.1467-9868.2010.00749.x.
  • Yang, J., Mahoney, M. W., Saunders, M., and Sun, Y. (2016), “Feature-Distributed Sparse Regression: A Screen-and-Clean Approach,” in Advances in Neural Information Processing Systems, pp. 2712–2720.
  • Yang, Y., and Zou, H. (2015), “A Fast Unified Algorithm for Solving Group-Lasso Penalize Learning Problems,” Statistics and Computing, 25, 1129–1141. DOI: 10.1007/s11222-014-9498-5.
  • Yang, Y., and Zou, H. (2017), gglasso: Group Lasso Penalized Learning Using a Unified BMD Algorithm. R package version 1.4. Available at https://CRAN.R-project.org/package=gglasso
  • Yuan, M., and Lin, Y. (2006), “Model Selection and Estimation in Regression with Grouped Variables,” Journal of the Royal Statistical Society, Series B, 68, 49–67. DOI: 10.1111/j.1467-9868.2005.00532.x.
  • Yuan, M., and Zhou, D. (2016), “Minimax Optimal Rates of Estimation in High Dimensional Additive Models,” The Annals of Statistics, 44, 2564–2593. DOI: 10.1214/15-AOS1422.
  • Zeng, D., and Lin, D. (2015), “On Random-Effects Meta-Analysis,” Biometrika, 102, 281–294. DOI: 10.1093/biomet/asv011.
  • Zhang, Y., Duchi, J., and Wainwright, M. (2013), “Divide and Conquer Kernel Ridge Regression,” in Conference on Learning Theory, pp. 592–617.
  • Zhang, Y., Duchi, J., and Wainwright, M. (2015), “Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates,” The Journal of Machine Learning Research, 16, 3299–3340.
  • Zhang, Y., Wainwright, M. J., and Duchi, J. C. (2012), “Communication-Efficient Algorithms for Statistical Optimization,” in Advances in Neural Information Processing Systems, pp. 1502–1510.
  • Zhao, P., and Yu, B. (2006), “On Model Selection Consistency of Lasso,” Journal of Machine Learning Research, 7, 2541–2563.
  • Zhao, T., Cheng, G., and Liu, H. (2016), “A Partially Linear Framework for Massive Heterogeneous Data,” Annals of Statistics, 44, 1400–1437.
  • Zhou, Y., Porwal, U., Zhang, C., Ngo, H. Q., Nguyen, X., Ré, C., and Govindaraju, V. (2014), “Parallel Feature Selection Inspired by Group Testing,” in Advances in Neural Information Processing Systems, pp. 3554–3562.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.