Search in:

Advanced search

Statistics

A Journal of Theoretical and Applied Statistics

Volume 57, 2023 - Issue 3

Submit an article Journal homepage

Views

CrossRef citations to date

Altmetric

Research Article

Convergence in quadratic mean of averaged stochastic gradient algorithms without strong convexity nor bounded gradient

Antoine Godichon-BaggioniLaboratoire de Probabilités, Statistique et Modélisation Sorbonne-Université, Paris, FranceCorrespondence[email protected]

Pages 637-668 | Received 31 May 2022, Accepted 08 May 2023, Published online: 17 May 2023

Cite this article
https://doi.org/10.1080/02331888.2023.2213371
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Bach F. Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression. J Mach Learn Res. 2014;15(1):595–627.
Google Scholar
Cohen K, Nedić A, Srikant R. On projected stochastic gradient descent algorithm with weighted averaging for least squares regression. IEEE Trans Automat Contr. 2017;62(11):5974–5981.
Web of Science ®Google Scholar
Cardot H, Cénac P, Godichon-Baggioni A. Online estimation of the geometric median in hilbert spaces: nonasymptotic confidence balls. Ann Stat. 2017;45(2):591–614.
Web of Science ®Google Scholar
Cardot H, Cénac P, Zitt P-A. Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli. 2013;19(1):18–43.
Web of Science ®Google Scholar
Godichon-Baggioni A. Estimating the geometric median in Hilbert spaces with stochastic gradient algorithms: Lp and almost sure rates of convergence. J Multivar Anal. 2016;146:209–222.
Web of Science ®Google Scholar
Bercu B, Costa M, Gadat S. Stochastic approximation algorithms for superquantiles estimation; 2020. arXiv preprint arXiv:2007.14659.
Google Scholar
Costa M, Gadat S. Non asymptotic controls on a recursive superquantile approximation. Electron J Statist. 2021;15(2):4718–4769.
Web of Science ®Google Scholar
Alfarra M, Hanzely S, Albasyoni A, et al. Adaptive learning of the optimal mini-batch size of sgd; 2020. arXiv preprint arXiv:2005.01097.
Google Scholar
Konečnỳ J, Liu J, Richtárik P, et al. Mini-batch semi-stochastic gradient descent in the proximal setting. IEEE J Sel Top Signal Process. 2015;10(2):242–255.
Web of Science ®Google Scholar
Robbins H, Monro S. A stochastic approximation method. Ann Math Stat. 1951;22(3):400–407.
Google Scholar
Pelletier M. On the almost sure asymptotic behaviour of stochastic algorithms. Stoch Process Their Appl. 1998;78(2):217–244.
Web of Science ®Google Scholar
Ruppert D. Efficient estimations from a slowly convergent robbins-monro process. Technical report. Cornell University Operations Research and Industrial Engineering; 1988.
Google Scholar
Polyak B, Juditsky A. Acceleration of stochastic approximation. SIAM J Control Optim. 1992;30(4):838–855.
Web of Science ®Google Scholar
Pelletier M. Asymptotic almost sure efficiency of averaged stochastic algorithms. SIAM J Control Optim. 2000;39(1):49–72.
Web of Science ®Google Scholar
Moulines E, Bach F. Non-asymptotic analysis of stochastic approximation algorithms for machine learning. Adv Neural Inf Process Syst. 2011;24.
Google Scholar
Gadat S, Panloup F. Optimal non-asymptotic bound of the Ruppert-Polyak averaging without strong convexity; 2017. arXiv preprint arXiv:1709.03342.
Google Scholar
Godichon-Baggioni A. Online estimation of the asymptotic variance for averaged stochastic gradient algorithms. J Stat Plan Inference. 2019;203:1–19.
Web of Science ®Google Scholar
Godichon-Baggioni A. Lp and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective. ESAIM – Probab Stat. 2019;23:841–873.
Web of Science ®Google Scholar
Défossez A, Bottou L, Bach F, et al. A simple convergence proof of adam and adagrad; 2020. arXiv preprint arXiv:2003.02395.
Google Scholar
Wang H, Gurbuzbalaban M, Zhu L, et al. Convergence rates of stochastic gradient descent under infinite noise variance. Adv Neural Inf Process Syst. 2021;34:18866–18877.
Google Scholar
Li CJ, Mou W, Wainwright M, et al. Root-sgd: sharp nonasymptotics and asymptotic efficiency in a single algorithm. In: Conference on learning theory, PMLR. 2022; p. 909–981.
Google Scholar
Chaudhuri P. Multivariate location estimation using extension of R-estimates through U-statistics type approach. Ann Statist. 1992;20(2):897–916.
Web of Science ®Google Scholar
Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res. 2011;12(7):2121–2159.
Google Scholar
Boyer C, Godichon-Baggioni A. On the asymptotic rate of convergence of stochastic Newton algorithms and their weighted averaged versions; 2020. arXiv preprint arXiv:2011.09706.
Google Scholar
Mokkadem A, Pelletier M. A generalization of the averaging procedure: the use of two-time-scale algorithms. SIAM J Control Optim. 2011;49(4):1523–1543.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Convergence in quadratic mean of averaged stochastic gradient algorithms without strong convexity nor bounded gradient

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Convergence in quadratic mean of averaged stochastic gradient algorithms without strong convexity nor bounded gradient

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date