References
- A. Agafonov, D. Kamzolov, P. Dvurechensky, and A. Gasnikov, Inexact tensor methods and their application to stochastic convex optimization, arXiv preprint: 2012.15636, 2020.
- L. Bottou, F. Curtis, and J. Nocedal, Optimization methods for large-scale machine learning, SIAM Rev. 60 (2018), pp. 223–311.
- E.G. Birgin, J.L. Gardenghi, J.M. Martinez, S.A. Santos, and P.H.L. Toint, Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models, Math. Program. 163(1–2) (2017), pp. 359–368.
- R. Bollapragada, R.H. Byrd, and J. Nocedal, Exact and inexact subsampled Newton methods for optimization, IMA J. Numer. Anal. 39(2) (2018), pp. 545–578.
- J. Bolte, A. Daniilidis, and A. Lewis, The Lojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems, SIAM J. Optim. 17(4) (2007), pp. 1205–1223.
- S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn. 3(1) (2011), pp. 1–122.
- C. Cartis, N. IM. Gould, and P.L. Toint, A concise second-order complexity analysis for unconstrained optimization using high-order regularized models, Optim. Methods Softw. 35(2) (2020), pp. 243–256.
- C.C. Chang and C.J. Lin, LIBSVM: a library for support vector machines, ACM. Trans. Intell. Syst. Technol. 2(3) (2011), pp. 1–27.
- A. Defazio, F. Bach, and S. Lacoste-Julien, SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives, Adv. Neural. Inf. Process. Syst. 27 (2014), pp. 1646–1654.
- N. Doikov and Y.U. Nesterov, Local convergence of tensor methods, Math. Program. 193(1) (2022), pp. 315–336.
- N. Doikov and Y.U. Nesterov, Inexact tensor methods with dynamic accuracies, in International Conference on Machine Learning, 2020, pp. 2577–2586.
- A. Gasnikov, P. Dvurechensky, E. Gorbunov, E. Vorontsova, D. Selikhanovych, C.A. Uribe, B. Jiang, H. Wang, S. Zhang, S. Bubeck, and Q. Jiang, Near optimal methods for minimizing convex functions with Lipschitz pth derivatives, In Conference on Learning Theory, 2019, pp. 1392–1393.
- I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016.
- A. Hyvarinen, J. Karhunen, and E. Oja, Independent Component Analysis, Wiley, 2001.
- R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, in Advances in Neural Information Processing Systems, pp. 315–323, 2013.
- D. Kovalev, K. Mishchenko, and P. Richtarik, Stochastic Newton and cubic Newton methods with simple local linear-quadratic rates, in Advances in Neural Information Processing Systems, 2019.
- A. Lucchi and J. Kohler, A Stochastic tensor method for non-convex optimization, arXiv preprint: 1911.10367, 2019.
- A. Mokhtari, M. Eisen, and A. Ribeiro, IQN: an incremental quasi-Newton method with local superlinear convergence rate, SIAM J. Optim. 28(2) (2018), pp. 1670–1698.
- E. Moulines and F. Bach, Non-asymptotic analysis of stochastic approximation algorithms for machine learning, Adv. Neural Inform. Process. Syst. 24 (2011), pp. 451–459.
- J. Mairal, Incremental majorization–minimization optimization with application to large-scale machine learning, SIAM J. Optim. 25(2) (2015), pp. 829–855.
- Y.U. Nesterov, Implementable tensor methods in unconstrained convex optimization, Math. Program.186(1–2) (2021), pp. 157–183.
- Y.U. Nesterov, Inexact basic tensor methods for some classes of convex optimization problems, Optim. Methods Softw. 37(3) (2022), pp. 878–906. doi:10.1080/10556788.2020.1854252
- Y.U. Nesterov, Gradient methods for minimizing composite functions, Math. Program. 140(1) (2013), pp. 125–161.
- I. Necoara, General convergence analysis of stochastic first order methods for composite optimization, J. Optim. Theory. Appl. 189(1) (2021), pp. 66–95.
- I. Necoara, V. Nedelcu, and I. Dumitrache, Parallel and distributed optimization methods for estimation and control in networks, J. Process. Control. 21(5) (2011), pp. 756–766.
- I. Necoara and D. Lupu, General higher-order majorization–minimization algorithms for (non)convex optimization, arXiv preprint: 2010.13893, 2020.
- A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro, Robust stochastic approximation approach to stochastic programming, SIAM J. Optim. 19(4) (2009), pp. 1574–1609.
- Y.U. Nesterov and B.T. Polyak, Cubic regularization of Newton method and its global performance, Math. Program. 108(1) (2006), pp. 177–205.
- L.M. Nguyen, J. Liu, K. Scheinberg, and M. Takac, SARAH: A novel method for machine learning problems using stochastic recursive gradient, in International Conference on Machine Learning, 2017.
- A. Rodomanov and Y.U. Nesterov, Smoothness parameter of power of Euclidean norm, J. Optim. Theory. Appl. 185(2) (2020), pp. 303–326.
- L. Rosasco, S. Villa, and B.C. Vu, Convergence of stochastic proximal gradient algorithm, Appl. Math. Optim. 82(3) (2020), pp. 891–917.
- N. Tripuraneni, M. Stern, C. Jin, J. Regier, and M.I. Jordan, Stochastic cubic regularization for fast nonconvex optimization, Adv. Neural. Inf. Process. Syst. 31 (2018), pp. 2899–2908.
- D. Zhou, P. Xu, and Q. Gu, Stochastic variance-reduced cubic regularized Newton method, in International Conference on Machine Learning, 2018, pp. 5985–5994.
- http://www.ehu.eus/ccwintco/index.php/Hyperspectral-Remote-Sensing-Scenes.
- A. Wächter and L.T. Biegler, On the implementation of a primal–dual interior point filter line search algorithm for large-Scale nonlinear programming, Math. Program. 106(1) (2006), pp. 25–57.