184
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Accelerated dual-averaging primal–dual method for composite convex minimization

, , &
Pages 741-766 | Received 07 Jun 2019, Accepted 05 Jan 2020, Published online: 20 Jan 2020

References

  • Z. Allen-Zhu, Katyusha: The first direct acceleration of stochastic gradient methods, in Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing. ACM, 2017, pp. 1200–1205.
  • A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM. J. Imaging Sci. 2 (2009), pp. 183–202. doi: 10.1137/080716542
  • A. Chambolle and T. Pock, A first-order primal–dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vis. 40 (2011), pp. 120–145. doi: 10.1007/s10851-010-0251-1
  • A. Defazio, F. Bach, and S. Lacoste-Julien, SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives, in Advances in neural information processing systems. 2014, pp. 1646–1654.
  • L. Gao, J. Song, X. Liu, J. Shao, J. Liu, and J. Shao, Learning in high-dimensional multimedia data: The state of the art, Multi. Sys. 23 (2017), pp. 303–313. doi: 10.1007/s00530-015-0494-1
  • R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, in Advances in neural information processing systems. 2013, pp. 315–323.
  • S. Kakade, S. Shalev-Shwartz, and A. Tewari, On the duality of strong convexity and strong smoothness: Learning applications and matrix regularization, Unpublished Manuscript (2009).
  • G. Korpelevich, Extrapolation gradient methods and relation to modified lagrangeans. ekonomika i matematicheskie metody, 19: 694–703, 1983, Russian; English translation in Matekon.
  • G. Korpelevich, The extragradient method for finding saddle points and other problems, Matecon 12 (1976), pp. 747–756.
  • G. Lan and Y. Zhou, An optimal randomized incremental gradient method, Math. Program. 171 (2018), pp. 167–215. doi: 10.1007/s10107-017-1173-0
  • Y.T. Lee and A. Sidford, Efficient accelerated coordinate descent methods and faster algorithms for solving linear systems, in 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE, 2013, pp. 147–156.
  • H.B. McMahan, Follow-the-regularized-leader and mirror descent: Equivalence theorems and L1 regularization, in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 2011, pp. 525–533.
  • H.B. McMahan, A survey of algorithms and analysis for adaptive online learning, J. Mach. Learn. Res. 18 (2017), pp. 3117–3166.
  • H.B. McMahan, G. Holt, D. Sculley, M. Young, D. Ebner, J. Grady, L. Nie, T. Phillips, E. Davydov, D. Golovin, and S. Chikkerur, Ad click prediction: A view from the trenches, in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013, pp. 1222–1230.
  • T. Murata and T. Suzuki, Doubly accelerated stochastic variance reduced dual averaging method for regularized empirical risk minimization, in Advances in Neural Information Processing Systems. 2017, pp. 608–617.
  • Y. Nesterov, Primal–dual subgradient methods for convex problems, Math. Program. 120 (2009), pp. 221–259. doi: 10.1007/s10107-007-0149-x
  • Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Vol. 87, Springer Science & Business Media, New York, 2013.
  • S.J. Pan and Q. Yang, A survey on transfer learning, IEEE. Trans. Knowl. Data. Eng. 22 (2009), pp. 1345–1359. doi: 10.1109/TKDE.2009.191
  • O. Shamir and T. Zhang, Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes, in International Conference on Machine Learning. 2013, pp. 71–79.
  • C. Tan, T. Zhang, S. Ma, and J. Liu, Stochastic primal-dual method for empirical risk minimization with O(1) Per-Iteration complexity, in Advances in Neural Information Processing Systems. 2018, pp. 8376–8385.
  • R. Tibshirani, Regression shrinkage and selection via the lasso, J. Royal Statist. Soc. B. 58 (1996), pp. 267–288.
  • P. Tseng, On accelerated proximal gradient methods for convex-concave optimization, SIAM J. Opt. (2008).
  • L. Xiao, Dual averaging methods for regularized stochastic learning and online optimization, J. Mach. Learn. Res. 11 (2010), pp. 2543–2596.
  • L. Xiao and T. Zhang, A proximal stochastic gradient method with progressive variance reduction, SIAM. J. Optim. 24 (2014), pp. 2057–2075. doi: 10.1137/140961791
  • J. Yang and J. Leskovec, Defining and evaluating network communities based on ground-truth, Knowl. Inf. Syst. 42 (2015), pp. 181–213. doi: 10.1007/s10115-013-0693-z
  • O. Zadorozhnyi, G. Benecke, S. Mandt, T. Scheffer and M. Kloft, Huber-norm regularization for linear prediction models, in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2016, pp. 714–730.
  • Y. Zhang and L. Xiao, Stochastic primal-dual coordinate method for regularized empirical risk minimization, J. Mach. Learn. Res. 18 (2017), pp. 2939–2980.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.