325
Views
2
CrossRef citations to date
0
Altmetric
Original Articles

A smoothing stochastic gradient method for composite optimization

, &
Pages 1281-1301 | Received 28 Feb 2013, Accepted 03 Feb 2014, Published online: 13 Mar 2014

REFERENCES

  • A. Auslender and M. Teboulle, Interior gradient and proximal methods for convex and conic optimization, SIAM J. Optim. 16 (2006), pp. 697–725. doi: 10.1137/S1052623403427823
  • A. Beck and M. Teboulle, A fast iterative shrinkage-threshold algorithm for linear inverse problems, SIAM J. Optim. 2 (2009), pp. 183–202.
  • S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, New York, NY, 2003.
  • X. Chen, Q. Lin, S. Kim, J.G. Carbonell, and E.P. Xing, Smoothing proximal gradient method for general structured sparse learning, Ann. Appl. Stat. 6 (2012), pp. 719–752. doi: 10.1214/11-AOAS514
  • J. Friedman, T. Hastie, and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, J. Stat. Softw. 33(1) (2010), pp. 1–22.
  • S. Ghadimi and G. Lan, Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, I: A generic algorithmic framework, SIAM J. Optim. 22 (2012), pp. 1469–1492. doi: 10.1137/110848864
  • J.-B Hiriart-Urruty and C. Lemarechal., Fundamentals of Convex Analysis, Springer, Heidelberg, 2001.
  • C. Hu, J.T. Kwok, and W. Pan, Accelerated gradient methods for stochastic optimization and online learning, Adv. Neural Inf. Process. Syst. 22 (2009), pp. 781–789.
  • L. Jacob, G. Obozinski, and J.-P. Vert, Group Lasso with overlap and graph Lasso, International Conference on Machine Learning, Montreal, QC, Canada, 2009.
  • R. Jenatton, J.-Y. Audibert, and F. Bach, Structured variable selection with sparsity-inducing norms, J. Mach. Learn. Res. 12 (2011), pp. 2777–2824.
  • G. Lan, An optimal method for stochastic composite optimization, Math. Program. 133(1) (2012), pp. 365–397. doi: 10.1007/s10107-010-0434-y
  • G. Lan, Z. Lu, and R.D.C. Monteiro, Primal-dual first-order methods with o(1ε iteration-complexity for cone programming, Math. Program. 126(1) (2009), pp. 1–29. doi: 10.1007/s10107-008-0261-6
  • J. Liu, S. Ji, and J. Ye, Multi-task feature learning via efficient norm minimization, Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, 2009.
  • M.S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret, Applications of second-order cone programming, Linear Algebra Appl. 284 (1998), pp. 193–228. doi: 10.1016/S0024-3795(98)10032-0
  • L. Meier, S. van de Geer, and P. Buhlmann, The group Lasso for logistic regression, J. R. Stat. Soc. Ser. B 70 (2008), pp. 53–71. doi: 10.1111/j.1467-9868.2007.00627.x
  • A. Nemirovski and D. Yudin, Problem Complexity and Method Efficiency in Optimization, John Wiley, New York, 1983.
  • A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro, Robust stochastic approximation approach to stochastic programming, SIAM J. Optim. 19(4) (2009), pp. 1574–1609.
  • Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Kluwer Academic Pub, Norwell, MA, 2003.
  • Y. Nesterov, Smooth minimization of non-smooth functions, Math. Program. 103(1) (2005), pp. 127–152. doi: 10.1007/s10107-004-0552-5
  • Y. Nesterov, Gradient methods for minimizing composite objective function, Math. Program. 140(1) (2013), pp. 125–161. doi: 10.1007/s10107-012-0629-5
  • G. Obozinski, F. Bach, R. Jenatton, and J. Mairal, Proximal methods for sparse hierarchical dictionary learning, International Conference on Machine Learning, Haifa, Israel, 2010.
  • T.K. Pong, P. Tseng, S. Ji, and J. Ye, Trace norm regularization: Reformulations, algorithms, and multi-task learning, SIAM J. Optim. 20(6) (2010), pp. 3465–3489. doi: 10.1137/090763184
  • R. Tibshirani, Regression shrinkage and selection via the Lasso, J. R. Stat. Soc. Ser. B 58 (1996), pp. 267–288.
  • R. Tibshirani and M. Saunders, Sparsity and smoothness via the fused Lasso, J. R. Stat. Soc. Ser. B 67(1) (2005), pp. 91–108. doi: 10.1111/j.1467-9868.2005.00490.x
  • K.-C. Toh and S. Yun, An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems, Pac. J. Optim. 6(3) (2010), pp. 615–640.
  • P. Tseng, On accelerated proximal gradient methods for convex-concave optimization, Tech. Rep. Department of Mathematics at the University of Washington, Seattle, Washington, 2008. Available at http://pages.cs.wisc.edu/brecht/cs726docs/Tseng.APG.pdf
  • P. Tseng and S. Yun, A coordinate gradient descent method for nonsmooth separable minimization, Math. Program. 117 (2009), pp. 387–423. doi: 10.1007/s10107-007-0170-0
  • M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, J. R. Stat. Soc. Ser. B 68 (2006), pp. 49–67. doi: 10.1111/j.1467-9868.2005.00532.x

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.