519
Views
17
CrossRef citations to date
0
Altmetric
Regular Papers

A second-order method for convex 1-regularized optimization with active-set prediction

, , &
Pages 605-621 | Received 17 May 2015, Accepted 01 Jan 2016, Published online: 16 Feb 2016

References

  • G. Andrew and J. Gao, Scalable training of -regularized log-linear models, in Proceedings of the 24th International Conference on Machine Learning, Z. Ghahramani (ed.), Omnipress, Corvallis, OR 2007, pp. 33–40.
  • F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Optimization with sparsity-inducing penalties, Found. Trends Mach. Learn. 4 (2012), pp. 1–106. doi: 10.1561/2200000015
  • A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imag. Sci. 2 (2009), pp. 183–202. doi: 10.1137/080716542
  • R.H. Byrd, J. Nocedal, and R. Schnabel, Representations of quasi-Newton matrices and their use in limited memory methods, Math. Program. 63 (1994), pp. 129–156. doi: 10.1007/BF01582063
  • R. Byrd, J. Nocedal, and F. Oztoprak, An inexact successive quadratic approximation method for l-1 regularized optimization, Math. Program. (2015), pp. 1–22, Available at http://dx.doi.org/10.1007/s10107-015-0941-y.
  • R.H. Byrd, G.M. Chin, J. Nocedal, and F. Oztoprak, A family of second-order methods for convex L1 regularized optimization, Tech. Rep., Optimization Center Report 2012/2, Northwestern University, 2012.
  • R.H. Byrd, G.M. Chin, J. Nocedal, and Y. Wu, Sample size selection in optimization methods for machine learning, Math. Program. 134 (2012), pp. 127–155, Available at http://dx.doi.org/10.1007/s10107-012-0572-5. doi: 10.1007/s10107-012-0572-5
  • I. Daubechies, M. Defrise, and C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Commun. Pure Appl. Math. 57 (2004), pp. 1413–1457. doi: 10.1002/cpa.20042
  • M. De Santis, S. Lucidi, and F. Rinaldi, A fast active set block coordinate descent algorithm for -regularized least squares, arXiv preprint arXiv:1403.1738, 2014.
  • D.L. Donoho, De-noising by soft-thresholding, IEEE Trans. Inf. Theory, 41 (1995), pp. 613–627. doi: 10.1109/18.382009
  • K. Fountoulakis and J. Gondzio, Performance of first-and second-order methods for big data optimization, arXiv preprint arXiv:1503.03520, 2015.
  • K. Fountoulakis, J. Gondzio, and P. Zhlobich, Matrix-free interior point method for compressed sensing problems, Math. Program. Comput. 6 (2014), pp. 1–31. doi: 10.1007/s12532-013-0063-6
  • J. Friedman, T. Hastie, and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, J. Statist. Softw. 33 (2010), pp. 1, Available at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2929880/. doi: 10.18637/jss.v033.i01
  • D. Greene and P. Cunningham, Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering, Proceedings of the 23rd International Conference on Machine Learning (ICML '06), Pittsburgh, Pennsylvania. Available at http://doi.acm.org/10.1145/1143844.1143892, ACM, New York, NY, USA, 2006, pp. 377–384.
  • Z. Han and F.E. Curtis, Primal–dual active-set methods for isotonic regression and trend filtering, arXiv preprint arXiv:1508.02452, 2015.
  • C.J. Hsieh, M.A. Sustik, P. Ravikumar, and I.S. Dhillon, Sparse inverse covariance matrix estimation using quadratic approximation, Adv. Neural Inf. Process. Syst. 24 (2011), pp. 2330–2338.
  • P. Hungerländer and F. Rendl, A feasible active set method for strictly convex quadratic problems with simple bounds, SIAM J. Optim. 25 (2015), pp. 1633–1659, Available at http://dx.doi.org/10.1137/140984439. doi: 10.1137/140984439
  • K. Koh, S.J. Kim, and S.P. Boyd, An interior-point method for large-scale l1-regularized logistic regression, J. Mach. Learn. Res. 8 (2007), pp. 1519–1555.
  • J.D. Lee, Y. Sun, and M.A. Saunders, Proximal newton-type methods for minimizing composite functions, SIAM J. Optim. 24 (2014), pp. 1420–1443, http://dx.doi.org/10.1137/130921428. doi: 10.1137/130921428
  • Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Kluwer Academic Publishers, Boston, MA, 2004.
  • Y. Nesterov, Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM J. Optim. 22 (2012), pp. 341–362. doi: 10.1137/100802001
  • J. Nocedal and S. Wright, Numerical Optimization, 2nd ed., Springer, New York, 1999.
  • P. Olsen, F. Oztoprak, J. Nocedal, and S. Rennie, Newton-like methods for sparse inverse covariance estimation, Advances in Neural Information Processing Systems 25, P. Bartlett, F. Pereira, C. Burges, L. Bottou, and K. Weinberger, eds., 2012, pp. 764–772. Available at http://books.nips.cc/papers/files/nips25/NIPS2012_0344.pdf.
  • Pascal large scale learning challenge. Available at http://largescale.ml.tu-berlin.de/, 2008, Accessed 1 January 2015.
  • P. Richtárik and M. Takáč, Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function, Math. Program. 144 (2014), pp. 1–38. doi: 10.1007/s10107-012-0614-z
  • K. Scheinberg and X. Tang, Practical inexact proximal quasi-Newton method with global complexity analysis., arXiv preprint arXiv:1311.6547, 2014.
  • M. Schmidt, Graphical model structure learning with l1-regularization, Ph.D. thesis, University of British Columbia, 2010.
  • M. Schmidt, G. Fung, and R. Rosales, Fast optimization methods for l1 regularization: A comparative study and two new approaches, in J.N. Kok, J. Koronacki, R.L. de Mantaras, S. Matwin, D. Mladenič, and A. Skowron (eds.), Machine Learning: ECML 2007, Springer, Warsaw, 2007, pp. 286–297, Available at http://dx.doi.org/10.1007/978-3-540-74958-5_28.
  • M. Schmidt, D. Kim, and S. Suvrit, Projected Newton-type methods in machine learning, Optim. Mach. Learn. (2011).
  • S. Solntsev, J. Nocedal, and R.H. Byrd, An algorithm for quadratic 1-regularized optimization with a flexible active-set strategy, Optim. Methods Softw. (2014), pp. 1–25.
  • S. Sra, S. Nowozin, and S. Wright, Optimization for Machine Learning, MIT Press, Cambridge, MA, 2011.
  • P. Tseng and S. Yun, A coordinate gradient descent method for nonsmooth separable minimization, Math. Program. 117 (2009), pp. 387–423. doi: 10.1007/s10107-007-0170-0
  • Z. Wen, W. Yin, D. Goldfarb, and Y. Zhang, A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation, SIAM J. Sci. Comput. 32 (2010), pp. 1832–1857, Available at http://dx.doi.org/10.1137/090747695. doi: 10.1137/090747695
  • Z. Wen, W. Yin, D. Goldfarb, and Y. Zhang, A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation, SIAM J. Scient. Comput. 32 (2010), pp. 1832–1857. doi: 10.1137/090747695
  • Z. Wen, W. Yin, H. Zhang, and D. Goldfarb, On the convergence of an active-set method for 1 minimization, Optim. Methods Softw. 27 (2012), pp. 1127–1146, Available at http://dx.doi.org/10.1080/10556788.2011.591398. doi: 10.1080/10556788.2011.591398
  • S. Wright, Coordinate descent algorithms, Technical report, University of Wisconsin, Madison, WI, 2014.
  • S.J. Wright, R.D. Nowak, and M .A.T. Figueiredo, Sparse reconstruction by separable approximation, IEEE Trans. Signal Process. 57 (2009), pp. 2479–2493. doi: 10.1109/TSP.2009.2016892
  • G.X. Yuan, C.H. Ho, and C.J. Lin, An improved glmnet for l1-regularized logistic regression, J. Mach. Learn. Res. 13 (2012), pp. 1999–2030.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.