260
Views
0
CrossRef citations to date
0
Altmetric
Statistical Learning

Algorithms for Sparse Support Vector Machines

ORCID Icon &
Pages 1097-1108 | Received 14 Aug 2021, Accepted 07 Nov 2022, Published online: 13 Dec 2022

References

  • Barghout, L. (2015), “Spatial-Taxon Information Granules as Used in Iterative Fuzzy-Decision-Making for Image Segmentation,” in Granular Computing and Decision-Making: Interactive and Iterative Approaches, Studies in Big Data, eds. W. Pedrycz, S.-M. Chen, pp. 285–318, Cham: Springer.
  • Beltrami, E. J. (1970), An Algorithmic Approach to Nonlinear Analysis and Optimization, New York: Academic Press.
  • Ben-Hur, A., Horn, D., Siegelmann, H. T., and Vapnik, V. (2002), “Support Vector Clustering,” The Journal of Machine Learning Research, 2, 125–137.
  • Cauwenberghs, G., and Poggio, T. (2000), “Incremental and Decremental Support Vector Machine Learning,” in Proceedings of the 13th International Conference on Neural Information Processing Systems, NIPS’00, pp. 388–394, MIT Press.
  • Chi, E. C., Zhou, H., and Lange, K. (2014), “Distance Majorization and its Applications,” Mathematical Programming, 146, 409–436. DOI: 10.1007/s10107-013-0697-1.
  • Cortes, C., and Vapnik, V. (1995), “Support-Vector Networks,” Machine Learning, 20, 273–297. DOI: 10.1007/BF00994018.
  • Courant, R. (1943), “Variational Methods for the Solution of Problems of Equilibrium and Vibrations,” Bulletin of the American Mathematical Society, 49, 1–23. DOI: 10.1090/S0002-9904-1943-07818-4.
  • Decoste, D., and Schölkopf, B. (2002), “Training Invariant Support Vector Machines,” Machine Learning, 46, 161–190. DOI: 10.1023/A:1012454411458.
  • Dua, D., and Graff, C. (2019), “UCI Machine Learning Repository.”
  • Dunbrack, R. L. (2006), “Sequence Comparison and Protein Structure Prediction,” Current Opinion in Structural Biology, 16, 374–384. DOI: 10.1016/j.sbi.2006.05.006.
  • El Ghaoui, L., Viallon, V., and Rabbani, T. (2012), “Safe Feature Elimination for the Lasso and Sparse Supervised Learning Problems,” Pacific Journal of Optimization, 8, 667–698.
  • Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (2008), “LIBLINEAR: A Library for Large Linear Classification,” Journal of Machine Learning Research, 9, 1871–1874.
  • Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (1998a), “The Kernel Adatron With Bias Unit: Analysis of the Algorithm (Part 1),” ACSE Research Report 729, University of Sheffield Department of Automatic Control and Systems Engineering.
  • Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (1998b), “The Kernel Adatron with Bias Unit: Analysis of the Algorithm (Part 2),” ACSE Research Report 728, University of Sheffield Department of Automatic Control and Systems Engineering.
  • Groenen, P. J. F., Nalbantov, G., and Bioch, J. C. (2008), “SVM-Maj: A Majorization Approach to Linear Support Vector Machines with Different Hinge Errors,” Advances in Data Analysis and Classification, 2, 17–43. DOI: 10.1007/s11634-008-0020-9.
  • Hsu, C.-W., and C.-J. Lin (2002), “A Comparison of Methods for Multiclass Support Vector Machines,” IEEE Transactions on Neural Networks, 13, 415–425.
  • Jaggi, M. (2014), “An Equivalence between the Lasso and Support Vector Machines,” in Regularization, Optimization, Kernels, and Support Vector Machines, eds. J. A. K. Suykens, M. Signoretto, A. Argyriou, pp. 1–26, New York:. Chapman and Hall/CRC.
  • Joachims, T. (1998), “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” in Machine Learning: ECML-98, Lecture Notes in Computer Science, pp. 137–142, Springer.
  • Keys, K. L., Zhou, H., and Lange, K. (2019), “Proximal Distance Algorithms: Theory and Practice,” Journal of Machine Learning Research, 20, 1–38.
  • Kimeldorf, G., and Wahba, G. (1971), “Some Results on Tchebycheffian Spline Functions,” Journal of Mathematical Analysis and Applications, 33, 82–95. DOI: 10.1016/0022-247X(71)90184-3.
  • Landeros, A., Padilla, O. H. M., Zhou, H., and Lange, K. (2022), “Extensions to the Proximal Distance Method of Constrained Optimization,” Journal of Machine Learning Research, 23, 1–45.
  • Lange, K. (2016), MM Optimization Algorithms, Philadelphia, PA: SIAM-Society for Industrial and Applied Mathematics.
  • Lange, K., Hunter, D. R., and Yang, I. (2000), “Optimization Transfer Using Surrogate Objective Functions,” Journal of Computational and Graphical Statistics, 9, 1–20.
  • Lange, K., Won, J.-H., Landeros, A., and Zhou, H. (2021), “Nonconvex Optimization via MM Algorithms: Convergence Theory,” in Wiley StatsRef: Statistics Reference Online, eds. N. Balakrishnan, T. Colton, B. Everitt, W. W. Piegorsch, F. Ruggeri, J. L. Teugels, pp. 1–22, Hoboken: Wiley.
  • Lange, K., and Wu, T. T. (2008), “An MM Algorithm for Multicategory Vertex Discriminant Analysis,” Journal of Computational and Graphical Statistics, 17, 527–544. DOI: 10.1198/106186008X340940.
  • Laskov, P., Gehl, C., Krüger, S., and Müller, K.-R. (2006), “Incremental Support Vector Learning: Analysis, Implementation and Applications,” Journal of Machine Learning Research, 7, 1909–1936.
  • Luenberger, D. G. (1984), Linear and Nonlinear Programming, Reading, MA: Addison-Wesley.
  • Mangasarian, O. L., and Musicant, D. R. (2001), “Lagrangian Support Vector Machines,” The Journal of Machine Learning Research, 1, 161–177.
  • Mercer, J. (1909), “Functions of Positive and Negative Type, and their Connection the Theory of Integral Equations,” Philosophical Transactions of the Royal Society of London, Series A, 209, 415–446.
  • Nguyen, H. D., and McLachlan, G. J. (2017), “Iteratively-Reweighted Least-Squares Fitting of Support Vector Machines: A Majorization–Minimization Algorithm Approach,” in Proceedings of the 2017 Future Technologies Conference, pp. 439–446, The Science and Information Organization.
  • Ogawa, K., Suzuki, Y., and Takeuchi, I. (2013), “Safe Screening of Non-Support Vectors in Pathwise SVM Computation,” in Proceedings of the 30th International Conference on Machine Learning, pp. 1382–1390, PMLR.
  • Pradhan, S., Ward, W., Hacioglu, K., Martin, J. H., and Jurafsky, D. (2004), “Shallow Semantic Parsing using Support Vector Machines,” in Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 233–240, Association for Computational Linguistics.
  • Schölkopf, B., Herbrich, R., and Smola, A. J. (2001), “A Generalized Representer Theorem,” in Computational Learning Theory, Lecture Notes in Computer Science, eds. D. Helmbold, B. Williamson, pp. 416–426, Berlin: Springer.
  • Schölkopf, B., and Smola, A. J. (2018), Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, Cambridge, MA: The MIT Press.
  • Sewak, M., Vaidya, P., Chan, C.-C., and Duan, Z.-H. (2007), “SVM Approach to Breast Cancer Classification,” in Second International Multi-Symposiums on Computer and Computational Sciences (IMSCCS 2007), pp. 32–37. DOI: 10.1109/IMSCCS.2007.46.
  • Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J., and Tibshirani, R. J. (2012), “Strong Rules for Discarding Predictors in Lasso-Type Problems,” Journal of the Royal Statistical Society, Series B, 74, 245–266. DOI: 10.1111/j.1467-9868.2011.01004.x.
  • van den Burg, G. J. J., and Groenen, P. J. F. (2016), “GenSVM: A Generalized Multiclass Support Vector Machine,” Journal of Machine Learning Research, 17, 1–42.
  • Wang, J., Zhou, J., Wonka, P., and Ye, J. (2013), “Lasso Screening Rules via Dual Polytope Projection,” in Advances in Neural Information Processing Systems (Vol. 26), Curran Associates, Inc.
  • White, L., Togneri, R., Liu, W., and Bennamoun, M. (2019), “DataDeps.jl: Repeatable Data Setup for Reproducible Data Science,” Journal of Open Research Software, 7, 33. DOI: 10.5334/jors.244.
  • Wu, T. T., and Lange, K. (2010), “Multicategory Vertex Discriminant Analysis for High-Dimensional Data,” The Annals of Applied Statistics, 4, 1698–1721. DOI: 10.1214/10-AOAS345.
  • Xu, J., Chi, E., and Lange, K. (2017), “Generalized Linear Model Regression under Distance-to-set Penalties,” in Advances in Neural Information Processing Systems (Vol. 30), Curran Associates, Inc.
  • Zhu, J., Rosset, S., Hastie, T., and Tibshirani, R. (2003), “1-norm Support Vector Machines,” in Proceedings of the 16th International Conference on Neural Information Processing Systems, NIPS’03, pp. 49–56. MIT Press.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.