Search in:

Advanced search

Journal of the Operational Research Society Volume 69, 2018 - Issue 9

Submit an article Journal homepage

404

Views

CrossRef citations to date

Altmetric

Original Articles

Mixed integer linear programming and heuristic methods for feature selection in clustering

Stefano BenatiSchool of International Studies, University of Trento, Trento, Italy.Correspondence[email protected]

https://orcid.org/0000-0002-1928-5224 View further author information

Sergio GarcíaSchool of Mathematics, The University of Edinburgh, Edinburgh, UK.

https://orcid.org/0000-0003-4281-6916 View further author information

Justo PuertoInstitute of Mathematics of the University of Seville (IMUS), Universidad de Sevilla, Sevilla, Spain.

https://orcid.org/0000-0003-4079-8419 View further author information

Pages 1379-1395 | Received 04 Jul 2016, Accepted 20 Oct 2017, Published online: 05 Jan 2018

Cite this article
https://doi.org/10.1080/01605682.2017.1398206
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

AlBdaiwi, B., Ghosh, D., & Goldengorin, B. (2011). Data aggregation for p-median problems. Journal of Combinatorial Optimization, 21, 348–363.
Web of Science ®Google Scholar
Andrews, J. L., & McNicholas, P. D. (2013). vscc: Variable selection for clustering and classification. R package version 1. Retrieved from http://CRAN.R-project.org/package=vscc
Google Scholar
Andrews, J. L., & McNicholas, P. M. (2014). Variable selection for clustering and classification. Journal of Classification, 31, 136–153.
Web of Science ®Google Scholar
Avella, P., Boccia, M., Salerno, S., & Vasilyev, I. (2012). An aggregation heuristic for large scale p-median problems. Computers and Operations Research, 39, 1625–1632.
Web of Science ®Google Scholar
Avella, P., Sassano, A., & Vasil’ev, I. (2007). Computational study of large-scale p-median problems. Mathematical Programming, 109, 89–114.
Web of Science ®Google Scholar
Benati, S., & García, S. (2014). A mixed integer linear model for clustering with variable selection. Computers and Operations Research, 43, 280–285.
Web of Science ®Google Scholar
Brusco, B. J. (2004). Clustering binary data in the presence of masking variables. Psychological Methods, 9, 510–523.
PubMed Web of Science ®Google Scholar
Caballero, R., Laguna, M., Martí, R., & Molina, J. (2011). Scatter tabu search for multiobjective clustering problems. The Journal of the Operational Research Society, 62, 2034–2046.
Web of Science ®Google Scholar
Carmone, F. J., Kara, A., & Maxwell, S. (1999). HINoV: A new model to improve market segmentation by identifying noisy variables. Journal of Marketing Research, 36, 501–509.
Web of Science ®Google Scholar
Chen, J. S., Ching, R. K. H., & Lin, Y. S. (2004). An extended study of the k-means algorithm for data clustering and its applications. The Journal of the Operational Research Society, 55, 976–987.
Web of Science ®Google Scholar
Church, R. L. (2003). COBRA: A new formulation of the classic p-median location problem. Annals of Operations Research, 122, 103–120.
Web of Science ®Google Scholar
Church, R. L. (2008). BEAMR: An exact and approximate model for the p-median problem. Computers and Operations Research, 35, 417–426.
Web of Science ®Google Scholar
Cornuejols, G., Nemhauser, G., & Wolsey, L. (1980). A canonical representation of simple plant location-problems and its applications. SIAM Journal on Algebraic And Discrete Methods, 1, 261–272.
Google Scholar
Elloumi, S. (2010). A tighter formulation of the p-median problem. Journal of Combinatorial Optimization, 19, 69–83.
Web of Science ®Google Scholar
Elloumi, S., Labbé, M., & Pochet, Y. (2004). A new formulation and resolution method for the p-center problem. INFORMS Journal on Computing, 16, 84–94.
Web of Science ®Google Scholar
Fraiman, R., Justel, A., & Svarc, M. (2008). Selection of variables for cluster analysis and classification rules. Journal of the American Statistical Association, 103, 1294–1303.
Web of Science ®Google Scholar
Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.
Web of Science ®Google Scholar
Fraley, C., Raftery, A .E., Brendan Murphy, T., & Scrucca, L. (2012). mclust version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation (Technical Report No. 597). Department of Statistics, University of Washington.
Google Scholar
Fowlkes, E. B., Gnanadesikan, R., & Kettering, J. R. (1988). Variable selection in clustering. Journal of Classification, 5, 205–228.
Web of Science ®Google Scholar
Friedman, J., & Meulman, J. (2004). Clustering objects on subsets of attributes. Journal of the Royal Statistical Society. Ser. B, 66, 815–849.
Web of Science ®Google Scholar
García, S., Labbé, M., & Marín, A. (2011). Solving large p-median problems with a radius formulation. INFORMS Journal on Computing, 23, 46–556.
Web of Science ®Google Scholar
García, S., Landete, M., & Marín, A. (2012). New formulation and a branch-and-cut algorithm for the multiple allocation p-hub median problem. European Journal Of Operational Research, 220, 48–57.
Web of Science ®Google Scholar
García-Escudero, L. A., Gordaliza, A., & Matrán, C. (2003). Trimming tools in exploratory data analysis. Journal of Computational and Graphical Statistics, 12, 434–449.
Web of Science ®Google Scholar
Goldengorin, B., & Krushinsky, D. (2011). Complexity evaluation of benchmark instances for the p-median problem. Mathematical and Computer Modeling, 53, 1719–1736.
Google Scholar
Guyon, I., & Elisseef, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Google Scholar
Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society, Series C, 28, 100–108.
Google Scholar
Hoking, R. R. (1976). The analysis and selection of variables in linear regression. Biometrics, 32, 1–49.
Web of Science ®Google Scholar
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.
Web of Science ®Google Scholar
Kariv, O., & Hakimi, S. L. (1979). An algorithmic approach to network location problems, part II. The p-medians. SIAM Journal on Applied Mathematics, 37, 539–560.
Web of Science ®Google Scholar
Law, M. H. C., Figuereido, M. A. T., & Jain, A. K. (2004). Simultaneous feature selection and clustering using mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 1154–1166.
PubMed Web of Science ®Google Scholar
MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability (pp. 281–297). Berkeley, CA: University of California Press.
Google Scholar
Marín, A., Nickel, S., Puerto, J., & Velten, S. (2009). A flexible model and efficient solution strategies for discrete location problems. Discrete Applied Mathematics, 157, 1128–1145.
Web of Science ®Google Scholar
McLachlan, G. J., & Krishnan, T. (1997). The EM algorithm and extensions. New York, NY: Wiley.
Google Scholar
Maldonado, S., Pérez, J., Weber, R., & Labbé, M. (2014). Feature selection for support vector machines via mixed integer linear programming. Information Science, 279, 163–175.
Web of Science ®Google Scholar
Mladenovic, N., Brimberg, J., Hansen, P., & Moreno-Pérez, J. A. (2007). The p-median problem: A survey of metaheuristic approaches. European Journal of Operational Research, 179, 927–939.
Web of Science ®Google Scholar
Morlini, I., & Zani, S. (2013). Variable selection in cluster analysis: An approach based on a new index. (in Giusti A., Ritter G. and Vichi M. - Classification and Data Mining - Springer, Berlin DEU, Studies in Classification, Data Analysis, and Knowledge Organization: 71–79).
Google Scholar
Qiu, W., & Joe, H. (2006). Generation of random cluster with specified degree of separation. Journal of Classification, 23, 315–334.
Web of Science ®Google Scholar
Qiu, W., & Joe, H. (2013). clusterGeneration: Random Cluster Generation (with specified degree of separation). Retrieved from http://CRAN.R-project.org/package=clusterGeneration
Google Scholar
Pan, W., & Shen, X. (2007). Penalized model-based clustering with application to variable selection. Journal of Machine Learning Research, 8, 1154–1164.
Web of Science ®Google Scholar
Puerto, J., Ramos, A. B., & Rodríguez-Chía, A. M. (2013). A specialized branch & bound & cut for single-allocation ordered median hub location problems. Discrete Applied Mathematics, 161, 2624–2646.
Web of Science ®Google Scholar
Raftery, A. E., & Dean, N. (2006). Variable selection for model-based clustering. Journal of the American Statistical Association, 101, 168–178.
Web of Science ®Google Scholar
Scrucca, L., Adrian, E., & Raftery, N. D. (2013). clustvarsel: A package implementing variable selection for model-based clustering in R, version 2.0. Retrieved from http://CRAN.R-project.org/package=clustvarsel
Google Scholar
Steinley, D., & Brusco, M. J. (2008a). A new variable weighting and selection procedure for k-means cluster analysis. Multivariate Behavioral Research, 43, 77–108.
PubMed Web of Science ®Google Scholar
Steinley, D., & Brusco, M. J. (2008b). Selection of variables in cluster analysis: an empirical comparison of eight procedures. Psychometrika, 73, 125–144.
Web of Science ®Google Scholar
Tadesse, M. G., Sha, N., & Vannucci, M. (2005). Bayesian variable selection in clustering high-dimensional data. Journal of the American Statistical Association, 100, 602–617.
Web of Science ®Google Scholar
Yang, J., & Olafsson, S. (2009). Near-optimal feature selection for large databases. The Journal of the Operational Research Society, 60, 1045–1055.
Web of Science ®Google Scholar
Witten, D. M., & Tibshirani, R. (2010). A framework for feature selection in clustering. Journal of the American Statistical Association, 105, 713–726.
PubMed Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Mixed integer linear programming and heuristic methods for feature selection in clustering

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Mixed integer linear programming and heuristic methods for feature selection in clustering

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date