199
Views
1
CrossRef citations to date
0
Altmetric
Clustering

Graphical and Computational Tools to Guide Parameter Choice for the Cluster Weighted Robust Model

ORCID Icon, ORCID Icon, ORCID Icon &
Pages 1195-1214 | Received 02 Mar 2022, Accepted 27 Nov 2022, Published online: 09 Jan 2023

References

  • Azzalini, A. (2021). The R package sn: The Skew-Normal and Related Distributions such as the Skew-t and the SUN (version 2.0.0). Università di Padova, Italia.
  • Azzalini, A., and Capitanio, A. (1999), “Statistical Applications of the Multivariate Skew Normal Distribution,” Journal of the Royal Statistical Society, Series B, 61, 579–602. DOI: 10.1111/1467-9868.00194.
  • Bezdek, J. C. (2013), Pattern Recognition with Fuzzy Objective Function Algorithms, New York: Springer.
  • Cappozzo, A., García Escudero, L. A. G., Greselin, F., and Mayo-Iscar, A. (2021), “Parameter Choice, Stability and Validity for Robust Cluster Weighted Modeling,” Stats, 4, 602–615. DOI: 10.3390/stats4030036.
  • Cerioli, A., García-Escudero, L. A., Mayo-Iscar, A., and Riani, M. (2018), “Finding the Number of Normal Groups in Model-based Clustering via Constrained Likelihoods,” Journal of Computational and Graphical Statistics, 27, 404–416. DOI: 10.1080/10618600.2017.1390469.
  • Claeskens, G., and Hjort, N. L. (2008), Model Selection and Model Averaging, Cambridge: Cambridge University Press.
  • Cook, R. D., and Weisberg, S. (1994), An Introduction to Regression Graphics (Vol. 405), New York: Wiley.
  • Dang, U. J., Punzo, A., McNicholas, P. D., Ingrassia, S., and Browne, R. P. (2017), “Multivariate Response and Parsimony for Gaussian Cluster-Weighted Models,” Journal of Classification, 34, 4–34. DOI: 10.1007/s00357-017-9221-2.
  • Dempster, A., Laird, N., and Rubin, D. (1977), “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society, 39, 1–38.
  • Estivill-Castro, V. (2002), “Why so Many Clustering Algorithms: A Position Paper,” ACM SIGKDD Explorations Newsletter, 4, 65–75. DOI: 10.1145/568574.568575.
  • Flury, B. (1997), A First Course in Multivariate Statistics, New York: Springer.
  • Fritz, H., García-Escudero, L. A., and Mayo-Iscar, A. (2012), “tclust: An R Package for a Trimming Approach to Cluster Analysis” Journal of Statistical Software, 47, 1–26. DOI: 10.18637/jss.v047.i12.
  • García-Escudero, L. A., Gordaliza, A., Greselin, F., Ingrassia, S., and Mayo-Iscar, A. (2016), “The Joint Role of Trimming and Constraints in Robust Estimation for Mixtures of Gaussian Factor Analyzers,” Computational Statistics & Data Analysis, 99, 131–147.
  • García-Escudero, L. A., Gordaliza, A., Greselin, F., Ingrassia, S., and Mayo-Iscar, A. (2017), “Robust Estimation of Mixtures of Regressions with Random Covariates, via Trimming and Constraints,” Statistics and Computing, 27, 377–402.
  • García-Escudero, L. A., Gordaliza, A., Matrán, C., and Mayo-Iscar, A. (2011), “Exploring the Number of Groups in Robust Model-based Clustering,” Statistics and Computing, 21, 585–599. DOI: 10.1007/s11222-010-9194-z.
  • García-Escudero, L. A., Gordaliza, A., Matrán, C., and Mayo-Iscar, A. (2015), “Avoiding Spurious Local Maximizers in Mixture Modeling,” Statistics and Computing, 25, 619–633.
  • García-Escudero, L. A., Mayo-Iscar, A., and Riani, M. (2020), “Model-based Clustering with Determinant-and-Shape Constraint,” Statistics and Computing, 30, 1363–1380. DOI: 10.1007/s11222-020-09950-w.
  • García-Escudero, L. A., Mayo-Iscar, A., and Riani, M. (2022), “Constrained Parsimonious Model-based Clustering,” Statistics and Computing, 32, 1–15.
  • Gershenfeld, N. (1997), “Nonlinear Inference and Cluster-Weighted Modeling,” Annals of the New York Academy of Sciences, 808, 18–24. DOI: 10.1111/j.1749-6632.1997.tb51651.x.
  • Hathaway, R. J. (1985), “A Constrained Formulation of Maximum-Likelihood Estimation for Normal Mixture Distributions,” The Annals of Statistics, 13, 795–800. DOI: 10.1214/aos/1176349557.
  • Hennig, C. (2004), “Breakdown Points for Maximum Likelihood Estimators of Location-Scale Mixtures,” The Annals of Statistics, 32, 1313–1340. DOI: 10.1214/009053604000000571.
  • Hennig, C. (2015), “What are the True Clusters?” Pattern Recognition Letters, 64, 53–62.
  • Hennig, C., and Liao, T. F. (2013), “How to Find an Appropriate Clustering for Mixed-Type Variables with Application to Socio-Economic Stratification,” Journal of the Royal Statistical Society, Series C, 62, 309–369. DOI: 10.1111/j.1467-9876.2012.01066.x.
  • Huber, P. J., and Ronchetti, E. M. (2009), Robust Statistics. Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley.
  • Ingrassia, S., Minotti, S. C., and Punzo, A. (2014), “Model-based Clustering via Linear Cluster-Weighted Models,” Computational Statistics & Data Analysis, 71, 159–182.
  • Ingrassia, S., Minotti, S. C., and Vittadini, G. (2012), “Local Statistical Modeling via a Cluster-Weighted Approach with Elliptical Distributions,” Journal of Classification, 29, 363–401. DOI: 10.1007/s00357-012-9114-3.
  • Ingrassia, S., and Punzo, A. (2020), “Cluster Validation for Mixtures of Regressions via the Total Sum of Squares Decomposition,” Journal of Classification, 37, 526–547. DOI: 10.1007/s00357-019-09326-4.
  • Milligan, G. W., and Cooper, M. C. (1985), “An Examination of Procedures for Determining the Number of Clusters in a Data Set,” Psychometrika, 50, 159–179. DOI: 10.1007/BF02294245.
  • Neykov, N., Filzmoser, P., Dimova, R., and Neytchev, P. (2007), “Robust Fitting of Mixtures using the Trimmed Likelihood Estimator,” Computational Statistics & Data Analysis, 52, 299–308.
  • R Core Team (2022), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing.
  • Riani, M., Atkinson, A. C., Cerioli, A., and Corbellini, A. (2019), “Efficient Robust Methods via Monitoring for Clustering and Multivariate Data Analysis,” Pattern Recognition, 88, 246–260. DOI: 10.1016/j.patcog.2018.11.016.
  • Rousseeuw, P. J. (1987a), “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis,” Journal of Computational and Applied Mathematics, 20, 53–65. DOI: 10.1016/0377-0427(87)90125-7.
  • Rousseeuw, P. J. (1987b), “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis,” Journal of Computational and Applied Mathematics, 20, 53–65.
  • Rousseeuw, P. J., and Driessen, K. V. (1999), “A Fast Algorithm for the Minimum Covariance Determinant Estimator,” Technometrics, 41, 212–223. DOI: 10.1080/00401706.1999.10485670.
  • Schwarz, G. (1978), “Estimating the Dimension of a Model,” The Annals of Statistics, 6, 461–464. DOI: 10.1214/aos/1176344136.
  • Singh, K., Parelius, J. M., and Liu, R. Y. (1999), “Multivariate Analysis by Data Depth: Descriptive Statistics, Graphics and Inference,” (with Discussion and a Rejoinder by Liu and Singh), The Annals of Statistics, 27, 783–858.
  • Soffritti, G., and Galimberti, G. (2011), “Multivariate Linear Regression with Non-normal Errors: A Solution based on Mixture Models,” Statistics and Computing, 21, 523–536. DOI: 10.1007/s11222-010-9190-3.
  • Subedi, S., Punzo, A., Ingrassia, S., and McNicholas, P. D. (2013), “Clustering and Classification via Cluster-Weighted Factor Analyzers,” Advances in Data Analysis and Classification, 7, 5–40. DOI: 10.1007/s11634-013-0124-8.
  • Subedi, S., Punzo, A., Ingrassia, S., and McNicholas, P. D. (2015), “Cluster-Weighted t-factor Analyzers for Robust Model-based Clustering and Dimension Reduction,” Statistical Methods & Applications, 24, 623–649.
  • Tibshirani, R., Walther, G., and Hastie, T. (2001), “Estimating the Number of Clusters in a Data Set via the Gap Statistic,” Journal of the Royal Statistical Society, Series B, 63, 411–423. DOI: 10.1111/1467-9868.00293.
  • Torti, F., Riani, M., and Morelli, G. (2021), “Semiautomatic Robust Regression Clustering of International Trade Data,” Statistical Methods & Applications, 30, 863–894. DOI: 10.1007/s10260-021-00569-3.
  • Van Aelst, S., (Steven) Wang, X., Zamar, R. H., and Zhu, R. (2006), “Linear Grouping Using Orthogonal Regression,” Computational Statistics & Data Analysis, 50, 1287–1312.
  • von Luxburg, U., Williamson BobWilliamson, R. C., and Guyon, I. (2012), “Clustering: Science or Art?,” in JMLR: Workshop and Conference Proceedings (Vol. 27), p. 6579.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.