93
Views
7
CrossRef citations to date
0
Altmetric
Original Articles

Outlier-free merging of homogeneous groups of pre-classified observations under contamination

&
Pages 2997-3020 | Received 18 May 2017, Accepted 02 Jul 2017, Published online: 17 Jul 2017

References

  • Chaturvedi A, Green PE, Carroll JD. K-modes clustering. J Class. 2001;18(1):35–55. doi: 10.1007/s00357-001-0004-3
  • Hennig C, Liao TF. How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification. J R Stat Soc: Ser C (Appl Stat). 2013;62(3):309–369. doi: 10.1111/j.1467-9876.2012.01066.x
  • Baudry JP, Raftery AE, Celeux G, et al. Combining mixture components for clustering. J Comput Graph Stat. 2010;19(2):332–353. doi: 10.1198/jcgs.2010.08111
  • Hennig C. Methods for merging Gaussian mixture components. Adv Data Anal Class. 2010;4(1):3–34. doi: 10.1007/s11634-010-0058-3
  • Cerasa A. Combining homogeneous groups of preclassified observations with application to international trade. Stat Neerland. 2016;70:229–259. doi: 10.1111/stan.12086
  • Hubert M, Rousseeuw PJ, Van Aelst S. High-breakdown robust multivariate methods. Stat Sci. 2008;23(1):92–119. doi: 10.1214/088342307000000087
  • Hubert M, Van der Veeken S. Robust classification for skewed data. Adv Data Anal Class. 2010;4(4):239–254. doi: 10.1007/s11634-010-0066-3
  • Dovoedo YH, Chakraborti S. Outlier detection for multivariate skew-normal data: a comparative study. J Stat Comput Simul. 2013;83(4):773–783. doi: 10.1080/00949655.2011.636364
  • Amini M, Roozbeh M. Least trimmed squares ridge estimation in partially linear regression models. J Stat Comput Simul.; 2016. doi:10.1080/00949655.2015.1128433
  • Pérez B, Molina I, Peña D. Outlier detection and robust estimation in linear regression models with fixed group effects. J Stat Comput Simul. 2014;84(1):2652–2669. doi: 10.1080/00949655.2013.811669
  • Cerasa A, Buscaglia D. Do the EU countries import at the same price? The case of coffee. Agric Econ.; 2017. doi:10.1111/agec.12342
  • Bilkey WJ, Nes E. Country-of-origin effects on product evaluations. J Int Bus Stud. 1982;13(1):89–100. doi: 10.1057/palgrave.jibs.8490539
  • Cerioli A, Perrotta D. Robust clustering around regression lines with high density regions. Adv Data Anal Class. 2014;8(1):5–26. doi: 10.1007/s11634-013-0151-5
  • Barabesi L, Cerasa A, Perrotta D, et al. Modeling international trade data with the Tweedie distribution for anti-fraud and policy support. Eur J Oper Res. 2016;248:1031–1043. doi: 10.1016/j.ejor.2015.08.042
  • Barabesi L, Cerasa A, Cerioli A, et al. Goodness-of-fit testing for the Newcomb–Benford law with application to the detection of customs fraud. J Bus Econ Stat.; 2017. doi:10.1080/07350015.2016.1172014
  • Barabesi L, Cerasa A, Cerioli A, et al. A new family of tempered distributions. Electron J Stat. 2016;10:3871–3893. doi: 10.1214/16-EJS1214
  • Perrotta D, Torti F. Detecting price outliers in European trade data with the forward search. In: Palumbo F, Lauro CN, Greenacre M, editors. Data analysis and classification. Heidelberg: Springer; 2010. p. 415–423.
  • Riani M, Cerioli A, Atkinson A, et al. Fitting robust mixtures of regression lines with the forward search. In: Fogelman-Soulie F, Perrotta D, Piskorski J, Steinberger R, editors. Mining massive datasets for security applications. Amsterdam: IOS Press; 2008. p. 271–286.
  • García-Escudero LA, Gordaliza A, Matrán C, et al. A review of robust clustering methods. Adv Data Anal Class. 2010;4(2):89–109. doi: 10.1007/s11634-010-0064-5
  • García-Escudero LA, Gordaliza A, Mayo-Iscar A. Robust clusterwise linear regression through trimming. Comput Stat Data Anal. 2010;54(12):3057–3069. doi: 10.1016/j.csda.2009.07.002
  • Farcomeni A, Greco L. Robust methods for data reduction. Boca Raton, FL: Chapman and Hall/CRC; 2015.
  • Ritter G. Robust cluster analysis and variable selection. Boca Raton, FL: Chapman and Hall/CRC; 2015.
  • García-Escudero LA, Gordaliza A, Greselin F, et al. Robust estimation of mixtures of regressions with random covariates, via trimming and constraints. Stat Comput. 2017;27:377–402. doi: 10.1007/s11222-016-9628-3
  • Unwin A, Theus M, Hardle W. Exploratory graphics of a financial dataset. In: Chen C, Härdle W, Unwin A, editors. Handbook of data visualization. Berlin: Springer; 2008. p. 831–852.
  • Maronna RA, Martin RD, Yohai VJ. Robust statistics. Chichester: Wiley; 2006.
  • Riani M, Perrotta D, Cerioli A. The forward search for very large datasets. J Stat Softw. 2015;67: Code Snippet 1. doi: 10.18637/jss.v067.c01
  • Yohai VJ. High breakdown-point and high efficiency estimates for regression. Ann Stat. 1987;15:642–656. doi: 10.1214/aos/1176350366
  • Riani M, Cerioli A, Torti F. On consistency factors and efficiency of robust S-estimators. Test. 2014;23:356–387. doi: 10.1007/s11749-014-0357-7
  • Maronna RA, Yohai VJ. Correcting MM estimates for ‘fat’ data sets. Comput Stat Data Anal. 2010;54:3168–3173. doi: 10.1016/j.csda.2009.09.015
  • Koller M, Stahel WA. Sharpening Wald-type inference in robust regression for small samples. Comput Stat Data Anal. 2010;55:2504–2515. doi: 10.1016/j.csda.2011.02.014
  • Atkinson A, Riani M, Cerioli A. The forward search: theory and data analysis. J Korean Stat Soc. 2010;39(2):117–134. doi: 10.1016/j.jkss.2010.02.007
  • Riani M, Atkinson AC, Perrotta D. A parametric framework for the comparison of methods of very robust regression. Stat Sci. 2014;29(1):128–143. doi: 10.1214/13-STS437
  • Riani M, Atkinson AC. Fast calibrations of the forward search for testing multiple outliers in regression. Adv Data Anal Class. 2007;1(2):123–141. doi: 10.1007/s11634-007-0007-y
  • Salini S, Cerioli A, Laurini F, Riani M. Reliable robust regression diagnostics. Int Stat Rev. 2016;84(1):99–127. doi: 10.1111/insr.12103
  • Cerioli A, Farcomeni A, Riani M. Robust distances for outlier-free goodness-of-fit testing. Comput Stat Data Anal. 2013;65:29–45. doi: 10.1016/j.csda.2012.03.008
  • Maitra R, Melnykov V. Simulating data to study performance of finite mixture modeling and clustering algorithms. J Comput Graph Stat. 2010;19(2):354–376. doi: 10.1198/jcgs.2009.08054
  • Riani M, Cerioli A, Perrotta D, Torti F. Simulating mixtures of multivariate data with fixed cluster overlap in FSDA. Adv Data Anal Class. 2015;9(4):461–481. doi: 10.1007/s11634-015-0223-9
  • Torti F, Perrotta D, Riani M, et al. Assessing robust methodologies for clustering linear regression data.; submitted for publication.
  • Tiku ML, Akkaya AD. Robust estimation and hypothesis testing. New Delhi: New Age International (P) Ltd; 2004.
  • Cerioli A, Farcomeni A. Error rates for multivariate outlier detection. Comput Stat Data Anal. 2011;55:544–553. doi: 10.1016/j.csda.2010.05.021
  • Bondell HD, Reich BJ. Simultaneous factor selection and collapsing levels in ANOVA. Biometrics. 2009; 65:169–177. doi: 10.1111/j.1541-0420.2008.01061.x

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.