Search in:

Journal of Applied Statistics Volume 47, 2020 - Issue 7

Submit an article Journal homepage

Open access

7,847

Views

CrossRef citations to date

Altmetric

Articles

Evaluation of robust outlier detection methods for zero-inflated complex data

M. TemplZurich University of Applied Sciences, Winterthur, Switzerland;Vienna University of Technology, Vienna, AustriaCorrespondence[email protected]

https://orcid.org/0000-0002-8638-5276

J. GussenbauerStatistics Austria, Vienna, Austria

P. FilzmoserVienna University of Technology, Vienna, Austria

Pages 1144-1167 | Received 19 Nov 2018, Accepted 19 Sep 2019, Published online: 27 Sep 2019

Cite this article
https://doi.org/10.1080/02664763.2019.1671961
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Figures & data

Table 1. Overview of univariate and multivariate outlier detection methods addressed.

Download CSV Display Table

Figure 1. Top: Estimates of the Gini coefficient (left) and variance of the Gini coefficient (right) for the Albanian data set after univariate outlier detection methods as well as outlier imputation have been applied. Bottom: Share of upper and lower outliers for each outlier detection scheme applied to the Albanian data set.

Figure 2. Top: Estimates of Gini coefficient (left) and variance of Gini coefficient (right) for the Albanian data set after multivariate outlier detection methods as well as outlier imputation have been applied. Bottom: Share of outliers detected by multivariate outlier detection methods for the Albanian household data.

Figure 3. Estimated Gini coefficients for different levels of ε and different outlier detection methods. The dashed line indicates a baseline representing the median of the Gini coefficients of the uncontaminated data.

Figure 4. Boxplots of successfully detected artificial outliers, where the whole observation was contaminated, for different outlier detection methods and different levels of ε.

Figure 5. Boxplots of successfully detected artificial outliers, where only single cells were contaminated, for different outlier detection methods and different levels of ε.

Figure 6. Share of false/positive outliers to number of clean data points for different outlier detection methods and different levels of ε.

Table 2. Percentages of misclassified observations for different levels of ε for the univariate methods.

Download CSV Display Table

Figure 7. Boxplots of calculated Gini coefficients for different outlier detection methods and different levels of ε. The dashed line indicates a baseline representing the median of the Gini coefficients of the uncontaminated data.

Figure 8. Boxplots of successfully detected artificial outliers, where the whole observation was contaminated, for different outlier detection methods and different levels of ε.

Figure 9. Boxplots of successfully detected artificial outliers, where only single cells were contaminated, for different outlier detection methods and different levels of ε.

Figure 10. Boxplots of share of false/positive outliers to number of clean data points for different outlier detection methods and different levels of ε.

Table 3. Percentages of misclassified observations for different levels of ε regarding multivariate methods.

Download CSV Display Table

E. Vandervieren and M. Hubert, An adjusted boxplot for skewed distributions, Comput. Stat. Data Anal. 52 (2008), pp. 5186–5201. doi: 10.1016/j.csda.2007.11.008

Your download is now in progress and you may close this window

Login or register to access this feature

Evaluation of robust outlier detection methods for zero-inflated complex data

Figures & data

Table 1. Overview of univariate and multivariate outlier detection methods addressed.

Table 2. Percentages of misclassified observations for different levels of ε for the univariate methods.

Table 3. Percentages of misclassified observations for different levels of ε regarding multivariate methods.

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date