461
Views
12
CrossRef citations to date
0
Altmetric
Original Articles

Exploratory tools for outlier detection in compositional data with structural zeros

, &
Pages 734-752 | Received 11 Aug 2015, Accepted 20 Apr 2016, Published online: 12 May 2016

References

  • J. Aitchison, The Statistical Analysis of Compositional Data, Chapman & Hall, London, 1986.
  • J. Aitchison and M. Greenacre, Biplots of compositional data, J. Appl. Stat. 51 (2002), pp. 375–392.
  • J. Aitchison and J. Kay, Possible solutions of some essential zero problems in compositional data analysis. pp. 1–6. Available at http://ima.udg.edu/Activitats/CoDaWork03/paper_Aitchison_and_Kay.pdf.
  • A. Alfons and M. Templ, Estimation of social exclusion indicators from complex surveys: The R package laeken, J. Statist. Softw. 54 (2013), pp. 1–25.
  • A. Alfons, S. Kraft, M. Templ, and P. Filzmoser, Simulation of close-to-reality population data for household surveys with application to EU-SILC, Statist. Methods Appl. 20 (2011), pp. 383–407.
  • J. Bacon-Shone, Discrete and continuous compositions, in CoDaWork'08, Universitat de Girona. Departament d'Informática i Matemática Aplicada, 2008, p. 11.
  • A. Butler and C. Glasbey, A latent Gaussian model for compositional data with zeros, J. Appl. Stat. 57 (2008), pp. 505–520.
  • F. Chebana and T. Ouarda, Depth-based multivariate descriptive statistics with hydrological applications, J. Geophys. Res: Atmos. 116 (2011), pp. 1–19.
  • X. Dang and R. Serfling, Nonparametric depth-based multivariate outlier identifiers, and masking robustness properties, J. Stat. Plan. Inference 140 (2010), pp. 198–213.
  • J. de Leeuw, Principal component analysis of binary data by iterated singular value decomposition, Comput. Stat. Data Anal. 50 (2006), pp. 21–39.
  • O. Dupriez, Building a household consumption database for the calculation of poverty ppps, Technical note, World Bank, 2007, Available at http://siteresources.worldbank.org/ICPINT/Resources/270056-1195253046582/Dupriez_BuildingaHHCdatabasefortheCalculationofPovertyPPPs_Mar07.pdf.
  • JJ. Egozcue, Reply to ‘On the Harker variation diagrams; …’ by J.A. Cortés, Math. Geosci. 41 (2009), pp. 829–834.
  • JJ. Egozcue and V. Pawlowsky-Glahn, Groups of parts and their balances in compositional data analysis, Math. Geol. 37 (2005), pp. 795–828.
  • J. Egozcue and V. Pawlowsky-Glahn, Compositional Data Analysis in the Geosciences: From theory to Practice, chap. Simplicial geometry for compositional data, Geological Society, London, 2006, pp. 145–160, special Publications 264.
  • JJ. Egozcue, V. Pawlowsky-Glahn, G. Mateu-Figueras, and C. Barceló-Vidal, Isometric logratio transformations for compositional data analysis, Math. Geol. 35 (2003), pp. 279–300.
  • J. Egozcue, V. Pawlowsky-Glahn, G. Mateu-Figueras, and C. Barceló-Vidal, Compositional Data Analysis: Theory and Applications, Elem. Simplicial Linear Algebra Geometry. Wiley, Chichester, 2011, 139–145.
  • Eurostat, Description of target variables: Cross-sectional and longitudinal, EU-SILC 065/04, Unit E-2: Living conditions, Directorate E: Social and regional statistics and geographical information system, Eurostat, Luxembourg, 2004.
  • P. Filzmoser and K. Hron, Outlier detection for compositional data using robust methods, Math. Geosci. 40 (2008), pp. 233–248.
  • P. Filzmoser, K. Hron, and C. Reimann, Principal component analysis for compositional data with outliers, Environmetrics 20 (2009), pp. 621–632.
  • P. Filzmoser, K. Hron, and C. Reimann, Interpretation of multivariate outliers for compositional data, Comput. Geosci. 39 (2012), pp. 77–85.
  • JM. Fry, TR. Fry, and KR. McLaren, Compositional data analysis and zeros in micro data, Appl. Econom. 32 (2000), pp. 953–959, Available at http://www.tandfonline.com/doi/abs/10.1080/000368400322002.
  • K.R. Gabriel, The biplot – graphic display of matrices with application to principal component analysis, Biometrika 58 (1971), pp. 453–467.
  • J. Guilford, Psychometric Methods, McGraw-Hill series in psychology, McGraw-Hill, New York City, 1954.
  • K. Hron, M. Templ, and P. Filzmoser, Imputation of missing values for compositional data using classical and robust methods, Comput. Statist. Data Anal. 54 (2010), pp. 3095–3107.
  • S. Lee, JZ. Huang, and J. Hu, Sparse logistic principal components analysis for binary data, Ann. Appl. Stat. 4 (2010), pp. 1579–1601, Available at http://dx.doi.org/10.1214/10-AOAS327.
  • JA. Martín-Fernández, C. Barceló-Vidal, and V. Pawlowsky-Glahn, Dealing with zeros and missing values in compositional data sets using nonparametric imputation, Math. Geol. 35 (2003), pp. 253–278.
  • J. Martín-Fernández, J. Palarea-Albaladejo, and R. Olea, Compositional Data Analysis: Theory and Applications, Dealing with Zeros, Wiley, Chichester, 2011, 43–58.
  • JA. Martín-Fernández, K. Hron, M. Templ, P. Filzmoser, and J. Palarea-Albaladejo, Model-based replacement of rounded zeros in compositional data: Classical and robust approaches, Comput. Statist. Data Anal. C 56 (2012), pp. 2688–2704.
  • J. Martín-Fernández, K. Hron, M. Templ, P. Filzmoser, and J. Palarea-Albaladejo, Bayesian-multiplicative treatment of count zeros in compositional data sets, Stat. Model. 15 (2015), doi:10.1177/1471082X14535524.
  • B. Meindl, M. Templ, A. Alfons, and A. Kowarik, simPop: Simulation of Synthetic Populations for Survey Data Considering Auxiliary Information, 2015, Available at http://CRAN.R-project.org/package=simPop, R package version 0.2.9.
  • V. Pawlowsky-Glahn and A. Buccianti, Compositional Data Analysis: Theory and Applications, Wiley, Chichester, 2011.
  • V. Pawlowsky-Glahn, J. Egozcue, and R. Tolosana-Delgado, Modeling and Analysis of Compositional Data, Wiley, Chichester, 2015.
  • P. Rousseeuw and K. von Driessen, A fast algorithm for the minimum covariance determinant estimator, Technometrics 41 (1999), pp. 212–223.
  • JL. Scealy and AH. Welsh, Regression for compositional data by using distributions defined on the hypersphere, J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 (2011), pp. 351–375.
  • C. Stewart and C. Field, Managing the essential zeros in quantitative fatty acid signature analysis, J. Agric. Biol. Environ. Stat. 16 (2010), pp. 45–69.
  • F. Tang and H. Tao, Binary principal component analysis, In Proc. British Machine Vision Conference, Volume I, 2006, pp. 377–386.
  • M. Templ, A. Alfons, and P. Filzmoser, Exploring incomplete data using visualization techniques, Adv. Data Anal. Classif. 6 (2012), pp. 29–47.
  • M. Templ, K. Hron, and P. Filzmoser, robCompositions: An R-package for robust statistical analysis of compositional data, in Compositional Data Analysis: Theory and Applications, V. Pawlowsky-Glahn and A. Buccianti, eds., Wiley, Chichester, 2011, pp. 341–355.
  • M. Templ, K. Hron, and P. Filzmoser, Robust Estimation for Compositional Data, 2015, Available at https://github.com/matthias-da/robCompositions, R package version 1.9.2.
  • V. Todorov, M. Templ, and P. Filzmoser, Detection of multivariate outliers in business survey data with incomplete information, Adv. Data Anal. Classif. 5 (2011), pp. 37–56.
  • O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, and RB. Altman, Missing value estimation methods for dna microarrays, Bioinformatics 17 (2001), pp. 520–525.
  • K. van den Boogaart and R. Tolosana-Delgado, Analyzing Compositional Data with R, Springer, Heidelberg, 2013.
  • H. Wang, Q. Liu, HMK. Mok, L. Fu, and W. Man Tse, A hyperspherical transformation forecasting model for compositional data, Eur. J. Oper. Res. 179 (2007), pp. 459–468.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.