Abstract
Outlier detection is fundamental to data analysis. Desirable properties are affine invariance, robustness, low computational burden, and nonimposition of elliptical contours. However, leading methods fail to possess all of these features. The Mahalanobis distance outlyingness (MD) imposes elliptical contours. The projection outlyingness, powerfully involving projections of the data onto all univariate directions, is highly computationally intensive. Computationally easy variants using projection pursuit with but finitely many directions have been introduced, but these fail to capture at once the other desired properties. Here, we develop a ‘robust Mahalanobis spatial outlyingness on projections’ (RMSP) function, which indeed satisfies all the four desired properties. Pre-transformation to a strong invariant coordinate system yields affine invariance, ‘spatial trimming’ yields robustness, and ‘spatial Mahalanobis outlyingness’ is used to obtain computational ease and smooth, unconstrained contours. From empirical study using artificial and actual data, our findings are that SUP is outclassed by MD and RMSP, that MD and RMSP are competitive, and that RMSP is especially advantageous in describing the intermediate outlyingness structure when elliptical contours are not assumed.
AMS 2000 Subject Classifications :
Acknowledgements
The authors gratefully acknowledge very helpful, insightful comments from reviewers. Useful input from G. L. Thompson is also greatly appreciated. Support under National Science Foundation Grants DMS-0805786 and DMS-1106691 and National Security Agency Grant H98230-08-1-0106 is sincerely acknowledged. Support by NSF Grants DMS-0805786 and DMS-1106691 and NSA Grant H98230-08-1-0106 is gratefully acknowledged.