709
Views
10
CrossRef citations to date
0
Altmetric
Articles

Distance-based outlier detection for high dimension, low sample size data

, &
Pages 13-29 | Received 14 Dec 2016, Accepted 02 Mar 2018, Published online: 24 Mar 2018

References

  • J. Ahn, M.H. Lee, and Y.J. Yoon, Clustering high dimension, low sample size data using the maximal data piling distance, Statist. Sinica 22 (2012), pp. 443–464. doi: 10.5705/ss.2010.148
  • J. Ahn and J.S. Marron, The maximal data piling direction for discrimination, Biometrika 97 (2010), pp. 254–259. doi: 10.1093/biomet/asp084
  • J. Ahn, J.S. Marron, K.E. Muller, and Y.-Y. Chi, The high-dimension, low-sample-size geometric representation holds under mild conditions, Biometrika 3 (2007), pp. 760–766. doi: 10.1093/biomet/asm050
  • V. Barnett and T. Lewis, Outliers in Statistical Data, 3rd ed., Wileys and Sons, Hoboken, NJ, 1994.
  • M. Benito, J. Parker, Q. Du, J. Wu, D. Xiang, C.M. Perou, and J.S. Marron, Adjustment of systematic microarray data biases, Bioinformatics 20 (2004), pp. 105–114. doi: 10.1093/bioinformatics/btg385
  • P.J. Bickel and E. Levina, Regularized estimation of large covariance matrices, Ann. Statist. 36 (2008), pp. 199–227. doi: 10.1214/009053607000000758
  • T. Cai and W. Liu, Adaptive thresholding for sparse covariance matrix estimation, J. Amer. Statist. Assoc. 106 (2011), pp. 672–684. doi: 10.1198/jasa.2011.tm10560
  • C. Fauconnier and G. Haesbroeck, Outliers detection with the minimum covariance determinant estimator in practice, Stat. Methodol. 6 (2009), pp. 363–379. doi: 10.1016/j.stamet.2008.12.005
  • P. Filzmoser, R. Maronna, and M. Werner, Outlier identification in high dimensions, Comput. Statist. Data Anal. 52 (2008), pp. 1694–1711. doi: 10.1016/j.csda.2007.05.018
  • R.G. Garret, The chi-square plot: A tool for multivariate outlier recognition, J. Geochem. Explor. 32 (1989), pp. 319–341. doi: 10.1016/0375-6742(89)90071-X
  • P. Hall, J.S. Marron, and A. Neeman, Geometric representation of high dimension, low sample size data, J. R. Stat. Soc. Ser. B 67 (2005), pp. 427–444. doi: 10.1111/j.1467-9868.2005.00510.x
  • T. Hastie and R. Tibshirani, Efficient quadratic regularization for expression arrays, Biostatistics 5 (2004), pp. 329–340. doi: 10.1093/biostatistics/kxh010
  • S. Jung and J.S. Marron, PCA consistency in high dimension, low sample size context, Ann. Statist. 37 (2009), pp. 4104–4130. doi: 10.1214/09-AOS709
  • C.A.J. Klaassen and J.A. Wellner, Efficient estimation in the bivariate normal copula model: Normal margins are least favourable, Bernoulli 3 (1997), pp. 55–77. doi: 10.2307/3318652
  • M.H. Lee, J. Ahn, and Y. Jeon, HDLSS discrimination with adaptive data piling, J. Comput. Graph. Statist. 22 (2012), pp. 433–451. doi: 10.1080/10618600.2012.681235
  • P.C. Mahalanobis, On the generalised distance in statistics, Proc. Natl. Inst. Sci. India 2 (1936), pp. 49–55.
  • D. Paul, Asymptotics of sample eigenstructure for a large dimensional spiked covariance model, Statist. Sinica 17 (2007), pp. 1617–1642.
  • K. Ro, C. Zou, Z. Wang, and G. Yin, Outlier detection for high-dimensional data, Biometrika 102 (2015), pp. 589–599. doi: 10.1093/biomet/asv021
  • P.J. Rousseeuw and K.V. Driessen, A fast algorithm for the minimum covariance determinant estimator, Technometrics 41 (1999), pp. 212–223. doi: 10.1080/00401706.1999.10485670
  • T.A. Sajesh and M.R. Srinivasan, Outlier detection for high dimensional data using the comedian approach, J. Stat. Comput. Simul. 82 (2012), pp. 745–757. doi: 10.1080/00949655.2011.552504
  • T.A. Sajesh and M.R. Srinivasan, An overview of multiple outliers in multidimensional data, Sri Lankan J. Appl. Stat. 14 (2013), pp. 87–120. doi: 10.4038/sljastats.v14i2.6214
  • K. Shedden, J.M.G. Taylor, S.A. Enkemann, M.S. Tsao, T.J. Yeatman, W.L. Gerald, S. Eschrich, I. Jurisica, S.E. Venkatraman, M. Meyerson, R. Kuick, K.K. Dobbin, T. Lively, J.W. Jacobson, D.G. Beer, T.J. Giordano, D.E. Misek, A.C. Chang, C.Q. Zhu, D. Strumpf, S. Hanash, F.A. Shepherd, K. Ding, L. Seymour, K. Naoki, N. Pennell, B. Weir, R. Verhaak, C. Ladd-Acosta, T. Golub, M. Gruidl, J. Szoke, M. Zakowski, V. Rusch, M. Kris, A. Viale, N. Motoi, W. Travis, and A. Sharma, Gene-expression-based survival prediction in lung adenocarcinoma: A multi-site, blinded validation study, Nat. Med. 14 (2008), pp. 822–827. doi: 10.1038/nm.1790
  • M. J. van der Laan and J. Bryan, Gene expression analysis with the parametric bootstrap, Biostatistics 2 (2001), pp. 445–461. doi: 10.1093/biostatistics/2.4.445

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.