Abstract
High-dimension, low sample size (HDLSS) data are becoming common in various fields such as genetic microarrays, medical imaging, text recognition, finance, chemometrics, and so on. Such data have surprising and often counterintuitive geometric structures because of the high-dimensional noise that dominates and corrupts the local neighborhoods. In this article, we estimate the intrinsic dimension (ID) that allows one to distinguish between deterministic chaos and random noise of HDLSS data. A new ID estimating methodology is given and its properties are studied by using a d-asymptotic approach.
Acknowledgments
The authors thank the anonymous referees for their valuable comments and careful reading. Research of the second author was partially supported by Grant-in-Aid for Scientific Research (B), Japan Society for the Promotion of Science (JSPS), under Contract Number 18300092.