Abstract
Principal component analysis is a multivariate technique widely used in dimensionality reduction. The ideal number of principal components retained should be defined when one is dealing with high-dimensional data. Some criteria for this choice were proposed in the literature. Most of them have serious limitations, such as normality assumptions, subjective analysis, and asymptotic properties. This study aims to propose two new tests using the parametric bootstrap for determining the optimal number of principal components (PC) retained for subsequent analysis, based on the amount of the total variation accounted for by the k first principal components. The performances of these tests were compared among themselves and with those of Fujikoshi (1980) and Gebert and Ferreira (2010) through Monte Carlo simulations. Under multivariate normality the two proposed parametric bootstrap tests are recommended. Under nonnormality the test of Gebert and Ferreira (2010) is recommended. The three bootstrap tests surpass the Fujikoshi test in most circumstances.
AMS Subject Classification: