Abstract
We propose a new approach to the problem of high-dimensional multivariate ANOVA via bootstrapping max statistics that involve the differences of sample mean vectors. The proposed method proceeds via the construction of simultaneous confidence regions for the differences of population mean vectors. It is suited to simultaneously test the equality of several pairs of mean vectors of potentially more than two populations. By exploiting the variance decay property that is a natural feature in relevant applications, we are able to provide dimension-free and nearly parametric convergence rates for Gaussian approximation, bootstrap approximation, and the size of the test. We demonstrate the proposed approach with ANOVA problems for functional data and sparse count data. The proposed methodology is shown to work well in simulations and several real data applications.
Supplementary Material
Supplement:
The Supplement contains the proofs for the results in Section 3, and additional simulation studies for functional ANOVA and high-dimensional MANOVA. (PDF)
R-package:
The hdanova.cuda packageFootnote4 implements the proposed method for the GPU based computing platform.
Notes
1 Note that M itself is not a test statistic since it involves unknown parameters, but being able to estimate the quantiles of M will enable our testing procedure based on SCRs.
2 that is,∼for each, the equation
is satisfied.
3 Originally available from ftp://ftp.cs.cornell.edu/pub/smart, and now available publicly on the Internet, for example, https://www.dataminingresearch.com/index.php/2010/09/classic3-classic4-datasets/