2,293
Views
28
CrossRef citations to date
0
Altmetric
Theory and Methods

A Simple Two-Sample Test in High Dimensions Based on L2-Norm

, , &
Pages 1011-1027 | Received 29 May 2018, Accepted 25 Mar 2019, Published online: 30 May 2019
 

ABSTRACT

Testing the equality of two means is a fundamental inference problem. For high-dimensional data, the Hotelling’s T2-test either performs poorly or becomes inapplicable. Several modifications have been proposed to address this issue. However, most of them are based on asymptotic normality of the null distributions of their test statistics which inevitably requires strong assumptions on the covariance. We study this problem thoroughly and propose an L2-norm based test that works under mild conditions and even when there are fewer observations than the dimension. Specially, to cope with general nonnormality of the null distribution we employ the Welch–Satterthwaite χ2-approximation. We derive a sharp upper bound on the approximation error and use it to justify that χ2-approximation is preferred to normal approximation. Simple ratio-consistent estimators for the parameters in the χ2-approximation are given. Importantly, our test can cope with singularity or near singularity of the covariance which is commonly seen in high dimensions and is the main cause of nonnormality. The power of the proposed test is also investigated. Extensive simulation studies and an application show that our test is at least comparable to and often outperforms several competitors in terms of size control, and the powers are comparable when their sizes are. Supplementary materials for this article are available online.

Acknowledgments

The authors thank the co-editor, AE, and two reviewers for their constructive comments and suggestions which help us improve the article substantially.

Funding

Zhang is supported by the National University of Singapore academic research grant R-155-000-187-114. Zhou is financially supported by the First Class Discipline of Zhejiang—A(Zhejiang Gongshang University—Statistics). Cheng is supported by the Hong Kong Baptist University grants RC-ICRS17-18 and FRG2/17-18/086. Guo would like to thank Professor Wen-Lung Shiau, the Advanced Data Analysis Center (PLS-SEM of Zhejiang University of Technology), for his support on this research.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.