Empirical Bayes Confidence Intervals for Selected Parameters in High-Dimensional Data

J. T. Gene Hwang Department of Mathematics, Department of Statistics, Cornell University, Ithaca, NY, 14850; Department of Mathematics, National Chung Cheng University, Taiwan, ROC

Zhigen Zhao Department of Statistics, Temple University, Philadelphia, PA, 19122

Abstract

Modern statistical problems often involve a large number of populations and hence a large number of parameters that characterize these populations. It is common for scientists to use data to select the most significant populations, such as those with the largest t statistics. The scientific interest often lies in studying and making inferences regarding these parameters, called the selected parameters, corresponding to the selected populations. The current statistical practices either apply a traditional procedure assuming there were no selection—a practice that is not valid—or they use the Bonferroni-type procedure that is valid but very conservative and often noninformative. In this article, we propose valid and sharp confidence intervals that allow scientists to select parameters and to make inferences for the selected parameters based on the same data. This type of confidence interval allows the users to zero in on the most interesting selected parameters without collecting more data. The validity of confidence intervals is defined as the controlling of Bayes coverage probability so that it is no less than a nominal level uniformly over a class of prior distributions for the parameter. When a mixed model is assumed and the random effects are the key parameters, this validity criterion is exactly the frequentist criterion, since the Bayes coverage probability is identical to the frequentist coverage probability. Assuming that the observations are normally distributed with unequal and unknown variances, we select parameters with the largest t statistics. We then construct sharp empirical Bayes confidence intervals for these selected parameters, which have either a large Bayes coverage probability or a small Bayes false coverage rate uniformly for a class of priors. Our intervals, applicable to any high-dimensional data, are applied to microarray data and are shown to be better than all the alternatives. It is also anticipated that the same intervals would be valid for any selection rule. Supplementary materials for this article are available online.

KEY WORDS:

ACKNOWLEDGMENTS

The authors thank Song Chang at Cornell who at the earlier stage of this research helped us with simulations. Also we thank Professors Philip Everson and Carl N. Morris for explaining their work to the authors.

Hwang's research is supported by the National Science Council, Taiwan, grant No. NSC 100-2118-M-194-004-MY3. Zhao's research is supported by NSF grant DMS-1208735.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Empirical Bayes Confidence Intervals for Selected Parameters in High-Dimensional Data

Related Research Data

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Empirical Bayes Confidence Intervals for Selected Parameters in High-Dimensional Data

Abstract

ACKNOWLEDGMENTS

Reprints and Corporate Permissions

Academic Permissions

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date