1,301
Views
10
CrossRef citations to date
0
Altmetric
Theory and Methods

Empirical Bayes Confidence Intervals for Selected Parameters in High-Dimensional Data

&
Pages 607-618 | Received 01 Apr 2011, Published online: 01 Jul 2013
 

Abstract

Modern statistical problems often involve a large number of populations and hence a large number of parameters that characterize these populations. It is common for scientists to use data to select the most significant populations, such as those with the largest t statistics. The scientific interest often lies in studying and making inferences regarding these parameters, called the selected parameters, corresponding to the selected populations. The current statistical practices either apply a traditional procedure assuming there were no selection—a practice that is not valid—or they use the Bonferroni-type procedure that is valid but very conservative and often noninformative. In this article, we propose valid and sharp confidence intervals that allow scientists to select parameters and to make inferences for the selected parameters based on the same data. This type of confidence interval allows the users to zero in on the most interesting selected parameters without collecting more data. The validity of confidence intervals is defined as the controlling of Bayes coverage probability so that it is no less than a nominal level uniformly over a class of prior distributions for the parameter. When a mixed model is assumed and the random effects are the key parameters, this validity criterion is exactly the frequentist criterion, since the Bayes coverage probability is identical to the frequentist coverage probability. Assuming that the observations are normally distributed with unequal and unknown variances, we select parameters with the largest t statistics. We then construct sharp empirical Bayes confidence intervals for these selected parameters, which have either a large Bayes coverage probability or a small Bayes false coverage rate uniformly for a class of priors. Our intervals, applicable to any high-dimensional data, are applied to microarray data and are shown to be better than all the alternatives. It is also anticipated that the same intervals would be valid for any selection rule. Supplementary materials for this article are available online.

ACKNOWLEDGMENTS

The authors thank Song Chang at Cornell who at the earlier stage of this research helped us with simulations. Also we thank Professors Philip Everson and Carl N. Morris for explaining their work to the authors.

Hwang's research is supported by the National Science Council, Taiwan, grant No. NSC 100-2118-M-194-004-MY3. Zhao's research is supported by NSF grant DMS-1208735.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 343.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.