1,369
Views
1
CrossRef citations to date
0
Altmetric
Book reviews

Large-scale inference: empirical Bayes methods for estimation, testing, and prediction

Page 2305 | Published online: 29 May 2012

Large-scale inference: empirical Bayes methods for estimation, testing, and prediction, by Bradley Efron, Cambridge, Cambridge University Press, 2010, xii + 263 pp., £45.00 or US$70.00 (hardback), ISBN 978-0-521-19249-1

With the proliferation of high-dimensional data comes the daunting task of soliciting as much information from such data as possible. This process is particularly challenging since the number of variables is often larger than the number of samples. Efron tackles this all-important problem through the use of microarray data. The use of Bayes methods for estimation, testing and prediction procedures is at the heart of this marvellous collection of some of Efron's best work. The chapters are a list of Efron's best papers on false discovery rate (FDR), bound in a book form.

The introduction of any new methodology for handling high-dimensional data introduces problems whereby inference can become flawed. Efron makes an effort to put to rest all the inadequacies of the new methods. The FDR and its many variants are identified early in the book as the main tool for finding genes of interest.

Estimation of the proportion of null hypotheses is the basis for making the FDR more powerful. Efron models the distribution of p-values as a mixture of p-values from the null and those from the alternative. The use of the maximum-likelihood estimation and centering methods allows for the estimation of the proportion of nulls. He assumes that the percentage of p-values from the null distribution is high (close to 1) and that the distribution is standard normal. In other data sets, he illustrates the use of an empirical null rather the standard normal null. In addition to the ordinary FDR, Efron's local FDR is discussed, which is a probability that a gene is null given the data.

The problem of highly correlated data is also tackled. Efron suggests ways of taking into account correlation both between genes and between samples. Finally, Bayes and empirical Bayes methods for prediction of local FDRs are also covered.

The only drawback of the book is its reliance on continuous test statistics. The high-dimensional world now has a lot of high-dimensional categorical data.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.