Abstract
Classical prediction methods, such as Fisher’s linear discriminant function, were designed for small-scale problems in which the number of predictors N is much smaller than the number of observations n. Modern scientific devices often reverse this situation. A microarray analysis, for example, might include n=100 subjects measured on N=10,000 genes, each of which is a potential predictor. This article proposes an empirical Bayes approach to large-scale prediction, where the optimum Bayes prediction rule is estimated employing the data from all of the predictors. Microarray examples are used to illustrate the method. The results demonstrate a close connection with the shrunken centroids algorithm of Tibshirani et al. (2002), a frequentist regularization approach to large-scale prediction, and also with false discovery rate theory.