ABSTRACT
A false discovery rate (FDR) procedure is often employed in exploratory data analysis to determine which among thousands or millions of attributes are worthy of follow-up analysis. However, these methods tend to discover the most statistically significant attributes, which need not be the most worthy of further exploration. This article provides a new FDR-controlling method that allows for the nature of the exploratory analysis to be considered when determining which attributes are discovered. To illustrate, a study in which the objective is to classify discoveries into one of several clusters is considered, and a new FDR method that minimizes the misclassification rate is developed. It is shown analytically and with simulation that the proposed method performs better than competing methods.
Acknowledgments
The authors would like to thank two anonymous reviewers for helpful comments.