Abstract
Bayesian model averaging (BMA) is an effective technique for addressing model uncertainty in variable selection problems. However, current BMA approaches have computational difficulty dealing with data in which there are many more measurements (variables) than samples. This paper presents a method for combining ℓ1 regularization and Markov chain Monte Carlo model composition techniques for BMA. By treating the ℓ1 regularization path as a model space, we propose a method to resolve the model uncertainty issues arising in model averaging from solution path point selection. We show that this method is computationally and empirically effective for regression and classification in high-dimensional data sets. We apply our technique in simulations, as well as to some applications that arise in genomics.
Acknowledgements
We thank Dr Ka Yee Yeung for many fruitful discussions on genomics, for making iBMA available, and for making us aware of the DREAM competition as a source of benchmark data. We also thank an anonymous referee for comments leading to a number improvements in this article.