Abstract
This article shows that a probabilistic version of the classical forward-stepwise variable inclusion procedure can serve as a general data-augmentation scheme for model space distributions in (generalized) linear models. This latent variable representation takes the form of a Markov process, thereby allowing information propagation algorithms to be applied for sampling from model space posteriors. In particular, We propose a sequential Monte Carlo method for achieving effective unbiased Bayesian model averaging in high-dimensional problems, using proposal distributions constructed using local information propagation. The method—called LIPS for local information propagation based sampling—is illustrated using real and simulated examples with dimensionality ranging from 15 to 1000, and its performance in estimating posterior inclusion probabilities and in out-of-sample prediction is compared to those of several other methods—namely, MCMC, BAS, iBMA, and LASSO. In addition, it is shown that the latent variable representation can also serve as a modeling tool for specifying model space priors that account for knowledge regarding model complexity and conditional inclusion relationships. Supplementary materials for this article are available online.
Additional information
Notes on contributors
Li Ma
Li Ma is Assistant Professor, Department of Statistical Science, Duke University, Box 90251, Durham, NC 27708-0251 (E-mail: [email protected]). This research is supported by NSF grant DMS-1309057. The author is especially grateful to Quanli Wang for help in programming that substantially improved the efficiency of the software. The implementation of the LIPS algorithm used in the examples is based on the SMCTC template class in C++ (Johansen 2009).