672
Views
10
CrossRef citations to date
0
Altmetric
Original Articles

Variational Approximation for Mixtures of Linear Mixed Models

Pages 564-585 | Received 01 Dec 2011, Published online: 28 Apr 2014
 

Abstract

Mixtures of linear mixed models (MLMMs) are useful for clustering grouped data and can be estimated by likelihood maximization through the Expectation–Maximization algorithm. A suitable number of components is then determined conventionally by comparing different mixture models using penalized log-likelihood criteria such as Bayesian information criterion. We propose fitting MLMMs with variational methods, which can perform parameter estimation and model selection simultaneously. We describe a variational approximation for MLMMs where the variational lower bound is in closed form, allowing for fast evaluation and develop a novel variational greedy algorithm for model selection and learning of the mixture components. This approach handles algorithm initialization and returns a plausible number of mixture components automatically. In cases of weak identifiability of certain model parameters, we use hierarchical centering to reparameterize the model and show empirically that there is a gain in efficiency in variational algorithms similar to that in Markov chain Monte Carlo (MCMC) algorithms. Related to this, we prove that the approximate rate of convergence of variational algorithms by Gaussian approximation is equal to that of the corresponding Gibbs sampler, which suggests that reparameterizations can lead to improved convergence in variational algorithms just as in MCMC algorithms. Supplementary materials for the article are available online.

SUPPLEMENTARY MATERIALS

Appendix: Derivation of variational lower bound in (Equation3) and the expressions of the variational lower bounds and parameter updates for Algorithms 2 and 3 can be found in the Appendix. An example on application of Algorithm 2 to yeast galactose data of Ideker et al. (Citation2001) is also included. (VA_MLMM.appendix.pdf)

R codes and data: R codes for implementing the VGA using algorithms 1, 2, and 3 and the water temperature dataset are available as supplemental materials. Please read file “README” contained in the zip file for more details. (VGA.zip)

ACKNOWLEDGMENTS

Siew Li Tan was partially supported as part of the Singapore-Delft Water Alliance (SDWA)’s tropical reservoir research program. We thank SDWA for supplying the water temperature dataset and Dr. David Burger and Dr. Hans Los for their valuable comments and suggestions.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.