Search in:

Advanced search

Journal of the American Statistical Association Volume 112, 2017 - Issue 518

Submit an article Journal homepage

45,319

Views

1,958

CrossRef citations to date

Altmetric

Review

Variational Inference: A Review for Statisticians

David M. BleiDepartment of Computer Science and Statistics, Columbia University, New York, NY

Alp KucukelbirDepartment of Computer Science, Columbia University, New York, NY

Jon D. McAuliffeDepartment of Statistics, University of California, Berkeley, CA

Pages 859-877 | Received 01 Jan 2016, Published online: 13 Jul 2017

Cite this article
https://doi.org/10.1080/01621459.2017.1285773
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Ahmed, A. , Aly, M. , Gonzalez, J. , Narayanamurthy, S. , and Smola, A. (2012), “Scalable Inference in Latent Variable Models,” in International Conference on Web Search and Data Mining , pp. 123–132.
Google Scholar
Airoldi, E. , Blei, D. , Fienberg, S. , and Xing, E. (2008), “Mixed Membership Stochastic Blockmodels,” Journal of Machine Learning Research , 9, 1981–2014.
PubMed Web of Science ®Google Scholar
Amari, S. (1982), “Differential Geometry of Curved Exponential Families-Curvatures and Information Loss,” The Annals of Statistics , 10, 357–385.
Web of Science ®Google Scholar
——— (1998), “Natural Gradient Works Efficiently in Learning,” Neural Computation , 10, 251–276.
Web of Science ®Google Scholar
Archambeau, C. , Cornford, D. , Opper, M. , and Shawe-Taylor, J. (2007a), “Gaussian Process Approximations of Stochastic Differential Equations,” Workshop on Gaussian Processes in Practice , 1, 1–16.
Google Scholar
Archambeau, C. , Opper, M. , Shen, Y. , Cornford, D. , and Shawe-Taylor, J. (2007b), “Variational Inference for Diffusion Processes,” in Neural Information Processing Systems , pp. 17–24.
Google Scholar
Armagan, A. , Clyde, M. , and Dunson, D. (2011), “Generalized Beta Mixtures of Gaussians,” in Neural Information Processing Systems , pp. 523–531.
Google Scholar
Armagan, A. , and Dunson, D. (2011), “Sparse Variational Analysis of Linear Mixed Models for Large Data Sets,” Statistics & Probability Letters , 81, 1056–1062.
PubMed Web of Science ®Google Scholar
Barber, D. (2012), Bayesian Reasoning and Machine Learning , Cambridge, UK : Cambridge University Press.
Google Scholar
Barber, D. , and Bishop, C. M. (1998), “Ensemble Learning in Bayesian Neural Networks,” in Generalization in Neural Networks and Machine Learning , ed. C. M. Bishop, New York : Springer Verlag, pp. 215–237.
Google Scholar
Barber, D. , and Chiappa, S. (2006), “Unified Inference for Variational Bayesian Linear Gaussian State-Space Models,” in Neural Information Processing Systems , pp. 81–88.
Google Scholar
Barber, D. , and de van Laar, P. (1999), “Variational Cumulant Expansions for Intractable Distributions,” Journal of Artificial Intelligence Research , 10, 435–455.
Web of Science ®Google Scholar
Barber, D. , and Wiegerinck, W. (1999), “Tractable Variational Structures for Approximating Graphical Models,” in Neural Information Processing Systems , pp. 183–189.
Google Scholar
Beal, M. , and Ghahramani, Z. (2003), “The Variational Bayesian EM Algorithm for Incomplete Data: With Application to Scoring Graphical Model Structures,” in Bayesian Statistics (Vol. 7), eds. J. Bernardo , M. Bayarri , J. Berger , A. Dawid , D. Heckerman , A. Smith , and M. West , Oxford, UK : Oxford University Press, pp. 453–464.
Google Scholar
Bernardo, J. , and Smith, A. (1994), Bayesian Theory , Chichester, UK : Wiley.
Google Scholar
Bickel, P. , Choi, D. , Chang, X. , and Zhang, H. (2013), “Asymptotic Normality of Maximum Likelihood and its Variational Approximation for Stochastic Blockmodels,” The Annals of Statistics , 41, 1922–1943.
Web of Science ®Google Scholar
Bishop, C. (2006), Pattern Recognition and Machine Learning , New York : Springer.
Google Scholar
Bishop, C. , Lawrence, N. , Jaakkola, T. , and Jordan, M. I. (1998), “Approximating Posterior Distributions in Belief Networks using Mixtures,” in Neural Information Processing Systems , pp. 416–422.
Google Scholar
Bishop, C. , and Winn, J. (2000), “Non-linear Bayesian Image Modelling,” in European Conference on Computer Vision , pp. 3–17.
Google Scholar
Blei, D. (2012), “Probabilistic Topic Models,” Communications of the ACM , 55, 77–84.
Web of Science ®Google Scholar
Blei, D. , and Jordan, M. I. (2006), “Variational Inference for Dirichlet Process Mixtures,” Journal of Bayesian Analysis , 1, 121–144.
Web of Science ®Google Scholar
Blei, D. , and Lafferty, J. (2007), “A Correlated Topic Model of Science,” Annals of Applied Statistics , 1, 17–35.
Web of Science ®Google Scholar
Blei, D. , Ng, A. , and Jordan, M. I. (2003), “Latent Dirichlet Allocation,” Journal of Machine Learning Research , 3, 993–1022.
Web of Science ®Google Scholar
Braun, M. , and McAuliffe, J. (2010), “Variational Inference for Large-Scale Models of Discrete Choice,” Journal of the American Statistical Association , 105, 324–335.
Web of Science ®Google Scholar
Brown, L. (1986), Fundamentals of Statistical Exponential Families , Hayward, CA : Institute of Mathematical Statistics.
Google Scholar
Bugbee, B. , Breidt, F. , and van der Woerd, M. (2016), “Laplace Variational Approximation for Semiparametric Regression in the Presence of Heteroscedastic Errors,” Journal of Computational and Graphical Statistics , 25, 225–245.
PubMed Web of Science ®Google Scholar
Carbonetto, P. , and Stephens, M. (2012), “Scalable Variational Inference for Bayesian Variable Selection in Regression, and its Accuracy in Genetic Association Studies,” Bayesian Analysis , 7, 73–108.
Web of Science ®Google Scholar
Celisse, A. , Daudin, J.-J. , and Pierre, L. (2012), “Consistency of Maximum-Likelihood and Variational Estimators in the Stochastic Block Model,” Electronic Journal of Statistics , 6, 1847–1899.
Web of Science ®Google Scholar
Challis, E. , and Barber, D. (2013), “Gaussian Kullback-Leibler Approximate Inference,” The Journal of Machine Learning Research , 14, 2239–2286.
Web of Science ®Google Scholar
Chan, A. , and Vasconcelos, N. (2009), “Layered Dynamic Textures,” IEEE Transactions on Pattern Analysis and Machine Intelligence , 31, 1862–1879.
PubMed Web of Science ®Google Scholar
Cohen, S. , and Smith, N. (2010), “Covariance in Unsupervised Learning of Probabilistic Grammars,” The Journal of Machine Learning Research , 11, 3017–3051.
Web of Science ®Google Scholar
Cummins, M. , and Newman, P. (2008), “FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance,” The International Journal of Robotics Research , 27, 647–665.
Web of Science ®Google Scholar
de Freitas, N. D. , Højen-Sørensen, P. , Jordan, M. , and Russell, S. (2001), “Variational MCMC,” in Uncertainty in Artificial Intelligence , pp. 120–127.
Google Scholar
Damianou, A. , Titsias, M. , and Lawrence, N. (2011), “Variational Gaussian Process Dynamical Systems,” in Neural Information Processing Systems , pp. 2510–2518.
Google Scholar
Daunizeau, J. , Adam, V. , and Rigoux, L. (2014), “VBA: A Probabilistic Treatment of Nonlinear Models for Neurobiological and Behavioural Data,” PLoS Computational Biology , 10, e1003441.
PubMed Web of Science ®Google Scholar
Dempster, A. , Laird, N. , and Rubin, D. (1977), “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society , Series B, 39, 1–38.
Google Scholar
Deng, L. (2004), “Switching Dynamic System Models for Speech Articulation and Acoustics,” in Mathematical Foundations of Speech and Language Processing, eds. M. Johnson, S. P. Khudanpur, M. Ostendorf, and R. Rosenfeld, New York : Springer, pp. 115–133.
Google Scholar
Diaconis, P. , and Ylvisaker, D. (1979), “Conjugate Priors for Exponential Families,” The Annals of Statistics , 7, 269–281.
Web of Science ®Google Scholar
Du, L. , Lu, R. , Carin, L. , and Dunson, D. (2009), “A Bayesian Model for Simultaneous Image Clustering, Annotation and Object Segmentation,” in Neural Information Processing Systems , pp. 486–494.
Google Scholar
Ermis, B. , and Bouchard, G. (2014), “Iterative Splits of Quadratic Bounds for Scalable Binary Tensor Factorization,” in Uncertainty in Artificial Intelligence , pp. 192–199.
Google Scholar
Erosheva, E. A. , Fienberg, S. E. , and Joutard, C. (2007), “Describing Disability through Individual-Level Mixture Models for Multivariate Binary Data,” The Annals of Applied Statistics , 1, 346–384.
PubMed Web of Science ®Google Scholar
Flandin, G. , and Penny, W. (2007), “Bayesian fMRI Data Analysis with Sparse Spatial Basis Function Priors,” NeuroImage , 34, 1108–1125.
PubMed Web of Science ®Google Scholar
Foti, N. , Xu, J. , Laird, D. , and Fox, E. (2014), “Stochastic Variational Inference for Hidden Markov Models,” in Neural Information Processing Systems , pp. 3599–3607.
Google Scholar
Furmston, T. , and Barber, D. (2010), “Variational Methods for Reinforcement Learning,” Artificial Intelligence and Statistics , 9, 241–248.
Google Scholar
Gelfand, A. , and Smith, A. (1990), “Sampling Based Approaches to Calculating Marginal Densities,” Journal of the American Statistical Association , 85, 398–409.
Web of Science ®Google Scholar
Geman, S. , and Geman, D. (1984), “Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence , 6, 721–741.
PubMed Web of Science ®Google Scholar
Gershman, S. J. , Blei, D. M. , Norman, K. A. , and Sederberg, P. B. (2014), “Decomposing Spatiotemporal Brain Patterns into Topographic Latent Sources,” NeuroImage , 98, 91–102.
PubMed Web of Science ®Google Scholar
Ghahramani, Z. , and Jordan, M. I. (1997), “Factorial Hidden Markov Models,” Machine Learning , 29, 245–273.
Web of Science ®Google Scholar
Giordano, R. J. , Broderick, T. , and Jordan, M. I. (2015), “Linear Response Methods for Accurate Covariance Estimates from Mean Field Variational Bayes,” in Neural Information Processing Systems , pp. 1441–1449.
Google Scholar
Grimmer, J. (2011), “An introduction to Bayesian Inference via Variational Approximations,” Political Analysis , 19, 32–47.
Web of Science ®Google Scholar
Hall, P. , Ormerod, J. , and Wand, M. (2011a), “Theory of Gaussian Variational Approximation for a Poisson Mixed Model,” Statistica Sinica , 21, 369–389.
Web of Science ®Google Scholar
Hall, P. , Pham, T. , Wand, M. , and Wang, S. (2011b), “Asymptotic Normality and Valid Inference for Gaussian Variational Approximation,” Annals of Statistics , 39, 2502–2532.
Web of Science ®Google Scholar
Harrison, L. , and Green, G. (2010), “A Bayesian Spatiotemporal Model for very Large Data Sets,” Neuroimage , 50, 1126–1141.
PubMed Web of Science ®Google Scholar
Hastings, W. (1970), “Monte Carlo Sampling Methods using Markov Chains and their Applications,” Biometrika , 57, 97–109.
Web of Science ®Google Scholar
Hensman, J. , Fusi, N. , and Lawrence, N. (2013), “Gaussian Processes for Big Data,” in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence , Corvallis, OR : AUAI Press, pp. 282–290.
Google Scholar
Hensman, J. , Rattray, M. , and Lawrence, N. (2012), “Fast Variational Inference in the Conjugate Exponential Family,” in Neural Information Processing Systems , pp. 2888–2896.
Google Scholar
Hinton, G. , and Van Camp, D. (1993), “Keeping the Neural Networks Simple by Minimizing the Description Length of the Weights,” in Computational Learning Theory , pp. 5–13.
Google Scholar
Hoffman, M. , Blei, D. , and Mimno, D. M. (2012), “Sparse Stochastic Inference for Latent Dirichlet Allocation,” in Proceedings of the 29th International Conference on Machine Learning (ICML-12), eds. J. Langford and J. Pineau, New York: ACM, pp. 1599–1606.
Google Scholar
Hoffman, M. D. , Blei, D. , Wang, C. , and Paisley, J. (2013), “Stochastic Variational Inference,” Journal of Machine Learning Research , 14, 1303–1347.
Web of Science ®Google Scholar
Hoffman, M. D. , and Blei, D. M. (2015), “Structured Stochastic Variational Inference,” in Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics (Vol. 38), eds. G. Lebanon and S. V. N. Vishwanathan, San Diego, CA: Proceedings of Machine Learning Research, pp. 361–369.
Google Scholar
Hoffman, M. D. , and Gelman, A. (2014), “The No-U-turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo,” The Journal of Machine Learning Research , 15, 1593–1623.
Web of Science ®Google Scholar
Honkela, A. , Tornio, M. , Raiko, T. , and Karhunen, J. (2008), “Natural Conjugate Gradient in Variational Inference,” in Neural Information Processing, eds. J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, New York : Springer, pp. 305–314.
Google Scholar
Jaakkola, T. , and Jordan, M. I. (1996), “Computing Upper and Lower Bounds on Likelihoods in Intractable Networks,” in Uncertainty in Artificial Intelligence , pp. 340–348.
Google Scholar
——— (1997), “A Variational Approach to Bayesian Logistic Regression Models and their Extensions,” in Artificial Intelligence and Statistics , pp. 1–12.
Google Scholar
——— (2000), “Bayesian Parameter Estimation via Variational Methods,” Statistics and Computing , 10, 25–37.
Web of Science ®Google Scholar
Ji, C. , Shen, H. , and West, M. (2010), “Bounded Approximations for Marginal Likelihoods,” Technical Report, Duke University.
Google Scholar
Johnson, M. , and Willsky, A. (2014), “Stochastic Variational Inference for Bayesian Time Series Models,” in International Conference on Machine Learning , pp. 1854–1862.
Google Scholar
Jojic, N. , and Frey, B. (2001), “Learning Flexible Sprites in Video Layers,” in Computer Vision and Pattern Recognition , pp. 1–8.
Google Scholar
Jojic, V. , Jojic, N. , Meek, C. , Geiger, D. , Siepel, A. , Haussler, D. , and Heckerman, D. (2004), “Efficient Approximations for Learning Phylogenetic HMM Models from Data,” Bioinformatics , 20, 161–168.
PubMed Web of Science ®Google Scholar
Jordan, M. I. , Ghahramani, Z. , Jaakkola, T. , and Saul, L. (1999), “Introduction to Variational Methods for Graphical Models,” Machine Learning , 37, 183–233.
Web of Science ®Google Scholar
Khan, M. E. , Bouchard, G. , Murphy, K. P. , and Marlin, B. M. (2010), “Variational Bounds for Mixed-Data Factor Analysis,” in Neural Information Processing Systems , pp. 1108–1116.
Google Scholar
Kiebel, S. , Daunizeau, J. , Phillips, C. , and Friston, K. (2008), “Variational Bayesian Inversion of the Equivalent Current Dipole Model in EEG/MEG,” NeuroImage , 39, 728–741.
PubMed Web of Science ®Google Scholar
Kingma, D. , and Welling, M. (2014), “Auto-Encoding Variational Bayes,” in Proceedings of the 2nd International Conference on Learning Representations (ICLR) .
Google Scholar
Knowles, D. , and Minka, T. (2011), “Non-Conjugate Variational Message Passing for Multinomial and Binary Regression,” in Neural Information Processing Systems , pp. 1701–1709.
Google Scholar
Kucukelbir, A. , Ranganath, R. , Gelman, A. , and Blei, D. (2015), “Automatic Variational Inference in Stan,” in Neural Information Processing Systems , pp. 568–576.
Google Scholar
Kucukelbir, A. , Tran, D. , Ranganath, R. , Gelman, A. , and Blei, D. M. (2017), “Automatic Differentiation Variational Inference,” Journal of Machine Learning Research , 18, 1–45.
Google Scholar
Kullback, S. , and Leibler, R. (1951), “On Information and Sufficiency,” The Annals of Mathematical Statistics , 22, 79–86.
Google Scholar
Kurihara, K. , and Sato, T. (2006), “Variational Bayesian Grammar Induction for Natural Language,” in Grammatical Inference: Algorithms and Applications , New York : Springer, pp. 84–96.
Google Scholar
Kushner, H. , and Yin, G. (1997), Stochastic Approximation Algorithms and Applications, eds. Y. Sakakibara, S. Kobayashi, K. Sato, T. Nishino, and E. Tomita, New York : Springer.
Google Scholar
Lashkari, D. , Sridharan, R. , Vul, E. , Hsieh, P. , Kanwisher, N. , and Golland, P. (2012), “Search for Patterns of Functional Specificity in the Brain: A Nonparametric Hierarchical Bayesian Model for Group fMRI Data,” Neuroimage , 59, 1348–1368.
PubMed Web of Science ®Google Scholar
Lauritzen, S. , and Spiegelhalter, D. (1988), “Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems,” Journal of the Royal Statistical Society , Series B, 50, 157–224.
Google Scholar
Le Cun, Y. , and Bottou, L. (2004), “Large Scale Online Learning,” in Neural Information Processing Systems , pp. 217–224.
Google Scholar
Leisink, M. , and Kappen, H. (2001), “A Tighter Bound for Graphical Models,” Neural Computation , 13, 2149–2171.
PubMed Web of Science ®Google Scholar
Liang, P. , Jordan, M. I. , and Klein, D. (2009), “Probabilistic Grammars and Hierarchical Dirichlet Processes,” in The Handbook of Applied Bayesian Analysis , eds. T. O’Hagan , and M. West , New York : Oxford University Press, pp. 776–819.
Google Scholar
Liang, P. , Petrov, S. , Klein, D. , and Jordan, M. I. (2007), “The Infinite PCFG using Hierarchical Dirichlet Processes,” in Empirical Methods in Natural Language Processing , pp. 688–697.
Google Scholar
Likas, A. , and Galatsanos, N. (2004), “A Variational Approach for Bayesian Blind Image Deconvolution,” IEEE Transactions on Signal Processing , 52, 2222–2233.
Web of Science ®Google Scholar
Logsdon, B. , Hoffman, G. , and Mezey, J. (2010), “A Variational Bayes Algorithm for Fast and Accurate Multiple Locus Genome-Wide Association Analysis,” BMC Bioinformatics , 11, 58.
PubMed Web of Science ®Google Scholar
MacKay, D. J. (1997), “Ensemble Learning for Hidden Markov Models,” unpublished manuscript, available at http://www.inference.eng.cam.ac.uk/mackay/ensemblePaper.pdf .
Google Scholar
Manning, J. R. , Ranganath, R. , Norman, K. A. , and Blei, D. M. (2014), “Topographic Factor Analysis: A Bayesian Model for Inferring Brain Networks from Neural Data,” PloS one , 9, e94914.
PubMed Web of Science ®Google Scholar
Marlin, B. M. , Khan, M. E. , and Murphy, K. P. (2011), “Piecewise Bounds for Estimating Bernoulli-Logistic Latent Gaussian Models,” in International Conference on Machine Learning , pp. 633–640.
Google Scholar
McGrory, C. A. , and Titterington, D. M. (2007), “Variational Approximations in Bayesian Model Selection for Finite Mixture Distributions,” Computational Statistics and Data Analysis , 51, 5352–5367.
Web of Science ®Google Scholar
Metropolis, N. , Rosenbluth, A. , Rosenbluth, M. , Teller, M. , and Teller, E. (1953), “Equations of State Calculations by Fast Computing Machines,” Journal of Chemical Physics , 21, 1087–1092.
Web of Science ®Google Scholar
Minka, T. P. (2001), “Expectation Propagation for Approximate Bayesian Inference,” in Uncertainty in Artificial Intelligence , pp. 362–369.
Google Scholar
——— (2005), “Divergence Measures and Message Passing,” Technical Report, Microsoft Research.
Google Scholar
Minka, T. , Winn, J. , Guiver, J. , Webster, S. , Zaykov, Y. , Yangel, B. , Spengler, A. , and Bronskill, J. (2014), Infer.NET 2.6. Cambridge, MA: Microsoft Research.
Google Scholar
Naseem, T. , Chen, H. , Barzilay, R. , and Johnson, M. (2010), “Using Universal Linguistic Knowledge to Guide Grammar Induction,” in Empirical Methods in Natural Language Processing , pp. 1234–1244.
Google Scholar
Nathoo, F. , Babul, A. , Moiseev, A. , Virji-Babul, N. , and Beg, M. (2014), “A Variational Bayes Spatiotemporal Model for Electromagnetic Brain Mapping,” Biometrics , 70, 132–143.
PubMed Web of Science ®Google Scholar
Neal, R. M. , and Hinton, G. E. (1998), “A View of the EM Algorithm that Justifies Incremental, Sparse, and other Variants,” in Learning in Graphical Models , New York: Springer, pp. 355–368.
Google Scholar
Neville, S. , Ormerod, J. , and Wand, M. (2014), “Mean Field Variational Bayes for Continuous Sparse Signal Shrinkage: Pitfalls and Remedies,” Electronic Journal of Statistics , 8, 1113–1151.
Web of Science ®Google Scholar
Nott, D. J. , Tan, S. L. , Villani, M. , and Kohn, R. (2012), “Regression Density Estimation with Variational Methods and Stochastic Approximation,” Journal of Computational and Graphical Statistics , 21, 797–820.
Web of Science ®Google Scholar
Opper, M. , and Winther, O. (2005), “Expectation Consistent Approximate Inference,” The Journal of Machine Learning Research , 6, 2177–2204.
Web of Science ®Google Scholar
Ormerod, J. , You, C. , and Muller, S. (2014), “A Variational Bayes Approach to Variable Selection,” unpublished manuscript, available at http://www.maths.usyd.edu.au/u/jormerod/JTOpapers/Variab-leSelectionFinal.pdf .
Google Scholar
Paisley, J. , Blei, D. , and Jordan, M. I. (2012), “Variational Bayesian Inference with Stochastic Search,” in Proceedings of the 29th International Conference on International Conference on Machine Learning , Madison, WI: Omnipress, pp. 1363–1370.
Google Scholar
Parisi, G. (1988), Statistical Field Theory , Melville, NY: Addison-Wesley.
Google Scholar
Pearl, J. (1988), Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference , San Francisco, CA: Morgan Kaufmann.
Google Scholar
Penny, W. , Kiebel, S. , and Friston, K. (2003), “Variational Bayesian Inference for fMRI Time Series,” NeuroImage , 19, 727–741.
PubMed Web of Science ®Google Scholar
Penny, W. , Trujillo-Barreto, N. , and Friston, K. (2005), “Bayesian fMRI Time Series Analysis with Spatial Priors,” Neuroimage , 24, 350–362.
PubMed Web of Science ®Google Scholar
Peterson, C. , and Anderson, J. (1987), “A Mean Field Theory Learning Algorithm for Neural Networks,” Complex Systems , 1, 995–1019.
Google Scholar
Raj, A. , Stephens, M. , and Pritchard, J. (2014), “fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets,” Genetics , 197, 573–589.
PubMed Web of Science ®Google Scholar
Ramos, F. , Upcroft, B. , Kumar, S. , and Durrant-Whyte, H. (2012), “A Bayesian Approach for Place Recognition,” Robotics and Autonomous Systems , 60, 487–497.
Web of Science ®Google Scholar
Ranganath, R. , Gerrish, S. , and Blei, D. (2014), “Black Box Variational Inference,” in Artificial Intelligence and Statistics , pp. 814–822.
Google Scholar
Ranganath, R. , Tran, D. , and Blei, D. (2016), “Hierarchical Variational Models,” in International Conference on Machine Learning , pp. 324–333.
Google Scholar
Regier, J. , Miller, A. , McAuliffe, J. , Adams, R. , Hoffman, M. , Lang, D. , Schlegel, D. , and Prabhat (2015), “Celeste: Variational Inference for a Generative Model of Astronomical Images,” in International Conference on Machine Learning , pp. 2095–2103.
Google Scholar
Reyes-Gomez, M. , Ellis, D. , and Jojic, N. (2004), “Multiband Audio Modeling for Single-Channel Acoustic Source Separation,” in Acoustics, Speech, and Signal Processing , pp. 641–644.
Google Scholar
Rezende, D. J. , Mohamed, S. , and Wierstra, D. (2014), “Stochastic Backpropagation and Approximate Inference in Deep Generative Models,” in Proceedings of the 31st International Conference on Machine Learning (Vol. 32), eds. E. P. Xing and T. Jebara, Beijing, China: Proceedings of Machine Learning Research, pp. 1278–1286.
Google Scholar
Robbins, H. , and Monro, S. (1951), “A Stochastic Approximation Method,” The Annals of Mathematical Statistics , 22, 400–407.
Google Scholar
Robert, C. , and Casella, G. (2004), Monte Carlo Statistical Methods (Springer Texts in Statistics) , New York : Springer-Verlag.
Google Scholar
Roberts, S. , Guilford, T. , Rezek, I. , and Biro, D. (2004), “Positional Entropy During Pigeon Homing I: Application of Bayesian Latent State Modelling,” Journal of Theoretical Biology , 227, 39–50.
PubMed Web of Science ®Google Scholar
Roberts, S. , and Penny, W. (2002), “Variational Bayes for Generalized Autoregressive Models,” IEEE Transactions on Signal Processing , 50, 2245–2257.
Web of Science ®Google Scholar
Rohde, D. , and Wand, M. (2016), “Semiparametric Mean Field Variational Bayes: General Principles and Numerical Issues,” Journal of Machine Learning Research , 17, 1–47.
Google Scholar
Salimans, T. , Kingma, D. , and Welling, M. (2015), “Markov Chain Monte Carlo and Variational Inference: Bridging the Gap,” in International Conference on Machine Learning , pp. 1218–1226.
Google Scholar
Salimans, T. , and Knowles, D. (2014), “On using Control Variates with Stochastic Approximation for Variational Bayes,” arXiv preprint, arXiv:1401.1022. Available at https://arxiv.org/abs/1401.1022
Google Scholar
Sanguinetti, G. , Lawrence, N. , and Rattray, M. (2006), “Probabilistic Inference of Transcription Factor Concentrations and Gene-Specific Regulatory Activities,” Bioinformatics , 22, 2775–2781.
PubMed Web of Science ®Google Scholar
Sato, M. (2001), “Online Model Selection Based on the Variational Bayes,” Neural Computation , 13, 1649–1681.
Web of Science ®Google Scholar
Sato, M. , Yoshioka, T. , Kajihara, S. , Toyama, K. , Goda, N. , Doya, K. , and Kawato, M. (2004), “Hierarchical Bayesian Estimation for MEG Inverse Problem,” NeuroImage , 23, 806–826.
PubMed Web of Science ®Google Scholar
Saul, L. , and Jordan, M. I. (1996), “Exploiting Tractable Substructures in Intractable Networks,” in Neural Information Processing Systems , pp. 486–492.
Google Scholar
Saul, L. K. , Jaakkola, T. , and Jordan, M. I. (1996), “Mean Field Theory for Sigmoid Belief Networks,” Journal of Artificial Intelligence Research , 4, 61–76.
Web of Science ®Google Scholar
Spall, J. (2003), Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , New York : Wiley.
Google Scholar
Stan Development Team (2015), Stan Modeling Language Users Guide and Reference Manual, Version 2.8.0. New York: Columbia University.
Google Scholar
Stegle, O. , Parts, L. , Durbin, R. , and Winn, J. (2010), “A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eqtl Studies,” PLoS Computational Biology , 6, e1000770.
PubMed Web of Science ®Google Scholar
Sudderth, E. B. , and Jordan, M. I. (2009), “Shared Segmentation of Natural Scenes using Dependent Pitman-Yor Processes,” in Neural Information Processing Systems , pp. 1585–1592.
Google Scholar
Sung, J. , Ghahramani, Z. , and Bang, Y. (2008), “Latent-Space Variational Bayes,” IEEE Transactions on Pattern Analysis and Machine Intelligence , 30, 2236–2242.
PubMed Web of Science ®Google Scholar
Sykacek, P. , Roberts, S. , and Stokes, M. (2004), “Adaptive BCI Based on Variational Bayesian Kalman Filtering: An Empirical Evaluation,” IEEE Transactions on Biomedical Engineering , 51, 719–727.
PubMed Web of Science ®Google Scholar
Tan, L. , and Nott, D. (2013), “Variational Inference for Generalized Linear Mixed Models using Partially Noncentered Parametrizations,” Statistical Science , 28, 168–188.
Web of Science ®Google Scholar
Tan, L. , and Nott, D. (2014), “A Stochastic Variational Framework for Fitting and Diagnosing Generalized Linear Mixed Models,” Bayesian Analysis , 9, 963–1004.
Web of Science ®Google Scholar
Tan, L. , and Nott, D. 2018). Gaussian variational approximation with sparse precision matrices. Statistics and Computing, 28, 259-275. https://doi.org/10.1007/s11222-017-9729-7
Web of Science ®Google Scholar
Tipping, M. , and Lawrence, N. (2005), “Variational Inference for Student-t models: Robust Bayesian Interpolation and Generalised Component Analysis,” Neurocomputing , 69, 123–141.
Web of Science ®Google Scholar
Titsias, M. , and Lawrence, N. (2010), “Bayesian Gaussian Process Latent Variable Model,” in Artificial Intelligence and Statistics , pp. 844–851.
Google Scholar
Titsias, M. , and Lázaro-Gredilla, M. (2014), “Doubly Stochastic Variational Bayes for Non-Conjugate Inference,” in International Conference on Machine Learning , pp. 1971–1979.
Google Scholar
Tran, D. , Ranganath, R. , and Blei, D. M. (2016), “The Variational Gaussian Process,” in International Conference on Learning Representations , pp. 1–4.
Google Scholar
Ueda, N. , and Ghahramani, Z. (2002), “Bayesian Model Search for Mixture Models Based on Optimizing Variational Bounds,” Neural Networks , 15, 1223–1241.
PubMed Web of Science ®Google Scholar
Van Den Broek, B. , Wiegerinck, W. , and Kappen, B. (2008), “Graphical Model Inference in Optimal Control of Stochastic Multi-Agent Systems,” Journal of Artificial Intelligence Research , 32, 95–122.
Web of Science ®Google Scholar
Vermaak, J. , Lawrence, N. D. , and Pérez, P. (2003), “Variational Inference for Visual Tracking,” in Computer Vision and Pattern Recognition , pp. 1–8.
Google Scholar
Villegas, M. , Paredes, R. , and Thomee, B. (2013), “Overview of the ImageCLEF 2013 Scalable Concept Image Annotation Subtask,” in CLEF Evaluation Labs and Workshop , pp. 308–328.
Google Scholar
Wainwright, M. J. , and Jordan, M. I. (2008), “Graphical Models, Exponential Families, and Variational Inference,” Foundations and Trends in Machine Learning , 1, 1–305.
Google Scholar
Wand, M. (2014), “Fully Simplified Multivariate Normal Updates in Non-Conjugate Variational Message Passing,” Journal of Machine Learning Research , 15, 1351–1369.
Web of Science ®Google Scholar
Wand, M. , Ormerod, J. , Padoan, S. , and Fuhrwirth, R. (2011), “Mean Field Variational Bayes for Elaborate Distributions,” Bayesian Analysis , 6, 847–900.
Web of Science ®Google Scholar
Wang, B. , and Titterington, D. (2005), “Inadequacy of Interval Estimates Corresponding to Variational Bayesian Approximations,” in Artificial Intelligence and Statistics , pp. 373–380.
Google Scholar
Wang, B. , and Titterington, D. (2006), “Convergence Properties of a General Algorithm for Calculating Variational Bayesian Estimates for a Normal Mixture Model,” Bayesian Analysis , 1, 625–650.
Web of Science ®Google Scholar
Wang, C. , and Blei, D. (2013), “Variational Inference in Nonconjugate Models,” Journal of Machine Learning Research , 14, 1005–1031.
Web of Science ®Google Scholar
Wang, C. , and Blei, D. (2015), “A General Method for Robust Bayesian Modeling,” Journal of Machine Learning Research , 14, 1005–1031.
Google Scholar
Wang, P. , and Blunsom, P. (2013), “Collapsed Variational Bayesian Inference for Hidden Markov Models,” in Artificial Intelligence and Statistics , pp. 599–607.
Google Scholar
Wang, Y. , and Mori, G. (2009), “Human Action Recognition by Semilatent Topic Models,” IEEE Transactions on Pattern Analysis and Machine Intelligence , 31, 1762–1774.
PubMed Web of Science ®Google Scholar
Waterhouse, S. , MacKay, D. , and Robinson, T. (1996), “Bayesian Methods for Mixtures of Experts,” in Neural Information Processing Systems , pp. 351–357.
Google Scholar
Welling, M. , and Teh, Y. (2011), “Bayesian Learning via Stochastic Gradient Langevin Dynamics,” in International Conference on Machine Learning , pp. 681–688.
Google Scholar
Westling, T. , and McCormick, T. H. (2015), “Establishing Consistency and Improving Uncertainty Estimates of Variational Inference Through M-estimation,” arXiv preprint, arXiv:1510.08151. Available at https://arxiv.org/abs/1510.08151
Google Scholar
Wiggins, C. , and Hofman, J. (2008), “Bayesian Approach to Network Modularity,” Physical Review Letters , 100, 258701.
PubMed Web of Science ®Google Scholar
Wingate, D. , and Weber, T. (2013), “Automated Variational Inference in Probabilistic Programming,” arXiv preprint, arXiv:1301.1299. Available at https://arxiv.org/abs/1301.1299
Google Scholar
Winn, J. , and Bishop, C. (2005), “Variational Message Passing,” Journal of Machine Learning Research , 6, 661–694.
Web of Science ®Google Scholar
Wipf, D. , and Nagarajan, S. (2009), “A Unified Bayesian Framework for MEG/EEG Source Imaging,” NeuroImage , 44, 947–966.
PubMed Web of Science ®Google Scholar
Woolrich, M. , Behrens, T. , Beckmann, C. , Jenkinson, M. , and Smith, S. (2004), “Multilevel Linear Modeling for fMRI Group Analysis using Bayesian Inference,” Neuroimage , 21, 1732–1747.
PubMed Web of Science ®Google Scholar
Xing, E. , Wu, W. , Jordan, M. I. , and Karp, R. (2004), “Logos: A Modular Bayesian Model for de novo motif Detection,” Journal of Bioinformatics and Computational Biology , 2, 127–154.
PubMedGoogle Scholar
Yedidia, J. S. , Freeman, W. T. , and Weiss, Y. (2001), “Generalized Belief Propagation,” in Neural Information Processing Systems , pp. 689–695.
Google Scholar
Yogatama, D. , Wang, C. , Routledge, B. , Smith, N. A. , and Xing, E. (2014), “Dynamic Language Models for Streaming Text,” Transactions of the Association for Computational Linguistics , 2, 181–192.
Google Scholar
You, C. , Ormerod, J. , and Muller, S. (2014), “On Variational Bayes Estimation and Variational Information Criteria for Linear Regression Models,” Australian & New Zealand Journal of Statistics , 56, 73–87.
Web of Science ®Google Scholar
Yu, T. , and Wu, Y. (2005), “Decentralized Multiple Target Tracking using Netted Collaborative Autonomous Trackers,” in Computer Vision and Pattern Recognition , pp. 939–946.
Google Scholar
Zumer, J. , Attias, H. , Sekihara, K. , and Nagarajan, S. (2007), “A Probabilistic Algorithm Integrating Source Localization and Noise Suppression for MEG and EEG Data,” NeuroImage , 37, 102–115.
PubMed Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Variational Inference: A Review for Statisticians

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Variational Inference: A Review for Statisticians

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date