134
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Relative Entropy Gradient Sampler for Unnormalized Distribution

, ORCID Icon, , & ORCID Icon
Received 28 Mar 2023, Accepted 21 Feb 2024, Published online: 21 May 2024

References

  • Ambrosio, L., Gigli, N., and Savaré, G. (2008), Gradient Flows: in Metric Spaces and in the Space of Probability Measures, Basel: Springer.
  • Andrieu, C., de Freitas, N., Doucet, A., and Jordan, M. I. (2003), “An Introduction to MCMC for Machine Learning,” Machine Learning, 50, 5–43. DOI: 10.1023/A:1020281327116.
  • Beal, M. J. (2003). Variational Algorithms for Approximate Bayesian Inference. PhD thesis, University College London.
  • Boffi, N. M., and Vanden-Eijnden, E. (2023), “Probability Flow Solution of the Fokker–Planck Equation,” Machine Learning: Science and Technology, 4, 035012. DOI: 10.1088/2632-2153/ace2aa.
  • Bregman, L. M. (1967), “The Relaxation Method of Finding the Common Point of Convex Sets and its Application to the Solution of Problems in Convex Programming,” USSR Computational Mathematics and Mathematical Physics, 7, 200–217. DOI: 10.1016/0041-5553(67)90040-7.
  • Chen, C., Zhang, R., Wang, W., Li, B., and Chen, L. (2018), “A Unified Particle-Optimization Framework for Scalable Bayesian Sampling,” in UAI.
  • Chen, T., Fox, E., and Guestrin, C. (2014), “Stochastic Gradient Hamiltonian Monte Carlo,” in Proceedings of the 31st International Conference on Machine Learning, pp. 1683–1691.
  • Dawid, A. P. (2007), “The Geometry of Proper Scoring Rules,” Annals of the Institute of Statistical Mathematics, 59, 77–93. DOI: 10.1007/s10463-006-0099-8.
  • Duane, S., Kennedy, A., Pendleton, B. J., and Roweth, D. (1987), “Hybrid Monte Carlo,” Physics Letters B, 195, 216–222. DOI: 10.1016/0370-2693(87)91197-X.
  • Duncan, A., Nüsken, N., and Szpruch, L. (2019), “On the Geometry of Stein Variational Gradient Descent,” arXiv preprint arXiv:1912.00894.
  • Dunson, D. B., and Johndrow, J. E. (2019), “The Hastings Algorithm at Fifty,” Biometrika, 107, 1–23. DOI: 10.1093/biomet/asz066.
  • Gao, Y., Huang, J., Jiao, Y., Liu, J., Lu, X., and Yang, Z. (2022), “Deep Generative Learning via Euler Particle Transport,” in Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, pp. 336–368.
  • Gao, Y., Jiao, Y., Wang, Y., Wang, Y., Yang, C., and Zhang, S. (2019), “Deep Generative Learning via Variational Gradient Flow,” in Proceedings of the 36th International Conference on Machine Learning, pp. 2093–2101.
  • Gershman, S., Hoffman, M., and Blei, D. (2012), “Nonparametric Variational Inference,” ICML.
  • Gneiting, T., and Raftery, A. E. (2007), “Strictly Proper Scoring Rules, Prediction, and Estimation,” Journal of the American statistical Association, 102, 359–378. DOI: 10.1198/016214506000001437.
  • Gu, H., Birmpa, P., Pantazis, Y., Rey-Bellet, L., and Katsoulakis, M. A. (2022), “Lipschitz Regularized Gradient Flows and Latent Generative Particles,” arXiv preprint arXiv:2210.17230
  • Hastings, W. K. (1970), “Monte Carlo Sampling Methods Using Markov Chains and their Applications,” Biometrika, 57, 97–109. DOI: 10.1093/biomet/57.1.97.
  • Hoffman, M. D., and Gelman, A. (2014), “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo,” Journal of Machine Learning Research, 15, 1593–1623.
  • Jordan, R., Kinderlehrer, D., and Otto, F. (1998), “The Variational Formulation of the Fokker–Planck Equation,” SIAM Journal on Mathematical Analysis, 29, 1–17. DOI: 10.1137/S0036141096303359.
  • Kanamori, T., and Sugiyama, M. (2014), “Statistical Analysis of Distance Estimators with Density Differences and Density Ratios,” Entropy, 16, 921–942. DOI: 10.3390/e16020921.
  • Korba, A., Salim, A., Arbel, M., Luise, G., and Gretton, A. (2020), “A Non-asymptotic Analysis for Stein Variational Gradient Descent,” in Advances in Neural Information Processing Systems (Vol. 33).
  • LeVeque, R. J. (2007), Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-state and Time-dependent Problems (Vol. 98), Philadelphia: SIAM.
  • Liu, C., Zhuo, J., Cheng, P., Zhang, R., and Zhu, J. (2019a), “Understanding and Accelerating Particle-based Variational Inference,” in Proceedings of the 36th International Conference on Machine Learning, pp. 4082–4092.
  • Liu, C., Zhuo, J., and Zhu, J. (2019b), “Understanding MCMC Dynamics as Flows on the Wasserstein Space,” in Proceedings of the 36th International Conference on Machine Learning, pp. 4093–4103.
  • Liu, Q. (2017), “Stein Variational Gradient Descent as Gradient Flow,” in Advances in Neural Information Processing Systems.
  • Liu, Q., and Wang, D. (2016), “Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm,” in Advances in Neural Information Processing Systems.
  • Lu, J., Lu, Y., and Nolen, J. (2019), “Scaling Limit of the Stein Variational Gradient Descent: The Mean Field Regime,” SIAM Journal on Mathematical Analysis, 51, 648–671. DOI: 10.1137/18M1187611.
  • Maoutsa, D., Reich, S., and Opper, M. (2020), “Interacting Particle Solutions of Fokker–Planck Equations through Gradient–Log–Density Estimation,” Entropy, 22, 802. DOI: 10.3390/e22080802.
  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953), “Equation of State Calculations by Fast Computing Machines,” The Journal of Chemical Physics, 21, 1087–1092. DOI: 10.1063/1.1699114.
  • Neal, R. M. (2011), MCMC Using Hamiltonian Dynamics, Boca Raton, FL: CRC Press.
  • Roberts, G. O., and Stramer, O. (2002), “Langevin Diffusions and Metropolis-Hastings Algorithms,” Methodology and Computing in Applied Probability, 4, 337–357. DOI: 10.1023/A:1023562417138.
  • Roberts, G. O., and Tweedie, R. L. (1996), “Exponential Convergence of Langevin Distributions and their Discrete Approximations,” Bernoulli, 2, 341–363. DOI: 10.2307/3318418.
  • Salim, A., Korba, A., and Luise, G. (2020), “The Wasserstein Proximal Gradient Algorithm,” arXiv preprint arXiv:2002.03035.
  • Salim, A., Sun, L., and Richtárik, P. (2021), “Complexity Analysis of Stein Variational Gradient Descent Under Talagrand’s Inequality t1,” arXiv preprint arXiv:2106.03076.
  • Tierney, L. (1994), “Markov Chains for Exploring Posterior Distributions,” The Annals of Statistics, 22, 1701–1728. DOI: 10.1214/aos/1176325750.
  • Villani, C. (2008), Optimal Transport: Old and New (Vol. 338), Berlin: Springer.
  • Wainwright, M. J., and Jordan, M. I. (2008), “Graphical Models, Exponential Families, and Variational Inference,” Foundations and Trends in Machine Learning, 1, 1–305.
  • Wang, Y., Chen, P., Pilanci, M., and Li, W. (2022), “Optimal Neural Network Approximation of Wasserstein Gradient Direction via Convex Optimization,” arXiv preprint arXiv:2205.13098.
  • Welling, M., and Teh, Y. W. (2011), “Bayesian Learning via Stochastic Gradient Langevin Dynamics,” in Proceedings of the 28th International Conference on Machine Learning, pp. 681–688.
  • Zhu, M., Liu, C., and Zhu, J. (2020), “Variance Reduction and Quasi-Newton for Particle-based Variational Inference,” in Proceedings of the 37th International Conference on Machine Learning, pp. 11576–11587.
  • Zhuo, J., Liu, C., Shi, J., Zhu, J., Chen, N., and Zhang, B. (2018), “Message Passing Stein Variational Gradient Descent,” in Proceedings of the 35th International Conference on Machine Learning, pp. 6018–6027.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.