322
Views
0
CrossRef citations to date
0
Altmetric
Articles

Fast Search and Estimation of Bayesian Nonparametric Mixture Models Using a Classification Annealing EM Algorithm

ORCID Icon
Pages 236-247 | Received 27 Jun 2019, Accepted 03 Aug 2020, Published online: 30 Sep 2020

References

  • Arbel, J., and Prünster, I. (2017), “A Moment-Matching Ferguson & Klass Algorithm,” Statistics and Computing, 27, 3–17.
  • Argiento, R., Bianchini, I., and Guglielmi, A. (2016), “Posterior Sampling From ε-Approximation of Normalized Completely Random Measure Mixtures,” Electronic Journal of Statistics, 10, 3516–3547. DOI: 10.1214/16-EJS1168.
  • Banfield, J., and Raftery, A. (1993), “Model-Based Gaussian and Non-Gaussian Clustering,” Biometrics, 49, 803–821.
  • Beckman, R., and McKay, M. (1987), “Monte Carlo Estimation Under Different Distributions Using the Same Simulation,” Technometrics, 29, 153–160. DOI: 10.1080/00401706.1987.10488206.
  • Blei, D., and Jordan, M. (2006), “Variational Inference for Dirichlet Process Mixtures,” Bayesian Analysis, 1, 121–144. DOI: 10.1214/06-BA104.
  • Campbell, T., Huggins, J., How, J., and Broderick, T. (2019), “Truncated Random Measures,” Bernoulli, 25, 1256–1288. DOI: 10.3150/18-BEJ1020.
  • Celeux, G., and Govaert, G. (1992), “A Classification EM Algorithm for Clustering and Two Stochastic Versions,” Computational Statistics and Data Analysis, 14, 315–332. DOI: 10.1016/0167-9473(92)90042-E.
  • Chen, M. (2016), “Dirichlet Process Gaussian Mixture Model,” MATLAB Central File Exchange, MATLAB Code, available at https://www.mathworks.com/matlabcentral/fileexchange/55865-dirichlet-process-gaussian-mixture-model.
  • Chen, M. (2019), “EM Algorithm for Gaussian Mixture Model (EM GMM)”, MATLAB Central File Exchange, MATLAB Code, available at https://www.mathworks.com/matlabcentral/fileexchange/26184-em-algorithm-for-gaussian-mixture-model-em-gmm.
  • Dahl, D. (2006), “Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model,” in Bayesian Inference for Gene Expression and Proteomics, eds. K.-A. Do, P. Müller, and M. Vannucci, Cambridge, UK: Cambridge University Press, pp. 201–218.
  • Daumé III, H. (2007), “Fast Search for Dirichlet Process Mixture Models,” in Proceedings of Machine Learning Research, eds. M. Meila and X. Shen, San Juan, Puerto Rico: PMLR, pp. 83–90.
  • DeBlasi, P., Favaro, S., Lijoi, A., Meña, R., Prünster, I., and Ruggiero, M. (2015), “Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process?,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 212–229. DOI: 10.1109/TPAMI.2013.217.
  • Dempster, A., Laird, N., and Rubin, D. (1977), “Maximum Likelihood From Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society, Series B, 39, 1–38.
  • Eisenstein, J. (2012), “Dirichlet Process Mixture Model Code in MATLAB. Sampling and Variational,” GitHub, available at https://github.com/jacobeisenstein/DPMM.
  • Ferguson, T. (1973), “A Bayesian Analysis of Some Nonparametric Problems,” The Annals of Statistics, 1, 209–230. DOI: 10.1214/aos/1176342360.
  • Ferguson, T., and Klass, M. (1972), “A Representation of Independent Increment Processes Without Gaussian Components,” The Annals of Mathematical Statistics, 43, 1634–1643. DOI: 10.1214/aoms/1177692395.
  • Fortini, S., and Petrone, S. (2020), “Quasi-Bayes Properties of a Recursive Procedure for Mixtures,” Journal of the Royal Statistical Society, Series B, 82, 1087–1114 DOI: 10.1111/rssb.12385.
  • Fuentes-García, R., Meña, R., and Walker, S. (2019), “Modal Posterior Clustering Motivated by Hopfield’s Network,” Computational Statistics and Data Analysis, 137, 92–100. DOI: 10.1016/j.csda.2019.02.008.
  • Gates, A., and Ahn, Y.-Y. (2017), “The Impact of Random Models on Clustering Similarity,” Journal of Machine Learning Research, 18, 1–28.
  • Gelfand, A., and Mukhopadhyay, S. (1995), “On Nonparametric Bayesian Inference for the Distribution of a Random Sample,” Canadian Journal of Statistics, 23, 411–420. DOI: 10.2307/3315384.
  • Guha, S., and Mishra, N. (2016), “Clustering Data Streams,” in Data Stream Management: Processing High-Speed Data Streams, eds. M. Garofalakis, J. Gehrke, and R. Rastogi, Berlin, Heidelberg: Springer, pp. 169–187.
  • Hjort, N. (1990), “Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data,” The Annals of Statistics, 18, 1259–1294. DOI: 10.1214/aos/1176347749.
  • Hjort, N., Holmes, C., Müller, P., and Walker, S. (2010), Bayesian Nonparametrics, New York: Cambridge University Press.
  • Ishwaran, H., and James, L. (2001), “Gibbs Sampling Methods for Stick-Breaking Priors,” Journal of the American Statistical Association, 96, 161–173. DOI: 10.1198/016214501750332758.
  • Karabatsos, G., and Walker, S. (2012), “Adaptive-Modal Bayesian Nonparametric Regression,” Electronic Journal of Statistics, 6, 2038–2068. DOI: 10.1214/12-EJS738.
  • Kingman, J. (1967), “Completely Random Measures,” Pacific Journal of Mathematics, 21, 59–78. DOI: 10.2140/pjm.1967.21.59.
  • Kingman, J. (1975), “Random Discrete Distributions,” Journal of the Royal Statistical Society, Series B, 37, 1–22.
  • Lijoi, A., Meña, R., and Prünster, I. (2005a), “Bayesian Nonparametric Analysis for a Generalized Dirichlet Process Prior,” Statistical Inference for Stochastic Processes, 8, 283–309. DOI: 10.1007/s11203-005-6071-z.
  • Lijoi, A., Meña, R., and Prünster, I. (2005b), “Hierarchical Mixture Modeling With Normalized Inverse-Gaussian Priors,” Journal of the American Statistical Association, 100, 1278–1291.
  • Lijoi, A., Meña, R., and Prünster, I. (2007), “Controlling the Reinforcement in Bayesian Nonparametric Mixture Models,” Journal of the Royal Statistical Society, Series B, 69, 715–740.
  • Lijoi, A., and Prünster, I. (2010), “Models Beyond the Dirichlet Process,” in Bayesian Nonparametrics, eds. N. Hjort, C. Holmes, P. Müller, and S. Walker, Cambridge: Cambridge University Press, pp. 80–136.
  • Lo, A. (1984), “On a Class of Bayesian Nonparametric Estimates,” The Annals of Statistics, 12, 351–357. DOI: 10.1214/aos/1176346412.
  • Meña, R. (2013), “Geometric Weight Priors and Their Applications in Bayesian Nonparametrics,” in Bayesian Theory and Applications, eds. P. Damien, P. Dellaportas, N. Polson, and D. Stephens, Oxford: Oxford University Press, pp. 271–296.
  • Mitra, R., and Müller, P. (2015), Nonparametric Bayesian Inference in Biostatistics, Basel: Springer.
  • Müller, P., Erkanli, A., and West, M. (1996), “Bayesian Curve Fitting Using Multivariate Normal Mixtures,” Biometrika, 83, 67–79.
  • Neal, R. (2000), “Markov Chain Sampling Methods for Dirichlet Process Mixture Models,” Journal of Computational and Graphical Statistics, 9, 249–265.
  • Nguyen, H.-L., Woon, Y.-K., and Ng, W.-K. (2015), “A Survey on Data Stream Clustering and Classification,” Knowledge and Information Systems, 45, 535–569. DOI: 10.1007/s10115-014-0808-1.
  • Ni, Y., Müller, P., Diesendruck, M., Williamson, S., Zhu, Y., and Ji, Y. (2019), “Scalable Bayesian Nonparametric Clustering and Classification,” Journal of Computational and Graphical Statistics, 29, 53–65.
  • Nielsen, S. (2000), “The Stochastic EM Algorithm: Estimation and Asymptotic Results,” Bernoulli, 6, 457–489. DOI: 10.2307/3318671.
  • Pennell, M., and Dunson, D. (2006), “Bayesian Semiparametric Dynamic Frailty Models for Multiple Event Time Data,” Biometrics, 62, 1044–1052. DOI: 10.1111/j.1541-0420.2006.00571.x.
  • Pitman, J. (1996), “Some Developments of the Blackwell-MacQueen Urn Scheme,” in Statistics, Probability and Game Theory. Papers in Honor of David Blackwell, eds. T. Ferguson, L. Shapeley, and J. MacQueen, Hayward, CA: Institute of Mathematical Sciences, pp. 245–268.
  • Pitman, J. (2003), “Poisson–Kingman partitions,” in Science and Statistics: A Festschrift for Terry Speed, Institute of Mathematical Statistics, Hayward, Lecture Notes, Monograph Series, ed. D. Goldstein, Beachwood, OH: Institute of Mathematical Statistics, pp. 1–35.
  • Rastelli, R., and Friel, N. (2018), “Optimal Bayesian Estimators for Latent Variable Cluster Models,” Statistics and Computing, 28, 1169–1186. DOI: 10.1007/s11222-017-9786-y.
  • Raykov, Y., Boukouvalas, A., and Little, M. (2016), “Simple Approximate MAP Inference for Dirichlet Processes Mixtures,” Electronic Journal of Statistics, 10, 3548–3578. DOI: 10.1214/16-EJS1196.
  • Regazzini, E., Lijoi, A., and Prünster, I. (2003), “Distributional Results for Means of Normalized Random Measures With Independent Increments,” The Annals of Statistics, 31, 560–585. DOI: 10.1214/aos/1051027881.
  • Richardson, S., and Green, P. (1997), “On Bayesian Analysis of Mixtures With an Unknown Number of Components,” Journal of the Royal Statistical Society, Series B, 59, 731–792. DOI: 10.1111/1467-9868.00095.
  • Rodríguez, A., Dunson, D., and Gelfand, A. (2009), “Bayesian Nonparametric Functional Data Analysis Through Density Estimation,” Biometrika, 96, 149–162. DOI: 10.1093/biomet/asn054.
  • Roeder, K. (1990), “Density Estimation With Confidence Sets Exemplified by Superclusters and Voids in the Galaxies,” Journal of the American Statistical Association, 85, 617–624. DOI: 10.1080/01621459.1990.10474918.
  • Stahl, D., and Sallis, H. (2012), “Model-Based Cluster Analysis,” Wiley Interdisciplinary Reviews: Computational Statistics, 4, 341–358. DOI: 10.1002/wics.1204.
  • Symons, M. (1981), “Clustering Criteria and Multivariate Normal Mixtures,” Biometrics, 37, 35–43. DOI: 10.2307/2530520.
  • Teh, Y., and Gorür, D. (2009), “Indian Buffet Processes With Power-Law Behavior,” in Advances in Neural Information Processing Systems 22 (NIPS 2009), eds. Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta, Red Hook, NY: Curran Associates, Inc., pp. 1838–1846.
  • Van Laarhoven, P., and Aarts, E. (1987), Simulated Annealing: Theory and Applications, Dordrecht: Reidel.
  • Vitter, J. (1985), “Random Sampling With a Reservoir,” ACM Transactions on Mathematical Software, 11, 37–57. DOI: 10.1145/3147.3165.
  • Wang, L., and Dunson, D. (2011), “Fast Bayesian Inference in Dirichlet Process Mixture Models,” Journal of Computational and Graphical Statistics, 20, 196–216. DOI: 10.1198/jcgs.2010.07081.
  • Woodward, W., Parr, W., Schucany, W., and Lindsey, H. (1984), “A Comparison of Minimum Distance and Maximum Likelihood Estimation of a Mixture Proportion,” Journal of the American Statistical Association, 79, 590–598. DOI: 10.1080/01621459.1984.10478085.
  • Zuanetti, D., Müller, P., Zhu, Y., Yang, S., and Ji, Y. (2019), “Bayesian Nonparametric Clustering for Large Data Sets,” Statistics and Computing, 29, 203–215. DOI: 10.1007/s11222-018-9803-9.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.