Abstract
In topic modelling, the main computational problem is to approximate the posterior distribution given an observed collection. Commonly, we must resort to variational methods for approximations; however, we do not know which variational variant is the best choice under certain settings. In this paper, we focus on four topic modelling inference methods, including mean-field variation Bayesian, collapsed variational Bayesian, hybrid variational-Gibbs and expectation propagation, and aim to systematically compare them. We analyse them from two perspectives, i.e. the approximate posterior distribution and the type of -divergence; and then empirically compare them on various data-sets by two popular metrics. The empirical results are almost matching our analysis, where they indicate that CVB0 may be the best variational variant for topic models.
Notes
No potential conflict of interest was reported by the authors.
4 On some data-sets, the test perplexities achieved by optimising are worse than those achieved by fixing hyperparameters. These results suggest that optimising
does not always lead to a better solution (Asuncion et al., Citation2009).