ABSTRACT
Online social media such as Twitter are growing so rapidly. Recently, Twitter has become one of the popular microblogging services on the Internet. It lets millions of users to communicate and interact by sending short messages of up to 140 characters. The massive amount of information over the web from Twitter requires an automatic tool that can determine the topics that people are talking about. The Topic Detection task is concentrated on discovering the main topics automatically. In this article at first, we explore different approaches to detect topics of tweets. Then, we will classify these topic detection approaches to four classes of categories, including with word embedding or without word embedding, specified or unspecified, offline (RED) or online (NED), and supervised or unsupervised. Finally, we will discuss the studied approaches in detail.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
6. Defence Advanced Research Projects Agency
7. C implementation of variational expectation maximization for latent Dirichlet allocation (LDA)., from http://www.cs.princeton.edu/~blei/lda-c/index.html.
9 Global vectors for word representation
10. Embeddings from Language Models
11. Bidirectional Encoder Representations from Transformers
12. Dirichlet Multinomial Mixture