2,406
Views
1
CrossRef citations to date
0
Altmetric
Research Article

What We Can Do and Cannot Do with Topic Modeling: A Systematic Review

ORCID Icon, ORCID Icon, & ORCID Icon
 

ABSTRACT

Topic modeling has become an effective tool for communication scholars to explore large amounts of texts. However, empirical studies applying topic modeling often face the critical question of making meaningful theoretical contributions. In this study, we highlighted the importance of theoretical underpinning, the research design, and the methodological details of topic modeling studies. We summarized five normative arguments that address critical issues in theory building and testing, research design, and reliability and validity assessments. Using these normative arguments as criteria, we systematically reviewed 105 communication studies that applied topic modeling. We identified gaps and missed opportunities in previous studies and discussed potential pitfalls for the field.

Acknowledgement

The authors are grateful to Dr. Tai-Quan 'Winson' Peng, Dr. Marko Bachl, and the three anonymous reviewers for their constructive suggestions.

Disclosure statement

We have no known conflict of interest to disclose.

Notes

1 Other non-negative matrix factorization (NMF)-based models (Shi et al., Citation2018) are also considered a topic modeling technique, which researchers can use to classify documents.

2 We selected 2009 as the starting point because in the same year, Lazer et al. (Citation2009) published a pioneer study introducing computational social science.

3 We excluded studies focusing on methodological innovation (e.g., studies from Communication Methods and Measures) because it is difficult to compare these studies with other empirical studies whose goals are typically theory building and theory testing. However, we still intergrade important topic modeling studies in the literature review and discussion.

Additional information

Notes on contributors

Yingying Chen

Dr. Yingying Chen (PhD in Information and Media, Michigan State University) is an Assistant Professor in the School of Journalism and Communication at the Renmin University of China. Her research focuses on the intersection of political communication, science communication, information diffusion, and computational social science methods.

Zhao Peng

Dr. Zhao Peng (Ph.D., Michigan State University) is an Assistant Professor in the Department of Journalism at Emerson College. Her research investigates how communication scientists can more effectively present their findings, including how communication practitioners may apply DEIA knowledge in data visualizations.

Sei-Hill Kim

Dr. Sei-Hill Kim (PhD, Cornell University) is the Eleanor M. & R. Frank Mundy Professor in the School of Journalism and Mass Communications at the University of South Carolina. His research investigates the role of mass media in shaping public discourse on important issues in politics, public health, and science and technology.

Chang Won Choi

Dr. Chang-Won Choi (PhD, The University of South Carolina) is an Assistant Professor of Integrated Marketing Communications in the School of Journalism and New Media at the University of Mississippi. His research interests include viral advertising, digital advertising, corporate social responsibility (CSR) campaigns, and computational methods for advertising research.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.