ABSTRACT
Topic modeling has become an effective tool for communication scholars to explore large amounts of texts. However, empirical studies applying topic modeling often face the critical question of making meaningful theoretical contributions. In this study, we highlighted the importance of theoretical underpinning, the research design, and the methodological details of topic modeling studies. We summarized five normative arguments that address critical issues in theory building and testing, research design, and reliability and validity assessments. Using these normative arguments as criteria, we systematically reviewed 105 communication studies that applied topic modeling. We identified gaps and missed opportunities in previous studies and discussed potential pitfalls for the field.
Acknowledgement
The authors are grateful to Dr. Tai-Quan 'Winson' Peng, Dr. Marko Bachl, and the three anonymous reviewers for their constructive suggestions.
Disclosure statement
We have no known conflict of interest to disclose.
Notes
1 Other non-negative matrix factorization (NMF)-based models (Shi et al., Citation2018) are also considered a topic modeling technique, which researchers can use to classify documents.
2 We selected 2009 as the starting point because in the same year, Lazer et al. (Citation2009) published a pioneer study introducing computational social science.
3 We excluded studies focusing on methodological innovation (e.g., studies from Communication Methods and Measures) because it is difficult to compare these studies with other empirical studies whose goals are typically theory building and theory testing. However, we still intergrade important topic modeling studies in the literature review and discussion.
Additional information
Notes on contributors
Yingying Chen
Dr. Yingying Chen (PhD in Information and Media, Michigan State University) is an Assistant Professor in the School of Journalism and Communication at the Renmin University of China. Her research focuses on the intersection of political communication, science communication, information diffusion, and computational social science methods.
Zhao Peng
Dr. Zhao Peng (Ph.D., Michigan State University) is an Assistant Professor in the Department of Journalism at Emerson College. Her research investigates how communication scientists can more effectively present their findings, including how communication practitioners may apply DEIA knowledge in data visualizations.
Sei-Hill Kim
Dr. Sei-Hill Kim (PhD, Cornell University) is the Eleanor M. & R. Frank Mundy Professor in the School of Journalism and Mass Communications at the University of South Carolina. His research investigates the role of mass media in shaping public discourse on important issues in politics, public health, and science and technology.
Chang Won Choi
Dr. Chang-Won Choi (PhD, The University of South Carolina) is an Assistant Professor of Integrated Marketing Communications in the School of Journalism and New Media at the University of Mississippi. His research interests include viral advertising, digital advertising, corporate social responsibility (CSR) campaigns, and computational methods for advertising research.