ABSTRACT
Online communities are increasingly important discursive spaces in which individuals obtain health information and engage in sensemaking, and they play an especially essential role during viral outbreaks in which social distancing requirements may preclude engagement with other communities and information sources. However, the manner in which those communities evolve in response to a rapidly developing public health crisis, as well as in reaction to one another, is not well-understood. This longitudinal study uses latent Dirichlet allocation to assess the co-evolution of three subreddits focused on COVID-19 during the earliest, most volatile stages of the outbreak. The results demonstrate the power of being the first online community addressing an emerging health crisis as well as the manner in which latecomers to the conversation gravitate toward distinct niches to differentiate themselves with respect to both topical foci and associated communication styles. The results also highlight individuals’ detachment toward developments in even an unprecedented crisis such as a global pandemic, which represents a critical barrier that health communication professionals must overcome to persuade audiences to take health crises seriously. Future studies should examine the potential role of coordination among community administrators as well as the extent to which users are aware of and exert agency over the co-evolutionary processes spanning multiple communities.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1. Segmenting the subreddits into three separate LDA models likely would have resulted in topic solutions that could not be directly compared. For instance, a topic attributed to r/Coronavirus may have appeared qualitatively similar to a corresponding topic attributed to r/COVID19, but with different word allocations to those respective topics. This would have inhibited subsequent efforts to associate changes in one subreddit with those in another, with subtle differences in the topics obtained for each subreddit masking such effects. To ensure that changes in allocations for a given topic could be reliably compared across subreddits, a single LDA encompassing the overall discursive space was performed rather than a separate model for each subreddit. This guaranteed that all subreddits would be evaluated with respect to an identical set of topics.
2. As an alternative to LDA, structural topic modeling (STM), which can incorporate covariates (e.g., the organizational structure of Reddit posts and comments), was also considered. However, current STM algorithms rely on variational expectation maximization, which far exceeds the available memory on typical computer hardware for data sets with millions of documents. Since STM could not be feasibly employed, LDA, which models topics without reference to the characteristics of individual documents, was used instead.