673
Views
2
CrossRef citations to date
0
Altmetric
Research Article

Toward an Extended Infodemiology Framework: Leveraging Social Media Data and Web Search Queries as Digital Pulse on Cancer Communication

, , &
 

ABSTRACT

This study aims to extend the infodemiology framework by postulating that effective use of digital data sources for cancer communication should consider four components: (a) content: key topics that people are concerned with, (b) congruence: how interest in cancer topics differ between public posts (i.e., tweets) and private web searches, (c) context: the influence of the information environment, and (d) information conduits. We compared tweets (n = 36, 968) and Google web searches on breast, lung, and prostate cancer between the National Cancer Prevention Month and a non-cancer awareness month in 2018. There are three key findings. First, reliance on public tweets alone may result in lost opportunities to identify potential cancer misinformation detected from private web searches. Second, lung cancer tweets were most sensitive to external information environment – tweets became substantially pessimistic after the end of cancer awareness month. Finally, the cancer communication landscape was largely democratized, with no prominent conduits dominating conversations on Twitter.

Notes

1. Tweet-retweet network consists of all the nodes in a given network, with direction edges that indicate if a node retweets another node. In other words, if there is a directional edge formed from node A to node B, it means that node A has retweeted a post from node B.

2. While we make a conceptual distinction between the three types of cancer information conduits, we are not claiming that they are mutually exclusive categories as a highly influential Twitter user could also be an important information broker.

3. The six corpora were: (a) February breast cancer (n = 11,482); (b) February lung cancer (n = 4,104)); (c) February prostate cancer (n = 3,207); (d) March breast cancer (n = 11,526); (e) March lung cancer (n = 3,854); (f) March prostate cancer (n = 2,795).

4. Document-term matrix is a way of representing textual data for LDA, where rows are documents (i.e., tweets), and columns are terms (i.e., individual words), and each cell in the matrix shows how frequent each term would appear in each document (Welbers et al., Citation2017).

Additional information

Funding

This project was supported by a research grant from Nanyang Technological University [Grant No.: M020060110].

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.