2,966
Views
6
CrossRef citations to date
0
Altmetric
Articles

Using geospatial social media data for infectious disease studies: a systematic review

, ORCID Icon, ORCID Icon, , ORCID Icon &
Pages 130-157 | Received 04 Sep 2022, Accepted 17 Dec 2022, Published online: 03 Jan 2023

ABSTRACT

Geospatial social media (GSM) data has been increasingly used in public health due to its rich, timely, and accessible spatial information, particularly in infectious disease research. This review synthesized 86 research articles that use GSM data in infectious diseases published between December 2013 and March 2022. These articles cover 12 infectious disease types ranging from respiratory infectious diseases to sexually transmitted diseases with spatial levels varying from the neighborhood, county, state, and country. We categorized these studies into three major infectious disease research domains: surveillance, explanation, and prediction. With the assistance of advanced computing, statistical and spatial methods, GSM data has been widely and deeply applied to these domains, particularly in surveillance and explanation domains. We further identified four knowledge gaps in terms of contextual information use, application scopes, spatiotemporal dimension, and data limitations and proposed innovation opportunities for future research. Our findings will contribute to a better understanding of using GSM data in infectious diseases studies and provide insights into strategies for using GSM data more effectively in future research.

1. Introduction

As a significant threat to public health, infectious diseases are distinguished from many other types of diseases by characteristics such as unpredictability, transmissibility, and preventability. Global pandemics and local epidemics in history could even influence the course of a war, determine the fate of nations, and shape the progress of civilization (Fauci and Morens Citation2012). Infectious disease research has undoubtedly been a priority for the scientific community.

Traditionally, survey questionnaires, census data, and medical records have been widely applied to diverse aspects of infectious disease research, such as monitoring the prevalence and incidence (Khanal, Adhikari, and Karkee Citation2013) and forecasting the epidemic trends (Inampudi et al. Citation2020). However, the utilization of these traditional data continues to face challenges, including the long update periods of the dataset, data scope restricted to a certain geographic area, and relatively small sample size (Kitchin Citation2013; Jing et al. Citation2021; Sha et al. Citation2021). As a result, new data sources and methodologies have been introduced to address these issues and inform policy decision-making.

Social media platforms such as Twitter (Li, Erfani, et al. Citation2021; Li, Huang, et al. Citation2021), Facebook (Ascani et al. Citation2021), and Instagram (Puspitasari, Ariful, and Nuqoba Citation2021) have been recognized as unique and powerful data sources for studying infectious diseases over the last decade due to their accessibility, timeliness, and richness. Social media are internet-based applications that enable communication and resource sharing, where users post and share their opinions, experiences, and emotions, including texts, images, and videos (Kaplan and Haenlein Citation2010). In some cases, it is referred to as crowdsourced data (Chunara, Smolinski, and Brownstein Citation2013). Among these various types of social media data, the Geospatial Social Media data (GSM) data (i.e. social media data with geolocation/spatial information) provide rich information about geolocation and mobility in addition to individual behaviors and user characteristics (Sun et al. Citation2019; Li, Wachowicz, and Fan Citation2021). Such timely spatial data enable researchers to rapidly obtain a wealth of useful information for a variety of infectious disease studies that inherently include a spatial component, including surveillance (Chang et al. Citation2021), prediction (Fakhry, Asfoura, and Kassam Citation2020), response (Chang et al. Citation2021).

Despite several studies reviewing the application of social media data on public health (Tang et al. Citation2018; Edo-Osagie et al. Citation2020), no systematic review of the use of GSM data in infectious diseases exists to the best of our knowledge. To bridge the gap, this article aims to examine the use of GSM data on infectious diseases in terms of data type, research topics, research methods, and spatial levels. Specifically, this review expands the previous literature reviews in three ways. First, it includes empirical studies in common infectious diseases with a broader coverage from respiratory infectious diseases to sexually transmitted diseases. Second, it employs a spatial perspective (i.e. utilizing spatial data and spatial analytical techniques to disentangle the mechanisms that drive the spatial and temporal transmission of infectious diseases) in extracting and synthesizing information from the included papers, allowing for a comprehensive examination of how spatial information can contribute to a deeper understanding of outbreaks and transmission of infectious diseases and their related issues (including individual attitudes, government policies, and vaccine efficacy in response to public health emergencies caused by infectious diseases). Third, it identifies new opportunities in infectious disease research from a spatial perspective.

2. Methods

2.1. Eligibility criteria

This review follows the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISRMA) (Moher et al. Citation2009). Only original peer-reviewed journal articles reporting empirical studies on GSM and infectious diseases were included. To be eligible for inclusion criteria, a paper should (1) be an original study and peer-reviewed; (2) report empirical studies (i.e. not literature reviews or methodology papers); (3) be published in a journal or full-text conference proceeding (i.e. not book reviews, book chapters, or others); (4) be published in the English language prior to March 1, 2022; and (5) research analysis is based on user-generated data on the social media platforms rather than survey data collected via social media platforms.

The keywords used in the searches included terms for social media, health issues, and spatial dimensions. Our keywords for social media included mainstream social media platforms that are widely used in public health research, including Twitter, Facebook, Weibo, and Instagram, as well as the term ‘social media’ to cover other social media platforms. The rule of the keyword list for infectious diseases was as follows. First, since there is no official list of infectious diseases, we referred to the list at the university of UTAH (UniversityOfUTAH Citation2022), which includes 27 different types of infectious diseases. Second, we referred to other review articles on social media and infectious diseases (Tang et al. Citation2018). Third, based on the prior two steps, the authors discussed and identified the main infectious disease keywords and used ‘infectious disease’ to include other infectious diseases, totaling 37 keyword terms. To our knowledge, this is one of the most extensive keyword lists in the reviews of infectious diseases and social media to date. Lastly, five spatially related keywords were determined through discussion among the authors and referenced from relevant geospatial social media review literature (Barros, Gutiérrez, and García-Palomares Citation2022). See section 2.2 for a complete list of keywords.

2.2. Data sources and search strategies

Our review topics involve computer science, social sciences, and life sciences, which led to the selection of two databases. First, Web of Science is one of the most well-known databases for indexing academic literature in various fields worldwide, and it is widely used in the field of academic reviews (Li, Dong, and Liu Citation2020; Mollalo et al. Citation2021), so it was chosen as the first database. Second, PubMed is a well-known database of biomedical and life sciences literature, and it is frequently used in public health academic reviews (Edo-Osagie et al. Citation2020; Mollalo et al. Citation2021), hence it was chosen as the second database to avoid missing relevant articles. Third, we looked at other databases and found that the search results were generally consistent with the first two databases. As a result, the Web of Science and PubMed search results met our review database criteria.

We first conducted a literature search on Web of Science based on combinations of keywords, with a restricted deadline of March 1, 2022. In Web of Science, we used the ‘topic’ search, which searches for titles, abstracts, keywords, and keywords plus. Other search options defaulted. We then searched PubMed using the ‘Title/Abstract’ option.

The rule along with the selected keywords are shown as follows, taking the Web of Science rule as an example.

((TS = (Twitter)) OR (TS = (Tweets)) OR (TS = (Weibo)) OR (TS = (Facebook)) OR (TS = (Instagram)) OR (TS = (Social media))) AND ((TS = (Geotagged)) OR (TS = (Geocoded)) OR (TS = (Geolocated)) OR (TS = (Spatial)) OR (TS = (Spatiotemporal)) OR (TS = (Geographic)) OR (TS = (Geographical))) AND ((TS = (COVID)) OR (TS = (COVID-19)) OR (TS = (Coronavirus)) OR (TS = (Epidemic)) OR (TS = (Pandemic)) OR (TS = (Influenza)) OR (TS = (Virus)) OR (TS = (Infectious disease)) OR (TS = (Ebola)) OR (TS = (Measles)) OR (TS = (Zika)) OR (TS = (Cholera)) OR (TS = (SARS)) OR (TS = (Flu)) OR (TS = (H1N1)) OR (TS = (Dengue)) OR (TS = (Fever)) OR (TS = (Plague)) OR (TS = (MERS)) OR (TS = (Malaria)) OR (TS = (Polio)) OR (TS = (HIV)) OR (TS = (AIDS)) OR (TS = (Acquired immunodeficiency syndrome)) OR (TS = (Sexually transmitted infections)) OR (TS = (Sexually transmitted disease)) OR (TS = (STD)) OR (TS = (STI)) OR (TS = (Syphilis)) OR (TS = (Gonorrhea)) OR (TS = (Chlamydia)) OR (TS = (Trichomoniasis)) OR (TS = (Tuberculosis)) OR (TS = (Hepatitis B)) OR (TS = (HBV)) OR (TS = (HCV)) OR (TS = (Viral Hepatitis)))

The initial search yielded 685 articles from Web of Science and 322 articles from PubMed, with 712 non-duplicate articles in total. We then screened the abstracts of these articles using the criteria listed above. For those articles that could not be judged as eligible based on the abstract information, the full texts of these articles were examined to determine eligibility. These articles were discussed by the authors to determine whether they should be included in the review. The exclusion criteria were as follows: (1) exclusion of non-research papers, such as literature reviews; (2) exclusion of non-infectious disease research papers; (3) exclusion of methodology papers for data mining; (4) exclusion of research papers that mentioned ‘spatial’ in the article but did not use geospatial social data; (5) exclusion of research papers that mentioned social media platforms but did not actually use social media data. Following the screening, 86 articles were finally included in this review ().

Figure 1. Flow diagram for article identification and selection.

Figure 1. Flow diagram for article identification and selection.

3. Results

3.1. Overview

3.1.1. Publication characteristics

The number of empirical studies using GSM data in infectious diseases has increased rapidly in recent years. As illustrated in , the earliest article was published in December 2013, and the latest in February 2022 (as of March 1, 2022). Before 2020, fewer than 10 relevant papers were published each year; however, the number has increased significantly after the COVID-19 outbreak, and most of them are COVID-19-related studies (n = 50). These papers are widely dispersed across various subjects, with the majority in journals related to multidisciplinary sciences, medicine, environment and geography, and computer science. Thus, GSM data is expected to remain a focus in geography, public health, and computer science in the coming years. In addition, most studies were undertaken in the United States, Europe, and East Asia, with relatively few studies conducted in other regions of the world.

Figure 2. Number of publications on the use of GSM data on infectious diseases (As of March 1, 2022).

Figure 2. Number of publications on the use of GSM data on infectious diseases (As of March 1, 2022).

3.1.2. Research characteristics

summarizes the characteristics of included papers in terms of information type, infectious disease type, and application domains. The information extracted from GSM data in these papers can be classified into three categories: contextual information, movement information, and spatial-social network information. Specifically, contextual information includes content-based information such as texts, images, and videos posted on social media platforms; movement information refers to information about human mobility that is reflected in GSM data; and network information includes spatial connectivity or social connectivity extracted from GSM data (e.g. social connectedness between people from Facebook (Holtz et al. Citation2020) and place connectivity between places from Twitter (Li, Huang, Ye, et al. Citation2021)).

Figure 3. Summary of the characteristics of the included papers.

Figure 3. Summary of the characteristics of the included papers.

The three types of information were used in these studies for 12 various infectious diseases. As illustrated in , the majority of studies focused on COVID-19 (n = 50), followed by influenza (n = 12), HIV (n = 9), and other STIs (e.g. Syphilis and Gonorrhea) and other emerging infectious diseases (e.g. Zika, MERS, and Ebola). The distribution of papers on infectious diseases is generally determined by the scale and severity of the epidemic. The number of publications may also be affected by the epidemic cycle. For example, although the mortality of Ebola is high, the number of publications on Ebola is not very high since this epidemic was short-lived. More scientific evidence is required to explain this correlation.

Figure 4. Number of publications by infectious diseases.

Figure 4. Number of publications by infectious diseases.

From the perspective of public health implication, these studies are grouped into three independent domains (), including surveillance (n = 36), explanation (n = 35), and prediction (n = 21). Studies belonging to each of these domains were subsequently reviewed in-depth, including themes, data, methods, and spatial levels.

Several pieces of scientific evidence support our classification. Surveillance is the monitoring of multiple statuses and is a popular domain in public health, with many review studies focusing on the surveillance of infectious diseases (Lee et al. Citation2016) and public health (Sinnenberg et al. Citation2017; Jordan et al. Citation2018; Edo-Osagie et al. Citation2020). In the current review, surveillance includes publications that use GSM data to monitor the spatiotemporal distribution patterns of infectious diseases, evaluate the effectiveness of various government practices to combat infectious diseases, and assess public attitudes and reactions to infectious diseases.

In terms of the explanation, several review papers on geography (e.g. Li, Dong, and Liu Citation2020) and social determinants of infectious diseases (Bishwajit, Ide, and Ghosh Citation2014; Duarte et al. Citation2021) and health (Palmer et al. Citation2019) have involved in mechanism/correlation explanations. Explanation studies in the current review are those that investigate the factors that correlate the prevalence and death of infectious diseases, and the factors that correlate the response to infectious diseases.

The prediction domain has also been highlighted in several health-related studies (Lee et al. Citation2016; Edo-Osagie et al. Citation2020). For instance, Edo-Osagie et al. (Citation2020) defined a domain of forecasting, which includes articles on predicting trends in health-related events. Predictions of infectious disease outbreaks/deaths, as well as predictions on topics related to infectious disease response, are included in the prediction domain of this review.

3.2. Surveillance

Infectious disease surveillance is essential for identifying public health threats and developing prevention and control strategies. Key aspects of surveillance are the assessment of spatiotemporal patterns of specific disease transmission and the assessment of responses to public health emergencies caused by infectious diseases, such as individual attitudes, government policies, and vaccine efficacy. In this section, we summarized the efforts in assessing spatiotemporal patterns of and responses to infectious diseases using GSM data in .

Table 1. Related articles involving the surveillance.

The primary information category used for these studies is contextual information. The majority of studies focused on COVID-19, including geo-distribution/pattern of COVID-19 transmission (Peng et al. Citation2020; Pobiruchin, Zowalla, and Wiesner Citation2020; Cuomo et al. Citation2021; Lopreite et al. Citation2021), attitudes and perceptions of the public during the COVID-19 outbreak (Han et al. Citation2020; Hung et al. Citation2020; Yigitcanlar et al. Citation2020), assessment on the vaccine (Hu et al. Citation2021) and lockdown measures (Li, Erfani, et al. Citation2021; Surano, Porfiri, and Rizzo Citation2022), the impact of misinformation on COVID-19 (Chen et al. Citation2021), government online practices (Slavik et al. Citation2021), the impact of social distancing (Kwon et al. Citation2020; Porcher and Renault Citation2021; Shen et al. Citation2021), and the impact of COVID-19 restrictions on the utilization of different locations, services, and amenities (Legeby et al. Citation2022). For example, Cuomo et al. (Citation2021) collected tweets related to COVID-19 and then assessed the longitudinal and geospatial variations among different communities. Hung et al. (Citation2020) investigated the sentiments toward COVID-19 ranging from positive to negative using COVID-19-related tweets. Chen et al. (Citation2021) evaluated the prevalence of rumors during the COVID-19 outbreak in China by using COVID-19 rumor data extracted from Sina Weibo.

Ebola transmission and people's reactions to Ebola in China (Liu et al. Citation2016) and across the world (Tran and Lee Citation2016) were also studied using contextual information. Tran and Lee (Citation2016) analyzed spatiotemporal and social characteristics of Ebola-related tweets (e.g. the distance between Ebola-related tweets; the similarities between the two events; the role of social ties in spreading Ebola-related tweets). Liu et al. (Citation2016) collected two keyword indices from Chinese social media platforms, Ebola-related Baidu index (BDI, and Baidu a search company like Google) and Sina Micro Index (SMI, and Sina Micro is a platform like Twitter), and evaluated the public perception of Ebola in China. Their results showed that Ebola-related BDI and SMI trends were highly correlated, but these two indices were not significantly correlated with Ebola disease-related indicators (e.g. case rates and death rates). Other studies assessed the transmission of Dengue fever (Nsoesie et al. Citation2016; Ye et al. Citation2016), HIV (Cai et al. Citation2020; van Heerden and Young Citation2020), and Zika virus (Pruss et al. Citation2019). Among them, Nsoesie et al. (Citation2016) used machine learning methods to investigate spatiotemporal trends of dengue fever events on Twitter. Cai et al. (Citation2020) collected geolocated tweets related to the HIV outbreak in Indiana to analyze geospatial characteristics of the epidemic. Van Heerden and Young (Citation2020) used a combination of data from Twitter, Instagram, and YouTube to map and examine variations in posts related to HIV across regions of South Africa. Pruss et al. (Citation2019) examined temporal and spatial variation in the tweets related to Zika virus.

Human mobility data derived from social media platforms also contribute to assessing the spread of infectious diseases, including the transmission of COVID-19 (Kraemer et al. Citation2018; Chang et al. Citation2021; Cowley et al. Citation2021; Shepherd et al. Citation2021) and the effects of governmental lockdown in response to the pandemic (Huang et al. Citation2020; Gibbs et al. Citation2021; Gulnerman Citation2021; Iranmanesh and Alpar Atun Citation2021). For example, leveraging Facebook data, genomics, and mobile phone data, Cowley et al. (Citation2021) analyzed the pandemic trajectory and the emergence of variants. Using geolocated tweets from Lahore, Pakistan, Kraemer et al. (Citation2018) explored spatiotemporal variation in dengue transmission and found that the variation was related to patterns of intra-urban human mobility.

Only one study in this domain used social network data (Holtz et al. Citation2020). Holtz et al. (Citation2020) analyzed the effects of policies in one area on human movement and social distancing in other areas, and the effects of adopting disjointed local policies in the context of such spillover effects, utilizing mobile phone data from tens of millions of devices and social network connections from millions of Facebook users. The result revealed that policies in one area could affect residents’ contact patterns in other areas.

Using multiple methods, researchers extracted various types of information from social media data to assess the patterns of the epidemics and their related issues over time, at various spatial levels, such as neighborhood (Legeby et al. Citation2022), county (Guntuku et al. Citation2021), state (Hu et al. Citation2021) and national levels (Tran and Lee Citation2016). These methods include emotion analysis (Hu et al. Citation2021), content analysis (Nsoesie et al. Citation2016), topic modeling (Tran and Lee Citation2016), mobility and network analysis such as social network analysis (Hung et al. Citation2020), and spatial analysis such as geospatial clustering analysis (Cuomo et al. Citation2021) and kernel density analysis (Peng et al. Citation2020). These studies provide essential scientific support for the government's prompt response to infectious disease outbreaks.

3.3. Explanation

To predict and prevent the spread of infectious diseases, it is necessary to understand the spatiotemporal mechanisms of disease transmission and to examine the correlations between disease-related social media posts and infection cases. In this section, we reviewed studies that explored the correlations between infectious disease-related social media posts and infectious disease outbreaks/deaths, and the associations between socioeconomic and environmental factors and diseases ().

Table 2. Related articles involving the explanation.

Many studies used contextual information to investigate the relationships between disease-related posts on social media and disease cases/deaths. The words and behaviors of users on social media platforms may be related to individual or regional physical/mental illness and attitude. For infectious disease, the number of social media posts related to COVID-19 cases are associated with actual COVID-19 cases (Shen et al. Citation2020; Cuomo et al. Citation2021); tweets related to COVID-19 deaths are associated with actual COVID-19 deaths (Dahal et al. Citation2021; Turiel, Fernandez-Reyes, and Aste Citation2021); COVID-19 vaccine tweets are linked to actual COVID-19 vaccination rate (Chan, Jamieson, and Albarracin Citation2020). Research on other infectious diseases focused on the associations between HIV cases/new diagnoses and social media posts related to HIV (Young, Rivers, and Lewis Citation2014; Ireland et al. Citation2016; Nielsen et al. Citation2017; Li, Qiao, et al. Citation2021), dengue fever (Puspitasari, Ariful, and Nuqoba Citation2021), syphilis (Young et al. Citation2018), and influenza (Broniatowski, Paul, and Dredze Citation2013; Allen et al. Citation2016; Huang et al. Citation2019). These studies identified a strong correlation between disease-related social media posts and actual outbreaks of these diseases by analyzing data at different spatial scales (i.e. county and state). The social media data used were dominated by Twitter and Chinese Sina Weibo.

Using contextual information, researchers also investigated the socioeconomic and demographic determinants of COVID-19-related issues, such as public concerns about the pandemic (Scotti et al. Citation2020; Su et al. Citation2021), pandemic severity (Li et al. Citation2020), lockdown measures (Surano, Porfiri, and Rizzo Citation2022), and social distancing measures (Porcher and Renault Citation2021). Su et al. (Citation2021) classified tweets by socioeconomic status at the county level and found that COVID-19 concerns varied by socioeconomic status regardless of the hotspot and non-hotspot areas in the United States. Similarly, Scotti et al. (Citation2020) investigated whether socioeconomic and epidemiological variables influenced feelings on Twitter during the pandemic and revealed that areas with a higher mortality rate and a higher poor level are associated with more negative emotions. In China, researchers investigated the impact of urban environment on pandemic severity in Wuhan, using Weibo-based COVID-19 case data as the outcome variable (Li et al. Citation2020).

Aside from COVID-19, socioeconomic and demographic factors correlating the prevalence of other infectious diseases were investigated using contextual information. For example, scholars explored spatiotemporal trends in dengue events reported on Twitter compared to confirmed cases and analyzed the associations with sociodemographic factors at the municipality level (Nsoesie et al. Citation2016). Using influenza-related tweets, researchers found more tweets about the influenza vaccine among women than men (Huang et al. Citation2019). Using Zika-related tweets in the travel context, researchers compared the differences in the network, demographic, and language characteristics between users who modified their behavior and the control group (Daughton and Paul Citation2019). HIV has been a hotspot topic in research on sex transmission diseases. Twitter activity among young men was found to be associated with overall HIV prevalence (Stevens et al. Citation2020). In another study (Chan et al. Citation2021), the number of men who have sex with men (MSM) appeared to be linked to the HIV-related tweets in a region, the in-person communications about pre-exposure prophylaxis (PrEP), and the ultimately actual PrEP use.

Indicators derived from the movement information were also used as social and environmental factors to assess their impacts on infectious diseases. For example, a range of mobility pattern indicators derived from Twitter, Facebook, and Google Trends have been developed, which serve as proxies for human movement that drives the transmission of infectious diseases. Using Twitter-based population mobility data, the significant link between population movement and COVID-19 outbreaks is illustrated (Zeng et al. Citation2021). Ascani et al. (Citation2021) found a positive relationship between mobility within labor market areas and COVID-19 excess mortality in Italy using Facebook data as a proxy of individual mobility (Ascani et al. Citation2021). Using daily Community Mobility Reports data from Google as a proxy for social distancing and geotagged messages from Twitter to capture beliefs at the state level, it was observed that an increase in Twitter index of social distancing on one day was associated with a decrease in mobility on the day after (Porcher and Renault Citation2021).

Network information derived from social media data was used to explain infectious diseases and related topics. Social connectivity index derived from Facebook (Fritz and Kauermann Citation2022), place connectivity index derived from Twitter (Li, Huang, Ye, et al. Citation2021) and Facebook (Fan et al. Citation2021), have been used as environmental factors that influence disease transmission. For example, using data from Facebook activities, researchers explored the impact of human movement and social connectivity on the new COVID-19 infections in Germany (Fritz and Kauermann Citation2022). Also using Facebook data, Fan et al. (Citation2021) examined the impact of population co-location reduction on COVID-19 cross-county transmission risk using cross-county human co-location data (Fan et al. Citation2021). Using the place connectivity index derived from billions of geotagged tweets, a significant impact of place connectivity on COVID-19 case infections is revealed (Li, Huang, Ye, et al. Citation2021).

Studies involving explanations have employed spatial and non-spatial statistical methods such as geographically weighted regression (Forati and Ghose Citation2021), negative binomial model (Stevens et al. Citation2020), and OLS regression (Turiel, Fernandez-Reyes, and Aste Citation2021), correlation analysis (Dahal et al. Citation2021), as well as machine learning such as random forest (RF) (Li, Yang, et al. Citation2021). The majority of studies were conducted at the county level, with a few at the state and other geographic levels. These studies have investigated some key research questions, such as the relationship between users’ social media posts and actual relevant cases based on different spatial scales and the environmental and socioeconomic determinants of infectious diseases. These studies have deepened our understanding of infectious disease transmission processes and their risk factors, as well as the connection between social media use and public health problems.

3.4. Prediction

Infectious disease prediction is a crucial part of infectious disease prevention. Depending on the prediction results, countermeasures can be taken in advance for prevention. Scholars have used GSM data to forecast the infectious disease outbreak and the related issues caused by epidemics/pandemics (such as the pressure on the medical system and the consequences of travel restrictions), with various spatiotemporal resolutions ().

Table 3. Related articles involving the prediction.

Using contextual information, predictions of COVID-19 outbreaks (Fakhry, Asfoura, and Kassam Citation2020) and related health system overloads (Rivieccio et al. Citation2021) are hot research topics. Following state-level Twitter-based COVID-19 information, Fakhry, Asfoura, and Kassam (Citation2020) predicted current and future COVID-19 cases based on a dual machine learning approach. To identify predictors of possible new health system overloads, Rivieccio et al. (Citation2021) analyzed data from Twitter and emergency services. GSM data were also widely utilized to predict influenza-like illness (ILI) (Nagar et al. Citation2014; Wang et al. Citation2016; Elkin, Topal, and Bebek Citation2017; Kandula, Hsu, and Shaman Citation2017; Lu et al. Citation2018; Wakamiya, Kawai, and Aramaki Citation2018; Chen et al. Citation2019; Wang et al. Citation2020). For instance, using Twitter data and the partial differential equation (PDE) model, Wang et al. (Citation2020) predicted the regional influenza outbreak. Using Google search trends, Kandula, Hsu, and Shaman (Citation2017) predicted the subregional nowcasts of seasonal influenza. Using H7N9-related Baidu Search Index (BSI) and Weibo Posting Index (WPI) data, Chen et al. (Citation2019) predicted H7N9 cases in China based on seasonal autoregressive integrated moving average models. Scholars have also used sex transmission disease (STD)-related tweets to predict the spread of STDs like HIV, gonorrhea, and chlamydia diagnoses (Chan et al. Citation2018; Adnan et al. Citation2020). Adnan et al. (Citation2020) evaluated the temporal predictive strength for Campylobacter from five data sources including consumer helpline, general practice consultations, Google Trends, tweets, and school absenteeism. Their results showed that models using tweets and Google Trends can provide better prediction performance in early outbreaks compared to conventional data sources.

Using movement information, Twitter-based population mobility data were used to predict Dengue (Ramadona et al. Citation2019) and COVID-19 cases (Bisanzio et al. Citation2020; Cowley et al. Citation2021; Zeng et al. Citation2021; Lucas, Vahedi, and Karimzadeh Citation2022). For example, Ramadona et al. (Citation2019) combined human mobility derived from neighborhood-level geotagged tweets and incidence data to predict intra-urban dengue transmission in Indonesia. Zeng et al. (Citation2021) forecasted daily new cases of COVID-19 at state and county levels in South Carolina, US, using a Twitter-derived mobility index. Similarly, Bisanzio et al. (Citation2020) predicted the global spread of COVID-19 during the early outbreak period using Twitter activity as a proxy indicator of human mobility. Regarding network information, the outbreak of dengue activity was predicted by integrating satellite imagery, weather data, clinical data, Twitter-based connectivity, and census data (Castro et al. Citation2021).

Several studies have integrated human movement and network information (Chang et al. Citation2021; Vahedi, Karimzadeh, and Zoraghein Citation2021; Lucas, Vahedi, and Karimzadeh Citation2022). Utilizing connectivity and human movement information derived from Facebook and cellphone data, as well as infection rates and socioeconomic compositions of counties as predictive features, Vahedi, Karimzadeh, and Zoraghein (Citation2021) predicted the spatial and temporal patterns of COVID-19 cases in the contiguous US. Lucas, Vahedi, and Karimzadeh (Citation2022) used weekly new cases as temporal features and Facebook movement and connectedness as spatial features to predict the county-level spread of COVID-19 cases in the US. Using Facebook movement data and colocation data, Chang et al. (Citation2021) assessed the potential impact of COVID-19 intracity/intercity travel restrictions at multiple geographic levels in Taiwan and found intercity travel restrictions could reduce the outbreak's scope.

GSM data could aid in achieving acceptable infectious disease predictions and enhance our knowledge of the disease transmission direction and risk. Several studies using these alternative data sources outperformed traditional data in terms of prediction performance. For example, Ramadona et al. (Citation2019) found that a mobility-weighted incidence index derived from Twitter outperformed conventional mobility and neighborhood centrality in predicting dengue risk; Vahedi, Karimzadeh, and Zoraghein (Citation2021) showed that the model using social connectedness index outperformed than the base model without that social media-based index, with a 6.46% improvement in the mean absolute errors (MAE) over the two-week prediction. Other studies also reported improved accuracy, including COVID-19 prediction (Lucas, Vahedi, and Karimzadeh Citation2022), Influenza prediction (Elkin, Topal, and Bebek Citation2017; Lu et al. Citation2018), and Campylobacter prediction (Adnan et al. Citation2020).

Validation is an essential component of prediction. Many studies validated their model performance by comparing predicted outcomes with actual disease cases (Wang et al. Citation2016; Kandula, Hsu, and Shaman Citation2017; Chan et al. Citation2018; Lu et al. Citation2018; Wakamiya, Kawai, and Aramaki Citation2018; Adnan et al. Citation2020; Vahedi, Karimzadeh, and Zoraghein Citation2021), using a range of validation metrics such as Pearson correlation coefficient (COR), root mean square error (RMSE), mean absolute error (MAE), and mean absolute proportion error (MAPE). After evaluating model performance using out-of-sample data from four influenza seasons between 2012 and 2016, Lu et al. (Citation2018) validated their model performances using real data from 2016 to 2017. The COR between the out-of-sample influenza activity estimates and officially reported influenza activity was 0.98. Kandula, Hsu, and Shaman (Citation2017) validated their model estimates using state-level ILI counts from 2005 to 2010 season provided by CDC, achieving good predictive performance with a COR of 0.84, RMSE of 1.01, and MAPE of 0.83.

Lastly, these studies have used different regression (Vahedi, Karimzadeh, and Zoraghein Citation2021), machine learning (Fakhry, Asfoura, and Kassam Citation2020), and deep learning methods (Lucas, Vahedi, and Karimzadeh Citation2022) for prediction, with the majority at the county, state, and regional levels.

4. Discussion

Massive social media data provide unprecedented opportunities for health research to gain insights that were previously unavailable or hard to obtain from traditional data sources. This review systematically examines the use of geospatial social media data (GSM) data in infectious disease research. We found that the number of relevant publications has increased over time, covering a wide range of topics involving 12 infectious diseases. We classified these studies into three domains including surveillance, explanation, and prediction of infectious diseases, and thoroughly reviewed all articles in each domain. We observed that GSM data has been widely and deeply applied to these domains, particularly in surveillance and explanation using various statistical and spatial methods at diverse geographic levels. Following these results, this section discusses the knowledge gaps in this area in terms of data extraction, research topic, spatial analysis scope, and data quality, and proposes innovation opportunities for future research.

4.1. Data extraction: underuse of social media contextual information

While many studies used social media data with contents and geographic location, none of the articles in our review have addressed the demographic information of social media users. Users’ demographic information is essential for determining the impact of individual-level factors on infectious disease cases and deaths, but it is not directly available from the data. One approach is to use machine learning techniques to extract such information. Studies have demonstrated that demographic information such as age, sex, income, education level, marriage status, and religious status can be inferred from users’ posts using machine learning algorithms (Preoţiuc-Pietro et al. Citation2015; Poulston, Stevenson, and Bontcheva Citation2016). With demographic information, a range of new research can be conducted, such as analyzing the racial and socioeconomic disparities in cases/deaths of diseases and conducting more precise disease prediction.

Image and video information are other underutilized contextual information. Some scholars collected images from social media such as Twitter (Chen and Dredze Citation2018) and Instagram (Seltzer et al. Citation2017) to investigate public health issues, however, our review found no papers that used images from mainstream social media platforms to study infectious diseases. Aside from texts, images and videos from social media platforms can be utilized to extract socioeconomic and environmental determinants of epidemics. With image analysis, for example, elements of built environment can be measured via deep learning algorithms such as Fully Convolutional Networks (FCN) and Convolutional Neural Networks (CNN) (Gebru et al. Citation2017; Helbich et al. Citation2019), and perceptions of built environment can be evaluated.

According to our review, the underutilization of contextual information is also manifested by the fact that the indicators derived from contextual information may not be accurate. In some studies, for example, the keywords of the HIV risk behavior index do not include slang terms for depicting sex-related and drug-related behaviors (Li, Qiao, et al. Citation2021). Second, in most cases, emoji and function words in tweets are unprocessed (Chen et al. Citation2021). Due to the word limit of posts on Twitter and Sina Weibo, abbreviated words and acronyms appear frequently (Huang et al. Citation2022). The majority of these words from social media are unprocessed. Third, although English is commonly used on Twitter, not all conversations about infectious diseases use English. A 2018 report showed that the top three languages in monthly tweets are English (31.8%), Japanese (18.8%), and Spanish (8.46%) (Vicinitas Citation2022). Most studies excluded non-English tweets, restricting the ability to catch the Internet discourse of people who speak other languages (Stevens et al. Citation2020). For a few studies dealing with multilingualism, the corresponding tweets were analyzed using naïve translation methods (Zhang et al. Citation2021). Sentiment and emotion differ across languages. Direct translation of tweets may overlook such differences, and current sentiment and emotion analysis techniques do not adequately address this issue (Huang et al. Citation2022). In light of this, more work could be focused on improving text classification models (e.g. using more advanced classifiers), exploring how to extract slang, emoji, abbreviation, and function words, and analyzing multiple languages.

In addition, the sample size and methodological restrictions in sentiment analysis may hinder the extraction of useful contextual information. Many articles have conducted sentiment analysis, but they are incapable of fully capturing nuances between similar tweets (Surano, Porfiri, and Rizzo Citation2022). Furthermore, many common action words (e.g. ‘work’ or ‘exercise’) tend to evoke mixed feelings (Ireland et al. Citation2016), but existing emotion analysis cannot extract these feelings accurately. More precise sentiment analysis can help assess a wider spectrum of sentiments and provide a clearer picture of how the general public perceives infectious diseases. Future research should expand the sample size to reduce sentiment uncertainty and explore more diverse sentiment dimensions (Hu et al. Citation2021) and more precise topic modeling methods using advanced natural language processing technologies such as deep learning or artificial intelligence.

4.2. Research topic: inadequate breadth and depth

From COVID-19 for respiratory diseases to gonorrhea for sexually transmitted diseases, infectious diseases encompass a wide range of illnesses. While the reviewed publications cover 12 infectious diseases (), some infectious diseases with significant public health implications were overlooked. For example, with the prevalence of COVID-19, many disciplines have been dedicated to studying COVID-19 (n = 50), whereas other infectious diseases with high mortality rates, such as Tuberculosis and Malaria, and that affect small populations or areas, such as Hepatitis B virus (HBV), were less frequently linked to social media data.

In the following, we discussed the current application limitations and future directions of each of the three domains. For the surveillance research, a major limitation is the use of a single social media data source such as Facebook or Twitter, which represents only a portion of the population and may not reflect the spatiotemporal patterns of the general population. People’s reactions and attitudes to the same epidemic may differ depending on their social backgrounds (Fan et al. Citation2021). To reduce bias and improve the accuracy of the assessment, more diverse datasets such as Electronic Health Record (EHR) data, mobile phone data, or survey data need to be used.

For the explanation research, some critical topics were not discussed in the selected papers, such as the relationship between Internet language use, individual behavior, and infectious diseases (e.g. the relationship between language use of condoms, protective sex behavior, and HIV-positive rates). The relationship between connectivity and COVID-19 cases was presented in some articles (Li, Huang, Ye, et al. Citation2021), but it remains to be examined the differences in connectivity within or between states in relation to COVID-19 cases, and how their relationships vary by socioeconomic characteristics (e.g. concentrated disadvantage and urbanicity). Future studies could further explore the potential mechanisms on these topics, for example, exploring various mediation and moderation effects of social media-based environmental factors on infectious disease outbreaks. Moreover, there are several other issues worth investigating. People on Twitter often choose this platform to get in touch with their friends and acquaintances, and cross-regional impact may also be important and worth considering in the future (Chan et al. Citation2021). Infectious diseases can be linked to other health issues, such as physical and mental health, and further research is required to better understand the association between infectious diseases (and the associated emotions) and mental health. Specifically, potential research topics include the investigation of the influence of COVID-19 on tweets-based mental health, the relationship between the epidemic risk perception and people's mental health, the relationship between risk perception, social support (or other community factors such as social capital and social disorder) and mental health, and the relationships between epidemics, epidemic risk perception, and depression. Lastly, we would like to note that most studies in the explanation domain attempted to identify correlations. Without using individual and longitudinal data, the potential causal relationships cannot be revealed.

Compared with the other two domains, prediction has received the least attention using social media data, with only 21 publications. One limitation of the prediction is the performance of past data cannot guarantee future quality. Future research could use updated data to train the model and incorporate other key factors such as weather patterns and traffic systems.

4.3. Spatiotemporal analysis: insufficient attention to the spatiotemporal dimension

The majority of the reviewed literature is at spatial levels such as county (n = 27) and state/provincial level (n = 28), rather than neighborhood/community level (n = 7) and country level (n = 3). The lack of neighborhood level studies is likely due to the scarcity of data at such levels. A recent study by (Li, Huang, Ye, et al. Citation2021) reveals that a majority of the worldwide geotagged tweets (79%) are geotagged at the city level, and tweets geotagged at the neighborhood level or lower only account for 7.9%. Since the neighborhood-level analysis could help reduce bias caused by the aggregation of spatial scales, future work could explore ways to increase sample size using machine/deep learning and natural language processing methods to infer the geolocations from the texts by geoparsing place names (Gelernter and Balaji Citation2013; Wang and Hu Citation2019). In addition, future research could also leverage the rich county level data to explore innovative and significant research topics, such as the association between global climate change, political environment, and infectious diseases, since infectious diseases often spread across countries. Regarding the study regions, most of the studies are in North America, Europe, and East Asia. Future research is suggested to pay more attention to Africa and South America because these regions are home to low-income countries, which are more vulnerable to infectious disease outbreaks.

The use of social media platforms is also not uniform across time, and related indicators such as sentiment scores extracted from social media data are thus not uniform across time (Padilla et al. Citation2018). Ignoring the time factor limits the interpretability of some research findings. For instance, many studies in the explanation research use cross-sectional data, which limits the ability to provide theoretical explanations and infer causal mechanisms. More longitudinal data is necessary to investigate potential causal pathways, such as clarifying the link between the use of risk-behavior words in Twitter and changes in HIV transmission across time.

4.4. Data quality: inherent flaws of social media data

4.4.1. Population bias

Social media data is suffered from a number of inherent flaws. First, social media users are not strictly indicative of the general population (with regard to their socioeconomic and demographic status) or people at risk of pathogen exposure (Adnan et al. Citation2020; Guntuku et al. Citation2021). For example, the majority of Twitter users are in a specific age range (18–29 and 30–39) and are technologically savvy (Huang and Wong Citation2016; Jiang, Li, and Ye Citation2019). Facebook users are also mostly between the ages of 25–34 (Statista Citation2022a) and users of the Chinese social media platform Sina Weibo are young and well-educated (Statista Citation2022b). The use of geotagged social media data may exacerbate the population bias. For example, only around 1% ∼ 2% of tweets are geotagged (Hong et al. Citation2012; Cuomo et al. Citation2021), and those geotagged tweets tend to be concentrated on certain populations such as younger people and females who like to tweet with geographic location (Feng and Kirkley Citation2021). In addition, a small group of users may express the majority of posts, as 80% of tweets are posted by the top 10% of most active users (Wojcik and Hughes Citation2019). With such potentially uneven sampling distribution and nonrepresentative demographic distribution of the population, the relevant indices and findings may be biased. For instance, we cannot collect topic content and sentiment from those who do not use social media applications to express their opinions and comments, so we must be more cautious about generalizing the results. Furthermore, geotagged tweets generally over-represent the population in urban areas than rural areas. Since epidemic response differs between urban and rural areas, many research findings based on social media data may be restricted to urban populations.

4.4.2. Digital divide

Digital divide implies inequality in access to the Internet and Information and Communications Technology (ICT). This disparity grows when regions are considered. According to the Statista report (Statista Citation2022c), by the end of 2021, only 40% of Africans have internet access, compared to more than 80% of Europeans and people from the U.S. The following two factors may be at the root of this divide: (1) inequalities in access to the Internet and ICT. The cost of Internet access varies greatly among socioeconomic groups and countries; (2) inequalities in people’s digital skills. Immigrant populations, for example, face barriers to telemedicine adoption that are not always related to Internet access (Wang, Do, and Wilson Citation2018), and may be due to a lack of digital skills, which impedes the smooth use of various social media software (Ramsetty and Adams Citation2020). For infectious disease research, digital divide makes it more difficult to track data on vulnerable groups, and relevant research is limited or ignored. Digital divide also makes it challenging to investigate some of the factors associated with infectious diseases. For example, during the pandemic, the opinions of vulnerable groups in some areas may not be sufficiently expressed on social media, resulting in a low number of relevant posts in that area being associated with high infection rates.

4.4.3. Misinformation

Social media narratives may contain misinformation about disease transmission, influencing the validity of findings. Misinformation has a significant impact on public awareness. For example, social media users can falsify flu-related posts to attract more awareness from authorities and obtain additional assistance such as vaccines and medical supplies (Allen et al. Citation2016). Google search patterns and social media-based medical visits may reflect media reports and the perception of the situation instead of the actual influence of the epidemic. Most studies did not engage with social media users to confirm whether they were actually taking a specific medicine or confirm their self-reported risk behaviors or health status, so social media data may not accurately reflect the dynamics of infectious diseases (Li, Qiao, et al. Citation2021). In addition, in the event of a large-scale outbreak, the response of the local media may differ from that of areas with a small number of diseases. Thus, media attention is likely to significantly influence predictions (Adnan et al. Citation2020). The censorship system may also have an impact on the research findings. Some studies suggest that COVID-19 cases reported by the Chinese Center for Disease Control and Prevention may be underestimated due to limited testing capacity, the presence of asymptomatic carriers, and Internet censorship (Shen et al. Citation2020). Also, a small proportion of geotagged tweets reported fake geolocation information (Xu, Dredze, and Broniatowski Citation2020). To obtain more authentic data, fact-checkers, social media companies, news media, professional organizations, and authorities need to coordinate efforts to control the spread of misinformation about infectious diseases, so that scientific information can be released and public awareness of the pandemic can be improved (Forati and Ghose Citation2021).

4.4.4. Limited data sharing

Sharing social media research datasets enables reproducibility, replicability, and comparability (Kinder-Kurlanda et al. Citation2017), and helps provide a direct data source for scientific studies, but most studies in our review did not share datasets due to legal restrictions, data sharing policies, format, storage issues, and benefit conflicts, even though most of their data originate from open-accessed social media platforms. From the data provider perspective, data sharing policies of social media platforms are constantly changing. For example, access of large-scale geospatial data from Chinese Sina Weibo is becoming increasingly difficult. With the acquisition of Twitter by Elon Musk, the future of Twitter's data sharing policies is unclear.

To overcome these limitations, it is critical to explore, assess and integrate new types of data sources in infectious disease studies. The advancement of information and communication technologies have enabled a plethora of new types of data, including EHR data, mobile phone data, street view data, credit card data, and traffic data (Jing et al. Citation2022). For instance, mobile phone data can be used to measure population movement more accurately by covering larger populations with finer geographic resolutions. In addition, major social media platforms are undergoing rapid evolution in recent years. For example, young people have shifted their preference from traditional platforms such as Facebook and Twitter to newer apps like Snapchat and TikTok (Mittmann et al. Citation2022). The incorporation of these new data sources has the potential to reduce the data bias as well as broaden the scope of infectious disease research.

4.5. Limitations of the current study

The current review has several limitations in terms of article selection. First, the inclusion/exclusion criteria would influence the result of article selection, leading to selection bias in the articles included for the review. Second, we reviewed only two main databases, leaving out other related databases such as Scopus, PsycInfo, and IEEE Xplore. Third, while we compiled an extensive keyword list for the major infectious diseases, some keywords may not be included for uncommon infectious diseases. Third, we did not review non-English publications which may have resulted in the omission of important papers published in other languages, such as Chinese, German, and Japanese. Lastly, publications in this field increased rapidly during the COVID-19 outbreak, and our review did not include relevant papers published after the time of our data collection.

5. Conclusions

Social media data has become increasingly important in the field of health and healthcare in recent years. This article conducted a systematic review of the use of GSM data in infectious diseases. We began with providing an overview of current publications and research characteristics, and then synthesized the use of GSM data in three domains: surveillance, explanation, and prediction. We further discussed the research gaps and proposed new research opportunities regarding information extraction, research topic, spatial analysis, and data sources. With the increasing availability of social media data, as well as the advancement of machine learning and artificial intelligence, future research can expand current applications to advance our understanding of infectious diseases and human health.

Acknowledgements

We thank the five anonymous reviewers and the editor for their detailed and insightful comments that significantly improved the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by National Institutes of Health [grant number 3R01AI127203-04S1] and NSF [grant number 2028791]. The funders had no role in study design, data collection and analysis, decision to publish or preparation of this article.

References

  • Adnan, Mehnaz, Xiaoying Gao, Xiaohan Bai, Elizabeth Newbern, Jill Sherwood, Nicholas Jones, Michael Baker, Tim Wood, and Wei Gao. 2020. “Potential Early Identification of a Large Campylobacter Outbreak Using Alternative Surveillance Data Sources: Autoregressive Modelling and Spatiotemporal Clustering.” JMIR Public Health and Surveillance 6 (3): e18281. doi:10.2196/18281
  • Ali, G. G. Md Nawaz, Md Mokhlesur Rahman, Md Amjad Hossain, Md Shahinoor Rahman, Kamal Chandra Paul, Jean-Claude Thill, and Jim Samuel. 2021. “Public Perceptions of COVID-19 Vaccines: Policy Implications from US Spatiotemporal Sentiment Analytics.” Paper presented at the Healthcare.
  • Allen, Chris, Ming-Hsiang Tsou, Anoshe Aslam, Anna Nagel, and Jean-Mark Gawron. 2016. “Applying GIS and Machine Learning Methods to Twitter Data for Multiscale Surveillance of Influenza.” PLoS One 11 (7): e0157734. doi10.1371/journal.pone.0157734.
  • Ascani, Andrea, Alessandra Faggian, Sandro Montresor, and Alessandro Palma. 2021. “Mobility in Times of Pandemics: Evidence on the Spread of COVID19 in Italy’s Labour Market Areas.” Structural Change and Economic Dynamics 58: 444–454. doi:10.1016/j.strueco.2021.06.016
  • Barros, C., J. Gutiérrez, and J. García-Palomares. 2022. “Geotagged Data from Social Media in Visitor Monitoring of Protected Areas; a Scoping Review.” Current Issues in Tourism 25 (9): 1399–1415. doi:10.1080/13683500.2021.1931053
  • Bisanzio, Donal, Moritz U. G. Kraemer, Isaac I. Bogoch, Thomas Brewer, John S. Brownstein, and Richard Reithinger. 2020. “Use of Twitter Social Media Activity as a Proxy for Human Mobility to Predict the Spatiotemporal Spread of COVID-19 at Global Scale.” Geospatial Health 15: 1. doi:10.1080/13683500.2021.1931053.
  • Bishwajit, Ghose, Seydou Ide, and Sharmistha Ghosh. 2014. “Social Determinants of Infectious Diseases in South Asia.” International Scholarly Research Notices 2014: 135243. doi:10.1155/2014/135243.
  • Broniatowski, David A., Michael J. Paul, and Mark Dredze. 2013. “National and Local Influenza Surveillance Through Twitter: An Analysis of the 2012-2013 Influenza Epidemic.” PLoS One 8 (12): e83672. doi:10.1371/journal.pone.0083672
  • Cai, Mingxiang, Neal Shah, Jiawei Li, Wen-Hao Chen, Raphael E. Cuomo, Nick Obradovich, and Tim K. Mackey. 2020. “Identification and Characterization of Tweets Related to the 2015 Indiana HIV Outbreak: A Retrospective Infoveillance Study.” PLoS One 15 (8): e0235150. doi:10.1371/journal.pone.0235150.
  • Castro, Lauren A., Nicholas Generous, Wei Luo, Ana Pastore y Piontti, Kaitlyn Martinez, Marcelo F. C. Gomes, Dave Osthus, Geoffrey Fairchild, Amanda Ziemann, and Alessandro Vespignani. 2021. “Using Heterogeneous Data to Identify Signatures of Dengue Outbreaks at Fine Spatio-Temporal Scales Across Brazil.” PLoS Neglected Tropical Diseases 15 (5): e0009392. doi:10.1371/journal.pntd.0009392.
  • Chan, Man-pui Sally, Kathleen Hall Jamieson, and Dolores Albarracin. 2020. “Prospective Associations of Regional Social Media Messages with Attitudes and Actual Vaccination: A Big Data and Survey Study of the Influenza Vaccine in the United States.” Vaccine 38 (40): 6236–6247. doi:10.1016/j.vaccine.2020.07.054
  • Chan, Man-pui Sally, Sophie Lohmann, Alex Morales, Chengxiang Zhai, Lyle Ungar, David R. Holtgrave, and Dolores Albarracín. 2018. “An Online Risk Index for the Cross-Sectional Prediction of New HIV Chlamydia, and Gonorrhea Diagnoses Across US Counties and Across Years.” AIDS and Behavior 22 (7): 2322–2333. doi:10.1007/s10461-018-2046-0
  • Chan, Man-pui Sally, Alex Morales, Maria Zlotorzynska, Patrick Sullivan, Travis Sanchez, Chengxiang Zhai, and Dolores Albarracín. 2021. “Estimating the Influence of Twitter on Pre-Exposure Prophylaxis Use and HIV Testing as a Function of Rates of Men Who Have Sex with Men in the United States.” In S101-S9. LWW.
  • Chang, Meng-Chun, Rebecca Kahn, Yu-An Li, Cheng-Sheng Lee, Caroline O. Buckee, and Hsiao-Han Chang. 2021. “Variation in Human Mobility and its Impact on the Risk of Future COVID-19 Outbreaks in Taiwan.” BMC Public Health 21 (1): 1–10. doi:10.1186/s12889-020-10013-y
  • Chen, Bin, Xinyi Chen, Jin Pan, Kui Liu, Bo Xie, Wei Wang, Ying Peng, Fei Wang, Na Li, and Jianmin Jiang. 2021. “Dissemination and Refutation of Rumors During the COVID-19 Outbreak in China: Infodemiology Study.” Journal of Medical Internet Research 23 (2): e22427. doi:10.2196/22427
  • Chen, Tao, and Mark Dredze. 2018. “Vaccine Images on Twitter: Analysis of What Images are Shared.” Journal of Medical Internet Research 20 (4): e8221. doi:10.2196/jmir.8221.
  • Chen, Bin, Jian Shao, Kui Liu, Gaofeng Cai, Zhenggang Jiang, Yuru Huang, Hua Gu, and Jianmin Jiang. 2018. “Does Eating Chicken Feet with Pickled Peppers Cause Avian Influenza? Observational Case Study on Chinese Social Media During the Avian Influenza A (H7N9) Outbreak.” JMIR Public Health and Surveillance 4 (1): e8198. doi:10.2196/publichealth.8198.
  • Chen, Ying, Yuzhou Zhang, Zhiwei Xu, Xuanzhuo Wang, Jiahai Lu, and Wenbiao Hu. 2019. “Avian Influenza A (H7N9) and Related Internet Search Query Data in China.” Scientific Reports 9 (1): 1–9. doi:10.1038/s41598-018-37186-2
  • Chunara, Rumi, Mark S. Smolinski, and John S. Brownstein. 2013. “Why we Need Crowdsourced Data in Infectious Disease Surveillance.” Current Infectious Disease Reports 15 (4): 316–319. doi:10.1007/s11908-013-0341-5
  • Cowley, Lauren A., Mokibul Hassan Afrad, Sadia Isfat Ara Rahman, Md Mahfuz Al Mamun, Taylor Chin, Ayesha Mahmud, Mohammed Ziaur Rahman, Mallick Masum Billah, Manjur Hossain Khan, and Sharmin Sultana. 2021. “Genomics, Social Media and Mobile Phone Data Enable Mapping of SARS-CoV-2 Lineages to Inform Health Policy in Bangladesh.” Nature Microbiology 6 (10): 1271–1278. doi:10.1038/s41564-021-00955-3
  • Cresswell, Kathrin, Ahsen Tahir, Zakariya Sheikh, Zain Hussain, Andrés Domínguez Hernández, Ewen Harrison, Robin Williams, Aziz Sheikh, and Amir Hussain. 2021. “Understanding Public Perceptions of COVID-19 Contact Tracing Apps: Artificial Intelligence–Enabled Social Media Analysis.” Journal of Medical Internet Research 23 (5): e26618. doi:10.2196/26618
  • Cuomo, Raphael E., Vidya Purushothaman, Jiawei Li, Mingxiang Cai, and Timothy K. Mackey. 2020. “Sub-national Longitudinal and Geospatial Analysis of COVID-19 Tweets.” PLoS One 15 (10): e0241330. doi:10.1371/journal.pone.0241330.
  • Cuomo, Raphael E., Vidya Purushothaman, Jiawei Li, Mingxiang Cai, and Tim K. Mackey. 2021. “A Longitudinal and Geospatial Analysis of COVID-19 Tweets During the Early Outbreak Period in the United States.” BMC Public Health 21 (1): 1–11. doi:10.1186/s12889-020-10013-y
  • Dahal, Sushma, Juan M. Banda, Ana I. Bento, Kenji Mizumoto, and Gerardo Chowell. 2021. “Characterizing all-Cause Excess Mortality Patterns During COVID-19 Pandemic in Mexico.” BMC Infectious Diseases 21 (1): 1–10. doi:10.1186/s12879-020-05706-z
  • Daughton, Ashlynn R., and Michael J. Paul. 2019. “Identifying Protective Health Behaviors on Twitter: Observational Study of Travel Advisories and Zika Virus.” Journal of Medical Internet Research 21 (5): e13090. doi:10.2196/13090
  • Duarte, Raquel, Ana Aguiar, Marta Pinto, Isabel Furtado, Simon Tiberi, Knut Lönnroth, and G. B. Migliori. 2021. “Different Disease, Same Challenges: Social Determinants of Tuberculosis and COVID-19.” Pulmonology 27 (4): 338–344. doi:10.1016/j.pulmoe.2021.02.002
  • Edo-Osagie, Oduwa, Beatriz De La Iglesia, Iain Lake, and Obaghe Edeghere. 2020. “A Scoping Review of the Use of Twitter for Public Health Research.” Computers in Biology and Medicine 122: 103770. doi:10.1016/j.compbiomed.2020.103770
  • Elkin, Lauren S., Kamil Topal, and Gurkan Bebek. 2017. “Network Based Model of Social Media Big Data Predicts Contagious Disease Diffusion.” Information Discovery and Delivery. doi:10.1108/IDD-05-2017-0046.
  • Fakhry, Nuha Noha, Evan Asfoura, and Gamal Kassam. 2020. “Tracking Coronavirus Pandemic Diseases Using Social Media: A Machine Learning Approach.” International Journal of Advanced Computer Science and Applications 11 (10). doi:10.14569/IJACSA.2020.0111028.
  • Fan, Chao, Sanghyeon Lee, Yang Yang, Bora Oztekin, Qingchun Li, and Ali Mostafavi. 2021. “Effects of Population Co-location Reduction on Cross-County Transmission Risk of COVID-19 in the United States.” Applied Network Science 6 (1): 1–18. doi:10.1007/s41109-020-00342-7
  • Fauci, Anthony S., and David M. Morens. 2012. “The Perpetual Challenge of Infectious Diseases.” New England Journal of Medicine 366 (5): 454–461. doi:10.1056/NEJMra1108296
  • Feng, Shihui, and Alec Kirkley. 2021. “Integrating Online and Offline Data for Crisis Management: Online Geolocalized Emotion, Policy Response, and Local Mobility During the COVID Crisis.” Scientific Reports 11 (1): 1–14. doi:10.1038/s41598-020-79139-8
  • Forati, Amir Masoud, and Rina Ghose. 2021. “Geospatial Analysis of Misinformation in COVID-19 Related Tweets.” Applied Geography 133: 102473. doi:10.1016/j.apgeog.2021.102473
  • Fritz, Cornelius, and Goeran Kauermann. 2022. “On the Interplay of Regional Mobility, Social Connectedness and the Spread of COVID-19 in Germany.” Journal of the Royal Statistical Society. Series A,(Statistics in Society) 185 (1): 400. doi:10.1111/rssa.12753
  • Gebru, Timnit, Jonathan Krause, Yilun Wang, Duyun Chen, Jia Deng, Erez Lieberman Aiden, and Li Fei-Fei. 2017. “Using Deep Learning and Google Street View to Estimate the Demographic Makeup of Neighborhoods Across the United States.” Proceedings of the National Academy of Sciences 114 (50): 13108–13113. doi:10.1073/pnas.1700035114
  • Gelernter, Judith, and Shilpa Balaji. 2013. “An Algorithm for Local Geoparsing of Microtext.” GeoInformatica 17 (4): 635–667. doi:10.1007/s10707-012-0173-8
  • Gibbs, Hamish, Emily Nightingale, Yang Liu, James Cheshire, Leon Danon, Liam Smeeth, Carl A. B. Pearson, Chris Grundy, LSHTM CMMID COVID-19 Working Group, and Adam J. Kucharski. 2021. “Detecting Behavioural Changes in Human Movement to Inform the Spatial Scale of Interventions Against COVID-19.” PLoS Computational Biology 17 (7): e1009162. doi:10.1371/journal.pcbi.1009162
  • Gulnerman, Ayse Giz. 2021. “Changing Pattern of Human Movements in Istanbul During Covid-19.” Paper presented at the International Conference on Computational Science and its Applications.
  • Guntuku, Sharath Chandra, Alison M. Buttenheim, Garrick Sherman, and Raina M. Merchant. 2021. “Twitter Discourse Reveals Geographical and Temporal Variation in Concerns about COVID-19 Vaccines in the United States.” Vaccine 39 (30): 4034–4038. doi:10.1016/j.vaccine.2021.06.014
  • Han, Xuehua, Juanle Wang, Min Zhang, and Xiaojie Wang. 2020. “Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China.” International Journal of Environmental Research and Public Health 17 (8): 2788. doi:10.3390/ijerph17082788
  • Helbich, Marco, Yao Yao, Ye Liu, Jinbao Zhang, Penghua Liu, and Ruoyu Wang. 2019. “Using Deep Learning to Examine Street View Green and Blue Spaces and Their Associations with Geriatric Depression in Beijing, China.” Environment International 126: 107–117. doi:10.1016/j.envint.2019.02.013
  • Holtz, David, Michael Zhao, Seth G. Benzell, Cathy Y. Cao, Mohammad Amin Rahimian, Jeremy Yang, Jennifer Allen, Avinash Collis, Alex Moehring, and Tara Sowrirajan. 2020. “Interdependence and the Cost of Uncoordinated Responses to COVID-19.” Proceedings of the National Academy of Sciences 117 (33): 19837–19843. doi:10.1073/pnas.2009522117
  • Hong, Liangjie, Amr Ahmed, Siva Gurumurthy, Alexander J. Smola, and Kostas Tsioutsiouliklis. 2012. “Discovering Geographical Topics in the Twitter Stream.” Paper presented at the Proceedings of the 21st International Conference on World Wide Web.
  • Hu, Tao, Siqin Wang, Wei Luo, Mengxi Zhang, Xiao Huang, Yingwei Yan, Regina Liu, Kelly Ly, Viraj Kacker, and Bing She. 2021. “Revealing Public Opinion Towards COVID-19 Vaccines with Twitter Data in the United States: Spatiotemporal Perspective.” Journal of Medical Internet Research 23 (09): e30854. doi:10.2196/30854.
  • Huang, Xiao, Zhenlong Li, Yuqin Jiang, Xiaoming Li, and Dwayne Porter. 2020. “Twitter Reveals Human Mobility Dynamics During the COVID-19 Pandemic.” PLoS One 15 (11): e0241957. doi:10.1371/journal.pone.0241957.
  • Huang, Xiaolei, Michael C. Smith, Amelia M. Jamison, David A. Broniatowski, Mark Dredze, Sandra Crouse Quinn, Justin Cai, and Michael J. Paul. 2019. “Can Online Self-Reports Assist in Real-Time Identification of Influenza Vaccination Uptake? A Cross-Sectional Study of Influenza Vaccine-Related Tweets in the USA, 2013–2017.” BMJ Open 9 (1): e024018. doi:10.1136/bmjopen-2018-024018.
  • Huang, Xiao, Siqin Wang, Mengxi Zhang, Tao Hu, Alexander Hohl, Bing She, Xi Gong, Jianxin Li, Xiao Liu, and Oliver Gruebner. 2022. “Social Media Mining Under the COVID-19 Context: Progress, Challenges, and Opportunities.” International Journal of Applied Earth Observation and Geoinformation 113: 102967. doi:10.1016/j.jag.2022.102967
  • Huang, Qunying, and David W. S. Wong. 2016. “Activity Patterns, Socioeconomic Status and Urban Spatial Structure: What Can Social Media Data Tell Us?” International Journal of Geographical Information Science 30 (9): 1873–1898. doi:10.1080/13658816.2016.1145225
  • Hung, Man, Evelyn Lauren, Eric S. Hon, Wendy C. Birmingham, Julie Xu, Sharon Su, Shirley D. Hon, Jungweon Park, Peter Dang, and Martin S. Lipsky. 2020. “Social Network Analysis of COVID-19 Sentiments: Application of Artificial Intelligence.” Journal of Medical Internet Research 22 (8): e22590. doi:10.2196/22590
  • Inampudi, Srividya, Greshma Johnson, Jay Jhaveri, S. Niranjan, Kuldeep Chaurasia, and Mayank Dixit. 2020. “Machine Learning Based Prediction of H1N1 and Seasonal Flu Vaccination.” Paper presented at the International Advanced Computing Conference.
  • Iranmanesh, Aminreza, and Resmiye Alpar Atun. 2021. “Reading the Changing Dynamic of Urban Social Distances During the COVID-19 Pandemic Via Twitter.” European Societies 23 (sup1): S872–SS86. doi:10.1080/14616696.2020.1846066
  • Ireland, Molly E., Qijia Chen, H. Andrew Schwartz, Lyle H. Ungar, and Dolores Albarracin. 2016. “Action Tweets Linked to Reduced County-Level HIV Prevalence in the United States: Online Messages and Structural Determinants.” AIDS and Behavior 20 (6): 1256–1264. doi:10.1007/s10461-015-1252-2
  • Jiang, Yuqin, Zhenlong Li, and Xinyue Ye. 2019. “Understanding Demographic and Socioeconomic Biases of Geotagged Twitter Users at the County Level.” Cartography and Geographic Information Science 46 (3): 228–242. doi:10.1080/15230406.2018.1434834
  • Jing, Fengrui, Lin Liu, Suhong Zhou, Zhenlong Li, Jiangyu Song, Linsen Wang, Ruofei Ma, and Xiaoming Li. 2022. “Exploring Large-Scale Spatial Distribution of Fear of Crime by Integrating Small Sample Surveys and Massive Street View Images.” Environment and Planning B: Urban Analytics and City Science 23998083221135608. doi:10.1177/23998083221135608.
  • Jing, Fengrui, Lin Liu, Suhong Zhou, Jiangyu Song, Linsen Wang, Hanlin Zhou, Yiwen Wang, and Ruofei Ma. 2021. “Assessing the Impact of Street-View Greenery on Fear of Neighborhood Crime in Guangzhou, China.” International Journal of Environmental Research and Public Health 18 (1): 311. doi:10.3390/ijerph18010311
  • Jordan, Sophie E., Sierra E. Hovet, Isaac Chun-Hai Fung, Hai Liang, King-Wa Fu, and Zion Tsz Ho Tse. 2018. “Using Twitter for Public Health Surveillance from Monitoring and Prediction to Public Response.” Data 4 (1): 6. doi:10.3390/data4010006
  • Kandula, Sasikiran, Daniel Hsu, and Jeffrey Shaman. 2017. “Subregional Nowcasts of Seasonal Influenza Using Search Trends.” Journal of Medical Internet Research 19 (11): e7486. doi:10.2196/jmir.7486.
  • Kaplan, Andreas M., and Michael Haenlein. 2010. “Users of the World, Unite! The Challenges and Opportunities of Social Media.” Business Horizons 53 (1): 59–68. doi:10.1016/j.bushor.2009.09.003
  • Khanal, Vishnu, Mandira Adhikari, and Rajendra Karkee. 2013. “Social Determinants of Poor Knowledge on HIV among Nepalese Males: Findings from National Survey 2011.” Journal of Community Health 38 (6): 1147–1156. doi:10.1007/s10900-013-9727-4
  • Kim, Ick-Hoi, Chen-Chieh Feng, Yi-Chen Wang, Brian H. Spitzberg, and Ming-Hsiang Tsou. 2017. “Exploratory Spatiotemporal Analysis in Risk Communication During the MERS Outbreak in South Korea.” The Professional Geographer 69 (4): 629–643. doi:10.1080/00330124.2017.1288577
  • Kinder-Kurlanda, Katharina, Katrin Weller, Wolfgang Zenk-Möltgen, Jürgen Pfeffer, and Fred Morstatter. 2017. “Archiving Information from Geotagged Tweets to Promote Reproducibility and Comparability in Social Media Research.” Big Data & Society 4 (2): 2053951717736336. doi:10.1177/2053951717736336.
  • Kitchin, Rob. 2013. “Big Data and Human Geography: Opportunities, Challenges and Risks.” Dialogues in Human Geography 3 (3): 262–267. doi:10.1177/2043820613513388
  • Kraemer, Moritz U. G., Donal Bisanzio, R. C. Reiner, R. Zakar, Jared B. Hawkins, Clark C. Freifeld, David L. Smith, Simon I. Hay, John S. Brownstein, and T. Alex Perkins. 2018. “Inferences About Spatiotemporal Variation in Dengue Virus Transmission are Sensitive to Assumptions About Human Mobility: A Case Study Using Geolocated Tweets from Lahore, Pakistan.” EPJ Data Science 7: 1–17. doi:10.1140/epjds/s13688-017-0128-2
  • Kwon, Jiye, Connor Grady, Josemari T. Feliciano, and Samah J. Fodeh. 2020. “Defining Facets of Social Distancing During the COVID-19 Pandemic: Twitter Analysis.” Journal of Biomedical Informatics 111: 103601. doi:10.1016/j.jbi.2020.103601
  • Lee, Elizabeth C., Jason M. Asher, Sandra Goldlust, John D. Kraemer, Andrew B. Lawson, and Shweta Bansal. 2016. “Mind the Scales: Harnessing Spatial Big Data for Infectious Disease Surveillance and Inference.” The Journal of Infectious Diseases 214 (suppl_4): S409–SS13. doi:10.1093/infdis/jiw344.
  • Legeby, Ann, Daniel Koch, Fábio Duarte, Cate Heine, Tom Benson, Umberto Fugiglando, and Carlo Ratti. 2022. “New Urban Habits in Stockholm Following COVID-19.” Urban Studies 00420980211070677. doi:10.1177/00420980211070677.
  • Li, Ting, Yuxiang Dong, and Zhenhuan Liu. 2020. “A Review of Social-Ecological System Resilience: Mechanism, Assessment and Management.” Science of The Total Environment 723: 138113. doi:10.1016/j.scitotenv.2020.138113
  • Li, Lingyao, Abdolmajid Erfani, Yu Wang, and Qingbin Cui. 2021. “Anatomy Into the Battle of Supporting or Opposing Reopening Amid the COVID-19 Pandemic on Twitter: A Temporal and Spatial Analysis.” PLoS One 16 (7): e0254359. doi:10.1371/journal.pone.0254359.
  • Li, Zhenlong, Xiao Huang, Xinyue Ye, Yuqin Jiang, Yago Martin, Huan Ning, Michael E. Hodgson, and Xiaoming Li. 2021. “Measuring Global Multi-Scale Place Connectivity Using Geotagged Social Media Data.” Scientific Reports 11 (1): 1–19. doi:10.1038/s41598-020-79139-8
  • Li, Zhenlong, Shan Qiao, Yuqin Jiang, and Xiaoming Li. 2021. “Building a Social Media-Based HIV Risk Behavior Index to Inform the Prediction of HIV New Diagnosis: A Feasibility Study.” AIDS (London, England) 35 (Suppl 1): S91. doi:10.1097/QAD.0000000000002787.
  • Li, Songnian, Monica Wachowicz, and Hongchao Fan. 2021. “Analytics of Big Geosocial Media and Crowdsourced Data.” Big Earth Data 5 (1): 1–4. doi:10.1080/20964471.2021.1898780
  • Li, Qingchun, Yang Yang, Wanqiu Wang, Sanghyeon Lee, Xin Xiao, Xinyu Gao, Bora Oztekin, Chao Fan, and Ali Mostafavi. 2021. “unraveling the Dynamic Importance of County-Level Features in Trajectory of COVID-19.” Scientific Reports 11 (1): 1–11. doi:10.1038/s41598-020-79139-8
  • Li, Xin, Lin Zhou, Tao Jia, Ran Peng, Xiongwu Fu, and Yuliang Zou. 2020. “Associating COVID-19 Severity with Urban Factors: A Case Study of Wuhan.” International Journal of Environmental Research and Public Health 17 (18): 6712. doi:10.3390/ijerph17186712
  • Liu, Kui, Li Li, Tao Jiang, Bin Chen, Zhenggang Jiang, Zhengting Wang, Yongdi Chen, Jianmin Jiang, and Hua Gu. 2016. “Chinese Public Attention to the Outbreak of Ebola in West Africa: Evidence from the Online Big Data Platform.” International Journal of Environmental Research and Public Health 13 (8): 780. doi:10.3390/ijerph13080780
  • Liu, Siru, and Jialin Liu. 2021. “Public Attitudes Toward COVID-19 Vaccines on English-Language Twitter: A Sentiment Analysis.” Vaccine 39 (39): 5499–5505. doi:10.1016/j.vaccine.2021.08.058
  • Lopreite, Milena, Pietro Panzarasa, Michelangelo Puliga, and Massimo Riccaboni. 2021. “Early Warnings of COVID-19 Outbreaks Across Europe from Social Media.” Scientific Reports 11 (1): 1–7. doi:10.1038/s41598-020-79139-8
  • Lu, Fred Sun, Suqin Hou, Kristin Baltrusaitis, Manan Shah, Jure Leskovec, Jared Hawkins, John Brownstein, Giuseppe Conidi, Julia Gunn, and Josh Gray. 2018. “Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis.” JMIR Public Health and Surveillance 4 (1): e8950. doi:10.2196/publichealth.8950.
  • Lucas, Benjamin, Behzad Vahedi, and Morteza Karimzadeh. 2022. “A Spatiotemporal Machine Learning Approach to Forecasting COVID-19 Incidence at the County Level in the USA.” International Journal of Data Science and Analytics, 1–20. doi:10.1007/s41060-021-00295-9.
  • Mittmann, Gloria, Kate Woodcock, Sylvia Dörfler, Ina Krammer, Isabella Pollak, and Beate Schrank. 2022. ““TikTok is my Life and Snapchat is my Ventricle”: A Mixed-Methods Study on the Role of Online Communication Tools for Friendships in Early Adolescents.” The Journal of Early Adolescence 42 (2): 172–203. doi:10.1177/02724316211020368
  • Moher, David, Alessandro Liberati, Jennifer Tetzlaff, Douglas G. Altman, and PRISMA Group*. 2009. “Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement.” Annals of Internal Medicine 151 (4): 264–269. doi:10.7326/0003-4819-151-4-200908180-00135
  • Mollalo, Abolfazl, Alireza Mohammadi, Sara Mavaddati, and Behzad Kiani. 2021. “Spatial Analysis of COVID-19 Vaccination: A Scoping Review.” International Journal of Environmental Research and Public Health 18 (22): 12024. doi:10.3390/ijerph182212024
  • Nagar, Ruchit, Qingyu Yuan, Clark C. Freifeld, Mauricio Santillana, Aaron Nojima, Rumi Chunara, and John S. Brownstein. 2014. “A Case Study of the New York City 2012-2013 Influenza Season with Daily Geocoded Twitter Data from Temporal and Spatiotemporal Perspectives.” Journal of Medical Internet Research 16 (10): e3416. doi:10.2196/jmir.3416.
  • Nielsen, René Clausen, Miguel Luengo-Oroz, Maeve B. Mello, Josi Paz, Colin Pantin, and Taavi Erkkola. 2017. “Social Media Monitoring of Discrimination and HIV Testing in Brazil, 2014–2015.” AIDS and Behavior 21 (1): 114–120. doi:10.1007/s10461-017-1753-2
  • Nsoesie, Elaine O., Luisa Flor, Jared Hawkins, Adyasha Maharana, Tobi Skotnes, Fatima Marinho, and John S. Brownstein. 2016. “Social Media as a Sentinel for Disease Surveillance: What Does Sociodemographic Status Have to Do with It?” PLoS Currents 8. doi:10.1371/currents.outbreaks.cc09a42586e16dc7dd62813b7ee5d6b6.
  • Padilla, Jose J., Hamdi Kavak, Christopher J. Lynch, Ross J. Gore, and Saikou Y. Diallo. 2018. “Temporal and Spatiotemporal Investigation of Tourist Attraction Visit Sentiment on Twitter.” PLoS One 13 (6): e0198857. doi:10.1371/journal.pone.0198857.
  • Palmer, Richard C., Deborah Ismond, Erik J. Rodriquez, and Jay S. Kaufman. 2019. “Social Determinants of Health: Future Directions for Health Disparities Research.” American Journal of Public Health 109 (S1): S70–SS1. doi:10.2105/AJPH.2019.304964
  • Peng, Zhenghong, Ru Wang, Lingbo Liu, and Hao Wu. 2020. “Exploring Urban Spatial Features of COVID-19 Transmission in Wuhan Based on Social Media Data.” ISPRS International Journal of Geo-Information 9 (6): 402. doi:10.3390/ijgi9060402
  • Pobiruchin, Monika, Richard Zowalla, and Martin Wiesner. 2020. “Temporal and Location Variations, and Link Categories for the Dissemination of COVID-19–Related Information on Twitter During the SARS-CoV-2 Outbreak in Europe: Infoveillance Study.” Journal of Medical Internet Research 22 (8): e19629. doi:10.2196/19629
  • Porcher, Simon, and Thomas Renault. 2021. “Social Distancing Beliefs and Human Mobility: Evidence from Twitter.” PLoS One 16 (3): e0246949. doi:10.1371/journal.pone.0246949.
  • Poulston, Adam, Mark Stevenson, and Kalina Bontcheva. 2016. “User Profiling with Geo-Located Posts and Demographic Data.” Paper presented at the Proceedings of the First Workshop on NLP and Computational Social Science.
  • Preoţiuc-Pietro, Daniel, Svitlana Volkova, Vasileios Lampos, Yoram Bachrach, and Nikolaos Aletras. 2015. “Studying User Income Through Language, Behaviour and Affect in Social Media.” PLoS One 10 (9): e0138717. doi:10.1371/journal.pone.0138717.
  • Pruss, Dasha, Yoshinari Fujinuma, Ashlynn R. Daughton, Michael J. Paul, Brad Arnot, Danielle Albers Szafir, and Jordan Boyd-Graber. 2019. “Zika Discourse in the Americas: A Multilingual Topic Analysis of Twitter.” PLoS One 14 (5): e0216922. doi:10.1371/journal.pone.0216922.
  • Puspitasari, Ira, Rohiim Ariful, and Barry Nuqoba. 2021. “Public Health on Social Media: Using Instagram Posts for Investigating Dengue Hemorrhagic Fever in Indonesia.” Paper presented at the AIP conference proceedings.
  • Ramadona, Aditya Lia, Yesim Tozan, Lutfan Lazuardi, and Joacim Rocklöv. 2019. “A Combination of Incidence Data and Mobility Proxies from Social Media Predicts the Intra-Urban Spread of Dengue in Yogyakarta, Indonesia.” PLoS Neglected Tropical Diseases 13 (4): e0007298. doi:10.1371/journal.pntd.0007298.
  • Ramsetty, Anita, and Cristin Adams. 2020. “Impact of the Digital Divide in the age of COVID-19.” Journal of the American Medical Informatics Association 27 (7): 1147–1148. doi:10.1093/jamia/ocaa078
  • Rao, Ashwin, Fred Morstatter, Minda Hu, Emily Chen, Keith Burghardt, Emilio Ferrara, and Kristina Lerman. 2021. “Political Partisanship and Antiscience Attitudes in Online Discussions About COVID-19: Twitter Content Analysis.” Journal of Medical Internet Research 23 (6): e26692. doi:10.2196/26692
  • Rivieccio, Bruno Alessandro, Alessandra Micheletti, Manuel Maffeo, Matteo Zignani, Alessandro Comunian, Federica Nicolussi, Silvia Salini, Giancarlo Manzi, Francesco Auxilia, and Mauro Giudici. 2021. “CoViD-19, Learning from the Past: A Wavelet and Cross-Correlation Analysis of the Epidemic Dynamics Looking to Emergency Calls and Twitter Trends in Italian Lombardy Region.” PLoS One 16 (2): e0247854. doi:10.1371/journal.pone.0247854.
  • Scotti, Francesco, Davide Magnanimi, Valeria Maria Urbano, and Francesco Pierri. 2020. “Online Feelings and Sentiments Across Italy During Pandemic: Investigating the Influence of Socio-Economic and Epidemiological Variables.” Paper presented at the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).
  • Seltzer, E. K., E. Horst-Martz, M. Lu, and Raina Martha Merchant. 2017. “Public Sentiment and Discourse About Zika Virus on Instagram.” Public Health 150: 170–175. doi:10.1016/j.puhe.2017.07.015
  • Sha, Dexuan, Yi Liu, Qian Liu, Yun Li, Yifei Tian, Fayez Beaini, Cheng Zhong, Tao Hu, Zifu Wang, and Hai Lan. 2021. “A Spatiotemporal Data Collection of Viral Cases for COVID-19 Rapid Response.” Big Earth Data 5 (1): 90–111. doi:10.1080/20964471.2020.1844934
  • Shen, Cuihua, Anfan Chen, Chen Luo, Jingwen Zhang, Bo Feng, and Wang Liao. 2020. “Using Reports of Symptoms and Diagnoses on Social Media to Predict COVID-19 Case Counts in Mainland China: Observational Infoveillance Study.” Journal of Medical Internet Research 22 (5): e19421. doi:10.2196/19421
  • Shen, Lining, Rui Yao, Wenli Zhang, Richard Evans, Guang Cao, and Zhiguo Zhang. 2021. “Emotional Attitudes of Chinese Citizens on Social Distancing During the COVID-19 Outbreak: Analysis of Social Media Data.” JMIR Medical Informatics 9 (3): e27079. doi:10.2196/27079
  • Shepherd, Harry E. R., Florence S. Atherden, Ho Man Theophilus Chan, Alexandra Loveridge, and Andrew J. Tatem. 2021. “Domestic and International Mobility Trends in the United Kingdom During the COVID-19 Pandemic: An Analysis of Facebook Data.” International Journal of Health Geographics 20 (1): 1–13. doi:10.1186/s12942-020-00255-9
  • Sinnenberg, Lauren, Alison M. Buttenheim, Kevin Padrez, Christina Mancheno, Lyle Ungar, and Raina M. Merchant. 2017. “Twitter as a Tool for Health Research: A Systematic Review.” American Journal of Public Health 107 (1): e1–e8. doi:10.2105/AJPH.2016.303512
  • Slavik, Catherine E., Charlotte Buttle, Shelby L. Sturrock, J. Connor Darlington, and Niko Yiannakoulias. 2021. “Examining Tweet Content and Engagement of Canadian Public Health Agencies and Decision Makers During COVID-19: Mixed Methods Analysis.” Journal of Medical Internet Research 23 (3): e24883. doi:10.2196/24883
  • Statista. 2022a. Accessed November 7. https://www.statista.com/statistics/1176654/internet-penetration-rate-africa-compared-to-global-average/.
  • Statista. 2022b. Accessed November 7. https://www.statista.com/statistics/187549/facebook-distribution-of-users-age-group-usa/#:~:text=U.S.%20Facebook%20users%202022%2C%20by%20age%20group&text=As%20of%20September%202022%2C%2023.6,largest%20audience%20in%20the%20country.
  • Statista. 2022c. Accessed November 7. https://www.statista.com/statistics/320940/china-sina-weibo-user-breakdown-by-age-group/#:~:text=The%20Chinese%20Twitter%2Dlike%20platform,or%20a%20higher%20education%20qualification.
  • Stevens, Robin, Stephen Bonett, Jacqueline Bannon, Deepti Chittamuru, Barry Slaff, Safa K. Browne, Sarah Huang, and José A. Bauermeister. 2020. “Association Between HIV-Related Tweets and HIV Incidence in the United States: Infodemiology Study.” Journal of Medical Internet Research 22 (6): e17196. doi:10.2196/17196
  • Su, Yihua, Aarthi Venkat, Yadush Yadav, Lisa B. Puglisi, and Samah J. Fodeh. 2021. “Twitter-based Analysis Reveals Differential COVID-19 Concerns Across Areas with Socioeconomic Disparities.” Computers in Biology and Medicine 132: 104336. doi:10.1016/j.compbiomed.2021.104336
  • Sun, Xiaoyu, Zhou Huang, Xia Peng, Yiran Chen, and Yu Liu. 2019. “Building a Model-Based Personalised Recommendation Approach for Tourist Attractions from Geotagged Social Media Data.” International Journal of Digital Earth 12 (6): 661–678. doi:10.1080/17538947.2018.1471104
  • Surano, Francesco Vincenzo, Maurizio Porfiri, and Alessandro Rizzo. 2022. “Analysis of Lockdown Perception in the United States During the COVID-19 Pandemic.” The European Physical Journal Special Topics 231 (9): 1625–1633. doi:10.1140/epjs/s11734-021-00265-z
  • Tang, Lu, Bijie Bie, Sung-Eun Park, and Degui Zhi. 2018. “Social Media and Outbreaks of Emerging Infectious Diseases: A Systematic Review of Literature.” American Journal of Infection Control 46 (9): 962–972. doi:10.1016/j.ajic.2018.02.010
  • Tran, Thanh, and Kyumin Lee. 2016. “Understanding Citizen Reactions and Ebola-Related Information Propagation on Social Media.” Paper presented at the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).
  • Turiel, Jeremy, Delmiro Fernandez-Reyes, and Tomaso Aste. 2021. “Wisdom of Crowds Detects COVID-19 Severity Ahead of Officially Available Data.” Scientific Reports 11 (1): 1–9. doi:10.1038/s41598-020-79139-8
  • UniversityOfUTAH. 2022. Accessed November 7. https://healthcare.utah.edu/infectiousdiseases/general.php.
  • Vahedi, Behzad, Morteza Karimzadeh, and Hamidreza Zoraghein. 2021. “Spatiotemporal Prediction of COVID-19 Cases Using Inter-and Intra-County Proxies of Human Interactions.” Nature Communications 12 (1): 1–15. doi:10.1038/s41467-020-20314-w
  • van Heerden, Alastair, and Sean Young. 2020. “Use of Social Media Big Data as a Novel HIV Surveillance Tool in South Africa.” PLoS One 15 (10): e0239304. doi:10.1371/journal.pone.0239304.
  • Vicinitas. 2022. Accessed November 7. https://www.vicinitas.io/blog/twitter-social-media-strategy-2018-research-100-million-tweets.
  • Wakamiya, Shoko, Yukiko Kawai, and Eiji Aramaki. 2018. “Twitter-based Influenza Detection After Flu Peak Via Tweets with Indirect Information: Text Mining Study.” JMIR Public Health and Surveillance 4 (3): e8627. doi:10.2196/publichealth.8627.
  • Wang, Yang, D. Phuong Do, and Fernando A. Wilson. 2018. “Immigrants’ Use of Ehealth Services in the United States, National Health Interview Survey, 2011-2015.” Public Health Reports 133 (6): 677–684. doi:10.1177/0033354918795888
  • Wang, Jimin, and Yingjie Hu. 2019. “Are we There yet? Evaluating State-of-the-Art Neural Network Based Geoparsers Using EUPEG as a Benchmarking Platform.” Paper presented at the Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Geospatial Humanities.
  • Wang, Feng, Haiyan Wang, Kuai Xu, Ross Raymond, Jaime Chon, Shaun Fuller, and Anton Debruyn. 2016. “Regional Level Influenza Study with Geo-Tagged Twitter Data.” Journal of Medical Systems 40 (8): 1–8. doi:10.1007/s10916-016-0545-y.
  • Wang, Yufang, Kuai Xu, Yun Kang, Haiyan Wang, Feng Wang, and Adrian Avram. 2020. “Regional Influenza Prediction with Sampling Twitter Data and PDE Model.” International Journal of Environmental Research and Public Health 17 (3): 678. doi:10.3390/ijerph17030678
  • Wojcik, Stefan, and Adam Hughes. 2019. “How Twitter Users Compare to the General Public.” Pew Research Center: Internet, Science & Tech.
  • Xu, Paiheng, Mark Dredze, and David A. Broniatowski. 2020. “The Twitter Social Mobility Index: Measuring Social Distancing Practices with Geolocated Tweets.” Journal of Medical Internet Research 22 (12): e21499. doi:10.2196/21499
  • Yang, Wan, Marc Lipsitch, and Jeffrey Shaman. 2015. “Inference of Seasonal and Pandemic Influenza Transmission Dynamics.” Proceedings of the National Academy of Sciences 112 (9): 2723–2728. doi:10.1073/pnas.1415012112
  • Ye, Xinyue, Shengwen Li, Xining Yang, and Chenglin Qin. 2016. “Use of Social Media for the Detection and Analysis of Infectious Diseases in China.” ISPRS International Journal of Geo-Information 5 (9): 156. doi:10.3390/ijgi5090156
  • Yigitcanlar, Tan, Nayomi Kankanamge, Alexander Preston, Palvinderjit Singh Gill, Maqsood Rezayee, Mahsan Ostadnia, Bo Xia, and Giuseppe Ioppolo. 2020. “How Can Social Media Analytics Assist Authorities in Pandemic-Related Policy Decisions? Insights from Australian States and Territories.” Health Information Science and Systems 8 (1): 1–21. doi:10.1007/s13755-020-00121-9.
  • Young, Sean D., Neil Mercer, Robert E. Weiss, Elizabeth A. Torrone, and Sevgi O. Aral. 2018. “Using Social Media as a Tool to Predict Syphilis.” Preventive Medicine 109: 58–61. doi:10.1016/j.ypmed.2017.12.016
  • Young, Sean D., Caitlin Rivers, and Bryan Lewis. 2014. “Methods of Using Real-Time Social Media Technologies for Detection and Remote Monitoring of HIV Outcomes.” Preventive Medicine 63: 112–115. doi:10.1016/j.ypmed.2014.01.024
  • Yousefinaghani, Samira, Rozita Dara, Samira Mubareka, Andrew Papadopoulos, and Shayan Sharif. 2021. “An Analysis of COVID-19 Vaccine Sentiments and Opinions on Twitter.” International Journal of Infectious Diseases 108: 256–262. doi:10.1016/j.ijid.2021.05.059
  • Zeng, Chengbo, Jiajia Zhang, Zhenlong Li, Xiaowen Sun, Bankole Olatosi, Sharon Weissman, and Xiaoming Li. 2021. “Spatial-temporal Relationship Between Population Mobility and COVID-19 Outbreaks in South Carolina: Time Series Forecasting Analysis.” Journal of Medical Internet Research 23 (4): e27045. doi:10.2196/27045
  • Zhang, Xiangliang, Qiang Yang, Somayah Albaradei, Xiaoting Lyu, Hind Alamro, Adil Salhi, Changsheng Ma, Manal Alshehri, Inji Ibrahim Jaber, and Faroug Tifratene. 2021. “Rise and Fall of the Global Conversation and Shifting Sentiments During the COVID-19 Pandemic.” Humanities and Social Sciences Communications 8 (1): 1–10. doi:10.1057/s41599-020-00684-8
  • Zhu, Yongjian, Liqing Cao, Jingui Xie, Yugang Yu, Anfan Chen, and Fengming Huang. 2021. “Using Social Media Data to Assess the Impact of COVID-19 on Mental Health in China.” Psychological Medicine, 1–8. doi:10.1017/S0033291721001598.
  • Zhu, Bangren, Xinqi Zheng, Haiyan Liu, Jiayang Li, and Peipei Wang. 2020. “Analysis of Spatiotemporal Characteristics of Big Data on Social Media Sentiment with COVID-19 Epidemic Topics.” Chaos, Solitons & Fractals 140: 110123. doi:10.1016/j.chaos.2020.110123