826
Views
0
CrossRef citations to date
0
Altmetric
Acceptance & Hesitation

Longitudinal analysis of behavioral factors and techniques used to identify vaccine hesitancy among Twitter users: Scoping review

ORCID Icon, ORCID Icon & ORCID Icon
Article: 2278377 | Received 04 Sep 2023, Accepted 29 Oct 2023, Published online: 20 Nov 2023

ABSTRACT

While vaccines have played a pivotal role in the fight against infectious diseases, individuals engage in online resources to find vaccine-related support and information. The benefits and consequences of these online peers are unclear and mainly cause a behavioral shift in user sentiment toward vaccination. This scoping review aims to identify the community and individual factors that longitudinally influence public behavior toward vaccination. The secondary aim is to gain insight into techniques and methodologies used to extract these factors from Twitter data. We followed PRISMA-ScR guidelines to search various online repositories. From this search process, a total of 28 most relevant articles out of 705 relevant studies. Three main themes emerged including individual and community factors influencing public attitude toward vaccination, and techniques employed to identify these factors. Anti-vax, Pro-vax, and neutral are the major communities, while misinformation, vaccine campaign, and user demographics are the common individual factors assessed during this reviewing process. Twitter user sentiment (positive, negative, and neutral) and emotions (fear, trust, sadness) were also discussed to identify the intentions to accept or refuse vaccines. SVM, LDA, BERT are the techniques used for topic modeling, while Louvain, NodeXL, and Infomap algorithms are used for community detection. This research is notable for being the first systematic review that emphasizes the dearth of longitudinal studies and the methodological and underlying practical constraints underpinning the lucrative implementation of an explainable and longitudinal behavior analysis system. Moreover, new possible research directions are suggested for the researchers to perform accurate human behavior analysis.

Introduction

Recently, longitudinal analysis has gained significant attention from healthcare practitioners and psychologists to perform in-depth analyses of users’ health and behavioral modifications with time. During this process, the researchers analyze the data collected from the same subjects or participants over a period of time. It is often used to examine changes, trends, or relationships within a specific group or population. Conversely, to the cross-sectional investigations, the longitudinal analysis allows for the investigation of temporal patterns, the exploration of causal relationships, and the examination of individual trajectories. For human behavior analysis, longitudinal evaluations are performed to understand how responses change over time and identify the factors that influence this change. By repeatedly measuring individuals, researchers can capture the within-individual differences in responses.Citation1 A considerable amount of data spans over time is required to perform longitudinal analysis however, collecting data through surveys, questionnaires, or public stations can indeed be time-consuming and prone to errors. Additionally, when data collection extends over a span of several years, it can become more complex and financially demanding, making it less feasible for researchers.

With the availability of social media platforms, it has now become more accessible for researchers to collect data spans over time and perform longitudinal behavior analysis. The social media tools like Twitter,Footnotea Facebook, Instagram, YouTube, and others facilitate users exchanging ideas, information, knowledge, and other facts among human societies and generate an enormous amount of data. Research shows that regular use of these applications can have both positive and negative effects on human behavior.Citation2 With around 400 million registered users sharing their thoughts, Twitter generates a substantial amount of data daily. Twitter users utilize “tweets,” which are short messages, to share news, information, and opinions. Additionally, Twitter provides options for users to engage with tweets by liking, commenting, retweeting (reposting a tweet), and sharing them. Kaur et al.,Citation3 synthesized the dynamics and flow of behavioral changes among twitter users during the pandemic.

With several vaccines approved globally, mass vaccination campaigns are currently underway. However, attitudes toward vaccination, specifically vaccine hesitancy, pose a potential threat to achieving sufficient coverage and community immunity. The SAGE Working Group on Vaccine Hesitancy determined that vaccine hesitancy pertains to the postponement of acceptance or the refusal of vaccination even when vaccination services are readily available.Citation4 Factors such as attitudes toward the collective importance, efficacy, side effects, and speed of vaccine development were significant predictors of vaccine hesitancy, while sociodemographic variables such as age, sex, and socioeconomic status only explained a small proportion (10%) of the variance. Additionally, anti-vaccine content is prevalent across social media platforms, posted by a minority of users, but frequently generating greater user engagement than neutral or pro-vaccine content.Citation5 Recent studies have also found that social media posts containing vaccine-related misinformation are shared as frequently as those with reliable information.Citation6,Citation7 Disinformation campaigns on social media have been associated with drops in vaccination coverage, as measured by annual data on actual vaccination rates from the WHO, and increased levels of negative vaccine discourse on Twitter.Citation8

The literature on vaccine hesitancy lacks clear conceptual clarity, as it presents varying definitions of vaccine hesitancy, ranging from a psychological state to different types of vaccination behavior.Citation9,Citation10 Additionally, the terminologies ‘low uptake,’ ‘vaccine confidence,’ and ‘low intention to vaccinate’ are often matched with vaccine hesitancy.Citation11,Citation12 Several research studies have employed systematic reviews and surveys to synthesize existing literature and pinpoint knowledge gaps. For instance, in the field of sentiment analysis for vaccine hesitancy, Alamoodi and colleaguesCitation13 conducted a comprehensive analysis. They highlighted the repercussions of vaccine hesitancy across various sectors, including social, medical, public health, and technology science. In a different study, Zhao and his teamCitation14 carried out a review analysis focusing on the impact of COVID-19 on mental health in Australia. Their longitudinal analysis revealed that specific demographic groups, such as young individuals, those with preexisting mental health conditions, and individuals facing financial hardships, experienced more significant declines in their mental well-being. Another valuable contribution came from Bussink-Voorend et al.,Citation15 who performed a meticulous systematic analysis of literature gathered from PubMed, Embase, and PsycINFO databases. Their work aimed to provide readers with a clear understanding of the concept of vaccine hesitancy. Lastly, Skafle and his teamCitation16 conducted a scoping review of the literature to identify instances of misinformation related to autism and COVID-19 vaccination shared on social media.

Upon reviewing the literature, a common observation is that the majority of existing scoping reviews tend to concentrate on particular areas, such as misinformation, sentiments, mental health, or other healthcare consequences. Additionally, many of these reviews have limitations in terms of geographic scope or their focus on specific vaccines.In contrast, our review aims to synthesize the literature from various research domains, including the longitudinal analysis of user behavior toward vaccination, factors influencing user behavior, and techniques for identifying user attitudes over time using Twitter data. Furthermore, this review aims to evaluate vaccine hesitancy and its underlying factors (individual and community factors) on Twitter as a social media platform. The findings of this scoping review will provide valuable insights for healthcare administrators and policymakers to understand the factors associated with vaccine hesitancy among different cohorts engaging in Twitter discussions. This understanding will facilitate the planning of vaccination campaigns and help improve the uptake rates of various vaccines. The main theme of this scoping review followed by the research questions are explained below:

  • To identify how the Twitter users’ behavior changes toward vaccination longitudinally. Also, it aims to enlist the magnified community and individual factors that are influencing user behavior toward vaccines.

  • To describe the state-of-the-art techniques used to identify Twitter users’ behavior toward vaccination longitudinally. Furthermore, this objective aims to explain different machine learning and software-based methodologies used to perform users’ sentiment analysis toward vaccination.

  • To find out both the healthcare and social impacts of vaccine hesitancy on population. Moreover, this objective aims to identify how the vaccine hesitancy hesitancy challenges the healthcare workers and state agencies to fight against outbreaks.

Methodology

PRISMA-ScR guidelinesCitation17 and vaccine hesitancy scoping review frameworkCitation18 are followed to accomplish this longitudinal research work. of the supplementary file presents the adherence to the PRISMA-ScR guidelines.

Table 1. Studies analysis.

Search process

A search was performed on 10 databases: Scopus, PubMed, IEEE Xplore, ACM, Google Scholar, PsycINFO, Ovid, CINAHL, Springer Link, and Cochrane Library, using relevant keywords and search queries. The MEDLINE database was not searched because PubMed provides primary access to references and abstracts to the MEDLINE database. Google Scholar retrieved a large number of studies (30,600) and so the 100 studies are included from the first 10 pages to keep the search results relevant in this review. The search was conducted on only English-language articles. This study focuses on longitudinal vaccine-related content on Twitter data. Twitter debuted on March 21, 2006. The search was limited from 2006 to current. Forward and backward reference list checking was performed to identify the up-to-date studies.

Search terms

The authors of this study developed search queries through discussion and consultation. It aimed to find out all the relevant studies that performed longitudinal analysis on Twitter data toward vaccines. Search terms were chosen based on the intervention (‘longitudinal’ OR ’retrospective’ OR ’prospective) on Twitter platform (twitter OR tweet*). The target outcome was (vaccin* OR immuniz* OR immunis*). An example of search queries for Scopus databases is: TITLE-ABS KEY ((’longitudinal’ OR ’retrospective’ OR ’prospective’) AND (twitter OR tweet*) AND (vaccin* OR immuniz* OR immunis*)). The detailed search terms that are used for different databases are shown in Appendix Search Results excel file.

Inclusion and exclusion criteria

A primary search was performed to find literature about longitudinal vaccine-related topics on Twitter data. Studies considered Twitter data along with other social media data were included in this study. Articles written only in the English language were considered for this study. The included studies are empirical studies, peer-reviewed articles, dissertations, book chapters, and conference proceedings, and studies published from 2006 to the present. In the study setting, there was no limitation on the population’s demographic data, such as age, gender, and nationality.

Articles were excluded that were published before 2006 because Twitter started in 2006. Articles written in languages other than English were not considered. All those longitudinal studies were excluded that analyzed Tweets instead of Twitter users. Studies were excluded that failed to evaluate user(s) sentiments for a certain period of time. Longitudinal analysis on hospitals or other health workers vaccination reports was removed from search results. Also, all those studies were excluded that manually assessed/surveyed users’ sentiments for the vaccine. Systematic reviews, newspapers, magazines, reviews, proposals and posters, non-peer-reviewed articles, only abstracts, and letters to the editor were excluded from the analysis.

Study selection

The study selection procedure comprised four stages: initially, duplicate studies were identified using the automated duplicate detection feature of Rayyan softwareCitation47 and subsequently eliminated. In the following step, two reviewers (SK and MRB) assessed the titles and abstracts of the remaining unique studies and included if they met the study inclusion criteria. Next, the reviewers thoroughly examined the full text of the selected studies. Finally, forward and backward reference lists were checked in the included studies to further observe the relevant studies. Any discrepancies between the reviewers were resolved through discussion with the third author ZS.

Data extraction and synthesizing

To prepare for data extraction from the included studies, an extraction table was created with a column header using an Excel spreadsheet and shared among other authors MRB and ZS. The data extraction sheet was reviewed and updated through discussion. A pilot test was conducted on two studies to ensure data consistency and availability. The extracted data encompassed various aspects, including study characteristics (such as type of paper, authors, authors’ location), attributes of Twitter data (such as tool online source searched (Twitter), online source searched (others),tweet language, tweets language (others) total duration, number of tweets, and number of users). Vaccine-related topics to data acquisition (such as type of vaccine, topics, emotions, peak time, and concerns). The extracted data are presented and summarized in the tables. The analysis and findings are synthesized using a narrative approach.

Results

This section of the paper outlines the search results and main findings of this scoping review process. It briefly explains the articles accumulation and selection process, key factors influencing user’s behavior toward vaccination, and methodologies used in the literature (included articles), to longitudinally analyze psychological characteristics associated with vaccine hesitancy.

Search results

During our initial articles downloading process, 705 records were obtained that were further screened for removing duplicate entries; we were left with 431 studies for further evaluation. Using our established inclusion and exclusion criteria (see Methods section), we further screened these studies based on their title and abstract and obtained 294 studies for full-text review. After screening these articles based on the contents provided, we obtained only 28 of the most relevant articles for our final evaluation and reviewing process. During this process, we removed the articles that considered only tweets instead of users because we aimed to study users’ behavior toward vaccination, not the tweet contents. Also, we removed those articles that presented other social media sources other than Twitter because our primary concern is to consider only Twitter. It is worth mentioning that we included those articles that considered other social media sources, along with Twitter. Furthermore, we excluded those articles that performed manual surveys, because our primary concern is to analyze the literature that considered Twitter a primary source for longitudinal user’s analysis toward vaccination. The overall search process, articles downloading and screening process, and final pool of most relevant studies selection process are described in .

Figure 1. Experimental setup.

Figure 1. Experimental setup.

Demographic of the included studies

The overall demographics of the finalized relevant articles are shown in . The outer shell represents the references to the included articles, the second last represents the type of vaccines reported in these longitudinal studies, the third last shell represents the publication year, and the inner circle represents the country of the first author (reported in the studies). These vaccines include Measles, Mumps, and Rubella (MMR) vaccine, Human Papillomavirus Vaccine (HPV), and coronavirus disease of 2019 (Covid-19). Most of the studies (N = 19 ~ 68%) reported COVID-19 vaccination, (N = 5 ~ 18%) reported HPV vaccine, and (N = 4 ~ 14%) reported MMR vaccines. Most of the studies (N = 11 ~ 39%) were reported in 2022, while (N = 9 ~ 32%) studies reported during 2021. Among the total 28 studies, 20 studies (~72%) longitudinal studies on user behavior analysis toward vaccination are most recently reported. Moreover, most of the studies are reported by the US authors (N = 14 ~ 50%), the UK reported three studies (~11%), China reported two studies, while the rest of the countries Japan, Singapore, India, and others reported only one study on longitudinal analysis using Twitter data. shows that most of the studies (N = 23 ~ 83%) are journal articles, while only (N = 5 ~ 17%) are conference papers.

Figure 2. Evolution of included articles.

Figure 2. Evolution of included articles.

Studies taxonomy

represents the overall taxonomy of the finalized relevant articles. It contains the information about the number of data samples, type of vaccine studied, and range of years followed for users’ behavior longitudinal analysis. Also, contains the information about the techniques or methodologies used to perform this longitudinal analysis. In the included articles along with Twitter some researchers considered data from other online social media and news sources such as Calo et al.Citation39 considered data posts and status reviews from Facebook, Instagram, Reddit, and YouTube to analyze public attitude toward HPV vaccination. While Islam et al.Citation22 considered blogs and news reports from Google, Facebook, Fact-checking agency websites, YouTube, Fact check, and television and newspaper websites. Hussain et al.Citation33 considered Facebook posts along with Twitter data for user behavior analysis toward COVID-19 vaccination.

Data source is the preliminary step for any sentiment analysis task. Numerous data sources are reported in the selected articles to identify the behavior of Twitter users toward vaccination. Among the included articles Twitter is a primary source for data accumulation in all the articles (N = 28 ~ 100%) however, the research papersCitation22,Citation33,Citation39 (N = 3 ~ 11%) also used other social media sources like YouTube, Facebook, Instagram, Reddit, Fact check, and television and newspaper websites along-with Twitter for data accumulation and sentiment analysis process.

Only four studiesCitation19,Citation33,Citation34,Citation43 (N = 4 ~ 14%) provide data and implementation code for the public. Two research articlesCitation32,Citation45 provided the public access link to the implementation code but didn’t provide an access link to the data due to Twitter public policy restrictions. Three studiesCitation20,Citation22,Citation39 provided only the keys to the Twitter data (Tweet IDs and User IDs) used for experimentation. The research articleCitation44 made their code publicly available but made their data private and available on request only. While the research articlesCitation26,Citation37,Citation43 made their data private and can provide the data based on a reasonable request to the corresponding authors. The rest of the (N = 16 ~ 58%) of the articles made their simulations and data private. The information about data sources, code implementation, and public access links are shown in .

Table 2. Data and code availability with access link.

Only one research articleCitation45 provided information about the hardware resources used for experimentation and sentiment analysis. They have used NVIDIA Tesla V100 SXM332 GB GPU for the development of their 12-layered architecture with hidden size h = 768, with a dropout rate of 0.2, a learning rate of 0.001, regularization of 0.001, and sigmoid as an activation function.

The finalized pool of relevant articles studies the longitudinal analysis of users’ behavior about three different vaccine types. These vaccines include the MMR vaccine, HPV, and Covid-19. Most of the studies (N = 19 ~ 68%) reported COVID-19 vaccination, (N = 5 ~ 18%) reported HPV vaccine, and (N = 4 ~ 14%) reported MMR vaccines in their longitudinal analysis using Twitter and other social media platforms.

For user demographics and profile data extraction, different techniques are reported in the included studies, such as articlesCitation20,Citation21 that used m3 inference in Python for geographical information extraction. ArticleCitation21 also used “Geopy and Pycountry” libraries, whileCitation25 used DeepFace andCitation41 used a named entity recognizer (NER) for user demographics and geographical information extraction. Several geographical regions are covered for users’ psychological and behavioral analysis toward vaccination. Among the reported geographical regions USA is the highest reported region in (N = 19 ~ 68%) studies.Citation19,Citation30,Citation32,Citation33,Citation35,Citation37,Citation38,Citation40,Citation42 The second highest reported region for longitudinal analysis is the UK and reported in (N = 8 ~ 29%) research studies.Citation22,Citation23,Citation33,Citation34,Citation38,Citation41,Citation44,Citation46 India and Pakistan reported in retrospective studiesCitation22,Citation34,Citation38 for users’ behavior analysis toward vaccination. Australia and Brazil reported in (N = 4 ~ 14%) studiesCitation21,Citation22,Citation34,Citation38 for attitude analysis toward vaccination longitudinally.

Characteristic of data

The details about the size of data observed, the number of users analyzed for longitudinal observations, and the time frame for which the users’ sentiments are analyzed are briefly explained below.

Data samples studied

In the finalized studies varying numbers of samples are used for longitudinal analysis of the users’ behavior toward vaccine hesitancy. Only one articleCitation21 reported 1.4 billion tweets for retrospective analysis of user behavior. Similarly, a single articleCitation34 reported about 13 million tweets for behavioral analysis toward vaccination.

In the final pool of relevant articles, some researchers considered other language tweets along with English language tweets, such as a research paperCitation19 considered Japanese language tweets along with English tweets. Similarly, articleCitation26 considered Spanish, Turkish, Japanese, Portuguese, German, Slovenian, and Dutch language tweets along with English language tweets for longitudinal analysis. Research paperCitation32 followed Dutch language tweets along with English tweets, while the articleCitation34 considered English tweets along with 90 different languages tweets belonging to Southeast Asian, Eastern Mediterranean, and Western Pacific countries, including India, Indonesia, and Pakistan. The research paperCitation37 downloaded Spanish-language tweets along with English tweets for vaccine hesitancy analysis among Twitter users.

Timeframe of data

The included studies report different data timeframes for users’ psychological observations and attitudes toward vaccine hesitancy. Some researchers performed their longitudinal observation on the data timeframe reported for two years, three years, and vice versa. Even some researchers reported their observations on a few months of Twitter data. A detailed description of the data timeframe is shown in .

Table 3. Data timeframe selected for users’ longitudinal sentiment analysis.

From , it is concluded that only three articlesCitation23,Citation40,Citation43 have considered an observational period of more than three years. Similarly, only six articlesCitation20,Citation21,Citation28,Citation34,Citation36,Citation45 considered data for one or more than one year for user psychological observation toward vaccination.

Table 4. Community factors extracted from the included studies.

Table 5. Individual factors extracted from the included studies.

Number of users

During the assessment and evaluation process, we found that varying numbers of users are selected for longitudinal analysis toward vaccinations. represents the selection of users in different research articles. From , it is concluded that only one study considered users of more than one million (1.15 million) for longitudinal behavior analysis. Only two papers considered users in the range of 500,000 to 1000,000 range. But most of the studies (N = 10 ~ 36%) provided no user information during the psychological analysis toward vaccination.

Table 6. Data collection strategies.

Users longitudinal behavior analysis

represents the framework followed in this scoping review process for user longitudinal behavior analysis. It provides an encyclopedic overview of the two different themes identified during this systematic analysis of the literature. These two overarching themes include (1) factors influencing users’ behavior toward vaccination, and (2) the methods/techniques used in the literature to perform a longitudinal analysis of vaccine hesitancy among Twitter users. These broad themes are further divided into sub-themes like “factors affecting user’s behavior” are divided into community factors and individual factors, where the individual factors are classified into contextual factors, individual and group factors, or vaccine-specific factors. The community factors are classified into community-specific factors like politicians, religious and other influential activists, and media (news or advertisement team), as shown in the leaf nodes of .

Figure 3. Study taxonomy.

Figure 3. Study taxonomy.

The methods/techniques to identify vaccine hesitancy are divided into community detection methods that are further divided into machine learning methods and statistical methods. The machine learning methods are then further dissected into shallow and deep architectures. The shallow architectures use binary patterns for classification and identification, while the deep architectures use neural network-based models for classification and identification purposes.

Factors affecting users behavior toward vaccination

The dilemma of understanding why some people are agreeing to be vaccinated and others are not, is a critical issue, especially for the healthcare domain. The capability, opportunity, motivation – behavior (COM-B) model is presented to identify the factors influencing Twitter users’ behavior toward COVID-19 vaccination.Citation48 The included studies identified different individual and community-based features that directly or indirectly affect public behavior toward vaccination.

Community factors

In the realm of sociology and community psychology, community factors refer to the various elements that characterize a particular community or social group. These factors encompass the collective characteristics, resources, and dynamics that influence the well-being and functioning of the community and its individual members.Citation49 Community factors can include aspects such as the social norms, values, and beliefs within the community, the availability of social support networks, the quality of local institutions and services, the level of community engagement and cohesion, and the presence of economic opportunities. These factors shape the social environment and can have a profound impact on health, social relationships, and overall quality of life within a community.Citation4 During this scoping review process, we identified different communities and the information disseminated from these communities that were used in the included studies for psychological analysis of Twitter users toward vaccination. Furthermore, in the included studies, different community-based factors were used to identify the engagement of a user and exposure to a certain psychological behavior toward vaccination. These factors include the information a user retweets, likes, replies, followers’ networks, and the community where a user engages. represents different community-based factors identified in the included longitudinal studies.

During our scoping review analysis, we identified that the research articlesCitation19,Citation23,Citation29,Citation31,Citation32,Citation35,Citation37,Citation42,Citation44 reported different communities for the longitudinal analysis of users’ behavior toward vaccination. These were anti-vaxxers, pro-vaxxers, or neutral communities, and based on these communities and information exposure,Citation19,Citation20,Citation28,Citation29,Citation32,Citation35,Citation42,Citation44 the authors decided that a user(s) is pro-vaccine, anti-vaccine, or neutral. This information exposure is the information that a user disseminates, like what a user likes, retweets, followers, and replies. Similarly, the research articlesCitation19,Citation26,Citation27,Citation31,Citation33,Citation37,Citation38,Citation42,Citation43 also identified influential personalities such as political, religious, media person, and activities that a user is involved. The authors of the articles as mentioned earlier classified the user into the anti-vaccine, pro-vaccine, or neutral communities.

Individual factors

Individual factors refer to the distinct traits or characteristics that differentiate one person from another. It can be in the form of contextual factors, individual and group factors, or vaccine-specific factors.Citation50 represents the individual factors identified in the included studies. Sentiment and emotions are the highly reported individual factors reported in (N = 17 ~ 61%) articles.Citation20,Citation25,Citation27,Citation29,Citation30,Citation33,Citation35,Citation38,Citation40,Citation46 The highly discussed sentiments are fear, trust, sadness, anger, disgust, surprise, joy, intent to accept or reject vaccination, and many others. The second highly discussed topic in the individual and group factor is user demographics and analyzed in (N = 13 ~ 46%) studies.Citation20,Citation21,Citation23,Citation28,Citation33,Citation36,Citation37,Citation39,Citation40 In the user demographics, these studies reported the factors like age, gender, user type, occupation, etc. The factor belief, attitude, and prevention assessed in (N = 9 ~ 32%) research articles.Citation20,Citation27,Citation32,Citation34,Citation40,Citation41,Citation43,Citation45 The highly discussed topics in this category are beliefs that vaccines are safe, effective, or non-effective, pandemic care, public safety, and many others. The individual factor knowledge/awareness is reported in (N = 7 ~ 25%) studies,Citation20,Citation23,Citation27,Citation32,Citation33,Citation41,Citation43 and the data extracted are vaccine awareness, educational impact, scientific inquiry, and many others. Detailed information about all these individual factors is provided in supplementary study taxonomy excel file.

In the vaccine-specific factors, misinformation is the highly reported theme in (N = 11 ~ 39%) research articles.Citation26,Citation28,Citation30,Citation33,Citation37,Citation39,Citation41,Citation42 Several misinformation themes are discussed in these articles, and we divided these themes into three broad categories, including conspiracy theories, medical-related, and vaccine-related misinformation as shown in . The research articles,Citation20,Citation23,Citation26,Citation32,Citation37,Citation41,Citation42 reported the rumors and conspiracy theories shared on Twitter about vaccinations, while the articlesCitation26,Citation30,Citation33,Citation34,Citation37,Citation42 reported their research on medical-related misinformation. The research paperCitation20,Citation23,Citation26,Citation33,Citation39,Citation42 performed their longitudinal analysis on vaccine-related misinformation reported in the Twitter data. The second highly reported theme in the vaccine-specific factor is the vaccine campaign and it is analyzed in five articles.Citation27,Citation34,Citation39,Citation45,Citation46 In the campaign, the extracted text includes vaccine motivation, vaccine distribution, medical training, and many others. Detailed information about all these individual factors is provided in supplementary material, namely, topic discussed excel file.

Figure 4. Misinformation analyzed in the included studies.

Figure 4. Misinformation analyzed in the included studies.

In psychology, contextual factors refer to the environmental or situational elements that influence an individual’s thoughts, feelings, and behaviors. These factors include the physical, social, cultural, and historical contexts in which individuals are embedded. Contextual factors can have a significant impact on an individual’s experiences, perceptions, and actions, shaping their development, interactions, and overall psychological well-being. During our review analysis, we identified several contextual factors in the included studies such as pandemic news reported in seven studiesCitation21,Citation26,Citation28,Citation33,Citation41,Citation45,Citation46 and data extracted relevant to this theme, including cases and deaths, reproduction rate, new case, new vaccines, vaccine & disease, and many others. Similarly, geographical barriers are also reported in seven studies,Citation24,Citation26,Citation31,Citation38,Citation40,Citation43 and the data associated with this theme include vaccine accessibility, inequities, and many others. Detailed information about all these individual factors is provided in supplementary material, namely, topic discussed excel file.

Methods to identify vaccine hesitancy

In the included studies, numerous machine learning-based methods are reported for vaccine hesitancy and users’ emotional themes (happy, sad, sorrow, etc.) calculation. During our reviewing process, we found that numerous data collection strategies and statistical and machine learning-based methods are proposed for vaccine hesitancy calculation among Twitter users. These data collection strategies and methods are briefly discussed in the following subsections.

Data collection strategies

During the analysis process, we found that different data collection strategies, such as the use of application programming interface (API) and other web-crawling keys, are used to accumulate data from social media platforms like Twitter, Facebook, YouTube, and many others. These data collection strategies are outlined in . From , it is evident that most of the articles (N = 18 ~ 64.28%) have reported the Twitter search API for data accumulation from Twitter. Because the Twitter search API is provided by the Twitter developers portal and freely accessible to all the researchers globally. It is maintained by the Twitter developers’ community. Two articlesCitation38,Citation41 have reported the Snscrape API. It is worth noting that Snscrape extends beyond scraping tweets and offers functionality for extracting data from various other social networking platforms, including Facebook, Instagram, Reddit, VKontakte, and Weibo (Sina Weibo). Twitter permits the use of polite crawlers. Nevertheless, if the data obtained through scraping is publicly shared in an unconventional manner, Twitter has the authority to terminate API access and potentially take disciplinary measures against the account.

Community factors detection methods

The social media platforms like Twitter social media platforms that assist multinational companies, political parties, and advertising teams by proposing a dynamic perspective to classify like-minded consumers and voters through community detection methods.Citation51 In the included studies, several community detection methods are used, which are shown in . These methods can be software, machine learning, and statistical methodologies.

Figure 5. Different community detection methods reported in the included studies.

Figure 5. Different community detection methods reported in the included studies.

Software

In the included studies, several software are reported to identify communities/clusters as shown in . The R software is among the highest reported tools used for community detection. It is worth mentioning that different versions of R software are reported, like articleCitation22 used R version 4.0.3. ArticlesCitation28,Citation46 used R version 4.0.2,Citation34 used R version 4.3.2, andCitation40 used R version 3.6.2. Two studiesCitation32,Citation44 reported the RStudio software for community detection using retweet packages. The research articleCitation24 reported STATA version 15 for identifying distinct types of communities to perform longitudinal analysis of users’ behavior toward COVID-19 vaccination using Twitter data.

Machine learning methods

During the assessment and evaluation process of the included studies, we found that numerous machine learning methods are proposed for users’ sentiment analysis longitudinally. For simplicity, we divided these machine learning methodologies into two categories deep architectures and shallow architectures.Citation52 Deep architectures are models developed using artificial intelligence and hidden layers for feature extraction, classification, and identification purposes. In the included articles, numerous deep learning models are proposed for hesitancy analysis using Twitter data. The GenLouvain method is the most highly reported community detection method among five studies.Citation19,Citation29,Citation30,Citation32,Citation42 The second highest model is BERT (Bidirectional Encoder Representations from Transformers) reported in two longitudinal studies.Citation22 The binary invariant long short-term memory (Bi-LSTM) model is reported inCitation45 for longitudinal analysis of user attitude toward COVID-19 vaccination. XLM-Roberta, also known as Cross-lingual Language Model – Roberta, is reported inCitation34 for longitudinally evaluating public attitude toward vaccination using Meltwater media monitoring platform data.

The shallow architectures are techniques that have no hidden layers, and no automatic feature extraction capabilities. Some feature engineering processes are required to extract astute information from data and then perform training and testing processes of these models to accomplish identification and classification tasks. Typically, these models show outstanding performance in binary classification problems. Numerous shallow methodologies and models are reported for identifying different communities (anti-vaxxers, pro-vaxxers, or neutral) from Twitter data. In the included studies, support vector machine (SVM) is the highly utilized shallow model reported in four studies.Citation23,Citation29,Citation35,Citation45 Selecting an accurate kernel space, the SVM model shows outstanding capabilities in data mining and NLP-relevant research problems. Multiple regression techniques such as logistic regression, linear regression, univariate linear regression, spline regression, and multivariate linear regression techniques are reported in the articles.Citation21,Citation29,Citation34,Citation45 After finely tuning the hyper-parameters these regression models show an outstanding performance for NLP-relevant tasks. Naïve Bayes technique is reported in,Citation29,Citation45 and it is considered the simplest and most generalized classification model in data mining tasks.

Individual factors detection methods

After community detection, the next step is to synthesize the techniques used in the longitudinal studies to identify the topic discussed in the communities and the individual response toward these topics. During our scoping review process, multiple techniques are reported to identify individual factors (also shown in ) and their sentiments and emotions toward the discussed topics among communities. For simplicity, we divided these individual topic identification methodologies into four classes, as shown in . These four broad classes include (1) emotion and sentiment detection techniques, (2) correlation identification techniques, (3) topic identification techniques, and (4) user stance calculation techniques.

Figure 6. Individual factors and methodologies.

Figure 6. Individual factors and methodologies.

Sentiments and emotions detection techniques

In psychology, sentiments, and emotions are related concepts, but they have some differences in their meaning and usage.Citation53 Sentiments are broader and can be shared by groups or communities. And emotions are specific to the individual and the immediate context. While emotions and sentiments are distinct, they are interconnected and can influence each other. Emotions can contribute to the formation of sentiments, and sentiments can shape emotional experiences and responses in different situations.Citation53 During our review analysis, we separated the sentiments and emotions detection techniques for readability and understandability purposes, as shown in .

In the included studies for sentiment analysis, VADER (Valence Aware Dictionary and Sentiment Reasoner) is highly reported lexicon and rule-based sentiment evaluation tool used for sentiment analysis in the finalized longitudinal studies. It is reported in (N = 8 ~ 28%) studies.Citation20,Citation21,Citation25,Citation27,Citation30,Citation33,Citation46 A Python library TextBlob is reported in three studies.Citation20,Citation33,Citation46

For emotion detection, LIWC (Linguistic Inquiry and Word Count) dictionary is used to analyze the linguistic and psychological dimensions of written text or contents in the posts. It is reported in.Citation19,Citation23,Citation30 RNN (Recurrent Neural Networks) is suggested for emotion detection in three studies.Citation21,Citation40,Citation43 NRCLex (National Research Council Lexicon), a Python library, is reported inCitation20 for emotion detection purposes.

Correlation identification techniques

These techniques are used to identify the correlation between the highly discussed topic and user sentiments. In the finalized articles, several correlations (between topics and sentiments) identification techniques are proposed. The research articlesCitation38,Citation42 reported cosine similarity for correlation identification. The research papersCitation43,Citation45 used Global Vectors for Word Representation (GloVe) Twitter. It maps each token (i.e., word) in the text to a 200-dimension vector; pre-trained GloVe (trained on 2 billion tweets).Citation54 Research papersCitation21 used Pearson correlation for stance calculations, whileCitation42 reported principle component analysis (PCA) andCitation28 used Silhouette width (ranging from −1 to 1) for stance calculation toward vaccination. The research articleCitation37 used Botometer API to calculate whether a post and underlined stance are from a human or Bots (robots).

Topic identification techniques

These techniques are used to perform topic modeling (to extract topics discussed in different communities). Moreover, these techniques assisted in identifying the highly discussed topic in different communities and discussions. Four research articlesCitation20,Citation25,Citation27,Citation41 have proposed Latent Dirichlet Allocation (LDA) technique for topic modeling. Two papersCitation38,Citation45 used the Word2Vec model for topic extraction from the community discussionsCitation23 reported meaning extraction method, whileCitation40 used locally estimated scatterplot smoothing (LOESS) for topic modeling purposes.

Users’ stance calculation techniques

These techniques are used for calculating vaccine acceptance or rejection levels based on the topics discussed in the communities. The growing availability of digital data and large datasets has made the sentiment analysis domain more interesting, andCitation45 mining of texts has gained significant attention from researchers.Citation55 Using AI to analyze the emotions, attitudes, and opinions expressed in comments is a breakthrough that holds promise for identifying public opinions on vaccine hesitancy.Citation56 By categorizing opinions according to polarity (positive, negative, or neutral), emotions (such as anger and joy), or degree of agreement, sentiment analysis can provide valuable insights.Citation56 For user’s stance calculations, the research articlesCitation36,Citation39 employed x2 test, while theCitation19 reported Mann-Whitney U-test for user’s stance toward vaccination. Along with x2 test, the research articleCitation36 also used Kruskal-Wallis test for user’s stance calculation toward vaccination. The research paperCitation26 employed SAGE hesitancy matrix to identify user’s stances toward vaccination based on the community discussion and information exposure.

Statistical methods

The statistical methods for data analysis encompass descriptive statistics and inferential statistics. Descriptive statistics employ measures like mean and median to summarize data. Inferential statistics, on the other hand, make conclusions based on data using tests like the student’s t-test, z-test. Additional statistical techniques involve data sampling, central tendency, random variables, probability distributions, statistical inference, confidence intervals, and hypothesis testing. Several statistical operations and methodologies are reported in the included studies likeCitation22 employed statistical package R version 4.0.3 on an excel sheet to perform topic modeling in different community discussions. The articleCitation24 reports Spearman correlation and statistical analysis for topic and sentiment analysis. The research papersCitation26,Citation27 employed the vaccine hesitancy matrix and Prism, version 9.0.2 statistical GraphPad software, respectively. Moreover, for the stance similarities and sentiment analysis numerous statistical techniques including x2 test, t-test, Mann-Whitney U-test, and many others provided in supplementary material, namely, methodologies employed excel file.

Discussions

This section of the paper presents a summary of the review’s findings and results. It provides a concise overview of the principal outcomes, challenges encountered, and practical implications derived from this review work.

Main results

The culmination of 28 comprehensive longitudinal studies on user behavior analysis regarding vaccine hesitancy, utilizing Twitter data, highlights the pressing concern surrounding the proliferation of both community and individual factors and its correlation with vaccine hesitancy among users. This scoping review represents a groundbreaking endeavor as the first of its kind to analyze Twitter data and identify numerous influential factors at both community and individual levels that shape human behavior toward vaccines over time. Our analysis of users’ behavior over time revealed two overarching themes: factors influencing human behavior toward vaccines and methodologies employed to calculate vaccine hesitancy among Twitter users. The individual factors influencing user behavior were further categorized into three distinct classes: contextual factors, individual and group factors, and vaccine-specific factors. Similarly, community factors were classified into three classes: influential personalities, community analysis, and information exposure. Methodologies utilized encompassed both machine learning methods and software-based approaches. While most of the included studies were conducted between 2021 and 2022, it is crucial to note that the findings predominantly reflect data from Europe and the United States. Consequently, there remains a notable dearth of information, particularly from African, Asian, and South American countries. Nonetheless, it is worth mentioning that Twitter emerged as the most extensively studied platform for longitudinal analysis, followed by Facebook, Reddit, and YouTube, respectively.

The growing body of recent evidence in behavior analysis utilizing Twitter data reflects the availability of new digital platforms and advanced data mining and machine learning techniques. The Gen Louvain method and SVM are the highly proposed methodologies for community detection, while LDA and Word2Vec models are the most employed algorithms for topic modeling and discussion identification among different Twitter communities. Exploring these behaviors can assist public health officials in tailoring their messages to address public health concerns and enhance healthcare delivery. In our analysis, the most frequently discussed factors are sentiments and emotions, followed by user demographics and the spread of misinformation on Twitter. We identified three broad categories of misinformation: medical-related, vaccine-related, and conspiracy theories. However, these categories are interconnected and can overlap, as skepticism toward vaccine development may be rooted in conspiratorial beliefs regarding hidden power structures and corrupt elites.

Digital data can help portray the dynamics of public health surveillance systems and allow public health professionals to pinpoint the general concerns or needs of the public during infectious disease events to create location-specific campaigns. For example, the finding that there is no association between community and individual discussions and resistive behaviors toward vaccination among Twitters users can reinforce the unfamiliarity of this population about the relationship between vaccine hesitancy and individual or community discussion on social media platforms. Several emotional and sentimental themes are identified during our evaluation process, and a number of techniques are reported to extract these sentimental and emotional themes from the users’ community discussions. VADER and TextBlob are predominantly utilized libraries for sentiment analysis, while NRCLex, LIWC, and RNN are the highly employed techniques for emotions (happy, sad, sorrow, anger, and joy) calculation. Since longitudinal analysis is temporal and momentary analysis of user behavior so, RNN is the most employed emotion detection technique.

Interestingly, it is worth noting that the majority of the research conducted on Twitter engagement has focused on extended periods of time, spanning months or even years. There is a notable scarcity of studies investigating engagement on a more immediate, momentary scale. Out of the 28 finalized longitudinal studies, only a third encompassed long-term analysis, exceeding one year in duration.Citation20,Citation21,Citation23,Citation28,Citation34,Citation36,Citation40,Citation43,Citation45 The long-term analysis offers enriched evidence to gain insights about the attitude patterns of the population that dissipate information about vaccines on Twitter and other social media tools. Conversely, around two-thirds of the studies were of a midterm nature, measuring engagement over several months up to one year.Citation19,Citation22,Citation24,Citation25,Citation27,Citation29,Citation33,Citation35,Citation37,Citation39,Citation41,Citation42,Citation44,Citation46 Notably, there was only a single study that employed a momentary approach, examining engagement over the course of just one week.Citation26 These momentary approaches typically employ longitudinal designs to analyze the captured data. However, further research is required to explore the short-term cross-sectional progression of engagement, specifically in relation to discussions surrounding vaccinations at both the community and individual levels. Additionally, investigating the interaction between momentary engagement and other variables of a momentary nature is also an area that warrants additional attention.

Our research findings indicate a notable prevalence of Twitter data usage in the analysis of human behavior, particularly in the context of understanding the factors that influence user behavior toward vaccination. There are several possible explanations for this observation. Firstly, this category encompasses a wide range of topics, including seasonal outbreaks, epidemics, sexually transmitted diseases, and infectious diseases. The diverse nature of these topics makes them highly relevant and widely studied. Another contributing factor is the convenience of utilizing relative search volumes on Twitter, access logs from other social media platforms, and the prevailing fear and hype surrounding infectious diseases and various epidemics such as HPV and MMR. Surprisingly, a minimal proportion of research papers (0.3%) focused on community analysis, and a similar percentage (approximately 0.3%) explored the concept of information exposure. This finding is unexpected given the wealth of available Twitter data for analysis in these areas. A survey conducted across 19 countries between June 16 and June 20, 2020, using an online panel of 13,426 respondents, found that 72% of participants were either very or likely to take a COVID-19 vaccine. However, acceptance rates varied significantly between countries, ranging from 90% in China to less than 55% in Russia.Citation57 Higher vaccine acceptance was associated with older age, higher socioeconomic status, and trust in the government.Citation57 A recent survey of UK adults yielded similar results, with 72% of participants expressing willingness to be vaccinated and the remaining 28% reporting strong hesitancy or uncertainty.Citation58

During our analysis, it has been observed that vaccination plays a crucial role in the fight against the pandemic. Twitter provides a user-friendly interface where individuals can freely share their perspectives and engage in discussions on various public issues, including healthcare, politics, human rights, and personal experiences. This makes it an excellent platform for conducting opinion-based textual data analytics for various real-world applications. However, the rampant spread of misinformation related to the pandemic and vaccines through social media platforms has led the World Health Organization to coin the term “infodemic.” False claims regarding negative vaccine side effects, vaccine reliability, and other individual and community factors significantly influence the behavior of Twitter users toward vaccination. These factors not only diminish the severity of outbreaks but also pose challenges for healthcare agencies and workers striving to control the spread of a particular outbreak while promoting public health through the use of vaccines and other medical resources.

Challenges and future recommendations

Based on the proposed analysis, some of the recommendations are suggested that will open new gates for the research community to explore.

  • User-generated content on Twitter is often subject to bias, as it tends to reflect information that individuals feel comfortable sharing, which may not accurately represent the full range of their emotions and experiences. Among the 28 studies analyzed, no longitudinal study was found that linked the findings with users’ subjective experiences, whether self-reported or not, using text, image, or video data types. Therefore, there is a significant gap in research that can identify and address content biases that impact the collection and analysis of digital data for studying vaccination-related behaviors. It is crucial to conduct studies that can accurately determine and mitigate these biases in order to enhance the reliability and validity of behavioral analyses in the context of vaccination.

  • The anonymity provided by the internet allows individuals with stigmatized attributes to benefit from supportive communication on Twitter. However, the challenge of accurately determining user demographics raises unresolved questions about the population biases present among internet users with diverse cultural backgrounds or socioeconomic statuses. Demographic data for most digital platforms are not representative at a national level and tend to be skewed toward younger age groups and users with higher levels of education. Unfortunately, this important topic remains significantly underreported by the research community.

  • We found that no studies assessed digital media utilization for vulnerable populations (e.g., low-income, older adults, or people with a disability) who are under-presented on different digital platforms. Studies on detecting social bots are scarce.

  • For longitudinal analysis, a considerable timeframe is required to perform an enriched analysis of different individual and community factors associated with vaccine hesitancy among Twitter users. However, in the included studies, only three studiesCitation23,Citation40,Citation43 selected a range of years greater than two years. Similarly, a considerable amount of Twitter data is required to perform a momentary qualitative analysis of user behavior, but in the included 28 articles, only one studyCitation25 used a dataset greater than one million tweets.

  • Among the studies included in the analysis, a mere four studies (0.14%) took into account scientific and social media platforms other than Twitter for behavioral analysis. To conduct a more comprehensive and in-depth psychological analysis of user behavior, it is essential to consider other scientific and nonscientific platforms as well. Exploring these platforms as part of future research presents a valuable challenge for further investigation.

  • During this review analysis, we found that the majority of the included studies (approximately 72%) performed their longitudinal analysis on Twitter data from Europe and the United States. There is no significant contribution toward longitudinal analysis from low-income countries like Pakistan, Bangladesh, India, and many other countries. This topic requires considerable attention from the research community.

  • Explainable artificial intelligence (XAI pertains to the capability of AI systems to offer clear and transparent explanations for their decisions and actions. Within the domain of behavioral analysis, XAI plays a crucial role in improving transparency, accountability, trust, and the identification and rectification of errors. Consequently, it enhances the acceptance of AI-based methods for human psychological evaluations. However, despite the significance of explainability, the studies included in our analysis did not report any research specifically addressing the explainability aspect of various approaches utilized for longitudinal behavior analysis of Twitter users. This represents a notable gap in the current literature, emphasizing the need for future studies to explore and incorporate explainability techniques into the analysis of Twitter user behavior.

Strength and limitations

The strengths and limitations of this review analysis are discussed in brief below. It is worth mentioning that there are no magnified limitations found for this scoping review, but some minor limitations are there that threaten the validity of this work. Based on these limitations, some future recommendations are proposed that should be addressed in the near future to perform a more authentic behavior analysis of the public attitude toward vaccination using Twitter data.

Strength

The following are some of the magnified strengths and applications of the proposed scoping review research work. Firstly, to the best of our knowledge, this is the first of its kind systematic review that analyzed the longitudinal literature reported in the 10 well-reputed online repositories, namely, Scopus, PubMed, IEEE Xplore, ACM, Google Scholar, PsycINFO, Ovid, CINAHL, Springer Link, and Cochrane Library to accumulated relevant studies and perform review analysis. A considerable number of studies (705 articles) were synthesized to develop a final database of 28 most relevant longitudinal studies for the assessment and evaluation process.

Secondly, this research work presents a concise summary of the key individual and community factors from Twitter data (Facebook, Reddit, YouTube, Instagram, etc., as secondary social media tools) that influence human behavior toward vaccination longitudinally. Moreover, it also outlines the algorithms and methodologies that can be employed to perform momentary sentiment and behavior analysis using social media (Twitter and other platforms) data. Furthermore, it explains different correlation functions and statistical methodologies that can be integrated to identify the correlations between highly discussed topics in communities and conclude users’ emotions and stances about vaccines (or other highly discussed topics).

Thirdly, after analyzing the literature, this research work identified the gaps in the reported extant and presented new research directions for the research community to explore. Conversely, this will not only open new directions for the researchers to explore, but it will assist the healthcare workforce and health agencies to identify public health sentiments about outbreaks or pandemics and follow precautionary measures on a priority basis.

Limitations

The current review is a scoping review of longitudinal studies on users’ behavior toward vaccine hesitancy using Twitter community discussions during the last two decades. Thus, many studies using a cross-sectional design were not included in the review, and including these studies in the review might have given a different picture of how engagement has been studied across the past two decades. For example, several studies focusing on momentary engagement using cross-sectional designs have been published during the past 20 years and were not included in this review due to their cross-sectional design. However, the present review addressed the need to review the longitudinal research on users’ engagements in Twitter discussions which presents a first appraisal of the evidence base that can be further developed.

The longitudinal studies reported before 2006 were also skipped during the reviewing process because the prime concern of this scoping review is to analyze the longitudinal studies reported on Twitter data. Also, we skipped those studies that considered tweets instead of users for longitudinal analysis because the main focus of this scoping review is to identify the individual and community factors from Twitter discussions that caused a momentary shift in users’ behavior toward vaccination.

For this scoping review, we considered only 10 well-reputed online repositories for longitudinal studies accumulating, namely, Scopus, PubMed, IEEE Xplore, ACM, Google Scholar, PsycINFO, Ovid, CINAHL, Springer Link, and Cochrane Library. Several journals and publishers are available that publish research work, but our main objective is to target highly peer-reviewed journals and libraries that publish medical-related research work. Moreover, the last search was performed on January 07, 2023, but the research studies report daily.

Moreover, a notable limitation arises from the weak association observed between self-reported social media usage and actual utilization, as documented in reference. A considerable portion of these investigations gathered data from Twitter, primarily because Twitter has afforded researchers access to its data, rendering it more accessible than other social media platforms. Nevertheless, it’s important to note that this Twitter-centric dataset may not accurately reflect a randomly selected cross-section of the population, given that its user base predominantly comprises individuals aged 25 to 34, primarily located in the United States. Additionally, it is worth mentioning that our analysis did not encompass an evaluation of the potential influence of social media bots (automated accounts) disseminating misleading information in these studies. Furthermore, we did not delve into the role of social media algorithms in contributing to the formation of echo chambers.

During our analysis and assessment process, we considered only longitudinal studies that are reported using Twitter data. The longitudinal studies reported on medical records from manual hospital records, healthcare centers, surveys, or other verbal discussions are skipped during our review analysis. Also, we included the studies that evaluated users’ sentiments and emotions about vaccines with time, not the studies that reported effects on jobs and life standards.

Conclusion

This comprehensive scoping review investigates the patterns and trends influencing Twitter users’ vaccine behavior longitudinally, as reported in the literature. The research focuses on understanding the individual and community factors that influence vaccine acceptance and refusal, exploring changes in vaccination rates, and identifying techniques used to determine vaccine behaviors. For community factors detection SVM and Gen Louvain method is the highly reported among machine learning techniques, while R is among the software significantly used for community factors detection. VADER is extensively reported for sentiment analysis while LIWC is used for emotion detection regarding vaccination. Cosine similarity and GloVe are the statistical methods increasing reported for calculating correlation between topics discussed on Twitter between different communities. LDA and Word2Vec are the techniques motley reported for topic modeling, while Whitney U test, Kruskal-Wallis test, z2 test are the techniques used for stance calculation about vaccination among Twitter users. Community factors encompass social norms, values, beliefs, social support networks, local institutions, community engagement, and economic opportunities within a community. Individual factors include sentiment, emotions, user demographics, beliefs, attitudes, prevention, knowledge/awareness, and vaccine-specific factors such as misinformation and vaccine campaigns. The findings reveal the significance of mass media in influencing information-seeking behavior. While the majority of studies focused on Twitter, it is crucial to explore other digital platforms for a more comprehensive understanding of behavior analysis. The lack of studies reporting the explainability aspect of different approaches used for Twitter users’ longitudinal behavior analysis underscores the need for further research in this area. Additionally, the demographic coverage of the studies revealed a notable dearth of information from African, Asian, and South American countries, highlighting the need for more diverse geographical representation in future studies.

Further research is needed to address gaps in community analysis and information exposure, as well as to improve the explainability of approaches used for Twitter user behavior analysis.

Authors’ contribution

S.K. and Z.S. contributed to the research concept and overall study design. S.K. and M.R.B. carried out the scoping review process, and Z.S. reviewed and resolved conflicts between the two authors. Z.S. supervised the study and provided technical assistance. S.K. contributed to writing the original draft. All the authors read and approved the final manuscript.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Supplemental material

Supplementary Material

Download PDF (438.5 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The complete data can be found in this research paper. Any inquiries or requests for additional information should be directed to the corresponding author (Z.S.).

Supplementary data

Supplemental data for this article can be accessed on the publisher’s website at https://doi.org/10.1080/21645515.2023.2278377.

Additional information

Funding

Open Access funding provided by the Qatar National Library.

Notes

[a] Twitter has been renamed “X” but the word Twitter will be used in this manuscript because the literature is assessed using Twitter as the main keyword for articles accumulation.

References

  • Fitzmaurice GM, Laird NM, Ware JH. Applied longitudinal analysis. Journal of the American Statistical Association, John Wiley & Sons; 2012.
  • Merchant RM, Lurie N. Social media and emergency preparedness in response to Novel Coronavirus. JAMA. 2020;323(20):2011–19. doi:10.1001/jama.2020.4469.
  • Kaur S, Kaul P, Zadeh PM. Monitoring the dynamics of emotions during COVID-19 using Twitter data. Procedia Comput Sci. 2020 Jan 01;177:423–30. doi:10.1016/j.procs.2020.10.056.
  • MacDonald NE. Vaccine hesitancy: Definition, scope and determinants, (in eng). Vaccine. 2015 Aug 14;33(34):4161–4. doi:10.1016/j.vaccine.2015.04.036.
  • Johnson NF, Velásquez N, Restrepo J, Leahy R, Gabriel N, El Oud S, Zheng M, Manrique P, Wuchty S, Lupu Y. The online competition between pro- and anti-vaccination views. Nature. 2020 June 01;582(7811):230–3. doi:10.1038/s41586-020-2281-1.
  • Cinelli M, Quattrociocchi W, Galeazzi A, Valensise CM, Brugnoli E, Schmidt AL, Zola P, Zollo F, Scala A. The COVID-19 social media infodemic. Sci Rep. 2020;10(1):1–0. doi:10.1038/s41598-020-73510-5.
  • Kouzy R, Abi Jaoude J, Kraitem A, El Alam MB, Karam B, Adib E, Zarka J, Traboulsi C, Akl E, Baddour K. Coronavirus goes viral: quantifying the COVID-19 misinformation epidemic on Twitter. Cureus. 2020 Mar 13;12(3):e7255. doi:10.7759/cureus.7255.
  • Moosa AS, Wee YMS, Jaw MH, Tan QF, Tse WLD, Loke CY, Ee GLA, Ng CCD, Aau WK, Koh YLE, et al. A multidisciplinary effort to increase COVID-19 vaccination among the older adults, (in eng). Front Public Health. 2022;10:904161. doi:10.3389/fpubh.2022.904161.
  • Brewer NT, Chapman GB, Rothman AJ, Leask J, Kempe A. Increasing vaccination: Putting psychological science into action. Psychol Sci Public Inter. 2017;18(3):149–207. doi:10.1177/1529100618760521.
  • Bedford H, Attwell K, Danchin M, Marshall H, Corben P, Leask JJV. Vaccine hesitancy, refusal and access barriers:The need for clarity in terminology. Vaccine. 2018;36(44):6556–8. doi:10.1016/j.vaccine.2017.08.004.
  • N. V. A. J. P. h. r. Committee. Assessing the state of vaccine confidence in the United States: recommendations from the National vaccine advisory committee: Approved by the National vaccine advisory committee on June 10, 2015. Public Health Rep. 2015;130(6):573–95. doi:10.1177/003335491513000606.
  • Schmid P, Rauber D, Betsch C, Lidolt G, Denker M-L, Cowling BJ. Barriers of influenza vaccination intention and behavior–a systematic review of influenza vaccine hesitancy, 2005–2016. PloS One. 2017;12(1):e0170550. doi:10.1371/journal.pone.0170550.
  • Alamoodi AH, Zaidan B, Al-Masawa M, Taresh SM, Noman S, Ahmaro IYY, Garfan S, Chen J, Ahmed MA, Zaidan AA, et al. Multi-perspectives systematic review on the applications of sentiment analysis for vaccine hesitancy. Comput Biol Med. 2021;139:104957. doi:10.1016/j.compbiomed.2021.104957.
  • Zhao Y, Leach LS, Walsh E, Batterham PJ, Calear AL, Phillips C, Olsen A, Doan T, LaBond C, Banwell C. COVID-19 and mental health in Australia – a scoping review. BMC Public Health. 2022 June 15;22(1):1200. doi:10.1186/s12889-022-13527-9.
  • Bussink-Voorend D, Hautvast JL, Vandeberg L, Visser O, Hulscher ME. A systematic literature review to clarify the concept of vaccine hesitancy. Nat Hum Behav. 2022;6(12):1634–48. doi:10.1038/s41562-022-01431-6.
  • Skafle I, Nordahl-Hansen A, Quintana DS, Wynn R, Gabarron E. Misinformation about COVID-19 vaccines on social media: rapid review. J Med Internet Res. 2022 Aug 4;24(8):e37367. doi:10.2196/37367.
  • Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, Moher D, Peters MDJ, Horsley T, Weeks L, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73. doi:10.7326/M18-0850.
  • Newman PA, Reid L, Tepjan S, Fantus S, Allan K, Nyoni T, Guta A, Williams CC. COVID-19 vaccine hesitancy among marginalized populations in the U.S. and Canada: Protocol for a scoping review. PloS One. 2022;17(3):e0266120. doi:10.1371/journal.pone.0266120.
  • Miyazaki K, Uchiba T, Tanaka K, Sasahara K. Aggressive behaviour of anti-vaxxers and their toxic replies in English and Japanese, (in English). Humanit Soc Sci Commun. 2022;9(1). doi:10.1057/s41599-022-01245-x.
  • Lanier HD, Diaz MI, Saleh SN, Lehmann CU, Medford RJ. Analyzing COVID-19 disinformation on Twitter using the hashtags #scamdemic and #plandemic: Retrospective study, (in English). PloS One. 2022;17(6):e0268409. doi:10.1371/journal.pone.0268409.
  • Zhang C, Xu S, Li Z, Liu G, Dai D, Dong C. The evolution and disparities of online attitudes toward COVID-19 vaccines: year-long longitudinal and cross-sectional study, (in English). J Med Internet Res. 2022;24(1):e32394. doi:10.2196/32394.
  • Islam MS, Kamal A-HM, Kabir A, Southern DL, Khan SH, Hasan SMM, Sarkar T, Sharmin S, Das S, Roy T, et al. COVID-19 vaccine rumors and conspiracy theories: The need for cognitive inoculation against misinformation to improve vaccine adherence (in English). PloS One. 2021;16(5):e0251605. doi:10.1371/journal.pone.0251605.
  • Mitra T, Counts S, Pennebaker JW. Understanding anti-vaccination attitudes in social media. [Georgia Institute of Technology, United States, Microsoft Research, United States, University of Texas at Austin, United States]. AAAI Press; 2016.
  • Bradford NJ, Amani B, Walker VP, Sharif MZ, Ford CL. Barely tweeting and rarely about racism: Assessing US state health department Twitter use during the COVID-19 vaccine rollout, (in eng). Ethn Dis. 2022;32(3):257–64. doi:10.18865/ed.32.3.257.
  • Xie Z, Wang X, Jiang Y, Chen Y, Huang S, Ma H, Li D. Public perception of COVID-19 vaccines on Twitter in the United States. medRxiv. 2021.
  • Calac AJ, Haupt MR, Li Z, Mackey T. Spread of COVID-19 vaccine misinformation in the ninth inning: Retrospective observational infodemic study, (in eng). JMIR Infodemiol. 2022 Jan;2(1):e33587. doi:10.2196/33587.
  • Margus C, Brown N, Hertelendy AJ, Safferman MR, Hart A, Ciottone GR. “Emergency Physician Twitter use in the COVID-19 pandemic as a potential predictor of impending surge: Retrospective observational study, (in eng). J Med Internet Res. 2021 July 14;23(7):e28615. doi:10.2196/28615.
  • London J. Tweeting the alarm: Exploring the efficacy of Twitter as a serial transmitter during the COVID-19 pandemic. Virtual Event, Germany; 2021. doi:10.1145/3458026.3462159
  • Yuan X, Crooks AT. Examining online vaccination discussion and communities in Twitter. Copenhagen, Denmark; 2018. doi:10.1145/3217804.3217912
  • Sharevski F, Devine A, Jachim P, Pieroni E. Meaningful context, a red flag, or both? Preferences for enhanced misinformation warnings among US Twitter users. Karlsruhe, Germany; 2022. doi:10.1145/3549015.3555671.
  • Addawood A. Usage of scientific references in MMR vaccination debates on Twitter. 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, Spain; 2020. p. 971–9. doi:10.1109/ASONAM.2018.8508385.
  • Lutkenhaus R, Jansz J, Bouman M. Tailoring in the digital era: Stimulating dialogues on health topics in collaboration with social media influencers. Digital Health. 2019;5:1–11. doi:10.1177/2055207618821521.
  • Hussain A, Tahir A, Hussain Z, Sheikh Z, Gogate M, Dashtipour K, Ali A, Sheikh A. Artificial Intelligence–Enabled Analysis of Public Attitudes on Facebook and Twitter Toward COVID-19 Vaccines in the United Kingdom and the United States: Observational Study. J Med Internet Res. 2021;23(4):e26627. doi:10.2196/26627.
  • Zhou X, Zhang X, Larson HJ, de Figueiredo A, Jit M, Fodeh S, Vermund SH, Zang S, Lin L, Hou Z. Global spatiotemporal trends and determinants of COVID-19 vaccine acceptance on Twitter: A multilingual deep learning study in 135 countries and territories. medRxiv. 2022;p. 2022.11. 14.22282300.
  • Dunn AG, Leask J, Zhou X, Mandl KD, Coiera E. Associations between exposure to and expression of negative opinions about human papillomavirus vaccines on social media: An observational study. J Med Internet Res. 2015;17(6):e144–e144. doi:10.2196/jmir.4343.
  • Massey PM, Budenz A, Leader A, Fisher K, Klassen AC, Yom-Tov E. What drives health professionals to tweet about #HPVvaccine? Identifying strategies for effective communication. Prev Chronic Dis. 2018;15:E26–E26. doi:10.5888/pcd15.170320.
  • Ruiz-Núñez C, Segado-Fernández S, Jiménez-Gómez B, Hidalgo PJJ, Magdalena CSR, Pollo MDCÁ, Santillán-Garcia A, Herrera-Peco I. Bots’ activity on COVID-19 pro and anti-vaccination networks: Analysis of Spanish-written messages on Twitter. Vaccines (Basel). 2022;10(8):1240. doi:10.3390/vaccines10081240.
  • Chopra H, Vashishtha A, Pal R, Tyagi A, Sethi T. Mining trends of COVID-19 vaccine beliefs on Twitter with Lexical Embeddings. Ithaca: Cornell University Library; 2021.
  • Calo WA, Gilkey MB, Shah PD, Dyer A-M, Margolis MA, Dailey SA, Brewer NT. Misinformation and other elements in HPV vaccine tweets: An experimental comparison, (in eng). J Behav Med. 2021 June;44(3):310–9. doi:10.1007/s10865-021-00203-3.
  • Du J, Luo C, Shegog R, Bian J, Cunningham RM, Boom JA, Poland GA, Chen Y, Tao C. Use of deep learning to analyze social media discussions about the human papillomavirus vaccine. JAMA Netw Open. 2020;3(11):e2022025–e2022025. doi:10.1001/jamanetworkopen.2020.22025.
  • Evans SL, Jones R, Alkan E, Sichman JS, Haque A, de Oliveira FB, Mougouei D. The emotional impact of COVID-19 news reporting: A longitudinal study using natural language processing. Human Behav Emerg Technol. 2023 Mar 09;2023:7283166. doi:10.1155/2023/7283166.
  • Osborne MT, Malloy SS, Nisbet EC, Bond RM, Tien JH. Sentinel node approach to monitoring online COVID-19 misinformation. Sci Rep. 2022;12(1):1–15. doi:10.1038/s41598-022-12450-8.
  • Du J, Cunningham RM, Xiang Y, Li F, Jia Y, Boom JA, Myneni S, Bian J, Luo C, Chen Y, et al. Leveraging deep learning to understand health beliefs about the human papillomavirus vaccine from social media. npj Digit Med. 2019 Apr 15;2(1):27. doi:10.1038/s41746-019-0102-4.
  • Fazel S, Zhang L, Javid B, Brikell I, Chang Z. Harnessing Twitter data to survey public attention and attitudes towards COVID-19 vaccines in the UK. Sci Rep. 2021 Dec 14;11(1):23402. doi:10.1038/s41598-021-02710-4.
  • Wankhade M, Rao ACS. Opinion analysis and aspect understanding during COVID-19 pandemic using BERT-Bi-LSTM ensemble method. Sci Rep. 2022 Oct 12;12(1):17095. doi:10.1038/s41598-022-21604-7.
  • Liew TM, Lee CS. Examining the Utility of social media in COVID-19 vaccination: Unsupervised learning of 672,133 Twitter posts. JMIR Public Health Surveill. 2021 Nov 3;7(11):e29789. doi:10.2196/29789.
  • Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan—a web and mobile app for systematic reviews. Syst Rev. 2016 Dec 05;5(1):210. doi:10.1186/s13643-016-0384-4.
  • Liu S, Liu J. Understanding behavioral intentions toward COVID-19 vaccines: Theory-based content analysis of tweets, (in eng). J Med Internet Res. 2021 May 12;23(5):e28118. doi:10.2196/28118.
  • Grandjean M, Mauro A. A social network analysis of Twitter: Mapping the digital humanities community. Cogent Arts Humanit. 2016 Dec 31;3(1):1171458. doi:10.1080/23311983.2016.1171458.
  • MacDonald NE. Vaccine hesitancy: Definition, scope and determinants. Vaccine. 2015 Aug 14;33(34):4161–4. doi:10.1016/j.vaccine.2015.04.036.
  • Gupta SK, Singh DP. Seed community identification framework for community detection over social media. Arab J Sci Eng. 2023 Feb 01;48(2):1829–43. doi:10.1007/s13369-022-07020-z.
  • Khan S, Khan HU, Nazir S. Systematic analysis of healthcare big data analytics for efficient care and disease diagnosing. Sci Rep. 2022 Dec 26;12(1):22377. doi:10.1038/s41598-022-26090-5.
  • Park EH, Storey VC. Emotion ontology studies: A framework for expressing feelings digitally and its application to sentiment analysis. ACM Comput Surv. 2023;55(9):1–38. Article 181. doi: 10.1145/3555719.
  • Mitra T, Counts S, Pennebaker J. Understanding anti-vaccination attitudes in social media. Proc Int AAAI Conf Web Soc Media. 2021 Apr 08;10(1):269–78. doi:10.1609/icwsm.v10i1.14729.
  • Dang NC, Moreno-García MN, De la Prieta F. Sentiment analysis based on deep learning: A comparative study. Electronics. 2020;9(3):483. doi:10.3390/electronics9030483.
  • Piedrahita-Valdés H, Piedrahita-Castillo D, Bermejo-Higuera J, Guillem-Saiz P, Bermejo-Higuera JR, Guillem-Saiz J, Sicilia-Montalvo JA, Machío-Regidor F. Vaccine hesitancy on social media: Sentiment analysis from June 2011 to April 2019. Vaccines. 2021;9(1):28. doi:10.3390/vaccines9010028.
  • Lazarus JV, Ratzan SC, Palayew A, Gostin LO, Larson HJ, Rabin K, Kimball S, El-Mohandes A. A global survey of potential acceptance of a COVID-19 vaccine. Nat Med. 2021 Feb 01;27(2):225–8. doi:10.1038/s41591-020-1124-9.
  • Freeman D, Loe BS, Chadwick A, Vaccari C, Waite F, Rosebrock L, Jenner L, Petit A, Lewandowsky S, Vanderslott S, Innocenti S. COVID-19 vaccine hesitancy in the UK: The Oxford coronavirus explanations, attitudes, and narratives survey (oceans) II. Psychol Med. 2022;52(14):3127–41.