774
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Linking survey with Twitter data: examining associations among smartphone usage, privacy concern and Twitter linkage consent

ORCID Icon, , , , &
Received 20 Apr 2023, Accepted 19 Dec 2023, Published online: 04 Jan 2024

ABSTRACT

Linking survey and social media data has gained popularity. However, obtaining consent from respondents to link social media is a known challenge. Using data from a nationally representative survey of the U.K. this study investigated whether respondents’ a) activity frequency, b) activity variety and c) technical skills with smartphones are associated with consent to link Twitter data to survey responses. Additionally, this study explored mediating role of privacy and security concern and moderating effects of age, gender, employment and educational level to better understand the influences of privacy concern on Twitter linkage consent. Results showed that activity variety with smartphones is positively associated with Twitter linkage consent, and privacy concern mediated the effects of activity frequency and activity variety with smartphones on linkage consent. Age and employment status moderated the associations between privacy concern and linkage consent, with younger and employed respondents being more likely to be affected by privacy concern.

Social media platforms, such as Twitter (also known as X.), Facebook and Instagram, allow users to share updates and experiences, providing a wealth of real-time data for researchers to study human actions, thoughts, and feelings (Fiesler & Proferes, Citation2018). However, due to the public accessibility of social media information and the associated risks of exposing personal privacy, researchers typically gather and examine this data in aggregate, rather than attributing it to specific individuals. This approach, though protecting privacy, may limit the potential utility of social media data for research purposes (Townsend & Wallace, Citation2017).

In recent years, a burgeoning number of studies have sought to link social media data with other data sources in relatively secure settings. This involves storage and processing of different types of data separately until needed for linkage, using linked data only for specific research inquiries, and restricting access for identifiable information (Sloan et al., Citation2020). By minimizing the risk of disclosure, public and anonymized data sources can supplement each other (Hughes et al., Citation2021). Data linkage between different types of data depends on obtaining consent from respondents (Knies et al., Citation2012). It is critical as informed consent stands as a pivotal ethical principle in research, demonstrating respect for and protection of respondents. The willingness of participants to engage in research is also heavily influenced by whether their consent is requested (Fiesler & Proferes, Citation2018). Therefore, understanding the factors that influence respondents’ consent decisions can help identify potential biases in linked data sources and improve consent rates by addressing these concerns in survey design (Ohme et al., Citation2021).

Previous research has shed some insights into the impacts of respondents’ demographics on data linkage consent. However, there remains limited literature on whether individuals’ behaviors, particularly smartphone usage, could impact their consent decisions. Given that smartphone is increasingly becoming the primary device through which individuals integrate their online and offline activities (Klimmt et al., Citation2017; Wenz & Keusch, Citation2023), the lack of understanding on the effect of smartphone usage may impair the comprehension of linked dataset. Additionally, the continuous connectivity and omnipresence nature of smartphones might affect users’ privacy concerns (Ketelaar & Van Balen, Citation2018), the primary factors underlying their hesitations to grant consents for sharing personal data (Fiesler & Proferes, Citation2018). In light of this, this study focuses on respondents’ consent to link Twitter data with survey responses, aiming a) to investigate the impact of smartphone usage (i.e. activity frequency, activity variety, and technical skills with smartphone) on Twitter linkage consent, b) to examine the mediating role of privacy and security concern between smartphone usage and Twitter linkage consent.

Moreover, previous research surrounding effects of demographic factors on data linkage consent often yielded inconsistent results. The reason for this may be due to the presence of other variables that function interactively with demographic features. To explore this, this study examines the moderating role of age, gender, employment status and educational level on the association between privacy concern and Twitter linkage consent. By doing so, this study contributes to a more comprehensive understanding of the factors that affect data linkage consent as well as the potential systematic bias of the linked dataset due to these factors. The finding of this study will also provide insights for designing survey questionnaires to improve data linkage consent rates.

Rational

Linking survey with Twitter data

Social media has become an integral part of modern society, profoundly altering interpersonal communication in current world (Hedman & Djerf-Pierre, Citation2013). As one of the most popular social media platforms, Twitter enables users to share brief and real-time messages, as well as interact through likes, comments, and retweets. With an average of over 500 million daily tweets (Beveridge, Citation2022), Twitter has become a crucial source of publicly available information that can be utilized for various research purposes, such as identifying emerging online topics, tracking online sentiments, and mapping social networks (Ovadia, Citation2009).

There has been a growing research interest in combining Twitter information with survey responses, as these two types of data can supplement each other and offer a more comprehensive understanding of individuals’ online and offline behaviors (Eady et al., Citation2019). For example, with the linked dataset, it is likely to investigate respondents’ online behaviors objectively, rather than relying solely on self-reported measurements (Hughes et al., Citation2021). Such linkage also provides an opportunity to validate survey response by comparing it with information retrieved from respondents’ tweets (Sloan et al., Citation2015). Moreover, advanced computational methods can be applied to tweet data to build machine learning models, which can then be refined and validated using survey data (Braithwaite et al., Citation2016). Overall, merging survey with Twitter data enables researchers to explore a wider array of research inquiries concerning the interplay between online and offline activities, as well as to pursue new avenues for methodological advancements.

To ethically and practically access and link Twitter data with survey responses, it is necessary to obtain respondents’ consents. Such consent is often requested through a series of survey questions, along with information detailing the purpose, procedure and handling of data collection and storage (Stier et al., Citation2020). Despite researchers’ efforts to develop ethical guidelines for securing and storing linkage data (Sloan et al., Citation2020), not all individuals grant consents for their data to be linked in this manner, and consent rates vary greatly across different studies (Mneimneh et al., Citation2021). As consent to data linkage is non-probabilistic and linked sample may not represent the entire population, it is crucial to investigate what factors may affect data linkage consent. By doing so, researchers can better understand limitations of linked dataset when drawing conclusions, and can tailor the design of survey questionnaires to mitigate influences of potential factors.

Effects of smartphone usage on Twitter linkage consent

In this study, we first propose that individuals’ smartphone usage may influence their willingness to grant consent for Twitter linkage. Smartphones are the most widely used mobile devices to access social media (Dean, Citation2022; Parsons, Citation2022). As such, individuals, who use smartphone more intensively or possess greater technical expertise, may be more familiar with the functionality of social media and potential benefits of exploiting social media data. Consequently, they might be more inclined to grant consent for Twitter linkage. Findings from prior empirical studies could lend support to this argument. For instance, Silber et al. (Citation2022) discovered that individuals’ affinity with technology is positively related to their willingness to share Twitter data. Findings related to consent in other data linkage contexts indicate a comparable pattern, albeit with potential differences in the underlying motivations for sharing various data types. For instance, Elevelt et al. (Citation2019) discovered a positive relationship between variety of smartphone usage and willingness to share GPS data. Máté et al. (Citation2023) noted that those who use smartphones for a wider range of activities tend to participate more in passive digital data collection. Wenz et al. (Citation2019) observed that individuals with a higher variety of activities with smartphone were more likely to attend mobile data collection tasks, such as connecting smartphones via Bluetooth and using smartphone cameras to take photos. To delve deeper into this association, we examine smartphone usage habit from three perspectives and propose that (see for summarized hypotheses):

Figure 1. Summary of hypotheses.

H1a, H1b and H1c represent total effects between corresponding variables and content to link Twitter.
Figure 1. Summary of hypotheses.

H1:

Respondents’ (a) activity frequency, (b) activity variety and (c) technical skills with smartphone are positively associated with Twitter linkage consent.

Privacy and security concern as mediator

Additionally, privacy and security concerns have been a major contributing factor to individuals’ reluctance to link public with anonymous data (Mneimneh et al., Citation2021; Otto & Kruikemeier, Citation2023; Sala et al., Citation2012). Such concerns often stem from the fear of personal data breaches, and a lack of clarity on how different data types are merged (Clarke et al., Citation2021). Meanwhile, privacy concern is not inherently fixed, but rather can be influenced by other factors (Sipior et al., Citation2014; Smith et al., Citation2011). Smith et al. (Citation2011) macro model of APCO (Antecedents ➔ Privacy Concerns ➔ Outcomes) has suggested that privacy concern can act as both a dependent and an independent variable. This study proposes that smartphone usage may play a crucial role in shaping respondents’ privacy concerns, which in turn influence their consents to data linkage. The rationale is outlined below.

With increasing prevalence of mobile internet services, smartphones have emerged as the primary devices for accessing online contents and services, reflecting a growing trend toward mobile-first digital experiences (Tsetsi & Rains, Citation2017). Those who frequently use smartphones are typically more engaged online. Prior studies have indicated that those who are more frequent and adept internet users often exhibit less concern about privacy and security (Metzger, Citation2004; Yao et al., Citation2007). It is thus plausible that higher frequency, variety and proficiency in smartphone usage, implying greater internet exposure, may lead to lower levels of privacy and security concern. Accordingly, we put forward that:

H2:

Respondents’ (a) activity frequency, (b) activity variety and (c) technical skills with smartphone are negatively associated with privacy and security concerns.

H3:

Respondents’ privacy and security concerns are negatively associated with Twitter linkage consent.

Based on H2 and H3, (a) activity frequency, (b) activity variety and (c) technical skill with smartphone will be negatively related with individuals’ privacy concerns, which, in turn will be negatively associated with their consents to Twitter linkage. In other words, smartphone usage may influence Twitter linkage consent through privacy concern, which aligns with the logic of mediation model (Hayes, Citation2009; Wu & Zumbo, Citation2008). Prior studies on consent to adopt advanced technology for data collection, such as contact tracing app and network devices, also suggested that privacy concern could play a crucial mediating role in individuals’ decision-makings (Alraja et al., Citation2019; Wang et al., Citation2022). Accordingly, we propose:

H4:

Privacy and security concerns mediate the associations between (a) activity frequency, (b) activity variety, (c) technical skills with smartphone and Twitter linkage consent.

Potential moderating effects by demographics

In addition to investigating the impact of individuals’ smartphone usage on data linkage consent, this study also seeks to address the inconsistency in the influence of individuals’ characteristics on data linkage consent. For instance, some scholars found that males were more likely to grant consent to link survey with social media data, such as Twitter, Facebook, and Spotify (Mneimneh et al., Citation2021; Silber et al., Citation2022). Similar outcomes were also observed in research examining linkage consent between survey and medical records (Knies et al., Citation2012; Sala et al., Citation2012). In contrast, Dunn et al. (Citation2004) discovered that the consent rate was lower in males than in females until around the age of 70 years. There were also several studies revealing no significant relationship between gender and data linkage consent (Mneimneh et al., Citation2021). These discrepancies may be due to the influence of other variables on the relationship between demographic features and data linkage consent. This study places the emphasis on privacy and security concerns, and specifically examines how demographic features might moderate the association between these concerns and Twitter linkage consent.

First, respondents’ age may moderate the negative effect of privacy and security concern on data linkage consent. Previous research has indicated that an individual’s propensity to engage in data linkage declines with age (Al Baghal et al., Citation2020; Mneimneh et al., Citation2021). This study further proposes that for individuals with same levels of privacy concern, older respondents might be more aware of the potential risks and thus more cautious about sharing Twitter accounts. In contrast, younger respondents – more well-adapted to technological realities – may feel more comfortable with innovations and are better able to handle privacy concern, leading them to be more inclined to consent data linkage. Therefore, we propose that:

H5:

The negative association between privacy and security concern and Twitter linkage consent will be moderated by age; the negative association will be stronger for older respondents than middle-aged and younger respondents.

Second, the association between privacy and security concern and data linkage consent might be moderated by respondents’ employment status. According to privacy calculus theory (Wang et al., Citation2016), individuals always rationally weigh the potential benefits and risks of personal information disclosure. Self-disclosure often occurs when perceived benefits outweigh the associated costs. In the case of this study, employed respondents, who often possess greater socioeconomic resources, might have more at stake if their privacy and security are compromised. Thus, they may avoid disclosing personal information and the impact of privacy concern on Twitter linkage consent could be more pronounced. In contrast, unemployed individuals may not have as much at stack, even though their level of privacy and security concerns is similar. Previous research on attitudes toward data linkage also showed that respondents may support data linkage in general, but express reservations about linking employment-related data due to its potential higher risks (Xafis, Citation2015). Accordingly, we put forward that:

H6:

The negative association between privacy and security concern and Twitter linkage consent will be moderated by employment status; the negative association will be stronger for employed respondents than unemployed respondents.

Additionally, gender may moderate the association between privacy concern and consent for linking Twitter data. According to sociocultural theory, males and females are often assigned different social norms and expectations (Eagly & Wood, Citation1999). These societal expectations can potentially affect their levels of caution and willingness to disclose personal information. Specifically, females tend to demonstrate higher rates of caution and adopt more protective actions when confronted with external risks (McLean & Anderson, Citation2009). In contrast, males, who are often expected to be more risk-taking, may take less protective behavior (McLean & Anderson, Citation2009). Empirical research before also suggested that females tend to express more concerns about online privacy and, consequently, tend to adopt more self-protective online behaviors in comparison to males (Dommeyer & Gross, Citation2003; Hoy & Milne, Citation2010). Accordingly, in the current study, females may be more hesitant to share Twitter information than males due to concerns about privacy and security. Accordingly, we put forward:

H7:

The negative association between privacy and security concern and Twitter linkage consent will be moderated by gender; the negative association will be stronger for female compared to male.

Lastly, educational level may serve as a moderator on the association between privacy concern and Twitter linkage consent. Research on online privacy management have shown that individuals with lower educational attainment are less likely to take measures to protect their online privacy (Boerman et al., Citation2021; Chai et al., Citation2009; Smit et al., Citation2014). This could be attributed to the fact that less educated individuals may lack necessary knowledge and familiarity with specific privacy protection measures (Boerman et al., Citation2021). Regarding this, it is plausible that those with lower levels of education might not avoid potential privacy risks and tend to grant consents to link Twitter data, even if they have similar level of privacy concern. Comparatively, individuals with higher education are more likely to apply their educated privacy knowledge to actual preventive actions and, thus, might be more inclined to decline data linkage. Therefore, we put forward:

H8:

The negative association between privacy and security concern and Twitter linkage consent will be moderated by educational level; the negative association will be stronger for individuals with higher educational levels compared to those with lower and medium educational levels.

Methods

Data collection

The data for this study were sourced from the UK Understanding Society Innovation Panel (IP), a national representative survey of the United Kingdom. The IP survey is an ongoing longitudinal study that tracks individuals within households (University of Essex ISER, Citation2021). The initial wave of the IP survey began in 2008, targeting households across England, Scotland and Wales with a stratified and geographically clustered design. First, sectors were randomly selected in a systematic manner, with the probability of selection proportional to the sector’s population size. Then, within each selected sector, addresses were chosen with systematic random sampling approach, resulting in a sample of 2,760 addresses. For each sample address, interviewers identified the sampled persons. In the subsequent waves, only households that had participated in the previous wave were revisited. To replenish the sample, new addresses were added, with 960 in wave 4, 1,560 in wave 7, and 960 in wave 10.

This study used the dataset from Wave 10 of the IP survey, gathered in May 2017. Respondents were first asked if they had a Twitter account. Those who confirmed having an account were subsequently asked for consent to link their Twitter data with survey responses. The sample for Wave 10 comprised 2,570 individuals. Out of this, 513 (approximately 19.96%) reported they had a Twitter account. Among those Twitter account holders, 171 individuals agreed to the linkage of Twitter, while 315 individuals declined.Footnote1

After listwise deletion of missing values, 454 of 513 respondents remained. Among the respondents in the final sample, the age average is 37 with a range from 16 to 76. Males accounted for 46.3 percent (n = 210), while females accounted for 53.7 percent (n = 244). Married respondents accounted for 57.7 percent (n = 262), while single respondents accounted for 42.3 percent (n = 192). Most respondents were employed (n = 352, 77.5%), while 102 were not (22.5%). Majority of respondents had obtained a degree (n = 170, 37.4%), followed by A-level (n = 134, 29.5%), GCSE (n = 75, 16.5%) and other higher degree (n = 53, 11.7%). Respondents’ self-reported personal monthly gross income ranged from 0 to 15,000 GBP, with a mean of 2082.68 GBP. Most respondents lived in urban areas (n = 344, 75.8%) as opposed to rural areas (n = 110, 24.2%).

Variable construction

For all study variables, higher scores represented higher values of the variable. The reliability of these variables was evaluated using McDonald’s omega (ω). For variables that included four or more items, Confirmatory factor analyses (CFAs) were performed. See supplementary material for descriptive statistics for all study variables.

Activity frequency with smartphone

One item, ‘How often do you use a smartphone for activities other than phone calls or text messaging?’, was used to evaluate respondents’ activity frequency with smartphone. Respondents rated their answers using a 4-point scale (1 = Everyday; 4 = Once a month or less). This item was reverse coded, so the higher scores indicated higher frequency in using smartphone.

Activity variety with smartphone

Twelve items were used to measure activity variety with smartphone (e.g. ‘Do you use your smartphone for browsing websites/playing games/shopping’). Each item was dummy coded (1 = Yes; 0 = No). The scores for all 12 items were then summed up, with a higher score indicating a wider range of activities conducted using smartphone.

Technical skill with smartphone

One item regarding ‘How would you rate your skills of using a smartphone?’ was used to assess respondents’ skill levels with smartphone. The answers range from ‘1’ (Beginner) to ‘5’ (Advanced).

Privacy and security concern

Eight items were employed to measure privacy and security concerns. Using a 5-point Likert scale (1 = Not at all concerned; 5 = Extremely concerned), respondents would answer questions regarding how concerned they would be about privacy and security of providing information in multiple ways (e.g. complete an online questionnaire in mobile browser/use camera of smartphone/share the GPS position of smartphone). CFA results suggested that the eight items formed a unidimensional factor: χ2(12) = 37.76, p < .001, χ2/df = 3.15, CFI = .99, RMSEA = .07, SRMR = .03. The reliability analysis showed acceptable internal consistency, with McDonald’s omega (ω) was .91 [CI95% = .89, .92].

Findings

Structural equation modelling

To test hypotheses, a structural model was constructed. This model included activity frequency, activity variety and technical skill with smartphone as exogenous variables, privacy and security concern as the mediator, and consent to Twitter linkage as the endogenous variable, using lavaan (0.6–12) in R. Demographic variables, including age, gender, income, education, marriage, employment status, and residential area, were incorporated as control variables. These were retained in the model if they showed a significant association with any of the variables under study. Mediation analysis was conducted by investigating the indirect effect of (a) activity frequency, (b) activity variety and (c) technical skills with smartphones on Twitter linkage consent through privacy and security concern. This approach is commonly used in studies to examine mediation effects (e.g. Hayes, Citation2009; Shrout & Bolger, Citation2002). The analysis was carried out by computing bias-corrected bootstrapped confidence intervals with 5,000 random samples. To address any missing data, the full information maximum likelihood estimation was utilized (Graham, Citation2009). First, goodness-of-fit for the measurement model was examined, where all latent factors were free to covary in the measurement model. The output indicated that the magnitude of the relationship between items and factors were adequate for all variables and all factor loadings are statistically significant. Then, goodness-of-fit for the structural model was examined.Footnote2 Results suggested acceptable fit for the proposed model, χ2(37) = 93.95, p < .001, χ2/df = 2.53, RMSEA = .06 [CI90 = .044, .073], SRMR = .03, CFI = .97.Footnote3

H1 predicted that (a) activity frequency, (b) activity variety and (c) technical skills with smartphones are positively related with consent to Twitter linkage. Results (see ) revealed that only the relationship between activity variety and Twitter linkage consent was significant (b = .12, SE = .05, p = .012, CI95 = .03, .22). However, the relationships between activity frequency (b = −.04, SE = .05, p = .43, CI95 = −.15, .06) and linkage consent, as well as between technical skills (b = −.02, SE = .05, p = .76, CI95 = −.11, .08) and linkage consent were not significant. Thus, H1b was supported, indicating that individuals who use smartphones for multiple purposes are more likely to consent to link Twitter data with survey responses. However, H1a and H1c were not supported.

Figure 2. Final model with standardized path coefficients.

*p < .05. **p < .01. ***p < .001. The standardized coefficients between activity frequency, activity variety and technical skills with smartphones and consent to link Twitter represent total effects.
Figure 2. Final model with standardized path coefficients.

Regarding H2a-H2c, results showed that activity variety with smartphones was negatively associated with privacy concerns (b = −.32, SE = .06, p < .001, CI95 = −.44, −0.20), in accordance with H2b. However, activity frequency with smartphone was positively related with privacy concern (b = .15, SE = .05, p = .004, CI95 = .05, .26). The relationship between smartphone technical skills and privacy concern was not significant (b = −.06, SE = .06, p = .27, CI95 = −.17, .05). Therefore, H2b was supported, whereas H2a and H2c were not. Furthermore, privacy and security concerns were negatively related with Twitter linkage consent (b = −.20, SE = .06, p < .001, CI95 = −.31, −.10), providing support for H3.

Mediation analysis (i.e. indirect analysis) revealed that privacy and security concern mediated the association between activity frequency and Twitter linkage consent (b = −.03, SE = .02, p = .036, CI95 = −.061, −.002), as well as between activity variety and Twitter linkage consent (b = .07, SE = .02, p = .004, CI95 = .021, .110), but not the association between technical skills and Twitter linkage consent (b = .012, SE = .01, p = .31, CI95 = −.011, .036). Therefore, H4a and H4b were supported, whereas H4c was not.

Moderation analysis

To examine the influence of four proposed moderators (i.e. age, employment status, gender, and educational level) on the relationship between privacy concerns and consent to link Twitter (H5 – H8), simple slope analyses conducted at the mean, as well as one standard deviation above and below the average of the moderator with the PROCESS Macro in SPSS (Hayes, Citation2022). The Johnson-Neyman method was applied to ascertain the specific values of a moderator for which the effect of independent variable on dependent variable becomes statistically significant (Hayes, Citation2022).

Results suggested that respondents’ age moderated the negative association between privacy concern and Twitter linkage consent (b = .02, SE = .008, p = .046, CI95 = .0003, .031) (see ). However, contrary to the hypothesis, the negative effect of privacy concern on Twitter linkage consent was higher for younger respondents (b = −.80, SE = .19, p < .001, CI95 = −1.17, −.44) relative to middle-aged (b = −.58, SE = .12, p < .001, CI95 = −.82, −.33) and older respondents (b = −.35, SE = .15, p = .02, CI95 = −.64, −.06). Thereby, H5 was not supported.

Figure 3. Moderating effects of age on the association between privacy and security concern and consent to Twitter linkage.

The values on Y-axis denote the logarithmic transformation of the odds of Twitter linkage consent. These values can range from negative to positive infinity. Higher probabilities of consent correspond to higher log-odds, and lower probabilities correspond to lower log-odds.
Figure 3. Moderating effects of age on the association between privacy and security concern and consent to Twitter linkage.

Furthermore, results showed that respondents’ employment status moderated the negative association between privacy concern and Twitter linkage (b = −.62, SE = .28, p = .03, CI95 = −1.17, −.07) (see ). For employed respondents, the higher privacy concern would lead to lower likelihood to disclose Twitter information (b = −.67, SE = .14, p < .001, CI95 = −.95, −.39). For unemployed respondents, however, the association between privacy concern and Twitter linkage consent was not significant (b = −.05, SE = .24, p = .84, CI95 = −.53, .43). Therefore, H6 was supported.

Figure 4. Moderating effects of employment status on the association between privacy and security concerns and consent to Twitter Linkage.

The values on Y-axis denote the logarithmic transformation of the odds of Twitter linkage consent. These values can range from negative to positive infinity. Higher probabilities of consent correspond to higher log-odds, and lower probabilities correspond to lower log-odds.
Figure 4. Moderating effects of employment status on the association between privacy and security concerns and consent to Twitter Linkage.

Additionally, it was discovered that gender (b = −.34, SE = .24, p = .16, CI95 = −.82, .14) and educational level (b = −.07, SE = .16, p = .64, CI95 = −.38, .23) did not moderate the association between privacy concern and Twitter linkage consent. Thus, H7 and H8 were not supported.

Discussion

Utilizing data from the UK Understanding Society Innovation Panel (IP), this study tested the associations among activity frequency, activity variety, technical skills with smartphone, privacy and security concern, and Twitter linkage consent. Results indicated that both frequency and variety of smartphone activities significantly predicted privacy and security concerns, which in turn influenced their consents to link Twitter with survey responses. Additionally, age and employment status moderated the associations between privacy concern and Twitter linkage consent.

First, different from our hypotheses, only activity variety with smartphone was positively related to consent to Twitter linkage, while the effect of activity frequency and technical skills with smartphone were not significant. This is consistent with previous research that highlighted the influence of smartphone usage variety on the willingness to take part in passive mobile data collection (Keusch et al., Citation2019; Struminskaya et al., Citation2021; Wenz et al., Citation2019). For instance, Wenz et al. (Citation2019) found a similar positive association between the diversity of smartphone activities and the willingness to attend mobile data collection tasks, such as app downloads for surveys and sharing GPS position. One possible explanation for this consistent pattern is that individuals who use smartphones for a wider array of purposes tend to be more open to new experiences and practices, thus more likely to grant consent to Twitter linkage. This reminds us to be cautious when drawing conclusions from the linked data sources, as the linked sample may be skewed toward individuals with specific smartphone usage patterns.

Additionally, we discovered that activity frequency was positively related to privacy and security concern, while activity variety was negatively related to privacy concern. This implies that it is critical to distinguish different operationalizations of smartphone usage – activity frequency versus activity variety – when considering its potential influences. The positive relationship could be because those who use smartphones more frequently may have a higher likelihood of encountering privacy and security risks, such as data breaches or hacking attempts. Thus, they may be more aware of the potential risks and vulnerabilities associated with these activities. Additionally, self-reported frequent usage of smartphones is also often associated with heightened anxiety and stress (Elhai et al., Citation2021), which could amplify concerns over privacy and security intrusions. In contrast, individuals who engage in a wider variety of smartphone activities may have a more relaxed attitude toward privacy. They may be less concerned about protecting their personal information because they are accustomed to sharing it in different platforms. On the other hand, individuals who engage in a narrower range of activities may be more likely to perceive those activities as particularly important or sensitive, thereby heightening their privacy concerns.

Moreover, our study found that privacy and security concern was significant determinant directly affecting the willingness to consent to Twitter linkage, corroborating previous research on data linkage and passive data collection (Keusch et al., Citation2019; Mneimneh et al., Citation2021). This study also provides evidence suggesting that privacy concern could act as a mediator between other factors, such as variety and frequency of smartphone use, and the consent for data linkage. The results shed a light on respondents’ decision-making process regarding whether consent to link to Twitter and provides some insights for researchers to address respondents’ privacy concerns when designing surveys. For example, it might be helpful to explicitly address respondents’ privacy concerns due to frequent usage of smartphone in the survey introduction to alleviate respondents’ potential anxiety.

Regarding the moderating role of demographic factors, our study revealed that age influenced the link between privacy concerns and the consent to link Twitter, with a somewhat unexpected twist: younger respondents showed a stronger correlation between privacy concerns and consent behaviors. This suggests that for respondents with same level of privacy concerns, younger respondents were more likely to translate their privacy concerns to actual behaviours. Specifically, among those with lower privacy worries, young respondents were more inclined to consent to provide their Twitter account than middle-age and older respondents, whereas for those expressing higher privacy and security considerations, younger were less likely than older to agree to link Twitter accounts. One potential reason could be that younger people, having grown up in the digital age, are more adept at managing online privacy (Blank et al., Citation2014), which may lead to actions more closely aligned with their privacy concerns, such as a lower rate of consent to disclose information. Furthermore, Rusk (Citation2014) posited that trust toward the data-collecting entity may affect the association between privacy attitudes and behaviors. Older respondents might place more trust in public data collectors, such as universities in this study, than younger respondents (Christensen & Lægreid, Citation2005). This trust may lead them to be more amenable to data linkage, despite having similar privacy concerns.

Furthermore, our findings indicate that employment status moderates the relationship between privacy and security concerns and the consent to link Twitter. Employed respondents showed a pronounced negative correlation between privacy concerns and their willingness to consent to Twitter linkage, suggesting that as privacy concerns increase, the likelihood of consent decreases for this group. This could be attributed to employed individuals’ heightened awareness of the potential repercussions of sharing personal information online, which may stem from their professional obligations and the reputational risks involved (Bhave et al., Citation2020). Consequently, they might exercise more caution when considering whether to link their Twitter accounts to a research study. On the other hand, unemployed respondents may not be subject to the same level of professional scrutiny or potential exposure, and thus, their privacy concerns may not significantly deter them from consenting to social media account linkage.

Limitations and next steps

Several limitations of the study are worth noting. One major limitation is the potential for bias due to survey nonresponse. In this study, low consent rates and a small percentage of Twitter users among the initial sample may lead to a lack of representativeness in the final linked dataset. Second, the cross-sectional data of the study restricts our ability to infer causality from the observed associations. Subsequent studies would benefit from employing experimental or longitudinal designs to better understand the dynamic relationship between smartphone usage and privacy concerns, as well as Twitter linkage consent. Third, in this study, we relied on self-report measures to assess activity frequency, activity variety and technical skills with smartphones. Future research could employ unobtrusive means of data collection, such as passive smartphone usage, to gain a more precise assessment of smartphone usage habits. Lastly, the dataset used in this study was collected in 2017. Over the past years, Twitter has undergone significant changes, especially with Elon Musk taking control of the platform. It is anticipated that Twitter (or named as X.) will evolve into to a super-app, incorporating more features. As Elon Mask (Citation2023) stated: ‘we will add comprehensive communications and the ability to conduct your entire financial world.’ Moreover, Mask has also pledged to reduce content moderation and appease advertisers (Kelly, Citation2022). The expansion and transformation undoubtedly mean more heightened privacy concerns for users. This could potentially have a detrimental impact on obtaining user consent for data linkage, potentially further lowering the rate of consent in linking Twitter data in this regard.

Supplemental material

Supplemental Material

Download MS Word (17 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/13645579.2023.2299482.

Additional information

Funding

ESRC project ‘Understanding [Online/Offline] Society: Linking Surveys with Twitter Data’ [ES/S015175/1]. ESRC project ‘Understanding Society: The UK Household Longitudinal Survey Waves 13-15’ [ES/T002611/1].

Notes on contributors

Shujun Liu

Shujun Liu is a Research Associate of School of Social Sciences at Cardiff University, where she works as a part of ESRC project ‘Understanding [Online/Offline] Society: Linking Surveys with Twitter Data’ (ES/S015175/1). Her key research interests include digital media studies, computational social science, climate communication, political communication.

Luke Sloan

Luke Sloan is Deputy Director of the Social Data Science Lab and Principal Investigator on the lead the ESRC project ‘Understanding [Online/Offline] Society: Linking Surveys with Twitter Data’ (ES/S015175/1). His key research interests are understanding representation on Twitter and augmenting social media data through data linkage.

Tarek Al Baghal

Tarek Al Baghal is a Professor of Survey Methodology at the Institute of Social and Economic Research, University of Essex, and is Deputy Director of Understanding Society.

Matthew Williams

Matthew Williams is Professor of Criminology in the School of Social Sciences at Cardiff University and Director of HateLab and the Social Data Science Lab. His main areas of research activity are Cybercrime, Human Factors in Cybersecurity, Hate Crime, Hate Speech and Extremism Online.

Curtis Jessop

Curtis Jessop is the Director of Attitudinal Surveys at the National Centre for Social Research where he oversees the development and delivery of the British Social Attitudes study and the NatCen Panel. He has also conducted extensive work looking at the ethics, feasibilities and practicalities of linking digital trace and survey data.

Paulo Serôdio

Paulo Serôdio is a senior research officer at the Institute of Social and Economic Research, of the University of Essex.

Notes

1. The two values do not add up to 513 due to the missing values, which are caused for the following reasons: (1) In applicable: the participant was not eligible for this question and was never asked about it; (2) Proxy: when a person cannot participate in the interview, someone else in the household answered question on their behalf; (3) Refusal: the respondent refused to answer; (4) DK: the respondent did not know the answer.

2. The analysis was conducted with estimator of Maximum Likelihood (ML). Taking the ordinal nature of items into consideration, we further run a structural equation modeling using estimator of Diagonally Weighted Least Squares (DWLS) with polychoric correlations. DWLS is useful when dealing with ordinal data and provides estimates without the need for data to be normally distributed. The model showed an acceptable fit with Robust method, χ2(37) = 72.97, p < .001, χ2/df = 1.97, RMSEA = .046 [CI90 = .030, .062], SRMR = .03, CFI = .995. The results are consistent with those using ML. Specifically, smartphone activity variety was positively associated with Twitter linkage consent (b = .17, SE = .04, p = .04). Meanwhile, activity frequency (b = .17, SE = .16, p = .001) was positively associated with privacy concern, while activity variety (b = −.32, SE = .03, p < .001) was negatively associated with privacy concern. Furthermore, privacy and security concern mediated the effects of activity frequency (b = −.05, SE = .06, p = .008) and activity variety (b = .10, SE = .01, p < .001) on Twitter linkage consent.

3. A model is deemed to have a good fit if the Root Mean Square Error of Approximation (RMSEA) is below .06 and the Comparative Fit Index (CFI) exceeds .95. An acceptable model fit is indicated when the RMSEA values range from .06 to .08 and the CFI values fall between .90 and .95 (Hu & Bentler, Citation1999).

References

  • Al Baghal, T., Sloan, L., Jessop, C., Williams, M. L., & Burnap, P. (2020). Linking twitter and survey data: The impact of survey mode and demographics on consent rates across three UK studies. Social Science Computer Review, 38(5), 517–532. https://doi.org/10.1177/0894439319828011
  • Alraja, M. N., Farooque, M. M. J., & Khashab, B. (2019). The effect of security, privacy, familiarity, and trust on users’ attitudes toward the use of the IoT-based healthcare: The mediation role of risk perception. Institute of Electrical and Electronics Engineers Access, 7, 111341–111354. https://doi.org/10.1109/ACCESS.2019.2904006
  • Beveridge, C. (2022). 33 Twitter Stats That Matter to Marketers in 2023. Hootsuite. https://blog.hootsuite.com/twitter-statistics/
  • Bhave, D. P., Teo, L. H., & Dalal, R. S. (2020). Privacy at work: A review and a research agenda for a contested terrain. Journal of Management, 46(1), 127–164. https://doi.org/10.1177/0149206319878254
  • Blank, G., Bolsover, G., & Dubois, E. (2014). A new privacy paradox: Young people and privacy on social network sites (Vol. 17). Oxford Internet Institute. https://www.oxfordmartin.ox.ac.uk/downloads/A%20New%20Privacy%20Paradox%20April%202014.pdf
  • Boerman, S. C., Kruikemeier, S., & Zuiderveen Borgesius, F. J. (2021). Exploring motivations for online privacy protection behavior: Insights from panel data. Communication Research, 48(7), 953–977. https://doi.org/10.1177/0093650218800915
  • Braithwaite, S. R., Giraud-Carrier, C., West, J., Barnes, M. D., & Hanson, C. L. (2016). Validating machine learning algorithms for Twitter data against established measures of suicidality. JMIR Mental Health, 3(2), e4822. https://doi.org/10.2196/mental.4822
  • Chai, S., Bagchi-Sen, S., Morrell, C., Rao, H. R., & Upadhyaya, S. J. (2009). Internet and online information privacy: An exploratory study of preteens and early teens. IEEE Transactions on Professional Communication, 52(2), 167–182. https://doi.org/10.1109/TPC.2009.2017985
  • Christensen, T., & Lægreid, P. (2005). Trust in government: The relative importance of service satisfaction, political factors, and demography. Public Performance & Management Review, 28(4), 487–511. https://doi.org/10.1080/15309576.2005.11051848
  • Clarke, H., Clark, S., Birkin, M., Iles-Smith, H., Glaser, A., & Morris, M. A. (2021). Understanding barriers to novel data linkages: Topic modeling of the results of the LifeInfo survey. Journal of Medical Internet Research, 23(5), e24236. https://doi.org/10.2196/24236
  • Dean, B. (2022). Facebook demographic statistics: How many people use Facebook in 2022? Backlinko. https://backlinko.com/facebook-users
  • Dommeyer, C. J., & Gross, B. L. (2003). What consumers know and what they do: An investigation of consumer knowledge, awareness, and use of privacy protection strategies. Journal of Interactive Marketing, 17(2), 34–51. https://doi.org/10.1002/dir.10053
  • Dunn, K. M., Jordan, K., Lacey, R. J., Shapley, M., & Jinks, C. (2004). Patterns of consent in epidemiologic research: Evidence from over 25,000 responders. American Journal of Epidemiology, 159(11), 1087–1094. https://doi.org/10.1093/aje/kwh141
  • Eady, G., Nagler, J., Guess, A., Zilinsky, J., & Tucker, J. A. (2019). How many people live in political bubbles on social media? Evidence from linked survey and Twitter data. SAGE Open, 9(1), 1–21. https://doi.org/10.1177/2158244019832705
  • Eagly, A. H., & Wood, W. (1999). The origins of sex differences in human behavior: Evolved dispositions versus social roles. American Psychologist, 54(6), 408. https://doi.org/10.1037/0003-066X.54.6.408
  • Elevelt, A., Lugtig, P., & Toepoel, V. (2019). Doing a time use survey on smartphones only: What factors predict nonresponse at different stages of the survey process? Survey Research Methods, 13(2), 195–213. https://doi.org/10.18148/srm/2019.v13i2.7385
  • Elhai, J. D., Sapci, O., Yang, H., Amialchuk, A., Rozgonjuk, D., & Montag, C. (2021). Objectively-measured and self-reported smartphone use in relation to surface learning, procrastination, academic productivity, and psychopathology symptoms in college students. Human Behavior and Emerging Technologies, 3(5), 912–921. https://doi.org/10.1002/hbe2.254
  • Fiesler, C., & Proferes, N. (2018). “Participant” perceptions of Twitter research ethics. Social Media+ Society, 4(1), 1–14. https://doi.org/10.1177/2056305118763366
  • Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60(1), 549–576. https://doi.org/10.1146/annurev.psych.58.110405.085530
  • Hayes, A. F. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication Monographs, 76(4), 408–420. https://doi.org/10.1080/03637750903310360
  • Hayes, A. F. (2022). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (3rd ed). The Guilford Press.
  • Hedman, U., & Djerf-Pierre, M. (2013). The social journalist: Embracing the social media life or creating a new digital divide? Digital Journalism, 1(3), 368–385. https://doi.org/10.1080/21670811.2013.776804
  • Hoy, M. G., & Milne, G. (2010). Gender differences in privacy-related measures for young adult Facebook users. Journal of Interactive Advertising, 10(2), 28–45. https://doi.org/10.1080/15252019.2010.10722168
  • Hu L and Bentler P M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. https://doi.org/10.1080/10705519909540118
  • Hughes, A. G., McCabe, S. D., Hobbs, W. R., Remy, E., Shah, S., & Lazer, D. M. (2021). Using administrative records and survey data to construct samples of tweeters and tweets. Public Opinion Quarterly, 85(S1), 323–346. https://doi.org/10.1093/poq/nfab020
  • Kelly, H. (2022, January). Twitter privacy settings to change now. The Washington Post. https://www.washingtonpost.com/technology/2022/01/19/twitter-privacy-settings/
  • Ketelaar, P. E., & Van Balen, M. (2018). The smartphone as your follower: The role of smartphone literacy in the relation between privacy concerns, attitude and behaviour towards phone-embedded tracking. Computers in Human Behavior, 78, 174–182. https://doi.org/10.1016/j.chb.2017.09.034
  • Keusch, F., Struminskaya, B., Antoun, C., Couper, M. P., & Kreuter, F. (2019). Willingness to participate in passive mobile data collection. Public Opinion Quarterly, 83(S1), 210–235. https://doi.org/10.1093/poq/nfz007
  • Klimmt, C., Hefner, D., Reinecke, L., Rieger, D., & Vorderer, P. (2017). The permanently online and permanently connected mind: Mapping the cognitive structures behind mobile internet use. In P. Vorderer, D. Hefner, L. Reinecke, & C. Klimmt (Eds.), Permanently online, permanently connected (pp. 18–28). Routledge.
  • Knies, G., Burton, J., & Sala, E. (2012). Consenting to health record linkage: Evidence from a multi-purpose longitudinal survey of a general population. BMC Health Services Research, 12(1), 1–6. https://doi.org/10.1186/1472-6963-12-52
  • Máté, Á., Rakovics, Z., Rudas, S., Wallis, L., Ságvári, B., Huszár, Á., & Koltai, J. (2023). Willingness of participation in an application-based digital data collection among different social groups and smartphone user clusters. Sensors, 23(9), 2–17. https://doi.org/10.3390/s23094571
  • McLean, C. P., & Anderson, E. R. (2009). Brave men and timid women? A review of the gender differences in fear and anxiety. Clinical Psychology Review, 29(6), 496–505. https://doi.org/10.1016/j.cpr.2009.05.003
  • Metzger, M. J. (2004). Privacy, trust, and disclosure: Exploring barriers to electronic commerce. Journal of Computer-Mediated Communication, 9(4), 942. https://doi.org/10.1111/j.1083-6101.2004.tb00292.x
  • Mneimneh, Z. N., McClain, C., Bruffaerts, R., & Altwaijri, Y. A. (2021). Evaluating survey consent to social media linkage in three international health surveys. Research in Social and Administrative Pharmacy, 17(6), 1091–1100. https://doi.org/10.1016/j.sapharm.2020.08.007
  • Musk, E. (2023, July 25). In the months to come, we will add comprehensive communications and the ability to conduct your entire financial world [Tweet]. Twitter. https://shorturl.at/bdCI4
  • Ohme, J., Araujo, T., de Vreese, C. H., & Piotrowski, J. T. (2021). Mobile data donations: Assessing self-report accuracy and sample biases with the iOS screen time function. Mobile Media & Communication, 9(2), 293–313. https://doi.org/10.1177/2050157920959106
  • Otto, L. P., & Kruikemeier, S. (2023). The smartphone as a tool for mobile communication research: Assessing mobile campaign perceptions and effects with experience sampling. New Media & Society, 25(4), 795–815. https://doi.org/10.1177/14614448231158651
  • Ovadia, S. (2009). Exploring the potential of Twitter as a research tool. Behavioral & Social Sciences Librarian, 28(4), 202–205. https://doi.org/10.1080/01639260903280888
  • Parsons, J. (2022, November 3). What Percentage of Twitter Uses Mobile Devices? Growtraffic. https://growtraffic.com/blog/2022/11/percentage-twitter-uses-mobile-devices
  • Rusk, J. D. (2014). Trust and decision making in the private paradox. In Proceedings of the Southern Association for Information Systems Conference,Macon, GA, U.S.
  • Sala, E., Burton, J., & Knies, G. (2012). Correlates of obtaining informed consent to data linkage: Respondent, interview, and interviewer characteristics. Sociological Methods & Research, 41(3), 414–439. https://doi.org/10.1177/0049124112457330
  • Shrout, P. E., & Bolger, N. (2002). Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychological Methods, 7(4), 422–445. https://doi.org/10.1037/1082-989X.7.4.422
  • Silber, H., Breuer, J., Beuthner, C., Gummer, T., Keusch, F., Siegers, P., Stier, S., & Weiß, B. (2022). Linking surveys and digital trace data: Insights from two studies on determinants of data sharing behaviour. Journal of the Royal Statistical Society Series A: Statistics in Society, 185(Supplement_2), S387–S407. https://doi.org/10.1111/rssa.12954
  • Sipior, J. C., Ward, B. T., & Volonino, L. (2014). Privacy concerns associated with smartphone use. Journal of Internet Commerce, 13(3–4), 177–193. https://doi.org/10.1080/15332861.2014.947902
  • Sloan, L., Jessop, C., Al Baghal, T., & Williams, M. (2020). Linking survey and Twitter data: Informed consent, disclosure, security, and archiving. Journal of Empirical Research on Human Research Ethics, 15(1–2), 63–76. https://doi.org/10.1177/1556264619853447
  • Sloan, L., Morgan, J., Burnap, P., Williams, M., & Preis, T. (2015). Who tweets? Deriving the demographic characteristics of age, occupation and social class from Twitter user meta-data. PloS One, 10(3), e0115545. https://doi.org/10.1371/journal.pone.0115545
  • Smith, H. J., Dinev, T., & Xu, H. (2011). Information privacy research: An interdisciplinary review. MIS Quarterly, 35(4), 989–1015. https://doi.org/10.2307/41409970
  • Smit, E. G., Van Noort, G., & Voorveld, H. A. (2014). Understanding online behavioural advertising: User knowledge, privacy concerns and online coping behaviour in Europe. Computers in Human Behavior, 32, 15–22. https://doi.org/10.1016/j.chb.2013.11.008
  • Stier, S., Breuer, J., Siegers, P., & Thorson, K. (2020). Integrating survey data and digital trace data: Key issues in developing an emerging field. Social Science Computer Review, 38(5), 503–516. https://doi.org/10.1177/0894439319843669
  • Struminskaya, B., Lugtig, P., Toepoel, V., Schouten, B., Giesen, D., & Dolmans, R. (2021). Sharing data collected with smartphone sensors: Willingness, participation, and nonparticipation bias. Public Opinion Quarterly, 85(S1), 423–462. https://doi.org/10.1093/poq/nfab025
  • Townsend, L., & Wallace, C. (2017). The ethics of using social media data in research: A new framework. In K. Woodfield (Ed.), The ethics of online research (pp. 189–207). Emerald Publishing Limited. https://doi.org/10.1108/S2398-601820180000002008
  • Tsetsi, E., & Rains, S. A. (2017). Smartphone internet access and use: Extending the digital divide and usage gap. Mobile Media & Communication, 5(3), 239–255. https://doi.org/10.1177/2050157917708329
  • University of Essex, Institute for Social and Economic Research. (2021). Understanding society: Innovation panel, waves 1-13, 2008-2020 (11th ed.). UK Data Service. https://doi.org/10.5255/UKDA-SN-6849
  • Wang, T., Duong, T. D., & Chen, C. C. (2016). Intention to disclose personal information via mobile applications: A privacy calculus perspective. International Journal of Information Management, 36(4), 531–542. https://doi.org/10.1016/j.ijinfomgt.2016.03.003
  • Wang, Y., Ngien, A., & Ahmed, S. (2022). Nationwide adoption of a digital contact tracing app: Examining the role of privacy concern, political trust, and technology literacy. Communication Studies, 73(4), 364–379. https://doi.org/10.1080/10510974.2022.2094982
  • Wenz, A., Jackle, A., & Couper, M. P. (2019). Willingness to use mobile technologies for data collection in a probability household panel. Survey Research Methods, 13(1), 1–22. https://doi.org/10.18148/srm/2019.v1i1.7298
  • Wenz, A., & Keusch, F. (2023). Increasing the acceptance of smartphone-based data collection. Public Opinion Quarterly, 87(2), 357–388. https://doi.org/10.1093/poq/nfad019
  • Wu, A. D., & Zumbo, B. D. (2008). Understanding and using mediators and moderators. Social Indicators Research, 87(3), 367–392. https://doi.org/10.1007/s11205-007-9143-1
  • Xafis, V. (2015). The acceptability of conducting data linkage research without obtaining consent: Lay people’s views and justifications. BMC Medical Ethics, 16(79), 1–16. https://doi.org/10.1186/s12910-015-0070-4
  • Yao, M. Z., Rice, R. E., & Wallis, K. (2007). Predicting user concerns about online privacy. Journal of the American Society for Information Science and Technology, 58(5), 710–722. https://doi.org/10.1002/asi.20530