2,726
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Leveraging Data Donations for Communication Research: Exploring Drivers Behind the Willingness to Donate

ORCID Icon & ORCID Icon

ABSTRACT

Using data donations to collect digital trace data holds great potential for communication research, which has not yet been fully realized. Besides limited awareness and expertise among researchers, a central challenge is to motivate people to donate their personal data. Therefore, this article investigates which factors affect people’s willingness to donate across different platforms and data types. The study applies a multilevel approach that explains the reported willingness to donate different types of data (level 1) belonging to different platforms (level 2) from potential data donors with individual characteristics (level 3) to a hypothetical research project. The analysis is based on data collected through a national online survey (n = 833). We find higher willingness to donate YouTube data compared to Facebook, Instagram, or Google, as well as relevant influencing factors at all three levels. Greater willingness is found for lower perceived sensitivity and higher perceived relevance of the data (level of data type), greater perceived behavioral control to request and submit the data (platform level), more favorable attitudes toward data donation and the donation purpose, as well as lower contextual privacy concerns (individual level). Based on these findings, practical implications for future data donation studies are proposed.

Vast amounts of digital trace data are permanently recorded by digital devices and online services. Digital trace data in the broadest sense can be defined as digital observations that contain information about the actions of a (human) agent captured and contextualized through a digital device or service (Howison et al., Citation2011). Such data can entail anything from linking a physical in-store purchase with a loyalty or credit card number to activity logs on a web server to messages sent through e-mails. Of particular interest to communication and media researchers and hence the focus of this article are digital trace data that capture people’s interactions with large online platforms such as social networking sites, media streaming applications, or search engines.

Although these data are primarily collected by profit-oriented companies to improve their products and services, they also hold great potential for scientific research. So far, this potential has rarely been fully realized because access to digital trace data is often restricted (Lazer et al., Citation2020). While there are some proven methods for academic researchers to access digital trace data, such as public application programming interfaces (APIs), web scraping, or direct collaborations with companies, all of them come with significant drawbacks. Access to public APIs is becoming increasingly restricted (Bruns, Citation2019; Freelon, Citation2018), web scraping is associated with both ethical and legal concerns (Mancosu & Vegetti, Citation2020), and collaborations with companies raise questions about the independence of academic research from commercial interests (Breuer et al., Citation2020; Bruns, Citation2019).

A promising approach that addresses many of these shortcomings and offers additional advantages is to utilize data donations, broadly defined as a person’s consensual act of sharing (donating) data for research purposes (Prainsack, Citation2019; Skatova & Goulding, Citation2019).Footnote1 In the context of digital trace data generated by online (media) platforms, the data to be donated are considered personal data as they are directly linked to a natural person and contain information about that person’s characteristics and behavior as recorded by the respective platform (GDPR, Art. 4). Thus, to collect data donations, users of a relevant platform are approached and asked to contribute these data to a research project. One way to do this is the download-upload approach. This approach utilizes the “takeout” function that many large companies such as Google, Instagram, or Spotify have implemented in their platforms which allows users to download a copy of the data stored about them and subsequently submit these data to the researchers.

In principle, a successful data donation collection depends on two things: Firstly, it requires sound and reliable implementation of the data donation process that considers the technological, legal, and ethical requirements. Secondly, it requires individuals who are willing to donate their data. While the first aspect can be directly designed by the researcher, the latter is dependent on the motivation of potential participants and is much more difficult to achieve. People’s initial motivation to start participating in a data donation project is already affected at the very beginning of the donation process by the information they receive about the research project in the invitation and introduction of a study (e.g., what is the research goal, who conducts the research, what kind of data is requested etc.). Based on this information, participants will then decide whether they want to start the data donation process.

However, it remains unclear which factors are driving the (un-)willingness to participate in data donation studies (Breuer et al., Citation2020). This article addresses this issue and explores the following research question: Which factors affect people’s willingness to donate personal data for academic research?

Before we derive the relevant factors related to people’s willingness to donate, we contextualize this method in relation to other approaches to collect digital trace data. Based on theoretical considerations and previous empirical findings, we then introduce our hypotheses relating (a) to the data requested for donation, (b) to the platform on which these data were generated, and (c) to individual attitudes and predispositions. Based on a hypothetical research scenario that asks participants to donate their personal data through the download-upload approach, we test the hypotheses with a multilevel approach that incorporates the reported willingness to donate different types of data (e.g., search terms, likes, social ties – level 1) that belong to different platforms (i.e., Google Search, Instagram, Facebook, and YouTube – level 2) from the perspective of potential data donors with individual characteristics (i.e., attitudes, age, gender, education – level 3). Finally, we discuss the results and offer practical recommendations for optimizing the communication with and recruitment of participants for future data donation projects.

Data donations as an approach to collect digital trace data

Prominent approaches to collect digital trace data that have been applied in numerous studies are public APIs, web scraping, and direct collaborations with the organizations recording the digital trace data. However, each of these approaches is associated with noteworthy limitations. Public APIs, although cost-effective and easy to use, often return datasets that are hard to replicate and contain biases that are difficult to evaluate (Ruths & Pfeffer, Citation2014). Additionally, public APIs usually only expose a subset of the digital trace data a company collects and are becoming increasingly restrictive (Bruns, Citation2019; Hogan, Citation2018). Web scraping remains a legal gray area (Freelon, Citation2018) and only allows for the collection of publicly visible information, which limits its utility (Halavais, Citation2019). Collaboration platforms, such as Social Science One (King & Persily, Citation2020), have not fulfilled their promises and instead have raised concerns regarding a loss of academic independence (Breuer et al., Citation2020; Bruns, Citation2019), a lack of external reproducibility (Theocharis & Jungherr, Citation2020), and a new digital divide between researchers with and without the privilege to collaborate with these organizations (Boyd & Crawford, Citation2012; Ruths & Pfeffer, Citation2014; Walker et al., Citation2019). Lastly, a drawback that all mentioned approaches have in common is the lack of informed consent from the users about whom the collected data contain information (Flick, Citation2016; Lomborg & Bechmann, Citation2014). In addition to these approaches, there are many examples of researchers developing dedicated applications that implement a wide range of automatic tracking approaches, each with their own advantages and disadvantages (for an overview see Christner et al., Citation2022).

Data donations are an addition to this methodological toolbox and address some of the drawbacks of the established approaches and offer additional possibilities. In principle, there are several ways to collect data donations in practice. One approach is to utilize APIs that require the user’s permission to access their personal data. Another approach is to use applications to which users give permission to scrape data using their personal account (e.g., DataSkop, Citation2021). However, the most versatile approach is to utilize the participants’ right to request and download their personal data from a digital service in the form of a data takeout (or data download package; Boeschoten, Araujo et al., Citation2022) and ask them to download and submit these data to the researchers. We call this the download-upload approach.

Data donations in general, and the download-upload approach in particular, have gained increased attention owing to newly introduced laws that grant users the right to request and share data that (digital) service providers have stored about them (Ausloos & Veale, Citation2021), such as the General Data Protection Regulation (GDPR) in the European Union in 2018. Data donations utilize these laws and directly involve users in the collection of digital trace data – a characteristic that sets data donations apart from other approaches. Therefore, researchers are in direct contact with the people whose behavior is depicted in the data – referred to herein as data donors—which leads to four inherent qualities of data donations: they are consensual, enrichable, connectable, and universal.

The first three qualities stem from the fact that the collection of data donations is a user-centric approach. Data donations are consensual because there is always direct contact between researchers and data donors. Researchers can, thus, disclose their identity and organizational affiliation, the study goal, and the intended use of the donated data. Potential participants can use this information to make an informed decision about whether to disclose data and provide explicit consent to participate.

Owing to the direct contact with data donors, digital trace data collected through data donations are enrichable. This means that researchers can gather additional individual-level information from data donors through an appended survey and investigate research questions that cannot be addressed with digital trace data alone (Jungherr, Citation2018; Stier et al., Citation2019).

Data donations are connectable because a person may donate data from different services to the same study. Studies that utilize data donations are, thus, not restricted to a single service but allow the exploration of cross-platform behavioral patterns (Halavais, Citation2019).

Lastly, collecting data donations is a universal approach – at least in countries where the GDPR or similar laws apply. The right to data portability as defined in the GDPR (Art. 20) provides individuals with the right to request the personal data that is explicitly linked to them or their actions from any company that stores such data. Data donations as an approach to collecting (digital trace) data can therefore be applied across a wide range of research scenarios (Ausloos & Veale, Citation2021) and also has the potential to make data sources available which are not accessible through other means (e.g., because no API is available or no data are publicly displayed). However, companies’ compliance with and implementation of the right to data portability varies widely, although large online platforms seem to be the ones implementing the right most comprehensively (Kuebler-Wachendorff et al., Citation2021; Syrmoudis et al., Citation2021).

Stages of a data donation study

In practice, the download-upload approach is comparable to the process of an online survey and can conceptually be divided into two stages: the consideration and the donation stage. The consideration stage entails the study invitation, the briefing, and any additional introductory parts. For participants, the study invitation is usually the first point of contact with a study. The invitation briefly summarizes the main aspects of a study (i.e., who conducts the study, what is the research goal, how participants can donate their data etc.) and provides a call-to-action (e.g., visiting an online questionnaire). Based on the limited information provided in the invitation, participants will make an initial decision on whether to participate in the study by following the call-to-action (e.g., clicking on an URL to open a questionnaire). If a participant decides to take part in a study, they will first be presented with an introduction that re-iterates and extends the central aspects of the study (Dillman et al., Citation2014). Usually, the introduction will conclude with a consent statement confirming that a participant is, at least preliminary, willing to participate and donate their data (e.g., by ticking a box and clicking on a “next”-button).

The donation stage will be entered by those participants who indicate that they are willing to donate. This stage comprises the instructions on how to request and download the data from a digital service, the actual data donation – that is, transmission of the downloaded data to the researchers – and a debriefing of the participants at the end.

Challenges of collecting data donations

In the consideration stage, researchers are faced with the challenge to inform participants about the content of a study while being brief and concise in order not to discourage people from participating (Dillman et al., Citation2014). The goal is to minimize the dropout rate and to avoid biases of these dropouts with respect to the investigated dimensions. While in a traditional survey the information can focus on the topic and objectives of the study, in data donation studies additional information about the process of donating data must also be included. For most participants data donation is likely to be a new concept about which they lack a clear understanding and with which they have no or little experience. Consequently, the participation effort is high, and thus also the hurdle to initiate participation.

In the donation stage, the main challenge for researchers is to implement a trustworthy and secure donation process. So far, few technical solutions to collect data donations are available, most of which are still in the pilot phase (e.g., Araujo et al., Citation2022; Boeschoten, Mendrik, et al., Citation2022; Menchen-Trevino, Citation2016; Pfiffner et al., Citation2022). Consequentially, a custom implementation of the donation process is often needed (Ausloos & Veale, Citation2021) which requires specific knowledge and an infrastructure that may not be accessible for all interested researchers. Regarding the communication with participants, comprehensive and clear instructions to guide willing data donors are crucial (Breuer et al., Citation2020). Hence, considerable effort is needed from researchers to prepare comprehensible instruction material to minimize compliance and consent errors (Boeschoten, Araujo et al., Citation2022).

The design of the donation stage will be highly variable from study to study, depending on both the platform from which the data should be donated and the technical implementation of the data donation process. In contrast to this, the type of information communicated in the study invitation and introduction will be similar across research scenarios. Therefore, the study at hand focuses on the motivational aspects that are relevant in the consideration stage.

Willingness to donate personal data for research

A theory that has often been applied to explain people’s willingness to donate in other contexts (e.g., medical donations or charitable causes) or participation in academic research (e.g., Bosnjak et al., Citation2005; Haunberger, Citation2011) is the theory of planned behavior (Ajzen, Citation1991). According to this theory, human behavior is based on a reasoned process in which personal beliefs regarding the relevant action, aggregated as attitudes toward the behavior, subjective norms, and perceived behavioral control, shape the behavioral intention. This intention captures the motivational factors underlying a behavior and is an indication of the effort people are willing to invest in executing the actual behavior (Ajzen, Citation1991). In the context of an academic data donation study, this intention is already formed in the consideration stage when potential participants are confronted with the study invitation or the introduction to the study. Therefore, although the theory is not without criticism (e.g., Sniehotta et al., Citation2014), it provides a helpful foundation to identify and explore the factors that affect people’s willingness to donate their data. In line with this theoretical tradition, the term willingness to donate refers to a person’s preliminary agreement formed during the consideration stage to actively provide their personal digital trace data for academic research. Hence, the willingness to donate is a necessary precondition for an actual data donation.

Factors affecting people’s willingness to donate personal data

The theory of planned behavior suggests that people’s willingness to perform a specific behavior depends on attitudes, perceived behavioral control, and subjective norms. However, we assume that the subjective norm is unlikely to be an important predictor of the willingness to donate data. This is because donating data is usually done in private and, therefore, is not directly observable by others. Such private behavior was found to be less affected by social norms (Lapinski & Rimal, Citation2005). Furthermore, given the novelty of the approach, people are unlikely to have strong assumptions about the behavior of others or feel a social pressure to act in a specific way. Hence, this study did not include subjective norms but focuses on factors related to attitudes and perceived behavioral control, for which seven hypotheses are derived (for an overview, see ).

Figure 1. Hypotheses Model.

Figure 1. Hypotheses Model.

Attitudes

An attitude is a person’s evaluation of an object, which can include anything from physical things to people, ideas, or behaviors (Bohner & Dickel, Citation2011). In the context of the data donation process, potential donors might evaluate a variety of objects, which can, in turn, affect their willingness to donate data. These include the act of donating data in general, the donation receiver, the purpose of donation, and the kind of data that should be donated.

Act of donating data

According to the theory of planned behavior, a person’s general attitude toward a behavior is assumed to be a relevant predictor of the execution of that behavior (Ajzen, Citation1991). In the case at hand, a positive association between the general attitude toward data donation and the willingness to donate is assumed:

H1:

The more favorable the attitude toward the act of data donation in general is, the greater the willingness to donate data is.

Donation receiver

The attitude toward the donation receiver—that is, the entity asking for a data donation (e.g., a specific university, a commercial organization, etc.)—is also expected to affect people’s willingness to donate. From the perspective of the data donor, a data donation is characterized by uncertainty regarding how the data will be stored and used and what the actual outcome of a study will be. Under such circumstances, the attitude toward and trust in the donation receiver are assumed to be crucial preconditions for cooperative behavior (Hillebrand & Hornuf, Citation2021; Hummel et al., Citation2019). Findings from different domains support this assumption. For online surveys, studies have demonstrated a positive relationship between participation rates and attitudes toward the sender of a survey (Brosnan et al., Citation2019; Keusch, Citation2015) and trust in the sponsor of a survey (Fang et al., Citation2009). For the specific case of data donations, results of a qualitative study indicate that people who chose to donate their data often expressed a positive connectedness with the donation receiver (Sleigh, Citation2018). Similarly, a quantitative online experiment found that people were significantly less willing to donate their data to a project managed by a private entity compared to one managed by academia or a government institution (Hillebrand & Hornuf, Citation2021). Drawing on this research, the following hypothesis is tested:

H2:

The more favorable the attitude toward the donation receiver is, the greater the willingness to donate data is.

Purpose of donation

The purpose of the donation is the stated aim of the research project to which a data donation contributes. Attitudes toward the purpose of donation consist of the perceived relevance of a stated research goal to oneself and others. This can also be characterized as cause involvement, a concept that has received considerable attention in (social) marketing research (e.g., Grau & Folse, Citation2007) and that is linked to behavioral intention.

In survey participation research, multiple studies have found a correlation between higher interest in the survey topic and greater willingness to participate (Brosnan et al., Citation2019; Keusch, Citation2015). For the specific case of donating personal data for research purposes, similar observations have been made. Results from qualitative studies indicate that it is important to inform participants about the purpose of their donation and the expected impact of the study results, because people that can relate to a purpose are more likely to be willing to share their data for research (Skatova et al., Citation2019; Sleigh, Citation2018). The same observation was made in a quantitative survey study in which a strong need for understanding the research purpose correlated with the stated willingness to donate one’s data (Skatova & Goulding, Citation2019). Based on the theoretical assumptions and these empirical observations, the following hypothesis is formulated:

H3:

The more favorable the attitude toward the purpose of donation is, the greater the willingness to donate data is.

Contextual privacy concerns

Privacy concerns can be defined as a subjective measure (Buchanan et al., Citation2007) that captures people’s concerns about the opportunistic use of their personal data by others (Kayhan & Davis, Citation2016). Great privacy concerns are assumed to be an obstacle for activities requiring the provision of personal information (Smith et al., Citation1996), which also applies to data donations (Fortes & Rita, Citation2016).

Conceptually, privacy concerns can either be regarded as a persistent trait that is relatively stable across situations or as contextual, depending on a specific situation. Especially in situations where the type of information exchange and the involved entities are known, contextual privacy concerns are assumed to hold greater explanatory power (Kayhan & Davis, Citation2016). Hence, in this study, the focus is on contextual privacy concerns.

Empirical findings regarding the role of privacy concerns in data donations are inconclusive. Some studies suggest that privacy concerns are indeed an important predictor of the willingness to donate. That is, privacy and security concerns as well as the perception of lacking control of or information about what happens with the shared data are among the main reasons not to participate in passive mobile data collection (Keusch et al., Citation2019). The willingness to donate is significantly greater for projects that are perceived as low risk compared to projects with medium or high risks attached to them (Hillebrand & Hornuf, Citation2021). In other studies, perceived privacy concerns did not affect the likelihood of data sharing (Ohme et al., Citation2020; Silber et al., Citation2021). However, these studies measured privacy concerns globally and not in relation to a specific donation receiver, which might explain why no relationship was found. Therefore, it is reasonable to assume that increased contextual privacy concerns are related to a lower willingness to donate personal data:

H4:

The greater the contextual privacy concerns are, the lower the willingness to donate data is.

Data to be donated

Digital trace data usually capture a person’s behavior at a granular level and might disclose personal habits, preferences, and attitudes to an extent that is considered sensitive (Lomborg, Citation2013). The sensitivity assessment of data depends, among other things, on the type of data and past interactions with the service or product on which the data were recorded (Bansal et al., Citation2010). Therefore, it is assumed that people’s attitude toward the type of data they are asked to donate is linked to the willingness to donate these data.

Empirical results regarding the disclosure of “traditional” data, such as one’s name, address, or financial statements, show the relevance of perceived information sensitivity to information disclosure. In general, the likelihood of disclosing information decreases when the perceived sensitivity of information is higher (Bansal et al., Citation2010; Malheiros et al., Citation2013). An indication that this could also apply to data donations is the fact that people’s concerns (Keusch et al., Citation2020) and willingness to disclose data vary depending on the type of data requested. For example, people are more likely to share their music streaming history than their Facebook or credit card purchase history (Seltzer et al., Citation2019; Silber et al., Citation2021). Findings also indicate that people have an idea about what kind of information could be inferred about them from their trace data and that people differ in their assessments of the sensitivity of these data (Skatova et al., Citation2019; Sleigh, Citation2018). Based on these observations, the following hypothesis is formulated:

H5:

The higher the perceived sensitivity of one’s personal data is, the lower the willingness to donate data is.

Another aspect that is directly related to the type of data requested is the perceived relevance of the data for the proposed donation purpose. When confronted with a donation request, potential donors will, on the one hand, consider whether they think the requested data are appropriate for the proposed research question (this is sometimes referred to as contextual integrity; Nissenbaum, Citation2004) and, on the other hand, consider whether their donation will have any value for the donation purpose before they make the effort of actually donating (Duncan, Citation2004). Hence, the following hypothesis is formulated:

H6:

The higher the perceived relevance of the requested data is, the greater the willingness to donate data is.

Perceived behavioral control

In the context of data donations, perceived behavioral control refers to people’s beliefs that they are capable of accessing and downloading their personal data from a service and subsequently submitting these data to the researchers. Theoretically, perceived behavioral control consists of the two subdimensions autonomy (the degree of control over performing a behavior) and capacity (the belief to be able to perform a behavior; Fishbein & Ajzen, Citation2010). Following this definition, the capacity dimension is closely related to the “ease of use” as incorporated in the technology acceptance model (Davis, Citation1989). Perceived behavioral control is also closely related to self-efficacy and depends on factors such as a person’s skills and abilities – which partly depend on previous experience – or their available resources, such as time (Ajzen, Citation2020).

Existing findings support this rationale and indicate that people are more likely to share their data when the donation procedure is less effortful (Silber et al., Citation2021). Similarly, in a study that asked people to share screen time measures from their smartphone, people who reported higher levels of perceived technical mobile phone skills were more likely to participate in the study (Ohme et al., Citation2020). In the case of the download-upload approach to collect data donations, it is important to note that the perceived behavioral control will likely depend on the platform from which the data are requested, because the takeout functionality can be implemented differently on each platform. Regarding perceived behavioral control, the following hypothesis is proposed:

H7:

The greater the perceived behavioral control over the forthcoming data donation process is, the greater the willingness to donate data is.

This review of the literature shows that previous studies have investigated factors that affect people’s willingness to donate their personal data for research. However, many of these studies focus on a subset of the identified factors or only examine their relevance for one particular data type. The present study extends this line of research by integrating multiple factors and testing their relationship with the willingness to donate data simultaneously for different types of data, which will be described in the following sections.

Method

To answer the research question and test the hypotheses, an online survey was conducted among Swiss internet users. The study was preregistered on OSF.Footnote2

Sample

Participants were recruited to complete an online questionnaire through the online access panel of GfK Switzerland consisting of 38,000 active members. The original language of the questionnaire was German. Overall, 989 participants completed the questionnaire. The sample is representative of the Swiss population between 16 and 75 years of age regarding age and gender distribution; only the group of young males aged 16–29 years is slightly underrepresented. Participants who failed to correctly answer the two included quality check items (e.g, “This serves as a quality control: select the option ‘completely agree (7)’ here”) were excluded from the sample (n = 44). Additionally, cases with missing values for one of the predictor variables were excluded because the applied analytical model assumes complete cases at the predictor level. Hence, the final sample used for the analysis consists of 833 participants. In the final sample, the distribution of age groups still approximates the general population, although participants aged 30–44 years are slightly overrepresented (2.2% points above the census statistic) at the expense of both the 16- to 29-year-olds and the 60- to 75-year-olds, who are slightly underrepresented (respectively, 1.7 and 0.5% points lower than the census statistic). Regarding gender, 48.6% are female, which is 1.2% points below the census statistic. These minor deviances are considered unproblematic for the analysis at hand and no weighting was applied.

Design and procedure

In the questionnaire, all the included attitudes are measured in relation to a specific research purpose, a donation receiver, or the requested data. The research purpose (investigation of algorithmic influence across a range of platforms) and the donation receiver (University of Zurich) were first introduced in the questions measuring the respective attitudes. Participants were then presented with the following brief description of a fictitious study: “A research project at the University of Zurich is investigating how recommendation algorithms on the internet influence our digital usage patterns and personal preferences.” Next, privacy concerns regarding this study were measured, followed by question blocks related to a maximum of three out of four platforms in a random order. For each platform, perceived behavioral control, attitudes toward platform-specific data types, and the willingness to donate these data types for the previously introduced study were measured. See for an overview of included platforms and data types, and Appendices A for a more detailed description of the survey design. The complete questionnaire can be found on OSF (https://doi.org/10.17605/OSF.IO/H24XV).

Table 1. Included platforms and data types.

This study design with a hypothetical scenario was chosen to compare different data types from the four most-used digital services in Switzerland (IGEM, Citation2021; Latzer et al., Citation2021) and cover different platform categories (i.e., search engines, social networking sites, video streaming platform).

Data collection took place between November 29 and December 10, 2021. The software Unipark (UNIPARK, Citation2022) was used to implement the questionnaire. Participants first provided informed consent before answering the standardized questions. On average, it took participants 18 minutes to complete the questionnaire.

Measures

The following paragraphs describe how the constructs were measured and provide examples of the items used. The complete item sets per measure can be found in .

Table 2. Variable measurements.

Willingness to donate

To measure the willingness to donate, participants were asked the following question with respect to each platform mentioned in : “Would you donate the following data about your [platform] usage to the research project mentioned earlier (Data donation to the University of Zurich to study recommendation algorithms)?” Participants indicated their willingness to donate with respect to each of the associated data types on a rating scale ranging from 1 (under no circumstance) to 7 (under any circumstance).

Attitudes

Attitude toward donating data

Participants were presented with the following definition of the term data donation: “As a user, you have the right to request the usage data that a company has stored about you. Such usage data is not only interesting for you personally and the companies, but also for scientific research. In this context, ‘data donation’ refers to the process of requesting your personal usage data from the companies and then making it available to a scientific institution for research purposes.” Next, the attitude toward the act of donating one’s personal data was measured with three items on a 7-point rating scale ranging from 1 (do not agree at all) to 7 (completely agree; e.g., “The donation of one’s personal data for research I find worth supporting”). The items are a subset of a scale proposed by Bresnahan et al. (Citation2008).

Attitude toward the donation receiver

The attitude toward the donation receiver was measured with four items adapted from Dennis et al. (Citation2016) and one additional item rated on a 7-point rating scale ranging from 1 (do not agree at all) to 7 (completely agree; e.g., “Researchers at the University of Zurich are trustworthy”).

Attitude toward the donation purpose

To assess the attitude toward the purpose of donation, participants were presented with the following statement: “Currently, a lot of research is being done on how recommendation algorithms on the Internet influence our digital usage patterns and our personal preferences.” They were then asked to state their attitude toward this kind of research with four items that they had to rate on a 7-point semantic differential-type scale (e.g., “This kind of research is unimportant/is important”). The items were adapted from the operationalization of the construct campaign cause involvement in Grau and Folse (Citation2007).

Attitude toward contextual privacy

To measure contextual privacy concerns, four items from the scale suggested by Ha et al. (Citation2021) were adapted (e.g., “I would have reservations about donating data for this research project, because my personal data could be misused”). Participants rated these statements on a scale from 1 (do not agree at all) to 7 (completely agree).

Attitudes toward the data to be donated

Attitude toward the data to be donated was measured regarding data sensitivity and perceived relevance of the data for the donation purpose (i.e., the research question).

Data sensitivity was measured separately for each data type with three items. Each item was assessed in a separate question and participants answered each question on a 7-point rating scale relating to the given data type. Different end points were used for each of the three items (e.g., “How sensitive do you consider the following data about your personal [platform] usage to be” − 1 [not sensitive at all] to 7 [extremely sensitive]). Item 1 was adapted from Bansal et al. (Citation2010) and items 2 and 3 from Mothersbaugh et al. (Citation2012).

Perceived relevance of the data was measured with the question “How helpful do you think the following data about your personal [platform] usage is for examining the impact of recommendation algorithms on human behavior,” rated on a scale from 1 (not helpful at all) to 7 (extremely helpful) for each data type.

Perceived behavioral control

The operationalization of the perceived behavioral control in this study focused on the capacity dimension and the construct was measured with the following question: “Thinking about the process of donating data (i.e., downloading your [platform] usage data first and then submitting it to the researchers), how much do you agree with the following statements?” The response items were guided by Fishbein and Ajzen’s (Citation2010) suggestions and measured on a 7 point rating scale ranging from 1 (do not agree at all) to 7 (completely agree; e.g., “For me, it would be easy to request my [platform] usage data and submit it to the researchers”).

Control variables

The participants’ age, gender, education, and platform use frequency were included as control variables.

Data preparation

The internal reliability of all multi-item measures was tested using Cronbach’s alpha. The reliability was sufficient in all cases and each measure was, therefore, combined into a single mean index (Cronbach’s alpha is reported in for each measure).

Analysis strategy

To test the hypotheses, a three-level linear mixed-effect model was used. More specifically, a multilevel model with crossed random effects between individuals (level 3) and platforms (level 2) and nested random effects between platforms (level 2) and data types (level 1) was formulated. The conceptual model is depicted in along with an overview of which measured variable is located at which level of the model. The analysis was carried out in R 4.1.2 (R Core Team, Citation2022) and the model parameters were estimated using the restricted maximum likelihood method implemented in the R-package “lme4” (version 1.1–28; Bates et al., Citation2015).Footnote3

Figure 2. Conceptual multilevel model.

Note. CV = Control Variable, DV = Dependent Variable, DT = Data Type.”
Figure 2. Conceptual multilevel model.

This analytical approach was chosen because (1) the dependent variable was measured repeatedly, (2) the different data types for which the willingness to donate was measured are nested within different platforms, (3) the explanatory variables are related to different levels, and (4) because the model encompasses multiple donation scenarios, the generalized, cross-situational relevance of the predictors for donating data can be estimated.

Results

Willingness to donate: descriptives

The reported willingness to donate data varied across platforms and data types that participants were asked to donate (see ). In general, the means are below the midpoint of the scale (i.e., people tend to be hesitant on average) and distributions tended to be positively skewed, except for data types belonging to YouTube, which tended more toward a uniform distribution and a mean above the midpoint (i.e., in favor of donation). An ad-hoc comparison of the mean willingness to donate associated with the different data types (F(13,5607) = 46.75, p < .001) showed some significant differences (based on Tukey post-hoc tests). The willingness to donate data belonging to YouTube is higher compared to most other datatypes, except for seen content on Instagram and Facebook. Furthermore, people are also more willing to share what they see on social media compared to their activities on Google. Finally, people are least likely to share data that includes information about their personal social network (friends and followers).

Table 3. Willingness to donate across platforms and data types.

Multilevel analysis

presents the results of the multilevel model analysis. The model M0 constituted the base model and only included the random intercepts. M1 extended this base model with the predictor and control variables. M0 served as the base model against which the fit of M1 was compared and the M1 results were used to test the hypotheses.

Table 4. Multilevel model – results.

Fixed effects: hypothesis tests

The estimates of M1 indicate that, except for the attitude toward the receiver (p = .068), all other predictors are significantly associated with the willingness to donate (all p < .001). More favorable attitudes toward data donation in general (H1) and toward the donation purpose (H3) are related to greater willingness to donate. Similarly, higher perceived relevance of the requested data (H6) and greater perceived behavioral control (H7) are positively related to the willingness to donate. However, higher perceived data sensitivity (H5) and stronger contextual privacy concerns (H4) are associated with lower willingness to donate. Thus, this analysis suggests that H2 is the only hypothesis that must be rejected given the non-significant effect.

Regarding the magnitude of effects, as indicated by the estimated coefficients, perceived data sensitivity is the strongest predictor of the willingness to donate (β = −0.45), followed by perceived relevance of the requested data (β = 0.31), perceived behavioral control (β = 0.25), attitude toward data donation (β = 0.21), and contextual privacy concerns (β = −0.19). Attitude toward the donation purpose (β = 0.11) seems to have the least effect on the willingness to donate.

Model fit and random effects

The extended model M1 seemed to fit the data better than the base model M0 (M1: REML criterion = 21430.3, AIC = 21395, BIC = 21503, deviance = 21363; M0: REML criterion = 23808.4, AIC = 23817, BIC = 23851, deviance = 23807). To verify this statistically, the models were re-fit using the maximum likelihood method and compared with a likelihood ratio test. This test confirmed the better fit of M1 (χ2(11) = 2444, p < .001).

The intraclass correlation coefficients (ICCs) for M0 indicate that the requested data type (ICC = .034) is more relevant to the willingness to donate than the platform to which the requested data belong (ICC = .016). However, the ICCs are rather low, indicating that both the type of platform and the type of requested data are not that decisive in explaining the willingness to donate.

In the base model, 60.6% of the variance of willingness to donate can be attributed to differences between individuals. After accounting for the effects of the theoretically derived predictors and control variables, the percentage of variance of willingness to donate explained by other intraindividual differences drops to 47.9%. The ICCs of the platform (ICC = .013) and data type (ICC = .015) also decrease in M1, supporting the observation in the M0 results that the importance of these variables in explaining people’s willingness to donate are not as crucial as the individual differences.

Discussion

Data donations provide a new approach for collecting digital trace data for media and communication research. However, motivating members of the general public to donate their media usage data remains a challenge. Identifying the factors related to the willingness to donate personal data is crucial to facilitate participant recruitment and participant motivation in future data donation studies. In the study at hand, we conducted a survey to explore the association of different factors with the willingness to donate personal data. The importance of these factors was examined by employing a multilevel analysis that controlled for different platforms and data types, thus allowing for more generalizable results.

Hypotheses discussion

General attitude toward data donations

The first factor that was positively linked to the willingness to donate is a person’s general attitude toward data donation (H1). In our sample, attitude toward the concept of data donation for academic research tended to be positive (M = 4.51, SD = 1.46), but at the same time, the term “data donation” was not commonly known (10.1% of respondents reported that they had heard the term before). An expectation that we share with Ohme et al. (Citation2020) is that the general willingness to donate personal data will increase when the collection of data donations for research becomes a more common practice. To achieve this, more data donation studies and additional efforts from the scientific community to raise awareness among the general public about the potential that data donations offer for academic research are needed (e.g., by promoting the concept at public science exhibitions or through image campaigns).

Role of the donation receiver

Our model showed no significant relationship between the willingness to donate and personal attitude toward the donation receiver (H2). However, this does not mean that the receiver is irrelevant in the donation decision. First, in our sample, the attitude toward the receiver was generally favorable (M = 5.26, SD = 1.20), which could be due to self-selection bias in the study, because the institution was already mentioned in the introduction of our survey. Second, even if the strength of the attitude toward the receiver is not as influential, other studies indicate that the type of institution requesting a data donation can be (Hillebrand & Hornuf, Citation2021; Richter et al., Citation2021). According to these findings, universities and governments are more likely than private companies to receive data donations.

Role of the communicated research goal

Attitude toward the donation purpose and perceived relevance of the requested data are both directly related to the research goal communicated to potential study participants. Both variables showed a positive relationship with the willingness to donate personal data.

More positive attitudes toward the donation purpose were linked to greater willingness to donate (H3). This is in line with previous research showing that people are more likely to share their personal data if they can relate to the research goal (Skatova & Goulding, Citation2019, Skatova et al., Citation2019; Sleigh, Citation2018).

Similarly, increased perceived relevance was associated with greater willingness to donate (H6). This is in line with theories of altruistic and charitable donation behavior (Duncan, Citation2004) that propose that people are more likely to donate when they think their contribution can positively impact the promoted cause. Compared to the attitude toward the donation purpose, the observed effect size for the perceived relevance was about three times higher (βrelevance = 0.31 vs. βporpuse = 0.11).

Perceived data sensitivity and privacy concerns

Our results indicate that perceived data sensitivity was the strongest predictor of a person’s willingness to donate—the higher the perceived sensitivity of the requested data, the lower the willingness to donate (H5). Generally, this observation is in line with previous findings regarding information disclosure behavior in general (Bansal et al., Citation2010; Malheiros et al., Citation2013) and data sharing behavior more specifically (Keusch et al., Citation2020; Ohme et al., Citation2020).

Our findings further show that the perceived data sensitivity depends to some extent on the data type requested as a donation, although there was considerable variation in the responses and the differences between platforms were moderately large when looking at the mean sensitivity of datatypes belonging to a platform (Google: M = 4.20, SD = 1.62; Facebook: M = 3.79, SD = 1.78; Instagram: M = 3.45, SD = 1.68; YouTube: M = 3.39, SD = 1.71). This indicates that perceived sensitivity also depends on personal predispositions and usage habits.

Related to perceived sensitivity are contextual privacy concerns, for which we also observed a significant negative relationship with people’s willingness to donate (H4). This is in line with observations that people are more willing to donate data for projects that are perceived as low risk (Hillebrand & Hornuf, Citation2021; Keusch et al., Citation2019). The question remains whether privacy concerns can be influenced by a description in a study or whether contextual privacy concerns are predetermined by general privacy concerns and are, therefore, stable across situations (in our sample, general privacy concerns and contextual privacy concerns showed a correlation of r = .42).

Requesting one’s personal Data

Another factor that was positively related to the willingness to donate is the perceived behavioral control over accessing and downloading one’s data (H7). While previous studies found that technical skills, in general, do not seem to have a significant impact on the willingness to donate (Keusch et al., Citation2019; Ohme et al., Citation2020), perceived behavioral control does. Compared to general skills that are accumulated over time, researchers can positively affect participants’ perceived behavioral control by designing convincing and easy to understand communication materials that introduce the upcoming data donation process. This is crucial because many members of the general public have never requested a data takeout from a digital service (only 7.75% of participants in this study indicated that they had already done so in the past).

Limitations

In the study at hand, we investigate the relationship of different factors with people’s reported willingness to donate different data types from four online platforms in a hypothetical research scenario (asking if they would be willing to donate different kinds of data to such a study). This design choice provides unique insights but also some limitations. First, our results are valuable for explaining participants’ initial willingness to take part in a data donation project but may not fully explain actual donation rates. Second, the results are based on correlational observations and should be treated with caution when drawing causal inferences. Third, the number of groups at the lower levels of the multilevel model was limited (four platforms with two to four data types each). Therefore, the estimates for random effects might have been relatively low (Hox, Citation2010). However, since we had a large number of individuals (“groups”) at the highest level, this should not have had a noteworthy effect on the accuracy of the fixed effects estimates on which our hypotheses and interpretation mainly focused (Snijders & Bosker, Citation2012). Fourth, our study presented participants with one scenario (the same for each participant). Although our sample showed variation regarding the strengths of attitudes, our results do not guarantee that the observed relationships between these factors and people’s willingness to donate will be the same across different purpose and receiver scenarios. Fifth, we did not include all possible factors that could be related to the willingness to donate, and we do not claim to be exhaustive in this respect. Most notably, we did not consider the effects of incentives (financial or otherwise) in our model. However, it can be expected that an adequate incentive will increase the willingness to donate data, as other studies investigating related topics have shown (Haas et al., Citation2020; Keusch et al., Citation2019).

These limitations could only be resolved by experimental variation of multiple factors and effective data donations. However, this was not feasible with the available resources or would have required to focus on a single platform. Therefore, we propose to use our findings regarding the reported willingness to donate different data types from different platforms as a starting point to inform the design and focus of future studies that further explore specific aspects of the data donation process.

Practical implications

Despite these limitations, our observations show that people are more willing to donate data that contain information on entertainment-related media use (e.g., YouTube data) compared to data that contain more information on social relations (e.g., friends or followers) or personal information needs (e.g., search engine use). Furthermore, on social media, people seem to be more willing to donate data that contain information about what content they have been “passively” presented with (e.g., seen posts on social media) compared to data that directly capture their active behavior (e.g., what they have posted or liked). The implications of these differences can go in different directions and are at least threefold: First, if the research question allows it, researchers may focus on data types with higher likelihood of participation. Second, surveys addressing data types with lower willingness to donate should make sure to gain as much information on the nonresponse as possible to estimate potential bias in the data. Third, projects addressing data types with lower willingness to donate should increase efforts to optimize initial communication with potential data donors. Our results show that perceived sensitivity of the data to be donated, the perceived relevance of the data for the donation purpose, and the perceived behavioral control have the strongest relationship with people’s willingness to donate. Interestingly, these are all factors that are specific to data donation studies. More general information that needs to be communicated in every study scenario (i.e., also in a traditional survey study), such as the donation receiver or the donation purpose by itself, seem to be of lower relevance. This indicates that researchers must emphasize other aspects than they might be used to. Therefore, we suggest the following practical implications for the study invitation and introduction: First, although participants’ general perception of how sensitive their data are is likely a stable attitude that is hard to change during participant recruitment, researchers can tackle this challenge by trying to reduce the amount of sensitive data they are collecting or by implementing data anonymization and reduction steps in the data donation process (Boeschoten, Mendrik, et al., Citation2022; Breuer et al., Citation2022). Furthermore, a transparent and comprehensible consent form can reassure participants that their data, although sensitive, is secure with the researchers. This consent form must be designed in an intelligible way, contain the legally and ethically necessary information and omit unnecessary technical details (Breuer et al., Citation2022). Second, researchers should explain to potential participants how and why their data donation can help to reach the research goal in order to demonstrate the relevance of their donation. These explanations should also address people who might feel that their personal data are not relevant for the stated research goal (e.g., because they use the service only seldomly)—otherwise, researchers run the risk of achieving a biased sample. Overall, communicating how and why people’s data donations can help to reach a research goal is even more important than convincing participants that the research goal itself is a cause worth supporting. Third, if potential participants believe that they will be able to request and upload their data, they will be more willing to donate their data. Therefore, researchers should try to explain in simple but brief terms, what the data donation process will entail and that no special skills are relevant. Related to this, researchers have to make sure that the respective hurdles are minimized also for participants that did not even know about the possibility to request their personal data from digital platforms.

Lastly, we want to emphasize that optimizing the identified motivational factors will not guarantee a successful data donation study as there are additional challenges involved in the collection of data donations (see van Driel et al., Citation2022, for an overview). First, some people may be hesitant to share their data regardless of any efforts to optimize the proposed factors. Second, convincing individuals to start the process of donating their data is only the first step. Based on our experiences with the collection of data donations (e.g., Blassnig et al., Citation2023) a high drop-out rate may occur during the actual donation process due to technical issues, loss of motivation, or long timespans between requesting the data and its provision by the respective service provider. Third, even if participants successfully download and upload their data, they may still consider the information too personal and abort the process or refuse to consent to the donation. Nevertheless, we still believe that the collection of data donations is a promising approach for communication science, and we are convinced that with the accumulation of experiences the academic community will be able to successfully face and minimize the challenges associated with data donations.

Conclusion

The present study examined the relationships of different factors with people’s willingness to donate personal data for academic research through a multilevel analysis. Apart from the attitude toward the donation receiver, all included factors showed a significant relationship with the willingness to donate.

Our results provide important insights for improving communication with potential data donors. They indicate that highlighting the importance of data donation and the positive impact a data donation can have are important for increasing data donation rates. The results further show that both higher perceived sensitivity and stronger contextual privacy concerns are related to lower willingness to donate personal data. Hence, it is crucial to foster a donation environment that potential data donors regard as secure and trustworthy. Furthermore, the results underscore the importance of preparing good instruction material to increase people’s sense of control over the data donation process.

Future methodological studies could add to this line of research by examining the actual data donation process (not a hypothetical situation) and experimentally manipulating relevant aspects of the donation process to deepen our understanding of how to best design data donation studies.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data used for the presented analysis along with the R-scripts used for the analysis can be found here: https://doi.org/10.17605/OSF.IO/H24XV.

Additional information

Funding

The work was supported by the Digital Society Initiative of the University of Zurich .

Notes on contributors

Nico Pfiffner

Nico Pfiffner is a research assistant and PhD candidate at the Department of Communication and Media Research (IKMZ) at the University of Zurich. His current work focuses on the question of how digital trace data can be made accessible for academic research through data donations from citizens. He is interested in both the technical implementation of data donation collections as well as the associated practical and epistemological challenges and implications.

Thomas. N. Friemel

Thomas N. Friemel (PhD, 2008 U of Zurich) is an associate professor at the Department of Communication and Media Research at the University of Zurich. His research focuses on the social context of media use and effects.

Notes

1 Although in principle both individuals and organizations can “donate” data, the term data donation usually refers to individuals (Susha et al., Citation2019) whereas organizational data donation is usually termed data philanthropy (Ajana, Citation2017; Kirkpatrick, Citation2013; Susha et al., Citation2019).

2 The preregistration can be accessed here: https://osf.io/5qsgh. Deviations from the preregistration are discussed here: https://doi.org/10.17605/OSF.IO/H24XV.

3 The R-script for the model estimation and additional material can be found here: https://doi.org/10.17605/OSF.IO/H24XV.

References

  • Ajana, B. (2017). Digital health and the biopolitics of the quantified self. Digital Health, 3, 2055207616689509. https://doi.org/10.1177/2055207616689509
  • Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50(2), 179–211. https://doi.org/10.1016/0749-5978(91)90020-T
  • Ajzen, I. (2020). The theory of planned behavior: Frequently asked questions. Human Behavior and Emerging Technologies, 2(4), 314–324. https://doi.org/10.1002/hbe2.195
  • Araujo, T., Ausloos, J., van Atteveldt, W., Loecherbach, F., Moeller, J., Ohme, J., Trilling, D., van de Velde, B., Vreese, C. D., & Welbers, K. (2022). OSD2F: An open-source data donation framework. Computational Communication Research, 4(2), 372–387. https://doi.org/10.5117/CCr2022.2.001.ARAU
  • Ausloos, J., & Veale, M. (2021). Researching with data rights. Technology and Regulation, 136–157. https://doi.org/10.26116/techreg.2020.010
  • Bansal, G., Zahedi, F., & Gefen, D. (2010). The impact of personal dispositions on information sensitivity, privacy concern and trust in disclosing health information online. Decision Support Systems, 49(2), 138–150. https://doi.org/10.1016/j.dss.2010.01.010
  • Bates, D., Mächler, M., Bolker, B., & Walker, S. [. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
  • Blassnig, S., Mitova, E., Pfiffner, N., & Reiss, M. V. (2023). Googling referendum campaigns: analyzing online search patterns regarding Swiss direct-democratic votes. Media and Communication, 11(1), 19–30. https://doi.org/10.17645/mac.v11i1.6030
  • Boeschoten, L., Araujo, T., Ausloos, J., Möller, J. E., & Oberski, D. L. (2022). A framework for privacy preserving digital trace data collection through data donation. Computational Communication Research, 4(2), 388–423. https://doi.org/10.5117/CCr2022.2.002.BoEs
  • Boeschoten, L., Mendrik, A., van der Veen, E., Vloothuis, J., Hu, H., Voorvaart, R., & Oberski, D. L. (2022). Privacy-preserving local analysis of digital trace data: A proof-of-concept. Patterns, 3(3), 100444. https://doi.org/10.1016/j.patter.2022.100444
  • Bohner, G., & Dickel, N. (2011). Attitudes and attitude change. Annual Review of Psychology, 62(1), 391–417. https://doi.org/10.1146/annurev.psych.121208.131609
  • Bosnjak, M., Tuten, T. L., & Wittmann, W. W. (2005). Unit (non)response in Web-based access panel surveys: An extended planned-behavior approach. Psychology & Marketing, 22(6), 489–505. https://doi.org/10.1002/mar.20070
  • Boyd, D., & Crawford, K. (2012). Critical questions for big data. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/1369118X.2012.678878
  • Bresnahan, M. J., Guan, X., Wang, X., & Mou, Y. (2008). The culture of the body: Attitudes toward organ donation in China and the US. Chinese Journal of Communication, 1(2), 181–195. https://doi.org/10.1080/17544750802287976
  • Breuer, J., Bishop, L., & Kinder-Kurlanda, K. (2020). The practical and ethical challenges in acquiring and sharing digital trace data: Negotiating public-private partnerships. New Media & Society, 22(11), 2058–2080. https://doi.org/10.1177/1461444820924622
  • Breuer, J., Kmetty, Z., Haim, M., & Stier, S. (2022). User-centric approaches for collecting Facebook data in the ‘post-API age’: Experiences from two studies and recommendations for future research. Information, Communication & Society, 1–20. https://doi.org/10.1080/1369118X.2022.2097015
  • Brosnan, K., Kemperman, A., & Dolnicar, S. (2019). Maximizing participation from online survey panel members. International Journal of Market Research, 63(4), 147078531988070. https://doi.org/10.1177/1470785319880704
  • Bruns, A. (2019). After the ‘APIcalypse’: Social media platforms and their fight against critical scholarly research. Information, Communication & Society, 22(11), 1544–1566. https://doi.org/10.1080/1369118X.2019.1637447
  • Buchanan, T., Paine, C., Joinson, A. N., & Reips, U. ‑. (2007). Development of measures of online privacy concern and protection for use on the Internet. Journal of the American Society for Information Science and Technology, 58(2), 157–165. https://doi.org/10.1002/asi.20459
  • Christner, C., Urman, A., Adam, S., & Maier, M. (2022). Automated tracking approaches for studying online media use: A critical review and recommendations. Communication Methods and Measures, 16(2), 79–95. https://doi.org/10.1080/19312458.2021.1907841
  • DataSkop. (2021). Pilotprojekt: Wahlempfehlung: Was zeigt dir der YouTube-Algorithmus zur Bundestagswahl? AlgorithmWatch, Europa-Universität Viadrina, Fachhochschule Potsdam, Universität Paderborn, Verein Mediale Pfade. https://dataskop.net/pilotprojekt-wahlempfehlung-was-zeigt-dir-der-youtube-algorithmus-zur-bundestagswahl-juli-2021/
  • Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13(3), 319. https://doi.org/10.2307/249008
  • Dennis, C., Papagiannidis, S., Alamanos, E., & Bourlakis, M. (2016). The role of brand attachment strength in higher education. Journal of Business Research, 69(8), 3049–3057. https://doi.org/10.1016/j.jbusres.2016.01.020
  • Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, phone, mail, and mixed-mode surveys: The tailored design method (4th ed.). Wiley.
  • Duncan, B. (2004). A theory of impact philanthropy. Journal of Public Economics, 88(9–10), 2159–2180. https://doi.org/10.1016/S0047-2727(03)00037-9
  • Fang, J., Shao, P., & Lan, G. (2009). Effects of innovativeness and trust on web survey participation. Computers in Human Behavior, 25(1), 144–152. https://doi.org/10.1016/j.chb.2008.08.002
  • Fishbein, M., & Ajzen, I. (2010). Predicting and changing behavior: The reasoned action approach. Psychology Press. https://doi.org/10.4324/9780203838020
  • Flick, C. (2016). Informed consent and the Facebook emotional manipulation study. Research Ethics, 12(1), 14–28. https://doi.org/10.1177/1747016115599568
  • Fortes, N., & Rita, P. (2016). Privacy concerns and online purchasing behaviour: Towards an integrated model. European Research on Management and Business Economics, 22(3), 167–176. https://doi.org/10.1016/j.iedeen.2016.04.002
  • Freelon, D. (2018). Computational research in the post-API age. Political Communication, 35(4), 665–668. https://doi.org/10.1080/10584609.2018.1477506
  • Grau, S. L., & Folse, J. A. G. (2007). Cause-related marketing (CRM): The influence of donation proximity and message-framing cues on the less-involved consumer. Journal of Advertising, 36(4), 19–33. https://doi.org/10.2753/JOA0091-3367360402
  • Haas, G. ‑., Kreuter, F., Keusch, F., Trappmann, M., & Bähr, S. (2020). Effects of incentives in smartphone data collection. In C. A. Hill, P. P. Biemer, T. D. Buskirk, L. Japec, A. Kirchner, S. Kolenikov, & L. E. Lyberg (Eds.), Big data meets survey science (pp. 387–414). Wiley. https://doi.org/10.1002/9781118976357.ch13
  • Ha, Q. -A., Chen, J. V., Uy, H. U., & Capistrano, E. P. (2021). Exploring the privacy concerns in using intelligent virtual assistants under perspectives of information sensitivity and anthropomorphism. International Journal of Human–Computer Interaction, 37(6), 512–527. https://doi.org/10.1080/10447318.2020.1834728
  • Halavais, A. (2019). Overcoming terms of service: A proposal for ethical distributed research. Information, Communication & Society, 22(11), 1567–1581. https://doi.org/10.1080/1369118X.2019.1627386
  • Haunberger, S. (2011). Explaining unit nonresponse in online panel surveys: An application of the extended theory of planned behavior. Journal of Applied Social Psychology, 41(12), 2999–3025. https://doi.org/10.1111/j.1559-1816.2011.00856.x
  • Hillebrand, K., & Hornuf, L. (2021). The social dilemma of big data: Donating personal data to promote social welfare. SSRN Electronic Journal, Advance online publication. https://doi.org/10.2139/ssrn.3801476
  • Hogan, B. (2018). Social media giveth, social media taketh away: Facebook, friendships, and APIs. International Journal of Communication, 12, 592–611.
  • Howison, J., Wiggins, A., & Crowston, K. (2011). Validity issues in the use of social network analysis with digital trace data. Journal of the Association for Information Systems, 12(12), 767–797. https://doi.org/10.17705/1jais.00282
  • Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). Routledge.
  • Hummel, P., Braun, M., & Dabrock, P. (2019). Data donations as exercises of sovereignty. In J. Krutzinna & L. Floridi (Eds.), The ethics of medical data donation (pp. 23–54). Springer International Publishing. https://doi.org/10.1007/978-3-030-04363-6_3
  • IGEM. (2021). Zusammenfassung IGEM-Digimonitor 2021: Die repräsentative Studie zur digitalen Schweiz. https://www.igem.ch/download/Zusammenfassung-IGEM-Digimonitor-2021.pdf
  • Jungherr, A. (2018). Normalizing digital trace data. In N. J. Stroud & S. C. McGregor (Eds.), New agendas in communication. Digital discussions: How big data informs political communication (pp. 9–35). Routledge Taylor & Francis Group.
  • Kayhan, V. O., & Davis, C. J. (2016). Situational privacy concerns and antecedent factors. Journal of Computer Information Systems, 56(3), 228–237. https://doi.org/10.1080/08874417.2016.1153913
  • Keusch, F. (2015). Why do people participate in Web surveys? Applying survey participation theory to Internet survey data collection. Management Review Quarterly, 65(3), 183–216. https://doi.org/10.1007/s11301-014-0111-y
  • Keusch, F., Struminskaya, B., Antoun, C., Couper, M. P., & Kreuter, F. (2019). Willingness to participate in passive mobile data collection. Public Opinion Quarterly, 83(Suppl S1), 210–235. https://doi.org/10.1093/poq/nfz007
  • Keusch, F., Struminskaya, B., Kreuter, F., & Weichbold, M. (2020). Combining active and passive mobile data collection. In C. A. Hill, P. P. Biemer, T. D. Buskirk, L. Japec, A. Kirchner, S. Kolenikov, & L. E. Lyberg (Eds.), Big data meets survey science (pp. 657–682). Wiley. https://doi.org/10.1002/9781118976357.ch22
  • King, G., & Persily, N. (2020). A new model for industry–academic partnerships. PS: Political Science & Politics, 53(4), 703–709. https://doi.org/10.1017/S1049096519001021
  • Kirkpatrick, R. (2013, March 21). A new type of philanthropy: Donating data. Harvard Business Review, 2013. https://hbr.org/2013/03/a-new-type-of-philanthropy-don
  • Kuebler-Wachendorff, S., Luzsa, R., Kranz, J., Mager, S., Syrmoudis, E., Mayr, S., & Grossklags, J. (2021). The right to data portability: Conception, status quo, and future directions. Informatik Spektrum, 44(4), 264–272. https://doi.org/10.1007/s00287-021-01372-w
  • Kuznetsova A., Brockhoff P. B., & Christensen R. H. B. (2017). lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13
  • Lapinski, M. K., & Rimal, R. N. (2005). An explication of social norms. Communication Theory, 15(2), 127–147. https://doi.org/10.1111/j.1468-2885.2005.tb00329.x
  • Latzer, M., Büchi, M., Kappeler, K., & Festic, N. (2021). Internetanwendungen und deren Nutzung in der Schweiz 2021: Themenbericht aus dem World Internet Project – Switzerland 2021. Universität Zürich. https://mediachange.ch/research/wip-ch-2021/
  • Lazer, D., Pentland, A., Watts, D. J., Aral, S., Athey, S., Contractor, N., Freelon, D., Gonzalez-Bailon, S., King, G., Margetts, H., Nelson, A., Salganik, M. J., Strohmaier, M., Vespignani, A., & Wagner, C. (2020). Computational social science: Obstacles and opportunities. Science, 369(6507), 1060–1062. https://doi.org/10.1126/science.aaz8170
  • Lomborg, S. (2013). Personal Internet archives and ethics. Research Ethics, 9(1), 20–31. https://doi.org/10.1177/1747016112459450
  • Lomborg, S., & Bechmann, A. (2014). Using APIs for data collection on social media. The Information Society, 30(4), 256–265. https://doi.org/10.1080/01972243.2014.915276
  • Malheiros, M., Preibusch, S., & Sasse, M. A. (2013). Fairly truthful”: The impact of perceived effort, fairness, relevance, and sensitivity on personal data disclosure. In D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, M. Huth, N. Asokan, S. Čapkun, I. Flechais, & L. Coles-Kemp (Eds.), Lecture Notes in Computer Science. Trust and trustworthy computing (Vol. 7904, pp. 250–266). Springer. https://doi.org/10.1007/978-3-642-38908-5_19
  • Mancosu, M., & Vegetti, F. (2020). What you can scrape and what is right to scrape: A proposal for a tool to collect public Facebook data. Social Media + Society, 6(3), 205630512094070. https://doi.org/10.1177/2056305120940703
  • Menchen-Trevino, E. (2016, March 20). Web Historian: Enabling multi-method and independent research with real-world web browsing history data. In X. Lin & M. Khoo (Eds.), iConference 2016 Proceedings. iSchools. https://doi.org/10.9776/16611
  • Mothersbaugh, D. L., Foxx, W. K., Beatty, S. E., & Wang, S. (2012). Disclosure antecedents in an online service context. Journal of Service Research, 15(1), 76–98. https://doi.org/10.1177/1094670511424924
  • Nissenbaum, H. (2004). Privacy as contextual integrity. Washington Law Review, 79(1), 119–158.
  • Ohme, J., Araujo, T., de Vreese, C. H., & Piotrowski, J. T. (2020). Mobile data donations: Assessing self-report accuracy and sample biases with the iOS screen time function. Mobile Media & Communication, 9(2), 293–313. https://doi.org/10.1177/2050157920959106
  • Pfiffner, N., Witlox, P., & Friemel, T. N. (2022). Data Donation Module (Version 0.1.26) [ Computer software]. https://github.com/uzh/ddm
  • Prainsack, B. (2019). Data donation: How to resist the iLeviathan. In J. Krutzinna & L. Floridi (Eds.), The ethics of medical data donation (pp. 9–22). Springer International Publishing. https://doi.org/10.1007/978-3-030-04363-6_2
  • R Core Team. (2022). R: A Language and Environment for Statistical Computing. https://www.R-project.org/
  • Richter, G., Borzikowsky, C., Lesch, W., Semler, S. C., Bunnik, E. M., Buyx, A., & Krawczak, M. (2021). Secondary research use of personal medical data: Attitudes from patient and population surveys in the Netherlands and Germany. European Journal of Human Genetics: EJHG, 29(3), 495–502. https://doi.org/10.1038/s41431-020-00735-3
  • Ruths, D., & Pfeffer, J. (2014). Social media for large studies of behavior. Science, 346(6213), 1063–1064. https://doi.org/10.1126/science.346.6213.1063
  • Seltzer, E., Goldshear, J., Guntuku, S. C., Grande, D., Asch, D. A., Klinger, E. V., & Merchant, R. M. (2019). Patients’ willingness to share digital health and non-health data for research: A cross-sectional study. BMC Medical Informatics and Decision Making, 19(1), 157. https://doi.org/10.1186/s12911-019-0886-9
  • Silber, H., Breuer, J., Beuthner, C., Gummer, T., Keusch, F., Siegers, P., Stier, S., & Weiß, B. (2021). Linking surveys and digital trace data: Insights from two studies on determinants of data sharing behavior. SocArXiv. Advance online publication. https://doi.org/10.31235/osf.io/dz93u
  • Skatova, A., & Goulding, J. (2019). Psychology of personal data donation. Plos One, 14(11), e0224240. https://doi.org/10.1371/journal.pone.0224240
  • Skatova, A., Shiells, K., & Boyd, A. (2019). Attitudes towards transactional data donation and linkage in a longitudinal population study: Evidence from the Avon Longitudinal Study of Parents and Children. Wellcome Open Research, 4, 192. https://doi.org/10.12688/wellcomeopenres.15557.1
  • Sleigh, J. (2018). Experiences of donating personal data to mental health research: An explorative anthropological study. Biomedical Informatics Insights, 10, 1178222618785131. https://doi.org/10.1177/1178222618785131
  • Smith, H. J., Milberg, S. J., & Burke, S. J. (1996). Information Privacy: Measuring Individuals’ Concerns about Organizational Practices. MIS Quarterly, 20(2), 167. https://doi.org/10.2307/249477
  • Sniehotta, F. F., Presseau, J., & Araújo-Soares, V. (2014). Time to retire the theory of planned behaviour. Health Psychology Review, 8(1), 1–7. https://doi.org/10.1080/17437199.2013.869710
  • Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Sage.
  • Stier, S., Breuer, J., Siegers, P., & Thorson, K. (2019). Integrating survey data and digital trace data: Key issues in developing an emerging field. Social Science Computer Review, 38(5), 503–516. https://doi.org/10.1177/0894439319843669
  • Susha, I., Grönlund, Å., & van Tulder, R. (2019). Data driven social partnerships: Exploring an emergent trend in search of research challenges and questions. Government Information Quarterly, 36(1), 112–128. https://doi.org/10.1016/j.giq.2018.11.002
  • Syrmoudis, E., Mager, S., Kuebler-Wachendorff, S., Pizzinini, P., Grossklags, J., & Kranz, J. (2021). Data portability between online services: An empirical analysis on the effectiveness of GDPR art. 20. Proceedings on Privacy Enhancing Technologies, 2021(3), 351–372. https://doi.org/10.2478/popets-2021-0051
  • Theocharis, Y., & Jungherr, A. (2020). ComputationaL social science and the study of political communication. Political Communication, 38(1–2), 1–22. https://doi.org/10.1080/10584609.2020.1833121
  • UNIPARK. (2022). Survey-Software. https://www.unipark.com/en/survey-software/
  • van Driel, I. I., Giachanou, A., Pouwels, J. L., Boeschoten, L., Beyens, I., & Valkenburg, P. M. (2022). Promises and pitfalls of social media data donations. Communication Methods and Measures, 16(4), 266–282. https://doi.org/10.1080/19312458.2022.2109608
  • Walker, S., Mercea, D., & Bastos, M. (2019). The disinformation landscape and the lockdown of social platforms. Information, Communication & Society, 22(11), 1531–1543. https://doi.org/10.1080/1369118X.2019.1648536

Appendices

A. Questionnaire structure

The questionnaire was conceptually divided into three parts.

The first part consisted of a series of fixed questions. The first questions included general sociodemographic (i.e., age, gender) and media use measures. Age and gender were used for quota control. Interlocked age-gender groups were closed after reaching the intended sample size. Next, a definition of the term data donation was presented to the participants and their attitude toward the concept of data donation for academic research was measured. Then followed questions to obtain the participants’ attitudes about the University of Zurich (introduced as the receiver of the data donations in the second part) and academic research regarding the influence of selection algorithms on usage patterns and personal preferences (introduced as the purpose of donation in the second part).

The second part started with the introduction of a fictional study conducted by the University of Zurich to investigate algorithmic influences across a range of digital platforms. Next, participants’ attitudes toward different kinds of data types and their willingness to donate these data to the fictional research project introduced before were measured. For this, each participant was presented with at most three question blocks, each of which was related to a different platform (see ). In these blocks, the questions were asked with regard to one of the platforms belonging to the respective platform category (for social networking sites, this defaulted to Instagram; if a participant did not use Instagram, the questions related to Facebook). A question block was skipped if participants indicated that they never used any of the related platforms. In each of these blocks, participants first indicated their willingness to donate for each data type separately. Subsequently, questions about perceived data sensitivity, perceived relevance, and perceived behavioral control were asked with regard to the respective data types.

Finally, the third part concluded the questionnaire with additional sociodemographic questions.