CrossRef citations to date
SI: Computational Me

Digital Trace Data Collection for Social Media Effects Research: APIs, Data Donation, and (Screen) Tracking

ORCID Icon, ORCID Icon, ORCID Icon, , , & show all


  • Abul-Fottouh, D., Song, M. Y., & Gruzd, A. (2020). Examining algorithmic biases in YouTube’s recommendations of vaccine videos. International Journal of Medical Informatics, 140, 104175. https://doi.org/10.1016/j.ijmedinf.2020.104175
  • Allcott, H., Braghieri, L., Eichmeyer, S., & Gentzkow, M. (2020). The welfare effects of social media. The American Economic Review, 110(3), 629–676. https://doi.org/10.1257/aer.20190658
  • Allen, J., Howland, B., Mobius, M., Rothschild, D., & Watts, D. J. (2020). Evaluating the fake news problem at the scale of the information ecosystem. Science Advances, 6(14), eaay3539. https://doi.org/10.1126/sciadv.aay3539
  • Amaya, A., Biemer, P. P., & Kinyon, D. (2020). Total error in a big data world: Adapting the TSE framework to big data. Journal of Survey Statistics and Methodology, 8(1), 89–119. https://doi.org/10.1093/jssam/smz056
  • Araujo, T., Ausloos, J., van Atteveldt, W., Loecherbach, F., Moeller, J., Ohme, J., Trilling, D., van de Velde, B., de Vreese, C., & Welbers, K. (2022). OSD2F: An open-source data donation framework. Computational Communication Research, 4(2), 372–387. https://doi.org/10.5117/CCR2022.2.001.ARAU
  • Araujo, T., Wonneberger, A., Neijens, P., & Vreese, C. D. (2017). How much time do you spend online? Understanding and improving the accuracy of self-reported measures of internet use. Communication Methods and Measures, 11(3), 173–190. https://doi.org/10.1080/19312458.2017.1317337
  • Ausloos, J., & Veale, M. (2021). Researching with data rights. Technology and Regulation, 136–157. http://dx.doi.org/10.2139/ssrn.3465680
  • Bartley, N., Abeliuk, A., Ferrara, E., & Lerman, K. (2021). Auditing algorithmic bias on Twitter. 13th ACM Web Science Conference 2021, 65–73. https://doi.org/10.1145/34475353462491.
  • Baumgartner, S. E., Sumter, S. R., Petkevič, V., & Wiradhany, W. (2022). A novel iOS data donation approach: automatic processing, compliance, and reactivity in a longitudinal study. Social Science Computer Review, 08944393211071068. https://doi.org/10.1177/08944393211071068
  • Bennett, W. L., & Iyengar, S. (2008). A new era of minimal effects? The changing foundations of political communication. Journal of Communication, 58(4), 707–731.
  • Boeschoten, L., Ausloos, J., Moeller, J., Araujo, T., & Oberski, D. L. (2020). Digital trace data collection through data donation. ArXiv:2011.09851 [Cs, Stat]. http://arxiv.org/abs/2011.09851
  • Boeschoten, L., Ausloos, J., Möller, J. E., Araujo, T., & Oberski, D. L. (2022). A framework for privacy preserving digital trace data collection through data donation. Computational Communication Research, 4(2), 388–423. https://doi.org/10.5117/CCR2022.2.002.BOES
  • Boeschoten, L., Mendrik, A., van der Veen, E., Vloothuis, J., Hu, H., Voorvaart, R., & Oberski, D. L. (2022). Privacy-preserving local analysis of digital trace data: A proof-of-concept. Patterns, 3(3), 100444. https://doi.org/10.1016/j.patter.2022.100444
  • Borges Do Nascimento, I. J., Beatriz Pizarro, A., Almeida, J., Azzopardi-Muscat, N., André Gonçalves, M., Björklund, M., & Novillo-Ortiz, D. (2022). Infodemics and health misinformation: A systematic review of reviews. Bulletin of the World Health Organization, 100(9), 544–561. https://doi.org/10.2471/BLT.21.287654
  • Breuer, J., Kmetty, Z., Haim, M., & Stier, S. (2022). User-centric approaches for collecting Facebook data in the ‘post-API age’: Experiences from two studies and recommendations for future research. Information, Communication & Society. https://doi.org/10.1080/1369118X.2022.2097015
  • Burgess, J., Angus, D., Carah, N., Andrejevic, M., Hawker, K., Lewis, K., Obeid, A. K., Smith, A., Tan, J., Fordyce, R., Trott, V., & Li, L. 2021. Critical simulation as hybrid digital method for exploring the data operations and vernacular cultures of visual social media platforms. Preprint SocArXiv. https://doi.org/10.31235/osf.io/2cwsu
  • Chiatti, A., Cho, M. J., Gagneja, A., Yang, X., Brinberg, M., Roehrick, K., Choudhury, S. R., Ram, N., Reeves, B., & Giles, C. L. (2018). Text extraction and retrieval from smartphone screenshots: Building a repository for life in media. Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 948–955. https://doi.org/10.1145/3167132.3167236
  • Christner, C., Urman, A., Adam, S., & Maier, M. (2022). Automated tracking approaches for studying online media use: A critical review and recommendations. Communication Methods and Measures, 16(2), 79–95. https://doi.org/10.1080/19312458.2021.1907841
  • Cohn, J. (2019). The burden of choice: Recommendations, subversion, and algorithmic culture. Rutgers University Press.
  • Cramer, H., Garcia-Gathright, J., Springer, A., & Reddy, S. (2018). Assessing and addressing algorithmic bias in practice. Interactions, 25(6), 58–63. https://doi.org/10.1145/3278156
  • Cronin, J., von Hohenberg, B. C., Gonçalves, J. F. F., Menchen-Trevino, E., & Wojcieszak, M. (2022). The (null) over-time effects of exposure to local news websites: Evidence from trace data. Journal of Information Technology & Politics, 1–15. https://doi.org/10.1080/19331681.2022.2123878
  • De Vreese, C. H., Boukes, M., Schuck, A., Vliegenthart, R., Bos, L., & Lelkes, Y. (2017). Linking survey and media content data: Opportunities, considerations, and pitfalls. Communication Methods and Measures, 11(4), 221–244. https://doi.org/10.1080/19312458.2017.1380175
  • Diener, E. (1984). Subjective well-being. Psychological Bulletin, 95(3), 542–575. https://doi.org/10.1037/0033-2909.95.3.542
  • European Union. (2016). Regulation (EU) 2016/679 of the European parliament and of the council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/EC (general data protection regulation). OJ, 59(L 119), 1–89.
  • Fan, Y., Lehmann, S., & Blok, A. (2022). Extracting the interdisciplinary specialty structures in social media data-based research: A clustering-based network approach. Journal of Informetrics, 16(3), 101310. https://doi.org/10.1016/j.joi.2022.101310
  • Freelon, D. (2014). On the interpretation of digital trace data in communication and social computing research. Journal of Broadcasting & Electronic Media, 58(1), 59–75. https://doi.org/10.1080/08838151.2013.875018
  • Freelon, D. (2018). Computational research in the post-API age. Political Communication, 35(4), 665–668. https://doi.org/10.1080/10584609.2018.1477506
  • Freelon, D., Pruden, M. L., Malmer, D., & Crist, A. (2022). Piegraph. [Computer software]. Retrieved from http://pcad.ils.unc.edu
  • Freelon, D., & Wells, C. (2020). Disinformation as political communication. Political Communication.
  • Gaisbauer, F., Pournaki, A., Banisch, S., Olbrich, E., & Guidi, B. (2021). Ideological differences in engagement in public debate on Twitter. Plos One, 16(3), e0249241. https://doi.org/10.1371/journal.pone.0249241
  • Gillespie, T. (2014). The relevance of algorithms. Media Technologies: Essays on Communication, Materiality, and Society, 167(2014), 167.
  • Grinberg, N., Joseph, K., Friedland, L., Swire-Thompson, B., & Lazer, D. (2019). Fake news on Twitter during the 2016 U.S. presidential election. Science, 363(6425), 374–378. https://doi.org/10.1126/science.aau2706
  • Guess, A., Nagler, J., & Tucker, J. (2019). Less than you think: Prevalence and predictors of fake news dissemination on Facebook. Science Advances, 5(1), eaau4586. https://doi.org/10.1126/sciadv.aau4586
  • Guo, J. (2022). Deep learning approach to text analysis for human emotion detection from big data. Journal of Intelligent Systems, 31(1), 113–126. https://doi.org/10.1515/jisys-2022-0001
  • Halavais, A. (2019). Overcoming terms of service: A proposal for ethical distributed research. Information, Communication & Society, 22(11), 1567–1581. https://doi.org/10.1080/1369118X.2019.1627386
  • Hancock, J. T., Liu, S. X., Luo, M., & Mieczkowski, H. (2022). Social media and psychological well-being. In S. C. Matz (Ed.), The psychology of technology: Social science research in the age of big data (pp. 195–238). American Psychological Association. https://doi.org/10.1037/0000290-007
  • Hooker, S. (2021). Moving beyond “algorithmic bias is a data problem. Patterns, 2(4), 100241. https://doi.org/10.1016/j.patter.2021.100241
  • Howison, J., Wiggins, A., & Crowston, K. (2011). Validity issues in the use of social network analysis with digital trace data. Journal of the Association for Information Systems, 12(12), 767–797. https://doi.org/10.17705/1jais.00282
  • Kmetty, Z., & Németh, R. (2022). Which is your favorite music genre? A validity comparison of Facebook data and survey data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 154(1), 82–104. https://doi.org/10.1177/07591063211061754
  • Lambrecht, A., & Tucker, C. (2019). Algorithmic Bias? Anastem career ads. Management Science, 65(7), 2966–2981. https://doi.org/10.1287/mnsc.2018.3093
  • Lazer, D., Hargittai, E., Freelon, D., Gonzalez-Bailon, S., Munger, K., Ognyanova, K., & Radford, J. (2021). Meaningful measures of human society in the twenty-first century. Nature, 595(7866), 189–196. https://doi.org/10.1038/s41586-021-03660-7
  • Lee, J. H. S., Li, T., Hsu, W., & Lee, M. L. (2021). Repurpose image identification for fake news detection. In C. Strauss, G. Kotsis, A. M. Tjoa, & I. Khalil (Eds.), Database and Expert Systems Applications: 32nd International Conference, DEXA 2021, Virtual Event, September 27–30, 2021, Proceedings, Part II (Vol. 12924). Springer International Publishing. https://doi.org/10.1007/978-3-030-86475-0
  • Lee, J., Reeves, N., Ram, B., & Hamilton, J. (2022). The psychology of poverty and life online: Natural Experiments on the Effects of Smartphone Payday Loan Ads on Psychological Stress. Information, Communication & Society, 1–22. https://doi.org/10.1080/1369118X.2022.2109982
  • Luiten, A., Hox, J. J. C. M., & De Leeuw, E. D. (2020). Survey nonresponse trends and fieldwork effort in the 21st century: Results of an international study across countries and surveys. Journal of Official Statistics, 36(3), 469–487. https://doi.org/10.2478/jos-2020-0025
  • Mackey, T. K., Purushothaman, V., Haupt, M., Nali, M. C., & Li, J. (2021). Application of unsupervised machine learning to identify and characterise hydroxychloroquine misinformation on Twitter. Lancet Digital Health, 3(2), e72–75. https://doi.org/10.1016/S2589-7500(20)30318-6
  • Mellon, J., & Prosser, C. (2017). Twitter and Facebook are not representative of the general population: Political attitudes and demographics of British social media users. Research & Politics, 4(3), 2053168017720008. https://doi.org/10.1177/2053168017720008
  • Menchen-Trevino, E. (2016). Web historian: Enabling multi-method and independent research with real-world web browsing history data. IConference 2016 Proceedings (iSchools). https://doi.org/10.9776/16611.
  • Noble, S. U. (2018). Algorithms of oppression: How search engines reinforce racism. NYU Press.
  • Ohme, J., & Araujo, T. (2022). Digital data donations: A quest for best practices. Patterns, 3(4), 100467. https://doi.org/10.1016/j.patter.2022.100467
  • Ohme, J., Araujo, T., de Vreese, C. H., & Piotrowski, J. T. (2021). Mobile data donations: Assessing self-report accuracy and sample biases with the iOS screen time function. Mobile Media & Communication, 9(2), 293–313. https://doi.org/10.1177/2050157920959106
  • Orben, A. (2020). Teenagers, screens and social media: A narrative review of reviews and key studies. Social Psychiatry and Psychiatric Epidemiology, 55(4), 407–414. https://doi.org/10.1007/s00127-019-01825-4
  • Otto, L. P., Thomas, F., Glogger, I., & De Vreese, C. H. (2022). Linking media content and survey data in a dynamic and digital media environment – mobile longitudinal linkage analysis. Digital Journalism, 10(1), 200–215. https://doi.org/10.1080/21670811.2021.1890169
  • Parry, D. A., Davidson, B. I., Sewall, C. J. R., Fisher, J. T., Mieczkowski, H., & Quintana, D. S. (2021). A systematic review and meta-analysis of discrepancies between logged and self-reported digital media use. Nature Human Behaviour, 5(11), 1535–1547. https://doi.org/10.1038/s41562-021-01117-5
  • Reeves, B., Ram, N., Robinson, T. N., Cummings, J. J., Giles, C. L., Pan, J., Chiatti, A., Cho, M., Roehrick, K., Yang, X., Gagneja, A., Brinberg, M., Muise, D., Lu, Y., Luo, M., Fitzgerald, A., & Yeykelis, L. (2021). Screenomics: A framework to capture and analyze personal life experiences and the ways that technology shapes them. Human–Computer Interaction, 36(2), 150–201. https://doi.org/10.1080/07370024.2019.1578652
  • Robertson, R. E., Jiang, S., Joseph, K., Friedland, L., Lazer, D., & Wilson, C. (2018). Auditing partisan audience bias within google search. Proceedings of the ACM on Human-Computer Interaction, 2(CSCW), 1–22. https://doi.org/10.1145/3274417
  • Scharkow, M. (2016). The accuracy of self-reported internet use—a validation study using client log data. Communication Methods and Measures, 10(1), 13–27. https://doi.org/10.1080/19312458.2015.1118446
  • Singh, B., & Sharma, D. K. (2022). Predicting image credibility in fake news over social media using multi-modal approach. Neural Computing & Applications, 34(24), 21503–21517. https://doi.org/10.1007/s00521-021-06086-4
  • Stadel, M., & Stulp, G. (2022). Balancing bias and burden in personal network studies. Social Networks, 70, 16–24. https://doi.org/10.1016/j.socnet.2021.10.007
  • Stier, S., Breuer, J., Siegers, P., & Thorson, K. (2020). Integrating survey data and digital trace data: Key issues in developing an emerging field. Social Science Computer Review, 38(5), 503–516. https://doi.org/10.1177/0894439319843669
  • Sun, X., Ram, N., Reeves, B., Cho, M. -J., Fitzgerald, A., & Robinson, T. N. (2022). Connectedness and independence of young adults and parents in the digital world: Observing smartphone interactions at multiple timescales using screenomics. Journal of Social and Personal Relationships. https://doi.org/10.1177/02654075221104268
  • Thorson, K., Cotter, K., Medeiros, M., & Pak, C. (2021). Algorithmic inference, political interest, and exposure to news and politics on Facebook. Information, Communication & Society, 24(2), 183–200. https://doi.org/10.1080/1369118X.2019.1642934
  • Toth, R., & Trifonova, T. (2021). Somebody’s watching me: Smartphone use tracking and reactivity. Computers in Human Behavior Reports, 4, 100142. https://doi.org/10.1016/j.chbr.2021.100142
  • Tufekci, Z. (2014). Engineering the public: Big data, surveillance and computational politics. First Monday, https://doi.org/10.5210/fm.v19i7.4901
  • Valkenburg, P. M. (2022). Theoretical foundations of social media uses and effects. In J. Nesi, E. H. Telzer, & M. J. Prinstein (Eds.), Handbook of adolescent digital media use and mental health (1st ed, pp. 39–60). Cambridge University Press. https://doi.org/10.1017/9781108976237.004
  • Valkenburg, P. M., Beyens, I., Meier, A., & Vanden Abeele, M. M. P. (2022). Advancing our understanding of the associations between social media use and well-being. Current Opinion in Psychology, 47, 101357. https://doi.org/10.1016/j.copsyc.2022.101357
  • Valkenburg, P., Beyens, I., Pouwels, J. L., van Driel, I. I., & Keijsers, L. (2021). Social media use and adolescents’ self-esteem: heading for a person-specific media effects paradigm. The Journal of Communication, 71(1), 56–78. https://doi.org/10.1093/joc/jqaa039
  • Vanden Abeele, M. M. P. (2020). Digital wellbeing as a dynamic construct. Communication Theory. https://doi.org/10.1093/ct/qtaa024
  • van Driel, I. I., Giachanou, A., Pouwels, J. L., Boeschoten, L., Beyens, I., & Valkenburg, P. M. (2022). Promises and pitfalls of social media data donations. Communication Methods and Measures, 16(4), 266–282. https://doi.org/10.1080/19312458.2022.2109608
  • Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146–1151. https://doi.org/10.1126/science.aap9559
  • Vuorre, M., & Przybylski, A. K. 2022. Global well-being and mental health in the internet age. Preprint PsyArXiv. https://doi.org/10.31234/osf.io/9tbjy
  • Wagner, C., Strohmaier, M., Olteanu, A., Kıcıman, E., Contractor, N., & Eliassi-Rad, T. (2021). Measuring algorithmically infused societies. Nature, 595(7866), 197–204. https://doi.org/10.1038/s41586-021-03666-1
  • Wojcieszak, M., Menchen-Trevino, E., Goncalves, J. F. F., & Weeks, B. (2022). Avenues to news and diverse news exposure online: Comparing direct navigation, social media, news aggregators, search queries, and article hyperlinks. The International Journal of Press/politics, 27(4), 194016122110091. https://doi.org/10.1177/19401612211009160
  • World Health Organization. (2022, September 1). Infodemics and misinformation negatively affect people’s health behaviours, new WHO review finds. https://www.who.int/europe/news/item/01-09-2022-infodemics-and-misinformation-negatively-affect-people-s-health-behaviours–new-who-review-finds
  • Yang, K. -C., Pierri, F., Hui, P. -M., Axelrod, D., Torres-Lugo, C., Bryden, J., & Menczer, F. (2021). The COVID-19 infodemic: Twitter versus Facebook. Big Data & Society, 8(1), 20539517211013860. https://doi.org/10.1177/20539517211013861
  • Yang, X., Ram, N., Robinson, T., & Reeves, B. (2019). Using screenshots to predict task switching on smartphones. Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, 1–6. https://doi.org/10.1145/3290607.3313089
  • Yee, A. Z. H., Yu, R., Lim, S. S., Lim, K. H., Dinh, T. T. A., Loh, L., Hadianto, A., & Quizon, M. (2022). ScreenLife capture: An open-source and user-friendly framework for collecting screenome data from Android smartphones. Behavior Research Methods, 1–18. https://doi.org/10.3758/s13428-022-02006-z
  • Yeykelis, L., Cummings, J. J., & Reeves, B. (2014). Multitasking on a single device: arousal and the frequency, anticipation, and prediction of switching between media content on a computer: Multitasking and Arousal. The Journal of Communication, 64(1), 167–192. https://doi.org/10.1111/jcom.12070
  • Zhang, L. C. (2012). Topics of statistical theory for register‐based statistics and data integration. Statistica Neerlandica, 66(1), 41–63. https://doi.org/10.1111/j.1467-9574.2011.00508.x