3,206
Views
2
CrossRef citations to date
0
Altmetric
Research Article

Lifting the Veil on the Use of Big Data News Repositories: A Documentation and Critical Discussion of A Protest Event Analysis

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon

References

  • Austin Holmes, A., & Baoumi, H. (2016, January 29). Egypt’s protests by the numbers. Carnegie Endowment for International Peace. https://carnegieendowment.org/sada/?fa=62627
  • Banks, A. S. (1997). Cross-nationaltime-series data archive [English]. https://www.worldcat.org/title/cross-national-time-series-data-archive/oclc/768447979
  • Bekker, M. (2022). Better, faster, stronger: Using machine learning to analyse south African police-recorded protest data. South African Review of Sociology, 52(1), 4–23. https://doi.org/10.1080/21528586.2021.1982762
  • Best, R. H., Carpino, C., & Crescenzi, M. J. C. (2013). An analysis of the TABARI coding system. Conflict Management and Peace Science, 30(4), 335–348. https://doi.org/10.1177/0738894213491176
  • Bolivar, F., Camara, N., Davila Egas, T., Orkun Isa, B., Posadas, C., Rodrigo, T., & Vazquez, S. (2021). Understanding the sustainability framework using Big Data. BBVA Research. https://www.bbvaresearch.com/en/publicaciones/global-understanding-the-sustainability-framework-using-big-data/
  • Boyd, D., & Crawford, K. (2012). Critical questions for Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/1369118X.2012.678878
  • Bruns, A., Harrington, S., & Hurcombe, E. (2021). Coronavirus conspiracy theories: Tracing misinformation trajectories from the fringes to the mainstream. In M. Lewis, E. Govender, & K. Holland (Eds.), Communicating COVID-19: interdisciplinary perspectives (pp. 229–250). Springer International Publishing. https://doi.org/10.1007/978-3-030-79735-5
  • Bruns, A., Hurcombe, E., & Harrington, S. (2021). Covering conspiracy: Approaches to reporting the COVID/5G conspiracy theory. Digital Journalism, 1–22. https://doi.org/10.1080/21670811.2021.1968921
  • Campante, F., & Yanagizawa-Drott, D. (2018). Long-range growth: Economic development in the global network of air links. The Quarterly Journal of Economics, 133(3), 1395–1458. https://doi.org/10.1093/qje/qjx050
  • Carvalho, T. (2022). Contesting austerity: Social movements and the left in Portugal and Spain (2008-2015). Amsterdam University Press.
  • Chalabi, M. (2014, May 6). Kidnapping of girls in Nigeria is part of a worsening problem (Updated). FiveThirtyEight. https://fivethirtyeight.com/features/nigeria-kidnapping/
  • Christensen, D. (2019). Concession stands: How mining investments incite protest in Africa. International Organization, 73(1), 65–101. https://doi.org/10.1017/S0020818318000413
  • Christensen, D., & Garfias, F. (2018). Can you hear me now? How communication technology affects protest and repression. Quarterly Journal of Political Science, 13(1), 89–117. https://doi.org/10.1561/100.00016129
  • Claassen, C., & Gibson, J. L. (2016). Macro-tolerance and protest: Does a culture of political intolerance dampen dissent?
  • Consoli, S., Pezzoli, L. T., & Tosetti, E. (2021). Emotions in macroeconomic news and their impact on the European bond market. Journal of International Money and Finance, 118, 102472. https://doi.org/10.1016/j.jimonfin.2021.102472
  • David Williams, O., Yung, K. C., & Grépin, K. A. (2021). The failure of private health services: COVID-19 induced crises in low- and middle-income country (LMIC) health systems. Global Public Health, 16(8–9), 1320–1333. https://doi.org/10.1080/17441692.2021.1874470
  • Dearing, J. W., Rogers, E. M., & Rogers, E. (1996). Agenda-Setting. SAGE.
  • Diesner, J. (2015). Small decisions with big impact on data analytics. Big Data & Society, 2(2), 2053951715617185. https://doi.org/10.1177/2053951715617185
  • D’Ignazio, C., & Klein, L. F. (2020). Data feminism. MIT Press.
  • Dos Santos, R. F., Perkins, T. K., Wood, C. D., Meyer, W. D., Garfinkle, N. W., Enscore, S. I., Wang, X., Selig, L. A., & Calfas, G. W. (2017). Social. and Political Event Data to Support Army Requirements (ERDC/CERL TR-17-40; Military Facilities Engineering Technology). U.S. Army Engineer Research Development Center.
  • Drakos, K., & Gofas, A. (2006). The devil you know but are afraid to face: Underreporting bias and its distorting effects on the study of terrorism. Journal of Conflict Resolution, 50(5), 714–735. https://doi.org/10.1177/0022002706291051
  • Earl, J., Martin, A., McCarthy, J. D., & Soule, S. A. (2004). The use of newspaper data in the study of collective action. Annual Review of Sociology, 30(1), 65–80. https://doi.org/10.1146/annurev.soc.30.012703.110603
  • Earl, J., Soule, S. A., & McCarthy, J. D. (2003). Protest under fire? Explaining the policing of protest. American Sociological Review, 68(4), 581–606. https://doi.org/10.2307/1519740
  • Eck, K. (2012). In data we trust? A comparison of UCDP GED and ACLED conflict events datasets. Cooperation and Conflict, 47(1), 124–141. https://doi.org/10.1177/0010836711434463
  • The Economist. (2020, March 10). Political protests have become more widespread and more frequent. The Economist. https://www.economist.com/graphic-detail/2020/03/10/political-protests-have-become-more-widespread-and-more-frequent
  • Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43(4), 51–58. https://doi.org/10.1111/j.1460-2466.1993.tb01304.x
  • Fengcai, Q., Jinsheng, D., & Li, W. (2020). An online framework for temporal social unrest event prediction using news stream. 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 176–182. https://doi.org/10.1109/CyberC49757.2020.00036
  • Ferreira, L. N., Hong, I., Rutherford, A., & Cebrian, M. (2021). The small-world network of global protests. Scientific Reports, 11(1), 19215. https://doi.org/10.1038/s41598-021-98628-y
  • Fuchs, C. (2017). From digital positivism and administrative big data analytics towards critical digital and social media research! European Journal of Communication, 32(1), 37–49. https://doi.org/10.1177/0267323116682804
  • Fu, K., & Zhu, Y. (2020). Did the world overlook the media’s early warning of COVID-19? Journal of Risk Research, 23(7–8), 1047–1051. https://doi.org/10.1080/13669877.2020.1756380
  • Galambos, L. (1975). The public image of big business in America, 1880-1940: A quantitative study in social change. Johns Hopkins University Press.
  • The GDELT Project. (2015, February 19). GDELT 2.0: Our global world in realtime. https://blog.gdeltproject.org/gdelt-2-0-our-global-world-in-realtime/
  • The GDELT Project. (2021, January 22). A behind-the-scenes look at how we think about master file formats and timestamping. https://blog.gdeltproject.org/a-behind-the-scenes-look-at-how-we-think-about-master-file-formats-and-timestamping/
  • The GDELT Project. (n.d.). The GDELT project: Watching our world unfold. Retrieved November 18, 2021, from https://www.gdeltproject.org
  • Gitelman, L. (2013). Raw data is an oxymoron. MIT Press.
  • Guo, L., & Vargo, C. (2020). “Fake News” and emerging online media ecosystem: an integrated intermedia agenda-setting analysis of the 2016 U.S. Presidential election. Communication Research, 47(2), 178–200. https://doi.org/10.1177/0093650218777177
  • Haig, C. S., Schmidt, K., & Brannen, S. (2020, March 2). The Age of Mass Protests: Understanding an Escalating Global Trend. Center for Strategic and International Studies. https://www.csis.org/analysis/age-mass-protests-understanding-escalating-global-trend
  • Halkia, M., Ferri, S., Papazoglou, M., Van Damme, M.-S., & Thomakos, D. (2020). Conflict event modelling: Research experiment and event data limitations. Proceedings of AESPEN 2020, 42–48. https://aclanthology.org/2020.aespen-1.8/
  • Hammond, J., & Weidmann, N. B. (2014). Using machine-coded event data for the micro-level study of political violence. Research & Politics, 1(2). https://doi.org/10.1177/20531680145399
  • Hopp, F. R., Fisher, J. T., & Weber, R. (2020). Dynamic transactions between news frames and sociopolitical events: An integrative, hidden markov model approach. Journal of Communication, 70(3), 335–355. https://doi.org/10.1093/joc/jqaa015
  • Hopp, F. R., Schaffer, J., Fisher, J. T., & Weber, R. (2019). iCoRe: The GDELT interface for the advancement of communication research. Computational Communication Research, 1(1), 13–44. https://doi.org/10.5117/CCR2019.1.002.HOPP
  • Hovy, D., & Prabhumoye, S. (2021). Five sources of bias in natural language processing. Language and Linguistics Compass, 15(8), 8. https://doi.org/10.1111/lnc3.12432
  • Hutter, S. (2014). Protest event analysis and its offspring. In D. Della Porta (Ed.), Methodological practices in social movement research. Oxford Scholarship Online.
  • Jäger, K. (2018). The limits of studying networks via event data: Evidence from the ICEWS dataset. Journal of Global Security Studies, 3(4), 498–511. https://doi.org/10.1093/jogss/ogy015
  • Jenkins, J. C., & Perrow, C. (1977). Insurgency of the powerless: Farm worker movements (1946-1972). American Sociological Review, 42(2), 249–268. https://doi.org/10.2307/2094604
  • Katzenbach, C. (2021). “AI will fix this” – The technical, discursive, and political turn to AI in governing communication. Big Data & Society, 8(2), 20539517211046184. https://doi.org/10.1177/20539517211046182
  • Kolanovic, M., & Krishnamachari, R. T. (2017). Big data and AI strategies: Machine learning and alternative data approach to investing. J.P. Morgan.
  • Kriesi, H., Wüest, B., Lorenzini, J., Makarov, P., Enggist, M., Rothenhäusler, K., Kurer, T., Häusermann, S., Wangen, P., Altiparmakis, A., Borbáth, E., Bremer, B., Gessler, T., Hunger, S., Hutter, S., Schulte-Cloos, J., & Wang, C. (2020). PolDem-protest dataset 30 European countries, Version 1. https://poldem.eui.eu/downloads/pea/poldem-protest_30_codebook.pdf
  • Krippendorff, K. (2018). Content analysis: An introduction to its methodology (Fourth ed.). SAGE.
  • Kurer, T., Häusermann, S., Wüest, B., & Enggist, M. (2019). Economic grievances and political protest. European Journal of Political Research, 58(3), 866–892. https://doi.org/10.1111/1475-6765.12318
  • Kwak, H., & An, J. (2014). A first look at global news coverage of disasters by using the GDELT dataset. In L. M. Aiello & D. McFarland (Eds.), Social informatics: 6th International Conference, Socinfo 2014, Barcelona, Spain, November 11-13, 2014. Proceedings (1st edition). Springer.
  • LaFree, G. (2010). The Global Terrorism Database (GTD): Accomplishments and challenges. Perspectives on Terrorism, 4(1), 24–46.
  • Lakoff, G. (1990). Don’t think of an elephant: Know your values and frame the debate. Chelsea Green Publishing Co.
  • Leetaru, K. (2014, May 30). Did the Arab spring really spark a wave of global protests? Foreign Policy. https://foreignpolicy.com/2014/05/30/did-the-arab-spring-really-spark-a-wave-of-global-protests/
  • Leetaru, K., & Schrodt, P. A. (2013). GDELT: Global data on events, location and tone, 1979-2012.
  • Levin, N., Ali, S., & Crandall, D. (2018). Utilizing remote sensing and big data to quantify conflict intensity: The Arab spring as a case study. Applied Geography, 94, 1–17. https://doi.org/10.1016/j.apgeog.2018.03.001
  • Malik, M., Hopp, F. R., Chen, Y., & Weber, R. (2021). Does regional variation in pathogen prevalence predict the moralization of language in COVID-19 news? Journal of Language and Social Psychology, 40(5–6), 653–676. https://doi.org/10.1177/0261927X211044194
  • Manacorda, M., & Tesei, A. (2020). Liberation technology: Mobile phones and political mobilization in Africa. Econometrica, 88(2), 533–567. https://doi.org/10.3982/ECTA14392
  • Mattoni, A., & Pavan, E. (2018). Politics, participation and big data. Introductory reflections on the ontological, epistemological, and methodological aspects of a complex relationship [Data set]. Partecipazione & Conflitto, 11(2), 313–331. https://doi.org/10.1285/I20356609V11I2P313
  • Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt.
  • McAdam, D. (19821999). Political process and the development of black insurgency, 1930–1970. University of Chicago Press.
  • McCombs, M. E., & Shaw, D. L. (1972). The agenda-setting function of mass media. Public Opinion Quarterly, 36(2), 176–187. https://doi.org/10.1086/267990
  • Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A Survey on bias and fairness in machine learning. ACM Computing Surveys, 54(6), 1–35. https://doi.org/10.1145/3457607
  • Metternich, N. W., Dorff, C., Gallop, M., Weschle, S., & Ward, M. D. (2013). Antigovernment networks in civil conflicts: How network structures affect conflictual behavior. American Journal of Political Science, 57(4), 892–911. https://doi.org/10.1111/ajps.12039
  • Neumayer, C.(2022). Content, form and reception: Perspectives from digital media data. In P. Vossen & A. Fokkens (Eds.), The perspective web (pp. 143–155). Cambridge University Press.
  • Neumayer, M. M., & Poell, T. (2019). Introduction. In C. Neumayer, M. Mortensen, & T. Poell (Eds.), Social media materialities and protest: Critical reflections (pp. 1–14). Routledge.
  • Neumayer, C., & Rossi, L. (2016). 15 years of protest and media technologies scholarship: A sociotechnical timeline. Social Media + Society, 2(3), 2056305116662180. https://doi.org/10.1177/2056305116662180
  • Neumayer, C., Rossi, L., & Struthers, D. M. (2021). Invisible data: A framework for understanding visibility processes in social media data. Social Media + Society, 7(1), 2056305120984472. https://doi.org/10.1177/2056305120984472
  • Odziemkowska, K., & Henisz, W. J. (2021). Webs of influence: Secondary stakeholder actions and cross-national corporate social performance. Organization Science, 32(1), 233–255. https://doi.org/10.1287/orsc.2020.1380
  • Olzak, S. (1992). The dynamics of ethnic Competition and Conflict. Stanford University Press.
  • Ortiz, A., & Rodrigo, T. (2018). Monitoring global trade support in real time using Big Data. BBVA Research. https://www.bbvaresearch.com/wp-content/uploads/2018/07/Exploring-the-global-trade-and-protectionism-in-real-time-using-Big-Data_.pdf
  • Ou-Yang, L. (n.d.). Newspaper. Github. https://github.com/codelucas/newspaper
  • Ponticelli, J., & Voth, H.-J. (2020). Austerity and anarchy: Budget cuts and social unrest in Europe, 1919–2008. Journal of Comparative Economics, 48(1), 1–19. https://doi.org/10.1016/j.jce.2019.09.007
  • Portos, M. (2021). Grievances and public protests: Political mobilisation in Spain in the age of austerity. Palgrave Macmillan. https://doi.org/10.1007/978-3-030-53405-9
  • Raleigh, C., Linke, A., Hegre, H., & Karlsen, J. (2010). Introducing ACLED: An armed conflict location and event dataset: Special data feature. Journal of Peace Research, 47(5), 651–660. https://doi.org/10.1177/0022343310378914
  • Scheufele, D. A. (1999). Framing as a Theory of Media Effects. Journal of Communication, 49(1), 103–122. https://doi.org/10.1111/j.1460-2466.1999.tb02784.x
  • Schrodt, P. A. (2012). CAMEO conflict and mediation event observations event and actor codebook. http://data.gdeltproject.org/documentation/CAMEO.Manual.1.1b3.pdf
  • Schrodt, P. A., & Yonamine, J. E. (2013). A guide to event data: Past, present, and future. All Azimuth: A Journal of Foreign Policy and Peace, 2(2), 5–22.
  • Sundberg, R., & Melander, E. (2013). Introducing the UCDP georeferenced event dataset. Journal of Peace Research, 50(4), 523–532. https://doi.org/10.1177/0022343313484347
  • Tiedemann, J., & Thottingal, S. (2020). OPUS-MT – Building open translation services for the world. https://helda.helsinki.fi/handle/10138/327852
  • Tilly, C. (1995). Popular contention in Great Britain, 1758-1834. Harvard University Press.
  • Vargo, C. J., & Guo, L. (2017). Networks, big data, and intermedia agenda setting: An analysis of traditional, partisan, and emerging online U.S. news. Journalism & Mass Communication Quarterly, 94(4), 1031–1055. https://doi.org/10.1177/1077699016679976
  • Wang, W., Kennedy, R., Lazer, D., & Ramakrishnan, N. (2016). Growing pains for global monitoring of societal events. Science, 353(6307), 1502–1503. https://doi.org/10.1126/science.aaf6758
  • Wang, D. J., & Soule, S. A. (2012). Social movement organizational collaboration: Networks of learning and the diffusion of protest tactics, 1960–1995. American Journal of Sociology, 117(6), 1674–1722. https://doi.org/10.1086/664685
  • Ward, M. D., Berger, A., Cutler, J., Matthew, D., Cassy, D., & Ben, R. (2013). Comparing GDELT and ICEWS event data.
  • Welbers, K., Van Atteveldt, W., Bajjalieh, J., Shalmon, D., Joshi, P. V., Althaus, S., Chan, C.-H., Wessler, H., & Jungblut, M. (2022). Linking event archives to news: A computational method for analyzing the gatekeeping process. Communication Methods and Measures, 16(1), 59–78. https://doi.org/10.1080/19312458.2021.1953455
  • Williams, S. (2020). Exploration of the Global Database of Events, Language and Tone (GDELT), with specific application to disaster reporting. Office for National Statistics. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/methodologies/explorationoftheglobaldatabaseofeventslanguageandtonegdeltwithspecificapplicationtodisasterreporting#strengths-and-limitations
  • Wright, J., Lennox, R., & Verissimo, D. (2020). Online monitoring of global attitudes towards wildlife. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3658481
  • Wu, C., & Gerber, M. S. (2018). Forecasting civil unrest using social media and protest participation theory. IEEE Transactions on Computational Social Systems, 5(1), 82–94. https://doi.org/10.1109/TCSS.2017.2763128
  • Yesilbas, V., Padilla, J. J., & Frydenlund, E. (2021). An analysis of global news coverage of refugees using a big data Approach. In R. Thomson, M. N. Hussain, C. Dancy, & A. Pyke (Eds.), Social, Cultural, and Behavioral Modeling: 14th International Conference, SBP-BRiMS 2021, Virtual Event, July 6–9,2021, Proceedings (Vol.12720, pp. 111–120). Springer International Publishing. https://doi.org/10.1007/978-3-030-80387-2
  • Yuen, S., Cheng, E. W., Or, N. H. K., Grépin, K. A., Fu, K.-W., Yung, K.-C., & Yue, R. P. H. (2021). A tale of two city-states: A comparison of the state-led vs civil society-led responses to COVID-19 in Singapore and Hong Kong. Global Public Health, 16(8–9), 1283–1303. https://doi.org/10.1080/17441692.2021.1877769
  • Zheng, C. (2020). Comparisons of the city brand influence of global cities: word-embedding based semantic mining and clustering analysis on the big data of gdelt global news knowledge graph. Sustainability, 12(16), 6294. https://doi.org/10.3390/su12166294