CrossRef citations to date

(What) Can Journalism Studies Learn from Supervised Machine Learning?

, & ORCID Icon


  • Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Singapore: Springer.
  • Boumans, J. W., and D. Trilling. 2016. “Taking Stock of the Toolkit: An Overview of Relevant Automated Content Analysis Approaches and Techniques for Digital Journalism Scholars.” Digital Journalism 4 (1): 8–23.
  • Breiman, L. 2001. “Statistical Modeling: The two Cultures (with Comments and a Rejoinder by the Author).” Statistical Science 16 (3): 199–231.
  • Broersma, M., and F. Harbers. 2018. “Exploring Machine Learning to Study the Long-Term Transformation of News: Digital Newspaper Archives, Journalism History, and Algorithmic Transparency.” Digital Journalism 6 (9): 1150–1164.
  • Bruns, A. 2005. Gatewatching: Collaborative Online News Production. New York: Peter Lang.
  • Burggraaff, C., and D. Trilling. 2017. “Through a Different Gate: An Automated Content Analysis of how Online News and Print News Differ.” Journalism. doi:10.1177/1464884917716699.
  • Burscher, B., D. Odijk, R. Vliegenthart, M. de Rijke, and C. H. de Vreese. 2014. “Teaching the Computer to Code Frames in News: Comparing two Supervised Machine Learning Approaches to Frame Analysis.” Communication Methods and Measures 8 (3): 190–206.
  • Burscher, B., R. Vliegenthart, and C. H. de Vreese. 2015. “Using Supervised Machine Learning to Code Policy Issues: Can Classifiers Generalize Across Contexts?” The ANNALS of the American Academy of Political and Social Science 659 (1): 122–131.
  • Carlson, M., S. Robinson, S. C. Lewis, and D. A. Berkowitz. 2018. “Journalism Studies and its Core Commitments: The Making of a Communication Field.” Journal of Communication 68 (1): 6–25.
  • Castillo, C., M. El-Haddad, J. Pfeffer, and M. Stempeck. 2014, February. “Characterizing the Life Cycle of Online News Stories Using Social Media Reactions.” In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing, 211–223. Baltimore, Maryland: ACM.
  • Castillo, C., M. Mendoza, and B. Poblete. 2011, March. “Information credibility on twitter.” In Proceedings of the 20th international conference on World wide web, 675–684. Hyderabad, India: ACM.
  • Chadwick, A. 2013. The Hybrid Media System: Politics and Power. Oxford, UK: Oxford University Press.
  • Chen, N.-C., M. Drouhard, R. Kocielnik, J. Suh, and C. R. Aragon. 2018. “Using Machine Learning to Support Qualitative Coding in Social Science.” ACM Transactions on Interactive Intelligent Systems 8 (2): 1–20. doi:10.1145/3232718.
  • Deacon, D. 2007. “Yesterday’s Papers and Today’s Technology.” European Journal of Communication 22 (1): 5–25.
  • De Choudhury, M., N. Diakopoulos, and M. Naaman. 2012, February. “Unfolding the Event Landscape on Twitter: Classification and Exploration of User Categories.” In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, 241–244. ACM.
  • de Mello, R. F., and M. A. Ponti. 2018. Machine Learning: A Practical Approach on the Statistical Learning Theory. Cham, Switzerland: Springer.
  • Deutch, D., and N. Frost. (n.d.). Explaining White-box Classifications to Data Scientists. Retrieved from https://www.cs.tau.ac.il/~danielde/WhiteBoxFull.pdf.
  • Deuze, M. 2008. “The Changing Context of News Work: Liquid Journalism for a Monitorial Citizenry.” International Journal of Communication 18 (2): 848–865.
  • Doshi-Velez, F., and B. Kim. 2017. Towards A Rigorous Science of Interpretable Machine Learning. arXiv preprint arXiv:1702.08608.
  • Esser, F. 1999. “`Tabloidization’ of News: A Comparative Analysis of Anglo-American and German Press Journalism.” European Journal of Communication 14 (3): 291–324.
  • Esser, F., and A. Umbricht. 2014. “The Evolution of Objective and Interpretative Journalism in the Western Press: Comparing six News Systems Since the 1960s.” Journalism & Mass Communication Quarterly 91 (2): 229–249.
  • Fisher, C. 2016. “The Advocacy Continuum: Towards a Theory of Advocacy in Journalism.” Journalism 17 (6): 711–726.
  • Flaounas, I., O. Ali, T. Lansdall-Welfare, T. De Bie, N. Mosdell, J. Lewis, and N. Cristianini. 2013. “Research Methods in the Age of Digital Journalism.” Digital Journalism 1 (1): 102–116. doi:10.1080/21670811.2012.714928.
  • Gigerenzer, G. 2004. “Mindless Statistics.” The Journal of Socio-Economics 33 (5): 587–606.
  • Grinberg, N. 2018. “ Identifying Modes of User Engagement With Online News and Their Relationship to Information Gain in Text.” In Proceedings of the 2018 World Wide Web Conference, 1745–1754. International World Wide Web Conferences Steering Committee, April.
  • Günther, E., and T. Quandt. 2015. “Word Counts and Topic Models.” Digital Journalism 4 (1): 75–88. doi:10.1080/21670811.2015.1093270.
  • Hamborg, F., K. Donnay, and B. Gipp. 2018. “Automated Identification of Media Bias in News Articles: An Interdisciplinary Literature Review.” International Journal on Digital Libraries. doi:10.1007/s00799-018-0261-y.
  • Hamborg, F., S. Lachnit, M. Schubotz, T. Hepp, and B. Gipp. 2018, March. Giveme5W: Main Event Retrieval from News Articles by Extraction of the Five Journalistic W Questions. In International Conference on Information, 356–366. Springer.
  • Heinrich, A. 2011. Network Journalism: Journalistic Practice in Interactive Spheres. New York: Routledge.
  • Hermida, A. 2010. “Twittering the News: The Emergence of Ambient Journalism.” Journalism Practice 4 (3): 297–308.
  • Herrera, F., F. Charte, A. J. Rivera, and M. J. del Jesus. 2016. Multilabel Classification: Problem Analysis, Metrics and Techniques. doi: 10.1007/978-3-319-41111-8
  • Hester, J. B., and E. Dougall. 2007. “The Efficiency of Constructed Week Sampling for Content Analysis of Online News.” Journalism & Mass Communication Quarterly 84 (4): 811–824.
  • Hochreiter, S., and J. Schmidhuber. 1997. “LSTM Can Solve Hard Long Time Lag Problems.” In Advances in neural information processing systems, 473–479.
  • James, G., D. Witten, T. Hastie, and R. Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. New York: Springer.
  • Jiang, L., and E. H. Han. 2019. ModBot: Automatic Comments Moderation. https://drive.google.com/file/d/10bkVVPwqMolzEm_MSsmkecoWWq496pw7/.
  • Joshi, M., W. W. Cohen, M. Dredze, and C. P. Rosé. 2012. “Multi-Domain Learning: When Do Domains Matter?” In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 1302–1312. Association for Computational Linguistics, July.
  • Kaplan, R. M., D. A. Chambers, and R. E. Glasgow. 2014. “Big Data and Large Sample Size: A Cautionary Note on the Potential for Bias.” Clinical and Translational Science 7 (4): 342–346.
  • Karlsson, M., C. Clerwall, and L. Nord. 2014. “You Ain't Seen Nothing yet: Transparency's (Lack of) Effect on Source and Message Credibility.” Journalism Studies 15 (5): 668–678.
  • Kinder, D. R. 2007. “Curmudgeonly Advice.” Journal of Communication 57 (1): 155–162.
  • Krawczyk, B. 2016. “Learning From Imbalanced Data : Open Challenges and Future Directions.” Progress in Artificial Intelligence 5 (4): 221–232.
  • Lazer, D., A. Pentland, L. Adamic, S. Aral, A.-L. Barabasi, D. Brewer, N. Christakis, N. Contractor, J. Fowler, and M. Gutmann. 2009. “SOCIAL SCIENCE: Computational Social Science.” Science 323 (5915): 721–723. doi:10.1126/science.1167742.
  • Leavy, S. 2018. “Uncovering Gender Bias in Newspaper Coverage of Irish Politicians Using Machine Learning.” Digital Scholarship in the Humanities 34 (1): 48–63.
  • Lewis, D. D., Y. Yang, T. G. Rose, and F. Li. 2004. “Rcv1: A new Benchmark Collection for Text Categorization Research.” Journal of Machine Learning Research 5: 361–397.
  • Lovejoy, J., B. R. Watson, S. Lacy, and D. Riffe. 2014. “Assessing the Reporting of Reliability in Published Content Analyses: 1985-2010.” Communication Methods and Measures 8 (3): 207–221.
  • Mahrt, M., and M. Scharkow. 2013. “The Value of Big Data in Digital Media Research.” Journal of Broadcasting & Electronic Media 57 (1): 20–33.
  • Margolin, D. B. 2019. “: Computational Contributions: A Symbiotic Approach to Integrating big, Observational Data Studies Into the Communication Field.” Communication Methods and Measures. doi:19.1080/19312458.2019.1639144.
  • McCombs, M. E., and D. L. Shaw. 1972. “The Agenda-Setting Function of Mass Media.” Public Opinion Quarterly 36 (2): 176–187.
  • Nelson, L. K. 2017. “Computational Grounded Theory: A Methodological Framework.” Sociological Methods & Research. doi:10.1177/0049124117729703.
  • Opperhuizen, A. E., K. Schouten, and E. H. Klijn. 2019. “Framing a Conflict! How Media Report on Earthquake Risks Caused by gas Drilling.” Journalism Studies 20 (5): 717–734.
  • Qin, J. 2015. “Hero on Twitter, Traitor on News.” The International Journal of Press/Politics 20 (2): 166–184.
  • Ribeiro, M. T., S. Singh, and C. Guestrin. 2016, August. “Why Should I Trust You?: Explaining the Predictions of any Classifier.” In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 1135–1144. ACM.
  • Riffe, D., S. Lacy, and F. Fico. 2014. Analyzing Media Messages: Using Quantitative Content Analysis in Research. 3rd ed. Mahwah: Routledge.
  • Rizos, G., S. Papadopoulos, and Y. Kompatsiaris. 2016. “Predicting News Popularity by Mining Online Discussions.” In Proceedings of the 25th International Conference Companion on World Wide Web, 737–742. International World Wide Web Conferences Steering Committee, April.
  • Rudkowsky, E., M. Haselmayer, M. Wastian, M. Jenny, Š Emrich, and M. Sedlmair. 2018. “More Than Bags of Words : Sentiment Analysis with Word Embeddings.” Communication Methods and Measures 12 (2–3): 140–157.
  • Scharkow, M. 2013. “Thematic Content Analysis Using Supervised Machine Learning: An Empirical Evaluation Using German Online News.” Quality and Quantity 47 (2): 761–773.
  • Schaudt, S., and S. Carpenter. 2009. “The News That’s fit to Click: An Analysis of Online News Values and Preferences Present in the Most-Viewed Stories on Azcentral.com.” Southwestern Mass Communication Journal 24 (2): 17–26.
  • Scheufele, D. A., and D. Tewksbury. 2006. “Framing, Agenda Setting, and Priming: The Evolution of Three Media Effects Models.” Journal of Communication 57 (1): 9–20.
  • Shmueli, G. 2010. “To Explain or to Predict?” Statistical Science 25 (3): 289–310.
  • Shreyas, R., D. M. Akshata, B. S. Mahanand, B. Shagun, and C. M. Abhishek. 2016. “Predicting Popularity of Online Articles Using Random Forest Regression.” In 2016 s International Conference on Cognitive Computing and Information Processing (CCIP), 1–5. IEEE, August.
  • Singer, J. B. 2014. “User-generated Visibility: Secondary Gatekeeping in a Shared Media Space.” New Media & Society 16 (1): 55–73.
  • Soroka, S., L. Young, and M. Balmas. 2015. “Bad News or mad News? Sentiment Scoring of Negativity, Fear, and Anger in News Content.” The ANNALS of the American Academy of Political and Social Science 659 (1): 108–121.
  • Steele, C. A., and K. G. Barnhurst. 1996. “The Journalism of Opinion: Network News Coverage of U.S. Presidential Campaigns, 1968–1988.” Critical Studies in Mass Communication 13 (3): 187–209.
  • Steensen, S., and L. Ahva. 2015. “Theories of Journalism in a Digital Age.” Journalism Practice 9 (1): 1–18.
  • Trilling, D., P. Tolochko, and B. Burscher. 2017. “From Newsworthiness to Shareworthiness: How to Predict News Sharing Based on Article Characteristics.” Journalism & Mass Communication Quarterly 94 (1): 38–60.
  • Van Canneyt, S., P. Leroux, B. Dhoedt, and T. Demeester. 2018. “Modeling and Predicting the Popularity of Online News Based on Temporal and Content-Related Features.” Multimedia Tools and Applications 77 (1): 1409–1436.
  • Vasdev, S. 2019. “Can Machine Learning Help us Measure the Trustworthiness of News?” Presented at the Computation + Journalism Symposium, February. https://drive.google.com/file/d/1_Qrp2pGhl3eu7r-3BU6nwG_XN32jv7T_/view.
  • White, D. M. 1950. “The ‘Gatekeeper': A Case Study in the Selection of News.” Journalism & Mass Communication Quarterly 27 (4).
  • Wu, B., and H. Shen. 2015. “Analyzing and Predicting News Popularity on Twitter.” International Journal of Information Management 35 (6): 702–711.
  • Zelizer, B. 2004. Taking Journalism Seriously: News and the Academy. Thousand Oaks: SAGE Publications.
  • Zhang, Y., R. Jin, and Z. H. Zhou. 2010. “Understanding bag-of-Words Model: A Statistical Framework.” International Journal of Machine Learning and Cybernetics 1 (1–4): 43–52.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.