Abstract
After radio and television, online media are fast becoming a primary source of information for many Africans. With this increase, it is becoming necessary for media researchers to explore ways to better understand production, content and reception patterns of online news in the continent. This paper introduces freely available tools for systematic and (semi-)automated collection, storage and analysis of digital news that builds on recent advances in the computational power of personal computers, and the decreasing costs of storing large amounts of data. I start by describing existing challenges in the collection of online news text data, including the limited amount of African news content in commercial databases, and the methodological shortcomings of using commercial search engines. Then, I present a four-stage approach using packages written in the open-source R programming language to automate the collection of online news content (web scraping); transform this content for easier storage and analysis (data processing); use computational text analysis tools to describe and categorise data; and present the results in ways that are easier to understand (data visualisation). The paper concludes with a summary of recommendations for using computational methods to study African communication phenomena.
Notes
1 For a discussion on ways in which Python can be used for text analysis, see Lane, Howard, and Hapke (Citation2019).
2 Data was retrieved directly from the website of each journal. Initially, all articles jointly mentioning the phrase “content analysis” with “Africa” or the name of an African country were retrieved using a custom-built web scraper. The initial search returned 697 results. The author examined the methods section of each article, and retained only those that used content analysis and studied at least one African country. Most of the articles only mentioned “content analysis” or African countries in passing, and were thus discarded.