Abstract
Massive datasets of communication are challenging traditional, human-driven approaches to content analysis. Computational methods present enticing solutions to these problems but in many cases are insufficient on their own. We argue that an approach blending computational and manual methods throughout the content analysis process may yield more fruitful results, and draw on a case study of news sourcing on Twitter to illustrate this hybrid approach in action. Careful combinations of computational and manual techniques can preserve the strengths of traditional content analysis, with its systematic rigor and contextual sensitivity, while also maximizing the large-scale capacity of Big Data and the algorithmic accuracy of computational methods.
Notes
This research was supported by the Office of the Vice President for Research at the University of Minnesota and the Social Sciences and Humanities Research Council of Canada.
1It is worth noting that this view is challenged by scholars like CitationHerring (2010) and CitationKarlsson (2012), who share a concern that the structural features of new media (such as hyperlinks) and the media content created through them (such as blog comments) are simply too “new” to be addressed by “old” methods of content analysis alone.
2 As of late December 2012, Twitter introduced a feature allowing users to download their entire Twitter archive. Thus, researchers studying an individuals' (or set of individuals') use of Twitter, as we did, may request from the subjects of their study a copy of their Twitter archive. Because this feature was not available to us at the time of writing, we have only briefly experimented with it. However, according to the documentation accompanying the download (as of January 2013), “the JSON export contains a full representation of your Tweets as returned by v1.1 of the Twitter API.” As such, it would be a rich and comprehensive data set, like the one provided to us by Carvin, available in a standardized format that is easily parsed.
3 Although we used Microsoft Excel and SPSS because of the researchers' collective familiarity with those programs, our approach may be easily adopted using open-source solutions, such as LibreOffice and R, respectively.