10,085
Views
320
CrossRef citations to date
0
Altmetric
The Workshop

Affective News: The Automated Coding of Sentiment in Political Texts

&
Pages 205-231 | Published online: 26 Apr 2012
 

Abstract

An increasing number of studies in political communication focus on the “sentiment” or “tone” of news content, political speeches, or advertisements. This growing interest in measuring sentiment coincides with a dramatic increase in the volume of digitized information. Computer automation has a great deal of potential in this new media environment. The objective here is to outline and validate a new automated measurement instrument for sentiment analysis in political texts. Our instrument uses a dictionary-based approach consisting of a simple word count of the frequency of keywords in a text from a predefined dictionary. The design of the freely available Lexicoder Sentiment Dictionary (LSD) is discussed in detail here. The dictionary is tested against a body of human-coded news content, and the resulting codes are also compared to results from nine existing content-analytic dictionaries. Analyses suggest that the LSD produces results that are more systematically related to human coding than are results based on the other available dictionaries. The LSD is thus a useful starting point for a revived discussion about dictionary construction and validation in sentiment analysis for political communication.

Acknowledgments

The authors are grateful to Mark Daku, who programmed Lexicoder; to Marc André Bodet and Blake Andrew for their work using the LSD in its early stages; to Christopher Wlezien and Robert Erikson, for providing us some U.S. polling data with which to further test the dictionary; and to the editor and anonymous reviewers, whose comments were critical to this final version of the article.

Notes

1. On the tone of news content, see discussion in the following section. On negative advertising in particular, see, for example, a meta-analysis by CitationLau, Sigelman, Heldman, & Babbitt (1999).

2. Affect generally refers to any conscious or unconscious feeling as distinct from a cognitive perception, and it is a necessary component of the more complex experience of emotion, which is generally conceived of as having both affective and cognitive components (CitationHuitt, 2003). For the purposes at hand, we use the terms sentiment and tone to refer broadly to affect or emotion. Practically speaking, automation cannot distinguish between the two, and thus the distinction is adequate. Finer distinctions within these broad categories are made explicit in the text that follows.

5. For examples in computational linguistics, see, for instance, CitationGénéreux and Evans (2006); CitationHatzivassiloglou and McKeown (1997); CitationJoachims (1998); CitationKim and Hovy (2006); CitationKushal, Lawrence, and Pennock (2003); CitationLeshed and Kaye (2006); CitationMishne (2005); CitationPang et al. (2002); and CitationWiebe (2000). Political scientists have taken up statistical methods to automate policy positions in party manifestos (CitationLaver, Benoit, & Garry, 2003) and the topic of congressional speeches (CitationPurpura & Hillard, 2006).

7. There are many other examples, including CitationPennebaker, Slatcher, and Chung's (2005) use of word counts to derive psychological attributes of political candidates from their tone in natural conversation. Frequency analysis has also been applied to international communications to monitor conflict and various efforts to use word counts in the analysis of international relations (e.g., CitationDoucet & Jehn, 1997; CitationHogenraad, 2005; CitationHolsti, Brody, & North, 1964; CitationHopmann & King, 1976;). For work using DICTION specifically, see http://www.dictionsoftware.com.

8. Several studies have used proximity-based lexical rules to attribute tone to actors or topics at the subdocument level by measuring the local co-occurrence of dictionary words and a “subject” of interest (see, e.g., CitationMullen & Collier, 2004; CitationPang et al., 2002; CitationTong, 2001; see also research on subjectivity analysis, e.g., CitationWiebe, 2000). Despite many sophisticated approaches, however, state of the art machine-learning and NLP sentiment analysis techniques cannot readily unravel the topic-specific relationship between presented evidence and speaker opinion (see, e.g., CitationThomas et al., 2006, p. 2).

9. GI combines Osgood's Semantic Differential Scale and the Lasswell Value Dictionary.

10. The original GI software package was programmed with a set of word sense disambiguation rules that corresponded to various senses annotated in the dictionary. However, neither the program nor the rules are maintained. Thus, most research simply collapses or weighs the carefully annotated multiple word senses, to the dismay of creator Philip CitationStone, who laments the tendency as “a step backwards in both theory and technique” (1986, p. 76).

12. We were unfortunately unable to include other dictionaries in the construction of the LSD due to proprietary restrictions either on their use, modification, or distribution.

13. A complete list of categories classified as positive or negative is available upon request.

14. Alternately, we could have aggregated by category across dictionaries to calculate word scores. We chose to aggregate per dictionary first to avoid biasing the tone in favor of the dictionary with the most categories.

15. The General Inquirer is a notable exception (see Note 9). Hart's DICTION program also makes modest statistical adjustments by differentially weighing homographs.

17. Trials were conducted using a subset of subjective words, noted in the literature to improve sentiment analysis (CitationWiebe, 2000). However, this version did not perform as well as the full LSD.

18. These modules are available alongside the LSD online.

19. By way of example, randomly drawn positive terms include beaming, charity, cognizant, comprehend, credible, curious, dignify, dominance, ecstatic, friend, gain, gentle, justifiably, look up to, meticulous, of note, peace, politeness, reliability, and success; randomly drawn negative terms include admonish, appall, disturbed, fight, flop, grouch, huffish, hypocritical, impurity, irritating, limp, omission, oversight, rancor, relapse, sap, serpent, untimely, worrying, and yawn.

20. Lexicoder was developed by Stuart Soroka and Lori Young, and programmed by Mark Daku. It is available at http://www.lexicoder.com

21. We would very much have liked to include DICTION in our study; however, the word list—though it can be inspected—is not exportable.

22. Notably, it is not a measure of journalistic tone, bias, or subjectivity.

23. Note that differences in human codes in our sample are a matter of degree. There is no single case of both positive and negative tone codes for a single story, for instance. Thus, we are confident that differences do reflect genuine ambiguity.

24. Proportions are used to control for the varying length of articles.

25. Note that the R 2 value for unprocessed text is 1.4 points lower than above, and about 4 points lower when inflections are not applied. Full results are available from the authors.

26. It need not be, of course, and that is a first difficulty. Another option is to let the mean net tone of all neutral articles, as determined by coders, be zero. Using that value, .45, does not significantly change the results mentioned in the text.

27. The neutral category represents a real problem in this kind of analysis, since it is not clear in the LSD codes where exactly neutral ends, and the error around the mean in this particular sample is of course a very rough proxy.

28. The lengthy lag of media content (4 days or more) is a consequence of trying to build models that can predict shifts in vote shares. The original models are described in detail in CitationSoroka et al. (2009).

29. Stories were selected by searching for the word “election” or any one of the party leader's names in the story text in searches limited by geography (Canada). Newspapers include the Globe and Mail, the Toronto Star, the National Post, the Vancouver Sun, and the Calgary Herald. We exclude the two French-language newspapers here, since they cannot be coded using the LSD.

30. Using this matched subsample of articles makes for a stricter test of the LSD in comparison with human coding than would a test using the entire database. This makes sense for the purposes at hand. However, we should note that this approach greatly attenuates one of the main advantages of automated coding—namely, the ability to work with much more data than could feasibly be coded by humans. Given that the reliability of automated tone will increase with sample size, and the ease with which sample size can be increased, we are limiting our automated predictions here rather severely.

31. Note that manual tone in the original study is also a measure of net tone, accounting for the relative weight of positive versus negative coverage toward each actor during the campaign.

32. Note that we include variants of party names such as “Grits” or “Tories” for Liberals and Conservatives, for instance.

33. We could also calculate net tone as a proportion of the total number of words analyzed, to account for differences in the volume of coverage of various actors. We see some advantage to the raw measure we use here, however, since it captures, in part, the consequences of a large versus small amount of negative/positive coverage. In any case, results are not very different when we use the percentage-point measure.

34. This automated measure of tone differs from the manual measure in at least one way. Recall that automated tone is a composite measure of the relative negativity of the events or issues being covered and the opinions and attitudes of newsmakers. In the original study, expert coders were trained to measure the latter. We should accordingly expect to see some differences in their relationship to vote shares. However, as the results demonstrate, any such differences turn out to be minor.

35. On the value of the MAE as a goodness of fit measure in prediction and forecasting, see CitationKrueger and Lewis-Beck (2005).

36. And note that results from these models—relying on just 1,590 of the original manually coded articles—are not very different from the original results in CitationSoroka et al. (2009).

37. Predictions are smoothed using lowess smoothing with a bandwidth of .2.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 265.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.