2,026
Views
3
CrossRef citations to date
0
Altmetric
Research Article

The Automatic Analysis of Emotion in Political Speech Based on Transcripts

, , , , &
Pages 98-121 | Published online: 13 Aug 2021
 

ABSTRACT

Automatic sentiment analysis is used extensively in political science. The digitization of legislative transcripts has increased the potential application of established tools for the automated analyses of emotion in text. Unlike in writing, however, expressing emotion in speech involves intonation, facial expressions, and body language. Drawing on a new dataset of annotated texts and videos from the Canadian House of Commons, this paper does three things. First, we examine whether transcripts capture the emotional content of speeches. We find that transcripts capture sentiment, but not emotional arousal. Second, we compare strategies for the automated analysis of sentiment in text. We find that leading approaches performed reasonably well, but sentiment dictionaries generated using word embeddings surpassed these other approaches. Finally, we test the robustness of the approach based on word embeddings. Although the methodology is reasonably robust to alternative specifications, we find that dictionaries created using word embeddings are sensitive to the choice of seed words and to training corpus size. We conclude by discussing the implications for analyses of political speech.

Acknowledgments

This paper was improved by feedback received at the Centre for the Study of Democratic Citizenship at McGill, the Department of Political Science at Universit´e Laval, the Department of Political Science at Western University, the 2nd Annual Politics and Computational Social Science Conference, the 115th Conference of the American Political Science Association, the 2019 Conference of the Canadian Political Science Association, and, especially, from comments by Jacob Montgomery, Bryce Dietrich, J. Scott Matthews, Sven-Oliver Proksch, Fran¸cois P´etry, Yannick Dufresne, and David Armstrong. We are also grateful for the exceptionally detailed and constructive criticism from the Journal’s anonymous reviewers. We also thank Meghan Snider, Pierre-Oliver Bonin, Jason Vandenbeukel, Katie Moez, Stefan Ferraro, and Justin Savoie for their excellent assistance coding. We are responsible for any remaining errors.

Disclosure Statement

No potential conflict of interest was reported by the authors.

Data Availability Statement

The data described in this article are openly available in the Open Science Framework at https://doi.org/10.17605/OSF.IO/VUTW4.

Open Scholarship

This article has earned the Center for Open Science badges for Open Data, Open Materials and Preregistered. The data and materials are openly accessible at https://doi.org/10.17605/OSF.IO/VUTW4.

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

Notes

1. In three cases, there were minor distortions in the indicated videos (e-mail dings). The previous sentence was used for those cases as well.

2. The missing speech implied that government ministers were taking bribes, and we suspect it was withdrawn from Hansard by the Member.

3. Although video coders may have coded the same video more than twice, we restricted the analysis to their first two scores because the text coders coded each snippet only twice.

4. For general summaries of these methods and their applications, see for instance, Quinn et al. (Citation2010), Cambria et al. (Citation2013), Grimmer and Stewart (Citation2013), and Wilkerson and Casas (Citation2017), and Benoit (Citation2019).

5. There are two popular variants. The Continuous Bag of Words (CBOW) algorithm assigns vectors to maximize the likelihood of a word appearing, given its context. The Skip Gram algorithm assigns vectors to maximize the likelihood of contexts appearing, given each word. We use the CBOW algorithm.

6. Pang and Lee (Citation2008) summarize the history of this development.

7. Amir et al. (Citation2015) also used word embeddings to predict the sentiment of Twitter terms using a labeled set of words and phrases, but using a regression-based approach.

8. For a detailed examination of the relevance of arithmetic operations performed on words embeddings, see Ethayarajh et al. (Citation2018).

9. For each approach, unclassified sentences are not included in the calculations of accuracy and R-squared. We exclude them because their inclusion as “neutral” classifications substantially reduces the performance of these dictionaries, and, for the purposes of comparing established dictionaries to the approach based on word embeddings, we wanted to represent the established dictionaries in the strongest possible way. In response to a helpful suggestion, we also experimented by including in the Lexicoder analysis the entire paragraph surrounding the sentences that we extracted. In effect, this meant that a greater number of words would align with the Lexicoder dictionary, which could conceivably result in a better classification of the context of the sentence in our analysis. In following this approach, we found a small decrease in the accuracy of Lexicoder’s classification and in the amount of variance that it explained, but an appreciable decrease, from 31% to 10%, in the proportion of our sentences that Lexicoder was unable to classify.

10. We are confident that supervised models trained on annotated parliamentary text would represent an excellent strategy for analyzing sentiment in parliamentary corpora, provided that there was enough annotated data with which to train the models. Parliamentary data are not normally annotated for sentiment, however, and the process of annotating them is time consuming and costly.

Additional information

Funding

This research was funded by the Social Sciences and Humanities Research Council of Canada.

Notes on contributors

Christopher Cochrane

Christopher Cochrane is a Associate Professors, Department of Political Science, University of Toronto.

Ludovic Rheault

Ludovic Rheault is a Associate Professors, Department of Political Science, University of Toronto.

Jean-François Godbout

Jean-François Godbout is a Professor in the Department of Political Science, Université de Montréal. 

Tanya Whyte

Tanya Whyte Ph.D. recipients from the Department of Political Science, University of Toronto.

Michael W.-C. Wong

Michael W.-C. Wong (M.Phil., Oxford) is a Research Assistant in the Department of Political Science at the University of Toronto Scarborough.

Sophie Borwein

Sophie Borwein Ph.D. recipients from the Department of Political Science, University of Toronto.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 265.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.