535
Views
35
CrossRef citations to date
0
Altmetric
Regular articles

Reproducing affective norms with lexical co-occurrence statistics: Predicting valence, arousal, and dominance

&
Pages 1584-1598 | Received 30 Jan 2014, Accepted 23 Jun 2014, Published online: 10 Sep 2014
 

Abstract

Human ratings of valence, arousal, and dominance are frequently used to study the cognitive mechanisms of emotional attention, word recognition, and numerous other phenomena in which emotions are hypothesized to play an important role. Collecting such norms from human raters is expensive and time consuming. As a result, affective norms are available for only a small number of English words, are not available for proper nouns in English, and are sparse in other languages. This paper investigated whether affective ratings can be predicted from length, contextual diversity, co-occurrences with words of known valence, and orthographic similarity to words of known valence, providing an algorithm for estimating affective ratings for larger and different datasets. Our bootstrapped ratings achieved correlations with human ratings on valence, arousal, and dominance that are on par with previously reported correlations across gender, age, education and language boundaries. We release these bootstrapped norms for 23,495 English words.

Notes

1A common method for remedying multicollinearity in linear regression is the use of ridge regression. When repeating our regressions with ridge regression (selecting our ridge regression parameter via the method of Cule & De Iorio, Citation2012), in no case did any coefficient differ from those in the original regressions by more than .002, and all significant predictors remained so at the p < .001 level.

2These norms are much smaller than the Warriner et al. (Citation2013) norms, precluding the creation of a large test set. Therefore, these were evaluated via leave-one-out cross-validation as in Bestgen and Vincze (Citation2012).

3Although further dimension reduction techniques were not useful in the present analysis (see the Appendix), dimension reduction techniques such as LSA may confer advantages in cases for which the amount of available data is small. For example, Vincze and Bestgen (Citation2011) achieved higher correlations to human ratings on a set of Spanish norms for valence, arousal, and dominance than did the method described here. It is difficult to know whether this difference owes to differences in the norms, the technique employed, the size and quality of the training corpora, or some interaction among these variables. Further research will help to tease out the relevant factors.

Log in via your institution

Log in to Taylor & Francis Online

There are no offers available at the current time.

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.