ABSTRACT
In the present study, we raised the question of whether valence information of natural emotional sounds can be extracted rapidly and unintentionally. In a first experiment, we collected explicit valence ratings of brief natural sound segments. Results showed that sound segments of 400 and 600 ms duration—and with some limitation even sound segments as short as 200 ms—are evaluated reliably. In a second experiment, we introduced an auditory version of the affective Simon task to assess automatic (i.e. unintentional and fast) evaluations of sound valence. The pattern of results indicates that affective information of natural emotional sounds can be extracted rapidly (i.e. after a few hundred ms long exposure) and in an unintentional fashion.
Acknowledgements
The authors thank Ullrich Ecker for his helpful comments and Thorid Römer for assistance in data collection.
Notes
1. Despite of its relative neglect compared with visual affective research (which encompasses hundreds of published studies), there are remarkable attempts to investigate sound evaluation, which should be mentioned: There are studies on preferential processing of conditioned valence of sounds (Bröckelmann et al., Citation2011, Citation2013; Folyi, Liesefeld, & Wentura, Citation2015), on functional magnetic resonance imaging and electrophysiological correlates of complex emotional sounds such as environmental sounds, emotional vocalizations, and music (e.g. Czigler, Cox, Gyimesi, & Horváth, Citation2007; Grandjean et al., Citation2005; Koelsch, Fritz, von Cramon, Müller, & Friederici, Citation2006; Mitchell, Elliott, Barry, Cruttenden, & Woodruff, Citation2003; Sander, Frome, & Scheich, Citation2007; Sander & Scheich, Citation2001; Sauter & Eimer, Citation2010; Scott, Sauter, & McGettigan, Citation2009; Shinkareva et al., Citation2014), on identifying non-symbolic, low-level acoustic features that contribute to the evaluation of a wide range of sounds by using the approach of computational modelling (e.g. Weninger, Eyben, Schuller, Mortillaro, & Scherer, Citation2013), and on multisensory integration of emotional information (e.g. Dolan, Morris, & de Gelder, Citation2001; Pourtois, de Gelder, Bol, & Crommelinck, Citation2005).
2. Sample size was determined by considerations about the reliability of mean ratings (see Materials).
3. All correlations are associated with p < .001. However, due to the multimodal distribution of the norm ratings, inferential statistics might be biased. Thus, the correlations should be dominantly taken as a descriptive index of the correspondence between brief segments ratings and the full ratings.
4. Alternatively, we conducted a 3 (valence) × 2 (animacy category: animate vs. inanimate) × 3 (duration) MANOVA. All effects reported below are essentially the same in this analysis. Additionally, there were significant effects involving animacy. However, for the sake of succinctness and because these effects are rather uninteresting due to their ambiguity (i.e. they might be an effect of better discriminability of one category relative to the other or they might reflect a response bias), we report only the reduced analysis.
Bröckelmann, A.-K., Steinberg, C., Elling, L., Zwanzger, P., Pantev, C., & Junghofer, M. (2011). Emotion-associated tones attract enhanced attention at early auditory processing: Magnetoencephalographic correlates. The Journal of Neuroscience, 31, 7801–7810. doi:10.1523/jneurosci.6236-10.2011 Bröckelmann, A.-K., Steinberg, C., Dobel, C., Elling, L., Zwanzger, P., Pantev, C., & Junghöfer, M. (2013). Affect-specific modulation of the N1m to shock-conditioned tones: Magnetoencephalographic correlates. European Journal of Neuroscience, 37, 303–315. doi:10.1111/ejn.12043 Folyi, T., Liesefeld, H. R., & Wentura, D. (2015). Attentional enhancement for positive and negative tones at an early stage of auditory processing. Manuscript submitted for publication. Czigler, I., Cox, T. J., Gyimesi, K., & Horváth, J. (2007). Event-related potential study to aversive auditory stimuli. Neuroscience Letters, 420, 251–256. doi:10.1016/j.neulet.2007.05.007 Grandjean, D., Sander, D., Pourtois, G., Schwartz, S., Seghier, M. L., Scherer, K. R., & Vuilleumier, P. (2005). The voices of wrath: Brain responses to angry prosody in meaningless speech. Nature Neuroscience, 8, 145–146. doi:10.1038/nn1392 Koelsch, S., Fritz, T., von Cramon, D. Y., Müller, K., & Friederici, A. D. (2006). Investigating emotion with music: An fMRI study. Human Brain Mapping, 27, 239–250. doi:10.1002/hbm.20180 Mitchell, R. L., Elliott, R., Barry, M., Cruttenden, A., & Woodruff, P. W. (2003). The neural response to emotional prosody, as revealed by functional magnetic resonance imaging. Neuropsychologia, 41, 1410–1421. doi:10.1016/S0028-3932(03)00017-4 Sander, K., Frome, Y., & Scheich, H. (2007). FMRI activations of amygdala, cingulate cortex, and auditory cortex by infant laughing and crying. Human Brain Mapping, 28, 1007–1022. doi:10.1002/hbm.20333 Sander, K., & Scheich, H. (2001). Auditory perception of laughing and crying activates human amygdala regardless of attentional state. Cognitive Brain Research, 12, 181–198. doi:10.1016/S0926-6410(01)00045-3 Sauter, D. A., & Eimer, M. (2010). Rapid detection of emotion from human vocalizations. Journal of Cognitive Neuroscience, 22, 474–481. doi:10.1162/jocn.2009.21215 Scott, S. K., Sauter, D., & McGettigan, C. (2009). Brain mechanisms for processing perceived emotional vocalizations in humans. In S. Brudzynski (Ed.), Handbook of mammalian vocalizations: An integrative neuroscience approach (pp. 187–198). Oxford: Academic Press. Shinkareva, S. V., Wang, J., Kim, J., Facciani, M. J., Baucom, L. B., & Wedell, D. H. (2014). Representations of modality-specific affective processing for visual and auditory stimuli derived from functional magnetic resonance imaging data. Human Brain Mapping, 35, 3558–3568. doi:10.1002/hbm.22421 Weninger, F., Eyben, F., Schuller, B. W., Mortillaro, M., & Scherer, K. R. (2013). On the acoustics of emotion in audio: What speech, music, and sound have in common. Frontiers in Psychology, Emotion Science, 4, 1–12. doi:10.3389/fpsyg.2013.00292 Dolan, R. J., Morris, J. S., & de Gelder, B. (2001). Crossmodal binding of fear in voice and face. Proceedings of the National Academy of Sciences, 98, 10006–10010. doi:10.1073/pnas.171288598 Pourtois, G., de Gelder, B., Bol, A., & Crommelinck, M. (2005). Perception of facial expressions and voices and of their combination in the human brain. Cortex, 41, 49–59. doi:10.1016/S0010-9452(08)70177-1 Additional information
Funding
The present research was conducted within the International Research Training Group “Adaptive Minds” supported by the German Research Foundation [GRK 1457].