4,961
Views
0
CrossRef citations to date
0
Altmetric
Articles

Audio features dedicated to the detection and tracking of arousal and valence in musical compositions

Pages 322-333 | Received 23 Oct 2017, Accepted 09 Apr 2018, Published online: 27 Apr 2018
 

ABSTRACT

The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. Emotion recognition was treated as a regression problem, and a two-dimensional valence–arousal model was used to measure emotions in music. Features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval, were used. The influence of different feature sets was examined – low level, rhythm, tonal, and their combination – on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. Features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases, were found and presented. This paper presents also the process of building emotion maps of musical compositions. The obtained emotion maps provide new knowledge about the distribution of emotions in an examined audio recording. They reveal new knowledge that had only been available to music experts until this point.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes on contributor

Jacek Grekow received an MSc degree in Computer Science from the Technical University in Sofia, Bulgaria in 1994, and a PhD degree from the Polish-Japanese Institute of Information Technology in Warsaw, Poland 2009. He also obtained a Master of Arts degree from The Fryderyk Chopin University of Music in Warsaw 2007. His primary research interests are music information retrieval, emotions in music, and music visualization.

Notes

Additional information

Funding

This research was realized as part of study no. S/WI/3/2013.