Abstract
We describe the development of a computational model predicting listener-perceived expressions of music in branding contexts. Representative ground truth from multi-national online listening experiments was combined with machine learning of music branding expert knowledge, and audio signal analysis toolbox outputs. A mixture of random forest and traditional regression models is able to predict average ratings of perceived brand image on four dimensions. Resulting cross-validated prediction accuracy (R²) was Arousal: 61%, Valence: 44%, Authenticity: 55%, and Timeliness: 74%. Audio descriptors for rhythm, instrumentation, and musical style contributed most. Adaptive sub-models for different marketing target groups further increase prediction accuracy.
Acknowledgement
We would like to express our gratitude to Geoffroy Peeters and his team from project partner IRCAM for their contributions to the machine learning parts of this study. Similarly, we thank our project partner HearDis for contributing the stimulus material and tag knowledge, as well as our partners from Integral Markt- und Meinungsforschung to contribute with their SINUS meta milieu scales.