409
Views
51
CrossRef citations to date
0
Altmetric
Original Articles

Autotagger: A Model for Predicting Social Tags from Acoustic Features on Large Music Databases

, , &
Pages 115-135 | Published online: 26 Nov 2008
 

Abstract

Social tags are user-generated keywords associated with some resource on the Web. In the case of music, social tags have become an important component of “Web 2.0” recommender systems, allowing users to generate playlists based on use-dependent terms such as chill or jogging that have been applied to particular songs. In this paper, we propose a method for predicting these social tags directly from MP3 files. Using a set of 360 classifiers trained using the online ensemble learning algorithm FilterBoost, we map audio features onto social tags collected from the Web. The resulting automatic tags (or autotags) furnish information about music that is otherwise untagged or poorly tagged, allowing for insertion of previously unheard music into a social recommender. This avoids the “cold-start problem” common in such systems. Autotags can also be used to smooth the tag space from which similarities and recommendations are made by providing a set of comparable baseline tags for all tracks in a recommender system. Because the words we learn are the same as those used by people who label their music collections, it is easy to integrate our predictions into existing similarity and prediction methods based on web data.

Acknowledgement

Many thanks to the members of the CAL group, in particular Luke Barrington, Gert Lanckriet and Douglas Turnbull, for publishing the CAL500 data set and answering our numerous questions. Thanks to the many individuals that provided input, support and comments including James Bergstra, Andrew Hankinson, Stephen Green, the members of LISA lab, BRAMS lab and CIRMMT. Thanks to Joseph Turian for pointing us to the phrase “There's no data like more data” (originally from speech recognition, we believe).

Notes

2Audioscrobbler. Web Services described at http://www.audio scrobbler.net/data/webservices/.

3Music Information Retrieval Evaluation eXchange; yearly contest pages found at www.music-ir.org.

5Of course, real recommenders deal with a more complex situation, caring about novelty of recommendations, serendipity and user confidence among others [see Herlocker et al. (2004) for more details]. However, similarity is essential. We do it on the artist level because the data available to build a ground truth would be too sparse on the album or song level.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.