1,084
Views
47
CrossRef citations to date
0
Altmetric
Articles

The State of the Art Ten Years After a State of the Art: Future Research in Music Information Retrieval

Pages 147-172 | Received 16 Jul 2013, Accepted 07 Feb 2014, Published online: 09 May 2014
 

Abstract

A decade has passed since the first review of research on a ‘flagship application’ of music information retrieval (MIR): the problem of music genre recognition (MGR). During this time, about 500 works addressing MGR have been published, and at least 10 campaigns have been run to evaluate MGR systems. This makes MGR one of the most researched areas of MIR. So, where does MGR now lie? We show that in spite of this massive amount of work, MGR does not lie far from where it began, and the paramount reason for this is that most evaluation in MGR lacks validity. We perform a case study of all published research using the most-used benchmark dataset in MGR during the past decade: GTZAN. We show that none of the evaluations in these many works is valid to produce conclusions with respect to recognizing genre, i.e. that a system is using criteria relevant for recognizing genre. In fact, the problems of validity in evaluation also affect research in music emotion recognition and autotagging. We conclude by discussing the implications of our work for MGR and MIR in the next ten years.

Acknowledgements

Thanks to: Fabien Gouyon, Nick Collins, Arthur Flexer, Mark Plumbley, Geraint Wiggins, Mark Levy, Roger Dean, Julián Urbano, Alan Marsden, Lars Kai Hansen, Jan Larsen, Mads G. Christensen, Sergios Theodoridis, Aggelos Pikrakis, Dan Stowell, Rémi Gribonval, Geoffrey Peeters, Diemo Schwarz, Roger Dannenberg, Bernard Mont-Reynaud, Gaël Richard, Rolf Bardeli, Jort Gemmeke, Curtis Roads, Stephen Pope, Yi-Hsuan Yang, George Tzanetakis, Constantine Kotropoulos, Yannis Panagakis, Ulaş Bağci, Engin Erzin, and João Paulo Papa for illuminating discussions about these topics (which does not mean any endorse the ideas herein). Mads G. Christensen, Nick Collins, Cynthia Liem, and Clemens Hage helped identify several excerpts in GTZAN, and my wife Carla Sturm endured my repeated listening to all of its excerpts. Thanks to the many, many associate editors and anonymous reviewers for the comments that helped move this work closer and closer to being publishable.

Notes

4 The bibliography and spreadsheet that we use to generate this figure are available as supplementary material, as well as online here: http://imi.aau.dk/~bst/software.

5 The dataset can be downloaded from here: http://marsyas.info/download/data_sets

6 Fabbri (Citation1999) also notes that genre helps ‘to speed up communication’.

8 All relevant references are available in the supplementary material, as well as online: http://imi.aau.dk/~bst/software.

10 Personal communication.

12 This is the file ‘country.00015.wav’ in GTZAN.

13 ‘Power Nap’ by J.S. Epperson (Binaural Beats Entrainment), 2010.

14 Our machine-readable index of this metadata is available at: http://imi.aau.dk/~bst/software.

15 These can be heard online here: http://imi.aau.dk/~bst/research/GTZANtable2

16 Which is now over 100 considering publications in 2013 not included in our survey (Sturm, Citation2012b).

17 A modern-day equivalent is ‘Maggie’, a dog that has performed arithmetic feats on nationally syndicated television programs: http://www.oprah.com/oprahshow/Maggie-the-Dog-Does-Math

18 We use here the multivariate dataset, http://archive.ics.uci.edu/ml/datasets/Adult

19 This feature takes a value in ‘Wife’, ‘Husband’, ‘Unmarried’, ‘Child’.

20 This feature takes a value in ‘Tech-support’, ‘Craft-repair’, ‘Other-service’, ‘Sales’, ‘Exec-managerial’, ‘Prof-specialty’, ‘Handlers-cleaners’, ‘Machine-op-inspct’, ‘Adm-clerical’, ‘Farming-fishing’, ‘Transport-moving’, ‘Priv-house-serv’, ‘Protective-serv’, ‘Armed-Forces’.

21 This feature takes a value in ‘Married’, ‘Divorced’, ‘Never-married’, ‘Widowed’, ‘Separated’.

22 At least, the scope of the first conclusion must be limited to 1996 USA, and that of the second to the dataset.

23 R. Hamming, ‘You get what you measure’, lecture at Naval Postgraduate School, June 1995. http://www.youtube.com/watch?v=LNhcaVi3zPA

24 What conclusion is valid in this case has yet to be determined.

25 Personal communication with J.P. Papa.

This work was supported by Independent Postdoc Grant 11-105218 from Det Frie Forskningsråd.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 471.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.