463
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Convolution-based classification of audio and symbolic representations of music

, , , &
Pages 191-205 | Received 22 Oct 2016, Accepted 23 Mar 2018, Published online: 06 May 2018

References

  • Allwein, E.L. , Schapire, R.E. , & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research , 1 (Dec), 113–141.
  • Arthur, D. , & Vassilvitskii, S. (2007). K-means++: The advantages of careful seeding . In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 1027–1035). New Orleans, Lousiana.
  • Bach, J.S. (1742a). The Well-Tempered Clavier, BWV 846–893. (Recorded by Angela Hewitt. Bach: The Well-Tempered Clavier [Audio CD]. HYPERION (2009)).
  • Bach, J.S. (1742b). The Well-Tempered Clavier, BWV 846–893. (Recorded by Pieter-Jan Belder. Bach: The Well-Tempered Clavier [Audio CD]. Brilliant Classics (2009)).
  • Bengio, Y. , Courville, A. , & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence , 35 (8), 1798–1828.
  • Cataltepe, Z. , Yaslan, Y. , & Sonmez, A. (2007). Music genre classification using MIDI and audio features. EURASIP Journal on Advances in Signal Processing , 2007 (1), 1–8.
  • Choi, K. , Fazekas, G. , Sandler, M.B. , & Cho, K. (2017, October 23--27). Transfer learning for music classification and regression tasks . Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017, Suzhou, China (pp. 141–149).
  • Costa, Y.M. , Oliveira, L. , Koerich, A.L. , Gouyon, F. , & Martins, J. (2012). Music genre classification using LBP textural features. Signal Processing , 92 (11), 2723–2737.
  • Daubechies, I. & Maes, S. (1996). A nonlinear squeezing of the continuous wavelet transform based on auditory nerve models. In A. Aldroubi & M. Unser (Eds.), Wavelets in medicine and biology (pp. 527–546). Boca Raton, FL: CRS Press.
  • Davies, D.L. & Bouldin, D.W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence , PAMI-1 (2), 224–227.
  • de Boor, C. (1978). A practical guide to splines . New York: Springer-Verlag.
  • Eerola, T. , & Toiviainen, P. (2003). MIDI toolbox: MATLAB tools for music research. Jyväskylä, Finland: University of Jyväskylä. Retrieved from http://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/miditoolbox/
  • Eisen, C. , Baldassarre, A. , & Griffiths, P. String quartet. Retrieved from http://www.oxfordmusiconline.com.zorac.aub.aau.dk/subscriber/article/grove/music/40899 ( Accessed 9-Oct-2015).
  • Foleiss, J.H. , & Tavares, T. (2016). A spectral bandwise feature-based system for the mirex 2016 train/test task. Submission to Audio Classification (Train/Test) Tasks of MIREX, 2016.
  • Gibbons, J.D. , & Chakraborti, S. (2011). Nonparametric statistical inference. In Lovric, M. (Ed.), International encyclopedia of statistical science . Berlin Heidelberg: Springer (pp. 977–979).
  • Herlands, W. , Der, R. , Greenberg, Y. , & Levin, S. (2014, July 27--31). A machine learning approach to musically meaningful homogeneous style classification . Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, Québec, Canada (pp. 276–282).
  • Hillewaere, R. , Manderick, B. , Conklin, D. , Downie, J.S. , & Veltkamp, R.C. (2010, August 9--13). String quartet classification with monophonic models . Proceedings of the 11th International Society for Music Information Retrieval Conference, ISMIR 2010, Utrecht, Netherlands (pp. 537–542). International Society for Music Information Retrieval.
  • Hontanilla, M. , Pérez-Sancho, C. , & Iñesta, J.M. (2013). Modeling musical style with language models for composer recognition (pp. 740–748). Berlin, Heidelberg: Springer.
  • Karmakar, A. , Kumar, A. , & Patney, R. (2011). Synthesis of an optimal wavelet based on auditory perception criterion. EURASIP Journal on Advances in Signal Processing , 2011 (1), 1–13.
  • Kay, K.N. , Naselaris, T. , Prenger, R.J. , & Gallant, J.L. (2008). Identifying natural images from human brain activity. Nature , 452 (7185), 352–355.
  • Keys, R. (1981). Cubic convolution interpolation for digital image processing. IEEE Transactions on Acoustics, Speech, and Signal Processing , 29 (6), 1153–1160.
  • Kuncheva, L.I. (2004). Combining pattern classifiers: Methods and algorithms . New Jersey: John Wiley & Sons.
  • LeCun, Y. , Kavukcuoglu, K. , & Farabet, C. (2010). Convolutional networks and applications in vision . Proceedings of 2010 IEEE International Symposium on Circuits and Systems (pp. 253–256). Paris.
  • Lidy, T. , Rauber, A. , Pertusa, A. , & Iñesta, J.M. (2007, September 23--27). Improving genre classification by combination of audio and symbolic descriptors using a transcription systems. In Dixon, S. , Bainbridge, D. , & Typke, R. (Eds.), Proceedings of the 8th International Conference on Music Information Retrieval, ISMIR 2007, Vienna, Austria (pp. 61–66). Vienna: Austrian Computer Society.
  • Lidy, T. , & Schindler, A. (2016). Parallel convolutional neural networks for music genre and mood classification. MIREX2016. New York. Retrieved from http://www.music-ir.org/mirex/abstracts/2016/LS1.pdf
  • Mandel, M.I. , Devaney, J. , Turnbull, D. & Tzanetakis, G. (Eds.). (2016, August 7--11). Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016, New York City, United States .
  • Meredith, D. (2006). The ps13 pitch spelling algorithm. Journal of New Music Research , 35 (2), 121–159.
  • Murdock, B.B., Jr (1979). Convolution and correlation in perception and memory. In L.-G. Nilsson (Ed.), Perspectives on Learning and Memory (pp. 105–119). Hillsdale, NJ: Erlbaum.
  • Ogihara, M. , & Li, T. (2008). N-gram chord profiles for composer style representation . Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR 2008) (pp. 671–676). Philadelphia, PA.
  • Oramas, S. , Nieto, O. , Barbieri, F. , & Serra, X. (2017, October 23--27). Multi-label music genre classification from audio, text and images using deep features . Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017, Suzhou, China (pp. 23–30).
  • Paul, E.S. , & Kaufman, S.B. (2014). The philosophy of creativity: New essays . Oxford: Oxford University Press.
  • Pons, J. , Lidy, T. , & Serra, X. (2016). Experimenting with musically motivated convolutional neural networks . Proceedings of the 14th International Workshop on Content-Based Multimedia Indexing (CBMI 2016), Bucharest, Romania. IEEE.
  • Pribram, K.H. (1986). Convolution and matrix systems as content addressible distributed brain processes in perception and memory. Journal of Neurolinguistics , 2 (1), 349–364.
  • Rush, J.C. & Sabers, D.L. (1981). The perception of artistic style. Studies in Art Education , 23 (1), 24–32.
  • Sapp, C. , & Liu, Y.W. (2015). The Haydn/Mozart string quartet quiz. Retrieved from http://qq.themefinder.org (Accessed 26-Dec-2015).
  • Schlüter, J. (2016). Learning to pinpoint singing voice from weakly labeled examples. In DBLP:conf/ismir/2016 (pp. 44–50). New York.
  • Schörkhuber, C. , Klapuri, A. , Holighaus, N. , & Dörfler, M. (2014). A Matlab toolbox for efficient perfect reconstruction time-frequency transforms with log-frequency resolution . Audio Engineering Society Conference: 53rd International Conference: Semantic Audio, London, UK. Audio Engineering Society.
  • Snowden, R. J. , Thompson, P. , & Troscianko, T. (2012). Basic vision: An introduction to visual perception . Oxford: Oxford University Press.
  • Stein, L. (1979). Structure & style: The study and analysis of musical forms. Summy-Birchard Music. Miami, FL.
  • Sturm, B.L. (2014a). A simple method to determine if a music information retrieval system is a “horse". IEEE Transactions on Multimedia , 16 (6), 1636–1644.
  • Sturm, B.L. (2014b). A survey of evaluation in music genre recognition. In A., Nürnberger , S., Stober , B., Larsen , M., Detyniecki (Eds.), Adaptive Multimedia Retrieval: Semantics, Context, and Adaptation: 10th International Workshop AMR 2012. Copenhagen, Denmark, October 24-25, 2012, Revised Selected Papers (pp. 29-66). Springer. Lecture Notes in Computer Science, Vol. 8382, doi:10.1007/978-3-319-12093-5_2
  • Tuia, D. , Volpi, M. , Mura, M.D. , Rakotomamonjy, A. , & Flamary, R. (2014). Automatic feature learning for spatio-spectral image classification with sparse SVM. IEEE Transactions on Geoscience and Remote Sensing , 52 (10), 6062–6074.
  • Tzanetakis, G. , Ermolinskyi, A. , & Cook, P. (2003). Pitch histograms in audio and symbolic music information retrieval. Journal of New Music Research , 32 (2), 143–152.
  • van Kranenburg, P. , & Backer, E. (2004, April 15--18). Musical style recognition -- A quantitative approach . Proceedings of the Conference on Interdisciplinary Musicology (CIM04) Graz, Austria (pp. 106–107).
  • van Kranenburg, P. , Volk, A. , & Wiering, F. (2013). A comparison between global and local features for computational classification of folk song melodies. Journal of New Music Research , 42 (1), 1–18.
  • Velarde, G. , Weyde, T. , Cancino Chacón, C. E. , Meredith, D. , & Grachten, M. (2016, August 7-11). Composer recognition based on 2d-filtered piano-rolls . Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016, New York City, United States (pp. 115–121). Retrieved from https://wp.nyu.edu/ismir2016/wp-content/uploads/sites/2294/2016/07/063Paper.pdf
  • Velarde, G. , Weyde, T. , & Meredith, D. (2013). An approach to melodic segmentation and classification \ based \ on \ filteringwith the haar-wavelet. Journal of New Music Research , 42 (4), 325–345.
  • Wu, M.J. , Chen, Z.S. , Jang, J.S.R. , Ren, J.M. , Li, Y.H. , & Lu, C.H. (2011). Combining visual and acoustic features for music genre classification . 2011 10th International Conference on Machine Learning and Applications and Workshops (ICMLA) (Vol. 2, pp. 124–129). Honolulu, Hawaii: IEEE.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.