Search in:

Advanced search

Journal of New Music Research Volume 47, 2018 - Issue 1

Submit an article Journal homepage

210

Views

CrossRef citations to date

Altmetric

Original Articles

Large vocabulary automatic chord estimation using bidirectional long short-term memory recurrent neural network with even chance training

Junqi DengThe University of Hong Kong, Hong Kong.Correspondence[email protected]

http://orcid.org/0000-0002-3173-1143

Yu-Kwong KwokThe University of Hong Kong, Hong Kong.

Pages 53-67 | Received 29 Aug 2016, Accepted 08 Aug 2017, Published online: 30 Oct 2017

Cite this article
https://doi.org/10.1080/09298215.2017.1367820
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Bello, J. P., & Pickens, J. (2005). A robust mid-level representation for harmonic content in music signals. In Proceedings of the 6th International Society for Music Information Retrieval Conference, ISMIR (Vol. 5, pp. 304–311), London, UK.
Google Scholar
Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends\textregistered in Machine Learning, 2(1), 1–127.
Google Scholar
Boulanger-Lewandowski, N., Bengio, Y., & Vincent, P. (2013). Audio chord recognition with recurrent neural networks. In Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR (pp. 335–340), Curitiba, Brazil.
Google Scholar
Brown, J. C. (1991). Calculation of a constant Q spectral transform. The Journal of the Acoustical Society of America, 89(1), 425–434.
Web of Science ®Google Scholar
Burgoyne, J. A., Pugin, L., Kereliuk, C., & Fujinaga, I. (2007). A cross-validated study of modelling strategies for automatic chord recognition in audio. In Proceedings of the 8th International Society for Music Information Retrieval Conference, ISMIR (pp. 251–254), Vienna, Austria.
Google Scholar
Burgoyne, J. A., Wild, J., & Fujinaga, I. (2011). An expert ground truth set for audio chord recognition and music analysis. In Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR (Vol. 11, pp. 633–638), Miami, FL.
Google Scholar
Cannam, C., Mauch, M., Davies, M. E. P., Dixon, S., Landone, C., Noland, K., ... Figueira, L. A. (2013). MIREX 2013 entry: Vamp plugins from the centre for digital music, Curitiba, Brazil.
Google Scholar
Catteau, B., Martens, J.-P., & Leman, M. (2007). A probabilistic framework for audio-based tonal key and chord recognition. In B. Decker & H. J, Lenz (Eds.), Advances in data analysis (pp. 637–644). Berlin, Germany: Springer.
Google Scholar
Chawla, N. V., Japkowicz, N., & Kotcz, A. (2004). Editorial: Special issue on learning from imbalanced data sets. ACM Sigkdd Explorations Newsletter, 6(1), 1–6.
Google Scholar
Cho, T. (2014). Improved techniques for automatic chord recognition from music audio signals (PhD thesis). New York City, NY: New York University.
Google Scholar
Cho, T., & Bello, J. P. (2013). MIREX 2013: Large vocabulary chord recognition system using multi-band features and a multi-stream HMM. Music Information Retrieval Evaluation eXchange (MIREX), Curitiba, Brazil.
Google Scholar
Deng, J., & Kwok, Y.-K. (2016a). Automatic chord estimation on SeventhsBass chord vocabulary using deep neural network. In Proceedings of the 41th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China.
Google Scholar
Deng, J., & Kwok, Y.-K. (2016b). A hybrid Gaussian-HMM-deep-learning approach for automatic chord estimation with very large vocabulary. In Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR, New York City, NY.
Google Scholar
Duda, R. O., Hart, P. E., & Stork, D. G. (2012). Pattern classification. New York City, NY: Wiley.
Google Scholar
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211.
Web of Science ®Google Scholar
Fujishima, T. (1999). Realtime chord recognition of musical sound: A system using common lisp music. In Proceedings of the 25th International Computer Music Conference (Vol. 1999, pp. 464–467), Beijing, China.
Google Scholar
Gómez, E. (2006). Tonal description of music audio signals. PhD Thesis. Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona.
Google Scholar
Graves, A. (2012). Supervised sequence labelling with recurrent neural networks. Berlin: Springer.
Google Scholar
Hamel, P., & Eck, D. (2010). Learning features from music audio with deep belief networks. In Proceedings of the 11th International Society for Music Information Retrieval Conference, ISMIR (pp. 339–344). Utrecht, The Netherlands.
Google Scholar
Harte, C. (2010). Towards automatic extraction of harmony information from music signals (PhD thesis). Department of Electronic Engineering, Queen Mary, University of London, London, UK.
Google Scholar
Harte, C., & Sandler, M. (2005). Automatic chord identification using a quantised chromagram. In Audio Engineering Society Convention 118. Barcelona, Spain: Audio Engineering Society.
Google Scholar
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.
PubMed Web of Science ®Google Scholar
Huang, X., Acero, A., & Hon, H.-W. (2001). Spoken language processing: A guide to theory, algorithm, and system development. (Foreword by: Raj Reddy). Prentice Hall PTR.
Google Scholar
Humphrey, E. J., & Bello, J. P. (2012). Rethinking automatic chord recognition with convolutional neural networks. In Proceedings of the 11th International Conference on Machine Learning and Applications (ICMLA) (Vol. 2, pp. 357–362). Boca Raton, FL: IEEE.
Google Scholar
Humphrey, E. J., Bello, J. P., & LeCun, Y. (2013). Feature learning and deep architectures: New directions for music informatics. Journal of Intelligent Information Systems, 41(3), 461–481.
Web of Science ®Google Scholar
Jolliffe, I. (2002). Principal component analysis. New York, NY: Wiley Online Library.
Google Scholar
Jordan, M. I. (1986). Attractor dynamics and parallellism in a connectionist sequential machine. In Proceedings of the Eighth Annual Meeting of the Cognitive Science Society (pp. 531–546). Amherst, MA: Lawrence Erlbaum Associates.
Google Scholar
Khadkevich, M., & Omologo, M. (2011). Time-frequency reassigned features for automatic chord recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 181–184). Prague, Czech Republic: IEEE.
Google Scholar
Korzeniowski, F., & Widmer, G. (2016). Feature learning for chord recognition: The deep chroma extractor. In Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR, New York City, NY.
Google Scholar
Lang, K. J., Waibel, A. H., & Hinton, G. E. (1990). A time-delay neural network architecture for isolated word recognition. Neural Networks, 3(1), 23–43.
Web of Science ®Google Scholar
Lawson, C. L., & Hanson, R. J. (1974). Solving least squares problems (Vol. 15). Prentice-Hall.
Google Scholar
LeCun, Y., & Bengio, Y. (1995). Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), 1995.
Google Scholar
Mauch, M. (2010a). Automatic chord transcription from audio using computational models of musical context (PhD thesis). School of Electronic Engineering and Computer Science Queen Mary, University of London.
Google Scholar
Mauch, M. (2010b). Simple chord estimate: Submission to the MIREX chord estimation task. Music Information Retrieval Evaluation Exchange (MIREX), Utrecht, Netherlands.
Google Scholar
Mauch, M., & Dixon, S. (2010a). Approximate note transcription for the improved identification of difficult chords. In Proceedings of the 11th International Society for Music Information Retrieval Conference, ISMIR (pp. 135–140), Utrecht, The Netherlands.
Google Scholar
Mauch, M., & Dixon, S. (2010b). MIREX 2010: Chord detection using a dynamic Bayesian network. Music Information Retrieval Evaluation Exchange (MIREX), Utrecht, Netherlands.
Google Scholar
Mauch, M., & Dixon, S. (2010c). Simultaneous estimation of chords and musical context from audio. IEEE Transactions on Audio, Speech, and Language Processing, 18(6), 1280–1289.
Web of Science ®Google Scholar
Ni, Y., McVicar, M., Santos-Rodriguez, R., & De Bie, T. (2012). An end-to-end machine learning system for harmonic analysis of music. IEEE Transactions on Audio, Speech, and Language Processing, 20(6), 1771–1783.
Web of Science ®Google Scholar
Oppenheim, A. V., Willsky, A. S., & Nawab, S. H. (1983). Signals and systems (Vol. 2). Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Papadopoulos, H., & Peeters, G. (2008). Simultaneous estimation of chord progression and downbeats from an audio file. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP (pp. 121–124). Las Vegas, NV: IEEE.
Google Scholar
Papadopoulos, H., & Tzanetakis, G. (2012). Modeling chord and key structure with Markov logic. In Proceedings of the 13th International Society for Music Information Retrieval Conference, ISMIR (pp. 127–132). Porto, Portugal: Citeseer.
Google Scholar
Pardo, B., & Birmingham, W. P. (2002). Algorithms for chordal analysis. Computer Music Journal, 26(2), 27–49.
Web of Science ®Google Scholar
Pauwels, J., & Peeters, G. (2013). Evaluating automatically estimated chord sequences. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 749–753). Vancouver, Canada: IEEE.
Google Scholar
Prechelt, L. (1998). Early stopping-but when? In Neural networks: Tricks of the trade (pp. 55–69). Berlin, Germany: Springer.
Google Scholar
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1988). Learning representations by back-propagating errors. Cognitive Modeling, 5(3), 1–2.
Google Scholar
Rumelhart, D. E., McClelland, J. L., \ & PDP Research Group. (1988). Parallel distributed processing (Vol. 1) . Massachusetts: IEEE.
Google Scholar
Ryynänen, M. P., & Klapuri, A. (2005). Polyphonic music transcription using note event modeling. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005 (pp. 319–322). New York, NY: IEEE.
Google Scholar
Sheh, A., & Ellis, D. P. W. (2003). Chord segmentation and recognition using EM-trained hidden Markov models. In Proceedings of the 4th International Society for Music Information Retrieval Conference, ISMIR (pp. 185–191). Baltimore, MD: International Symposium on Music Information Retrieval.
Google Scholar
Sigtia, S., Boulanger-Lewandowski, N., & Dixon, S. (2015). Audio chord recognition with a hybrid recurrent neural network. In Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), Malaga, Spain.
Google Scholar
Sigtia, S., & Dixon, S. (2014). Improved music feature learning with deep neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6959–6963). Florence, Italy: IEEE.
Google Scholar
Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.
Google Scholar
Wakefield, G. H. (1999). Mathematical representation of joint time-chroma distributions. In SPIE’s International Symposium on Optical Science, Engineering, and Instrumentation (pp. 637–645). Denver, CO: International Society for Optics and Photonics.
Google Scholar
Weller, A., Ellis, D., & Jebara, T. (2009). Structured prediction models for chord transcription of music audio. In International Conference on Machine Learning and Applications, ICMLA (pp. 590–595). Miami, FL: IEEE.
Google Scholar
Werbos, P. J. (1990). Backpropagation through time: What it does and how to do it. Proceedings of the IEEE, 78(10), 1550–1560.
Web of Science ®Google Scholar
Zeiler, M. D. (2012). ADADELTA: An adaptive learning rate method. arXiv preprint arXiv:1212.5701.
Google Scholar
Zhou, X., & Lerch, A. (2015). Chored detection using deep learning. In Proceedings of the 16th International Society for Music Information Retrieval Conference, ISMIR (Vol. 53).
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Large vocabulary automatic chord estimation using bidirectional long short-term memory recurrent neural network with even chance training

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Large vocabulary automatic chord estimation using bidirectional long short-term memory recurrent neural network with even chance training

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date