ABSTRACT
Generative statistical models of chord sequences play crucial roles in music processing. To capture syntactic similarities among certain chords (e.g. in C major key, between G and G7 and between F and Dm), we study hidden Markov models and probabilistic context-free grammar models with latent variables describing syntactic categories of chord symbols and their unsupervised learning techniques for inducing the latent grammar from data. Surprisingly, we find that these models often outperform conventional Markov models in predictive power, and the self-emergent categories often correspond to traditional harmonic functions. This implies the need for chord categories in harmony models from the informatics perspective.
Notes
1 The number of possible ways to tie transition probabilities is given by the partition number. For example, with 10 symbols, it is 42 for a first-order model and about for a second-order model (with bigram contexts).
2 To make the relation between PCFG models and HMMs explicit, we here do not include the start symbol in .
3 J-Total Music: http://music.j-total.net (Researchers who wish to have access to the J-Pop data should contact the authors.)
4 The dataset was downloaded from the McGill Billboard Project webpage: http://ddmal.music.mcgill.ca/research/billboard (Complete Annotation section).