Abstract
This article describes a method to estimate and track the tempo of musical recordings which was submitted to the MIREX 2006 evaluation contest where it was ranked third out of seven submissions. The algorithm that we present is composed of three stages: first a front-end analyses the audio signal in order to extract a representation of the musically relevant events, the so-called “detection function”. Then, the periodicity of these events is estimated in contiguous and overlapping excerpts of the detection function signal. Finally, the periodicities are tracked through time and the most energetic are selected as tempi.
Acknowledgements
This work was jointly supported by the Mexican Council for Science and Technology Grant no. 129114 and the French Ministry of Research under the Project ACI-Music-Discover. The authors would like to thank the anonymous reviewers for their constructive comments, suggestions, and corrections.
Notes
1By “musically relevant” we refer to all discrete sound events such as note onsets, changes in loudness, pitch or timbre, that the listener uses to infer a regular musical pattern.
2We suppose that the input signal is extracted from commercial CDs, sampled at 44.1 kHz with 16-bits of resolution and converted to mono.
3In the frequency domain, the shape of the peaks corresponds to the Fourier transform of the analysis window. A suitable estimation of the peak width (main lobe width) can be obtained from the parameters of the analysis window.