Abstract
In this paper we present two probabilistic models for real-time acoustic event detection: the Hidden Markov Model and the Change Point Model. We construct the generative models in such a way that each time slice of the audio spectra is generated from a ‘spectral template’ which is multiplied by a volume factor. From this point of view, we treat the event detection problem as a template matching problem where the aim is to infer the active template and its volume while the audio data are observed. The novel contribution in this paper is a Change Point Model for real-time template matching using a conditional Poisson observation model. For this model, we develop an exact inference algorithm and an effective approximation schema. We evaluate the models on online monophonic pitch tracking of two low pitched instruments where we focus on the trade-off between the latency and accuracy of the system. The evaluation results suggest favourable features such as quick detection, graceful degradation and an acceptable level of accuracy when compared with a state-of-the-art monophonic pitch tracking algorithm (YIN). We believe that these models provide a flexible and powerful modelling framework for real-time event and pitch detection.
Acknowledgements
We would like to thank the reviewers for helpful comments and suggestions. This work is funded by The Scientific and Technical Research Council of Turkey (TÜBİTAK) grant number 110E292, project ‘Bayesian matrix and tensor factorisations (BAYTEN)’. The work of Umut Şimşekli is supported by the PhD scholarship (2211) from TÜBİTAK.
Notes
1Note that we use MATLAB's colon operator syntax in which (1: T) is equivalent to [1, 2, 3, … ,T] and x 1:T ≡ {x 1,x 2, … , xT }.