15
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Two-scale Auditory Feature Based Non-intrusive Speech Quality Evaluation

&
Pages 111-118 | Published online: 01 Sep 2014
 

Abstract

This paper proposes a novel two-scale auditory feature based algorithm for non-intrusive evaluation of speech quality. The neuron firing probabilities along the length of the basilar membrane, from an explicit auditory model, are used to extract features from the distorted speech signal. This is in contrast to previous methods, which either use standard vocal tract based features, or incorporate only some aspects of the human auditory perception mechanism. The features are extracted at two scales, namely a global scale spanning all voiced frames in an utterance, and a local scale spanning voiced frames from contiguous voiced segments in the utterance. This is followed by a simple information fusion at the score level using Gaussian Mixture Models (GMMs). The use of an explicit auditory model to extract features is based on the premise that similar processing (in a qualitative sense) happens in human speech perception. In addition, auditory feature extraction at two scales incorporates the effects of both long term and short term distortions on speech quality. The proposed algorithm is shown to perform at least as good as the ITU-T Recommendation P.563.

Additional information

Notes on contributors

Kartik Audhkhasi

Kartik Audhkhasi received his B.Tech. in Electrical Engineering and M.Tech. in Information and Communication Technology from Indian Institute of Technology, Delhi (IITD) in 2008. At present, he is a Ph.D. student at the Signal Analysis and Interpretation Laboratory (SAIL) within the Electrical Engineering Department at the University of Southern California (USC). He is broadly interested in signal processing and machine learning, with an emphasis on speech processing, recognition and human language technologies. E-mail: [email protected]

Arun Kumar

Arun Kumar received the B.Tech, M.Tech and PhD degrees in Electrical Engineering from Indian Institute of Technology (IIT), Kanpur. He was a Visiting Researcher at the University of California, Santa Barbara, from 1994 to 1996. Since 1997, he has been with the Centre for Applied Research in Electronics (CARE), IIT Delhi, where he is currently working as Professor. His research interests span the areas of digital signal processing, underwater acoustics, communications, and voice technologies for man-machine interaction. In these areas, he has introduced new courses at the Masters level, and has supervised several Masters and PhD theses at IIT Delhi. He has also supervised over 25 funded research and development projects from Indian and foreign industries, as well as various government organizations. He has received the Young Scientist award of the International Union of Radio Science (URSI) in the Netherlands. E-mail: [email protected]

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.