61
Views
2
CrossRef citations to date
0
Altmetric
Articles

VEP Detection for Read, Extempore and Conversation Speech

ORCID Icon &
Pages 2652-2660 | Published online: 02 Mar 2020
 

ABSTRACT

In this paper, we propose a novel approach for accurate detection of the vowel end points (VEPs) in any mode of speech. VEP is the instant at which the vowel ends in the speech signal. In this study, we have considered three broad modes of speech, namely; conversation, extempore, and read. The existing methods were explored the VEP detection for read mode of speech, and it may not be appropriate for the VEP detection in extempore and conversation modes. This is due to the acoustic characteristic of read mode is very different from the modes as mentioned earlier. To handle this problem, we proposed a two-stage method for accurately detecting the VEPs, irrespective of modes. At the first stage, vowel onset points (VOPs) are detected in a speech signal using our recent method based on continuous wavelet transform and phone boundary. VOP represents the start of the vowel in the speech signal. At the second stage, phone boundaries are detected using spectral transition measure approach, and then the closest succeeding phone boundary for each detected VOP is considered as detected VEP. Experiments involve TIMIT and Bengali speech corpora. Performance of the proposed VEP detection method is compared with two state-of-the-art signal processing methods. The significance of the proposed method is shown by automatically detecting vowel regions from the TIMIT and Bengali speech corpora. The evaluation results report that the performance of the proposed method is significantly better than the existing methods.

Additional information

Notes on contributors

Kumud Tripathi

Kumud Tripathi received the BTech degree from Department of Computer Science and Engineering, SR Group of Institution, Jhansi, India in 2011 and MTech degree from Department of Information Technology, Indian Institute of Information Technology, Allahabad, India, in 2015. She is currently pursuing the PhD degree in the Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur. She has published papers in 6 international conferences and 2 Journals. Her research interests include speech and signal processing.

K. Sreenivasa Rao

K Sreenivasa Rao received the PhD degree from the Department of Computer Science and Engineering, Indian Institute of Technology (IIT), Chennai, in 2005. He is currently working as a professor in the Department of Computer Science and Engineering, IIT Kharagpur, Kharagpur, India. His research interests are speech, audio and music signal processing, machine learning and big-data analytics. He has published more than 200 articles in reputed international journals and conference proceedings. E-mail: [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 100.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.