Labelling of Hindi Speech

Archana BalyanDepartment of ECE, Maharaja Surajmal Institute of Technology, GGSIPU, New Delhi, IndiaView further author information

Amita DevDepartment of CSE, Bhai Parmanand Institute of Business Studies, Shakarpur, New Delhi, IndiaView further author information

Ruchika KumariDepartment of ECE, Maharaja Surajmal Institute of Technology, GGSIPU, New Delhi, IndiaView further author information

Shyam Sunder AgrawalDirector General, KIIT, MD University, Rohtak, IndiaView further author information

ABSTRACT

The goal of this paper is to obtain segmented and labelled speech at syllable level and also that the reasonable number of syllables may suffice the need for travel domain applications. A base-line group delay-based segmentation technique is applied on spoken speech sentences to generate labelled database at syllable level. The system is validated against 50 manually segmented speech utterances. The segmentation accuracy was evaluated by performing time-error analysis. It is observed that 63.07% syllables have time-error less than 30 ms. It is observed that vowels are more accurately segmented as compared to fricatives. The confidence interval is found to be 0.1147 ms for confidence level of 95%. This paper also presents implementation of algorithm for identifying syllables based on linguistic rules for Hindi words. After survey of the relevant literature, a set of rules are identified and implemented as a simple easy-to-implement algorithm. The text segmentation algorithm is tested on 2400 distinct words and algorithm performs with 99.5% accuracy for segmentation of written text.

KEYWORDS:

Acknowledgements

The authors would like to acknowledge Prof. Dr Hema Murthy from IIT, Chennai for her support throughout the execution of this work.

Additional information

Notes on contributors

Archana Balyan

Archana Balyan has obtained her BE (ECE) degree from Bangalore University and received the ME (ECE) degree from Delhi College of Engineering, University of Delhi. She has 20 years of teaching experience and is an associate professor in Maharaja Surajmal Institute of Technology (a premier institute affiliated to GGSIP University, Delhi). She has published several papers in reputed International Journals and in the proceedings of leading conferences. Her research interests include speech synthesis, computational linguistics and analog electronics.

E-mail: [email protected]

Amita Dev

Amita Dev has obtained her BTech degree from Punjab University and completed her post graduation from BITS, Pilani, India. She has obtained her PhD degree from Delhi University in the area of speech recognition. She has more than 25 years of experience and presently working as a Principal of Ambedkar Polytechnic, New Delhi and Bhai Parmanand Institute of Business Studies. She has been awarded “National level best engineering teachers award” in year 2001 by ISTE for her significant contribution in the field of engineering and technology. She has also been awarded the “State level best teacher award” by Department of Training and Technical Education, Government of Delhi. She is the recipient of “National level young teachers award” for pursuing advance research in the field of speech recognition. She has published more than 20 papers in international and national journals and in the conference proceedings of leading conferences. She has written several books in the area of computer science and engineering.

E-mail: [email protected]

Ruchika Kumari

Ruchika Kumari has obtained her BE (ECE) degree from Swami Ramanand Teerth University, Maharashtra and ME (Power Electronics) degree from RGPV University, Bhopal. Presently, she is an assistant professor at Maharaja Surajmal Institute of Technology (A premier institute affiliated to GGSIP University). She has experience in teaching for more than 10 years. Her research interest includes speech signal processing.

E-mail: [email protected]

Shyam Sunder Agrawal

Shyam Sunder Agrawal is a scientist and teacher of eminence having more than 40 years of R&D and teaching experience. He obtained his MSc and PhD degrees in physics and electronics in the year 1965 and 1970 respectively from Aligarh Muslim University, Aligarh. He has published more than 150 research papers in the Indian and Foreign Journals and presented large number of invited and contributory research paper in international and national conferences and academic forums. Key area of his research include human speech perception, speaker identification, high quality speech synthesis and development of speech corpora and databases for Indian Languages. Currently, Dr. S.S. Agrawal is a director general at KIIT, Gurgaon and an Advisor to CDAC, Noida.

E-mail: [email protected]

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.