55
Views
5
CrossRef citations to date
0
Altmetric
Original Article

Segmental Intelligibility of Three Text-to-Speech Synthesis Methods in Reverberant Environments

Pages 150-163 | Published online: 12 Jul 2009
 

Abstract

In this study, the segmental intelligibility of three currently available text-to-speech products under two reverberant conditions was investigated. The reverberation times used were 1.2 and 2.4 s simulating reverberation that may exist in a large room and a large hall with poor acoustics. The human speech had an overall intelligibility (whole words correct) of 95% and a phoneme error rate of 2.35% under reverberant conditions investigated in this study, which were not significantly different from those obtained in a nonreverberant controlled condition. In contrast, the overall intelligibility of text-to-speech voices was 68% and phoneme error rate was 14.48%, which indicated that that text-to-speech output suffers significantly in the same reverberant conditions. Implications of these findings for the improvement of text-to-speech products and the practice of AAC are discussed with suggestions for further research.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.