11
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Pronunciation of Digit Sequences in Text-to-Speech Systems

&
Pages 241-249 | Published online: 24 Oct 2007
 

Abstract

Text-to-speech systems usually consist of a preprocessor for expanding abbreviations, a system for converting orthographic text to a phonemic representation, rules for generating appropriate rhythm and intonation, and a speech synthesizer to generate an acoustic waveform from the phonemic representation. Multi-layer perceptrons have recently been used for the orthographic to phonemic conversion process. In this paper the possibility of using perceptrons in the preprocessor is explored. It is shown that single-layer perceptrons are sufficient for expanding 3-digit numbers, 4-digit numbers and cardinal numbers into appropriate orthographic text, but a multi-layer perceptron is required for expanding 12-hour clock times.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.