16
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Novel Applications of Neural Networks in Speech Technology Systems: Search Space Reduction and Prosodic Modeling

, , , , , , , , & show all
Pages 631-646 | Published online: 01 Mar 2013

References

  • H. Bourlard y N. Morgan. “Connectionist speech recognition-A hybrid approach”. Kluwer Academic, 1994.
  • A. J. Robinson, G. D. Cook, D. P. W. Ellis, E. Fosler-Lussier, S. J. Renals, and D. A. G. Williams. “Connectionist speech recognition of broadcast news”. Speech Communication, 37:27–45, 2002.
  • J. Burniston and K.M. Curtis. “A Hybrid Neural Network/Rule Based Architecture for Diphone Speech Synthesis”. In International Symposium on Speech, Image Processing and Neural Networks Proceedings. 323–6, 1994.
  • R. Cordoba, J.M. Montero, J. Gutiérrez-Arriola, J.A. Vallejo, E. Enriquez, and J.M. Pardo. Selection of the most significant parameters for duration modeling in a Spanish text-to-speech system using neural networks. Computer Speech &Language, Vol 16 N° 2, pp. 183–203, 2002.
  • J.L. Gauvain and L. Lamel, “Large-vocabulary continuous speech recognition: advances and applications”. Proceedings of the IEEE, Volume: 88, Issue: 8, pp. 1181–1200. 2000.
  • S. Ortmanns, H. Ney, and A. Eiden, “Language-Model Look-Ahead for Large Vocabulary Speech Recognition”, Proc. Int. Conf. on Spoken Language Processing, vol. 4, Philadelphia, PA, USA, pp. 2095–2098, 1996.
  • J. Macias-Guarasa, A. Gallardo, J. Ferreiros, J.M. Pardo, and L. Villarrubia, “Initial Evaluation of a Preselection Module for a Flexible Large Vocabulary Speech Recognition System in Telephone Environment”. Proc. Int. Conf. on Spoken Language Processing, Philadelphia, PA, USA, pp. 1343–1346, 1996.
  • H. Ney. “The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition”. IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 32, n. 2, pp. 263–271, 1984.
  • L. Fissore, P. Laface, G. Micca, and R. Pieraccini, “Lexical Access to Large Vocabularies for Speech Recognition”. IEEE Transactions on Acoustics, Speech and Signal Processing vol. 37, n. 8, pp. 1197–1213, 1989.
  • D. Tapias, A. Acero, J. Esteve, and J.C. Torrecilla, “The VESTEL Telephone Speech Database”. Proc. Int. Conf. on Spoken Language Processing, Yokohama, Japan, pp. 13431346, 1994.
  • J. Macias-Guarasa, J. Ferreiros, J. Colás, A. Gallardo-Antolin, and J.M. Pardo, “Improved Variable Preselection List Length Estimation Using NNs In A Large Vocabulary Telephone Speech Recognition System”. Proc. Int. Conf. on Spoken Language Processing, Beijing, China, pp. 823–826, 2000.
  • C.M. Bishop, Neural Networks for Pattern Recognition. Oxford University Press. pp. 116161 and pp. 245–247, 1995.
  • J. Macias-Guarasa, “Architectures and Methods in Large Vocabulary Speech Recognition Systems.” PhD. Thesis. Universidad Politecnica de Madrid, 2001.
  • J. Kittler, “Feature set search algorithms” in Pattern Recognition and Signal Processing, C.H. Chen, Ed., Sijthoff and Noordhoff, The Netherlands, pp. 41–60, 1978.
  • J.A. Vallejo. Mejora de la frecuencia fundamental en la conversion de texto a voz. PhD Thesis Universidad Politecnica de Madrid, 1998.
  • J.M. Montero, R. Córdoba, J.A. Vallejo, J. Gutierrez-Arriola, E. Enriquez, and J.M. Pardo. Restricted-Domain Female-Voice Synthesis in Spanish: from Database Design to ANN Prosodic Modeling. Proceedings of ICSLP, pp. 621–624, 2000.
  • J. Allen, S. Hunnicut, and D.H. Klatt. From Text to Speech: The MITaIk System. Cambridge University Press, Cambridge, 1987.
  • S. Tournemire. Identification and automatic generation of prosodic contours for atext-tospeech synthesis system in French. Proceedings of Eurospeech, pp. 191–194, 1997.
  • W.N. Campbell. Syllable-based segmental duration. In Bailly, G., Benoit, C., and Sawallis, T.R. (Eds.) Talking machines: theories, models and designs (pp. 211–224). Elsevier, 1992.
  • J.W. Fackrell, H. Vereecken, J.P. Martens, and B. Van Coile. Multilingual prosody modeling using cascades of regression trees and neural networks. Proceedings of Eurospeech, pp. 1835–1838, 1999.
  • Speech Technology Group online text to speech demonstration. http://www-gth.die.upm.es.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.