55
Views
5
CrossRef citations to date
0
Altmetric
Original Article

Segmental Intelligibility of Three Text-to-Speech Synthesis Methods in Reverberant Environments

Pages 150-163 | Published online: 12 Jul 2009

References

  • Allen, J., Hunnicutt, S., & Klatt, D. (1987). From Text To Speech, The MITTALK System. New York: Cambridge University Press.
  • ANSI (1969). American national standards specification for audiometers (ANSI S3.6-1969). New York: American National Standards Institute.
  • Atal, B. S., & Hanauer, S. L. (1971). Speech analysis and synthesis by linear prediction of the speech wave. Journal of the Acoustical Society of America, 50, 637— 655.
  • Bess, F. H. (2000). Classroom acoustics: An overview. The Volta Review, 101,1–14.
  • Bistafa, S. R., & Bradley, J. S. (2000). Reverberation time and maximum background-noise level for classrooms from a comparative study of speech intelligibility metrics. Journal of the Acoustical Society of America, 107, 861 — 875.
  • Cohen, B. H. (1996). Explaining psychological statistics. Pacific Grove, CA: Brooks/Cole Publishing Co.
  • Crandell, C. C., & Smaldino, J. J. (2000). Improving classroom acoustics: Utilizing hearing-assistive technol-ogy and communication strategies in the educational setting. The Volta Review, 101, 47–64.
  • Dutoit, T., & Leich, H. (1993). Text-To-Speech synthesis based on a MBE re-synthesis of the segments database. Speech Communication, 13, 435–440.
  • Finitzo-Hieber, T., & Tillman, T. W. (1978). Room acoustics effects on monosyllabic word discrimination ability for normal and hearing-impaired children. Journal of Speech and Hearing Research, 21, 440–458.
  • Fucci, D., Reynolds, M. E., Bettagere, R., & Gonzales, M. D. (1995). Synthetic speech intelligibility under several experimental conditions. Augmentative and Alternative Communication, 11, 113–117.
  • Green, B. G., Logan, J. S., & Pisoni, D. B. (1986). Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems. Behavior Research Methods, Instruments and Computers, 18, 100 — 107.
  • Harris, R. W., & Reitz, M. L. (1985). Effects of room reverberation and noise on speech discrimination by the elderly. Audiology, 24, 319— 324.
  • Helfer, K. S., & Wilbur, L. A. (1990). Hearing loss, aging, and speech perception in reverberation and noise. Journal of Speech and Hearing Research, 33, 149 — 155.
  • House, A. S., Williams, C. E., Hecker, M. H. L., & Kryter, K. D. (1965). Articulation-testing methods: Consonantal differentiation with a closed-response set. Journal of the Acoustical Society of America, 37, 158 — 166.
  • JASA (Technical Committee on Architectural Acoustics) (2000) Classroom acoustics: A resource for creating learning environments with desirable listening conditions. Retrieved June 1, 2002 from http://asa.aip.org/classroom/ booklet.html.
  • Klatt, D. H. (1980). Software for a cascade/parallel formant synthesizer. Journal of the Acoustical Society of America, 67, 971 — 975.
  • Klatt, D. H. (1987). Review of text-to-speech conversion of English. Journal of the Acoustical Society of America, 82, 737— 793.
  • Kondras, M. J. (1960). Reverberation times of typical elementary school classrooms. Noise Control, 6, 17–19.
  • Koul, R. K., & Allen, G. D. (1993). Segmental intelligibility and speech interference thresholds of high-quality syn-thetic speech in the presence of noise. Journal of Speech and Hearing Research, 36, 790— 798.
  • Kuttruff, H. (2000). Room acoustics (4th ed.). London: Spon Press.
  • Logan, J. S., Greene, B. G., & Pisoni, D. B. (1989). Segmental intelligibility of synthetic speech produced by rule. Journal of the Acoustical Society of America, 86, 566 — 581.
  • Moulines, E., & Charpentier, F. (1990). Pitch-synchronous waveform processing techniques for text-to-speech synth-esis using diphones. Speech Communication, 9, 453 —467.
  • Nabelek, A. K. (1988). Identification of vowels in quiet, noise, and reverberation: Relationships with age and hearing loss. Journal of the Acoustical Society of America, 80, 741 — 748.
  • Nabelek, A. K., & Robinson, P. K. (1982). Monaural and binaural speech perception in reverberation for listeners of various ages. Journal of the Acoustical Society of America, 71, 1242–1248.
  • O'Shaughnessy, D. (1987). Speech communication: Human and machine. Reading, MA: Addison-Wesley Publishing Company.
  • O'Shaughnessy, D., Barbeau, L., Bernardi, D., & Arch-ambault, D. (1988). Diphone speech synthesis. Speech Communication, 7, 55–65.
  • Payton, K. L., Uchanski, R. M., & Braida, L. D., (1994). Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing. Journal of the Acoustical Society of America, 95, 1581 —1592.
  • Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of the vowels. Journal of the Acoustical Society of America, 24, 175–184.
  • Picard, M., & Bradley, J. S. (2001). Revisiting speech interference in classrooms. Audiology, 40, 221–244.
  • Pisoni, D. B, & Koen, E. (1982). Some comparisons of intelligibility of synthetic and natural speech at different speech-to-noise ratios. Journal of the Acoustical Society of America, 71, S94 (Abstract).
  • Pisoni, D. B., Nusbaum, H. C., & Greene, B. G. (1985). Perception of synthetic speech generated by rule. In Proceedings of the International IEEE, 73, 1665–1676.
  • Rodman, R. D. (1999). Computer speech technology. Boston: Artech House.
  • Ross, M. (1978). Classroom acoustics and speech intellig-ibility. In J. Katz (Ed.), Handbook of clinical audiology. Baltimore, MD: Williams and Wilkins.
  • Stylianou, Y. (2000). On the Implementation of the Harmo-nic Plus Noise Model for Concatenative Speech Synthesis. Paper presented at ICASSP 2000, Istanbul, Turkey. Retrieved June 1, 2002, from http://www.research.att. com/projects/tts/papers/2000_ICASSP/fastHNM.pdf.
  • Stylianou, Y., Dutoit, T., & Schroeter, J. (1997). Diphone concatenation using a harmonic plus noise model of speech. Retrieved June 1, 2002, from http://www.research. att.com/resources/trs/TRs/ 97/97.29/hnmconc.html.
  • Venkatagiri, H. S. (2003). Segmental intelligibility of four currently used text-to-speech synthesis methods. Journal of the Acoustical Society of America, 113, 2095–2104.
  • Venkatagiri, H. S. (2002). Segmental intelligibility of TTS in quiet. Unpublished study.
  • Venkatagiri, H. S. (1995). Techniques for enhancing productivity in AAC: A review of research. American Journal of Speech-Language Pathology, 4, 36–45.
  • Witten, I. H. (1982). Principles of computer speech. London: Academic Press.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.