313
Views
1
CrossRef citations to date
0
Altmetric
Research Articles

Voice banking to support individuals who use speech-generating devices: development and evaluation of Singaporean-accented English synthetic voices and a Singapore Colloquial English recording inventory

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, , ORCID Icon & show all
Pages 208-218 | Received 08 Apr 2022, Accepted 02 Feb 2023, Published online: 27 Mar 2023

References

  • Acapela Group. (2023, February 7). My-own-voice. https://www.acapela-group.com/solutions/my-own-voice/
  • Alamsaputra, D. M., Kohnert, K. J., Munson, B., & Reichle, J. (2006). Synthesized speech intelligibility among native speakers and non-native speakers of English. Augmentative and Alternative Communication, 22(4), 258–268. doi:10.1080/00498250600718555
  • American Speech-Language-Hearing Association. (2022). Augmentative and alternative communication. https://www.asha.org/public/speech/disorders/aac/
  • Benoît, C., Grice, M., & Hazan, V. (1996). The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences. Speech Communication, 18(4), 381–392. doi:10.1016/0167-6393(96)00026-X
  • Beukelman, D., Fager, S., & Nordness, A. (2011). Communication support for people with ALS. Neurology Research International, 2011, 1–6. doi:10.1155/2011/714693
  • Beukelman, D., & Light, J. (2020). Augmentative & alternative communication: Supporting children and adults with complex communication needs (5th ed.). Brookes Publishing.
  • Black, A. W., Zen, H., & Tokuda, K. (2007, April). Statistical parametric speech synthesis. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07 (Vol. 4, pp. IV–1229). IEEE.
  • Bunnell, H. T., & Lilley, J. (2007). Analysis methods for assessing TTS intelligibility. In Proceedings of the 6th ISCA Workshop on Speech Synthesis, 4032–4033.
  • Bunnell, H. T., Lilley, J., & McGrath, K. (2017). The ModelTalker project: A web-based voice banking pipeline for ALS/MND patients. Proceedings of Interspeech, 2017, 4032–4033. doi:10.21437/Interspeech.2017-2054
  • Bunnell, H. T., & Pennington, C. A. (2010). Advances in computer speech synthesis and implications for assistive technology. In J. Mullennix & S. Stern (Eds.), Computer synthesized speech technologies: Tools for aiding impairment. IGI Global. 71–91. doi:10.4018/978-1-61520-725-1
  • Capes, T., Coles, P., Conkie, A., Golipour, L., Hadjitarkhani, A., Hu, Q., Huddleston, N., Hunt, M., Li, J., Neeracher, M., Prahallad, K., Raitio, T., Rasipuram, R., Townsend, G., Williamson, B., Winarsky, D., Wu, Z., & Zhang, H. (2017). Siri on-device deep learning-guided unit selection text-to-speech system. Proceedings of Interspeech, 2017, 4011–4015. doi:10.21437/INTERSPEECH.2017-1798
  • CereProc Ltd. (2023, February 7). CereProc voices. https://www.cereproc.com/en/products/voices
  • Cohn, M., Segedin, B. F., & Zellou, G. (2022). Acoustic-phonetic properties of Siri- and human-directed speech. Journal of Phonetics, 90, 101123. doi:10.1016/j.wocn.2021.101123
  • Cooper, E. (2023, February 7). Intelligibility testing. http://www.cs.columbia.edu/∼ecooper/tts/intelligibility.html
  • Coppens-Hofman, M. C., Terband, H., Snik, A. F. M., & Maassen, B. A. M. (2016). Speech characteristics and intelligibility in adults with mild and moderate intellectual disabilities. Folia Phoniatrica et Logopaedica, 68(4), 175–182. doi:10.1159/000450548
  • Creer, S., Cunningham, S., Green, P., & Yamagishi, J. (2013). Building personalised synthetic voices for individuals with severe speech impairment. Computer Speech and Language, 27(6), 1178–1193. doi:10.1016/j.csl.2012.10.001
  • Crumpton, J., & Bethel, C. L. (2016). A survey of using vocal prosody to convey emotion in robot speech. International Journal of Social Robotics, 8(2), 271–285. doi:10.1007/s12369-015-0329-4
  • Department of Statistics Singapore. (2022a). Singapore Department of Statistics (DOS). Singapore Statistics. http://www.singstat.gov.sg/
  • Department of Statistics Singapore. (2022b). Sustainable development goals. http://www.singstat.gov.sg/find-data/sdg/goal-8
  • Drager, K., Clark-Serpentine, E. A., Johnson, K. E., & Roeser, J. L. (2006). Accuracy of repetition of digitized and synthesized speech for young children in background noise. American Journal of Speech-Language Pathology, 15(2), 155–164. doi:10.1044/1058-0360(2006/015)
  • Drager, K., Reichle, J., & Pinkoski, C. (2010). Synthesized speech output and children: A scoping review. American Journal of Speech-Language Pathology, 19(3), 259–273. doi:10.1044/1058-0360(2010/09-0024)
  • Duffy, S., & Pisoni, D. (1992). Comprehension of synthetic speech produced by rule: A review and theoretical interpretation. Language and Speech, 35(4), 351–389. doi:10.1177/002383099203500401
  • Eveleth, R. (2014). You can now donate your voice to people who can’t speak. Smithsonian Magazine. https://www.smithsonianmag.com/smart-news/you-can-now-donate-your-voice-180950220/
  • Fairbanks, G. (1960). Voice and articulation drillbook. Addison-Wesley Educational Publishers.
  • Fontan, L., Tardieu, J., Gaillard, P., Woisard, V., & Ruiz, R. (2015). Relationship between speech intelligibility and speech comprehension in babble noise. Journal of Speech, Language, and Hearing Research, 58(3), 977–986. doi:10.1044/2015_JSLHR-H-13-0335
  • Goh, C., & Woo, Y. (2009). The Coxford Singlish dictionary (2nd ed.). Angsana Books.
  • Gupta, A. (1989). Singapore colloquial English and standard English. Singapore Journal of Education, 10(2), 33–39. doi:10.1080/02188798908547659
  • Gwee, L. (2017). Spiaking Singlish: A companion to how Singaporeans communicate. Marshall Cavendish International (Asia) Private Limited.
  • Harris, P. A., Taylor, R., Minor, B. L., Elliott, V., Fernandez, M., O'Neal, L., McLeod, L., Delacqua, G., Delacqua, F., Kirby, J., & Duda, S. N. (2019). The REDCap consortium: Building an international community of software platform partners. Journal of Biomedical Informatics, 95, 103208. doi:10.1016/j.jbi.2019.103208
  • Henton, C. (1999). Where is female synthetic speech? Journal of the International Phonetic Association, 29(1),. 1–61. doi: 10.1017/S0025100300006411
  • Hyppa-Martin, J., Friese, J., & Barnes, C. (2017). Voice banking for individuals with ALS: Tips and pointers for successful, lowcost voice banking [Poster presentation]. Annual Convention of the American Speech-Language-Hearing Association, Los Angeles, California, United States of America.
  • Hyppa-Martin, J., Lilley, J., Chen, M., Friese, J., & Bunnell, H. T. (n.d). Effect of two voice synthesis types on intelligibility and listener perspectives of voices banked by individuals with amyotrophic lateral sclerosis [Unpublished manuscript], Department of Communication Sciences and Disorders, University of Minnesota Duluth.
  • Iacono, T., Trembath, D., & Erickson, S. (2016). The role of augmentative and alternative communication for children with autism: Current status and future trends. Neuropsychiatric Disease and Treatment, 12, 2349–2361. doi:10.2147/NDT.S95967
  • IEEE Subcommittee. (1969). Harvard sentences. IEEE Transactions on Audio and Electroacoustics. https://www.cs.columbia.edu/∼hgs/audio/harvard.html
  • Ilves, M., & Surakka, V. (2013). Subjective responses to synthesized speech with lexical emotional content: The effect of the naturalness of the synthetic voice. Behaviour & Information Technology, 32(2), 117–131. doi:10.1080/0144929X.2012.702285
  • Jackson, S., Bode, A., Beck, K., Alexander, K., & Lamb, H. (2017). Using the ModelTalker software to create a personalised synthetic voice: A hands-on student learning experience [Poster presentation]. American Speech-Language-Hearing Association National Convention, Los Angeles, California, United States of America.
  • Leimgruber, J. (2013). Singapore English: Structure, variation, and usage. Cambridge University Press. doi:10.1017/CBO9781139225755
  • Lilley, J., Hyppa-Martin, J., & Bunnell, H. T. (2020). A large-scale comparison of the intelligibility of unit-selection and deep-neural-network parametric synthetic voices generated from dysarthric speech. The Journal of the Acoustical Society of America, 148(4), 2582–2582. doi:10.1121/1.5147169
  • Miel, P. M. (2010). An essential guide to Singlish. Talisman Publishing.
  • Morrow, D., Mirenda, P., Beukelman, D. R., & Yorkston, K. M. (1993). Vocabulary selection for augmentative communication systems. American Journal of Speech-Language Pathology, 2(2), 19–30. doi:10.1044/1058-0360.0202.19
  • Nah, Y. H., Chen, M., & Poon, K. K. L. (2022). Supporting Individuals With Autism Spectrum Disorder in Singapore. Intervention in School and Clinic, 57(5), 348–354. https://doi.org/10.1177/10534512211032911
  • Nemours Children’s Health. (2021). ModelTalker: Creating personal voices for all. https://www.modeltalker.org/
  • Pinkoski-Ball, C. L., Reichle, J., & Munson, B. (2012). Synthesized speech intelligibility and early preschool-age children: Comparing accuracy for single-word repetition with repeated exposure. American Journal of Speech-Language Pathology, 21(4), 293–301. doi:10.1044/1058-0360(2012/11-0020)
  • Salmela, S. (2019). Attitudes towards customized vs. generic synthetic voices on speech generating devices [Unpublished manuscript], Department of Communication Sciences and Disorders, University of Minnesota Duluth.
  • Schmidt, C., O’Neill, J., & Hyppa-Martin, J. (2021). Human-centered design of synthetic speech to support comprehension for a professional experiencing dysarthria [Poster presentation]. American Speech-Language-Hearing Association National Convention, Washington, D. C., United States of America.
  • Stevens, C., Lees, N., Vonwiller, J., & Burnham, D. (2005). On-line experimental methods to evaluate text-to-speech (TTS) synthesis: Effects of voice gender and signal quality on intelligibility, naturalness and preference. Computer Speech & Language, 19(2), 129–146. doi:10.1016/j.csl.2004.03.003
  • Stipancic, K. L., Tjaden, K., & Wilding, G. (2016). Comparison of intelligibility measures for adults with Parkinson’s disease, adults with MS, and healthy controls. Journal of Speech, Language, and Hearing Research, 59(2), 230–238. doi:10.1044/2015_JSLHR-S-15-0271
  • Von Berg, S., Panorska, A., Uken, D., & Qeadan, F. (2009). DECtalkTM and VeriVoxTM: Intelligibility, likeability, and rate preference differences for four listener groups. Augmentative and Alternative Communication, 25(1), 7–18. doi:10.1080/07434610902728531
  • Walker, Z., & Musti-Rao, S. (2016). Inclusion in high-achieving Singapore: Challenges of building an inclusive society in policy and practice. Global Education Review, 3(3), 28–42.
  • Westley, M., Sutherland, D., & Bunnell, H. T. (2019). Voice banking to support people who use speech-generating devices: New Zealand voice donors’ perspectives. Perspectives of the ASHA Special Interest Groups, 4(4), 593–600. doi:10.1044/2019_PERS-SIG2-2018-0011
  • Williams, A., Srinivasan, M., Liu, C., Lee, P., & Zhou, Q. (2020). Why do bilinguals code-switch when emotional? Insights from immigrant parent–child interactions. Emotion, 20(5), 830–841. doi:10.1037/emo0000568
  • Zajonc, R. B. (1968). Attitudinal effects of mere exposure. Journal of Personality and Social Psychology, 9(2, Pt.2), 1–27. doi:10.1037/h0025848
  • Zen, H., Tokuda, K., & Black, A. W. (2009). Statistical parametric speech synthesis. Speech Communication, 51(11), 39–1064. doi:10.1016/j.specom.2009.04.004

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.