1,291
Views
22
CrossRef citations to date
0
Altmetric
Review

Automated speech analysis tools for children’s speech production: A systematic literature review

ORCID Icon, ORCID Icon, , , ORCID Icon & ORCID Icon
Pages 583-598 | Received 24 Jul 2017, Accepted 14 May 2018, Published online: 11 Jul 2018

References

  • Allen, M.M. (2013). Intervention efficacy and intensity for children with speech sound disorder. Journal of Speech, Language & Hearing Research, 56, 865–877. doi: 1092-4388(2012/11-0076)
  • Australian Bureau of Statistics. (2016). Household use of information technology, Australia, 2014-15 (Vol. 2017). Canberra, Australia: Australian Bureau of Statistics.
  • Azizi, S., Towhidkhah, F., & Almasganj, F. (2012). Study of VTLN method to recognize common speech disorders in speech therapy of Persian children. Paper presented at the 2012 19th Iranian Conference of Biomedical Engineering, ICBME 2012.
  • Baker, E. (2012). Optimal intervention intensity in speech-language pathology: Discoveries, challenges, and unchartered territories. International Journal of Speech-Language Pathology, 14, 478–485. doi:10.3109/17549507.2012.717967
  • Baker, E., & McLeod, S. (2011). Evidence-based practice for children with speech sound disorders: Part 1 Narrative review. Language, Speech & Hearing Services in Schools, 42, 102–139. doi:10.1044/0161-1461(2010/09-0075)
  • Ballard, K.J., Robin, D.A., McCabe, P., & McDonald, J. (2010). A treatment for dysprosody in childhood apraxia of speech. Journal of Speech, Language & Hearing Research, 53, 1227–1245. doi: 1092-4388(2010/09-0130)
  • Bártů, M., & Tucková, J. (2008). A classification method of children with developmental dysphasia based on disorder speech analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5164 LNCS, pp. 822–828).
  • Byun, T.M., Campbell, H., Carey, H., Liang, W., Park, T.H., & Svirsky, M. (2017). Enhancing intervention for residual rhotic errors via app-delivered biofeedback: A case study. Journal of Speech, Language, and Hearing Research, 60, 1810–1817. doi:10.1044/2017_JSLHR-S-16-0248
  • Charter, R.A. (2003). A breakdown of reliability coefficients by test type and reliability method, and the clinical implications of low reliability. The Journal of General Psychology, 130, 290–304. doi:10.1080/00221300309601160
  • Chen, Y.J. (2011). Identification of articulation error patterns using a novel dependence network. IEEE Transactions on Biomedical Engineering, 58, 3061–3068. doi:10.1109/TBME.2011.2135352
  • Cler, G.J., Mittelman, T., Braden, M.N., Woodnorth, G.H., & Stepp, C.E. (2017). Video game rehabilitation of velopharyngeal dysfunction: A case series. Journal of Speech, Language, and Hearing Research, 60, 1800–1809. doi:10.1044/2017_JSLHR-S-16-0231
  • Cucchiarini, C. (1996). Assessing transcription agreement: Methodological aspects. Clinical Linguistics & Phonetics, 10, 131–155. doi:10.3109/02699209608985167
  • de Wet, F., Van der Walt, C., & Niesler, T.R. (2009). Automatic assessment of oral language proficiency and listening comprehension. Speech Communication, 51, 864–874. doi:10.1016/j.specom.2009.03.002
  • Delmonte, R. (2009). Prosodic tools for language learning. International Journal of Speech Technology, 12, 161–184. doi:10.1007/s10772-010-9065-1
  • Deng, L., & Li, X. (2013). Machine learning paradigms for speech recognition: An overview. IEEE Transactions on Audio, Speech & Language Processing, 21, 130.
  • Dodd, B. (2013). Differential diagnosis and treatment of children with speech disorder. West Sussex, UK: Wiley.
  • Dudy, S., Asgari, M., & Kain, A. (2015). Pronunciation analysis for children with speech sound disorders. Paper presented at the Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS.
  • Duenser, A., Ward, L., Stefania, A., Smith, D., Freyne, J., Morgan, A., & Dodd, B. (2016). Feasibility of technology enabled speech disorder screening. In A. Georgiou, L. K. Schaper, & S. Whetton (Eds.), Digital health innovation for consumers, clinicians, connectivity and community (Vol. 227, pp. 21–27). Amsterdam, Netherlands: IOS Press.
  • Edeal, D.M., & Gildersleeve-Neumann, C.E. (2011). The importance of production frequency in therapy for childhood apraxia of speech. American Journal of Speech-Language Pathology, 20, 95–110. doi:10.1044/1058-0360(2011/09-0005)
  • Engwall, O., & Balter, O. (2007). Pronunciation feedback from real and virtual language teachers. Computer Assisted Language Learning, 20, 235–262. doi:10.1080/09588220701489507
  • Eskenazi, M. (2009). An overview of spoken language technology for education. Speech Communication, 51, 832–884. doi:10.1016/j.specom.2009.04.005
  • Expressive Solutions LLC. (2011). ArtikPix (Version 2.0) [Mobile Application]: Expressive Solutions LLC. Retrieved from http://itunes.apple.com
  • Ferrer, L., Bratt, H., Richey, C., Franco, H., Abrash, V., & Precoda, K. (2015). Classification of lexical stress using spectral and prosodic features for computer-assisted language learning systems. Speech Communication, 69, 31–45. doi:10.1016/j.specom.2015.02.002
  • Grant, M.J., & Booth, A. (2009). A typology of reviews: An analysis of 14 review types and associated methodologies. Health Information & Libraries Journal, 26, 91–108. doi:10.1111/j.1471-1842.2009.00848.x
  • Hacker, C., Cincarek, T., Maier, A., HeBler, A., & Noth, E. (2007, April 15–20). Boosting of prosodic and pronunciation features to detect mispronunciations of non-native children. Paper presented at the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing – ICASSP '07.
  • Kadi, K.L., Selouani, S.A., Boudraa, B., & Boudraa, M. (2016). Fully automated speaker identification and intelligibility assessment in dysarthria disease using auditory knowledge. Biocybernetics and Biomedical Engineering, 36, 233–247. doi:10.1016/j.bbe.2015.11.004
  • Keilmann, A., Braun, L., & Napiontek, U. (2004). Emotional satisfaction of parents and speech-language therapists with outcome of training intervention in children with speech annd language disorders. Folia Phoniatrica et Logopaedica, 56, 51–61. doi:10.1159/000075328
  • Kenny, B., & Lincoln, M. (2012). Sport, scales, or war? Metaphors speech-language pathologists use to describe caseload management . International Journal of Speech-Language Pathology, 14, 247–259. doi:10.3109/17549507.2012.651747
  • Kent, R.D., & Kim, Y.J. (2003). Toward an acoustic typology of motor speech disorders. Clinical Linguistics & Phonetics, 17, 427–445. doi:10.1080/0269920031000086248
  • Keshet, J. (in press). Automatic speech recognition: A primer for speech pathology researchers. International Journal of Speech-Language Pathology.
  • Kurian, C. (2014). A review on technological development of automatic speech recognition. International Journal of Soft Computing and Engineering, 4, 2231–2307.
  • Lee, J., Lee, C.H., Kim, D.-W., & Kang, B.-Y. (2017). Smartphone-assisted pronunciation learning technique for ambient intelligence. IEEE Access, 5, 312–325. doi:10.1109/ACCESS.2016.2641474
  • Lee, S., Noh, H., Lee, J., Lee, K., Lee, G.G., Sagong, S., & Kim, M. (2011). On the effectiveness of Robot-Assisted Language Learning. ReCALL, 23, 25–58. doi:10.1017/S0958344010000273
  • Lim, J.M., McCabe, P., & Purcell, A. (2017). Challenges and solutions in speech-language pathology service delivery across Australia and Canada. European Journal for Person Centred Healthcare, 5, 120–128. doi:10.5750/ejpch.v5i1.1244
  • Maier, A., Haderlein, T., Eysholdt, U., Rosanowski, F., Batliner, A., Schuster, M., & Nöth, E. (2009a). PEAKS – A system for the automatic evaluation of voice and speech disorders. Speech Communication, 51, 425–437. doi:10.1016/j.specom.2009.01.004
  • Maier, A., Honig, F., Bocklet, T., Noth, E., Stelzle, F., Nkenke, E., & Schuster, M. (2009b). Automatic detection of articulation disorders in children with cleft lip and palate. Journal of the Acoustical Society of America, 126, 2589–2602. doi:10.1121/1.3216913
  • Maier, A., Honig, F., Hacker, C., Schuster, M., & Noth, E. (2008). Automatic evaluation of characteristics of speech disorders in children with cleft lip and palate. Paper presented at the Interspeech 2008 – International Conference on Spoken Language Processing, Brisbane, Australia.
  • Martin, T. (2014). The evolution of the smartphone. Pocketnow, 2017 (25th June). Retrieved from Pocketnow.com website: http://pocketnow.com/2014/07/28/the-evolution-of-the-smartphone
  • Mazenan, M. N., Swee, T. T., & Soh, S. S. (2015). Recognition test on highly newly robust Malay corpus based on statistical analysis for Malay articulation disorder. Paper presented at the BMEiCON 2014 – 7th Biomedical Engineering International Conference.
  • McAllister, L., McCormack, J., McLeod, S., & Harrison, L.J. (2011). Expectations and experiences of accessing and participating in services for childhood speech impairment. International Journal of Speech-Language Pathology, 13, 251–267. doi:10.3109/17549507.2011.535565
  • McLeod, S., & Baker, E. (2014). Speech-language pathologists' practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders. Clinical Linguistics & Phonetics, 28, 508–531. doi:10.3109/02699206.2014.926994
  • Moher, D., Liberarti, A., Tetzlaff, J., & Altman, D.G. & The PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Medicine, 6, e1000097. doi:10.1371/journal.pmed.1000097
  • Morton, H., Gunson, N., & Jack, M. (2012). Interactive language learning through speech-enabled virtual scenarios. Advances in Human-Computer Interaction, 2012, 389523.
  • Murray, E., McCabe, P., & Ballard, K.J. (2014). A systematic review of treatment outcomes for children with childhood apraxia of speech. American Journal of Speech-Language Pathology, 23, 486–504. doi:10.1044/2014_AJSLP-13-0035
  • Murray, E., McCabe, P., & Ballard, K.J. (2015). A randomized controlled trial for children with childhood apraxia of speech comparing Rapid Syllable Transition Treatment and the Nuffield Dyspraxia Programme–Third Edition. Journal of Speech, Language & Hearing Research, 58, 669–686. doi:10.1044/2015_JSLHR-S-13-0179
  • Murray, E., McCabe, P., Heard, R., & Ballard, K.J. (2015). Differential diagnosis of children with suspected childhood apraxia of speech. Journal of Speech, Language & Hearing Research, 58, 43–60. doi:10.1044/2014_JSLHR-S-12-0358
  • Mustafa, B.M., Rosdi, F., Salim, S.S., & Mughal, M.U. (2015). Exploring the influence of general and specific factors on the recognition accuracy of an ASR system for dysarthric speaker. Expert Systems with Applications, 42, 3924–3932. doi:10.1016/j.eswa.2015.01.033
  • Navarro-Newball, A.A., Loaiza, D., Oviedo, C., Castillo, A., Portilla, A., Linares, D., & Álvarez, G. (2014). Talking to Teo: Video game supported speech therapy. Entertainment Computing, 5, 401–412. doi:10.1016/j.entcom.2014.10.005
  • Neri, A., Mich, O., Gerosa, M., & Giuliani, D. (2008). The effectiveness of computer assisted pronunciation training for foreign language learning by children. Computer Assisted Language Learning, 21, 393–408. doi:10.1080/09588220802447651
  • Nicolao, M., Beeston, A.V., & Hain, T. (2015 April 19–24). Automatic assessment of English learner pronunciation using discriminative classifiers. Paper presented at the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
  • O’Callaghan, C., McAllister, L., & Wilson, L. (2005). Barriers to accessing rural paediatric speech pathology services: Health care consumers' perspectives. Australian Journal of Rural Health, 13, 162–171. doi:10.1111/j.1440-1854.2005.00686.x
  • O’Shaughnessy, D. (2008). Invited paper: Automatic speech recognition: History, methods and challenges. Pattern Recognition, 41, 2965–2979. doi:10.1016/j.patcog.2008.05.008
  • O’Shaughnessy, D. (2015, 28-30 October, 2015). Automatic speech recognition. Paper presented at the 2015 Chilean Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), Santiago, Chile.
  • Obach, D.D., & Cordel, M.O. (2012, 19-22 Nov. 2012). Performance comparison of ASR classifiers for the development of an English CAPT system for Filipino students. Paper presented at the TENCON 2012 IEEE Region 10 Conference.
  • Oliveira, C., Lousada, M., & Jesus, L.M.T. (2015). The clinical practice of speech and language therapists with children with phonologically based speech sound disorders. Child Language Teaching & Therapy, 31, 173–194. doi:10.1177/0265659014550420
  • Pantoja, M. (2014). Automatic pronunciation assistance on video. Paper presented at the PIVP 2014 - Proceedings of the 1st International Workshop on Perception Inspired Video Processing, Workshop of MM 2014.
  • Parnandi, A., Karappa, V., Lan, T., Shahin, M., McKechnie, J., Ballard, K., … Gutierrez-Osuna, R. (2015). Development of a remote therapy tool for childhood apraxia of speech. ACM Transactions on Accessible Computing, 7, 10. doi:10.1145/2776895
  • Ruggero, L., McCabe, P., Ballard, K.J., & Munro, N. (2012). Paediatric speech language pathology service delivery: An exploratory survey of Australian parents. International Journal of Speech-Language Pathology, 14, 338–350. doi:10.3109/17549507.2011.650213
  • Saz, O., Lleida, E., & Rodríguez, W. R. (2009). Avoiding speaker variability in pronunciation verification of children's disordered speech. Paper presented at the Proceedings of the 2nd Workshop on Child, Computer and Interaction, WOCCI '09.
  • Saz, O., Yin, S.C., Lleida, E., Rose, R., Vaquero, C., & Rodríguez, W.R. (2009). Tools and technologies for computer-aided speech and language therapy. Speech Communication, 51, 948–967. doi:10.1016/j.specom.2009.04.006
  • Schipor, O.A., Pentiuc, S.G., & Schipor, M.D. (2012). Automatic assessment of pronunciation quality of children within assisted speech therapy. Automatinis vaikų tarsenos kokybės vertinimas pagalbinio kalbėjimo terapijoje, (122), 15–18.
  • Shahin, M., Ahmed, B., & Ballard, K. J. (2012). Automatic classification of unequal lexical stress patterns using machine learning algorithms. Paper presented at the 2012 IEEE Workshop on Spoken Language Technology, Miami, FL, USA.
  • Shahin, M., Ahmed, B., McKechnie, J., Ballard, K., & Gutierrez-Osuna, R. (2014). Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech. Paper presented at the Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH.
  • Shahin, M., Ahmed, B., Parnandi, A., Karappa, V., McKechnie, J., Ballard, K.J., & Gutierrez-Osuna, R. (2015). Tabby Talks: An automated tool for the assessment of childhood apraxia of speech. Speech Communication, 70, 49–64. doi:10.1016/j.specom.2015.04.002
  • Shahin, M., Epps, J., & Ahmed, B. (2016). Automatic classification of lexical stress in English and Arabic languages using deep learning. Paper presented at the Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH.
  • Shaikh, N., & Deshmukh, R.R. (2016). Speech recognition system – a review. IOSR Journal of Computer Engineering, 18, 1–9.
  • Simmons, E.S., Paul, R., & Shic, F. (2016). Brief Report: A mobile application to treat prosodic deficits in autism spectrum disorder and other communication impairments: A pilot study. Journal of Autism & Developmental Disorders, 46, 320–327. doi:10.1007/s10803-015-2573-8
  • Singh, S., Thakur, A., & Vir, D. (2015). Automatic articulation error detection tool for Punjabi language with aid for hearing impaired people. International Journal of Speech Technology, 18, 143–156. doi:10.1007/s10772-014-9256-2
  • Skahan, S.M., Watson, M., & Lof, G.L. (2007). Speech-language pathologists' assessment practices for children with suspected speech sound disorders: results of a national survey. American Journal of Speech-Language Pathology, 16, 246–259. doi:10.1044/1058-0360(2007/029)
  • Strik, H., & Cucchiarini, C. (1999). Modeling pronunciation variation for ASR: A survey of the literature. Speech Communication, 29, 225–246. doi:10.1016/S0167-6393(99)00038-2
  • Su, H. Y., Wu, C. H., & Tsai, P. J. (2008). Automatic assessment of articulation disorders using confident unit-based model adaptation. Paper presented at the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings.
  • Suanpirintr, S., & Thubthong, N. (2007). The effect of pauses in dysarthric speech recognition study on Thai cerebral palsy children. Paper presented at the Proceedings of the 1st international convention on Rehabilitation engineering & assistive technology: in conjunction with 1st Tan Tock Seng Hospital Neurorehabilitation Meeting, Singapore.
  • Sztaho, D., Nagy, K., & Vicsi, K. (2010). Subjective tests and automatic sentence modality recognition with recordings of speech impaired children. Paper presented at the Proceedings of the Second international conference on Development of Multimodal Interfaces: active Listening and Synchrony, Dublin, Ireland.
  • Thomas, D.C., McCabe, P., & Ballard, K.J. (2014). Rapid Syllable Transitions (ReST) treatment for childhood apraxia of speech: The effect of lower dose-frequency. Journal of Communication Disorders, 51, 29–42. doi:10.1016/j.jcomdis.2014.06.004
  • Ting, H. N., & Mark, K. M. (2008). Speaker-dependent Malay vowel recognition for a child with articulation disorder using multi-layer perceptron. Paper presented at the IFMBE Proceedings.
  • To, C.K., Law, T., & Cheung, P.S.P. (2012). Treatment intensity in everyday clinical management of speech sound disorders in Hong Kong. International Journal of Speech-Language Pathology, 14, 462–466. doi:10.3109/17549507.2012.688867
  • Tommy, C. A., & Minoi, J. L. (2016, 4-8 Dec. 2016). Speech therapy mobile application for speech and language impairment children. Paper presented at the 2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES).
  • van Santen, J.P.H., Prud’hommeaux, E.T., & Black, L.M. (2009). Automated assessment of prosody production. Speech Communication, 51, 1082–1097. doi:10.1016/j.specom.2009.04.007
  • Verdon, S., Wilson, L., Smith-Tamaray, M., & McAllister, L. (2011). An investigation of equity of rural speech-language pathology services for children: A geographical perspective. International Journal of Speech-Language Pathology, 13, 239–250. doi:10.3109/17549507.2011.573865
  • Wang, Y.H., & Young, S.S.C. (2015). Effectiveness of feedback for enhancing English pronunciation in an ASR-based CALL System. Journal of Computer Assisted Learning, 31, 493–504. doi:10.1111/jcal.12079
  • Wielgat, R., Zieliński, T.P., Woźniak, T., Grabias, S., & Król, D. (2008). Automatic recognition of pathological phoneme production. Folia Phoniatrica et Logopaedica, 60, 323–331. doi:10.1159/000170083
  • Williams, L.A. (2012). Intensity in phonological intervention: Is there a prescribed amount?. International Journal of Speech-Language Pathology, 14, 456–461. doi:10.3109/17549507.2012.688866

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.