1,235
Views
8
CrossRef citations to date
0
Altmetric
Communication rights of people with communication disabilities

Fostering human rights through TalkBank

, , &
Pages 115-119 | Received 29 Jul 2017, Accepted 08 Oct 2017, Published online: 10 Nov 2017

Abstract

In accord with articles 19 and 27 of the Universal Declaration of Human Rights, people with speech and language disorders have the right to receive maximal benefit from academic research on speech and language acquisition and disorders. To evaluate the diverse nature of speech and language disorders, this research must have access to large datasets, as well as to refined tools for the systematic analysis of these datasets. The TalkBank system addresses this need by providing researchers with thousands of hours of open-access database archives of digital audio, video and transcript files documenting typical and disordered language use in dozens of languages and cultures. In this paper, we review the TalkBank system, with an emphasis on the AphasiaBank, PhonBank and FluencyBank databases. We describe how specialised assessment tools can be used to study issues in speech and language acquisition and disorders recorded within these databases. We then provide illustrations of how assessments support the needs of researchers, clinicians, developers, and educators, whose combined work contributes solutions for people with speech, language and language learning disorders worldwide.

In accord with articles 19 and 27 of the Universal Declaration of Human Rights, people with speech and language disorders have the right to receive maximal benefit from academic research on speech and language acquisition and disorders. To evaluate the diverse nature of speech and language disorders, researchers must have access to large datasets and powerful tools for the systematic analysis of these datasets. The TalkBank system addresses this need by providing researchers with thousands of hours of open-access transcription linked to digital audio or video, documenting typical and disordered spoken language use in dozens of languages and cultures. The TalkBank system is grounded on these six basic principles: (1) use of a common transcription format called CHAT (Codes for the Human Analysis of Transcripts), (2) free availability of the analysis programs called CLAN (Child Language ANalysis), (3) open access to contributed data, (4) protection of participant rights through informed consent, (5) interoperability with other language analysis systems and (6) conformity with international standards for language database structure. The TalkBank system is currently composed of 12 specialised spoken language banks. In this report, we will focus on AphasiaBank for the study of aphasia, PhonBank for the study of typical and disordered child phonology, and FluencyBank for the study of disfluency and stuttering. The data and resources for each of these banks can be accessed through http://talkbank.org.

AphasiaBank

The June 2017 announcement marking Aphasia Awareness month emphasised the fact that communication access is a basic human right, and that it is important to promote strategies to remove communication barriers and reduce the psychological distress of living with aphasia. AphasiaBank (MacWhinney, Fromm, Forbes, & Holland, Citation2011) (http://aphasia.talkbank.org/) aims to achieve those goals through improving patient-oriented assessment and treatment of aphasia, based on the analysis of spoken language use from a large number of people with aphasia. The system has been organised and directed by Brian MacWhinney, Audrey Holland, Davida Fromm, and Margie Forbes. Over 750 professionals from 37 countries and a variety of disciplines (e.g. speech-language pathology, linguistics, psychology, neurology, computer science) have requested access to the database for research, education, and/or clinical purposes. Over 50 research and clinical sites have contributed to the database.

AphasiaBank focuses on recordings gathered using a standard discourse protocol and elicitation script (http://aphasia.talkbank.org/protocol/). This focus on discourse samples is motivated by the importance of spoken interaction in everyday life, as well as the richness of the macro- and micro-linguistic information contained in discourse samples. As of June 2017, the protocol database contains 425 transcripts and video files from people with aphasia and 247 from people without aphasia. These samples are mostly in English, but smaller corpora are also available for French, Cantonese, German, Italian, Japanese, Spanish, and some bilingual participants. Transcriptions are done in CHAT format (MacWhinney, Citation2000), and transcribers use an error-coding format to capture word-level and sentence-level errors. The discourse protocol is augmented by a standard test battery and comprehensive demographic data collection.

An alternative interview method was developed for eliciting communication samples from participants with more severe or global aphasia. The Famous People Protocol (http://aphasia.talkbank.org/famous/) uses a detailed interactive protocol with pictures of famous entertainers, world figures, sports figures, and former United States presidents. Participants with aphasia are encouraged to communicate using any modality (e.g. gesture, pantomime, drawing, singing) or compensatory strategies (e.g. circumlocuting, self-cuing). This user-friendly tool was designed to yield valuable clinical information while also provide patients with experiences of communication success.

The AphasiaBank database has facilitated the development of new clinician-friendly discourse evaluation tools focussing on methods such as core concepts (Richardson & Dalton, Citation2016) or propositional density (Brown, Snodgrass, Kemper, Herman, & Covington, Citation2008). The CLAN programs for analysis of TalkBank materials allow researchers and clinicians to automatically compute a wide range of these measures and to compare results from individual patients with patterns in the larger database. Use of the CLAN programs and AphasiaBank data has resulted in hundreds of publications, presentations and theses on topics such as treatment outcomes, recovery, listener perceptions, lexical diversity, syntax, gestures, story grammar, coherence, and aphasia syndrome classification. The website provides links at http://aphasia.talkbank.org/publications/ to publications that have used AphasiaBank data. Many other videos and transcripts of therapy (e.g. Script training, group therapy, oral reading assessment) and non-protocol tasks (e.g. conversations, story retells, other picture descriptions) are also at the website. Educational resources include a Grand Rounds tutorial with descriptions of classic aphasia types, video samples and questions for discussion about the language samples and potential treatment approaches.

Future work in AphasiaBank will address: (1) apraxia of speech assessment using state-of-the-art computer-mediated procedures; (2) more automation for transcription and clinical assessment; (3) long-term recovery processes; (4) functional communication; and (5) comparisons with other populations (e.g. people with traumatic brain injury, right hemisphere stroke and primary progressive aphasia). Improvements in empirical and clinical measurement and analysis will be used to improve speech and language services for individuals with aphasia, thereby positively impacting their quality of life.

AphasiaBank extensions

The advantages of data sharing and automated analysis of discourse have led to the development of several other clinical language banks (e.g. TBIBank, RHDBank and DementiaBank), all accessible from the overall TalkBank page at http://talkbank.org. In each of these areas, the collection, transcription and analysis of full discourse samples allows for the investigation of the relationship between cognition and language (e.g. coherence, cohesion, pragmatics). TBIBank is a repository of multimedia interactions for the study of communication in people with traumatic brain injury. It includes media files and transcripts from a standard discourse protocol as well as other contributions of conversations, story retells, story generations, picture descriptions and procedural discourse. RHDBank is one of the more recent databases created for the study of communication in people with Right Hemisphere Disorder. This corpus is based on a standard discourse protocol, demographic data collection and set of assessment procedures. The discourse protocol includes free speech, picture descriptions, the Cinderella story telling, a procedural discourse task, a question production task, and a first-encounter conversation. DementiaBank has transcripts and media from individuals with various types of dementia as well as individuals with primary progressive aphasia. The largest corpus in this repository has longitudinal data for four language tasks (Cookie Theft picture descriptions, a sentence construction task, word fluency, and a story retell task) from individuals with Alzheimer’s disease (AD), other types of dementia and elderly controls. These data have been of interest to researchers using machine learning and linguistic analysis to automatically identify AD from short narrative samples and researchers working to improve speech recognition skills in personal assistive robots trained to work with older adults with AD (Rudzicz et al. 2014).

PhonBank

PhonBank (http://phonbank.talkbank.org) extends TalkBank analyses to the study of typical and disordered phonetics and phonology. The study of phonological development has important implications for (1) the diagnosis and treatment of speech and language disorders, (2) models of the biological bases of speech and language acquisition and production, (3) understanding child and adult bilinguals, (4) promoting the teaching of second languages, and (5) advancing linguistic theory. Spearheaded by Brian MacWhinney at Carnegie Mellon University and Yvan Rose at Memorial University of Newfoundland, this research consortium pursues two inter-related goals: (1) to develop PhonBank, a shared database for the study of phonology, phonological development, and speech disorders; and (2) to develop Phon, a specialised software program for the building and analysis of phonological corpora (Rose & MacWhinney, Citation2014).

As reported in Rose and Stoel-Gammon (Citation2015), research in the areas of phonological development and phonological disorders comes from two relatively disparate fields of study with different goals and different methods of collecting and analysing data. One field, composed primarily of educators, speech-language pathologists, and psychologists, seeks to establish norms for phonological acquisition in children with typical development with the goal of identifying those children with atypical development. The second source of data on phonological development and disorders is based primarily on work by linguists, speech-language pathologists, and speech scientists and is focussed on how children acquired phonology rather than what they acquired. Regardless of the approach to data collection and analysis, research on phonological development and disorders requires an enormous amount of time and patience, particularly in the context of longitudinal studies. When a researcher uses an extant dataset, there is no need to collect or phonetically transcribe the productions, while the need for sorting and cross tabulating the data remains. PhonBank offers the template upon which these two approaches are now converging. By sharing group and individual children's data within a unified data format and a specialised software program to systematically analyse these data, researchers and clinicians can use the data from past work as a springboard for their own studies, allowing them to check the status of particular phenomena and generate new questions.

Because of its focus on the phonological and phonetic properties of speech patterns, including speech acoustics, PhonBank facilitates the study of elusive pronunciation details that, together, can make up a person's accent. This is particularly relevant for the study of speech phenomena that deviate from standard pronunciation. For example, one can study speech accents within multilingual societies, and how they affect perceptibility or might influence social stigmatisation of speaker groups. Within multilingual societies, we need to distinguish language disorders from accents that are typical among multilingual speakers. In countries with aboriginal populations speaking their traditional languages, knowledge of language development and speech phonetics is central to the provision of educational and clinical services adapted to the specific needs of these speakers. These questions also have direct implications from the perspective of human rights. For example, how much disruption from the standardised norm can be expected, without negative consequences, by speakers whose native dialects differ from an established norm? Similarly, how can we assess speech and language disorders in aboriginal languages, given the virtual absence of diagnosis tools adapted for these languages, and of clinical practitioners competent in these languages?

Answers to such questions require access to data and tools for their analysis. Phon data annotation and mining functions can address speech data from virtually all languages, and PhonBank was the site of publication of the first-ever publicly-accessible database of aboriginal (Cree) language acquisition (Rose, Brittain, Dyck, & Swain, Citation2010). Additional data from minority languages and dialects, such as Berber, Galician, and Quechua, have since been added to the database, and Phon was used in the analysis of other minority languages such as Gurindji and Cayuga. The research which emanates from these projects is also relevant beyond the scholarly worlds of linguistics or psychology. Educators (e.g. teachers, curriculum designers, dictionary makers) working with minority and endangered languages, require descriptive grammars of these languages for teaching and evaluation purposes. Also, as the details emanating from database research become available, these are used by speech–language pathologists and educators. This knowledge contributes to creating speech and language assessment tools that can provide early diagnosis lead to better early intervention. Children from minority-language communities who are referred for assessment are, at present, largely assessed through translation, or in their second language, a practice that often results in either under- or over-diagnosis of speech and language disorders among children (Stow & Dodd, Citation2005; Winter, Citation2001). Finally, outside of academia and beyond minority language-speaking communities, the existence of these research projects raises the profile of these linguistic communities within the larger society.

FluencyBank

The third major clinical component of TalkBank is FluencyBank (http://fluency.talkbank.org), organised by Nan Bernstein-Ratner at the University of Maryland and Brian MacWhinney at Carnegie Mellon University. The goal of this bank is to provide extensive, well-transcribed data for the study of the fluency and disfluency, in both children and adults. Fluency disorders are involved in stuttering, aphasia, apraxia and other disabilities. There are numerous classic studies of adult disfluency patterns based on slips of the tongue in typical adults. However, there is not yet any publicly available data on the development of fluency and disfluency in children. Although FluencyBank is only one year old, it has already attracted major commitments of transcriptions and media from earlier investigations. The task here is to convert the data from the divergent formats used in these projects to the uniform CHAT format needed for analysis by the TalkBank CLAN program. In addition, the project will collect new longitudinal data from children in the Eastern United States who show early evidence of stuttering.

The study of disfluency has been impeded by the difficulty of developing a consistent and reliable method of coding disfluencies. It is particularly difficult to separate typical patterns of disfluency from disordered patterns. In addition to the challenge of coding reliability, the analysis of disfluencies suffers from a workload problem. Creating precise measurements of the length of unfilled pauses and tallying segment repetitions, mispronunciations and retraces is an exceedingly time-intensive task. To do this, we are implementing these five methods:

  1. FluencyBank transcripts are linked directly to the audio record, thereby tightening the linkage of codes to data.

  2. FluencyBank has defined a consistent system for fluency coding, based on CHAT.

  3. By encouraging data-sharing, FluencyBank will be able to create a large inventory of well-transcribed and well-coded data linked to audio.

  4. Because both Phon and CLAN link directly to Praat, it is possible to create a core set of “gold standard” transcriptions of disfluency patterns.

  5. Using these gold standard transcriptions, we can train automatic speech recognition (ASR) systems such as SpeechKitchen (Metze, Fosler-Lusser, and Bates, Citation2013) to perform automatic diarisation and segmentation on input recordings. We have shown that this method is particularly powerful when participants are asked to repeat target sentences or passages.

Automatic diarisation and segmentation will address many parts of the workload problem. However, we will still need human input and further analysis to evaluate coding reliability and to identify particular patterns of disfluency. By grounding disfluency coding on the underlying acoustic facts are quantified in Praat (http://praat.org), the basic phonological facts as characterised in Phon, and the basic lexical and morphosyntactic facts as characterised by CLAN, we can achieve much great levels of coding consistency

Summary and future

In this brief account, we have discussed the three components of TalkBank that focus most specifically on speech and language disorders, as well as current extensions of AphasiaBank to the study of TBI, RHD, and dementia. There are also components of the TalkBank system that provide data on specific language impairment (SLI), Down syndrome, and autism spectrum disorder (ASD). As these various clinical databases grow and mature, we will be able to create increasingly powerful comparisons of language usage across these various groups. Using this information, we can develop more precise ideas about speech and language patterns that cause problems in each of these disorders, thereby providing guidance to programs for speech and language therapy and improving the quality of life for people with language disorders.

All of these programs and databases are freely available, and we actively solicit further data contributions and new projects. We encourage clinicians and researchers to visit the central website at http://talkbank.org which provides links to all TalkBank materials, including web video tutorials, ground rules for data usage, addresses for subscribing to user groups, links for downloading, and summaries of the content of the many datasets of clinical relevance in TalkBank.

Declaration of interest

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by NIDCD grant DC 008524 for AphasiaBank, NIDCD grant DC015494 for FluencyBank, NSF Grant BCS-1626294 for FluencyBank, and NICHD grant HD051698 for PhonBank.

References

  • Brown, C., Snodgrass, T., Kemper, S.J., Herman, R., & Covington, M.A. (2008). Automatic measurement of propositional idea density from part-of-speech tagging. Behavior Research Methods, Instruments, and Computers, 40, 540–545. doi:10.3758/BRM.40.2.540
  • MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
  • MacWhinney, B., Fromm, D., Forbes, M., & Holland, A. (2011). AphasiaBank: Methods for studying discourse. Aphasiology, 25, 1286–1307. doi:10.1080/02687038.2011.589893
  • Metze, F., Fosler-Lusser, E., & Bates, R. (2013). The speech recognition virtual kitchen. Paper presented at the INTERSPEECH, Lyon, France. Retrieved from: https://github.org/srvk
  • Richardson, J.D., & Dalton, S.G. (2016). Main concepts for three different discourse tasks in a large non-clinical sample. Aphasiology, 30, 45–73. doi:10.1080/02687038.2015.1057891
  • Rose, Y., Brittain, J., Dyck, C., & Swain, E. (2010). The acquisition of metrical opacity: A longitudinal case study from Northern East Cree. In K. Franich, K. M. Iserman, & L. L. Keil (Eds.), Proceedings of the 34th Annual Boston University Conference on Language Development (pp. 339–350). Somerville, MA: Cascadilla Press.
  • Rose, Y., & MacWhinney, B. (2014). The PhonBank Project: Data and software-assisted methods for the study of phonology and phonological development. In J. Durand, U. Gut, & G. Kristoffersen (Eds.), The Oxford handbook of corpus phonology (pp. 380–401). Oxford, UK: Oxford University Press.
  • Rose, Y., & Stoel-Gammon, C. (2015). Using PhonBank and Phon in studies of phonological development and disorders. Clinical Linguistics and Phonetics, 29, 686–700. doi:10.3109/02699206.2015.1041609
  • Stow, C., & Dodd, B. (2005). A survey of bilingual children referred for investigation of communication disorders: A comparison with monolingual children referred in one area in England. Journal of Multilingual Communication Disorders, 3, 1–23. doi:10.1080/14769670400009959
  • Winter, K. (2001). Numbers of bilingual children in speech and language therapy: Theory and practice of measuring their representation. International Journal of Bilingualism, 5, 465–495. doi:10.1177/13670069010050040401