ABSTRACT
Aims
To detect differences in speech fluency in separate primary progressive aphasia syndromes (PPA) using automated analysis techniques. The resulting linguistic features are evaluated for their use in a predictive model to identify common patterns in speakers with PPA. As fluency is observable in audio recordings, its quantification may provide a low-cost instrument that augments spontaneous speech analyses in clinical practice.
Methods and Procedures
Speech was recorded in 14 controls, 7 nonfluent variant (nfvPPA) and 8 semantic variant (svPPA) speakers. The recordings were annotated for speech and non-speech with Kaldi, a common toolkit for speech processing software. Variables relating to fluency (pause rate, number of pauses, length of pauses) were analyzed.
Outcomes and Results
The best fitting distribution of pause duration was a combination of two Gaussian distributions, corresponding with pause categories short vs. long.
Group level differences were found in the rate of pauses and proportion of silence: nfvPPA speakers use more short pauses relative to long pauses than control speakers, and the duration of short and long pauses is longer; svPPA speakers use more longer pauses relative to short pauses. Their short pauses are significantly shorter than those from control speakers.
Participants in both PPA groups pause more frequently. SvPPA speakers are typically perceived as fluent. However, our analysis shows their fluency patterns to be distinct from control speakers, if the long-short distinction is observed.
Conclusions
Automatic measurements of pause duration show meaningful distinctions across the groups and might provide future aid in clinical assessment.
Acknowledgements
Participant recruitment was partly accomplished through Hersenonderzoek.nl, the Dutch online registry that facilitates participant recruitment for neuroscience studies (www.hersenonderzoek.nl). Hersenonderzoek.nl is funded by ZonMw-Memorabel (project no. 73305095003), a project in the context of the Dutch Deltaplan Dementie, the Alzheimer’s Society in the Netherlands (Alzheimer Nederland), Brain Foundation Netherlands (Hersenstichting), and Amsterdam Neuroscience.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1. 1: a custom implementation of the algorithm of Ramírez et al. (Citation2004); 2: the Matlab implementation of the algorithm of Drugman et al. (Citation2015), 3: the VAD detection algorithm bundled with the WSJ recipe of Kaldi.
2. 2The locution time includes the turn-initial and turn-final silences (if any); these silences are not counted as pauses because they do not interrupt the phonated segments.
3. R version 4.1.2, with packages rethinking (McElreath, Citation2023), mc‐stan (Stan Development Team, Citation2016) and vegan (Oksanen et al., Citation2022).
4. As implemented in the stats-package of R (R Core Team, Citation2017).