ABSTRACT
There is growing evidence that subtle changes in spontaneous speech may reflect early pathological changes in cognitive function. Recent work has found that lexical-semantic features of spontaneous speech predict cognitive dysfunction in individuals with mild cognitive impairment (MCI). The current study assessed whether Ostrand and Gunstad’s (OG) lexical-semantic features extend to predicting cognitive status in a sample of individuals with Alzheimer’s clinical syndrome (ACS) and healthy controls. Four additional (New) speech indices shown to be important in language processing research were also explored in this sample to extend prior work. Speech transcripts of the Cookie Theft Task from 81 individuals with ACS (Mage = 72.7 years, SD = 8.80, 70.4% female) and 61 healthy controls (HC) (Mage = 63.9 years, SD = 8.52, 62.3% female) from Dementia Bank were analyzed. Random forest and logistic machine learning techniques examined whether subject-level lexical-semantic features could be used to accurately discriminate those with ACS from HC. Results showed that logistic models with the New lexical-semantic features obtained good classification accuracy (78.4%), but the OG features had wider success across machine learning model types. In terms of sensitivity and specificity, the random forest model trained on the OG features was the most balanced. Findings from the current study suggest that features of spontaneous speech used to predict MCI may also distinguish between individuals with ACS and healthy controls. Future work should evaluate these lexical-semantic features in pre-clinical persons to further explore their potential to assist with early detection through speech analysis.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1. These two methods were used as complementary. Some studies have suggested superior performance for random forest over logistic models (Maroco et al., Citation2011), while others have shown better performance in logistic models (Kirasich et al., Citation2018). Because random forest models are difficult to interpret in terms of the structure of their resulting models and involve fine-tuning of parameters (which we left standard in our analyses, including the default number of trees), we opted to also use logistic models, given that they are more interpretable and may provide, in some cases, superior fit.