Publication Cover
Aging, Neuropsychology, and Cognition
A Journal on Normal and Dysfunctional Development
Volume 30, 2023 - Issue 2
12,873
Views
6
CrossRef citations to date
0
Altmetric
Original Article

Improvement in executive function for older adults through smartphone apps: a randomized clinical trial comparing language learning and brain training

, , , , , , , , & show all
Pages 150-171 | Received 03 Jun 2021, Accepted 05 Oct 2021, Published online: 25 Oct 2021

ABSTRACT

Bilingualism has been linked to improved executive function and delayed onset of dementia, but it is unknown whether similar benefits can be obtained later in life through deliberate intervention. Given the logistical hurdles of second language acquisition in a randomized trial for older adults, few interventional studies have been done thus far. However, recently developed smartphone apps offer a convenient means to acquire skills in a second language and can be compared with brain training apps specifically designed to improve executive function. In a randomized clinical trial, 76 adults aged 65–75 were assigned to either 16 weeks of Spanish learning using the app Duolingo 30 minutes a day, an equivalent amount of brain training using the app BrainHQ, or a waitlist control condition. Executive function was assessed before and after the intervention with preregistered (NCT03638882) tests previously linked to better performance in bilinguals. For two of the primary measures: incongruent Stroop color naming and 2-back accuracy, Duolingo provided equivalent benefits as BrainHQ compared to a control group. On reaction time for N-back and Simon tests, the BrainHQ group alone experienced strong gains over the other two groups. Duolingo was rated as more enjoyable. These results suggest that app-based language learning may provide some similar benefits as brain training in improving executive function in seniors but has less impact on processing speed. However, future advancements in app design may optimize not only the acquisition of the target language but also the side benefits of the language learning experience.

Introduction

A comprehensive review by Livingston et al. (Citation2017) estimated that 35% of dementia risk is modifiable through lifestyle factors that confer “cognitive reserve” – the ability to tolerate a greater degree of neurodegenerative disease before displaying symptoms of dementia (Chan et al., Citation2018; Stern, Citation2012). One intriguing protective factor is bilingualism (Bialystok, Citation2021): Lifelong bilinguals are diagnosed with dementia later in life than monolinguals (Alladi et al., Citation2013; Anderson, Hawrylewicz, & Grundy; Bialystok et al., Citation2007; Chertkow et al., Citation2010; Craik et al., Citation2010; Livingston et al., Citation2017; Woumans et al., Citation2015), yet exhibit more advanced neurodegeneration upon diagnosis (Kowoll et al., Citation2016; Perani et al., Citation2017; Schweizer et al., Citation2012; meta-analysis in Anderson et al., Citation2020). On a population level, multilingual countries have lower incidences of dementia than do monolingual countries matched on such indices as life expectancy and wealth (Klein et al., Citation2016).

The protective effect of bilingualism is linked to mixed findings on improved executive function in bilinguals, based on standard tests including the Simon task (Bialystok et al., Citation2004), flanker task (Abutalebi et al., Citation2012), and Stroop task (Bialystok, Craik et al., Citation2014). Meta-analyses have both supported (Grundy, Citation2020; Van den Noort et al., Citation2019) and refuted (Donnelly et al., Citation2019; Lehtonen et al., Citation2018) the reliability of these effects, but neuroimaging studies routinely report more efficient recruitment of brain resources by bilinguals while performing these tasks (Gold et al., Citation2013; Kousaie & Phillips, Citation2017). Many factors may be responsible for the contradictory results associated with between-group comparisons (Antoniou, Citation2019) so longitudinal training studies could provide more controlled evidence.

Bilingualism is a life circumstance that is typically beyond an individual’s control, so it is unknown whether similar benefits could emerge from deliberate second-language learning in older adults. Although these learners may not achieve full fluency, it is possible that the process of acquiring the second language may confer measurable benefits. Language learning is a demanding mental activity involving working memory, long-term memory, sustained attention, and auditory processing, potentially providing a more effective “mental workout” than other recreational activities enjoyed by seniors, such as crosswords, sudoku, reading, or painting.

Given the logistical challenges of arranging effective language instruction, few formal studies have addressed this possibility (Antoniou et al., Citation2013) but existing studies have shown promising results. In one study, individuals (mean age 50 years) who completed a one-week intensive course in Gaelic outperformed a control group on attention tasks (Bak et al., Citation2016). In another study, ERP signals to a go/no-go task indicated less conflict for students after taking introductory Spanish (Sullivan et al., Citation2014). A study examining intelligence scores at ages 11 and 72 within individuals found that specifically those who went on to become bilingual significantly outperformed their childhood scores in later life (Bak et al., Citation2014). Contrarily, some controlled studies of language training have failed to find improvements in cognitive function (Berggren et al., Citation2020; Ramos et al., Citation2017, reviewed by Pot et al., Citation2019). Most training studies have relied on classroom learning, in which it is difficult to control and quantify the degree of engagement experienced by each participant. One study used a computer-based intervention (Rosetta Stone) to teach English to seniors aged 60–85, finding overall improvements in a general measure of cognitive well-being (ADAS-Cog), but not in specific measures of executive function that are similar to those tested in the present experiment (Wong et al., Citation2019).

The recent popularity of self-directed software applications for language learning can provide a more controlled approach to this research. Modern language apps run on smartphones, allowing them to be accessed throughout the day at opportune moments, and are heavily “gamified,” making them engaging. With such apps, a language-learning intervention for older adults involving 30 minutes a day of home-based training becomes feasible, and all participants can complete the same exercises while their progress is tracked.

Simultaneously, smartphone-based apps offering “brain training” that are based on empirical research and intended to strengthen cognitive brain networks have also proliferated. The benefits of brain training apps are as controversial as those of bilingualism, with high-profile reviews and position papers claiming either that brain training is highly effective or barely effective at all (Cognitive Training Data Website, Citation2014; Max Planck Institute for Human Development, Citation2014; Rabipour & Raz, Citation2012; Simons et al., Citation2016). The consensus is that brain training apps produce lasting improvement on the specific exercises that are trained and may lead to “near transfer” for related tasks, but there is disagreement about the potential for “far transfer” that would make people more resilient to cognitive decline.

To protect against cognitive decline, an intervention should improve abilities associated with greater cognitive reserve, including executive functioning (managing conflicting information) and working memory. Therefore, we selected three tests as outcome measures in a preregistered clinical trial: N-back, Simon Task, and the Color-Word Interference Test (Stroop task). To evaluate the impact of a realistic, sustainable amount of app-based training, we randomized volunteers into three groups for a 16-week intervention, comparing 30 minutes a day of language-learning vs. brain training, with both compared to a passive control group. The two interventions served as active controls for each other. We selected two popular market-leading apps for the interventions: Duolingo for learning Spanish, and BrainHQ by Posit Science for cognitive training focused on executive function.

No improvement from pretest to posttest was expected for the Control group, but we expected BrainHQ group to show significant gains at posttest because of the similarity between the training and the outcome, especially in light of a recent study (not published yet when we began this project however) finding improved working memory and response speed with BrainHQ compared to an active control consisting of casual computer games, with a study design and population group similar to our own (Lee et al., Citation2020). For the Duolingo group, we expected intermediate results, better than Control, but with no expectation that they would surpass a group specifically trained on exercises related to the outcome measures who were expected to show near transfer. In contrast, improvement for Duolingo would reflect far transfer, a more dramatic outcome. All measures included control conditions with minimal conflict or memory load demands to distinguish improved general response speed from improved executive function. No improvement was predicted for these control conditions.

Method

Participants

Participants were healthy community members recruited through advertisement. Inclusion criteria were as follows: 1) Neurologically healthy aged 65–75, 2) normal or corrected-to-normal vision and hearing (self-report), 3) monolingual native English speakers, based on the Language and Social Background Questionnaire (LSBQ) (Anderson et al., Citation2018), 4) no previous formal study of Spanish in their life nor any other language in the past 10 years, and 5) daily access to a compatible smartphone or tablet.

Ninety-five participants completed the pretesting and were then pseudorandomly assigned to their study condition (supplementary information). 76 completed the study and are included in the analysis: 28 in the Duolingo group, and 24 in each of the BrainHQ and Control groups. For two participants (1 BrainHQ, 1 Control), technical failures resulted in the loss of data for the N-back and Simon tests, resulting in a final n of 74 for those tests.

Intervention

Following pretesting, participants in the Duolingo and BrainHQ training groups spent 16 weeks using the assigned app for 30 minutes/day 5 times/week. The Duolingo intervention used the Introductory Spanish course and BrainHQ were assigned eight exercises (out of 29) that focused on executive function, attention, and working memory, all tapping abilities similar to those in the outcome measures. Usage of both apps was tracked through built-in functions, and feedback was used to keep participants on track. All participants returned after 16 weeks for posttesting.

Background measures

Montreal Cognitive Assessment (MoCA) (Nasreddine et al., Citation2005) was administered at pretest to screen for mild cognitive impairment. A conservative minimum score of 21 (Carson et al., Citation2018; Waldron-Perrine & Axelrod, Citation2012) resulted in exclusion of three volunteers.

WEBCape is an online platform for assessing language proficiency for university placement. We used it to confirm lack of Spanish proficiency before training and assess Spanish learning in the Duolingo group.

Outcome measures (Preregistered clinicaltrials.gov NCT03638882).

N-back task

Implemented in Eprime, the program presented single digits in white on a black background. Each trial began with a fixation cross with a jittered duration of 500, 1000, or 1500 ms, followed by one digit displayed for 2000 ms, which was the maximum time allowed for a response. Participants sat with their left- and right-index fingers on the designated response keys on a laptop and pressed “yes” with their left-index finger or “no” with their right. 25% of the trials were targets requiring a “yes” response. Each run consisted of 96 trials, the first two always non-targets and excluded from the analysis. Three conditions were completed in a fixed sequence: 0-back (“yes” if the digit is “0”); 1-back (“yes” if the digit is the same as the one on the previous trial); and 2-back (“yes” if the digit is the same as the one 2 trials previously).

The 0-back condition involves only target detection, so no training benefit was expected since it was included as a control to identify improvement in reaction time (RT) without improvement in executive function. For the 1-back and 2-back conditions, we predicted a Group X Session interaction involving stronger improvements in the intervention groups than control for reaction time and accuracy, although accuracy might be at ceiling for the 1-back condition. There was no specific prediction for targetness (target vs. non-target stimuli), and the 2-back condition was predicted to be more difficult than the 1-back (main effect of N-back), and to benefit more from training (interaction of Group X Session X Condition).

Simon task

Adapted from Bialystok et al. (Citation2004, Study 2), the task was implemented in Eprime with a white background. Each trial began with a black central fixation cross for 300 ms, then replaced by a colored square, either at the center, or on the right or left side of the screen, depending on the condition. Participants were instructed to press a left-side key when they saw squares of certain colors, and a right-side key for other colors. The square remained on the screen until a response was made. Trials were separated by 500 ms blank screen. Participants completed four condition blocks each consisting of 24 trials, then the blocks were repeated in reverse order, for a total of 48 trials in each condition. All blocks began with instructions and practice trials that were repeated if incorrect.

In the first condition, Center-2, participants pressed left key for blue squares and the right key for brown squares. In the second condition, Side-2, squares were blue or brown and appeared on the left or right side of the screen creating congruent and incongruent trials with equal numbers of each. For congruent trials, the position of the square matched the position of the correct response key, with the opposite configuration for incongruent trials. The third condition, Center-4, was similar to the first condition except that there were four possible colors: pink and yellow requiring a left response, and red and green requiring a rightward response. Finally, Side-4 used the four colors from Center-4 but presented them on one side of the screen creating congruent and incongruent trials.

Accuracy was expected to be at ceiling, so all predictions concerned RT. Center-2 does not involve working memory or managing response conflict so it was included as a control. For the side-presentations, we expected improvement in RT for the training conditions (Group X Session interaction). We also expected main effects of load, with longer RT for the 4-color condition than the 2-color condition, and congruency, with longer RT for incongruent than congruent trials, but were uncertain whether these effects would also undergo more improvement following cognitive training (higher order interactions of Group X Session with load or congruency). While some early evidence suggested that bilinguals outperform monolinguals specifically on incongruent trials (Bialystok et al., Citation2004), subsequent studies have found equivalent advantages for both congruent and incongruent trials (Hilchey & Klein, Citation2011).

Color-word interference task

This test is based on the DKEFS battery (Delis et al., Citation2001) implementation of the classic “Stroop effect” test, in which participants name the ink color of a word color word, such as the word “BLUE” written in red ink, to which participants must answer “red.” The DKEFS battery has four conditions; we focus on conditions 1 and 3. Each condition presents 50 stimuli on a single page and asks the participant to respond as quickly as possible without making errors. The dependent variable is the total time to respond to all the stimuli. Errors were logged but not analyzed.

In color naming, the stimuli are colored squares, and the participant names the color of each square sequentially. This condition was a control to assess group differences in speed of color naming. In color word interference, color words were printed in incongruent colors and participants were named the ink color while ignoring the word.

Participant engagement and satisfaction

Satisfaction questionnaire

After completing the study, participants in the intervention conditions were administered a questionnaire (supplementary information) to evaluate their satisfaction with the intervention. Each question was answered with a Likert rating from 1 to 10 and overall program satisfaction was calculated as the sum of eight questions directly addressing satisfaction.

Usage tracking

Based on estimated progress milestones corresponding to 30 minutes of usage 5 times a week (supplementary information), we estimated the proportion of the expected training completed by each participant.

Sample size and power

A recent meta-analysis of cognitive training in seniors (Chiu et al., Citation2017) estimated that interventions with over 3 training sessions per week, using executive function as an outcome measure, achieved an average effect size of 0.647, while those with less frequent sessions had much smaller effects. It is unclear whether such effect sizes are applicable to single interventions as in comparisons of scores before and after the intervention within a single group, or rather to comparisons of an experimental group to a control group (a group by session interaction, as examined in the present study). An estimate with G-power software (Faul et al., Citation2007) for a single-group design (α = .05, ß = 0.8, effect size d = .647) yields a required sample size of 17, while a two-group design requires 31 in each group. Given the resource constraints in the present study (an 18-month grant essentially funding one research assistant and small expenses), we estimated a maximum capacity of 90 participants, 30 in each group, and ended up achieving 76 who completed the intervention, for an estimated power of about 75%. For a smaller effect size like d = .4, which has been described as “a good first estimate of the smallest effect size of interest in psychological research” (Brysbaert, Citation2019) and is close to the average observed across many studies of cognitive training in older adults (Chiu et al., Citation2017), 80% power would require 78 participants per group, beyond the resources available for the present study. However, the observed effect sizes in the present study can also inform the planning of future studies to investigate specific comparisons in larger groups.

Statistical analyses

Results of all experimental tasks consisting of individual trials were analyzed with mixed-effects models, focusing on Group X Session interactions reflecting differential improvement for the interventions. Differential improvement was tested as three specific planned pairwise contrasts within the omnibus model, testing for greater post-pre improvement in one group vs. another (BrainHQ vs. control, Duolingo vs. Control, BrainHQ vs. Duolingo). For continuous outcome measures, models were fit with the function lmer in the R package lme4 (Bates et al., Citation2014), with p-values generated via the Satterthwaite degrees of freedom estimate implemented in the package lmerTest (Kuznetsova et al., Citation2017). For measures with binary outcomes (n-back accuracy), we used mixed-effects logistic regression (Jaeger, Citation2008) using the R function glmer with a binomial response family, estimating a random intercept for each participant, with fixed effects of Group, Session, and Targetness. P-values were estimated by the Wald test implemented in the R function summary. All models included random intercepts per participant as well as condition-specific intercepts for within-subjects factors.

Model formulas in R syntax were as follows:

N-back 1-back & 2-back RT:

lmer(RT ~ Group*Session*istarget*numback + (1 | Subject) + (1 | numback:Subject) + (1 | Session:Subject) + (1 | istarget:Subject)

where Group ∈ (Control, BrainHQ, Duolingo); Session ∈ (pre,post); istarget ∈

(target, nontarget); numback ∈ (1,2)

N-back 0-back RT:

lmer(RT ~ Group*Session*istarget + (1 | Subject) + (1 | Session:Subject) + (1 | istarget:Subject)

N-back 2-back accuracy:

glmer(accuracy ~ Group*Session*istarget + (1|Subject) + (1 | Session:Subject) + (1 | istarget:Subject), family = binomial)

Simon task RT Side Presentation:

lmer(RT ~ Group*Session*Colors*TrialType + (1 | Subject) + (1 | Session:Subject) + (1 | Colors:Subject) + (1 | TrialType:Subject)

where Group ∈ (Control, BrainHQ, Duolingo); Colors ∈ (2,4); TrialType ∈ (Congruent,Incongruent); Session ∈ (pre,post)

Simon task RT Center Presentation:

lmer(RT ~ Group*Session*Colors + (1 | Subject) + (1 | Session:Subject) + (1 | Colors:Subject)

For tasks with only a single score per session (the color-word interference paradigm), we report results obtained with standard repeated measures Anova with type-3 sums of squares with the ezANOVA package in R. Analyses with mixed-effects models, as expected, returned essentially identical results.

Planned pairwise linear contrasts comparing the improvement between pairs of groups were conducted using the emmeans package (Lenth et al., Citation2018).

In addition to statistical significance testing, we estimated effect sizes for all outcome measures using a simple traditional approach (Cumming, Citation2014). We compute Cohen’s d as:

d=MEMCsFor effect sizes of training effects (post – pre) within single groups (Control, BrainHQ, and Duolingo), ME is the average score of the group post-training, MC is the average score pre-training, and s is the standard deviation of the pre-training scores pooled across all three groups (since none have yet undergone randomized intervention and can be regarded as equivalent). For effect size estimates of differences between groups, ME is the average difference score (post minus pre) of the experimental group (BrainHQ or Duolingo), MC is the average difference score within the Control group, and s is the standard deviation of difference scores within the Control group.

Results

Background demographics

Background measures by group are presented in with p-values obtained from ANOVA for continuous variables and chi-square for sex. There were no significant differences between the groups on any measures, including number of days between pre- and post-intervention testing.

Table 1. Demographics of participants in the three experimental groups, showing mean, standard deviation, and p-value from a 3 × 1 ANOVA testing for between-group differences.

N-back response time

Effect sizes for all outcome measures are shown in , and a summary of all group differences for the outcome measures reported below is presented in . Reaction times for the N-back task are presented in (separate plots for targets and non-targets shown in Figure S1). The 0-back condition, which was intended to test for generalized improvements in response time in the absence of demands on working memory and executive function, was analyzed with a mixed model including fixed effects of Group, Session, and Targetness (target or non-target). A sizable improvement after intervention was present only in the BrainHQ group (−57 ± 78 msec, with ± indicating standard deviation). This improvement was significantly larger than the improvement seen in the Duolingo group (−11 ± 50 msec); [Z = −2.82, p = .0048] and in the Control group (−8.21 ± 50 msec); [Z = −2.53, p = .0115]. Improvement in the Duolingo group did not differ from the Control group [Z = 0.18, p = .8601]. Responses to targets were slower than to non-targets, resulting in a significant main effect of targetness [F(1,71.3) = 9.461, p = .0030]. There was no interaction of Group X Session X Targetness [F(2,13,370) = 1.86, p = .1556].

Table 2. Effect sizes for changes in test scores in each group, in raw time (milliseconds) and as Cohen’s d (see text for details of computation). Effect sizes with d > .4 are bolded.

Table 3. Summary of findings showing differences in improvement by groups for the outcome measures.

Figure 1. Performance in the N-back task. A) Reaction time to all trials, both target and non-target. B) Accuracy expressed as D-prime, a measure of response discrimination combining targets and non-targets. For all figures, error bars represent 95% confidence interval of the mean averaged across subjects (after averaging across trials within subject).

Figure 1. Performance in the N-back task. A) Reaction time to all trials, both target and non-target. B) Accuracy expressed as D-prime, a measure of response discrimination combining targets and non-targets. For all figures, error bars represent 95% confidence interval of the mean averaged across subjects (after averaging across trials within subject).

The 1-back and 2-back conditions, which present challenges for working memory and executive function, were analyzed together for reaction time in a mixed model with n-back as an additional fixed effect (1-back or 2-back). Trials with no response in the 2-s response period (3.6% of trials) were scored as incorrect in analyses of accuracy but were excluded from calculations of d-prime (see below) as they cannot be unambiguously characterized as a “false alarm” or “miss.” In this analysis, participants in the BrainHQ group again exhibited strong improvements in response time (−88 ± 77 msec), significantly exceeding those seen in the Control group (−8 ± 99 msec); [Z = −2.40, p = .0165]. The improvement in the Duolingo group (−47 ± 91 msec) fell in between that observed in the BrainHQ and Control groups. The difference in improvement between the Duolingo group and the Control group was not statistically significant [Z = −1.178, p = .2389], but neither was the difference between BrainHQ and Duolingo [Z = −1.133, p = .1821]. With the improvement from Duolingo approximately half the size of that seen in the BrainHQ group, and several times larger than that of the Control group, this pattern suggests that Duolingo may induce modest improvements in n-back response time, with an observed effect size of d = 0.40. However, a larger study would be necessary to provide conclusive evidence that this is a reliable finding (see discussion).

Regarding specific conditions, there was again a significant main effect of Targetness [F(1,71) = 16.93, p = .0001], but in the opposite direction seen in the 0-back. Here, responses to targets were faster than non-targets. This underscores the fact that 0-back and 1-back/2-back are different tasks, with only the latter involving working memory, and that in this context “rejecting” a non-target takes longer than recognizing a target. There was also a significant higher interaction of Group X Session X Targetness [F(2,25,481) = 4.70, p = .009]. To investigate the reason for this, we implemented post-hoc contrast testing whether group differences in improvement were stronger for targets or non-targets. In fact, the non-targets seemed to drive the effect. Reaction time improvement on non-targets was significantly larger for the BrainHQ group vs. Control [Z = −3.57, p = .0004]. The differences between Duolingo and Control [Z = −1.80, p = .0713], and between BrainHQ and Duolingo [Z = −1.94, p = .0524] were not quite significant. In contrast, there were no significant differences between any of the three groups for post-hoc contrasts examining the target trials alone.

N-back accuracy

Accuracy was essentially at ceiling in the 0-back and 1-back conditions, with most participants showing fewer than three errors per session. In contrast, the most difficult condition, 2-back, gave rise to accuracy scores in the range of 0.6–0.9, making this condition amenable to analyses of accuracy as an outcome variable. For ease in visualization, overall accuracy across targets (25% of trials) and non-targets is plotted as d-prime in (raw accuracy shown in Figure S1). The BrainHQ group improved in its average d-prime score after the intervention (mean change 0.54 ± .78) significantly more than the Control group did (0.20 ± 0.57), [Z = 2.76, p = .0059]. Similarly, the Duolingo group (0.57 ± 0.68) also improved significantly more than the Control group [Z = 2.22, p = .0263], and there was no significant difference in the degree of improvement for BrainHQ vs. Duolingo [Z = 0.68, p = .4998]. Thus, both interventions led to similar improvements in accuracy compared to the control group on the 2-back condition.

Simon task

Results for the Simon task are shown in . As expected, accuracy was at ceiling (above 95% for all conditions), so only RT was analyzed. We first present the results of an analysis limited to the control condition with center presentation, requiring no management of the conflict between the response button and the side of presentation. We modeled fixed effects of Group, Session, and Memory Load (two colors or four colors). Because participants had unlimited time to respond, there were no missing trials. To reduce the effect of RT outliers without eliminating trials inducing prolonged indecision, trials with RT greater than 5 seconds were winsorized to 5 seconds (0.3% of all trials) and scored as errors but retained for RT analysis.

Figure 2. Reaction time for the Simon task, across all conditions.

Figure 2. Reaction time for the Simon task, across all conditions.

For center-presentation condition of the Simon task, unlike the 0-back test, there were substantial test–retest practice effects, with all three groups exhibiting shorter reaction times in the posttest than the pretest (). Comparing the improvement among groups, however, only the BrainHQ group outperformed the other groups. There was significantly more improvement in the BrainHQ group (−147 ± 112 msec) compared to the Control group (−75 ± 126 msec); [Z = −2.19, p = .0288], and compared to the Duolingo group (62 ± 101 msec); [Z = −2.69, p = .0071], whereas Duolingo did not differ from Control [Z = 0.40, p = .6873].

The main effect of Memory Load (2 vs. 4 colors) was also significant [F(1,71) = 205.6, p < .0001], as expected, reflecting longer RT for 4 colors vs. 2 colors, but there was no significant 3-way interaction of Group X Session X Memory Load.

For the side presentation conditions, which introduced the challenge of managing stimulus-response incompatibilities (e.g., a right button press for a stimulus appearing on the left side), we used a similar mixed-effects model, with the additional factor of Congruency. Again, participants in the BrainHQ group experienced significantly larger gains than the other two groups. BrainHQ (−168 ± 107 msec) significantly outperformed both the Control group (−40 ± 86 msec); [Z = −4.36, p < .0001], and the Duolingo group (−78 ± 103 msec); [Z = −3.20, p = .0014]. As in the n-back task, the Duolingo group again experienced intermediate gains in between BrainHQ and Control, but the difference between Duolingo and Control was not significant [Z = −1.37, p = 0.1710]. Again, there was, as predicted, an effect of memory load with longer RT for the 4-color condition [F(1,71) = 153.5, p < .0001], and, also as predicted, an effect of Congruency with longer RTs to spatially incongruent trials [F(1,71) = 15.5, p = .0001], but no higher-order interactions of these factors with Group X Session.

Color word interference test

The average completion times for the two conditions are presented in . For color patch naming (), all three groups exhibited improvement in the second testing session to a similar degree, resulting in a significant main effect of Session [F(1,73) = 16.46, p = .0001], but no Group X Session interaction [F(2,73) = 1.30, p = .28], and accordingly, no pairwise contrasts between groups were significant. The amounts of improvement were Control: −1.00 ± 4.72 sec, BrainHQ: (−2.58 ± 2.60 sec), Duolingo: (−1.39 ± 3.07 sec).

Figure 3. Total time to complete the control and conflict-inducing conditions of the DKEFS Color-Word Interference Test. A) Color patch naming. B) Incongruent color word naming.

Figure 3. Total time to complete the control and conflict-inducing conditions of the DKEFS Color-Word Interference Test. A) Color patch naming. B) Incongruent color word naming.

For the critical Incongruent Color Word Naming condition (), improvement was present in all three groups, resulting in a significant main effect of Session [F(1,73) = 46.32, p < .0001]. However, this improvement was stronger in both the BrainHQ and Duolingo groups than in the Control group, resulting in a significant Group X Session interaction [F(2,73) = 4.88, p = .0103]. Planned pairwise contrasts showed more improvement for BrainHQ (−9.00 ± 7.99 sec) vs. Control (−2.25 ± 7.97 sec); [t(73) = −3.07], [p = .0030] and for Duolingo (−6.61 ± 6.92 sec) vs. Control [t(73) = −2.06, p = .0430], but no difference between BrainHQ and Duolingo [t(73) = −1.13, p = .2617].

Program satisfaction

The full questionnaire is shown in Table S1 (supplementary information). For the composite satisfaction score, Duolingo received more favorable ratings (59.32 ± 11.54) than BrainHQ (51.77 ± 13.87) [t(45) = 2.11, p = .04]. Additionally, participants in the Duolingo group completed significantly more of the assigned training (average 100% ± 30%) than the BrainHQ participants (87% ± 14%) [t(40) = 2.20, p = .03].

Discussion

This study is among the first to examine the cognitive benefits of language learning in a very specific app-based format, comparing it to a well-matched condition involving the same amount of time spent engaged in a learning app on a tablet or smartphone. We selected outcome measures based on previous reports of improvements in executive function for lifelong bilingualism. Notably, most of these measures are based on the time required to make a decision, and therefore it is important to distinguish overall gains in processing speed from gains specific to manipulating information in working memory or managing conflicting responses, which are cognitive abilities falling within the umbrella term of executive function. For this reason, each of our preselected primary outcome measures included a control condition thought to reflect processing speed alone without much demand on executive function.

For two of our measures, we found that both language learning with Duolingo and brain training with BrainHQ produced large gains after the intervention, equivalent to each other and significantly exceeding the test–retest practice-related improvements present in the Control group. One measure was accuracy in the 2-back condition of the n-back task, with accuracy in the 0-back and 1-back conditions not tested due to ceiling effects. Because it is based on an accuracy measure, this improvement is less related to processing speed, but speed does play a role given the tight time pressure of the task with a 2-s deadline to respond to each trial. These results are consistent with research with children (Janus & Bialystok, Citation2018) and young adults (Barker & Bialystok, Citation2019; Comishen & Bialystok, Citation2021) in which the more difficult n-back conditions were performed better by bilinguals with no difference on the simpler conditions. The effect sizes for the improvement in both intervention groups compared to Control, ranging from d = 0.61 to 0.65, are considered “medium” in the traditional classification of Cohen, and similar to gains reported in other cognitive training studies targeting executive function (Chiu et al., Citation2017). Thus, improvements in working memory usage as reflected by 2-back accuracy seem to be a benefit enjoyed by users of both apps.

Another measure that improved equally in both intervention groups, and significantly more than in the Control group, was the time taken to complete the Incongruent Color Naming condition of the Color Word Interference Test (aka Stroop test). A classic test of inhibitory function, the Stroop task is a sensitive measure of executive function, and has been linked to improved performance in bilinguals among older adults in several studies (Bialystok et al., Citation2004; Bialystok, Poarch et al., Citation2014; Incera & McLennan, Citation2018; Kousaie & Phillips, Citation2017), although findings are mixed (Antón et al., Citation2016; Kousaie & Phillips, Citation2012). Notably, this is a timed test, and overall processing speed would be expected to play a role. Therefore, we also examined performance in a control condition, Color Patch Naming, involving a similar task but with no need to inhibit a prepotent response. For this condition, gains in the three groups did not differ significantly from each other, highlighting the specificity of the gains achieved in the Stroop condition.

Despite these gains present in both intervention groups, other tests that primarily involved reaction time showed a clear advantage for the BrainHQ group, with relatively little difference between the Duolingo and Control groups. In the n-back task, the BrainHQ group made larger gains in speed than the other two groups, both on the undemanding 0-back condition (pointing to an overall improvement in processing speed) and on the more challenging 1-back and 2-back conditions. For these latter conditions, the BrainHQ group enjoyed an effect size of d = −0.82 on reaction time, (“large” by the standards of Cohen), while the advantage of the Duolingo group over the Control group was about half of this, d = −.40 (small to medium), and was not statistically significant. Post-hoc power analysis suggests that if this effect size is accurate, a sample size of over 75 participants per group would be necessary to achieve significance. On the other hand, for the 0-back condition, the Duolingo and Control groups barely differed at all, d = −.06.

The Simon task, which also was performed at ceiling-level accuracy, revealed similarly strong gains in the BrainHQ group compared to the other two groups. This task was chosen based on previous findings of faster performance in the more demanding conditions by bilingual children (Martin-Rhee & Bialystok, Citation2008; Poarch & Van Hell, Citation2012; Tse & Altarriba, Citation2014), and older adults (Bialystok et al., Citation2004; Goral et al., Citation2015; Salvatierra & Rosselli, Citation2011), although results have been more mixed in young adults (Bialystok et al., Citation2008). The BrainHQ made large improvements in reaction time both for the control center-presentation condition and the more cognitively demanding side-presentation condition. For the Duolingo group, there was virtually no gain compared to the Control group on the center condition, whereas for the side condition, again we observed a small-to-medium gain of a similar effect size (d = −.44), which again was not significant. Thus, the Simon task converges with the N-back task in suggesting that some benefit of Duolingo on response time for tasks requiring executive function may be present, but a substantially larger study (n > 75 per group) would be necessary to determine that conclusively.

Ultimately, it is not surprising that users of BrainHQ experienced greater gains in response time than users of Duolingo. Exercises in BrainHQ required speeding responses to tasks similar to those in the outcome measures, making the improvement consistent with near transfer. In contrast, Duolingo exercises are not timed so provide no practice in speeded responding. The fact that individuals in this group nonetheless improved significantly on some task conditions requiring executive functioning provides evidence that the process of learning a language through app-based exercises may provide generalizable cognitive benefits. In this sense, such gains would be considered far transfer and are therefore more impressive than the near transfer of speed found for BrainHQ.

Overall, the improvements documented for BrainHQ training were greater than those for Duolingo, possibly reflecting the greater similarity between the exercises and outcome measures. The Spanish learning exercises bear little resemblance to outcome measures, and the improvements were accordingly more modest. For a user primarily interested in improving processing speed, brain training may be more useful. However, language learning is an interesting activity with other potential rewards, including enhanced opportunities for social interaction which may also have protective effects for brain health. This additional benefit of language learning was supported by the satisfaction and training adherence data that clearly favored Duolingo.

We believe that seniors may not have to make a choice between the inherently rewarding process of learning a language and the cognitive health benefits of speed-based training. It is certainly possible to improve both at the same time. Duolingo is an effective method to study a language despite a nearly complete lack of time pressure involved in the exercises. However, it could easily be extended to incorporate time pressure by requiring speeded responses, an option included in some language learning apps, including Duolingo (but as an optional feature not used in the present study). Furthermore, extended listening exercises in a foreign language can provide real-time cognitive challenge, and can be incrementally increased in difficulty both by increasing the input speed and by modulating the lexical and grammatical complexity of the material. Ultimately, adults who wish to acquire a language and “train their brain” at the same time may seek out language training opportunities that provide both through an increased emphasis on time pressure.

A deeper question in this work concerns the putatively causal link between improved performance on neuropsychological tasks, as seen here, and prevention of dementia. As discussed above, there is considerable evidence that 1) the lifelong experience of bilingualism confers a protective effect by delaying the clinical onset of dementia, and that 2) bilingual children and adults outperform their monolingual peers on certain tasks dependent on executive function. Taken together, these two findings indirectly lead to the conclusion that increasing executive function through an intervention may lower the risk of dementia, but there is as yet no longitudinal data on the protective effects of becoming bilingual. Furthermore, there is as yet no conclusive evidence on the mechanism by which bilingualism exerts a protective effect, although a prominent hypothesis is that the need to manage the conflicting demands of communicating in one language or another at different times leads to improved development and maintenance of brain networks critical for selective attention (Bialystok, Citation2017; Costa et al., Citation2009; Kheder & Kaan, Citation2021).

In any case, adults in this study, or any training study, are not becoming bilingual. Rather, participants in both the Duolingo and BrainHQ groups are engaging in a cognitively stimulating activity that may enhance their mental abilities, similar to numerous other life experiences that are thought to enhance cognitive reserve (e.g. education, social interaction). Bilingualism, like education, is a pervasive long-term aspect of life experience that may be expected to exert a large effect on brain health in later life. Enhanced cognitive activity in mid-life has also been found to exert a protective effect when controlling for early-life factors.(Carlson et al., Citation2008; Karp et al., Citation2009) It is currently unknown whether any specific interventional program, language-based or otherwise, can provide a similar effect, and the cost and complexity of conducting a randomized clinical trial of such an intervention over several decades is most likely prohibitive. Short of that “gold standard” of evidence, a suitable means of progress in the field is to evaluate the short-term impact of different interventions on cognitive abilities that are known to be associated (directly or indirectly) with mitigated dementia risk. Thus, the present study is a step toward evaluation of language training compared with other app-based interventions as a potential means to improve brain health in seniors. Although this study found intriguing benefits of both interventions, it is subject to certain limitations. Foremost, some of the observed effect sizes for Duolingo’s effect on N-back and Simon task reaction time were too small to achieve statistical significance within our limited sample size, although they may still guide future work. Second, the amount of language training provided in this study was modest: Four months, although a fairly long time for a research study intervention, is not enough time to acquire proficiency in a language. Although participants improved their WEBcape scores (on average from 92 to 136, p = .039), their final scores still were within the range of first-semester Spanish. Thus, any generalizable improvements may have been larger under a more realistic period of training, such as one year.

A key aspect of cognitive training studies is the choice of an appropriate control condition, as placebo effects can contribute to gains observed after training (Foroughi et al., Citation2016; Sala et al., Citation2019). The purpose in this study was to assess the efficacy of language training on cognitive outcomes, so the Duolingo group was compared to a passive control (to estimate the baseline practice effects from repeating the tests after 16 weeks) and a benchmark training that targeted the processes involved in the outcome measures. The question was whether the Duolingo group would improve more than control and possibly approach the level achieved by a group trained directly in those processes. For that reason, our hypothesis was that Duolingo would fall in the middle. To mitigate the potential effects of placebo, our outcome measures included control conditions involving response time in the absence of executive function demands (0-back as control for 1-back and 2-back, Simon task with center presentation as control for side presentation, and color patch naming as control for the color word interference task). While brain training improved response time on the control conditions as well as the more demanding conditions (except for color patch naming), no benefits of Duolingo were detected for any of the control conditions.

Arguably, a more complete design would include an active control group that engaged in a completely different activity, such as computer games or crossword puzzles, as done in a previous study assessing BrainHQ in seniors (Lee et al., Citation2020). That study, like ours, found that BrainHQ improved working memory and processing speed on a variety of measures, not limited to specific conditions thought to require conflict management (i.e. executive function). In contrast, our study found greater specificity for training with Duolingo, but less overall improvement. Future studies might compare Duolingo or similar language training to an active control condition with a larger sample size in order to determine whether some effects observed in the present study that were non-significant but of promising effect size (Cohen’s d > 0.4, but p > .05) will turn out to be significant. However, considering that the observed effect sizes in our study might be even smaller with an active control group, it might require a very large study to demonstrate an advantage for Duolingo in its present form compared to an equally engaging “placebo” intervention. Given our findings that Duolingo improves aspects of executive function but has a smaller impact on processing speed than brain training does, we believe that a more promising approach for future studies could be to attempt to combine the benefits of both interventions. It should be possible to develop language–language activities that can teach a language and boost executive function at the same time, particularly through the use of time pressure. Such an approach might improve language learning outcomes as well, while focusing on an inherently rewarding activity that older adults appear to find more enjoyable and sustainable than brain training, according to our questionnaire.

The present study demonstrated the efficacy of 4 months of smartphone-based language training on improving certain aspects of executive functioning. Although the two interventions differ in many ways, it is notable that Brain HQ provides practice in rapid responding, whereas Duolingo does not, and that the main difference in outcomes was that BrainHQ participants experienced a general increase in speed across all tasks and conditions. Speculatively, the efficacy of language training might improve by incorporating a speeded component. Regarding the ultimate implications of these results, translating the observed gains into tangible reductions in the risk of dementia will ultimately require longitudinal studies to confirm causal links between lifestyle activities and health outcomes, but existing studies suggest that engaging mental activities can produce a meaningful protective effect (Livingston et al., Citation2017). Given that app-based language training is effective, enjoyable, and sustainable, and potentially leads to further life-enhancing activities that take advantage of the newfound language knowledge (e.g. travel, new friends), the present results should encourage seniors to take advantage of these benefits.

Supplemental material

Supplemental Material

Download MS Word (343.6 KB)

Acknowledgments

This study was funded by a grant from CABHI (https://www.cabhi.com/), with additional support from Duolingo.

Disclosure statement

Duolingo provided financial support and in-kind technical support. Posit Science provided technical support. These companies and the funders played no role in the experimental design, the data analysis and interpretation, or the writing of the paper.

Supplementary material

Supplemental data for this article can be accessed here

Additional information

Funding

This work was supported by the Center for Aging + Brain Health Innovation [I2P2 Grant].

References

  • Abutalebi, J., Della Rosa, P. A., Green, D. W., Hernandez, M., Scifo, P., Keim, R., Cappa, S. F., & Costa, A. (2012). Bilingualism tunes the anterior cingulate cortex for conflict monitoring. Cerebral Cortex, 22(9), 2076–2086. https://doi.org/10.1093/cercor/bhr287
  • Alladi, S., Bak, T. H., Duggirala, V., Surampudi, B., Shailaja, M., Shukla, A. K., … Kaul, S. (2013). Bilingualism delays age at onset of dementia, independent of education and immigration status. Neurology, 81(22), 1938–1944. https://doi.org/10.1212/01.wnl.0000436620.33155.a4
  • Anderson, J. A. E., Hawrylewicz, K., & Grundy, J. G. (2020). Does bilingualism protect against dementia? A meta-analysis. Psychonomic Bulletin & Review, 27(5), 952–965. https://doi.org/10.3758/s13423-020-01736-5
  • Anderson, J. A. E., Mak, L., Keyvani Chahi, A., & Bialystok, E. (2018). The language and social background questionnaire: Assessing degree of bilingualism in a diverse population. Behavior Research Methods, 50(1), 250–263. https://doi.org/10.3758/s13428-017-0867-9
  • Antón, E., García, Y. F., Carreiras, M., & Duñabeitia, J. A. (2016). Does bilingualism shape inhibitory control in the elderly? Journal of Memory and Language, 90, 147–160. https://doi.org/10.1016/j.jml.2016.04.007
  • Antoniou, M. (2019). The advantages of bilingualism debate. Annual Review of Linguistics, 5(1), 395–415. https://doi.org/10.1146/annurev-linguistics-011718-011820
  • Antoniou, M., Gunasekera, G. M., & Wong, P. C. (2013). Foreign language training as cognitive therapy for age-related cognitive decline: A hypothesis for future research. Neuroscience and Biobehavioral Reviews, 37(10), 2689–2698. https://doi.org/10.1016/j.neubiorev.2013.09.004
  • Bak, T. H., Long, M. R., Vega-Mendoza, M., & Sorace, A. (2016). Novelty, challenge, and practice: the impact of intensive language learning on attentional functions. PloS One, 11(4), e0153485. https://doi.org/10.1371/journal.pone.0153485
  • Bak, T. H., Nissan, J. J., Allerhand, M. M., & Deary, I. J. (2014). Does bilingualism influence cognitive aging? Annals of Neurology, 75(6), 959–963. https://doi.org/10.1002/ana.24158
  • Barker, R. M., & Bialystok, E. (2019). Processing differences between monolingual and bilingual young adults on an emotion n-back task. Brain and Cognition, 134, 29-43. https://doi.org/10.1016/j.bandc.2019.05.004.
  • Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. https://doi.org/10.18637/jss.v067.i01.
  • Berggren, R., Nilsson, J., Brehmer, Y., Schmiedek, F., & Lövdén, M. (2020). Foreign language learning in older age does not improve memory or intelligence: Evidence from a randomized controlled study. Psychology and Aging, 35(2), 212–219. https://doi.org/10.1037/pag0000439
  • Bialystok, E. (2017). The bilingual adaptation: How minds accommodate experience. Psychological Bulletin, 143(3), 233–262. https://doi.org/10.1037/bul0000099
  • Bialystok, E. (2021). Bilingualism: Pathway to cognitive reserve. Trends in Cognitive Sciences, 25(5), 355–364. https://doi.org/10.1016/j.tics.2021.02.003
  • Bialystok, E., Craik, F., & Luk, G. (2008). Cognitive control and lexical access in younger and older bilinguals. Journal of Experimental Psychology. Learning, Memory, and Cognition, 34(4), 859–873. https://doi.org/10.1037/0278-7393.34.4.859
  • Bialystok, E., Craik, F. I., & Freedman, M. (2007). Bilingualism as a protection against the onset of symptoms of dementia. Neuropsychologia, 45(2), 459–464. https://doi.org/10.1016/j.neuropsychologia.2006.10.009
  • Bialystok, E., Craik, F. I., Klein, R., & Viswanathan, M. (2004). Bilingualism, aging, and cognitive control: Evidence from the Simon task. Psychology and Aging, 19(2), 290–303. https://doi.org/10.1037/0882-7974.19.2.290
  • Bialystok, E., Craik, F. I. M., Binns, M. A., Ossher, L., & Freedman, M. (2014). Effects of bilingualism on the age of onset and progression of MCI and AD: Evidence from executive function tests. Neuropsychology, 28(2), 290–304. https://doi.org/10.1037/neu0000023
  • Bialystok, E., Poarch, G., Luo, L., & Craik, F. I. M. (2014). Effects of bilingualism and aging on executive function and working memory. Psychology and Aging, 29(3), 696–705. https://doi.org/10.1037/a0037254
  • Brysbaert, M. (2019). how many participants do we have to include in properly powered experiments? a tutorial of power analysis with reference tables. Journal of Cognition, 2(1), 16. https://doi.org/10.5334/joc.72
  • Carlson, M. C., Helms, M. J., Steffens, D. C., Burke, J. R., Potter, G. G., & Plassman, B. L. (2008). Midlife activity predicts risk of dementia in older male twin pairs. Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association, 4(5), 324–331. https://doi.org/10.1016/j.jalz.2008.07.002
  • Carson, N., Leach, L., & Murphy, K. J. (2018). A re-examination of Montreal Cognitive Assessment (MoCA) cutoff scores. International Journal of Geriatric Psychiatry, 33(2), 379–388. https://doi.org/10.1002/gps.4756
  • Chan, D., Shafto, M., Kievit, R., Matthews, F., Spink, M., Valenzuela, M., & Henson, R. N. (2018). Lifestyle activities in mid-life contribute to cognitive reserve in late-life, independent of education, occupation, and late-life activities. Neurobiology of Aging, 70, 180–183. https://doi.org/10.1016/j.neurobiolaging.2018.06.012
  • Chertkow, H., Whitehead, V., Phillips, N., Wolfson, C., Atherton, J., & Bergman, H. (2010). Multilingualism (but not always bilingualism) delays the onset of Alzheimer disease: Evidence from a bilingual community. Alzheimer Disease and Associated Disorders, 24(2), 118–125. https://doi.org/10.1097/WAD.0b013e3181ca1221
  • Chiu, H. L., Chu, H., Tsai, J. C., Liu, D., Chen, Y. R., Yang, H. L., & Chou, K. R. (2017). The effect of cognitive-based training for the healthy older people: A meta-analysis of randomized controlled trials. PloS One, 12(5), e0176742. https://doi.org/10.1371/journal.pone.0176742
  • Cognitive Training Data Website. (2014). Cognitive training data response letter. https://www.cognitivetrainingdata.org/the-controversy-does-brain-training-work/response-letter/
  • Costa, A., Hernández, M., Costa-Faidella, J., & Sebastián-Gallés, N. (2009). On the bilingual advantage in conflict processing: Now you see it, now you don’t. Cognition, 113(2), 135–149. https://doi.org/10.1016/j.cognition.2009.08.001
  • Craik, F. I., Bialystok, E., & Freedman, M. (2010). Delaying the onset of Alzheimer disease: Bilingualism as a form of cognitive reserve. Neurology, 75(19), 1726–1729. https://doi.org/10.1212/WNL.0b013e3181fc2a1c
  • Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7–29. https://doi.org/10.1177/0956797613504966
  • Delis, D. C., Kaplan, E., & Kramer, J. H. (2001). Delis-Kaplan executive function system. The Psychological Corporation.
  • Donnelly, S., Brooks, P. J., & Homer, B. D. (2019). Is there a bilingual advantage on interference-control tasks? A multiverse meta-analysis of global reaction time and interference cost. Psychonomic Bulletin & Review, 26(4), 1122–1147. https://doi.org/10.3758/s13423-019-01567-z
  • Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. https://doi.org/10.3758/BF03193146
  • Foroughi, C. K., Monfort, S. S., Paczynski, M., McKnight, P. E., & Greenwood, P. (2016). Placebo effects in cognitive training. Proceedings of the National Academy of Sciences, 113(27), 7470–7474. https://doi.org/10.1073/pnas.1601243113
  • Gold, B. T., Kim, C., Johnson, N. F., Kryscio, R. J., & Smith, C. D. (2013). Lifelong bilingualism maintains neural efficiency for cognitive control in aging. Journal of Neuroscience, 33(2), 387–396. https://doi.org/10.1523/jneurosci.3837-12.2013
  • Goral, M., Campanelli, L., & Spiro, A, 3rd. (2015). Language dominance and inhibition abilities in bilingual older adults. Biling (Camb Engl), 18(1), 79–89. https://doi.org/10.1017/s1366728913000126
  • Grundy, J. G. (2020). The effects of bilingualism on executive functions: an updated quantitative analysis. Journal of Cultural Cognitive Science, 4(2), 177-199. https://doi.org/10.1007/s41809-020-00062-5.
  • Hilchey, M. D., & Klein, R. M. (2011). Are there bilingual advantages on nonlinguistic interference tasks? Implications for the plasticity of executive control processes. Psychonomic Bulletin & Review, 18(4), 625–658. https://doi.org/10.3758/s13423-011-0116-7
  • Incera, S., & McLennan, C. T. (2018). Bilingualism and age are continuous variables that influence executive function. Neuropsychology, Development, and Cognition. Section B, Aging, Neuropsychology and Cognition, 25(3), 443–463. https://doi.org/10.1080/13825585.2017.1319902
  • Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59(4), 434–446. https://doi.org/10.1016/j.jml.2007.11.007
  • Janus, M., & Bialystok, E. (2018). Working Memory With Emotional Distraction in Monolingual and Bilingual Children. Frontiers in Psychology, 9, 1582. https://doi.org/10.3389/fpsyg.2018.01582.
  • Karp, A., Andel, R., Parker, M. G., Wang, H. X., Winblad, B., & Fratiglioni, L. (2009). Mentally stimulating activities at work during midlife and dementia risk after age 75: Follow-up study from the Kungsholmen Project. American Journal of Geriatric Psychiatry, 17(3), 227–236. https://doi.org/10.1097/JGP.0b013e318190b691
  • Kheder, S., & Kaan, E. (2021). Cognitive control in bilinguals: Proficiency and code-switching both matter. Cognition, 209, 104575. https://doi.org/10.1016/j.cognition.2020.104575
  • Klein, R. M., Christie, J., & Parkvall, M. (2016). Does multilingualism affect the incidence of Alzheimer’s disease?: A worldwide analysis by country. SSM Popul Health, 2, 463–467. https://doi.org/10.1016/j.ssmph.2016.06.002
  • Kousaie, S., & Phillips, N. A. (2012). Ageing and bilingualism: Absence of a “bilingual advantage” in stroop interference in a nonimmigrant sample. Quarterly Journal of Experimental Psychology, 65(2), 356–369. https://doi.org/10.1080/17470218.2011.604788 2006
  • Kousaie, S., & Phillips, N. A. (2017). A behavioural and electrophysiological investigation of the effect of bilingualism on aging and cognitive control. Neuropsychologia, 94, 23–35. https://doi.org/10.1016/j.neuropsychologia.2016.11.013
  • Kowoll, M. E., Degen, C., Gorenc, L., Küntzelmann, A., Fellhauer, I., Giesel, F., … Schröder, J. (2016). Bilingualism as a contributor to cognitive reserve? evidence from cerebral glucose metabolism in mild cognitive impairment and alzheimer’s disease. Frontiers In Psychiatry / Frontiers Research Foundation, 7, 62. https://doi.org/10.3389/fpsyt.2016.00062
  • Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13
  • Lee, H. K., Kent, J. D., Wendel, C., Wolinsky, F. D., Foster, E. D., Merzenich, M. M., & Voss, M. W. (2020). Home-based, adaptive cognitive training for cognitively normal older adults: Initial efficacy trial. The Journals of Gerontology: Series B, 75(6), 1144–1154. https://doi.org/10.1093/geronb/gbz073
  • Lehtonen, M., Soveri, A., Laine, A., Järvenpää, J., De Bruin, A., & Antfolk, J. (2018). Is bilingualism associated with enhanced executive functioning in adults? A meta-analytic review. Psychological Bulletin, 144(4), 394–425. https://doi.org/10.1037/bul0000142
  • Russell V. Lenth (2021). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.5.5-1. https://CRAN.R-project.org/package=emmeans.
  • Livingston, G., Sommerlad, A., Orgeta, V., Costafreda, S. G., Huntley, J., Ames, D., … Mukadam, N. (2017). Dementia prevention, intervention, and care. Lancet, 390(10113), 2673–2734. https://doi.org/10.1016/s0140-6736(17)31363-6
  • Martin-Rhee, M. M., & Bialystok, E. (2008). The development of two types of inhibitory control in monolingual and bilingual children. Bilingualism: Language and Cognition, 11(1), 81. https://doi.org/10.1017/S1366728907003227
  • Max Planck Institute for Human Development and Stanford Center on Longevity. (2014). A Consensus on the Brain Training Industry from the Scientific Community. Retrieved Sept. 3, 2021 from https://longevity.stanford.edu/a-consensus-on-the-brain-training-industry-from-the-scientific-community-2/
  • Nasreddine, Z. S., Phillips, N. A., Bedirian, V., Charbonneau, S., Whitehead, V., Collin, I., & Chertkow, H. (2005). The montreal cognitive assessment, MoCA: A brief screening tool for mild cognitive impairment. Journal of the American Geriatrics Society, 53(4), 695–699. https://doi.org/10.1111/j.1532-5415.2005.53221.x
  • Perani, D., Farsad, M., Ballarini, T., Lubian, F., Malpetti, M., Fracchetti, A., … Abutalebi, J. (2017). The impact of bilingualism on brain reserve and metabolic connectivity in Alzheimer’s dementia. Proceedings of the National Academy of Sciences of the United States of America, 114(7), 1690–1695. https://doi.org/10.1073/pnas.1610909114
  • Poarch, G. J., & Van Hell, J. G. (2012). Executive functions and inhibitory control in multilingual children: Evidence from second-language learners, bilinguals, and trilinguals. Journal of Experimental Child Psychology, 113(4), 535–551. https://doi.org/10.1016/j.jecp.2012.06.013
  • Pot, A., Porkert, J., & Keijzer, M. (2019). The Bidirectional in Bilingual: Cognitive, Social and Linguistic Effects of and on Third-Age Language Learning. Behav Sci (Basel), 9(9), 98. https://doi.org/10.3390/bs9090098
  • Rabipour, S., & Raz, A. (2012). Training the brain: Fact and fad in cognitive and behavioral remediation. Brain and Cognition, 79(2), 159–179. https://doi.org/10.1016/j.bandc.2012.02.006
  • Ramos, S., Fernández García, Y., Antón, E., Casaponsa, A., & Duñabeitia, J. A. (2017). Does learning a language in the elderly enhance switching ability? Journal of Neurolinguistics, 43(Part A), 39-48. https://doi.org/10.1016/j.jneuroling.2016.09.001
  • Sala, G., Aksayli, N. D., Tatlidil, K. S., Tatsumi, T., Gondo, Y., & Gobet, F. (2019). Near and Far Transfer in Cognitive Training: A Second-Order Meta-Analysis. Collabra: Psychology, 5(1), 18. https://doi.org/10.1525/collabra.203.
  • Salvatierra, J. L., & Rosselli, M. (2011). The effect of bilingualism and age on inhibitory control. International Journal of Bilingualism, 15(1), 26–37. https://doi.org/10.1177/1367006910371021
  • Schweizer, T. A., Ware, J., Fischer, C. E., Craik, F. I., & Bialystok, E. (2012). Bilingualism as a contributor to cognitive reserve: Evidence from brain atrophy in Alzheimer’s disease. Cortex, 48(8), 991–996. https://doi.org/10.1016/j.cortex.2011.04.009
  • Simons, D. J., Boot, W. R., Charness, N., Gathercole, S. E., Chabris, C. F., Hambrick, D. Z., & Stine-Morrow, E. A. (2016). Do “brain-training” programs work? Psychological Science in the Public Interest, 17(3), 103–186. https://doi.org/10.1177/1529100616661983
  • Stern, Y. (2012). Cognitive reserve in ageing and Alzheimer’s disease. Lancet Neurology, 11(11), 1006–1012. https://doi.org/10.1016/s1474-4422(12)70191-6
  • Sullivan, M. D., Janus, M., Moreno, S., Astheimer, L., & Bialystok, E. (2014). Early stage second-language learning improves executive control: Evidence from ERP. Brain and Language, 139, 84–98. https://doi.org/10.1016/j.bandl.2014.10.004
  • Tse, C. S., & Altarriba, J. (2014). The relationship between language proficiency and attentional control in Cantonese-English bilingual children: Evidence from Simon, Simon switching, and working memory tasks. Frontiers in Psychology, 5, 954. https://doi.org/10.3389/fpsyg.2014.00954
  • van den Noort, M., Vermeire, K., Bosch, P., Staudte, H., Krajenbrink, T., Jaswetz, L., Struys, E., Yeo, S., Barisch, P., Perriard, B., Lee, S. H., & Lim, S. (2019). A Systematic Review on the Possible Relationship Between Bilingualism, Cognitive Decline, and the Onset of Dementia. Behav Sci (Basel), 9(7), 81. https://doi.org/10.3390/bs9070081
  • Waldron-Perrine, B., & Axelrod, B. N. (2012). Determining an appropriate cutting score for indication of impairment on the montreal cognitive assessment. International Journal of Geriatric Psychiatry, 27(11), 1189–1194. https://doi.org/10.1002/gps.3768
  • Wong, P. C. M., Ou, J., Pang, C. W. Y., Zhang, L., Tse, C. S., Lam, L. C. W., & Antoniou, M. (2019). Language training leads to global cognitive improvement in older adults: A preliminary study. Journal of Speech, Language, and Hearing Research, 62(7), 2411–2424. https://doi.org/10.1044/2019_jslhr-l-18-0321
  • Woumans, E., Santens, P., Sieben, A., Versijpt, J. A. N., Stevens, M., & Duyck, W. (2015). Bilingualism delays clinical manifestation of Alzheimer’s. Bilingualism: Language and Cognition, 18(3), 568–574. https://doi.org/10.1017/S136672891400087X