1,353
Views
1
CrossRef citations to date
0
Altmetric
Articles

Effects of repeated retrieval on keyword mediator use: shifting to direct retrieval predicts better learning outcomes

, &
Pages 908-917 | Received 25 Mar 2020, Accepted 05 Jul 2020, Published online: 29 Jul 2020

ABSTRACT

Keyword mediators are an effective memory technique to encode novel vocabulary: learners link a novel word form to its meaning with a mental image that includes a keyword that resembles the word form (e.g., nyanya = tomato; keyword mnemonic: the ninja chops the tomato in half). Prior research suggests that such mediated form-meaning associations become less dependent on keywords after retrieval practice. The present study investigated if retrieval-induced decreases in mediator use predict word retention. Thirty participants learned novel vocabulary using experimenter-provided keywords and repeatedly retrieved the words from memory while thinking aloud. As expected, keyword use decreased with practice: learners stopped mentioning keywords for 21.6% of the words (on average after 8.27 retrievals). Shifting to direct, unmediated retrieval predicted higher form and meaning recall on a retention test after 6–8 days. Continuing retrieval practice until a shift has occurred to direct retrieval thus seems beneficial for retention.

Vocabulary acquisition is an essential part of mastering a foreign language (Schmitt, Citation2008). At its most basic level, vocabulary learning involves the association of new, unfamiliar word forms with new or already existing semantic representations of word meaning in memory (Ellis & Beaton, Citation1993). One strategy to facilitate the establishment of these form-meaning associations is the keyword method (Atkinson & Raugh, Citation1975). This is a mnemonic technique in which learners link a keyword, a familiar word that is acoustically similar to the to-be-learned foreign word form (e.g., nyanya – ninja), to the word meaning (here: tomato) with a mental image (e.g., the ninja chops the tomato in half). The keyword method enhances performance on both immediate and delayed tests of word knowledge compared to traditional learning techniques like rote rehearsal (Atkinson, Citation1975; Fritz et al., Citation2007). Moreover, positive effects hold irrespective of the age and abilities of the learners (Wyra et al., Citation2007).

Benefits of keyword mediators have been attributed to the creation and elaboration of meaningful associations between word form and meaning, which facilitate later recall (e.g., Dunlosky et al., Citation2013; Pyc & Rawson, Citation2010). Keyword mediators are particularly effective in combination with retrieval practice (Fritz et al., Citation2007). During retrieval practice, learners actively retrieve word knowledge from memory (e.g., when translating), which leads to enhanced long-term retention (e.g., Karpicke & Roediger, Citation2008; Roediger & Butler, Citation2011). One way in which retrieval enhances retention, is by stimulating learners to improve the keyword mediators that they use: learners adjust ineffective mediators when retrieval attempts fail, changing mediators in a way that the mediator itself can be retrieved from memory more easily in response to the word form, or in a way that the mediator more readily elicits the correct word meaning (Pyc & Rawson, Citation2010, Citation2012). Beyond refining mediators, retrieval practice may also change the nature of mediated form-meaning associations in a more fundamental way, resulting in a shift to unmediated memory retrieval.

From mediated to direct word retrieval

The retrieval of words that are initially encoded with keyword mediators changes with practice such that items can increasingly be retrieved directly, without intermediate retrieval steps (Crutcher & Ericsson, Citation2000). For example, with practice, learners may connect the word form nynya to its meaning tomato, without needing the keyword ninja. Rickard (Citation1997) proposed that such a shift from multiple step retrieval to direct retrieval is fundamental in acquisition.

In a foundational study, Kole and Healy (Citation2013) documented the shift from mediated to direct retrieval in word learning, using priming experiments. Participants learned foreign words with experimenter-provided keywords and practiced these words through repeated retrieval. After 5, 10, or 45 retrieval rounds, participants translated each word once more, immediately followed by a lexical decision task in which they rated the lexicality of words that were either semantically related to the keyword or not. Shorter reaction times for semantically related words than for unrelated words were interpreted as indication that keywords had been activated (a mediated priming effect); similar reaction times for semantically related and unrelated words were interpreted as indication that participants had retrieved the word meaning without activating the keyword. A mediated priming effect suggesting keyword activation was found only after 5 repetitions, and not after 10 nor after 45 repetitions (Kole & Healy, Citation2013). Thus, a shift to direct retrieval seems to have occurred after 5–10 rounds of retrieval practice.

Skill acquisition models, like the direct access model (Rickard & Bajic, Citation2003) and the identical elements model (Rickard & Bajic, Citation2006), propose that reduction in keyword mediator use with practice is due to the establishment of direct form-meaning associations in memory. A possible explanation for these changes is the episodic context account of testing effects (Karpicke et al., Citation2014). This account holds that retrieval strengthens the most important components of the form-meaning association, by reducing the (relative) activation of irrelevant, competing information and increasingly honing in on a limited number of core, contextual features that are processed across successful retrievals (Lehman et al., Citation2014). Similarly, keyword mediators might eventually not be needed anymore when form-meaning associations become increasingly decontextualised and independent of the original encoding context (i.e., the keyword). A compatible theory from the retrieval practice literature, the dual memory theory (Rickard & Pan, Citation2018), proposes that while learners initially form a study memory during the encoding of items, retrieval practice leads to a second type of memory, test memory, which critically involves the formation of an association from the cue (e.g., foreign word) to the response (e.g., translation). The dual memory theory explains the shift from initially keyword-mediated retrieval to direct retrieval with the idea that test memory becomes the primary route for retrieval after multiple retrieval events (Rickard & Pan, Citation2018).

Although previous research suggests that repeated retrieval practice fundamentally changes mediated memory retrieval to direct retrieval, the functional significance of this change is at present unclear. Direct memory retrieval might, however, be more viable, because it is less prone to interference by semantically related competitors than multiple step, mediated memory retrieval (McElree, Citation2001). Furthermore, strong, selective cue-target associations enlarge the likelihood of recalling the correct target information (e.g., Karpicke et al., Citation2014; Lehman et al., Citation2014; Lehman & Karpicke, Citation2016). This suggests that the transition to direct retrieval might be important for vocabulary retention, which would have important practical implications for the use of keywords in (foreign) language learning and instruction. For instance, learners could be encouraged to continue practicing retrieval until they do not rely on the keyword mediator anymore to translate words. The present study therefore tests if changing from mediated to direct retrieval can indeed be achieved with repeated retrieval and whether it is beneficial for vocabulary learning.

Present study

The main focus in prior research has been on the effects of the keyword method on recall accuracy, both during practice and on later performance tests (e.g., Adams & McIntyre, Citation1967; Atkinson, Citation1975; Atkinson & Raugh, Citation1975; Fritz et al., Citation2007; Rickard & Bajic, Citation2003, Citation2006). The present study taps more directly into the mental processes during word learning with keyword mediators. We do this by means of the think-aloud method (Van Someren et al., Citation1994), in which learners verbalise their conscious thoughts directly as they arise. In a follow-up to the study by Kole and Healy (Citation2013), the present study investigates the shift from mediated to unmediated direct retrieval more closely in order to establish its relation to word learning outcomes. This contributes to the understanding of word learning with keyword mediators in several ways.

First, we tested whether the occurrence of the shift to direct retrieval is indeed a result of repeated retrieval practice, rather than passage of time. This control step was taken because word representations change over time due to offline consolidation processes (Davis & Gaskell, Citation2009). Eliminating the effect of time since encoding allowed us to focus purely on the link between repeated retrieval and shifts to direct memory. In the study by Kole and Healy (Citation2013), there was no such control for the time since encoding. Decreased use of mediators after repeated retrieval could, therefore, have been a consequence of the passage of time. Second, we investigated at which moment during practice learners shift from mediated to direct word retrieval. Previous research found keyword activation after five but not after ten repetitions (Kole & Healy, Citation2013), which suggests that shifts occurred somewhere between five and ten repetitions. To get a more precise estimate of the moment of shift, we measured the moment of shift for each word in the present study using think-aloud data. This allowed us to document variations in the moment of shift between words and participants. Finally, we investigated whether the presence and moment of shifts to direct retrieval predict better word retention.

In this study, participants encoded foreign vocabulary words with experimenter-provided keyword mediators. Providing keywords is as effective as self-generated keywords (Campos et al., Citation2004), but ensures greater consistency across participants. Participants then engaged in repeated retrieval of the newly learned words while thinking aloud. Based on these verbalisations, the type of retrieval (either mediated: with reference to the keyword, or direct: without reference to the keyword), was determined for each repetition. After one week, word knowledge was tested in order to relate the activation of keywords during practice to word retention.

We expected that first, a shift to direct retrieval would occur as a consequence of repeated retrieval practice rather than mere passage of time. Second, we expected most shifts to occur between 5 and 10 retrieval repetitions, based on previous findings by Kole and Healy (Citation2013). Third, words that shifted to direct retrieval during practice were expected to be recalled more accurately on a retention test than words for which retrieval remained mediated during practice, because direct form-meaning associations are thought to be more stable and less error prone (McElree, Citation2001; Rickard & Bajic, Citation2003) and selective cue-target associations might increase the likelihood of recalling the correct target information (e.g., Karpicke et al., Citation2014; Lehman et al., Citation2014).

Method

Participants

In total, 33 participants were tested. All were university students and native speakers of Dutch, without prior knowledge of Swahili. The data of three participants were removed from analysis, because they did not comply with the instructions. The remaining 21 female and 9 male participants had an average age of 22.6 years (SD = 2.1).

Stimuli

Fifty Swahili words with corresponding Dutch translations, orthographically similar keywords, and mnemonic association sentences were used (e.g., nyanya = tomato; keyword = ninja; mnemonic = The ninja chops the tomato in half. See the appendix for a complete stimuli overview). Of these 50 stimuli, 34 were adapted from an unpublished dataset with participant-generated mnemonics from our lab; the remaining 16 Swahili words were selected from Nelson and Dunlosky (Citation1994). The keywords were all concrete nouns that had substantial orthographic or phonological overlap with the Swahili word form, and were likely to be familiar to learners (cf. Atkinson, Citation1975; Hulstijn, Citation1997; Paivio, Citation1969). The mnemonic association sentences described a concrete, imaginable interaction of the keyword and translation. These sentences were as short as possible (4 to 9 words, M = 7.76, SD = 1.03), similar in complexity across stimuli (e.g., only active sentences, all written in the present tense, containing only two nouns), and similar in structure (keywords always occurred earlier than translations in the sentences).

Procedure

Overview of the experiment

The present study consisted of two sessions. The first session included two initial encoding blocks and 10 retrieval blocks (duration: ± 1.5 h). The second session took place 6–8 days after practice (M = 6.8, SD = 0.8) and included one retention test for receptive and one for productive vocabulary knowledge (duration: ± 20 min).

Initial encoding

During initial encoding, each stimulus (i.e., respectively the Swahili word, keyword, keyword sentence, and L1 translation) was visually presented on a computer screen for 16000 ms, one at a time, in a different random order per participant. Participants were instructed to generate a vivid visualisation of each mnemonic association. After a short break, the stimuli were presented again and participants rated whether they could form an image of each mnemonic on a continuous scale from (1) No, no image at all to (4) Yes, clear image. Based on these ratings, the words were assigned per participant to the frequent and infrequent retrieval condition in such a way that average ratings were similar for the 30 words in the frequent retrieval condition (across participants: M = 3.08, SD = 0.70) and the 20 words in the infrequent retrieval condition (across participants: M = 3.08, SD = 0.71).

Retrieval phase

Initial encoding was followed by 10 retrieval blocks, in which participants typed in the translation of the Swahili words, which were presented one at a time. Submission of empty responses was allowed and there were no constraints on response time. In order to make a comparison between frequently and infrequently practiced words, stimuli were presented in one of two conditions: in the frequent retrieval condition, words were presented in all 10 retrieval blocks; in the infrequent retrieval condition, words were only presented in Block 1, 2, 5, and 10. Immediate feedback was provided: correct responses turned green; incorrect responses turned red and the correct translation was presented (as well as the mnemonic association sentence in Block 1 and 2 only).

Think-aloud procedure. Participants were instructed to say aloud everything that crossed their mind during retrieval practice. Following Van Someren et al. (Citation1994), participants received instructions and a short practice on thinking aloud prior to the experiment, with additional reminders to think aloud during retrieval practice.

Think-aloud data: categorising direct and mediated retrieval. Think-aloud recordings were collected for every retrieval trial (11400 trials in total) and scored on keyword use by the first author. A subset of trials was scored again after several weeks (n = 300; 10 trials per participant) and complete overlap was found between the scores, indicating high reliability. In cases when participants referred to the keyword and/or mnemonic association sentence during practice, the trial was scored as mediated. In cases when participants did not mention the keyword, nor the mnemonic association sentence, the trial was scored as direct. In cases when retrieval failed, the trial was treated as if it had not shifted to direct retrieval.Footnote1 The presence of a shift was determined by examining retrieval patterns. A word was considered to have shifted from mediated (M) to direct (D) retrieval if participants used direct retrieval at some point during practice and the retrieval stayed direct throughout the rest of practice (e.g., MMMDDDDDDD, shift moment = Trial 4; or MMMDMDMDDD, shift moment = Trial 8).

Test session

Approximately one week after the practice session (six to eight days, M = 6.8, SD = 0.8) participants completed two tests: a receptive test in which participants produced the Dutch translation of the Swahili words, and a productive test in which participants produced the Swahili word form when presented with the Dutch meaning. The procedure was similar to the practice session in that a word was shown on the screen and participants typed in the translation, but there was no feedback. Omissions (i.e., empty responses) were allowed and there was no response time limit. In between the receptive and the productive test, participants performed a non-linguistic distractor task for approximately five minutes, in order to reduce transfer between tests (a digit span working memory task, Woods et al., Citation2011).

Analyses

Prior to further analyses, data was screened in order to exclude any words for which participants adopted a different mediator during practice (28 out of 1500 words), or expressed familiarity with the Swahili word (3 out of 1500 words). As the number of trials in this category was too low to perform meaningful analyses on, these words were excluded (the number of excluded words per participant ranged from 0 to 7 words; M = 1.03). This left a total number of1469 words that were included in statistical analyses.

Test performance

Accuracy on the final test was scored using the Levenshtein Distance (Levenshtein, Citation1966). This measure describes how many characters need to be removed, added, or replaced to change one word into another (e.g., hause to houses has a Levenshtein Distance of 2). The accuracy on the receptive test was based on a Levenshtein Distance of ≤2, which scored minor typing errors as correct. The productive test was scored twice: once leniently, based on a Levenshtein Distance of ≤2 (thus allowing some deviations from the correct form), and once strictly, based on a Levenshtein Distance of 0 (scoring only the exact response as correct).

Statistical analyses

We used logistic mixed-effects models with crossed random effects for participants and items (see Baayen et al., Citation2008; Quené & Van den Bergh, Citation2008), to model the dichotomous outcome variables retrieval type (mediated/direct) and recall on the post-test (recalled/not recalled). The lme4 package (version 1.1-20; Bates et al., Citation2015) in R (R Core Team, Citation2017) was used for model estimation. To test significance of predictors (predictors were all measured at word level and included as fixed effects in the statistical models), we used p-values calculated with lmerTest (version 2.0-29; cf. Kuznetsova et al., Citation2017). The reported effect size is the Odd’s Ratio. Mixed models were used because they allow the use of predictors at word level, such as the moment of shift, retrieval type, and condition, while including random effects at both the level of words and participants (Baayen et al., Citation2008).

Hereafter, we provide a short overview of the models tested for Hypotheses 1, 3, and 4. Hypothesis 2 was answered by providing descriptive statistics of the shift moment. To describe the effect of repeated retrieval on mediator use during practice (Hypothesis 1), we tested statistical models with Practice Condition (frequent retrieval or infrequent retrieval) and Practice block (1, 2, 5, or 10; to compare mediator use at the beginning of retrieval practice and after multiple repetitions) as fixed effects and Retrieval Type (mediated or direct) as dichotomous dependent variable, including learners and words as random effects. Second, the average moment of shift (Hypothesis 2) was calculated for words that shifted to direct retrieval in the frequent retrieval condition (the infrequent condition contained only four repetitions, allowing few shifts to direct retrieval and little variation in the moment of shift). Descriptive statistics of the moment of shift are reported to allow comparison with prior studies. Third, to measure the relation between the presence of a shift to direct retrieval during practice and recall on the retention test (Hypothesis 3), we tested models with Retrieval Type (mediated or direct) and Practice Condition as predictors, and Word Retention as the dependent variable (separate models to predict receptive retention, productive retention scored leniently, and productive retention scored strictly). Similarly, we conducted analyses with Shift Moment as quantitative predictor and Word Retention as dependent variable, based on the data of those words that shifted to direct retrieval in the frequent retrieval condition.

Results

Question 1: what is the effect of repeated retrieval practice on keyword mediator use?

During practice, 97.3% of responses in the frequent retrieval condition and 93.4% of responses in the infrequent retrieval condition were correct. Further, in the frequent retrieval condition 78.4% of the responses were mediated (i.e., learners mentioned the keywords) and 21.6% were direct retrieval. In the infrequent retrieval conditions, mediators were used in 85.9% of the responses, while only 14.1% were directly retrieved. shows the decrease in mediated retrieval in both conditions over the course of practice. The decrease was significant: an analysis with Block as predictor of Retrieval Type showed significantly less mediated retrieval at the end of practice (Block 10) than in the beginning of practice (Block 1), both in the frequent retrieval condition, β = 3.36, Z(1756) = 11.17, p < .001, OR = 28.85, and in the infrequent retrieval condition, β = 2.96, Z(1174)  = 7.02, p < .001, OR = 19.36.

Figure 1. Proportion of items for which participants mentioned the mediator during retrieval in the practice blocks in the frequent and infrequent retrieval condition. The asterisks indicate comparisons between the two retrieval conditions in Block 5 and 10, *** p < .001. The shaded areas represent 95% confidence intervals.

Figure 1. Proportion of items for which participants mentioned the mediator during retrieval in the practice blocks in the frequent and infrequent retrieval condition. The asterisks indicate comparisons between the two retrieval conditions in Block 5 and 10, *** p < .001. The shaded areas represent 95% confidence intervals.

No significant difference in mediator use was found between the frequent and infrequent retrieval condition in Block 1 (β = 0.53, Z(1465) = 1.01, p = .31, OR = 1.70) nor in Block 2 (β = 0.43, Z(1465) = 1.38, p = .17, OR = 1.54). This was to be expected because practice was identical in the two conditions in the first two blocks. However, in Block 5, when words in the frequent retrieval condition were retrieved for the fifth time and words in the infrequent retrieval condition for the third time, the mediator use differed significantly between the conditions (β = 0.92, Z(1465) = 1125.00, p < .001, OR = 2.51), with less mediated retrieval in the frequent retrieval condition than in the infrequent retrieval condition. At the end of practice, when words in the frequent retrieval condition were retrieved for the tenth time and words in the infrequent retrieval condition for the fourth time, there was also significantly less mediated retrieval in the frequent retrieval condition (β = 0.86, Z(1465) = 4.62, p < .001, OR = 2.36).

In sum, repeated retrieval in the frequent retrieval condition reduced the use of keyword mediators and thus increased direct retrieval in comparison to the infrequent retrieval condition. Whereas mediator use did not differ between conditions in the first two blocks, in which words had been retrieved equally often in the two conditions, the frequent retrieval condition reduced mediator use compared to the infrequent retrieval condition in the later blocks.

Question 2: what is the average moment of shift from mediated to direct retrieval?

Calculated over the subset of words on which a shift appeared in the frequent retrieval conditionFootnote2, participants shifted on average after 8.27 retrievals (SD = 1.84, npp= 21; averaged first across words per participant, and then across participants). This estimate is based on a relatively small subset of trials and items: a shift from mediated to direct retrieval appeared for 21.5% of the words in the frequent retrieval condition (189 out of 880 words) at some point during practice; and of the 30 participants, only 21 participants shifted to direct retrieval during practice.

Question 3: does a shift from mediated to direct retrieval predict better retention?

Relation between presence of a shift and word retention

A significant main effect of Retrieval Type on receptive word retention was found (β = 1.17, Z(876) = 2.62, p < .01, OR = 3.22, see ), with higher correct recall on the retention test for words on which a shift to direct retrieval appeared during practice, than for words that remained mediated throughout practice. Likewise, significant main effects of Retrieval Type during practice were found on productive test performance when scored leniently (β = 0.78, Z(876) = 2.54, p = .01, OR = 2.19, see ) and when scored strictly (β = 0.77, Z(876) = 3.02, p < .01, OR = 2.15, see ), both with higher word retention for words that were directly retrieved during practice than for mediated words.

Table 1. Fixed and random effects of mixed models with condition, retrieval type, and shift moment as predictors of performance on tests of receptive and productive word knowledge.

Relation between moment of shift and word retention

No significant relation was found between Shift Moment and word retention on the receptive test (β = −0.28, Z(186) = −1.63, p = .10, OR = 0.76, see ), nor on the lenient productive test (β = −0.02, Z(186) = −0.22, p = .83, OR = 0.98, see ), nor on the strictly scored productive test (β= −0.04, Z(186) = −0.55, p = .58, OR = 0.96, see ).

Effects of repeated retrieval practice on word retention

As a supplement to our main analysis, we also tested whether the frequent retrieval condition – which reduced keyword mediator use overall – led to higher word retention than the infrequent retrieval condition. Test responses in the frequent retrieval condition were correct for 90.9% on the receptive test, 79.3% on the leniently scored productive test, and 51.0% on the strictly scored productive test. For the infrequently retrieved words, responses were correct for 86.7% on the receptive test, 79.1% on the productive test when scored leniently, and 48.3% on the productive test when scored strictly. A model with Practice Condition as predictor for performance on the receptive test showed a significant difference between the frequent and infrequent retrieval condition (β = 0.47, Z(1465) = 2.55, p = .01, OR = 1.60, see ), with higher word retention in the frequent retrieval condition than in the infrequent retrieval condition. No significant effects of Condition were found on the productive test, neither when scored leniently (β = 0.13, Z(1465) = 0.90, p = .37, OR = 1.14, see ), nor when scored strictly (β = 0.21, Z(1465) = 1.63, p = .10, OR = 1.23, see ).

Discussion

The present study explored the effects of repeated word retrieval on the use of keyword mediators and word retention using a think-aloud paradigm. Mediator use decreased over the course of retrieval practice of newly learned word form-meaning associations and the shift from initially mediated to direct retrieval during practice predicted better recall on a test of receptive and productive word knowledge one week after practice.

There are three main findings. First, as we had expected, think-aloud protocols revealed that mediated retrieval decreased over the course of repeated retrieval practice. Whereas in the beginning of practice, participants almost always mentioned the keywords, the word meaning was increasingly mentioned directly without mediator reference after several repetitions. The decrease in mediator use was higher in the frequent retrieval condition than in the infrequent retrieval condition, which indicates that the decrease in keyword mediation was a consequence of repeated retrieval rather than merely the time that had passed since encoding. This suggests that reduced keyword-related priming effects found in earlier experiments by Kole and Healy (Citation2013) were indeed caused by repeated retrieval and not the mere passage of time. The findings also support the idea that instead of strengthening mediated memory representations, repeated retrieval fundamentally changes them (cf. the identical elements model, Rickard & Bajic, Citation2006); and the episodic context account of retrieval practice (Karpicke et al., Citation2014; Lehman et al., Citation2014). Mediators serve as an initial learning context, but with practice, the mediated connection between the target form and meaning appears to change into a more direct connection.

Second, the average moment at which a shift from mediated to direct retrieval appeared was after 8.27 retrievals, which is comparable to the range proposed by Kole and Healy (Citation2013), who found indications for keyword activation after 5 but not after 10 retrieval rounds.

Third, a shift from mediated to direct retrieval during practice predicted better word retention over time: words that were retrieved directly, without mediators during practice were more likely to be recalled correctly on a test one week after practice. This is in line with suggestions that direct memory associations can be reactivated more easily, because fewer interim steps are needed and thus the chance for activation of competing, irrelevant information is lower (Lehman et al., Citation2014; McElree, Citation2001; Rickard & Bajic, Citation2003). The present data did not show a relation between the moment of shift during practice and word retention, however. Looking at the conditions overall, we found that the frequent retrieval condition (10 retrieval opportunities), which reduced keyword mediation during practice, led to higher word retention than the infrequent retrieval condition (4 retrieval opportunities) on the receptive test but not on the productive test.

Our findings that the use of mediators decreased over the course of retrieval practice, and that decreased mediation predicted better word retention over time, are relevant for cognitive accounts of retrieval practice. To begin with, it is an on-going debate in the literature if and how memory representations change as a consequence of repeated retrieval. According to the semantic elaboration hypothesis (Carpenter, Citation2009, Citation2011; Carpenter & Yeung, Citation2017; Wirebring et al., Citation2015), the retrieval of information from memory activates cue-relevant information (in particular, mediators), which then become incorporated into memory. In this way, retrieval is thought to lead to increasingly elaborate memory representations that are easier to recall due to the increased number or quality of associations. However, the present findings confirm that such elaboration may not be beneficial throughout the course of learning and that “the time course of mediators may be limited, in that they are activated during early stages of learning but cease to be utilised after a cue-target pair has become well learned” (Carpenter & Yeung, Citation2017, p. 138). In this way, our findings are more in line with the idea that repeated retrieval leads to an increasing restriction of the search set of candidate responses, as suggested by the episodic context account of testing (Karpicke et al., Citation2014; Lehman et al., Citation2014). Ultimately, repeated retrieval may produce decontextualisation “wherein items become more retrievable but are no longer only associated with a specific context (e.g., the original study context)” (Lehman et al., Citation2014, p. 259; cf. also the dual memory theory, Rickard & Pan, Citation2018).

The present findings are also relevant given earlier studies that showed that even though repeated retrieval enhances retention, the more repetitions have been completed already, the smaller the added value is of extra repetitions for retention over time (Pyc & Rawson, Citation2009). Although available data is limited, there appears to be a curvilinear relation between the number of repetitions and retention, which flattens around 6 to 8 repetitions (i.e., after 6 to 8 repetitions, the added value of extra repetitions is lower, Pyc & Rawson, Citation2009). This number roughly coincides with the average moment of shift to direct retrieval found in the present study (8.27 repetitions). Tentatively, the benefits of additional retrieval practice might decrease once a shift has occurred from mediated to direct retrieval. Especially for practical applications of this finding, an interesting follow-up question for further research would therefore be to systematically test the effects of retrieval practice that is done before or after the shift from mediated to direct retrieval, in order to examine whether, for example, a shift to direct retrieval is a cut-off point after which retrieval practice becomes less beneficial. Further experimental studies along these lines could also partly alleviate the issue that the relation between shifts to direct retrieval and later retention is correlational in the present study. Because the present experiment was focused on naturally occurring shifts from mediated to direct retrieval, it did not include a manipulation of retrieval type and it is therefore not possible to rule out that item characteristics like word difficulty influenced shifts to direct retrieval as well as retention.

A number of limitations of the present study need to be considered. First, participants shifted to direct retrieval on only a limited number of words in the 10 rounds of retrieval practice. Therefore, the average shift of 8.27 retrievals was based on a subset of the practiced words and might underestimate the actual number of retrievals needed to cause a shift. The limited number of shifts observed also caused a restricted range of shift moments, which could explain the lack of relation between the moment of shift and word retention. Exploratory correlational analyses of our data suggest that participants who shifted and participants who shifted earlier on average (and thus had a higher number of direct retrievals during practice) might have better retention compared to participants who did not shift to direct retrieval or shifted later, but this relation needs to be tested more systematically in future studies.

Second, the think-aloud procedure of the present study may have reduced shifts to direct retrieval compared to earlier studies (e.g., Kole & Healy, Citation2013; Rickard & Bajic, Citation2003). Possibly, repeatedly saying aloud the keywords led to additional practice and preservation of the keywords. Oral debriefing furthermore revealed that some participants felt that mentioning the mediators was socially desirable. These issues may have delayed and reduced the shift to direct retrieval in the present study. Follow-up studies in which think-aloud is required on only some of the trials, could partly solve this issue. Also, increasing the number of repeated retrievals during practice to the point at which direct retrieval has been established on the majority of the words could reinforce the current findings. Finally, it would be desirable to replicate our findings in future research using self-generated mediators. Despite similar recall for self-generated and experimenter-selected mediators in previous studies (Campos et al., Citation2004), keyword preservation over the course of retrieval practice might differ between self-generated and experimenter-provided mediators. Nevertheless, in spite of limitations of the present study, we established a link between the presence of a shift to direct retrieval during practice and word retention over time. This suggests that the study tapped into a meaningful aspect of word learning.

Anecdotally, while the present study was not set up to test the effectiveness of a combination of repeated retrieval practice and the keyword method, performance levels on the final test were rather high. We found that a week after the 90 minutes of practice, participants still remembered the meaning of about 90% of the 50 practiced words on average, and correctly recalled the exact spelling of about half the words. Combining the keyword method with repeated retrieval practice can thus be an effective strategy to remember novel vocabulary. Moreover, learners who encode new vocabulary items with the help of keyword mediators might benefit from continued practice until they do not need to retrieve the mediator anymore in order to recall the word meaning. Vice versa, learners who notice that they engage in an effortful step-wise retrieval in which they first search for the keyword mediator before recalling word meaning, do good in continuing retrieval practice to further strengthen form-meaning associations.

To conclude, the present study showed that words encoded with keyword mediators are, after repeated retrieval practice, eventually retrieved directly, without reference to mediators. This shift from mediated to direct retrieval predicted better long-term retention, which indicates that continued practice until a shift has occurred to direct retrieval, is beneficial for word retention.

Acknowledgements

This article is based on the thesis of the first author.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 If retrieval failure during practice is related to retention on the posttest, this scoring might introduce a confound causing lower retention of mediated items. The analyses reported hereafter were therefore repeated after excluding any items that were answered incorrectly in the frequent retrieval condition in any of the last three retrieval rounds. These were only 4 items in total (0.5% of all items). Excluding these items from analyses changed the results only numerically but did not change the pattern of results (all effects remained significant and in the same direction).

2 Data from the infrequent retrieval condition was not included because it contained only four repetitions, allowing few shifts to direct retrieval and little variation in the moment of shift.

References

Appendix

Overview of all stimuli: 50 Swahili words with corresponding translations, keywords, and mnemonic association sentences. Note that the keyword is always mentioned before the translation in the original Dutch mnemonic sentences, but this is not always the case in the provided translations due to differences in word order between Dutch and English.