186
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Why is a flamingo named as pelican and asparagus as celery? Understanding the relationship between targets and errors in a speeded picture naming task

ORCID Icon, , ORCID Icon & ORCID Icon
Received 14 Jan 2023, Accepted 10 Jan 2024, Published online: 13 Feb 2024

ABSTRACT

Speakers sometimes make word production errors, such as mistakenly saying pelican instead of flamingo. This study explored which properties of an error influence the likelihood of its selection over the target word. Analysing real-word errors in speeded picture naming, we investigated whether, relative to the target, naming errors were more typical representatives of the semantic category, were associated with more semantic features, and/or were semantically more closely related to the target than its near semantic neighbours were on average. Results indicated that naming errors tended to be more typical category representatives and possess more semantic features than the targets. Moreover, while not being the closest semantic neighbours, errors were largely near semantic neighbours of the targets. These findings suggest that typicality, number of semantic features, and semantic similarity govern activation levels in the production system, and we discuss possible mechanisms underlying these effects in the context of word production theories.

Introduction

In everyday life, we successfully retrieve and utter thousands of words each day. In neurotypical adults, this mostly happens automatically, rapidly, and at a high level of accuracy. However, despite the high success rate, the process of word production is not perfect and sometimes it breaks down even in neurotypical speakers who may utter an incorrect word (e.g., in malapropisms; Fay & Cutler, Citation1977), make errors in a word’s phonemes, or fail to produce the word at all. If analysed thoroughly, such errors can give us an insight into the processes underlying word production: We can localize the origin of the observed errors in theories of word production and use them to understand the processing mechanisms that have led to precisely this – and not another – erroneous response. Thus, word production errors provide a window into information representation and into processing mechanisms during word production.

Given the fairly low number of spontaneously occurring errors in neurotypical individuals (e.g., saying orange when seeing an apple), a number of tasks have been suggested that experimentally trigger word production errors, and interestingly, the errors in these tasks were found to resemble some of the types of errors made by individuals with aphasia (Moses et al., Citation2004). Among error-inducing tasks are paradigms that induce tip-of-the-tongue states where only partial word recall occurs (e.g., Brown & McNeill, Citation1966), tongue twister or spoonerism tasks (e.g., Baars et al., Citation1975; Vitevitch, Citation2002), and speeded picture naming tasks (also called “tempo picture naming tasks” or “naming to deadline tasks”) that require participants to name pictures at an increased rate (e.g., Moses et al., Citation2004; Vitkovitch et al., Citation1993). In the speeded naming task, the task used in this study, a response deadline is set to limit processing time for word production, which results in an increased number of naming errors. For example, participants may be instructed to name a picture within 600 ms and this time pressure is thought to influence processing at the semantic and/or lexical levels of word production (e.g., Lampe et al., Citation2023; Lloyd-Jones & Nettlemill, Citation2007; Mirman, Citation2011; Moses et al., Citation2004; Vitkovitch et al., Citation1993; Vitkovitch & Humphreys, Citation1991; see Lampe et al., Citation2023, for a review).

Picture naming errors are often described based on broad error categories that describe the given erroneous response in relation to the target word (e.g., semantic error, phonological error). Research has also focused on whether the occurrence of these broad error categories is predicted by the characteristics of the target words (psycholinguistic variables), for example, investigating whether more semantic errors occur when producing words that are less frequent or have more semantic features (e.g., Fieder et al., Citation2019; Hameau, Nickels, et al., Citation2019; Lampe et al., Citation2021; Nickels, Citation1995; Nickels & Howard, Citation1995). However, to date, comparatively little research has tried to understand why a particular incorrect word is chosen over the target word and investigations of naming errors have usually not explored the precise relationship between the intended target item and the incorrect response given by a participant.

In this study, we assessed the relationship between target words and naming errors produced by neurotypical participants in a speeded picture naming task. We aimed to understand why someone, for example, incorrectly says pelican but not peacock for a picture of a flamingo. To shed light on the mechanisms underlying the selection of a particular incorrect word, we examined which properties of the erroneously selected word influenced the likelihood of its selection over the target word. Specifically, we investigated the psycholinguistic semantic variables typicality, number of semantic features, and semantic similarity. We determined whether, relative to the target response, incorrect responses were more typical representatives of their semantic categories, were associated with more semantic features, and whether they were semantically more closely related to the targets than their near semantic neighbours were on average.

Origin of real-word errors in lexical access for word production

Word production is a complex, multistage process. There is, however, general consensus on the following processing stages and representations: access to word meaning (semantic representation), selection of a word (lexical representation), activation of the corresponding phonemes and, finally, preparation of articulatory commands for response execution (e.g., Abdel Rahman & Melinger, Citation2009; Dell, Citation1986; Levelt et al., Citation1999; Oppenheim et al., Citation2010).

It is thought that in conceptually driven word production, lexical representations that are semantically related to the target word are co-activated alongside the target’s lexical representation. For instance, while planning to say the target camel, the representations of horse, dromedary, and donkey are co-activated as they overlap in meaning with the target. There are theoretical debates as to whether these co-activated candidates compete for selection with the target’s lexical representation (e.g., Abdel Rahman & Melinger, Citation2009; Dell, Citation1986; Howard et al., Citation2006; Levelt et al., Citation1999; Mahon et al., Citation2007; Navarrete et al., Citation2014; Oppenheim et al., Citation2010; Rose et al., Citation2019) and around the precise way in which selection occurs (e.g., after a certain number of time steps (e.g., Dell, Citation1986; Dell et al., Citation1997), after reaching an absolute activation threshold (e.g., Oppenheim et al., Citation2010), or based on Luce’s ratio (Levelt et al., Citation1999)). However, in all word production theories the most strongly activated lexical representation at the time of selection is chosen. If the activation of a co-activated lexical representation exceeds the activation of the target’s lexical representation, this co-activated representation is selected, resulting in a semantic naming error.

Lexical representations that are unrelated in meaning to the target word can also be selected. This happens, for example, in perseverative errors where a previously selected representation is still highly active and is reselected rather than the next target item (see e.g., Fischer-Baum et al., Citation2016; Moses et al., Citation2007, for reviews of different theoretical accounts). Lastly, formal naming errors, which share phonemes with the target, arise, depending on the theory, when an incorrect phonological representation is selected and/or when incorrect segments (i.e., phonemes or syllables) are selected during phonological encoding (e.g., Best, Citation1996; Butterworth, Citation1989; Dell, Citation1986).

Relationship between target words and naming errors

There is a large body of experimental research investigating whether properties (i.e., psycholinguistic variables like word frequency, age of acquisition, imageability) of the target word influence picture naming speed, naming accuracy, and the rate of occurrence of certain error types (see Perret & Bonin, Citation2019, for a meta-analysis and review). However, there is another, under-researched, dimension to incorrect responses: the 1:1 relationship between the target word and the erroneous response in terms of psycholinguistic variables. As detailed above, when one word is substituted for a target in word production, the representation of that word must have, for some reason, been more strongly activated in the mental lexicon than the target’s representation. By describing the relationship between target words and erroneously produced responses in terms of psycholinguistic variables, we can identify factors that contribute to the activation patterns in the mental lexicon. This approach allows us to better understand why and how naming errors arise and to reveal the principles that govern information representation and activation in word production.

Previous research into the relationship between target words and erroneous responses with regard to psycholinguistic variables mostly focused on lexical and form-based similarity between the target and an incorrect response that was produced either spontaneously or as a result of an experimental manipulation by neurotypical participants or by individuals with aphasia (see Goldrick et al., Citation2010, for a comprehensive review). A number of studies have demonstrated that speech errors in the spontaneous speech of neurotypical participants (e.g., del Viso et al., Citation1991; Levelt, Citation1989, for errors with an associated relationship with the target, e.g., smoothie for blender) tend to be of higher lexical frequency than the actual target words. The same was found for naming errors produced by individuals with aphasia (Blanken, Citation1990; Kittredge et al., Citation2008; Martin et al., Citation1994, reported a marginally significant result). Similarly, it was shown that error responses of individuals with aphasia were of higher frequency than expected if a response were given at random (e.g., Bormann et al., Citation2008; Gagnon et al., Citation1997; Goldrick et al., Citation2010).

These findings match the theoretically based predictions for effects of lexical frequency. Word frequency is the only variable whose effect is explicitly implemented in theories of word production. Specifically, in WEAVER++ Levelt et al. (Citation1999) suggested lower activation thresholds for word forms of higher frequency and higher activation thresholds for words with lower frequency based on findings by Jescheniak and Levelt (Citation1994). Levelt and colleagues noted that, alternatively, following Roelofs (Citation1997), effects of frequency could also be implemented by varying items’ verification times as a function of their frequency of occurrence, so that lower frequency items would take longer to be processed. Similarly, Dell (Citation1990) implemented frequency as varying resting levels at the lexical level: Higher frequency words have higher resting activation levels than lower frequency words.

However, other research investigating the relationship in lexical frequency between targets and naming errors did not find evidence for naming errors being systematically higher in frequency than the intended target word (spontaneous errors of neurotypical speakers: e.g., Harley & MacAndrew, Citation2001; Vitevitch, Citation1997; experimentally induced errors of neurotypical participants: Dell, Citation1990; errors of individuals with aphasia: Best, Citation1996; Gerhand & Barry, Citation2000; Gordon, Citation2002).

Similarly, conflicting findings have been reported in the literature for age of acquisition, a variable that is highly correlated with word frequency: While Kittredge et al. (Citation2008) found no evidence that incorrect responses in individuals with aphasia were different in age of acquisition to target words, Gerhand and Barry (Citation2000) found that semantic naming errors consisted of words that were earlier acquired than the actual target words (see also Bormann et al., Citation2008, for a similar finding on a very small item set). This finding is in line with Levelt et al.'s (Citation1999) suggestion that age of acquisition could be modelled in much the same way as lexical frequency: Words with lower age of acquisition would then have an activation advantage over words with higher age of acquisition and would therefore be more likely selected in situations of naming errors.

Relating to word form, length of the target word was reported as a significant predictor of word error length, with naming errors generally preserving the target’s length in numbers of segments (phonemes or graphemes) or syllables (spontaneous errors of neurotypical speakers: e.g., Fay & Cutler, Citation1977; Harley & MacAndrew, Citation2001; errors of individuals with aphasia: e.g., Best, Citation1996; Blanken, Citation1990; Goldrick et al., Citation2010; Gordon, Citation2002). Lastly, the phonological overlap between the target word and an incorrect response has been the subject of a number of investigations. These studies have mostly reported that target words and naming errors (especially those that were not semantically related) shared aspects of their phonological structure (e.g., overall overlap in phonemes or graphemes or in the segment in word initial position) (spontaneous errors of neurotypical speakers: e.g., del Viso et al., Citation1991; Harley & MacAndrew, Citation2001; errors of individuals with aphasia: e.g., Best, Citation1996; Blanken, Citation1990; Gagnon et al., Citation1997; Goldrick et al., Citation2010; Martin et al., Citation1994). Moreover, the phonological neighbourhood density of the target and the erroneous response was reported to be comparable (spontaneous errors of neurotypical speakers: e.g., Vitevitch, Citation1997; errors of individuals with aphasia: e.g., Gordon, Citation2002).

As is evident from this brief review of the literature, there have been investigations of the influence of various psycholinguistic variables pertaining to lexical information or phonological form on the relationship between targets and incorrectly produced responses. The findings of these studies have been used to gain insights into information representation and processing dynamics during lexical and post-lexical processing stages of word production. More precisely, they have been, for example, used to localize effects of certain variables in the word production process (e.g., Kittredge et al., Citation2008), to reveal co-activation mechanisms during word production due to phonological similarity (e.g., Goldrick et al., Citation2010), and to specify the architecture of word production models (e.g., Blanken, Citation1998).

However, conceptually driven word production starts with semantic processing and there is a growing body of literature highlighting the influence on word production of semantic variables, which are psycholinguistic variables that capture aspects of the semantic representation of words or their relationship with other words in the mental lexicon. The focus of these investigations is usually on whether the semantic variables affect picture naming speed, accuracy, and the type of naming errors produced by individuals with and without aphasia (e.g., Bormann, Citation2011; Fieder et al., Citation2019; Hameau, Nickels, et al., Citation2019; Lampe, Hameau, Fieder, et al., Citation2021; Lampe et al., Citation2021; Mirman, Citation2011; Rabovsky et al., Citation2016). However, there is little previous work on the relationship between target words and naming errors in terms of semantic variables.

Woollams et al. (Citation2008) and Woollams (Citation2012) argued that erroneous naming responses produced by individuals with semantic dementia in picture naming were of higher typicality within their semantic category than the actual target words. In addition, they found interactions with severity of impairment: The greater the semantic impairment, the greater the difference between target and response typicality, while individuals with very mild impairment did not show this typicalisation of responses. Similar findings of response typicalisation in individuals with semantic dementia have also been reported in other modalities: Woollams et al. (Citation2007) found typicalisation effects in reading aloud and Patterson and Erzinçlioǧlu (Citation2008) in delayed copy drawing.

Woollams and colleagues interpreted this typicalisation of responses as arising due to more typical concepts dominating processing in the absence of sufficient semantic activation in the context of semantic impairment (e.g., Woollams, Citation2012; Woollams et al., Citation2008). Consequently, there may be a mechanism by which more typical representatives of a semantic category are more strongly activated and more readily selected than atypical representatives. However, the mechanism for effects of typicality have not formally been specified in any current theories of word production.

Semantic similarity is the only other semantic variable where a relationship between targets and errors has been analysed. Mirman (Citation2011) showed that the picture naming errors of individuals with aphasia and speeded neurotypical participants tended to be near semantic neighbours of the target word. This was the case despite the average likelihood of producing a near semantic neighbour by chance being lower than the likelihood of producing a distant semantic neighbour. This over-representation of near neighbours in naming errors was interpreted as arising due to near semantic neighbours being strongly activated and acting as stronger competitors in word production than more distant semantic neighbours.

The current study

We aimed to understand which properties of an error influence the likelihood of its selection over the target word, given the relative paucity of research on this topic. Specifically, we investigated the nature of real-word errors in relation to their target words in speeded picture naming with neurotypical individuals. As outlined previously, naming errors arise if, during word production, representations of incorrect naming responses are co-activated alongside the target representation and are ultimately selected over the target. Thus, studying the composition of the set of incorrect picture naming responses and their properties in relation to the target provides a window into information representation and processing during word production with respect to influences of the investigated variables, which can ultimately be used to inform theories of word production.

Given the increased interest in the effects of semantic variables on picture naming generally (see Nickels et al., Citation2022, for a review), and the sparsity of investigations focusing on semantic variables in relation to target-error relationships, we focused on three semantic variables. We wished to further examine the effects of typicality (Woollams, Citation2012; Woollams et al., Citation2008) and semantic similarity (Mirman, Citation2011), extending the research into target-error relationships in neurotypical speakers. In addition, we examined the effects of the number of semantic features, given recent research showing that this seems to be a semantic variable that has a robust effect on naming (e.g., Lampe et al., Citation2021; Rabovsky et al., Citation2016; Taylor et al., Citation2012).

While we operationalized these semantic variables based on a semantic features norm database (McRae et al., Citation2005), we do not intend this to be a statement about the structure of semantic representations nor does this study attempt to adjudicate between different theories of semantic representation. Instead, we believe that the feature-based semantic variables we focus on could also be indices of relations between holistic lexical concepts (e.g., the number of links between holistic lexical concepts; e.g., Abdel Rahman & Melinger, Citation2009; Collins & Loftus, Citation1975; Levelt et al., Citation1999). In the Discussion, we discuss how our results could be accounted for in both feature-based and holistic architectures of semantic representations.

There is evidence that these three semantic variables influence word production from previous research on picture naming speed and accuracy, albeit with inconsistent directions of effects (typicality: e.g., Dell’Acqua et al., Citation2000; Fieder et al., Citation2019; Grossman et al., Citation1998; Holmes & Ellis, Citation2006; Jolicoeur et al., Citation1984; Lampe, Hameau, Fieder, et al., Citation2021; Morrison et al., Citation1992; Rogers et al., Citation2015; Rossiter & Best, Citation2013; Woollams, Citation2012; number of semantic features: e.g., Lampe, Hameau, Fieder, et al., Citation2021; Lampe et al., Citation2021; Rabovsky et al., Citation2016, Citation2021; Taylor et al., Citation2012; semantic similarity: e.g., Bormann, Citation2011; Fieder et al., Citation2019; Hameau, Biedermann, et al., Citation2019; Hameau, Nickels, et al., Citation2019; Lampe et al., Citation2017; Lampe, Hameau, Fieder, et al., Citation2021; Lampe et al., Citation2021; Mirman, Citation2011; Mirman & Graziano, Citation2013). We examined the relationship between targets and naming errors with respect to these semantic variables while statistically accounting for other psycholinguistic variables (word frequency, age of acquisition, imageability, familiarity, length, and ordinal category position).

Based on findings from studies of naming speed and accuracy, and on the findings by Woollams et al. (Citation2008), we hypothesized that in speeded picture naming neurotypical speakers would make errors that would be more typical representatives of their semantic category than the target words. Moreover, following Mirman (Citation2011), we predicted that incorrect responses should be semantically more similar to their targets than the targets’ near semantic neighbours are on average. Lastly, we anticipated that naming errors would have more semantic features than the target words, given that the number of semantic features has been shown to facilitate naming speed and accuracy (e.g., Lampe et al., Citation2021; Rabovsky et al., Citation2016; Taylor et al., Citation2012). Moreover, following Rabovsky and McRae (Citation2014), whose neural network model indicated higher semantic activation for words with more semantic features and consequently stronger lexical activation (Rabovsky et al., Citation2016), we expected the rich semantic representations of words with many semantic features to be activated more easily than representations of words with sparser representations. This then results in greater activation of the lexical representations of words with many semantic features, ultimately leading to errors that tend to have more semantic features than the targets.

Methods

Participants

83 right-handed native Australian English speakers aged 17–35 years took part in the experiment. All had normal or corrected-to-normal vision and no history of neurological or cognitive disorders or speech and language impairments. Five participants were excluded from the analyses as they did not to meet the eligibility criteria (n = 1), misinterpreted the task instructions (n = 2), or performed poorly in a standard naming task using the same items (n = 2; see Procedure for more details). Consequently, data from 78 participants were included in the analyses (M = 20.1 years, SD = 2.3 years, range = 17–33 years old, 64 females, 14 males). The study was approved by the Macquarie University Human Ethics Committee and written informed consent was obtained from the participants. Participants received course credit or monetary compensation for their participation.

Stimuli

The stimuli consisted of 297 colour photographs, depicting objects from McRae et al.'s (Citation2005) feature database on white background. All items had high name agreement (>75%) in Australian English (see Lampe et al., Citation2021, for more information on the selection of the experimental stimuli). The items were presented in six pseudorandomised experimental lists in which items from the same semantic category were interspersed by a minimum of two items from different semantic categories.

Information on the semantic variables was retrieved from McRae et al.'s (Citation2005) feature database. The calculation of the measure typicality was based on Rosch and Mervis’ (Citation1975) family resemblance score (Lampe, Hameau, Fieder, et al., Citation2021; Lampe et al., Citation2021): For each semantic feature of an item a feature weight was determined, which was the number of items in the same semantic category sharing that feature. This feature weight was then divided by the number of items in the semantic category and subsequently multiplied by its production frequency (i.e., number of participants who produced this particular feature for an item). Finally, the feature weights of all features of an item were summed. To determine the difference in typicality between a target and an incorrect response, we subtracted the typicality value of the target from the typicality value of the incorrect response if typicality information was available for the incorrect response (i.e., if the response was part of McRae et al.'s feature database).

The measure number of semantic features was directly retrieved from McRae et al.'s (Citation2005) database. It is a count of all semantic features, except for taxonomic features, produced by the participants in McRae et al.’s feature production task in response to an item name. For our target-error analyses, we calculated the difference between a target and an incorrect response that is part of the database by McRae et al. in terms of number of semantic features by subtracting the target’s number of semantic features from the number of semantic features of the incorrect response.

Semantic similarity was a measure that described the semantic similarity between the target and an incorrect response in contrast to the similarity of a target and all its near semantic neighbours. Following Mirman (Citation2011), we defined near semantic neighbours as words with a cosine similarity of > .4 between the targets’ feature vector and the feature vector of another item in McRae et al.'s (Citation2005) database. We first determined the average cosine similarity between the target and all its near semantic neighbours (i.e., average “similarity” of all near semantic neighbours with the target). Then, for each incorrect response, if this response was in McRae et al.'s database, we obtained its cosine similarity with the target. Finally, our measure of semantic similarity was established by subtracting the average cosine similarity between a target and all its near semantic neighbours from the cosine similarity between a target and the incorrect response.

In addition to the semantic variables, information on the psycholinguistic control variables familiarity, age of acquisition, length in phonemes, frequency, and imageability was available for all target items and was retrieved for the incorrect responses that were part of McRae et al.'s (Citation2005) database. Familiarity, age of acquisition, and imageability values were obtained from a norming study (Lampe et al., Citation2021). Frequency was obtained from a word frequency database for British English (SUBTLEX-UK; van Heuven et al., Citation2014). In addition, as perseveration errors were reported to arise in neurotypical participants under speeded naming conditions (e.g., Moses et al., Citation2004; Vitkovitch et al., Citation1993; Vitkovitch & Humphreys, Citation1991), we determined whether a given response was perseverative: If a response had been produced earlier in the experiment, irrespective of the accuracy of that response, or the distance between first and subsequent production, it was coded as a perseveration. Lastly, a measure ordinal semantic category position was derived for the target words, which indicated how many items of the same semantic category had been seen in the experiment before the respective target word to account for facilitatory or inhibitory influences of having previously named items from the same semantic category (Howard et al., Citation2006).

Procedure

Participants were tested individually in a quiet room. The entire testing session consisted of three naming rounds, lasting about 30–35 min each. First, participants took part in a standard picture naming task (Round 1; data reported in Lampe et al., Citation2021). Subsequently, participants completed two further naming tasks: Another standard naming task and a speeded picture naming task, with the order of these tasks being counterbalanced between participants (Rounds 2 and 3, respectively; data reported in Lampe et al., Citation2023)Footnote1. Half of the participants completed speeded naming in Round 2 and standard naming in Round 3, and vice-versa in the other half of the participants. Thus, when completing the speeded naming task, which is analysed here, the participants had already named the items once or twice under standard naming conditions. We believed that exposing the participants to the experimental items before the speeded naming task was important to decrease any difficulties related to visually identifying the depicted objects. Thus, our experimental design can be compared to the common practice of preceding a naming task with a familiarization phase (note though that we did not provide participants with any feedback on the accuracy of their responses in any of the naming rounds). We have shown elsewhere (Lampe et al., Citation2023) that a repetition of the experimental items had little effect on the size of any effects of the semantic variables on response times and latencies. In addition, while repeated exposure to the items may decrease the number of naming errors made, there is no reason to assume that the item repetition should influence the type of naming error made, which is what was investigated here.

For the speeded naming task, participants were instructed to name each picture using a single word. They were asked to prioritize naming speed over naming accuracy and to say the very first word that came to their minds, irrespective of whether this was a correct response or not. To familiarize participants with the task and the required naming speed, five practice trials were administered at the beginning of the experiment and after each of three breaks. Each practice trial began with the display of a fixation cross in the centre of the screen for a random duration of between 500 and 1000 ms. Next, a picture stimulus was displayed on a white background in the centre of the screen, disappearing after 600 ms with an auditory beep, so that participants could experience the required speed. Subsequently, using information from a voice key, a message was presented for 400 ms on the screen saying “Good!” (in green ink) if the participant had managed to initiate their response within the 600 ms of picture display or “Too slow!” (in red ink) if they were slower than this. A blank screen followed for 1000 ms indicating the end of the current trial and the beginning of the next. Participants were asked to respond as quickly as possible, ideally before they heard the sound (to “beat the beep”).

In the experimental trials, the trial sequence was identical to the practice trial sequence, with the only difference that picture stimuli were now presented for 2000ms and participants heard no beep. However, the feedback on the screen after the disappearance of the pictures remained, informing the participants whether they had responded quickly enough. The first item of each experimental block was a practice item (presented according to the trial sequence of an experimental trial with a 2000ms presentation duration and no “beep”) and was excluded from the analyses.

Response coding

All responses were transcribed and word onsets were marked using auditory and visual inspection in Praat (Boersma & Weenink, Citation2019). The first full response (i.e., a response consisting of at least one possible English syllable) was classified following the response classification by Fieder et al. (Citation2019) (see Appendix A for a summary of the response coding with examples). Correct responses were responses that included only the correctly produced target words (e.g., cat), correct target words preceded by either a determiner (e.g., a cat), disfluency on the initial phoneme or syllable (e.g., k..cat), or a hesitation (e.g., er cat), and elaborations (e.g., red cat). Three stimuli from the McRae et al. (Citation2005) database were adapted to Australian English and for these stimuli the following alternative responses were accepted as correct: pram for the target buggy, prawn for the target shrimp, motorbike for the target motorcycle, while the original targets themselves were marked as synonyms.

Semantic errors were incorrect responses that consisted of a response that was related in meaning to the target word. Specifically, we coded superordinates (e.g., fruit for apple), subordinates (e.g., pound for coin), coordinates (both target and response belonged to the same semantic category according to the classification by Lampe et al. (Citation2021)Footnote2, e.g., duck for swan), associates (target and response were from different semantic categories but were in an associative relationship, e.g., smoothie for blender), semantic others (target and response were from different semantic categories and were semantically, but not associatively or categorically, related, e.g., beetle for turkey), and part-whole relationships (e.g., brick for wall). Moreover, incomplete responses that shared at least 50% of their phonemes with a semantically related item or vice-versa were also classified as semantic errors (e.g., oct for the target squid: > 50% of the phonemes of oct belong to the semantically related word octopus), as were instances of two-step errors that involved a semantic followed by a phonological error, that resulted in a real word or a nonword (e.g., riger, mediated via tiger, for the target lion).

Other errors consisted of abbreviations (e.g., sub for submarine), unrelated real words that had neither phonological, nor semantic relationship to the target (e.g., animal for guitar), parts of compound target words (e.g., straw for strawberry), self-corrected responses preceded by disfluencies on a non-target initial sound that are shorter than one English syllable (e.g., s octopus for octopus), synonyms (e.g., bunny for rabbit), and visual errors (semantically unrelated, but visually similar to the target items, e.g., lipstick for the target bullet). False starts were defined as responses consisting of less than one syllable, both target-related (e.g., br for broccoli) and target-unrelated (e.g., br for cabbage) that were not followed by a full response. Unrelated nonwords were responses that were not real English words and shared less than 50% of their phonemes with the target (e.g., adewawa for apple). Phonological errors were both words and non-words that shared at least 50% of their phonemes with the target (e.g., pedal for medal) or vice versa (e.g., ulurula for ruler), but were semantically unrelated. There were also some cases, where the origin of an error and its relationship to the target was ambiguous (morphological-semantic-phonological errors; e.g., dishwashing machine for the target dishwasher) that were also classified as other errors. Lastly, no responses and indications of a failure to respond (e.g., I don’t know) were coded as omissions.

On average, participants produced 231.76 correct responses (out of 297; 78.03%, SD = 20.25, range: 181–273), with 45.12 semantic errors (15.19%, SD = 15.71, range: 14–87), 14.71 other errors (4.95%, SD = 8.78, range: 2–48), and 6.82 omissions (1.83%, SD = 6.26, range: 1–27).

20% of the responses were double-coded by an independent second rater to check for consistency. The two raters reached 97.82% agreement with regard to the broad error categories (correct, semantic error, other error, omission) and 95.91% using the detailed error codes (e.g., incomplete response, coordinate error, etc.). Discrepancies were resolved through consensus discussion.

Analyses

Of the 23,166 data points (from 297 items and 78 participants) 5,089 (21.97% of responses) were incorrect responses, including both word and nonword responses. Of these, only those incorrect responses that were entries in the McRae et al. (Citation2005) database (i.e., real-word responses of the semantic and other errors categories) could be included in our analyses. Moreover, to avoid spurious results caused by difficulties with object identification or poor conceptual knowledge in our participants, 14 items (i.e., jeep, basement, crane (machine), crowbar, swan, tricycle, ashtray, raft, spear, plug (electric), goose, canoe, mug, and barn) were excluded from the analyses. These items were identified as having low naming accuracy (< 60%) in the second standard naming round. We used the second standard naming round to identify these items as participants had previously completed the first round of standard naming in which any initial difficulties with visual object identification should have been resolved. Ultimately, this resulted in 2,349 erroneous responses (10.14% of the total number of data points). However, as some target words of the McRae et al. (Citation2005) database have no near semantic neighbours, we could not calculate the semantic similarity measure for these items. This was the case for 383 erroneous responses. Thus, for the Primary Analyses, 1,966 naming errors were analysed with regard to the relationship between target word and the incorrect response given.

To determine whether the target words and the incorrect responses significantly differed with respect to the semantic variables, we performed linear mixed-effects analyses using the lme4 package (Version 1.1.31; Bates et al., Citation2015b) in RStudio (Version 2022.07.2; RStudio Team, Citation2022). The lmerTest package (Version 3.1.3; Kuznetsova et al., Citation2017) was used to derive p-values. Random intercepts for participants and for items were included in all models. Variance Inflation Factors (VIF) were calculated for each model to estimate the extent of multicollinearity between the variables. VIFs for the fixed effects (see Section Primary Analyses) in all models were below 2.87 (and below 2.0 for most fixed effects) and thus below the acceptability threshold (VIFs of around 5 indicate potentially problematic multicollinearity; Hair et al., Citation2014; Rogerson, Citation2011). Hence, there was no sign of problematic multicollinearity between the predictors of the models used in the analyses. The data and analysis code for all analyses reported below are available under https://osf.io/gye6x/.

Primary analyses

We ran linear mixed effect models on three dependent variables: difference values between response and target values for 1) typicality or 2) number of semantic features or 3) the semantic similarity measure. To determine whether there was a significant difference between target and naming errors in terms of these semantic variables, we interpreted the intercept of each model: The intercept denotes the average difference in the dependent variable between target and response when all fixed effects of the model equal zero (i.e., the average value of standardized fixed effects, see below). The main advantages of this approach are that it allows us to directly model the variables of interest – the pairwise difference between targets and naming errors for a set of semantic attributes – as the dependent variables in our models and that the models account for variance related to the specific dyads of target and error response. Following this approach, we are not interested in the independent variables themselves, but only in the intercepts of the models. Since the dependent variable represents the difference between target word and erroneous response in, say, typicality, an intercept of zero would indicate that there was no difference between target and response and hence no bias towards more or less typical responses. An intercept significantly greater than zero indicates that the erroneous responses were more typical than the actual targets and/or had more semantic features than the targets and/or were more semantically similar to the targets compared to the near semantic neighbours of that target on average. Thus, the intercept can be thought of as similar to a single sample t-test on the difference scores (functionally a paired t-test).Footnote3

There are two advantages of this statistical approach over using paired t-tests. First, mixed models allow us to account for variance related to differences between participants and items (by-participant and by-item random intercepts of the models). Second, this approach allowed us to control for the influence of psycholinguistic variables that may influence the difference between target and response. In each model, we included the additional semantic and various control variables as standardized (i.e., mean centred and divided by the sample standard deviation) fixed effects: 1) the difference values between response and target values of the other semantic variables (e.g., when testing if incorrect responses were more typical than the target, we included the difference between target and response in terms of number of semantic features and the semantic similarity measure as fixed effects), 2) difference values of target and errors for the psycholinguistic variables familiarity, frequency, word length, age of acquisition, and imageability, 3) a binary variable for whether the response was perseverative and had been produced earlier, and 4) a measure of ordinal category position of the target word in the experiment. All fixed effects, except for the binary variable for perseverations, were standardized (i.e., mean centred and divided by the sample standard deviation) and the perseverations variable was sum coded. This was done to ensure that the intercept (which is evaluated at the zero point of the other fixed effects) represents the response to items that are average in the control variables.

The inclusion of these fixed effects in our models allowed us to account for any influence of the additional semantic and control variables on the dependent variable. Thus, our intercept denotes differences in the semantic variables of interest between target and response if the difference in terms of the other psycholinguistic variables was accounted for. However, it is impossible to interpret the effects of the fixed effects on the dependent variables in a way that is theoretically meaningful. Given that the field is still developing an understanding of precisely how psycholinguistic variables affect word production speed and accuracy, we currently lack the confidence to interpret effects of, for example, the difference in familiarity on the difference in typicality between target and error. Thus, while we controlled for their variance, we did not attempt to interpret the fixed effects in the present study. The outcome of these models is summarized in the Results section and full model outputs, including effects of the psycholinguistic control variables, are reported in Appendix B. Given previous evidence for effects of the available psycholinguistic control variables on word production and the importance to account for such variables in word production studies (e.g., Perret & Bonin, Citation2019), we included all psycholinguistic variables simultaneously in the models. This also had the advantage that models were structured identically across all analyses. However, as there is a case for parsimonious models (e.g., Bates et al., Citation2015a; Matuschek et al., Citation2017), we also re-ran the models of the Primary Analyses only including those psycholinguistic control variables that made a significant contribution to the model’s fit.

Supplementary analyses

To check for consistency across slightly different model variants we also ran some additional analyses. These are listed below and full results reported in Appendices C–E.

Supplementary Analyses Set 1: Analysis of coordinate errors. As coordinate errors were the most common error type (85.30% of the naming errors of the Primary Analyses, n = 1,677) and there is general consensus between theories of word production that they are co-activated during lexical selection, we conducted an additional set of analyses on coordinate errors only. Given that coordinates are from the same semantic category as the target, focusing on this error type also results in the exclusion of unrelated responses. As in the Primary Analyses, we ran linear mixed models on those coordinate errors for which semantic similarity could be calculated (n = 1,677). The distribution of naming errors included in these Analyses for the three semantic variables is plotted in Appendix C. The outcome of these models is summarized in the Results section and the full model outputs are reported in Appendix C.

Supplementary Analyses Set 2: Analyses excluding semantic similarity as a fixed effect. As explained in the Analyses section, there were some erroneous responses for which we could not calculate a semantic similarity value. To be able to take advantage of higher statistical power when including all available naming errors (n = 2,349) or all available coordinate errors (n = 1,917), we ran additional analyses that did not include semantic similarity as a fixed effect for the dependent variables typicality and number of semantic features. The distribution of naming errors across the different error types in these analyses and the full model outputs are reported in Appendix D.

Supplementary Analyses Set 3: Alternative fixed effects variables. We also fitted alternative models for each dependent variable in the Primary Analyses, the coordinate errors analyses (Supplementary Analyses Set 1), and the analyses excluding semantic similarity as a fixed effect (Supplementary Analyses Set 2), that contained separate values of the to-be-controlled psycholinguistic variables for the target and the response as fixed effects, rather than target-error differences. Given that both types of models (difference values or separate values as fixed effect) showed the same results, we report the outcomes of the models with difference values as fixed effects, while outputs of the additional models are provided in Appendix E.

Results

provides the frequencies and percentages of the different error types included in our Primary Analyses. plots, for each incorrect response, the difference between the target and error raw values for typicality and number of semantic neighbours, and the difference between the cosine semantic similarity of target and error and the targets’ near semantic neighbours on average.

Figure 1. The distribution of naming errors of the Primary Analyses for the differences between target and error values for typicality (Panel A) and number of semantic features (Panel B), and the semantic similarity measure (Panel C). Note. The black dashed line indicates the average difference between target and error for the respective semantic variable.

Figure 1. The distribution of naming errors of the Primary Analyses for the differences between target and error values for typicality (Panel A) and number of semantic features (Panel B), and the semantic similarity measure (Panel C). Note. The black dashed line indicates the average difference between target and error for the respective semantic variable.

Table 1. Counts and percentage of errors of each error type included in the Primary Analyses.

As described in the Analyses section, to answer our research questions, we focused on the intercept of the statistical models investigating typicality, number of semantic features, and semantic similarity. Positive intercepts that were significantly different from zero indicated that the incorrect response was more typical, had more semantic features, or was semantically closer to the target than its near semantic neighbours on average. Here we report the analyses outcomes with regard to the intercept for the semantic variables and the full model outputs are reported in Appendices B–E.

Primary analyses

Typicality

The model showed a significant positive intercept (β = 2.52, CI = 0.81–4.22, SE = 0.87, t = 2.90, p < .01), which indicates that incorrect responses were more typical within their category than the targets. The model of Supplementary Analyses Set 2 with an increased number of 2,349 observations but that did not include semantic similarity as a fixed effect produced a similar outcome (β = 3.89, CI = 2.27–5.52, SE = 0.83, t = 4.70, p < .001).

Number of semantic features

The model provided a significant positive intercept (β = 0.61, CI = 0.13–1.09, SE = 0.24, t = 2.50, p < .05), indicating that incorrect responses had more semantic features than the target items. The model of Supplementary Analyses Set 2 using 2,349 observations that did not control for semantic similarity produced a similar positive effect (β = 0.56, CI = 0.11–1.01, SE = 0.23, t = 2.42, p < .05).

Semantic similarity

In contrast to the previous analyses, the semantic similarity model resulted in a significant negative intercept (β = −0.06, CI = −0.09 – −0.03, SE = 0.01, t = −4.52, p < .001), indicating that the average semantic similarity between a target and its near semantic neighbours was greater than between a target and the incorrect response given by the participants.

Comparable effects for the three semantic variables were found in the models of Supplementary Analyses Set 3 with separate fixed effects for the psycholinguistic variables for target and response (Appendix E).Footnote4,Footnote5

Supplementary analyses set 1: coordinate errors

Typicality

The intercept of the typicality model was positive, but this was not significant (β = 1.68, CI = −0.11–3.47, SE = 0.91, t = 1.84, p = .065), which indicates that there was no evidence that the typicality of coordinate errors differed from the typicality of the targets. However, the additional model, including all available coordinate errors (n = 1,917 observations) but without semantic similarity (Supplementary Analyses Set 2), had a significant positive intercept, indicating that the coordinate errors were more typical than their targets (β = 3.89, CI = 2.27–5.52, SE = 0.83, t = 4.70, p < .001).

Number of semantic features

The model showed a significant positive intercept (β = 1.09, CI = 0.57–1.60, SE = 0.26, t = 4.10, p < .001), indicating that coordinate errors had more semantic features than their targets. Similarly, the model without semantic similarity as a fixed effect (Supplementary Analyses Set 2) that included all available coordinate errors (n = 1,917 observations), yielded a significant positive intercept (β = 0.56, CI = 0.11–1.01, SE = 0.23, t = 2.42, p < .05).

Semantic similarity

The intercept of the model for semantic similarity was not significant (β = −0.00, CI = −0.02–0.02, SE = 0.01, t = −0.07, p = .944), suggesting that there was no evidence for a difference in semantic similarity between target words and incorrect coordinate responses and target words and their near semantic neighbours on average.

Comparable effects for the three semantic variables were found in Supplementary Analyses Set 3 with separate fixed effects for the psycholinguistic variables for target and coordinate errors (Appendix E).

Discussion

We aimed to further understand which properties of an error influence the likelihood of its selection over the target word. Specifically, using a speeded picture naming paradigm, we analysed whether, relative to the target word, naming errors were more typical representatives of the target’s semantic category, associated with more semantic features, and/or semantically more closely related to their targets than their near semantic neighbours were on average. The results are summarized in and partly confirmed these hypotheses: While the effect sizes were small (see also ), naming errors were more typical category representatives than the target word in all analyses, except for the model of Supplementary Analyses Set 1 on coordinate errors that included semantic similarity which had the smallest sample size (n = 1677 observations). Similarly, across analyses we found that incorrect responses had more semantic features than the target words. Contrary to our hypothesis for semantic similarity, when including all error types, incorrect responses were semantically more distant from targets than their near semantic neighbours were on average. However, when only analysing coordinate naming errors (Supplementary Analyses Set 1), this difference was no longer significant.

Table 2. Summarized findings of the Primary Analyses and Supplementary Analyses Sets 1 and 2 with typicality, number of semantic features, or semantic similarity as dependent variables.

In the Primary Analyses we found that naming errors are more typical representatives of their semantic categories than the target words. This was the case despite there being on average about the same numbers of items in the database by McRae et al. (Citation2005) that were of higher and lower or equal typicality than the target word (on average, 50.38% of items in the database were of higher typicality than the target and 49.62% were of lower or equal typicality; t(2348) = 0.74, p = 0.459). Thus, if incorrect responses were selected by chance, we would have likely observed no effect of typicality.

Our finding suggests that representatives of higher typicality are co-activated together with the target, are systematically more strongly activated than more atypical representatives, and are, in the case of naming errors, ultimately selected over the target. This is in line with the typicalisation of response effect previously reported for individuals with semantic dementia (Woollams, Citation2012; Woollams et al., Citation2008). As detailed in the Introduction, Woollams and colleagues found that this typicalisation of responses was dependent on the degree of semantic impairment with greater semantic impairment yielding larger typicalisation effects. More specifically, for individuals with semantic dementia it was suggested that idiosyncratic semantic features, which are not shared by many representatives, are vulnerable to semantic damage, resulting in poorer performance on atypical items and a gravitation towards more typical representatives.

We propose that the underlying mechanism of the typicalisation effect observed here may be similar, despite it being caused by time pressured processing rather than cognitive decline: The effect may originate at the semantic level where the time pressure results in insufficient semantic activation, which allows more typical representatives to dominate semantic processing. Specifically, during semantic processing under time-pressure, the idiosyncratic semantic features of atypical words may be less accessible and/or less activated than shared features, causing insufficient semantic information for distinguishing a more atypical representative from a more typical representative in the required timeframe. Thus, shared semantic features of higher typicality items dominate the available semantic information, resulting in selection of an incorrect candidate of relatively high typicality. For example, if semantic features that apply to most musical instruments are activated, but the idiosyncratic features that are unique to harp are not accessed, a more typical representative of the category musical instruments that was activated by the shared category features, for example piano, may be more strongly activated than the target and ultimately be selected for production.Footnote6

McRae and colleagues suggest a mechanism that may underpin idiosyncratic features being less activated than shared semantic features: They propose that item typicality is determined by the intercorrelation of an item’s semantic features with those of other items from the same semantic category (e.g., a robin is a more typical bird than penguin because its features are strongly intercorrelated with the features of other members of the category; McRae et al., Citation1997, Citation1999). Typical items are characterized by entire networks of intercorrelated features with bi-directional links between them, while semantic features characterizing atypical items are much less interconnected.

Translated to word production, this may mean that when retrieving the semantic information related to a typical item, its network of intercorrelated features will be activated, which results in co-activation of lexical representations sharing these semantic features. If the item is atypical, many of the features that characterize more typical representations of the category will also be appropriately activated, which will again cause their lexical representations to be activated. In speeded processing, there may not be enough time to also sufficiently activate the weakly intercorrelated features that are associated with atypical items, leading to their lexical representations to be weakly activated, at best. Consequently, during lexical selection, there would be a large cohort of strongly co-activated typical category representatives, one of which is selected over the only weakly activated representation of the more atypical target word. However, as we have shown in our previous research (Lampe, Hameau, Fieder, et al., Citation2021; Lampe et al., Citation2021), category typicality and intercorrelational density, a measure capturing featural intercorrelations, have independent effects in word production, suggesting that typicality is not solely caused by featural intercorrelations.

Alternatively, Rosch and Mervis (Citation1975), in prototype theory, proposed that more typical semantic representatives share semantic information with other members of their category and are consequently closer to the category prototype. This was thought to result in a processing advantage for more typical category representatives. Thus, more typical representations may be more accessible, which we see, for example, reflected in facilitatory effects of typicality on response times in picture naming (e.g., Dell’Acqua et al., Citation2000; Grossman et al., Citation1998; Holmes & Ellis, Citation2006; Jolicoeur et al., Citation1984). In processing under time constraints, one of the first available representations may be selected, with insufficient time available to wait for less typical representatives to be activated. Additionally, it has been proposed that, in order to be able to respond quickly and to meet task requirements, participants operating under speeded conditions may lower the activation threshold that has to be surpassed for information selection (Coltheart et al., Citation2001; Humphreys et al., Citation1995; Kello, Citation2004). Thus, there may simply not be enough time for representations of lower typicality items to be sufficiently activated before one of the more typical items’ representation surpasses the activation threshold and is selected for further processing.

In sum, we propose that the effect of typicalisation of response arises at the semantic level, where more typical representations are more strongly activated than less typical representations and are therefore, ultimately, more likely to be selected during word production. None of the proposed mechanisms requires lexical competition and thus, they are compatible with both competitive and non-competitive theories of word production (e.g., Abdel Rahman & Melinger, Citation2009; Dell, Citation1986; Howard et al., Citation2006; Levelt et al., Citation1999; Mahon et al., Citation2007; Navarrete et al., Citation2014; Oppenheim et al., Citation2010). However, as our proposed mechanisms mostly rely on semantic features, they are most readily integrated into theories assuming semantic representations are decomposed into semantic features (e.g., Dell, Citation1986). Nevertheless, our proposed mechanisms may also be compatible with theories assuming a model architecture with holistic semantic concepts (e.g., Abdel Rahman & Melinger, Citation2009; Collins & Loftus, Citation1975; Collins & Quillian, Citation1969; Levelt et al., Citation1999): For example, these theories could include a semantic category node that is connected to all representatives of this semantic category. The strength of this connection between semantic category node and the semantic representation of target or error words could be what is captured by our typicality measure.

Lastly, we note that only one model did not show evidence for a difference in typicality between the target and the actual response given – the model with coordinate errors (rather than all errors) and including semantic similarity as one of the fixed effects (Supplementary Analyses Set 1). There was evidence for significant typicalisation of response in the other analyses that either included all real-word errors, both with and without semantic similarity as a fixed effect, or only coordinate errors but without semantic similarity as a fixed effect (Primary Analyses and Supplementary Analyses Set 2). Consequently, the reason for the non-significant finding in Supplementary Analyses Set 1 is unlikely to be due to either the error type investigated or the inclusion of semantic similarity as a fixed effect in the statistical models. Instead, we suggest that the fact that this analysis included the smallest number of observations compared to the other analyses may have resulted in insufficient statistical power to find the effect.

Besides the effect of typicalisation of response, we also found that naming errors had more semantic features than the intended target words. This was the case despite there being on average more items in the McRae et al. (Citation2005) database that had the same number or fewer semantic features than the target word (on average, 37.83% of items had more semantic features than the target word and 62.17% had the same number or fewer semantic features; t(2348) = −22.26, p < .001). Given these proportions, if incorrect responses were selected by chance, we would have expected to observe a negative effect of number of semantic features, with responses mostly having fewer semantic features than the target word. Our finding of naming errors having more semantic features than the target word suggests that representations with more semantic features than the target receive activation during word production and that, at least on occasions leading to naming errors, they are more strongly activated than representatives with fewer semantic features and are ultimately selected. To the best of our knowledge this has not been demonstrated before.

In a neural network model that simulated processing of words with many or few semantic features, Rabovsky and McRae (Citation2014) found higher semantic activation for words with many compared to those with fewer semantic features. In word production, this is thought to result in stronger activation of the lexical representations of words with more semantic features (Lampe et al., Citation2021; Rabovsky et al., Citation2016). In the context of our experiment, activation would spread from the target features through a network of interconnected semantic features that often occur together. This would result in activation of non-target semantic features, words with more semantic features are more likely to contain some of these features. Furthermore, following Rabovsky and McRae, if the (non-target and target) semantic features activated are of words that are characterized by more semantic features than the target, activation of the lexical representations of these other words will be strong, given their rich semantic representations. This provides a two-fold advantage for words with many semantic features: given their strong activation, 1) they are likely to surpass the necessary activation threshold or be the most strongly activated candidate at the moment of selection. In both competitive (e.g., Abdel Rahman & Melinger, Citation2009, Citation2019; Howard et al., Citation2006; Levelt et al., Citation1999) and non-competitive (e.g., Dell, Citation1986; Oppenheim et al., Citation2010) word production architectures, the stronger activation of words with more semantic features makes their selection more likely than selection of less activated words.Footnote7 2) Under competitive lexical selection accounts, strongly activated words with many semantic features act as strong competitors for selection (Abdel Rahman & Melinger, Citation2009, Citation2019; Howard et al., Citation2006; Levelt et al., Citation1999). Thus, words with many semantic features will be more readily selected than words with fewer semantic features, which is precisely what we observed in our naming error data.

Assuming holistic representations, our measure number of semantic features may be an index of the number of links between a target concept and other concepts in the mental lexicon. Activation may spread along those links and facilitate processing of words that have many such links, making them more likely candidates for selection than concepts with fewer links (see Lampe et al., Citation2021, for a similar argument).

The analysis of the semantic similarity measure on all real-word naming errors in the Primary Analyses revealed that erroneous responses were less semantically similar to the target than the target’s near semantic neighbours were on average. However, when analysing only coordinate errors in Supplementary Analyses Set 1, this effect was no longer significant, suggesting that, in this analysis, there was no evidence for a difference in semantic similarity between the target and the incorrect response given and the target and its near semantic neighbours on average. How can the difference between the analyses be explained? Examining the 289 naming errors that were not coordinates of the target in the Primary Analyses (14.70% of naming errors), it is clear that they were mostly only distantly related to their targets, as indicated by low cosine similarities between the feature vectors of target and response (M = 0.14, range = 0.00–0.52). In the Primary Analyses, these extreme values of semantically unrelated naming errors likely drove the significant negative effect of semantic similarity. However, when only focusing on coordinate errors, we excluded the possibility that unrelated responses with extremely low target-response feature overlap influenced the effect.

Critically, however, findings from both analyses showed no evidence that erroneous responses were systematically semantically more similar to the target than its near semantic neighbours were on average, in contrast to our predictions. This is surprising under the assumption that the semantically most closely related concept should be the most strongly activated after the target, given its strong overlap in semantic features with the target, and this would be expected to facilitate selection of its representation over other lexical representations.

However, in both analyses, the average cosine semantic similarity between the targets and incorrect responses indicated, that, whilst not being the most similar semantic neighbours, incorrect responses were, on average, still near semantic neighbours of the targets (Primary Analyses: M = 0.44, range = 0.00–0.85; Supplementary Analyses Set 1 on coordinate naming errors: M = 0.50, range = 0.00–0.85). This is in line with findings by Mirman (Citation2011) and supports his interpretation that near semantic neighbours are strongly activated via their shared semantic features with the target. Consequently, they act as strong competitors for selection in the context of word production theories that assume co-activated lexical representations to compete for selection with the target word (e.g., Abdel Rahman & Melinger, Citation2009; Levelt et al., Citation1999). Alternatively, in the context of Oppenheim et al.’s (Citation2010) non-competitive model, the semantic-to-lexical connections of items with particularly strong semantic overlap with the targets (i.e., near semantic neighbours) will be especially weakened upon retrieval of the target word. If, later in the experiment, these items are encountered as targets, it will be harder to retrieve them because of the impact of these particularly strongly weakened connections on activation of that target. In combination with noise in the system (i.e., induced by the requirement to name the items quickly) this may lead to a semantically closely related item (e.g., the previous target word) being elicited instead of the target. Assuming holistic semantic representations, semantic similarity could be a measure of the number of nodes that are shared by the two concepts and thus connect the two concepts (e.g., superordinate and property nodes like “mammal”, “pet”, “fur”, “whiskers” etc. for cat and dog). So, a near semantic neighbour would be connected via many such links with the target and activation would spread between interconnected concepts, leading to strong co-activation of near semantic neighbours.

Yet, it still remains unclear why participants did not consistently produce the nearest semantic neighbour of a target. Our findings of naming errors being more typical representatives of their semantic category and having more semantic features indicate that the selection of an incorrect response is not purely driven by its semantic similarity to the target. Instead, other variables also drive co-activation of non-target representations and influence the selection processes. This may result in the pattern we have found where incorrect responses are not the most closely related representations, but, on average, fall within the range of near semantic neighbours.

Another explanation for our findings of semantic similarity may lay in the semantic feature database that was used to calculate the semantic similarity between items (McRae et al., Citation2005). The authors emphasized that while listing semantic features for each concept, participants tended to produce characteristics that distinguish a given concept from other, similar concepts, rather than listing properties that are common for many objects. This becomes evident when comparing the feature sets characterizing the intuitively similar concepts pen and pencil, which in the McRae et al. database are only distantly related with a cosine feature vector similarity of .12: pen – has ink, used for writing, is long, made of plastic, different colours, is blue, is round, is thin, has a ball point, made of metal, has a cap, has a pointed end, is red, used for writing, used with paper; pencil – made of lead, used for writing, made of wood, has an eraser, is sharpened, is erasable, used at school, has an eraser on end, is mechanical, is yellow, used for math, is long, is thin, made of graphite, an utensil, used for drawing. We speculate that the semantic similarity measure calculated on the basis of these feature norms may not fully reflect the actual real-world similarity between concepts. This would suggest that the analyses of semantic similarity based on the McRae et al. (Citation2005) database may not be able to fully capture the actual activation patterns in the word production system. Moreover, the use of a feature database like that of McRae et al. potentially limits the generalisability of our findings: The database contains only a subset of all words in our mental lexicons and it is impossible to estimate to what extent this subset is representative of all concrete nouns in our lexicons. It is also unclear whether potential item selection biases inherent in the database generation may have influenced the findings of this study (e.g., it is known that highly frequent features (e.g., breathes) or features that are difficult to verbalize are underrepresented (Cree & McRae, Citation2003)). It would be important for future research to replicate our findings for the three semantic variables with, for example, variables derived from large scale co-occurrence-based models that allow calculation of the statistical distribution of words from large corpora of text (e.g., LSA, Landauer & Dumais, Citation1997; GloVE, Pennington et al., Citation2014).

Moreover, future studies could investigate spontaneously occurring speech errors in individuals with and without aphasia to overcome the artificial character of the speeded naming task. While we believe that the effects reported in this study would hold using spontaneous speech errors, such research faces at least two major difficulties: The actually intended “target” word would need to be known, which is not always the case in natural conversations and this type of research would require item information to be available specifically for the set of produced errors and intended target words.

An interesting observation that goes beyond the scope of this paper is that most naming errors were perseverative in nature: In the Primary Analyses, 64% of errors consisted of words that had been produced earlier in the same experiment and only 36% were words that had not been previously uttered (the proportions were almost identical in Supplementary Analyses Set 1, where 62% of naming errors were preservative and 38% were not). This suggests that there is lingering activation related to previous responses in the word production system that influences activation levels of such representations and may facilitate their selection later on in the experiment compared to the target word. The same mechanism has been suggested in the research on cumulative semantic inhibition in standard, non-speeded, picture naming: Naming is slowed by other members of the same semantic category having been previously named and therefore, depending on the theory, being more strongly activated and acting as stronger competitors or having undergone changes in connection strength in favour of the previous target words (e.g., Howard et al., Citation2006; Oppenheim et al., Citation2010; Rose & Abdel Rahman, Citation2016). In our study, this lingering activation may combine with the effects of semantic variables described and may also explain why it is not always the most similar neighbour that has been produced as an error.

Conclusion

This research has focused on the way in which erroneous responses in a speeded picture-naming task differed from their targets in terms of the semantic variables typicality, number of semantic features, and semantic similarity. Our research supports an account where all three semantic variables, and particularly typicality and number of semantic features, influence processing during word production. This suggests that current theories of word production need to account for these semantic variables and specify how they influence activation levels during processing, which is not currently the case. Consequently, throughout the Discussion, we suggested possible mechanisms that could underpin the observed effects in the context of major theories of word production and different semantic architectures.

We have argued that the observed effects of the semantic variables originate at the semantic level of processing, with the increased influence of semantic factors likely due to the task requirements leading to reduced processing time, and stem from activation of non-target semantic representations. More specifically, we found that there is strong activation of semantic representations of words that are more typical representatives of a semantic category than the target, have more semantic features than the target, and are semantically closely related to the target.

Although we operationalized our semantic variables using a semantic feature norm database (McRae et al., Citation2005), we proposed that activation of these (target-related) semantic representations can arise in the context of both decomposed, feature-based (e.g., Dell, Citation1986), and non-decomposed, holistic, semantic representations (Abdel Rahman & Melinger, Citation2009; Levelt et al., Citation1999). In the case of decomposed semantic representations, the activation of non-target semantic representations was suggested to arise following inaccessibility of certain feature types (effect of typicality), featural richness of semantic representations (effect of number of semantic features), or featural overlap among concepts (effect of semantic similarity). For holistic semantic architectures we have argued that activation of non-target semantic representations may stem from activation spreading via links between holistic concept nodes that vary in strength (effect of typicality) and number (effects of number of semantic features and semantic similarity).

As a result of semantic co-activation, the lexical representations of words related in meaning to the target are activated. As already highlighted in the Introduction, the process of selecting one of these strongly co-activated representations for production varies between theories of word production. We have hypothesized that in the context of competitive theories, the strongly active lexical representations of words that are more typical or have more semantic features than the target or are near semantic neighbours of the target are the strongest competitors for selection. In contrast, in the context of non-competitive theories, the lexical representations with the strongest activation from semantics at the time of selection are selected, irrespective of activation levels of co-activated representations. We have discussed how the observed effects of typicality and number of semantic features can easily be explained in the context of such non-competitive word production theories. However, to explain the observed effect that naming errors tend to be near semantic neighbours of the target, within these theories we had to assume semantic-to-lexical links to the target that had previously been weakened in the experiment (Oppenheim et al., Citation2010). The generalisability of this proposal should be addressed in future research, e.g., by using computational modelling. Future work could also investigate the potential role of feedback between the lexical and semantic levels for the production of naming errors and their semantic characteristics.

In sum, we have argued that the activation of non-target semantic representations is compatible with theories of semantic representations that assume decomposed or non-decomposed semantic representations. Similarly, while different theories of word production make different predictions regarding the influence of co-activated lexical representations, we have demonstrated that, with the processing assumptions outlined in the Discussion, both competitive and non-competitive types of theories seem to be able to explain our data. Of course, it remains to be seen whether these predictions are upheld using fully implemented computational simulations that incorporate our proposed amendments of these theories. It also remains to be seen whether such amendments could have unseen consequences on the models’ abilities to simulate other psycholinguistic phenomena, and/or whether new findings will be generated that can be tested experimentally.

Author note

The methods of this study were preregistered on the Open Science Framework (https://osf.io/yw6ma/). The data and analysis code for all analyses reported in this study are available at https://osf.io/gye6x/.

Acknowledgements

The authors thank Serje Robidoux for statistical advice.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This research was supported by an International Macquarie University Research Excellence Scholarship (iMQRES) grant to Leonie F. Lampe. Maria Zarifyan received an EMCL + EMJMD student scholarship and Solène Hameau was supported by an Australian Research Council Discovery Project Grant (number DP190101490).

Notes

1 While Lampe et al. (Citation2021) and Lampe et al. (Citation2023) used data from the same large-scale experiment, these publications analysed different parts of the data and asked fundamentally different research questions. Lampe et al. (Citation2021) studied effects of semantic variables on naming speed and accuracy in the standard naming task of Round 1 and Lampe et al. (Citation2023) looked for differences in the strength of effects of semantic variables between standard and speeded picture naming of Rounds 2 and 3. Neither of these studies has focused on the naming errors produced in the speeded naming task nor analysed the relationship between target words and naming errors.

2 We allowed the items tomato and pumpkin to be classified as either fruits and vegetables and seal and walrus as either sea creatures or land animals.

3 Following a comment during the peer review process, we ran additional analyses using paired t-tests comparing the target values and the response values for the three semantic variables of interest. In all three cases were the results of the Primary Analyses with linear mixed effect model confirmed: Erroneous responses were more typical than the target (t(1965) = -8.80, p < .001), had more semantic features than the target (t(1965) = -4.60, p < .001), and the near semantic neighbours of a target were on average semantically more similar than the target and the produced response (t(1965) = 13.47, p < .001).

4 Comparable effects were also found when, following Bates et al. (Citation2015a), we simplified the fixed effect structure to include only those fixed effects that were supported by the data following likelihood ratio tests.

For typicality the final model included the difference value between target and response in number of semantic features, frequency, familiarity, and length, as well as the semantic similarity measure. The intercept remained significant and positive (β = 1.89, CI = 0.31 – 3.48, SE = 0.87, t = 2.34, p < .05), indicating that naming errors were more typical than the target words. For number of semantic features the final model included the difference value between target and response in typicality, frequency, age of acquisition, and imageability. The intercept remained significant and positive (β = 0.58, CI = 0.13 – 1.04, SE = 0.87, t = 2.50, p < .05), indicating that naming errors had more semantic features than the target words. Finally, the model for semantic similarity included the difference value between target and response in familiarity, length, age of acquisition, imageability, and the binary term of whether a response was a perseveration. The intercept remained significant and negative (β = -0.06, CI = -0.09 – -0.04, SE = 0.87, t = -4.82, p < .001), indicating that naming errors had more semantic features than the target words.

5 Comparable effects were also found when following a different analysis approach that was suggested during peer review. For these additional analyses we created dyads of correct and incorrect responses and used typicality or number of semantic features of the given response – irrespective of its accuracy – as the dependent variable. For semantic similarity the dependent variable of this additional analysis consisted of the average cosine similarity of a target’s near semantic neighbours for correct responses and the difference in cosine similarity between target and erroneous response for naming errors. We ran linear mixed effect model analyses on these dependent variables with a new binary variable of response accuracy (correct vs incorrect) as a fixed effect in addition to the psycholinguistic control variables also included in the Primary Analyses. The random intercept by stimulus was replaced by a random intercept for dyads of targets and the produced naming errors. Using likelihood ratio tests, we compared this full model with a model that did not include the binary variable of response accuracy. These model comparisons were significant for all three dependent variables and confirmed the findings of the Primary Analyses that erroneous responses were significantly more typical than the target words (χ² (1) = 75.98, p < .001), had more semantic features (χ² (1) = 21.03, p < .001), and the average semantic similarity between a target and its near semantic neighbours was greater than between a target and the incorrect response given by the participants (χ² (1) = 173.65, p < .001).

6 A reviewer enquired about the potential influence of visual typicality on the observed typicalisation effect. While we have no way of comprehensively and objectively determining an item’s visual typicality for its semantic category, as an approximation, we re-calculated our typicality measure but including only those features of the McRae et al. (Citation2005) database that are classified as visual-colour, and visual-form and surface. This new measure gives us an estimate of the overall visual typicality of an item for its semantic category in terms of (only) the visual features that were generated by McRae et al.’s participants. This new measure ranges from 0.00 to 55.50 (mean = 16.67; SD = 10.39). 155 of our 283 items (55%) have values below the mean and could thus be considered as visually less typical representatives of their semantic categories. This visual typicality measure correlates strongly with our measure of semantic typicality of the items (r = 0.71). This suggests that when viewing a picture of a semantically typical concept in which the visual features are visible, processing of any typical visual features likely results in co-activation of other semantically and visually typical items that share some of these typical visual features. This may result in selection of visually and semantically typical non-target items, resulting in semantic naming errors.

However, while future research should study the potential influence of category-based visual typicality more systematically, we do not believe that visual typicality of an item within its semantic category was the driving force for our typicalisation effect. While more typical representatives of a semantic category (e.g., horse, dog) likely share a similar, prototypical visual shape (e.g., four legs, tail, general body shape), and this would often be visible in the pictures, the appearance of more atypical representatives of the category (e.g., bat, the animal) is more unique. Thus, images of these atypical items look rather dissimilar from one another and also from more typical representatives of the category. Consequently, we do not believe that the observed typicalisation effect can arise based on visual information alone, as the characteristic visual features of typical items are likely not displayed in images of atypical items (e.g., the visual features of the visually and semantically typical animal horse are not present in the less typical animal bat). In other words, as atypical items look different from typical items, when seeing something that looks unlike a typical category representative, a more typical item should not be activated based on visual information alone. However, given our data, we are currently unable to fully disentangle effects of visual and semantic typicality and hope that future research addresses this topic further.

7 Similarly, in attractor models it is thought that settling in an attractor is facilitated for words with more semantic features as they are represented as stronger attractor basins (Plaut & Shallice, Citation1993; see also Pexman et al., Citation2007).

References

  • Abdel Rahman, R., & Melinger, A. (2009). Semantic context effects in language production: A swinging lexical network proposal and a review. Language and Cognitive Processes, 24(5), 713–734. https://doi.org/10.1080/01690960802597250
  • Abdel Rahman, R., & Melinger, A. (2019). Semantic processing during language production: An update of the swinging lexical network. Language, Cognition and Neuroscience, 34(9), 1176–1192. https://doi.org/10.1080/23273798.2019.1599970
  • Baars, B. J., Motley, M. T., & MacKay, D. G. (1975). Output editing for lexical status in artificially elicited slips of the tongue. Journal of Verbal Learning and Verbal Behavior, 14(4), 382–391. https://doi.org/10.1016/S0022-5371(75)80017-X
  • Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015a). Parsimonious mixed models. arXiv, 1–27. arXiv:1506.04967
  • Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015b). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
  • Best, W. (1996). When racquets are baskets but baskets are biscuits, where do the words come from ? A single case study of formal paraphasic errors in aphasia. Cognitive Neuropsychology, 13(3), 443–480. https://doi.org/10.1080/026432996381971
  • Blanken, G. (1990). Formal paraphasias: A single case study. Brain and Language, 38(4), 534–554. https://doi.org/10.1016/0093-934X(90)90136-5
  • Blanken, G. (1998). Lexicalisation in speech production: Evidence from form related word substitutions in aphasia. Cognitive Neuropsychology, 15(4), 321–360. https://doi.org/10.1080/026432998381122
  • Boersma, P., & Weenink, D. (2019). Praat: Doing phonetics by computer. http://www.praat.org/.
  • Bormann, T. (2011). The role of lexical-semantic neighborhood in object naming: Implications for models of lexical access. Frontiers in Psychology, 2, 1–11. https://doi.org/10.3389/fpsyg.2011.00127
  • Bormann, T., Kulke, F., & Blanken, G. (2008). The influence of word frequency on semantic word substitutions in aphasic naming. Aphasiology, 22(12), 1313–1320. https://doi.org/10.1080/02687030701679436
  • Brown, R., & McNeill, D. (1966). The „tip of the tongue” phenomenon. Journal of Verbal Learning and Verbal Behavior, 5(4), 325–337. https://doi.org/10.1016/S0022-5371(66)80040-3
  • Butterworth, B. (1989). Lexical access in speech production. In W. D. Marslen-Wilson (Hrsg.), Lexical representation and process (S. 108–135). The MIT Press.
  • Collins, A. M., & Loftus, E. F. (1975). A spreading-activation theory of semantic processing. Psychological Review, 82(6), 407–428. https://doi.org/10.1037/0033-295X.82.6.407
  • Collins, A. M., & Quillian, M. R. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 8, 240–247.
  • Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108(1), 204–256. https://doi.org/10.1037/0033-295x.108.1.204
  • Cree, G. S., & Mcrae, K. (2003). Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese, and cello (and many other such concrete nouns). Journal of Experimental Psychology: General, 132(2), 163–201. https://doi.org/10.1037/0096-3445.132.2.163
  • Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review, 93(3), 283–321. https://doi.org/10.1037/0033-295X.93.3.283
  • Dell, G. S. (1990). Effects of frequency and vocabulary type on phonological speech errors. Language and Cognitive Processes, 5(4), 313–349. https://doi.org/10.1080/01690969008407066
  • Dell, G. S., Schwartz, M. F., Martin, N., Saffran, E. M., & Gagnon, D. A. (1997). Lexical access in aphasic and nonaphasic speakers. Psychological Review, 104(4), 801–838. https://doi.org/10.1037/0033-295X.104.4.801
  • Dell’Acqua, R., Lotto, L., & Job, R. (2000). Naming times and standardized norms for the Italian PD/DPSS set of 266 pictures: Direct comparisons with American, English, French, and Spanish published databases. Behavior Research Methods, Instruments, & Computers, 32(4), 588–615. https://doi.org/10.3758/BF03200832
  • del Viso, S., Igoa, J. M., & García-Albea, J. E. (1991). On the autonomy of phonological encoding: Evidence from slips of the tongue in Spanish. Journal of Psycholinguistic Research, 20(3), 161–185. https://doi.org/10.1007/BF01067213
  • Fay, D., & Cutler, A. (1977). Malapropisms and the structure of the mental lexicon. Linguistic Inquiry, 8(3), 505–520.
  • Fieder, N., Wartenburger, I., & Abdel Rahman, R. (2019). A close call: Interference from semantic neighbourhood density and similarity in language production. Memory & Cognition, 47(1), 145–168. https://doi.org/10.3758/s13421-018-0856-y
  • Fischer-Baum, S., Miozzo, M., Laiacona, M., & Capitani, E. (2016). Perseveration during verbal fluency in traumatic brain injury reflects impairments in working memory. Neuropsychology, 30(7), 791–799. https://doi.org/10.1037/neu0000286
  • Gagnon, D. A., Schwartz, M. F., Martin, N., Dell, G. S., & Saffran, E. M. (1997). The origins of formal paraphasias in aphasics’ picture naming. Brain and Language, 59(3), 450–472. https://doi.org/10.1006/brln.1997.1792
  • Gerhand, S., & Barry, C. (2000). When does a deep dyslexic make a semantic error? The roles of age-of-acquisition, concreteness, and frequency. Brain and Language, 74(1), 26–47. https://doi.org/10.1006/brln.2000.2320
  • Goldrick, M., Folk, J. R., & Rapp, B. (2010). Mrs. Malaprop’s neighborhood: Using word errors to reveal neighborhood structure. Journal of Memory and Language, 62(2), 113–134. https://doi.org/10.1016/j.jml.2009.11.008
  • Gordon, J. K. (2002). Phonological neighborhood effects in aphasic speech errors: Spontaneous and structured contexts. Brain and Language, 82(2), 113–145. https://doi.org/10.1016/S0093-934X(02)00001-9
  • Grossman, M., Robinson, K., Biassou, N., White-Devine, T., & D’Esposito, M. (1998). Semantic memory in Alzheimer’s disease: Representativeness, ontologic category, and material. Neuropsychology, 12(1), 34–42. https://doi.org/10.1037/0894-4105.12.1.34
  • Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2014). Multivariate data analysis. Pearson.
  • Hameau, S., Biedermann, B., Fieder, N., & Nickels, L. (2019). Investigation of the effects of semantic neighbours in aphasia: A facilitated naming study. Aphasiology, 34(7), 840–864. https://doi.org/10.1080/02687038.2019.1652241
  • Hameau, S., Nickels, L., & Biedermann, B. (2019). Effects of semantic neighbourhood density on spoken word production. Quarterly Journal of Experimental Psychology, 72(12), 2752–2775. https://doi.org/10.1177/1747021819859850
  • Harley, T. A., & MacAndrew, S. B. G. (2001). Constraints upon word substitution speech errors. Journal of Psycholinguistic Research, 30(4), 395–418. https://doi.org/10.1023/A:1010421724343
  • Holmes, S. J., & Ellis, A. W. (2006). Age of acquisition and typicality effects in three object processing tasks. Visual Cognition, 13(7–8), 884–910. https://doi.org/10.1080/13506280544000093
  • Howard, D., Nickels, L., Coltheart, M., & Cole-Virtue, J. (2006). Cumulative semantic inhibition in picture naming: Experimental and computational studies. Cognition, 100(3), 464–482. https://doi.org/10.1016/j.cognition.2005.02.006
  • Humphreys, G. W., Lamote, C., & Lloyd-Jones, T. J. (1995). An interactive activation approach to object processing: Effects of structural similarity, name frequency, and task in normality and pathology. Memory, 3(3–4), 535–586. https://doi.org/10.1080/09658219508253164
  • Jescheniak, J. D., & Levelt, W. J. M. (1994). Word frequency effects in speech production: Retrieval of syntactic information and of phonological form. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(4), 824–843. https://doi.org/10.1037/0278-7393.20.4.824
  • Jolicoeur, P., Gluck, M. A., & Kosslyn, S. M. (1984). Pictures and names: Making the connection. Cognitive Psychology, 16(2), 243–275. https://doi.org/10.1016/0010-0285(84)90009-4
  • Kello, C. T. (2004). Control over the time course of cognition in the tempo-naming task. Journal of Experimental Psychology: Human Perception and Performance, 30(5), 942–955. https://doi.org/10.1037/0096-1523.30.5.942
  • Kittredge, A. K., Dell, G. S., Verkuilen, J., & Schwartz, M. F. (2008). Where is the effect of frequency in word production? Insights from aphasic picture-naming errors. Cognitive Neuropsychology, 25(4), 463–492. https://doi.org/10.1080/02643290701674851
  • Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). Lmertest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13
  • Lampe, L. F., Fieder, N., Krajenbrink, T., & Nickels, L. (2017). Semantische Nachbarschaft in der Wortproduktion bei Aphasie [Semantic neighbourhood in word production in aphasia]. In A. Adelt, Ö. Yetim, C. Otto, & T. Fritzsche (Hrsg.), Spektrum Patholinguistik – Panorama Patholinguistik: Sprachwissenschaft trifft Sprachtherapie (Bd. 10, S. 103–114). Universitätsverlag Potsdam. https://publishup.uni-potsdam.de/opus4-ubp/frontdoor/index/index/docId/39701.
  • Lampe, L. F., Hameau, S., Fieder, N., & Nickels, L. (2021). Effects of semantic variables on word production in aphasia. Cortex, 141, 363–402. https://doi.org/10.1016/j.cortex.2021.02.020
  • Lampe, L. F., Hameau, S., & Nickels, L. (2021). Semantic variables both help and hinder word production: Behavioral evidence from picture naming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 48(1), 72–97. https://doi.org/10.1037/xlm0001050
  • Lampe, L. F., Hameau, S., & Nickels, L. (2023). Are they really stronger? Comparing effects of semantic variables in speeded deadline and standard picture naming. Quarterly Journal of Experimental Psychology, 76(4), 762–782.
  • Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211–240. https://doi.org/10.1037/0033-295X.104.2.211
  • Levelt, W. J. M. (1989). Speaking: From intention to articulation. MIT Press.
  • Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22(1), 1–75. https://doi.org/10.1017/S0140525X99001776
  • Lloyd-Jones, T. J., & Nettlemill, M. (2007). Sources of error in picture naming under time pressure. Memory & Cognition, 35(4), 816–836. https://doi.org/10.3758/BF03193317
  • Mahon, B. Z., Costa, A., Peterson, R., Vargas, K. A., & Caramazza, A. (2007). Lexical selection is not by competition: A reinterpretation of semantic interference and facilitation effects in the picture-word interference paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33(3), 503–535. https://doi.org/10.1037/0278-7393.33.3.503
  • Martin, N., Dell, G. S., Saffran, E. M., & Schwartz, M. F. (1994). Origins of paraphasias in deep dysphasia: Testing the consequences of a decay impairment to an interactive spreading activation model of lexical retrieval. Brain and Language, 47(4), 609–660. https://doi.org/10.1006/brln.1994.1061
  • Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/J.JML.2017.01.001
  • McRae, K., Cree, G. S., Seidenberg, M. S., & McNorgan, C. (2005). Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37(4), 547–559. https://doi.org/10.3758/BF03192726
  • McRae, K., Cree, G. S., Westmacott, R., & De Sa, V. R. (1999). Further evidence for feature correlations in semantic memory. Canadian Journal of Experimental Psychology, 53(4), 360–373. https://doi.org/10.1037/h0087323
  • McRae, K., de Sa, V. R., & Seidenberg, M. S. (1997). On the nature and scope of featural representations of word meaning. Journal of Experimental Psychology: General, 126(2), 99–130. https://doi.org/10.1037/0096-3445.126.2.99
  • Mirman, D. (2011). Effects of near and distant semantic neighbors on word production. Cognitive, Affective and Behavioral Neuroscience, 11(1), 32–43. https://doi.org/10.3758/s13415-010-0009-7
  • Mirman, D., & Graziano, K. M. (2013). The neural basis of inhibitory effects of semantic and phonological neighbors in spoken word production. Journal of Cognitive Neuroscience, 25(9), 1504–1516. https://doi.org/10.1162/jocn_a_00408
  • Morrison, C. M., Ellis, A. W., & Quinlan, P. T. (1992). Age of acquisition, not word frequency, affects object naming, not object recognition. Memory & Cognition, 20(6), 705–714. https://doi.org/10.3758/BF03202720
  • Moses, M. S., Nickels, L., & Sheard, C. (2004). „I’m sitting here feeling aphasic!” A study of recurrent perseverative errors elicited in unimpaired speakers. Brain and Language, 89(1), 157–173. https://doi.org/10.1016/S0093-934X(03)00364-X
  • Moses, M. S., Nickels, L. A., & Sheard, C. (2007). Chips, cheeks and carols: A review of recurrent perseveration in speech production. Aphasiology, 21(10–11), 960–974. https://doi.org/10.1080/02687030701198254
  • Navarrete, E., Del Prato, P., Peressotti, F., & Mahon, B. Z. (2014). Lexical selection is not by competition: Evidence from the blocked naming paradigm. Journal of Memory and Language, 76, 253–272. https://doi.org/10.1016/j.jml.2014.05.003
  • Nickels, L. (1995). Getting it right? Using aphasic naming errors to evaluate theoretical models of spoken word production. Language and Cognitive Processes, 10(1), 13–45. https://doi.org/10.1080/01690969508407086
  • Nickels, L., & Howard, D. (1995). Aphasic naming: What matters? Neuropsychologia, 33(10), 1281–1303. https://doi.org/10.1016/0028-3932(95)00102-9
  • Nickels, L., Lampe, L. F., Mason, C., & Hameau, S. (2022). Investigating the influence of semantic factors on word retrieval: Reservations, results and recommendations. Cognitive Neuropsychology, 39(3–4), 113–154. https://doi.org/10.1080/02643294.2022.2109958
  • Oppenheim, G. M., Dell, G. S., & Schwartz, M. F. (2010). The dark side of incremental learning: A model of cumulative semantic interference during lexical access in speech production. Cognition, 114(2), 227–252. https://doi.org/10.1016/j.cognition.2009.09.007
  • Patterson, K., & Erzinçlioǧlu, S. W. (2008). Drawing as a ‘window’ on deteriorating conceptual knowledge in neurodegenerative disease. In C. Lange-Küttner & A. Vinter (Hrsg.), Drawing and the Non-Verbal Mind (1. Aufl., S. 281–304). Cambridge University Press. https://doi.org/10.1017/CBO9780511489730.014.
  • Pennington, J., Socher, R., & Manning, C. (2014). GloVe: Global vectors for word representation. Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. https://doi.org/10.3115/v1/D14-1162.
  • Perret, C., & Bonin, P. (2019). Which variables should be controlled for to investigate picture naming in adults? A Bayesian meta-analysis. Behavior Research Methods, 51(6), 2533–2545. https://doi.org/10.3758/s13428-018-1100-1
  • Pexman, P. M., Hargreaves, I. S., Edwards, J. D., Henry, L. C., & Goodyear, B. G. (2007). The neural consequences of semantic richness: When more comes to mind, less activation is observed. Psychological Science, 18(5), 401–406. https://doi.org/10.1111/j.1467-9280.2007.01913.x
  • Plaut, D. C., & Shallice, T. (1993). Deep dyslexia: A case study of connectionist neuropsychology. Cognitive Neuropsychology, 10(5), 377–500. https://doi.org/10.1080/02643299308253469
  • Rabovsky, M., & McRae, K. (2014). Simulating the N400 ERP component as semantic network error: Insights from a feature-based connectionist attractor model of word meaning. Cognition, 132(1), 68–89. https://doi.org/10.1016/j.cognition.2014.03.010
  • Rabovsky, M., Schad, D. J., & Abdel Rahman, R. (2016). Language production is facilitated by semantic richness but inhibited by semantic density: Evidence from picture naming. Cognition, 146, 240–244. https://doi.org/10.1016/j.cognition.2015.09.016
  • Rabovsky, M., Schad, D. J., & Abdel Rahman, R. (2021). Semantic richness and density effects on language production: Electrophysiological and behavioral evidence. Journal of Experimental Psychology: Learning Memory and Cognition, 47(3), 508–517. https://doi.org/10.1037/xlm0000940
  • Roelofs, A. (1997). The WEAVER model of word-form encoding in speech production. Cognition, 64(3), 249–284. https://doi.org/10.1016/S0010-0277(97)00027-9
  • Rogers, T. T., Patterson, K., Jefferies, E., & Lambon Ralph, M. A. (2015). Disorders of representation and control in semantic cognition: Effects of familiarity, typicality, and specificity. Neuropsychologia, 76, 220–239. https://doi.org/10.1016/j.neuropsychologia.2015.04.015
  • Rogerson, P. (2011). Statistical methods for geography. SAGE Publications, Ltd. https://doi.org/10.4135/9781849209953.
  • Rosch, E. H., & Mervis, C. B. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7(4), 573–605. https://doi.org/10.1016/0010-0285(75)90024-9
  • Rose, S. B., & Abdel Rahman, R. (2016). Cumulative semantic interference for associative relations in language production. Cognition, 152, 20–31. https://doi.org/10.1016/j.cognition.2016.03.013
  • Rose, S. B., Aristei, S., Melinger, A., & Abdel Rahman, R. (2019). The closer they are, the more they interfere: Semantic similarity of word distractors increases competition in language production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45(4), 753–763. https://doi.org/10.1037/xlm0000592
  • Rossiter, C., & Best, W. (2013). “Penguins don’t fly”: An investigation into the effect of typicality on picture naming in people with aphasia. Aphasiology, 27(7), 784–798. https://doi.org/10.1080/02687038.2012.751579
  • RStudio Team. (2022). RStudio: Integrated development for R. RStudio, PBC. https://www.rstudio.com/.
  • Taylor, K. I., Devereux, B. J., Acres, K., Randall, B., & Tyler, L. K. (2012). Contrasting effects of feature-based statistics on the categorisation and basic-level identification of visual objects. Cognition, 122(3), 363–374. https://doi.org/10.1016/j.cognition.2011.11.001
  • van Heuven, W. J. B., Mandera, P., Keuleers, E., & Brysbaert, M. (2014). SUBTLEX-UK: A new and improved word frequency database for British English. Quarterly Journal of Experimental Psychology, 67(6), 1176–1190. https://doi.org/10.1080/17470218.2013.850521
  • Vitevitch, M. S. (1997). The Neighborhood Characteristics of Malapropisms. https://journals-sagepub-com.simsrad.net.ocs.mq.edu.au/doi/abs/10.1177002383099704000301.
  • Vitevitch, M. S. (2002). The influence of phonological similarity neighborhoods on speech production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28(4), 735–747. https://doi.org/10.1037/0278-7393.28.4.735
  • Vitkovitch, M., & Humphreys, G. W. (1991). Perseverant responding in speeded naming of pictures: It’s in the links. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17(4), 664–680. https://doi.org/10.1037/0278-7393.17.4.664
  • Vitkovitch, M., Humphreys, G. W., & Lloyd-Jones, T. J. (1993). On naming a giraffe a zebra: Picture naming errors across different object categories. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19(2), 243–259. https://doi.org/10.1037/0278-7393.19.2.243
  • Woollams, A. M. (2012). Apples are not the only fruit: The effects of concept typicality on semantic representation in the anterior temporal lobe. Frontiers in Human Neuroscience, 6, 1–9. https://doi.org/10.3389/fnhum.2012.00085
  • Woollams, A. M., Cooper-Pye, E., Hodges, J. R., & Patterson, K. (2008). Anomia: A doubly typical signature of semantic dementia. Neuropsychologia, 46(10), 2503–2514. https://doi.org/10.1016/j.neuropsychologia.2008.04.005
  • Woollams, A. M., Ralph, M. A. L., Plaut, D. C., & Patterson, K. (2007). SD-squared: On the association between semantic dementia and surface dyslexia. Psychological Review, 114(2), 316–339. https://doi.org/10.1037/0033-295X.114.2.316

Appendices

Appendix A – Response coding

Table A1. Response coding with examples.

Appendix B – Primary Analyses: Linear mixed effect model outputs

Full model outputs from models of the Primary Analyses analysing target-error relationships with regard to typicality, number of semantic features, and semantic similarity on 1966 observations with complete information on semantic similarity.

Typicality

Table B1. Summary of Primary Analyses linear mixed model for typicality (n = 1966 observations).

Number of semantic features

Table B2. Summary of Primary Analyses linear mixed model for number of semantic features (n = 1966 observations).

Semantic similarity

Table B3. Summary of Primary Analyses linear mixed model for semantic similarity (n = 1966 observations).

Appendix C – Supplementary Analyses Set 1 on coordinate errors: Distribution of naming errors for semantic variables and linear mixed effect model outputs

Below are the full model outputs from models of the Supplementary Analyses Set 1 analysing target-error relationships with regard to typicality, number of semantic features, and semantic similarity on 1677 coordinate errors.

Typicality

Table C1. Summary of Supplementary Analyses Set 1 linear mixed model for typicality on coordinate errors (n = 1677 observations).

Number of semantic features

Table C2. Summary of Supplementary Analyses Set 1 linear mixed model for number of semantic features on coordinate errors (n = 1677 observations).

Semantic similarity

Table C3. Summary of Supplementary Analyses Set 1 linear mixed model for semantic similarity on coordinate errors (n = 1677 observations).

Appendix D – Supplementary Analyses Set 2 on all available naming errors without including semantic similarity: Naming errors included in the analyses and linear mixed effect model outputs

Table D1. Counts and percentage of errors of each error type included in Supplementary Analyses Set 2.

Typicality

Table D2. Summary of Supplementary Analyses Set 2 linear mixed model for typicality on all available data points (n = 2349 observations).

Table D3. Summary of Supplementary Analyses Set 2 linear mixed model for typicality on all available coordinate errors (n = 1917 observations).

Number of semantic features

Table D4. Summary of Supplementary Analyses Set 2 linear mixed model for number of semantic features on all available data points (n = 2349 observations).

Table D5. Summary of Supplementary Analyses Set 2 linear mixed model for number of semantic features on all available coordinate errors (n = 1917 observations).

Appendix E – Supplementary Analyses Set 3 with separate values for the fixed effects: linear mixed effect model outputs

Below are the full model outputs from models of Supplementary Analyses Set 3 analysing target-error relationships with separate values for fixed effects with regard to typicality, number of semantic features, and semantic similarity on the 1966 observations of the Primary Analyses, 1677 observations of Supplementary Analyses Set 1 on coordinate errors, and 2349 or 1917 observations of Supplementary Analyses Set 2 without including the fixed effect of semantic similarity.

Typicality

Table E1. Summary of Supplementary Analyses Set 3 linear mixed model for typicality with separate values for target and response for control variables as fixed effects (analogous to Primary Analyses on data points with complete information for semantic similarity measure, n = 1966 observations).

Table E2. Summary of Supplementary Analyses Set 3 linear mixed model for typicality on coordinate errors with separate values for target and response for control variables as fixed effects (analogous to Supplementary Analyses Set 1 on data points with complete information for semantic similarity measure, n = 1677 observations).

Table E3. Summary of Supplementary Analyses Set 3 linear mixed model for typicality on all available data points with separate values for target and response for control variables as fixed effects (analogous to Supplementary Analyses Set 2, n = 2349 observations).

Table E4. Summary of Supplementary Analyses Set 3 linear mixed model for typicality on all available coordinate errors with separate values for target and response for control variables as fixed effects (analogous to Supplementary Analyses Set 2, n = 1917 observations).

Number of semantic features

Table E5. Summary of Supplementary Analyses Set 3 linear mixed model for number of semantic features with separate values for target and response for control variables as fixed effects (analogous to Primary Analyses on data points with complete information for semantic similarity measure, n = 1966 observations).

Table E6. Summary of Supplementary Analyses Set 3 linear mixed model for number of semantic features on coordinate errors with separate values for target and response for control variables as fixed effects (analogous to Supplementary Analyses Set 1 on data points with complete information for semantic similarity measure, n = 1677 observations).

Table E7. Summary of Supplementary Analyses Set 3 linear mixed model for number of semantic features on all available data points with separate values for target and response for control variables as fixed effects (analogous to Supplementary Analyses Set 2, n = 2349 observations).

Table E8. Summary of Supplementary Analyses Set 3 linear mixed model for typicality on all available coordinate errors with separate values for target and response for control variables as fixed effects (analogous to Supplementary Analyses Set 2, n = 1917 observations).

Semantic similarity

Table E9. Summary of Supplementary Analyses Set 3 linear mixed model for semantic similarity with separate values for target and response for control variables as fixed effects (analogous to Primary Analyses on data points with complete information for semantic similarity measure, n = 1966 observations).

Table E10. Summary of Supplementary Analyses Set 3 linear mixed model for semantic similarity on coordinate errors with separate values for target and response for control variables as fixed effects (analogous to Supplementary Analyses Set 1 on data points with complete information for semantic similarity measure, n = 1677 observations).