1,397
Views
0
CrossRef citations to date
0
Altmetric
Articles

The effect of gestures on the interpretation of plural references

&
Pages 454-469 | Received 02 Apr 2021, Accepted 22 Oct 2021, Published online: 02 Nov 2021

ABSTRACT

During language comprehension, comprehenders build conceptual models of linguistic content. Despite their fundamental nature, relatively little is known about how plural expressions are realised in conceptual models. Specifically, it is unknown whether these models contain quantity information when it is left unspecified in the noun phrase. This paper reports two studies that investigate the effect of gestures on plural conceptual representations. Experiment 1 confirmed that small gestures evoke small set sizes and large gestures evoke large set sizes. Experiment 2 replicated these findings and showed that gestures seem to influence comprehenders' representations of number information rather than the amount of space a plural reference occupies. Taken together, these data suggest that when plurals are instantiated in comprehenders' minds, spatial information provided by gestures is used by comprehenders to make inferences about the quantity of objects in a plural set.

The effect of gestures on the interpretation of plural referents

The goal of any linguistic act is to transmit an idea from a speaker’s mind to a listener’s mind. In order to successfully comprehend the idea, the listener must create a conceptual representation of the linguistic content. There has been much debate over the nature of these kinds of conceptual representations (e.g. Barsalou, Citation1999; Johnson-Laird, Citation1983). An important piece of building a conceptual model is resolving references. References point to real world entities, but most of the time the referent is unavailable in the visual context during comprehension. Understanding the factors that influence comprehenders’ representation of references is key to understanding how ideas are instantiated in the mind. Importantly, while references may be physically absent from the visual context, speakers may use gestures to indicate those references. The goal of this paper is to understand how gestures, as part of the visual context, can influence comprehenders’ representations of plural references.

Much work on reference has focused on singular concrete objects; however, a complete understanding of how comprehension works must include an understanding of how more abstract references such as events and plurals, are realised in the minds of comprehenders. The work presented here is focused on plural references. A fundamental aspect of human language is the ability to refer to quantities of individuals and not merely specific individuals. In English, for example, this can be accomplished by introducing a plural (e.g. cats), a conjoined noun phrase (e.g. the cat and the dog), a quantified expression (e.g. each cat), or a numerical expression (e.g. two cats) into the discourse. Plural references are interesting because their conceptual representations include properties of either the plural set, the individuals that comprise the set, or both, and different types of linguistic reference seem to introduce or focus one or the other kind of representation (e.g. Moxey et al., Citation2004; Patson, Citation2014). Despite their fundamental nature, there is still relatively little known about the nature of conceptual quantity information relevant to plural expressions. For conjoined noun phrases (e.g. Mary and John), quantity representation is somewhat straightforward: Comprehenders seem to create a representation that includes two distinct individuals (e.g. Moxey et al., Citation2004; Patson & Ferreira, Citation2009; Patson & Warren, Citation2011). However, unquantified plurals (e.g. the lovers) do not seem to have that kind of representation (Patson, Citation2014; Patson & Ferreira, Citation2009; Patson & Warren, Citation2011) and it is implausible that comprehenders could have a conceptual representation with an exact number of entities for quantified expressions like, There were 20,000 fans in the bleachers (Johnson-Laird, Citation1983). So then how does the quantity information for non-conjoined plurals get realised in the conceptual representation?

It may seem plausible that a comprehender’s conceptual representation of a plural reference might be represented as a small, manipulable set of tokens. Indeed, this view was proposed by Johnson-Laird (Citation1983) and there is evidence that some plurals may be represented as a set of individuated tokens. For example, in syllogistic reasoning tasks, when asked to draw a model of the reasoning process for syllogisms with plural expressions (e.g. All horses are animals. Therefore, all horse’s heads are animal heads.), participants draw a small number (e.g. 3 or 4) of tokens to represent the plural expression (e.g. Bucciarelli & Johnson-Laird, Citation1999). This suggests that comprehenders’ conceptual models for unspecified plurals contain a small number of tokens. Indeed, other work has shown that when number is specified in the linguistic content, comprehenders include exactly that number of entities in their conceptual representation if the number is within the subitisation range (Beg et al., Citation2019; Patson, Citation2021; Patson et al., Citation2014; Šetić & Domijan, Citation2017). For example, Šetić and Domijan (Citation2017) found that participants were faster to respond to a picture of exactly four dogs compared to a picture of three dogs after reading a sentence that contained the expression four dogs. Importantly, this effect was not driven simply by number alone, that is, participants were not faster to respond to a picture of four cats after reading four dogs. This suggests that comprehenders’ conceptual models contained a representation of the quantified expression that contained exact number and object identity information. However, when number is left unspecified or outside of the subitisation range, the conceptual representation of the plural does not seem to contain explicit representations of number (Patson, Citation2021; Patson et al., Citation2014). For example, Patson et al. found that participants were no faster to respond to a picture of multiple apples as to a picture of one apple after reading a sentence like, The farmer saw the apples. They interpreted this finding as suggesting that comprehenders’ representations of unquantified plurals were “fuzzy”, and that quantity information is not explicitly realised in the conceptual model.

Although comprehenders may not include explicit quantity information for unquantified plurals, there is evidence that when quantity is linguistically implied, comprehenders’ conceptual representation includes implicit information about quantity. For example, Patson (Citation2016) used a picture-matching paradigm to show that when sentences include information about relative set size, comprehenders include that information in their conceptual representations. In Patson (Citation2016), participants read sentences like, The child carried a basket of apples, or The farmer saw the truck full of apples. Then they saw a picture of either a small, countable number of apples or a picture of a large uncountable number of apples. Participants were faster to judge that the picture depicted apples when the number of objects implied in the sentence matched the relative number of objects depicted in the picture. These data suggest that comprehenders create conceptual representations that include relative set size information when that information is provided in the discourse. However, it is still uncertain how that quantity information gets encoded.

The above studies are reading studies, but language comprehension also occurs during speech. And when people talk, they often gesture. In social communication situations, gestures serve to help speakers depict concrete and abstract concepts (e.g. Cienki, Citation1998; Goldin-Meadow, Citation2003) and facilitate lexical access (e.g. Krauss et al., Citation1991). What are known as iconic or representational gestures are amongst the most common type of gestures (Kendon, Citation2004; McNeill, Citation1992). Representational gestures are movements that convey spatial or motor information using the semiotic strategy of signalling meaning through resemblance. An example of a representational gesture is extending the index finger and drawing the shape of an imaginary diamond to indicate a diamond. Relevant to the topic of plurals, gestural iconicity plays a role in the transmission of abstract ideas. For example, representational gestures have been shown to change people’s temporal concepts (e.g. Jamalian & Tversky, Citation2012; Lewis & Stickles, Citation2017), facilitate learning mathematical concepts (e.g. Goldin-Meadow, Citation2003; Goldin-Meadow et al., Citation2001; Goldin-Meadow et al., Citation2009), and convey number information (e.g. Winter et al., Citation2013; Woodin et al., Citation2020).

In one study investigating the use of gestures to convey number and quantity information, Woodin et al. (Citation2020) found that newscasters often performed gestures when talking about quantities. In their study of the TV News Archive, Winter et al. found that when talking about a “tiny number of people”, newscasters were likely to perform gestures such as a pinching-like gesture in which the gesturer drew her forefinger and thumb close together as if to hold that tiny number between her fingers. When talking about a “huge amount”, newscasters performed gestures in which their hands moved outward from her body, with flat, open palms facing one another. The space between their hands seemed to represent the large physical size of the number. Woodin et al. argued that when talking about quantities speakers activate spatial concepts in their minds and their gestures reflect the activation of those spatial concepts.

Research investigating number processing also shows that there is a deep mental connection between the conceptualisation of physical space and numerical quantity. For example, in their classic work, Henik and Tzelgov (Citation1982) found that people were quicker to judge which of two numbers is greater when the greater number is presented in larger typeface. Similarly, when asked to judge which of two quantities of dots is larger, participants are faster when the larger quantity takes up more space on the screen than the smaller quantity (Hurewitz et al., Citation2006). Interestingly, Lindemann et al. (Citation2007) found that the type of hand grip participants made was also related to number magnitude. In their experiment, participants made parity decisions about numerals presented on a computer screen. The type of grip required to make the response was varied. Lindemann et al. found that participants were faster to initiate a precision grip (a gesture in which the forefinger and thumb approach each other as if holding a tiny object) in response to smaller numbers and initiating a power grip (firm grip involving the full hand as if holding a larger object) in response to larger numbers. This finding suggests that there is a link between size-based spatial-numerical associations and manual actions. Similarly, when participants reach for blocks that happen to have numbers printed on them, they widen their grip aperture between the index finger and thumb if the number is greater regardless of the actual size of the block (Andres et al., Citation2004). Interestingly, participants’ rating of the graspability of a visually presented object influences their subsequent numerical processing ability (Ranzini et al., Citation2011), and viewing “pinching” gestures interferes with participants ability to process larger magnitudes (Badets & Pesenti, Citation2010). Taken together, these studies show that thinking about numbers is mentally connected with the actions used for interacting with different sized objects (see Winter et al., Citation2015 for a review).

Returning to gestures, representational gestures seem to be a window into the mind of speakers which have the potential to reveal metaphoric thought processes during language use. For example, Winter and colleagues suggest that their findings that speakers use large sweeping gestures when talking about large numbers and small, precision grips when talking about small numbers reflects the conceptual metaphors that are active in speakers’ minds when talking about quantity information. Considerable research has shown that representational gestures can influence how comprehenders understand language (for a review see Hostetter, Citation2011) perhaps by activating the same conceptual metaphors in listeners that are active in speakers. Importantly, representational gestures have been found to be more communicative when they are nonredundant with the speech stream (Church et al., Citation2007; Church & Goldin-Meadow, Citation1986; Emmorey & Casey, Citation2001; Hostetter, Citation2011; but see Dargue et al., Citation2019 for evidence that nonredundant gestures are not always more informative than redundant gestures) and when they convey spatial or motor information rather than abstract information (Alibali, Citation2005; Hostetter, Citation2011). For example, Church et al. (Citation2007) found that when speakers used gestures to convey information not present in speech, comprehenders incorporated that information into their understanding of the sentence. Specifically, participants who saw a speaker wave their hand in front of their nose to indicate a bad smell while saying It’s bad in that room, remembered the sentence as including information about a bad smell. In addition, Cook and Goldin-Meadow (Citation2006) found that children learning new math concepts performed better when the instructions contained a gesture that corresponded to the math concept, compared to when the instructions did not contain a similar gesture. One hypothesis to explain the role of gestures during language use is that gestures reflect the activation of perceptual and motor systems in speakers’ minds and can therefore activate those same systems in comprehenders’ minds (see Hostetter & Alibali, Citation2008).

The current study was designed to investigate whether gestures can be used to convey quantity information for unquantified plural references to comprehenders. If there is a deep mental connection between the conceptualisation of physical size and numerical quantity, then gestures of different physical sizes should be able to evoke differential quantity information for plurals during language comprehension. However, whether gestures can convey quantity information for plural expressions likely depends on how quantity information gets instantiated into comprehenders’ conceptual representations of unquantified plural sets. One hypothesis, based on previous work, is that the conceptual representation of an unquantified plural is “fuzzy” or underspecified and does not include any quantity information (Patson et al., Citation2014). A related hypothesis is that the quantity information is not included in visual terms but instead is realised abstractly (e.g. Johnson-Laird, Citation1983). Under this view, gestures may have no reliable impact on comprehenders’ conceptual representations of a plural set. This is because gestures reflect spatial information. Under the underspecified/abstract hypothesis, quantity is not included as a spatial property in the conceptual representation. A second hypothesis is that the conceptual representation of a plural is focused on the level of the group and features of the group, such as size, shape of the group, are included in the conceptual representation (Patson, Citation2016; Citation2021; Treisman, Citation2006). Under this view, the shape of the group may indicate information about quantity as quantity and group size are inextricably linked. Thus, quantity information could be instantiated in the conceptual representation through the spatial information encoded by the gesture. For example, if a speaker performs a gesture in which they spread their hands out widely while saying These cookies are for the party, comprehenders may interpret the gesture as indicating a large set (or pile) of cookies which may be interpreted as indicating a large number of cookies (rather than a pile of large cookies).

The goal of the current set of experiments was to (1) determine whether representational gestures could be used to influence plural conceptual representations, and (2) confirm that the gesture’s influence is on quantity and no other features related to the size of a plural set (i.e. object distribution). In order to test these hypotheses, participants watched videos of a speaker gesturing while talking about a plural set. Participants were then given a choice between two photos and asked to choose which of the photos the speaker was referring to. The current study used experimental controls rather than natural interactions to determine the effect of gestures. The gesture and speech stimuli used in this study were constructed to control for other factors, such as facial expressions or prosody, that could confound a listener’s response. The sentences used in this study were written such that they could plausibly refer to small or large set sizes. They were read by an actor whose tone of voice, facial expression, and body movements (other than the gesture) were kept constant. This allowed us to pinpoint the gesture used in the video as the source of comprehenders’ representations.

Experiment 1

The goal of Experiment 1 was to determine whether gestures could be used to influence comprehenders’ choice between a plural with a small set size and a plural with a large set size. Building on previous work showing that speakers use gestures to communicate quantity information (Winter et al., Citation2013; Woodin et al., Citation2020), we predict that large gestures will evoke a representation with a large number of objects while a small gesture will evoke a representation with a small number of objects. These predictions assume that quantity information is included in comprehenders’ conceptual representations by evoking spatial magnitude information (i.e. a large number of objects makes a large group). If quantity information is underspecified (Patson et al., Citation2014) or is included abstractly (Johnson-Laird, Citation1983) in comprehenders’ conceptual representations, then gestures may not influence participants’ choices.

Method

Participants

68 volunteers (42 female, 26 male) were recruited as participants from the Center of Science and Industry (COSI) in Columbus, Ohio. The average age of the participants was 30 years and ages ranged between 18 and 69. We did not collect education information for the participants in this study, but data collected by our laboratory shows that the average COSI visitor is highly educated, nearly 75% either have or are in the process of earning an undergraduate degree. All were native speakers of English and had normal or corrected to normal eyesight. Participants had no history of hearing or speech impairments.

Apparatus

The trials were presented using E-Prime v.2 experimental software (Schneider et al., Citation2002). A Dell P2412H 24-inch monitor (1920 × 1080 pixels) displayed stimuli with a screen refresh rate of 60 Hz. Keyboard presses were used to log responses and record reaction time.

Design and stimuli

The experiment had a single factor design with three levels: small gesture, large gesture, and no gesture. A total of 54 sentences including a plural noun were used for this experiment (e.g. Ming ordered these cupcakes). For each condition, a different video was made. In the video, the speaker produced the sentence out loud while gesturing on the demonstrative pronoun (e.g. these, those) in the two gesture videos. In the “no gesture” control condition, the speaker produced the sentence while keeping her hands in her lap. More details about the gestures are below. For each sentence, two pictures were created of the objects described in the sentence. One picture contained a small set of objects, and one contained a large set of objects. See for an example of images used in the experiment. In half of the trials, the small picture appeared on the left, on the other half of trials the small picture appeared on the right. A set of 43 filler items was also created. For the filler items, the same speaker used in the experimental videos spoke a sentence describing a scene. After watching the video, the participants saw two pictures, one was clearly correct. For example, the sentence, The turtles are next to the pond, was followed by a picture of three turtles next to a pond and a picture of several ducks near a pond. This ensured that participants were paying attention to the speech in the video. For half of the filler trials, the correct picture appeared on the right side of the screen, for the other half of the filler trials the correct picture appeared on the left of the screen. The experimental items were counterbalanced across three lists using a Latin-square procedure such that participants saw all three conditions, but only one version of each item. Each list contained the same filler items.

Figure 1. Example pictures used in Experiment 1 for the sentence, These leaves fell from the tree. (a) small number picture choice, (b) large number picture choice. [To view this figure in colour, please see the online version of this journal.]

Figure 1. Example pictures used in Experiment 1 for the sentence, These leaves fell from the tree. (a) small number picture choice, (b) large number picture choice. [To view this figure in colour, please see the online version of this journal.]

Gesture description

The same speaker appeared in all the videos used in both experiments reported here. In the small gesture condition, the speaker began with her hands in her lap and produced the gestures concurrently with the demonstrative pronoun and preceding plural noun. The gesture was produced with only one hand, either the left or right, both with equal frequency, while the other hand remained in her lap. The gesture was performed in close proximity to the speaker’s body and at a mid-torso height, just above the elbow. The hand assumes slightly varied orientations, always with the thumb and index finger extended in a C-shape. In some instances, all five fingers were extended and used to produce the C-shape. The speaker also varied the rotation of her wrist which altered the direction the palm was facing. This was done to enhance naturalness. Following the utterance of the plural noun, the gesturing hand returned to the speaker’s lap. See .

Figure 2. Screenshots from small gesture video used in Experiment 1 and 2 for the sentence, Macy gathered these bottles. [To view this figure in colour, please see the online version of this journal.]

Figure 2. Screenshots from small gesture video used in Experiment 1 and 2 for the sentence, Macy gathered these bottles. [To view this figure in colour, please see the online version of this journal.]

The large gestures covered more physical space than the small gestures. As in the small gesture condition, the speaker began with her hands in her lap. The large gestures used either one or both hands, each occurring equally as often, and was signalled by the raising of the hand(s) to a height level with the speaker’s elbow(s). The arm(s) then moved outward in a horizontal fashion, extending as far from the speaker’s body as naturally possible, the furthest extension occurring in succession with the demonstrative pronoun. Following the plural noun, the hand(s) returned to the speaker’s lap. See .

Figure 3. Screenshots from large gesture video used in Experiment 1 and 2 for the sentence, Ms. Smith borrowed these crayons. [To view this figure in colour, please see the online version of this journal.]

Figure 3. Screenshots from large gesture video used in Experiment 1 and 2 for the sentence, Ms. Smith borrowed these crayons. [To view this figure in colour, please see the online version of this journal.]

Gestures used in the filler items were distinct from the experimental gestures. They were not always cued with a demonstrative pronoun or the plural noun but were congruent with the context of their sentence. Both deictic (pointing gestures) and iconic gestures were used in the filler videos. In all the videos, the speaker’s hands began either on her lap, or held close to her body at elbow-height. The gestures were performed with either one or both of her hands. For the deictic gestures, the hands took various shapes, such as flattened horizontally with palms facing either up or down, or finger extensions, but always motioned outward in different directions from the speaker’s body to work as an indicator towards an implied object(s). The iconic gestures naturally prompted different hand shapes, as well, depending on the context of the sentence. For example, in the sentence, The squirrels ran up the tree, the speaker used an extended index finger to represent the actor in the scene and began with her hand palm-down. While uttering the verb, the speaker rotated her hand so that her palm faced up, while simultaneously raising her entire hand, in order to suggest that the actor was moving in an upward direction.

Procedure

Participants were tested individually or in pairs. After they provided informed consent, they were given a verbal introduction to the experiment. Then the computer guided them through example trials followed by two practice trials. Each trial began with a centre-justified fixation cross. When participants pressed the space bar, the cross was replaced by a video. Participants watched the video and listened to the speaker over a pair of headphones. After the video finished, participants saw two pictures on the screen, one centred on the left of the screen and one centred on the right of the screen. Participants were asked to select the picture that the speaker was likely talking about by using either the “f” key to choose the picture on the left or the “j” key to choose the picture on the right. The pictures disappeared only when participants made their response. The participants’ button presses were recorded. The entire experiment lasted about 15 minutes.

Data analysis

The data were analyzed in R (version 4.0.3; R Core Team, Citation2020) by fitting a mixed logistic regression (with logit link function) using the glmer function in the lme4 package (Bates et al., Citation2014). All analyses involved the random effects model with random intercepts and random slopes of gesture condition for subjects and items (Barr et al., Citation2013). As some items or participants may have stronger/weaker gesture effects, these random effects are theoretically justified. Each model included sum-coded fixed effects of gesture type. Data and analysis code for all experiments reported in the present study are available at the second author’s OSF webpage (DOI 10.17605/OSF.IO/JXZ86).

In the first model, participants’ responses were coded to reflect a preference for the picture that depicted a small number of objects (1= small choice, 0= large choice). An initial model with the maximal number of random effects specified by the design failed to converge. A backwards stepping procedure was used to incrementally remove random effects that accounted for the least amount of variance until the model converged (Barr et al., Citation2013). The final model included random intercepts for participants and items.

Results and discussion

A proportion of correct responses for filler items was computed for each participant. Accuracy rates were high (M = 0.93, SD = 0.16) indicating participants were paying attention during the experiment.

As predicted, in the control conditions, participants’ responses were nearly at chance (proportion M = 0.49) indicating that participants randomly chose between the large and small pictures when no gesture was used by the speaker. As the data in indicate, the gestures did have a significant effect on participants’ picture choice. When the gesture was large, participants were less likely to choose the small picture compared to the control condition (proportion M = 0.36) and when the gesture was small, participants were more likely to choose the small picture compared to the control condition (proportion M = 0.56). Consistent with Winter and Duffy (Citation2020) the hand used for the gesture did not influence the effect (proportion means for Small gestures: left hand = 0.54, right hand = 0.58; proportion means for large gesture: both hands = 0.38, left hand = 0.35, right hand = 0.36).

Table 1. Fixed effects for the logistic mixed model for small picture choice by gesture condition in Experiment 1.

A second model was fit in which participants’ responses were coded to reflect a preference for the picture that depicted a large number of objects (0= small choice, 1= large choice). Because these are binary data, the results of the model only changed the signs on the estimates. However, it was conceptually necessary to run a second model because our hypothesis concerns the match between the gesture and the picture that was chosen. For small gestures, “match” means choosing a small picture, for large gestures “match” means choosing a large picture. Our hypothesis requires us to compare “match” choices to the same picture choice in the control condition. For small gestures that means comparing to a small picture choice in the control condition, and for large gestures that means comparing to a large picture choice in the control condition. As the data in indicate, the gestures did have a significant effect on participants’ picture choice. When the gesture was large, participants were more likely to choose the large picture compared to the control condition (proportion M = 0.63) and when the gesture was small, participants were less likely to choose the large picture compared to the control condition (proportion M = 0.43). Consistent with Winter and Duffy (Citation2020) the hand used for the gesture did not influence the effect (proportion means for Small gestures: left hand = 0.70, right hand = 0.74; proportion means for large gesture: both hands = 0.78, left hand = 0.74, right hand = 0.72).

Table 2. Fixed effects for the logistic mixed model for large picture choice by gesture condition in Experiment 1.

The data reported here are consistent with the hypothesis that the conceptual representation of a plural is focused on the level of the group and features of the group, such as size and shape of the group, are included in the conceptual representation (Patson, Citation2016; Citation2021; Treisman, Citation2006) and that the shape of the group may indicate information about quantity. Consistent with previous work (Winter et al., Citation2013; Woodin et al., Citation2020), large gestures seem to evoke a representation of a plural set with a large number of objects while small gestures seem to evoke a representation of a plural set with a small number of objects. These data suggest that quantity information can be instantiated in a plural reference’s conceptual representation through the spatial information encoded by a gesture. Experiment 2 was designed to investigate an alternative way in which gestures may influence the conceptual representation of plural references, namely, the amount of space the plural set occupies.

Experiment 2

The data reported in Experiment 1 suggest that gestures can be used to convey quantity information: Large gestures seem to evoke large quantities, while small gestures seem to evoke small quantities. These data are consistent with the hypothesis that quantity information is encoded spatially in a conceptual representation of a plural reference. Importantly, there is another way in which gestures could influence the conceptual representation of a plural. It is well known that gestures can be used to indicate the size of a reference (e.g. Beattie & Shovelton, Citation2006; Holler & Stevens, Citation2007). For example, Holler and Stevens (Citation2007) asked participants to describe specific target objects from scenes from Where’s Wally to each other. They found that participants encoded reference size information through gestures during their speech. Importantly, the size of an object (or group) is perfectly correlated with the amount of space that object takes up. Thus, it is possible that gestures might be used to convey how much space is taken up by an object or group and not an indication of quantity. It is important to note that there are many ways in which size can be communicated through gesture (e.g. hand shape, arm positioning; e.g. Hassemer & Winter, Citation2016, Citation2018). The goal of Experiment 2 was to determine whether the particular gestures used in this study seem to evoke quantity information or information about how much space a plural set is occupying.

Method

Participants

Sixty-six volunteers (49 female, 17 male) were recruited as participants from the COSI in Columbus, Ohio. The average age of the participants was 30 years and ages ranged between 18 and 62. All were native speakers of English and had normal or corrected to normal eyesight. Participants had no history of hearing or speech impairments.

Apparatus & procedure

The same apparatus and procedure used in Experiment 1 were used in Experiment 2.

Design and stimuli

Two features of the videos and pictures used Experiment 2 were manipulated. The first feature was the gesture size: small versus large. The same videos used in Experiment 1 were used in Experiment 2. The second feature was the picture choice. Four pictures were created for each sentence frame. Two of the pictures depicted a large (10) number of objects and two of the pictures depicted a small (3) number of objects. Within each set size, there were two different spatial layouts: One of the pictures had the objects grouped together in a pile (group picture) and the other picture had the objects spread out (dispersion picture).

As in Experiment 1, participants were given a pair of pictures to choose from after watching a video. With four pictures for each sentence, it was possible to construct six possible picture pairs for each gesture type. In order to limit the number of conditions in the experiment, the most informative contrasts were presented to participants. The result was a 2 (gesture type: small vs. large) × 2 (picture preference: quantity vs. object dispersion). For the picture preference condition, there were two levels. The first level was the quantity only condition. In this condition, participants were shown two pictures that had the same spatial layout but different quantities. For the small gesture, both pictures were grouped, for the large gesture, both pictures were dispersed as these are the spatial layouts consistent with the gesture. This condition was designed to allow us to replicate the finding in Experiment 1 that participants’ decisions were influenced by quantity. The second condition directly pit together spatial layout and quantity. In this condition, participants chose between a picture whose quantity was consistent with the gesture and a picture whose spatial layout (but not quantity) might be consistent with the gesture. For the small gesture condition, the quantity-consistent picture was a small, dispersed picture and the spatial-consistent picture was a large, group picture. For the large gesture condition, the quantity-consistent picture was a large, group picture and the spatial-consistent picture was a small, dispersed picture. See for an example of all four choice conditions.

Table 3. Example stimuli based on condition for Experiment 2.

There were 52 experimental items counterbalanced across four lists such that participants saw all four conditions, but only one version of each item. Each list contained the same 66 filler items. The same filler videos used in Experiment 1 were used in Experiment 2. The pictures used for the fillers were changed to look more like Experiment 2’s experimental items. For the filler items, the correct picture appeared on the right side of the screen on half of the trials, and on the left side of the screen on the other half of the trials.

Predictions

If gestures convey information about quantity, then there should be no effect of object dispersion on picture choices. That is, they should always prefer the picture whose quantity matches the gesture size. Therefore, in the large gesture conditions, participants should prefer the large group picture over the small disperse picture and should prefer the large disperse picture over the small disperse picture. In the small gesture conditions, participants should prefer the small group picture over the large group picture and should prefer the small disperse picture over the large group picture.

If gestures convey spatial layout, then there should be a main effect of object dispersion, such that participants will be less likely to choose the quantity-consistent picture when object dispersion is not held constant. In the large gesture conditions, participants should prefer the small disperse picture over the large group picture and should show no preference between the small disperse and large disperse pictures. In the small gesture conditions, participants should prefer the large group picture over the small disperse picture and should show no preference when choosing between the small group picture and the large group picture.

Data analysis

The data were analyzed in R (version 4.0.3; R Core Team, Citation2020) by fitting a mixed logistic regression (with logit link function) using the glmer function in the lme4 package (Bates et al., Citation2014). All analyses involved the random effects model with random intercepts and random slopes of both factors for subjects and items (Barr et al., Citation2013). As some items or participants may have stronger/weaker effects, these random effects are theoretically justified. Each model included effect-coded fixed effects of gesture size and object dispersion. Data and analysis code for all experiments reported in the present study are available at the second author’s OSF webpage (DOI 10.17605/OSF.IO/JXZ86).

In the first model, participants’ responses were coded to reflect whether they chose the picture that was consistent with the quantity of objects implied by the gesture (1= quantity consistent choice, 0= quantity inconsistent choice). An initial model with the maximal number of random effects specified by the design failed to converge. A backwards stepping procedure was used to incrementally remove random effects that accounted for the least amount of variance until the model converged (Barr et al., Citation2013). The final model included random intercepts for participants and items. The subject factor also included random slopes for both factors, but no interaction slope.

Results and discussion

A proportion of correct responses for filler items was computed for each participant. Accuracy rates were high (M = 0.96, SD = 0.20) indicating participants were paying attention during the experiment.

shows the mean proportions for each condition. As can be seen from the proportion data, participants strongly preferred the quantity-consistent picture in all four conditions (all proportions are above chance).

Table 4. Proportion of quantity-consistent choices by condition in Experiment 2.

Consistent with Winter and Duffy (Citation2020), the participants’ preferences were not influenced by the hand the speaker used when performing the gesture (proportion means: Small gesture: left hand = 0.70, right hand = 0.74; Large Gesture: left hand = 0.74, right hand = 0.72, both hands = 0.78).

reports the fixed effects for the logistic mixed model. As predicted by the quantity hypothesis, the object dispersion factor did not have a significant effect on participants’ choices (proportion mean difference = .01; see ). Although conventional wisdom states that the null hypothesis can never be accepted, this is not strictly the case. Frick (Citation1995) argues that if (1) the null hypothesis is possible; (2) the results are consistent with the null hypothesis; and (3) the experiment was a good effort to find an effect, then it is acceptable to accept the null hypothesis. It is possible that the object dispersion factor could have no effect, therefore the null hypothesis is possible in this instance. The experiment reported here included data from 66 participants and 54 trials for each participant, meaning that the number of data points was sufficient to find even a small effect. Finally, the data are consistent with the null hypothesis as the dispersion factor was not a significant predictor in this model. Inspection of the proportion means indicates that the two dispersion conditions only differed by .01. Further inspection of the proportion means in indicates that the object dispersion factor did not have a consistent effect across the gesture conditions as indicated by the significant interaction between gesture size and object dispersion (logit coefficient β = 0.66, SE = 0.18, z = 3.66, p < .001). For the large gesture condition, the object dispersion factor decreased participants’ preference for the quantity-consistent picture (proportion mean difference = .04), however, in the small gesture condition, the object dispersion factor had the opposite effect: Participants’ preference for the quantity-consistent picture was increased (proportion mean difference = .06). Follow up comparisons revealed a robust effect of object dispersion in the small gesture condition (logit coefficient β = −0.59, SE = 0.17, z = −3.55, p < .0001). No other comparisons differed significantly, all p’s > 0.05. This interaction was not predicted by either hypothesis, so it is unclear why object dispersion differentially impacted the gesture conditions. Importantly, in none of the conditions did the proportion of quantity-consistent choices ever go below chance. That is, participants preferred the quantity-consistent picture whether object dispersion was a factor in their choice. This result is most consistent with the hypothesis that gestures evoke quantity information.

Table 5. Fixed effects for the logistic mixed model for picture choice by gesture size and object dispersion in Experiment 2.

General discussion

The goal of this set of experiments was to begin to understand how gestures might influence the conceptual representations of plural references created by language comprehenders. Experiment 1 showed that gestures could be used to influence the set size of a plural reference. Experiment 2 replicated these findings and further demonstrated that gestures were evoking quantity information for plural references rather than the spatial layout of the plural set. Taken together, these findings corroborate previous work showing that gestures can influence how comprehenders represent linguistic content (e.g. Church et al., Citation2007; Church & Goldin-Meadow, Citation1986; Emmorey & Casey, Citation2001; Hostetter, Citation2011). Importantly, these data extend these findings to quantity information implied by plural references.

These data are consistent with previous work showing that gestures are often used by speakers to convey number information (Winter et al., Citation2013; Woodin et al., Citation2020). Winter et al. (Citation2013) found that newscasters were more likely to perform small, pinching-like gestures when talking about “tiny” numbers of things and large gestures when talking about “huge” amounts. They argued that the gestures seem to convey information about quantity by evoking spatial representations of amount. The data reported here suggest that gestures that are used by speakers to convey information about quantity can also be used by comprehenders to make inferences about the quantity information in plural references.

These data are also consistent with work showing a tight link between number and spatial representations (Andres et al., Citation2004; Henik & Tzelgov, Citation1982; Hurewitz et al., Citation2006; Lindemann et al., Citation2007). A related effect, also consistent with the data reported here, is an effect known as the spatial-numerical association of response codes, or SNARC effect (Dehaene et al., Citation1993). The SNARC effect is a demonstration of the spatial organisation of magnitude information: Experimental findings show that participants respond to smaller numbers with their left hand and larger numbers with their right hand. It is believed that this effect is due to the automatic association between the location of the response hand and the spatial magnitude of the number information. Particularly related to this study, the SNARC effect has been extended to plural information, showing that participants are faster to respond to singulars with their left hand and plurals with their right hand (e.g. Patson, Warren, Hurler, & Kaup, Citationunder review; Roettger & Domahs, Citation2015), suggesting that semantic magnitude information is active during the processing of nouns as well as the during the processing of digits and number words. Of course the gestures used in the current experiments likely did not activate the same number line associated with the SNARC effect, as the speaker in the video used both hands to perform the gestures. However, the data reported here suggest that spatial magnitude information related to quantity can be activated during the comprehension of plural sets with the kinds of gestures used in these experiments. These data add to the growing evidence that there is a tight link between quantity/number information and spatial representations of number/quantity.

These data contribute important insights into how plural references may be instantiated in comprehenders’ minds. Previous work has suggested that when quantity information is left unspecified in the linguistic stream, comprehenders leave that information underspecified in the conceptual representation (Patson, Citation2014; Patson et al., Citation2014). It is important to note that when plurals are quantified with small numbers (i.e. within the subitisation range), comprehenders create a conceptual representation that has exactly that number of objects (Beg et al., Citation2019; Patson, Citation2021; Patson et al., Citation2014; Šetić & Domijan, Citation2017). For example, Šetić and Domijan used a picture-matching experiment and found that participants were faster to respond to a picture of exactly three dogs after reading a sentence like, Three dogs were wandering in the street compared to pictures of dogs in which the quantity information did not match the sentence. However, recent evidence using the picture-matching paradigm found that comprehenders do not simulate exact number when the cardinal quantifier (i.e. six, eight) is outside of the subitisation range (Beg et al., Citation2019; Patson, Citation2021). Beg et al. argued that when numbers are outside of the subitisation range comprehenders do not access a symbolic representation of that number’s respective quantity. Given this, it is important to consider what kind of conceptual representation is built for a plural referent when its quantity is either larger than the subitisation range or unspecified. Importantly, in the attention literature, Treisman (Citation2006) has argued that people are only able to maintain three to four objects (i.e. within the subitising range) in visual attention. Once the number of objects in the visual scene exceeds four, Treisman argues that the objects are bound together in a single object file and properties of the group object file are in attentional focus rather than properties of the individuals that comprise the group. The data reported here are consistent with this hypothesis. Like previous work (Winter et al., Citation2013; Woodin et al., Citation2020), we argue that the gestures in the experiments reported here influenced the conceptual representations of plural references by evoking spatial representations of amount. This suggests that the conceptual representations of the plural references were indeed focused on the level of the group and the gestures were used to specify properties of that group, namely group size. Furthermore, group size was used by comprehenders to make inferences about quantity due to the deep mental connection between the conceptualisation of physical size and numerical quantity (e.g. Andres et al., Citation2004; Citation1982; Hurewitz et al., Citation2006; Lindemann et al., Citation2007). Taken together, all of this work suggests that quantity information is encoded in plural representations through group features, not left underspecified or abstract. If quantity information were abstract or unspecified, we would not have predicted an influence of gestures.

These data raise a number of important questions about how gestures may influence plural conceptual representations. In particular, it would be important to understand how gestures might be used to differentiate plurals within and beyond the subitisation range. Šetić and Domijan (Citation2017) provided evidence consistent with Patson et al. (Citation2014) that comprehenders’ conceptual representations have exact number represented when the number of objects is within the subitisation range. When exact number is conceptually represented, attention is distributed among the individual objects (e.g. Treisman, Citation2006). Given this, it would be interesting to understand how gestures influence those conceptual representations. Would gestures influence the representation of the individual objects themselves? Or rather, would the gesture influence how the objects are spaced in the conceptual representation? For example, if a speaker were to use the same large gesture used in this study while saying, There are three dogs, would that be interpreted as three large dogs, or would that gesture indicate that the dogs are spaced far from each other?

One place that has begun to investigate how different kinds of plurals may be represented gesturally is in the body of work investigating homesigners (Coppola et al., Citation2013). Homesigners are individuals who do not have access to conventional language models, neither spoken nor signed, and therefore create their own gestural system to communicate with their surrounding world (Goldin-Meadow et al., Citation1996; Spaepen et al., Citation2011; Spaepen et al., Citation2013). Researchers have identified several elements of natural languages within homesign systems, such as distinction between nouns, verbs, and adjectives, and several complex grammatical properties (Coppola et al., Citation2013; Spaepen et al., Citation2013). One study indicated that there are at least three different ways in which homesigners communicate quantity information: finger extension, punctuated movements, and unpunctuated movements. Punctuated movements are discrete from the surrounding gestures and can occur in different spaces. They, along with finger extensions, provide more accurate accounts of the set number and act like cardinal numerical devices. Unpunctuated movements are produced in succession with the other gestures and can also occur in different spaces. Unlike their counterpart, unpunctuated movements behave like non-cardinal number devices by appearing to convey “more than one” or “many”. Interestingly, unpunctuated movements do not seem to be reserved for large item sets and are used for sets as small as two or three. It is possible that unpunctuated movements function as a plural marker, similar to -s in English (Coppola et al., Citation2013). Because unpunctuated movements provide underspecified information about quantity, perhaps these gestures elicit mental representations of this sort. In other words, an unpunctuated movement paired with the sentence The buttons are on the table might produce vague conceptualisations of buttons. Likewise, punctuated movements may elicit representations that recognise the individuals in the set because they attempt to convey information about number. The conceptualisation of buttons in this condition might be more detailed and specific. Future work should investigate the link between homesigns for quantity information and how plurals are conceptualised in comprehenders’ minds more generally.

A more recent study showed that ASL (American Sign Language), like the homesigns describe above, also uses punctuated and unpunctuated repetitions that can convey different kinds of plural references (Schlenker & Lamberton, Citation2019). In ASL, signers can repeat a sign to convey a plural. For example, a signer may repeat the sign for “book” three times to indicate that there are multiple books. It may seem plausible that the three repetitions of the sign indicate that there are three books, however, the repeated sign can be performed with a numeral that overrides that interpretation. Thus, if a signer performs “book” three times followed by the numeral for seven, that is felicitously interpreted as meaning “seven books”. Instead, the repetitions seem to be used to convey information about the spatial configuration of the plural reference. In unpunctuated repetitions, the repetition is vague and has low resolution, that is, each movement does not necessarily denote a single object. In punctuated repetitions, the repetitions have higher resolution such that each repetition denotes a single object. It is the case that numerals can override this preference, but when this happens, the punctuated nature of the repetition suggests that the objects are clearly separable. Both unpunctuated and punctuated repetitions can indicate how objects are arranged. For example, if the sign for book is performed three times in a row horizontally, this is taken to mean the books are arranged horizontally. Future work should investigate whether punctuated gestures influence comprehenders’ conceptual representations of plural references.

One limitation of the current findings is that the gestures used in this study were scripted, thus care must be taken in generalising these results. Importantly, we are not claiming that speakers often use the particular gestures used in these studies to convey quantity information about plural referents as instantiated in these experiments. Instead, we are claiming that gestures can communicate information about a plural referents’ relative set size. Importantly, a meta-analysis of previously published work on the effects of gestures on comprehension suggested that scripted gestures have a similar impact on comprehension as spontaneous gestures (Hostetter, Citation2011). Thus, the data reported here suggest that gestures can be used to convey information about plural sets to comprehenders. Furthermore, the forced-choice method used in these experiments does not allow us to conclude that gestures automatically instantiate spatial quantity information in comprehenders’ conceptual representations. Importantly, the forced-choice method allowed us to determine whether comprehenders were sensitive to gesture information and whether it influenced their interpretation of the quantity information in the plural reference. Future work should investigate the timecourse of the gestures’ influence for a better understanding of how and when the gesture exerts an influence on comprehension.

Given these limitations, these data make useful contributions to our knowledge about how plural references are conceptualised. Consistent with previous work, these data suggest that when an unquantified plural set is instantiated in a comprehenders mind, attention is focused on the set rather than the objects that make up the set. These data suggest that gestures can influence the representation of the plural reference by influencing spatial aspects (e.g. size) of the set. This, in turn, is used by comprehenders to make inferences about the quantity of the plural reference. These findings are consistent with work suggesting that the conceptual representation for references include physical and spatial properties (e.g. Barsalou, Citation1999). Furthermore, they suggest that quantity information for plural references is not abstract (Johnson-Laird, Citation1983) or underspecified (Patson et al., Citation2014) but that it is encoded through spatial means.

Open practices statement

All of the experimental materials for experiments reported here are available upon request. Data and analysis scripts are available at DOI 10.17605/OSF.IO/JXZ86. None of the experiments were preregistered.

Acknowledgments

The authors would like to thank Isabel Aey, Angélica Avilés Bosques, and Joshua Perry for their assistance with stimuli creation and data collection. These data were collected at the Language Sciences Laboratory at the Center for Science and Industry (COSI) in Columbus, Ohio.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References