4,012
Views
32
CrossRef citations to date
0
Altmetric
Current perspectives on visual working memory

Memory for action: a functional view of selection in visual working memory

ORCID Icon, ORCID Icon & ORCID Icon
Pages 388-400 | Received 17 Feb 2020, Accepted 28 Apr 2020, Published online: 19 May 2020

ABSTRACT

Perception is shaped by actions, which determine the allocation of selective attention across the visual field. Here, we review evidence that maintenance in visual working memory is similarly influenced by actions (eye or hand movements), planned and executed well after encoding: Representations that are relevant for an upcoming action – because they spatially correspond to the action goal or because they are defined along action-related feature dimensions – are automatically prioritised over action-irrelevant representations and held in a stable state. We summarise what is known about specific characteristics and mechanisms of selection-for-action in working memory, such as its temporal dynamics and spatial specificity, and delineate open questions. This newly-burgeoning area of research promotes a more functional perspective on visual working memory that emphasizes its role in action control.

Humans actively shape and constantly adapt to their environments with their actions. It is no stretch to assume that cognition, including perceptual functions, has evolved in the service of action control. This perspective treats actions not merely as an appendix to a string of mental operations but as their primary purpose; it has gained momentum in the 1980s (e.g., Allport, Citation1987; Neumann, Citation1987; Neumann & Prinz, Citation1990) and has since proven a fruitful approach to the study of cognitive functions. Action and perception are tightly intertwined, interacting bidirectionally (e.g., Cruse et al., Citation1990; Schütz-Bosbach & Prinz, Citation2007).

The visual system in particular has been characterized as a system optimized for selecting action-relevant information from the visual scene (selection-for-action; Allport, Citation1987). We ceaselessly move our eyes, heads and hands to gather information and interact with our surroundings. It is these actions that determine a visual feature’s relevance. Consider reaching out for the hand of a child, balancing on a curbside. As you reach out, the hand’s location is rendered more important than other locations in your surroundings, and its size and orientation are more important than other features, because size and orientation will inform the posture of your hand (i.e., grip aperture and orientation) for grasping it. Decades of research have shown that the selective processing of sensory input is indeed tailored to the requirements of action control. In this review, we make the case that it is time to adopt a similar, ecological perspective on the selection of visual information that is maintained in working memory in the absence of corresponding perceptual input.

Studies of selective processing for visual working memory typically instruct participants about which subset of stimuli they should memorise (e.g., the red items but not the green or blue ones; e.g., Jost et al., Citation2011; Vogel et al., Citation2005) or display informative cues before or after the presentation of information that is to be remembered (e.g., Griffin & Nobre, Citation2003; Heuer et al., Citation2016; Heuer & Schubö, Citation2016; Kalogeropoulou et al., Citation2017; Souza, Citation2016; see Souza & Oberauer, Citation2016 for review). The world outside the well-controlled laboratory environment, however, hardly ever provides explicit cues pointing to the aspects of our visual environment that we should focus on to achieve a specified goal. Instead, we perform goal-directed actions, which impose the selection of relevant visual information for maintenance in working memory. Actions, therefore, could be considered natural cues. For a specific type of action, this role has long been recognized: Several studies have shown that information about the goal of a saccadic eye movement is maintained in memory, even when the object at that location is irrelevant for the primary task (e.g., Bays & Husain, Citation2008; Deubel et al., Citation2002; Irwin, Citation1992; Irwin et al., Citation1990; Schut et al., Citation2017; Shao et al., Citation2010; Tas et al., Citation2016; see Aagten-Murphy & Bays, Citation2018 for review). However, these studies tested memory for visual stimuli that were presented during saccade planning, and sometimes their presence extended well into saccade execution. With this timing, the influence of saccades on memory can most likely be attributed to presaccadic attention shifts (e.g., Deubel & Schneider, Citation1996; Kowler et al., Citation1995; Ohl et al., Citation2017; Rolfs & Carrasco, Citation2012; Rolfs et al., Citation2011; see Deubel, Citation2014 for review), which increase visual sensitivity during encoding and promote the selective transfer of sensory information into working memory.

This review focuses on recent studies that examined whether and how actions – planned and executed well after encoding – modulate already existing representations in visual working memory. It does not cover other sorts of relations between actions and visual working memory, which interact bidirectionally at different stages of processing (e.g., Czoschke et al., Citation2019; Myers et al., Citation2017; Souza et al., Citation2019; Tseng & Bridgeman, Citation2011; van Ede, Chekroud, and Nobre, Citation2019; van Ede, Chekroud, Stokes, et al., Citation2019; Williams et al., Citation2013; for a review see van Ede, Citation2020). We will first review evidence that actions (eye and hand movements) are accompanied by an attentional prioritisation of representations (i) that correspond to action-relevant locations, or (ii) that are defined along feature dimensions that are relevant for the particular type of action. We will then discuss specific characteristics and potential mechanisms of selection-for-actionFootnote1 in visual working memory. In so doing, we will summarise what we know about issues that have already been addressed, pointing out open questions along the way, and speculate about issues that still lack systematic investigation. Throughout the review, we will draw parallels between this action-related attentional weighting in memory and the well-established deployment of attention to action-relevant visual information in the currently perceived environment (for a distinction between non-perceptual and perceptual attention, see Oberauer, Citation2019; commonly also referred to as internal and external attention, see Chun et al., Citation2011).

Prioritization of action-relevant locations

Goal-directed movements shape the deployment of visuospatial attention: The goal of an action is attended before the movement starts, increasing sensitivity at that location relative to other locations (for reviews see Deubel, Citation2014; Zhao et al., Citation2012). The coupling of perceptual spatial attention to visual goal locations during action planning is obligatory and has not only been observed for eye movements (e.g., Castet et al., Citation2006; Deubel & Schneider, Citation1996; Hanning et al., Citation2019; Hoffman & Subramaniam, Citation1995; Kowler et al., Citation1995; Li et al., Citation2016; Montagnini & Castet, Citation2007; Ohl et al., Citation2017; Rolfs et al., Citation2011; Rolfs & Carrasco, Citation2012), for which the link with visual attention might be expected to be particularly strong, but also for hand movements such as reaching and grasping (e.g., Baldauf & Deubel, Citation2008b, Citation2010; Deubel et al., Citation1998; Rolfs et al., Citation2013; Schiegg et al., Citation2003; Stewart et al., Citation2019). All of these studies employed dual-task paradigms, in which participants had to perform a specific movement in combination with a visual task that requires the detection, discrimination, or identification of a target stimulus presented briefly before the onset of the movement.

More recent evidence shows that visual working memory is also affected by on-going actions, similar to perception. Actions, planned and executed during memory maintenance, bias visual working memory in favour of stimuli previously presented at the same location as the current action goal. This selection within visual working memory has been observed for both eye movements (Hanning & Deubel, Citation2018; Hanning et al., Citation2016; Ohl & Rolfs, Citation2017, Citation2018, Citation2020) and manual pointing movements (Hanning & Deubel, Citation2018; Heuer & Schubö, Citation2018; Heuer et al., Citation2017). Studies in this new line of research combined movement tasks and memory tasks. The particular paradigms differed in some experimental details such as the features to-be-remembered, the cues that were used to indicate the movement goal, or the exact timing of events. But the basic experimental protocol (A) and the results (B) were remarkably consistent across these studies: During the retention interval of a memory task, participants were cued to perform a movement towards one of the locations previously occupied by the memory items. Embedding the movement task within the visual working memory task ensured that any action-related effect could only be the result of an attentional modulation at the representational level during maintenance, and not of an increase in sensitivity at the perceptual stage. Across all of these studies, the movement cue – hence, the action goal location – was unpredictive of the upcoming test item location in the memory task. Neither was the memorised information required to perform the movement, because the locations were marked by placeholders that were present throughout the trials (Heuer et al., Citation2017; Heuer & Schubö, Citation2018; Ohl & Rolfs, Citation2017, Citation2018, Citation2020) or by items that were distinct from the memory items and only presented during the movement task (Hanning et al., Citation2016). Thus, the movement and memory tasks were not related but merely overlapped in time. Nevertheless, the action modulated memory performance in a spatially specific manner: Memory was better for items that had previously been presented at action-relevant locations (i.e., action goals) than for items presented at action-irrelevant locations.

Figure 1. Prioritisation of action-relevant locations. (A) In a typical paradigm, participants have to memorise a set of items (here, four colours) and perform an eye or hand movement to a cued location during the maintenance interval. Memory and movement tasks are unrelated, that is, the movement cue is not predictive of the upcoming memory probe location. The movement is accompanied by the automatic allocation of attention (orange) to the goal location, and the item representation that is spatially congruent with the movement goal benefits from the increased attentional engagement at that location. (B) As a result, memory performance for congruent (i.e., action-relevant) items is better than for incongruent (i.e., action-irrelevant) items. This effect is largest shortly after encoding and decreases with increasing cue delay (top panel), but remains stable for several seconds once action-related priorities have been established (i.e., across different probe delays following movement execution; bottom panel). These plots illustrate typical results; for the original data see Hanning & Deubel, Citation2018; Hanning et al., Citation2016; Heuer et al., Citation2017; Heuer & Schubö, Citation2018; Ohl and Rolfs (Citation2017, Citation2018, Citation2020).

Figure 1. Prioritisation of action-relevant locations. (A) In a typical paradigm, participants have to memorise a set of items (here, four colours) and perform an eye or hand movement to a cued location during the maintenance interval. Memory and movement tasks are unrelated, that is, the movement cue is not predictive of the upcoming memory probe location. The movement is accompanied by the automatic allocation of attention (orange) to the goal location, and the item representation that is spatially congruent with the movement goal benefits from the increased attentional engagement at that location. (B) As a result, memory performance for congruent (i.e., action-relevant) items is better than for incongruent (i.e., action-irrelevant) items. This effect is largest shortly after encoding and decreases with increasing cue delay (top panel), but remains stable for several seconds once action-related priorities have been established (i.e., across different probe delays following movement execution; bottom panel). These plots illustrate typical results; for the original data see Hanning & Deubel, Citation2018; Hanning et al., Citation2016; Heuer et al., Citation2017; Heuer & Schubö, Citation2018; Ohl and Rolfs (Citation2017, Citation2018, Citation2020).

This pattern of results stood in contrast to control conditions that were visually identical to the main task and only differed in the instructions for the movement cue. In these conditions, participants were asked to ignore the movement cue and perform either no movement at all (Hanning et al., Citation2016; Ohl & Rolfs, Citation2017) or one towards the centre of the display irrespective of the indicated location (Heuer et al., Citation2017; Heuer & Schubö, Citation2018). Using these different instructions, the movement cue did not elicit the bias in visual working memory that goal-directed actions did. The prioritisation of information spatially congruent with the cued action goal thus cannot be attributed to an automatic shift of attention triggered by the non-informative cue. Instead, it is a consequence of attentional processes specifically associated with the action.

Prioritization of action-relevant feature dimensions

Allocating spatial attention to the goal object location ensures that visual information at that location is processed and maintained preferentially over other objects in the environment. But different features matter depending on what exactly it is that we want to do with an object. When you reach for a child’s hand, as illustrated in the example above, you need to consider features like size and orientation. If, by contrast, you only want to point to the hand in order to encourage someone closer to it to grasp it, its size and orientation are largely irrelevant. But the colour or lightness of the gloves the child is wearing might be helpful to localize the hand and point in the right direction. In these scenarios, the visual information and the goal object are the same, but your action intentions render different features of that very same object more relevant than others. Setting up a specific action plan primes action-related feature dimensions by increasing their weight and thus their impact on perceptual processing (intentional weighting; Hommel, Citation2009; Hommel et al., Citation2001; Memelink & Hommel, Citation2013). Not only does processing of action-relevant features of the goal object itself increase (e.g., Bekkering & Neggers, Citation2002; Gutteling et al., Citation2011; Hannus et al., Citation2005), but entire feature dimensions that provide action-relevant information are primed (e.g., Fagioli et al., Citation2007; Wykowska & Schubö, Citation2012; Wykowska et al., Citation2009).

Again, this generalized influence of action intentions does not end at the perceptual stage: Preparing a particular type of action also induces a selective weighting within visual working memory, resulting in a prioritisation of representations coded on action-relevant feature dimensions. Key evidence for this comes from a study that combined a memory and movement task (Heuer & Schubö, Citation2017), in which participants memorised items that were defined either by size or by colour while preparing either a grasping or a pointing movement (A). Whereas size is a critical feature dimension for grasping (Smeets & Brenner, Citation1999), colour can be used to localize a goal object and guide a pointing movement (White et al., Citation2006). In two separate experiments, the type of movement to be prepared was instructed either before the memory items or well after their disappearance during the retention interval. Participants performed the actual pointing or grasping movement towards an item on the display only after responding to the memory task. In both experiments, memory for size was better during the preparation of a grasping movement (B). Memory for colour, conversely, tended to be better while a pointing movement was being planned. The latter effect, however, was much smaller and did not reach statistical significance – probably because the action relevance of colour information for pointing movements is simply not that high (see also Bekkering & Neggers, Citation2002; Hannus et al., Citation2005).

Figure 2. Prioritisation of action-relevant feature dimensions. (A) In a typical paradigm, participants memorise a set of items defined by different feature dimensions (here, size and colour) and are cued to prepare a grasping or pointing movement during the maintenance interval, but to withhold movement execution until after completion of the memory task at the end of the trial. Memory and movement tasks are unrelated, that is, the movement type is not predictive of the upcoming memory probe type. The different movement types render different features action-relevant: While size is a relevant feature dimension for grasping, colour can be used to guide a pointing movement. Item representations defined along action-relevant feature dimensions benefit from their increased attentional weight (yellow for grasping, green for pointing). (B) As a result, memory performance for size items is better during the preparation of a grasp, whereas performance for colour items tends to be slightly better while a pointing movement is being prepared. This plot illustrates typical results; for the original data see Heuer and Schubö (Citation2017).

Figure 2. Prioritisation of action-relevant feature dimensions. (A) In a typical paradigm, participants memorise a set of items defined by different feature dimensions (here, size and colour) and are cued to prepare a grasping or pointing movement during the maintenance interval, but to withhold movement execution until after completion of the memory task at the end of the trial. Memory and movement tasks are unrelated, that is, the movement type is not predictive of the upcoming memory probe type. The different movement types render different features action-relevant: While size is a relevant feature dimension for grasping, colour can be used to guide a pointing movement. Item representations defined along action-relevant feature dimensions benefit from their increased attentional weight (yellow for grasping, green for pointing). (B) As a result, memory performance for size items is better during the preparation of a grasp, whereas performance for colour items tends to be slightly better while a pointing movement is being prepared. This plot illustrates typical results; for the original data see Heuer and Schubö (Citation2017).

Characteristics and mechanisms

A number of specific characteristics and mechanisms of selection-for-action in visual working memory have been investigated thus far. In dedicated paragraphs, each summarizing the current state of knowledge and pointing out open questions, we will discuss these findings. The last paragraphs will then be devoted to speculations about issues that have not yet been systematically investigated.

  • The effects of selection-for-action are largest shortly after encoding and decline thereafter, but they remain highly stable for extended periods of time once priorities are imposed.

By varying the interval between a memory array and a movement cue from 100 to 3200 ms, Ohl and Rolfs (Citation2017, Citation2018) have shown that the relative enhancement of memoranda congruent with a saccade goal is largest when the action is initiated shortly after the disappearance of the memory array (see B, top panel). Thereafter, the difference in performance for action-relevant and -irrelevant items decreases but is reliably observed well beyond the range of iconic memory up to at least 800 ms after memory array offset, in most cases probably longer (see also Hanning & Deubel, Citation2018; Ohl & Rolfs, Citation2019). Intervals used in other studies fall within that range (Hanning et al., Citation2016; Heuer et al., Citation2017; Heuer & Schubö, Citation2018; Ohl & Rolfs, Citation2018). Once action-related priorities have been established in visual working memory, however, they remain highly stable. That is, when the interval between movement and memory test was varied, a sustained memory advantage for the item corresponding to the action goal was evident across several seconds (Ohl & Rolfs, Citation2017; see B, bottom panel). Whereas the findings obtained for hand movements (Hanning & Deubel, Citation2018; Heuer et al., Citation2017; Heuer & Schubö, Citation2018) appear to be consistent with the time course delineated for saccadic eye movements, the timing of hand movements – reaching movements to different locations or different types of movements – has never been varied in a comparable manner.

  • The spatial specificity of selection-for-action in visual working memory varies, depending on factors that have yet to be identified.

Perceptual shifts of attention that accompany goal-directed actions are spatially highly specific to the intended goal location (Baldauf & Deubel, Citation2008a, Citation2009; Baldauf et al., Citation2006; Deubel & Schneider, Citation1996; Ohl et al., Citation2017; Rolfs et al., Citation2011; Stewart et al., Citation2019) rather than the action’s end-point (van der Stigchel & De Vries, Citation2015; Wollenberg et al., Citation2018). This specificity may reduce interference from surrounding objects and ensure efficient selection of information required for the specification of movement parameters. The spatial specificity for action-induced selection within visual working memory is more variable. Ohl and Rolfs (Citation2017, Citation2018, Citation2019) observed a relative enhancement of memory performance that was confined to the representations directly corresponding to the action goal, while performance even for items surrounding that location dropped sharply. With similar distances between items, Heuer and Schubö (Citation2017), by contrast, found that memory performance decreased more gradually as a function of distance between a reach goal and a memory test item – a pattern indicative of an attentional gradient spreading out from the action goal location. One might speculate that these different patterns are characteristic of the different effectors for which they were observed, but given that a comparable, spatially highly specific tuning of perceptual attention has been demonstrated for both eye (Baldauf & Deubel, Citation2008a; Deubel & Schneider, Citation1996) and hand movements (Baldauf & Deubel, Citation2009), this explanation seems rather unlikely, albeit it cannot be ruled out. Another possibility is that the distribution of attention within visual working memory may actually be similar across effectors and studies – a prioritisation of information corresponding to the intended goal location that levels off with increasing distance from that location – but that it depends on task demands (e.g., a simple detection of colour category changes vs a more difficult classification and report of Gabor orientations) if the weaker attentional engagement at non-target locations yields a measurable behavioural benefit.

In contrast to the spatially specific selection of the action goal, selection by feature-based attention is spatially invariant – it is effective across the entire visual field (e.g., Bichot et al., Citation1999; White & Carrasco, Citation2011). Saccadic eye movements in perceptual tasks do not interfere with the deployment of feature-based attention across saccades (Kalogeropoulou & Rolfs, Citation2017) and objects at locations other than the saccade goal do not benefit from sharing a specific feature with the saccade target (Born et al., Citation2012; Jonikaitis & Theeuwes, Citation2013; White et al., Citation2013). Future studies should determine whether a similar independence of feature-based attention and spatially specific selection-for-action generalizes from perceptual to memory tasks.

  • Selection-for-action in visual working memory occurs automatically.

All studies that examined the effect of actions on maintenance in visual working memory used dual-task paradigms, in which the movement task had no predictive value for the memory task: Each item in memory was equally likely to be tested – irrespective of the goal location or the type of movement – so there was no strategic advantage in deploying more resources to specific representations (Hanning & Deubel, Citation2018; Hanning et al., Citation2016; Heuer & Schubö, Citation2017, Citation2018; Heuer et al., Citation2017; Ohl & Rolfs, Citation2017, Citation2018, Citation2020). In fact, items presented at the action goal are prioritised irrespective of memory cue validity – even when they are far less likely to be probed than any other item (Ohl & Rolfs, Citation2017, Citation2020), – and in spite of participants’ knowledge about these contingencies. While this is probably the most convincing piece of evidence that selection-for-action in visual working memory occurs involuntarily, it dovetails nicely with other findings. For instance, action-related biases do not vary with set size (Ohl & Rolfs, Citation2020) and they remain unperturbed when pitted against another powerful selection bias that confers a direct advantage: monetary reward (Heuer & Schubö, Citation2018).

Evidence for the obligatory nature of selection-for-action in visual working memory is striking, yet some open questions remain. For example, it is not clear how actions generated and executed during memory maintenance affect the deployment of endogenous attention. In particular, can participants – while performing an action – voluntarily deploy attention to other memoranda that are not related to the action (e.g., make use of a valid retro-cue pointing to a location)?

  • Selection-for-action in visual working memory relies on effector-specific attentional mechanisms.

Despite the highly similar behavioural consequences of eye and hand movements on working memory described so far, independent attentional mechanisms appear to be involved. Hanning and Deubel (Citation2018) found that simultaneous eye and hand movements to different locations yielded memory benefits for items presented at both action goals, which were of approximately the same magnitude as when a single eye or hand movement was performed. Thus, there was no tradeoff between effectors, which mirrors findings of simultaneous and independent shifts of attention to perceptual input (Hanning et al., Citation2018; Jonikaitis & Deubel, Citation2011; but see Khan et al., Citation2011). This may seem surprising in light of the large body of work demonstrating a high degree of overlap between eye and hand movements at neural and behavioural levels of analysis (e.g., Beurze et al., Citation2009; Crawford et al., Citation2011; Filimon, Citation2010; Horstmann & Hoffmann, Citation2005; Neggers & Bekkering, Citation2000; Pelz et al., Citation2001). Conceivably, such interactions emerge during later stages of motor control, whereas the attentional selection of movement targets happens early on during movement preparation (Jonikaitis & Deubel, Citation2011). The timing of selection-for-action in visual working memory ­– planning or execution – is discussed in further detail in a separate section below.

These findings invite the question if the prioritisation of action-relevant memory contents relies on mechanisms that are specific to the oculomotor and reach systems, that is, to different types of effectors, or also to different effectors of the same type, such as the right and left arm. Coordinated movements of both hands are typically tightly coupled and crosstalk emerges already during motor programming (e.g., Heuer et al., Citation1998; Spijkers & Heuer, Citation2004)­. Yet perceptual attention appears to be allocated to the separate goals of bimanual actions in parallel and at no cost compared to unimanual reaching, at least under certain conditions (Baldauf & Deubel, Citation2008b). If this also holds for selection in visual working memory has yet to be determined.

  • Selection-for-action in visual working memory occurs during action planning rather than during action execution.

The evidence obtained so far indicates that the prioritisation of action-relevant representations in visual working memory occurs during movement preparation and does not necessarily require the planned movement to be executed. Hanning et al. (Citation2016) observed the same relative enhancement of items at the saccade goal in randomly interleaved catch trials, in which the eye movement was planned but not executed. Converging evidence was obtained for the prioritisation of action-relevant feature dimensions. In the paradigm used by Heuer and Schubö (Citation2018), participants were instructed to prepare the grasping or pointing movement following movement cue presentation (before encoding or during the maintenance interval) but to withhold movement execution until after completion of the memory task. The observed effects of movement type could thus only have arisen during action planning, as later stages did not overlap with the memory task.

  • Does selection-for-action rely on a modulation of action-relevant representations, action-irrelevant representations, or both?

Visual memory is biased towards information that is potentially action-relevant due to its spatial correspondence with the action goal or its coding of action-related feature dimensions. In other words, action-relevant representations are enhanced relative to action-irrelevant representations. It remains an open question, though, how this translates to modulations in the absolute sense. More specifically, it is unclear whether this bias reflects a beneficial modulation of the action-relevant representation (e.g., enhancement or protection), an inhibition of action-irrelevant representations, or a combination of these processes working in concert. Detrimental effects on action-irrelevant representations do not necessarily entail their active inhibition: Any processes that specifically benefit a particular representation likely involve a shift of limited resources and thus come at a cost for the remaining representations. Based on the finding that performance for items at the action goal remains highly stable across several seconds once they have been prioritised during action planning (Ohl & Rolfs, Citation2017), we may speculate that these items are held in a protected state, while the remaining items are subject to time-based decay or interference (see also Oberauer et al., Citation2016; Souza & Oberauer, Citation2016). Other mechanisms, however, could likewise account for this pattern.

Tackling this issue with behavioural experiments requires an adequate baseline, and finding one has proven to be particularly challenging. Conditions without a movement or with movements towards the same constant goal, as have been used in previous studies (Hanning & Deubel, Citation2018; Hanning et al., Citation2016; Heuer & Schubö, Citation2018; Heuer et al., Citation2017; Ohl & Rolfs, Citation2017), can control for effects of the movement cue, but they constitute an unsuitable baseline for the memory task: The task load in these conditions is clearly lower than in conditions, in which participants also perform cued movements to varying locations. One solution would be to introduce cued movements to goal locations that do not correspond to memory item locations (cf. Ohl & Rolfs, Citation2020). However, this solution is not entirely satisfactory either: Actions to targets that do not correspond to memory item locations incur a cost in the memory task that may be indicative of increased visual memory load and not just general task load (Lawrence et al., Citation2004; Postle et al., Citation2006; Schut et al., Citation2017; Tas et al., Citation2016; but see Ohl & Rolfs, Citation2020). Presumably, information at action-relevant locations that do not correspond to memory item locations likewise receives priority during perceptual and mnemonic processing, taking away resources from items maintained for the memory task. The identification of the specific attentional mechanisms that bring about the action-related weighting in visual working memory might benefit from a closer look at its neural underpinnings.

  • What are the neural mechanisms underlying selection-for-action in visual working memory?

While we have a relatively good understanding of the neurophysiological implementation of action-related influences on perception, dedicated research is needed to characterize the interplay of actions and visual working memory. The neural circuits of working memory, (covert) visual attention and the oculomotor and reach systems overlap and are highly interdependent (e.g., Awh et al., Citation2006; Ikkai & Curtis, Citation2011; Jonikaitis & Moore, Citation2019; Nourdoost et al., Citation2010; Perry & Fallah, Citation2017). Saccade preparation enhances processing throughout visual cortex via feedback from retinotopically organized oculomotor regions (e.g., Ekstrom et al., Citation2009; Moore & Armstrong, Citation2003; Moore et al., Citation1998; Saber et al., Citation2015). Similarly, feedback from motor regions that control reaching and grasping movements modifies activity in visual cortex, modulating the processing of visual features according to action intention and enhancing sensitivity near the hand (e.g., Gutteling et al., Citation2013; Monaco et al., Citation2018; Perry et al., Citation2015; Velji-Ibrahim et al., Citation2018). Thus, we may speculate that top-down signals from effector-specific fronto-parietal oculomotor or reach regions modulate the maintenance of visual working memory representations in visual cortex (sensory recruitment; Harrison & Tong, Citation2009; Pasternak & Greenlee, Citation2005; Serences et al., Citation2009; Supèr et al., Citation2001). However, higher-level areas are also involved in the maintenance of representations in visual working memory (Bettencourt & Xu, Citation2015; Riley & Constantinidis, Citation2016). Action-related selection processes in visual working memory may accordingly target different levels within a distributed network involved in memory maintenance (Christophel et al., Citation2017), promising an exciting avenue for future research.

  • Is action-related prioritisation within visual working memory equivalent to prioritisation induced by retro-cues?

The finding that representations in visual working memory are weighted according to their potential relevance for planned actions begs the question if action relevance is equivalent to another source of selection bias: explicit task-relevance. How differences in task-relevance affect maintenance in visual working memory has been studied extensively since the development of the retro-cueing paradigm (Griffin & Nobre, Citation2003; Landman et al., Citation2003; see Souza & Oberauer, Citation2016 for an overview). In terms of memory performance, the result of an action that renders some retained items more relevant than others is certainly the same as that of using a retro-cue indicating that specific items are more task-relevant than others: Memory for these items is better than for the remaining ones, suggesting that these were maintained in a prioritised state. The effects of actions and retro-cues might therefore be considered functionally equivalent. As both types of effects rely on attentional processes and likely operate on the same representations, it seems reasonable to assume that there is also a degree of overlap of the underlying mechanisms (see also Myers et al., Citation2017). For instance, prioritisation induced by both actions and retro-cues occurs at the item-level as well as at the level of feature dimensions (e.g., Hajonides et al., Citation2019; Heuer & Schubö, Citation2017; Niklaus et al., Citation2017; Ohl & Rolfs, Citation2017; Ye et al., Citation2016). However, there is reason to believe that these selection mechanisms are not entirely equivalent.

For one, retro-cues are typically endogenous cues that provide useful information about the upcoming test item – such as its location or identity – with a given validity, allowing participants to make strategic use of them in order to improve their performance. The benefit conferred by retro-cues, therefore, heavily relies on their voluntary use and is accordingly sensitive to cue reliability (e.g., Gunseli et al., Citation2015; Shimi et al., Citation2013).Footnote2 Actions, by contrast, exert their influence automatically and the effects persist not only in the absence of any strategic advantage, but even when they are associated with a disadvantage (e.g., Ohl & Rolfs, Citation2017, Citation2020). This suggests that action-related prioritisation in visual working memory is an involuntary, hard-wired part of action planning. A dissociation between selection-for-action and endogenous, non-perceptual shifts of attention is also supported by the observation that simultaneous actions with different effectors yield independent effects, whereas cueing participants to attend to more than one item results in a memory trade-off (Hanning & Deubel, Citation2018). Findings obtained for eye movements reveal further differences with respect to time course and interaction with memory load. First, saccadic selection is strongest right after memory array offset and declines thereafter (Ohl & Rolfs, Citation2017, Citation2018), whereas retro-cue benefits emerge even when the cues are presented several seconds after the memory array (e.g., Astle et al., Citation2012). Second, saccadic selection occurs independent of memory load (Ohl & Rolfs, Citation2020), whereas retro-cue benefits increase with set size (see Souza & Oberauer, Citation2016).

The differences between action-related and cue-related selection described here are merely starting points for a systematic delineation of the underlying mechanisms, which calls for research directly comparing these processes, also at the neural level (for other efforts to delineate the contributions of different biasing mechanisms to prioritisation in visual working memory, see for example Heuer & Schubö, Citation2018; Niklaus et al., Citation2019).

Conclusions and outlook

The studies reviewed here have shown that maintenance in visual working memory, similar to perception and transfer into memory, is biased towards potentially action-relevant information. This information is automatically selected and held in a prioritised, stable state. These findings call for a more functional view on visual working memory that emphasizes its purpose of facilitating goal-directed actions, corroborating and extending recent proposals that consider visuospatial working memory to be a key component of the eye movement system (van der Stigchel & Hollingworth, Citation2018).

Selection-for-action in visual working memory is a newly-burgeoning field and our understanding is naturally incomplete. Throughout this review, we have pointed out remaining open questions related to specific characteristics and mechanisms. At a broader level, it will be pivotal to develop new experimental protocols that establish whether the influence of actions generalizes to more natural settings. For a start, future studies could employ movement tasks that are less controlled, for example examining the effects of self-generated actions. Similarly, the complexity of the information maintained in memory could be increased from simple items that carry information on more than one feature dimension, to real-world objects that also differ in more abstract features such as affordance. At our current state of knowledge, it seems reasonable to assume that actions constitute the primary source of selection in visual working memory in an ecologically valid environment.

Acknowledgment

We acknowledge support by the Open Access Publication Fund of Humboldt-Universität zu Berlin.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the Deutsche Forschungsgemeinschaft (DFG; German Research Foundation) under grants HE8207/1-1 to A. Heuer, OH274/2-2 to S. Ohl, and RO3579/6-2, RO3579/11-1 as well as RO3579/8-1 to M. Rolfs.

Notes

1 In line with previous work, we use the term “selection-for-action” (Allport, Citation1987) to emphasize the primary purpose of selection in visual working memory in the reviewed studies: to ensure that any information that might be required for upcoming actions (e.g., information spatially congruent with an action goal) is readily available. The actual utilization of that information for action planning and control is thus not needed for this term to apply.

2 There is evidence from so-called ‘incidental cueing’ (Zokaei, Manohar, et al., Citation2014; Zokaei, Ning, et al., Citation2014) that seems to indicate that even non-predictive endogenous cues presented during the delay bias visual working memory. Unlike typical retrocues, however, these incidental cues required a response: Both cue and memory probe indicated one of the memorised items by its colour and required the discrimination of location (cue) or reproduction of motion direction (probe). Thus, this paradigm can be thought of as a special instance of a dual-task, and the influence of the ‘cue’ as similar to the effect of actions reviewed here (after all, the button press response required by the cue is an action).

References