1,120
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Sensorimotor anticipation of others’ actions in real-world and video settings: Modulation by level of engagement?

&
Pages 293-304 | Received 21 Jun 2021, Published online: 14 Jun 2022

ABSTRACT

Electroencephalography (EEG) studies investigating social cognition have used both video and real-world stimuli, often without a strong reasoning as to why one or the other was chosen. Video stimuli can be selected for practical reasons, while naturalistic real-world stimuli are ecologically valid. The current study investigated modulatory effects on EEG mu (8–13 Hz) suppression, directly prior to the onset – and during the course – of observed actions, related to real-world and video settings. Recordings were made over sensorimotor cortex and stimuli in both settings consisted of identical (un)predictable object-related grasping and placing actions. In both settings, a very similar mu suppression was found during unfolding of the action, irrespective of predictability. However, mu suppression related to the anticipation of upcoming predictable actions was found exclusively in the real-world setting. Thus, even though the presentation setting does not seem to modulate mu suppression during action observation, it does affect the anticipation-related mu suppression. We discuss the possibility that this may be due to increased social engagement in real-world settings, which in particular affects anticipation. The findings emphasize the importance of using real-world stimuli to bring out the subtle, anticipatory, aspects related to action observation.

Introduction

Social cognition enables one to perceive, understand, and respond to others and to social situations. Recently, concerns have been raised about the ecological validity of the majority of studies investigating social cognition. The main thrust of the criticism is that many studies investigated social stimuli without creating a genuine real-world social encounter and therefore studied a diluted, incomplete, form of social cognition, which questions the applicability of the conclusions of these studies (e.g., Foulsham et al., Citation2011; Spiers & Maguire, Citation2007; Zaki & Ochsner, Citation2009).

The automatic responses generated by social stimuli during genuine social encounters differ from those generated by social stimuli presented in video displays in a number of important respects. One such respect involves attention and engagement. For example, eye-tracking data has shown that subtle modifications in others’ eye gaze had an effect on participants’ eye movements in real-world interactions but not in video presentations (Foulsham et al., Citation2011; Freeth et al., Citation2013; Laidlaw et al., Citation2011). The authors argued that the mere opportunity for social interaction altered social attention. FMRI studies provided evidence for the differential impact on neural responses of perceiving the experimenter via a live video-feed compared to a previous recording of the experimenter (Redcay et al., Citation2010). In the former condition, greater activation was found in multiple cortical areas involved in social cognition and reward. Another fMRI study showed that perceiving speech as a real-time audio-feed by a social partner, in contrast to believing the audio was prerecorded, increased activation of the mentalizing system (Rice & Redcay, Citation2016). EEG studies revealed neurophysiological differences between the perception of a static, live presented, person, and a picture of a person (Hietanen et al., Citation2008; Pönkänen et al., Citation2011), and found that differences in electrophysiological responses to a change in gaze direction were only enhanced in the real-world setting.

The observation of others’ actions activates an action observation network (AON; Cross et al., Citation2009), which encompasses higher-order visual regions encoding biological motion, most notably the superior temporal sulcus (STS; Jellema & Perrett, Citation2003a, Citation2003b) and parieto-frontal regions. The latter are primarily motor areas, yet they receive visual input from biological motion (Caspers et al., Citation2010). Within the parieto-frontal regions, the inferior parietal lobule (IPL) and ventral premotor cortex (vPMC) are considered key nodes of the AON, containing mirror neurons that not only become active during action execution but also during observation of similar actions carried out by others (see, Rizzolatti & Sinigaglia, Citation2016, for a review). In concert with other areas of the AON, the mirror neuron areas give rise to the mirror neuron system (MNS), which has been proposed to play a role in the immediate, involuntary, understanding of others’ actions (Rizzolatti & Sinigaglia, Citation2010). The idea is that due to the perceptual-motor matching property of mirror neurons, observed actions automatically trigger the motor representation corresponding to the observed action within the observer’s motor repertoire, which in turn triggers representations of associated outcomes (behavioral/emotional/visceral) that would normally be associated with that particular action when executed by the observer themselves. However, the issue of whether goal/intention reading is generated by, and thus follows, automatic mirroring, or whether it precedes it, remains debated (e.g., Pomiechowska & Csibra, Citation2017) as is the more general role of the MNS and motor activation in social cognition (Heyes, Citation2010; Jacob & Jeannerod, Citation2005).

So far, the few studies that have directly compared activity in the MNS in response to real or video stimuli demonstrated mixed results. For example, a single-cell study in macaque monkeys reported that real-world actions produced enhanced mirror neuron responses compared to actions presented in video-clips (Ferrari et al., Citation2003). However, in another single-cell study in macaque monkeys by Caggiano et al. (Citation2011) no differences in mirror neuron responses were found between observed real-world and video actions. Several imaging studies have supported the superiority of real-world over video stimuli in humans. Shimada and Hiraki (Citation2006) used near-infrared spectroscopy (NIRS) to measure the activity in the sensorimotor cortex of infants during the observation of biological and non-biological motion, in both real-world and video presentations. The difference between the two motion conditions – with more activation for biological motion – was exclusively found in the real-world setting. In a MEG study that recorded power reductions in the beta frequency range during action observation as an index for MNS activity, it was found that power reductions were larger in the naturalistic situation than when video-clips were observed (Järveläinen et al., Citation2001). An index more commonly used for MNS activity is suppressed power in the alpha frequency band over the sensorimotor areas, called mu suppression (for a meta-analysis, see, Fox et al., Citation2016). Ruysschaert et al. (Citation2013) presented actions in a real-world and video setting to infants while recording mu suppression. They reported only mu suppression during the observation of real-world goal-directed actions and not when presented in video-clips.

An important prerequisite for successful social interaction is being able to predict what the other person is most likely going to do next, which ability has been proposed to rely (partly) on the MNS (Kilner et al., Citation2007, Citation2004; Krol et al., Citation2020; Maranesi et al., Citation2014; Prinz, Citation2006; Southgate et al., Citation2009). While Kilner et al. (Citation2004) recorded anticipation effects in ERPs (i.e., readiness potentials) prior to the onset of action observation in adults, the studies by Krol et al. (Citation2020) and Southgate et al. (Citation2009) found anticipation effects in mu suppression, in adults and infants, respectively. A single-cell study by Maranesi et al. (Citation2014) reported activation of mirror neurons in the 300–400 ms period directly prior to the observation of the onset of predictable actions. The study by Kilner et al. (Citation2004) presented video stimuli, and the other three studies presented the actions in a real-world setting.

The current study

It is as yet unclear whether action anticipation and observation are processed differently in real-world settings compared to video observations. We reported – using a real-world paradigm – a distinct mu suppression immediately prior to the onset of predictable actions, which we attributed to action anticipation (Krol et al., Citation2020). The anticipation response exclusively occurred when the upcoming action was predictable (based on contextual cues). If the actions were not predictable, anticipation effects were absent. However, in another study (Krol & Jellema, Citation2022) in which (un)predictable actions were presented in video-clips, no such anticipation effects were found. As there were major differences in the respective paradigms, with, for example, two interacting agents being observed in the video study versus a single agent in the real-world study, and the use of different electrode clusters, a direct comparison of the two studies is not possible. Therefore, the current study combines an adapted data set from the Krol et al. (Citation2020) real-world study with a new data set using video presentations that were designed to match the real-world actions. This allowed to directly examine the impact that real-world versus video presentations have on mu suppression during both the anticipation and observation of (non-)predictable actions.

The first research question was whether there is a difference in the extent of mu suppression during action observation between the two settings, and the second question was whether anticipation effects for upcoming predictable actions were modulated by the type of setting. We hypothesized that the presumed increased level of observer engagement in the real-world setting would enhance mu suppression both during the anticipation of upcoming actions and during the unfolding of the action, compared to the video setting. To the best of our knowledge, no previous studies have examined mu suppression during real-world versus video observation of actions using one and the same paradigm in adults.

Methods

Participants

A total of 39 participants were included, 16 of them (3 males and 13 females; age, M = 22.9, SD = 7.7) took part in the real-world experiment (Krol et al., Citation2020) and the other 23 participants took part in the video experiment (7 males and 16 females; age, M = 22.1, SD = 7.7). The two groups did not differ in age (t (37) = .31, p = .76, d = .05) and male/female ratio (X2 (1, N = 39) = 1.84, p = .18). The minimum sample size for the video experiment was estimated through G*Power (Faul et al., Citation2007), using as effect size dz = .72, α = .05, and 1-β = .80. The effect size estimation was performed on the effect of predictability on mu suppression during the Anticipation phase in Krol et al. (Citation2020). The minimum sample size was 18 (two-tailed). We choose to further increase the sample size for the video experiment to 23. This number of participants is in line with comparable studies that used video presentations with similarly timed goal-directed hand actions producing sufficiently powered results (e.g., Avanzini et al., Citation2012; Streltsova et al., Citation2010). All participants were Psychology undergraduate students of the University of Hull, and received credits toward their degree in exchange for participation. Exclusively right-handed participants were recruited, and all participants had normal or corrected vision. Ethical approval was obtained from the ethics committee of the Psychology department of the University of Hull. All participants provided written consent.

Materials and procedure

The following descriptions apply to both real-world and video experiments; differences between the two paradigms are indicated where they apply. Prior to the experiment, participants received an oral briefing about the experimental procedures and signed an informed consent. EEG recordings were obtained during the observation of an agent performing grasping and placing actions. Playing cards were chosen as objects to be manipulated because they require a specific hand grip for picking up (precision grasp), can easily be incorporated in a game that follows a set of fixed rules (e.g., half are red and half are black), and are well known. In the video condition, the sequences were filmed with a high-definition video camera. Clips were edited with Adobe Premiere Pro CS5, to ensure that the timing of the phases was equal in all clips. The video stimuli were presented on an LCD monitor using Eprime 2.0 software (Psychology Software Tools, Pittsburgh, PA). This program was also used to change the color of the tablet screen.

At the start of each trial, participants saw an agent (in side-view) sitting at a table, and a tablet positioned on the table in front of the agent. A deck of cards was split in two, with one-half positioned on top of the tablet (covering part of the tablet screen), the other half positioned 20 cm further away from the agent (). Each trial included the following phases:

Figure 1. Schematic depiction of sequence of events in single trials in both settings. Top panel shows the real-world setting. In this example a match of tablet color and card color (both red) meant that the participant knew that the agent was going to pick up a card and place it on the table. a, touchpad; b, tablet with deck of cards; c, designated area for placing the cards. The bottom panel shows single key frames from a clip in the video setting (same rule). The selected electrodes in both the real-world and video experiments consisted of two clusters of nine electrodes around CP3 and CP4 of the 10/20 system (far right).

Figure 1. Schematic depiction of sequence of events in single trials in both settings. Top panel shows the real-world setting. In this example a match of tablet color and card color (both red) meant that the participant knew that the agent was going to pick up a card and place it on the table. a, touchpad; b, tablet with deck of cards; c, designated area for placing the cards. The bottom panel shows single key frames from a clip in the video setting (same rule). The selected electrodes in both the real-world and video experiments consisted of two clusters of nine electrodes around CP3 and CP4 of the 10/20 system (far right).

1) Rest phase: The 2 s at the start of each trial in which the agent sat still at the table with their right hand positioned on the table next to the tablet. The tablet screen was white. A 1 s interval of the Rest phase was used as the baseline period, which in the real-world condition started about 2 s before the onset of the subsequent Signal phase. It should be noted that in the real-world setting, there was inevitably some variability in the temporal position of the 1 s baseline period because the Delay phase (see below) was self-timed and thus varied somewhat from trial to trial. The onset of the agent’s action, tagged with a digital marker (see below), served to align the single trials for averaging. To avoid the risk of including the onset of the Signal phase in the baseline period, we selected the period from −5 s to −4 s (with respect to action onset at t = 0) as baseline. In the video setting, the 1 s baseline period started exactly 1 s before the onset of the Signal phase (from −3.5 s to −2.5 s).

2) Signal phase: The 2 s during which the tablet screen displayed a red, black, or blue color signal.

3) Delay phase: The agent did not pick up the card immediately following offset of the signal, but following a short delay, which was 500 ms in the video-clips. Although in the real-world setting, the experimenter attempted to delay the action for .5 s following Signal offset, subsequent measurements indicated the delay period was on average 950 ms (SD = 84 ms). For the analysis, we selected the 500 ms directly preceding action onset, which corresponded to the 500 ms Delay interval in the video setting.

4) Action phase: The 3 s period during which in half of the trials the agent reached out, grasped, and placed the top card, while in the other half, no action was performed. All actions were performed with the right hand. On average, the real-world actions lasted 3.70 s (SD = .27), which was 700 ms longer than the actions in the videos. Even though the real-world actions were intended to last for 3 s, this was hard to adhere to in live performance, and their durations inevitably varied somewhat from trial to trial.

5) Rest phase: The 2 s period directly following the end of the action, during which the actor’s hand resumed the starting position.

6) Inter trial interval (ITI). In the real-world condition, the ITI lasted approximately 4 s, during which time the tablet screen turned white and the participant was allowed to look away and blink. To alert the participant to the start of a new trial, the color of the tablet screen changed from white to yellow for 1 s, directly followed by the color signal. In the video condition, in the ITI the presentation monitor displayed a gray screen for 1 to 3 s, on which a black fixation cross was presented during the last second of the ITI.

Predictable/unpredictable blocks. Each participant completed two blocks; one block included exclusively predictable actions, the other exclusively unpredictable actions. In the block with predictable actions, participants were able to predict whether or not an action was going to be performed by the agent on the basis of the color of the signal and of the top card: if colors matched – both red or both black – then the agent performed the action, if the colors did not match, no action was performed. In the block with unpredictable actions, the color of the signal was blue throughout the block, so that participants could not predict whether an action would follow or not. Participants were instructed of these rules directly prior to the onset of the block. In each block, an action occurred in half of the trials in a random order. A block consisted of 80 trials (40 action trials, 40 no action trials), thus in total 160 trials over both blocks. The block with predictable actions was always presented first, as this was the crucial block that potentially contained the anticipation effects, and we did not want the participants’ responses in this block to be contaminated by any experience from the unpredictable block.

Markers. In the real-world condition, two markers were recorded, one marked the onset of the Signal phase (this marker was generated by the Eprime software) and one marked the onset of the actor’s action. This latter marker was generated by a touchpad (Pal Pad, Adaptivation, Inc., Sioux Falls, United States) located next to the tablet, on which the actor’s right hand rested. Lifting the hand from the pressure-activated membrane switch in the touchpad produced a 4 V signal. The touchpad was connected to the AD-box of the EEG system, recording the exact timing of the action onset and action ending (hand placed back on pad). In the video condition, the Eprime software provided a marker signaling the start of the trial.

It took approximately 40 minutes to complete the experiment, including a short break between the blocks.

EEG Recording

EEG was continuously recorded from 64 Ag–AgCl-tipped electrodes arranged according to the International 10–20 EEG System using the ActiveTwo system (BioSemi, Amsterdam, The Netherlands). EEG recordings were made relative to the common mode sense (CMS) electrode, located in between P1, PO3, and POz. The CMS and DRL (passive driven right leg), located in between P2, POz, and PO4, formed a feedback loop driving the subject’s average potential as close as possible to the reference voltage in the A/D-box (i.e., the amplifier “zero”). The electrodes had integrated amplifiers to reduce noise and interference in the data. The BioSemi software program, ActiView, was used for EEG data acquisition, where the sampling rate was down-sampled from 2048 Hz to 512 Hz.

EEG Analysis

The EEG data were processed using BrainVision Analyzer 2.1.1 (Brain Products GmbH, Gilching, Germany). First, the raw data were re-referenced to an average reference. The 8–13 Hz frequency band was extracted with a Butterworth Zero Phase filter (48 dB/oct). Next, the data were segmented for the entire 10 ss of events. In the real-world condition, 5 s before and 5 s after the onset of the action (indicated by the touchpad marker) were selected. A visual inspection was done to remove incorrect markers. An artifact rejection was applied by removing segments if it contained data with amplitudes below −50 µV or above 50 µV (6% of all segments was removed). After artifact rejection, the data were rectified to exclusively have positive values of amplitudes in the alpha frequency band. Subsequently, the data were baseline corrected, followed by averaging of all segments of each condition. Finally, the area information was exported in a selected time domain.

The above method of transforming raw data into power in squared microvolt (µV2) in a specific frequency band is called temporal spectral evolution (TSE; Salmelin & Hari, Citation1994). This method has been used successfully in other studies that investigated mu suppression (Babiloni et al., Citation2002; Cattaneo et al., Citation2007; Hari & Salmelin, Citation1997; Salmelin & Hari, Citation1994, Citation1994). Averaged power was exported for two pre-selected clusters of nine electrodes; one at the left hemisphere (C5, C3, C1, CP5, CP3, CP1, P5, P3, and P1) and one at the right hemisphere (C2, C4, C6, P2, P4, CP6, P2, P4, and P6; ). These electrodes were selected to cover the cortical areas involved in sensorimotor representations of hand and arm actions (Babiloni et al., Citation2002). Similar clusters were analyzed in a number of previous studies in which hand/arm actions were presented (Avanzini et al., Citation2012; Coll et al., Citation2017; Southgate et al., Citation2009).

Statistical analysis

Mu suppression was analyzed in an overall 2 × 2 × 2 × 2 ANOVA with Prediction (Predictable, Unpredictable), Phase (Anticipation, Action), and Hemisphere (LH, RH), as within-subject factors and Setting (Real-world, Video) as between-subjects factor. The Anticipation phase consisted of the 500 ms immediately before action onset and the Action phase is the 3 s after action onset. In a final analysis, the magnitude of power reductions prior to, and during, the observation of actions was compared with the level of power in the within-trial baseline phase, using one-sample t-tests. Bonferroni corrections were applied to correct for multiple comparisons, and Greenhouse–Geisser corrections were applied whenever Mauchly’s test of sphericity was significant.

Results

First, an overall 2 (Prediction) × 2 (Phase) × 2 (Hemisphere) × 2 (Setting) ANOVA was computed (see, for the time course of power reductions). This analysis demonstrated that the main effects of Prediction, F(1, 37) = 1.42, p = .24, ηp2 = .04, and Setting, F(1, 37) = 2.63, p = .11, ηp2 = .07, were not significant. The main effect of Phase, F(1, 37) = 20.08, p < .001, ηp2 = .35, was significant. Power suppression in the Anticipation phase (M = −.06; SD = .14) was smaller compared to the Action phase (M = −.21; SD = .22). The main effect of Hemisphere was also significant, F(1, 37) = 6.93, p = .01, ηp2 = .16. Power in the right hemisphere (M = −.14; SD = .14) was more reduced than in the left hemisphere (M = −.11; SD = .16). The two-way interaction effect of Setting by Prediction, F(1, 37) = 11.42, p = .002, ηp2 = .24 was significant. Importantly, the three-way interaction effect between Setting, Phase, and Prediction was also significant, F(1, 37) = 10.08, p = .003, ηp2 = .21. None of the other interaction effects were significant, including the four-way interaction effect, F(1, 37) = .18, p = .67, ηp2 = .005.

Figure 2. Time course of power fluctuations in the action conditions. Left-side panel, Real-world condition. The top graph depicts power in the mu rhythm averaged over the 9-electrode cluster of the left hemisphere (LH), for the predictable and unpredictable actions. The bottom graph displays the same information for the right hemisphere cluster (RH). The onset of the actions was at t = 0. Right-side panel, Video condition. Details are as those of the real-world condition.

Figure 2. Time course of power fluctuations in the action conditions. Left-side panel, Real-world condition. The top graph depicts power in the mu rhythm averaged over the 9-electrode cluster of the left hemisphere (LH), for the predictable and unpredictable actions. The bottom graph displays the same information for the right hemisphere cluster (RH). The onset of the actions was at t = 0. Right-side panel, Video condition. Details are as those of the real-world condition.

Since the main question was whether the neural activation as a result of observing predictable or unpredictable actions differs between the real-world and video setting, the three-way interaction effect was further explored in a 2 (Prediction) × 2 (Setting) analysis per phase.

Anticipation phase. There were no significant effects of Prediction, F(1, 37) = 2.41, p = .13, ηp2 = .06, and Setting, F(1, 37) = 2.27, p = .14, ηp2 = .06. However, the interaction between Prediction and Setting was significant, F(1, 37) = 14.78, p < .001, ηp2 = .29. Paired sample t-tests revealed that in the real-world condition, the reduction of power for predictable actions (M = −.18; SD = .16) was significantly larger than for unpredictable actions (M = −.02; SD = .15; t(15) = −2.57, p = .02, dz = −.66). In contrast, in the video conditions the effect was reversed with larger power reductions in the unpredictable block (M = −.07; SD = .16) than in the predictable block (M = −.001; SD = .18), t(22) = −2.80, p = .01, dz = −.59.

Action phase. The 2 (Prediction) × 2 (Setting) ANOVA demonstrated that there was no significant effect for Prediction, F(1, 37) = .002, p = .96, ηp2 < .001, and no significant effect for Setting, F(1, 37) = 1.98, p = .17, ηp2 = .05. There was again a significant Setting by Prediction interaction effect, F(1, 37) = 4.80, p = .04, ηp2 = .12. The difference in power between the two Prediction blocks, however, was not significant in either setting (paired-samples t tests, real-world: t(15) = −1.27, p = .22, dz = −.08; video: t(22) = 1.89, p = .07, dz = .09). Neither predictable or unpredictable actions induced significantly more mu suppression in the real-world setting than in the video setting (independent-samples t tests; predictable, t(37) = 1.71, p = .095; unpredictable, t(37) = 1.00, p = .32).

Thus, during the observation of both predictable and non-predictable actions, there were no significant differences in the extent of mu power suppression between the real-world setting and the video setting. In contrast, with respect to the anticipation of upcoming actions, mu suppression was exclusively found prior to action onset of predictable actions in the real-world setting and not in the video setting.

Comparisons to baseline. The previous analysis did not determine whether or not power was significantly reduced relative to the baseline, i.e., the 1-s epoch before the Signal phase. This was examined in one-sample t-tests (test-value = 0), and the averaged power reductions are shown in . Bonferroni corrections were applied to correct for multiple comparisons (alpha was adjusted to a significance level of .006, 8 comparisons in total).

Figure 3. Comparisons to baseline. The average mu power suppression over the sensorimotor areas is shown, separated by Phase (anticipation and action), Setting (real-world and video), and Prediction (predictable and unpredictable). Data were baseline corrected to a within-trial baseline period. Error bars represent ±1SD.

Figure 3. Comparisons to baseline. The average mu power suppression over the sensorimotor areas is shown, separated by Phase (anticipation and action), Setting (real-world and video), and Prediction (predictable and unpredictable). Data were baseline corrected to a within-trial baseline period. Error bars represent ±1SD.

In the real-world condition, power in the Anticipation phase was significantly reduced from baseline for predictable actions, t(15) = −4.57, p < .001, d = −1.18, but not for unpredictable actions, t(15) = −.64, p = .53, d = −.18. In the Action phase, power was significantly reduced during the observation of both the predictable actions, t(15) = −4.48, p < .001, d = −.30, and unpredictable actions, t(15) = −5.14, p < .001, d = −.34. In the video condition, power in the Anticipation phase did not significantly differ from baseline for either action type (predictable actions, t(22) = −2.02, p = .06, d = −.43; unpredictable actions, t(22) = −0.03, p = .98, d = −.01).

Power in the Action phase was reduced from baseline for both action types (predictable actions, t(22) = −3.19, p = .004, d = −.15; unpredictable actions, t(22) = −4.03, p = .001, d = −.18).

The results of the comparison to the baseline epoch corroborate the results of the ANOVAs. Significant power reductions were only found immediately before the onset of predictable actions in the real-world, and not in the video, setting. No significant mu suppression was measured prior to unpredictable actions in either setting. In both prediction blocks and in both settings, power was reduced from baseline during the observation of the actions.

No-action conditions. We next analyzed the No-action conditions (predictable and non-predictable), recorded in the real-world and video settings (). In these conditions, the Signal phase is present, but no action occurs. In the predictable No-action condition, the participant knows that no action will be performed (tablet color and card color do not match); in the unpredictable No-action condition, the participant does not have prior knowledge whether an action will be performed (the tablet signal turns blue at each trial), and there turns out to be no action. These conditions are useful for examining any possible mu suppression effects due to the mere presentation of the signal color. For example, if the signal offset would induce mu suppression, then that could be mistaken for mu suppression related to action anticipation, as the Delay phase immediately followed the Signal phase. The overall 2 (Prediction) × 2 (Phase) × 2 (Hemisphere) × 2 (Setting) ANOVA did not reveal any significant main or interaction effects. Comparisons to Baseline using one-sample t-tests (test value = 0) showed that both the Delay phase and Action phase (during which no action was presented) did not differ from the baseline in any of the conditions in the two settings (all ps > .5).

Figure 4. Time course of power fluctuations in the No-action conditions. Left-side panel, Real-world setting. The top graph depicts power in the mu rhythm averaged over the 9-electrode cluster of the left hemisphere (LH), for the predictable and unpredictable conditions. The bottom graph displays the same information for the right hemisphere cluster (RH). The onset of the actions was at t = 0 in the Action conditions. Right-side panel, Video condition. Details are as those of the real-world setting. The inset-illustrations depict the predictable No-action conditions.

Figure 4. Time course of power fluctuations in the No-action conditions. Left-side panel, Real-world setting. The top graph depicts power in the mu rhythm averaged over the 9-electrode cluster of the left hemisphere (LH), for the predictable and unpredictable conditions. The bottom graph displays the same information for the right hemisphere cluster (RH). The onset of the actions was at t = 0 in the Action conditions. Right-side panel, Video condition. Details are as those of the real-world setting. The inset-illustrations depict the predictable No-action conditions.

Discussion

The main objective of the current study was to determine whether the setting in which actions are presented – real-world versus video setting – affects power suppressions in the alpha frequency band (mu rhythm) over the sensorimotor cortical areas. The actions presented were either unpredictable, or predictable on the basis of contextual cues, which allowed to examine anticipation effects. The paradigms in both settings were very similar in terms of presented actions, setup arrangements, and complexity, so that the relative impact of these settings could be compared.

Action observation. Mu suppression was found during the observation of actions, in line with previous reports (e.g., Arnstein et al., Citation2011; Avanzini et al., Citation2012; Cochin et al., Citation1999; Nyström et al., Citation2011; Perry & Bentin, Citation2009; Southgate et al., Citation2009). There was no significant overall difference in the extent of mu suppression between the real-world and video settings. Therefore, earlier reports that mu suppression induced by action observation is enhanced in real-world settings (Ruysschaert et al., Citation2013) are not supported by the current findings. It should be noted though that Ruysschaert et al. (Citation2013) recorded from infants, and video observation in infancy could differ from that in adults. For example, it has been shown that infants learn less from observing videos than from real-world observation compared to adults, referred to as the video deficit in infancy (Anderson & Pempek, Citation2005).

Action anticipation. Mu modulations related to setting and predictability were found during the anticipation of upcoming actions. That is, mu suppression increased exclusively in the anticipation period for predictable actions in the real-world setting. No mu reduction was found during the anticipation period for predictable actions in the video setting, while non-predictable actions did not produce mu suppression in either setting. This concords with Southgate et al. (Citation2009) who reported EEG anticipation effects in a real-world setting. However, Southgate and colleagues tested 9-month-old-infants, while a non-predictable action condition was not included, which makes comparison difficult. EEG anticipation effects in response to actions presented in videos have rarely been investigated. The Kilner et al. (Citation2004) study was one of the few that reported EEG anticipation effects prior to the onset of predictable actions in adults using video stimuli. However, the Readiness Potential was used as an index for simulation activity in the motor system. It is as yet unclear how the Readiness Potential relates to mu suppression and whether it index the same motor activity. Braukmann et al. (Citation2017) reported anticipatory motor activation in adults, indexed by beta-band suppression, using video presentations of predictable object-directed multi-step actions (e.g., making a cup of tea). They reported progressive increases in beta suppression in each next action step, which they attributed to anticipation of the next step. However, in their paradigm, the presented multi-step actions consisted of a continuous flow of movements, which made it impossible to separate contributions to power suppression due to the prediction of the next sub-step from those due to the observation of the current movement. Nevertheless, it may well be that the continuous flow of consecutive action sub-steps, where each step logically predicts the next step, is an effective way to induce anticipation effects (cf., Cattaneo et al., Citation2007), to the extent that they can be measured when engagement is relatively low, as in video presentations. It cannot be excluded that the absence of anticipation effects in our video experiment could be related to the artificial nature of the predictive cue (color signal) as opposed to a natural cue consisting of a sub-step of an action chain. This possibility will be explored in future experiments whilst allowing for temporal separation of prediction and action contributions. It should be noted that evidence for involvement of the beta frequency range in action observation remains mixed and usually the contribution of beta is found to be less prominent than that of mu, at least in adults (e.g., Avanzini et al., Citation2012; Babiloni et al., Citation2002).

Mu suppression was found to be overall stronger in the right hemisphere, though none of the interactions with hemisphere was significant. Execution of right-hand actions is typically associated with activation in motor areas of the left hemisphere; however, for observation of right-hand actions performed by an actor facing the observer, the pattern of hemispheric activation is less clear. Some studies reported predominantly left-hemisphere activation for observation of right-hand actions (e.g., Arnstein et al., Citation2011; Cochin et al., Citation1999; Press et al., Citation2011), others reported bilateral activation (e.g., Babiloni et al., Citation2002; Hari & Salmelin, Citation1997), and again others a right hemisphere dominance (e.g., Perry & Bentin, Citation2009). It has been suggested that stronger activation occurs over the sensorimotor cortex of the hemisphere that is contralateral to the area in space where movement is seen (Kilner et al., Citation2006). This could in principle account for the lateralization effect in the current study as the agent’s arm always started moving from the left side and returned to the left side. Although the pile on which the card had to be placed was located in the right half of the visual scene, most of the action occurred in the left visual field.

When recording sensorimotor mu suppression from central-parietal electrodes, there is a risk of mu being confounded by occipital alpha related to visual attentional processes (Hobson & Bishop, Citation2016; Klimesch, Citation2012). In Krol et al. (Citation2020) we specifically examined the role of occipital alpha and concluded that occipital alpha did not contribute to the differences in sensorimotor mu suppression between the predictable and unpredictable conditions.

In principle, there is the possibility that the Signal offset caused mu suppression, which then mistakenly could have been interpreted as anticipation-related mu suppression, since the Anticipation phase (i.e., the Delay) immediately followed the Signal phase. The No-action condition, where the tablet signal was presented in the same way as in the Action conditions, but no action followed, is well suited to examine this possibility. For if any mu suppression would occur in this condition during the Delay period, then it is highly likely to be related to the Signal as there was no action anticipation. In other words, this condition provides a fairly clean recording of Signal-related mu suppression, if it exists. Therefore, the finding that in the No-action condition, no mu suppression was found during this period, nor during the subsequent “no action” period, strongly suggests that our anticipation-related mu suppression is not in fact Signal-offset-related mu suppression, or that the latter contributes to it.

Another potential issue is that in the real-world settings, the bodily posture and gaze direction of the experimenter performing the reach and grasp actions could inadvertently have provided cues as to whether an action would ensue. Such bodily cues were absent in the video setting where merely arms and hands were visible. This could, in principle, compromise the “unpredictability” of the unpredictable condition, and could “strengthen” the predictability in the predictive condition (adding a bodily cue to the color cue). To prevent this from happening, the following measures had been taken during the real-world recordings: In the instructions provided to the participant directly prior to the start of the recording, it was emphasized that throughout the trials they should exclusively look at the tablet and the moving hand. Only during the 4 s, ITI were they allowed to quickly look away, or blink, if they wanted. Participants stringently adhered to this instruction, as verified by a second experimenter present in the lab. Furthermore, the actions were always performed by one and the same experimenter (MK), which allowed to keep variability in the experimenter’s posture and gaze to a minimum. The experimenter ensured that she fixated her eyes on the tablet and cards at all times during the recording. However, if such bodily cues were nevertheless conveyed, then mu suppression should be observed prior to the onset of actions in the unpredictable action condition. In this condition, the tablet color (blue at each trial) did not provide any predictive information, allowing anticipatory mu suppression related to bodily cues, if it existed, to reveal itself. However, no mu suppression was found in this condition (, left panel) in the Delay period. This strongly suggests that no bodily cues associated with action performance were present, and that, in as far as they were present, they did not cause any anticipatory mu suppression.

What factors could explain the presence of a mu-related anticipation effect in the real-world setting and its absence in the video setting? One possibility is that participants’ engagement played a role. In an ecologically valid real-world setting, it may be more relevant to anticipate the agent’s actions as, in principle, a joint action could follow. Even though exclusively individual actions were presented in the current study, and participants knew that they were not required to interact with the agent and should passively observe the hand actions, in principle they could and hence could have been more motivated to anticipate the actions. Naturalistic experiment settings have been demonstrated to enhance engagement. For example, real-world stimuli enhance social attentional gaze following compared to video stimuli (Risko et al., Citation2012). This seems to contradict the finding by Laidlaw et al. (Citation2011) that participants paid more attention to a videotaped confederate than to a live confederate. However, this latter effect seems to occur especially for scanning of the eye region. In the real-world condition of the current study, the participants were explicitly instructed to observe the hand and arm actions and not to pay attention to the agent’s face. Laidlaw et al. (Citation2011) suggested that social attention increases when social interactions are possible. Further, Mu suppression studies in infants also indicate that engagement may be a crucial factor. For example, the finding by Ruysschaert et al. (Citation2013) that in 18–36-month-old infants, mu suppression during action observation occurred in real-world settings but not in video presentations, may reflect a lack of engagement with videos at a young age. This may be linked to the ease of processing of 3D versus 2D information. Especially young children struggle with transferring learning from 2D information (e.g., television) to real-life situations, as compared to 3D information (e.g., live interactions; Barr, Citation2010). One could speculate that also in young adults (current study), enhanced processing of 3D information is prevalent.

The extent in which the social context resembles the real-world may also be crucial when investigating differences between typical and atypical populations (Freeth et al., Citation2013; Laidlaw et al., Citation2011). For example, some studies found less suppression of mu power (presumably reflecting reduced activation in the MNS) in individuals with autism spectrum disorder compared to controls during action observation (e.g., Dapretto et al., Citation2006; Oberman et al., Citation2005; Williams et al., Citation2006), while others found no differences (e.g., Fan et al., Citation2010; Raymaekers et al., Citation2009). Differences in interest in, and engagement with, the social stimuli, instigated by their nature and setting, may well have contributed to these discrepancies, and may specifically affect anticipation processes (Hudson et al., Citation2012).

Limitations

Although precautions were taken to prevent the experimenter performing the reach and grasp actions in the real-world experiment from inadvertently providing postural/bodily cues as to whether an action was going to be performed, we had no objective measure to confirm this. In subsequent experiments, video or eye-tracking recordings of the participant observing the experimenter’s hand actions could be performed to objectively ascertain that no such additional cues were conveyed.

We suggested that increased social interest in, and engagement with, the agent performing the action in the real-world settings, and the potential to interact with them, might be a reason for the presence of anticipatory mu suppression in the real-world and not in the video setting, but we have no objective evidence for it. This might be obtained in future studies by measuring, for example, pupil size (e.g., Hopstaken et al., Citation2015), galvanic skin responses (e.g., Smallwood et al., Citation2004) or heart rate (e.g., Smallwood et al., Citation2004) in both settings, which have been proposed as indexes for engagement and social interest.

In conclusion, researchers in social neuroscience have become increasingly aware of the importance of naturalistic study designs. In previous EEG-mu studies, usually, either video stimuli or real-world stimuli were presented. A direct comparison of the influence of these settings on mu suppression, in one and the same paradigm, has rarely been performed. The current study aimed to fill this gap. The findings supported the claim of the superiority of real-world stimuli over video stimuli. That is, exclusively in the real-world setting, a mu suppression was found prior to the onset of predictable upcoming actions. No such effect was present in the video condition for predictable actions. No differences in mu suppression were found during action observation. The reason for the anticipatory mu effect could be that the observer’s extent of anticipation depends on their engagement with the agent, and that engagement is enhanced in real-world settings where there is in principle the possibility for joint action.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work has been supported by a grant from the University of Hull.

References