1,427
Views
8
CrossRef citations to date
0
Altmetric
Articles

Managing activity transitions in robot-mediated hybrid language classrooms

ORCID Icon & ORCID Icon

Abstract

The development of videoconferencing technology has enabled new modes of combining in-person and remote teaching. In this article, we investigate interactional practices in hybrid language classrooms that combine on-site and remote participation by way of telepresence technology. Telepresence robots are videoconferencing tools that can be remotely controlled and moved in the ‘local’ space during video-mediated interaction. In our video-based study, we investigate recordings from university-level foreign language classes (Finnish, German, Swedish and English) involving robot-mediated participants as part of an otherwise on-site classroom student cohort. We draw on multimodal conversation analysis (CA) and analyse a selection of data extracts with a focus on how participants use the robot’s mobility as an interactional resource in moments of transition between whole-class and group-based activities. The analysis explores how moving the robot enables the remote student to demonstrate competent participation and to contribute to the progression of the activity transition. We also analyse how teachers make sense of the remote students’ engagements by monitoring the positioning and movements of the robot, and how they individually support the remote students in moments that can potentially be interactionally challenging in hybrid environments. These findings expand CALL literature by demonstrating how telepresence robots can enhance the multimodal range of meaning-making resources of remote students within everyday classroom practices in hybrid language teaching. As practical implications, we outline some ways in which social interaction provides both a rich resource base for participants and a site in which many pedagogical questions relevant to hybrid education play out.

1. Introduction

Synchronous computer-mediated communication (SCMC) has become a key aspect of distance language learning activities as a means to offer students possibilities to practice oral, interactional and intercultural skills. Reviewing existing research evidence, a meta-analysis by Ziegler (Citation2016) suggests that SCMC contexts can be equally beneficial as face-to-face contexts for students’ L2 development. A growing number of CALL studies have taken interest in video-mediated and task-oriented SCMC, exploring interaction in both dyadic constellations (e.g. Guichon & Cohen, Citation2014; Satar, Citation2013, Citation2016) and in multi-party conferences or between larger learner cohorts (Austin et al., Citation2017; Rusk & Pörn, Citation2019; Satar & Wigham, Citation2017). In comparison to audio or text-based SCMC, videoconferencing can be seen to support the development of L2 interactional competence (Pekarek Doehler & Berger, Citation2018) by enabling learners to use a broader range of multimodal resources for participation and meaning-making, such as gaze behaviour, facial expressions, and hand gestures (see also Cohen & Wigham, Citation2019). However, researchers have also underscored that the possible benefits of videoconferencing for CALL depend on how well participants take the technology and its limitations into account in interaction (see e.g. Slaughter et al., Citation2019; Guichon & Cohen, Citation2014).

This paper probes such technology-specific “affordances of the medium” (González-Lloret, Citation2015, p. 577) by exploring interactional use of a telepresence robot as a means to provide synchronous hybrid language teaching. Unlike the kinds of autonomous social robots that tend to be in focus in robot-assisted language learning (RALL) research (for an overview, see Randall, Citation2019), telepresence robots are non-autonomous, mobility-enhanced, and remote-controlled videoconferencing solutions that typically have a camera, screen, speakers and a microphone, and wheels for movement. For a remote student participating synchronously in hybrid classroom teaching, the ability to “move” around the classroom in a mediated manner by operating the telepresence robot can offer an important resource for participation. Indeed, telepresence robots have previously been used in education to enable homebound or hospitalized students to participate in classroom teaching as well as to provide remote language teaching in hard-to-reach areas (e.g. Cha et al., Citation2017; Han, Citation2012; Shin & Han, Citation2017; Soares et al., Citation2017). Several earlier design-oriented studies in the field of human-robot interaction (HRI) have identified telepresence robots as a potentially viable tool for hybrid classrooms, one that can provide remote students a fuller sense of social presence and belonging in the classroom community than traditional, more static, videoconferencing set-ups (e.g. Bell et al., Citation2016; Edwards et al., Citation2016; Gleason & Greenhow, Citation2017; Shin & Han, Citation2017). Despite such an optimistic outlook, it has also been recognised that asymmetries between on-site and remote participation in hybrid interaction (and by extension, teaching) constitute so called ‘wicked problems’ that are extremely difficult to erase completely, no matter what videoconferencing technology is used (Jones et al., Citation2021).

Compared to other videoconferencing tools, remote-controlled mobility is the most distinctive technological feature of telepresence robots. From an interactional perspective, movement is not just a technical ‘gimmick’. It can extend the range of available multimodal resources for interaction, because managing the robot’s position in relation to classroom participants allows a remote student to control what they see through the camera view and to address their talk to a specific individual within a group. However, thus far extremely few studies have investigated interactional practices of robot-mediated communication (RMC), i.e., social encounters “in which at least one party is telepresent through voice, video, and motion in physical space via a remotely controlled robot” (Herring, Citation2015, p. 398), in synchronous hybrid education. We are only aware of two such studies in language teaching settings: A study by Liao et al. (Citation2019) suggested design principles of telepresence-place-based FL learning and investigated the principles with the help of real-life interactional data. In our previous study, we explored how participants manage various kinds of physical and digital learning materials in robot-mediated hybrid language teaching (Jakonen & Jauni, Citation2021).

In this article we address this apparent research gap and, by doing so, respond to Raes et al.’s (Citation2020) call for more investigations of real-time practices of hybrid education from multimodal perspectives. We draw on a multimodal conversation analytical (CA) approach to explore how participants manage transitions between whole-class and small group interactions in a hybrid, tertiary-level language teaching setting in which remote participation takes place via a telepresence robot. As we will elaborate later (section 2.2), such transitions might occur when a group task begins or ends, but also when a group task is paused while the teacher provides students some task-related instructions, to be resumed afterwards. More specifically, we seek answers to following research questions:

  1. How does the mobility of the telepresence robot afford remote student participation during activity transitions in hybrid language classrooms?

  2. How do teachers interactionally support robot-mediated remote students during activity transitions?

By interactional support, we mean the use of talk and embodied conduct (e.g. gaze, posture, and gestures) employed to facilitate student participation in the on-going classroom activity. Through these questions, we aim to contribute to research on telepresence robots in the fields of CALL and HRI by exploring in micro-level detail the social, multimodal, and cooperative practices involved in hybrid interactions in language teaching contexts. We focus on activity transitions quite deliberately because they can be challenging interactional environments for remote participation insofar as they involve the use of embodied interactional resources such as body movements and gaze, as we will show in our empirical analysis. Before that, we will briefly situate this study within CA investigations of language learning, presenting key tenets of CA and outlining its relevance for issues in computer-assisted language learning (CALL). We then review CA literature on activity transitions in face-to-face classroom/pedagogical settings in order to lay the ground for our empirical work.

2. Theoretical background

2.1. Conversation analysis and (computer-assisted) language learning interactions

Conversation analysis (CA) is both a methodological approach to studying social interaction and by now a considerable theoretical body of findings on how interaction is organised (see e.g. Sidnell & Stivers, Citation2012). CA capitalises on inductivity and micro-level interactional detail, paying close attention to how participants observably orient to each other’s conduct in real time. CA views language learning as something that is constituted in, and takes place through, social interaction, which differs considerably from cognitive-interactionist perspectives to SLA. Among other topics, previous CA-SLA studies have shed light on issues such as what language learning looks like as social action (e.g. Majlesi & Broth, Citation2012; Markee, Citation2008), how learners’ interactional competence develops over time (e.g. Pekarek Doehler & Berger, Citation2018), how conversational structures such as repair are used as resources for learning (e.g. Lilja, Citation2014), and how interaction in language classrooms is organised (Seedhouse, Citation2004). A sizeable number of CA-SLA studies have explored L2 interaction in CALL-relevant contexts, such as text-based CMC (Jenks, Citation2014), gaming (Rusk & Ståhl, Citation2022), and other technology-rich environments (e.g. Eilola & Lilja, Citation2021; Kurhila & Kotilainen, Citation2020; Thorne et al., Citation2021).

As González-Lloret (Citation2015, p. 574) points out, CA studies of technology-mediated language learning have traditionally focused on text-based CMC contexts. However, the technological development of video platforms during the past 10 years has increased attention on video-mediated learning interactions (see e.g. Balaman & Doehler, Citation2022; Balaman & Sert, Citation2017; Jakonen & Jauni, Citation2021; Rusk & Pörn, Citation2019). In broad terms, these studies reveal the interactional complexity of learning spaces created through videoconferencing. For example, Dooly and Davitova (Citation2018) investigated telecollaborative interaction between two classrooms in dispersed locations, a setting in which interaction does not only take place on and via the screen but also within each physically co-present classroom cohort. Furthermore, as Balaman and Doehler (Citation2022) show, video-mediated interaction involves coordinating talk, embodied conduct mediated by the webcam and participants’ possible other screen activities that may be either private or public (such as when sharing one’s screen). When working on L2 tasks online, the participants in their data sometimes verbalised their private task-oriented activities such as online searches by way of turns such as ‘let me check’ (Balaman & Doehler, Citation2022). Moreover, because learning materials and other material resources are prevalent in language classrooms, participants need to find ways to ensure remote participants’ access to them. If the relevant materials are in one location and cannot be shared digitally, participants may need to make sure they are visible to remote participants through various kinds of checking and showing sequences (e.g. Dooly & Davitova, Citation2018; Jakonen & Jauni, Citation2021). Altogether, these and other CA studies thus illustrate that, despite the interactional complexity of video-mediated instruction, teachers and learners have ways to accommodate interactional practices to the specific affordances and constraints of the used videoconferencing technology.

2.2. Classroom activity transitions as collaborative and interactional achievements

Classroom lessons are typically organized as series of linked activities that may involve interacting together as a class, in groups or pairs, or working individually. Teachers and students thus need to coordinate two kinds of activity transitions in the classroom: between two separate activities and between two phases within an activity, such as when working on an itemized task (e.g. Jacknick, Citation2011; Mortensen & Hazel, Citation2011). An early definition by Arlin (Citation1979) viewed classroom transitions as a ‘‘teacher-initiated directive to students to end one activity and to start another’’ (p. 42). While teacher directives such as ‘Okay, start working in groups’ are typical turns-at-talk in moments of transition, studies of classroom interaction have shown that transitions are not quite as simple individual efforts as the definition might imply. Instead, transitions are often stepwise, unfold over several interactional turns, and convey expectations towards what should come next (see e.g. Jacknick, Citation2011).

Moreover, transitions are a site where student agency and students’ understanding of the task gets negotiated. The interactionally contingent nature of such work can in turn lead to very different subsequent task interaction in different student groups (e.g. Hellermann & Doehler, Citation2010). Furthermore, Jacknick (Citation2011) argues that even if teachers may discursively design transitions as if to “discourage further student contributions” (p. 34), students can show agency by self-selecting themselves as speakers and create ‘wiggle room’. Student agency may also become visible in the form of off-task talk, for which the stepwise nature of transitions may provide opportunities (Markee, Citation2005).

Particularly relevant for our study is the observation that classroom transitions are multimodal accomplishments. This has become perhaps most apparent in studies of interaction in pedagogical settings beyond language teaching. For example, dance (Broth & Keevallik, Citation2014) and budo classes (Råman, Citation2017) include alternation between two kinds of participation frameworks: teacher whole-class demonstrations and student practice in pairs or in small groups. Shifting from one activity to another thus involves a reorganization of the participation framework, which participants largely accomplish through body movements and posture shifts in relation to each other in the physical space. Råman (Citation2017, paragraph 72) concludes that in order to manage transitions in a budo class, participants “assign meaning to not only locations, but also to movements and directions”. While the context is clearly different from language classes, it is worth remembering that transitions in face-to-face classrooms require physical movements for example when students get into groups. In a hybrid classroom context such as ours, the telepresence robot’s camera does not necessarily provide capabilities to zoom into text on the classroom board so that a remote student could follow the teacher’s board work in the same place where they might interact as part of a student group. In such a situation, robot movement may be needed to shift attention between the teacher, student group(s), and the board (or other task materials). In these ways, activity transitions in synchronous hybrid classrooms can be interactionally challenging because local and remote students tend to have starkly different embodied resources at their disposal, resources which afford and constrain action in different ways. In this study, we focus on how the mediated and re-embodied mobility of the robot enables participation in transitions, and aim to demonstrate how participants use and make sense of it as a particular kind of techno-embodied resource.

3. Data and method

As part of an ongoing project exploring hybrid language teaching in the Finnish higher education context, we have video-recorded lessons in which, alongside a classroom-based cohort, 1-2 students participated remotely and synchronously with the help of a Double 2 telepresence robot. Double 2 is a wheeled videoconferencing tool that the remote user can move in the ‘local’ space – in our case the physical classroom – during interaction. The video corpus amounts to c. 7.5 hours of recordings from four different second language classrooms (Finnish, Swedish, German and English), collected before the outbreak of the Covid-19 pandemic in 2018–2019. Recordings were made using two classroom cameras and screencapture software on remote participants’ computers (English and Swedish). The data contain lessons in which a student was unable to join the classroom physically because of an injury (Swedish) or a study abroad period (Finnish), and lessons during which the teachers and students were experimenting with the robot so that students took turns to use the robot from another campus location (German and English). The courses on which we have filmed were part of the university’s regular curriculum, and alongside the robot-mediated remote students, the on-site classroom cohort varied between 10-20 students.

Our data shows novice users’ first encounters with the telepresence technology. As researchers, we did not endorse any particular hybrid pedagogy or mode of using the robot beyond showing how it works in a technical sense. It seemed that teachers took the hybridity of these lessons into account more in their classroom management than in lesson design or materials selection: in that sense, the data illustrates teachers’ ad hoc practices for supporting synchronous remote participation as opposed to a systematic, pre-planned curricular approach to hybrid education. In particular, we noticed that activity transitions were typical moments for teachers to conduct visible interactional work to facilitate remote students’ robot-mediated participation. We screened the entire video corpus to create a collection of 28 transitions between whole-class and small group interaction in which the robot’s movement was clearly observable, and examined these transitions in micro-level detail with a focus on how participants orient to and manage the robot’s movements and location in the classroom. The collection of extracts thus constitutes a subset of all activity transitions that occurred during the lessons, one that arguably exhibits the kinds of transitions that might be more difficult to accomplish in a more static hybrid set-up based on more traditional videoconferencing applications. In this article, we draw on a multimodal CA framework and report our observations on hybrid transitions by analysing selected data extracts from two different classrooms, Finnish and German. The extracts illustrate recurrent interactional practices, tasks, and troubles related to robot movement during transition moments in our collection of 28 cases. However, because of the small size of the data corpus and the exploratory nature of the study, we do not wish to make claims about the relative frequency of the reported practices in hybrid classrooms beyond our data. In the spirit of CA-SLA studies, our main interest is in understanding transitions as situated and collaborative achievements in these extracts. Moreover, as González-Lloret (Citation2015, p. 578) suggests, CA offers a highly useful methodological toolkit for exploring how participants make sense of new forms of technology-mediated interaction, and the kinds of pedagogical consequences such technologies may have. Detailed transcription of interaction is an essential part of CA analysis, making available both to the analyst and to the reader how talk and embodied actions such as gestures and movement are produced. We have used CA conventions to transcribe talk (Jefferson, Citation2004) and embodied conduct (Mondada, Citation2014). Turn elements in languages other than English have also been translated into English, aiming at idiomatical equivalence. Participants’ names are pseudonymised.

4. Telepresent movement and remote participation during classroom transitions

Many transitions in our data collection revolve around a teacher directive issued to students. Realised through a multitude of linguistic forms, directives are “utterances designed to get someone else to do something” (Goodwin, Citation2006). In this section, we show how ways of moving the robot constitute expected and compliant actions in response to transition-implicating directives (section 4.1). We also analyse how classroom participants make sense of robot movements and illustrate some ways in which they support the remote student’s participation during transitions (section 4.2).

4.1. Movement as projected and responsive action in activity transitions

Extract 1 shows how a teacher in a beginner level Finnish class is concluding instructions for a pair task that involves students telling each other the times shown on a series of clock faces in a handout (also projected on the whiteboard). The student pairs have been established earlier so that our focal pair in the extract is formed by a classroom student (CS) and a robot-mediated remote student (RS, visible in the middle of ). This means that as the teacher’s bilingual (Finnish-English) instruction ends at line 8, the remote student already knows with whom he should begin pair work. The extract illustrates a highly uncomplicated and unsupported transition as the remote student turns the robot to face the classroom student and begins talk.

Extract 1. Taking initiative in a transition.

Extract 1. Taking initiative in a transition.

The teacher’s extended turn at lines 1-4 instructs the students a phrase in Finnish (voitko toistaa/sanoa uudestaan?, ‘can you repeat/say again?’) that they can use in the upcoming pair task as a clarification request. While the instruction projects a transition, it does not yet indicate when precisely the students should begin pair work. However, at line 6, the teacher utters hyvä (‘good’) and simultaneously does an open palm gesture with both hands. This can be heard as an assessment of the whole-class instruction, and a signal that it is now complete. Clasping her hands together (), the teacher follows this with a bilingual permission to ‘start’ the pair activity, first uttered in Finnish and then in English (work with your pair, line 7).

Looking at participants’ movement at lines 6-7 tells how the transition unfolds through the way the remote student anticipates the timing of the transition and demonstrates compliance to the teacher’s directive. Firstly, the teacher starts moving away from her position in front of the classroom (to eventually go round the classroom) at the same time as she begins to utter the permission at line 6. The remote student begins to turn the robot around nearly simultaneously, after the word voit (‘you can’) and while the teacher is still talking. Both movement trajectories thus orient to the imminence of the pair activity even though a verbal ‘go ahead’ for the transition is still underway.

-1.3 show how the robot’s orientation changes during the teacher’s permission and the following silence: at line 7 (), the remote student has turned the robot and its camera orientation approximately 90 degrees to the right (towards the walking teacher), and towards the end of line 8 (), he has turned the robot 180 degrees to face his classroom partner. The classroom student lifts his gaze from the task sheet on his desk to the remote student during line 8, which allows the two students to reach a face-to-screenface formation (Due, Citation2021) in which both participants can see each other and the classroom student’s handout is between them. As mutual gaze has been established, the remote student begins talk, first with a muted microphone (line 9). After unmuting himself, he opens the pair task by asking the classroom student to begin (line 13). All in all, extract 1 thus illustrates a transition which the remote student manages in a highly agentic manner through relevantly timed robot movements and talk. At no point does the remote student receive any individualised teacher support beyond the whole-class instruction. Yet, coordinating the robot’s position allows him to configure the participation framework by shifting from being a recipient of the teacher instruction to becoming a(n active) member in dyadic peer interaction.

Extract 2. Telling where to go to in the classroom.

Extract 2. Telling where to go to in the classroom.

Extract 3a. Avoiding movement in preparation of a transition

.

Extract 3a. Avoiding movement in preparation of a transition.

Extract 2 shows a transition in which, similarly to extract 1, the remote student takes initiative for getting into a peer group. However, the transition itself is more complex and involves an individualised instruction from the teacher at a moment when the remote student is already moving the robot towards a peer group. The extract shows a German class that have already played a few rounds of group quiz using the Quizlet app. A new round about to begin, the remote student (Asko) has moved the robot in front of the classroom whiteboard to see the Quizlet interface that is projected there. As the teacher clicks a button to start a new game (line 3), the app reshuffles the groups and shows them on the whiteboard. While this indicates the composition of each group, it does not yet tell where in the classroom space the groups should go to sit together. This is what the teacher instructs in the extract, first to the remote student (lines 6-8) and then to one classroom student (line 10), formulating her instructions in very different ways.

The teacher’s go-ahead signal at line 3 (geht’s los, ‘let’s go’) is followed by a silence, during which the teacher maintains visual orientation on her laptop and nobody moves in the classroom, including the robot that the remote student controls. This suggests that students treat the teacher’s instruction for the upcoming task phase as still incomplete. The teacher resumes talk (line 5), addresses the remote student and identifies his group members to him (line 6). At the same time, she walks to the robot and brings her gaze to it. However, the remote student also begins to turn the robot around almost immediately as the teacher begins to walk towards the robot and is verbally addressing him. Although the robot movement begins later than in extract 1, the turning suggests that the remote student is about to drive the robot towards his group: in other words, he is producing a responsive action that aligns with the teacher’s go-ahead signal.

Unlike in extract 1, the teacher individually instructs the remote student, Asko, at lines 6-9. Asko orients to the teacher’s approach into his field of vision () and talk by suspending robot movement as the teacher utters the first mit (‘with’). Asko resumes turning the robot at the onset of the second mit, by which the teacher has named one of his group members (Lauri). Turning away from the teacher before her turn has ended is a way for the remote participant to orient to the redundancy of the group members’ names, which have already been displayed on the whiteboard. By the end of line 9, Asko has turned the robot 180 degrees so that the robot’s camera and screen are facing Asko’s group members at the back of the room, thereby being in Asko’s field of vision (see -2.4).

The teacher’s two individualised instructions, first to Asko and then to Luukas (line 11), are different in their turn design. Besides telling Asko his group members, the teacher also instructs him how to find his group in the classroom with the help of two directives at lines 7-9, einmal zurück (‘first (go) back’) and dahin (‘there(to)’). She also points towards the group members twice (-2.4), even when she is outside the robot camera’s range and thereby not visible to Asko. In contrast, the teacher instructs the ‘same’ thing to Luukas, a classroom student, merely by taking a step towards him (line 10) and by telling him to go to the ‘other team’ (line 11), at the same time pointing at Luukas and his team. Line 11 thus does not contain the kind of navigational instruction that the teacher provides to the robot-mediated remote student. This indicates that, unlike ‘regular’ on-site students, the remote student is being treated as someone who needs more help in navigating the classroom, including being told where his group members are located.

Taken together, extracts 1-2 illustrate how participants reorganise the participation framework when transiting from whole-class to a group-based activity. The mobility of the telepresence robot provides a possibility for the remote student to take initiative in such a practical and collaborative accomplishment in ways that resemble those that classroom students are projected to exhibit. Moving the robot in locally appropriate ways, including suspending and resuming movement in order to adjust to situational contingencies such as the teacher’s ‘extra’ information, involves monitoring and anticipating details of classroom interaction. Doing this allows the robot-mediated remote student to perform relevant technologically-mediated embodied actions as a transition is unfolding in real time. While both extracts illustrate agentic and aligning remote participation through ways of moving the robot, in extract 2 the teacher nevertheless also treated the transition as potentially cumbersome to the remote student by way of offering individualised support. In our data, such individualised post-expansions to a whole-class instruction are fairly common practices for the teachers to interactionally support and secure remote participation during transitions, a topic which we will next discuss in more detail.

4.2. Assisting remote students’ activity transitions

The provision of individualised assistance can be seen to support the remote student’s transitioning between activities, but at the same time it makes visible an asymmetry between remote and on-site participation modes. Besides guiding how and where to move the robot for group work (extract 2), in our data teachers may also assist remote participants by configuring a group activity so as to avoid the need for the remote student to move around in the classroom. Extract 3a shows an instance in which the teacher is finishing instructing an activity for getting to know other students in the beginner’s level Finnish class. Students are first to interview their partner and then to move around in pairs and introduce their partner to other student pairs. Prior to the extract, the teacher has been modelling phrases for introducing a person. As the teacher is coming to the end of her instruction, she creates a distinction between in-class student pairs and the pair that includes a robot-mediated remote student, treating movement as an inconvenience to the remote student (lines 5-7).

Line 1 (‘now we do so that’) frames the teacher’s turn-in-progress as a modification to the usual task routine. The modification entails suggesting (‘maybe’) a division of labour between student pairs (‘others’) expected to move around and the hybrid pair who ‘can stay’ (line 4) in their current place, accompanied by a gesture to identify the relevant students () and to delineate the area where they are expected to stay. At lines 6-7, the teacher laughingly accounts for the modification by way of making relevant participants’ asymmetric possibilities for mobility. She names the remote student, and through self-repair treats movement in the classroom as more or less an inconvenience to him, something that he would ‘have to’ (line 7) do. While talking, the teacher uses her hands to depict walking by moving them in front of one another in an awkward manner (). ‘Going round between the tables’ is a particular kind of verbal formulation for moving, one that implies a cumbersome embodied action. In this way, the teacher is treating task-relevant physical movement as a more problematic possibility for the remote student than for in-class students. Teachers can use laughter to mitigate face concerns while correcting non-aligned student actions (Jakonen & Evnitskaya, Citation2020), and it seems that here the laughter, together with talk and ways of gesturing, orients to the sensitive nature of guiding and controlling what kind of techno-embodied conduct is expected from the remote student. Through the laughter, the verbal formulation and the caricature-like gesture of walking, the teacher constructs robot movement as something other than a normal state of affairs, an inconvenience that ought to be avoided if possible.

Besides identifying ways to avoid and facilitate robot movement, teachers may sometimes also visually monitor robot movements after an instruction to begin group work. Extract 3 b shows how the transition shown in extract 3a continues as the teacher eventually gives a go-ahead signal for the pair activity (line 11). This time, the robot’s movement to the remote student’s classroom-based partner is delayed, and the teacher treats this as reason for assistance, first prompting the remote student individually and then remaining nearby to visually monitor robot movement.

The go-ahead signal (line 11) is followed by a recap of the task instruction (lines 12-15). The beginning of pair activity becomes imminent at the latest via the permission to start to ‘move around’ if one ‘already knows’ one’s conversational partner (line 14). Indeed, some students visibly begin pair work in the form of establishing mutual gaze with their partner and initiating talk (not transcribed in the extract) parallel to the on-going teacher’s recap. However, the remote student maintains the robot in its position in front of the whiteboard, its screen oriented to the teacher. The teacher treats the lack of robot movement as a sign of trouble in the progression of the transition: after a silence (line 16), the teacher addresses another transition-implicative positive assessment ‘good’ (line 17) to the remote student by looking and pointing at the robot (). After a further silence during which the robot remains in position, the teacher provides an individualized and modified directive to start the activity, which, instead of ‘moving around’, prompts the remote student to ‘practice with others’ (lines 19-20). The turn is a hearable ‘nudge’ to start moving (and a reminder of how the teacher set up the activity as less mobile for the remote student in extract 3a). The remote student responds to it by beginning to turn the robot towards his group approximately one second into the silence at line 21.

Such a verbal ‘nudge’ is not the only practice of assistance that the teacher employs to ensure the remote student’s transitioning to pair work in this extract. Towards the end of line 20, the teacher withdraws her gaze from the robot and sweeps it across the class during the first four seconds of the silence at line 21, enacting a so called ‘lighthouse gaze’ (Cekaite & Björk-Willén, Citation2018). Here, such a gaze pattern can be seen as a means to ensure that the instructed transition is proceeding as it should, and to detect the physically present students’ possible help requests, hand raises, or other signs of trouble. The ‘lighthouse gaze’ ends when the teacher takes a few steps away from her instructional position towards her desk and shifts her gaze back to the robot, which is turning around. Looking at the robot for two seconds () and seeing it turn provides visible evidence that the remote student is on its way to his partner. This allows the teacher to withdraw from interacting with the class and go to her desk after the extract ends. Altogether, the visual monitoring of the robot-mediated remote student and the classroom students after the task instruction can be seen as ways to ensure that students do not display signs of trouble beginning the pair task.

Extract 3b. ‘Nudging’ into transition.

Extract 3b. ‘Nudging’ into transition.

5. Concluding discussion

New videoconferencing technologies on the market are expanding the range of available options for CALL activities. In this study, we have investigated the use of a mobility-enhanced videoconferencing robot to enable synchronous hybrid instruction for on-site and remote students within one physical classroom. Our exploration has centred around ways in which the robot’s mobility, its signature feature, provides resources for pedagogical praxis and remote participation in the interactionally complex moments of transitions between whole-class and small-group activities in hybrid language classes. Even if students tend to spend much of their classroom time behind their desks, body movements beyond facial expressions and gestures are deeply engrained in the organisation of classroom interaction. People tend to notice the role of movement in the interactional organisation of an activity when it is somehow problematic such as at hybrid events where online presence takes the form of a ‘talking head’ (Licoppe & Morel, Citation2012) on a laptop screen that an on-site participant moves around. Activity transitions in a synchronous hybrid class are challenging because they involve a reorganisation of the participation framework (such as when forming student pairs or small groups) and a need to reposition the participants in the physical classroom space. This article has sought to show that body/robot movements and gaze shifts are central resources for making these happen in a hybrid language class.

Altogether, our study expands current CALL literature on the multimodal possibilities of videoconferencing by describing the role of mobility, a thus far little-explored interactional resource, in videoconferencing-based learning activities. The analysis suggests that robot movement is both a projected action in classroom transitions and a display of competent remote participation. To give an example, turning the robot around and moving from the whiteboard to a student group at an appropriate moment is a way for the remote student to display understanding of (target-language) instructions, to take initiative in the unfolding transition, and to maintain its progress. The exact moments and the manner in which remote students move the robot are therefore not random. Instead, they show how remote students interpret implications and expectations conveyed by teacher talk. Classroom participants treat the robot’s movements as intelligible conduct, as if seeing the metal ‘body’ of the robot as an embodied participant in the classroom, doing recognisable and meaningful actions that provide a sense of how the remote student engages in real-time with the hybrid instruction. The availability of a ‘body’ makes it possible that the teacher can make inferences about whether the remote student needs assistance in order to participate ‘fully’ in the lesson, for example by monitoring whether transition-relevant robot movements are on time (extract 1) or delayed (extract 3 b).

Our observations provide further micro-level interactional support for earlier findings based on self-report and experimental data that suggest telepresence robots may provide an increased sense of social presence and engagement for remote students when compared to more static forms of videoconferencing (e.g. Gleason & Greenhow, Citation2017; Shin & Han, Citation2017). The added value of the CA perspective that we have adopted here is to point at some ways in which notions such as social presence, participation and agency get their situated meaning through social and multimodal negotiation during language teaching activities. Without interactional data and a close micro-analysis, these fleeting moments could be easy to overlook because participants might not register and remember them later in interviews or surveys. Barad (Citation2007, p. 33) has argued that agencies are situated to the extent that ”they don’t exist as individual elements”. Similarly, it seems to us that a student’s agency (or lack thereof) is not an automatic by-product of any technology. Agentic participation is still possible for remote students projected as a ‘talking head’ on a laptop screen in a hybrid class beginning group work, as long as relevant assistance is provided. However, the range of available multimodal action resources configures how things can be done, and this may lead to starkly different practices for accomplishing a particular action (such as getting into a group) across different technological set-ups. In the current context, the telepresent robot ‘body’ (Lee & Takayama, Citation2011) tends to ‘expose’ a student to other participants more than screen-based videoconferencing tools such as Zoom, in which the black screens represented by switched-off cameras do not make it easy to gauge remote students’ manner and level of engagement with the instruction. In comparison, the telepresence robot is considerably less opaque in that its mere movement during classroom activities can be taken as accountable student conduct and investigated for its social meaning. In this sense, such a mobile human-machine assemblage (Due, Citation2021) exposes more of its remote user to the classroom participants than many other videoconferencing tools.

Our analysis suggests that individualised post-expansions of task instructions are a key interactional practice through which teachers provide interactional support to remote students (e.g. extracts 2 and 3 b). Besides positioning themselves so that they can monitor the remote student’s actions (3 b) in these moments, teachers tend to formulate their directives and gestures addressed to the remote student as noticeably explicit (2), and sometimes offer help even when robot movements indicate no trouble. Such caution may orient to the fact that the instructions are in students’ L2, but also to uncertainty about what exactly robot-mediated students can do in the classroom. In fact, most participants in our data are engaging with an unfamiliar technology, and the data has been collected prior to the beginning of the Covid-19 pandemic, which has without a doubt created more awareness of challenges in videoconferencing-based online teaching (see e.g. González-Lloret et al., Citation2021). Newcomers trying to find out how a videoconferencing technology works, and what kinds of pedagogical and interactional adjustments it requires on an ad hoc basis, may well organise lessons differently from seasoned experts who pre-design lessons specifically as hybrid. Rather than a limitation of our study, we think of this aspect of the data as something that tests the telepresence robot’s capabilities regarding fidelity of user experience and the flexibility with which otherwise ‘regular’ face-to-face lessons can be transformed into hybrid lessons without significant pre-planning when a sudden need for hybrid teaching may arise. Against this background, the fact that in much of our interactional data participants are able to sustain teaching and learning activities without apparent problems and resolve emergent troubles suggests careful optimism towards the interactional possibilities of the telepresence robot.

What kinds of pedagogical implications might our observations have for hybrid interaction? Traditionally the focus and scope of CA-SLA studies has revolved more around describing how language teaching and learning take place in and through interaction rather than prescribing any particular instructional practices to practitioners. Such caution is understandable, given that CA studies have amply demonstrated that there are perhaps an infinite number of ways to do things because actions are both context-shaped and context-renewing (Heritage, Citation1984), meaning that a question about the ‘effectiveness’ of a particular practice is very complex. However, recent classroom-based studies have begun to engage more with the question how findings about the organization of classroom interaction can be used to bring about change in curricular planning and teaching practices (see e.g. the collection of studies in Kunitz et al., Citation2021). For one thing, interaction can be taken as a point of departure in evidence-based teacher education because many questions of pedagogy and student participation are inherently linked to interactional phenomena such as turn-taking and repairing emerging troubles – or how to facilitate student participation in moments of transitions, which we have explored in this study. A concrete way to make teachers more aware of how their own actions can assist or constrain student participation in the classroom could be to use published transcripts or recordings of one’s own teaching practice as a basis for pedagogical reflection in pre- and in-service training. For a highly technologized setting such as the hybrid classroom, this reflection would do well to consider the relationship between the material environment, its possibilities for participation, and the kind of support the students might need in the particular environment. There may be limitations to how well ‘effective’ classroom transitions can be engineered through curricular design and lesson planning because transitions are a situated interactional achievement, the exact constraints of which only become apparent when a transition is unfolding. Despite this, advance planning can help teachers ensure that their classroom is at least materially accessible to the particular remote technology being used – for example, that there is sufficient empty space for the telepresence robot to move in the classroom. Secondly, because robot movements and robot-mediated gaze shifts are relatively slow, it would be useful to reserve enough time for transitions between activities. Thirdly, it is crucial to ensure the availability of interactional support for remote students, for whom navigating in a hybrid classroom may be less easy than for in-class students. Our findings are encouraging in the sense that they suggest that even teachers with very little experience with telepresence technology show awareness of interactional asymmetries by monitoring remote students and by offering such individualised support to them after whole-class task instruction.

To sum up, we hope to have illustrated some interactional affordances and constraints of the mobility features of telepresence robots for CALL activities. Considering social interaction is important when designing and implementing synchronous hybrid language teaching because interaction is where social presence, inclusion and participation asymmetries between on-site and remote students are either highlighted or alleviated. As we have tried to argue in this article, synchronous hybrid teaching can be challenging because of asymmetries of video-mediated interaction, recurrent participation framework changes in classroom interaction, and the central role of learning materials in instruction. Despite these challenges, social interaction offers a significant resource base for participants: interaction is ‘hard-wired’ with resources such as the organisation of repair that participants can employ to deal with emergent troubles and to maintain shared understanding of what is going on. It is likely that the fidelity of telepresence solutions will continue to develop in the future, but perhaps a more pressing question for CALL researchers and educators is how to design and implement classes that are maximally sensitive to the interactional needs of both on-site and remote students, as well as to the possibilities of the specific communication solution used. Further studies that investigate hybrid interaction in an in-depth manner in a range of pedagogical constellations are certainly needed and can help us find more answers to this question.

Acknowledgements

We are grateful to the anonymous reviewers for their constructive suggestions for improving earlier versions of this article. We have also benefitted greatly from comments received at various data sessions and conferences where we have presented this work. Any remaining argumentative errors and shortcomings are our own.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by the Academy of Finland [grant number 343480].

Notes on contributors

Teppo Jakonen

Dr Teppo Jakonen works as Academy of Finland Research Fellow at the School of Languages and Translation Studies at University of Turku, Finland. His research explores the role of technology and materiality in social interaction in language teaching and learning activities. Jakonen’s publications have appeared in journals such as Applied Linguistics, The Modern Language Journal, Journal of Pragmatics, and Linguistics and Education.

Heidi Jauni

Dr Heidi Jauni works as a Development Manager at the Language Centre of Tampere University, Finland. Her research interests include conversation analysis, especially interaction in educational settings and mediated interaction.

References

  • Arlin, M. (1979). Teacher transitions can disrupt time flow in classrooms. American Educational Research Journal, 16(1), 42–56.
  • Austin, N., Hampel, R., & Kukulska-Hulme, A. (2017). Video conferencing and multimodal expression of voice: Children’s conversations using Skype for second language development in a telecollaborative setting. System, 64, 87–103. https://doi.org/10.1016/j.system.2016.12.003
  • Balaman, U., & Doehler, S. P. (2022). Navigating the complex social ecology of screen-based activity in video-mediated interaction. Pragmatics, 32(1), 54–79.
  • Balaman, U., & Sert, O. (2017). Development of L2 interactional resources for online collaborative task accomplishment. Computer Assisted Language Learning, 30(7), 601–630. https://doi.org/10.1080/09588221.2017.1334667
  • Barad, K. (2007). Meeting the universe halfway. Quantum physics and the entanglement of matter and meaning. Durham: Duke University Press. Basingstoke: Palgrave.
  • Bell, J., Cain, W., Peterson, A., & Cheng, C. (2016). From 2D to Kubi to Doubles: Designs for student telepresence in synchronous hybrid classrooms. International Journal of Designs for Learning, 7(3), 19–33. https://doi.org/10.14434/ijdl.v7i3.19520
  • Broth, M., & Keevallik, L. (2014). Getting ready to move as a couple: accomplishing mobile formations in a dance class. Space and Culture, 17(17), 107–121. https://doi.org/10.1177/1206331213508483
  • Cekaite, A., & Björk-Willén, P. (2018). Enchantment in storytelling: Co-operation and participation in children’s aesthetic experience. Linguistics and Education, 48(December 2018), 52–60.
  • Cha, E., Chen, S., & Mataric, M. J. (2017, January). Designing telepresence robots for K-12 education. In RO-MAN 2017 - 26th IEEE International Symposium on Robot and Human Interactive Communication (pp. 683–688). https://doi.org/10.1109/ROMAN.2017.8172377
  • Cohen, C., & Wigham, C. R. (2019). A comparative study of lexical word search in an audioconferencing and a videoconferencing condition. Computer Assisted Language Learning, 32(4), 448–481. https://doi.org/10.1080/09588221.2018.1527359
  • Dooly, M., & Davitova, N. (2018). ‘What can we do to talk more?’: Analysing language learners’ online interaction. Hacettepe Egitim Dergisi, 33, 215–237.
  • Due, B. L. (2021). RoboDoc: Semiotic resources for achieving face-to-screenface formation with a telepresence robot. Semiotica, 2021(238), 253–278. https://doi.org/10.1515/sem-2018-0148
  • Edwards, A., Edwards, C., Spence, P. R., Harris, C., & Gambino, A. (2016). Robots in the classroom: Differences in students’ perceptions of credibility and learning between “teacher as robot” and “robot as teacher. Computers in Human Behavior, 65, 627–634. https://doi.org/10.1016/j.chb.2016.06.005
  • Eilola, L., & Lilja, N. (2021). The smartphone as a personal cognitive artifact supporting participation in interaction. The Modern Language Journal, 105(1), 294–316. https://doi.org/10.1111/modl.12697
  • Gleason, B. W., & Greenhow, C. (2017). Hybrid education: The potential of teaching and learning with robot-mediated communication. Online Learning, 21(4), 159–176. https://doi.org/10.24059/olj.v21i4.1276
  • González-Lloret, M. (2015). Conversation analysis in computer-assisted language learning. Calico, 32(3), 569–594. https://doi.org/10.1558/cj.v32i3.27568
  • González-Lloret, M., Canals, L., & Pineda Hoyos, J. E. (2021). Role of technology in language teaching and learning amid the crisis generated by the COVID-19 pandemic. kala, Revista De Lenguaje Y Cultura, 26(3), 477–482. https://doi.org/10.17533/udea.ikala.v26n3a01
  • Goodwin, M. H. (2006). Participation, affect, and trajectory in family directive/response sequences. Text & Talk, 26(4–5), 513–541.
  • Guichon, N., & Cohen, C. (2014). The impact of the webcam on an online L2 interaction. Canadian Modern Language Review, 70(3), 331–354. https://doi.org/10.3138/cmlr.2102
  • Han, J. (2012). Emerging technologies: Robot assisted language learning. Language Learning & Education, 16(3), 1–9. https://doi.org/10.1109/SASG.2015.7449269
  • Hellermann, J., & Doehler, S. P. (2010). On the contingent nature of language‐learning tasks. Classroom Discourse, 1(1), 25–45. https://doi.org/10.1080/19463011003750657
  • Heritage, J. (1984). Garfinkel and ethonomethodology. Polity Press.
  • Herring, S. C. (2015). New frontiers in interactive multimodal communication. In A. Georgakopoulou & T. Spilioti (Eds.), The Routledge handbook of language and digital communication (pp. 412–416). Routledge.
  • Jacknick, C. M. (2011). Breaking in is hard to do: How students negotiate classroom activity shifts. Classroom Discourse, 2(1), 20–38. https://doi.org/10.1080/19463014.2011.562656
  • Jakonen, T., & Evnitskaya, N. (2020). Teacher smiles as an interactional and pedagogical resource in the classroom. Journal of Pragmatics, 163, 18–31. https://doi.org/10.1016/j.pragma.2020.04.005
  • Jakonen, T., & Jauni, H. (2021). Mediated learning materials: Visibility checks in telepresence robot mediated classroom interaction. Classroom Discourse, 12(1–2), 121–145. https://doi.org/10.1080/19463014.2020.1808496
  • Jefferson, G. (2004). Glossary of transcript symbols with an introduction. In G. H. Lerner (Ed.), Conversation analysis: Studies from the first generation (pp. 13–31). John Benjamins.
  • Jenks, C. (2014). Social interaction in second language chat rooms. Edinburgh University Press.
  • Jones, B., Zhang, Y., Wong, P. N. Y., Rintel, S. (2021, April). Belonging there: VROOM-ing into the Uncanny Valley of XR telepresence. Proceedings of the ACM on Human-Computer Interaction, 5, 59.
  • Kunitz, S., Markee, N., & Sert, O. (Eds.). (2021). Classroom-based conversation analytic research. Springer.
  • Kurhila, S., & Kotilainen, L. (2020). Student-initiated language learning sequences in a real-world digital environment. Linguistics and Education, 56(088565), 100807. https://doi.org/10.1016/j.linged.2020.100807
  • Lee, M. K., Takayama, L. (2011). “Now, I have a body " : Uses and social norms for mobile remote presence in the workplace. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 33–42).
  • Liao, J., Lu, X., Masters, K. A., Dudek, J., & Zhou, Z. (2019). Telepresence-place-based foreign language learning and its design principles. Computer Assisted Language Learning, 1–26. https://doi.org/10.1080/09588221.2019.1690527
  • Licoppe, C., & Morel, J. (2012). Video-in-interaction: “talking heads” and the multimodal organization of mobile and skype video calls. Research on Language and Social Interaction, 45(4), 399–429. https://doi.org/10.1080/08351813.2012.724996
  • Lilja, N. (2014). Partial repetitions as other-initiations of repair in second language talk: Re-establishing understanding and doing learning. Journal of Pragmatics, 71, 98–116. https://doi.org/10.1016/j.pragma.2014.07.011
  • Majlesi, A. R., & Broth, M. (2012). Emergent learnables in second language classroom interaction. Learning, Culture and Social Interaction, 1(3–4), 193–207. https://doi.org/10.1016/j.lcsi.2012.08.004
  • Markee, N. (2005). The organization of off-task talk in second language classrooms. In K. Richards & P. Seedhouse (Eds.), Applying conversation analysis (pp. 197–213). Basingstoke: Palgrave. Macmillan.
  • Markee, N. (2008). Toward a learning behavior tracking methodology for CA-for-SLA. Applied Linguistics, 29(3), 404–427. https://doi.org/10.1093/applin/amm052
  • Mondada, L. (2014). Conventions for multimodal transcription. Romanisches Seminar der Universität. https://franz.unibas.ch/fileadmin/franz/user_upload/redaktion/Mondada_conv_multimodality.pdf
  • Mortensen, K., & Hazel, S. (2011). Initiating round robins in the L2 classroom - preliminary observations. Novitas-ROYAL, 5(1), 55–70.
  • Pekarek Doehler, S., & Berger, E. (2018). L2 interactional competence as increased ability for context-sensitive conduct: A longitudinal study of story-openings. Applied Linguistics, 39(4), 555–578.
  • Råman, J. (2017). The organization of transitions between observing and teaching in the budo class. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 19(1), 28.
  • Randall, N. (2019). A survey of robot-assisted language learning (RALL). ACM Transactions on Human-Robot Interaction (THRI), 9(1), 1–36.
  • Raes, A., Detienne, L., Windey, I., & Depaepe, F. (2020). A systematic literature review on synchronous hybrid learning: Gaps identified. Learning Environments Research, 23(3), 269–290.
  • Rusk, F., & Pörn, M. (2019). Delay in L2 interaction in video-mediated environments in the context of virtual tandem language learning. Linguistics and Education, 50, 56–70. https://doi.org/10.1016/j.linged.2019.02.003
  • Rusk, F., & Ståhl, M. (2022). Coordinating teamplay using named locations in a multilingual game environment-Playing esports in an educational context. Classroom Discourse, 13(2). https://doi.org/10.1080/19463014.2021.2024444
  • Satar, M. (2013). Multimodal language learner interactions via desktop videoconferencing within a framework of social presence: Gaze. ReCALL, 25(1), 122–142. https://doi.org/10.1017/S0958344012000286
  • Satar, M. (2016). Meaning-making in online language learner interactions via desktop videoconferencing. ReCALL, 28(3), 305–325. https://doi.org/10.1017/S0958344016000100
  • Satar, M., & Wigham, C. R. (2017). Multimodal instruction-giving practices in webconferencing-supported language teaching. System, 70, 63–80. https://doi.org/10.1016/j.system.2017.09.002
  • Seedhouse, P. (2004). The interactional architecture of the language classroom: A conversation analysis perspective. Blackwell.
  • Shin, K. W. C., & Han, J.-H. (2017). Qualitative exploration on children’s interactions in telepresence robot assisted language learning. Journal of the Korea Convergence Society, 8(3), 177–184. https://doi.org/10.15207/JKCS.2017.8.3.177
  • Sidnell, J., & Stivers, T. (Eds.). (2012). The handbook of conversation analysis. Malden, MA: John Wiley & Sons.
  • Slaughter, Y., Smith, W., & Hajek, J. (2019). Videoconferencing and the networked provision of language programs in regional and rural schools. ReCALL, 31(2), 204–217. https://doi.org/10.1017/S0958344018000101
  • Soares, N., Kay, J. C., & Craven, G. (2017). Mobile robotic telepresence solutions for the education of hospitalized children. Perspectives in Health Information Management, 14(Fall), 1e.
  • Thorne, S. L., Hellermann, J., & Jakonen, T. (2021). Rewilding language education: Emergent assemblages and entangled actions. The Modern Language Journal, 105(S1), 106–125. https://doi.org/10.1111/modl.12687
  • Ziegler, N. (2016). Synchronous computer-mediated communication and interaction: A meta-analysis. Studies in Second Language Acquisition, 38(3), 553–586. https://doi.org/10.1017/S027226311500025X