6,516
Views
301
CrossRef citations to date
0
Altmetric
Original Articles

Multiple Temporalities of Language and Body in Interaction: Challenges for Transcribing Multimodality

 

ABSTRACT

The article focuses on the principles of multimodal CA, the way they can be operationalized in a transcription system, and the analytical and conceptual consequences of transcription choices. Elaborating on the foundations of multimodal CA and on the basis of video recordings of French and Swiss German encounters, as well as animal interactions, the article discusses classic and contemporary challenges for transcription and analysis, such as beyond gesture and gaze, body arrangements in interactional spaces, larger groups, material environments, mobile settings, silent activities, and animal encounters. It also highlights the diversity of multimodal practices involved: mobilizing occasioned material resources, movements not only of the upper (head, gesture) but also the lower (feet, legs, posterior) parts of the body, haptic contacts touching objects and coparticipants, and camera movements. The precise transcription of relevant details reveals complex arrangements of multimodal resources and gestalts. Their fine-grained, distinct, multiple temporalities constitute the basis of their sequential order—for sequentiality as a fundamental organizational principle of action. Data are in French and Swiss German.

Notes

1 This distinguishes how “multimodality” is defined in CA versus other approaches, where it can be seen as concerning the characteristics of texts, semiotic signs (e.g., visual vs. written), and digital interfaces (see Mondada, Citation2014a, p. 138). For a discussion on multimodality and transcription from a semiotic perspective, see Bezemer and Mavers (Citation2011). My aim here is not to compare different transcription systems, which would require another article (see Ayass, Citation2015).

2 For instance, iconic gestures can be “environmentally coupled” (Goodwin, Citation2007), they often relate to sedimented manual actions (Streeck, Citation2009), and they can be shaped in specific ways by objects working as prosthetic extensions (Mondada, Citation2014a).

3 Because of lack of space, I do not discuss the position that refutes the importance of transcribing multimodality here. Basically, the main argument defended here is that transcribing is indispensable for a fine-grained analytical investigation of temporally ordered details. This also holds true in the face of arguments claiming that video clips would solve all the problems and make multimodal transcription obsolete: even within a—highly desirable—editorial model of scientific articles that include clips in the analytical text, transcripts would still be needed for precise temporal and sequential analysis.

4 Aligning software like ELAN or CLAN allows one to closely align the original recorded data and the transcript but does not fundamentally solve this problem. Because of lack of space, I do not discuss the impact on transcribing here. However, most excerpts have been transcribed using ELAN; ELAN transcripts are fully convertible into the conventions presented here.

5 This points at the difference between transcribing—which depends on a preliminary analysis of what is made locally relevant by the participants, which in turn depends on the way they configure their embodied and verbal actions within its situated ecology—and coding. The former relies on descriptions adapted to the specificity of the particular movement targeted and the latter on standard labels selected from a predefined list of labels (a coding scheme) and homogeneously used throughout the corpus.

6 The need for specific conventions for embodiment relates to limitations associated with the use of verbal conventions for annotating the body. For example, the use of [brackets] for indicating the placement of embodied conduct within talk would be treating it as having the same properties as overlapping turns, which is not the case. The temporality of embodied cues has different affordances and constraints than talk (e.g., extended simultaneity is a general feature of embodiment but a problematic feature for talk). The use of ((double parentheses)) for embodied conducts is also problematic because it reduces them to comments that are inserted in the flow of talk, ignoring their precise temporal location in the ongoing action and their specific temporal trajectory.

7 See the website indicated at the end of this article. Although I have developed and used the system since the beginning of the 2000s (e.g., Mondada, Citation2007) and the convention is increasingly being used by other scholars, there is no publication as yet that sets out its specific rationale and principles of use. This article fills that gap.

8 All extracts were collected during ethnographic fieldwork with the agreement of the participants, who were informed about the use of the video recordings in analyses featuring selected moments of interaction and represented in transcripts involving textual and visual representations. In the textual part of the transcript, the identity of the participants has been anonymized (e.g., the names identifying them are all pseudonyms); however, the participants did authorize the use of unanonymized photographic images (i.e., screen shots that have been neither pixelated nor covered with black stripes over the faces), which are important for the analysis of gaze and other facial expressions.

9 This article focuses on the conceptual rationale underlying these conventions; for detailed instructions on how to implement them in precisely formatted transcripts, see the web reference to a tutorial at the end of the article.

10 Transcripts are visual-textual hybrids. The diversity of types of images and their role in transcripts are an important point that cannot be discussed here for lack of space. However, see Mondada (Citation2016b) for an extended discussion on different uses of images in transcripts and their analytical and theoretical consequences.

11 The initial movement is signaled with a ≫, the last one, with a -≫ (4). These double arrows refer to movements beginning/continuing before/after the extract, which is important for their location within broader streams of action and temporal spans.

12 For the representation of mobility, cartographic representations complementing other visual representations can be useful. This is especially relevant for longer mobile trajectories, such as those involving cars or bikes (McIlvenny, Citation2015), but might be difficult to implement for micromobilities, such as small steps.

13 An interesting alternative notation used by Luff and Heath (Citation2015; see also Goodwin, Citation1981) represents segments of 0.1 seconds with dashes (-): Even though this avoids a numbered quantification of time, it relies on the measure of the segment’s length, it has the advantage of showing the emergent progression of time, but it still segments it in homogeneously measured units. Hence, this alternative notation still relies on chronos rather than kairos.

14 The extract belongs to a video corpus assembled in collaboration with A. Meguerditchian at the CNRS primatology center in Rousset (France).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.