Abstract
Current methodologies in corpus linguistics have revolutionised the way in which we study language, allowing us to make accurate and retrievable observations and analyses using a range of written and spoken data from naturally occurring contexts. Yet, while current corpora allow us to explore multimillion word databases, they fail to represent language and communication beyond the word. This is problematic as social interactions are in fact multimodal, combining both verbal and non-verbal elements. This article reports on preliminary developments in this area carried out as part of an interdisciplinary project based at the University of Nottingham, funded by the UK-based Economic and Social Sciences Research Council, which is exploring how we can utilise new textualities in order to develop the scope of what a corpus can reveal and provide the tools for exploring discourse in specific contexts of communication. The specific linguistic focus of this project is the exploration of the roles and nature of gesture-in-talk, with a specific focus on codifying and analysing backchannels. To manage the scale of the research challenge we explore a particular sub-set of gesture: head nods. The article discusses the development of multimodal corpora, using video data recorded from conversational exchanges that are to be streamed with the verbal data. We provide an exploration of the requirements that such a ‘multimodal’ corpus should look to fulfil, in order to lead to better descriptions of language and better applications of such descriptions.
Acknowledgement
The research on which this article is based is funded by the UK Economic and Social Research Council (ESRC), e-Social Science Research Node DReSS (www.ncess.ac.uk/nodes/digitalrecord), and the ESRC e-Social Science small grants project HeadTalk (grant no. RES-149-25-1016).
Notes
6 Research at the Max Planck Institute for Psycholinguistics (www.mpi.nl/research/research/other/lc-gesture) is perhaps the most developed but still not supportive of corpus linguistics.