794
Views
21
CrossRef citations to date
0
Altmetric
Full Papers

Expressing reactive emotion based on multimodal emotion recognition for natural conversation in human–robot interactionFootnote*

, , , &
Pages 1030-1041 | Received 31 Oct 2018, Accepted 08 Sep 2019, Published online: 19 Sep 2019
 

Abstract

Human–human interaction consists of various nonverbal behaviors that are often emotion-related. To establish rapport, it is essential that the listener respond to reactive emotion in a way that makes sense given the speaker's emotional state. However, human–robot interactions generally fail in this regard because most spoken dialogue systems play only a question-answer role. Aiming for natural conversation, we examine an emotion processing module that consists of a user emotion recognition function and a reactive emotion expression function for a spoken dialogue system to improve human–robot interaction. For the emotion recognition function, we propose a method that combines valence from prosody and sentiment from text by decision-level fusion, which considerably improves the performance. Moreover, this method reduces fatal recognition errors, thereby improving the user experience. For the reactive emotion expression function, the system's emotion is divided into emotion category and emotion level, which are predicted using the parameters estimated by the recognition function on the basis of distributions inferred from human–human dialogue data. As a result, the emotion processing module can recognize the user's emotion from his/her speech, and expresses a reactive emotion that matches. Evaluation with ten participants demonstrated that the system enhanced by this module is effective to conduct natural conversation.

GRAPHICAL ABSTRACT

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

* This paper was originally submitted to the special issue on Robot and Human Interactive Communication.

Additional information

Funding

This work was supported by JST ERATO Ishiguro Symbiotic Human-Robot Interaction program [grant number JPMJER1401], Japan.

Notes on contributors

Yuanchao Li

Yuanchao Li received BE in electronic information engineering from Nanjing University of Posts and Telecommunications, China, in 2014, and MS in informatics from Kyoto University, Japan, in 2017. He is currently a research engineer at Honda R&D Co., Ltd., Japan. His research interests include affective computing, spoken language processing, and human–robot interaction. He is a member of APSIPA and ISCA.

Carlos Toshinori Ishi

Carlos Toshinori Ishi received the PhD degree in engineering from The University of Tokyo, Japan, in 2001. He worked at the JST/CREST Expressive Speech Processing Project from 2002 to 2004 at ATR. He joined ATR Intelligent Robotics and Communication Labs, since 2005, and is currently the group leader of the Dept. of Sound Environment Intelligence at ATR Hiroshi Ishiguro Labs, since 2013. He is a member of ISCA, RSJ and ASJ.

Koji Inoue

Koji Inoue received his MS and PhD degrees in informatics in 2015 and 2018 from Kyoto University, Japan. He is currently an Assistant Professor at Graduate School of Informatics, Kyoto University, and was a Research Fellow of the Japan Society for the Promotion of Science (JSPS) from 2015 to 2018. His research interests include spoken dialogue systems, speech signal processing, multimodal interaction, and conversational robots. He is a member of IEEE and ISCA.

Shizuka Nakamura

Shizuka Nakamura received MS in 2009 and PhD in 2012, in Global Information and Telecommunication Studies, from Waseda University. She was a JSPS Research Fellow at Waseda University, from 2009 to 2012, and an Assistant Professor at Graduate School of Language and Culture, Osaka University, from 2012 to 2015. She has been a Researcher in Graduate School of Informatics at Kyoto University since 2015. She is engaged in research on speech communication from the viewpoint of speech sciences. She is a member of IPA, ISCA and ASJ.

Tatsuya Kawahara

Tatsuya Kawahara received BE in 1987, ME in 1989, and PhD in 1995, all in information science, from Kyoto University, Kyoto, Japan. From 1995 to 1996, he was a Visiting Researcher at Bell Laboratories, Murray Hill, NJ, USA. Currently, he is a Professor in the School of Informatics, Kyoto University. He has also been an Invited Researcher at ATR and NICT. He has published more than 400 academic papers on speech recognition, spoken language processing, and spoken dialogue systems. He has been conducting several projects including speech recognition software Julius and the automatic transcription system for the Japanese Parliament (Diet). Dr Kawahara received the Commendation for Science and Technology by the Minister of Education, Culture, Sports, Science and Technology (MEXT) in 2012. From 2003 to 2006, he was a member of IEEE SPS Speech Technical Committee. He was a General Chair of IEEE Automatic Speech Recognition and Understanding workshop (ASRU 2007). He also served as a Tutorial Chair of INTERSPEECH 2010 and a Local Arrangement Chair of ICASSP 2012. He was an editorial board member of Elsevier Journal of Computer Speech and Language and IEEE/ACM Transactions on Audio, Speech, and Language Processing. He is the Editor-in-Chief of APSIPA Transactions on Signal and Information Processing. Dr Kawahara is a board member of APSIPA and ISCA, and a Fellow of IEEE.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 332.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.