197
Views
29
CrossRef citations to date
0
Altmetric
Review

Robots that can hear, understand and talk

Pages 533-564 | Published online: 02 Apr 2012

References

  • Trevelyan , J. P. 1999 . Redefining robotics for the new millennium . Int. J. Robotics Res. , 18 ( 12 ) : 1211 – 1223 .
  • Atkeson , G. C. , Hale , J. G. , Polick , F. , Riley , M. , Kotosaka , S. , Schaul , S. , Shibata , T. , Teviata , G. , Ude , A. , Vijaykumar , S. , Kawato , E. and Kawato , M. July/August 2000 . Using humanoid robots to study human behavior, IEEE Intelligent System, Special Issue on Humanoid Robotics July/August , 46 – 56 .
  • Macna , P. and Inayatullah , S. 1988 . The Rights of Robots: Technology, Culture, and Law in 21st Centuary , Kurzweilie, AI Net . Accessible at http://www.kurzwelai.net/articles/
  • Khan , Z. 1998 . Attitudes towards intelligent service robots , Stockholm : NADA, KTH . IPL Technical Report No. TRITA-NAP9821
  • Beckers , R. , Holland , O. E. and Deneubourg , J. L. 1996 . “ From local actions to global tasks: stigmergy and collective robotics ” . In Proc. Artificial Life IV, Fourth International Workshop on the Synthesis and Simulation of Living Systems Edited by: Brooks , R. A. and Maes , P. 181 – 189 .
  • Deneubourg , J. L. , Goss , S. , Franks , N. , Sendova-Franks , A. , Detrain , C. and Chretien , L. 1991 . “ The dynamic of collective sorting robot-like ants and ant-like robots ” . In Proc. Int. Conf. on Simulation of Adaptive Behavior , Edited by: Meyer , J. A. and Wilson , S. 356 – 365 . Cambridge , MA : MIT Press .
  • Nakauchi , Y. and Simmons , R. 2000 . A social robot that stands in line . Autonomous Robots , 12 ( 3 ) : 313 – 324 .
  • Oestreicher , L. , Hüttenrauch , H. and Eklund , K. S. 1999 . “ Where are you going little robot? Prospects of human?robot interaction ” . In Position paper for the CHI 99 Basic Research Symposium, ACM CHI 99 Conference on Human Factors in Computing Systems
  • Dautenhahn , K. 1999 . “ Robots as social actors: auroa and the case of autism ” . In Proc. 3rd Cognitive Technology Conf San Francisco , CA
  • Akrin , R. C. 2000 . Behaviour Based Robotics , Cambridge , MA : MIT Press .
  • McGurk , H. and McDonald , J. W. 1976 . Hearing lips and seeing voices . Nature , 264 : 746
  • Baerveldt , A. J. 1992 . “ Cooperation between man and robot: interface and safety ” . In IEEE Int. Workshop on Robot and Human Communication Tokyo
  • Wakita , Y. , Hirai , S. , Hori , T. , Takada , R. and Kakikura , M. 1998 . “ Realization of safety in a coexistent robotic system by information sharing ” . In Proc. IEEE Int. Conf. on Robotics and Automation 3474 – 3479 . Leuven
  • Shibata , T. and Tanie , K. 2001 . “ Physical and affective interaction between human and mental commit robot ” . In Proc. IEEE Int. Conf. on Robotics and Automation 2572 – 2577 . Seoul
  • Asimov , I. and Frenkel , K. 1985 . Robots: Machines in Man's Image , New York , NY : Harmony Books .
  • Ferrel , M. 1986 . The Best National Geographic (documentary)
  • Robert , M. 1978 . The Robot Book , New York : Push Pin Press .
  • Pelachand , C. 1991 . “ Linguistic issues in facial animation ” . In Proc. Computer Animation Geneva
  • Badler , N. I. , Phillips , C. B. and Webber , B. L. 1993 . Simulating Humans?Computer Graphics, Animation and Control , Oxford : Oxford University Press .
  • Neal , J. G. , Dobes , Z. , Bettinger , K. E. and Byoun , J. S. 1988 . “ Multimodal references in human computer dialog ” . In Proc. 7th National Conf. on Artificial Intelligence AAAI-88 819 – 823 . San Mateo , CA
  • Neal , J. G. , Thielman , C. Y. , Dobes , Z. , Haller , S. M. and Shapiro , S. C. 1989 . “ Natural language with integrated deictic and graphic gesture ” . In Proc. DARPA Speech and Natural Language Workshop , 410 – 423 . Los Albos , CA : Morgan Kaufmann .
  • Crangle , C. and Suppes , P. 1994 . Language and Learning for Robots , Stanford , CA : CSLI Publications .
  • Bischoff , R. and Jain , T. 1999 . “ Natural communication and interaction with humanoid robots ” . In Proc. 2nd Int. Symp. on Humanoid Robotics 121 – 128 . Tokyo
  • Alford , W. A. , Rogers , T. , Wilkes , D. M. and Kawamura , K. 1999 . “ Multi-agent system for a human friendly robot ” . In Proc. 1999 IEEE Int. Conf. on Systems, Man, and Cybernetics 1064 – 1069 . Tokyo
  • Nisimura , R. , Uchida , T. , Lee , A. , Saruwatari , H. and Shikano , K. 2001 . “ Development of Julius-based speech dialogue system for campus receptionist robot ” . In Proc. IEICE SP2001-99 93 – 98 .
  • Nisimura , R. , Uchida , T. , Lee , A. , Saruwatari , H. , Shikano , K. and Matsumato , Y. 2002 . “ Aska: receptionist robot with speech dialogue system ” . In Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems 1314 – 1317 . Lausanne
  • Lee , A. , Kawahara , T. and Shikano , K. 2001 . “ JULIUS — an open source real time large vocabulary recognition engine ” . In Proc. Eur. Conf. on Speech Communication and Technology 1691 – 1694 .
  • Asoh , H. , Vlassis , N. , Motomura , Y. , Asano , F. , Hara , I. , Hayamizu , S. , Ito , K. , Kurita , T. , Matsui , T. , Bunschoten , R. and Kröse , B. September/October 2001 . “ Jijo-2: an office robot that communicates and learns ” . In IEEE Intelligent System September/October , 46 – 55 .
  • Matsui , T. , Asoh , H. and Asano , D. 1997 . “ Map learning of an office conversant mobile robot Jijo-2 by dialog guided navigation ” . In Proc. 1st Int. Workshop on Field and Service Robotics 230 – 235 . Canberra
  • Christian , T. 2002 . “ Talking to GODOT: dialogue with mobile robot ” . In Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems 1338 – 1343 .
  • Konashi , . T. 2003 . “ A portable spoken dialog system for autonomous robots ” . In Proc. 1st Int. Workshop on Language Understanding and Agents for Real World Interaction
  • Thorission , K. R. 1999 . A mind model for multi-modal communicative creatures and humanoids . Int. J. Appl. Artif. Intell. , 13 ( 4-5 ) : 449 – 486 .
  • Harlan , R. 2001 . “ The Khepera robot and the robot class: a platform for introducing robotics in the undergraduate curriculum ” . In Proc. 32nd SIGCSE Technical Symp. on Computer Science Education 105 – 109 .
  • Kommer , F. 1999 . “ Chantemargue, a speech recognition interface to Khepera robots ” . In Proc. 1st Int. Khepera Workshop , Poderborn : Paderborn University .
  • Frank , D. and Lakemeyer , G. 2000 . “ A speech interface for a mobile robot controlled by GOLOG ” . In Proc. 2nd Int. Cognitive RoboticsWorkshop Berlin
  • Torrance , M. C. 1999 . Natural communication with robot , Cambridge : Department of Electrical Engineering& Computer Science, Cambridge University . Master Thesis
  • Lemon , O. , Bracy , A. , Gruenstein , A. and Peters , S. 2001 . “ A multi modal dialogue system for human? robot conversation ” . In Proc. NAACL2001 Pittsburgh , PA
  • Langle , T. , Luth , T. C. , Stopp , E. , Herzog , G. and Kamstrup , G. 1995 . “ KANTARA—a natural language interface for intelligent robots ” . In Intelligent Autonomous system (IAS-4) , Edited by: Rembold , U. and Dillmann , R. 365 – 372 . Stanford , CA : IOS Press . No. 1
  • Versweyveld , L. March 1998 . “ Voice-controlled surgical robot ready to assist in minimally invasive heart surgery ” . In Virtual Medical Worlds Monthly March ,
  • Berry , G. A. , Pavlovic , V. and Huang , T. S. 1998 . “ Battle view: a multi-modal HCI research application ” . In Proc. Workshop on Perceptual User Interface San Francisco , CA
  • Mitsutosi , Y. and Shimo-okubo . 2001 . “ Human?robot interface based on mutual assistance between speech and vision ” . In Proc. Workshop on Perceptual User Interface , 1 – 4 . Orlando , FL : ACM PUI .
  • Carter , R. 2000 . Mapping the Mind , London : Phoenix .
  • Casanova , F. , Switala , A. E. and Roy , E. 2001 . Brain circuitry involved in language reveals differences in human, non-human primates . Am. J. Phys. Anthropol. , August http://www.sciencedaily.com/releases/2001/09/010905071926.htm
  • Zue , V. , Glass , W. and James , R. 2000 . Conversational interfaces: advances and challenges . Proc. IEEE , 88 ( 8 ) : 1166 – 1180 .
  • Bourlard , H. and Nelson , M. 1994 . Connectionist Speech Recognition: A Hybrid Approach , Dordrecht : Kluwer .
  • Furui , S. 2000 . “ Steps toward flexible speech recognition — recent progress at Tokyo Institute of Technology ” . In Proc. 8th Australian Int. Conf. on Speech Science and Technology 19 – 29 . Canberra
  • Cherry , E. C. 1953 . Some experimentson recognitionof speech, with one and with two ears . J. Acoust. Soc. Am. , 25 : 975 – 979 .
  • Potamanos , G. , Neti , C. , Iyengar , G. and Helmuth , E. 2001 . “ Large vocabulary audio-visual speech recognition by machines and humans ” . In Proc. EUROSPEECH-01 1027 – 1030 . Aalborg
  • Lippman , R. P. 1997 . Speech recognition by machines and humans . Speech Commun. , 22 : 1 – 15 .
  • Gong , Y. 1995 . Speech recognition in noisy environments— survey . Speech Commun. , 16 : 261 – 291 .
  • Yamada , T. , Nakamura , S. and Shikano , K. 2002 . Distant talking speech recognition based on 3-D Viterbi search using a microphone array . IEEE Trans. Speech Audio Process. , 10 : 48 – 56 .
  • Allision , B. , Nourbakhash , I. and Simmons , R. 2002 . “ Role of expressiveness and attention in human? robot interaction ” . In Proc. IEEE Int. Conf. on Robotics and Automation 4138 – 4142 . Washington , DC
  • Waibel , A. 1988 . Prosody and Speech Recognition , Los Altos , CA : Morgan Kaufmann .
  • Cynthia , B. 2001 . “ Affective interaction between human and robots ” . In Proc. of ECAL Prague
  • Chen , L. S. , Huang , T. S. , Miyasato , T. and Nakatsu , R. 1998 . “ Multi-modal human emotion/expression recognition ” . In Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition 366 – 371 . Nara
  • Breazeal , C. 2001 . “ Emotive qualities in robot speech ” . In Proc. Int. Conf. on Intelligent Robotics and Systems 1389 – 1394 . Maui
  • Kirchhoff , K. 2001 . “ A comparison of classification techniques for the automatic detection of error corrections in human?computer dialogues ” . In Proc. NAACL Workshop on Adaptation in Dialogue Systems 33 – 40 . Pittsburgh , PA
  • Stolcke , A. 2002 . “ Improvements to SRI LVCSR system ” . In NIST Rich Transcription Workshop
  • Shafran , I. , Ostendorf , M. and Wright , R. 2001 . “ Prosody and phonetic variability: lesions learned from acoustic model clustering ” . In Proc. ISCA Tutorial and Research Workshop on the Prosody in Speech Recognition and Understanding 127 – 132 .
  • Reilly , W. S. 1996 . Believable social and emotional agent , Computer Science, CMU . PhD Thesis
  • Marc , S. 2001 . “ Emotional speech synthesis — a review ” . In Proc. EUROSPEECH-01 561 – 564 . Aarlborg
  • Canmero , D. 1997 . “ Modeling motivations and emotions as basis of intelligent behavior ” . In Proc. 1st Int. Conf. on Autonomous Agent , 148 – 155 . Orlando , FL : ACM Press .
  • Picard , R. 1997 . Affective Computation , Cambridge , MA : MIT Press .
  • Cynthia , B. 2002 . Regulation and entertainment in human?robot interaction . Int. J. Robotics Res. , 21 : 883 – 902 .
  • Pane , Y. and Waibel , A. 2000 . “ The effect of room acoustics on MFCC speech parameter ” . In Proc. ICSLP 2000 129 – 132 . Bejing
  • Schobben , D. W. E. 2001 . Real Time Adaptive Concepts in Acoustics BSS and Multi-channel Echo Cancellation , Dordrecht : Kluwer .
  • Johnson , H. and Dugeon , D. E. 1993 . Array Signal Processing, Concepts and Techniques , Englewood Cliffs , NJ : Prentice- Hall .
  • Andre , J. 2001 . A comparison of auditory and blind separation techniques for speech segregation . IEEE Trans. Speech Audio Process. , 9 ( 3 )
  • Torkkola , K. 1999 . “ Blind source separation for audio signals-are we there yet? ” . In Proc. Workshop on ICA & BSS France
  • Prasad , R. K. , Saruwatari , H. , Lee , A. and Shikano. , K. 2003 . “ A fixed point ICA algorithm for convoluted speech separation ” . In Proc. Int. Symp. on ICA & BSS 579 – 584 . Nara
  • Prasad , R. K. , Saruwatari , H. and Shikano , K. 2003 . “ Problems in blind separation of convolutive speech mixtures by negentropy maximization ” . In Proc. IEEE Int. Workshop on Acoustic Echo and Noise Control 287 – 290 . Japan
  • Dowding , J. , Gawron , J. M. , Appelt , D. , Bear , J. , Cherny , L. , Moore , R. and Moran , D. 1993 . “ Gemini: natural language system for spoken language understanding ” . In Proc. 31st Annual Meeting of the Association for Computational Linguistics 54 – 61 . Columbus , OH
  • Bobrow , R. , Ingria , R. and Stallard , D. 1990 . “ Syntactic and semantic knowledge in DELPHI unification grammar ” . In Proc. DARPA Speech and Natural Language Workshop 230 – 236 .
  • Seneff , S. 1992 . Tina: a natural language system for spoken language application . Computational Linguistic , 18 ( 1 ) : 61 – 86 .
  • Ward , W. 1990 . “ The CMU air travels information service: understanding spontaneous speech ” . In Proc. DARPA Workshop on Speech and Natural language 127 – 129 .
  • Chow , Y. and Schwartz , R. 1989 . “ The N best algorithm: an efficient procedure for finding top N sentences hypotheses ” . In Proc. DARPA workshop on Speech and Natural Language 199 – 202 .
  • Kadari , H. and W.Wayne . 2001 . “ Word graph interface for flexible concept based speech understanding framework ” . In Proc. EUROSPEECH?2001 1775 – 1778 . Aalborg
  • Harnad , S. 1990 . The Symbol Grounding Problem . Physica D , 42 : 335 – 346 .
  • Roy , D. 1999 . Learning from sights and sounds: a computational model , MIT Media Laboratory . PhD Thesis
  • Snow , C. E. 1997 . “ Mother's speech research: from input to interaction ” . In Talking to Children: Language Input and Acquisition , Cambridge : Cambridge University Press .
  • Terrence , F. and Charles , T. 2001 . Collaborative controls: A robot centric model vehicle teleportation , Robotics Institute,CMU . Technical Report RITR-01?34
  • Anzai , Y. 1994 . Human?robot?computer interaction: a new paradigm of research in robotics . Int. J. Adv. Robotics , 8 ( 4 ) : 334 – 358 .
  • Imai , M. , Hiraki , K. and Miyasato , T. 1999 . “ Physical constraints on human?robot interaction ” . In Proc. 6th Int. Joint Conf. on AI 1124 – 1130 . Stockholm
  • McNeill , D. 1987 . Psycholinguistic: A New Approach , London : Harper & Row .
  • Kelin , W. 1982 . Speech, Place and Action , New York : Wiley .
  • Bernsen , N. , Dybkjaer , H. and Dybkjaer , L. 1996 . Cooperatively in human?machine and human? human spoken dialogue . Discourse Processes , 21 ( 2 )
  • Thomson , D. and Wisowaty , J. 1999 . “ User confusion in natural language services ” . In Proc. ESCA Workshop on Interactive Dialogue in Multi-Modal System , Kloster Irsee .
  • Levin , E. , Pieraccini , R. and Eckert , W. 1998 . “ Using Markov decision process for learning dialogue strategies ” . In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing 201 – 204 . Seattle , WA
  • Connolly , J. H. , Clarke , A. A. , Garner , S. W. and Palmen , H. K. 1995 . “ Clause internal structure in spoken dialogue ” . In Corpus-based Approaches to Dialog Modeling: Proc. 9th Twente Workshop on Language Technology , 13 – 23 . Enschede : University of Tewente .
  • Nunan , D. 1993 . Introducing Discourse Analysis , London : Penguin .
  • Pfau , T. , Ellis , D. P. W. and Stolcke , A. 2001 . “ Multi-speaker speech activity detection for the ICSI meeting recorder ” . In Proc. IEEE ASRU Workshop , Madonna di Campiglio .
  • Kozima , H. 1998 . “ Attention-sharing and behavior-sharingin human?robot communication ” . In Proc. IEEE Int. Workshop on Robot and Human Communication 9 – 14 . Takamatsu
  • Clark , H. H. 1985 . “ Language use and language users ” . In Handbook of Social Psychology , Edited by: Lindzey , G. and Aronson , E. New York : Harper & Row .
  • Cole , R. A. , Mariani , J. , Uszkoreit , H. , Zaenen , A. and Zue , V. , eds. 1997 . Survey of the State of the Art in Human Language Technology , Cambridge : Cambridge University Press . Online version: http://cslu.cse.ogi.edu/HLTsurvey
  • Moore , J. D. and Paris , C. L. 1993 . Planning texts for advisory dialogues: capturing intentional and rhetorical information . Computational Linguistics , 19 ( 4 ) : 651 – 694 .
  • Reiter , R. D. 2000 . Building Natural Language Generation Systems , Cambridge : Cambridge University Press .
  • Fries , G. 1993 . “ Phoneme-depended speech synthesis in the time and frequency domains ” . In Proc. EUROSPEECH-93 921 – 924 . Berlin
  • Sproat , R. 1996 . “ Multilingual text analysis for text-to-speech synthesis ” . In Proc. 4th Int. Conf. on Spoken Language Processing 75 – 80 . Philadelphia , PA
  • Santen , J. , Sproat , R. , Olive , J. and Hirschberg , J. , eds. 1997 . Progress in Speech Synthesis , New York , NY : Springer-Verlag .
  • Dendi , V. R. 2001 . “ A face for robot: the path to creating a face for socially interactive robot ” . In Application to Human computer interaction , Information Technology Services . http://www.its.caltech.edu/
  • Massaro , D. W. 1998 . Perceiving Talking Faces: From Speech Perception to Behavioral Principle , Cambridge : IT Press .
  • Izawa , A. , Hattori , K. , Matsuoka , Y. and Kawamura , S. 1983 . Speech synthesis by mechanical system control . J. Robotics Soc., Jpn. , : 273 – 278 .
  • Nishikawa , K. , Asama , K. , Hayashi , K. , Takanobu , H. and Takanishi , A. 2001 . “ Mechanical design of talking robot for natural vowels and consonant sounds ” . In Proc. IEEE Int. Conf. on Robotics and Automation 2424 – 2430 .
  • Abe , M. , Nakamura , S. , Shikano , K. and Kuwawara , H. 1998 . “ Voice conversion through vector quantization ” . In Proc. ICASSP'97 655 – 658 .
  • Stylionou , Y. and Cappe , O. 1998 . “ A system voice conversion based on probabilistic classification and a harmonic plus noise model ” . In Proc. ICASSP'98 281 – 284 . Seattle , WA
  • Toda , T. , Saruwatari , H. and Shikano , K. 2001 . “ High quality voice conversion based on gaussian mixture model with dynamic frequency wrapping ” . In Proc. EUROSPECH'01 349 – 352 . Aalborg
  • Ostendorf , M. and Bulyko , I. 2002 . “ The impact of speech recognition on speech synthesis ” . In Proc IEEE Workshop on Speech Synthesis
  • Asai , T. , Hinamoto , Y. , Saruwatari , H. and Shikano , K. 2004 . “ Interface for barge-in free spoken dialogue system based on sound field control and microphone array ” . In IEICE Technical Report 81 – 86 . EA2003-153
  • J.Wilpon and Jacobsen , C. 1996 . “ A study of speech recognition for children and the elderly ” . In Proc. ICASSP'96 349 – 352 . Atlanta , GA
  • Li , Q. and Russel , M. 2001 . “ Why is automatic recognition of children's speech difficult? ” . In Proc. EUROSPEECH'01 2671 – 2674 . Aalborg
  • Dautenhahn , K. 1998 . The art of designing socially intelligent agents— science, fiction, and human in the loop . Appl. Artif. J. , 12 ( 7-8 ) : 573 – 617 .
  • Nakano , M. , Miyazaki , N. , Hirasawa , J.-i. , Dohsaka , K. and Kawabata , T. 1999 . “ Understanding unsegmented user utterances in real-time spoken dialogue systems ” . In Proc. 37th Annu. Meet. of the Association for Computational Linguistics 200 – 207 . College Park , MD

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.