136
Views
3
CrossRef citations to date
0
Altmetric
Full Papers

Probabilistic nod generation model based on speech and estimated utterance categories

, &
Pages 731-741 | Received 06 Nov 2018, Accepted 05 Apr 2019, Published online: 04 May 2019

References

  • Ishi CT, Ishiguro H, Hagita N. Analysis of relationship between head motion events and speech in dialogue conversations. Speech Commun. 2014 Feb;57:233–243. doi: 10.1016/j.specom.2013.06.008
  • Tekalp A, Karpov A, Aran O, et al. Combined gesture-speech analysis and speech driven gesture synthesis. In: 2006 IEEE International Conference on Multimedia and Expo. IEEE Computer Society. 2006 Jul. p. 893–896. Los Alamitos (CA).
  • Munhall K, Jones JA, Callan DE, et al. Visual prosody and speech intelligibility: head movement improves auditory speech perception. Psychol Sci. 2004;15(2):133–137. pMID: 14738521. doi: 10.1111/j.0963-7214.2004.01502010.x
  • Graf HP, Cosatto E, Strom V, et al. Visual prosody: facial movements accompanying speech. In: Proceedings of fifth IEEE International Conference on Automatic Face Gesture Recognition; May. Washington, DC; 2002. p. 396–401.
  • Beskow J, Granström B, House D. Visual correlates to prominence in several expressive modes. In: INTERSPEECH. ISCA. 2006.
  • Yehia HC, Kuratate T, Vatikiotis-Bateson E. Linking facial animation, head motion and speech acoustics. J Phon. 2002;30(3):555–568. http://www.sciencedirect.com/science/article/pii/S0095447002901658. doi: 10.1006/jpho.2002.0165
  • Busso C, Deng Z, Grimm M, et al. Rigid head motion in expressive speech animation: Analysis and synthesis. IEEE Trans Audio Speech Lang Process. 2007 Mar;15(3):1075–1086. doi: 10.1109/TASL.2006.885910
  • Watanabe T, Danbara R, Okubo M. Interactor: Speech-driven embodied interactive actor. In: Proceedings. 11th IEEE International Workshop on Robot and Human Interactive Communication; Sep. Berlin; 2002. p. 430–435.
  • Ogawa T, Watanabe T. Interrobot: speech-driven embodied interaction robot. Vol. 15. 2000 02. p. 322–327.
  • Ishi CT, Liu C, Ishiguro H, et al. Head motions during dialogue speech and nod timing control in humanoid robots. In: Proceedings of the 5th ACM/IEEE International Conference on Human-Robot Interaction. HRI '10. 2010. p. 293–300. Osaka. Piscataway (NJ): IEEE Press.
  • Liu C, Ishi CT, Ishiguro H, et al. Generation of nodding, head tilting and eye gazing for human-robot dialogue interaction. HRI'12 - Proceedings of the 7th Annual ACM/IEEE International Conference on Human-Robot Interaction. Boston, MA; 2012.
  • Liu C, Ishi CT, Ishiguro H. Probabilistic nod generation model based on estimated utterance categories. In: 2017 IEEE/RSJ international conference on intelligent robots and systems, IROS, 2017 Sep 24–28; Vancouver, BC, Canada; 2017. p. 5333–5339.
  • Salton G, McGill MJ. Introduction to modern information retrieval. New York (NY): McGraw-Hill, Inc; 1986.
  • Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003 Mar;3:993–1022. http://dl.acm.org/citation.cfm?id=944919.944937.
  • Zhou D, Chen L, He Y. An unsupervised framework of exploring events on twitter: filtering, extraction and categorization. In: AAAI Conference on Artificial Intelligence. Austin, TX; 2015.
  • Yeh JF, Tan YS, Lee CH. Topic detection and tracking for conversational content by using conceptual dynamic latent dirichlet allocation. Neurocomput. 2016;216(C):310–318. doi: 10.1016/j.neucom.2016.08.017
  • Hoffman M, Bach FR, Blei DM. Online learning for latent dirichlet allocation. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A, editors. Advances in neural information processing systems 23. Vancouver: Curran Associates, Inc; 2010. p. 856–864.
  • Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. 1st ed. Cambridge (UK): Cambridge University Press; 2000.
  • Sun A, Lim EP, Liu Y. On strategies for imbalanced text classification using svm: A comparative study. Decis Support Syst. 2009 Dec;48(1):191–201. doi: 10.1016/j.dss.2009.07.011
  • Liu AY. The effect of oversampling and undersampling on classifying imbalanced text datasets [master thesis]. 2004. Available from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.5878&rep=rep1&type=pdf.
  • Zadrozny B, Langford J, Abe N. Cost-sensitive learning by cost-proportionate example weighting. In: proceedings of the third ieee international conference on data mining. ICDM '03. Washington (DC): IEEE Computer Society. 2003. p. 435.
  • Ishi CT, Liu C, Ishiguro H, et al. Speech-driven lip motion generation for tele-operated humanoid robots. In: Auditory-visual speech processing, AVSP 2011; 2011 Sep 1–2; Volterra; 2011. p. 131–135.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.