4,636
Views
29
CrossRef citations to date
0
Altmetric
Original Articles

Data Fusion for Real-time Multimodal Emotion Recognition through Webcams and Microphones in E-Learning

, &

REFERENCES

  • Bahreini, K., Nadolski, R., Qi, W., & Westera, W. (2012a). FILTWAM - A framework for online game-based communication skills training - Using webcams and microphones for enhancing learner support. In P. Felicia (Ed.), The 6th European Conference on Games Based Learning (ECGBL) (pp. 39–48). Cork, Ireland: Academic Conferences, Ltd.
  • Bahreini, K., Nadolski, R., & Westera, W. (2012b). FILTWAM - A framework For online affective computing in serious games. In A. De Gloria & S. de Freitas (Eds.), The 4th International Conference on Games and Virtual Worlds for Serious Applications (VSGAMES’ 12). Procedia Computer Science.15 (pp. 45–52). Genoa, Italy : Elsevier B.V.
  • Bahreini, K., Nadolski, R., & Westera, W. (2014). Towards multimodal emotion recognition in E-learning environments. Interactive Learning Environments. DOI: 10.1080/10494820.2014.908927.
  • Bahreini, K., Nadolski, R., & Westera, W. (2015). Towards real-time speech emotion recognition for affective E-learning. Education and Information Technologies, 1–20. Springer US. DOI: 10.1007/s10639-015-9388-2.
  • Ben Ammar, M., Neji, M., Alimi, A. M., & Gouarderes, G. (2010). The affective tutoring system. Expert Systems with Applications, 37(4), 3013–3023.
  • Biswas, P., & Langdon, P. (2015). Multimodal intelligent eye-gaze tracking system. International Journal of Human–Computer Interaction, 31(4), 277–294, DOI: 10.1080/10447318.2014.1001301.
  • Bosch, N., Chen, H., D’Mello, S., Baker, R., & Shute, V. (2015). Accuracy vs. availability heuristic in multimodal affect detection in the wild. Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (ICMI ‘15) (pp. 267–274). Seattle, WA: ACM.
  • Brent, M. (1995). Instance-based learning: Nearest neighbour with generalization. Hamilton, NZ: University of Waikato, Department of Computer Science.
  • Buisine, S., Courgeon, M., Charles, A., Clavel, C., Martin, J. C., Tan, N., & Grynszpan, O. (2014). The role of body postures in the recognition of emotions in contextually rich scenarios. International Journal of Human–Computer Interaction, 30(1), 52–62, DOI: 10.1080/10447318.2013.802200.
  • Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., & Weiss, B. (2005). A database of German emotional speech. Proceedings of the Inter Speech, 1517–1520. Lisbon, Portugal.
  • Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., … Narayanan, S. S. (2004). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of ACM 6th International Conference on Multimodal Interfaces (pp. 205–211). New York: ACM.
  • Castellano, G., Kessous, L., & Caridakis, G. (2008). Emotion recognition through multiple modalities: Face, body gesture, speech, affect and emotion in human–computer interaction. In C. Peter & R. Beale (Eds.), Lecture notes in computer science 4868 (pp. 92–103). Berlin Heidelberg: Springer.
  • Chen, L. S., Huang, T. S., Miyasato, T., & Nakatsu, R. (1998). Multimodal human emotion/expression recognition. Proceedings of the 3rd International Conference on Face and Gesture Recognition, 366–371.
  • Chen, L. (2000). Joint processing of audio-visual information for the recognition of emotional expressions in human–computer interaction. PhD thesis. University of Illinois at Urbana–Champaign.
  • Cohen, W. W. (1995). Fast effective rule induction. Twelfth International Conference on Machine Learning, 115–123.
  • Cooper, D. H., Cootes, T. F., Taylor, C. J., & Graham, J. (1995). Active shape models – Their training and application. Computer Vision and Image Understanding, 61, 38–59.
  • Cristinacce, D., & Cootes, T. (2004). A comparison of shape constrained facial feature detectors. IEEE International Conference on Automatic Face and Gesture Recognition (FG’04), 375–380.
  • Cristinacce, D., & Cootes, T. (2008). Automatic feature localisation with constrained local models. Journal of Pattern Recognition, 41(10), 3054–3067.
  • De Silva, L. C., & Ng, L. C. (2000). Bimodal emotion recognition. IEEE International Conference on Automatic Face and Gesture Recognition, 332–335.
  • D’Mello, S. K., & Graesser, A. C. (2012). AutoTutor and Affective AutoTutor: Learning by talking with cognitively and emotionally intelligent computers that talk back. ACM Transactions on Interactive Intelligent Systems, 2(4), 1–39.
  • Ekman, P. (1972). Universals and cultural differences in facial expression of emotion. In J. K. Cole (Ed.), Nebraska Symposium on Motivation (pp. 207–283). Lincoln, NE: University of Nebraska Press.
  • Ekman, P., & Friesen, W. V. (1979). Facial action coding system: Investigator’s guide. Consulting Psychologists Press. https://www.paulekman.com/product/facs-manual/
  • Frank, E., Hall, M., & Pfahringer, B. (2003). Locally weighted Naive Bayes. 19th Conference in Uncertainty in Artificial Intelligence, 249–256.
  • Gaffary, Y., Eyharabide, V., Martin, J. C., & Ammi, M. (2014). The impact of combining kinesthetic and facial expression displays on emotion recognition by users. International Journal of Human–Computer Interaction, 30(11), 904–920, DOI: 10.1080/10447318.2014.941276.
  • Geertzen, J. (2012). Inter-rater agreement with multiple raters and variables. Retrieved from https://mlnl.net/jg/software/ira/
  • Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2008). Multi-pie. IEEE International Conference on Automatic Face and Gesture Recognition (FG’08), 1–8.
  • Grubb, C. (2013). Multimodal emotion recognition. Technical Report. Retrieved from http://orzo.union.edu/Archives/SeniorProjects/2013/CS.2013/
  • Hallgren, K. A. (2012). Computing inter-rater reliability for observational data: An overview and tutorial. Tutorials in Quantitative Methods for Psychology, 8(1), 23–34. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/.
  • Huhnel, I., Fölster, M., Werheid, K., & Hess, U. (2014). Empathic reactions of younger and older adults: No age related decline in affective responding. Journal of Experimental Social Psychology, 50, 136–143.
  • Jack, R. E., Garrod, O. G. B., Yub, H., Caldara, R., & Schyns, P. G. (2012). Facial expressions of emotion are not culturally universal. Proceedings of the National Academy of Sciences, 109(19), 7241–7244. DOI: 10.1073/pnas.1200155109.
  • Jaimes, A., & Sebe, N. (2007). Multimodal human–computer interaction: A survey, computer vision and image understanding. Special Issue on Vision for Human–Computer Interaction, 108(1–2), 116–134.
  • Jiang, L., & Zhang, H. (2006). Weightily averaged one-dependence estimators. Proceedings of the 9th Biennial Pacific Rim International Conference on Artificial Intelligence (PRICAI), 970–974.
  • Krahmer, E., & Swerts, M. (2011). Audio-visual expression of emotions in communication. In Philips Research Book Series 12 (pp. 85–106). Dordrecht, The Netherlands: Springer.
  • Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.
  • Lang, G., & van der Molen, H. T. (2008). Psychologische Gespreksvoering book. Heerlen: Open University of the Netherlands.
  • Le Cessie, S., & van Houwelingen, J. C. (1992). Ridge estimators in logistic regression. Applied Statistics, 41(1), 191–201.
  • Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The extended Cohn-Kande dataset (CK+): A complete facial expression dataset for action unit and emotion-specified expression. In Proceedings of the Third IEEE Workshop on CVPR for Human Communicative Behavior Analysis (CVPR4HB 2010) (pp. 94–101). San Francisco, CA: IEEE.
  • Messer, K., Matas, J., Kittler, J., Luuttin, J., & Maitre, G. (1999). XM2VTSDB: The extended M2VTS database. International Conference of Audio and Video-Based Biometric Person Authentication (AVBPA’99), 72–77.
  • Murthy, G. R. S., & Jadon, R. S. (2009). Effectiveness of Eigenspaces for facial expression recognition. International Journal of Computer Theory and Engineering, 1(5), 638–642.
  • Nadolski, R. J., Hummel, H. G. K., Van den Brink, H. J., Hoefakker, R., Slootmaker, A., Kurvers, H., & Storm, J. (2008). EMERGO: Methodology and toolkit for efficient development of serious games in higher education. Simulations & Gaming, 39(3), 338–352. DOI: http://sag.sagepub.com/content/39/3/338.full.pdf+html.
  • Nwe, T., Foo, S., & De Silva, L. (2003). Speech emotion recognition using hidden Markov models. Speech Communication, 41, 603–623.
  • Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(11), 559–572. DOI: 10.1080/14786440109462720.
  • Pekrun, R. (1992). The impact of emotions on learning and achievement: towards a theory of cognitive/motivational mediators. Journal of Applied Psychology, 41, 359–376.
  • Platt, J. (1999). Fast training of support vector machines using sequential minimal optimization. In B. Schoelkopf, C. Burges, & A. Smola (Eds.), Advances in Kernel Methods - Support Vector Learning (pp. 185–208). Cambridge, MA: MIT Press.
  • Preeti, K. (2013). Multimodal emotion recognition for enhancing human–computer interaction. PhD dissertation. University of Narsee Monjee, Institute of Management Studies, Department of Computer Engineering. Mumbai, India.
  • Rus, V., D’Mello, S. K., Hu, X., & Graesser, A. C. (2013). Recent advances in intelligent tutoring systems with conversational dialogue. AI Magazine, 34(3), 42–54.
  • Russell, J. A. (1994). Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. Psychological Bulletin, 115, 102–141.
  • Saragih, J., Lucey, S., & Cohn, J. F. (2011). Deformable model fitting by regularized landmark mean-shift. International Journal of Computer Vision (IJCV), 91(2), 200–215.
  • Sarrafzadeh, A., Alexander, S., Dadgostar, F., Fan, C., & Bigdeli, A. (2008). How do you know that I don’t understand? A look at the future of intelligent tutoring systems. Computers in Human Behavior, 24(4), 1342–1363.
  • Schuller, B., Lang, M., & Rigoll, G. (2002). Multimodal emotion recognition in audio-visual communication. IEEE International Conference on Multimedia and Expo, ICME ‘02, 1, 745–748. DOI: 10.1109/ICME.2002.1035889.
  • Sebe, N., Cohen, I., Gevers, T., & Huang, T. S. (2006). Emotion recognition based on joint visual and audio cues. 18th International Conference on Pattern recognition, 1136–1139.
  • Sebe, N. (2009). Multimodal interfaces: Challenges and perspectives. Journal of Ambient Intelligence and Smart Environments, 1(1), 23–30.
  • Van der Molen, H. T., & Gramsbergen-Hoogland, Y. H. (2005). Communication in organizations: Basic skills and conversation models. New York, NY: Psychology Press.
  • Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. Conference on computer vision and pattern recognition, I-511-I-518.
  • Viola, P., & Jones, M. (2002). Robust real-time object detection. International Journal of Computer Vision, 57(2), 137–154.
  • Vogt, T. (2011). Real-time automatic emotion recognition from speech: The recognition of emotions from speech in view of real-time applications. Südwestdeutscher Verlag für Hochschulschriften. ISBN-10: 3838125452.
  • Wang, S., Ling, X., Zhang, F., Tong, J. (2010). Speech emotion recognition based on principal component analysis and back propagation neural network. In Proceedings of the 2010 International Conference on Measuring Technology and Mechatronics Automation (ICMTMA ‘10), 03 (pp. 437–440). IEEE Computer Society, Washington, DC, USA.
  • Wagner, J., Lingenfelser, F., Baur, T., Damian, I., Kistler, F., & Andre, E. (2013). The Social Signal Interpretation (SSI) framework: Multimodal signal processing and recognition in real-time. Proceedings of the 21st ACM International Conference on Multimedia, MM ‘13,831–834.
  • Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1), 39–58.
  • Zhang, Z. (1999). Feature-based facial expression recognition: Sensitivity analysis and experiment with a multi-layer perceptron. International Journal of Pattern Recognition Artificial Intelligence, 13(6), 893–911.
  • Zheng, F., & Geoffrey, I. W. (2006). Efficient lazy elimination for averaged-one dependence estimators. Proceedings of the Twenty-third International Conference on Machine Learning (ICML 2006), 1113–1120.
  • Zheng, Z., & Webb, G. (2000). Lazy learning of Bayesian rules. Machine Learning, 4(1), 53–84.