422
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Multimodal Emotion Recognition for Children with Autism Spectrum Disorder in Social Interaction

, , , , , , , , & show all
Pages 1921-1930 | Received 25 Nov 2022, Accepted 23 Jun 2023, Published online: 11 Jul 2023

References

  • Bangerter, A., Chatterjee, M., Manfredonia, J., Manyakov, N. V., Ness, S., Boice, M. A., Skalkin, A., Goodwin, M. S., Dawson, G., Hendren, R., Leventhal, B., Shic, F., & Pandina, G. (2020). Automated recognition of spontaneous facial expression in individuals with autism spectrum disorder: Parsing response variability. Molecular Autism, 11(1), 1–15. https://doi.org/10.1186/s13229-020-00327-4
  • Bartlett, M. S., Littlewort, G., Frank, M., Lainscsek, C., & Movellan, J. (2005). Recognizing facial expression: Machine learning and application to spontaneous behavior. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). IEEE.
  • Bastanfard, A., Aghaahmadi, M., Abdi, A., Fazel, M., & Moghadam, M. (2009). Persian viseme classification for developing visual speech training application. In P. Muneesawang, F. Wu, I. Kumazawa, A. Roeksabutr, M. Liao, & X. Tang (Eds.), Proceedings of the 10th Pacific Rim conference on multimedia: Advances in multimedia information processing (Vol. 12, pp. 1080–1085). Springer.
  • Bauminger, N. (2002). The facilitation of social-emotional understanding and social interaction in high-functioning children with autism: Intervention outcomes. Journal of Autism and Developmental Disorders, 32(4), 283–298. https://doi.org/10.1023/a:1016378718278
  • Begeer, S., Koot, H. M., Rieffe, C., Terwogt, M. M., & Stegge, H. (2008). Emotional competence in children with autism: Diagnostic criteria and empirical evidence. Developmental Review, 28(3), 342–369. https://doi.org/10.1016/j.dr.2007.09.001
  • Billeci, L., Narzisi, A., Campatelli, G., Crifaci, G., Calderoni, S., Gagliano, A., Calzone, C., Colombi, C., Pioggia, G., Muratori, F., Raso, R., Ruta, L., Rossi, I., Ballarani, A., Fulceri, F., Darini, A., Maroscia, E., Lattarulo, C., Tortorella, G., Siracusano, R., & Comminiello, V. (2016). Disentangling the initiation from the response in joint attention: An eye-tracking study in toddlers with autism spectrum disorders. Translational Psychiatry, 6(5), e808–e808. https://doi.org/10.1038/tp.2016.75
  • Case-Smith, J., & Arbesman, M. (2008). Evidence-based review of interventions for autism used in or of relevance to occupational therapy. The American Journal of Occupational Therapy, 62(4), 416–429. https://doi.org/10.5014/ajot.62.4.416
  • Developmental Behavior Group of Pediatric Branch of Chinese Medical Association, Child Health Care Professional Committee of Pediatric Branch of Chinese Medical Doctor Association, & Expert Group of the Research Project on Technology and Standards for Diagnosis of Children’s Autism. (2017). Consensus on early identification screening and early intervention for autism spectrum disorder. Chinese Journal of Pediatrics, 55(12), 890–897. https://doi.org/10.3760/cma.j.issn.0578-1310.2017.12.004
  • Egger, H. L., Dawson, G., Hashemi, J., Carpenter, K. L. H., Espinosa, S., Campbell, K., Brotkin, S., Schaich-Borg, J., Qiu, Q., Tepper, M., Baker, J. P., Bloomfield, R. A., & Sapiro, G. (2018). Automatic emotion and attention analysis of young children at home: A researchkit autism feasibility study. NPJ Digital Medicine, 1(1), 20. https://doi.org/10.1038/s41746-018-0024-6
  • Falck-Ytter, T., Fernell, E., Hedvall, Å. L., Von Hofsten, C., & Gillberg, C. (2012). Gaze performance in children with autism spectrum disorder when observing communicative actions. Journal of Autism and Developmental Disorders, 42(10), 2236–2245. https://doi.org/10.1007/s10803-012-1471-6
  • Fang, H.-S., Xie, S., Tai, Y.-W., & Lu, C. (2017). RMPE: Regional multi-person pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2334–2343).
  • Faso, D. J., Sasson, N. J., & Pinkham, A. E. (2015). Evaluating posed and evoked facial expressions of emotion from adults with autism spectrum disorder. Journal of Autism and Developmental Disorders, 45(1), 75–89. https://doi.org/10.1007/s10803-014-2194-7
  • Filntisis, P. P., Efthymiou, N., Koutras, P., Potamianos, G., & Maragos, P. (2019). Fusing body posture with facial expressions for joint recognition of affect in child-robot interaction. IEEE Robotics and Automation Letters, 4(4), 4011–4018. https://doi.org/10.1109/LRA.2019.2930434
  • Fu, Z., Liu, F., Wang, H., Qi, J., Fu, X., Zhou, A., & Li, Z. (2021). A cross-modal fusion network based on self-attention and residual structure for multimodal emotion recognition. arXiv.
  • Hajarian, M., Bastanfard, A., Mohammadzadeh, J., & Khalilian, M. (2019). Snefl: Social network explicit fuzzy like dataset and its application for incel detection. Multimedia Tools and Applications, 78(23), 33457–33486. https://doi.org/10.1007/s11042-019-08057-3
  • Han, J., Li, X., Xie, L., Liu, J., Wang, F., & Wang, Z. (2018). Affective computing of childern with authism based on feature transfer [Paper presentation]. 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS) (p. 845–849). IEEE. https://doi.org/10.1109/CCIS.2018.8691180
  • Helt, M., Kelley, E., Kinsbourne, M., Pandey, J., Boorstein, H., Herbert, M., & Fein, D. (2008). Can children with autism recover? if so, how? Neuropsychology Review, 18(4), 339–366. https://doi.org/10.1007/s11065-008-9075-9
  • Huang, J., Tao, J., Liu, B., Lian, Z., & Niu, M. (2020). Multimodal transformer fusion for continuous emotion recognition [Paper presentation]. ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (p. 3507–3511). IEEE. https://doi.org/10.1109/ICASSP40776.2020.9053762
  • Huang, X., Dhall, A., Goecke, R., Pietikainen, M., & Zhao, G. (2018). Multimodal framework for analyzing the affect of a group of people. IEEE Transactions on Multimedia, 20(10), 2706–2721. https://doi.org/10.1109/TMM.2018.2818015
  • Jacques, C., Courchesne, V., Mineau, S., Dawson, M., & Mottron, L. (2022). Positive, negative, neutralor unknown? the perceived valence of emotions expressed by young autistic children in a novel context suited to autism. Autism, 26(7), 1833–1848. https://doi.org/10.1177/13623613211068221
  • Jiang, Z., Harati, S., Crowell, A., Mayberg, H. S., Nemati, S., & Clifford, G. D. (2021). Classifying major depressive disorder and response to deep brain stimulation over time by analyzing facial expressions. IEEE Transactions on Bio-Medical Engineering, 68(2), 664–672. https://doi.org/10.1109/TBME.2020.3010472
  • Kalantarian, H., Jedoui, K., Washington, P., & Wall, D. (2020). A mobile game for automatic emotion-labeling of images. IEEE Transactions on Games, 12(2), 213–218. https://doi.org/10.1109/tg.2018.2877325
  • Karumuri, S., Niewiadomski, R., Volpe, G., & Camurri, A. (2019). From motions to emotions: Classification of affect from dance movements using deep learning [Paper presentation]. CHI '19: CHI Conference on Human Factors in Computing Systems (pp. 1–6). IEEE. https://doi.org/10.1145/3290607.3312910
  • Keskar, N., & Socher, R. (2017). Improving generalization performance by switching from Adam to SGD. arXiv. https://doi.org/10.48550/ARXIV.1712.07628
  • Kingma, D., & Ba, J. (2015). Adam: A method for stochastic optimization [Paper presentation]. Proceedings of 3rd International Conference on Learning Representations. IEEE.
  • Kushki, A., Khan, A., Brian, J., & Anagnostou, E. (2015). A Kalman filtering framework for physiological detection of anxiety-related arousal in children with autism spectrum disorder. IEEE Transactions on Bio-Medical Engineering, 62(3), 990–1000. https://doi.org/10.1109/TBME.2014.2377555
  • Li, J., Bhat, A., & Barmaki, R. (2021). A two-stage multi-modal affect analysis framework for children with autism spectrum disorder [Paper presentation]. The AAAI-21 workshop on affective content analysis.
  • Li, J., Wang, C., Zhu, H., Mao, Y., Fang, H.-S., Lu, C. (2018). Crowdpose: Efficient crowded scenes pose estimation and a new benchmark. arXiv preprint arXiv:1812.00324.
  • Li, S., Ke, L., Pratama, K., Tai, Y.-W., Tang, C.-K., & Cheng, K.-T. (2020). Cascaded deep monocular 3d human pose estimation with evolutionary training data [Paper presentation]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
  • Li, S., Xing, X., Fan, W., Cai, B., Fordson, P., & Xu, X. (2021). Spatiotemporal and frequential cascaded attention networks for speech emotion recognition. Neurocomputing, 448, 238–248. https://doi.org/10.1016/j.neucom.2021.02.094
  • Liu, Z., Wu, M., Cao, W., Chen, L., Xu, J., Zhang, R., Zhou, M., and Mao, J. (2017). A facial expression emotion recognition based human-robot interaction system. IEEE/CAA Journal of Automatica Sinica. 4(4), 668–676. https://doi.org/10.1109/JAS.2017.7510622
  • Marinoiu, E., Zanfir, M., Olaru, V., & Sminchisescu, C. (2018). 3d human sensing, action and emotion recognition in robot assisted therapy of children with autism [Paper presentation]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
  • Minoofam, S. A. H., Bastanfard, A., & Keyvanpour, M. R. (2022). Ralf: An adaptive reinforcement learning framework for teaching dyslexic students. Multimedia Tools and Applications, 81(5), 6389–6412. https://doi.org/10.1007/s11042-021-11806-y
  • Mittal, T., Bhattacharya, U., Chandra, R., Bera, A., Manocha, D. (2020). M3ER: Multiplicative multimodal emotion recognition using facial, textual, and speech cues. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, p. 1359–1367). https://doi.org/10.1609/aaai.v34i02.5492
  • Mittal, T., Guhan, P., Bhattacharya, U., Chandra, R., Bera, A., & Manocha, D. (2020). Emoticon: Context-aware multimodal emotion recognition using frege’s principle [Paper presentation]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (p. 14222–14231). IEEE. https://doi.org/10.1109/CVPR42600.2020.01424
  • Pantic, M., & Rothkrantz, L. (2000). Expert system for automatic analysis of facial expression. Image and Vision Computing, 18(11), 881–905. https://doi.org/10.1016/S0262-8856(00)00034-2
  • Petrovska, I. V., & Trajkovski, V. (2019). Effects of a computer-based intervention on emotion understanding in children with autism spectrum conditions. Journal of Autism and Developmental Disorders, 49(10), 4244–4255. https://doi.org/10.1007/s10803-019-04135-5
  • Ramdoss, S., Machalicek, W., Rispoli, M., Mulloy, A., Lang, R., & O'Reilly, M. (2012). Computer-based interventions to improve social and emotional skills in individuals with autism spectrum disorders: A systematic review. Developmental Neurorehabilitation, 15(2), 119–135. https://doi.org/10.3109/17518423.2011.651655
  • Savargiv, M., & Bastanfard, A. (2015). Persian speech emotion recognition [Paper presentation]. 2015 7th Conference on Information and Knowledge Technology (IKT) (p. 1–5). https://doi.org/10.1109/IKT.2015.7288756
  • Shi, J., Liu, C., Ishi, C. T., & Ishiguro, H. (2020). Skeleton-based emotion recognition based on two-stream self-attention enhanced spatial-temporal graph convolutional network. Sensors, 21(1), 205. https://doi.org/10.3390/s21010205
  • Siqueira, H., Magg, S., & Wermter, S. (2020). Efficient facial feature learning with wide ensemble-based convolutional neural networks. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), 5800–5809. https://doi.org/10.1609/aaai.v34i04.6037
  • Stock, J., Righart, R., & Gelder, B. D. (2007). Body expressions influence recognition of emotions in the face and voice. Emotion, 7(3), 487–494. https://doi.org/10.1037/1528-3542.7.3.487
  • Teh, E. J., Yap, M. J., & Rickard Liow, S. J. (2018). Emotional processing in autism spectrum disorders: Effects of age, emotional valence, and social engagement on emotional language use. Journal of Autism and Developmental Disorders, 48(12), 4138–4154. https://doi.org/10.1007/s10803-018-3659-x
  • Tian, Y., Min, X., Zhai, G., & Gao, Z. (2019). Video-based early asd detection via temporal pyramid networks [Paper presentation]. 2019 IEEE International Conference on Multimedia and Expo (ICME) (p. 272–277). IEEE. https://doi.org/10.1109/ICME.2019.00055
  • Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9(2605), 2579–2605.
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Polosukhin, I. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (p. 6000–6010). ACM.
  • Verma, S., Wang, C., Zhu, L., & Liu, W. (2019). Deepcu: Integrating both common and unique latent information for multimodal sentiment analysis [Paper presentation]. Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI-19 (p. 3627–3634). https://doi.org/10.24963/ijcai.2019/503
  • Wang, J., Liu, Y., Hu, Y., Shi, H., & Mei, T. (2021). Facex-zoo: A pytorch toolbox for face recognition. arXiv.
  • Wang, Y., Wu, J., & Hoashi, K. (2019). Multi-attention fusion network for video-based emotion recognition [Paper presentation]. International Conference on Multimodal Interaction 2019 (p. 595–601). ACM.https://doi.org/10.1145/3340555.3355720
  • Washington, P., Kalantarian, H., Kent, J., Husic, A., Kline, A., Leblanc, E., Wall, D. (2020). Training an emotion detection classifier using frames from a mobile therapeutic game for children with developmental disorders. arXiv.
  • Wu, Z., Pan, S., Long, G., Jiang, J., Zhang, C. (2019). Graph wavenet for deep spatial-temporal graph modeling. In IJCAI2019: Proceedings of the 28th International Joint Conference on Artificial Intelligence.
  • Xiu, Y., Li, J., Wang, H., Fang, Y., & Lu, C. (2018). Pose Flow: Efficient online pose tracking. Bmvc.
  • Yan, S., Xiong, Y., & Lin, D. (2018). Spatial temporal graph convolutional networks for skeleton-based action recognition. Arxiv.
  • Yuan, S., Huang, H., & Wu, L. (2016). Use of word clustering to improve emotion recognition from short text. Journal of Computing Science and Engineering, 10(4), 103–110. https://doi.org/10.5626/JCSE.2016.10.4.103
  • Zhao, F., Zhang, X., Thung, K.-H., Mao, N., Lee, S.-W., & Shen, D. (2022). Constructing multi-view high-order functional connectivity networks for diagnosis of autism spectrum disorder. IEEE Transactions on Bio-Medical Engineering, 69(3), 1237–1250. https://doi.org/10.1109/TBME.2021.3122813
  • Zhao, S., Ma, Y., Gu, Y., Yang, J., Xing, T., Xu, P., Keutzer, K. (2020). An end-to-end visual-audio attention network for emotion recognition in user-generated videos. In 2020 AAAI Conference on Artificial Intelligence (p. 303–311). https://doi.org/10.1609/aaai.v34i01.5364

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.