5,216
Views
7
CrossRef citations to date
0
Altmetric
Research Article

Multimodal Sentiment Analysis Using Multi-tensor Fusion Network with Cross-modal Modeling

ORCID Icon, , ORCID Icon &
Article: 2000688 | Received 08 Jul 2021, Accepted 20 Oct 2021, Published online: 19 Nov 2021

References

  • Chaturvedi, I., R. Satapathy, S. Cavallari, and E. Cambria. 2019. Fuzzy commonsense reasoning for multimodal sentiment analysis. Pattern Recognition Letters 125:185–200. doi:10.1016/j.patrec.2019.04.024.
  • Chauhan, D. S., M. S. Akhtar, A. Ekbal, and P. Bhattacharyya. 2019. Context-aware interactive attention for multi-modal sentiment and emotion analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), HongKong, 5647–57
  • Degottex, G., J. Kane, T. Drugman, T. Raitio, and S. Scherer. 2014. Covarepa collaborative voice analysis repository for speech technologies. In IEEE international conference on acoustics, speech and signal processing (ICASSP), 960–64. IEEE, Florence, Italy.
  • Ebrahimi, M., A. H. Yazdavar, and A. Sheth. 2017. Challenges of sentiment analysis for dynamic events. IEEE Intelligent Systems 32 (5):70–75. doi:10.1109/MIS.2017.3711649.
  • Fs, A., B. Jra, C. Ai, and A. Mg. 2020. Multimodal subspace support vector data description. Pattern Recognition, 110:107648.
  • Kumar, A., and J. Vepa. 2020. Gated mechanism for attention based multi modal sentiment analysis. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4477–81. IEEE, Spain.
  • Li, Y., K. Zhang, J. Wang, and X. Gao. 2021. A cognitive brain model for multimodal sentiment analysis based on attention neural networks. Neurocomputing 430:159–73. doi:10.1016/j.neucom.2020.10.021.
  • Liu, Y. J., and C. L. Cheng. 2018. A gesture feature extraction algorithm based on key frames and local extremum. Computer Technology and Development 28 (3):127–31.
  • Majumder, N., D. Hazarika, A. Gelbukh, E. Cambria, and S. Poria. 2018. Multimodal sentiment analysis using hierarchical fusion with context modeling. Knowledge-based Systems 161:124–33. doi:10.1016/j.knosys.2018.07.041.
  • Mittal, T., U. Bhattacharya, R. Chandra, A. Bera, and D. Manocha. 2020. M3er: Multiplicative multimodal emotion recognition using facial, textual, and speech cues. Proceedings of the AAAI Conference on Artificial Intelligence 34:1359–67. doi:10.1609/aaai.v34i02.5492.
  • Patel, D., X. Hong, and G. Zhao. 2016. Selective deep features for microexpression recognition. In The 23rd international conference on pattern recognition (ICPR), 2258–63. IEEE, Cancún, Mexico.
  • Pennington, J., R. Socher, and C. D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, 1532–43.
  • Pham, H., P. P. Liang, T. Manzini, L. P. Morency, and B. Pzos. 2019. Found in translation: Learning robust joint representations by cyclic translations between modalities. Proceedings of the AAAI Conference on Artificial Intelligence 33:6892C6899. doi:10.1609/aaai.v33i01.33016892.
  • Plank, B., A. Søgaard, and Y. Goldberg. 2016. Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. arXiv preprint arXiv:160405529.
  • Poria, S., E. Cambria, D. Hazarika, N. Mazumder, A. Zadeh, and L. P. Morency. 2017. Multi-level multiple attentions for contextual multimodal sentiment analysis. In IEEE International Conference on Data Mining (ICDM), 1033C1038. IEEE, New Orleans, LA, USA.
  • Poria, S., I. Chaturvedi, E. Cambria, and A. Hussain. 2016. Convolutional mkl based multimodal emotion recognition and sentiment analysis. InThe 16th IEEE international conference on data mining (ICDM), 439C448. IEEE, Barcelona, Spain.
  • Sahay, S., S. H. Kumar, R. Xia, J. Huang, and L. Nachman. 2018. Multimodal relational tensor network for sentiment and emotion classification. arXiv preprint arXiv:180602923.
  • Stockli, S., M. Schulte-Mecklenbeck, S. Borer, and A. C. Samson. 2018. Facial expression analysis with affdex and facet: A validation study. Behavior Research Methods 50 (4):1446–60. doi:10.3758/s13428-017-0996-1.
  • Tsai, Y. H. H., P. P. Liang, A. Zadeh, L. P. Morency, and R. Salakhutdinov. 2018. Learning factorized multimodal representations. arXiv preprint arXiv:180606176.
  • Tsai, Y. H. H., S. Bai, P. P. Liang, J. Z. Kolter, L. P. Morency, and R. Salakhutdinov. 2019. Multimodal transformer for unaligned multimodal language sequences. In Proceedings of the conference on Association for Computational Linguistics, 6558–61. NIH Public Access, Florence, Italy.
  • Xi, C., G. Lu, and J. Yan. 2020. Multimodal sentiment analysis based on multi-head attention mechanism. In Proceedings of the 4th International Conference on Machine Learning and Soft Computing, New York NY United States, 34C39.
  • Xing, F. Z., E. Cambria, and R. E. Welsch. 2018. Natural language based financial forecasting: A survey. Artificial Intelligence Review 50 (1):49–73. doi:10.1007/s10462-017-9588-9.
  • Xue, H., X. Yan, S. Jiang, and H. Lai. 2020. Multi-tensor fusion network with hybrid attention for multimodal sentiment analysis. The International Conference on Machine Learning and Cybernetics (ICMLC), shenzhen, 169–74. IEEE.
  • Young, T., E. Cambria, I. Chaturvedi, H. Zhou, S. Biswas, and M. Huang. 2018. Augmenting end-to-end dialogue systems with commonsense knowledge. In Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.
  • Zadeh, A. B., P. P. Liang, S. Poria, E. Cambria, and L. P. Morency. 2018c. Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 1:2236–46.
  • Zadeh, A., M. Chen, S. Poria, E. Cambria, and L. P. Morency. 2017 Tensor fusion network for multimodal sentiment analysis. arXiv preprint arXiv:170707250.
  • Zadeh, A., P. P. Liang, N. Mazumder, S. Poria, E. Cambria, and L. P. Morency. 2018a. Memory fusion network for multi-view sequential learning. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.
  • Zadeh, A., P. P. Liang, S. Poria, P. Vij, E. Cambria, and L. P. Morency. 2018b. Multi-attention recurrent network for human communication comprehension. In Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA
  • Zadeh, A., R. Zellers, E. Pincus, and L. P. Morency. 2016. Multimodal sentiment intensity analysis in videos: Facial gestures and verbal messages. IEEE Intelligent Systems 31 (6):82–88. doi:10.1109/MIS.2016.94.
  • Zhou, P., W. Shi, J. Tian, Z. Qi, B. Li, H. Hao, and B. Xu. 2016. Attention-based bidirectional long short-term memory networks for relation classification. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics 2:207–12.