An Image-Text Sentiment Analysis Method Using Multi-Channel Multi-Modal Joint Learning

Lianting Gonga School of Management and Information, Zhejiang College of Construction, Hangzhou City, Zhejiang Province, ChinaCorrespondence[email protected]
View further author information

Xingzhou Heb School of Humanities, Zhejiang University of Technology, Hangzhou City, Zhejiang Province, ChinaView further author information

Jianzhong Yangc School of Humanities, Hangzhou Normal University, Hangzhou City, Zhejiang Province, ChinaView further author information

Article: 2371712 | Received 03 Dec 2023, Accepted 17 Jun 2024, Published online: 28 Jun 2024

Cite this article
https://doi.org/10.1080/08839514.2024.2371712
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

An, J., W. M. N. W. Zainon, and Z. Hao. 2023. Improving targeted multimodal sentiment classification with semantic description of images. Computers Materials & Continua 75(3):5801–20. doi:10.32604/cmc.2023.038220.
Web of Science ®Google Scholar
Angamuthu, S., and P. Trojovský. 2023. Integrating multi-criteria decision-making with hybrid deep learning for sentiment analysis in recommender systems. PeerJ Computer Science 9:e1497. doi:10.7717/peerj-cs.1497.
PubMed Web of Science ®Google Scholar
Bao, H., L. Dong, and F. Wei. 2021. Beit: Bert pre-training of image transformers. ArXiv, abs/2106.08254.
Google Scholar
Cai, G., and B. Xia. 2015. Convolutional neural networks for multimedia sentiment analysis. In Natural Language Processing and Chinese Computing: 4th CCF Conference, NLPCC 2015, Proceedings 4, Nanchang, China, October 9–13, 2015, 159–167. Springer International Publishing.
Google Scholar
Chochlakis, G., T. Srinivasan, J. Thomason, and S. S. Narayanan. 2022. VAuLT: Augmenting the vision-and-language transformer for sentiment classification on social media. arXiv preprint arXiv:2208.09021.
Google Scholar
Gandhi, A., K. U. Adhvaryu, S. Poria, E. Cambria, and A. Hussain. 2022. Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Information Fusion 91:424–44. doi:10.1016/j.inffus.2022.09.025.
Web of Science ®Google Scholar
Gao, Q., B. Cao, X. Guan, T. Gu, X. Bao, J. Wu, B. Liu, and J. Cao. 2022. Emotion recognition in conversations with emotion shift detection based on multi-task learning. Knowledge-Based Systems 248:108861. doi:10.1016/j.knosys.2022.108861.
Web of Science ®Google Scholar
Guo, W., Y. Zhang, X. Cai, L. Meng, J. Yang, and X. Yuan. 2021. LD-MAN: Layout-driven multimodal attention network for online news sentiment recognition. IEEE Transactions on Multimedia 23:1785–98. doi:10.1109/TMM.2020.3003648.
Web of Science ®Google Scholar
Hazarika, D., R. Zimmermann, and S. Poria. 2020. MISA: Modality-invariant and -specific representations for multimodal sentiment analysis. In Proceedings of the 28th ACM International Conference on Multimedia, 1122–1131. https://arxiv.org/abs/2005.03545.
Google Scholar
Jia, L., T. Ma, H. Rong, V. S. Sheng, X. Huang, and X. Xie. 2023. A rearrangement and restore mixer model for target-oriented multimodal sentiment classification. IEEE Transactions on Artificial Intelligence 1–11. doi:10.1109/TAI.2023.3341879.
Google Scholar
Khan, J., N. Ahmad, S. Khalid, F. Ali, and Y. Lee. 2023. Sentiment and context-aware hybrid DNN with attention for text sentiment classification. Institute of Electrical and Electronics Engineers Access 11:28162–79. doi:10.1109/ACCESS.2023.3259107.
Google Scholar
Li, J., C. Wang, Z. Luo, Y. Wu, and X. Jiang. 2024. Modality-dependent sentiments exploring for multi-modal sentiment classification. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 7930–7934, IEEE.
Google Scholar
Liu, X., F. Wei, W. Jiang, Q. Zheng, Y. Qiao, J. Liu, L. Niu, Z. Chen, and H. Dong. 2023. MTR-SAM: Visual multimodal text recognition and sentiment analysis in public opinion analysis on the internet. Applied Sciences 13(12):7307. doi:10.3390/app13127307.
Google Scholar
Liu, X., T. Wu, and G. Guo. 2022. Adaptive sparse ViT: Towards learnable adaptive token pruning by fully exploiting self-attention. arXiv preprint arXiv:2209.13802.
Google Scholar
Manek, A. S., and P. D. Shenoy. 2022. Mining the web data: Intelligent information retrieval system for filtering spam and sentiment analysis. In 2022 IEEE International Conference for Women in Innovation, Technology and Entrepreneurship (ICWITE), 1–10. IEEE.
Google Scholar
Meena, G., K. Mohbey, K. Kumar, and K. Lokesh. 2023. A hybrid deep learning approach for detecting sentiment polarities and knowledge graph representation on monkeypox tweets. Decision Analytics Journal 7:100243. doi:10.1016/j.dajour.2023.100243.
Google Scholar
Niu, T., S. Zhu, L. Pang, and A. El-Saddik. 2016. Sentiment analysis on multi-view social data. Conference on Multimedia Modeling 15–27. doi:10.1007/978-3-319-27674-8_2.
Google Scholar
Paul, S., and P. Y. Chen. 2022. Vision transformers are robust learners. Proceedings of the AAAI Conference on Artificial Intelligence 36(2):2071–81. doi:10.1609/aaai.v36i2.20103.
Google Scholar
Riyadh, M., and M. O. Shafiq. 2022. GAN-BElectra: Enhanced multi-class sentiment analysis with limited labeled data. Applied Artificial Intelligence 36(1):2083794. doi:10.1080/08839514.2022.2083794.
Web of Science ®Google Scholar
She, D., J. Yang, M. Cheng, Y. Lai, P. L. Rosin, and L. Wang. 2020. WSCNet: Weakly supervised coupled networks for visual sentiment classification and detection. IEEE Transactions on Multimedia 22(5):1358–71. doi:10.1109/TMM.2019.2939744.
Web of Science ®Google Scholar
Tan, C. H., A. Chan, M. Haldar, J. Tang, X. Liu, M. Abdool, H. Gao, L. He, and S. Katariya. 2023. Optimizing Airbnb search journey with multi-task learning. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 4872–81. doi:10.1145/3580305.3599881.
Google Scholar
Tsai, Y. H., S. Bai, P. P. Liang, J. Z. Kolter, L. Morency, and R. Salakhutdinov. 2019. Multimodal transformer for unaligned multimodal language sequences. In Proceedings of the Conference. Association for Computational Linguistics Meeting, Florence, Italy, 6558–69. Vol. 2019. NIH Public Access.
Google Scholar
Wang, H., X. Li, Z. Ren, M. Wang, and C. Ma. 2023. Multimodal sentiment analysis representations learning via contrastive learning with condense attention fusion. Sensors 23(5):2679. doi:10.3390/s23052679.
PubMed Web of Science ®Google Scholar
Wang, H., C. Ren, and Z. Yu. 2024. Multimodal sentiment analysis based on cross-instance graph neural networks. Applied Intelligence 54(4):3403–16. doi:10.1007/s10489-024-05309-0.
Web of Science ®Google Scholar
Wang, Y., Y. Shen, Z. Liu, P. P. Liang, A. Zadeh, and L. P. Morency. 2019. Words can shift: Dynamically adjusting word representations using nonverbal behaviors. Proceedings of the AAAI Conference on Artificial Intelligence 33(01):7216–23. doi:10.1609/aaai.v33i01.33017216.
PubMedGoogle Scholar
Xiao, L., X. Hu, Y. Chen, X. Yun, B. Chen, D. Gu, and B. Tang. 2022. Multi-head self-attention based gated graph convolutional networks for aspect-based sentiment classification. Multimedia Tools & Applications 1–20.
Google Scholar
Xie, G. F., N. Liu, X. J. Hu, and Y. T. Shen. 2023. Toward prompt-enhanced sentiment analysis with mutual describable information between aspects. Applied Artificial Intelligence 37(1):2186432. doi:10.1080/08839514.2023.2186432.
Web of Science ®Google Scholar
Xie, Y., and Y. Liao. 2023. Efficient-ViT: A light-weight classification model based on CNN and ViT. In Proceedings of the 2023 6th International Conference on Image and Graphics Processing, 64–70. doi:10.1145/3582649.3582676.
Google Scholar
Xu, N., and W. Mao. 2017. MultiSentiNet: A deep semantic network for multimodal sentiment analysis. In Proceedings of the 2017 ACM on Conference on Information & Knowledge Management, 2399–402. doi:10.1145/3132847.3133142.
Google Scholar
Xu, Y., H. Wei, M. Lin, Y. Deng, K. Sheng, M. Zhang, F. Tang, W. Dong, F. Huang, and C. Xu. 2021. Transformers in computational visual media: A survey. Computational Visual Media 8(1):33–62. doi:10.1007/s41095-021-0247-3.
Google Scholar
Yan, X., H. Xue, S. Jiang, and Z. Liu. 2021. Multimodal sentiment analysis using multi-tensor fusion network with cross-modal modeling. Applied Artificial Intelligence 36(1):2000688. doi:10.1080/08839514.2021.2000688.
Web of Science ®Google Scholar
Yan, X., H. Xue, S. Jiang, and Z. Liu. 2022. Multimodal sentiment analysis using multi-tensor fusion network with cross-modal modeling. Applied Artificial Intelligence 36(1):2000688. doi:10.1080/08839514.2021.2000688.
Web of Science ®Google Scholar
Yang, B., L. Wu, J. Zhu, B. Shao, X. Lin, and T. Liu. 2022. Multimodal sentiment analysis with two-phase multi-task learning. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30:2015–24. doi:10.1109/TASLP.2022.3178204.
Web of Science ®Google Scholar
Yang, X., S. Feng, D. Wang, P. Hong, and S. Poria. 2023. Few-shot multimodal sentiment analysis based on multimodal probabilistic fusion prompts. In Proceedings of the 31st ACM International Conference on Multimedia, 6045–6053.
Google Scholar
Yang, X., S. Feng, D. Wang, and Y. Zhang. 2020. Image-text multimodal emotion classification via multi-view attentional network. IEEE Transactions on Multimedia 23:4014–26. 10.1109/TMM.2020.3035277.
Web of Science ®Google Scholar
Yin, Z., Y. Du, Y. Liu, and Y. Wang. 2024. Multi-layer cross-modality attention fusion network for multimodal sentiment analysis. Multimedia Tools & Applications 83(21):60171–87. doi:10.1007/s11042-023-17685-9.
Google Scholar
Yu, W., H. Xu, F. Meng, Y. Zhu, Y. Ma, J. Wu, J. Zou, and K. Yang. 2020. CH-SIMS: A Chinese multimodal sentiment analysis dataset with fine-grained annotation of modality. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 3718–27. doi:10.18653/v1/2020.acl-main.343.
Google Scholar
Yu, Y., H. Lin, J. Meng, and Z. Zhao. 2016. Visual and textual sentiment analysis of a microblog using deep convolutional neural networks. Algorithms 9(2):41. doi:10.3390/a9020041.
Web of Science ®Google Scholar
Zhang, H., Y. Liu, Z. Xiong, Z. Wu, and D. Xu. 2023. Visual sentiment analysis with semantic correlation enhancement. Complex & Intelligent Systems 1–13.
Web of Science ®Google Scholar
Zhang, K. 2022. Faster R-CNN transmission line multi-target detection based on BAM. In 2022 4th International Conference on Intelligent Control, Measurement and Signal Processing (ICMSP), 364–369. IEEE.
Google Scholar
Zhang, S., C. Yin, and Z. Yin. 2023. Multimodal sentiment recognition with multi-task learning. IEEE Transactions on Emerging Topics in Computational Intelligence 7(1):200–09. doi:10.1109/TETCI.2022.3224929.
Google Scholar
Zhang, S., H. Yu, and G. Zhu. 2022. An emotional classification method of Chinese short comment text based on ELECTRA. Connection Science 34(1):254–73. doi:10.1080/09540091.2021.1985968.
Web of Science ®Google Scholar
Zhao, Y., M. Mamat, A. Aysa, and K. Ubul. 2023. Multimodal sentiment system and method based on CRNN-SVM. Neural Computing & Applications 35(35):24713–25. doi:10.1007/s00521-023-08366-7.
Web of Science ®Google Scholar
Zhu, T., L. Li, J. Yang, S. Zhao, H. Liu, and J. Qian. 2023. Multimodal sentiment analysis with image-text interaction network. IEEE Transactions on Multimedia 25:3375–85. doi:10.1109/tmm.2022.3160060.
Web of Science ®Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

An Image-Text Sentiment Analysis Method Using Multi-Channel Multi-Modal Joint Learning

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

An Image-Text Sentiment Analysis Method Using Multi-Channel Multi-Modal Joint Learning

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date