1,132
Views
0
CrossRef citations to date
0
Altmetric
Articles

M2SA: a novel dataset for multi-level and multi-domain sentiment analysis

ORCID Icon, , &
Pages 494-512 | Received 01 Mar 2023, Accepted 20 Jun 2023, Published online: 07 Jul 2023
 

ABSTRACT

People have more channels to express their opinions and feelings about events, products, and celebrities because of the development of social networks. They are becoming rich data sources, gaining attention for many practical applications and in the field of research. Sentiment analysis (SA) is one of the most common uses of this data source. Of the currently available SA datasets, most are only suitable for use in SA corresponding to a specific level, such as document, sentence, or aspect levels. This renders it difficult to develop practical systems that require a combination of sentiment analyzes at all three levels. Additionally, the previous datasets included opinions on only a single domain, although many people often mention multiple domains when expressing their views. This study introduces a new dataset called multi-level and multi-domain (M2SA) for SA. Each sample in M2SA contains a short text with at least two sentences and two aspects with different domains and sentiment polarities. The release of the M2SA dataset will contribute to the promotion of research in the field of SA, primarily by promoting the development and improvement of methods for multi-level SA or multi-aspect, multi-domain SA. The M2SA dataset was tested using state-of-the-art SA methods and was compared with other standard datasets. The results demonstrate that the M2SA dataset is better than the previous datasets in supporting to improve of the performance of SA methods.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

Additional information

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2023R1A2C1008134).

Notes on contributors

Huyen Trang Phan

Huyen Trang Phan received an M.S. degree in computer science from the University of Science and Technology, The University of Da Nang, Vietnam, in 2015, and a Ph.D. degree in computer engineering from Yeungnam University, South Korea, in 2020. She is currently an assistant professor in the Department of Computer Engineering, Yeungnam University, South Korea, and a lecturer at the Faculty of Information Technology, Nguyen Tat Thanh University, Vietnam. She has authored seven journal articles and twelve conference papers as the first author. Her research interests include sentiment analysis, fake news detection, text summarization, and decision support systems.

Ngoc Thanh Nguyen

Ngoc Thanh Nguyen received the Ph.D. degree in computer science from Wroclaw University in 1989. He is currently full Professor of Wroclaw University and head of Department of Information Systems. He got distinguished Scientist of ACM in 2009. He was ACM Distinguished Speaker and IEEE Distinguished Visitor (2009-2013). He is also chair of IEEE SMC Technical Committee on Computational Collective Intelligence and general chair of two Conferences ICCCI and ACIIDS. He has contributed to over 300 publications in various reputed journals/conferences/books. His research interests include computational collective intelligence, knowledge integration, big data, inconsistent knowledge processing, and multi-agent systems.

Dosam Hwang

Dosam Hwang received the Ph.D. degree in Kyoto University, Kyoto, Japan. He is a professor emeritus of Yeungnam University. He has been a full professor of the Department of Computer Engineering at Yeungnam University in South Korea from 1996 to February, 2023, whose research interests mainly include Natural Language Processing, Ontology, Knowledge Engineering, Information Retrieval, and Machine translation. He has served as the Head of the Yeungnam University's Computer Engineering Department for five years between 2005 and 2009. He has also held a position as a principal researcher at the Korea Institute of Science and Technology (KIST) and has also been a visiting professor at the Korea Advanced Institute of Science and Technology (KAIST). He has so far been not only a co-chair of several international conferences but also a steering committee member of ICCCI and ACIIDS, and MISSI international conferences. More specifically, for example, he has been the Assistant Secretary of ISO/TC37/SC4 for language resource management from 2005 to 2007, and also the Secretary of Korean TC for ISO/TC37/SC4 at the same time. In 2006, he was the Director of the Korean Society for Cognitive Science (KSCS) and the Korean Information Science Society (KISS). He has been serving as the Society's Director and the Mentor of a knowledge engineering study group, since 2007. In addition to this, he has also participated in several Korean national research projects, such as a project on machine translation system (1985-1990), and the national IT ontology infrastructure and technology development project called ‘CoreOnto (2006-2009)', and ‘Exobrain, (2013-2014)', the project focused on the construction of deep knowledge base and question-answering platform. He is now in charge of an intelligent service integration based on IoT Big Data as part of Korea's another principal national research project ‘BK+ (2014-present)’. In recognition of his such great commitment and contribution to the relative fields of study, he has been honored as a Distinguished Researcher of KIST in 1988 by Korea's Ministry of Science and Technology (MoST) and awarded a prize for Good Conduct from Kyunghee High School in 1973. He had more than 50 publications.

Yeong-Seok Seo

Yeong-Seok Seo received the Ph.D. degree in computer science from the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea, in 2012. Since September 2022, he has been an Associate Professor (Tenure Track) with the Department of Computer Engineering, Yeungnam University, Gyeongsan, Gyeongbuk, Republic of Korea. His research interests include software engineering, artificial intelligence, the Internet of Things, and big data analysis. He is an Associate Editor of Human-centric Computing and Information Sciences (HCIS) (SCI indexed), Processes (SCI indexed), Electronics (SCI indexed), and Journal of Information Processing Systems (JIPS) (SCOPUS/ESCI indexed). He was also a Guest Editor of the Journal of Systems and Software (JSS) (SCI indexed). Furthermore, he is involved in international standardization activities and is a member of the Korean National Body mirror committee to ISO on IT Service Management and IT Governance (ISO/IEC JTC1/SC40). He also served as a technical committee member for some international conferences and workshops such as ICSE 2020 Demonstrate Track, ICSE 2020 Software Engineering in Practice, MITA 2019, QRS 2020, CSA 2020, WITC 2021, FutureTech 2021, CUTE 2021, CSA 2021, WITC 2022, FutureTech 2022, CUTE 2022, CSA 2022, MITA 2021, WITC 2023, ACIIDS 2023, ICCCI 2023 etc. Prof. Seo is a member of the board of directors of the software engineering society in Korea. He received the Undang academic paper award (grand prize) in 2022 and the 2nd JIPS Survey Paper Awards in 2019, and received the best paper award at the ASK 2022, ASK 2021, MITA 2021, and 2019.