129
Views
2
CrossRef citations to date
0
Altmetric
Computers and Computing

Automatic Text Summarization of Konkani Folk Tales Using Supervised Machine Learning Algorithms and Language Independent Features

ORCID Icon & ORCID Icon
Pages 6162-6175 | Published online: 24 Oct 2021
 

Abstract

Automatic text summarization is an emerging field of research in Natural Language Processing. This work is a novel attempt to include a low-resource language to the domain of Automatic Text Summarization. We use supervised machine learning algorithms to perform single document extractive automatic text summarization on documents in a low-resource language, Konkani. In particular, we propose using language independent features to train supervised machine learning algorithms using a Konkani dataset, specifically devised for the experimentation using books on Konkani folktale literature. We approach the automatic text summarization task as a binary classification problem, and the algorithms, once trained, classify the sentences based on their relevance to generate a summary. Thereafter, the performance of popular linear and non-linear supervised machine learning algorithms is evaluated using K-fold cross-validation. The summary generated by the systems is compared with human-generated summaries to verify its effectiveness. The results show that the linear models exhibit better performance in comparison with the non-linear models; however, all the models could beat the baselines. The output produced by the proposed methodology generates promising summaries without the need for any language-specific domain knowledge.

DISCLOSURE STATEMENT

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Jovi D’Silva

Jovi D’Silva is presently a research scholar in the Department of Computer Science and Engineering, School of Engineering, Assam Don Bosco University, Guwahati, India. He has obtained BCA and MCA degrees from Bangalore University, India and MTech in computer science and engineering from Christ University, India. His research area is natural language processing. Email: [email protected]

Uzzal Sharma

Uzzal Sharma obtained his MCA from IGNOU and completed PhD from Gauhati University. He has over 16 years of experience in academics and industry. His research areas include speech signal processing, software engineering, cyber security and data engineering. Currently, he is an assistant professor Stage 2 at Assam Don Bosco University, Guwahati, India. Email: [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 100.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.