Abstract
Automatic text summarization is an emerging field of research in Natural Language Processing. This work is a novel attempt to include a low-resource language to the domain of Automatic Text Summarization. We use supervised machine learning algorithms to perform single document extractive automatic text summarization on documents in a low-resource language, Konkani. In particular, we propose using language independent features to train supervised machine learning algorithms using a Konkani dataset, specifically devised for the experimentation using books on Konkani folktale literature. We approach the automatic text summarization task as a binary classification problem, and the algorithms, once trained, classify the sentences based on their relevance to generate a summary. Thereafter, the performance of popular linear and non-linear supervised machine learning algorithms is evaluated using K-fold cross-validation. The summary generated by the systems is compared with human-generated summaries to verify its effectiveness. The results show that the linear models exhibit better performance in comparison with the non-linear models; however, all the models could beat the baselines. The output produced by the proposed methodology generates promising summaries without the need for any language-specific domain knowledge.
DISCLOSURE STATEMENT
No potential conflict of interest was reported by the author(s).
Additional information
Notes on contributors
![](/cms/asset/3ce1f4ae-dc73-45d9-bc65-fb8274f9326e/tijr_a_1987993_ilg0001.gif)
Jovi D’Silva
Jovi D’Silva is presently a research scholar in the Department of Computer Science and Engineering, School of Engineering, Assam Don Bosco University, Guwahati, India. He has obtained BCA and MCA degrees from Bangalore University, India and MTech in computer science and engineering from Christ University, India. His research area is natural language processing. Email: [email protected]
![](/cms/asset/96930e41-8d29-4a3d-a855-22f7e68986bc/tijr_a_1987993_ilg0002.gif)
Uzzal Sharma
Uzzal Sharma obtained his MCA from IGNOU and completed PhD from Gauhati University. He has over 16 years of experience in academics and industry. His research areas include speech signal processing, software engineering, cyber security and data engineering. Currently, he is an assistant professor Stage 2 at Assam Don Bosco University, Guwahati, India. Email: [email protected]