220
Views
4
CrossRef citations to date
0
Altmetric
Articles

Evaluation of Lexical-Based Approaches to the Semantic Similarity of Malay Sentences

, &
 

Abstract

We evaluate existing and modified approaches for measuring the semantic similarity of sentences in the Malay language. These approaches are mainly used for English sentences and no studies to date have evaluated and compared their effectiveness when applied to Malay sentences. We used a pre-processed Malay machine-readable dictionary to calculate word-to-word semantic similarity with two methods: probability of intersection and normalization. We then used the word-to-word semantic similarity measure to identify semantic sentence similarity. We evaluated five measures of semantic sentence similarity: vector-based semantic similarity, word order similarity, highest word-to-sentence similarity, and combinations of vector-based and word-to-sentence similarity and of word order and word-to-sentence similarity. We also evaluated the effects of including and excluding lexical components such as prepositions, conjunctions, verbs, and morphological variants.

Acknowledgments

The authors wish to thank the Ministry of Higher Education for the funds provided for this project and also the anonymous referees for their helpful and constructive comments on this paper.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.