29
Views
0
CrossRef citations to date
0
Altmetric
Articles

Design and development of lemmatizer for Sindhi language in devanagri script

, &
 

Abstract

The Study of word formation is called as morphology. Morphological analysis is one of the important tasks of Natural Language Processing. IR and Machine Translation are two important applications of Morphological Analyzer. In order to search or retrieve any information from web we need, normalize form of a word .This task is accomplish by two morphological analyzer Stemmer and Lemmatizer . Stemming is a simple process of strip off the word endings to get the normalized form of a word, called stem. The process of stemming suffers from two types of errors i.e. over stemming and under stemming. We can handle Under stemming by making a database of exceptional words whereas Over stemming is handle by lemmatizer .Lemma is correct root word or dictionary word generated by applying some rules for affix removal and some additional rules for making a correct dictionary word. This paper presents design and development of lemmatizer for Sindhi language in Devnagri script. This is our first attempt to develop a lemmatizer for Sindhi language in devanagri script.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.