ABSTRACT
This article takes issue with type-based automatic lemmatisation of the Old English class I strong verbs L-Y. To reach this goal, the article discusses the design of a Morphological Generation interface implemented with the rules of Old English strong verb inflection as well as with the most frequent variant spellings. The forms provided by the interface are automatically checked against the two most representative corpora of Old English, the Dictionary of Old English Corpus and the The York-Toronto-Helsinki Parsed Corpus of Old English Prose, and validated results are assigned the corresponding lemma. The results prove that almost 99% of the validated forms can successfully be assigned lemma. The remaining 1% correspond to formally ambiguous inflectional forms, where lemma assignment is in dispute between two candidates. The conclusion is reached that disambiguation is only possible on the basis of contextualised token-based analysis. The article confirms that type-based lemmatisation of Old English strong verbs can be largely automatised.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1 For a detailed account on the query strings and their goals, see Metola Rodríguez (Citation2015).
2 For an exhaustive review of the lemmatised inflectional forms of contracted and preterite-present verbs, see García Fernández (Citation2020)
3 For a detailed explanation of the meaning of the POS labels, see Taylor, Warner, Pintzuk & Beths (Citation2003).