58
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Quantitative analysis of Sesotho sa Leboa part-of-speech taggers

ORCID Icon & ORCID Icon

References

  • Akbik A, Blythe D, Vollgraf R. 2018. Contextual string embeddings for sequence labeling. Proceedings of the 27th International Conference on Computational Linguistics, 20–26 August, Santa Fe, New Mexico. pp. 1638–1649.
  • Amrullah AZ, Hartanto R, Mustika IW. 2017. A comparison of different part-of-speech tagging techniques for text in Bahasa Indonesia. Proceedings of the 7th International Annual Engineering Seminar, 1–2 August, Yogyakarta, Indonesia. pp. 1–5. https://doi.org/https://doi.org/10.1109/INAES.2017.8068538
  • Anbananthen KSM, Krishnan JK, Sayeed MS, Muniapan P. 2017. Comparison of stochastic and rule-based POS tagging on Malay online text. American Journal of Applied Sciences 14(9): 843–851. https://doi.org/https://doi.org/10.3844/ajassp.2017.843.851
  • Babych B, Sharoff S. 2016. Ukrainian part-of-speech tagger for hybrid MT: Rapid induction of morphological disambiguation resources from a closely related language. Proceedings of the Fifth Workshop on Hybrid Approaches to Translation (HyTra). European Association for Machine Translation (EAMT), 1 June, Riga, Latvia.
  • Bahcevan CA, Kutlu E, Yildiz T. 2018. Deep neural network architecture for part-of-speech tagging for Turkish language. Proceedings of the Third International Conference on Computer Science and Engineering (UBMK), 20–23 September, Sarajevo. pp. 235–238. https://doi.org/https://doi.org/10.1109/UBMK.2018.8566272
  • Barnard E, Davel M, Van Heerden C, de Wet F, Badenhorst J. 2014. The NCHLT speech corpus of the South African languages. Proceedings of the Fourth International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU), 14–16 May, St Petersburg, Russia. pp. 194–200.
  • Bohnet B, McDonald R, Simoes G, Andor D, Pitler E, Maynez J. 2018. Morphosyntactic tagging with a Meta-BiLSTM model over context sensitive token encodings. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 15–20 July, Melbourne, Australia. pp. 2642–2652. https://doi.org/https://doi.org/10.18653/v1/P18-1246
  • Choi JD. 2016. Dynamic feature induction: The last gist to the state-of-the-art. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 12–17 June, San Diego, California. pp. 271–281. https://doi.org/https://doi.org/10.18653/v1/N16-1031
  • Denis P, Sagot B. 2012. Coupling an annotated corpus and a lexicon for state-of-the-art POS tagging. Language Resources and Evaluation 46(4): 721–736. https://doi.org/https://doi.org/10.1007/s10579-012-9193-0
  • De Schryver GM, De Pauw GD. 2007. Dictionary Writing System (DWS) + Corpus Query Package (CQP): The case of TshwaneLex. Lexikos 17: 226–246. https://doi.org/https://doi.org/10.5788/17-0-554
  • Du Toit JS. 2017. A comparative evaluation of open-source part-of-speech taggers for South African languages. Honours thesis, North-West University, Potchefstroom, South Africa.
  • Eiselen R, Puttkammer MJ. 2014. Developing text resources for ten South African languages. Language Resource and Evaluation Conference, 26–31 May, Reykjavik, Iceland. pp. 3698–3703.
  • Eiselen ER, Puttkammer MJ, Hocking J, Kruger A. 2018. CTexTools 2.1.0. SADiLaR resource catalogue [distributor]. https://hdl.handle.net/20.500.12185/480
  • Faaβ G, Heid U, Taljard E, Prinsloo DJ. 2009. Part-of-speech tagging of Northern Sotho: Disambiguating polysemous function words. Proceedings of the First Workshop on Language Technologies for African Languages, 31 March, Athens, Greece. pp. 38–45.
  • Garrette D, Mielens J, Baldridge J. 2013. Real-world semi-supervised learning of POS-taggers for low-resource languages. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 4–9 August, Sofia, Bulgaria. pp. 583–592.
  • Givón T. 1976. Topic, pronoun, and grammatical agreement. In: Li CN (ed.), Subject and Topic. New York: Academic Press. pp. 149–188.
  • Halácsy P, Kornai A, Oravecz C. 2007. HunPos: An Open Source Trigram Tagger. Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, 25–27 June, Prague, Czech Republic. pp. 209–212. https://doi.org/https://doi.org/10.3115/1557769.1557830
  • Heid U, Taljard E, Prinsloo DJ. 2006. Grammar-based tools for the creation of tagging resources for an unresourced language: the case of Northern Sotho. Proceedings of the 5th International Conference on Language Resources and Evaluation, 24–26 May, Genoa, Italy. pp. 2235–2240.
  • Jurafsky D, Martin JH. 2009. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Hoboken: Prentice Hall.
  • Koleva M. 2013. Towards adaptation of NLP tools for closely related Bantu languages: Building a part-of-speech tagger for Zulu. PhD thesis, Saarland University, Germany.
  • Kurniawan K, Aji AF. 2018. Toward a standardized and more accurate Indonesian part-of-speech tagging. Proceedings of 2018 International Conference on Asian Language Processing, 15–18 November, Bandung, Indonesia. pp. 303–307. https://doi.org/https://doi.org/10.1109/IALP.2018.8629236
  • Ling W, Dyer C, Black AW, Trancoso I, Fermandez R, Amir S, Marujo L, Luís T. 2015. Finding function in form: Compositional character models for open vocabulary word representation. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, September, Lisbon, Portugal. pp. 1520–1530. https://doi.org/https://doi.org/10.18653/v1/D15-1176
  • Lombard DP. 1985. Introduction to the grammar of Northern Sotho. Pretoria: Van Schaik.
  • Lu X. 2014. Computational Methods for Corpus Annotation and Analysis. Dordrecht: Springer. https://doi.org/https://doi.org/10.1007/978-94-017-8645-4
  • Ma X, Hovy E. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. http://arxiv.org/abs/1603.01354
  • Malema G, Okgetheng B, Motlhanka M. 2017. Setswana part-of-speech tagging. International Journal on Natural Language Computing 6(6): 15–20. https://doi.org/https://doi.org/10.5121/ijnlc.2017.6602
  • Nasiruddin M. 2013. A state of the art of word sense induction: A way towards word sense disambiguation for under-resourced languages. http://arxiv.org/abs/1310.1425
  • Petrochenkov VV, Kazennikov AO. 2013. A statistical tagger for morphological tagging of Russian language texts. Automation and Remote Control 74(10): 1724–1732. https://doi.org/https://doi.org/10.1134/S0005117913100123
  • Poulos G, Louwrens L J. 1994. A linguistic analysis of Northern Sotho. Pretoria: Via Afrika.
  • Prinsloo DJ, Faaβ G, Taljard E, Heid U. 2008. Designing a verb guesser for part of speech tagging in Northern Sotho. Southern African Linguistics and Applied Language Studies 26(2): 185–196. https://doi.org/https://doi.org/10.2989/SALALS.2008.26.2.1.565
  • Puttkammer MJ, Schlemmer M. 2014. NCHLT Part-of-speech taggers. 1.0. SADiLaR resource catalogue [distributor]. https://hdl.handle.net/20.500.12185/323
  • Puttkammer MJ, Schlemmer M, Bekker R. 2014. NCHLT Sepedi annotated text corpora. SADiLaR resource catalogue [distributor]. https://hdl.handle.net/20.500.12185/325
  • Taljard E, De Schryver GM. 2016. A corpus-driven account of the noun classes and genders in Northern Sotho. Southern African Linguistics and Applied Language Studies 34(2): 169–185. https://doi.org/https://doi.org/10.2989/16073614.2016.1206478
  • Taljard E, Faaß G, Heid U, Prinsloo D.J. 2008. On the development of a tagset for Northern Sotho with special reference to the issue of standardisation. Literator 29(1): 111–138. https://doi.org/https://doi.org/10.4102/lit.v29i1.103
  • Tasharofi S, Raja F, Oroumchian F, Rahgozar M. 2007. Evaluation of statistical part of speech tagging of persian text. The 9th International Symposium on Signal Processing and Its Applications, 1–4 February, Sharjah, United Arab Emirates. pp. 1–4. https://doi.org/https://doi.org/10.1109/ISSPA.2007.4555312
  • Tseng H, Jurafsky D, Manning C. 2005. Morphological features help POS tagging of unknown words across language varieties. Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, 14–15 October, Jeju Island, Korea. pp. 32–39.
  • Wilken I, Gumede T, Moors C, Calteaux K. 2018. Human language technology audit 2018: Design considerations and methodology. Proceedings of 2018 International Conference on Intelligent and Innovative Computing Applications, 1–7 December, Plaine Magnien, Mauritius. pp. 1–7. https://doi.org/https://doi.org/10.1109/ICONIC.2018.8601212
  • Van Rooy B, Pretorius R. 2003. A word-class tagset for Setswana. Southern African Linguistics and Applied Language Studies 21(4): 203–222. https://doi.org/https://doi.org/10.2989/16073610309486344
  • Zerbian S. 2006. Expression of information structure in the Bantu language Northern Sotho. PhD thesis, Humboldt University, Berlin, Germany.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.