Basic Quantitative Characteristics of the Modern Greek Language Using the Hellenic National Corpus

George Mikros Dr Department of Italian and Spanish Language and Literature, School of Philosophy, University of Athens; Institute for Language and Speech Processing (ILSP)Correspondence[email protected]

Nick Hatzigeorgiu Institute for Language and Speech Processing (ILSP)

George Carayannis Institute for Language and Speech Processing (ILSP)

Abstract

Modern Greek is one of the least quantitatively studied modern European languages and the goal of this paper is to fill this relative void. We use the Hellenic National Corpus (HNC), which is a growing corpus that currently includes 33 million words. The corpus and all the tools used in our work were developed by the Institute for Language and Speech Processing (ILSP). In this paper we focus on three main areas: the lists of the 1000 most common words and lemmas, word length and letter frequency. We also make some comparisons with earlier work, in which we had used the previous 13 million word edition of the HNC.

Notes

¹The HNC has a Web interface and queries are possible over the Internet at the following Web address: http://hnc.ilsp.gr/

² A propos, we would like to thank all the publishers that have donated the texts used in HNC.

³This test is preferable to other statistical tests, since it does not require a specific distribution for the test parameters. The Wilcoxon test uses the differences between two pairs of measurements and gives higher weight to pairs that have a greater difference.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Basic Quantitative Characteristics of the Modern Greek Language Using the Hellenic National Corpus

Related Research Data

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Basic Quantitative Characteristics of the Modern Greek Language Using the Hellenic National Corpus

Abstract

Notes

Reprints and Corporate Permissions

Academic Permissions

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date