1,060
Views
3
CrossRef citations to date
0
Altmetric
Original Articles

Measuring Collocation Tendency of Words

&
Pages 174-187 | Published online: 27 May 2011
 

Abstract

In all natural languages, some words collocate with other words to create multi-worded blocks of meaning – the collocations. Since identification of collocations is vital for information retrieval, language learning, psycholinguistics, authorship determination and translation, collocation extraction is an important issue in natural language processing. In this paper we present a method which is designed to improve current statistical methods that generate ranked lists of collocation candidates.

Due to meaning integrity, any word in a collocation must suggest or at least imply the subsequent words composing the collocation. As a result, we may state that the words in a random text differ in the tendency to facilitate the prediction of the next word. If a word helps the prediction then it tends to collocate, otherwise it does not. In this paper, an attempt has been made to extract collocations by measuring collocation tendency of words and word combinations. The method used is to filter out free word pairs (the words that do not facilitate the prediction of the next word or those in which meaning integrity has not been completed yet) in the lists of candidate pairs.

Collocation tendency method is tested on a base data set extracted by some statistical collocation extraction techniques (frequency of occurrence, point-wise mutual information, the t-test, chi-square techniques) and is evaluated by precision and recall measures. We have found that collocation tendency method brings a remarkable improvement on frequency of occurrence and the t-test techniques.

Notes

1A group of two consecutive words in text or speech.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 394.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.