47
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Using a language independent domain model for multilingual information extraction

Pages 705-724 | Published online: 26 Nov 2010
 

The volume of electronic text in different languages, particularly on the World Wide Web, is growing significantly, and the problem of users who are restricted in the number of languages they read obtaining information from this text is becoming more widespread. This article investigates some of the issues involved in achieving multilingual information extraction (IE), describes the approach adopted in the M-LaSIE-II IE system, which addresses these problems, and presents the results of evaluating the approach against a small parallel corpus of English/French newswire texts. The approach is based on the assumption that it is possible to construct a language independent representation of concepts relevant to the domain, at least for the small well-defined domains typical of IE tasks, allowing multilingual IE to be successfully carried out without requiring full machine translation.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.