2,720
Views
24
CrossRef citations to date
0
Altmetric
Articles

What to expect from Neural Machine Translation: a practical in-class translation evaluation exercise

ORCID Icon
Pages 375-387 | Received 15 Aug 2017, Accepted 15 Jul 2018, Published online: 23 Jul 2018
 

ABSTRACT

Machine translation is currently undergoing a paradigm shift from statistical to neural network models. Neural machine translation (NMT) is difficult to conceptualise for translation students, especially without context. This article describes a short in-class evaluation exercise to compare statistical and neural MT, including details of student results and follow-on discussions. As part of this exercise, students carry out evaluations of two types of MT output using three translation quality assurance (TQA) metrics: adequacy, post-editing productivity, and a simple error taxonomy. In this way, the exercise introduces NMT, TQA, and post-editing. In our module, a more detailed explanation of NMT followed the evaluation.

The rise of NMT has been accompanied by a good deal of media hyperbole about neural networks and machine learning, some of which has suggested that several professions, including translation, may be under threat. This evaluation exercise is intended to empower the students, and help them understand the strengths and weaknesses of this new technology. Students’ findings using several language pairs mirror those from published research, such as improved fluency and word order in NMT output, with some unpredictable problems of omission and mistranslation.

Disclosure statement

No potential conflict of interest was reported by the author.

Notes

1. See Forcada (Citation2017) for an accessible introduction to the technology behind NMT.

2. Forcada and Ñeco (Citation1997) had, in fact, suggested a method that was effectively a precursor to NMT some years before.

4. See Doherty et al. (Citation2018) and Lommel (Citation2018) for a further discussion of these.

5. From August 2017, the site at https://translator.microsoft.com/neural/ allowed users to compare NMT and SMT output. This changed in March 2018 so that users can test the research systems described in Hassan et al. (Citation2018). At the time of writing (May 2018), users can still access Google SMT via Google Sheets. Due to these being free online tools, the MT systems involved are liable to change without warning.

6. Arabic, Chinese (simplified), English, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish.

7. Most Wikipedia topics chosen by the students were geographical locations, with less obvious choices including Russian Blue cats, Women in Nazi Germany, and Korean pop group Girls’ Generation.

Additional information

Funding

This research was supported by the ADAPT Centre for Digital Content Technology which is funded under the SFI Research Centres Programme (Grant13/RC/2106) and is co-funded under the European Regional Development Fund.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.