3,530
Views
23
CrossRef citations to date
0
Altmetric
Original Articles

Revision history: Translation trends in Wikipedia

Pages 16-34 | Published online: 30 Jul 2014
 

Abstract

Wikipedia is a well-known example of a website with content developed entirely through crowdsourcing. It has over 4 million articles in English alone, and content in 284 other language versions. While the articles in the different versions are often written directly in the respective target-language, translations also take place. Given that a previous study suggested that many of English Wikipedia's translators had neither formal training in translation nor professional work experience as translators, it is worth examining the quality of the translations produced. This paper uses Mossop's taxonomy of editing and revising procedures to explore a corpus of translated Wikipedia articles to determine how often transfer and language/style problems are present in these translations and assess how these problems are addressed.

Note on contributor

Julie McDonough Dolmaya teaches in the School of Translation at York University's Glendon Campus. She obtained her doctorate in translation studies from the University of Ottawa in 2009. Her research interests range from political translation and oral history to crowdsourcing, blogs and web translation, and she has published articles on these topics in Meta, The Translator, Translation Studies, and others. She blogs about her teaching and research at www.mcdonough-dolmaya.ca.

Notes

1. For instance, Twitter Engineering Manager Gaku Ueda noted in a 2012 interview that “Because our volunteers are avid Twitter users, they understand the product before translating the string. They tend to come up with good translations” (Thicke Citation2012). Likewise, in a 2012 interview, Baldur's Gate Enhanced Edition Creative Director Trent Oster talked about the quality of the crowdsourced translation of Baldur's Gate, remarking that “these are passionate fans of the series, they know the ins-and-outs, they know the little details, and they're doing it because they love it. The end result is just – the quality is so much higher. The attention to detail is so much higher. They know the terms, they know what THAC0 means and how it should be framed in their language to be understandable to someone who don't [sic] know the rules necessarily” (“How Fans Translated,” Citation2012).

3. Occasionally, articles are listed in the clean-up section even though they are not actually translations: this usually happens when the article appears to have been written by a non-native speaker of English. The Sonora article (listed in the Appendix) is one such example. As Wikipedia editor Maunus noted on 5 September 2011 when posting the article to “Wikipedia: Pages Needing Translation into English”: “The initial language of this article was English written by a Spanish speaker. This article seems to have been written by a Spanish speaking English learner, possible [sic] with the help of a machine translator.” And indeed, the English article resembles the Spanish Sonora article very little in terms of content and organization, though some passages do correspond closely. See http://en.wikipedia.org/w/index.php?title=Wikipedia:Pages_needing_translation_into_English&oldid=450719832 for the Sonora listing on “Wikipedia: Pages Needing Translation into English”.

4. To see this, click on the “view history” tab on any Wikipedia article. This will bring up a page entitled “[Article Title]: Revision History”. On it, users can “compare selected revisions”. In the legend key, though, users are told that “m” refers to a “minor edit”, while → refers to a “section edit”, etc.

5. The total number edits to each of the 94 articles from the date the article was created until mid-May 2013 was, in most cases, higher than the figures in . This is because many of the articles had been edited numerous times before someone identified them as needing translation or clean-up. is intended to reflect the extent of the revision process once a Wikipedian had identified an article as being in need of translation or revision.

6. When an edit is made by a registered Wikipedia user, his or her user name is recorded. When an internet user does not have a Wikipedia account or is not logged in, his or her IP (Internet Provider) address is recorded instead. The figures in the “No. of Editors” columns were calculated by counting each unique user name (or IP address) in the page history as one “editor”. Multiple edits by the same user or IP address were counted only once. It is possible, however, that some Wikipedia users edited the an article under their user names and also under their IP addresses, which means that the figures in the “No. of Editors” columns may be higher than they should.

7. Indeed, the page history of the English Filipe Espinosa article reveals that on 11 October 2012, about three months after the article was posted to “Wikipedia: Pages Needing Translation into English”, Wikipedia user Cutiekatie removed the article from the list, noting that “i [sic] think it was previously tagged wrong. This doesn't read like a rough translation at all and the article was started in English.”

8. The fact that 2 of the articles in the sub-corpus are not translations does suggest that some of the other 94 articles in the corpus may also be texts originally written in English rather than translations from French and Spanish. The existence of such texts opens the door to avenues of research related to those raised by Toury (Citation2012, 54) when he discusses pseudo-translations, namely what do the pseudo-translations listed on “Wikipedia: Pages Needing Translation into English” reveal about how Wikipedians regard the status of translated texts and the features of translated texts (e.g. incorrect syntax, grammatical errors).

9. An infobox is a table added to the top right-hand corner of an article to summarize information shared by certain types of articles. For instance, airport info boxes like the one added to “Mocopulli Airport” specify the airport type (public/private), owner, operator, location, coordinates, website, etc. See http://en.wikipedia.org/wiki/Help:Infobox for more details and examples.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 311.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.