Skip to main content


Crowdsourcing Translation without Bilingual People

 

Introduction

We propose a rethinking of the translation problem to bring together translation technology and human-computer interaction, producing a framework for translation that exploits imperfect technology and limited human abilities in tandem to achieve capabilities neither can achieve alone. The core of this framework is MonoTrans, an iterative protocol in which monolingual human participants work together to improve imperfect machine translations.

Projects

MonoTrans: Collaborative Translation by Monolingual People

MonoTrans is an iterative protocol between monolingual people to translate. Monolingual translation, or translation by people who speak only the source or the target language, can be used to solve the problem of translating between rare languages, or to achieve quality translation at a large scale. At the core of monolingual translation are protocols in which the human participants (monolingual source or target language speakers) work together to make sense of machine translations. Since monolingual translation does not depend on bilingual humans, it can enable translation between uncommon language pairs where a bilingual translator is hard to find. In addition, monolingual translation can be supported by a larger population, and thus is likely to result in much higher throughput.

MonoTrans2:Asynchronous Protocol



Source Side UI


Target Side UI


Target Side: Identifying Translation Erros

MonoTrans2 is an improvement on MonoTrans with an asynchronous protocol. As reflected by the design of MonoTrans2, tasks in a translation process can be designed so every user can perform a task independent of the others. The tasks can be broken down and shortened so there is always a task for any user. By introducing these independent short tasks, the synchronicity restriction for monolingual translation can be removed, which in turn improves the scalability for monolingual translation.

ParaTrans: Error-Driven Paraphrase

The source text provided to a machine translation system is typically only one of many ways the input sentence could have been expressed, and alternative forms of expression can often produce a better translation. We introduce error driven paraphrasing of source sentences: instead of paraphrasing a source sentence exhaustively, we obtain paraphrases for only the parts that are predicted to be problematic for the translation system.

Participants

Publications

  • Chang Hu, Philip Resnik, Yakov Kronrod, and Benjamin Bederson. 2012. Deploying monotrans widgets in the wild. In Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 2935-2938. DOI=10.1145/2208636.2208700

  • Chang Hu, Philip Resnik, Yakov Kronrod, Vladimir Eidelman, Olivia Buzek, Benjamin B. Bederson. The Value of Monolingual Crowdsourcing in a Real-World Translation Scenario: Simulation using Haitian Creole Emergency SMS Messages. , EMNLP 2011 Sixth Workshop on Statistical Machine Translation, July 2011 [pdf]

  • Chang Hu, Benjamin B. Bederson, Philip Resnik, Yakov Kronrod. MonoTrans2: A New Human Computation System to Support Monolingual Translation, CHI 2011, May 2011 [DOI]

  • Yakov Kronrod, Chang Hu, Olivia Buzek, Alexander J. Quinn. Using Monolingual Crowd to Improve Translation, Student poster at AAAS 2011, Winning poster in Math, Technology and Engineering category, February 2011 [poster]

  • Philip Resnik, Olivia Buzek, Chang Hu, Yakov Kronrod, Alexander J. Quinn, Benjamin B. Bederson. Improving Translation via Targeted Paraphrasing, 2010 Conference on Empirical Methods in Natural Language Processing, October 2010 [pdf]

  • Chang Hu, Benjamin B. Bederson, Philip Resnik. Translation by iterative collaboration between monolingual users, ACM SIGKDD Workshop on Human Computation HCOMP '10, July 2010 [pdf] [DOI].

  • Chang Hu, Benjamin B. Bederson, Philip Resnik. Translation by Iterative Collaboration between Monolingual Users, Graphics Interface '10, June 2010 [pdf][DOI].

  • Olivia Buzek, Philip Resnik, and Ben Bederson. Error Driven Paraphrase Annotation using Mechanical Turk, NAACL 2010 Workshop [pdf].

  • Chang Hu. Collaborative translation by monolingual users, Extended Abstracts of CHI '09, April 2009 [pdf] [DOI].

Talks

  • Ben Bederson's Google Tech Talk, September 2009.
  • Ben Bederson's talk at MSRA and Tsinghua University.

Awards

  • Yakov Kronrod, Chang Hu, Olivia Buzek, Alexander J. Quinn. Using Monolingual Crowd to Improve Translation, Student poster at AAAS 2011, Winning poster in Math, Technology and Engineering category, February 2011.  [poster]

PUblic Citations

  • NPR Morning Edition, June 22, 2010, Using The Wisdom Of Crowds To Translate Language [link]
  • WAMU Kojo Show, Aug 24, 2010, Automating Language Translation: The Future of a Unified Internet? [link]
  • New Scientist, June 28, 2011, The man-machine: Harnessing humans in a hive mind [link]

Workshop

We organized a Crowdsourcing and Translation workshop on June 2010.


Other Related Projects from HCIL

CrowdFlow by Alex Quinn:

  • Quinn, A., Bederson, B., Yeh, T., Lin, J. CrowdFlow: Integrating Machine Learning with Mechanical Turk for Speed-Cost-Quality Flexibility, HCIL Tech Report, May 2010 [link]

Sponsors and Partners

This project is supported by Google and NSF grant #0941455.



This material is based upon work supported by the National Science Foundation under Grant No. 0941455. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.