Workshop on Crowdsourcing and Translation
- June 10-11, 2010
Organized by Ben Bederson and Philip Resnik
University of Maryland, Human-Computer Interaction Lab and Computational Linguistics & Information Processing Lab
Workshop Goals
Our goal is to bring together people whose work is helping to
define this new area, and to create an opportunity for discussions that
will help define its future directions.
Workshop Agenda
This workshop was held at the University of Maryland's Inn
and Conference Center on June 10-11, 2010. The workshop was
conducted over a two-day period, which included:
- Analysis of Existing Research Areas
- Brainstorming Activities on Research Areas
- Workshop Outcomes & Next Steps
Speakers
- Ed Bice, (slides) CEO of Meedan.net, an organization that is developing technologies to facilitate cross-language interaction on the web. He is also a founder of the Language Commons, a consortium of organizations and companies working to promote open licensed and accessible linguistic data for all the world's languages;
- Lukas Biewald, co-founder and CEO of CrowdFlower (formerly Dolores Labs), which provides labor-on-demand via access to an always-on, scalable online workforce;
- Chris Callison-Burch, assistant research professor at Johns Hopkins, who has been pioneering new methods for acquiring language data using Amazon's Mechanical Turk;
- Alain Desilets, (slides) research officer at the National Research Council of Canada, who has done research on professional translators' work practices and is exploring how massive online collaboration will change the world of translation and multilingual content creation;
- Martin Kay, professor of Linguistics at Stanford University and recipient of the Lifetime Achievement Award from the Association for Computational Linguistics, who has been discussing for 30 years the process of translation by professional and amateur translators, by existing and existing and proposed machine translation systems, and what each might learn from the other;
- Niki Kittur, assistant professor of human-computer interaction at Carnegie Mellon, whose research examines how groups of people can collaborate to process information on a scale that exceeds individual cognitive capabilities;
- Philipp Koehn, (slides) assistant professor at the University of Edinburgh and author of the new textbook Statistical Machine Translation from Cambridge University Press;
- Donghui Lin, (slides) researcher at the National Institute of Information and Communication Technology (NICT) in Japan, who is helping to develop the Language Grid, a multilingual infrastructure on the Internet for multicultural collaboration;
- Rob Miller, (slides) associate professor and leader of the User Interface Design Group in MIT's Computer Science and Artificial Intelligence Lab, whose research lies at the intersection of user interfaces and programming, with a recent focus on toolkits and systems that incorporate human computation and crowd computing; he will be joined by Greg Little, a PhD student in the same lab and the creator of Turkit;
- Bob Moore, principal researcher at Microsoft Research, a language technology veteran whose research has ranged widely within artificial intelligence and natural-language processing, and who is currently focusing machine learning and statistical modeling in machine translation; he will be joined by Vikram Dendi, a Group Product Manager with Microsoft's MT team leading their crowdsourcing efforts;
- Rob Munro, (slides) a Ph.D. student at Stanford who has coordinated the efforts of thousands of workers and volunteers in translating Haitian Creole and French text messages, in order to communicate urgent needs from the Haitian people to emergency responders and aid organizations after the devastating January 2010 earthquake; and
- Ralph Weischedel, principal scientist at Raytheon BBN Technologies, who leads a group of over 20 researchers in natural language understanding, machine translation, question answering, information extraction, and semantic annotation, in multiple languages.
Non-speaker participants
| Outside University of Maryland | ||||
| ByungGyu Ahn | Johns Hopkins University | |||
| Vikram Dendi | Microsoft | |||
| Peter Highnam | IARPA | |||
| Okan Kolak | ||||
| Alon Lavie | Carnegie Mellon University | |||
| Greg Little | MIT | |||
| Victor Piotrowski | NSF | |||
| Tom Russell | NSF | |||
| David Yarowsky | Johns Hopkins University | |||
| University of Maryland and affiliates | ||||
| Bonnie Dorr | |
|||
| Chang Hu | |
|||
| Judith Klavans | |
|||
| Yakov Kronrod | |
|||
| Alex Quinn | |
|||
| Hendra Setiawan | |
|||
| Amy Weinberg | |
|||
Schedule
| Day 1: Thu, June 10, 2010 | ||||
| 8:00am-9:00am | Registration and continental breakfast | |||
| 9:00am-9:10am | Bonnie Dorr | CMPS welcome | ||
| 9:10am-9:20am | Ben Bederson and Philip Resnik | Event intro | ||
| 9:20am-9:45am | Non-speaker participant introductions | |||
| 9:45am-10:30am | Martin Kay | |||
| 10:30am-11:15am | Chris Callison-Burch | |||
| 11:15am-11:30am | Break | |||
| 11:30am-12:15pm | Rob Miller (slides) | |||
| 12:15pm-1:00pm | Donghui Lin (slides) | |||
| 1:00pm-2:00pm | Lunch | |||
| 2:00pm-2:45pm | Ben Bederson and Philip Resnik (slides) | |||
| 2:45pm-3:30pm | Lukas Biewald | |||
| 3:30pm-3:45pm | Break | |||
| 3:45pm-4:30pm | Philipp Koehn (slides) | |||
| 4:30am-5:15pm | Rob Munro (slides) | |||
| 5:15pm-6:00pm | Ralph Weischedel | |||
| 6:00pm | Informal dinner plans | |||
| Day 2: Fri, June 11, 2010 | ||||
| 8:00am-9:00am | Continental breakfast | |||
| 9:00am-9:30am | Start of day discussion | |||
| 9:30am-10:15am | Ed Bice (slides) | |||
| 10:15am-11:00am | Alain Desilets (slides) | |||
| 11:00am-11:15am | Break | |||
| 11:15am-12:00pm | Niki Kittur | |||
| 12:00pm-12:45pm | Bob Moore and Vikram Dendi | |||
| 12:45pm-1:45pm | Working lunch: discussion and take-away messages | |||
| 1:45pm-2:15pm | Planning for workshop write-up | |||
| 2:15pm-2:45pm | Discussion of follow-on activities (conference workshop?) | |||
| 2:45pm-3:00pm | Workshop wrap-up | |||
Supported by NSF grant #0941455 and by Google.
Jointly organized by Human-Computer Interaction Lab and
Computational Linguistics & Information Processing Lab ![]()
