CDI - Translation as a Collaborative Process

Natural language translation remains a crucial problem that is expensive, slow to develop solutions for, and difficult to scale. While automated approaches often result in understanding the gist, fully automated high quality translation remains far out of reach for the vast majority of the world's languages. A variety of projects are now emerging that tap into the Web-based community of people willing to help translate, but bilingual expertise is quite rare compared to the total availability of volunteers. This project will investigate whether a combination of machine translation and human participants that speak only a single language (i.e., monolingual speakers) can result in high quality translation. The research is organized around development of an iterative protocol that combines elements of machine translation, human and semi-automated language annotation, and human correction, motivated by concepts in information theory and discourse analysis. This research framework will support both synchronous and asynchronous pairwise interaction among human participants as well as a ""bag of tasks"" approach that permits truly distributed human computation.

With respect to broader impacts, this project is among the first to investigate the potential of hybrid human/machine translation involving non-bilingual human participants, combining practical implementation with empirically driven experimentation. If successful, this project will lower the bar for translation of natural languages, resulting in a widely available approach offering high quality translation for an unprecedentedly wide range of language pairs while reducing requirements and costs for bilingual expertise. The technology to be developed will be evaluated on a real-world problem: translation of books within the (previously NSF-funded) International Children's Digital Library project (www.childrenslibrary.org). The ICDL currently contains 4,000 books in 60 languages and has an active user population including 1,000 volunteers with differing language skills who are interested in helping with translation. Participants in Mexico, Romania, Mongolia, and the U.S. will act as early adopters in K-12 educational settings, supporting the ICDL's goal of enabling greater shared cultural understanding through this existing and growing resource.

Principal Investigators