The CLIP Colloquium Series presents...


Language and Translation Model Adaptation using Comparable Corpora

Matthew Snover CLIP Lab
October 8, 2008, 11 a.m. AVW 2120


Joint work with Bonnie Dorr and Richard Schwartz
Practice Talk for EMNLP 2008

Traditionally, statistical machine translation systems have relied on parallel bi-lingual data to train a translation model. While bi-lingual parallel data are expensive to generate, monolingual data are relatively common. Yet monolingual data have been under-utilized, having been used primarily for training a language model in the target language. This paper describes a novel method for utilizing monolingual target data to improve the performance of a statistical machine translation system on news stories. The method exploits the existence of comparable text---multiple texts in the target language that discuss the same or similar stories as found in the source language document. For every source document that is to be translated, a large monolingual data set in the target language is searched for documents that might be comparable to the source documents. These documents are then used to adapt the MT system to increase the probability of generating texts that resemble the comparable document. Experimental results obtained by adapting both the language and translation models show substantial gains over the baseline system.

TER-Plus: Paraphrase, Semantic and Alignment Enhancements to Translation Edit Rate


Joint work with Nitin Madnani, Bonnie Dorr and Richard Schwartz
Practice Talk for NIST Metric MATR 2008 Workshop at AMTA/EMNLP 2008

We describe a new evaluation metric, TER-Plus, or TERp, for automatic evaluation of machine translation. TERp is an extension of Translation Edit Rate (TER) that builds off of the success of TER as an evaluation metric and alignment tool while addressing several of its weaknesses through the use of paraphrases, morphological stemming, and synonyms, as well as edit costs that are optimized to correlate better with various types of human judgments. Correlation studies with both adequacy and HTER, comparing TERp to BLEU, METEOR and TER, illustrate the ability of TERp to evaluate translation performance as well as the differences between different types of human judgments.

About the Speaker

Matthew Snover received an MS in Computer Science from Washington University in Saint Louis in 2002. He is currently a Ph.D. student in Computer Science at the University of Maryland, College Park, and is a member of the Laboratory for Computational Linguistics and Information Processing. His research interests include unsupervised learning, statistical machine translation, and machine translation evaluation.


This talk is part of the CLIP Colloquium Series. For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.