Eliciting a corpus of word-aligned phrases for MT

Lori Levin, Alon Lavie, Erik Peterson

Language Technologies Institute

Carnegie Mellon University

 


Special Computational Linguistics Colloquium

October 10, 2003, 11:00am, AVW Room 2120


 

 

We describe a process for constructing an initial MT system for languages that do not have sufficient resources for traditional data-driven approaches.  The process starts with an elicitation phase in which a bilingual informant translates phrases and supplies word alignments.  The elicitation is supported by an elicitation tool and an elicitation corpus.  This talk will include a demo of the elicitation tool and a discussion of the contents of the elicitation corpus.  We will also describe how the elicited data is used for the automatic learning of syntactic transfer rules.

 

About the speakers:

Alon Lavie is an Associate Research Professor at the Language Technologies Institute in the School of Computer Science at Carnegie Mellon University.  He earned his BA in Computer Science (1987) from the Technion - Israel Institut of Technology and an MS (1993) and PhD (1996) in Computer Science from Carnegie Mellon University.  He has been a member of the faculty since 1996. Dr. Lavie's main research areas are Machine Translation (MT) and Spoken Language Understanding.  He has worked extensively on MT of both text and speech and on parsing of spoken language, and has spearheaded several research projects in these areas over the past eight years.  As a Co-PI of the NESPOLE! speech translation project, he led the design and development of a distributed speech translation architecture over the Internet for E-commerce applications.

He is currently a co-PI of the DARPA/TIDES MilliRADD Project and the NSF/ITR funded AVENUE project, which are exploring new machine-learning-based approaches to MT for languages with limited amounts of online resources.


For the colloquium series schedule, see the UMD Computational Linguistics Colloquium Series web page at http://umiacs.umd.edu/~resnik/cl_colloquium/. If you are interested in meeting with the speaker, please contact Doug Oard (oard@umiacs.umd.edu).