CLIP
LAMP
People
Projects
Publications
Colloquia
Courses
        Divergence Unraveling for Statistical Translation (DUSTer)
 

Objective:
The main goal of DUSTer is to enable more accurate projection of foreign language dependency trees without requiring any training on dependency-tree data in the foreign language. We resolve some of the most prevalent linguistic divergence cases by using dependency-tree information to transform the sentence structure of one language to bear a closer resemblance to that of the other language. We present techniques for modifying English parse trees to form resulting parse trees that better reflect the structure of the sentences in the other languages (in particular, we focus on English-Spanish, English-Arabic, and English-Chinese). A divergence occurs when the underlying concepts or gist of a sentence is distributed over different words for different languages. For example, the notion of running into the room is expressed as run into the room in English and move-in the room running (entrar el cuarto corriendo ) in Spanish. While seemingly transparent for human readers, this throws statistical aligners for a serious loop. Far from being a rare occurrence, our preliminary investigations revealed that divergences occurred in approximately 1 out of every 3 sentences. Thus, finding a way to deal effectively with these divergences and repair them would be a massive advance for bilingual alignment---and, ultimately, statistical MT

The following three ideas motivate the development of automatic ``divergence correction'' techniques:

  • Every language pair has translation divergences that are easy to recognize.
  • Knowing what they are and how to accommodate them provides the basis for refined word-level alignment.
  • Refined word-level alignment results in improved projection of structural information from English to the foreign language.

Current version of software: Click here.

Researchers:
Bonnie Dorr, Nizar Habash, Necip Fazil Ayan, Nitin Madnani. (DUSTer alumni: Rebecca Hwa, Eric Nichols, Lisa Pearl, Andrew Fister, Ayelet Goldin.)

Sponsors:
Army Research Laboratory, Office of Naval Research, National Science Foundation.




home | CLIP | LAMP | people | projects | publications | colloquia | courses | contact us
© Copyright 2000-2004, CLIP Laboratory, University of Maryland, All rights reserved.
Comments, questions to webmaster.