Bootstrapping parsers via syntactic projection across parallel texts

TitleBootstrapping parsers via syntactic projection across parallel texts
Publication TypeJournal Articles
Year of Publication2005
AuthorsHwa R, Resnik P, Weinberg A, Cabezas C, Kolak O
JournalNat. Lang. Eng.
Volume11
Issue3
Pagination311 - 325
Date Published2005/09//
ISBN Number1351-3249
Abstract

Broad coverage, high quality parsers are available for only a handful of languages. A prerequisite for developing broad coverage parsers for more languages is the annotation of text with the desired linguistic representations (also known as “treebanking”). However, syntactic annotation is a labor intensive and time-consuming process, and it is difficult to find linguistically annotated text in sufficient quantities. In this article, we explore using parallel text to help solving the problem of creating syntactic annotation in more languages. The central idea is to annotate the English side of a parallel corpus, project the analysis to the second language, and then train a stochastic analyzer on the resulting noisy annotations. We discuss our background assumptions, describe an initial study on the “projectability” of syntactic relations, and then present two experiments in which stochastic parsers are developed with minimal human intervention via projection from English.

URLhttp://dx.doi.org/10.1017/S1351324905003840
DOI10.1017/S1351324905003840