Leveraging reusability: cost-effective lexical acquisition for large-scale ontology translation

Publication TypeConference Papers
Year of Publication2006
AuthorsMurray CG, Dorr BJ, Jimmy Lin, Hajič J, Pecina P
Conference NameProceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Date Published2006///
PublisherAssociation for Computational Linguistics
Conference LocationStroudsburg, PA, USA

Thesauri and ontologies provide important value in facilitating access to digital archives by representing underlying principles of organization. Translation of such resources into multiple languages is an important component for providing multilingual access. However, the specificity of vocabulary terms in most ontologies precludes fully-automated machine translation using general-domain lexical resources. In this paper, we present an efficient process for leveraging human translations when constructing domain-specific lexical resources. We evaluate the effectiveness of this process by producing a probabilistic phrase dictionary and translating a thesaurus of 56,000 concepts used to catalogue a large archive of oral histories. Our experiments demonstrate a cost-effective technique for accurate machine translation of large ontologies.