TY  - CONF
T1  - Iterative translation disambiguation for cross-language information retrieval
T2  - Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Y1  - 2005
A1  - Monz,Christof
A1  - Dorr, Bonnie J
KW  - cross-language retrieval
KW  - query formulation
KW  - term co-occurrence measures
KW  - term weighting
KW  - translation disambiguation
AB  - Finding a proper distribution of translation probabilities is one of the most important factors impacting the effectiveness of a cross-language information retrieval system. In this paper we present a new approach that computes translation probabilities for a given query by using only a bilingual dictionary and a monolingual corpus in the target language. The algorithm combines term association measures with an iterative machine learning approach based on expectation maximization. Our approach considers only pairs of translation candidates and is therefore less sensitive to data-sparseness issues than approaches using higher n-grams. The learned translation probabilities are used as query term weights and integrated into a vector-space retrieval system. Results for English-German cross-lingual retrieval show substantial improvements over a baseline using dictionary lookup without term weighting.
JA  - Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
T3  - SIGIR '05
PB  - ACM
CY  - New York, NY, USA
SN  - 1-59593-034-5
UR  - http://doi.acm.org/10.1145/1076034.1076123
M3  - 10.1145/1076034.1076123
ER  -