Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multi-media collections in the near future. Considering the characteristics and monosyllabic structure of the Chinese language, a syllable-based framework for retrieving Mandarin spoken documents using speech queries has been investigated at Academia Sinica, Taiwan. This talk presents a new syllable-based approach that is based on matching the whole syllable lattice directly instead of using the syllable and syllable pair information extracted from the syllable lattice. The experimental results show that the retrieval performance can be improved significantly. In this talk, we will also briefly introduce the research work of the Chinese information processing group at Academia Sinica.
For the colloquium series schedule, see the UMD Computational Linguistics Colloquium Series web page at http://umiacs.umd.edu/~resnik/cl_colloquium/. If you are interested in meeting with the speaker, please contact Philip Resnik (resnik@umiacs.umd.edu).