UMIACS Computational Linguistics Colloquium, DATE

TITAN: A Cross-linguistic Search Engine on the WWW and related research activities at NTT


Genichiro KIKUI


Information and Communication Systems Labs. NTT (Nippon Telegraph and Telephone Corp.), Japan and CSLI (Center for the Study of Language and Information), Stanford University.


UMIACS Computational Linguistics Colloquium

DATE, 4pm, AVW Room 4406


Although various search services are available on the World-Wide Web, there are only a few services that can help people access web pages in a foreign language. In order to solve this problem we have developed a cross-linguistic WWW search engine, TITAN, which provides users with an interface to query in their native languages while performing multi-lingual searches.

This talk first outlines TITAN and its two essential modules; a module for identifying character coding systems and languages of individual web pages, and a "simple MT module".

Then, I present some results of our research on statistic-based term-list translation. It resolves translation ambiguity by using word concurrence statistics in the target language.

Finally, if the audience is interested in, I am pleased to introduce our international joint project on "distributed cross-linguistic search service for Asian languages", which aims at defining protocols between a cross-linguistic meta-searcher and cross-linguistic, or mono-lingual, search engines and at developing a prototype service using the protocols.


For the colloquium series schedule, see the UMD Computational Linguistics Colloquium Series web page at http://umiacs.umd.edu/~resnik/cl_colloquium/. If you are interested in meeting with the speaker, please contact Mari Broman Olsen (molsen@umiacs.umd.edu) or Philip Resnik (resnik@umiacs.umd.edu).