UMIACS Computational Linguistics Colloquium, November 23, 1999

Accurate morpho-syntactic tagging using large tagsets


Dan Tufis


Romanian Academy


UMIACS Computational Linguistics Colloquium

November 23, 1999, 12:15pm, AVW Room 2120


This talk will address two main issues related to morpho-syntactic tagging:

  1. coping with large tagsets (over 600 tags) without performance degradation, and

  2. significant improvement of tagging accuracy (close to 99%) by using multiple language models classifiers.

Although the case study was based on HMMs, and was intensively tested on the Romanian language, we claim that our approach, called TT-CLAM (Tiered Tagging and Combined Language Models Classifiers) does not depend on a specific tagging method such as HMM, rule-based, memory based or ME-based, nor does it depend on a specific language.

Informal demos can be arranged.


For the colloquium series schedule, see the UMD Computational Linguistics Colloquium Series web page at http://umiacs.umd.edu/~resnik/cl_colloquium/. If you are interested in meeting with the speaker, please contact Philip Resnik (resnik@umiacs.umd.edu).