UMIACS Computational Linguistics Colloquium,
November 23, 1999
Accurate morpho-syntactic tagging using large tagsets
Dan Tufis
Romanian Academy
UMIACS Computational Linguistics Colloquium
November 23, 1999,
12:15pm, AVW Room 2120
This talk will address two main issues related to morpho-syntactic
tagging:
- coping with large tagsets (over 600 tags) without performance
degradation, and
- significant improvement of tagging accuracy (close to 99%) by using
multiple language models classifiers.
Although the case study was based on HMMs, and was intensively tested
on the Romanian language, we claim that our approach, called TT-CLAM
(Tiered Tagging and Combined Language Models Classifiers) does not
depend on a specific tagging method such as HMM, rule-based, memory
based or ME-based, nor does it depend on a specific language.
Informal demos can be arranged.
For the colloquium series schedule, see the UMD
Computational Linguistics Colloquium Series web page at
http://umiacs.umd.edu/~resnik/cl_colloquium/. If you are interested
in meeting with the speaker, please contact Philip Resnik (resnik@umiacs.umd.edu).