Achieving Mandarin parsing accuracies comparable to English has proven to be a challenge to the language processing community. Many of the state-of-the art parsers that achieve above 89% F-score on English newswire struggle to achieve accuracy levels above 81 or 82% on Mandarin newswire. In this talk, we will describe our efforts to improve Mandarin parsing accuracies to values approaching English levels and discuss the factors contributing to our success. Armed with Mandarin parsers of this quality, we are also beginning to develop language models that are informed by the syntactic information learned during training.
Mary Harper is a Senior Research Scientist and an Area Director at the Center for the Advanced Study of Language and an Affiliate Research Professor in the Computer Science Department at University of Maryland. Harper's research has focused on developing methods for incorporating multiple types of knowledge into computer algorithms for modeling human communication. Recent research has focused on the integration of speech and natural language processing systems (in English and Mandarin), multimodal (speech, gesture, and gaze) integration, spoken term detection, and the utilization of hierarchical structure learned in an unsupervised fashion to improve the classification accuracy of documents and images.
This talk is part of the CLIP Colloquium Series, organized by Jimmy Lin (jimmylin -at- umd .dot. edu). For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.