Conditional random fields (Lafferty, McCallum, and Pereira, 2001) are quite effective at sequence labeling tasks like shallow parsing (Sha and Pereira, 2003) and named-entity extraction (McCallum and Li, 2003). CRFs are *log-linear*, allowing the incorporation of arbitrary features into the model. Clever new features are one way to improve performance; clever objective functions are another (see, for instance, recent work on max-margin parsing by Taskar, Klein, et al., 2004).
We have developed a method to do both, in the unlabeled data framework. That is, we use log-linear models capable of exploiting new features, and a new class of objective functions: contrastive estimation (CE). CE can be intuitively understood as exploiting implicit negative evidence and is computationally efficient (unlike log-linear EM). In fact, CE generalizes EM and a variety of other objective functions. By engineering classes of implicit negative evidence, CE can be adapted for specific applications.
We describe applications to two natural language learning problems---POS tagging of unlabeled text with a dictionary (Merialdo, 1994) and dependency grammar induction (Klein and Manning, 2004)---and show how contrastive estimation outperforms EM (with the same feature sets), is more robust to loss of domain knowledge (dictionary degradation or uninformative initialization), and can recover by modeling additional, nonorthogonal features.
This is joint work with Jason Eisner and was presented at ACL 2005 and the IJCAI 2005 Workshop on Grammatical Inference Applications.
Noah A. Smith is a PhD student at Johns Hopkins University. He received his undergraduate degrees at the University of Maryland. His graduate research has focused on unsupervised approaches to finding grammatical (or other useful) structure in text, by improving search (e.g., using deterministic annealing) and introducing novel, EM-rivaling objective functions (contrastive estimation). Noah has contributed to the Giza statistical MT toolkit, the STRAND system for finding parallel text on the Web, and the Dyna programming language. He works with Jason Eisner and is supported by a fellowship from the Fannie and John Hertz Foundation.
This talk is part of the CLIP Colloquium Series, organized by Jimmy Lin (jimmylin -at- umd .dot. edu). For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.