Machine-readable dictionaries (MRDs) have been studied for some time in CL, but generally just to provide relatively sparse amounts of (syntactic) lexical information. However, with many "computer scientists complaining about the amount of lexical tuning needed to make a lexicon suitable for a particular application," further exploitation of MRDs seems necessary to avoid the lexical acquisition bottleneck. Following some early work to extract ISA hierarchies, the 1990s saw an expansion of efforts to extract additional semantic relations (EuroWordNet) and build more extensive semantic networks (MindNet). Current research has been examining verb sense hierarchies (VerbNet) and construction of semantic frames for lexical entries (FrameNet). We describe our current efforts working with a dictionary publisher, parsing definitions using directed-graph models of the semantic structure of dictionaries as the overall framework, and performing lexicographic tasks whose purpose is the maintenance and construction of dictionary and thesaurus entries. These lexicographic tasks have resulted in some immediate benefits, particularly when performed with NLP objectives in mind, specifically word-sense disambiguation in Senseval and focused semantic network expansion in the TREC question-answering track.
Click here for slides.
For the colloquium series schedule, see the UMD Computational Linguistics Colloquium Series web page at http://umiacs.umd.edu/~resnik/cl_colloquium/. If you are interested in meeting with the speaker, please contact Philip Resnik (resnik@umiacs.umd.edu).