The CLIP Colloquium Series presents...


Building large-scale shallow semantic resources for higher-quality NLP

Eduard Hovy (Information Sciences Institute, University of Southern California)
February 13, 2008, 11:00am, Location AVW 2460

Research in natural language processing (NLP) over the past fifteen years has produced impressive practical results using statistical methods. But increasingly there are signs that continued quality improvement in language processing applications (including QA, summarization, information extraction, and machine translation) requires deeper and richer representations, possibly even (shallow) semantics of text meaning. But although theories of semantics (formal and informal) abound, no-one has yet built a resource of semantic symbols that is large enough to effectively support NLP, that is empirically based, and that is (as far as possible) automatically derived. Can this be done? This talk outlines a general research agenda, with examples of research by myself, colleagues, and students, on a set of core problems that must be solved: cross-ontology/termset alignment; seeded automatic harvesting from text of terms and relations; harvesting axiomatic knowledge (inferences) from text; sharing and propagating event structure information across resources; applying human annotation to verify concept granularity; and, if there's time, beginning to discover the nature of complex notions through annotation and verification by domain experts. Many of the results of this research are included in the Omega ontology and ancillary resources that is being built and used at ISI and elsewhere.

About the Speaker

Eduard Hovy leads the Natural Language Research Group at the Information Sciences Institute of the University of Southern California. He is also Deputy Director of the Intelligent Systems Division, as well as a research associate professor of the Computer Science Department of USC and Advisory Professor of the Beijing University of Posts and Telecommunications. He completed a Ph.D. in Computer Science (Artificial Intelligence) at Yale University in 1987. His research focuses on information extraction, automated text summarization, the semi-automated construction of large lexicons and ontologies, machine translation, question answering, and digital government. He is the author or co-editor of five books and over 180 technical articles. Dr. Hovy regularly serves in an advisory capacity to funders of NLP research in the US and EU. In 2001 Dr. Hovy served as President of the Association for Computational Linguistics (ACL) and in 2001-3 as President of the International Association of Machine Translation (IAMT); he currently serves as President of the Digital Government Society of North America (DGSNA). Dr. Hovy regularly co-teaches a course in the Master's Degree Program in Computer Science at the University of Southern California, as well as occasional short courses on MT and other topics at universities and conferences. He has served on the Ph.D. and M.S. committees for students from USC, Carnegie Mellon University, Taiwan National U, the Universities of Toronto, Karlsruhe, Pennsylvania, Stockholm, Waterloo, Nijmegen, Pretoria, and Ho Chi Minh City.


This talk is part of the CLIP Colloquium Series, organized by Jimmy Lin (jimmylin -at- umd .dot. edu). For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.