Philip Resnik
|
News flash (Feb 2008): What happened to my hand?
News flash (August 2007): My student Stephan Greene successfully defended his doctoral dissertation, Spin: Lexical Semantics, Transitivity, and the Identification of Implicit Sentiment (short abstract). News flash (June 2007): The Czech-English machine translation system submitted by my student Chris Dyer was the best performer, for all evaluation measures, in the shared task for that language pair at the ACL-2007 Workshop on Statistical Machine Translation (WMT-2007). [details] News flash (June 2007): Becky and I just celebrated our 5th wedding anniversary, and last November my parents celebrated their 50th!
|
I do research in computational linguistics, with interests both in the modeling of human linguistic processes (especially lexical semantics, lexical acquisition, and on-line sentence processing) and in the application of natural language processing techniques to practical problems such as cross-language searching and machine translation. My general research agenda for language technology is to improve the state of the art by finding the right balance between knowledge-free statistical modeling and linguistically informed techniques -- and in so doing, to obtain a better scientific understanding of human language itself.
My recent work has largely been focused on machine translation and multilingual natural language processing, exploiting parallel corpora and linguistically informed modeling in statistical machine translation and in multilingual natural language processing more generally (with a focus on Chinese and Arabic, as well as other less-studied languages). As part of this effort, my postdoc David Chiang (now at USC/ISI) developed Hiero, the first syntax-based system to demonstrate performance comparable to state-of-the-art statistical MT systems (see 2005 NIST MT Evaluation results). I have been working with a number of students to further improve hierarchical phrase-based translation, and some recent innovations include the introduction of confusion network decoding (useful in translation of speech recognition output and also for text translation of morphologically complex languages), development of efficient algorithms for using suffix array representations in hierarchical decoding, and the use of English-to-English translation to create artificial references translations for use in parameter tuning.Continued interests in multilingual NLP include using parallel bilingual text to improve word sense disambiguation (that is, the identification of appropriate word meanings in context), exploiting linguistic information to improve word-to-word alignments between translations, and developing parsers for resource-poor languages by projecting linguistic information from English, using a parallel corpus as a bridge from one language to the other.
In a different line of research, I led the development of the Linguist's Search Engine, a tool designed to make it easier for linguists to search naturally occurring data using syntactic and lexical criteria. I'm hoping this tool will make it easier for more linguists to go beyond the exclusive use of introspective judgments as empirical evidence, which can lead to useful and interesting results. In follow-on work with the Center for the Advanced Study of Language (CASL), we ported the LSE to Chinese, and the LSE code is available under an open source license. (Aaron Elkiss, the LSE's chief architect, implementor, and guru, moved on to the CS Ph.D. program at University of Michigan in the fall, so folks interested in participating in the continued care and feeding of the LSE should get in touch with me.)
During the next several years, I hope to re-engage more fully with my interests in computational psycholinguistics. I'm particularly interested in the possibility that ideas from (statistical) information theory may have a useful role to play in explaining why language works the way it does. (This is an idea I first began exploring in my dissertation [ps, pdf].)
See my on-line list of publications for links to papers on the above research topics and more.
Philip Resnik, Associate Professor Department of Linguistics and Institute for Advanced Computer Studies 1401 Marie Mount Hall UMIACS phone: (301) 405-6760 University of Maryland Linguistics phone: (301) 405-8903 College Park, MD 20742 USA Fax : (301) 405-7104 http://umiacs.umd.edu/~resnik E-mail: resnik [AT] umd _DOT_ edu UMIACS office: AV Williams 3143 By far the best way to reach me is by e-mail to resnik [AT] umd _DOT_ edu.