Philip Resnik
|
News flash (MEDIA, February 11, 2012): I was
really pleased to be included among those quoted in discussions by the
Wall Street Journal's "Numbers Guy", Carl Bialik, about mining Twitter
for public opinion, including both
the print
column and the accompanying
blog post.
News flash (MEDIA, January 31, 2012): I had great fun guesting on the Kojo Nnamdi show on WAMU 88.5 in Washington DC, talking about New Frontiers in Political Polling: Social Media and "Sentiment Analysis". We discussed computational analysis of social media in the context of political campaigns, which was also the topic of a recent posting I did on Language Log called #CompuPolitics; we also briefly discussed the React Labs project, in which collaborators and I are developing a smartphone app for large scale, real-time collection of people's responses during live events like political debates. News flash (January 26, 2012): Two upcoming talks in cool places, in March. One is a plenary lecture at the 2012 American Association for Applied Linguistics conference, entitled The Linguistics of Spin: A Computational Linguist's Forays into Social Science The other is a slot at South By Southwest Interactive (SXSWi), on EHRs, NLP and the Future of Clinical Narrative. News flash (November 7, 2011): I'm in the Bay Area to give a talk today at Google on crowdsourcing and translation, to kick off a new Google-funded collaboration involving me, Ben Bederson, and Chris Callison-Burch that we're calling "Translate the World". Tomorrow I will be giving the keynote talk at the Sentiment Analysis Symposium, a technology/business event focused on, yes, sentiment analysis. News flash (MEDIA, November 3, 2011): Interviewed by New Scientist for their story on Siri. News flash (MEDIA, June 28, 2011): Nice mention of my work with Ben Bederson and students on monolingual translation crowdsourcing in Jim Giles, New Scientist, Issue 2818, The man-machine: Harnessing humans in a hive mind. News flash (June 28, 2011): I've just finished two invited talks on "Computer Assisted Coding and Beyond: An Academic's Adventures with Clinical Natural Language Processing in the Real World", one at the ACL/HLT 2011 BioNLP Workshop, and the other at the National Library of Medicine. |
Machine translation. My recent work has largely been focused on machine translation and multilingual natural language processing, exploiting parallel corpora and linguistically informed modeling in statistical machine translation and in multilingual natural language processing more generally (with a focus on Chinese and Arabic, as well as other less-studied languages). As part of this effort, my postdoc David Chiang (now at USC/ISI) developed Hiero, the first syntax-based system to demonstrate performance comparable to then state-of-the-art statistical phrase-based MT systems (see 2005 NIST MT Evaluation results). I have worked with a number of students to further improve hierarchical phrase-based translation, and some innovations include the introduction of lattice decoding (useful in translation of speech recognition output and also for text translation of morphologically complex languages), development of efficient algorithms for using suffix array representations in hierarchical decoding, use of English-to-English translation to create artificial reference translations for use in parameter tuning, the introduction of soft syntactic constraints based on source language structure, and exploitation of lattices (and soon, forests) to represent source language paraphrase and syntactically driven reordering alternatives.Crowdsourcing and translation. Connected with my machine translation research, Ben Bederson and I are working on an ambitious attempt to achieve low cost, high quality translation by taking advantage of monolingual human participants in a computer-assisted translation protocol, in a project we call "Translation as a Collaborative Process". We're blending ideas from machine translation, human computer-interfaces, and distributed human computation ("crowdsourcing"), and tackling the real-world problem of translating books in the International Children's Digital Library. We received a 2009 Google Research Award sponsoring this work, as well as funding from NSF. In September 2009, Ben gave a Google tech talk about the project which is available on YouTube. Ben and I now have a follow-up Google Research Award in which we're collaborating with Chris Callison-Burch to bring his crowdsourcing work and ours together in a framework we're calling "Translate the World".
Computational social science. I have also been doing work on sentiment analysis and related topics such as persuasion, framing, and "spin", with a particular interest in the connections among lexical semantics, surface linguistic expression, and underlying internal state. For example, why does my son say "My toy broke" instead of "I broke my toy"? He's using syntax to package up the statement about what happened in a way that de-emphasizes semantic properties such as causation, volition, and change-of-state. (This is an example of using syntax for "spin", just the same way that Ronald Reagan did in 1987 when he sidestepped attributing responsibility for the Iran-contra scandal; remember "Mistakes were made"? Precocious child.) My student Stephan Greene did a fascinating dissertation on this topic, and for a conference-paper-length description see our 2009 NAACL paper. Current topics of investigation include modeling syntax/semantics/sentiment connections in a Bayesian framework, bootstrapping multilingual sentiment analysis capabilities, and working with political scientists to model agenda setting and framing in political discourse. I've also been working with political scientist collaborators on the React Labs project, a smartphone app for large scale, real-time collection of people's responses during live events like political debates. Outside academia, I do real-world sentiment analysis as Lead Scientist with Converseon Inc., a leading social media firm.
Clinical informatics. Since about 1999 I've been involved in natural language processing for clinical documentation. I helped start up CodeRyte, Inc., which is now the nation's fastest growing provider of NLP solutions in healthcare (see, e.g., Deloitte's Technology Fast 500 and the Inc. 5000 listings.) I developed pieces of the core technology, helped build an excellent language technology team, and I currently advise the company on technology development and strategic direction. I've also presented a tutorial on NLP and computer assisted coding at the convention of the American Health Information Management Association (AHIMA) and I have serve as one of the co-chairs of AHIMA's steering committee on computer assisted coding. Much to my surprise, I was listed at #82 on the Future Health 100, a list of "the most creative and influential innovators working in healthcare today" at healthspottr.com.
Computational psycholinguistics. During the next several years, I hope to re-engage more fully with my interests in computational psycholinguistics. I'm particularly interested in the possibility that ideas from (statistical) information theory may have a useful role to play in explaining why language works the way it does. (This is an idea I first began exploring in my dissertation [ps, pdf], back in 1993, and in recent years a variety of people like John Hale, Roger Levy, and Florian Jaeger, among others, have done very interesting work in the same spirit.) I'm also interested in using Bayesian modeling as a way to bring linguists here with cognitive modeling interests together with computational linguists focusing on applications. Momentum for that around here has already started building with the recent arrival of Naomi Feldman in our Linguistics Department.
Empirical linguistics. I'm quite interested in promoting the use of naturally occurring data as evidence in linguistics research. , I led the development of the Linguist's Search Engine, a tool designed to make it easier for linguists to search naturally occurring data using syntactic and lexical criteria. This tool was intended to make it easier for more linguists to go beyond the exclusive use of introspective judgments as empirical evidence, which can lead to useful and interesting results. In follow-on work with the Center for the Advanced Study of Language (CASL), we ported the LSE to Chinese, and the LSE code is available under an open source license. (Aaron Elkiss was the LSE's chief architect, implementor, and guru. I kept it running for a number of years after he graduated, but eventually retired it. Anyone interested in resurrecting it: the source code is available.)
See my on-line list of publications for links to papers on the above research topics and more.
Professional History
.
Philip Resnik, Professor Department of Linguistics and Institute for Advanced Computer Studies 1401 Marie Mount Hall UMIACS phone: (301) 405-6760 University of Maryland Linguistics phone: (301) 405-8903 College Park, MD 20742 USA Fax : (301) 405-7104 http://umiacs.umd.edu/~resnik E-mail: resnik [AT] umd _DOT_ edu UMIACS office: AV Williams 3143 By far the best way to reach me is by e-mail to resnik [AT] umd _DOT_ edu.