my picture

Nitin Madnani

Mailing address
Institute for Advanced Computer Studies
AVW 3126G
University of Maryland
College Park, MD 20742
Phone
(O) +1-301-405-6746
E-mail
(first initial+last name)@umiacs.umd.edu
PGP Public Key
pubkey.asc
Research Interests
Statistical Machine Translation
Automatic Paraphrase Generation & Recognition
Automatic Text Summarization
Machine Learning
Computer Science Education
Blog
Colorful Green Ideas

Complete Curriculum Vitae

I am a Ph.D. candidate in the Department of Computer Science at University of Maryland, College Park. I am also a graduate research assistant in the Computational Linguistics and Information Processing Laboratory at the Institute for Advanced Computer Studies, where I work with my advisor, Bonnie Dorr.

In general, my research is focused on building computational models of human language that can be validated by their contribution to enhancing human understanding of and experience with language. Specifically, my dissertation is based on exploring the intersection of and interaction between statistical machine translation and automatic paraphrase generation to build a computational model of paraphrasing. More details can be found in my research statement.

I am also particularly interested in Computer Science education. I have had a unique set of experiences in teaching computer science that span both my undergraduate and graduate careers. More details can be found in my teaching statement.

I will be graduating in Spring 2010 and am looking for an academic position that focuses on both CS teaching and research.

Education

Experience

Publications

Paraphrasing & Machine Translation

TER-Plus: Paraphrase, Semantic, and Alignment Enhancements to Translation Edit Rate. In Press. Journal of Machine Translation. Matthew Snover, Nitin Madnani, Bonnie Dorr and Richard Schwartz. pdficon bibicon
Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric. 2009. Fourth Workshop on Statistical Machine Translation. Matthew Snover, Nitin Madnani, Bonnie Dorr and Richard Schwartz. pdficon bibicon
Applying Automatically Generated Semantic Knowledge: A Case Study in Machine Translation. 2008. Proceedings of the Symposium on Semantic Knowledge Discovery, Organization and Use . Nitin Madnani, Philip Resnik, Bonnie Dorr and Richard Schwartz. pdficon bibicon taricon
TERp: A System Description. 2008. Proceedings of the First NIST Metrics for Machine Translation Challenge (MetricsMATR). Matt Snover, Nitin Madnani, Bonnie Dorr and Richard Schwartz. pdficon bibicon
Are Multiple Reference Translations Necessary? Investigating the Value of Paraphrased Reference Translations in Parameter Optimization. 2008. Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas (AMTA 2008). Nitin Madnani, Philip Resnik, Bonnie Dorr and Richard Schwartz. pdficon bibicon
Using Paraphrases for Parameter Tuning in Statistical Machine Translation. 2007. Proceedings of the Second ACL Workshop on Statistical Machine Translation (WMT-07). Nitin Madnani, Necip Fazil Ayan, Philip Resnik, Bonnie Dorr. pdficon bibicon
The Hiero Machine Translation System: Extensions, Evaluation, and Analysis. 2005. Proceedings of HLT/EMNLP . David Chiang, Adam Lopez, Nitin Madnani, Christof Monz, Philip Resnik and Michael Subotin. pdficon
Rapid Porting of DUSTer to Hindi. 2003. ACM Transactions on Asian Language Information Processing, Volume 2, Issue 2 . Bonnie J. Dorr, Necip Fazil Ayan, Nizar Habash, Nitin Madnani, and Rebecca Hwa. pdficon bibcon

Automatic Text Summarization & Information Retrieval

Multiple Alternative Sentence Compressions and Word-Pair Antonymy for Automatic Text Summarization and Recognizing Textual Entailment. 2008. Text Analysis Conference (TAC) . Saif Mohammad, Bonnie J. Dorr, Melissa Egan, Nitin Madnani, David Zajic, and Jimmy Lin. pdficon bibicon
TREC 2007 ciQA Task: University of Maryland. 2007. Proceedings of the Sixteenth Text REtrieval Conference (TREC). Nitin Madnani, Jimmy Lin, and Bonnie Dorr. pdficon bibicon
Measuring Variability in Sentence Ordering for News Summarization. 2007. Proceedings of the 11th European Workshop on Natural Language Generation (ENLG). Nitin Madnani, Rebecca Passonneau, John Conroy, Necip Fazil Ayan, Bonnie Dorr, Judith Klavans, Dianne O'Leary and Judith Schlesinger. pdficon bibicon
Multiple Alternative Sentence Compressions for Automatic Text Summarization. 2007. Proceedings of the Document Understanding Conference (DUC) at HLT/NAACL . Nitin Madnani, David Zajic, Bonnie Dorr, Necip Fazil Ayan and Jimmy Lin. pdficon slideicon bibcon

Education & Software

The Python and The Elephant: Large Scale Natural Language Processing with NLTK and Dumbo. To appear in 2010. Proceedings of the Eighth Annual Python Conference. Nitin Madnani and Jimmy Lin.
Querying and Serving N-gram Language Models with Python. 2009. The Python Papers. Volume 4, Issue 2. Nitin Madnani. pdficon pdficon bibicon
Source Code: Querying and Serving N-gram Language Models with Python. 2009. The Python Papers Source Codes. Volume 1. Nitin Madnani. pdficon pdficon bibicon
Combining Open-Source with Research to Re-engineer a Hands-on Introductory NLP Course. 2008. Proceedings of the Third ACL Workshop on Issues in Teaching Computational Linguistics (TeachCL-08). Nitin Madnani and Bonnie Dorr. pdficon bibicon
Getting Started on Natural Language Processing with Python. 2007. ACM Crossroads, Volume 13, Issue 4. Nitin Madnani.
[Note: The PDF version has been completely revised since the official ACM version to keep up to date with the changes made to the software used in the article. ]
pdficon pdficon bibicon

Working Papers

Generating Phrasal & Sentential Paraphrases: A Survey of Data-Driven Methods (Journal Article).
In submission to Computational Linguistics. Second round of review (Oct 2009).
Machine Translation Evaluation and Optimization (Book Chapter).
In Preparation.
A Pythonic Exploration of Vector-Space Methods for Semantic Similarity (Magazine Article).
In Preparation.

Unpublished Manuscripts

Active Learning for Mention Detection: A Comparison of Sentence Selection Strategies.
(Available as arXiv:0911.1965v1 from the arXiv Computing Research Repository (CoRR))
pdficon
Emily: A Tool for Visual Poetry Analysis. pdficon

Invited Talks

Using Paraphrases for Parameter Tuning in Statistical Machine Translation. 2007. Annual Technical Presentation at the Technical Meeting for Global Autonomous Language Exploitation. Nitin Madnani, Necip Fazil Ayan, Philip Resnik and Bonnie J. Dorr. slideicon

Posters

Exploring Emily Dickinson Letters. June 2005. HCIL 22nd Annual Symposium and Open House, University of Maryland. Catherine Plaisant, Nitin Madnani, Matt Kirschenbaum, Martha Nell Smith, Tanya Clement, Greg Lord. pdficon
Portable Divergence Unraveling - The Case of Hindi. 2004. Research Review Day, University of Maryland. Nitin Madnani, Necip Fazil Ayan, Bonnie Dorr, Nizar Habash, Christof Monz. pdficon

Other Talks, Presentations & Miscellanea

Expectation Maximization. 2004.
Advanced NLP Seminar, University of Maryland.
slideicon
Decoding in Statistical Machine Translation. 2006.
StatMT Reading Group, University of Maryland.
slideicon
A timeline of inter-annotator agreement measures in Computational Linguistics based on Inter-Coder Agreement for Computational Linguistics by Ron Artstein and Massimo Poesio.
Linguistics seminar on Corpus-based Social Science, University of Maryland.
pdficon

Software

Python & Perl wrappers for SRILMs: Wrappers that will allow you to read and query an SRI language model directly in your Python and Perl code. taricon
clusterinfo: A Python script that displays current usage of a PBS-based cluster in a more condensed and easier-to-read format. taricon
LM Server: A Python-based XML-RPC server for an SRILM language model. Allows multiple clients to query the same language model that's loaded in memory in server mode. taricon
UMIACS Word Alignment Interface: A Java-based tool for creating and viewing word alignments between language pairs. It has been widely used across the community to create aligments for many language pairs including Welsh-English, Swahili-English, Czech-English and Chinese-English. pdficon
TER-plus (TERp): TERp is an automatic evaluation metric for Machine Translation, which takes as input a set of reference translations, and a set of machine translation output for that same data. TERp utilizes automatically generated paraphrases, stemming, synonyms, relaxed shifting constraints and other improvements.
(Note: Work done in collaboration with Matt Snover, who is the main developer of TERp.)
pdficon

Service