Advanced NLP: Theory and Practice - CMSC 828, Spring 2002
Instructors: Bonnie J. Dorr and Rebecca Hwa
Administration
Course Readings
-
January 29: "Introduction and Overview." Leader: Dorr
-
February 5: "Statistical NLP Review." Leader: Hwa
-
February 12: MT Survey discussion. Leader: Dorr
-
Dorr, Bonnie J., Pamela W. Jordan, and John W. Benoit, "A Survey of Current
Research in Machine Translation," Advances in Computers, Vol 49, M. Zelkowitz
(Ed), Academic Press, London, pp. 1--68, 1999. PS
PDF
-
February 19: Parsing. Leader: Hwa
Reactions due on Feb 10
-
Goodman, Joshua, "Algorithm and Metrics," Proc. of the 34th ACL, 1996.
PS
-
Collins, Michael, "Three Generative Lexicalised Models for Statistical
Parsing," Proc. of the 35th ACL, 1997. PS
-
Charniak, Eugene, "Statistical Parsing with a Context-free Grammar and
Word Statistics," Proc of AAAI-97. PS
-
February 26: EM and grammar induction
Reactions due on Feb 18
Leaders: Diab and Lopez.
-
Lari and Young, "The Estimation of Stochastic Context-Free Grammars using
the Inside-Outside Algorithm," Computer Speech and Language 4(1), 1990.
PDF
(sorry the printing quality is so low)
-
Pereira and Schabes, "Inside-outside re-estimation for partially bracketed
corpora" In Proc. of the 30th ACL, pp 128-135. 1992. PDF.
-
Background: Collins, Michael, "The EM Algorithm (In fulfillment
of the Written Preliminary Exam II requirement)" 1997. PS
-
March 05: Syntax-Semantics Interface.
Reactions due on February 25
Leaders: Kwon Zajic.
-
Read only sections 1,2, and 4: Levin, B. and M. Rappaport Hovav
``From Lexical Semantics to Argument Realization'', A revised and expanded
version will appear in the Cambridge Research Surveys in Linguistics Series.
1996. PS
-
Optional: Check out the LCS interlingual representation for our
in-house lexicons by clicking here:
EnglishChineseSpanish
-
March 12: Clustering.
Reactions due on March 04
Leaders: Naft and Ayan.
-
Lee, Lillian and Fernando Pereira, "Distributed Similarity Models: Clustering
vs. Nearest Neighbors," Proc. of the 37th ACL, 1999. PS
-
Pereira, Tishby, and Lee, "Distributional Clustering of English Words"
Proc of 31st ACL, 1993. PS.
-
Lin, Dekang, "Automatic Retrieval and Clustering of Similar Words," Proc.
of Coling/ACL-98, 1998. PS
-
Optional: Mats Rooth, Stefan Riezler, Detlef Prescher, Glenn Carroll,
and Franz Beil. Inducing a Semantically Annotated Lexicon via EM-Based
Clustering Proceedings of ACL '99, pp. 104--111. PS
-
Optional: Franz Josef Och, "An Efficient Method for Determining
Bilingual Word Classes" In Proc. of EACL'99, Bergen, Norway, June
1999. PS|PDF
-
March 19: WSD.
Reactions due on March 11
Leaders: Greene and Divita.
-
Main Reading 1: Pedersen, Ted, "Machine Learning with Lexical Features:
The Duluth Approach to Senseval-2," Proceedings of SENSEVAL-2: Second International
Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France,
July 5-6, 2001.PDF
-
Main Reading 2: Abney, Steven and Marc Light, "Hiding a Semantic
Class Hierarchy in a Markov Model," Proceedings of the ACL '99 Workshop
on Unsupervised Learning in Natural Language Processing, pp. 1-8. PS
-
Background: Ide, N & Veronis, J. "Introduction to the Special
Issue on Word Sense Disambiguation: The state of the art," Computational
Linguistics, 24, 1: 1-40, 1998.
PS
-
Optional (on feature selection): Pederson, Kayallp, Bruce, "Significant
Lexical Relationships," Proceedings of the Thirteenth National Conference
on Artificial Intelligence (AAAI-96), Portland, OR, August 4-8, 1996.
GZIPPED
PS
-
Optional: Check out his public release of code
here.
Also, you'll find his system description
here.
-
March 26: Vacation.
-
April 02: Guest Lecture: Nizar Habash, "Generation"
Reactions due on March 18
Send reactions to Nizar
-
Irene Langkilde and Kevin Knight, "Generation that Exploits Corpus-Based
Statistical Knowledge," In the Proc. of COLING-ACL, 1998 PS
-
Srinivas Bangalore and Owen Rambow, "Exploiting a Probabilistic Hierarchical
Model for Generation," In the Proc of COLING 2000. PS
-
April 09: MT Model
Reactions due on April 01
Leaders: Russo-Lassner and Tate.
-
Yamada and Knight, "A Syntax-Based Statistical Translation Model," Proc
of ACL-2001, 2001. PS
-
Germann et al, "Fast Decoding and Optimal Decoding for Machine Translation,"
Proc of ACL, 2001. PDF
-
April 16: Catch Up!
-
April 23: Word Alignment and dependency models
Reactions due on April 15
Lopez and Nossal
and Kolak
-
Alshawi, H. and S. Bangalore and S. Douglas, "Learning Dependency Translation
Models as Collections of Finite State Head Transducers," CL vol 26, 2000.
PS
-
Dekai Wu, "An Algorithm for Simultaneously Bracketing Parallel Texts
by Aligning Words," Proc of ACL 1995. PS
-
Melamed, Dan, "Models of Translational Equivalence among Words," CL 26(2),
2000. [PDF]
-
April 30: Guest Lecture: Philip Resnik
-
Resnik, R., "Mining the Web for Bilingual Text," in Proc. of the 37th Annual
Meeting of the ACL, 1999. [PS]
-
Resnik, R. and Smith, N. "Getting Serioyus about 'More is Better':
Mining the Internet Archive for Bilingual Text," submitted to EMNLP
2002 [DRAFT do not circulate]
2002. [PS]
-
May 07: Guest Lecture: Doug Oard, "Information Retrieval"
(1/2 lecture)
-
James Allen, "Perspectives on Information Retrieval and Speech," In Information
Retrieval Techniques for Speech Applications, Coden, Brown, and Srinivasan
(Eds.), 2002. PDF.
-
Steve Young, "Statistical Modelling in Continuous Speech Recogntion," in
Proc. of the ICUAI, 2001. PS
ALSO: "Cross-Language Information Retrieval."
Reactions due on April 29
Leaders: Darwish and Wang
-
McCarley, J.S., "Should we Translate the Documents or the Queries in Cross-language
Information Retrieval," in Proceedings of 37th Annual Meeting of the Association
for Computational Linguistics, (College Park, MD., 1999), pp. 208 - 214
[PDF]
-
S. E. Robertson and K. Sparck-Jones. "Simple proven approaches to text
retrieval," Technical Report TR356, Cambridge University Computer Laboratory,
1997. [PS]
-
May 14: Bootstrapping Methods for Multilingual Processing
Reactions due on May 6
Leader: Cabezas and Kolak
-
Bonnie Dorr, Lisa Pearl, Rebecca Hwa, and Nizar Habash, "Unraveling MT
Divergences: Linguistic Knowledge for Statistical Word-Level
Alignment," submitted to AMTA 2002 DRAFT do
not circulate. [PS]
-
Rebecca Hwa, Philip Resnik, Amy Weinberg, and Okan Kolak, "Evaluating
Translational Correspondence using Annotation Projection," to appear
in the Proceedings of the 40th Annual Meeting of the ACL 2002. Final
version [PDF]
-
David Yarowsky and Grace Ngai, "Inducing Multilingual POS Taggers and
NP Bracketers via Robust Projection across Aligned Corpora," in Proceedings
of NAACL 2001, pp. 200--207. PDF
Bonnie Dorr's home page
Rebecca Hwa's home page