|
CMCS723/LING723 Computational
Linguistics I |
|
|
Instructor:
Saif Mohammad Co-instructor: Nitin Madnani Course co-ordinator: Bonnie Dorr |
| Class: | |
| Wednesdays, 4 to 6:30pm, Computer Science Instructional Center (CSIC) Room 3120 | |
Text: |
|
| Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics, second edition (published in 2008), by Daniel Jurafsky and James H. Martin. | |
| Guest lectures: | |
| by Bonnie Dorr, Philip Resnik, and Doug Oard. | |
| Overview: | |
| The lectures in this course will cover topics in four broad areas of Computational Linguistics (words and morphology; syntax; semantics; and pragmatics) and some specific applications. |
September 3: INTRODUCTIONS and OVERVIEW by Saif and Bonnie
Reading: Chapter 1 of J&M; Secton 1.3 and 1.4 from this chapter in Foundations of Statistical Natural Language Processing.
- administrivia
- semester plan
- overview of NLP, by Bonnie
- introduction to statistical NLP, by Saif
Lecture notes: Course details and Introduction to statistical NLP, Overview of Computtional Linguistics
Assignment 0 posted (not for credit)
September 10: NUTS and BOLTS by Nitin
Reading: NLTK Book (Chapters 1 and 2); Python Beginners' Guide (includes resources for both programmers and non-programmers); ACM article on Getting Started on NLP with Python.
- Introduction to Python and NLTK
Lecture notes: Introduction to Python and NLTK
Assignment 0: finish Q1 before class.
September 17: WORDS by Saif
Reading: Chapter 2, Section 2.2 onwards; Chapter 3.
Lecture notes: FSA, FST, morphology
- regular expressions
- finite state automata
- morphology
- finite state transducers
Assignment 0: turn in solutions to both Q1 and Q2
Assignment 1 (on finite state automata) posted
September 24: WORDS by Saif and Nitin
Reading: Chapter 5.1-5.4 and 5.6
Lecture notes: POS tagging
- finish morphology and FSTs
- part-of-speech tagging
- introduction to HMMs
October 1: WORDS by Nitin
Reading: Chapter 5.5, 6.1-6.4
- hidden Markov models (HMMs)
- expectation-maximization and HMM training
Lecture notes: HMM and EM
Assignment 1 is due; Assignment 2 (on POS-tagging and HMM) is posted
October 8: SYNTAX by Bonnie
Reading: This week and next week's lecture will draw from subsets of the slides at these two URLs:
http://www.cs.colorado.edu/~martin/SLP/Slides/slp12.pdf
http://www.cs.colorado.edu/~martin/SLP/Slides/slp13.pdf
- context-free grammars (CFG)
- linguistic phenomena
Lecture notes: Syntax Ia, Syntax Ib
October 15: SYNTAX by Bonnie and Nitin; Midterm Review by Saif and Nitin
Reading: This week and last week's lecture draw from subsets of the slides at these two URLs:
http://www.cs.colorado.edu/~martin/SLP/Slides/slp12.pdf
http://www.cs.colorado.edu/~martin/SLP/Slides/slp13.pdfTAG: Pages 1-13 and 27-33 (Section 8) of Aravind Joshi and Yves Schabes, Tree-Adjoining Grammars, in Handbook of Formal Languages, G. Rozenberg and A. Salomaa (eds.), Vol. 3, Springer, Berlin, New York, 1997, 69-124.
CCG: New Ch 12 (Section 12.7); Mark Steedman, Categorial Grammar (tutorial overview), Lingua, 90:221--258, 1993; and Shieber et al. (1995) section 4.
- context-free parsing: CYK, Earley by Bonnie
- tree adjoining grammars by Nitin
- midterm review by Saif and Nitin
Lecture notes: Syntax II
Lecture notes: CCG and TAGAssignment 2 is due
October 22: MIDTERM
Midterm review: These questions are to help you focus your preparation for the midterm. This is not a substitute for the readings for all the classes so far.
This is a two-hour in-class exam.
October 29: WORDS by Nitin
Reading: Chapter 4 (Sections 4.1--4.7; 4.9.1)
- N-gram language models
Lecture notes: N-grams
Assignment 3 (on n-grams and smoothing) is posted
November 5: SEMANTICS by Saif
Reading: Chapter 19 (Sections 19.1--19.3); Chapter 20 (Sections 20.1--20.5)
- representing meaning
- word senses
- word sense disambiguation
- supervised
- unsupervised
- semi-supervised
Lecture notes: WSD
November 12: SEMANTICS by Saif
Reading: Chapter 20 (Sections 20.6--20.9)
- lexical semantic relations
- semantic distance
- WordNet-based measures
- distributional measures
Lecture notes: WSD and Semantic Distance
Assignment 4 (on word senses and semantic distance) is posted. Asssignment 3 is due.
November 19: APPLICATIONS
Reading (optional): Statistical Machine Translation by Adam Lopez
- machine translation
November 26: APPLICATIONS by Saif
See forum for details of this class.
- clustering
- first and second-order co-occurrences
- singular value decomposition
Assignment 4 is due
December 3: APPLICATIONS
Reading: Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval
Reading: 2005 Johns Hopkins Summer Workshop Final Report on Parsing and Spoken Structural Event Detection
- information retrieval
- segmentation in speech NLP
Lecture notes: Information Retrieval
Lecture notes: Segmentation for speech NLP
December 10: APPLICATIONS and WRAP-UP by Saif
- example applications of NLP on the web
- example visualisations of natural language
- exam overview
- wrap-up
FINAL EXAM: December 17 2008, 4 to 6 pm
Last updated: October 2008