 Background and Description

There are a lot of languages out there. The goal of this course is to learn about a handful of different styles of techniques that are relevant to trying to model them as a whole. We will discuss questions of natural language processing (building systems to deal with lots of languages), computational linguistics (using and uncovering latent structure in Language) and computational psycholinguistics (explaining why things are as they are). The course will be almost entire project based in teams that (to the extent possible) span departments.

To be clear, while there are projects that I think are interesting, part of the point of this course is to learn from each other. I would be perfectly happy if everyone came in with their own problem and we all figured out together how to solve it. Some ideas I have for projects include things like: joint syntactic models over dozens of languages using typological knowledge; uniform information density accounts of typological generalizations; spelling-to-pronunciation modeling across many languages; using (human) second-language acquisition knowledge to help (machine) second-language acquisition; modeling cross-linguistic discourse structure. Or anything else you know about that's fun and interesting that I've never thought of! Prereqs: Students should have taken at least one of the following courses: Computational Linguistics I, Machine Learning or Computational Modeling of Language. (Or you should at least have some knowledge about these things.) You should also be very interested in this topic: please do not take this course unless you're really going to be involved.