CS4120/6120: Natural Language Processing
Spring 2017 Syllabus
Return to basic course information.
This schedule is subject to change. Check back as the class progresses.
Why NLP?
Language Models
- n-gram models, naive Bayes classifiers, probability, estimation
- We also played the Shannon game, guessing the next letter
from the previous n letters.
- Readings for Jan. 19: Bo Pang, Lillian Lee,
and Shivakumar
Vaithyanathan. Thumbs
up? Sentiment Classification using Machine Learning
Techniques. EMNLP,
2002.
- Victor Chahuneau, Kevin Gimpel, Bryan R. Routledge, Lily
Scherlis, and Noah
A. Smith. Word
Salad: Relating Food Prices and
Descriptions. In EMNLP, 2012.
- Reading for Jan. 26:
C. E. Shannon. Prediction
and Entropy of Printed English. The Bell System
Technical Journal, January 1951.
- Background: Jurafsky & Martin, chapter 4
Regular Languages
- the Chomsky hierarchy, regular expressions, (weighted)
finite-state automata and transducers
- Readings for Feb. 2: Kevin Knight and
Jonathan
Graehl. Machine
Transliteration. Computational Linguistics,
24(4), 1998.
- Background on NLP with unweighted finite
state machines: Karttunen, Chanod, Grefenstette, and
Schiller. Regular
expressions for language engineering. Journal of
Natural Language Engineering, 1997. We discussed the
main points and interesting examples from this paper in class,
but you can read it for more derivations and examples.
- More background: Jurafsky & Martin, chapter 2
Noisy Channel and Hidden Markov Models
Context-Free Grammars and Parsers
- Background: Jurafsky & Martin, chapters 12-14
Log-Linear Models
- also known as: logistic regression, and maximum entropy
(maxent) models; directly modeling the conditional probability
if output given input, rather than the joint probability of
input and output (and then using Bayes rule)
- Background: Jurafsky & Martin,
chapter 6.6-6.7; N. Smith, Appendix C
Models with Structured Outputs
- models that decide among combinatorially many outputs,
e.g. sequences of tags or dependency links; locally normalized
(action-based) models such as Maximum Entropy Markov Models
(MEMMs); globally normalized models such as linear-chain
Conditional Random Fields (CRFs)
- Background: Jurafsky & Martin,
chapter 6.8; N. Smith, chapter 3.1-3.5
Formal Semantics
- logical form: lambda expressions, event semantics,
quantifiers, intensional semantics; first steps in
computational compositional semantics: semantic role labeling,
combinatory categorial grammar (CCG)
- Background: Jurafsky & Martin,
chapters 17-20; see
also NLTK book,
chapter 10
- Readings for Nov. 10: McDonald et
al., Non-projective
Dependency Parsing using Spanning Tree
Algorithms, EMNLP 2005.
- Bansal et
al., Structured
Learning for Taxonomy Induction with Belief
Propagation, ACL 2014.
Lexical Semantics
- Words and word senses, vector space representations,
greedy agglomerative clustering, k-means and EM clustering,
Brown clustering as language modeling; learning hierarchies
of word meanings; continuous word embeddings; compositional
vector-space semantics
- Background: Yoav
Goldberg, A
Primer on Neural Network Models for Natural Language
Processing, Technical Report, October 2015.
Machine Translation
- word-based alignment models; phrase-based models;
syntactic and tree-based models; learning from comparable
corpora; topic models; encoder-decoder models
- Background: Jurafsky & Martin,
chapter 25
NLP and Linguistics
- Zipf's law, Heaps' law, power-law pitfalls; the poverty of
the stimulus, learning in the limit, Gold's and Horning's
theorems; probabilistic grammars in historical and
psycholinguistics