CS4120/6120: Natural Language Processing

Spring 2017

Instructor: David Smith, Assistant Professor in Computer and Information Science (Office Hours: Thursdays, 12-2, or by appointment; WVH 356)

TA: Liwen Hou (Office Hours: Fridays, 3-5; WVH 472)

Class meeting: Thursdays, 6-9 p.m., Behrakis 325

Course Texts

This is a graduate/undergraduate course that introduces you to natural language processing; it is also an introduction to reading papers in natural language processing. In addition to reading and discussing papers from the NLP literature, you will, in the latter part of the course, complete a project using empirical data.

Along with these readings, lectures will provide background in the fundamental linguistics concepts, statistical models, and algorithms used in NLP. These lectures will primarily draw on material from two textbooks which, while not required, provide more useful information:


Lecture notes and readings will be posted on the syllabus.


Course Policies

Discussion and Participation

When reading, the goal is not necessarily to figure out every detail of that one paper but rather to understand how each paper fits with what you've learned about NLP as a whole and what future questions it suggests. In other words, the process should mimic what you would do when conducting research in NLP or other areas of applied CS. At some point during the class, you will also give a short presentation on one of the papers related to your literature review (see below). This presentation will also count towards the participation score, which totals 20% of the course grade.

Homework Assignments

There will be four homework assignments for 30% of the total course grade. Assignments will mix written derivations and explanations with some programming problems. If you discuss a problem with others, you must note with whom you discussed the problem at the beginning of your solution write-up. Even if you acknowledge collaboration, that does not permit sharing text or code of the actual write-up. Similar text-reuse from published or online sources is also not permitted.


During the term, you will complete a project involving a computational approach to natural language, an empirical analysis of that approach, and a report on your project. This project will constitute 50% of the course grade. First, you will send a short pitch to the instructor about the intent and scope of the project, including at least two references to related work and evidence of usable data. After the project scope is adjusted, you will write a brief research plan, which will become part of the final report. At the end of the class, you will give a short presentation in class on your progress. Finally, you will turn in your report and supplementary code and data.