Course Meeting Times

Lectures: Two sessions / week, 1.5 hours / session

Course Outline

  • Introduction (1 Lecture)
  • Estimation Techniques, and Language Modeling (1 Lecture)
  • Parsing and Syntax (5 Lectures)
  • The EM Algorithm in NLP (1 Lecture)
  • Stochastic Tagging, and Log-Linear Models (2 Lectures)
  • Probabilistic Similarity Measures and Clustering (2 Lectures)
  • Machine Translation (2 Lectures)
  • Discourse Processing: Segmentation, Anaphora Resolution (3 Lectures)
  • Dialogue Systems (1 Lecture)
  • Natural Language Generation/Summarization (1 Lecture)
  • Unsupervised Methods in NLP (1 Lecture)


Upon completion of 6.864, students will be able to explain and apply fundamental algorithms and techniques in the area of natural language processing (NLP). In particular, students will:

  • Understand approaches to syntax and semantics in NLP.
  • Understand approaches to discourse, generation, dialogue and summarization within NLP.
  • Understand current methods for statistical approaches to machine translation.
  • Understand machine learning techniques used in NLP, including hidden Markov models and probabilistic context-free grammars, clustering and unsupervised methods, log-linear and discriminative models, and the EM algorithm as applied within NLP.


MIT courses 6.034 and 6.046J, or permission of instructor


Suggested textbooks for the course are:

Buy at Amazon Jurafsky, David, and James H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Upper Saddle River, NJ: Prentice-Hall, 2000. ISBN: 0130950696.

Buy at Amazon Manning, Christopher D., and Hinrich Schütze. Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press, 1999. ISBN: 0262133601.

Measurable Outcomes and Assessment Methods

Students completing 6.864 will have demonstrated an ability to:

  • Understand the mathematical and linguistic foundations underlying approaches to the above areas in NLP (measured by problem sets and quizzes).
  • Design, implement and test algorithms for NLP problems (measured by problem sets).


Midterm 20%
Final Exam 30%
Six-seven Homeworks 50%

Academic Integrity

Everything you do for credit in this subject is supposed to be your own work. You can talk to other students (and instructors) about approaches to problems, but then you should sit down and do the problem yourself. This is not only the ethical way but also the only effective way of learning the material.