Course Meeting Times

Lectures: 2 sessions / week, 1.5 hours / session

A list of topics covered in the course is presented in the calendar.


This introductory course gives an overview of many concepts, techniques, and algorithms in machine learning, beginning with topics such as classification and linear regression and ending up with more recent topics such as boosting, support vector machines, hidden Markov models, and Bayesian networks. The course will give the student the basic ideas and intuition behind modern machine learning methods as well as a bit more formal understanding of how, why, and when they work. The underlying theme in the course is statistical inference as it provides the foundation for most of the methods covered.

Problem Sets

There will be a total of 5 problem sets, due roughly every two weeks. The content of the problem sets will vary from theoretical questions to more applied problems. You are encouraged to collaborate with other students while solving the problems but you will have to turn in your own solutions. Copying will not be tolerated. If you collaborate, you must indicate all of your collaborators.

Each problem set will be graded by a group of students with the guidance of your TAs. Each problem set will be graded in a single grading session, usually on the first Monday after it is due, starting at 5pm. Every student is required to participate in one grading session. You should sign up for grading by contacting a TA, by email or in person; doing it early increases the chances of getting the preferred grading schedule. Students who do not register for grading by the third week of the course, will be assigned to a problem set by us.

If you drop the class after signing up for a grading session, please be sure to let us know so we can keep track of students available for grading. If you add the class during the term, please remember to sign up for grading.


There will be two in-class exams, a midterm midway through the term and a final the last day of class.


You are required to complete a class project. The choice of the topic is up to you so long as it clearly pertains to the course material. To ensure that you are on the right track, you will have to submit a one paragraph description of your project a month before the project is due. Similarly to problem sets, you are encouraged to collaborate on the project. We expect a four page write-up about the project, which should clearly and succinctly describe the project goal, methods, and your results. Each group should submit only one copy of the write-up and include all the names of the group members (a two person group will have 6 pages, a three person group will have 8 pages, and so on). The projects will be graded on the basis of your understanding of the overall course material (not based on, e.g., how brilliantly your method works). The scope of the project is about 1-2 problem sets.

The projects are due in Lec #23. Electronic submission is required but we can accept only postscript or pdf documents. The short proposal should be turned in on or before Lec #12.

The projects can be literature reviews, theoretical derivations or analyses, applications of machine learning methods to problems you are interested in, or something else (to be discussed with course staff).


Your overall grade will be determined roughly as follows:

Midterm 15%
Problem sets 30%
Final 25%
Project 30%


There are a number of useful texts for this course but each covers only some part of the class material.

Buy at Amazon Bishop, Christopher. Neural Networks for Pattern Recognition. New York, NY: Oxford University Press, 1995. ISBN: 9780198538646.

Buy at Amazon Duda, Richard, Peter Hart, and David Stork. Pattern Classification. 2nd ed. New York, NY: Wiley-Interscience, 2000. ISBN: 9780471056690.

Buy at Amazon Hastie, T., R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York, NY: Springer, 2001. ISBN: 9780387952840.

Buy at Amazon MacKay, David. Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press, 2003. ISBN: 9780521642989. Available on-line here.

Buy at Amazon Mitchell, Tom. Machine Learning. New York, NY: McGraw-Hill, 1997. ISBN: 9780070428072.

You are responsible for the material covered in lectures (most of which will appear in lecture notes in some form), problem sets, as well as material specifically made available and indicated for this purpose. The weekly recitations/tutorials will be helpful in understanding the material and solving the homework problems.

Recommended Citation

For any use or distribution of these materials, please cite as follows:

Tommi Jaakkola, course materials for 6.867 Machine Learning, Fall 2006. MIT OpenCourseWare (, Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].


1 Introduction, linear classification, perceptron update rule
2 Perceptron convergence, generalization
3 Maximum margin classification
4 Classification errors, regularization, logistic regression Problem set 1 out
5 Linear regression, estimator bias and variance, active learning
6 Active learning (cont.), non-linear predictions, kernals Problem set 1 due
7 Kernal regression, kernels Problem set 2 out
8 Support vector machine (SVM) and kernels, kernel optimization
9 Model selection Problem set 2 due
10 Model selection criteria
11 Description length, feature selection Problem set 3 out 3 days before Lec #11
12 Combining classifiers, boosting
13 Boosting, margin, and complexity

Problem set 3 due

Problem set 4 out

14 Margin and generalization, mixture models
15 Mixtures and the expectation maximization (EM) algorithm
16 EM, regularization, clustering Problem set 4 due
17 Clustering
18 Spectral clustering, Markov models Problem set 5 out
19 Hidden Markov models (HMMs)
20 HMMs (cont.)
21 Bayesian networks
22 Learning Bayesian networks Problem set 5 due

Probabilistic inference

Guest lecture on collaborative filtering

Projects due
24 Current problems in machine learning, wrap up Exams back