This section contains a project description, suggested topics, and examples of student work.
The course project will be a major component of the course grade. You are welcome to choose a topic in any area of machine learning or statistics related to the course syllabus. You are strongly encouraged to choose a topic that you would like to learn about, rather than a topic you are already familiar with. It is fine to choose a topic related to your independent research, though you must choose something for the course project that you would not have done if you had not taken this course. Throughout the semester we will provide project ideas to help you.
For experimental projects, you must provide insight as to why certain algorithms perform better than others for your dataset. Note that applying many different algorithms to one dataset and reporting test results does not constitute a project. Also, the application to your data must be novel (you should not use a well-studied dataset, or at least you need to use it in a new way). For algorithmic papers, you should introduce a new algorithm or set of algorithms for solving a problem, and also discuss under what assumptions the methods should work and under what conditions they will fail. For theoretical projects, you must prove something new. It is possible (and desirable, but not always practical) to have a project that is a combination of theoretical and experimental results.
Project Proposal. A 1–2 page proposal is due at the beginning of week 10. It should contain the following information: (1) project title, (2) 1–2 sentence description of the project, (3) importance of the project, (4) precise description of the question you are trying to answer and how you will answer it (if experimental, what data you will collect), (5) reading list (papers you will need to read).
Progress Report. Due week 12. The progress report will not be included towards your grade but is recommended, and our feedback will help you get a sense of how your final report will be received. Amount of progress between the Progress Report and Final Report is not important—you are encouraged to turn in everything you have done so far for the Progress Report.
Project Ad. Due week 13. One slide. An "advertisement" describing your project to the class. Include a brief description of your problem and results and if possible, a graphic (optional). This should encourage other members of the class to attend your final talk. Make it exciting!
Project Talk. Scheduled during the last 2 weeks of classes, weeks 14–15. The length of the talk will depend on the number of talks that need to be given during the time available.
Paper. Due on the last day of class. There is no specified length for the paper, but we expect most papers will be on the order of 10–15 pages. Please try to keep it less than 20 pages if possible. Your paper should start with an abstract (1 paragraph) and introduction (1–2 pages), and end with a conclusion. In the introduction of the report, you should answer the following: Why is your problem interesting and important? Have you addressed a gap in our (collective) knowledge?
Projects are courtesy of anonymous MIT students, unless specified otherwise, and are used with permission.
Online k-Means Clustering of Nonstationary Data (PDF), by Angie King
Improving Tools for Medical Statistics (PDF), by Jacqueline Soegaard