The choice of topic for the class project is up to you so long as it clearly pertains to the course material. It is fine to select a topic that is related to your area of research so long as you can isolate a part that is not carried out in collaboration with people outside the class. You are encouraged to collaborate on the project but this is by no means required. We expect a four page write-up about the project, which should clearly and succinctly describe the project goal, methods, and your results. You can refer to (and provide) supplementary material if you wish but the 4-page description should be detailed enough so that the project can be graded only on the basis of the write-up. A two person group will have 6 pages, a three person group will have 8 pages, and so on. The page limit and at least 11pt font is necessary (and will be enforced) to ensure that we will be able to read through everyone's project carefully. Each group should submit only one copy of the write-up and include all the names of the group members. The projects will be graded on the basis of your understanding of the overall course material (not based on, e.g., how brilliantly your method works). The scope of the project is about 1-2 problem sets.

The projects are due during Lec #23. Electronic submission is required but we can accept only postscript or pdf documents.

The projects can be literature reviews, theoretical derivations or analyses, applications of machine learning methods to problems you are interested in, or something else (to be discussed with course staff).

Here are Some Examples

  • Apply/Develop a machine learning method to solve a specific problem
    • A machine learning approach to classifying your incoming mail
    • Predict stock prices based on past price variation
    • Predict how people would rate movies, books, etc.
    • Cluster gene expression data, how to modify existing methods to solve the problem better
  • Surveys/Reviews
    • Complexity of classifiers, different concepts, comparison
    • Algorithmic stability, which methods have stability guarantees, and where could we apply these concepts
    • Collaborative filtering, what methods are available to solve collaborative filtering problems, in which context have they been found effective
    • Machine learning methods for genomic data, are they effective, what is missing
    • Calibration, which methods are calibrated, how to modify a method so as to improve calibration
  • Theoretical problems
    • Generalization guarantees for a specific algorithm (ask us)
    • Learnability of specific concept classes (ask us)
    • Convergence/consistency of a specific estimation method (ask us)