Course Meeting Times
Lectures: 3 sessions / week, 1 hour / session
Prerequisites
Permission of instructor is required. Helpful courses (ideal but not required): Theory of Probability (18.175) and either Statistical Learning Theory and Applications (9.520) or Machine Learning (6.867)
Description
The main goal of this course is to study the generalization ability of a number of popular machine learning algorithms such as boosting, support vector machines and neural networks. We will develop a number of technical tools that will allow us to give qualitative explanations of why these learning algorithms work so well in many classification problems.
Topics of the course include Vapnik-Chervonenkis theory, concentration inequalities in product spaces, and other elements of empirical process theory.
Grading
The grade is based upon two problem sets and class attendance.
Course Outline
Introduction
-
Classification problem set-up
-
Examples of learning algorithms: Voting algorithms (boosting), support vector machines, neural networks
-
Analyzing generalization ability
Technical Tools: Elements of Empirical Process Theory
One-dimensional Concentration Inequalities
-
Chebyshev (Markov), Rademacher, Hoeffding, Bernstein, Bennett
-
Toward uniform bounds: Union bound, clustering
Vapnik-Chervonenkis Theory and More
-
VC classes of sets and functions
-
Shattering numbers, growth function, covering numbers
-
Examples of VC classes, properties
-
Uniform deviation bounds
-
Symmetrization
-
Kolmogorov's chaining technique
-
Dudley's entropy integral
-
Contraction principles
Concentration Inequalities
-
Talagrand's concentration inequality on the cube
-
Symmetrization
-
Talagrand's concentration inequality for empirical processes
-
Vapnik-Chervonenkis type inequalities
-
Martingale-difference inequalities
Applications
-
Generalization ability of voting classifiers, neural networks, support vector machines