Syllabus

Course Meeting Times

Lectures: 3 sessions / week, 1 hour / session

Prerequisites

Permission of instructor is required. Helpful courses (ideal but not required): Theory of Probability (18.175) and either Statistical Learning Theory and Applications (9.520) or Machine Learning (6.867)

Description

The main goal of this course is to study the generalization ability of a number of popular machine learning algorithms such as boosting, support vector machines and neural networks. We will develop a number of technical tools that will allow us to give qualitative explanations of why these learning algorithms work so well in many classification problems.

Topics of the course include Vapnik-Chervonenkis theory, concentration inequalities in product spaces, and other elements of empirical process theory.

Grading

The grade is based upon two problem sets and class attendance.

Course Outline

Introduction

Classification problem set-up
Examples of learning algorithms: Voting algorithms (boosting), support vector machines, neural networks
Analyzing generalization ability

Technical Tools: Elements of Empirical Process Theory

One-dimensional Concentration Inequalities

Chebyshev (Markov), Rademacher, Hoeffding, Bernstein, Bennett
Toward uniform bounds: Union bound, clustering

Vapnik-Chervonenkis Theory and More

VC classes of sets and functions
Shattering numbers, growth function, covering numbers
Examples of VC classes, properties
Uniform deviation bounds
Symmetrization
Kolmogorov’s chaining technique
Dudley’s entropy integral
Contraction principles

Concentration Inequalities

Talagrand’s concentration inequality on the cube
Symmetrization
Talagrand’s concentration inequality for empirical processes
Vapnik-Chervonenkis type inequalities
Martingale-difference inequalities

Applications

Generalization ability of voting classifiers, neural networks, support vector machines

Browse Course Material

Course Info

Instructor

Departments

As Taught In

Level

Topics

Learning Resource Types

Topics in Statistics: Statistical Learning Theory