LEC # | TOPICS | KEY DATES |
---|---|---|
1 |
Markov Decision Processes
Finite-Horizon Problems: Backwards Induction Discounted-Cost Problems: Cost-to-Go Function, Bellman’s Equation |
|
2 |
Value Iteration
Existence and Uniqueness of Bellman’s Equation Solution Gauss-Seidel Value Iteration |
|
3 |
Optimality of Policies derived from the Cost-to-go Function
Policy Iteration Asynchronous Policy Iteration |
Problem set 1 out |
4 |
Average-Cost Problems
Relationship with Discounted-Cost Problems Bellman’s Equation Blackwell Optimality |
Problem set 1 due |
5 |
Average-Cost Problems
Computational Methods |
|
6 |
Application of Value Iteration to Optimization of Multiclass Queueing Networks
Introduction to Simulation-based Methods Real-Time Value Iteration |
Problem set 2 out |
7 |
Q-Learning
Stochastic Approximations |
|
8 |
Stochastic Approximations: Lyapunov Function Analysis
The ODE Method Convergence of Q-Learning |
|
9 | Exploration versus Exploitation: The Complexity of Reinforcement Learning | |
10 |
Introduction to Value Function Approximation
Curse of Dimensionality Approximation Architectures |
|
11 | Model Selection and Complexity | Problem set 3 out |
12 |
Introduction to Value Function Approximation Algorithms
Performance Bounds |
|
13 | Temporal-Difference Learning with Value Function Approximation | |
14 | Temporal-Difference Learning with Value Function Approximation (cont.) | |
15 |
Temporal-Difference Learning with Value Function Approximation (cont.)
Optimal Stopping Problems General Control Problems |
|
16 | Approximate Linear Programming | Problem set 4 out |
17 | Approximate Linear Programming (cont.) | |
18 | Efficient Solutions for Approximate Linear Programming | |
19 | Efficient Solutions for Approximate Linear Programming: Factored MDPs | |
20 | Policy Search Methods | Problem set 5 out |
21 | Policy Search Methods (cont.) | |
22 |
Policy Search Methods for POMDPs
Application: Call Admission Control Actor-Critic Methods |
|
23 |
Guest Lecture: Prof. Nick Roy
Approximate POMDP Compression |
|
24 |
Policy Search Methods: PEGASUS
Application: Helicopter Control |
Calendar
Course Info
Instructor
Departments
As Taught In
Spring
2004
Level
Topics
Learning Resource Types
notes
Lecture Notes
group_work
Projects with Examples
assignment
Problem Sets