| LEC # | TOPICS | LECTURE NOTES |
|---|---|---|
| 1 |
Markov Decision Processes
Finite-Horizon Problems: Backwards Induction Discounted-Cost Problems: Cost-to-Go Function, Bellman’s Equation |
(PDF) |
| 2 |
Value Iteration
Existence and Uniqueness of Bellman’s Equation Solution Gauss-Seidel Value Iteration |
(PDF) |
| 3 |
Optimality of Policies derived from the Cost-to-Go Function
Policy Iteration Asynchronous Policy Iteration |
(PDF) |
| 4 |
Average-Cost Problems
Relationship with Discounted-Cost Problems Bellman’s Equation Blackwell Optimality |
(PDF) |
| 5 |
Average-Cost Problems
Computational Methods |
(PDF) |
| 6 |
Application of Value Iteration to Optimization of Multiclass Queueing Networks
Introduction to Simulation-based Methods Real-Time Value Iteration |
(PDF) |
| 7 |
Q-Learning
Stochastic Approximations |
(PDF) |
| 8 |
Stochastic Approximations: Lyapunov Function Analysis
The ODE Method Convergence of Q-Learning |
(PDF) |
| 9 | Exploration versus Exploitation: The Complexity of Reinforcement Learning | (PDF) |
| 10 |
Introduction to Value Function Approximation
Curse of Dimensionality Approximation Architectures |
(PDF) |
| 11 | Model Selection and Complexity | (PDF) |
| 12 |
Introduction to Value Function Approximation Algorithms
Performance Bounds |
(PDF) |
| 13 | Temporal-Difference Learning with Value Function Approximation | (PDF) |
| 14 | Temporal-Difference Learning with Value Function Approximation (cont.) | (PDF) |
| 15 |
Temporal-Difference Learning with Value Function Approximation (cont.)
Optimal Stopping Problems General Control Problems |
(PDF) |
| 16 | Approximate Linear Programming | (PDF) |
| 17 | Approximate Linear Programming (cont.) | (PDF) |
| 18 | Efficient Solutions for Approximate Linear Programming | (PDF) |
| 19 | Efficient Solutions for Approximate Linear Programming: Factored MDPs | (PDF) |
| 20 | Policy Search Methods | (PDF) |
| 21 | Policy Search Methods (cont.) | (PDF) |
| 22 |
Policy Search Methods for POMDPs
Application: Call Admission Control Actor-Critic Methods |
|
| 23 | Approximate POMDP Compression | |
| 24 |
Policy Search Methods: PEGASUS
Application: Helicopter Control |
Lecture Notes
Course Info
Instructor
Departments
As Taught In
Spring
2004
Level
Topics
Learning Resource Types
notes
Lecture Notes
group_work
Projects with Examples
assignment
Problem Sets