Lecture 22: Gradient Descent: Downhill to a Minimum
Description
Gradient descent is the most common optimization algorithm in deep learning and machine learning. It only takes into account the first derivative when performing updates on parameters—the stepwise process that moves downhill to reach a local minimum.
Summary
Gradient descent: Downhill from
Excellent example:
If
Each step multiplies by
Remarkable function: logarithm of determinant of
Related section in textbook: VI.4
Instructor: Prof. Gilbert Strang
Problems for Lecture 22
From textbook Section VI.4
1. For a 1 by 1 matrix in Example 3, the determinant is just
6. What is the gradient descent equation