Video Lectures

Lecture 26: Structure of Neural Nets for Deep Learning

Description

This lecture is about the central structure of deep neural networks, which are a major force in machine learning. The aim is to find the function that’s constructed to learn the training data and then apply it to the test data.

Summary

The net has layers of nodes. Layer zero is the data.
We choose matrix of “weights” from layer to layer.
Nonlinear step at each layer! Negative values become zero!
We know correct class for the training data.
Weights optimized to (usually) output that correct class.

Related section in textbook: VII.1

Instructor: Prof. Gilbert Strang

Problems for Lecture 26
From textbook Section VII.1

4. Explain with words or show with graphs why each of these statements about Continuous Piecewise Linear functions (CPL functions) is true:

\(\boldsymbol{M} \hspace{4pt}\) The maximum \(M(x,y)\) of two CPL functions \(F_1(x,y)\) and \(F_2(x, y)\) is CPL.
\(\boldsymbol{S} \hspace{6pt}\) The sum \(S(x,y)\) of two CPL functions \(F_1(x,y)\) and \(F_2(x,y)\) is CPL.
\(\boldsymbol{C} \hspace{6pt}\) If the one-variable functions \(y=F_1(x)\) and \(z=F_2(y)\) are CPL, so is the
\(\hspace{16pt}\)composition \(C(x)=z=(F_2(F_1(x))\)

Problem 7 uses the blue ball, orange ring example on playground.tensorflow.org with one hidden layer and activation by ReLU (not Tanh). When learning succeeds, a white polygon separates blue from orange in the figure that follows.

7. Does learning succeed for \(N=4\)? What is the count \(r(N, 2)\) of flat pieces in \(F(\boldsymbol{x})\)? The white polygon shows where flat pieces in the graph of \(F(\boldsymbol{x})\) change sign as they go through the base plane \(z=0\). How many sides in the polygon?

Course Info

Departments
As Taught In
Spring 2018
Learning Resource Types
Lecture Videos
Problem Sets
Instructor Insights