Lecture 26: Structure of Neural Nets for Deep Learning

Course Info

Instructor

Prof. Gilbert Strang

Departments

Mathematics

As Taught In

Spring 2018

Level

Undergraduate

Topics

Learning Resource Types

Instructor Insights

Lecture Videos

Podcasts

Problem Sets

Download Course

Video Lectures

Description

This lecture is about the central structure of deep neural networks, which are a major force in machine learning. The aim is to find the function that’s constructed to learn the training data and then apply it to the test data.

Summary

The net has layers of nodes. Layer zero is the data.
We choose matrix of “weights” from layer to layer.
Nonlinear step at each layer! Negative values become zero!
We know correct class for the training data.
Weights optimized to (usually) output that correct class.

Related section in textbook: VII.1

Instructor: Prof. Gilbert Strang

Select language:

Problems for Lecture 26
From textbook Section VII.1

4. Explain with words or show with graphs why each of these statements about Continuous Piecewise Linear functions (CPL functions) is true:

\(\boldsymbol{M} \hspace{4pt}\) The maximum \(M(x,y)\) of two CPL functions \(F_1(x,y)\) and \(F_2(x, y)\) is CPL.
\(\boldsymbol{S} \hspace{6pt}\) The sum \(S(x,y)\) of two CPL functions \(F_1(x,y)\) and \(F_2(x,y)\) is CPL.
\(\boldsymbol{C} \hspace{6pt}\) If the one-variable functions \(y=F_1(x)\) and \(z=F_2(y)\) are CPL, so is the
\(\hspace{16pt}\)composition \(C(x)=z=(F_2(F_1(x))\)

Problem 7 uses the blue ball, orange ring example on playground.tensorflow.org with one hidden layer and activation by ReLU (not Tanh). When learning succeeds, a white polygon separates blue from orange in the figure that follows.

7. Does learning succeed for \(N=4\)? What is the count \(r(N, 2)\) of flat pieces in \(F(\boldsymbol{x})\)? The white polygon shows where flat pieces in the graph of \(F(\boldsymbol{x})\) change sign as they go through the base plane \(z=0\). How many sides in the polygon?