WEBVTT

00:00:01.550 --> 00:00:03.920
The following content is
provided under a Creative

00:00:03.920 --> 00:00:05.310
Commons license.

00:00:05.310 --> 00:00:07.520
Your support will help
MIT OpenCourseWare

00:00:07.520 --> 00:00:11.610
continue to offer high quality
educational resources for free.

00:00:11.610 --> 00:00:14.180
To make a donation or to
view additional materials

00:00:14.180 --> 00:00:18.140
from hundreds of MIT courses,
visit MIT OpenCourseWare

00:00:18.140 --> 00:00:19.026
at ocw.mit.edu.

00:00:23.170 --> 00:00:27.920
GILBERT STRANG: OK, so kind of
a few things in mind for today.

00:00:27.920 --> 00:00:31.340
One is to answer those two
questions on the second line.

00:00:33.860 --> 00:00:38.080
We found those two formulas
on the first line last time,

00:00:38.080 --> 00:00:40.850
the derivative of a inverse.

00:00:40.850 --> 00:00:43.460
So the derivative of A
squared ought to be easy.

00:00:43.460 --> 00:00:48.020
But if we can't do that,
we need to be sure we can.

00:00:48.020 --> 00:00:51.560
And then this was the
derivative of an eigenvalue.

00:00:51.560 --> 00:00:54.980
And then it's natural to
ask about the derivative

00:00:54.980 --> 00:00:56.870
of the singular value.

00:00:56.870 --> 00:01:00.110
And I had a happy day
yesterday in the snow,

00:01:00.110 --> 00:01:03.380
realizing that that
has a nice formula too.

00:01:03.380 --> 00:01:06.230
Of course, I'm not the first.

00:01:06.230 --> 00:01:13.260
I'm sure that Wikipedia
already knows this formula.

00:01:13.260 --> 00:01:14.970
But it was new to me.

00:01:14.970 --> 00:01:19.460
And I should say Professor
Edelman has carried it

00:01:19.460 --> 00:01:21.120
to the second derivative.

00:01:21.120 --> 00:01:27.320
Again, not new, but it's
more difficult to find

00:01:27.320 --> 00:01:31.040
second derivatives,
and interesting.

00:01:31.040 --> 00:01:34.740
But we'll just stay
with first derivatives.

00:01:34.740 --> 00:01:39.050
OK, so that's my
first item of sort

00:01:39.050 --> 00:01:41.260
of business from last time.

00:01:41.260 --> 00:01:44.840
And then I'd like to say
something about the lab

00:01:44.840 --> 00:01:49.160
homeworks and ask your advice
and begin to say something

00:01:49.160 --> 00:01:51.050
about a project.

00:01:51.050 --> 00:01:58.550
And then I will move to
these topics in Section 4.4

00:01:58.550 --> 00:02:01.430
that you have already.

00:02:01.430 --> 00:02:06.860
And you might notice
I skipped 4.3.

00:02:06.860 --> 00:02:10.610
And the reason is
that on Friday,

00:02:10.610 --> 00:02:13.070
actually arriving
at MIT tomorrow

00:02:13.070 --> 00:02:19.910
is Professor Townsend,
4.3 is all about his work.

00:02:19.910 --> 00:02:24.320
And he's the best
lecturer I know.

00:02:24.320 --> 00:02:29.390
He was here as an instructor
and did 18.06 and was

00:02:29.390 --> 00:02:31.540
a big success.

00:02:31.540 --> 00:02:36.110
Actually, he's also
just won a prize

00:02:36.110 --> 00:02:44.570
for the SIAG/LA, international
prize for young investigators,

00:02:44.570 --> 00:02:48.800
young faculty in
applied linear algebra.

00:02:48.800 --> 00:02:53.300
So he goes to Hong Kong
to get that prize too.

00:02:53.300 --> 00:02:58.700
Anyway, he will be on the videos
and in here in class Friday,

00:02:58.700 --> 00:03:00.860
if all goes well.

00:03:00.860 --> 00:03:06.110
OK, so in order
then, the first thing

00:03:06.110 --> 00:03:09.260
is the derivative of A squared.

00:03:09.260 --> 00:03:16.750
And you might think it's
2A dA dt, but it's not.

00:03:16.750 --> 00:03:18.760
And if you realize
that it's not,

00:03:18.760 --> 00:03:22.480
then you realize what it is,
you will get these things right

00:03:22.480 --> 00:03:23.750
in the future.

00:03:23.750 --> 00:03:32.250
So the answer to the derivative
of A squared is not 2A dA dt.

00:03:36.340 --> 00:03:37.690
And why isn't it?

00:03:37.690 --> 00:03:40.720
And what is the right answer?

00:03:40.720 --> 00:03:43.450
So I do that maybe
just below here.

00:03:50.590 --> 00:03:53.030
Well, I could ask you to
guess the right answer,

00:03:53.030 --> 00:03:55.660
but why don't we do
it systematically.

00:03:55.660 --> 00:03:59.800
So how do you find
the derivative?

00:03:59.800 --> 00:04:01.180
It's a limit.

00:04:01.180 --> 00:04:03.700
First you have a delta A, right.

00:04:03.700 --> 00:04:05.210
And then you take a limit.

00:04:05.210 --> 00:04:15.700
So I look at A plus delta
A squared minus A squared.

00:04:15.700 --> 00:04:18.279
So that's the
change in A squared.

00:04:18.279 --> 00:04:21.820
And I divide it by delta t.

00:04:21.820 --> 00:04:24.370
And then delta t goes to 0.

00:04:24.370 --> 00:04:26.920
So that's the derivative
I'm looking for,

00:04:26.920 --> 00:04:29.020
the derivative of A squared.

00:04:29.020 --> 00:04:34.210
And now, if I write that out,
you'll see why this is wrong,

00:04:34.210 --> 00:04:38.080
but something very close to it,
of course-- can't be far away--

00:04:38.080 --> 00:04:38.980
is right.

00:04:38.980 --> 00:04:41.100
So what happens if
I write this out?

00:04:41.100 --> 00:04:44.770
The A squared will
cancel the A squared.

00:04:44.770 --> 00:04:45.480
What will I have?

00:04:45.480 --> 00:04:49.540
Will I have 2A delta A?

00:04:49.540 --> 00:04:52.795
Why don't I write
2A delta A next?

00:04:55.900 --> 00:05:01.090
Because when you're squaring
a sum of two matrices,

00:05:01.090 --> 00:05:11.180
one term is A delta A, and
another term is delta A A.

00:05:11.180 --> 00:05:15.230
And those are
different in general.

00:05:15.230 --> 00:05:20.210
And then plus delta A squared.

00:05:20.210 --> 00:05:24.680
And now I divide
it all by delta t.

00:05:24.680 --> 00:05:31.550
So you're now seeing my point
that now I let delta t go to 0.

00:05:31.550 --> 00:05:34.760
So I'm just doing
matrix calculus.

00:05:34.760 --> 00:05:40.490
And it's not altogether simple,
but if you follow the rules,

00:05:40.490 --> 00:05:42.200
it comes out right.

00:05:42.200 --> 00:05:48.770
So now what answer do I
get as delta t goes to 0?

00:05:48.770 --> 00:05:51.950
I get A dA dt--

00:05:51.950 --> 00:05:56.240
that's the definition of the--

00:05:56.240 --> 00:05:58.550
that ratio goes to dA dt.

00:05:58.550 --> 00:06:01.460
That's the whole idea
of the derivative of A.

00:06:01.460 --> 00:06:04.730
And now what's the other term?

00:06:04.730 --> 00:06:13.190
It's dA dt A. So it
was simply that point

00:06:13.190 --> 00:06:20.510
that I wanted you to pick up on,
that the derivative might not

00:06:20.510 --> 00:06:24.770
commute with A. Matrices
don't commute in general.

00:06:24.770 --> 00:06:31.415
And so you'll notice that we
had a similar expression there.

00:06:35.350 --> 00:06:38.480
We had to pay attention to
the order of things there.

00:06:38.480 --> 00:06:39.770
And now we get it right.

00:06:39.770 --> 00:06:51.810
It's not this, but A
dA dt plus dA dt A. OK.

00:06:51.810 --> 00:06:52.970
Good.

00:06:52.970 --> 00:06:54.990
Now, can I do the other one?

00:06:54.990 --> 00:07:01.200
Which is a little more serious,
but it's a beautiful formula.

00:07:01.200 --> 00:07:04.470
And it's parallel to this guy.

00:07:04.470 --> 00:07:07.050
You might even guess it.

00:07:07.050 --> 00:07:10.050
So I'm looking for the
derivative of a singular value.

00:07:10.050 --> 00:07:12.780
The matrix A is changing.

00:07:12.780 --> 00:07:17.830
dA dt tells me how it's changing
at the moment, at the instant.

00:07:17.830 --> 00:07:22.400
And I want to know how is sigma
changing at that same instant.

00:07:22.400 --> 00:07:26.700
And sort of in parallel
with this is a nice--

00:07:26.700 --> 00:07:27.870
the nice formula--

00:07:27.870 --> 00:07:36.150
u transpose dA dt v of t.

00:07:36.150 --> 00:07:39.030
Boy, you couldn't ask for a
nicer formula than that, right?

00:07:43.310 --> 00:07:46.440
You remember this
is the eigenvector.

00:07:46.440 --> 00:07:50.050
And that's the eigenvector
of A transpose.

00:07:50.050 --> 00:07:52.650
So this is the
singular vector of A.

00:07:52.650 --> 00:07:56.280
And you could say this is a
singular vector of A transpose,

00:07:56.280 --> 00:08:04.470
or it's the left singular vector
of A. So that's our formula.

00:08:04.470 --> 00:08:07.360
And if we can just
recall how to prove it,

00:08:07.360 --> 00:08:10.260
which is going to be parallel
to the proof of that one,

00:08:10.260 --> 00:08:14.870
then I'm a happy person and
we can get on with life.

00:08:14.870 --> 00:08:20.340
So let's remember this,
because it will help us

00:08:20.340 --> 00:08:22.140
to remember the other one, too.

00:08:22.140 --> 00:08:25.620
OK, so where do I start?

00:08:25.620 --> 00:08:28.410
I start with a
formula for sigma.

00:08:28.410 --> 00:08:35.340
So I believe that sigma is
u transpose times A times

00:08:35.340 --> 00:08:41.780
v. Everybody agree with that?

00:08:41.780 --> 00:08:45.530
Everything's depending
on t in this formula.

00:08:45.530 --> 00:08:48.890
As time changes,
everything changes.

00:08:48.890 --> 00:08:52.340
But I didn't write
in the parentheses,

00:08:52.340 --> 00:08:56.390
t three more times.

00:08:56.390 --> 00:08:59.500
Can we just remember
about the SVD.

00:08:59.500 --> 00:09:04.756
The SVD says that
A times v equals--

00:09:04.756 --> 00:09:05.710
AUDIENCE: Sigma u.

00:09:05.710 --> 00:09:06.770
GILBERT STRANG: Sigma u.

00:09:06.770 --> 00:09:08.030
Thanks.

00:09:08.030 --> 00:09:09.490
Av is sigma u.

00:09:09.490 --> 00:09:10.410
That's the SVD.

00:09:13.290 --> 00:09:18.380
So when I put in for
Av, I put in sigma u.

00:09:18.380 --> 00:09:19.760
Sigma is just a number.

00:09:19.760 --> 00:09:21.700
So I bring it outside.

00:09:21.700 --> 00:09:24.110
And I'm left with u transpose u.

00:09:24.110 --> 00:09:26.970
And what's u transpose u?

00:09:26.970 --> 00:09:28.660
1.

00:09:28.660 --> 00:09:30.160
So I've used these two facts.

00:09:32.980 --> 00:09:35.890
Or I could have
gone the other way

00:09:35.890 --> 00:09:39.580
and said that this
is the transpose of--

00:09:39.580 --> 00:09:43.060
this is A transpose u transpose.

00:09:43.060 --> 00:09:49.060
I could look at it
that way times v.

00:09:49.060 --> 00:09:50.860
And if I look at
it that way, I'm

00:09:50.860 --> 00:09:53.530
interested in what
is A transpose u.

00:09:53.530 --> 00:09:57.370
And what is A transpose u?

00:09:57.370 --> 00:10:04.900
It's sigma v. And it's
transpose, so sigma v

00:10:04.900 --> 00:10:07.090
transpose v.

00:10:07.090 --> 00:10:09.643
And what is sigma v transpose v?

00:10:09.643 --> 00:10:10.310
AUDIENCE: Sigma.

00:10:10.310 --> 00:10:12.400
GILBERT STRANG: It's
sigma again, of course.

00:10:12.400 --> 00:10:14.070
Got sigma both ways.

00:10:14.070 --> 00:10:15.520
OK.

00:10:15.520 --> 00:10:18.910
Now, I'm ready to
take the derivative.

00:10:18.910 --> 00:10:23.680
That's the formula
I have for sigma,

00:10:23.680 --> 00:10:25.540
completely parallel
to the formula

00:10:25.540 --> 00:10:27.970
that we started out
with for lambda.

00:10:27.970 --> 00:10:31.900
The eigenvalue was
y transpose Ax.

00:10:31.900 --> 00:10:34.510
And now we've got
u transpose Av.

00:10:34.510 --> 00:10:38.170
And, by the way, when
would those two formulas

00:10:38.170 --> 00:10:40.410
be one and the same?

00:10:40.410 --> 00:10:45.060
When does the SVD just
tell us nothing new

00:10:45.060 --> 00:10:50.520
beyond the eigenvalue stuff for
what matrices are the singular

00:10:50.520 --> 00:10:53.430
values, the same as the
eigenvalues, and singular

00:10:53.430 --> 00:10:57.870
vectors the same as this
as the eigenvectors for--

00:10:57.870 --> 00:10:58.755
For?

00:10:58.755 --> 00:11:00.240
AUDIENCE: Symmetric.

00:11:00.240 --> 00:11:02.340
GILBERT STRANG: Symmetric, good.

00:11:02.340 --> 00:11:08.490
Symmetric, square,
and-- the two words

00:11:08.490 --> 00:11:11.820
that I'm always looking
for in this course.

00:11:11.820 --> 00:11:13.560
If you want an A in
this course, just

00:11:13.560 --> 00:11:17.970
write down positive definite
in the answer to any question,

00:11:17.970 --> 00:11:21.510
because sigmas are by
definition positive.

00:11:21.510 --> 00:11:24.570
And if they're going to agree
totally with the lambdas,

00:11:24.570 --> 00:11:26.460
then the lambdas
have to be positive.

00:11:26.460 --> 00:11:30.237
Or could be 0, so positive
semidefinite definite

00:11:30.237 --> 00:11:31.320
would be the right answer.

00:11:31.320 --> 00:11:33.060
Anyway, this is our start.

00:11:36.170 --> 00:11:38.460
And what do we do
with that formula?

00:11:38.460 --> 00:11:42.340
So this was all the same,
because v transpose v was 1.

00:11:45.870 --> 00:11:48.000
Here I had v transpose
v. And that's 1.

00:11:48.000 --> 00:11:49.260
So it gave me sigma.

00:11:49.260 --> 00:11:50.240
Yeah, good.

00:11:50.240 --> 00:11:52.190
Everybody's with us.

00:11:52.190 --> 00:11:53.580
OK, what do I do?

00:11:53.580 --> 00:11:55.110
Take the derivative.

00:11:55.110 --> 00:11:58.160
Takes the derivative of
that equation in the box.

00:11:58.160 --> 00:12:00.570
It's exactly what
I did last time

00:12:00.570 --> 00:12:03.480
with the corresponding
equation for lambda.

00:12:03.480 --> 00:12:04.620
Same thing.

00:12:04.620 --> 00:12:07.140
And I'm going to get again--

00:12:07.140 --> 00:12:11.130
it's a product rule, because
I have three things multiplied

00:12:11.130 --> 00:12:12.760
on the right-hand side.

00:12:12.760 --> 00:12:15.700
So I've got three terms
from the product rule.

00:12:15.700 --> 00:12:21.780
So d sigma dt,
coming from the box,

00:12:21.780 --> 00:12:35.430
is du transpose dt Av
plus u transpose dA dt v

00:12:35.430 --> 00:12:41.370
plus the third guy, which
will be u transpose A dv dt.

00:12:44.530 --> 00:12:46.090
Did I get the three terms there?

00:12:46.090 --> 00:12:48.200
Yep.

00:12:48.200 --> 00:12:49.980
And which term do I want?

00:12:49.980 --> 00:12:54.300
Which term do I believe is going
to survive and be the answer?

00:12:57.090 --> 00:12:59.620
Well, this is what I'm after.

00:12:59.620 --> 00:13:01.750
So it's the middle term.

00:13:01.750 --> 00:13:03.220
The middle term is just right.

00:13:06.320 --> 00:13:09.770
And the other two terms
had better be zero.

00:13:09.770 --> 00:13:12.200
So that will be the proof.

00:13:12.200 --> 00:13:14.540
The other two
terms will be zero.

00:13:14.540 --> 00:13:17.840
So can we just take
one of those two terms

00:13:17.840 --> 00:13:22.100
and show that it's
zero like this one?

00:13:22.100 --> 00:13:24.140
OK, what have I got here?

00:13:24.140 --> 00:13:27.030
I want to know that
that term is 0.

00:13:27.030 --> 00:13:28.060
So what have I got.

00:13:28.060 --> 00:13:37.370
I've got du transpose
dt times Av.

00:13:37.370 --> 00:13:43.200
And everybody says, OK, in
place of Av, write in sigma u.

00:13:43.200 --> 00:13:47.760
And sigma's a number, so I
don't mind putting it there.

00:13:47.760 --> 00:13:53.310
So I've got sigma, a number of
times the derivative of u times

00:13:53.310 --> 00:13:55.140
u itself, the dot product--

00:13:55.140 --> 00:13:58.410
the derivative of u
with dot product with u.

00:13:58.410 --> 00:14:00.990
And that equals?

00:14:00.990 --> 00:14:05.280
0, I hope, because of this.

00:14:08.100 --> 00:14:09.120
Because of that.

00:14:12.240 --> 00:14:14.790
This comes from the
derivative of that.

00:14:17.580 --> 00:14:23.210
But you see, now we've got
dot products, ordinary dot

00:14:23.210 --> 00:14:26.800
products, and a number
on the right-hand side.

00:14:26.800 --> 00:14:29.750
We're in dimension
1, you could say.

00:14:29.750 --> 00:14:34.300
So this tells me immediately
that the derivative

00:14:34.300 --> 00:14:44.870
of u with u plus u transpose
times the derivative of u

00:14:44.870 --> 00:14:51.680
is the derivative
of 1, which is 0.

00:14:51.680 --> 00:14:56.120
All I'm saying is that
these are the same.

00:14:56.120 --> 00:15:01.535
You know, vectors, x transpose
y is the same as y transpose

00:15:01.535 --> 00:15:04.460
x when I'm talking
about real numbers.

00:15:04.460 --> 00:15:08.030
If I was doing complex
things, which I could do,

00:15:08.030 --> 00:15:16.070
then I'd have to pay attention
and take complex conjugates

00:15:16.070 --> 00:15:16.920
at the right moment.

00:15:16.920 --> 00:15:19.250
But let's not bother.

00:15:19.250 --> 00:15:23.600
So you see, this is
just two of these.

00:15:23.600 --> 00:15:26.960
And it gives me 0.

00:15:26.960 --> 00:15:28.070
So that term's gone.

00:15:30.690 --> 00:15:34.470
And similarly, totally
similarly, this term is gone.

00:15:34.470 --> 00:15:40.820
This is A transpose
u, all transpose.

00:15:40.820 --> 00:15:45.410
I'm just doing the
same thing times dv dt.

00:15:45.410 --> 00:15:48.300
And what is A transpose u?

00:15:48.300 --> 00:15:55.580
It's sigma v. So this is
sigma v transpose dv dt.

00:15:55.580 --> 00:15:59.090
And again 0, because of this.

00:16:02.630 --> 00:16:07.630
So in a way this was a
slightly easier thing--

00:16:07.630 --> 00:16:12.830
the last time was completely
parallel computation.

00:16:12.830 --> 00:16:17.410
But the first and third terms
had to cancel each other with

00:16:17.410 --> 00:16:19.840
the x's and y's.

00:16:19.840 --> 00:16:29.690
Now, they disappear separately,
leaving the right answer.

00:16:29.690 --> 00:16:32.860
You might think, how did
we get into derivatives

00:16:32.860 --> 00:16:35.470
of singular values?

00:16:35.470 --> 00:16:40.210
Well, I think if we're
going to understand the SVD,

00:16:40.210 --> 00:16:45.040
then the first derivative
of the sigma is--

00:16:45.040 --> 00:16:47.320
well, except that I've
survived all these years

00:16:47.320 --> 00:16:48.230
without knowing it.

00:16:48.230 --> 00:16:50.440
So you could say it's not--

00:16:53.340 --> 00:16:58.330
you can live without it, but
it's a pretty nice formula.

00:16:58.330 --> 00:17:05.780
OK, that completes
that Section 3.1.

00:17:05.780 --> 00:17:09.770
And more to say about 3.2,
which was the interlacing

00:17:09.770 --> 00:17:11.960
part that I introduced.

00:17:11.960 --> 00:17:14.720
OK, so where am I?

00:17:14.720 --> 00:17:26.220
I guess I'm thinking about the
neat topics about interlacing

00:17:26.220 --> 00:17:28.060
of eigenvalues.

00:17:28.060 --> 00:17:33.810
So may I pick up on that theme,
interlacing of eigenvalues

00:17:33.810 --> 00:17:39.730
and say what's in the notes
and what's the general idea?

00:17:39.730 --> 00:17:40.230
OK.

00:17:43.290 --> 00:17:48.480
So we're leaving the
derivatives and moving

00:17:48.480 --> 00:17:54.570
to finite changes in the
eigenvalues and singular

00:17:54.570 --> 00:17:58.020
values, and we are
recognizing that we

00:17:58.020 --> 00:18:02.730
can't get exact
formulas for the change,

00:18:02.730 --> 00:18:06.200
but we can get
bounds for change.

00:18:06.200 --> 00:18:07.990
And they are pretty cool.

00:18:07.990 --> 00:18:12.060
So let me remind you what
that is, what they are.

00:18:12.060 --> 00:18:15.260
So I have a matrix--

00:18:15.260 --> 00:18:18.450
let's see, a symmetric
matrix S that

00:18:18.450 --> 00:18:22.080
has eigenvalues lambda 1,
greater equal lambda 2,

00:18:22.080 --> 00:18:25.920
greater equal so on.

00:18:25.920 --> 00:18:28.680
Then I change S by some amount.

00:18:28.680 --> 00:18:35.520
I think in the notes there is
a number, theta times 1 matrix.

00:18:35.520 --> 00:18:40.080
That has eigenvalues mu
1, greater equal mu 2,

00:18:40.080 --> 00:18:43.530
greater equal something.

00:18:43.530 --> 00:18:49.610
And these are what I can't
give you an exact formula for.

00:18:49.610 --> 00:18:52.850
You just would have
to compute them.

00:18:52.850 --> 00:18:57.410
But I can give you
bounds for them.

00:18:57.410 --> 00:18:59.470
And the bounds come
from the lambdas.

00:19:02.030 --> 00:19:04.100
So this was a positive.

00:19:04.100 --> 00:19:05.700
This is a positive change.

00:19:09.590 --> 00:19:14.140
So the eigenvalues will
go up, or stay still,

00:19:14.140 --> 00:19:16.760
but they won't go down.

00:19:16.760 --> 00:19:20.830
So the mu's will be
bigger than the lambdas.

00:19:20.830 --> 00:19:27.130
But the neat thing is that mu
2 will not pass up lambda 1.

00:19:27.130 --> 00:19:29.240
So here is the interlacing.

00:19:29.240 --> 00:19:32.110
Mu 1 is greater equal lambda 1.

00:19:32.110 --> 00:19:35.350
That says that the highest
eigenvalue, the top eigenvalue

00:19:35.350 --> 00:19:39.690
went up, or didn't move.

00:19:39.690 --> 00:19:44.640
But mu 2 is below lambda 1.

00:19:44.640 --> 00:19:46.540
This is the new--
everybody's with me here?

00:19:46.540 --> 00:19:50.210
This is a new, and
this is the old.

00:19:50.210 --> 00:19:56.510
New and old being old is S,
new is with the change in S.

00:19:56.510 --> 00:20:01.540
And that mu 2 is
greater equal lambda 2.

00:20:01.540 --> 00:20:04.010
So the second
eigenvalues went up.

00:20:04.010 --> 00:20:05.158
And then so on.

00:20:10.910 --> 00:20:13.790
That's a great fact.

00:20:13.790 --> 00:20:16.810
And I guess that I sent
out a puzzle question.

00:20:16.810 --> 00:20:19.215
Did it arrive in email?

00:20:25.100 --> 00:20:29.820
Did anybody see that puzzle
question and think about it?

00:20:29.820 --> 00:20:31.060
It worried me for a while.

00:20:36.480 --> 00:20:41.820
Suppose this is the
second eigenvalue value--

00:20:41.820 --> 00:20:44.310
eigenvector.

00:20:44.310 --> 00:20:50.520
So I'm adding on, I'm hyping
up the second eigenvector,

00:20:50.520 --> 00:20:52.830
hyping up the matrix
in the direction

00:20:52.830 --> 00:20:54.370
of the second eigenvector.

00:20:57.890 --> 00:21:02.250
So the second
eigenvalue was lambda 2.

00:21:02.250 --> 00:21:05.280
And its mu 2, the new
second eigenvalue,

00:21:05.280 --> 00:21:06.860
is going to be bigger by theta.

00:21:11.190 --> 00:21:15.120
But then I lost a little
sleep in thinking, OK,

00:21:15.120 --> 00:21:20.130
if the second eigenvalue
is mu 2 plus theta--

00:21:20.130 --> 00:21:22.980
sorry, if the second
eigenvalue mu 2--

00:21:22.980 --> 00:21:24.300
so let me write it here.

00:21:24.300 --> 00:21:35.460
If mu 2, the second eigenvalue,
is the old lambda 2 plus theta

00:21:35.460 --> 00:21:45.390
then bad news, because theta
can be as big as I want.

00:21:45.390 --> 00:21:48.180
It can be 20, 200, 2,000.

00:21:48.180 --> 00:21:57.300
And if I'm just adding theta
to lambda 2 to get the second--

00:21:57.300 --> 00:22:01.440
because it's a second
eigenvector that's

00:22:01.440 --> 00:22:10.140
getting pumped up, then after a
while, mu 2 will pass lambda 1.

00:22:10.140 --> 00:22:11.520
This will be totally true.

00:22:11.520 --> 00:22:13.200
I have no worries about this.

00:22:13.200 --> 00:22:14.610
The old lambda 1--

00:22:14.610 --> 00:22:18.150
actually, the old--

00:22:18.150 --> 00:22:21.000
I'll even have
equality here, because

00:22:21.000 --> 00:22:27.600
for this particular change,
it's not affecting lambda 1.

00:22:27.600 --> 00:22:30.430
So I think mu 1
would be lambda 1

00:22:30.430 --> 00:22:34.080
in my hypothetical possibility.

00:22:34.080 --> 00:22:35.550
What I'm trying to
get you to do is

00:22:35.550 --> 00:22:39.210
to think through what this
means, because it's quite

00:22:39.210 --> 00:22:43.170
easy to write that line there.

00:22:43.170 --> 00:22:46.950
But then when you think about
it, you get some questions.

00:22:46.950 --> 00:22:50.810
And it looks as
if it might fail,

00:22:50.810 --> 00:22:57.110
because if theta is really
big, that mu 2 would pass up

00:22:57.110 --> 00:22:57.860
lambda 1.

00:22:57.860 --> 00:23:00.500
And the thing would fail.

00:23:00.500 --> 00:23:02.570
And there has to be a catch.

00:23:02.570 --> 00:23:05.960
There has to be a catch.

00:23:05.960 --> 00:23:11.540
So does anybody-- you
saw that in the email.

00:23:11.540 --> 00:23:16.400
And I'll now explain
what how I understood

00:23:16.400 --> 00:23:24.650
that everything can work and I'm
not reaching a contradiction.

00:23:24.650 --> 00:23:27.110
And here's my thinking.

00:23:27.110 --> 00:23:32.810
So it's perfectly true that the
eigenvalue that goes with u2--

00:23:32.810 --> 00:23:36.320
or maybe I should be calling
them x2, because usually I

00:23:36.320 --> 00:23:38.750
call the eigenvectors x2--

00:23:38.750 --> 00:23:42.950
it's perfectly true that mu
2, that that one goes up.

00:23:46.020 --> 00:23:51.900
But what happens when
it reaches lambda 1?

00:23:51.900 --> 00:23:54.435
Actually, lambda 1,
the first eigenvalue,

00:23:54.435 --> 00:23:57.930
is staying put, because it's
not getting any push from this.

00:23:57.930 --> 00:24:01.700
But the second eigenvalue is
getting a push of size theta.

00:24:01.700 --> 00:24:07.290
So what happens when lambda
2 plus theta, which is mu 2--

00:24:07.290 --> 00:24:09.480
mu 2 is lambda 2 plus theta--

00:24:09.480 --> 00:24:12.720
what happens when it
comes up to lambda 1

00:24:12.720 --> 00:24:15.120
and I start worrying
that it passes lambda 1?

00:24:18.410 --> 00:24:21.740
Do you see what's
happening there?

00:24:21.740 --> 00:24:25.250
What happens when mu 2 passes--

00:24:25.250 --> 00:24:26.720
when mu 2, which is--

00:24:26.720 --> 00:24:28.220
I'm just going to copy here--

00:24:28.220 --> 00:24:31.875
it's the old lambda 2 plus
the theta, the number.

00:24:31.875 --> 00:24:34.250
What happens when theta gets
bigger and bigger and bigger

00:24:34.250 --> 00:24:37.670
and this hits this thing
and then goes beyond?

00:24:37.670 --> 00:24:40.850
Just to see the logic here.

00:24:40.850 --> 00:24:46.760
What happens is that this lambda
2 plus theta, which was mu 2,

00:24:46.760 --> 00:24:49.070
mu 2 until they got here.

00:24:49.070 --> 00:24:55.570
But what is lambda 2 plus
theta after it passes lambda 1?

00:24:55.570 --> 00:24:56.800
It's lambda 1 now.

00:24:59.340 --> 00:25:02.190
It passed up, so it's
the top eigenvalue

00:25:02.190 --> 00:25:07.390
of the altered matrix.

00:25:07.390 --> 00:25:10.380
And therefore, it's just fine.

00:25:10.380 --> 00:25:11.130
It's out here.

00:25:11.130 --> 00:25:13.740
No problem.

00:25:13.740 --> 00:25:15.810
Maybe I'll just say it again.

00:25:15.810 --> 00:25:20.010
When theta is big
enough that mu 2 reaches

00:25:20.010 --> 00:25:23.520
lambda 1, if I increase
theta beyond that,

00:25:23.520 --> 00:25:30.060
then this becomes not
mu 2 any more, but mu 1.

00:25:30.060 --> 00:25:35.130
And then totally
everybody's happy.

00:25:35.130 --> 00:25:40.260
I won't say more on that,
because that's just like a way

00:25:40.260 --> 00:25:44.760
that I found to make me think,
what do these things mean?

00:25:44.760 --> 00:25:48.070
OK, enough said on
that small point.

00:25:48.070 --> 00:25:51.730
But then the main point
is, why is this true?

00:25:51.730 --> 00:25:59.240
This interlacing, which is
really a nice, beautiful fact.

00:25:59.240 --> 00:26:05.500
And you could
imagine that we have

00:26:05.500 --> 00:26:09.220
more different perturbations
than just rank 1s.

00:26:13.300 --> 00:26:19.750
So let me tell you the
inequality, so named

00:26:19.750 --> 00:26:23.650
after the discoverer,
Weyl's inequality.

00:26:27.790 --> 00:26:39.400
So his inequality is for
the eigenvalues of S plus T.

00:26:39.400 --> 00:26:41.980
So T is the change.

00:26:41.980 --> 00:26:43.170
S is where I start.

00:26:43.170 --> 00:26:45.340
It has eigenvalues lambda.

00:26:45.340 --> 00:26:48.520
But now, I'm looking at the
eigenvalues of S plus T.

00:26:48.520 --> 00:26:50.860
So I'm making a change.

00:26:50.860 --> 00:26:53.350
Over here, in my
little puzzle question,

00:26:53.350 --> 00:26:56.710
that was T. It was
a rank 1 change.

00:26:56.710 --> 00:26:59.860
Now I will allow other ranks.

00:26:59.860 --> 00:27:03.430
So I want to estimate
lambdas of S plus t

00:27:03.430 --> 00:27:10.880
in terms of lambdas
of S and lambdas of T.

00:27:10.880 --> 00:27:13.710
And I want some
inequality sign there.

00:27:17.000 --> 00:27:21.680
And it's supposed to be true
for any symmetric matrices,

00:27:21.680 --> 00:27:26.800
symmetric S and T.

00:27:26.800 --> 00:27:32.360
And then a totally
identical Weyl inequality--

00:27:32.360 --> 00:27:33.980
actually, Weyl was
one of the people

00:27:33.980 --> 00:27:36.380
who discovered singular values.

00:27:36.380 --> 00:27:39.350
And when he did it, he
asked about his inequality.

00:27:39.350 --> 00:27:42.370
And he found that it
still worked the way we've

00:27:42.370 --> 00:27:44.180
found this morning earlier.

00:27:47.210 --> 00:27:49.490
I haven't completed
that yet, because I

00:27:49.490 --> 00:27:54.790
haven't told you which
lambdas I'm talking about.

00:27:54.790 --> 00:27:58.420
So let me do that.

00:27:58.420 --> 00:28:01.050
So now, I'll tell you
Weyl's inequality.

00:28:01.050 --> 00:28:03.280
So S and T are symmetric.

00:28:03.280 --> 00:28:05.770
And so the lambdas are real.

00:28:05.770 --> 00:28:07.670
And we want to know--

00:28:07.670 --> 00:28:10.060
we want to get them in order.

00:28:10.060 --> 00:28:11.740
OK, so here it goes.

00:28:15.170 --> 00:28:21.460
Weyl allowed the i-th eigenvalue
of S and the j-th eigenvalue

00:28:21.460 --> 00:28:27.850
of T and figured out that this
was bounded by that eigenvalue

00:28:27.850 --> 00:28:32.650
of S plus T. So that's
Weyl's great inequality,

00:28:32.650 --> 00:28:42.730
which reduces to the
one I wrote here,

00:28:42.730 --> 00:28:44.680
if I make the right choice--

00:28:44.680 --> 00:28:47.940
yeah, probably, if
I take j equal to 1.

00:28:47.940 --> 00:28:51.340
So you see the beauty of this.

00:28:51.340 --> 00:28:56.560
It tells you about
any eigenvalues of S,

00:28:56.560 --> 00:28:57.760
eigenvalues of T.

00:28:57.760 --> 00:29:00.670
So I'm using lambdas here.

00:29:00.670 --> 00:29:02.950
Lambda of S are the
eigenvalues of S.

00:29:02.950 --> 00:29:07.480
I'm using lambda again for T
and lambda again for S plus T.

00:29:07.480 --> 00:29:11.830
So you have to pay attention
to which matrix I'm

00:29:11.830 --> 00:29:13.330
taking the eigenvalues out of.

00:29:13.330 --> 00:29:17.270
So let me take j equal to 1.

00:29:17.270 --> 00:29:21.780
And this says that
lambda i, because j is 1,

00:29:21.780 --> 00:29:28.210
S plus T is less or equal to
lambda i of S plus lambda 1,

00:29:28.210 --> 00:29:40.170
the top eigenvalue of T.
This is lambda max of T.

00:29:40.170 --> 00:29:44.850
Do you see that that's totally
reasonable, believable?

00:29:44.850 --> 00:29:49.260
That the eigenvalue
when I add on T-- let's

00:29:49.260 --> 00:29:52.260
imagine in our minds
that T is positive.

00:29:52.260 --> 00:29:56.640
T is like this thing.

00:29:56.640 --> 00:30:02.280
This could be the T, example of
a T. It's what I'm adding on.

00:30:02.280 --> 00:30:06.570
Then the eigenvalues go up.

00:30:06.570 --> 00:30:09.870
But they don't pass that.

00:30:09.870 --> 00:30:12.510
So that tells you how
much it could go up by.

00:30:12.510 --> 00:30:19.070
So I guess that Weyl is giving
us a less than or equal here.

00:30:19.070 --> 00:30:22.050
Less or equal to lambda 1--

00:30:22.050 --> 00:30:24.450
so I'm taking i to be 1--

00:30:24.450 --> 00:30:27.320
plus theta.

00:30:27.320 --> 00:30:32.590
Yeah, so that any equality
I've written down there--

00:30:32.590 --> 00:30:37.020
there's some playing around
to do to get practice.

00:30:37.020 --> 00:30:44.310
And it's not so essential for
us to be like world grandmasters

00:30:44.310 --> 00:30:48.030
at this thing, but
you should see it.

00:30:48.030 --> 00:30:52.010
And you should also
see j equal to 2.

00:30:52.010 --> 00:30:55.230
Why will j equal to
2 tell us something?

00:30:55.230 --> 00:30:58.120
I hope it will.

00:30:58.120 --> 00:31:00.310
Let's see what it tells us.

00:31:00.310 --> 00:31:04.720
Lambda i plus 1 now-- j is 2--

00:31:04.720 --> 00:31:12.640
of S plus T. So it's less
than or equal to lambda i of S

00:31:12.640 --> 00:31:19.480
plus lambda 2 of T. I
think that's interesting.

00:31:19.480 --> 00:31:34.280
And also, I think I also could
get lambda i plus i minus 1.

00:31:34.280 --> 00:31:37.570
Let me write it and
see if it's correct.

00:31:37.570 --> 00:31:40.755
Plus lambda i minus 1.

00:31:40.755 --> 00:31:43.690
So those was add up to i plus 2.

00:31:43.690 --> 00:31:51.120
Yeah, I guess lambda i
plus 1 plus lambda 1 of T.

00:31:51.120 --> 00:31:55.050
That's what I got by taking--

00:31:55.050 --> 00:31:57.360
yeah, did I do that right?

00:32:00.480 --> 00:32:03.696
I'm taking j equal to 1.

00:32:03.696 --> 00:32:07.480
No, well, I don't
think I got it right.

00:32:10.520 --> 00:32:15.080
What do I want to do here to
get a bound on lambda i plus 1?

00:32:15.080 --> 00:32:16.610
I want to take j equal to 2.

00:32:16.610 --> 00:32:23.210
I should just be sensible
and plug in j equal to 2

00:32:23.210 --> 00:32:24.270
and i equal to 1.

00:32:29.510 --> 00:32:35.480
All I want to say is that Weyl's
inequality is the great fact

00:32:35.480 --> 00:32:38.390
out of which all this
interlacing falls

00:32:38.390 --> 00:32:42.650
and more and more, because
the interlacing is telling me

00:32:42.650 --> 00:32:46.020
about neighbors.

00:32:46.020 --> 00:32:50.790
And actually if I use Weyl for i
and j, different i's and j's, I

00:32:50.790 --> 00:32:56.166
even learn about ones
that are not neighbors.

00:32:56.166 --> 00:33:00.600
And I could tell you a
proof of Weyl's inequality.

00:33:00.600 --> 00:33:02.340
But I'll save that
for the notes.

00:33:07.310 --> 00:33:09.100
So I think maybe
that's what I want

00:33:09.100 --> 00:33:14.240
to do about interfacing, just
to say what the notes have,

00:33:14.240 --> 00:33:17.550
but not repeat it all in class.

00:33:17.550 --> 00:33:24.230
So the notes have actually two
ways to prove this interlacing.

00:33:24.230 --> 00:33:27.830
The standard way that every
mathematician would use

00:33:27.830 --> 00:33:30.990
would be Weyl's inequality.

00:33:30.990 --> 00:33:36.600
But last year,
Professor Rao, visiting,

00:33:36.600 --> 00:33:42.060
found a nice argument
that's also in the notes.

00:33:42.060 --> 00:33:43.700
It ends up with a graph.

00:33:43.700 --> 00:33:47.540
And on that graph, you
can see that this is true.

00:33:47.540 --> 00:33:55.530
So for what it's worth, two
approaches to this interlacing

00:33:55.530 --> 00:33:58.440
and some examples.

00:33:58.440 --> 00:34:01.710
But I really don't
want to spend our lives

00:34:01.710 --> 00:34:05.000
on this eigenvalue topic.

00:34:05.000 --> 00:34:08.940
It's a beautiful fact
about symmetric matrices

00:34:08.940 --> 00:34:12.510
and the corresponding fact
is true for singular values

00:34:12.510 --> 00:34:18.270
of any matrix, but let's
think of leaving it there.

00:34:21.150 --> 00:34:28.670
So now, I'm moving on
to the new section.

00:34:28.670 --> 00:34:30.710
The new section
involves something

00:34:30.710 --> 00:34:31.897
called compressed sensing.

00:34:31.897 --> 00:34:33.605
I don't know if you've
heard those words.

00:34:45.949 --> 00:34:52.699
So these are all topics in
Section 4.4, which you have.

00:34:52.699 --> 00:34:55.880
I think we sent it out
10 days ago probably.

00:34:58.660 --> 00:35:04.000
OK, so first let me remember
what the nuclear norm is

00:35:04.000 --> 00:35:06.320
of a matrix.

00:35:06.320 --> 00:35:19.635
The nuclear norm a matrix is
the sum of the singular values,

00:35:19.635 --> 00:35:22.460
the sum of the singular values.

00:35:22.460 --> 00:35:29.170
So it's like the L1
norm for a vector.

00:35:29.170 --> 00:35:32.080
That's a right way
to think about it.

00:35:32.080 --> 00:35:34.030
And do you remember
what was special?

00:35:34.030 --> 00:35:38.230
We've talked about
using the L1 norm.

00:35:38.230 --> 00:35:42.610
It has this special property
that the ordinary L2

00:35:42.610 --> 00:35:45.190
norm absolutely does not have.

00:35:45.190 --> 00:35:48.070
What was it special
about the L1 norm?

00:35:48.070 --> 00:35:56.080
If I minimize the L1 norm with
some constraint, like ab equal

00:35:56.080 --> 00:36:01.320
b, what's special about the
solution, the minimum in the L1

00:36:01.320 --> 00:36:02.040
norm?

00:36:02.040 --> 00:36:02.910
AUDIENCE: Sparse.

00:36:02.910 --> 00:36:04.160
GILBERT STRANG: Sparse, right.

00:36:04.160 --> 00:36:06.920
Sparse.

00:36:06.920 --> 00:36:10.700
So this is moving
us up to matrices.

00:36:10.700 --> 00:36:13.670
And that's where compressed
sensing comes in.

00:36:13.670 --> 00:36:16.230
Matrix completion comes in.

00:36:16.230 --> 00:36:20.580
So matrix completion
would just be--

00:36:20.580 --> 00:36:23.270
I mentioned-- so
this is completion.

00:36:26.120 --> 00:36:28.590
And I'll remember
the words Netflix,

00:36:28.590 --> 00:36:31.910
which made the problem famous.

00:36:31.910 --> 00:36:44.140
So I have the matrix A, 3, 2,
question mark, question mark,

00:36:44.140 --> 00:36:47.310
question mark, 1, 4,
6, question mark--

00:36:53.390 --> 00:36:55.780
missing data.

00:36:55.780 --> 00:36:59.650
And so I have to put
it in something there,

00:36:59.650 --> 00:37:03.000
because if I don't put in
anything, then the numbers

00:37:03.000 --> 00:37:07.930
I do know are useless,
because no row or no column

00:37:07.930 --> 00:37:10.100
is complete.

00:37:10.100 --> 00:37:12.100
So it just would give up.

00:37:12.100 --> 00:37:14.710
Somebody that sent
me the data, 3 and 2

00:37:14.710 --> 00:37:20.050
and didn't tell me a
ranking for the third movie,

00:37:20.050 --> 00:37:22.780
I'd have to say,
well, I can't use it.

00:37:22.780 --> 00:37:24.010
That's not possible.

00:37:24.010 --> 00:37:28.990
So we need to think about there.

00:37:28.990 --> 00:37:36.790
And the idea is that the numbers
that minimized the nuclear norm

00:37:36.790 --> 00:37:40.180
are a good choice,
a good choice.

00:37:40.180 --> 00:37:46.540
So that's just a connection here
that we will say more about,

00:37:46.540 --> 00:37:49.300
but not--

00:37:49.300 --> 00:37:52.240
we could have a whole
course in compressed sensing

00:37:52.240 --> 00:37:53.470
and nuclear norm.

00:37:53.470 --> 00:37:59.470
Professor Parrilo in course
6 is an expert on this.

00:38:03.450 --> 00:38:06.980
But you see the point that--

00:38:06.980 --> 00:38:18.900
so you remember v1
came from the 0 norm.

00:38:21.530 --> 00:38:23.585
And what is the 0
norm of the vector?

00:38:26.940 --> 00:38:27.860
Well, it's not a norm.

00:38:27.860 --> 00:38:31.760
So you could say,
forget it, no answer.

00:38:31.760 --> 00:38:34.880
But what do we
symbolically mean when

00:38:34.880 --> 00:38:38.310
I write the 0 norm of a vector?

00:38:38.310 --> 00:38:41.070
I mean the number of....?

00:38:41.070 --> 00:38:42.540
Non-zeros.

00:38:42.540 --> 00:38:44.520
The number of non-zeros.

00:38:44.520 --> 00:38:55.430
This was the number of
non-zeros in the vector, in v.

00:38:55.430 --> 00:39:03.720
But it's not a norm, because
if I take 2 times the vector,

00:39:03.720 --> 00:39:07.330
I have the same number
of non-zeros, same norm.

00:39:07.330 --> 00:39:09.662
I can't have the norm
of 2v equal the norm

00:39:09.662 --> 00:39:15.890
of v. That would blow away
all the properties of norms.

00:39:15.890 --> 00:39:17.920
So v0 is not a norm.

00:39:17.920 --> 00:39:21.500
And then we move it to that sort
of appropriate nearest norm.

00:39:21.500 --> 00:39:23.260
And we get v1.

00:39:23.260 --> 00:39:27.320
We get the L1 norm,
which is the sum of--

00:39:27.320 --> 00:39:32.474
everybody remembers that
this is the sum of the vi.

00:39:32.474 --> 00:39:37.950
And you remember my pictures
of diamonds touching

00:39:37.950 --> 00:39:41.400
planes at sharp points.

00:39:41.400 --> 00:39:44.430
Well, that's what
is going on here.

00:39:44.430 --> 00:39:48.660
That problem was
called basis pursuit.

00:39:48.660 --> 00:39:51.600
And it comes back
again in this section.

00:39:54.120 --> 00:40:01.570
So I minimize this norm
subject to the conditions.

00:40:01.570 --> 00:40:07.070
Now, I'm just going to take
a jump to the matrix case.

00:40:07.070 --> 00:40:09.920
What's my idea here?

00:40:09.920 --> 00:40:14.420
My idea is that for a
matrix, the nuclear norm

00:40:14.420 --> 00:40:15.440
comes from what?

00:40:18.830 --> 00:40:21.570
What's the norm that
we sort of start with,

00:40:21.570 --> 00:40:24.220
but it's not a norm?

00:40:24.220 --> 00:40:27.280
And when I sort of take the--

00:40:27.280 --> 00:40:34.000
because the requirements
for a norm don't fail--

00:40:34.000 --> 00:40:38.170
they fail for what I'm
about to write there.

00:40:38.170 --> 00:40:41.920
I could put A 0,
but I don't want

00:40:41.920 --> 00:40:44.740
the number of non-zero entries.

00:40:44.740 --> 00:40:47.280
That would be a good guess.

00:40:47.280 --> 00:40:50.480
And probably in some
sense it makes sense.

00:40:50.480 --> 00:40:53.960
But it's not the
answer I'm looking for.

00:40:53.960 --> 00:41:04.030
What do you think is the 0 norm
of a matrix that is not a norm,

00:41:04.030 --> 00:41:09.400
but when I pump it up to the
best, to the nearest good norm,

00:41:09.400 --> 00:41:11.710
I get the nuclear norm?

00:41:11.710 --> 00:41:14.500
So this is the question,
it's what is A0?

00:41:18.698 --> 00:41:22.160
And it's what?

00:41:22.160 --> 00:41:23.060
AUDIENCE: The rank.

00:41:23.060 --> 00:41:24.290
GILBERT STRANG: The rank.

00:41:24.290 --> 00:41:27.890
The rank of matrix
is the equivalent.

00:41:32.630 --> 00:41:34.170
So I don't know about the zero.

00:41:34.170 --> 00:41:35.720
Nobody else calls it A0.

00:41:35.720 --> 00:41:37.460
So I better not.

00:41:37.460 --> 00:41:39.180
It's the rank.

00:41:39.180 --> 00:41:40.890
So again, the rank
is not a norm,

00:41:40.890 --> 00:41:43.820
because if I double the matrix,
I don't double the rank.

00:41:46.600 --> 00:41:48.220
So I have to move to a norm.

00:41:48.220 --> 00:41:50.410
And it turns out to
be the nuclear norm.

00:41:50.410 --> 00:41:52.540
And now, I'll just,
with one minute,

00:41:52.540 --> 00:41:57.850
say it's the guess of some
people who are working hard

00:41:57.850 --> 00:42:02.920
to prove it, that
the deep learning

00:42:02.920 --> 00:42:07.120
algorithm of gradient
descent finds

00:42:07.120 --> 00:42:12.850
the solution to the minimum
problem in the nuclear norm.

00:42:12.850 --> 00:42:16.140
And we don't know if
that's true or not yet.

00:42:16.140 --> 00:42:25.540
For related examples, like
this thing, it's proved.

00:42:25.540 --> 00:42:31.870
For the exact problem of deep
learning, it's a conjecture.

00:42:31.870 --> 00:42:35.200
So that's what in section 4.4.

00:42:35.200 --> 00:42:38.860
But that word lasso, you
want to know what that is.

00:42:38.860 --> 00:42:41.140
Compressed sensing,
I'll say a word about.

00:42:41.140 --> 00:42:46.540
So that will be Monday after
Alex Townsend's lecture Friday.

00:42:46.540 --> 00:42:51.790
So he's coming to
speak to computational

00:42:51.790 --> 00:42:55.960
science students all over
MIT tomorrow afternoon.

00:42:55.960 --> 00:43:00.100
I'll certainly go
to that, but then he

00:43:00.100 --> 00:43:03.850
said he would come in and
take this class Friday.

00:43:03.850 --> 00:43:05.410
So I'll see you Friday.

00:43:05.410 --> 00:43:07.455
And he'll be here too.