WEBVTT

00:00:07.460 --> 00:00:09.910
OK.

00:00:09.910 --> 00:00:15.260
Here's lecture sixteen
and if you remember

00:00:15.260 --> 00:00:21.150
I ended up the last lecture
with this formula for what

00:00:21.150 --> 00:00:24.610
I called a projection matrix.

00:00:24.610 --> 00:00:31.610
And maybe I could just
recap for a minute what

00:00:31.610 --> 00:00:35.010
is that magic formula doing?

00:00:35.010 --> 00:00:38.390
For example, it's
supposed to be --

00:00:38.390 --> 00:00:40.230
it's supposed to
produce a projection,

00:00:40.230 --> 00:00:44.770
if I multiply by a b,
so I take P times b,

00:00:44.770 --> 00:00:51.240
I'm supposed to project that
vector b to the nearest point

00:00:51.240 --> 00:00:54.150
in the column space.

00:00:54.150 --> 00:00:54.800
OK.

00:00:54.800 --> 00:00:56.350
Can I just --

00:00:56.350 --> 00:01:01.420
one way to recap is to
take the two extreme cases.

00:01:01.420 --> 00:01:05.040
Suppose a vector b is
in the column space?

00:01:05.040 --> 00:01:10.030
Then what do I get when
I apply the projection P?

00:01:10.030 --> 00:01:13.610
So I'm projecting
into the column space

00:01:13.610 --> 00:01:18.780
but I'm starting with a vector
in this case that's already

00:01:18.780 --> 00:01:20.950
in the column
space, so of course

00:01:20.950 --> 00:01:25.860
when I project it I
get B again, right.

00:01:25.860 --> 00:01:29.860
And I want to show you how
that comes out of this formula.

00:01:29.860 --> 00:01:32.550
Let me do the other extreme.

00:01:32.550 --> 00:01:35.350
Suppose that vector is
perpendicular to the column

00:01:35.350 --> 00:01:36.210
space.

00:01:36.210 --> 00:01:38.860
So imagine this column
space as a plane

00:01:38.860 --> 00:01:42.550
and imagine b as sticking
straight up perpendicular

00:01:42.550 --> 00:01:43.490
to it.

00:01:43.490 --> 00:01:50.490
What's the nearest point in the
column space to b in that case?

00:01:50.490 --> 00:01:54.380
So what's the projection
onto the plane,

00:01:54.380 --> 00:01:57.980
the nearest point in the
plane, if the vector b that

00:01:57.980 --> 00:02:02.000
I'm looking at is -- got no
component in the column space,

00:02:02.000 --> 00:02:05.050
it's sticking completely
-- ninety degrees with it,

00:02:05.050 --> 00:02:10.220
then Pb should be zero, right.

00:02:10.220 --> 00:02:13.040
So those are the
two extreme cases.

00:02:13.040 --> 00:02:18.000
The average vector has a
component P in the column space

00:02:18.000 --> 00:02:20.930
and a component
perpendicular to it,

00:02:20.930 --> 00:02:25.930
and what the projection
does is it kills this part

00:02:25.930 --> 00:02:29.510
and it preserves this part.

00:02:29.510 --> 00:02:30.010
OK.

00:02:30.010 --> 00:02:32.230
Can we just see why that's true?

00:02:32.230 --> 00:02:37.140
Just -- that formula
ought to work.

00:02:37.140 --> 00:02:41.010
So let me start with this one.

00:02:41.010 --> 00:02:44.260
What vectors are in the -- are
perpendicular to the column

00:02:44.260 --> 00:02:45.240
space?

00:02:45.240 --> 00:02:48.220
How do I see that
I really get zero?

00:02:48.220 --> 00:02:50.850
I have to think, what does
it mean for a vector b

00:02:50.850 --> 00:02:54.410
to be perpendicular
to the column space?

00:02:54.410 --> 00:02:59.430
So if it's perpendicular
to all the columns,

00:02:59.430 --> 00:03:02.100
then it's in some other space.

00:03:02.100 --> 00:03:05.740
We've got our four spaces so
the reason I do this is it's

00:03:05.740 --> 00:03:10.030
perfectly using what we
know about our four spaces.

00:03:10.030 --> 00:03:13.860
What vectors are perpendicular
to the column space?

00:03:13.860 --> 00:03:19.190
Those are the guys in the
null space of A transpose,

00:03:19.190 --> 00:03:20.290
right?

00:03:20.290 --> 00:03:22.740
That's the first
section of this chapter,

00:03:22.740 --> 00:03:26.160
that's the key geometry
of these spaces.

00:03:26.160 --> 00:03:28.300
If I'm perpendicular
to the column space,

00:03:28.300 --> 00:03:30.961
I'm in the null
space of A transpose.

00:03:30.961 --> 00:03:31.460
OK.

00:03:31.460 --> 00:03:33.790
So if I'm in the null
space of A transpose,

00:03:33.790 --> 00:03:41.760
and I multiply this big formula
times b, so now I'm getting Pb,

00:03:41.760 --> 00:03:48.530
this is now the projection,
Pb, do you see that I get zero?

00:03:48.530 --> 00:03:50.470
Of course I get zero.

00:03:50.470 --> 00:03:52.950
Right at the end
there, A transpose b

00:03:52.950 --> 00:03:54.840
will give me zero right away.

00:03:54.840 --> 00:03:57.780
So that's why that zero's here.

00:03:57.780 --> 00:04:00.890
Because if I'm perpendicular
to the column space, then

00:04:00.890 --> 00:04:03.840
I'm in the null space of A
transpose and A transpose

00:04:03.840 --> 00:04:08.640
b is OK, what about
the other possibility.

00:04:08.640 --> 00:04:09.970
zilch.

00:04:09.970 --> 00:04:13.370
How do I see that this formula
gives me the right answer

00:04:13.370 --> 00:04:15.250
if b is in the column space?

00:04:18.230 --> 00:04:21.890
So what's a typical vector
in the column space?

00:04:21.890 --> 00:04:24.480
It's a combination
of the columns.

00:04:24.480 --> 00:04:27.240
How do I write a
combination of the columns?

00:04:27.240 --> 00:04:31.090
So tell me, how would
I write, you know,

00:04:31.090 --> 00:04:34.440
your everyday vector
that's in the column space?

00:04:34.440 --> 00:04:38.860
It would have the form
A times some x, right?

00:04:38.860 --> 00:04:42.280
That's what's in the column
space, A times something.

00:04:42.280 --> 00:04:44.730
That makes it a
combination of the columns.

00:04:44.730 --> 00:04:49.570
So these b's were in the
null space of A transpose.

00:04:49.570 --> 00:04:54.640
These guys in the column
space, those b's are Ax-s.

00:04:54.640 --> 00:04:55.210
Right?

00:04:55.210 --> 00:04:58.825
If b is in the column space
then it has the form Ax.

00:05:01.380 --> 00:05:04.450
I'm going to stick that on the
quiz or the final for sure.

00:05:04.450 --> 00:05:08.290
That you have to realize --
because we've said it like

00:05:08.290 --> 00:05:13.020
a thousand times that the things
in the column space are vectors

00:05:13.020 --> 00:05:14.241
A times x.

00:05:14.241 --> 00:05:14.740
OK.

00:05:14.740 --> 00:05:18.050
And do you see what happens
now if we use our formula?

00:05:18.050 --> 00:05:19.990
There's an A transpose A.

00:05:19.990 --> 00:05:21.860
Gets canceled by its inverse.

00:05:21.860 --> 00:05:25.800
We're left with an A times x.

00:05:25.800 --> 00:05:27.530
So the result was Ax.

00:05:27.530 --> 00:05:28.570
Which was b.

00:05:28.570 --> 00:05:30.100
Do you see that it works?

00:05:30.100 --> 00:05:32.750
This is that whole business.

00:05:32.750 --> 00:05:35.870
Cancel, cancel, leaving Ax.

00:05:35.870 --> 00:05:37.840
And Ax was b.

00:05:37.840 --> 00:05:43.730
So that turned out to
be b, in this case.

00:05:43.730 --> 00:05:51.300
OK, so geometrically what we're
seeing is we're taking a vector

00:05:51.300 --> 00:05:53.010
--

00:05:53.010 --> 00:06:00.770
we've got the column space
and perpendicular to that

00:06:00.770 --> 00:06:06.350
is the null space
of A transpose.

00:06:06.350 --> 00:06:10.230
And our typical
vector b is out here.

00:06:10.230 --> 00:06:12.970
There's zero, so there's
our typical vector b,

00:06:12.970 --> 00:06:19.460
and what we're doing is we're
projecting it to P. And the --

00:06:19.460 --> 00:06:22.300
and of course at the same time
we're finding the other part

00:06:22.300 --> 00:06:24.810
of it which is e.

00:06:24.810 --> 00:06:30.520
So the two pieces, the
projection piece and the error

00:06:30.520 --> 00:06:35.280
piece, add up to the original b.

00:06:35.280 --> 00:06:36.230
OK.

00:06:36.230 --> 00:06:39.520
That's like what
our matrix does.

00:06:39.520 --> 00:06:41.440
So this is P --

00:06:41.440 --> 00:06:48.260
P is -- this P is Ab, is sorry
-- is Pb, it's the projection,

00:06:48.260 --> 00:06:52.590
applied to b, and this one is --

00:06:52.590 --> 00:06:55.080
OK, that's a projection too.

00:06:55.080 --> 00:06:58.070
That's a projection
down onto that space.

00:06:58.070 --> 00:06:59.860
What's a good formula for it?

00:06:59.860 --> 00:07:05.340
Suppose I ask you for the
projection of the projection

00:07:05.340 --> 00:07:08.830
matrix onto the --

00:07:08.830 --> 00:07:13.240
this space, this
perpendicular space?

00:07:13.240 --> 00:07:16.960
So if this projection
was P, what's

00:07:16.960 --> 00:07:21.070
the projection that gives me e?

00:07:21.070 --> 00:07:24.170
It's the -- what I want is to
get the rest of the vector,

00:07:24.170 --> 00:07:30.790
so it'll be just I minus P times
b, that's a projection too.

00:07:30.790 --> 00:07:35.880
That's the projection onto
the perpendicular space.

00:07:38.950 --> 00:07:40.040
OK.

00:07:40.040 --> 00:07:44.150
So if P's a projection, I
minus P is a projection.

00:07:44.150 --> 00:07:47.790
If P is symmetric, I
minus P is symmetric.

00:07:47.790 --> 00:07:52.290
If P squared equals P, then I
minus P squared equals I minus

00:07:52.290 --> 00:07:55.690
P. It's just --

00:07:55.690 --> 00:08:00.460
the algebra -- is only
doing what your --

00:08:00.460 --> 00:08:05.270
picture is completely
telling you.

00:08:05.270 --> 00:08:08.122
But the algebra leads
to this expression.

00:08:11.820 --> 00:08:16.280
That expression for P given --

00:08:16.280 --> 00:08:19.810
given a basis for
the subspace, given

00:08:19.810 --> 00:08:25.690
the matrix A whose columns are
a basis for our column space.

00:08:25.690 --> 00:08:28.820
OK, that's recap because you
-- you need to see that formula

00:08:28.820 --> 00:08:30.460
more than once.

00:08:30.460 --> 00:08:34.669
And now can I pick
up on using it?

00:08:34.669 --> 00:08:37.789
So now -- and the --

00:08:37.789 --> 00:08:46.590
it's like, let me do that again,
I'll go right through a problem

00:08:46.590 --> 00:08:52.470
that I started at the end, which
is find a best straight line.

00:08:52.470 --> 00:08:53.820
You remember that problem, I --

00:08:53.820 --> 00:08:57.530
I picked a particular
set of points,

00:08:57.530 --> 00:09:00.820
they weren't specially
brilliant, t equal one,

00:09:00.820 --> 00:09:07.190
two, three, the heights were
one, two, and then two again.

00:09:07.190 --> 00:09:10.570
So they were -- heights
were that point, that point,

00:09:10.570 --> 00:09:13.320
which makes it look like I've
got a nice forty-five-degree

00:09:13.320 --> 00:09:18.800
line -- but then the third
point didn't lie on the line.

00:09:18.800 --> 00:09:22.500
And I wanted to find
the best straight line.

00:09:22.500 --> 00:09:26.404
So I'm looking for the
-- this line, y=C+Dt.

00:09:30.850 --> 00:09:35.110
And it's not going to go
through all three points,

00:09:35.110 --> 00:09:37.880
because no line goes
through all three points.

00:09:37.880 --> 00:09:42.190
So I'm going to pick
the best line, the --

00:09:42.190 --> 00:09:45.990
the best being the one that
makes the overall error

00:09:45.990 --> 00:09:48.430
as small as I can make it.

00:09:48.430 --> 00:09:52.390
Now I have to tell you,
what is that overall error?

00:09:52.390 --> 00:10:01.600
And -- because that determines
what's the winning line.

00:10:01.600 --> 00:10:02.740
If we don't know --

00:10:02.740 --> 00:10:06.810
I mean we have to decide
what we mean by the error --

00:10:06.810 --> 00:10:12.940
and then we minimize and we find
the right -- the best C and D.

00:10:12.940 --> 00:10:18.100
So if I went through this --
if I went through that point,

00:10:18.100 --> 00:10:18.600
OK.

00:10:18.600 --> 00:10:20.688
I would solve the
equation C+D=1.

00:10:23.680 --> 00:10:26.310
Because at t equal to one --

00:10:26.310 --> 00:10:30.050
I'd have C plus D, and
it would come out right.

00:10:30.050 --> 00:10:34.310
If it went through this point,
I'd have C plus two D equal to

00:10:34.310 --> 00:10:35.150
two.

00:10:35.150 --> 00:10:38.990
Because at t equal to two, I
would like to get the answer

00:10:38.990 --> 00:10:39.550
two.

00:10:39.550 --> 00:10:43.950
At the third point, I have
C plus three D because t is

00:10:43.950 --> 00:10:47.160
three, but the -- the
answer I'm shooting for is

00:10:47.160 --> 00:10:49.850
two again.

00:10:49.850 --> 00:10:52.680
So those are my three equations.

00:10:52.680 --> 00:10:55.720
And they don't have a solution.

00:10:55.720 --> 00:10:58.110
But they've got a best solution.

00:10:58.110 --> 00:10:59.890
What do I mean by best solution?

00:10:59.890 --> 00:11:04.220
So let me take time out to
remember what I'm talking

00:11:04.220 --> 00:11:06.550
about for best solution.

00:11:06.550 --> 00:11:11.960
So this is my equation Ax=b.

00:11:11.960 --> 00:11:18.440
A is this matrix, one,
one, one, one, two, three.

00:11:18.440 --> 00:11:22.580
x is my -- only have
two unknowns, C and D,

00:11:22.580 --> 00:11:27.440
and b is my right-hand
side, one, two, three.

00:11:27.440 --> 00:11:27.940
OK.

00:11:31.930 --> 00:11:34.670
No solution.

00:11:34.670 --> 00:11:37.300
Three eq- I have a
three by two matrix,

00:11:37.300 --> 00:11:40.200
I do have two
independent columns --

00:11:40.200 --> 00:11:42.540
so I do have a basis
for the column space,

00:11:42.540 --> 00:11:44.630
those two columns
are independent,

00:11:44.630 --> 00:11:46.420
they're a basis for
the column space,

00:11:46.420 --> 00:11:52.370
but the column space
doesn't include that vector.

00:11:52.370 --> 00:11:57.600
So best possible in this --

00:11:57.600 --> 00:12:01.540
what would best possible mean?

00:12:01.540 --> 00:12:05.750
The way that comes out to
linear equations is I --

00:12:05.750 --> 00:12:13.970
I want to minimize
the sum of these --

00:12:13.970 --> 00:12:15.457
I'm going to make an error here.

00:12:15.457 --> 00:12:16.790
I'm going to make an error here.

00:12:16.790 --> 00:12:18.950
I'm going to make
an error there.

00:12:18.950 --> 00:12:24.970
And I'm going to sum and
square and add up those errors.

00:12:24.970 --> 00:12:26.720
So it's a sum of squares.

00:12:26.720 --> 00:12:30.750
It's a least squares
solution I'm looking for.

00:12:30.750 --> 00:12:37.980
So if I -- those errors are the
difference between Ax and b.

00:12:37.980 --> 00:12:40.320
That's what I want
to make small.

00:12:40.320 --> 00:12:42.951
And the way I'm measuring
this -- this is a vector,

00:12:42.951 --> 00:12:43.450
right?

00:12:43.450 --> 00:12:45.480
This is e1,e2 ,e3.

00:12:45.480 --> 00:12:49.090
The Ax-b, this is the e.

00:12:49.090 --> 00:12:50.890
The error vector.

00:12:50.890 --> 00:12:55.890
And small means its length.

00:12:55.890 --> 00:12:57.662
The length of that vector.

00:12:57.662 --> 00:12:59.370
That's what I'm going
to try to minimize.

00:12:59.370 --> 00:13:04.280
And it's convenient to square.

00:13:04.280 --> 00:13:06.920
If I make something
small, I make --

00:13:09.620 --> 00:13:12.320
this is a never negative
quantity, right?

00:13:12.320 --> 00:13:13.690
The length of that vector.

00:13:16.760 --> 00:13:20.040
The length will be zero
exactly when the --

00:13:20.040 --> 00:13:21.990
when I have the
zero vector here.

00:13:21.990 --> 00:13:26.620
That's exactly the case
when I can solve exactly,

00:13:26.620 --> 00:13:29.730
b is in the column
space, all great.

00:13:29.730 --> 00:13:31.820
But I'm not in that case now.

00:13:31.820 --> 00:13:34.070
I'm going to have
an error vector, e.

00:13:34.070 --> 00:13:35.815
What's this error
vector in my picture?

00:13:38.340 --> 00:13:42.030
I guess what I'm trying
to say is there's --

00:13:42.030 --> 00:13:45.270
there's two pictures
of what's going on.

00:13:45.270 --> 00:13:47.540
There's two pictures
of what's going on.

00:13:47.540 --> 00:13:50.900
One picture is --

00:13:50.900 --> 00:13:55.020
in this is the three
points and the line.

00:13:55.020 --> 00:14:00.220
And in that picture, what
are the three errors?

00:14:00.220 --> 00:14:03.480
The three errors are what
I miss by in this equation.

00:14:03.480 --> 00:14:05.150
So it's this --

00:14:05.150 --> 00:14:06.740
this little bit here.

00:14:06.740 --> 00:14:08.950
That vertical distance
up to the line.

00:14:08.950 --> 00:14:12.780
There's one -- sorry there's
one, and there's C plus D.

00:14:12.780 --> 00:14:14.700
And it's that difference.

00:14:14.700 --> 00:14:17.720
Here's two and here's C+2D.

00:14:17.720 --> 00:14:20.600
So vertically it's
that distance --

00:14:20.600 --> 00:14:23.620
that little error there is e1.

00:14:23.620 --> 00:14:26.220
This little error here is e2.

00:14:26.220 --> 00:14:30.540
This little error
coming up is e3.

00:14:30.540 --> 00:14:32.350
e3.

00:14:32.350 --> 00:14:35.240
And what's my overall error?

00:14:35.240 --> 00:14:43.240
Is e1 square plus e2
squared plus e3 squared.

00:14:43.240 --> 00:14:44.920
That's what I'm
trying to make small.

00:14:44.920 --> 00:14:54.090
I -- some statisticians -- this
is a big part of statistics,

00:14:54.090 --> 00:14:56.360
fitting straight lines is
a big part of science --

00:14:56.360 --> 00:15:00.310
and specifically statistics,
where the right word to use

00:15:00.310 --> 00:15:02.210
would be regression.

00:15:02.210 --> 00:15:05.270
I'm doing regression here.

00:15:05.270 --> 00:15:06.145
Linear regression.

00:15:09.840 --> 00:15:12.820
And I'm using this
sum of squares

00:15:12.820 --> 00:15:15.270
as the measure of error.

00:15:15.270 --> 00:15:21.190
Again, some statisticians
would be -- they would say, OK,

00:15:21.190 --> 00:15:24.000
I'll solve that problem
because it's the clean problem.

00:15:24.000 --> 00:15:27.080
It leads to a beautiful
linear system.

00:15:27.080 --> 00:15:30.340
But they would be a little
careful about these squares,

00:15:30.340 --> 00:15:32.670
for -- in this case.

00:15:32.670 --> 00:15:35.990
If one of these
points was way off.

00:15:35.990 --> 00:15:39.040
Suppose I had a measurement at
t equal zero that was way off.

00:15:41.560 --> 00:15:44.880
Well, would the straight line,
would the best line be the same

00:15:44.880 --> 00:15:46.890
if I had this fourth point?

00:15:46.890 --> 00:15:50.180
Suppose I have this
fourth data point.

00:15:50.180 --> 00:15:54.880
No, certainly the line would --

00:15:54.880 --> 00:15:57.620
it wouldn't be the -- that
wouldn't be the best line.

00:15:57.620 --> 00:16:01.100
Because that line would
have a giant error --

00:16:01.100 --> 00:16:04.820
and when I squared it it
would be like way out of sight

00:16:04.820 --> 00:16:06.860
compared to the others.

00:16:06.860 --> 00:16:14.280
So this would be called by
statisticians an outlier,

00:16:14.280 --> 00:16:17.830
and they would not be happy to
see the whole problem turned

00:16:17.830 --> 00:16:21.150
topsy-turvy by this one outlier,
which could be a mistake,

00:16:21.150 --> 00:16:22.760
after all.

00:16:22.760 --> 00:16:26.500
So they wouldn't -- so they
wouldn't like maybe squaring,

00:16:26.500 --> 00:16:29.940
if there were outliers, they
would want to identify them.

00:16:29.940 --> 00:16:30.440
OK.

00:16:30.440 --> 00:16:35.800
I'm not going to --

00:16:35.800 --> 00:16:40.870
I don't want to suggest that
least squares isn't used,

00:16:40.870 --> 00:16:44.790
it's the most used, but
it's not exclusively used

00:16:44.790 --> 00:16:47.040
because it's a little --

00:16:47.040 --> 00:16:50.000
overcompensates for outliers.

00:16:50.000 --> 00:16:51.500
Because of that squaring.

00:16:51.500 --> 00:16:52.000
OK.

00:16:52.000 --> 00:16:54.300
So suppose we don't
have this guy,

00:16:54.300 --> 00:16:57.300
we just have these
three equations.

00:16:57.300 --> 00:17:01.680
And I want to make --
minimize this error.

00:17:01.680 --> 00:17:02.650
OK.

00:17:02.650 --> 00:17:08.069
Now, what I said is there's
two pictures to look at.

00:17:08.069 --> 00:17:10.940
One picture is this one.

00:17:10.940 --> 00:17:14.700
The three points, the best line.

00:17:14.700 --> 00:17:16.280
And the errors.

00:17:16.280 --> 00:17:20.760
Now, on this picture,
what are these points

00:17:20.760 --> 00:17:24.890
on the line, the points
that are really on the line?

00:17:24.890 --> 00:17:30.490
So they're -- points, let
me call them P1, P2, and P3,

00:17:30.490 --> 00:17:35.610
those are three numbers, so
this -- this height is P1,

00:17:35.610 --> 00:17:45.700
this height is P2, this height
is P3, and what are those guys?

00:17:45.700 --> 00:17:49.930
Suppose those were the
three values instead of --

00:17:49.930 --> 00:17:53.840
there's b1, ev- everybody's
seen all these -- sorry,

00:17:53.840 --> 00:17:57.090
my art is as usual
not the greatest,

00:17:57.090 --> 00:18:04.590
but there's the given b1, the
given b2, and the given b3.

00:18:04.590 --> 00:18:09.330
I promise not to put a single
letter more on that picture.

00:18:09.330 --> 00:18:10.050
OK.

00:18:10.050 --> 00:18:15.600
There's b1, P1 is the one on
the line, and e1 is the distance

00:18:15.600 --> 00:18:16.600
between.

00:18:16.600 --> 00:18:21.410
And same at points two
and same at points three.

00:18:21.410 --> 00:18:23.370
OK, so what's up?

00:18:23.370 --> 00:18:26.310
What's up with those Ps?

00:18:26.310 --> 00:18:29.930
P1, P2, P3, what are they?

00:18:29.930 --> 00:18:32.520
They're the components,
they lie on the line,

00:18:32.520 --> 00:18:33.720
right?

00:18:33.720 --> 00:18:38.420
They're the points
which if instead

00:18:38.420 --> 00:18:44.530
of one, two, two, which
were the b's, suppose I put

00:18:44.530 --> 00:18:47.230
P1, P2, P3 in here.

00:18:47.230 --> 00:18:50.150
I'll figure out in a minute
what those numbers are.

00:18:50.150 --> 00:18:53.040
But I just want to get the
picture of what I'm doing.

00:18:53.040 --> 00:18:56.390
If I put P1, P2, P3 in
those three equations,

00:18:56.390 --> 00:18:58.795
what would be good about
the three equations?

00:19:01.820 --> 00:19:03.820
I could solve them.

00:19:03.820 --> 00:19:06.420
A line goes through the Ps.

00:19:06.420 --> 00:19:10.400
So the P1, P2, P3 vector,
that's in the column

00:19:10.400 --> 00:19:11.320
space.

00:19:11.320 --> 00:19:14.480
That is a combination
of these columns.

00:19:14.480 --> 00:19:16.400
It's the closest combination.

00:19:16.400 --> 00:19:18.180
It's this picture.

00:19:18.180 --> 00:19:20.920
See, I've got the two
pictures like here's

00:19:20.920 --> 00:19:24.710
the picture that
shows the points, this

00:19:24.710 --> 00:19:28.240
is a picture in a
blackboard plane,

00:19:28.240 --> 00:19:34.310
here's a picture that's
showing the vectors.

00:19:34.310 --> 00:19:38.540
The vector b, which is in
this case, in this example

00:19:38.540 --> 00:19:42.090
is the vector one, two, two.

00:19:42.090 --> 00:19:47.940
The column space is in
this case spanned by the --

00:19:47.940 --> 00:19:49.720
well, you see A there.

00:19:49.720 --> 00:19:55.600
The column space of the matrix
one, one, one, one, two, three.

00:19:55.600 --> 00:20:01.540
And this picture shows
the nearest point.

00:20:01.540 --> 00:20:04.510
There's the -- that
point P1, P2, P3,

00:20:04.510 --> 00:20:08.050
which I'm going to compute
before the end of this hour,

00:20:08.050 --> 00:20:13.090
is the closest point
in the column space.

00:20:13.090 --> 00:20:13.780
OK.

00:20:13.780 --> 00:20:19.560
Let me -- t I don't dare
leave it any longer --

00:20:19.560 --> 00:20:21.650
can I just compute it now.

00:20:21.650 --> 00:20:24.850
So I want to compute --

00:20:24.850 --> 00:20:28.800
find P. All right.

00:20:28.800 --> 00:20:39.250
Find P. Find x, which
is CD, find P and P. OK.

00:20:39.250 --> 00:20:42.430
And I really should put
these little hats on

00:20:42.430 --> 00:20:49.830
to remind myself that they're
the estimated the best line,

00:20:49.830 --> 00:20:51.970
not the perfect line.

00:20:51.970 --> 00:20:53.050
OK.

00:20:53.050 --> 00:20:54.330
OK.

00:20:54.330 --> 00:20:55.540
How do I proceed?

00:20:55.540 --> 00:20:58.340
Let's just run
through the mechanics.

00:20:58.340 --> 00:21:02.530
What's the equation for x?

00:21:02.530 --> 00:21:04.620
The -- or x hat.

00:21:04.620 --> 00:21:10.390
The equation for that is A
transpose A x hat equals A

00:21:10.390 --> 00:21:12.500
transpose x --

00:21:12.500 --> 00:21:14.105
A transpose b.

00:21:18.020 --> 00:21:19.530
The most --

00:21:19.530 --> 00:21:23.620
I'm -- will venture to call
that the most important equation

00:21:23.620 --> 00:21:26.350
in statistics.

00:21:26.350 --> 00:21:28.560
And in estimation.

00:21:28.560 --> 00:21:33.140
And whatever you're -- wherever
you've got error and noise this

00:21:33.140 --> 00:21:36.980
is the estimate
that you use first.

00:21:36.980 --> 00:21:37.500
OK.

00:21:37.500 --> 00:21:42.740
Whenever you're fitting
things by a few parameters,

00:21:42.740 --> 00:21:44.700
that's the equation to use.

00:21:44.700 --> 00:21:46.500
OK, let's solve it.

00:21:46.500 --> 00:21:47.970
What is A transpose A?

00:21:47.970 --> 00:21:50.580
So I have to figure out
what these matrices are.

00:21:50.580 --> 00:21:56.860
One, one, one, one, two, three
and one, one, one, one, two,

00:21:56.860 --> 00:22:04.490
three, that gives me some
matrix, that gives me

00:22:04.490 --> 00:22:12.510
a matrix, what do I get out of
that, three, six, six, and one

00:22:12.510 --> 00:22:15.720
and four and nine, fourteen.

00:22:15.720 --> 00:22:17.040
OK.

00:22:17.040 --> 00:22:21.830
And what do I expect to see in
that matrix and I do see it,

00:22:21.830 --> 00:22:25.210
just before I keep going
with the calculation?

00:22:25.210 --> 00:22:28.450
I expect that matrix
to be symmetric.

00:22:28.450 --> 00:22:30.565
I expect it to be invertible.

00:22:34.100 --> 00:22:36.300
And near the end
of the course I'm

00:22:36.300 --> 00:22:39.060
going to say I expect it
to be positive definite,

00:22:39.060 --> 00:22:45.590
but that's a future fact
about this crucial matrix,

00:22:45.590 --> 00:22:47.050
A transpose A.

00:22:47.050 --> 00:22:47.670
OK.

00:22:47.670 --> 00:22:50.880
And now let me
figure A transpose b.

00:22:50.880 --> 00:22:57.280
So let me -- can I tack on b as
an extra column here, one, two,

00:22:57.280 --> 00:22:59.950
two?

00:22:59.950 --> 00:23:04.770
And tack on the extra
A transpose b is --

00:23:04.770 --> 00:23:09.580
looks like five and one
and four and six, eleven.

00:23:13.770 --> 00:23:20.760
I think my equations are three
C plus six D equals five,

00:23:20.760 --> 00:23:29.700
and six D plus fourt-six C
plus fourteen D is eleven.

00:23:29.700 --> 00:23:33.090
Can I just for safety
see if I did that right?

00:23:33.090 --> 00:23:37.350
One, one, one times
one, two, two is five.

00:23:37.350 --> 00:23:40.630
One, two, three, that's
one, four and six, eleven.

00:23:40.630 --> 00:23:42.667
Looks good.

00:23:42.667 --> 00:23:43.625
These are my equations.

00:23:48.860 --> 00:23:52.000
That's my -- they're called
the normal equations.

00:23:54.610 --> 00:23:56.984
I'll just write that
word down because it --

00:24:02.800 --> 00:24:04.470
so I solve them.

00:24:04.470 --> 00:24:10.270
I solve that for C and
D. I would like to --

00:24:10.270 --> 00:24:13.130
before I solve them could I
do one thing that's on the --

00:24:13.130 --> 00:24:16.570
that's just above here?

00:24:16.570 --> 00:24:18.110
I would like to --

00:24:18.110 --> 00:24:21.470
I'd like to find these
equations from calculus.

00:24:21.470 --> 00:24:26.320
I'd like to find them from
this minimizing thing.

00:24:26.320 --> 00:24:28.010
So what's the first error?

00:24:28.010 --> 00:24:32.690
The first error is what I
missed by in the first equation.

00:24:32.690 --> 00:24:36.250
C plus D minus one squared.

00:24:36.250 --> 00:24:40.010
And the second error is what
I miss in the second equation.

00:24:40.010 --> 00:24:44.110
C plus two D minus two squared.

00:24:44.110 --> 00:24:52.350
And the third error squared is C
plus three D minus two squared.

00:24:52.350 --> 00:24:56.410
That's my -- overall squared
error that I'm trying

00:24:56.410 --> 00:24:58.040
to minimize.

00:24:58.040 --> 00:24:58.610
OK.

00:24:58.610 --> 00:25:08.910
So how would you minimize that?

00:25:08.910 --> 00:25:16.270
OK, linear algebra has given us
the equations for the minimum.

00:25:16.270 --> 00:25:20.750
But we could use calculus too.

00:25:20.750 --> 00:25:24.440
That's a function of
two variables, C and D,

00:25:24.440 --> 00:25:28.010
and we're looking
for the minimum.

00:25:28.010 --> 00:25:31.140
So how do we find it?

00:25:31.140 --> 00:25:35.160
Directly from calculus, we
take partial derivatives,

00:25:35.160 --> 00:25:37.510
right, we've got two
variables, C and D,

00:25:37.510 --> 00:25:40.900
so take the partial
derivative with respect to C

00:25:40.900 --> 00:25:44.560
and set it to zero, and
you'll get that equation.

00:25:44.560 --> 00:25:47.140
Take the partial
derivative with respect --

00:25:47.140 --> 00:25:51.570
I'm not going to write it
all out, just -- you will.

00:25:51.570 --> 00:25:56.370
The partial derivative with
respect to D, it -- you know,

00:25:56.370 --> 00:25:59.340
it's going to be linear,
that's the beauty of these

00:25:59.340 --> 00:26:03.220
squares,that if I have the
square of something and I take

00:26:03.220 --> 00:26:07.520
its derivative I get something
And this is what I get. linear.

00:26:07.520 --> 00:26:11.440
So this is the derivative of
the error with respect to C

00:26:11.440 --> 00:26:13.770
being zero, and this
is the derivative

00:26:13.770 --> 00:26:17.850
of the error with
respect to D being zero.

00:26:17.850 --> 00:26:20.660
Wherever you look, these
equations keep coming.

00:26:20.660 --> 00:26:22.370
So now I guess I'm
going to solve it,

00:26:22.370 --> 00:26:25.830
what will I do, I'll subtract,
I'll do elimination of course,

00:26:25.830 --> 00:26:27.820
because that's the only
thing I know how to do.

00:26:27.820 --> 00:26:32.540
Two of these away from
this would give me --

00:26:32.540 --> 00:26:37.198
let's see, six, so would
that be two Ds equals one?

00:26:37.198 --> 00:26:37.697
Ha.

00:26:41.440 --> 00:26:43.050
So it wasn't --

00:26:43.050 --> 00:26:45.760
I was afraid these numbers
were going to come out awful.

00:26:45.760 --> 00:26:48.770
But if I take two of
those away from that,

00:26:48.770 --> 00:26:51.480
the equation I get left
is two D equals one,

00:26:51.480 --> 00:26:57.700
so I think D is a
half and C is whatever

00:26:57.700 --> 00:27:03.650
back substitution gives, six D
is three, so three C plus three

00:27:03.650 --> 00:27:07.060
is five, I'm doing back
substitution now, right, three,

00:27:07.060 --> 00:27:10.910
can I do it in light
letters, three C plus

00:27:10.910 --> 00:27:15.970
that six D is three equals
five, so three C is two,

00:27:15.970 --> 00:27:17.440
so I think C is two-thirds.

00:27:23.276 --> 00:27:24.275
One-half and two-thirds.

00:27:29.230 --> 00:27:38.640
So the best line, the best
line is the constant two-thirds

00:27:38.640 --> 00:27:42.760
plus one-half t.

00:27:42.760 --> 00:27:46.820
And I -- is my picture
more or less right?

00:27:46.820 --> 00:27:49.890
Let me write, let me copy
that best line down again,

00:27:49.890 --> 00:27:52.600
two-thirds and a half.

00:27:52.600 --> 00:27:55.510
Let me -- I'll put in the
two-thirds and the half.

00:27:59.890 --> 00:28:00.710
OK.

00:28:00.710 --> 00:28:05.360
So what's this P1, that's
the value at t equal to one.

00:28:05.360 --> 00:28:08.380
At t equal to one, I have
two-thirds plus a half,

00:28:08.380 --> 00:28:10.280
which is --

00:28:10.280 --> 00:28:13.400
what's that, four-sixths
and three-sixths, so P1, oh,

00:28:13.400 --> 00:28:18.400
I promised not to write
another thing on this --

00:28:18.400 --> 00:28:21.860
I'll erase P1 and
I'll put seven-sixths.

00:28:21.860 --> 00:28:22.580
OK.

00:28:22.580 --> 00:28:27.990
And yeah, it's above one,
and e1 is one-sixth, right.

00:28:27.990 --> 00:28:28.720
You see it all.

00:28:28.720 --> 00:28:29.220
Right?

00:28:29.220 --> 00:28:29.830
What's P2?

00:28:29.830 --> 00:28:31.660
OK.

00:28:31.660 --> 00:28:35.230
At point t equal to two,
where's my line here?

00:28:35.230 --> 00:28:38.920
At t equal to two, it's
two-thirds plus one, right?

00:28:38.920 --> 00:28:41.580
That's five-thirds.

00:28:41.580 --> 00:28:44.130
Two-thirds and t is two,
so that's two-thirds

00:28:44.130 --> 00:28:46.070
and one make five-thirds.

00:28:46.070 --> 00:28:49.320
And that's -- sure enough,
that's smaller than the exact

00:28:49.320 --> 00:28:50.280
two.

00:28:50.280 --> 00:28:55.180
And then final P3, when
t is three, oh, what's

00:28:55.180 --> 00:28:56.820
two-thirds plus three-halves?

00:29:01.390 --> 00:29:03.950
It's the same as
three-halves plus two-thirds.

00:29:03.950 --> 00:29:09.280
It's -- so maybe
four-sixths and nine-sixths,

00:29:09.280 --> 00:29:11.120
maybe thirteen-sixths.

00:29:11.120 --> 00:29:15.110
OK, and again, look,
oh, look at this, OK.

00:29:15.110 --> 00:29:19.840
You have to admire the
beauty of this answer.

00:29:19.840 --> 00:29:21.260
What's this first error?

00:29:21.260 --> 00:29:25.760
So here are the
errors. e1, e2 and e3.

00:29:25.760 --> 00:29:28.340
OK, what was that
first error, e1?

00:29:28.340 --> 00:29:32.640
Well, if we decide the
errors counting up,

00:29:32.640 --> 00:29:35.260
then it's one-sixth.

00:29:35.260 --> 00:29:38.670
And the last error,
thirteen-sixths

00:29:38.670 --> 00:29:43.420
minus the correct two
is one-sixth again.

00:29:43.420 --> 00:29:47.890
And what's this
error in the middle?

00:29:47.890 --> 00:29:52.530
Let's see, the correct
answer was two, two.

00:29:52.530 --> 00:29:55.900
And we got five-thirds and
it's the other direction,

00:29:55.900 --> 00:29:58.445
minus one-third,
minus two-sixths.

00:30:02.070 --> 00:30:04.560
That's our error vector.

00:30:04.560 --> 00:30:09.220
In our picture, in our
other picture, here it is.

00:30:09.220 --> 00:30:13.880
We just found P and e.

00:30:13.880 --> 00:30:19.120
e is this vector, one-sixth,
minus two-sixths, one-sixth,

00:30:19.120 --> 00:30:21.540
and P is this guy.

00:30:21.540 --> 00:30:23.540
Well, maybe I have
the signs of e wrong,

00:30:23.540 --> 00:30:26.880
I think I have, let me fix it.

00:30:26.880 --> 00:30:32.840
Because I would like
this one-sixth --

00:30:32.840 --> 00:30:37.690
I would like this plus the
P to give the original b.

00:30:37.690 --> 00:30:42.730
I want P plus e to match b.

00:30:42.730 --> 00:30:47.080
So I want minus a
sixth, plus seven-sixths

00:30:47.080 --> 00:30:50.650
to give the correct b equal one.

00:30:50.650 --> 00:30:52.090
OK.

00:30:52.090 --> 00:30:58.060
Now -- I'm going to
take a deep breath here,

00:30:58.060 --> 00:31:06.720
and ask what do we know
about this error vector e?

00:31:06.720 --> 00:31:09.790
You've seen now this whole
problem worked completely

00:31:09.790 --> 00:31:13.780
through, and I even think
the numbers are right.

00:31:13.780 --> 00:31:17.500
So there's P, so let me --

00:31:17.500 --> 00:31:24.840
I'll write -- if I can put
it down here, B is P plus e.

00:31:24.840 --> 00:31:29.110
b I believe was one, two, two.

00:31:29.110 --> 00:31:34.860
The nearest point
had seven-sixths,

00:31:34.860 --> 00:31:36.120
what were the others?

00:31:36.120 --> 00:31:40.590
Five-thirds and thirteen-sixths.

00:31:40.590 --> 00:31:46.950
And the e vector was
minus a sixth, two-sixths,

00:31:46.950 --> 00:31:49.360
one-third in other
words, and minus a sixth.

00:31:58.511 --> 00:31:59.010
OK.

00:31:59.010 --> 00:32:01.930
Tell me some stuff
about these two vectors.

00:32:01.930 --> 00:32:03.820
Tell me something about
those two vectors,

00:32:03.820 --> 00:32:06.480
well, they add to
b, right, great.

00:32:06.480 --> 00:32:07.070
OK.

00:32:07.070 --> 00:32:09.420
What else?

00:32:09.420 --> 00:32:12.520
What else about those
two vectors, the P,

00:32:12.520 --> 00:32:18.700
the projection vector P,
and the error vector e.

00:32:18.700 --> 00:32:21.470
What else do you
know about them?

00:32:21.470 --> 00:32:24.430
They're perpendicular, right.

00:32:24.430 --> 00:32:25.860
Do we dare verify that?

00:32:29.180 --> 00:32:32.230
Can you take the dot
product of those vectors?

00:32:32.230 --> 00:32:35.440
I'm like getting like minus
seven over thirty-six,

00:32:35.440 --> 00:32:36.850
can I change that to ten-sixths?

00:32:42.180 --> 00:32:45.250
Oh, God, come out right here.

00:32:45.250 --> 00:32:50.880
Minus seven over thirty-six,
plus twenty over thirty-six,

00:32:50.880 --> 00:32:53.080
minus thirteen over thirty-six.

00:32:56.730 --> 00:32:57.630
Thank you, God.

00:32:57.630 --> 00:32:59.120
OK.

00:32:59.120 --> 00:33:04.030
And what else should we
know about that vector?

00:33:04.030 --> 00:33:05.740
Actually we know --

00:33:05.740 --> 00:33:08.740
I've got to say we know
even a little more.

00:33:08.740 --> 00:33:13.510
This vector, e, is
perpendicular to P,

00:33:13.510 --> 00:33:18.480
but it's perpendicular
to other stuff too.

00:33:18.480 --> 00:33:22.220
It's perpendicular not just to
this guy in the column space,

00:33:22.220 --> 00:33:25.170
this is in the column
space for sure.

00:33:25.170 --> 00:33:27.680
This is perpendicular
to the column space.

00:33:27.680 --> 00:33:32.710
So like give me another
vector it's perpendicular to.

00:33:32.710 --> 00:33:35.000
Another because it's
perpendicular to the whole

00:33:35.000 --> 00:33:37.490
column space, not
just to this --

00:33:37.490 --> 00:33:40.780
this particular
projection that's --

00:33:40.780 --> 00:33:44.880
that is in the column space,
but it's perpendicular to other

00:33:44.880 --> 00:33:46.520
stuff, whatever's
in the column space,

00:33:46.520 --> 00:33:49.800
so tell me another vector
in the -- oh, well,

00:33:49.800 --> 00:33:53.080
I've written down the matrix,
so tell me another vector

00:33:53.080 --> 00:33:55.000
in the column space.

00:33:55.000 --> 00:33:58.000
Pick a nice one.

00:33:58.000 --> 00:33:59.450
One, one, one.

00:33:59.450 --> 00:34:01.490
That's what
everybody's thinking.

00:34:01.490 --> 00:34:04.230
OK, one, one, one is
in the column space.

00:34:04.230 --> 00:34:07.350
And this guy is supposed
to be perpendicular to one,

00:34:07.350 --> 00:34:08.090
one, one.

00:34:08.090 --> 00:34:10.000
Is it?

00:34:10.000 --> 00:34:10.659
Sure.

00:34:10.659 --> 00:34:12.550
If I take the dot
product with one,

00:34:12.550 --> 00:34:16.690
one, one I get minus a sixth,
plus two-sixths, minus a sixth,

00:34:16.690 --> 00:34:18.080
zero.

00:34:18.080 --> 00:34:20.659
And it's perpendicular
to one, two, three.

00:34:20.659 --> 00:34:23.020
Because if I take the
dot product with one,

00:34:23.020 --> 00:34:30.310
two, three I get minus one, plus
four, minus three, zero again.

00:34:30.310 --> 00:34:32.449
OK, do you see the --

00:34:32.449 --> 00:34:35.739
I hope you see the two pictures.

00:34:35.739 --> 00:34:41.110
The picture here for vectors
and, the picture here

00:34:41.110 --> 00:34:48.120
for the best line, and it's
the same picture, just --

00:34:48.120 --> 00:34:51.440
this one's in the plane
and it's showing the line,

00:34:51.440 --> 00:34:56.060
this one never did show the
line, this -- in this picture,

00:34:56.060 --> 00:34:59.160
C and D never showed up.

00:34:59.160 --> 00:35:02.040
In this picture, C and
D were -- you know,

00:35:02.040 --> 00:35:04.730
they determined that line.

00:35:04.730 --> 00:35:07.020
But the two are
exactly the same.

00:35:07.020 --> 00:35:10.540
C and D is the combination
of the two columns

00:35:10.540 --> 00:35:14.770
that gives P. OK.

00:35:14.770 --> 00:35:19.890
So that's these squares.

00:35:19.890 --> 00:35:23.680
And the special
but most important

00:35:23.680 --> 00:35:26.820
example of fitting by
straight line, so the homework

00:35:26.820 --> 00:35:29.670
that's coming then
Wednesday asks

00:35:29.670 --> 00:35:32.750
you to fit by straight lines.

00:35:32.750 --> 00:35:40.440
So you're just going to end
up solving the key equation.

00:35:40.440 --> 00:35:42.850
You're going to end up
solving that key equation

00:35:42.850 --> 00:35:47.100
and then P will be Ax hat.

00:35:47.100 --> 00:35:47.870
That's it.

00:35:51.640 --> 00:35:54.470
OK.

00:35:54.470 --> 00:35:59.650
Now, can I put in a little
piece of linear algebra

00:35:59.650 --> 00:36:03.350
that I mentioned
earlier, mentioned again,

00:36:03.350 --> 00:36:06.510
but I never did write?

00:36:06.510 --> 00:36:09.840
And I've -- I
should do it right.

00:36:09.840 --> 00:36:16.070
It's about this matrix
A transpose A. There.

00:36:21.840 --> 00:36:26.450
I was sure that that
matrix would be invertible.

00:36:26.450 --> 00:36:29.220
And of course I wanted to
be sure it was invertible,

00:36:29.220 --> 00:36:36.210
because I planned to solve this
system with with that matrix.

00:36:36.210 --> 00:36:40.620
So and I announced
like before --

00:36:40.620 --> 00:36:42.660
as the chapter
was just starting,

00:36:42.660 --> 00:36:45.390
I announced that it
would be invertible.

00:36:45.390 --> 00:36:48.615
But now I -- can I
come back to that?

00:36:48.615 --> 00:36:49.115
OK.

00:36:52.940 --> 00:36:56.050
So what I said was --

00:36:56.050 --> 00:37:07.440
that if A has
independent columns,

00:37:07.440 --> 00:37:14.616
then A transpose
A is invertible.

00:37:20.100 --> 00:37:24.080
And I would like to --

00:37:24.080 --> 00:37:27.250
first to repeat
that important fact,

00:37:27.250 --> 00:37:32.320
that that's the requirement
that makes everything go here.

00:37:32.320 --> 00:37:34.610
It's this independent
columns of A

00:37:34.610 --> 00:37:39.050
that guarantees
everything goes through.

00:37:39.050 --> 00:37:42.140
And think why.

00:37:42.140 --> 00:37:44.970
Why does this matrix
A transpose A,

00:37:44.970 --> 00:37:50.410
why is it invertible if the
columns of A are independent?

00:37:50.410 --> 00:38:01.840
OK, there's -- so if it
wasn't invertible, I'm --

00:38:01.840 --> 00:38:04.750
so I want to prove that.

00:38:04.750 --> 00:38:08.060
If it isn't
invertible, then what?

00:38:08.060 --> 00:38:10.610
I want to reach --

00:38:10.610 --> 00:38:13.010
I want to follow that
-- follow that line --

00:38:13.010 --> 00:38:15.400
of thinking and
see what I come to.

00:38:15.400 --> 00:38:17.480
Suppose, so proof.

00:38:17.480 --> 00:38:26.810
Suppose A transpose Ax is zero.

00:38:26.810 --> 00:38:28.400
I'm trying to prove this.

00:38:28.400 --> 00:38:30.440
This is now to prove.

00:38:30.440 --> 00:38:39.910
I don't like hammer away at
too many proofs in this course.

00:38:39.910 --> 00:38:41.690
But this is like
the central fact

00:38:41.690 --> 00:38:44.320
and it brings in all
the stuff we know.

00:38:44.320 --> 00:38:44.820
OK.

00:38:44.820 --> 00:38:46.700
So I'll start the proof.

00:38:46.700 --> 00:38:51.160
Suppose A transpose Ax is zero.

00:38:51.160 --> 00:38:56.110
What -- and I'm aiming to prove
A transpose A is invertible.

00:38:56.110 --> 00:38:58.150
So what do I want to prove now?

00:39:00.680 --> 00:39:03.560
So I'm aiming to
prove this fact.

00:39:03.560 --> 00:39:06.680
I'll use this, and I'm aiming
to prove that this matrix is

00:39:06.680 --> 00:39:11.740
invertible, OK, so if I
suppose A transpose Ax is zero,

00:39:11.740 --> 00:39:13.875
then what conclusion
do I want to reach?

00:39:16.450 --> 00:39:21.200
I'd like to know
that x must be zero.

00:39:21.200 --> 00:39:23.510
I want to show x must be zero.

00:39:23.510 --> 00:39:33.100
To show now -- to prove x
must be the zero vector.

00:39:33.100 --> 00:39:38.640
Is that right, that's what we
worked in the previous chapter

00:39:38.640 --> 00:39:43.850
to understand, that a
matrix was invertible

00:39:43.850 --> 00:39:51.960
when its null space is
only the zero vector.

00:39:51.960 --> 00:39:53.340
So that's what I want to show.

00:39:53.340 --> 00:40:00.520
How come if A transpose Ax is
zero, how come x must be zero?

00:40:00.520 --> 00:40:01.810
What's going to be the reason?

00:40:05.270 --> 00:40:06.880
Actually I have
two ways to do it.

00:40:10.270 --> 00:40:12.210
Let me show you one way.

00:40:12.210 --> 00:40:14.640
This is -- here, trick.

00:40:18.210 --> 00:40:22.880
Take the dot product
of both sides with x.

00:40:22.880 --> 00:40:25.980
So I'll multiply both
sides by x transpose.

00:40:25.980 --> 00:40:30.100
x transpose A transpose
Ax equals zero.

00:40:33.190 --> 00:40:35.100
I shouldn't have written trick.

00:40:35.100 --> 00:40:37.640
That makes it sound
like just a dumb idea.

00:40:37.640 --> 00:40:39.581
Brilliant idea, I
should have put.

00:40:39.581 --> 00:40:40.080
OK.

00:40:43.040 --> 00:40:44.215
I'll just put idea.

00:40:47.920 --> 00:40:49.230
OK.

00:40:49.230 --> 00:40:57.670
Now, I got to that equation,
x transpose A transpose Ax=0,

00:40:57.670 --> 00:41:06.229
and I'm hoping you can
see the right way to --

00:41:06.229 --> 00:41:07.270
to look at that equation.

00:41:12.760 --> 00:41:15.030
What can I conclude
from that equation,

00:41:15.030 --> 00:41:17.840
that if I have x
transpose A -- well,

00:41:17.840 --> 00:41:21.070
what is x transpose
A transpose Ax?

00:41:21.070 --> 00:41:25.360
Does that -- what
it's giving you?

00:41:29.620 --> 00:41:32.740
It's again going to be putting
in parentheses, I'm looking

00:41:32.740 --> 00:41:37.170
at Ax and what I seeing here?

00:41:37.170 --> 00:41:39.560
Its transpose.

00:41:39.560 --> 00:41:47.580
So I'm seeing here this
is Ax transpose Ax.

00:41:47.580 --> 00:41:48.325
Equaling zero.

00:41:51.640 --> 00:41:57.040
Now if Ax transpose Ax, so like
let's call it y or something,

00:41:57.040 --> 00:42:01.450
if y transpose y is zero,
what does that tell me?

00:42:06.780 --> 00:42:08.950
That the vector has
to be zero, right?

00:42:08.950 --> 00:42:10.650
This is the length
squared, that's

00:42:10.650 --> 00:42:15.730
the length of the vector Ax
squared, that's Ax times Ax.

00:42:15.730 --> 00:42:18.210
So I conclude that
Ax has to be zero.

00:42:23.474 --> 00:42:24.640
Well, I'm getting somewhere.

00:42:29.900 --> 00:42:34.610
Now that I know Ax
is zero, now I'm

00:42:34.610 --> 00:42:37.370
going to use my
little hypothesis.

00:42:37.370 --> 00:42:43.290
Somewhere every mathematician
has to use the hypothesis.

00:42:43.290 --> 00:42:45.050
Right?

00:42:45.050 --> 00:42:49.740
Now, if A has independent
columns and we've --

00:42:49.740 --> 00:42:55.580
we're at the point where Ax is
zero, what does that tell us?

00:42:55.580 --> 00:42:59.610
I could -- I mean that could
be like a fill-in question

00:42:59.610 --> 00:43:01.090
on the final exam.

00:43:01.090 --> 00:43:06.820
If A has independent columns
and if Ax equals zero then what?

00:43:10.390 --> 00:43:15.850
Please say it. x is zero, right.

00:43:15.850 --> 00:43:18.370
Which was just what
we wanted to prove.

00:43:18.370 --> 00:43:20.790
That -- do you see why that is?

00:43:20.790 --> 00:43:24.150
If Ax eq- equals zero,
now we're using --

00:43:24.150 --> 00:43:27.190
here we used this was
the square of something,

00:43:27.190 --> 00:43:30.810
so I'll put in
little parentheses

00:43:30.810 --> 00:43:35.720
the observation we made, that
was a square which is zero,

00:43:35.720 --> 00:43:37.610
so the thing has to be zero.

00:43:37.610 --> 00:43:43.130
Now we're using the hypothesis
of independent columns

00:43:43.130 --> 00:43:48.600
at the A has
independent columns.

00:43:48.600 --> 00:43:52.060
If A has independent
columns, this is telling me

00:43:52.060 --> 00:43:56.040
x is in its null space,
and the only thing

00:43:56.040 --> 00:44:00.510
in the null space of such a
matrix is the zero vector.

00:44:00.510 --> 00:44:01.320
OK.

00:44:01.320 --> 00:44:06.620
So that's the argument and
you see how it really used

00:44:06.620 --> 00:44:13.420
our understanding of the
-- of the null space.

00:44:13.420 --> 00:44:13.990
OK.

00:44:13.990 --> 00:44:15.650
That's great.

00:44:15.650 --> 00:44:16.420
All right.

00:44:16.420 --> 00:44:20.750
So where are we then?

00:44:20.750 --> 00:44:24.430
That board is like
the backup theory

00:44:24.430 --> 00:44:28.670
that tells me that
this matrix had

00:44:28.670 --> 00:44:32.610
to be invertible because these
columns were independent.

00:44:35.530 --> 00:44:38.360
OK.

00:44:38.360 --> 00:44:44.940
there's one case
of independent --

00:44:44.940 --> 00:44:50.540
there's one case where the
geometry gets even better.

00:44:50.540 --> 00:44:55.030
When the -- there's one case
when columns are sure to be

00:44:55.030 --> 00:44:56.610
independent.

00:44:56.610 --> 00:45:00.060
And let me put that -- let me
write that down and that'll be

00:45:00.060 --> 00:45:01.780
the subject for next time.

00:45:01.780 --> 00:45:07.040
Columns are sure -- are
certainly independent,

00:45:07.040 --> 00:45:23.290
definitely independent,
if they're perpendicular.

00:45:23.290 --> 00:45:25.190
Oh, I've got to rule
out the zero column,

00:45:25.190 --> 00:45:33.280
let me give them all length one,
so they can't be zero if they

00:45:33.280 --> 00:45:37.855
are perpendicular unit vectors.

00:45:42.870 --> 00:45:53.370
Like the vectors one, zero,
zero, zero, one, zero and zero,

00:45:53.370 --> 00:45:55.480
zero, one.

00:45:55.480 --> 00:46:00.660
Those vectors are unit
vectors, they're perpendicular,

00:46:00.660 --> 00:46:05.820
and they certainly
are independent.

00:46:05.820 --> 00:46:10.610
And what's more, suppose
they're -- oh, that's so nice,

00:46:10.610 --> 00:46:14.080
I mean what is A transpose
A for that matrix?

00:46:14.080 --> 00:46:16.470
For the matrix with
these three columns?

00:46:16.470 --> 00:46:18.280
It's the identity.

00:46:18.280 --> 00:46:23.090
So here's the key to the
lecture that's coming.

00:46:23.090 --> 00:46:27.210
If we're dealing with
perpendicular unit vectors

00:46:27.210 --> 00:46:32.000
and the word for that will
be -- see I could have said

00:46:32.000 --> 00:46:35.650
orthogonal, but I
said perpendicular --

00:46:35.650 --> 00:46:41.370
and this unit vectors gets
put in as the word normal.

00:46:41.370 --> 00:46:42.545
Orthonormal vectors.

00:46:46.070 --> 00:46:49.820
Those are the best
columns you could ask for.

00:46:49.820 --> 00:46:54.730
Matrices with -- whose
columns are orthonormal,

00:46:54.730 --> 00:46:56.950
they're perpendicular
to each other,

00:46:56.950 --> 00:47:01.010
and they're unit vectors, well,
they don't have to be those

00:47:01.010 --> 00:47:06.280
three, let me do a
final example over here,

00:47:06.280 --> 00:47:11.110
how about one at an angle like
that and one at ninety degrees,

00:47:11.110 --> 00:47:18.050
that vector would be cos theta,
sine theta, a unit vector,

00:47:18.050 --> 00:47:24.150
and this vector would be
minus sine theta cos theta.

00:47:24.150 --> 00:47:30.850
That is our absolute favorite
pair of orthonormal vectors.

00:47:30.850 --> 00:47:33.630
They're both unit vectors
and they're perpendicular.

00:47:33.630 --> 00:47:36.520
That angle is ninety degrees.

00:47:36.520 --> 00:47:41.500
So like our job next
time is first to see

00:47:41.500 --> 00:47:43.640
why orthonormal
vectors are great,

00:47:43.640 --> 00:47:47.240
and then to make vectors
orthonormal by picking

00:47:47.240 --> 00:47:49.530
the right basis.

00:47:49.530 --> 00:47:50.625
OK, see you.

00:47:57.070 --> 00:47:58.620
Thanks.