1
00:00:07,460 --> 00:00:09,910
OK.
2
00:00:09,910 --> 00:00:15,260
Here's lecture sixteen
and if you remember
3
00:00:15,260 --> 00:00:21,150
I ended up the last lecture
with this formula for what
4
00:00:21,150 --> 00:00:24,610
I called a projection matrix.
5
00:00:24,610 --> 00:00:31,610
And maybe I could just
recap for a minute what
6
00:00:31,610 --> 00:00:35,010
is that magic formula doing?
7
00:00:35,010 --> 00:00:38,390
For example, it's
supposed to be --
8
00:00:38,390 --> 00:00:40,230
it's supposed to
produce a projection,
9
00:00:40,230 --> 00:00:44,770
if I multiply by a b,
so I take P times b,
10
00:00:44,770 --> 00:00:51,240
I'm supposed to project that
vector b to the nearest point
11
00:00:51,240 --> 00:00:54,150
in the column space.
12
00:00:54,150 --> 00:00:54,800
OK.
13
00:00:54,800 --> 00:00:56,350
Can I just --
14
00:00:56,350 --> 00:01:01,420
one way to recap is to
take the two extreme cases.
15
00:01:01,420 --> 00:01:05,040
Suppose a vector b is
in the column space?
16
00:01:05,040 --> 00:01:10,030
Then what do I get when
I apply the projection P?
17
00:01:10,030 --> 00:01:13,610
So I'm projecting
into the column space
18
00:01:13,610 --> 00:01:18,780
but I'm starting with a vector
in this case that's already
19
00:01:18,780 --> 00:01:20,950
in the column
space, so of course
20
00:01:20,950 --> 00:01:25,860
when I project it I
get B again, right.
21
00:01:25,860 --> 00:01:29,860
And I want to show you how
that comes out of this formula.
22
00:01:29,860 --> 00:01:32,550
Let me do the other extreme.
23
00:01:32,550 --> 00:01:35,350
Suppose that vector is
perpendicular to the column
24
00:01:35,350 --> 00:01:36,210
space.
25
00:01:36,210 --> 00:01:38,860
So imagine this column
space as a plane
26
00:01:38,860 --> 00:01:42,550
and imagine b as sticking
straight up perpendicular
27
00:01:42,550 --> 00:01:43,490
to it.
28
00:01:43,490 --> 00:01:50,490
What's the nearest point in the
column space to b in that case?
29
00:01:50,490 --> 00:01:54,380
So what's the projection
onto the plane,
30
00:01:54,380 --> 00:01:57,980
the nearest point in the
plane, if the vector b that
31
00:01:57,980 --> 00:02:02,000
I'm looking at is -- got no
component in the column space,
32
00:02:02,000 --> 00:02:05,050
it's sticking completely
-- ninety degrees with it,
33
00:02:05,050 --> 00:02:10,220
then Pb should be zero, right.
34
00:02:10,220 --> 00:02:13,040
So those are the
two extreme cases.
35
00:02:13,040 --> 00:02:18,000
The average vector has a
component P in the column space
36
00:02:18,000 --> 00:02:20,930
and a component
perpendicular to it,
37
00:02:20,930 --> 00:02:25,930
and what the projection
does is it kills this part
38
00:02:25,930 --> 00:02:29,510
and it preserves this part.
39
00:02:29,510 --> 00:02:30,010
OK.
40
00:02:30,010 --> 00:02:32,230
Can we just see why that's true?
41
00:02:32,230 --> 00:02:37,140
Just -- that formula
ought to work.
42
00:02:37,140 --> 00:02:41,010
So let me start with this one.
43
00:02:41,010 --> 00:02:44,260
What vectors are in the -- are
perpendicular to the column
44
00:02:44,260 --> 00:02:45,240
space?
45
00:02:45,240 --> 00:02:48,220
How do I see that
I really get zero?
46
00:02:48,220 --> 00:02:50,850
I have to think, what does
it mean for a vector b
47
00:02:50,850 --> 00:02:54,410
to be perpendicular
to the column space?
48
00:02:54,410 --> 00:02:59,430
So if it's perpendicular
to all the columns,
49
00:02:59,430 --> 00:03:02,100
then it's in some other space.
50
00:03:02,100 --> 00:03:05,740
We've got our four spaces so
the reason I do this is it's
51
00:03:05,740 --> 00:03:10,030
perfectly using what we
know about our four spaces.
52
00:03:10,030 --> 00:03:13,860
What vectors are perpendicular
to the column space?
53
00:03:13,860 --> 00:03:19,190
Those are the guys in the
null space of A transpose,
54
00:03:19,190 --> 00:03:20,290
right?
55
00:03:20,290 --> 00:03:22,740
That's the first
section of this chapter,
56
00:03:22,740 --> 00:03:26,160
that's the key geometry
of these spaces.
57
00:03:26,160 --> 00:03:28,300
If I'm perpendicular
to the column space,
58
00:03:28,300 --> 00:03:30,961
I'm in the null
space of A transpose.
59
00:03:30,961 --> 00:03:31,460
OK.
60
00:03:31,460 --> 00:03:33,790
So if I'm in the null
space of A transpose,
61
00:03:33,790 --> 00:03:41,760
and I multiply this big formula
times b, so now I'm getting Pb,
62
00:03:41,760 --> 00:03:48,530
this is now the projection,
Pb, do you see that I get zero?
63
00:03:48,530 --> 00:03:50,470
Of course I get zero.
64
00:03:50,470 --> 00:03:52,950
Right at the end
there, A transpose b
65
00:03:52,950 --> 00:03:54,840
will give me zero right away.
66
00:03:54,840 --> 00:03:57,780
So that's why that zero's here.
67
00:03:57,780 --> 00:04:00,890
Because if I'm perpendicular
to the column space, then
68
00:04:00,890 --> 00:04:03,840
I'm in the null space of A
transpose and A transpose
69
00:04:03,840 --> 00:04:08,640
b is OK, what about
the other possibility.
70
00:04:08,640 --> 00:04:09,970
zilch.
71
00:04:09,970 --> 00:04:13,370
How do I see that this formula
gives me the right answer
72
00:04:13,370 --> 00:04:15,250
if b is in the column space?
73
00:04:18,230 --> 00:04:21,890
So what's a typical vector
in the column space?
74
00:04:21,890 --> 00:04:24,480
It's a combination
of the columns.
75
00:04:24,480 --> 00:04:27,240
How do I write a
combination of the columns?
76
00:04:27,240 --> 00:04:31,090
So tell me, how would
I write, you know,
77
00:04:31,090 --> 00:04:34,440
your everyday vector
that's in the column space?
78
00:04:34,440 --> 00:04:38,860
It would have the form
A times some x, right?
79
00:04:38,860 --> 00:04:42,280
That's what's in the column
space, A times something.
80
00:04:42,280 --> 00:04:44,730
That makes it a
combination of the columns.
81
00:04:44,730 --> 00:04:49,570
So these b's were in the
null space of A transpose.
82
00:04:49,570 --> 00:04:54,640
These guys in the column
space, those b's are Ax-s.
83
00:04:54,640 --> 00:04:55,210
Right?
84
00:04:55,210 --> 00:04:58,825
If b is in the column space
then it has the form Ax.
85
00:05:01,380 --> 00:05:04,450
I'm going to stick that on the
quiz or the final for sure.
86
00:05:04,450 --> 00:05:08,290
That you have to realize --
because we've said it like
87
00:05:08,290 --> 00:05:13,020
a thousand times that the things
in the column space are vectors
88
00:05:13,020 --> 00:05:14,241
A times x.
89
00:05:14,241 --> 00:05:14,740
OK.
90
00:05:14,740 --> 00:05:18,050
And do you see what happens
now if we use our formula?
91
00:05:18,050 --> 00:05:19,990
There's an A transpose A.
92
00:05:19,990 --> 00:05:21,860
Gets canceled by its inverse.
93
00:05:21,860 --> 00:05:25,800
We're left with an A times x.
94
00:05:25,800 --> 00:05:27,530
So the result was Ax.
95
00:05:27,530 --> 00:05:28,570
Which was b.
96
00:05:28,570 --> 00:05:30,100
Do you see that it works?
97
00:05:30,100 --> 00:05:32,750
This is that whole business.
98
00:05:32,750 --> 00:05:35,870
Cancel, cancel, leaving Ax.
99
00:05:35,870 --> 00:05:37,840
And Ax was b.
100
00:05:37,840 --> 00:05:43,730
So that turned out to
be b, in this case.
101
00:05:43,730 --> 00:05:51,300
OK, so geometrically what we're
seeing is we're taking a vector
102
00:05:51,300 --> 00:05:53,010
--
103
00:05:53,010 --> 00:06:00,770
we've got the column space
and perpendicular to that
104
00:06:00,770 --> 00:06:06,350
is the null space
of A transpose.
105
00:06:06,350 --> 00:06:10,230
And our typical
vector b is out here.
106
00:06:10,230 --> 00:06:12,970
There's zero, so there's
our typical vector b,
107
00:06:12,970 --> 00:06:19,460
and what we're doing is we're
projecting it to P. And the --
108
00:06:19,460 --> 00:06:22,300
and of course at the same time
we're finding the other part
109
00:06:22,300 --> 00:06:24,810
of it which is e.
110
00:06:24,810 --> 00:06:30,520
So the two pieces, the
projection piece and the error
111
00:06:30,520 --> 00:06:35,280
piece, add up to the original b.
112
00:06:35,280 --> 00:06:36,230
OK.
113
00:06:36,230 --> 00:06:39,520
That's like what
our matrix does.
114
00:06:39,520 --> 00:06:41,440
So this is P --
115
00:06:41,440 --> 00:06:48,260
P is -- this P is Ab, is sorry
-- is Pb, it's the projection,
116
00:06:48,260 --> 00:06:52,590
applied to b, and this one is --
117
00:06:52,590 --> 00:06:55,080
OK, that's a projection too.
118
00:06:55,080 --> 00:06:58,070
That's a projection
down onto that space.
119
00:06:58,070 --> 00:06:59,860
What's a good formula for it?
120
00:06:59,860 --> 00:07:05,340
Suppose I ask you for the
projection of the projection
121
00:07:05,340 --> 00:07:08,830
matrix onto the --
122
00:07:08,830 --> 00:07:13,240
this space, this
perpendicular space?
123
00:07:13,240 --> 00:07:16,960
So if this projection
was P, what's
124
00:07:16,960 --> 00:07:21,070
the projection that gives me e?
125
00:07:21,070 --> 00:07:24,170
It's the -- what I want is to
get the rest of the vector,
126
00:07:24,170 --> 00:07:30,790
so it'll be just I minus P times
b, that's a projection too.
127
00:07:30,790 --> 00:07:35,880
That's the projection onto
the perpendicular space.
128
00:07:38,950 --> 00:07:40,040
OK.
129
00:07:40,040 --> 00:07:44,150
So if P's a projection, I
minus P is a projection.
130
00:07:44,150 --> 00:07:47,790
If P is symmetric, I
minus P is symmetric.
131
00:07:47,790 --> 00:07:52,290
If P squared equals P, then I
minus P squared equals I minus
132
00:07:52,290 --> 00:07:55,690
P. It's just --
133
00:07:55,690 --> 00:08:00,460
the algebra -- is only
doing what your --
134
00:08:00,460 --> 00:08:05,270
picture is completely
telling you.
135
00:08:05,270 --> 00:08:08,122
But the algebra leads
to this expression.
136
00:08:11,820 --> 00:08:16,280
That expression for P given --
137
00:08:16,280 --> 00:08:19,810
given a basis for
the subspace, given
138
00:08:19,810 --> 00:08:25,690
the matrix A whose columns are
a basis for our column space.
139
00:08:25,690 --> 00:08:28,820
OK, that's recap because you
-- you need to see that formula
140
00:08:28,820 --> 00:08:30,460
more than once.
141
00:08:30,460 --> 00:08:34,669
And now can I pick
up on using it?
142
00:08:34,669 --> 00:08:37,789
So now -- and the --
143
00:08:37,789 --> 00:08:46,590
it's like, let me do that again,
I'll go right through a problem
144
00:08:46,590 --> 00:08:52,470
that I started at the end, which
is find a best straight line.
145
00:08:52,470 --> 00:08:53,820
You remember that problem, I --
146
00:08:53,820 --> 00:08:57,530
I picked a particular
set of points,
147
00:08:57,530 --> 00:09:00,820
they weren't specially
brilliant, t equal one,
148
00:09:00,820 --> 00:09:07,190
two, three, the heights were
one, two, and then two again.
149
00:09:07,190 --> 00:09:10,570
So they were -- heights
were that point, that point,
150
00:09:10,570 --> 00:09:13,320
which makes it look like I've
got a nice forty-five-degree
151
00:09:13,320 --> 00:09:18,800
line -- but then the third
point didn't lie on the line.
152
00:09:18,800 --> 00:09:22,500
And I wanted to find
the best straight line.
153
00:09:22,500 --> 00:09:26,404
So I'm looking for the
-- this line, y=C+Dt.
154
00:09:30,850 --> 00:09:35,110
And it's not going to go
through all three points,
155
00:09:35,110 --> 00:09:37,880
because no line goes
through all three points.
156
00:09:37,880 --> 00:09:42,190
So I'm going to pick
the best line, the --
157
00:09:42,190 --> 00:09:45,990
the best being the one that
makes the overall error
158
00:09:45,990 --> 00:09:48,430
as small as I can make it.
159
00:09:48,430 --> 00:09:52,390
Now I have to tell you,
what is that overall error?
160
00:09:52,390 --> 00:10:01,600
And -- because that determines
what's the winning line.
161
00:10:01,600 --> 00:10:02,740
If we don't know --
162
00:10:02,740 --> 00:10:06,810
I mean we have to decide
what we mean by the error --
163
00:10:06,810 --> 00:10:12,940
and then we minimize and we find
the right -- the best C and D.
164
00:10:12,940 --> 00:10:18,100
So if I went through this --
if I went through that point,
165
00:10:18,100 --> 00:10:18,600
OK.
166
00:10:18,600 --> 00:10:20,688
I would solve the
equation C+D=1.
167
00:10:23,680 --> 00:10:26,310
Because at t equal to one --
168
00:10:26,310 --> 00:10:30,050
I'd have C plus D, and
it would come out right.
169
00:10:30,050 --> 00:10:34,310
If it went through this point,
I'd have C plus two D equal to
170
00:10:34,310 --> 00:10:35,150
two.
171
00:10:35,150 --> 00:10:38,990
Because at t equal to two, I
would like to get the answer
172
00:10:38,990 --> 00:10:39,550
two.
173
00:10:39,550 --> 00:10:43,950
At the third point, I have
C plus three D because t is
174
00:10:43,950 --> 00:10:47,160
three, but the -- the
answer I'm shooting for is
175
00:10:47,160 --> 00:10:49,850
two again.
176
00:10:49,850 --> 00:10:52,680
So those are my three equations.
177
00:10:52,680 --> 00:10:55,720
And they don't have a solution.
178
00:10:55,720 --> 00:10:58,110
But they've got a best solution.
179
00:10:58,110 --> 00:10:59,890
What do I mean by best solution?
180
00:10:59,890 --> 00:11:04,220
So let me take time out to
remember what I'm talking
181
00:11:04,220 --> 00:11:06,550
about for best solution.
182
00:11:06,550 --> 00:11:11,960
So this is my equation Ax=b.
183
00:11:11,960 --> 00:11:18,440
A is this matrix, one,
one, one, one, two, three.
184
00:11:18,440 --> 00:11:22,580
x is my -- only have
two unknowns, C and D,
185
00:11:22,580 --> 00:11:27,440
and b is my right-hand
side, one, two, three.
186
00:11:27,440 --> 00:11:27,940
OK.
187
00:11:31,930 --> 00:11:34,670
No solution.
188
00:11:34,670 --> 00:11:37,300
Three eq- I have a
three by two matrix,
189
00:11:37,300 --> 00:11:40,200
I do have two
independent columns --
190
00:11:40,200 --> 00:11:42,540
so I do have a basis
for the column space,
191
00:11:42,540 --> 00:11:44,630
those two columns
are independent,
192
00:11:44,630 --> 00:11:46,420
they're a basis for
the column space,
193
00:11:46,420 --> 00:11:52,370
but the column space
doesn't include that vector.
194
00:11:52,370 --> 00:11:57,600
So best possible in this --
195
00:11:57,600 --> 00:12:01,540
what would best possible mean?
196
00:12:01,540 --> 00:12:05,750
The way that comes out to
linear equations is I --
197
00:12:05,750 --> 00:12:13,970
I want to minimize
the sum of these --
198
00:12:13,970 --> 00:12:15,457
I'm going to make an error here.
199
00:12:15,457 --> 00:12:16,790
I'm going to make an error here.
200
00:12:16,790 --> 00:12:18,950
I'm going to make
an error there.
201
00:12:18,950 --> 00:12:24,970
And I'm going to sum and
square and add up those errors.
202
00:12:24,970 --> 00:12:26,720
So it's a sum of squares.
203
00:12:26,720 --> 00:12:30,750
It's a least squares
solution I'm looking for.
204
00:12:30,750 --> 00:12:37,980
So if I -- those errors are the
difference between Ax and b.
205
00:12:37,980 --> 00:12:40,320
That's what I want
to make small.
206
00:12:40,320 --> 00:12:42,951
And the way I'm measuring
this -- this is a vector,
207
00:12:42,951 --> 00:12:43,450
right?
208
00:12:43,450 --> 00:12:45,480
This is e1,e2 ,e3.
209
00:12:45,480 --> 00:12:49,090
The Ax-b, this is the e.
210
00:12:49,090 --> 00:12:50,890
The error vector.
211
00:12:50,890 --> 00:12:55,890
And small means its length.
212
00:12:55,890 --> 00:12:57,662
The length of that vector.
213
00:12:57,662 --> 00:12:59,370
That's what I'm going
to try to minimize.
214
00:12:59,370 --> 00:13:04,280
And it's convenient to square.
215
00:13:04,280 --> 00:13:06,920
If I make something
small, I make --
216
00:13:09,620 --> 00:13:12,320
this is a never negative
quantity, right?
217
00:13:12,320 --> 00:13:13,690
The length of that vector.
218
00:13:16,760 --> 00:13:20,040
The length will be zero
exactly when the --
219
00:13:20,040 --> 00:13:21,990
when I have the
zero vector here.
220
00:13:21,990 --> 00:13:26,620
That's exactly the case
when I can solve exactly,
221
00:13:26,620 --> 00:13:29,730
b is in the column
space, all great.
222
00:13:29,730 --> 00:13:31,820
But I'm not in that case now.
223
00:13:31,820 --> 00:13:34,070
I'm going to have
an error vector, e.
224
00:13:34,070 --> 00:13:35,815
What's this error
vector in my picture?
225
00:13:38,340 --> 00:13:42,030
I guess what I'm trying
to say is there's --
226
00:13:42,030 --> 00:13:45,270
there's two pictures
of what's going on.
227
00:13:45,270 --> 00:13:47,540
There's two pictures
of what's going on.
228
00:13:47,540 --> 00:13:50,900
One picture is --
229
00:13:50,900 --> 00:13:55,020
in this is the three
points and the line.
230
00:13:55,020 --> 00:14:00,220
And in that picture, what
are the three errors?
231
00:14:00,220 --> 00:14:03,480
The three errors are what
I miss by in this equation.
232
00:14:03,480 --> 00:14:05,150
So it's this --
233
00:14:05,150 --> 00:14:06,740
this little bit here.
234
00:14:06,740 --> 00:14:08,950
That vertical distance
up to the line.
235
00:14:08,950 --> 00:14:12,780
There's one -- sorry there's
one, and there's C plus D.
236
00:14:12,780 --> 00:14:14,700
And it's that difference.
237
00:14:14,700 --> 00:14:17,720
Here's two and here's C+2D.
238
00:14:17,720 --> 00:14:20,600
So vertically it's
that distance --
239
00:14:20,600 --> 00:14:23,620
that little error there is e1.
240
00:14:23,620 --> 00:14:26,220
This little error here is e2.
241
00:14:26,220 --> 00:14:30,540
This little error
coming up is e3.
242
00:14:30,540 --> 00:14:32,350
e3.
243
00:14:32,350 --> 00:14:35,240
And what's my overall error?
244
00:14:35,240 --> 00:14:43,240
Is e1 square plus e2
squared plus e3 squared.
245
00:14:43,240 --> 00:14:44,920
That's what I'm
trying to make small.
246
00:14:44,920 --> 00:14:54,090
I -- some statisticians -- this
is a big part of statistics,
247
00:14:54,090 --> 00:14:56,360
fitting straight lines is
a big part of science --
248
00:14:56,360 --> 00:15:00,310
and specifically statistics,
where the right word to use
249
00:15:00,310 --> 00:15:02,210
would be regression.
250
00:15:02,210 --> 00:15:05,270
I'm doing regression here.
251
00:15:05,270 --> 00:15:06,145
Linear regression.
252
00:15:09,840 --> 00:15:12,820
And I'm using this
sum of squares
253
00:15:12,820 --> 00:15:15,270
as the measure of error.
254
00:15:15,270 --> 00:15:21,190
Again, some statisticians
would be -- they would say, OK,
255
00:15:21,190 --> 00:15:24,000
I'll solve that problem
because it's the clean problem.
256
00:15:24,000 --> 00:15:27,080
It leads to a beautiful
linear system.
257
00:15:27,080 --> 00:15:30,340
But they would be a little
careful about these squares,
258
00:15:30,340 --> 00:15:32,670
for -- in this case.
259
00:15:32,670 --> 00:15:35,990
If one of these
points was way off.
260
00:15:35,990 --> 00:15:39,040
Suppose I had a measurement at
t equal zero that was way off.
261
00:15:41,560 --> 00:15:44,880
Well, would the straight line,
would the best line be the same
262
00:15:44,880 --> 00:15:46,890
if I had this fourth point?
263
00:15:46,890 --> 00:15:50,180
Suppose I have this
fourth data point.
264
00:15:50,180 --> 00:15:54,880
No, certainly the line would --
265
00:15:54,880 --> 00:15:57,620
it wouldn't be the -- that
wouldn't be the best line.
266
00:15:57,620 --> 00:16:01,100
Because that line would
have a giant error --
267
00:16:01,100 --> 00:16:04,820
and when I squared it it
would be like way out of sight
268
00:16:04,820 --> 00:16:06,860
compared to the others.
269
00:16:06,860 --> 00:16:14,280
So this would be called by
statisticians an outlier,
270
00:16:14,280 --> 00:16:17,830
and they would not be happy to
see the whole problem turned
271
00:16:17,830 --> 00:16:21,150
topsy-turvy by this one outlier,
which could be a mistake,
272
00:16:21,150 --> 00:16:22,760
after all.
273
00:16:22,760 --> 00:16:26,500
So they wouldn't -- so they
wouldn't like maybe squaring,
274
00:16:26,500 --> 00:16:29,940
if there were outliers, they
would want to identify them.
275
00:16:29,940 --> 00:16:30,440
OK.
276
00:16:30,440 --> 00:16:35,800
I'm not going to --
277
00:16:35,800 --> 00:16:40,870
I don't want to suggest that
least squares isn't used,
278
00:16:40,870 --> 00:16:44,790
it's the most used, but
it's not exclusively used
279
00:16:44,790 --> 00:16:47,040
because it's a little --
280
00:16:47,040 --> 00:16:50,000
overcompensates for outliers.
281
00:16:50,000 --> 00:16:51,500
Because of that squaring.
282
00:16:51,500 --> 00:16:52,000
OK.
283
00:16:52,000 --> 00:16:54,300
So suppose we don't
have this guy,
284
00:16:54,300 --> 00:16:57,300
we just have these
three equations.
285
00:16:57,300 --> 00:17:01,680
And I want to make --
minimize this error.
286
00:17:01,680 --> 00:17:02,650
OK.
287
00:17:02,650 --> 00:17:08,069
Now, what I said is there's
two pictures to look at.
288
00:17:08,069 --> 00:17:10,940
One picture is this one.
289
00:17:10,940 --> 00:17:14,700
The three points, the best line.
290
00:17:14,700 --> 00:17:16,280
And the errors.
291
00:17:16,280 --> 00:17:20,760
Now, on this picture,
what are these points
292
00:17:20,760 --> 00:17:24,890
on the line, the points
that are really on the line?
293
00:17:24,890 --> 00:17:30,490
So they're -- points, let
me call them P1, P2, and P3,
294
00:17:30,490 --> 00:17:35,610
those are three numbers, so
this -- this height is P1,
295
00:17:35,610 --> 00:17:45,700
this height is P2, this height
is P3, and what are those guys?
296
00:17:45,700 --> 00:17:49,930
Suppose those were the
three values instead of --
297
00:17:49,930 --> 00:17:53,840
there's b1, ev- everybody's
seen all these -- sorry,
298
00:17:53,840 --> 00:17:57,090
my art is as usual
not the greatest,
299
00:17:57,090 --> 00:18:04,590
but there's the given b1, the
given b2, and the given b3.
300
00:18:04,590 --> 00:18:09,330
I promise not to put a single
letter more on that picture.
301
00:18:09,330 --> 00:18:10,050
OK.
302
00:18:10,050 --> 00:18:15,600
There's b1, P1 is the one on
the line, and e1 is the distance
303
00:18:15,600 --> 00:18:16,600
between.
304
00:18:16,600 --> 00:18:21,410
And same at points two
and same at points three.
305
00:18:21,410 --> 00:18:23,370
OK, so what's up?
306
00:18:23,370 --> 00:18:26,310
What's up with those Ps?
307
00:18:26,310 --> 00:18:29,930
P1, P2, P3, what are they?
308
00:18:29,930 --> 00:18:32,520
They're the components,
they lie on the line,
309
00:18:32,520 --> 00:18:33,720
right?
310
00:18:33,720 --> 00:18:38,420
They're the points
which if instead
311
00:18:38,420 --> 00:18:44,530
of one, two, two, which
were the b's, suppose I put
312
00:18:44,530 --> 00:18:47,230
P1, P2, P3 in here.
313
00:18:47,230 --> 00:18:50,150
I'll figure out in a minute
what those numbers are.
314
00:18:50,150 --> 00:18:53,040
But I just want to get the
picture of what I'm doing.
315
00:18:53,040 --> 00:18:56,390
If I put P1, P2, P3 in
those three equations,
316
00:18:56,390 --> 00:18:58,795
what would be good about
the three equations?
317
00:19:01,820 --> 00:19:03,820
I could solve them.
318
00:19:03,820 --> 00:19:06,420
A line goes through the Ps.
319
00:19:06,420 --> 00:19:10,400
So the P1, P2, P3 vector,
that's in the column
320
00:19:10,400 --> 00:19:11,320
space.
321
00:19:11,320 --> 00:19:14,480
That is a combination
of these columns.
322
00:19:14,480 --> 00:19:16,400
It's the closest combination.
323
00:19:16,400 --> 00:19:18,180
It's this picture.
324
00:19:18,180 --> 00:19:20,920
See, I've got the two
pictures like here's
325
00:19:20,920 --> 00:19:24,710
the picture that
shows the points, this
326
00:19:24,710 --> 00:19:28,240
is a picture in a
blackboard plane,
327
00:19:28,240 --> 00:19:34,310
here's a picture that's
showing the vectors.
328
00:19:34,310 --> 00:19:38,540
The vector b, which is in
this case, in this example
329
00:19:38,540 --> 00:19:42,090
is the vector one, two, two.
330
00:19:42,090 --> 00:19:47,940
The column space is in
this case spanned by the --
331
00:19:47,940 --> 00:19:49,720
well, you see A there.
332
00:19:49,720 --> 00:19:55,600
The column space of the matrix
one, one, one, one, two, three.
333
00:19:55,600 --> 00:20:01,540
And this picture shows
the nearest point.
334
00:20:01,540 --> 00:20:04,510
There's the -- that
point P1, P2, P3,
335
00:20:04,510 --> 00:20:08,050
which I'm going to compute
before the end of this hour,
336
00:20:08,050 --> 00:20:13,090
is the closest point
in the column space.
337
00:20:13,090 --> 00:20:13,780
OK.
338
00:20:13,780 --> 00:20:19,560
Let me -- t I don't dare
leave it any longer --
339
00:20:19,560 --> 00:20:21,650
can I just compute it now.
340
00:20:21,650 --> 00:20:24,850
So I want to compute --
341
00:20:24,850 --> 00:20:28,800
find P. All right.
342
00:20:28,800 --> 00:20:39,250
Find P. Find x, which
is CD, find P and P. OK.
343
00:20:39,250 --> 00:20:42,430
And I really should put
these little hats on
344
00:20:42,430 --> 00:20:49,830
to remind myself that they're
the estimated the best line,
345
00:20:49,830 --> 00:20:51,970
not the perfect line.
346
00:20:51,970 --> 00:20:53,050
OK.
347
00:20:53,050 --> 00:20:54,330
OK.
348
00:20:54,330 --> 00:20:55,540
How do I proceed?
349
00:20:55,540 --> 00:20:58,340
Let's just run
through the mechanics.
350
00:20:58,340 --> 00:21:02,530
What's the equation for x?
351
00:21:02,530 --> 00:21:04,620
The -- or x hat.
352
00:21:04,620 --> 00:21:10,390
The equation for that is A
transpose A x hat equals A
353
00:21:10,390 --> 00:21:12,500
transpose x --
354
00:21:12,500 --> 00:21:14,105
A transpose b.
355
00:21:18,020 --> 00:21:19,530
The most --
356
00:21:19,530 --> 00:21:23,620
I'm -- will venture to call
that the most important equation
357
00:21:23,620 --> 00:21:26,350
in statistics.
358
00:21:26,350 --> 00:21:28,560
And in estimation.
359
00:21:28,560 --> 00:21:33,140
And whatever you're -- wherever
you've got error and noise this
360
00:21:33,140 --> 00:21:36,980
is the estimate
that you use first.
361
00:21:36,980 --> 00:21:37,500
OK.
362
00:21:37,500 --> 00:21:42,740
Whenever you're fitting
things by a few parameters,
363
00:21:42,740 --> 00:21:44,700
that's the equation to use.
364
00:21:44,700 --> 00:21:46,500
OK, let's solve it.
365
00:21:46,500 --> 00:21:47,970
What is A transpose A?
366
00:21:47,970 --> 00:21:50,580
So I have to figure out
what these matrices are.
367
00:21:50,580 --> 00:21:56,860
One, one, one, one, two, three
and one, one, one, one, two,
368
00:21:56,860 --> 00:22:04,490
three, that gives me some
matrix, that gives me
369
00:22:04,490 --> 00:22:12,510
a matrix, what do I get out of
that, three, six, six, and one
370
00:22:12,510 --> 00:22:15,720
and four and nine, fourteen.
371
00:22:15,720 --> 00:22:17,040
OK.
372
00:22:17,040 --> 00:22:21,830
And what do I expect to see in
that matrix and I do see it,
373
00:22:21,830 --> 00:22:25,210
just before I keep going
with the calculation?
374
00:22:25,210 --> 00:22:28,450
I expect that matrix
to be symmetric.
375
00:22:28,450 --> 00:22:30,565
I expect it to be invertible.
376
00:22:34,100 --> 00:22:36,300
And near the end
of the course I'm
377
00:22:36,300 --> 00:22:39,060
going to say I expect it
to be positive definite,
378
00:22:39,060 --> 00:22:45,590
but that's a future fact
about this crucial matrix,
379
00:22:45,590 --> 00:22:47,050
A transpose A.
380
00:22:47,050 --> 00:22:47,670
OK.
381
00:22:47,670 --> 00:22:50,880
And now let me
figure A transpose b.
382
00:22:50,880 --> 00:22:57,280
So let me -- can I tack on b as
an extra column here, one, two,
383
00:22:57,280 --> 00:22:59,950
two?
384
00:22:59,950 --> 00:23:04,770
And tack on the extra
A transpose b is --
385
00:23:04,770 --> 00:23:09,580
looks like five and one
and four and six, eleven.
386
00:23:13,770 --> 00:23:20,760
I think my equations are three
C plus six D equals five,
387
00:23:20,760 --> 00:23:29,700
and six D plus fourt-six C
plus fourteen D is eleven.
388
00:23:29,700 --> 00:23:33,090
Can I just for safety
see if I did that right?
389
00:23:33,090 --> 00:23:37,350
One, one, one times
one, two, two is five.
390
00:23:37,350 --> 00:23:40,630
One, two, three, that's
one, four and six, eleven.
391
00:23:40,630 --> 00:23:42,667
Looks good.
392
00:23:42,667 --> 00:23:43,625
These are my equations.
393
00:23:48,860 --> 00:23:52,000
That's my -- they're called
the normal equations.
394
00:23:54,610 --> 00:23:56,984
I'll just write that
word down because it --
395
00:24:02,800 --> 00:24:04,470
so I solve them.
396
00:24:04,470 --> 00:24:10,270
I solve that for C and
D. I would like to --
397
00:24:10,270 --> 00:24:13,130
before I solve them could I
do one thing that's on the --
398
00:24:13,130 --> 00:24:16,570
that's just above here?
399
00:24:16,570 --> 00:24:18,110
I would like to --
400
00:24:18,110 --> 00:24:21,470
I'd like to find these
equations from calculus.
401
00:24:21,470 --> 00:24:26,320
I'd like to find them from
this minimizing thing.
402
00:24:26,320 --> 00:24:28,010
So what's the first error?
403
00:24:28,010 --> 00:24:32,690
The first error is what I
missed by in the first equation.
404
00:24:32,690 --> 00:24:36,250
C plus D minus one squared.
405
00:24:36,250 --> 00:24:40,010
And the second error is what
I miss in the second equation.
406
00:24:40,010 --> 00:24:44,110
C plus two D minus two squared.
407
00:24:44,110 --> 00:24:52,350
And the third error squared is C
plus three D minus two squared.
408
00:24:52,350 --> 00:24:56,410
That's my -- overall squared
error that I'm trying
409
00:24:56,410 --> 00:24:58,040
to minimize.
410
00:24:58,040 --> 00:24:58,610
OK.
411
00:24:58,610 --> 00:25:08,910
So how would you minimize that?
412
00:25:08,910 --> 00:25:16,270
OK, linear algebra has given us
the equations for the minimum.
413
00:25:16,270 --> 00:25:20,750
But we could use calculus too.
414
00:25:20,750 --> 00:25:24,440
That's a function of
two variables, C and D,
415
00:25:24,440 --> 00:25:28,010
and we're looking
for the minimum.
416
00:25:28,010 --> 00:25:31,140
So how do we find it?
417
00:25:31,140 --> 00:25:35,160
Directly from calculus, we
take partial derivatives,
418
00:25:35,160 --> 00:25:37,510
right, we've got two
variables, C and D,
419
00:25:37,510 --> 00:25:40,900
so take the partial
derivative with respect to C
420
00:25:40,900 --> 00:25:44,560
and set it to zero, and
you'll get that equation.
421
00:25:44,560 --> 00:25:47,140
Take the partial
derivative with respect --
422
00:25:47,140 --> 00:25:51,570
I'm not going to write it
all out, just -- you will.
423
00:25:51,570 --> 00:25:56,370
The partial derivative with
respect to D, it -- you know,
424
00:25:56,370 --> 00:25:59,340
it's going to be linear,
that's the beauty of these
425
00:25:59,340 --> 00:26:03,220
squares,that if I have the
square of something and I take
426
00:26:03,220 --> 00:26:07,520
its derivative I get something
And this is what I get. linear.
427
00:26:07,520 --> 00:26:11,440
So this is the derivative of
the error with respect to C
428
00:26:11,440 --> 00:26:13,770
being zero, and this
is the derivative
429
00:26:13,770 --> 00:26:17,850
of the error with
respect to D being zero.
430
00:26:17,850 --> 00:26:20,660
Wherever you look, these
equations keep coming.
431
00:26:20,660 --> 00:26:22,370
So now I guess I'm
going to solve it,
432
00:26:22,370 --> 00:26:25,830
what will I do, I'll subtract,
I'll do elimination of course,
433
00:26:25,830 --> 00:26:27,820
because that's the only
thing I know how to do.
434
00:26:27,820 --> 00:26:32,540
Two of these away from
this would give me --
435
00:26:32,540 --> 00:26:37,198
let's see, six, so would
that be two Ds equals one?
436
00:26:37,198 --> 00:26:37,697
Ha.
437
00:26:41,440 --> 00:26:43,050
So it wasn't --
438
00:26:43,050 --> 00:26:45,760
I was afraid these numbers
were going to come out awful.
439
00:26:45,760 --> 00:26:48,770
But if I take two of
those away from that,
440
00:26:48,770 --> 00:26:51,480
the equation I get left
is two D equals one,
441
00:26:51,480 --> 00:26:57,700
so I think D is a
half and C is whatever
442
00:26:57,700 --> 00:27:03,650
back substitution gives, six D
is three, so three C plus three
443
00:27:03,650 --> 00:27:07,060
is five, I'm doing back
substitution now, right, three,
444
00:27:07,060 --> 00:27:10,910
can I do it in light
letters, three C plus
445
00:27:10,910 --> 00:27:15,970
that six D is three equals
five, so three C is two,
446
00:27:15,970 --> 00:27:17,440
so I think C is two-thirds.
447
00:27:23,276 --> 00:27:24,275
One-half and two-thirds.
448
00:27:29,230 --> 00:27:38,640
So the best line, the best
line is the constant two-thirds
449
00:27:38,640 --> 00:27:42,760
plus one-half t.
450
00:27:42,760 --> 00:27:46,820
And I -- is my picture
more or less right?
451
00:27:46,820 --> 00:27:49,890
Let me write, let me copy
that best line down again,
452
00:27:49,890 --> 00:27:52,600
two-thirds and a half.
453
00:27:52,600 --> 00:27:55,510
Let me -- I'll put in the
two-thirds and the half.
454
00:27:59,890 --> 00:28:00,710
OK.
455
00:28:00,710 --> 00:28:05,360
So what's this P1, that's
the value at t equal to one.
456
00:28:05,360 --> 00:28:08,380
At t equal to one, I have
two-thirds plus a half,
457
00:28:08,380 --> 00:28:10,280
which is --
458
00:28:10,280 --> 00:28:13,400
what's that, four-sixths
and three-sixths, so P1, oh,
459
00:28:13,400 --> 00:28:18,400
I promised not to write
another thing on this --
460
00:28:18,400 --> 00:28:21,860
I'll erase P1 and
I'll put seven-sixths.
461
00:28:21,860 --> 00:28:22,580
OK.
462
00:28:22,580 --> 00:28:27,990
And yeah, it's above one,
and e1 is one-sixth, right.
463
00:28:27,990 --> 00:28:28,720
You see it all.
464
00:28:28,720 --> 00:28:29,220
Right?
465
00:28:29,220 --> 00:28:29,830
What's P2?
466
00:28:29,830 --> 00:28:31,660
OK.
467
00:28:31,660 --> 00:28:35,230
At point t equal to two,
where's my line here?
468
00:28:35,230 --> 00:28:38,920
At t equal to two, it's
two-thirds plus one, right?
469
00:28:38,920 --> 00:28:41,580
That's five-thirds.
470
00:28:41,580 --> 00:28:44,130
Two-thirds and t is two,
so that's two-thirds
471
00:28:44,130 --> 00:28:46,070
and one make five-thirds.
472
00:28:46,070 --> 00:28:49,320
And that's -- sure enough,
that's smaller than the exact
473
00:28:49,320 --> 00:28:50,280
two.
474
00:28:50,280 --> 00:28:55,180
And then final P3, when
t is three, oh, what's
475
00:28:55,180 --> 00:28:56,820
two-thirds plus three-halves?
476
00:29:01,390 --> 00:29:03,950
It's the same as
three-halves plus two-thirds.
477
00:29:03,950 --> 00:29:09,280
It's -- so maybe
four-sixths and nine-sixths,
478
00:29:09,280 --> 00:29:11,120
maybe thirteen-sixths.
479
00:29:11,120 --> 00:29:15,110
OK, and again, look,
oh, look at this, OK.
480
00:29:15,110 --> 00:29:19,840
You have to admire the
beauty of this answer.
481
00:29:19,840 --> 00:29:21,260
What's this first error?
482
00:29:21,260 --> 00:29:25,760
So here are the
errors. e1, e2 and e3.
483
00:29:25,760 --> 00:29:28,340
OK, what was that
first error, e1?
484
00:29:28,340 --> 00:29:32,640
Well, if we decide the
errors counting up,
485
00:29:32,640 --> 00:29:35,260
then it's one-sixth.
486
00:29:35,260 --> 00:29:38,670
And the last error,
thirteen-sixths
487
00:29:38,670 --> 00:29:43,420
minus the correct two
is one-sixth again.
488
00:29:43,420 --> 00:29:47,890
And what's this
error in the middle?
489
00:29:47,890 --> 00:29:52,530
Let's see, the correct
answer was two, two.
490
00:29:52,530 --> 00:29:55,900
And we got five-thirds and
it's the other direction,
491
00:29:55,900 --> 00:29:58,445
minus one-third,
minus two-sixths.
492
00:30:02,070 --> 00:30:04,560
That's our error vector.
493
00:30:04,560 --> 00:30:09,220
In our picture, in our
other picture, here it is.
494
00:30:09,220 --> 00:30:13,880
We just found P and e.
495
00:30:13,880 --> 00:30:19,120
e is this vector, one-sixth,
minus two-sixths, one-sixth,
496
00:30:19,120 --> 00:30:21,540
and P is this guy.
497
00:30:21,540 --> 00:30:23,540
Well, maybe I have
the signs of e wrong,
498
00:30:23,540 --> 00:30:26,880
I think I have, let me fix it.
499
00:30:26,880 --> 00:30:32,840
Because I would like
this one-sixth --
500
00:30:32,840 --> 00:30:37,690
I would like this plus the
P to give the original b.
501
00:30:37,690 --> 00:30:42,730
I want P plus e to match b.
502
00:30:42,730 --> 00:30:47,080
So I want minus a
sixth, plus seven-sixths
503
00:30:47,080 --> 00:30:50,650
to give the correct b equal one.
504
00:30:50,650 --> 00:30:52,090
OK.
505
00:30:52,090 --> 00:30:58,060
Now -- I'm going to
take a deep breath here,
506
00:30:58,060 --> 00:31:06,720
and ask what do we know
about this error vector e?
507
00:31:06,720 --> 00:31:09,790
You've seen now this whole
problem worked completely
508
00:31:09,790 --> 00:31:13,780
through, and I even think
the numbers are right.
509
00:31:13,780 --> 00:31:17,500
So there's P, so let me --
510
00:31:17,500 --> 00:31:24,840
I'll write -- if I can put
it down here, B is P plus e.
511
00:31:24,840 --> 00:31:29,110
b I believe was one, two, two.
512
00:31:29,110 --> 00:31:34,860
The nearest point
had seven-sixths,
513
00:31:34,860 --> 00:31:36,120
what were the others?
514
00:31:36,120 --> 00:31:40,590
Five-thirds and thirteen-sixths.
515
00:31:40,590 --> 00:31:46,950
And the e vector was
minus a sixth, two-sixths,
516
00:31:46,950 --> 00:31:49,360
one-third in other
words, and minus a sixth.
517
00:31:58,511 --> 00:31:59,010
OK.
518
00:31:59,010 --> 00:32:01,930
Tell me some stuff
about these two vectors.
519
00:32:01,930 --> 00:32:03,820
Tell me something about
those two vectors,
520
00:32:03,820 --> 00:32:06,480
well, they add to
b, right, great.
521
00:32:06,480 --> 00:32:07,070
OK.
522
00:32:07,070 --> 00:32:09,420
What else?
523
00:32:09,420 --> 00:32:12,520
What else about those
two vectors, the P,
524
00:32:12,520 --> 00:32:18,700
the projection vector P,
and the error vector e.
525
00:32:18,700 --> 00:32:21,470
What else do you
know about them?
526
00:32:21,470 --> 00:32:24,430
They're perpendicular, right.
527
00:32:24,430 --> 00:32:25,860
Do we dare verify that?
528
00:32:29,180 --> 00:32:32,230
Can you take the dot
product of those vectors?
529
00:32:32,230 --> 00:32:35,440
I'm like getting like minus
seven over thirty-six,
530
00:32:35,440 --> 00:32:36,850
can I change that to ten-sixths?
531
00:32:42,180 --> 00:32:45,250
Oh, God, come out right here.
532
00:32:45,250 --> 00:32:50,880
Minus seven over thirty-six,
plus twenty over thirty-six,
533
00:32:50,880 --> 00:32:53,080
minus thirteen over thirty-six.
534
00:32:56,730 --> 00:32:57,630
Thank you, God.
535
00:32:57,630 --> 00:32:59,120
OK.
536
00:32:59,120 --> 00:33:04,030
And what else should we
know about that vector?
537
00:33:04,030 --> 00:33:05,740
Actually we know --
538
00:33:05,740 --> 00:33:08,740
I've got to say we know
even a little more.
539
00:33:08,740 --> 00:33:13,510
This vector, e, is
perpendicular to P,
540
00:33:13,510 --> 00:33:18,480
but it's perpendicular
to other stuff too.
541
00:33:18,480 --> 00:33:22,220
It's perpendicular not just to
this guy in the column space,
542
00:33:22,220 --> 00:33:25,170
this is in the column
space for sure.
543
00:33:25,170 --> 00:33:27,680
This is perpendicular
to the column space.
544
00:33:27,680 --> 00:33:32,710
So like give me another
vector it's perpendicular to.
545
00:33:32,710 --> 00:33:35,000
Another because it's
perpendicular to the whole
546
00:33:35,000 --> 00:33:37,490
column space, not
just to this --
547
00:33:37,490 --> 00:33:40,780
this particular
projection that's --
548
00:33:40,780 --> 00:33:44,880
that is in the column space,
but it's perpendicular to other
549
00:33:44,880 --> 00:33:46,520
stuff, whatever's
in the column space,
550
00:33:46,520 --> 00:33:49,800
so tell me another vector
in the -- oh, well,
551
00:33:49,800 --> 00:33:53,080
I've written down the matrix,
so tell me another vector
552
00:33:53,080 --> 00:33:55,000
in the column space.
553
00:33:55,000 --> 00:33:58,000
Pick a nice one.
554
00:33:58,000 --> 00:33:59,450
One, one, one.
555
00:33:59,450 --> 00:34:01,490
That's what
everybody's thinking.
556
00:34:01,490 --> 00:34:04,230
OK, one, one, one is
in the column space.
557
00:34:04,230 --> 00:34:07,350
And this guy is supposed
to be perpendicular to one,
558
00:34:07,350 --> 00:34:08,090
one, one.
559
00:34:08,090 --> 00:34:10,000
Is it?
560
00:34:10,000 --> 00:34:10,659
Sure.
561
00:34:10,659 --> 00:34:12,550
If I take the dot
product with one,
562
00:34:12,550 --> 00:34:16,690
one, one I get minus a sixth,
plus two-sixths, minus a sixth,
563
00:34:16,690 --> 00:34:18,080
zero.
564
00:34:18,080 --> 00:34:20,659
And it's perpendicular
to one, two, three.
565
00:34:20,659 --> 00:34:23,020
Because if I take the
dot product with one,
566
00:34:23,020 --> 00:34:30,310
two, three I get minus one, plus
four, minus three, zero again.
567
00:34:30,310 --> 00:34:32,449
OK, do you see the --
568
00:34:32,449 --> 00:34:35,739
I hope you see the two pictures.
569
00:34:35,739 --> 00:34:41,110
The picture here for vectors
and, the picture here
570
00:34:41,110 --> 00:34:48,120
for the best line, and it's
the same picture, just --
571
00:34:48,120 --> 00:34:51,440
this one's in the plane
and it's showing the line,
572
00:34:51,440 --> 00:34:56,060
this one never did show the
line, this -- in this picture,
573
00:34:56,060 --> 00:34:59,160
C and D never showed up.
574
00:34:59,160 --> 00:35:02,040
In this picture, C and
D were -- you know,
575
00:35:02,040 --> 00:35:04,730
they determined that line.
576
00:35:04,730 --> 00:35:07,020
But the two are
exactly the same.
577
00:35:07,020 --> 00:35:10,540
C and D is the combination
of the two columns
578
00:35:10,540 --> 00:35:14,770
that gives P. OK.
579
00:35:14,770 --> 00:35:19,890
So that's these squares.
580
00:35:19,890 --> 00:35:23,680
And the special
but most important
581
00:35:23,680 --> 00:35:26,820
example of fitting by
straight line, so the homework
582
00:35:26,820 --> 00:35:29,670
that's coming then
Wednesday asks
583
00:35:29,670 --> 00:35:32,750
you to fit by straight lines.
584
00:35:32,750 --> 00:35:40,440
So you're just going to end
up solving the key equation.
585
00:35:40,440 --> 00:35:42,850
You're going to end up
solving that key equation
586
00:35:42,850 --> 00:35:47,100
and then P will be Ax hat.
587
00:35:47,100 --> 00:35:47,870
That's it.
588
00:35:51,640 --> 00:35:54,470
OK.
589
00:35:54,470 --> 00:35:59,650
Now, can I put in a little
piece of linear algebra
590
00:35:59,650 --> 00:36:03,350
that I mentioned
earlier, mentioned again,
591
00:36:03,350 --> 00:36:06,510
but I never did write?
592
00:36:06,510 --> 00:36:09,840
And I've -- I
should do it right.
593
00:36:09,840 --> 00:36:16,070
It's about this matrix
A transpose A. There.
594
00:36:21,840 --> 00:36:26,450
I was sure that that
matrix would be invertible.
595
00:36:26,450 --> 00:36:29,220
And of course I wanted to
be sure it was invertible,
596
00:36:29,220 --> 00:36:36,210
because I planned to solve this
system with with that matrix.
597
00:36:36,210 --> 00:36:40,620
So and I announced
like before --
598
00:36:40,620 --> 00:36:42,660
as the chapter
was just starting,
599
00:36:42,660 --> 00:36:45,390
I announced that it
would be invertible.
600
00:36:45,390 --> 00:36:48,615
But now I -- can I
come back to that?
601
00:36:48,615 --> 00:36:49,115
OK.
602
00:36:52,940 --> 00:36:56,050
So what I said was --
603
00:36:56,050 --> 00:37:07,440
that if A has
independent columns,
604
00:37:07,440 --> 00:37:14,616
then A transpose
A is invertible.
605
00:37:20,100 --> 00:37:24,080
And I would like to --
606
00:37:24,080 --> 00:37:27,250
first to repeat
that important fact,
607
00:37:27,250 --> 00:37:32,320
that that's the requirement
that makes everything go here.
608
00:37:32,320 --> 00:37:34,610
It's this independent
columns of A
609
00:37:34,610 --> 00:37:39,050
that guarantees
everything goes through.
610
00:37:39,050 --> 00:37:42,140
And think why.
611
00:37:42,140 --> 00:37:44,970
Why does this matrix
A transpose A,
612
00:37:44,970 --> 00:37:50,410
why is it invertible if the
columns of A are independent?
613
00:37:50,410 --> 00:38:01,840
OK, there's -- so if it
wasn't invertible, I'm --
614
00:38:01,840 --> 00:38:04,750
so I want to prove that.
615
00:38:04,750 --> 00:38:08,060
If it isn't
invertible, then what?
616
00:38:08,060 --> 00:38:10,610
I want to reach --
617
00:38:10,610 --> 00:38:13,010
I want to follow that
-- follow that line --
618
00:38:13,010 --> 00:38:15,400
of thinking and
see what I come to.
619
00:38:15,400 --> 00:38:17,480
Suppose, so proof.
620
00:38:17,480 --> 00:38:26,810
Suppose A transpose Ax is zero.
621
00:38:26,810 --> 00:38:28,400
I'm trying to prove this.
622
00:38:28,400 --> 00:38:30,440
This is now to prove.
623
00:38:30,440 --> 00:38:39,910
I don't like hammer away at
too many proofs in this course.
624
00:38:39,910 --> 00:38:41,690
But this is like
the central fact
625
00:38:41,690 --> 00:38:44,320
and it brings in all
the stuff we know.
626
00:38:44,320 --> 00:38:44,820
OK.
627
00:38:44,820 --> 00:38:46,700
So I'll start the proof.
628
00:38:46,700 --> 00:38:51,160
Suppose A transpose Ax is zero.
629
00:38:51,160 --> 00:38:56,110
What -- and I'm aiming to prove
A transpose A is invertible.
630
00:38:56,110 --> 00:38:58,150
So what do I want to prove now?
631
00:39:00,680 --> 00:39:03,560
So I'm aiming to
prove this fact.
632
00:39:03,560 --> 00:39:06,680
I'll use this, and I'm aiming
to prove that this matrix is
633
00:39:06,680 --> 00:39:11,740
invertible, OK, so if I
suppose A transpose Ax is zero,
634
00:39:11,740 --> 00:39:13,875
then what conclusion
do I want to reach?
635
00:39:16,450 --> 00:39:21,200
I'd like to know
that x must be zero.
636
00:39:21,200 --> 00:39:23,510
I want to show x must be zero.
637
00:39:23,510 --> 00:39:33,100
To show now -- to prove x
must be the zero vector.
638
00:39:33,100 --> 00:39:38,640
Is that right, that's what we
worked in the previous chapter
639
00:39:38,640 --> 00:39:43,850
to understand, that a
matrix was invertible
640
00:39:43,850 --> 00:39:51,960
when its null space is
only the zero vector.
641
00:39:51,960 --> 00:39:53,340
So that's what I want to show.
642
00:39:53,340 --> 00:40:00,520
How come if A transpose Ax is
zero, how come x must be zero?
643
00:40:00,520 --> 00:40:01,810
What's going to be the reason?
644
00:40:05,270 --> 00:40:06,880
Actually I have
two ways to do it.
645
00:40:10,270 --> 00:40:12,210
Let me show you one way.
646
00:40:12,210 --> 00:40:14,640
This is -- here, trick.
647
00:40:18,210 --> 00:40:22,880
Take the dot product
of both sides with x.
648
00:40:22,880 --> 00:40:25,980
So I'll multiply both
sides by x transpose.
649
00:40:25,980 --> 00:40:30,100
x transpose A transpose
Ax equals zero.
650
00:40:33,190 --> 00:40:35,100
I shouldn't have written trick.
651
00:40:35,100 --> 00:40:37,640
That makes it sound
like just a dumb idea.
652
00:40:37,640 --> 00:40:39,581
Brilliant idea, I
should have put.
653
00:40:39,581 --> 00:40:40,080
OK.
654
00:40:43,040 --> 00:40:44,215
I'll just put idea.
655
00:40:47,920 --> 00:40:49,230
OK.
656
00:40:49,230 --> 00:40:57,670
Now, I got to that equation,
x transpose A transpose Ax=0,
657
00:40:57,670 --> 00:41:06,229
and I'm hoping you can
see the right way to --
658
00:41:06,229 --> 00:41:07,270
to look at that equation.
659
00:41:12,760 --> 00:41:15,030
What can I conclude
from that equation,
660
00:41:15,030 --> 00:41:17,840
that if I have x
transpose A -- well,
661
00:41:17,840 --> 00:41:21,070
what is x transpose
A transpose Ax?
662
00:41:21,070 --> 00:41:25,360
Does that -- what
it's giving you?
663
00:41:29,620 --> 00:41:32,740
It's again going to be putting
in parentheses, I'm looking
664
00:41:32,740 --> 00:41:37,170
at Ax and what I seeing here?
665
00:41:37,170 --> 00:41:39,560
Its transpose.
666
00:41:39,560 --> 00:41:47,580
So I'm seeing here this
is Ax transpose Ax.
667
00:41:47,580 --> 00:41:48,325
Equaling zero.
668
00:41:51,640 --> 00:41:57,040
Now if Ax transpose Ax, so like
let's call it y or something,
669
00:41:57,040 --> 00:42:01,450
if y transpose y is zero,
what does that tell me?
670
00:42:06,780 --> 00:42:08,950
That the vector has
to be zero, right?
671
00:42:08,950 --> 00:42:10,650
This is the length
squared, that's
672
00:42:10,650 --> 00:42:15,730
the length of the vector Ax
squared, that's Ax times Ax.
673
00:42:15,730 --> 00:42:18,210
So I conclude that
Ax has to be zero.
674
00:42:23,474 --> 00:42:24,640
Well, I'm getting somewhere.
675
00:42:29,900 --> 00:42:34,610
Now that I know Ax
is zero, now I'm
676
00:42:34,610 --> 00:42:37,370
going to use my
little hypothesis.
677
00:42:37,370 --> 00:42:43,290
Somewhere every mathematician
has to use the hypothesis.
678
00:42:43,290 --> 00:42:45,050
Right?
679
00:42:45,050 --> 00:42:49,740
Now, if A has independent
columns and we've --
680
00:42:49,740 --> 00:42:55,580
we're at the point where Ax is
zero, what does that tell us?
681
00:42:55,580 --> 00:42:59,610
I could -- I mean that could
be like a fill-in question
682
00:42:59,610 --> 00:43:01,090
on the final exam.
683
00:43:01,090 --> 00:43:06,820
If A has independent columns
and if Ax equals zero then what?
684
00:43:10,390 --> 00:43:15,850
Please say it. x is zero, right.
685
00:43:15,850 --> 00:43:18,370
Which was just what
we wanted to prove.
686
00:43:18,370 --> 00:43:20,790
That -- do you see why that is?
687
00:43:20,790 --> 00:43:24,150
If Ax eq- equals zero,
now we're using --
688
00:43:24,150 --> 00:43:27,190
here we used this was
the square of something,
689
00:43:27,190 --> 00:43:30,810
so I'll put in
little parentheses
690
00:43:30,810 --> 00:43:35,720
the observation we made, that
was a square which is zero,
691
00:43:35,720 --> 00:43:37,610
so the thing has to be zero.
692
00:43:37,610 --> 00:43:43,130
Now we're using the hypothesis
of independent columns
693
00:43:43,130 --> 00:43:48,600
at the A has
independent columns.
694
00:43:48,600 --> 00:43:52,060
If A has independent
columns, this is telling me
695
00:43:52,060 --> 00:43:56,040
x is in its null space,
and the only thing
696
00:43:56,040 --> 00:44:00,510
in the null space of such a
matrix is the zero vector.
697
00:44:00,510 --> 00:44:01,320
OK.
698
00:44:01,320 --> 00:44:06,620
So that's the argument and
you see how it really used
699
00:44:06,620 --> 00:44:13,420
our understanding of the
-- of the null space.
700
00:44:13,420 --> 00:44:13,990
OK.
701
00:44:13,990 --> 00:44:15,650
That's great.
702
00:44:15,650 --> 00:44:16,420
All right.
703
00:44:16,420 --> 00:44:20,750
So where are we then?
704
00:44:20,750 --> 00:44:24,430
That board is like
the backup theory
705
00:44:24,430 --> 00:44:28,670
that tells me that
this matrix had
706
00:44:28,670 --> 00:44:32,610
to be invertible because these
columns were independent.
707
00:44:35,530 --> 00:44:38,360
OK.
708
00:44:38,360 --> 00:44:44,940
there's one case
of independent --
709
00:44:44,940 --> 00:44:50,540
there's one case where the
geometry gets even better.
710
00:44:50,540 --> 00:44:55,030
When the -- there's one case
when columns are sure to be
711
00:44:55,030 --> 00:44:56,610
independent.
712
00:44:56,610 --> 00:45:00,060
And let me put that -- let me
write that down and that'll be
713
00:45:00,060 --> 00:45:01,780
the subject for next time.
714
00:45:01,780 --> 00:45:07,040
Columns are sure -- are
certainly independent,
715
00:45:07,040 --> 00:45:23,290
definitely independent,
if they're perpendicular.
716
00:45:23,290 --> 00:45:25,190
Oh, I've got to rule
out the zero column,
717
00:45:25,190 --> 00:45:33,280
let me give them all length one,
so they can't be zero if they
718
00:45:33,280 --> 00:45:37,855
are perpendicular unit vectors.
719
00:45:42,870 --> 00:45:53,370
Like the vectors one, zero,
zero, zero, one, zero and zero,
720
00:45:53,370 --> 00:45:55,480
zero, one.
721
00:45:55,480 --> 00:46:00,660
Those vectors are unit
vectors, they're perpendicular,
722
00:46:00,660 --> 00:46:05,820
and they certainly
are independent.
723
00:46:05,820 --> 00:46:10,610
And what's more, suppose
they're -- oh, that's so nice,
724
00:46:10,610 --> 00:46:14,080
I mean what is A transpose
A for that matrix?
725
00:46:14,080 --> 00:46:16,470
For the matrix with
these three columns?
726
00:46:16,470 --> 00:46:18,280
It's the identity.
727
00:46:18,280 --> 00:46:23,090
So here's the key to the
lecture that's coming.
728
00:46:23,090 --> 00:46:27,210
If we're dealing with
perpendicular unit vectors
729
00:46:27,210 --> 00:46:32,000
and the word for that will
be -- see I could have said
730
00:46:32,000 --> 00:46:35,650
orthogonal, but I
said perpendicular --
731
00:46:35,650 --> 00:46:41,370
and this unit vectors gets
put in as the word normal.
732
00:46:41,370 --> 00:46:42,545
Orthonormal vectors.
733
00:46:46,070 --> 00:46:49,820
Those are the best
columns you could ask for.
734
00:46:49,820 --> 00:46:54,730
Matrices with -- whose
columns are orthonormal,
735
00:46:54,730 --> 00:46:56,950
they're perpendicular
to each other,
736
00:46:56,950 --> 00:47:01,010
and they're unit vectors, well,
they don't have to be those
737
00:47:01,010 --> 00:47:06,280
three, let me do a
final example over here,
738
00:47:06,280 --> 00:47:11,110
how about one at an angle like
that and one at ninety degrees,
739
00:47:11,110 --> 00:47:18,050
that vector would be cos theta,
sine theta, a unit vector,
740
00:47:18,050 --> 00:47:24,150
and this vector would be
minus sine theta cos theta.
741
00:47:24,150 --> 00:47:30,850
That is our absolute favorite
pair of orthonormal vectors.
742
00:47:30,850 --> 00:47:33,630
They're both unit vectors
and they're perpendicular.
743
00:47:33,630 --> 00:47:36,520
That angle is ninety degrees.
744
00:47:36,520 --> 00:47:41,500
So like our job next
time is first to see
745
00:47:41,500 --> 00:47:43,640
why orthonormal
vectors are great,
746
00:47:43,640 --> 00:47:47,240
and then to make vectors
orthonormal by picking
747
00:47:47,240 --> 00:47:49,530
the right basis.
748
00:47:49,530 --> 00:47:50,625
OK, see you.
749
00:47:57,070 --> 00:47:58,620
Thanks.