1
00:00:01 --> 00:00:03
Yes, OK, four,
three, two, one,
2
00:00:03 --> 00:00:05
OK, I see you guys are in a
happy mood.
3
00:00:05 --> 00:00:08
I don't know if that means
18.06 is ending,
4
00:00:08 --> 00:00:09
or, the quiz was good.
5
00:00:09 --> 00:00:13
Uh, my birthday conference was
going on at the time of the
6
00:00:13 --> 00:00:17
quiz, and in the conference,
of course, everybody had to say
7
00:00:17 --> 00:00:21
nice things, but I was
wondering, what would my 18.06
8
00:00:21 --> 00:00:24
class be saying,
because it was at the exactly
9
00:00:24 --> 00:00:26
the same time.
10
00:00:26 --> 00:00:31
But, what I know from the
grades so far,
11
00:00:31 --> 00:00:39
they're basically close to,
and maybe slightly above the
12
00:00:39 --> 00:00:44
grades that you got on quiz two.
13
00:00:44 --> 00:00:48
So, very satisfactory.
14
00:00:48 --> 00:00:55
And, then we have a final exam
coming up, and today's lecture,
15
00:00:55 --> 00:01:00.82
as I told you by email,
will be a first step in the
16
00:01:00.82 --> 00:01:07.61
review, and then on Wednesday
I'll do all I can in reviewing
17
00:01:07.61 --> 00:01:09
the whole course.
18
00:01:09 --> 00:01:15
So my topic today is --
actually, this is a lecture I
19
00:01:15 --> 00:01:19
have never given before in this
way, and it will -- well,
20
00:01:19 --> 00:01:23
four subspaces,
that's certainly fundamental,
21
00:01:23 --> 00:01:26.86
and you know that,
so I want to speak about
22
00:01:26.86 --> 00:01:31
left-inverses and right-inverses
and then something called
23
00:01:31 --> 00:01:33
pseudo-inverses.
24
00:01:33 --> 00:01:39
And pseudo-inverses,
let me say right away,
25
00:01:39 --> 00:01:45
that comes in near the end of
chapter seven,
26
00:01:45 --> 00:01:52
and that would not be expected
on the final.
27
00:01:52 --> 00:01:56
But you'll see that what I'm
talking about is really the
28
00:01:56 --> 00:01:59
basic stuff that,
for an m-by-n matrix of rank r,
29
00:01:59 --> 00:02:03
we're going back to the most
fundamental picture in linear
30
00:02:03 --> 00:02:03
algebra.
31
00:02:03 --> 00:02:06
Nobody could forget that
picture, right?
32
00:02:06 --> 00:02:09.92
When you're my age,
even, you'll remember the row
33
00:02:09.92 --> 00:02:12
space, and the null space.
34
00:02:12 --> 00:02:18
Orthogonal complements over
there, the column space and the
35
00:02:18 --> 00:02:23
null space of A transpose
column, orthogonal complements
36
00:02:23 --> 00:02:24
over here.
37
00:02:24 --> 00:02:27.9
And I want to speak about
inverses.
38
00:02:27.9 --> 00:02:28
OK.
39
00:02:28 --> 00:02:34
And I want to identify the
different possibilities.
40
00:02:34 --> 00:02:38
So first of all,
when does a matrix have a just
41
00:02:38 --> 00:02:42
a perfect inverse,
two-sided, you know,
42
00:02:42 --> 00:02:47
so the two-sided inverse is
what we just call inverse,
43
00:02:47 --> 00:02:48
right?
44
00:02:48 --> 00:02:53
And, so that means that there's
a matrix that produces the
45
00:02:53 --> 00:03:00
identity, whether we write it on
the left or on the right.
46
00:03:00 --> 00:03:03
And just tell me,
how are the numbers r,
47
00:03:03 --> 00:03:07.61
the rank, n the number of
columns, m the number of rows,
48
00:03:07.61 --> 00:03:11
how are those numbers related
when we have an invertible
49
00:03:11 --> 00:03:12
matrix?
50
00:03:12 --> 00:03:17
So this is the matrix which was
-- chapter two was all about
51
00:03:17 --> 00:03:20
matrices like this,
the beginning of the course,
52
00:03:20 --> 00:03:26
what was the relation of th- of
r, m, and n, for the nice case?
53
00:03:26 --> 00:03:32
They're all the same,
all equal.
54
00:03:32 --> 00:03:37
So this is the case when r=m=n.
55
00:03:37 --> 00:03:43.23
Square matrix,
full rank, period,
56
00:03:43.23 --> 00:03:51
just -- so I'll use the words
full rank.
57
00:03:51 --> 00:03:52.04
OK, good.
58
00:03:52.04 --> 00:03:53
Everybody knows that.
59
00:03:53 --> 00:03:54
OK.
60
00:03:54 --> 00:03:55
Then chapter three.
61
00:03:55 --> 00:04:00
We began to deal with matrices
that were not of full rank,
62
00:04:00 --> 00:04:05
and they could have any rank,
and we learned what the rank
63
00:04:05 --> 00:04:05
was.
64
00:04:05 --> 00:04:09.96
And then we focused,
if you remember on some cases
65
00:04:09.96 --> 00:04:12
like full column rank.
66
00:04:12 --> 00:04:19
Now, can you remember what was
the deal with full column rank?
67
00:04:19 --> 00:04:24
So, now, I think this is the
case in which we have a
68
00:04:24 --> 00:04:28
left-inverse,
and I'll try to find it.
69
00:04:28 --> 00:04:34
So we have a -- what was the
situation there?
70
00:04:34 --> 00:04:40
It's the case of full column
rank, and that means -- what
71
00:04:40 --> 00:04:42
does that mean about r?
72
00:04:42 --> 00:04:48.76
It equals, what's the deal with
r, now, if we have full column
73
00:04:48.76 --> 00:04:54
rank, I mean the columns are
independent, but maybe not the
74
00:04:54 --> 00:04:55
rows.
75
00:04:55 --> 00:04:59
So what is r equal to in this
case?
76
00:04:59 --> 00:04:59
n.
77
00:04:59 --> 00:05:00
Thanks.
n.
78
00:05:00 --> 00:05:01
r=n.
79
00:05:01 --> 00:05:05.82
The n columns are independent,
but probably,
80
00:05:05.82 --> 00:05:07
we have more rows.
81
00:05:07 --> 00:05:12
What's the picture,
and then what's the null space
82
00:05:12 --> 00:05:13
for this?
83
00:05:13 --> 00:05:18
So the n columns are
independent, what's the null
84
00:05:18 --> 00:05:20
space in this case?
85
00:05:20 --> 00:05:26
So of course,
you know what I'm asking.
86
00:05:26 --> 00:05:29
You're saying,
why is this guy asking
87
00:05:29 --> 00:05:35
something, I know that-- I think
about it in my sleep,
88
00:05:35 --> 00:05:35
right?
89
00:05:35 --> 00:05:40
So the null space of this
matrix if the rank is n,
90
00:05:40 --> 00:05:45
the null space is what vectors
are in the null space?
91
00:05:45 --> 00:05:47
Just the zero vector.
92
00:05:47 --> 00:05:48
Right?
93
00:05:48 --> 00:05:52
The columns are independent.
94
00:05:52 --> 00:05:55
Independent columns.
95
00:05:55 --> 00:06:03
No combination of the columns
gives zero except that one.
96
00:06:03 --> 00:06:12
And what's my picture over,
-- let me redraw my picture --
97
00:06:12 --> 00:06:17
the row space is everything.
98
00:06:17 --> 00:06:17
No.
99
00:06:17 --> 00:06:18
Is that right?
100
00:06:18 --> 00:06:23
Let's see, I often get these
turned around,
101
00:06:23 --> 00:06:23.88
right?
102
00:06:23.88 --> 00:06:25
So what's the deal?
103
00:06:25 --> 00:06:29
The columns are independent,
right?
104
00:06:29 --> 00:06:34
So the rank should be the full
number of columns,
105
00:06:34 --> 00:06:38
so what does that tell us?
106
00:06:38 --> 00:06:40
There's no null space,
right.
107
00:06:40 --> 00:06:41.18
OK.
108
00:06:41.18 --> 00:06:44
The row space is the whole
thing.
109
00:06:44 --> 00:06:47
Yes, I won't even draw the
picture.
110
00:06:47 --> 00:06:52
And what was the deal with --
and these were very important in
111
00:06:52 --> 00:06:59
least squares problems because
-- So, what more is true here?
112
00:06:59 --> 00:07:03
If we have full column rank,
the null space is zero,
113
00:07:03 --> 00:07:07
we have independent columns,
the unique -- so we have zero
114
00:07:07 --> 00:07:09
or one solutions to Ax=b.
115
00:07:09 --> 00:07:14
There may not be any solutions,
but if there's a solution,
116
00:07:14 --> 00:07:18.39
there's only one solution
because other solutions are
117
00:07:18.39 --> 00:07:21.84
found by adding on stuff from
the null space,
118
00:07:21.84 --> 00:07:25
and there's nobody there to add
on.
119
00:07:25 --> 00:07:32
So the particular solution is
the solution,
120
00:07:32 --> 00:07:37
if there is a particular
solution.
121
00:07:37 --> 00:07:43
But of course,
the rows might not be - are
122
00:07:43 --> 00:07:49
probably not independent
-- and therefore,
123
00:07:49 --> 00:07:54
so right-hand sides won't end
up with a zero equal zero after
124
00:07:54 --> 00:07:58
elimination, so sometimes we may
have no solution,
125
00:07:58 --> 00:07:59
or one solution.
126
00:07:59 --> 00:07:59
OK.
127
00:07:59 --> 00:08:03
And what I want to say is that
for this matrix A -- oh,
128
00:08:03 --> 00:08:09
yes, tell me something about A
transpose A in this case.
129
00:08:09 --> 00:08:12
So this whole part of the
board, now, is devoted to this
130
00:08:12 --> 00:08:13
case.
131
00:08:13 --> 00:08:15
What's the deal with A
transpose A?
132
00:08:15 --> 00:08:19
I've emphasized over and over
how important that combination
133
00:08:19 --> 00:08:23
is, for a rectangular matrix,
A transpose A is the good thing
134
00:08:23 --> 00:08:27
to look at, and if the rank is
n, if the null space has only
135
00:08:27 --> 00:08:31
zero in it, then the same is
true of A transpose A.
136
00:08:31 --> 00:08:39.59
That's the beautiful fact,
that if the rank of A is n,
137
00:08:39.59 --> 00:08:47
well, we know this will be an n
by n symmetric matrix,
138
00:08:47 --> 00:08:52.32
and it will be full rank.
139
00:08:52.32 --> 00:08:54
So this is invertible.
140
00:08:54 --> 00:08:56
This matrix is invertible.
141
00:08:56 --> 00:08:59
That matrix is invertible.
142
00:08:59 --> 00:09:04
And now I want to show you that
A itself has a one-sided
143
00:09:04 --> 00:09:04
inverse.
144
00:09:04 --> 00:09:05
Here it is.
145
00:09:05 --> 00:09:08.85
The inverse of that,
which exists,
146
00:09:08.85 --> 00:09:13
times A transpose,
there is a one-sided -- shall I
147
00:09:13 --> 00:09:18
call it A inverse?
-- left of the matrix A.
148
00:09:18 --> 00:09:20
Why do I say that?
149
00:09:20 --> 00:09:24
Because if I multiply this guy
by A, what do I get?
150
00:09:24 --> 00:09:28
What does that multiplication
give?
151
00:09:28 --> 00:09:33
Of course, you know it
instantly, because I just put
152
00:09:33 --> 00:09:37.87
the parentheses there,
I have A transpose A inverse
153
00:09:37.87 --> 00:09:43
times A transpose A so,
of course, it's the identity.
154
00:09:43 --> 00:09:46
So it's a left inverse.
155
00:09:46 --> 00:09:51
And this was the totally
crucial case for least squares,
156
00:09:51 --> 00:09:57
because you remember that least
squares, the central equation of
157
00:09:57 --> 00:10:01.2
least squares had this matrix,
A transpose A,
158
00:10:01.2 --> 00:10:04
as its coefficient matrix.
159
00:10:04 --> 00:10:10.3
And in the case of full column
rank, that matrix is invertible,
160
00:10:10.3 --> 00:10:11
and we're go.
161
00:10:11 --> 00:10:16.13
So that's the case where there
is a left-inverse.
162
00:10:16.13 --> 00:10:21.1
So A does whatever it does,
we can find a matrix that
163
00:10:21.1 --> 00:10:24
brings it back to the identity.
164
00:10:24 --> 00:10:31
Now, is it true that,
in the other order -- so A
165
00:10:31 --> 00:10:36
inverse left times A is the
identity.
166
00:10:36 --> 00:10:37.47
Right?
167
00:10:37.47 --> 00:10:40
This matrix is m by n.
168
00:10:40 --> 00:10:43.72
This matrix is n by m.
169
00:10:43.72 --> 00:10:49
The identity matrix is n by n.
170
00:10:49 --> 00:10:50
All good.
171
00:10:50 --> 00:10:52
All good if you're n.
172
00:10:52 --> 00:10:57
But if you try to put that
matrix on the other side,
173
00:10:57 --> 00:10:59
it would fail.
174
00:10:59 --> 00:11:04.5
If the full column rank -- if
this is smaller than m,
175
00:11:04.5 --> 00:11:09
the case where they're equals
is the beautiful case,
176
00:11:09 --> 00:11:12
but that's all set.
177
00:11:12 --> 00:11:16
Now, we're looking at the case
where the columns are
178
00:11:16 --> 00:11:19
independent but the rows are
not.
179
00:11:19 --> 00:11:23
So this is invertible,
but what matrix is not
180
00:11:23 --> 00:11:23
invertible?
181
00:11:23 --> 00:11:26.62
A A transpose is bad for this
case.
182
00:11:26.62 --> 00:11:28
A transpose A is good.
183
00:11:28 --> 00:11:32
So we can multiply on the left,
everything good,
184
00:11:32 --> 00:11:35
we get the left inverse.
185
00:11:35 --> 00:11:40
But it would not be a two-sided
inverse.
186
00:11:40 --> 00:11:48
A rectangular matrix can't have
a two-sided inverse,
187
00:11:48 --> 00:11:56
because there's got to be some
null space, right?
188
00:11:56 --> 00:11:59
If I have a matrix that's
rectangular, then either that
189
00:11:59 --> 00:12:03
matrix or its transpose has some
null space, because if n and m
190
00:12:03 --> 00:12:06
are different,
then there's going to be some
191
00:12:06 --> 00:12:09
free variables around,
and we'll have some null space
192
00:12:09 --> 00:12:10
in that direction.
193
00:12:10 --> 00:12:14
OK, tell me the corresponding
picture for the opposite case.
194
00:12:14 --> 00:12:18.15
So now I'm going to ask you
about right-inverses.
195
00:12:18.15 --> 00:12:20
A right-inverse.
196
00:12:20 --> 00:12:28
And you can fill this all out,
this is going to be the case of
197
00:12:28 --> 00:12:30
full row rank.
198
00:12:30 --> 00:12:37
And then r is equal to m,
now, the m rows are
199
00:12:37 --> 00:12:42
independent, but the columns are
not.
200
00:12:42 --> 00:12:47
So what's the deal on that?
201
00:12:47 --> 00:12:51
Well, just exactly the flip of
this one.
202
00:12:51 --> 00:12:57
The null space of A transpose
contains only zero,
203
00:12:57 --> 00:13:03
because there are no
combinations of the rows that
204
00:13:03 --> 00:13:05
give the zero row.
205
00:13:05 --> 00:13:08.37
We have independent rows.
206
00:13:08.37 --> 00:13:13
And in a minute,
I'll give an example of all
207
00:13:13 --> 00:13:15
these.
208
00:13:15 --> 00:13:20.25
So, how many solutions to Ax=b
in this case?
209
00:13:20.25 --> 00:13:23.05
The rows are independent.
210
00:13:23.05 --> 00:13:26
So we can always solve Ax=b.
211
00:13:26 --> 00:13:31
Whenever elimination never
produces a zero row,
212
00:13:31 --> 00:13:36
so we never get into that zero
equal one problem,
213
00:13:36 --> 00:13:42
so Ax=b always has a solution,
but too many.
214
00:13:42 --> 00:13:48
So there will be some null
space, the null space of A --
215
00:13:48 --> 00:13:53.19
what will be the dimension of
A's null space?
216
00:13:53.19 --> 00:13:56
How many free variables have we
got?
217
00:13:56 --> 00:14:03
How many special solutions in
that null space have we got?
218
00:14:03 --> 00:14:08
So how many free variables in
this setup?
219
00:14:08 --> 00:14:11
We've got n columns,
so n variables,
220
00:14:11 --> 00:14:16
and this tells us how many are
pivot variables,
221
00:14:16 --> 00:14:22
that tells us how many pivots
there are, so there are n-m free
222
00:14:22 --> 00:14:24
variables.
223
00:14:24 --> 00:14:30
So there are infinitely many
solutions to Ax=b.
224
00:14:30 --> 00:14:34
We have n-m free variables in
this case.
225
00:14:34 --> 00:14:34
OK.
226
00:14:34 --> 00:14:39
Now I wanted to ask about this
idea of a right-inverse.
227
00:14:39 --> 00:14:40.1
OK.
228
00:14:40.1 --> 00:14:44
So I'm going to have a matrix
A, my matrix A,
229
00:14:44 --> 00:14:50
and now there's going to be
some inverse on the right that
230
00:14:50 --> 00:14:53.42
will give the identity matrix.
231
00:14:53.42 --> 00:14:57
So it will be A times A inverse
on the right,
232
00:14:57 --> 00:15:00
will be I.
233
00:15:00 --> 00:15:06
And can you tell me what,
just by comparing with what we
234
00:15:06 --> 00:15:11
had up there,
what will be the right-inverse,
235
00:15:11 --> 00:15:15
we even have a formula for it.
236
00:15:15 --> 00:15:20
There will be other --
actually, there are other
237
00:15:20 --> 00:15:25
left-inverses,
that's our favorite.
238
00:15:25 --> 00:15:29
There will be other
right-inverses,
239
00:15:29 --> 00:15:34
but tell me our favorite here,
what's the nice right-inverse?
240
00:15:34 --> 00:15:40
The nice right-inverse will be,
well, there we had A transpose
241
00:15:40 --> 00:15:46
A was good, now it will be A A
transpose that's good.
242
00:15:46 --> 00:15:51
The good matrix,
the good right -- the thing we
243
00:15:51 --> 00:15:57
can invert is A A transpose,
so now if I just do it that
244
00:15:57 --> 00:16:01
way, there sits the
right-inverse.
245
00:16:01 --> 00:16:06
You see how completely parallel
it is to the one above?
246
00:16:06 --> 00:16:07
Right.
247
00:16:07 --> 00:16:11
So that's the right-inverse.
248
00:16:11 --> 00:16:17
So that's the case when there
is -- In terms of this picture,
249
00:16:17 --> 00:16:23
tell me what the null spaces
are like so far for these three
250
00:16:23 --> 00:16:24.48
cases.
251
00:16:24.48 --> 00:16:28
What about case one,
where we had a two-sided
252
00:16:28 --> 00:16:32
inverse, full rank,
everything great.
253
00:16:32 --> 00:16:37
The null spaces were,
like, gone, right?
254
00:16:37 --> 00:16:43
The null spaces were just the
zero vectors.
255
00:16:43 --> 00:16:48
Then I took case two,
this null space was gone.
256
00:16:48 --> 00:16:56
Case three, this null space was
gone, and then case four is,
257
00:16:56 --> 00:17:04
like, the most general case
when this picture is all there
258
00:17:04 --> 00:17:11
-- when all the null spaces --
this has dimension r,
259
00:17:11 --> 00:17:18
of course, this has dimension
n-r, this has dimension r,
260
00:17:18 --> 00:17:26
this has dimension m-r,
and the final case will be when
261
00:17:26 --> 00:17:30
r is smaller than m and n.
262
00:17:30 --> 00:17:34
But can I just,
before I leave here look a
263
00:17:34 --> 00:17:36
little more at this one?
264
00:17:36 --> 00:17:39
At this case of full column
rank?
265
00:17:39 --> 00:17:43
So A inverse on the left,
it has this left-inverse to
266
00:17:43 --> 00:17:45
give the identity.
267
00:17:45 --> 00:17:51
I said if we multiply it in the
other order, we wouldn't get the
268
00:17:51 --> 00:17:52
identity.
269
00:17:52 --> 00:17:58
But then I just realized that I
should ask you,
270
00:17:58 --> 00:17:59
what do we get?
271
00:17:59 --> 00:18:06
So if I put them in the other
order -- if I continue this down
272
00:18:06 --> 00:18:13
below, but I write A times A
inverse left -- so there's A
273
00:18:13 --> 00:18:18.83
times the left-inverse,
but it's not on the left any
274
00:18:18.83 --> 00:18:20
more.
275
00:18:20 --> 00:18:27.38
So it's not going to come out
perfectly.
276
00:18:27.38 --> 00:18:37
But everybody in this room
ought to recognize that matrix,
277
00:18:37 --> 00:18:38
right?
278
00:18:38 --> 00:18:44
Let's see, is that the guy we
know?
279
00:18:44 --> 00:18:46
Am I OK, here?
280
00:18:46 --> 00:18:50
What is that matrix?
281
00:18:50 --> 00:18:52
P.
282
00:18:52 --> 00:18:53
Thanks.
283
00:18:53 --> 00:18:53
P.
284
00:18:53 --> 00:18:57
That matrix -- it's a
projection.
285
00:18:57 --> 00:19:01
It's the projection onto the
column space.
286
00:19:01 --> 00:19:06
It's trying to be the identity
matrix, right?
287
00:19:06 --> 00:19:12
A projection matrix tries to be
the identity matrix,
288
00:19:12 --> 00:19:18
but you've given it,
an impossible job.
289
00:19:18 --> 00:19:21
So it's the identity matrix
where it can be,
290
00:19:21 --> 00:19:24
and elsewhere,
it's the zero matrix.
291
00:19:24 --> 00:19:26
So this is P,
right.
292
00:19:26 --> 00:19:29.39
A projection onto the column
space.
293
00:19:29.39 --> 00:19:29
OK.
294
00:19:29 --> 00:19:34
And if I asked you this one,
and put these in the opposite
295
00:19:34 --> 00:19:38
order -- so this came from up
here.
296
00:19:38 --> 00:19:41
And similarly,
if I try to put the right
297
00:19:41 --> 00:19:44
inverse on the left -- so that,
like, came from above.
298
00:19:44 --> 00:19:48
This, coming from this side,
what happens if I try to put
299
00:19:48 --> 00:19:50
the right inverse on the left?
300
00:19:50 --> 00:19:54
Then I would have A transpose
A, A transpose inverse A,
301
00:19:54 --> 00:19:58
if this matrix is now on the
left, what do you figure that
302
00:19:58 --> 00:19:59
matrix is?
303
00:19:59 --> 00:20:05
It's going to be a projection,
too, right?
304
00:20:05 --> 00:20:13.06
It looks very much like this
guy, except the only difference
305
00:20:13.06 --> 00:20:18
is, A and A transpose have been
reversed.
306
00:20:18 --> 00:20:25
So this is a projection,
this is another projection,
307
00:20:25 --> 00:20:28
onto the row space.
308
00:20:28 --> 00:20:34
Again, it's trying to be the
identity, but there's only so
309
00:20:34 --> 00:20:37
much the matrix can do.
310
00:20:37 --> 00:20:42
And this is the projection onto
the column space.
311
00:20:42 --> 00:20:48
So let me now go back to the
main picture and tell you about
312
00:20:48 --> 00:20:53
the general case,
the pseudo-inverse.
313
00:20:53 --> 00:20:56
These are cases we know.
314
00:20:56 --> 00:20:59.18
So this was important review.
315
00:20:59.18 --> 00:21:04
You've got to know the business
about these ranks,
316
00:21:04 --> 00:21:10
and the free variables --
really, this is linear algebra
317
00:21:10 --> 00:21:11
coming together.
318
00:21:11 --> 00:21:16
And, you know,
one nice thing about teaching
319
00:21:16 --> 00:21:17
18.06,
320
00:21:17 --> 00:21:18
It's not trivial.
321
00:21:18 --> 00:21:22
But it's -- I don't know,
somehow, it's nice when it
322
00:21:22 --> 00:21:23
comes out right.
323
00:21:23 --> 00:21:25
I mean -- well,
I shouldn't say anything bad
324
00:21:25 --> 00:21:27
about calculus,
but I will.
325
00:21:27 --> 00:21:30
I mean, like,
you know, you have formulas for
326
00:21:30 --> 00:21:32
surface area,
and other awful things and,
327
00:21:32 --> 00:21:37.7
you know, they do their best in
calculus, but it's not elegant.
328
00:21:37.7 --> 00:21:44
And, linear algebra just is --
well, you know,
329
00:21:44 --> 00:21:52
linear algebra is about the
nice part of calculus,
330
00:21:52 --> 00:21:57
where everything's,
like, flat, and,
331
00:21:57 --> 00:22:03.08
the formulas come out right.
332
00:22:03.08 --> 00:22:05
And you can go into high
dimensions where,
333
00:22:05 --> 00:22:07
in calculus,
you're trying to visualize
334
00:22:07 --> 00:22:09.48
these things,
well, two or three dimensions
335
00:22:09.48 --> 00:22:10.59
is kind of the limit.
336
00:22:10.59 --> 00:22:13
But here, we don't -- you know,
I've stopped doing two-by-twos,
337
00:22:13 --> 00:22:16
I'm just talking about the
general case.
338
00:22:16 --> 00:22:19
OK, now I really will speak
about the general case here.
339
00:22:19 --> 00:22:22
What could be the inverse --
what's a kind of reasonable
340
00:22:22 --> 00:22:25
inverse for a matrix for the
completely general matrix where
341
00:22:25 --> 00:22:28
there's a rank r,
but it's smaller than n,
342
00:22:28 --> 00:22:31
so there's some null space
left, and it's smaller than m,
343
00:22:31 --> 00:22:34
so a transpose has some null
space, and it's those null
344
00:22:34 --> 00:22:37
spaces that are screwing up
inverses, right?
345
00:22:37 --> 00:22:46
Because if a matrix takes a
vector to zero,
346
00:22:46 --> 00:22:59
well, there's no way an inverse
can, like, bring it back to
347
00:22:59 --> 00:23:02
life.
348
00:23:02 --> 00:23:04.49
My topic is now the
pseudo-inverse,
349
00:23:04.49 --> 00:23:08
and let's just by a picture,
see what's the best inverse we
350
00:23:08 --> 00:23:09
could have?
351
00:23:09 --> 00:23:12.27
So, here's a vector x in the
row space.
352
00:23:12.27 --> 00:23:13
I multiply by A.
353
00:23:13 --> 00:23:17
Now, the one thing everybody
knows is you take a vector,
354
00:23:17 --> 00:23:20
you multiply by A,
and you get an output,
355
00:23:20 --> 00:23:23
and where is that output?
356
00:23:23 --> 00:23:24
Where is Ax?
357
00:23:24 --> 00:23:29
Always in the column space,
right?
358
00:23:29 --> 00:23:33
Ax is a combination of the
columns.
359
00:23:33 --> 00:23:37
So Ax is somewhere here.
360
00:23:37 --> 00:23:43
So I could take all the vectors
in the row space.
361
00:23:43 --> 00:23:49
I could multiply them all by A.
362
00:23:49 --> 00:23:55
I would get a bunch of vectors
in the column space and what I
363
00:23:55 --> 00:24:00.56
think is, I'd get all the
vectors in the column space just
364
00:24:00.56 --> 00:24:01
right.
365
00:24:01 --> 00:24:06
I think that this connection
between an x in the row space
366
00:24:06 --> 00:24:12
and an Ax in the column space,
this is one-to-one.
367
00:24:12 --> 00:24:15
We got a chance,
because they have the same
368
00:24:15 --> 00:24:15
dimension.
369
00:24:15 --> 00:24:19
That's an r-dimensional space,
and that's an r-dimensional
370
00:24:19 --> 00:24:20
space.
371
00:24:20 --> 00:24:23
And somehow,
the matrix A -- it's got these
372
00:24:23 --> 00:24:27.39
null spaces hanging around,
where it's knocking vectors to
373
00:24:27.39 --> 00:24:27
zero.
374
00:24:27 --> 00:24:30
And then it's got all the
vectors in between,
375
00:24:30 --> 00:24:33
which is almost all vectors.
376
00:24:33 --> 00:24:37
Almost all vectors have a row
space component and a null space
377
00:24:37 --> 00:24:37
component.
378
00:24:37 --> 00:24:39
And it's killing the null space
component.
379
00:24:39 --> 00:24:42
But if I look at the vectors
that are in the row space,
380
00:24:42 --> 00:24:45
with no null space component,
just in the row space,
381
00:24:45 --> 00:24:47
then they all go into the
column space,
382
00:24:47 --> 00:24:49
so if I put another vector,
let's say, y,
383
00:24:49 --> 00:24:52
in the row space,
I positive that wherever Ay is,
384
00:24:52 --> 00:24:54
it won't hit Ax.
385
00:24:54 --> 00:24:59
Do you see what I'm saying?
386
00:24:59 --> 00:25:02
Let's see why.
387
00:25:02 --> 00:25:05
All right.
388
00:25:05 --> 00:25:09
So here's what I said.
389
00:25:09 --> 00:25:22
If x and y are in the row
space, then A x is not the same
390
00:25:22 --> 00:25:25
as A y.
391
00:25:25 --> 00:25:29
They're both in the column
space, of course,
392
00:25:29 --> 00:25:31
but they're different.
393
00:25:31 --> 00:25:36
That would be a perfect
question on a final exam,
394
00:25:36 --> 00:25:42
because that's what I'm
teaching you in that material of
395
00:25:42 --> 00:25:48
chapter three and chapter four,
especially chapter three.
396
00:25:48 --> 00:25:54
If x and y are in the row
space, then Ax is different from
397
00:25:54 --> 00:25:55
Ay.
398
00:25:55 --> 00:26:00
So what this means -- and we'll
see why -- is that,
399
00:26:00 --> 00:26:04
in words, from the row space to
the column space,
400
00:26:04 --> 00:26:08
A is perfect,
it's an invertible matrix.
401
00:26:08 --> 00:26:11
If we, like,
limited it to those spaces.
402
00:26:11 --> 00:26:16
And then, its inverse will be
what I'll call the
403
00:26:16 --> 00:26:18
pseudo-inverse.
404
00:26:18 --> 00:26:20
So that's that the
pseudo-inverse is.
405
00:26:20 --> 00:26:24
It's the inverse -- so A goes
this way, from x to y -- sorry,
406
00:26:24 --> 00:26:27
x to A x, from y to A y,
that's A, going that way.
407
00:26:27 --> 00:26:31
Then in the other direction,
anything in the column space
408
00:26:31 --> 00:26:35
comes from somebody in the row
space, and the reverse there is
409
00:26:35 --> 00:26:37
what I'll call the
pseudo-inverse,
410
00:26:37 --> 00:26:40
and the accepted notation is A
plus.
411
00:26:40 --> 00:26:45
So y will be A plus x.
412
00:26:45 --> 00:26:47
I'm sorry.
413
00:26:47 --> 00:26:57
No, y will be A plus times
whatever it started with,
414
00:26:57 --> 00:26:58
A y.
415
00:26:58 --> 00:27:05
Do you see my picture there?
416
00:27:05 --> 00:27:07
Same, of course,
for x and A x.
417
00:27:07 --> 00:27:10
This way, A does it,
the other way is the
418
00:27:10 --> 00:27:13
pseudo-inverse,
and the pseudo-inverse just
419
00:27:13 --> 00:27:16
kills this stuff,
and the matrix just kills this
420
00:27:16 --> 00:27:16
stuff.
421
00:27:16 --> 00:27:20
So everything that's really
serious here is going on in the
422
00:27:20 --> 00:27:25
row space and the column space,
and now, tell me
423
00:27:25 --> 00:27:32
-- this is the fundamental
fact, that between those two
424
00:27:32 --> 00:27:39
r-dimensional spaces,
our matrix is perfect.
425
00:27:39 --> 00:27:39
Why?
426
00:27:39 --> 00:27:42
Suppose they weren't.
427
00:27:42 --> 00:27:46.68
Why do I get into trouble?
428
00:27:46.68 --> 00:27:50
Suppose -- so,
proof.
429
00:27:50 --> 00:27:57
I haven't written down proof
very much, but I'm going to use
430
00:27:57 --> 00:27:59
that word once.
431
00:27:59 --> 00:28:02
Suppose they were the same.
432
00:28:02 --> 00:28:08
Suppose these are supposed to
be two different vectors.
433
00:28:08 --> 00:28:14
Maybe I'd better make the
statement correctly.
434
00:28:14 --> 00:28:17
If x and y are different
vectors in the row space --
435
00:28:17 --> 00:28:20
maybe I'll better put if x is
different from y,
436
00:28:20 --> 00:28:24
both in the row space -- so I'm
starting with two different
437
00:28:24 --> 00:28:27.95
vectors in the row space,
I'm multiplying by A -- so
438
00:28:27.95 --> 00:28:31
these guys are in the column
space, everybody knows that,
439
00:28:31 --> 00:28:35
and the point is,
they're different over there.
440
00:28:35 --> 00:28:38
So, suppose they weren't.
441
00:28:38 --> 00:28:40
Suppose A x=A y.
442
00:28:40 --> 00:28:45
Suppose, well,
that's the same as saying
443
00:28:45 --> 00:28:47
A(x-y) is zero.
444
00:28:47 --> 00:28:49
So what?
445
00:28:49 --> 00:28:56
So, what do I know now about
(x-y), what do I know about this
446
00:28:56 --> 00:28:57
vector?
447
00:28:57 --> 00:29:05
Well, I can see right away,
what space is it in?
448
00:29:05 --> 00:29:09
It's sitting in the null space,
right?
449
00:29:09 --> 00:29:12
So it's in the null space.
450
00:29:12 --> 00:29:16
But what else do I know about
it?
451
00:29:16 --> 00:29:21
Here it was x in the row space,
y in the row space,
452
00:29:21 --> 00:29:23
what about x-y?
453
00:29:23 --> 00:29:28
It's also in the row space,
right?
454
00:29:28 --> 00:29:32
Heck, that thing is a vector
space, and if the vector space
455
00:29:32 --> 00:29:35
is anything at all,
if x is in the row space,
456
00:29:35 --> 00:29:39
and y is in the row space,
then the difference is also,
457
00:29:39 --> 00:29:41
so it's also in the row space.
458
00:29:41 --> 00:29:42.07
So what?
459
00:29:42.07 --> 00:29:45.66
Now I've got a vector x-y
that's in the null space,
460
00:29:45.66 --> 00:29:50.4
and that's also in the row
space, so what vector is it?
461
00:29:50.4 --> 00:29:52
It's the zero vector.
462
00:29:52 --> 00:29:57
So I would conclude from that
that x-y had to be the zero
463
00:29:57 --> 00:30:00
vector, x-y, so,
in other words,
464
00:30:00 --> 00:30:05
if I start from two different
vectors, I get two different
465
00:30:05 --> 00:30:06
vectors.
466
00:30:06 --> 00:30:11
If these vectors are the same,
then those vectors had to be
467
00:30:11 --> 00:30:13
the same.
468
00:30:13 --> 00:30:18
That's like the algebra proof,
which we understand completely
469
00:30:18 --> 00:30:23
because we really understand
these subspaces of what I said
470
00:30:23 --> 00:30:27
in words, that a matrix A is
really a nice,
471
00:30:27 --> 00:30:31
invertible mapping from row
space to columns pace.
472
00:30:31 --> 00:30:36
If the null spaces keep out of
the way, then we have an
473
00:30:36 --> 00:30:38
inverse.
474
00:30:38 --> 00:30:41
And that inverse is called the
pseudo inverse,
475
00:30:41 --> 00:30:44.9
and it's a very,
very, useful in application.
476
00:30:44.9 --> 00:30:49
Statisticians discovered,
oh boy, this is the thing that
477
00:30:49 --> 00:30:53
we needed all our lives,
and here it finally showed up,
478
00:30:53 --> 00:30:56.73
the pseudo-inverse is the right
thing.
479
00:30:56.73 --> 00:31:00
Why do statisticians need it?
480
00:31:00 --> 00:31:06
And because statisticians are
like least-squares-happy.
481
00:31:06 --> 00:31:11
I mean they're always doing
least squares.
482
00:31:11 --> 00:31:17
And so this is their central
linear regression.
483
00:31:17 --> 00:31:20
Statisticians who may watch
this on video,
484
00:31:20 --> 00:31:24.81
please forgive that description
of your interests.
485
00:31:24.81 --> 00:31:29
One of your interests is linear
regression and this problem.
486
00:31:29 --> 00:31:34
But this problem is only OK
provided we have full column
487
00:31:34 --> 00:31:34
rank.
488
00:31:34 --> 00:31:38
And statisticians have to worry
all the time about,
489
00:31:38 --> 00:31:43
oh, God, maybe we just repeated
an experiment.
490
00:31:43 --> 00:31:48
You know, you're taking all
these measurements,
491
00:31:48 --> 00:31:52.54
maybe you just repeat them a
few times.
492
00:31:52.54 --> 00:31:57
You know, maybe they're not
independent.
493
00:31:57 --> 00:32:01
Well, in that case,
that A transpose A matrix that
494
00:32:01 --> 00:32:03
they depend on becomes singular.
495
00:32:03 --> 00:32:06
So then that's when they needed
the pseudo-inverse,
496
00:32:06 --> 00:32:10.83
it just arrived at the right
moment, and it's the right
497
00:32:10.83 --> 00:32:11
quantity.
498
00:32:11 --> 00:32:11
OK.
499
00:32:11 --> 00:32:15
So now that you know what the
pseudo-inverse should do,
500
00:32:15 --> 00:32:18
let me see what it is.
501
00:32:18 --> 00:32:20
Can we find it?
502
00:32:20 --> 00:32:30
So this is my -- to complete
the lecture is -- how do I find
503
00:32:30 --> 00:32:35
this pseudo-inverse A plus?
504
00:32:35 --> 00:32:36
OK.
505
00:32:36 --> 00:32:36
OK.
506
00:32:36 --> 00:32:40
Well, here's one way.
507
00:32:40 --> 00:32:49
Everything I do today is to try
to review stuff.
508
00:32:49 --> 00:32:54.47
One way would be to start from
the SVD.
509
00:32:54.47 --> 00:32:58
The Singular Value
Decomposition.
510
00:32:58 --> 00:33:04
And you remember that that
factored A into an orthogonal
511
00:33:04 --> 00:33:11
matrix times this diagonal
matrix times this orthogonal
512
00:33:11 --> 00:33:12
matrix.
513
00:33:12 --> 00:33:18
But what did that diagonal guy
look like?
514
00:33:18 --> 00:33:20
This diagonal guy,
sigma, has some non-zeroes,
515
00:33:20 --> 00:33:23
and you remember,
they came from A transpose A,
516
00:33:23 --> 00:33:26
and A A transpose,
these are the good guys,
517
00:33:26 --> 00:33:28.94
and then some more zeroes,
and all zeroes there,
518
00:33:28.94 --> 00:33:30
and all zeroes there.
519
00:33:30 --> 00:33:32
So you can guess what the
pseudo-inverse is,
520
00:33:32 --> 00:33:35.59
I just invert stuff that's nice
to invert -- well,
521
00:33:35.59 --> 00:33:38
what's the pseudo-inverse of
this?
522
00:33:38 --> 00:33:43
That's what the problem comes
down to.
523
00:33:43 --> 00:33:51
What's the pseudo-inverse of
this beautiful diagonal matrix?
524
00:33:51 --> 00:33:55
But it's got a null space,
right?
525
00:33:55 --> 00:34:00
What's the rank of this matrix?
526
00:34:00 --> 00:34:03
What's the rank of this
diagonal matrix?
527
00:34:03 --> 00:34:05
r, of course.
528
00:34:05 --> 00:34:09
It's got r non-zeroes,
and then it's otherwise,
529
00:34:09 --> 00:34:09
zip.
530
00:34:09 --> 00:34:13
So it's got n columns,
it's got m rows,
531
00:34:13 --> 00:34:15
and it's got rank r.
532
00:34:15 --> 00:34:20
It's the best example,
the simplest example we could
533
00:34:20 --> 00:34:24
ever have of our general setup.
534
00:34:24 --> 00:34:24
OK?
535
00:34:24 --> 00:34:28
So what's the pseudo-inverse?
536
00:34:28 --> 00:34:33
What's the matrix -- so I'll
erase our columns,
537
00:34:33 --> 00:34:38
because right below it,
I want to write the
538
00:34:38 --> 00:34:40
pseudo-inverse.
539
00:34:40 --> 00:34:46
OK, you can make a pretty darn
good guess.
540
00:34:46 --> 00:34:49
If it was a proper diagonal
matrix, invertible,
541
00:34:49 --> 00:34:54
if there weren't any zeroes
down here, if it was sigma one
542
00:34:54 --> 00:34:59
to sigma n, then everybody knows
what the inverse would be,
543
00:34:59 --> 00:35:04
the inverse would be one over
sigma one, down to one over s-
544
00:35:04 --> 00:35:08
but of course,
I'll have to stop at sigma r.
545
00:35:08 --> 00:35:14
And, it will be the rest,
zeroes again,
546
00:35:14 --> 00:35:16
of course.
547
00:35:16 --> 00:35:25
And now this one was m by n,
and this one is meant to have a
548
00:35:25 --> 00:35:33
slightly different,
you know, transpose shape,
549
00:35:33 --> 00:35:34
n by m.
550
00:35:34 --> 00:35:40
They both have that rank r.
551
00:35:40 --> 00:35:45
My idea is that the
pseudo-inverse is the best -- is
552
00:35:45 --> 00:35:49
the closest I can come to an
inverse.
553
00:35:49 --> 00:35:53
So what is sigma times its
pseudo-inverse?
554
00:35:53 --> 00:35:58
Can you multiply sigma by its
pseudo-inverse?
555
00:35:58 --> 00:36:00
Multiply that by that?
556
00:36:00 --> 00:36:02
What matrix do you get?
557
00:36:02 --> 00:36:05
They're diagonal.
558
00:36:05 --> 00:36:08
Rectangular,
of course.
559
00:36:08 --> 00:36:13
But of course,
we're going to get ones,
560
00:36:13 --> 00:36:16
R ones, and all the rest,
zeroes.
561
00:36:16 --> 00:36:23
And the shape of that,
this whole matrix will be m by
562
00:36:23 --> 00:36:23
m.
563
00:36:23 --> 00:36:29.75
And suppose I did it in the
other order.
564
00:36:29.75 --> 00:36:32
Suppose I did sigma plus sigma.
565
00:36:32 --> 00:36:35
Why don't I do it right
underneath?
566
00:36:35 --> 00:36:37.17
in the opposite order?
567
00:36:37.17 --> 00:36:40
See, this matrix hasn't got a
left-inverse,
568
00:36:40 --> 00:36:45
it hasn't got a right-inverse,
but every matrix has got a
569
00:36:45 --> 00:36:46
pseudo-inverse.
570
00:36:46 --> 00:36:52
If I do it in the order sigma
plus sigma, what do I get?
571
00:36:52 --> 00:36:54.42
Square matrix,
this is m by n,
572
00:36:54.42 --> 00:36:57
this is m by m,
my result is going to m by m --
573
00:36:57 --> 00:37:00
is going to be n by n,
and what is it?
574
00:37:00 --> 00:37:03
Those are diagonal matrices,
it's going to be ones,
575
00:37:03 --> 00:37:04.61
and then zeroes.
576
00:37:04.61 --> 00:37:08
It's not the same as that,
it's a different size -- it's a
577
00:37:08 --> 00:37:09.26
projection.
578
00:37:09.26 --> 00:37:12
One is a projection matrix onto
the column space,
579
00:37:12 --> 00:37:17
and this one is the projection
matrix onto the row space.
580
00:37:17 --> 00:37:20
That's the best that
pseudo-inverse can do.
581
00:37:20 --> 00:37:25
So what the pseudo-inverse does
is, if you multiply on the left,
582
00:37:25 --> 00:37:29.89
you don't get the identity,
if you multiply on the right,
583
00:37:29.89 --> 00:37:34
you don't get the identity,
what you get is the projection.
584
00:37:34 --> 00:37:39
It brings you into the two good
spaces, the row space and column
585
00:37:39 --> 00:37:40
space.
586
00:37:40 --> 00:37:44
And it just wipes out the null
space.
587
00:37:44 --> 00:37:50.64
So that's what the
pseudo-inverse of this diagonal
588
00:37:50.64 --> 00:37:56
one is, and then the
pseudo-inverse of A itself --
589
00:37:56 --> 00:38:00
this is perfectly invertible.
590
00:38:00 --> 00:38:07
What's the inverse of V
transpose?
591
00:38:07 --> 00:38:13.41
Just another tiny bit of
review.
592
00:38:13.41 --> 00:38:23.11
That's an orthogonal matrix,
and its inverse is V,
593
00:38:23.11 --> 00:38:25
good.
594
00:38:25 --> 00:38:27
This guy has got all the
trouble in it,
595
00:38:27 --> 00:38:30
all the null space is
responsible for,
596
00:38:30 --> 00:38:33
so it doesn't have a true
inverse, it has a
597
00:38:33 --> 00:38:36.15
pseudo-inverse,
and then the inverse of U is U
598
00:38:36.15 --> 00:38:37
transpose, thanks.
599
00:38:37 --> 00:38:39
Or, of course,
I could write U inverse.
600
00:38:39 --> 00:38:42
So, that's the question of,
how do you find the
601
00:38:42 --> 00:38:46
pseudo-inverse
-- so what statisticians do
602
00:38:46 --> 00:38:50
when they're in this -- so this
is like the case of where least
603
00:38:50 --> 00:38:54
squares breaks down because the
rank is -- you don't have full
604
00:38:54 --> 00:38:58
rank, and the beauty of the
singular value decomposition is,
605
00:38:58 --> 00:39:03
it puts all the problems into
this diagonal matrix where it's
606
00:39:03 --> 00:39:04
clear what to do.
607
00:39:04 --> 00:39:10
It's the best inverse you could
think of is clear.
608
00:39:10 --> 00:39:17
You see there could be other --
I mean, we could put some stuff
609
00:39:17 --> 00:39:22
down here, it would multiply
these zeroes.
610
00:39:22 --> 00:39:28
It wouldn't have any effect,
but then the good
611
00:39:28 --> 00:39:33
pseudo-inverse is the one with
no extra stuff,
612
00:39:33 --> 00:39:38
it's sort of,
like, as small as possible.
613
00:39:38 --> 00:39:44
It has to have those to produce
the ones.
614
00:39:44 --> 00:39:48
If it had other stuff,
it would just be a larger
615
00:39:48 --> 00:39:52
matrix, so this pseudo-inverse
is kind of the minimal matrix
616
00:39:52 --> 00:39:55.18
that gives the best result.
617
00:39:55.18 --> 00:39:57
Sigma sigma plus being r ones.
618
00:39:57 --> 00:39:59
SK.
so I guess I'm hoping --
619
00:39:59 --> 00:40:03
pseudo-inverse,
again, let me repeat what I
620
00:40:03 --> 00:40:06
said at the very beginning.
621
00:40:06 --> 00:40:13
This pseudo-inverse,
which appears at the end,
622
00:40:13 --> 00:40:22
which is in section seven point
four, and probably I did more
623
00:40:22 --> 00:40:29
with it here than I did in the
book.
624
00:40:29 --> 00:40:32
The word pseudo-inverse will
not appear on an exam in this
625
00:40:32 --> 00:40:36
course, but I think if you see
this all will appear,
626
00:40:36 --> 00:40:39
because this is all what the
course was about,
627
00:40:39 --> 00:40:42
chapters one,
two, three, four -- but if you
628
00:40:42 --> 00:40:44
see all that,
then you probably see,
629
00:40:44 --> 00:40:48
well, OK, the general case had
both null spaces around,
630
00:40:48 --> 00:40:51.3
and this is the natural thing
to do.
631
00:40:51.3 --> 00:40:51
Yes.
632
00:40:51 --> 00:40:57
So, this is one way to find the
pseudo-inverse.
633
00:40:57 --> 00:41:03
The point of a pseudo-inverse,
of computing a pseudo-inverse
634
00:41:03 --> 00:41:10
is to get some factors where you
can find the pseudo-inverse
635
00:41:10 --> 00:41:12
quickly.
636
00:41:12 --> 00:41:15
And this is,
like, the champion,
637
00:41:15 --> 00:41:19
because this is where we can
invert those,
638
00:41:19 --> 00:41:23
and those two,
easily, just by transposing,
639
00:41:23 --> 00:41:27
and we know what to do with a
diagonal.
640
00:41:27 --> 00:41:31
OK, that's as much review,
maybe --
641
00:41:31 --> 00:41:36
let's have a five-minute
holiday in 18.06 and,
642
00:41:36 --> 00:41:41
I'll see you Wednesday,
then, for the rest of this
643
00:41:41 --> 00:41:41
course.
644
00:41:41 --> 00:41:44
Thanks.