1
00:00:01,161 --> 00:00:03,920
ANNOUNCER: The following content
is provided under a Creative
2
00:00:03,920 --> 00:00:05,310
Commons license.
3
00:00:05,310 --> 00:00:07,520
Your support will help
MIT Open Courseware
4
00:00:07,520 --> 00:00:11,610
continue to offer high quality
educational resources for free.
5
00:00:11,610 --> 00:00:14,180
To make a donation or to
view additional materials
6
00:00:14,180 --> 00:00:16,670
from hundreds of
MIT courses, visit
7
00:00:16,670 --> 00:00:18,540
MITOpenCourseware@OCW.MIT.edu.
8
00:00:22,824 --> 00:00:25,960
PROFESSOR: So this is a big
day mathematically speaking,
9
00:00:25,960 --> 00:00:32,320
because we come to
this key idea, which is
10
00:00:32,320 --> 00:00:34,000
a little bit like eigenvalues.
11
00:00:34,000 --> 00:00:36,970
Well, a lot like
eigenvalues, but different
12
00:00:36,970 --> 00:00:44,410
because the matrix A now is
more usually rectangular.
13
00:00:44,410 --> 00:00:48,250
So for a rectangular matrix,
the whole idea of eigenvalues
14
00:00:48,250 --> 00:00:54,160
is shot because if I
multiply A times a vector
15
00:00:54,160 --> 00:00:59,590
x in n dimensions, out will
come something in m dimensions
16
00:00:59,590 --> 00:01:01,810
and it's not going
to equal lambda x.
17
00:01:01,810 --> 00:01:08,620
So Ax equal lambda x is not even
possible if A is rectangular.
18
00:01:08,620 --> 00:01:13,330
And even if A is square, what
are the problems, just thinking
19
00:01:13,330 --> 00:01:16,420
for a minute about eigenvalues?
20
00:01:16,420 --> 00:01:20,950
The case I wrote up
here is the great case
21
00:01:20,950 --> 00:01:24,790
where I have a symmetric
matrix and then it's
22
00:01:24,790 --> 00:01:28,720
got a full set of
eigenvalues and eigenvectors
23
00:01:28,720 --> 00:01:31,360
and they're
orthogonal, all good.
24
00:01:31,360 --> 00:01:36,730
But for a general square
matrix, either the eigenvectors
25
00:01:36,730 --> 00:01:38,800
are complex--
26
00:01:38,800 --> 00:01:42,550
eigenvalues are complex
or the eigenvectors
27
00:01:42,550 --> 00:01:43,870
are not orthogonal.
28
00:01:46,780 --> 00:01:50,410
So we can't stay with
eigenvalues forever.
29
00:01:50,410 --> 00:01:52,000
That's what I'm saying.
30
00:01:52,000 --> 00:01:55,840
And this is the
right thing to do.
31
00:01:55,840 --> 00:01:57,910
So what are these pieces?
32
00:01:57,910 --> 00:02:07,710
So these are the left and these
are the right singular vectors.
33
00:02:07,710 --> 00:02:10,035
So this is the new
word is singular.
34
00:02:16,170 --> 00:02:18,660
And in between go the--
35
00:02:18,660 --> 00:02:23,790
not the eigenvalues,
but the singular values.
36
00:02:23,790 --> 00:02:25,780
So we've got the
whole point now.
37
00:02:25,780 --> 00:02:27,690
You've got to pick up on this.
38
00:02:27,690 --> 00:02:32,490
There are two sets of
singular vectors, not one.
39
00:02:32,490 --> 00:02:38,540
For eigenvectors, we just
had one set, the Q's.
40
00:02:38,540 --> 00:02:41,690
Now we have a
rectangular matrix,
41
00:02:41,690 --> 00:02:47,870
we've got one set of left
eigenvectors in m dimensions,
42
00:02:47,870 --> 00:02:50,630
and we've got another
set of right eigenvectors
43
00:02:50,630 --> 00:02:53,030
in n dimensions.
44
00:02:53,030 --> 00:02:58,280
And numbers in between
are not eigenvalues,
45
00:02:58,280 --> 00:02:59,700
but singular values.
46
00:02:59,700 --> 00:03:00,770
So these guys are--
47
00:03:03,440 --> 00:03:05,150
let me write what
that looks like.
48
00:03:05,150 --> 00:03:11,090
This is, again, a diagonal
matrix sigma 2 to sigma r,
49
00:03:11,090 --> 00:03:13,470
let's say.
50
00:03:13,470 --> 00:03:17,250
So it's again, a diagonal
matrix in the middle.
51
00:03:17,250 --> 00:03:26,670
But the numbers on the
diagonal are all positive or 0.
52
00:03:26,670 --> 00:03:29,110
And they're called
singular values.
53
00:03:29,110 --> 00:03:31,920
So it's just a different world.
54
00:03:31,920 --> 00:03:35,940
So really, the first step by
have to do, the math step,
55
00:03:35,940 --> 00:03:41,550
is to show that
any matrix can be
56
00:03:41,550 --> 00:03:47,630
factored into u times
sigma times v transpose.
57
00:03:47,630 --> 00:03:51,810
So that's the parallel
to the spectral theorem
58
00:03:51,810 --> 00:03:55,650
that any symmetric matrix
could be factored that way.
59
00:03:55,650 --> 00:03:58,380
So you're good for that part.
60
00:03:58,380 --> 00:04:04,230
We just have to do it to see
what are u and sigma and v?
61
00:04:04,230 --> 00:04:08,280
What are those vectors
and those singular values?
62
00:04:08,280 --> 00:04:09,810
Let's go.
63
00:04:09,810 --> 00:04:16,990
So the key is that A
transpose A is a great matrix.
64
00:04:16,990 --> 00:04:24,370
So that's the key to the
math is A transpose A. So
65
00:04:24,370 --> 00:04:26,650
what are the properties
of A transpose A?
66
00:04:26,650 --> 00:04:28,970
A is rectangular again.
67
00:04:28,970 --> 00:04:34,710
So maybe m by n A transpose.
68
00:04:34,710 --> 00:04:36,715
So this was m by n.
69
00:04:36,715 --> 00:04:39,760
And this was n by m.
70
00:04:39,760 --> 00:04:44,500
So we get a result
that's n by n.
71
00:04:44,500 --> 00:04:48,630
And what else can you tell
me about A transpose A?
72
00:04:48,630 --> 00:04:49,780
It's a metric.
73
00:04:49,780 --> 00:04:51,610
That's a big deal.
74
00:04:51,610 --> 00:04:53,230
And it's square.
75
00:04:53,230 --> 00:04:55,690
And well yeah, you
can tell me more now,
76
00:04:55,690 --> 00:04:58,690
because we talked
about something,
77
00:04:58,690 --> 00:05:04,390
a topic that's a little more
than symmetric last time.
78
00:05:04,390 --> 00:05:09,070
The matrix A transpose A
will be positive, definite.
79
00:05:09,070 --> 00:05:13,990
It's eigenvalues are
greater or equal to 0.
80
00:05:13,990 --> 00:05:18,190
And that will mean that we
can take their square roots.
81
00:05:18,190 --> 00:05:19,720
And that's what we will do.
82
00:05:19,720 --> 00:05:23,020
So A transpose A we'll
have a factorization.
83
00:05:23,020 --> 00:05:24,380
It's symmetric.
84
00:05:24,380 --> 00:05:29,200
It'll have a like, a Q
lambda Q transpose, but I'm
85
00:05:29,200 --> 00:05:31,630
going to call it V lambda--
86
00:05:31,630 --> 00:05:36,610
no, yeah, lambda-- I'll still
call it lambda V transpose.
87
00:05:36,610 --> 00:05:39,700
So these V's-- what do we know
about eigenvectors of these
88
00:05:39,700 --> 00:05:43,180
V's or eigenvectors of this guy?
89
00:05:43,180 --> 00:05:46,030
Square, symmetric,
positive, definite matrix.
90
00:05:46,030 --> 00:05:47,740
So we're in good shape.
91
00:05:47,740 --> 00:05:52,420
And what do we know about the
eigenvalues of A transpose A?
92
00:05:52,420 --> 00:05:55,380
They are all positive.
93
00:05:55,380 --> 00:05:59,230
So the eigenvalues are--
well, or equal to 0.
94
00:05:59,230 --> 00:06:02,905
And these guys are orthogonal.
95
00:06:02,905 --> 00:06:04,780
And these guys are
greater or equal to there.
96
00:06:07,330 --> 00:06:09,700
So that's good.
97
00:06:09,700 --> 00:06:12,520
That's one of our--
98
00:06:12,520 --> 00:06:14,300
We'll depend a lot on that.
99
00:06:14,300 --> 00:06:17,365
But also, you've got
to recognize that A,
100
00:06:17,365 --> 00:06:24,310
A transpose is a different
guy, A, A transpose.
101
00:06:24,310 --> 00:06:28,300
So what's the shape
of A, A transpose?
102
00:06:28,300 --> 00:06:30,160
How big is that?
103
00:06:30,160 --> 00:06:33,180
Now I've got-- what do I have?
104
00:06:33,180 --> 00:06:35,220
M by n times n by m.
105
00:06:35,220 --> 00:06:37,580
So this will be what size?
106
00:06:37,580 --> 00:06:38,340
N by m.
107
00:06:38,340 --> 00:06:44,370
Different shape but with
the same eigenvalues--
108
00:06:44,370 --> 00:06:45,510
the same eigenvalues.
109
00:06:45,510 --> 00:06:48,210
So it's going to have some other
eigenvectors, u-- of course,
110
00:06:48,210 --> 00:06:49,752
I'm going to call
them u, because I'm
111
00:06:49,752 --> 00:06:51,330
going to go in over there.
112
00:06:51,330 --> 00:06:53,140
They'll be the same.
113
00:06:53,140 --> 00:06:56,000
Well, they're saying
yeah, let me--
114
00:06:56,000 --> 00:07:02,650
I shouldn't-- I have to
say when I say the same,
115
00:07:02,650 --> 00:07:05,690
I can't quite literally
mean the very same,
116
00:07:05,690 --> 00:07:10,700
because this has got n
eigenvalues and this has m
117
00:07:10,700 --> 00:07:12,650
eigenvalues.
118
00:07:12,650 --> 00:07:16,660
But the missing guys, the
ones that are in one of them
119
00:07:16,660 --> 00:07:21,110
and not in the other, depending
on the sizes, are zeros.
120
00:07:21,110 --> 00:07:25,670
So really, the heart of the
thing, the non-zero eigenvalues
121
00:07:25,670 --> 00:07:26,290
are the same.
122
00:07:29,060 --> 00:07:34,210
Well actually, I've
pretty much revealed
123
00:07:34,210 --> 00:07:38,350
what the SVD is going to use.
124
00:07:38,350 --> 00:07:43,520
It's going to use the U's from
here and the V's from here.
125
00:07:43,520 --> 00:07:45,280
But that's the story.
126
00:07:45,280 --> 00:07:48,550
You've got to see that story.
127
00:07:48,550 --> 00:07:52,330
So fresh start on the
singular value decomposition.
128
00:07:52,330 --> 00:07:54,110
What are we looking for?
129
00:07:54,110 --> 00:07:56,990
Well, as a factorization--
130
00:07:56,990 --> 00:07:58,030
so we're looking for--
131
00:08:03,460 --> 00:08:12,450
we want A. We want vectors v,
so that when I multiply by v--
132
00:08:12,450 --> 00:08:17,100
so if it was an eigenvector,
it would be Av equal lambda v.
133
00:08:17,100 --> 00:08:20,600
But now for A, it's rectangular.
134
00:08:20,600 --> 00:08:22,530
It hasn't got eigenvectors.
135
00:08:22,530 --> 00:08:31,780
So Av is sigma, that the
new singular value times u.
136
00:08:31,780 --> 00:08:38,169
That's the first guy and the
second guy and the rth guy.
137
00:08:38,169 --> 00:08:42,740
I'll stop at r, the rank.
138
00:08:42,740 --> 00:08:43,350
Oh, yeah.
139
00:08:46,047 --> 00:08:46,880
Is that what I want?
140
00:08:50,870 --> 00:08:53,740
A-- let me just see.
141
00:08:53,740 --> 00:08:56,200
Av is sigma u.
142
00:08:56,200 --> 00:08:58,570
Yeah, that's good.
143
00:08:58,570 --> 00:09:02,750
So this is what takes the
place of Ax equal lambda x.
144
00:09:02,750 --> 00:09:06,200
A times one set of
singular vectors
145
00:09:06,200 --> 00:09:09,590
gives me a number of times the
other set of singular vectors.
146
00:09:09,590 --> 00:09:13,010
And why did I stop
at r the rank?
147
00:09:13,010 --> 00:09:15,360
Because after that,
the sigmas are 0.
148
00:09:15,360 --> 00:09:19,700
So after that, I could
have some more guys,
149
00:09:19,700 --> 00:09:29,030
but they'll be in the null
space 0 on down to of Vn.
150
00:09:29,030 --> 00:09:32,460
So these are the important ones.
151
00:09:32,460 --> 00:09:35,910
So that's what I'm looking for.
152
00:09:35,910 --> 00:09:38,680
Let me say it now in words.
153
00:09:38,680 --> 00:09:43,540
I'm looking for a bunch
of orthogonal vectors v
154
00:09:43,540 --> 00:09:46,330
so that when I
multiply them by A
155
00:09:46,330 --> 00:09:49,910
I get a bunch of
orthogonal vectors u.
156
00:09:49,910 --> 00:09:53,570
That is not so clearly possible.
157
00:09:53,570 --> 00:09:56,090
But it is possible.
158
00:09:56,090 --> 00:09:57,320
It does happen.
159
00:09:57,320 --> 00:09:59,840
I'm looking for one set
of orthogonal vectors
160
00:09:59,840 --> 00:10:02,960
v in the input
space, you could say,
161
00:10:02,960 --> 00:10:07,850
so that the Av's in the output
space are also orthogonal.
162
00:10:11,360 --> 00:10:14,390
In our picture of
the fundamental--
163
00:10:14,390 --> 00:10:18,560
the big picture
of linear algebra,
164
00:10:18,560 --> 00:10:27,530
we have v's in this space, and
then stuff in the null space.
165
00:10:27,530 --> 00:10:34,190
And we have u's over
here in the columns space
166
00:10:34,190 --> 00:10:38,160
and some stuff in the
null space over there.
167
00:10:38,160 --> 00:10:42,840
And the idea is that I
have orthogonal v's here.
168
00:10:42,840 --> 00:10:44,820
And when I multiply by A--
169
00:10:44,820 --> 00:10:48,920
so multiply by A--
170
00:10:48,920 --> 00:10:54,760
then I get orthogonal u's over
here, orthogonal to orthogonal.
171
00:10:54,760 --> 00:11:00,260
That's what makes the
V's and they u's special.
172
00:11:00,260 --> 00:11:01,510
Right?
173
00:11:01,510 --> 00:11:03,370
That's the property.
174
00:11:03,370 --> 00:11:04,960
And then when we
write down-- well,
175
00:11:04,960 --> 00:11:07,390
let me write down
what that would mean.
176
00:11:07,390 --> 00:11:11,950
So I've just drawn a
picture to go with this--
177
00:11:11,950 --> 00:11:13,930
those equations.
178
00:11:13,930 --> 00:11:16,610
That picture just goes
with these equations.
179
00:11:16,610 --> 00:11:18,970
And let me just write
down what it means.
180
00:11:18,970 --> 00:11:23,200
It means in matrix--
so I've written it.
181
00:11:23,200 --> 00:11:28,653
Oh yeah, I've written it here
in vectors one at a time.
182
00:11:28,653 --> 00:11:30,070
But of course,
you, know I'm going
183
00:11:30,070 --> 00:11:33,610
to put those vectors into
the columns of a matrix.
184
00:11:33,610 --> 00:11:44,380
So A times v1 up to
let's say vr will equal--
185
00:11:44,380 --> 00:11:45,490
oh yeah.
186
00:11:45,490 --> 00:11:47,780
It equals sigma as times u.
187
00:11:47,780 --> 00:11:53,420
So this is what I'm
after is u1 up to ur
188
00:11:53,420 --> 00:11:58,400
multiplied by sigma
1 along to sigma r.
189
00:12:01,730 --> 00:12:05,540
What I'm doing now
is just to say I'm
190
00:12:05,540 --> 00:12:10,360
converting these
individual singular
191
00:12:10,360 --> 00:12:14,680
vectors, each v going
into a u to putting them
192
00:12:14,680 --> 00:12:16,300
all together into a matrix.
193
00:12:16,300 --> 00:12:21,670
And of course, what I've written
here is Av equals u sigma,
194
00:12:21,670 --> 00:12:25,300
Av equals u sigma.
195
00:12:28,150 --> 00:12:32,550
That's what that amounts to.
196
00:12:32,550 --> 00:12:38,790
Well, then I'm going to put
a v transpose on this side.
197
00:12:38,790 --> 00:12:44,520
And I'm going to get to A
equals u sigma v transpose,
198
00:12:44,520 --> 00:12:48,380
multiplying both sides
there by v transpose.
199
00:12:48,380 --> 00:12:51,750
I'm kind of writing the same
thing in different forms,
200
00:12:51,750 --> 00:12:55,155
matrix form, vector
at a time form.
201
00:12:57,690 --> 00:13:01,440
And now we have to find them.
202
00:13:01,440 --> 00:13:06,210
Now I've used up boards
saying what we're after,
203
00:13:06,210 --> 00:13:08,430
but now we've got to get there.
204
00:13:08,430 --> 00:13:10,335
So what are the v's
and what are the u's?
205
00:13:15,820 --> 00:13:20,700
Well, the cool idea is to
think of A transpose A.
206
00:13:20,700 --> 00:13:23,470
So you're with me
what we're for.
207
00:13:23,470 --> 00:13:26,210
And now think about
A transpose A.
208
00:13:26,210 --> 00:13:29,740
So if this is what
I'm hoping for,
209
00:13:29,740 --> 00:13:33,700
what will A transpose
A turn out to be?
210
00:13:36,940 --> 00:13:41,770
So big moment that's going
to reveal what the v's are.
211
00:13:41,770 --> 00:13:46,000
So if I form A transpose A--
212
00:13:46,000 --> 00:13:50,080
so A transpose-- so I got
to transpose this guy.
213
00:13:50,080 --> 00:13:57,390
So A transpose is V sigma
transpose U transpose, right?
214
00:13:57,390 --> 00:14:04,670
And then comes A, which is
this, U sigma V transpose.
215
00:14:04,670 --> 00:14:08,040
So why did I do that?
216
00:14:08,040 --> 00:14:12,810
Why is it that A transpose A
is the cool thing to look at
217
00:14:12,810 --> 00:14:14,550
to make the problem simpler?
218
00:14:14,550 --> 00:14:20,370
Well, what becomes simpler
in that line just written?
219
00:14:20,370 --> 00:14:25,960
U transpose U is the
identity, because I'm looking
220
00:14:25,960 --> 00:14:29,830
for orthogonal, in
fact orthonormal U's.
221
00:14:29,830 --> 00:14:31,240
So that's the identity.
222
00:14:31,240 --> 00:14:38,980
So this is V sigma
transpose sigma V transpose.
223
00:14:38,980 --> 00:14:42,080
And I'll put parentheses
around that because that's
224
00:14:42,080 --> 00:14:43,100
a diagonal matrix.
225
00:14:48,800 --> 00:14:50,180
What does that tell me?
226
00:14:50,180 --> 00:14:53,440
What does that tell all of us?
227
00:14:53,440 --> 00:14:55,840
A transpose A has this form.
228
00:14:55,840 --> 00:14:57,420
Now we've seen that form before.
229
00:14:57,420 --> 00:15:01,320
We know that this is a
symmetric matrix, symmetric
230
00:15:01,320 --> 00:15:03,090
and even positive definite.
231
00:15:03,090 --> 00:15:06,160
So what are the v's?
232
00:15:06,160 --> 00:15:13,860
The v's are the eigenvectors
of A transpose A.
233
00:15:13,860 --> 00:15:20,770
This is the Q lambda Q transpose
for that symmetric matrix.
234
00:15:20,770 --> 00:15:24,360
So we know the v's
are the eigenvectors,
235
00:15:24,360 --> 00:15:32,040
v is the eigenvectors
of A transpose A.
236
00:15:32,040 --> 00:15:35,820
I guess we're also going
to get the singular values.
237
00:15:35,820 --> 00:15:41,760
So the sigma transpose sigma,
which will be the sigma squared
238
00:15:41,760 --> 00:15:51,800
are the eigenvalues of
A transpose A. Good!
239
00:15:55,010 --> 00:15:58,190
Sort of by looking for the
correct thing, U sigma V
240
00:15:58,190 --> 00:16:01,790
transpose and then just
using the U transpose U
241
00:16:01,790 --> 00:16:04,640
equal identity, we got
it back to something
242
00:16:04,640 --> 00:16:07,800
we perfectly recognize.
243
00:16:07,800 --> 00:16:09,430
A transpose A has that form.
244
00:16:09,430 --> 00:16:11,400
So now we know what the V's are.
245
00:16:11,400 --> 00:16:17,940
And if I do it the other way,
which, what's the other way?
246
00:16:17,940 --> 00:16:20,220
Instead of A transpose
A, the other way
247
00:16:20,220 --> 00:16:23,760
is to look at A, A transpose.
248
00:16:23,760 --> 00:16:26,910
And if I write all
that down, that a
249
00:16:26,910 --> 00:16:32,910
is the U sigma V transpose, and
the A transpose is the V sigma
250
00:16:32,910 --> 00:16:35,230
transpose U transpose.
251
00:16:35,230 --> 00:16:39,030
And again, this stuff
goes away and leaves me
252
00:16:39,030 --> 00:16:45,610
with U sigma, sigma
transpose U transpose.
253
00:16:45,610 --> 00:16:47,990
So I know what the U's are too.
254
00:16:47,990 --> 00:16:53,990
They are eigenvectors
of A, A transpose.
255
00:17:00,430 --> 00:17:03,670
Isn't that a beautiful symmetry?
256
00:17:03,670 --> 00:17:06,940
You just-- A transpose
A and A, A transpose
257
00:17:06,940 --> 00:17:08,440
are two different guys now.
258
00:17:08,440 --> 00:17:13,089
So each has its own
eigenvectors and we use both.
259
00:17:13,089 --> 00:17:15,680
It's just right.
260
00:17:15,680 --> 00:17:19,190
And I just have to
take the final step,
261
00:17:19,190 --> 00:17:23,730
and we've established the SVD.
262
00:17:23,730 --> 00:17:26,550
So the final step is to remember
what I'm going for here.
263
00:17:29,840 --> 00:17:33,030
A times a v is supposed
to be a sigma times a u.
264
00:17:36,210 --> 00:17:37,960
See, what I have
to deal with now
265
00:17:37,960 --> 00:17:41,870
is I haven't quite finished.
266
00:17:41,870 --> 00:17:44,020
It's just perfect
as far as it goes,
267
00:17:44,020 --> 00:17:48,710
but it hasn't gone to
the end yet because we
268
00:17:48,710 --> 00:17:51,650
could have double eigenvalues
and triple eigenvalues,
269
00:17:51,650 --> 00:17:56,590
and all those horrible
possibilities.
270
00:17:56,590 --> 00:18:01,400
And if I have triple eigenvalues
or double eigenvalues,
271
00:18:01,400 --> 00:18:04,030
then what's the deal
with eigenvectors
272
00:18:04,030 --> 00:18:05,750
if I have double eigenvalues?
273
00:18:05,750 --> 00:18:10,630
Suppose a matrix has
a symmetric matrix,
274
00:18:10,630 --> 00:18:12,770
has a double eigenvalue.
275
00:18:12,770 --> 00:18:14,610
Let me just take an example.
276
00:18:14,610 --> 00:18:20,950
So symmetric matrix like
say, 1, 1, 5, make it.
277
00:18:20,950 --> 00:18:23,550
Why not?
278
00:18:23,550 --> 00:18:25,620
What's the deal
with eigenvectors
279
00:18:25,620 --> 00:18:29,510
for that matrix 1, 1, 5?
280
00:18:29,510 --> 00:18:31,610
So 5 has got an eigenvector.
281
00:18:31,610 --> 00:18:35,720
You can see what it is, 0, 0, 1.
282
00:18:35,720 --> 00:18:39,410
What about eigenvectors
that go with lambda equal 1
283
00:18:39,410 --> 00:18:42,440
for that matrix?
284
00:18:42,440 --> 00:18:43,230
What's up?
285
00:18:43,230 --> 00:18:48,530
What would be eigenvectors
for a lambda equal 1?
286
00:18:48,530 --> 00:18:52,040
Unfortunately, there was
a whole plane of them.
287
00:18:52,040 --> 00:18:56,840
Any vector of the form x, y, 0.
288
00:18:56,840 --> 00:19:04,220
Any vector in the x, y
plane would produce x, y, 0.
289
00:19:04,220 --> 00:19:06,440
So I have a whole
plane of eigenvectors.
290
00:19:06,440 --> 00:19:11,420
And I've got to pick two that
are orthogonal, which I can do.
291
00:19:11,420 --> 00:19:12,940
And then they have to be--
292
00:19:12,940 --> 00:19:15,860
in the SVD those
two orthogonal guys
293
00:19:15,860 --> 00:19:18,420
have to go to two
orthogonal guys.
294
00:19:18,420 --> 00:19:23,732
In other words, it's a
little bit of detail here,
295
00:19:23,732 --> 00:19:28,710
a little getting into
this exactly what is--
296
00:19:28,710 --> 00:19:38,950
well, actually, let
me tell you the steps.
297
00:19:38,950 --> 00:19:44,710
So I use this to conclude that
the V's the singular vectors
298
00:19:44,710 --> 00:19:46,270
should be eigenvalues.
299
00:19:46,270 --> 00:19:49,240
I concluded those
guys from this step.
300
00:19:49,240 --> 00:19:52,240
Now I'm not going to
use this step so much.
301
00:19:52,240 --> 00:19:57,010
Of course, it's in the back of
my mind but I'm not using it.
302
00:19:57,010 --> 00:20:00,820
I'm going to get
the u's from here.
303
00:20:00,820 --> 00:20:17,190
So u1 is A v1 over sigma
1 ur is Avr over sigma r.
304
00:20:17,190 --> 00:20:20,330
You see what I'm doing here?
305
00:20:20,330 --> 00:20:28,540
I'm picking in a possible
plane of things the one I want,
306
00:20:28,540 --> 00:20:29,490
the u's I want.
307
00:20:29,490 --> 00:20:31,510
So I've chosen the v's.
308
00:20:31,510 --> 00:20:33,640
I've chosen the sigmas.
309
00:20:33,640 --> 00:20:37,750
They were fixed
for A transpose A.
310
00:20:37,750 --> 00:20:40,780
The eigenvectors are
v's, the things--
311
00:20:40,780 --> 00:20:44,770
the eigenvalues
are sigma squared.
312
00:20:44,770 --> 00:20:49,460
And now then this
is the u I want.
313
00:20:49,460 --> 00:20:50,980
Are you with me?
314
00:20:50,980 --> 00:20:56,830
So I want to get
these u's correct.
315
00:20:56,830 --> 00:20:59,150
And if I have a whole
plane of possibilities,
316
00:20:59,150 --> 00:21:01,890
I got to pick the right one.
317
00:21:01,890 --> 00:21:05,600
And now finally, I have to
show that it's the right one.
318
00:21:05,600 --> 00:21:08,970
So what is left to show?
319
00:21:08,970 --> 00:21:14,100
I should show that these u's are
eigenvectors of A, A transpose.
320
00:21:16,850 --> 00:21:21,050
And I should show that
they're orthogonal.
321
00:21:21,050 --> 00:21:22,850
That's the key.
322
00:21:22,850 --> 00:21:27,320
I would like to show that
these are orthogonal.
323
00:21:27,320 --> 00:21:30,420
And that's what goes
in this picture.
324
00:21:30,420 --> 00:21:32,690
The v's-- I've got
orthogonal, guys,
325
00:21:32,690 --> 00:21:36,410
because they're the eigenvectors
of a symmetric matrix.
326
00:21:36,410 --> 00:21:37,850
Pick them orthogonal.
327
00:21:37,850 --> 00:21:40,490
But now I'm multiplying
by A, so I'm
328
00:21:40,490 --> 00:21:46,370
getting the u which is Av over
sigma for the basis vectors.
329
00:21:46,370 --> 00:21:48,290
And I have to show
they're orthogonal.
330
00:21:48,290 --> 00:21:52,490
So this is like
the final moment.
331
00:21:52,490 --> 00:21:54,590
Does everything
come together right?
332
00:21:57,920 --> 00:22:02,480
If I've picked the v's as the
eigenvectors of A transpose A,
333
00:22:02,480 --> 00:22:09,420
and then I take these for
the u, are they orthogonal?
334
00:22:09,420 --> 00:22:12,900
So I would like to think
that we can check that fact
335
00:22:12,900 --> 00:22:15,550
and that it will come out.
336
00:22:15,550 --> 00:22:18,570
Could you just help
me through this one?
337
00:22:18,570 --> 00:22:23,716
I'll never ask for anything
again, just get the SVD one.
338
00:22:31,560 --> 00:22:37,580
So I would like
to show that u1--
339
00:22:37,580 --> 00:22:40,250
so let me put up what I'm doing.
340
00:22:40,250 --> 00:22:47,610
I'm trying to show that
u1 transpose u2 is 0.
341
00:22:47,610 --> 00:22:49,680
They're orthogonal.
342
00:22:49,680 --> 00:22:57,890
So u1 is A v1 over sigma 1.
343
00:22:57,890 --> 00:22:59,040
That's transpose.
344
00:22:59,040 --> 00:23:00,740
That's u1.
345
00:23:00,740 --> 00:23:06,000
And u2 is A v2 over sigma 2.
346
00:23:06,000 --> 00:23:08,280
And I want to get 0.
347
00:23:08,280 --> 00:23:13,560
The whole conversation
is ending right here.
348
00:23:13,560 --> 00:23:16,866
Why is that thing 0?
349
00:23:16,866 --> 00:23:19,370
The v's are orthogonal.
350
00:23:19,370 --> 00:23:21,850
We know the v's are orthogonal.
351
00:23:21,850 --> 00:23:25,070
They're orthogonal
eigenvectors of A transpose A.
352
00:23:25,070 --> 00:23:26,580
Let me repeat that.
353
00:23:26,580 --> 00:23:36,970
The v's are orthogonal
eigenvectors of A transpose A,
354
00:23:36,970 --> 00:23:40,150
which I know we can find them.
355
00:23:40,150 --> 00:23:42,520
Then I chose the u's to be this.
356
00:23:42,520 --> 00:23:44,920
And I want to get the answer 0.
357
00:23:44,920 --> 00:23:48,470
Are you ready to do it?
358
00:23:48,470 --> 00:23:53,210
We want to compute
that and get 0.
359
00:23:53,210 --> 00:23:56,580
So what do I get?
360
00:23:56,580 --> 00:23:57,680
We just have to do it.
361
00:23:57,680 --> 00:24:02,180
So I can see that the
denominator is that.
362
00:24:02,180 --> 00:24:09,030
So is it v1 transpose
A transpose times A v2.
363
00:24:14,870 --> 00:24:17,690
And I'm hoping to get 0.
364
00:24:17,690 --> 00:24:21,060
Do I get 0 here?
365
00:24:21,060 --> 00:24:23,240
You hope so.
366
00:24:23,240 --> 00:24:25,060
v1 is orthogonal v2.
367
00:24:25,060 --> 00:24:28,900
But I've got A transpose A
stuck in the middle there.
368
00:24:28,900 --> 00:24:33,210
So what happens here?
369
00:24:33,210 --> 00:24:34,230
How do I look at that?
370
00:24:37,530 --> 00:24:44,960
v2 is an eigenvector of
A transpose A. Terrific!
371
00:24:47,630 --> 00:24:50,670
So this is v1 transpose.
372
00:24:50,670 --> 00:24:54,030
And this is the matrix times v2.
373
00:24:54,030 --> 00:25:00,420
So that's sigma 2
transpose v2, isn't it?
374
00:25:00,420 --> 00:25:04,050
It's the eigenvector
with eigenvalue sigma
375
00:25:04,050 --> 00:25:07,550
2 squared times v2.
376
00:25:07,550 --> 00:25:10,825
Yeah, divided by
sigma 1 sigma 2.
377
00:25:16,410 --> 00:25:18,000
So the A's are out of there now.
378
00:25:21,310 --> 00:25:25,310
So I've just got these
numbers, sigma 2 squared.
379
00:25:25,310 --> 00:25:28,900
So that would be
sigma 2 over sigma 1--
380
00:25:28,900 --> 00:25:34,660
I've accounted for these numbers
here-- times v1 transpose v2.
381
00:25:34,660 --> 00:25:40,810
And now what's up?
382
00:25:40,810 --> 00:25:43,030
They're orthonormal.
383
00:25:43,030 --> 00:25:44,200
We got it.
384
00:25:44,200 --> 00:25:45,610
That's 0.
385
00:25:45,610 --> 00:25:49,020
That is 0 there, yeah.
386
00:25:49,020 --> 00:25:53,710
So not only are the v's
orthogonal to each other,
387
00:25:53,710 --> 00:25:57,490
but because they're eigenvectors
of A transpose A, when
388
00:25:57,490 --> 00:26:00,460
I do this, I discover
that the Av's
389
00:26:00,460 --> 00:26:06,460
are orthogonal to each other
over in the column space.
390
00:26:06,460 --> 00:26:11,010
So orthogonal v's in the
row space, orthogonal Av's
391
00:26:11,010 --> 00:26:13,110
over in column space.
392
00:26:13,110 --> 00:26:19,260
That was discovered late--
much long after eigenvectors.
393
00:26:19,260 --> 00:26:22,980
And it's a interesting history.
394
00:26:22,980 --> 00:26:26,430
And it just comes out right.
395
00:26:26,430 --> 00:26:30,930
And then it was discovered,
but not much used, for oh,
396
00:26:30,930 --> 00:26:32,670
100 years probably.
397
00:26:32,670 --> 00:26:38,880
And then people saw that it
was exactly the right thing,
398
00:26:38,880 --> 00:26:41,370
and data matrices
became important, which
399
00:26:41,370 --> 00:26:47,060
are large rectangular matrices.
400
00:26:47,060 --> 00:26:48,530
And we have not--
401
00:26:48,530 --> 00:26:52,040
oh, I better say a word, just
a word here about actually
402
00:26:52,040 --> 00:26:58,310
computing the v's and
sigmas and the u's
403
00:26:58,310 --> 00:27:02,030
So how would you
actually find them?
404
00:27:02,030 --> 00:27:05,870
What I most want to
say is you would not
405
00:27:05,870 --> 00:27:09,065
go this A transpose A route.
406
00:27:14,530 --> 00:27:15,750
Why is it like it?
407
00:27:15,750 --> 00:27:18,300
Is that a big mistake?
408
00:27:18,300 --> 00:27:24,180
If you have a matrix
A, say 5,000 by 10,000,
409
00:27:24,180 --> 00:27:27,480
why is it a mistake
to actually use
410
00:27:27,480 --> 00:27:29,790
A transpose A in
the computation?
411
00:27:29,790 --> 00:27:33,840
We used it heavily in the proof.
412
00:27:33,840 --> 00:27:37,530
And we could find another proof
that wouldn't use it so much.
413
00:27:37,530 --> 00:27:45,310
But why would I not
multiply these two together?
414
00:27:45,310 --> 00:27:48,320
It's very big, very expensive.
415
00:27:48,320 --> 00:27:55,450
It adds in a whole
lot of round off--
416
00:27:55,450 --> 00:27:57,780
you have a matrix that's now--
417
00:27:57,780 --> 00:28:02,580
its vulnerability to round
off errors is squared--
418
00:28:02,580 --> 00:28:05,220
that's called its condition
number-- gets squared.
419
00:28:05,220 --> 00:28:07,560
And you just don't go there.
420
00:28:07,560 --> 00:28:12,330
So the actual computational
methods are quite different.
421
00:28:12,330 --> 00:28:14,770
And we'll talk about those.
422
00:28:14,770 --> 00:28:18,550
But the A transpose
A, because it's
423
00:28:18,550 --> 00:28:24,100
symmetric positive definite,
made the proof so nice.
424
00:28:24,100 --> 00:28:30,640
You've seen the nicest
proof, I'd say, of the--
425
00:28:30,640 --> 00:28:33,560
Now I should think
about the geometry.
426
00:28:33,560 --> 00:28:38,150
So what does A
equal A for u sigma?
427
00:28:38,150 --> 00:28:45,570
Maybe I take another
board, but it will fill it.
428
00:28:45,570 --> 00:28:49,110
But it's a good U
sigma V transpose.
429
00:28:52,120 --> 00:28:55,190
So it's got three factors there.
430
00:28:55,190 --> 00:28:57,500
And I would like
then each factor
431
00:28:57,500 --> 00:28:59,840
is kind of a special matrix.
432
00:28:59,840 --> 00:29:01,980
U and V are orthogonal matrix.
433
00:29:01,980 --> 00:29:05,090
So I think of
those as rotations.
434
00:29:05,090 --> 00:29:07,370
Sigma is a diagonal matrix.
435
00:29:07,370 --> 00:29:09,200
I think of it as stretching.
436
00:29:09,200 --> 00:29:11,310
So now I'm just going
to draw the picture.
437
00:29:11,310 --> 00:29:14,450
So here's unit vectors.
438
00:29:17,170 --> 00:29:22,420
And the first thing--
so if I multiply by x,
439
00:29:22,420 --> 00:29:24,370
this is the first
thing that happens.
440
00:29:24,370 --> 00:29:26,530
So that rotates.
441
00:29:26,530 --> 00:29:28,900
So here's x's.
442
00:29:28,900 --> 00:29:31,720
Then V transpose x's.
443
00:29:31,720 --> 00:29:35,730
That's still a circle
length and change
444
00:29:35,730 --> 00:29:39,420
for those, when I multiply
by an orthogonal matrix.
445
00:29:39,420 --> 00:29:43,670
But the vectors turned.
446
00:29:43,670 --> 00:29:46,250
It's a rotation.
447
00:29:46,250 --> 00:29:49,580
Could be a reflection, but
let's keep it as a rotation.
448
00:29:49,580 --> 00:29:52,250
Now what does sigma do?
449
00:29:52,250 --> 00:29:55,450
So I have this unit circle.
450
00:29:55,450 --> 00:29:56,090
I'm in 2D.
451
00:29:59,460 --> 00:30:03,390
So I'm drawing a
picture of the vectors.
452
00:30:03,390 --> 00:30:08,120
These are the unit
vectors in 2D, x,y.
453
00:30:08,120 --> 00:30:12,620
They got turned by
the orthogonal matrix.
454
00:30:12,620 --> 00:30:17,160
What does sigma do
to that picture?
455
00:30:17,160 --> 00:30:20,220
It stretches, because
sigma multiplies
456
00:30:20,220 --> 00:30:22,500
by sigma 1 in the
first component,
457
00:30:22,500 --> 00:30:24,120
sigma 2 in the second.
458
00:30:24,120 --> 00:30:26,130
So it stretches these guys.
459
00:30:26,130 --> 00:30:30,000
And let's suppose this is
number 1 and this is number 2,
460
00:30:30,000 --> 00:30:32,190
this is number 1 and number 2.
461
00:30:32,190 --> 00:30:36,320
So sigma 1, our
convention is sigma 1--
462
00:30:36,320 --> 00:30:39,360
we always take sigma 1
greater or equal to sigma 2,
463
00:30:39,360 --> 00:30:43,720
greater or equal whatever,
greater equal, sigma rank.
464
00:30:43,720 --> 00:30:46,530
And they're all positive.
465
00:30:49,110 --> 00:30:51,540
And the rest are 0.
466
00:30:51,540 --> 00:30:53,890
So sigma 1 will be
bigger than sigma 2.
467
00:30:53,890 --> 00:30:58,660
So I'm expecting a
circle goes to an ellipse
468
00:30:58,660 --> 00:30:59,905
when you stretch--
469
00:31:02,680 --> 00:31:07,480
I didn't get it quite
perfect, but not bad.
470
00:31:07,480 --> 00:31:15,250
So this would be sigma
2 v2, sigma 1 v1,
471
00:31:15,250 --> 00:31:17,890
and this would be sigma 2 v2.
472
00:31:17,890 --> 00:31:19,105
And we now have an ellipse.
473
00:31:22,260 --> 00:31:25,190
So we started with
x is in a circle.
474
00:31:25,190 --> 00:31:26,540
We rotated.
475
00:31:26,540 --> 00:31:27,560
We stretched.
476
00:31:27,560 --> 00:31:31,610
And now the final step
is take these guys
477
00:31:31,610 --> 00:31:33,650
and multiply them by u.
478
00:31:33,650 --> 00:31:38,150
So this was the
sigma V transpose x.
479
00:31:38,150 --> 00:31:42,050
And now I'm ready for the
u part which comes last
480
00:31:42,050 --> 00:31:44,060
because it's at the left.
481
00:31:44,060 --> 00:31:45,200
And what happens?
482
00:31:45,200 --> 00:31:46,340
What's the picture now?
483
00:31:50,040 --> 00:31:52,710
What does u do to the ellipse?
484
00:31:52,710 --> 00:31:54,370
It rotates it.
485
00:31:54,370 --> 00:31:56,820
It's another orthogonal matrix.
486
00:31:56,820 --> 00:31:59,370
It rotates it
somewhere, maybe there.
487
00:32:03,060 --> 00:32:11,330
And now we see the
u's, u2 and u1.
488
00:32:20,950 --> 00:32:23,980
Well, let me think about that.
489
00:32:23,980 --> 00:32:26,460
Basically, that's
not that's right.
490
00:32:29,150 --> 00:32:34,010
So this SVD is telling us
something quite remarkable
491
00:32:34,010 --> 00:32:36,800
that every linear
transformation,
492
00:32:36,800 --> 00:32:40,040
every matrix
multiplication factors
493
00:32:40,040 --> 00:32:46,020
into a rotation times a stretch
times a different rotation,
494
00:32:46,020 --> 00:32:50,350
but possibly different.
495
00:32:50,350 --> 00:32:55,240
Actually, when would the
u be the same as a v?
496
00:32:55,240 --> 00:32:56,430
Here's a good question.
497
00:32:56,430 --> 00:33:00,280
When is u the same as v
when are the two singular
498
00:33:00,280 --> 00:33:02,030
vectors just the same?
499
00:33:02,030 --> 00:33:03,190
AUDIENCE: A square.
500
00:33:03,190 --> 00:33:06,220
PROFESSOR: Because A
would have to be square.
501
00:33:06,220 --> 00:33:12,340
And we want this to be
the same as Q lambda Q
502
00:33:12,340 --> 00:33:15,730
transpose if they're the same.
503
00:33:15,730 --> 00:33:18,430
So the U's would be
the same as the V's
504
00:33:18,430 --> 00:33:22,180
when the matrix is symmetric.
505
00:33:22,180 --> 00:33:26,050
And actually we need it
to be positive definite.
506
00:33:26,050 --> 00:33:27,830
Why is that?
507
00:33:27,830 --> 00:33:33,290
Because our convention is these
guys are greater or equal to 0.
508
00:33:33,290 --> 00:33:36,080
It's going to be
the same, then--
509
00:33:36,080 --> 00:33:42,080
so far a positive
definite symmetric matrix,
510
00:33:42,080 --> 00:33:47,740
the S that we started
with is the same
511
00:33:47,740 --> 00:33:50,070
as the A on the next line.
512
00:33:50,070 --> 00:33:55,340
Yeah, the Q is the U, the Q
transpose is the V transpose,
513
00:33:55,340 --> 00:33:57,780
the lambda is the sigma.
514
00:33:57,780 --> 00:33:59,520
So those are the good matrices.
515
00:33:59,520 --> 00:34:02,580
And they're the ones that
you can't improve basically.
516
00:34:02,580 --> 00:34:06,300
They're so good you can't make
a positive definite symmetric
517
00:34:06,300 --> 00:34:08,070
matrix better than it is.
518
00:34:08,070 --> 00:34:14,400
Well, maybe diagonalize
it or something, but OK.
519
00:34:14,400 --> 00:34:18,239
Now I think of like,
one question here
520
00:34:18,239 --> 00:34:26,830
that helps me anyway to
keep this figure straight,
521
00:34:26,830 --> 00:34:32,820
how I want to count
the parameters
522
00:34:32,820 --> 00:34:38,699
in this factorization.
523
00:34:38,699 --> 00:34:40,980
So I am 2 by 2.
524
00:34:40,980 --> 00:34:43,199
I'm 2 by 2.
525
00:34:43,199 --> 00:34:48,110
So A has four
numbers, a, b, c, d.
526
00:34:52,929 --> 00:34:55,510
Then I guess I feel
that four numbers should
527
00:34:55,510 --> 00:34:59,440
appear on the right hand side.
528
00:34:59,440 --> 00:35:03,190
Somehow the U and the
sigma and the V transpose
529
00:35:03,190 --> 00:35:06,050
should use up a total
of four numbers.
530
00:35:06,050 --> 00:35:10,780
So we have a counting
match between the left side
531
00:35:10,780 --> 00:35:14,050
that's got four numbers a,
b, c, d, and the right side
532
00:35:14,050 --> 00:35:18,360
that's got four numbers
buried in there somewhere.
533
00:35:18,360 --> 00:35:21,690
So how can we dig them out?
534
00:35:21,690 --> 00:35:23,340
How many numbers in sigma?
535
00:35:23,340 --> 00:35:24,900
That's pretty clear.
536
00:35:28,310 --> 00:35:31,462
Two, sigma 1 and sigma 2.
537
00:35:31,462 --> 00:35:34,560
The two eigenvalues.
538
00:35:34,560 --> 00:35:36,900
How many numbers
in this rotation?
539
00:35:39,670 --> 00:35:42,340
So if I had a
different color chalk,
540
00:35:42,340 --> 00:35:47,890
I would put 2 for the number of
things I counted for by sigma.
541
00:35:47,890 --> 00:35:53,520
How many parameters does a
two by two rotation require?
542
00:35:53,520 --> 00:35:54,020
One.
543
00:35:54,020 --> 00:35:56,570
And what's a good
word for that one?
544
00:35:59,961 --> 00:36:02,740
Is that one parameter?
545
00:36:02,740 --> 00:36:05,380
It's like I have our
cos theta, sine theta,
546
00:36:05,380 --> 00:36:07,000
minus sine theta, cos theta.
547
00:36:07,000 --> 00:36:09,610
There's a number theta.
548
00:36:09,610 --> 00:36:12,230
It's the angle it rotates.
549
00:36:12,230 --> 00:36:18,380
So that's one guy to tell
the rotation angle, two guys
550
00:36:18,380 --> 00:36:25,740
to tell the stretchings, and
one more to tell the rotation
551
00:36:25,740 --> 00:36:29,070
from you, adding up to four.
552
00:36:29,070 --> 00:36:31,620
So those count--
that was a match up
553
00:36:31,620 --> 00:36:35,160
with the four numbers, a,
b, c, d that we start with.
554
00:36:35,160 --> 00:36:39,450
Of course, it's a complicated
relation between those four
555
00:36:39,450 --> 00:36:43,410
numbers and rotations
and stretches,
556
00:36:43,410 --> 00:36:45,320
but it's four
equals four anyway.
557
00:36:48,890 --> 00:36:51,720
And I guess if you
did three by threes--
558
00:36:51,720 --> 00:36:54,390
oh, three by threes.
559
00:36:54,390 --> 00:36:56,400
What would happen then?
560
00:36:56,400 --> 00:36:57,780
So let me take three.
561
00:36:57,780 --> 00:37:01,180
Do you want to care
for three by threes?
562
00:37:01,180 --> 00:37:04,690
Just, it's sort of satisfying
to get four equal four.
563
00:37:04,690 --> 00:37:07,590
But now what do we
get three by three?
564
00:37:10,590 --> 00:37:13,050
We got how many numbers here?
565
00:37:13,050 --> 00:37:13,550
Nine.
566
00:37:17,280 --> 00:37:18,915
So where are those nine numbers?
567
00:37:23,730 --> 00:37:24,420
How many here?
568
00:37:24,420 --> 00:37:26,850
That's usually the easy--
569
00:37:26,850 --> 00:37:28,470
three.
570
00:37:28,470 --> 00:37:33,310
So what's your guess for
the how many in a rotation?
571
00:37:33,310 --> 00:37:38,490
And a 3D rotation, you take
a sphere and you rotate it.
572
00:37:38,490 --> 00:37:42,570
How many how many numbers
to tell you what's what--
573
00:37:42,570 --> 00:37:44,410
to tell you what you did?
574
00:37:44,410 --> 00:37:44,910
Three.
575
00:37:44,910 --> 00:37:45,930
We hope three.
576
00:37:45,930 --> 00:37:51,900
Yeah, it's going to be three,
three, and three for the three
577
00:37:51,900 --> 00:37:53,520
dimensional world
that we live in.
578
00:37:53,520 --> 00:38:00,300
So people who do rotations for a
living understand that rotation
579
00:38:00,300 --> 00:38:02,378
in 3D, but how do you see this?
580
00:38:02,378 --> 00:38:03,670
AUDIENCE: Roll, pitch, and yaw.
581
00:38:03,670 --> 00:38:04,390
PROFESSOR: Sorry?
582
00:38:04,390 --> 00:38:05,280
AUDIENCE: Roll, pitch and yaw.
583
00:38:05,280 --> 00:38:06,750
PROFESSOR: Roll, pitch, and yaw.
584
00:38:06,750 --> 00:38:07,740
That sounds good.
585
00:38:07,740 --> 00:38:11,040
I mean, it's three words
and we've got it, right?
586
00:38:11,040 --> 00:38:11,970
OK, yeah.
587
00:38:11,970 --> 00:38:13,270
Roll, pitch and yaw.
588
00:38:13,270 --> 00:38:19,030
Yeah, I guess a pilot hopefully,
knows about those three.
589
00:38:19,030 --> 00:38:21,310
Yeah, yeah, yeah.
590
00:38:21,310 --> 00:38:22,260
Which is roll?
591
00:38:22,260 --> 00:38:24,030
When you are like
forward and back?
592
00:38:26,690 --> 00:38:29,280
Does anybody, anybody?
593
00:38:29,280 --> 00:38:30,840
Roll, pitch, and yaw?
594
00:38:33,780 --> 00:38:35,415
AUDIENCE: Pitch is
the up and down one.
595
00:38:35,415 --> 00:38:37,140
PROFESSOR: Pitch is
the up and down one.
596
00:38:37,140 --> 00:38:37,640
OK.
597
00:38:37,640 --> 00:38:41,550
AUDIENCE: Roll is like,
think of a barrel roll.
598
00:38:41,550 --> 00:38:43,305
And yaw is your
side-to-side motion.
599
00:38:43,305 --> 00:38:46,980
PROFESSOR: Oh, yaw, you
stay in a plane and you--
600
00:38:46,980 --> 00:38:48,270
OK, beautiful.
601
00:38:48,270 --> 00:38:50,040
Right, right.
602
00:38:50,040 --> 00:38:54,030
And that leads us to our
four-- four dimensions.
603
00:38:54,030 --> 00:38:57,090
What's your guess on 4D?
604
00:38:57,090 --> 00:38:59,130
Well, we could do
the count again.
605
00:38:59,130 --> 00:39:02,670
If it was 4 by 4, we would
have 16 numbers there.
606
00:39:02,670 --> 00:39:06,060
And in the middle, we always
have an easy time with that.
607
00:39:06,060 --> 00:39:07,890
That would be 4.
608
00:39:07,890 --> 00:39:10,620
So we've got 12
left to share out.
609
00:39:10,620 --> 00:39:13,110
So six somehow-- six--
610
00:39:13,110 --> 00:39:17,720
six angles in four dimensions.
611
00:39:17,720 --> 00:39:19,220
Well, we'll leave it there.
612
00:39:19,220 --> 00:39:21,160
Yeah, yeah, yeah.
613
00:39:21,160 --> 00:39:23,210
OK.
614
00:39:23,210 --> 00:39:27,330
So there is the SVD
but without an example.
615
00:39:27,330 --> 00:39:30,990
Examples, you know, I would
have to compute A transpose A
616
00:39:30,990 --> 00:39:32,220
and find it.
617
00:39:32,220 --> 00:39:35,010
So the text will do that--
618
00:39:35,010 --> 00:39:37,690
does it for a particular matrix.
619
00:39:37,690 --> 00:39:39,930
Oh!
620
00:39:39,930 --> 00:39:43,740
Yeah, the text does it
for a matrix 3, 4, 0,
621
00:39:43,740 --> 00:39:48,260
5 that came out pretty well.
622
00:39:48,260 --> 00:39:50,810
A few facts we
could learn though.
623
00:39:50,810 --> 00:39:54,890
So if I multiply all
the eigenvalues together
624
00:39:54,890 --> 00:39:58,160
for a matrix A, what do I get?
625
00:39:58,160 --> 00:39:59,850
I get the determinant.
626
00:39:59,850 --> 00:40:04,430
What if I multiply the
singular values together?
627
00:40:04,430 --> 00:40:06,170
Well again, I get
the determinant.
628
00:40:06,170 --> 00:40:10,190
You can see it right away
from the big formula.
629
00:40:10,190 --> 00:40:15,000
Take determinant--
take determinant.
630
00:40:15,000 --> 00:40:17,220
Well, assuming the
matrix A is square.
631
00:40:17,220 --> 00:40:18,990
So it's got a determinant.
632
00:40:18,990 --> 00:40:21,990
Then I take determinant
of this product.
633
00:40:21,990 --> 00:40:24,660
I can take the
separate determinants.
634
00:40:24,660 --> 00:40:29,890
That has determinant
equal to one.
635
00:40:29,890 --> 00:40:39,120
An orthogonal matrix,
the determinant is one.
636
00:40:39,120 --> 00:40:40,960
And similarly, here.
637
00:40:40,960 --> 00:40:45,710
So the product of the sigmas
is also the determinant.
638
00:40:45,710 --> 00:40:46,240
Yeah.
639
00:40:46,240 --> 00:40:49,410
Yeah, so the product of the
sigmas is also the determinant.
640
00:40:49,410 --> 00:40:52,130
The product of the
sigmas here will be 15.
641
00:40:52,130 --> 00:40:59,960
But you'll find that sigma
one is smaller than lambda 1.
642
00:40:59,960 --> 00:41:02,650
So here are the
eigenvalues, lambda 1
643
00:41:02,650 --> 00:41:06,630
less or equal to lambda 2, say.
644
00:41:06,630 --> 00:41:12,380
But the singular values
are outside them.
645
00:41:12,380 --> 00:41:13,040
Yeah.
646
00:41:13,040 --> 00:41:14,930
But they still multiply.
647
00:41:14,930 --> 00:41:19,730
Sigma 1 times sigma
2 will still be 15.
648
00:41:19,730 --> 00:41:22,770
And that's the same as
lambda 1 times lambda 2.
649
00:41:22,770 --> 00:41:23,270
Yeah.
650
00:41:26,230 --> 00:41:31,930
But overall, computing
the examples of the SVD
651
00:41:31,930 --> 00:41:35,510
take more time because--
652
00:41:35,510 --> 00:41:40,270
well, yeah, you just compute
A transpose A and you've got
653
00:41:40,270 --> 00:41:40,930
the v's.
654
00:41:40,930 --> 00:41:42,820
And you're on your way.
655
00:41:42,820 --> 00:41:47,560
And you have to take the
square root of the eigenvalues.
656
00:41:47,560 --> 00:41:54,390
So that's the SVD as
a piece of pure math.
657
00:41:54,390 --> 00:41:58,170
But of course, what we'll do
next time starting right away
658
00:41:58,170 --> 00:42:01,010
is use SVD.
659
00:42:01,010 --> 00:42:04,540
And let me tell you
even today, the most--
660
00:42:04,540 --> 00:42:11,910
yeah, yeah most important
pieces of the SVD.
661
00:42:11,910 --> 00:42:14,300
So what do I mean by
pieces of the SVD?
662
00:42:14,300 --> 00:42:16,860
I've got one more blackboard
still to write on.
663
00:42:16,860 --> 00:42:18,200
So here we go.
664
00:42:20,930 --> 00:42:30,380
So let me write out A is
the u's times the sigmas--
665
00:42:30,380 --> 00:42:35,300
sigmas 1 to r times the v's--
666
00:42:35,300 --> 00:42:39,540
v transpose v1 transpose
down to vr transpose.
667
00:42:39,540 --> 00:42:43,240
So those are across.
668
00:42:43,240 --> 00:42:43,790
Yeah.
669
00:42:43,790 --> 00:42:48,940
Actually what I've
written here--
670
00:42:52,700 --> 00:42:57,130
so you could say there
is a big economies.
671
00:42:57,130 --> 00:43:01,400
There is a smaller size SVD
that has the real stuff that
672
00:43:01,400 --> 00:43:02,720
really counts.
673
00:43:02,720 --> 00:43:07,220
And then there's a larger SVD
that has a whole lot of zeros.
674
00:43:07,220 --> 00:43:10,415
So this it would be the
smaller one, m by r.
675
00:43:13,250 --> 00:43:15,410
This would be r by r.
676
00:43:15,410 --> 00:43:17,570
And these would all be positive.
677
00:43:17,570 --> 00:43:19,430
And this would be r by n.
678
00:43:24,260 --> 00:43:29,720
So that's only using
the r non-zeros.
679
00:43:29,720 --> 00:43:33,360
All these guys are
greater than zero.
680
00:43:33,360 --> 00:43:37,590
Then the other one
we could fill out
681
00:43:37,590 --> 00:43:43,882
to get a square
orthogonal matrix,
682
00:43:43,882 --> 00:43:52,060
the sigmas and square v's v1
transpose to vn transpose.
683
00:43:52,060 --> 00:43:54,000
So what are the shapes now?
684
00:43:54,000 --> 00:43:57,030
This shape is m by m.
685
00:43:57,030 --> 00:43:59,370
It's a proper orthogonal matrix.
686
00:43:59,370 --> 00:44:01,680
This one also n by n.
687
00:44:01,680 --> 00:44:04,740
So this guy has to be--
this is the sigma now.
688
00:44:04,740 --> 00:44:07,280
So it has to be what size?
689
00:44:07,280 --> 00:44:08,220
m by m.
690
00:44:08,220 --> 00:44:10,800
That's the remaining space.
691
00:44:10,800 --> 00:44:17,460
So it starts with the sigmas,
and then it's all zeros,
692
00:44:17,460 --> 00:44:19,140
accounting for null space stuff.
693
00:44:23,130 --> 00:44:23,630
Yeah.
694
00:44:23,630 --> 00:44:29,130
So you should really see
that these two are possible.
695
00:44:29,130 --> 00:44:34,100
That all these zeros
when you multiply out,
696
00:44:34,100 --> 00:44:37,010
just give nothing, so
that really the only thing
697
00:44:37,010 --> 00:44:41,250
that non-zero is in these bits.
698
00:44:41,250 --> 00:44:43,080
But there is a complete one.
699
00:44:43,080 --> 00:44:49,405
So what are these extra u's
that are in the null space of A,
700
00:44:49,405 --> 00:44:51,970
A transpose or A transpose A?
701
00:44:51,970 --> 00:44:57,290
Yeah, so two sizes, the large
size and the small size.
702
00:44:57,290 --> 00:45:03,500
But then the things that
count are all in there.
703
00:45:03,500 --> 00:45:05,820
OK.
704
00:45:05,820 --> 00:45:09,940
So I was going to
do one more thing.
705
00:45:09,940 --> 00:45:13,140
Let me see what it was.
706
00:45:17,170 --> 00:45:21,130
So this is section
1.8 of the notes.
707
00:45:21,130 --> 00:45:25,350
And you'll see examples there.
708
00:45:25,350 --> 00:45:33,440
And you'll see a second
approach to the finding the u's
709
00:45:33,440 --> 00:45:36,700
and v's and sigmas.
710
00:45:36,700 --> 00:45:38,200
I can tell you what that is.
711
00:45:38,200 --> 00:45:47,380
But maybe with just do
something nice at the end,
712
00:45:47,380 --> 00:45:57,380
let me tell you about another
factorization of A that's
713
00:45:57,380 --> 00:46:09,260
famous in engineering, and
it's famous in geometry.
714
00:46:09,260 --> 00:46:12,740
So this is NEA is a
U sigma V transpose.
715
00:46:12,740 --> 00:46:14,070
We've got that.
716
00:46:14,070 --> 00:46:17,120
Now the other one
that I'm thinking of,
717
00:46:17,120 --> 00:46:18,450
I'll tell you its name.
718
00:46:18,450 --> 00:46:23,075
It's called the polar
decomposition of a matrix.
719
00:46:26,900 --> 00:46:31,400
And all I want you to see
is that it's virtually here.
720
00:46:31,400 --> 00:46:33,160
So a polar means--
721
00:46:33,160 --> 00:46:37,310
what's polar in--
for a complex number,
722
00:46:37,310 --> 00:46:40,990
what's the polar form
of a complex number?
723
00:46:40,990 --> 00:46:42,280
AUDIENCE: e to the i theta.
724
00:46:42,280 --> 00:46:44,990
PROFESSOR: Yeah, it's e
to the i theta times r.
725
00:46:44,990 --> 00:46:46,010
Yeah.
726
00:46:46,010 --> 00:46:49,100
A real guy-- so
the real guy r will
727
00:46:49,100 --> 00:46:52,490
translate into a symmetric guy.
728
00:46:52,490 --> 00:46:55,190
And the e to the i theta
will translate into--
729
00:46:58,970 --> 00:47:02,510
what kind of a matrix reminds
you of e to the i theta?
730
00:47:02,635 --> 00:47:03,510
AUDIENCE: Orthogonal.
731
00:47:03,510 --> 00:47:06,650
PROFESSOR: Orthogonal, size 1.
732
00:47:06,650 --> 00:47:08,022
So orthogonal.
733
00:47:10,680 --> 00:47:14,460
So that's a very,
very kind of nice.
734
00:47:14,460 --> 00:47:18,750
Every matrix factors
into a symmetric matrix
735
00:47:18,750 --> 00:47:20,830
times an orthogonal matrix.
736
00:47:20,830 --> 00:47:26,790
And I of course, describe these
as the most important classes
737
00:47:26,790 --> 00:47:27,600
of matrices.
738
00:47:27,600 --> 00:47:33,380
And here, we're saying every
matrix is a S times a Q.
739
00:47:33,380 --> 00:47:37,220
And I'm also saying that
I can get that quickly out
740
00:47:37,220 --> 00:47:39,410
of the SVD.
741
00:47:39,410 --> 00:47:43,210
So I'm just want to do it.
742
00:47:43,210 --> 00:47:47,190
So I want to find an S
and find a Q out of this.
743
00:47:47,190 --> 00:47:49,200
So to get an S--
744
00:47:49,200 --> 00:47:50,550
So let me just start it.
745
00:47:50,550 --> 00:47:55,350
U sigma-- but now
I'm looking for an S.
746
00:47:55,350 --> 00:47:59,530
So what shall I put in now?
747
00:47:59,530 --> 00:48:02,710
I better put in--
748
00:48:02,710 --> 00:48:04,950
if I've got to U
sigma something,
749
00:48:04,950 --> 00:48:06,860
and I want it to
be a symmetric, I
750
00:48:06,860 --> 00:48:13,860
should put in U
transpose would do it.
751
00:48:13,860 --> 00:48:15,600
But then if I put
it in U transpose,
752
00:48:15,600 --> 00:48:20,970
I've got to put it in U.
So now I've got U sigma.
753
00:48:20,970 --> 00:48:22,560
U transpose U is the identity.
754
00:48:22,560 --> 00:48:26,470
Then I've got to
get V transpose.
755
00:48:26,470 --> 00:48:30,490
And have I got what
the polar decomposition
756
00:48:30,490 --> 00:48:35,080
is asking for in this line?
757
00:48:35,080 --> 00:48:36,300
So, yeah.
758
00:48:36,300 --> 00:48:38,220
What have I got here?
759
00:48:38,220 --> 00:48:41,570
Where's the where's
the S in this?
760
00:48:41,570 --> 00:48:46,460
So you see, I took the SVD and I
just put the identity in there,
761
00:48:46,460 --> 00:48:48,080
just shifted things a little.
762
00:48:48,080 --> 00:48:50,860
And now where's the S
that I can read off?
763
00:48:55,360 --> 00:49:01,100
For three, that's an S.
That's a symmetric matrix.
764
00:49:01,100 --> 00:49:02,420
And where's the Q?
765
00:49:02,420 --> 00:49:06,020
Well, I guess we can see
where the Q has to be.
766
00:49:06,020 --> 00:49:11,060
It's here, yeah.
767
00:49:11,060 --> 00:49:14,000
Yeah, so just by
sticking U transpose U
768
00:49:14,000 --> 00:49:17,120
and putting the
parentheses right,
769
00:49:17,120 --> 00:49:23,860
I recover that decomposition
of a matrix, which
770
00:49:23,860 --> 00:49:27,910
in mechanical engineering
language, is language
771
00:49:27,910 --> 00:49:32,530
tells me that any
strain can be--
772
00:49:32,530 --> 00:49:36,460
which is like stretching
of elastic thing,
773
00:49:36,460 --> 00:49:48,160
has a symmetric kind of a
stretch and a internal twist.
774
00:49:48,160 --> 00:49:50,320
Yeah.
775
00:49:50,320 --> 00:49:51,970
So that's good.
776
00:49:51,970 --> 00:50:00,130
Well, this was a 3, 6, 9
boards filled with matrices.
777
00:50:00,130 --> 00:50:02,840
Well, it is 18 0, 6, 5.
778
00:50:02,840 --> 00:50:04,630
So maybe that's all right.
779
00:50:04,630 --> 00:50:11,890
But the idea is to use
them on a matrix of data.
780
00:50:11,890 --> 00:50:16,230
And I'll just tell
you the key fact.
781
00:50:16,230 --> 00:50:27,240
The key fact-- if I have
a big matrix of data, A,
782
00:50:27,240 --> 00:50:30,000
and if I want to pull
out of that matrix
783
00:50:30,000 --> 00:50:33,790
the important part,
so that's what
784
00:50:33,790 --> 00:50:36,280
data science has to be doing.
785
00:50:36,280 --> 00:50:42,340
Out of a big matrix, some part
of it is noise, some part of it
786
00:50:42,340 --> 00:50:43,510
is signal.
787
00:50:43,510 --> 00:50:46,060
I'm looking for the most
important part of the signal
788
00:50:46,060 --> 00:50:46,840
here.
789
00:50:46,840 --> 00:50:49,240
So I'm looking for the most
important part of the matrix.
790
00:50:52,620 --> 00:50:57,420
In a way, the biggest
numbers, but of course,
791
00:50:57,420 --> 00:51:00,810
I don't look at
individual numbers.
792
00:51:00,810 --> 00:51:04,950
So what's the biggest
part of the matrix?
793
00:51:04,950 --> 00:51:06,930
What are the
principal components?
794
00:51:06,930 --> 00:51:08,700
Now we're really getting in--
795
00:51:11,670 --> 00:51:13,140
it could be data.
796
00:51:13,140 --> 00:51:15,060
And we want to do
statistics, or we
797
00:51:15,060 --> 00:51:18,120
want to see what has
high variance, what
798
00:51:18,120 --> 00:51:24,240
has low variance, we'll do these
connections with statistics.
799
00:51:24,240 --> 00:51:26,950
But what's the important
part of the matrix?
800
00:51:26,950 --> 00:51:33,130
Well, let me look at
U sigma V transpose.
801
00:51:33,130 --> 00:51:37,600
Here, yeah, let me look at it.
802
00:51:37,600 --> 00:51:43,900
So what's the one most
important part of that matrix?
803
00:51:43,900 --> 00:51:44,950
The right one?
804
00:51:44,950 --> 00:51:46,430
It's a rank one piece.
805
00:51:46,430 --> 00:51:50,750
So when I say a part, of course
it's going to be a matrix part.
806
00:51:50,750 --> 00:51:52,900
So the simple matrix
building block
807
00:51:52,900 --> 00:51:57,850
is like a rank one matrix, a
something, something transpose.
808
00:51:57,850 --> 00:52:00,550
And what should I
pull out of that
809
00:52:00,550 --> 00:52:02,890
as being the most
important rank one
810
00:52:02,890 --> 00:52:06,230
matrix that's in that product?
811
00:52:06,230 --> 00:52:09,890
So I'll erase the
1.8 while you think
812
00:52:09,890 --> 00:52:16,640
what do I do to pick out the big
deal, the thing that the data
813
00:52:16,640 --> 00:52:19,200
is telling me first.
814
00:52:19,200 --> 00:52:24,360
Well, these are orthonormal.
815
00:52:24,360 --> 00:52:26,740
No one is bigger
than another one.
816
00:52:26,740 --> 00:52:29,580
These are orthonormal, no one
is bigger than another one.
817
00:52:29,580 --> 00:52:35,130
But here, I look here, which
is the most important number?
818
00:52:35,130 --> 00:52:37,200
Sigma 1.
819
00:52:37,200 --> 00:52:37,880
Sigma 1.
820
00:52:37,880 --> 00:52:41,930
So the part I pick out is
this biggest number times
821
00:52:41,930 --> 00:52:45,080
it's row times it's column.
822
00:52:45,080 --> 00:52:55,220
So it's u 1 sigma 1 v1 transpose
is the top principal part
823
00:52:55,220 --> 00:52:57,320
of the matrix A.
It's the leading
824
00:52:57,320 --> 00:53:01,700
part of the matrix A.
It's the biggest rank one
825
00:53:01,700 --> 00:53:04,040
part of the matrix is there.
826
00:53:04,040 --> 00:53:07,580
So computing those three
guys is the first step
827
00:53:07,580 --> 00:53:10,080
to understanding the data.
828
00:53:10,080 --> 00:53:10,580
Yeah.
829
00:53:10,580 --> 00:53:12,920
So that's what's
coming next is--
830
00:53:12,920 --> 00:53:17,230
and I guess tomorrow,
since they moved--
831
00:53:17,230 --> 00:53:24,410
MIT declared Tuesday
to be Monday.
832
00:53:24,410 --> 00:53:25,860
They didn't change Wednesday.
833
00:53:25,860 --> 00:53:32,450
So I'll see you tomorrow for
the principal components.
834
00:53:32,450 --> 00:53:34,000
Good.