1
00:00:14,200 --> 00:00:21,050
GILBERT STRANG: OK, ready
for part three of this vision
2
00:00:21,050 --> 00:00:23,430
of linear algebra.
3
00:00:23,430 --> 00:00:26,810
So the key word in part
three is orthogonal, which
4
00:00:26,810 --> 00:00:29,330
again means perpendicular.
5
00:00:29,330 --> 00:00:32,119
So we have
perpendicular vectors.
6
00:00:32,119 --> 00:00:34,130
We can imagine those.
7
00:00:34,130 --> 00:00:37,950
We have something called
orthogonal matrices.
8
00:00:37,950 --> 00:00:41,180
That's when-- I've got one here.
9
00:00:41,180 --> 00:00:46,010
An orthogonal matrix is
when we have these columns.
10
00:00:46,010 --> 00:00:47,750
I'm always going
to use the letter
11
00:00:47,750 --> 00:00:50,260
Q for an orthogonal matrix.
12
00:00:50,260 --> 00:00:52,910
And I look at its
columns, and every column
13
00:00:52,910 --> 00:00:56,510
is perpendicular to
every other column.
14
00:00:56,510 --> 00:00:59,600
So I don't just have two
perpendicular vectors
15
00:00:59,600 --> 00:01:00,770
going like this.
16
00:01:00,770 --> 00:01:03,950
I have n of them because
I'm in n dimensions.
17
00:01:03,950 --> 00:01:09,800
And you just imagine
xyz axes or xyzw axes,
18
00:01:09,800 --> 00:01:13,895
go up to 4D for
relativity, go up to 8D
19
00:01:13,895 --> 00:01:17,450
for string theory, 8 dimensions.
20
00:01:17,450 --> 00:01:18,830
We just have vectors.
21
00:01:18,830 --> 00:01:23,210
After all, it's just this row of
numbers or a column of numbers.
22
00:01:23,210 --> 00:01:28,910
And we can decide when things
are perpendicular by that test.
23
00:01:28,910 --> 00:01:34,220
Like say the test for Q1
to be perpendicular to Qn
24
00:01:34,220 --> 00:01:38,600
is that row times that column.
25
00:01:38,600 --> 00:01:43,360
When I say times, I mean dot
product, multiply every pair.
26
00:01:43,360 --> 00:01:47,300
Q1 transpose Qn gives
that 0 up there.
27
00:01:47,300 --> 00:01:50,280
So the columns
are perpendicular.
28
00:01:50,280 --> 00:01:52,950
And those matrices are
the best to compute with.
29
00:01:52,950 --> 00:01:55,080
And again, they're called Q.
30
00:01:55,080 --> 00:01:58,270
And one way to, a
quick matrix way,
31
00:01:58,270 --> 00:02:03,020
because there's always a matrix
way to explain something,
32
00:02:03,020 --> 00:02:06,340
and you'll see how
quick it is here.
33
00:02:06,340 --> 00:02:09,060
This business of
having columns that
34
00:02:09,060 --> 00:02:11,880
are perpendicular to each
other, and actually, I'm
35
00:02:11,880 --> 00:02:15,990
going to make those lengths of
all those column vectors all 1,
36
00:02:15,990 --> 00:02:18,810
just to sort of normalize it.
37
00:02:18,810 --> 00:02:25,770
Then all that's expressed by,
if I multiply Q transpose by Q,
38
00:02:25,770 --> 00:02:28,120
I'm taking all
those dot products,
39
00:02:28,120 --> 00:02:33,150
and I'm getting 1s when
it's Q against itself.
40
00:02:33,150 --> 00:02:37,560
And I'm getting 0s when it's
one 1 Q versus another Q.
41
00:02:37,560 --> 00:02:41,130
And again, just think of
three perpendicular axes.
42
00:02:41,130 --> 00:02:45,690
Those directions
are the Q1, Q2, Q3.
43
00:02:45,690 --> 00:02:46,830
OK?
44
00:02:46,830 --> 00:02:49,830
So we really want to
compute with those.
45
00:02:49,830 --> 00:02:51,540
Here's an example.
46
00:02:51,540 --> 00:02:54,030
Well, that has just
two perpendicular axes.
47
00:02:54,030 --> 00:02:56,520
I didn't have space
for the third one.
48
00:02:56,520 --> 00:03:03,000
So do you see that those two
columns are perpendicular?
49
00:03:03,000 --> 00:03:04,410
Again, what does that mean?
50
00:03:04,410 --> 00:03:06,030
I take the dot product.
51
00:03:06,030 --> 00:03:08,460
Minus 1 times 2, 2.
52
00:03:08,460 --> 00:03:10,680
2 times minus 1,
another minus 2.
53
00:03:10,680 --> 00:03:12,870
So I'm up to minus
4 at this point.
54
00:03:12,870 --> 00:03:14,820
And then 2 times
2 gives a plus 4.
55
00:03:14,820 --> 00:03:17,220
So it all washes out to 0.
56
00:03:17,220 --> 00:03:20,110
And why is that 1/3 there?
57
00:03:20,110 --> 00:03:21,400
Why is that?
58
00:03:21,400 --> 00:03:25,790
That's so that these
vectors will have length 1.
59
00:03:25,790 --> 00:03:27,970
There will be unit vectors.
60
00:03:27,970 --> 00:03:30,970
Yeah, and how do I figure,
the length of a vector,
61
00:03:30,970 --> 00:03:33,230
just while we're at it?
62
00:03:33,230 --> 00:03:37,060
I take 1 squared or minus
1 squared gives me 1.
63
00:03:37,060 --> 00:03:42,200
2 squared and 2 squared, I take
the dot product with itself.
64
00:03:42,200 --> 00:03:44,160
So minus 1 squared,
2 squared, and 2
65
00:03:44,160 --> 00:03:46,320
squared, that adds up to 9.
66
00:03:46,320 --> 00:03:48,570
The square root of
9 is the length.
67
00:03:48,570 --> 00:03:51,500
I'm just doing Pythagoras here.
68
00:03:51,500 --> 00:03:53,880
There is one side of a triangle.
69
00:03:53,880 --> 00:03:56,140
Here is a second
side of a triangle.
70
00:03:56,140 --> 00:03:58,500
It's a right triangle
because that vector
71
00:03:58,500 --> 00:04:00,510
is perpendicular to that one.
72
00:04:00,510 --> 00:04:04,060
It's in 3D because they
have three components.
73
00:04:04,060 --> 00:04:07,500
And I didn't write
a third direction.
74
00:04:07,500 --> 00:04:14,400
And their length one
vectors because just that's
75
00:04:14,400 --> 00:04:16,950
how when I compute the
length and remember
76
00:04:16,950 --> 00:04:22,019
about the 1/3, which is put
in there to give a length 1.
77
00:04:22,019 --> 00:04:23,670
So OK.
78
00:04:23,670 --> 00:04:28,320
So these matrices are with
Q transpose times Q equal I.
79
00:04:28,320 --> 00:04:31,260
That again, that's the
matrix shorthand for all
80
00:04:31,260 --> 00:04:33,340
I've just said.
81
00:04:33,340 --> 00:04:39,050
And those matrices are the
best because they don't
82
00:04:39,050 --> 00:04:40,970
change the length of anything.
83
00:04:40,970 --> 00:04:42,230
You don't have blow up.
84
00:04:42,230 --> 00:04:43,850
You don't have going to 0.
85
00:04:43,850 --> 00:04:46,370
You can multiply
together 1,000 matrices,
86
00:04:46,370 --> 00:04:50,440
and you'll still have
another orthogonal matrix.
87
00:04:50,440 --> 00:04:53,810
Yes, a little family
of beautiful matrices.
88
00:04:53,810 --> 00:04:57,250
OK, and very, very useful.
89
00:04:57,250 --> 00:04:59,320
OK, and there was
a good example.
90
00:04:59,320 --> 00:05:01,990
Oh, I think the way
I got that example,
91
00:05:01,990 --> 00:05:04,330
I just added a third row.
92
00:05:04,330 --> 00:05:05,560
The third column, sorry.
93
00:05:05,560 --> 00:05:06,770
The third column.
94
00:05:06,770 --> 00:05:09,680
So 2 squared plus 2 squared
plus minus 1 squared.
95
00:05:09,680 --> 00:05:11,380
That adds up to 9.
96
00:05:11,380 --> 00:05:13,930
When I take the
square root, I get 3.
97
00:05:13,930 --> 00:05:15,650
So that has length 3.
98
00:05:15,650 --> 00:05:17,010
I divided by 3.
99
00:05:17,010 --> 00:05:19,030
So it would have length 1.
100
00:05:19,030 --> 00:05:23,190
We always want to see
1s, like we do there.
101
00:05:23,190 --> 00:05:26,110
And if I-- here's a simple fact.
102
00:05:26,110 --> 00:05:28,070
But great.
103
00:05:28,070 --> 00:05:30,100
Then if I have two
of these matrices
104
00:05:30,100 --> 00:05:33,580
or 50 of these matrices, I
could multiply them together.
105
00:05:33,580 --> 00:05:35,920
And I would still
have length of 1.
106
00:05:35,920 --> 00:05:38,050
I'd still have
orthogonal matrices.
107
00:05:38,050 --> 00:05:41,590
1 times 1 times 1 forever is 1.
108
00:05:41,590 --> 00:05:45,280
OK, so there's probably
something hiding here.
109
00:05:45,280 --> 00:05:46,860
Oh, yeah.
110
00:05:46,860 --> 00:05:51,150
Oh, yeah, to understand why
these matrices are important,
111
00:05:51,150 --> 00:05:55,980
this one, this line is telling
me that, if I have a vector x,
112
00:05:55,980 --> 00:06:00,860
and I multiply by Q, it
doesn't change the length.
113
00:06:00,860 --> 00:06:03,710
This is a symbol
for length squared.
114
00:06:03,710 --> 00:06:06,050
And that's equal to the
original length squared.
115
00:06:06,050 --> 00:06:08,510
Length it is
preserved by these Qs.
116
00:06:08,510 --> 00:06:09,590
Everything is preserved.
117
00:06:09,590 --> 00:06:13,910
You're multiplying effectively
by the matrix versions
118
00:06:13,910 --> 00:06:16,490
of 1 and minus 1.
119
00:06:16,490 --> 00:06:22,430
And a rotation is a very
significant very valuable
120
00:06:22,430 --> 00:06:25,970
orthogonal matrix, which
just has cosines and signs.
121
00:06:25,970 --> 00:06:29,510
And everybody's remembering
that cosine squared plus sine
122
00:06:29,510 --> 00:06:33,230
squared is 1 from trig.
123
00:06:33,230 --> 00:06:38,350
So that's an orthogonal matrix.
124
00:06:38,350 --> 00:06:41,000
Oh, it's also orthogonal
because the dot
125
00:06:41,000 --> 00:06:44,030
product between that
one and that one,
126
00:06:44,030 --> 00:06:45,770
you're OK for the dot product.
127
00:06:45,770 --> 00:06:51,260
That product gives me minus sine
cosine, plus sine cosine, 0.
128
00:06:51,260 --> 00:06:55,610
So the column 1 is
orthogonal to column 2.
129
00:06:55,610 --> 00:06:56,360
That's good.
130
00:06:56,360 --> 00:06:58,740
OK.
131
00:06:58,740 --> 00:07:03,150
These lambdas that
you see here are
132
00:07:03,150 --> 00:07:04,470
something called eigenvalues.
133
00:07:04,470 --> 00:07:08,470
That's not allowed
until the next lecture.
134
00:07:08,470 --> 00:07:11,290
OK, all right, now,
here's something.
135
00:07:11,290 --> 00:07:13,510
Here's a computing thing.
136
00:07:13,510 --> 00:07:20,590
If we have a bunch of columns,
not orthogonal, not length 1,
137
00:07:20,590 --> 00:07:24,280
then, often, we would
like to convert them
138
00:07:24,280 --> 00:07:27,670
to, so we call those, A1 to AN.
139
00:07:27,670 --> 00:07:30,590
Nothing special
about those columns.
140
00:07:30,590 --> 00:07:33,370
We would like to convert
them to orthogonal columns
141
00:07:33,370 --> 00:07:36,880
because they're the
beautiful ones, Q1 up to Qn.
142
00:07:36,880 --> 00:07:41,110
And two guys called Graham
and Schmidt figured out
143
00:07:41,110 --> 00:07:42,130
a way to do that.
144
00:07:42,130 --> 00:07:46,590
And a century later, we're
still using their idea.
145
00:07:46,590 --> 00:07:48,540
Well, I don't know whose
idea it was actually.
146
00:07:48,540 --> 00:07:49,920
I think Graham had the idea.
147
00:07:49,920 --> 00:07:54,270
And I'm not really sure what
Schmidt, how he got into it.
148
00:07:54,270 --> 00:07:56,190
Well, he may have
repeated the idea.
149
00:07:56,190 --> 00:08:01,380
So OK, so I won't
go all the details.
150
00:08:01,380 --> 00:08:04,590
But here's what the
point is the point
151
00:08:04,590 --> 00:08:10,130
is, if I have a bunch of
columns that are independent,
152
00:08:10,130 --> 00:08:12,130
they go in different
directions, but they're not
153
00:08:12,130 --> 00:08:14,860
90 degree directions.
154
00:08:14,860 --> 00:08:17,500
Then I can convert
it to a 90 degree one
155
00:08:17,500 --> 00:08:22,990
to perpendicular
axes with a matrix R,
156
00:08:22,990 --> 00:08:29,110
happens to be triangular, that
did the moving around, did
157
00:08:29,110 --> 00:08:30,820
take that combinations.
158
00:08:30,820 --> 00:08:35,260
So A equal QR is one of
the fundamental steps
159
00:08:35,260 --> 00:08:39,970
of linear algebra and
computational linear algebra.
160
00:08:39,970 --> 00:08:42,789
Very, very often,
we're given a matrix
161
00:08:42,789 --> 00:08:48,790
A. We want a nice matrix Q, so
we do this Graham Schmidt step
162
00:08:48,790 --> 00:08:51,640
to make the columns orthogonal.
163
00:08:51,640 --> 00:08:56,080
And oh, here's a first
step of Graham Schmidt.
164
00:08:56,080 --> 00:09:03,730
But you'll need practice
to see all the steps.
165
00:09:03,730 --> 00:09:04,960
Maybe not.
166
00:09:04,960 --> 00:09:09,250
OK, so here, what's
the advantage
167
00:09:09,250 --> 00:09:11,620
of perpendicular vectors?
168
00:09:11,620 --> 00:09:13,730
Suppose I have a triangle.
169
00:09:13,730 --> 00:09:17,350
And one side is perpendicular
to the second side.
170
00:09:17,350 --> 00:09:19,870
How does that help?
171
00:09:19,870 --> 00:09:22,420
Well, that's a
right triangle then.
172
00:09:22,420 --> 00:09:27,730
Side A perpendicular to side B.
And of course, Pythagoras, now
173
00:09:27,730 --> 00:09:30,700
we're really going
back, Pythagoras said,
174
00:09:30,700 --> 00:09:33,320
a squared plus b
squared is c squared.
175
00:09:33,320 --> 00:09:36,910
So we have beautiful formulas
when things are perpendicular.
176
00:09:36,910 --> 00:09:42,460
If the angles are not 90 degrees
when the cosine of 90 degrees
177
00:09:42,460 --> 00:09:46,800
is 1 or maybe the sine
of 90 degrees is 1,
178
00:09:46,800 --> 00:09:50,010
yeah, sine of 90 degrees is 1.
179
00:09:50,010 --> 00:09:56,550
For those perfect
angles, 0 and 90 degrees,
180
00:09:56,550 --> 00:09:58,780
we can do everything.
181
00:09:58,780 --> 00:10:03,265
And here is a place that Q fits.
182
00:10:05,980 --> 00:10:10,660
This is like the first big
application of linear algebra.
183
00:10:10,660 --> 00:10:13,900
So let me just say what it is.
184
00:10:13,900 --> 00:10:17,290
And it uses these cubes.
185
00:10:17,290 --> 00:10:20,560
So what's the application
that's called least squares?
186
00:10:20,560 --> 00:10:26,340
And you start with
equations, Ax equal b.
187
00:10:26,340 --> 00:10:28,100
You always think
of that as a matrix
188
00:10:28,100 --> 00:10:30,710
times the unknown
vector, being known,
189
00:10:30,710 --> 00:10:33,950
right hand side b Ax equal b.
190
00:10:33,950 --> 00:10:38,330
So suppose we have
too many equations.
191
00:10:38,330 --> 00:10:39,350
That often happens.
192
00:10:39,350 --> 00:10:40,940
If you take too
many measurements,
193
00:10:40,940 --> 00:10:43,740
you want to get an exact x.
194
00:10:43,740 --> 00:10:46,520
So you do more and
more measurements to b.
195
00:10:46,520 --> 00:10:48,860
You're pasting more and
more conditions on x.
196
00:10:48,860 --> 00:10:52,400
And you're not going to find
an exact x because you've
197
00:10:52,400 --> 00:10:55,690
got too many equations.
m is bigger than n.
198
00:10:55,690 --> 00:10:57,980
We might have
2,000 measurements,
199
00:10:57,980 --> 00:11:02,160
say, from medical things
or from satellites.
200
00:11:02,160 --> 00:11:04,250
And we might have
only two unknowns,
201
00:11:04,250 --> 00:11:07,430
fitting a straight line
with only two variables.
202
00:11:07,430 --> 00:11:10,460
So how am I going to solve
2,000 equations with two
203
00:11:10,460 --> 00:11:12,860
unknowns Well, I'm not.
204
00:11:12,860 --> 00:11:16,680
But I look for
the best solution.
205
00:11:16,680 --> 00:11:18,050
How close can I come?
206
00:11:18,050 --> 00:11:20,370
And that's what least
squares is about.
207
00:11:20,370 --> 00:11:24,990
You get Ax as close
as possible to b.
208
00:11:24,990 --> 00:11:29,010
And probably, this
will show how the--
209
00:11:29,010 --> 00:11:29,840
yeah.
210
00:11:29,840 --> 00:11:31,800
Yeah, here's the right equation.
211
00:11:31,800 --> 00:11:34,720
When you-- here's my message.
212
00:11:34,720 --> 00:11:39,210
When you can't solve Ax
equal b, multiply both sides
213
00:11:39,210 --> 00:11:41,130
by A transpose.
214
00:11:41,130 --> 00:11:43,390
Then you can solve
this equation.
215
00:11:43,390 --> 00:11:44,760
That's the right equation.
216
00:11:44,760 --> 00:11:48,030
So I put a little
hat on that x to show
217
00:11:48,030 --> 00:11:52,650
that it doesn't solve the
original equation, Ax equal b,
218
00:11:52,650 --> 00:11:54,880
but it comes the closest.
219
00:11:54,880 --> 00:11:57,300
It's the closest
solution I could find.
220
00:11:57,300 --> 00:12:01,320
And it's discovered by
multiplying both sides
221
00:12:01,320 --> 00:12:03,480
by this A transpose matrix.
222
00:12:03,480 --> 00:12:06,990
So A transpose A is a
terrifically important matrix.
223
00:12:06,990 --> 00:12:09,840
It's a square matrix.
224
00:12:09,840 --> 00:12:11,410
See, A didn't have to be square.
225
00:12:11,410 --> 00:12:13,630
I could have lots of
measurements there,
226
00:12:13,630 --> 00:12:17,200
many, many equations,
long, thin matrix for A.
227
00:12:17,200 --> 00:12:23,510
But A transpose A always comes
out square and also symmetric.
228
00:12:23,510 --> 00:12:28,510
And it's just a great
matrix for theory.
229
00:12:28,510 --> 00:12:33,340
And this QR business
makes it work in practice.
230
00:12:33,340 --> 00:12:34,780
Let me see if there's more.
231
00:12:34,780 --> 00:12:36,610
So this is, oh, yeah.
232
00:12:36,610 --> 00:12:39,650
This is the geometry.
233
00:12:39,650 --> 00:12:45,430
So I start with a matrix A. It's
only got a few columns, maybe
234
00:12:45,430 --> 00:12:47,170
even only two columns.
235
00:12:47,170 --> 00:12:52,150
So its column space is just
a plane, not the whole space.
236
00:12:52,150 --> 00:12:56,690
But my right hand side b is
somewhere else in whole space.
237
00:12:56,690 --> 00:12:58,800
You see this situation.
238
00:12:58,800 --> 00:13:02,990
I can only solve Ax
equal b when b is
239
00:13:02,990 --> 00:13:04,550
a combination of the columns.
240
00:13:04,550 --> 00:13:06,110
And here, it's not.
241
00:13:06,110 --> 00:13:07,970
The measurements
weren't perfect.
242
00:13:07,970 --> 00:13:09,200
I'm off somewhere.
243
00:13:09,200 --> 00:13:12,840
So how do you deal with that?
244
00:13:12,840 --> 00:13:15,360
Geometry tells you.
245
00:13:15,360 --> 00:13:17,940
You can't deal with b.
246
00:13:17,940 --> 00:13:20,130
You can't solve Ax equal b.
247
00:13:20,130 --> 00:13:22,140
So you drop a perpendicular.
248
00:13:22,140 --> 00:13:25,850
You find the closest
point, the projection that
249
00:13:25,850 --> 00:13:28,760
in the space where
you can solve.
250
00:13:28,760 --> 00:13:32,160
So then you solve Ax equal p.
251
00:13:32,160 --> 00:13:35,160
That's what least squares is
all about, fitting the best
252
00:13:35,160 --> 00:13:37,560
straight line,
the best parabola,
253
00:13:37,560 --> 00:13:42,210
whatever, is all linear
algebra of perpendicular
254
00:13:42,210 --> 00:13:45,460
things and orthogonal matrices.
255
00:13:45,460 --> 00:13:50,610
OK, I think that's what I
can say about orthogonal.
256
00:13:50,610 --> 00:13:51,960
Well, it'll come in again.
257
00:13:51,960 --> 00:13:55,080
Orthogonal matrices,
perpendicular columns
258
00:13:55,080 --> 00:13:59,190
is so beautiful, but next
is coming eigenvectors.
259
00:13:59,190 --> 00:14:00,810
And that's another chapter.
260
00:14:00,810 --> 00:14:02,410
So I'll stop here.
261
00:14:02,410 --> 00:14:02,910
Good.
262
00:14:02,910 --> 00:14:04,460
Thanks.