1
00:00:00 --> 00:00:01
2
00:00:01 --> 00:00:02
The following content is
provided under a Creative
3
00:00:02 --> 00:00:03
Commons license.
4
00:00:03 --> 00:00:06
Your support will help MIT
OpenCourseWare continue to
5
00:00:06 --> 00:00:10
offer high-quality educational
resources for free.
6
00:00:10 --> 00:00:12
To make a donation, or to view
additional materials from
7
00:00:12 --> 00:00:20
hundreds of MIT courses, visit
MIT OpenCourseWare ocw.mit.edu.
8
00:00:20 --> 00:00:24
PROFESSOR STRANG: Ready
for the least squares
9
00:00:24 --> 00:00:29
lecture, lecture 11?
10
00:00:29 --> 00:00:32
Homework is just being
posted on the web.
11
00:00:32 --> 00:00:39
It'll be due, it's really to
help you practice, get some
12
00:00:39 --> 00:00:44
experience on these sections
for the first exam.
13
00:00:44 --> 00:00:46
That's Tuesday evening.
14
00:00:46 --> 00:00:49
So eight days away.
15
00:00:49 --> 00:00:52
So the homework will
be due the day after.
16
00:00:52 --> 00:00:58
And actually, we'll try to move
the review session to Monday
17
00:00:58 --> 00:01:02
next week so you can ask me any
questions about the homework
18
00:01:02 --> 00:01:05
or any review material.
19
00:01:05 --> 00:01:09
So that's all a week away
and this week we get
20
00:01:09 --> 00:01:11
two great examples.
21
00:01:11 --> 00:01:16
Least squares is one
that comes today.
22
00:01:16 --> 00:01:19
But could I first, because I
keep learning more, and I've
23
00:01:19 --> 00:01:23
got your MATLAB homeworks to
return, I keep sort of learning
24
00:01:23 --> 00:01:28
a little more from your MATLAB
results and I think because we
25
00:01:28 --> 00:01:31
spoke about it, it would be
worth speaking just
26
00:01:31 --> 00:01:33
a little more.
27
00:01:33 --> 00:01:40
So I'm going to take ten
minutes about this convection
28
00:01:40 --> 00:01:45
diffusion equation in which I
put in a coefficient d, a
29
00:01:45 --> 00:01:49
diffusivity just to help
get the units right.
30
00:01:49 --> 00:01:51
So this is your example.
31
00:01:51 --> 00:01:55
And it had d=1 of course.
32
00:01:55 --> 00:02:01
Well first I realized that
later in the book I completely
33
00:02:01 --> 00:02:04
forgot that I discuss
this problem.
34
00:02:04 --> 00:02:07
About page 509 I think.
35
00:02:07 --> 00:02:09
I discussed it a little bit.
36
00:02:09 --> 00:02:16
And just because it's worth,
since we invested a little
37
00:02:16 --> 00:02:19
time, the little bit
more will pay off.
38
00:02:19 --> 00:02:25
So first of all, the point
is here we have convection
39
00:02:25 --> 00:02:27
competing with diffusion.
40
00:02:27 --> 00:02:32
And always there's some
non-dimensional number.
41
00:02:32 --> 00:02:34
Here it's called
the Peclet number.
42
00:02:34 --> 00:02:36
Actually, there's an
accent on one of those
43
00:02:36 --> 00:02:39
e's, Peclet number.
44
00:02:39 --> 00:02:43
Which measures the ratio, the
importance of convection
45
00:02:43 --> 00:02:45
relative to diffusion.
46
00:02:45 --> 00:02:52
So it's V times a length scale
in the problem divided by d.
47
00:02:52 --> 00:02:55
So then that has the same
units as that if the
48
00:02:55 --> 00:02:58
result is dimensionless.
49
00:02:58 --> 00:03:00
Maybe you know the
Reynolds number.
50
00:03:00 --> 00:03:04
This is very like the Reynolds
number, which also measures in
51
00:03:04 --> 00:03:10
Navier-Stokes equation the
importance of convection,
52
00:03:10 --> 00:03:14
advection and diffusion.
53
00:03:14 --> 00:03:20
There in that equation the
velocity, V, that's a
54
00:03:20 --> 00:03:23
non-linear equation,
Navier-Stokes, it's
55
00:03:23 --> 00:03:28
tremendously important and
many codes to solve it,
56
00:03:28 --> 00:03:33
lots of discussion, theory
still not complete.
57
00:03:33 --> 00:03:36
In that problem, the V is u.
58
00:03:36 --> 00:03:41
It's non-linear and the term
there that we took as a
59
00:03:41 --> 00:03:45
constant, as a given constant
V, it's the same as u.
60
00:03:45 --> 00:03:52
So in the Reynolds number, this
would be u, a typical velocity
61
00:03:52 --> 00:03:57
u, times a typical length
scale, which would be like one
62
00:03:57 --> 00:04:02
in our zero to one problem,
divided by d or mu or nu,
63
00:04:02 --> 00:04:05
whatever number we use.
64
00:04:05 --> 00:04:07
So it's like the
Reynolds number.
65
00:04:07 --> 00:04:12
And then it's turned out for
this problem that people also
66
00:04:12 --> 00:04:18
use a number that gets called
the cell Peclet number where
67
00:04:18 --> 00:04:23
the length is taken to be
half the cell size,
68
00:04:23 --> 00:04:24
delta x over two.
69
00:04:24 --> 00:04:26
Let me call that number P.
70
00:04:26 --> 00:04:28
So that's P.
71
00:04:28 --> 00:04:33
And what's my point?
72
00:04:33 --> 00:04:36
This equation's important
enough to sort of see a little
73
00:04:36 --> 00:04:40
more about it than just
the numbers that come out.
74
00:04:40 --> 00:04:49
So the MATLAB homework which
you did really well set up
75
00:04:49 --> 00:04:51
finite differences for this.
76
00:04:51 --> 00:04:55
Right?
77
00:04:55 --> 00:05:01
And found the eigenvalues
and solutions.
78
00:05:01 --> 00:05:05
It's the eigenvalues I want
to say a little more about.
79
00:05:05 --> 00:05:14
Because you set up a matrix
K over delta x squared
80
00:05:14 --> 00:05:21
and V times the center
difference over delta x.
81
00:05:21 --> 00:05:30
And I guess I call that whole
combination L asked you
82
00:05:30 --> 00:05:34
about the eigenvalues of L.
83
00:05:34 --> 00:05:37
And you printed them
out correctly.
84
00:05:37 --> 00:05:43
But there's more there than
I think we have understood.
85
00:05:43 --> 00:05:47
And I want to make some
more comments about that.
86
00:05:47 --> 00:05:49
Because it's quite important.
87
00:05:49 --> 00:05:52
And the comments are
clearest if I just
88
00:05:52 --> 00:05:55
reduce to a n equal 2.
89
00:05:55 --> 00:06:01
So that matrix, well the
off-diagonal part of that
90
00:06:01 --> 00:06:06
matrix had some number
v and some number c.
91
00:06:06 --> 00:06:10
Actually we could figure out
what was the b in this.
92
00:06:10 --> 00:06:14
This produced a minus
1, 2, minus 1, right?
93
00:06:14 --> 00:06:20
So part of the b was the minus
1 over delta x squared.
94
00:06:20 --> 00:06:27
And then from this was a plus
V and a 1 over, well it's
95
00:06:27 --> 00:06:31
a center difference so I
should divide by 2 delta x.
96
00:06:31 --> 00:06:32
Is that right?
97
00:06:32 --> 00:06:37
Is that what a typical
off-diagonal thing that in the
98
00:06:37 --> 00:06:39
matrix that you displayed?
99
00:06:39 --> 00:06:42
That's what's coming from
the off-diagonal of K.
100
00:06:42 --> 00:06:46
And this is what's coming from
the center difference c.
101
00:06:46 --> 00:06:50
And then what would
this c thing be?
102
00:06:50 --> 00:06:54
Well the c is below the
diagonal so it's also at minus
103
00:06:54 --> 00:06:56
1 over delta x squared.
104
00:06:56 --> 00:06:59
But now this is a difference,
so it's going to
105
00:06:59 --> 00:07:01
be a minus, right?
106
00:07:01 --> 00:07:09
I think those would have been
your entries for b And c.
107
00:07:09 --> 00:07:12
So can we just think first,
what are the eigenvalues
108
00:07:12 --> 00:07:14
of that matrix?
109
00:07:14 --> 00:07:17
It's a two by two
simple problem.
110
00:07:17 --> 00:07:18
The trace is zero plus zero.
111
00:07:20 --> 00:07:24
So that the eigenvalues will
be a plus minus pair because
112
00:07:24 --> 00:07:25
they have to add to zero.
113
00:07:25 --> 00:07:29
And I think that's the
plus minus pair you get.
114
00:07:29 --> 00:07:30
Let's just check.
115
00:07:30 --> 00:07:32
What's our other check?
116
00:07:32 --> 00:07:34
They will add to zero, the
plus the square root and
117
00:07:34 --> 00:07:37
minus the square root.
118
00:07:37 --> 00:07:40
And the product of the two
eigenvalues, lambda one times
119
00:07:40 --> 00:07:45
lambda two, will be, we have
one of them is plus, one with
120
00:07:45 --> 00:07:48
a minus, so it'd minus b c.
121
00:07:48 --> 00:07:53
And that's correctly
the determinant.
122
00:07:53 --> 00:07:54
So it's good.
123
00:07:54 --> 00:07:57
These are the correct
eigenvalues.
124
00:07:57 --> 00:08:03
Now let me ask you about
the signs of b and c.
125
00:08:03 --> 00:08:07
If b and c have the same signs,
like maybe even equal, one,
126
00:08:07 --> 00:08:12
one, what are the eigenvalues?
127
00:08:12 --> 00:08:16
So in that symmetric case
if b and c are equal
128
00:08:16 --> 00:08:19
the eigenvalues are?
129
00:08:19 --> 00:08:21
Right here.
130
00:08:21 --> 00:08:26
If b and c are equal, say
equal to one, the eigenvalues
131
00:08:26 --> 00:08:28
are plus and minus one.
132
00:08:28 --> 00:08:32
But what if the
signs are opposite?
133
00:08:32 --> 00:08:33
Everything changes.
134
00:08:33 --> 00:08:37
What if b is one and
c is minus one?
135
00:08:37 --> 00:08:40
That matrix would then be
a 90 degree rotation.
136
00:08:40 --> 00:08:46
It would be anti-symmetric if b
was one and c was minus one.
137
00:08:46 --> 00:08:50
Our formula is still correct,
but what does it give us?
138
00:08:50 --> 00:08:56
If b is one and c is minus
one what have I got here?
139
00:08:56 --> 00:08:57
I've got i.
140
00:08:57 --> 00:09:01
So the eigenvalues change from
plus and minus one in the
141
00:09:01 --> 00:09:04
symmetric case to plus
and minus i in the
142
00:09:04 --> 00:09:05
anti-symmetric case.
143
00:09:05 --> 00:09:10
And I think that's what
you guys saw at a
144
00:09:10 --> 00:09:12
certain level of V.
145
00:09:12 --> 00:09:16
I hope you did because that was
the point about eigenvalues.
146
00:09:16 --> 00:09:20
Now you may say, what
about the diagonal?
147
00:09:20 --> 00:09:22
Well I claim diagonal
is very simple.
148
00:09:22 --> 00:09:26
What's the diagonal?
149
00:09:26 --> 00:09:28
Now I'm going to allow
myself a diagonal and I'm
150
00:09:28 --> 00:09:30
just going to change.
151
00:09:30 --> 00:09:34
What happens if I have a and a?
152
00:09:34 --> 00:09:36
Same entry on the diagonal.
153
00:09:36 --> 00:09:39
What are the eigenvalues now?
154
00:09:39 --> 00:09:42
This is just like, a great
chance to do some basic
155
00:09:42 --> 00:09:43
eigenvalue stuff.
156
00:09:43 --> 00:09:47
What are the eigenvalues
of that matrix?
157
00:09:47 --> 00:09:50
Well I've added a
times the identity.
158
00:09:50 --> 00:09:54
I've just shifted
that matrix by a.
159
00:09:54 --> 00:09:57
So the eigenvalues
all shift by a.
160
00:09:57 --> 00:10:02
So the eigenvalues are
now a plus and minus.
161
00:10:02 --> 00:10:06
So no big deal.
162
00:10:06 --> 00:10:11
So you say that the a is
actually not important, not
163
00:10:11 --> 00:10:15
the key to this question
of are they real or
164
00:10:15 --> 00:10:18
do they go complex.
165
00:10:18 --> 00:10:22
So the eigenvalues of this
are real when b and c
166
00:10:22 --> 00:10:24
have the same sign.
167
00:10:24 --> 00:10:26
If b and c have the same
sign, I have a square
168
00:10:26 --> 00:10:29
root, no problem.
169
00:10:29 --> 00:10:33
When b and c have opposite
sign, what do I get?
170
00:10:33 --> 00:10:36
When b and c have opposite
sign, I'm taking the square
171
00:10:36 --> 00:10:38
root of a negative number
and I've gone complex.
172
00:10:38 --> 00:10:44
Do you see that the change from
real eigenvalues, which gives
173
00:10:44 --> 00:10:50
a nice curve, to complex
eigenvalues, which gives a very
174
00:10:50 --> 00:10:58
bumpy curve for the solution,
just happens when like, for
175
00:10:58 --> 00:11:04
example, b-- is it b that's
going to go to zero maybe?
176
00:11:04 --> 00:11:10
And then beyond that?
177
00:11:10 --> 00:11:14
Well this sign is for
sure negative, right?
178
00:11:14 --> 00:11:16
So c is staying negative.
179
00:11:16 --> 00:11:24
And originally for a little
delta x, b is also negative.
180
00:11:24 --> 00:11:31
What's happening here?
181
00:11:31 --> 00:11:36
I think that the transition
that you I hope observed
182
00:11:36 --> 00:11:39
comes when b hits zero.
183
00:11:39 --> 00:11:44
When the combination of V and
delta x is such that at b=0 we
184
00:11:44 --> 00:11:50
switch from real eigenvalues
to complex eigenvalues.
185
00:11:50 --> 00:11:51
And when is b=0?
186
00:11:51 --> 00:11:55
187
00:11:55 --> 00:12:01
That's when this negative
guy off the diagonal just
188
00:12:01 --> 00:12:04
exactly cancels this one.
189
00:12:04 --> 00:12:09
So b is zero when what?
190
00:12:09 --> 00:12:14
So if this equals this, one
over delta x squared is equal
191
00:12:14 --> 00:12:19
to V over two delta x, let me
multiply both sides by delta x
192
00:12:19 --> 00:12:24
squared so that I have a nice
one there, multiplying by delta
193
00:12:24 --> 00:12:28
x squared will put
a delta x up here.
194
00:12:28 --> 00:12:29
And what have we discovered?
195
00:12:29 --> 00:12:33
This is why I wanted
you to see it.
196
00:12:33 --> 00:12:39
That the transition comes when
the Peclet number is one.
197
00:12:39 --> 00:12:43
So that Peclet number, that
cell Peclet number is exactly
198
00:12:43 --> 00:12:50
the point but we observed of
transition from real
199
00:12:50 --> 00:12:53
eigenvalues to
complex eigenvalues.
200
00:12:53 --> 00:12:55
And that's the transition.
201
00:12:55 --> 00:13:00
So it's that combination, this
is the Peclet number, cell
202
00:13:00 --> 00:13:09
Peclet number, it's that
combination, P_cell maybe.
203
00:13:09 --> 00:13:15
We've done the computations
and now we gradually get
204
00:13:15 --> 00:13:17
back to the meaning.
205
00:13:17 --> 00:13:22
And I just wanted to take this
step back to the meaning to see
206
00:13:22 --> 00:13:25
when do those numbers
start going complex.
207
00:13:25 --> 00:13:28
You may have noticed or you may
not have noticed that it'll
208
00:13:28 --> 00:13:33
happen when one of those, when
that upper diagonal
209
00:13:33 --> 00:13:35
changes sign.
210
00:13:35 --> 00:13:40
Now you could say, ok
that's the eigenvalues.
211
00:13:40 --> 00:13:44
What's the consequences for
the shape of the solution?
212
00:13:44 --> 00:13:47
Well, I haven't
figured all that out.
213
00:13:47 --> 00:13:50
I'd be happy to have some
more thoughts about that.
214
00:13:50 --> 00:13:59
But what you noticed, I think,
in the computations is if V got
215
00:13:59 --> 00:14:05
too big so that that P was
bigger than one, if V got too
216
00:14:05 --> 00:14:11
big, so convection was
dominating and our delta x was
217
00:14:11 --> 00:14:16
not small enough to deal with
it, you should have seen the
218
00:14:16 --> 00:14:21
points on the discrete values
were oscillating instead
219
00:14:21 --> 00:14:22
of a proper smooth.
220
00:14:22 --> 00:14:27
I mean, the proper, with a
large V, the correct solution,
221
00:14:27 --> 00:14:31
I think, is practically nothing
for here and then it goes, this
222
00:14:31 --> 00:14:35
is a really large V, take V
to a thousand or something.
223
00:14:35 --> 00:14:36
It climbs up like mad.
224
00:14:36 --> 00:14:40
Here's the halfway point
where the load is.
225
00:14:40 --> 00:14:44
And then it goes along here and
then it climbs down like mad to
226
00:14:44 --> 00:14:46
satisfy the boundary condition.
227
00:14:46 --> 00:14:51
I didn't know that that's what
would happen for large V.
228
00:14:51 --> 00:14:54
What I'm saying is, and
undoubtedly it could be
229
00:14:54 --> 00:14:57
understood physically, so I
guess what I'm saying is
230
00:14:57 --> 00:15:04
there's just more good stuff
in any computation than
231
00:15:04 --> 00:15:06
purely the numbers.
232
00:15:06 --> 00:15:11
And this is part of the good
stuff in that example.
233
00:15:11 --> 00:15:12
I hope you liked that.
234
00:15:12 --> 00:15:16
Because I mean, here you
did the work but then, to
235
00:15:16 --> 00:15:22
understand it is frankly
still under way.
236
00:15:22 --> 00:15:28
More thinking to do.
237
00:15:28 --> 00:15:31
That's back to least squares.
238
00:15:31 --> 00:15:35
Here's today's lecture.
239
00:15:35 --> 00:15:38
So remember where we
started last time.
240
00:15:38 --> 00:15:38
Au=b.
241
00:15:39 --> 00:15:40
Last time I wrote f.
242
00:15:40 --> 00:15:42
I regret it terribly.
243
00:15:42 --> 00:15:43
I can't fix it.
244
00:15:43 --> 00:15:45
But it's b.
245
00:15:45 --> 00:15:49
I want b there to be
the right-hand side.
246
00:15:49 --> 00:15:56
And I jumped to the equation
that determines the best u.
247
00:15:56 --> 00:16:02
There's no exact u because
we've got too many equations.
248
00:16:02 --> 00:16:04
You remember the set-up, we
have too many equations.
249
00:16:04 --> 00:16:09
There's noise in the
measurements and we can't
250
00:16:09 --> 00:16:10
get the error down to zero.
251
00:16:10 --> 00:16:13
There's some error.
252
00:16:13 --> 00:16:18
And the best u was given
by that equation and
253
00:16:18 --> 00:16:21
we want to say why.
254
00:16:21 --> 00:16:26
And understand it from
two or three ways.
255
00:16:26 --> 00:16:28
Calculus, geometry, everything.
256
00:16:28 --> 00:16:34
Can I first, because I love my
little framework here, fit it
257
00:16:34 --> 00:16:36
in because it's quite
important, this example
258
00:16:36 --> 00:16:39
and then others fit in.
259
00:16:39 --> 00:16:42
So u is our unknown as always.
260
00:16:42 --> 00:16:46
Then the matrix A in the
problem produces an Au.
261
00:16:46 --> 00:16:49
262
00:16:49 --> 00:16:55
Now two things to notice about
e, which, that's the same
263
00:16:55 --> 00:16:59
letter I used for elongation,
here it's standing for error.
264
00:16:59 --> 00:17:00
Two things to notice.
265
00:17:00 --> 00:17:05
One is that the source term,
which is b, comes in at this
266
00:17:05 --> 00:17:10
point of the framework.
267
00:17:10 --> 00:17:16
When we had external forces
on springs and on masses
268
00:17:16 --> 00:17:19
it came in at this point.
269
00:17:19 --> 00:17:21
We had an f there.
270
00:17:21 --> 00:17:23
So that's why I'd like to
keep those two separate.
271
00:17:23 --> 00:17:28
The b's are like voltage
sources, they come in here.
272
00:17:28 --> 00:17:30
The f's are will be
like current sources,
273
00:17:30 --> 00:17:32
they'll come in there.
274
00:17:32 --> 00:17:35
Actually it's beautiful.
275
00:17:35 --> 00:17:37
One more thing to notice.
276
00:17:37 --> 00:17:40
A is coming with a minus sign.
277
00:17:40 --> 00:17:43
In mechanics in masses
and springs we had e=Au.
278
00:17:45 --> 00:17:49
Here it's natural to work
with this, the error
279
00:17:49 --> 00:17:51
or the residual b-Au.
280
00:17:53 --> 00:18:00
And that minus sign is natural
in physics and in electrical
281
00:18:00 --> 00:18:05
engineering and hydraulics.
282
00:18:05 --> 00:18:07
Where's that minus sign
coming from in flow?
283
00:18:07 --> 00:18:11
Well, flow goes from the
higher point to the lower.
284
00:18:11 --> 00:18:14
Higher voltage to
the lower voltage.
285
00:18:14 --> 00:18:18
And that usually produces
that minus sign.
286
00:18:18 --> 00:18:23
No big deal, of course.
287
00:18:23 --> 00:18:27
So that step is fine
with the framework.
288
00:18:27 --> 00:18:32
What do we expect in
that middle step?
289
00:18:32 --> 00:18:37
So what's our name for the
matrix that goes there?
290
00:18:37 --> 00:18:40
Everybody's gotta
know this framework.
291
00:18:40 --> 00:18:42
C, right?
292
00:18:42 --> 00:18:46
Only I've been taking
unweighted least squares.
293
00:18:46 --> 00:18:49
So for unweighted
least squares, C will
294
00:18:49 --> 00:18:52
be the identity.
295
00:18:52 --> 00:18:55
And C doesn't show
in our equations.
296
00:18:55 --> 00:18:59
So C is the identity when there
are no weights, when all the
297
00:18:59 --> 00:19:02
equations are equally reliable.
298
00:19:02 --> 00:19:05
And that's pretty
common, of course.
299
00:19:05 --> 00:19:07
But not always.
300
00:19:07 --> 00:19:11
And we'll think, ok
there is a weight e.
301
00:19:11 --> 00:19:20
So w, which is Ce, is weighted
errors, you could say.
302
00:19:20 --> 00:19:23
So the letter w comes up
appropriately again.
303
00:19:23 --> 00:19:25
Weighted errors.
304
00:19:25 --> 00:19:28
And then what's the
good weighting?
305
00:19:28 --> 00:19:31
May I stay with C equal the
identity for the moment?
306
00:19:31 --> 00:19:33
Unweighted least squares,
because that's by
307
00:19:33 --> 00:19:35
far the most common.
308
00:19:35 --> 00:19:38
And then w and e are the same.
309
00:19:38 --> 00:19:39
C is the identity.
310
00:19:39 --> 00:19:42
And finally, there's the last
step in our framework where we
311
00:19:42 --> 00:19:45
always expect to
see A transpose.
312
00:19:45 --> 00:19:47
And we do.
313
00:19:47 --> 00:19:48
And we have to say why.
314
00:19:48 --> 00:19:51
So that's where I
left it last time.
315
00:19:51 --> 00:19:53
That this was the picture.
316
00:19:53 --> 00:19:54
This is the equation.
317
00:19:54 --> 00:19:59
If I had a matrix C, it
would go there and there.
318
00:19:59 --> 00:20:00
Right?
319
00:20:00 --> 00:20:06
Because I'd have b-Au and then
I'd apply C before A transpose.
320
00:20:06 --> 00:20:11
So C would slip in there before
A transpose on both sides.
321
00:20:11 --> 00:20:13
So that would, with the
C's there, that would
322
00:20:13 --> 00:20:17
be the weighted least
squares equation.
323
00:20:17 --> 00:20:20
You see that it would be A
transpose C A instead of A
324
00:20:20 --> 00:20:26
transpose A, but still the
main facts are there.
325
00:20:26 --> 00:20:30
So where does the
equation come from?
326
00:20:30 --> 00:20:36
So one source, one way to get
the equation is from calculus.
327
00:20:36 --> 00:20:42
From minimizing,
from minimizing.
328
00:20:42 --> 00:20:45
Set a derivative to
zero, calculus.
329
00:20:45 --> 00:20:48
And what's the quantity
we're minimizing?
330
00:20:48 --> 00:20:51
We're minimizing that
squared error because
331
00:20:51 --> 00:20:55
this is least squares.
332
00:20:55 --> 00:20:58
We're minimizing this,
e transpose e, the
333
00:20:58 --> 00:20:59
length of e squared.
334
00:20:59 --> 00:21:01
The sum of the squares
of the errors.
335
00:21:01 --> 00:21:06
Which is (b-Au) transpose b-Au.
336
00:21:06 --> 00:21:09
337
00:21:09 --> 00:21:13
Again I could say where
to slip in the C matrix.
338
00:21:13 --> 00:21:15
If there was one, it
would go in there.
339
00:21:15 --> 00:21:19
C would go in there,
C would go in there.
340
00:21:19 --> 00:21:20
There'd be a C in the equation.
341
00:21:20 --> 00:21:26
But let's keep C to
be the identity.
342
00:21:26 --> 00:21:28
So I minimized.
343
00:21:28 --> 00:21:29
It's a quadratic.
344
00:21:29 --> 00:21:35
It's got u's times u's,
so second degree.
345
00:21:35 --> 00:21:38
And what's the coefficient
in that second degree part?
346
00:21:38 --> 00:21:42
Well, the second degree part is
coming from Au transpose Au.
347
00:21:44 --> 00:21:45
Right?
348
00:21:45 --> 00:21:47
This times this is
going to be linear.
349
00:21:47 --> 00:21:50
This times this is
going to be linear.
350
00:21:50 --> 00:21:52
That times that is just
going to be a constant,
351
00:21:52 --> 00:21:54
its derivative is zero.
352
00:21:54 --> 00:21:58
But this times this is
altogether, that times that is
353
00:21:58 --> 00:22:02
the u transpose A transpose Au.
354
00:22:02 --> 00:22:04
Right?
355
00:22:04 --> 00:22:07
So that's the quadratic part.
356
00:22:07 --> 00:22:16
And my only point is it's like
our old stiffness matrix.
357
00:22:16 --> 00:22:21
We're seeing the matrix in
here is A transpose A.
358
00:22:21 --> 00:22:29
In other words, when I do
calculus and maybe I'd prefer
359
00:22:29 --> 00:22:33
to see something than
just compute away, take
360
00:22:33 --> 00:22:35
derivatives mechanically.
361
00:22:35 --> 00:22:42
So I'm going to leave that
which is done in the text,
362
00:22:42 --> 00:22:44
finding the derivative,
setting to zero.
363
00:22:44 --> 00:22:45
And what does it give?
364
00:22:45 --> 00:22:48
It gives us our equation.
365
00:22:48 --> 00:22:52
So that equation will come
when I set the derivatives
366
00:22:52 --> 00:22:54
of this thing to zero.
367
00:22:54 --> 00:22:58
So that's one totally
ok approach.
368
00:22:58 --> 00:23:02
But I like to see a
picture with it.
369
00:23:02 --> 00:23:03
I hope that's alright.
370
00:23:03 --> 00:23:09
To take the second
approach is to see why A
371
00:23:09 --> 00:23:11
transpose w equal zero.
372
00:23:11 --> 00:23:13
Why is that?
373
00:23:13 --> 00:23:16
What's going on in
that key step?
374
00:23:16 --> 00:23:17
This is always the key step.
375
00:23:17 --> 00:23:20
This is like the set-up step.
376
00:23:20 --> 00:23:23
This is the weighting step
with constants coming in.
377
00:23:23 --> 00:23:26
And here's the key step.
378
00:23:26 --> 00:23:28
Let's see that.
379
00:23:28 --> 00:23:32
So my picture.
380
00:23:32 --> 00:23:39
Let me draw that picture again.
381
00:23:39 --> 00:23:45
And my example was in
three dimensions, so m=3.
382
00:23:45 --> 00:23:48
383
00:23:48 --> 00:23:51
I've got three equations.
384
00:23:51 --> 00:23:55
The matrix A, oh I'm afraid I
don't remember what it was, but
385
00:23:55 --> 00:24:00
I think it was something like
[1, 1, 1; 0, 1, 3],
386
00:24:00 --> 00:24:02
was that maybe it?
387
00:24:02 --> 00:24:04
Just to connect to last time.
388
00:24:04 --> 00:24:12
And what I'm now calling V was
the vector was it?
389
00:24:12 --> 00:24:13
Or was it not?
390
00:24:13 --> 00:24:15
It was maybe?
391
00:24:15 --> 00:24:18
That's right?
392
00:24:18 --> 00:24:20
And what was the point?
393
00:24:20 --> 00:24:24
If I draw the vector b it
goes there somewhere.
394
00:24:24 --> 00:24:27
If I draw the first column of
A, it goes here somewhere.
395
00:24:27 --> 00:24:31
If I draw the second column of
A, it goes there somewhere.
396
00:24:31 --> 00:24:38
And if I draw all combinations
of these columns, all
397
00:24:38 --> 00:24:43
combinations of that vector and
that vector, what do I get?
398
00:24:43 --> 00:24:45
I get a plane.
399
00:24:45 --> 00:24:47
I get a plane.
400
00:24:47 --> 00:24:48
There it is.
401
00:24:48 --> 00:24:49
That's the plane.
402
00:24:49 --> 00:24:50
That's the plane.
403
00:24:50 --> 00:24:52
This is from column one.
404
00:24:52 --> 00:24:54
Here's column two.
405
00:24:54 --> 00:24:58
This plane is the column
plane or column space.
406
00:24:58 --> 00:25:03
It's the column space
of A because it comes
407
00:25:03 --> 00:25:05
from the columns of A.
408
00:25:05 --> 00:25:08
Now what's the point
about this plane?
409
00:25:08 --> 00:25:15
The point is that if b is on
the plane then I'm golden.
410
00:25:15 --> 00:25:20
If b is on the plane then b is
a combination of the columns,
411
00:25:20 --> 00:25:24
that's what the plane is, and
I have a solution to Au=b.
412
00:25:26 --> 00:25:35
So b on a plane, b on the
plane means Au=b is solvable.
413
00:25:35 --> 00:25:40
And it could happen, of course.
414
00:25:40 --> 00:25:42
Like perfect measurements.
415
00:25:42 --> 00:25:46
But we can't expect it.
416
00:25:46 --> 00:25:50
When we have three measurements
or 100 measurements or 10,000
417
00:25:50 --> 00:25:53
measurements we can't
expect perfection.
418
00:25:53 --> 00:25:56
So usually b will
be off the plane.
419
00:25:56 --> 00:25:57
Now what?
420
00:25:57 --> 00:25:59
What happens when b
is off the plane?
421
00:25:59 --> 00:26:01
Let me just complete
that picture.
422
00:26:01 --> 00:26:06
And you know what's coming.
423
00:26:06 --> 00:26:15
If we're going to get-- Au or
Au hat is going to be on the
424
00:26:15 --> 00:26:19
plane so I'm looking
for the best u hat.
425
00:26:19 --> 00:26:22
Can I just erase this to make
space for what you know
426
00:26:22 --> 00:26:25
I'm going to draw?
427
00:26:25 --> 00:26:28
Here are these little columns,
let me put them there.
428
00:26:28 --> 00:26:31
What am I going to draw?
429
00:26:31 --> 00:26:32
The projection.
430
00:26:32 --> 00:26:33
The projection.
431
00:26:33 --> 00:26:36
I'm going to draw,
what's the projection?
432
00:26:36 --> 00:26:40
The projection is the nearest
point that is in the plane to
433
00:26:40 --> 00:26:42
the b that's not in the plane.
434
00:26:42 --> 00:26:45
So here's the projection p.
435
00:26:45 --> 00:26:47
I drop down this thing.
436
00:26:47 --> 00:26:50
There's the projection
p, little p.
437
00:26:50 --> 00:26:53
That's the projection
of b onto the plane.
438
00:26:53 --> 00:26:58
I think your mind says yeah,
that's the right choice.
439
00:26:58 --> 00:27:03
And do you want to
tell me what this?
440
00:27:03 --> 00:27:06
That is the part that
we can't deal with.
441
00:27:06 --> 00:27:09
The part we can't improve.
442
00:27:09 --> 00:27:13
We've made it as small
as we could and it's e.
443
00:27:13 --> 00:27:18
That's the error e and
this p is the best guy
444
00:27:18 --> 00:27:20
that is in the plane.
445
00:27:20 --> 00:27:24
Do you see that this
is the picture.
446
00:27:24 --> 00:27:27
You get an actual picture
of what's going on.
447
00:27:27 --> 00:27:32
You're splitting b, the
measurements into the part you
448
00:27:32 --> 00:27:37
can deal with, the projection,
the Au hat that is in
449
00:27:37 --> 00:27:38
the column space.
450
00:27:38 --> 00:27:40
It is a combination
of the columns.
451
00:27:40 --> 00:27:42
Those points do lie on
a line if I'm doing
452
00:27:42 --> 00:27:43
straight line fitting.
453
00:27:43 --> 00:27:47
And the part that you can't
deal with, the e, the
454
00:27:47 --> 00:27:53
difference, b-Au, which
is not in the plane.
455
00:27:53 --> 00:27:56
And now I'm still looking
for the equations.
456
00:27:56 --> 00:27:56
Right?
457
00:27:56 --> 00:27:58
I've just named some stuff.
458
00:27:58 --> 00:28:05
But I haven't got an equation
for that projection.
459
00:28:05 --> 00:28:08
So what's the key fact?
460
00:28:08 --> 00:28:12
What's the key fact in this
picture that's going to lead
461
00:28:12 --> 00:28:20
me to an equation for p and
e and u hat and everything?
462
00:28:20 --> 00:28:27
The key fact is that that
dotted line is perpendicular,
463
00:28:27 --> 00:28:29
perpendicular to the plane.
464
00:28:29 --> 00:28:33
If I'm looking for the closest
point, everybody knows
465
00:28:33 --> 00:28:36
project, that's what
projection involves.
466
00:28:36 --> 00:28:37
Go perpendicular.
467
00:28:37 --> 00:28:40
This is a right angle.
468
00:28:40 --> 00:28:44
That e is perpendicular
to the whole plane.
469
00:28:44 --> 00:28:46
Not only perpendicular to
p, it's perpendicular to
470
00:28:46 --> 00:28:48
everybody in that plane.
471
00:28:48 --> 00:28:49
Right?
472
00:28:49 --> 00:28:51
I'm dropping the
perpendicular to the plane.
473
00:28:51 --> 00:28:54
Do you accept that?
474
00:28:54 --> 00:28:56
Because if you do,
we're through.
475
00:28:56 --> 00:29:00
We just write down the
equations for perpendicular and
476
00:29:00 --> 00:29:03
we've got what we want from
the picture instead of
477
00:29:03 --> 00:29:06
from a calculation.
478
00:29:06 --> 00:29:09
So what's the idea?
479
00:29:09 --> 00:29:14
So e is perpendicular
to the first column.
480
00:29:14 --> 00:29:18
So b in the plane,
we would be golden.
481
00:29:18 --> 00:29:20
Let's suppose we're
not in the plane.
482
00:29:20 --> 00:29:26
So now we have this 90 degree
angle, this perpendicular
483
00:29:26 --> 00:29:27
projection.
484
00:29:27 --> 00:29:31
And it tells me that the
first column-- oh I
485
00:29:31 --> 00:29:35
better name the columns.
486
00:29:35 --> 00:29:36
Can I just call
this column a_1?
487
00:29:37 --> 00:29:41
That first column is a_1 and
the second column is a_2.
488
00:29:42 --> 00:29:48
So those two columns, whatever
they are, are the guys whose
489
00:29:48 --> 00:29:51
combinations give us the plane.
490
00:29:51 --> 00:29:54
And it's the plane that
we're projecting onto.
491
00:29:54 --> 00:29:57
It's the plane of all
combinations that
492
00:29:57 --> 00:29:59
comes up here.
493
00:29:59 --> 00:30:01
So what's this 90 degree angle?
494
00:30:01 --> 00:30:06
It says that a_1 is
perpendicular to p, right?
495
00:30:06 --> 00:30:09
Sorry!
496
00:30:09 --> 00:30:11
Say that right for me.
497
00:30:11 --> 00:30:17
The first equation says that
a_1 and what are perpendicular?
498
00:30:17 --> 00:30:19
e, thank you, e.
499
00:30:19 --> 00:30:26
So the first equation says
that a_1 transpose e is zero.
500
00:30:26 --> 00:30:33
And the second equation says
that a_2 transpose e is zero.
501
00:30:33 --> 00:30:40
Those are my two equations.
502
00:30:40 --> 00:30:43
I have to convert those now
into matrix language because
503
00:30:43 --> 00:30:48
I've done them two separate--
vector, I mean vector
504
00:30:48 --> 00:30:50
language, and I want to get
into matrix language.
505
00:30:50 --> 00:30:52
But it's easy to do.
506
00:30:52 --> 00:30:58
Here I have, look, if
I have two equations,
507
00:30:58 --> 00:31:04
let's get a matrix here.
508
00:31:04 --> 00:31:09
What's it saying? a_1
transpose and a_2
509
00:31:09 --> 00:31:13
transpose, what are those?
510
00:31:13 --> 00:31:14
They're the rows
of A transpose.
511
00:31:14 --> 00:31:19
So the matrix way to say that
is A transpose e equal zero.
512
00:31:19 --> 00:31:24
In other words, this is
saying both at once, right?
513
00:31:24 --> 00:31:28
The first row of A transpose
times e gives zero, the
514
00:31:28 --> 00:31:31
second row of A transpose
times e is zero.
515
00:31:31 --> 00:31:35
So it's A transpose e equal
zero which is what we wanted
516
00:31:35 --> 00:31:40
in this case where w
and e are the same.
517
00:31:40 --> 00:31:42
Because C is the identity.
518
00:31:42 --> 00:31:45
And let's just go one
step further and see.
519
00:31:45 --> 00:31:51
That's A transpose
(b-Au hat) is zero.
520
00:31:51 --> 00:32:00
Remember this zero stands
for , right?
521
00:32:00 --> 00:32:02
I wanted to put the two
equations together.
522
00:32:02 --> 00:32:07
So I've got two components
on the right-hand side.
523
00:32:07 --> 00:32:09
And then I just
plugged in what b is.
524
00:32:09 --> 00:32:11
And now everybody
sees it, right?
525
00:32:11 --> 00:32:16
Everybody sees that we've got
the picture, this 90 degree
526
00:32:16 --> 00:32:21
angle was the key to
these equations.
527
00:32:21 --> 00:32:27
Because if I put A transpose Au
hat onto the other side, I've
528
00:32:27 --> 00:32:34
got exactly the normal
equations that I wanted.
529
00:32:34 --> 00:32:42
We're taking the time to
see the picture and the
530
00:32:42 --> 00:32:44
form of the equations.
531
00:32:44 --> 00:32:51
Then I can plug in the numbers,
but the thinking is where the
532
00:32:51 --> 00:32:57
equations come from.
so we're there.
533
00:32:57 --> 00:32:59
Now what to do next?
534
00:32:59 --> 00:33:03
Now we've understood where
the equations come from.
535
00:33:03 --> 00:33:06
I didn't go through the steps
of taking the derivatives,
536
00:33:06 --> 00:33:09
but that would work.
537
00:33:09 --> 00:33:12
Or this picture.
538
00:33:12 --> 00:33:14
I love this picture.
539
00:33:14 --> 00:33:17
Let me stay with that
a little bit longer.
540
00:33:17 --> 00:33:19
What is u hat?
541
00:33:19 --> 00:33:29
Can I just go over here to say,
ok what have we got here?
542
00:33:29 --> 00:33:32
We started with Au=b
and then we got the
543
00:33:32 --> 00:33:37
projection was Au hat.
544
00:33:37 --> 00:33:39
But now what is u hat?
545
00:33:39 --> 00:33:45
I'm just going to assemble
things here. u hat we figured
546
00:33:45 --> 00:33:49
out by the 90 degree angle
comes from this equation, which
547
00:33:49 --> 00:33:54
is that equation, which is A
transpose Au hat equal
548
00:33:54 --> 00:33:59
A transpose b, the
central equation.
549
00:33:59 --> 00:34:02
That's the central equation.
550
00:34:02 --> 00:34:08
Now plug in u hat here so I get
a formula for the projection.
551
00:34:08 --> 00:34:11
While we're doing all this
stuff we might just as well put
552
00:34:11 --> 00:34:14
those two pieces together and
have a formula for
553
00:34:14 --> 00:34:15
the projection.
554
00:34:15 --> 00:34:20
So it's A times u hat-- I
hope you like this formula.
555
00:34:20 --> 00:34:25
It's kind of goofy-looking
but you'll remember it.
556
00:34:25 --> 00:34:27
What is u hat?
557
00:34:27 --> 00:34:31
The whole point is that
this matrix is good.
558
00:34:31 --> 00:34:35
It's square, it's symmetric,
it's invertible, we'll have
559
00:34:35 --> 00:34:37
another word about that.
560
00:34:37 --> 00:34:48
And now I'll invert it
times A transpose b.
561
00:34:48 --> 00:34:53
That's the goofy formula
that I wanted you to see.
562
00:34:53 --> 00:35:02
The projection of vector b onto
these columns of A comes from
563
00:35:02 --> 00:35:08
applying this matrix, sometimes
I call it the matrix
564
00:35:08 --> 00:35:11
of four A's.
565
00:35:11 --> 00:35:16
Now it's worth looking
at that matrix.
566
00:35:16 --> 00:35:19
Often I'll call that
matrix capital P.
567
00:35:19 --> 00:35:22
It's the projection matrix.
568
00:35:22 --> 00:35:26
You give me any vector b, I
multiply it by this matrix
569
00:35:26 --> 00:35:28
and I get the projection.
570
00:35:28 --> 00:35:34
It's just worth seeing what
this matrix P, these four
571
00:35:34 --> 00:35:41
A's, what are projection
matrices like.
572
00:35:41 --> 00:35:47
Now first of all, when I have
an inverse of a product any
573
00:35:47 --> 00:35:52
reasonable person would say ok,
split that into A inverse times
574
00:35:52 --> 00:35:55
A transpose inverse and
simplify the whole thing.
575
00:35:55 --> 00:35:58
And what will happen?
576
00:35:58 --> 00:36:01
It's not going to be legal,
but let's just pretend.
577
00:36:01 --> 00:36:06
If I split this into A inverse
times A transpose inverse and
578
00:36:06 --> 00:36:09
simplify, what do I get for P?
579
00:36:09 --> 00:36:10
Do you see it?
580
00:36:10 --> 00:36:18
I'll get A and if I try
to split this into that,
581
00:36:18 --> 00:36:19
what do I have here?
582
00:36:19 --> 00:36:22
I've got the identity.
583
00:36:22 --> 00:36:23
That's the identity.
584
00:36:23 --> 00:36:25
That's the identity.
585
00:36:25 --> 00:36:29
The result is the identity.
586
00:36:29 --> 00:36:33
That doesn't look good, right?
587
00:36:33 --> 00:36:36
P is not the same as P.
588
00:36:36 --> 00:36:41
This matrix cannot be split
into these two pieces.
589
00:36:41 --> 00:36:45
A is rectangular,
that's its problem.
590
00:36:45 --> 00:36:49
If A was square, oh yeah,
think about the case
591
00:36:49 --> 00:36:50
when A is square.
592
00:36:50 --> 00:36:52
Suppose m equals n.
593
00:36:52 --> 00:36:54
That case'll be included here.
594
00:36:54 --> 00:36:58
If m equals n and my matrix is
square and invertible and
595
00:36:58 --> 00:37:01
golden then all this works.
596
00:37:01 --> 00:37:04
The projection is the
identity matrix.
597
00:37:04 --> 00:37:06
And what's with my picture?
598
00:37:06 --> 00:37:10
What's my picture look
like in the case where
599
00:37:10 --> 00:37:13
A is a square matrix?
600
00:37:13 --> 00:37:14
Give it another column.
601
00:37:14 --> 00:37:17
Fit this thing by a quadratic.
602
00:37:17 --> 00:37:21
So if I was fitting instead of
by a straight line, by a
603
00:37:21 --> 00:37:25
quadratic, it turns out I'd
have zero squared, one squared
604
00:37:25 --> 00:37:29
and three squared
in that column.
605
00:37:29 --> 00:37:32
I'd have a three
by three matrix.
606
00:37:32 --> 00:37:35
It comes out to be invertible.
607
00:37:35 --> 00:37:37
Now what's going on?
608
00:37:37 --> 00:37:39
Now what's my problem Au=b?
609
00:37:40 --> 00:37:49
Now suddenly m is still three,
but now n is three. b is?
610
00:37:49 --> 00:37:53
And what happened to the
plane? b is in there.
611
00:37:53 --> 00:37:57
And now what's there?
612
00:37:57 --> 00:38:01
It's now the
combinations of what?
613
00:38:01 --> 00:38:03
Why did that plane come in?
614
00:38:03 --> 00:38:05
That was the combinations
of two columns.
615
00:38:05 --> 00:38:06
But now I've got three.
616
00:38:06 --> 00:38:10
The combinations of three
columns, those three columns of
617
00:38:10 --> 00:38:14
an invertible matrix is what?
618
00:38:14 --> 00:38:17
Are you with me?
619
00:38:17 --> 00:38:20
If I have a three by three
invertible matrix, these three
620
00:38:20 --> 00:38:24
columns independent, pointing
off different directions, not
621
00:38:24 --> 00:38:31
in a plane, then when I take
the combinations I get?
622
00:38:31 --> 00:38:31
I get R^3.
623
00:38:32 --> 00:38:34
I get the whole space.
624
00:38:34 --> 00:38:35
I get everybody.
625
00:38:35 --> 00:38:39
Every vector including this b
and any other b you want to
626
00:38:39 --> 00:38:42
suggest will be a combination
of these three guys.
627
00:38:42 --> 00:38:44
So what's my picture here?
628
00:38:44 --> 00:38:49
My picture is that plane
grew to be the whole space.
629
00:38:49 --> 00:38:56
So what's the projection of b
onto the whole space? b itself.
630
00:38:56 --> 00:38:58
And what's the error?
631
00:38:58 --> 00:38:58
Zero.
632
00:38:58 --> 00:38:59
Good.
633
00:38:59 --> 00:39:01
So that's the nice case.
634
00:39:01 --> 00:39:04
That's the standard case that
we've thought about in the
635
00:39:04 --> 00:39:07
past when m equalled n.
636
00:39:07 --> 00:39:11
In that case P is the identity
and that'd be all true.
637
00:39:11 --> 00:39:14
But normally it's not.
638
00:39:14 --> 00:39:19
So I want to come back to
this P just to mention an
639
00:39:19 --> 00:39:22
important fact about P.
640
00:39:22 --> 00:39:24
And it comes again
from the picture.
641
00:39:24 --> 00:39:26
So this is a projection.
642
00:39:26 --> 00:39:31
This is what I'm calling
the projection matrix.
643
00:39:31 --> 00:39:34
It's the matrix that
does the projection.
644
00:39:34 --> 00:39:36
And there it is.
645
00:39:36 --> 00:39:40
Four A's in a row
that multiplies b.
646
00:39:40 --> 00:39:42
Now here's my little question.
647
00:39:42 --> 00:39:46
So linear algebra's full
of these different
648
00:39:46 --> 00:39:49
kinds of matrices.
649
00:39:49 --> 00:39:54
Rotations, reflections,
symmetric matrices, Markov
650
00:39:54 --> 00:39:58
matrices, so it's just every
problem has matrices.
651
00:39:58 --> 00:40:01
Now here we have a
projection matrix.
652
00:40:01 --> 00:40:07
Now what I want to know is what
happens if I project again?
653
00:40:07 --> 00:40:10
If I take the vector b, any
vector b, I project it
654
00:40:10 --> 00:40:13
and then I project again.
655
00:40:13 --> 00:40:18
So project twice and just tell
me, you know what will happen.
656
00:40:18 --> 00:40:21
I'm back to this picture.
657
00:40:21 --> 00:40:26
I project b to P and
now I project again.
658
00:40:26 --> 00:40:28
Where do I go?
659
00:40:28 --> 00:40:31
Same place, right?
660
00:40:31 --> 00:40:34
Once I'm in the plane
the projection stays
661
00:40:34 --> 00:40:35
right where it is.
662
00:40:35 --> 00:40:36
So what does that tell me?
663
00:40:36 --> 00:40:44
That tells me that P squared
on b is the same as P on b.
664
00:40:44 --> 00:40:47
If I project twice, no change.
665
00:40:47 --> 00:40:50
It's the same as
projecting once.
666
00:40:50 --> 00:40:52
So the projection matrix
has the property
667
00:40:52 --> 00:40:55
that P squared is P.
668
00:40:55 --> 00:41:00
And actually, we should be able
to see it if I write out this
669
00:41:00 --> 00:41:01
whole miserable thing twice.
670
00:41:01 --> 00:41:04
So now I'm going to
be up to eight A's.
671
00:41:04 --> 00:41:09
Sorry about this, but I
promise not to do P cubed.
672
00:41:09 --> 00:41:12
A times A transpose
A inverse times A
673
00:41:12 --> 00:41:14
transpose, that's one P.
674
00:41:14 --> 00:41:20
I'll write it again.
675
00:41:20 --> 00:41:22
There's the second P.
676
00:41:22 --> 00:41:26
So that's P squared.
677
00:41:26 --> 00:41:27
Do you see anything good there?
678
00:41:27 --> 00:41:33
Do you see in here A transpose
A, that combination and
679
00:41:33 --> 00:41:35
that combination there.
680
00:41:35 --> 00:41:39
This cancels that to
give the identity.
681
00:41:39 --> 00:41:42
And what am I left with?
682
00:41:42 --> 00:41:46
I'm left with A, the
inverse times A transpose,
683
00:41:46 --> 00:41:49
which was exactly P.
684
00:41:49 --> 00:41:53
The algebra is just coming
along with the understanding
685
00:41:53 --> 00:41:54
that we know.
686
00:41:54 --> 00:41:58
So that's the
projection matrix.
687
00:41:58 --> 00:42:02
So this is the theory
of projections in a
688
00:42:02 --> 00:42:05
nutshell, in a nutshell.
689
00:42:05 --> 00:42:08
This is projections onto
the column space of A.
690
00:42:08 --> 00:42:14
Now I have to remind you
about one little math point.
691
00:42:14 --> 00:42:16
Not so little, I guess.
692
00:42:16 --> 00:42:20
How could I say
little for math?
693
00:42:20 --> 00:42:22
Is A transpose A invertible?
694
00:42:22 --> 00:42:26
We're plowing along as
if it is, that's going
695
00:42:26 --> 00:42:28
to be our assumption.
696
00:42:28 --> 00:42:33
But what's the condition for A
transpose A to be invertible,
697
00:42:33 --> 00:42:35
which allows all this to work?
698
00:42:35 --> 00:42:43
When is A transpose
A invertible?
699
00:42:43 --> 00:42:47
What I'm doing here now
is I'm separating the
700
00:42:47 --> 00:42:50
positive definite one.
701
00:42:50 --> 00:42:54
When A transpose A is positive
definite, the good normal case
702
00:42:54 --> 00:42:59
when all our equations work,
from the semi-definite one
703
00:42:59 --> 00:43:04
where we overlook the fact that
A transpose A, where somehow
704
00:43:04 --> 00:43:08
the experiment wasn't well set
up, we got an A transpose
705
00:43:08 --> 00:43:12
A that is singular.
706
00:43:12 --> 00:43:15
And just to see when
could that happen.
707
00:43:15 --> 00:43:20
Let me just remind you.
708
00:43:20 --> 00:43:21
This is important.
709
00:43:21 --> 00:43:26
Why don't I give it some space.
710
00:43:26 --> 00:43:28
It's really straightforward.
711
00:43:28 --> 00:43:34
Let me just go through
those steps again.
712
00:43:34 --> 00:43:42
If it's not invertible, if
some A transpose A u is zero.
713
00:43:42 --> 00:43:47
This is always the risk that we
have to check out and be sure
714
00:43:47 --> 00:43:49
we don't have and understand.
715
00:43:49 --> 00:43:54
So if A transpose Au is zero,
then that would lead us, I
716
00:43:54 --> 00:44:01
could multiply both sides by u
transpose. u transpose
717
00:44:01 --> 00:44:03
zero, right?
718
00:44:03 --> 00:44:04
Safe.
719
00:44:04 --> 00:44:07
Multiply whatever that u
might be, multiply both
720
00:44:07 --> 00:44:09
sides by u transpose.
721
00:44:09 --> 00:44:12
But what is u transpose zero?
722
00:44:12 --> 00:44:17
Zero, nothing there.
723
00:44:17 --> 00:44:20
Now how do I
understand this guy?
724
00:44:20 --> 00:44:22
Well you remember the key.
725
00:44:22 --> 00:44:24
Everybody remembers the key?
726
00:44:24 --> 00:44:27
You look at that thing and you
say hey, if I put in
727
00:44:27 --> 00:44:30
parentheses in the right place
that's the length
728
00:44:30 --> 00:44:34
of Au squared.
729
00:44:34 --> 00:44:38
So that's the small trick that
this multiplying by u transpose
730
00:44:38 --> 00:44:43
and then seeing what you've got
that we've done and
731
00:44:43 --> 00:44:44
you should know it.
732
00:44:44 --> 00:44:47
And now if the length
squared is zero, what does
733
00:44:47 --> 00:44:48
that tell me about Au?
734
00:44:48 --> 00:44:51
735
00:44:51 --> 00:44:54
If I have a vector here
who's length is zero,
736
00:44:54 --> 00:44:56
that vector must be?
737
00:44:56 --> 00:44:57
Zero.
738
00:44:57 --> 00:45:01
Zero vector's the only one
for which the sum of the
739
00:45:01 --> 00:45:05
squares will give zero.
740
00:45:05 --> 00:45:08
And if Au is zero I could
multiply both sides
741
00:45:08 --> 00:45:14
by A transpose and
complete the loop.
742
00:45:14 --> 00:45:16
Actually I thought of that
when I was swimming this
743
00:45:16 --> 00:45:21
morning, that line.
744
00:45:21 --> 00:45:27
Just to see once again when,
it's sort of interesting then.
745
00:45:27 --> 00:45:33
A transpose Au equal zero which
is the bad thing we hope
746
00:45:33 --> 00:45:34
we don't deal with.
747
00:45:34 --> 00:45:36
And when does it happen?
748
00:45:36 --> 00:45:41
It happens when A u is zero.
749
00:45:41 --> 00:45:46
So our assumption always has to
be this; that there aren't any
750
00:45:46 --> 00:45:50
u's, except the zero vector of
course, that's always going to
751
00:45:50 --> 00:45:59
happen, but we always have to
assume that Au is never zero.
752
00:45:59 --> 00:46:01
So we have to avoid this.
753
00:46:01 --> 00:46:12
So to avoid that assume A
has, this is the key word,
754
00:46:12 --> 00:46:18
independent columns.
755
00:46:18 --> 00:46:21
Since this is a combination
of the columns, independent
756
00:46:21 --> 00:46:23
columns means what?
757
00:46:23 --> 00:46:26
It means that the only
combination of the columns
758
00:46:26 --> 00:46:31
to give zero is the
zero combination.
759
00:46:31 --> 00:46:34
So did I have independent
columns over here?
760
00:46:34 --> 00:46:35
I sure did.
761
00:46:35 --> 00:46:38
That column and that column
were off in different
762
00:46:38 --> 00:46:40
directions, they
were independent.
763
00:46:40 --> 00:46:43
And that's why I
knew we were fine.
764
00:46:43 --> 00:46:46
A transpose A was zero.
765
00:46:46 --> 00:46:53
I would have to really struggle
to find a, well I'd have to
766
00:46:53 --> 00:46:58
think a bit to find an example
where we run into trouble.
767
00:46:58 --> 00:47:04
These squares, well I certainly
could in many applications, but
768
00:47:04 --> 00:47:07
the straightforward
applications of fitting a
769
00:47:07 --> 00:47:12
straight line, A is going to be
a column vector of ones and a
770
00:47:12 --> 00:47:15
column vector of times and
those are different
771
00:47:15 --> 00:47:20
directions and no problem.
772
00:47:20 --> 00:47:23
So that's A transpose A.
773
00:47:23 --> 00:47:30
What else to do
with this topic?
774
00:47:30 --> 00:47:33
Because there's a whole
world of estimation.
775
00:47:33 --> 00:47:38
I mean, statistics is looking
over our shoulder I guess.
776
00:47:38 --> 00:47:42
Really, we should realize that
a statistician say yeah, I know
777
00:47:42 --> 00:47:46
that but, and then going on.
778
00:47:46 --> 00:47:52
And what is that guy, what
more does he have to say?
779
00:47:52 --> 00:47:55
So you've got the
central ideas.
780
00:47:55 --> 00:48:01
I guess the statistician comes
in in this, that's the
781
00:48:01 --> 00:48:05
statistical constant now.
782
00:48:05 --> 00:48:08
And what do
statisticians compute?
783
00:48:08 --> 00:48:14
They say you've got
errors, right?
784
00:48:14 --> 00:48:17
And of course, in any
particular case we don't know
785
00:48:17 --> 00:48:21
what that error is, otherwise
we could take it out and
786
00:48:21 --> 00:48:23
we'd get exact solutions.
787
00:48:23 --> 00:48:25
We don't know what
the error is.
788
00:48:25 --> 00:48:32
What is reasonable to
know about errors?
789
00:48:32 --> 00:48:43
We're doing a little
statistics here.
790
00:48:43 --> 00:48:46
Somehow that error, that
particular error of the
791
00:48:46 --> 00:48:49
experiment that we happen to
run, and if we ran it again
792
00:48:49 --> 00:48:53
we'd get a different error,
those errors come out of some
793
00:48:53 --> 00:48:56
sort of error population.
794
00:48:56 --> 00:48:59
Like dark matter or something.
795
00:48:59 --> 00:49:03
Just like, a bunch of errors
are out there, noise.
796
00:49:03 --> 00:49:08
And what could we reasonably
assume that we know
797
00:49:08 --> 00:49:09
about the noise?
798
00:49:09 --> 00:49:14
We could assume that its
average is zero, mean zero.
799
00:49:14 --> 00:49:18
So statisticians always,
that just resets the meter.
800
00:49:18 --> 00:49:19
Right?
801
00:49:19 --> 00:49:23
If you had a meter or a clock
that was always three minutes
802
00:49:23 --> 00:49:27
ahead (like this one)
you would reset it.
803
00:49:27 --> 00:49:30
And we'll do that one day.
804
00:49:30 --> 00:49:34
So you'd reset to get
the average zero.
805
00:49:34 --> 00:49:36
But that doesn't mean every
error is zero, right?
806
00:49:36 --> 00:49:39
That just means the
average error is zero.
807
00:49:39 --> 00:49:42
So what's the other number?
808
00:49:42 --> 00:49:46
What's the other number that
statisticians live on?
809
00:49:46 --> 00:49:48
It's the deviation or
its square, which is
810
00:49:48 --> 00:49:51
called the variance.
811
00:49:51 --> 00:49:52
Right.
812
00:49:52 --> 00:49:53
Variance.
813
00:49:53 --> 00:49:58
So that's the thing that
you could assume that the
814
00:49:58 --> 00:50:01
errors have mean zero
and have some variance.
815
00:50:01 --> 00:50:05
You could suppose that you knew
something about the variance.
816
00:50:05 --> 00:50:08
You don't know the individual
errors, but you know whether
817
00:50:08 --> 00:50:15
the errors are like, are very
small or close to
818
00:50:15 --> 00:50:20
zero or large.
819
00:50:20 --> 00:50:22
So this is a small variance.
820
00:50:22 --> 00:50:26
So one over sigma is
sort of that distance.
821
00:50:26 --> 00:50:27
One over sigma.
822
00:50:27 --> 00:50:33
Here, this is a large variance
where the magnitude of the
823
00:50:33 --> 00:50:37
error could be much
larger from this.
824
00:50:37 --> 00:50:40
So those are the two numbers,
mean zero, that leaves us just
825
00:50:40 --> 00:50:45
one number, and the variance,
the standard deviation sigma or
826
00:50:45 --> 00:50:48
the variance sigma squared.
827
00:50:48 --> 00:50:55
One moment on these squares.
828
00:50:55 --> 00:50:58
Let me just say what the
weighting matrix would be.
829
00:50:58 --> 00:51:02
And then I can tell
you in a moment why.
830
00:51:02 --> 00:51:09
What would the weighting matrix
be if our three equations,
831
00:51:09 --> 00:51:12
you know, that came from one
measurement and this came from
832
00:51:12 --> 00:51:14
a second measurement and this
came from a third measurement.
833
00:51:14 --> 00:51:19
If they came from different
meter readers with different
834
00:51:19 --> 00:51:25
variances, suppose, then the
right C matrix will be a
835
00:51:25 --> 00:51:28
diagonal matrix, beautiful.
836
00:51:28 --> 00:51:34
And what sits up there, what
sits there, what sits there?
837
00:51:34 --> 00:51:36
We don't have spring
constants anymore.
838
00:51:36 --> 00:51:39
We have statistics constants.
839
00:51:39 --> 00:51:43
And what's the number
that goes there?
840
00:51:43 --> 00:51:45
That one is the third guy.
841
00:51:45 --> 00:51:49
So it's associated with
the third measurement.
842
00:51:49 --> 00:51:55
It's one over sigma
three squared.
843
00:51:55 --> 00:51:57
Those are the numbers that
go on the diagonal, the
844
00:51:57 --> 00:52:00
inverses of the variances.
845
00:52:00 --> 00:52:03
And just to see that
that makes sense.
846
00:52:03 --> 00:52:11
If that number is unreliable,
if it has a large variance
847
00:52:11 --> 00:52:18
then I want to give it
little weight, right?
848
00:52:18 --> 00:52:22
If this third meter is very
unreliable I'm not going to
849
00:52:22 --> 00:52:25
throw it out entirely, but I
know that it's variance is
850
00:52:25 --> 00:52:31
large and therefore I'll weight
that equation only a little,
851
00:52:31 --> 00:52:33
with a small weight.
852
00:52:33 --> 00:52:37
Suppose sigma two, so this
guy is one over sigma two
853
00:52:37 --> 00:52:44
squared, suppose this is an
extremely reliable meter.
854
00:52:44 --> 00:52:48
That measurement has
little expected error.
855
00:52:48 --> 00:52:49
Then I want to
weight it heavily.
856
00:52:49 --> 00:52:56
So it has a small sigma two and
that gives it a large weight.
857
00:52:56 --> 00:52:58
And sigma one similarly.
858
00:52:58 --> 00:53:05
So that's the weighting for the
case that you can actually
859
00:53:05 --> 00:53:08
hope to use in practice.
860
00:53:08 --> 00:53:11
I'll just mention that
statisticians would also
861
00:53:11 --> 00:53:13
say, wait a minute.
862
00:53:13 --> 00:53:17
Measurement two and measurement
three might be interconnected.
863
00:53:17 --> 00:53:19
They might not be independent.
864
00:53:19 --> 00:53:21
There might be a covariance.
865
00:53:21 --> 00:53:25
And then that gets them
into more great linear
866
00:53:25 --> 00:53:26
algebra actually.
867
00:53:26 --> 00:53:32
But if I want a diagonal matrix
C that's the case when my
868
00:53:32 --> 00:53:34
measurements are independent.
869
00:53:34 --> 00:53:41
And basically, I'm
whitening the system.
870
00:53:41 --> 00:53:44
I'm making the system
white, making it all equal
871
00:53:44 --> 00:53:46
variances by rescaling.
872
00:53:46 --> 00:53:48
By weighting the equations.
873
00:53:48 --> 00:53:51
Ok thanks.
874
00:53:51 --> 00:53:55
Wednesday is the next
big example of the
875
00:53:55 --> 00:53:57
framework with b and f.
876
00:53:57 --> 00:53:59
See you then.