WEBVTT
00:00:00.499 --> 00:00:01.950
The following
content is provided
00:00:01.950 --> 00:00:04.900
by MIT OpenCourseWare under
a Creative Commons License.
00:00:04.900 --> 00:00:08.230
Additional information
about our license,
00:00:08.230 --> 00:00:10.560
and MIT OpenCourseWare
in general,
00:00:10.560 --> 00:00:11.780
is available at ocw.mit.edu.
00:00:16.570 --> 00:00:17.280
PROFESSOR: OK.
00:00:17.280 --> 00:00:21.230
Now, where am I
with this problem?
00:00:21.230 --> 00:00:28.697
Well, last time I spoke about
what the situation's like
00:00:28.697 --> 00:00:29.780
as alpha goes to infinity.
00:00:29.780 --> 00:00:37.530
And I want to say also a word
about -- more than a word --
00:00:37.530 --> 00:00:39.000
about alpha going to 0.
00:00:41.960 --> 00:00:48.900
And then, the real problems
come when alpha is in between.
00:00:48.900 --> 00:00:51.930
The real problem
-- the situations,
00:00:51.930 --> 00:00:58.570
these ill-posed problems that
come from inverse problems,
00:00:58.570 --> 00:01:01.730
trying to find out what's
inside your brain by taking
00:01:01.730 --> 00:01:03.520
measurements at the skull.
00:01:03.520 --> 00:01:12.350
All sorts of applications
involve a finite alpha.
00:01:12.350 --> 00:01:18.310
And I'm not quite ready
to discuss those topics.
00:01:18.310 --> 00:01:23.020
I mean, roughly speaking --
00:01:23.020 --> 00:01:26.240
I'll write down a reminder now.
00:01:26.240 --> 00:01:29.290
What happened when
alpha went to infinity?
00:01:29.290 --> 00:01:33.460
When alpha went to
infinity, this part
00:01:33.460 --> 00:01:35.070
became the important part.
00:01:35.070 --> 00:01:42.230
So as alpha went to infinity,
the limit was u_infinity,
00:01:42.230 --> 00:01:43.710
shall I call it? u_infinity.
00:01:47.210 --> 00:01:50.860
Well, so u_infinity was
a minimizer of this term,
00:01:50.860 --> 00:02:02.100
u_infinity minimized
B*u minus d squared.
00:02:02.100 --> 00:02:10.490
In fact, in my last lecture,
I was taking B*u equal d
00:02:10.490 --> 00:02:13.390
as an equation that
had exact solutions,
00:02:13.390 --> 00:02:17.060
and saying how did we
actually solve B*u equal d.
00:02:17.060 --> 00:02:21.450
So u_infinity
minimizes B*u minus d.
00:02:21.450 --> 00:02:24.080
But that might
leave some freedom.
00:02:24.080 --> 00:02:26.760
If B doesn't have
that many rows,
00:02:26.760 --> 00:02:30.350
if its rank is not
that big, then this
00:02:30.350 --> 00:02:32.510
doesn't finish the job.
00:02:32.510 --> 00:02:39.610
So among these, if
there are many --
00:02:39.610 --> 00:02:43.760
and that's what were
interested in --
00:02:43.760 --> 00:02:51.810
u hat infinity, that limit,
will minimize the other bit,
00:02:51.810 --> 00:02:55.540
A*u minus b square.
00:02:55.540 --> 00:02:58.110
Does that makes sense somehow?
00:02:58.110 --> 00:03:01.480
This is in the problem
here, for any finite alpha.
00:03:01.480 --> 00:03:03.600
As alpha gets bigger
and bigger, we
00:03:03.600 --> 00:03:05.920
push harder and
harder on this one,
00:03:05.920 --> 00:03:10.440
so we get a one that's
a winner for this one,
00:03:10.440 --> 00:03:14.720
but the trace of this
first part is still around
00:03:14.720 --> 00:03:19.770
and if there are many winners,
then having this first part
00:03:19.770 --> 00:03:24.970
in there will give us, among
the winners, the one that
00:03:24.970 --> 00:03:29.110
does the best on that term.
00:03:29.110 --> 00:03:33.620
And small alpha,
going to 0, will just
00:03:33.620 --> 00:03:35.540
be the opposite right?
00:03:35.540 --> 00:03:37.880
This finally struck
me over the weekend,
00:03:37.880 --> 00:03:40.300
you know, like I could
divide this quantity,
00:03:40.300 --> 00:03:44.380
this whole expression by
alpha, so then I have a 1
00:03:44.380 --> 00:03:48.440
there, and a 1 over alpha
here, and now as alpha
00:03:48.440 --> 00:03:51.710
goes 0 this is the big term.
00:03:51.710 --> 00:03:56.490
So now u -- shall
I call this u_0?
00:03:56.490 --> 00:03:58.700
Brilliant notation, right?
00:03:58.700 --> 00:04:02.640
So this produces a u_alpha.
00:04:06.190 --> 00:04:10.200
In that limit it converges to
a u_infinity that focuses first
00:04:10.200 --> 00:04:13.200
on this problem, but
in the other limit,
00:04:13.200 --> 00:04:15.970
when alpha's going to 0, it's
this term that's biggest.
00:04:15.970 --> 00:04:25.530
So u_0 minimizes A*u minus b
squared and if there are many
00:04:25.530 --> 00:04:31.265
minimizers, among these --
well, you know what I'm going
00:04:31.265 --> 00:04:35.590
to write. u_0, see I
put a little hat there.
00:04:35.590 --> 00:04:36.330
Did I?
00:04:36.330 --> 00:04:38.560
I don't know I haven't
stayed with these hats
00:04:38.560 --> 00:04:43.060
very much but maybe
I'll add them.
00:04:43.060 --> 00:04:48.870
u hat minimizes the term
that's not so important,
00:04:48.870 --> 00:04:51.290
B*u minus d square.
00:04:51.290 --> 00:05:00.780
OK, so today's lecture is still
about these limiting cases.
00:05:00.780 --> 00:05:09.210
As I said, the scientific
problems, ill-posed problems,
00:05:09.210 --> 00:05:12.470
especially these
inverse problems,
00:05:12.470 --> 00:05:16.180
give situations in which
these limiting problems are
00:05:16.180 --> 00:05:21.170
really bad, and you don't get
to the limit, you don't want to.
00:05:21.170 --> 00:05:27.040
The whole point is to
have a finite alpha.
00:05:27.040 --> 00:05:33.020
But choosing that alpha
correctly is the art --
00:05:33.020 --> 00:05:36.460
let me just say
why -- so I almost,
00:05:36.460 --> 00:05:41.030
I'm sort of anticipating what
I'm not ready to do properly.
00:05:41.030 --> 00:05:49.010
So, I'll say why a finite
alpha, on Wednesday.
00:05:49.010 --> 00:05:51.590
Why?
00:05:51.590 --> 00:05:54.440
Because noisy data.
00:06:04.500 --> 00:06:14.360
Because of the noise, at
best u is only determined,
00:06:14.360 --> 00:06:20.100
because of the noise,
up to some order,
00:06:20.100 --> 00:06:23.110
say, order of some
small quantity
00:06:23.110 --> 00:06:25.390
delta that measures the noise.
00:06:25.390 --> 00:06:27.250
This is like a
measure of the noise.
00:06:32.750 --> 00:06:35.920
Then there's no reason to
do what we did last time,
00:06:35.920 --> 00:06:39.450
like forcing B*u equal d.
00:06:39.450 --> 00:06:43.970
There's no point in forcing
B*u equal d if the d in that
00:06:43.970 --> 00:06:51.980
equation has noise in it,
then pushing it all the way
00:06:51.980 --> 00:06:59.570
to the limit is unreasonable,
and may produce a very,
00:06:59.570 --> 00:07:03.960
you know, a
catastrophic illness.
00:07:03.960 --> 00:07:08.350
So that's when -- so it's
really the presence of noise,
00:07:08.350 --> 00:07:15.180
the presence of uncertainty in
the first place that says OK,
00:07:15.180 --> 00:07:18.710
a finite alpha is fine, you're
not looking for perfection,
00:07:18.710 --> 00:07:21.110
what you're looking
for is some stability,
00:07:21.110 --> 00:07:25.460
some control on the stability.
00:07:25.460 --> 00:07:27.040
OK, right.
00:07:27.040 --> 00:07:30.670
But now -- so that's Wednesday.
00:07:34.300 --> 00:07:43.310
Today, let me go -- I didn't
give an example, so today,
00:07:43.310 --> 00:07:51.990
two topics, one is an
example with B*u equal d.
00:07:51.990 --> 00:07:54.260
That was last
lecture's, and that's
00:07:54.260 --> 00:07:58.450
the case when alpha
goes to infinity,
00:07:58.450 --> 00:08:03.010
then secondly is something
called a pseudoinverse.
00:08:03.010 --> 00:08:06.440
You may have seen that
expression, the pseudoinverse
00:08:06.440 --> 00:08:11.160
of A, and sometimes it's
written A with a dagger or A
00:08:11.160 --> 00:08:13.390
with a plus sign.
00:08:13.390 --> 00:08:15.460
And that is worth knowing about.
00:08:15.460 --> 00:08:18.140
So this is a topic
in linear algebra.
00:08:18.140 --> 00:08:19.910
It would be in my
linear algebra book,
00:08:19.910 --> 00:08:23.970
but it's a topic that never
gets into the 18.06 course cause
00:08:23.970 --> 00:08:26.490
it's sort of a little late.
00:08:26.490 --> 00:08:30.620
And that will appear
as alpha goes to 0.
00:08:30.620 --> 00:08:31.120
Right.
00:08:31.120 --> 00:08:34.860
So that's what
today is about, it's
00:08:34.860 --> 00:08:41.790
linear algebra, because I'm
not ready for the noise yet.
00:08:41.790 --> 00:08:48.310
But it's the noisy data
that we have in reality
00:08:48.310 --> 00:08:55.340
and that's why, in reality,
alpha will be chosen finite.
00:08:55.340 --> 00:08:55.950
OK.
00:08:55.950 --> 00:09:02.760
So part one, then, is to do a
very simple example with B*u
00:09:02.760 --> 00:09:03.900
equal d.
00:09:03.900 --> 00:09:06.060
And here is the example.
00:09:06.060 --> 00:09:07.160
OK.
00:09:07.160 --> 00:09:11.859
So this is my sum of
squares in which I plan
00:09:11.859 --> 00:09:13.150
to let alpha to go to infinity.
00:09:18.130 --> 00:09:21.870
So A is the identity
matrix and b is 0.
00:09:21.870 --> 00:09:26.120
So that quantity is simple.
00:09:26.120 --> 00:09:36.580
Here, I have just one equation,
so p is 1; p by n is 1 by 2.
00:09:36.580 --> 00:09:41.940
I've just one equation u_1 minus
u_2 equals 6, and in the limit,
00:09:41.940 --> 00:09:43.930
as alpha go to
infinity, I expect
00:09:43.930 --> 00:09:46.220
to see that that
equation is enforced.
00:09:49.930 --> 00:09:54.790
So there's two ways to do it,
we can let alpha go to infinity
00:09:54.790 --> 00:10:02.820
and look at u_alpha
going toward u_infinity,
00:10:02.820 --> 00:10:05.690
maybe with their little hats.
00:10:05.690 --> 00:10:12.670
Or the second method which is
the null space method, which is
00:10:12.670 --> 00:10:15.500
what I spoke about last time.
00:10:15.500 --> 00:10:24.290
the null space method solves the
constraint B*u equal d which is
00:10:24.290 --> 00:10:26.540
just u_1 minus u_2 equal 6.
00:10:26.540 --> 00:10:27.040
OK.
00:10:27.040 --> 00:10:33.390
And that's -- maybe I'll
start with that one.
00:10:33.390 --> 00:10:35.313
Which looks so simple,
of course, just
00:10:35.313 --> 00:10:37.620
to solve u_1 minus u_2 equal 6.
00:10:37.620 --> 00:10:41.980
I mean, everybody
would say, OK, solve it
00:10:41.980 --> 00:10:45.290
for u_2 equals u_1 minus 6.
00:10:49.580 --> 00:10:54.650
So here is the method any
sensible person would use.
00:10:54.650 --> 00:10:56.210
But this course doesn't.
00:10:56.210 --> 00:11:03.630
OK, the sensible method
would be u_2 is u_1 minus 6;
00:11:03.630 --> 00:11:12.440
plug that into the
squares and minimize.
00:11:12.440 --> 00:11:18.090
So when I plug this in,
of course, this is exact,
00:11:18.090 --> 00:11:23.470
and this becomes u
-- so I'm minimizing,
00:11:23.470 --> 00:11:28.610
minimizing u_1 squared
plus, what was it?
00:11:28.610 --> 00:11:30.190
u_1 minus 6 square.
00:11:34.930 --> 00:11:39.410
So that's reduced the
problem to one unknown,
00:11:39.410 --> 00:11:41.250
this is the null space method.
00:11:41.250 --> 00:11:45.040
The null space method
is to solve the equation
00:11:45.040 --> 00:11:48.260
and remove unknowns.
00:11:48.260 --> 00:11:51.970
Remove p unknowns coming
from the p constraints,
00:11:51.970 --> 00:11:54.150
and here p is 1.
00:11:54.150 --> 00:11:54.660
OK.
00:11:54.660 --> 00:11:58.170
And by the way, can we
just guess, or not guess,
00:11:58.170 --> 00:12:05.770
but pretty well be sure,
what's the minimizer here?
00:12:05.770 --> 00:12:09.120
Anybody just tell me what
u_1 would minimize that?
00:12:09.120 --> 00:12:10.570
Just make a guess, maybe?
00:12:13.090 --> 00:12:19.520
I'm looking for a number sort
of halfway between 0 and 6
00:12:19.520 --> 00:12:21.110
somehow.
00:12:21.110 --> 00:12:27.630
You won't be surprised
that the u_1 is 3.
00:12:27.630 --> 00:12:32.450
And then, from this equation, I
should learn that u_2 is minus
00:12:32.450 --> 00:12:34.700
3 -- u_2, no, u_(2, infinity).
00:12:37.740 --> 00:12:42.130
Now I've got too many --
u_(2, infinity) is minus 3.
00:12:42.130 --> 00:12:47.430
Anyway, simple calculus, if you
just set the derivative to 0,
00:12:47.430 --> 00:12:50.430
you'll get 3 and then
you get minus 3 for u_2.
00:12:50.430 --> 00:12:53.650
So that's the null space
method, except that I
00:12:53.650 --> 00:13:02.180
didn't follow my complicated
QR orthogonalization.
00:13:02.180 --> 00:13:05.220
And I just want
to do that quickly
00:13:05.220 --> 00:13:07.320
to reach the same answer.
00:13:10.270 --> 00:13:15.250
And to say, why don't
I just do this anyway?
00:13:15.250 --> 00:13:18.860
This is what -- this
would be the row --
00:13:18.860 --> 00:13:23.030
this would be the standard
method in the first month
00:13:23.030 --> 00:13:27.180
of linear algebra would be to
use the row reduced echelon
00:13:27.180 --> 00:13:30.300
form, which of course
is going to be really,
00:13:30.300 --> 00:13:32.520
really simple for
this matrix; in fact,
00:13:32.520 --> 00:13:37.440
that's already in row reduced
echelon form -- elimination,
00:13:37.440 --> 00:13:40.930
row reduction has nothing
to do to improve that --
00:13:40.930 --> 00:13:44.280
and then solve and then
plug in and then go with it.
00:13:44.280 --> 00:13:49.290
OK, well the thing
is that that row
00:13:49.290 --> 00:13:53.960
reduced echelon form, the
stuff you teach, is not,
00:13:53.960 --> 00:13:59.730
for large systems,
guaranteed stable.
00:13:59.730 --> 00:14:01.640
It's not numerically stable.
00:14:01.640 --> 00:14:07.200
And the option of using--
of orthogonalizing
00:14:07.200 --> 00:14:11.200
is the right one to
know for a large system.
00:14:11.200 --> 00:14:16.010
So you you'll have to allow me,
on this really small example,
00:14:16.010 --> 00:14:19.820
to use a method that
I described last time.
00:14:19.820 --> 00:14:24.970
And I just want to recap with
an example on the small system.
00:14:24.970 --> 00:14:27.780
OK, so what was that method?
00:14:27.780 --> 00:14:30.950
So this is the null space
method using qr now.
00:14:34.920 --> 00:14:39.710
The MATLAB command qr, so
what did we -- qr of B prime.
00:14:39.710 --> 00:14:43.130
Do you remember
that we took that --
00:14:43.130 --> 00:14:47.140
that's the MATLAB command
that eventually will,
00:14:47.140 --> 00:14:50.750
or is actually already in
the notes for this section,
00:14:50.750 --> 00:14:53.460
and those notes
will get updated --
00:14:53.460 --> 00:14:57.540
but that's step one in
the null space method,
00:14:57.540 --> 00:14:58.940
qr B prime.
00:14:58.940 --> 00:15:00.920
And this gives me a
chance to say what's
00:15:00.920 --> 00:15:04.790
up with this qr algorithm.
00:15:04.790 --> 00:15:11.150
I mean after lu, qr is the most
important algorithm in MATLAB.
00:15:11.150 --> 00:15:13.340
And so what does it do?
00:15:13.340 --> 00:15:20.460
B prime, the transpose of B,
is just 1, minus 1, right?
00:15:20.460 --> 00:15:23.200
OK.
00:15:23.200 --> 00:15:30.800
Now what does Gram-Schmidt
do to that matrix?
00:15:34.240 --> 00:15:37.250
Well, the idea of
Gram-Schmidt is
00:15:37.250 --> 00:15:41.370
to produce orthonormal columns.
00:15:41.370 --> 00:15:45.220
So the most basic Gram-Schmidt
idea would say, so what would
00:15:45.220 --> 00:15:46.390
Gram and Schmidt say?
00:15:46.390 --> 00:15:49.020
They'd say, well, we
only have one column,
00:15:49.020 --> 00:15:52.340
and all we would have
to do is normalize it
00:15:52.340 --> 00:16:01.940
So Gram-Schmidt would produce
the normalized thing --
00:16:01.940 --> 00:16:05.830
times square root of 2.
00:16:05.830 --> 00:16:09.810
That would be the q, and
this would be the r, 1 by 1,
00:16:09.810 --> 00:16:12.650
in Gram Schmidt.
00:16:12.650 --> 00:16:20.540
OK, but here's the point, that
the qr algorithm in MATLAB,
00:16:20.540 --> 00:16:23.560
which no longer uses
the Gram-Schmidt idea,
00:16:23.560 --> 00:16:31.000
instead uses a Householder idea,
and one nice thing about this
00:16:31.000 --> 00:16:39.950
is that it produces not just
this column, but another one,
00:16:39.950 --> 00:16:47.520
it produces a column for the
-- it completes the basis
00:16:47.520 --> 00:16:50.220
to a full orthonormal basis.
00:16:50.220 --> 00:16:54.020
So it finds a second vector.
00:16:54.020 --> 00:16:56.770
So ordinary
Gram-Schmidt just had
00:16:56.770 --> 00:16:59.270
one column times one number.
00:16:59.270 --> 00:17:05.480
What qr actually does is it
ends up with two columns.
00:17:05.480 --> 00:17:12.410
And well, everybody can see
what's the other column --
00:17:12.410 --> 00:17:16.610
that has length 1, of course,
and is orthogonal to the first
00:17:16.610 --> 00:17:17.500
column.
00:17:17.500 --> 00:17:21.770
And now, that is
multiplied by 0.
00:17:27.310 --> 00:17:30.370
So this is what qr does.
00:17:30.370 --> 00:17:35.160
We have this 2 by 1 matrix,
it produces a 2 by 2 times a 2
00:17:35.160 --> 00:17:35.690
by 1.
00:17:38.820 --> 00:17:43.000
And you might say, it
was wasting its time,
00:17:43.000 --> 00:17:48.810
to find this part, because
it's multiplied by 0,
00:17:48.810 --> 00:17:54.910
but what are we learning
from the vector?
00:17:54.910 --> 00:17:58.410
From this [1, 1] vector or
1 over square root of 2,
00:17:58.410 --> 00:18:00.190
1 over square root of 2 vector?
00:18:00.190 --> 00:18:03.880
What good can that do us?
00:18:03.880 --> 00:18:11.490
It's the null space of
B. So B was 1, minus 1,
00:18:11.490 --> 00:18:16.340
So let me just -- so that's
the connection with null space
00:18:16.340 --> 00:18:23.120
of B. If I look at vectors
-- there's my matrix B,
00:18:23.120 --> 00:18:29.330
and if I'm solving B*u equal
d, if I'm solving B*u equal d,
00:18:29.330 --> 00:18:37.200
then u is u_particular
and u null space,
00:18:37.200 --> 00:18:42.960
and if I want u null space, then
that's where this -- and these,
00:18:42.960 --> 00:18:46.950
whatever extra columns, this
might be p columns and then
00:18:46.950 --> 00:18:50.640
this would be n minus p columns,
that's what that's good for.
00:18:50.640 --> 00:18:53.180
And of course that
column tells me
00:18:53.180 --> 00:18:56.380
about the null space
which, for this matrix,
00:18:56.380 --> 00:19:05.400
is one-dimensional
and easy to find, OK.
00:19:05.400 --> 00:19:13.210
So that may be just to, so you
know the difference between
00:19:13.210 --> 00:19:16.650
Gram-Schmidt's qr
which stops with --
00:19:16.650 --> 00:19:19.450
if you had one column
you end with one column,
00:19:19.450 --> 00:19:24.500
and the MATLAB Householder
qr which finds a full square
00:19:24.500 --> 00:19:25.170
matrix.
00:19:25.170 --> 00:19:29.410
OK, just good to know and
here we've found a use for it.
00:19:29.410 --> 00:19:29.910
OK.
00:19:29.910 --> 00:19:36.010
So then, the algorithm
that I gave last time --
00:19:36.010 --> 00:19:39.510
and I'll give the
code in the notes --
00:19:39.510 --> 00:19:45.740
goes through the steps of
finding a u_particular,
00:19:45.740 --> 00:19:51.600
and actually, the u_particular
that it would find happens
00:19:51.600 --> 00:19:59.540
to be -- [3, minus 3] happens
to be the actual winner.
00:19:59.540 --> 00:20:06.085
And therefore, the u null space
that that algorithm would find
00:20:06.085 --> 00:20:08.950
-- if I went through
all the steps,
00:20:08.950 --> 00:20:15.020
you would see that because I'm
in this special case of b being
00:20:15.020 --> 00:20:16.360
0 and so on,
00:20:16.360 --> 00:20:19.110
that the vector that
it would choose --
00:20:19.110 --> 00:20:21.440
this is the basis
for the null space,
00:20:21.440 --> 00:20:28.620
but it would choose 0 of that
basis vector and would come up
00:20:28.620 --> 00:20:30.500
with that answer.
00:20:30.500 --> 00:20:34.240
OK so that's what the
algorithm from last time
00:20:34.240 --> 00:20:39.010
would have done to this problem.
00:20:39.010 --> 00:20:46.120
I also, over the weekend,
thought OK, if it's all true,
00:20:46.120 --> 00:20:51.460
I should be able to
use my first method.
00:20:51.460 --> 00:20:55.190
The large alpha method
and just find the answer
00:20:55.190 --> 00:20:58.880
to the original problem and
let alpha go to infinity.
00:20:58.880 --> 00:21:01.260
Are you willing to do that?
00:21:01.260 --> 00:21:04.010
That might take a
little more calculation,
00:21:04.010 --> 00:21:06.400
but let me try that.
00:21:06.400 --> 00:21:10.430
I'm hoping, you know, that
it approaches this answer.
00:21:10.430 --> 00:21:13.810
This is the answer
I'm looking for.
00:21:13.810 --> 00:21:22.360
OK so, do your mind just
-- suppose I had to do that
00:21:22.360 --> 00:21:24.650
minimization.
00:21:24.650 --> 00:21:27.230
Again, now I'm not using
the null space method,
00:21:27.230 --> 00:21:30.610
so I'm not reducing, I'm not
getting u_2 out of the problem.
00:21:30.610 --> 00:21:38.730
I'm doing the minimum as it
stands, and so what do I get?
00:21:38.730 --> 00:21:41.750
Well, I've got two
variables u_1 and u_2.
00:21:41.750 --> 00:21:45.210
So I take the derivatives
with respect u_1 --
00:21:45.210 --> 00:21:47.520
I'm minimizing --
everybody, when I point,
00:21:47.520 --> 00:21:50.360
I'm pointing at that top line.
00:21:50.360 --> 00:21:56.030
So it's 2*u_1, and
what do I have here?
00:21:56.030 --> 00:22:03.990
2*alpha u_1 minus u_2
minus 6 equaling 0.
00:22:03.990 --> 00:22:08.350
Is that -- did I take the
u_1 derivative correctly?
00:22:08.350 --> 00:22:12.590
Now if I take the u_2
derivative I get two u_2's.
00:22:12.590 --> 00:22:15.800
And now, the chain rule is
going to give me a minus sign,
00:22:15.800 --> 00:22:22.850
so it would be a minus 2*alpha
u_1 minus u_2 minus 6 equals 0.
00:22:22.850 --> 00:22:26.770
So those two equations
will determine u_1 and u_2
00:22:26.770 --> 00:22:29.370
for a finite alpha.
00:22:29.370 --> 00:22:33.360
And then I'll let alpha head to
infinity and see what happens.
00:22:33.360 --> 00:22:37.730
OK, first I'll
multiply by a half
00:22:37.730 --> 00:22:42.470
and get rid of those
useless 2's, and then
00:22:42.470 --> 00:22:45.360
solve this equation.
00:22:45.360 --> 00:22:48.080
OK, so what do I have here?
00:22:48.080 --> 00:22:54.160
I've got a matrix -- u_1 is
multiplying 1 plus alpha,
00:22:54.160 --> 00:22:57.250
u_2 has a minus alpha.
00:22:57.250 --> 00:23:06.340
In this line, u_1 has a minus
alpha, u_1 has a 1 minus,
00:23:06.340 --> 00:23:09.470
minus, plus alpha, am I right?
00:23:09.470 --> 00:23:14.950
Times [u 1, u 2] equals --
what's my right-hand side?
00:23:14.950 --> 00:23:17.940
I guess the right-hand
side has alphas in it.
00:23:17.940 --> 00:23:23.300
6*alpha and minus
6*alpha, I think.
00:23:23.300 --> 00:23:23.810
OK.
00:23:27.010 --> 00:23:29.510
two equations, two unknowns.
00:23:29.510 --> 00:23:34.190
These are the normal equation
for this problem, written out
00:23:34.190 --> 00:23:35.610
explicitly.
00:23:35.610 --> 00:23:39.630
And probably I can
find the solution
00:23:39.630 --> 00:23:41.750
and let alpha go to infinity.
00:23:41.750 --> 00:23:44.680
You could say,
what are you doing,
00:23:44.680 --> 00:23:47.870
Professor Strang, this
elementary calculation?
00:23:47.870 --> 00:23:50.810
But there is something sort
of satisfying about seeing
00:23:50.810 --> 00:23:53.030
a small example actually work.
00:23:53.030 --> 00:23:54.550
At least to me.
00:23:54.550 --> 00:23:58.850
OK, so how do I solve
those equations?
00:23:58.850 --> 00:24:00.050
Well, good question.
00:24:03.430 --> 00:24:05.610
Should I -- with
a 2 by 2 matrix,
00:24:05.610 --> 00:24:09.560
can I do the unforgivable and
actually find its inverse?
00:24:09.560 --> 00:24:14.260
I mean, it's like not allowed
in true linear algebra
00:24:14.260 --> 00:24:18.750
to find the inverse, but
maybe we could do it here.
00:24:18.750 --> 00:24:25.410
So [u 1, u 2] is going to be
the inverse matrix, which is --
00:24:25.410 --> 00:24:31.120
so my little recipe for finding
inverses is take the entries,
00:24:31.120 --> 00:24:38.790
this entry goes up is up here,
that entry goes down there --
00:24:38.790 --> 00:24:41.150
well, you couldn't
see the difference --
00:24:41.150 --> 00:24:46.640
this entry stays stays in
place, those change sign,
00:24:46.640 --> 00:24:49.560
and then I have to divide
by the determinant.
00:24:49.560 --> 00:24:51.680
So what was the
determinant of this?
00:24:51.680 --> 00:24:55.190
1 plus 2*alpha plus alpha
squared minus alpha squared,
00:24:55.190 --> 00:24:57.290
I get 1 plus 2 alpha.
00:24:57.290 --> 00:25:02.830
And that's the inverse matrix
now that's multiplying 6*alpha
00:25:02.830 --> 00:25:06.650
and minus 6*alpha, OK.
00:25:06.650 --> 00:25:10.100
And if I can do that
multiplication, I have -- well,
00:25:10.100 --> 00:25:13.390
there's this factor 1
over 1 plus 2*alpha,
00:25:13.390 --> 00:25:15.880
and what do I have?
00:25:15.880 --> 00:25:20.180
6*alpha, 6 alpha squared,
minus 6 alpha squared,
00:25:20.180 --> 00:25:22.730
I think 6*alpha?
00:25:22.730 --> 00:25:27.640
6 alpha squared, minus 6*alpha
squared plus -- minus 6 alpha,
00:25:27.640 --> 00:25:30.200
I think it's that,
minus 6*alpha.
00:25:30.200 --> 00:25:36.320
And, ready for the great moment?
00:25:36.320 --> 00:25:44.100
Let alpha go to infinity,
and what do I get?
00:25:44.100 --> 00:25:52.580
As alpha goes to infinity,
the 1 becomes insignificant,
00:25:52.580 --> 00:25:59.660
the alpha cancels the alpha,
so that approaches [3, -3].
00:25:59.660 --> 00:26:03.590
So there you see the large
alpha method in practice.
00:26:03.590 --> 00:26:04.370
OK.
00:26:04.370 --> 00:26:11.130
And you see what -- well,
there's something quite
00:26:11.130 --> 00:26:13.550
important here.
00:26:13.550 --> 00:26:15.482
Something quite important,
and it's connected
00:26:15.482 --> 00:26:16.440
with the pseudoinverse.
00:26:19.470 --> 00:26:27.350
The pseudoinverse -- so now,
I want to, we got this answer.
00:26:27.350 --> 00:26:35.300
And what I want to say is that
the alpha, the limiting alpha
00:26:35.300 --> 00:26:43.300
system, has produced
this pseudoinverse.
00:26:43.300 --> 00:26:46.160
So now I have to tell you
about the pseudoinverse
00:26:46.160 --> 00:26:47.810
and what it means.
00:26:47.810 --> 00:26:50.210
And basically, the essential
thing that it means
00:26:50.210 --> 00:26:59.840
is, the pseudoinverse
gives the solution u which
00:26:59.840 --> 00:27:03.300
has no null space component.
00:27:03.300 --> 00:27:05.120
That's what the
pseudoinverse is about.
00:27:05.120 --> 00:27:07.840
I'll draw a picture to
say what I'm saying.
00:27:07.840 --> 00:27:12.010
But it's this fact that
means that this part,
00:27:12.010 --> 00:27:20.150
which was this number,
is the output --
00:27:20.150 --> 00:27:28.530
this is the pseudoinverse
of B applied to [6, 6].
00:27:28.530 --> 00:27:29.670
You see the point?
00:27:29.670 --> 00:27:33.370
B hasn't got an inverse.
00:27:33.370 --> 00:27:34.410
B is 1, minus 1.
00:27:34.410 --> 00:27:40.200
It's a rectangular matrix.
00:27:40.200 --> 00:27:50.220
And it's not invertible
in the normal sense.
00:27:50.220 --> 00:27:53.590
I can't find a
two-sided inverse;
00:27:53.590 --> 00:27:58.490
a B inverse doesn't exist.
00:27:58.490 --> 00:28:02.100
But a pseudoinverse counts.
00:28:02.100 --> 00:28:06.020
So, just to give a MATLAB -- as
long as I've written a MATLAB
00:28:06.020 --> 00:28:11.910
command here, why don't I
write the other MATLAB command?
00:28:11.910 --> 00:28:17.940
u is the pseudoinverse -- you
remember that pseudo starts
00:28:17.940 --> 00:28:27.530
with a letter p, so P-I-N-V
-- of B multiplying d.
00:28:27.530 --> 00:28:34.660
That's what we
got automatically.
00:28:34.660 --> 00:28:38.370
And it's what we get --
and the reason we got
00:28:38.370 --> 00:28:40.660
the pseudoinverse.
00:28:40.660 --> 00:28:43.420
So let me just say
what was special here.
00:28:43.420 --> 00:28:46.060
What was special that
produced this pseudoinverse --
00:28:46.060 --> 00:28:48.300
that I'm going to
speak about more --
00:28:48.300 --> 00:28:54.040
was this choice A equal
the identity and b equal 0,
00:28:54.040 --> 00:28:59.530
the fact that we just put the
norm of u squared there --
00:28:59.530 --> 00:29:02.470
well, the idea is this
produces the pseudoinverse.
00:29:06.030 --> 00:29:12.110
And if you like -- so, can I
say a little more about this
00:29:12.110 --> 00:29:14.710
pseudoinverse before drawing
the picture that shows what
00:29:14.710 --> 00:29:15.620
it's about?
00:29:15.620 --> 00:29:19.570
So I took this thing and
let alpha go to infinity.
00:29:19.570 --> 00:29:24.340
OK, so I could equally well
have divided it by alpha,
00:29:24.340 --> 00:29:27.370
the whole -- if I divide
the whole thing by alpha,
00:29:27.370 --> 00:29:32.500
that won't change the minimizer;
certainly the same u's will
00:29:32.500 --> 00:29:33.650
win.
00:29:33.650 --> 00:29:37.830
And now I see one
over alpha going to 0.
00:29:37.830 --> 00:29:41.510
And that's where the
pseudoinverse is usually seen.
00:29:41.510 --> 00:29:46.550
We take the given problem,
which does not completely
00:29:46.550 --> 00:29:49.920
determine u_1 and
u_2, and we throw
00:29:49.920 --> 00:29:54.800
in a small amount
of norm u squared,
00:29:54.800 --> 00:29:59.590
and find the minimum
for that, right.
00:29:59.590 --> 00:30:01.380
So yeah.
00:30:03.990 --> 00:30:06.440
Let me say it, somehow.
00:30:06.440 --> 00:30:15.870
I take the B transpose B
plus the 1 over alpha I --
00:30:15.870 --> 00:30:23.560
now alpha is still going to
infinity in this lecture,
00:30:23.560 --> 00:30:30.890
so 1 over alpha, the whole
thing is headed for 0 --
00:30:30.890 --> 00:30:34.390
times the norm of u square.
00:30:34.390 --> 00:30:37.720
This is the u_1 squared
plus u_2 squared.
00:30:37.720 --> 00:30:40.650
OK.
00:30:40.650 --> 00:30:49.350
And that inverse, that quantity
inverse approaches the -- well,
00:30:49.350 --> 00:30:54.020
once I -- I'm not giving
the complete formula,
00:30:54.020 --> 00:30:59.020
but that's is what entering
here and it leads to --
00:30:59.020 --> 00:31:05.830
may I see the vague word leads
toward the pseudoinverse B
00:31:05.830 --> 00:31:07.370
plus.
00:31:07.370 --> 00:31:07.870
Yeah.
00:31:07.870 --> 00:31:11.130
And I'll do better with that.
00:31:11.130 --> 00:31:13.750
OK, I want to go
on to the picture.
00:31:13.750 --> 00:31:17.770
OK, so right.
00:31:17.770 --> 00:31:20.430
Do you know the most important
picture of linear algebra?
00:31:20.430 --> 00:31:24.610
The whole picture what a
matrix is actually doing?
00:31:24.610 --> 00:31:29.120
Here we have a great example
to draw that picture.
00:31:29.120 --> 00:31:33.690
So here's the picture
that 18.06 is --
00:31:33.690 --> 00:31:34.860
it's at the center of 18.06.
00:31:34.860 --> 00:31:39.060
For our 2 by 1 matrix.
00:31:39.060 --> 00:31:43.680
So this is our matrix
is B equals 1, minus 1.
00:31:43.680 --> 00:31:45.810
This is the picture
for that matrix.
00:31:45.810 --> 00:31:51.660
OK, so that matrix
has a row space.
00:31:51.660 --> 00:31:54.210
The row space is the
set of all vectors that
00:31:54.210 --> 00:31:56.430
are a combinations of the rows.
00:31:56.430 --> 00:32:00.400
But there's only one row, so
the row space is only a line.
00:32:00.400 --> 00:32:03.720
I guess it's probably that line.
00:32:03.720 --> 00:32:09.770
So the row space
of B, of my matrix,
00:32:09.770 --> 00:32:16.180
is all multiples of [1, -1].
00:32:16.180 --> 00:32:18.070
So it's a line.
00:32:18.070 --> 00:32:21.020
Let's put the zero point in.
00:32:21.020 --> 00:32:24.540
OK, then the matrix
also has a null space.
00:32:24.540 --> 00:32:30.420
The null space as the side
of solutions to B*x equals 0.
00:32:30.420 --> 00:32:36.040
It's a line, and in fact
it's a perpendicular line.
00:32:36.040 --> 00:32:45.830
So this is the null space
of B, and it contains all --
00:32:45.830 --> 00:32:47.480
what does it contain?
00:32:47.480 --> 00:32:51.340
All the solutions to B*u
equals 0 which, in this case,
00:32:51.340 --> 00:32:56.690
are all multiples of [1, 1].
00:32:56.690 --> 00:33:00.860
And just to come back
to my early comment,
00:33:00.860 --> 00:33:06.500
that's what the qr, the extra
half of the qr algorithm
00:33:06.500 --> 00:33:10.710
is telling us; it's giving
us a beautiful basis
00:33:10.710 --> 00:33:11.920
for the null space.
00:33:11.920 --> 00:33:16.610
And so the key point is that
the null space is always
00:33:16.610 --> 00:33:22.130
perpendicular to the row space,
which of course we see here.
00:33:22.130 --> 00:33:28.240
This z is what we had to
compute when there were
00:33:28.240 --> 00:33:30.590
p components and not just one.
00:33:30.590 --> 00:33:38.040
And now, where is -- let's
see, what else goes into this
00:33:38.040 --> 00:33:39.450
picture?
00:33:39.450 --> 00:33:44.740
Where are the solutions to
my equation B*u equal d?
00:33:44.740 --> 00:33:50.620
So my equation was -- my
equation was u_1 minus u_2
00:33:50.620 --> 00:33:56.690
equal a particular number, 6,
and where are the solutions
00:33:56.690 --> 00:33:58.870
to u_1 minus u_2 equals 6?
00:34:02.250 --> 00:34:11.080
OK, so now I want to draw all
the -- Where are all the --
00:34:11.080 --> 00:34:13.700
so this is the u1, u2 plane?
00:34:13.700 --> 00:34:20.320
OK, so one solution is
take c equal to 3. [3, -3],
00:34:20.320 --> 00:34:24.090
the combination [3, -3],
which is right there,
00:34:24.090 --> 00:34:29.410
is my particular solution, so
u_particular, or u row space,
00:34:29.410 --> 00:34:30.650
is [3, -3].
00:34:33.540 --> 00:34:38.530
That solves the equation,
and it lies in the row space.
00:34:38.530 --> 00:34:41.090
And now, if you
understand the whole point
00:34:41.090 --> 00:34:47.400
of linear equations, where
are the rest of the solutions?
00:34:47.400 --> 00:34:50.710
How do I draw the
rest of the solutions?
00:34:50.710 --> 00:34:55.890
Well, to a particular solution I
add on any null space solution.
00:34:55.890 --> 00:34:59.310
The null space
solutions go this way.
00:34:59.310 --> 00:35:06.100
So I add on -- so this is my
whole line of all solutions,
00:35:06.100 --> 00:35:08.890
so this is the line
of all solutions.
00:35:17.960 --> 00:35:24.230
And now, the key question is,
which solution is the smallest?
00:35:24.230 --> 00:35:26.990
When -- so this is the
idea this pseudoinverse.
00:35:26.990 --> 00:35:31.260
When there are many solutions,
pick the smallest one,
00:35:31.260 --> 00:35:34.720
pick the shortest one, it's
the most stable somehow.
00:35:34.720 --> 00:35:36.530
It's the natural one.
00:35:36.530 --> 00:35:39.890
And which one is it?
00:35:39.890 --> 00:35:42.860
OK which -- so
here is the origin.
00:35:42.860 --> 00:35:46.640
What point on that line
is closest to the origin?
00:35:46.640 --> 00:35:51.930
What point minimizes u_1
square plus u_2 square?
00:35:51.930 --> 00:35:53.070
Everybody can see.
00:35:53.070 --> 00:36:01.000
This guy, that minimi--
so the pseudoinverse says,
00:36:01.000 --> 00:36:04.470
wait a minute, when you've
got a whole line of solutions,
00:36:04.470 --> 00:36:06.490
just tell me a good one.
00:36:06.490 --> 00:36:08.250
Tell me the special one.
00:36:08.250 --> 00:36:11.430
And the special one is
the one in the row space.
00:36:14.110 --> 00:36:16.740
And that's the one that
the pseudoinverse picks.
00:36:16.740 --> 00:36:23.210
So the pseudoinverse of a matrix
-- so the general rule is,
00:36:23.210 --> 00:36:26.170
and part of the lecture
was the fact that,
00:36:26.170 --> 00:36:29.550
as alpha goes to
infinity in this problem,
00:36:29.550 --> 00:36:31.290
the pseudoinverse will do it.
00:36:31.290 --> 00:36:35.310
Or I could say, just directly,
what does the pseudoinverse do?
00:36:35.310 --> 00:36:46.450
The pseudoinverse -- so B plus,
the pseudoinverse, chooses,
00:36:46.450 --> 00:36:55.500
it chooses u_p, if you like,
u_p -- that's the B plus,
00:36:55.500 --> 00:37:01.290
that multiplies -- the solution
-- I can't say B inverse d.
00:37:01.290 --> 00:37:05.070
Everybody knows my
equation is B*u equal d.
00:37:05.070 --> 00:37:08.850
So this is my
equation, B*u equal d.
00:37:08.850 --> 00:37:14.000
And my particular solution,
my pseudo-solution, my best
00:37:14.000 --> 00:37:16.790
solution, is going
to be B plus d,
00:37:16.790 --> 00:37:21.380
and it's going to
be in the row space,
00:37:21.380 --> 00:37:26.930
because it's the
smallest solution.
00:37:30.650 --> 00:37:33.640
So if you meet the
idea of pseudoinverses,
00:37:33.640 --> 00:37:38.130
now you know what
it's talking about.
00:37:38.130 --> 00:37:40.060
Because we don't
have a true inverse,
00:37:40.060 --> 00:37:45.230
we have a whole line of a
solutions, we want to pick one,
00:37:45.230 --> 00:37:48.200
and the pseudoinverse
picks this one.
00:37:48.200 --> 00:37:50.580
It's the one in the row
space, and it's the shortest,
00:37:50.580 --> 00:37:53.260
because these are orthogonal.
00:37:53.260 --> 00:38:02.210
Because these are orthogonal
-- u is u_p plus u_n,
00:38:02.210 --> 00:38:03.900
and because those
are orthogonal,
00:38:03.900 --> 00:38:06.800
the length of u
squared, by Pythagoras,
00:38:06.800 --> 00:38:11.670
is the length of u_p squared
plus the length of u_n squared.
00:38:11.670 --> 00:38:15.820
And which one is shortest?
00:38:15.820 --> 00:38:18.830
The one that has no u_n.
00:38:18.830 --> 00:38:22.710
That orthogonal
component might as well
00:38:22.710 --> 00:38:26.300
be 0 if you want the shortest.
00:38:26.300 --> 00:38:30.420
So all solutions have
this, and this is
00:38:30.420 --> 00:38:32.030
the length of the shortest one.
00:38:32.030 --> 00:38:32.860
OK.
00:38:32.860 --> 00:38:35.740
So that tells you what
the pseudoinverse is.
00:38:35.740 --> 00:38:41.550
At least it tells you what
it is for a 1 by 2 matrix.
00:38:41.550 --> 00:38:48.710
As long as I'm trying to
speak about the pseudoinverse,
00:38:48.710 --> 00:38:52.170
let me complete this thought.
00:38:52.170 --> 00:38:56.570
But you saw the idea,
that the thought was --
00:38:56.570 --> 00:38:59.320
there are there two
ways to get it, again.
00:38:59.320 --> 00:39:02.940
The null space method
that goes for it directly,
00:39:02.940 --> 00:39:09.070
or the big alpha method that
we checked actually works.
00:39:09.070 --> 00:39:10.920
So that was the point
of this board here.
00:39:10.920 --> 00:39:15.780
That the big alpha method, also
produces, in the limit as alpha
00:39:15.780 --> 00:39:17.610
goes to infinity, u_p.
00:39:17.610 --> 00:39:23.850
And there's a little
-- it doesn't have --
00:39:23.850 --> 00:39:27.390
if alpha was a 1,000, I wouldn't
get exactly the right answer,
00:39:27.390 --> 00:39:32.320
because this would be
2,001 in the denominator.
00:39:32.320 --> 00:39:36.780
But as 1,000 becomes a million
and alpha goes to infinity,
00:39:36.780 --> 00:39:40.110
I guess the exact one.
00:39:40.110 --> 00:39:42.570
OK, so here I was going
to draw the picture,
00:39:42.570 --> 00:39:53.170
so if I draw row space -- can I
imagine this is the row space,
00:39:53.170 --> 00:39:58.580
whose dimension is the
rank of the matrix.
00:39:58.580 --> 00:40:07.510
Perpendicular to it is the
null space whose dimension is
00:40:07.510 --> 00:40:12.940
the rest -- the rank, that was
the rank that I always call r,
00:40:12.940 --> 00:40:16.420
then this will have the
dimension n minus r,
00:40:16.420 --> 00:40:19.680
the number of --
this is exactly,
00:40:19.680 --> 00:40:27.590
these are the two things
that MATLAB found here.
00:40:27.590 --> 00:40:34.030
These were the r vectors in the
row space, turned into columns,
00:40:34.030 --> 00:40:37.680
and these were the n minus r
-- but that was only one --
00:40:37.680 --> 00:40:40.340
vectors in the null space.
00:40:40.340 --> 00:40:46.690
So normally, we're up in n
dimensions, not just two.
00:40:46.690 --> 00:40:50.810
With two dimensions, I just
had lines; in n dimensions
00:40:50.810 --> 00:40:54.040
I have an r-dimensional
subspace perpendicular
00:40:54.040 --> 00:40:56.870
to an n minus r
dimensional subspace.
00:40:56.870 --> 00:41:01.150
And now B. What does B do?
00:41:01.150 --> 00:41:02.030
OK.
00:41:02.030 --> 00:41:06.990
So suppose I take a vector
u_n in the null space
00:41:06.990 --> 00:41:10.130
o B. Then B takes it to 0.
00:41:10.130 --> 00:41:13.180
So can I just draw
that with an arrow?
00:41:13.180 --> 00:41:14.630
This'll be 0.
00:41:14.630 --> 00:41:20.210
B*u_n is 0, that's
the whole idea.
00:41:20.210 --> 00:41:21.380
OK.
00:41:21.380 --> 00:41:26.810
But a vector in a row
space is not taken to 0.
00:41:26.810 --> 00:41:32.110
B will take that --- dot
dot dot dot -- into the --
00:41:32.110 --> 00:41:43.740
I better draw here -- the
column space of B. OK.
00:41:43.740 --> 00:41:49.350
Which I'm drawing as a subspace
whose dimension is also,
00:41:49.350 --> 00:41:57.590
is this same rank r, that's
the great fact about matrices,
00:41:57.590 --> 00:41:59.650
that the number of
independent rows
00:41:59.650 --> 00:42:02.280
equals the number of
independent columns.
00:42:02.280 --> 00:42:11.550
So this guy heads off
to some B u row space.
00:42:11.550 --> 00:42:12.730
OK.
00:42:12.730 --> 00:42:16.900
And if I've complete the
picture, as I really should,
00:42:16.900 --> 00:42:23.360
there's another
subspace over here,
00:42:23.360 --> 00:42:28.250
which happened to be the zero
subspace in this example,
00:42:28.250 --> 00:42:29.970
but usually it's here.
00:42:29.970 --> 00:42:35.230
It's the null space
of B transpose.
00:42:39.040 --> 00:42:45.470
In that example, B
transpose was [1, -1]
00:42:45.470 --> 00:42:51.140
and its column was independent,
so there was no null space.
00:42:51.140 --> 00:42:53.300
So I had a simple
picture, and that's
00:42:53.300 --> 00:42:57.270
why I wanted to draw you
a bigger picture with it.
00:42:57.270 --> 00:43:02.950
It's dimension will be,
well, not n minus r,
00:43:02.950 --> 00:43:10.850
but if B is m by
n, let's say, then
00:43:10.850 --> 00:43:13.190
it turns out that
this null space will
00:43:13.190 --> 00:43:15.250
have dimension m minus r.
00:43:15.250 --> 00:43:16.100
No problem.
00:43:16.100 --> 00:43:16.720
OK.
00:43:16.720 --> 00:43:20.420
Now, in the last
three minutes, I
00:43:20.420 --> 00:43:25.740
want to draw the pseudoinverse.
00:43:25.740 --> 00:43:28.910
So what I'm saying is
that every matrix B,
00:43:28.910 --> 00:43:32.020
every rectangular
or square matrix B,
00:43:32.020 --> 00:43:34.450
has these four spaces.
00:43:34.450 --> 00:43:37.890
Four fundamental subspaces
they've come to be called.
00:43:37.890 --> 00:43:46.290
OK and the null space is the
vectors which B takes to 0.
00:43:46.290 --> 00:43:50.030
B takes any vector
into its column space.
00:43:50.030 --> 00:43:55.950
So now let me just draw what
happens to u equal u null
00:43:55.950 --> 00:43:59.010
space plus u row space.
00:43:59.010 --> 00:44:03.220
So this was a guy
in the row space.
00:44:03.220 --> 00:44:08.210
If I -- B, what will B do
when multiplies this vector?
00:44:08.210 --> 00:44:12.010
This vector has a part
that's in the null space,
00:44:12.010 --> 00:44:14.320
and a part that's
in the row space.
00:44:14.320 --> 00:44:19.860
But when I multiply by B,
what happens to this part?
00:44:19.860 --> 00:44:21.180
Gone.
00:44:21.180 --> 00:44:24.060
When I multiply that
by B, where does it go?
00:44:24.060 --> 00:44:24.740
There.
00:44:24.740 --> 00:44:31.000
So this, all these guys
feed into that same point.
00:44:31.000 --> 00:44:37.400
B*u is also going there.
00:44:37.400 --> 00:44:39.350
That's why it's not invertible.
00:44:39.350 --> 00:44:40.440
Of course.
00:44:40.440 --> 00:44:42.800
That's why it's not invertible.
00:44:42.800 --> 00:44:51.120
Here, I guess --
yeah, here I -- sorry.
00:44:51.120 --> 00:44:51.750
Yeah.
00:44:51.750 --> 00:44:54.986
This was the null space of
B. I didn't write in what
00:44:54.986 --> 00:44:56.380
it was the null space of.
00:44:56.380 --> 00:44:57.310
OK.
00:44:57.310 --> 00:45:00.630
So the matrix couldn't be
invertible, and actually,
00:45:00.630 --> 00:45:04.300
because it has a null space,
and they all send those --
00:45:04.300 --> 00:45:07.880
so what is the
pseudoinverse, finally?
00:45:07.880 --> 00:45:13.640
Finally, last moment, the
pseudoinverse is the matrix --
00:45:13.640 --> 00:45:17.540
it's like an inverse matrix
that comes backwards, right?
00:45:17.540 --> 00:45:21.650
It reverses what B does.
00:45:21.650 --> 00:45:28.000
What it cannot do is reverse
stuff that's appeared at 0.
00:45:28.000 --> 00:45:31.960
No matrix could send
0 back to u_n right?
00:45:31.960 --> 00:45:34.650
If i multiply by
the zero vector,
00:45:34.650 --> 00:45:36.890
I'm only going to
get the zero vector.
00:45:36.890 --> 00:45:44.730
So the pseudoinverse has to --
what it can do is it can send
00:45:44.730 --> 00:45:50.220
this stuff back to this.
00:45:50.220 --> 00:45:51.760
This is what the
pseudoinverse does.
00:45:51.760 --> 00:45:55.460
If I had a different color
chalk I would use it now.
00:45:55.460 --> 00:45:59.920
But let use two
arrows or even three.
00:45:59.920 --> 00:46:02.470
This is what the
pseudoinverse does.
00:46:02.470 --> 00:46:08.570
It takes the column space and
sends it back to the row space.
00:46:08.570 --> 00:46:13.710
And because these have the same
dimension r -- the point is,
00:46:13.710 --> 00:46:18.880
inside B is this r by
r matrix that's cool.
00:46:18.880 --> 00:46:20.640
It's totally invertible.
00:46:20.640 --> 00:46:23.270
And B plus inverts it.
00:46:23.270 --> 00:46:25.900
So from row space
to column space
00:46:25.900 --> 00:46:29.860
goes B; from column
space back to row space
00:46:29.860 --> 00:46:34.420
comes the pseudoinverse, but I
can't call it a genuine inverse
00:46:34.420 --> 00:46:40.300
because all this stuff,
including 0, the best I can do
00:46:40.300 --> 00:46:43.790
is send those all back to 0.
00:46:43.790 --> 00:46:45.400
There.
00:46:45.400 --> 00:46:49.320
Now I've really wiped
out that figure.
00:46:49.320 --> 00:46:52.490
But I'll put the
three arrows there
00:46:52.490 --> 00:46:56.410
that makes it crystal clear.
00:46:56.410 --> 00:46:58.740
So this, those three
arrows are indicating
00:46:58.740 --> 00:47:00.895
what the pseudoinverse does.
00:47:00.895 --> 00:47:08.580
It takes the column space -- Its
column space is the row space.
00:47:08.580 --> 00:47:13.370
The column space of B plus
is the row space of B.
00:47:13.370 --> 00:47:16.700
You know, sort of, in these
two spaces, that's where
00:47:16.700 --> 00:47:19.010
the pseudoinverse is alive.
00:47:19.010 --> 00:47:25.410
And B kills the null space and
B plus kills the null space --
00:47:25.410 --> 00:47:26.420
the other null space.
00:47:26.420 --> 00:47:28.000
The null space of B transpose.
00:47:28.000 --> 00:47:35.690
Anyway, that pseudoinverse is at
the center of the whole theory
00:47:35.690 --> 00:47:38.240
here.
00:47:38.240 --> 00:47:40.340
You know, when I take out
books from the library
00:47:40.340 --> 00:47:44.530
about regularizing
least squares,
00:47:44.530 --> 00:47:49.470
they begin by explaining
the pseudoinverse.
00:47:49.470 --> 00:47:55.470
Which, as we've seen, arises
as alpha goes to infinity
00:47:55.470 --> 00:47:59.920
or 0, whichever end we're at.
00:47:59.920 --> 00:48:02.220
And what I have
still to do next time
00:48:02.220 --> 00:48:06.740
is, what happens if
I'm not prepared to go
00:48:06.740 --> 00:48:08.820
all the way to
the pseudoinverse,
00:48:08.820 --> 00:48:17.110
because it blows up on me,
and I want a finite alpha,
00:48:17.110 --> 00:48:19.610
what should that alpha be?
00:48:19.610 --> 00:48:26.710
And that alpha will be
determined by, as I said,
00:48:26.710 --> 00:48:30.270
somehow by the noise
level in the system.
00:48:30.270 --> 00:48:30.860
Right.
00:48:30.860 --> 00:48:34.540
And just to emphasize another
example that I'll probably
00:48:34.540 --> 00:48:38.200
mention, you know,
CT scans, MRI,
00:48:38.200 --> 00:48:42.420
all those things that
are trying to reconstruct
00:48:42.420 --> 00:48:46.990
the results from limited number
of measurements, measurements
00:48:46.990 --> 00:48:51.270
that are not really enough
to perfect reconstruction,
00:48:51.270 --> 00:48:55.760
so this is the theory of
imperfect reconstruction,
00:48:55.760 --> 00:49:00.640
if I can invent an
expression, having
00:49:00.640 --> 00:49:03.490
met perfect reconstruction
in the world of wavelets
00:49:03.490 --> 00:49:06.480
and signal processing,
this is the subject
00:49:06.480 --> 00:49:09.280
of imperfect
reconstruction and I'll
00:49:09.280 --> 00:49:12.270
hope to do justice
to it on Wednesday.
00:49:12.270 --> 00:49:12.920
OK.
00:49:12.920 --> 00:49:14.490
Thank you.