WEBVTT
00:00:07.222 --> 00:00:08.430
DAVID SHIROKOFF: Hi everyone.
00:00:08.430 --> 00:00:09.950
Welcome back.
00:00:09.950 --> 00:00:14.370
So today I'd like to tackle
a problem on pseudoinverses.
00:00:14.370 --> 00:00:17.300
So given a matrix A,
which is not square,
00:00:17.300 --> 00:00:19.560
so it's just 1 and 2.
00:00:19.560 --> 00:00:21.640
First, what is
its pseudoinverse?
00:00:21.640 --> 00:00:25.300
So A plus I'm using to
denote the pseudoinverse.
00:00:25.300 --> 00:00:30.460
Then secondly, compute
A plus A and A A plus.
00:00:30.460 --> 00:00:33.980
And then thirdly, if x is
in the null space of A,
00:00:33.980 --> 00:00:37.070
what is A plus A acting on x?
00:00:37.070 --> 00:00:42.540
And lastly, if x is in the
column space of A transpose,
00:00:42.540 --> 00:00:44.935
what is A plus A*x?
00:00:44.935 --> 00:00:47.060
So I'll let you think about
this problem for a bit,
00:00:47.060 --> 00:00:48.268
and I'll be back in a second.
00:00:59.942 --> 00:01:01.040
Hi everyone.
00:01:01.040 --> 00:01:02.540
Welcome back.
00:01:02.540 --> 00:01:05.209
OK, so let's take a
look at this problem.
00:01:05.209 --> 00:01:08.030
Now first off, what
is a pseudoinverse?
00:01:08.030 --> 00:01:12.690
Well, we define the
pseudoinverse using the SVD.
00:01:17.940 --> 00:01:20.160
So in actuality,
this is nothing new.
00:01:25.220 --> 00:01:28.410
Now, we note that
because A is not square,
00:01:28.410 --> 00:01:31.390
the regular inverse of A
doesn't necessarily exist.
00:01:31.390 --> 00:01:35.400
However, we do know that the
SVD exists for every matrix A
00:01:35.400 --> 00:01:39.980
whether it's square or not.
00:01:39.980 --> 00:01:43.170
So how do we compute
the SVD of a matrix?
00:01:43.170 --> 00:01:45.760
Well let's just recall
that the SVD of a matrix
00:01:45.760 --> 00:01:54.780
has the form of U sigma V
transpose, where U and V are
00:01:54.780 --> 00:01:59.480
orthogonal matrices
and sigma is a matrix
00:01:59.480 --> 00:02:03.090
with positive values
along the diagonal
00:02:03.090 --> 00:02:05.810
or 0's along the diagonal.
00:02:05.810 --> 00:02:07.850
And let's just take a
look at the dimensions
00:02:07.850 --> 00:02:10.020
of these matrices for a second.
00:02:10.020 --> 00:02:13.860
So we know that A
is a 1 by 2 matrix.
00:02:13.860 --> 00:02:16.320
And the way to figure
out what the dimensions
00:02:16.320 --> 00:02:18.740
of these matrices
are I usually always
00:02:18.740 --> 00:02:21.132
start with the
center matrix, sigma,
00:02:21.132 --> 00:02:23.590
and sigma is always going to
have the same dimensions as A,
00:02:23.590 --> 00:02:26.230
so it's going to
be a 1 by 2 matrix.
00:02:26.230 --> 00:02:30.140
U and V are always
square matrices.
00:02:30.140 --> 00:02:32.480
So to make this
multiplication work out,
00:02:32.480 --> 00:02:34.660
we need V to have
2, and because it's
00:02:34.660 --> 00:02:37.220
square it has to be 2 by 2.
00:02:37.220 --> 00:02:41.250
And likewise, U
has to be 1 by 1.
00:02:41.250 --> 00:02:45.500
So we now have the dimensions
of U, sigma, and V.
00:02:45.500 --> 00:02:49.300
And note, because U
is a 1 by 1 matrix,
00:02:49.300 --> 00:02:52.790
the only orthogonal 1
by 1 matrix is just 1.
00:02:52.790 --> 00:02:57.130
So u we already
know is just going
00:02:57.130 --> 00:03:00.590
to be the matrix, the
identity matrix, which is a 1
00:03:00.590 --> 00:03:02.870
by 1 matrix.
00:03:02.870 --> 00:03:05.750
OK, now how do we
compute V and sigma?
00:03:05.750 --> 00:03:13.130
Well, we can take
A transpose and A,
00:03:13.130 --> 00:03:21.070
and if we do that, we end up
getting the matrix V sigma
00:03:21.070 --> 00:03:27.220
transpose sigma V transpose.
00:03:27.220 --> 00:03:30.670
And this matrix is going
to be a square matrix where
00:03:30.670 --> 00:03:34.300
the diagonal elements are
squares of the singular values.
00:03:34.300 --> 00:03:38.790
So computing V and
the values along sigma
00:03:38.790 --> 00:03:43.520
just boil down to
diagonalizing A transpose A.
00:03:43.520 --> 00:03:44.960
So what is A transpose A?
00:03:44.960 --> 00:03:53.080
Well, in our case is
[1; 2] times [1, 2],
00:03:53.080 --> 00:03:59.340
which gives us [1, 2; 2, 4].
00:03:59.340 --> 00:04:03.940
And note that the second row
is just a constant multiple
00:04:03.940 --> 00:04:05.770
times the first row.
00:04:05.770 --> 00:04:10.360
Now what this means is we
have a zero eigenvalue.
00:04:10.360 --> 00:04:14.470
So we already know that
lambda_1 is going to be 0.
00:04:14.470 --> 00:04:17.540
So one of the eigenvalues
of this matrix is 0.
00:04:17.540 --> 00:04:19.959
And of course, when
we square root it,
00:04:19.959 --> 00:04:22.070
this is going to give
us a singular value
00:04:22.070 --> 00:04:25.350
sigma, which is also 0.
00:04:25.350 --> 00:04:30.070
And this is generally
a case when we have
00:04:30.070 --> 00:04:32.940
a sigma which is not square.
00:04:32.940 --> 00:04:36.320
We typically always have
zero singular values.
00:04:36.320 --> 00:04:38.502
Now to compute the
second eigenvalue,
00:04:38.502 --> 00:04:39.960
well we already
know how to compute
00:04:39.960 --> 00:04:41.626
the eigenvalues of a
matrix, so I'm just
00:04:41.626 --> 00:04:43.770
going to tell you what it is.
00:04:43.770 --> 00:04:47.430
The second one is lambda is 5.
00:04:47.430 --> 00:04:50.600
And if we just take
a quick look what
00:04:50.600 --> 00:04:57.210
the corresponding eigenvector
is going to be to lambda is 5,
00:04:57.210 --> 00:05:00.135
it's going to satisfy
this equation.
00:05:02.900 --> 00:05:08.170
So we can take the
eigenvector u to be 1 and 2.
00:05:08.170 --> 00:05:10.280
However, remember
that when we compute
00:05:10.280 --> 00:05:13.630
the eigenvector for this
orthogonal matrix V,
00:05:13.630 --> 00:05:16.842
they always have to
have a unit length.
00:05:16.842 --> 00:05:19.050
And this vector right now
doesn't have a unit length.
00:05:19.050 --> 00:05:22.200
We have to divide by the
length of this vector, which
00:05:22.200 --> 00:05:26.750
in our case is 1 over root 5.
00:05:26.750 --> 00:05:30.970
And if I go back to the
lambda equals 0 case,
00:05:30.970 --> 00:05:34.200
we also have
another eigenvector,
00:05:34.200 --> 00:05:37.440
which I'll just state.
00:05:37.440 --> 00:05:39.820
You can actually
compute it quite quickly
00:05:39.820 --> 00:05:44.540
just by noting that it has to be
orthogonal to this eigenvector,
00:05:44.540 --> 00:05:45.110
2 and 1.
00:05:47.870 --> 00:05:52.730
So what this means is A has a
singular value decomposition,
00:05:52.730 --> 00:06:01.510
which looks like: 1, so
this is u, times sigma,
00:06:01.510 --> 00:06:05.250
which is going to be root 5, 0.
00:06:05.250 --> 00:06:09.160
Remember that the first sigma
is actually the square root
00:06:09.160 --> 00:06:09.910
of the eigenvalue.
00:06:12.740 --> 00:06:16.660
Times a matrix which
looks like, now we
00:06:16.660 --> 00:06:20.460
have to order the eigenvalues
up in the correct order.
00:06:20.460 --> 00:06:22.170
Because 5 appears
in the first column,
00:06:22.170 --> 00:06:26.430
we have to take this vector to
be in the first column as well.
00:06:26.430 --> 00:06:34.170
So this is 1 over root 5, this
is 2 over root 5, negative 2
00:06:34.170 --> 00:06:40.050
over root 5, and 1 over root 5.
00:06:40.050 --> 00:06:48.670
And now this is V, but the
singular value decomposition
00:06:48.670 --> 00:06:49.940
is defined by V transpose.
00:06:55.120 --> 00:06:57.840
So this gives us a
representation for A. And now
00:06:57.840 --> 00:07:00.140
once we have the SVD of
A, how do we actually
00:07:00.140 --> 00:07:04.110
compute A plus, or the
pseudoinverse of A?
00:07:04.110 --> 00:07:14.700
Well just note if
A was invertible,
00:07:14.700 --> 00:07:20.670
then the inverse of
A in terms of the SVD
00:07:20.670 --> 00:07:26.280
would be V transpose times
the inverse of sigma.
00:07:30.800 --> 00:07:33.860
Sorry, this is not V
transpose, this is just V.
00:07:33.860 --> 00:07:36.095
So it'd be V sigma
inverse U transpose.
00:07:39.120 --> 00:07:45.400
And when A is invertible,
sigma inverse exists.
00:07:45.400 --> 00:07:49.900
So in our case, sigma
inverse doesn't necessarily
00:07:49.900 --> 00:07:52.510
exist because
sigma-- note, this is
00:07:52.510 --> 00:07:57.290
sigma-- sigma is root 5 and 0.
00:07:57.290 --> 00:08:03.970
So we have to construct a
pseudoinverse for sigma.
00:08:03.970 --> 00:08:07.640
So the way that we
do that is we take 1
00:08:07.640 --> 00:08:11.790
over each singular value, and
we take the transpose of sigma.
00:08:14.850 --> 00:08:17.720
So when A is not
invertible, we can still
00:08:17.720 --> 00:08:20.560
construct a
pseudoinverse by taking
00:08:20.560 --> 00:08:29.230
V, an approximation for sigma
inverse, which in our case
00:08:29.230 --> 00:08:33.480
is going to be 1 over
the singular value and 0.
00:08:33.480 --> 00:08:37.870
So note where sigma
is invertible,
00:08:37.870 --> 00:08:42.480
we take the inverse, and then we
fill in 0's in the other areas.
00:08:42.480 --> 00:08:43.234
Times U transpose.
00:08:46.500 --> 00:08:47.920
And we can work this out.
00:08:47.920 --> 00:09:01.880
We get 1 over root 5, 1, minus
2; 2, 1, 1 over root 5, 0.
00:09:07.760 --> 00:09:18.380
And if I multiply things
out, I get 1/5, [1; 2].
00:09:18.380 --> 00:09:22.640
So this is an approximation
for A inverse,
00:09:22.640 --> 00:09:25.270
which is the pseudoinverse.
00:09:25.270 --> 00:09:27.262
So this finishes up part one.
00:09:27.262 --> 00:09:28.970
And I'll started on
part two in a second.
00:09:35.780 --> 00:09:40.050
So now that we've just computed
A plus, the pseudoinverse of A.
00:09:40.050 --> 00:09:42.530
We're going to investigate
some properties
00:09:42.530 --> 00:09:44.620
of the pseudoinverse.
00:09:44.620 --> 00:09:47.130
So for part two
we need to compute
00:09:47.130 --> 00:09:52.630
A times A plus and
A plus times A.
00:09:52.630 --> 00:09:56.150
So we can just go
ahead and do this.
00:09:56.150 --> 00:10:04.590
So A A plus you can
do fairly quickly.
00:10:04.590 --> 00:10:08.000
1/5, [1; 2].
00:10:08.000 --> 00:10:14.720
And when we multiply it out we
get 1 plus 4 divided by 5 is 1.
00:10:14.720 --> 00:10:17.960
So we just get the one
by one matrix, which
00:10:17.960 --> 00:10:20.860
is 1, the identity matrix.
00:10:20.860 --> 00:10:27.280
And secondly, if we
take A plus times A
00:10:27.280 --> 00:10:37.930
we're going to get 1/5,
[1; 2] times [1, 2].
00:10:37.930 --> 00:10:40.640
And we can just
fill in this matrix.
00:10:40.640 --> 00:10:46.335
This is 1/5, [1, 2; 2, 1].
00:10:52.070 --> 00:10:54.330
And this concludes part two.
00:10:54.330 --> 00:10:58.300
So now let's take a look at
what happens when a vector x is
00:10:58.300 --> 00:11:00.380
in the null space of
A, and then secondly,
00:11:00.380 --> 00:11:05.280
what happens when x is in the
column space of A transpose.
00:11:05.280 --> 00:11:09.970
So for part three,
let's assume x
00:11:09.970 --> 00:11:13.590
is in the null space of A. Well
what's the null space of A?
00:11:13.590 --> 00:11:19.070
We can quickly check
that the null space of A
00:11:19.070 --> 00:11:25.730
is a constant times
any vector minus 2, 1.
00:11:25.730 --> 00:11:28.270
So that's the null space.
00:11:28.270 --> 00:11:32.980
So if x is, for example, i.e.
00:11:32.980 --> 00:11:38.710
if we take x is
equal to minus 2, 1,
00:11:38.710 --> 00:11:48.700
and we were to, say, multiply
it by A plus A, acting on x,
00:11:48.700 --> 00:11:51.840
we see that we get 0.
00:11:51.840 --> 00:11:54.480
And this isn't very
surprising because, well,
00:11:54.480 --> 00:11:58.180
if x is in the null space of
A, we know that A acting on x
00:11:58.180 --> 00:11:58.990
is going to be 0.
00:12:02.920 --> 00:12:09.130
So that no matter what matrix A
plus is, when we multiply by 0,
00:12:09.130 --> 00:12:10.320
we'll always end up with 0.
00:12:13.740 --> 00:12:18.260
And then lastly, let's
take a look at the column
00:12:18.260 --> 00:12:19.135
space of A transpose.
00:12:22.640 --> 00:12:25.900
Well, A transpose
is [1, 2], so it's
00:12:25.900 --> 00:12:28.460
any constant times
the vector [1; 2].
00:12:31.880 --> 00:12:35.720
And specifically, if we were
to take, say, x is equal to [1;
00:12:35.720 --> 00:12:42.920
2], we can work at A plus A
acting on the vector [1; 2].
00:12:49.070 --> 00:12:56.000
So we have 1/5 [1, 2; 2, 1].
00:12:56.000 --> 00:13:03.020
So recall this is A plus
A. And if we multiply it
00:13:03.020 --> 00:13:09.650
on the vector [1; 2], we get
1 plus 4 is 5, divided by 5,
00:13:09.650 --> 00:13:11.980
so we get 1.
00:13:11.980 --> 00:13:21.610
2 plus 2 is 4-- sorry, I
copied the matrix down.
00:13:21.610 --> 00:13:27.590
So it's 2 plus 8, which
is 10, divided by 5 is 2.
00:13:30.920 --> 00:13:34.030
And we see that at the end
we recover the vector x.
00:13:37.170 --> 00:13:41.500
So in general, if we
take A plus A acting
00:13:41.500 --> 00:13:47.320
on x, where x is in the
column space of A transpose,
00:13:47.320 --> 00:13:50.940
we always recover x
at the end of the day.
00:13:50.940 --> 00:13:54.520
So intuitively, what does
this matrix A plus A do?
00:13:54.520 --> 00:14:02.410
Well, if x is in the null
space of A, it just kills it.
00:14:02.410 --> 00:14:04.770
We just get 0.
00:14:04.770 --> 00:14:09.710
If x is not in the null space
of A, then we just get x back.
00:14:09.710 --> 00:14:11.910
So it's essentially
the identity matrix
00:14:11.910 --> 00:14:17.520
acting on x whenever x is in
the column space of A transpose.
00:14:17.520 --> 00:14:20.480
Now specifically,
if A is invertible,
00:14:20.480 --> 00:14:22.490
then A doesn't
have a null space.
00:14:22.490 --> 00:14:25.660
So what that means is:
when A is invertible,
00:14:25.660 --> 00:14:30.160
A plus A recovers the identity
because when we multiply it
00:14:30.160 --> 00:14:34.540
on any vector, we
get that vector back.
00:14:34.540 --> 00:14:39.040
So I'd like to conclude here,
and I'll see you next time.