WEBVTT
00:00:07.560 --> 00:00:08.640
OK.
00:00:08.640 --> 00:00:12.290
So, coming nearer the
end of the course,
00:00:12.290 --> 00:00:20.340
this lecture will be a mixture
of the linear algebra that
00:00:20.340 --> 00:00:22.610
comes with a change of basis.
00:00:22.610 --> 00:00:26.340
And a change of basis from
one basis to another basis
00:00:26.340 --> 00:00:30.270
is something you really
do in applications.
00:00:30.270 --> 00:00:33.710
And, I would like to talk
about those applications.
00:00:33.710 --> 00:00:36.920
I got a little bit
involved with compression.
00:00:36.920 --> 00:00:39.820
Compressing a signal,
compressing an image.
00:00:39.820 --> 00:00:43.910
And that's exactly
change-of-basis.
00:00:43.910 --> 00:00:48.510
And then, the main
theme in this chapter
00:00:48.510 --> 00:00:54.320
is th- the connection between
a linear transformation, which
00:00:54.320 --> 00:00:56.870
doesn't have to
have coordinates,
00:00:56.870 --> 00:01:01.610
and the matrix that tells
us that transformation
00:01:01.610 --> 00:01:04.069
with respect to coordinates.
00:01:04.069 --> 00:01:08.420
So the matrix is the
coordinate-based description
00:01:08.420 --> 00:01:11.090
of the linear transformation.
00:01:11.090 --> 00:01:14.480
Let me start out with
the nice part, which
00:01:14.480 --> 00:01:18.770
is just to tell you something
about image compression.
00:01:18.770 --> 00:01:21.070
Those of you -- well,
everybody's going to meet
00:01:21.070 --> 00:01:25.950
compression, because you know
that the amount of data that
00:01:25.950 --> 00:01:27.730
we're getting --
00:01:27.730 --> 00:01:31.640
well, these lectures
are compressed.
00:01:31.640 --> 00:01:37.500
So that, actually, probably
you see my motion as
00:01:37.500 --> 00:01:38.030
jerky?
00:01:38.030 --> 00:01:39.050
Shall I use that word?
00:01:39.050 --> 00:01:41.760
Have you looked on the web?
00:01:41.760 --> 00:01:46.470
I should like to
find a better word.
00:01:46.470 --> 00:01:47.770
Compressed, let's say.
00:01:47.770 --> 00:01:53.900
So the complete signal is, of
course, in those video cameras,
00:01:53.900 --> 00:01:56.360
and in the videotape,
but that goes
00:01:56.360 --> 00:01:59.540
to the bottom of building
nine, and out of that
00:01:59.540 --> 00:02:06.240
comes a jumpy motion because
it uses a standard system
00:02:06.240 --> 00:02:09.020
for compressing images.
00:02:09.020 --> 00:02:13.990
And, you'll notice that the
stuff that sits on the board
00:02:13.990 --> 00:02:20.200
comes very clearly, but
it's my motion that needs
00:02:20.200 --> 00:02:22.150
a whole lot of bits, right?
00:02:22.150 --> 00:02:26.060
So, and if I were to run up
and back up there and back,
00:02:26.060 --> 00:02:29.380
that would need too
many bits, and I'd
00:02:29.380 --> 00:02:32.860
be compressed even more.
00:02:32.860 --> 00:02:36.180
So, what does compression mean?
00:02:36.180 --> 00:02:39.100
Let me just think
of a still image.
00:02:39.100 --> 00:02:44.570
And of course, satellites,
and computations
00:02:44.570 --> 00:02:47.420
of the climate,
computations of combustion,
00:02:47.420 --> 00:02:52.410
the computers and
sensors of all kinds
00:02:52.410 --> 00:02:55.910
are just giving us
overwhelming amounts of data.
00:02:55.910 --> 00:02:58.720
The Web is, too.
00:02:58.720 --> 00:03:02.910
Now, some compression
can be done with no loss.
00:03:02.910 --> 00:03:07.150
Lossless compression is possible
just using, sort of, the fact
00:03:07.150 --> 00:03:10.410
that there are redundancies.
00:03:10.410 --> 00:03:13.550
But I'm talking here
about lossy compression.
00:03:13.550 --> 00:03:18.730
So I'm talking about
-- here's an image.
00:03:18.730 --> 00:03:20.960
And what does an
image consist of?
00:03:20.960 --> 00:03:24.770
It consists of a lot of
little pixels, right?
00:03:24.770 --> 00:03:28.870
Maybe five hundred and twelve
by five hundred and twelve.
00:03:28.870 --> 00:03:31.930
Two to the ninth by two
to the ninth pixels,
00:03:31.930 --> 00:03:40.200
and so this is pixel number
one, one, so that's a pixel.
00:03:40.200 --> 00:03:45.860
And if we're in black and
white, the typical pixel
00:03:45.860 --> 00:03:52.640
would tell us a gray-scale,
from zero to two fifty five.
00:03:52.640 --> 00:03:59.250
So a pixel is usually a
value of one of the xi,
00:03:59.250 --> 00:04:02.190
so this would be the
i-th pixel, is --
00:04:02.190 --> 00:04:05.260
it's usually a real
number on a scale
00:04:05.260 --> 00:04:06.860
from zero to two fifty five.
00:04:06.860 --> 00:04:10.260
In other words, two to
the eighth possibilities.
00:04:10.260 --> 00:04:19.720
So usually, that's the standard,
so that's eight -- eight bits.
00:04:22.620 --> 00:04:26.610
But then we have
that for every pixel,
00:04:26.610 --> 00:04:29.490
so we have five hundred
and twelve squared pixels,
00:04:29.490 --> 00:04:36.640
we're really operating x is a
vector in R^n, but what is n?
00:04:36.640 --> 00:04:41.080
n is five hundred
and twelve squared.
00:04:41.080 --> 00:04:45.470
That's our problem, right there.
00:04:45.470 --> 00:04:49.040
A pixel is a vector
that gives us
00:04:49.040 --> 00:04:53.010
the information about the image.
00:04:53.010 --> 00:04:55.240
I'm sorry.
00:04:55.240 --> 00:05:02.530
The image that comes through is
a vector of that length that --
00:05:02.530 --> 00:05:04.800
that's the information that
we have about the image,
00:05:04.800 --> 00:05:08.550
if it's a color image, we would
have three times that length,
00:05:08.550 --> 00:05:12.920
because we'd need three
coordinates to get color.
00:05:12.920 --> 00:05:16.830
So it would be three times five
hundred and twelve squared.
00:05:16.830 --> 00:05:18.830
It's an enormous
amount of information,
00:05:18.830 --> 00:05:22.810
and we couldn't send out
the image for these lectures
00:05:22.810 --> 00:05:25.510
without compressing it.
00:05:25.510 --> 00:05:29.070
It would overload the system.
00:05:29.070 --> 00:05:31.100
So it has to be compressed.
00:05:31.100 --> 00:05:37.920
The standard compression,
and still used with lectures
00:05:37.920 --> 00:05:39.810
is, called JPEG.
00:05:42.950 --> 00:05:46.870
I think that stands for Joint
Photographic Experts Group.
00:05:46.870 --> 00:05:50.390
They established a
system of compression.
00:05:50.390 --> 00:05:53.080
And I just want to tell
you what it's about.
00:05:53.080 --> 00:05:55.190
It's a change-of-basis.
00:05:55.190 --> 00:05:58.360
What basis do we have?
00:05:58.360 --> 00:06:00.860
The current basis
we have is, you
00:06:00.860 --> 00:06:07.050
could say, the standard basis
is, every pixel, give a value.
00:06:07.050 --> 00:06:12.440
So that's like we have a
vector x which is five hundred
00:06:12.440 --> 00:06:16.380
and twelve squared long
and, in the i-th position,
00:06:16.380 --> 00:06:21.420
we get a number like one
twenty one or something.
00:06:21.420 --> 00:06:27.720
The pixel next to it might
be one twenty four, maybe
00:06:27.720 --> 00:06:34.470
where my tie begins to enter,
so if it was mostly blue shirt,
00:06:34.470 --> 00:06:37.240
this would be a slight
difference in shading,
00:06:37.240 --> 00:06:41.430
but pretty close, then the tie
would be a different color,
00:06:41.430 --> 00:06:45.180
so we might have
quite a few pixels
00:06:45.180 --> 00:06:47.820
for the blue shirt,
and a whole lot
00:06:47.820 --> 00:06:52.450
more for the blackboard,
that are very close.
00:06:52.450 --> 00:06:55.430
And that's what are
very correlated.
00:06:55.430 --> 00:06:59.250
And that's what gives us the
possibility of compression.
00:06:59.250 --> 00:07:02.770
For example, before
the lecture starts,
00:07:02.770 --> 00:07:07.660
if we had a blank blackboard,
then there's an image,
00:07:07.660 --> 00:07:12.250
but it would make no
sense to take that image
00:07:12.250 --> 00:07:17.030
and tell you what it
is pixel by pixel.
00:07:17.030 --> 00:07:21.720
I mean, there's a case in
which all pixel values,
00:07:21.720 --> 00:07:24.180
all gray levels are the same --
00:07:24.180 --> 00:07:26.950
or practically the same,
depending on the erasing
00:07:26.950 --> 00:07:29.240
of the board, but
extremely close --
00:07:29.240 --> 00:07:35.260
and, so that's an image where
the standard basis is lousy.
00:07:35.260 --> 00:07:40.490
That's the basic fact, that
the standard basis which gives
00:07:40.490 --> 00:07:44.640
the value of every pixel
makes no use of the fact that
00:07:44.640 --> 00:07:49.250
we're getting a whole lot of
pixels whose gray levels --
00:07:49.250 --> 00:07:54.450
the neighboring pixels tend
to have the same gray level
00:07:54.450 --> 00:07:57.260
as their neighbors.
00:07:57.260 --> 00:07:59.950
So how do we take
advantage of that fact?
00:07:59.950 --> 00:08:04.820
Well, one basis
vector that would
00:08:04.820 --> 00:08:07.810
be extremely nice to
include in the basis
00:08:07.810 --> 00:08:12.830
would be a vector of all ones.
00:08:12.830 --> 00:08:14.830
That's not in our
standard basis,
00:08:14.830 --> 00:08:23.890
so let me just write again,
the standard basis is our one,
00:08:23.890 --> 00:08:29.440
and all the rest zeroes, zero,
one, and all the rest, zeroes,
00:08:29.440 --> 00:08:33.820
everybody knows what
these standard basis is.
00:08:33.820 --> 00:08:39.320
Now, any other basis
for R -- so this is --
00:08:39.320 --> 00:08:43.100
for this very
high-dimensional space --
00:08:43.100 --> 00:08:46.180
now I'm going to speak
about a better basis.
00:08:46.180 --> 00:08:51.810
Better basis -- and
let me just emphasize,
00:08:51.810 --> 00:08:54.580
one vector that would be
extremely nice to have in that
00:08:54.580 --> 00:08:56.770
basis is the vector of all ones.
00:08:59.790 --> 00:09:00.540
Why is that?
00:09:00.540 --> 00:09:04.730
Let me just say again, because
that vector of all ones,
00:09:04.730 --> 00:09:11.180
by itself, one vector is
able to completely give
00:09:11.180 --> 00:09:15.940
the information
on a solid image.
00:09:15.940 --> 00:09:17.650
Of course, our image
won't be solid,
00:09:17.650 --> 00:09:26.070
it will have a mix
of solid and signal.
00:09:26.070 --> 00:09:30.470
So having that one
vector in the basis
00:09:30.470 --> 00:09:33.260
is going to save us a whole lot.
00:09:33.260 --> 00:09:36.400
Now, the question is, what other
vectors should be in the basis?
00:09:36.400 --> 00:09:41.050
The extreme vector
in the basis might
00:09:41.050 --> 00:09:47.050
be a vector of one minus one,
one minus one, one minus one.
00:09:47.050 --> 00:09:49.420
That would be a
vector that shows --
00:09:49.420 --> 00:09:52.560
I mean, that's like a
checkerboard vector, right?
00:09:52.560 --> 00:09:58.630
That's a vector that
would, if the image was
00:09:58.630 --> 00:10:01.040
like a huge checkerboard
of plus, minus,
00:10:01.040 --> 00:10:03.720
plus, minus, plus,
minus, that vector
00:10:03.720 --> 00:10:06.260
would carry the whole signal.
00:10:06.260 --> 00:10:08.740
But much more common
would be maybe
00:10:08.740 --> 00:10:14.760
to have half the image, darker
and the other half lighter.
00:10:14.760 --> 00:10:18.320
So another vector that might
be quite useful in here
00:10:18.320 --> 00:10:21.480
would be half ones
and half minus ones.
00:10:24.950 --> 00:10:31.780
I'm just trying to get across
the idea of that a basis could
00:10:31.780 --> 00:10:33.950
be where, that
first of all, we've
00:10:33.950 --> 00:10:38.330
got the bases at our disposal.
00:10:38.330 --> 00:10:39.960
Like, we're free to choose that.
00:10:39.960 --> 00:10:45.330
And it's a billion-dollar
decision what we choose.
00:10:45.330 --> 00:10:50.270
So, and TV people
would rather pre-
00:10:50.270 --> 00:10:55.480
would prefer one basis based on
the way the signal is scanned,
00:10:55.480 --> 00:10:58.040
and movie people
would prefer another,
00:10:58.040 --> 00:11:02.540
I mean, there's giant
politics in this question that
00:11:02.540 --> 00:11:05.710
really reduces to a
linear algebra problem,
00:11:05.710 --> 00:11:07.900
what basis to choose.
00:11:07.900 --> 00:11:15.540
I'll just mention the best
known basis, which JPEG uses, --
00:11:15.540 --> 00:11:17.760
let me put that here --
00:11:17.760 --> 00:11:21.490
is the Fourier basis.
00:11:21.490 --> 00:11:27.730
So when you use the Fourier
basis, that includes --
00:11:27.730 --> 00:11:31.820
this is the constant vector, the
D C vector if we're electrical
00:11:31.820 --> 00:11:37.170
engineers, the l- vector of all
ones, so it would include one,
00:11:37.170 --> 00:11:39.070
one, one, one.
00:11:39.070 --> 00:11:41.370
Often eight by eight
is a good choice.
00:11:44.770 --> 00:11:46.580
Eight by eight is a good choice.
00:11:46.580 --> 00:11:50.340
So, what do I mean by
this eight by eight?
00:11:50.340 --> 00:11:55.430
I mean that the big signal,
which is five twelve by five
00:11:55.430 --> 00:12:00.480
twelve, gets broken down, and
JPEG does this, into eight
00:12:00.480 --> 00:12:02.540
by eight blocks.
00:12:02.540 --> 00:12:06.800
And we -- sort of, this is
too much to deal with at once.
00:12:06.800 --> 00:12:10.090
So what JPEG does is take
this eight by eight block,
00:12:10.090 --> 00:12:16.250
which is sixty four
coefficients, sixty four,
00:12:16.250 --> 00:12:21.030
pixels, and changes
the basis on that
00:12:21.030 --> 00:12:22.090
piece.
00:12:22.090 --> 00:12:26.260
And then, now, let's see, I was
going to write down Fourier,
00:12:26.260 --> 00:12:31.120
so you remember Fourier as this
vector of all ones, and then,
00:12:31.120 --> 00:12:35.100
the vector -- oh,
well, actually,
00:12:35.100 --> 00:12:39.480
I gave a lecture earlier
about the Fourier matrix,
00:12:39.480 --> 00:12:50.040
this matrix whose columns are
powers of a complex number w.
00:12:50.040 --> 00:12:52.820
I won't repeat that,
because I don't
00:12:52.820 --> 00:12:56.600
want to go into the details
of the Fourier basis,
00:12:56.600 --> 00:13:00.640
just to tell you how
compression works.
00:13:00.640 --> 00:13:02.880
So what happens in JPEG?
00:13:02.880 --> 00:13:11.640
What happens to the video, to
each image, of these lectures?
00:13:11.640 --> 00:13:16.270
It gets broken into
eight by eight blocks.
00:13:16.270 --> 00:13:16.910
OK.
00:13:16.910 --> 00:13:21.280
Within each block, we have
sixty four coefficients,
00:13:21.280 --> 00:13:24.210
sixty four basis vectors,
sixty four pixels,
00:13:24.210 --> 00:13:28.870
and we change basis in
sixty four dimensional space
00:13:28.870 --> 00:13:31.662
using these Fourier vectors.
00:13:35.000 --> 00:13:38.140
Just note, that was
a lossless step.
00:13:38.140 --> 00:13:40.240
Let me emphasize.
00:13:40.240 --> 00:13:46.150
In comes the signal x.
00:13:46.150 --> 00:13:47.460
We change basis.
00:13:47.460 --> 00:13:49.490
This is the basis change.
00:13:49.490 --> 00:13:50.800
Change basis.
00:13:50.800 --> 00:13:54.080
Choose a better basis.
00:13:54.080 --> 00:14:03.350
So it produces,
the coefficients c.
00:14:03.350 --> 00:14:07.590
So sixty four pixels come
in, sixty four coefficients
00:14:07.590 --> 00:14:08.400
come out.
00:14:08.400 --> 00:14:11.150
Now comes the compression.
00:14:11.150 --> 00:14:12.735
Now come -- this was lossless.
00:14:16.320 --> 00:14:18.860
It's just -- we know that R --
00:14:18.860 --> 00:14:25.250
R sixty four has plenty of
bases, and we've chosen one.
00:14:25.250 --> 00:14:32.040
Now, in that basis, we write
the signal in that basis,
00:14:32.040 --> 00:14:34.350
and that's what my lecture
-- that's the math part
00:14:34.350 --> 00:14:35.710
of my lecture.
00:14:35.710 --> 00:14:37.610
Now here's the application part.
00:14:37.610 --> 00:14:40.340
The next part is going to
be the compression step.
00:14:43.790 --> 00:14:46.030
And that's lossy.
00:14:46.030 --> 00:14:47.840
We're going to lose information.
00:14:47.840 --> 00:14:52.220
And what will actually
happen at that step?
00:14:52.220 --> 00:14:56.500
Well, one thing we could
do is just throw away
00:14:56.500 --> 00:14:58.680
the small coefficients.
00:14:58.680 --> 00:15:03.520
So that's called thresholding,
we set some threshold.
00:15:03.520 --> 00:15:07.150
Every coefficient, every
basis vector that's
00:15:07.150 --> 00:15:10.510
not in there more than
the threshold value,
00:15:10.510 --> 00:15:14.240
and we set them threshold
so that our eye can't
00:15:14.240 --> 00:15:19.650
see the difference, or can
hardly see the difference,
00:15:19.650 --> 00:15:23.230
whether we throw away that
little bit of that basis vector
00:15:23.230 --> 00:15:24.220
or keep it.
00:15:24.220 --> 00:15:28.004
So this compression step
produces a compressed set of
00:15:28.004 --> 00:15:29.670
I'll just keep going
here. coefficients.
00:15:32.900 --> 00:15:36.220
So it keeps going,
this compression step
00:15:36.220 --> 00:15:40.190
produces some coefficient c hat.
00:15:40.190 --> 00:15:42.560
And with many zeroes.
00:15:45.220 --> 00:15:48.160
So that's where the
compression came.
00:15:48.160 --> 00:15:52.480
Probably, there is enough of
this vector of all ones --
00:15:52.480 --> 00:15:55.240
we very seldom throw that away.
00:15:55.240 --> 00:15:57.950
Usually, its coefficient
will be large.
00:15:57.950 --> 00:16:01.110
But the coefficient of
something like this,
00:16:01.110 --> 00:16:04.500
that quickly alternative
vector, there's
00:16:04.500 --> 00:16:07.560
probably very little of
that in any smooth signal.
00:16:07.560 --> 00:16:12.450
That's high-frequency -- this is
low-frequency, zero frequency.
00:16:12.450 --> 00:16:15.350
This stuff is the highest
frequency we could have,
00:16:15.350 --> 00:16:23.470
and if the noise, the jitter is
producing that sort of output,
00:16:23.470 --> 00:16:26.470
but a smooth lecture
like this one
00:16:26.470 --> 00:16:30.740
is, has very little of
that highest frequency,
00:16:30.740 --> 00:16:32.700
very little noise
in this lecture.
00:16:32.700 --> 00:16:37.150
OK, so we throw away
whatever there is,
00:16:37.150 --> 00:16:40.860
and we're left with
just a few coefficients,
00:16:40.860 --> 00:16:48.560
and then we reconstruct a
signal using those coefficients.
00:16:48.560 --> 00:16:53.580
We take those coefficients,
times their basis vectors,
00:16:53.580 --> 00:16:59.040
but this sum doesn't have
sixty four terms any more.
00:16:59.040 --> 00:17:02.130
Probably, it has about
two or three terms.
00:17:02.130 --> 00:17:03.230
So that would --
00:17:03.230 --> 00:17:05.109
say it has three terms.
00:17:05.109 --> 00:17:07.000
From sixty four down
to three, that's
00:17:07.000 --> 00:17:09.670
compression of
twenty one to one.
00:17:09.670 --> 00:17:12.500
That's the kind of compression
you're looking for.
00:17:12.500 --> 00:17:17.480
And everybody is looking for
that sort of compression.
00:17:17.480 --> 00:17:19.890
Let's see, I guess
I met the problem
00:17:19.890 --> 00:17:23.990
with the FBI and fingerprints.
00:17:23.990 --> 00:17:26.390
So there's a whole
lot of still images.
00:17:26.390 --> 00:17:30.590
You know, with your thumb,
you make these inky marks
00:17:30.590 --> 00:17:34.300
which go somewhere.
00:17:34.300 --> 00:17:38.460
it used to go to Washington
and get stored in a big file.
00:17:38.460 --> 00:17:44.240
So Washington had a file of
thirty million murderers,
00:17:44.240 --> 00:17:50.670
cheaters on quizzes,
other stuff,
00:17:50.670 --> 00:17:56.280
and actually, there was no
way to retrieve them in time.
00:17:56.280 --> 00:17:59.740
So suppose you're at the
police station, they say, OK,
00:17:59.740 --> 00:18:03.650
this person may have done
this, check with Washington,
00:18:03.650 --> 00:18:07.410
have they got -- are his or
her fingerprints on file?
00:18:07.410 --> 00:18:12.220
Well, Washington won't know
the answer within a week
00:18:12.220 --> 00:18:15.870
if it's got filing cabinets
full of fingerprints.
00:18:15.870 --> 00:18:21.190
So of course, the natural
step is digitizing.
00:18:21.190 --> 00:18:24.380
So all fingerprints
are now digitized,
00:18:24.380 --> 00:18:30.090
so now it's at least electronic,
but still there's too much
00:18:30.090 --> 00:18:33.580
information in each one.
00:18:33.580 --> 00:18:39.260
I mean, you can't search
through that many, fingerprints
00:18:39.260 --> 00:18:45.040
if the digital image is five
twelve squared by five twelve
00:18:45.040 --> 00:18:47.280
squared, if it's
that many pixels.
00:18:47.280 --> 00:18:49.110
So you get compressed.
00:18:49.110 --> 00:18:54.970
So the FBI had to decide what
basis to choose for compression
00:18:54.970 --> 00:18:55.900
of fingerprints.
00:18:55.900 --> 00:18:59.510
And then they built a big new
facility in West Virginia,
00:18:59.510 --> 00:19:03.220
and that's where
fingerprints now are sent.
00:19:03.220 --> 00:19:06.110
So I think, if you get
your fingerprints done now
00:19:06.110 --> 00:19:09.610
at the police station, if it's
an up-to-date police station,
00:19:09.610 --> 00:19:13.990
it happens digitally, and
the signal is sent digitally,
00:19:13.990 --> 00:19:18.560
and then in West Virginia,
it's compressed and indexed.
00:19:18.560 --> 00:19:21.960
And then, if they
want to find you,
00:19:21.960 --> 00:19:26.710
they can do it within minutes
instead of within a week.
00:19:26.710 --> 00:19:27.410
OK.
00:19:27.410 --> 00:19:31.860
So this compression comes
up for signals, for images,
00:19:31.860 --> 00:19:36.020
for video -- which is,
like these lectures --
00:19:36.020 --> 00:19:38.400
there's another aspect.
00:19:38.400 --> 00:19:43.980
You could treat the video as one
still image after another one,
00:19:43.980 --> 00:19:50.770
and compress each one, and
then run them and make a video.
00:19:50.770 --> 00:19:52.950
But that misses --
00:19:52.950 --> 00:19:55.990
well, you can see why
that's not optimal.
00:19:55.990 --> 00:20:04.570
In a video thing, you
have a sequence of images,
00:20:04.570 --> 00:20:14.340
so video is really a sequence
of images but what about one
00:20:14.340 --> 00:20:16.800
image to the next image?
00:20:16.800 --> 00:20:18.780
They're extremely correlated.
00:20:18.780 --> 00:20:21.930
I mean that I'm getting an
image every split-second,
00:20:21.930 --> 00:20:24.200
and also, I'm moving slightly.
00:20:24.200 --> 00:20:29.690
That's what's producing the,
jumpy motion on the video.
00:20:29.690 --> 00:20:35.010
But I'm not, like, you know --
00:20:35.010 --> 00:20:39.270
each image in the sequence is
pretty close to the one before.
00:20:39.270 --> 00:20:44.480
So you have to use, like,
prediction and correction.
00:20:44.480 --> 00:20:49.700
I mean, the image of me one
instant -- one time-step later,
00:20:49.700 --> 00:20:51.830
you would assume
would be the same,
00:20:51.830 --> 00:20:54.260
and then plus a
small correction.
00:20:54.260 --> 00:20:58.620
And you would only code and
digitize the correction,
00:20:58.620 --> 00:21:00.970
and compress the correction.
00:21:00.970 --> 00:21:05.350
So a sequence of images
that's highly correlated
00:21:05.350 --> 00:21:11.250
and the problem in
compression is always
00:21:11.250 --> 00:21:14.010
to use this
correlation, this fact
00:21:14.010 --> 00:21:17.960
that, in time, or
in space, things
00:21:17.960 --> 00:21:23.650
don't change instantly, they're
very often smooth changes,
00:21:23.650 --> 00:21:29.170
and, you can predict one
value from the previous value.
00:21:29.170 --> 00:21:29.670
OK.
00:21:29.670 --> 00:21:36.240
So those are applications
which are pure linear algebra.
00:21:36.240 --> 00:21:42.680
I could, well, maybe you'll
allow me to tell you,
00:21:42.680 --> 00:21:47.650
and the book describes,
the new basis that's
00:21:47.650 --> 00:21:50.150
the competition for Fourier.
00:21:50.150 --> 00:21:54.680
So the competition for
Fourier is called wavelets,
00:21:54.680 --> 00:21:58.460
and I can describe what
that basis is like,
00:21:58.460 --> 00:22:00.680
say, in the eight by eight case.
00:22:00.680 --> 00:22:03.240
So the eight by
eight wavelet basis
00:22:03.240 --> 00:22:08.410
is the vector of all
ones, eight ones, then
00:22:08.410 --> 00:22:13.030
the vector of four ones
and four minus ones,
00:22:13.030 --> 00:22:18.770
then the vector of two
ones, and two minus ones,
00:22:18.770 --> 00:22:20.620
and four zeroes.
00:22:20.620 --> 00:22:24.930
And also the vector
of four zeroes
00:22:24.930 --> 00:22:28.480
and two ones and two minus ones.
00:22:28.480 --> 00:22:31.630
So now I'm up to four, and
I need four more, right?
00:22:31.630 --> 00:22:32.870
For R^8?
00:22:32.870 --> 00:22:39.690
The next basis vector will be
one minus one and six zeroes,
00:22:39.690 --> 00:22:44.910
and then three more like
that, with the one minus one
00:22:44.910 --> 00:22:47.400
there, and there, and there.
00:22:47.400 --> 00:22:52.490
So those are eight vectors
in eight-dimensional space,
00:22:52.490 --> 00:22:59.830
those are called wavelets,
and it's a very simple wavelet
00:22:59.830 --> 00:23:03.131
choice, it's a
more sophisticated
00:23:03.131 --> 00:23:03.630
choice.
00:23:03.630 --> 00:23:09.320
This is a little jumpy, to
jump between one and minus one.
00:23:09.320 --> 00:23:12.920
And, actually, you
can see, now, suppose
00:23:12.920 --> 00:23:16.820
you compare the wavelet basis
with the Fourier basis above.
00:23:16.820 --> 00:23:22.450
How could I write this guy,
which is in the Fourier basis,
00:23:22.450 --> 00:23:24.780
it's an eight --
it's a vector in R^8.
00:23:24.780 --> 00:23:28.350
How would I write
that as a combination
00:23:28.350 --> 00:23:30.920
of the wavelet basis?
00:23:30.920 --> 00:23:34.250
Have I told you enough about the
wavelet basis that you can see,
00:23:34.250 --> 00:23:37.750
how does this very fast guy --
00:23:37.750 --> 00:23:41.700
what combination of the wavelet
basis is that very fast guy?
00:23:41.700 --> 00:23:45.020
It would be this one --
00:23:45.020 --> 00:23:47.430
it would be the sum
of these four, right?
00:23:47.430 --> 00:23:50.360
That very fast guy will
be that one minus one,
00:23:50.360 --> 00:23:53.740
and the next one, and the
next one, and the next one.
00:23:53.740 --> 00:23:59.350
So this is the sum of
those last four wavelets.
00:23:59.350 --> 00:24:02.010
This one, we've kept, and so on.
00:24:02.010 --> 00:24:06.410
So, each -- well, every --
well, that's what a basis does.
00:24:06.410 --> 00:24:10.250
Every vector in R^8 is
some combination of those,
00:24:10.250 --> 00:24:15.210
and for the linear algebra --
so the linear algebra is this
00:24:15.210 --> 00:24:18.860
step, find the coefficient.
00:24:18.860 --> 00:24:21.900
That's the step we want to take.
00:24:21.900 --> 00:24:28.010
What if I give you the basis,
like this wavelet basis,
00:24:28.010 --> 00:24:34.360
and I give you the pixel -- so
here are the pixel values, P1,
00:24:34.360 --> 00:24:35.660
P2, down to P8 --
00:24:38.810 --> 00:24:39.930
what's the job?
00:24:39.930 --> 00:24:42.220
What's the linear algebra here?
00:24:42.220 --> 00:24:48.400
So these are the values, this
is in the standard basis, right?
00:24:48.400 --> 00:24:53.990
Those are just the values
at eight successive points.
00:24:53.990 --> 00:24:57.560
I guess I'm dropping down to
one dimension, instead of eight
00:24:57.560 --> 00:25:01.180
by eight, I'm just going
to take eight pixel values
00:25:01.180 --> 00:25:04.540
along that first top row.
00:25:04.540 --> 00:25:06.160
So what do I want to do?
00:25:06.160 --> 00:25:10.720
In standard basis, here
are the pixel values.
00:25:10.720 --> 00:25:15.880
I want to write that as a
combination of c1 times this
00:25:15.880 --> 00:25:21.220
guy, plus c2 times
this guy, plus c3,
00:25:21.220 --> 00:25:25.590
these are the coefficients,
plus c4 times this one --
00:25:25.590 --> 00:25:26.720
do you see what I'm doing?
00:25:26.720 --> 00:25:32.010
I want to write this vector P
as a combination of c1 times
00:25:32.010 --> 00:25:36.395
the first wavelet plus c8
times the eighth wavelet.
00:25:39.910 --> 00:25:42.640
That's the transform step.
00:25:42.640 --> 00:25:43.970
That's the lossless step.
00:25:43.970 --> 00:25:47.580
That's the step from P --
oh, I'm calling it P here,
00:25:47.580 --> 00:25:50.790
and I called it x
there, so let me --
00:25:50.790 --> 00:25:54.520
at the risk of moving, and
therefore making this jumpy --
00:25:54.520 --> 00:26:00.780
suppose the signal I'm now
calling P, that a pixel values,
00:26:00.780 --> 00:26:02.860
and I'm looking for
the coefficients.
00:26:02.860 --> 00:26:05.360
OK, tell me how to do it.
00:26:08.360 --> 00:26:12.960
If I give you eight
basis vectors,
00:26:12.960 --> 00:26:17.930
and I give you the input signal,
and I ask for the coefficients,
00:26:17.930 --> 00:26:19.950
what do I do?
00:26:19.950 --> 00:26:22.690
What's the step?
00:26:22.690 --> 00:26:28.650
I'm trying to solve this, I want
to know the eight coefficients,
00:26:28.650 --> 00:26:32.200
so I'm changing from the
standard basis, which is just
00:26:32.200 --> 00:26:37.220
the eight gray-scale values
to the wavelet basis, where
00:26:37.220 --> 00:26:41.490
the same vector is
represented by eight numbers.
00:26:41.490 --> 00:26:44.970
It's got to take eight numbers
to tell you a vector in R^8,
00:26:44.970 --> 00:26:49.460
and those eight numbers are
the coefficients of the basis.
00:26:49.460 --> 00:26:54.720
Look, we've done
this thing before.
00:26:54.720 --> 00:26:58.510
There is the equation
in vector notation,
00:26:58.510 --> 00:27:01.610
we want to see it as a matrix.
00:27:01.610 --> 00:27:06.870
This is a combination of
columns of the wavelet matrix,
00:27:06.870 --> 00:27:07.430
right?
00:27:07.430 --> 00:27:14.440
This is P equals
c1, c2, down to c8,
00:27:14.440 --> 00:27:17.370
and these guys are the columns.
00:27:17.370 --> 00:27:19.990
I mean, this is the step
that we're constantly
00:27:19.990 --> 00:27:22.700
taking in this course,
the first basis vector
00:27:22.700 --> 00:27:25.920
goes in the first column,
the second basis vector
00:27:25.920 --> 00:27:29.170
goes in the second
column, and so on,
00:27:29.170 --> 00:27:32.760
the eight columns of
this wavelet matrix
00:27:32.760 --> 00:27:35.130
are the eight basis vectors.
00:27:35.130 --> 00:27:40.710
This is a wavelet matrix W.
00:27:40.710 --> 00:27:46.330
So, the step to change basis
-- so now I'm finally coming
00:27:46.330 --> 00:27:51.780
to this change-of-basis, so
the change of basis that,
00:27:51.780 --> 00:27:55.370
let me stay with this
board, but -- well,
00:27:55.370 --> 00:27:56.790
let me just go above it, here.
00:28:00.500 --> 00:28:07.200
So the standard basis, we know,
the wavelet basis we have here,
00:28:07.200 --> 00:28:14.300
and the transform is
simply, solve the equations,
00:28:14.300 --> 00:28:23.300
P=W C. So the coefficients
are W inverse P. Right.
00:28:23.300 --> 00:28:26.920
This shows a critical point.
00:28:26.920 --> 00:28:31.630
A good basis has a
nice, fast, inverse.
00:28:31.630 --> 00:28:38.260
So good basis means what?
00:28:38.260 --> 00:28:40.470
So this is like the
billion-dollar competition,
00:28:40.470 --> 00:28:41.710
Eh? and it's not over yet.
00:28:41.710 --> 00:28:46.360
People are going to come up
with better bases than these.
00:28:46.360 --> 00:28:53.420
So a good basis will be, first
good thing would be fast.
00:28:53.420 --> 00:28:58.600
I have to be able to multiply
by W fast, and multiply by W --
00:28:58.600 --> 00:29:00.930
by its inverse fast.
00:29:00.930 --> 00:29:04.630
That's -- if a basis doesn't
allow you to do that fast,
00:29:04.630 --> 00:29:09.920
then it's going to take so much
time that you can't afford it.
00:29:09.920 --> 00:29:12.210
So these bases --
00:29:12.210 --> 00:29:15.900
the Fourier basis,
everybody said, OK, I
00:29:15.900 --> 00:29:18.830
know how to deal quickly
with the Fourier basis,
00:29:18.830 --> 00:29:23.690
because we have something called
the Fast Fourier Transform.
00:29:23.690 --> 00:29:27.510
So there's a FFT that came
in my earlier lecture,
00:29:27.510 --> 00:29:30.270
and comes in the last
chapter of the book,
00:29:30.270 --> 00:29:34.400
so change-of-basis is done
-- if, for the Fourier basis,
00:29:34.400 --> 00:29:39.380
it's done fast by the FFT
and there's a fast wavelet
00:29:39.380 --> 00:29:40.420
transform.
00:29:40.420 --> 00:29:47.040
I can change, for
this wavelet example,
00:29:47.040 --> 00:29:49.320
this matrix is easy to invert.
00:29:49.320 --> 00:29:52.720
It's just somebody
had a smart idea
00:29:52.720 --> 00:29:56.930
in choosing that wavelet
basis and inverting it,
00:29:56.930 --> 00:29:58.580
it has a nice inverse.
00:29:58.580 --> 00:30:01.290
Actually, you can see why
it has a nice inverse.
00:30:01.290 --> 00:30:06.340
Do you see any property of
these eight basis vectors?
00:30:06.340 --> 00:30:07.940
Well, I've only
written five of them,
00:30:07.940 --> 00:30:11.220
but if you see that
property for those five,
00:30:11.220 --> 00:30:13.220
you'll see it for the three
00:30:13.220 --> 00:30:14.030
remaining.
00:30:14.030 --> 00:30:16.890
Well, if I give you
those eight vectors
00:30:16.890 --> 00:30:20.650
and ask, what's a nice property?
00:30:20.650 --> 00:30:23.770
Well, you would say, first,
they're all ones and minus ones
00:30:23.770 --> 00:30:25.120
and zeroes.
00:30:25.120 --> 00:30:32.530
So every multiplication is very
fast using -- just in binary.
00:30:32.530 --> 00:30:34.820
But what's the other great
property of those vectors?
00:30:34.820 --> 00:30:37.750
Anybody see it?
00:30:37.750 --> 00:30:41.210
So, of course, when I
think about a basis,
00:30:41.210 --> 00:30:42.480
one nice property --
00:30:42.480 --> 00:30:45.320
I don't have to have it, but
I'm happy if it's there --
00:30:45.320 --> 00:30:48.830
is that they're orthogonal.
00:30:48.830 --> 00:30:51.070
If the basis vectors
are orthogonal,
00:30:51.070 --> 00:30:52.780
then I'm in good shape.
00:30:52.780 --> 00:30:54.380
And these are... do you see?
00:30:54.380 --> 00:30:56.840
Take the dot product
of that with that,
00:30:56.840 --> 00:31:00.910
you get four plus ones and
four minus ones, you get zero.
00:31:00.910 --> 00:31:03.060
Take the dot product
of that with that.
00:31:03.060 --> 00:31:05.880
You get two plus ones
and two minus ones.
00:31:05.880 --> 00:31:07.660
Or the dot product
of that with that.
00:31:07.660 --> 00:31:10.620
Two plus ones and
two minus ones.
00:31:10.620 --> 00:31:14.060
You can easily check that
that's an orthogonal basis.
00:31:14.060 --> 00:31:15.875
It's not orthonormal.
00:31:18.870 --> 00:31:22.470
To fix it up, I should
divide by the length,
00:31:22.470 --> 00:31:23.920
to make them unit vectors.
00:31:23.920 --> 00:31:26.000
Let's suppose I do that.
00:31:26.000 --> 00:31:29.320
So somewhere in here, I've
got to account for the fact
00:31:29.320 --> 00:31:32.590
that this has length
square root of eight, that
00:31:32.590 --> 00:31:34.520
has length square
root of four, that
00:31:34.520 --> 00:31:36.750
has length square root of two.
00:31:36.750 --> 00:31:40.750
But that's just a constant
factor that's easy to --
00:31:40.750 --> 00:31:42.620
so suppose we've done that.
00:31:42.620 --> 00:31:47.420
Then, tell me what's W inverse?
00:31:47.420 --> 00:31:52.270
That's what chapter four,
section four point four
00:31:52.270 --> 00:31:53.630
was about.
00:31:53.630 --> 00:31:57.470
If we have orthonormal
columns then
00:31:57.470 --> 00:32:02.330
the inverse is the
same as the transpose.
00:32:02.330 --> 00:32:06.470
So if we have a fast way to
multiply by W, which we do,
00:32:06.470 --> 00:32:09.010
the inverse is going
to look just the same,
00:32:09.010 --> 00:32:10.450
and we'll have a fast way to do
00:32:10.450 --> 00:32:11.680
W inverse.
00:32:11.680 --> 00:32:17.760
So that's the wavelet basis
passes this requirement
00:32:17.760 --> 00:32:19.020
for fast.
00:32:19.020 --> 00:32:21.250
We can use it fast.
00:32:21.250 --> 00:32:24.530
But there's a second
requirement, is it any good?
00:32:24.530 --> 00:32:26.420
Because the the
very fastest thing
00:32:26.420 --> 00:32:30.060
we could do is not to
change basis at all.
00:32:30.060 --> 00:32:30.560
Right?
00:32:30.560 --> 00:32:33.660
The fastest thing would be, OK,
stay with the standard basis,
00:32:33.660 --> 00:32:36.020
stay with eight pixel values.
00:32:36.020 --> 00:32:40.320
But that was poor from
compression point of view,
00:32:40.320 --> 00:32:41.020
right?
00:32:41.020 --> 00:32:45.380
Those eight pixel values, if I
just took those eight numbers,
00:32:45.380 --> 00:32:48.800
I can't throw some
of those away.
00:32:48.800 --> 00:32:52.550
If I throw away
ninety percent --
00:32:52.550 --> 00:32:55.110
if I compress ten to
one, and throw away
00:32:55.110 --> 00:32:57.610
ninety percent of
my pixel values,
00:32:57.610 --> 00:33:00.660
well, my picture's
just gone dark.
00:33:00.660 --> 00:33:03.470
Whereas, the basis
that was good,
00:33:03.470 --> 00:33:05.960
the wavelet basis or
the Fourier basis,
00:33:05.960 --> 00:33:12.910
if I throw away c5, c6, c7,
and c8, all I'm throwing away
00:33:12.910 --> 00:33:16.060
is little blips that
are probably there
00:33:16.060 --> 00:33:18.870
in very small amounts.
00:33:18.870 --> 00:33:23.210
So the second property that
we need is good compression.
00:33:23.210 --> 00:33:32.470
So first, it has to be fast, and
secondly, a few basis vectors
00:33:32.470 --> 00:33:36.270
should come close to the signal.
00:33:36.270 --> 00:33:40.140
So a few is enough.
00:33:40.140 --> 00:33:41.840
Can I write it that way?
00:33:41.840 --> 00:33:50.470
A few basis vectors are enough
to reproduce the image just
00:33:50.470 --> 00:33:54.060
exactly as on a video
of these 18.06 lectures.
00:33:54.060 --> 00:33:58.700
Uh, I don't know what the
compression rate is, I'll ask,
00:33:58.700 --> 00:34:03.270
David, who does the
compression -- and, by the way,
00:34:03.270 --> 00:34:09.639
I'll try to get the lectures,
that are relevant for the quiz
00:34:09.639 --> 00:34:13.310
up onto the Web in time.
00:34:13.310 --> 00:34:15.980
So I'll send them
a message today.
00:34:15.980 --> 00:34:21.679
So, he's using the Fourier
basis because the JPEG --
00:34:21.679 --> 00:34:26.219
so JPEG two thousand, which will
be the next standard for image
00:34:26.219 --> 00:34:29.520
compression, will
include wavelets.
00:34:29.520 --> 00:34:33.139
So, I mean, you're actually
getting a kind of up-to-date,
00:34:33.139 --> 00:34:40.030
picture of where this big world
of signal and image processing
00:34:40.030 --> 00:34:41.230
is.
00:34:41.230 --> 00:34:45.090
That Fourier is
what everybody knew,
00:34:45.090 --> 00:34:47.949
and what people
automatically used,
00:34:47.949 --> 00:34:53.460
and the new one is
wavelets, where this is
00:34:53.460 --> 00:34:55.580
the simplest set of wavelets.
00:34:55.580 --> 00:34:59.650
And this isn't the one that
the FBI uses, by the way,
00:34:59.650 --> 00:35:03.730
the FBI uses a smoother
wavelet, instead
00:35:03.730 --> 00:35:08.690
of jumping from one to minus
one, it's a smooth, Cutoff.
00:35:08.690 --> 00:35:15.010
and, that's what we'll be
in in JPEG two thousand.
00:35:15.010 --> 00:35:17.130
OK, so that's that application.
00:35:17.130 --> 00:35:23.520
Now, let me come to the
math, the linear algebra
00:35:23.520 --> 00:35:25.780
part of the lecture.
00:35:25.780 --> 00:35:29.470
Well, we've actually
seen a change-of-basis.
00:35:29.470 --> 00:35:33.420
So let -- let me just review
that eh-eh change-of-basis
00:35:33.420 --> 00:35:37.980
idea, and then the i- and then
the transformation to a matrix.
00:35:37.980 --> 00:35:38.590
OK.
00:35:38.590 --> 00:35:43.630
So this, I hope you see
that these applications are
00:35:43.630 --> 00:35:45.360
really big.
00:35:45.360 --> 00:35:49.100
Now, I have to talk a little
about change-of-basis,
00:35:49.100 --> 00:35:50.960
and a little about that.
00:35:50.960 --> 00:35:52.380
The matrix.
00:35:52.380 --> 00:35:53.300
OK.
00:35:53.300 --> 00:35:55.880
OK.
00:35:55.880 --> 00:35:56.380
OK.
00:35:56.380 --> 00:35:57.171
So change-of-basis.
00:36:01.820 --> 00:36:10.390
Basically, forgive
that put, OK, I have,
00:36:10.390 --> 00:36:15.250
I have my vector in
one basis, and I want
00:36:15.250 --> 00:36:16.500
to change to a different one.
00:36:16.500 --> 00:36:19.590
Actually, you saw it
for the wavelet case.
00:36:19.590 --> 00:36:22.680
So I need the --
00:36:22.680 --> 00:36:31.300
let the matrix W,
and the columns of W
00:36:31.300 --> 00:36:34.330
be the new basis vectors.
00:36:40.830 --> 00:36:43.820
Then the change-of-basis
involves, just
00:36:43.820 --> 00:36:46.050
as it did there, W inverse.
00:36:46.050 --> 00:36:52.510
So we have the vector,
say, x, in the old basis,
00:36:52.510 --> 00:37:04.050
and that converts to a vector,
let's say, c, in the new basis,
00:37:04.050 --> 00:37:16.655
and the relation is exactly what
we had there, that x is W c.
00:37:20.400 --> 00:37:22.680
That's the step we have to take.
00:37:22.680 --> 00:37:26.640
There's a matrix W that
gives us a change-of-basis.
00:37:26.640 --> 00:37:28.650
OK.
00:37:28.650 --> 00:37:38.110
What I want to do is think about
transformations on matrices.
00:37:38.110 --> 00:37:43.790
So here's the question
to complete this lecture.
00:37:43.790 --> 00:37:50.930
Suppose I have a linear
transformation T.
00:37:50.930 --> 00:37:57.390
So we would think of it as an
eight -- as a n by n matrix.
00:37:57.390 --> 00:38:02.400
And it's computed with
respect to a certain basis.
00:38:02.400 --> 00:38:04.960
So T -- no, I'm sorry.
00:38:04.960 --> 00:38:07.780
I've got the
transformation T, period.
00:38:07.780 --> 00:38:09.760
That's taking
eight-dimensional space
00:38:09.760 --> 00:38:12.240
to eight-dimensional space.
00:38:12.240 --> 00:38:15.340
Now, let's get
matrices in there.
00:38:15.340 --> 00:38:16.650
OK.
00:38:16.650 --> 00:38:27.350
So, with respect to a first
basis, say v1 up to v8,
00:38:27.350 --> 00:38:35.550
it has a matrix A.
00:38:35.550 --> 00:38:39.380
I'm just setting
up letters here.
00:38:39.380 --> 00:38:48.000
With respect to a second basis,
say, I'll make it u1 up to --
00:38:48.000 --> 00:39:00.000
or w1, since I've used (w)s,
w1 up to w8, it has a matrix B.
00:39:00.000 --> 00:39:05.830
And my question is, what's
the connection between A
00:39:05.830 --> 00:39:09.890
How is the matrix -- the
transformation T is settled.
00:39:09.890 --> 00:39:11.240
and B?
00:39:11.240 --> 00:39:16.660
We could say, it's a
rotation, for example.
00:39:16.660 --> 00:39:18.490
So that would be
one transformation
00:39:18.490 --> 00:39:21.640
of eight-dimensional space,
just spin it a little.
00:39:24.740 --> 00:39:25.930
Or project it.
00:39:25.930 --> 00:39:32.070
Or whatever linear
transformation we've got.
00:39:32.070 --> 00:39:35.240
Now, we have to remember --
00:39:35.240 --> 00:39:41.420
my first step is to remind you
how you create that matrix A.
00:39:41.420 --> 00:39:45.530
Then my second step is, we would
use the same method to create
00:39:45.530 --> 00:39:49.520
B, but because it came from
the same transformation,
00:39:49.520 --> 00:39:52.550
there's got to be a
relation between A and B.
00:39:52.550 --> 00:39:55.660
What's the relation
between A and B?
00:39:55.660 --> 00:40:02.230
And let me jump to the
answer on that one.
00:40:02.230 --> 00:40:05.890
That if I have the
same transformation,
00:40:05.890 --> 00:40:10.600
and I'm compute on its matrix in
one basis, and then I computer
00:40:10.600 --> 00:40:16.910
it in another basis, those
two matrices are similar.
00:40:16.910 --> 00:40:19.350
So these two
matrices are similar.
00:40:19.350 --> 00:40:23.080
Now, do you remember what
similar matrices meant?
00:40:23.080 --> 00:40:23.630
Similar.
00:40:23.630 --> 00:40:27.935
A is similar to -- the
two matrices are similar.
00:40:27.935 --> 00:40:28.435
Similar.
00:40:31.280 --> 00:40:33.560
And what do I mean by that?
00:40:33.560 --> 00:40:40.700
I mean that I take the matrix
B, and I can compute it
00:40:40.700 --> 00:40:47.320
from the matrix A using
some similarity, some matrix
00:40:47.320 --> 00:40:52.180
M on one side, and M
inverse on the other.
00:40:52.180 --> 00:40:55.530
And this M will be the
change-of-basis matrix.
00:40:59.490 --> 00:41:03.740
This part of the lecture
is, admittedly, compressed.
00:41:03.740 --> 00:41:05.820
What I wanted you to --
00:41:05.820 --> 00:41:12.670
it's really the conclusion
that I want you to spot.
00:41:12.670 --> 00:41:17.980
Now, I have to go back and
say, what does it mean for A
00:41:17.980 --> 00:41:21.830
to be the matrix of
this transformation T.
00:41:21.830 --> 00:41:23.710
So I have to remind
you what that meant,
00:41:23.710 --> 00:41:26.280
that was in the last lecture.
00:41:26.280 --> 00:41:32.360
Then this is the conclusion
that if I change to a different
00:41:32.360 --> 00:41:36.720
basis, we now know -- see, if
I change to a different basis,
00:41:36.720 --> 00:41:38.340
two things happen.
00:41:38.340 --> 00:41:41.880
Every vector has
new coordinates.
00:41:41.880 --> 00:41:45.190
There, the rule is this one,
between the old coordinates
00:41:45.190 --> 00:41:46.660
and the new ones.
00:41:46.660 --> 00:41:50.570
Every matrix changes, every
transformation has a new
00:41:50.570 --> 00:41:51.560
matrix.
00:41:51.560 --> 00:41:55.120
And the new matrix
is related this way,
00:41:55.120 --> 00:41:57.710
the M could be
the same as the W.
00:41:57.710 --> 00:42:00.300
The M there would be the W here.
00:42:00.300 --> 00:42:00.960
OK.
00:42:00.960 --> 00:42:04.790
So, can I, in the
remaining minutes,
00:42:04.790 --> 00:42:09.620
recapture my lecture -- the
end of my lecture that was just
00:42:09.620 --> 00:42:13.740
before Thanksgiving,
about the matrix?
00:42:13.740 --> 00:42:15.660
OK.
00:42:15.660 --> 00:42:16.550
What's the matrix?
00:42:19.090 --> 00:42:21.040
And I'll just take one basis.
00:42:21.040 --> 00:42:26.830
So now this part is going
to go onto this board here.
00:42:26.830 --> 00:42:27.970
What is the matrix?
00:42:27.970 --> 00:42:28.740
What is A?
00:42:32.830 --> 00:42:33.930
OK.
00:42:33.930 --> 00:42:42.280
Using a basis v1 up to v8.
00:42:42.280 --> 00:42:43.161
Mm.
00:42:43.161 --> 00:42:43.660
OK.
00:42:43.660 --> 00:42:46.350
What's the point?
00:42:46.350 --> 00:42:51.180
The point is, if I know
what the transformation does
00:42:51.180 --> 00:42:56.990
to those eight basis vectors,
I know it completely.
00:42:56.990 --> 00:42:59.140
I know T, I know
everything about T,
00:42:59.140 --> 00:43:11.200
I know T completely from knowing
T of V -- what T does to v1,
00:43:11.200 --> 00:43:15.040
what T does to v2,
what T does to v8.
00:43:18.020 --> 00:43:19.760
Why is that?
00:43:19.760 --> 00:43:23.170
It's because T is a
linear transformation.
00:43:23.170 --> 00:43:26.700
So that if I know what
these outputs are --
00:43:26.700 --> 00:43:29.360
so these are the
inputs v1 up to v8,
00:43:29.360 --> 00:43:31.560
these are the outputs
from the transformation,
00:43:31.560 --> 00:43:35.500
like everyone rotated,
everyone projected,
00:43:35.500 --> 00:43:38.010
whatever transformation
I've done,
00:43:38.010 --> 00:43:42.090
then why is it that
I know everything?
00:43:42.090 --> 00:43:43.660
How does linearity work?
00:43:43.660 --> 00:43:45.020
Why?
00:43:45.020 --> 00:43:56.150
This is because every x is
some combination of these basis
00:43:56.150 --> 00:43:57.870
vectors, right?
00:43:57.870 --> 00:44:05.570
c1v1, c2v2, c8v8,
they were a basis.
00:44:05.570 --> 00:44:07.010
That's the whole
point of a basis,
00:44:07.010 --> 00:44:09.790
that every vector is a
combination of the basis
00:44:09.790 --> 00:44:12.210
vectors in exactly one way.
00:44:12.210 --> 00:44:14.737
And then, what is T of x?
00:44:18.670 --> 00:44:23.460
The point is, I claim that
we know T of x completely
00:44:23.460 --> 00:44:27.770
for every x, because every x
is a combination of those --
00:44:27.770 --> 00:44:31.740
and now we use the linear
transformation part to say that
00:44:31.740 --> 00:44:37.890
the output from x has to be c1
times the output from v1 plus
00:44:37.890 --> 00:44:43.330
v2 times the output
from v2, and so on.
00:44:43.330 --> 00:44:46.360
Up through c8 times
the output from v8.
00:44:46.360 --> 00:44:49.790
So this is like just saying, OK.
00:44:49.790 --> 00:44:54.900
We know everything
when we know what
00:44:54.900 --> 00:44:57.790
T does to each basis vector.
00:44:57.790 --> 00:44:58.720
OK.
00:44:58.720 --> 00:45:02.000
So those are the
eight things we need.
00:45:02.000 --> 00:45:08.330
Now -- but we need these
answers in this basis.
00:45:08.330 --> 00:45:13.450
So this first output
is some combination
00:45:13.450 --> 00:45:14.665
of the eight basis vectors.
00:45:17.480 --> 00:45:24.680
So write T acting on the
first input -- in other words,
00:45:24.680 --> 00:45:27.880
write the first output as
a combination of the basis
00:45:27.880 --> 00:45:38.040
vectors, say a11 v1 +
a21 v2 and so on a81 v8.
00:45:41.740 --> 00:45:50.110
Write T of v2 as some
combination a12 of v1,
00:45:50.110 --> 00:45:53.240
a22 of v2 and so on.
00:45:53.240 --> 00:45:57.270
I'm creating the matrix
A, column by column.
00:45:57.270 --> 00:46:00.050
Those numbers go in
the first column,
00:46:00.050 --> 00:46:06.200
these numbers go in the second
column, the matrix A that thi-
00:46:06.200 --> 00:46:11.900
this -- this is our matrix that
represents T in this basis is
00:46:11.900 --> 00:46:13.110
these numbers.
00:46:13.110 --> 00:46:22.290
a11 down to a18, a21
down to a28, and so on.
00:46:22.290 --> 00:46:25.060
OK.
00:46:25.060 --> 00:46:26.120
That's the recipe.
00:46:26.120 --> 00:46:31.090
In other words, if I give you
a transformation, and a basis.
00:46:31.090 --> 00:46:32.720
So that's what I
have to give you.
00:46:32.720 --> 00:46:37.550
The inputs are the
basis and to tell you
00:46:37.550 --> 00:46:38.641
what the transformation
00:46:38.641 --> 00:46:39.140
is.
00:46:39.140 --> 00:46:42.840
And then, you tell me --
00:46:42.840 --> 00:46:47.880
you compute T for
each basis, expand
00:46:47.880 --> 00:46:51.270
that result in the
basis, and that gives you
00:46:51.270 --> 00:46:55.900
the sixty four numbers
that go into the matrix A.
00:46:55.900 --> 00:47:01.460
Let me suppose -- let's close
with the best example of all.
00:47:01.460 --> 00:47:08.810
Suppose v1 to v8, this
basis, is the eigenvectors.
00:47:08.810 --> 00:47:22.950
Suppose we have an eigenvector
basis so that T(vi)
00:47:22.950 --> 00:47:27.660
is in the same direction of vi.
00:47:27.660 --> 00:47:29.790
Now, my question is, what is A?
00:47:35.370 --> 00:47:37.370
Can you carry through the steps?
00:47:37.370 --> 00:47:40.270
Let's do them together, because
we can do it in one minute.
00:47:43.200 --> 00:47:47.120
So, we've chosen
this perfect basis.
00:47:47.120 --> 00:47:52.010
And, actually, with
signal image processing,
00:47:52.010 --> 00:47:55.520
they might look for
the eigenvectors.
00:47:55.520 --> 00:47:57.790
But that would take
more calculation
00:47:57.790 --> 00:48:02.040
time that just saying, OK,
we'll use the wavelet basis.
00:48:02.040 --> 00:48:04.510
Or, OK, we'll use
the Fourier basis.
00:48:04.510 --> 00:48:08.130
But the very best basis
is the eigenvector basis.
00:48:08.130 --> 00:48:10.320
OK, what's the matrix?
00:48:10.320 --> 00:48:12.480
So, what's the first
column of the matrix?
00:48:15.180 --> 00:48:17.130
How do I get the first column?
00:48:17.130 --> 00:48:19.185
I take the first
basis vector v1.
00:48:22.000 --> 00:48:25.070
I opt -- I look to see, what
does the transformation do
00:48:25.070 --> 00:48:26.940
to it?
00:48:26.940 --> 00:48:29.610
The output is lambda one v1.
00:48:29.610 --> 00:48:36.420
I express that output as a
combination so the first input
00:48:36.420 --> 00:48:39.590
is v1.
00:48:39.590 --> 00:48:47.150
Its output is lambda one v1.
00:48:47.150 --> 00:48:51.900
Now write lambda one v1 as
a combination of the basis
00:48:51.900 --> 00:48:54.330
vectors, well,
it's already done.
00:48:54.330 --> 00:48:56.720
It's just lambda one times
the first basis vector
00:48:56.720 --> 00:48:58.360
and zero times the others.
00:48:58.360 --> 00:49:03.490
So this first column will
have lambda one and zeroes.
00:49:03.490 --> 00:49:05.060
OK.
00:49:05.060 --> 00:49:10.030
Second input is v2.
00:49:10.030 --> 00:49:15.220
Output is lambda two v2.
00:49:15.220 --> 00:49:19.060
OK, write that output as
a combination of the (v)s.
00:49:19.060 --> 00:49:20.470
It's already done.
00:49:20.470 --> 00:49:23.120
It's just lambda two
times the second v.
00:49:23.120 --> 00:49:25.190
So we need, in
the second column,
00:49:25.190 --> 00:49:28.040
we have lambda two
times the second v.
00:49:28.040 --> 00:49:31.410
Well, you see what's
coming, that in that basis,
00:49:31.410 --> 00:49:37.470
in the eigenvector basis,
the matrix is diagonal.
00:49:37.470 --> 00:49:40.950
So that's the
perfect basis, that's
00:49:40.950 --> 00:49:43.840
the basis we'd love to
have for image processing,
00:49:43.840 --> 00:49:47.770
but to find the eigenvectors
of our pixel matrix
00:49:47.770 --> 00:49:49.640
would be too expensive.
00:49:49.640 --> 00:49:54.110
So we do something
cheaper and close,
00:49:54.110 --> 00:49:58.200
which is to choose a
good basis like wavelets.
00:49:58.200 --> 00:49:59.130
OK, thanks.
00:49:59.130 --> 00:50:04.490
So I'll -- quiz review
on Wednesday, all day.
00:50:04.490 --> 00:50:06.040
Thanks.