1
00:00:07,560 --> 00:00:08,640
OK.
2
00:00:08,640 --> 00:00:12,290
So, coming nearer the
end of the course,
3
00:00:12,290 --> 00:00:20,340
this lecture will be a mixture
of the linear algebra that
4
00:00:20,340 --> 00:00:22,610
comes with a change of basis.
5
00:00:22,610 --> 00:00:26,340
And a change of basis from
one basis to another basis
6
00:00:26,340 --> 00:00:30,270
is something you really
do in applications.
7
00:00:30,270 --> 00:00:33,710
And, I would like to talk
about those applications.
8
00:00:33,710 --> 00:00:36,920
I got a little bit
involved with compression.
9
00:00:36,920 --> 00:00:39,820
Compressing a signal,
compressing an image.
10
00:00:39,820 --> 00:00:43,910
And that's exactly
change-of-basis.
11
00:00:43,910 --> 00:00:48,510
And then, the main
theme in this chapter
12
00:00:48,510 --> 00:00:54,320
is th- the connection between
a linear transformation, which
13
00:00:54,320 --> 00:00:56,870
doesn't have to
have coordinates,
14
00:00:56,870 --> 00:01:01,610
and the matrix that tells
us that transformation
15
00:01:01,610 --> 00:01:04,069
with respect to coordinates.
16
00:01:04,069 --> 00:01:08,420
So the matrix is the
coordinate-based description
17
00:01:08,420 --> 00:01:11,090
of the linear transformation.
18
00:01:11,090 --> 00:01:14,480
Let me start out with
the nice part, which
19
00:01:14,480 --> 00:01:18,770
is just to tell you something
about image compression.
20
00:01:18,770 --> 00:01:21,070
Those of you -- well,
everybody's going to meet
21
00:01:21,070 --> 00:01:25,950
compression, because you know
that the amount of data that
22
00:01:25,950 --> 00:01:27,730
we're getting --
23
00:01:27,730 --> 00:01:31,640
well, these lectures
are compressed.
24
00:01:31,640 --> 00:01:37,500
So that, actually, probably
you see my motion as
25
00:01:37,500 --> 00:01:38,030
jerky?
26
00:01:38,030 --> 00:01:39,050
Shall I use that word?
27
00:01:39,050 --> 00:01:41,760
Have you looked on the web?
28
00:01:41,760 --> 00:01:46,470
I should like to
find a better word.
29
00:01:46,470 --> 00:01:47,770
Compressed, let's say.
30
00:01:47,770 --> 00:01:53,900
So the complete signal is, of
course, in those video cameras,
31
00:01:53,900 --> 00:01:56,360
and in the videotape,
but that goes
32
00:01:56,360 --> 00:01:59,540
to the bottom of building
nine, and out of that
33
00:01:59,540 --> 00:02:06,240
comes a jumpy motion because
it uses a standard system
34
00:02:06,240 --> 00:02:09,020
for compressing images.
35
00:02:09,020 --> 00:02:13,990
And, you'll notice that the
stuff that sits on the board
36
00:02:13,990 --> 00:02:20,200
comes very clearly, but
it's my motion that needs
37
00:02:20,200 --> 00:02:22,150
a whole lot of bits, right?
38
00:02:22,150 --> 00:02:26,060
So, and if I were to run up
and back up there and back,
39
00:02:26,060 --> 00:02:29,380
that would need too
many bits, and I'd
40
00:02:29,380 --> 00:02:32,860
be compressed even more.
41
00:02:32,860 --> 00:02:36,180
So, what does compression mean?
42
00:02:36,180 --> 00:02:39,100
Let me just think
of a still image.
43
00:02:39,100 --> 00:02:44,570
And of course, satellites,
and computations
44
00:02:44,570 --> 00:02:47,420
of the climate,
computations of combustion,
45
00:02:47,420 --> 00:02:52,410
the computers and
sensors of all kinds
46
00:02:52,410 --> 00:02:55,910
are just giving us
overwhelming amounts of data.
47
00:02:55,910 --> 00:02:58,720
The Web is, too.
48
00:02:58,720 --> 00:03:02,910
Now, some compression
can be done with no loss.
49
00:03:02,910 --> 00:03:07,150
Lossless compression is possible
just using, sort of, the fact
50
00:03:07,150 --> 00:03:10,410
that there are redundancies.
51
00:03:10,410 --> 00:03:13,550
But I'm talking here
about lossy compression.
52
00:03:13,550 --> 00:03:18,730
So I'm talking about
-- here's an image.
53
00:03:18,730 --> 00:03:20,960
And what does an
image consist of?
54
00:03:20,960 --> 00:03:24,770
It consists of a lot of
little pixels, right?
55
00:03:24,770 --> 00:03:28,870
Maybe five hundred and twelve
by five hundred and twelve.
56
00:03:28,870 --> 00:03:31,930
Two to the ninth by two
to the ninth pixels,
57
00:03:31,930 --> 00:03:40,200
and so this is pixel number
one, one, so that's a pixel.
58
00:03:40,200 --> 00:03:45,860
And if we're in black and
white, the typical pixel
59
00:03:45,860 --> 00:03:52,640
would tell us a gray-scale,
from zero to two fifty five.
60
00:03:52,640 --> 00:03:59,250
So a pixel is usually a
value of one of the xi,
61
00:03:59,250 --> 00:04:02,190
so this would be the
i-th pixel, is --
62
00:04:02,190 --> 00:04:05,260
it's usually a real
number on a scale
63
00:04:05,260 --> 00:04:06,860
from zero to two fifty five.
64
00:04:06,860 --> 00:04:10,260
In other words, two to
the eighth possibilities.
65
00:04:10,260 --> 00:04:19,720
So usually, that's the standard,
so that's eight -- eight bits.
66
00:04:22,620 --> 00:04:26,610
But then we have
that for every pixel,
67
00:04:26,610 --> 00:04:29,490
so we have five hundred
and twelve squared pixels,
68
00:04:29,490 --> 00:04:36,640
we're really operating x is a
vector in R^n, but what is n?
69
00:04:36,640 --> 00:04:41,080
n is five hundred
and twelve squared.
70
00:04:41,080 --> 00:04:45,470
That's our problem, right there.
71
00:04:45,470 --> 00:04:49,040
A pixel is a vector
that gives us
72
00:04:49,040 --> 00:04:53,010
the information about the image.
73
00:04:53,010 --> 00:04:55,240
I'm sorry.
74
00:04:55,240 --> 00:05:02,530
The image that comes through is
a vector of that length that --
75
00:05:02,530 --> 00:05:04,800
that's the information that
we have about the image,
76
00:05:04,800 --> 00:05:08,550
if it's a color image, we would
have three times that length,
77
00:05:08,550 --> 00:05:12,920
because we'd need three
coordinates to get color.
78
00:05:12,920 --> 00:05:16,830
So it would be three times five
hundred and twelve squared.
79
00:05:16,830 --> 00:05:18,830
It's an enormous
amount of information,
80
00:05:18,830 --> 00:05:22,810
and we couldn't send out
the image for these lectures
81
00:05:22,810 --> 00:05:25,510
without compressing it.
82
00:05:25,510 --> 00:05:29,070
It would overload the system.
83
00:05:29,070 --> 00:05:31,100
So it has to be compressed.
84
00:05:31,100 --> 00:05:37,920
The standard compression,
and still used with lectures
85
00:05:37,920 --> 00:05:39,810
is, called JPEG.
86
00:05:42,950 --> 00:05:46,870
I think that stands for Joint
Photographic Experts Group.
87
00:05:46,870 --> 00:05:50,390
They established a
system of compression.
88
00:05:50,390 --> 00:05:53,080
And I just want to tell
you what it's about.
89
00:05:53,080 --> 00:05:55,190
It's a change-of-basis.
90
00:05:55,190 --> 00:05:58,360
What basis do we have?
91
00:05:58,360 --> 00:06:00,860
The current basis
we have is, you
92
00:06:00,860 --> 00:06:07,050
could say, the standard basis
is, every pixel, give a value.
93
00:06:07,050 --> 00:06:12,440
So that's like we have a
vector x which is five hundred
94
00:06:12,440 --> 00:06:16,380
and twelve squared long
and, in the i-th position,
95
00:06:16,380 --> 00:06:21,420
we get a number like one
twenty one or something.
96
00:06:21,420 --> 00:06:27,720
The pixel next to it might
be one twenty four, maybe
97
00:06:27,720 --> 00:06:34,470
where my tie begins to enter,
so if it was mostly blue shirt,
98
00:06:34,470 --> 00:06:37,240
this would be a slight
difference in shading,
99
00:06:37,240 --> 00:06:41,430
but pretty close, then the tie
would be a different color,
100
00:06:41,430 --> 00:06:45,180
so we might have
quite a few pixels
101
00:06:45,180 --> 00:06:47,820
for the blue shirt,
and a whole lot
102
00:06:47,820 --> 00:06:52,450
more for the blackboard,
that are very close.
103
00:06:52,450 --> 00:06:55,430
And that's what are
very correlated.
104
00:06:55,430 --> 00:06:59,250
And that's what gives us the
possibility of compression.
105
00:06:59,250 --> 00:07:02,770
For example, before
the lecture starts,
106
00:07:02,770 --> 00:07:07,660
if we had a blank blackboard,
then there's an image,
107
00:07:07,660 --> 00:07:12,250
but it would make no
sense to take that image
108
00:07:12,250 --> 00:07:17,030
and tell you what it
is pixel by pixel.
109
00:07:17,030 --> 00:07:21,720
I mean, there's a case in
which all pixel values,
110
00:07:21,720 --> 00:07:24,180
all gray levels are the same --
111
00:07:24,180 --> 00:07:26,950
or practically the same,
depending on the erasing
112
00:07:26,950 --> 00:07:29,240
of the board, but
extremely close --
113
00:07:29,240 --> 00:07:35,260
and, so that's an image where
the standard basis is lousy.
114
00:07:35,260 --> 00:07:40,490
That's the basic fact, that
the standard basis which gives
115
00:07:40,490 --> 00:07:44,640
the value of every pixel
makes no use of the fact that
116
00:07:44,640 --> 00:07:49,250
we're getting a whole lot of
pixels whose gray levels --
117
00:07:49,250 --> 00:07:54,450
the neighboring pixels tend
to have the same gray level
118
00:07:54,450 --> 00:07:57,260
as their neighbors.
119
00:07:57,260 --> 00:07:59,950
So how do we take
advantage of that fact?
120
00:07:59,950 --> 00:08:04,820
Well, one basis
vector that would
121
00:08:04,820 --> 00:08:07,810
be extremely nice to
include in the basis
122
00:08:07,810 --> 00:08:12,830
would be a vector of all ones.
123
00:08:12,830 --> 00:08:14,830
That's not in our
standard basis,
124
00:08:14,830 --> 00:08:23,890
so let me just write again,
the standard basis is our one,
125
00:08:23,890 --> 00:08:29,440
and all the rest zeroes, zero,
one, and all the rest, zeroes,
126
00:08:29,440 --> 00:08:33,820
everybody knows what
these standard basis is.
127
00:08:33,820 --> 00:08:39,320
Now, any other basis
for R -- so this is --
128
00:08:39,320 --> 00:08:43,100
for this very
high-dimensional space --
129
00:08:43,100 --> 00:08:46,180
now I'm going to speak
about a better basis.
130
00:08:46,180 --> 00:08:51,810
Better basis -- and
let me just emphasize,
131
00:08:51,810 --> 00:08:54,580
one vector that would be
extremely nice to have in that
132
00:08:54,580 --> 00:08:56,770
basis is the vector of all ones.
133
00:08:59,790 --> 00:09:00,540
Why is that?
134
00:09:00,540 --> 00:09:04,730
Let me just say again, because
that vector of all ones,
135
00:09:04,730 --> 00:09:11,180
by itself, one vector is
able to completely give
136
00:09:11,180 --> 00:09:15,940
the information
on a solid image.
137
00:09:15,940 --> 00:09:17,650
Of course, our image
won't be solid,
138
00:09:17,650 --> 00:09:26,070
it will have a mix
of solid and signal.
139
00:09:26,070 --> 00:09:30,470
So having that one
vector in the basis
140
00:09:30,470 --> 00:09:33,260
is going to save us a whole lot.
141
00:09:33,260 --> 00:09:36,400
Now, the question is, what other
vectors should be in the basis?
142
00:09:36,400 --> 00:09:41,050
The extreme vector
in the basis might
143
00:09:41,050 --> 00:09:47,050
be a vector of one minus one,
one minus one, one minus one.
144
00:09:47,050 --> 00:09:49,420
That would be a
vector that shows --
145
00:09:49,420 --> 00:09:52,560
I mean, that's like a
checkerboard vector, right?
146
00:09:52,560 --> 00:09:58,630
That's a vector that
would, if the image was
147
00:09:58,630 --> 00:10:01,040
like a huge checkerboard
of plus, minus,
148
00:10:01,040 --> 00:10:03,720
plus, minus, plus,
minus, that vector
149
00:10:03,720 --> 00:10:06,260
would carry the whole signal.
150
00:10:06,260 --> 00:10:08,740
But much more common
would be maybe
151
00:10:08,740 --> 00:10:14,760
to have half the image, darker
and the other half lighter.
152
00:10:14,760 --> 00:10:18,320
So another vector that might
be quite useful in here
153
00:10:18,320 --> 00:10:21,480
would be half ones
and half minus ones.
154
00:10:24,950 --> 00:10:31,780
I'm just trying to get across
the idea of that a basis could
155
00:10:31,780 --> 00:10:33,950
be where, that
first of all, we've
156
00:10:33,950 --> 00:10:38,330
got the bases at our disposal.
157
00:10:38,330 --> 00:10:39,960
Like, we're free to choose that.
158
00:10:39,960 --> 00:10:45,330
And it's a billion-dollar
decision what we choose.
159
00:10:45,330 --> 00:10:50,270
So, and TV people
would rather pre-
160
00:10:50,270 --> 00:10:55,480
would prefer one basis based on
the way the signal is scanned,
161
00:10:55,480 --> 00:10:58,040
and movie people
would prefer another,
162
00:10:58,040 --> 00:11:02,540
I mean, there's giant
politics in this question that
163
00:11:02,540 --> 00:11:05,710
really reduces to a
linear algebra problem,
164
00:11:05,710 --> 00:11:07,900
what basis to choose.
165
00:11:07,900 --> 00:11:15,540
I'll just mention the best
known basis, which JPEG uses, --
166
00:11:15,540 --> 00:11:17,760
let me put that here --
167
00:11:17,760 --> 00:11:21,490
is the Fourier basis.
168
00:11:21,490 --> 00:11:27,730
So when you use the Fourier
basis, that includes --
169
00:11:27,730 --> 00:11:31,820
this is the constant vector, the
D C vector if we're electrical
170
00:11:31,820 --> 00:11:37,170
engineers, the l- vector of all
ones, so it would include one,
171
00:11:37,170 --> 00:11:39,070
one, one, one.
172
00:11:39,070 --> 00:11:41,370
Often eight by eight
is a good choice.
173
00:11:44,770 --> 00:11:46,580
Eight by eight is a good choice.
174
00:11:46,580 --> 00:11:50,340
So, what do I mean by
this eight by eight?
175
00:11:50,340 --> 00:11:55,430
I mean that the big signal,
which is five twelve by five
176
00:11:55,430 --> 00:12:00,480
twelve, gets broken down, and
JPEG does this, into eight
177
00:12:00,480 --> 00:12:02,540
by eight blocks.
178
00:12:02,540 --> 00:12:06,800
And we -- sort of, this is
too much to deal with at once.
179
00:12:06,800 --> 00:12:10,090
So what JPEG does is take
this eight by eight block,
180
00:12:10,090 --> 00:12:16,250
which is sixty four
coefficients, sixty four,
181
00:12:16,250 --> 00:12:21,030
pixels, and changes
the basis on that
182
00:12:21,030 --> 00:12:22,090
piece.
183
00:12:22,090 --> 00:12:26,260
And then, now, let's see, I was
going to write down Fourier,
184
00:12:26,260 --> 00:12:31,120
so you remember Fourier as this
vector of all ones, and then,
185
00:12:31,120 --> 00:12:35,100
the vector -- oh,
well, actually,
186
00:12:35,100 --> 00:12:39,480
I gave a lecture earlier
about the Fourier matrix,
187
00:12:39,480 --> 00:12:50,040
this matrix whose columns are
powers of a complex number w.
188
00:12:50,040 --> 00:12:52,820
I won't repeat that,
because I don't
189
00:12:52,820 --> 00:12:56,600
want to go into the details
of the Fourier basis,
190
00:12:56,600 --> 00:13:00,640
just to tell you how
compression works.
191
00:13:00,640 --> 00:13:02,880
So what happens in JPEG?
192
00:13:02,880 --> 00:13:11,640
What happens to the video, to
each image, of these lectures?
193
00:13:11,640 --> 00:13:16,270
It gets broken into
eight by eight blocks.
194
00:13:16,270 --> 00:13:16,910
OK.
195
00:13:16,910 --> 00:13:21,280
Within each block, we have
sixty four coefficients,
196
00:13:21,280 --> 00:13:24,210
sixty four basis vectors,
sixty four pixels,
197
00:13:24,210 --> 00:13:28,870
and we change basis in
sixty four dimensional space
198
00:13:28,870 --> 00:13:31,662
using these Fourier vectors.
199
00:13:35,000 --> 00:13:38,140
Just note, that was
a lossless step.
200
00:13:38,140 --> 00:13:40,240
Let me emphasize.
201
00:13:40,240 --> 00:13:46,150
In comes the signal x.
202
00:13:46,150 --> 00:13:47,460
We change basis.
203
00:13:47,460 --> 00:13:49,490
This is the basis change.
204
00:13:49,490 --> 00:13:50,800
Change basis.
205
00:13:50,800 --> 00:13:54,080
Choose a better basis.
206
00:13:54,080 --> 00:14:03,350
So it produces,
the coefficients c.
207
00:14:03,350 --> 00:14:07,590
So sixty four pixels come
in, sixty four coefficients
208
00:14:07,590 --> 00:14:08,400
come out.
209
00:14:08,400 --> 00:14:11,150
Now comes the compression.
210
00:14:11,150 --> 00:14:12,735
Now come -- this was lossless.
211
00:14:16,320 --> 00:14:18,860
It's just -- we know that R --
212
00:14:18,860 --> 00:14:25,250
R sixty four has plenty of
bases, and we've chosen one.
213
00:14:25,250 --> 00:14:32,040
Now, in that basis, we write
the signal in that basis,
214
00:14:32,040 --> 00:14:34,350
and that's what my lecture
-- that's the math part
215
00:14:34,350 --> 00:14:35,710
of my lecture.
216
00:14:35,710 --> 00:14:37,610
Now here's the application part.
217
00:14:37,610 --> 00:14:40,340
The next part is going to
be the compression step.
218
00:14:43,790 --> 00:14:46,030
And that's lossy.
219
00:14:46,030 --> 00:14:47,840
We're going to lose information.
220
00:14:47,840 --> 00:14:52,220
And what will actually
happen at that step?
221
00:14:52,220 --> 00:14:56,500
Well, one thing we could
do is just throw away
222
00:14:56,500 --> 00:14:58,680
the small coefficients.
223
00:14:58,680 --> 00:15:03,520
So that's called thresholding,
we set some threshold.
224
00:15:03,520 --> 00:15:07,150
Every coefficient, every
basis vector that's
225
00:15:07,150 --> 00:15:10,510
not in there more than
the threshold value,
226
00:15:10,510 --> 00:15:14,240
and we set them threshold
so that our eye can't
227
00:15:14,240 --> 00:15:19,650
see the difference, or can
hardly see the difference,
228
00:15:19,650 --> 00:15:23,230
whether we throw away that
little bit of that basis vector
229
00:15:23,230 --> 00:15:24,220
or keep it.
230
00:15:24,220 --> 00:15:28,004
So this compression step
produces a compressed set of
231
00:15:28,004 --> 00:15:29,670
I'll just keep going
here. coefficients.
232
00:15:32,900 --> 00:15:36,220
So it keeps going,
this compression step
233
00:15:36,220 --> 00:15:40,190
produces some coefficient c hat.
234
00:15:40,190 --> 00:15:42,560
And with many zeroes.
235
00:15:45,220 --> 00:15:48,160
So that's where the
compression came.
236
00:15:48,160 --> 00:15:52,480
Probably, there is enough of
this vector of all ones --
237
00:15:52,480 --> 00:15:55,240
we very seldom throw that away.
238
00:15:55,240 --> 00:15:57,950
Usually, its coefficient
will be large.
239
00:15:57,950 --> 00:16:01,110
But the coefficient of
something like this,
240
00:16:01,110 --> 00:16:04,500
that quickly alternative
vector, there's
241
00:16:04,500 --> 00:16:07,560
probably very little of
that in any smooth signal.
242
00:16:07,560 --> 00:16:12,450
That's high-frequency -- this is
low-frequency, zero frequency.
243
00:16:12,450 --> 00:16:15,350
This stuff is the highest
frequency we could have,
244
00:16:15,350 --> 00:16:23,470
and if the noise, the jitter is
producing that sort of output,
245
00:16:23,470 --> 00:16:26,470
but a smooth lecture
like this one
246
00:16:26,470 --> 00:16:30,740
is, has very little of
that highest frequency,
247
00:16:30,740 --> 00:16:32,700
very little noise
in this lecture.
248
00:16:32,700 --> 00:16:37,150
OK, so we throw away
whatever there is,
249
00:16:37,150 --> 00:16:40,860
and we're left with
just a few coefficients,
250
00:16:40,860 --> 00:16:48,560
and then we reconstruct a
signal using those coefficients.
251
00:16:48,560 --> 00:16:53,580
We take those coefficients,
times their basis vectors,
252
00:16:53,580 --> 00:16:59,040
but this sum doesn't have
sixty four terms any more.
253
00:16:59,040 --> 00:17:02,130
Probably, it has about
two or three terms.
254
00:17:02,130 --> 00:17:03,230
So that would --
255
00:17:03,230 --> 00:17:05,109
say it has three terms.
256
00:17:05,109 --> 00:17:07,000
From sixty four down
to three, that's
257
00:17:07,000 --> 00:17:09,670
compression of
twenty one to one.
258
00:17:09,670 --> 00:17:12,500
That's the kind of compression
you're looking for.
259
00:17:12,500 --> 00:17:17,480
And everybody is looking for
that sort of compression.
260
00:17:17,480 --> 00:17:19,890
Let's see, I guess
I met the problem
261
00:17:19,890 --> 00:17:23,990
with the FBI and fingerprints.
262
00:17:23,990 --> 00:17:26,390
So there's a whole
lot of still images.
263
00:17:26,390 --> 00:17:30,590
You know, with your thumb,
you make these inky marks
264
00:17:30,590 --> 00:17:34,300
which go somewhere.
265
00:17:34,300 --> 00:17:38,460
it used to go to Washington
and get stored in a big file.
266
00:17:38,460 --> 00:17:44,240
So Washington had a file of
thirty million murderers,
267
00:17:44,240 --> 00:17:50,670
cheaters on quizzes,
other stuff,
268
00:17:50,670 --> 00:17:56,280
and actually, there was no
way to retrieve them in time.
269
00:17:56,280 --> 00:17:59,740
So suppose you're at the
police station, they say, OK,
270
00:17:59,740 --> 00:18:03,650
this person may have done
this, check with Washington,
271
00:18:03,650 --> 00:18:07,410
have they got -- are his or
her fingerprints on file?
272
00:18:07,410 --> 00:18:12,220
Well, Washington won't know
the answer within a week
273
00:18:12,220 --> 00:18:15,870
if it's got filing cabinets
full of fingerprints.
274
00:18:15,870 --> 00:18:21,190
So of course, the natural
step is digitizing.
275
00:18:21,190 --> 00:18:24,380
So all fingerprints
are now digitized,
276
00:18:24,380 --> 00:18:30,090
so now it's at least electronic,
but still there's too much
277
00:18:30,090 --> 00:18:33,580
information in each one.
278
00:18:33,580 --> 00:18:39,260
I mean, you can't search
through that many, fingerprints
279
00:18:39,260 --> 00:18:45,040
if the digital image is five
twelve squared by five twelve
280
00:18:45,040 --> 00:18:47,280
squared, if it's
that many pixels.
281
00:18:47,280 --> 00:18:49,110
So you get compressed.
282
00:18:49,110 --> 00:18:54,970
So the FBI had to decide what
basis to choose for compression
283
00:18:54,970 --> 00:18:55,900
of fingerprints.
284
00:18:55,900 --> 00:18:59,510
And then they built a big new
facility in West Virginia,
285
00:18:59,510 --> 00:19:03,220
and that's where
fingerprints now are sent.
286
00:19:03,220 --> 00:19:06,110
So I think, if you get
your fingerprints done now
287
00:19:06,110 --> 00:19:09,610
at the police station, if it's
an up-to-date police station,
288
00:19:09,610 --> 00:19:13,990
it happens digitally, and
the signal is sent digitally,
289
00:19:13,990 --> 00:19:18,560
and then in West Virginia,
it's compressed and indexed.
290
00:19:18,560 --> 00:19:21,960
And then, if they
want to find you,
291
00:19:21,960 --> 00:19:26,710
they can do it within minutes
instead of within a week.
292
00:19:26,710 --> 00:19:27,410
OK.
293
00:19:27,410 --> 00:19:31,860
So this compression comes
up for signals, for images,
294
00:19:31,860 --> 00:19:36,020
for video -- which is,
like these lectures --
295
00:19:36,020 --> 00:19:38,400
there's another aspect.
296
00:19:38,400 --> 00:19:43,980
You could treat the video as one
still image after another one,
297
00:19:43,980 --> 00:19:50,770
and compress each one, and
then run them and make a video.
298
00:19:50,770 --> 00:19:52,950
But that misses --
299
00:19:52,950 --> 00:19:55,990
well, you can see why
that's not optimal.
300
00:19:55,990 --> 00:20:04,570
In a video thing, you
have a sequence of images,
301
00:20:04,570 --> 00:20:14,340
so video is really a sequence
of images but what about one
302
00:20:14,340 --> 00:20:16,800
image to the next image?
303
00:20:16,800 --> 00:20:18,780
They're extremely correlated.
304
00:20:18,780 --> 00:20:21,930
I mean that I'm getting an
image every split-second,
305
00:20:21,930 --> 00:20:24,200
and also, I'm moving slightly.
306
00:20:24,200 --> 00:20:29,690
That's what's producing the,
jumpy motion on the video.
307
00:20:29,690 --> 00:20:35,010
But I'm not, like, you know --
308
00:20:35,010 --> 00:20:39,270
each image in the sequence is
pretty close to the one before.
309
00:20:39,270 --> 00:20:44,480
So you have to use, like,
prediction and correction.
310
00:20:44,480 --> 00:20:49,700
I mean, the image of me one
instant -- one time-step later,
311
00:20:49,700 --> 00:20:51,830
you would assume
would be the same,
312
00:20:51,830 --> 00:20:54,260
and then plus a
small correction.
313
00:20:54,260 --> 00:20:58,620
And you would only code and
digitize the correction,
314
00:20:58,620 --> 00:21:00,970
and compress the correction.
315
00:21:00,970 --> 00:21:05,350
So a sequence of images
that's highly correlated
316
00:21:05,350 --> 00:21:11,250
and the problem in
compression is always
317
00:21:11,250 --> 00:21:14,010
to use this
correlation, this fact
318
00:21:14,010 --> 00:21:17,960
that, in time, or
in space, things
319
00:21:17,960 --> 00:21:23,650
don't change instantly, they're
very often smooth changes,
320
00:21:23,650 --> 00:21:29,170
and, you can predict one
value from the previous value.
321
00:21:29,170 --> 00:21:29,670
OK.
322
00:21:29,670 --> 00:21:36,240
So those are applications
which are pure linear algebra.
323
00:21:36,240 --> 00:21:42,680
I could, well, maybe you'll
allow me to tell you,
324
00:21:42,680 --> 00:21:47,650
and the book describes,
the new basis that's
325
00:21:47,650 --> 00:21:50,150
the competition for Fourier.
326
00:21:50,150 --> 00:21:54,680
So the competition for
Fourier is called wavelets,
327
00:21:54,680 --> 00:21:58,460
and I can describe what
that basis is like,
328
00:21:58,460 --> 00:22:00,680
say, in the eight by eight case.
329
00:22:00,680 --> 00:22:03,240
So the eight by
eight wavelet basis
330
00:22:03,240 --> 00:22:08,410
is the vector of all
ones, eight ones, then
331
00:22:08,410 --> 00:22:13,030
the vector of four ones
and four minus ones,
332
00:22:13,030 --> 00:22:18,770
then the vector of two
ones, and two minus ones,
333
00:22:18,770 --> 00:22:20,620
and four zeroes.
334
00:22:20,620 --> 00:22:24,930
And also the vector
of four zeroes
335
00:22:24,930 --> 00:22:28,480
and two ones and two minus ones.
336
00:22:28,480 --> 00:22:31,630
So now I'm up to four, and
I need four more, right?
337
00:22:31,630 --> 00:22:32,870
For R^8?
338
00:22:32,870 --> 00:22:39,690
The next basis vector will be
one minus one and six zeroes,
339
00:22:39,690 --> 00:22:44,910
and then three more like
that, with the one minus one
340
00:22:44,910 --> 00:22:47,400
there, and there, and there.
341
00:22:47,400 --> 00:22:52,490
So those are eight vectors
in eight-dimensional space,
342
00:22:52,490 --> 00:22:59,830
those are called wavelets,
and it's a very simple wavelet
343
00:22:59,830 --> 00:23:03,131
choice, it's a
more sophisticated
344
00:23:03,131 --> 00:23:03,630
choice.
345
00:23:03,630 --> 00:23:09,320
This is a little jumpy, to
jump between one and minus one.
346
00:23:09,320 --> 00:23:12,920
And, actually, you
can see, now, suppose
347
00:23:12,920 --> 00:23:16,820
you compare the wavelet basis
with the Fourier basis above.
348
00:23:16,820 --> 00:23:22,450
How could I write this guy,
which is in the Fourier basis,
349
00:23:22,450 --> 00:23:24,780
it's an eight --
it's a vector in R^8.
350
00:23:24,780 --> 00:23:28,350
How would I write
that as a combination
351
00:23:28,350 --> 00:23:30,920
of the wavelet basis?
352
00:23:30,920 --> 00:23:34,250
Have I told you enough about the
wavelet basis that you can see,
353
00:23:34,250 --> 00:23:37,750
how does this very fast guy --
354
00:23:37,750 --> 00:23:41,700
what combination of the wavelet
basis is that very fast guy?
355
00:23:41,700 --> 00:23:45,020
It would be this one --
356
00:23:45,020 --> 00:23:47,430
it would be the sum
of these four, right?
357
00:23:47,430 --> 00:23:50,360
That very fast guy will
be that one minus one,
358
00:23:50,360 --> 00:23:53,740
and the next one, and the
next one, and the next one.
359
00:23:53,740 --> 00:23:59,350
So this is the sum of
those last four wavelets.
360
00:23:59,350 --> 00:24:02,010
This one, we've kept, and so on.
361
00:24:02,010 --> 00:24:06,410
So, each -- well, every --
well, that's what a basis does.
362
00:24:06,410 --> 00:24:10,250
Every vector in R^8 is
some combination of those,
363
00:24:10,250 --> 00:24:15,210
and for the linear algebra --
so the linear algebra is this
364
00:24:15,210 --> 00:24:18,860
step, find the coefficient.
365
00:24:18,860 --> 00:24:21,900
That's the step we want to take.
366
00:24:21,900 --> 00:24:28,010
What if I give you the basis,
like this wavelet basis,
367
00:24:28,010 --> 00:24:34,360
and I give you the pixel -- so
here are the pixel values, P1,
368
00:24:34,360 --> 00:24:35,660
P2, down to P8 --
369
00:24:38,810 --> 00:24:39,930
what's the job?
370
00:24:39,930 --> 00:24:42,220
What's the linear algebra here?
371
00:24:42,220 --> 00:24:48,400
So these are the values, this
is in the standard basis, right?
372
00:24:48,400 --> 00:24:53,990
Those are just the values
at eight successive points.
373
00:24:53,990 --> 00:24:57,560
I guess I'm dropping down to
one dimension, instead of eight
374
00:24:57,560 --> 00:25:01,180
by eight, I'm just going
to take eight pixel values
375
00:25:01,180 --> 00:25:04,540
along that first top row.
376
00:25:04,540 --> 00:25:06,160
So what do I want to do?
377
00:25:06,160 --> 00:25:10,720
In standard basis, here
are the pixel values.
378
00:25:10,720 --> 00:25:15,880
I want to write that as a
combination of c1 times this
379
00:25:15,880 --> 00:25:21,220
guy, plus c2 times
this guy, plus c3,
380
00:25:21,220 --> 00:25:25,590
these are the coefficients,
plus c4 times this one --
381
00:25:25,590 --> 00:25:26,720
do you see what I'm doing?
382
00:25:26,720 --> 00:25:32,010
I want to write this vector P
as a combination of c1 times
383
00:25:32,010 --> 00:25:36,395
the first wavelet plus c8
times the eighth wavelet.
384
00:25:39,910 --> 00:25:42,640
That's the transform step.
385
00:25:42,640 --> 00:25:43,970
That's the lossless step.
386
00:25:43,970 --> 00:25:47,580
That's the step from P --
oh, I'm calling it P here,
387
00:25:47,580 --> 00:25:50,790
and I called it x
there, so let me --
388
00:25:50,790 --> 00:25:54,520
at the risk of moving, and
therefore making this jumpy --
389
00:25:54,520 --> 00:26:00,780
suppose the signal I'm now
calling P, that a pixel values,
390
00:26:00,780 --> 00:26:02,860
and I'm looking for
the coefficients.
391
00:26:02,860 --> 00:26:05,360
OK, tell me how to do it.
392
00:26:08,360 --> 00:26:12,960
If I give you eight
basis vectors,
393
00:26:12,960 --> 00:26:17,930
and I give you the input signal,
and I ask for the coefficients,
394
00:26:17,930 --> 00:26:19,950
what do I do?
395
00:26:19,950 --> 00:26:22,690
What's the step?
396
00:26:22,690 --> 00:26:28,650
I'm trying to solve this, I want
to know the eight coefficients,
397
00:26:28,650 --> 00:26:32,200
so I'm changing from the
standard basis, which is just
398
00:26:32,200 --> 00:26:37,220
the eight gray-scale values
to the wavelet basis, where
399
00:26:37,220 --> 00:26:41,490
the same vector is
represented by eight numbers.
400
00:26:41,490 --> 00:26:44,970
It's got to take eight numbers
to tell you a vector in R^8,
401
00:26:44,970 --> 00:26:49,460
and those eight numbers are
the coefficients of the basis.
402
00:26:49,460 --> 00:26:54,720
Look, we've done
this thing before.
403
00:26:54,720 --> 00:26:58,510
There is the equation
in vector notation,
404
00:26:58,510 --> 00:27:01,610
we want to see it as a matrix.
405
00:27:01,610 --> 00:27:06,870
This is a combination of
columns of the wavelet matrix,
406
00:27:06,870 --> 00:27:07,430
right?
407
00:27:07,430 --> 00:27:14,440
This is P equals
c1, c2, down to c8,
408
00:27:14,440 --> 00:27:17,370
and these guys are the columns.
409
00:27:17,370 --> 00:27:19,990
I mean, this is the step
that we're constantly
410
00:27:19,990 --> 00:27:22,700
taking in this course,
the first basis vector
411
00:27:22,700 --> 00:27:25,920
goes in the first column,
the second basis vector
412
00:27:25,920 --> 00:27:29,170
goes in the second
column, and so on,
413
00:27:29,170 --> 00:27:32,760
the eight columns of
this wavelet matrix
414
00:27:32,760 --> 00:27:35,130
are the eight basis vectors.
415
00:27:35,130 --> 00:27:40,710
This is a wavelet matrix W.
416
00:27:40,710 --> 00:27:46,330
So, the step to change basis
-- so now I'm finally coming
417
00:27:46,330 --> 00:27:51,780
to this change-of-basis, so
the change of basis that,
418
00:27:51,780 --> 00:27:55,370
let me stay with this
board, but -- well,
419
00:27:55,370 --> 00:27:56,790
let me just go above it, here.
420
00:28:00,500 --> 00:28:07,200
So the standard basis, we know,
the wavelet basis we have here,
421
00:28:07,200 --> 00:28:14,300
and the transform is
simply, solve the equations,
422
00:28:14,300 --> 00:28:23,300
P=W C. So the coefficients
are W inverse P. Right.
423
00:28:23,300 --> 00:28:26,920
This shows a critical point.
424
00:28:26,920 --> 00:28:31,630
A good basis has a
nice, fast, inverse.
425
00:28:31,630 --> 00:28:38,260
So good basis means what?
426
00:28:38,260 --> 00:28:40,470
So this is like the
billion-dollar competition,
427
00:28:40,470 --> 00:28:41,710
Eh? and it's not over yet.
428
00:28:41,710 --> 00:28:46,360
People are going to come up
with better bases than these.
429
00:28:46,360 --> 00:28:53,420
So a good basis will be, first
good thing would be fast.
430
00:28:53,420 --> 00:28:58,600
I have to be able to multiply
by W fast, and multiply by W --
431
00:28:58,600 --> 00:29:00,930
by its inverse fast.
432
00:29:00,930 --> 00:29:04,630
That's -- if a basis doesn't
allow you to do that fast,
433
00:29:04,630 --> 00:29:09,920
then it's going to take so much
time that you can't afford it.
434
00:29:09,920 --> 00:29:12,210
So these bases --
435
00:29:12,210 --> 00:29:15,900
the Fourier basis,
everybody said, OK, I
436
00:29:15,900 --> 00:29:18,830
know how to deal quickly
with the Fourier basis,
437
00:29:18,830 --> 00:29:23,690
because we have something called
the Fast Fourier Transform.
438
00:29:23,690 --> 00:29:27,510
So there's a FFT that came
in my earlier lecture,
439
00:29:27,510 --> 00:29:30,270
and comes in the last
chapter of the book,
440
00:29:30,270 --> 00:29:34,400
so change-of-basis is done
-- if, for the Fourier basis,
441
00:29:34,400 --> 00:29:39,380
it's done fast by the FFT
and there's a fast wavelet
442
00:29:39,380 --> 00:29:40,420
transform.
443
00:29:40,420 --> 00:29:47,040
I can change, for
this wavelet example,
444
00:29:47,040 --> 00:29:49,320
this matrix is easy to invert.
445
00:29:49,320 --> 00:29:52,720
It's just somebody
had a smart idea
446
00:29:52,720 --> 00:29:56,930
in choosing that wavelet
basis and inverting it,
447
00:29:56,930 --> 00:29:58,580
it has a nice inverse.
448
00:29:58,580 --> 00:30:01,290
Actually, you can see why
it has a nice inverse.
449
00:30:01,290 --> 00:30:06,340
Do you see any property of
these eight basis vectors?
450
00:30:06,340 --> 00:30:07,940
Well, I've only
written five of them,
451
00:30:07,940 --> 00:30:11,220
but if you see that
property for those five,
452
00:30:11,220 --> 00:30:13,220
you'll see it for the three
453
00:30:13,220 --> 00:30:14,030
remaining.
454
00:30:14,030 --> 00:30:16,890
Well, if I give you
those eight vectors
455
00:30:16,890 --> 00:30:20,650
and ask, what's a nice property?
456
00:30:20,650 --> 00:30:23,770
Well, you would say, first,
they're all ones and minus ones
457
00:30:23,770 --> 00:30:25,120
and zeroes.
458
00:30:25,120 --> 00:30:32,530
So every multiplication is very
fast using -- just in binary.
459
00:30:32,530 --> 00:30:34,820
But what's the other great
property of those vectors?
460
00:30:34,820 --> 00:30:37,750
Anybody see it?
461
00:30:37,750 --> 00:30:41,210
So, of course, when I
think about a basis,
462
00:30:41,210 --> 00:30:42,480
one nice property --
463
00:30:42,480 --> 00:30:45,320
I don't have to have it, but
I'm happy if it's there --
464
00:30:45,320 --> 00:30:48,830
is that they're orthogonal.
465
00:30:48,830 --> 00:30:51,070
If the basis vectors
are orthogonal,
466
00:30:51,070 --> 00:30:52,780
then I'm in good shape.
467
00:30:52,780 --> 00:30:54,380
And these are... do you see?
468
00:30:54,380 --> 00:30:56,840
Take the dot product
of that with that,
469
00:30:56,840 --> 00:31:00,910
you get four plus ones and
four minus ones, you get zero.
470
00:31:00,910 --> 00:31:03,060
Take the dot product
of that with that.
471
00:31:03,060 --> 00:31:05,880
You get two plus ones
and two minus ones.
472
00:31:05,880 --> 00:31:07,660
Or the dot product
of that with that.
473
00:31:07,660 --> 00:31:10,620
Two plus ones and
two minus ones.
474
00:31:10,620 --> 00:31:14,060
You can easily check that
that's an orthogonal basis.
475
00:31:14,060 --> 00:31:15,875
It's not orthonormal.
476
00:31:18,870 --> 00:31:22,470
To fix it up, I should
divide by the length,
477
00:31:22,470 --> 00:31:23,920
to make them unit vectors.
478
00:31:23,920 --> 00:31:26,000
Let's suppose I do that.
479
00:31:26,000 --> 00:31:29,320
So somewhere in here, I've
got to account for the fact
480
00:31:29,320 --> 00:31:32,590
that this has length
square root of eight, that
481
00:31:32,590 --> 00:31:34,520
has length square
root of four, that
482
00:31:34,520 --> 00:31:36,750
has length square root of two.
483
00:31:36,750 --> 00:31:40,750
But that's just a constant
factor that's easy to --
484
00:31:40,750 --> 00:31:42,620
so suppose we've done that.
485
00:31:42,620 --> 00:31:47,420
Then, tell me what's W inverse?
486
00:31:47,420 --> 00:31:52,270
That's what chapter four,
section four point four
487
00:31:52,270 --> 00:31:53,630
was about.
488
00:31:53,630 --> 00:31:57,470
If we have orthonormal
columns then
489
00:31:57,470 --> 00:32:02,330
the inverse is the
same as the transpose.
490
00:32:02,330 --> 00:32:06,470
So if we have a fast way to
multiply by W, which we do,
491
00:32:06,470 --> 00:32:09,010
the inverse is going
to look just the same,
492
00:32:09,010 --> 00:32:10,450
and we'll have a fast way to do
493
00:32:10,450 --> 00:32:11,680
W inverse.
494
00:32:11,680 --> 00:32:17,760
So that's the wavelet basis
passes this requirement
495
00:32:17,760 --> 00:32:19,020
for fast.
496
00:32:19,020 --> 00:32:21,250
We can use it fast.
497
00:32:21,250 --> 00:32:24,530
But there's a second
requirement, is it any good?
498
00:32:24,530 --> 00:32:26,420
Because the the
very fastest thing
499
00:32:26,420 --> 00:32:30,060
we could do is not to
change basis at all.
500
00:32:30,060 --> 00:32:30,560
Right?
501
00:32:30,560 --> 00:32:33,660
The fastest thing would be, OK,
stay with the standard basis,
502
00:32:33,660 --> 00:32:36,020
stay with eight pixel values.
503
00:32:36,020 --> 00:32:40,320
But that was poor from
compression point of view,
504
00:32:40,320 --> 00:32:41,020
right?
505
00:32:41,020 --> 00:32:45,380
Those eight pixel values, if I
just took those eight numbers,
506
00:32:45,380 --> 00:32:48,800
I can't throw some
of those away.
507
00:32:48,800 --> 00:32:52,550
If I throw away
ninety percent --
508
00:32:52,550 --> 00:32:55,110
if I compress ten to
one, and throw away
509
00:32:55,110 --> 00:32:57,610
ninety percent of
my pixel values,
510
00:32:57,610 --> 00:33:00,660
well, my picture's
just gone dark.
511
00:33:00,660 --> 00:33:03,470
Whereas, the basis
that was good,
512
00:33:03,470 --> 00:33:05,960
the wavelet basis or
the Fourier basis,
513
00:33:05,960 --> 00:33:12,910
if I throw away c5, c6, c7,
and c8, all I'm throwing away
514
00:33:12,910 --> 00:33:16,060
is little blips that
are probably there
515
00:33:16,060 --> 00:33:18,870
in very small amounts.
516
00:33:18,870 --> 00:33:23,210
So the second property that
we need is good compression.
517
00:33:23,210 --> 00:33:32,470
So first, it has to be fast, and
secondly, a few basis vectors
518
00:33:32,470 --> 00:33:36,270
should come close to the signal.
519
00:33:36,270 --> 00:33:40,140
So a few is enough.
520
00:33:40,140 --> 00:33:41,840
Can I write it that way?
521
00:33:41,840 --> 00:33:50,470
A few basis vectors are enough
to reproduce the image just
522
00:33:50,470 --> 00:33:54,060
exactly as on a video
of these 18.06 lectures.
523
00:33:54,060 --> 00:33:58,700
Uh, I don't know what the
compression rate is, I'll ask,
524
00:33:58,700 --> 00:34:03,270
David, who does the
compression -- and, by the way,
525
00:34:03,270 --> 00:34:09,639
I'll try to get the lectures,
that are relevant for the quiz
526
00:34:09,639 --> 00:34:13,310
up onto the Web in time.
527
00:34:13,310 --> 00:34:15,980
So I'll send them
a message today.
528
00:34:15,980 --> 00:34:21,679
So, he's using the Fourier
basis because the JPEG --
529
00:34:21,679 --> 00:34:26,219
so JPEG two thousand, which will
be the next standard for image
530
00:34:26,219 --> 00:34:29,520
compression, will
include wavelets.
531
00:34:29,520 --> 00:34:33,139
So, I mean, you're actually
getting a kind of up-to-date,
532
00:34:33,139 --> 00:34:40,030
picture of where this big world
of signal and image processing
533
00:34:40,030 --> 00:34:41,230
is.
534
00:34:41,230 --> 00:34:45,090
That Fourier is
what everybody knew,
535
00:34:45,090 --> 00:34:47,949
and what people
automatically used,
536
00:34:47,949 --> 00:34:53,460
and the new one is
wavelets, where this is
537
00:34:53,460 --> 00:34:55,580
the simplest set of wavelets.
538
00:34:55,580 --> 00:34:59,650
And this isn't the one that
the FBI uses, by the way,
539
00:34:59,650 --> 00:35:03,730
the FBI uses a smoother
wavelet, instead
540
00:35:03,730 --> 00:35:08,690
of jumping from one to minus
one, it's a smooth, Cutoff.
541
00:35:08,690 --> 00:35:15,010
and, that's what we'll be
in in JPEG two thousand.
542
00:35:15,010 --> 00:35:17,130
OK, so that's that application.
543
00:35:17,130 --> 00:35:23,520
Now, let me come to the
math, the linear algebra
544
00:35:23,520 --> 00:35:25,780
part of the lecture.
545
00:35:25,780 --> 00:35:29,470
Well, we've actually
seen a change-of-basis.
546
00:35:29,470 --> 00:35:33,420
So let -- let me just review
that eh-eh change-of-basis
547
00:35:33,420 --> 00:35:37,980
idea, and then the i- and then
the transformation to a matrix.
548
00:35:37,980 --> 00:35:38,590
OK.
549
00:35:38,590 --> 00:35:43,630
So this, I hope you see
that these applications are
550
00:35:43,630 --> 00:35:45,360
really big.
551
00:35:45,360 --> 00:35:49,100
Now, I have to talk a little
about change-of-basis,
552
00:35:49,100 --> 00:35:50,960
and a little about that.
553
00:35:50,960 --> 00:35:52,380
The matrix.
554
00:35:52,380 --> 00:35:53,300
OK.
555
00:35:53,300 --> 00:35:55,880
OK.
556
00:35:55,880 --> 00:35:56,380
OK.
557
00:35:56,380 --> 00:35:57,171
So change-of-basis.
558
00:36:01,820 --> 00:36:10,390
Basically, forgive
that put, OK, I have,
559
00:36:10,390 --> 00:36:15,250
I have my vector in
one basis, and I want
560
00:36:15,250 --> 00:36:16,500
to change to a different one.
561
00:36:16,500 --> 00:36:19,590
Actually, you saw it
for the wavelet case.
562
00:36:19,590 --> 00:36:22,680
So I need the --
563
00:36:22,680 --> 00:36:31,300
let the matrix W,
and the columns of W
564
00:36:31,300 --> 00:36:34,330
be the new basis vectors.
565
00:36:40,830 --> 00:36:43,820
Then the change-of-basis
involves, just
566
00:36:43,820 --> 00:36:46,050
as it did there, W inverse.
567
00:36:46,050 --> 00:36:52,510
So we have the vector,
say, x, in the old basis,
568
00:36:52,510 --> 00:37:04,050
and that converts to a vector,
let's say, c, in the new basis,
569
00:37:04,050 --> 00:37:16,655
and the relation is exactly what
we had there, that x is W c.
570
00:37:20,400 --> 00:37:22,680
That's the step we have to take.
571
00:37:22,680 --> 00:37:26,640
There's a matrix W that
gives us a change-of-basis.
572
00:37:26,640 --> 00:37:28,650
OK.
573
00:37:28,650 --> 00:37:38,110
What I want to do is think about
transformations on matrices.
574
00:37:38,110 --> 00:37:43,790
So here's the question
to complete this lecture.
575
00:37:43,790 --> 00:37:50,930
Suppose I have a linear
transformation T.
576
00:37:50,930 --> 00:37:57,390
So we would think of it as an
eight -- as a n by n matrix.
577
00:37:57,390 --> 00:38:02,400
And it's computed with
respect to a certain basis.
578
00:38:02,400 --> 00:38:04,960
So T -- no, I'm sorry.
579
00:38:04,960 --> 00:38:07,780
I've got the
transformation T, period.
580
00:38:07,780 --> 00:38:09,760
That's taking
eight-dimensional space
581
00:38:09,760 --> 00:38:12,240
to eight-dimensional space.
582
00:38:12,240 --> 00:38:15,340
Now, let's get
matrices in there.
583
00:38:15,340 --> 00:38:16,650
OK.
584
00:38:16,650 --> 00:38:27,350
So, with respect to a first
basis, say v1 up to v8,
585
00:38:27,350 --> 00:38:35,550
it has a matrix A.
586
00:38:35,550 --> 00:38:39,380
I'm just setting
up letters here.
587
00:38:39,380 --> 00:38:48,000
With respect to a second basis,
say, I'll make it u1 up to --
588
00:38:48,000 --> 00:39:00,000
or w1, since I've used (w)s,
w1 up to w8, it has a matrix B.
589
00:39:00,000 --> 00:39:05,830
And my question is, what's
the connection between A
590
00:39:05,830 --> 00:39:09,890
How is the matrix -- the
transformation T is settled.
591
00:39:09,890 --> 00:39:11,240
and B?
592
00:39:11,240 --> 00:39:16,660
We could say, it's a
rotation, for example.
593
00:39:16,660 --> 00:39:18,490
So that would be
one transformation
594
00:39:18,490 --> 00:39:21,640
of eight-dimensional space,
just spin it a little.
595
00:39:24,740 --> 00:39:25,930
Or project it.
596
00:39:25,930 --> 00:39:32,070
Or whatever linear
transformation we've got.
597
00:39:32,070 --> 00:39:35,240
Now, we have to remember --
598
00:39:35,240 --> 00:39:41,420
my first step is to remind you
how you create that matrix A.
599
00:39:41,420 --> 00:39:45,530
Then my second step is, we would
use the same method to create
600
00:39:45,530 --> 00:39:49,520
B, but because it came from
the same transformation,
601
00:39:49,520 --> 00:39:52,550
there's got to be a
relation between A and B.
602
00:39:52,550 --> 00:39:55,660
What's the relation
between A and B?
603
00:39:55,660 --> 00:40:02,230
And let me jump to the
answer on that one.
604
00:40:02,230 --> 00:40:05,890
That if I have the
same transformation,
605
00:40:05,890 --> 00:40:10,600
and I'm compute on its matrix in
one basis, and then I computer
606
00:40:10,600 --> 00:40:16,910
it in another basis, those
two matrices are similar.
607
00:40:16,910 --> 00:40:19,350
So these two
matrices are similar.
608
00:40:19,350 --> 00:40:23,080
Now, do you remember what
similar matrices meant?
609
00:40:23,080 --> 00:40:23,630
Similar.
610
00:40:23,630 --> 00:40:27,935
A is similar to -- the
two matrices are similar.
611
00:40:27,935 --> 00:40:28,435
Similar.
612
00:40:31,280 --> 00:40:33,560
And what do I mean by that?
613
00:40:33,560 --> 00:40:40,700
I mean that I take the matrix
B, and I can compute it
614
00:40:40,700 --> 00:40:47,320
from the matrix A using
some similarity, some matrix
615
00:40:47,320 --> 00:40:52,180
M on one side, and M
inverse on the other.
616
00:40:52,180 --> 00:40:55,530
And this M will be the
change-of-basis matrix.
617
00:40:59,490 --> 00:41:03,740
This part of the lecture
is, admittedly, compressed.
618
00:41:03,740 --> 00:41:05,820
What I wanted you to --
619
00:41:05,820 --> 00:41:12,670
it's really the conclusion
that I want you to spot.
620
00:41:12,670 --> 00:41:17,980
Now, I have to go back and
say, what does it mean for A
621
00:41:17,980 --> 00:41:21,830
to be the matrix of
this transformation T.
622
00:41:21,830 --> 00:41:23,710
So I have to remind
you what that meant,
623
00:41:23,710 --> 00:41:26,280
that was in the last lecture.
624
00:41:26,280 --> 00:41:32,360
Then this is the conclusion
that if I change to a different
625
00:41:32,360 --> 00:41:36,720
basis, we now know -- see, if
I change to a different basis,
626
00:41:36,720 --> 00:41:38,340
two things happen.
627
00:41:38,340 --> 00:41:41,880
Every vector has
new coordinates.
628
00:41:41,880 --> 00:41:45,190
There, the rule is this one,
between the old coordinates
629
00:41:45,190 --> 00:41:46,660
and the new ones.
630
00:41:46,660 --> 00:41:50,570
Every matrix changes, every
transformation has a new
631
00:41:50,570 --> 00:41:51,560
matrix.
632
00:41:51,560 --> 00:41:55,120
And the new matrix
is related this way,
633
00:41:55,120 --> 00:41:57,710
the M could be
the same as the W.
634
00:41:57,710 --> 00:42:00,300
The M there would be the W here.
635
00:42:00,300 --> 00:42:00,960
OK.
636
00:42:00,960 --> 00:42:04,790
So, can I, in the
remaining minutes,
637
00:42:04,790 --> 00:42:09,620
recapture my lecture -- the
end of my lecture that was just
638
00:42:09,620 --> 00:42:13,740
before Thanksgiving,
about the matrix?
639
00:42:13,740 --> 00:42:15,660
OK.
640
00:42:15,660 --> 00:42:16,550
What's the matrix?
641
00:42:19,090 --> 00:42:21,040
And I'll just take one basis.
642
00:42:21,040 --> 00:42:26,830
So now this part is going
to go onto this board here.
643
00:42:26,830 --> 00:42:27,970
What is the matrix?
644
00:42:27,970 --> 00:42:28,740
What is A?
645
00:42:32,830 --> 00:42:33,930
OK.
646
00:42:33,930 --> 00:42:42,280
Using a basis v1 up to v8.
647
00:42:42,280 --> 00:42:43,161
Mm.
648
00:42:43,161 --> 00:42:43,660
OK.
649
00:42:43,660 --> 00:42:46,350
What's the point?
650
00:42:46,350 --> 00:42:51,180
The point is, if I know
what the transformation does
651
00:42:51,180 --> 00:42:56,990
to those eight basis vectors,
I know it completely.
652
00:42:56,990 --> 00:42:59,140
I know T, I know
everything about T,
653
00:42:59,140 --> 00:43:11,200
I know T completely from knowing
T of V -- what T does to v1,
654
00:43:11,200 --> 00:43:15,040
what T does to v2,
what T does to v8.
655
00:43:18,020 --> 00:43:19,760
Why is that?
656
00:43:19,760 --> 00:43:23,170
It's because T is a
linear transformation.
657
00:43:23,170 --> 00:43:26,700
So that if I know what
these outputs are --
658
00:43:26,700 --> 00:43:29,360
so these are the
inputs v1 up to v8,
659
00:43:29,360 --> 00:43:31,560
these are the outputs
from the transformation,
660
00:43:31,560 --> 00:43:35,500
like everyone rotated,
everyone projected,
661
00:43:35,500 --> 00:43:38,010
whatever transformation
I've done,
662
00:43:38,010 --> 00:43:42,090
then why is it that
I know everything?
663
00:43:42,090 --> 00:43:43,660
How does linearity work?
664
00:43:43,660 --> 00:43:45,020
Why?
665
00:43:45,020 --> 00:43:56,150
This is because every x is
some combination of these basis
666
00:43:56,150 --> 00:43:57,870
vectors, right?
667
00:43:57,870 --> 00:44:05,570
c1v1, c2v2, c8v8,
they were a basis.
668
00:44:05,570 --> 00:44:07,010
That's the whole
point of a basis,
669
00:44:07,010 --> 00:44:09,790
that every vector is a
combination of the basis
670
00:44:09,790 --> 00:44:12,210
vectors in exactly one way.
671
00:44:12,210 --> 00:44:14,737
And then, what is T of x?
672
00:44:18,670 --> 00:44:23,460
The point is, I claim that
we know T of x completely
673
00:44:23,460 --> 00:44:27,770
for every x, because every x
is a combination of those --
674
00:44:27,770 --> 00:44:31,740
and now we use the linear
transformation part to say that
675
00:44:31,740 --> 00:44:37,890
the output from x has to be c1
times the output from v1 plus
676
00:44:37,890 --> 00:44:43,330
v2 times the output
from v2, and so on.
677
00:44:43,330 --> 00:44:46,360
Up through c8 times
the output from v8.
678
00:44:46,360 --> 00:44:49,790
So this is like just saying, OK.
679
00:44:49,790 --> 00:44:54,900
We know everything
when we know what
680
00:44:54,900 --> 00:44:57,790
T does to each basis vector.
681
00:44:57,790 --> 00:44:58,720
OK.
682
00:44:58,720 --> 00:45:02,000
So those are the
eight things we need.
683
00:45:02,000 --> 00:45:08,330
Now -- but we need these
answers in this basis.
684
00:45:08,330 --> 00:45:13,450
So this first output
is some combination
685
00:45:13,450 --> 00:45:14,665
of the eight basis vectors.
686
00:45:17,480 --> 00:45:24,680
So write T acting on the
first input -- in other words,
687
00:45:24,680 --> 00:45:27,880
write the first output as
a combination of the basis
688
00:45:27,880 --> 00:45:38,040
vectors, say a11 v1 +
a21 v2 and so on a81 v8.
689
00:45:41,740 --> 00:45:50,110
Write T of v2 as some
combination a12 of v1,
690
00:45:50,110 --> 00:45:53,240
a22 of v2 and so on.
691
00:45:53,240 --> 00:45:57,270
I'm creating the matrix
A, column by column.
692
00:45:57,270 --> 00:46:00,050
Those numbers go in
the first column,
693
00:46:00,050 --> 00:46:06,200
these numbers go in the second
column, the matrix A that thi-
694
00:46:06,200 --> 00:46:11,900
this -- this is our matrix that
represents T in this basis is
695
00:46:11,900 --> 00:46:13,110
these numbers.
696
00:46:13,110 --> 00:46:22,290
a11 down to a18, a21
down to a28, and so on.
697
00:46:22,290 --> 00:46:25,060
OK.
698
00:46:25,060 --> 00:46:26,120
That's the recipe.
699
00:46:26,120 --> 00:46:31,090
In other words, if I give you
a transformation, and a basis.
700
00:46:31,090 --> 00:46:32,720
So that's what I
have to give you.
701
00:46:32,720 --> 00:46:37,550
The inputs are the
basis and to tell you
702
00:46:37,550 --> 00:46:38,641
what the transformation
703
00:46:38,641 --> 00:46:39,140
is.
704
00:46:39,140 --> 00:46:42,840
And then, you tell me --
705
00:46:42,840 --> 00:46:47,880
you compute T for
each basis, expand
706
00:46:47,880 --> 00:46:51,270
that result in the
basis, and that gives you
707
00:46:51,270 --> 00:46:55,900
the sixty four numbers
that go into the matrix A.
708
00:46:55,900 --> 00:47:01,460
Let me suppose -- let's close
with the best example of all.
709
00:47:01,460 --> 00:47:08,810
Suppose v1 to v8, this
basis, is the eigenvectors.
710
00:47:08,810 --> 00:47:22,950
Suppose we have an eigenvector
basis so that T(vi)
711
00:47:22,950 --> 00:47:27,660
is in the same direction of vi.
712
00:47:27,660 --> 00:47:29,790
Now, my question is, what is A?
713
00:47:35,370 --> 00:47:37,370
Can you carry through the steps?
714
00:47:37,370 --> 00:47:40,270
Let's do them together, because
we can do it in one minute.
715
00:47:43,200 --> 00:47:47,120
So, we've chosen
this perfect basis.
716
00:47:47,120 --> 00:47:52,010
And, actually, with
signal image processing,
717
00:47:52,010 --> 00:47:55,520
they might look for
the eigenvectors.
718
00:47:55,520 --> 00:47:57,790
But that would take
more calculation
719
00:47:57,790 --> 00:48:02,040
time that just saying, OK,
we'll use the wavelet basis.
720
00:48:02,040 --> 00:48:04,510
Or, OK, we'll use
the Fourier basis.
721
00:48:04,510 --> 00:48:08,130
But the very best basis
is the eigenvector basis.
722
00:48:08,130 --> 00:48:10,320
OK, what's the matrix?
723
00:48:10,320 --> 00:48:12,480
So, what's the first
column of the matrix?
724
00:48:15,180 --> 00:48:17,130
How do I get the first column?
725
00:48:17,130 --> 00:48:19,185
I take the first
basis vector v1.
726
00:48:22,000 --> 00:48:25,070
I opt -- I look to see, what
does the transformation do
727
00:48:25,070 --> 00:48:26,940
to it?
728
00:48:26,940 --> 00:48:29,610
The output is lambda one v1.
729
00:48:29,610 --> 00:48:36,420
I express that output as a
combination so the first input
730
00:48:36,420 --> 00:48:39,590
is v1.
731
00:48:39,590 --> 00:48:47,150
Its output is lambda one v1.
732
00:48:47,150 --> 00:48:51,900
Now write lambda one v1 as
a combination of the basis
733
00:48:51,900 --> 00:48:54,330
vectors, well,
it's already done.
734
00:48:54,330 --> 00:48:56,720
It's just lambda one times
the first basis vector
735
00:48:56,720 --> 00:48:58,360
and zero times the others.
736
00:48:58,360 --> 00:49:03,490
So this first column will
have lambda one and zeroes.
737
00:49:03,490 --> 00:49:05,060
OK.
738
00:49:05,060 --> 00:49:10,030
Second input is v2.
739
00:49:10,030 --> 00:49:15,220
Output is lambda two v2.
740
00:49:15,220 --> 00:49:19,060
OK, write that output as
a combination of the (v)s.
741
00:49:19,060 --> 00:49:20,470
It's already done.
742
00:49:20,470 --> 00:49:23,120
It's just lambda two
times the second v.
743
00:49:23,120 --> 00:49:25,190
So we need, in
the second column,
744
00:49:25,190 --> 00:49:28,040
we have lambda two
times the second v.
745
00:49:28,040 --> 00:49:31,410
Well, you see what's
coming, that in that basis,
746
00:49:31,410 --> 00:49:37,470
in the eigenvector basis,
the matrix is diagonal.
747
00:49:37,470 --> 00:49:40,950
So that's the
perfect basis, that's
748
00:49:40,950 --> 00:49:43,840
the basis we'd love to
have for image processing,
749
00:49:43,840 --> 00:49:47,770
but to find the eigenvectors
of our pixel matrix
750
00:49:47,770 --> 00:49:49,640
would be too expensive.
751
00:49:49,640 --> 00:49:54,110
So we do something
cheaper and close,
752
00:49:54,110 --> 00:49:58,200
which is to choose a
good basis like wavelets.
753
00:49:58,200 --> 00:49:59,130
OK, thanks.
754
00:49:59,130 --> 00:50:04,490
So I'll -- quiz review
on Wednesday, all day.
755
00:50:04,490 --> 00:50:06,040
Thanks.