WEBVTT

00:00:00.000 --> 00:00:02.350
The following content is
provided under a Creative

00:00:02.350 --> 00:00:03.640
Commons license.

00:00:03.640 --> 00:00:06.540
Your support will help MIT
OpenCourseWare continue to

00:00:06.540 --> 00:00:09.970
offer high quality educational
resources for free.

00:00:09.970 --> 00:00:12.780
To make a donation or to view
additional materials from

00:00:12.780 --> 00:00:16.550
hundreds of MIT courses, visit
MIT OpenCourseWare at

00:00:16.550 --> 00:00:17.800
OCW.MIT.edu.

00:00:23.950 --> 00:00:24.520
PROFESSOR: OK.

00:00:24.520 --> 00:00:32.300
We are talking about jointly
Gaussian random variables.

00:00:32.300 --> 00:00:35.240
One comment through all of this
and through all of the

00:00:35.240 --> 00:00:41.860
notes is that you can add a
mean to Gaussian random

00:00:41.860 --> 00:00:44.230
variables, or you
can talk about

00:00:44.230 --> 00:00:47.470
zero-mean random variables.

00:00:47.470 --> 00:00:49.540
Here we're using random
variables mostly

00:00:49.540 --> 00:00:51.370
to talk about noise.

00:00:51.370 --> 00:00:56.440
When we're talking about noise,
you really should be

00:00:56.440 --> 00:01:00.150
talking about zero-mean random
variables, because you can

00:01:00.150 --> 00:01:02.750
always take the mean out.

00:01:02.750 --> 00:01:06.320
And because of that, I don't
like to state everything twice

00:01:06.320 --> 00:01:10.830
once for variables of processes
without a mean, and

00:01:10.830 --> 00:01:14.640
once for variables of processes
with a mean.

00:01:14.640 --> 00:01:17.620
And after looking at the notes,
I think I've been a

00:01:17.620 --> 00:01:21.700
little inconsistent
about that.

00:01:21.700 --> 00:01:25.830
I think the point is you can
keep yourself straight by just

00:01:25.830 --> 00:01:27.840
saying the only thing
important is the

00:01:27.840 --> 00:01:30.350
case without a mean.

00:01:30.350 --> 00:01:34.270
Putting the mean in is something
unfortunately done

00:01:34.270 --> 00:01:37.410
by people who like complexity.

00:01:37.410 --> 00:01:41.440
And they have unfortunately
got in the

00:01:41.440 --> 00:01:43.310
notation on their side.

00:01:43.310 --> 00:01:45.640
So anytime you want
to talk about a

00:01:45.640 --> 00:01:47.680
zero-mean random variable.

00:01:47.680 --> 00:01:49.910
You have to say zero-mean
random variable.

00:01:49.910 --> 00:01:53.060
And if you say random variable,
it means it could

00:01:53.060 --> 00:01:55.220
have a mean or not
have a mean.

00:01:55.220 --> 00:01:58.940
And of course the way the
notation should be stated is

00:01:58.940 --> 00:02:01.300
you talk about random variables
as things which

00:02:01.300 --> 00:02:02.470
don't have means.

00:02:02.470 --> 00:02:05.960
And then you can talk about
random variables plus means as

00:02:05.960 --> 00:02:10.340
things which do have means,
which would make life easier

00:02:10.340 --> 00:02:12.040
but unfortunately it's
not that way.

00:02:12.040 --> 00:02:15.260
So anytime you see something
and are wondering about

00:02:15.260 --> 00:02:20.170
whether I've been careful about
the mean or not, the

00:02:20.170 --> 00:02:23.230
answer is I well might
not have been.

00:02:23.230 --> 00:02:25.720
And two, it's not
very important.

00:02:25.720 --> 00:02:28.160
So anyway.

00:02:28.160 --> 00:02:32.110
Here I'll be talking all
about zero-mean things.

00:02:32.110 --> 00:02:37.210
A k-tuple of zero-mean random
variables is said to be

00:02:37.210 --> 00:02:41.770
jointly Gaussian if you can
express them in this way here.

00:02:41.770 --> 00:02:48.490
Namely as a linear combination
of IID normal Gaussian again

00:02:48.490 --> 00:02:50.220
random variables.

00:02:50.220 --> 00:02:50.540
OK.

00:02:50.540 --> 00:02:54.240
In your homework, those of you
who've done it already, have

00:02:54.240 --> 00:02:59.160
realized that just having two
Gaussian random variables is

00:02:59.160 --> 00:03:03.070
not enough to make
those two random

00:03:03.070 --> 00:03:05.670
variables be jointly Gaussian.

00:03:05.670 --> 00:03:07.760
They can be individually
Gaussian

00:03:07.760 --> 00:03:09.710
but not jointly Gaussian.

00:03:09.710 --> 00:03:12.510
This is sort of important
because when you start

00:03:12.510 --> 00:03:17.120
manipulating things as we will
do when we're generating

00:03:17.120 --> 00:03:20.650
signals to send and things like
this, you can very easily

00:03:20.650 --> 00:03:24.280
wind up with things which look
Gaussian and are appropriately

00:03:24.280 --> 00:03:28.030
modeled as Gaussian, but which
are not jointly Gaussian.

00:03:28.030 --> 00:03:30.870
Things which are jointly
Gaussian are

00:03:30.870 --> 00:03:32.250
defined in this way.

00:03:32.250 --> 00:03:33.840
We will come up with
a couple of other

00:03:33.840 --> 00:03:36.440
definitions of them later.

00:03:36.440 --> 00:03:40.500
But you've undoubtedly been
taught that any old Gaussian

00:03:40.500 --> 00:03:42.840
random variables are
jointly Gaussian.

00:03:42.840 --> 00:03:45.950
You've probably seen joint
densities for them

00:03:45.950 --> 00:03:47.300
and things like this.

00:03:47.300 --> 00:03:50.730
Those joint densities only apply
when you have jointly

00:03:50.730 --> 00:03:52.510
Gaussian random variables.

00:03:52.510 --> 00:03:55.750
And the fundamental definition
is this.

00:03:55.750 --> 00:04:02.010
This fundamental definition
makes sense, because the way

00:04:02.010 --> 00:04:06.540
that you generate these noise
random variables is usually

00:04:06.540 --> 00:04:10.830
from some very large underlying
set of very, very

00:04:10.830 --> 00:04:12.570
small random variables.

00:04:12.570 --> 00:04:15.320
And the Law of Large Numbers
says that when you add up a

00:04:15.320 --> 00:04:18.800
very, very large number of
small underlying random

00:04:18.800 --> 00:04:24.460
variables, you normalize the sum
so it has some reasonable

00:04:24.460 --> 00:04:27.150
variance then that random
variable is going to be

00:04:27.150 --> 00:04:30.440
appropriately modeled
as being Gaussian.

00:04:30.440 --> 00:04:35.950
If you at the same time look
at linear combinations of

00:04:35.950 --> 00:04:39.880
those things for the same
reason, it's appropriate to

00:04:39.880 --> 00:04:43.780
model each of the random
variables you're looking at as

00:04:43.780 --> 00:04:47.770
a linear combination of some
underlying set of noise

00:04:47.770 --> 00:04:49.800
variables which is
very, very large.

00:04:49.800 --> 00:04:51.490
Probably not Gaussian.

00:04:51.490 --> 00:04:55.290
But here when we're trying to
do this, because of the

00:04:55.290 --> 00:04:59.590
central limit theorem, we just
model these as normal Gaussian

00:04:59.590 --> 00:05:01.810
random variables
to start with.

00:05:01.810 --> 00:05:05.330
So that's where that definition
comes from.

00:05:05.330 --> 00:05:09.680
It's why when you're looking at
noise processes in the real

00:05:09.680 --> 00:05:14.170
world, and trying to model them
in a sensible way this

00:05:14.170 --> 00:05:16.510
jointly Gaussian is the
thing you almost

00:05:16.510 --> 00:05:19.270
always come up with.

00:05:19.270 --> 00:05:20.160
OK.

00:05:20.160 --> 00:05:24.280
So one important point here, and
the notes point this out.

00:05:24.280 --> 00:05:32.280
If each of these random
variables Z sub i is

00:05:32.280 --> 00:05:35.060
independent of each of the
others, and they have

00:05:35.060 --> 00:05:41.355
arbitrary variances, then
because of this formula, the

00:05:41.355 --> 00:05:45.150
set of Z i's are going to
be jointly Gaussian.

00:05:45.150 --> 00:05:46.300
Why is that?

00:05:46.300 --> 00:05:51.960
Well you simply make a matrix
here A, which is diagonal.

00:05:51.960 --> 00:05:55.560
And the elements of this
diagonal matrix are sigma 1

00:05:55.560 --> 00:05:59.840
squared, sigma 2 squared, sigma
3 squared, and so forth.

00:06:06.310 --> 00:06:15.580
If you have a vector Z, which
is then sigma 1 squared down

00:06:15.580 --> 00:06:25.480
to sigma k squared times a noise
vector N1 to N sub k.

00:06:25.480 --> 00:06:32.340
What you wind up with is Z sub
i is going to be equal to --

00:06:32.340 --> 00:06:34.540
I guess I don't want those
squares in there.

00:06:34.540 --> 00:06:36.010
Sigma 1 up to sigma k --

00:06:36.010 --> 00:06:42.670
Z sub i is going to be equal to
sigma i, N sub i; N sub i

00:06:42.670 --> 00:06:45.240
is a Gaussian random variable
with variance one and

00:06:45.240 --> 00:06:50.040
therefore Z sub i as Gaussian
random variable with variance

00:06:50.040 --> 00:06:51.640
sigma sub i squared.

00:06:51.640 --> 00:06:52.980
OK.

00:06:52.980 --> 00:06:58.270
So anyway, one special case of
this formula is anytime that

00:06:58.270 --> 00:07:01.840
you want to deal with a set of
independent Gaussian random

00:07:01.840 --> 00:07:08.720
variables with arbitrary
variances, they are always

00:07:08.720 --> 00:07:11.690
going to be jointly Gaussian.

00:07:11.690 --> 00:07:13.280
Saying that they're
uncorrelated is

00:07:13.280 --> 00:07:14.650
not enough for that.

00:07:14.650 --> 00:07:17.930
You really need the statement
that they're independent of

00:07:17.930 --> 00:07:20.010
each other.

00:07:20.010 --> 00:07:21.820
OK.

00:07:21.820 --> 00:07:24.980
That's sort of where
we were last time.

00:07:27.950 --> 00:07:31.750
When you look at this formula,
you can look at it in terms of

00:07:31.750 --> 00:07:33.880
sample values.

00:07:33.880 --> 00:07:38.980
And if we look at a sample value
what's happening is that

00:07:38.980 --> 00:07:44.940
the sample value of the random
vector Z, namely little z, is

00:07:44.940 --> 00:07:49.320
going to be defined as some
matrix A times this sample

00:07:49.320 --> 00:07:53.650
value for the normal
vector N. OK.

00:07:53.650 --> 00:07:57.060
And what we want to look
at is geometrically

00:07:57.060 --> 00:07:59.090
what happens there.

00:07:59.090 --> 00:08:07.210
Well this matrix a is going to
map a unit vector, E sub i

00:08:07.210 --> 00:08:10.750
into the i'th column
of A. Why is that?

00:08:10.750 --> 00:08:17.140
Well you look at A, which is
whatever it is, A sub 1, 1,

00:08:17.140 --> 00:08:22.280
blah, blah, blah up
to A sub k, k.

00:08:22.280 --> 00:08:28.040
And you look at multiplying it
by a vector which is zero only

00:08:28.040 --> 00:08:31.090
in the i'th position.

00:08:31.090 --> 00:08:37.060
And what's this matrix
multiplication going to do?

00:08:37.060 --> 00:08:40.720
It's going to simply pick
out the i'th column

00:08:40.720 --> 00:08:42.490
of this matrix here.

00:08:42.490 --> 00:08:43.040
OK.

00:08:43.040 --> 00:08:48.670
So a is going to map e sub i
into the i'th column of A. OK.

00:08:48.670 --> 00:08:55.710
So then the question is what is
this matrix A going to do

00:08:55.710 --> 00:09:01.030
to some small cube
in the n plane?

00:09:01.030 --> 00:09:01.340
OK.

00:09:01.340 --> 00:09:07.110
If you take a small cube in the
n plane from 0 to delta

00:09:07.110 --> 00:09:15.170
along the n1 line is going to
map into 0 to this point here,

00:09:15.170 --> 00:09:20.390
which is A e sub 1.

00:09:20.390 --> 00:09:26.630
This point here is going to
map into A times e sub 2.

00:09:26.630 --> 00:09:29.600
This is just drawing for two
dimensions of course.

00:09:29.600 --> 00:09:33.280
So in fact all the points in
this cube are going to get map

00:09:33.280 --> 00:09:35.890
into this little
rectangle here.

00:09:35.890 --> 00:09:37.450
OK.

00:09:37.450 --> 00:09:41.740
Namely, that's what a matrix
times a vector is going to do.

00:09:41.740 --> 00:09:42.820
Anybody awake out there?

00:09:42.820 --> 00:09:45.720
You're all looking at me as
if I'm totally insane.

00:09:50.180 --> 00:09:50.480
OK.

00:09:50.480 --> 00:09:54.050
Every everyone following this?

00:09:54.050 --> 00:09:55.780
OK.

00:09:55.780 --> 00:09:58.870
So perhaps this is
just too trivial.

00:09:58.870 --> 00:10:00.690
I hope not.

00:10:00.690 --> 00:10:01.080
OK.

00:10:01.080 --> 00:10:05.770
So unit cubes get mapped
into rectangles here.

00:10:05.770 --> 00:10:10.030
If I take a unit cube up here,
it's going to get mapped into

00:10:10.030 --> 00:10:12.780
the same kind of unit
rectangle here.

00:10:12.780 --> 00:10:16.520
If I visualize tiling this plane
here with little tiny

00:10:16.520 --> 00:10:20.320
cubes, delta on a side, what's
going to happen?

00:10:20.320 --> 00:10:24.690
Each of these little cubes
is going to map into a

00:10:24.690 --> 00:10:28.460
parallelogram over here, and
these parallelograms going to

00:10:28.460 --> 00:10:31.120
tile this space here.

00:10:31.120 --> 00:10:31.490
OK.

00:10:31.490 --> 00:10:35.700
Which means that each little
cube here maps into one of

00:10:35.700 --> 00:10:36.940
these rectangles.

00:10:36.940 --> 00:10:40.940
Each rectangle here maps back
into a little cube here.

00:10:40.940 --> 00:10:42.490
Which means that I'm
looking at a very

00:10:42.490 --> 00:10:45.080
special case of a here.

00:10:45.080 --> 00:10:48.330
I'm looking at a case where
a is non-singular.

00:10:48.330 --> 00:10:51.820
In other words, I can get the
any point in this plane by

00:10:51.820 --> 00:10:54.420
starting with some point
in this plane.

00:10:54.420 --> 00:10:58.440
Which means for any point here,
I can go back here also.

00:10:58.440 --> 00:11:05.250
In other words, I can also write
n is equal to A to the

00:11:05.250 --> 00:11:10.310
minus 1 times Z. And this
matrix has to exist.

00:11:10.310 --> 00:11:10.570
OK.

00:11:10.570 --> 00:11:13.380
That's what you mean
geometrically by a

00:11:13.380 --> 00:11:14.420
non-singular matrix.

00:11:14.420 --> 00:11:18.150
It means that all points in
this plane get mapped into

00:11:18.150 --> 00:11:19.910
points in this plane.

00:11:19.910 --> 00:11:23.650
And get mapped into only one
point in this plane, and get

00:11:23.650 --> 00:11:26.030
mapped into only one point
in this plane.

00:11:26.030 --> 00:11:28.490
Where every point in this
plane is the map

00:11:28.490 --> 00:11:29.830
of some point here.

00:11:29.830 --> 00:11:31.820
In other words you can go
from here to there.

00:11:31.820 --> 00:11:35.060
You can also go back again.

00:11:35.060 --> 00:11:35.440
OK.

00:11:35.440 --> 00:11:38.900
The volume of a parallelepiped
here, and this is in an

00:11:38.900 --> 00:11:42.610
arbitrary number of dimensions
is going to be the determinant

00:11:42.610 --> 00:11:46.420
of A. And you all know how
to find determinants.

00:11:46.420 --> 00:11:48.680
Namely you program a computer
to do it, and the

00:11:48.680 --> 00:11:50.110
computer does it.

00:11:50.110 --> 00:11:52.940
I mean it used to be we had to
do this by an enormous amount

00:11:52.940 --> 00:11:54.340
of calculation.

00:11:54.340 --> 00:11:57.780
And of course nobody
does that anymore.

00:11:57.780 --> 00:11:58.240
OK.

00:11:58.240 --> 00:12:01.060
So we can find the volume of
this parallelepiped, it's this

00:12:01.060 --> 00:12:02.970
determinant.

00:12:02.970 --> 00:12:05.680
And what does all
of that mean?

00:12:05.680 --> 00:12:08.410
Well first let me ask
you the question.

00:12:08.410 --> 00:12:11.280
What's going to happen if
the determinant of a

00:12:11.280 --> 00:12:13.760
is equaled to zero?

00:12:13.760 --> 00:12:15.410
What does that mean
geometrically?

00:12:18.380 --> 00:12:20.080
What does that mean here
in terms of this

00:12:20.080 --> 00:12:21.650
two dimensional diagram?

00:12:21.650 --> 00:12:23.580
AUDIENCE: [INAUDIBLE]

00:12:23.580 --> 00:12:23.980
PROFESSOR: What?

00:12:23.980 --> 00:12:25.860
AUDIENCE: Projection
onto a line.

00:12:25.860 --> 00:12:26.550
PROFESSOR: Yeah.

00:12:26.550 --> 00:12:30.220
This little cube here is simply
going to get projected

00:12:30.220 --> 00:12:31.620
onto some line here.

00:12:31.620 --> 00:12:34.890
Like for example that.

00:12:34.890 --> 00:12:39.140
In other words it means that
this matrix is not invertible

00:12:39.140 --> 00:12:42.960
for one thing, but it also means
everything here gets

00:12:42.960 --> 00:12:45.220
mapped onto some lower
dimensional

00:12:45.220 --> 00:12:47.810
sub-space here in general.

00:12:47.810 --> 00:12:48.750
OK.

00:12:48.750 --> 00:12:50.890
Now remember that for a minute,
and we'll come back to

00:12:50.890 --> 00:12:53.610
that when we start talking about
probability densities.

00:12:58.890 --> 00:13:03.820
OK because of the picture we
were just looking at, the

00:13:03.820 --> 00:13:11.710
density of the random variables
Z at A times N,

00:13:11.710 --> 00:13:17.640
namely the density of Z at this
particular value here is

00:13:17.640 --> 00:13:23.170
just the density that we get
corresponding to the density

00:13:23.170 --> 00:13:26.830
of some point here mapped
over into here.

00:13:26.830 --> 00:13:29.620
And what's going to happen when
we take a point here and

00:13:29.620 --> 00:13:31.540
map it over into here?

00:13:31.540 --> 00:13:33.960
If you have a certain amount
of density here, which is

00:13:33.960 --> 00:13:37.220
probability per unit volume.

00:13:37.220 --> 00:13:40.770
Now when you map it into here,
and the determinant is bigger

00:13:40.770 --> 00:13:44.330
than zero, what you're doing
is mapping a little volume

00:13:44.330 --> 00:13:46.930
into a big volume.

00:13:46.930 --> 00:13:49.660
And if you're doing that over
small enough region where the

00:13:49.660 --> 00:13:53.370
probability density is of such
essentially fixed, what's

00:13:53.370 --> 00:13:57.240
going to happen to the
probability density over here?

00:13:57.240 --> 00:14:01.400
It's going to get scaled down,
and it's going to get scaled

00:14:01.400 --> 00:14:04.850
down precisely by that
determinant.

00:14:04.850 --> 00:14:05.560
OK.

00:14:05.560 --> 00:14:09.800
So what this is saying is the
probability density of the

00:14:09.800 --> 00:14:13.000
random variable z, which is
linear combination of these

00:14:13.000 --> 00:14:17.810
normal random variables, is in
fact the probability density

00:14:17.810 --> 00:14:21.530
of this normal vector N, and we
know what that probability

00:14:21.530 --> 00:14:26.250
density is, divided by this
determinant of a.

00:14:26.250 --> 00:14:28.480
In fact this is a general
formula for any old

00:14:28.480 --> 00:14:30.290
probability density at all.

00:14:30.290 --> 00:14:32.660
You can start out with anything
which you call a

00:14:32.660 --> 00:14:37.210
random vector N and you can
derive the probability density

00:14:37.210 --> 00:14:41.030
of any linear combination
of those elements in

00:14:41.030 --> 00:14:42.680
precisely this way.

00:14:42.680 --> 00:14:49.620
So long as this volume element
is non-zero, which means

00:14:49.620 --> 00:14:53.640
you're not mapping an entire
space into a sub-space.

00:14:53.640 --> 00:14:57.720
When you're mapping an entire
space into a sub-space and you

00:14:57.720 --> 00:15:02.270
define density as being density
per unit volume, of

00:15:02.270 --> 00:15:06.140
course the density in this map
space doesn't exist anymore.

00:15:06.140 --> 00:15:09.370
Which is exactly what
this formula says.

00:15:09.370 --> 00:15:12.710
If this determinant is zero, it
means this density here is

00:15:12.710 --> 00:15:17.980
going to be infinite in the
regions where this z exists at

00:15:17.980 --> 00:15:21.830
all, which is just in this
linear sub-space, and what do

00:15:21.830 --> 00:15:23.920
you do about that?

00:15:23.920 --> 00:15:27.970
I mean do you get all
frustrated about it?

00:15:27.970 --> 00:15:31.540
Or do you say what's going
on and treat it in

00:15:31.540 --> 00:15:32.790
some sensible way?

00:15:37.200 --> 00:15:41.080
I mean the thing is
happening here.

00:15:41.080 --> 00:15:42.830
And this talks about
it a little bit.

00:15:42.830 --> 00:15:46.210
If a is singular, then
A is going to map Rk

00:15:46.210 --> 00:15:48.240
into a proper sub-space.

00:15:48.240 --> 00:15:50.700
Determinant A is going
to be equal to 0.

00:15:50.700 --> 00:15:53.400
The density doesn't exist.

00:15:53.400 --> 00:15:56.230
So what do you do about it?

00:15:56.230 --> 00:16:01.820
I mean what does this mean
if you're mapping

00:16:01.820 --> 00:16:03.330
into a smaller sub-space.

00:16:06.840 --> 00:16:11.180
What does it mean in terms
of this diagram here?

00:16:13.680 --> 00:16:16.150
Well in the diagram here
it's pretty clear.

00:16:16.150 --> 00:16:18.580
Because these little cubes here
are getting mapped into

00:16:18.580 --> 00:16:20.880
straight lines here.

00:16:20.880 --> 00:16:23.310
Yeah?

00:16:23.310 --> 00:16:23.630
What?

00:16:23.630 --> 00:16:27.110
AUDIENCE: [INAUDIBLE]

00:16:27.110 --> 00:16:31.640
PROFESSOR: Some linear
combinations of this are being

00:16:31.640 --> 00:16:32.650
mapped into 0.

00:16:32.650 --> 00:16:39.050
Namely if this straight line
is this way any Z which is

00:16:39.050 --> 00:16:43.700
going in this direction is
being mapped into 0.

00:16:43.700 --> 00:16:50.520
Any vector Z which is going in
this direction has to be

00:16:50.520 --> 00:16:53.060
identically equaled to 0.

00:16:53.060 --> 00:16:59.010
In other words some linear
combination of n1 and n2 is a

00:16:59.010 --> 00:17:03.740
random variable which takes on
the value 0 identically.

00:17:03.740 --> 00:17:09.420
Why do you try to represent that
in a probabilistic sense?

00:17:09.420 --> 00:17:12.840
Why don't you just take it out
of consideration altogether?

00:17:12.840 --> 00:17:17.890
Here what it means is that z1
and z2 are simply linear

00:17:17.890 --> 00:17:20.390
combinations of each other.

00:17:20.390 --> 00:17:20.750
OK.

00:17:20.750 --> 00:17:24.990
In other words once you know
what the sample value of z1

00:17:24.990 --> 00:17:29.220
is, you can find the
sample values z2.

00:17:29.220 --> 00:17:35.070
In other words z2 is a linear
combination of z1.

00:17:35.070 --> 00:17:39.600
It's linearly dependent on z1,
which means that you can

00:17:39.600 --> 00:17:43.030
identify it exactly once
you know what z1 is.

00:17:43.030 --> 00:17:45.150
Which means you might as
well not call it a

00:17:45.150 --> 00:17:47.340
random variable at all.

00:17:47.340 --> 00:17:50.880
Which means you might as well
view this situation where the

00:17:50.880 --> 00:17:56.650
determinant is 0 as where the
vector Z is really just one

00:17:56.650 --> 00:18:00.320
random variable, and everything
else is determined.

00:18:00.320 --> 00:18:05.570
So you throw out these extra
random variables,

00:18:05.570 --> 00:18:08.130
pseudo-random variables, which
are really just linear

00:18:08.130 --> 00:18:09.830
combinations of the others.

00:18:09.830 --> 00:18:12.060
So you deal with a smaller
dimensional set.

00:18:12.060 --> 00:18:14.690
You find the probability
density of the smaller

00:18:14.690 --> 00:18:18.270
dimensional set, and you don't
worry about all of these

00:18:18.270 --> 00:18:22.390
mathematical peculiarities that
would arise otherwise.

00:18:22.390 --> 00:18:22.850
OK.

00:18:22.850 --> 00:18:28.270
So once we do that, A is going
to be non-singular.

00:18:28.270 --> 00:18:30.500
Because we're going to be left
with a set of random

00:18:30.500 --> 00:18:32.880
variables, which are
not linearly

00:18:32.880 --> 00:18:34.580
dependent on each other.

00:18:34.580 --> 00:18:37.580
They can be statistically
dependent on each other, but

00:18:37.580 --> 00:18:40.490
not linearly dependent, OK.

00:18:40.490 --> 00:18:46.170
So for all z then, since
determinant A is not 0, the

00:18:46.170 --> 00:18:51.090
probability density at some
arbitrary vector Z is going to

00:18:51.090 --> 00:18:59.400
be the normal joint density
evaluated at a to the minus 1z

00:18:59.400 --> 00:19:03.590
divided by the determinant
of A. OK.

00:19:03.590 --> 00:19:05.850
In other words we're just
working the map backwards.

00:19:05.850 --> 00:19:09.500
This formula is the same as this
formula, except instead

00:19:09.500 --> 00:19:12.230
of writing An here, we're
writing z here.

00:19:12.230 --> 00:19:17.450
And when An is equal to z then
little n has to be equal to A

00:19:17.450 --> 00:19:19.370
to the minus 1z.

00:19:19.370 --> 00:19:20.410
And what does that say?

00:19:20.410 --> 00:19:23.870
It says that the joint
probability density has to be

00:19:23.870 --> 00:19:28.180
equal to this quantity here.

00:19:28.180 --> 00:19:31.075
Which in matrix terms looks
rather simple, it

00:19:31.075 --> 00:19:32.890
looks rather nice.

00:19:32.890 --> 00:19:34.570
You can rewrite this.

00:19:34.570 --> 00:19:38.340
This is a norm here,
so it's this vector

00:19:38.340 --> 00:19:41.950
there times this vector.

00:19:41.950 --> 00:19:44.960
Where in fact when you want to
multiply vectors in this way,

00:19:44.960 --> 00:19:47.890
you're taking inner product of
the vector with this cell.

00:19:47.890 --> 00:19:50.050
These are real vectors
we're looking at.

00:19:50.050 --> 00:19:52.700
Because we're trying to model
real noise, because we're

00:19:52.700 --> 00:19:55.840
modeling the noise on
communication channels.

00:19:55.840 --> 00:20:01.270
So this is going to be the
inner product of A to the

00:20:01.270 --> 00:20:05.670
minus 1z with itself, which
means you want to look at the

00:20:05.670 --> 00:20:08.490
transform of a to
the minus 1z.

00:20:08.490 --> 00:20:11.530
Now what's the transform
of a to the minus 1z?

00:20:11.530 --> 00:20:15.590
It's z transform times a to
the minus 1 transform.

00:20:15.590 --> 00:20:19.190
So we wind up with a to the
minus 1 transform times a to

00:20:19.190 --> 00:20:21.030
the minus 1 times z.

00:20:21.030 --> 00:20:25.700
So we have some kind of peculiar
bilinear form here.

00:20:25.700 --> 00:20:32.270
So for any sample value of the
random vector Z we can find

00:20:32.270 --> 00:20:37.510
the probability density in terms
of this quantity here.

00:20:37.510 --> 00:20:40.920
Which looks a little bit
peculiar, but it

00:20:40.920 --> 00:20:42.170
doesn't look too bad.

00:20:45.720 --> 00:20:46.040
OK.

00:20:46.040 --> 00:20:48.280
Now we want to simplify
that a little bit.

00:20:51.670 --> 00:20:54.770
Anytime you're dealing with
zero-mean random variables --

00:20:54.770 --> 00:20:57.430
now remember I'm going to forget
to say zero-mean half

00:20:57.430 --> 00:20:59.940
the time because everything is
concerned with zero-mean

00:20:59.940 --> 00:21:01.470
random variables.

00:21:01.470 --> 00:21:08.240
The covariance of Z1 and Z2 is
expected value of Z1 times Z2.

00:21:08.240 --> 00:21:11.630
So if you have a k-tuple Z,
the covariance is a matrix

00:21:11.630 --> 00:21:17.370
whose i,j element is expected
value of Zi times Zj.

00:21:17.370 --> 00:21:21.180
And what that means is that the
covariance matrix, this is

00:21:21.180 --> 00:21:26.330
a matrix now, it's going to be
the expected value of z times

00:21:26.330 --> 00:21:28.010
z transpose.

00:21:28.010 --> 00:21:33.400
Z is a random vector, which is
a column random vector, Z

00:21:33.400 --> 00:21:38.890
transpose is a row-random
vector, which is this simply

00:21:38.890 --> 00:21:42.550
turned upside down, turned
by 90 degrees.

00:21:42.550 --> 00:21:44.820
Now when you multiply the
components of this vector

00:21:44.820 --> 00:21:48.580
together, you can see that what
you get is the elements

00:21:48.580 --> 00:21:50.130
of this covariance matrix.

00:21:50.130 --> 00:21:53.770
In other words this is just
standard matrix manipulation,

00:21:53.770 --> 00:21:58.560
which I hope most of you or at
least partly familiar with.

00:21:58.560 --> 00:21:59.000
OK.

00:21:59.000 --> 00:22:02.280
When we then talk about the
expected values Z times Z

00:22:02.280 --> 00:22:07.210
transpose we can write this as
the expected value of A times

00:22:07.210 --> 00:22:09.410
N, which is what z is.

00:22:09.410 --> 00:22:15.640
N transpose times A transpose,
which is what Z transpose is.

00:22:15.640 --> 00:22:20.000
And miraculously here the N and
the N transpose are in the

00:22:20.000 --> 00:22:24.330
middle here, where it's easy
to deal with them, because

00:22:24.330 --> 00:22:28.750
these are normal Gaussian
random variables.

00:22:28.750 --> 00:22:33.380
And when you look at this column
times this row, since

00:22:33.380 --> 00:22:37.000
all diagonal elements are
independent of each other, and

00:22:37.000 --> 00:22:40.670
all of them have variance one,
the expected value of N times

00:22:40.670 --> 00:22:44.940
N transpose is simply
the identity matrix.

00:22:44.940 --> 00:22:48.810
All the randomness goes out of
this which it obviously has to

00:22:48.810 --> 00:22:51.540
because we're looking at a
covariance which is just a

00:22:51.540 --> 00:22:54.000
matrix and not something
random.

00:22:54.000 --> 00:22:57.010
So you wind up with this
covariance matrix is equal to

00:22:57.010 --> 00:23:02.830
a rather arbitrary matrix A,
but not singular, times the

00:23:02.830 --> 00:23:06.850
transpose of that matrix.

00:23:06.850 --> 00:23:07.230
OK.

00:23:07.230 --> 00:23:10.340
We've assumed that A is
non-singular and therefore

00:23:10.340 --> 00:23:17.620
it's not too hard to see the k
sub Z is non-singular also.

00:23:17.620 --> 00:23:22.600
And explicitly case of Z mainly
its co-variance matrix,

00:23:22.600 --> 00:23:28.980
the inverse of it, is A to the
minus 1 transpose times A to

00:23:28.980 --> 00:23:29.940
the minus 1.

00:23:29.940 --> 00:23:32.330
Namely you take the inverse and
you start flipping things

00:23:32.330 --> 00:23:35.210
and you do all of these
neat matrix things

00:23:35.210 --> 00:23:37.560
that you always do.

00:23:37.560 --> 00:23:41.290
And you should review them
if you've forgotten that.

00:23:41.290 --> 00:23:44.930
So that when we write our
probability density, which was

00:23:44.930 --> 00:23:49.180
this in terms of this
transformation here, what we

00:23:49.180 --> 00:23:54.270
get is the density of Z is
equal to, in place of

00:23:54.270 --> 00:23:57.620
determinant of A, we get the
square root of the determinant

00:23:57.620 --> 00:24:03.170
in the k sub Z. You probably
don't remember that, but is

00:24:03.170 --> 00:24:03.890
what you get.

00:24:03.890 --> 00:24:07.140
And it's sort of a blah.

00:24:07.140 --> 00:24:09.110
Here this is more interesting.

00:24:09.110 --> 00:24:13.590
This is minus 1/2 z transpose
times Kz to the

00:24:13.590 --> 00:24:15.560
minus 1 times z.

00:24:15.560 --> 00:24:18.640
What does this tell you?

00:24:18.640 --> 00:24:21.490
Look at this formula.

00:24:21.490 --> 00:24:26.510
Is it just a big formula or
what does it say to you?

00:24:26.510 --> 00:24:30.860
You got to look at these things
and see what they say.

00:24:30.860 --> 00:24:33.890
I mean we've gone through
a lot of work to derive

00:24:33.890 --> 00:24:35.140
something here.

00:24:43.140 --> 00:24:46.170
AUDIENCE: [INAUDIBLE]

00:24:46.170 --> 00:24:47.320
PROFESSOR: Well it
is Gaussian.

00:24:47.320 --> 00:24:48.890
Yes.

00:24:48.890 --> 00:24:53.400
I mean that's the way we define
jointly Gaussian.

00:24:53.400 --> 00:24:55.760
But what's the funny
thing about

00:24:55.760 --> 00:24:57.780
this probability density?

00:24:57.780 --> 00:24:59.030
What does it depend on?

00:25:09.090 --> 00:25:12.300
What do I have to tell you in
order for you to calculate

00:25:12.300 --> 00:25:15.140
this probability density
for every z you

00:25:15.140 --> 00:25:18.500
want to plug in here?

00:25:18.500 --> 00:25:21.750
I have to tell you what this
covariance matrix is.

00:25:21.750 --> 00:25:24.980
And once I tell you what the
covariance matrix is, there is

00:25:24.980 --> 00:25:26.960
nothing more to be specified.

00:25:26.960 --> 00:25:30.200
In other words, a jointly
Gaussian random vector is

00:25:30.200 --> 00:25:34.020
completely specified by
its covariance matrix.

00:25:34.020 --> 00:25:36.710
And it's specified exactly
this way by

00:25:36.710 --> 00:25:38.880
its covariance matrix.

00:25:38.880 --> 00:25:39.100
OK.

00:25:39.100 --> 00:25:40.350
There's nothing more there.

00:25:43.830 --> 00:25:46.950
So this says anytime you're
dealing with jointly Gaussian,

00:25:46.950 --> 00:25:50.100
the only thing you have to
be interested in is this

00:25:50.100 --> 00:25:52.520
covariance here.

00:25:52.520 --> 00:25:55.780
Namely all you need to have
jointly Gaussian is somebody

00:25:55.780 --> 00:26:00.290
has to tell you what the
covariance is, and somebody

00:26:00.290 --> 00:26:07.230
has to tell you also that
it's jointly Gaussian.

00:26:07.230 --> 00:26:10.420
Jointly Gaussian plus a given
covariance specifies the

00:26:10.420 --> 00:26:13.870
probability density.

00:26:13.870 --> 00:26:14.420
OK.

00:26:14.420 --> 00:26:17.790
What does that tell you?

00:26:17.790 --> 00:26:22.100
Well let's look at an example
where we just have two random

00:26:22.100 --> 00:26:25.020
variables, Z1 and Z2.

00:26:25.020 --> 00:26:28.870
then expected value of Z1
squared is the upper.

00:26:28.870 --> 00:26:35.520
Left hand element of that
covariance matrix, which we'll

00:26:35.520 --> 00:26:37.380
call sigma 1 squared.

00:26:37.380 --> 00:26:44.050
The lower, right hand side of
the matrix is K22, which we'll

00:26:44.050 --> 00:26:45.540
call sigma 2 squared.

00:26:48.490 --> 00:26:51.430
And we're going to let rho be
the normalize covariance.

00:26:51.430 --> 00:26:54.690
We're just defining a bunch of
things here, because this is

00:26:54.690 --> 00:26:57.470
the way people usually
define this.

00:26:57.470 --> 00:27:02.230
So rho will be the normalized
cross covariance.

00:27:02.230 --> 00:27:06.100
Then the determinant of the
Kz is this mess here.

00:27:06.100 --> 00:27:09.960
For A to be non-singular, we
have to have rho less than 1.

00:27:09.960 --> 00:27:13.060
If rho is equal to 1 then this
determinant is going to be

00:27:13.060 --> 00:27:16.480
equal to 0, and we're back in
this awful case that we don't

00:27:16.480 --> 00:27:18.260
want to think about.

00:27:18.260 --> 00:27:21.600
So then if we go through the
trouble of finding out what

00:27:21.600 --> 00:27:27.580
the inverse of K sub z is, we
find this 1 over 1 minus row

00:27:27.580 --> 00:27:30.440
square times this matrix here.

00:27:30.440 --> 00:27:34.710
The probability density plugging
this in is this big

00:27:34.710 --> 00:27:36.500
thing here.

00:27:36.500 --> 00:27:38.140
OK what does that tell you?

00:27:45.580 --> 00:27:49.080
Well the thing that it tells
me is that I never want to

00:27:49.080 --> 00:27:51.880
deal with this, and I
particularly don't want to

00:27:51.880 --> 00:27:56.010
deal with it if I'm dealing with
three or more variables.

00:27:56.010 --> 00:27:56.400
OK.

00:27:56.400 --> 00:27:59.250
In other words the interesting
thing here is the simple

00:27:59.250 --> 00:28:07.470
formula we had before, which
is this formula.

00:28:07.470 --> 00:28:08.470
OK.

00:28:08.470 --> 00:28:13.150
And we have computers these
days which say given nice

00:28:13.150 --> 00:28:17.130
formulas like this, their
standard computer routines to

00:28:17.130 --> 00:28:19.930
calculate things like this.

00:28:19.930 --> 00:28:24.910
And you never want to look at
some awful mess like this.

00:28:24.910 --> 00:28:26.700
OK.

00:28:26.700 --> 00:28:29.560
And if you put a mean into here,
which you will see in

00:28:29.560 --> 00:28:34.020
every textbook on random
variables and probability you

00:28:34.020 --> 00:28:38.410
ever look at, this thing becomes
so ugly that you were

00:28:38.410 --> 00:28:41.530
probably convinced before you
took this class that jointly

00:28:41.530 --> 00:28:44.420
Gaussian random variables were
things you wanted to avoid

00:28:44.420 --> 00:28:45.700
like the plague.

00:28:45.700 --> 00:28:48.460
And if you really have to deal
with explicit formulas like

00:28:48.460 --> 00:28:50.000
this, you're absolutely right.

00:28:50.000 --> 00:28:53.130
You do want to avoid them like
the plague, because you can't

00:28:53.130 --> 00:28:57.140
get any insight from what that
says, or at least I can't.

00:28:57.140 --> 00:29:00.420
So I say OK, we have
to deal with this.

00:29:00.420 --> 00:29:03.070
But yet we like to get a little
more insight about what

00:29:03.070 --> 00:29:05.190
this means.

00:29:05.190 --> 00:29:08.070
And to do this, we like to find
a little bit more about

00:29:08.070 --> 00:29:11.250
what these bilinear forms
are all about.

00:29:11.250 --> 00:29:14.540
Those of you who have taken any
course on linear algebra

00:29:14.540 --> 00:29:18.190
have dealt with these
bilinear forms.

00:29:18.190 --> 00:29:21.100
And played with them forever.

00:29:21.100 --> 00:29:24.720
And those of you who haven't are
probably puzzled about how

00:29:24.720 --> 00:29:26.270
to deal with them.

00:29:26.270 --> 00:29:30.850
The notes have an appendix,
which is about two pages long

00:29:30.850 --> 00:29:36.880
which tells you what you have to
know about these matrices.

00:29:36.880 --> 00:29:42.610
And I will just sort of quote
those results as we go.

00:29:42.610 --> 00:29:47.880
Incidentally those results
are not hard to derive.

00:29:47.880 --> 00:29:50.550
And not hard to find
out about.

00:29:50.550 --> 00:29:53.660
You can simply derive them on
your own, or you can look at

00:29:53.660 --> 00:29:56.290
Strang's book on linear algebra,
which is about the

00:29:56.290 --> 00:29:58.620
simplest way to get them.

00:29:58.620 --> 00:30:04.680
And that's all you need to do.

00:30:04.680 --> 00:30:05.070
OK.

00:30:05.070 --> 00:30:08.560
We've said the probability
density depends on this

00:30:08.560 --> 00:30:12.040
bilinear form z transpose
times Kz to the

00:30:12.040 --> 00:30:14.600
minus 1 time z.

00:30:14.600 --> 00:30:15.200
What is this?

00:30:15.200 --> 00:30:17.830
Is this a matrix or
a vector or what?

00:30:22.590 --> 00:30:24.190
How many people think
it's a matrix?

00:30:26.720 --> 00:30:30.110
How many people think
it's a vector?

00:30:30.110 --> 00:30:31.395
You think it's a vector?

00:30:31.395 --> 00:30:31.900
OK.

00:30:31.900 --> 00:30:34.630
Well in a very peculiar
sense it is.

00:30:34.630 --> 00:30:37.330
How many people think
it's a number?

00:30:37.330 --> 00:30:37.690
Good.

00:30:37.690 --> 00:30:39.460
OK.

00:30:39.460 --> 00:30:42.220
It is a number, and it's
a number because

00:30:42.220 --> 00:30:44.450
this is a row vector.

00:30:44.450 --> 00:30:46.090
This is a matrix.

00:30:46.090 --> 00:30:47.890
This is a column vector.

00:30:47.890 --> 00:30:50.650
And if you think of multiplying
a matrix times a

00:30:50.650 --> 00:30:53.760
column vector, you get
a column vector.

00:30:53.760 --> 00:30:55.940
And if you take a row vector
times a column

00:30:55.940 --> 00:30:57.750
vector, you got a number.

00:30:57.750 --> 00:30:58.370
OK.

00:30:58.370 --> 00:31:04.420
So this is just a number which
depends on little z.

00:31:04.420 --> 00:31:05.150
OK.

00:31:05.150 --> 00:31:08.880
Kz is called a positive
definite matrix.

00:31:08.880 --> 00:31:11.600
And it's called a positive
definite matrix, because this

00:31:11.600 --> 00:31:14.730
thing is always non-negative.

00:31:14.730 --> 00:31:17.800
And it always has to be
non-negative because this

00:31:17.800 --> 00:31:19.260
refers to the --

00:31:28.020 --> 00:31:30.950
if I put capital Z in here,
namely if I put the random

00:31:30.950 --> 00:31:36.300
vector in here Z, then what this
is, is the variance of a

00:31:36.300 --> 00:31:39.360
particular random variable.

00:31:39.360 --> 00:31:42.100
So it has to be greater
than or equal to zero.

00:31:42.100 --> 00:31:47.790
So anyway K sub z is always
non-negative definite.

00:31:47.790 --> 00:31:51.100
Here it's going to be positive
definite, because we've

00:31:51.100 --> 00:31:54.030
already assumed that the matrix
A was non-singular, and

00:31:54.030 --> 00:31:57.050
therefore the matrix Kz has
to be non-singular.

00:31:57.050 --> 00:31:59.370
So this has to be positive
definite.

00:31:59.370 --> 00:32:03.470
So it has an inverse, K sub
z minus 1 is also positive

00:32:03.470 --> 00:32:07.130
definite which means this
quantity is always greater

00:32:07.130 --> 00:32:10.910
than zero, if z is non-zero.

00:32:10.910 --> 00:32:14.340
You can take these positive
definite matrices and you can

00:32:14.340 --> 00:32:17.640
find eigenvectors and
eigenvalues for them.

00:32:17.640 --> 00:32:20.340
Do you ever actually calculate
these eigenvectors and

00:32:20.340 --> 00:32:22.660
eigenvalues?

00:32:22.660 --> 00:32:23.460
I hope not.

00:32:23.460 --> 00:32:26.310
It's a mess to do it.

00:32:26.310 --> 00:32:29.170
I mean it's just as bad as
writing that awful formula we

00:32:29.170 --> 00:32:30.470
had before.

00:32:30.470 --> 00:32:33.420
So you don't want to actually do
this, but it's nice to know

00:32:33.420 --> 00:32:35.910
that these things exist.

00:32:35.910 --> 00:32:44.120
And because these vectors exist,
and in fact if you have

00:32:44.120 --> 00:32:48.990
a matrix which is k by k, little
k by little k, then

00:32:48.990 --> 00:32:53.130
there are k such eigenvectors
and they can be chosen

00:32:53.130 --> 00:32:54.570
orthonormal.

00:32:54.570 --> 00:32:55.010
OK.

00:32:55.010 --> 00:33:00.770
In other words each of these Q
sub i are orthogonal to each

00:33:00.770 --> 00:33:01.870
of the others.

00:33:01.870 --> 00:33:04.550
You can clearly scale them,
because you scale this and

00:33:04.550 --> 00:33:06.420
scale this together.

00:33:06.420 --> 00:33:08.070
And it's going to maintain
equality.

00:33:08.070 --> 00:33:18.380
So you just scale them down so
you can make them normalize.

00:33:18.380 --> 00:33:21.060
If you have a bunch of
eigenvectors with the same

00:33:21.060 --> 00:33:26.660
eigenvalue, then the whole
linear sub-space formed by

00:33:26.660 --> 00:33:33.540
that set of eigenvectors all
have the same eigenvalue

00:33:33.540 --> 00:33:34.540
lambda sub i.

00:33:34.540 --> 00:33:38.560
Namely you take any linear
combination of these things

00:33:38.560 --> 00:33:42.080
which satisfy this equation
for a particular lambda.

00:33:42.080 --> 00:33:44.860
And any linear combination
will satisfy the

00:33:44.860 --> 00:33:47.800
same the same equation.

00:33:47.800 --> 00:33:51.910
So you can simply choose an
orthonormal set among them to

00:33:51.910 --> 00:33:53.300
satisfy this.

00:33:53.300 --> 00:33:58.290
And if you look at Q sub i and
Q sub j, which have different

00:33:58.290 --> 00:34:03.080
eigenvalues, then it's pretty
easy to show that in fact they

00:34:03.080 --> 00:34:05.770
have to be orthogonal
to each other.

00:34:05.770 --> 00:34:13.060
So anyway when you do this you
wind up with this form becomes

00:34:13.060 --> 00:34:17.280
just the sum over i of lambda
sub i to the minus 1.

00:34:17.280 --> 00:34:19.770
Namely these eigenvalues
to the minus 1.

00:34:19.770 --> 00:34:23.020
These eigenvalues are
all positive.

00:34:23.020 --> 00:34:26.730
Times the inner product
of z with Q sub i.

00:34:26.730 --> 00:34:29.440
In other words, you take
whatever vector Z you're

00:34:29.440 --> 00:34:35.070
interested in here, you
project it on these k

00:34:35.070 --> 00:34:38.220
orthonormal vectors Q sub i.

00:34:38.220 --> 00:34:41.600
You get those k values.

00:34:41.600 --> 00:34:47.680
And then this form here is
just that sum there.

00:34:47.680 --> 00:34:49.530
So when you write
the probability

00:34:49.530 --> 00:34:52.640
density in that way --

00:34:52.640 --> 00:34:54.510
we still have this here
we'll get rid of that

00:34:54.510 --> 00:34:55.640
in the minute --

00:34:55.640 --> 00:35:00.840
you have e to the minus sum over
i, these inner product

00:35:00.840 --> 00:35:04.230
terms squared divided by
2 times lambda sub i.

00:35:04.230 --> 00:35:07.010
That's just because this
is equal to this.

00:35:07.010 --> 00:35:11.500
It's just substituting this for
this in the formula for

00:35:11.500 --> 00:35:14.180
the probability density.

00:35:14.180 --> 00:35:14.530
OK.

00:35:14.530 --> 00:35:18.480
What does that say
pictorially?

00:35:18.480 --> 00:35:22.210
Let me show you a picture
of it first.

00:35:22.210 --> 00:35:27.840
It says that for any positive
definite matrix and therefore

00:35:27.840 --> 00:35:37.440
for any covariance matrix,
you can always find these

00:35:37.440 --> 00:35:38.370
orthonormal vectors.

00:35:38.370 --> 00:35:41.010
I've drawn them here
for two dimensions.

00:35:41.010 --> 00:35:45.060
They're just some arbitrary
vector q1; q2 has to be

00:35:45.060 --> 00:35:47.930
orthogonal to it.

00:35:47.930 --> 00:35:52.370
And if you look at the square
root of lambda 1 times q1, you

00:35:52.370 --> 00:35:53.640
got a point here.

00:35:53.640 --> 00:35:57.100
You look at the square root of
lambda 2 times q2, you got a

00:35:57.100 --> 00:35:58.180
point here.

00:35:58.180 --> 00:36:01.710
If you then look at this
probability density here, you

00:36:01.710 --> 00:36:07.760
see that all the points on this
ellipse here have to have

00:36:07.760 --> 00:36:15.730
the same sum of z
times Q sub i.

00:36:15.730 --> 00:36:18.170
OK.

00:36:18.170 --> 00:36:20.120
It looked a little
better over here.

00:36:20.120 --> 00:36:21.120
Yes.

00:36:21.120 --> 00:36:21.820
OK.

00:36:21.820 --> 00:36:27.380
Namely the points little z for
which this is constant are the

00:36:27.380 --> 00:36:29.690
points which form
this ellipse.

00:36:29.690 --> 00:36:34.940
And it's the ellipse which has
the axes square root of lambda

00:36:34.940 --> 00:36:37.930
i times Q sub i.

00:36:37.930 --> 00:36:44.170
And then you just imagine it
if it's lined up this way.

00:36:44.170 --> 00:36:51.720
And think of what you would get
if you were looking at the

00:36:51.720 --> 00:36:55.590
lines of equal probability
density for independent

00:36:55.590 --> 00:36:58.880
Gaussian random variables with
different variances.

00:36:58.880 --> 00:37:01.150
I mean we already pointed out
if the variances are all the

00:37:01.150 --> 00:37:06.990
same, these equal probability
contours are spheres.

00:37:06.990 --> 00:37:13.040
If you now expand on some of
the axes, you get ellipses.

00:37:13.040 --> 00:37:16.700
And now we've taken these
arbitrary vectors, so these

00:37:16.700 --> 00:37:19.860
are not in the directions we
started out with, but in some

00:37:19.860 --> 00:37:23.230
arbitrary directions.

00:37:23.230 --> 00:37:27.140
We have q1 and q2 are
orthornormal to each other,

00:37:27.140 --> 00:37:29.850
because that's the way
we've chosen them.

00:37:29.850 --> 00:37:33.640
And then the probability
density has this form

00:37:33.640 --> 00:37:36.490
which is this form.

00:37:36.490 --> 00:37:45.480
And the other thing we can do is
to take this form here and

00:37:45.480 --> 00:37:49.260
say gee this is just a
probability density for a

00:37:49.260 --> 00:37:52.870
bunch of independent random
variables, where the

00:37:52.870 --> 00:37:59.540
independent random variables are
the inner product of the

00:37:59.540 --> 00:38:07.030
random vector Z with Q sub 1 up
to the inner product of z

00:38:07.030 --> 00:38:08.440
with Q sub k.

00:38:08.440 --> 00:38:11.490
So these are independent
Gaussian random variables.

00:38:11.490 --> 00:38:14.480
They have variances
lambda sub i.

00:38:14.480 --> 00:38:19.850
And this is the nicest formula
for the probability density of

00:38:19.850 --> 00:38:23.990
an arbitrary set of jointly
density of jointly Gaussian

00:38:23.990 --> 00:38:25.900
random variables.

00:38:25.900 --> 00:38:26.990
OK.

00:38:26.990 --> 00:38:30.450
In other words what this says
is in general if you have a

00:38:30.450 --> 00:38:34.190
set of jointly Gaussian random
variables and I have this

00:38:34.190 --> 00:38:38.340
messy form here, all you're
doing is looking at them in a

00:38:38.340 --> 00:38:40.240
wrong coordinate system.

00:38:40.240 --> 00:38:43.440
If you switch them around, you
look at them this way instead

00:38:43.440 --> 00:38:47.990
of this way, you're going to
have independent Gaussian

00:38:47.990 --> 00:38:49.800
random variables.

00:38:49.800 --> 00:38:55.240
And the way to look at them
is found by solving this

00:38:55.240 --> 00:38:58.760
eigenvector eigenvalue equation,
which will tell you

00:38:58.760 --> 00:39:01.720
what these orthogonal
directions are.

00:39:01.720 --> 00:39:05.290
And then it'll tell you how to
get this nice picture that

00:39:05.290 --> 00:39:06.780
looks this way.

00:39:06.780 --> 00:39:09.010
OK.

00:39:09.010 --> 00:39:13.040
OK so that tells us what we
need to know, maybe even a

00:39:13.040 --> 00:39:16.360
little more than we have to
know, about jointly Gaussian

00:39:16.360 --> 00:39:19.320
random variables.

00:39:19.320 --> 00:39:21.080
But there's one more
bonus we get.

00:39:24.530 --> 00:39:27.980
And the bonus is
the following.

00:39:27.980 --> 00:39:34.780
If you create a matrix here B
where the i'th row of B is

00:39:34.780 --> 00:39:38.930
this vector Q sub i divided by
the square root of lambda sub

00:39:38.930 --> 00:39:43.820
i, then what this is going to
do is the corresponding

00:39:43.820 --> 00:39:48.700
element here is going to be a
normalized Gaussian random

00:39:48.700 --> 00:39:52.340
variable with variance one and
all of these are going to be

00:39:52.340 --> 00:39:55.150
independent of each other.

00:39:55.150 --> 00:39:55.300
OK.

00:39:55.300 --> 00:39:58.330
That's essentially what
we were saying before.

00:39:58.330 --> 00:40:01.660
This is just another way of
saying the same thing.

00:40:01.660 --> 00:40:08.720
That when you squish this
probability density around and

00:40:08.720 --> 00:40:11.660
look at it in a different frame
of reference, and then

00:40:11.660 --> 00:40:15.460
you scale the random variables
down or up, what you wind up

00:40:15.460 --> 00:40:20.640
with is IID normal Gaussian
random variables.

00:40:20.640 --> 00:40:21.000
OK.

00:40:21.000 --> 00:40:23.620
But that says that Z is
equal to B to the

00:40:23.620 --> 00:40:26.470
minus 1 times N prime.

00:40:26.470 --> 00:40:28.470
Well so what?

00:40:28.470 --> 00:40:29.980
Here's the reason for so what.

00:40:29.980 --> 00:40:32.800
We started out with a definition
of jointly

00:40:32.800 --> 00:40:36.820
Gaussian, which is probably not
the definition of jointly

00:40:36.820 --> 00:40:41.190
Gaussian you've ever seen if
you've seen this before.

00:40:41.190 --> 00:40:44.700
Then what we did was to say,
OK if you have jointly

00:40:44.700 --> 00:40:47.770
Gaussian random variables
and they're not linearly

00:40:47.770 --> 00:40:54.700
independent and none of them are
linearly dependent on the

00:40:54.700 --> 00:40:58.610
others, then this matrix
A is invertible.

00:40:58.610 --> 00:41:03.250
From that we derive the
probability density.

00:41:03.250 --> 00:41:07.670
From the probability density
we derive this.

00:41:07.670 --> 00:41:07.970
OK.

00:41:07.970 --> 00:41:10.100
The only thing you need
to get this is

00:41:10.100 --> 00:41:12.010
the probability density.

00:41:12.010 --> 00:41:16.520
And this says that anytime you
have a random vector Z with

00:41:16.520 --> 00:41:19.340
this probability density that
we just wrote down.

00:41:23.170 --> 00:41:27.100
Then you have the property that
N prime is equal to BZ,

00:41:27.100 --> 00:41:30.600
and Z is equal to B to
the minus 1 N prime.

00:41:30.600 --> 00:41:35.120
Which says that if you have this
probability density, then

00:41:35.120 --> 00:41:37.440
you have jointly Gaussian
random variables.

00:41:37.440 --> 00:41:38.460
So we have an alternate

00:41:38.460 --> 00:41:41.170
definition of jointly Gaussian.

00:41:41.170 --> 00:41:44.320
Random variables are jointly
Gaussian if they have this

00:41:44.320 --> 00:41:45.700
probability density.

00:41:49.150 --> 00:41:52.400
You can somehow represent them
as linear combinations of

00:41:52.400 --> 00:41:55.000
normal random variables.

00:41:55.000 --> 00:41:55.260
OK.

00:41:55.260 --> 00:41:57.330
Then there's something
even simpler.

00:41:57.330 --> 00:42:00.080
It says that if all linear
combinations of a random

00:42:00.080 --> 00:42:04.490
vector Z are Gaussian, then
Z is jointly Gaussian.

00:42:04.490 --> 00:42:05.910
Why is that?

00:42:05.910 --> 00:42:10.060
well If you look at this formula
here, it says take any

00:42:10.060 --> 00:42:14.750
old random vector at all that
has a covariance matrix.

00:42:14.750 --> 00:42:18.980
From that covariance matrix, we
can solve for all of these

00:42:18.980 --> 00:42:20.230
eigenvectors.

00:42:37.510 --> 00:42:42.140
If I find the appropriate random
variables here from

00:42:42.140 --> 00:42:46.250
this transformation, those
random variables are then

00:42:46.250 --> 00:42:50.320
uncorrelated from each other,
they are all statistically

00:42:50.320 --> 00:42:53.970
independent of each other, And
it follows from that, that if

00:42:53.970 --> 00:42:58.700
all these linear combinations
are Gaussian then Z has to be

00:42:58.700 --> 00:43:00.150
jointly Gaussian.

00:43:00.150 --> 00:43:00.560
OK.

00:43:00.560 --> 00:43:03.330
So we now have three
definitions.

00:43:03.330 --> 00:43:05.740
This one is the simplest
one to state.

00:43:05.740 --> 00:43:08.450
It's the hardest one
to work with.

00:43:08.450 --> 00:43:11.210
That's life you know.

00:43:11.210 --> 00:43:15.230
This one is probably the most
straightforward, because you

00:43:15.230 --> 00:43:19.730
don't have to know anything to
understand this definition.

00:43:19.730 --> 00:43:24.200
This original definition is
physically the most appealing

00:43:24.200 --> 00:43:27.780
because it shows why noise
vectors actually

00:43:27.780 --> 00:43:29.050
do have this property.

00:43:32.370 --> 00:43:32.660
OK.

00:43:32.660 --> 00:43:35.330
So here's a summary
of all of this.

00:43:35.330 --> 00:43:39.280
It says if Kz is singular, you
want to remove the linearly

00:43:39.280 --> 00:43:41.820
independent random variables.

00:43:41.820 --> 00:43:44.500
You just take them away because
they're uniquely

00:43:44.500 --> 00:43:48.600
specified in terms of
the other variables.

00:43:48.600 --> 00:43:55.540
And then you take the resulting
non-singular matrix,

00:43:55.540 --> 00:43:59.300
and Z is going to be jointly
Gaussian if and only if Z is

00:43:59.300 --> 00:44:04.470
equal to AN for some normal
random variable N. If Z has

00:44:04.470 --> 00:44:08.550
jointly Gaussian density or if
all linear combinations of Z

00:44:08.550 --> 00:44:12.930
are Gaussian all of this for 0
mean would mean it applies to

00:44:12.930 --> 00:44:14.990
fluctuation.

00:44:14.990 --> 00:44:17.030
OK.

00:44:17.030 --> 00:44:20.240
So why do we have to know all of
that about jointly Gaussian

00:44:20.240 --> 00:44:23.360
random variables?

00:44:23.360 --> 00:44:26.680
Well because everything about
Gaussian processes depends on

00:44:26.680 --> 00:44:28.470
jointly Gaussian random
variables.

00:44:28.470 --> 00:44:32.910
You can't do anything with
Gaussian processes without

00:44:32.910 --> 00:44:35.330
being able to look
at these jointly

00:44:35.330 --> 00:44:36.790
Gaussian random variables.

00:44:36.790 --> 00:44:40.770
And the reason is that we said
that Z of t is a Gaussian

00:44:40.770 --> 00:44:47.200
process if for every k and every
set of time instance,

00:44:47.200 --> 00:44:50.420
every set of e, that's
t1 to t sub k.

00:44:50.420 --> 00:44:54.050
If Z of t1 up to Z of
tk is a jointly

00:44:54.050 --> 00:44:56.840
Gaussian random vector.

00:44:56.840 --> 00:44:57.180
OK.

00:44:57.180 --> 00:45:00.250
So that directly links the
definition of a Gaussian

00:45:00.250 --> 00:45:04.120
process to Gaussian
random vectors.

00:45:04.120 --> 00:45:04.450
OK.

00:45:04.450 --> 00:45:07.450
Supposed the sample functions
of Z of t or L2 with

00:45:07.450 --> 00:45:08.420
probability one.

00:45:08.420 --> 00:45:12.020
I want to say a little bit about
this because otherwise

00:45:12.020 --> 00:45:16.800
you can't sort out any of these
things about how L2

00:45:16.800 --> 00:45:21.910
theory connects with
random processes.

00:45:21.910 --> 00:45:22.220
OK.

00:45:22.220 --> 00:45:25.030
So I'm going to start out by
just assuming that all these

00:45:25.030 --> 00:45:30.280
sample functions are going
to be L2 functions with

00:45:30.280 --> 00:45:33.040
probability 1.

00:45:33.040 --> 00:45:36.290
One way to ensure this is to
look only at processes of the

00:45:36.290 --> 00:45:42.210
form Z of t equals some sum
of Z sub i times ti of t.

00:45:42.210 --> 00:45:44.950
OK remember at the beginning
we started looking at this

00:45:44.950 --> 00:45:50.470
process the sum of a set of
normalized Gaussian random

00:45:50.470 --> 00:45:52.980
variables times sinc
functions times

00:45:52.980 --> 00:45:55.800
displaced sinc functions.

00:45:55.800 --> 00:45:57.750
You can also do the same
thing with Fourier

00:45:57.750 --> 00:46:00.200
coefficients or anything.

00:46:00.200 --> 00:46:03.020
You got a fairly general set
of processes that way.

00:46:06.890 --> 00:46:09.780
unfortunately they don't quite
work, because if you look at

00:46:09.780 --> 00:46:12.730
the sinc functions and you
look at noise, which is

00:46:12.730 --> 00:46:16.510
independent and identically
distributed in time, then the

00:46:16.510 --> 00:46:20.540
sample functions are going
to have infinite energy.

00:46:20.540 --> 00:46:24.370
I mean that kind of process
just runs on forever.

00:46:24.370 --> 00:46:27.140
It runs on with finite
power forever.

00:46:27.140 --> 00:46:30.510
And therefore it has
infinite energy.

00:46:30.510 --> 00:46:35.550
And therefore the simplest
process to look at, the sample

00:46:35.550 --> 00:46:41.680
functions are non-L2 with
probability one, which is sort

00:46:41.680 --> 00:46:43.250
of unfortunate.

00:46:43.250 --> 00:46:47.490
So we say OK we don't care about
that, because if you

00:46:47.490 --> 00:46:52.160
want to look at that process
a sum of Z sub i times sinc

00:46:52.160 --> 00:46:56.410
functions, what do
you care about?

00:46:56.410 --> 00:47:00.500
I mean you only care about the
terms in that expansion, which

00:47:00.500 --> 00:47:07.600
run from the big bang until
the next big bang.

00:47:07.600 --> 00:47:09.270
OK.

00:47:09.270 --> 00:47:12.620
We certainly don't care about it
before that or after that.

00:47:12.620 --> 00:47:15.630
And if we look within those
finite time limits, then all

00:47:15.630 --> 00:47:19.570
these functions are
going to be L2.

00:47:19.570 --> 00:47:22.700
Because they just last for
a finite amount of time.

00:47:22.700 --> 00:47:26.890
So all we need to do is to
truncate these things somehow.

00:47:26.890 --> 00:47:29.640
And we're going to diddle around
a little bit with a

00:47:29.640 --> 00:47:33.650
question of how to truncate
these series.

00:47:33.650 --> 00:47:36.990
But for the time being we
just say we can do that.

00:47:36.990 --> 00:47:38.200
And we will do it.

00:47:38.200 --> 00:47:41.450
So we can look at any processes
the form sum of Zi

00:47:41.450 --> 00:47:45.660
times phi i of t, where the Zi
are independent and the phi i

00:47:45.660 --> 00:47:48.510
of t are orthonormal.

00:47:48.510 --> 00:47:54.180
And to make things L2, we're
going to assume that the sum

00:47:54.180 --> 00:48:04.770
over i of Zi squared bar
is less than infinity.

00:48:04.770 --> 00:48:05.090
OK.

00:48:05.090 --> 00:48:08.690
In other words we only take a
finite number of these things,

00:48:08.690 --> 00:48:11.230
or if we want to take infinite
a number of them, the

00:48:11.230 --> 00:48:15.040
variances are going
to go off to zero.

00:48:15.040 --> 00:48:17.600
And I don't know whether you're
proving it in the

00:48:17.600 --> 00:48:20.590
homework this time or you will
prove it in the homework next

00:48:20.590 --> 00:48:24.460
time, I forget, but you're going
to look at the question

00:48:24.460 --> 00:48:29.040
of why this finite variance
condition makes these sample

00:48:29.040 --> 00:48:37.120
functions be L2 with
probability one.

00:48:37.120 --> 00:48:37.530
OK.

00:48:37.530 --> 00:48:43.050
So if you had this condition,
then all your sample functions

00:48:43.050 --> 00:48:44.840
are going to be L2.

00:48:44.840 --> 00:48:46.670
I'm going to get off
all of this L2

00:48:46.670 --> 00:48:48.870
business relatively shortly.

00:48:48.870 --> 00:48:51.270
I want to do a little bit
of it to start with.

00:48:51.270 --> 00:48:55.600
Because if any of you have start
doing any research in

00:48:55.600 --> 00:48:59.150
this area, at some point you're
going to be merrily

00:48:59.150 --> 00:49:02.880
working away calculating
all sorts of things.

00:49:02.880 --> 00:49:06.440
And suddenly you're going to
find that none of it exists,

00:49:06.440 --> 00:49:09.380
because of these problems
of infinite energy.

00:49:09.380 --> 00:49:11.530
And you're going to
get very puzzled.

00:49:11.530 --> 00:49:14.260
So one of the things I tried to
do in the notes is to write

00:49:14.260 --> 00:49:17.240
them in way that you can
understand them at a first

00:49:17.240 --> 00:49:20.370
reading without worrying
about any of this.

00:49:20.370 --> 00:49:23.860
And then when you go back for a
second reading, you can pick

00:49:23.860 --> 00:49:26.610
up all the mathematics
that you need.

00:49:26.610 --> 00:49:29.810
So that in fact you won't have
the problem of suddenly

00:49:29.810 --> 00:49:32.310
finding out that three-quarters
of your thesis

00:49:32.310 --> 00:49:35.150
has to be thrown away, because
you've been dealing with

00:49:35.150 --> 00:49:37.850
things that don't
make any sense.

00:49:37.850 --> 00:49:38.170
OK.

00:49:38.170 --> 00:49:41.180
So, we're going to define linear
functionals in the

00:49:41.180 --> 00:49:42.050
following way.

00:49:42.050 --> 00:49:46.340
We're going to first look at the
sample functions of this

00:49:46.340 --> 00:49:48.820
random process Z. OK.

00:49:48.820 --> 00:49:51.600
Now we talked about
this last time.

00:49:51.600 --> 00:49:56.170
If you have a random process Z
than really what you have is a

00:49:56.170 --> 00:49:59.530
set of functions defined
on some samples space.

00:49:59.530 --> 00:50:03.560
So the quantities you're
interested in is what is the

00:50:03.560 --> 00:50:11.760
value of the random process at
time t for sample point omega.

00:50:11.760 --> 00:50:14.540
OK.

00:50:14.540 --> 00:50:19.490
If we look at that and make it
for a given omega, this thing

00:50:19.490 --> 00:50:21.270
becomes a function of t.

00:50:21.270 --> 00:50:24.200
In fact for a given omega, this
is just what we've been

00:50:24.200 --> 00:50:28.000
calling a sample element
of the random process.

00:50:28.000 --> 00:50:30.770
So if we take this sample
element, look at the inner

00:50:30.770 --> 00:50:34.600
product of that with some
function g of t.

00:50:34.600 --> 00:50:38.720
In other words we look at the
integral of Z of t omega times

00:50:38.720 --> 00:50:41.290
g of t, dt.

00:50:41.290 --> 00:50:45.620
And if all these sample
functions are L2 and if g of t

00:50:45.620 --> 00:50:49.170
is L2, what happens when you
take the integral of an L2

00:50:49.170 --> 00:50:56.020
function times an L2 function,
which is the inner product of

00:50:56.020 --> 00:51:00.310
something L2 with something L2
which says something with

00:51:00.310 --> 00:51:02.930
finite energy inner
product with

00:51:02.930 --> 00:51:05.950
something with finite energy.

00:51:05.950 --> 00:51:11.560
Well the Schwarz inequality
tells you that if this has

00:51:11.560 --> 00:51:15.310
finite energy and this has
finite energy, the inner

00:51:15.310 --> 00:51:17.740
product exists.

00:51:17.740 --> 00:51:19.920
That's the reason why we went
through the Schwarz

00:51:19.920 --> 00:51:20.610
inequality.

00:51:20.610 --> 00:51:22.900
It's the main reason
for doing that.

00:51:22.900 --> 00:51:25.250
So these things have
finite value.

00:51:25.250 --> 00:51:30.220
So V of omega the results of
doing this namely V as a

00:51:30.220 --> 00:51:33.380
function of the sample space
is a real number.

00:51:33.380 --> 00:51:38.380
And it's a real number for the
sample points of omega with

00:51:38.380 --> 00:51:42.280
probability one, which means
we can talk about V as a

00:51:42.280 --> 00:51:43.800
random variable.

00:51:43.800 --> 00:51:45.060
OK.

00:51:45.060 --> 00:51:49.290
And now V is a random variable
which is defined in this way.

00:51:49.290 --> 00:51:52.140
And from now on we will call
these things linear

00:51:52.140 --> 00:51:55.530
functionals which are in fact
the integral of a random

00:51:55.530 --> 00:51:58.940
process times a function.

00:51:58.940 --> 00:52:02.520
And we can take that
kind of integral.

00:52:02.520 --> 00:52:05.830
It sort of looks like the linear
combinations of things

00:52:05.830 --> 00:52:11.930
we were doing before when we
were talking about matrices

00:52:11.930 --> 00:52:13.180
and random vectors.

00:52:21.280 --> 00:52:21.800
OK.

00:52:21.800 --> 00:52:24.830
If we restrict the random
process to have the following

00:52:24.830 --> 00:52:27.110
form, where these are
independent and these are

00:52:27.110 --> 00:52:33.160
orthonormal, then one of these
linear functionals is given by

00:52:33.160 --> 00:52:36.640
the random variable V is going
to be the integral of Z of t

00:52:36.640 --> 00:52:40.790
times g of t, but
Z of t is this.

00:52:40.790 --> 00:52:45.190
And at this point we're not
going to fuss about

00:52:45.190 --> 00:52:47.910
interchanging integrals
with summations.

00:52:47.910 --> 00:52:50.610
You have the machinery to do it,
because we're now dealing

00:52:50.610 --> 00:52:52.870
with an L2 space.

00:52:52.870 --> 00:52:54.730
We're not going to
fuss about it.

00:52:54.730 --> 00:52:57.810
And I advise you not
to fuss about it.

00:52:57.810 --> 00:53:02.080
So we have a sum of these random
variables here times

00:53:02.080 --> 00:53:03.170
these integrals here.

00:53:03.170 --> 00:53:07.320
These integrals here are just
projections of g of t on this

00:53:07.320 --> 00:53:09.440
space of orthonormal
functions.

00:53:09.440 --> 00:53:12.590
So whatever space of orthonormal
functions gives

00:53:12.590 --> 00:53:16.810
you your jollies, use it talk
about the inner products on

00:53:16.810 --> 00:53:18.130
that space.

00:53:18.130 --> 00:53:22.070
This gives you a nice inner
products space of

00:53:22.070 --> 00:53:24.760
sequences of numbers.

00:53:24.760 --> 00:53:28.210
And then if the z i are jointly
Gaussian, then V is

00:53:28.210 --> 00:53:30.110
going to be Gaussian.

00:53:30.110 --> 00:53:35.020
And then to generalize this one
little bit further, if you

00:53:35.020 --> 00:53:40.010
take a whole bunch of L2
functions, g1 of t g2 of t and

00:53:40.010 --> 00:53:43.030
so forth, you can talk about
a whole bunch of random

00:53:43.030 --> 00:53:47.670
variables V1 up to V sub j, 0.

00:53:47.670 --> 00:53:50.880
And V sub j is going to be
the integral of Z of

00:53:50.880 --> 00:53:53.450
t times gj of tdt.

00:53:53.450 --> 00:53:57.540
Remember this thing
looks very simple.

00:53:57.540 --> 00:54:00.180
It looks like the --

00:54:00.180 --> 00:54:04.500
like the convolutions you've
been doing all your life.

00:54:04.500 --> 00:54:05.080
It's not.

00:54:05.080 --> 00:54:08.510
It's really a rather
peculiar quantity.

00:54:08.510 --> 00:54:12.510
This in fact is what we call a
linear functional, and is the

00:54:12.510 --> 00:54:16.720
integral of a random
process times this.

00:54:16.720 --> 00:54:19.650
Which we have defined in terms
of the sample functions of the

00:54:19.650 --> 00:54:22.270
random process.

00:54:22.270 --> 00:54:26.450
And now we said OK now that we
understand what it is, we will

00:54:26.450 --> 00:54:28.770
just write this all the time.

00:54:28.770 --> 00:54:32.300
But I just caution
you not to let

00:54:32.300 --> 00:54:33.980
familiarity breed contempt.

00:54:33.980 --> 00:54:39.440
Because this is a rather
peculiar notion.

00:54:39.440 --> 00:54:41.610
And a rather powerful notion.

00:54:41.610 --> 00:54:42.100
OK.

00:54:42.100 --> 00:54:45.800
So these things are
jointly Gaussian.

00:54:45.800 --> 00:54:49.700
We want to take the expected
value of V sub i times V sub

00:54:49.700 --> 00:54:55.020
j, and now we're going to do
this without worrying about

00:54:55.020 --> 00:54:56.640
being careful at all.

00:54:56.640 --> 00:55:00.900
We have the expected value of
the integral of Z of t, gi of

00:55:00.900 --> 00:55:07.520
t, td times the integral of
Z tau gj of tau, d tau.

00:55:07.520 --> 00:55:11.400
And now we're going to slide
this expected value inside of

00:55:11.400 --> 00:55:14.010
both of these integrals.

00:55:14.010 --> 00:55:18.890
And not worry about it.

00:55:18.890 --> 00:55:21.400
And therefore what we're going
to have is a double integral

00:55:21.400 --> 00:55:27.500
of gi of t expected value of z
of t time z of tau times gj of

00:55:27.500 --> 00:55:34.470
tau dt, d tau, which
is this thing here.

00:55:34.470 --> 00:55:38.210
Which you should compare with
what we've been dealing with

00:55:38.210 --> 00:55:41.300
most of the lecture today.

00:55:41.300 --> 00:55:47.200
This is the same kind of form
for a covariance function as

00:55:47.200 --> 00:55:50.420
we've been dealing with for
covariance matrices.

00:55:50.420 --> 00:55:55.660
It has very similar effects.

00:55:55.660 --> 00:55:57.900
I mean before you were just
talking about finite

00:55:57.900 --> 00:56:02.060
dimensional matrices, which is
all simple mathematically in

00:56:02.060 --> 00:56:04.590
eigenfunctions and
eigenvalues.

00:56:04.590 --> 00:56:07.120
You have eigenfunctions
and eigenvalues of

00:56:07.120 --> 00:56:10.110
these things also.

00:56:10.110 --> 00:56:14.690
And so long as these are defined
nicely by these L2

00:56:14.690 --> 00:56:17.270
properties we've been
talking about.

00:56:17.270 --> 00:56:20.130
In fact you can deal with these
in virtually the same

00:56:20.130 --> 00:56:25.370
way that you can deal with
the matrices we were

00:56:25.370 --> 00:56:27.130
dealing with before.

00:56:27.130 --> 00:56:30.240
If you just remember what the
results are from matrices, you

00:56:30.240 --> 00:56:34.770
can guess what they are for
these covariance functions.

00:56:34.770 --> 00:56:35.020
OK.

00:56:35.020 --> 00:56:39.270
But anyway you can find the
expected value of Vi times Vj

00:56:39.270 --> 00:56:40.630
by this formula.

00:56:40.630 --> 00:56:44.050
Again we're dealing with
zero-mean and therefore we

00:56:44.050 --> 00:56:46.820
don't have to worry about the
mean, put that in later.

00:56:51.070 --> 00:56:54.390
And that all exists.

00:56:54.390 --> 00:56:54.740
OK.

00:56:54.740 --> 00:56:58.540
So the next thing we want
to deal with, hitting

00:56:58.540 --> 00:56:59.590
you with a lot today.

00:56:59.590 --> 00:57:05.500
But I mean the trouble is a lot
of this is half familiar

00:57:05.500 --> 00:57:07.790
to most of you.

00:57:07.790 --> 00:57:10.680
People who have taken various
communication courses at

00:57:10.680 --> 00:57:15.320
various places have all been
exposed to random processes in

00:57:15.320 --> 00:57:18.980
some highly trivialized sense.

00:57:18.980 --> 00:57:21.480
But the major results are the
same as the results we're

00:57:21.480 --> 00:57:22.460
going through here.

00:57:22.460 --> 00:57:25.010
And all we're doing here is
adding a little bit of

00:57:25.010 --> 00:57:28.380
carefulness about what works
and what doesn't work.

00:57:28.380 --> 00:57:32.850
Incidentally in the notes which
is towards the end of

00:57:32.850 --> 00:57:39.250
lectures 14 and 15, we give
three examples which let you

00:57:39.250 --> 00:57:43.960
know why in fact we want to
look primarily at random

00:57:43.960 --> 00:57:48.790
processes which are defined in
terms of a sum of independent

00:57:48.790 --> 00:57:52.790
Gaussian random variables time
orthonormal functions.

00:57:52.790 --> 00:57:56.430
And if you look at those
three examples, some

00:57:56.430 --> 00:57:58.410
of them have problems.

00:57:58.410 --> 00:58:02.560
Because of the fact that
everything you're dealing with

00:58:02.560 --> 00:58:04.610
has infinite energy.

00:58:04.610 --> 00:58:07.810
And therefore it doesn't
really make any sense.

00:58:07.810 --> 00:58:10.980
And one of them I should
talk about this just a

00:58:10.980 --> 00:58:11.850
little bit in class.

00:58:11.850 --> 00:58:14.730
And I think I still have a
couple of minutes, is a very

00:58:14.730 --> 00:58:24.140
strange process where
Z of t is IID.

00:58:24.140 --> 00:58:26.510
In fact just let it be normal.

00:58:30.310 --> 00:58:34.600
And independent for all t.

00:58:39.170 --> 00:58:39.470
OK.

00:58:39.470 --> 00:58:45.810
In other words you generate a
random process by looking at

00:58:45.810 --> 00:58:48.710
an uncountably infinite
collection of

00:58:48.710 --> 00:58:52.260
normal random variables.

00:58:52.260 --> 00:58:56.310
How do you deal with
such a process?

00:58:56.310 --> 00:58:58.920
I don't know how to
deal with it.

00:58:58.920 --> 00:59:02.900
I mean it sounds like
it's simple.

00:59:02.900 --> 00:59:06.610
If I put this on a quiz,
three-quarters of you would

00:59:06.610 --> 00:59:08.310
say oh that's very simple.

00:59:08.310 --> 00:59:13.280
What we're dealing with is a
family of impulse functions.

00:59:13.280 --> 00:59:15.800
Spaced arbitrarily
closely together.

00:59:15.800 --> 00:59:19.410
This is not impulse function.

00:59:19.410 --> 00:59:22.720
Impulse functions are even worse
than this, but this is

00:59:22.720 --> 00:59:25.000
bad enough.

00:59:25.000 --> 00:59:28.050
When we start talking about
spectral density, we can

00:59:28.050 --> 00:59:33.020
explain this a little bit better
by thinking this kind

00:59:33.020 --> 00:59:36.990
of process it doesn't
make any sense.

00:59:36.990 --> 00:59:39.680
But this kind of process, if
you look at its spectral

00:59:39.680 --> 00:59:46.440
density, it's going to have a
spectral density which is zero

00:59:46.440 --> 00:59:50.390
everywhere, but whose integral
over all frequencies is one.

00:59:53.060 --> 00:59:54.800
OK.

00:59:54.800 --> 00:59:57.780
In other words it's not
something you want to wish on

00:59:57.780 --> 00:59:59.670
your on your worse friend.

00:59:59.670 --> 01:00:02.200
It makes a certain amount of
sense as a limit of things.

01:00:02.200 --> 01:00:06.450
You can look at a very broadband
process where in

01:00:06.450 --> 01:00:10.180
fact you spread the process
out enormously.

01:00:10.180 --> 01:00:13.270
You can make pseudo noise which
looks sort of like this.

01:00:13.270 --> 01:00:16.280
And you make the process broader
and broader and lower

01:00:16.280 --> 01:00:18.370
and lower intensity
everywhere.

01:00:18.370 --> 01:00:20.750
But it still has this
energy of one.

01:00:20.750 --> 01:00:24.770
It still has a power
of one everywhere.

01:00:24.770 --> 01:00:26.150
And it just is ugly.

01:00:26.150 --> 01:00:28.630
OK.

01:00:28.630 --> 01:00:32.480
Now if you never worried about
these questions of L2, you

01:00:32.480 --> 01:00:36.010
would look at a process like
that and say, gee there must

01:00:36.010 --> 01:00:39.140
be some easy way to handle that
because it's probably the

01:00:39.140 --> 01:00:42.440
easiest process you
can define.

01:00:42.440 --> 01:00:45.050
I mean everything is normal.

01:00:45.050 --> 01:00:48.430
If you look at any set at
different times, you get a set

01:00:48.430 --> 01:00:52.560
of IID normal Gaussian
variables.

01:00:52.560 --> 01:00:56.380
You try to put it together, and
it doesn't mean anything.

01:00:56.380 --> 01:00:59.700
If you pass it through a filter,
the filter is going to

01:00:59.700 --> 01:01:02.220
cancel it all out.

01:01:02.220 --> 01:01:09.190
So anyway, that's one reason
why we want to look at this

01:01:09.190 --> 01:01:14.470
restricted class of random
processes we're looking at.

01:01:14.470 --> 01:01:14.880
OK.

01:01:14.880 --> 01:01:19.430
What we're interested in now is
we want to take a Gaussian

01:01:19.430 --> 01:01:24.140
random process really, but you
can take any random process.

01:01:24.140 --> 01:01:27.370
We want to pass it through a
filter, and we want to look at

01:01:27.370 --> 01:01:30.180
the random process
that comes out.

01:01:30.180 --> 01:01:30.440
OK.

01:01:30.440 --> 01:01:33.720
And that certainly is a very
physical kind of operation.

01:01:33.720 --> 01:01:37.320
I mean any kind of communication
system that you

01:01:37.320 --> 01:01:42.870
build is going to have
noise on the channel.

01:01:42.870 --> 01:01:45.740
And one of the first things
you're going to do is you're

01:01:45.740 --> 01:01:48.860
going to filter what
you've received.

01:01:48.860 --> 01:01:52.190
So you have to have some way
of dealing with this.

01:01:52.190 --> 01:01:53.060
OK.

01:01:53.060 --> 01:01:58.190
And the way we sort of been
dealing with it all along in

01:01:58.190 --> 01:02:02.750
terms of the transmitted way
forms we've been dealing with

01:02:02.750 --> 01:02:03.860
is to say OK.

01:02:03.860 --> 01:02:07.280
What we're going to do is to
first look what happens when

01:02:07.280 --> 01:02:10.360
we take sample functions of
this, pass them through the

01:02:10.360 --> 01:02:13.570
filter, and then what comes
out is going to be some

01:02:13.570 --> 01:02:15.220
function again.

01:02:15.220 --> 01:02:18.600
And then we're back into
what you studied as an

01:02:18.600 --> 01:02:22.410
undergraduate talking about
functions through filters.

01:02:22.410 --> 01:02:25.330
We're going to jazz it up a
little bit by saying these

01:02:25.330 --> 01:02:27.680
functions are going to
be L2 the filters

01:02:27.680 --> 01:02:29.000
are going to be L2.

01:02:29.000 --> 01:02:31.020
So that in fact you know
you'd get something

01:02:31.020 --> 01:02:33.490
out that make sense.

01:02:33.490 --> 01:02:37.480
So what we're doing is looking
at these sample functions V

01:02:37.480 --> 01:02:43.020
the output at time tau for
sample point omega is going to

01:02:43.020 --> 01:02:51.760
be the convolution of the input
at time t and sample

01:02:51.760 --> 01:02:52.570
point omega.

01:02:52.570 --> 01:02:57.260
Remember this one sample point
exists for all time.

01:02:57.260 --> 01:03:00.850
That's why these sample points
and sample spaces are so damn

01:03:00.850 --> 01:03:02.260
complicated.

01:03:02.260 --> 01:03:04.650
Because they have everything
in them.

01:03:04.650 --> 01:03:05.440
OK.

01:03:05.440 --> 01:03:09.410
So there's one sample point
which exists for all of them.

01:03:09.410 --> 01:03:11.650
This is a sample function.

01:03:11.650 --> 01:03:14.830
You're passing the sample
function through a filter,

01:03:14.830 --> 01:03:17.920
which is just normal
convolution.

01:03:17.920 --> 01:03:20.350
What comes out then.

01:03:20.350 --> 01:03:26.240
If in fact we express this
random process in terms of

01:03:26.240 --> 01:03:30.230
this orthonormal sum the way
we've been doing before.

01:03:30.230 --> 01:03:36.060
Is you get the sum over j of
this, which is a sample value

01:03:36.060 --> 01:03:39.470
of the j'th random variable
coming out times the integral

01:03:39.470 --> 01:03:43.800
of pj of t, h of tau
minus t, d tau.

01:03:43.800 --> 01:03:44.260
OK.

01:03:44.260 --> 01:03:48.430
For each tau that you look at,
this is just a sample value of

01:03:48.430 --> 01:03:50.030
a linear functional.

01:03:50.030 --> 01:03:50.570
OK.

01:03:50.570 --> 01:03:54.500
If I want to look at this at one
value of tau, I have this

01:03:54.500 --> 01:03:59.990
integral here which is
a random process.

01:03:59.990 --> 01:04:02.740
A sample value of a
random process at

01:04:02.740 --> 01:04:05.230
omega time a function.

01:04:05.230 --> 01:04:08.500
This is just a function
of t for given tau.

01:04:08.500 --> 01:04:09.080
OK.

01:04:09.080 --> 01:04:11.770
So this is a linear
functional.

01:04:11.770 --> 01:04:15.400
As a type we've been
talking about.

01:04:15.400 --> 01:04:17.740
And that linear functional
is then given by

01:04:17.740 --> 01:04:20.860
this for each tau.

01:04:20.860 --> 01:04:22.460
This is a sample value
of a linear

01:04:22.460 --> 01:04:26.020
functional we can talk about.

01:04:26.020 --> 01:04:26.330
OK.

01:04:26.330 --> 01:04:30.620
These things are then, if you
look over the whole sample

01:04:30.620 --> 01:04:35.230
space omega, V of tau becomes
a random variable.

01:04:35.230 --> 01:04:36.520
OK.

01:04:36.520 --> 01:04:41.940
V of tau is the random variable
whose sample values

01:04:41.940 --> 01:04:49.200
are V of t omega, and they're
given by this.

01:04:49.200 --> 01:04:55.050
So if z of t is Gaussian
process, you get jointly

01:04:55.050 --> 01:05:02.400
Gaussian linear functionals at
each of any set of times tau

01:05:02.400 --> 01:05:05.150
1, tau 2 up to tau sub k.

01:05:05.150 --> 01:05:09.650
So this just gives you a whole
set of linear functionals.

01:05:09.650 --> 01:05:14.930
And if z of t is a Gaussian
process, then all these linear

01:05:14.930 --> 01:05:19.520
functionals are going to be
jointly Gaussian also.

01:05:19.520 --> 01:05:20.460
And bingo.

01:05:20.460 --> 01:05:23.310
What we have then is an
alternate way to generate

01:05:23.310 --> 01:05:25.370
Gaussian processes.

01:05:25.370 --> 01:05:25.730
OK.

01:05:25.730 --> 01:05:29.860
In other words you can generate
a Gaussian process by

01:05:29.860 --> 01:05:34.320
specifying at each set of k
times, you have jointly

01:05:34.320 --> 01:05:36.140
Gaussian random variables.

01:05:36.140 --> 01:05:39.890
But once you do that, once you
understand one Gaussian

01:05:39.890 --> 01:05:42.460
process, you're off
and running.

01:05:42.460 --> 01:05:46.060
Because then you can pass
it through any L2 filter

01:05:46.060 --> 01:05:46.850
that you want to.

01:05:46.850 --> 01:05:50.160
And you generate another
Gaussian process.

01:05:50.160 --> 01:05:57.000
So for example, if you start
out with this sinc type

01:05:57.000 --> 01:06:04.820
process, I mean we'll see that
that has a spectral density,

01:06:04.820 --> 01:06:07.470
which is flat over
all frequencies.

01:06:07.470 --> 01:06:10.670
And we'll talk about spectral
density tomorrow.

01:06:10.670 --> 01:06:12.950
But then you pass it through a
linear filter, and you can

01:06:12.950 --> 01:06:16.260
give it any spectral density
that you want.

01:06:16.260 --> 01:06:20.120
So at this point, we really
have enough to talk about

01:06:20.120 --> 01:06:26.810
arbitrary covariance functions
just by starting out with

01:06:26.810 --> 01:06:30.000
random processes, which are
defined in terms of some

01:06:30.000 --> 01:06:33.180
sequence of orthonormal
random variables.

01:06:40.280 --> 01:06:40.860
OK.

01:06:40.860 --> 01:06:45.140
Now we can get to covariance
function of a filter process

01:06:45.140 --> 01:06:52.200
in the same way as we got the
matrix for linear functionals.

01:06:52.200 --> 01:06:52.500
OK.

01:06:52.500 --> 01:06:54.020
And this is just computation.

01:06:54.020 --> 01:06:55.380
OK.

01:06:55.380 --> 01:06:56.870
So what is it?

01:06:56.870 --> 01:07:01.720
The covariance function of
this output process V

01:07:01.720 --> 01:07:06.510
evaluated at one time
r another time s.

01:07:06.510 --> 01:07:10.550
One of the nasty things about
notation when you start

01:07:10.550 --> 01:07:13.570
dealing with a covariance
function of the input and the

01:07:13.570 --> 01:07:18.860
output to a linear filter is
you suddenly need to worry

01:07:18.860 --> 01:07:21.440
about two times at the
input and two other

01:07:21.440 --> 01:07:23.160
times at the output.

01:07:23.160 --> 01:07:23.640
OK.

01:07:23.640 --> 01:07:29.870
Because this thing is then
expected value of V sub r

01:07:29.870 --> 01:07:31.895
times the expected
value of V subs.

01:07:31.895 --> 01:07:32.360
OK.

01:07:32.360 --> 01:07:36.510
This is a random variable, which
is the process V of r

01:07:36.510 --> 01:07:38.600
evaluated at time r.

01:07:38.600 --> 01:07:43.410
This is the random variable,
which is the process V of t

01:07:43.410 --> 01:07:47.570
evaluated at a particular
time s.

01:07:47.570 --> 01:07:52.060
This is going to be the expected
value of what this

01:07:52.060 --> 01:07:56.980
random variable now is the
integral of the random process

01:07:56.980 --> 01:08:03.365
z of t times this function
here, which is now

01:08:03.365 --> 01:08:04.420
a function of t.

01:08:04.420 --> 01:08:06.980
Because we're looking at
a fixed value of r.

01:08:06.980 --> 01:08:10.290
So this is a linear functional,

01:08:10.290 --> 01:08:12.430
which is a random variable.

01:08:12.430 --> 01:08:14.210
This is another linear
functional.

01:08:14.210 --> 01:08:18.440
This is evaluated at some time
s, which is the output of the

01:08:18.440 --> 01:08:20.730
filter at time s.

01:08:20.730 --> 01:08:24.940
Then we will throw caution to
the wind interchange integrals

01:08:24.940 --> 01:08:27.720
with expectation
and everything.

01:08:27.720 --> 01:08:31.210
And then in the middle we'll
have expected value of z of t

01:08:31.210 --> 01:08:36.860
times z of tau, which is the
covariance function of z.

01:08:36.860 --> 01:08:37.200
OK.

01:08:37.200 --> 01:08:42.150
So the covariance function of
z then specifies what the

01:08:42.150 --> 01:08:46.370
covariance function
of the output is.

01:08:46.370 --> 01:08:47.310
OK.

01:08:47.310 --> 01:08:53.810
So whenever you pass a random
process through a filter if

01:08:53.810 --> 01:08:57.060
you know what the covariance
function of input to the

01:08:57.060 --> 01:09:00.610
filter is, you can find the
covariance function of the

01:09:00.610 --> 01:09:02.830
output of the filter.

01:09:02.830 --> 01:09:07.200
That's a kind of a nasty
formula, it's not very nice.

01:09:07.200 --> 01:09:12.800
But anyway the thing that it
tells you is that whether this

01:09:12.800 --> 01:09:16.110
random process is
Gaussian or not.

01:09:16.110 --> 01:09:19.140
The only thing that determines
the covariance function of the

01:09:19.140 --> 01:09:22.300
output of the filter is the
covariance function of the

01:09:22.300 --> 01:09:25.420
input to the filter plus of
course the filter response,

01:09:25.420 --> 01:09:28.440
which is needed OK.

01:09:31.300 --> 01:09:35.360
And this is just the same kind
of bilinear form that we were

01:09:35.360 --> 01:09:37.700
dealing with before.

01:09:37.700 --> 01:09:41.410
Next time we will talk a little
bit about the fact that

01:09:41.410 --> 01:09:45.210
when you're dealing with a
bilinear form like this, you

01:09:45.210 --> 01:09:49.220
can take these covariance
functions and they have the

01:09:49.220 --> 01:09:52.870
same kind of eigenvalues and
eigenvectors that we had

01:09:52.870 --> 01:09:55.030
before for a matrix.

01:09:55.030 --> 01:09:59.630
Namely this is again going to
be positive definite as a

01:09:59.630 --> 01:10:04.070
function, and we will be able
to find its eigenvectors and

01:10:04.070 --> 01:10:05.180
its eigenvalues.

01:10:05.180 --> 01:10:06.460
We can't calculate them.

01:10:06.460 --> 01:10:09.000
Computers can calculate them.

01:10:09.000 --> 01:10:11.310
People who've spend their
lives doing this

01:10:11.310 --> 01:10:13.060
can calculate them.

01:10:13.060 --> 01:10:16.770
I wouldn't suggest that you
spend your life doing this.

01:10:16.770 --> 01:10:20.710
Because again you would be
setting yourself up as a

01:10:20.710 --> 01:10:25.060
second class computer, and
you don't make any

01:10:25.060 --> 01:10:26.910
profit out of that.

01:10:26.910 --> 01:10:30.270
But anyway, we can find this
in principle from this.

01:10:30.270 --> 01:10:30.610
OK.

01:10:30.610 --> 01:10:33.280
One of the things that we
haven't talked about at all

01:10:33.280 --> 01:10:37.350
yet, and which we will start
talking about next time in

01:10:37.350 --> 01:10:42.720
which the next set of lecture
notes, lecture 16, we'll deal

01:10:42.720 --> 01:10:45.870
with is the question
of stationarity.

01:10:45.870 --> 01:10:49.940
Let me say just a little bit
about that to get into it.

01:10:49.940 --> 01:10:52.535
And then we'll talk a lot
more about it next time.

01:10:52.535 --> 01:10:57.620
The notes will probably be on
the web some time tomorrow.

01:10:57.620 --> 01:11:01.770
I hope before noon if you
want to look at them.

01:11:01.770 --> 01:11:08.670
Physically if you look at a
stochastic process, and you

01:11:08.670 --> 01:11:10.380
want to model it.

01:11:10.380 --> 01:11:14.010
I mean suppose you want to
model a noise process.

01:11:14.010 --> 01:11:16.850
How do you model a
noise process?

01:11:16.850 --> 01:11:20.850
Well you look at it over
a long period of time.

01:11:20.850 --> 01:11:23.600
You start taking statistics
about it over a

01:11:23.600 --> 01:11:26.480
long period of time.

01:11:26.480 --> 01:11:30.840
And somehow you want to model
it in such a way, I mean the

01:11:30.840 --> 01:11:32.720
only thing you can look at
is statistics over a

01:11:32.720 --> 01:11:34.420
long period of time.

01:11:34.420 --> 01:11:37.580
So if you're only looking at one
process, you can look at

01:11:37.580 --> 01:11:41.060
it for a year and then you can
model it, and then you can use

01:11:41.060 --> 01:11:44.250
that model for the
next 10 years.

01:11:44.250 --> 01:11:47.960
And what that's assuming is that
the noise process looks

01:11:47.960 --> 01:11:52.830
the same way this year
as it does next year.

01:11:52.830 --> 01:11:55.360
You can go further than that
and say, OK I'm going to

01:11:55.360 --> 01:11:59.540
manufacture cell phones or some
other kind of widget.

01:11:59.540 --> 01:12:02.820
And what I'm interested in then
is what these noise wave

01:12:02.820 --> 01:12:06.250
forms are to the whole
collection of my widgets.

01:12:06.250 --> 01:12:08.270
Namely different people
will buy my widgets.

01:12:08.270 --> 01:12:10.270
They will use them in
different places.

01:12:10.270 --> 01:12:13.700
So I'm interested in modeling
the noise over this whole set

01:12:13.700 --> 01:12:15.270
of widgets.

01:12:15.270 --> 01:12:19.810
But still if you're doing that
you're still almost forced to

01:12:19.810 --> 01:12:25.090
deal with models which have the
same statistics over at

01:12:25.090 --> 01:12:28.820
least a broad range of times.

01:12:28.820 --> 01:12:38.550
Sometimes when we're dealing
with wireless communication we

01:12:38.550 --> 01:12:41.100
say, no the channel keeps
changing in time. and the

01:12:41.100 --> 01:12:44.420
channel keeps changing
slowly in time.

01:12:44.420 --> 01:12:47.360
And therefore you don't have
the same statistics

01:12:47.360 --> 01:12:50.660
now as you have then.

01:12:50.660 --> 01:12:54.740
If you want to understands that,
believe me, the only way

01:12:54.740 --> 01:12:58.180
you're going to understand it is
to first understand how to

01:12:58.180 --> 01:13:00.770
deal with statistics for
the channel which

01:13:00.770 --> 01:13:03.010
stay the same forever.

01:13:03.010 --> 01:13:07.010
And once you understand those
statistics, you will then be

01:13:07.010 --> 01:13:10.200
in a position to start to
understand what happens when

01:13:10.200 --> 01:13:13.970
these statistics
change slowly.

01:13:13.970 --> 01:13:14.360
OK.

01:13:14.360 --> 01:13:18.490
In other words, what our
modeling assumption is in this

01:13:18.490 --> 01:13:22.750
course, and I believe it's the
right modeling assumption for

01:13:22.750 --> 01:13:27.840
all engineers, is that you never
start with some physical

01:13:27.840 --> 01:13:31.770
phenomena and say, I want to
test the hell out of this

01:13:31.770 --> 01:13:36.500
until I find an appropriate
statistical model for it.

01:13:36.500 --> 01:13:39.310
You do that only after
you know enough

01:13:39.310 --> 01:13:41.270
about random processes.

01:13:41.270 --> 01:13:44.680
That you know how to deal with
an enormous variety of

01:13:44.680 --> 01:13:48.520
different home cooked
random processes.

01:13:48.520 --> 01:13:48.850
OK.

01:13:48.850 --> 01:13:51.920
So what we do in a course like
this is we deal with lots of

01:13:51.920 --> 01:13:55.300
different home cooked random
processes, which is in fact

01:13:55.300 --> 01:13:58.880
why we've done rather peculiar
things like saying, let's look

01:13:58.880 --> 01:14:03.880
at a random process which
comes from a sum of

01:14:03.880 --> 01:14:06.830
independent random variables
multiplied

01:14:06.830 --> 01:14:08.920
by orthonormal functions.

01:14:08.920 --> 01:14:12.110
And you see what we've
accomplished by that already.

01:14:12.110 --> 01:14:15.660
Namely by starting out that way
we've been able to define

01:14:15.660 --> 01:14:18.270
Gaussian processes.

01:14:18.270 --> 01:14:20.840
We've been able to define what
happens when one of those

01:14:20.840 --> 01:14:24.110
Gaussian processes goes
through a filter.

01:14:24.110 --> 01:14:27.630
And in fact, that gives a way
of generating a lot more

01:14:27.630 --> 01:14:29.250
random processes.

01:14:29.250 --> 01:14:34.380
And next time what we're going
to do is not to say how is it

01:14:34.380 --> 01:14:36.920
that we know that processes
are stationary.

01:14:36.920 --> 01:14:38.760
How do you test whether
processes are

01:14:38.760 --> 01:14:40.120
stationary or not.

01:14:40.120 --> 01:14:42.930
But we're just going to assume
that they're stationary.

01:14:42.930 --> 01:14:45.910
In other words if they had the
same statistics now as they're

01:14:45.910 --> 01:14:47.400
going to have next year.

01:14:47.400 --> 01:14:51.080
And those statistics stay
the same forever.

01:14:51.080 --> 01:14:53.830
And then see what we
can say about it.

01:14:53.830 --> 01:14:56.940
To give you a clue as to how
to start looking at this,

01:14:56.940 --> 01:15:00.350
remember what we said quite
a long time ago

01:15:00.350 --> 01:15:03.640
about Markov chains.

01:15:03.640 --> 01:15:04.230
OK.

01:15:04.230 --> 01:15:07.820
And now when you look at Markov
chains, remember that

01:15:07.820 --> 01:15:11.270
what happens at one time is
statistically a function of

01:15:11.270 --> 01:15:14.780
what happened at the unit
of time before it.

01:15:14.780 --> 01:15:17.690
OK.

01:15:17.690 --> 01:15:21.940
But we can still model those
Markov change as being

01:15:21.940 --> 01:15:23.410
stationary.

01:15:23.410 --> 01:15:27.670
Because the dependents at this
time on the previous sample of

01:15:27.670 --> 01:15:33.020
time is the same now as it will
be five years from now.

01:15:33.020 --> 01:15:33.390
OK.

01:15:33.390 --> 01:15:36.590
In other words you can't just
look at the process at one

01:15:36.590 --> 01:15:38.980
instant of time and say
this is independent

01:15:38.980 --> 01:15:40.540
of all other times.

01:15:40.540 --> 01:15:42.370
That's not what stationary
means.

01:15:42.370 --> 01:15:47.230
What stationary means is the way
the process depends on the

01:15:47.230 --> 01:15:51.720
past at time t is the same is
the way it depends on the past

01:15:51.720 --> 01:15:54.430
at some later time tau.

01:15:54.430 --> 01:15:56.260
And that in fact is the way
we're going to define

01:15:56.260 --> 01:15:58.380
stationarity.

01:15:58.380 --> 01:16:03.740
It's at these joint sample times
that we're going to be

01:16:03.740 --> 01:16:06.350
looking at.

01:16:06.350 --> 01:16:09.390
Which I had a better
word for that.

01:16:09.390 --> 01:16:12.140
Join sets of epochs that
we'll be looking at.

01:16:12.140 --> 01:16:15.570
The joint statistics over a set
of epics is going to be

01:16:15.570 --> 01:16:19.350
the same now as it will be at
sometime in the future.

01:16:19.350 --> 01:16:22.040
And that's the way we're going
to define stationarity.

01:16:22.040 --> 01:16:25.370
A prelude of what we're going
to find when we do that is

01:16:25.370 --> 01:16:30.690
that this covariance function,
if the covariance function at

01:16:30.690 --> 01:16:36.160
time t and time tau is the same
if you translate it up to

01:16:36.160 --> 01:16:41.560
t plus t1 and tau plus t1, then
in fact this function is

01:16:41.560 --> 01:16:45.740
going to be a function only if
the difference t minus tau.

01:16:45.740 --> 01:16:47.850
It's going to be a function
of one variable

01:16:47.850 --> 01:16:50.840
instead of two variables.

01:16:50.840 --> 01:16:51.250
OK.

01:16:51.250 --> 01:16:53.710
So as soon as we get this
function being a function of

01:16:53.710 --> 01:16:58.200
one variable instead of two
variables, first thing we're

01:16:58.200 --> 01:17:02.010
going to do is to take the
Fourier transform of this.

01:17:02.010 --> 01:17:05.240
Because then we'll be taking
the Fourier transform of a

01:17:05.240 --> 01:17:07.710
function of a single variable.

01:17:07.710 --> 01:17:09.530
We're going to call
that the spectral

01:17:09.530 --> 01:17:12.440
density of the process.

01:17:12.440 --> 01:17:17.100
And we're going to find that for
stationary processes the

01:17:17.100 --> 01:17:20.560
spectrum densities tell you

01:17:20.560 --> 01:17:21.980
everything if they're Gaussian.

01:17:21.980 --> 01:17:22.930
Why is that?

01:17:22.930 --> 01:17:24.940
Well the inverse Fourier
transform is

01:17:24.940 --> 01:17:28.470
this covariance function.

01:17:28.470 --> 01:17:31.330
And we've now seen that the
covariance function tells you

01:17:31.330 --> 01:17:34.200
everything about a
Gaussian process.

01:17:34.200 --> 01:17:36.600
So if you know the spectral
density for a stationary

01:17:36.600 --> 01:17:38.850
process, it will tell
you everything.

01:17:38.850 --> 01:17:42.720
We will also have to fiddle
around a little bit about how

01:17:42.720 --> 01:17:45.280
we define stationarity.

01:17:45.280 --> 01:17:46.860
But at the same time
don't have this

01:17:46.860 --> 01:17:49.580
infinite energy problem.

01:17:49.580 --> 01:17:50.930
And the way we're going
to do it is the way

01:17:50.930 --> 01:17:52.010
we've done it all along.

01:17:52.010 --> 01:17:54.330
We're going to take something
that looked stationary, we're

01:17:54.330 --> 01:17:58.210
going to truncate it over some
long period of time, and we're

01:17:58.210 --> 01:18:00.570
going to have our cake and
eat it too that way.

01:18:00.570 --> 01:18:01.040
OK.

01:18:01.040 --> 01:18:03.660
So we'll do that next time.