WEBVTT
00:00:00.530 --> 00:00:02.960
The following content is
provided under a Creative
00:00:02.960 --> 00:00:04.370
Commons license.
00:00:04.370 --> 00:00:07.410
Your support will help MIT
OpenCourseWare continue to
00:00:07.410 --> 00:00:11.060
offer high quality educational
resources for free.
00:00:11.060 --> 00:00:13.960
To make a donation or view
additional materials from
00:00:13.960 --> 00:00:17.890
hundreds of MIT courses, visit
MIT OpenCourseWare at
00:00:17.890 --> 00:00:19.140
ocw.mit.edu.
00:00:24.600 --> 00:00:25.230
PROFESSOR: OK.
00:00:25.230 --> 00:00:28.710
Today we're all through with
Markov chains, or at least
00:00:28.710 --> 00:00:32.530
with finite state
Markov chains.
00:00:32.530 --> 00:00:36.440
And we're going on to
renewal processes.
00:00:36.440 --> 00:00:40.660
As part of that, we will spend
a good deal of time talking
00:00:40.660 --> 00:00:44.950
about the strong law of large
numbers, and convergence with
00:00:44.950 --> 00:00:46.830
probability one.
00:00:46.830 --> 00:00:51.440
The idea of convergence with
probability one, at least to
00:00:51.440 --> 00:00:56.540
me is by far the most difficult
part of the course.
00:00:56.540 --> 00:00:59.890
It's very abstract
mathematically.
00:00:59.890 --> 00:01:03.170
It looks like it's simple, and
it's one of those things you
00:01:03.170 --> 00:01:06.630
start to think you understand
it, and then at a certain
00:01:06.630 --> 00:01:09.460
point, you realize
that you don't.
00:01:09.460 --> 00:01:12.740
And this has been happening
to me for 20 years now.
00:01:12.740 --> 00:01:16.650
I keep thinking I really
understand this idea of
00:01:16.650 --> 00:01:19.800
convergence with probability
one, and then I see some
00:01:19.800 --> 00:01:22.180
strange example again.
00:01:22.180 --> 00:01:25.130
And I say there's something
very peculiar
00:01:25.130 --> 00:01:26.500
about this whole idea.
00:01:26.500 --> 00:01:29.750
And I'm going to illustrate
that you at the end of the
00:01:29.750 --> 00:01:30.980
lecture today.
00:01:30.980 --> 00:01:35.100
But for the most part, I will
be talking not so much about
00:01:35.100 --> 00:01:40.260
renewal processes, but this
set of mathematical issues
00:01:40.260 --> 00:01:43.990
that we have to understand in
order to be able to look at
00:01:43.990 --> 00:01:47.440
renewal processes in
the simplest way.
00:01:47.440 --> 00:01:50.830
One of the funny things about
the strong law of large
00:01:50.830 --> 00:01:55.420
numbers and how it gets applied
to renewal processes
00:01:55.420 --> 00:02:01.220
is that although the idea of
convergence with probability
00:02:01.220 --> 00:02:06.990
one is sticky and strange, once
you understand it, it is
00:02:06.990 --> 00:02:11.920
one of the most easy things
to use there is.
00:02:11.920 --> 00:02:16.220
And therefore, once you become
comfortable with it, you can
00:02:16.220 --> 00:02:19.150
use it to do things which would
be very hard to do in
00:02:19.150 --> 00:02:21.260
any other way.
00:02:21.260 --> 00:02:24.630
And because of that, most people
feel they understand it
00:02:24.630 --> 00:02:26.700
better than they actually do.
00:02:26.700 --> 00:02:30.670
And that's the reason why it
sometimes crops up when you're
00:02:30.670 --> 00:02:33.230
least expecting it, and
you find there's
00:02:33.230 --> 00:02:35.630
something very peculiar.
00:02:35.630 --> 00:02:39.050
OK, so let's start out by
talking a little bit about
00:02:39.050 --> 00:02:41.580
renewal processes.
00:02:41.580 --> 00:02:47.570
And then talking about this
convergence, and the strong
00:02:47.570 --> 00:02:52.190
law of large numbers, and what
it does to all of this.
00:02:52.190 --> 00:02:54.090
This is just review.
00:02:54.090 --> 00:02:57.230
We talked about arrival
processes when we started
00:02:57.230 --> 00:03:00.090
talking about Poisson
processes.
00:03:00.090 --> 00:03:03.740
Renewal processes are a special
kind of arrival
00:03:03.740 --> 00:03:07.510
processes, and Poisson processes
are a special kind
00:03:07.510 --> 00:03:09.290
of renewal process.
00:03:09.290 --> 00:03:13.710
So this is something you're
already sort of familiar with.
00:03:13.710 --> 00:03:20.310
All of arrival processes, we
will tend to treat in one of
00:03:20.310 --> 00:03:23.450
three equivalent ways, which is
the same thing we did with
00:03:23.450 --> 00:03:24.700
Poisson processes.
00:03:27.250 --> 00:03:30.250
A stochastic process,
we said, is a
00:03:30.250 --> 00:03:32.410
family of random variables.
00:03:32.410 --> 00:03:38.070
But in this case, we always view
it as three families of
00:03:38.070 --> 00:03:40.770
random variables, which
are all related.
00:03:40.770 --> 00:03:43.720
And all of which define
the other.
00:03:43.720 --> 00:03:46.780
And you jump back and forth from
looking at one to looking
00:03:46.780 --> 00:03:51.650
at the other, which is, as you
saw with Poisson processes,
00:03:51.650 --> 00:03:55.250
you really want to do this,
because if you stick to only
00:03:55.250 --> 00:04:00.060
one way of looking at it, you
really only pick up about a
00:04:00.060 --> 00:04:02.736
quarter, or a half
of the picture.
00:04:02.736 --> 00:04:04.840
OK.
00:04:04.840 --> 00:04:08.810
So this one picture gives us
a relationship between the
00:04:08.810 --> 00:04:14.170
arrival epochs of an arrival
process, the inter-arrival
00:04:14.170 --> 00:04:21.560
intervals, the x1, x2, x3, and
the counting process, n of t,
00:04:21.560 --> 00:04:26.100
and whichever one you use, you
use the one which is easiest,
00:04:26.100 --> 00:04:28.330
for whatever you plan to do.
00:04:28.330 --> 00:04:31.980
For defining a renewal process,
the easy thing to do
00:04:31.980 --> 00:04:36.330
is to look at the inter-arrival
intervals,
00:04:36.330 --> 00:04:40.280
because the definition of a
renewal process is it's an
00:04:40.280 --> 00:04:45.930
arrival process for which the
interrenewals are independent
00:04:45.930 --> 00:04:48.290
and identically distributed.
00:04:48.290 --> 00:04:53.000
So any process where the
arrivals, inter-arrivals have
00:04:53.000 --> 00:04:56.260
that property are IID.
00:04:56.260 --> 00:05:00.640
OK, renewal processes are
characterized, and the name
00:05:00.640 --> 00:05:06.700
comes from the idea that you
start over at each interval.
00:05:06.700 --> 00:05:11.180
This idea of starting over is
something that we talk about
00:05:11.180 --> 00:05:13.350
more later on.
00:05:13.350 --> 00:05:17.310
And it's a little bit strange,
and a little bit fishy.
00:05:17.310 --> 00:05:21.910
It's like with a Poisson
process, you look at different
00:05:21.910 --> 00:05:25.400
intervals, and they're
independent of each other.
00:05:25.400 --> 00:05:28.860
And we sort of know what
that means by now.
00:05:28.860 --> 00:05:34.540
OK, you look at the arrival
epochs for a Poisson process.
00:05:34.540 --> 00:05:37.280
Are they independent
of each other?
00:05:37.280 --> 00:05:39.210
Of course not.
00:05:39.210 --> 00:05:41.450
The arrival epochs are
the sums of the
00:05:41.450 --> 00:05:43.520
inter-arrival intervals.
00:05:43.520 --> 00:05:45.950
The inter-arrival intervals
are the things that are
00:05:45.950 --> 00:05:47.190
independent.
00:05:47.190 --> 00:05:49.260
And the arrival epochs
are the sums of
00:05:49.260 --> 00:05:50.810
inter-arrival intervals.
00:05:50.810 --> 00:05:54.250
If you know that the first
arrival epoch takes 10 times
00:05:54.250 --> 00:05:59.480
longer than its mean, then that
the second arrival epoch
00:05:59.480 --> 00:06:01.590
is going to be kind
of long, too.
00:06:01.590 --> 00:06:05.910
It's got to be at least 10 times
as long as the mean of
00:06:05.910 --> 00:06:10.330
the inter-arrival epochs,
because each arrival epoch is
00:06:10.330 --> 00:06:13.130
a sum of these inter-arrival
epochs.
00:06:13.130 --> 00:06:15.980
It's the inter-arrival epochs
that are independent.
00:06:15.980 --> 00:06:19.720
So when you say that the one
interval is independent of the
00:06:19.720 --> 00:06:23.720
other, yes, you know exactly
what you mean.
00:06:23.720 --> 00:06:26.160
And the idea is very simple.
00:06:26.160 --> 00:06:28.890
It's the same idea here.
00:06:28.890 --> 00:06:32.490
But then you start to think you
understand this, and you
00:06:32.490 --> 00:06:35.980
start to use it in
a funny way.
00:06:35.980 --> 00:06:39.710
And suddenly you're starting to
say that the arrival epochs
00:06:39.710 --> 00:06:42.280
are independent from one time
to the other, which they
00:06:42.280 --> 00:06:44.050
certainly aren't.
00:06:44.050 --> 00:06:49.070
What renewal theory does is it
lets you treat the gross
00:06:49.070 --> 00:06:53.000
characteristics of a process
in a very simple and
00:06:53.000 --> 00:06:54.640
straightforward way.
00:06:54.640 --> 00:06:58.270
So you're breaking up the
process into two sets
00:06:58.270 --> 00:06:59.700
of views about it.
00:06:59.700 --> 00:07:03.470
One is the long term behavior,
which you treat by renewal
00:07:03.470 --> 00:07:10.110
theory, and you use this one
exotic theory in a simple and
00:07:10.110 --> 00:07:14.550
straightforward way for every
different process, for every
00:07:14.550 --> 00:07:17.150
different renewal process
you look at.
00:07:17.150 --> 00:07:21.230
And then you have this usually
incredibly complicated kind of
00:07:21.230 --> 00:07:26.260
thing in the inside of
each arrival epoch.
00:07:26.260 --> 00:07:29.650
And the nice thing about renewal
theory is it lets you
00:07:29.650 --> 00:07:33.180
look at that complicated thing
without worrying about what's
00:07:33.180 --> 00:07:35.090
going on outside.
00:07:35.090 --> 00:07:38.950
So the local characteristics
can be studied without
00:07:38.950 --> 00:07:43.410
worrying about the long
term interactions.
00:07:43.410 --> 00:07:48.470
One example of this, and one
of the reasons we are now
00:07:48.470 --> 00:07:51.280
looking at Markov chains before
we look at renewal
00:07:51.280 --> 00:07:57.000
processes is that a Markov chain
is one of the nicest
00:07:57.000 --> 00:08:01.890
examples there is of a renewal
process, when you look at it
00:08:01.890 --> 00:08:03.450
in the right way.
00:08:03.450 --> 00:08:11.150
If you have a recurrent Markov
chain, then the interval from
00:08:11.150 --> 00:08:15.360
one time entering a particularly
recurrent state
00:08:15.360 --> 00:08:17.190
until the next time
you enter that
00:08:17.190 --> 00:08:21.500
recurrent state is a renewal.
00:08:21.500 --> 00:08:27.570
So we look at the sequence of
times at which we enter this
00:08:27.570 --> 00:08:29.110
one given state.
00:08:29.110 --> 00:08:31.150
Enter state one over here.
00:08:31.150 --> 00:08:33.520
We enter state one
again over here.
00:08:33.520 --> 00:08:36.780
We enter state one again,
and so forth.
00:08:36.780 --> 00:08:39.710
We're ignoring everything
that goes on between
00:08:39.710 --> 00:08:41.539
entries to state one.
00:08:41.539 --> 00:08:44.810
But every time you enter state
1, you're in the same
00:08:44.810 --> 00:08:49.490
situation as you were the last
time you entered state one.
00:08:49.490 --> 00:08:53.150
You're in the same situation,
in the sense that the
00:08:53.150 --> 00:08:58.010
inter-arrivals from state one
to state one again are
00:08:58.010 --> 00:08:59.910
independent of what
they were before.
00:08:59.910 --> 00:09:03.780
In other words, when you enter
state one, your successive
00:09:03.780 --> 00:09:07.650
state transitions from
there are the same
00:09:07.650 --> 00:09:08.850
as they were before.
00:09:08.850 --> 00:09:15.500
So it's the same situation as we
saw with Poisson processes,
00:09:15.500 --> 00:09:19.930
and it's the same kind of
renewal where when you talk
00:09:19.930 --> 00:09:25.800
about renewal, you have to be
very careful about what it is
00:09:25.800 --> 00:09:26.740
that's a renewable.
00:09:26.740 --> 00:09:30.420
Once you're careful about it,
it's clear what's going on.
00:09:30.420 --> 00:09:33.000
One of the things we're going to
find out now is one of the
00:09:33.000 --> 00:09:36.610
things that we failed to point
out before when we talked
00:09:36.610 --> 00:09:39.400
about finite state and
Markov chains.
00:09:39.400 --> 00:09:43.080
One of the most interesting
characteristics is the
00:09:43.080 --> 00:09:47.620
expected amount of time from one
entry to a recurrent state
00:09:47.620 --> 00:09:51.380
until the next time you enter
that recurrent state is 1 over
00:09:51.380 --> 00:09:55.390
pi sub i, where pi sub i is a
steady state probability of
00:09:55.390 --> 00:09:57.670
that steady state.
00:09:57.670 --> 00:09:59.600
Namely, we didn't do that.
00:09:59.600 --> 00:10:03.290
It's a little tricky to do that
in terms of Markov chains
00:10:03.290 --> 00:10:07.530
it's almost trivial to do it in
terms of renewal processes.
00:10:07.530 --> 00:10:11.450
And what's more, when we do it
in terms of renewal processes,
00:10:11.450 --> 00:10:13.230
you will see that it's
obvious, and you will
00:10:13.230 --> 00:10:14.410
never forget it.
00:10:14.410 --> 00:10:17.120
If we did it in terms of Markov
chains, it would be
00:10:17.120 --> 00:10:21.700
some long, tedious derivation,
and you'd get this nice
00:10:21.700 --> 00:10:24.970
answer, and you say, why did
that nice answer occur?
00:10:24.970 --> 00:10:26.500
And you wouldn't
have any idea.
00:10:26.500 --> 00:10:29.000
When you look at renewal
processes, it's
00:10:29.000 --> 00:10:31.100
obvious why it happens.
00:10:31.100 --> 00:10:34.130
And we'll see why that
is very soon.
00:10:34.130 --> 00:10:38.310
Also, after we finish renewal
processes, the next thing
00:10:38.310 --> 00:10:41.330
we're going to do is to
talk about accountable
00:10:41.330 --> 00:10:43.090
state Markov chains.
00:10:43.090 --> 00:10:46.740
Markov chains with accountable,
infinitely
00:10:46.740 --> 00:10:48.025
countable set of states.
00:10:52.520 --> 00:10:56.100
If you don't have a background
in renewal theory when you
00:10:56.100 --> 00:10:59.600
start to look at that, you
get very confused.
00:10:59.600 --> 00:11:02.940
So renewal theory will give us
the right tool to look at
00:11:02.940 --> 00:11:07.140
those more complicated
Markov chains.
00:11:07.140 --> 00:11:07.830
OK.
00:11:07.830 --> 00:11:10.620
So we carry from Markov chains
with accountably infinite
00:11:10.620 --> 00:11:14.940
state space comes largely
from renewal process.
00:11:14.940 --> 00:11:19.350
So yes, we'll be interested
in understanding that.
00:11:19.350 --> 00:11:19.750
OK.
00:11:19.750 --> 00:11:25.430
Another example is GTM queue.
00:11:25.430 --> 00:11:28.980
The text talked a little bit,
and we might have talked in
00:11:28.980 --> 00:11:32.810
class a little bit about this
strange notation a queueing
00:11:32.810 --> 00:11:34.710
theorist used.
00:11:34.710 --> 00:11:37.360
There are always at least three
letters separated by
00:11:37.360 --> 00:11:41.140
slashes to talk about
what kind of queue
00:11:41.140 --> 00:11:42.670
you're talking about.
00:11:42.670 --> 00:11:47.990
The first letter describes the
arrival process for the queue.
00:11:47.990 --> 00:11:52.860
G means it's a general rival
process, which doesn't really
00:11:52.860 --> 00:11:54.960
mean it's a general
arrival process.
00:11:54.960 --> 00:11:58.210
It means the arrival
process is renewal.
00:11:58.210 --> 00:12:03.210
Namely, it says the arrival
process is IID,
00:12:03.210 --> 00:12:05.280
inter-arrivals.
00:12:05.280 --> 00:12:07.890
But you don't know what
their distribution is.
00:12:07.890 --> 00:12:11.060
You would call that M if you
meant a Poisson process, which
00:12:11.060 --> 00:12:14.780
would mean memory lists,
inter-arrivals.
00:12:14.780 --> 00:12:19.940
The second G stands for the
service time distribution.
00:12:19.940 --> 00:12:23.690
Again, we assume that no matter
how many servers you
00:12:23.690 --> 00:12:27.700
have, no matter how the servers
work, the time to
00:12:27.700 --> 00:12:31.980
serve one user is independent
of the time
00:12:31.980 --> 00:12:33.600
to serve other users.
00:12:33.600 --> 00:12:37.370
But that the distribution of
that time has a general
00:12:37.370 --> 00:12:38.600
distribution.
00:12:38.600 --> 00:12:42.800
It would be M as you meant a
memory list distribution,
00:12:42.800 --> 00:12:45.680
which would mean exponential
distribution.
00:12:45.680 --> 00:12:51.110
Finally, the thing at the end
says we're talking about IQ
00:12:51.110 --> 00:12:52.950
with M servers.
00:12:52.950 --> 00:12:55.180
So the point here is we're
talking about a relatively
00:12:55.180 --> 00:12:57.690
complicated thing.
00:12:57.690 --> 00:13:00.550
Can you talk about this
in terms of renewals?
00:13:00.550 --> 00:13:05.290
Yes, you can, but it's not quite
obvious how to do it.
00:13:05.290 --> 00:13:09.550
You would think that the obvious
way of viewing a
00:13:09.550 --> 00:13:14.620
complicated queue like this is
to look at what happens from
00:13:14.620 --> 00:13:18.040
one busy period to the
next busy period.
00:13:18.040 --> 00:13:20.590
You would think the busy periods
would be independent
00:13:20.590 --> 00:13:21.800
of each other.
00:13:21.800 --> 00:13:23.610
But they're not quite.
00:13:23.610 --> 00:13:27.820
Suppose you finish one busy
period, and when you finish
00:13:27.820 --> 00:13:33.525
the one busy period, one
customer has just finished
00:13:33.525 --> 00:13:34.930
being served.
00:13:34.930 --> 00:13:38.630
But at that point, you're in the
middle of the waiting for
00:13:38.630 --> 00:13:41.230
the next customer to arrive.
00:13:41.230 --> 00:13:43.630
And as that's a general
distribution, the amount of
00:13:43.630 --> 00:13:47.380
time you have to wait for that
next customer to arrive
00:13:47.380 --> 00:13:50.120
depends on a whole
lot of things in
00:13:50.120 --> 00:13:52.000
the previous interval.
00:13:52.000 --> 00:13:54.540
So how can you talk about
renewals here?
00:13:54.540 --> 00:13:57.520
You talk about renewables by
waiting until that next
00:13:57.520 --> 00:13:58.730
arrival comes.
00:13:58.730 --> 00:14:04.270
When that next arrival comes to
terminate the idle period
00:14:04.270 --> 00:14:07.970
between busy periods, at that
time you're in the same state
00:14:07.970 --> 00:14:10.730
that you were in when the whole
thing started before.
00:14:10.730 --> 00:14:14.180
When you had the first
arrival come in.
00:14:14.180 --> 00:14:17.480
And at that point, you had one a
rival there being served you
00:14:17.480 --> 00:14:20.360
go through some long
complicated thing.
00:14:20.360 --> 00:14:22.730
Eventually the busy
period is over.
00:14:22.730 --> 00:14:25.760
Eventually, then, another
arrival comes in.
00:14:25.760 --> 00:14:28.920
And presto, at that point,
you're statistically back
00:14:28.920 --> 00:14:30.070
where you started.
00:14:30.070 --> 00:14:33.560
You're statistically back where
you started in terms of
00:14:33.560 --> 00:14:37.400
all inter-arrival times
at that point.
00:14:37.400 --> 00:14:40.220
And we will have to, even
though it's intuitively
00:14:40.220 --> 00:14:44.300
obvious that those things are
independent of each other,
00:14:44.300 --> 00:14:47.380
we're really going to have to
sort that out a little bit,
00:14:47.380 --> 00:14:51.290
because you come upon
many situations
00:14:51.290 --> 00:14:53.820
where this is not obvious.
00:14:53.820 --> 00:14:56.310
So if you don't know how to
sort it out when it is
00:14:56.310 --> 00:14:59.480
obvious, you're not going to
know how to sort it out when
00:14:59.480 --> 00:15:00.490
it's not obvious.
00:15:00.490 --> 00:15:03.840
But anyway, that's another
example of
00:15:03.840 --> 00:15:06.650
where we have renewals.
00:15:06.650 --> 00:15:08.710
OK.
00:15:08.710 --> 00:15:11.760
We want to talk about
convergence now.
00:15:11.760 --> 00:15:15.170
This idea of convergence
with probability one.
00:15:15.170 --> 00:15:17.990
It's based on the
idea of numbers
00:15:17.990 --> 00:15:21.440
converging to some limit.
00:15:21.440 --> 00:15:27.430
And I'm always puzzled about how
much to talk about this,
00:15:27.430 --> 00:15:31.440
because all of you, when you
first study calculus, talk
00:15:31.440 --> 00:15:32.990
about limits.
00:15:32.990 --> 00:15:36.650
Most of you, if you're
engineers, when you talk about
00:15:36.650 --> 00:15:40.400
calculus, it goes in one ear and
it goes out the other ear,
00:15:40.400 --> 00:15:43.900
because you don't have to
understand this very much.
00:15:43.900 --> 00:15:46.820
Because all the things you deal
with, the limits exist
00:15:46.820 --> 00:15:49.330
very nicely, and there's
no problem.
00:15:49.330 --> 00:15:51.170
So you can ignore it.
00:15:51.170 --> 00:15:55.310
And then you hear about these
epsilons and deltas, and I do
00:15:55.310 --> 00:15:56.930
the same thing.
00:15:56.930 --> 00:15:59.840
I can deal with an epsilon,
but as soon as you have an
00:15:59.840 --> 00:16:03.890
epsilon and a delta,
I go into orbit.
00:16:03.890 --> 00:16:07.210
I have no idea what's going on
anymore until I sit down and
00:16:07.210 --> 00:16:09.350
think about it very,
very carefully.
00:16:09.350 --> 00:16:13.200
Fortunately, when we have a
sequence of numbers, we only
00:16:13.200 --> 00:16:14.060
have an epsilon.
00:16:14.060 --> 00:16:15.610
We don't have a delta.
00:16:15.610 --> 00:16:17.095
So things are a little
bit simpler.
00:16:19.860 --> 00:16:23.260
I should warn you, though, that
you can't let this go in
00:16:23.260 --> 00:16:27.250
one ear and out the other ear,
because at this point, we are
00:16:27.250 --> 00:16:30.590
using the convergence of numbers
to be able to talk
00:16:30.590 --> 00:16:34.330
about convergence of random
variables, and convergence of
00:16:34.330 --> 00:16:38.500
random variables is indeed
not a simple topic.
00:16:38.500 --> 00:16:43.540
Convergence of numbers is a
simple topic made complicated
00:16:43.540 --> 00:16:44.790
by mathematicians.
00:16:46.800 --> 00:16:49.760
Any good mathematician,
when they hear me say
00:16:49.760 --> 00:16:51.430
this will be furious.
00:16:51.430 --> 00:16:55.210
Because in fact, when you think
about what they've done,
00:16:55.210 --> 00:16:59.080
they've taken something which
is simple but looks
00:16:59.080 --> 00:17:03.820
complicated, and they've turned
it into something which
00:17:03.820 --> 00:17:06.609
looks complicated in another
way, but is really the
00:17:06.609 --> 00:17:08.359
simplest way to deal with it.
00:17:08.359 --> 00:17:12.680
So let's do that and be done
with it, and then we can start
00:17:12.680 --> 00:17:15.140
using it for random variables.
00:17:15.140 --> 00:17:19.880
A sequence, b1, b2, b3, so
forth of real numbers.
00:17:19.880 --> 00:17:22.520
Real numbers are complex numbers
that doesn't make any
00:17:22.520 --> 00:17:30.640
difference, is said to converge
to a limit, b.
00:17:30.640 --> 00:17:33.470
If for each real epsilon greater
than zero, is there an
00:17:33.470 --> 00:17:37.700
integer M such that bn minus
b is less than or equal to
00:17:37.700 --> 00:17:41.460
epsilon for all n greater
than or equal to n?
00:17:41.460 --> 00:17:44.580
Now, how many people can look
at that and understand it?
00:17:44.580 --> 00:17:45.510
Be honest.
00:17:45.510 --> 00:17:46.250
Good.
00:17:46.250 --> 00:17:48.410
Some of you can.
00:17:48.410 --> 00:17:53.760
How many people look at that,
and their mind just, ah!
00:17:53.760 --> 00:17:57.010
How many people are
in that category?
00:17:57.010 --> 00:17:58.020
I am.
00:17:58.020 --> 00:18:00.660
But if I'm the only
one, that's good.
00:18:00.660 --> 00:18:03.070
OK.
00:18:03.070 --> 00:18:06.250
There's an equivalent way
to talk about this.
00:18:06.250 --> 00:18:10.110
A sequence of numbers, real or
complex, is said to converge
00:18:10.110 --> 00:18:11.420
to limit b.
00:18:11.420 --> 00:18:16.480
If for each integer k greater
than zero, there's an integer
00:18:16.480 --> 00:18:21.450
m of k, such that bn minus b
is less than or equal to 1
00:18:21.450 --> 00:18:25.825
over k for all n greater
than or equal to m.
00:18:25.825 --> 00:18:26.210
OK.
00:18:26.210 --> 00:18:30.160
And the argument there is pick
any epsilon you want to, no
00:18:30.160 --> 00:18:32.670
matter how small.
00:18:32.670 --> 00:18:35.930
And then you pick a k, such that
1 over k is less than or
00:18:35.930 --> 00:18:37.340
equal to epsilon.
00:18:37.340 --> 00:18:42.890
According to this definition, bn
minus b less than or equal
00:18:42.890 --> 00:18:48.990
to 1 over k ensures that you
have this condition up here
00:18:48.990 --> 00:18:51.120
that we're talking about.
00:18:51.120 --> 00:18:56.160
When bn minus b is less than
or equal to 1 over k, then
00:18:56.160 --> 00:18:59.950
also bn minus b is less than
or equal to epsilon.
00:18:59.950 --> 00:19:02.780
In other words, when you look
at this, you're starting to
00:19:02.780 --> 00:19:06.640
see what this definition
really means.
00:19:06.640 --> 00:19:12.700
Here, you don't really care
about all epsilon.
00:19:12.700 --> 00:19:18.200
All you care about is that
this holds true for small
00:19:18.200 --> 00:19:19.740
enough epsilon.
00:19:19.740 --> 00:19:23.890
And the trouble is there's
no way to specify a
00:19:23.890 --> 00:19:25.570
small enough epsilon.
00:19:25.570 --> 00:19:29.670
So the only way we can do this
is to say for all epsilon.
00:19:29.670 --> 00:19:35.700
But what the argument is is if
you can assert this statement
00:19:35.700 --> 00:19:39.650
for a sequence of smaller and
smaller values of epsilon,
00:19:39.650 --> 00:19:41.110
that's all you need.
00:19:41.110 --> 00:19:45.510
Because as soon as this is true
for one value of epsilon,
00:19:45.510 --> 00:19:48.200
it's true for all smaller
values of epsilon.
00:19:48.200 --> 00:19:53.110
Now, let me show you a picture
which, unfortunately, there's
00:19:53.110 --> 00:19:56.680
a kind of a complicated
picture.
00:19:56.680 --> 00:20:00.900
It's the picture that says what
that argument was really
00:20:00.900 --> 00:20:02.410
talking about.
00:20:02.410 --> 00:20:05.720
So if you don't understand the
picture, you were kidding
00:20:05.720 --> 00:20:08.230
yourself when you said you
thought you understood what
00:20:08.230 --> 00:20:10.090
the definition said.
00:20:10.090 --> 00:20:12.970
So what the picture says,
it's in terms of
00:20:12.970 --> 00:20:16.800
this 1 over k business.
00:20:16.800 --> 00:20:21.610
It says if you have a sequence
of numbers, b1, b2, b3, excuse
00:20:21.610 --> 00:20:24.220
me for insulting you
by talking about
00:20:24.220 --> 00:20:26.302
something so trivial.
00:20:26.302 --> 00:20:29.350
But believe me, as soon as we
start talking about random
00:20:29.350 --> 00:20:33.500
variables, this trivial thing
mixed with so many other
00:20:33.500 --> 00:20:37.160
things will start to become less
trivial, and you really
00:20:37.160 --> 00:20:40.320
need to understand what
this is saying.
00:20:40.320 --> 00:20:46.020
So we're saying if we have a
sequence, b1, b2, b3, b4, b5
00:20:46.020 --> 00:20:52.030
and so forth, what that second
idea of convergence says is
00:20:52.030 --> 00:21:02.760
that there's an M1 which says
that for all n greater than or
00:21:02.760 --> 00:21:12.500
equal to M1, b4, b5, b6, b7
minus b all lies within this
00:21:12.500 --> 00:21:16.660
limit here between b plus
1 and b minus 1.
00:21:16.660 --> 00:21:20.360
There's a number M2, which says
that as soon as you get
00:21:20.360 --> 00:21:23.570
bigger than M of 2, all
these numbers lie
00:21:23.570 --> 00:21:25.390
between these two limits.
00:21:25.390 --> 00:21:28.890
There's a number M3, which says
all of these numbers lie
00:21:28.890 --> 00:21:30.130
between these limits.
00:21:30.130 --> 00:21:34.670
So it's saying that, it's
essentially saying that you
00:21:34.670 --> 00:21:40.340
can a pipe, and as n increases,
you squeeze this
00:21:40.340 --> 00:21:42.000
pipe gradually down.
00:21:42.000 --> 00:21:44.550
You don't know how fast you
can squeeze it down when
00:21:44.550 --> 00:21:46.440
you're talking about
convergence.
00:21:46.440 --> 00:21:49.460
You might have something that
converges very slowly, and
00:21:49.460 --> 00:21:51.790
then M1 will be way out here.
00:21:51.790 --> 00:21:53.700
M2 will be way over there.
00:21:53.700 --> 00:21:58.480
M3 will be off on the
other side of Vassar
00:21:58.480 --> 00:22:01.160
Street, and so forth.
00:22:01.160 --> 00:22:05.650
But there always is such an
M1, M2, and M3, which says
00:22:05.650 --> 00:22:09.070
these numbers are getting
closer and closer to b.
00:22:09.070 --> 00:22:12.280
And they're staying closer
and closer to b.
00:22:12.280 --> 00:22:16.800
An example, which we'll come
back to, where you don't have
00:22:16.800 --> 00:22:20.840
convergence is the following
kind of thing.
00:22:20.840 --> 00:22:25.030
b1 is equal to 3/4,
in this case.
00:22:25.030 --> 00:22:27.470
b5 is equal to 3/4.
00:22:27.470 --> 00:22:30.110
b25 is equal to 3/4.
00:22:30.110 --> 00:22:34.540
b5 to the third is equal to 1.
00:22:34.540 --> 00:22:35.380
And so forth.
00:22:35.380 --> 00:22:43.110
These values at which b sub n is
equal to 3/4, b is equal to
00:22:43.110 --> 00:22:46.530
little b plus 3/4 get
more and more rare.
00:22:46.530 --> 00:22:51.560
So in some sense, this sequence
here where b2 up to
00:22:51.560 --> 00:22:53.190
b4 is zero.
00:22:53.190 --> 00:22:57.310
b6 up to b24 is zero
and so forth.
00:22:57.310 --> 00:23:00.580
This is some kind of
convergence, also.
00:23:00.580 --> 00:23:05.340
But it's not what anyone
would call convergence.
00:23:05.340 --> 00:23:08.610
I mean, as far as numbers are
concerned, there's only one
00:23:08.610 --> 00:23:11.950
kind of convergence that people
ever talk about, and
00:23:11.950 --> 00:23:13.930
it's this kind of convergence
here.
00:23:13.930 --> 00:23:17.070
This, although these numbers
are getting close
00:23:17.070 --> 00:23:18.820
to b in some sense.
00:23:18.820 --> 00:23:21.630
That's not viewed
as convergence.
00:23:21.630 --> 00:23:26.950
So here, even though almost all
the numbers are close to
00:23:26.950 --> 00:23:30.420
b, they don't stay close
to b, in a sense.
00:23:30.420 --> 00:23:34.290
They always pop up at some place
in the future, and that
00:23:34.290 --> 00:23:37.160
destroys the whole idea
of convergence.
00:23:37.160 --> 00:23:41.920
It destroys most theorems
about convergence.
00:23:41.920 --> 00:23:45.430
That's an example where you
don't have convergence.
00:23:45.430 --> 00:23:49.100
OK, random variables are really
a lot more complicated
00:23:49.100 --> 00:23:50.290
than numbers.
00:23:50.290 --> 00:23:57.000
I mean, a random variable is a
function from the sample space
00:23:57.000 --> 00:23:59.010
to real numbers.
00:23:59.010 --> 00:24:00.800
All of you know that's
not really what a
00:24:00.800 --> 00:24:02.470
random variable is.
00:24:02.470 --> 00:24:06.100
All of you know that a random
variable is a number that
00:24:06.100 --> 00:24:10.110
wiggles around a little bit,
rather than being fixed at
00:24:10.110 --> 00:24:12.695
what you ordinarily think of
a number as being, right?
00:24:16.990 --> 00:24:22.350
Since that's a very imprecise
notion, and the precise notion
00:24:22.350 --> 00:24:26.150
is very complicated, to build up
your intuition about this,
00:24:26.150 --> 00:24:29.830
you have to really think hard
about what convergence of
00:24:29.830 --> 00:24:32.300
random variables means.
00:24:32.300 --> 00:24:35.230
For convergence and
distribution, it's not the
00:24:35.230 --> 00:24:38.140
random variables, but the
distribution function of the
00:24:38.140 --> 00:24:40.100
random variables
that converge.
00:24:40.100 --> 00:24:43.710
In other words, in the
distribution function of z sub
00:24:43.710 --> 00:24:46.940
n, where you have a sequence of
random variables, z1, z2,
00:24:46.940 --> 00:24:50.520
z3 and so forth, the
distribution function
00:24:50.520 --> 00:24:56.890
evaluated at each real value z
converges for each z in the
00:24:56.890 --> 00:25:01.370
case where the distribution
function of this final
00:25:01.370 --> 00:25:04.740
convergent random variable
is continuous.
00:25:04.740 --> 00:25:06.600
We all studied that.
00:25:06.600 --> 00:25:08.370
We know what that means now.
00:25:08.370 --> 00:25:11.490
For convergence and probability,
we talked about
00:25:11.490 --> 00:25:14.270
convergence and probability
in two ways.
00:25:14.270 --> 00:25:16.910
One with an epsilon
and a delta.
00:25:16.910 --> 00:25:19.480
And then saying for every
epsilon and delta that isn't
00:25:19.480 --> 00:25:22.130
big enough, something happens.
00:25:22.130 --> 00:25:25.730
And then we saw that it was a
little easier to describe.
00:25:25.730 --> 00:25:28.570
It was a little easier to drive
describe by saying the
00:25:28.570 --> 00:25:30.880
convergence in probability.
00:25:30.880 --> 00:25:33.460
These distribution
functions have to
00:25:33.460 --> 00:25:34.810
converge to a unit step.
00:25:34.810 --> 00:25:35.620
And that's enough.
00:25:35.620 --> 00:25:39.230
They converge to a unit steps
at every z except
00:25:39.230 --> 00:25:41.570
where the step is.
00:25:41.570 --> 00:25:42.920
We talked about that.
00:25:42.920 --> 00:25:46.130
For convergence with probability
one, and this is
00:25:46.130 --> 00:25:50.030
the thing we want to talk about
today, this is the one
00:25:50.030 --> 00:25:54.020
that sounds so easy, and
which is really tricky.
00:25:54.020 --> 00:25:55.990
I don't want to scare
you about this.
00:25:55.990 --> 00:26:00.660
If you're not scared about
it to start with, I don't
00:26:00.660 --> 00:26:01.450
want to scare you.
00:26:01.450 --> 00:26:05.670
But I would like to convince
you that if you think you
00:26:05.670 --> 00:26:09.260
understand it and you haven't
spent a lot of time thinking
00:26:09.260 --> 00:26:12.070
about it, you're probably
due for a rude
00:26:12.070 --> 00:26:14.370
awakening at some point.
00:26:14.370 --> 00:26:17.870
So for convergence with
probability one, the set of
00:26:17.870 --> 00:26:22.370
sample paths that converge
has probability one.
00:26:22.370 --> 00:26:27.930
In other words, the sequence Y1,
Y2 converges to zero with
00:26:27.930 --> 00:26:30.290
probability one.
00:26:30.290 --> 00:26:33.070
And now I'm going to talk
about converging to zero
00:26:33.070 --> 00:26:35.820
rather than converging to
some random variable.
00:26:35.820 --> 00:26:39.120
Because if you're interested
in a sequence of random
00:26:39.120 --> 00:26:43.550
variables Z1, Z2 that converge
to some other random variable
00:26:43.550 --> 00:26:47.570
Z, you can get rid of a lot of
the complication by just
00:26:47.570 --> 00:26:51.310
saying, let's define a random
variable y sub n, which is
00:26:51.310 --> 00:26:53.570
equal to z sub n minus c.
00:26:53.570 --> 00:26:56.430
And then what we're interested
in is do these random
00:26:56.430 --> 00:26:59.490
variables y sub n
converged to 0.
00:26:59.490 --> 00:27:02.980
We can forget about what it's
converging to, and only worry
00:27:02.980 --> 00:27:05.560
about it converging to 0.
00:27:05.560 --> 00:27:11.270
OK, so when we do that, this
sequence of random variables
00:27:11.270 --> 00:27:15.160
converges to 0 with
probability 1.
00:27:15.160 --> 00:27:22.800
If the probability of the set of
sample points for which the
00:27:22.800 --> 00:27:25.410
sample path converges to 0.
00:27:25.410 --> 00:27:31.310
If that set of sample paths
has probability 1--
00:27:31.310 --> 00:27:36.050
namely, for almost everything
in the space, for almost
00:27:36.050 --> 00:27:39.820
everything in its peculiar
sense of probability--
00:27:39.820 --> 00:27:44.050
if that holds true, then you say
you have convergence with
00:27:44.050 --> 00:27:45.300
probability 1.
00:27:49.440 --> 00:27:51.440
Now, that looks straightforward,
00:27:51.440 --> 00:27:54.770
and I hope it is.
00:27:54.770 --> 00:27:56.590
You can memorize it
or do whatever you
00:27:56.590 --> 00:27:59.270
want to do with it.
00:27:59.270 --> 00:28:02.340
We're going to go on now and
prove an important theorem
00:28:02.340 --> 00:28:05.260
about convergence with
probability 1.
00:28:05.260 --> 00:28:09.160
I'm going to give a proof here
in class that's a little more
00:28:09.160 --> 00:28:12.610
detailed than the proof
I give in the notes.
00:28:12.610 --> 00:28:15.280
I don't like to give
proofs in class.
00:28:15.280 --> 00:28:19.275
I think it's a lousy idea
because when you're studying a
00:28:19.275 --> 00:28:22.940
proof, you have to go
at your own pace.
00:28:22.940 --> 00:28:29.420
But the problem is, I
know that students--
00:28:29.420 --> 00:28:31.200
and I was once a
student myself,
00:28:31.200 --> 00:28:33.170
and I'm still a student.
00:28:33.170 --> 00:28:38.165
If I see a proof, I will only
look at enough of it to say,
00:28:38.165 --> 00:28:40.000
ah, I get the idea of it.
00:28:40.000 --> 00:28:41.610
And then I will stop.
00:28:41.610 --> 00:28:44.840
And for this one, you need a
little more than the idea of
00:28:44.840 --> 00:28:46.410
it because it's something
we're going to
00:28:46.410 --> 00:28:48.020
build on all the time.
00:28:48.020 --> 00:28:51.300
So I want to go through
this proof carefully.
00:28:51.300 --> 00:28:56.990
And I hope that most of you
will follow most of it.
00:28:56.990 --> 00:28:59.560
And the parts of it that you
don't follow, I hope you'll go
00:28:59.560 --> 00:29:02.010
back and think about
it, because
00:29:02.010 --> 00:29:05.290
this is really important.
00:29:05.290 --> 00:29:10.630
OK, so the theorem says, let
this sequence of random
00:29:10.630 --> 00:29:17.080
variables satisfy the expected
value of the magnitude of Y
00:29:17.080 --> 00:29:23.360
sub n, the sum from n equals 1
to infinity of this is less
00:29:23.360 --> 00:29:24.610
than infinity.
00:29:27.740 --> 00:29:29.990
As usual there's a
misprint there.
00:29:29.990 --> 00:29:33.280
The sum, the expected
value of Yn, the
00:29:33.280 --> 00:29:34.970
bracket should be there.
00:29:34.970 --> 00:29:36.560
It's supposed to be less
than infinity.
00:29:40.820 --> 00:29:42.780
Let me write that down.
00:29:42.780 --> 00:29:49.720
The sum from n equals 1 to
infinity of expected value of
00:29:49.720 --> 00:29:56.330
the magnitude of Y sub n
is less than infinity.
00:29:56.330 --> 00:29:59.340
So it's a finite sum.
00:29:59.340 --> 00:30:03.960
So we're talking about these
Yn's when we start talking
00:30:03.960 --> 00:30:07.350
about the strong law
of large numbers.
00:30:07.350 --> 00:30:12.300
Yn is going to be something like
the sum from n equals 1
00:30:12.300 --> 00:30:15.730
to m divided by m.
00:30:15.730 --> 00:30:17.800
In other words, it's going to
be the sample average, or
00:30:17.800 --> 00:30:19.090
something like that.
00:30:19.090 --> 00:30:22.650
And these sample averages, if
you have a mean 0, are going
00:30:22.650 --> 00:30:23.850
to get small.
00:30:23.850 --> 00:30:26.620
The question is, when you sum
all of these things that are
00:30:26.620 --> 00:30:31.120
getting small, do you still get
something which is small?
00:30:31.120 --> 00:30:34.490
When you're dealing with the
weak law of large numbers,
00:30:34.490 --> 00:30:37.640
it's not necessary that
that sum gets small.
00:30:37.640 --> 00:30:40.910
It's only necessary that each
of the terms get small.
00:30:40.910 --> 00:30:43.660
Here we're saying, let's
assume also that
00:30:43.660 --> 00:30:46.560
this sum gets small.
00:30:46.560 --> 00:30:50.930
OK, so we want to prove that
under this condition, all of
00:30:50.930 --> 00:30:57.430
these sequences with probability
1 converge to 0,
00:30:57.430 --> 00:31:00.690
the individual sequences
converge.
00:31:00.690 --> 00:31:04.520
OK, so let's go through
the proof now.
00:31:04.520 --> 00:31:08.560
And as I say, I won't do
this to you very often.
00:31:08.560 --> 00:31:14.110
But I think for this one,
it's sort of necessary.
00:31:14.110 --> 00:31:18.000
OK, so first we'll use the
Markov inequality.
00:31:18.000 --> 00:31:22.310
And I'm dealing with a finite
value of m here.
00:31:22.310 --> 00:31:26.900
The probability that the sum of
a finite set of these Y sub
00:31:26.900 --> 00:31:30.960
n's is greater than alpha is
less than or equal to the
00:31:30.960 --> 00:31:33.890
expected value of that
random variable.
00:31:33.890 --> 00:31:36.730
Namely, this random
variable here.
00:31:36.730 --> 00:31:40.930
Sum from n equals 1 to m of
magnitude of Y sub n.
00:31:40.930 --> 00:31:43.560
That's just a random variable.
00:31:43.560 --> 00:31:46.640
And the probability that that
random variable is greater
00:31:46.640 --> 00:31:49.610
than alpha is less than or equal
to the expected value of
00:31:49.610 --> 00:31:52.980
that random variable
divided by alpha.
00:31:52.980 --> 00:32:02.850
OK, well now, this quantity here
is increasing in Y sub n.
00:32:02.850 --> 00:32:06.510
The magnitude of Y sub n is
a non-negative quantity.
00:32:06.510 --> 00:32:10.860
You take the expectation of a
non-negative quantity, if it
00:32:10.860 --> 00:32:15.470
has an expectation, which we're
assuming here for this
00:32:15.470 --> 00:32:18.680
to be less infinity, all of
these things have to have
00:32:18.680 --> 00:32:20.800
expectations.
00:32:20.800 --> 00:32:26.990
So as we increase m, this
gets bigger and bigger.
00:32:26.990 --> 00:32:31.720
So this quantity here is going
to be less than or equal to
00:32:31.720 --> 00:32:35.370
the sum from n equals 1 to
infinity of expected value of
00:32:35.370 --> 00:32:37.470
Y sub n divided by alpha.
00:32:37.470 --> 00:32:41.830
What I'm being careful about
here is all of the things that
00:32:41.830 --> 00:32:47.170
happen when you go from finite
m to infinite m.
00:32:47.170 --> 00:32:52.260
And I'm using what you know
about finite m, and then being
00:32:52.260 --> 00:32:56.080
very careful about going
to infinite m.
00:32:56.080 --> 00:32:59.340
And I'm going to try to explain
why as we do it.
00:32:59.340 --> 00:33:01.570
But here, it's straightforward.
00:33:01.570 --> 00:33:07.090
The expected value of a finite
sum is equal to the finite sum
00:33:07.090 --> 00:33:08.580
of an expected value.
00:33:08.580 --> 00:33:11.560
When you go to the limit, m goes
to infinity, you don't
00:33:11.560 --> 00:33:15.230
know whether these expected
values exist or not.
00:33:15.230 --> 00:33:18.990
You're sort of confused on both
sides of this equation.
00:33:18.990 --> 00:33:21.550
So we're sticking to
finite values here.
00:33:21.550 --> 00:33:25.710
Then, we're taking this
quantity, going to the limit
00:33:25.710 --> 00:33:27.450
as m goes to infinity.
00:33:27.450 --> 00:33:30.570
This quantity has to get bigger
and bigger as m goes to
00:33:30.570 --> 00:33:33.870
infinity, so this quantity
has to be less
00:33:33.870 --> 00:33:36.070
than or equal to this.
00:33:36.070 --> 00:33:40.980
This now, for a given alpha,
is just a number.
00:33:40.980 --> 00:33:43.820
It's nothing more than a number,
so we can deal with
00:33:43.820 --> 00:33:47.420
this pretty easily as we
make alpha big enough.
00:33:47.420 --> 00:33:50.050
But for most of the argument,
we're going to view alpha as
00:33:50.050 --> 00:33:51.700
being fixed.
00:33:51.700 --> 00:33:58.590
OK, so now the probability that
this sum, finite sum is
00:33:58.590 --> 00:34:01.920
greater than alpha, is less
than or equal to this.
00:34:01.920 --> 00:34:03.620
This was the thing we just
finished proving
00:34:03.620 --> 00:34:04.870
on the other page.
00:34:09.110 --> 00:34:11.600
This is less than or
equal to that.
00:34:11.600 --> 00:34:17.969
That's what I repeated, so I'm
not cheating you at all here.
00:34:17.969 --> 00:34:21.400
Now, it's a pain to write
that down all the time.
00:34:21.400 --> 00:34:27.580
So let's let the set, A sub m,
be the set of sample points
00:34:27.580 --> 00:34:32.380
such that its finite sum
of Y sub n of omega is
00:34:32.380 --> 00:34:34.170
greater than alpha.
00:34:34.170 --> 00:34:35.420
This is a random--
00:34:39.260 --> 00:34:43.150
for each value of omega,
this is just a number.
00:34:43.150 --> 00:34:49.710
The sum of the magnitude of Y
sub n is a random variable.
00:34:49.710 --> 00:34:53.270
It takes on a numerical
value for every omega
00:34:53.270 --> 00:34:54.760
in the sample space.
00:34:54.760 --> 00:34:59.140
So A sub m is the set of points
in the sample space for
00:34:59.140 --> 00:35:02.600
which this quantity here
is bigger than alpha.
00:35:02.600 --> 00:35:08.010
So we can rewrite this
now as just the
00:35:08.010 --> 00:35:10.440
probability of A sub m.
00:35:10.440 --> 00:35:14.680
So this is equivalent to saying,
the probability of A
00:35:14.680 --> 00:35:18.120
sub m is less than or equal
to this number here.
00:35:18.120 --> 00:35:20.700
For a fixed alpha,
this is a number.
00:35:20.700 --> 00:35:24.500
This is something which
can vary with m.
00:35:24.500 --> 00:35:29.560
Since these numbers here now,
now we're dealing with a
00:35:29.560 --> 00:35:32.710
sample space, which is
a little strange.
00:35:32.710 --> 00:35:39.570
We're talking about sample
points and we're saying, this
00:35:39.570 --> 00:35:46.320
number here, this magnitude of Y
sub n at a particular sample
00:35:46.320 --> 00:35:50.040
point omega is greater
than or equal to 0.
00:35:50.040 --> 00:35:55.050
Therefore, a sub m is a subset
of A sub m plus 1.
00:35:55.050 --> 00:36:03.650
In other words, as m gets larger
and larger here, m here
00:36:03.650 --> 00:36:05.450
gets larger and larger.
00:36:05.450 --> 00:36:08.950
Therefore, this sum here
gets larger and larger.
00:36:08.950 --> 00:36:14.240
Therefore, the set of omega for
which this increasing sum
00:36:14.240 --> 00:36:17.370
is greater than alpha gets
bigger and bigger.
00:36:17.370 --> 00:36:20.430
And that's the thing that we're
saying here, A sub m is
00:36:20.430 --> 00:36:24.990
included in A sub m plus 1 for
m greater than or equal to 1.
00:36:24.990 --> 00:36:28.920
OK, so the left side of this
quantity here, as a function
00:36:28.920 --> 00:36:32.370
of m, is a non-decreasing
bounded
00:36:32.370 --> 00:36:34.305
sequence of real numbers.
00:36:41.420 --> 00:36:43.090
Yes, the probability
of something
00:36:43.090 --> 00:36:44.120
is just a real number.
00:36:44.120 --> 00:36:46.970
A probability is a number.
00:36:46.970 --> 00:36:50.230
So this quantity here
is a real number.
00:36:50.230 --> 00:36:53.810
It's a real number which is
non-decreasing, so it keeps
00:36:53.810 --> 00:36:54.920
moving upward.
00:36:54.920 --> 00:36:59.330
What I'm trying to do now
is now, I went to
00:36:59.330 --> 00:37:00.300
the limit over here.
00:37:00.300 --> 00:37:03.110
I want to go to the
limit here.
00:37:03.110 --> 00:37:08.680
And so I have a sequence
of numbers in m.
00:37:08.680 --> 00:37:12.500
This sequence of numbers
is non-decreasing.
00:37:12.500 --> 00:37:14.250
So it's moving up.
00:37:14.250 --> 00:37:17.110
Every one of those quantities
is bounded by
00:37:17.110 --> 00:37:18.840
this quantity here.
00:37:18.840 --> 00:37:23.350
So I have an increasing sequence
of real numbers,
00:37:23.350 --> 00:37:25.960
which is bounded on the top.
00:37:25.960 --> 00:37:28.280
What happens?
00:37:28.280 --> 00:37:30.810
When you have a sequence
of real
00:37:30.810 --> 00:37:32.780
numbers which is bounded--
00:37:36.200 --> 00:37:38.810
I have a slide to prove this,
but I'm not going to prove it
00:37:38.810 --> 00:37:40.953
because it's tedious.
00:37:46.470 --> 00:37:53.210
Here we have this probability
which I'm calling A sub m
00:37:53.210 --> 00:37:59.740
probability of A sub m.
00:37:59.740 --> 00:38:07.730
Here I have the probability of
A sub m plus 1, and so forth.
00:38:07.730 --> 00:38:09.980
Here I have this
limit up here.
00:38:09.980 --> 00:38:13.850
All of this sequence of numbers,
there's an infinite
00:38:13.850 --> 00:38:15.770
sequence of them.
00:38:15.770 --> 00:38:17.650
They're all non-decreasing.
00:38:17.650 --> 00:38:20.780
They're all bounded by
this number here.
00:38:20.780 --> 00:38:23.300
And what happens?
00:38:23.300 --> 00:38:27.600
Well, either we go up to there
as a limit or else we stop
00:38:27.600 --> 00:38:30.890
sometime earlier as a limit.
00:38:30.890 --> 00:38:36.600
I should prove this, but it's
something we use all the time.
00:38:36.600 --> 00:38:43.190
It's a sequence of increasing
or non-decreasing numbers.
00:38:43.190 --> 00:38:46.420
If it's bounded by something, it
has to have a finite limit.
00:38:46.420 --> 00:38:49.390
The limit is less than or
equal to this quantity.
00:38:49.390 --> 00:38:53.690
It might be strictly less, but
the limit has to exist.
00:38:53.690 --> 00:38:55.620
And the limit has to be less
than or equal to b.
00:38:59.900 --> 00:39:02.240
OK, that's what we're
saying here.
00:39:02.240 --> 00:39:10.510
When we go to this limit, this
limit of the probability of A
00:39:10.510 --> 00:39:14.380
sub m is less than or equal
to this number here.
00:39:14.380 --> 00:39:19.310
OK, if I use this property of
nesting intervals, when you
00:39:19.310 --> 00:39:26.860
have A sub 1 nested inside of
A sub 2, nested inside of A
00:39:26.860 --> 00:39:33.380
sub 3, what we'd like to go
do is go to this limit.
00:39:33.380 --> 00:39:35.210
The limit, unfortunately,
doesn't make
00:39:35.210 --> 00:39:36.460
any sense in general.
00:39:39.130 --> 00:39:43.240
With this property of the
axioms, it's equation number 9
00:39:43.240 --> 00:39:47.490
in chapter 1 says that
we can do something
00:39:47.490 --> 00:39:51.190
that's almost as good.
00:39:51.190 --> 00:40:00.630
What it says that as we go to
this limit here, what we get
00:40:00.630 --> 00:40:06.830
is that this limit is
the probability of
00:40:06.830 --> 00:40:08.960
this infinite union.
00:40:08.960 --> 00:40:13.410
That's equal to the limit
as m goes to infinity of
00:40:13.410 --> 00:40:16.655
probability of A sub m.
00:40:16.655 --> 00:40:20.810
OK, look up equation 9,
and you'll see that's
00:40:20.810 --> 00:40:22.120
exactly what it says.
00:40:22.120 --> 00:40:25.870
If you think this is
obvious, it's not.
00:40:25.870 --> 00:40:29.350
It ain't obvious at all because
it's not even clear
00:40:29.350 --> 00:40:32.380
that this--
00:40:32.380 --> 00:40:35.090
well, nothing very much about
this union is clear.
00:40:35.090 --> 00:40:38.570
We know that this union must
be a measurable set.
00:40:38.570 --> 00:40:40.280
It must have a probability.
00:40:40.280 --> 00:40:42.090
We don't know much
more about it.
00:40:42.090 --> 00:40:45.340
But anyway, that property tells
us that this is true.
00:41:03.190 --> 00:41:06.040
OK, so where we are
at this point.
00:41:06.040 --> 00:41:09.680
I don't think I've skip
something, have I?
00:41:09.680 --> 00:41:13.430
Oh, no, that's the thing I
didn't want to talk about.
00:41:13.430 --> 00:41:19.720
OK, so A sub m is a set
of omega which satisfy
00:41:19.720 --> 00:41:21.970
this for finite m.
00:41:21.970 --> 00:41:28.230
The probability of this union
is then the union of all of
00:41:28.230 --> 00:41:30.470
these quantities over all m.
00:41:30.470 --> 00:41:35.050
And this is less than or equal
to this bound that we had.
00:41:35.050 --> 00:41:44.890
OK, so I even hate giving proofs
of this sort because
00:41:44.890 --> 00:41:46.960
it's a set of simple ideas.
00:41:46.960 --> 00:41:49.980
To track down every one
of them is difficult.
00:41:49.980 --> 00:41:53.320
The text doesn't track down
every one of them.
00:41:53.320 --> 00:41:55.750
And that's what I'm
trying to do here.
00:42:06.580 --> 00:42:11.870
We have two possibilities here,
and we're looking at
00:42:11.870 --> 00:42:13.330
this limit here.
00:42:13.330 --> 00:42:20.390
This limiting sum, which for
each omega is just a sequence,
00:42:20.390 --> 00:42:23.500
a non-decreasing sequence
of real numbers.
00:42:23.500 --> 00:42:27.940
So one possibility is that this
sequence of real numbers
00:42:27.940 --> 00:42:30.000
is bigger than alpha.
00:42:30.000 --> 00:42:32.390
The other possibility
is that it's less
00:42:32.390 --> 00:42:34.050
than or equal to alpha.
00:42:34.050 --> 00:42:37.660
If it's less than or equal to
alpha, then every one of these
00:42:37.660 --> 00:42:43.920
numbers is less than or equal
to alpha and omega has to be
00:42:43.920 --> 00:42:47.230
not in this union here.
00:42:47.230 --> 00:42:50.800
If the sum is bigger than
alpha, then one of the
00:42:50.800 --> 00:42:53.500
elements in this set is
bigger than alpha and
00:42:53.500 --> 00:42:55.550
omega is in this set.
00:42:55.550 --> 00:43:00.300
So what all of that says, and
you're just going to have to
00:43:00.300 --> 00:43:03.301
look at that because
it's not--
00:43:03.301 --> 00:43:05.760
it's one of these tedious
arguments.
00:43:05.760 --> 00:43:09.690
So the probability of omega such
that this sum is greater
00:43:09.690 --> 00:43:13.030
than alpha is less than or equal
to this number here.
00:43:13.030 --> 00:43:16.770
At this point, we have
made a major change
00:43:16.770 --> 00:43:19.000
in what we're doing.
00:43:19.000 --> 00:43:24.750
Before we were talking about
numbers like probabilities,
00:43:24.750 --> 00:43:27.400
numbers like expected values.
00:43:27.400 --> 00:43:33.020
Here, suddenly, we are talking
about sample points.
00:43:33.020 --> 00:43:35.870
We're talking about the
probability of a set of sample
00:43:35.870 --> 00:43:40.340
points, such that the sum
is greater than alpha.
00:43:40.340 --> 00:43:41.358
Yes?
00:43:41.358 --> 00:43:43.207
AUDIENCE: I understand how if
the whole sum if less than or
00:43:43.207 --> 00:43:45.840
equal than alpha then
every element is.
00:43:45.840 --> 00:43:49.160
But did you say that if it's
greater than alpha, then at
00:43:49.160 --> 00:43:50.360
least one element is
greater than alpha?
00:43:50.360 --> 00:43:51.610
Why is that?
00:43:57.660 --> 00:44:00.430
PROFESSOR: Well, because either
the sum is less than or
00:44:00.430 --> 00:44:03.670
equal to alpha or it's
greater than alpha.
00:44:03.670 --> 00:44:09.230
And if it's less than or equal
to alpha, then omega
00:44:09.230 --> 00:44:11.570
is not in this set.
00:44:11.570 --> 00:44:17.860
So the alternative is that omega
has to be in this set.
00:44:17.860 --> 00:44:20.540
Except the other way of looking
at it is if you have a
00:44:20.540 --> 00:44:25.390
sequence of numbers, which is
approaching a limit, and that
00:44:25.390 --> 00:44:31.870
limit is bigger than alpha, then
one of the terms has to
00:44:31.870 --> 00:44:33.990
be bigger than alpha.
00:44:33.990 --> 00:44:34.924
Yes?
00:44:34.924 --> 00:44:36.406
AUDIENCE: I think the confusion
is between the
00:44:36.406 --> 00:44:38.876
partial sums and the
terms of the sum.
00:44:38.876 --> 00:44:40.852
That's what he's confusing.
00:44:40.852 --> 00:44:43.322
Does that make sense?
00:44:43.322 --> 00:44:45.298
He's saying instead of
each partial sum, not
00:44:45.298 --> 00:44:46.548
each term in the sum.
00:44:50.750 --> 00:44:52.000
PROFESSOR: Yes.
00:44:58.540 --> 00:45:00.510
Except I don't see how that
answers the question.
00:45:11.000 --> 00:45:16.630
Except the point here is, if
each partial sum is less than
00:45:16.630 --> 00:45:20.260
or equal to alpha, then the
limit has to be less than or
00:45:20.260 --> 00:45:20.960
equal to alpha.
00:45:20.960 --> 00:45:22.450
That's what I was saying
on the other page.
00:45:22.450 --> 00:45:25.130
If you have a sequence of
numbers, which has an upper
00:45:25.130 --> 00:45:28.810
bound on them, then you
have to have a limit.
00:45:28.810 --> 00:45:31.900
And that limit has to be less
than or equal to alpha.
00:45:31.900 --> 00:45:34.090
So that's this case here.
00:45:34.090 --> 00:45:38.660
We have a sum of numbers as
we're going to the limit as m
00:45:38.660 --> 00:45:42.490
gets larger and larger,
these partial sums
00:45:42.490 --> 00:45:45.210
have to go to a limit.
00:45:45.210 --> 00:45:48.070
The partial sums are all less
than or equal to alpha.
00:45:48.070 --> 00:45:51.130
Then the infinite sum is less
than or equal to alpha, and
00:45:51.130 --> 00:45:53.810
omega is not in this set here.
00:45:53.810 --> 00:45:55.540
And otherwise, it is.
00:45:55.540 --> 00:45:59.220
OK, if I talk more about it,
I'll get more confused.
00:45:59.220 --> 00:46:01.985
So I think the slides
are clear.
00:46:09.020 --> 00:46:17.710
Now, if we look at the case
where alpha is greater than or
00:46:17.710 --> 00:46:26.180
equal to this sum, and we take
the complement of the set, the
00:46:26.180 --> 00:46:30.380
probability of the set of omega
for which this sum is
00:46:30.380 --> 00:46:32.900
less than or equal
to alpha has--
00:46:32.900 --> 00:46:36.830
oh, let's forget about
this for the moment.
00:46:36.830 --> 00:46:40.440
If I take the complement of this
set, the probability of
00:46:40.440 --> 00:46:43.560
the set of omega, such that the
sum is less than or equal
00:46:43.560 --> 00:46:47.190
to alpha, is greater
than 1 minus this
00:46:47.190 --> 00:46:48.820
expected value here.
00:46:48.820 --> 00:46:51.480
Now I'm saying, let's look at
the case where alpha is big
00:46:51.480 --> 00:46:55.280
enough that it's greater
than this number here.
00:46:55.280 --> 00:47:00.820
So this probability is greater
than 1 minus this number.
00:47:00.820 --> 00:47:06.720
So if the sum is less than or
equal to alpha for any given
00:47:06.720 --> 00:47:12.220
omega, then this quantity
here converges.
00:47:12.220 --> 00:47:14.620
Now I'm talking about
sample sequences.
00:47:14.620 --> 00:47:18.630
I'm saying I have an increasing
sequence of numbers
00:47:18.630 --> 00:47:21.540
corresponding to one particular
sample point.
00:47:21.540 --> 00:47:25.500
This increasing set of numbers
is less than or equal.
00:47:25.500 --> 00:47:28.760
Each element of it is less than
or equal to alpha, so the
00:47:28.760 --> 00:47:31.820
limit of it is less than
or equal to alpha.
00:47:31.820 --> 00:47:37.050
And what that says is the limit
of Y sub n of omega,
00:47:37.050 --> 00:47:39.760
this has to be equal to 0
for that sample point.
00:47:39.760 --> 00:47:41.500
This is all the sample
point argument.
00:47:46.280 --> 00:47:53.500
And what that says then is the
probability of omega, such
00:47:53.500 --> 00:47:57.580
that this limit here is equal
to 0, that's this quantity
00:47:57.580 --> 00:48:01.310
here, which is the same as this
quantity, which has to be
00:48:01.310 --> 00:48:02.935
greater than this quantity.
00:48:08.570 --> 00:48:11.100
This implies this.
00:48:11.100 --> 00:48:16.260
Therefore, the probability of
this has to be bigger than
00:48:16.260 --> 00:48:17.950
this probability here.
00:48:17.950 --> 00:48:22.480
Now, if we let alpha go to
infinity, what that says is
00:48:22.480 --> 00:48:26.310
this quantity goes to 0 and the
probability of the set of
00:48:26.310 --> 00:48:30.730
omega, such that this limit is
equal to 0, is equal to 1.
00:48:35.010 --> 00:48:39.640
I think if I try to spend 20
more minutes talking about
00:48:39.640 --> 00:48:42.860
that in more detail, it
won't get any clearer.
00:48:42.860 --> 00:48:47.020
It is one of these very tedious
arguments where you
00:48:47.020 --> 00:48:51.380
have to sit down and follow
it step by step.
00:48:51.380 --> 00:48:54.500
I wrote the steps that's
very carefully.
00:48:54.500 --> 00:49:02.180
And at this point, I have
to leave it as it is.
00:49:02.180 --> 00:49:05.540
But the theorem has been proven,
at least in what's
00:49:05.540 --> 00:49:08.800
written, if not in
what I've said.
00:49:08.800 --> 00:49:13.050
OK, let's look at an example
of this now.
00:49:13.050 --> 00:49:17.110
Let's look at the example where
these random variables Y
00:49:17.110 --> 00:49:21.930
sub n for n greater than or
equal to 1, have this
00:49:21.930 --> 00:49:23.170
following property.
00:49:23.170 --> 00:49:26.190
It's almost the same as the
sequence of numbers I talked
00:49:26.190 --> 00:49:27.830
about before.
00:49:27.830 --> 00:49:31.990
But what I'm going
to do now is--
00:49:31.990 --> 00:49:34.630
these are not IID random
variables.
00:49:34.630 --> 00:49:39.130
If they're IID random variables,
you're never going
00:49:39.130 --> 00:49:41.310
to talk about the sum
being finite.
00:49:41.310 --> 00:49:44.260
Sum of the expected values
being finite.
00:49:44.260 --> 00:49:52.620
How they behave is that for one
less than or equal to 5,
00:49:52.620 --> 00:49:55.520
you pick one of these random
variables in here and make it
00:49:55.520 --> 00:49:56.690
equal to 1.
00:49:56.690 --> 00:49:59.040
And all the rest
are equal to 0.
00:49:59.040 --> 00:50:03.560
From 5 to 25, you pick one of
the random variables, make it
00:50:03.560 --> 00:50:06.400
equal to 1, and all the
others are equal to 0.
00:50:06.400 --> 00:50:08.130
You choose randomly in here.
00:50:08.130 --> 00:50:12.380
From 25 to 125, you pick
one random variable,
00:50:12.380 --> 00:50:14.020
set it equal to 1.
00:50:14.020 --> 00:50:16.910
All the other random variables,
set it equal to 0,
00:50:16.910 --> 00:50:19.500
and so forth forever after.
00:50:19.500 --> 00:50:23.600
OK, so what does that say
for the sample points?
00:50:23.600 --> 00:50:29.080
If I look at any particular
sample point, what I find is
00:50:29.080 --> 00:50:36.610
that there's one occurrence of a
sample value equal to 1 from
00:50:36.610 --> 00:50:38.340
here to here.
00:50:38.340 --> 00:50:42.370
There's exactly one that's equal
to 1 from here to here.
00:50:42.370 --> 00:50:45.970
There's exactly one that's equal
to 1 from here to way
00:50:45.970 --> 00:50:49.890
out here at 125, and so forth.
00:50:49.890 --> 00:50:55.480
This is not a sequence of sample
values which converges
00:50:55.480 --> 00:51:00.790
because it keeps popping up
to 1 at all these values.
00:51:00.790 --> 00:51:05.300
So for every omega, Yn
of omega is 1 for
00:51:05.300 --> 00:51:07.530
some n in this interval.
00:51:07.530 --> 00:51:10.110
For every j and it's
0 elsewhere.
00:51:10.110 --> 00:51:15.830
This Yn of omega doesn't
converge for omega.
00:51:15.830 --> 00:51:19.310
So the probability that that
sequence converges
00:51:19.310 --> 00:51:22.390
is not 1, it's 0.
00:51:22.390 --> 00:51:26.080
So this is a particularly
awful example.
00:51:26.080 --> 00:51:29.490
This is a sequence of random
variables, which does not
00:51:29.490 --> 00:51:31.970
converge with probability 1.
00:51:31.970 --> 00:51:38.440
At the same time, the expected
value of Y sub n is 1 over 5
00:51:38.440 --> 00:51:42.320
to the j plus 1 minus
5 to the j.
00:51:42.320 --> 00:51:46.680
That's the probability that you
pick that particular n for
00:51:46.680 --> 00:51:50.310
a random variable to
be equal to 1.
00:51:50.310 --> 00:51:54.130
It's equal to this for 5 to the
j less than or equal to n,
00:51:54.130 --> 00:51:57.350
less than 5 to the j plus 1.
00:51:57.350 --> 00:52:01.000
When you add up all of these
things, when you add up
00:52:01.000 --> 00:52:04.330
expected value of Yn equal
to that over this
00:52:04.330 --> 00:52:06.120
interval, you get 1.
00:52:06.120 --> 00:52:09.460
When you add it up over the next
interval, which is much,
00:52:09.460 --> 00:52:11.400
much bigger, you get 1 again.
00:52:11.400 --> 00:52:12.950
When you add it up
over the next
00:52:12.950 --> 00:52:14.830
interval, you get 1 again.
00:52:14.830 --> 00:52:19.925
So the expected value
of the sum--
00:52:19.925 --> 00:52:22.850
the sum of the expected
value of the Y sub
00:52:22.850 --> 00:52:26.520
n's is equal to infinity.
00:52:26.520 --> 00:52:34.850
And what you wind up with
then is that this
00:52:34.850 --> 00:52:37.195
sequence does not converge--
00:52:49.690 --> 00:52:52.430
This says the theorem doesn't
apply at all.
00:52:52.430 --> 00:52:57.510
This says that the Y sub n of
omega does not converge for
00:52:57.510 --> 00:52:59.330
any sample function at all.
00:53:03.050 --> 00:53:06.910
This says that according to the
theorem, it doesn't have
00:53:06.910 --> 00:53:09.200
to converge.
00:53:09.200 --> 00:53:11.810
I mean, when you look at an
example after working very
00:53:11.810 --> 00:53:17.110
hard to prove a theorem, you
would like to find that if the
00:53:17.110 --> 00:53:25.990
conditions of the theorem are
satisfied what the theorem
00:53:25.990 --> 00:53:28.120
says is satisfied also.
00:53:28.120 --> 00:53:31.700
Here, the conditions
are not satisfied.
00:53:31.700 --> 00:53:33.810
And you also don't
have convergence
00:53:33.810 --> 00:53:35.370
with probability 1.
00:53:35.370 --> 00:53:39.280
You do have convergence in
probability, however.
00:53:39.280 --> 00:53:42.520
So this gives you a nice example
of where you have a
00:53:42.520 --> 00:53:47.320
sequence of random variables
that converges in probability.
00:53:47.320 --> 00:53:51.710
It converges in probability
because as n gets larger and
00:53:51.710 --> 00:53:56.730
larger, the probability that Y
sub n is going to be equal to
00:53:56.730 --> 00:54:00.900
anything other than 0 gets
very, very small.
00:54:00.900 --> 00:54:05.450
So the limit as n goes to
infinity of the probability
00:54:05.450 --> 00:54:07.920
that Y sub n is greater
than epsilon--
00:54:07.920 --> 00:54:11.750
for any epsilon greater than 0,
this probability is equal
00:54:11.750 --> 00:54:13.760
to 0 for all epsilon.
00:54:13.760 --> 00:54:17.670
So this quantity does converge
in probability.
00:54:17.670 --> 00:54:20.480
It does not converge
with probability 1.
00:54:20.480 --> 00:54:24.420
It's the simplest example I know
of where you don't have
00:54:24.420 --> 00:54:30.390
convergence with probability 1
and you do have convergence in
00:54:30.390 --> 00:54:32.290
probability.
00:54:32.290 --> 00:54:43.440
How about if you're looking at a
sequence of sample averages.
00:54:43.440 --> 00:54:47.720
Suppose you're looking at S sub
n over n where S sub n is
00:54:47.720 --> 00:54:51.680
a sum of IID random variables.
00:54:51.680 --> 00:54:57.460
Can you find an example there
where when you have a--
00:55:01.290 --> 00:55:05.850
can you find an example where
this sequence S sub n over n
00:55:05.850 --> 00:55:10.960
does converge in probability,
but does not converge with
00:55:10.960 --> 00:55:12.470
probability 1?
00:55:12.470 --> 00:55:16.660
Unfortunately, that's
very hard to do.
00:55:16.660 --> 00:55:20.900
And the reason is the main
theorem, which we will never
00:55:20.900 --> 00:55:30.210
get around to proving here is
that if you have a random
00:55:30.210 --> 00:55:36.510
variable x, and the expected
value of the magnitude of x is
00:55:36.510 --> 00:55:40.780
finite, then the strong law
of large numbers holds.
00:55:40.780 --> 00:55:47.530
Also, the weak law of large
numbers holds, which says that
00:55:47.530 --> 00:55:50.600
you're not going to find an
example where one holds and
00:55:50.600 --> 00:55:53.490
the other doesn't hold.
00:55:53.490 --> 00:55:57.910
So you have to go to strange
things like this in order to
00:55:57.910 --> 00:56:00.110
get these examples that
you're looking at.
00:56:04.220 --> 00:56:09.510
OK, let's now go from
convergence in probability 1
00:56:09.510 --> 00:56:14.240
to applying this to the sequence
of random variables
00:56:14.240 --> 00:56:23.210
where Y sub n is now equal to
the sum of n IID random
00:56:23.210 --> 00:56:24.520
variable divided by n.
00:56:24.520 --> 00:56:29.130
Namely, it's the sample average,
and we're looking at
00:56:29.130 --> 00:56:32.110
the limit as n goes to infinity
00:56:32.110 --> 00:56:33.320
of this sample average.
00:56:33.320 --> 00:56:40.500
What's the probability of the
set of sample points for which
00:56:40.500 --> 00:56:47.580
this sample path converges
to X bar?
00:56:47.580 --> 00:56:54.330
And the theorem says that this
quantity is equal to 1 if the
00:56:54.330 --> 00:56:57.940
expected value of X is
less than infinity.
00:56:57.940 --> 00:57:01.540
We're not going to prove that,
but what we are going to prove
00:57:01.540 --> 00:57:07.360
is that if the expected value
of the fourth moment of X is
00:57:07.360 --> 00:57:09.730
finite, then we're going
to prove that
00:57:09.730 --> 00:57:12.640
this theorem is true.
00:57:12.640 --> 00:57:20.480
OK, when we write this from now
on, we will sometimes get
00:57:20.480 --> 00:57:21.810
more terse.
00:57:21.810 --> 00:57:27.350
And instead of writing the
probability of an omega in the
00:57:27.350 --> 00:57:33.240
set of sample points such that
this limit for a sample point
00:57:33.240 --> 00:57:36.510
is equal to X bar, this whole
thing is equal to 1.
00:57:36.510 --> 00:57:39.480
We can sometimes write it as
the probability that this
00:57:39.480 --> 00:57:43.940
limit, which is now a limit
of Sn of omega over n
00:57:43.940 --> 00:57:45.100
is equal to X bar.
00:57:45.100 --> 00:57:47.450
But that's equal to 1.
00:57:47.450 --> 00:57:51.430
Some people write it even more
tersely as the limit of S sub
00:57:51.430 --> 00:57:56.590
n over n is equal to X bar
with probability 1.
00:57:56.590 --> 00:58:01.570
This is a very strange statement
here because this--
00:58:07.640 --> 00:58:11.320
I mean, what you're saying with
this statement is not
00:58:11.320 --> 00:58:16.770
that this limit is equal to
X bar with probability 1.
00:58:16.770 --> 00:58:21.610
It's saying, with probability 1,
this limit here exists for
00:58:21.610 --> 00:58:25.260
a sample point, and that limit
is equal to X bar.
00:58:25.260 --> 00:58:28.380
The thing which makes the strong
law of large numbers
00:58:28.380 --> 00:58:34.520
difficult is not proving
that the limit has
00:58:34.520 --> 00:58:36.170
a particular value.
00:58:36.170 --> 00:58:39.170
If there is a limit, it's
always easy to find
00:58:39.170 --> 00:58:40.230
what the value is.
00:58:40.230 --> 00:58:43.790
The thing which is difficult is
figuring out whether it has
00:58:43.790 --> 00:58:44.800
a limit or not.
00:58:44.800 --> 00:58:51.380
So this statement is fine for
people who understand what it
00:58:51.380 --> 00:58:55.700
says, but it's kind of
confusing otherwise.
00:58:55.700 --> 00:59:00.690
Still more tersely, people talk
about it as Sn over n
00:59:00.690 --> 00:59:04.790
goes to limit X bar with
probability 1.
00:59:04.790 --> 00:59:07.940
This is probably an even better
way to say it than this
00:59:07.940 --> 00:59:09.580
is because this is--
00:59:09.580 --> 00:59:12.580
I mean, this says that there's
something strange
00:59:12.580 --> 00:59:14.440
in the limit here.
00:59:14.440 --> 00:59:19.210
But I would suggest that you
write it this way until you
00:59:19.210 --> 00:59:20.720
get used to what it's saying.
00:59:20.720 --> 00:59:24.620
Because then, when you write it
this way, you realize that
00:59:24.620 --> 00:59:28.480
what you're talking about is
the limit over individual
00:59:28.480 --> 00:59:33.520
sample points rather than some
kind of more general limit.
00:59:33.520 --> 00:59:38.630
And convergence with probability
1 is always that
00:59:38.630 --> 00:59:42.420
sort of convergence.
00:59:42.420 --> 00:59:46.170
OK, this strong law and the
idea of convergence with
00:59:46.170 --> 00:59:49.240
probability 1 is really pretty
different from the other forms
00:59:49.240 --> 00:59:50.610
of convergence.
00:59:50.610 --> 00:59:54.420
In the sense that it focuses
directly on sample paths.
00:59:54.420 --> 00:59:59.080
The other forms of convergence
focus on things like the
00:59:59.080 --> 01:00:02.900
sequence of expected values,
or where the sequence of
01:00:02.900 --> 01:00:06.980
probabilities, or sequences of
numbers, which are the things
01:00:06.980 --> 01:00:09.840
you're used to dealing with.
01:00:09.840 --> 01:00:15.820
Here you're dealing directly
with sample points, and it
01:00:15.820 --> 01:00:18.740
makes it more difficult to
talk about the rate of
01:00:18.740 --> 01:00:21.070
convergence as n approaches
infinity.
01:00:21.070 --> 01:00:24.180
You can't talk about the rate
of convergence here as n
01:00:24.180 --> 01:00:25.860
approaches infinity.
01:00:25.860 --> 01:00:28.720
If you have any n less than
infinity, if you're only
01:00:28.720 --> 01:00:34.280
looking at a finite sequence,
you have no way of saying
01:00:34.280 --> 01:00:38.210
whether any of the sample values
over that sequence are
01:00:38.210 --> 01:00:41.360
going to converge or whether
they're not going to converge,
01:00:41.360 --> 01:00:44.670
because you don't know what
the rest of them are.
01:00:44.670 --> 01:00:48.570
So talking about a rate of
convergence with respect to
01:00:48.570 --> 01:00:50.680
the strong law of large numbers
01:00:50.680 --> 01:00:53.270
doesn't make any sense.
01:00:53.270 --> 01:00:55.880
It's connected directly to
the standard notion of a
01:00:55.880 --> 01:01:00.640
convergence of a sequence of
numbers when you look at those
01:01:00.640 --> 01:01:04.690
numbers applied to
a sample path.
01:01:04.690 --> 01:01:07.900
This is what gives the strong
law of large numbers its
01:01:07.900 --> 01:01:13.920
power, the fact that it's
related to this standard idea
01:01:13.920 --> 01:01:14.690
of convergence.
01:01:14.690 --> 01:01:18.530
The standard idea of convergence
is what the whole
01:01:18.530 --> 01:01:22.310
theory of analysis
is built on.
01:01:22.310 --> 01:01:26.030
And there are some very powerful
things you can do
01:01:26.030 --> 01:01:27.030
with analysis.
01:01:27.030 --> 01:01:31.740
And it's because convergence is
defined the way that it is.
01:01:31.740 --> 01:01:35.400
When we talk about the strong
law of large numbers, we are
01:01:35.400 --> 01:01:39.170
locked into that particular
notion of convergence.
01:01:39.170 --> 01:01:41.690
And therefore, it's going
to have a lot of power.
01:01:41.690 --> 01:01:44.050
We will see this as soon
as we start talking
01:01:44.050 --> 01:01:45.750
about renewal theory.
01:01:45.750 --> 01:01:47.890
And in fact, we'll see it in
the proof of the strong law
01:01:47.890 --> 01:01:50.640
that we're going
to go through.
01:01:50.640 --> 01:01:53.470
Most of the heavy lifting with
the strong law of large
01:01:53.470 --> 01:01:57.780
numbers has been done by the
analysis of convergence with
01:01:57.780 --> 01:01:58.740
probability 1.
01:01:58.740 --> 01:02:01.540
The hard thing is this theorem
we've just proven.
01:02:01.540 --> 01:02:02.990
And that's tricky.
01:02:02.990 --> 01:02:05.750
And I apologize for getting a
little confused about it as we
01:02:05.750 --> 01:02:08.720
went through it, and not
explaining all the steps
01:02:08.720 --> 01:02:10.910
completely.
01:02:10.910 --> 01:02:12.950
But as I said, it's hard
to follow proofs
01:02:12.950 --> 01:02:14.950
in real time anyway.
01:02:14.950 --> 01:02:17.080
But all of that is done now.
01:02:17.080 --> 01:02:19.700
How do we go through the strong
law of large numbers
01:02:19.700 --> 01:02:22.990
now if we accept this
convergence
01:02:22.990 --> 01:02:25.370
with probability 1?
01:02:25.370 --> 01:02:29.370
Well, it turns out to
be pretty easy.
01:02:29.370 --> 01:02:32.300
We're going to assume that the
expected value of the fourth
01:02:32.300 --> 01:02:35.890
moment of this underlying
random variable
01:02:35.890 --> 01:02:38.040
is less than infinity.
01:02:38.040 --> 01:02:43.830
So let's look at the expected
value of the sum of n random
01:02:43.830 --> 01:02:47.120
variables taken to
the fourth power.
01:02:47.120 --> 01:02:48.790
OK, so what is that?
01:02:48.790 --> 01:02:57.760
It's the expected value of S
sub n times S sub n times S
01:02:57.760 --> 01:03:00.030
sub n times S sub n.
01:03:00.030 --> 01:03:05.220
Sub n is the sum of
Xi from 1 to n.
01:03:05.220 --> 01:03:07.020
It's also this.
01:03:07.020 --> 01:03:08.840
It's also this.
01:03:08.840 --> 01:03:09.970
It's also this.
01:03:09.970 --> 01:03:14.150
So the expected value of S to
the n fourth is the expected
01:03:14.150 --> 01:03:17.810
value of this entire
product here.
01:03:17.810 --> 01:03:22.800
I should have a big bracket
around all of that.
01:03:22.800 --> 01:03:27.190
If I multiply all of these terms
out, each of these terms
01:03:27.190 --> 01:03:29.900
goes from 1 to n, what I'm
going to get is the
01:03:29.900 --> 01:03:32.710
sum from 1 to n.
01:03:32.710 --> 01:03:35.010
Sum over j from 1 to n.
01:03:35.010 --> 01:03:37.310
Sum over k from 1 to n.
01:03:37.310 --> 01:03:40.810
And a sum over l from 1 to n.
01:03:40.810 --> 01:03:44.710
So I'm going to have the
expected value of X sub i
01:03:44.710 --> 01:03:48.370
times X sub j times X
sub k times X sub l.
01:03:48.370 --> 01:03:50.100
Let's review what this is.
01:03:50.100 --> 01:04:01.340
X sub i is the random variable
for the i-th of these X's.
01:04:01.340 --> 01:04:02.640
I have n X's--
01:04:02.640 --> 01:04:06.310
X1, X2, X3, up to X sub n.
01:04:06.310 --> 01:04:11.000
What I'm trying to find is the
expected value of this sum to
01:04:11.000 --> 01:04:12.460
the fourth power.
01:04:12.460 --> 01:04:15.702
When you look at the sum of
something, if I look at the
01:04:15.702 --> 01:04:34.410
sum of numbers, [INAUDIBLE] of
a sub i, times the sum of b
01:04:34.410 --> 01:04:38.740
sub i, I write it as j.
01:04:38.740 --> 01:04:40.830
If i just do this, what's
it equal to?
01:04:40.830 --> 01:04:45.380
It's equal to the sum over
i and j of a sub
01:04:45.380 --> 01:04:47.890
i times a sub j.
01:04:47.890 --> 01:04:50.650
I'm doing exactly the same thing
here, but I'm taking the
01:04:50.650 --> 01:04:52.270
expected value of it.
01:04:52.270 --> 01:04:55.540
That's a finite sum. the
expected value of the sum is
01:04:55.540 --> 01:05:00.540
equal to the sum of the
expected values.
01:05:00.540 --> 01:05:07.240
So if I look at any particular
value of X--
01:05:07.240 --> 01:05:08.580
of this first X here.
01:05:08.580 --> 01:05:11.390
Suppose I look at i equals 1 .
01:05:11.390 --> 01:05:17.990
I suppose I look at the expected
value of X1 times--
01:05:17.990 --> 01:05:20.460
and I'll make this anything
other than 1.
01:05:20.460 --> 01:05:24.350
I'll make this anything other
than 1, and this anything
01:05:24.350 --> 01:05:24.870
other than 1.
01:05:24.870 --> 01:05:27.370
For example, suppose I'm trying
to find the expected
01:05:27.370 --> 01:05:35.490
value of X1 times X2
times X10 times X3.
01:05:35.490 --> 01:05:38.110
OK, what is that?
01:05:38.110 --> 01:05:42.910
Since X1, X2, X3 are all
independent of each other, the
01:05:42.910 --> 01:05:47.460
expected value of X1 times the
expected value of all these
01:05:47.460 --> 01:05:52.420
other things is the expected
value of X1 conditional on the
01:05:52.420 --> 01:05:54.610
values of these other
quantities.
01:05:54.610 --> 01:05:57.105
And then I average over all
the other quantities.
01:06:00.210 --> 01:06:03.460
Now, if these are independent
random variables, the expected
01:06:03.460 --> 01:06:08.080
value of this given the values
of these other quantities is
01:06:08.080 --> 01:06:11.130
just the expected value of X1.
01:06:11.130 --> 01:06:14.740
I'm dealing with a case where
the expected value of X is
01:06:14.740 --> 01:06:17.800
equal to 0.
01:06:17.800 --> 01:06:20.330
Assuming X bar equals 0.
01:06:20.330 --> 01:06:26.080
So when I pick i equal to 1
and all of these equal to
01:06:26.080 --> 01:06:31.680
something other than 1, this
expected value is equal to 0.
01:06:31.680 --> 01:06:34.760
That's a whole bunch of expected
values because that
01:06:34.760 --> 01:06:39.580
includes j equals 2 to n, k
equals 2 to n, and X sub l
01:06:39.580 --> 01:06:41.090
equals 2 to n.
01:06:41.090 --> 01:06:45.520
Now, I can do this for X sub i
equals 2, X sub i equals 3,
01:06:45.520 --> 01:06:46.770
and so forth.
01:06:48.950 --> 01:06:53.770
If i is different from j, and k,
and l, this expected value
01:06:53.770 --> 01:06:55.020
is equal to 0.
01:06:58.340 --> 01:07:02.160
And the same thing if
X sub j is different
01:07:02.160 --> 01:07:03.430
than all the others.
01:07:03.430 --> 01:07:05.640
The expected value
is equal to 0.
01:07:05.640 --> 01:07:09.150
So how can I get anything
that's nonzero?
01:07:09.150 --> 01:07:15.240
Well, if I look at X sub 1 times
X sub 1 times X sub 1
01:07:15.240 --> 01:07:18.130
times X sub 1, that
gives me expected
01:07:18.130 --> 01:07:19.690
value of X to the fourth.
01:07:19.690 --> 01:07:22.950
That's not 0, presumably.
01:07:22.950 --> 01:07:24.860
And I have n terms like that.
01:07:29.050 --> 01:07:33.350
Well, I'm getting
down to here.
01:07:33.350 --> 01:07:37.540
What we have is two kinds
of nonzero terms.
01:07:37.540 --> 01:07:41.930
One of them is where i is equal
to j is equal to k is
01:07:41.930 --> 01:07:43.060
equal to l.
01:07:43.060 --> 01:07:46.830
And then we have X sub i
to the fourth power.
01:07:46.830 --> 01:07:49.980
And we're assuming that's
some finite quantity.
01:07:49.980 --> 01:07:52.890
That's the basic assumption
we're using here, expected
01:07:52.890 --> 01:07:55.740
value of X fourth is
less than infinity.
01:07:55.740 --> 01:07:58.470
What other kinds of things
can we have?
01:07:58.470 --> 01:08:05.130
Well, if i is equal to j, and
if k is equal to l, then I
01:08:05.130 --> 01:08:13.920
have the expected value of Xi
squared times expected value
01:08:13.920 --> 01:08:16.890
of Xi squared Xk squared.
01:08:16.890 --> 01:08:17.950
What is that?
01:08:17.950 --> 01:08:22.510
Xi squared is independent of
Xk squared because i is
01:08:22.510 --> 01:08:23.710
unequal to k.
01:08:23.710 --> 01:08:26.060
These are independent
random variables.
01:08:26.060 --> 01:08:31.250
So I have the expected value
of Xi squared is what?
01:08:31.250 --> 01:08:35.720
It's just a variance of X.
This quantity here is the
01:08:35.720 --> 01:08:37.819
variance of X also.
01:08:37.819 --> 01:08:43.729
So I have the variance of Xi
squared, which is squared.
01:08:43.729 --> 01:08:50.040
So I have sigma to
the fourth power.
01:08:50.040 --> 01:08:55.720
So those are the only terms that
I have for this second
01:08:55.720 --> 01:08:59.850
kind of nonzero term
where Xi--
01:08:59.850 --> 01:09:02.160
excuse me. not Xi
is equal to Xj.
01:09:02.160 --> 01:09:03.689
That's not what we're
talking about.
01:09:03.689 --> 01:09:06.819
Where i is equal to j.
01:09:06.819 --> 01:09:12.330
Namely, we have a sum where i
runs from 1 to n, where j runs
01:09:12.330 --> 01:09:16.670
from 1 to n, k runs from 1 to
n, and l runs from 1 to n.
01:09:16.670 --> 01:09:21.040
What we're looking at is, for
what values of i, j, k, and l
01:09:21.040 --> 01:09:24.500
is this quantity
not equal to 0?
01:09:24.500 --> 01:09:28.430
We're saying that if i is equal
to j is equal to k is
01:09:28.430 --> 01:09:32.229
equal to l, then for all of
those terms, we have the
01:09:32.229 --> 01:09:34.550
expected value of X fourth.
01:09:34.550 --> 01:09:39.640
For all terms in which i is
equal to j and k is equal to
01:09:39.640 --> 01:09:44.689
l, for all of those terms, we
have the expected value of X
01:09:44.689 --> 01:09:47.609
sub i quantity squared.
01:09:47.609 --> 01:09:50.560
Now, how many of those
terms are there?
01:09:50.560 --> 01:09:55.180
Well, x sub i can be
any one of n terms.
01:09:55.180 --> 01:10:01.220
x sub j can be any one
of how many terms?
01:10:01.220 --> 01:10:02.470
It can't be equal.
01:10:06.634 --> 01:10:11.120
i is equal to j, how many
things can k be?
01:10:11.120 --> 01:10:17.450
It can't be equal to i because
then we would wind up with X
01:10:17.450 --> 01:10:19.550
sub i to the fourth power.
01:10:19.550 --> 01:10:24.650
So we're looking at n minus
1 possible values for k, n
01:10:24.650 --> 01:10:27.430
possible values for i.
01:10:27.430 --> 01:10:30.820
So there are n times n minus
1 of those terms.
01:10:30.820 --> 01:10:32.070
I can also have--
01:10:41.868 --> 01:10:44.600
let me write in this way.
01:10:44.600 --> 01:10:51.580
Times Xk Xl equals k.
01:10:51.580 --> 01:10:52.800
I can have those terms.
01:10:52.800 --> 01:11:00.470
I can also have Xi
Xj unequal to i.
01:11:00.470 --> 01:11:08.720
Xk equal to k and
Xl equal to i.
01:11:08.720 --> 01:11:10.640
I can have terms like this.
01:11:10.640 --> 01:11:13.650
And that gives me a sigma
fourth term also.
01:11:13.650 --> 01:11:18.630
I can also have Xi
Xj unequal to i.
01:11:18.630 --> 01:11:22.370
k can be equal to i and
l can be equal to j.
01:11:22.370 --> 01:11:24.880
So I really have three
kinds of terms.
01:11:24.880 --> 01:11:33.840
I have three times n times n
minus 1 times the expected
01:11:33.840 --> 01:11:43.690
value of X squared, this
quantity squared.
01:11:43.690 --> 01:11:48.190
So that's the total value of
expected value of S sub n to
01:11:48.190 --> 01:11:50.130
the fourth.
01:11:50.130 --> 01:11:55.640
It's the n terms for which i is
equal to j is equal to k is
01:11:55.640 --> 01:12:02.700
equal to l plus the 3n times n
minus 1 terms in which we have
01:12:02.700 --> 01:12:05.790
two pairs of equal terms.
01:12:05.790 --> 01:12:07.470
So we have that quantity here.
01:12:07.470 --> 01:12:13.010
Now, expected value of X fourth
is the second moment of
01:12:13.010 --> 01:12:16.410
the random variable X squared.
01:12:16.410 --> 01:12:23.870
So the expected value of X
squared squared is the mean of
01:12:23.870 --> 01:12:26.610
X squared squared.
01:12:26.610 --> 01:12:30.580
And that's less than or equal to
the variance of X squared,
01:12:30.580 --> 01:12:32.100
which is this quantity.
01:12:32.100 --> 01:12:36.190
The expected value
of Sn fourth is--
01:12:38.980 --> 01:12:43.310
well, actually it's less than
or equal to 3n squared times
01:12:43.310 --> 01:12:45.690
the expected value
of X fourth.
01:12:45.690 --> 01:12:50.020
And blah, blah, blah, until we
get to 3 times the expected
01:12:50.020 --> 01:12:53.300
value of X fourth times the
sum from n equals 1 to
01:12:53.300 --> 01:12:56.140
infinity of 1 over n squared.
01:12:56.140 --> 01:12:59.695
Now, is that quantity finite
or is it infinite?
01:13:06.050 --> 01:13:09.100
Well, let's talk of three
different ways of showing that
01:13:09.100 --> 01:13:13.220
this sum is going
to be finite.
01:13:13.220 --> 01:13:17.710
One of the ways is that this is
an approximation, a crude
01:13:17.710 --> 01:13:20.860
approximation, of the
integral from 1 to
01:13:20.860 --> 01:13:24.395
infinity of 1 over X squared.
01:13:24.395 --> 01:13:26.990
You know that that integral
is finite.
01:13:26.990 --> 01:13:30.560
Another way of doing it is you
already know that if you take
01:13:30.560 --> 01:13:35.650
1 over n times 1 over n plus 1,
you know how to sum that.
01:13:35.650 --> 01:13:37.340
That sum is finite.
01:13:37.340 --> 01:13:41.040
You can bound this by that.
01:13:43.940 --> 01:13:48.570
And the other way of doing it is
simply to know that the sum
01:13:48.570 --> 01:13:50.140
of 1 over n squared is finite.
01:13:53.990 --> 01:13:59.740
So what this says is that the
expected value of S sub n
01:13:59.740 --> 01:14:04.290
fourth over n fourth is
less than infinity.
01:14:04.290 --> 01:14:14.210
That says that the probability
of the set of omega for which
01:14:14.210 --> 01:14:20.450
S to the fourth over n fourth
is equal to 0 is equal to 1.
01:14:20.450 --> 01:14:24.750
in other words, it's saying
that S to the fourth over
01:14:24.750 --> 01:14:30.850
omega over n fourth
converges to 0.
01:14:30.850 --> 01:14:34.200
That's not quite what
we want, is it?
01:14:34.200 --> 01:14:37.770
But the set of sample points
for which this quantity
01:14:37.770 --> 01:14:44.790
converges has probability 1.
01:14:44.790 --> 01:14:47.860
And here is where you see the
real power of the strong law
01:14:47.860 --> 01:14:49.410
of large numbers.
01:14:49.410 --> 01:14:57.170
Because if these numbers
converge to 0 with probability
01:14:57.170 --> 01:15:11.310
1, what happens to the set of
numbers is Sn to the fourth of
01:15:11.310 --> 01:15:15.920
omega divided by n to the
fourth, this limit--
01:15:23.488 --> 01:15:28.940
if this was equal to 0, then
what is the limit as n
01:15:28.940 --> 01:15:35.541
approaches infinity of Sn of
omega over n to the fourth?
01:15:38.880 --> 01:15:43.240
If I take the fourth root
of this, I get this.
01:15:43.240 --> 01:15:47.220
If this quantity is converging
to 0, the fourth root of this
01:15:47.220 --> 01:15:53.830
also has to be converging to 0
on a sample path basis of the
01:15:53.830 --> 01:15:58.300
fact that this converges means
that this converges also.
01:16:00.910 --> 01:16:03.350
Now, you see if you were dealing
with convergence in
01:16:03.350 --> 01:16:07.060
probability or something like
that, you couldn't play this
01:16:07.060 --> 01:16:09.090
funny game.
01:16:09.090 --> 01:16:12.450
And the ability to play this
game is really what makes
01:16:12.450 --> 01:16:16.490
convergence in probability
a powerful concept.
01:16:16.490 --> 01:16:19.350
You can do all sorts of strange
things with it.
01:16:19.350 --> 01:16:23.590
And we'll talk about
that next time.
01:16:23.590 --> 01:16:29.920
But that's why all
of this works.
01:16:29.920 --> 01:16:33.960
So that's what says that the
probability of the set of
01:16:33.960 --> 01:16:37.570
omega for which the limits
of Sn of omega over n
01:16:37.570 --> 01:16:38.820
equals 0 equals 1.
01:16:41.840 --> 01:16:45.590
Now, let's look at the
strange aspect of
01:16:45.590 --> 01:16:47.880
what we've just done.
01:16:47.880 --> 01:16:51.940
And this is where things
get very peculiar.
01:16:51.940 --> 01:16:55.640
Let's look at the Bernoulli
case, which by now we all
01:16:55.640 --> 01:16:57.470
understand.
01:16:57.470 --> 01:17:03.260
So we consider a Bernoulli
process, all
01:17:03.260 --> 01:17:04.860
moments of X exist.
01:17:04.860 --> 01:17:08.110
Moment-generating functions
of X exist.
01:17:08.110 --> 01:17:11.590
X is about as well-behaved as
you can expect because it only
01:17:11.590 --> 01:17:14.030
has the values 1 or 0.
01:17:14.030 --> 01:17:16.630
So it's very nice.
01:17:16.630 --> 01:17:19.690
The expected value of X
is going to be equal
01:17:19.690 --> 01:17:22.070
to p in this case.
01:17:22.070 --> 01:17:28.380
The set of sample paths for
which Sn of omega over n is
01:17:28.380 --> 01:17:31.270
equal to p has probability 1.
01:17:31.270 --> 01:17:37.300
In other words, with probability
1, when you look
01:17:37.300 --> 01:17:40.305
at a sample path and you look
at the whole thing from n
01:17:40.305 --> 01:17:43.730
equals 1 off to infinity, and
you take the limit of that
01:17:43.730 --> 01:17:47.750
sample path as n goes to
infinity, what you get is p.
01:17:47.750 --> 01:17:52.090
And the probability that you
get p is equal to 1.
01:17:52.090 --> 01:17:55.880
Well, now, the thing that's
disturbing is, if you look at
01:17:55.880 --> 01:17:59.930
another Bernoulli process where
the probability of the 1
01:17:59.930 --> 01:18:03.160
is p prime instead of p.
01:18:03.160 --> 01:18:06.630
What happens then?
01:18:06.630 --> 01:18:12.440
With probability 1, you get
convergence of Sn of omega
01:18:12.440 --> 01:18:19.820
over n, but the convergence is
to p prime instead of to p.
01:18:19.820 --> 01:18:24.790
The events in these two spaces
are exactly the same.
01:18:24.790 --> 01:18:28.470
We've changed the probability
measure, but we've kept all
01:18:28.470 --> 01:18:30.740
the events the same.
01:18:30.740 --> 01:18:34.040
And by changing the probability
measure, we have
01:18:34.040 --> 01:18:43.160
changed one set of probability 1
into a set of probability 0.
01:18:43.160 --> 01:18:46.500
And we changed another set of
probability 0 into set of
01:18:46.500 --> 01:18:48.550
probability 1.
01:18:48.550 --> 01:18:52.130
So we have two different
events here.
01:18:52.130 --> 01:18:56.450
On one probability measure, this
event has probability 1.
01:18:56.450 --> 01:18:58.700
On the other one, it
has probability 0.
01:18:58.700 --> 01:19:04.930
They're both very nice, very
well-behaved probabilistic
01:19:04.930 --> 01:19:06.750
situations.
01:19:06.750 --> 01:19:08.560
So that's a little disturbing.
01:19:08.560 --> 01:19:14.140
But then you say, you can pick
p in an uncountably infinite
01:19:14.140 --> 01:19:15.940
number of ways.
01:19:15.940 --> 01:19:18.680
And for each way you
count p, you have
01:19:18.680 --> 01:19:20.160
uncountably many events.
01:19:23.250 --> 01:19:28.790
Excuse me, for each value of
p, you have one event of
01:19:28.790 --> 01:19:31.790
probability 1 for that p.
01:19:31.790 --> 01:19:35.750
So as you go through this
uncountable number of events,
01:19:35.750 --> 01:19:39.480
you go through this uncountable
number of p's, you
01:19:39.480 --> 01:19:43.560
have an uncountable number of
events, each of which has
01:19:43.560 --> 01:19:47.890
probability 1 for its own p.
01:19:47.890 --> 01:19:52.600
And now the set of sequences
that converge is, in fact, a
01:19:52.600 --> 01:19:55.270
rather peculiar sequence
to start with.
01:19:55.270 --> 01:19:57.656
So if you look at all the other
things that are going to
01:19:57.656 --> 01:20:01.010
happen, there are an awful
lot of those events also.
01:20:04.300 --> 01:20:08.780
So what is happening here is
that these events that we're
01:20:08.780 --> 01:20:14.750
talking about are indeed very,
very peculiar events.
01:20:14.750 --> 01:20:16.530
I mean, all the mathematics
works out.
01:20:16.530 --> 01:20:17.850
The mathematics is fine.
01:20:17.850 --> 01:20:21.420
There's no doubt about it.
01:20:21.420 --> 01:20:24.280
In fact, the mathematics
of probability
01:20:24.280 --> 01:20:26.570
theory was worked out.
01:20:26.570 --> 01:20:29.940
People like Kolmogorov went to
great efforts to make sure
01:20:29.940 --> 01:20:31.960
that all of this worked out.
01:20:31.960 --> 01:20:34.330
But then he wound up with
this peculiar kind
01:20:34.330 --> 01:20:36.660
of situation here.
01:20:36.660 --> 01:20:40.170
And that's what happens when you
go to an infinite number
01:20:40.170 --> 01:20:43.080
of random variables.
01:20:43.080 --> 01:20:49.320
And it's ugly, but that's
the way it is.
01:20:49.320 --> 01:20:55.140
So that what I'm arguing here
is that when you go from
01:20:55.140 --> 01:21:00.220
finite m to infinite n, and
you start interchanging
01:21:00.220 --> 01:21:06.010
limits, and you start taking
limits without much care and
01:21:06.010 --> 01:21:09.280
you start doing all the things
that you would like to do,
01:21:09.280 --> 01:21:15.700
thinking that infinite n is sort
of the same as finite n.
01:21:15.700 --> 01:21:18.990
In most places in probability,
you can do that and you can
01:21:18.990 --> 01:21:20.170
away with it.
01:21:20.170 --> 01:21:22.800
As soon as you start dealing
with the strong law of large
01:21:22.800 --> 01:21:26.130
numbers, you suddenly really
have to start being careful
01:21:26.130 --> 01:21:27.810
about this.
01:21:27.810 --> 01:21:31.900
So from now on, we have to be
just a little bit careful
01:21:31.900 --> 01:21:36.120
about interchanging limits,
interchanging summation and
01:21:36.120 --> 01:21:40.190
integration, interchanging all
sorts of things, as soon as we
01:21:40.190 --> 01:21:43.540
have an infinite number
of random variables.
01:21:43.540 --> 01:21:47.600
So that's a care that we have
to worry about from here on.
01:21:47.600 --> 01:21:48.850
OK, thank you.