1
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative
2
00:00:02,960 --> 00:00:04,370
Commons license.
3
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to
4
00:00:07,410 --> 00:00:11,060
offer high quality educational
resources for free.
5
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from
6
00:00:13,960 --> 00:00:17,890
hundreds of MIT courses, visit
MIT OpenCourseWare at
7
00:00:17,890 --> 00:00:19,140
ocw.mit.edu.
8
00:00:24,600 --> 00:00:25,230
PROFESSOR: OK.
9
00:00:25,230 --> 00:00:28,710
Today we're all through with
Markov chains, or at least
10
00:00:28,710 --> 00:00:32,530
with finite state
Markov chains.
11
00:00:32,530 --> 00:00:36,440
And we're going on to
renewal processes.
12
00:00:36,440 --> 00:00:40,660
As part of that, we will spend
a good deal of time talking
13
00:00:40,660 --> 00:00:44,950
about the strong law of large
numbers, and convergence with
14
00:00:44,950 --> 00:00:46,830
probability one.
15
00:00:46,830 --> 00:00:51,440
The idea of convergence with
probability one, at least to
16
00:00:51,440 --> 00:00:56,540
me is by far the most difficult
part of the course.
17
00:00:56,540 --> 00:00:59,890
It's very abstract
mathematically.
18
00:00:59,890 --> 00:01:03,170
It looks like it's simple, and
it's one of those things you
19
00:01:03,170 --> 00:01:06,630
start to think you understand
it, and then at a certain
20
00:01:06,630 --> 00:01:09,460
point, you realize
that you don't.
21
00:01:09,460 --> 00:01:12,740
And this has been happening
to me for 20 years now.
22
00:01:12,740 --> 00:01:16,650
I keep thinking I really
understand this idea of
23
00:01:16,650 --> 00:01:19,800
convergence with probability
one, and then I see some
24
00:01:19,800 --> 00:01:22,180
strange example again.
25
00:01:22,180 --> 00:01:25,130
And I say there's something
very peculiar
26
00:01:25,130 --> 00:01:26,500
about this whole idea.
27
00:01:26,500 --> 00:01:29,750
And I'm going to illustrate
that you at the end of the
28
00:01:29,750 --> 00:01:30,980
lecture today.
29
00:01:30,980 --> 00:01:35,100
But for the most part, I will
be talking not so much about
30
00:01:35,100 --> 00:01:40,260
renewal processes, but this
set of mathematical issues
31
00:01:40,260 --> 00:01:43,990
that we have to understand in
order to be able to look at
32
00:01:43,990 --> 00:01:47,440
renewal processes in
the simplest way.
33
00:01:47,440 --> 00:01:50,830
One of the funny things about
the strong law of large
34
00:01:50,830 --> 00:01:55,420
numbers and how it gets applied
to renewal processes
35
00:01:55,420 --> 00:02:01,220
is that although the idea of
convergence with probability
36
00:02:01,220 --> 00:02:06,990
one is sticky and strange, once
you understand it, it is
37
00:02:06,990 --> 00:02:11,920
one of the most easy things
to use there is.
38
00:02:11,920 --> 00:02:16,220
And therefore, once you become
comfortable with it, you can
39
00:02:16,220 --> 00:02:19,150
use it to do things which would
be very hard to do in
40
00:02:19,150 --> 00:02:21,260
any other way.
41
00:02:21,260 --> 00:02:24,630
And because of that, most people
feel they understand it
42
00:02:24,630 --> 00:02:26,700
better than they actually do.
43
00:02:26,700 --> 00:02:30,670
And that's the reason why it
sometimes crops up when you're
44
00:02:30,670 --> 00:02:33,230
least expecting it, and
you find there's
45
00:02:33,230 --> 00:02:35,630
something very peculiar.
46
00:02:35,630 --> 00:02:39,050
OK, so let's start out by
talking a little bit about
47
00:02:39,050 --> 00:02:41,580
renewal processes.
48
00:02:41,580 --> 00:02:47,570
And then talking about this
convergence, and the strong
49
00:02:47,570 --> 00:02:52,190
law of large numbers, and what
it does to all of this.
50
00:02:52,190 --> 00:02:54,090
This is just review.
51
00:02:54,090 --> 00:02:57,230
We talked about arrival
processes when we started
52
00:02:57,230 --> 00:03:00,090
talking about Poisson
processes.
53
00:03:00,090 --> 00:03:03,740
Renewal processes are a special
kind of arrival
54
00:03:03,740 --> 00:03:07,510
processes, and Poisson processes
are a special kind
55
00:03:07,510 --> 00:03:09,290
of renewal process.
56
00:03:09,290 --> 00:03:13,710
So this is something you're
already sort of familiar with.
57
00:03:13,710 --> 00:03:20,310
All of arrival processes, we
will tend to treat in one of
58
00:03:20,310 --> 00:03:23,450
three equivalent ways, which is
the same thing we did with
59
00:03:23,450 --> 00:03:24,700
Poisson processes.
60
00:03:27,250 --> 00:03:30,250
A stochastic process,
we said, is a
61
00:03:30,250 --> 00:03:32,410
family of random variables.
62
00:03:32,410 --> 00:03:38,070
But in this case, we always view
it as three families of
63
00:03:38,070 --> 00:03:40,770
random variables, which
are all related.
64
00:03:40,770 --> 00:03:43,720
And all of which define
the other.
65
00:03:43,720 --> 00:03:46,780
And you jump back and forth from
looking at one to looking
66
00:03:46,780 --> 00:03:51,650
at the other, which is, as you
saw with Poisson processes,
67
00:03:51,650 --> 00:03:55,250
you really want to do this,
because if you stick to only
68
00:03:55,250 --> 00:04:00,060
one way of looking at it, you
really only pick up about a
69
00:04:00,060 --> 00:04:02,736
quarter, or a half
of the picture.
70
00:04:02,736 --> 00:04:04,840
OK.
71
00:04:04,840 --> 00:04:08,810
So this one picture gives us
a relationship between the
72
00:04:08,810 --> 00:04:14,170
arrival epochs of an arrival
process, the inter-arrival
73
00:04:14,170 --> 00:04:21,560
intervals, the x1, x2, x3, and
the counting process, n of t,
74
00:04:21,560 --> 00:04:26,100
and whichever one you use, you
use the one which is easiest,
75
00:04:26,100 --> 00:04:28,330
for whatever you plan to do.
76
00:04:28,330 --> 00:04:31,980
For defining a renewal process,
the easy thing to do
77
00:04:31,980 --> 00:04:36,330
is to look at the inter-arrival
intervals,
78
00:04:36,330 --> 00:04:40,280
because the definition of a
renewal process is it's an
79
00:04:40,280 --> 00:04:45,930
arrival process for which the
interrenewals are independent
80
00:04:45,930 --> 00:04:48,290
and identically distributed.
81
00:04:48,290 --> 00:04:53,000
So any process where the
arrivals, inter-arrivals have
82
00:04:53,000 --> 00:04:56,260
that property are IID.
83
00:04:56,260 --> 00:05:00,640
OK, renewal processes are
characterized, and the name
84
00:05:00,640 --> 00:05:06,700
comes from the idea that you
start over at each interval.
85
00:05:06,700 --> 00:05:11,180
This idea of starting over is
something that we talk about
86
00:05:11,180 --> 00:05:13,350
more later on.
87
00:05:13,350 --> 00:05:17,310
And it's a little bit strange,
and a little bit fishy.
88
00:05:17,310 --> 00:05:21,910
It's like with a Poisson
process, you look at different
89
00:05:21,910 --> 00:05:25,400
intervals, and they're
independent of each other.
90
00:05:25,400 --> 00:05:28,860
And we sort of know what
that means by now.
91
00:05:28,860 --> 00:05:34,540
OK, you look at the arrival
epochs for a Poisson process.
92
00:05:34,540 --> 00:05:37,280
Are they independent
of each other?
93
00:05:37,280 --> 00:05:39,210
Of course not.
94
00:05:39,210 --> 00:05:41,450
The arrival epochs are
the sums of the
95
00:05:41,450 --> 00:05:43,520
inter-arrival intervals.
96
00:05:43,520 --> 00:05:45,950
The inter-arrival intervals
are the things that are
97
00:05:45,950 --> 00:05:47,190
independent.
98
00:05:47,190 --> 00:05:49,260
And the arrival epochs
are the sums of
99
00:05:49,260 --> 00:05:50,810
inter-arrival intervals.
100
00:05:50,810 --> 00:05:54,250
If you know that the first
arrival epoch takes 10 times
101
00:05:54,250 --> 00:05:59,480
longer than its mean, then that
the second arrival epoch
102
00:05:59,480 --> 00:06:01,590
is going to be kind
of long, too.
103
00:06:01,590 --> 00:06:05,910
It's got to be at least 10 times
as long as the mean of
104
00:06:05,910 --> 00:06:10,330
the inter-arrival epochs,
because each arrival epoch is
105
00:06:10,330 --> 00:06:13,130
a sum of these inter-arrival
epochs.
106
00:06:13,130 --> 00:06:15,980
It's the inter-arrival epochs
that are independent.
107
00:06:15,980 --> 00:06:19,720
So when you say that the one
interval is independent of the
108
00:06:19,720 --> 00:06:23,720
other, yes, you know exactly
what you mean.
109
00:06:23,720 --> 00:06:26,160
And the idea is very simple.
110
00:06:26,160 --> 00:06:28,890
It's the same idea here.
111
00:06:28,890 --> 00:06:32,490
But then you start to think you
understand this, and you
112
00:06:32,490 --> 00:06:35,980
start to use it in
a funny way.
113
00:06:35,980 --> 00:06:39,710
And suddenly you're starting to
say that the arrival epochs
114
00:06:39,710 --> 00:06:42,280
are independent from one time
to the other, which they
115
00:06:42,280 --> 00:06:44,050
certainly aren't.
116
00:06:44,050 --> 00:06:49,070
What renewal theory does is it
lets you treat the gross
117
00:06:49,070 --> 00:06:53,000
characteristics of a process
in a very simple and
118
00:06:53,000 --> 00:06:54,640
straightforward way.
119
00:06:54,640 --> 00:06:58,270
So you're breaking up the
process into two sets
120
00:06:58,270 --> 00:06:59,700
of views about it.
121
00:06:59,700 --> 00:07:03,470
One is the long term behavior,
which you treat by renewal
122
00:07:03,470 --> 00:07:10,110
theory, and you use this one
exotic theory in a simple and
123
00:07:10,110 --> 00:07:14,550
straightforward way for every
different process, for every
124
00:07:14,550 --> 00:07:17,150
different renewal process
you look at.
125
00:07:17,150 --> 00:07:21,230
And then you have this usually
incredibly complicated kind of
126
00:07:21,230 --> 00:07:26,260
thing in the inside of
each arrival epoch.
127
00:07:26,260 --> 00:07:29,650
And the nice thing about renewal
theory is it lets you
128
00:07:29,650 --> 00:07:33,180
look at that complicated thing
without worrying about what's
129
00:07:33,180 --> 00:07:35,090
going on outside.
130
00:07:35,090 --> 00:07:38,950
So the local characteristics
can be studied without
131
00:07:38,950 --> 00:07:43,410
worrying about the long
term interactions.
132
00:07:43,410 --> 00:07:48,470
One example of this, and one
of the reasons we are now
133
00:07:48,470 --> 00:07:51,280
looking at Markov chains before
we look at renewal
134
00:07:51,280 --> 00:07:57,000
processes is that a Markov chain
is one of the nicest
135
00:07:57,000 --> 00:08:01,890
examples there is of a renewal
process, when you look at it
136
00:08:01,890 --> 00:08:03,450
in the right way.
137
00:08:03,450 --> 00:08:11,150
If you have a recurrent Markov
chain, then the interval from
138
00:08:11,150 --> 00:08:15,360
one time entering a particularly
recurrent state
139
00:08:15,360 --> 00:08:17,190
until the next time
you enter that
140
00:08:17,190 --> 00:08:21,500
recurrent state is a renewal.
141
00:08:21,500 --> 00:08:27,570
So we look at the sequence of
times at which we enter this
142
00:08:27,570 --> 00:08:29,110
one given state.
143
00:08:29,110 --> 00:08:31,150
Enter state one over here.
144
00:08:31,150 --> 00:08:33,520
We enter state one
again over here.
145
00:08:33,520 --> 00:08:36,780
We enter state one again,
and so forth.
146
00:08:36,780 --> 00:08:39,710
We're ignoring everything
that goes on between
147
00:08:39,710 --> 00:08:41,539
entries to state one.
148
00:08:41,539 --> 00:08:44,810
But every time you enter state
1, you're in the same
149
00:08:44,810 --> 00:08:49,490
situation as you were the last
time you entered state one.
150
00:08:49,490 --> 00:08:53,150
You're in the same situation,
in the sense that the
151
00:08:53,150 --> 00:08:58,010
inter-arrivals from state one
to state one again are
152
00:08:58,010 --> 00:08:59,910
independent of what
they were before.
153
00:08:59,910 --> 00:09:03,780
In other words, when you enter
state one, your successive
154
00:09:03,780 --> 00:09:07,650
state transitions from
there are the same
155
00:09:07,650 --> 00:09:08,850
as they were before.
156
00:09:08,850 --> 00:09:15,500
So it's the same situation as we
saw with Poisson processes,
157
00:09:15,500 --> 00:09:19,930
and it's the same kind of
renewal where when you talk
158
00:09:19,930 --> 00:09:25,800
about renewal, you have to be
very careful about what it is
159
00:09:25,800 --> 00:09:26,740
that's a renewable.
160
00:09:26,740 --> 00:09:30,420
Once you're careful about it,
it's clear what's going on.
161
00:09:30,420 --> 00:09:33,000
One of the things we're going to
find out now is one of the
162
00:09:33,000 --> 00:09:36,610
things that we failed to point
out before when we talked
163
00:09:36,610 --> 00:09:39,400
about finite state and
Markov chains.
164
00:09:39,400 --> 00:09:43,080
One of the most interesting
characteristics is the
165
00:09:43,080 --> 00:09:47,620
expected amount of time from one
entry to a recurrent state
166
00:09:47,620 --> 00:09:51,380
until the next time you enter
that recurrent state is 1 over
167
00:09:51,380 --> 00:09:55,390
pi sub i, where pi sub i is a
steady state probability of
168
00:09:55,390 --> 00:09:57,670
that steady state.
169
00:09:57,670 --> 00:09:59,600
Namely, we didn't do that.
170
00:09:59,600 --> 00:10:03,290
It's a little tricky to do that
in terms of Markov chains
171
00:10:03,290 --> 00:10:07,530
it's almost trivial to do it in
terms of renewal processes.
172
00:10:07,530 --> 00:10:11,450
And what's more, when we do it
in terms of renewal processes,
173
00:10:11,450 --> 00:10:13,230
you will see that it's
obvious, and you will
174
00:10:13,230 --> 00:10:14,410
never forget it.
175
00:10:14,410 --> 00:10:17,120
If we did it in terms of Markov
chains, it would be
176
00:10:17,120 --> 00:10:21,700
some long, tedious derivation,
and you'd get this nice
177
00:10:21,700 --> 00:10:24,970
answer, and you say, why did
that nice answer occur?
178
00:10:24,970 --> 00:10:26,500
And you wouldn't
have any idea.
179
00:10:26,500 --> 00:10:29,000
When you look at renewal
processes, it's
180
00:10:29,000 --> 00:10:31,100
obvious why it happens.
181
00:10:31,100 --> 00:10:34,130
And we'll see why that
is very soon.
182
00:10:34,130 --> 00:10:38,310
Also, after we finish renewal
processes, the next thing
183
00:10:38,310 --> 00:10:41,330
we're going to do is to
talk about accountable
184
00:10:41,330 --> 00:10:43,090
state Markov chains.
185
00:10:43,090 --> 00:10:46,740
Markov chains with accountable,
infinitely
186
00:10:46,740 --> 00:10:48,025
countable set of states.
187
00:10:52,520 --> 00:10:56,100
If you don't have a background
in renewal theory when you
188
00:10:56,100 --> 00:10:59,600
start to look at that, you
get very confused.
189
00:10:59,600 --> 00:11:02,940
So renewal theory will give us
the right tool to look at
190
00:11:02,940 --> 00:11:07,140
those more complicated
Markov chains.
191
00:11:07,140 --> 00:11:07,830
OK.
192
00:11:07,830 --> 00:11:10,620
So we carry from Markov chains
with accountably infinite
193
00:11:10,620 --> 00:11:14,940
state space comes largely
from renewal process.
194
00:11:14,940 --> 00:11:19,350
So yes, we'll be interested
in understanding that.
195
00:11:19,350 --> 00:11:19,750
OK.
196
00:11:19,750 --> 00:11:25,430
Another example is GTM queue.
197
00:11:25,430 --> 00:11:28,980
The text talked a little bit,
and we might have talked in
198
00:11:28,980 --> 00:11:32,810
class a little bit about this
strange notation a queueing
199
00:11:32,810 --> 00:11:34,710
theorist used.
200
00:11:34,710 --> 00:11:37,360
There are always at least three
letters separated by
201
00:11:37,360 --> 00:11:41,140
slashes to talk about
what kind of queue
202
00:11:41,140 --> 00:11:42,670
you're talking about.
203
00:11:42,670 --> 00:11:47,990
The first letter describes the
arrival process for the queue.
204
00:11:47,990 --> 00:11:52,860
G means it's a general rival
process, which doesn't really
205
00:11:52,860 --> 00:11:54,960
mean it's a general
arrival process.
206
00:11:54,960 --> 00:11:58,210
It means the arrival
process is renewal.
207
00:11:58,210 --> 00:12:03,210
Namely, it says the arrival
process is IID,
208
00:12:03,210 --> 00:12:05,280
inter-arrivals.
209
00:12:05,280 --> 00:12:07,890
But you don't know what
their distribution is.
210
00:12:07,890 --> 00:12:11,060
You would call that M if you
meant a Poisson process, which
211
00:12:11,060 --> 00:12:14,780
would mean memory lists,
inter-arrivals.
212
00:12:14,780 --> 00:12:19,940
The second G stands for the
service time distribution.
213
00:12:19,940 --> 00:12:23,690
Again, we assume that no matter
how many servers you
214
00:12:23,690 --> 00:12:27,700
have, no matter how the servers
work, the time to
215
00:12:27,700 --> 00:12:31,980
serve one user is independent
of the time
216
00:12:31,980 --> 00:12:33,600
to serve other users.
217
00:12:33,600 --> 00:12:37,370
But that the distribution of
that time has a general
218
00:12:37,370 --> 00:12:38,600
distribution.
219
00:12:38,600 --> 00:12:42,800
It would be M as you meant a
memory list distribution,
220
00:12:42,800 --> 00:12:45,680
which would mean exponential
distribution.
221
00:12:45,680 --> 00:12:51,110
Finally, the thing at the end
says we're talking about IQ
222
00:12:51,110 --> 00:12:52,950
with M servers.
223
00:12:52,950 --> 00:12:55,180
So the point here is we're
talking about a relatively
224
00:12:55,180 --> 00:12:57,690
complicated thing.
225
00:12:57,690 --> 00:13:00,550
Can you talk about this
in terms of renewals?
226
00:13:00,550 --> 00:13:05,290
Yes, you can, but it's not quite
obvious how to do it.
227
00:13:05,290 --> 00:13:09,550
You would think that the obvious
way of viewing a
228
00:13:09,550 --> 00:13:14,620
complicated queue like this is
to look at what happens from
229
00:13:14,620 --> 00:13:18,040
one busy period to the
next busy period.
230
00:13:18,040 --> 00:13:20,590
You would think the busy periods
would be independent
231
00:13:20,590 --> 00:13:21,800
of each other.
232
00:13:21,800 --> 00:13:23,610
But they're not quite.
233
00:13:23,610 --> 00:13:27,820
Suppose you finish one busy
period, and when you finish
234
00:13:27,820 --> 00:13:33,525
the one busy period, one
customer has just finished
235
00:13:33,525 --> 00:13:34,930
being served.
236
00:13:34,930 --> 00:13:38,630
But at that point, you're in the
middle of the waiting for
237
00:13:38,630 --> 00:13:41,230
the next customer to arrive.
238
00:13:41,230 --> 00:13:43,630
And as that's a general
distribution, the amount of
239
00:13:43,630 --> 00:13:47,380
time you have to wait for that
next customer to arrive
240
00:13:47,380 --> 00:13:50,120
depends on a whole
lot of things in
241
00:13:50,120 --> 00:13:52,000
the previous interval.
242
00:13:52,000 --> 00:13:54,540
So how can you talk about
renewals here?
243
00:13:54,540 --> 00:13:57,520
You talk about renewables by
waiting until that next
244
00:13:57,520 --> 00:13:58,730
arrival comes.
245
00:13:58,730 --> 00:14:04,270
When that next arrival comes to
terminate the idle period
246
00:14:04,270 --> 00:14:07,970
between busy periods, at that
time you're in the same state
247
00:14:07,970 --> 00:14:10,730
that you were in when the whole
thing started before.
248
00:14:10,730 --> 00:14:14,180
When you had the first
arrival come in.
249
00:14:14,180 --> 00:14:17,480
And at that point, you had one a
rival there being served you
250
00:14:17,480 --> 00:14:20,360
go through some long
complicated thing.
251
00:14:20,360 --> 00:14:22,730
Eventually the busy
period is over.
252
00:14:22,730 --> 00:14:25,760
Eventually, then, another
arrival comes in.
253
00:14:25,760 --> 00:14:28,920
And presto, at that point,
you're statistically back
254
00:14:28,920 --> 00:14:30,070
where you started.
255
00:14:30,070 --> 00:14:33,560
You're statistically back where
you started in terms of
256
00:14:33,560 --> 00:14:37,400
all inter-arrival times
at that point.
257
00:14:37,400 --> 00:14:40,220
And we will have to, even
though it's intuitively
258
00:14:40,220 --> 00:14:44,300
obvious that those things are
independent of each other,
259
00:14:44,300 --> 00:14:47,380
we're really going to have to
sort that out a little bit,
260
00:14:47,380 --> 00:14:51,290
because you come upon
many situations
261
00:14:51,290 --> 00:14:53,820
where this is not obvious.
262
00:14:53,820 --> 00:14:56,310
So if you don't know how to
sort it out when it is
263
00:14:56,310 --> 00:14:59,480
obvious, you're not going to
know how to sort it out when
264
00:14:59,480 --> 00:15:00,490
it's not obvious.
265
00:15:00,490 --> 00:15:03,840
But anyway, that's another
example of
266
00:15:03,840 --> 00:15:06,650
where we have renewals.
267
00:15:06,650 --> 00:15:08,710
OK.
268
00:15:08,710 --> 00:15:11,760
We want to talk about
convergence now.
269
00:15:11,760 --> 00:15:15,170
This idea of convergence
with probability one.
270
00:15:15,170 --> 00:15:17,990
It's based on the
idea of numbers
271
00:15:17,990 --> 00:15:21,440
converging to some limit.
272
00:15:21,440 --> 00:15:27,430
And I'm always puzzled about how
much to talk about this,
273
00:15:27,430 --> 00:15:31,440
because all of you, when you
first study calculus, talk
274
00:15:31,440 --> 00:15:32,990
about limits.
275
00:15:32,990 --> 00:15:36,650
Most of you, if you're
engineers, when you talk about
276
00:15:36,650 --> 00:15:40,400
calculus, it goes in one ear and
it goes out the other ear,
277
00:15:40,400 --> 00:15:43,900
because you don't have to
understand this very much.
278
00:15:43,900 --> 00:15:46,820
Because all the things you deal
with, the limits exist
279
00:15:46,820 --> 00:15:49,330
very nicely, and there's
no problem.
280
00:15:49,330 --> 00:15:51,170
So you can ignore it.
281
00:15:51,170 --> 00:15:55,310
And then you hear about these
epsilons and deltas, and I do
282
00:15:55,310 --> 00:15:56,930
the same thing.
283
00:15:56,930 --> 00:15:59,840
I can deal with an epsilon,
but as soon as you have an
284
00:15:59,840 --> 00:16:03,890
epsilon and a delta,
I go into orbit.
285
00:16:03,890 --> 00:16:07,210
I have no idea what's going on
anymore until I sit down and
286
00:16:07,210 --> 00:16:09,350
think about it very,
very carefully.
287
00:16:09,350 --> 00:16:13,200
Fortunately, when we have a
sequence of numbers, we only
288
00:16:13,200 --> 00:16:14,060
have an epsilon.
289
00:16:14,060 --> 00:16:15,610
We don't have a delta.
290
00:16:15,610 --> 00:16:17,095
So things are a little
bit simpler.
291
00:16:19,860 --> 00:16:23,260
I should warn you, though, that
you can't let this go in
292
00:16:23,260 --> 00:16:27,250
one ear and out the other ear,
because at this point, we are
293
00:16:27,250 --> 00:16:30,590
using the convergence of numbers
to be able to talk
294
00:16:30,590 --> 00:16:34,330
about convergence of random
variables, and convergence of
295
00:16:34,330 --> 00:16:38,500
random variables is indeed
not a simple topic.
296
00:16:38,500 --> 00:16:43,540
Convergence of numbers is a
simple topic made complicated
297
00:16:43,540 --> 00:16:44,790
by mathematicians.
298
00:16:46,800 --> 00:16:49,760
Any good mathematician,
when they hear me say
299
00:16:49,760 --> 00:16:51,430
this will be furious.
300
00:16:51,430 --> 00:16:55,210
Because in fact, when you think
about what they've done,
301
00:16:55,210 --> 00:16:59,080
they've taken something which
is simple but looks
302
00:16:59,080 --> 00:17:03,820
complicated, and they've turned
it into something which
303
00:17:03,820 --> 00:17:06,609
looks complicated in another
way, but is really the
304
00:17:06,609 --> 00:17:08,359
simplest way to deal with it.
305
00:17:08,359 --> 00:17:12,680
So let's do that and be done
with it, and then we can start
306
00:17:12,680 --> 00:17:15,140
using it for random variables.
307
00:17:15,140 --> 00:17:19,880
A sequence, b1, b2, b3, so
forth of real numbers.
308
00:17:19,880 --> 00:17:22,520
Real numbers are complex numbers
that doesn't make any
309
00:17:22,520 --> 00:17:30,640
difference, is said to converge
to a limit, b.
310
00:17:30,640 --> 00:17:33,470
If for each real epsilon greater
than zero, is there an
311
00:17:33,470 --> 00:17:37,700
integer M such that bn minus
b is less than or equal to
312
00:17:37,700 --> 00:17:41,460
epsilon for all n greater
than or equal to n?
313
00:17:41,460 --> 00:17:44,580
Now, how many people can look
at that and understand it?
314
00:17:44,580 --> 00:17:45,510
Be honest.
315
00:17:45,510 --> 00:17:46,250
Good.
316
00:17:46,250 --> 00:17:48,410
Some of you can.
317
00:17:48,410 --> 00:17:53,760
How many people look at that,
and their mind just, ah!
318
00:17:53,760 --> 00:17:57,010
How many people are
in that category?
319
00:17:57,010 --> 00:17:58,020
I am.
320
00:17:58,020 --> 00:18:00,660
But if I'm the only
one, that's good.
321
00:18:00,660 --> 00:18:03,070
OK.
322
00:18:03,070 --> 00:18:06,250
There's an equivalent way
to talk about this.
323
00:18:06,250 --> 00:18:10,110
A sequence of numbers, real or
complex, is said to converge
324
00:18:10,110 --> 00:18:11,420
to limit b.
325
00:18:11,420 --> 00:18:16,480
If for each integer k greater
than zero, there's an integer
326
00:18:16,480 --> 00:18:21,450
m of k, such that bn minus b
is less than or equal to 1
327
00:18:21,450 --> 00:18:25,825
over k for all n greater
than or equal to m.
328
00:18:25,825 --> 00:18:26,210
OK.
329
00:18:26,210 --> 00:18:30,160
And the argument there is pick
any epsilon you want to, no
330
00:18:30,160 --> 00:18:32,670
matter how small.
331
00:18:32,670 --> 00:18:35,930
And then you pick a k, such that
1 over k is less than or
332
00:18:35,930 --> 00:18:37,340
equal to epsilon.
333
00:18:37,340 --> 00:18:42,890
According to this definition, bn
minus b less than or equal
334
00:18:42,890 --> 00:18:48,990
to 1 over k ensures that you
have this condition up here
335
00:18:48,990 --> 00:18:51,120
that we're talking about.
336
00:18:51,120 --> 00:18:56,160
When bn minus b is less than
or equal to 1 over k, then
337
00:18:56,160 --> 00:18:59,950
also bn minus b is less than
or equal to epsilon.
338
00:18:59,950 --> 00:19:02,780
In other words, when you look
at this, you're starting to
339
00:19:02,780 --> 00:19:06,640
see what this definition
really means.
340
00:19:06,640 --> 00:19:12,700
Here, you don't really care
about all epsilon.
341
00:19:12,700 --> 00:19:18,200
All you care about is that
this holds true for small
342
00:19:18,200 --> 00:19:19,740
enough epsilon.
343
00:19:19,740 --> 00:19:23,890
And the trouble is there's
no way to specify a
344
00:19:23,890 --> 00:19:25,570
small enough epsilon.
345
00:19:25,570 --> 00:19:29,670
So the only way we can do this
is to say for all epsilon.
346
00:19:29,670 --> 00:19:35,700
But what the argument is is if
you can assert this statement
347
00:19:35,700 --> 00:19:39,650
for a sequence of smaller and
smaller values of epsilon,
348
00:19:39,650 --> 00:19:41,110
that's all you need.
349
00:19:41,110 --> 00:19:45,510
Because as soon as this is true
for one value of epsilon,
350
00:19:45,510 --> 00:19:48,200
it's true for all smaller
values of epsilon.
351
00:19:48,200 --> 00:19:53,110
Now, let me show you a picture
which, unfortunately, there's
352
00:19:53,110 --> 00:19:56,680
a kind of a complicated
picture.
353
00:19:56,680 --> 00:20:00,900
It's the picture that says what
that argument was really
354
00:20:00,900 --> 00:20:02,410
talking about.
355
00:20:02,410 --> 00:20:05,720
So if you don't understand the
picture, you were kidding
356
00:20:05,720 --> 00:20:08,230
yourself when you said you
thought you understood what
357
00:20:08,230 --> 00:20:10,090
the definition said.
358
00:20:10,090 --> 00:20:12,970
So what the picture says,
it's in terms of
359
00:20:12,970 --> 00:20:16,800
this 1 over k business.
360
00:20:16,800 --> 00:20:21,610
It says if you have a sequence
of numbers, b1, b2, b3, excuse
361
00:20:21,610 --> 00:20:24,220
me for insulting you
by talking about
362
00:20:24,220 --> 00:20:26,302
something so trivial.
363
00:20:26,302 --> 00:20:29,350
But believe me, as soon as we
start talking about random
364
00:20:29,350 --> 00:20:33,500
variables, this trivial thing
mixed with so many other
365
00:20:33,500 --> 00:20:37,160
things will start to become less
trivial, and you really
366
00:20:37,160 --> 00:20:40,320
need to understand what
this is saying.
367
00:20:40,320 --> 00:20:46,020
So we're saying if we have a
sequence, b1, b2, b3, b4, b5
368
00:20:46,020 --> 00:20:52,030
and so forth, what that second
idea of convergence says is
369
00:20:52,030 --> 00:21:02,760
that there's an M1 which says
that for all n greater than or
370
00:21:02,760 --> 00:21:12,500
equal to M1, b4, b5, b6, b7
minus b all lies within this
371
00:21:12,500 --> 00:21:16,660
limit here between b plus
1 and b minus 1.
372
00:21:16,660 --> 00:21:20,360
There's a number M2, which says
that as soon as you get
373
00:21:20,360 --> 00:21:23,570
bigger than M of 2, all
these numbers lie
374
00:21:23,570 --> 00:21:25,390
between these two limits.
375
00:21:25,390 --> 00:21:28,890
There's a number M3, which says
all of these numbers lie
376
00:21:28,890 --> 00:21:30,130
between these limits.
377
00:21:30,130 --> 00:21:34,670
So it's saying that, it's
essentially saying that you
378
00:21:34,670 --> 00:21:40,340
can a pipe, and as n increases,
you squeeze this
379
00:21:40,340 --> 00:21:42,000
pipe gradually down.
380
00:21:42,000 --> 00:21:44,550
You don't know how fast you
can squeeze it down when
381
00:21:44,550 --> 00:21:46,440
you're talking about
convergence.
382
00:21:46,440 --> 00:21:49,460
You might have something that
converges very slowly, and
383
00:21:49,460 --> 00:21:51,790
then M1 will be way out here.
384
00:21:51,790 --> 00:21:53,700
M2 will be way over there.
385
00:21:53,700 --> 00:21:58,480
M3 will be off on the
other side of Vassar
386
00:21:58,480 --> 00:22:01,160
Street, and so forth.
387
00:22:01,160 --> 00:22:05,650
But there always is such an
M1, M2, and M3, which says
388
00:22:05,650 --> 00:22:09,070
these numbers are getting
closer and closer to b.
389
00:22:09,070 --> 00:22:12,280
And they're staying closer
and closer to b.
390
00:22:12,280 --> 00:22:16,800
An example, which we'll come
back to, where you don't have
391
00:22:16,800 --> 00:22:20,840
convergence is the following
kind of thing.
392
00:22:20,840 --> 00:22:25,030
b1 is equal to 3/4,
in this case.
393
00:22:25,030 --> 00:22:27,470
b5 is equal to 3/4.
394
00:22:27,470 --> 00:22:30,110
b25 is equal to 3/4.
395
00:22:30,110 --> 00:22:34,540
b5 to the third is equal to 1.
396
00:22:34,540 --> 00:22:35,380
And so forth.
397
00:22:35,380 --> 00:22:43,110
These values at which b sub n is
equal to 3/4, b is equal to
398
00:22:43,110 --> 00:22:46,530
little b plus 3/4 get
more and more rare.
399
00:22:46,530 --> 00:22:51,560
So in some sense, this sequence
here where b2 up to
400
00:22:51,560 --> 00:22:53,190
b4 is zero.
401
00:22:53,190 --> 00:22:57,310
b6 up to b24 is zero
and so forth.
402
00:22:57,310 --> 00:23:00,580
This is some kind of
convergence, also.
403
00:23:00,580 --> 00:23:05,340
But it's not what anyone
would call convergence.
404
00:23:05,340 --> 00:23:08,610
I mean, as far as numbers are
concerned, there's only one
405
00:23:08,610 --> 00:23:11,950
kind of convergence that people
ever talk about, and
406
00:23:11,950 --> 00:23:13,930
it's this kind of convergence
here.
407
00:23:13,930 --> 00:23:17,070
This, although these numbers
are getting close
408
00:23:17,070 --> 00:23:18,820
to b in some sense.
409
00:23:18,820 --> 00:23:21,630
That's not viewed
as convergence.
410
00:23:21,630 --> 00:23:26,950
So here, even though almost all
the numbers are close to
411
00:23:26,950 --> 00:23:30,420
b, they don't stay close
to b, in a sense.
412
00:23:30,420 --> 00:23:34,290
They always pop up at some place
in the future, and that
413
00:23:34,290 --> 00:23:37,160
destroys the whole idea
of convergence.
414
00:23:37,160 --> 00:23:41,920
It destroys most theorems
about convergence.
415
00:23:41,920 --> 00:23:45,430
That's an example where you
don't have convergence.
416
00:23:45,430 --> 00:23:49,100
OK, random variables are really
a lot more complicated
417
00:23:49,100 --> 00:23:50,290
than numbers.
418
00:23:50,290 --> 00:23:57,000
I mean, a random variable is a
function from the sample space
419
00:23:57,000 --> 00:23:59,010
to real numbers.
420
00:23:59,010 --> 00:24:00,800
All of you know that's
not really what a
421
00:24:00,800 --> 00:24:02,470
random variable is.
422
00:24:02,470 --> 00:24:06,100
All of you know that a random
variable is a number that
423
00:24:06,100 --> 00:24:10,110
wiggles around a little bit,
rather than being fixed at
424
00:24:10,110 --> 00:24:12,695
what you ordinarily think of
a number as being, right?
425
00:24:16,990 --> 00:24:22,350
Since that's a very imprecise
notion, and the precise notion
426
00:24:22,350 --> 00:24:26,150
is very complicated, to build up
your intuition about this,
427
00:24:26,150 --> 00:24:29,830
you have to really think hard
about what convergence of
428
00:24:29,830 --> 00:24:32,300
random variables means.
429
00:24:32,300 --> 00:24:35,230
For convergence and
distribution, it's not the
430
00:24:35,230 --> 00:24:38,140
random variables, but the
distribution function of the
431
00:24:38,140 --> 00:24:40,100
random variables
that converge.
432
00:24:40,100 --> 00:24:43,710
In other words, in the
distribution function of z sub
433
00:24:43,710 --> 00:24:46,940
n, where you have a sequence of
random variables, z1, z2,
434
00:24:46,940 --> 00:24:50,520
z3 and so forth, the
distribution function
435
00:24:50,520 --> 00:24:56,890
evaluated at each real value z
converges for each z in the
436
00:24:56,890 --> 00:25:01,370
case where the distribution
function of this final
437
00:25:01,370 --> 00:25:04,740
convergent random variable
is continuous.
438
00:25:04,740 --> 00:25:06,600
We all studied that.
439
00:25:06,600 --> 00:25:08,370
We know what that means now.
440
00:25:08,370 --> 00:25:11,490
For convergence and probability,
we talked about
441
00:25:11,490 --> 00:25:14,270
convergence and probability
in two ways.
442
00:25:14,270 --> 00:25:16,910
One with an epsilon
and a delta.
443
00:25:16,910 --> 00:25:19,480
And then saying for every
epsilon and delta that isn't
444
00:25:19,480 --> 00:25:22,130
big enough, something happens.
445
00:25:22,130 --> 00:25:25,730
And then we saw that it was a
little easier to describe.
446
00:25:25,730 --> 00:25:28,570
It was a little easier to drive
describe by saying the
447
00:25:28,570 --> 00:25:30,880
convergence in probability.
448
00:25:30,880 --> 00:25:33,460
These distribution
functions have to
449
00:25:33,460 --> 00:25:34,810
converge to a unit step.
450
00:25:34,810 --> 00:25:35,620
And that's enough.
451
00:25:35,620 --> 00:25:39,230
They converge to a unit steps
at every z except
452
00:25:39,230 --> 00:25:41,570
where the step is.
453
00:25:41,570 --> 00:25:42,920
We talked about that.
454
00:25:42,920 --> 00:25:46,130
For convergence with probability
one, and this is
455
00:25:46,130 --> 00:25:50,030
the thing we want to talk about
today, this is the one
456
00:25:50,030 --> 00:25:54,020
that sounds so easy, and
which is really tricky.
457
00:25:54,020 --> 00:25:55,990
I don't want to scare
you about this.
458
00:25:55,990 --> 00:26:00,660
If you're not scared about
it to start with, I don't
459
00:26:00,660 --> 00:26:01,450
want to scare you.
460
00:26:01,450 --> 00:26:05,670
But I would like to convince
you that if you think you
461
00:26:05,670 --> 00:26:09,260
understand it and you haven't
spent a lot of time thinking
462
00:26:09,260 --> 00:26:12,070
about it, you're probably
due for a rude
463
00:26:12,070 --> 00:26:14,370
awakening at some point.
464
00:26:14,370 --> 00:26:17,870
So for convergence with
probability one, the set of
465
00:26:17,870 --> 00:26:22,370
sample paths that converge
has probability one.
466
00:26:22,370 --> 00:26:27,930
In other words, the sequence Y1,
Y2 converges to zero with
467
00:26:27,930 --> 00:26:30,290
probability one.
468
00:26:30,290 --> 00:26:33,070
And now I'm going to talk
about converging to zero
469
00:26:33,070 --> 00:26:35,820
rather than converging to
some random variable.
470
00:26:35,820 --> 00:26:39,120
Because if you're interested
in a sequence of random
471
00:26:39,120 --> 00:26:43,550
variables Z1, Z2 that converge
to some other random variable
472
00:26:43,550 --> 00:26:47,570
Z, you can get rid of a lot of
the complication by just
473
00:26:47,570 --> 00:26:51,310
saying, let's define a random
variable y sub n, which is
474
00:26:51,310 --> 00:26:53,570
equal to z sub n minus c.
475
00:26:53,570 --> 00:26:56,430
And then what we're interested
in is do these random
476
00:26:56,430 --> 00:26:59,490
variables y sub n
converged to 0.
477
00:26:59,490 --> 00:27:02,980
We can forget about what it's
converging to, and only worry
478
00:27:02,980 --> 00:27:05,560
about it converging to 0.
479
00:27:05,560 --> 00:27:11,270
OK, so when we do that, this
sequence of random variables
480
00:27:11,270 --> 00:27:15,160
converges to 0 with
probability 1.
481
00:27:15,160 --> 00:27:22,800
If the probability of the set of
sample points for which the
482
00:27:22,800 --> 00:27:25,410
sample path converges to 0.
483
00:27:25,410 --> 00:27:31,310
If that set of sample paths
has probability 1--
484
00:27:31,310 --> 00:27:36,050
namely, for almost everything
in the space, for almost
485
00:27:36,050 --> 00:27:39,820
everything in its peculiar
sense of probability--
486
00:27:39,820 --> 00:27:44,050
if that holds true, then you say
you have convergence with
487
00:27:44,050 --> 00:27:45,300
probability 1.
488
00:27:49,440 --> 00:27:51,440
Now, that looks straightforward,
489
00:27:51,440 --> 00:27:54,770
and I hope it is.
490
00:27:54,770 --> 00:27:56,590
You can memorize it
or do whatever you
491
00:27:56,590 --> 00:27:59,270
want to do with it.
492
00:27:59,270 --> 00:28:02,340
We're going to go on now and
prove an important theorem
493
00:28:02,340 --> 00:28:05,260
about convergence with
probability 1.
494
00:28:05,260 --> 00:28:09,160
I'm going to give a proof here
in class that's a little more
495
00:28:09,160 --> 00:28:12,610
detailed than the proof
I give in the notes.
496
00:28:12,610 --> 00:28:15,280
I don't like to give
proofs in class.
497
00:28:15,280 --> 00:28:19,275
I think it's a lousy idea
because when you're studying a
498
00:28:19,275 --> 00:28:22,940
proof, you have to go
at your own pace.
499
00:28:22,940 --> 00:28:29,420
But the problem is, I
know that students--
500
00:28:29,420 --> 00:28:31,200
and I was once a
student myself,
501
00:28:31,200 --> 00:28:33,170
and I'm still a student.
502
00:28:33,170 --> 00:28:38,165
If I see a proof, I will only
look at enough of it to say,
503
00:28:38,165 --> 00:28:40,000
ah, I get the idea of it.
504
00:28:40,000 --> 00:28:41,610
And then I will stop.
505
00:28:41,610 --> 00:28:44,840
And for this one, you need a
little more than the idea of
506
00:28:44,840 --> 00:28:46,410
it because it's something
we're going to
507
00:28:46,410 --> 00:28:48,020
build on all the time.
508
00:28:48,020 --> 00:28:51,300
So I want to go through
this proof carefully.
509
00:28:51,300 --> 00:28:56,990
And I hope that most of you
will follow most of it.
510
00:28:56,990 --> 00:28:59,560
And the parts of it that you
don't follow, I hope you'll go
511
00:28:59,560 --> 00:29:02,010
back and think about
it, because
512
00:29:02,010 --> 00:29:05,290
this is really important.
513
00:29:05,290 --> 00:29:10,630
OK, so the theorem says, let
this sequence of random
514
00:29:10,630 --> 00:29:17,080
variables satisfy the expected
value of the magnitude of Y
515
00:29:17,080 --> 00:29:23,360
sub n, the sum from n equals 1
to infinity of this is less
516
00:29:23,360 --> 00:29:24,610
than infinity.
517
00:29:27,740 --> 00:29:29,990
As usual there's a
misprint there.
518
00:29:29,990 --> 00:29:33,280
The sum, the expected
value of Yn, the
519
00:29:33,280 --> 00:29:34,970
bracket should be there.
520
00:29:34,970 --> 00:29:36,560
It's supposed to be less
than infinity.
521
00:29:40,820 --> 00:29:42,780
Let me write that down.
522
00:29:42,780 --> 00:29:49,720
The sum from n equals 1 to
infinity of expected value of
523
00:29:49,720 --> 00:29:56,330
the magnitude of Y sub n
is less than infinity.
524
00:29:56,330 --> 00:29:59,340
So it's a finite sum.
525
00:29:59,340 --> 00:30:03,960
So we're talking about these
Yn's when we start talking
526
00:30:03,960 --> 00:30:07,350
about the strong law
of large numbers.
527
00:30:07,350 --> 00:30:12,300
Yn is going to be something like
the sum from n equals 1
528
00:30:12,300 --> 00:30:15,730
to m divided by m.
529
00:30:15,730 --> 00:30:17,800
In other words, it's going to
be the sample average, or
530
00:30:17,800 --> 00:30:19,090
something like that.
531
00:30:19,090 --> 00:30:22,650
And these sample averages, if
you have a mean 0, are going
532
00:30:22,650 --> 00:30:23,850
to get small.
533
00:30:23,850 --> 00:30:26,620
The question is, when you sum
all of these things that are
534
00:30:26,620 --> 00:30:31,120
getting small, do you still get
something which is small?
535
00:30:31,120 --> 00:30:34,490
When you're dealing with the
weak law of large numbers,
536
00:30:34,490 --> 00:30:37,640
it's not necessary that
that sum gets small.
537
00:30:37,640 --> 00:30:40,910
It's only necessary that each
of the terms get small.
538
00:30:40,910 --> 00:30:43,660
Here we're saying, let's
assume also that
539
00:30:43,660 --> 00:30:46,560
this sum gets small.
540
00:30:46,560 --> 00:30:50,930
OK, so we want to prove that
under this condition, all of
541
00:30:50,930 --> 00:30:57,430
these sequences with probability
1 converge to 0,
542
00:30:57,430 --> 00:31:00,690
the individual sequences
converge.
543
00:31:00,690 --> 00:31:04,520
OK, so let's go through
the proof now.
544
00:31:04,520 --> 00:31:08,560
And as I say, I won't do
this to you very often.
545
00:31:08,560 --> 00:31:14,110
But I think for this one,
it's sort of necessary.
546
00:31:14,110 --> 00:31:18,000
OK, so first we'll use the
Markov inequality.
547
00:31:18,000 --> 00:31:22,310
And I'm dealing with a finite
value of m here.
548
00:31:22,310 --> 00:31:26,900
The probability that the sum of
a finite set of these Y sub
549
00:31:26,900 --> 00:31:30,960
n's is greater than alpha is
less than or equal to the
550
00:31:30,960 --> 00:31:33,890
expected value of that
random variable.
551
00:31:33,890 --> 00:31:36,730
Namely, this random
variable here.
552
00:31:36,730 --> 00:31:40,930
Sum from n equals 1 to m of
magnitude of Y sub n.
553
00:31:40,930 --> 00:31:43,560
That's just a random variable.
554
00:31:43,560 --> 00:31:46,640
And the probability that that
random variable is greater
555
00:31:46,640 --> 00:31:49,610
than alpha is less than or equal
to the expected value of
556
00:31:49,610 --> 00:31:52,980
that random variable
divided by alpha.
557
00:31:52,980 --> 00:32:02,850
OK, well now, this quantity here
is increasing in Y sub n.
558
00:32:02,850 --> 00:32:06,510
The magnitude of Y sub n is
a non-negative quantity.
559
00:32:06,510 --> 00:32:10,860
You take the expectation of a
non-negative quantity, if it
560
00:32:10,860 --> 00:32:15,470
has an expectation, which we're
assuming here for this
561
00:32:15,470 --> 00:32:18,680
to be less infinity, all of
these things have to have
562
00:32:18,680 --> 00:32:20,800
expectations.
563
00:32:20,800 --> 00:32:26,990
So as we increase m, this
gets bigger and bigger.
564
00:32:26,990 --> 00:32:31,720
So this quantity here is going
to be less than or equal to
565
00:32:31,720 --> 00:32:35,370
the sum from n equals 1 to
infinity of expected value of
566
00:32:35,370 --> 00:32:37,470
Y sub n divided by alpha.
567
00:32:37,470 --> 00:32:41,830
What I'm being careful about
here is all of the things that
568
00:32:41,830 --> 00:32:47,170
happen when you go from finite
m to infinite m.
569
00:32:47,170 --> 00:32:52,260
And I'm using what you know
about finite m, and then being
570
00:32:52,260 --> 00:32:56,080
very careful about going
to infinite m.
571
00:32:56,080 --> 00:32:59,340
And I'm going to try to explain
why as we do it.
572
00:32:59,340 --> 00:33:01,570
But here, it's straightforward.
573
00:33:01,570 --> 00:33:07,090
The expected value of a finite
sum is equal to the finite sum
574
00:33:07,090 --> 00:33:08,580
of an expected value.
575
00:33:08,580 --> 00:33:11,560
When you go to the limit, m goes
to infinity, you don't
576
00:33:11,560 --> 00:33:15,230
know whether these expected
values exist or not.
577
00:33:15,230 --> 00:33:18,990
You're sort of confused on both
sides of this equation.
578
00:33:18,990 --> 00:33:21,550
So we're sticking to
finite values here.
579
00:33:21,550 --> 00:33:25,710
Then, we're taking this
quantity, going to the limit
580
00:33:25,710 --> 00:33:27,450
as m goes to infinity.
581
00:33:27,450 --> 00:33:30,570
This quantity has to get bigger
and bigger as m goes to
582
00:33:30,570 --> 00:33:33,870
infinity, so this quantity
has to be less
583
00:33:33,870 --> 00:33:36,070
than or equal to this.
584
00:33:36,070 --> 00:33:40,980
This now, for a given alpha,
is just a number.
585
00:33:40,980 --> 00:33:43,820
It's nothing more than a number,
so we can deal with
586
00:33:43,820 --> 00:33:47,420
this pretty easily as we
make alpha big enough.
587
00:33:47,420 --> 00:33:50,050
But for most of the argument,
we're going to view alpha as
588
00:33:50,050 --> 00:33:51,700
being fixed.
589
00:33:51,700 --> 00:33:58,590
OK, so now the probability that
this sum, finite sum is
590
00:33:58,590 --> 00:34:01,920
greater than alpha, is less
than or equal to this.
591
00:34:01,920 --> 00:34:03,620
This was the thing we just
finished proving
592
00:34:03,620 --> 00:34:04,870
on the other page.
593
00:34:09,110 --> 00:34:11,600
This is less than or
equal to that.
594
00:34:11,600 --> 00:34:17,969
That's what I repeated, so I'm
not cheating you at all here.
595
00:34:17,969 --> 00:34:21,400
Now, it's a pain to write
that down all the time.
596
00:34:21,400 --> 00:34:27,580
So let's let the set, A sub m,
be the set of sample points
597
00:34:27,580 --> 00:34:32,380
such that its finite sum
of Y sub n of omega is
598
00:34:32,380 --> 00:34:34,170
greater than alpha.
599
00:34:34,170 --> 00:34:35,420
This is a random--
600
00:34:39,260 --> 00:34:43,150
for each value of omega,
this is just a number.
601
00:34:43,150 --> 00:34:49,710
The sum of the magnitude of Y
sub n is a random variable.
602
00:34:49,710 --> 00:34:53,270
It takes on a numerical
value for every omega
603
00:34:53,270 --> 00:34:54,760
in the sample space.
604
00:34:54,760 --> 00:34:59,140
So A sub m is the set of points
in the sample space for
605
00:34:59,140 --> 00:35:02,600
which this quantity here
is bigger than alpha.
606
00:35:02,600 --> 00:35:08,010
So we can rewrite this
now as just the
607
00:35:08,010 --> 00:35:10,440
probability of A sub m.
608
00:35:10,440 --> 00:35:14,680
So this is equivalent to saying,
the probability of A
609
00:35:14,680 --> 00:35:18,120
sub m is less than or equal
to this number here.
610
00:35:18,120 --> 00:35:20,700
For a fixed alpha,
this is a number.
611
00:35:20,700 --> 00:35:24,500
This is something which
can vary with m.
612
00:35:24,500 --> 00:35:29,560
Since these numbers here now,
now we're dealing with a
613
00:35:29,560 --> 00:35:32,710
sample space, which is
a little strange.
614
00:35:32,710 --> 00:35:39,570
We're talking about sample
points and we're saying, this
615
00:35:39,570 --> 00:35:46,320
number here, this magnitude of Y
sub n at a particular sample
616
00:35:46,320 --> 00:35:50,040
point omega is greater
than or equal to 0.
617
00:35:50,040 --> 00:35:55,050
Therefore, a sub m is a subset
of A sub m plus 1.
618
00:35:55,050 --> 00:36:03,650
In other words, as m gets larger
and larger here, m here
619
00:36:03,650 --> 00:36:05,450
gets larger and larger.
620
00:36:05,450 --> 00:36:08,950
Therefore, this sum here
gets larger and larger.
621
00:36:08,950 --> 00:36:14,240
Therefore, the set of omega for
which this increasing sum
622
00:36:14,240 --> 00:36:17,370
is greater than alpha gets
bigger and bigger.
623
00:36:17,370 --> 00:36:20,430
And that's the thing that we're
saying here, A sub m is
624
00:36:20,430 --> 00:36:24,990
included in A sub m plus 1 for
m greater than or equal to 1.
625
00:36:24,990 --> 00:36:28,920
OK, so the left side of this
quantity here, as a function
626
00:36:28,920 --> 00:36:32,370
of m, is a non-decreasing
bounded
627
00:36:32,370 --> 00:36:34,305
sequence of real numbers.
628
00:36:41,420 --> 00:36:43,090
Yes, the probability
of something
629
00:36:43,090 --> 00:36:44,120
is just a real number.
630
00:36:44,120 --> 00:36:46,970
A probability is a number.
631
00:36:46,970 --> 00:36:50,230
So this quantity here
is a real number.
632
00:36:50,230 --> 00:36:53,810
It's a real number which is
non-decreasing, so it keeps
633
00:36:53,810 --> 00:36:54,920
moving upward.
634
00:36:54,920 --> 00:36:59,330
What I'm trying to do now
is now, I went to
635
00:36:59,330 --> 00:37:00,300
the limit over here.
636
00:37:00,300 --> 00:37:03,110
I want to go to the
limit here.
637
00:37:03,110 --> 00:37:08,680
And so I have a sequence
of numbers in m.
638
00:37:08,680 --> 00:37:12,500
This sequence of numbers
is non-decreasing.
639
00:37:12,500 --> 00:37:14,250
So it's moving up.
640
00:37:14,250 --> 00:37:17,110
Every one of those quantities
is bounded by
641
00:37:17,110 --> 00:37:18,840
this quantity here.
642
00:37:18,840 --> 00:37:23,350
So I have an increasing sequence
of real numbers,
643
00:37:23,350 --> 00:37:25,960
which is bounded on the top.
644
00:37:25,960 --> 00:37:28,280
What happens?
645
00:37:28,280 --> 00:37:30,810
When you have a sequence
of real
646
00:37:30,810 --> 00:37:32,780
numbers which is bounded--
647
00:37:36,200 --> 00:37:38,810
I have a slide to prove this,
but I'm not going to prove it
648
00:37:38,810 --> 00:37:40,953
because it's tedious.
649
00:37:46,470 --> 00:37:53,210
Here we have this probability
which I'm calling A sub m
650
00:37:53,210 --> 00:37:59,740
probability of A sub m.
651
00:37:59,740 --> 00:38:07,730
Here I have the probability of
A sub m plus 1, and so forth.
652
00:38:07,730 --> 00:38:09,980
Here I have this
limit up here.
653
00:38:09,980 --> 00:38:13,850
All of this sequence of numbers,
there's an infinite
654
00:38:13,850 --> 00:38:15,770
sequence of them.
655
00:38:15,770 --> 00:38:17,650
They're all non-decreasing.
656
00:38:17,650 --> 00:38:20,780
They're all bounded by
this number here.
657
00:38:20,780 --> 00:38:23,300
And what happens?
658
00:38:23,300 --> 00:38:27,600
Well, either we go up to there
as a limit or else we stop
659
00:38:27,600 --> 00:38:30,890
sometime earlier as a limit.
660
00:38:30,890 --> 00:38:36,600
I should prove this, but it's
something we use all the time.
661
00:38:36,600 --> 00:38:43,190
It's a sequence of increasing
or non-decreasing numbers.
662
00:38:43,190 --> 00:38:46,420
If it's bounded by something, it
has to have a finite limit.
663
00:38:46,420 --> 00:38:49,390
The limit is less than or
equal to this quantity.
664
00:38:49,390 --> 00:38:53,690
It might be strictly less, but
the limit has to exist.
665
00:38:53,690 --> 00:38:55,620
And the limit has to be less
than or equal to b.
666
00:38:59,900 --> 00:39:02,240
OK, that's what we're
saying here.
667
00:39:02,240 --> 00:39:10,510
When we go to this limit, this
limit of the probability of A
668
00:39:10,510 --> 00:39:14,380
sub m is less than or equal
to this number here.
669
00:39:14,380 --> 00:39:19,310
OK, if I use this property of
nesting intervals, when you
670
00:39:19,310 --> 00:39:26,860
have A sub 1 nested inside of
A sub 2, nested inside of A
671
00:39:26,860 --> 00:39:33,380
sub 3, what we'd like to go
do is go to this limit.
672
00:39:33,380 --> 00:39:35,210
The limit, unfortunately,
doesn't make
673
00:39:35,210 --> 00:39:36,460
any sense in general.
674
00:39:39,130 --> 00:39:43,240
With this property of the
axioms, it's equation number 9
675
00:39:43,240 --> 00:39:47,490
in chapter 1 says that
we can do something
676
00:39:47,490 --> 00:39:51,190
that's almost as good.
677
00:39:51,190 --> 00:40:00,630
What it says that as we go to
this limit here, what we get
678
00:40:00,630 --> 00:40:06,830
is that this limit is
the probability of
679
00:40:06,830 --> 00:40:08,960
this infinite union.
680
00:40:08,960 --> 00:40:13,410
That's equal to the limit
as m goes to infinity of
681
00:40:13,410 --> 00:40:16,655
probability of A sub m.
682
00:40:16,655 --> 00:40:20,810
OK, look up equation 9,
and you'll see that's
683
00:40:20,810 --> 00:40:22,120
exactly what it says.
684
00:40:22,120 --> 00:40:25,870
If you think this is
obvious, it's not.
685
00:40:25,870 --> 00:40:29,350
It ain't obvious at all because
it's not even clear
686
00:40:29,350 --> 00:40:32,380
that this--
687
00:40:32,380 --> 00:40:35,090
well, nothing very much about
this union is clear.
688
00:40:35,090 --> 00:40:38,570
We know that this union must
be a measurable set.
689
00:40:38,570 --> 00:40:40,280
It must have a probability.
690
00:40:40,280 --> 00:40:42,090
We don't know much
more about it.
691
00:40:42,090 --> 00:40:45,340
But anyway, that property tells
us that this is true.
692
00:41:03,190 --> 00:41:06,040
OK, so where we are
at this point.
693
00:41:06,040 --> 00:41:09,680
I don't think I've skip
something, have I?
694
00:41:09,680 --> 00:41:13,430
Oh, no, that's the thing I
didn't want to talk about.
695
00:41:13,430 --> 00:41:19,720
OK, so A sub m is a set
of omega which satisfy
696
00:41:19,720 --> 00:41:21,970
this for finite m.
697
00:41:21,970 --> 00:41:28,230
The probability of this union
is then the union of all of
698
00:41:28,230 --> 00:41:30,470
these quantities over all m.
699
00:41:30,470 --> 00:41:35,050
And this is less than or equal
to this bound that we had.
700
00:41:35,050 --> 00:41:44,890
OK, so I even hate giving proofs
of this sort because
701
00:41:44,890 --> 00:41:46,960
it's a set of simple ideas.
702
00:41:46,960 --> 00:41:49,980
To track down every one
of them is difficult.
703
00:41:49,980 --> 00:41:53,320
The text doesn't track down
every one of them.
704
00:41:53,320 --> 00:41:55,750
And that's what I'm
trying to do here.
705
00:42:06,580 --> 00:42:11,870
We have two possibilities here,
and we're looking at
706
00:42:11,870 --> 00:42:13,330
this limit here.
707
00:42:13,330 --> 00:42:20,390
This limiting sum, which for
each omega is just a sequence,
708
00:42:20,390 --> 00:42:23,500
a non-decreasing sequence
of real numbers.
709
00:42:23,500 --> 00:42:27,940
So one possibility is that this
sequence of real numbers
710
00:42:27,940 --> 00:42:30,000
is bigger than alpha.
711
00:42:30,000 --> 00:42:32,390
The other possibility
is that it's less
712
00:42:32,390 --> 00:42:34,050
than or equal to alpha.
713
00:42:34,050 --> 00:42:37,660
If it's less than or equal to
alpha, then every one of these
714
00:42:37,660 --> 00:42:43,920
numbers is less than or equal
to alpha and omega has to be
715
00:42:43,920 --> 00:42:47,230
not in this union here.
716
00:42:47,230 --> 00:42:50,800
If the sum is bigger than
alpha, then one of the
717
00:42:50,800 --> 00:42:53,500
elements in this set is
bigger than alpha and
718
00:42:53,500 --> 00:42:55,550
omega is in this set.
719
00:42:55,550 --> 00:43:00,300
So what all of that says, and
you're just going to have to
720
00:43:00,300 --> 00:43:03,301
look at that because
it's not--
721
00:43:03,301 --> 00:43:05,760
it's one of these tedious
arguments.
722
00:43:05,760 --> 00:43:09,690
So the probability of omega such
that this sum is greater
723
00:43:09,690 --> 00:43:13,030
than alpha is less than or equal
to this number here.
724
00:43:13,030 --> 00:43:16,770
At this point, we have
made a major change
725
00:43:16,770 --> 00:43:19,000
in what we're doing.
726
00:43:19,000 --> 00:43:24,750
Before we were talking about
numbers like probabilities,
727
00:43:24,750 --> 00:43:27,400
numbers like expected values.
728
00:43:27,400 --> 00:43:33,020
Here, suddenly, we are talking
about sample points.
729
00:43:33,020 --> 00:43:35,870
We're talking about the
probability of a set of sample
730
00:43:35,870 --> 00:43:40,340
points, such that the sum
is greater than alpha.
731
00:43:40,340 --> 00:43:41,358
Yes?
732
00:43:41,358 --> 00:43:43,207
AUDIENCE: I understand how if
the whole sum if less than or
733
00:43:43,207 --> 00:43:45,840
equal than alpha then
every element is.
734
00:43:45,840 --> 00:43:49,160
But did you say that if it's
greater than alpha, then at
735
00:43:49,160 --> 00:43:50,360
least one element is
greater than alpha?
736
00:43:50,360 --> 00:43:51,610
Why is that?
737
00:43:57,660 --> 00:44:00,430
PROFESSOR: Well, because either
the sum is less than or
738
00:44:00,430 --> 00:44:03,670
equal to alpha or it's
greater than alpha.
739
00:44:03,670 --> 00:44:09,230
And if it's less than or equal
to alpha, then omega
740
00:44:09,230 --> 00:44:11,570
is not in this set.
741
00:44:11,570 --> 00:44:17,860
So the alternative is that omega
has to be in this set.
742
00:44:17,860 --> 00:44:20,540
Except the other way of looking
at it is if you have a
743
00:44:20,540 --> 00:44:25,390
sequence of numbers, which is
approaching a limit, and that
744
00:44:25,390 --> 00:44:31,870
limit is bigger than alpha, then
one of the terms has to
745
00:44:31,870 --> 00:44:33,990
be bigger than alpha.
746
00:44:33,990 --> 00:44:34,924
Yes?
747
00:44:34,924 --> 00:44:36,406
AUDIENCE: I think the confusion
is between the
748
00:44:36,406 --> 00:44:38,876
partial sums and the
terms of the sum.
749
00:44:38,876 --> 00:44:40,852
That's what he's confusing.
750
00:44:40,852 --> 00:44:43,322
Does that make sense?
751
00:44:43,322 --> 00:44:45,298
He's saying instead of
each partial sum, not
752
00:44:45,298 --> 00:44:46,548
each term in the sum.
753
00:44:50,750 --> 00:44:52,000
PROFESSOR: Yes.
754
00:44:58,540 --> 00:45:00,510
Except I don't see how that
answers the question.
755
00:45:11,000 --> 00:45:16,630
Except the point here is, if
each partial sum is less than
756
00:45:16,630 --> 00:45:20,260
or equal to alpha, then the
limit has to be less than or
757
00:45:20,260 --> 00:45:20,960
equal to alpha.
758
00:45:20,960 --> 00:45:22,450
That's what I was saying
on the other page.
759
00:45:22,450 --> 00:45:25,130
If you have a sequence of
numbers, which has an upper
760
00:45:25,130 --> 00:45:28,810
bound on them, then you
have to have a limit.
761
00:45:28,810 --> 00:45:31,900
And that limit has to be less
than or equal to alpha.
762
00:45:31,900 --> 00:45:34,090
So that's this case here.
763
00:45:34,090 --> 00:45:38,660
We have a sum of numbers as
we're going to the limit as m
764
00:45:38,660 --> 00:45:42,490
gets larger and larger,
these partial sums
765
00:45:42,490 --> 00:45:45,210
have to go to a limit.
766
00:45:45,210 --> 00:45:48,070
The partial sums are all less
than or equal to alpha.
767
00:45:48,070 --> 00:45:51,130
Then the infinite sum is less
than or equal to alpha, and
768
00:45:51,130 --> 00:45:53,810
omega is not in this set here.
769
00:45:53,810 --> 00:45:55,540
And otherwise, it is.
770
00:45:55,540 --> 00:45:59,220
OK, if I talk more about it,
I'll get more confused.
771
00:45:59,220 --> 00:46:01,985
So I think the slides
are clear.
772
00:46:09,020 --> 00:46:17,710
Now, if we look at the case
where alpha is greater than or
773
00:46:17,710 --> 00:46:26,180
equal to this sum, and we take
the complement of the set, the
774
00:46:26,180 --> 00:46:30,380
probability of the set of omega
for which this sum is
775
00:46:30,380 --> 00:46:32,900
less than or equal
to alpha has--
776
00:46:32,900 --> 00:46:36,830
oh, let's forget about
this for the moment.
777
00:46:36,830 --> 00:46:40,440
If I take the complement of this
set, the probability of
778
00:46:40,440 --> 00:46:43,560
the set of omega, such that the
sum is less than or equal
779
00:46:43,560 --> 00:46:47,190
to alpha, is greater
than 1 minus this
780
00:46:47,190 --> 00:46:48,820
expected value here.
781
00:46:48,820 --> 00:46:51,480
Now I'm saying, let's look at
the case where alpha is big
782
00:46:51,480 --> 00:46:55,280
enough that it's greater
than this number here.
783
00:46:55,280 --> 00:47:00,820
So this probability is greater
than 1 minus this number.
784
00:47:00,820 --> 00:47:06,720
So if the sum is less than or
equal to alpha for any given
785
00:47:06,720 --> 00:47:12,220
omega, then this quantity
here converges.
786
00:47:12,220 --> 00:47:14,620
Now I'm talking about
sample sequences.
787
00:47:14,620 --> 00:47:18,630
I'm saying I have an increasing
sequence of numbers
788
00:47:18,630 --> 00:47:21,540
corresponding to one particular
sample point.
789
00:47:21,540 --> 00:47:25,500
This increasing set of numbers
is less than or equal.
790
00:47:25,500 --> 00:47:28,760
Each element of it is less than
or equal to alpha, so the
791
00:47:28,760 --> 00:47:31,820
limit of it is less than
or equal to alpha.
792
00:47:31,820 --> 00:47:37,050
And what that says is the limit
of Y sub n of omega,
793
00:47:37,050 --> 00:47:39,760
this has to be equal to 0
for that sample point.
794
00:47:39,760 --> 00:47:41,500
This is all the sample
point argument.
795
00:47:46,280 --> 00:47:53,500
And what that says then is the
probability of omega, such
796
00:47:53,500 --> 00:47:57,580
that this limit here is equal
to 0, that's this quantity
797
00:47:57,580 --> 00:48:01,310
here, which is the same as this
quantity, which has to be
798
00:48:01,310 --> 00:48:02,935
greater than this quantity.
799
00:48:08,570 --> 00:48:11,100
This implies this.
800
00:48:11,100 --> 00:48:16,260
Therefore, the probability of
this has to be bigger than
801
00:48:16,260 --> 00:48:17,950
this probability here.
802
00:48:17,950 --> 00:48:22,480
Now, if we let alpha go to
infinity, what that says is
803
00:48:22,480 --> 00:48:26,310
this quantity goes to 0 and the
probability of the set of
804
00:48:26,310 --> 00:48:30,730
omega, such that this limit is
equal to 0, is equal to 1.
805
00:48:35,010 --> 00:48:39,640
I think if I try to spend 20
more minutes talking about
806
00:48:39,640 --> 00:48:42,860
that in more detail, it
won't get any clearer.
807
00:48:42,860 --> 00:48:47,020
It is one of these very tedious
arguments where you
808
00:48:47,020 --> 00:48:51,380
have to sit down and follow
it step by step.
809
00:48:51,380 --> 00:48:54,500
I wrote the steps that's
very carefully.
810
00:48:54,500 --> 00:49:02,180
And at this point, I have
to leave it as it is.
811
00:49:02,180 --> 00:49:05,540
But the theorem has been proven,
at least in what's
812
00:49:05,540 --> 00:49:08,800
written, if not in
what I've said.
813
00:49:08,800 --> 00:49:13,050
OK, let's look at an example
of this now.
814
00:49:13,050 --> 00:49:17,110
Let's look at the example where
these random variables Y
815
00:49:17,110 --> 00:49:21,930
sub n for n greater than or
equal to 1, have this
816
00:49:21,930 --> 00:49:23,170
following property.
817
00:49:23,170 --> 00:49:26,190
It's almost the same as the
sequence of numbers I talked
818
00:49:26,190 --> 00:49:27,830
about before.
819
00:49:27,830 --> 00:49:31,990
But what I'm going
to do now is--
820
00:49:31,990 --> 00:49:34,630
these are not IID random
variables.
821
00:49:34,630 --> 00:49:39,130
If they're IID random variables,
you're never going
822
00:49:39,130 --> 00:49:41,310
to talk about the sum
being finite.
823
00:49:41,310 --> 00:49:44,260
Sum of the expected values
being finite.
824
00:49:44,260 --> 00:49:52,620
How they behave is that for one
less than or equal to 5,
825
00:49:52,620 --> 00:49:55,520
you pick one of these random
variables in here and make it
826
00:49:55,520 --> 00:49:56,690
equal to 1.
827
00:49:56,690 --> 00:49:59,040
And all the rest
are equal to 0.
828
00:49:59,040 --> 00:50:03,560
From 5 to 25, you pick one of
the random variables, make it
829
00:50:03,560 --> 00:50:06,400
equal to 1, and all the
others are equal to 0.
830
00:50:06,400 --> 00:50:08,130
You choose randomly in here.
831
00:50:08,130 --> 00:50:12,380
From 25 to 125, you pick
one random variable,
832
00:50:12,380 --> 00:50:14,020
set it equal to 1.
833
00:50:14,020 --> 00:50:16,910
All the other random variables,
set it equal to 0,
834
00:50:16,910 --> 00:50:19,500
and so forth forever after.
835
00:50:19,500 --> 00:50:23,600
OK, so what does that say
for the sample points?
836
00:50:23,600 --> 00:50:29,080
If I look at any particular
sample point, what I find is
837
00:50:29,080 --> 00:50:36,610
that there's one occurrence of a
sample value equal to 1 from
838
00:50:36,610 --> 00:50:38,340
here to here.
839
00:50:38,340 --> 00:50:42,370
There's exactly one that's equal
to 1 from here to here.
840
00:50:42,370 --> 00:50:45,970
There's exactly one that's equal
to 1 from here to way
841
00:50:45,970 --> 00:50:49,890
out here at 125, and so forth.
842
00:50:49,890 --> 00:50:55,480
This is not a sequence of sample
values which converges
843
00:50:55,480 --> 00:51:00,790
because it keeps popping up
to 1 at all these values.
844
00:51:00,790 --> 00:51:05,300
So for every omega, Yn
of omega is 1 for
845
00:51:05,300 --> 00:51:07,530
some n in this interval.
846
00:51:07,530 --> 00:51:10,110
For every j and it's
0 elsewhere.
847
00:51:10,110 --> 00:51:15,830
This Yn of omega doesn't
converge for omega.
848
00:51:15,830 --> 00:51:19,310
So the probability that that
sequence converges
849
00:51:19,310 --> 00:51:22,390
is not 1, it's 0.
850
00:51:22,390 --> 00:51:26,080
So this is a particularly
awful example.
851
00:51:26,080 --> 00:51:29,490
This is a sequence of random
variables, which does not
852
00:51:29,490 --> 00:51:31,970
converge with probability 1.
853
00:51:31,970 --> 00:51:38,440
At the same time, the expected
value of Y sub n is 1 over 5
854
00:51:38,440 --> 00:51:42,320
to the j plus 1 minus
5 to the j.
855
00:51:42,320 --> 00:51:46,680
That's the probability that you
pick that particular n for
856
00:51:46,680 --> 00:51:50,310
a random variable to
be equal to 1.
857
00:51:50,310 --> 00:51:54,130
It's equal to this for 5 to the
j less than or equal to n,
858
00:51:54,130 --> 00:51:57,350
less than 5 to the j plus 1.
859
00:51:57,350 --> 00:52:01,000
When you add up all of these
things, when you add up
860
00:52:01,000 --> 00:52:04,330
expected value of Yn equal
to that over this
861
00:52:04,330 --> 00:52:06,120
interval, you get 1.
862
00:52:06,120 --> 00:52:09,460
When you add it up over the next
interval, which is much,
863
00:52:09,460 --> 00:52:11,400
much bigger, you get 1 again.
864
00:52:11,400 --> 00:52:12,950
When you add it up
over the next
865
00:52:12,950 --> 00:52:14,830
interval, you get 1 again.
866
00:52:14,830 --> 00:52:19,925
So the expected value
of the sum--
867
00:52:19,925 --> 00:52:22,850
the sum of the expected
value of the Y sub
868
00:52:22,850 --> 00:52:26,520
n's is equal to infinity.
869
00:52:26,520 --> 00:52:34,850
And what you wind up with
then is that this
870
00:52:34,850 --> 00:52:37,195
sequence does not converge--
871
00:52:49,690 --> 00:52:52,430
This says the theorem doesn't
apply at all.
872
00:52:52,430 --> 00:52:57,510
This says that the Y sub n of
omega does not converge for
873
00:52:57,510 --> 00:52:59,330
any sample function at all.
874
00:53:03,050 --> 00:53:06,910
This says that according to the
theorem, it doesn't have
875
00:53:06,910 --> 00:53:09,200
to converge.
876
00:53:09,200 --> 00:53:11,810
I mean, when you look at an
example after working very
877
00:53:11,810 --> 00:53:17,110
hard to prove a theorem, you
would like to find that if the
878
00:53:17,110 --> 00:53:25,990
conditions of the theorem are
satisfied what the theorem
879
00:53:25,990 --> 00:53:28,120
says is satisfied also.
880
00:53:28,120 --> 00:53:31,700
Here, the conditions
are not satisfied.
881
00:53:31,700 --> 00:53:33,810
And you also don't
have convergence
882
00:53:33,810 --> 00:53:35,370
with probability 1.
883
00:53:35,370 --> 00:53:39,280
You do have convergence in
probability, however.
884
00:53:39,280 --> 00:53:42,520
So this gives you a nice example
of where you have a
885
00:53:42,520 --> 00:53:47,320
sequence of random variables
that converges in probability.
886
00:53:47,320 --> 00:53:51,710
It converges in probability
because as n gets larger and
887
00:53:51,710 --> 00:53:56,730
larger, the probability that Y
sub n is going to be equal to
888
00:53:56,730 --> 00:54:00,900
anything other than 0 gets
very, very small.
889
00:54:00,900 --> 00:54:05,450
So the limit as n goes to
infinity of the probability
890
00:54:05,450 --> 00:54:07,920
that Y sub n is greater
than epsilon--
891
00:54:07,920 --> 00:54:11,750
for any epsilon greater than 0,
this probability is equal
892
00:54:11,750 --> 00:54:13,760
to 0 for all epsilon.
893
00:54:13,760 --> 00:54:17,670
So this quantity does converge
in probability.
894
00:54:17,670 --> 00:54:20,480
It does not converge
with probability 1.
895
00:54:20,480 --> 00:54:24,420
It's the simplest example I know
of where you don't have
896
00:54:24,420 --> 00:54:30,390
convergence with probability 1
and you do have convergence in
897
00:54:30,390 --> 00:54:32,290
probability.
898
00:54:32,290 --> 00:54:43,440
How about if you're looking at a
sequence of sample averages.
899
00:54:43,440 --> 00:54:47,720
Suppose you're looking at S sub
n over n where S sub n is
900
00:54:47,720 --> 00:54:51,680
a sum of IID random variables.
901
00:54:51,680 --> 00:54:57,460
Can you find an example there
where when you have a--
902
00:55:01,290 --> 00:55:05,850
can you find an example where
this sequence S sub n over n
903
00:55:05,850 --> 00:55:10,960
does converge in probability,
but does not converge with
904
00:55:10,960 --> 00:55:12,470
probability 1?
905
00:55:12,470 --> 00:55:16,660
Unfortunately, that's
very hard to do.
906
00:55:16,660 --> 00:55:20,900
And the reason is the main
theorem, which we will never
907
00:55:20,900 --> 00:55:30,210
get around to proving here is
that if you have a random
908
00:55:30,210 --> 00:55:36,510
variable x, and the expected
value of the magnitude of x is
909
00:55:36,510 --> 00:55:40,780
finite, then the strong law
of large numbers holds.
910
00:55:40,780 --> 00:55:47,530
Also, the weak law of large
numbers holds, which says that
911
00:55:47,530 --> 00:55:50,600
you're not going to find an
example where one holds and
912
00:55:50,600 --> 00:55:53,490
the other doesn't hold.
913
00:55:53,490 --> 00:55:57,910
So you have to go to strange
things like this in order to
914
00:55:57,910 --> 00:56:00,110
get these examples that
you're looking at.
915
00:56:04,220 --> 00:56:09,510
OK, let's now go from
convergence in probability 1
916
00:56:09,510 --> 00:56:14,240
to applying this to the sequence
of random variables
917
00:56:14,240 --> 00:56:23,210
where Y sub n is now equal to
the sum of n IID random
918
00:56:23,210 --> 00:56:24,520
variable divided by n.
919
00:56:24,520 --> 00:56:29,130
Namely, it's the sample average,
and we're looking at
920
00:56:29,130 --> 00:56:32,110
the limit as n goes to infinity
921
00:56:32,110 --> 00:56:33,320
of this sample average.
922
00:56:33,320 --> 00:56:40,500
What's the probability of the
set of sample points for which
923
00:56:40,500 --> 00:56:47,580
this sample path converges
to X bar?
924
00:56:47,580 --> 00:56:54,330
And the theorem says that this
quantity is equal to 1 if the
925
00:56:54,330 --> 00:56:57,940
expected value of X is
less than infinity.
926
00:56:57,940 --> 00:57:01,540
We're not going to prove that,
but what we are going to prove
927
00:57:01,540 --> 00:57:07,360
is that if the expected value
of the fourth moment of X is
928
00:57:07,360 --> 00:57:09,730
finite, then we're going
to prove that
929
00:57:09,730 --> 00:57:12,640
this theorem is true.
930
00:57:12,640 --> 00:57:20,480
OK, when we write this from now
on, we will sometimes get
931
00:57:20,480 --> 00:57:21,810
more terse.
932
00:57:21,810 --> 00:57:27,350
And instead of writing the
probability of an omega in the
933
00:57:27,350 --> 00:57:33,240
set of sample points such that
this limit for a sample point
934
00:57:33,240 --> 00:57:36,510
is equal to X bar, this whole
thing is equal to 1.
935
00:57:36,510 --> 00:57:39,480
We can sometimes write it as
the probability that this
936
00:57:39,480 --> 00:57:43,940
limit, which is now a limit
of Sn of omega over n
937
00:57:43,940 --> 00:57:45,100
is equal to X bar.
938
00:57:45,100 --> 00:57:47,450
But that's equal to 1.
939
00:57:47,450 --> 00:57:51,430
Some people write it even more
tersely as the limit of S sub
940
00:57:51,430 --> 00:57:56,590
n over n is equal to X bar
with probability 1.
941
00:57:56,590 --> 00:58:01,570
This is a very strange statement
here because this--
942
00:58:07,640 --> 00:58:11,320
I mean, what you're saying with
this statement is not
943
00:58:11,320 --> 00:58:16,770
that this limit is equal to
X bar with probability 1.
944
00:58:16,770 --> 00:58:21,610
It's saying, with probability 1,
this limit here exists for
945
00:58:21,610 --> 00:58:25,260
a sample point, and that limit
is equal to X bar.
946
00:58:25,260 --> 00:58:28,380
The thing which makes the strong
law of large numbers
947
00:58:28,380 --> 00:58:34,520
difficult is not proving
that the limit has
948
00:58:34,520 --> 00:58:36,170
a particular value.
949
00:58:36,170 --> 00:58:39,170
If there is a limit, it's
always easy to find
950
00:58:39,170 --> 00:58:40,230
what the value is.
951
00:58:40,230 --> 00:58:43,790
The thing which is difficult is
figuring out whether it has
952
00:58:43,790 --> 00:58:44,800
a limit or not.
953
00:58:44,800 --> 00:58:51,380
So this statement is fine for
people who understand what it
954
00:58:51,380 --> 00:58:55,700
says, but it's kind of
confusing otherwise.
955
00:58:55,700 --> 00:59:00,690
Still more tersely, people talk
about it as Sn over n
956
00:59:00,690 --> 00:59:04,790
goes to limit X bar with
probability 1.
957
00:59:04,790 --> 00:59:07,940
This is probably an even better
way to say it than this
958
00:59:07,940 --> 00:59:09,580
is because this is--
959
00:59:09,580 --> 00:59:12,580
I mean, this says that there's
something strange
960
00:59:12,580 --> 00:59:14,440
in the limit here.
961
00:59:14,440 --> 00:59:19,210
But I would suggest that you
write it this way until you
962
00:59:19,210 --> 00:59:20,720
get used to what it's saying.
963
00:59:20,720 --> 00:59:24,620
Because then, when you write it
this way, you realize that
964
00:59:24,620 --> 00:59:28,480
what you're talking about is
the limit over individual
965
00:59:28,480 --> 00:59:33,520
sample points rather than some
kind of more general limit.
966
00:59:33,520 --> 00:59:38,630
And convergence with probability
1 is always that
967
00:59:38,630 --> 00:59:42,420
sort of convergence.
968
00:59:42,420 --> 00:59:46,170
OK, this strong law and the
idea of convergence with
969
00:59:46,170 --> 00:59:49,240
probability 1 is really pretty
different from the other forms
970
00:59:49,240 --> 00:59:50,610
of convergence.
971
00:59:50,610 --> 00:59:54,420
In the sense that it focuses
directly on sample paths.
972
00:59:54,420 --> 00:59:59,080
The other forms of convergence
focus on things like the
973
00:59:59,080 --> 01:00:02,900
sequence of expected values,
or where the sequence of
974
01:00:02,900 --> 01:00:06,980
probabilities, or sequences of
numbers, which are the things
975
01:00:06,980 --> 01:00:09,840
you're used to dealing with.
976
01:00:09,840 --> 01:00:15,820
Here you're dealing directly
with sample points, and it
977
01:00:15,820 --> 01:00:18,740
makes it more difficult to
talk about the rate of
978
01:00:18,740 --> 01:00:21,070
convergence as n approaches
infinity.
979
01:00:21,070 --> 01:00:24,180
You can't talk about the rate
of convergence here as n
980
01:00:24,180 --> 01:00:25,860
approaches infinity.
981
01:00:25,860 --> 01:00:28,720
If you have any n less than
infinity, if you're only
982
01:00:28,720 --> 01:00:34,280
looking at a finite sequence,
you have no way of saying
983
01:00:34,280 --> 01:00:38,210
whether any of the sample values
over that sequence are
984
01:00:38,210 --> 01:00:41,360
going to converge or whether
they're not going to converge,
985
01:00:41,360 --> 01:00:44,670
because you don't know what
the rest of them are.
986
01:00:44,670 --> 01:00:48,570
So talking about a rate of
convergence with respect to
987
01:00:48,570 --> 01:00:50,680
the strong law of large numbers
988
01:00:50,680 --> 01:00:53,270
doesn't make any sense.
989
01:00:53,270 --> 01:00:55,880
It's connected directly to
the standard notion of a
990
01:00:55,880 --> 01:01:00,640
convergence of a sequence of
numbers when you look at those
991
01:01:00,640 --> 01:01:04,690
numbers applied to
a sample path.
992
01:01:04,690 --> 01:01:07,900
This is what gives the strong
law of large numbers its
993
01:01:07,900 --> 01:01:13,920
power, the fact that it's
related to this standard idea
994
01:01:13,920 --> 01:01:14,690
of convergence.
995
01:01:14,690 --> 01:01:18,530
The standard idea of convergence
is what the whole
996
01:01:18,530 --> 01:01:22,310
theory of analysis
is built on.
997
01:01:22,310 --> 01:01:26,030
And there are some very powerful
things you can do
998
01:01:26,030 --> 01:01:27,030
with analysis.
999
01:01:27,030 --> 01:01:31,740
And it's because convergence is
defined the way that it is.
1000
01:01:31,740 --> 01:01:35,400
When we talk about the strong
law of large numbers, we are
1001
01:01:35,400 --> 01:01:39,170
locked into that particular
notion of convergence.
1002
01:01:39,170 --> 01:01:41,690
And therefore, it's going
to have a lot of power.
1003
01:01:41,690 --> 01:01:44,050
We will see this as soon
as we start talking
1004
01:01:44,050 --> 01:01:45,750
about renewal theory.
1005
01:01:45,750 --> 01:01:47,890
And in fact, we'll see it in
the proof of the strong law
1006
01:01:47,890 --> 01:01:50,640
that we're going
to go through.
1007
01:01:50,640 --> 01:01:53,470
Most of the heavy lifting with
the strong law of large
1008
01:01:53,470 --> 01:01:57,780
numbers has been done by the
analysis of convergence with
1009
01:01:57,780 --> 01:01:58,740
probability 1.
1010
01:01:58,740 --> 01:02:01,540
The hard thing is this theorem
we've just proven.
1011
01:02:01,540 --> 01:02:02,990
And that's tricky.
1012
01:02:02,990 --> 01:02:05,750
And I apologize for getting a
little confused about it as we
1013
01:02:05,750 --> 01:02:08,720
went through it, and not
explaining all the steps
1014
01:02:08,720 --> 01:02:10,910
completely.
1015
01:02:10,910 --> 01:02:12,950
But as I said, it's hard
to follow proofs
1016
01:02:12,950 --> 01:02:14,950
in real time anyway.
1017
01:02:14,950 --> 01:02:17,080
But all of that is done now.
1018
01:02:17,080 --> 01:02:19,700
How do we go through the strong
law of large numbers
1019
01:02:19,700 --> 01:02:22,990
now if we accept this
convergence
1020
01:02:22,990 --> 01:02:25,370
with probability 1?
1021
01:02:25,370 --> 01:02:29,370
Well, it turns out to
be pretty easy.
1022
01:02:29,370 --> 01:02:32,300
We're going to assume that the
expected value of the fourth
1023
01:02:32,300 --> 01:02:35,890
moment of this underlying
random variable
1024
01:02:35,890 --> 01:02:38,040
is less than infinity.
1025
01:02:38,040 --> 01:02:43,830
So let's look at the expected
value of the sum of n random
1026
01:02:43,830 --> 01:02:47,120
variables taken to
the fourth power.
1027
01:02:47,120 --> 01:02:48,790
OK, so what is that?
1028
01:02:48,790 --> 01:02:57,760
It's the expected value of S
sub n times S sub n times S
1029
01:02:57,760 --> 01:03:00,030
sub n times S sub n.
1030
01:03:00,030 --> 01:03:05,220
Sub n is the sum of
Xi from 1 to n.
1031
01:03:05,220 --> 01:03:07,020
It's also this.
1032
01:03:07,020 --> 01:03:08,840
It's also this.
1033
01:03:08,840 --> 01:03:09,970
It's also this.
1034
01:03:09,970 --> 01:03:14,150
So the expected value of S to
the n fourth is the expected
1035
01:03:14,150 --> 01:03:17,810
value of this entire
product here.
1036
01:03:17,810 --> 01:03:22,800
I should have a big bracket
around all of that.
1037
01:03:22,800 --> 01:03:27,190
If I multiply all of these terms
out, each of these terms
1038
01:03:27,190 --> 01:03:29,900
goes from 1 to n, what I'm
going to get is the
1039
01:03:29,900 --> 01:03:32,710
sum from 1 to n.
1040
01:03:32,710 --> 01:03:35,010
Sum over j from 1 to n.
1041
01:03:35,010 --> 01:03:37,310
Sum over k from 1 to n.
1042
01:03:37,310 --> 01:03:40,810
And a sum over l from 1 to n.
1043
01:03:40,810 --> 01:03:44,710
So I'm going to have the
expected value of X sub i
1044
01:03:44,710 --> 01:03:48,370
times X sub j times X
sub k times X sub l.
1045
01:03:48,370 --> 01:03:50,100
Let's review what this is.
1046
01:03:50,100 --> 01:04:01,340
X sub i is the random variable
for the i-th of these X's.
1047
01:04:01,340 --> 01:04:02,640
I have n X's--
1048
01:04:02,640 --> 01:04:06,310
X1, X2, X3, up to X sub n.
1049
01:04:06,310 --> 01:04:11,000
What I'm trying to find is the
expected value of this sum to
1050
01:04:11,000 --> 01:04:12,460
the fourth power.
1051
01:04:12,460 --> 01:04:15,702
When you look at the sum of
something, if I look at the
1052
01:04:15,702 --> 01:04:34,410
sum of numbers, [INAUDIBLE] of
a sub i, times the sum of b
1053
01:04:34,410 --> 01:04:38,740
sub i, I write it as j.
1054
01:04:38,740 --> 01:04:40,830
If i just do this, what's
it equal to?
1055
01:04:40,830 --> 01:04:45,380
It's equal to the sum over
i and j of a sub
1056
01:04:45,380 --> 01:04:47,890
i times a sub j.
1057
01:04:47,890 --> 01:04:50,650
I'm doing exactly the same thing
here, but I'm taking the
1058
01:04:50,650 --> 01:04:52,270
expected value of it.
1059
01:04:52,270 --> 01:04:55,540
That's a finite sum. the
expected value of the sum is
1060
01:04:55,540 --> 01:05:00,540
equal to the sum of the
expected values.
1061
01:05:00,540 --> 01:05:07,240
So if I look at any particular
value of X--
1062
01:05:07,240 --> 01:05:08,580
of this first X here.
1063
01:05:08,580 --> 01:05:11,390
Suppose I look at i equals 1 .
1064
01:05:11,390 --> 01:05:17,990
I suppose I look at the expected
value of X1 times--
1065
01:05:17,990 --> 01:05:20,460
and I'll make this anything
other than 1.
1066
01:05:20,460 --> 01:05:24,350
I'll make this anything other
than 1, and this anything
1067
01:05:24,350 --> 01:05:24,870
other than 1.
1068
01:05:24,870 --> 01:05:27,370
For example, suppose I'm trying
to find the expected
1069
01:05:27,370 --> 01:05:35,490
value of X1 times X2
times X10 times X3.
1070
01:05:35,490 --> 01:05:38,110
OK, what is that?
1071
01:05:38,110 --> 01:05:42,910
Since X1, X2, X3 are all
independent of each other, the
1072
01:05:42,910 --> 01:05:47,460
expected value of X1 times the
expected value of all these
1073
01:05:47,460 --> 01:05:52,420
other things is the expected
value of X1 conditional on the
1074
01:05:52,420 --> 01:05:54,610
values of these other
quantities.
1075
01:05:54,610 --> 01:05:57,105
And then I average over all
the other quantities.
1076
01:06:00,210 --> 01:06:03,460
Now, if these are independent
random variables, the expected
1077
01:06:03,460 --> 01:06:08,080
value of this given the values
of these other quantities is
1078
01:06:08,080 --> 01:06:11,130
just the expected value of X1.
1079
01:06:11,130 --> 01:06:14,740
I'm dealing with a case where
the expected value of X is
1080
01:06:14,740 --> 01:06:17,800
equal to 0.
1081
01:06:17,800 --> 01:06:20,330
Assuming X bar equals 0.
1082
01:06:20,330 --> 01:06:26,080
So when I pick i equal to 1
and all of these equal to
1083
01:06:26,080 --> 01:06:31,680
something other than 1, this
expected value is equal to 0.
1084
01:06:31,680 --> 01:06:34,760
That's a whole bunch of expected
values because that
1085
01:06:34,760 --> 01:06:39,580
includes j equals 2 to n, k
equals 2 to n, and X sub l
1086
01:06:39,580 --> 01:06:41,090
equals 2 to n.
1087
01:06:41,090 --> 01:06:45,520
Now, I can do this for X sub i
equals 2, X sub i equals 3,
1088
01:06:45,520 --> 01:06:46,770
and so forth.
1089
01:06:48,950 --> 01:06:53,770
If i is different from j, and k,
and l, this expected value
1090
01:06:53,770 --> 01:06:55,020
is equal to 0.
1091
01:06:58,340 --> 01:07:02,160
And the same thing if
X sub j is different
1092
01:07:02,160 --> 01:07:03,430
than all the others.
1093
01:07:03,430 --> 01:07:05,640
The expected value
is equal to 0.
1094
01:07:05,640 --> 01:07:09,150
So how can I get anything
that's nonzero?
1095
01:07:09,150 --> 01:07:15,240
Well, if I look at X sub 1 times
X sub 1 times X sub 1
1096
01:07:15,240 --> 01:07:18,130
times X sub 1, that
gives me expected
1097
01:07:18,130 --> 01:07:19,690
value of X to the fourth.
1098
01:07:19,690 --> 01:07:22,950
That's not 0, presumably.
1099
01:07:22,950 --> 01:07:24,860
And I have n terms like that.
1100
01:07:29,050 --> 01:07:33,350
Well, I'm getting
down to here.
1101
01:07:33,350 --> 01:07:37,540
What we have is two kinds
of nonzero terms.
1102
01:07:37,540 --> 01:07:41,930
One of them is where i is equal
to j is equal to k is
1103
01:07:41,930 --> 01:07:43,060
equal to l.
1104
01:07:43,060 --> 01:07:46,830
And then we have X sub i
to the fourth power.
1105
01:07:46,830 --> 01:07:49,980
And we're assuming that's
some finite quantity.
1106
01:07:49,980 --> 01:07:52,890
That's the basic assumption
we're using here, expected
1107
01:07:52,890 --> 01:07:55,740
value of X fourth is
less than infinity.
1108
01:07:55,740 --> 01:07:58,470
What other kinds of things
can we have?
1109
01:07:58,470 --> 01:08:05,130
Well, if i is equal to j, and
if k is equal to l, then I
1110
01:08:05,130 --> 01:08:13,920
have the expected value of Xi
squared times expected value
1111
01:08:13,920 --> 01:08:16,890
of Xi squared Xk squared.
1112
01:08:16,890 --> 01:08:17,950
What is that?
1113
01:08:17,950 --> 01:08:22,510
Xi squared is independent of
Xk squared because i is
1114
01:08:22,510 --> 01:08:23,710
unequal to k.
1115
01:08:23,710 --> 01:08:26,060
These are independent
random variables.
1116
01:08:26,060 --> 01:08:31,250
So I have the expected value
of Xi squared is what?
1117
01:08:31,250 --> 01:08:35,720
It's just a variance of X.
This quantity here is the
1118
01:08:35,720 --> 01:08:37,819
variance of X also.
1119
01:08:37,819 --> 01:08:43,729
So I have the variance of Xi
squared, which is squared.
1120
01:08:43,729 --> 01:08:50,040
So I have sigma to
the fourth power.
1121
01:08:50,040 --> 01:08:55,720
So those are the only terms that
I have for this second
1122
01:08:55,720 --> 01:08:59,850
kind of nonzero term
where Xi--
1123
01:08:59,850 --> 01:09:02,160
excuse me. not Xi
is equal to Xj.
1124
01:09:02,160 --> 01:09:03,689
That's not what we're
talking about.
1125
01:09:03,689 --> 01:09:06,819
Where i is equal to j.
1126
01:09:06,819 --> 01:09:12,330
Namely, we have a sum where i
runs from 1 to n, where j runs
1127
01:09:12,330 --> 01:09:16,670
from 1 to n, k runs from 1 to
n, and l runs from 1 to n.
1128
01:09:16,670 --> 01:09:21,040
What we're looking at is, for
what values of i, j, k, and l
1129
01:09:21,040 --> 01:09:24,500
is this quantity
not equal to 0?
1130
01:09:24,500 --> 01:09:28,430
We're saying that if i is equal
to j is equal to k is
1131
01:09:28,430 --> 01:09:32,229
equal to l, then for all of
those terms, we have the
1132
01:09:32,229 --> 01:09:34,550
expected value of X fourth.
1133
01:09:34,550 --> 01:09:39,640
For all terms in which i is
equal to j and k is equal to
1134
01:09:39,640 --> 01:09:44,689
l, for all of those terms, we
have the expected value of X
1135
01:09:44,689 --> 01:09:47,609
sub i quantity squared.
1136
01:09:47,609 --> 01:09:50,560
Now, how many of those
terms are there?
1137
01:09:50,560 --> 01:09:55,180
Well, x sub i can be
any one of n terms.
1138
01:09:55,180 --> 01:10:01,220
x sub j can be any one
of how many terms?
1139
01:10:01,220 --> 01:10:02,470
It can't be equal.
1140
01:10:06,634 --> 01:10:11,120
i is equal to j, how many
things can k be?
1141
01:10:11,120 --> 01:10:17,450
It can't be equal to i because
then we would wind up with X
1142
01:10:17,450 --> 01:10:19,550
sub i to the fourth power.
1143
01:10:19,550 --> 01:10:24,650
So we're looking at n minus
1 possible values for k, n
1144
01:10:24,650 --> 01:10:27,430
possible values for i.
1145
01:10:27,430 --> 01:10:30,820
So there are n times n minus
1 of those terms.
1146
01:10:30,820 --> 01:10:32,070
I can also have--
1147
01:10:41,868 --> 01:10:44,600
let me write in this way.
1148
01:10:44,600 --> 01:10:51,580
Times Xk Xl equals k.
1149
01:10:51,580 --> 01:10:52,800
I can have those terms.
1150
01:10:52,800 --> 01:11:00,470
I can also have Xi
Xj unequal to i.
1151
01:11:00,470 --> 01:11:08,720
Xk equal to k and
Xl equal to i.
1152
01:11:08,720 --> 01:11:10,640
I can have terms like this.
1153
01:11:10,640 --> 01:11:13,650
And that gives me a sigma
fourth term also.
1154
01:11:13,650 --> 01:11:18,630
I can also have Xi
Xj unequal to i.
1155
01:11:18,630 --> 01:11:22,370
k can be equal to i and
l can be equal to j.
1156
01:11:22,370 --> 01:11:24,880
So I really have three
kinds of terms.
1157
01:11:24,880 --> 01:11:33,840
I have three times n times n
minus 1 times the expected
1158
01:11:33,840 --> 01:11:43,690
value of X squared, this
quantity squared.
1159
01:11:43,690 --> 01:11:48,190
So that's the total value of
expected value of S sub n to
1160
01:11:48,190 --> 01:11:50,130
the fourth.
1161
01:11:50,130 --> 01:11:55,640
It's the n terms for which i is
equal to j is equal to k is
1162
01:11:55,640 --> 01:12:02,700
equal to l plus the 3n times n
minus 1 terms in which we have
1163
01:12:02,700 --> 01:12:05,790
two pairs of equal terms.
1164
01:12:05,790 --> 01:12:07,470
So we have that quantity here.
1165
01:12:07,470 --> 01:12:13,010
Now, expected value of X fourth
is the second moment of
1166
01:12:13,010 --> 01:12:16,410
the random variable X squared.
1167
01:12:16,410 --> 01:12:23,870
So the expected value of X
squared squared is the mean of
1168
01:12:23,870 --> 01:12:26,610
X squared squared.
1169
01:12:26,610 --> 01:12:30,580
And that's less than or equal to
the variance of X squared,
1170
01:12:30,580 --> 01:12:32,100
which is this quantity.
1171
01:12:32,100 --> 01:12:36,190
The expected value
of Sn fourth is--
1172
01:12:38,980 --> 01:12:43,310
well, actually it's less than
or equal to 3n squared times
1173
01:12:43,310 --> 01:12:45,690
the expected value
of X fourth.
1174
01:12:45,690 --> 01:12:50,020
And blah, blah, blah, until we
get to 3 times the expected
1175
01:12:50,020 --> 01:12:53,300
value of X fourth times the
sum from n equals 1 to
1176
01:12:53,300 --> 01:12:56,140
infinity of 1 over n squared.
1177
01:12:56,140 --> 01:12:59,695
Now, is that quantity finite
or is it infinite?
1178
01:13:06,050 --> 01:13:09,100
Well, let's talk of three
different ways of showing that
1179
01:13:09,100 --> 01:13:13,220
this sum is going
to be finite.
1180
01:13:13,220 --> 01:13:17,710
One of the ways is that this is
an approximation, a crude
1181
01:13:17,710 --> 01:13:20,860
approximation, of the
integral from 1 to
1182
01:13:20,860 --> 01:13:24,395
infinity of 1 over X squared.
1183
01:13:24,395 --> 01:13:26,990
You know that that integral
is finite.
1184
01:13:26,990 --> 01:13:30,560
Another way of doing it is you
already know that if you take
1185
01:13:30,560 --> 01:13:35,650
1 over n times 1 over n plus 1,
you know how to sum that.
1186
01:13:35,650 --> 01:13:37,340
That sum is finite.
1187
01:13:37,340 --> 01:13:41,040
You can bound this by that.
1188
01:13:43,940 --> 01:13:48,570
And the other way of doing it is
simply to know that the sum
1189
01:13:48,570 --> 01:13:50,140
of 1 over n squared is finite.
1190
01:13:53,990 --> 01:13:59,740
So what this says is that the
expected value of S sub n
1191
01:13:59,740 --> 01:14:04,290
fourth over n fourth is
less than infinity.
1192
01:14:04,290 --> 01:14:14,210
That says that the probability
of the set of omega for which
1193
01:14:14,210 --> 01:14:20,450
S to the fourth over n fourth
is equal to 0 is equal to 1.
1194
01:14:20,450 --> 01:14:24,750
in other words, it's saying
that S to the fourth over
1195
01:14:24,750 --> 01:14:30,850
omega over n fourth
converges to 0.
1196
01:14:30,850 --> 01:14:34,200
That's not quite what
we want, is it?
1197
01:14:34,200 --> 01:14:37,770
But the set of sample points
for which this quantity
1198
01:14:37,770 --> 01:14:44,790
converges has probability 1.
1199
01:14:44,790 --> 01:14:47,860
And here is where you see the
real power of the strong law
1200
01:14:47,860 --> 01:14:49,410
of large numbers.
1201
01:14:49,410 --> 01:14:57,170
Because if these numbers
converge to 0 with probability
1202
01:14:57,170 --> 01:15:11,310
1, what happens to the set of
numbers is Sn to the fourth of
1203
01:15:11,310 --> 01:15:15,920
omega divided by n to the
fourth, this limit--
1204
01:15:23,488 --> 01:15:28,940
if this was equal to 0, then
what is the limit as n
1205
01:15:28,940 --> 01:15:35,541
approaches infinity of Sn of
omega over n to the fourth?
1206
01:15:38,880 --> 01:15:43,240
If I take the fourth root
of this, I get this.
1207
01:15:43,240 --> 01:15:47,220
If this quantity is converging
to 0, the fourth root of this
1208
01:15:47,220 --> 01:15:53,830
also has to be converging to 0
on a sample path basis of the
1209
01:15:53,830 --> 01:15:58,300
fact that this converges means
that this converges also.
1210
01:16:00,910 --> 01:16:03,350
Now, you see if you were dealing
with convergence in
1211
01:16:03,350 --> 01:16:07,060
probability or something like
that, you couldn't play this
1212
01:16:07,060 --> 01:16:09,090
funny game.
1213
01:16:09,090 --> 01:16:12,450
And the ability to play this
game is really what makes
1214
01:16:12,450 --> 01:16:16,490
convergence in probability
a powerful concept.
1215
01:16:16,490 --> 01:16:19,350
You can do all sorts of strange
things with it.
1216
01:16:19,350 --> 01:16:23,590
And we'll talk about
that next time.
1217
01:16:23,590 --> 01:16:29,920
But that's why all
of this works.
1218
01:16:29,920 --> 01:16:33,960
So that's what says that the
probability of the set of
1219
01:16:33,960 --> 01:16:37,570
omega for which the limits
of Sn of omega over n
1220
01:16:37,570 --> 01:16:38,820
equals 0 equals 1.
1221
01:16:41,840 --> 01:16:45,590
Now, let's look at the
strange aspect of
1222
01:16:45,590 --> 01:16:47,880
what we've just done.
1223
01:16:47,880 --> 01:16:51,940
And this is where things
get very peculiar.
1224
01:16:51,940 --> 01:16:55,640
Let's look at the Bernoulli
case, which by now we all
1225
01:16:55,640 --> 01:16:57,470
understand.
1226
01:16:57,470 --> 01:17:03,260
So we consider a Bernoulli
process, all
1227
01:17:03,260 --> 01:17:04,860
moments of X exist.
1228
01:17:04,860 --> 01:17:08,110
Moment-generating functions
of X exist.
1229
01:17:08,110 --> 01:17:11,590
X is about as well-behaved as
you can expect because it only
1230
01:17:11,590 --> 01:17:14,030
has the values 1 or 0.
1231
01:17:14,030 --> 01:17:16,630
So it's very nice.
1232
01:17:16,630 --> 01:17:19,690
The expected value of X
is going to be equal
1233
01:17:19,690 --> 01:17:22,070
to p in this case.
1234
01:17:22,070 --> 01:17:28,380
The set of sample paths for
which Sn of omega over n is
1235
01:17:28,380 --> 01:17:31,270
equal to p has probability 1.
1236
01:17:31,270 --> 01:17:37,300
In other words, with probability
1, when you look
1237
01:17:37,300 --> 01:17:40,305
at a sample path and you look
at the whole thing from n
1238
01:17:40,305 --> 01:17:43,730
equals 1 off to infinity, and
you take the limit of that
1239
01:17:43,730 --> 01:17:47,750
sample path as n goes to
infinity, what you get is p.
1240
01:17:47,750 --> 01:17:52,090
And the probability that you
get p is equal to 1.
1241
01:17:52,090 --> 01:17:55,880
Well, now, the thing that's
disturbing is, if you look at
1242
01:17:55,880 --> 01:17:59,930
another Bernoulli process where
the probability of the 1
1243
01:17:59,930 --> 01:18:03,160
is p prime instead of p.
1244
01:18:03,160 --> 01:18:06,630
What happens then?
1245
01:18:06,630 --> 01:18:12,440
With probability 1, you get
convergence of Sn of omega
1246
01:18:12,440 --> 01:18:19,820
over n, but the convergence is
to p prime instead of to p.
1247
01:18:19,820 --> 01:18:24,790
The events in these two spaces
are exactly the same.
1248
01:18:24,790 --> 01:18:28,470
We've changed the probability
measure, but we've kept all
1249
01:18:28,470 --> 01:18:30,740
the events the same.
1250
01:18:30,740 --> 01:18:34,040
And by changing the probability
measure, we have
1251
01:18:34,040 --> 01:18:43,160
changed one set of probability 1
into a set of probability 0.
1252
01:18:43,160 --> 01:18:46,500
And we changed another set of
probability 0 into set of
1253
01:18:46,500 --> 01:18:48,550
probability 1.
1254
01:18:48,550 --> 01:18:52,130
So we have two different
events here.
1255
01:18:52,130 --> 01:18:56,450
On one probability measure, this
event has probability 1.
1256
01:18:56,450 --> 01:18:58,700
On the other one, it
has probability 0.
1257
01:18:58,700 --> 01:19:04,930
They're both very nice, very
well-behaved probabilistic
1258
01:19:04,930 --> 01:19:06,750
situations.
1259
01:19:06,750 --> 01:19:08,560
So that's a little disturbing.
1260
01:19:08,560 --> 01:19:14,140
But then you say, you can pick
p in an uncountably infinite
1261
01:19:14,140 --> 01:19:15,940
number of ways.
1262
01:19:15,940 --> 01:19:18,680
And for each way you
count p, you have
1263
01:19:18,680 --> 01:19:20,160
uncountably many events.
1264
01:19:23,250 --> 01:19:28,790
Excuse me, for each value of
p, you have one event of
1265
01:19:28,790 --> 01:19:31,790
probability 1 for that p.
1266
01:19:31,790 --> 01:19:35,750
So as you go through this
uncountable number of events,
1267
01:19:35,750 --> 01:19:39,480
you go through this uncountable
number of p's, you
1268
01:19:39,480 --> 01:19:43,560
have an uncountable number of
events, each of which has
1269
01:19:43,560 --> 01:19:47,890
probability 1 for its own p.
1270
01:19:47,890 --> 01:19:52,600
And now the set of sequences
that converge is, in fact, a
1271
01:19:52,600 --> 01:19:55,270
rather peculiar sequence
to start with.
1272
01:19:55,270 --> 01:19:57,656
So if you look at all the other
things that are going to
1273
01:19:57,656 --> 01:20:01,010
happen, there are an awful
lot of those events also.
1274
01:20:04,300 --> 01:20:08,780
So what is happening here is
that these events that we're
1275
01:20:08,780 --> 01:20:14,750
talking about are indeed very,
very peculiar events.
1276
01:20:14,750 --> 01:20:16,530
I mean, all the mathematics
works out.
1277
01:20:16,530 --> 01:20:17,850
The mathematics is fine.
1278
01:20:17,850 --> 01:20:21,420
There's no doubt about it.
1279
01:20:21,420 --> 01:20:24,280
In fact, the mathematics
of probability
1280
01:20:24,280 --> 01:20:26,570
theory was worked out.
1281
01:20:26,570 --> 01:20:29,940
People like Kolmogorov went to
great efforts to make sure
1282
01:20:29,940 --> 01:20:31,960
that all of this worked out.
1283
01:20:31,960 --> 01:20:34,330
But then he wound up with
this peculiar kind
1284
01:20:34,330 --> 01:20:36,660
of situation here.
1285
01:20:36,660 --> 01:20:40,170
And that's what happens when you
go to an infinite number
1286
01:20:40,170 --> 01:20:43,080
of random variables.
1287
01:20:43,080 --> 01:20:49,320
And it's ugly, but that's
the way it is.
1288
01:20:49,320 --> 01:20:55,140
So that what I'm arguing here
is that when you go from
1289
01:20:55,140 --> 01:21:00,220
finite m to infinite n, and
you start interchanging
1290
01:21:00,220 --> 01:21:06,010
limits, and you start taking
limits without much care and
1291
01:21:06,010 --> 01:21:09,280
you start doing all the things
that you would like to do,
1292
01:21:09,280 --> 01:21:15,700
thinking that infinite n is sort
of the same as finite n.
1293
01:21:15,700 --> 01:21:18,990
In most places in probability,
you can do that and you can
1294
01:21:18,990 --> 01:21:20,170
away with it.
1295
01:21:20,170 --> 01:21:22,800
As soon as you start dealing
with the strong law of large
1296
01:21:22,800 --> 01:21:26,130
numbers, you suddenly really
have to start being careful
1297
01:21:26,130 --> 01:21:27,810
about this.
1298
01:21:27,810 --> 01:21:31,900
So from now on, we have to be
just a little bit careful
1299
01:21:31,900 --> 01:21:36,120
about interchanging limits,
interchanging summation and
1300
01:21:36,120 --> 01:21:40,190
integration, interchanging all
sorts of things, as soon as we
1301
01:21:40,190 --> 01:21:43,540
have an infinite number
of random variables.
1302
01:21:43,540 --> 01:21:47,600
So that's a care that we have
to worry about from here on.
1303
01:21:47,600 --> 01:21:48,850
OK, thank you.