1
00:00:00,000 --> 00:00:00,040
2
00:00:00,040 --> 00:00:02,460
The following content is
provided under a Creative
3
00:00:02,460 --> 00:00:03,870
Commons license.
4
00:00:03,870 --> 00:00:06,910
Your support will help MIT
OpenCourseWare continue to
5
00:00:06,910 --> 00:00:10,560
offer high quality educational
resources for free.
6
00:00:10,560 --> 00:00:13,460
To make a donation or view
additional materials from
7
00:00:13,460 --> 00:00:19,290
hundreds of MIT courses, visit
MIT OpenCourseWare at
8
00:00:19,290 --> 00:00:20,540
ocw.mit.edu.
9
00:00:20,540 --> 00:00:22,840
10
00:00:22,840 --> 00:00:25,260
JOHN TSITSIKLIS: Today we're
going to finish our discussion
11
00:00:25,260 --> 00:00:27,480
of the Poisson process.
12
00:00:27,480 --> 00:00:31,280
We're going to see a few of
its properties, do a few
13
00:00:31,280 --> 00:00:35,660
interesting problems, some more
interesting than others.
14
00:00:35,660 --> 00:00:39,000
So go through a few examples and
then we're going to talk
15
00:00:39,000 --> 00:00:42,170
about some quite strange things
that happen with the
16
00:00:42,170 --> 00:00:43,980
Poisson process.
17
00:00:43,980 --> 00:00:46,500
So the first thing is to
remember what the Poisson
18
00:00:46,500 --> 00:00:48,200
processes is.
19
00:00:48,200 --> 00:00:52,300
It's a model, let's say, of
arrivals of customers that
20
00:00:52,300 --> 00:00:55,940
are, in some sense, quote
unquote, completely random,
21
00:00:55,940 --> 00:00:58,990
that is a customer can arrive
at any point in time.
22
00:00:58,990 --> 00:01:01,670
All points in time are
equally likely.
23
00:01:01,670 --> 00:01:05,330
And different points in time
are sort of independent of
24
00:01:05,330 --> 00:01:06,450
other points in time.
25
00:01:06,450 --> 00:01:10,160
So the fact that I got an
arrival now doesn't tell me
26
00:01:10,160 --> 00:01:13,050
anything about whether there's
going to be an arrival at some
27
00:01:13,050 --> 00:01:14,950
other time.
28
00:01:14,950 --> 00:01:18,230
In some sense, it's a continuous
time version of the
29
00:01:18,230 --> 00:01:19,760
Bernoulli process.
30
00:01:19,760 --> 00:01:23,130
So the best way to think about
the Poisson process is that we
31
00:01:23,130 --> 00:01:26,240
divide time into extremely
tiny slots.
32
00:01:26,240 --> 00:01:29,750
And in each time slot, there's
an independent possibility of
33
00:01:29,750 --> 00:01:31,120
having an arrival.
34
00:01:31,120 --> 00:01:33,860
Different time slots are
independent of each other.
35
00:01:33,860 --> 00:01:36,660
On the other hand, when the slot
is tiny, the probability
36
00:01:36,660 --> 00:01:39,760
for obtaining an arrival during
that tiny slot is
37
00:01:39,760 --> 00:01:41,910
itself going to be tiny.
38
00:01:41,910 --> 00:01:45,950
So we capture these properties
into a formal definition what
39
00:01:45,950 --> 00:01:48,380
the Poisson process is.
40
00:01:48,380 --> 00:01:51,390
We have a probability mass
function for the number of
41
00:01:51,390 --> 00:01:56,250
arrivals, k, during an interval
of a given length.
42
00:01:56,250 --> 00:02:00,590
So this is the sort of basic
description of the
43
00:02:00,590 --> 00:02:03,200
distribution of the number
of arrivals.
44
00:02:03,200 --> 00:02:07,520
So tau is fixed.
45
00:02:07,520 --> 00:02:09,350
And k is the parameter.
46
00:02:09,350 --> 00:02:13,660
So when we add over all k's, the
sum of these probabilities
47
00:02:13,660 --> 00:02:15,780
has to be equal to 1.
48
00:02:15,780 --> 00:02:19,330
There's a time homogeneity
assumption, which is hidden in
49
00:02:19,330 --> 00:02:22,840
this, namely, the only thing
that matters is the duration
50
00:02:22,840 --> 00:02:25,880
of the time interval, not where
the time interval sits
51
00:02:25,880 --> 00:02:28,230
on the real axis.
52
00:02:28,230 --> 00:02:31,260
Then we have an independence
assumption.
53
00:02:31,260 --> 00:02:34,570
Intervals that are disjoint are
statistically independent
54
00:02:34,570 --> 00:02:35,650
from each other.
55
00:02:35,650 --> 00:02:39,110
So any information you give me
about arrivals during this
56
00:02:39,110 --> 00:02:43,200
time interval doesn't change my
beliefs about what's going
57
00:02:43,200 --> 00:02:45,930
to happen during another
time interval.
58
00:02:45,930 --> 00:02:49,050
So this is a generalization
of the idea that we had in
59
00:02:49,050 --> 00:02:51,970
Bernoulli processes that
different time slots are
60
00:02:51,970 --> 00:02:53,930
independent of each other.
61
00:02:53,930 --> 00:02:56,770
And then to specify this
function, the distribution of
62
00:02:56,770 --> 00:03:00,120
the number of arrivals, we
sort of go in stages.
63
00:03:00,120 --> 00:03:03,750
We first specify this function
for the case where the time
64
00:03:03,750 --> 00:03:05,920
interval is very small.
65
00:03:05,920 --> 00:03:09,630
And I'm telling you what those
probabilities will be.
66
00:03:09,630 --> 00:03:13,540
And based on these then, we do
some calculations and to find
67
00:03:13,540 --> 00:03:16,510
the formula for the distribution
of the number of
68
00:03:16,510 --> 00:03:19,420
arrivals for intervals of
a general duration.
69
00:03:19,420 --> 00:03:23,000
So for a small duration, delta,
the probability of
70
00:03:23,000 --> 00:03:26,730
obtaining 1 arrival
is lambda delta.
71
00:03:26,730 --> 00:03:30,220
The remaining probability is
assigned to the event that we
72
00:03:30,220 --> 00:03:32,980
get to no arrivals during
that interval.
73
00:03:32,980 --> 00:03:36,330
The probability of obtaining
more than 1 arrival in a tiny
74
00:03:36,330 --> 00:03:40,630
interval is essentially 0.
75
00:03:40,630 --> 00:03:45,190
And when we say essentially,
it's means modular, terms that
76
00:03:45,190 --> 00:03:47,620
of order delta squared.
77
00:03:47,620 --> 00:03:50,440
And when delta is very small,
anything which is delta
78
00:03:50,440 --> 00:03:52,690
squared can be ignored.
79
00:03:52,690 --> 00:03:55,850
So up to delta squared terms,
that's what happened during a
80
00:03:55,850 --> 00:03:57,660
little interval.
81
00:03:57,660 --> 00:04:01,210
Now if we know the probability
distribution for the number of
82
00:04:01,210 --> 00:04:03,260
arrivals in a little interval.
83
00:04:03,260 --> 00:04:06,470
We can use this to get the
distribution for the number of
84
00:04:06,470 --> 00:04:08,370
arrivals over several
intervals.
85
00:04:08,370 --> 00:04:09,870
How do we do that?
86
00:04:09,870 --> 00:04:13,850
The big interval is composed
of many little intervals.
87
00:04:13,850 --> 00:04:16,410
Each little interval is
independent from any other
88
00:04:16,410 --> 00:04:20,720
little interval, so is it is
as if we have a sequence of
89
00:04:20,720 --> 00:04:22,310
Bernoulli trials.
90
00:04:22,310 --> 00:04:24,910
Each Bernoulli trial is
associated with a little
91
00:04:24,910 --> 00:04:29,580
interval and has a small
probability of obtaining a
92
00:04:29,580 --> 00:04:32,850
success or an arrival during
that mini-slot.
93
00:04:32,850 --> 00:04:35,800
On the other hand, when delta
is small, and you take a big
94
00:04:35,800 --> 00:04:38,680
interval and chop it
up, you get a large
95
00:04:38,680 --> 00:04:41,410
number of little intervals.
96
00:04:41,410 --> 00:04:45,240
So what we essentially have here
is a Bernoulli process,
97
00:04:45,240 --> 00:04:48,690
in which is the number of
trials is huge but the
98
00:04:48,690 --> 00:04:53,030
probability of success during
any given trial is tiny.
99
00:04:53,030 --> 00:05:02,580
The average number of trials
ends up being proportional to
100
00:05:02,580 --> 00:05:04,740
the length of the interval.
101
00:05:04,740 --> 00:05:07,410
If you have twice as large an
interval, it's as if you're
102
00:05:07,410 --> 00:05:10,860
having twice as many over these
mini-trials, so the
103
00:05:10,860 --> 00:05:14,080
expected number of arrivals will
increase proportionately.
104
00:05:14,080 --> 00:05:18,380
There's also this parameter
lambda, which we interpret as
105
00:05:18,380 --> 00:05:22,600
expected number of arrivals
per unit time.
106
00:05:22,600 --> 00:05:25,940
And it comes in those
probabilities here.
107
00:05:25,940 --> 00:05:28,810
When you double lambda, this
means that a little interval
108
00:05:28,810 --> 00:05:31,160
is twice as likely to
get an arrival.
109
00:05:31,160 --> 00:05:33,300
So you would expect
to get twice as
110
00:05:33,300 --> 00:05:34,880
many arrivals as well.
111
00:05:34,880 --> 00:05:37,990
That's why the expected number
of arrivals during an interval
112
00:05:37,990 --> 00:05:40,580
of length tau also scales
proportional to
113
00:05:40,580 --> 00:05:42,850
this parameter lambda.
114
00:05:42,850 --> 00:05:45,740
Somewhat unexpectedly, it turns
out that the variance of
115
00:05:45,740 --> 00:05:48,750
the number of arrivals is also
the same as the mean.
116
00:05:48,750 --> 00:05:50,540
This is a peculiarity
that happens
117
00:05:50,540 --> 00:05:52,310
in the Poisson process.
118
00:05:52,310 --> 00:05:56,100
So this is one way of thinking
about Poisson process, in
119
00:05:56,100 --> 00:05:59,640
terms of little intervals, each
one of which has a tiny
120
00:05:59,640 --> 00:06:01,690
probability of success.
121
00:06:01,690 --> 00:06:04,370
And we think of the distribution
associated with
122
00:06:04,370 --> 00:06:07,030
that process as being
described by
123
00:06:07,030 --> 00:06:09,100
this particular PMF.
124
00:06:09,100 --> 00:06:12,800
So this is the PMF for the
number of arrivals during an
125
00:06:12,800 --> 00:06:15,880
interval of a fixed
duration, tau.
126
00:06:15,880 --> 00:06:20,970
It's a PMF that extends all
over the entire range of
127
00:06:20,970 --> 00:06:22,810
non-negative integers.
128
00:06:22,810 --> 00:06:25,730
So the number of arrivals you
can get during an interval for
129
00:06:25,730 --> 00:06:27,890
certain length can
be anything.
130
00:06:27,890 --> 00:06:31,690
You can get as many arrivals
as you want.
131
00:06:31,690 --> 00:06:34,400
Of course the probability of
getting a zillion arrivals is
132
00:06:34,400 --> 00:06:35,620
going to be tiny.
133
00:06:35,620 --> 00:06:38,110
But in principle, this
is possible.
134
00:06:38,110 --> 00:06:41,440
And that's because an interval,
even if it's a fixed
135
00:06:41,440 --> 00:06:47,460
length, consists of an infinite
number of mini-slots
136
00:06:47,460 --> 00:06:48,770
in some sense.
137
00:06:48,770 --> 00:06:50,880
You can divide, chop
it up, into as many
138
00:06:50,880 --> 00:06:52,130
mini-slots as you want.
139
00:06:52,130 --> 00:06:54,400
So in principle, it's
possible that every
140
00:06:54,400 --> 00:06:55,860
mini-slot gets an arrival.
141
00:06:55,860 --> 00:06:59,190
In principle, it's possible to
get an arbitrarily large
142
00:06:59,190 --> 00:07:01,210
number of arrivals.
143
00:07:01,210 --> 00:07:05,250
So this particular formula here
is not very intuitive
144
00:07:05,250 --> 00:07:06,560
when you look at it.
145
00:07:06,560 --> 00:07:08,630
But it's a legitimate PMF.
146
00:07:08,630 --> 00:07:10,360
And it's called the
Poisson PMF.
147
00:07:10,360 --> 00:07:13,970
It's the PMF that describes
the number of arrivals.
148
00:07:13,970 --> 00:07:17,660
So that's one way of thinking
about the Poisson process,
149
00:07:17,660 --> 00:07:21,650
where the basic object of
interest would be this PMF and
150
00:07:21,650 --> 00:07:23,520
you try to work with it.
151
00:07:23,520 --> 00:07:26,600
There's another way of thinking
about what happens in
152
00:07:26,600 --> 00:07:28,060
the Poisson process.
153
00:07:28,060 --> 00:07:31,780
And this has to do with letting
things evolve in time.
154
00:07:31,780 --> 00:07:34,160
You start at time 0.
155
00:07:34,160 --> 00:07:37,080
There's going to be a time at
which the first arrival
156
00:07:37,080 --> 00:07:40,000
occurs, and call that time T1.
157
00:07:40,000 --> 00:07:44,130
This time turns out to have an
exponential distribution with
158
00:07:44,130 --> 00:07:46,340
parameter lambda.
159
00:07:46,340 --> 00:07:49,120
Once you get an arrival,
it's as if the
160
00:07:49,120 --> 00:07:53,300
process starts fresh.
161
00:07:53,300 --> 00:07:55,830
The best way to understand why
this is the case is by
162
00:07:55,830 --> 00:07:57,740
thinking in terms of
the analogy with
163
00:07:57,740 --> 00:07:58,840
the Bernoulli process.
164
00:07:58,840 --> 00:08:01,660
If you believe that statement
for the Bernoulli process,
165
00:08:01,660 --> 00:08:05,510
since this is a limiting case,
it should also be true.
166
00:08:05,510 --> 00:08:09,150
So starting from this time,
we're going to wait a random
167
00:08:09,150 --> 00:08:12,710
amount of time until we get the
second arrival This random
168
00:08:12,710 --> 00:08:15,250
amount of time, let's
call it T2.
169
00:08:15,250 --> 00:08:18,360
This time, T2 is also going
to have an exponential
170
00:08:18,360 --> 00:08:21,140
distribution with the same
parameter, lambda.
171
00:08:21,140 --> 00:08:26,615
And these two are going to be
independent of each other.
172
00:08:26,615 --> 00:08:27,750
OK?
173
00:08:27,750 --> 00:08:31,520
So the Poisson process has all
the same memorylessness
174
00:08:31,520 --> 00:08:34,820
properties that the Bernoulli
process has.
175
00:08:34,820 --> 00:08:37,630
What's another way of thinking
of this property?
176
00:08:37,630 --> 00:08:43,360
So think of a process where
you have a light bulb.
177
00:08:43,360 --> 00:08:47,070
The time at the light bulb burns
out, you can model it by
178
00:08:47,070 --> 00:08:48,855
an exponential random
variable.
179
00:08:48,855 --> 00:08:51,680
180
00:08:51,680 --> 00:08:58,170
And suppose that they tell you
that so far, we're are sitting
181
00:08:58,170 --> 00:09:01,550
at some time, T. And I tell you
that the light bulb has
182
00:09:01,550 --> 00:09:04,510
not yet burned out.
183
00:09:04,510 --> 00:09:08,290
What does this tell you about
the future of the light bulb?
184
00:09:08,290 --> 00:09:11,700
Is the fact that they didn't
burn out, so far, is it good
185
00:09:11,700 --> 00:09:13,720
news or is it bad news?
186
00:09:13,720 --> 00:09:17,640
Would you rather keep this light
bulb that has worked for
187
00:09:17,640 --> 00:09:20,950
t times steps and is still OK?
188
00:09:20,950 --> 00:09:25,770
Or would you rather use a new
light bulb that starts new at
189
00:09:25,770 --> 00:09:27,740
that point in time?
190
00:09:27,740 --> 00:09:30,920
Because of the memorylessness
property, the past of that
191
00:09:30,920 --> 00:09:33,220
light bulb doesn't matter.
192
00:09:33,220 --> 00:09:37,040
So the future of this light bulb
is statistically the same
193
00:09:37,040 --> 00:09:40,740
as the future of a
new light bulb.
194
00:09:40,740 --> 00:09:43,700
For both of them, the time until
they burn out is going
195
00:09:43,700 --> 00:09:46,580
to be described an exponential
distribution.
196
00:09:46,580 --> 00:09:50,990
So one way that people described
the situation is to
197
00:09:50,990 --> 00:09:55,450
say that used is exactly
as good as a new.
198
00:09:55,450 --> 00:09:59,220
So a used on is no worse
than a new one.
199
00:09:59,220 --> 00:10:01,950
A used one is no better
than a new one.
200
00:10:01,950 --> 00:10:06,130
So a used light bulb that
hasn't yet burnt out is
201
00:10:06,130 --> 00:10:09,180
exactly as good as
a new light bulb.
202
00:10:09,180 --> 00:10:11,740
So that's another way of
thinking about the
203
00:10:11,740 --> 00:10:17,150
memorylessness that we have
in the Poisson process.
204
00:10:17,150 --> 00:10:19,350
Back to this picture.
205
00:10:19,350 --> 00:10:22,410
The time until the second
arrival is the sum of two
206
00:10:22,410 --> 00:10:24,990
independent exponential
random variables.
207
00:10:24,990 --> 00:10:28,050
So, in principle, you can use
the convolution formula to
208
00:10:28,050 --> 00:10:32,330
find the distribution of T1
plus T2, and that would be
209
00:10:32,330 --> 00:10:36,750
what we call Y2, the time until
the second arrival.
210
00:10:36,750 --> 00:10:39,210
But there's also a direct
way of obtaining to the
211
00:10:39,210 --> 00:10:42,580
distribution of Y2, and this is
the calculation that we did
212
00:10:42,580 --> 00:10:44,340
last time on the blackboard.
213
00:10:44,340 --> 00:10:46,320
And actually, we did
it more generally.
214
00:10:46,320 --> 00:10:49,990
We found the time until the
case arrival occurs.
215
00:10:49,990 --> 00:10:53,860
It has a closed form formula,
which is called the Erlang
216
00:10:53,860 --> 00:10:56,960
distribution with k degrees
of freedom.
217
00:10:56,960 --> 00:11:00,170
So let's see what's
going on here.
218
00:11:00,170 --> 00:11:03,230
It's a distribution
Of what kind?
219
00:11:03,230 --> 00:11:05,210
It's a continuous
distribution.
220
00:11:05,210 --> 00:11:07,150
It's a probability
density function.
221
00:11:07,150 --> 00:11:10,620
This is because the time is a
continuous random variable.
222
00:11:10,620 --> 00:11:11,580
Time is continuous.
223
00:11:11,580 --> 00:11:14,320
Arrivals can happen
at any time.
224
00:11:14,320 --> 00:11:17,090
So we're talking
about the PDF.
225
00:11:17,090 --> 00:11:20,230
This k is just the parameter
of the distribution.
226
00:11:20,230 --> 00:11:22,450
We're talking about the
k-th arrival, so
227
00:11:22,450 --> 00:11:24,210
k is a fixed number.
228
00:11:24,210 --> 00:11:27,440
Lambda is another parameter of
the distribution, which is the
229
00:11:27,440 --> 00:11:32,660
arrival rate So it's a PDF over
the Y's, whereas lambda
230
00:11:32,660 --> 00:11:36,060
and k are parameters of
the distribution.
231
00:11:36,060 --> 00:11:40,530
232
00:11:40,530 --> 00:11:40,860
OK.
233
00:11:40,860 --> 00:11:45,630
So this was what we knew
from last time.
234
00:11:45,630 --> 00:11:51,550
Just to get some practice, let
us do a problem that's not too
235
00:11:51,550 --> 00:11:55,730
difficult, but just to see how
we use the various formulas
236
00:11:55,730 --> 00:11:57,470
that we have.
237
00:11:57,470 --> 00:12:01,930
So Poisson was a mathematician,
but Poisson
238
00:12:01,930 --> 00:12:04,730
also means fish in French.
239
00:12:04,730 --> 00:12:07,130
So Poisson goes fishing.
240
00:12:07,130 --> 00:12:11,680
And let's assume that fish
are caught according
241
00:12:11,680 --> 00:12:13,420
to a Poisson process.
242
00:12:13,420 --> 00:12:15,310
That's not too bad
an assumption.
243
00:12:15,310 --> 00:12:18,180
At any given point in time, you
have a little probability
244
00:12:18,180 --> 00:12:19,840
that a fish would be caught.
245
00:12:19,840 --> 00:12:22,930
And whether you catch one now
is sort of independent about
246
00:12:22,930 --> 00:12:28,210
whether at some later time a
fish will be caught or not.
247
00:12:28,210 --> 00:12:30,030
So let's just make
this assumption.
248
00:12:30,030 --> 00:12:35,270
And suppose that the rules of
the game are that you--
249
00:12:35,270 --> 00:12:40,350
Fish are being called it the
certain rate of 0.6 per hour.
250
00:12:40,350 --> 00:12:44,390
You fish for 2 hours,
no matter what.
251
00:12:44,390 --> 00:12:46,190
And then there are two
possibilities.
252
00:12:46,190 --> 00:12:50,710
If I have caught a fish,
I stop and go home.
253
00:12:50,710 --> 00:12:54,320
So if some fish have been
caught, so there's at least 1
254
00:12:54,320 --> 00:12:57,250
arrival during this interval,
I go home.
255
00:12:57,250 --> 00:13:01,760
Or if nothing has being caught,
I continue fishing
256
00:13:01,760 --> 00:13:03,630
until I catch something.
257
00:13:03,630 --> 00:13:05,300
And then I go home.
258
00:13:05,300 --> 00:13:09,410
So that's the description of
what is going to happen.
259
00:13:09,410 --> 00:13:12,940
And now let's starts asking
questions of all sorts.
260
00:13:12,940 --> 00:13:16,450
What is the probability that
I'm going to be fishing for
261
00:13:16,450 --> 00:13:19,060
more than 2 hours?
262
00:13:19,060 --> 00:13:23,200
I will be fishing for more than
2 hours, if and only if
263
00:13:23,200 --> 00:13:28,400
no fish were caught during those
2 hours, in which case,
264
00:13:28,400 --> 00:13:30,140
I will have to continue.
265
00:13:30,140 --> 00:13:33,600
Therefore, this is just
this quantity.
266
00:13:33,600 --> 00:13:38,630
The probability of catching
2 fish in--
267
00:13:38,630 --> 00:13:43,450
of catching 0 fish in the next
2 hours, and according to the
268
00:13:43,450 --> 00:13:47,170
formula that we have, this is
going to be e to the minus
269
00:13:47,170 --> 00:13:50,820
lambda times how much
time we have.
270
00:13:50,820 --> 00:13:53,040
There's another way of
thinking about this.
271
00:13:53,040 --> 00:13:55,790
The probability that I fish for
more than 2 hours is the
272
00:13:55,790 --> 00:14:01,230
probability that the first catch
happens after time 2,
273
00:14:01,230 --> 00:14:04,990
which would be the integral
from 2 to infinity of the
274
00:14:04,990 --> 00:14:09,610
density of the first
arrival time.
275
00:14:09,610 --> 00:14:11,770
And that density is
an exponential.
276
00:14:11,770 --> 00:14:14,910
So you do the integral of an
exponential, and, of course,
277
00:14:14,910 --> 00:14:17,160
you would get the same answer.
278
00:14:17,160 --> 00:14:17,550
OK.
279
00:14:17,550 --> 00:14:18,730
That's easy.
280
00:14:18,730 --> 00:14:22,880
So what's the probability of
fishing for more than 2 but
281
00:14:22,880 --> 00:14:25,420
less than 5 hours?
282
00:14:25,420 --> 00:14:28,570
What does it take for
this to happen?
283
00:14:28,570 --> 00:14:35,540
For this to happen, we need to
catch 0 fish from time 0 to 2
284
00:14:35,540 --> 00:14:43,020
and catch the first fish
sometime between 2 and 5.
285
00:14:43,020 --> 00:14:44,400
So if you--
286
00:14:44,400 --> 00:14:47,510
one way of thinking about what's
happening here might be
287
00:14:47,510 --> 00:14:49,900
to say that there's a
Poisson process that
288
00:14:49,900 --> 00:14:52,770
keeps going on forever.
289
00:14:52,770 --> 00:14:57,090
But as soon as I catch the
first fish, instead of
290
00:14:57,090 --> 00:15:00,990
continuing fishing and obtaining
those other fish I
291
00:15:00,990 --> 00:15:04,070
just go home right now.
292
00:15:04,070 --> 00:15:11,060
Now the fact that I go home
before time 5 means that, if I
293
00:15:11,060 --> 00:15:13,990
were to stay until time
5, I would have
294
00:15:13,990 --> 00:15:15,850
caught at least 1 fish.
295
00:15:15,850 --> 00:15:18,350
I might have caught
more than 1.
296
00:15:18,350 --> 00:15:22,970
So the event of interest here
is that the first catch
297
00:15:22,970 --> 00:15:26,560
happens between times 2 and 5.
298
00:15:26,560 --> 00:15:32,050
So one way of calculating
this quantity would be--
299
00:15:32,050 --> 00:15:35,300
Its the probability that the
first catch happens between
300
00:15:35,300 --> 00:15:37,700
times 2 and 5.
301
00:15:37,700 --> 00:15:40,060
Another way to deal with it
is to say, this is the
302
00:15:40,060 --> 00:15:44,880
probability that I caught 0 fish
in the first 2 hours and
303
00:15:44,880 --> 00:15:49,170
then the probability that I
catch at least 1 fish during
304
00:15:49,170 --> 00:15:51,130
the next 3 hours.
305
00:15:51,130 --> 00:15:53,890
306
00:15:53,890 --> 00:15:54,780
This.
307
00:15:54,780 --> 00:15:56,080
What is this?
308
00:15:56,080 --> 00:15:59,180
The probability of 0 fish in
the next 3 hours is the
309
00:15:59,180 --> 00:16:01,600
probability of 0 fish
during this time.
310
00:16:01,600 --> 00:16:04,480
1 minus this is the probability
of catching at
311
00:16:04,480 --> 00:16:07,850
least 1 fish, of having
at least 1 arrival,
312
00:16:07,850 --> 00:16:09,730
between times 2 and 5.
313
00:16:09,730 --> 00:16:13,310
If there's at least 1 arrival
between times 2 and 5, then I
314
00:16:13,310 --> 00:16:17,140
would have gone home
by time 5.
315
00:16:17,140 --> 00:16:20,660
So both of these, if you plug-in
numbers and all that,
316
00:16:20,660 --> 00:16:24,170
of course, are going to give
you the same answer.
317
00:16:24,170 --> 00:16:26,820
Now next, what's the probability
that I catch at
318
00:16:26,820 --> 00:16:29,560
least 2 fish?
319
00:16:29,560 --> 00:16:32,370
In which scenario are we?
320
00:16:32,370 --> 00:16:36,570
Under this scenario, I go home
when I catch my first fish.
321
00:16:36,570 --> 00:16:39,560
So in order to catch
at least 2 fish, it
322
00:16:39,560 --> 00:16:41,340
must be in this case.
323
00:16:41,340 --> 00:16:44,830
So this is the same as the event
that I catch at least 2
324
00:16:44,830 --> 00:16:49,020
fish during the first
2 time steps.
325
00:16:49,020 --> 00:16:52,410
So it's going to be the
probability from 2 to
326
00:16:52,410 --> 00:16:56,780
infinity, the probability that
I catch 2 fish, or that I
327
00:16:56,780 --> 00:17:01,860
catch 3 fish, or I catch
more than that.
328
00:17:01,860 --> 00:17:04,109
So it's this quantity.
329
00:17:04,109 --> 00:17:06,730
k is the number of fish
that I catch.
330
00:17:06,730 --> 00:17:09,599
At least 2, so k goes
from 2 to infinity.
331
00:17:09,599 --> 00:17:13,180
These are the probabilities of
catching a number k of fish
332
00:17:13,180 --> 00:17:14,859
during this interval.
333
00:17:14,859 --> 00:17:17,920
And if you want a simpler form
without an infinite sum, this
334
00:17:17,920 --> 00:17:20,619
would be 1 minus the probability
of catching 0
335
00:17:20,619 --> 00:17:24,880
fish, minus the probability of
catching 1 fish, during a time
336
00:17:24,880 --> 00:17:28,050
interval of length 2.
337
00:17:28,050 --> 00:17:29,520
Another way to think of it.
338
00:17:29,520 --> 00:17:34,230
I'm going to catch 2 fish, at
least 2 fish, if and only if
339
00:17:34,230 --> 00:17:40,630
the second fish caught in this
process happens before time 2.
340
00:17:40,630 --> 00:17:43,950
So that's another way of
thinking about the same event.
341
00:17:43,950 --> 00:17:46,230
So it's going to be the
probability that the random
342
00:17:46,230 --> 00:17:51,440
variable Y2, the arrival time
over the second fish, is less
343
00:17:51,440 --> 00:17:52,690
than or equal to 2.
344
00:17:52,690 --> 00:17:55,387
345
00:17:55,387 --> 00:17:56,310
OK.
346
00:17:56,310 --> 00:18:00,000
The next one is a
little trickier.
347
00:18:00,000 --> 00:18:03,380
Here we need to do a little
bit of divide and conquer.
348
00:18:03,380 --> 00:18:06,490
Overall, in this expedition,
what the expected number of
349
00:18:06,490 --> 00:18:08,840
fish to be caught?
350
00:18:08,840 --> 00:18:11,550
One way to think about it is
to try to use the total
351
00:18:11,550 --> 00:18:13,100
expectations theorem.
352
00:18:13,100 --> 00:18:17,830
And think of expected number of
fish, given this scenario,
353
00:18:17,830 --> 00:18:21,010
or expected number of fish,
given this scenario.
354
00:18:21,010 --> 00:18:24,190
That's a little more complicated
than the way I'm
355
00:18:24,190 --> 00:18:25,290
going to do it.
356
00:18:25,290 --> 00:18:28,240
The way I'm going to do is
to think as follows--
357
00:18:28,240 --> 00:18:32,310
Expected number of fish is the
expected number of fish caught
358
00:18:32,310 --> 00:18:37,520
between times 0 and 2 plus
expected number of fish caught
359
00:18:37,520 --> 00:18:39,800
after time 2.
360
00:18:39,800 --> 00:18:45,580
So what's the expected number
caught between time 0 and 2?
361
00:18:45,580 --> 00:18:47,860
This is lambda t.
362
00:18:47,860 --> 00:18:52,310
So lambda is 0.6 times 2.
363
00:18:52,310 --> 00:18:55,380
This is the expected number of
fish that are caught between
364
00:18:55,380 --> 00:18:57,260
times 0 and 2.
365
00:18:57,260 --> 00:19:00,440
Now let's think about the
expected number of fish caught
366
00:19:00,440 --> 00:19:01,630
afterwards.
367
00:19:01,630 --> 00:19:04,300
How many fish are being
caught afterwards?
368
00:19:04,300 --> 00:19:06,110
Well it depends on
the scenario.
369
00:19:06,110 --> 00:19:08,750
If we're in this scenario,
we've gone home
370
00:19:08,750 --> 00:19:10,800
and we catch 0.
371
00:19:10,800 --> 00:19:14,570
If we're in this scenario, then
we continue fishing until
372
00:19:14,570 --> 00:19:15,980
we catch one.
373
00:19:15,980 --> 00:19:19,970
So the expected number of fish
to be caught after time 2 is
374
00:19:19,970 --> 00:19:24,520
going to be the probability
of this scenario times 1.
375
00:19:24,520 --> 00:19:29,020
And the probability of that
scenario is the probability
376
00:19:29,020 --> 00:19:33,490
that they call it's 0 fish
during the first 2 time steps
377
00:19:33,490 --> 00:19:37,420
times 1, which is the number of
fish I'm going to catch if
378
00:19:37,420 --> 00:19:39,790
I continue.
379
00:19:39,790 --> 00:19:43,960
The expected total fishing time
we can calculate exactly
380
00:19:43,960 --> 00:19:46,150
the same way.
381
00:19:46,150 --> 00:19:47,890
I'm jumping to the last one.
382
00:19:47,890 --> 00:19:51,580
My total fishing time has a
period of 2 time steps.
383
00:19:51,580 --> 00:19:54,910
I'm going to fish for 2 time
steps no matter what.
384
00:19:54,910 --> 00:19:59,190
And then if I caught 0 fish,
which happens with this
385
00:19:59,190 --> 00:20:04,540
probability, my expected time
is going to be the expected
386
00:20:04,540 --> 00:20:08,920
time from here onwards, which is
the expected value of this
387
00:20:08,920 --> 00:20:12,490
geometric random variable
with parameter lambda.
388
00:20:12,490 --> 00:20:15,430
So the expected time
is 1 over lambda.
389
00:20:15,430 --> 00:20:22,460
And in our case this,
is 1/0.6.
390
00:20:22,460 --> 00:20:31,180
Finally, if I tell you that I
have been fishing for 4 hours
391
00:20:31,180 --> 00:20:37,800
and nothing has been caught so
far, how much do you expect
392
00:20:37,800 --> 00:20:41,630
this quantity to be?
393
00:20:41,630 --> 00:20:46,330
Here is the story that, again,
that for the Poisson process
394
00:20:46,330 --> 00:20:48,720
used is as good as new.
395
00:20:48,720 --> 00:20:51,060
The process does not
have any memory.
396
00:20:51,060 --> 00:20:54,930
Given what happens in the past
doesn't matter for the future.
397
00:20:54,930 --> 00:20:58,430
It's as if the process starts
new at this point in time.
398
00:20:58,430 --> 00:21:02,420
So this one is going to be,
again, the same exponentially
399
00:21:02,420 --> 00:21:04,910
distributed random
variable with the
400
00:21:04,910 --> 00:21:08,270
same parameter lambda.
401
00:21:08,270 --> 00:21:12,270
So expected time until an
arrival comes is an
402
00:21:12,270 --> 00:21:13,740
exponential distribut --
403
00:21:13,740 --> 00:21:15,910
has an exponential distribution
with parameter
404
00:21:15,910 --> 00:21:19,660
lambda, no matter what has
happened in the past.
405
00:21:19,660 --> 00:21:22,730
Starting from now and looking
into the future, it's as if
406
00:21:22,730 --> 00:21:24,910
the process has just started.
407
00:21:24,910 --> 00:21:32,440
So it's going to be 1 over
lambda, which is 1/0.6.
408
00:21:32,440 --> 00:21:33,690
OK.
409
00:21:33,690 --> 00:21:37,540
410
00:21:37,540 --> 00:21:41,500
Now our next example is going
to be a little more
411
00:21:41,500 --> 00:21:43,780
complicated or subtle.
412
00:21:43,780 --> 00:21:46,800
But before we get to the
example, let's refresh our
413
00:21:46,800 --> 00:21:50,300
memory about what we discussed
last time about merging
414
00:21:50,300 --> 00:21:53,110
Poisson independent
Poisson processes.
415
00:21:53,110 --> 00:21:56,090
Instead of drawing the picture
that way, another way we could
416
00:21:56,090 --> 00:21:58,260
draw it could be this.
417
00:21:58,260 --> 00:22:01,260
We have a Poisson process with
rate lambda1, and a Poisson
418
00:22:01,260 --> 00:22:03,440
process with rate lambda2.
419
00:22:03,440 --> 00:22:07,320
They have, each one of these,
have their arrivals.
420
00:22:07,320 --> 00:22:09,780
And then we form the
merged process.
421
00:22:09,780 --> 00:22:13,580
And the merged process records
an arrival whenever there's an
422
00:22:13,580 --> 00:22:16,930
arrival in either of
the two processes.
423
00:22:16,930 --> 00:22:19,990
424
00:22:19,990 --> 00:22:23,730
This process in that process are
assumed to be independent
425
00:22:23,730 --> 00:22:26,760
of each other.
426
00:22:26,760 --> 00:22:32,590
Now different times in this
process and that process are
427
00:22:32,590 --> 00:22:34,780
independent of each other.
428
00:22:34,780 --> 00:22:39,400
So what happens in these two
time intervals is independent
429
00:22:39,400 --> 00:22:41,780
from what happens in these
two time intervals.
430
00:22:41,780 --> 00:22:45,560
These two time intervals to
determine what happens here.
431
00:22:45,560 --> 00:22:48,750
These two time intervals
determine what happens there.
432
00:22:48,750 --> 00:22:53,740
So because these are independent
from these, this
433
00:22:53,740 --> 00:22:56,600
means that this is also
independent from that.
434
00:22:56,600 --> 00:22:59,020
So the independence assumption
is satisfied
435
00:22:59,020 --> 00:23:01,150
for the merged process.
436
00:23:01,150 --> 00:23:05,030
And the merged process turns out
to be a Poisson process.
437
00:23:05,030 --> 00:23:10,340
And if you want to find the
arrival rate for that process,
438
00:23:10,340 --> 00:23:12,550
you argue as follows.
439
00:23:12,550 --> 00:23:15,000
During a little interval of
length delta, we have
440
00:23:15,000 --> 00:23:17,280
probability lambda1
delta of having an
441
00:23:17,280 --> 00:23:18,620
arrival in this process.
442
00:23:18,620 --> 00:23:21,700
We have probability lambda2
delta of an arrival in this
443
00:23:21,700 --> 00:23:24,890
process, plus second
order terms in
444
00:23:24,890 --> 00:23:26,860
delta, which we're ignoring.
445
00:23:26,860 --> 00:23:29,270
And then you do the calculation
and you find that
446
00:23:29,270 --> 00:23:31,870
in this process, you're going
to have an arrival
447
00:23:31,870 --> 00:23:37,830
probability, which is lambda1
plus lambda2, again ignoring
448
00:23:37,830 --> 00:23:40,490
second order in delta--
449
00:23:40,490 --> 00:23:42,650
terms that are second
order in delta.
450
00:23:42,650 --> 00:23:46,130
So the merged process is a
Poisson process whose arrival
451
00:23:46,130 --> 00:23:48,760
rate is the sum of the
arrival rates of
452
00:23:48,760 --> 00:23:52,080
the individual processes.
453
00:23:52,080 --> 00:23:55,290
And the calculation we did at
the end of the last lecture--
454
00:23:55,290 --> 00:23:59,240
If I tell you that the new
arrival happened here, where
455
00:23:59,240 --> 00:24:00,610
did that arrival come from?
456
00:24:00,610 --> 00:24:02,910
Did it come from here
or from there?
457
00:24:02,910 --> 00:24:06,720
If the lambda1 is equal to
lambda2, then by symmetry you
458
00:24:06,720 --> 00:24:09,240
would say that it's equally
likely to have come from here
459
00:24:09,240 --> 00:24:10,660
or to come from there.
460
00:24:10,660 --> 00:24:13,720
But if this lambda is much
bigger than that lambda, the
461
00:24:13,720 --> 00:24:16,850
fact that they saw an arrival
is more likely to have come
462
00:24:16,850 --> 00:24:17,850
from there.
463
00:24:17,850 --> 00:24:22,410
And the formula that captures
this is the following.
464
00:24:22,410 --> 00:24:27,300
This is the probability that my
arrival has come from this
465
00:24:27,300 --> 00:24:32,360
particular stream rather than
that particular stream.
466
00:24:32,360 --> 00:24:38,900
So when an arrival comes and you
ask, what is the origin of
467
00:24:38,900 --> 00:24:39,690
that arrival?
468
00:24:39,690 --> 00:24:43,760
It's as if I'm flipping a
coin with these odds.
469
00:24:43,760 --> 00:24:46,910
And depending on outcome of that
coin, I'm going to tell
470
00:24:46,910 --> 00:24:49,790
you came from there or
it came from there.
471
00:24:49,790 --> 00:24:53,850
So the origin of an arrival
is either this
472
00:24:53,850 --> 00:24:55,610
stream or that stream.
473
00:24:55,610 --> 00:24:58,190
And this is the probability that
the origin of the arrival
474
00:24:58,190 --> 00:24:59,510
is that one.
475
00:24:59,510 --> 00:25:04,160
Now if we look at 2 different
arrivals, and we ask about
476
00:25:04,160 --> 00:25:05,570
their origins--
477
00:25:05,570 --> 00:25:08,130
So let's think about the origin
of this arrival and
478
00:25:08,130 --> 00:25:12,060
compare it with the origin
that arrival.
479
00:25:12,060 --> 00:25:14,010
The origin of this arrival
is random.
480
00:25:14,010 --> 00:25:16,720
It could be right be either
this or that.
481
00:25:16,720 --> 00:25:18,840
And this is the relevant
probability.
482
00:25:18,840 --> 00:25:20,750
The origin of that arrival
is random.
483
00:25:20,750 --> 00:25:24,360
It could be either here or is
there, and again, with the
484
00:25:24,360 --> 00:25:26,880
same relevant probability.
485
00:25:26,880 --> 00:25:27,730
Question.
486
00:25:27,730 --> 00:25:31,780
The origin of this arrival, is
it dependent or independent
487
00:25:31,780 --> 00:25:34,710
from the origin that arrival?
488
00:25:34,710 --> 00:25:37,500
And here's how the
argument goes.
489
00:25:37,500 --> 00:25:40,740
Separate times are
independent.
490
00:25:40,740 --> 00:25:45,050
Whatever has happened in the
process during this set of
491
00:25:45,050 --> 00:25:48,040
times is independent from
whatever happened in the
492
00:25:48,040 --> 00:25:50,980
process during that
set of times.
493
00:25:50,980 --> 00:25:55,040
Because different times have
nothing to do with each other,
494
00:25:55,040 --> 00:25:59,650
the origin of this, of an
arrival here, has nothing to
495
00:25:59,650 --> 00:26:02,480
do with the origin of
an arrival there.
496
00:26:02,480 --> 00:26:06,890
So the origins of different
arrivals are also independent
497
00:26:06,890 --> 00:26:08,850
random variables.
498
00:26:08,850 --> 00:26:12,710
So if I tell you that--
499
00:26:12,710 --> 00:26:14,150
yeah.
500
00:26:14,150 --> 00:26:15,310
OK.
501
00:26:15,310 --> 00:26:19,600
So it as if that each time that
you have an arrival in
502
00:26:19,600 --> 00:26:22,820
the merge process, it's as if
you're flipping a coin to
503
00:26:22,820 --> 00:26:26,410
determine where did that arrival
came from and these
504
00:26:26,410 --> 00:26:31,516
coins are independent
of each other.
505
00:26:31,516 --> 00:26:32,766
OK.
506
00:26:32,766 --> 00:26:35,550
507
00:26:35,550 --> 00:26:35,920
OK.
508
00:26:35,920 --> 00:26:37,770
Now we're going to use this--
509
00:26:37,770 --> 00:26:42,970
what we know about merged
processes to solve the problem
510
00:26:42,970 --> 00:26:48,240
that would be harder to do, if
you were not using ideas from
511
00:26:48,240 --> 00:26:49,720
Poisson processes.
512
00:26:49,720 --> 00:26:52,250
So the formulation of the
problem has nothing to do with
513
00:26:52,250 --> 00:26:54,370
the Poisson process.
514
00:26:54,370 --> 00:26:57,450
The formulation is
the following.
515
00:26:57,450 --> 00:26:59,870
We have 3 light-bulbs.
516
00:26:59,870 --> 00:27:03,490
And each light bulb is
independent and is going to
517
00:27:03,490 --> 00:27:07,920
die out at the time that's
exponentially distributed.
518
00:27:07,920 --> 00:27:11,170
So 3 light bulbs.
519
00:27:11,170 --> 00:27:16,630
They start their lives and
then at some point
520
00:27:16,630 --> 00:27:21,260
they die or burn out.
521
00:27:21,260 --> 00:27:26,150
So let's think of this as X,
this as Y, and this as Z.
522
00:27:26,150 --> 00:27:31,220
And we're interested in the
time until the last
523
00:27:31,220 --> 00:27:33,200
light-bulb burns out.
524
00:27:33,200 --> 00:27:36,930
So we're interested in the
maximum of the 3 random
525
00:27:36,930 --> 00:27:41,480
variables, X, Y, and Z. And in
particular, we want to find
526
00:27:41,480 --> 00:27:43,170
the expected value
of this maximum.
527
00:27:43,170 --> 00:27:45,770
528
00:27:45,770 --> 00:27:47,490
OK.
529
00:27:47,490 --> 00:27:50,760
So you can do derived
distribution, use the expected
530
00:27:50,760 --> 00:27:52,880
value rule, anything you want.
531
00:27:52,880 --> 00:27:56,230
You can get this answer using
the tools that you already
532
00:27:56,230 --> 00:27:58,180
have in your hands.
533
00:27:58,180 --> 00:28:02,070
But now let us see how we can
connect to this picture with a
534
00:28:02,070 --> 00:28:05,550
Poisson picture and come up
with the answer in a very
535
00:28:05,550 --> 00:28:07,240
simple way.
536
00:28:07,240 --> 00:28:09,630
What is an exponential
random variable?
537
00:28:09,630 --> 00:28:14,450
An exponential random variable
is the first act in the long
538
00:28:14,450 --> 00:28:19,570
play that involves a whole
Poisson process.
539
00:28:19,570 --> 00:28:23,020
So an exponential random
variable is the first act of a
540
00:28:23,020 --> 00:28:24,650
Poisson movie.
541
00:28:24,650 --> 00:28:25,660
Same thing here.
542
00:28:25,660 --> 00:28:29,700
You can think of this random
variable as being part of some
543
00:28:29,700 --> 00:28:31,850
Poisson process that
has been running.
544
00:28:31,850 --> 00:28:35,360
545
00:28:35,360 --> 00:28:38,040
So it's part of this
bigger picture.
546
00:28:38,040 --> 00:28:42,370
We're still interested in
the maximum of the 3.
547
00:28:42,370 --> 00:28:45,780
The other arrivals are not going
to affect our answers.
548
00:28:45,780 --> 00:28:49,640
It's just, conceptually
speaking, we can think of the
549
00:28:49,640 --> 00:28:52,840
exponential random variable as
being embedded in a bigger
550
00:28:52,840 --> 00:28:55,110
Poisson picture.
551
00:28:55,110 --> 00:29:00,980
So we have 3 Poisson process
that are running in parallel.
552
00:29:00,980 --> 00:29:06,150
Let us split the expected time
until the last burnout into
553
00:29:06,150 --> 00:29:09,800
pieces, which is time until the
first burnout, time from
554
00:29:09,800 --> 00:29:11,810
the first until the second,
and time from the
555
00:29:11,810 --> 00:29:13,690
second until the third.
556
00:29:13,690 --> 00:29:16,780
557
00:29:16,780 --> 00:29:20,570
And find the expected values of
each one of these pieces.
558
00:29:20,570 --> 00:29:24,620
What can we say about the
expected value of this?
559
00:29:24,620 --> 00:29:29,310
This is the first arrival
out of all of
560
00:29:29,310 --> 00:29:31,540
these 3 Poisson processes.
561
00:29:31,540 --> 00:29:34,070
It's the first event that
happens when you look at all
562
00:29:34,070 --> 00:29:36,080
of these processes
simultaneously.
563
00:29:36,080 --> 00:29:39,660
So 3 Poisson processes
running in parallel.
564
00:29:39,660 --> 00:29:43,750
We're interested in the time
until one of them, any one of
565
00:29:43,750 --> 00:29:46,380
them, gets in arrival.
566
00:29:46,380 --> 00:29:47,690
Rephrase.
567
00:29:47,690 --> 00:29:51,330
We merged the 3 Poisson
processes, and we ask for the
568
00:29:51,330 --> 00:29:56,820
time until we observe an arrival
in the merged process.
569
00:29:56,820 --> 00:30:01,250
When 1 of the 3 gets an arrival
for the first time,
570
00:30:01,250 --> 00:30:03,880
the merged process gets
its first arrival.
571
00:30:03,880 --> 00:30:06,300
So what's the expected
value of this time
572
00:30:06,300 --> 00:30:08,820
until the first burnout?
573
00:30:08,820 --> 00:30:11,940
It's going to be the
expected value of a
574
00:30:11,940 --> 00:30:13,720
Poisson random variable.
575
00:30:13,720 --> 00:30:17,050
So the first burnout is going
to have an expected
576
00:30:17,050 --> 00:30:20,430
value, which is--
577
00:30:20,430 --> 00:30:21,540
OK.
578
00:30:21,540 --> 00:30:23,690
It's a Poisson process.
579
00:30:23,690 --> 00:30:28,530
The merged process of the 3 has
a collective arrival rate,
580
00:30:28,530 --> 00:30:32,750
which is 3 times lambda.
581
00:30:32,750 --> 00:30:36,250
So this is the parameter over
the exponential distribution
582
00:30:36,250 --> 00:30:39,870
that describes the time until
the first arrival in the
583
00:30:39,870 --> 00:30:41,220
merged process.
584
00:30:41,220 --> 00:30:42,990
And the expected value
of this random
585
00:30:42,990 --> 00:30:45,670
variable is 1 over that.
586
00:30:45,670 --> 00:30:48,190
When you have an exponential
random variable with parameter
587
00:30:48,190 --> 00:30:50,150
lambda, the expected value
of that random
588
00:30:50,150 --> 00:30:52,330
variable is 1 over lambda.
589
00:30:52,330 --> 00:30:56,660
Here we're talking about the
first arrival time in a
590
00:30:56,660 --> 00:30:58,720
process with rate 3 lambda.
591
00:30:58,720 --> 00:31:00,680
The expected time until
the first arrival
592
00:31:00,680 --> 00:31:03,000
is 1 over (3 lambda).
593
00:31:03,000 --> 00:31:03,870
Alright.
594
00:31:03,870 --> 00:31:08,710
So at this time, this bulb, this
arrival happened, this
595
00:31:08,710 --> 00:31:11,490
bulb has been burned.
596
00:31:11,490 --> 00:31:15,760
So we don't care about
that bulb anymore.
597
00:31:15,760 --> 00:31:21,610
We start at this time,
and we look forward.
598
00:31:21,610 --> 00:31:23,640
This bulb has been burned.
599
00:31:23,640 --> 00:31:27,810
So let's just look forward
from now on.
600
00:31:27,810 --> 00:31:28,900
What have we got?
601
00:31:28,900 --> 00:31:34,030
We have two bulbs that
are burning.
602
00:31:34,030 --> 00:31:37,320
We have a Poisson process that's
the bigger picture of
603
00:31:37,320 --> 00:31:40,270
what could happen to that light
bulb, if we were to keep
604
00:31:40,270 --> 00:31:41,190
replacing it.
605
00:31:41,190 --> 00:31:42,880
Another Poisson process.
606
00:31:42,880 --> 00:31:45,610
These two processes are,
again, independent.
607
00:31:45,610 --> 00:31:50,850
From this time until that time,
how long does it take?
608
00:31:50,850 --> 00:31:53,930
It's the time until either
this process records an
609
00:31:53,930 --> 00:31:57,090
arrival or that process
records and arrival.
610
00:31:57,090 --> 00:32:01,210
That's the same as the time
that the merged process of
611
00:32:01,210 --> 00:32:03,810
these two records an arrival.
612
00:32:03,810 --> 00:32:06,430
So we're talking about the
expected time until the first
613
00:32:06,430 --> 00:32:08,710
arrival in a merged process.
614
00:32:08,710 --> 00:32:11,030
The merged process is Poisson.
615
00:32:11,030 --> 00:32:14,240
It's Poisson with
rate 2 lambda.
616
00:32:14,240 --> 00:32:17,690
So that extra time is
going to take--
617
00:32:17,690 --> 00:32:21,390
the expected value is going to
be 1 over the (rate of that
618
00:32:21,390 --> 00:32:22,580
Poisson process).
619
00:32:22,580 --> 00:32:25,170
So 1 over (2 lambda) is
the expected value
620
00:32:25,170 --> 00:32:26,980
of this random variable.
621
00:32:26,980 --> 00:32:30,870
So at this point, this bulb
now is also burned.
622
00:32:30,870 --> 00:32:33,620
So we start looking
from this time on.
623
00:32:33,620 --> 00:32:37,110
That part of the picture
disappears.
624
00:32:37,110 --> 00:32:40,150
Starting from this time, what's
the expected value
625
00:32:40,150 --> 00:32:43,650
until that remaining light-bulb
burns out?
626
00:32:43,650 --> 00:32:47,130
Well, as we said before, in
a Poisson process or with
627
00:32:47,130 --> 00:32:50,090
exponential random variables,
we have memorylessness.
628
00:32:50,090 --> 00:32:53,120
A used bulb is as good
as a new one.
629
00:32:53,120 --> 00:32:55,990
So it's as if we're starting
from scratch here.
630
00:32:55,990 --> 00:32:58,700
So this is going to be an
exponential random variable
631
00:32:58,700 --> 00:33:00,690
with parameter lambda.
632
00:33:00,690 --> 00:33:05,540
And the expected value of it is
going to be 1 over lambda.
633
00:33:05,540 --> 00:33:07,990
So the beauty of approaching
this problem in this
634
00:33:07,990 --> 00:33:10,930
particular way is, of course,
that we manage to do
635
00:33:10,930 --> 00:33:14,100
everything without any calculus
at all, without
636
00:33:14,100 --> 00:33:16,990
striking an integral, without
trying to calculate
637
00:33:16,990 --> 00:33:19,220
expectations in any form.
638
00:33:19,220 --> 00:33:23,150
Most of the non-trivial problems
that you encounter in
639
00:33:23,150 --> 00:33:28,540
the Poisson world basically
involve tricks of these kind.
640
00:33:28,540 --> 00:33:31,830
You have a question and you try
to rephrase it, trying to
641
00:33:31,830 --> 00:33:35,240
think in terms of what might
happen in the Poisson setting,
642
00:33:35,240 --> 00:33:39,200
use memorylessness, use merging,
et cetera, et cetera.
643
00:33:39,200 --> 00:33:43,360
644
00:33:43,360 --> 00:33:46,080
Now we talked about merging.
645
00:33:46,080 --> 00:33:49,480
It turns out that the splitting
of Poisson processes
646
00:33:49,480 --> 00:33:53,400
also works in a nice way.
647
00:33:53,400 --> 00:33:57,160
The story here is exactly
the same as for
648
00:33:57,160 --> 00:33:58,820
the Bernoulli process.
649
00:33:58,820 --> 00:34:01,870
So I'm having a Poisson
process.
650
00:34:01,870 --> 00:34:06,060
And each time, with some rate
lambda, and each time that an
651
00:34:06,060 --> 00:34:09,790
arrival comes, I'm going to send
it to that stream and the
652
00:34:09,790 --> 00:34:13,179
record an arrival here with some
probability P. And I'm
653
00:34:13,179 --> 00:34:16,120
going to send it to the other
stream with some probability 1
654
00:34:16,120 --> 00:34:19,469
minus P. So either of this
will happen or that will
655
00:34:19,469 --> 00:34:21,940
happen, depending on
the outcome of the
656
00:34:21,940 --> 00:34:23,550
coin flip that I do.
657
00:34:23,550 --> 00:34:27,449
Each time that then arrival
occurs, I flip a coin and I
658
00:34:27,449 --> 00:34:30,929
decide whether to record
it here or there.
659
00:34:30,929 --> 00:34:32,620
This is called splitting
a Poisson
660
00:34:32,620 --> 00:34:34,719
process into two pieces.
661
00:34:34,719 --> 00:34:37,120
What kind of process
do we get here?
662
00:34:37,120 --> 00:34:40,250
If you look at the little
interval for length delta,
663
00:34:40,250 --> 00:34:41,810
what's the probability
that this little
664
00:34:41,810 --> 00:34:44,090
interval gets an arrival?
665
00:34:44,090 --> 00:34:47,739
It's the probability that this
one gets an arrival, which is
666
00:34:47,739 --> 00:34:51,260
lambda delta times the
probability that after I get
667
00:34:51,260 --> 00:34:55,210
an arrival my coin flip came out
to be that way, so that it
668
00:34:55,210 --> 00:34:56,270
sends me there.
669
00:34:56,270 --> 00:34:58,740
So this means that this little
interval is going to have
670
00:34:58,740 --> 00:35:03,620
probability lambda delta P. Or
maybe more suggestively, I
671
00:35:03,620 --> 00:35:09,480
should write it as lambda
P times delta.
672
00:35:09,480 --> 00:35:12,350
So every little interval has
a probability of an arrival
673
00:35:12,350 --> 00:35:13,470
proportional to delta.
674
00:35:13,470 --> 00:35:16,780
The proportionality factor is
lambda P. So lambda P is the
675
00:35:16,780 --> 00:35:18,590
rate of that process.
676
00:35:18,590 --> 00:35:22,500
And then you go through the
mental exercise that you went
677
00:35:22,500 --> 00:35:25,170
through for the Bernoulli
process to argue that a
678
00:35:25,170 --> 00:35:28,520
different intervals here are
independent and so on.
679
00:35:28,520 --> 00:35:31,710
And that completes checking that
this process is going to
680
00:35:31,710 --> 00:35:33,360
be a Poisson process.
681
00:35:33,360 --> 00:35:38,060
So when you split a Poisson
process by doing independent
682
00:35:38,060 --> 00:35:41,040
coin flips each time that
something happens, the
683
00:35:41,040 --> 00:35:44,330
processes that you get is again
a Poisson process, but
684
00:35:44,330 --> 00:35:46,490
of course with a reduced rate.
685
00:35:46,490 --> 00:35:50,040
So instead of the word
splitting, sometimes people
686
00:35:50,040 --> 00:35:54,330
also use the words
thinning-out.
687
00:35:54,330 --> 00:35:57,650
That is, out of the arrivals
that came, you keep a few but
688
00:35:57,650 --> 00:35:59,000
throw away a few.
689
00:35:59,000 --> 00:36:01,820
690
00:36:01,820 --> 00:36:02,730
OK.
691
00:36:02,730 --> 00:36:08,570
So now the last topic over
this lecture is a quite
692
00:36:08,570 --> 00:36:11,270
curious phenomenon that
goes under the
693
00:36:11,270 --> 00:36:12,595
name of random incidents.
694
00:36:12,595 --> 00:36:15,550
695
00:36:15,550 --> 00:36:18,950
So here's the story.
696
00:36:18,950 --> 00:36:22,550
Buses have been running
on Mass Ave. from time
697
00:36:22,550 --> 00:36:24,070
immemorial.
698
00:36:24,070 --> 00:36:29,060
And the bus company that runs
the buses claims that they
699
00:36:29,060 --> 00:36:33,150
come as a Poisson process with
some rate, let's say, of 4
700
00:36:33,150 --> 00:36:34,970
buses per hour.
701
00:36:34,970 --> 00:36:39,250
So that the expected time
between bus arrivals is going
702
00:36:39,250 --> 00:36:42,500
to be 15 minutes.
703
00:36:42,500 --> 00:36:45,180
OK.
704
00:36:45,180 --> 00:36:45,840
Alright.
705
00:36:45,840 --> 00:36:48,130
So people have been complaining
that they have
706
00:36:48,130 --> 00:36:49,150
been showing up there.
707
00:36:49,150 --> 00:36:51,500
They think the buses are
taking too long.
708
00:36:51,500 --> 00:36:54,270
So you are asked
to investigate.
709
00:36:54,270 --> 00:36:56,840
Is the company--
710
00:36:56,840 --> 00:37:00,730
Does it operate according
to its promises or not.
711
00:37:00,730 --> 00:37:05,880
So you send an undercover agent
to go and check the
712
00:37:05,880 --> 00:37:07,940
interarrival times
of the buses.
713
00:37:07,940 --> 00:37:09,660
Are they 15 minutes?
714
00:37:09,660 --> 00:37:11,690
Or are they longer?
715
00:37:11,690 --> 00:37:17,660
So you put your dark glasses
and you show up at the bus
716
00:37:17,660 --> 00:37:21,110
stop at some random time.
717
00:37:21,110 --> 00:37:25,530
And you go and ask the guy in
the falafel truck, how long
718
00:37:25,530 --> 00:37:28,370
has it been since the
last arrival?
719
00:37:28,370 --> 00:37:31,310
So of course that guy works
for the FBI, right?
720
00:37:31,310 --> 00:37:36,900
So they tell you, well, it's
been, let's say, 12 minutes
721
00:37:36,900 --> 00:37:39,360
since the last bus arrival.
722
00:37:39,360 --> 00:37:40,960
And then you say,
"Oh, 12 minutes.
723
00:37:40,960 --> 00:37:42,780
Average time is 15.
724
00:37:42,780 --> 00:37:47,000
So a bus should be coming
any time now."
725
00:37:47,000 --> 00:37:48,230
Is that correct?
726
00:37:48,230 --> 00:37:49,660
No, you wouldn't
think that way.
727
00:37:49,660 --> 00:37:51,010
It's a Poisson process.
728
00:37:51,010 --> 00:37:53,810
It doesn't matter how long
it has been since
729
00:37:53,810 --> 00:37:55,270
the last bus arrival.
730
00:37:55,270 --> 00:37:56,920
So you don't go through
that fallacy.
731
00:37:56,920 --> 00:37:59,970
Instead of predicting how long
it's going to be, you just sit
732
00:37:59,970 --> 00:38:03,300
down there and wait and
measure the time.
733
00:38:03,300 --> 00:38:08,820
And you find that this is,
let's say, 11 minutes.
734
00:38:08,820 --> 00:38:13,260
And you go to your boss and
report, "Well, it took--
735
00:38:13,260 --> 00:38:16,410
I went there and the time from
the previous bus to the next
736
00:38:16,410 --> 00:38:18,310
one was 23 minutes.
737
00:38:18,310 --> 00:38:20,360
It's more than the 15
that they said."
738
00:38:20,360 --> 00:38:21,830
So go and do that again.
739
00:38:21,830 --> 00:38:23,590
You go day after day.
740
00:38:23,590 --> 00:38:28,350
You keep these statistics of the
length of this interval.
741
00:38:28,350 --> 00:38:32,160
And you tell your boss it's
a lot more than 15.
742
00:38:32,160 --> 00:38:36,720
It tends to be more
like 30 or so.
743
00:38:36,720 --> 00:38:39,170
So the bus company
is cheating us.
744
00:38:39,170 --> 00:38:43,490
Does the bus company really run
Poisson buses at the rate
745
00:38:43,490 --> 00:38:46,490
that they have promised?
746
00:38:46,490 --> 00:38:51,270
Well let's analyze the situation
here and figure out
747
00:38:51,270 --> 00:38:55,010
what the length of
this interval
748
00:38:55,010 --> 00:38:57,900
should be, on the average.
749
00:38:57,900 --> 00:39:01,120
The naive argument is that
this interval is an
750
00:39:01,120 --> 00:39:02,590
interarrival time.
751
00:39:02,590 --> 00:39:06,410
And interarrival times, on the
average, are 15 minutes, if
752
00:39:06,410 --> 00:39:10,610
the company runs indeed Poisson
processes with these
753
00:39:10,610 --> 00:39:11,850
interarrival times.
754
00:39:11,850 --> 00:39:14,970
But actually the situation is
a little more subtle because
755
00:39:14,970 --> 00:39:19,940
this is not a typical
interarrival interval.
756
00:39:19,940 --> 00:39:23,440
This interarrival interval
consists of two pieces.
757
00:39:23,440 --> 00:39:28,810
Let's call them T1
and T1 prime.
758
00:39:28,810 --> 00:39:32,250
What can you tell me about those
two random variables?
759
00:39:32,250 --> 00:39:35,940
What kind of random
variable is T1?
760
00:39:35,940 --> 00:39:39,950
Starting from this time, with
the Poisson process, the past
761
00:39:39,950 --> 00:39:41,290
doesn't matter.
762
00:39:41,290 --> 00:39:43,870
It's the time until an
arrival happens.
763
00:39:43,870 --> 00:39:49,110
So T1 is going to be an
exponential random variable
764
00:39:49,110 --> 00:39:50,425
with parameter lambda.
765
00:39:50,425 --> 00:39:53,300
766
00:39:53,300 --> 00:39:56,620
So in particular, the expected
value of T1 is
767
00:39:56,620 --> 00:40:00,260
going to be 15 by itself.
768
00:40:00,260 --> 00:40:02,720
How about the random
variable T1 prime.
769
00:40:02,720 --> 00:40:07,130
What kind of random
variable is it?
770
00:40:07,130 --> 00:40:14,180
This is like the first arrival
in a Poisson process that runs
771
00:40:14,180 --> 00:40:17,650
backwards in time.
772
00:40:17,650 --> 00:40:20,330
What kind of process is a
Poisson process running
773
00:40:20,330 --> 00:40:21,200
backwards in time?
774
00:40:21,200 --> 00:40:23,030
Let's think of coin flips.
775
00:40:23,030 --> 00:40:26,130
Suppose you have a movie
of coin flips.
776
00:40:26,130 --> 00:40:29,480
And for some accident, that
fascinating movie, you happen
777
00:40:29,480 --> 00:40:31,100
to watch it backwards.
778
00:40:31,100 --> 00:40:33,610
Will it look any different
statistically?
779
00:40:33,610 --> 00:40:33,780
No.
780
00:40:33,780 --> 00:40:36,940
It's going to be just the
sequence of random coin flips.
781
00:40:36,940 --> 00:40:40,770
So a Bernoulli process that's
runs in reverse time is
782
00:40:40,770 --> 00:40:42,410
statistically identical
to a Bernoulli
783
00:40:42,410 --> 00:40:44,290
process in forward time.
784
00:40:44,290 --> 00:40:46,600
The Poisson process is a
limit of the Bernoulli.
785
00:40:46,600 --> 00:40:48,950
So, same story with the
Poisson process.
786
00:40:48,950 --> 00:40:51,410
If you run it backwards in
time it looks the same.
787
00:40:51,410 --> 00:40:55,190
So looking backwards in time,
this is a Poisson process.
788
00:40:55,190 --> 00:40:58,930
And T1 prime is the time until
the first arrival in this
789
00:40:58,930 --> 00:41:00,260
backward process.
790
00:41:00,260 --> 00:41:04,910
So T1 prime is also going to
be an exponential random
791
00:41:04,910 --> 00:41:07,340
variable with the same
parameter, lambda.
792
00:41:07,340 --> 00:41:11,000
And the expected value
of T1 prime is 15.
793
00:41:11,000 --> 00:41:15,860
Conclusion is that the expected
length of this
794
00:41:15,860 --> 00:41:22,860
interval is going to
be 30 minutes.
795
00:41:22,860 --> 00:41:26,690
And the fact that this agent
found the average to be
796
00:41:26,690 --> 00:41:31,230
something like 30 does not
contradict the claims of the
797
00:41:31,230 --> 00:41:35,010
bus company that they're running
Poisson buses with a
798
00:41:35,010 --> 00:41:38,370
rate of lambda equal to 4.
799
00:41:38,370 --> 00:41:38,780
OK.
800
00:41:38,780 --> 00:41:43,390
So maybe the company can this
way-- they can defend
801
00:41:43,390 --> 00:41:44,970
themselves in court.
802
00:41:44,970 --> 00:41:47,490
But there's something
puzzling here.
803
00:41:47,490 --> 00:41:50,360
How long is the interarrival
time?
804
00:41:50,360 --> 00:41:51,910
Is it 15?
805
00:41:51,910 --> 00:41:53,216
Or is it 30?
806
00:41:53,216 --> 00:41:55,750
On the average.
807
00:41:55,750 --> 00:41:59,960
The issue is what do we
mean by a typical
808
00:41:59,960 --> 00:42:01,360
interarrival time.
809
00:42:01,360 --> 00:42:04,940
When we say typical, we mean
some kind of average.
810
00:42:04,940 --> 00:42:08,690
But average over what?
811
00:42:08,690 --> 00:42:13,280
And here's two different ways
of thinking about averages.
812
00:42:13,280 --> 00:42:15,080
You number the buses.
813
00:42:15,080 --> 00:42:17,120
And you have bus number 100.
814
00:42:17,120 --> 00:42:21,120
You have bus number 101,
bus number 102, bus
815
00:42:21,120 --> 00:42:24,660
number 110, and so on.
816
00:42:24,660 --> 00:42:29,370
One way of thinking about
averages is that you pick a
817
00:42:29,370 --> 00:42:32,150
bus number at random.
818
00:42:32,150 --> 00:42:36,070
I pick, let's say, that bus,
all buses being sort of
819
00:42:36,070 --> 00:42:37,760
equally likely to be picked.
820
00:42:37,760 --> 00:42:41,610
And I measure this interarrival
time.
821
00:42:41,610 --> 00:42:45,380
So for a typical bus.
822
00:42:45,380 --> 00:42:50,390
Then, starting from here until
there, the expected time has
823
00:42:50,390 --> 00:42:56,600
to be 1 over lambda, for
the Poisson process.
824
00:42:56,600 --> 00:42:58,370
But what we did in
this experiment
825
00:42:58,370 --> 00:42:59,720
was something different.
826
00:42:59,720 --> 00:43:02,040
We didn't pick a
bus at random.
827
00:43:02,040 --> 00:43:05,090
We picked a time at random.
828
00:43:05,090 --> 00:43:08,870
And if the picture is, let's
say, this way, I'm much more
829
00:43:08,870 --> 00:43:12,770
likely to pick this interval
and therefore this
830
00:43:12,770 --> 00:43:16,290
interarrival time, rather
than that interval.
831
00:43:16,290 --> 00:43:20,480
Because, this interval
corresponds to very few times.
832
00:43:20,480 --> 00:43:23,680
So if I'm picking a time at
random and, in some sense,
833
00:43:23,680 --> 00:43:27,430
let's say, uniform, so that all
times are equally likely,
834
00:43:27,430 --> 00:43:31,190
I'm much more likely to fall
inside a big interval rather
835
00:43:31,190 --> 00:43:32,710
than a small interval.
836
00:43:32,710 --> 00:43:37,140
So a person who shows up at the
bus stop at a random time.
837
00:43:37,140 --> 00:43:42,040
They're selecting an interval in
a biased way, with the bias
838
00:43:42,040 --> 00:43:44,350
favor of longer intervals.
839
00:43:44,350 --> 00:43:47,850
And that's why what they observe
is a random variable
840
00:43:47,850 --> 00:43:51,830
that has a larger expected
value then the ordinary
841
00:43:51,830 --> 00:43:53,080
expected value.
842
00:43:53,080 --> 00:43:56,780
So the subtlety here is to
realize that we're talking
843
00:43:56,780 --> 00:43:59,590
between two different kinds
of experiments.
844
00:43:59,590 --> 00:44:05,250
Picking a bus number at random
verses picking an interval at
845
00:44:05,250 --> 00:44:11,500
random with a bias in favor
of longer intervals.
846
00:44:11,500 --> 00:44:14,840
Lots of paradoxes that one
can cook up using Poisson
847
00:44:14,840 --> 00:44:19,190
processes and random processes
in general often have to do
848
00:44:19,190 --> 00:44:21,340
with the story of this kind.
849
00:44:21,340 --> 00:44:24,780
The phenomenon that we had in
this particular example also
850
00:44:24,780 --> 00:44:28,970
shows up in general, whenever
you have other kinds of
851
00:44:28,970 --> 00:44:30,470
arrival processes.
852
00:44:30,470 --> 00:44:34,210
So the Poisson process is the
simplest arrival process there
853
00:44:34,210 --> 00:44:36,830
is, where the interarrival
times are
854
00:44:36,830 --> 00:44:38,820
exponential random variables.
855
00:44:38,820 --> 00:44:40,280
There's a larger class
of models.
856
00:44:40,280 --> 00:44:43,580
They're called renewal
processes, in which, again, we
857
00:44:43,580 --> 00:44:46,900
have a sequence of successive
arrivals, interarrival times
858
00:44:46,900 --> 00:44:50,100
are identically distributed and
independent, but they may
859
00:44:50,100 --> 00:44:52,320
come from a general
distribution.
860
00:44:52,320 --> 00:44:55,100
So to make the same point of the
previous example but in a
861
00:44:55,100 --> 00:44:59,250
much simpler setting, suppose
that bus interarrival times
862
00:44:59,250 --> 00:45:02,830
are either 5 or 10
minutes apart.
863
00:45:02,830 --> 00:45:05,930
So you get some intervals
that are of length 5.
864
00:45:05,930 --> 00:45:08,790
You get some that are
of length 10.
865
00:45:08,790 --> 00:45:12,810
And suppose that these
are equally likely.
866
00:45:12,810 --> 00:45:16,990
So we have -- not exactly --
867
00:45:16,990 --> 00:45:20,380
In the long run, we have as many
5 minute intervals as we
868
00:45:20,380 --> 00:45:22,490
have 10 minute intervals.
869
00:45:22,490 --> 00:45:30,590
So the average interarrival
time is 7 and 1/2.
870
00:45:30,590 --> 00:45:35,850
But if a person shows up at a
random time, what are they
871
00:45:35,850 --> 00:45:37,100
going to see?
872
00:45:37,100 --> 00:45:40,520
873
00:45:40,520 --> 00:45:43,150
Do we have as many 5s as 10s?
874
00:45:43,150 --> 00:45:47,490
But every 10 covers twice
as much space.
875
00:45:47,490 --> 00:45:52,640
So if I show up at a random
time, I have probability 2/3
876
00:45:52,640 --> 00:45:57,180
falling inside an interval
of duration 10.
877
00:45:57,180 --> 00:46:00,990
And I have one 1/3 probability
of falling inside an interval
878
00:46:00,990 --> 00:46:02,460
of duration 5.
879
00:46:02,460 --> 00:46:06,710
That's because, out of the whole
real line, 2/3 of it is
880
00:46:06,710 --> 00:46:08,810
covered by intervals
of length 10, just
881
00:46:08,810 --> 00:46:09,590
because they're longer.
882
00:46:09,590 --> 00:46:12,280
1/3 is covered by the
smaller intervals.
883
00:46:12,280 --> 00:46:19,530
Now if I fall inside an interval
of length 10 and I
884
00:46:19,530 --> 00:46:23,260
measure the length of the
interval that I fell into,
885
00:46:23,260 --> 00:46:25,030
that's going to be 10.
886
00:46:25,030 --> 00:46:27,780
But if I fall inside an interval
of length 5 and I
887
00:46:27,780 --> 00:46:30,320
measure how long it is,
I'm going to get a 5.
888
00:46:30,320 --> 00:46:37,270
And that, of course, is going
to be different than 7.5.
889
00:46:37,270 --> 00:46:38,010
OK.
890
00:46:38,010 --> 00:46:42,310
And which number should
be bigger?
891
00:46:42,310 --> 00:46:45,110
It's the second number that's
bigger because this one is
892
00:46:45,110 --> 00:46:48,930
biased in favor of the
longer intervals.
893
00:46:48,930 --> 00:46:51,380
So that's, again, another
illustration of the different
894
00:46:51,380 --> 00:46:54,640
results that you get when you
have this random incidence
895
00:46:54,640 --> 00:46:55,990
phenomenon.
896
00:46:55,990 --> 00:46:59,320
So the bottom line, again, is
that if you talk about a
897
00:46:59,320 --> 00:47:03,380
typical interarrival time, one
must be very precise in
898
00:47:03,380 --> 00:47:05,370
specifying what we
mean typical.
899
00:47:05,370 --> 00:47:08,120
So typical means
sort of random.
900
00:47:08,120 --> 00:47:11,250
But to use the word random,
you must specify very
901
00:47:11,250 --> 00:47:15,070
precisely what is the random
experiment that you are using.
902
00:47:15,070 --> 00:47:18,920
And if you're not careful, you
can get into apparent puzzles,
903
00:47:18,920 --> 00:47:20,770
such as the following.
904
00:47:20,770 --> 00:47:25,170
Suppose somebody tells you the
average family size is 4, but
905
00:47:25,170 --> 00:47:30,340
the average person lives
in a family of size 6.
906
00:47:30,340 --> 00:47:33,330
Is that compatible?
907
00:47:33,330 --> 00:47:36,610
Family size is 4 on the average,
but typical people
908
00:47:36,610 --> 00:47:40,110
live, on the average, in
families of size 6.
909
00:47:40,110 --> 00:47:41,590
Well yes.
910
00:47:41,590 --> 00:47:43,080
There's no contradiction here.
911
00:47:43,080 --> 00:47:45,450
We're talking about two
different experiments.
912
00:47:45,450 --> 00:47:50,000
In one experiment, I pick a
family at random, and I tell
913
00:47:50,000 --> 00:47:51,960
you the average family is 4.
914
00:47:51,960 --> 00:47:55,910
In another experiment, I pick a
person at random and I tell
915
00:47:55,910 --> 00:47:58,310
you that this person, on the
average, will be in their
916
00:47:58,310 --> 00:48:00,080
family of size 6.
917
00:48:00,080 --> 00:48:01,140
And what is the catch here?
918
00:48:01,140 --> 00:48:05,440
That if I pick a person at
random, large families are
919
00:48:05,440 --> 00:48:08,160
more likely to be picked.
920
00:48:08,160 --> 00:48:11,710
So there's a bias in favor
of large families.
921
00:48:11,710 --> 00:48:15,270
Or if you want to survey, let's
say, are trains crowded
922
00:48:15,270 --> 00:48:16,495
in your city?
923
00:48:16,495 --> 00:48:19,170
Or are buses crowded?
924
00:48:19,170 --> 00:48:22,040
One choice is to pick a bus
at random and inspect
925
00:48:22,040 --> 00:48:23,220
how crowded it is.
926
00:48:23,220 --> 00:48:27,260
Another choice is to pick a
typical person and ask them,
927
00:48:27,260 --> 00:48:29,080
"Did you ride the bus today?
928
00:48:29,080 --> 00:48:33,500
Was it's crowded?" Well suppose
that in this city
929
00:48:33,500 --> 00:48:36,265
there's one bus that's extremely
crowded and all the
930
00:48:36,265 --> 00:48:38,520
other buses are completely
empty.
931
00:48:38,520 --> 00:48:42,300
If you ask a person. "Was your
bus crowded?" They will tell
932
00:48:42,300 --> 00:48:46,040
you, "Yes, my bus was crowded."
There's no witness
933
00:48:46,040 --> 00:48:49,460
from the empty buses to testify
in their favor.
934
00:48:49,460 --> 00:48:52,780
So by sampling people instead
of sampling buses, you're
935
00:48:52,780 --> 00:48:54,940
going to get different result.
936
00:48:54,940 --> 00:48:58,320
And in the process industry, if
your job is to inspect and
937
00:48:58,320 --> 00:49:01,450
check cookies, you will be
faced with a big dilemma.
938
00:49:01,450 --> 00:49:05,190
Do you want to find out how many
chocolate chips there are
939
00:49:05,190 --> 00:49:06,940
on a typical cookie?
940
00:49:06,940 --> 00:49:09,990
Are you going to interview
cookies or are you going to
941
00:49:09,990 --> 00:49:13,880
interview chocolate chips and
ask them how many other chips
942
00:49:13,880 --> 00:49:16,520
where there on your cookie?
943
00:49:16,520 --> 00:49:18,020
And you're going to
get different
944
00:49:18,020 --> 00:49:19,210
answers in these cases.
945
00:49:19,210 --> 00:49:22,670
So moral is, one has to be
very precise on how you
946
00:49:22,670 --> 00:49:26,160
formulate the sampling procedure
that you have.
947
00:49:26,160 --> 00:49:28,330
And you'll get different
answers.