WEBVTT
00:00:00.040 --> 00:00:02.460
The following content is
provided under a Creative
00:00:02.460 --> 00:00:03.870
Commons license.
00:00:03.870 --> 00:00:06.910
Your support will help MIT
OpenCourseWare continue to
00:00:06.910 --> 00:00:10.560
offer high quality educational
resources for free.
00:00:10.560 --> 00:00:13.460
To make a donation or view
additional materials from
00:00:13.460 --> 00:00:17.390
hundreds of MIT courses, visit
MIT OpenCourseWare at
00:00:17.390 --> 00:00:18.640
ocw.mit.edu.
00:00:22.710 --> 00:00:26.630
PROFESSOR: So by now you have
seen pretty much every
00:00:26.630 --> 00:00:30.273
possible trick there is in
basic probability theory,
00:00:30.273 --> 00:00:33.100
about how to calculate
distributions, and so on.
00:00:33.100 --> 00:00:37.230
You have the basic tools to
do pretty much anything.
00:00:37.230 --> 00:00:40.610
So what's coming after this?
00:00:40.610 --> 00:00:45.370
Well, probability is useful for
developing the science of
00:00:45.370 --> 00:00:48.120
inference, and this is a subject
to which we're going
00:00:48.120 --> 00:00:51.280
to come back at the end
of the semester.
00:00:51.280 --> 00:00:55.240
Another chapter, which is what
we will be doing over the next
00:00:55.240 --> 00:00:59.210
few weeks, is to deal with
phenomena that evolve in time.
00:00:59.210 --> 00:01:03.300
So so-called random processes
or stochastic processes.
00:01:03.300 --> 00:01:05.069
So what is this about?
00:01:05.069 --> 00:01:08.100
So in the real world, you don't
just throw two random
00:01:08.100 --> 00:01:09.520
variables and go home.
00:01:09.520 --> 00:01:11.410
Rather the world goes on.
00:01:11.410 --> 00:01:14.880
So you generate the random
variable, then you get more
00:01:14.880 --> 00:01:18.100
random variables, and things
evolve in time.
00:01:18.100 --> 00:01:21.560
And random processes are
supposed to be models that
00:01:21.560 --> 00:01:25.680
capture the evolution of random
phenomena over time.
00:01:25.680 --> 00:01:27.620
So that's what we
will be doing.
00:01:27.620 --> 00:01:31.230
Now when we have evolution in
time, mathematically speaking,
00:01:31.230 --> 00:01:34.840
you can use discrete time
or continuous time.
00:01:34.840 --> 00:01:36.800
Of course, discrete
time is easier.
00:01:36.800 --> 00:01:39.280
And that's where we're
going to start from.
00:01:39.280 --> 00:01:43.000
And we're going to start from
the easiest, simplest random
00:01:43.000 --> 00:01:46.740
process, which is the so-called
Bernoulli process,
00:01:46.740 --> 00:01:50.250
which is nothing but just a
sequence of coin flips.
00:01:50.250 --> 00:01:54.650
You keep flipping a coin
and keep going forever.
00:01:54.650 --> 00:01:56.380
That's what the Bernoulli
process is.
00:01:56.380 --> 00:01:58.290
So in some sense it's something
that you have
00:01:58.290 --> 00:01:59.160
already seen.
00:01:59.160 --> 00:02:03.180
But we're going to introduce a
few additional ideas here that
00:02:03.180 --> 00:02:07.460
will be useful and relevant as
we go along and we move on to
00:02:07.460 --> 00:02:09.949
continuous time processes.
00:02:09.949 --> 00:02:12.640
So we're going to define the
Bernoulli process, talk about
00:02:12.640 --> 00:02:17.800
some basic properties that the
process has, and derive a few
00:02:17.800 --> 00:02:21.440
formulas, and exploit the
special structure that it has
00:02:21.440 --> 00:02:24.930
to do a few quite interesting
things.
00:02:24.930 --> 00:02:29.550
By the way, where does the
word Bernoulli come from?
00:02:29.550 --> 00:02:33.010
Well the Bernoulli's were a
family of mathematicians,
00:02:33.010 --> 00:02:37.200
Swiss mathematicians and
scientists around the 1700s.
00:02:37.200 --> 00:02:39.680
There were so many of
them that actually--
00:02:39.680 --> 00:02:42.440
and some of them had the
same first name--
00:02:42.440 --> 00:02:46.920
historians even have difficulty
of figuring out who
00:02:46.920 --> 00:02:48.660
exactly did what.
00:02:48.660 --> 00:02:51.430
But in any case, you can imagine
that at the dinner
00:02:51.430 --> 00:02:53.710
table they were probably
flipping coins and doing
00:02:53.710 --> 00:02:55.570
Bernoulli trials.
00:02:55.570 --> 00:02:58.290
So maybe that was
their pass-time.
00:02:58.290 --> 00:02:58.610
OK.
00:02:58.610 --> 00:03:02.160
So what is the Bernoulli
process?
00:03:02.160 --> 00:03:05.750
The Bernoulli process is nothing
but a sequence of
00:03:05.750 --> 00:03:08.700
independent Bernoulli
trials that you can
00:03:08.700 --> 00:03:11.120
think of as coin flips.
00:03:11.120 --> 00:03:13.530
So you can think the result
of each trial
00:03:13.530 --> 00:03:15.350
being heads or tails.
00:03:15.350 --> 00:03:19.370
It's a little more convenient
maybe to talk about successes
00:03:19.370 --> 00:03:21.690
and failures instead
of heads or tails.
00:03:21.690 --> 00:03:24.470
Or if you wish numerical values,
to use a 1 for a
00:03:24.470 --> 00:03:27.330
success and 0 for a failure.
00:03:27.330 --> 00:03:30.820
So the model is that each one
of these trials has the same
00:03:30.820 --> 00:03:34.240
probability of success, p.
00:03:34.240 --> 00:03:37.050
And the other assumption is
that these trials are
00:03:37.050 --> 00:03:40.880
statistically independent
of each other.
00:03:40.880 --> 00:03:43.520
So what could be some examples
of Bernoulli trials?
00:03:43.520 --> 00:03:48.010
You buy a lottery ticket every
week and you win or lose.
00:03:48.010 --> 00:03:50.080
Presumably, these are
independent of each other.
00:03:50.080 --> 00:03:53.240
And if it's the same kind of
lottery, the probability of
00:03:53.240 --> 00:03:55.930
winning should be the same
during every week.
00:03:55.930 --> 00:03:58.840
Maybe you want to model
the financial markets.
00:03:58.840 --> 00:04:02.810
And a crude model could be that
on any given day the Dow
00:04:02.810 --> 00:04:05.860
Jones is going to go up
or down with a certain
00:04:05.860 --> 00:04:07.300
probability.
00:04:07.300 --> 00:04:11.770
Well that probability must be
somewhere around 0.5, or so.
00:04:11.770 --> 00:04:14.890
This is a crude model of
financial markets.
00:04:14.890 --> 00:04:17.910
You say, probably there
is more into them.
00:04:17.910 --> 00:04:19.670
Life is not that simple.
00:04:19.670 --> 00:04:23.770
But actually it's a pretty
reasonable model.
00:04:23.770 --> 00:04:26.260
It takes quite a bit of work
to come up with more
00:04:26.260 --> 00:04:29.590
sophisticated models that can
do better predictions than
00:04:29.590 --> 00:04:32.150
just pure heads and tails.
00:04:32.150 --> 00:04:36.270
Now more interesting, perhaps
to the examples we will be
00:04:36.270 --> 00:04:37.975
dealing with in this class--
00:04:37.975 --> 00:04:43.760
a Bernoulli process is a good
model for streams of arrivals
00:04:43.760 --> 00:04:45.940
of any kind to a facility.
00:04:45.940 --> 00:04:49.790
So it could be a bank, and
you are sitting at
00:04:49.790 --> 00:04:50.710
the door of the bank.
00:04:50.710 --> 00:04:54.180
And at every second, you check
whether a customer came in
00:04:54.180 --> 00:04:56.210
during that second or not.
00:04:56.210 --> 00:05:00.670
Or you can think about arrivals
of jobs to a server.
00:05:00.670 --> 00:05:04.890
Or any other kind of requests
to a service system.
00:05:04.890 --> 00:05:08.560
So requests, or jobs, arrive
at random times.
00:05:08.560 --> 00:05:12.360
You split the time
into time slots.
00:05:12.360 --> 00:05:15.180
And during each time slot
something comes or something
00:05:15.180 --> 00:05:16.610
does not come.
00:05:16.610 --> 00:05:20.110
And for many applications, it's
a reasonable assumption
00:05:20.110 --> 00:05:24.510
to make that arrivals on any
given slot are independent of
00:05:24.510 --> 00:05:27.530
arrivals in any other
time slot.
00:05:27.530 --> 00:05:30.910
So each time slot can be viewed
as a trial, where
00:05:30.910 --> 00:05:32.810
either something comes
or doesn't come.
00:05:32.810 --> 00:05:35.830
And different trials are
independent of each other.
00:05:35.830 --> 00:05:38.000
Now there's two assumptions
that we're making here.
00:05:38.000 --> 00:05:40.000
One is the independence
assumption.
00:05:40.000 --> 00:05:42.190
The other is that this number,
p, probability
00:05:42.190 --> 00:05:44.520
of success, is constant.
00:05:44.520 --> 00:05:47.840
Now if you think about the bank
example, if you stand
00:05:47.840 --> 00:05:53.330
outside the bank at 9:30 in
the morning, you'll see
00:05:53.330 --> 00:05:55.830
arrivals happening at
a certain rate.
00:05:55.830 --> 00:05:59.370
If you stand outside the bank
at 12:00 noon, probably
00:05:59.370 --> 00:06:01.240
arrivals are more frequent.
00:06:01.240 --> 00:06:03.680
Which means that the given
time slot has a higher
00:06:03.680 --> 00:06:08.030
probability of seeing an arrival
around noon time.
00:06:08.030 --> 00:06:11.780
This means that the assumption
of a constant p is probably
00:06:11.780 --> 00:06:15.080
not correct in that setting,
if you're talking
00:06:15.080 --> 00:06:16.630
about the whole day.
00:06:16.630 --> 00:06:19.670
So the probability of successes
or arrivals in the
00:06:19.670 --> 00:06:24.340
morning is going to be smaller
than what it would be at noon.
00:06:24.340 --> 00:06:27.775
But if you're talking about a
time period, let's say 10:00
00:06:27.775 --> 00:06:31.980
to 10:15, probably all slots
have the same probability of
00:06:31.980 --> 00:06:34.880
seeing an arrival and it's
a good approximation.
00:06:34.880 --> 00:06:37.450
So we're going to stick with
the assumption that p is
00:06:37.450 --> 00:06:41.110
constant, doesn't change
with time.
00:06:41.110 --> 00:06:44.480
Now that we have our model
what do we do with it?
00:06:44.480 --> 00:06:47.200
Well, we start talking about
the statistical properties
00:06:47.200 --> 00:06:48.510
that it has.
00:06:48.510 --> 00:06:52.400
And here there's two slightly
different perspectives of
00:06:52.400 --> 00:06:55.680
thinking about what a
random process is.
00:06:55.680 --> 00:06:59.060
The simplest version is to think
about the random process
00:06:59.060 --> 00:07:01.605
as being just a sequence
of random variables.
00:07:04.210 --> 00:07:06.900
We know what random
variables are.
00:07:06.900 --> 00:07:09.440
We know what multiple random
variables are.
00:07:09.440 --> 00:07:12.760
So it's just an experiment that
has associated with it a
00:07:12.760 --> 00:07:14.730
bunch of random variables.
00:07:14.730 --> 00:07:17.410
So once you have random
variables, what do you do
00:07:17.410 --> 00:07:18.480
instinctively?
00:07:18.480 --> 00:07:20.140
You talk about the
distribution of
00:07:20.140 --> 00:07:21.610
these random variables.
00:07:21.610 --> 00:07:25.690
We already specified for the
Bernoulli process that each Xi
00:07:25.690 --> 00:07:27.830
is a Bernoulli random variable,
with probability of
00:07:27.830 --> 00:07:29.480
success equal to p.
00:07:29.480 --> 00:07:31.640
That specifies the distribution
of the random
00:07:31.640 --> 00:07:35.500
variable X, or Xt, for
general time t.
00:07:35.500 --> 00:07:37.780
Then you can calculate
expected values and
00:07:37.780 --> 00:07:39.420
variances, and so on.
00:07:39.420 --> 00:07:44.050
So the expected value is, with
probability p, you get a 1.
00:07:44.050 --> 00:07:46.650
And with probability
1 - p, you get a 0.
00:07:46.650 --> 00:07:49.510
So the expected value
is equal to p.
00:07:49.510 --> 00:07:52.890
And then we have seen before a
formula for the variance of
00:07:52.890 --> 00:07:57.040
the Bernoulli random variable,
which is p times 1-p.
00:07:57.040 --> 00:08:00.350
So this way we basically now
have all the statistical
00:08:00.350 --> 00:08:04.680
properties of the random
variable Xt, and we have those
00:08:04.680 --> 00:08:06.670
properties for every t.
00:08:06.670 --> 00:08:10.140
Is this enough of a
probabilistic description of a
00:08:10.140 --> 00:08:11.580
random process?
00:08:11.580 --> 00:08:12.240
Well, no.
00:08:12.240 --> 00:08:15.070
You need to know how the
different random variables
00:08:15.070 --> 00:08:16.820
relate to each other.
00:08:16.820 --> 00:08:20.870
If you're talking about a
general random process, you
00:08:20.870 --> 00:08:23.490
would like to know things.
00:08:23.490 --> 00:08:26.370
For example, the joint
distribution of X2,
00:08:26.370 --> 00:08:29.220
with X5, and X7.
00:08:29.220 --> 00:08:31.680
For example, that might be
something that you're
00:08:31.680 --> 00:08:32.809
interested in.
00:08:32.809 --> 00:08:38.820
And the way you specify it is
by giving the joint PMF of
00:08:38.820 --> 00:08:40.530
these random variables.
00:08:40.530 --> 00:08:44.440
And you have to do that for
every collection, or any
00:08:44.440 --> 00:08:46.150
subset, of the random
variables you
00:08:46.150 --> 00:08:47.300
are interested in.
00:08:47.300 --> 00:08:49.860
So to have a complete
description of a random
00:08:49.860 --> 00:08:54.770
processes, you need to specify
for me all the possible joint
00:08:54.770 --> 00:08:55.960
distributions.
00:08:55.960 --> 00:08:58.780
And once you have all the
possible joint distributions,
00:08:58.780 --> 00:09:01.580
then you can answer, in
principle, any questions you
00:09:01.580 --> 00:09:03.280
might be interested in.
00:09:03.280 --> 00:09:05.670
How did we get around
this issue for
00:09:05.670 --> 00:09:06.690
the Bernoulli process?
00:09:06.690 --> 00:09:10.030
I didn't give you the joint
distributions explicitly.
00:09:10.030 --> 00:09:12.160
But I gave them to
you implicitly.
00:09:12.160 --> 00:09:15.250
And this is because I told you
that the different random
00:09:15.250 --> 00:09:18.390
variables are independent
of each other.
00:09:18.390 --> 00:09:21.490
So at least for the Bernoulli
process, where we make the
00:09:21.490 --> 00:09:24.320
independence assumption, we know
that this is going to be
00:09:24.320 --> 00:09:25.970
the product of the PMFs.
00:09:29.000 --> 00:09:33.850
And since I have told you what
the individual PMFs are, this
00:09:33.850 --> 00:09:37.090
means that you automatically
know all the joint PMFs.
00:09:37.090 --> 00:09:40.940
And we can go to business
based on that.
00:09:40.940 --> 00:09:41.310
All right.
00:09:41.310 --> 00:09:45.160
So this is one view of what a
random process is, just a
00:09:45.160 --> 00:09:47.180
collection of random
variables.
00:09:47.180 --> 00:09:50.660
There's another view that's a
little more abstract, which is
00:09:50.660 --> 00:09:53.170
the following.
00:09:53.170 --> 00:09:57.100
The entire process is to be
thought of as one long
00:09:57.100 --> 00:09:58.520
experiment.
00:09:58.520 --> 00:10:01.970
So we go back to the
chapter one view of
00:10:01.970 --> 00:10:03.550
probabilistic models.
00:10:03.550 --> 00:10:06.240
So there must be a sample
space involved.
00:10:06.240 --> 00:10:07.660
What is the sample space?
00:10:07.660 --> 00:10:11.760
If I do my infinite, long
experiment of flipping an
00:10:11.760 --> 00:10:15.330
infinite number of coins,
a typical outcome of the
00:10:15.330 --> 00:10:21.140
experiment would be a sequence
of 0's and 1's.
00:10:21.140 --> 00:10:25.640
So this could be one possible
outcome of the experiment,
00:10:25.640 --> 00:10:28.630
just an infinite sequence
of 0's and 1's.
00:10:28.630 --> 00:10:33.020
My sample space is the
set of all possible
00:10:33.020 --> 00:10:35.050
outcomes of this kind.
00:10:35.050 --> 00:10:40.660
Here's another possible
outcome, and so on.
00:10:40.660 --> 00:10:44.060
And essentially we're dealing
with a sample space, which is
00:10:44.060 --> 00:10:46.980
the space of all sequences
of 0's and 1's.
00:10:46.980 --> 00:10:50.470
And we're making some sort of
probabilistic assumption about
00:10:50.470 --> 00:10:53.330
what may happen in
that experiment.
00:10:53.330 --> 00:10:56.350
So one particular sequence that
we may be interested in
00:10:56.350 --> 00:10:59.660
is the sequence of obtaining
all 1's.
00:10:59.660 --> 00:11:05.510
So this is the sequence that
gives you 1's forever.
00:11:05.510 --> 00:11:08.330
Once you take the point of view
that this is our sample
00:11:08.330 --> 00:11:10.760
space-- its the space of all
infinite sequences--
00:11:10.760 --> 00:11:13.770
you can start asking questions
that have to do
00:11:13.770 --> 00:11:15.470
with infinite sequences.
00:11:15.470 --> 00:11:19.120
Such as the question, what's the
probability of obtaining
00:11:19.120 --> 00:11:23.180
the infinite sequence that
consists of all 1's?
00:11:23.180 --> 00:11:24.690
So what is this probability?
00:11:24.690 --> 00:11:27.240
Let's see how we could
calculate it.
00:11:27.240 --> 00:11:34.000
So the probability of obtaining
all 1's is certainly
00:11:34.000 --> 00:11:39.890
less than or equal to the
probability of obtaining 1's,
00:11:39.890 --> 00:11:42.335
just in the first 10 tosses.
00:11:45.075 --> 00:11:47.030
OK.
00:11:47.030 --> 00:11:50.810
This is asking for more things
to happen than this.
00:11:50.810 --> 00:11:55.780
If this event is true, then
this is also true.
00:11:55.780 --> 00:11:58.660
Therefore the probability of
this is smaller than the
00:11:58.660 --> 00:11:59.540
probability of that.
00:11:59.540 --> 00:12:03.410
This event is contained
in that event.
00:12:03.410 --> 00:12:04.950
This implies this.
00:12:04.950 --> 00:12:06.880
So we have this inequality.
00:12:06.880 --> 00:12:12.360
Now what's the probability of
obtaining 1's in 10 trials?
00:12:12.360 --> 00:12:15.980
This is just p to the 10th
because the trials are
00:12:15.980 --> 00:12:18.530
independent.
00:12:18.530 --> 00:12:22.780
Now of course there's no reason
why I chose 10 here.
00:12:22.780 --> 00:12:26.160
The same argument goes
through if I use an
00:12:26.160 --> 00:12:29.850
arbitrary number, k.
00:12:29.850 --> 00:12:34.250
And this has to be
true for all k.
00:12:34.250 --> 00:12:38.690
So this probability is less
than p to the k, no matter
00:12:38.690 --> 00:12:41.670
what k I choose.
00:12:41.670 --> 00:12:46.350
Therefore, this must be less
than or equal to the limit of
00:12:46.350 --> 00:12:48.660
this, as k goes to infinity.
00:12:48.660 --> 00:12:51.210
This is smaller than
that for all k's.
00:12:51.210 --> 00:12:55.860
Let k go to infinity, take k
arbitrarily large, this number
00:12:55.860 --> 00:12:57.770
is going to become arbitrarily
small.
00:12:57.770 --> 00:12:59.190
It goes to 0.
00:12:59.190 --> 00:13:02.480
And that proves that the
probability of an infinite
00:13:02.480 --> 00:13:06.080
sequence of 1's is equal to 0.
00:13:06.080 --> 00:13:09.800
So take limits of both sides.
00:13:13.220 --> 00:13:16.217
It's going to be less than
or equal to the limit--
00:13:16.217 --> 00:13:18.380
I shouldn't take a limit here.
00:13:18.380 --> 00:13:21.745
The probability is less than or
equal to the limit of p to
00:13:21.745 --> 00:13:26.000
the k, as k goes to infinity,
which is 0.
00:13:26.000 --> 00:13:30.880
So this proves in a formal way
that the sequence of all 1's
00:13:30.880 --> 00:13:32.650
has 0 probability.
00:13:32.650 --> 00:13:35.770
If you have an infinite number
of coin flips, what's the
00:13:35.770 --> 00:13:40.610
probability that all of the coin
flips result in heads?
00:13:40.610 --> 00:13:43.690
The probability of this
happening is equal to zero.
00:13:43.690 --> 00:13:48.310
So this particular sequence
has 0 probability.
00:13:48.310 --> 00:13:51.480
Of course, I'm assuming here
that p is less than 1,
00:13:51.480 --> 00:13:53.420
strictly less than 1.
00:13:53.420 --> 00:13:56.280
Now the interesting thing is
that if you look at any other
00:13:56.280 --> 00:13:59.690
infinite sequence, and you try
to calculate the probability
00:13:59.690 --> 00:14:03.526
of that infinite sequence, you
would get a product of (1-p)
00:14:03.526 --> 00:14:07.600
times 1, 1-p times 1, 1-p,
times p times p,
00:14:07.600 --> 00:14:09.570
times 1-p and so on.
00:14:09.570 --> 00:14:13.560
You keep multiplying numbers
that are less than 1.
00:14:13.560 --> 00:14:16.760
Again, I'm making the
assumption that p is
00:14:16.760 --> 00:14:17.940
between 0 and 1.
00:14:17.940 --> 00:14:21.140
So 1-p is less than 1,
p is less than 1.
00:14:21.140 --> 00:14:23.600
You keep multiplying numbers
less than 1.
00:14:23.600 --> 00:14:26.190
If you multiply infinitely
many such numbers, the
00:14:26.190 --> 00:14:28.470
infinite product becomes 0.
00:14:28.470 --> 00:14:33.310
So any individual sequence in
this sample space actually has
00:14:33.310 --> 00:14:35.230
0 probability.
00:14:35.230 --> 00:14:39.560
And that is a little bit
counter-intuitive perhaps.
00:14:39.560 --> 00:14:42.670
But the situation is more like
the situation where we deal
00:14:42.670 --> 00:14:44.680
with continuous random
variables.
00:14:44.680 --> 00:14:47.430
So if you could draw a
continuous random variable,
00:14:47.430 --> 00:14:50.810
every possible outcome
has 0 probability.
00:14:50.810 --> 00:14:52.320
And that's fine.
00:14:52.320 --> 00:14:54.860
But all of the outcomes
collectively still have
00:14:54.860 --> 00:14:56.280
positive probability.
00:14:56.280 --> 00:14:59.580
So the situation here is
very much similar.
00:14:59.580 --> 00:15:03.940
So the space of infinite
sequences of 0's and 1's, that
00:15:03.940 --> 00:15:07.340
sample space is very much
like a continuous space.
00:15:07.340 --> 00:15:10.340
If you want to push that analogy
further, you could
00:15:10.340 --> 00:15:15.830
think of this as the expansion
of a real number.
00:15:15.830 --> 00:15:18.950
Or the representation of a
real number in binary.
00:15:18.950 --> 00:15:22.540
Take a real number, write it
down in binary, you are going
00:15:22.540 --> 00:15:25.580
to get an infinite sequence
of 0's and 1's.
00:15:25.580 --> 00:15:28.780
So you can think of each
possible outcome here
00:15:28.780 --> 00:15:30.920
essentially as a real number.
00:15:30.920 --> 00:15:36.060
So the experiment of doing an
infinite number of coin flips
00:15:36.060 --> 00:15:39.670
is sort of similar to the
experiment of picking a real
00:15:39.670 --> 00:15:41.060
number at random.
00:15:41.060 --> 00:15:44.990
When you pick real numbers at
random, any particular real
00:15:44.990 --> 00:15:46.500
number has 0 probability.
00:15:46.500 --> 00:15:49.780
So similarly here, any
particular infinite sequence
00:15:49.780 --> 00:15:52.440
has 0 probability.
00:15:52.440 --> 00:15:55.260
So if we were to push that
analogy further, there would
00:15:55.260 --> 00:15:57.290
be a few interesting
things we could do.
00:15:57.290 --> 00:15:59.880
But we will not push
it further.
00:15:59.880 --> 00:16:05.400
This is just to give you an
indication that things can get
00:16:05.400 --> 00:16:08.710
pretty subtle and interesting
once you start talking about
00:16:08.710 --> 00:16:12.640
random processes that involve
forever, over the infinite
00:16:12.640 --> 00:16:13.900
time horizon.
00:16:13.900 --> 00:16:17.960
So things get interesting even
in this context of the simple
00:16:17.960 --> 00:16:19.750
Bernoulli process.
00:16:19.750 --> 00:16:23.130
Just to give you a preview of
what's coming further, today
00:16:23.130 --> 00:16:26.170
we're going to talk just about
the Bernoulli process.
00:16:26.170 --> 00:16:30.810
And you should make sure before
the next lecture--
00:16:30.810 --> 00:16:34.590
I guess between the exam
and the next lecture--
00:16:34.590 --> 00:16:36.740
to understand everything
we do today.
00:16:36.740 --> 00:16:39.930
Because next time we're going
to do everything once more,
00:16:39.930 --> 00:16:41.640
but in continuous time.
00:16:41.640 --> 00:16:46.360
And in continuous time, things
become more subtle and a
00:16:46.360 --> 00:16:47.700
little more difficult.
00:16:47.700 --> 00:16:50.580
But we are going to build on
what we understand for the
00:16:50.580 --> 00:16:52.030
discrete time case.
00:16:52.030 --> 00:16:55.370
Now both the Bernoulli process
and its continuous time analog
00:16:55.370 --> 00:16:58.470
has a property that we call
memorylessness, whatever
00:16:58.470 --> 00:17:01.590
happened in the past does
not affect the future.
00:17:01.590 --> 00:17:03.920
Later on in this class we're
going to talk about more
00:17:03.920 --> 00:17:07.310
general random processes,
so-called Markov chains, in
00:17:07.310 --> 00:17:10.890
which there are certain
dependences across time.
00:17:10.890 --> 00:17:15.349
That is, what has happened in
the past will have some
00:17:15.349 --> 00:17:18.390
bearing on what may happen
in the future.
00:17:18.390 --> 00:17:22.440
So it's like having coin flips
where the outcome of the next
00:17:22.440 --> 00:17:25.720
coin flip has some dependence
on the previous coin flip.
00:17:25.720 --> 00:17:28.400
And that gives us a richer
class of models.
00:17:28.400 --> 00:17:31.670
And once we get there,
essentially we will have
00:17:31.670 --> 00:17:34.400
covered all possible models.
00:17:34.400 --> 00:17:38.070
So for random processes that
are practically useful and
00:17:38.070 --> 00:17:41.480
which you can manipulate, Markov
chains are a pretty
00:17:41.480 --> 00:17:43.250
general class of models.
00:17:43.250 --> 00:17:47.260
And almost any real world
phenomenon that evolves in
00:17:47.260 --> 00:17:52.190
time can be approximately
modeled using Markov chains.
00:17:52.190 --> 00:17:55.580
So even though this is a first
class in probability, we will
00:17:55.580 --> 00:17:59.560
get pretty far in
that direction.
00:17:59.560 --> 00:17:59.950
All right.
00:17:59.950 --> 00:18:04.300
So now let's start doing a few
calculations and answer some
00:18:04.300 --> 00:18:06.690
questions about the
Bernoulli process.
00:18:06.690 --> 00:18:11.010
So again, the best way to think
in terms of models that
00:18:11.010 --> 00:18:13.350
correspond to the Bernoulli
process is in terms of
00:18:13.350 --> 00:18:15.870
arrivals of jobs
to a facility.
00:18:15.870 --> 00:18:18.230
And there's two types of
questions that you can ask.
00:18:18.230 --> 00:18:21.990
In a given amount of time,
how many jobs arrived?
00:18:21.990 --> 00:18:26.250
Or conversely, for a given
number of jobs, how much time
00:18:26.250 --> 00:18:28.720
did it take for them
to arrive?
00:18:28.720 --> 00:18:31.420
So we're going to deal with
these two questions, starting
00:18:31.420 --> 00:18:32.450
with the first.
00:18:32.450 --> 00:18:34.370
For a given amount of time--
00:18:34.370 --> 00:18:37.870
that is, for a given number
of time periods--
00:18:37.870 --> 00:18:40.150
how many arrivals have we had?
00:18:40.150 --> 00:18:44.180
How many of those Xi's
happen to be 1's?
00:18:44.180 --> 00:18:46.270
We fix the number
of time slots--
00:18:46.270 --> 00:18:47.970
let's say n time slots--
00:18:47.970 --> 00:18:50.920
and you measure the number
of successes.
00:18:50.920 --> 00:18:54.320
Well this is a very familiar
random variable.
00:18:54.320 --> 00:18:58.990
The number of successes in n
independent coin flips--
00:18:58.990 --> 00:19:01.430
or in n independent trials--
00:19:01.430 --> 00:19:03.660
is a binomial random variable.
00:19:03.660 --> 00:19:10.380
So we know its distribution is
given by the binomial PMF, and
00:19:10.380 --> 00:19:15.820
it's just this, for k going
from 0 up to n.
00:19:15.820 --> 00:19:18.940
And we know everything by now
about this random variable.
00:19:18.940 --> 00:19:21.990
We know its expected
value is n times p.
00:19:21.990 --> 00:19:27.980
And we know the variance, which
is n times p, times 1-p.
00:19:27.980 --> 00:19:31.740
So there's nothing new here.
00:19:31.740 --> 00:19:34.290
That's the easy part.
00:19:34.290 --> 00:19:37.590
So now let's look at the
opposite kind of question.
00:19:37.590 --> 00:19:42.210
Instead of fixing the time and
asking how many arrivals, now
00:19:42.210 --> 00:19:46.160
let us fix the number of
arrivals and ask how much time
00:19:46.160 --> 00:19:47.810
did it take.
00:19:47.810 --> 00:19:52.780
And let's start with the time
until the first arrival.
00:19:52.780 --> 00:19:59.470
So the process starts.
00:19:59.470 --> 00:20:00.720
We got our slots.
00:20:04.070 --> 00:20:08.250
And we see, perhaps, a sequence
of 0's and then at
00:20:08.250 --> 00:20:10.660
some point we get a 1.
00:20:10.660 --> 00:20:14.810
The number of trials it took
until we get a 1, we're going
00:20:14.810 --> 00:20:16.500
to call it T1.
00:20:16.500 --> 00:20:19.020
And it's the time of
the first arrival.
00:20:23.020 --> 00:20:23.680
OK.
00:20:23.680 --> 00:20:27.130
What is the probability
distribution of T1?
00:20:27.130 --> 00:20:30.430
What kind of random
variable is it?
00:20:30.430 --> 00:20:31.835
We've gone through
this before.
00:20:34.360 --> 00:20:40.560
The event that the first arrival
happens at time little
00:20:40.560 --> 00:20:48.400
t is the event that the first
t-1 trials were failures, and
00:20:48.400 --> 00:20:52.850
the trial number t happens
to be a success.
00:20:52.850 --> 00:20:57.960
So for the first success to
happen at time slot number 5,
00:20:57.960 --> 00:21:02.390
it means that the first 4 slots
had failures and the 5th
00:21:02.390 --> 00:21:04.800
slot had a success.
00:21:04.800 --> 00:21:08.440
So the probability of this
happening is the probability
00:21:08.440 --> 00:21:13.580
of having failures in the first
t -1 trials, and having
00:21:13.580 --> 00:21:16.300
a success at trial number 1.
00:21:16.300 --> 00:21:20.060
And this is the formula for
t equal 1,2, and so on.
00:21:20.060 --> 00:21:22.460
So we know what this
distribution is.
00:21:22.460 --> 00:21:25.200
It's the so-called geometric
distribution.
00:21:30.900 --> 00:21:35.010
Let me jump this through
this for a minute.
00:21:35.010 --> 00:21:38.360
In the past, we did calculate
the expected value of the
00:21:38.360 --> 00:21:42.240
geometric distribution,
and it's 1/p.
00:21:42.240 --> 00:21:46.010
Which means that if p is small,
you expect to take a
00:21:46.010 --> 00:21:48.810
long time until the
first success.
00:21:48.810 --> 00:21:52.540
And then there's a formula also
for the variance of T1,
00:21:52.540 --> 00:21:56.410
which we never formally derived
in class, but it was
00:21:56.410 --> 00:22:01.600
in your textbook and it just
happens to be this.
00:22:01.600 --> 00:22:02.310
All right.
00:22:02.310 --> 00:22:05.920
So nothing new until
this point.
00:22:05.920 --> 00:22:08.700
Now, let's talk about
this property, the
00:22:08.700 --> 00:22:10.210
memorylessness property.
00:22:10.210 --> 00:22:13.240
We kind of touched on this
property when we discussed--
00:22:13.240 --> 00:22:15.760
when we did the derivation
in class of the
00:22:15.760 --> 00:22:18.010
expected value of T1.
00:22:18.010 --> 00:22:20.040
Now what is the memoryless
property?
00:22:20.040 --> 00:22:22.980
It's essentially a consequence
of independence.
00:22:22.980 --> 00:22:26.740
If I tell you the results of my
coin flips up to a certain
00:22:26.740 --> 00:22:30.550
time, this, because of
independence, doesn't give you
00:22:30.550 --> 00:22:34.410
any information about the coin
flips after that time.
00:22:37.180 --> 00:22:41.180
So knowing that we had lots of
0's here does not change what
00:22:41.180 --> 00:22:44.920
I believe about the future coin
flips, because the future
00:22:44.920 --> 00:22:47.580
coin flips are going to be just
independent coin flips
00:22:47.580 --> 00:22:53.170
with a given probability,
p, for obtaining tails.
00:22:53.170 --> 00:22:58.240
So this is a statement that I
made about a specific time.
00:22:58.240 --> 00:23:02.270
That is, you do coin flips
until 12 o'clock.
00:23:02.270 --> 00:23:05.210
And then at 12 o'clock,
you start watching.
00:23:05.210 --> 00:23:09.900
No matter what happens before 12
o'clock, after 12:00, what
00:23:09.900 --> 00:23:12.890
you're going to see is just
a sequence of independent
00:23:12.890 --> 00:23:15.610
Bernoulli trials with the
same probability, p.
00:23:15.610 --> 00:23:18.450
Whatever happened in the
past is irrelevant.
00:23:18.450 --> 00:23:21.590
Now instead of talking about
the fixed time at which you
00:23:21.590 --> 00:23:26.940
start watching, let's think
about a situation where your
00:23:26.940 --> 00:23:31.240
sister sits in the next room,
flips the coins until she
00:23:31.240 --> 00:23:35.760
observes the first success,
and then calls you inside.
00:23:35.760 --> 00:23:38.700
And you start watching
after this time.
00:23:38.700 --> 00:23:40.970
What are you're going to see?
00:23:40.970 --> 00:23:45.160
Well, you're going to see a coin
flip with probability p
00:23:45.160 --> 00:23:46.850
of success.
00:23:46.850 --> 00:23:49.870
You're going to see another
trial that has probability p
00:23:49.870 --> 00:23:53.250
as a success, and these are all
independent of each other.
00:23:53.250 --> 00:23:56.800
So what you're going to see
starting at that time is going
00:23:56.800 --> 00:24:02.610
to be just a sequence of
independent Bernoulli trials,
00:24:02.610 --> 00:24:06.190
as if the process was starting
at this time.
00:24:06.190 --> 00:24:10.880
How long it took for the first
success to occur doesn't have
00:24:10.880 --> 00:24:15.850
any bearing on what is going
to happen afterwards.
00:24:15.850 --> 00:24:19.170
What happens afterwards
is still a sequence of
00:24:19.170 --> 00:24:21.230
independent coin flips.
00:24:21.230 --> 00:24:24.680
And this story is actually
even more general.
00:24:24.680 --> 00:24:28.980
So your sister watches the coin
flips and at some point
00:24:28.980 --> 00:24:31.690
tells you, oh, something
really interesting is
00:24:31.690 --> 00:24:32.440
happening here.
00:24:32.440 --> 00:24:35.250
I got this string of a
hundred 1's in a row.
00:24:35.250 --> 00:24:37.250
Come and watch.
00:24:37.250 --> 00:24:40.260
Now when you go in there and
you start watching, do you
00:24:40.260 --> 00:24:43.890
expect to see something
unusual?
00:24:43.890 --> 00:24:46.830
There were unusual things
that happened before
00:24:46.830 --> 00:24:48.180
you were called in.
00:24:48.180 --> 00:24:50.620
Does this means that you're
going to see unusual things
00:24:50.620 --> 00:24:51.780
afterwards?
00:24:51.780 --> 00:24:52.180
No.
00:24:52.180 --> 00:24:55.060
Afterwards, what you're going
to see is, again, just a
00:24:55.060 --> 00:24:57.700
sequence of independent
coin flips.
00:24:57.700 --> 00:25:00.640
The fact that some strange
things happened before doesn't
00:25:00.640 --> 00:25:03.940
have any bearing as to what is
going to happen in the future.
00:25:03.940 --> 00:25:08.560
So if the roulettes in the
casino are properly made, the
00:25:08.560 --> 00:25:12.610
fact that there were 3 reds in
a row doesn't affect the odds
00:25:12.610 --> 00:25:16.430
of whether in the next
roll it's going to
00:25:16.430 --> 00:25:19.570
be a red or a black.
00:25:19.570 --> 00:25:22.850
So whatever happens in
the past-- no matter
00:25:22.850 --> 00:25:25.010
how unusual it is--
00:25:25.010 --> 00:25:28.910
at the time when you're called
in, what's going to happen in
00:25:28.910 --> 00:25:32.510
the future is going to be just
independent Bernoulli trials,
00:25:32.510 --> 00:25:34.170
with the same probability, p.
00:25:36.730 --> 00:25:41.900
The only case where this story
changes is if your sister has
00:25:41.900 --> 00:25:43.850
a little bit of foresight.
00:25:43.850 --> 00:25:48.500
So your sister can look ahead
into the future and knows that
00:25:48.500 --> 00:25:54.230
the next 10 coin flips will be
heads, and calls you before
00:25:54.230 --> 00:25:56.430
those 10 flips will happen.
00:25:56.430 --> 00:25:59.660
If she calls you in, then what
are you going to see?
00:25:59.660 --> 00:26:02.310
You're not going to see
independent Bernoulli trials,
00:26:02.310 --> 00:26:05.510
since she has psychic powers
and she knows that the next
00:26:05.510 --> 00:26:06.910
ones would be 1's.
00:26:06.910 --> 00:26:12.770
She called you in and you will
see a sequence of 1's.
00:26:12.770 --> 00:26:15.470
So it's no more independent
Bernoulli trials.
00:26:15.470 --> 00:26:19.420
So what's the subtle
difference here?
00:26:19.420 --> 00:26:24.010
The future is independent from
the past, provided that the
00:26:24.010 --> 00:26:28.310
time that you are called and
asked to start watching is
00:26:28.310 --> 00:26:31.460
determined by someone who
doesn't have any foresight,
00:26:31.460 --> 00:26:33.470
who cannot see the future.
00:26:33.470 --> 00:26:36.960
If you are called in, just on
the basis of what has happened
00:26:36.960 --> 00:26:39.630
so far, then you don't have any
00:26:39.630 --> 00:26:41.430
information about the future.
00:26:41.430 --> 00:26:44.870
And one special case is
the picture here.
00:26:44.870 --> 00:26:47.240
You have your coin flips.
00:26:47.240 --> 00:26:51.060
Once you see a one that happens,
once you see a
00:26:51.060 --> 00:26:53.310
success, you are called in.
00:26:53.310 --> 00:26:57.690
You are called in on the basis
of what happened in the past,
00:26:57.690 --> 00:26:59.070
but without any foresight.
00:27:02.208 --> 00:27:03.140
OK.
00:27:03.140 --> 00:27:07.020
And this subtle distinction is
what's going to make our next
00:27:07.020 --> 00:27:10.420
example interesting
and subtle.
00:27:10.420 --> 00:27:13.380
So here's the question.
00:27:13.380 --> 00:27:17.790
You buy a lottery ticket every
day, so we have a Bernoulli
00:27:17.790 --> 00:27:21.880
process that's running
in time.
00:27:21.880 --> 00:27:26.050
And you're interested in the
length of the first string of
00:27:26.050 --> 00:27:26.860
losing days.
00:27:26.860 --> 00:27:28.410
What does that mean?
00:27:28.410 --> 00:27:33.700
So suppose that a typical
sequence of events
00:27:33.700 --> 00:27:35.040
could be this one.
00:27:40.970 --> 00:27:43.470
So what are we discussing
here?
00:27:43.470 --> 00:27:47.180
We're looking at the first
string of losing days, where
00:27:47.180 --> 00:27:49.140
losing days means 0's.
00:27:51.940 --> 00:27:56.910
So the string of losing days
is this string here.
00:27:56.910 --> 00:28:02.030
Let's call the length of that
string, L. We're interested in
00:28:02.030 --> 00:28:06.520
the random variable, which is
the length of this interval.
00:28:06.520 --> 00:28:08.670
What kind of random
variable is it?
00:28:11.230 --> 00:28:11.600
OK.
00:28:11.600 --> 00:28:16.190
Here's one possible way you
might think about the problem.
00:28:16.190 --> 00:28:16.550
OK.
00:28:16.550 --> 00:28:24.140
Starting from this time, and
looking until this time here,
00:28:24.140 --> 00:28:27.420
what are we looking at?
00:28:27.420 --> 00:28:31.640
We're looking at the time,
starting from here, until the
00:28:31.640 --> 00:28:35.040
first success.
00:28:35.040 --> 00:28:40.530
So the past doesn't matter.
00:28:40.530 --> 00:28:43.550
Starting from here we
have coin flips
00:28:43.550 --> 00:28:45.870
until the first success.
00:28:45.870 --> 00:28:48.160
The time until the
first success
00:28:48.160 --> 00:28:50.280
in a Bernoulli process--
00:28:50.280 --> 00:28:54.700
we just discussed that it's a
geometric random variable.
00:28:54.700 --> 00:28:58.090
So your first conjecture would
be that this random variable
00:28:58.090 --> 00:29:02.960
here, which is 1 longer than the
one we are interested in,
00:29:02.960 --> 00:29:06.310
that perhaps is a geometric
random variable.
00:29:14.040 --> 00:29:18.460
And if this were so, then you
could say that the random
00:29:18.460 --> 00:29:23.300
variable, L, is a geometric,
minus 1.
00:29:23.300 --> 00:29:26.160
Can that be the correct
answer?
00:29:26.160 --> 00:29:29.590
A geometric random variable,
what values does it take?
00:29:29.590 --> 00:29:33.160
It takes values 1,
2, 3, and so on.
00:29:33.160 --> 00:29:37.576
1 minus a geometric would
take values from 0,
00:29:37.576 --> 00:29:40.940
1, 2, and so on.
00:29:40.940 --> 00:29:45.200
Can the random variable
L be 0?
00:29:45.200 --> 00:29:45.920
No.
00:29:45.920 --> 00:29:48.500
The random variable L
is the length of a
00:29:48.500 --> 00:29:50.390
string of losing days.
00:29:50.390 --> 00:29:56.190
So the shortest that L could
be, would be just 1.
00:29:56.190 --> 00:29:59.770
If you get just one losing day
and then you start winning, L
00:29:59.770 --> 00:30:01.520
would be equal to 1.
00:30:01.520 --> 00:30:05.338
So L cannot be 0 by definition,
which means that L
00:30:05.338 --> 00:30:09.820
+ 1 cannot be 1,
by definition.
00:30:09.820 --> 00:30:14.310
But if L +1 were geometric,
it could be equal to 1.
00:30:14.310 --> 00:30:16.226
Therefore this random
variable, L
00:30:16.226 --> 00:30:18.830
+ 1, is not a geometric.
00:30:23.550 --> 00:30:23.920
OK.
00:30:23.920 --> 00:30:26.720
Why is it not geometric?
00:30:26.720 --> 00:30:29.100
I started watching
at this time.
00:30:29.100 --> 00:30:33.810
From this time until the first
success, that should be a
00:30:33.810 --> 00:30:35.690
geometric random variable.
00:30:35.690 --> 00:30:38.180
Where's the catch?
00:30:38.180 --> 00:30:42.670
If I'm asked to start watching
at this time, it's because my
00:30:42.670 --> 00:30:48.260
sister knows that the next
one was a failure.
00:30:48.260 --> 00:30:52.360
This is the time where the
string of failures starts.
00:30:52.360 --> 00:30:56.050
In order to know that they
should start watching here,
00:30:56.050 --> 00:30:58.770
it's the same as if
I'm told that the
00:30:58.770 --> 00:31:01.240
next one is a failure.
00:31:01.240 --> 00:31:05.180
So to be asked to start watching
at this time requires
00:31:05.180 --> 00:31:08.050
that someone looked
in the future.
00:31:08.050 --> 00:31:13.210
And in that case, it's no longer
true that these will be
00:31:13.210 --> 00:31:14.850
independent Bernoulli trials.
00:31:14.850 --> 00:31:16.000
In fact, they're not.
00:31:16.000 --> 00:31:18.990
If you start watching here,
you're certain that the next
00:31:18.990 --> 00:31:20.210
one is a failure.
00:31:20.210 --> 00:31:23.510
The next one is not an
independent Bernoulli trial.
00:31:23.510 --> 00:31:26.860
That's why the argument that
would claim that this L + 1 is
00:31:26.860 --> 00:31:30.400
geometric would be incorrect.
00:31:30.400 --> 00:31:33.680
So if this is not the correct
answer, which
00:31:33.680 --> 00:31:35.150
is the correct answer?
00:31:35.150 --> 00:31:37.700
The correct answer
goes as follows.
00:31:37.700 --> 00:31:39.400
Your sister is watching.
00:31:39.400 --> 00:31:44.080
Your sister sees the first
failure, and then tells you,
00:31:44.080 --> 00:31:45.530
OK, the failures--
00:31:45.530 --> 00:31:46.670
or losing days--
00:31:46.670 --> 00:31:47.790
have started.
00:31:47.790 --> 00:31:49.260
Come in and watch.
00:31:49.260 --> 00:31:51.550
So you start to watching
at this time.
00:31:51.550 --> 00:31:56.060
And you start watching until
the first success comes.
00:31:56.060 --> 00:31:59.290
This will be a geometric
random variable.
00:31:59.290 --> 00:32:05.005
So from here to here, this
will be geometric.
00:32:09.430 --> 00:32:11.560
So things happen.
00:32:11.560 --> 00:32:14.270
You are asked to
start watching.
00:32:14.270 --> 00:32:18.660
After you start watching, the
future is just a sequence of
00:32:18.660 --> 00:32:20.540
independent Bernoulli trials.
00:32:20.540 --> 00:32:23.930
And the time until the first
failure occurs, this is going
00:32:23.930 --> 00:32:27.470
to be a geometric random
variable with parameter p.
00:32:27.470 --> 00:32:31.830
And then you notice that the
interval of interest is
00:32:31.830 --> 00:32:35.030
exactly the same as the length
of this interval.
00:32:35.030 --> 00:32:37.550
This starts one time step
later, and ends
00:32:37.550 --> 00:32:39.260
one time step later.
00:32:39.260 --> 00:32:43.370
So conclusion is that L is
actually geometric, with
00:32:43.370 --> 00:32:44.620
parameter p.
00:33:33.090 --> 00:33:36.160
OK, it looks like I'm
missing one slide.
00:33:36.160 --> 00:33:38.540
Can I cheat a little
from here?
00:33:46.830 --> 00:33:48.080
OK.
00:33:48.080 --> 00:33:52.550
So now that we dealt with the
time until the first arrival,
00:33:52.550 --> 00:33:56.110
we can start talking about
the time until the second
00:33:56.110 --> 00:33:57.190
arrival, and so on.
00:33:57.190 --> 00:33:59.550
How do we define these?
00:33:59.550 --> 00:34:02.920
After the first arrival happens,
we're going to have a
00:34:02.920 --> 00:34:06.100
sequence of time slots with no
arrivals, and then the next
00:34:06.100 --> 00:34:08.199
arrival is going to happen.
00:34:08.199 --> 00:34:11.080
So we call this time
that elapses--
00:34:11.080 --> 00:34:14.760
or number of time slots after
the first arrival
00:34:14.760 --> 00:34:16.730
until the next one--
00:34:16.730 --> 00:34:18.500
we call it T2.
00:34:18.500 --> 00:34:22.070
This is the second inter-arrival
time, that is,
00:34:22.070 --> 00:34:23.780
time between arrivals.
00:34:23.780 --> 00:34:28.260
Once this arrival has happened,
then we wait and see
00:34:28.260 --> 00:34:31.600
how many more it takes until
the third arrival.
00:34:31.600 --> 00:34:37.040
And we call this
time here, T3.
00:34:37.040 --> 00:34:42.429
We're interested in the time of
the k-th arrival, which is
00:34:42.429 --> 00:34:45.230
going to be just the
sum of the first k
00:34:45.230 --> 00:34:46.889
inter-arrival times.
00:34:46.889 --> 00:34:51.909
So for example, let's say Y3
is the time that the third
00:34:51.909 --> 00:34:53.510
arrival comes.
00:34:53.510 --> 00:34:58.940
Y3 is just the sum of T1,
plus T2, plus T3.
00:35:01.840 --> 00:35:06.310
So we're interested in this
random variable, Y3, and it's
00:35:06.310 --> 00:35:08.765
the sum of inter-arrival
times.
00:35:08.765 --> 00:35:12.190
To understand what kind of
random variable it is, I guess
00:35:12.190 --> 00:35:16.970
we should understand what kind
of random variables these are
00:35:16.970 --> 00:35:18.230
going to be.
00:35:18.230 --> 00:35:22.040
So what kind of random
variable is T2?
00:35:22.040 --> 00:35:27.440
Your sister is doing her coin
flips until a success is
00:35:27.440 --> 00:35:29.750
observed for the first time.
00:35:29.750 --> 00:35:32.540
Based on that information about
what has happened so
00:35:32.540 --> 00:35:34.730
far, you are called
into the room.
00:35:34.730 --> 00:35:39.320
And you start watching until a
success is observed again.
00:35:39.320 --> 00:35:42.160
So after you start watching,
what you have is just a
00:35:42.160 --> 00:35:45.190
sequence of independent
Bernoulli trials.
00:35:45.190 --> 00:35:47.110
So each one of these
has probability
00:35:47.110 --> 00:35:49.400
p of being a success.
00:35:49.400 --> 00:35:52.180
The time it's going to take
until the first success, this
00:35:52.180 --> 00:35:57.600
number, T2, is going to be again
just another geometric
00:35:57.600 --> 00:35:58.840
random variable.
00:35:58.840 --> 00:36:01.860
It's as if the process
just started.
00:36:01.860 --> 00:36:06.280
After you are called into the
room, you have no foresight,
00:36:06.280 --> 00:36:09.540
you don't have any information
about the future, other than
00:36:09.540 --> 00:36:11.020
the fact that these
are going to be
00:36:11.020 --> 00:36:13.280
independent Bernoulli trials.
00:36:13.280 --> 00:36:18.480
So T2 itself is going to be
geometric with the same
00:36:18.480 --> 00:36:20.590
parameter p.
00:36:20.590 --> 00:36:24.700
And then you can continue the
arguments and argue that T3 is
00:36:24.700 --> 00:36:27.880
also geometric with the
same parameter p.
00:36:27.880 --> 00:36:30.850
Furthermore, whatever happened,
how long it took
00:36:30.850 --> 00:36:34.330
until you were called in, it
doesn't change the statistics
00:36:34.330 --> 00:36:36.810
about what's going to happen
in the future.
00:36:36.810 --> 00:36:39.130
So whatever happens
in the future is
00:36:39.130 --> 00:36:41.460
independent from the past.
00:36:41.460 --> 00:36:47.340
So T1, T2, and T3 are
independent random variables.
00:36:47.340 --> 00:36:54.220
So conclusion is that the time
until the third arrival is the
00:36:54.220 --> 00:37:00.510
sum of 3 independent geometric
random variables, with the
00:37:00.510 --> 00:37:02.150
same parameter.
00:37:02.150 --> 00:37:04.550
And this is true
more generally.
00:37:04.550 --> 00:37:08.820
The time until the k-th arrival
is going to be the sum
00:37:08.820 --> 00:37:14.900
of k independent random
variables.
00:37:14.900 --> 00:37:21.540
So in general, Yk is going to be
T1 plus Tk, where the Ti's
00:37:21.540 --> 00:37:26.850
are geometric, with the same
parameter p, and independent.
00:37:30.440 --> 00:37:33.520
So now what's more natural
than trying to find the
00:37:33.520 --> 00:37:37.100
distribution of the random
variable Yk?
00:37:37.100 --> 00:37:38.350
How can we find it?
00:37:38.350 --> 00:37:40.260
So I fixed k for you.
00:37:40.260 --> 00:37:41.680
Let's say k is 100.
00:37:41.680 --> 00:37:43.580
I'm interested in how
long it takes until
00:37:43.580 --> 00:37:46.180
100 customers arrive.
00:37:46.180 --> 00:37:48.850
How can we find the distribution
of Yk?
00:37:48.850 --> 00:37:51.980
Well one way of doing
it is to use this
00:37:51.980 --> 00:37:54.200
lovely convolution formula.
00:37:54.200 --> 00:37:57.950
Take a geometric, convolve it
with another geometric, you
00:37:57.950 --> 00:37:59.430
get something.
00:37:59.430 --> 00:38:02.040
Take that something that you
got, convolve it with a
00:38:02.040 --> 00:38:07.090
geometric once more, do this 99
times, and this gives you
00:38:07.090 --> 00:38:09.450
the distribution of Yk.
00:38:09.450 --> 00:38:14.300
So that's definitely doable,
and it's extremely tedious.
00:38:14.300 --> 00:38:16.900
Let's try to find
the distribution
00:38:16.900 --> 00:38:22.520
of Yk using a shortcut.
00:38:22.520 --> 00:38:28.660
So the probability that
Yk is equal to t.
00:38:28.660 --> 00:38:31.620
So we're trying to find
the PMF of Yk.
00:38:31.620 --> 00:38:34.030
k has been fixed for us.
00:38:34.030 --> 00:38:36.810
And we want to calculate this
probability for the various
00:38:36.810 --> 00:38:39.580
values of t, because this
is going to give
00:38:39.580 --> 00:38:43.642
us the PMF of Yk.
00:38:43.642 --> 00:38:45.850
OK.
00:38:45.850 --> 00:38:47.540
What is this event?
00:38:47.540 --> 00:38:53.210
What does it take for the k-th
arrival to be at time t?
00:38:53.210 --> 00:38:56.580
For that to happen, we
need two things.
00:38:56.580 --> 00:39:00.960
In the first t -1 slots,
how many arrivals
00:39:00.960 --> 00:39:03.130
should we have gotten?
00:39:03.130 --> 00:39:04.830
k - 1.
00:39:04.830 --> 00:39:09.400
And then in the last slot, we
get one more arrival, and
00:39:09.400 --> 00:39:11.430
that's the k-th one.
00:39:11.430 --> 00:39:20.340
So this is the probability that
we have k - 1 arrivals in
00:39:20.340 --> 00:39:24.250
the time interval
from 1 up to t.
00:39:28.670 --> 00:39:34.120
And then, an arrival
at time t.
00:39:39.860 --> 00:39:43.210
That's the only way that it
can happen, that the k-th
00:39:43.210 --> 00:39:45.450
arrival happens at time t.
00:39:45.450 --> 00:39:48.470
We need to have an arrival
at time t.
00:39:48.470 --> 00:39:50.150
And before that time,
we need to have
00:39:50.150 --> 00:39:53.380
exactly k - 1 arrivals.
00:39:53.380 --> 00:39:55.590
Now this is an event
that refers--
00:39:58.710 --> 00:39:59.960
t-1.
00:40:02.830 --> 00:40:07.460
In the previous time slots we
had exactly k -1 arrivals.
00:40:07.460 --> 00:40:10.680
And then at the last time slot
we get one more arrival.
00:40:10.680 --> 00:40:14.560
Now the interesting thing is
that this event here has to do
00:40:14.560 --> 00:40:18.620
with what happened from time
1 up to time t -1.
00:40:18.620 --> 00:40:22.350
This event has to do with
what happened at time t.
00:40:22.350 --> 00:40:25.510
Different time slots are
independent of each other.
00:40:25.510 --> 00:40:31.130
So this event and that event
are independent.
00:40:31.130 --> 00:40:34.930
So this means that we can
multiply their probabilities.
00:40:34.930 --> 00:40:37.280
So take the probability
of this.
00:40:37.280 --> 00:40:38.590
What is that?
00:40:38.590 --> 00:40:41.600
Well probability of having a
certain number of arrivals in
00:40:41.600 --> 00:40:44.650
a certain number of time slots,
these are just the
00:40:44.650 --> 00:40:46.530
binomial probabilities.
00:40:46.530 --> 00:40:51.250
So this is, out of t - 1 slots,
to get exactly k - 1
00:40:51.250 --> 00:41:02.670
arrivals, p to the k-1, (1-p)
to the t-1 - (k-1),
00:41:02.670 --> 00:41:05.480
this gives us t-k.
00:41:05.480 --> 00:41:07.960
And then we multiply with this
probability, the probability
00:41:07.960 --> 00:41:14.300
of an arrival, at time
t is equal to p.
00:41:14.300 --> 00:41:21.310
And so this is the formula for
the PMF of the number--
00:41:21.310 --> 00:41:27.410
of the time it takes until
the k-th arrival happens.
00:41:32.760 --> 00:41:35.730
Does it agree with the formula
in your handout?
00:41:35.730 --> 00:41:37.850
Or its not there?
00:41:37.850 --> 00:41:38.710
It's not there.
00:41:38.710 --> 00:41:39.960
OK.
00:41:48.018 --> 00:41:49.010
Yeah.
00:41:49.010 --> 00:41:50.020
OK.
00:41:50.020 --> 00:41:57.338
So that's the formula and it is
true for what values of t?
00:41:57.338 --> 00:41:58.588
[INAUDIBLE].
00:42:03.182 --> 00:42:08.350
It takes at least k time slots
in order to get k arrivals, so
00:42:08.350 --> 00:42:12.370
this formula should be
true for k larger
00:42:12.370 --> 00:42:13.990
than or equal to t.
00:42:20.330 --> 00:42:23.391
For t larger than
or equal to k.
00:42:30.130 --> 00:42:31.040
All right.
00:42:31.040 --> 00:42:34.950
So this gives us the PMF of
the random variable Yk.
00:42:34.950 --> 00:42:37.430
Of course, we may also be
interested in the mean and
00:42:37.430 --> 00:42:39.260
variance of Yk.
00:42:39.260 --> 00:42:42.150
But this is a lot easier.
00:42:42.150 --> 00:42:46.350
Since Yk is the sum of
independent random variables,
00:42:46.350 --> 00:42:50.310
the expected value of Yk is
going to be just k times the
00:42:50.310 --> 00:42:52.980
expected value of
your typical t.
00:42:52.980 --> 00:43:03.080
So the expected value of Yk is
going to be just k times 1/p,
00:43:03.080 --> 00:43:06.600
which is the mean of
the geometric.
00:43:06.600 --> 00:43:09.470
And similarly for the variance,
it's going to be k
00:43:09.470 --> 00:43:12.960
times the variance
of a geometric.
00:43:12.960 --> 00:43:16.950
So we have everything there is
to know about the distribution
00:43:16.950 --> 00:43:19.965
of how long it takes until
the first arrival comes.
00:43:23.760 --> 00:43:25.420
OK.
00:43:25.420 --> 00:43:27.810
Finally, let's do a few
more things about
00:43:27.810 --> 00:43:30.680
the Bernoulli process.
00:43:30.680 --> 00:43:34.970
It's interesting to talk about
several processes at the time.
00:43:34.970 --> 00:43:39.600
So in the situation here of
splitting a Bernoulli process
00:43:39.600 --> 00:43:43.730
is where you have arrivals
that come to a server.
00:43:43.730 --> 00:43:46.370
And that's a picture of which
slots get arrivals.
00:43:46.370 --> 00:43:48.900
But actually maybe you
have two servers.
00:43:48.900 --> 00:43:53.110
And whenever an arrival comes to
the system, you flip a coin
00:43:53.110 --> 00:43:56.620
and with some probability, q,
you send it to one server.
00:43:56.620 --> 00:44:00.390
And with probability 1-q, you
send it to another server.
00:44:00.390 --> 00:44:03.520
So there is a single
arrival stream, but
00:44:03.520 --> 00:44:05.040
two possible servers.
00:44:05.040 --> 00:44:07.280
And whenever there's an arrival,
you either send it
00:44:07.280 --> 00:44:09.270
here or you send it there.
00:44:09.270 --> 00:44:13.780
And each time you decide where
you send it by flipping an
00:44:13.780 --> 00:44:17.950
independent coin that
has its own bias q.
00:44:17.950 --> 00:44:22.450
The coin flips that decide
where do you send it are
00:44:22.450 --> 00:44:27.460
assumed to be independent from
the arrival process itself.
00:44:27.460 --> 00:44:30.480
So there's two coin flips
that are happening.
00:44:30.480 --> 00:44:33.850
At each time slot, there's a
coin flip that decides whether
00:44:33.850 --> 00:44:37.150
you have an arrival in this
process here, and that coin
00:44:37.150 --> 00:44:39.630
flip is with parameter p.
00:44:39.630 --> 00:44:43.000
And if you have something that
arrives, you flip another coin
00:44:43.000 --> 00:44:47.050
with probabilities q, and 1-q,
that decides whether you send
00:44:47.050 --> 00:44:49.770
it up there or you send
it down there.
00:44:49.770 --> 00:44:55.460
So what kind of arrival process
does this server see?
00:44:55.460 --> 00:44:59.510
At any given time slot, there's
probability p that
00:44:59.510 --> 00:45:01.480
there's an arrival here.
00:45:01.480 --> 00:45:04.300
And there's a further
probability q that this
00:45:04.300 --> 00:45:07.320
arrival gets sent up there.
00:45:07.320 --> 00:45:10.860
So the probability that this
server sees an arrival at any
00:45:10.860 --> 00:45:14.090
given time is p times q.
00:45:14.090 --> 00:45:18.900
So this process here is going to
be a Bernoulli process, but
00:45:18.900 --> 00:45:21.810
with a different parameter,
p times q.
00:45:21.810 --> 00:45:24.820
And this one down here, with the
same argument, is going to
00:45:24.820 --> 00:45:29.860
be Bernoulli with parameter
p times (1-q).
00:45:29.860 --> 00:45:33.500
So by taking a Bernoulli
stream of arrivals and
00:45:33.500 --> 00:45:36.460
splitting it into
two, you get two
00:45:36.460 --> 00:45:39.050
separate Bernoulli processes.
00:45:39.050 --> 00:45:40.890
This is going to be a Bernoulli
process, that's
00:45:40.890 --> 00:45:42.980
going to be a Bernoulli
process.
00:45:42.980 --> 00:45:45.630
Well actually, I'm running
a little too fast.
00:45:45.630 --> 00:45:49.330
What does it take to verify that
it's a Bernoulli process?
00:45:49.330 --> 00:45:52.650
At each time slot,
it's a 0 or 1.
00:45:52.650 --> 00:45:55.330
And it's going to be a 1, you're
going to see an arrival
00:45:55.330 --> 00:45:57.450
with probability p times q.
00:45:57.450 --> 00:46:00.510
What else do we need to verify,
to be able to tell--
00:46:00.510 --> 00:46:02.820
to say that it's a Bernoulli
process?
00:46:02.820 --> 00:46:05.620
We need to make sure that
whatever happens in this
00:46:05.620 --> 00:46:09.340
process, in different time
slots, are statistically
00:46:09.340 --> 00:46:11.240
independent from each other.
00:46:11.240 --> 00:46:13.030
Is that property true?
00:46:13.030 --> 00:46:16.900
For example, what happens in
this time slot whether you got
00:46:16.900 --> 00:46:20.100
an arrival or not, is it
independent from what happened
00:46:20.100 --> 00:46:22.660
at that time slot?
00:46:22.660 --> 00:46:26.850
The answer is yes for the
following reason.
00:46:26.850 --> 00:46:30.760
What happens in this time slot
has to do with the coin flip
00:46:30.760 --> 00:46:34.840
associated with the original
process at this time, and the
00:46:34.840 --> 00:46:38.340
coin flip that decides
where to send things.
00:46:38.340 --> 00:46:41.370
What happens at that time slot
has to do with the coin flip
00:46:41.370 --> 00:46:45.010
here, and the additional coin
flip that decides where to
00:46:45.010 --> 00:46:47.130
send it if something came.
00:46:47.130 --> 00:46:50.570
Now all these coin flips are
independent of each other.
00:46:50.570 --> 00:46:53.460
The coin flips that determine
whether we have an arrival
00:46:53.460 --> 00:46:56.860
here is independent from the
coin flips that determined
00:46:56.860 --> 00:46:59.280
whether we had an
arrival there.
00:46:59.280 --> 00:47:02.770
And you can generalize this
argument and conclude that,
00:47:02.770 --> 00:47:07.390
indeed, every time slot here
is independent from
00:47:07.390 --> 00:47:09.030
any other time slot.
00:47:09.030 --> 00:47:12.020
And this does make it
a Bernoulli process.
00:47:12.020 --> 00:47:15.590
And the reason is that, in the
original process, every time
00:47:15.590 --> 00:47:18.390
slot is independent from
every other time slot.
00:47:18.390 --> 00:47:21.000
And the additional assumption
that the coin flips that we're
00:47:21.000 --> 00:47:24.370
using to decide where to send
things, these are also
00:47:24.370 --> 00:47:26.020
independent of each other.
00:47:26.020 --> 00:47:30.220
So we're using here the basic
property that functions of
00:47:30.220 --> 00:47:33.283
independent things remain
independent.
00:47:36.390 --> 00:47:38.710
There's a converse
picture of this.
00:47:38.710 --> 00:47:41.970
Instead of taking one stream
and splitting it into two
00:47:41.970 --> 00:47:44.720
streams, you can do
the opposite.
00:47:44.720 --> 00:47:48.040
You could start from two
streams of arrivals.
00:47:48.040 --> 00:47:51.300
Let's say you have arrivals of
men and you have arrivals of
00:47:51.300 --> 00:47:54.000
women, but you don't
care about gender.
00:47:54.000 --> 00:47:57.430
And the only thing you record
is whether, in a given time
00:47:57.430 --> 00:48:00.450
slot, you had an
arrival or not.
00:48:00.450 --> 00:48:04.320
Notice that here we may have
an arrival of a man and the
00:48:04.320 --> 00:48:05.790
arrival of a woman.
00:48:05.790 --> 00:48:11.180
We just record it with a 1, by
saying there was an arrival.
00:48:11.180 --> 00:48:14.260
So in the merged process, we're
not keeping track of how
00:48:14.260 --> 00:48:16.880
many arrivals we had total.
00:48:16.880 --> 00:48:18.840
We just record whether
there was an
00:48:18.840 --> 00:48:21.400
arrival or not an arrival.
00:48:21.400 --> 00:48:25.830
So an arrival gets recorded here
if, and only if, one or
00:48:25.830 --> 00:48:28.800
both of these streams
had an arrival.
00:48:28.800 --> 00:48:31.780
So that we call a merging
of two Bernoull-- of two
00:48:31.780 --> 00:48:34.470
processes, of two arrival
processes.
00:48:34.470 --> 00:48:37.840
So let's make the assumption
that this arrival process is
00:48:37.840 --> 00:48:41.440
independent from that
arrival process.
00:48:41.440 --> 00:48:44.330
So what happens at the
typical slot here?
00:48:44.330 --> 00:48:49.680
I'm going to see an arrival,
unless none of
00:48:49.680 --> 00:48:51.950
these had an arrival.
00:48:51.950 --> 00:48:56.380
So the probability of an arrival
in a typical time slot
00:48:56.380 --> 00:49:02.650
is going to be 1 minus the
probability of no arrival.
00:49:02.650 --> 00:49:07.370
And the event of no arrival
corresponds to the first
00:49:07.370 --> 00:49:10.330
process having no arrival,
and the second
00:49:10.330 --> 00:49:14.350
process having no arrival.
00:49:14.350 --> 00:49:18.110
So there's no arrival in the
merged process if, and only
00:49:18.110 --> 00:49:21.270
if, there's no arrival in the
first process and no arrival
00:49:21.270 --> 00:49:22.710
in the second process.
00:49:22.710 --> 00:49:26.080
We're assuming that the two
processes are independent and
00:49:26.080 --> 00:49:29.160
that's why we can multiply
probabilities here.
00:49:29.160 --> 00:49:34.270
And then you can take this
formula and it simplifies to p
00:49:34.270 --> 00:49:38.120
+ q, minus p times q.
00:49:38.120 --> 00:49:41.360
So each time slot of the merged
process has a certain
00:49:41.360 --> 00:49:44.620
probability of seeing
an arrival.
00:49:44.620 --> 00:49:47.360
Is the merged process
a Bernoulli process?
00:49:47.360 --> 00:49:51.260
Yes, it is after you verify the
additional property that
00:49:51.260 --> 00:49:54.650
different slots are independent
of each other.
00:49:54.650 --> 00:49:56.560
Why are they independent?
00:49:56.560 --> 00:50:01.070
What happens in this slot has to
do with that slot, and that
00:50:01.070 --> 00:50:03.160
slot down here.
00:50:03.160 --> 00:50:05.790
These two slots--
00:50:05.790 --> 00:50:08.570
so what happens here,
has to do with what
00:50:08.570 --> 00:50:11.430
happens here and there.
00:50:11.430 --> 00:50:16.680
What happens in this slot has
to do with whatever happened
00:50:16.680 --> 00:50:19.200
here and there.
00:50:19.200 --> 00:50:23.190
Now, whatever happens here and
there is independent from
00:50:23.190 --> 00:50:25.330
whatever happens
here and there.
00:50:25.330 --> 00:50:29.180
Therefore, what happens here
is independent from what
00:50:29.180 --> 00:50:30.220
happens there.
00:50:30.220 --> 00:50:33.310
So the independence property
is preserved.
00:50:33.310 --> 00:50:36.640
The different slots of this
merged process are independent
00:50:36.640 --> 00:50:37.590
of each other.
00:50:37.590 --> 00:50:41.970
So the merged process is itself
a Bernoulli process.
00:50:41.970 --> 00:50:45.900
So please digest these two
pictures of merging and
00:50:45.900 --> 00:50:48.960
splitting, because we're going
to revisit them in continuous
00:50:48.960 --> 00:50:52.510
time where things are little
subtler than that.
00:50:52.510 --> 00:50:53.160
OK.
00:50:53.160 --> 00:50:56.240
Good luck on the exam and
see you in a week.