WEBVTT
00:00:00.265 --> 00:00:02.550
NARRATOR: The following content
is provided under a
00:00:02.550 --> 00:00:04.370
Creative Commons license.
00:00:04.370 --> 00:00:07.410
Your support will help MIT
OpenCourseWare continue to
00:00:07.410 --> 00:00:11.060
offer high quality educational
resources for free.
00:00:11.060 --> 00:00:13.960
To make a donation or view
additional materials from
00:00:13.960 --> 00:00:19.790
hundreds of MIT courses, visit
MIT OpenCourseWare at
00:00:19.790 --> 00:00:21.040
ocw.mit.edu.
00:00:22.975 --> 00:00:25.140
PROFESSOR: OK, so I guess
we're ready to start.
00:00:28.430 --> 00:00:31.370
Clearly, you can talk to
us about the exams
00:00:31.370 --> 00:00:32.820
later if you want to.
00:00:32.820 --> 00:00:37.036
And we will both be around after
class for a bit-- or
00:00:37.036 --> 00:00:39.310
I'll be around after
class for a bit and
00:00:39.310 --> 00:00:42.670
regular office hours.
00:00:42.670 --> 00:00:46.220
Wanted to finish talking about
[INAUDIBLE] state Markov
00:00:46.220 --> 00:00:53.500
change today and go on to talk
about Markov processes.
00:00:53.500 --> 00:00:57.520
And the first thing we want
to talk about is what does
00:00:57.520 --> 00:01:00.520
reversibility mean.
00:01:00.520 --> 00:01:04.459
I think reversibility is one
of these very, very tricky
00:01:04.459 --> 00:01:09.170
concepts that you think you
understand about five times,
00:01:09.170 --> 00:01:11.140
and then you realize you
don't understand
00:01:11.140 --> 00:01:12.830
it about five times.
00:01:12.830 --> 00:01:16.270
And hopefully by the sixth time,
and we will see at about
00:01:16.270 --> 00:01:21.110
six times, so hopefully by the
end of the term, it will look
00:01:21.110 --> 00:01:23.410
almost obvious to you.
00:01:23.410 --> 00:01:28.380
And then, we're going to talk
about branching processes.
00:01:28.380 --> 00:01:31.880
I said and what got passed out
to you that I'd be talking
00:01:31.880 --> 00:01:36.010
about round robin and
processor sharing.
00:01:36.010 --> 00:01:39.240
I decided not to do that.
00:01:39.240 --> 00:01:42.250
It was too much complexity for
this part of the course.
00:01:42.250 --> 00:01:45.640
I will talk about it just
for a few minutes.
00:01:45.640 --> 00:01:48.470
And then we'll go into
Markov processes.
00:01:48.470 --> 00:01:54.270
And we will see most of the
things we saw in Markov change
00:01:54.270 --> 00:01:58.230
again but in a different context
and in a slightly more
00:01:58.230 --> 00:02:01.770
complicated context.
00:02:01.770 --> 00:02:08.479
So for any Markov chain, we
have these equations.
00:02:12.140 --> 00:02:14.800
Typically, you just state the
equation of the probability of
00:02:14.800 --> 00:02:19.180
X sub n plus 1 given all the
previous terms is equal to
00:02:19.180 --> 00:02:22.820
probability of Xn n
plus 1 given Xn.
00:02:22.820 --> 00:02:25.070
It's an easy extension
to write it this way.
00:02:25.070 --> 00:02:28.930
The probability of lots of
things in the future given
00:02:28.930 --> 00:02:32.280
everything in the past is equal
to lots of things in the
00:02:32.280 --> 00:02:35.600
future just given the most
recent thing in the past.
00:02:35.600 --> 00:02:41.570
So what we did last time was to
say let's let A plus be any
00:02:41.570 --> 00:02:47.290
function of all of these things
here, and let's A minus
00:02:47.290 --> 00:02:52.280
be any function of all of these
things here except for X
00:02:52.280 --> 00:02:56.000
sub n, namely X sub n
minus 1 down to X0.
00:02:56.000 --> 00:03:00.990
And then what this says more
generally is the probability
00:03:00.990 --> 00:03:05.190
of all these future things
condition on Xn, and all the
00:03:05.190 --> 00:03:09.540
past things is equal to
probability of the future
00:03:09.540 --> 00:03:11.910
given just Xn.
00:03:11.910 --> 00:03:15.370
Then, we wrote that by
multiplying by the probability
00:03:15.370 --> 00:03:17.740
of A minus given Xn.
00:03:17.740 --> 00:03:21.400
And you can write it in this
nice symmetric form here.
00:03:24.620 --> 00:03:27.720
I'm hoping that these two laser
pointers, one of them
00:03:27.720 --> 00:03:29.770
will keep working.
00:03:29.770 --> 00:03:32.730
And as soon as you write it in
this symmetric form, it's
00:03:32.730 --> 00:03:35.910
clear that you can again turn
it around and write in the
00:03:35.910 --> 00:03:40.780
past given the present and the
future is equal to the past
00:03:40.780 --> 00:03:42.850
given the present.
00:03:42.850 --> 00:03:46.640
So this formula really is
the most symmetric form
00:03:46.640 --> 00:03:51.800
[? for it ?] and it really shows
the symmetry of past
00:03:51.800 --> 00:03:54.730
future, at least as far as
Markov chains are concerned.
00:03:54.730 --> 00:03:55.426
Yeah?
00:03:55.426 --> 00:03:57.290
AUDIENCE: I don't understand
[INAUDIBLE]
00:03:57.290 --> 00:03:58.222
write that down though.
00:03:58.222 --> 00:03:59.620
I feel like I'm missing
a step.
00:03:59.620 --> 00:04:10.260
For example, let's say I
[INAUDIBLE], I can't infer
00:04:10.260 --> 00:04:11.510
where I came from?
00:04:14.270 --> 00:04:16.190
PROFESSOR: No, that's
not what this says.
00:04:21.490 --> 00:04:26.290
I mean all it says is a
probabilistic statement.
00:04:26.290 --> 00:04:33.040
It says everything you can say
about X sub n plus 1 which was
00:04:33.040 --> 00:04:34.580
the first way we stated.
00:04:34.580 --> 00:04:39.040
Everything you know about X sub
n plus 1, you can find out
00:04:39.040 --> 00:04:41.300
by just looking at X sub n.
00:04:41.300 --> 00:04:46.370
And knowing the things before
that doesn't help you at all.
00:04:46.370 --> 00:04:51.030
When you write it out a Markov
chain in terms of a graph, you
00:04:51.030 --> 00:04:54.080
can see this because you see
transitions going from one
00:04:54.080 --> 00:04:56.860
state to the next state.
00:04:56.860 --> 00:04:59.220
And you don't remember
what the past is.
00:04:59.220 --> 00:05:01.280
The only part of the past
you remember is
00:05:01.280 --> 00:05:02.530
just that last state.
00:05:07.420 --> 00:05:08.670
It look you're still puzzled.
00:05:13.540 --> 00:05:17.670
So it's not how it's saying we
can't tell anything about the
00:05:17.670 --> 00:05:19.100
past and the future.
00:05:19.100 --> 00:05:25.050
In fact, if you don't condition
on X sub n, this
00:05:25.050 --> 00:05:27.330
stuff back here has a great
deal to do with
00:05:27.330 --> 00:05:28.780
the stuff up here.
00:05:28.780 --> 00:05:33.650
I mean it's only when you do
this conditioning, it is
00:05:33.650 --> 00:05:38.530
saying that the conditioning
at the present is the only
00:05:38.530 --> 00:05:43.410
linkage you have between
past and future.
00:05:43.410 --> 00:05:47.490
If you know where you are now,
you don't have to know
00:05:47.490 --> 00:05:49.640
anything about the past and know
what's going to happen in
00:05:49.640 --> 00:05:50.810
the future.
00:05:50.810 --> 00:05:52.400
That's not the way life is.
00:05:52.400 --> 00:05:54.920
I mean life is not
a Markov chain.
00:05:54.920 --> 00:05:57.810
It's just the way these
Markov chains are.
00:05:57.810 --> 00:06:01.990
But this very symmetric
statement says that as far as
00:06:01.990 --> 00:06:07.630
Markov chains are concerned,
past and future look the same.
00:06:07.630 --> 00:06:11.440
And that's the idea that we're
trying to use when we get into
00:06:11.440 --> 00:06:12.410
reversibility.
00:06:12.410 --> 00:06:15.700
This isn't saying anything about
reversibility, yet this
00:06:15.700 --> 00:06:18.110
is just giving a general
property that
00:06:18.110 --> 00:06:20.140
Markov chains have.
00:06:20.140 --> 00:06:24.280
And when you write this out,
it says the probability of
00:06:24.280 --> 00:06:29.510
this past state given Xn and
everything in the future is
00:06:29.510 --> 00:06:33.510
equal to the probability of the
past state given X sub n.
00:06:33.510 --> 00:06:40.370
So this is really the Markov
condition running from future
00:06:40.370 --> 00:06:41.250
down to past.
00:06:41.250 --> 00:06:45.440
And it's saying that if you
want to evaluate these
00:06:45.440 --> 00:06:50.830
probabilities of where you were
given anything now and
00:06:50.830 --> 00:06:55.130
further on, or put it in a more
sensible way, if you know
00:06:55.130 --> 00:07:00.950
everything over the past year,
and from knowing everything
00:07:00.950 --> 00:07:04.880
over the past year, you want
to decide what can you tell
00:07:04.880 --> 00:07:09.260
about what happens the year
before, what it's saying is
00:07:09.260 --> 00:07:13.600
the probability of what happened
the year before is
00:07:13.600 --> 00:07:18.050
statistically a function only
on the last day on the first
00:07:18.050 --> 00:07:21.390
day of this year that you're
conditioning on.
00:07:25.880 --> 00:07:29.710
So Markov condition works
in both directions.
00:07:29.710 --> 00:07:35.430
You need to study state and
forward change to be there in
00:07:35.430 --> 00:07:37.720
order to have homogeneity
in a backward chain.
00:07:37.720 --> 00:07:40.650
In other words, usually, we
define a Markov chain by
00:07:40.650 --> 00:07:44.850
starting off at time zero and
then evolving from there.
00:07:44.850 --> 00:07:49.520
So when you go backwards, that
fact that you started at time
00:07:49.520 --> 00:07:53.480
zero and said something about
time zero destroys the
00:07:53.480 --> 00:07:55.740
symmetry between past
and feature.
00:07:55.740 --> 00:08:00.190
But if you start off in steady
state, then everything is as
00:08:00.190 --> 00:08:01.440
it should be.
00:08:05.800 --> 00:08:07.920
So if you have a
positive-recurrent Markov
00:08:07.920 --> 00:08:11.480
chain in steady state, it
can't be in steady state
00:08:11.480 --> 00:08:14.620
unless it's positive-recurrent,
because
00:08:14.620 --> 00:08:16.980
otherwise, you can't evaluate
the steady-state
00:08:16.980 --> 00:08:18.150
probabilities.
00:08:18.150 --> 00:08:20.900
The steady-state probabilities
don't exist.
00:08:20.900 --> 00:08:24.420
And the backward probabilities
are probability that X sub n
00:08:24.420 --> 00:08:29.690
minus 1 equals j given that
X sub n equals i is the
00:08:29.690 --> 00:08:34.270
transition probability from i
to j times the steade-state
00:08:34.270 --> 00:08:37.350
probability pi sub
j over pi sub y.
00:08:37.350 --> 00:08:41.309
This looks more sensible if you
bring the pi sub i over
00:08:41.309 --> 00:08:45.770
there, pi sub i times
probability of Xn minus 1
00:08:45.770 --> 00:08:55.930
equals j given Xn equals i is
really the probability of Xn
00:08:55.930 --> 00:08:59.450
equals i and Xn minus
1 equals j.
00:08:59.450 --> 00:09:09.400
So what this statement is really
saying is it's pi i
00:09:09.400 --> 00:09:12.750
times the probability of Xn
minus one equals j given Xn
00:09:12.750 --> 00:09:20.330
equals i is really the
probability of being in state
00:09:20.330 --> 00:09:26.280
j at time n minus 1 and state
i at time [? n. ?]
00:09:26.280 --> 00:09:28.400
And we're just writing that
in two different ways.
00:09:28.400 --> 00:09:31.360
It's the base law way of
writing things in two
00:09:31.360 --> 00:09:32.460
different ways.
00:09:32.460 --> 00:09:38.050
If we define this backward
probability, which we said you
00:09:38.050 --> 00:09:42.010
can find by base law if you
want to work at it, if we
00:09:42.010 --> 00:09:46.795
define this to be the backward
transition probability, p sub
00:09:46.795 --> 00:09:52.450
ij star, in other words, p sub
ij star is in this world where
00:09:52.450 --> 00:09:57.450
things are moving backwards, it
corresponds to p sub j in
00:09:57.450 --> 00:09:59.710
the world where things
are moving forward.
00:09:59.710 --> 00:10:07.180
P sub ij star is then the
probability of being in state
00:10:07.180 --> 00:10:12.600
j at the next time back given
that you're in state i at this
00:10:12.600 --> 00:10:16.970
time if you're in state
I at the present.
00:10:16.970 --> 00:10:21.290
In other words, if you're
visualizing moving from future
00:10:21.290 --> 00:10:25.510
time back to backward time,
that's what your Markov chain
00:10:25.510 --> 00:10:26.860
is doing now.
00:10:26.860 --> 00:10:29.810
These star transition
probabilities are the
00:10:29.810 --> 00:10:33.310
probabilities of moving
backward by one step,
00:10:33.310 --> 00:10:36.930
conditional going where you were
at time n, where you're
00:10:36.930 --> 00:10:41.040
going to be at time n minus
1, if you will.
00:10:41.040 --> 00:10:44.260
As I said, these things are much
easier to deal with if
00:10:44.260 --> 00:10:49.020
you view them on a line, and you
have a right moving chain
00:10:49.020 --> 00:10:53.870
which is what we usually think
of as the chain moving from
00:10:53.870 --> 00:10:55.270
past to future.
00:10:55.270 --> 00:10:57.790
And then you have a left moving
chain, which is what
00:10:57.790 --> 00:11:05.210
you view as moving from
future down to past.
00:11:05.210 --> 00:11:09.340
OK, we define a chain as
reversible if these backward
00:11:09.340 --> 00:11:13.000
probabilities are equal to
the forward transition
00:11:13.000 --> 00:11:14.060
probabilities.
00:11:14.060 --> 00:11:19.360
So if a chain is reversible,
it's says that pi I times P
00:11:19.360 --> 00:11:26.170
sub ij, this is the probability
that you are at a
00:11:26.170 --> 00:11:30.130
time n minus 1, you were in
state I, and then you
00:11:30.130 --> 00:11:31.380
move to state j.
00:11:31.380 --> 00:11:36.140
So it's the probability of
being in one state at one
00:11:36.140 --> 00:11:38.760
time, the next state
at the next time.
00:11:38.760 --> 00:11:45.540
It's the probability that Xn
minus 1 and Xn are ij.
00:11:45.540 --> 00:11:51.000
And this probability here is--
00:12:03.130 --> 00:12:06.150
this equation is moving
forward in time.
00:12:06.150 --> 00:12:09.580
So this equation here is the
probability that you were in
00:12:09.580 --> 00:12:12.400
state j, and you move
to state i.
00:12:12.400 --> 00:12:16.330
So what we're saying is the
probability of being in i
00:12:16.330 --> 00:12:20.570
moving to j is the same as the
probability of being in j and
00:12:20.570 --> 00:12:21.620
moving to i.
00:12:21.620 --> 00:12:24.820
It's the condition you have
on any birth-death chain.
00:12:24.820 --> 00:12:28.790
We said that on any birth-death
chain, the
00:12:28.790 --> 00:12:34.560
fraction of transitions from i
to j has to be equal to the
00:12:34.560 --> 00:12:37.570
total number of transitions
from j to i.
00:12:37.570 --> 00:12:41.890
It's not that the probability
of moving up given i is the
00:12:41.890 --> 00:12:43.280
same as that of moving back.
00:12:43.280 --> 00:12:44.740
That's not what it's saying.
00:12:44.740 --> 00:12:48.480
It's saying that the probability
of having a
00:12:48.480 --> 00:12:57.250
transition over time
is pi i times Pij.
00:12:57.250 --> 00:13:02.090
Reversibility says that you make
as many up transitions
00:13:02.090 --> 00:13:05.110
over time as you make down
transitions over the
00:13:05.110 --> 00:13:06.360
same pair of states.
00:13:09.070 --> 00:13:11.900
I think that's the simplest
way to state the idea of
00:13:11.900 --> 00:13:13.380
reversibility.
00:13:13.380 --> 00:13:17.810
The fraction of time that you
move from state i to state j
00:13:17.810 --> 00:13:21.240
is the same as the fraction of
time in which you move from
00:13:21.240 --> 00:13:23.160
state j to state i.
00:13:23.160 --> 00:13:26.290
It's what always happens on a
birth-death chain, because
00:13:26.290 --> 00:13:31.350
every time you go up, if you
ever get back to the lower
00:13:31.350 --> 00:13:35.140
part of the chain, you have to
move back over that same path.
00:13:35.140 --> 00:13:38.810
You can easily visualize other
situations where you have the
00:13:38.810 --> 00:13:42.140
same condition if you have
enough symmetry between the
00:13:42.140 --> 00:13:44.150
various probabilities
involved.
00:13:44.150 --> 00:13:46.400
But the simplest way is
to have this sort of--
00:13:52.260 --> 00:13:55.120
well, not only the simplest, but
also the most common way
00:13:55.120 --> 00:13:58.340
is to have a birth-death
chain.
00:13:58.340 --> 00:14:01.140
OK, so this leads us to
the statement, all
00:14:01.140 --> 00:14:06.060
positive-recurrent birth-death
chains are reversible, and
00:14:06.060 --> 00:14:06.830
that's the theorem.
00:14:06.830 --> 00:14:08.880
Now the question is what
do you do with that?
00:14:12.040 --> 00:14:15.200
Let's have a more general
example than
00:14:15.200 --> 00:14:16.880
a birth-death chain.
00:14:16.880 --> 00:14:19.260
Suppose the non-zero
transition of a
00:14:19.260 --> 00:14:23.280
positive-recurrent Markov
chain form a tree.
00:14:23.280 --> 00:14:33.100
Before we had the states going
on a line, and from each state
00:14:33.100 --> 00:14:36.450
to the next state, there were
transition probabilities, you
00:14:36.450 --> 00:14:39.800
could only go up or
down on this line.
00:14:39.800 --> 00:14:43.590
What I'm saying now is if you
make a tree you have the same
00:14:43.590 --> 00:14:50.210
sort of condition that you had
before if the transitions on
00:14:50.210 --> 00:14:53.995
the states look like a tree.
00:15:07.520 --> 00:15:10.950
So these are the only
transitions that exist in this
00:15:10.950 --> 00:15:11.820
Markov chain.
00:15:11.820 --> 00:15:13.230
These are the states here.
00:15:16.530 --> 00:15:18.720
Again, you have this
condition.
00:15:18.720 --> 00:15:22.390
The only way to get from this
state out to this state is to
00:15:22.390 --> 00:15:28.470
move through here so that the
number of transitions that go
00:15:28.470 --> 00:15:32.170
from here to there must be
within one of a number of
00:15:32.170 --> 00:15:36.160
transitions that go from
here back to there.
00:15:36.160 --> 00:15:41.260
So you have this reversibility
condition again on any tree.
00:15:41.260 --> 00:15:44.040
And these birth-death chains
are just very, very skinny
00:15:44.040 --> 00:15:46.670
trees where everything is
laid out on a line.
00:15:46.670 --> 00:15:48.600
But this is the more
general case.
00:15:48.600 --> 00:15:51.395
And you'll see cases of
this as we move along.
00:15:57.700 --> 00:16:02.020
The following theorem is one of
these things that you use
00:16:02.020 --> 00:16:04.650
all the time in solving
problems.
00:16:04.650 --> 00:16:07.840
And it's extraordinarily
useful.
00:16:07.840 --> 00:16:12.190
It says for a Markov chain with
transition probabilities
00:16:12.190 --> 00:16:20.570
P sub ij, if a set of numbers pi
sub i exists so that all of
00:16:20.570 --> 00:16:24.500
them are positive,
they sum to one.
00:16:24.500 --> 00:16:28.400
If you can find such a set of
numbers, and if they satisfy
00:16:28.400 --> 00:16:35.370
this equation here, then you
know that the chain is
00:16:35.370 --> 00:16:38.930
reversible, and you know that
those numbers are the
00:16:38.930 --> 00:16:40.430
steady-state probability.
00:16:40.430 --> 00:16:43.350
So you get everything at once.
00:16:43.350 --> 00:16:45.430
It's sort of like a
guessing theorem.
00:16:45.430 --> 00:16:49.060
And I usually call it a guessing
theorem, because
00:16:49.060 --> 00:16:53.180
starting out, it's not obvious
that these equations have to
00:16:53.180 --> 00:16:54.060
be satisfied.
00:16:54.060 --> 00:16:59.740
They're only satisfied if you
have a chain which is
00:16:59.740 --> 00:17:00.760
reversible.
00:17:00.760 --> 00:17:04.329
But if you can find a solution
to these equations, then, in
00:17:04.329 --> 00:17:09.230
fact, you know it's reversible,
and you know you
00:17:09.230 --> 00:17:11.339
found steady--state
probabilities.
00:17:11.339 --> 00:17:15.060
It's a whole lot easier to solve
this equation usually
00:17:15.060 --> 00:17:18.210
than to solve the usual
equation we have for
00:17:18.210 --> 00:17:21.839
steady-state probabilities.
00:17:21.839 --> 00:17:26.490
But the proof of the theorem--
00:17:26.490 --> 00:17:31.320
I just restated the theorem
here, leaving out all of the
00:17:31.320 --> 00:17:34.700
boiler plate .
00:17:34.700 --> 00:17:40.360
If we take this equation for
fixed j, and we sum over I,
00:17:40.360 --> 00:17:41.780
what happens?
00:17:41.780 --> 00:17:46.130
When you sum over i over on
this side, you get the sum
00:17:46.130 --> 00:17:50.040
over i of pi sub i P sub ij.
00:17:50.040 --> 00:17:54.610
When you sum over i on this
side, you get pi sub j,
00:17:54.610 --> 00:18:00.220
because when you sum P sub ji
over i, you have to get one.
00:18:00.220 --> 00:18:03.310
When you're in state j, you
have to go someplace.
00:18:03.310 --> 00:18:07.110
And you can only go one place,
each with different
00:18:07.110 --> 00:18:08.810
probabilities.
00:18:08.810 --> 00:18:13.910
So that gives you the usual
steady-state conditions.
00:18:13.910 --> 00:18:18.075
If you can solve those steady
state conditions, then you
00:18:18.075 --> 00:18:20.880
know from what we did before
that the chain is
00:18:20.880 --> 00:18:22.510
positive-recurrent.
00:18:22.510 --> 00:18:24.950
You know there are steady-state
probabilities.
00:18:24.950 --> 00:18:28.460
You know there's probabilities
are all greater than zero.
00:18:28.460 --> 00:18:32.560
So if there's any solution to
these steady-state equations,
00:18:32.560 --> 00:18:36.140
then you know the chain has
to be positive-recurrent.
00:18:36.140 --> 00:18:40.990
And you know it has to be
reversible in this case.
00:18:40.990 --> 00:18:45.220
OK, here are a bunch of sanity
checks for reversibility.
00:18:45.220 --> 00:18:47.510
In other words, if you're going
to guess at something
00:18:47.510 --> 00:18:51.200
that's reversible and try to
solve these equations, you
00:18:51.200 --> 00:18:54.130
might as well do a sanity
check first.
00:18:54.130 --> 00:19:00.430
The simplest and most useful
sanity check is if you want it
00:19:00.430 --> 00:19:05.510
be reversible, and there's a
transition from i to j, then
00:19:05.510 --> 00:19:09.240
there has to be a transition
from j to I also.
00:19:09.240 --> 00:19:12.150
Mainly the number of transitions
going from i to j
00:19:12.150 --> 00:19:15.450
has to be the same over the
long term to number of
00:19:15.450 --> 00:19:17.810
transitions going from j to I.
00:19:17.810 --> 00:19:20.850
If there's a zero transition
probability one way and not
00:19:20.850 --> 00:19:23.330
the other way, you can't
satisfy that equation.
00:19:26.840 --> 00:19:32.860
If the chain is periodic, the
period has to be too.
00:19:32.860 --> 00:19:33.980
Why is that?
00:19:33.980 --> 00:19:35.910
Well, it's a long proof
in the notes.
00:19:35.910 --> 00:19:38.370
And if you write everything down
in algebra, it looks a
00:19:38.370 --> 00:19:39.500
little long.
00:19:39.500 --> 00:19:43.140
If you just think about it,
it's a lot shorter.
00:19:43.140 --> 00:19:47.650
If you're going around on a
cycle of, say, link three, if
00:19:47.650 --> 00:19:53.550
the chain is periodic, and it's
periodic with some period
00:19:53.550 --> 00:19:58.590
others than two, then you know
that the set of states has to
00:19:58.590 --> 00:20:01.820
partition into a
set of subsets.
00:20:01.820 --> 00:20:05.820
And you have to move from one
subset, to the next subset, to
00:20:05.820 --> 00:20:09.140
the next subset, and so forth.
00:20:09.140 --> 00:20:12.310
When you go backwards, you're
moving around that cycle in
00:20:12.310 --> 00:20:14.200
the opposite direction.
00:20:14.200 --> 00:20:18.410
Now, the only way that moving
around a cycle one way and
00:20:18.410 --> 00:20:22.010
moving around it the other way
works out is when the cycle
00:20:22.010 --> 00:20:23.610
only has two states
[? set in, ?]
00:20:23.610 --> 00:20:25.070
because then you're
moving, and you're
00:20:25.070 --> 00:20:27.020
moving right back again.
00:20:27.020 --> 00:20:29.545
OK, so the period has to be
two if it's periodic.
00:20:34.430 --> 00:20:38.980
If there's any set of
transitions i to j, j to k,
00:20:38.980 --> 00:20:42.670
and k to I, namely if you can
move around this way with some
00:20:42.670 --> 00:20:45.760
probability, then the
probability of moving back
00:20:45.760 --> 00:20:48.670
again has to be the
same thing.
00:20:48.670 --> 00:20:51.190
And that's what this
is saying.
00:20:51.190 --> 00:20:57.940
This is moving around this cycle
of length three one way.
00:20:57.940 --> 00:21:01.140
This is the forward
probabilities for moving
00:21:01.140 --> 00:21:05.580
around a cycle, the opposite way
and to have reversibility.
00:21:05.580 --> 00:21:08.340
The probability of going one way
has to be the same as the
00:21:08.340 --> 00:21:09.710
probability going
the other way.
00:21:14.810 --> 00:21:21.150
Now, that sounds peculiar, and
it gives me a good excuse to
00:21:21.150 --> 00:21:26.150
point out one of the main things
that's going on here.
00:21:26.150 --> 00:21:29.920
When you say something is
reversible, it doesn't usually
00:21:29.920 --> 00:21:36.060
mean that P sub ij is
equal to P sub ji.
00:21:36.060 --> 00:21:39.640
What it means is that
pi sub I times Pij
00:21:39.640 --> 00:21:43.900
equals pi j times Pji.
00:21:43.900 --> 00:21:48.100
Namely, the fraction of
transitions here is the same
00:21:48.100 --> 00:21:50.500
as the fraction of
transitions here.
00:21:50.500 --> 00:21:55.380
Why is it that here I'm only
using the probabilities, and
00:21:55.380 --> 00:21:59.130
I'm not saying anything about
the initial probability?
00:21:59.130 --> 00:22:04.260
It's because both of these
cycles start with state i.
00:22:04.260 --> 00:22:08.120
So what you really want to do
is say pi I times Pij times
00:22:08.120 --> 00:22:12.096
Pjk times Pki is the same as
pi i times [INAUDIBLE]
00:22:17.120 --> 00:22:18.570
And then you cancel
out the pi.
00:22:18.570 --> 00:22:23.490
So when you have a cycle, you
don't need that initial
00:22:23.490 --> 00:22:27.480
steady-state probability
in there.
00:22:27.480 --> 00:22:30.300
There's a nice generalization
of the guessing theorem to
00:22:30.300 --> 00:22:35.280
non-reversible change and that
generalization and it's proved
00:22:35.280 --> 00:22:38.420
the same way that
this is proved.
00:22:38.420 --> 00:22:42.810
If you can find a set of
transition probabilities, P
00:22:42.810 --> 00:22:47.800
sub ij star, and to be a set of
transition probabilities,
00:22:47.800 --> 00:22:50.030
they have to be non-negative.
00:22:50.030 --> 00:22:53.710
When you sum this over j,
you have to get one.
00:22:53.710 --> 00:22:55.790
That's what you need to have
a set of transition
00:22:55.790 --> 00:22:56.950
probabilities.
00:22:56.950 --> 00:23:02.110
Then, all you need is pi sub i
times P sub ij is equal to pi
00:23:02.110 --> 00:23:04.760
j times P sub ji star.
00:23:04.760 --> 00:23:09.620
In other words, when you look
at this backward transition
00:23:09.620 --> 00:23:13.510
probability for an arbitrary
Markov chain which is
00:23:13.510 --> 00:23:16.840
positive-recurrent, this
has to equal this.
00:23:16.840 --> 00:23:18.500
This is one of the conditions
that you
00:23:18.500 --> 00:23:21.580
have on a Markov chain.
00:23:21.580 --> 00:23:24.880
The interesting thing here
is this is enough.
00:23:24.880 --> 00:23:27.930
If you can guess a set of
backward transition
00:23:27.930 --> 00:23:33.460
probabilities to satisfy this
equation for all i and j, then
00:23:33.460 --> 00:23:38.230
you know you must have a set of
steady-state probabilities
00:23:38.230 --> 00:23:41.480
where the steady-state
probabilities are [INAUDIBLE].
00:23:41.480 --> 00:23:44.070
And the way to prove this
is the same as before.
00:23:44.070 --> 00:23:54.410
Namely, you sum this over j.
00:23:54.410 --> 00:23:57.410
And when you sum this over
j, you get the backward
00:23:57.410 --> 00:23:58.820
transition probability.
00:23:58.820 --> 00:24:02.460
So I'm not going to prove it.
00:24:02.460 --> 00:24:04.450
I mean the proof is in the
notes, and it's really the
00:24:04.450 --> 00:24:05.950
same proof as we went
through before.
00:24:09.100 --> 00:24:13.680
And incidentally, if you read
the section on round robin,
00:24:13.680 --> 00:24:17.920
you will find the key to finding
out what's going on
00:24:17.920 --> 00:24:21.910
there is, in fact,
that theorem.
00:24:21.910 --> 00:24:24.090
It's that way of solving for
what the steady-state
00:24:24.090 --> 00:24:26.810
probabilities have to be.
00:24:26.810 --> 00:24:31.660
While I'm at it, let me pause
for just a second, because
00:24:31.660 --> 00:24:35.050
we're not going to go through
that section on round robin.
00:24:35.050 --> 00:24:39.610
Let me talk about what it is,
what processor sharing is, and
00:24:39.610 --> 00:24:42.660
why that result is
pretty important.
00:24:42.660 --> 00:24:44.090
If you're at all
interested in--
00:24:47.110 --> 00:24:48.190
well, let's see.
00:24:48.190 --> 00:24:53.300
First pack of communication
is something important.
00:24:53.300 --> 00:24:56.720
Second, computer systems of
all types is important.
00:24:56.720 --> 00:25:00.620
There was an enormous transition
probably 20 years
00:25:00.620 --> 00:25:06.490
ago from computer systems
solving one job at a time, and
00:25:06.490 --> 00:25:09.650
then it went to the system
of solving many jobs
00:25:09.650 --> 00:25:09.930
concurrently.
00:25:09.930 --> 00:25:12.250
It would work a little bit on
one job, a little bit in
00:25:12.250 --> 00:25:15.400
another, a little bit in
another, and so forth.
00:25:15.400 --> 00:25:20.200
And it turns out to be a very
good idea for doing that.
00:25:20.200 --> 00:25:24.850
Or if you're interested in
killing systems, what happens
00:25:24.850 --> 00:25:29.650
if you have a killing system--
00:25:29.650 --> 00:25:33.470
suppose it's a GG1 queue.
00:25:33.470 --> 00:25:35.500
So you have a different service
00:25:35.500 --> 00:25:38.120
time for each customer.
00:25:38.120 --> 00:25:40.550
Or let's make it an MG1 queue.
00:25:40.550 --> 00:25:42.770
Makes the argument cleaner.
00:25:42.770 --> 00:25:46.370
Different customers have
different service times.
00:25:46.370 --> 00:25:52.160
We've seen that in an MG1 queue,
everybody can be held
00:25:52.160 --> 00:25:55.390
up by one slow customer.
00:25:55.390 --> 00:26:00.620
And if the customers have an
enormously, widely varied
00:26:00.620 --> 00:26:04.320
service time, some of them have
required enormously long
00:26:04.320 --> 00:26:08.330
service time, that causes an
enormous amount of queuing.
00:26:08.330 --> 00:26:13.130
What happens if you use
processor sharing, you have
00:26:13.130 --> 00:26:14.560
one server.
00:26:14.560 --> 00:26:18.350
And it's simultaneously
allocating service to every
00:26:18.350 --> 00:26:20.660
customer which is in queue.
00:26:20.660 --> 00:26:24.830
So it takes a service
capability, and it splits it
00:26:24.830 --> 00:26:26.440
up end-wise.
00:26:26.440 --> 00:26:29.030
And when you talk about
processor sharing, you assume
00:26:29.030 --> 00:26:31.860
that there's no overhead for
doing the splitting.
00:26:31.860 --> 00:26:34.570
And if there's no overhead for
doing the splitting, you can
00:26:34.570 --> 00:26:36.880
see intuitively what happens.
00:26:36.880 --> 00:26:42.010
The customers that don't need
much service are going to be
00:26:42.010 --> 00:26:45.620
held up a little bit by these
customers who require enormous
00:26:45.620 --> 00:26:49.570
amounts of service,
but not too much.
00:26:49.570 --> 00:26:52.680
Because this customer that
requires enormous service is
00:26:52.680 --> 00:26:56.140
getting the same rate of
service as you are.
00:26:56.140 --> 00:27:00.170
If that customer requires 100
hours of service, and you only
00:27:00.170 --> 00:27:03.750
require one second of service,
you're going to get out very
00:27:03.750 --> 00:27:07.670
much faster than they do.
00:27:07.670 --> 00:27:10.600
What happens when you
analyze all of this?
00:27:10.600 --> 00:27:12.990
It turns out that you've
turned the MG1
00:27:12.990 --> 00:27:15.240
queue into an MM1 queue.
00:27:19.710 --> 00:27:25.170
In other words, if you're doing
processor sharing, it
00:27:25.170 --> 00:27:29.070
takes the same expected amount
of time for you to get out as
00:27:29.070 --> 00:27:31.980
it would if all of the service
times were exponential.
00:27:34.970 --> 00:27:37.720
Now, that is why people
went to time
00:27:37.720 --> 00:27:41.130
sharing a long time ago.
00:27:41.130 --> 00:27:45.090
Most of the arguments for it,
especially in the computer
00:27:45.090 --> 00:27:49.190
science fraternity, were all
sorts of other things.
00:27:49.190 --> 00:27:50.750
But there's this very
simple queuing
00:27:50.750 --> 00:27:52.680
argument that led to that.
00:27:52.680 --> 00:27:55.590
Unfortunately, it's a fairly
complicated queuing argument,
00:27:55.590 --> 00:27:57.940
which is why we're not
going through it.
00:27:57.940 --> 00:28:00.720
But it's a very important
argument.
00:28:00.720 --> 00:28:04.790
Why, at the same time, did we
go to packet communication?
00:28:04.790 --> 00:28:06.970
Well, there are all sorts of
reasons for going to packet
00:28:06.970 --> 00:28:10.320
communication instead of
sending messages, long
00:28:10.320 --> 00:28:12.780
messages, one at a time.
00:28:12.780 --> 00:28:16.740
But one of them, and one of them
is very important, is the
00:28:16.740 --> 00:28:19.060
same process of sharing
resell.
00:28:19.060 --> 00:28:22.220
If you split things up into
small pieces, then what it
00:28:22.220 --> 00:28:27.090
means is effectively things are
being served in a process
00:28:27.090 --> 00:28:28.250
of sharing matter.
00:28:28.250 --> 00:28:33.030
So again, you get this losing
the slow truck effect.
00:28:33.030 --> 00:28:36.650
And everybody gets through
effectively in a
00:28:36.650 --> 00:28:39.570
fair amount of time.
00:28:39.570 --> 00:28:40.820
OK.
00:28:45.070 --> 00:28:49.120
I probably said just the wrong
amount about that, so you
00:28:49.120 --> 00:28:52.080
can't understand what
I was saying.
00:28:52.080 --> 00:28:57.500
But I think if you read it,
you will get the idea of
00:28:57.500 --> 00:28:59.130
what's going on.
00:28:59.130 --> 00:29:01.060
OK.
00:29:01.060 --> 00:29:03.620
Let's look at an
MM1 queue now.
00:29:03.620 --> 00:29:06.980
An MM1 queue, you remember,
is a queue where you have
00:29:06.980 --> 00:29:09.570
customers coming in.
00:29:09.570 --> 00:29:14.470
In a [? Poisson ?] manner, the
interval between customers is
00:29:14.470 --> 00:29:15.400
exponential.
00:29:15.400 --> 00:29:18.470
That you couldn't model
for a lot of things.
00:29:18.470 --> 00:29:21.675
The service time
is exponential.
00:29:24.320 --> 00:29:28.710
And what we're going to do to
try to analyze this in terms
00:29:28.710 --> 00:29:35.340
of mark of change, is to say
well, let's look at sampling
00:29:35.340 --> 00:29:40.800
the state of the MM1 queue at
some very finely spaced
00:29:40.800 --> 00:29:42.180
interval of time.
00:29:42.180 --> 00:29:46.670
And we'll make the interval of
time, delta, so small that
00:29:46.670 --> 00:29:49.470
there's a negligible probability
of having two
00:29:49.470 --> 00:29:52.400
customers come in in
the same interval.
00:29:52.400 --> 00:29:55.400
And so there's a negligible
probability of having a
00:29:55.400 --> 00:29:57.710
customer come in and
a customer go
00:29:57.710 --> 00:29:59.650
out in the same interval.
00:29:59.650 --> 00:30:03.720
It's effectively the same
argument that we use to say
00:30:03.720 --> 00:30:10.510
that a Poisson process is
effectively the same as a
00:30:10.510 --> 00:30:26.990
Bernoulli process, if you make
the time interval the step
00:30:26.990 --> 00:30:29.970
size for the Bernoulli process
very, very small, and the
00:30:29.970 --> 00:30:32.580
probability of success
very, very small.
00:30:32.580 --> 00:30:35.870
As you make that time interval
smaller and smaller, it goes
00:30:35.870 --> 00:30:40.120
in to a Poisson process as we
showed a long time ago.
00:30:40.120 --> 00:30:43.010
This is the same
argument here.
00:30:43.010 --> 00:30:51.400
And what we get then is this
system, which now has a state.
00:30:51.400 --> 00:30:53.810
And the state is the
number of customers
00:30:53.810 --> 00:30:55.010
that are in the system.
00:30:55.010 --> 00:30:58.490
As one customer is being served,
rest of the customers
00:30:58.490 --> 00:31:01.270
are sitting in a queue.
00:31:01.270 --> 00:31:07.880
The transitions over some very
small time, delta, there's a
00:31:07.880 --> 00:31:11.550
probability lambda delta that
a new customer comes in.
00:31:11.550 --> 00:31:13.940
So there's a transition
to the right.
00:31:13.940 --> 00:31:19.080
There's a probability mu delta,
if there's a server
00:31:19.080 --> 00:31:23.260
being served, that that service
gets finished in this
00:31:23.260 --> 00:31:25.470
time delta.
00:31:25.470 --> 00:31:29.440
And if you're in state zero,
then of course, you can get a
00:31:29.440 --> 00:31:31.410
new arrival coming in,
but you can't get any
00:31:31.410 --> 00:31:32.830
service being done.
00:31:32.830 --> 00:31:36.910
So it's saying, as you're all
familiar with, you have this
00:31:36.910 --> 00:31:41.150
system where any time there are
customers in the system,
00:31:41.150 --> 00:31:43.780
they're getting served
at rate mu.
00:31:43.780 --> 00:31:45.640
Mu has to be bigger
than lambda to
00:31:45.640 --> 00:31:47.300
make this thing stable.
00:31:47.300 --> 00:31:50.080
And you can see that
intuitively, I think.
00:31:50.080 --> 00:31:53.340
And when you're in state
zero, then the
00:31:53.340 --> 00:31:55.560
server isn't doing anything.
00:31:55.560 --> 00:31:58.470
So the server is resting,
because the server is faster
00:31:58.470 --> 00:32:01.050
than the arrival process.
00:32:01.050 --> 00:32:03.660
And then the only thing that
can happen is a new arrival
00:32:03.660 --> 00:32:06.380
comes in, and then the server
starts to work again, and
00:32:06.380 --> 00:32:08.920
you're back in state 1.
00:32:08.920 --> 00:32:16.480
So this is just a time sampled
version of the MM1 queue.
00:32:16.480 --> 00:32:21.080
And if you analyze this either
from the guessing theorem that
00:32:21.080 --> 00:32:25.320
I just was talking about or the
general result for birth
00:32:25.320 --> 00:32:28.550
death change that we talked
about last time.
00:32:28.550 --> 00:32:37.330
You see that pi sub n minus 1
times lambda delta is equal to
00:32:37.330 --> 00:32:39.510
pi sub n times mu delta.
00:32:39.510 --> 00:32:42.730
The fraction of transitions
going up is equal to the
00:32:42.730 --> 00:32:44.970
fraction of transitions
going down.
00:32:44.970 --> 00:32:48.020
You take the steady state
probability of being in
00:32:48.020 --> 00:32:50.080
state n minus 1.
00:32:50.080 --> 00:32:52.580
You multiply it by the
probability of an up
00:32:52.580 --> 00:32:54.150
transition.
00:32:54.150 --> 00:32:56.660
And you get the same thing, as
you take the probability of
00:32:56.660 --> 00:33:00.380
being in state by n and
multiply it by a down
00:33:00.380 --> 00:33:01.710
transition.
00:33:01.710 --> 00:33:07.020
If you define rho as being
lambda over mu, then what this
00:33:07.020 --> 00:33:10.780
equation says is a steady state
probability of being in
00:33:10.780 --> 00:33:14.750
state n is rho times the steady
state probability of
00:33:14.750 --> 00:33:18.210
being in a state n minus 1.
00:33:18.210 --> 00:33:21.330
This is the same as the general
birth death result,
00:33:21.330 --> 00:33:24.510
except that rho is a constant
overall state
00:33:24.510 --> 00:33:30.450
rather than state 1.
00:33:30.450 --> 00:33:35.340
Pi sub n is then equal to a
rho sub n times pi zero.
00:33:35.340 --> 00:33:39.550
And pi sub n is then equal
to, if you re-curse on
00:33:39.550 --> 00:33:43.820
this, you get this.
00:33:43.820 --> 00:33:46.880
Then you use the condition that
the pi sub i's have to
00:33:46.880 --> 00:33:48.220
add up to 1.
00:33:48.220 --> 00:33:51.620
And you get pi sub n has to
be equal to 1 minus rho
00:33:51.620 --> 00:33:53.610
times rho to the n.
00:33:53.610 --> 00:33:54.100
OK.
00:33:54.100 --> 00:33:58.360
This is all very simple
and straightforward.
00:33:58.360 --> 00:34:03.220
What's curious about
this is it doesn't
00:34:03.220 --> 00:34:05.996
depend on delta at all.
00:34:05.996 --> 00:34:09.400
You can make delta anything
you want to.
00:34:09.400 --> 00:34:11.820
And we know that if we shrink
delta enough, it's going to
00:34:11.820 --> 00:34:14.730
look very much like
an MM1 queue.
00:34:14.730 --> 00:34:19.139
But it looks like an MM1 queue
no matter what delta is.
00:34:19.139 --> 00:34:22.460
Just so long as lambda plus mu
times delta is less than or
00:34:22.460 --> 00:34:23.120
equal to 1.
00:34:23.120 --> 00:34:27.090
You don't want transition
probabilities to add up to
00:34:27.090 --> 00:34:27.900
more than 1.
00:34:27.900 --> 00:34:31.980
And you have these self loops
here which take up the slack.
00:34:31.980 --> 00:34:38.540
And we saw before that the
steady state probabilities
00:34:38.540 --> 00:34:41.739
didn't have anything to do with
these self transitions.
00:34:41.739 --> 00:34:44.969
And that will turn out to be
sort of useful later on.
00:34:44.969 --> 00:34:47.989
So we get these nice
probabilities which are
00:34:47.989 --> 00:34:57.580
independent of the time
increment that we're taking.
00:34:57.580 --> 00:35:01.680
So we think that this is
probably pretty much operating
00:35:01.680 --> 00:35:05.500
like an actual MM1 queue
would operate.
00:35:05.500 --> 00:35:06.050
OK.
00:35:06.050 --> 00:35:08.530
Now here's this diagram that I
showed you last time, and I
00:35:08.530 --> 00:35:11.230
told you was going
to be confusing.
00:35:11.230 --> 00:35:14.620
And I hope it's a little less
confusing at this point.
00:35:14.620 --> 00:35:16.780
We've now talked about
reversibility.
00:35:16.780 --> 00:35:18.770
We know what reversibility
means.
00:35:18.770 --> 00:35:22.650
We know that we have
reversibility here.
00:35:22.650 --> 00:35:24.770
And what's going on?
00:35:24.770 --> 00:35:28.750
We have this diagram on the
top, which is the usual
00:35:28.750 --> 00:35:34.560
diagram for the way that
an MM1 queue operates.
00:35:34.560 --> 00:35:37.850
You start out in state zero.
00:35:37.850 --> 00:35:40.775
The only thing that can happen
from state zero is at some
00:35:40.775 --> 00:35:44.410
point you get an arrival.
00:35:44.410 --> 00:35:47.160
So the arrival takes
you up there.
00:35:47.160 --> 00:35:49.520
You have no more arrivals
for a while.
00:35:49.520 --> 00:35:51.670
Some later time, you get
another arrival.
00:35:51.670 --> 00:35:52.610
[INAUDIBLE]
00:35:52.610 --> 00:35:56.910
So this is just the arrival
process here.
00:35:56.910 --> 00:36:01.420
This is the number of arrivals
up until time T. The same
00:36:01.420 --> 00:36:07.650
time, when you have arrivals,
eventually since the server is
00:36:07.650 --> 00:36:11.110
working now, at some point
there can be a departure.
00:36:11.110 --> 00:36:15.240
So we go over to here in
the sample sequence.
00:36:15.240 --> 00:36:17.360
There's eventually a
departure there.
00:36:17.360 --> 00:36:18.860
There's a departure there.
00:36:18.860 --> 00:36:21.330
And then you're back in
state zero again.
00:36:21.330 --> 00:36:24.110
You go along until there's
another arrival.
00:36:24.110 --> 00:36:27.460
Corresponding to this sample
path of arrivals and
00:36:27.460 --> 00:36:30.550
departures, we can say
what the state is.
00:36:30.550 --> 00:36:32.860
The state is just the difference
between the
00:36:32.860 --> 00:36:35.970
arrivals and the departures
for this sample path.
00:36:35.970 --> 00:36:45.190
So the state here start
out at time 1.
00:36:45.190 --> 00:36:47.040
x1 is equal to 0.
00:36:47.040 --> 00:36:51.340
Then at time x2, suddenly
an arrival comes in.
00:36:51.340 --> 00:36:56.150
x2 is equal to 1, x3 is equal to
1, x4 is equal to 1, x5 is
00:36:56.150 --> 00:36:57.270
equal to 1.
00:36:57.270 --> 00:36:59.260
Another arrival comes in.
00:36:59.260 --> 00:37:00.930
So we have a queue of 1.
00:37:00.930 --> 00:37:04.020
We have the server operating
on one customer.
00:37:04.020 --> 00:37:07.270
Then in the sample path, we
suppose there's a departure.
00:37:07.270 --> 00:37:09.920
And we suppose that
the second arrival
00:37:09.920 --> 00:37:11.830
required hardly any service.
00:37:11.830 --> 00:37:15.430
So there's a very fast
departure there.
00:37:15.430 --> 00:37:19.860
Now, what we're going to do is
to look at what happens.
00:37:19.860 --> 00:37:23.790
This is the picture that we
have for the Markov chain.
00:37:23.790 --> 00:37:26.930
This with the picture we had
for the sample path of
00:37:26.930 --> 00:37:29.830
arrivals and departures for what
we thought was the real
00:37:29.830 --> 00:37:32.390
life thing that was going on.
00:37:32.390 --> 00:37:34.940
We now have the state diagram.
00:37:34.940 --> 00:37:36.780
And now what we're going
to do is say, let's
00:37:36.780 --> 00:37:39.320
look at this backwards.
00:37:39.320 --> 00:37:42.360
And since looking at it
backwards in time is
00:37:42.360 --> 00:37:48.110
complicated, let's look at
it coming in this way.
00:37:48.110 --> 00:37:52.780
So we have the state diagram,
and we try to figure out what,
00:37:52.780 --> 00:37:56.070
going backwards, is going
on here from these state
00:37:56.070 --> 00:37:57.980
transitions.
00:37:57.980 --> 00:38:04.090
Well in going backwards, the
state is increasing by 1.
00:38:04.090 --> 00:38:08.630
So that looks like something
that we would call an arrival.
00:38:08.630 --> 00:38:12.330
Now why am I calling these
arrivals and departures?
00:38:12.330 --> 00:38:18.300
It's because the probability of
any sample path along here
00:38:18.300 --> 00:38:23.040
is going to be the same as a
backward sample path, the same
00:38:23.040 --> 00:38:25.240
sample path, going backwards.
00:38:25.240 --> 00:38:28.390
That's what we've already
established.
00:38:28.390 --> 00:38:33.430
And the probabilities going
backwards are going to be the
00:38:33.430 --> 00:38:36.110
same as the probabilities
going forward.
00:38:36.110 --> 00:38:39.510
Since we can interpret this
going forward as arrivals
00:38:39.510 --> 00:38:42.250
causing up transitions,
departures causing down
00:38:42.250 --> 00:38:48.410
transitions, going backwards we
can say this is an arrival
00:38:48.410 --> 00:38:50.420
in this backward going chain.
00:38:50.420 --> 00:38:53.620
This is an arrival in a
backward going chain.
00:38:53.620 --> 00:38:56.170
This is a departure in the
backward going chain.
00:38:56.170 --> 00:38:57.440
We go along here.
00:38:57.440 --> 00:38:59.690
Finally, there's another
departure in the backward
00:38:59.690 --> 00:39:00.710
going chain.
00:39:00.710 --> 00:39:01.960
This state diagram--
00:39:07.718 --> 00:39:11.660
with two of them, we
might make it.
00:39:11.660 --> 00:39:13.200
Yes, OK.
00:39:13.200 --> 00:39:19.650
The state diagram here
determines this diagram here.
00:39:19.650 --> 00:39:22.590
If I tell you what this
is, you can draw this.
00:39:22.590 --> 00:39:25.620
You can draw every up transition
as an arrival,
00:39:25.620 --> 00:39:28.030
every down transition
as a departure.
00:39:28.030 --> 00:39:32.010
So this diagram is specified
by this diagram.
00:39:32.010 --> 00:39:35.500
This diagram is also specified
by this diagram.
00:39:35.500 --> 00:39:39.600
So this and this each
specify each other.
00:39:39.600 --> 00:39:42.220
Now if we interpret this
as arrivals and this is
00:39:42.220 --> 00:39:47.470
departures, and we have the
probabilities of an MM1 chain,
00:39:47.470 --> 00:39:51.770
then we say the statistics of
these arrivals here are the
00:39:51.770 --> 00:39:58.190
same as a Bernoulli process,
which is coming along the
00:39:58.190 --> 00:40:00.370
other way and leading
to arrivals.
00:40:00.370 --> 00:40:07.360
What that says is the departure
process here is a
00:40:07.360 --> 00:40:08.610
Bernoulli process.
00:40:11.460 --> 00:40:13.440
Now you really have to
wrap your head around
00:40:13.440 --> 00:40:15.050
that a little bit.
00:40:15.050 --> 00:40:18.610
Because we know that departures
only occur when
00:40:18.610 --> 00:40:22.170
you're in states greater
than or equal to 1.
00:40:24.840 --> 00:40:28.060
So what's going on?
00:40:28.060 --> 00:40:33.450
When you're looking at it in
forward time, a departure can
00:40:33.450 --> 00:40:41.680
only occur from a non-negative
state to some other state.
00:40:41.680 --> 00:40:47.930
Namely from some non-negative
state to some smaller state.
00:40:47.930 --> 00:40:49.630
Now, when I look at
it backwards in
00:40:49.630 --> 00:40:52.880
time, what do I find?
00:40:52.880 --> 00:40:55.520
I can be in state zero.
00:40:55.520 --> 00:41:01.070
And there could have been
a departure which may--
00:41:01.070 --> 00:41:05.870
if I'm in state zero at time
zero, and I say there was a
00:41:05.870 --> 00:41:10.560
departure between n minus 1 and
n, that just says that the
00:41:10.560 --> 00:41:13.740
state at time n minus
1 was equal to 1.
00:41:13.740 --> 00:41:18.060
Not that the state at time
n was equal to 1.
00:41:18.060 --> 00:41:21.400
Because I'm running along here
looking at these arrivals
00:41:21.400 --> 00:41:24.400
going this way, departures
going this way.
00:41:24.400 --> 00:41:29.150
When I'm in state zero,
I can get an arrival.
00:41:29.150 --> 00:41:31.633
I can't when I'm in state 1.
00:41:35.000 --> 00:41:39.450
If I were here, I couldn't
get a departure in the
00:41:39.450 --> 00:41:40.650
next unit of time.
00:41:40.650 --> 00:41:43.100
Because the state is
equal to zero.
00:41:43.100 --> 00:41:46.250
But I can be coming from
a departure in
00:41:46.250 --> 00:41:47.260
the previous state.
00:41:47.260 --> 00:41:49.790
Because in the previous state,
the state was 1.
00:41:53.840 --> 00:41:55.410
I mean, you really have
to say this to
00:41:55.410 --> 00:41:57.180
yourself a dozen times.
00:42:03.050 --> 00:42:04.450
And you have to reason
about it.
00:42:04.450 --> 00:42:08.320
You have to look at the diagram,
read the notes, talk
00:42:08.320 --> 00:42:09.570
to your friends about it.
00:42:12.210 --> 00:42:14.230
And after you do all of
this, it will start to
00:42:14.230 --> 00:42:16.680
make sense to you.
00:42:16.680 --> 00:42:20.740
But I hope I'm at least making
it seem plausible to you.
00:42:20.740 --> 00:42:23.590
So each sample path corresponds
to both a right
00:42:23.590 --> 00:42:25.630
and left moving chain.
00:42:25.630 --> 00:42:26.970
And each of them are MM1.
00:42:29.480 --> 00:42:30.480
So we have Burke's theorem.
00:42:30.480 --> 00:42:34.720
And Burke's theorem says given
an MM1 sample time Markov
00:42:34.720 --> 00:42:38.950
chain in steady state, first,
the departure processes
00:42:38.950 --> 00:42:42.000
Bernoulli at rate lambda.
00:42:42.000 --> 00:42:42.880
OK.
00:42:42.880 --> 00:42:45.660
Let me put it another way now.
00:42:45.660 --> 00:42:50.040
When we look at it in the
customary way, we're looking
00:42:50.040 --> 00:42:52.320
at things moving
upward in time.
00:42:52.320 --> 00:42:54.340
We know there can't be
a departure when
00:42:54.340 --> 00:42:56.380
you're in state zero.
00:42:56.380 --> 00:43:00.430
That's because we're looking
at departures after
00:43:00.430 --> 00:43:02.220
you're in time zero.
00:43:02.220 --> 00:43:05.520
When we look at time coming in
backwards, we're not being
00:43:05.520 --> 00:43:13.660
dependent on the state to the
left of that departure.
00:43:13.660 --> 00:43:16.850
We're only dependent on the
state after the departure.
00:43:16.850 --> 00:43:20.950
The state after departure
can be anything.
00:43:20.950 --> 00:43:21.850
OK?
00:43:21.850 --> 00:43:25.430
And therefore, after departure
you can be in
00:43:25.430 --> 00:43:26.860
any state at all.
00:43:26.860 --> 00:43:30.440
And therefore, you can always
have a departure, which leaves
00:43:30.440 --> 00:43:31.770
you in state zero.
00:43:36.420 --> 00:43:38.510
That's exactly what this
theorem is saying.
00:43:38.510 --> 00:43:39.340
It's saying--
00:43:39.340 --> 00:43:39.680
yes.
00:43:39.680 --> 00:43:40.930
AUDIENCE: [INAUDIBLE]
00:43:44.004 --> 00:43:45.000
departure process?
00:43:45.000 --> 00:43:47.260
PROFESSOR: Well, a couple
of reasons.
00:43:49.980 --> 00:43:54.080
If you had a Bernoulli process
and departure rate was mu,
00:43:54.080 --> 00:43:56.690
over a long period of time,
you'll have more departures
00:43:56.690 --> 00:43:57.940
than you have arrivals.
00:44:01.860 --> 00:44:07.040
But the other, better reason is
that now you're amortizing
00:44:07.040 --> 00:44:09.820
those departures
over all time.
00:44:09.820 --> 00:44:12.970
And before, you were amortizing
them only over
00:44:12.970 --> 00:44:18.930
times when the state of the
chain was greater than what?
00:44:18.930 --> 00:44:20.720
The probability of the
state of the chain is
00:44:20.720 --> 00:44:22.880
greater than 1 is rho.
00:44:22.880 --> 00:44:25.460
And that's the difference
between lambda and mu.
00:44:25.460 --> 00:44:27.780
OK?
00:44:27.780 --> 00:44:30.250
It's not nice, but that's
the way it is.
00:44:30.250 --> 00:44:32.720
Well actually, it is nice when
you're solving problems with
00:44:32.720 --> 00:44:34.280
these things.
00:44:34.280 --> 00:44:38.010
I mean some of you might have
noticed when you were looking
00:44:38.010 --> 00:44:42.120
at the quiz problem dealing with
Poisson processes, that
00:44:42.120 --> 00:44:47.240
it was very, very sticky to say
things about what happens
00:44:47.240 --> 00:44:49.580
at some time in the
past, given what's
00:44:49.580 --> 00:44:52.460
going on in the future.
00:44:52.460 --> 00:44:54.920
Those are nasty problems
to deal with.
00:44:54.920 --> 00:44:58.010
This makes those problems
very easy to deal with.
00:44:58.010 --> 00:45:01.350
Because it's saying, if you go
backward in time, you reverse
00:45:01.350 --> 00:45:03.490
the role of departures
and arrivals.
00:45:03.490 --> 00:45:03.957
Yes.
00:45:03.957 --> 00:45:06.914
AUDIENCE: Can you explain that
one more time, why it's lambda
00:45:06.914 --> 00:45:07.700
and not mu?
00:45:07.700 --> 00:45:10.265
Just the last thing you
said [INAUDIBLE].
00:45:10.265 --> 00:45:11.270
PROFESSOR: OK.
00:45:11.270 --> 00:45:15.950
Last thing I said was that the
probability that the state is
00:45:15.950 --> 00:45:18.390
bigger than zero is rho.
00:45:22.700 --> 00:45:26.770
Because the probability of the
state is zero is 1 minus rho.
00:45:26.770 --> 00:45:32.630
I mean that's not obvious, but
it's just the way it is.
00:45:32.630 --> 00:45:37.970
So that if you're trying to
find the probability of a
00:45:37.970 --> 00:45:42.710
departure and you don't know
what the state is, and you
00:45:42.710 --> 00:45:47.080
just look in at any old time,
it's sort of like a random
00:45:47.080 --> 00:45:48.770
incidence problem.
00:45:48.770 --> 00:45:53.020
I mean, you're looking into this
process, and all you know
00:45:53.020 --> 00:45:55.260
is you're in steady state.
00:45:55.260 --> 00:45:58.740
And you don't know what
the state is.
00:45:58.740 --> 00:46:02.460
I mean you can talk about
the earlier state.
00:46:02.460 --> 00:46:05.350
You can't talk about--
00:46:05.350 --> 00:46:08.270
I mean usually when we talk
about these Markov chains,
00:46:08.270 --> 00:46:12.930
we're talking about state of
time n, transition from time n
00:46:12.930 --> 00:46:15.480
to n plus 1.
00:46:15.480 --> 00:46:19.140
And in that case, you can't have
a departure if you're in
00:46:19.140 --> 00:46:21.880
state zero at time n.
00:46:21.880 --> 00:46:26.760
Now the transition from time n
to n plus 1, if we're moving
00:46:26.760 --> 00:46:32.000
the other way in time, we're
starting out at time n plus 1.
00:46:32.000 --> 00:46:35.380
And we're starting out
at time n plus 1.
00:46:35.380 --> 00:46:40.100
If you're in state zero there,
you can still be coming out of
00:46:40.100 --> 00:46:43.960
a departure from time n.
00:46:43.960 --> 00:46:48.200
I mean suppose at time n the
state is 1, and at time n plus
00:46:48.200 --> 00:46:51.290
1 the state is zero.
00:46:51.290 --> 00:46:53.250
That means there was
a departure between
00:46:53.250 --> 00:46:55.420
n and n plus 1.
00:46:55.420 --> 00:46:58.870
But when you're looking at it
from the right, what you see
00:46:58.870 --> 00:47:03.440
is the state at time
n plus 1 is zero.
00:47:03.440 --> 00:47:05.510
And there's a probability
of a departure.
00:47:05.510 --> 00:47:08.570
And it's exactly the same as the
probability of a departure
00:47:08.570 --> 00:47:11.520
given any other state.
00:47:11.520 --> 00:47:12.770
OK?
00:47:19.870 --> 00:47:22.450
If you're just doing this as
mathematicians, you could look
00:47:22.450 --> 00:47:24.820
at these formulas and say
yes, I agree with that.
00:47:24.820 --> 00:47:27.180
It's all very simple.
00:47:27.180 --> 00:47:29.870
Since we're struggling here to
get some insight as to what's
00:47:29.870 --> 00:47:31.620
going on and some
understanding of
00:47:31.620 --> 00:47:35.630
it, it's very tricky.
00:47:35.630 --> 00:47:39.850
Now, the other part of Burke's
theorem says the state at n
00:47:39.850 --> 00:47:45.270
delta is independent of
departures prior to n delta.
00:47:45.270 --> 00:47:47.880
And that seems even worse.
00:47:47.880 --> 00:47:51.145
It says you're looking at
this Markov chain at
00:47:51.145 --> 00:47:52.840
a particular time.
00:47:52.840 --> 00:47:56.340
And you're saying the state of
it is independent of all those
00:47:56.340 --> 00:48:01.980
departures which happened
before that.
00:48:01.980 --> 00:48:04.700
That's really saying
something.
00:48:04.700 --> 00:48:10.170
But if you use this
reversibility condition that
00:48:10.170 --> 00:48:14.060
says, when you look at things
going from right to left,
00:48:14.060 --> 00:48:17.730
arrivals become departures and
departures become arrivals.
00:48:17.730 --> 00:48:22.000
Then that statement there is
exactly the same as saying the
00:48:22.000 --> 00:48:26.880
state of a forward going chain
at time n is independent of
00:48:26.880 --> 00:48:30.350
the arrivals that come
after time n.
00:48:30.350 --> 00:48:32.680
Now, you all know
that to be true.
00:48:32.680 --> 00:48:36.200
Because you're all used to
looking at these things moving
00:48:36.200 --> 00:48:37.450
forward in time.
00:48:42.650 --> 00:48:46.020
So whenever you see a statement
like that, just in
00:48:46.020 --> 00:48:52.460
your head reverse time, or turn
your head around so that
00:48:52.460 --> 00:48:55.020
right becomes left and
left becomes right.
00:48:55.020 --> 00:48:58.040
And then departures become
arrivals and arrivals become
00:48:58.040 --> 00:48:59.410
departures.
00:48:59.410 --> 00:49:01.190
You can't do one without
the other.
00:49:01.190 --> 00:49:04.350
You've got to do both
of them together.
00:49:04.350 --> 00:49:05.050
OK.
00:49:05.050 --> 00:49:10.060
So everything we know about the
MM1 sample time chain has
00:49:10.060 --> 00:49:14.120
a corresponding statement with
time reversed and arrivals and
00:49:14.120 --> 00:49:15.110
departure switched.
00:49:15.110 --> 00:49:17.290
So it's not only Burke's
theorem.
00:49:17.290 --> 00:49:21.590
I mean, you can write down
100 theorems now.
00:49:21.590 --> 00:49:24.520
And they're all the same idea.
00:49:24.520 --> 00:49:26.810
But the critical idea
is the question
00:49:26.810 --> 00:49:29.760
that two of you asked.
00:49:29.760 --> 00:49:35.260
And that is, why is the
departure rate going to be
00:49:35.260 --> 00:49:40.120
lambda when you look at things
coming in backwards.?
00:49:40.120 --> 00:49:45.010
And the answer again is that
it's lambda because we're not
00:49:45.010 --> 00:49:49.645
conditioning it on knowing
what the prior state was.
00:49:49.645 --> 00:49:55.370
And everything else you know
about these things, you always
00:49:55.370 --> 00:49:57.450
condition things on
the prior state.
00:49:57.450 --> 00:50:00.050
So now we're getting used
to conditioning them
00:50:00.050 --> 00:50:03.220
on the later state.
00:50:03.220 --> 00:50:03.640
OK.
00:50:03.640 --> 00:50:05.590
Let's talk about branching
processes.
00:50:05.590 --> 00:50:09.390
Branching processes have
nothing to do with
00:50:09.390 --> 00:50:11.570
reversibility.
00:50:11.570 --> 00:50:17.790
Again, these are just very
curious kinds of processes.
00:50:17.790 --> 00:50:25.870
They have a lot to do with all
kinds of genetic kinds of
00:50:25.870 --> 00:50:32.080
things, with lots of physics
kinds of experiments.
00:50:32.080 --> 00:50:36.200
I don't think a branching
process corresponds very
00:50:36.200 --> 00:50:38.690
closely to any one
of those things.
00:50:38.690 --> 00:50:42.330
This is the same kind of
modeling issue that we come up
00:50:42.330 --> 00:50:43.980
against all the time.
00:50:43.980 --> 00:50:48.030
What we do is, we pick very,
very simple models to try to
00:50:48.030 --> 00:50:51.410
understand one aspect of
a physical problem.
00:50:51.410 --> 00:50:54.860
And if you try to ask for a
model that understands all
00:50:54.860 --> 00:50:57.630
aspects of that physical
problem, you've got a model
00:50:57.630 --> 00:50:59.870
that's too complicated to
say anything about.
00:50:59.870 --> 00:51:07.140
But here's a model that says
if you believe that one
00:51:07.140 --> 00:51:11.210
generation to the next, if the
only thing that's happening is
00:51:11.210 --> 00:51:17.360
the individuals in one
generation are spawning
00:51:17.360 --> 00:51:22.100
children or are spawning
whatever it is in that next
00:51:22.100 --> 00:51:26.760
generation, and every individual
does this in an
00:51:26.760 --> 00:51:31.430
independent way, then this is
what you have to live with.
00:51:31.430 --> 00:51:33.120
I mean that's what the
mathematics says.
00:51:33.120 --> 00:51:38.000
The model is no good, but
the mathematics is fine.
00:51:38.000 --> 00:51:42.990
So the mathematics is we suppose
that x of n is the
00:51:42.990 --> 00:51:48.080
state of the Markov chain at
time n, and the Markov chain
00:51:48.080 --> 00:51:52.160
is described in the following
way. x of n, we think of as
00:51:52.160 --> 00:51:58.090
being the number of elements in
generation n, and for each
00:51:58.090 --> 00:52:07.790
element k, out of that x of n,
each element gives rise to a
00:52:07.790 --> 00:52:09.820
number of new elements.
00:52:09.820 --> 00:52:12.850
And the number of new elements
it gives rise to we
00:52:12.850 --> 00:52:14.990
call it y sub kn.
00:52:14.990 --> 00:52:19.190
The n at the end is for the
generation, the k is for the
00:52:19.190 --> 00:52:22.370
particular element in
the case generation.
00:52:22.370 --> 00:52:27.345
So y sub kn is the number of
offspring of the element k in
00:52:27.345 --> 00:52:29.130
the n-th generation.
00:52:29.130 --> 00:52:33.610
After the element in
the n-th generation
00:52:33.610 --> 00:52:36.640
gives birth, it dies.
00:52:36.640 --> 00:52:42.700
So it's kind of a cruel world,
but that's this particular
00:52:42.700 --> 00:52:44.300
kind of model.
00:52:44.300 --> 00:52:48.490
So the number of elements in
the n plus first generation
00:52:48.490 --> 00:52:52.830
then, is the sum of the number
of offspring of the elements
00:52:52.830 --> 00:52:54.840
in the n-th generation.
00:52:54.840 --> 00:53:03.640
So it says x of n plus 1 is
equal to this y sub kn is the
00:53:03.640 --> 00:53:08.680
number of offspring of element
k, and we sum that number of
00:53:08.680 --> 00:53:14.290
offspring from 1 to
x of n, and that's
00:53:14.290 --> 00:53:15.640
the equation we get.
00:53:15.640 --> 00:53:19.580
The assumption we make is that
the non-negative integer
00:53:19.580 --> 00:53:22.220
random variable y sub kn--
00:53:22.220 --> 00:53:24.020
these random variables--
00:53:24.020 --> 00:53:28.080
are independent, and identically
distributed over
00:53:28.080 --> 00:53:30.080
both n and k.
00:53:30.080 --> 00:53:35.250
There's this usual peculiar
problem that we have where
00:53:35.250 --> 00:53:37.400
we're defining random variables
that might not
00:53:37.400 --> 00:53:41.910
exist, but we should be
used to that by now.
00:53:41.910 --> 00:53:45.720
I mean we just have the random
variable there and we pick
00:53:45.720 --> 00:53:47.510
them out when we need
them is the best way
00:53:47.510 --> 00:53:48.950
to think about that.
00:53:48.950 --> 00:53:52.850
The initial generation x of 0
can be an arbitrary positive
00:53:52.850 --> 00:53:56.660
random variable, but it's
usually taken to be y.
00:53:56.660 --> 00:54:01.180
So you start out with one
element, and this thing goes
00:54:01.180 --> 00:54:04.410
on from one generation
to the next.
00:54:04.410 --> 00:54:09.920
It might all die out, or it
might continue, it might blow
00:54:09.920 --> 00:54:14.700
up explosively, and we want
to find out which it does.
00:54:14.700 --> 00:54:17.600
OK so here's the critical
equation.
00:54:17.600 --> 00:54:20.660
Let's look at a couple of
examples of why sub kn is
00:54:20.660 --> 00:54:26.050
deterministic, and equals 1, and
xn is equal to xn minus 1
00:54:26.050 --> 00:54:28.980
is equal to x0 for all n greater
than or equal to 1.
00:54:28.980 --> 00:54:31.870
So this example isn't
very interesting.
00:54:31.870 --> 00:54:37.300
If y kn is equal to 2, then each
generation has twice as
00:54:37.300 --> 00:54:39.980
many elements as the previous
generation.
00:54:39.980 --> 00:54:43.270
Each element has
two offspring.
00:54:43.270 --> 00:54:47.600
So you have something that looks
like a tree, which is
00:54:47.600 --> 00:54:50.200
where the name branching process
comes from because
00:54:50.200 --> 00:54:52.390
people think of these things
in terms of trees.
00:54:55.270 --> 00:55:08.020
Each one element here two
offspring, two elements here,
00:55:08.020 --> 00:55:10.860
and now if you visualize this
kind of chain, you can think
00:55:10.860 --> 00:55:12.580
of this as being random.
00:55:12.580 --> 00:55:17.140
So the perhaps in this
first generation
00:55:17.140 --> 00:55:19.020
there are two offspring.
00:55:19.020 --> 00:55:23.840
Perhaps this one has no
offspring the next time, so
00:55:23.840 --> 00:55:25.540
this dies out.
00:55:25.540 --> 00:55:27.130
This one has two offspring.
00:55:30.070 --> 00:55:33.400
This one has two offspring.
00:55:33.400 --> 00:55:37.420
this one has no offspring,
this one has two.
00:55:37.420 --> 00:55:39.420
Four, we're up to four.
00:55:39.420 --> 00:55:42.540
And then all of them die out.
00:55:42.540 --> 00:55:46.000
So we're talking about that kind
of process, which you can
00:55:46.000 --> 00:55:49.040
visualize as a tree just
as easily as you can
00:55:49.040 --> 00:55:50.520
visualize it this way.
00:55:50.520 --> 00:55:53.030
Personally I find it easier
to do this as a tree.
00:55:53.030 --> 00:55:59.790
Because that's personal
preference.
00:55:59.790 --> 00:56:06.280
OK, so just talked about this
third kind of animal here.
00:56:06.280 --> 00:56:12.490
If the probability of no
offspring is 1/2, and the
00:56:12.490 --> 00:56:17.640
probability of twins is 1/2,
then xn, it's a rather
00:56:17.640 --> 00:56:20.000
peculiar Markov chain.
00:56:20.000 --> 00:56:24.180
It can grow explosively,
or it can die out.
00:56:24.180 --> 00:56:26.420
Who would guess that it's going
to grow explosively on
00:56:26.420 --> 00:56:27.010
the average?
00:56:27.010 --> 00:56:29.280
And who would guess that it will
die out on the average?
00:56:33.600 --> 00:56:36.020
I mean would anybody hazard to
make a guess that this will
00:56:36.020 --> 00:56:40.310
die out with probability one?
00:56:40.310 --> 00:56:44.670
Well it will, and we'll
see that today.
00:56:44.670 --> 00:56:50.180
It can grow for quite a while,
but eventually it gets killed.
00:56:50.180 --> 00:56:54.020
When we look at this process
now, the state
00:56:54.020 --> 00:56:55.950
0 is trapping state.
00:56:55.950 --> 00:56:58.790
The states 0 was always
a trapping state
00:56:58.790 --> 00:57:00.720
for branching processes.
00:57:00.720 --> 00:57:03.860
Because once you get to state
0, there's nothing to have
00:57:03.860 --> 00:57:07.040
offspring anymore.
00:57:07.040 --> 00:57:11.220
So state 0 is always
a trapping state.
00:57:11.220 --> 00:57:16.600
But in other states you can have
rather explosive growth.
00:57:16.600 --> 00:57:22.420
For this particular thing here
the even numbered states all
00:57:22.420 --> 00:57:26.270
communicate, but there
are transient.
00:57:26.270 --> 00:57:28.710
Each odd numbered state doesn't
communicate with any
00:57:28.710 --> 00:57:30.090
other state.
00:57:30.090 --> 00:57:33.820
As you see from this diagram
here, you're always dealing
00:57:33.820 --> 00:57:35.950
with an even number
of states here.
00:57:35.950 --> 00:57:42.340
Because each offspring
each element has
00:57:42.340 --> 00:57:46.490
either two or 0 offspring.
00:57:46.490 --> 00:57:51.310
So you're summing up a bunch of
even numbers, and you never
00:57:51.310 --> 00:57:55.960
get anything odd, except this
initial state of one, which
00:57:55.960 --> 00:57:58.390
you get out of right away.
00:57:58.390 --> 00:58:01.580
OK so how do we analyze
these things?
00:58:01.580 --> 00:58:05.100
We want to find the probability
for the general
00:58:05.100 --> 00:58:08.240
case that the process
dies out.
00:58:08.240 --> 00:58:12.095
So let's simplify our notation
a little bit.
00:58:14.760 --> 00:58:22.820
We're going to let the pmf on
y-- we have a pmf on y because
00:58:22.820 --> 00:58:25.750
y is an integer random
variable.
00:58:25.750 --> 00:58:29.150
It's 0, or one, or
2, or so forth.
00:58:29.150 --> 00:58:32.770
We'll call that p sub of k.
00:58:32.770 --> 00:58:38.620
For the Markov chain namely
x0, x1, x2, and so forth,
00:58:38.620 --> 00:58:43.380
we're going to let piece of ij
as usual, be the transition
00:58:43.380 --> 00:58:46.280
probabilities in the
Markov chain.
00:58:46.280 --> 00:58:49.820
And here it's very useful to
talk about the probability the
00:58:49.820 --> 00:58:53.050
state j has reached on
or before step n,
00:58:53.050 --> 00:58:54.630
starting from state i.
00:58:54.630 --> 00:58:57.510
Remember we talked
about that--
00:58:57.510 --> 00:58:59.030
I forget whether we
talked about last
00:58:59.030 --> 00:59:00.780
time, or the time before--
00:59:00.780 --> 00:59:05.630
but you can look it up,
and what it is.
00:59:05.630 --> 00:59:10.080
The thing we derive for it is
the probability that you will
00:59:10.080 --> 00:59:16.600
have touched state j in one of
the n previous tries starting
00:59:16.600 --> 00:59:17.890
in state i.
00:59:17.890 --> 00:59:19.760
It's p sub of ij.
00:59:19.760 --> 00:59:21.980
That's the probability you reach
it right away so you're
00:59:21.980 --> 00:59:23.330
successful.
00:59:23.330 --> 00:59:26.620
And then for everything else you
might reach on the first
00:59:26.620 --> 00:59:31.130
try, there's the probability
of going
00:59:31.130 --> 00:59:32.910
to that state, initially.
00:59:32.910 --> 00:59:35.500
So this is what happens
in the first trial.
00:59:35.500 --> 00:59:38.300
And then there's the probability
you will have gone
00:59:38.300 --> 00:59:43.320
from state k to state j, in
any one of the n minus 1
00:59:43.320 --> 00:59:45.550
states after that.
00:59:45.550 --> 00:59:48.570
So f ij of one is p ij.
00:59:48.570 --> 00:59:53.410
And now what we're interested
in is we start with one
00:59:53.410 --> 00:59:57.190
element, and we're interested
in the probability that it
00:59:57.190 --> 00:59:59.510
dies out, before it explodes.
00:59:59.510 --> 01:00:01.780
Or just the probability
it dies out.
01:00:01.780 --> 01:00:06.310
So f sub 1,0 of n is the
probability starting in state
01:00:06.310 --> 01:00:13.530
1 that you're going to reach
state 0 after n steps.
01:00:13.530 --> 01:00:20.340
So it's p 0 plus sum here of
p sub k probability you go
01:00:20.340 --> 01:00:24.160
immediately to state k, and now
here's the only hard thing
01:00:24.160 --> 01:00:25.810
about this.
01:00:25.810 --> 01:00:31.050
What I claim now is if we go
to state k, and state k we
01:00:31.050 --> 01:00:35.780
have k elements in this
first generation.
01:00:35.780 --> 01:00:40.950
Now what's the probability that
starting with k elements
01:00:40.950 --> 01:00:45.590
we're going to be dead after
n minus 1 transitions?
01:00:45.590 --> 01:00:49.720
Well to be dead after n minus
1 transitions every one of
01:00:49.720 --> 01:00:54.880
these elements has to die out.
01:00:54.880 --> 01:00:55.790
And they're independent.
01:00:55.790 --> 01:00:57.840
Everything that's going on
from each element is
01:00:57.840 --> 01:01:00.440
independent of everything
from each other element.
01:01:00.440 --> 01:01:06.240
So the probability that this
first one dies out is f sub 1
01:01:06.240 --> 01:01:09.450
0 over n minus 1 steps.
01:01:09.450 --> 01:01:11.510
Probability the second
one dies out--
01:01:11.510 --> 01:01:14.000
same thing. you take the
product of them.
01:01:14.000 --> 01:01:22.050
So this is the probability that
we die out initially,
01:01:22.050 --> 01:01:26.290
this sum from k equals
1 to infinity.
01:01:26.290 --> 01:01:30.980
Is the probability that each of
the k descendants dies out
01:01:30.980 --> 01:01:34.740
within time n minus 1.
01:01:34.740 --> 01:01:38.650
We can take a p 0 into the sum
here, we sum from 0 up to
01:01:38.650 --> 01:01:43.720
infinity, because s sub 1 0 to
the 0 is just equal to 1.
01:01:43.720 --> 01:01:48.980
So we get this nice looking
formula here.
01:01:48.980 --> 01:01:50.760
Let me do this quickly,
and then we'll go back
01:01:50.760 --> 01:01:52.070
and talk about it.
01:01:52.070 --> 01:01:58.006
Let's talk about
the z transform
01:01:58.006 --> 01:02:00.670
of this birth process.
01:02:00.670 --> 01:02:06.315
OK so we have this discrete
random variable y with pmf p
01:02:06.315 --> 01:02:12.670
sub k h of z is the sum over k
if p sub k times z to the k.
01:02:12.670 --> 01:02:15.130
It's just another kind of
transform, we have all kinds
01:02:15.130 --> 01:02:17.520
of transforms in this course.
01:02:17.520 --> 01:02:20.510
And this is one transform.
01:02:20.510 --> 01:02:26.240
Given the state of a pmf, you
can define a function into z
01:02:26.240 --> 01:02:28.470
in this way.
01:02:28.470 --> 01:02:40.550
So f 1 0 of n is then equal to
h of f 1 0 of n minus 1.
01:02:40.550 --> 01:02:44.750
It's amazing that all this mess
turns into something that
01:02:44.750 --> 01:02:47.490
looks so simple.
01:02:47.490 --> 01:02:52.630
So this will be the probability
that we will die
01:02:52.630 --> 01:02:56.860
out by time end, if in fact we
know what the probability of
01:02:56.860 --> 01:02:59.450
dying out at time
n minus 1 is.
01:02:59.450 --> 01:03:00.700
So let's try to solve
this equation.
01:03:03.510 --> 01:03:05.445
And it's not hard to solve
as you would think.
01:03:08.250 --> 01:03:14.240
There's this z transform h of
z, h of z is given there.
01:03:14.240 --> 01:03:16.590
What do I know about h of z?
01:03:16.590 --> 01:03:19.520
I know its value, it's
z equal to 1.
01:03:19.520 --> 01:03:25.590
Because it's z equal to 1, I'm
just summing p sub k times 1.
01:03:25.590 --> 01:03:29.830
So h of 1 is equal to 1.
01:03:29.830 --> 01:03:34.040
That's what this is in
both cases here.
01:03:34.040 --> 01:03:36.280
What else do I know about it?
01:03:36.280 --> 01:03:39.055
h of 0 is equal to p 0.
01:03:41.890 --> 01:03:46.440
And if you take the second
derivative of this, you find
01:03:46.440 --> 01:03:49.930
out immediately that the second
derivative is positive.
01:03:49.930 --> 01:03:54.260
So this curve, this convex,
it goes like that.
01:03:54.260 --> 01:03:57.260
As it's been drawn here.
01:03:57.260 --> 01:04:04.030
The other thing we know is that
this derivative at 1 the
01:04:04.030 --> 01:04:08.310
derivative of h of z is equal to
the sum of k times p to the
01:04:08.310 --> 01:04:14.520
k, times k, times z
to the k minus 1.
01:04:14.520 --> 01:04:16.780
I set z equal to 1,
and what is that?
01:04:16.780 --> 01:04:20.380
It's the sum of pk times k.
01:04:20.380 --> 01:04:26.150
So this derivative
here is y-bar.
01:04:26.150 --> 01:04:28.920
In this case, I'm looking at a
case where y-bar is equal to
01:04:28.920 --> 01:04:32.440
1, in this case, I'm looking
at a case where y-bar is
01:04:32.440 --> 01:04:33.690
bigger than 1.
01:04:36.450 --> 01:04:37.730
Everybody with me so far?
01:04:41.104 --> 01:04:43.514
AUDIENCE: So what
is the y-bar?
01:04:43.514 --> 01:04:47.030
PROFESSOR: y-bar is the expected
value of the random
01:04:47.030 --> 01:04:48.600
variable y.
01:04:48.600 --> 01:04:51.500
And the random variable y is the
number of offspring than
01:04:51.500 --> 01:04:53.960
any one element will have.
01:04:53.960 --> 01:04:58.900
The whole thing is defined
in terms of this y.
01:04:58.900 --> 01:05:01.500
I mean it's the only thing that
I've given you except all
01:05:01.500 --> 01:05:03.520
these independence conditions.
01:05:03.520 --> 01:05:05.030
It's like the Poisson process.
01:05:05.030 --> 01:05:10.140
There's only one element
in it, which is lambda.
01:05:10.140 --> 01:05:13.940
A complicated process, but
it's defined in terms of
01:05:13.940 --> 01:05:16.970
everything being independent
of everything else.
01:05:16.970 --> 01:05:20.440
And that's the same thing kind
of thing we have here.
01:05:20.440 --> 01:05:28.210
Well what I've drawn here is a
is a graphical way of finding
01:05:28.210 --> 01:05:35.790
what f 1 0 of 1, f 1 0 of 2,
f 1 0 of 3 is and so forth.
01:05:35.790 --> 01:05:43.530
And we start out with one
element here, and I want to
01:05:43.530 --> 01:05:55.320
find h of f 1 0 of one
is h of f 1 0 of 0.
01:05:55.320 --> 01:05:58.450
What is f 1 0 of 0?
01:05:58.450 --> 01:05:59.730
It has to be 1.
01:05:59.730 --> 01:06:06.030
So f 1 0 of n is just
equal to h of p 0.
01:06:06.030 --> 01:06:12.170
So I trace from here, from p 0
over to this slope of one.
01:06:12.170 --> 01:06:20.950
So this is p 0 down here, and
this point here is f 1 0 of 1,
01:06:20.950 --> 01:06:22.840
at this point.
01:06:22.840 --> 01:06:29.310
Starting here I go over to here,
and down here this is f
01:06:29.310 --> 01:06:34.300
1 0 of one, as advertised.
01:06:34.300 --> 01:06:38.520
I can move up to the curve
and I get h of 2.
01:06:38.520 --> 01:06:47.640
h of f 1 0 h of z, where z
is equal to f 1 0 of 1.
01:06:47.640 --> 01:06:50.370
And so forth along here.
01:06:50.370 --> 01:06:53.990
I'm not going to spend a lot
of time explaining the
01:06:53.990 --> 01:06:57.460
graphical procedure, because
this is something that you
01:06:57.460 --> 01:07:01.130
look at on your own, and you
sort it out in two minutes,
01:07:01.130 --> 01:07:04.240
and if I explained it, I mean
you'll be looking at it at a
01:07:04.240 --> 01:07:06.330
different speed and
I'm explaining it
01:07:06.330 --> 01:07:08.500
at, so it won't work.
01:07:08.500 --> 01:07:12.900
But what happens is starting
out with sum p 0, you just
01:07:12.900 --> 01:07:17.769
move along each of these points
are f 1 0 of 1, f 1 0
01:07:17.769 --> 01:07:22.010
of 2, f 1 0 of 3, up to
f 1 0 of infinity.
01:07:22.010 --> 01:07:28.210
This is the probability that
the process will die out
01:07:28.210 --> 01:07:31.040
eventually.
01:07:31.040 --> 01:07:37.840
So it's the point at which
h of z equals z.
01:07:37.840 --> 01:07:41.430
That's the root of the equation,
h of z equals z.
01:07:41.430 --> 01:07:45.400
We already know what that is
pretty much, because we know
01:07:45.400 --> 01:07:50.460
that we're looking at it case
here where y-bar is
01:07:50.460 --> 01:07:51.730
greater than 1.
01:07:51.730 --> 01:07:58.820
So it means the slope here
is bigger than 1.
01:07:58.820 --> 01:08:02.270
We have a convex curve which
starts on this side of this
01:08:02.270 --> 01:08:05.500
line, that ends on the other
side of this line.
01:08:05.500 --> 01:08:07.920
There's got to be a root
in the middle.
01:08:07.920 --> 01:08:11.650
And there can only be one root,
so we eventually get to
01:08:11.650 --> 01:08:12.640
that point.
01:08:12.640 --> 01:08:15.380
And that's the probability
of dying out.
01:08:18.640 --> 01:08:23.529
Now, over in this case,
y-bar is equal to 1.
01:08:23.529 --> 01:08:27.330
Or I could look at a case y-bar
is less than 1, And what
01:08:27.330 --> 01:08:28.878
happens then?
01:08:28.878 --> 01:08:33.560
Keep moving around the same way,
I got up at that point,
01:08:33.560 --> 01:08:37.830
and in fact h 1 0 of infinity
is equal to 1.
01:08:37.830 --> 01:08:41.399
Which says the probability of
dying out is equal to 1.
01:08:46.620 --> 01:08:49.130
These things that I'm
calculating here are in fact
01:08:49.130 --> 01:08:53.060
the probability of dying out by
time 1, the probability of
01:08:53.060 --> 01:08:56.430
dying out by time 2, and so
forth all the way up.
01:08:56.430 --> 01:08:59.649
In here we start out on this
side of the curve.
01:08:59.649 --> 01:09:01.920
We keep getting crunched in.
01:09:01.920 --> 01:09:06.140
We wind up at that point, and in
this case, we keep getting
01:09:06.140 --> 01:09:08.960
crunched up, and we wind
up at that point.
01:09:08.960 --> 01:09:14.279
So the general behavior of these
branching processes is
01:09:14.279 --> 01:09:18.790
so long as there's a possibility
of an element
01:09:18.790 --> 01:09:22.819
having no children, there's a
possibility that the whole
01:09:22.819 --> 01:09:25.920
process will die out.
01:09:25.920 --> 01:09:31.689
But if the expected number of
offspring is greater than 1,
01:09:31.689 --> 01:09:36.770
then that probability of dying
out is less than 1.
01:09:36.770 --> 01:09:42.410
Unless the expected number of
offspring is less than or
01:09:42.410 --> 01:09:46.529
equal to 1, then the probability
of dying out is in
01:09:46.529 --> 01:09:47.779
fact equal to 1.
01:09:50.180 --> 01:09:52.920
So that was just this graphical
picture, and that
01:09:52.920 --> 01:09:55.990
does the whole thing, and if
you think about it for 10
01:09:55.990 --> 01:09:59.880
minutes in a quiet room, I think
it will be obvious to
01:09:59.880 --> 01:10:02.440
you, because there's no
rocket science here.
01:10:02.440 --> 01:10:07.500
It's just a simple graphical
argument.
01:10:07.500 --> 01:10:10.490
I have to think about it every
time I do it, because it
01:10:10.490 --> 01:10:13.220
always looks implausible.
01:10:13.220 --> 01:10:18.700
So it says the process can
explode if the expected number
01:10:18.700 --> 01:10:23.390
of elements from each element
is larger than 1.
01:10:23.390 --> 01:10:26.030
But it doesn't have
to explode.
01:10:26.030 --> 01:10:28.450
There's an interesting theorem
that we'll talk about when we
01:10:28.450 --> 01:10:31.800
start talking about
martingales.
01:10:31.800 --> 01:10:36.540
And that is that the number of
elements in generation n
01:10:36.540 --> 01:10:50.760
divided by the expected value of
y to the n-th power x sub n
01:10:50.760 --> 01:10:55.700
divided by y-bar to
the n-th power.
01:10:55.700 --> 01:10:57.570
This is something that
looks like it ought
01:10:57.570 --> 01:11:00.650
to be kind of stable.
01:11:00.650 --> 01:11:04.380
And it says that this approach
is a random variable.
01:11:04.380 --> 01:11:12.180
Namely with probability 1, this
has some random value
01:11:12.180 --> 01:11:13.430
that you can calculate.
01:11:18.170 --> 01:11:21.650
With a certain probability,
this is equal to 0.
01:11:21.650 --> 01:11:28.880
With a certain probability this
is some larger constant.
01:11:28.880 --> 01:11:31.550
And it can be any old constant
at all with different
01:11:31.550 --> 01:11:32.940
probabilities.
01:11:32.940 --> 01:11:35.860
And you can sort of see
why this is happening.
01:11:35.860 --> 01:11:38.580
Suppose you have this process.
01:11:38.580 --> 01:11:40.800
Suppose y-bar is
bigger than 1.
01:11:40.800 --> 01:11:43.600
Suppose it's equal
to 2 for example.
01:11:43.600 --> 01:11:46.340
So the expected number of
offspring of each of these
01:11:46.340 --> 01:11:53.080
elements is two, so the number
of offspring of 10 to the 6
01:11:53.080 --> 01:11:56.650
elements is 2 times
10 to the sixth.
01:11:56.650 --> 01:12:00.630
What this is doing is dividing
by that multiplying factor.
01:12:00.630 --> 01:12:03.280
What's going to happen then is
after a certain amount of
01:12:03.280 --> 01:12:07.630
time, you have so many elements,
and each one of them
01:12:07.630 --> 01:12:10.840
is doing something
independently, so the number
01:12:10.840 --> 01:12:16.280
of offspring in each generation
divided by an extra
01:12:16.280 --> 01:12:20.350
y sub bar is almost constant.
01:12:20.350 --> 01:12:22.400
And that's what this
theorem is saying.
01:12:22.400 --> 01:12:27.190
So that after a while it says
the growth rate becomes fixed.
01:12:27.190 --> 01:12:28.890
And that sort of obvious
intuitively.
01:12:32.040 --> 01:12:33.500
That's enough for that.
01:12:38.980 --> 01:12:40.190
Should've not been so talkative
01:12:40.190 --> 01:12:41.530
about the earlier things.
01:12:41.530 --> 01:12:48.930
But Markov processes turn out
to be pretty simple, given
01:12:48.930 --> 01:12:51.190
what we know about
Markov chains.
01:12:51.190 --> 01:12:54.560
There's not a lot of new things
to be learned here.
01:12:54.560 --> 01:12:56.460
Just a few.
01:12:56.460 --> 01:13:00.840
Accountable state Markov process
is most easily viewed
01:13:00.840 --> 01:13:02.940
as a simple extension
of accountable
01:13:02.940 --> 01:13:05.250
state Markov chain.
01:13:05.250 --> 01:13:10.500
And along with each state in
the Markov chain, there's a
01:13:10.500 --> 01:13:12.090
holding time.
01:13:12.090 --> 01:13:17.650
So what happens in this process
is it goes along.
01:13:17.650 --> 01:13:22.590
At a certain point there's
a state change.
01:13:22.590 --> 01:13:26.770
The state change is according
to the Markov chain, and
01:13:26.770 --> 01:13:32.190
amount of time that it takes
is an exponential random
01:13:32.190 --> 01:13:35.540
variable which depends on
the state you are in.
01:13:35.540 --> 01:13:38.840
So in some states you
move quickly.
01:13:38.840 --> 01:13:41.430
In some states you
move slowly.
01:13:41.430 --> 01:13:44.080
But the only thing that's going
on is you have a Markov
01:13:44.080 --> 01:13:48.740
chain, and each state of the
Markov chain, there's some
01:13:48.740 --> 01:13:52.590
rate which determines how long
it's going to take to get to
01:13:52.590 --> 01:13:58.100
the next state change.
01:13:58.100 --> 01:14:02.490
So that you can visualize what
the process looks like--
01:14:07.910 --> 01:14:11.560
This is the state at times 0.
01:14:11.560 --> 01:14:15.820
This determines some
holding time u1.
01:14:15.820 --> 01:14:21.340
It also determines some
state at time 1.
01:14:21.340 --> 01:14:25.160
The state you go to is
independent of how long it
01:14:25.160 --> 01:14:27.520
takes you to get there.
01:14:27.520 --> 01:14:30.950
This then determines the rate,
so it tells you the rate of
01:14:30.950 --> 01:14:33.850
this exponential random
variable.
01:14:33.850 --> 01:14:37.850
And we have this plus some
process leading off plus we
01:14:37.850 --> 01:14:41.860
have this Markov process leading
along here, and for
01:14:41.860 --> 01:14:44.390
each state of the Markov
process, you have
01:14:44.390 --> 01:14:45.770
this holding time.
01:14:45.770 --> 01:14:49.800
You will ask-- as I do every
time I look at this--
01:14:49.800 --> 01:14:52.130
why did I make this
u1 instead of u0?
01:14:54.970 --> 01:14:57.450
It's because of the
next slide.
01:14:57.450 --> 01:14:58.240
OK?
01:14:58.240 --> 01:15:00.940
Here's the next slide, which
shows what's going on.
01:15:00.940 --> 01:15:10.790
So we start off at time 0, x
of 0 is in some state i.
01:15:10.790 --> 01:15:15.260
We stay in state i until
some time u1, at
01:15:15.260 --> 01:15:17.400
which the state changes.
01:15:17.400 --> 01:15:21.140
The state change now
is some state j.
01:15:21.140 --> 01:15:25.230
We stay in that same state j
until the next state change.
01:15:25.230 --> 01:15:30.000
We stay in that state until the
next state change, and it
01:15:30.000 --> 01:15:33.370
is since we want to make the
first state change time s of
01:15:33.370 --> 01:15:41.810
one, we sort of have to make the
first interval between 0
01:15:41.810 --> 01:15:44.650
on the state u1.
01:15:44.650 --> 01:15:48.430
So these things are off
base from the u's.
01:15:48.430 --> 01:15:54.680
And this is the way that a way
that a Markov process evolves.
01:15:54.680 --> 01:15:58.780
You simply have what looks
like a Poisson process, a
01:15:58.780 --> 01:16:02.340
variable rate, and the variable
rate is varying
01:16:02.340 --> 01:16:05.930
according to the state of a
Markov chain every time you
01:16:05.930 --> 01:16:10.760
have an arrival in the variable
rate Poisson process,
01:16:10.760 --> 01:16:13.720
you change the rate according
to this Markov process.
01:16:13.720 --> 01:16:18.040
So it's everything about Markov
chains, plus Poisson
01:16:18.040 --> 01:16:20.530
processes all put together.
01:16:20.530 --> 01:16:24.730
OK, I think I'll stop
there, and we will
01:16:24.730 --> 01:16:26.030
continue next time.