WEBVTT
00:00:00.530 --> 00:00:02.960
The following content is
provided under a Creative
00:00:02.960 --> 00:00:04.370
Commons license.
00:00:04.370 --> 00:00:07.410
Your support will help MIT
OpenCourseWare continue to
00:00:07.410 --> 00:00:11.060
offer high quality educational
resources for free.
00:00:11.060 --> 00:00:13.960
To make a donation or view
additional materials from
00:00:13.960 --> 00:00:17.890
hundreds of MIT courses, visit
MIT OpenCourseWare at
00:00:17.890 --> 00:00:19.140
ocw.mit.edu.
00:00:22.940 --> 00:00:24.030
PROFESSOR: So this
is the outline of
00:00:24.030 --> 00:00:25.900
the lecture for today.
00:00:25.900 --> 00:00:29.530
So we just need to talk about
the conditional densities for
00:00:29.530 --> 00:00:30.620
Poisson process.
00:00:30.620 --> 00:00:33.650
So these are the things that
we've seen before in previous
00:00:33.650 --> 00:00:38.390
sections, so I'm just going to
give a simple proof for them
00:00:38.390 --> 00:00:41.000
and some intuition.
00:00:41.000 --> 00:00:46.130
So if you remember the Poisson
process, this was the famous
00:00:46.130 --> 00:00:48.210
figure that we had.
00:00:48.210 --> 00:00:52.190
So our time is at zero and
afterwards zero and then we
00:00:52.190 --> 00:00:53.440
had some arrivals.
00:00:58.800 --> 00:01:01.160
So our time is then t.
00:01:01.160 --> 00:01:06.550
We had N of t arrivals,
which is equal to n.
00:01:06.550 --> 00:01:07.850
This is Sn.
00:01:26.960 --> 00:01:28.740
This is the figure
that we had.
00:01:28.740 --> 00:01:33.610
So for the first equation,
we're looking at--
00:01:33.610 --> 00:01:37.250
We want to find the conditional
distribution of
00:01:37.250 --> 00:01:48.040
interval times condition
and last arrival times.
00:01:48.040 --> 00:01:51.840
So first of all, we want to find
the joint distribution of
00:01:51.840 --> 00:01:53.680
the interval arrival
times and the last
00:01:53.680 --> 00:01:58.970
arrival, this equation.
00:01:58.970 --> 00:02:01.970
So you know that interval
arrival times are independent
00:02:01.970 --> 00:02:19.880
in the Poisson process, so just
looking at this thing,
00:02:19.880 --> 00:02:23.640
you know that these things are
independent of each other so
00:02:23.640 --> 00:02:26.250
we will have lambda to
the power of Ne to
00:02:26.250 --> 00:02:27.500
the power minus lambda.
00:02:31.790 --> 00:02:35.230
And then, this last one
corresponds to the fact that
00:02:35.230 --> 00:02:43.630
xn plus 1 is equal to n plus
1 minus summation of x.
00:02:43.630 --> 00:02:47.880
These two events are equivalent
to each other, so
00:02:47.880 --> 00:02:50.420
this is going to correspond
to lambda e to the
00:02:50.420 --> 00:02:52.330
power of minus lambda.
00:02:55.890 --> 00:02:57.690
So we just did it the
other way, so
00:02:57.690 --> 00:02:58.940
it's going to be like--
00:03:11.020 --> 00:03:13.910
So you can just get rid of these
terms and it's going to
00:03:13.910 --> 00:03:15.660
be that equation.
00:03:15.660 --> 00:03:18.570
So it's very simple, it's
just the independence of
00:03:18.570 --> 00:03:20.330
inter-arrival times.
00:03:20.330 --> 00:03:23.870
So you look at the last term as
the inter-arrival time for
00:03:23.870 --> 00:03:26.310
the last arrival, and then
they're going to be
00:03:26.310 --> 00:03:28.660
independent, and the terms kind
of cancel out and you
00:03:28.660 --> 00:03:30.430
have this equation.
00:03:30.430 --> 00:03:35.060
So this conditional density
will be the joint
00:03:35.060 --> 00:03:35.730
distribution--
00:03:35.730 --> 00:03:41.910
or the distribution of Sn plus
1, which is like these.
00:03:41.910 --> 00:03:44.240
So the joint decision
will be like this.
00:03:44.240 --> 00:03:46.820
So what does the intuition
behind this mean?
00:03:46.820 --> 00:03:48.300
This means that--
00:03:48.300 --> 00:03:49.500
you've seen this term before.
00:03:49.500 --> 00:03:52.510
This means that we have uniform
distribution over the
00:03:52.510 --> 00:03:57.180
interval of zero to one,
condition on something.
00:03:57.180 --> 00:04:03.630
So previously, what we had was
the conditional distribution
00:04:03.630 --> 00:04:08.560
of arrival times condition on
the last arrival, so F of s1
00:04:08.560 --> 00:04:10.740
and condition on sn plus 1.
00:04:10.740 --> 00:04:15.320
But here we have distribution
of x1 to x and
00:04:15.320 --> 00:04:17.290
condition on sn plus 1.
00:04:17.290 --> 00:04:21.779
And previously, the constraint
that we had was like we should
00:04:21.779 --> 00:04:23.845
have ordering over the
arrival times.
00:04:28.260 --> 00:04:31.920
So because of this constraint
we had this n factorial, if
00:04:31.920 --> 00:04:34.400
you remember, that order of
statistics things that you
00:04:34.400 --> 00:04:35.500
talked about.
00:04:35.500 --> 00:04:38.215
Now here, what we have is
something like this.
00:04:38.215 --> 00:04:42.670
So all the arrival times
should be positive, and
00:04:42.670 --> 00:04:49.220
summation of them should
be less than t.
00:04:49.220 --> 00:04:54.740
Because well, the summation
of them to n plus 1 is
00:04:54.740 --> 00:04:56.330
equal to sn plus 1.
00:04:58.940 --> 00:05:00.780
And the last term is positive,
so this thing should
00:05:00.780 --> 00:05:02.030
be less than t.
00:05:04.880 --> 00:05:08.480
So these two constraints are
sort of dual to each other.
00:05:08.480 --> 00:05:11.890
So because of this constraint,
I had some n factorial over
00:05:11.890 --> 00:05:15.660
the conditional distribution of
arrival times, condition on
00:05:15.660 --> 00:05:17.720
the last arrival time.
00:05:17.720 --> 00:05:21.620
Now, I have inter-arrival
conditional distribution of
00:05:21.620 --> 00:05:24.100
inter-arrival time condition
on last arrival time and we
00:05:24.100 --> 00:05:25.560
should have this n
factorial here.
00:05:28.060 --> 00:05:31.650
The other interesting thing is
that if I condition it under n
00:05:31.650 --> 00:05:36.030
of t, so the number of arrivals
at time [INAUDIBLE]
00:05:36.030 --> 00:05:37.180
t.
00:05:37.180 --> 00:05:40.270
The time is going to be very
similar to the previous one.
00:05:43.710 --> 00:05:46.060
Just to prove this thing,
it's very simple again.
00:05:56.190 --> 00:05:57.520
I just wanted to show you--
00:06:09.229 --> 00:06:10.740
So what does this thing mean?
00:06:10.740 --> 00:06:14.290
This thing means that these are
the distributions of the
00:06:14.290 --> 00:06:18.100
inter-arrival times and the last
event corresponds to the
00:06:18.100 --> 00:06:23.420
fact that xn plus 1 is bigger
than t minus summation of xi.
00:06:26.830 --> 00:06:31.890
Because we have an arrival
time at [? snn, ?]
00:06:31.890 --> 00:06:36.210
and then after that there's
no arrival.
00:06:36.210 --> 00:06:39.660
Because at time nt we
have n arrivals.
00:06:39.660 --> 00:06:41.770
So this corresponds
to do is thing.
00:06:41.770 --> 00:06:45.521
So again, we'll have something
like lambda n, e
00:06:45.521 --> 00:06:46.771
to the power of--
00:06:51.210 --> 00:06:53.890
This is for the first event
and you see that these are
00:06:53.890 --> 00:06:57.030
independent because of the
Poisson properties of the
00:06:57.030 --> 00:06:58.300
Poisson process.
00:06:58.300 --> 00:06:59.680
And the last term will be--
00:07:03.320 --> 00:07:05.940
property of the Poisson
and a variable bigger
00:07:05.940 --> 00:07:07.190
than something is--
00:07:17.930 --> 00:07:20.290
Yes, this cancel out.
00:07:20.290 --> 00:07:23.490
And then we'll have this term,
this first n lambda to the
00:07:23.490 --> 00:07:26.310
power of n, just forget
about n plus 1.
00:07:26.310 --> 00:07:31.890
And property distribution of n
of t equal to n is this term
00:07:31.890 --> 00:07:33.020
without n plus 1.
00:07:33.020 --> 00:07:37.050
So lambda n to the 12, and e
to the power minus lambda t
00:07:37.050 --> 00:07:38.470
over n factorial.
00:07:38.470 --> 00:07:41.870
So these terms cancel out and
we'll have this thing.
00:07:41.870 --> 00:07:45.940
So again, we should have these
constraints that summation of
00:07:45.940 --> 00:07:48.570
xk, k equal to 1, n should
be less than t.
00:07:56.980 --> 00:08:00.390
No matter if they are
conditioning of n of t or sn
00:08:00.390 --> 00:08:05.380
plus 1, the last arrival time
or the number of arrivals at
00:08:05.380 --> 00:08:09.560
some time instant, the condition
of distribution is
00:08:09.560 --> 00:08:12.952
uniform subject to
some constraints.
00:08:12.952 --> 00:08:14.220
Is there any questions?
00:08:22.510 --> 00:08:29.540
So if you look at the limit, and
so again, looking at this
00:08:29.540 --> 00:08:40.280
figure, you see that this
corresponds to sn plus 1 and
00:08:40.280 --> 00:08:43.710
you see that sn plus
1 is bigger than t.
00:08:43.710 --> 00:08:47.710
So if you take the limit and
take the t1 going to t, these
00:08:47.710 --> 00:08:49.400
two distributions are going
to be the same.
00:08:53.060 --> 00:08:56.270
So t1 corresponds
to sn plus 1.
00:08:56.270 --> 00:08:57.520
t1.
00:09:05.390 --> 00:09:08.550
So basically what's it's saying
is that conditional
00:09:08.550 --> 00:09:12.350
distribution of arrival times
are uniform, independent of
00:09:12.350 --> 00:09:14.030
what is going to happen
in future.
00:09:14.030 --> 00:09:17.200
So in future, there might be
some, I mean, the knowledge
00:09:17.200 --> 00:09:20.110
that we have about future could
be the fact that knowing
00:09:20.110 --> 00:09:23.940
what happens in this interval
or the next arrival is going
00:09:23.940 --> 00:09:27.080
to happen at some point, I
mean, condition on these
00:09:27.080 --> 00:09:30.910
things the arrival times are
going to be uniform.
00:09:43.330 --> 00:09:46.570
The other fact is that this
distribution that I showed
00:09:46.570 --> 00:09:51.910
you, they are all symmetric
with respect to interval
00:09:51.910 --> 00:09:53.450
arrival times.
00:09:53.450 --> 00:09:58.120
So it doesn't matter, I mean,
no arrival has more priority
00:09:58.120 --> 00:10:01.630
to any other one or they're not
different to each other.
00:10:01.630 --> 00:10:05.180
So if you look at the
permutation of them, they're
00:10:05.180 --> 00:10:08.380
going to be similar.
00:10:08.380 --> 00:10:16.500
So now, I want to look at the
distribution of x1 or s1, they
00:10:16.500 --> 00:10:17.750
are equal to each other.
00:10:20.030 --> 00:10:24.260
I can easily calculate the
probability of x1 bigger than
00:10:24.260 --> 00:10:32.350
some tau condition, for example,
n of t is equal to n.
00:10:32.350 --> 00:10:34.065
So what is this conditional
distribution?
00:10:38.440 --> 00:10:42.690
So condition on n of t equal
to n, I know that in this
00:10:42.690 --> 00:10:50.750
interval I have n arrivals and
n inter-arrival times, and I
00:10:50.750 --> 00:10:57.140
want to know if x1 is in
this interval, or s1.
00:10:57.140 --> 00:10:59.870
s1 is equal to x1.
00:10:59.870 --> 00:11:03.460
And you know that si's
are also uniform.
00:11:03.460 --> 00:11:09.860
So you know that these are
independent of each other, and
00:11:09.860 --> 00:11:14.241
so you can say that the
probability of this thing is
00:11:14.241 --> 00:11:17.140
like this one.
00:11:17.140 --> 00:11:19.041
So all of them should
be bigger than tau.
00:11:24.120 --> 00:11:26.430
And since this is symmetric--
00:11:26.430 --> 00:11:29.880
so this corresponds to x1.
00:11:29.880 --> 00:11:31.080
Since this was--
00:11:31.080 --> 00:11:33.330
everything was symmetric for
all inter-arrival times.
00:11:33.330 --> 00:11:37.490
This is also the complimentary
distribution function for xk.
00:11:49.330 --> 00:11:50.580
Is there any questions?
00:11:56.640 --> 00:11:59.340
So the other think that are not
in the slides that I want
00:11:59.340 --> 00:12:02.144
to tell you is the conditional
distribution of
00:12:02.144 --> 00:12:05.380
si given n of t.
00:12:14.270 --> 00:12:20.380
So I know that there n arrivals
in this interval, and
00:12:20.380 --> 00:12:29.540
I want to see if the i arrival
is something like this.
00:12:29.540 --> 00:12:31.170
So the probability that--
00:12:31.170 --> 00:12:34.390
so this even corresponds to
the fact that there is one
00:12:34.390 --> 00:12:40.020
arrival in this interval and i
minus 1 arrival was here and n
00:12:40.020 --> 00:12:43.990
minus i minus 1 arrival
was here.
00:12:46.620 --> 00:12:49.720
Oh, no, sorry, n minus
i arrival is here.
00:12:49.720 --> 00:12:52.990
So the probability of
having one arrival
00:12:52.990 --> 00:12:56.630
here corresponds to--
00:12:56.630 --> 00:12:59.780
since arrival times our uniform,
and I want to have
00:12:59.780 --> 00:13:03.880
one arrival in this interval
like this, and I want to have
00:13:03.880 --> 00:13:12.110
i minus arrivals here i minus
one arrivals here, and n minus
00:13:12.110 --> 00:13:13.240
i arrivals here.
00:13:13.240 --> 00:13:17.010
So it's like aq binomial
probability distribution so
00:13:17.010 --> 00:13:21.280
out of the n arrival-- n minus
1 remaining arrivals, I want
00:13:21.280 --> 00:13:25.283
to have some of them in
this interval and
00:13:25.283 --> 00:13:26.533
some of them here.
00:13:35.000 --> 00:13:37.050
So this is the distribution
of this thing.
00:13:41.670 --> 00:13:45.980
Yeah, so just as a check
we know that the last
00:13:45.980 --> 00:13:48.260
inter-arrival--
00:13:48.260 --> 00:13:51.030
last arrival time,
the distribution
00:13:51.030 --> 00:13:53.120
of that thing is--
00:13:53.120 --> 00:13:55.450
Here, you just cancel these
things out, too--
00:13:55.450 --> 00:13:56.700
is--
00:13:59.900 --> 00:14:03.860
We need to get back to this
equation, but it's just the
00:14:03.860 --> 00:14:07.040
same as this one but you
let i equal to n.
00:14:12.000 --> 00:14:13.540
This is OK?
00:14:13.540 --> 00:14:17.672
Just to think i equal to
n, you get this one.
00:14:17.672 --> 00:14:20.037
AUDIENCE: Does this equation
also apply to i equals
00:14:20.037 --> 00:14:21.930
[INAUDIBLE]?
00:14:21.930 --> 00:14:23.180
PROFESSOR: Yeah, sure.
00:14:26.810 --> 00:14:33.520
Yeah, i equal to 1, we will
have nt minus tau.
00:14:48.280 --> 00:14:54.720
And if you look at this thing,
so this is a complimentary
00:14:54.720 --> 00:14:55.650
distribution function.
00:14:55.650 --> 00:14:57.413
You just take to zero
till you get that.
00:15:03.350 --> 00:15:04.725
This is just a check of
this formulation.
00:15:09.640 --> 00:15:10.520
Yeah.
00:15:10.520 --> 00:15:12.020
So is there any other
questions?
00:15:12.020 --> 00:15:15.830
Did anybody get why we have
this kind of formulation?
00:15:15.830 --> 00:15:19.190
Is was very simple, it's just
the combination of a uniform
00:15:19.190 --> 00:15:23.014
distribution and a binomial
distribution.
00:15:23.014 --> 00:15:24.508
AUDIENCE: What is tau?
00:15:24.508 --> 00:15:27.910
PROFESSOR: Tau is this thing.
00:15:27.910 --> 00:15:29.160
Tau is the si.
00:15:33.250 --> 00:15:33.954
Change of notation.
00:15:33.954 --> 00:15:35.610
AUDIENCE: [INAUDIBLE]?
00:15:35.610 --> 00:15:39.670
PROFESSOR: I said that I want
one arrival in this interval
00:15:39.670 --> 00:15:44.930
and i minus 1 arrival was here
and n minus i arrival is here.
00:15:44.930 --> 00:15:48.380
So I define it, the
Bernoulli process.
00:15:48.380 --> 00:15:51.880
So priority of falling in
this interval is tau
00:15:51.880 --> 00:15:55.030
over t minus tau.
00:15:55.030 --> 00:15:55.870
And I say that--
00:15:55.870 --> 00:15:57.890
AUDIENCE: [INAUDIBLE]?
00:15:57.890 --> 00:16:00.180
PROFESSOR: OK, so falling in
this interval in a uniform
00:16:00.180 --> 00:16:02.700
distribution corresponds
to success.
00:16:02.700 --> 00:16:08.440
I want, among n minus 1
arrivals, i minus 1 of them to
00:16:08.440 --> 00:16:12.040
be successful and the rest of
them to be [INAUDIBLE].
00:16:12.040 --> 00:16:12.390
[INAUDIBLE]
00:16:12.390 --> 00:16:14.632
that one of them should fall
here and the rest of them
00:16:14.632 --> 00:16:15.882
should fall here.
00:16:17.790 --> 00:16:18.225
Richard?
00:16:18.225 --> 00:16:20.532
AUDIENCE: So that's
the [INAUDIBLE]?
00:16:20.532 --> 00:16:24.413
PROFESSOR: Yeah, you just get
rid of this dt, that's it.
00:16:24.413 --> 00:16:27.299
[? AUDIENCE: Why do ?]
those stay, n over t.
00:16:27.299 --> 00:16:28.261
PROFESSOR: Which one?
00:16:28.261 --> 00:16:29.704
AUDIENCE: Can you explain
again the first term?
00:16:29.704 --> 00:16:31.345
The first term there,
the nd over t?
00:16:31.345 --> 00:16:33.510
PROFESSOR: Oh, OK, so I want
to have one interval
00:16:33.510 --> 00:16:39.080
definitely here but among
n arrivals I want
00:16:39.080 --> 00:16:40.050
them to fall here.
00:16:40.050 --> 00:16:42.690
So any one of them
should cancel.
00:16:42.690 --> 00:16:46.210
But then i minus 1 of them
should fall here and n minus i
00:16:46.210 --> 00:16:49.395
intervals should fall here.
00:16:49.395 --> 00:16:52.190
Any other questions?
00:16:52.190 --> 00:16:53.720
It's beautiful, isn't it?
00:17:01.240 --> 00:17:02.490
There's a problem.
00:17:07.800 --> 00:17:13.369
I studied x1 to xn, but what
is the distribution of this
00:17:13.369 --> 00:17:14.140
remaining part?
00:17:14.140 --> 00:17:19.050
So if I condition on n, nt or
condition on sn plus 1, what
00:17:19.050 --> 00:17:21.900
is the distribution of this
one or this one or doing
00:17:21.900 --> 00:17:24.619
distribution off all these?
00:17:24.619 --> 00:17:27.510
So this is what we're
going to look.
00:17:27.510 --> 00:17:33.960
So first of all, I do the
conditioning over sn plus 1,
00:17:33.960 --> 00:17:36.910
and I find the distribution
of--
00:17:36.910 --> 00:17:43.010
the distribution of x1 to xn
condition on sn plus 1, is
00:17:43.010 --> 00:17:46.560
there any randomness in
distribution of xn plus 1?
00:17:46.560 --> 00:17:48.540
So this is going to
be sx plus 1.
00:18:03.940 --> 00:18:06.200
So this is going to
be xn plus 1.
00:18:06.200 --> 00:18:10.060
So if I find a distribution of
x1 to xn condition on this
00:18:10.060 --> 00:18:15.860
condition is sn plus 1, it's
very easy to show that xn plus
00:18:15.860 --> 00:18:18.220
1 is equal to sn plus 1.
00:18:22.360 --> 00:18:23.880
So there's no randomness here.
00:18:28.470 --> 00:18:32.440
But looking at the figure,
you see that--
00:18:32.440 --> 00:18:36.200
and the distributions, you see
that everything is symmetric.
00:18:36.200 --> 00:18:41.510
So I've found the distribution
x1 to xn and I can find xn
00:18:41.510 --> 00:18:43.650
plus 1 easily from them.
00:18:43.650 --> 00:18:48.080
But what if I find the
distribution of x2 to xn plus
00:18:48.080 --> 00:18:53.640
1, condition on this thing, and
then find x1 from them?
00:18:53.640 --> 00:18:56.420
So you can say that x1
is equal to xn minus
00:18:56.420 --> 00:19:02.250
1 2 to n plus 1xi.
00:19:02.250 --> 00:19:04.120
And then we have
this solution.
00:19:04.120 --> 00:19:07.970
So it seems that there's no
difference between them and
00:19:07.970 --> 00:19:10.000
actually, there's not.
00:19:10.000 --> 00:19:15.360
So condition on sn plus 1, we
have the same distribution for
00:19:15.360 --> 00:19:19.800
x1 to xn or x2 to xn plus 1.
00:19:19.800 --> 00:19:23.060
But there are only n free
parameters because the n plus
00:19:23.060 --> 00:19:27.525
1's can be calculated
in your equations.
00:19:33.530 --> 00:19:37.430
If you take any n random
variables out of this n plus 1
00:19:37.430 --> 00:19:40.490
random variables, we have
the same story.
00:19:40.490 --> 00:19:46.670
So it's going to be uniform, our
0 to t for this interval.
00:19:46.670 --> 00:19:49.160
So it's symmetric.
00:19:49.160 --> 00:19:53.020
Now, the very nice thing is
about n of t, equal t--
00:19:53.020 --> 00:19:56.560
[? so ?] n must be equal to n.
00:19:56.560 --> 00:20:00.420
So this is easy because we had
these equations, but what if I
00:20:00.420 --> 00:20:04.470
call this thing xn star of
n plus 1 and I do the
00:20:04.470 --> 00:20:08.270
conditioning over n of t?
00:20:08.270 --> 00:20:13.130
So I don't know the arrival time
for n plus 1's arrival
00:20:13.130 --> 00:20:15.625
but I do know the number of
arrivals at time [INAUDIBLE]
00:20:15.625 --> 00:20:16.650
t.
00:20:16.650 --> 00:20:19.230
Condition on that, what is the
distribution for example, for
00:20:19.230 --> 00:20:22.120
xn star n plus 1 or x1 to xn?
00:20:26.250 --> 00:20:29.800
Again, easily we can say
that it's the same
00:20:29.800 --> 00:20:33.760
story, so it's uniform.
00:20:33.760 --> 00:20:38.970
So you can take out of x1 to xn
and xn star n plus 1, you
00:20:38.970 --> 00:20:47.810
can take n off them and find
the distribution of that.
00:20:54.130 --> 00:20:57.540
OK, so the other way to look at
this is that if I find the
00:20:57.540 --> 00:21:01.530
distribution of x1 to xn
condition on n of t, I will
00:21:01.530 --> 00:21:07.280
have the expect that xs star n
plus 1 is equal to t minus sn,
00:21:07.280 --> 00:21:12.421
which is t minus summation
of x to i.
00:21:12.421 --> 00:21:15.110
Can anyone of you see this?
00:21:15.110 --> 00:21:15.460
Hope so.
00:21:15.460 --> 00:21:16.710
Oh, OK.
00:21:19.720 --> 00:21:23.240
So again, if I find the
distribution of x1 to xn, I
00:21:23.240 --> 00:21:26.530
can find sx star n plus
1 deterministically.
00:21:29.650 --> 00:21:40.980
Just as a signage check, you can
find that f of the star of
00:21:40.980 --> 00:21:46.030
x-- oh, sorry, f of x star
n plus 1 condition on--
00:22:00.600 --> 00:22:05.160
So looking at this formulation
and just replacing everything,
00:22:05.160 --> 00:22:08.080
we can't find this thing, and
you can see that this is
00:22:08.080 --> 00:22:09.280
similar to this one again.
00:22:09.280 --> 00:22:12.760
So this is the distribution
for--
00:22:12.760 --> 00:22:15.860
well, this is the distribution
for x1.
00:22:15.860 --> 00:22:18.220
So you can find the
density and this
00:22:18.220 --> 00:22:18.990
is going to be density.
00:22:18.990 --> 00:22:23.010
So what I'm saying is that
x1 has the same marginal
00:22:23.010 --> 00:22:26.870
distribution as xn
plus 1 star.
00:22:30.410 --> 00:22:33.220
So you can get it with this kind
of reasoning, saying that
00:22:33.220 --> 00:22:36.330
the marginal distribution should
be the same or looking
00:22:36.330 --> 00:22:37.580
at this kind of formulation.
00:22:44.480 --> 00:22:47.400
When you have this distribution
you can find the
00:22:47.400 --> 00:22:53.280
distribution of summation of
xi, i equal to 1 to n.
00:22:53.280 --> 00:22:54.480
And then you can find--
00:22:54.480 --> 00:22:56.910
So t is determined, so
you can just replace
00:22:56.910 --> 00:22:58.980
everything and find it.
00:22:58.980 --> 00:23:01.188
And you can see that these
two are the same.
00:23:01.188 --> 00:23:02.580
AUDIENCE: [INAUDIBLE]
00:23:02.580 --> 00:23:04.250
the same distribution
that is one?
00:23:04.250 --> 00:23:05.410
PROFESSOR: Yeah.
00:23:05.410 --> 00:23:06.660
Condition on nt [INAUDIBLE].
00:23:09.410 --> 00:23:12.010
So everything is symmetric
and uniform.
00:23:12.010 --> 00:23:14.980
But you can only choose n for
your parameters because the n
00:23:14.980 --> 00:23:19.420
plus 1 is a combination
of the--
00:23:19.420 --> 00:23:20.670
Right?
00:23:27.720 --> 00:23:29.870
So this is the end of the
chapter [INAUDIBLE]
00:23:33.160 --> 00:23:35.670
for Poisson process, is there
any problems or any
00:23:35.670 --> 00:23:38.360
questions about it?
00:23:38.360 --> 00:23:41.360
So let's start the
Markov chain.
00:23:41.360 --> 00:23:44.150
So in this chapter we only talk
about the finite-state of
00:23:44.150 --> 00:23:47.160
Markov chains.
00:23:47.160 --> 00:23:51.830
Markov chains are some processes
that changing
00:23:51.830 --> 00:23:53.680
integer-time sense.
00:23:53.680 --> 00:23:56.960
So it's like we're quantizing
time and any kind of change
00:23:56.960 --> 00:24:00.750
can happen only in an
integer-time sense.
00:24:00.750 --> 00:24:05.530
And it's different in this way
from Poisson processes because
00:24:05.530 --> 00:24:10.780
we can have the definition of
Poisson processes for any
00:24:10.780 --> 00:24:15.820
continuous time, but Markov
chains are only defined for
00:24:15.820 --> 00:24:17.730
integer-times.
00:24:17.730 --> 00:24:23.060
And finite-state Markov chains
is a Markov chain where states
00:24:23.060 --> 00:24:28.275
can be only in a finite set, so
we usually call it 1 to M,
00:24:28.275 --> 00:24:30.630
but you can name it in
any way you like.
00:24:42.930 --> 00:24:47.430
What we are looking for in
finite-state Markov chains is
00:24:47.430 --> 00:24:52.080
the probability distribution of
the next state conditioned
00:24:52.080 --> 00:24:53.370
on whatever [? history ?]
00:24:53.370 --> 00:24:55.220
that I have.
00:24:55.220 --> 00:24:57.950
So I know that at integer times
00:24:57.950 --> 00:24:59.540
there can be some change.
00:24:59.540 --> 00:25:01.100
So what is this change?
00:25:01.100 --> 00:25:03.190
We model it with the probability
distribution of
00:25:03.190 --> 00:25:06.490
this change corresponding
to the history.
00:25:06.490 --> 00:25:08.970
So we can model any discrete
integer time
00:25:08.970 --> 00:25:12.020
processing this way.
00:25:12.020 --> 00:25:17.460
Now, the nice thing about Markov
chains or homogeneous
00:25:17.460 --> 00:25:20.470
Markov chains is that
this distribution is
00:25:20.470 --> 00:25:22.320
equal to these things.
00:25:22.320 --> 00:25:28.360
So you see i and j, so i is the
previous state and j is
00:25:28.360 --> 00:25:31.940
the state that I want to go in
so the distribution of the
00:25:31.940 --> 00:25:36.110
next state condition and all
the history only depends on
00:25:36.110 --> 00:25:38.800
the last step.
00:25:38.800 --> 00:25:43.740
So even the previous state, the
new state is independent
00:25:43.740 --> 00:25:47.420
of all their earlier things
that might happen.
00:25:47.420 --> 00:25:56.570
So there's a very nice
way to show it.
00:25:56.570 --> 00:26:00.160
So in general, we say
that xn plus 1 is
00:26:00.160 --> 00:26:07.700
independent of xn plus 1.
00:26:11.930 --> 00:26:13.790
So if I know the periods--
00:26:13.790 --> 00:26:25.950
and these things are true for
all the possible states.
00:26:25.950 --> 00:26:31.840
So whatever the history in the
earlier process that I had, I
00:26:31.840 --> 00:26:36.110
only care about the state that
I was in in previous time.
00:26:36.110 --> 00:26:38.840
And this is going to show me
what is the distribution of
00:26:38.840 --> 00:26:41.320
next state.
00:26:41.320 --> 00:26:45.830
So the two important things that
you can see this in this
00:26:45.830 --> 00:26:48.070
formulation is that, first
of all, this kind of
00:26:48.070 --> 00:26:54.150
independence, it is true for
all i, j, k, m, and so on.
00:26:54.150 --> 00:26:58.810
The other thing is that it's
not time-dependent.
00:26:58.810 --> 00:27:03.920
So if I'm in state, the
probability distribution of my
00:27:03.920 --> 00:27:10.000
next state is going to be the
same if the same thing happens
00:27:10.000 --> 00:27:12.030
in 100 years.
00:27:12.030 --> 00:27:13.470
So it's not time-dependent.
00:27:13.470 --> 00:27:16.230
That's why we call it
homogeneous Markov chains.
00:27:16.230 --> 00:27:19.620
And we can have non-homogeneous
Markov chains,
00:27:19.620 --> 00:27:22.700
but there are not many nice
results about them.
00:27:22.700 --> 00:27:25.990
I mean, we cannot find
many good things.
00:27:25.990 --> 00:27:31.620
And all the information that
we know to characterize the
00:27:31.620 --> 00:27:35.250
Markov chain is this kind
of distribution.
00:27:35.250 --> 00:27:38.850
So we call these things
transition probabilities.
00:27:38.850 --> 00:27:46.820
And something else which is
called initial probabilities,
00:27:46.820 --> 00:27:51.320
which tell me that, at time
instance 0, what is the
00:27:51.320 --> 00:27:53.520
probability distribution
of the states.
00:27:53.520 --> 00:27:57.320
So knowing Markov chains, you
can find that probability
00:27:57.320 --> 00:28:01.370
distribution of x1 is equal to
probability of x1 condition
00:28:01.370 --> 00:28:03.960
and x0 [INAUDIBLE].
00:28:03.960 --> 00:28:05.880
So knowing this thing
and these transition
00:28:05.880 --> 00:28:07.650
probabilities, I can kind
the probability
00:28:07.650 --> 00:28:09.820
distribution of x1.
00:28:09.820 --> 00:28:11.190
Well, very easily,
the probability
00:28:11.190 --> 00:28:14.599
of xn is equal to--
00:28:14.599 --> 00:28:15.849
sorry.
00:28:24.840 --> 00:28:27.460
so you can do this thing
iteratively.
00:28:27.460 --> 00:28:30.450
And just knowing the transition
probabilities and
00:28:30.450 --> 00:28:32.510
initial probabilities, you
can find the probability
00:28:32.510 --> 00:28:36.065
distribution of the states at
any time instant to forever.
00:28:39.900 --> 00:28:40.830
Do you see it's very easy?
00:28:40.830 --> 00:28:43.260
You just do this thing
iteratively.
00:28:48.180 --> 00:28:52.850
So I talked about the
independence and initial.
00:28:57.120 --> 00:29:03.820
Yeah, so what we do is, we
characterize the Markov chains
00:29:03.820 --> 00:29:06.090
with these set of transition
probabilities
00:29:06.090 --> 00:29:08.080
and the initial state.
00:29:08.080 --> 00:29:11.070
And you see that these
transition probabilities, we
00:29:11.070 --> 00:29:19.340
have, well, m times n minus 1,
free parameters in them,
00:29:19.340 --> 00:29:23.190
because the summation of each of
them should be equal to 1.
00:29:23.190 --> 00:29:27.980
So for each initial state, I can
have a distribution over
00:29:27.980 --> 00:29:30.420
the next step.
00:29:30.420 --> 00:29:33.180
So I'm going to talk about
it in the matrix form.
00:29:33.180 --> 00:29:37.930
And what we're doing in practice
usually, we assume
00:29:37.930 --> 00:29:40.120
the initial state to be
a constant state.
00:29:40.120 --> 00:29:44.290
So we usually just define a
state, call it initial state.
00:29:44.290 --> 00:29:46.520
Usually, we call it x0.
00:29:46.520 --> 00:29:49.300
I mean, it's what's you're going
to usually see if you
00:29:49.300 --> 00:29:51.670
want to study the behavior
of Markov chains.
00:29:54.250 --> 00:29:57.850
So these are the two ways that
we can visualize the
00:29:57.850 --> 00:29:59.980
transition probabilities.
00:29:59.980 --> 00:30:01.970
So matrix form is very easy.
00:30:01.970 --> 00:30:05.060
So you just look at
the n probability
00:30:05.060 --> 00:30:06.790
distributions that you have.
00:30:06.790 --> 00:30:12.780
And with Markov chain with n
number of states, you will
00:30:12.780 --> 00:30:17.600
have n distributions, each of
them corresponding to the
00:30:17.600 --> 00:30:22.010
conditional distribution of next
step, condition of the
00:30:22.010 --> 00:30:24.730
present step equal to i.
00:30:24.730 --> 00:30:26.600
So i can be anything.
00:30:26.600 --> 00:30:30.530
So this was the thing I was
saying, the number of free
00:30:30.530 --> 00:30:31.260
parameters.
00:30:31.260 --> 00:30:35.600
So I have n distribution, and
each distribution has n minus
00:30:35.600 --> 00:30:39.290
1 free parameters, because the
summation of them should be
00:30:39.290 --> 00:30:41.590
equal to 1.
00:30:41.590 --> 00:30:43.726
So this is the number of free
parameters that I have.
00:31:05.090 --> 00:31:06.750
So is there any problem
with the matrix form?
00:31:06.750 --> 00:31:10.680
I'm assuming that you've
all been seeing
00:31:10.680 --> 00:31:13.690
this sometime before.
00:31:13.690 --> 00:31:14.550
Fine?
00:31:14.550 --> 00:31:19.810
So the nice thing about matrix
form is that you can do many
00:31:19.810 --> 00:31:21.220
nice stuff with them.
00:31:21.220 --> 00:31:23.710
We can look at the notes for
them, and we will see these
00:31:23.710 --> 00:31:25.270
kinds of properties later.
00:31:25.270 --> 00:31:28.690
But you can imagine that just
looking at the matrix form and
00:31:28.690 --> 00:31:33.920
doing some algebra over them,
we can get very nice results
00:31:33.920 --> 00:31:35.790
about the Markov chains.
00:31:35.790 --> 00:31:40.950
The other representation of the
Markov chains is using a
00:31:40.950 --> 00:31:45.250
graph, a directed graph in which
each node corresponds to
00:31:45.250 --> 00:31:49.440
a state and each arc, or each
edge, corresponds to a
00:31:49.440 --> 00:31:51.450
transition probability.
00:31:51.450 --> 00:31:57.280
And the very important thing
about graphical model is that
00:31:57.280 --> 00:32:01.800
there's a very clear difference
between 0
00:32:01.800 --> 00:32:04.660
transition probabilities
and non-0 transition
00:32:04.660 --> 00:32:05.850
probabilities.
00:32:05.850 --> 00:32:09.180
So if there's any possibility
of going from one state to
00:32:09.180 --> 00:32:14.580
another state, we have an
edge or an arc here.
00:32:14.580 --> 00:32:17.850
So probability of 10 to the
power of minus 5 is
00:32:17.850 --> 00:32:19.240
different from 0.
00:32:19.240 --> 00:32:21.900
We have an arc in one case and
no arc in another case,
00:32:21.900 --> 00:32:24.210
because there's a chance of
going from this state to the
00:32:24.210 --> 00:32:25.280
next state.
00:32:25.280 --> 00:32:27.790
And even then, the probability
is 10 to the power of minus 5.
00:32:34.560 --> 00:32:38.710
And OK, so we can do a lot of
inference by just looking at
00:32:38.710 --> 00:32:40.330
the graphical model.
00:32:40.330 --> 00:32:43.580
And it's easier to see some
properties of the Markov chain
00:32:43.580 --> 00:32:45.060
by looking at the graphical
models.
00:32:45.060 --> 00:32:49.350
We will see these properties
that we can find by looking at
00:32:49.350 --> 00:32:52.570
the graphical model.
00:32:52.570 --> 00:32:59.180
So this is not really
00:32:59.180 --> 00:33:01.830
classification of states, actually.
00:33:01.830 --> 00:33:04.150
So there are some definitions,
very
00:33:04.150 --> 00:33:06.170
intuitive definitions here.
00:33:06.170 --> 00:33:15.190
So there's something called a
walk, which says that this is
00:33:15.190 --> 00:33:19.940
an ordered string of the nodes,
where the probability
00:33:19.940 --> 00:33:23.940
of going from each node to
the next one is non-0.
00:33:23.940 --> 00:33:27.340
So for example, looking at this
walk, you can have the
00:33:27.340 --> 00:33:32.240
probability of going from 4 to 4
is positive, 4 to 1, 1 to 2,
00:33:32.240 --> 00:33:34.760
2 to 3, and 3, to 2.
00:33:34.760 --> 00:33:38.790
So there is no kind of
constraints in the definition
00:33:38.790 --> 00:33:39.330
of the walk.
00:33:39.330 --> 00:33:42.020
There should be just positive
probabilities in going from
00:33:42.020 --> 00:33:43.870
one state to the next state.
00:33:43.870 --> 00:33:45.200
So we can have repetition.
00:33:45.200 --> 00:33:47.230
We can have whatever we like.
00:33:50.260 --> 00:33:52.520
And the number of-- well,
we can find the number
00:33:52.520 --> 00:33:54.250
of states for sure.
00:33:54.250 --> 00:33:58.790
So what is the maximum number
of steps in a walk?
00:33:58.790 --> 00:34:00.550
Or minimum number?
00:34:00.550 --> 00:34:03.612
Minimum number of
states is two--
00:34:03.612 --> 00:34:04.760
[? one. ?]
00:34:04.760 --> 00:34:06.355
So what is the maximum
number of steps?
00:34:12.935 --> 00:34:16.580
Well, there's no constraint,
so it can be anything.
00:34:16.580 --> 00:34:19.930
So the next thing that
we look is a path.
00:34:19.930 --> 00:34:24.489
A path is a walk where there
is no repeated nodes.
00:34:24.489 --> 00:34:28.659
So I never go through
one node twice.
00:34:28.659 --> 00:34:31.780
So for example, I have this
path, 4, 1, 2, 3.
00:34:31.780 --> 00:34:34.770
We can see here, 4, 1, 2, 3.
00:34:34.770 --> 00:34:38.210
And now we can say, what
is the maximum number
00:34:38.210 --> 00:34:39.469
of steps in a path?
00:34:43.453 --> 00:34:44.449
AUDIENCE: [INAUDIBLE].
00:34:44.449 --> 00:34:46.616
PROFESSOR: Steps, not states.
00:34:46.616 --> 00:34:47.429
AUDIENCE: [INAUDIBLE].
00:34:47.429 --> 00:34:50.190
PROFESSOR: n minus 1, yeah.
00:34:50.190 --> 00:34:52.330
Because you cannot go through
one state twice.
00:34:52.330 --> 00:34:57.030
So you have a maximum number
of steps in n minus 1.
00:34:57.030 --> 00:35:02.230
And the cycle is a walk in which
the first and last node
00:35:02.230 --> 00:35:03.540
is repeated.
00:35:03.540 --> 00:35:07.280
So first of all, there's not
a really great difference
00:35:07.280 --> 00:35:09.590
between these two cycles,
so it doesn't
00:35:09.590 --> 00:35:11.580
matter where I start.
00:35:11.580 --> 00:35:13.550
And we don't care about
the repetition and
00:35:13.550 --> 00:35:14.800
definition of cycle.
00:35:22.572 --> 00:35:23.510
Oh, OK.
00:35:23.510 --> 00:35:24.330
Yeah, sorry.
00:35:24.330 --> 00:35:25.410
No node is repeated.
00:35:25.410 --> 00:35:27.560
So sorry, something was wrong.
00:35:27.560 --> 00:35:30.700
Yeah, we shouldn't have any
repetition except for the
00:35:30.700 --> 00:35:31.660
first and last nodes.
00:35:31.660 --> 00:35:33.900
So the first and last node
should be the same, but there
00:35:33.900 --> 00:35:35.480
shouldn't be any repetition.
00:35:35.480 --> 00:35:38.005
So what is the maximum number
of steps in this case?
00:35:41.872 --> 00:35:42.870
AUDIENCE: [INAUDIBLE].
00:35:42.870 --> 00:35:47.070
PROFESSOR: n, yeah, because we
have an additional step.
00:35:47.070 --> 00:35:49.070
Yeah?
00:35:49.070 --> 00:35:51.320
AUDIENCE: You said in a path,
the maximum number of
00:35:51.320 --> 00:35:52.570
steps is n minus 1?
00:35:52.570 --> 00:35:53.570
PROFESSOR: Yeah.
00:35:53.570 --> 00:35:57.070
AUDIENCE: I mean, if you have
n equals, like, 6, couldn't
00:35:57.070 --> 00:35:59.350
there be 1, 2, 3, 4, 5, 6?
00:35:59.350 --> 00:36:02.830
PROFESSOR: No, it's the
steps, not the states.
00:36:02.830 --> 00:36:08.270
So if I have path from 1 to
2, and this is the path.
00:36:08.270 --> 00:36:12.360
So there's one step.
00:36:12.360 --> 00:36:14.740
Just whenever you're confused,
make a simple model.
00:36:14.740 --> 00:36:18.770
It's going to make everything
very, very easy.
00:36:18.770 --> 00:36:22.010
Any other questions?
00:36:22.010 --> 00:36:24.060
So this is another definition.
00:36:24.060 --> 00:36:27.930
We say that a node j is
accessible from i if there is
00:36:27.930 --> 00:36:30.840
a walk from i to j.
00:36:30.840 --> 00:36:37.350
And by just looking at the
graphical model, you can
00:36:37.350 --> 00:36:41.540
verify the existence of
this walk very easily.
00:36:41.540 --> 00:36:44.030
But the nice thing, again, about
this thing is what I
00:36:44.030 --> 00:36:46.720
emphasized in the
graphical model.
00:36:46.720 --> 00:36:51.690
It talks about if there's any
positive probability of going
00:36:51.690 --> 00:36:53.910
from i to j.
00:36:53.910 --> 00:37:01.420
So if j is accessible from i
and you start at node i,
00:37:01.420 --> 00:37:04.770
there's a positive probability
that you will end in node j
00:37:04.770 --> 00:37:07.290
sometime in the future.
00:37:07.290 --> 00:37:13.550
There's nothing said about the
number of steps needed to get
00:37:13.550 --> 00:37:17.010
there, or the probability of
that, except that this
00:37:17.010 --> 00:37:19.720
probability is non-0.
00:37:19.720 --> 00:37:20.970
It's positive.
00:37:24.110 --> 00:37:28.750
So for example, if there's a
state like k, that we have p
00:37:28.750 --> 00:37:33.250
of ik is positive and p of kj is
positive, then we will have
00:37:33.250 --> 00:37:36.320
p of ij [? too ?] is positive.
00:37:36.320 --> 00:37:38.980
So we have this kind
of notation.
00:37:38.980 --> 00:37:40.230
I don't have a pointer.
00:37:42.810 --> 00:37:43.510
Weird.
00:37:43.510 --> 00:37:50.060
So yeah, we have this kind of
my notation. pijn means that
00:37:50.060 --> 00:37:55.600
the probability of going from
state i to state j in n steps.
00:37:55.600 --> 00:37:59.990
And this is exactly
like n steps.
00:37:59.990 --> 00:38:02.770
Actually, this value could
be different for
00:38:02.770 --> 00:38:05.910
different number of n.
00:38:05.910 --> 00:38:12.860
For example, if we have a Markov
chain like this, p of
00:38:12.860 --> 00:38:22.320
131 is equal to 0, but
p of 132 is non-0.
00:38:22.320 --> 00:38:31.680
So node j is accessible from i
if pijn is positive for any n.
00:38:31.680 --> 00:38:37.610
So we say that j is not
accessible from i if this is 0
00:38:37.610 --> 00:38:40.680
for all possible n's.
00:38:40.680 --> 00:38:42.434
Mercedes?
00:38:42.434 --> 00:38:45.236
AUDIENCE: Are those actually
greater than or equal to?
00:38:45.236 --> 00:38:46.210
PROFESSOR: No, greater.
00:38:46.210 --> 00:38:52.990
They are always greater
than or equal to 0.
00:38:52.990 --> 00:38:56.510
Probability is always
non-negative.
00:38:56.510 --> 00:39:00.700
So what want is positive
probability, meaning that
00:39:00.700 --> 00:39:02.930
there's a chance that
I will get there.
00:39:02.930 --> 00:39:04.955
I don't care how small this
chance is, but it
00:39:04.955 --> 00:39:07.366
shouldn't be 0.
00:39:07.366 --> 00:39:08.616
AUDIENCE: [INAUDIBLE].
00:39:12.688 --> 00:39:18.060
So p to ij, though, couldn't it
be equal to pik, [? pkj? ?]
00:39:18.060 --> 00:39:19.440
PROFESSOR: Yeah, exactly.
00:39:19.440 --> 00:39:26.510
So in this case, p132 is
equal to p12, p23,
00:39:26.510 --> 00:39:29.480
AUDIENCE: So I guess I'm asking
if p2ij should be
00:39:29.480 --> 00:39:33.450
greater than or equal to pfk.
00:39:33.450 --> 00:39:35.350
PROFESSOR: Oh, OK.
00:39:35.350 --> 00:39:36.050
No, actually.
00:39:36.050 --> 00:39:36.990
You know why?
00:39:36.990 --> 00:39:40.870
Because there can exist some
other state like here.
00:39:40.870 --> 00:39:41.250
AUDIENCE: Right.
00:39:41.250 --> 00:39:43.740
But if that doesn't exist and
there's only one path.
00:39:43.740 --> 00:39:44.360
PROFESSOR: Yeah, sure.
00:39:44.360 --> 00:39:45.690
But there's no guarantee.
00:39:45.690 --> 00:39:50.150
What I'm saying is that Pik is
positive, Pkj is positive.
00:39:50.150 --> 00:39:53.710
So this thing is positive,
but it can be bigger.
00:39:53.710 --> 00:39:55.510
In this case, it
can be bigger.
00:39:55.510 --> 00:39:58.840
I don't care about the quantity,
about the amount of
00:39:58.840 --> 00:39:59.350
the probability.
00:39:59.350 --> 00:40:03.430
I care about its positiveness,
it's greater than 0.
00:40:03.430 --> 00:40:05.920
I just want it to be non-0.
00:40:11.810 --> 00:40:18.540
The other thing is that when we
look at pijn, I don't care
00:40:18.540 --> 00:40:22.090
what kind of walk I have
between i and j.
00:40:22.090 --> 00:40:25.040
It's a walk, so it can have
repetition, it can have
00:40:25.040 --> 00:40:27.480
cycles, or it can
have anything.
00:40:27.480 --> 00:40:28.560
I don't care.
00:40:28.560 --> 00:40:34.510
So if you really want to
calculate pij for n equal to
00:40:34.510 --> 00:40:34.800
[INAUDIBLE]
00:40:34.800 --> 00:40:38.940
1,000, you should really find
all possible walks from i to j
00:40:38.940 --> 00:40:43.495
and add the probabilities
to find this value.
00:40:43.495 --> 00:40:45.205
And all the walks
within steps.
00:40:48.700 --> 00:40:50.950
Not here.
00:40:50.950 --> 00:40:55.100
Let's look at some examples.
00:40:55.100 --> 00:40:59.910
So is node 3 accessible
from node 1?
00:41:03.870 --> 00:41:08.840
So you see that there's a
walk like this, 1, 2, 3.
00:41:08.840 --> 00:41:11.270
So there's a positive
probability of going to state
00:41:11.270 --> 00:41:13.230
3 from node 1.
00:41:13.230 --> 00:41:18.380
So node 3 is accessible
from 1.
00:41:18.380 --> 00:41:20.880
But if you really want to
calculate the probability, you
00:41:20.880 --> 00:41:24.850
should also look at the fact
that we can have cycles.
00:41:24.850 --> 00:41:27.110
So 1, 2, 3 is a walk.
00:41:27.110 --> 00:41:29.550
But 1, 1, 2, 3 is also a walk.
00:41:29.550 --> 00:41:34.070
1, 2 3, 2, 3 is also a walk.
00:41:34.070 --> 00:41:34.370
You see?
00:41:34.370 --> 00:41:37.420
So you have to count all these
things to find the probability
00:41:37.420 --> 00:41:41.780
of p13n for any n.
00:41:41.780 --> 00:41:46.370
What about state 5?
00:41:46.370 --> 00:41:49.990
So is the node 3 accessible
from node 5?
00:41:52.594 --> 00:41:56.350
You see that it's not.
00:41:56.350 --> 00:41:59.560
Actually, if you're going to
state 5, you never go out.
00:41:59.560 --> 00:42:01.910
With [INAUDIBLE]
00:42:01.910 --> 00:42:03.930
probability of 1, you
stay there forever.
00:42:03.930 --> 00:42:10.990
So actually, no state except
state 5 is accessible from 5.
00:42:15.570 --> 00:42:20.965
So is node 2 accessible
from itself?
00:42:23.590 --> 00:42:28.760
So accessible means that I
should have a walk from 2 to 2
00:42:28.760 --> 00:42:31.400
in some number of steps.
00:42:31.400 --> 00:42:35.870
So you can see that we
can have 2, 3, 2,
00:42:35.870 --> 00:42:36.740
or many other walks.
00:42:36.740 --> 00:42:39.900
So it's accessible.
00:42:39.900 --> 00:42:43.435
But as you see, node 6 is not
accessible from itself.
00:42:45.980 --> 00:42:48.420
If you are in node 6,
you always go out.
00:42:53.460 --> 00:42:55.990
So it's not accessible.
00:42:55.990 --> 00:42:58.680
So let's go back to
these definitions.
00:43:01.450 --> 00:43:04.050
Yeah, this is what I said, and
I'm emphasizing again.
00:43:04.050 --> 00:43:09.070
If you want to say that j is
not accessible from i, I
00:43:09.070 --> 00:43:13.770
should have a pijn equal
to 0 for all n.
00:43:18.970 --> 00:43:24.010
The other thing is that, if
there's a walk from i to j and
00:43:24.010 --> 00:43:30.100
from j to k, we can prove easily
that there's a walk
00:43:30.100 --> 00:43:32.440
from i to k.
00:43:32.440 --> 00:43:46.520
So having a walk from i to j
means that for some n, this
00:43:46.520 --> 00:43:47.880
thing is positive.
00:43:47.880 --> 00:43:50.140
So this is i to j.
00:43:50.140 --> 00:43:54.230
From j to k, this means
that p of jk, for
00:43:54.230 --> 00:43:56.960
some n, this is positive.
00:43:56.960 --> 00:44:03.890
So looking at i two k, I can say
that p of i to k, m plus
00:44:03.890 --> 00:44:06.068
n, is greater than.
00:44:09.820 --> 00:44:14.260
And I have this thing here
because of the reason that I
00:44:14.260 --> 00:44:16.600
explained right now, because
there might be other paths
00:44:16.600 --> 00:44:24.190
from i to k, except the paths
going through j, or except the
00:44:24.190 --> 00:44:30.010
walks that have n steps to get
to j and n steps to get to k.
00:44:32.680 --> 00:44:37.120
So we know that, well, we can
concatenate the walks from--
00:44:37.120 --> 00:44:39.660
so if there's a walk from i to
j and j to k, we have a walk
00:44:39.660 --> 00:44:40.910
from i to k.
00:44:56.260 --> 00:45:00.710
So we say that states i and j
communicate if there's a walk
00:45:00.710 --> 00:45:04.050
from i to j and from j to i.
00:45:04.050 --> 00:45:05.880
So it can go back and forth.
00:45:05.880 --> 00:45:10.650
So it means that there's
a cycle from from i
00:45:10.650 --> 00:45:12.460
two i or j two j.
00:45:12.460 --> 00:45:13.710
This is the implication.
00:45:17.260 --> 00:45:20.650
So it's, again, very simple to
prove that if i communicates
00:45:20.650 --> 00:45:25.880
with j and j communicates with
k, then i communicates with k.
00:45:25.880 --> 00:45:28.200
In order to prove that,
I should assume that i
00:45:28.200 --> 00:45:34.810
communicates to j, I need to
prove that there is a walk
00:45:34.810 --> 00:45:39.480
from i to k, and I need to prove
that there is a walk
00:45:39.480 --> 00:45:41.610
from k to i.
00:45:41.610 --> 00:45:46.700
So this means that i
communicates with k.
00:45:46.700 --> 00:45:52.050
So these two things can be
proved easily from the
00:45:52.050 --> 00:45:54.710
concatenation of the walks
and the fact that i and j
00:45:54.710 --> 00:45:56.040
communicate and j and
k communicate.
00:46:00.240 --> 00:46:04.050
Now, what I can define
is something
00:46:04.050 --> 00:46:07.660
called a class of states.
00:46:07.660 --> 00:46:18.160
So a class is called a non-empty
set of states, where
00:46:18.160 --> 00:46:21.440
all the pairs of states in a
class communicated with each
00:46:21.440 --> 00:46:25.290
other, and none of them
communicate with any other
00:46:25.290 --> 00:46:28.060
state in the Markov chain.
00:46:28.060 --> 00:46:31.070
So this is the definition
of class.
00:46:31.070 --> 00:46:37.900
So I just group the pairs that
can communicate with the state
00:46:37.900 --> 00:46:40.830
and just get rid of all those
who do not communicate to the
00:46:40.830 --> 00:46:42.080
[INAUDIBLE].
00:46:46.110 --> 00:46:55.180
So for defining a class or for
finding a class, or for naming
00:46:55.180 --> 00:46:59.810
a class, we can have a
representative state.
00:46:59.810 --> 00:47:05.320
So I want to find all the states
that communicate with
00:47:05.320 --> 00:47:07.490
each other in a class.
00:47:07.490 --> 00:47:11.280
So I can just pick one of the
states in this class and find
00:47:11.280 --> 00:47:13.540
all the states that communicate
with this single
00:47:13.540 --> 00:47:19.410
state, because if two states
communicate with one state,
00:47:19.410 --> 00:47:23.520
then these two states
communicate with each other.
00:47:23.520 --> 00:47:32.210
And if there's a state that
doesn't communicate with me,
00:47:32.210 --> 00:47:35.490
it doesn't communicate with
anybody else whom I'm
00:47:35.490 --> 00:47:37.740
communicating with.
00:47:37.740 --> 00:47:41.860
I'm going to prove it now,
just in a few moments.
00:47:41.860 --> 00:47:45.530
Just, I want to look at
this figure again.
00:47:45.530 --> 00:47:53.080
So first I take state 2 and
find a class that has this
00:47:53.080 --> 00:47:54.790
state in itself.
00:47:54.790 --> 00:48:00.170
So you see that in this class,
we have state 2 on 3, because
00:48:00.170 --> 00:48:04.110
there's only a state 3 that
communicates with 2.
00:48:04.110 --> 00:48:08.700
And correspondingly,
we can have class
00:48:08.700 --> 00:48:11.200
having state 4 and 5.
00:48:11.200 --> 00:48:14.440
And you see that state 1 is
communicating with itself, so
00:48:14.440 --> 00:48:15.410
it's a class by itself.
00:48:15.410 --> 00:48:18.120
But it doesn't communicate
with anyone else.
00:48:18.120 --> 00:48:20.530
So we have this class also.
00:48:20.530 --> 00:48:26.070
So next question, why do
we call C4 a class?
00:48:31.070 --> 00:48:32.320
So it doesn't communicate
with itself.
00:48:35.620 --> 00:48:39.810
If you're in state 6, you go out
of it with probability of
00:48:39.810 --> 00:48:41.100
1, eventually.
00:48:41.100 --> 00:48:42.465
So why do we call it a class?
00:48:49.200 --> 00:48:52.100
Actually, so this is the
definition of the classes.
00:48:52.100 --> 00:48:55.630
But we want to have some very
nice property of other classes
00:48:55.630 --> 00:48:59.720
which says that we can partition
the states in a
00:48:59.720 --> 00:49:03.870
Markov chain by using
the classes.
00:49:03.870 --> 00:49:08.355
So if I don't count this case as
a class, I cannot partition
00:49:08.355 --> 00:49:11.910
it, because partitioning means
that I should cover the whole
00:49:11.910 --> 00:49:14.010
states in the classes.
00:49:14.010 --> 00:49:19.550
What I to do is to do some
kind of partitioning in a
00:49:19.550 --> 00:49:22.370
Markov chain that I have, by
using the classes, so that I
00:49:22.370 --> 00:49:25.690
can have a representative
state for each class.
00:49:25.690 --> 00:49:31.510
And this is one way to partition
the Markov chains.
00:49:34.010 --> 00:49:37.000
And why do I say it's
partitioning?
00:49:37.000 --> 00:49:44.220
Well, it's covering the whole
finite space of the states.
00:49:44.220 --> 00:49:46.820
But I need to prove that
there's no intersection
00:49:46.820 --> 00:49:49.770
between classes also.
00:49:49.770 --> 00:49:51.020
Why is it like that?
00:49:55.100 --> 00:49:59.900
So meaning that I cannot have
two classes where there is an
00:49:59.900 --> 00:50:02.430
intersection between them,
because if there's an
00:50:02.430 --> 00:50:03.300
intersection--
00:50:03.300 --> 00:50:09.660
for example, i belongs to
C1 and i belongs to C2--
00:50:09.660 --> 00:50:14.320
it means that i communicates
with all the states in C1 and
00:50:14.320 --> 00:50:17.170
i communicates with
all states in C2.
00:50:17.170 --> 00:50:20.340
So you can say that all states
in C1 communicate with all
00:50:20.340 --> 00:50:21.610
states in C2.
00:50:21.610 --> 00:50:23.040
So actually, they should
be the same.
00:50:28.220 --> 00:50:30.070
And there's only these
states that
00:50:30.070 --> 00:50:31.280
communicate with each other.
00:50:31.280 --> 00:50:34.690
We have this exclusivity
that's conditional.
00:50:34.690 --> 00:50:37.980
So we can have this kind
of partitioning.
00:50:37.980 --> 00:50:40.090
Is there any question?
00:50:40.090 --> 00:50:41.340
Everybody's fine?
00:50:44.650 --> 00:50:47.690
So another definition which
is going to be very, very
00:50:47.690 --> 00:50:52.220
important in the future for
us is the recurrency.
00:50:52.220 --> 00:50:53.470
And it's actually very simple.
00:51:00.310 --> 00:51:10.170
So a state is called recurrent
if for all the states that we
00:51:10.170 --> 00:51:17.820
have j is accessible to
i, we also have i is
00:51:17.820 --> 00:51:19.770
accessible to j.
00:51:19.770 --> 00:51:25.320
So if from some state i, in some
number of states, I can
00:51:25.320 --> 00:51:29.400
go to some state j, I can
get back to i from
00:51:29.400 --> 00:51:31.030
this state for sure.
00:51:31.030 --> 00:51:32.780
And this should be
true for all the
00:51:32.780 --> 00:51:36.000
states in a Markov chain.
00:51:36.000 --> 00:51:39.550
So if there's a walk from i to
j, there should be a walk from
00:51:39.550 --> 00:51:41.170
j to i, too.
00:51:41.170 --> 00:51:45.730
If this is true, then
i is recurrent.
00:51:45.730 --> 00:51:47.610
This is the definition
of recurrence.
00:51:47.610 --> 00:51:50.080
Is it OK?
00:51:50.080 --> 00:51:53.610
And if it's not recurrent,
we call it transient.
00:51:53.610 --> 00:51:56.720
And why do we call transient?
00:51:56.720 --> 00:51:59.130
I mean, the intuition
is very nice.
00:51:59.130 --> 00:52:01.790
So what is a transient state?
00:52:01.790 --> 00:52:05.100
A transient state says that
there might be some kind of
00:52:05.100 --> 00:52:09.240
walk from i to some k.
00:52:09.240 --> 00:52:11.560
And then I can't go back.
00:52:11.560 --> 00:52:14.910
So there's a positive
probability that I go out of
00:52:14.910 --> 00:52:18.370
state i in some way, in
some walk, and never
00:52:18.370 --> 00:52:19.620
come back to it.
00:52:19.620 --> 00:52:23.008
So it's kind of transitional
behavior.
00:52:23.008 --> 00:52:25.984
AUDIENCE: [INAUDIBLE]?
00:52:25.984 --> 00:52:28.363
PROFESSOR: No, no, with
some probability.
00:52:31.000 --> 00:52:32.570
So there exists some
probability.
00:52:32.570 --> 00:52:34.270
There is a positive probability
that I go out of
00:52:34.270 --> 00:52:36.240
it and never come back.
00:52:36.240 --> 00:52:39.900
It's enough for definition
of transience.
00:52:39.900 --> 00:52:43.700
So you know why?
00:52:43.700 --> 00:52:47.510
Because I have the definition of
recurrence for all the j's
00:52:47.510 --> 00:52:50.740
that are accessible from i.
00:52:50.740 --> 00:52:51.810
So with probability 1--
00:52:51.810 --> 00:52:54.240
oh, OK, so I cannot say
with probability 1.
00:52:57.220 --> 00:52:59.420
Yeah, I cannot say probability
of 1 for recurrency.
00:52:59.420 --> 00:53:04.190
But for transient behavior,
there exists some probability.
00:53:04.190 --> 00:53:05.800
There's a positive probability
that I go out
00:53:05.800 --> 00:53:07.050
and never come back.
00:53:21.770 --> 00:53:24.140
State 1 is transient.
00:53:26.680 --> 00:53:29.215
And state 3, is it recurrent
or transient?
00:53:36.860 --> 00:53:37.260
Some idea?
00:53:37.260 --> 00:53:37.530
AUDIENCE: [INAUDIBLE].
00:53:37.530 --> 00:53:38.710
PROFESSOR: Transient?
00:53:38.710 --> 00:53:41.290
What about state 2?
00:53:41.290 --> 00:53:44.430
It's wrong, because there should
be something going on
00:53:44.430 --> 00:53:45.470
[INAUDIBLE].
00:53:45.470 --> 00:53:47.570
So now here, it's recurrent.
00:53:53.680 --> 00:53:54.930
Good?
00:53:58.410 --> 00:54:00.650
Oh yeah, we have
examples here.
00:54:00.650 --> 00:54:04.970
So states 2 and 3 are recurrent,
because the only
00:54:04.970 --> 00:54:09.660
state that I can go out from
this state is to themselves.
00:54:09.660 --> 00:54:13.000
So they are only accessible
from themselves.
00:54:13.000 --> 00:54:14.790
And there's a positive
probability of going from one
00:54:14.790 --> 00:54:15.410
to the other.
00:54:15.410 --> 00:54:17.280
So they're recurrent.
00:54:17.280 --> 00:54:22.060
But states 4 and 5
are transient.
00:54:22.060 --> 00:54:28.020
And the reason is here, because
there is a positive
00:54:28.020 --> 00:54:30.980
probability that I go out of
state 4 in that direction and
00:54:30.980 --> 00:54:33.750
never come back.
00:54:33.750 --> 00:54:35.000
And there's no way
to come back.
00:54:37.950 --> 00:54:41.200
And state 6 and 1 are
also transient.
00:54:41.200 --> 00:54:44.740
State 6 is very, very transient
because, well, I go
00:54:44.740 --> 00:54:47.290
out of it in one step
and never come back.
00:54:47.290 --> 00:54:51.540
And state 1 is also transient,
because I can go out of it in
00:54:51.540 --> 00:54:55.111
this direction and never
be able to come back.
00:54:55.111 --> 00:54:56.361
Do you see?
00:55:00.320 --> 00:55:04.960
So this is a very, very, very
important theorem, which says
00:55:04.960 --> 00:55:09.450
that if we have some classes,
the states in the class are
00:55:09.450 --> 00:55:12.590
all recurrent or
all transient.
00:55:12.590 --> 00:55:15.110
This is what we are going to
use a lot in the future, so
00:55:15.110 --> 00:55:17.147
you should remember it.
00:55:17.147 --> 00:55:21.070
And the proof is very easy.
00:55:21.070 --> 00:55:29.385
So let's assume that state
i is recurrent.
00:55:29.385 --> 00:55:37.160
And let's define Si, all the
states that communicate with
00:55:37.160 --> 00:55:40.660
i, that are accessible from i.
00:55:40.660 --> 00:55:45.950
And you know that since i is
recurrent, being accessible
00:55:45.950 --> 00:55:52.230
from i means that i is
accessible from them.
00:55:52.230 --> 00:55:56.690
So if j is accessible from i,
i is also accessible from j.
00:55:56.690 --> 00:56:05.460
So we know that i and j
communicate with each other if
00:56:05.460 --> 00:56:10.960
and only if j is in this set.
00:56:10.960 --> 00:56:12.370
So this is a class.
00:56:12.370 --> 00:56:15.950
This is the class
that contains i.
00:56:15.950 --> 00:56:18.610
So this is like the class that I
told you, that we can have a
00:56:18.610 --> 00:56:23.360
class where this state
i is representative.
00:56:23.360 --> 00:56:26.970
And actually, any state in a
class can be representative.
00:56:26.970 --> 00:56:29.920
It doesn't matter.
00:56:29.920 --> 00:56:33.440
So by just looking at state
i, I can define a class.
00:56:33.440 --> 00:56:39.410
The states that are accessible
from i-- and actually, this is
00:56:39.410 --> 00:56:41.190
the class that contains i.
00:56:43.800 --> 00:56:51.720
So let's assume that there's
a state called k which is
00:56:51.720 --> 00:56:55.720
accessible from some j
which is in this set.
00:56:55.720 --> 00:56:59.660
So k is accessible from j, and
j is accessible from i.
00:56:59.660 --> 00:57:03.020
So k is accessible from i.
00:57:03.020 --> 00:57:07.060
But k accessible from i
implies that i is also
00:57:07.060 --> 00:57:09.250
accessible from k, because
i is recurrent.
00:57:12.204 --> 00:57:16.510
So you think i is accessible
from k?
00:57:16.510 --> 00:57:20.020
And you know that j is also
accessible from i because,
00:57:20.020 --> 00:57:24.590
well, Si was also a class,
recurrent class.
00:57:24.590 --> 00:57:27.060
So you see that j is
also recurrent.
00:57:27.060 --> 00:57:32.440
So if k is accessible from j,
then j is also accessible from
00:57:32.440 --> 00:57:34.490
k for any k.
00:57:34.490 --> 00:57:37.390
So this is the definition
of recurrency.
00:57:37.390 --> 00:57:42.795
So what I want to say is that if
i is recurrent and it's in
00:57:42.795 --> 00:57:46.370
the same class as j, then
j is recurrent for sure.
00:57:49.180 --> 00:57:52.540
And we didn't prove it here,
but if one of them is
00:57:52.540 --> 00:57:54.070
transient, them all of them
are transient too.
00:57:56.900 --> 00:57:58.150
So the proof is very simple.
00:58:01.700 --> 00:58:04.950
I proved that if there is any
state like k that's is
00:58:04.950 --> 00:58:07.480
accessible from j, I need
to prove that j is also
00:58:07.480 --> 00:58:09.640
accessible from k.
00:58:09.640 --> 00:58:10.890
And I proved it like this.
00:58:15.550 --> 00:58:16.800
It's easy.
00:58:18.720 --> 00:58:23.830
So the next definition that we
have is the definition of the
00:58:23.830 --> 00:58:27.480
periodic states and classes.
00:58:27.480 --> 00:58:33.590
So I told you that the number of
steps in a walk-- so when I
00:58:33.590 --> 00:58:37.460
say that there's a walk from
some state to the other state,
00:58:37.460 --> 00:58:41.740
I didn't specify the number of
steps needed to get from the
00:58:41.740 --> 00:58:43.240
state to another state.
00:58:43.240 --> 00:58:47.610
So assuming that there is a
walk from state i to i,
00:58:47.610 --> 00:58:51.410
meaning that i is accessible
from i, it's not always true.
00:58:51.410 --> 00:58:54.560
You might go out of i and
never come back to it.
00:58:54.560 --> 00:58:58.260
So assuming that there's a
positive probability that we
00:58:58.260 --> 00:59:02.140
can go from state i to i, you
can look at the number of
00:59:02.140 --> 00:59:04.880
steps needed for this walk.
00:59:04.880 --> 00:59:08.610
And we can find the greatest
common divisor of
00:59:08.610 --> 00:59:10.510
this number of steps.
00:59:10.510 --> 00:59:15.180
And it's called the
period of i.
00:59:15.180 --> 00:59:19.720
And if this number is greater
than 1, then i is
00:59:19.720 --> 00:59:21.710
called to be periodic.
00:59:21.710 --> 00:59:24.640
And if it's not,
it's aperiodic.
00:59:24.640 --> 00:59:28.840
So the very simple example
of this thing
00:59:28.840 --> 00:59:31.190
is this Markov chain.
00:59:31.190 --> 00:59:40.900
So probability of 1,1
in step is 0.
00:59:40.900 --> 00:59:44.990
probability of 1, 1 in two
steps is positive.
00:59:44.990 --> 00:59:49.420
And actually, probability of
1, 1 for all even number of
00:59:49.420 --> 00:59:51.440
steps is positive.
00:59:51.440 --> 00:59:55.567
But for all odd number
of steps, it's 0.
00:59:58.490 --> 01:00:01.940
And actually, you can
prove that 2 is the
01:00:01.940 --> 01:00:03.390
greatest common divisor.
01:00:03.390 --> 01:00:05.487
So this is the very
easy example to
01:00:05.487 --> 01:00:13.660
show that it's periodic.
01:00:18.850 --> 01:00:23.620
So there is a very simple
thing to check if
01:00:23.620 --> 01:00:25.920
something is aperiodic.
01:00:25.920 --> 01:00:28.090
But it doesn't tell me--
01:00:28.090 --> 01:00:31.380
I mean, if it doesn't exist, we
don't know its periodicity.
01:00:31.380 --> 01:00:41.780
So if there is a walk from state
1 or state i to itself,
01:00:41.780 --> 01:00:47.930
and in this walk, we go from a
state called, like, j, and
01:00:47.930 --> 01:00:50.560
there is a loop here--
01:00:50.560 --> 01:00:52.610
so Pjj is positive.
01:00:55.460 --> 01:01:00.062
What can we say about the
periodicity of i?
01:01:00.062 --> 01:01:02.040
No, the period is 1.
01:01:02.040 --> 01:01:04.490
Yeah, exactly.
01:01:04.490 --> 01:01:07.820
So in this case, state i
is always aperiodic.
01:01:11.080 --> 01:01:13.900
So if there's a walk, and in
this walk, there is a loop, we
01:01:13.900 --> 01:01:14.860
always say that.
01:01:14.860 --> 01:01:18.230
But if it doesn't exist, can
we say anything about the
01:01:18.230 --> 01:01:20.800
periodicity?
01:01:20.800 --> 01:01:21.770
No.
01:01:21.770 --> 01:01:22.670
It might be periodic.
01:01:22.670 --> 01:01:24.940
It might be aperiodic.
01:01:24.940 --> 01:01:25.820
It's just a check.
01:01:25.820 --> 01:01:29.040
So whenever you see a loop,
it's aperiodic.
01:01:29.040 --> 01:01:32.350
Whenever you see a loop in the
walk from i to i, if there's a
01:01:32.350 --> 01:01:34.740
loop in the other side of the
Markov chain, we don't care.
01:01:38.370 --> 01:01:41.740
So the definition is fine.
01:01:41.740 --> 01:01:46.030
So just looking at this example,
so if we're going
01:01:46.030 --> 01:01:50.880
from state 4 to 4, the number
of states that we need is,
01:01:50.880 --> 01:01:52.710
like, 4, 6, 8.
01:01:52.710 --> 01:02:01.940
So 4, 1, 2, 3, 4 is a cycle, or
is a walk from 4, to 4, and
01:02:01.940 --> 01:02:05.440
the number of steps is 4.
01:02:05.440 --> 01:02:10.340
4, 5, 6, 7, 8 9, 4
is another walk.
01:02:10.340 --> 01:02:13.340
So this corresponds
to n equal to 6.
01:02:13.340 --> 01:02:18.230
And 4, 1, 2 3, 4, 1, 2, 3,
4 is another walk which
01:02:18.230 --> 01:02:21.080
corresponds to n equal to 8.
01:02:21.080 --> 01:02:27.640
So you see that we can go like
this or this or this.
01:02:27.640 --> 01:02:29.680
So these are different n's.
01:02:29.680 --> 01:02:33.640
But we see that the greatest
common divisor is 2.
01:02:33.640 --> 01:02:36.232
So the period of state 4 is 2.
01:02:36.232 --> 01:02:40.120
For state 7, we have
this thing.
01:02:40.120 --> 01:02:44.730
So the minimum number of steps
to get from 7 to itself is 6,
01:02:44.730 --> 01:02:48.840
and then you can get from
7 to 7 in 10 steps.
01:02:48.840 --> 01:02:49.960
And I hope you see it.
01:02:49.960 --> 01:02:53.650
So it's going to be like this.
01:02:53.650 --> 01:02:57.690
And so again, the greatest
common divisor is 2.
01:03:04.400 --> 01:03:18.790
So we proved that if one state
in a class is recurrent, and
01:03:18.790 --> 01:03:20.830
then all the states
are recurrent.
01:03:20.830 --> 01:03:23.440
And we said that recurrency
corresponds to having a cycle,
01:03:23.440 --> 01:03:26.200
having a walk from each
state to itself.
01:03:29.690 --> 01:03:36.310
So this is the result, very
similar to that one, which
01:03:36.310 --> 01:03:38.460
says that all the states
in the same class
01:03:38.460 --> 01:03:41.170
have the same period.
01:03:41.170 --> 01:03:45.600
And in this example,
you see it too.
01:03:45.600 --> 01:03:48.590
The proof is not very
complicated, but it takes time
01:03:48.590 --> 01:03:49.120
to do that.
01:03:49.120 --> 01:03:50.210
So I'm not going to do it.
01:03:50.210 --> 01:03:51.920
But it's all nice.
01:03:51.920 --> 01:03:57.330
So it's good if you look at
the text, you'll see that.
01:03:57.330 --> 01:03:58.686
AUDIENCE: [INAUDIBLE]
01:03:58.686 --> 01:04:01.700
only for recurrent states?
01:04:01.700 --> 01:04:02.250
PROFESSOR: Yeah.
01:04:02.250 --> 01:04:04.217
For non-recurrent ones, it's--
01:04:04.217 --> 01:04:05.680
AUDIENCE: [INAUDIBLE].
01:04:05.680 --> 01:04:06.510
PROFESSOR: It's 1.
01:04:06.510 --> 01:04:08.570
It's aperiodic.
01:04:08.570 --> 01:04:09.220
OK, yeah.
01:04:09.220 --> 01:04:11.596
Periodicity is only defined for
recurrent states, yeah.
01:04:15.290 --> 01:04:16.710
We have another example here.
01:04:16.710 --> 01:04:17.780
I just want to show you.
01:04:17.780 --> 01:04:24.480
So I have two recurrent classes
in this example.
01:04:24.480 --> 01:04:25.750
Actually, three.
01:04:25.750 --> 01:04:28.610
So one of them is this class.
01:04:28.610 --> 01:04:33.030
What is the period for a class
corresponding to state 1?
01:04:33.030 --> 01:04:34.460
It's 1, because I have a loop.
01:04:34.460 --> 01:04:35.570
It's very simple.
01:04:35.570 --> 01:04:39.440
For any n, there is a positive
probability.
01:04:39.440 --> 01:04:43.320
What is the period for
this class, states--
01:04:43.320 --> 01:04:47.390
containing states 4 and 5?
01:04:47.390 --> 01:04:50.510
Look at it.
01:04:50.510 --> 01:04:52.850
There is definitely a
loop in this class.
01:04:52.850 --> 01:04:54.785
So it's 1.
01:04:54.785 --> 01:04:58.100
No, I said that whenever there
is a loop, the greatest common
01:04:58.100 --> 01:04:58.910
divisor is 1.
01:04:58.910 --> 01:05:03.080
So for going from a state 5 to
5, we can go it in one step,
01:05:03.080 --> 01:05:05.580
two step, three step,
four step, five.
01:05:05.580 --> 01:05:09.420
So the greatest common
divisor is 1.
01:05:09.420 --> 01:05:14.200
So here, I showed you that if
there is a loop, then it's
01:05:14.200 --> 01:05:16.650
definitely aperiodic, meaning
that the greatest common
01:05:16.650 --> 01:05:18.840
divisor is 1.
01:05:18.840 --> 01:05:20.218
AUDIENCE: [INAUDIBLE] have
self-transitions [INAUDIBLE]
01:05:20.218 --> 01:05:21.670
2?
01:05:21.670 --> 01:05:24.410
PROFESSOR: No, I'm talking
about 4 and 5.
01:05:24.410 --> 01:05:25.260
In what?
01:05:25.260 --> 01:05:27.204
AUDIENCE: If they didn't
have self-transitions?
01:05:27.204 --> 01:05:28.180
[INAUDIBLE].
01:05:28.180 --> 01:05:28.780
PROFESSOR: Oh yeah.
01:05:28.780 --> 01:05:30.140
It would be like this one?
01:05:30.140 --> 01:05:30.980
AUDIENCE: Yeah, [INAUDIBLE].
01:05:30.980 --> 01:05:33.310
PROFESSOR: Yeah, definitely.
01:05:33.310 --> 01:05:38.685
So the class containing states
2 and 3, they are periodic.
01:05:38.685 --> 01:05:39.651
AUDIENCE: Oh, wait.
01:05:39.651 --> 01:05:40.901
You just said [INAUDIBLE].
01:05:44.000 --> 01:05:44.640
PROFESSOR: Oh, OK.
01:05:44.640 --> 01:05:45.890
AUDIENCE: [INAUDIBLE].
01:05:53.396 --> 01:05:55.097
PROFESSOR: Yeah, actually,
we can define
01:05:55.097 --> 01:05:56.850
for transient states.
01:05:56.850 --> 01:05:59.380
Yeah, for transient classes,
we can define periodicity,
01:05:59.380 --> 01:06:00.630
like this case.
01:06:04.480 --> 01:06:05.730
Yeah, why not?
01:06:20.300 --> 01:06:26.360
This is another very important
thing that we can
01:06:26.360 --> 01:06:29.100
do in Markov chains.
01:06:29.100 --> 01:06:32.770
So I have a class of states
in a Markov chain,
01:06:32.770 --> 01:06:35.590
and they are periodic.
01:06:35.590 --> 01:06:42.130
So the period of each state in
a class is greater than 1.
01:06:42.130 --> 01:06:45.250
And you know that all the states
in a class have the
01:06:45.250 --> 01:06:47.480
same period.
01:06:47.480 --> 01:06:52.920
So it means that I can partition
the class into these
01:06:52.920 --> 01:06:58.700
subclasses, where
there is only--
01:06:58.700 --> 01:07:01.040
OK, so--
01:07:01.040 --> 01:07:02.510
do I have that?
01:07:02.510 --> 01:07:03.760
I don't know.
01:07:15.270 --> 01:07:18.160
So let's assume that
d is equal to 3.
01:07:18.160 --> 01:07:21.620
And the whole team is the
class of states that I'm
01:07:21.620 --> 01:07:22.750
talking about.
01:07:22.750 --> 01:07:29.020
I can partition into three
classes, in which I only have
01:07:29.020 --> 01:07:32.906
transitions from one of these
to the other one.
01:07:36.720 --> 01:07:41.440
So there's no transition in
this subclass to itself.
01:07:41.440 --> 01:07:47.270
And the only transitions
are from one
01:07:47.270 --> 01:07:48.540
class to the next one.
01:07:51.290 --> 01:07:52.790
So this is sort of intuitive.
01:07:52.790 --> 01:07:59.430
So just looking at three states,
if the period is 3, I
01:07:59.430 --> 01:08:04.060
can partition it in this way.
01:08:04.060 --> 01:08:08.300
Or I can have it like two of
these in this case, where I
01:08:08.300 --> 01:08:11.222
have like this.
01:08:11.222 --> 01:08:11.840
You see?
01:08:11.840 --> 01:08:14.190
So there's a transition from
this to these two states, and
01:08:14.190 --> 01:08:16.930
from these two states to here,
and from here to here.
01:08:16.930 --> 01:08:19.720
But I cannot have any transition
from here to
01:08:19.720 --> 01:08:23.979
itself, or from here to
here, or here to here.
01:08:23.979 --> 01:08:28.460
Just look at the text for the
proof and illustration.
01:08:28.460 --> 01:08:35.195
But there exists this kind
of partitioning.
01:08:35.195 --> 01:08:36.445
You should know that.
01:08:40.220 --> 01:08:44.300
Yeah, so if I am in a class, in
a subclass, the next state
01:08:44.300 --> 01:08:47.500
will be in the next subclass,
for sure.
01:08:47.500 --> 01:08:50.640
So I know this subclass
in [INAUDIBLE].
01:08:50.640 --> 01:08:53.950
And in two steps, I will be
in the other [INAUDIBLE].
01:08:53.950 --> 01:08:57.710
So you know that the set of
classes that I can be in
01:08:57.710 --> 01:09:04.840
state, nd plus m.
01:09:04.840 --> 01:09:08.970
So the other definition that
you can say is that, again,
01:09:08.970 --> 01:09:13.420
you can choose one of the
states in a class that I
01:09:13.420 --> 01:09:14.359
talked about.
01:09:14.359 --> 01:09:18.220
So let's say that I
choose a state 1.
01:09:18.220 --> 01:09:20.130
And I define Sm--
01:09:20.130 --> 01:09:23.200
Sm corresponds to
that subclass--
01:09:23.200 --> 01:09:26.240
are all the j's where S1j.
01:09:42.399 --> 01:09:47.450
So I told you that I will have
d classes, I can define each
01:09:47.450 --> 01:09:49.580
class like this.
01:09:49.580 --> 01:09:54.320
So this is all the possible
states that I can be in nd
01:09:54.320 --> 01:09:57.260
plus m step.
01:09:57.260 --> 01:10:01.320
So starting from state 1, in
nd plus m step, I can be in
01:10:01.320 --> 01:10:03.690
set of classes, for some m.
01:10:03.690 --> 01:10:05.090
So m can we big.
01:10:05.090 --> 01:10:09.628
But anyway, I call this
the subclass number m.
01:10:12.830 --> 01:10:24.865
So let's just talk
about something.
01:10:28.680 --> 01:10:30.990
So I said that in order to
characterize the Markov
01:10:30.990 --> 01:10:37.460
chains, I need to show you the
initial state or initial state
01:10:37.460 --> 01:10:38.220
distribution.
01:10:38.220 --> 01:10:42.940
So it can be deterministic,
like I start from some
01:10:42.940 --> 01:10:45.660
specific state all the time,
or there can be some
01:10:45.660 --> 01:10:49.460
distribution, like px0, meaning
that this is the
01:10:49.460 --> 01:10:53.490
distribution that I would have
at my initial state.
01:10:53.490 --> 01:10:59.720
I don't have it here, but using,
I think, chain rules,
01:10:59.720 --> 01:11:03.230
we can easily find a
distribution of states at each
01:11:03.230 --> 01:11:10.500
time instant by having the
transitional probabilities and
01:11:10.500 --> 01:11:13.260
the initial state
distribution.
01:11:13.260 --> 01:11:15.230
I just wrote it for you.
01:11:15.230 --> 01:11:19.990
So for characterizing the Markov
chain, I just need to
01:11:19.990 --> 01:11:22.650
tell you the transition
probabilities
01:11:22.650 --> 01:11:26.570
and the initial state.
01:11:26.570 --> 01:11:31.480
So a very good question is that,
is there any kind of
01:11:31.480 --> 01:11:40.330
stable behavior when time goes,
like in very far future?
01:11:40.330 --> 01:11:45.490
So I had an example here,
just looking at here.
01:11:45.490 --> 01:11:50.150
So this is a very simple thing
that I can say about this
01:11:50.150 --> 01:11:51.370
Markov chain.
01:11:51.370 --> 01:11:55.370
I know that in the very far
future, there is a 0
01:11:55.370 --> 01:11:59.680
probability that
I'm in state 6.
01:11:59.680 --> 01:12:00.380
And you know it why?
01:12:00.380 --> 01:12:03.690
Because any time that I am in
state 6, I will go out of it
01:12:03.690 --> 01:12:05.060
with probability 1.
01:12:05.060 --> 01:12:08.330
So there's no chance that
I will be there.
01:12:08.330 --> 01:12:20.610
Or I can say that, if I start
from state 4, in very, very
01:12:20.610 --> 01:12:27.444
far future, I will not be in
state 1, 4, or 5, for sure.
01:12:27.444 --> 01:12:29.030
You know why?
01:12:29.030 --> 01:12:33.020
Because there is a positive
probability of going out of
01:12:33.020 --> 01:12:35.740
these threes states,
like here.
01:12:35.740 --> 01:12:40.080
And then if I go out of it,
I can never come back.
01:12:40.080 --> 01:12:44.910
So there's a chance of going
from state 4 to state 2.
01:12:44.910 --> 01:12:47.360
And if I ever go there, I
will never come back.
01:12:50.180 --> 01:12:55.830
So these are the statements that
I can say about the state
01:12:55.830 --> 01:13:00.920
of behavior or the very,
very future behavior
01:13:00.920 --> 01:13:02.300
of the Markov chain.
01:13:02.300 --> 01:13:07.690
So the question is that, can
we always say these kind of
01:13:07.690 --> 01:13:09.200
statements?
01:13:09.200 --> 01:13:12.160
And what kind of statements,
actually, we can have?
01:13:12.160 --> 01:13:16.090
And so you see that, for
example, for a state six
01:13:16.090 --> 01:13:16.550
[INAUDIBLE]
01:13:16.550 --> 01:13:20.870
example, I could say that the
probability of being in state
01:13:20.870 --> 01:13:24.620
6 as n goes to infinity is 0.
01:13:24.620 --> 01:13:27.620
So can I always have some kind
of probability distribution
01:13:27.620 --> 01:13:32.910
over the states in the future
as n goes to infinity?
01:13:32.910 --> 01:13:35.780
This is a very good question.
01:13:35.780 --> 01:13:39.020
And actually, it's related to a
lot of applications that we
01:13:39.020 --> 01:13:42.600
have for Markov chains, like
the queuing theory.
01:13:42.600 --> 01:13:45.700
And you can have queues
for almost anything.
01:13:45.700 --> 01:13:50.800
So one of the most fundamental
and interesting classes of
01:13:50.800 --> 01:13:55.870
states are the states that are
called ergodic, meaning that
01:13:55.870 --> 01:13:57.560
they are recurrent
and aperiodic.
01:14:00.240 --> 01:14:09.230
So if I have a Markov chain that
has only one class, and
01:14:09.230 --> 01:14:14.430
this class is recurrent and
aperiodic, then we call it a
01:14:14.430 --> 01:14:15.980
ergodic Markov chain.
01:14:15.980 --> 01:14:21.880
So we had two theorems saying
that if a state in a class is
01:14:21.880 --> 01:14:25.170
recurrent, then all the
states are recurrent.
01:14:25.170 --> 01:14:30.010
And if a state in a class is
aperiodic, then all the states
01:14:30.010 --> 01:14:31.550
are aperiodic.
01:14:31.550 --> 01:14:37.070
So we can say that some classes
are ergodic and some
01:14:37.070 --> 01:14:38.970
are not ergodic.
01:14:38.970 --> 01:14:42.940
So if the Markov chain has only
one class, and this class
01:14:42.940 --> 01:14:46.460
is aperiodic and recurrent,
then we call it a ergodic
01:14:46.460 --> 01:14:47.980
Markov chain.
01:14:47.980 --> 01:14:58.970
And the very important and
nice property of ergodic
01:14:58.970 --> 01:15:03.200
Markov chains is that they
lose memory as n goes to
01:15:03.200 --> 01:15:06.740
infinity, meaning that whatever
initial distribution
01:15:06.740 --> 01:15:11.240
that I have for the initial
state, I will
01:15:11.240 --> 01:15:12.160
lose memory of that.
01:15:12.160 --> 01:15:14.500
So whatever state that I start
in, or whatever distribution
01:15:14.500 --> 01:15:23.200
that I start in, after a while,
for a large enough n,
01:15:23.200 --> 01:15:26.940
the distribution of states
does not depend on that.
01:15:26.940 --> 01:15:31.760
So again, looking at that chain
rule, I could say that I
01:15:31.760 --> 01:15:34.760
can find the probability
distribution of x by looking
01:15:34.760 --> 01:15:36.010
at this thing.
01:15:40.400 --> 01:15:45.420
And looking at these
recursively, it all depends on
01:15:45.420 --> 01:15:49.540
P of the initial distribution.
01:15:49.540 --> 01:15:57.140
And the ergodic Markov chains
have this property that, after
01:15:57.140 --> 01:15:59.890
a while, this distribution
doesn't depend on initial
01:15:59.890 --> 01:16:01.220
distribution anymore.
01:16:01.220 --> 01:16:02.850
So they lose memory.
01:16:02.850 --> 01:16:05.920
And actually, usually we can
calculate the stable
01:16:05.920 --> 01:16:06.490
distribution.
01:16:06.490 --> 01:16:11.140
So this thing goes to a limit
which is called pj.
01:16:11.140 --> 01:16:14.470
So the important thing is that
it doesn't depend on i.
01:16:14.470 --> 01:16:17.780
It doesn't depend
where i starts.
01:16:17.780 --> 01:16:20.960
Eventually, I will converge
the distribution.
01:16:20.960 --> 01:16:25.070
And then this distribution
doesn't change.
01:16:25.070 --> 01:16:29.610
So for a large enough n, I have
this property for ergodic
01:16:29.610 --> 01:16:30.860
Markov chains.
01:16:32.925 --> 01:16:35.050
We will have a lot
to do with these
01:16:35.050 --> 01:16:37.380
properties in the future.
01:16:37.380 --> 01:16:47.990
So I was saying that this pj,
pi j, should be positive.
01:16:47.990 --> 01:16:54.500
So being in each state has
a non-0 probability.
01:16:54.500 --> 01:17:01.730
The first thing that I need to
prove is that pijn and is
01:17:01.730 --> 01:17:07.620
nonzero for a large enough n for
all j and all the initial
01:17:07.620 --> 01:17:09.250
distributions.
01:17:09.250 --> 01:17:11.580
And I want to prove that
this is true for
01:17:11.580 --> 01:17:12.920
ergodic Markov chains.
01:17:12.920 --> 01:17:14.170
This is not true, generally.
01:17:16.852 --> 01:17:24.180
Well, this is more a
combinatorial issue, but there
01:17:24.180 --> 01:17:28.480
is a theorem here which says
that for an ergodic Markov
01:17:28.480 --> 01:17:35.880
chain, for all n greater than
this value, I have non-0
01:17:35.880 --> 01:17:39.300
probability of going
from i to j.
01:17:39.300 --> 01:17:43.190
So the thing that you should be
careful here is that, for
01:17:43.190 --> 01:17:45.850
all n greater than this value.
01:17:45.850 --> 01:18:02.860
So for going from a state 1 to
1, I can go it in six steps
01:18:02.860 --> 01:18:05.360
and 12 steps and so on.
01:18:05.360 --> 01:18:11.170
But I cannot go it in
24 steps, I think.
01:18:11.170 --> 01:18:17.720
I cannot go from 1
to 1 in 25 steps.
01:18:17.720 --> 01:18:24.470
But I can go from 1 to 1 in
26, 27, 28, 29, to 30.
01:18:24.470 --> 01:18:28.300
So for n greater than n minus
1 squared plus 1, I can go
01:18:28.300 --> 01:18:30.880
from any state to any state.
01:18:30.880 --> 01:18:38.770
So I think this [? bond ?]
is tied for state 4.
01:18:41.630 --> 01:18:43.890
Sorry, maybe what I said
is true for state 4.
01:18:49.614 --> 01:18:50.580
Yeah.
01:18:50.580 --> 01:18:53.770
So I'm not going to prove
this theorem.
01:18:53.770 --> 01:18:56.130
Well, actually you don't
have the proof
01:18:56.130 --> 01:18:57.567
in the notes, either.
01:18:57.567 --> 01:19:05.910
But yeah, you can look at the
example and the cases that are
01:19:05.910 --> 01:19:08.570
discussed in the notes.
01:19:08.570 --> 01:19:09.820
Is there any questions?
01:19:12.870 --> 01:19:14.120
Fine?