1
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative
2
00:00:02,960 --> 00:00:04,370
Commons license.
3
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to
4
00:00:07,410 --> 00:00:11,060
offer high-quality educational
resources for free.
5
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from
6
00:00:13,960 --> 00:00:19,790
hundreds of MIT courses, visit
MIT OpenCourseWare at
7
00:00:19,790 --> 00:00:21,040
ocw.mit.edu.
8
00:00:22,860 --> 00:00:23,150
SHAN-YUAN HO: OK.
9
00:00:23,150 --> 00:00:24,860
So today's lecture
is going to be on
10
00:00:24,860 --> 00:00:26,570
Finite-state Markov Chains.
11
00:00:26,570 --> 00:00:28,500
And we're going to use
the matrix approach.
12
00:00:28,500 --> 00:00:32,180
So in last lecture, we saw
that the Markov chain, we
13
00:00:32,180 --> 00:00:35,760
could represent it as a directed
graph or as a matrix.
14
00:00:35,760 --> 00:00:40,860
So the outline is we will look
at this transition matrix and
15
00:00:40,860 --> 00:00:42,220
its powers.
16
00:00:42,220 --> 00:00:45,640
And then we'll want to know
whether this p of n is going
17
00:00:45,640 --> 00:00:48,490
to converge for very,
very large n.
18
00:00:48,490 --> 00:00:53,790
Then we will extend this to
Ergodic Markov chains, Ergodic
19
00:00:53,790 --> 00:00:57,460
unichains, and other
finite-state Markov chains.
20
00:00:57,460 --> 00:01:00,170
So remember in the Markovity,
these Markov chains, the
21
00:01:00,170 --> 00:01:03,250
effect of the past on the future
is totally summarized
22
00:01:03,250 --> 00:01:04,269
by its state.
23
00:01:04,269 --> 00:01:07,590
So we want to analyze the
probabilities of properties of
24
00:01:07,590 --> 00:01:09,150
the sequence of these states.
25
00:01:09,150 --> 00:01:12,430
So whatever the state you are
in, all the past is totally
26
00:01:12,430 --> 00:01:13,330
summarized in that state.
27
00:01:13,330 --> 00:01:15,970
And that's the only thing
that affects the future.
28
00:01:15,970 --> 00:01:20,310
So an ergodic Markov chain is
a Markov chain that has a
29
00:01:20,310 --> 00:01:23,150
single recurrent class
and is aperiodic.
30
00:01:23,150 --> 00:01:25,150
So this chain doesn't contain
any transient states.
31
00:01:25,150 --> 00:01:28,000
And it doesn't contain
any periodicity.
32
00:01:28,000 --> 00:01:32,455
So an ergodic unichain is just
ergodic Markov chain, but it
33
00:01:32,455 --> 00:01:34,176
has some transient
states in it.
34
00:01:36,990 --> 00:01:41,370
So the state x sub n of this
Markov chain at step n depends
35
00:01:41,370 --> 00:01:43,450
only on the past through
the previous step.
36
00:01:43,450 --> 00:01:47,500
So for n steps, we want
to be at state j.
37
00:01:47,500 --> 00:01:48,810
And then we have this path.
38
00:01:48,810 --> 00:01:51,510
x sub n minus 1 is i, and
so forth, up to x0.
39
00:01:51,510 --> 00:01:54,040
It's just the probability
from i to j, from
40
00:01:54,040 --> 00:01:55,620
state i to state j.
41
00:01:55,620 --> 00:01:59,800
So this means that we can write
the joint probability of
42
00:01:59,800 --> 00:02:02,310
all these states that we're in,
so x0, x1, all the way up
43
00:02:02,310 --> 00:02:07,110
to xn, as a function of these
transition probabilities.
44
00:02:07,110 --> 00:02:10,060
So in this transition
probability matrix, we can
45
00:02:10,060 --> 00:02:13,950
represent these transition
probabilities.
46
00:02:13,950 --> 00:02:19,220
We see that here, in this
example, this is a 6-state
47
00:02:19,220 --> 00:02:20,210
Markov chain.
48
00:02:20,210 --> 00:02:23,380
So if I want to go from, say,
state 2 to state 1 in one
49
00:02:23,380 --> 00:02:26,780
step, it would just
be p of 2,1.
50
00:02:26,780 --> 00:02:29,990
If I want to go from
state 6 to itself--
51
00:02:29,990 --> 00:02:33,380
this is last one, which
is p of 6,6.
52
00:02:33,380 --> 00:02:35,700
So this is a probably
transition matrix.
53
00:02:35,700 --> 00:02:38,960
So if we condition on the state
at time 0 and then we
54
00:02:38,960 --> 00:02:44,850
define this p of ijn is equal to
the probability that we're
55
00:02:44,850 --> 00:02:49,410
in state j at the n-th step,
given that we start x0 is
56
00:02:49,410 --> 00:02:54,920
equal to i, let's look at what
happens when n is equal to 2.
57
00:02:54,920 --> 00:02:59,580
So in a 2-step transition,
we go from i to j.
58
00:02:59,580 --> 00:03:04,100
It's just the probability that
at step 2, x2 is equal to j,
59
00:03:04,100 --> 00:03:06,870
x1 is equal to some k,
and x0 is equal to i.
60
00:03:06,870 --> 00:03:09,420
So remember, we started
in state i.
61
00:03:09,420 --> 00:03:13,210
But this has to be multiplied
by probability that x1 is
62
00:03:13,210 --> 00:03:15,490
equal to k, given that
x0 is equal to i.
63
00:03:15,490 --> 00:03:18,500
And we have to sum this over all
the states k, in order to
64
00:03:18,500 --> 00:03:21,754
get the total probability
from--
65
00:03:21,754 --> 00:03:22,670
Oh, stand back?
66
00:03:22,670 --> 00:03:23,590
OK.
67
00:03:23,590 --> 00:03:24,160
There.
68
00:03:24,160 --> 00:03:25,860
OK.
69
00:03:25,860 --> 00:03:28,770
So this is just probability
of ij in two steps.
70
00:03:28,770 --> 00:03:31,850
It's just the probability of
i going to k times the
71
00:03:31,850 --> 00:03:36,700
probability of k going to j,
summed over all k states.
72
00:03:36,700 --> 00:03:42,270
So we notice that this term
right here, the sum over k or
73
00:03:42,270 --> 00:03:45,690
ik, kj is just the ij term of
the product of the transition
74
00:03:45,690 --> 00:03:47,480
matrix P with itself.
75
00:03:47,480 --> 00:03:49,425
So we represent this
as P squared.
76
00:03:49,425 --> 00:03:51,760
So we multiply the transition
matrix by itself.
77
00:03:51,760 --> 00:03:55,300
This gives us the 2-step
transition matrix of this
78
00:03:55,300 --> 00:03:56,320
Markov chain.
79
00:03:56,320 --> 00:04:01,400
So if you want to go i to j, you
just look at ij element in
80
00:04:01,400 --> 00:04:02,400
this matrix.
81
00:04:02,400 --> 00:04:04,240
And that gives you the
probability in two steps,
82
00:04:04,240 --> 00:04:05,490
going from state i to state j.
83
00:04:10,200 --> 00:04:14,060
So for n, we just iterate
on this for
84
00:04:14,060 --> 00:04:16,220
successively larger n.
85
00:04:16,220 --> 00:04:23,520
So for n state to get from state
i to state j, we just
86
00:04:23,520 --> 00:04:27,680
have this probability x sub n,
given j, given x of n minus
87
00:04:27,680 --> 00:04:31,950
the previous step is equal to
k, x sub n minus 1 equals k,
88
00:04:31,950 --> 00:04:34,940
given x0 is equal to i,
summing over all k.
89
00:04:34,940 --> 00:04:38,700
So this means that we broke
this up for n-th step.
90
00:04:38,700 --> 00:04:41,200
In the n minus one step,
we visited state k.
91
00:04:41,200 --> 00:04:43,730
And then we multiplied that
one-step transition from k to
92
00:04:43,730 --> 00:04:46,560
j because we want to arrive
at j starting at i.
93
00:04:46,560 --> 00:04:50,570
But again, we have to sum over
all the k's in order to get
94
00:04:50,570 --> 00:04:54,610
the probability from
i to j in n steps.
95
00:04:54,610 --> 00:04:59,960
So p of n right here, this
representation is just the
96
00:04:59,960 --> 00:05:02,770
transition matrix multiplied
by itself n times.
97
00:05:02,770 --> 00:05:05,430
And this gives you the n-th step
transition probabilities
98
00:05:05,430 --> 00:05:06,890
of this Markov chain.
99
00:05:06,890 --> 00:05:09,860
So computationally, what you do
is you take p, p squared, p
100
00:05:09,860 --> 00:05:10,680
to the fourth.
101
00:05:10,680 --> 00:05:13,600
If you wanted p to the 9th,
you'd just take p eighth
102
00:05:13,600 --> 00:05:17,570
multiplied by p, to to
multiply by this.
103
00:05:17,570 --> 00:05:20,140
So this gives us this thing
called the Chapman-Kolmogorov
104
00:05:20,140 --> 00:05:23,580
equations, which means that when
we want to go from step i
105
00:05:23,580 --> 00:05:27,800
to step j, we can go to an
intermediate state and then
106
00:05:27,800 --> 00:05:29,520
sum up all the states that
would go into the
107
00:05:29,520 --> 00:05:30,890
intermediate state.
108
00:05:30,890 --> 00:05:35,980
So in this case, if the step is
m plus n transition, we can
109
00:05:35,980 --> 00:05:38,330
break it up into m and n.
110
00:05:38,330 --> 00:05:41,670
So it's the probability that it
goes from i to k in exactly
111
00:05:41,670 --> 00:05:45,660
m steps and k to j in n steps,
summing over all the k's that
112
00:05:45,660 --> 00:05:50,230
it visits on its way
from i to j.
113
00:05:50,230 --> 00:05:53,110
So this is very useful a
quantity that we can
114
00:05:53,110 --> 00:05:56,280
manipulate our transition
probabilities when we get
115
00:05:56,280 --> 00:05:59,180
higher orders of n.
116
00:05:59,180 --> 00:06:00,840
So the convergence
of p to the n.
117
00:06:00,840 --> 00:06:04,250
So a very important question we
like to ask is as n goes to
118
00:06:04,250 --> 00:06:07,960
infinity whether this goes
to a limit or not.
119
00:06:07,960 --> 00:06:12,810
In other words, does the initial
state matter, all
120
00:06:12,810 --> 00:06:16,380
initial sates matter in
this Markov chain?
121
00:06:16,380 --> 00:06:17,750
So the Markov chain is going
to go on for a long, long,
122
00:06:17,750 --> 00:06:19,280
long, long, long time.
123
00:06:19,280 --> 00:06:22,350
And at the n-th state where n is
very large, is it going to
124
00:06:22,350 --> 00:06:25,980
depend on i?
125
00:06:25,980 --> 00:06:29,840
Or is it going to depend on n,
which is the number of steps?
126
00:06:29,840 --> 00:06:33,820
If it goes to this quantity,
some limit, then it won't
127
00:06:33,820 --> 00:06:35,040
depend on this.
128
00:06:35,040 --> 00:06:37,610
So let's assume that
this limit exists.
129
00:06:37,610 --> 00:06:43,140
If this limit does exist, we can
take the sum of this limit
130
00:06:43,140 --> 00:06:48,860
and then multiply it by p of
jk, summing over all j.
131
00:06:48,860 --> 00:06:50,790
So we do a sum of over j.
132
00:06:50,790 --> 00:06:53,490
So we're going from j to
k on both sides, and we
133
00:06:53,490 --> 00:06:54,790
sum over all j.
134
00:06:54,790 --> 00:06:57,020
So we take this limit
right here.
135
00:06:57,020 --> 00:07:02,180
We notice that this left side
going from i to k to n plus 1
136
00:07:02,180 --> 00:07:06,500
is just that this limit
at state k exists.
137
00:07:06,500 --> 00:07:08,790
Because we saw assumed up here
that this exists for
138
00:07:08,790 --> 00:07:10,190
all i and all j.
139
00:07:10,190 --> 00:07:14,086
So therefore, if we take the n
plus 1 step, we take this n
140
00:07:14,086 --> 00:07:18,860
going to infinity of i to k,
it has to go to pi of k.
141
00:07:18,860 --> 00:07:20,790
So when we do this,
we could simplify
142
00:07:20,790 --> 00:07:22,570
this equation up here.
143
00:07:22,570 --> 00:07:27,810
And if it doesn't exist, we have
this pi sub k for all the
144
00:07:27,810 --> 00:07:29,370
states in the Markov chain.
145
00:07:29,370 --> 00:07:30,780
So this is just a vector.
146
00:07:30,780 --> 00:07:34,885
So pi sub k is equal to pi sub j
times the probability from k
147
00:07:34,885 --> 00:07:37,120
to j, summed over all j.
148
00:07:37,120 --> 00:07:39,490
So if you have an m state Markov
chain, you have exactly
149
00:07:39,490 --> 00:07:42,160
m of these equations.
150
00:07:42,160 --> 00:07:45,280
And this one, we'll call it the
vector pi, which consists
151
00:07:45,280 --> 00:07:50,460
of each element of this
equation, if the limit is
152
00:07:50,460 --> 00:07:51,000
going to exist.
153
00:07:51,000 --> 00:07:52,400
But we don't know whether
it does or not, at
154
00:07:52,400 --> 00:07:54,750
this point in time.
155
00:07:54,750 --> 00:07:57,010
So if it does exist, what's
going to happen?
156
00:07:57,010 --> 00:08:00,770
So that means I'm going to
multiply this probability
157
00:08:00,770 --> 00:08:03,940
matrix, P times P,P P, P,
P, P, P all the way.
158
00:08:03,940 --> 00:08:07,980
And if the limit exists, then
that means for each row, they
159
00:08:07,980 --> 00:08:09,490
must be all identical.
160
00:08:09,490 --> 00:08:14,610
Because we said the limit
exists, then going from 1 to
161
00:08:14,610 --> 00:08:16,855
j, 2 to j, 3 to j, 4 to
j, they should be
162
00:08:16,855 --> 00:08:18,050
all exactly the same.
163
00:08:18,050 --> 00:08:20,600
This is also the equivalent of
saying, when I look at this
164
00:08:20,600 --> 00:08:24,190
large limit, as n is very,
very large if the limit
165
00:08:24,190 --> 00:08:27,040
exists, that all the elements
in the column should be
166
00:08:27,040 --> 00:08:30,170
exactly the same as well,
or all the rows.
167
00:08:30,170 --> 00:08:33,220
So the elements are equal to
each other, or all the rows,
168
00:08:33,220 --> 00:08:37,419
if I look at the row, which is
going to be this pi vector.
169
00:08:37,419 --> 00:08:39,220
They should be the same.
170
00:08:39,220 --> 00:08:40,400
So we define this vector.
171
00:08:40,400 --> 00:08:43,340
If this limit exists, the
probability vector
172
00:08:43,340 --> 00:08:45,830
is this vector pi.
173
00:08:45,830 --> 00:08:48,930
Because we said it was an
m state Markov chain.
174
00:08:48,930 --> 00:08:52,060
Each pi sub i is non-negative,
and they obviously have
175
00:08:52,060 --> 00:08:53,620
to sum up to 1.
176
00:08:53,620 --> 00:08:57,380
So this is what we call a
probability vector, called the
177
00:08:57,380 --> 00:08:59,020
steady-state vector,
for this transition
178
00:08:59,020 --> 00:09:01,690
matrix P, if it exists.
179
00:09:01,690 --> 00:09:06,290
So what happens is this limit
is easy to study.
180
00:09:06,290 --> 00:09:14,030
In the future in the course, we
will study these pi P, this
181
00:09:14,030 --> 00:09:18,130
steady-state vector for
various Markov chains.
182
00:09:18,130 --> 00:09:21,870
And so you see, it is quite
interesting, many things that
183
00:09:21,870 --> 00:09:24,930
could come about it.
184
00:09:24,930 --> 00:09:29,330
So we notice that this
solution can
185
00:09:29,330 --> 00:09:31,600
contain more than one.
186
00:09:31,600 --> 00:09:32,460
It may not be unique.
187
00:09:32,460 --> 00:09:34,880
So if it contains more than one,
it's very possible that
188
00:09:34,880 --> 00:09:37,000
it has more than one solution,
more than one probability
189
00:09:37,000 --> 00:09:37,880
vector solution.
190
00:09:37,880 --> 00:09:40,450
But just because a solution
exists to that, it doesn't
191
00:09:40,450 --> 00:09:43,010
mean that this limit exists.
192
00:09:43,010 --> 00:09:44,840
So we have prove that
limit exists, first.
193
00:09:47,890 --> 00:09:54,490
So for ergodic Markov chain,
here we have another way to
194
00:09:54,490 --> 00:10:01,320
express this that this matrix
converges is that the matrix
195
00:10:01,320 --> 00:10:02,570
of the rows--
196
00:10:05,260 --> 00:10:08,360
the elements in the column are
all the same for each i.
197
00:10:08,360 --> 00:10:09,530
So we have this theorem.
198
00:10:09,530 --> 00:10:12,950
And today's lecture is going to
be completely this theorem.
199
00:10:12,950 --> 00:10:16,450
This theorem says that if you
have an ergodic finite-state
200
00:10:16,450 --> 00:10:18,890
Markov chain-- so when we say
"ergodic," remember it means
201
00:10:18,890 --> 00:10:22,540
that there's only one class,
every single state in this is
202
00:10:22,540 --> 00:10:24,290
recurrent, you have no transient
states, and you have
203
00:10:24,290 --> 00:10:25,030
no periodicity.
204
00:10:25,030 --> 00:10:27,330
So it's an aperiodic chain.
205
00:10:27,330 --> 00:10:32,590
And then for each j, if you take
the maximum path from i
206
00:10:32,590 --> 00:10:36,880
to j in n steps, this is
non-increasing in n.
207
00:10:36,880 --> 00:10:41,800
So in other words, this right
here, this is non-increasing.
208
00:10:41,800 --> 00:10:44,976
So if I take the maximum path
from state i to j, it gives
209
00:10:44,976 --> 00:10:46,600
you exactly n steps.
210
00:10:46,600 --> 00:10:48,920
So that means this is
maximized over all
211
00:10:48,920 --> 00:10:50,180
initial states i.
212
00:10:50,180 --> 00:10:52,390
So it doesn't matter what state
you start, and I take
213
00:10:52,390 --> 00:10:53,370
the maximum path.
214
00:10:53,370 --> 00:10:58,180
And if I increase n, and I take
maximum of that again,
215
00:10:58,180 --> 00:11:01,310
the maximum path, this
is not increasing.
216
00:11:01,310 --> 00:11:04,330
And the minimum is
non-decreasing in n.
217
00:11:04,330 --> 00:11:10,480
So as we take n, the path from
i to j, this n getting larger
218
00:11:10,480 --> 00:11:16,370
and larger, we have that the
maximum of this path, which is
219
00:11:16,370 --> 00:11:19,960
the most probable path,
is non-increasing.
220
00:11:19,960 --> 00:11:24,610
And then the minimum of this
path, the least likely path,
221
00:11:24,610 --> 00:11:26,340
is going to be non-decreasing.
222
00:11:26,340 --> 00:11:29,600
So we're wondering whether
this limit is going to
223
00:11:29,600 --> 00:11:30,460
converge or not.
224
00:11:30,460 --> 00:11:33,460
In this theorem it said that
for an ergodic finite-state
225
00:11:33,460 --> 00:11:36,000
Markov chain, this limit
actually does converge.
226
00:11:36,000 --> 00:11:39,850
So in other words, the lim sup
is equal to lim if of this and
227
00:11:39,850 --> 00:11:44,250
will equal pi sub j, which is
the steady-state distribution.
228
00:11:44,250 --> 00:11:46,670
And not only that, this
convergence is going to be
229
00:11:46,670 --> 00:11:49,440
exponential in n.
230
00:11:49,440 --> 00:11:53,230
So this is the theorem that
we will prove today.
231
00:11:53,230 --> 00:11:58,180
So the key to this theorem is
this pair statements, that the
232
00:11:58,180 --> 00:12:01,010
most probable path from i
to j, given n steps--
233
00:12:01,010 --> 00:12:02,460
so this is the most
probable path--
234
00:12:02,460 --> 00:12:06,550
is non-increasing at n,
and the minimum is
235
00:12:06,550 --> 00:12:08,500
non-decreasing in n.
236
00:12:08,500 --> 00:12:12,320
So the proof is almost trivial,
but let's see what
237
00:12:12,320 --> 00:12:14,220
happens in this.
238
00:12:14,220 --> 00:12:18,410
So we have a probably
transition matrix.
239
00:12:18,410 --> 00:12:21,160
So this is the statement
right here.
240
00:12:21,160 --> 00:12:24,050
And the transition is just one
here and one here, with
241
00:12:24,050 --> 00:12:25,960
probability 1, 1.
242
00:12:25,960 --> 00:12:30,580
In this case, we want to say,
what is the maximum path that
243
00:12:30,580 --> 00:12:34,490
we're in state 2,
given n steps?
244
00:12:34,490 --> 00:12:37,510
So we know that this probability
alternates between
245
00:12:37,510 --> 00:12:40,150
1 and 2, it's non-increasing,
it's not decreasing, it's
246
00:12:40,150 --> 00:12:41,740
always the same.
247
00:12:41,740 --> 00:12:47,890
So those two bounds are
met with equality.
248
00:12:47,890 --> 00:12:48,660
So in this here.
249
00:12:48,660 --> 00:12:50,240
So the second example is this.
250
00:12:50,240 --> 00:12:52,690
We have a two-state
chain again.
251
00:12:52,690 --> 00:12:58,520
But this time, from 1 to 2, we
have the transition of 3/4.
252
00:12:58,520 --> 00:13:01,326
So that means that we have
a chain here of 1/4.
253
00:13:01,326 --> 00:13:03,090
See, the minute we put a
self-loop in here, it
254
00:13:03,090 --> 00:13:05,590
completely destroys
the periodicity.
255
00:13:05,590 --> 00:13:08,760
Any Markov chain, you put a
self-loop in it, and the
256
00:13:08,760 --> 00:13:09,910
periodicity is destroyed.
257
00:13:09,910 --> 00:13:12,050
So here we have 3/4.
258
00:13:12,050 --> 00:13:15,180
So this has to come
back with 1/4.
259
00:13:15,180 --> 00:13:16,220
All right.
260
00:13:16,220 --> 00:13:22,590
So in this one, let's look at
the n step going from 1 to 2.
261
00:13:22,590 --> 00:13:27,060
So basically, we want
to end up in state 2
262
00:13:27,060 --> 00:13:28,850
in exactly n steps.
263
00:13:28,850 --> 00:13:34,810
So when n is equal to 1,
what is the maximum?
264
00:13:34,810 --> 00:13:36,950
The maximum is if you start it
in this state and then you
265
00:13:36,950 --> 00:13:37,890
went to state 2.
266
00:13:37,890 --> 00:13:40,600
The other alternative is you
start at state 2, and you stay
267
00:13:40,600 --> 00:13:41,040
in state 2.
268
00:13:41,040 --> 00:13:43,250
Because we want to end at state
2 in exactly one step.
269
00:13:43,250 --> 00:13:45,300
So the maximum is going to
be 3/4, and the minimum
270
00:13:45,300 --> 00:13:48,190
is going to be 1/4.
271
00:13:48,190 --> 00:13:50,170
You get n is equal to 2.
272
00:13:50,170 --> 00:13:53,430
Now we want to end up in
state 2 in two steps.
273
00:13:53,430 --> 00:13:56,360
So what is going to
be the maximum?
274
00:13:56,360 --> 00:13:58,960
The maximum is going
to be if you visit
275
00:13:58,960 --> 00:14:00,580
state 1 and then back.
276
00:14:00,580 --> 00:14:02,840
So n is equal to 1.
277
00:14:02,840 --> 00:14:08,850
Then P1 from 1 to 2
is equal to 3/4.
278
00:14:08,850 --> 00:14:15,830
So the probability from 1 to 2
in two steps is equal to 3/8.
279
00:14:18,690 --> 00:14:22,270
So it goes 1/4 plus
3/4, 3/4 plus 1/4.
280
00:14:22,270 --> 00:14:23,280
It should be equal
to 3/8, right?
281
00:14:23,280 --> 00:14:25,510
Is that right?
282
00:14:25,510 --> 00:14:28,220
OK.
283
00:14:28,220 --> 00:14:32,570
And then it when P1,2 to 3, if
there are three transitions
284
00:14:32,570 --> 00:14:35,770
from 1 to 2, then it's
equal to 9/16.
285
00:14:35,770 --> 00:14:38,900
So if for 2, if I want
to transition
286
00:14:38,900 --> 00:14:40,350
from 2 to 2 n steps--
287
00:14:40,350 --> 00:14:44,030
so P2,2 is equal to 1/4.
288
00:14:44,030 --> 00:14:47,090
So it just stayed by itself.
289
00:14:47,090 --> 00:14:54,600
So P2,2 in two steps, you
don't have a choice.
290
00:14:54,600 --> 00:14:55,850
You have to go from
3/4 to 3/4.
291
00:15:03,020 --> 00:15:03,580
So that's 9/16.
292
00:15:03,580 --> 00:15:07,970
But For thing is I can also stay
here by 1/4 times 1/4.
293
00:15:07,970 --> 00:15:14,740
So that gives me 5/8
and so forth.
294
00:15:14,740 --> 00:15:18,305
So basically, the sequence going
from 1 to 2 is going to
295
00:15:18,305 --> 00:15:24,030
be oscillating between 3/4,
3/8, 9/16, and so forth.
296
00:15:24,030 --> 00:15:27,530
And then going from 2,2, it's
going to be oscillating too.
297
00:15:27,530 --> 00:15:30,440
We can see that's
1/4, 5/8, 7/16.
298
00:15:30,440 --> 00:15:34,980
So what happens is this
oscillation is going to
299
00:15:34,980 --> 00:15:37,860
converge-- it's going to
approach, actually, 1/2.
300
00:15:37,860 --> 00:15:44,540
So if we take the maximum of
these two, so P1,2 and P2,2--
301
00:15:44,540 --> 00:15:48,410
because that means that we're
going to end at state 2.
302
00:15:48,410 --> 00:15:53,060
And maximum over n steps, then
we just look at these two
303
00:15:53,060 --> 00:15:56,670
numbers, the 3/4 and 1/4, if we
want the maximum, then it's
304
00:15:56,670 --> 00:15:57,350
going to be 3/4.
305
00:15:57,350 --> 00:15:59,410
For the 3/8 and 5/8, the maximum
is going to be 5/8,
306
00:15:59,410 --> 00:16:02,690
the 9/16 and 7/16, the 9/16
will be the maximum.
307
00:16:02,690 --> 00:16:04,975
And similarly, we compare it,
and we take the minimum.
308
00:16:04,975 --> 00:16:07,730
And the minimum is 1/4,
3/8, and 7/16.
309
00:16:07,730 --> 00:16:10,620
So we see that the maximum
is going to be--
310
00:16:10,620 --> 00:16:12,150
it starts high.
311
00:16:12,150 --> 00:16:14,490
And then it's going to
decrease toward 1/2.
312
00:16:14,490 --> 00:16:16,970
And the minimum, what happens
is it's going to start low,
313
00:16:16,970 --> 00:16:21,060
and then it's going to
increase to 1/2.
314
00:16:21,060 --> 00:16:23,600
So this is exactly
this one here.
315
00:16:23,600 --> 00:16:26,800
So P's transition makes this
an arbitrary finite-state
316
00:16:26,800 --> 00:16:27,980
Markov chain.
317
00:16:27,980 --> 00:16:32,130
Therefore, each j, this maximum
path, the most problem
318
00:16:32,130 --> 00:16:34,630
path from i to j in n steps
is non-increasing n.
319
00:16:34,630 --> 00:16:36,860
And the minimum is
non-decreasing n.
320
00:16:36,860 --> 00:16:40,480
So you take n plus 1
steps from i to j.
321
00:16:40,480 --> 00:16:43,570
So we're going to use that
Chapman-Kolmogorov equation.
322
00:16:43,570 --> 00:16:46,790
So we take the first step
to some state k.
323
00:16:46,790 --> 00:16:50,260
And then we go from
k to j in n steps.
324
00:16:50,260 --> 00:16:53,180
But then we sum this
over all k.
325
00:16:53,180 --> 00:16:59,620
But this P n for state to j to
k in n steps, I can just take
326
00:16:59,620 --> 00:17:00,620
the maximum path.
327
00:17:00,620 --> 00:17:05,950
So I take the most probable
path, the state that gives me
328
00:17:05,950 --> 00:17:08,680
the most probable path, and
I substitute this in.
329
00:17:08,680 --> 00:17:12,359
When I substitute this in,
obviously every one of these
330
00:17:12,359 --> 00:17:14,670
guys is going to be less
than or equal to this.
331
00:17:14,670 --> 00:17:16,859
Therefore, this outside
term is going to be
332
00:17:16,859 --> 00:17:17,900
less than or equal.
333
00:17:17,900 --> 00:17:20,190
So now this is just going
to be a constant.
334
00:17:20,190 --> 00:17:24,589
So I sum over all k, and
then this term remains.
335
00:17:24,589 --> 00:17:29,740
So therefore, what we know is
if I want to end up in state
336
00:17:29,740 --> 00:17:37,430
j, and for n steps, if I
increase the step more, to n
337
00:17:37,430 --> 00:17:41,930
plus 1, we know that this
probability is going to stay
338
00:17:41,930 --> 00:17:43,010
the same or decrease.
339
00:17:43,010 --> 00:17:44,330
It's not going to increase.
340
00:17:44,330 --> 00:17:47,520
So you could do exactly the same
thing for the minimum.
341
00:17:47,520 --> 00:17:50,480
So if this is going to be true,
then of course, if I
342
00:17:50,480 --> 00:17:53,000
think the maximum of this,
it's also going to
343
00:17:53,000 --> 00:17:53,830
be less than that.
344
00:17:53,830 --> 00:17:57,270
Because this limit's true
for Markov chain.
345
00:17:57,270 --> 00:17:59,620
It doesn't matter.
346
00:17:59,620 --> 00:18:02,830
It just has to be a finite-state
Markov chain.
347
00:18:02,830 --> 00:18:05,250
So this is true for any
finite-state Markov chain.
348
00:18:05,250 --> 00:18:07,850
So if I take the maximum of
this, it's less than or equal
349
00:18:07,850 --> 00:18:10,560
to the maximum of
the n-th step.
350
00:18:10,560 --> 00:18:16,440
So n plus 1 steps, the path is
going to be less probable when
351
00:18:16,440 --> 00:18:18,510
I take the maximum path, the
fact that I end up at
352
00:18:18,510 --> 00:18:19,760
state j than n.
353
00:18:25,570 --> 00:18:29,050
So before we complete the proof
of this theorem, let's
354
00:18:29,050 --> 00:18:32,450
look at this case where P
is greater than zero.
355
00:18:32,450 --> 00:18:35,440
So if we say Pis greater than
zero, this means that every
356
00:18:35,440 --> 00:18:39,020
entry in this matrix is greater
than 0 for all i, j,
357
00:18:39,020 --> 00:18:42,310
which means that this graph
is fully connected.
358
00:18:42,310 --> 00:18:46,130
So that means you could get from
i to j in one step with
359
00:18:46,130 --> 00:18:49,610
nonzero probability.
360
00:18:49,610 --> 00:18:52,250
So if P is greater than 0--
361
00:18:52,250 --> 00:18:53,600
and let this be the
transition matrix.
362
00:18:53,600 --> 00:18:56,330
So we'll prove this first, and
then we'll extend it to the
363
00:18:56,330 --> 00:18:59,170
arbitrary finite Markov chain.
364
00:18:59,170 --> 00:19:02,560
So let alpha here is equal
to the minimum.
365
00:19:02,560 --> 00:19:04,420
So it's going to be the minimum
element in this
366
00:19:04,420 --> 00:19:05,560
transition matrix.
367
00:19:05,560 --> 00:19:09,590
That means it's going to be the
state that contains the
368
00:19:09,590 --> 00:19:11,580
minimum transition.
369
00:19:11,580 --> 00:19:15,935
So let's call alpha-- it's
the minimum probability.
370
00:19:15,935 --> 00:19:19,080
Excuse me.
371
00:19:19,080 --> 00:19:21,660
So let all these
states i and j.
372
00:19:21,660 --> 00:19:25,040
And for n greater than or equal
to 1, we have these
373
00:19:25,040 --> 00:19:26,740
three expressions.
374
00:19:26,740 --> 00:19:34,060
So this first expression says
this, that if I have an n plus
375
00:19:34,060 --> 00:19:38,590
1 walk from i to j, I take
the most probable of
376
00:19:38,590 --> 00:19:41,960
this walk over i.
377
00:19:41,960 --> 00:19:44,860
So my choices, I can choose
my initial starting state.
378
00:19:44,860 --> 00:19:46,960
In n plus 1 steps, I want
to end in state j.
379
00:19:46,960 --> 00:19:48,690
So I pick the most
probable path.
380
00:19:48,690 --> 00:19:52,680
If I minus this, which is the
least probable path--
381
00:19:52,680 --> 00:19:56,690
but you get to minimize this
over i, over the initial
382
00:19:56,690 --> 00:19:57,890
starting a state.
383
00:19:57,890 --> 00:20:02,800
So this is less than or
equal to the n step.
384
00:20:02,800 --> 00:20:06,670
It's exactly this term
here, the n step
385
00:20:06,670 --> 00:20:08,130
times 1 minus 2 alpha.
386
00:20:08,130 --> 00:20:12,390
So alpha is the minimum
transition probability in this
387
00:20:12,390 --> 00:20:14,780
probability transition matrix.
388
00:20:14,780 --> 00:20:17,940
So this one, it's not so
obvious right now.
389
00:20:17,940 --> 00:20:20,370
But we are going to prove
that in the next slide.
390
00:20:20,370 --> 00:20:26,830
So once we have this, we can
iterative on n to get the
391
00:20:26,830 --> 00:20:28,670
second term.
392
00:20:28,670 --> 00:20:35,170
So for this term inside here,
the most probable path to
393
00:20:35,170 --> 00:20:38,550
state j in n steps, minus the
least probable path to state j
394
00:20:38,550 --> 00:20:43,670
in n steps, is equal to exactly
the same thing in n
395
00:20:43,670 --> 00:20:46,090
minus 1 steps times
1 minus 2 alpha.
396
00:20:46,090 --> 00:20:49,300
So we just keep on iterating
this over. n, and then we
397
00:20:49,300 --> 00:20:50,170
should get this.
398
00:20:50,170 --> 00:20:52,640
So to prove this to this, we
prove it by induction.
399
00:20:52,640 --> 00:20:58,690
We just have to prove the
initial step, that the maximum
400
00:20:58,690 --> 00:21:02,810
single transition from l to j,
minus the minimum single
401
00:21:02,810 --> 00:21:05,650
transition from l to j,
is less than or equal
402
00:21:05,650 --> 00:21:07,700
to 1 minus 2 alpha.
403
00:21:07,700 --> 00:21:10,510
So this one is proved
by induction.
404
00:21:10,510 --> 00:21:14,440
So as n goes to infinity, notice
that this term is going
405
00:21:14,440 --> 00:21:16,290
to go to 0.
406
00:21:16,290 --> 00:21:19,030
because alpha is going to
be less than a 1/2.
407
00:21:19,030 --> 00:21:23,560
Because if it's not, then we can
choose 1 minus alpha to be
408
00:21:23,560 --> 00:21:24,870
this minimum.
409
00:21:24,870 --> 00:21:27,340
So if this is going to 0, this
tells us the difference
410
00:21:27,340 --> 00:21:31,260
between the most probable path
minus the least probable path,
411
00:21:31,260 --> 00:21:33,390
the fact that we end
up in state j.
412
00:21:33,390 --> 00:21:37,140
So if we take the limit as n
goes to infinity of both of
413
00:21:37,140 --> 00:21:39,240
these, they should equal.
414
00:21:39,240 --> 00:21:41,420
Because the difference of this,
we notice that it's
415
00:21:41,420 --> 00:21:45,290
going down exponentially in n.
416
00:21:45,290 --> 00:21:49,500
So this shows us that this
limit indeed does
417
00:21:49,500 --> 00:21:51,450
exist and is equal.
418
00:21:51,450 --> 00:21:55,760
We want to prove this first
statement over here.
419
00:21:55,760 --> 00:21:57,870
So in order to prove this first
statement, what we're
420
00:21:57,870 --> 00:22:03,290
going to do is we're going to
take this i, j transition in n
421
00:22:03,290 --> 00:22:04,990
plus 1 transitions.
422
00:22:04,990 --> 00:22:08,090
And then we're going to express
it as a function of n
423
00:22:08,090 --> 00:22:09,500
transitions.
424
00:22:09,500 --> 00:22:11,150
So the idea is this.
425
00:22:11,150 --> 00:22:14,900
We're going to use the
Chapman-Kolmogorov equations
426
00:22:14,900 --> 00:22:18,540
to have an intermediary step.
427
00:22:18,540 --> 00:22:24,310
So in order to do this i to j
in n plus 1 steps, the most
428
00:22:24,310 --> 00:22:29,530
probable path, we're going to
go to this intermediate step
429
00:22:29,530 --> 00:22:34,220
and then on to the final step.
430
00:22:34,220 --> 00:22:36,270
In this intermediate step,
it's going to be
431
00:22:36,270 --> 00:22:36,970
a function of n.
432
00:22:36,970 --> 00:22:39,650
So we're going to take one step
and then n more steps.
433
00:22:39,650 --> 00:22:43,310
So what we're going to do is,
the intuition is, we're going
434
00:22:43,310 --> 00:22:47,890
to remove the least
probable path.
435
00:22:47,890 --> 00:22:50,110
So we remove that from
the sum in this
436
00:22:50,110 --> 00:22:52,780
Chapman-Kolmogorov equation.
437
00:22:52,780 --> 00:22:54,870
And then we have the sum
of everything else
438
00:22:54,870 --> 00:22:56,010
except for that path.
439
00:22:56,010 --> 00:22:59,080
And then the sum of everything
else, we're going to bound it.
440
00:22:59,080 --> 00:23:02,760
Once we bound it, then we
have this expression.
441
00:23:02,760 --> 00:23:05,850
The probability of i to j in n
plus 1 steps is going be a
442
00:23:05,850 --> 00:23:09,050
function of a max and
a min over n steps
443
00:23:09,050 --> 00:23:10,840
with a bunch of terms.
444
00:23:10,840 --> 00:23:14,720
So that's the intuition of
how we're going to do it.
445
00:23:14,720 --> 00:23:17,750
So the probability of ij going
from state i to state j in
446
00:23:17,750 --> 00:23:20,290
exactly n plus 1 steps
is equal to this.
447
00:23:20,290 --> 00:23:23,100
So it's the probability of
going from i to k, this
448
00:23:23,100 --> 00:23:23,550
intermediate step.
449
00:23:23,550 --> 00:23:26,840
We're going to take one
step to a state k.
450
00:23:26,840 --> 00:23:29,860
And then we're going from
k to j in n steps,
451
00:23:29,860 --> 00:23:31,000
summing over all k.
452
00:23:31,000 --> 00:23:34,260
So this is exactly equal to this
with Chapman-Kolmogorov.
453
00:23:34,260 --> 00:23:37,685
So now what happens is
we're going to take--
454
00:23:41,100 --> 00:23:43,660
Before we get to this next step,
let's define this l min
455
00:23:43,660 --> 00:23:47,690
to be the state that minimizes
p of ij, n over i.
456
00:23:47,690 --> 00:23:51,430
So l min is going to be the
state that's going to be such
457
00:23:51,430 --> 00:23:57,060
that the choices I pick over i
that in n steps I arrive at j
458
00:23:57,060 --> 00:23:59,670
that's going to be the
least probable.
459
00:23:59,670 --> 00:24:03,160
So this is l min over here.
460
00:24:03,160 --> 00:24:04,860
It's the l min that
satisfies this.
461
00:24:04,860 --> 00:24:06,840
Then I'm going to remove this.
462
00:24:06,840 --> 00:24:09,670
So this is one state. l min is
just one state that i is going
463
00:24:09,670 --> 00:24:12,690
to go to in this first step.
464
00:24:12,690 --> 00:24:15,440
So we're going to remove
it from the sum.
465
00:24:15,440 --> 00:24:18,660
So then, this is just here.
466
00:24:18,660 --> 00:24:28,120
So that path goes from i to l
min times l to j in n steps.
467
00:24:28,120 --> 00:24:30,160
So remove that one
path from here.
468
00:24:30,160 --> 00:24:33,610
Now we have the sum over the
rest of the cases because we
469
00:24:33,610 --> 00:24:34,620
just removed that.
470
00:24:34,620 --> 00:24:38,890
So we have ik, kj to
n, where k is not
471
00:24:38,890 --> 00:24:39,700
equal to that element.
472
00:24:39,700 --> 00:24:44,450
So we removed that path, the one
that goes to that state.
473
00:24:44,450 --> 00:24:49,890
But p of kj, n, the path that
goes from k to j in n steps,
474
00:24:49,890 --> 00:24:54,650
we can just bound this term
by the maximum over l
475
00:24:54,650 --> 00:24:56,390
from l to j of n.
476
00:24:56,390 --> 00:24:58,640
So then we're going to take the
most probable path in n
477
00:24:58,640 --> 00:25:02,720
steps such that we end
up in state j in n.
478
00:25:02,720 --> 00:25:06,250
So this term right here is
bounded by this term.
479
00:25:06,250 --> 00:25:08,150
Becomes is bounded by this,
that's why we have this less
480
00:25:08,150 --> 00:25:10,240
than or equal sign.
481
00:25:10,240 --> 00:25:13,670
So we just do two things from
this step, the first step, to
482
00:25:13,670 --> 00:25:14,600
the second step.
483
00:25:14,600 --> 00:25:20,420
So we took out the path that's
going to minimize that right
484
00:25:20,420 --> 00:25:24,670
at the j-th node in n steps.
485
00:25:24,670 --> 00:25:30,870
And then we bounded the rest
of this sum by this.
486
00:25:30,870 --> 00:25:35,840
So when we sum this all up, this
is just a constant here.
487
00:25:35,840 --> 00:25:41,800
And ik here is just all the
states that i is going to
488
00:25:41,800 --> 00:25:45,330
visit except for this
one state, l min.
489
00:25:45,330 --> 00:25:48,450
Since it's just all of them
except for that, it's just 1
490
00:25:48,450 --> 00:25:52,730
minus the probability that it
goes from state i to l min.
491
00:25:52,730 --> 00:25:57,580
So this sum here is just
equal to this sum here.
492
00:25:57,580 --> 00:25:59,120
So this arrives here.
493
00:25:59,120 --> 00:26:02,860
And this term is still here.
494
00:26:02,860 --> 00:26:10,620
So going from here, what
happens is we just to
495
00:26:10,620 --> 00:26:12,420
rearrange the terms.
496
00:26:12,420 --> 00:26:13,440
So nothing happens right here.
497
00:26:13,440 --> 00:26:14,690
It's just rearranging.
498
00:26:17,560 --> 00:26:20,330
Now we have this term here.
499
00:26:20,330 --> 00:26:23,620
So we look at this term, P
from i going to l min--
500
00:26:23,620 --> 00:26:27,970
Remember, we chose alpha to be
the minimum single transition
501
00:26:27,970 --> 00:26:31,760
probability, single
transition in that
502
00:26:31,760 --> 00:26:33,020
probability transition matrix.
503
00:26:33,020 --> 00:26:36,310
So i to l has to be
greater than that.
504
00:26:36,310 --> 00:26:39,050
But the minus of this has to be
less than, the negative has
505
00:26:39,050 --> 00:26:39,800
to be less than.
506
00:26:39,800 --> 00:26:42,320
So this we can substitute
here.
507
00:26:46,670 --> 00:26:48,010
So now we have this.
508
00:26:48,010 --> 00:26:51,872
So the maximum over i of this
n plus 1 step actually shows
509
00:26:51,872 --> 00:26:52,760
you the probability.
510
00:26:52,760 --> 00:26:56,190
Because this I can write
as an n plus 1 step
511
00:26:56,190 --> 00:26:57,380
path from i to j.
512
00:26:57,380 --> 00:27:02,030
So if this is less than this
entire term, of course I can
513
00:27:02,030 --> 00:27:05,740
write the maximum path
from i to j.
514
00:27:05,740 --> 00:27:07,320
It also has to be less of
this because this is
515
00:27:07,320 --> 00:27:10,900
satisfied for all i, j.
516
00:27:10,900 --> 00:27:15,570
So therefore, we arrive at
this expression here.
517
00:27:15,570 --> 00:27:21,750
So now we're kind of in good
business because we have the n
518
00:27:21,750 --> 00:27:24,110
plus one step at transition, the
maximum path from i to j
519
00:27:24,110 --> 00:27:26,750
in n plus 1 steps as a function
of n, which is what
520
00:27:26,750 --> 00:27:29,525
we wanted, and a function
of this alpha.
521
00:27:34,430 --> 00:27:35,980
So we repeat that
last statement.
522
00:27:38,490 --> 00:27:42,170
And the last one is here,
the last line.
523
00:27:45,160 --> 00:27:46,140
So now we have the maximum.
524
00:27:46,140 --> 00:27:48,910
So now we want to do is we
want to get the minimum.
525
00:27:48,910 --> 00:27:52,750
So we do exactly the same thing,
with the same proof.
526
00:27:52,750 --> 00:27:55,593
And with the minimum, what we're
going to do is we're
527
00:27:55,593 --> 00:28:01,180
going to look at the ij
transition in n plus 1 steps.
528
00:28:01,180 --> 00:28:03,180
And then what we're going to do
is we're going to pull out
529
00:28:03,180 --> 00:28:04,540
the maximum this time.
530
00:28:04,540 --> 00:28:08,830
So we pull out the most probable
path in n steps such
531
00:28:08,830 --> 00:28:11,250
that it arrives in state j.
532
00:28:11,250 --> 00:28:12,200
Then we play the same game.
533
00:28:12,200 --> 00:28:13,170
Would bound everything--
534
00:28:13,170 --> 00:28:16,830
above, this time-- by the
minimum of the n step
535
00:28:16,830 --> 00:28:18,890
transition probabilities
to get to j.
536
00:28:18,890 --> 00:28:24,440
So once we do that, we get this
expression, very similar
537
00:28:24,440 --> 00:28:26,650
to this one up here.
538
00:28:26,650 --> 00:28:31,330
So now we have the maximum path,
which is n plus 1 steps
539
00:28:31,330 --> 00:28:36,440
to j, and the minimum of
n plus 1 steps to j.
540
00:28:36,440 --> 00:28:40,770
So we could take the difference
between these two.
541
00:28:40,770 --> 00:28:44,000
So if you subtract these
equations here, so this first
542
00:28:44,000 --> 00:28:48,010
equation minus the second
equation, we have this on the
543
00:28:48,010 --> 00:28:53,170
right-hand side here and then
these terms over here on the
544
00:28:53,170 --> 00:28:55,950
left-hand side.
545
00:28:55,950 --> 00:28:59,660
So these terms over here on the
left-hand exactly proves
546
00:28:59,660 --> 00:29:01,620
the first line of the lemma.
547
00:29:06,670 --> 00:29:13,110
So the first line of
the lemma was here.
548
00:29:13,110 --> 00:29:15,860
So now, to prove the second of
the lemma, remember, we're
549
00:29:15,860 --> 00:29:17,330
going to prove this
by induction.
550
00:29:17,330 --> 00:29:20,900
in order to prove this by
induction, we need to be first
551
00:29:20,900 --> 00:29:23,320
initial step.
552
00:29:23,320 --> 00:29:24,670
So the initial step is this.
553
00:29:24,670 --> 00:29:30,660
So if I take the minimum
transition probability from l
554
00:29:30,660 --> 00:29:33,230
to j, it has to be greater
than here with the alpha.
555
00:29:33,230 --> 00:29:35,670
Because we said that alpha was
the absolute minimum of all
556
00:29:35,670 --> 00:29:38,510
the single-step transition
probabilities.
557
00:29:38,510 --> 00:29:41,770
Then the maximum transition
probability has to be greater
558
00:29:41,770 --> 00:29:44,250
than or equal to
1 minus alpha.
559
00:29:44,250 --> 00:29:46,450
It's just by definition
of what we choose.
560
00:29:46,450 --> 00:29:49,840
So therefore, if I take this
term, the maximize minus the
561
00:29:49,840 --> 00:29:52,750
minimum is just 1
minus 2 alpha.
562
00:29:52,750 --> 00:29:55,850
So that's your first step in
the induction process.
563
00:29:55,850 --> 00:29:58,215
So we iterate on n.
564
00:29:58,215 --> 00:30:01,630
When we iterate on n,
one arrives at this
565
00:30:01,630 --> 00:30:02,880
equation down here.
566
00:30:16,610 --> 00:30:24,760
So this shows us from here that
if we take the limit as n
567
00:30:24,760 --> 00:30:27,360
goes to infinity of this
term, this goes down
568
00:30:27,360 --> 00:30:29,880
exponentially in n.
569
00:30:29,880 --> 00:30:32,590
And both of these limits are
going to converge, and they
570
00:30:32,590 --> 00:30:34,795
exist, and they're going
to be greater than 0.
571
00:30:34,795 --> 00:30:37,690
So they'll be greater than 0
because of our initial state
572
00:30:37,690 --> 00:30:41,250
that we chose this path with
a positive probability.
573
00:30:41,250 --> 00:30:43,735
Yeah, go ahead.
574
00:30:43,735 --> 00:30:46,847
AUDIENCE: It seems to me that
alpha is the minimum, the
575
00:30:46,847 --> 00:30:48,934
smallest number in the
transition matrix, right?
576
00:30:48,934 --> 00:30:51,250
SHAN-YUAN HO: Alpha is the
smallest number, correct.
577
00:30:51,250 --> 00:30:51,665
AUDIENCE: Yeah.
578
00:30:51,665 --> 00:30:53,716
How does it fall from
that, like that?
579
00:30:53,716 --> 00:31:01,460
So my is, the convergence
rate is related to f?
580
00:31:01,460 --> 00:31:03,890
SHAN-YUAN HO: Yes,
it is, yeah.
581
00:31:03,890 --> 00:31:06,570
In general, it doesn't really
matter because it's still
582
00:31:06,570 --> 00:31:09,250
going to go down exponentially
in n.
583
00:31:09,250 --> 00:31:12,640
But it does depend on
that alpha, yes.
584
00:31:16,570 --> 00:31:17,820
Any other questions?
585
00:31:22,110 --> 00:31:23,050
Yes.
586
00:31:23,050 --> 00:31:25,515
AUDIENCE: Is the strength that
bound it proportional to the
587
00:31:25,515 --> 00:31:27,820
size of that matrix, right?
588
00:31:27,820 --> 00:31:28,470
SHAN-YUAN HO: Excuse me?
589
00:31:28,470 --> 00:31:29,800
AUDIENCE: The strength
of that bound is
590
00:31:29,800 --> 00:31:31,074
proportional to the size?
591
00:31:31,074 --> 00:31:33,490
I mean, for a very large
finite-state Markov chain, the
592
00:31:33,490 --> 00:31:35,210
strength of the bound is going
to be somewhat weak because
593
00:31:35,210 --> 00:31:37,190
alpha is going to be--
594
00:31:37,190 --> 00:31:39,300
SHAN-YUAN HO: Alpha has
to be less than 1/2.
595
00:31:39,300 --> 00:31:40,060
AUDIENCE: OK, yes.
596
00:31:40,060 --> 00:31:43,974
But the strength of the bound,
though, it's not a very tight
597
00:31:43,974 --> 00:31:47,902
bound on max minus min.
598
00:31:47,902 --> 00:31:50,400
Because in a large--
599
00:31:50,400 --> 00:31:50,660
SHAN-YUAN HO: Yes.
600
00:31:50,660 --> 00:31:52,760
This is just a bound.
601
00:31:52,760 --> 00:31:54,940
And the bound is what
when we took it that
602
00:31:54,940 --> 00:31:59,380
minimum-probability path,
the l min, remember?
603
00:31:59,380 --> 00:32:02,050
The bound was actually
in here.
604
00:32:02,050 --> 00:32:04,070
So we took the
minimum-probability path in n
605
00:32:04,070 --> 00:32:09,620
steps, this l min that minimizes
this over i.
606
00:32:09,620 --> 00:32:11,910
And then this is where this less
than or equal to here is
607
00:32:11,910 --> 00:32:13,160
just a substitution.
608
00:32:18,250 --> 00:32:19,500
Any other questions?
609
00:32:23,810 --> 00:32:27,750
So what we know is that what
happens is that these
610
00:32:27,750 --> 00:32:29,820
limited-state probabilities
exist.
611
00:32:29,820 --> 00:32:34,095
So we have a finite
ergodic chain.
612
00:32:37,310 --> 00:32:41,950
So if the probability of the
elements in this transition
613
00:32:41,950 --> 00:32:43,630
matrix are all greater
than 0, we know
614
00:32:43,630 --> 00:32:44,990
that this limit exists.
615
00:32:44,990 --> 00:32:47,950
But we know that in general,
that may not be the case.
616
00:32:47,950 --> 00:32:50,415
We're going to have some 0's
in our transition matrix.
617
00:32:54,570 --> 00:32:57,240
So let's go back to the
arbitrary finite-state ergodic
618
00:32:57,240 --> 00:33:03,740
chain with probability
transition matrix P. So in the
619
00:33:03,740 --> 00:33:08,610
last slide, we showed that this
transition matrix P of h
620
00:33:08,610 --> 00:33:13,560
is positive for h is equal to
M minus 1 squared plus 1.
621
00:33:13,560 --> 00:33:19,130
So what we do is, we can apply
lemma 2 to P of h with this
622
00:33:19,130 --> 00:33:22,025
alpha equals to minimum
going from i to j
623
00:33:22,025 --> 00:33:23,275
in exactly h steps.
624
00:33:26,340 --> 00:33:29,850
So why is this M minus
1 squared plus 1?
625
00:33:29,850 --> 00:33:33,020
So in the last lecture--
626
00:33:33,020 --> 00:33:34,490
so what it means is this.
627
00:33:37,670 --> 00:33:39,020
So what is says is here.
628
00:33:39,020 --> 00:33:41,075
This was an example given
in the last lecture.
629
00:33:41,075 --> 00:33:43,430
It was a 6-state Markov chain.
630
00:33:43,430 --> 00:33:48,020
So what it says is that if n is
greater than or equal to M
631
00:33:48,020 --> 00:33:50,510
minus 1 squared plus 1-- in this
case, it's going to be 6.
632
00:33:50,510 --> 00:33:56,570
So if n is greater than or equal
to 26, then I take P to
633
00:33:56,570 --> 00:33:59,000
the 26th power, it means
it's greater than zero.
634
00:33:59,000 --> 00:34:02,350
That meas if I take P to the
26th power, every single
635
00:34:02,350 --> 00:34:06,590
element in this transition
matrix is going to be
636
00:34:06,590 --> 00:34:14,260
non-zero, which means that you
can go from any state to any
637
00:34:14,260 --> 00:34:17,389
state with nonzero probability,
as long as n is
638
00:34:17,389 --> 00:34:18,290
bigger than that.
639
00:34:18,290 --> 00:34:20,889
So basically, in this Markov
chain, if you go long enough,
640
00:34:20,889 --> 00:34:22,190
long enough.
641
00:34:22,190 --> 00:34:25,905
Then I say, OK, I want to go
from state i to state j in
642
00:34:25,905 --> 00:34:28,469
exactly how many steps, there is
a positive probability that
643
00:34:28,469 --> 00:34:31,600
this is going to happen.
644
00:34:31,600 --> 00:34:34,380
So how did this bound
come across?
645
00:34:34,380 --> 00:34:44,980
Well, for instance, in this
chain, if we look at P1,1 so
646
00:34:44,980 --> 00:34:46,340
we have here?
647
00:34:46,340 --> 00:34:50,449
So I'm going to look
at the transition
648
00:34:50,449 --> 00:34:51,940
starting at state 1.
649
00:34:51,940 --> 00:34:53,600
And I want to come back to 1.
650
00:34:53,600 --> 00:34:56,860
So you definitely could come
back at 6, because these are
651
00:34:56,860 --> 00:34:59,300
all positive probability 1.
652
00:34:59,300 --> 00:35:00,350
So 6 is possible.
653
00:35:00,350 --> 00:35:02,160
So n is equal to
6 is possible.
654
00:35:02,160 --> 00:35:03,680
So what's the next one
that's possible? n is
655
00:35:03,680 --> 00:35:05,720
equal to 11, right?
656
00:35:05,720 --> 00:35:09,100
Then the next one is what?
657
00:35:09,100 --> 00:35:11,690
16 is possible, right?
658
00:35:11,690 --> 00:35:15,100
So 0 to 5 is impossible, is 0.
659
00:35:15,100 --> 00:35:19,560
So if I pick n between 0 and 5,
and 7 and 10, you're toast.
660
00:35:19,560 --> 00:35:22,750
You can't get back to 1.
661
00:35:22,750 --> 00:35:23,840
And so forth.
662
00:35:23,840 --> 00:35:25,300
So 18 is possible.
663
00:35:29,250 --> 00:35:30,410
21--
664
00:35:30,410 --> 00:35:32,480
let's see, is 17 possible?
665
00:35:32,480 --> 00:35:33,740
Yeah, 17 is also possible.
666
00:35:37,040 --> 00:35:38,750
AUDIENCE: Why is 16 possible?
667
00:35:38,750 --> 00:35:41,630
SHAN-YUAN HO: So I go around
here twice, and
668
00:35:41,630 --> 00:35:42,880
then the last one.
669
00:35:46,060 --> 00:35:48,810
Is that right?
670
00:35:48,810 --> 00:35:52,160
So if I go from here to here
to here to here, if I go
671
00:35:52,160 --> 00:35:54,696
twice, and then one more
in the final loop.
672
00:35:54,696 --> 00:35:55,550
AUDIENCE: That's 12.
673
00:35:55,550 --> 00:35:57,340
SHAN-YUAN HO: Oh, it's 12?
674
00:35:57,340 --> 00:35:57,870
No.
675
00:35:57,870 --> 00:36:01,640
I'm going to go this inner
loop right here.
676
00:36:01,640 --> 00:36:07,234
So if I go from 1 to 2 to 3
to 4 to 5 to 6, down to 2.
677
00:36:07,234 --> 00:36:10,130
Then I go 3, 4, 5, 6, 1.
678
00:36:10,130 --> 00:36:11,380
That's 11, isn't it?
679
00:36:14,300 --> 00:36:17,090
So 16 is I'm going to go around
the inner loop twice.
680
00:36:19,630 --> 00:36:19,870
OK.
681
00:36:19,870 --> 00:36:20,610
Go ahead.
682
00:36:20,610 --> 00:36:20,980
Question?
683
00:36:20,980 --> 00:36:23,670
AUDIENCE: So everything 20 and
under is possible, right?
684
00:36:23,670 --> 00:36:24,670
SHAN-YUAN HO: No.
685
00:36:24,670 --> 00:36:25,730
Is 25 possible?
686
00:36:25,730 --> 00:36:28,092
Tell me how you're going
to go 25 on this.
687
00:36:28,092 --> 00:36:30,760
You just do the 5
loop 5 times.
688
00:36:30,760 --> 00:36:32,140
SHAN-YUAN HO: Yeah, but I
want to go from 1 to 1.
689
00:36:32,140 --> 00:36:33,240
You're starting in state 1.
690
00:36:33,240 --> 00:36:33,825
AUDIENCE: Oh, oh, sorry.
691
00:36:33,825 --> 00:36:34,150
OK.
692
00:36:34,150 --> 00:36:35,662
SHAN-YUAN HO: 1 to 1, right?
693
00:36:35,662 --> 00:36:36,594
AUDIENCE: OK, cool.
694
00:36:36,594 --> 00:36:37,844
OK, I see.
695
00:36:40,030 --> 00:36:42,230
SHAN-YUAN HO: So you know for
this one that this bound is
696
00:36:42,230 --> 00:36:44,160
actually tight.
697
00:36:44,160 --> 00:36:46,880
So 25 is impossible.
698
00:36:46,880 --> 00:36:50,520
So P1,1 of 25 is equal to 0.
699
00:36:50,520 --> 00:36:51,920
There's no way you
can do that.
700
00:36:51,920 --> 00:36:54,850
But for 26 on, then you can.
701
00:36:54,850 --> 00:36:58,090
So what you're noticing is that
you need this loop of 6
702
00:36:58,090 --> 00:37:02,830
here and that any combination
of 5 or 6 is possible.
703
00:37:02,830 --> 00:37:08,310
So basically, in this particular
example, if n is
704
00:37:08,310 --> 00:37:18,710
equal to 6k plus 5j, where k
is greater than or equal to
705
00:37:18,710 --> 00:37:21,020
1-- because I need that final
loop to get back--
706
00:37:21,020 --> 00:37:23,350
or j is greater than
or equal to 0--
707
00:37:23,350 --> 00:37:28,730
So any combination of this one,
then I can express n.
708
00:37:28,730 --> 00:37:31,140
I can go around it to give me
a positive probability of
709
00:37:31,140 --> 00:37:33,660
going from state 1 to state 1.
710
00:37:33,660 --> 00:37:38,910
So I'm going to prove this
using extremal property.
711
00:37:38,910 --> 00:37:41,330
So we're going to take the
absolute worst case.
712
00:37:41,330 --> 00:37:46,970
So the absolute worst case is
that for M state finite Markov
713
00:37:46,970 --> 00:37:49,540
chain is if have a loop
of m and you have a
714
00:37:49,540 --> 00:37:51,005
loop of m minus 1.
715
00:37:51,005 --> 00:37:52,460
You can't just have
a loop of m.
716
00:37:52,460 --> 00:37:53,950
The problem is now this
becomes periodic.
717
00:37:56,580 --> 00:38:00,680
So we have to get rid
of the periodicity.
718
00:38:00,680 --> 00:38:03,390
If you add a single group here,
that doesn't help you.
719
00:38:03,390 --> 00:38:05,590
Then after 6, then it I get
7, 8, 9, 10, 11, 12.
720
00:38:05,590 --> 00:38:08,930
That didn't have this.
721
00:38:08,930 --> 00:38:11,880
So the absolute worst case for
an M state chain is going to
722
00:38:11,880 --> 00:38:13,430
be something that
looks like this.
723
00:38:13,430 --> 00:38:16,550
1 that goes to 2-- you're
forced to go to 2--
724
00:38:16,550 --> 00:38:20,840
so forth, until state M. And
then this M is going to go
725
00:38:20,840 --> 00:38:23,830
back to 2 or is going
to go back to 1.
726
00:38:23,830 --> 00:38:31,290
So in other words, the worst
case is if you have--
727
00:38:31,290 --> 00:38:39,210
n has to be some combination
of Mk plus M minus 1 j.
728
00:38:39,210 --> 00:38:43,430
So this will be the worst
possible case for M state
729
00:38:43,430 --> 00:38:44,220
Markov chain.
730
00:38:44,220 --> 00:38:50,480
So it'll be Mk plus
M minus 1 j.
731
00:38:50,480 --> 00:38:52,570
So k has to be greater
than or equal to 1.
732
00:38:52,570 --> 00:38:55,840
And then j has to be greater
than or equal to 0, because
733
00:38:55,840 --> 00:38:56,950
you need to come back.
734
00:38:56,950 --> 00:39:00,000
So I'm just looking at the case
probability that I start
735
00:39:00,000 --> 00:39:02,500
in state 1 and I come
back in state 1.
736
00:39:02,500 --> 00:39:04,670
So all right.
737
00:39:04,670 --> 00:39:09,770
So how do we get this bound?
738
00:39:09,770 --> 00:39:14,860
Well, there is an identity
that says this.
739
00:39:14,860 --> 00:39:29,370
If a and b are relatively prime,
then the largest n such
740
00:39:29,370 --> 00:39:32,750
that it cannot be written-- so
we want to find the largest n
741
00:39:32,750 --> 00:39:42,430
such that ak plus bj--
742
00:39:42,430 --> 00:39:48,860
but this is k and j greater
than or equal to 0--
743
00:39:48,860 --> 00:39:52,260
that it cannot be written
in this form.
744
00:39:55,860 --> 00:39:58,820
The largest integer that it
cannot be written is ab
745
00:39:58,820 --> 00:40:00,680
minus a minus b.
746
00:40:00,680 --> 00:40:02,960
This takes a little bit to
prove, but it's not too hard.
747
00:40:02,960 --> 00:40:05,480
If you want to know this proof,
come see me offline
748
00:40:05,480 --> 00:40:09,090
after class.
749
00:40:09,090 --> 00:40:10,170
This is the largest integer.
750
00:40:10,170 --> 00:40:13,160
If n is equal to this,
it cannot be
751
00:40:13,160 --> 00:40:14,310
written in this form.
752
00:40:14,310 --> 00:40:17,690
But if n is greater than
this, then it can.
753
00:40:17,690 --> 00:40:21,680
So all we do is substitute M
for a and M minus 1 for b
754
00:40:21,680 --> 00:40:24,440
because M and M minus 1
are relatively prime.
755
00:40:24,440 --> 00:40:29,460
But remember, we have a k here
that has to be greater than or
756
00:40:29,460 --> 00:40:29,980
equal to 1.
757
00:40:29,980 --> 00:40:31,330
We need at least one k.
758
00:40:31,330 --> 00:40:34,690
But this so identity is for
k and j greater than 0.
759
00:40:34,690 --> 00:40:39,420
So therefore, we have to
subtract out that k.
760
00:40:39,420 --> 00:40:46,030
So therefore, we have M times
M minus 1, minus M
761
00:40:46,030 --> 00:40:48,200
minus M minus 1.
762
00:40:48,200 --> 00:40:58,960
But the thing is we have to add
the extra M, because this
763
00:40:58,960 --> 00:41:00,360
k is greater than
or equal to 1.
764
00:41:00,360 --> 00:41:05,330
So we have to add up one of
the M's because of this.
765
00:41:05,330 --> 00:41:13,830
So this is just equal to
M minus 1, squared.
766
00:41:13,830 --> 00:41:18,920
So this number, if n is equal
to this, it's the largest
767
00:41:18,920 --> 00:41:20,490
number that it cannot be
written like that.
768
00:41:20,490 --> 00:41:21,810
So therefore, we
have to add 1.
769
00:41:21,810 --> 00:41:24,080
So that's why the bound
is equal to 1.
770
00:41:24,080 --> 00:41:30,660
So the upper bound that n can
be written is going to be M
771
00:41:30,660 --> 00:41:34,320
minus 1, squared plus 1.
772
00:41:34,320 --> 00:41:36,410
AUDIENCE: Why did you add
the 1 at the end?
773
00:41:36,410 --> 00:41:36,960
SHAN-YUAN HO: This one?
774
00:41:36,960 --> 00:41:40,476
AUDIENCE: No, we've got to
do the 1 at the end.
775
00:41:40,476 --> 00:41:42,380
AUDIENCE: We already
have that in there.
776
00:41:42,380 --> 00:41:42,810
SHAN-YUAN HO: Oh, where is it?
777
00:41:42,810 --> 00:41:44,245
No, it's in here, right?
778
00:41:44,245 --> 00:41:45,495
AUDIENCE: No, it's not here.
779
00:41:48,420 --> 00:41:49,670
SHAN-YUAN HO: Did I--
780
00:41:53,980 --> 00:41:54,720
What are you talking about?
781
00:41:54,720 --> 00:41:57,088
Where's the 1?
782
00:41:57,088 --> 00:41:59,030
AUDIENCE: At the end,
the last equation.
783
00:41:59,030 --> 00:41:59,500
SHAN-YUAN HO: This one?
784
00:41:59,500 --> 00:42:01,290
AUDIENCE: Yes.
785
00:42:01,290 --> 00:42:02,270
SHAN-YUAN HO: OK.
786
00:42:02,270 --> 00:42:12,420
This is the "cannot," largest
n which you cannot write.
787
00:42:12,420 --> 00:42:15,170
You cannot write this.
788
00:42:15,170 --> 00:42:18,220
So this bound is tight.
789
00:42:18,220 --> 00:42:22,890
It means that this is the
one that you can.
790
00:42:22,890 --> 00:42:24,990
So if n is greater
than or equal to
791
00:42:24,990 --> 00:42:26,470
this, then it's possible.
792
00:42:26,470 --> 00:42:28,970
This is the largest
one it cannot.
793
00:42:28,970 --> 00:42:30,790
Based on this, it cannot.
794
00:42:30,790 --> 00:42:32,070
So we have to add the 1.
795
00:42:32,070 --> 00:42:34,970
So therefore, in here,
you could do 26.
796
00:42:34,970 --> 00:42:38,470
So starting from 26, 27,
28, you can do that.
797
00:42:41,310 --> 00:42:42,890
Any questions?
798
00:42:42,890 --> 00:42:45,590
AUDIENCE: Relatively prime,
what do you mean by
799
00:42:45,590 --> 00:42:45,920
"relatively"?
800
00:42:45,920 --> 00:42:47,982
SHAN-YUAN HO: There is a
greatest common divisor of 1.
801
00:42:51,900 --> 00:42:56,790
So if we take h here, h is
going to be positive.
802
00:42:56,790 --> 00:43:00,400
So if h is equal to M minus 1,
squared plus 1, then now all
803
00:43:00,400 --> 00:43:01,530
the elements are positive.
804
00:43:01,530 --> 00:43:04,590
Because we just proved that
we can write this--
805
00:43:10,230 --> 00:43:13,390
every state can be visited by
any other state, with positive
806
00:43:13,390 --> 00:43:15,260
probability.
807
00:43:15,260 --> 00:43:21,620
So we say, looking at P, we know
that P of h is positive
808
00:43:21,620 --> 00:43:23,510
for h greater than or
equal to this bound.
809
00:43:23,510 --> 00:43:28,330
So what we do is we applied this
lemma 2 probability to
810
00:43:28,330 --> 00:43:32,980
this transition matrix P of h,
where we have picked alpha--
811
00:43:32,980 --> 00:43:34,620
remember, alpha is the
single-step transition
812
00:43:34,620 --> 00:43:35,340
probability.
813
00:43:35,340 --> 00:43:38,840
So instead of the single
transition, we have lumped
814
00:43:38,840 --> 00:43:42,700
this P into P to the h power.
815
00:43:42,700 --> 00:43:45,470
So it's h steps.
816
00:43:45,470 --> 00:43:50,790
Because we proved the result
before for positive P. So this
817
00:43:50,790 --> 00:43:53,760
P to the h is positive, so we
take alpha as the minimum from
818
00:43:53,760 --> 00:43:59,510
i to j of P to the
h in this matrix.
819
00:43:59,510 --> 00:44:02,360
So it doesn't really matter what
the value of alpha is,
820
00:44:02,360 --> 00:44:03,950
only that it's going
to be positive.
821
00:44:03,950 --> 00:44:06,290
And it has to be positive
because it's a probability.
822
00:44:06,290 --> 00:44:12,770
So what happens is, if we follow
the proof of what we
823
00:44:12,770 --> 00:44:17,260
just showed in the lemma, then
we show that the maximum path
824
00:44:17,260 --> 00:44:19,190
from l to j--
825
00:44:22,260 --> 00:44:25,240
h times M. So M is going
to be an integer, so in
826
00:44:25,240 --> 00:44:27,220
multiples of h--
827
00:44:27,220 --> 00:44:30,930
this upper limit is going to be
equal to the lower limit.
828
00:44:30,930 --> 00:44:34,730
So the most probable path
is equal to the
829
00:44:34,730 --> 00:44:36,590
least probable path.
830
00:44:40,240 --> 00:44:42,380
So this is multiple of h's.
831
00:44:42,380 --> 00:44:44,730
So if we take this as M goes
to infinity, this has
832
00:44:44,730 --> 00:44:47,510
got to equal to--
833
00:44:47,510 --> 00:44:53,110
Oops, this should be going
to pi sub j, excuse me.
834
00:44:53,110 --> 00:44:55,230
This little temple here.
835
00:44:55,230 --> 00:44:57,950
And this is going to
be greater than 0.
836
00:44:57,950 --> 00:45:01,040
So the problem is now we've
shown it for multiples of h's,
837
00:45:01,040 --> 00:45:04,180
what about the h's in between?
838
00:45:04,180 --> 00:45:10,510
But the fact is that lemma 1,
we showed that this maximum
839
00:45:10,510 --> 00:45:14,170
path from l to j in n is
not increasing in n.
840
00:45:14,170 --> 00:45:18,620
So all those states, all those
paths, the transition
841
00:45:18,620 --> 00:45:21,460
probability for the paths in
between these multiples of
842
00:45:21,460 --> 00:45:25,110
h's, in between them it's
going to be not
843
00:45:25,110 --> 00:45:26,100
increasing in n.
844
00:45:26,100 --> 00:45:30,852
So even if we're taking these
multiples of each of h and n
845
00:45:30,852 --> 00:45:33,150
here, here, here, and we know
that this limit is increasing,
846
00:45:33,150 --> 00:45:37,690
we know that all the ones in
between them are also going to
847
00:45:37,690 --> 00:45:42,350
be increasing to the same limit
because of lemma 1.
848
00:45:42,350 --> 00:45:45,335
To remember, the maximum is
going to be not increasing,
849
00:45:45,335 --> 00:45:46,460
and the minimum is going to be
850
00:45:46,460 --> 00:45:48,760
non-decreasing in any one path.
851
00:45:48,760 --> 00:45:54,280
So this must have the
same limit as
852
00:45:54,280 --> 00:45:56,040
this multiple of this.
853
00:45:56,040 --> 00:45:58,390
So the same limit applies.
854
00:45:58,390 --> 00:46:00,220
So any questions on this?
855
00:46:00,220 --> 00:46:03,390
So this is how we prove it for
the arbitrary finite-state
856
00:46:03,390 --> 00:46:07,790
ergodic chain when we have some
0 probability transition
857
00:46:07,790 --> 00:46:13,490
elements in the matrix P. So
the proof is the same.
858
00:46:17,680 --> 00:46:19,880
So now for ergodic unichain.
859
00:46:19,880 --> 00:46:26,880
So we see that this limit as n
approaches infinity from i to
860
00:46:26,880 --> 00:46:30,120
j of n is going to just end up
in the steady-state transition
861
00:46:30,120 --> 00:46:32,380
pi of j for all i.
862
00:46:32,380 --> 00:46:35,040
So it doesn't matter what
your initial state is.
863
00:46:35,040 --> 00:46:39,170
As n goes to infinity of this
path, as this Markov chain
864
00:46:39,170 --> 00:46:42,360
goes on and on, you will end up
in state j with probability
865
00:46:42,360 --> 00:46:47,440
pi sub j, where pi is this
probability vector.
866
00:46:47,440 --> 00:46:50,090
So now we have this steady-state
vector, and then
867
00:46:50,090 --> 00:46:54,130
we can solve for the
steady-state vector solution.
868
00:46:54,130 --> 00:46:59,600
So this pi P is equal to pi.
869
00:46:59,600 --> 00:46:59,860
Yeah?
870
00:46:59,860 --> 00:47:00,330
Go ahead.
871
00:47:00,330 --> 00:47:02,270
AUDIENCE: Where did you prove
that the sum of all the pi j's
872
00:47:02,270 --> 00:47:04,030
equal to one?
873
00:47:04,030 --> 00:47:06,563
Because you say that we
proved that this is
874
00:47:06,563 --> 00:47:07,355
the probability vector.
875
00:47:07,355 --> 00:47:08,780
But did prove only that
it is non-negative?
876
00:47:08,780 --> 00:47:09,290
SHAN-YUAN HO: It's
non-negative.
877
00:47:09,290 --> 00:47:13,090
But the thing is because as n
goes to infinity, you have to
878
00:47:13,090 --> 00:47:15,200
land up someone, right?
879
00:47:15,200 --> 00:47:16,960
This is a finite-state
Markov chain.
880
00:47:16,960 --> 00:47:19,030
You have to be somewhere.
881
00:47:19,030 --> 00:47:21,060
And the fact that you have to be
somewhere, your whole state
882
00:47:21,060 --> 00:47:23,480
space has to add up to 1.
883
00:47:23,480 --> 00:47:24,550
Because it's a constant,
remember?
884
00:47:24,550 --> 00:47:29,290
For every j, as n goes to
infinity, it goes to pi sub j.
885
00:47:29,290 --> 00:47:31,020
So you have that for
every single state.
886
00:47:31,020 --> 00:47:32,530
And then you have to
end up somewhere.
887
00:47:32,530 --> 00:47:34,570
So if you have to end up
somewhere, the space has to
888
00:47:34,570 --> 00:47:35,896
add up to one.
889
00:47:35,896 --> 00:47:37,980
Yeah, good question.
890
00:47:37,980 --> 00:47:41,990
So why are we interested
in this pi sub j?
891
00:47:41,990 --> 00:47:45,210
The question is that because
in this recurrent class, it
892
00:47:45,210 --> 00:47:48,910
tells us that as this goes to
infinity, we see this sequence
893
00:47:48,910 --> 00:47:50,880
of states going back and
forth, back and forth.
894
00:47:50,880 --> 00:47:53,430
And we know that as n goes
to infinity, we have some
895
00:47:53,430 --> 00:47:56,285
probability, pi sub j, of
landing in state j, pi sub i
896
00:47:56,285 --> 00:47:57,810
of landing in state
i, and so forth.
897
00:47:57,810 --> 00:48:01,980
So it says that in the n step,
as n goes to infinity, that
898
00:48:01,980 --> 00:48:04,290
this is the fraction of time
that, actually, that state is
899
00:48:04,290 --> 00:48:05,440
going to be visited.
900
00:48:05,440 --> 00:48:08,590
Because at each step, you have
to make a transition.
901
00:48:08,590 --> 00:48:15,050
So it's kind of the expected
number of times per unit time.
902
00:48:15,050 --> 00:48:16,280
So it's divide by n.
903
00:48:16,280 --> 00:48:18,260
It's going to be that fraction
of time that you're going to
904
00:48:18,260 --> 00:48:19,110
visit that state.
905
00:48:19,110 --> 00:48:20,110
It's the fraction of time
that you're going
906
00:48:20,110 --> 00:48:21,660
to be in that state.
907
00:48:21,660 --> 00:48:26,940
It's this limiting state as
n gets very, very large.
908
00:48:26,940 --> 00:48:32,520
So we will see that in the next
few chapters when we do
909
00:48:32,520 --> 00:48:35,210
renewal theory that this will
come into useful play.
910
00:48:35,210 --> 00:48:39,670
And we give a slightly different
viewpoint of it.
911
00:48:39,670 --> 00:48:42,270
So it's very easy to extend this
result to a more general
912
00:48:42,270 --> 00:48:44,510
class of ergodic unichains.
913
00:48:44,510 --> 00:48:46,350
So remember the ergodic
unichains, now we have
914
00:48:46,350 --> 00:48:48,020
increased these transient
states.
915
00:48:48,020 --> 00:48:50,120
So before, we proved this.
916
00:48:50,120 --> 00:48:53,270
We just proved it for it
contains exactly one class.
917
00:48:53,270 --> 00:48:59,300
It's aperiodic, so we have no
cycles, no periodicity in this
918
00:48:59,300 --> 00:49:00,170
Markov chain.
919
00:49:00,170 --> 00:49:03,080
And so we know that the
steady-state transition
920
00:49:03,080 --> 00:49:04,510
probabilities have a limit.
921
00:49:04,510 --> 00:49:06,596
And the upper limit and the
lower limit of these paths as
922
00:49:06,596 --> 00:49:07,680
they go to infinity--
923
00:49:07,680 --> 00:49:09,770
in fact, they end up in
a particular state--
924
00:49:09,770 --> 00:49:10,430
has a limit.
925
00:49:10,430 --> 00:49:14,570
And we have this steady-state
probability vector that
926
00:49:14,570 --> 00:49:15,610
describes this.
927
00:49:15,610 --> 00:49:17,940
So now we have these
transient states.
928
00:49:17,940 --> 00:49:20,290
So these transient states of
this Markov chain, what
929
00:49:20,290 --> 00:49:25,480
happens is there exists a path
that this transient state is
930
00:49:25,480 --> 00:49:27,370
going to go to a recurrent
state.
931
00:49:27,370 --> 00:49:29,930
So once it leaves this transient
state, it goes to
932
00:49:29,930 --> 00:49:30,550
recurrent state.
933
00:49:30,550 --> 00:49:31,810
It's never going to come back.
934
00:49:31,810 --> 00:49:40,500
So there is some probability,
alpha, of leaving the
935
00:49:40,500 --> 00:49:41,700
class at each step.
936
00:49:41,700 --> 00:49:43,750
So there's some transition
probability in this transient
937
00:49:43,750 --> 00:49:45,630
state that's going
to be alpha.
938
00:49:45,630 --> 00:49:48,730
And the probability of remaining
in this transient
939
00:49:48,730 --> 00:49:50,970
state is just 1 minus
alpha to the n.
940
00:49:50,970 --> 00:49:52,840
And this goes down
exponentially.
941
00:49:52,840 --> 00:49:56,340
So what this says is that
eventually, as n gets very
942
00:49:56,340 --> 00:49:59,290
large, it's very, very hard to
stay in that transient state.
943
00:49:59,290 --> 00:50:01,340
So it's going to go out of
the transient state.
944
00:50:01,340 --> 00:50:04,900
And then it will go into
the recurrent class.
945
00:50:04,900 --> 00:50:09,410
So when one does the analysis
for this, what happens in the
946
00:50:09,410 --> 00:50:13,960
probability in this steady-state
vector is those
947
00:50:13,960 --> 00:50:17,580
transient states, this pi,
will be equal to 0.
948
00:50:17,580 --> 00:50:21,610
So this distribution is only
going to be non-zero for
949
00:50:21,610 --> 00:50:23,640
recurrent states in this
Markov chains.
950
00:50:23,640 --> 00:50:27,510
And the transient states will
have probability equal to 0.
951
00:50:27,510 --> 00:50:31,080
In the notes, they just
extend the argument.
952
00:50:31,080 --> 00:50:35,340
But you need a little bit
more care to show this.
953
00:50:35,340 --> 00:50:38,210
And it divides the transient
states into a block and then
954
00:50:38,210 --> 00:50:40,660
the recurrent classes into
another block and then shows
955
00:50:40,660 --> 00:50:45,630
that these transient states'
limiting probability is going
956
00:50:45,630 --> 00:50:46,880
to go to 0.
957
00:50:52,020 --> 00:50:55,080
So let's see.
958
00:50:55,080 --> 00:50:59,490
So this says just what I said,
that these transient states
959
00:50:59,490 --> 00:51:02,180
decay exponentially, and one
of the paths will be taken,
960
00:51:02,180 --> 00:51:03,880
eventually, out of it.
961
00:51:03,880 --> 00:51:07,440
So for ergodic unichains, the
ergodic class is eventually
962
00:51:07,440 --> 00:51:09,180
entered, and then steady state
in that class is reached.
963
00:51:09,180 --> 00:51:13,370
So every state j, we
have exactly this.
964
00:51:13,370 --> 00:51:18,270
The maximum path from
i to j in n steps--
965
00:51:18,270 --> 00:51:19,100
and the minimum path.
966
00:51:19,100 --> 00:51:21,820
We look at the minimum path in
n steps and the maximum path
967
00:51:21,820 --> 00:51:22,700
in n steps.
968
00:51:22,700 --> 00:51:25,820
And for each n, we take the
limit as n goes to infinity.
969
00:51:25,820 --> 00:51:29,220
These guys, these limits are
exactly equal, and it equals
970
00:51:29,220 --> 00:51:32,680
to this pi sub j, which is
equal to the j state.
971
00:51:32,680 --> 00:51:39,150
So your initial states, how you
went the paths that you
972
00:51:39,150 --> 00:51:40,560
have gone is completely
wiped out.
973
00:51:40,560 --> 00:51:44,470
And all that matters is
this final state,
974
00:51:44,470 --> 00:51:45,820
as n gets very large.
975
00:51:45,820 --> 00:51:49,200
So the difference here is that
pi sub j equals 0 for each
976
00:51:49,200 --> 00:51:51,380
transient state, and it's
greater than 0 for the
977
00:51:51,380 --> 00:51:52,630
recurrent state.
978
00:51:55,770 --> 00:51:57,580
So other finite Markov chains.
979
00:51:57,580 --> 00:51:59,280
So we can consider a
Markov chain with
980
00:51:59,280 --> 00:52:00,330
several ergodic classes.
981
00:52:00,330 --> 00:52:03,340
Because we just considered it
with one ergodic class.
982
00:52:03,340 --> 00:52:05,790
So if the classes don't
communicate, then you just
983
00:52:05,790 --> 00:52:06,740
consider it separately.
984
00:52:06,740 --> 00:52:08,920
So you figure out the
steady-state transition
985
00:52:08,920 --> 00:52:11,170
probabilities for each of
the classes separately.
986
00:52:11,170 --> 00:52:17,080
But if you have to insist on
analyzing the entire chain P,
987
00:52:17,080 --> 00:52:19,760
then this P will have m
independent steady-state
988
00:52:19,760 --> 00:52:29,180
vectors and one non-zero
in each class.
989
00:52:29,180 --> 00:52:32,690
So this P sub n is still going
to converge, but the rows are
990
00:52:32,690 --> 00:52:33,590
not going to be the same.
991
00:52:33,590 --> 00:52:35,210
So basically, you're going
to have blocks.
992
00:52:35,210 --> 00:52:38,510
So if you have one class, say 1
through k is going to be in
993
00:52:38,510 --> 00:52:41,680
one class, and then k through
l is going to be another
994
00:52:41,680 --> 00:52:45,070
class, and then l through z is
going to another class, you
995
00:52:45,070 --> 00:52:45,960
have a block.
996
00:52:45,960 --> 00:52:49,170
So this steady-state vector
is going to be in blocks.
997
00:52:51,770 --> 00:52:56,480
So you can see the recurring
classes only communicate
998
00:52:56,480 --> 00:52:57,520
within themselves.
999
00:52:57,520 --> 00:52:59,350
Because these don't
1000
00:52:59,350 --> 00:53:01,570
communicate, so they're separate.
1001
00:53:01,570 --> 00:53:11,450
So you could have a lot of 0's
in limiting state, if you look
1002
00:53:11,450 --> 00:53:15,690
at this, P sub n goes
to infinity.
1003
00:53:15,690 --> 00:53:18,010
So there m set of rows,
one for each class.
1004
00:53:18,010 --> 00:53:20,220
And a row for each class k
will be non-zero for the
1005
00:53:20,220 --> 00:53:22,280
elements of that class.
1006
00:53:22,280 --> 00:53:26,350
So then finally, if we
have periodicity.
1007
00:53:26,350 --> 00:53:32,540
So now if we have a periodic
recurrent chain with period d.
1008
00:53:32,540 --> 00:53:34,440
We had the two where it's
just a period of 2.
1009
00:53:34,440 --> 00:53:39,130
So with periodicity, what you
do is you're going to divide
1010
00:53:39,130 --> 00:53:41,770
these classes into d
different states.
1011
00:53:41,770 --> 00:53:44,880
So you have to go
to one state--
1012
00:53:44,880 --> 00:53:50,070
So if there's d states, this is
a period of d, you separate
1013
00:53:50,070 --> 00:53:54,410
or you partition the states into
d of them, d subclasses,
1014
00:53:54,410 --> 00:53:56,190
with a cycle rotation
between them.
1015
00:53:56,190 --> 00:54:00,380
So basically, each time unit,
you have to go from one class
1016
00:54:00,380 --> 00:54:01,910
to the next class.
1017
00:54:01,910 --> 00:54:05,080
And then we do that, then for
each class, you could have the
1018
00:54:05,080 --> 00:54:07,210
limiting-state probability.
1019
00:54:07,210 --> 00:54:11,290
So in other words, you are
looking at this transition
1020
00:54:11,290 --> 00:54:13,460
matrix, pi d.
1021
00:54:13,460 --> 00:54:15,820
Because when it cycles, it
totally depends on which one
1022
00:54:15,820 --> 00:54:17,960
you start out at.
1023
00:54:17,960 --> 00:54:22,560
But if you look at the d
intervals, then that becomes
1024
00:54:22,560 --> 00:54:24,640
the ergodic class by itself.
1025
00:54:24,640 --> 00:54:27,130
And there are exactly
d of them.
1026
00:54:27,130 --> 00:54:31,020
So the limit as n approaches
infinity of P of nd, this
1027
00:54:31,020 --> 00:54:36,220
thing also exists, but exists in
the subclass sense of there
1028
00:54:36,220 --> 00:54:40,070
is d subclasses if it
has a period of d.
1029
00:54:40,070 --> 00:54:42,570
So that means a steady state
is reached within each
1030
00:54:42,570 --> 00:54:44,640
subclass, but the chain
rotates from
1031
00:54:44,640 --> 00:54:47,240
one subclass to another.
1032
00:54:47,240 --> 00:54:47,950
Yeah, go ahead.
1033
00:54:47,950 --> 00:54:49,410
AUDIENCE: In this case, if we
do a simple check with 1 and
1034
00:54:49,410 --> 00:54:52,700
2, with 1 and 1, it
doesn't converge.
1035
00:54:52,700 --> 00:54:53,570
SHAN-YUAN HO: No, it does.
1036
00:54:53,570 --> 00:54:56,380
It is 1, converges to 1.
1037
00:54:56,380 --> 00:54:58,740
So it's 1, and then it's
going to be 1.
1038
00:54:58,740 --> 00:55:00,970
AUDIENCE: It's 1, 1,
1, 1, 1, 1, 1, 1.
1039
00:55:00,970 --> 00:55:01,500
So you go here?
1040
00:55:01,500 --> 00:55:02,904
Like, it's reached--?
1041
00:55:02,904 --> 00:55:03,372
SHAN-YUAN HO: No, no.
1042
00:55:03,372 --> 00:55:04,790
It converges for here.
1043
00:55:04,790 --> 00:55:08,510
But this d is equal to
2, in that case.
1044
00:55:08,510 --> 00:55:10,534
So you have to do nd,
so you've got
1045
00:55:10,534 --> 00:55:12,200
to look at P squared.
1046
00:55:12,200 --> 00:55:14,536
So if I look at P squared,
I'm always a 1--
1047
00:55:14,536 --> 00:55:16,196
1, 1, 1, 1, 1, 1, 1, 1.
1048
00:55:16,196 --> 00:55:17,380
That's converging.
1049
00:55:17,380 --> 00:55:19,300
The other one is 2,
2, 2, 2, 2, 2.
1050
00:55:19,300 --> 00:55:20,550
That's also converging.
1051
00:55:23,690 --> 00:55:24,150
OK.
1052
00:55:24,150 --> 00:55:26,040
So is there any other questions
about this?
1053
00:55:28,680 --> 00:55:30,050
OK, that's it.
1054
00:55:30,050 --> 00:55:31,300
Thank you.