1
00:00:06,279 --> 00:00:07,380
PROFESSOR: Hi, everyone.
2
00:00:07,380 --> 00:00:08,520
Welcome back.
3
00:00:08,520 --> 00:00:12,390
So today, I'd like to tackle
a problem in Markov matrices.
4
00:00:12,390 --> 00:00:15,220
Specifically, we're going to
start with this problem which
5
00:00:15,220 --> 00:00:17,540
almost has a physics origin.
6
00:00:17,540 --> 00:00:21,240
If we have a particle that
jumps between positions A and B
7
00:00:21,240 --> 00:00:23,470
with the following
probabilities--
8
00:00:23,470 --> 00:00:25,910
I'll just state it-- if
it starts at A and jumps
9
00:00:25,910 --> 00:00:29,110
to B with probability
0.4 or starts at A
10
00:00:29,110 --> 00:00:34,300
and stays at A with probability
0.6, or if it starts at B
11
00:00:34,300 --> 00:00:36,870
then it goes to A
with probability 0.2
12
00:00:36,870 --> 00:00:40,010
or stays at B with
probability 0.8,
13
00:00:40,010 --> 00:00:42,660
we'd like to know the
evolution of the probability
14
00:00:42,660 --> 00:00:45,690
of this particle over
a long period of time.
15
00:00:45,690 --> 00:00:47,110
So specifically
the problem we're
16
00:00:47,110 --> 00:00:49,830
interested today is:
if we have a particle
17
00:00:49,830 --> 00:00:52,040
and we know that it
starts at position A,
18
00:00:52,040 --> 00:00:53,940
what is the
probability that it is
19
00:00:53,940 --> 00:00:56,150
at position A and
the probability
20
00:00:56,150 --> 00:01:01,890
that it's at position B after
one step, after n steps,
21
00:01:01,890 --> 00:01:04,112
and then finally after an
infinite number of steps?
22
00:01:04,112 --> 00:01:06,320
So I'll let you think about
this problem for a moment
23
00:01:06,320 --> 00:01:07,028
and I'll be back.
24
00:01:17,460 --> 00:01:18,580
Hi everyone.
25
00:01:18,580 --> 00:01:20,370
Welcome back.
26
00:01:20,370 --> 00:01:23,630
So the main difficulty
with this problem
27
00:01:23,630 --> 00:01:25,646
is that it's phrased
as a physics problem.
28
00:01:25,646 --> 00:01:28,020
And we need to convert it into
some mathematical language
29
00:01:28,020 --> 00:01:29,899
to get a handle on it.
30
00:01:29,899 --> 00:01:31,440
So specifically,
what we'd like to do
31
00:01:31,440 --> 00:01:35,670
is to convert this into
a matrix formalism.
32
00:01:35,670 --> 00:01:41,120
So what we can do is we can
write this little graph down
33
00:01:41,120 --> 00:01:44,990
and describe everything in
this graph using a matrix.
34
00:01:44,990 --> 00:01:47,860
So I'm going to
call this matrix A,
35
00:01:47,860 --> 00:01:52,700
and I'm going to associate
the first row of A
36
00:01:52,700 --> 00:01:58,580
with particle position A
and particle position B.
37
00:01:58,580 --> 00:02:06,320
And I'll associate the
first and second columns
38
00:02:06,320 --> 00:02:08,990
with particles
positions A and B.
39
00:02:08,990 --> 00:02:10,490
And then what I'm
going to do is I'm
40
00:02:10,490 --> 00:02:14,040
going to fill in this
matrix with the probability
41
00:02:14,040 --> 00:02:15,270
distributions.
42
00:02:15,270 --> 00:02:18,490
So, specifically what's going
to go in this top left corner?
43
00:02:18,490 --> 00:02:24,350
Well, the number 0.6, which
represents the probability
44
00:02:24,350 --> 00:02:26,870
that I stay at position
A, given that I
45
00:02:26,870 --> 00:02:30,780
start at position A.
What's going to go here
46
00:02:30,780 --> 00:02:33,060
in the bottom left-hand corner?
47
00:02:33,060 --> 00:02:39,890
Well, we're going to put 0.4,
because this is the probability
48
00:02:39,890 --> 00:02:45,230
that I wind up at B, given that
I start at A. And then lastly,
49
00:02:45,230 --> 00:02:50,490
we'll fill in these other two
columns or the second column
50
00:02:50,490 --> 00:02:56,690
with 0.8 and 0.2.
51
00:02:56,690 --> 00:02:58,340
So I'll just state
briefly this is
52
00:02:58,340 --> 00:03:00,210
what's called a Markov matrix.
53
00:03:00,210 --> 00:03:03,600
And it's called Markov, because
first off, every element
54
00:03:03,600 --> 00:03:06,460
is positive or 0.
55
00:03:06,460 --> 00:03:11,840
And secondly, the sum of the
elements in each column is 1.
56
00:03:11,840 --> 00:03:16,520
So if we note 0.4 plus 0.6
is 1, 0.8 plus 0.2 is 1.
57
00:03:16,520 --> 00:03:18,490
And these matrices
come up all the time
58
00:03:18,490 --> 00:03:20,890
when we're talking about
probabilities and the evolution
59
00:03:20,890 --> 00:03:23,530
of probability distributions.
60
00:03:23,530 --> 00:03:24,110
OK.
61
00:03:24,110 --> 00:03:28,250
So now, once we've encoded
this graph using this matrix A,
62
00:03:28,250 --> 00:03:30,480
we now want to
tackle this problem.
63
00:03:30,480 --> 00:03:35,394
So I'm going to
introduce the vector p,
64
00:03:35,394 --> 00:03:36,810
and I'm going to
use a subscript 0
65
00:03:36,810 --> 00:03:41,630
is to denote the probability
that the particle is at time 0.
66
00:03:41,630 --> 00:03:50,400
So we're told that the
particle starts at position A.
67
00:03:50,400 --> 00:03:55,190
So at time 0, I'm going
to use the vector [1, 0].
68
00:03:55,190 --> 00:04:03,810
Again, I'm going to match the
top component of this vector
69
00:04:03,810 --> 00:04:06,310
with the top component
of this matrix
70
00:04:06,310 --> 00:04:09,700
and the first column
of this matrix.
71
00:04:09,700 --> 00:04:12,120
And then likewise, the second
component of this vector
72
00:04:12,120 --> 00:04:17,480
with the second row and
second column of this matrix.
73
00:04:17,480 --> 00:04:19,670
And we're interested
in: how does
74
00:04:19,670 --> 00:04:26,620
this probability evolve as
the particle takes many steps?
75
00:04:26,620 --> 00:04:35,610
So for one step, what's the
probability of the particle
76
00:04:35,610 --> 00:04:36,770
going to be?
77
00:04:36,770 --> 00:04:41,150
Well, this is the beauty of
introducing matrix notation.
78
00:04:41,150 --> 00:04:44,510
I'm going to denote p_1
to be the probability
79
00:04:44,510 --> 00:04:48,160
of the particle after one step.
80
00:04:48,160 --> 00:04:53,080
And we see that we can write
this as the matrix A multiplied
81
00:04:53,080 --> 00:04:54,160
by p_0.
82
00:04:58,030 --> 00:05:07,570
So the answer is 0.6 and 0.4.
83
00:05:07,570 --> 00:05:09,680
And I achieve this just
by multiplying this matrix
84
00:05:09,680 --> 00:05:11,291
by this vector.
85
00:05:11,291 --> 00:05:11,790
OK?
86
00:05:11,790 --> 00:05:13,050
So this concludes part one.
87
00:05:16,770 --> 00:05:19,240
Now part two is a
little trickier.
88
00:05:19,240 --> 00:05:20,435
So part two is n steps.
89
00:05:26,510 --> 00:05:28,340
And to tackle this
problem, we need
90
00:05:28,340 --> 00:05:30,750
to use a little more machinery.
91
00:05:30,750 --> 00:05:35,550
So first off, I'm going to
note that p_1 is A times p_0.
92
00:05:38,620 --> 00:05:42,302
Likewise, p_2-- so
this is the position
93
00:05:42,302 --> 00:05:44,510
of the-- the probability
distribution of the particle
94
00:05:44,510 --> 00:05:45,940
after two steps.
95
00:05:45,940 --> 00:05:52,860
This is A times p_0, which
is A squared times p_0.
96
00:05:52,860 --> 00:05:55,000
And we note that
there's a general trend.
97
00:05:55,000 --> 00:06:01,070
After n steps-- so
P_n-- the general trend
98
00:06:01,070 --> 00:06:05,970
is, it's going to be this matrix
A raised to the n-th power,
99
00:06:05,970 --> 00:06:09,230
multiply the vector P0.
100
00:06:09,230 --> 00:06:11,920
So how do we take the
n-th power of a matrix?
101
00:06:11,920 --> 00:06:16,760
Well, this is where we use
eigenvectors and eigenvalues.
102
00:06:16,760 --> 00:06:27,320
So recall, that we can take any
matrix A that's diagonalizable
103
00:06:27,320 --> 00:06:35,360
and write it as U D U inverse,
where D is a diagonal matrix
104
00:06:35,360 --> 00:06:39,180
and this matrix U is a matrix
whose columns correspond
105
00:06:39,180 --> 00:06:41,242
to the eigenvectors of A.
106
00:06:41,242 --> 00:06:42,700
So for this problem,
I'm just going
107
00:06:42,700 --> 00:06:44,830
to state what the eigenvalues
and eigenvectors are.
108
00:06:44,830 --> 00:06:48,050
And I'll let you work them out.
109
00:06:48,050 --> 00:06:52,800
So because it's a
Markov matrix, we always
110
00:06:52,800 --> 00:06:55,780
have an eigenvalue which is 1.
111
00:06:55,780 --> 00:07:01,130
And in this case, we have an
eigenvector u which is 1 and 2.
112
00:07:04,160 --> 00:07:11,440
In addition, the second
eigenvalue is 0.4.
113
00:07:11,440 --> 00:07:14,520
And the eigenvector
corresponding to this one
114
00:07:14,520 --> 00:07:17,070
is [1, -1].
115
00:07:17,070 --> 00:07:19,750
And I'll just call these
u_1 and u_2, like that.
116
00:07:30,670 --> 00:07:37,420
OK, we can now write this
big matrix U as 1, 2; 1, -1.
117
00:07:41,580 --> 00:07:44,452
D is going to be-- now I
have to match things up.
118
00:07:44,452 --> 00:07:46,160
If I'm going to put
the first eigenvector
119
00:07:46,160 --> 00:07:51,750
in the first column, we have
to stick 1 in the first column
120
00:07:51,750 --> 00:07:57,620
as well and then 0.4 like this.
121
00:07:57,620 --> 00:08:01,410
And then lastly, we also have
U inverse which I can just
122
00:08:01,410 --> 00:08:16,670
work out to be minus 1/3,
one over the determinant,
123
00:08:16,670 --> 00:08:29,740
times -1, -1; -2, and 1,
which simplifies to this.
124
00:08:29,740 --> 00:08:40,860
OK, so now if we take A and
raise it to the power of n,
125
00:08:40,860 --> 00:08:43,809
we have this nice identity
that all the U and U inverses
126
00:08:43,809 --> 00:08:46,040
collapse in the middle.
127
00:08:46,040 --> 00:08:51,900
And we're left with U, D
to the n, U inverse, p_0.
128
00:08:54,460 --> 00:08:57,810
Now raising the a diagonal
matrix to the power of n
129
00:08:57,810 --> 00:08:59,400
is a relatively
simple thing to do.
130
00:08:59,400 --> 00:09:03,655
We just take the eigenvalues and
raise them to the power of n.
131
00:09:03,655 --> 00:09:05,780
So when we compute this
product, there's a question
132
00:09:05,780 --> 00:09:07,720
of what order do we do things?
133
00:09:07,720 --> 00:09:09,930
Now these are 2 by 2
matrices, so in theory we
134
00:09:09,930 --> 00:09:11,830
could just multiply
out, 2 by 2 matrix, 2
135
00:09:11,830 --> 00:09:15,050
by 2 matrix, 2 by 2 matrix, and
then on a vector which is a 2
136
00:09:15,050 --> 00:09:16,540
by 1 matrix.
137
00:09:16,540 --> 00:09:19,030
But if you're in a test and
you're cramped for time,
138
00:09:19,030 --> 00:09:21,574
you want to do as little
computations as possible.
139
00:09:21,574 --> 00:09:22,990
So what you want
to do is you want
140
00:09:22,990 --> 00:09:28,170
to start on the right-hand
side and then work backwards.
141
00:09:28,170 --> 00:09:40,170
So if we do this, we
end up obtaining 1, 2,
142
00:09:40,170 --> 00:09:47,795
this is going to be to the
power of n, 1/3, [1, 2].
143
00:09:51,870 --> 00:09:53,520
OK, so for this
last part, I'm just
144
00:09:53,520 --> 00:09:55,570
going to write down
the final answer.
145
00:09:55,570 --> 00:09:59,890
And I'll let you work out the
multiplication of matrices.
146
00:09:59,890 --> 00:10:15,650
So we have for p_n: 1/3, 2 times
0.4 to the n plus 1, -2 0.4
147
00:10:15,650 --> 00:10:21,430
to the n plus 2.
148
00:10:21,430 --> 00:10:26,160
And this is the final
vector for p of n.
149
00:10:26,160 --> 00:10:27,980
So this finishes up Part 2.
150
00:10:27,980 --> 00:10:30,620
And then lastly,
for Part 3, what
151
00:10:30,620 --> 00:10:33,930
happens when n goes to infinity?
152
00:10:33,930 --> 00:10:36,740
Well, we have the
answer for any n.
153
00:10:36,740 --> 00:10:39,410
So we can just take the
limit as n goes to infinity.
154
00:10:39,410 --> 00:10:41,950
Now, specifically as
n goes to infinity,
155
00:10:41,950 --> 00:10:46,040
0.4 raised to some very
large power vanishes.
156
00:10:46,040 --> 00:10:48,150
So these two terms drop off.
157
00:10:48,150 --> 00:10:52,550
And at the end of the day,
we're left with p_infinity
158
00:10:52,550 --> 00:10:58,130
is 1/3 [1, 2].
159
00:10:58,130 --> 00:10:59,240
OK?
160
00:10:59,240 --> 00:11:04,430
So just to recap, we started off
with a particle starting at A,
161
00:11:04,430 --> 00:11:08,550
and then after a very
long time, the particle
162
00:11:08,550 --> 00:11:11,560
winds up with a
probability distribution
163
00:11:11,560 --> 00:11:16,030
which is 1/3, 1 and 2.
164
00:11:16,030 --> 00:11:20,290
And this is quite characteristic
of Markov matrix chains.
165
00:11:20,290 --> 00:11:24,290
Specifically, we note
that 1/3 * [1, 2]
166
00:11:24,290 --> 00:11:31,340
is a multiple of the eigenvector
corresponding to eigenvalue 1.
167
00:11:31,340 --> 00:11:34,667
So even though the particle
started at position A,
168
00:11:34,667 --> 00:11:36,250
after a long period
of time, it tended
169
00:11:36,250 --> 00:11:40,130
to forget where it started
and approached, diffused
170
00:11:40,130 --> 00:11:43,130
into this uniform distribution.
171
00:11:43,130 --> 00:11:43,860
OK.
172
00:11:43,860 --> 00:11:45,470
I'd like to finish up here.
173
00:11:45,470 --> 00:11:47,680
And I'll see you next time.