1
00:00:00,060 --> 00:00:02,500
The following content is
provided under a Creative

2
00:00:02,500 --> 00:00:04,019
Commons license.

3
00:00:04,019 --> 00:00:06,360
Your support will help
MIT OpenCourseWare

4
00:00:06,360 --> 00:00:10,730
continue to offer high quality
educational resources for free.

5
00:00:10,730 --> 00:00:13,330
To make a donation or
view additional materials

6
00:00:13,330 --> 00:00:17,236
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,236 --> 00:00:17,861
at ocw.mit.edu.

8
00:00:22,009 --> 00:00:23,550
PROFESSOR: Today
we're going to study

9
00:00:23,550 --> 00:00:29,030
stochastic processes and,
among them, one type of it,

10
00:00:29,030 --> 00:00:29,990
so discrete time.

11
00:00:29,990 --> 00:00:31,340
We'll focus on discrete time.

12
00:00:31,340 --> 00:00:33,920
And I'll talk about
what it is right now.

13
00:00:33,920 --> 00:00:38,100
So a stochastic
process is a collection

14
00:00:38,100 --> 00:00:56,070
of random variables indexed by
time, a very simple definition.

15
00:00:56,070 --> 00:01:03,880
So we have either-- let's
start from 0-- random variables

16
00:01:03,880 --> 00:01:12,760
like this, or we have random
variables given like this.

17
00:01:12,760 --> 00:01:14,730
So a time variable
can be discrete,

18
00:01:14,730 --> 00:01:16,590
or it can be continuous.

19
00:01:16,590 --> 00:01:20,310
These ones, we'll call
discrete-time stochastic

20
00:01:20,310 --> 00:01:24,030
processes, and these
ones continuous-time.

21
00:01:30,540 --> 00:01:35,505
So for example, a
discrete-time random variable

22
00:01:35,505 --> 00:01:46,320
can be something
like-- and so on.

23
00:01:46,320 --> 00:01:51,010
So these are the values, X_0,
X_1, X_2, X_3, and so on.

24
00:01:51,010 --> 00:01:52,720
And they are random variables.

25
00:01:52,720 --> 00:01:57,330
This is just one--
so one realization

26
00:01:57,330 --> 00:01:58,810
of the stochastic process.

27
00:01:58,810 --> 00:02:02,290
But all these variables
are supposed to be random.

28
00:02:02,290 --> 00:02:04,255
And then a continuous-time
random variable--

29
00:02:04,255 --> 00:02:06,585
a continuous-time
stochastic process

30
00:02:06,585 --> 00:02:09,350
can be something like that.

31
00:02:09,350 --> 00:02:14,440
And it doesn't have to be
continuous, so it can jump

32
00:02:14,440 --> 00:02:18,410
and it can jump and so on.

33
00:02:18,410 --> 00:02:20,356
And all these values
are random values.

34
00:02:23,478 --> 00:02:27,300
So that's just a very
informal description.

35
00:02:27,300 --> 00:02:30,066
And a slightly
different point of view,

36
00:02:30,066 --> 00:02:31,440
which is slightly
preferred, when

37
00:02:31,440 --> 00:02:33,550
you want to do
some math with it,

38
00:02:33,550 --> 00:02:44,340
is that-- alternative
definition--

39
00:02:44,340 --> 00:02:57,550
it's a probability
distribution over paths,

40
00:02:57,550 --> 00:02:59,460
over a space of paths.

41
00:03:09,070 --> 00:03:11,260
So you have all a
bunch of possible paths

42
00:03:11,260 --> 00:03:12,574
that you can take.

43
00:03:12,574 --> 00:03:14,490
And you're given some
probability distribution

44
00:03:14,490 --> 00:03:15,980
over it.

45
00:03:15,980 --> 00:03:18,570
And then that will
be one realization.

46
00:03:18,570 --> 00:03:22,540
Another realization will look
something different and so on.

47
00:03:22,540 --> 00:03:24,970
So this one-- it's more
intuitive definition,

48
00:03:24,970 --> 00:03:27,710
the first one, that it's a
collection of random variables

49
00:03:27,710 --> 00:03:29,030
indexed by time.

50
00:03:29,030 --> 00:03:31,125
But that one, if you want
to do some math with it,

51
00:03:31,125 --> 00:03:33,760
from the formal point of view,
that will be more helpful.

52
00:03:33,760 --> 00:03:37,870
And you'll see why
that's the case later.

53
00:03:37,870 --> 00:03:40,460
So let me show you
some more examples.

54
00:03:40,460 --> 00:03:43,500
For example, to describe
one stochastic process,

55
00:03:43,500 --> 00:03:48,910
this is one way to describe
a stochastic process.

56
00:03:48,910 --> 00:03:55,780
t with-- let me show you
three stochastic processes,

57
00:03:55,780 --> 00:03:59,540
so number one, f(t) equals t.

58
00:03:59,540 --> 00:04:01,145
And this was probability 1.

59
00:04:04,850 --> 00:04:10,930
Number 2, f(t) is
equal to t, for all t,

60
00:04:10,930 --> 00:04:20,550
with probability 1/2, or f(t)
is equal to minus t, for all t,

61
00:04:20,550 --> 00:04:22,880
with probability 1/2.

62
00:04:22,880 --> 00:04:30,470
And the third one
is, for each t,

63
00:04:30,470 --> 00:04:34,610
f(t) is equal to t or minus
t, with probability 1/2.

64
00:04:41,370 --> 00:04:44,560
The first one is
quite easy to picture.

65
00:04:44,560 --> 00:04:49,140
It's really just-- there's
nothing random in here.

66
00:04:49,140 --> 00:04:50,700
This happens with probability 1.

67
00:04:50,700 --> 00:04:52,631
Your path just
says f(t) equals t.

68
00:04:52,631 --> 00:04:54,880
And we're only looking at t
greater than or equal to 0

69
00:04:54,880 --> 00:04:57,840
here.

70
00:04:57,840 --> 00:05:00,350
So that's number 1.

71
00:05:00,350 --> 00:05:10,180
Number 2, it's either
this one or this one.

72
00:05:10,180 --> 00:05:12,209
So it is a stochastic process.

73
00:05:12,209 --> 00:05:14,250
If you think about it this
way, it doesn't really

74
00:05:14,250 --> 00:05:16,010
look like a stochastic process.

75
00:05:16,010 --> 00:05:18,330
But under the
alternative definition,

76
00:05:18,330 --> 00:05:20,820
you have two possible
paths that you can take.

77
00:05:20,820 --> 00:05:25,540
You either take this path, with
1/2, or this path, with 1/2.

78
00:05:25,540 --> 00:05:30,130
Now, at each point,
t, your value X(t)

79
00:05:30,130 --> 00:05:32,930
is a random variable.

80
00:05:32,930 --> 00:05:35,530
It's either t or minus t.

81
00:05:35,530 --> 00:05:38,230
And it's the same for all t.

82
00:05:38,230 --> 00:05:40,659
But they are dependent
on each other.

83
00:05:40,659 --> 00:05:42,450
So if you know one
value, you automatically

84
00:05:42,450 --> 00:05:43,785
know all the other values.

85
00:05:47,110 --> 00:05:50,760
And the third one is
even more interesting.

86
00:05:50,760 --> 00:05:55,840
Now, for each t, we get
rid of this dependency.

87
00:05:55,840 --> 00:06:00,970
So what you'll have is
these two lines going on.

88
00:06:00,970 --> 00:06:03,150
I mean at every
single point, you'll

89
00:06:03,150 --> 00:06:05,310
be either a top one
or a bottom one.

90
00:06:05,310 --> 00:06:07,310
But if you really
want draw the picture,

91
00:06:07,310 --> 00:06:10,230
it will bounce back and forth,
up and down, infinitely often,

92
00:06:10,230 --> 00:06:11,770
and it'll just look
like two lines.

93
00:06:15,290 --> 00:06:17,880
So I hope this gives
you some feeling

94
00:06:17,880 --> 00:06:20,220
about stochastic
processes, I mean,

95
00:06:20,220 --> 00:06:24,100
why we want to describe it in
terms of this language, just

96
00:06:24,100 --> 00:06:27,250
a tiny bit.

97
00:06:27,250 --> 00:06:27,900
Any questions?

98
00:06:30,770 --> 00:06:33,730
So, when you look
at a process, when

99
00:06:33,730 --> 00:06:37,460
you use a stochastic
process to model a real life

100
00:06:37,460 --> 00:06:40,700
something going on, like a stock
price, usually what happens

101
00:06:40,700 --> 00:06:44,660
is you stand at time t.

102
00:06:44,660 --> 00:06:49,380
And you know all the
values in the past-- know.

103
00:06:49,380 --> 00:06:52,714
And in the future,
you don't know.

104
00:06:52,714 --> 00:06:54,380
But you want to know
something about it.

105
00:06:54,380 --> 00:06:57,650
You want to have some
intelligent conclusion,

106
00:06:57,650 --> 00:07:00,500
intelligent information about
the future, based on the past.

107
00:07:03,350 --> 00:07:06,682
For this stochastic
process, it's easy.

108
00:07:06,682 --> 00:07:08,390
No matter where you
stand at, you exactly

109
00:07:08,390 --> 00:07:11,130
know what's going to
happen in the future.

110
00:07:11,130 --> 00:07:12,910
For this one, it's
also the same.

111
00:07:12,910 --> 00:07:14,750
Even though it's
random, once you

112
00:07:14,750 --> 00:07:16,650
know what happened
at some point,

113
00:07:16,650 --> 00:07:20,250
you know it has to be this
distribution or this line,

114
00:07:20,250 --> 00:07:23,820
if it's here, and this
line if it's there.

115
00:07:23,820 --> 00:07:27,050
But that one is
slightly different.

116
00:07:27,050 --> 00:07:30,260
No matter what you
know about the past,

117
00:07:30,260 --> 00:07:33,160
even if know all the values
in the past, what happened,

118
00:07:33,160 --> 00:07:36,360
it doesn't give any information
at all about the future.

119
00:07:36,360 --> 00:07:39,850
Though it's not true if I
say any information at all.

120
00:07:39,850 --> 00:07:43,160
We know that each value
has to be t or minus t.

121
00:07:43,160 --> 00:07:46,730
You just don't know what it is.

122
00:07:46,730 --> 00:07:49,880
So when you're given
a stochastic process

123
00:07:49,880 --> 00:07:54,150
and you're standing at
some time, your future,

124
00:07:54,150 --> 00:07:57,270
you don't know what the future
is, but most of the time

125
00:07:57,270 --> 00:08:00,300
you have at least
some level of control

126
00:08:00,300 --> 00:08:03,010
given by the probability
distribution.

127
00:08:03,010 --> 00:08:06,180
Here, it was, you can
really determine the line.

128
00:08:06,180 --> 00:08:09,120
Here, because of probability
distribution, at each point,

129
00:08:09,120 --> 00:08:12,280
only gives t or minus t,
you know that each of them

130
00:08:12,280 --> 00:08:14,820
will be at least
one of the points,

131
00:08:14,820 --> 00:08:16,390
but you don't know
more than that.

132
00:08:19,070 --> 00:08:21,240
So the study of
stochastic processes

133
00:08:21,240 --> 00:08:25,890
is, basically, you look at the
given probability distribution,

134
00:08:25,890 --> 00:08:30,000
and you want to say something
intelligent about the future

135
00:08:30,000 --> 00:08:32,600
as t goes on.

136
00:08:32,600 --> 00:08:34,260
So there are three
types of questions

137
00:08:34,260 --> 00:08:37,549
that we mainly study here.

138
00:08:37,549 --> 00:08:50,865
So (a), first type, is
what are the dependencies

139
00:08:50,865 --> 00:08:53,330
in the sequence of values.

140
00:08:59,270 --> 00:09:01,720
For example, if
you know the price

141
00:09:01,720 --> 00:09:06,430
of a stock on all past
dates, up to today, can

142
00:09:06,430 --> 00:09:12,140
you say anything intelligent
about the future stock prices--

143
00:09:12,140 --> 00:09:15,140
those type of questions.

144
00:09:15,140 --> 00:09:28,350
And (b) is what is the long
term behavior of the sequence?

145
00:09:28,350 --> 00:09:30,370
So think about the
law of large numbers

146
00:09:30,370 --> 00:09:37,560
that we talked about last
time or central limit theorem.

147
00:09:43,774 --> 00:09:48,445
And the third type, this one is
less relevant for our course,

148
00:09:48,445 --> 00:09:51,602
but, still, I'll
just write it down.

149
00:09:51,602 --> 00:09:52,810
What are the boundary events?

150
00:10:00,030 --> 00:10:02,480
How often will something
extreme happen,

151
00:10:02,480 --> 00:10:07,370
like how often will a stock
price drop by more than 10%

152
00:10:07,370 --> 00:10:10,340
for a consecutive 5 days--
like these kind of events.

153
00:10:10,340 --> 00:10:11,600
How often will that happen?

154
00:10:14,780 --> 00:10:20,400
And for a different example,
like if you model a call center

155
00:10:20,400 --> 00:10:25,760
and you want to know,
over a period of time,

156
00:10:25,760 --> 00:10:29,750
the probability that at least
90% of the phones are idle

157
00:10:29,750 --> 00:10:31,096
or those kind of things.

158
00:10:36,330 --> 00:10:39,210
So that's was an introduction.

159
00:10:39,210 --> 00:10:39,860
Any questions?

160
00:10:43,560 --> 00:10:47,010
Then there are really lots
of stochastic processes.

161
00:10:50,600 --> 00:10:59,005
One of the most important ones
is the simple random walk.

162
00:11:09,460 --> 00:11:12,360
So today, I will focus on
discrete-time stochastic

163
00:11:12,360 --> 00:11:13,510
processes.

164
00:11:13,510 --> 00:11:16,280
Later in the course, we'll go
on to continuous-time stochastic

165
00:11:16,280 --> 00:11:17,610
processes.

166
00:11:17,610 --> 00:11:20,700
And then you'll see
like Brownian motions

167
00:11:20,700 --> 00:11:24,600
and-- what else-- Ito's
lemma and all those things

168
00:11:24,600 --> 00:11:25,960
will appear later.

169
00:11:25,960 --> 00:11:29,190
Right now, we'll
study discrete time.

170
00:11:29,190 --> 00:11:34,360
And later, you'll see that
it's really just-- what is it--

171
00:11:34,360 --> 00:11:36,610
they're really parallel.

172
00:11:36,610 --> 00:11:39,090
So this simple
random walk, you'll

173
00:11:39,090 --> 00:11:42,010
see the corresponding thing
in continuous-time stochastic

174
00:11:42,010 --> 00:11:43,680
processes later.

175
00:11:43,680 --> 00:11:45,920
So I think it's
easier to understand

176
00:11:45,920 --> 00:11:48,950
discrete-time processes,
that's why we start with it.

177
00:11:48,950 --> 00:11:53,180
But later, it will really help
if you understand it well.

178
00:11:53,180 --> 00:11:55,070
Because for continuous
time, it will just

179
00:11:55,070 --> 00:11:56,830
carry over all the knowledge.

180
00:11:59,700 --> 00:12:01,010
What is a simple random walk?

181
00:12:04,240 --> 00:12:12,370
Let Y_i be IID, independent
identically distributed,

182
00:12:12,370 --> 00:12:19,220
random variables, taking
values 1 or minus 1,

183
00:12:19,220 --> 00:12:21,220
each with probability 1/2.

184
00:12:25,990 --> 00:12:35,180
Then define, for each time
t, X sub t as the sum of Y_i,

185
00:12:35,180 --> 00:12:37,244
from i equals 1 to t.

186
00:12:41,410 --> 00:12:46,350
Then the sequence of
random variables-- and X_0

187
00:12:46,350 --> 00:12:52,250
is equal to 0-- X0,
X1, X2, and so on

188
00:12:52,250 --> 00:12:55,350
is called a one-dimensional
simple random walk.

189
00:12:55,350 --> 00:12:58,370
But I'll just refer to
it as simple random walk

190
00:12:58,370 --> 00:12:59,590
or random walk.

191
00:12:59,590 --> 00:13:04,500
And this is a definition.

192
00:13:04,500 --> 00:13:13,499
It's called simple random walk.

193
00:13:30,420 --> 00:13:31,551
Let's try to plot it.

194
00:13:35,239 --> 00:13:40,000
At time 0, we start at 0.

195
00:13:40,000 --> 00:13:43,600
And then, depending
on the value of Y1,

196
00:13:43,600 --> 00:13:45,740
you will either
go up or go down.

197
00:13:45,740 --> 00:13:47,760
Let's say we went up.

198
00:13:47,760 --> 00:13:49,800
So that's at time 1.

199
00:13:49,800 --> 00:13:54,050
Then at time 2, depending
on your value of Y2,

200
00:13:54,050 --> 00:13:56,430
you will either go
up one step from here

201
00:13:56,430 --> 00:13:59,670
or go down one step from there.

202
00:13:59,670 --> 00:14:08,740
Let's say we went up
again, down, 4, up, up,

203
00:14:08,740 --> 00:14:09,953
something like that.

204
00:14:09,953 --> 00:14:11,432
And it continues.

205
00:14:15,880 --> 00:14:18,910
Another way to look at it-- the
reason we call it a random walk

206
00:14:18,910 --> 00:14:28,470
is, if you just plot your values
of X_t, over time, on a line,

207
00:14:28,470 --> 00:14:33,700
then you start at 0, you go to
the right, right, left, right,

208
00:14:33,700 --> 00:14:35,515
right, left, left, left.

209
00:14:35,515 --> 00:14:39,190
So the trajectory is like a
walk you take on this line,

210
00:14:39,190 --> 00:14:40,375
but it's random.

211
00:14:40,375 --> 00:14:41,750
And each time you
go to the right

212
00:14:41,750 --> 00:14:44,900
or left, right or
left, right or left.

213
00:14:44,900 --> 00:14:48,310
So that was two representations.

214
00:14:48,310 --> 00:14:51,100
This picture looks a
little bit more clear.

215
00:14:51,100 --> 00:14:53,770
Here, I just lost
everything I draw.

216
00:14:53,770 --> 00:14:57,115
Something like that
is the trajectory.

217
00:15:00,740 --> 00:15:03,214
So from what we
learned last time,

218
00:15:03,214 --> 00:15:04,880
we can already say
something intelligent

219
00:15:04,880 --> 00:15:08,370
about the simple random walk.

220
00:15:08,370 --> 00:15:16,050
For example, if you apply
central limit theorem

221
00:15:16,050 --> 00:15:20,230
to the sequence, what is
the information you get?

222
00:15:24,510 --> 00:15:29,300
So over a long time, let's
say t is way, far away,

223
00:15:29,300 --> 00:15:38,094
like a huge number,
a very large number,

224
00:15:38,094 --> 00:15:43,070
what can you say about the
distribution of this at time t?

225
00:15:43,070 --> 00:15:45,244
AUDIENCE: Is it close to 0?

226
00:15:45,244 --> 00:15:46,160
PROFESSOR: Close to 0.

227
00:15:46,160 --> 00:15:49,610
But by close to 0,
what do you mean?

228
00:15:49,610 --> 00:15:51,430
There should be a scale.

229
00:15:51,430 --> 00:15:54,500
I mean some would say
that 1 is close to 0.

230
00:15:54,500 --> 00:15:57,395
Some people would say
that 100 is close to 0,

231
00:15:57,395 --> 00:16:02,710
so do you have some degree
of how close it will be to 0?

232
00:16:07,030 --> 00:16:09,850
Anybody?

233
00:16:09,850 --> 00:16:11,350
AUDIENCE: So variance
will be small.

234
00:16:11,350 --> 00:16:11,830
PROFESSOR: Sorry?

235
00:16:11,830 --> 00:16:13,371
AUDIENCE: The variance
will be small.

236
00:16:13,371 --> 00:16:14,930
PROFESSOR: Variance
will be small.

237
00:16:14,930 --> 00:16:17,210
About how much will
the variance be?

238
00:16:17,210 --> 00:16:18,027
AUDIENCE: 1 over n.

239
00:16:18,027 --> 00:16:18,860
PROFESSOR: 1 over n.

240
00:16:18,860 --> 00:16:19,732
1 over n?

241
00:16:19,732 --> 00:16:20,616
AUDIENCE: Over t.

242
00:16:20,616 --> 00:16:21,527
PROFESSOR: 1 over t?

243
00:16:21,527 --> 00:16:23,110
Anybody else want
to have a different?

244
00:16:23,110 --> 00:16:24,160
AUDIENCE: [INAUDIBLE].

245
00:16:24,160 --> 00:16:26,526
PROFESSOR: 1 over square
root t probably would.

246
00:16:26,526 --> 00:16:26,962
AUDIENCE: [INAUDIBLE].

247
00:16:26,962 --> 00:16:28,270
AUDIENCE: The variance
would be [INAUDIBLE].

248
00:16:28,270 --> 00:16:29,250
PROFESSOR: Oh,
you're right, sorry.

249
00:16:29,250 --> 00:16:30,542
Variance will be 1 over t.

250
00:16:33,846 --> 00:16:38,830
And the standard deviation will
be 1 over square root of t.

251
00:16:38,830 --> 00:16:41,264
What I'm saying is, by
central limit theorem.

252
00:16:41,264 --> 00:16:42,180
AUDIENCE: [INAUDIBLE].

253
00:16:42,180 --> 00:16:44,910
Are you looking at the sums
or are you looking at the?

254
00:16:44,910 --> 00:16:47,220
PROFESSOR: I'm
looking at the X_t.

255
00:16:47,220 --> 00:16:48,510
Ah.

256
00:16:48,510 --> 00:16:51,610
That's a very good point.

257
00:16:51,610 --> 00:16:54,060
t and square root of t.

258
00:16:54,060 --> 00:16:54,560
Thank you.

259
00:16:54,560 --> 00:16:56,054
AUDIENCE: That's very different.

260
00:16:56,054 --> 00:16:58,544
PROFESSOR: Yeah,
very, very different.

261
00:16:58,544 --> 00:17:01,530
I was confused.

262
00:17:01,530 --> 00:17:03,030
Sorry about that.

263
00:17:03,030 --> 00:17:07,930
The reason is because X_t, 1
over the square root of t times

264
00:17:07,930 --> 00:17:11,579
X_t-- we saw last
time that this,

265
00:17:11,579 --> 00:17:13,510
if t is really,
really large, this

266
00:17:13,510 --> 00:17:19,760
is close to the normal
distribution, 0,1.

267
00:17:19,760 --> 00:17:24,089
So if you just look at it,
X_t over the square root of t

268
00:17:24,089 --> 00:17:26,650
will look like
normal distribution.

269
00:17:26,650 --> 00:17:32,590
That means the value, at
t, will be distributed

270
00:17:32,590 --> 00:17:35,030
like a normal
distribution, with mean 0

271
00:17:35,030 --> 00:17:37,210
and variance square root of t.

272
00:17:37,210 --> 00:17:39,160
So what you said was right.

273
00:17:39,160 --> 00:17:41,270
It's close to 0.

274
00:17:41,270 --> 00:17:45,520
And the scale you're looking at
is about the square root of t.

275
00:17:45,520 --> 00:17:51,010
So it won't go too
far away from 0.

276
00:17:54,640 --> 00:18:03,021
That means, if you draw these
two curves, square root of t

277
00:18:03,021 --> 00:18:08,260
and minus square root of t, your
simple random walk, on a very

278
00:18:08,260 --> 00:18:17,400
large scale, won't like go too
far away from these two curves.

279
00:18:17,400 --> 00:18:19,860
Even though the
extreme values it

280
00:18:19,860 --> 00:18:24,530
can take-- I didn't draw it
correctly-- is t and minus

281
00:18:24,530 --> 00:18:28,935
t, because all values can be 1
or all values can be minus 1.

282
00:18:28,935 --> 00:18:32,440
Even though,
theoretically, you can

283
00:18:32,440 --> 00:18:35,654
be that far away
from your x-axis,

284
00:18:35,654 --> 00:18:37,070
in reality, what's
going to happen

285
00:18:37,070 --> 00:18:40,000
is you're going to be
really close to this curve.

286
00:18:40,000 --> 00:18:42,156
You're going to play
within this area, mostly.

287
00:18:47,362 --> 00:18:48,820
AUDIENCE: I think
that [INAUDIBLE].

288
00:18:52,570 --> 00:18:54,970
PROFESSOR: So, yeah, that
was a very vague statement.

289
00:18:54,970 --> 00:18:56,750
You won't deviate too much.

290
00:18:56,750 --> 00:18:59,030
So if you take 100
square root of t,

291
00:18:59,030 --> 00:19:03,260
you will be inside this
interval like 90% of the time.

292
00:19:03,260 --> 00:19:06,530
If you take this to be 10,000
times square root of t,

293
00:19:06,530 --> 00:19:08,880
almost 99.9% or
something like that.

294
00:19:14,010 --> 00:19:16,390
And there's even
a theorem saying

295
00:19:16,390 --> 00:19:20,700
you will hit these two
lines infinitely often.

296
00:19:20,700 --> 00:19:23,880
So if you go over time, a
very long period, for a very,

297
00:19:23,880 --> 00:19:29,090
very long, you live long enough,
then, even if you go down here.

298
00:19:29,090 --> 00:19:32,010
Even, in this picture, you
might think, OK, in some cases,

299
00:19:32,010 --> 00:19:33,510
it might be the
case that you always

300
00:19:33,510 --> 00:19:37,110
play in the negative region.

301
00:19:37,110 --> 00:19:39,550
But there's a theorem saying
that that's not the case.

302
00:19:39,550 --> 00:19:42,670
With probability 1,
if you go to infinity,

303
00:19:42,670 --> 00:19:45,150
you will cross this
line infinitely often.

304
00:19:45,150 --> 00:19:48,000
And in fact, you will meet these
two lines infinitely often.

305
00:19:52,066 --> 00:19:53,959
So those are some
interesting things

306
00:19:53,959 --> 00:19:55,000
about simple random walk.

307
00:19:55,000 --> 00:19:57,855
Really, there are lot
more interesting things,

308
00:19:57,855 --> 00:20:04,108
but I'm just giving an
overview, in this course, now.

309
00:20:08,090 --> 00:20:11,866
Unfortunately, I can't talk
about all of these fun stuffs.

310
00:20:11,866 --> 00:20:18,900
But let me still try to show
you some properties and one

311
00:20:18,900 --> 00:20:23,060
nice computation on it.

312
00:20:23,060 --> 00:20:31,990
So some properties of a random
walk, first, expectation of X_k

313
00:20:31,990 --> 00:20:33,610
is equal to 0.

314
00:20:33,610 --> 00:20:36,040
That's really easy to prove.

315
00:20:36,040 --> 00:20:39,765
Second important property is
called independent increment.

316
00:20:46,100 --> 00:20:56,820
So if look at these times,
t_0, t_1, up to t_k,

317
00:20:56,820 --> 00:21:05,830
then random variables X sub
t_i+1 minus X sub t_i are

318
00:21:05,830 --> 00:21:06,815
mutually independent.

319
00:21:13,950 --> 00:21:15,880
So what this says
is, if you look

320
00:21:15,880 --> 00:21:18,086
at what happens
from time 1 to 10,

321
00:21:18,086 --> 00:21:22,570
that is irrelevant to what
happens from 20 to 30.

322
00:21:22,570 --> 00:21:27,075
And that can easily be
shown by the definition.

323
00:21:27,075 --> 00:21:32,510
I won't do that, but we'll
try to do it as an exercise.

324
00:21:32,510 --> 00:21:35,970
Third one is called stationary,
so it has the property.

325
00:21:39,090 --> 00:21:44,910
That means, for all h
greater or equal to 0,

326
00:21:44,910 --> 00:21:50,171
and t greater than or equal to
0-- h is actually equal to 1--

327
00:21:50,171 --> 00:22:03,270
the distribution of X_(t+h)
minus X_t is the same

328
00:22:03,270 --> 00:22:15,610
as the distribution of X sub h.

329
00:22:15,610 --> 00:22:18,160
And again, this easily
follows from the definition.

330
00:22:18,160 --> 00:22:24,280
What it says is, if you look
at the same amount of time,

331
00:22:24,280 --> 00:22:28,590
then what happens
inside this interval

332
00:22:28,590 --> 00:22:32,530
is irrelevant of
your starting point.

333
00:22:32,530 --> 00:22:35,090
The distribution is the same.

334
00:22:35,090 --> 00:22:38,280
And moreover, from
the first part,

335
00:22:38,280 --> 00:22:43,630
if these intervals do not
overlap, they're independent.

336
00:22:43,630 --> 00:22:46,120
So those are the two properties
that we're talking here.

337
00:22:46,120 --> 00:22:50,120
And you'll see these properties
appearing again and again.

338
00:22:50,120 --> 00:22:54,640
Because stochastic processes
having these properties

339
00:22:54,640 --> 00:22:57,530
are really good, in some sense.

340
00:22:57,530 --> 00:23:00,910
They are fundamental
stochastic processes.

341
00:23:00,910 --> 00:23:03,757
And simple random walk is like
the fundamental stochastic

342
00:23:03,757 --> 00:23:04,256
process.

343
00:23:09,860 --> 00:23:12,770
So let's try to see
one interesting problem

344
00:23:12,770 --> 00:23:14,061
about simple random walk.

345
00:23:22,410 --> 00:23:27,770
So example, you play a game.

346
00:23:27,770 --> 00:23:29,940
It's like a coin toss game.

347
00:23:29,940 --> 00:23:32,500
I play with, let's say, Peter.

348
00:23:32,500 --> 00:23:36,460
So I bet $1 at each turn.

349
00:23:36,460 --> 00:23:39,360
And then Peter tosses
a coin, a fair coin.

350
00:23:39,360 --> 00:23:41,660
It's either heads or tails.

351
00:23:41,660 --> 00:23:43,340
If it's heads, he wins.

352
00:23:43,340 --> 00:23:45,620
He wins the $1.

353
00:23:45,620 --> 00:23:47,110
If it's tails, I win.

354
00:23:47,110 --> 00:23:48,530
I win $1.

355
00:23:48,530 --> 00:23:55,040
So from my point of view,
in this coin toss game,

356
00:23:55,040 --> 00:24:08,880
at each turn my balance
goes up by $1 or down by $1.

357
00:24:13,580 --> 00:24:17,060
And now, let's say I
started from $0.00 balance,

358
00:24:17,060 --> 00:24:19,630
even though that's not possible.

359
00:24:19,630 --> 00:24:24,060
Then my balance will exactly
follow the simple random walk,

360
00:24:24,060 --> 00:24:30,360
assuming that the coin it's
a fair coin, 50-50 chance.

361
00:24:30,360 --> 00:24:36,940
Then my balance is a
simple random walk.

362
00:24:41,110 --> 00:24:43,797
And then I say the following.

363
00:24:43,797 --> 00:24:44,380
You know what?

364
00:24:44,380 --> 00:24:45,130
I'm going to play.

365
00:24:45,130 --> 00:24:46,980
I want to make money.

366
00:24:46,980 --> 00:24:55,850
So I'm going to play until
I win $100 or I lose $100.

367
00:24:55,850 --> 00:25:08,780
So let's say I play until
I win $100 or I lose $100.

368
00:25:08,780 --> 00:25:12,070
What is the probability that I
will stop after winning $100?

369
00:25:17,544 --> 00:25:18,520
AUDIENCE: 1/2.

370
00:25:18,520 --> 00:25:20,474
PROFESSOR: 1/2 because?

371
00:25:20,474 --> 00:25:21,460
AUDIENCE: [INAUDIBLE].

372
00:25:21,460 --> 00:25:22,830
PROFESSOR: Yes.

373
00:25:22,830 --> 00:25:29,201
So happens with 1/2, 1/2.

374
00:25:29,201 --> 00:25:30,200
And this is by symmetry.

375
00:25:33,690 --> 00:25:36,330
Because every chain
of coin toss which

376
00:25:36,330 --> 00:25:39,160
gives a winning sequence,
when you flip it,

377
00:25:39,160 --> 00:25:40,956
it will give a losing sequence.

378
00:25:40,956 --> 00:25:42,730
We have one-to-one
correspondence

379
00:25:42,730 --> 00:25:44,790
between those two things.

380
00:25:44,790 --> 00:25:46,820
That was good.

381
00:25:46,820 --> 00:25:48,850
Now if I change it.

382
00:25:48,850 --> 00:25:53,560
What if I say I will
win $100 or I lose $50?

383
00:25:56,762 --> 00:26:08,126
What if I play until
win $100 or lose $50?

384
00:26:11,480 --> 00:26:16,510
In other words, I look
at the random walk,

385
00:26:16,510 --> 00:26:18,710
I look at the first
time that it hits

386
00:26:18,710 --> 00:26:23,230
either this line or it hits
this line, and then I stop.

387
00:26:25,850 --> 00:26:31,660
What is the probability that I
will stop after winning $100?

388
00:26:31,660 --> 00:26:34,320
AUDIENCE: [INAUDIBLE].

389
00:26:34,320 --> 00:26:35,170
PROFESSOR: 1/3?

390
00:26:35,170 --> 00:26:36,150
Let me see.

391
00:26:36,150 --> 00:26:37,190
Why 1/3?

392
00:26:37,190 --> 00:26:38,106
AUDIENCE: [INAUDIBLE].

393
00:27:05,540 --> 00:27:11,915
PROFESSOR: So you're saying,
hitting this probability is p.

394
00:27:11,915 --> 00:27:17,130
And the probability that you
hit this first is p, right?

395
00:27:17,130 --> 00:27:19,080
It's 1/2, 1/2.

396
00:27:19,080 --> 00:27:21,160
But you're saying from
here, it's the same.

397
00:27:21,160 --> 00:27:25,584
So it should be 1/4
here, 1/2 times 1/2.

398
00:27:27,934 --> 00:27:29,100
You've got a good intuition.

399
00:27:29,100 --> 00:27:31,484
It is 1/3, actually.

400
00:27:31,484 --> 00:27:32,400
AUDIENCE: [INAUDIBLE].

401
00:27:43,110 --> 00:27:44,680
PROFESSOR: And then
once you hit it,

402
00:27:44,680 --> 00:27:48,450
it's like the same afterwards?

403
00:27:48,450 --> 00:27:51,087
I'm not sure if there is a way
to make an argument out of it.

404
00:27:51,087 --> 00:27:51,920
I really don't know.

405
00:27:51,920 --> 00:27:53,480
There might be or
there might not be.

406
00:27:53,480 --> 00:27:54,180
I'm not sure.

407
00:27:54,180 --> 00:27:55,980
I was thinking of
a different way.

408
00:27:55,980 --> 00:27:59,080
But yeah, there might be a way
to make an argument out of it.

409
00:27:59,080 --> 00:28:01,610
I just don't see it right now.

410
00:28:01,610 --> 00:28:06,290
So in general, if you put
a line B and a line A,

411
00:28:06,290 --> 00:28:11,662
then probability of hitting
B first is A over A plus B.

412
00:28:11,662 --> 00:28:16,384
And the probability of
hitting this line-- minus A--

413
00:28:16,384 --> 00:28:23,520
is B over A plus B. And so, in
this case, if it's 100 and 50,

414
00:28:23,520 --> 00:28:27,200
it's 100 over 150, that's
2/3 and that's 1/3.

415
00:28:30,180 --> 00:28:33,250
This can be proved.

416
00:28:33,250 --> 00:28:35,642
It's actually not that
difficult to prove it.

417
00:28:35,642 --> 00:28:37,850
I mean it's hard to find
the right way to look at it.

418
00:29:00,802 --> 00:29:19,140
So fix your B and A. And
for each k between minus A

419
00:29:19,140 --> 00:29:27,490
and B define f of k as the
probability that you'll

420
00:29:27,490 --> 00:29:31,320
hit-- what is it--
this line first,

421
00:29:31,320 --> 00:29:38,830
and the probability that
you hit the line B first

422
00:29:38,830 --> 00:29:46,010
when you start at k.

423
00:29:46,010 --> 00:29:48,554
So it kind of points
out what you're saying.

424
00:29:48,554 --> 00:29:50,720
Now, instead of looking at
one fixed starting point,

425
00:29:50,720 --> 00:29:52,525
we're going to change
our starting point

426
00:29:52,525 --> 00:29:55,290
and look at all possible ways.

427
00:29:55,290 --> 00:29:58,570
So when you start at
k, I'll define f of k

428
00:29:58,570 --> 00:30:00,560
as the probability that
you hit this line first

429
00:30:00,560 --> 00:30:03,490
before hitting that line.

430
00:30:03,490 --> 00:30:05,520
What we are interested
in is computing f(0).

431
00:30:10,430 --> 00:30:14,595
What we know is f of B is
equal to 1, f of minus A

432
00:30:14,595 --> 00:30:15,770
is equal to 0.

433
00:30:20,000 --> 00:30:24,430
And then actually, there's
one recursive formula

434
00:30:24,430 --> 00:30:26,670
that matters to us.

435
00:30:26,670 --> 00:30:34,500
If you start at f(k), you
either go up or go down.

436
00:30:34,500 --> 00:30:36,550
You go up with probability 1/2.

437
00:30:36,550 --> 00:30:38,760
You go down with
probability 1/2.

438
00:30:38,760 --> 00:30:40,950
And now it starts again.

439
00:30:40,950 --> 00:30:46,340
Because of this-- which one
is it-- stationary property.

440
00:30:46,340 --> 00:30:49,850
So starting from
here, the probability

441
00:30:49,850 --> 00:30:54,690
that you hit B first is
exactly f of k plus 1.

442
00:30:54,690 --> 00:30:57,800
So if you go up, the
probability that you hit B first

443
00:30:57,800 --> 00:30:59,690
is f of k plus 1.

444
00:30:59,690 --> 00:31:03,012
If you go down,
it's f of k minus 1.

445
00:31:06,320 --> 00:31:08,510
And then that gives
you a recursive formula

446
00:31:08,510 --> 00:31:09,990
with two boundary values.

447
00:31:09,990 --> 00:31:12,970
If you look at it,
you can solve it.

448
00:31:12,970 --> 00:31:17,180
When you solve it,
you'll get that answer.

449
00:31:17,180 --> 00:31:20,070
So I won't go into details,
but what I wanted to show

450
00:31:20,070 --> 00:31:23,770
is that simple random walk is
really this property, these two

451
00:31:23,770 --> 00:31:24,840
properties.

452
00:31:24,840 --> 00:31:28,120
It has these properties and
even more powerful properties.

453
00:31:28,120 --> 00:31:30,200
So it's really easy to control.

454
00:31:30,200 --> 00:31:32,080
And at the same time
it's quite universal.

455
00:31:32,080 --> 00:31:36,790
It can model-- like it's
not a very weak model.

456
00:31:36,790 --> 00:31:43,280
It's rather restricted, but
it's a really good model

457
00:31:43,280 --> 00:31:46,880
for like a mathematician.

458
00:31:46,880 --> 00:31:49,300
From the practical
point of view,

459
00:31:49,300 --> 00:31:53,560
you'll have to twist some
things slightly and so on.

460
00:31:53,560 --> 00:31:56,800
But in many cases,
you can approximate it

461
00:31:56,800 --> 00:31:59,830
by simple random walk.

462
00:31:59,830 --> 00:32:04,885
And as you can see, you
can do computations,

463
00:32:04,885 --> 00:32:06,500
with simple random
walk, by hand.

464
00:32:10,500 --> 00:32:11,985
So that was it.

465
00:32:11,985 --> 00:32:14,119
I talked about the
most important example

466
00:32:14,119 --> 00:32:15,035
of stochastic process.

467
00:32:18,620 --> 00:32:23,780
Now, let's talk about
more stochastic processes.

468
00:32:27,629 --> 00:32:31,701
The second one is
called the Markov chain.

469
00:32:31,701 --> 00:32:34,376
Let me write that
part, actually.

470
00:32:49,550 --> 00:32:52,180
So Markov chain, unlike
the simple random walk,

471
00:32:52,180 --> 00:32:53,775
is not a single
stochastic process.

472
00:32:56,490 --> 00:33:00,000
A stochastic process is
called a Markov chain

473
00:33:00,000 --> 00:33:02,110
if has some property.

474
00:33:02,110 --> 00:33:05,690
And what we want to
capture in Markov chain

475
00:33:05,690 --> 00:33:09,690
is the following statement.

476
00:33:09,690 --> 00:33:17,090
These are a collection of
stochastic processes having

477
00:33:17,090 --> 00:33:32,660
the property that-- whose
effect of the past on the future

478
00:33:32,660 --> 00:33:39,077
is summarized only
by the current state.

479
00:33:45,760 --> 00:33:48,840
That's quite a vague statement.

480
00:33:48,840 --> 00:33:59,620
But what we're trying to
capture here is-- now,

481
00:33:59,620 --> 00:34:05,840
look at some generic
stochastic process at time t.

482
00:34:05,840 --> 00:34:08,260
You know all the
history up to time t.

483
00:34:08,260 --> 00:34:12,280
You want to say something
about the future.

484
00:34:12,280 --> 00:34:14,949
Then, if it's a Markov
chain, what it's saying is,

485
00:34:14,949 --> 00:34:17,699
you don't even have
know all about this.

486
00:34:17,699 --> 00:34:19,199
Like this part is
really irrelevant.

487
00:34:22,310 --> 00:34:27,853
What matters is the value at
this last point, last time.

488
00:34:27,853 --> 00:34:30,600
So if it's a Markov
chain, you don't

489
00:34:30,600 --> 00:34:32,480
have to know all this history.

490
00:34:32,480 --> 00:34:34,889
All you have to know
is this single value.

491
00:34:34,889 --> 00:34:37,949
And all of the effect of
the past on the future

492
00:34:37,949 --> 00:34:40,679
is contained in this value.

493
00:34:40,679 --> 00:34:42,190
Nothing else matters.

494
00:34:42,190 --> 00:34:44,000
Of course, this is
a very special type

495
00:34:44,000 --> 00:34:45,690
of stochastic process.

496
00:34:45,690 --> 00:34:47,830
Most other stochastic
processes, the future

497
00:34:47,830 --> 00:34:51,060
will depend on
the whole history.

498
00:34:51,060 --> 00:34:53,480
And in that case, it's
more difficult to analyze.

499
00:34:53,480 --> 00:34:56,280
But these ones are
more manageable.

500
00:34:56,280 --> 00:34:58,250
And still, lots of
interesting things

501
00:34:58,250 --> 00:35:00,380
turn out to be Markov chains.

502
00:35:00,380 --> 00:35:02,080
So if you look at
simple random walk,

503
00:35:02,080 --> 00:35:06,310
it is a Markov chain, right?

504
00:35:06,310 --> 00:35:14,680
So simple random walk, let's
say you went like that.

505
00:35:14,680 --> 00:35:20,160
Then what happens after
time t really just depends

506
00:35:20,160 --> 00:35:23,460
on how high this point is at.

507
00:35:23,460 --> 00:35:25,580
What happened before
doesn't matter at all.

508
00:35:25,580 --> 00:35:29,070
Because we're just having
new coin tosses every time.

509
00:35:29,070 --> 00:35:31,155
But this value can
affect the future,

510
00:35:31,155 --> 00:35:32,530
because that's
where you're going

511
00:35:32,530 --> 00:35:34,990
to start your process from.

512
00:35:34,990 --> 00:35:38,240
Like that's where you're
starting your process.

513
00:35:38,240 --> 00:35:41,590
So that is a Markov chain.

514
00:35:41,590 --> 00:35:42,790
This part is irrelevant.

515
00:35:42,790 --> 00:35:45,412
Only the value matters.

516
00:35:45,412 --> 00:35:47,370
So let me define it a
little bit more formally.

517
00:36:05,240 --> 00:36:27,814
A discrete-time stochastic
process is a Markov chain

518
00:36:27,814 --> 00:36:36,230
if the probability that
X at some time, t plus 1,

519
00:36:36,230 --> 00:36:43,230
is equal to
something, some value,

520
00:36:43,230 --> 00:36:49,830
given the whole
history up to time n,

521
00:36:49,830 --> 00:36:55,810
is equal to the probability that
X_(t+1) is equal to that value,

522
00:36:55,810 --> 00:37:04,950
given the value X sub n for all
n greater than or equal to-- t

523
00:37:04,950 --> 00:37:10,260
greater than or
equal to 0 and all s.

524
00:37:10,260 --> 00:37:14,990
This is a mathematical
way of writing down this.

525
00:37:14,990 --> 00:37:20,690
The value at X_(t+1), given
all the values up to time t,

526
00:37:20,690 --> 00:37:23,830
is the same as the
value at time t plus 1,

527
00:37:23,830 --> 00:37:26,993
the probability of it,
given only the last value.

528
00:37:39,090 --> 00:37:41,750
And the reason simple random
walk is a Markov chain

529
00:37:41,750 --> 00:37:45,560
is because both of
them are just 1/2.

530
00:37:45,560 --> 00:37:50,920
I mean, if it's for--
let me write it down.

531
00:37:54,680 --> 00:37:59,470
So example: random walk.

532
00:38:03,943 --> 00:38:10,420
Probability that X_(t+1)
equal to s, given--

533
00:38:10,420 --> 00:38:20,096
t is equal to 1/2, if s is equal
X_t plus 1, or X_t minus 1,

534
00:38:20,096 --> 00:38:21,436
and 0 otherwise.

535
00:38:24,840 --> 00:38:30,185
So it really depends only
on the last value of X_t.

536
00:38:30,185 --> 00:38:31,870
Any questions?

537
00:38:31,870 --> 00:38:32,910
All right.

538
00:38:36,460 --> 00:38:39,350
If there is case
when you're looking

539
00:38:39,350 --> 00:38:41,610
at a stochastic
process, a Markov chain,

540
00:38:41,610 --> 00:38:50,020
and all X_i have values
in some set S, which

541
00:38:50,020 --> 00:38:59,120
is finite, a finite
set, in that case,

542
00:38:59,120 --> 00:39:01,640
it's really easy to
describe Markov chains.

543
00:39:04,380 --> 00:39:09,360
So now denote the
probability i, j

544
00:39:09,360 --> 00:39:15,530
as the probability
that, if at that time t

545
00:39:15,530 --> 00:39:18,520
you are at i, the
probability that you

546
00:39:18,520 --> 00:39:33,942
jump to j at time t plus 1
for all pair of points i, j.

547
00:39:38,100 --> 00:39:40,530
I mean, it's a finite set,
so I might just as well

548
00:39:40,530 --> 00:39:45,160
call it the integer
set from 1 to m,

549
00:39:45,160 --> 00:39:49,490
just to make the
notation easier.

550
00:39:49,490 --> 00:39:57,710
Then, first of all, if you
sum over all j in S, P_(i,j),

551
00:39:57,710 --> 00:39:59,216
that is equal to 1.

552
00:39:59,216 --> 00:40:01,060
Because if you
start at i, you'll

553
00:40:01,060 --> 00:40:03,770
have to jump to somewhere
in your next step.

554
00:40:03,770 --> 00:40:06,650
So if you sum over all
possible states you can have,

555
00:40:06,650 --> 00:40:09,680
you have to sum up to 1.

556
00:40:09,680 --> 00:40:12,690
And really, a very
interesting thing

557
00:40:12,690 --> 00:40:16,620
is this matrix, called
the transition probability

558
00:40:16,620 --> 00:40:24,740
matrix, defined as.

559
00:40:34,460 --> 00:40:40,540
So we put P_(i,j) at
i-th row and j-th column.

560
00:40:40,540 --> 00:40:42,130
And really, this
tells you everything

561
00:40:42,130 --> 00:40:44,640
about the Markov chain.

562
00:40:44,640 --> 00:40:46,540
Everything about the
stochastic process

563
00:40:46,540 --> 00:40:47,900
is contained in this matrix.

564
00:41:00,470 --> 00:41:02,070
That's because a
future state only

565
00:41:02,070 --> 00:41:04,550
depends on the current state.

566
00:41:04,550 --> 00:41:08,210
So if you know what happens at
time t, where it's at time t,

567
00:41:08,210 --> 00:41:10,800
you look at the
matrix, you can decode

568
00:41:10,800 --> 00:41:12,030
all the information you want.

569
00:41:12,030 --> 00:41:14,990
What is the probability that
it will be at-- let's say,

570
00:41:14,990 --> 00:41:15,824
it's at 0 right now.

571
00:41:15,824 --> 00:41:17,281
What's the probability
that it will

572
00:41:17,281 --> 00:41:18,410
jump to 1 at the next time?

573
00:41:18,410 --> 00:41:21,180
Just look at 0 comma 1, here.

574
00:41:21,180 --> 00:41:23,040
There is no 0, 1,
here, so it's 1 and 2.

575
00:41:23,040 --> 00:41:28,690
Just look at 1 and
2, 1 and 2, i and j.

576
00:41:28,690 --> 00:41:29,814
Actually, I made a mistake.

577
00:41:37,074 --> 00:41:39,010
That should be the right one.

578
00:41:42,410 --> 00:41:45,180
Not only that,
that's a one-step.

579
00:41:45,180 --> 00:41:46,840
So what happened is
it describes what

580
00:41:46,840 --> 00:41:48,910
happens in a single
step, the probability

581
00:41:48,910 --> 00:41:51,410
that you jump from i to j.

582
00:41:51,410 --> 00:41:53,330
But using that,
you can also model

583
00:41:53,330 --> 00:41:58,260
what's the probability that you
jump from i to j in two steps.

584
00:41:58,260 --> 00:42:03,110
So let's define q sub
i, j as the probability

585
00:42:03,110 --> 00:42:08,440
that X at time t plus 2 is equal
to j, given that X at time t

586
00:42:08,440 --> 00:42:12,070
is equal to i.

587
00:42:12,070 --> 00:42:25,020
Then the matrix,
defined this way,

588
00:42:25,020 --> 00:42:27,100
can you describe it in
terms of the matrix A?

589
00:42:33,620 --> 00:42:34,800
Anybody?

590
00:42:34,800 --> 00:42:35,980
Multiplication?

591
00:42:35,980 --> 00:42:36,810
Very good.

592
00:42:36,810 --> 00:42:37,700
So it's A square.

593
00:42:42,990 --> 00:42:44,200
Why is it?

594
00:42:44,200 --> 00:42:46,930
So let me write this
down in a different way.

595
00:42:46,930 --> 00:42:55,150
q_(i,j) is, you sum over
all intermediate values

596
00:42:55,150 --> 00:43:03,680
the probability that you
jump from i to k, first,

597
00:43:03,680 --> 00:43:05,900
and then the probability
that you jump from k to j.

598
00:43:12,480 --> 00:43:14,940
And if you look at
what this means,

599
00:43:14,940 --> 00:43:20,910
each entry here is described by
a linear-- what is it-- the dot

600
00:43:20,910 --> 00:43:24,840
product of a column and a row.

601
00:43:24,840 --> 00:43:26,932
And that's exactly what occurs.

602
00:43:26,932 --> 00:43:29,140
And if you want to look at
the three-step, four-step,

603
00:43:29,140 --> 00:43:31,390
all you have to do is just
multiply it again and again

604
00:43:31,390 --> 00:43:33,230
and again.

605
00:43:33,230 --> 00:43:35,430
Really, this matrix
contains all the information

606
00:43:35,430 --> 00:43:40,290
you want if you have a
Markov chain and it's finite.

607
00:43:40,290 --> 00:43:41,882
That's very important.

608
00:43:41,882 --> 00:43:44,310
For random walk,
simple random walk,

609
00:43:44,310 --> 00:43:46,840
I told you that it
is a Markov chain.

610
00:43:46,840 --> 00:43:50,570
But it does not have a
transition probability matrix,

611
00:43:50,570 --> 00:43:53,191
because the state
space is not finite.

612
00:43:53,191 --> 00:43:54,045
So be careful.

613
00:43:57,740 --> 00:44:00,680
However, finite Markov
chains, really, there's

614
00:44:00,680 --> 00:44:08,280
one matrix that
describes everything.

615
00:44:08,280 --> 00:44:13,110
I mean, I said it like it's
something very interesting.

616
00:44:13,110 --> 00:44:15,790
But if you think
about it, you just

617
00:44:15,790 --> 00:44:17,766
wrote down all
the probabilities.

618
00:44:17,766 --> 00:44:19,140
So it should
describe everything.

619
00:44:34,542 --> 00:44:35,125
So an example.

620
00:44:41,152 --> 00:44:48,900
You have a machine,
and it's broken

621
00:44:48,900 --> 00:44:53,180
or working at a given day.

622
00:45:00,580 --> 00:45:02,070
That's a silly example.

623
00:45:02,070 --> 00:45:13,388
So if it's working
today, working tomorrow,

624
00:45:13,388 --> 00:45:25,260
broken with probability 0.01,
working with probability 0.99.

625
00:45:25,260 --> 00:45:29,300
If it's broken, the
probability that it's repaired

626
00:45:29,300 --> 00:45:35,500
on the next day is 0.8.

627
00:45:35,500 --> 00:45:40,450
And it's broken at 0.2.

628
00:45:40,450 --> 00:45:42,380
Suppose you have
something like this.

629
00:45:47,170 --> 00:45:50,854
This is an example of a Markov
chain used in like engineering

630
00:45:50,854 --> 00:45:51,395
applications.

631
00:45:56,560 --> 00:46:01,296
In this case, S is also called
the state space, actually.

632
00:46:01,296 --> 00:46:04,170
And the reason is
because, in many cases,

633
00:46:04,170 --> 00:46:07,990
what you're modeling is these
kind of states of some system,

634
00:46:07,990 --> 00:46:13,750
like broken or working, rainy,
sunny, cloudy as weather.

635
00:46:13,750 --> 00:46:18,380
And all these things
that you model

636
00:46:18,380 --> 00:46:20,210
represent states a lot of time.

637
00:46:20,210 --> 00:46:22,505
So you call it
state set as well.

638
00:46:22,505 --> 00:46:24,175
So that's an example.

639
00:46:24,175 --> 00:46:26,000
And let's see what
happens for this matrix.

640
00:46:28,520 --> 00:46:30,720
We have two states,
working and broken.

641
00:46:35,680 --> 00:46:37,680
Working to working is 0.99.

642
00:46:37,680 --> 00:46:40,530
Working to broken is 0.01.

643
00:46:40,530 --> 00:46:42,600
Broken to working is 0.8.

644
00:46:42,600 --> 00:46:53,590
Broken to broken is 0.2.

645
00:46:53,590 --> 00:46:55,512
So that's what we've
learned so far.

646
00:46:55,512 --> 00:47:00,660
And the question, what happens
if you start from some state,

647
00:47:00,660 --> 00:47:04,030
let's say it was
working today, and you

648
00:47:04,030 --> 00:47:12,900
go a very, very long time,
like a year or 10 years,

649
00:47:12,900 --> 00:47:16,720
then the distribution,
after 10 years, on that day,

650
00:47:16,720 --> 00:47:20,300
is A to the 3,650.

651
00:47:20,300 --> 00:47:24,680
So that will be--
that times [1,  0]

652
00:47:24,680 --> 00:47:27,440
will be the probability [p, q].

653
00:47:27,440 --> 00:47:30,030
p will be the probability that
it's working at that time.

654
00:47:30,030 --> 00:47:32,414
q will be the probability
that it's broken at that time.

655
00:47:35,760 --> 00:47:37,630
What will p and q be?

656
00:47:45,340 --> 00:47:46,655
What will p and q be?

657
00:47:46,655 --> 00:47:48,530
That's the question that
we're trying to ask.

658
00:47:55,130 --> 00:47:57,030
We didn't learn, so
far, how to do this,

659
00:47:57,030 --> 00:47:58,400
but let's think about it.

660
00:48:01,220 --> 00:48:06,946
I'm going to cheat a
little bit and just say,

661
00:48:06,946 --> 00:48:12,400
you know what, I think,
over a long period of time,

662
00:48:12,400 --> 00:48:20,760
the probability distribution on
day 3,650 and that on day 3,651

663
00:48:20,760 --> 00:48:22,490
shouldn't be that different.

664
00:48:22,490 --> 00:48:25,246
They should be about the same.

665
00:48:25,246 --> 00:48:26,370
Let's make that assumption.

666
00:48:26,370 --> 00:48:27,770
I don't know if
it's true or not.

667
00:48:27,770 --> 00:48:32,470
Well, I know it's true, but
that's what I'm telling you.

668
00:48:32,470 --> 00:48:38,300
Under that assumption, now you
can solve what p and q are.

669
00:48:38,300 --> 00:48:49,180
So approximately, I hope,
p, q-- so A^3650 * [1,

670
00:48:49,180 --> 00:48:56,350
0] is approximately the same
as A to the 3651, [1, 0].

671
00:48:56,350 --> 00:48:58,555
That means that this is [p, q].

672
00:48:58,555 --> 00:49:01,121
[p, q] is about the
same as A times [p, q].

673
00:49:04,970 --> 00:49:07,510
Anybody remember what this is?

674
00:49:07,510 --> 00:49:09,030
Yes.

675
00:49:09,030 --> 00:49:11,475
So [p, q] will be the
eigenvector of this matrix.

676
00:49:14,090 --> 00:49:17,350
Over a long period of time,
the probability distribution

677
00:49:17,350 --> 00:49:20,470
that you will observe
will be the eigenvector.

678
00:49:23,650 --> 00:49:26,510
And whats the eigenvalue?

679
00:49:26,510 --> 00:49:30,752
1, at least in this case,
it looks like it's 1.

680
00:49:30,752 --> 00:49:33,210
Now I'll make one
more connection.

681
00:49:33,210 --> 00:49:36,954
Do you remember
Perron-Frobenius theorem?

682
00:49:36,954 --> 00:49:40,400
So this is a matrix.

683
00:49:40,400 --> 00:49:43,560
All entries are positive.

684
00:49:43,560 --> 00:49:45,980
So there is a
largest eigenvalue,

685
00:49:45,980 --> 00:49:49,870
which is positive and real.

686
00:49:49,870 --> 00:49:52,670
And there is an all-positive
eigenvector corresponding

687
00:49:52,670 --> 00:49:53,415
to it.

688
00:49:56,555 --> 00:49:58,930
What I'm trying to say is
that's going to be your [p, q].

689
00:50:06,380 --> 00:50:09,050
But let me not jump
to the conclusion yet.

690
00:50:27,060 --> 00:50:37,090
And one more thing we know
is, by Perron-Frobenius, there

691
00:50:37,090 --> 00:50:41,330
exists an eigenvalue,
the largest one, lambda

692
00:50:41,330 --> 00:50:50,699
greater than 0, and eigenvector
[v 1, v 2], where [v 1, v 2]

693
00:50:50,699 --> 00:50:51,240
are positive.

694
00:50:54,340 --> 00:50:57,100
Moreover, lambda was
at multiplicity 1.

695
00:50:57,100 --> 00:50:58,650
I'll get back to it later.

696
00:50:58,650 --> 00:51:00,250
So let's write this down.

697
00:51:00,250 --> 00:51:07,032
A times [v 1, v 2] is equal
to lambda times [v 1, v2].

698
00:51:07,032 --> 00:51:08,740
A times [v 1, v 2],
we can write it down.

699
00:51:08,740 --> 00:51:14,430
It's 0.99 v_1 plus 0.01 v_2.

700
00:51:14,430 --> 00:51:22,169
And that 0.8 v_1 plus 0.2 v_2,
which is equal to [v1, v2].

701
00:51:26,140 --> 00:51:28,190
You can solve v_1 and
v_2, but before doing

702
00:51:28,190 --> 00:51:41,501
that-- sorry about that.

703
00:51:41,501 --> 00:51:42,487
This is flipped.

704
00:51:51,544 --> 00:51:52,960
Yeah, so everybody,
it should have

705
00:51:52,960 --> 00:51:55,466
been flipped in the beginning.

706
00:51:55,466 --> 00:51:57,876
So that's 8.

707
00:52:02,710 --> 00:52:10,190
So sum these two values, and
you get lambda times [v 1, v 2].

708
00:52:10,190 --> 00:52:14,101
On the left, what you
get is v_1 plus v_2,

709
00:52:14,101 --> 00:52:15,833
you sum two coordinates.

710
00:52:18,611 --> 00:52:20,880
On the left, you
get v_1 plus v_2.

711
00:52:20,880 --> 00:52:25,320
On the right, you get
lambda times v_1 plus v_2.

712
00:52:25,320 --> 00:52:27,642
That means your
lambda is equal to 1.

713
00:52:34,064 --> 00:52:38,600
So that eigenvalue, guaranteed
by Perron-Frobenius theorem,

714
00:52:38,600 --> 00:52:41,630
is 1, eigenvalue of 1.

715
00:52:41,630 --> 00:52:45,670
So what you'll find here
will be the eigenvector

716
00:52:45,670 --> 00:52:49,857
corresponding to the largest
eigenvalue-- eigenvector

717
00:52:49,857 --> 00:52:52,440
will be the one corresponding
to the largest eigenvalue, which

718
00:52:52,440 --> 00:52:53,710
is equal to 1.

719
00:52:53,710 --> 00:52:56,250
And that's something
very general.

720
00:52:56,250 --> 00:53:00,770
It's not just about this matrix
and this special example.

721
00:53:00,770 --> 00:53:03,940
In general, if you have
a transition matrix,

722
00:53:03,940 --> 00:53:09,460
if you're given a Markov chain
and given a transition matrix,

723
00:53:09,460 --> 00:53:11,310
Perron-Frobenius
theorem guarantees

724
00:53:11,310 --> 00:53:14,180
that there exists a vector as
long as all the entries are

725
00:53:14,180 --> 00:53:15,520
positive.

726
00:53:15,520 --> 00:53:25,150
So in general, if transition
matrix of a Markov chain

727
00:53:25,150 --> 00:53:39,170
has positive entries, then
there exists a vector pi_1 up

728
00:53:39,170 --> 00:53:49,600
to pi_m such that-- I'll just
call it v-- Av is equal to v.

729
00:53:49,600 --> 00:53:52,400
And that will be the long-term
behavior as explained.

730
00:53:52,400 --> 00:53:56,790
Over a long term, if it
converges to some state,

731
00:53:56,790 --> 00:53:59,470
it has to satisfy that.

732
00:53:59,470 --> 00:54:01,630
And by Perron-Frobenius
theorem, we

733
00:54:01,630 --> 00:54:04,440
know that there is a
vector satisfying it.

734
00:54:04,440 --> 00:54:09,090
So if it converges, it
will converge to that.

735
00:54:09,090 --> 00:54:11,990
And what it's saying is, if
all the entries are positive,

736
00:54:11,990 --> 00:54:13,280
then it converges.

737
00:54:13,280 --> 00:54:15,450
And there is such a state.

738
00:54:15,450 --> 00:54:17,810
We know the long-term
behavior of the system.

739
00:54:26,050 --> 00:54:28,330
So this is called the
stationary distribution.

740
00:54:32,480 --> 00:54:36,290
Such vector v is called.

741
00:54:44,090 --> 00:54:46,230
It's not really right
to say that a vector is

742
00:54:46,230 --> 00:54:47,670
stationary distribution.

743
00:54:47,670 --> 00:54:52,080
But if I give this distribution
to the state space,

744
00:54:52,080 --> 00:55:03,340
what I mean is consider
probability distribution over S

745
00:55:03,340 --> 00:55:10,810
such that probability is-- so
it's a random variable X-- X is

746
00:55:10,810 --> 00:55:12,730
equal to i is equal to pi_i.

747
00:55:15,660 --> 00:55:18,830
If you start from this
distribution, in the next step,

748
00:55:18,830 --> 00:55:22,050
you'll have the exact
same distribution.

749
00:55:22,050 --> 00:55:23,570
That's what I'm
trying to say here.

750
00:55:23,570 --> 00:55:25,952
That's called a
stationary distribution.

751
00:55:34,930 --> 00:55:35,590
Any questions?

752
00:55:38,518 --> 00:55:41,836
AUDIENCE: So [INAUDIBLE]?

753
00:55:46,535 --> 00:55:47,160
PROFESSOR: Yes.

754
00:55:47,160 --> 00:55:48,023
Very good question.

755
00:55:51,741 --> 00:55:53,366
Yeah, but Perron-Frobenius
theorem says

756
00:55:53,366 --> 00:55:55,282
there is exactly one
eigenvector corresponding

757
00:55:55,282 --> 00:55:58,100
to the largest eigenvalue.

758
00:55:58,100 --> 00:56:00,280
And that turns out to be 1.

759
00:56:00,280 --> 00:56:02,740
The largest eigenvalue
turns out to be 1.

760
00:56:02,740 --> 00:56:06,400
So there will a unique
stationary distribution

761
00:56:06,400 --> 00:56:09,818
if all the entries are positive.

762
00:56:14,226 --> 00:56:15,142
AUDIENCE: [INAUDIBLE]?

763
00:56:21,920 --> 00:56:23,135
PROFESSOR: This one?

764
00:56:23,135 --> 00:56:24,051
AUDIENCE: [INAUDIBLE]?

765
00:56:33,991 --> 00:56:36,476
PROFESSOR: Maybe.

766
00:56:36,476 --> 00:56:37,967
It's a good point.

767
00:56:57,350 --> 00:56:58,344
Huh?

768
00:56:58,344 --> 00:56:59,835
Something is wrong.

769
00:57:06,310 --> 00:57:07,310
Can anybody help me?

770
00:57:07,310 --> 00:57:09,170
This part looks questionable.

771
00:57:09,170 --> 00:57:11,154
AUDIENCE: Just kind of
[INAUDIBLE] question,

772
00:57:11,154 --> 00:57:13,898
is that topic covered in
portions of [INAUDIBLE]?

773
00:57:17,874 --> 00:57:21,850
The other eigenvalues in the
matrix are smaller than 1.

774
00:57:21,850 --> 00:57:26,390
And so when you take products
of the transition probability

775
00:57:26,390 --> 00:57:33,150
matrix, those eigenvalues
that are smaller than 1 scale

776
00:57:33,150 --> 00:57:37,740
after repeated
multiplication to 0.

777
00:57:37,740 --> 00:57:41,750
So in the limit, they're 0,
but until you get to the limit,

778
00:57:41,750 --> 00:57:43,739
you still have them.

779
00:57:43,739 --> 00:57:45,155
Essentially, that
kind of behavior

780
00:57:45,155 --> 00:57:49,065
is transitionary
behavior that dissipates.

781
00:57:49,065 --> 00:57:53,470
But the behavior corresponding
to the stationary distribution

782
00:57:53,470 --> 00:57:53,970
persists.

783
00:57:57,320 --> 00:57:58,850
PROFESSOR: But,
as you mentioned,

784
00:57:58,850 --> 00:58:02,000
this argument seems to be
giving that all lambda has

785
00:58:02,000 --> 00:58:02,625
to be 1, right?

786
00:58:02,625 --> 00:58:05,882
Is that your point?

787
00:58:05,882 --> 00:58:06,970
You're right.

788
00:58:06,970 --> 00:58:09,167
I don't see what the
problem is right now.

789
00:58:09,167 --> 00:58:10,250
I'll think about it later.

790
00:58:10,250 --> 00:58:14,850
I don't want to waste my time
on trying to find what's wrong.

791
00:58:14,850 --> 00:58:16,660
But the conclusion is right.

792
00:58:16,660 --> 00:58:18,510
There will be a
unique one and so on.

793
00:58:24,405 --> 00:58:26,020
Now let me make a note here.

794
00:58:35,910 --> 00:58:39,720
So let me move on
to the final topic.

795
00:58:39,720 --> 00:58:40,930
It's called martingale.

796
00:58:52,850 --> 00:58:57,030
And this is, there
is another collection

797
00:58:57,030 --> 00:58:58,990
of stochastic processes.

798
00:58:58,990 --> 00:59:04,750
And what we're trying to
model here is a fair game.

799
00:59:04,750 --> 00:59:13,930
Stochastic processes
which are a fair game.

800
00:59:19,670 --> 00:59:35,990
And formally, what I mean
is a stochastic process is

801
00:59:35,990 --> 01:00:20,770
a martingale if that happens.

802
01:00:20,770 --> 01:00:22,310
Let me iterate it.

803
01:00:22,310 --> 01:00:28,100
So what we have
here is, at time t,

804
01:00:28,100 --> 01:00:30,910
if you look at what's going
to happen at time t plus 1,

805
01:00:30,910 --> 01:00:33,660
take the expectation,
then it has

806
01:00:33,660 --> 01:00:36,640
to be exactly equal
to the value of X_t.

807
01:00:36,640 --> 01:00:41,920
So we have this stochastic
process, and, at time t,

808
01:00:41,920 --> 01:00:44,180
you are at X_t.

809
01:00:44,180 --> 01:00:49,250
At time t plus 1, lots
of things can happen.

810
01:00:49,250 --> 01:00:52,570
It might go to this point, that
point, that point, or so on.

811
01:00:52,570 --> 01:00:54,080
But the probability
distribution is

812
01:00:54,080 --> 01:00:59,290
designed so that the
expected value over all these

813
01:00:59,290 --> 01:01:02,730
are exactly equal
to the value at X_t.

814
01:01:02,730 --> 01:01:06,260
So it's kind of centered
at X_t, centered meaning

815
01:01:06,260 --> 01:01:09,620
in the probabilistic sense.

816
01:01:09,620 --> 01:01:12,590
The expectation
is equal to that.

817
01:01:12,590 --> 01:01:16,040
So if your value at time
t was something else,

818
01:01:16,040 --> 01:01:19,070
your values at
time t plus 1 will

819
01:01:19,070 --> 01:01:21,587
be centered at this value
instead of that value.

820
01:01:24,720 --> 01:01:27,980
And the reason I'm
saying it models

821
01:01:27,980 --> 01:01:34,790
a fair game is
because, if this is

822
01:01:34,790 --> 01:01:41,510
like your balance over
some game, in expectation,

823
01:01:41,510 --> 01:01:47,670
you're not supposed to
win any money at all

824
01:01:47,670 --> 01:01:50,070
And I will later tell
you more about that.

825
01:01:55,610 --> 01:01:59,380
So example, a random
walk is a martingale.

826
01:02:18,710 --> 01:02:19,690
What else?

827
01:02:24,490 --> 01:02:28,820
Second one, now let's
say you're in a casino

828
01:02:28,820 --> 01:02:31,580
and you're playing roulette.

829
01:02:31,580 --> 01:02:40,150
Balance of a roulette
player is not a martingale.

830
01:02:46,700 --> 01:02:49,610
Because it's designed so
that the expected value

831
01:02:49,610 --> 01:02:52,150
is less than 0.

832
01:02:52,150 --> 01:02:53,880
You're supposed to lose money.

833
01:02:53,880 --> 01:02:57,850
Of course, at one instance,
you might win money.

834
01:02:57,850 --> 01:03:02,310
But in expected value,
you're designed to go down.

835
01:03:05,400 --> 01:03:06,730
So it's not a martingale.

836
01:03:06,730 --> 01:03:09,420
It's not a fair game.

837
01:03:09,420 --> 01:03:11,820
The game is designed for
the casino not for you.

838
01:03:15,470 --> 01:03:18,106
Third one is some funny example.

839
01:03:18,106 --> 01:03:24,596
I just made it up to show that
there are many possible ways

840
01:03:24,596 --> 01:03:28,130
that a stochastic process
can be a martingale.

841
01:03:28,130 --> 01:03:35,450
So if Y_i are IID
random variables such

842
01:03:35,450 --> 01:03:45,048
that Y_i is equal to 2, with
probability 1/3, and 1/2

843
01:03:45,048 --> 01:04:00,854
is probability 2/3, then let
X_0 equal 1 and X_k equal.

844
01:04:05,255 --> 01:04:07,827
Then that is a martingale.

845
01:04:11,170 --> 01:04:14,960
So at each step, you'll
either multiply by 2 or 1/2

846
01:04:14,960 --> 01:04:18,140
by 2-- just divide by 2.

847
01:04:18,140 --> 01:04:23,260
And the probability distribution
is given as 1/3 and 2/3.

848
01:04:23,260 --> 01:04:26,910
Then X_k is a martingale.

849
01:04:26,910 --> 01:04:32,910
The reason is-- so you can
compute the expected value.

850
01:04:32,910 --> 01:04:45,800
The expected value of the
X_(k+1), given X_k up to X_0,

851
01:04:45,800 --> 01:04:58,880
is equal to-- what you have is
expected value of Y_(k+1) times

852
01:04:58,880 --> 01:05:03,942
Y_k up to Y_1.

853
01:05:03,942 --> 01:05:05,436
That part is X_k.

854
01:05:08,930 --> 01:05:12,438
But this is designed so that the
expected value is equal to 1.

855
01:05:20,030 --> 01:05:21,146
So it's a martingale.

856
01:05:26,460 --> 01:05:29,240
I mean it will fluctuate
a lot, your balance,

857
01:05:29,240 --> 01:05:32,510
double, double, double,
half, half, half, and so on.

858
01:05:32,510 --> 01:05:36,999
But still, in expectation,
you will always maintain.

859
01:05:36,999 --> 01:05:39,040
I mean the expectation at
all time is equal to 1,

860
01:05:39,040 --> 01:05:40,910
if you look at it
from the beginning.

861
01:05:40,910 --> 01:05:43,880
You look at time 1, then
the expected value of X_1

862
01:05:43,880 --> 01:05:44,870
and so on.

863
01:05:48,340 --> 01:05:50,410
Any questions on
definition or example?

864
01:05:53,090 --> 01:05:56,580
So the random walk is an
example which is both Markov

865
01:05:56,580 --> 01:05:58,820
chain and martingale.

866
01:05:58,820 --> 01:06:02,640
But these two concepts are
really two different concepts.

867
01:06:02,640 --> 01:06:04,800
Try not to be confused
between the two.

868
01:06:04,800 --> 01:06:06,339
They're just two
different things.

869
01:06:11,330 --> 01:06:13,670
There are Markov chains
which are not martingales.

870
01:06:13,670 --> 01:06:16,510
There are martingales which
are not Markov chains.

871
01:06:16,510 --> 01:06:19,030
And there are somethings
which are both,

872
01:06:19,030 --> 01:06:21,240
like a simple random walk.

873
01:06:21,240 --> 01:06:24,400
There are some stuff which
are not either of them.

874
01:06:24,400 --> 01:06:26,350
They really are just
two separate things.

875
01:06:34,040 --> 01:06:36,150
Let me conclude with
one interesting theorem

876
01:06:36,150 --> 01:06:38,220
about martingales.

877
01:06:38,220 --> 01:06:42,380
And it really enforces
your intuition, at least

878
01:06:42,380 --> 01:06:46,740
intuition of the definition,
that martingale is a fair game.

879
01:06:46,740 --> 01:06:48,572
It's called optional
stopping theorem.

880
01:06:53,780 --> 01:07:00,340
And I will write it down
more formally later,

881
01:07:00,340 --> 01:07:05,320
but the message is this.

882
01:07:05,320 --> 01:07:09,050
If you play a martingale
game, if it's a game you play

883
01:07:09,050 --> 01:07:14,470
and it's your balance, no
matter what strategy you use,

884
01:07:14,470 --> 01:07:18,970
your expected value cannot
be positive or negative.

885
01:07:18,970 --> 01:07:21,120
Even if you try to
lose money so hard,

886
01:07:21,120 --> 01:07:22,780
you won't be able to do that.

887
01:07:22,780 --> 01:07:24,940
Even if you try to win
money so hard, like try

888
01:07:24,940 --> 01:07:28,300
to invent something really,
really cool and ingenious,

889
01:07:28,300 --> 01:07:30,130
you should not be
able to win money.

890
01:07:30,130 --> 01:07:34,124
Your expected value
is just fixed.

891
01:07:34,124 --> 01:07:35,540
That's the content
of the theorem.

892
01:07:35,540 --> 01:07:37,230
Of course, there are
technical conditions

893
01:07:37,230 --> 01:07:38,160
that have to be there.

894
01:07:42,320 --> 01:07:46,930
So if you're playing
a martingale game,

895
01:07:46,930 --> 01:07:50,470
then you're not
supposed to win or lose,

896
01:07:50,470 --> 01:07:51,470
at least in expectation.

897
01:07:53,995 --> 01:07:56,080
So before stating
the theorem, I have

898
01:07:56,080 --> 01:07:59,524
to define what a
stopping point means.

899
01:08:05,820 --> 01:08:27,279
So given a stochastic process,
a non-negative integer

900
01:08:27,279 --> 01:08:39,896
valued random variable tau
is called a stopping time,

901
01:08:39,896 --> 01:08:48,350
if, for all integer k greater
than or equal to 0, tau,

902
01:08:48,350 --> 01:09:00,380
lesser or equal to k,
depends only on X_1 to X_k.

903
01:09:00,380 --> 01:09:04,960
So that is something
very, very strange.

904
01:09:04,960 --> 01:09:07,950
I want to define something
called a stopping time.

905
01:09:07,950 --> 01:09:11,550
It will be a non-negative
integer valued random variable.

906
01:09:11,550 --> 01:09:14,649
So it will it be
0, 1, 2, or so on.

907
01:09:14,649 --> 01:09:18,560
That means it will
be some time index.

908
01:09:18,560 --> 01:09:22,229
And if you look at the
event that tau is less than

909
01:09:22,229 --> 01:09:27,800
or equal to k-- so if you
want to look at the events

910
01:09:27,800 --> 01:09:32,229
when you stop at time
less than or equal to k,

911
01:09:32,229 --> 01:09:34,760
your decision only
depends on the events

912
01:09:34,760 --> 01:09:40,410
up to k, on the value of
the stochastic process

913
01:09:40,410 --> 01:09:43,340
up to time k.

914
01:09:43,340 --> 01:09:45,540
In other words, if
this is some strategy

915
01:09:45,540 --> 01:09:49,930
you want to use-- by
strategy I mean some strategy

916
01:09:49,930 --> 01:09:53,540
that you stop playing
at some point.

917
01:09:53,540 --> 01:09:55,840
You have a strategy
that is defined

918
01:09:55,840 --> 01:10:00,040
as you play some k rounds, and
then you look at the outcome.

919
01:10:00,040 --> 01:10:02,480
You say, OK, now I think
it's in favor of me.

920
01:10:02,480 --> 01:10:03,460
I'm going to stop.

921
01:10:03,460 --> 01:10:05,225
You have a pre-defined
set of strategies.

922
01:10:08,130 --> 01:10:12,540
And if that strategy
only depends

923
01:10:12,540 --> 01:10:16,570
on the values of the stochastic
process up to right now,

924
01:10:16,570 --> 01:10:18,880
then it's a stopping time.

925
01:10:18,880 --> 01:10:21,370
If it's some strategy that
depends on future values,

926
01:10:21,370 --> 01:10:23,680
it's not a stopping time.

927
01:10:23,680 --> 01:10:25,468
Let me show you by example.

928
01:10:28,150 --> 01:10:31,640
Remember that coin toss game
which had random walk value, so

929
01:10:31,640 --> 01:10:35,790
either win $1 or lose $1.

930
01:10:35,790 --> 01:10:49,980
So in coin toss game,
let tau be the first time

931
01:10:49,980 --> 01:11:02,778
at which balance becomes $100,
then tau is a stopping time.

932
01:11:10,770 --> 01:11:15,410
Or you stop at either
$100 or negative

933
01:11:15,410 --> 01:11:17,850
$50, that's still
a stopping time.

934
01:11:17,850 --> 01:11:21,370
Remember that we
discussed about it?

935
01:11:21,370 --> 01:11:22,780
We look at our balance.

936
01:11:22,780 --> 01:11:27,300
We stop at either at the time
when we win $100 or lose $50.

937
01:11:27,300 --> 01:11:29,824
That is a stopping time.

938
01:11:29,824 --> 01:11:32,320
But I think it's better to
tell you what is not a stopping

939
01:11:32,320 --> 01:11:33,700
time, an example.

940
01:11:33,700 --> 01:11:36,660
That will help, really.

941
01:11:36,660 --> 01:11:50,280
So let tau-- in the same
game-- the time of first peak.

942
01:11:50,280 --> 01:11:54,310
By peak, I mean the
time when you go down,

943
01:11:54,310 --> 01:11:57,794
so that would be your tau.

944
01:11:57,794 --> 01:12:00,250
So the first time when
you start to go down,

945
01:12:00,250 --> 01:12:02,150
you're going to stop.

946
01:12:02,150 --> 01:12:04,680
That's not a stopping time.

947
01:12:04,680 --> 01:12:06,640
Not a stopping time.

948
01:12:12,000 --> 01:12:15,710
To see formally why it's the
case, first of all, if you want

949
01:12:15,710 --> 01:12:18,470
to decide if it's a
peak or not at time t,

950
01:12:18,470 --> 01:12:21,900
you have to refer to the
value at time t plus 1.

951
01:12:21,900 --> 01:12:23,983
If you're just looking
at values up to time t,

952
01:12:23,983 --> 01:12:25,955
you don't know if it's
going to be a peak

953
01:12:25,955 --> 01:12:28,440
or if it's going to continue.

954
01:12:28,440 --> 01:12:32,860
So the event that
you stop at time t

955
01:12:32,860 --> 01:12:38,150
depends on t plus 1
as well, which doesn't

956
01:12:38,150 --> 01:12:41,022
fall into this definition.

957
01:12:41,022 --> 01:12:43,050
So that's what we're
trying to distinguish

958
01:12:43,050 --> 01:12:45,580
by defining a stopping time.

959
01:12:45,580 --> 01:12:48,330
In these cases it was
clear, at the time,

960
01:12:48,330 --> 01:12:50,110
you know if you
have to stop or not.

961
01:12:50,110 --> 01:12:51,610
But if you define
your stopping time

962
01:12:51,610 --> 01:12:53,170
in this way and not
a stopping time,

963
01:12:53,170 --> 01:12:56,820
if you define tau in
this way, your decision

964
01:12:56,820 --> 01:12:59,670
depends on future
values of the outcome.

965
01:12:59,670 --> 01:13:04,035
So it's not a stopping
time under this definition.

966
01:13:04,035 --> 01:13:04,618
Any questions?

967
01:13:04,618 --> 01:13:07,082
Does it make sense?

968
01:13:07,082 --> 01:13:07,582
Yes?

969
01:13:07,582 --> 01:13:11,534
AUDIENCE: Could you still
have tau as the stopping time,

970
01:13:11,534 --> 01:13:14,498
if you were referring
to t, and then t minus 1

971
01:13:14,498 --> 01:13:16,990
was greater than [INAUDIBLE]?

972
01:13:16,990 --> 01:13:18,005
PROFESSOR: So.

973
01:13:18,005 --> 01:13:20,442
AUDIENCE: Let's say,
yeah, it was [INAUDIBLE].

974
01:13:20,442 --> 01:13:21,900
PROFESSOR: So that
time after peak,

975
01:13:21,900 --> 01:13:22,640
the first time after peak?

976
01:13:22,640 --> 01:13:23,223
AUDIENCE: Yes.

977
01:13:23,223 --> 01:13:25,640
PROFESSOR: Yes, that
will be a stopping time.

978
01:13:25,640 --> 01:13:38,030
So three, tau is tau_0 plus 1,
where tau 0 is the first peak,

979
01:13:38,030 --> 01:13:39,630
then it is a stopping time.

980
01:13:39,630 --> 01:13:41,106
It's a stopping time.

981
01:14:06,200 --> 01:14:10,210
So the optional stopping
theorem that I promised

982
01:14:10,210 --> 01:14:13,150
says the following.

983
01:14:13,150 --> 01:14:25,898
Suppose we have a martingale,
and tau is a stopping time.

984
01:14:29,834 --> 01:14:36,900
And further suppose
that there exists

985
01:14:36,900 --> 01:14:43,540
a constant T such that tau is
less than or equal to T always.

986
01:14:46,180 --> 01:14:49,780
So you have some strategy
which is a finite strategy.

987
01:14:49,780 --> 01:14:51,720
You can't go on forever.

988
01:14:51,720 --> 01:14:54,460
You have some bound on the time.

989
01:14:54,460 --> 01:14:58,390
And your stopping time
always ends before that time.

990
01:14:58,390 --> 01:15:08,110
In that case, the expectation
of your value at the stopping

991
01:15:08,110 --> 01:15:11,000
time, when you've
stopped, your balance,

992
01:15:11,000 --> 01:15:14,160
if that's what it's
modeling, is always

993
01:15:14,160 --> 01:15:18,220
equal to the balance
at the beginning.

994
01:15:18,220 --> 01:15:21,890
So no matter what strategy you
use, if you're a mortal being,

995
01:15:21,890 --> 01:15:24,610
then you cannot win.

996
01:15:24,610 --> 01:15:27,670
That's the content
of this theorem.

997
01:15:27,670 --> 01:15:30,430
So I wanted to prove
it, but I'll not,

998
01:15:30,430 --> 01:15:32,990
because I think I'm
running out of time.

999
01:15:32,990 --> 01:15:37,470
But let me show you one, very
interesting corollary of this

1000
01:15:37,470 --> 01:15:38,810
applied to that number one.

1001
01:15:42,370 --> 01:15:45,250
So number one is
a stopping time.

1002
01:15:45,250 --> 01:15:49,610
It's not clear that there is a
bounded time where you always

1003
01:15:49,610 --> 01:15:51,830
stop before that time.

1004
01:15:51,830 --> 01:15:54,160
But this theorem does
apply to that case.

1005
01:15:54,160 --> 01:15:57,080
So I'll just forget about
that technical issue.

1006
01:15:57,080 --> 01:16:03,080
So corollary, it
applies not immediately,

1007
01:16:03,080 --> 01:16:09,430
but it does apply to the first
case, case 1 given above.

1008
01:16:09,430 --> 01:16:15,130
And then what it says
is expectation of X_tau

1009
01:16:15,130 --> 01:16:15,920
is equal to 0.

1010
01:16:18,720 --> 01:16:23,390
But expectation of
X_tau is-- X at tau

1011
01:16:23,390 --> 01:16:26,370
is either 100 or negative
50, because they're always

1012
01:16:26,370 --> 01:16:29,910
going to stop at the first
time where you either

1013
01:16:29,910 --> 01:16:33,280
hit $100 or minus $50.

1014
01:16:33,280 --> 01:16:37,880
So this is 100 times
some probability

1015
01:16:37,880 --> 01:16:41,970
plus 1 minus p times minus 50.

1016
01:16:41,970 --> 01:16:44,320
There's some probability
that you stop at 100.

1017
01:16:44,320 --> 01:16:46,991
With all the rest, you're
going to stop at minus 50.

1018
01:16:46,991 --> 01:16:47,740
You know it's set.

1019
01:16:47,740 --> 01:16:49,960
It's equal to 0.

1020
01:16:49,960 --> 01:16:55,130
What it gives is-- I hope it
gives me the right thing I'm

1021
01:16:55,130 --> 01:16:57,030
thinking about.

1022
01:16:57,030 --> 01:16:59,660
p, 100, yes.

1023
01:16:59,660 --> 01:17:02,770
It's 150p minus 50 equals 0.

1024
01:17:02,770 --> 01:17:04,540
p is 1/3.

1025
01:17:04,540 --> 01:17:07,274
And if you remember, that was
exactly the computation we got.

1026
01:17:10,970 --> 01:17:13,560
So that's just a
neat application.

1027
01:17:13,560 --> 01:17:16,350
But the content of this,
it's really interesting.

1028
01:17:16,350 --> 01:17:21,090
So try to contemplate about it,
something very philosophically.

1029
01:17:21,090 --> 01:17:23,810
If something can be
modeled using martingales,

1030
01:17:23,810 --> 01:17:26,450
perfectly, if it
really fits into

1031
01:17:26,450 --> 01:17:28,630
the mathematical
formulation of a martingale,

1032
01:17:28,630 --> 01:17:30,454
then you're not supposed to win.

1033
01:17:33,190 --> 01:17:35,510
So that's it for today.

1034
01:17:35,510 --> 01:17:39,470
And next week, Peter will
give wonderful lectures.

1035
01:17:39,470 --> 01:17:41,620
See you next week.