1 00:00:00,060 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,236 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,236 --> 00:00:17,861 at ocw.mit.edu. 8 00:00:22,009 --> 00:00:23,550 PROFESSOR: Today we're going to study 9 00:00:23,550 --> 00:00:29,030 stochastic processes and, among them, one type of it, 10 00:00:29,030 --> 00:00:29,990 so discrete time. 11 00:00:29,990 --> 00:00:31,340 We'll focus on discrete time. 12 00:00:31,340 --> 00:00:33,920 And I'll talk about what it is right now. 13 00:00:33,920 --> 00:00:38,100 So a stochastic process is a collection 14 00:00:38,100 --> 00:00:56,070 of random variables indexed by time, a very simple definition. 15 00:00:56,070 --> 00:01:03,880 So we have either-- let's start from 0-- random variables 16 00:01:03,880 --> 00:01:12,760 like this, or we have random variables given like this. 17 00:01:12,760 --> 00:01:14,730 So a time variable can be discrete, 18 00:01:14,730 --> 00:01:16,590 or it can be continuous. 19 00:01:16,590 --> 00:01:20,310 These ones, we'll call discrete-time stochastic 20 00:01:20,310 --> 00:01:24,030 processes, and these ones continuous-time. 21 00:01:30,540 --> 00:01:35,505 So for example, a discrete-time random variable 22 00:01:35,505 --> 00:01:46,320 can be something like-- and so on. 23 00:01:46,320 --> 00:01:51,010 So these are the values, X_0, X_1, X_2, X_3, and so on. 24 00:01:51,010 --> 00:01:52,720 And they are random variables. 25 00:01:52,720 --> 00:01:57,330 This is just one-- so one realization 26 00:01:57,330 --> 00:01:58,810 of the stochastic process. 27 00:01:58,810 --> 00:02:02,290 But all these variables are supposed to be random. 28 00:02:02,290 --> 00:02:04,255 And then a continuous-time random variable-- 29 00:02:04,255 --> 00:02:06,585 a continuous-time stochastic process 30 00:02:06,585 --> 00:02:09,350 can be something like that. 31 00:02:09,350 --> 00:02:14,440 And it doesn't have to be continuous, so it can jump 32 00:02:14,440 --> 00:02:18,410 and it can jump and so on. 33 00:02:18,410 --> 00:02:20,356 And all these values are random values. 34 00:02:23,478 --> 00:02:27,300 So that's just a very informal description. 35 00:02:27,300 --> 00:02:30,066 And a slightly different point of view, 36 00:02:30,066 --> 00:02:31,440 which is slightly preferred, when 37 00:02:31,440 --> 00:02:33,550 you want to do some math with it, 38 00:02:33,550 --> 00:02:44,340 is that-- alternative definition-- 39 00:02:44,340 --> 00:02:57,550 it's a probability distribution over paths, 40 00:02:57,550 --> 00:02:59,460 over a space of paths. 41 00:03:09,070 --> 00:03:11,260 So you have all a bunch of possible paths 42 00:03:11,260 --> 00:03:12,574 that you can take. 43 00:03:12,574 --> 00:03:14,490 And you're given some probability distribution 44 00:03:14,490 --> 00:03:15,980 over it. 45 00:03:15,980 --> 00:03:18,570 And then that will be one realization. 46 00:03:18,570 --> 00:03:22,540 Another realization will look something different and so on. 47 00:03:22,540 --> 00:03:24,970 So this one-- it's more intuitive definition, 48 00:03:24,970 --> 00:03:27,710 the first one, that it's a collection of random variables 49 00:03:27,710 --> 00:03:29,030 indexed by time. 50 00:03:29,030 --> 00:03:31,125 But that one, if you want to do some math with it, 51 00:03:31,125 --> 00:03:33,760 from the formal point of view, that will be more helpful. 52 00:03:33,760 --> 00:03:37,870 And you'll see why that's the case later. 53 00:03:37,870 --> 00:03:40,460 So let me show you some more examples. 54 00:03:40,460 --> 00:03:43,500 For example, to describe one stochastic process, 55 00:03:43,500 --> 00:03:48,910 this is one way to describe a stochastic process. 56 00:03:48,910 --> 00:03:55,780 t with-- let me show you three stochastic processes, 57 00:03:55,780 --> 00:03:59,540 so number one, f(t) equals t. 58 00:03:59,540 --> 00:04:01,145 And this was probability 1. 59 00:04:04,850 --> 00:04:10,930 Number 2, f(t) is equal to t, for all t, 60 00:04:10,930 --> 00:04:20,550 with probability 1/2, or f(t) is equal to minus t, for all t, 61 00:04:20,550 --> 00:04:22,880 with probability 1/2. 62 00:04:22,880 --> 00:04:30,470 And the third one is, for each t, 63 00:04:30,470 --> 00:04:34,610 f(t) is equal to t or minus t, with probability 1/2. 64 00:04:41,370 --> 00:04:44,560 The first one is quite easy to picture. 65 00:04:44,560 --> 00:04:49,140 It's really just-- there's nothing random in here. 66 00:04:49,140 --> 00:04:50,700 This happens with probability 1. 67 00:04:50,700 --> 00:04:52,631 Your path just says f(t) equals t. 68 00:04:52,631 --> 00:04:54,880 And we're only looking at t greater than or equal to 0 69 00:04:54,880 --> 00:04:57,840 here. 70 00:04:57,840 --> 00:05:00,350 So that's number 1. 71 00:05:00,350 --> 00:05:10,180 Number 2, it's either this one or this one. 72 00:05:10,180 --> 00:05:12,209 So it is a stochastic process. 73 00:05:12,209 --> 00:05:14,250 If you think about it this way, it doesn't really 74 00:05:14,250 --> 00:05:16,010 look like a stochastic process. 75 00:05:16,010 --> 00:05:18,330 But under the alternative definition, 76 00:05:18,330 --> 00:05:20,820 you have two possible paths that you can take. 77 00:05:20,820 --> 00:05:25,540 You either take this path, with 1/2, or this path, with 1/2. 78 00:05:25,540 --> 00:05:30,130 Now, at each point, t, your value X(t) 79 00:05:30,130 --> 00:05:32,930 is a random variable. 80 00:05:32,930 --> 00:05:35,530 It's either t or minus t. 81 00:05:35,530 --> 00:05:38,230 And it's the same for all t. 82 00:05:38,230 --> 00:05:40,659 But they are dependent on each other. 83 00:05:40,659 --> 00:05:42,450 So if you know one value, you automatically 84 00:05:42,450 --> 00:05:43,785 know all the other values. 85 00:05:47,110 --> 00:05:50,760 And the third one is even more interesting. 86 00:05:50,760 --> 00:05:55,840 Now, for each t, we get rid of this dependency. 87 00:05:55,840 --> 00:06:00,970 So what you'll have is these two lines going on. 88 00:06:00,970 --> 00:06:03,150 I mean at every single point, you'll 89 00:06:03,150 --> 00:06:05,310 be either a top one or a bottom one. 90 00:06:05,310 --> 00:06:07,310 But if you really want draw the picture, 91 00:06:07,310 --> 00:06:10,230 it will bounce back and forth, up and down, infinitely often, 92 00:06:10,230 --> 00:06:11,770 and it'll just look like two lines. 93 00:06:15,290 --> 00:06:17,880 So I hope this gives you some feeling 94 00:06:17,880 --> 00:06:20,220 about stochastic processes, I mean, 95 00:06:20,220 --> 00:06:24,100 why we want to describe it in terms of this language, just 96 00:06:24,100 --> 00:06:27,250 a tiny bit. 97 00:06:27,250 --> 00:06:27,900 Any questions? 98 00:06:30,770 --> 00:06:33,730 So, when you look at a process, when 99 00:06:33,730 --> 00:06:37,460 you use a stochastic process to model a real life 100 00:06:37,460 --> 00:06:40,700 something going on, like a stock price, usually what happens 101 00:06:40,700 --> 00:06:44,660 is you stand at time t. 102 00:06:44,660 --> 00:06:49,380 And you know all the values in the past-- know. 103 00:06:49,380 --> 00:06:52,714 And in the future, you don't know. 104 00:06:52,714 --> 00:06:54,380 But you want to know something about it. 105 00:06:54,380 --> 00:06:57,650 You want to have some intelligent conclusion, 106 00:06:57,650 --> 00:07:00,500 intelligent information about the future, based on the past. 107 00:07:03,350 --> 00:07:06,682 For this stochastic process, it's easy. 108 00:07:06,682 --> 00:07:08,390 No matter where you stand at, you exactly 109 00:07:08,390 --> 00:07:11,130 know what's going to happen in the future. 110 00:07:11,130 --> 00:07:12,910 For this one, it's also the same. 111 00:07:12,910 --> 00:07:14,750 Even though it's random, once you 112 00:07:14,750 --> 00:07:16,650 know what happened at some point, 113 00:07:16,650 --> 00:07:20,250 you know it has to be this distribution or this line, 114 00:07:20,250 --> 00:07:23,820 if it's here, and this line if it's there. 115 00:07:23,820 --> 00:07:27,050 But that one is slightly different. 116 00:07:27,050 --> 00:07:30,260 No matter what you know about the past, 117 00:07:30,260 --> 00:07:33,160 even if know all the values in the past, what happened, 118 00:07:33,160 --> 00:07:36,360 it doesn't give any information at all about the future. 119 00:07:36,360 --> 00:07:39,850 Though it's not true if I say any information at all. 120 00:07:39,850 --> 00:07:43,160 We know that each value has to be t or minus t. 121 00:07:43,160 --> 00:07:46,730 You just don't know what it is. 122 00:07:46,730 --> 00:07:49,880 So when you're given a stochastic process 123 00:07:49,880 --> 00:07:54,150 and you're standing at some time, your future, 124 00:07:54,150 --> 00:07:57,270 you don't know what the future is, but most of the time 125 00:07:57,270 --> 00:08:00,300 you have at least some level of control 126 00:08:00,300 --> 00:08:03,010 given by the probability distribution. 127 00:08:03,010 --> 00:08:06,180 Here, it was, you can really determine the line. 128 00:08:06,180 --> 00:08:09,120 Here, because of probability distribution, at each point, 129 00:08:09,120 --> 00:08:12,280 only gives t or minus t, you know that each of them 130 00:08:12,280 --> 00:08:14,820 will be at least one of the points, 131 00:08:14,820 --> 00:08:16,390 but you don't know more than that. 132 00:08:19,070 --> 00:08:21,240 So the study of stochastic processes 133 00:08:21,240 --> 00:08:25,890 is, basically, you look at the given probability distribution, 134 00:08:25,890 --> 00:08:30,000 and you want to say something intelligent about the future 135 00:08:30,000 --> 00:08:32,600 as t goes on. 136 00:08:32,600 --> 00:08:34,260 So there are three types of questions 137 00:08:34,260 --> 00:08:37,549 that we mainly study here. 138 00:08:37,549 --> 00:08:50,865 So (a), first type, is what are the dependencies 139 00:08:50,865 --> 00:08:53,330 in the sequence of values. 140 00:08:59,270 --> 00:09:01,720 For example, if you know the price 141 00:09:01,720 --> 00:09:06,430 of a stock on all past dates, up to today, can 142 00:09:06,430 --> 00:09:12,140 you say anything intelligent about the future stock prices-- 143 00:09:12,140 --> 00:09:15,140 those type of questions. 144 00:09:15,140 --> 00:09:28,350 And (b) is what is the long term behavior of the sequence? 145 00:09:28,350 --> 00:09:30,370 So think about the law of large numbers 146 00:09:30,370 --> 00:09:37,560 that we talked about last time or central limit theorem. 147 00:09:43,774 --> 00:09:48,445 And the third type, this one is less relevant for our course, 148 00:09:48,445 --> 00:09:51,602 but, still, I'll just write it down. 149 00:09:51,602 --> 00:09:52,810 What are the boundary events? 150 00:10:00,030 --> 00:10:02,480 How often will something extreme happen, 151 00:10:02,480 --> 00:10:07,370 like how often will a stock price drop by more than 10% 152 00:10:07,370 --> 00:10:10,340 for a consecutive 5 days-- like these kind of events. 153 00:10:10,340 --> 00:10:11,600 How often will that happen? 154 00:10:14,780 --> 00:10:20,400 And for a different example, like if you model a call center 155 00:10:20,400 --> 00:10:25,760 and you want to know, over a period of time, 156 00:10:25,760 --> 00:10:29,750 the probability that at least 90% of the phones are idle 157 00:10:29,750 --> 00:10:31,096 or those kind of things. 158 00:10:36,330 --> 00:10:39,210 So that's was an introduction. 159 00:10:39,210 --> 00:10:39,860 Any questions? 160 00:10:43,560 --> 00:10:47,010 Then there are really lots of stochastic processes. 161 00:10:50,600 --> 00:10:59,005 One of the most important ones is the simple random walk. 162 00:11:09,460 --> 00:11:12,360 So today, I will focus on discrete-time stochastic 163 00:11:12,360 --> 00:11:13,510 processes. 164 00:11:13,510 --> 00:11:16,280 Later in the course, we'll go on to continuous-time stochastic 165 00:11:16,280 --> 00:11:17,610 processes. 166 00:11:17,610 --> 00:11:20,700 And then you'll see like Brownian motions 167 00:11:20,700 --> 00:11:24,600 and-- what else-- Ito's lemma and all those things 168 00:11:24,600 --> 00:11:25,960 will appear later. 169 00:11:25,960 --> 00:11:29,190 Right now, we'll study discrete time. 170 00:11:29,190 --> 00:11:34,360 And later, you'll see that it's really just-- what is it-- 171 00:11:34,360 --> 00:11:36,610 they're really parallel. 172 00:11:36,610 --> 00:11:39,090 So this simple random walk, you'll 173 00:11:39,090 --> 00:11:42,010 see the corresponding thing in continuous-time stochastic 174 00:11:42,010 --> 00:11:43,680 processes later. 175 00:11:43,680 --> 00:11:45,920 So I think it's easier to understand 176 00:11:45,920 --> 00:11:48,950 discrete-time processes, that's why we start with it. 177 00:11:48,950 --> 00:11:53,180 But later, it will really help if you understand it well. 178 00:11:53,180 --> 00:11:55,070 Because for continuous time, it will just 179 00:11:55,070 --> 00:11:56,830 carry over all the knowledge. 180 00:11:59,700 --> 00:12:01,010 What is a simple random walk? 181 00:12:04,240 --> 00:12:12,370 Let Y_i be IID, independent identically distributed, 182 00:12:12,370 --> 00:12:19,220 random variables, taking values 1 or minus 1, 183 00:12:19,220 --> 00:12:21,220 each with probability 1/2. 184 00:12:25,990 --> 00:12:35,180 Then define, for each time t, X sub t as the sum of Y_i, 185 00:12:35,180 --> 00:12:37,244 from i equals 1 to t. 186 00:12:41,410 --> 00:12:46,350 Then the sequence of random variables-- and X_0 187 00:12:46,350 --> 00:12:52,250 is equal to 0-- X0, X1, X2, and so on 188 00:12:52,250 --> 00:12:55,350 is called a one-dimensional simple random walk. 189 00:12:55,350 --> 00:12:58,370 But I'll just refer to it as simple random walk 190 00:12:58,370 --> 00:12:59,590 or random walk. 191 00:12:59,590 --> 00:13:04,500 And this is a definition. 192 00:13:04,500 --> 00:13:13,499 It's called simple random walk. 193 00:13:30,420 --> 00:13:31,551 Let's try to plot it. 194 00:13:35,239 --> 00:13:40,000 At time 0, we start at 0. 195 00:13:40,000 --> 00:13:43,600 And then, depending on the value of Y1, 196 00:13:43,600 --> 00:13:45,740 you will either go up or go down. 197 00:13:45,740 --> 00:13:47,760 Let's say we went up. 198 00:13:47,760 --> 00:13:49,800 So that's at time 1. 199 00:13:49,800 --> 00:13:54,050 Then at time 2, depending on your value of Y2, 200 00:13:54,050 --> 00:13:56,430 you will either go up one step from here 201 00:13:56,430 --> 00:13:59,670 or go down one step from there. 202 00:13:59,670 --> 00:14:08,740 Let's say we went up again, down, 4, up, up, 203 00:14:08,740 --> 00:14:09,953 something like that. 204 00:14:09,953 --> 00:14:11,432 And it continues. 205 00:14:15,880 --> 00:14:18,910 Another way to look at it-- the reason we call it a random walk 206 00:14:18,910 --> 00:14:28,470 is, if you just plot your values of X_t, over time, on a line, 207 00:14:28,470 --> 00:14:33,700 then you start at 0, you go to the right, right, left, right, 208 00:14:33,700 --> 00:14:35,515 right, left, left, left. 209 00:14:35,515 --> 00:14:39,190 So the trajectory is like a walk you take on this line, 210 00:14:39,190 --> 00:14:40,375 but it's random. 211 00:14:40,375 --> 00:14:41,750 And each time you go to the right 212 00:14:41,750 --> 00:14:44,900 or left, right or left, right or left. 213 00:14:44,900 --> 00:14:48,310 So that was two representations. 214 00:14:48,310 --> 00:14:51,100 This picture looks a little bit more clear. 215 00:14:51,100 --> 00:14:53,770 Here, I just lost everything I draw. 216 00:14:53,770 --> 00:14:57,115 Something like that is the trajectory. 217 00:15:00,740 --> 00:15:03,214 So from what we learned last time, 218 00:15:03,214 --> 00:15:04,880 we can already say something intelligent 219 00:15:04,880 --> 00:15:08,370 about the simple random walk. 220 00:15:08,370 --> 00:15:16,050 For example, if you apply central limit theorem 221 00:15:16,050 --> 00:15:20,230 to the sequence, what is the information you get? 222 00:15:24,510 --> 00:15:29,300 So over a long time, let's say t is way, far away, 223 00:15:29,300 --> 00:15:38,094 like a huge number, a very large number, 224 00:15:38,094 --> 00:15:43,070 what can you say about the distribution of this at time t? 225 00:15:43,070 --> 00:15:45,244 AUDIENCE: Is it close to 0? 226 00:15:45,244 --> 00:15:46,160 PROFESSOR: Close to 0. 227 00:15:46,160 --> 00:15:49,610 But by close to 0, what do you mean? 228 00:15:49,610 --> 00:15:51,430 There should be a scale. 229 00:15:51,430 --> 00:15:54,500 I mean some would say that 1 is close to 0. 230 00:15:54,500 --> 00:15:57,395 Some people would say that 100 is close to 0, 231 00:15:57,395 --> 00:16:02,710 so do you have some degree of how close it will be to 0? 232 00:16:07,030 --> 00:16:09,850 Anybody? 233 00:16:09,850 --> 00:16:11,350 AUDIENCE: So variance will be small. 234 00:16:11,350 --> 00:16:11,830 PROFESSOR: Sorry? 235 00:16:11,830 --> 00:16:13,371 AUDIENCE: The variance will be small. 236 00:16:13,371 --> 00:16:14,930 PROFESSOR: Variance will be small. 237 00:16:14,930 --> 00:16:17,210 About how much will the variance be? 238 00:16:17,210 --> 00:16:18,027 AUDIENCE: 1 over n. 239 00:16:18,027 --> 00:16:18,860 PROFESSOR: 1 over n. 240 00:16:18,860 --> 00:16:19,732 1 over n? 241 00:16:19,732 --> 00:16:20,616 AUDIENCE: Over t. 242 00:16:20,616 --> 00:16:21,527 PROFESSOR: 1 over t? 243 00:16:21,527 --> 00:16:23,110 Anybody else want to have a different? 244 00:16:23,110 --> 00:16:24,160 AUDIENCE: [INAUDIBLE]. 245 00:16:24,160 --> 00:16:26,526 PROFESSOR: 1 over square root t probably would. 246 00:16:26,526 --> 00:16:26,962 AUDIENCE: [INAUDIBLE]. 247 00:16:26,962 --> 00:16:28,270 AUDIENCE: The variance would be [INAUDIBLE]. 248 00:16:28,270 --> 00:16:29,250 PROFESSOR: Oh, you're right, sorry. 249 00:16:29,250 --> 00:16:30,542 Variance will be 1 over t. 250 00:16:33,846 --> 00:16:38,830 And the standard deviation will be 1 over square root of t. 251 00:16:38,830 --> 00:16:41,264 What I'm saying is, by central limit theorem. 252 00:16:41,264 --> 00:16:42,180 AUDIENCE: [INAUDIBLE]. 253 00:16:42,180 --> 00:16:44,910 Are you looking at the sums or are you looking at the? 254 00:16:44,910 --> 00:16:47,220 PROFESSOR: I'm looking at the X_t. 255 00:16:47,220 --> 00:16:48,510 Ah. 256 00:16:48,510 --> 00:16:51,610 That's a very good point. 257 00:16:51,610 --> 00:16:54,060 t and square root of t. 258 00:16:54,060 --> 00:16:54,560 Thank you. 259 00:16:54,560 --> 00:16:56,054 AUDIENCE: That's very different. 260 00:16:56,054 --> 00:16:58,544 PROFESSOR: Yeah, very, very different. 261 00:16:58,544 --> 00:17:01,530 I was confused. 262 00:17:01,530 --> 00:17:03,030 Sorry about that. 263 00:17:03,030 --> 00:17:07,930 The reason is because X_t, 1 over the square root of t times 264 00:17:07,930 --> 00:17:11,579 X_t-- we saw last time that this, 265 00:17:11,579 --> 00:17:13,510 if t is really, really large, this 266 00:17:13,510 --> 00:17:19,760 is close to the normal distribution, 0,1. 267 00:17:19,760 --> 00:17:24,089 So if you just look at it, X_t over the square root of t 268 00:17:24,089 --> 00:17:26,650 will look like normal distribution. 269 00:17:26,650 --> 00:17:32,590 That means the value, at t, will be distributed 270 00:17:32,590 --> 00:17:35,030 like a normal distribution, with mean 0 271 00:17:35,030 --> 00:17:37,210 and variance square root of t. 272 00:17:37,210 --> 00:17:39,160 So what you said was right. 273 00:17:39,160 --> 00:17:41,270 It's close to 0. 274 00:17:41,270 --> 00:17:45,520 And the scale you're looking at is about the square root of t. 275 00:17:45,520 --> 00:17:51,010 So it won't go too far away from 0. 276 00:17:54,640 --> 00:18:03,021 That means, if you draw these two curves, square root of t 277 00:18:03,021 --> 00:18:08,260 and minus square root of t, your simple random walk, on a very 278 00:18:08,260 --> 00:18:17,400 large scale, won't like go too far away from these two curves. 279 00:18:17,400 --> 00:18:19,860 Even though the extreme values it 280 00:18:19,860 --> 00:18:24,530 can take-- I didn't draw it correctly-- is t and minus 281 00:18:24,530 --> 00:18:28,935 t, because all values can be 1 or all values can be minus 1. 282 00:18:28,935 --> 00:18:32,440 Even though, theoretically, you can 283 00:18:32,440 --> 00:18:35,654 be that far away from your x-axis, 284 00:18:35,654 --> 00:18:37,070 in reality, what's going to happen 285 00:18:37,070 --> 00:18:40,000 is you're going to be really close to this curve. 286 00:18:40,000 --> 00:18:42,156 You're going to play within this area, mostly. 287 00:18:47,362 --> 00:18:48,820 AUDIENCE: I think that [INAUDIBLE]. 288 00:18:52,570 --> 00:18:54,970 PROFESSOR: So, yeah, that was a very vague statement. 289 00:18:54,970 --> 00:18:56,750 You won't deviate too much. 290 00:18:56,750 --> 00:18:59,030 So if you take 100 square root of t, 291 00:18:59,030 --> 00:19:03,260 you will be inside this interval like 90% of the time. 292 00:19:03,260 --> 00:19:06,530 If you take this to be 10,000 times square root of t, 293 00:19:06,530 --> 00:19:08,880 almost 99.9% or something like that. 294 00:19:14,010 --> 00:19:16,390 And there's even a theorem saying 295 00:19:16,390 --> 00:19:20,700 you will hit these two lines infinitely often. 296 00:19:20,700 --> 00:19:23,880 So if you go over time, a very long period, for a very, 297 00:19:23,880 --> 00:19:29,090 very long, you live long enough, then, even if you go down here. 298 00:19:29,090 --> 00:19:32,010 Even, in this picture, you might think, OK, in some cases, 299 00:19:32,010 --> 00:19:33,510 it might be the case that you always 300 00:19:33,510 --> 00:19:37,110 play in the negative region. 301 00:19:37,110 --> 00:19:39,550 But there's a theorem saying that that's not the case. 302 00:19:39,550 --> 00:19:42,670 With probability 1, if you go to infinity, 303 00:19:42,670 --> 00:19:45,150 you will cross this line infinitely often. 304 00:19:45,150 --> 00:19:48,000 And in fact, you will meet these two lines infinitely often. 305 00:19:52,066 --> 00:19:53,959 So those are some interesting things 306 00:19:53,959 --> 00:19:55,000 about simple random walk. 307 00:19:55,000 --> 00:19:57,855 Really, there are lot more interesting things, 308 00:19:57,855 --> 00:20:04,108 but I'm just giving an overview, in this course, now. 309 00:20:08,090 --> 00:20:11,866 Unfortunately, I can't talk about all of these fun stuffs. 310 00:20:11,866 --> 00:20:18,900 But let me still try to show you some properties and one 311 00:20:18,900 --> 00:20:23,060 nice computation on it. 312 00:20:23,060 --> 00:20:31,990 So some properties of a random walk, first, expectation of X_k 313 00:20:31,990 --> 00:20:33,610 is equal to 0. 314 00:20:33,610 --> 00:20:36,040 That's really easy to prove. 315 00:20:36,040 --> 00:20:39,765 Second important property is called independent increment. 316 00:20:46,100 --> 00:20:56,820 So if look at these times, t_0, t_1, up to t_k, 317 00:20:56,820 --> 00:21:05,830 then random variables X sub t_i+1 minus X sub t_i are 318 00:21:05,830 --> 00:21:06,815 mutually independent. 319 00:21:13,950 --> 00:21:15,880 So what this says is, if you look 320 00:21:15,880 --> 00:21:18,086 at what happens from time 1 to 10, 321 00:21:18,086 --> 00:21:22,570 that is irrelevant to what happens from 20 to 30. 322 00:21:22,570 --> 00:21:27,075 And that can easily be shown by the definition. 323 00:21:27,075 --> 00:21:32,510 I won't do that, but we'll try to do it as an exercise. 324 00:21:32,510 --> 00:21:35,970 Third one is called stationary, so it has the property. 325 00:21:39,090 --> 00:21:44,910 That means, for all h greater or equal to 0, 326 00:21:44,910 --> 00:21:50,171 and t greater than or equal to 0-- h is actually equal to 1-- 327 00:21:50,171 --> 00:22:03,270 the distribution of X_(t+h) minus X_t is the same 328 00:22:03,270 --> 00:22:15,610 as the distribution of X sub h. 329 00:22:15,610 --> 00:22:18,160 And again, this easily follows from the definition. 330 00:22:18,160 --> 00:22:24,280 What it says is, if you look at the same amount of time, 331 00:22:24,280 --> 00:22:28,590 then what happens inside this interval 332 00:22:28,590 --> 00:22:32,530 is irrelevant of your starting point. 333 00:22:32,530 --> 00:22:35,090 The distribution is the same. 334 00:22:35,090 --> 00:22:38,280 And moreover, from the first part, 335 00:22:38,280 --> 00:22:43,630 if these intervals do not overlap, they're independent. 336 00:22:43,630 --> 00:22:46,120 So those are the two properties that we're talking here. 337 00:22:46,120 --> 00:22:50,120 And you'll see these properties appearing again and again. 338 00:22:50,120 --> 00:22:54,640 Because stochastic processes having these properties 339 00:22:54,640 --> 00:22:57,530 are really good, in some sense. 340 00:22:57,530 --> 00:23:00,910 They are fundamental stochastic processes. 341 00:23:00,910 --> 00:23:03,757 And simple random walk is like the fundamental stochastic 342 00:23:03,757 --> 00:23:04,256 process. 343 00:23:09,860 --> 00:23:12,770 So let's try to see one interesting problem 344 00:23:12,770 --> 00:23:14,061 about simple random walk. 345 00:23:22,410 --> 00:23:27,770 So example, you play a game. 346 00:23:27,770 --> 00:23:29,940 It's like a coin toss game. 347 00:23:29,940 --> 00:23:32,500 I play with, let's say, Peter. 348 00:23:32,500 --> 00:23:36,460 So I bet $1 at each turn. 349 00:23:36,460 --> 00:23:39,360 And then Peter tosses a coin, a fair coin. 350 00:23:39,360 --> 00:23:41,660 It's either heads or tails. 351 00:23:41,660 --> 00:23:43,340 If it's heads, he wins. 352 00:23:43,340 --> 00:23:45,620 He wins the $1. 353 00:23:45,620 --> 00:23:47,110 If it's tails, I win. 354 00:23:47,110 --> 00:23:48,530 I win $1. 355 00:23:48,530 --> 00:23:55,040 So from my point of view, in this coin toss game, 356 00:23:55,040 --> 00:24:08,880 at each turn my balance goes up by $1 or down by $1. 357 00:24:13,580 --> 00:24:17,060 And now, let's say I started from $0.00 balance, 358 00:24:17,060 --> 00:24:19,630 even though that's not possible. 359 00:24:19,630 --> 00:24:24,060 Then my balance will exactly follow the simple random walk, 360 00:24:24,060 --> 00:24:30,360 assuming that the coin it's a fair coin, 50-50 chance. 361 00:24:30,360 --> 00:24:36,940 Then my balance is a simple random walk. 362 00:24:41,110 --> 00:24:43,797 And then I say the following. 363 00:24:43,797 --> 00:24:44,380 You know what? 364 00:24:44,380 --> 00:24:45,130 I'm going to play. 365 00:24:45,130 --> 00:24:46,980 I want to make money. 366 00:24:46,980 --> 00:24:55,850 So I'm going to play until I win $100 or I lose $100. 367 00:24:55,850 --> 00:25:08,780 So let's say I play until I win $100 or I lose $100. 368 00:25:08,780 --> 00:25:12,070 What is the probability that I will stop after winning $100? 369 00:25:17,544 --> 00:25:18,520 AUDIENCE: 1/2. 370 00:25:18,520 --> 00:25:20,474 PROFESSOR: 1/2 because? 371 00:25:20,474 --> 00:25:21,460 AUDIENCE: [INAUDIBLE]. 372 00:25:21,460 --> 00:25:22,830 PROFESSOR: Yes. 373 00:25:22,830 --> 00:25:29,201 So happens with 1/2, 1/2. 374 00:25:29,201 --> 00:25:30,200 And this is by symmetry. 375 00:25:33,690 --> 00:25:36,330 Because every chain of coin toss which 376 00:25:36,330 --> 00:25:39,160 gives a winning sequence, when you flip it, 377 00:25:39,160 --> 00:25:40,956 it will give a losing sequence. 378 00:25:40,956 --> 00:25:42,730 We have one-to-one correspondence 379 00:25:42,730 --> 00:25:44,790 between those two things. 380 00:25:44,790 --> 00:25:46,820 That was good. 381 00:25:46,820 --> 00:25:48,850 Now if I change it. 382 00:25:48,850 --> 00:25:53,560 What if I say I will win $100 or I lose $50? 383 00:25:56,762 --> 00:26:08,126 What if I play until win $100 or lose $50? 384 00:26:11,480 --> 00:26:16,510 In other words, I look at the random walk, 385 00:26:16,510 --> 00:26:18,710 I look at the first time that it hits 386 00:26:18,710 --> 00:26:23,230 either this line or it hits this line, and then I stop. 387 00:26:25,850 --> 00:26:31,660 What is the probability that I will stop after winning $100? 388 00:26:31,660 --> 00:26:34,320 AUDIENCE: [INAUDIBLE]. 389 00:26:34,320 --> 00:26:35,170 PROFESSOR: 1/3? 390 00:26:35,170 --> 00:26:36,150 Let me see. 391 00:26:36,150 --> 00:26:37,190 Why 1/3? 392 00:26:37,190 --> 00:26:38,106 AUDIENCE: [INAUDIBLE]. 393 00:27:05,540 --> 00:27:11,915 PROFESSOR: So you're saying, hitting this probability is p. 394 00:27:11,915 --> 00:27:17,130 And the probability that you hit this first is p, right? 395 00:27:17,130 --> 00:27:19,080 It's 1/2, 1/2. 396 00:27:19,080 --> 00:27:21,160 But you're saying from here, it's the same. 397 00:27:21,160 --> 00:27:25,584 So it should be 1/4 here, 1/2 times 1/2. 398 00:27:27,934 --> 00:27:29,100 You've got a good intuition. 399 00:27:29,100 --> 00:27:31,484 It is 1/3, actually. 400 00:27:31,484 --> 00:27:32,400 AUDIENCE: [INAUDIBLE]. 401 00:27:43,110 --> 00:27:44,680 PROFESSOR: And then once you hit it, 402 00:27:44,680 --> 00:27:48,450 it's like the same afterwards? 403 00:27:48,450 --> 00:27:51,087 I'm not sure if there is a way to make an argument out of it. 404 00:27:51,087 --> 00:27:51,920 I really don't know. 405 00:27:51,920 --> 00:27:53,480 There might be or there might not be. 406 00:27:53,480 --> 00:27:54,180 I'm not sure. 407 00:27:54,180 --> 00:27:55,980 I was thinking of a different way. 408 00:27:55,980 --> 00:27:59,080 But yeah, there might be a way to make an argument out of it. 409 00:27:59,080 --> 00:28:01,610 I just don't see it right now. 410 00:28:01,610 --> 00:28:06,290 So in general, if you put a line B and a line A, 411 00:28:06,290 --> 00:28:11,662 then probability of hitting B first is A over A plus B. 412 00:28:11,662 --> 00:28:16,384 And the probability of hitting this line-- minus A-- 413 00:28:16,384 --> 00:28:23,520 is B over A plus B. And so, in this case, if it's 100 and 50, 414 00:28:23,520 --> 00:28:27,200 it's 100 over 150, that's 2/3 and that's 1/3. 415 00:28:30,180 --> 00:28:33,250 This can be proved. 416 00:28:33,250 --> 00:28:35,642 It's actually not that difficult to prove it. 417 00:28:35,642 --> 00:28:37,850 I mean it's hard to find the right way to look at it. 418 00:29:00,802 --> 00:29:19,140 So fix your B and A. And for each k between minus A 419 00:29:19,140 --> 00:29:27,490 and B define f of k as the probability that you'll 420 00:29:27,490 --> 00:29:31,320 hit-- what is it-- this line first, 421 00:29:31,320 --> 00:29:38,830 and the probability that you hit the line B first 422 00:29:38,830 --> 00:29:46,010 when you start at k. 423 00:29:46,010 --> 00:29:48,554 So it kind of points out what you're saying. 424 00:29:48,554 --> 00:29:50,720 Now, instead of looking at one fixed starting point, 425 00:29:50,720 --> 00:29:52,525 we're going to change our starting point 426 00:29:52,525 --> 00:29:55,290 and look at all possible ways. 427 00:29:55,290 --> 00:29:58,570 So when you start at k, I'll define f of k 428 00:29:58,570 --> 00:30:00,560 as the probability that you hit this line first 429 00:30:00,560 --> 00:30:03,490 before hitting that line. 430 00:30:03,490 --> 00:30:05,520 What we are interested in is computing f(0). 431 00:30:10,430 --> 00:30:14,595 What we know is f of B is equal to 1, f of minus A 432 00:30:14,595 --> 00:30:15,770 is equal to 0. 433 00:30:20,000 --> 00:30:24,430 And then actually, there's one recursive formula 434 00:30:24,430 --> 00:30:26,670 that matters to us. 435 00:30:26,670 --> 00:30:34,500 If you start at f(k), you either go up or go down. 436 00:30:34,500 --> 00:30:36,550 You go up with probability 1/2. 437 00:30:36,550 --> 00:30:38,760 You go down with probability 1/2. 438 00:30:38,760 --> 00:30:40,950 And now it starts again. 439 00:30:40,950 --> 00:30:46,340 Because of this-- which one is it-- stationary property. 440 00:30:46,340 --> 00:30:49,850 So starting from here, the probability 441 00:30:49,850 --> 00:30:54,690 that you hit B first is exactly f of k plus 1. 442 00:30:54,690 --> 00:30:57,800 So if you go up, the probability that you hit B first 443 00:30:57,800 --> 00:30:59,690 is f of k plus 1. 444 00:30:59,690 --> 00:31:03,012 If you go down, it's f of k minus 1. 445 00:31:06,320 --> 00:31:08,510 And then that gives you a recursive formula 446 00:31:08,510 --> 00:31:09,990 with two boundary values. 447 00:31:09,990 --> 00:31:12,970 If you look at it, you can solve it. 448 00:31:12,970 --> 00:31:17,180 When you solve it, you'll get that answer. 449 00:31:17,180 --> 00:31:20,070 So I won't go into details, but what I wanted to show 450 00:31:20,070 --> 00:31:23,770 is that simple random walk is really this property, these two 451 00:31:23,770 --> 00:31:24,840 properties. 452 00:31:24,840 --> 00:31:28,120 It has these properties and even more powerful properties. 453 00:31:28,120 --> 00:31:30,200 So it's really easy to control. 454 00:31:30,200 --> 00:31:32,080 And at the same time it's quite universal. 455 00:31:32,080 --> 00:31:36,790 It can model-- like it's not a very weak model. 456 00:31:36,790 --> 00:31:43,280 It's rather restricted, but it's a really good model 457 00:31:43,280 --> 00:31:46,880 for like a mathematician. 458 00:31:46,880 --> 00:31:49,300 From the practical point of view, 459 00:31:49,300 --> 00:31:53,560 you'll have to twist some things slightly and so on. 460 00:31:53,560 --> 00:31:56,800 But in many cases, you can approximate it 461 00:31:56,800 --> 00:31:59,830 by simple random walk. 462 00:31:59,830 --> 00:32:04,885 And as you can see, you can do computations, 463 00:32:04,885 --> 00:32:06,500 with simple random walk, by hand. 464 00:32:10,500 --> 00:32:11,985 So that was it. 465 00:32:11,985 --> 00:32:14,119 I talked about the most important example 466 00:32:14,119 --> 00:32:15,035 of stochastic process. 467 00:32:18,620 --> 00:32:23,780 Now, let's talk about more stochastic processes. 468 00:32:27,629 --> 00:32:31,701 The second one is called the Markov chain. 469 00:32:31,701 --> 00:32:34,376 Let me write that part, actually. 470 00:32:49,550 --> 00:32:52,180 So Markov chain, unlike the simple random walk, 471 00:32:52,180 --> 00:32:53,775 is not a single stochastic process. 472 00:32:56,490 --> 00:33:00,000 A stochastic process is called a Markov chain 473 00:33:00,000 --> 00:33:02,110 if has some property. 474 00:33:02,110 --> 00:33:05,690 And what we want to capture in Markov chain 475 00:33:05,690 --> 00:33:09,690 is the following statement. 476 00:33:09,690 --> 00:33:17,090 These are a collection of stochastic processes having 477 00:33:17,090 --> 00:33:32,660 the property that-- whose effect of the past on the future 478 00:33:32,660 --> 00:33:39,077 is summarized only by the current state. 479 00:33:45,760 --> 00:33:48,840 That's quite a vague statement. 480 00:33:48,840 --> 00:33:59,620 But what we're trying to capture here is-- now, 481 00:33:59,620 --> 00:34:05,840 look at some generic stochastic process at time t. 482 00:34:05,840 --> 00:34:08,260 You know all the history up to time t. 483 00:34:08,260 --> 00:34:12,280 You want to say something about the future. 484 00:34:12,280 --> 00:34:14,949 Then, if it's a Markov chain, what it's saying is, 485 00:34:14,949 --> 00:34:17,699 you don't even have know all about this. 486 00:34:17,699 --> 00:34:19,199 Like this part is really irrelevant. 487 00:34:22,310 --> 00:34:27,853 What matters is the value at this last point, last time. 488 00:34:27,853 --> 00:34:30,600 So if it's a Markov chain, you don't 489 00:34:30,600 --> 00:34:32,480 have to know all this history. 490 00:34:32,480 --> 00:34:34,889 All you have to know is this single value. 491 00:34:34,889 --> 00:34:37,949 And all of the effect of the past on the future 492 00:34:37,949 --> 00:34:40,679 is contained in this value. 493 00:34:40,679 --> 00:34:42,190 Nothing else matters. 494 00:34:42,190 --> 00:34:44,000 Of course, this is a very special type 495 00:34:44,000 --> 00:34:45,690 of stochastic process. 496 00:34:45,690 --> 00:34:47,830 Most other stochastic processes, the future 497 00:34:47,830 --> 00:34:51,060 will depend on the whole history. 498 00:34:51,060 --> 00:34:53,480 And in that case, it's more difficult to analyze. 499 00:34:53,480 --> 00:34:56,280 But these ones are more manageable. 500 00:34:56,280 --> 00:34:58,250 And still, lots of interesting things 501 00:34:58,250 --> 00:35:00,380 turn out to be Markov chains. 502 00:35:00,380 --> 00:35:02,080 So if you look at simple random walk, 503 00:35:02,080 --> 00:35:06,310 it is a Markov chain, right? 504 00:35:06,310 --> 00:35:14,680 So simple random walk, let's say you went like that. 505 00:35:14,680 --> 00:35:20,160 Then what happens after time t really just depends 506 00:35:20,160 --> 00:35:23,460 on how high this point is at. 507 00:35:23,460 --> 00:35:25,580 What happened before doesn't matter at all. 508 00:35:25,580 --> 00:35:29,070 Because we're just having new coin tosses every time. 509 00:35:29,070 --> 00:35:31,155 But this value can affect the future, 510 00:35:31,155 --> 00:35:32,530 because that's where you're going 511 00:35:32,530 --> 00:35:34,990 to start your process from. 512 00:35:34,990 --> 00:35:38,240 Like that's where you're starting your process. 513 00:35:38,240 --> 00:35:41,590 So that is a Markov chain. 514 00:35:41,590 --> 00:35:42,790 This part is irrelevant. 515 00:35:42,790 --> 00:35:45,412 Only the value matters. 516 00:35:45,412 --> 00:35:47,370 So let me define it a little bit more formally. 517 00:36:05,240 --> 00:36:27,814 A discrete-time stochastic process is a Markov chain 518 00:36:27,814 --> 00:36:36,230 if the probability that X at some time, t plus 1, 519 00:36:36,230 --> 00:36:43,230 is equal to something, some value, 520 00:36:43,230 --> 00:36:49,830 given the whole history up to time n, 521 00:36:49,830 --> 00:36:55,810 is equal to the probability that X_(t+1) is equal to that value, 522 00:36:55,810 --> 00:37:04,950 given the value X sub n for all n greater than or equal to-- t 523 00:37:04,950 --> 00:37:10,260 greater than or equal to 0 and all s. 524 00:37:10,260 --> 00:37:14,990 This is a mathematical way of writing down this. 525 00:37:14,990 --> 00:37:20,690 The value at X_(t+1), given all the values up to time t, 526 00:37:20,690 --> 00:37:23,830 is the same as the value at time t plus 1, 527 00:37:23,830 --> 00:37:26,993 the probability of it, given only the last value. 528 00:37:39,090 --> 00:37:41,750 And the reason simple random walk is a Markov chain 529 00:37:41,750 --> 00:37:45,560 is because both of them are just 1/2. 530 00:37:45,560 --> 00:37:50,920 I mean, if it's for-- let me write it down. 531 00:37:54,680 --> 00:37:59,470 So example: random walk. 532 00:38:03,943 --> 00:38:10,420 Probability that X_(t+1) equal to s, given-- 533 00:38:10,420 --> 00:38:20,096 t is equal to 1/2, if s is equal X_t plus 1, or X_t minus 1, 534 00:38:20,096 --> 00:38:21,436 and 0 otherwise. 535 00:38:24,840 --> 00:38:30,185 So it really depends only on the last value of X_t. 536 00:38:30,185 --> 00:38:31,870 Any questions? 537 00:38:31,870 --> 00:38:32,910 All right. 538 00:38:36,460 --> 00:38:39,350 If there is case when you're looking 539 00:38:39,350 --> 00:38:41,610 at a stochastic process, a Markov chain, 540 00:38:41,610 --> 00:38:50,020 and all X_i have values in some set S, which 541 00:38:50,020 --> 00:38:59,120 is finite, a finite set, in that case, 542 00:38:59,120 --> 00:39:01,640 it's really easy to describe Markov chains. 543 00:39:04,380 --> 00:39:09,360 So now denote the probability i, j 544 00:39:09,360 --> 00:39:15,530 as the probability that, if at that time t 545 00:39:15,530 --> 00:39:18,520 you are at i, the probability that you 546 00:39:18,520 --> 00:39:33,942 jump to j at time t plus 1 for all pair of points i, j. 547 00:39:38,100 --> 00:39:40,530 I mean, it's a finite set, so I might just as well 548 00:39:40,530 --> 00:39:45,160 call it the integer set from 1 to m, 549 00:39:45,160 --> 00:39:49,490 just to make the notation easier. 550 00:39:49,490 --> 00:39:57,710 Then, first of all, if you sum over all j in S, P_(i,j), 551 00:39:57,710 --> 00:39:59,216 that is equal to 1. 552 00:39:59,216 --> 00:40:01,060 Because if you start at i, you'll 553 00:40:01,060 --> 00:40:03,770 have to jump to somewhere in your next step. 554 00:40:03,770 --> 00:40:06,650 So if you sum over all possible states you can have, 555 00:40:06,650 --> 00:40:09,680 you have to sum up to 1. 556 00:40:09,680 --> 00:40:12,690 And really, a very interesting thing 557 00:40:12,690 --> 00:40:16,620 is this matrix, called the transition probability 558 00:40:16,620 --> 00:40:24,740 matrix, defined as. 559 00:40:34,460 --> 00:40:40,540 So we put P_(i,j) at i-th row and j-th column. 560 00:40:40,540 --> 00:40:42,130 And really, this tells you everything 561 00:40:42,130 --> 00:40:44,640 about the Markov chain. 562 00:40:44,640 --> 00:40:46,540 Everything about the stochastic process 563 00:40:46,540 --> 00:40:47,900 is contained in this matrix. 564 00:41:00,470 --> 00:41:02,070 That's because a future state only 565 00:41:02,070 --> 00:41:04,550 depends on the current state. 566 00:41:04,550 --> 00:41:08,210 So if you know what happens at time t, where it's at time t, 567 00:41:08,210 --> 00:41:10,800 you look at the matrix, you can decode 568 00:41:10,800 --> 00:41:12,030 all the information you want. 569 00:41:12,030 --> 00:41:14,990 What is the probability that it will be at-- let's say, 570 00:41:14,990 --> 00:41:15,824 it's at 0 right now. 571 00:41:15,824 --> 00:41:17,281 What's the probability that it will 572 00:41:17,281 --> 00:41:18,410 jump to 1 at the next time? 573 00:41:18,410 --> 00:41:21,180 Just look at 0 comma 1, here. 574 00:41:21,180 --> 00:41:23,040 There is no 0, 1, here, so it's 1 and 2. 575 00:41:23,040 --> 00:41:28,690 Just look at 1 and 2, 1 and 2, i and j. 576 00:41:28,690 --> 00:41:29,814 Actually, I made a mistake. 577 00:41:37,074 --> 00:41:39,010 That should be the right one. 578 00:41:42,410 --> 00:41:45,180 Not only that, that's a one-step. 579 00:41:45,180 --> 00:41:46,840 So what happened is it describes what 580 00:41:46,840 --> 00:41:48,910 happens in a single step, the probability 581 00:41:48,910 --> 00:41:51,410 that you jump from i to j. 582 00:41:51,410 --> 00:41:53,330 But using that, you can also model 583 00:41:53,330 --> 00:41:58,260 what's the probability that you jump from i to j in two steps. 584 00:41:58,260 --> 00:42:03,110 So let's define q sub i, j as the probability 585 00:42:03,110 --> 00:42:08,440 that X at time t plus 2 is equal to j, given that X at time t 586 00:42:08,440 --> 00:42:12,070 is equal to i. 587 00:42:12,070 --> 00:42:25,020 Then the matrix, defined this way, 588 00:42:25,020 --> 00:42:27,100 can you describe it in terms of the matrix A? 589 00:42:33,620 --> 00:42:34,800 Anybody? 590 00:42:34,800 --> 00:42:35,980 Multiplication? 591 00:42:35,980 --> 00:42:36,810 Very good. 592 00:42:36,810 --> 00:42:37,700 So it's A square. 593 00:42:42,990 --> 00:42:44,200 Why is it? 594 00:42:44,200 --> 00:42:46,930 So let me write this down in a different way. 595 00:42:46,930 --> 00:42:55,150 q_(i,j) is, you sum over all intermediate values 596 00:42:55,150 --> 00:43:03,680 the probability that you jump from i to k, first, 597 00:43:03,680 --> 00:43:05,900 and then the probability that you jump from k to j. 598 00:43:12,480 --> 00:43:14,940 And if you look at what this means, 599 00:43:14,940 --> 00:43:20,910 each entry here is described by a linear-- what is it-- the dot 600 00:43:20,910 --> 00:43:24,840 product of a column and a row. 601 00:43:24,840 --> 00:43:26,932 And that's exactly what occurs. 602 00:43:26,932 --> 00:43:29,140 And if you want to look at the three-step, four-step, 603 00:43:29,140 --> 00:43:31,390 all you have to do is just multiply it again and again 604 00:43:31,390 --> 00:43:33,230 and again. 605 00:43:33,230 --> 00:43:35,430 Really, this matrix contains all the information 606 00:43:35,430 --> 00:43:40,290 you want if you have a Markov chain and it's finite. 607 00:43:40,290 --> 00:43:41,882 That's very important. 608 00:43:41,882 --> 00:43:44,310 For random walk, simple random walk, 609 00:43:44,310 --> 00:43:46,840 I told you that it is a Markov chain. 610 00:43:46,840 --> 00:43:50,570 But it does not have a transition probability matrix, 611 00:43:50,570 --> 00:43:53,191 because the state space is not finite. 612 00:43:53,191 --> 00:43:54,045 So be careful. 613 00:43:57,740 --> 00:44:00,680 However, finite Markov chains, really, there's 614 00:44:00,680 --> 00:44:08,280 one matrix that describes everything. 615 00:44:08,280 --> 00:44:13,110 I mean, I said it like it's something very interesting. 616 00:44:13,110 --> 00:44:15,790 But if you think about it, you just 617 00:44:15,790 --> 00:44:17,766 wrote down all the probabilities. 618 00:44:17,766 --> 00:44:19,140 So it should describe everything. 619 00:44:34,542 --> 00:44:35,125 So an example. 620 00:44:41,152 --> 00:44:48,900 You have a machine, and it's broken 621 00:44:48,900 --> 00:44:53,180 or working at a given day. 622 00:45:00,580 --> 00:45:02,070 That's a silly example. 623 00:45:02,070 --> 00:45:13,388 So if it's working today, working tomorrow, 624 00:45:13,388 --> 00:45:25,260 broken with probability 0.01, working with probability 0.99. 625 00:45:25,260 --> 00:45:29,300 If it's broken, the probability that it's repaired 626 00:45:29,300 --> 00:45:35,500 on the next day is 0.8. 627 00:45:35,500 --> 00:45:40,450 And it's broken at 0.2. 628 00:45:40,450 --> 00:45:42,380 Suppose you have something like this. 629 00:45:47,170 --> 00:45:50,854 This is an example of a Markov chain used in like engineering 630 00:45:50,854 --> 00:45:51,395 applications. 631 00:45:56,560 --> 00:46:01,296 In this case, S is also called the state space, actually. 632 00:46:01,296 --> 00:46:04,170 And the reason is because, in many cases, 633 00:46:04,170 --> 00:46:07,990 what you're modeling is these kind of states of some system, 634 00:46:07,990 --> 00:46:13,750 like broken or working, rainy, sunny, cloudy as weather. 635 00:46:13,750 --> 00:46:18,380 And all these things that you model 636 00:46:18,380 --> 00:46:20,210 represent states a lot of time. 637 00:46:20,210 --> 00:46:22,505 So you call it state set as well. 638 00:46:22,505 --> 00:46:24,175 So that's an example. 639 00:46:24,175 --> 00:46:26,000 And let's see what happens for this matrix. 640 00:46:28,520 --> 00:46:30,720 We have two states, working and broken. 641 00:46:35,680 --> 00:46:37,680 Working to working is 0.99. 642 00:46:37,680 --> 00:46:40,530 Working to broken is 0.01. 643 00:46:40,530 --> 00:46:42,600 Broken to working is 0.8. 644 00:46:42,600 --> 00:46:53,590 Broken to broken is 0.2. 645 00:46:53,590 --> 00:46:55,512 So that's what we've learned so far. 646 00:46:55,512 --> 00:47:00,660 And the question, what happens if you start from some state, 647 00:47:00,660 --> 00:47:04,030 let's say it was working today, and you 648 00:47:04,030 --> 00:47:12,900 go a very, very long time, like a year or 10 years, 649 00:47:12,900 --> 00:47:16,720 then the distribution, after 10 years, on that day, 650 00:47:16,720 --> 00:47:20,300 is A to the 3,650. 651 00:47:20,300 --> 00:47:24,680 So that will be-- that times [1, 0] 652 00:47:24,680 --> 00:47:27,440 will be the probability [p, q]. 653 00:47:27,440 --> 00:47:30,030 p will be the probability that it's working at that time. 654 00:47:30,030 --> 00:47:32,414 q will be the probability that it's broken at that time. 655 00:47:35,760 --> 00:47:37,630 What will p and q be? 656 00:47:45,340 --> 00:47:46,655 What will p and q be? 657 00:47:46,655 --> 00:47:48,530 That's the question that we're trying to ask. 658 00:47:55,130 --> 00:47:57,030 We didn't learn, so far, how to do this, 659 00:47:57,030 --> 00:47:58,400 but let's think about it. 660 00:48:01,220 --> 00:48:06,946 I'm going to cheat a little bit and just say, 661 00:48:06,946 --> 00:48:12,400 you know what, I think, over a long period of time, 662 00:48:12,400 --> 00:48:20,760 the probability distribution on day 3,650 and that on day 3,651 663 00:48:20,760 --> 00:48:22,490 shouldn't be that different. 664 00:48:22,490 --> 00:48:25,246 They should be about the same. 665 00:48:25,246 --> 00:48:26,370 Let's make that assumption. 666 00:48:26,370 --> 00:48:27,770 I don't know if it's true or not. 667 00:48:27,770 --> 00:48:32,470 Well, I know it's true, but that's what I'm telling you. 668 00:48:32,470 --> 00:48:38,300 Under that assumption, now you can solve what p and q are. 669 00:48:38,300 --> 00:48:49,180 So approximately, I hope, p, q-- so A^3650 * [1, 670 00:48:49,180 --> 00:48:56,350 0] is approximately the same as A to the 3651, [1, 0]. 671 00:48:56,350 --> 00:48:58,555 That means that this is [p, q]. 672 00:48:58,555 --> 00:49:01,121 [p, q] is about the same as A times [p, q]. 673 00:49:04,970 --> 00:49:07,510 Anybody remember what this is? 674 00:49:07,510 --> 00:49:09,030 Yes. 675 00:49:09,030 --> 00:49:11,475 So [p, q] will be the eigenvector of this matrix. 676 00:49:14,090 --> 00:49:17,350 Over a long period of time, the probability distribution 677 00:49:17,350 --> 00:49:20,470 that you will observe will be the eigenvector. 678 00:49:23,650 --> 00:49:26,510 And whats the eigenvalue? 679 00:49:26,510 --> 00:49:30,752 1, at least in this case, it looks like it's 1. 680 00:49:30,752 --> 00:49:33,210 Now I'll make one more connection. 681 00:49:33,210 --> 00:49:36,954 Do you remember Perron-Frobenius theorem? 682 00:49:36,954 --> 00:49:40,400 So this is a matrix. 683 00:49:40,400 --> 00:49:43,560 All entries are positive. 684 00:49:43,560 --> 00:49:45,980 So there is a largest eigenvalue, 685 00:49:45,980 --> 00:49:49,870 which is positive and real. 686 00:49:49,870 --> 00:49:52,670 And there is an all-positive eigenvector corresponding 687 00:49:52,670 --> 00:49:53,415 to it. 688 00:49:56,555 --> 00:49:58,930 What I'm trying to say is that's going to be your [p, q]. 689 00:50:06,380 --> 00:50:09,050 But let me not jump to the conclusion yet. 690 00:50:27,060 --> 00:50:37,090 And one more thing we know is, by Perron-Frobenius, there 691 00:50:37,090 --> 00:50:41,330 exists an eigenvalue, the largest one, lambda 692 00:50:41,330 --> 00:50:50,699 greater than 0, and eigenvector [v 1, v 2], where [v 1, v 2] 693 00:50:50,699 --> 00:50:51,240 are positive. 694 00:50:54,340 --> 00:50:57,100 Moreover, lambda was at multiplicity 1. 695 00:50:57,100 --> 00:50:58,650 I'll get back to it later. 696 00:50:58,650 --> 00:51:00,250 So let's write this down. 697 00:51:00,250 --> 00:51:07,032 A times [v 1, v 2] is equal to lambda times [v 1, v2]. 698 00:51:07,032 --> 00:51:08,740 A times [v 1, v 2], we can write it down. 699 00:51:08,740 --> 00:51:14,430 It's 0.99 v_1 plus 0.01 v_2. 700 00:51:14,430 --> 00:51:22,169 And that 0.8 v_1 plus 0.2 v_2, which is equal to [v1, v2]. 701 00:51:26,140 --> 00:51:28,190 You can solve v_1 and v_2, but before doing 702 00:51:28,190 --> 00:51:41,501 that-- sorry about that. 703 00:51:41,501 --> 00:51:42,487 This is flipped. 704 00:51:51,544 --> 00:51:52,960 Yeah, so everybody, it should have 705 00:51:52,960 --> 00:51:55,466 been flipped in the beginning. 706 00:51:55,466 --> 00:51:57,876 So that's 8. 707 00:52:02,710 --> 00:52:10,190 So sum these two values, and you get lambda times [v 1, v 2]. 708 00:52:10,190 --> 00:52:14,101 On the left, what you get is v_1 plus v_2, 709 00:52:14,101 --> 00:52:15,833 you sum two coordinates. 710 00:52:18,611 --> 00:52:20,880 On the left, you get v_1 plus v_2. 711 00:52:20,880 --> 00:52:25,320 On the right, you get lambda times v_1 plus v_2. 712 00:52:25,320 --> 00:52:27,642 That means your lambda is equal to 1. 713 00:52:34,064 --> 00:52:38,600 So that eigenvalue, guaranteed by Perron-Frobenius theorem, 714 00:52:38,600 --> 00:52:41,630 is 1, eigenvalue of 1. 715 00:52:41,630 --> 00:52:45,670 So what you'll find here will be the eigenvector 716 00:52:45,670 --> 00:52:49,857 corresponding to the largest eigenvalue-- eigenvector 717 00:52:49,857 --> 00:52:52,440 will be the one corresponding to the largest eigenvalue, which 718 00:52:52,440 --> 00:52:53,710 is equal to 1. 719 00:52:53,710 --> 00:52:56,250 And that's something very general. 720 00:52:56,250 --> 00:53:00,770 It's not just about this matrix and this special example. 721 00:53:00,770 --> 00:53:03,940 In general, if you have a transition matrix, 722 00:53:03,940 --> 00:53:09,460 if you're given a Markov chain and given a transition matrix, 723 00:53:09,460 --> 00:53:11,310 Perron-Frobenius theorem guarantees 724 00:53:11,310 --> 00:53:14,180 that there exists a vector as long as all the entries are 725 00:53:14,180 --> 00:53:15,520 positive. 726 00:53:15,520 --> 00:53:25,150 So in general, if transition matrix of a Markov chain 727 00:53:25,150 --> 00:53:39,170 has positive entries, then there exists a vector pi_1 up 728 00:53:39,170 --> 00:53:49,600 to pi_m such that-- I'll just call it v-- Av is equal to v. 729 00:53:49,600 --> 00:53:52,400 And that will be the long-term behavior as explained. 730 00:53:52,400 --> 00:53:56,790 Over a long term, if it converges to some state, 731 00:53:56,790 --> 00:53:59,470 it has to satisfy that. 732 00:53:59,470 --> 00:54:01,630 And by Perron-Frobenius theorem, we 733 00:54:01,630 --> 00:54:04,440 know that there is a vector satisfying it. 734 00:54:04,440 --> 00:54:09,090 So if it converges, it will converge to that. 735 00:54:09,090 --> 00:54:11,990 And what it's saying is, if all the entries are positive, 736 00:54:11,990 --> 00:54:13,280 then it converges. 737 00:54:13,280 --> 00:54:15,450 And there is such a state. 738 00:54:15,450 --> 00:54:17,810 We know the long-term behavior of the system. 739 00:54:26,050 --> 00:54:28,330 So this is called the stationary distribution. 740 00:54:32,480 --> 00:54:36,290 Such vector v is called. 741 00:54:44,090 --> 00:54:46,230 It's not really right to say that a vector is 742 00:54:46,230 --> 00:54:47,670 stationary distribution. 743 00:54:47,670 --> 00:54:52,080 But if I give this distribution to the state space, 744 00:54:52,080 --> 00:55:03,340 what I mean is consider probability distribution over S 745 00:55:03,340 --> 00:55:10,810 such that probability is-- so it's a random variable X-- X is 746 00:55:10,810 --> 00:55:12,730 equal to i is equal to pi_i. 747 00:55:15,660 --> 00:55:18,830 If you start from this distribution, in the next step, 748 00:55:18,830 --> 00:55:22,050 you'll have the exact same distribution. 749 00:55:22,050 --> 00:55:23,570 That's what I'm trying to say here. 750 00:55:23,570 --> 00:55:25,952 That's called a stationary distribution. 751 00:55:34,930 --> 00:55:35,590 Any questions? 752 00:55:38,518 --> 00:55:41,836 AUDIENCE: So [INAUDIBLE]? 753 00:55:46,535 --> 00:55:47,160 PROFESSOR: Yes. 754 00:55:47,160 --> 00:55:48,023 Very good question. 755 00:55:51,741 --> 00:55:53,366 Yeah, but Perron-Frobenius theorem says 756 00:55:53,366 --> 00:55:55,282 there is exactly one eigenvector corresponding 757 00:55:55,282 --> 00:55:58,100 to the largest eigenvalue. 758 00:55:58,100 --> 00:56:00,280 And that turns out to be 1. 759 00:56:00,280 --> 00:56:02,740 The largest eigenvalue turns out to be 1. 760 00:56:02,740 --> 00:56:06,400 So there will a unique stationary distribution 761 00:56:06,400 --> 00:56:09,818 if all the entries are positive. 762 00:56:14,226 --> 00:56:15,142 AUDIENCE: [INAUDIBLE]? 763 00:56:21,920 --> 00:56:23,135 PROFESSOR: This one? 764 00:56:23,135 --> 00:56:24,051 AUDIENCE: [INAUDIBLE]? 765 00:56:33,991 --> 00:56:36,476 PROFESSOR: Maybe. 766 00:56:36,476 --> 00:56:37,967 It's a good point. 767 00:56:57,350 --> 00:56:58,344 Huh? 768 00:56:58,344 --> 00:56:59,835 Something is wrong. 769 00:57:06,310 --> 00:57:07,310 Can anybody help me? 770 00:57:07,310 --> 00:57:09,170 This part looks questionable. 771 00:57:09,170 --> 00:57:11,154 AUDIENCE: Just kind of [INAUDIBLE] question, 772 00:57:11,154 --> 00:57:13,898 is that topic covered in portions of [INAUDIBLE]? 773 00:57:17,874 --> 00:57:21,850 The other eigenvalues in the matrix are smaller than 1. 774 00:57:21,850 --> 00:57:26,390 And so when you take products of the transition probability 775 00:57:26,390 --> 00:57:33,150 matrix, those eigenvalues that are smaller than 1 scale 776 00:57:33,150 --> 00:57:37,740 after repeated multiplication to 0. 777 00:57:37,740 --> 00:57:41,750 So in the limit, they're 0, but until you get to the limit, 778 00:57:41,750 --> 00:57:43,739 you still have them. 779 00:57:43,739 --> 00:57:45,155 Essentially, that kind of behavior 780 00:57:45,155 --> 00:57:49,065 is transitionary behavior that dissipates. 781 00:57:49,065 --> 00:57:53,470 But the behavior corresponding to the stationary distribution 782 00:57:53,470 --> 00:57:53,970 persists. 783 00:57:57,320 --> 00:57:58,850 PROFESSOR: But, as you mentioned, 784 00:57:58,850 --> 00:58:02,000 this argument seems to be giving that all lambda has 785 00:58:02,000 --> 00:58:02,625 to be 1, right? 786 00:58:02,625 --> 00:58:05,882 Is that your point? 787 00:58:05,882 --> 00:58:06,970 You're right. 788 00:58:06,970 --> 00:58:09,167 I don't see what the problem is right now. 789 00:58:09,167 --> 00:58:10,250 I'll think about it later. 790 00:58:10,250 --> 00:58:14,850 I don't want to waste my time on trying to find what's wrong. 791 00:58:14,850 --> 00:58:16,660 But the conclusion is right. 792 00:58:16,660 --> 00:58:18,510 There will be a unique one and so on. 793 00:58:24,405 --> 00:58:26,020 Now let me make a note here. 794 00:58:35,910 --> 00:58:39,720 So let me move on to the final topic. 795 00:58:39,720 --> 00:58:40,930 It's called martingale. 796 00:58:52,850 --> 00:58:57,030 And this is, there is another collection 797 00:58:57,030 --> 00:58:58,990 of stochastic processes. 798 00:58:58,990 --> 00:59:04,750 And what we're trying to model here is a fair game. 799 00:59:04,750 --> 00:59:13,930 Stochastic processes which are a fair game. 800 00:59:19,670 --> 00:59:35,990 And formally, what I mean is a stochastic process is 801 00:59:35,990 --> 01:00:20,770 a martingale if that happens. 802 01:00:20,770 --> 01:00:22,310 Let me iterate it. 803 01:00:22,310 --> 01:00:28,100 So what we have here is, at time t, 804 01:00:28,100 --> 01:00:30,910 if you look at what's going to happen at time t plus 1, 805 01:00:30,910 --> 01:00:33,660 take the expectation, then it has 806 01:00:33,660 --> 01:00:36,640 to be exactly equal to the value of X_t. 807 01:00:36,640 --> 01:00:41,920 So we have this stochastic process, and, at time t, 808 01:00:41,920 --> 01:00:44,180 you are at X_t. 809 01:00:44,180 --> 01:00:49,250 At time t plus 1, lots of things can happen. 810 01:00:49,250 --> 01:00:52,570 It might go to this point, that point, that point, or so on. 811 01:00:52,570 --> 01:00:54,080 But the probability distribution is 812 01:00:54,080 --> 01:00:59,290 designed so that the expected value over all these 813 01:00:59,290 --> 01:01:02,730 are exactly equal to the value at X_t. 814 01:01:02,730 --> 01:01:06,260 So it's kind of centered at X_t, centered meaning 815 01:01:06,260 --> 01:01:09,620 in the probabilistic sense. 816 01:01:09,620 --> 01:01:12,590 The expectation is equal to that. 817 01:01:12,590 --> 01:01:16,040 So if your value at time t was something else, 818 01:01:16,040 --> 01:01:19,070 your values at time t plus 1 will 819 01:01:19,070 --> 01:01:21,587 be centered at this value instead of that value. 820 01:01:24,720 --> 01:01:27,980 And the reason I'm saying it models 821 01:01:27,980 --> 01:01:34,790 a fair game is because, if this is 822 01:01:34,790 --> 01:01:41,510 like your balance over some game, in expectation, 823 01:01:41,510 --> 01:01:47,670 you're not supposed to win any money at all 824 01:01:47,670 --> 01:01:50,070 And I will later tell you more about that. 825 01:01:55,610 --> 01:01:59,380 So example, a random walk is a martingale. 826 01:02:18,710 --> 01:02:19,690 What else? 827 01:02:24,490 --> 01:02:28,820 Second one, now let's say you're in a casino 828 01:02:28,820 --> 01:02:31,580 and you're playing roulette. 829 01:02:31,580 --> 01:02:40,150 Balance of a roulette player is not a martingale. 830 01:02:46,700 --> 01:02:49,610 Because it's designed so that the expected value 831 01:02:49,610 --> 01:02:52,150 is less than 0. 832 01:02:52,150 --> 01:02:53,880 You're supposed to lose money. 833 01:02:53,880 --> 01:02:57,850 Of course, at one instance, you might win money. 834 01:02:57,850 --> 01:03:02,310 But in expected value, you're designed to go down. 835 01:03:05,400 --> 01:03:06,730 So it's not a martingale. 836 01:03:06,730 --> 01:03:09,420 It's not a fair game. 837 01:03:09,420 --> 01:03:11,820 The game is designed for the casino not for you. 838 01:03:15,470 --> 01:03:18,106 Third one is some funny example. 839 01:03:18,106 --> 01:03:24,596 I just made it up to show that there are many possible ways 840 01:03:24,596 --> 01:03:28,130 that a stochastic process can be a martingale. 841 01:03:28,130 --> 01:03:35,450 So if Y_i are IID random variables such 842 01:03:35,450 --> 01:03:45,048 that Y_i is equal to 2, with probability 1/3, and 1/2 843 01:03:45,048 --> 01:04:00,854 is probability 2/3, then let X_0 equal 1 and X_k equal. 844 01:04:05,255 --> 01:04:07,827 Then that is a martingale. 845 01:04:11,170 --> 01:04:14,960 So at each step, you'll either multiply by 2 or 1/2 846 01:04:14,960 --> 01:04:18,140 by 2-- just divide by 2. 847 01:04:18,140 --> 01:04:23,260 And the probability distribution is given as 1/3 and 2/3. 848 01:04:23,260 --> 01:04:26,910 Then X_k is a martingale. 849 01:04:26,910 --> 01:04:32,910 The reason is-- so you can compute the expected value. 850 01:04:32,910 --> 01:04:45,800 The expected value of the X_(k+1), given X_k up to X_0, 851 01:04:45,800 --> 01:04:58,880 is equal to-- what you have is expected value of Y_(k+1) times 852 01:04:58,880 --> 01:05:03,942 Y_k up to Y_1. 853 01:05:03,942 --> 01:05:05,436 That part is X_k. 854 01:05:08,930 --> 01:05:12,438 But this is designed so that the expected value is equal to 1. 855 01:05:20,030 --> 01:05:21,146 So it's a martingale. 856 01:05:26,460 --> 01:05:29,240 I mean it will fluctuate a lot, your balance, 857 01:05:29,240 --> 01:05:32,510 double, double, double, half, half, half, and so on. 858 01:05:32,510 --> 01:05:36,999 But still, in expectation, you will always maintain. 859 01:05:36,999 --> 01:05:39,040 I mean the expectation at all time is equal to 1, 860 01:05:39,040 --> 01:05:40,910 if you look at it from the beginning. 861 01:05:40,910 --> 01:05:43,880 You look at time 1, then the expected value of X_1 862 01:05:43,880 --> 01:05:44,870 and so on. 863 01:05:48,340 --> 01:05:50,410 Any questions on definition or example? 864 01:05:53,090 --> 01:05:56,580 So the random walk is an example which is both Markov 865 01:05:56,580 --> 01:05:58,820 chain and martingale. 866 01:05:58,820 --> 01:06:02,640 But these two concepts are really two different concepts. 867 01:06:02,640 --> 01:06:04,800 Try not to be confused between the two. 868 01:06:04,800 --> 01:06:06,339 They're just two different things. 869 01:06:11,330 --> 01:06:13,670 There are Markov chains which are not martingales. 870 01:06:13,670 --> 01:06:16,510 There are martingales which are not Markov chains. 871 01:06:16,510 --> 01:06:19,030 And there are somethings which are both, 872 01:06:19,030 --> 01:06:21,240 like a simple random walk. 873 01:06:21,240 --> 01:06:24,400 There are some stuff which are not either of them. 874 01:06:24,400 --> 01:06:26,350 They really are just two separate things. 875 01:06:34,040 --> 01:06:36,150 Let me conclude with one interesting theorem 876 01:06:36,150 --> 01:06:38,220 about martingales. 877 01:06:38,220 --> 01:06:42,380 And it really enforces your intuition, at least 878 01:06:42,380 --> 01:06:46,740 intuition of the definition, that martingale is a fair game. 879 01:06:46,740 --> 01:06:48,572 It's called optional stopping theorem. 880 01:06:53,780 --> 01:07:00,340 And I will write it down more formally later, 881 01:07:00,340 --> 01:07:05,320 but the message is this. 882 01:07:05,320 --> 01:07:09,050 If you play a martingale game, if it's a game you play 883 01:07:09,050 --> 01:07:14,470 and it's your balance, no matter what strategy you use, 884 01:07:14,470 --> 01:07:18,970 your expected value cannot be positive or negative. 885 01:07:18,970 --> 01:07:21,120 Even if you try to lose money so hard, 886 01:07:21,120 --> 01:07:22,780 you won't be able to do that. 887 01:07:22,780 --> 01:07:24,940 Even if you try to win money so hard, like try 888 01:07:24,940 --> 01:07:28,300 to invent something really, really cool and ingenious, 889 01:07:28,300 --> 01:07:30,130 you should not be able to win money. 890 01:07:30,130 --> 01:07:34,124 Your expected value is just fixed. 891 01:07:34,124 --> 01:07:35,540 That's the content of the theorem. 892 01:07:35,540 --> 01:07:37,230 Of course, there are technical conditions 893 01:07:37,230 --> 01:07:38,160 that have to be there. 894 01:07:42,320 --> 01:07:46,930 So if you're playing a martingale game, 895 01:07:46,930 --> 01:07:50,470 then you're not supposed to win or lose, 896 01:07:50,470 --> 01:07:51,470 at least in expectation. 897 01:07:53,995 --> 01:07:56,080 So before stating the theorem, I have 898 01:07:56,080 --> 01:07:59,524 to define what a stopping point means. 899 01:08:05,820 --> 01:08:27,279 So given a stochastic process, a non-negative integer 900 01:08:27,279 --> 01:08:39,896 valued random variable tau is called a stopping time, 901 01:08:39,896 --> 01:08:48,350 if, for all integer k greater than or equal to 0, tau, 902 01:08:48,350 --> 01:09:00,380 lesser or equal to k, depends only on X_1 to X_k. 903 01:09:00,380 --> 01:09:04,960 So that is something very, very strange. 904 01:09:04,960 --> 01:09:07,950 I want to define something called a stopping time. 905 01:09:07,950 --> 01:09:11,550 It will be a non-negative integer valued random variable. 906 01:09:11,550 --> 01:09:14,649 So it will it be 0, 1, 2, or so on. 907 01:09:14,649 --> 01:09:18,560 That means it will be some time index. 908 01:09:18,560 --> 01:09:22,229 And if you look at the event that tau is less than 909 01:09:22,229 --> 01:09:27,800 or equal to k-- so if you want to look at the events 910 01:09:27,800 --> 01:09:32,229 when you stop at time less than or equal to k, 911 01:09:32,229 --> 01:09:34,760 your decision only depends on the events 912 01:09:34,760 --> 01:09:40,410 up to k, on the value of the stochastic process 913 01:09:40,410 --> 01:09:43,340 up to time k. 914 01:09:43,340 --> 01:09:45,540 In other words, if this is some strategy 915 01:09:45,540 --> 01:09:49,930 you want to use-- by strategy I mean some strategy 916 01:09:49,930 --> 01:09:53,540 that you stop playing at some point. 917 01:09:53,540 --> 01:09:55,840 You have a strategy that is defined 918 01:09:55,840 --> 01:10:00,040 as you play some k rounds, and then you look at the outcome. 919 01:10:00,040 --> 01:10:02,480 You say, OK, now I think it's in favor of me. 920 01:10:02,480 --> 01:10:03,460 I'm going to stop. 921 01:10:03,460 --> 01:10:05,225 You have a pre-defined set of strategies. 922 01:10:08,130 --> 01:10:12,540 And if that strategy only depends 923 01:10:12,540 --> 01:10:16,570 on the values of the stochastic process up to right now, 924 01:10:16,570 --> 01:10:18,880 then it's a stopping time. 925 01:10:18,880 --> 01:10:21,370 If it's some strategy that depends on future values, 926 01:10:21,370 --> 01:10:23,680 it's not a stopping time. 927 01:10:23,680 --> 01:10:25,468 Let me show you by example. 928 01:10:28,150 --> 01:10:31,640 Remember that coin toss game which had random walk value, so 929 01:10:31,640 --> 01:10:35,790 either win $1 or lose $1. 930 01:10:35,790 --> 01:10:49,980 So in coin toss game, let tau be the first time 931 01:10:49,980 --> 01:11:02,778 at which balance becomes $100, then tau is a stopping time. 932 01:11:10,770 --> 01:11:15,410 Or you stop at either $100 or negative 933 01:11:15,410 --> 01:11:17,850 $50, that's still a stopping time. 934 01:11:17,850 --> 01:11:21,370 Remember that we discussed about it? 935 01:11:21,370 --> 01:11:22,780 We look at our balance. 936 01:11:22,780 --> 01:11:27,300 We stop at either at the time when we win $100 or lose $50. 937 01:11:27,300 --> 01:11:29,824 That is a stopping time. 938 01:11:29,824 --> 01:11:32,320 But I think it's better to tell you what is not a stopping 939 01:11:32,320 --> 01:11:33,700 time, an example. 940 01:11:33,700 --> 01:11:36,660 That will help, really. 941 01:11:36,660 --> 01:11:50,280 So let tau-- in the same game-- the time of first peak. 942 01:11:50,280 --> 01:11:54,310 By peak, I mean the time when you go down, 943 01:11:54,310 --> 01:11:57,794 so that would be your tau. 944 01:11:57,794 --> 01:12:00,250 So the first time when you start to go down, 945 01:12:00,250 --> 01:12:02,150 you're going to stop. 946 01:12:02,150 --> 01:12:04,680 That's not a stopping time. 947 01:12:04,680 --> 01:12:06,640 Not a stopping time. 948 01:12:12,000 --> 01:12:15,710 To see formally why it's the case, first of all, if you want 949 01:12:15,710 --> 01:12:18,470 to decide if it's a peak or not at time t, 950 01:12:18,470 --> 01:12:21,900 you have to refer to the value at time t plus 1. 951 01:12:21,900 --> 01:12:23,983 If you're just looking at values up to time t, 952 01:12:23,983 --> 01:12:25,955 you don't know if it's going to be a peak 953 01:12:25,955 --> 01:12:28,440 or if it's going to continue. 954 01:12:28,440 --> 01:12:32,860 So the event that you stop at time t 955 01:12:32,860 --> 01:12:38,150 depends on t plus 1 as well, which doesn't 956 01:12:38,150 --> 01:12:41,022 fall into this definition. 957 01:12:41,022 --> 01:12:43,050 So that's what we're trying to distinguish 958 01:12:43,050 --> 01:12:45,580 by defining a stopping time. 959 01:12:45,580 --> 01:12:48,330 In these cases it was clear, at the time, 960 01:12:48,330 --> 01:12:50,110 you know if you have to stop or not. 961 01:12:50,110 --> 01:12:51,610 But if you define your stopping time 962 01:12:51,610 --> 01:12:53,170 in this way and not a stopping time, 963 01:12:53,170 --> 01:12:56,820 if you define tau in this way, your decision 964 01:12:56,820 --> 01:12:59,670 depends on future values of the outcome. 965 01:12:59,670 --> 01:13:04,035 So it's not a stopping time under this definition. 966 01:13:04,035 --> 01:13:04,618 Any questions? 967 01:13:04,618 --> 01:13:07,082 Does it make sense? 968 01:13:07,082 --> 01:13:07,582 Yes? 969 01:13:07,582 --> 01:13:11,534 AUDIENCE: Could you still have tau as the stopping time, 970 01:13:11,534 --> 01:13:14,498 if you were referring to t, and then t minus 1 971 01:13:14,498 --> 01:13:16,990 was greater than [INAUDIBLE]? 972 01:13:16,990 --> 01:13:18,005 PROFESSOR: So. 973 01:13:18,005 --> 01:13:20,442 AUDIENCE: Let's say, yeah, it was [INAUDIBLE]. 974 01:13:20,442 --> 01:13:21,900 PROFESSOR: So that time after peak, 975 01:13:21,900 --> 01:13:22,640 the first time after peak? 976 01:13:22,640 --> 01:13:23,223 AUDIENCE: Yes. 977 01:13:23,223 --> 01:13:25,640 PROFESSOR: Yes, that will be a stopping time. 978 01:13:25,640 --> 01:13:38,030 So three, tau is tau_0 plus 1, where tau 0 is the first peak, 979 01:13:38,030 --> 01:13:39,630 then it is a stopping time. 980 01:13:39,630 --> 01:13:41,106 It's a stopping time. 981 01:14:06,200 --> 01:14:10,210 So the optional stopping theorem that I promised 982 01:14:10,210 --> 01:14:13,150 says the following. 983 01:14:13,150 --> 01:14:25,898 Suppose we have a martingale, and tau is a stopping time. 984 01:14:29,834 --> 01:14:36,900 And further suppose that there exists 985 01:14:36,900 --> 01:14:43,540 a constant T such that tau is less than or equal to T always. 986 01:14:46,180 --> 01:14:49,780 So you have some strategy which is a finite strategy. 987 01:14:49,780 --> 01:14:51,720 You can't go on forever. 988 01:14:51,720 --> 01:14:54,460 You have some bound on the time. 989 01:14:54,460 --> 01:14:58,390 And your stopping time always ends before that time. 990 01:14:58,390 --> 01:15:08,110 In that case, the expectation of your value at the stopping 991 01:15:08,110 --> 01:15:11,000 time, when you've stopped, your balance, 992 01:15:11,000 --> 01:15:14,160 if that's what it's modeling, is always 993 01:15:14,160 --> 01:15:18,220 equal to the balance at the beginning. 994 01:15:18,220 --> 01:15:21,890 So no matter what strategy you use, if you're a mortal being, 995 01:15:21,890 --> 01:15:24,610 then you cannot win. 996 01:15:24,610 --> 01:15:27,670 That's the content of this theorem. 997 01:15:27,670 --> 01:15:30,430 So I wanted to prove it, but I'll not, 998 01:15:30,430 --> 01:15:32,990 because I think I'm running out of time. 999 01:15:32,990 --> 01:15:37,470 But let me show you one, very interesting corollary of this 1000 01:15:37,470 --> 01:15:38,810 applied to that number one. 1001 01:15:42,370 --> 01:15:45,250 So number one is a stopping time. 1002 01:15:45,250 --> 01:15:49,610 It's not clear that there is a bounded time where you always 1003 01:15:49,610 --> 01:15:51,830 stop before that time. 1004 01:15:51,830 --> 01:15:54,160 But this theorem does apply to that case. 1005 01:15:54,160 --> 01:15:57,080 So I'll just forget about that technical issue. 1006 01:15:57,080 --> 01:16:03,080 So corollary, it applies not immediately, 1007 01:16:03,080 --> 01:16:09,430 but it does apply to the first case, case 1 given above. 1008 01:16:09,430 --> 01:16:15,130 And then what it says is expectation of X_tau 1009 01:16:15,130 --> 01:16:15,920 is equal to 0. 1010 01:16:18,720 --> 01:16:23,390 But expectation of X_tau is-- X at tau 1011 01:16:23,390 --> 01:16:26,370 is either 100 or negative 50, because they're always 1012 01:16:26,370 --> 01:16:29,910 going to stop at the first time where you either 1013 01:16:29,910 --> 01:16:33,280 hit $100 or minus $50. 1014 01:16:33,280 --> 01:16:37,880 So this is 100 times some probability 1015 01:16:37,880 --> 01:16:41,970 plus 1 minus p times minus 50. 1016 01:16:41,970 --> 01:16:44,320 There's some probability that you stop at 100. 1017 01:16:44,320 --> 01:16:46,991 With all the rest, you're going to stop at minus 50. 1018 01:16:46,991 --> 01:16:47,740 You know it's set. 1019 01:16:47,740 --> 01:16:49,960 It's equal to 0. 1020 01:16:49,960 --> 01:16:55,130 What it gives is-- I hope it gives me the right thing I'm 1021 01:16:55,130 --> 01:16:57,030 thinking about. 1022 01:16:57,030 --> 01:16:59,660 p, 100, yes. 1023 01:16:59,660 --> 01:17:02,770 It's 150p minus 50 equals 0. 1024 01:17:02,770 --> 01:17:04,540 p is 1/3. 1025 01:17:04,540 --> 01:17:07,274 And if you remember, that was exactly the computation we got. 1026 01:17:10,970 --> 01:17:13,560 So that's just a neat application. 1027 01:17:13,560 --> 01:17:16,350 But the content of this, it's really interesting. 1028 01:17:16,350 --> 01:17:21,090 So try to contemplate about it, something very philosophically. 1029 01:17:21,090 --> 01:17:23,810 If something can be modeled using martingales, 1030 01:17:23,810 --> 01:17:26,450 perfectly, if it really fits into 1031 01:17:26,450 --> 01:17:28,630 the mathematical formulation of a martingale, 1032 01:17:28,630 --> 01:17:30,454 then you're not supposed to win. 1033 01:17:33,190 --> 01:17:35,510 So that's it for today. 1034 01:17:35,510 --> 01:17:39,470 And next week, Peter will give wonderful lectures. 1035 01:17:39,470 --> 01:17:41,620 See you next week.