1 00:00:00,000 --> 00:00:00,810 2 00:00:00,810 --> 00:00:03,200 In this exercise, we'll be working with the notion of 3 00:00:03,200 --> 00:00:06,100 convergence in probability, as well as some other notion of 4 00:00:06,100 --> 00:00:09,810 converge of random variables that we'll introduce later. 5 00:00:09,810 --> 00:00:14,930 First type of random variable is xn, where xn has 6 00:00:14,930 --> 00:00:24,570 probability 1 minus 1 minus over n to be as 0 and 7 00:00:24,570 --> 00:00:28,010 probability of 1 over n to be a 1. 8 00:00:28,010 --> 00:00:30,776 9 00:00:30,776 --> 00:00:37,230 And graphically, we see that we have a pretty big mess. 10 00:00:37,230 --> 00:00:42,470 1 minus 1 over n at location 0, and a tiny bit somewhere 11 00:00:42,470 --> 00:00:45,040 here, only 1 over n. 12 00:00:45,040 --> 00:00:48,315 So this will be the PMF for x. 13 00:00:48,315 --> 00:00:52,330 14 00:00:52,330 --> 00:00:54,110 On the other hand, we have the sequence of 15 00:00:54,110 --> 00:00:56,890 random variables, yn. 16 00:00:56,890 --> 00:01:00,410 Fairly similar to xn with a slight tweak. 17 00:01:00,410 --> 00:01:04,410 The similar part says it also has a very high probability of 18 00:01:04,410 --> 00:01:08,230 being at 0, mass 1 over 1 minus n. 19 00:01:08,230 --> 00:01:12,670 But on the off chance that yn is not at 0, it has a pretty 20 00:01:12,670 --> 00:01:14,080 big value n. 21 00:01:14,080 --> 00:01:17,810 So it has probability 1 over n of somewhere out there. 22 00:01:17,810 --> 00:01:21,800 So to contrast the two graphs, we see at 0, they have the 23 00:01:21,800 --> 00:01:26,400 same amount of mass, 1 over 1 minus n, but for y, it's all 24 00:01:26,400 --> 00:01:29,820 the way out there that has a small mass 1 over n. 25 00:01:29,820 --> 00:01:33,260 So this will be our Pyn of y. 26 00:01:33,260 --> 00:01:37,770 27 00:01:37,770 --> 00:01:40,180 And for the remainder of the problem, we'll be looking at 28 00:01:40,180 --> 00:01:44,980 the regime where the number n tends to infinity, and study 29 00:01:44,980 --> 00:01:49,990 what will happen to these two sequences of random variables. 30 00:01:49,990 --> 00:01:53,250 In part A, we're to compute the expected value n variance 31 00:01:53,250 --> 00:01:55,230 for both xn and yn. 32 00:01:55,230 --> 00:01:56,650 Let's get started. 33 00:01:56,650 --> 00:02:03,330 The expected value of xn is given by the probability-- 34 00:02:03,330 --> 00:02:08,669 it's at one, which is 1 over n times 1 plus the probability 35 00:02:08,669 --> 00:02:12,670 being at 0, 1 over n times value 0. 36 00:02:12,670 --> 00:02:15,920 And that gives us 1 over n. 37 00:02:15,920 --> 00:02:21,550 To calculate the variance of xn, see that variance is 38 00:02:21,550 --> 00:02:27,280 simply the expected value of xn minus expected value of xn, 39 00:02:27,280 --> 00:02:31,890 which in this case is 1 over n from the previous calculation 40 00:02:31,890 --> 00:02:33,130 we have here. 41 00:02:33,130 --> 00:02:35,490 We take the square of this value and compute the whole 42 00:02:35,490 --> 00:02:44,560 expectation, and this gives us 1 over n, 1 minus 1 over n 43 00:02:44,560 --> 00:02:49,700 plus the remainder probability 1 over 1 minus n of x being at 44 00:02:49,700 --> 00:02:52,956 0, so 0 minus 1 over n squared. 45 00:02:52,956 --> 00:02:55,560 46 00:02:55,560 --> 00:03:01,310 And if we carry out the calculations here, we'll get n 47 00:03:01,310 --> 00:03:03,730 minus 1 over n squared. 48 00:03:03,730 --> 00:03:10,440 49 00:03:10,440 --> 00:03:13,080 Now, let's turn to yn. 50 00:03:13,080 --> 00:03:17,530 The expected value of yn is equal to probability of being 51 00:03:17,530 --> 00:03:25,270 at 0, 0 plus the probability of being at n and 52 00:03:25,270 --> 00:03:27,000 times the value n. 53 00:03:27,000 --> 00:03:30,060 And this gives us 1. 54 00:03:30,060 --> 00:03:33,274 The variance of yn. 55 00:03:33,274 --> 00:03:39,840 We do the same thing as before, we have 1 minus 1 over 56 00:03:39,840 --> 00:03:44,950 n probability of being at 0, multiplied 0 minus 1 squared, 57 00:03:44,950 --> 00:03:49,100 where 1 is the expected value of y. 58 00:03:49,100 --> 00:03:54,150 And with probability 1 over n, out there, equal to n, and 59 00:03:54,150 --> 00:03:58,620 this is multiplied by n minus 1 squared. 60 00:03:58,620 --> 00:04:02,560 And this gives us n minus 1 61 00:04:02,560 --> 00:04:07,070 Already, we can see that while the expected value for x was 1 62 00:04:07,070 --> 00:04:11,490 over n, the expected value for y is sitting right at 1. 63 00:04:11,490 --> 00:04:15,400 It does not decrease as it increases. 64 00:04:15,400 --> 00:04:20,310 And also, while the variance for x is n minus 1 over n 65 00:04:20,310 --> 00:04:23,860 squared, the variance for y is much bigger. 66 00:04:23,860 --> 00:04:27,930 It is actually increasing to infinity as n goes infinity. 67 00:04:27,930 --> 00:04:30,180 So these intuitions will be helpful for the remainder of 68 00:04:30,180 --> 00:04:31,370 the problem. 69 00:04:31,370 --> 00:04:34,700 In part B, we're asked to use Chebyshev's Inequality and see 70 00:04:34,700 --> 00:04:38,660 whether xn or yn converges to any number in probability. 71 00:04:38,660 --> 00:04:41,430 Let's first recall what the inequality is about. 72 00:04:41,430 --> 00:04:46,020 It says that if we have random variable x, in our case, xn, 73 00:04:46,020 --> 00:04:51,500 then the probability of xn minus the expected value of 74 00:04:51,500 --> 00:04:57,040 xn, in our case, 1 over n, that this deviation, the 75 00:04:57,040 --> 00:05:00,380 absolute value of this difference is greater than 76 00:05:00,380 --> 00:05:08,290 epsilon is bounded above by the variance of xn divided by 77 00:05:08,290 --> 00:05:11,290 the value of epsilon squared. 78 00:05:11,290 --> 00:05:16,100 Well, in our case, we know the variance is n minus 1 over n 79 00:05:16,100 --> 00:05:22,300 squared, hence this whole term is this term divided by 80 00:05:22,300 --> 00:05:24,530 epsilon squared. 81 00:05:24,530 --> 00:05:27,700 Now, we know that as n gets really big, the probability of 82 00:05:27,700 --> 00:05:30,100 xn being at 0 is very big. 83 00:05:30,100 --> 00:05:32,270 It's 1 minus 1 over n. 84 00:05:32,270 --> 00:05:35,750 So a safe bet to guess is that if xn work to converge 85 00:05:35,750 --> 00:05:38,590 anywhere on the real line, it might just converge 86 00:05:38,590 --> 00:05:39,940 to the point 0. 87 00:05:39,940 --> 00:05:41,940 And let's see if that is true. 88 00:05:41,940 --> 00:05:45,860 Now, to show that xn converges to 0 in probability, formally 89 00:05:45,860 --> 00:05:50,800 we need to show that for every fixed epsilon greater than 0, 90 00:05:50,800 --> 00:05:58,600 the probability that xn minus 0 greater than epsilon has to 91 00:05:58,600 --> 00:06:02,690 be 0, and the limit has n going to infinity. 92 00:06:02,690 --> 00:06:05,660 And hopefully, the inequalities above will help 93 00:06:05,660 --> 00:06:07,480 us achieve this goal. 94 00:06:07,480 --> 00:06:09,350 And let's see how that is done. 95 00:06:09,350 --> 00:06:12,860 I would like to have an estimate, in fact, an upper 96 00:06:12,860 --> 00:06:17,850 bound of the probability xn absolute value greater or 97 00:06:17,850 --> 00:06:20,040 equal to epsilon. 98 00:06:20,040 --> 00:06:21,850 And now, we're going to do some massaging to this 99 00:06:21,850 --> 00:06:25,730 equation so that it looks like what we know before, which is 100 00:06:25,730 --> 00:06:27,990 right here. 101 00:06:27,990 --> 00:06:33,420 Now, we see that this equation is in fact, less than 102 00:06:33,420 --> 00:06:40,190 probability xn minus 1 over n greater or equal to epsilon 103 00:06:40,190 --> 00:06:42,000 plus 1 over n. 104 00:06:42,000 --> 00:06:45,180 Now, I will justify this inequality in one second. 105 00:06:45,180 --> 00:06:48,490 But suppose that you believe me for this inequality, we can 106 00:06:48,490 --> 00:06:53,020 simply plug-in the value right here, namely substituting 107 00:06:53,020 --> 00:06:57,560 epsilon plus 1 over n, in the place of epsilon right here 108 00:06:57,560 --> 00:07:00,880 and use the Chebyshev Inequality we did earlier to 109 00:07:00,880 --> 00:07:05,060 arrive at the following inequality, which is n minus 1 110 00:07:05,060 --> 00:07:09,840 over n squared times, instead of epsilon, now we have 111 00:07:09,840 --> 00:07:12,955 epsilon plus 1 over n squared. 112 00:07:12,955 --> 00:07:16,710 113 00:07:16,710 --> 00:07:20,280 Now, if we take n to infinity in this 114 00:07:20,280 --> 00:07:22,410 equation, see what happens. 115 00:07:22,410 --> 00:07:27,230 Well, this term here converges to 0 because n squared is much 116 00:07:27,230 --> 00:07:29,490 bigger than n minus 1. 117 00:07:29,490 --> 00:07:32,180 And this term here converges to number 1 118 00:07:32,180 --> 00:07:34,650 over epsilon squared. 119 00:07:34,650 --> 00:07:38,420 So it becomes 0 times 1 over epsilon squared, hence the 120 00:07:38,420 --> 00:07:40,830 whole term converges to 0. 121 00:07:40,830 --> 00:07:46,640 And this proves that indeed, the limit of the term here as 122 00:07:46,640 --> 00:07:52,220 n going to infinity is equal to 0, and that implies xn 123 00:07:52,220 --> 00:07:54,410 converges to 0 in probability. 124 00:07:54,410 --> 00:07:58,810 125 00:07:58,810 --> 00:08:01,050 Now, there is the one thing I did not justify in the 126 00:08:01,050 --> 00:08:06,650 process, which is why is probability of absolute value 127 00:08:06,650 --> 00:08:12,460 xn greater than epsilon less then the term right here? 128 00:08:12,460 --> 00:08:14,480 So let's take a look. 129 00:08:14,480 --> 00:08:18,080 Well, the easiest way to see this is to see what ranges of 130 00:08:18,080 --> 00:08:21,010 xn are we talking about in each case. 131 00:08:21,010 --> 00:08:26,220 Well, in the first case, we're looking at interval around 0 132 00:08:26,220 --> 00:08:30,850 plus minus epsilon and xn can lie anywhere here. 133 00:08:30,850 --> 00:08:33,539 134 00:08:33,539 --> 00:08:36,740 While in the second case, right here, we can see that 135 00:08:36,740 --> 00:08:41,440 the set of range values for xn is precisely this interval 136 00:08:41,440 --> 00:08:45,280 here, which was the same as before, but now, we actually 137 00:08:45,280 --> 00:08:49,810 have less on this side, where the starting point and the 138 00:08:49,810 --> 00:08:54,720 interval on the right is epsilon plus 2 over n. 139 00:08:54,720 --> 00:08:58,890 And therefore, the right hand style captures strictly less 140 00:08:58,890 --> 00:09:02,390 values of xn than the left hand side, hence the 141 00:09:02,390 --> 00:09:04,820 inequality is true. 142 00:09:04,820 --> 00:09:07,490 Now, we wonder if we can use the same trick, Chebyshev 143 00:09:07,490 --> 00:09:10,870 Inequality, to derive the result for yn as well. 144 00:09:10,870 --> 00:09:12,390 Let's take a look. 145 00:09:12,390 --> 00:09:17,880 The probability of yn minus it's mean, 1, greater or equal 146 00:09:17,880 --> 00:09:19,490 to epsilon. 147 00:09:19,490 --> 00:09:25,700 From the Chebyshev Inequality, we have variance of yn divided 148 00:09:25,700 --> 00:09:28,420 by epsilon squared. 149 00:09:28,420 --> 00:09:29,410 Now, there is a problem. 150 00:09:29,410 --> 00:09:31,420 The variance of yn is very big. 151 00:09:31,420 --> 00:09:34,510 In fact, it is equal to n minus 1. 152 00:09:34,510 --> 00:09:37,870 And we calculated in part A, divided by epsilon squared. 153 00:09:37,870 --> 00:09:41,940 And this quantity here diverges as n going to 154 00:09:41,940 --> 00:09:45,360 infinity to infinity itself. 155 00:09:45,360 --> 00:09:48,810 So in this case, the Chebyshev Inequality does not tell us 156 00:09:48,810 --> 00:09:54,220 much information of whether yn converges or not. 157 00:09:54,220 --> 00:09:58,420 Now, going to part C, the question is although we don't 158 00:09:58,420 --> 00:10:02,030 know anything about yn from just the Chebyshev Inequality, 159 00:10:02,030 --> 00:10:04,660 does yn converge to anything at all? 160 00:10:04,660 --> 00:10:07,520 Well, it turns out it does. 161 00:10:07,520 --> 00:10:09,270 In fact, we don't have to go through anything more 162 00:10:09,270 --> 00:10:12,830 complicated than distribution yn itself. 163 00:10:12,830 --> 00:10:17,600 So from the distribution yn, we know that absolute value of 164 00:10:17,600 --> 00:10:22,980 yn greater or equal to epsilon is equal to 1 over n whenever 165 00:10:22,980 --> 00:10:25,135 epsilon is less than n. 166 00:10:25,135 --> 00:10:30,150 And this is true because we know yn has a lot of mass at 0 167 00:10:30,150 --> 00:10:36,320 and a tiny bit a mass at value 1 over n at location n. 168 00:10:36,320 --> 00:10:39,670 So if we draw the cutoff here at epsilon, then the 169 00:10:39,670 --> 00:10:44,420 probability of yn landing to the right of epsilon is simply 170 00:10:44,420 --> 00:10:46,250 equal to 1 over n. 171 00:10:46,250 --> 00:10:49,800 And this tells us, if we take the limit of n going to 172 00:10:49,800 --> 00:10:54,640 infinity and measure the probability that yn-- 173 00:10:54,640 --> 00:10:55,870 just to write it clearly-- 174 00:10:55,870 --> 00:11:00,800 deviates from 0 by more than epsilon, this is equal to the 175 00:11:00,800 --> 00:11:04,600 limit as n going to infinity of 1 over n. 176 00:11:04,600 --> 00:11:07,560 And that is equal to 0. 177 00:11:07,560 --> 00:11:12,270 From this calculation, we know that yn does converge to 0 in 178 00:11:12,270 --> 00:11:16,260 probability as n going to infinity. 179 00:11:16,260 --> 00:11:20,350 180 00:11:20,350 --> 00:11:24,540 For part D, we'd like to know whether the convergence in 181 00:11:24,540 --> 00:11:28,970 probability implies the convergence in expectation. 182 00:11:28,970 --> 00:11:32,170 That is, if we have a sequence of random variables, let's 183 00:11:32,170 --> 00:11:39,980 call it zn, that converges to number c in probability as n 184 00:11:39,980 --> 00:11:45,960 going to infinity, does it also imply that the limit as n 185 00:11:45,960 --> 00:11:52,930 going to infinity of the expected value of zn also 186 00:11:52,930 --> 00:11:55,180 converges to c. 187 00:11:55,180 --> 00:11:56,250 Is that true? 188 00:11:56,250 --> 00:11:59,520 Well, intuitively it is true, because in the limit, zn 189 00:11:59,520 --> 00:12:02,890 almost looks like it concentrates on c solely, 190 00:12:02,890 --> 00:12:06,530 hence we might expect that the expected value is also going 191 00:12:06,530 --> 00:12:08,040 to c itself. 192 00:12:08,040 --> 00:12:10,430 Well, unfortunately, that is not quite true. 193 00:12:10,430 --> 00:12:14,580 In fact, we have the proof right here by looking at yn. 194 00:12:14,580 --> 00:12:19,140 For yn, we know that the expected value of yn is equal 195 00:12:19,140 --> 00:12:21,210 to 1 for all n. 196 00:12:21,210 --> 00:12:23,840 It does not matter how big n gets. 197 00:12:23,840 --> 00:12:28,900 But we also know front part C that yn does converge to 0 in 198 00:12:28,900 --> 00:12:31,200 probability. 199 00:12:31,200 --> 00:12:35,180 And this means somehow, yn can get very close to 0, yet it's 200 00:12:35,180 --> 00:12:39,500 expected value still stays one away. 201 00:12:39,500 --> 00:12:42,220 And the reason again, we go back to the way yn was 202 00:12:42,220 --> 00:12:43,710 constructed. 203 00:12:43,710 --> 00:12:47,810 Now, as n goes to infinity, the probability of yn being at 204 00:12:47,810 --> 00:12:52,960 0, 1 minus 1 over n, approaches 1. 205 00:12:52,960 --> 00:12:58,740 So it's very likely that yn is having a value 0, but whenever 206 00:12:58,740 --> 00:13:02,600 on the off chance that yn takes a value other than 0, 207 00:13:02,600 --> 00:13:03,640 it's a huge number. 208 00:13:03,640 --> 00:13:08,560 It is n, even though it has a small probability of 1 over n. 209 00:13:08,560 --> 00:13:11,110 Adding these two factors together, it tells us the 210 00:13:11,110 --> 00:13:15,320 expected value of yn always stays at 1. 211 00:13:15,320 --> 00:13:18,460 And however, in probability, it's very likely 212 00:13:18,460 --> 00:13:21,100 that y is around 0. 213 00:13:21,100 --> 00:13:24,070 So this example tells us converges in probability is 214 00:13:24,070 --> 00:13:25,020 not that strong. 215 00:13:25,020 --> 00:13:28,010 That tells us something about the random variables but it 216 00:13:28,010 --> 00:13:31,420 does not tell us whether the mean value of the random 217 00:13:31,420 --> 00:13:34,150 variables converge to the same number. 218 00:13:34,150 --> 00:13:35,400