1 00:00:00,120 --> 00:00:02,460 The following content is provided under a Creative 2 00:00:02,460 --> 00:00:03,850 Commons license. 3 00:00:03,850 --> 00:00:06,090 Your support will help MIT OpenCourseWare 4 00:00:06,090 --> 00:00:10,180 continue to offer high-quality educational resources for free. 5 00:00:10,180 --> 00:00:12,720 To make a donation or to view additional materials 6 00:00:12,720 --> 00:00:16,680 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,680 --> 00:00:19,535 at ocw.mit.edu. 8 00:00:19,535 --> 00:00:21,410 PHILIPPE RIGOLLET: We're talking about tests. 9 00:00:21,410 --> 00:00:24,720 And to be fair, we spend most of our time 10 00:00:24,720 --> 00:00:28,470 talking about new jargon that we're using. 11 00:00:28,470 --> 00:00:31,710 The main goal is to take a binary decision, yes and no. 12 00:00:31,710 --> 00:00:36,360 So just so that we're clear and we make sure that we all 13 00:00:36,360 --> 00:00:37,980 speak the same language, let me just 14 00:00:37,980 --> 00:00:42,570 remind you what the key words are for tests. 15 00:00:42,570 --> 00:00:48,310 So the first thing is that we split theta 16 00:00:48,310 --> 00:00:53,690 in theta 0 and theta 1. 17 00:00:53,690 --> 00:00:57,610 Both are included in theta, and they are disjoint. 18 00:01:00,970 --> 00:01:04,090 So I have my set of possible parameters. 19 00:01:04,090 --> 00:01:10,770 And then I have theta 0 is here, theta 1 is here. 20 00:01:10,770 --> 00:01:14,170 And there might be something that I leave out. 21 00:01:14,170 --> 00:01:18,070 And so what we're doing is, we have two hypotheses. 22 00:01:18,070 --> 00:01:20,800 So here's our hypothesis testing problem. 23 00:01:20,800 --> 00:01:27,910 And it's h0 theta belongs to theta 0 versus h1 theta 24 00:01:27,910 --> 00:01:29,860 belongs to theta 1. 25 00:01:29,860 --> 00:01:32,620 This guy was called the null, and this guy 26 00:01:32,620 --> 00:01:36,410 was called the alternative. 27 00:01:36,410 --> 00:01:37,910 And why we give them special names 28 00:01:37,910 --> 00:01:41,230 is because we saw that they have an asymmetric role. 29 00:01:41,230 --> 00:01:43,070 The null represents the status quo, 30 00:01:43,070 --> 00:01:46,290 and data is here to bring evidence against this guy. 31 00:01:46,290 --> 00:01:49,730 And we can really never conclude that h0 is true 32 00:01:49,730 --> 00:01:56,420 because all we could conclude is that h1 is not true, or may not 33 00:01:56,420 --> 00:01:59,030 be true. 34 00:01:59,030 --> 00:02:00,840 So that was the first thing. 35 00:02:00,840 --> 00:02:03,020 The second thing was the hypothesis. 36 00:02:03,020 --> 00:02:05,730 The third thing is, what is a test? 37 00:02:05,730 --> 00:02:17,100 Well, psi, it's a statistic, and it takes the data, 38 00:02:17,100 --> 00:02:21,500 and it maps it into 0 or 1. 39 00:02:21,500 --> 00:02:24,000 And I didn't really mention it, but there's some things such 40 00:02:24,000 --> 00:02:27,390 as called randomized tests, which is, well, 41 00:02:27,390 --> 00:02:28,920 if I cannot really make a decision, 42 00:02:28,920 --> 00:02:30,870 I might as well flip a coin. 43 00:02:30,870 --> 00:02:32,952 That tends to be biased, but that's really-- 44 00:02:32,952 --> 00:02:34,410 I mean, think about it in practice. 45 00:02:34,410 --> 00:02:36,118 You probably don't want to make decisions 46 00:02:36,118 --> 00:02:37,380 based on flipping a coin. 47 00:02:37,380 --> 00:02:39,270 And so what people typically do-- 48 00:02:39,270 --> 00:02:41,830 this is happening, typically, at one specific value. 49 00:02:41,830 --> 00:02:44,339 So rather than flipping a coin for this very specific value, 50 00:02:44,339 --> 00:02:45,880 what people typically do is they say, 51 00:02:45,880 --> 00:02:48,088 OK, I'm going to side with h0 because that's the most 52 00:02:48,088 --> 00:02:50,134 conservative choice I can make. 53 00:02:50,134 --> 00:02:52,050 So in a way, they think of flipping this coin, 54 00:02:52,050 --> 00:02:55,380 but always falling on heads, say. 55 00:02:55,380 --> 00:02:58,410 So associated to this test was something called, well, 56 00:02:58,410 --> 00:03:05,590 the rejection region r psi, which 57 00:03:05,590 --> 00:03:15,360 is just the set of data x1 xn such that psi of x1 xn 58 00:03:15,360 --> 00:03:16,720 is equal to 1. 59 00:03:16,720 --> 00:03:19,170 So that means we rejected h0 when the test is 1. 60 00:03:19,170 --> 00:03:21,270 And those are the set of data points 61 00:03:21,270 --> 00:03:25,140 that actually are going to lead me to reject the test. 62 00:03:28,777 --> 00:03:30,860 And then the things that we're actually, slightly, 63 00:03:30,860 --> 00:03:35,660 a little more important and really peculiar to test, 64 00:03:35,660 --> 00:03:40,460 specific to tests, were the type I and type II error. 65 00:03:40,460 --> 00:03:44,820 So the type I error arises when-- 66 00:03:44,820 --> 00:04:01,390 so type I error is when you reject, whereas h0 is correct. 67 00:04:01,390 --> 00:04:06,920 And the type II error is the opposite, 68 00:04:06,920 --> 00:04:17,354 so it's failed to reject, whereas h1 is correct-- 69 00:04:17,354 --> 00:04:20,450 h is correct, yeah. 70 00:04:20,450 --> 00:04:23,770 So those are the two types of errors you can make. 71 00:04:23,770 --> 00:04:26,820 And we quantified their probability of type I error. 72 00:04:26,820 --> 00:04:31,720 So alpha psi is the probability-- 73 00:04:31,720 --> 00:04:38,320 so that's the probability of type I error. 74 00:04:41,470 --> 00:04:49,190 So psi is just the probability for theta that psi rejects 75 00:04:49,190 --> 00:04:54,360 and that's defined for theta and theta 0, 76 00:04:54,360 --> 00:04:56,390 so for different values of theta 0. 77 00:04:56,390 --> 00:05:00,380 So h0 being correct means there exists a theta in theta 0 78 00:05:00,380 --> 00:05:03,380 for which that actually is the right distribution. 79 00:05:03,380 --> 00:05:05,060 So for different values of theta, 80 00:05:05,060 --> 00:05:07,340 I might make different errors. 81 00:05:07,340 --> 00:05:12,440 So if you think, for example, about the coin example, 82 00:05:12,440 --> 00:05:16,550 I'm testing if the coin is biased towards heads 83 00:05:16,550 --> 00:05:18,770 or biased towards tails. 84 00:05:18,770 --> 00:05:21,830 So if I'm testing whether p is larger 85 00:05:21,830 --> 00:05:25,145 than 1/2 or less than 1/2, then when the true p-- let's 86 00:05:25,145 --> 00:05:27,290 say our h0 is larger than 1/2. 87 00:05:27,290 --> 00:05:29,900 When p is equal to 1, it's actually very difficult for me 88 00:05:29,900 --> 00:05:33,570 to make a mistake, because I only see heads. 89 00:05:33,570 --> 00:05:35,780 So when p is getting closer to 1/2, 90 00:05:35,780 --> 00:05:38,850 I'm going to start making more and more probability of error. 91 00:05:38,850 --> 00:05:42,170 And so the type II error-- so that's the probability of type 92 00:05:42,170 --> 00:05:43,670 II-- 93 00:05:43,670 --> 00:05:46,550 is denoted by beta psi. 94 00:05:46,550 --> 00:05:50,750 And it's the function, well, that does the opposite 95 00:05:50,750 --> 00:05:58,070 and, this time, is defined for theta in theta 1. 96 00:05:58,070 --> 00:06:13,410 And finally, we define something called the power, pi of psi. 97 00:06:13,410 --> 00:06:16,180 And this time, this is actually a number. 98 00:06:16,180 --> 00:06:23,220 And so this number is equal to the maximum over theta n theta 99 00:06:23,220 --> 00:06:23,879 0. 100 00:06:23,879 --> 00:06:25,920 I mean, that could be a supremum, but think of it 101 00:06:25,920 --> 00:06:32,940 as being a maximum of p theta of psi is equal-- 102 00:06:32,940 --> 00:06:37,580 sorry, that's n0, right? 103 00:06:37,580 --> 00:06:39,640 Give me one sec. 104 00:06:39,640 --> 00:06:42,270 No, sorry, that's the min. 105 00:06:46,190 --> 00:06:48,740 So this is not making a mistake. 106 00:06:48,740 --> 00:06:52,940 Theta 0 is in theta 2 So if theta is in theta 1 107 00:06:52,940 --> 00:06:55,279 and I conclude 1, so this is a good thing. 108 00:06:55,279 --> 00:06:56,570 I want this number to be large. 109 00:06:56,570 --> 00:06:58,860 And I'm looking at the worst house-- 110 00:06:58,860 --> 00:07:02,770 what is the smallest value this number can be? 111 00:07:02,770 --> 00:07:06,040 So what I want to show you a little bit is a picture. 112 00:07:09,410 --> 00:07:12,770 So now I'm going to take theta, and think of it as being a p. 113 00:07:12,770 --> 00:07:18,340 So I'm going to take p for some variable in the experiment. 114 00:07:18,340 --> 00:07:20,690 So p can range between 0 and 1, that's for sure. 115 00:07:23,516 --> 00:07:24,890 And what I'm going to try to test 116 00:07:24,890 --> 00:07:30,200 is whether p is less than 1/2 or larger than 1/2. 117 00:07:30,200 --> 00:07:34,340 So this is going to be, let's say, theta 0. 118 00:07:34,340 --> 00:07:37,020 And this guy here is theta 1. 119 00:07:37,020 --> 00:07:40,900 Just trying to give you a picture of what those guys are. 120 00:07:40,900 --> 00:07:46,240 So I have my y-axis, and now I'm going to start drawing number. 121 00:07:46,240 --> 00:07:48,760 All these things-- this function, 122 00:07:48,760 --> 00:07:51,250 this function, and this number-- are 123 00:07:51,250 --> 00:07:52,600 all numbers between 0 and 1. 124 00:07:56,990 --> 00:07:59,750 So now I'm claiming that-- 125 00:07:59,750 --> 00:08:03,350 so when I move from left to right, 126 00:08:03,350 --> 00:08:08,430 what is my probability of rejecting going to do? 127 00:08:08,430 --> 00:08:11,930 So what I'm going to plot is the probability under theta. 128 00:08:11,930 --> 00:08:14,740 The first thing I want to plot is the probability under theta 129 00:08:14,740 --> 00:08:19,400 that psi is equal to 1. 130 00:08:19,400 --> 00:08:20,880 And let's say psi-- 131 00:08:20,880 --> 00:08:25,140 think of psi as being just this indicator 132 00:08:25,140 --> 00:08:35,270 that square root on n xn bar minus p over square root xn 133 00:08:35,270 --> 00:08:40,419 bar 1 minus xn bar is larger than some constant c 134 00:08:40,419 --> 00:08:43,840 for a probability chosen c. 135 00:08:43,840 --> 00:08:48,280 So what we choose is that c is in such a way that, at 1/2, 136 00:08:48,280 --> 00:08:50,380 when we're testing for 1/2, what we 137 00:08:50,380 --> 00:08:56,500 wanted was this number to be equal to alpha, basically. 138 00:08:56,500 --> 00:09:00,115 So we fix this alpha number so that this guy-- 139 00:09:00,115 --> 00:09:09,430 so if I want alpha of psi of theta less than alpha 140 00:09:09,430 --> 00:09:12,360 given in advanced-- 141 00:09:12,360 --> 00:09:15,199 so think of it as being equal to, say, 5%. 142 00:09:15,199 --> 00:09:16,740 So I'm fixing this number, and I want 143 00:09:16,740 --> 00:09:19,050 this to be controlled for all theta and theta 0. 144 00:09:23,400 --> 00:09:26,440 So if you're going to give me this budget, 145 00:09:26,440 --> 00:09:29,644 well, I'm actually going to make it equal where I can. 146 00:09:29,644 --> 00:09:31,810 If you're telling me you can make it equal to alpha, 147 00:09:31,810 --> 00:09:34,650 we know that if I increase my type I error, 148 00:09:34,650 --> 00:09:36,590 I'm going to decrease my type II error. 149 00:09:36,590 --> 00:09:39,280 If I start putting everyone in jail 150 00:09:39,280 --> 00:09:41,862 or if I start letting everyone go free, 151 00:09:41,862 --> 00:09:43,570 that's what we were discussing last time. 152 00:09:43,570 --> 00:09:45,460 So since we have this trade-off and you're 153 00:09:45,460 --> 00:09:49,372 giving me a budget for one guy, I'm just going to max it out. 154 00:09:49,372 --> 00:09:50,830 And where am I going to max it out? 155 00:09:50,830 --> 00:09:53,170 Exactly at 1/2 at the boundary. 156 00:09:53,170 --> 00:09:54,400 So this is going to be 5%. 157 00:10:00,970 --> 00:10:03,460 So what I know is that since alpha of theta 158 00:10:03,460 --> 00:10:06,550 is less than alpha for all theta in theta 159 00:10:06,550 --> 00:10:12,580 0-- sorry, that's for theta 0, that's where alpha is defined. 160 00:10:12,580 --> 00:10:14,676 So for theta and theta 0, I knew that my function 161 00:10:14,676 --> 00:10:15,800 is going to look like this. 162 00:10:15,800 --> 00:10:18,380 It's going to be somewhere in this rectangle. 163 00:10:18,380 --> 00:10:20,426 Everybody agrees? 164 00:10:20,426 --> 00:10:22,800 So this function for this guy is going to look like this. 165 00:10:22,800 --> 00:10:25,590 When I'm at 0, when p is equal to 0, 166 00:10:25,590 --> 00:10:29,620 which means I only observe 0's, then I 167 00:10:29,620 --> 00:10:32,350 know that p is going to be 0, and I will certainly not 168 00:10:32,350 --> 00:10:34,510 conclude that p is equal to 1. 169 00:10:34,510 --> 00:10:39,100 This test will never conclude that p is equal to 1-- 170 00:10:42,630 --> 00:10:44,770 that p is larger than 1/2, just because xn bar 171 00:10:44,770 --> 00:10:46,120 is going to be equal to 0. 172 00:10:46,120 --> 00:10:48,190 Well, this is actually not well-defined, 173 00:10:48,190 --> 00:10:51,040 so maybe I need to do something-- put it equal to 0 174 00:10:51,040 --> 00:10:52,900 if xn bar is equal to 0. 175 00:10:52,900 --> 00:10:55,871 So I guess, basically, I get something which is negative, 176 00:10:55,871 --> 00:10:58,120 and so it's never going to be larger than what I want. 177 00:10:58,120 --> 00:11:00,370 And so here, I'm actually starting at 0. 178 00:11:00,370 --> 00:11:04,300 So now, this is this function here that increases-- 179 00:11:04,300 --> 00:11:06,610 I mean, it should increase smoothly. 180 00:11:06,610 --> 00:11:11,620 This function here is alpha psi of theta-- 181 00:11:11,620 --> 00:11:15,280 or alpha psi of p, let's say, because we're talking about p. 182 00:11:15,280 --> 00:11:17,916 Then it reaches alpha here. 183 00:11:17,916 --> 00:11:19,290 Now, when I go on the other side, 184 00:11:19,290 --> 00:11:21,510 I'm actually looking at beta. 185 00:11:21,510 --> 00:11:23,730 When I'm on theta 1, the function that matters 186 00:11:23,730 --> 00:11:28,540 is the probability of type II error, which is beta of psi. 187 00:11:28,540 --> 00:11:30,775 And this beta of psi is actually going to increase. 188 00:11:34,130 --> 00:11:35,500 So beta of psi is what? 189 00:11:35,500 --> 00:11:37,660 Well, beta of psi should also-- 190 00:11:37,660 --> 00:11:39,940 sorry, that's the probability of being equal to alpha. 191 00:11:39,940 --> 00:11:41,440 So what I'm going to do is I'm going 192 00:11:41,440 --> 00:11:43,960 to look at the probability of rejecting. 193 00:11:43,960 --> 00:11:46,502 So let me draw this functional all the way. 194 00:11:46,502 --> 00:11:48,900 It's going to look like this. 195 00:11:48,900 --> 00:11:52,750 Now here, if I look at this function here or here, 196 00:11:52,750 --> 00:11:57,320 this is the probability under theta that psi is equal to 1. 197 00:11:57,320 --> 00:11:59,620 And we just said that, in this region, 198 00:11:59,620 --> 00:12:02,600 this function is called alpha of psi. 199 00:12:02,600 --> 00:12:06,990 In that region, it's not called alpha of psi. 200 00:12:06,990 --> 00:12:08,340 It's not called anything. 201 00:12:08,340 --> 00:12:11,030 It's just the probability of rejection. 202 00:12:11,030 --> 00:12:12,770 So it's not any error, it's actually 203 00:12:12,770 --> 00:12:14,310 what you should be doing. 204 00:12:14,310 --> 00:12:19,680 What we're looking at in this region is 1 minus this guy. 205 00:12:19,680 --> 00:12:21,780 We're looking at the probability of not rejecting. 206 00:12:21,780 --> 00:12:23,820 So I need to actually, basically, look at the 1 207 00:12:23,820 --> 00:12:27,340 minus this thing, which here is going to be 95%. 208 00:12:27,340 --> 00:12:31,800 So I'm going to do 95%. 209 00:12:34,510 --> 00:12:36,640 And this is my probability. 210 00:12:36,640 --> 00:12:38,380 Ability And I'm just basically drawing 211 00:12:38,380 --> 00:12:40,330 the symmetric of this guy. 212 00:12:40,330 --> 00:12:44,890 So this here is the probability under theta 213 00:12:44,890 --> 00:12:50,390 that psi is equal to 0, which is 1 minus p theta 214 00:12:50,390 --> 00:12:52,360 that psi is equal to 1. 215 00:12:52,360 --> 00:12:56,460 So it's just 1 minus the wide curve. 216 00:12:56,460 --> 00:12:59,010 And it's actually, by definition, equal 217 00:12:59,010 --> 00:13:00,360 to beta of psi of theta. 218 00:13:06,670 --> 00:13:09,360 Now, where do I read pi psi? 219 00:13:20,330 --> 00:13:22,250 What is pi psi on this picture? 220 00:13:26,130 --> 00:13:28,020 Is pi psi a number or a function? 221 00:13:32,242 --> 00:13:32,950 AUDIENCE: Number. 222 00:13:32,950 --> 00:13:33,670 PHILIPPE RIGOLLET: It's a number, right? 223 00:13:33,670 --> 00:13:35,444 It's the minimum of a function. 224 00:13:35,444 --> 00:13:36,360 What is this function? 225 00:13:36,360 --> 00:13:39,450 It's the probability under theta that theta is equal to 1. 226 00:13:39,450 --> 00:13:44,070 I drew this entire function for between theta 0 and theta 1. 227 00:13:44,070 --> 00:13:46,680 I drew-- this is this entire white curve. 228 00:13:46,680 --> 00:13:48,152 This is this probability. 229 00:13:48,152 --> 00:13:50,610 Now I'm saying, look at the smallest value this probability 230 00:13:50,610 --> 00:13:54,300 can take on the set theta 1. 231 00:13:54,300 --> 00:13:55,060 What is this? 232 00:14:00,332 --> 00:14:02,060 This guy. 233 00:14:02,060 --> 00:14:03,800 This is where my pi-- 234 00:14:03,800 --> 00:14:08,750 this thing here is pi psi, and so it's equal to 5%. 235 00:14:11,462 --> 00:14:13,510 So that's for this particular test, 236 00:14:13,510 --> 00:14:19,035 because this test has a continuous curve for this psi. 237 00:14:19,035 --> 00:14:20,800 And so if I want to make sure that I'm 238 00:14:20,800 --> 00:14:24,310 at 5% when I come to the right of the theta 0, 239 00:14:24,310 --> 00:14:26,380 if it touches theta 1, then I'd better 240 00:14:26,380 --> 00:14:30,220 have 5% on the other side if the function is continuous. 241 00:14:30,220 --> 00:14:33,130 So basically, if this function is 242 00:14:33,130 --> 00:14:38,189 increasing, which will be the case for most tests, 243 00:14:38,189 --> 00:14:39,980 and continuous, then what's going to happen 244 00:14:39,980 --> 00:14:42,499 is that the level of the test, which is alpha, 245 00:14:42,499 --> 00:14:44,790 is actually going to be equal to the power of the test. 246 00:14:48,260 --> 00:14:50,110 Now, there's something I didn't mention, 247 00:14:50,110 --> 00:14:52,810 and I'm just mentioning it passing by. 248 00:14:52,810 --> 00:14:55,760 Here, I define the power itself. 249 00:14:55,760 --> 00:14:59,420 This function, this entire white curve here, 250 00:14:59,420 --> 00:15:01,510 is actually called the power function-- 251 00:15:06,690 --> 00:15:07,200 this thing. 252 00:15:07,200 --> 00:15:09,340 That's the entire white curve. 253 00:15:09,340 --> 00:15:12,200 And what you could have is tests that 254 00:15:12,200 --> 00:15:16,670 have the entire curve which is dominated by another test. 255 00:15:16,670 --> 00:15:18,760 So here, if I look at this test-- 256 00:15:18,760 --> 00:15:21,680 and let's assume I can build another test that 257 00:15:21,680 --> 00:15:23,070 has this curve. 258 00:15:23,070 --> 00:15:29,024 Let's say it's the same here, but then here, it 259 00:15:29,024 --> 00:15:29,690 looks like this. 260 00:15:34,600 --> 00:15:38,130 What is the power of this test? 261 00:15:38,130 --> 00:15:39,130 AUDIENCE: It's the same. 262 00:15:39,130 --> 00:15:40,505 PHILIPPE RIGOLLET: It's the same. 263 00:15:40,505 --> 00:15:43,330 It's 5%, because this point touches here exactly 264 00:15:43,330 --> 00:15:44,530 at the same point. 265 00:15:44,530 --> 00:15:48,970 However, for any other value than the worst possible, 266 00:15:48,970 --> 00:15:51,670 this guy is doing better than this guy. 267 00:15:51,670 --> 00:15:52,750 Can you see that? 268 00:15:52,750 --> 00:15:55,420 Having a curve higher on the right-hand side 269 00:15:55,420 --> 00:15:57,310 is a good thing because it means that you 270 00:15:57,310 --> 00:16:03,240 tend to reject more when you're actually in h1. 271 00:16:03,240 --> 00:16:06,516 So this guy is definitely better than this guy. 272 00:16:06,516 --> 00:16:07,890 And so what we say, in this case, 273 00:16:07,890 --> 00:16:09,660 is that the test with the dashed line 274 00:16:09,660 --> 00:16:13,550 is uniformly more powerful than the other tests. 275 00:16:13,550 --> 00:16:15,390 But we're not going to go into those details 276 00:16:15,390 --> 00:16:18,780 because, basically, all the tests that we will describe 277 00:16:18,780 --> 00:16:22,010 are already the most powerful ones. 278 00:16:22,010 --> 00:16:24,090 In particular, this guy is-- 279 00:16:24,090 --> 00:16:25,020 there's no such thing. 280 00:16:25,020 --> 00:16:26,340 All the other guys you can come up with 281 00:16:26,340 --> 00:16:27,631 are going to actually be below. 282 00:16:33,910 --> 00:16:36,610 So we saw a couple tests, then we 283 00:16:36,610 --> 00:16:40,450 saw how to pick this threshold, and we defined those two 284 00:16:40,450 --> 00:16:41,327 things. 285 00:16:41,327 --> 00:16:42,118 AUDIENCE: Question. 286 00:16:42,118 --> 00:16:43,076 PHILIPPE RIGOLLET: Yes? 287 00:16:43,076 --> 00:16:45,541 AUDIENCE: But in that case, the dashed line, 288 00:16:45,541 --> 00:16:48,964 if it were also higher in the region of theta 0, 289 00:16:48,964 --> 00:16:50,898 do you still consider it better? 290 00:16:50,898 --> 00:16:51,898 PHILIPPE RIGOLLET: Yeah. 291 00:16:51,898 --> 00:16:52,890 AUDIENCE: OK. 292 00:16:52,890 --> 00:16:55,400 PHILIPPE RIGOLLET: Because you're given this budget of 5%. 293 00:16:55,400 --> 00:16:58,170 So in this paradigm where you're given the-- 294 00:16:58,170 --> 00:17:01,450 actually, if the dashed line was this dashed line, 295 00:17:01,450 --> 00:17:03,030 I would still be happy. 296 00:17:03,030 --> 00:17:05,040 I mean, I don't care what this thing does here, 297 00:17:05,040 --> 00:17:06,780 as long as it's below 5%. 298 00:17:06,780 --> 00:17:08,730 But here, I'm going to try to discover. 299 00:17:08,730 --> 00:17:11,460 Think about, again, the drug discovery example. 300 00:17:11,460 --> 00:17:14,069 You're trying to find-- let's say you're a scientist 301 00:17:14,069 --> 00:17:17,140 and you're trying to prove that your drug works. 302 00:17:17,140 --> 00:17:18,140 What do you want to see? 303 00:17:18,140 --> 00:17:22,450 Well, FDA puts on you this constraint 304 00:17:22,450 --> 00:17:26,919 that your probability of type I error should never exceed 5%. 305 00:17:26,919 --> 00:17:28,710 You're going to work under this assumption. 306 00:17:28,710 --> 00:17:30,293 But what you're going to do is, you're 307 00:17:30,293 --> 00:17:33,720 going to try to find a test that will make you find something 308 00:17:33,720 --> 00:17:35,450 as often as possible. 309 00:17:35,450 --> 00:17:38,400 And so you're going to max this constraint of 5%. 310 00:17:38,400 --> 00:17:41,820 And then you're going to try to make this curve, that means-- 311 00:17:41,820 --> 00:17:45,780 this is, basically, this number here, for any point 312 00:17:45,780 --> 00:17:47,990 here, is the probability that you publish your paper. 313 00:17:47,990 --> 00:17:50,460 That's the probability that you can 314 00:17:50,460 --> 00:17:51,690 release to market your drug. 315 00:17:51,690 --> 00:17:53,710 That's the probability that it works. 316 00:17:53,710 --> 00:17:56,920 And so you want this curve to be as high as possible. 317 00:17:56,920 --> 00:18:02,550 You want to make sure that if there's evidence in the data 318 00:18:02,550 --> 00:18:05,670 that h1 is the truth, you want to squeeze as much 319 00:18:05,670 --> 00:18:07,110 of this evidence as possible. 320 00:18:07,110 --> 00:18:09,780 And the test that has the highest possible curve 321 00:18:09,780 --> 00:18:11,490 is the most powerful one. 322 00:18:11,490 --> 00:18:15,810 Now, you have to also understand that having two curves that 323 00:18:15,810 --> 00:18:19,190 are on top of each other completely, everywhere, 324 00:18:19,190 --> 00:18:22,560 is a rare phenomenon. 325 00:18:22,560 --> 00:18:24,450 It's not always the case that there 326 00:18:24,450 --> 00:18:27,842 is a test that's uniformly more powerful than any other test. 327 00:18:27,842 --> 00:18:29,550 It might be that you have some trade-off, 328 00:18:29,550 --> 00:18:31,110 that it might be better here, but then you're 329 00:18:31,110 --> 00:18:32,027 losing power here. 330 00:18:32,027 --> 00:18:33,610 Maybe it's-- I mean, things like this. 331 00:18:33,610 --> 00:18:35,110 Well, actually, maybe it should not go down. 332 00:18:35,110 --> 00:18:37,590 But let's say it goes like this, and then, maybe, this guy 333 00:18:37,590 --> 00:18:39,660 goes like this. 334 00:18:39,660 --> 00:18:43,560 Then you have to, basically, make an educated guess 335 00:18:43,560 --> 00:18:46,170 whether you think that the theta you're going to find is here 336 00:18:46,170 --> 00:18:47,880 or is here, and then you pick your test. 337 00:18:51,089 --> 00:18:51,880 Any other question? 338 00:18:51,880 --> 00:18:52,584 Yes? 339 00:18:52,584 --> 00:18:53,976 AUDIENCE: Can you explain the green curve again? 340 00:18:53,976 --> 00:18:55,255 That's just the type II error? 341 00:18:55,255 --> 00:18:57,380 PHILIPPE RIGOLLET: So the green curve is-- exactly. 342 00:18:57,380 --> 00:18:58,830 So that's beta psi of theta. 343 00:18:58,830 --> 00:19:00,620 So it's really the type II error. 344 00:19:00,620 --> 00:19:02,730 And it's defined only here. 345 00:19:02,730 --> 00:19:05,420 So here, it's not a definition, it's 346 00:19:05,420 --> 00:19:08,215 really I'm just mapping it to this point. 347 00:19:08,215 --> 00:19:10,340 So it's defined only here, and it's the probability 348 00:19:10,340 --> 00:19:11,048 of type II error. 349 00:19:15,510 --> 00:19:17,555 So here, it's pretty large. 350 00:19:17,555 --> 00:19:19,530 I'm making it, basically, as large 351 00:19:19,530 --> 00:19:22,330 as I could because I'm at the boundary, 352 00:19:22,330 --> 00:19:26,410 and that means, at the boundary, since the status quo is h0, 353 00:19:26,410 --> 00:19:29,394 I'm always going to go for h0 if I 354 00:19:29,394 --> 00:19:31,935 don't have any evidence, which means that what's going to pay 355 00:19:31,935 --> 00:19:34,268 is the type II error that's going to basically pay this. 356 00:19:38,059 --> 00:19:38,850 Any other question? 357 00:19:41,680 --> 00:19:43,000 So let's move on. 358 00:19:43,000 --> 00:19:47,190 So did we do this? 359 00:19:47,190 --> 00:19:50,220 No, I think we stopped here, right? 360 00:19:50,220 --> 00:19:53,120 I didn't cover that part. 361 00:19:53,120 --> 00:19:55,310 So as I said, in this paradigm, we're 362 00:19:55,310 --> 00:19:58,040 going to actually fix this guy to be something. 363 00:19:58,040 --> 00:20:01,620 And this thing is actually called the level of the test. 364 00:20:01,620 --> 00:20:03,410 I'm sorry, this is, again, more words. 365 00:20:03,410 --> 00:20:06,920 Actually, the good news is that we split it into two lectures. 366 00:20:06,920 --> 00:20:09,335 So we have, what is a test? 367 00:20:09,335 --> 00:20:11,252 What is a hypothesis? 368 00:20:11,252 --> 00:20:11,960 What is the null? 369 00:20:11,960 --> 00:20:14,429 What is the alternative? 370 00:20:14,429 --> 00:20:15,470 What is the type I error? 371 00:20:15,470 --> 00:20:16,553 What is the type II error? 372 00:20:16,553 --> 00:20:18,650 And now, I'm telling you there's another thing. 373 00:20:18,650 --> 00:20:22,670 So we define the power, which was some sort of a lower bound 374 00:20:22,670 --> 00:20:24,140 on the-- 375 00:20:24,140 --> 00:20:26,390 or it's 1 minus the upper bound on the type II 376 00:20:26,390 --> 00:20:28,540 error, basically. 377 00:20:28,540 --> 00:20:32,480 And so it's alternative-- so the power 378 00:20:32,480 --> 00:20:34,970 is the smallest probability of rejecting 379 00:20:34,970 --> 00:20:36,290 when you're in the null. 380 00:20:36,290 --> 00:20:41,200 And it's alternative when you're in theta 1, so that's my power. 381 00:20:41,200 --> 00:20:43,760 I looked here, and I looked at the smallest value. 382 00:20:43,760 --> 00:20:45,770 And I can look at this side and say, well, 383 00:20:45,770 --> 00:20:48,886 what is the largest probability that I make a type I error? 384 00:20:48,886 --> 00:20:51,260 Again, this largest probability is the level of the test. 385 00:20:58,460 --> 00:21:03,140 So this is alpha equal, by definition, 386 00:21:03,140 --> 00:21:15,490 to the maximum for theta in theta 0 of alpha psi of theta. 387 00:21:15,490 --> 00:21:18,340 So here, I just put the level itself. 388 00:21:18,340 --> 00:21:20,950 As you can see, here, it essentially says 389 00:21:20,950 --> 00:21:23,890 that if I'm of level of 5%, I'm also of level 10%, 390 00:21:23,890 --> 00:21:25,600 I'm also of level 15%. 391 00:21:25,600 --> 00:21:27,160 So here, it's really an upper bound. 392 00:21:27,160 --> 00:21:29,980 Whatever you guys want to take, this is what it is. 393 00:21:29,980 --> 00:21:34,861 But as we said, if this number is 4.5%, 394 00:21:34,861 --> 00:21:36,360 you're losing in your type II error. 395 00:21:36,360 --> 00:21:38,200 So if you're allowed to have-- 396 00:21:38,200 --> 00:21:43,420 if this maximum here is 4.5% and FDA told you you can go to 5%, 397 00:21:43,420 --> 00:21:44,920 you're losing in your type II error. 398 00:21:44,920 --> 00:21:46,295 So you actually want to make sure 399 00:21:46,295 --> 00:21:48,400 that this is the 5% that's given to you. 400 00:21:48,400 --> 00:21:51,700 So the way it works is that you give me the alpha, 401 00:21:51,700 --> 00:21:56,410 then I'm going to go back, pick c that depends on alpha here, 402 00:21:56,410 --> 00:21:58,300 so that this thing is actually equal to 5%. 403 00:22:01,590 --> 00:22:04,800 And so of course, in many instances, 404 00:22:04,800 --> 00:22:06,840 we do not know the probability. 405 00:22:06,840 --> 00:22:09,290 We do not know how to compute the probability of type I 406 00:22:09,290 --> 00:22:10,300 error. 407 00:22:10,300 --> 00:22:12,930 This is a maximum value for the probability of type I error. 408 00:22:12,930 --> 00:22:13,920 We don't know how to compute it. 409 00:22:13,920 --> 00:22:15,900 I mean, it might be a very complicated random variable. 410 00:22:15,900 --> 00:22:17,300 Maybe it's a weird binomial. 411 00:22:17,300 --> 00:22:19,500 We could compute it, but it would be painful. 412 00:22:19,500 --> 00:22:21,960 But we know how to compute is its asymptotic value. 413 00:22:21,960 --> 00:22:24,330 Just because of the central limit theorem, convergence 414 00:22:24,330 --> 00:22:28,020 and distribution tells me that the probability of type I error 415 00:22:28,020 --> 00:22:30,650 is basically going towards the probability 416 00:22:30,650 --> 00:22:33,110 that some Gaussian is in some region. 417 00:22:33,110 --> 00:22:36,240 And so we're going to compute, not the level itself, 418 00:22:36,240 --> 00:22:37,500 but the asymptotic level. 419 00:22:43,700 --> 00:22:48,320 And that's basically the limit as n 420 00:22:48,320 --> 00:22:56,830 goes to infinity of alpha psi of theta. 421 00:22:56,830 --> 00:22:58,540 And then I'm going to make the max here. 422 00:23:06,300 --> 00:23:08,240 So how am I going to compute this? 423 00:23:08,240 --> 00:23:13,440 Well, if I take a test that has rejection region of the form 424 00:23:13,440 --> 00:23:14,910 tn-- 425 00:23:14,910 --> 00:23:17,970 because it depends on the data, that's tn of x1 xn-- 426 00:23:17,970 --> 00:23:23,440 my observation's larger than some number c. 427 00:23:23,440 --> 00:23:26,430 Of course, I can almost always write 428 00:23:26,430 --> 00:23:28,402 tests like that, except that sometimes, 429 00:23:28,402 --> 00:23:30,860 it's going to be an absolute value, which essentially means 430 00:23:30,860 --> 00:23:32,552 I'm going away from some value. 431 00:23:32,552 --> 00:23:34,260 Maybe, actually, I'm less than something, 432 00:23:34,260 --> 00:23:37,090 but I can always put a negative sign in front of everything. 433 00:23:37,090 --> 00:23:39,780 So this is not without much of generality. 434 00:23:39,780 --> 00:23:47,890 So this includes something that looks like-- 435 00:23:51,520 --> 00:23:56,290 something is larger than the constants, so that means-- 436 00:23:56,290 --> 00:24:02,330 which is equivalent to-- well, let me write that as tq, 437 00:24:02,330 --> 00:24:05,510 because then that means that-- 438 00:24:05,510 --> 00:24:07,940 so that's tn. 439 00:24:07,940 --> 00:24:10,040 But this actually encompasses the fact 440 00:24:10,040 --> 00:24:21,370 that qn is larger than c or qn is less than c and n minus c. 441 00:24:21,370 --> 00:24:22,480 So that includes this guy. 442 00:24:22,480 --> 00:24:26,320 That also includes qn less than c, 443 00:24:26,320 --> 00:24:32,840 because this is equivalent to qn is larger than minus c. 444 00:24:32,840 --> 00:24:33,830 And minus qn is-- 445 00:24:33,830 --> 00:24:35,240 and so that's going to be my tn. 446 00:24:37,810 --> 00:24:42,430 So I can actually encode several type of things-- 447 00:24:42,430 --> 00:24:44,230 rejection regions. 448 00:24:44,230 --> 00:24:47,020 So here, in this case, I have a rejection region 449 00:24:47,020 --> 00:24:50,110 that looks like this, or a rejection region 450 00:24:50,110 --> 00:24:53,380 that looks like this, or a rejection 451 00:24:53,380 --> 00:24:54,550 region that looks like this. 452 00:24:57,209 --> 00:24:58,750 And here, I don't really represent it 453 00:24:58,750 --> 00:25:02,420 for the whole data, but maybe for the average, for example, 454 00:25:02,420 --> 00:25:04,018 or the normalized average. 455 00:25:17,950 --> 00:25:23,950 So if I write this, then-- 456 00:25:23,950 --> 00:25:25,470 yeah. 457 00:25:25,470 --> 00:25:32,970 And in this case, this tn that shows up 458 00:25:32,970 --> 00:25:35,168 is called test statistic. 459 00:25:41,460 --> 00:25:43,730 I mean, this is not set in stone. 460 00:25:43,730 --> 00:25:46,930 Here, for example, q could be the test statistic. 461 00:25:46,930 --> 00:25:48,640 It doesn't have to be minus q itself 462 00:25:48,640 --> 00:25:50,750 that's the test statistic. 463 00:25:50,750 --> 00:25:52,000 So what is the test statistic? 464 00:25:52,000 --> 00:25:55,170 Well, it's what you're going to build from your data 465 00:25:55,170 --> 00:25:57,790 and then compare to some fixed value. 466 00:25:57,790 --> 00:26:01,167 So in the example we had here, what is our test statistic? 467 00:26:01,167 --> 00:26:02,000 Well, it's this guy. 468 00:26:05,620 --> 00:26:09,200 This was our test statistic. 469 00:26:09,200 --> 00:26:12,830 And is this thing a statistic? 470 00:26:12,830 --> 00:26:14,510 What are the criteria for a statistic? 471 00:26:14,510 --> 00:26:15,810 What is the statistic? 472 00:26:21,046 --> 00:26:23,633 I know you know the answer. 473 00:26:23,633 --> 00:26:25,050 AUDIENCE: Measurable function. 474 00:26:25,050 --> 00:26:26,080 PHILIPPE RIGOLLET: Yeah, it's a measurable function 475 00:26:26,080 --> 00:26:29,064 of the data that does not depend on the parameter. 476 00:26:29,064 --> 00:26:32,995 Is this guy a statistic? 477 00:26:32,995 --> 00:26:33,981 AUDIENCE: It's not. 478 00:26:35,949 --> 00:26:37,490 PHILIPPE RIGOLLET: Let's think again. 479 00:26:40,490 --> 00:26:45,360 When I implemented the test, what did I do? 480 00:26:45,360 --> 00:26:47,490 I was able to compute my test. 481 00:26:47,490 --> 00:26:49,800 My test did not depend on some unknown parameter. 482 00:26:49,800 --> 00:26:52,640 How did we do it? 483 00:26:52,640 --> 00:26:57,187 We just plugged in 0.5 here, remember? 484 00:26:57,187 --> 00:26:59,020 That was the value for which we computed it, 485 00:26:59,020 --> 00:27:02,150 because under h0, that was the value we're seeing. 486 00:27:02,150 --> 00:27:05,910 And if theta 0 is actually an entire set, 487 00:27:05,910 --> 00:27:09,935 I'm just going to take the value that's the closest to h1. 488 00:27:09,935 --> 00:27:11,060 We'll see that in a second. 489 00:27:11,060 --> 00:27:13,400 I mean, I did not guarantee that to you. 490 00:27:13,400 --> 00:27:18,950 But just taking the worst type I error and bounded by alpha 491 00:27:18,950 --> 00:27:22,310 is equivalent to taking p and taking the value of p that's 492 00:27:22,310 --> 00:27:26,836 the closest to theta 1, which is completely intuitive. 493 00:27:26,836 --> 00:27:29,460 The worst type I error is going to be attained for the p that's 494 00:27:29,460 --> 00:27:32,260 the closest to the alternative. 495 00:27:32,260 --> 00:27:36,510 So even if the null is actually just an entire set, 496 00:27:36,510 --> 00:27:38,940 it's as if it was just the point that's 497 00:27:38,940 --> 00:27:41,840 the closest to the alternative. 498 00:27:41,840 --> 00:27:44,520 So now we can compute this, because there's 499 00:27:44,520 --> 00:27:46,440 no unknown parameters that shows up. 500 00:27:46,440 --> 00:27:48,210 We replace p by 0.5. 501 00:27:48,210 --> 00:27:50,406 And so that was our test statistic. 502 00:27:53,952 --> 00:27:55,410 So when you're building a test, you 503 00:27:55,410 --> 00:27:58,000 want to first build a test statistic, 504 00:27:58,000 --> 00:28:01,230 and then see what threshold you should be getting. 505 00:28:01,230 --> 00:28:08,640 So now, let's go back to our example where we want to have-- 506 00:28:08,640 --> 00:28:16,340 we have x1 xn, their IID [INAUDIBLE] p. 507 00:28:16,340 --> 00:28:25,050 And I want to test if p is 1/2 versus p not equal to 1/2, 508 00:28:25,050 --> 00:28:27,630 which, as I said, is what you want to do if you 509 00:28:27,630 --> 00:28:33,560 want to test if a coin is fair. 510 00:28:33,560 --> 00:28:36,560 And so here, I'm going to build a test statistic. 511 00:28:36,560 --> 00:28:39,180 And we concluded last time that-- 512 00:28:39,180 --> 00:28:41,790 what do we want for this statistic? 513 00:28:41,790 --> 00:28:44,700 We want it to have a distribution which, 514 00:28:44,700 --> 00:28:49,820 under the null, does not depend on the parameters, 515 00:28:49,820 --> 00:28:54,650 a distribution that I can actually compute quintiles of. 516 00:28:54,650 --> 00:28:56,150 So what we did is, we said, well, 517 00:28:56,150 --> 00:28:59,210 if I look at-- the central limit theorem tells me that square 518 00:28:59,210 --> 00:29:03,500 root of n xn bar minus p divided by-- 519 00:29:03,500 --> 00:29:06,871 so if I do central limit theorem plus Slutsky, for example, 520 00:29:06,871 --> 00:29:08,120 I'm going to have square root. 521 00:29:12,006 --> 00:29:13,880 And we've had this discussion whether we want 522 00:29:13,880 --> 00:29:15,004 to use Slutsky or not here. 523 00:29:15,004 --> 00:29:17,900 But let's assume we're taking Slutsky wherever we can. 524 00:29:17,900 --> 00:29:20,060 So this thing tells me that, by the central limit 525 00:29:20,060 --> 00:29:23,510 theorem, as n goes to infinity, this thing converges 526 00:29:23,510 --> 00:29:25,190 in distribution to some n01. 527 00:29:28,260 --> 00:29:31,550 Now, as we said, this guy is not something we know. 528 00:29:31,550 --> 00:29:34,020 But under the null, we actually know it. 529 00:29:34,020 --> 00:29:37,100 And we can actually replace it by 1/2. 530 00:29:37,100 --> 00:29:41,300 So this thing holds under h0. 531 00:29:41,300 --> 00:29:44,300 When I write under h0, it means when this is the truth. 532 00:29:47,120 --> 00:29:49,270 So now I have something that converges 533 00:29:49,270 --> 00:29:52,570 to something that has no dependence on anything I 534 00:29:52,570 --> 00:29:53,170 don't know. 535 00:29:53,170 --> 00:29:56,890 And in particular, if you have any statistics textbook, which 536 00:29:56,890 --> 00:29:59,260 you don't because I didn't require one-- 537 00:29:59,260 --> 00:30:04,444 and you should be thankful, because these things cost $350. 538 00:30:04,444 --> 00:30:05,860 Actually, if you look at the back, 539 00:30:05,860 --> 00:30:12,250 you actually have a table for a standard Gaussian. 540 00:30:12,250 --> 00:30:13,780 I could have anything else here. 541 00:30:13,780 --> 00:30:15,760 I could have an exponential distribution. 542 00:30:15,760 --> 00:30:17,560 I could have a-- 543 00:30:17,560 --> 00:30:20,320 I don't know-- well, we'll see the chi squared 544 00:30:20,320 --> 00:30:22,030 distribution in a minute. 545 00:30:22,030 --> 00:30:24,040 Any distribution from which you can actually 546 00:30:24,040 --> 00:30:25,600 see a table that somebody actually 547 00:30:25,600 --> 00:30:27,516 computed this thing for which you can actually 548 00:30:27,516 --> 00:30:30,430 draw the pdf and start computing whatever probability you want 549 00:30:30,430 --> 00:30:32,410 on them, then this is what you want 550 00:30:32,410 --> 00:30:35,110 to see at the right-hand side. 551 00:30:35,110 --> 00:30:36,500 This is any distribution. 552 00:30:36,500 --> 00:30:38,380 It's called pivotal. 553 00:30:38,380 --> 00:30:39,880 I think we've mentioned that before. 554 00:30:39,880 --> 00:30:41,713 Pivotal means it does not depend on anything 555 00:30:41,713 --> 00:30:43,610 that you don't know. 556 00:30:43,610 --> 00:30:45,616 And maybe it's easy to compute those things. 557 00:30:45,616 --> 00:30:47,990 Probably, typically, you need a computer to simulate them 558 00:30:47,990 --> 00:30:50,985 for you because computing probabilities for Gaussians 559 00:30:50,985 --> 00:30:51,860 is not an easy thing. 560 00:30:51,860 --> 00:30:53,985 We don't know how to solve those integrals exactly, 561 00:30:53,985 --> 00:30:56,520 we have to do it numerically. 562 00:30:56,520 --> 00:31:08,420 So now I want to do this test. 563 00:31:08,420 --> 00:31:12,950 My test statistic will be declared to be what? 564 00:31:12,950 --> 00:31:17,740 Well, I'm going to reject if what 565 00:31:17,740 --> 00:31:18,970 is larger than some number? 566 00:31:24,140 --> 00:31:27,500 The absolute value of this guy. 567 00:31:27,500 --> 00:31:29,550 So my test statistic is going to be 568 00:31:29,550 --> 00:31:35,790 square root of n minus 0.5 divided by square root of xn 569 00:31:35,790 --> 00:31:38,240 bar 1 minus xn bar. 570 00:31:41,160 --> 00:31:43,470 That's my test statistic, absolute value of this guy, 571 00:31:43,470 --> 00:31:45,900 because I want to reject either when this guy is too large 572 00:31:45,900 --> 00:31:47,150 or when this guy is too small. 573 00:31:50,260 --> 00:31:51,760 I don't know ahead whether I'm going 574 00:31:51,760 --> 00:31:55,470 to see p larger than 1/2 or less than 1/2. 575 00:31:55,470 --> 00:31:59,190 So now I need to compute c such that the probability 576 00:31:59,190 --> 00:32:05,210 that tn is larger than c. 577 00:32:05,210 --> 00:32:11,290 So that's the probability under p, which is unknown. 578 00:32:11,290 --> 00:32:17,230 I want this probability to be less than some level alpha, 579 00:32:17,230 --> 00:32:18,410 asymptotically. 580 00:32:18,410 --> 00:32:24,740 So I want the limit of this guy to be less than alpha, 581 00:32:24,740 --> 00:32:26,810 and that's the level of my test. 582 00:32:26,810 --> 00:32:32,010 So that's the given level. 583 00:32:32,010 --> 00:32:33,720 So I want this thing to happen. 584 00:32:33,720 --> 00:32:35,280 Now, what I know is that this limit-- 585 00:32:38,090 --> 00:32:40,397 actually, I should say given asymptotic level. 586 00:32:48,520 --> 00:32:50,130 So what is this thing? 587 00:32:54,300 --> 00:33:00,600 Well, OK, that's the probability that something 588 00:33:00,600 --> 00:33:03,000 that looks like under p. 589 00:33:03,000 --> 00:33:05,520 So under p, this guy-- 590 00:33:05,520 --> 00:33:08,730 so what I know is that tn is square root of n 591 00:33:08,730 --> 00:33:15,490 minus xn bar minus 0.5 divided by square root of xn bar 592 00:33:15,490 --> 00:33:18,700 1 minus xn bar exceeds. 593 00:33:23,770 --> 00:33:26,882 Is this true that as n to infinity, 594 00:33:26,882 --> 00:33:28,840 this probability is the same as the probability 595 00:33:28,840 --> 00:33:30,460 that the absolute value of a Gaussian 596 00:33:30,460 --> 00:33:33,610 exceeds c of a standard Gaussian? 597 00:33:33,610 --> 00:33:34,351 Is this true? 598 00:33:37,281 --> 00:33:39,530 AUDIENCE: The absolute value of the standard Gaussian. 599 00:33:39,530 --> 00:33:41,113 PHILIPPE RIGOLLET: Yeah, the absolute. 600 00:33:41,113 --> 00:33:43,830 So you're saying that this, as n becomes large enough, this 601 00:33:43,830 --> 00:33:48,500 should be the probability that some absolute value of n01 602 00:33:48,500 --> 00:33:49,898 exceeds c, right? 603 00:33:49,898 --> 00:33:51,990 AUDIENCE: Yes. 604 00:33:51,990 --> 00:33:54,780 PHILIPPE RIGOLLET: So I claim that this is not correct. 605 00:33:54,780 --> 00:33:56,077 Somebody tell me why. 606 00:33:56,077 --> 00:33:57,360 AUDIENCE: Even in the limit it's not correct? 607 00:33:57,360 --> 00:33:59,651 PHILIPPE RIGOLLET: Even in the limit, it's not correct. 608 00:34:03,164 --> 00:34:04,334 AUDIENCE: OK. 609 00:34:04,334 --> 00:34:05,917 PHILIPPE RIGOLLET: So what do you see? 610 00:34:05,917 --> 00:34:07,625 AUDIENCE: It's because, at the beginning, 611 00:34:07,625 --> 00:34:11,368 we picked the worst possible true parameter, 0.5. 612 00:34:11,368 --> 00:34:13,915 So we don't actually know that this 0.5 is the mean. 613 00:34:13,915 --> 00:34:15,040 PHILIPPE RIGOLLET: Exactly. 614 00:34:15,040 --> 00:34:19,500 So we pick this 0.5 here, but this is for any p. 615 00:34:19,500 --> 00:34:21,360 But what is the only p I can get? 616 00:34:21,360 --> 00:34:26,949 So what I want is that this is true for all p in theta 0. 617 00:34:26,949 --> 00:34:31,750 But the only p that's in theta 0 is actually p is equal to 0.5. 618 00:34:31,750 --> 00:34:33,780 So yes, what you said was true, but it 619 00:34:33,780 --> 00:34:38,130 required to specify p to be equal to 0.5. 620 00:34:38,130 --> 00:34:40,409 So this, in general, is not true. 621 00:34:40,409 --> 00:34:47,909 But it happens to be true if p belongs to theta 0, which 622 00:34:47,909 --> 00:34:53,489 is strictly equivalent to p is equal to 0.5, 623 00:34:53,489 --> 00:34:59,320 because theta 0 is really just this one point, 0.5. 624 00:34:59,320 --> 00:35:01,820 So now, this becomes true. 625 00:35:01,820 --> 00:35:03,760 And so what I need to do is to find c such 626 00:35:03,760 --> 00:35:05,140 that this guy is equal to what? 627 00:35:11,650 --> 00:35:14,020 I mean, let's just follow. 628 00:35:14,020 --> 00:35:16,950 So I want this to be less than alpha. 629 00:35:16,950 --> 00:35:19,850 But then we said that this was equal to this, 630 00:35:19,850 --> 00:35:21,970 which is equal to this. 631 00:35:21,970 --> 00:35:24,960 So all I want is that this guy is less than alpha. 632 00:35:24,960 --> 00:35:28,230 But we said we might as well just make it equal to alpha 633 00:35:28,230 --> 00:35:30,540 if you allow me to make it as big as I want, 634 00:35:30,540 --> 00:35:32,050 as long as it's less than alpha. 635 00:35:32,050 --> 00:35:33,790 AUDIENCE: So this is a true statement. 636 00:35:33,790 --> 00:35:35,748 PHILIPPE RIGOLLET: So this is a true statement. 637 00:35:35,748 --> 00:35:38,354 But it's under this condition. 638 00:35:38,354 --> 00:35:39,104 AUDIENCE: Exactly. 639 00:35:43,010 --> 00:35:48,680 PHILIPPE RIGOLLET: So I'm going to set it equal to alpha, 640 00:35:48,680 --> 00:35:52,310 and then I'm going to try to solve for c. 641 00:36:10,390 --> 00:36:13,540 So what I'm looking for is a c such that 642 00:36:13,540 --> 00:36:17,390 if I draw a standard Gaussian-- 643 00:36:17,390 --> 00:36:20,530 so that's pdf of some n01-- 644 00:36:20,530 --> 00:36:23,200 I want the probability that the absolute value of my Gaussian 645 00:36:23,200 --> 00:36:25,630 exceeding this guy-- 646 00:36:25,630 --> 00:36:29,350 so that means being either here or here. 647 00:36:29,350 --> 00:36:31,220 So that's minus c and c. 648 00:36:31,220 --> 00:36:36,200 I want the sum of those two things to be equal to alpha. 649 00:36:36,200 --> 00:36:53,570 So I want the sum of these areas to equal alpha. 650 00:36:53,570 --> 00:36:56,240 So by symmetry, each of them should 651 00:36:56,240 --> 00:36:58,190 be equal to alpha over 2. 652 00:37:02,710 --> 00:37:08,310 And so what I'm looking for is c such that the probability 653 00:37:08,310 --> 00:37:15,410 that my n01 exceeds c, which is just this area to the right, 654 00:37:15,410 --> 00:37:20,830 now, equals alpha, which is equivalent to taking c, which 655 00:37:20,830 --> 00:37:26,020 is q equals alpha over 2, and that's q alpha over 2 656 00:37:26,020 --> 00:37:28,240 by definition of q alpha over 2. 657 00:37:28,240 --> 00:37:30,370 That's just what q alpha over 2 is. 658 00:37:30,370 --> 00:37:34,420 And that's what the tables at the back of the book give you. 659 00:37:34,420 --> 00:37:42,400 Who has already seen a table for Gaussian probabilities? 660 00:37:42,400 --> 00:37:44,200 What it does, it's just a table. 661 00:37:44,200 --> 00:37:45,696 I mean, it's pretty ancient. 662 00:37:45,696 --> 00:37:47,320 I mean, of course, you can actually ask 663 00:37:47,320 --> 00:37:49,240 Google to do it for you now. 664 00:37:49,240 --> 00:37:52,180 I mean, it's basically standard issue. 665 00:37:52,180 --> 00:37:56,110 But back in the day, they actually had to look at tables. 666 00:37:56,110 --> 00:37:59,140 And since the values alphas were pretty standard, 667 00:37:59,140 --> 00:38:01,510 the values alpha that people were requesting 668 00:38:01,510 --> 00:38:04,810 were typically 1%, 5%, 10%, all you 669 00:38:04,810 --> 00:38:07,000 could do is to compute these different values 670 00:38:07,000 --> 00:38:08,560 for different values of alpha. 671 00:38:08,560 --> 00:38:10,390 That was it. 672 00:38:10,390 --> 00:38:13,450 So there's really not much to give you. 673 00:38:13,450 --> 00:38:15,750 So for the Gaussian, I can tell you 674 00:38:15,750 --> 00:38:20,210 that alpha is equal to-- if alpha is equal to 5%, 675 00:38:20,210 --> 00:38:27,100 then q alpha over 2, q 2.5% is equal to 1.96, for example. 676 00:38:27,100 --> 00:38:28,840 So those are just fixed numbers that 677 00:38:28,840 --> 00:38:31,030 are functions of the Gaussian. 678 00:38:31,030 --> 00:38:32,410 So everybody agrees? 679 00:38:32,410 --> 00:38:37,443 We've done that before for our confidence intervals. 680 00:38:40,350 --> 00:38:42,040 And so now we know that if I actually 681 00:38:42,040 --> 00:38:48,460 plug in this guy to be q alpha over 2, then 682 00:38:48,460 --> 00:38:51,430 this limit is actually equal to alpha. 683 00:38:51,430 --> 00:38:53,247 And so now I've actually constrained this. 684 00:39:01,040 --> 00:39:07,800 So q alpha over 2 here for alpha equals 5%, as I said, is 1.96. 685 00:39:07,800 --> 00:39:13,790 So in the example 1, the number that we found was 3.54, 686 00:39:13,790 --> 00:39:18,800 I think, or something like that, 3.55 for t. 687 00:39:18,800 --> 00:39:29,290 So if we scroll back very quickly, 3.45-- 688 00:39:29,290 --> 00:39:30,980 that was example 1. 689 00:39:30,980 --> 00:39:33,770 Example two-- negative 0.77. 690 00:39:33,770 --> 00:39:40,970 So if I look at tn in example 1, tn 691 00:39:40,970 --> 00:39:46,040 was just the absolute value of 3.45, which-- 692 00:39:46,040 --> 00:39:50,390 don't pull out your calculators-- is equal to 3.45. 693 00:39:50,390 --> 00:39:54,500 Example 2, absolute value of negative 0.77 694 00:39:54,500 --> 00:39:57,050 was equal to 0.77. 695 00:39:57,050 --> 00:39:59,450 And so all I need to check is, is this number 696 00:39:59,450 --> 00:40:01,610 larger or smaller than 1.96? 697 00:40:01,610 --> 00:40:06,530 That's what my test ends up being. 698 00:40:06,530 --> 00:40:12,860 So in example 1, 3.45 being larger 699 00:40:12,860 --> 00:40:18,885 than 1.96, that means that I reject. 700 00:40:18,885 --> 00:40:22,790 Fairness of my coins, in example 2, 701 00:40:22,790 --> 00:40:27,230 0.77 being smaller than 1.96-- 702 00:40:27,230 --> 00:40:29,370 what do I do? 703 00:40:29,370 --> 00:40:30,270 I fail to reject. 704 00:40:44,084 --> 00:40:45,000 So here is a question. 705 00:40:47,730 --> 00:40:54,270 In example 1, for what level alpha would psi alpha-- 706 00:40:57,530 --> 00:41:00,090 OK, so here, what's going to happen 707 00:41:00,090 --> 00:41:04,350 if I start decreasing my level? 708 00:41:04,350 --> 00:41:07,020 When I decrease my level, I'm actually 709 00:41:07,020 --> 00:41:09,100 making this area smaller and smaller, 710 00:41:09,100 --> 00:41:13,360 which means that I push this c to the right. 711 00:41:13,360 --> 00:41:17,080 So now I'm asking, what is the smallest c 712 00:41:17,080 --> 00:41:22,360 I should pick so that now, I actually do not reject h0? 713 00:41:22,360 --> 00:41:29,642 What is the smallest c I should be taking here? 714 00:41:29,642 --> 00:41:30,600 What is the smallest c? 715 00:41:37,520 --> 00:41:43,300 So c here, in the example I gave you for 5%, was 1.96. 716 00:41:43,300 --> 00:41:49,330 What is the smallest c I should be taking so that now, 717 00:41:49,330 --> 00:41:50,730 this inequality is reversed? 718 00:41:54,980 --> 00:41:55,990 3.45. 719 00:41:55,990 --> 00:41:58,900 I ask only trivial questions, don't be worried. 720 00:41:58,900 --> 00:42:02,890 So 3.45 is the smallest c that I'm actually 721 00:42:02,890 --> 00:42:04,870 willing to tolerate. 722 00:42:04,870 --> 00:42:07,870 So let's say this was my 5%. 723 00:42:07,870 --> 00:42:09,730 If this was 2.5-- 724 00:42:09,730 --> 00:42:11,230 if here, let's say, in this picture, 725 00:42:11,230 --> 00:42:16,490 alpha is 5%, that means maybe I need to push here. 726 00:42:16,490 --> 00:42:18,150 And this number should be what? 727 00:42:18,150 --> 00:42:20,580 So this is going to be 1.96. 728 00:42:20,580 --> 00:42:26,460 And this number here is going to be 3.45, clearly to scale. 729 00:42:26,460 --> 00:42:30,330 And so now, what I want to ask you is, 730 00:42:30,330 --> 00:42:33,570 well, there's two ways I can understand this number 3.45. 731 00:42:33,570 --> 00:42:36,120 It is the number 3.45, but I can also 732 00:42:36,120 --> 00:42:40,340 try to understand what is the area to the right of this guy. 733 00:42:40,340 --> 00:42:42,780 And if I understand what the area to the right of this guy 734 00:42:42,780 --> 00:42:47,700 is, this is actually some alpha prime over 2. 735 00:42:47,700 --> 00:42:49,560 And that means that if I actually 736 00:42:49,560 --> 00:42:53,840 fix this level alpha prime, that would 737 00:42:53,840 --> 00:42:57,260 be exactly the tipping point at which I would 738 00:42:57,260 --> 00:43:01,420 go from accepting to rejecting. 739 00:43:01,420 --> 00:43:04,390 So I knew, in terms of absolute thresholds, 740 00:43:04,390 --> 00:43:07,300 3.45 is the trivial answer to the question. 741 00:43:07,300 --> 00:43:09,040 That's the tipping point, because I'm 742 00:43:09,040 --> 00:43:11,350 comparing a number to 3.45. 743 00:43:11,350 --> 00:43:13,330 But now, if I try to map this back 744 00:43:13,330 --> 00:43:16,570 and understand what level would have been giving me 745 00:43:16,570 --> 00:43:18,910 this particular tipping point, that's 746 00:43:18,910 --> 00:43:21,430 a number between 0 and 1. 747 00:43:21,430 --> 00:43:25,830 The smaller the number, the larger this number here, 748 00:43:25,830 --> 00:43:28,280 which means that the more evidence I have in my data 749 00:43:28,280 --> 00:43:30,990 against h0. 750 00:43:30,990 --> 00:43:36,140 And so this number is actually something called the p-value. 751 00:43:36,140 --> 00:43:38,640 And so saying, for example 2, there's 752 00:43:38,640 --> 00:43:40,880 the tipping point alpha at which I 753 00:43:40,880 --> 00:43:44,150 go from failing to reject to rejecting. 754 00:43:44,150 --> 00:43:47,990 And that's exactly the number, the area under the curve, 755 00:43:47,990 --> 00:43:53,990 such that here, I see 0.77. 756 00:43:53,990 --> 00:43:56,710 And this is this alpha prime prime over 2. 757 00:43:59,660 --> 00:44:04,090 Alpha prime prime is clearly larger than 5%. 758 00:44:04,090 --> 00:44:06,630 So what's the advantage of thinking and mapping back 759 00:44:06,630 --> 00:44:08,170 these numbers? 760 00:44:08,170 --> 00:44:11,790 Well, now, I'm actually going to spit out some number which 761 00:44:11,790 --> 00:44:12,900 is between 0 and 1. 762 00:44:12,900 --> 00:44:18,827 And that should be the only scale you should have in mind. 763 00:44:18,827 --> 00:44:20,410 Remember, we discussed that last time. 764 00:44:20,410 --> 00:44:22,750 I was like, well, if I actually spit out 765 00:44:22,750 --> 00:44:26,200 a number which is 3.45, maybe you can try to think, 766 00:44:26,200 --> 00:44:29,230 is 3.45 a large number for a Gaussian? 767 00:44:29,230 --> 00:44:29,897 That's a number. 768 00:44:29,897 --> 00:44:32,355 But if I had another random variable that was not Gaussian, 769 00:44:32,355 --> 00:44:33,910 maybe it was a double exponential, 770 00:44:33,910 --> 00:44:36,220 you would have to have another scale in your mind. 771 00:44:36,220 --> 00:44:42,880 Is 3.45 so large that it's unlikely for it 772 00:44:42,880 --> 00:44:44,680 to come from a double exponential. 773 00:44:44,680 --> 00:44:46,300 If I had a gamma distribution-- 774 00:44:46,300 --> 00:44:48,508 I can think of any distribution, and then that means, 775 00:44:48,508 --> 00:44:51,040 for each distribution, you would have to have scale in mind. 776 00:44:51,040 --> 00:44:53,290 So of course, you can have the Gaussian scale in mind. 777 00:44:53,290 --> 00:44:55,270 I mean, I have the Gaussian scale in mind. 778 00:44:55,270 --> 00:44:59,740 But then, if I map it back into this number between 0 and 1, 779 00:44:59,740 --> 00:45:02,260 all the distributions play the same role. 780 00:45:02,260 --> 00:45:05,920 So whether I'm talking about if my limiting distribution is 781 00:45:05,920 --> 00:45:09,874 normal or exponential or gamma, or whatever you want, 782 00:45:09,874 --> 00:45:11,290 for all these guys, I'm just going 783 00:45:11,290 --> 00:45:13,450 to map it into one number between 0 and 1. 784 00:45:13,450 --> 00:45:16,210 Small number means lots of evidence against h1. 785 00:45:16,210 --> 00:45:21,040 Large number means lots of evidence against h0. 786 00:45:21,040 --> 00:45:25,210 Small number means very few evidence against h9. 787 00:45:25,210 --> 00:45:27,800 And this is the only number you need to keep in mind. 788 00:45:27,800 --> 00:45:29,710 And the question is, am I willing 789 00:45:29,710 --> 00:45:34,570 to tolerate this number between 5%, 6%, or maybe 10%, 12%? 790 00:45:34,570 --> 00:45:37,720 And this is the only scale you have to have in mind. 791 00:45:37,720 --> 00:45:41,030 And this scale is the scale of p-values. 792 00:45:41,030 --> 00:45:48,120 So the p-value is the tipping point in terms of alpha. 793 00:45:48,120 --> 00:45:52,050 In words, I can make it formal, because tipping point, 794 00:45:52,050 --> 00:45:54,510 as far as I know, is not a mathematical term. 795 00:45:54,510 --> 00:45:58,950 So a p-value of a test is the smallest, 796 00:45:58,950 --> 00:46:01,740 potentially asymptotic level if I talk about an asymptotic 797 00:46:01,740 --> 00:46:02,910 p-value-- 798 00:46:02,910 --> 00:46:05,520 and that's what we do when we talk about central theorem-- 799 00:46:05,520 --> 00:46:09,410 at which the test rejects h0. 800 00:46:09,410 --> 00:46:10,820 If I were to go any smaller-- 801 00:46:14,640 --> 00:46:17,750 sorry, it's the smallest level-- 802 00:46:17,750 --> 00:46:19,360 yeah, if I were to go any smaller, 803 00:46:19,360 --> 00:46:21,250 I would fail to reject. 804 00:46:21,250 --> 00:46:25,200 The smaller the level, the less likely it is for me to reject. 805 00:46:25,200 --> 00:46:26,710 And if I were to go any smaller, I 806 00:46:26,710 --> 00:46:31,010 would start failing to reject. 807 00:46:31,010 --> 00:46:33,210 And so it is a random number. 808 00:46:33,210 --> 00:46:35,790 It depends on what I actually observe. 809 00:46:35,790 --> 00:46:39,240 So here, of course, I instantiated those two numbers, 810 00:46:39,240 --> 00:46:44,202 3.45 and 0.77, as realizations of random variables. 811 00:46:44,202 --> 00:46:46,410 But if you think of those as being the random numbers 812 00:46:46,410 --> 00:46:50,190 before I see my data, this was a random number, 813 00:46:50,190 --> 00:46:53,550 and therefore, the area under the curve to the right of it 814 00:46:53,550 --> 00:46:55,830 is also a random area. 815 00:46:55,830 --> 00:46:58,920 If this thing fluctuates, then the area under the curve 816 00:46:58,920 --> 00:47:00,090 fluctuates. 817 00:47:00,090 --> 00:47:02,150 And that's what the p-value is. 818 00:47:02,150 --> 00:47:05,950 That's what-- what is his name? 819 00:47:05,950 --> 00:47:06,780 I forget. 820 00:47:06,780 --> 00:47:10,470 John Oliver talks about when he talks about p-hacking. 821 00:47:10,470 --> 00:47:14,150 And so we talked about this in the first lecture. 822 00:47:14,150 --> 00:47:18,194 So p-hacking is, how do I do-- oh, if I'm a scientist, 823 00:47:18,194 --> 00:47:20,360 do I want to see a small p-value or a large p-value? 824 00:47:20,360 --> 00:47:21,027 AUDIENCE: Small. 825 00:47:21,027 --> 00:47:22,359 PHILIPPE RIGOLLET: Small, right? 826 00:47:22,359 --> 00:47:24,960 Scientists want to see small p-values because small p-values 827 00:47:24,960 --> 00:47:28,020 equals rejecting, which equals discovery, 828 00:47:28,020 --> 00:47:31,230 which equals publications, which equals promotion. 829 00:47:31,230 --> 00:47:34,550 So that's what people want to see. 830 00:47:34,550 --> 00:47:37,970 So people are tempted to see small p-values. 831 00:47:37,970 --> 00:47:41,660 And what's called p-hacking is, well, find a way to cheat. 832 00:47:41,660 --> 00:47:44,450 Maybe look at your data, formulate your hypothesis 833 00:47:44,450 --> 00:47:49,840 in such a way that you will actually have a smaller 834 00:47:49,840 --> 00:47:51,375 p-value than you should have. 835 00:47:51,375 --> 00:47:53,000 So here, for example, there's one thing 836 00:47:53,000 --> 00:47:54,958 I did not insist on because, again, this is not 837 00:47:54,958 --> 00:47:57,200 a particular course on statistical thinking, 838 00:47:57,200 --> 00:47:59,420 but one thing that we implicitly did 839 00:47:59,420 --> 00:48:04,750 was set those theta 0 and theta 1 ahead of time. 840 00:48:04,750 --> 00:48:08,378 I fixed them, and I'm trying to test this. 841 00:48:08,378 --> 00:48:11,280 This is to be contrasted with the following approach. 842 00:48:11,280 --> 00:48:13,350 I draw my data. 843 00:48:13,350 --> 00:48:15,224 So I draw-- 844 00:48:15,224 --> 00:48:16,890 I run this experiment, which is probably 845 00:48:16,890 --> 00:48:18,840 going to get me a publication in nature. 846 00:48:18,840 --> 00:48:23,010 I'm trying to test if a coin is fair. 847 00:48:23,010 --> 00:48:24,760 And I draw my data, and I see that there's 848 00:48:24,760 --> 00:48:31,020 13 out of 30 of my observations that are heads. 849 00:48:31,020 --> 00:48:32,610 That means that, from this data, it 850 00:48:32,610 --> 00:48:36,850 looks like p is less than 1/2. 851 00:48:36,850 --> 00:48:38,560 So if I look at this data and then 852 00:48:38,560 --> 00:48:42,870 decide that my alternative is not p not equal to 1/2, 853 00:48:42,870 --> 00:48:47,300 but rather p less than 1/2, that's p-hacking. 854 00:48:47,300 --> 00:48:50,550 I'm actually making my p-value strictly smaller 855 00:48:50,550 --> 00:48:53,130 by first looking at the data, and then deciding what 856 00:48:53,130 --> 00:48:54,660 my alternative is going to be. 857 00:48:54,660 --> 00:48:58,770 And that's cheating, because all the things we did, 858 00:48:58,770 --> 00:49:02,970 we're assuming that this 0.5, or the alternative, 859 00:49:02,970 --> 00:49:05,240 was actually a fixed-- everything was deterministic. 860 00:49:05,240 --> 00:49:07,164 The only randomness came from the data. 861 00:49:07,164 --> 00:49:08,580 But if I start looking at the data 862 00:49:08,580 --> 00:49:11,130 and designing my experiment or my alternatives 863 00:49:11,130 --> 00:49:13,232 and null hypothesis based on the data, 864 00:49:13,232 --> 00:49:15,690 it's as if I started putting randomness all over the place. 865 00:49:15,690 --> 00:49:18,300 And then I cannot control it because I don't know how it 866 00:49:18,300 --> 00:49:22,960 just intermingles with each other. 867 00:49:22,960 --> 00:49:26,170 So that was for the John Oliver moment. 868 00:49:29,940 --> 00:49:32,340 So the p-value is nice. 869 00:49:32,340 --> 00:49:35,500 So maybe I mentioned that, before, my wife 870 00:49:35,500 --> 00:49:36,610 works in market research. 871 00:49:36,610 --> 00:49:40,240 And maybe every two years, she seems 872 00:49:40,240 --> 00:49:42,370 to run into a statistician in the hallway, 873 00:49:42,370 --> 00:49:45,070 and she comes home and says, what is a p-value again? 874 00:49:45,070 --> 00:49:48,190 And for her, a p-value is just the number 875 00:49:48,190 --> 00:49:50,080 in an Excel spreadsheet. 876 00:49:50,080 --> 00:49:55,360 And actually, small equals good and large equals bad. 877 00:49:55,360 --> 00:49:57,820 And that's all she needs to know at this point. 878 00:49:57,820 --> 00:50:01,310 Actually, they do the job for her-- small is green, 879 00:50:01,310 --> 00:50:02,290 large is red. 880 00:50:02,290 --> 00:50:06,590 And so for her, a p-value is just green or red. 881 00:50:06,590 --> 00:50:08,740 But so what she's really implicitly doing 882 00:50:08,740 --> 00:50:12,980 with this color code is just applying the golden rule. 883 00:50:12,980 --> 00:50:16,210 What the statisticians do for her in the Excel spreadsheet 884 00:50:16,210 --> 00:50:18,760 is that they take the numbers for the p-values that 885 00:50:18,760 --> 00:50:20,440 are less than some fixed level. 886 00:50:20,440 --> 00:50:22,390 So depending on the field in which she works-- 887 00:50:22,390 --> 00:50:24,410 so she works for pharmaceutical companies-- 888 00:50:24,410 --> 00:50:26,560 so the p-values are typically compared-- 889 00:50:26,560 --> 00:50:31,030 the tests are usually performed at level 1%, rather than 5%. 890 00:50:31,030 --> 00:50:33,040 So 5% is maybe your gold standard 891 00:50:33,040 --> 00:50:36,550 if you're doing sociology or trying to-- 892 00:50:36,550 --> 00:50:39,970 I don't know-- release a new blueberry flavor 893 00:50:39,970 --> 00:50:40,990 for your toothpaste. 894 00:50:40,990 --> 00:50:43,630 Something that's not going to change the life of people, 895 00:50:43,630 --> 00:50:45,040 maybe you're going to run at 5%. 896 00:50:45,040 --> 00:50:46,150 It's OK to make a mistake. 897 00:50:46,150 --> 00:50:47,858 See, people are just going to feel gross, 898 00:50:47,858 --> 00:50:50,770 but that's about it, whereas here, 899 00:50:50,770 --> 00:50:53,296 if you have this p-value which is less than 1%, 900 00:50:53,296 --> 00:50:55,420 it might be more important for some drug discovery, 901 00:50:55,420 --> 00:50:56,550 for example. 902 00:50:56,550 --> 00:50:59,810 And so let's say you run at 1%. 903 00:50:59,810 --> 00:51:02,470 And so what they do in this Excel spreadsheet is 904 00:51:02,470 --> 00:51:05,800 that all the numbers that are below 1% show up in green 905 00:51:05,800 --> 00:51:09,010 and all the numbers that are above 1% show up in red. 906 00:51:09,010 --> 00:51:09,790 And that's it. 907 00:51:09,790 --> 00:51:11,710 That's just applying the golden rule. 908 00:51:11,710 --> 00:51:13,640 If the number is green, reject. 909 00:51:13,640 --> 00:51:18,024 If the number is red, fail to reject. 910 00:51:18,024 --> 00:51:18,524 Yeah? 911 00:51:18,524 --> 00:51:20,512 AUDIENCE: So going back to example 2 912 00:51:20,512 --> 00:51:23,991 where the prior example where you 913 00:51:23,991 --> 00:51:26,476 want to cheat by looking after beta 914 00:51:26,476 --> 00:51:32,450 and then formulating, say, theta 1 to be p less than 1/2. 915 00:51:32,450 --> 00:51:33,914 PHILIPPE RIGOLLET: Yeah. 916 00:51:33,914 --> 00:51:38,306 AUDIENCE: So how would you achieve your goal 917 00:51:38,306 --> 00:51:40,804 by changing the theta-- 918 00:51:40,804 --> 00:51:42,470 PHILIPPE RIGOLLET: By achieving my goal, 919 00:51:42,470 --> 00:51:45,826 you mean letting ethics aside, right? 920 00:51:45,826 --> 00:51:46,700 AUDIENCE: Yeah, yeah. 921 00:51:46,700 --> 00:51:47,930 PHILIPPE RIGOLLET: Ah, you want to be published. 922 00:51:47,930 --> 00:51:48,555 AUDIENCE: Yeah. 923 00:51:48,555 --> 00:51:54,410 PHILIPPE RIGOLLET: [LAUGHS] So let me teach you how, then. 924 00:51:54,410 --> 00:51:58,160 So well, here, what do you do? 925 00:51:58,160 --> 00:52:03,380 You want to-- at the end of the day, 926 00:52:03,380 --> 00:52:06,200 a test is only telling you whether you found evidence 927 00:52:06,200 --> 00:52:11,150 in your data that h1 was more likely than h0, basically. 928 00:52:11,150 --> 00:52:12,830 How do you make h1 more likely? 929 00:52:12,830 --> 00:52:18,500 Well, you just basically target h1 to be what it is-- 930 00:52:18,500 --> 00:52:21,740 what the data is going to make it more likely to be. 931 00:52:21,740 --> 00:52:26,210 So if, for example, I say h1 can be on both sides, 932 00:52:26,210 --> 00:52:29,177 then my data is going to have to take into account fluctuations 933 00:52:29,177 --> 00:52:31,760 on both sides, and I'm going to lose a factor or two somewhere 934 00:52:31,760 --> 00:52:33,690 because things are not symmetric. 935 00:52:33,690 --> 00:52:38,240 Here is the ultimate way of making this work. 936 00:52:38,240 --> 00:52:42,920 I'm going back to my example of flipping coins. 937 00:52:42,920 --> 00:52:45,790 And now, so here, what I did is, I said, 938 00:52:45,790 --> 00:52:54,690 oh, this number 0.43 is actually smaller than 0.5, 939 00:52:54,690 --> 00:52:56,640 so I'm just going to test whether I'm 0.5 940 00:52:56,640 --> 00:52:58,780 or I'm less than 0.5. 941 00:52:58,780 --> 00:53:01,050 But here is something that I can promise you 942 00:53:01,050 --> 00:53:04,380 I did not make the computation will reject. 943 00:53:04,380 --> 00:53:06,180 So here, this one actually-- 944 00:53:06,180 --> 00:53:08,280 yeah, this one fails to reject. 945 00:53:08,280 --> 00:53:11,080 So here is one that will certainly reject. 946 00:53:11,080 --> 00:53:24,950 h0 is 0.5, p is 0.5, h1p is 0.43. 947 00:53:24,950 --> 00:53:27,830 Now, you can try, but I can promise you 948 00:53:27,830 --> 00:53:32,030 that your data will tell you that h1 is the right one. 949 00:53:32,030 --> 00:53:36,110 I mean, you can check very quickly that this is really 950 00:53:36,110 --> 00:53:37,780 extremely likely to happen. 951 00:53:40,795 --> 00:53:41,670 Actually, what am I-- 952 00:53:45,050 --> 00:53:52,220 no, actually, that's not true, because here, 953 00:53:52,220 --> 00:53:56,330 the test that I derive that's based on this kind of stuff, 954 00:53:56,330 --> 00:53:59,960 here at some point, somewhere under some layers, 955 00:53:59,960 --> 00:54:04,450 I assume that all our tests are going to have this form. 956 00:54:04,450 --> 00:54:06,030 But here, this is only when you're 957 00:54:06,030 --> 00:54:09,030 trying to test one region versus another region next to it, 958 00:54:09,030 --> 00:54:11,030 or one point versus a region around it, 959 00:54:11,030 --> 00:54:13,140 or something like this, whereas for this guy, 960 00:54:13,140 --> 00:54:15,640 there's another test that could come up with, 961 00:54:15,640 --> 00:54:18,720 which is, what is the probability that I get 0.43, 962 00:54:18,720 --> 00:54:21,786 and what is the probability that I get 0.5? 963 00:54:21,786 --> 00:54:23,160 Now, what I'm going to do is, I'm 964 00:54:23,160 --> 00:54:25,680 going to just conclude it's whichever 965 00:54:25,680 --> 00:54:27,607 has the largest probability. 966 00:54:27,607 --> 00:54:29,940 Then maybe I'm going to have to make some adjustments so 967 00:54:29,940 --> 00:54:32,310 that the level is actually 5%. 968 00:54:32,310 --> 00:54:33,690 But I can make this happen. 969 00:54:33,690 --> 00:54:36,890 I can make the level be 5% and always conclude this guy, 970 00:54:36,890 --> 00:54:38,610 but I would have to use a different test. 971 00:54:38,610 --> 00:54:40,290 Now, the test that I described, again, 972 00:54:40,290 --> 00:54:42,870 those tn larger than c are built in 973 00:54:42,870 --> 00:54:46,170 to be tests that are resilient to these kind of manipulations 974 00:54:46,170 --> 00:54:48,840 because they're oblivious towards what 975 00:54:48,840 --> 00:54:50,490 the alternative looks like. 976 00:54:50,490 --> 00:54:51,960 I mean, they're just saying it's either to the left 977 00:54:51,960 --> 00:54:53,334 or to the right, but whether it's 978 00:54:53,334 --> 00:54:55,440 a point or an entire half-line doesn't matter. 979 00:54:59,170 --> 00:55:01,200 So if you try to look at your data 980 00:55:01,200 --> 00:55:05,520 and just put the data itself into your hypothesis testing 981 00:55:05,520 --> 00:55:10,745 problem, then you're failing the statistical principle. 982 00:55:10,745 --> 00:55:12,120 And that's what people are doing. 983 00:55:12,120 --> 00:55:13,572 I mean, how can I check? 984 00:55:13,572 --> 00:55:15,030 I mean, of course, here, it's going 985 00:55:15,030 --> 00:55:16,488 to be pretty blatant if you publish 986 00:55:16,488 --> 00:55:17,820 a paper that looks like this. 987 00:55:17,820 --> 00:55:19,650 But there's ways to do it differently. 988 00:55:19,650 --> 00:55:21,734 For example, one way to do it is to just do mult-- 989 00:55:21,734 --> 00:55:23,233 so typically, what people do is they 990 00:55:23,233 --> 00:55:24,780 do multiple hypothesis testing. 991 00:55:24,780 --> 00:55:27,570 They're doing 100 tests at a time. 992 00:55:27,570 --> 00:55:30,714 Then you have random fluctuations every time. 993 00:55:30,714 --> 00:55:32,130 And so they just pick the one that 994 00:55:32,130 --> 00:55:34,422 has the random fluctuations that go their way. 995 00:55:34,422 --> 00:55:36,130 I mean, sometimes it's going in your way, 996 00:55:36,130 --> 00:55:37,879 and sometimes it's going the opposite way, 997 00:55:37,879 --> 00:55:39,780 so you just pick the one that works for you. 998 00:55:39,780 --> 00:55:41,990 We'll talk about multiple hypothesis testing soon 999 00:55:41,990 --> 00:55:44,860 if you want to increase your publication count. 1000 00:55:49,779 --> 00:55:50,820 There's actually papers-- 1001 00:55:50,820 --> 00:55:53,040 I think it was a big news that some papers, 1002 00:55:53,040 --> 00:55:54,720 I think, in psychology or psychometrics 1003 00:55:54,720 --> 00:55:57,900 papers that actually refused to publish p-values now. 1004 00:56:03,630 --> 00:56:05,880 Where were we? 1005 00:56:05,880 --> 00:56:07,230 Here's the golden rule. 1006 00:56:07,230 --> 00:56:11,970 So one thing that I like to show is this thing, 1007 00:56:11,970 --> 00:56:14,340 just so you know how you apply the golden rule 1008 00:56:14,340 --> 00:56:16,050 and how you apply the standard tests. 1009 00:56:16,050 --> 00:56:25,450 So the standard paradigm is the following. 1010 00:56:25,450 --> 00:56:29,430 You have a black box, which is your test. 1011 00:56:29,430 --> 00:56:32,230 For my wife, this is the 4th floor of the building. 1012 00:56:32,230 --> 00:56:33,840 That's where the statisticians sit. 1013 00:56:33,840 --> 00:56:35,980 What she sends there is data-- 1014 00:56:38,620 --> 00:56:41,140 let's say x1 xn. 1015 00:56:41,140 --> 00:56:43,870 And she says, well, this one is about toothpaste, 1016 00:56:43,870 --> 00:56:45,190 so here's a level-- 1017 00:56:45,190 --> 00:56:47,220 let's say 5%. 1018 00:56:47,220 --> 00:56:50,890 What the 4th floor brings back is that answer-- yes, 1019 00:56:50,890 --> 00:56:53,290 no, green, red, just an answer. 1020 00:56:58,050 --> 00:56:59,670 So that's the standard testing. 1021 00:56:59,670 --> 00:57:02,040 You just feed it the data and the level at which you 1022 00:57:02,040 --> 00:57:04,330 want to perform the test, maybe asymptotic, 1023 00:57:04,330 --> 00:57:06,580 and it spits out a yes, no answer. 1024 00:57:06,580 --> 00:57:15,340 What p-value does, you just feed it the data itself. 1025 00:57:18,380 --> 00:57:22,210 And what it spits out is the p-value. 1026 00:57:22,210 --> 00:57:23,870 And now it's just up to you. 1027 00:57:23,870 --> 00:57:27,910 I mean, hopefully your brain has the computational power 1028 00:57:27,910 --> 00:57:31,090 of deciding whether a number is larger or smaller than 5% 1029 00:57:31,090 --> 00:57:33,790 without having to call a statistician for this. 1030 00:57:33,790 --> 00:57:35,360 And that's what it does. 1031 00:57:35,360 --> 00:57:37,600 So now we're on 1 scale. 1032 00:57:37,600 --> 00:57:41,500 Now, I see some of you nodding when I talk about p-hacking, 1033 00:57:41,500 --> 00:57:43,095 so that means you've seen p-values. 1034 00:57:43,095 --> 00:57:45,220 If you've seen more than 100 p-values in your life, 1035 00:57:45,220 --> 00:57:47,330 you have an entire scale. 1036 00:57:47,330 --> 00:57:50,770 A good p-value is less than 10 to the minus 4. 1037 00:57:50,770 --> 00:57:53,440 That's the ultimate sweet spot. 1038 00:57:53,440 --> 00:57:56,830 Actually, statistical software spits out 1039 00:57:56,830 --> 00:58:01,370 an output which says less than 10 to the minus 4. 1040 00:58:01,370 --> 00:58:02,830 But then maybe you want a p-val-- 1041 00:58:05,870 --> 00:58:08,960 if you tell me my p-value was 4.65, then I will say, 1042 00:58:08,960 --> 00:58:10,970 you've been doing some p-hacking until you found 1043 00:58:10,970 --> 00:58:12,680 a number that was below 5%. 1044 00:58:12,680 --> 00:58:14,660 That's typically what people will do. 1045 00:58:14,660 --> 00:58:16,590 But if you tell me-- 1046 00:58:16,590 --> 00:58:18,800 if you're doing the test, if you're saying, 1047 00:58:18,800 --> 00:58:21,590 I published my result, my test at 5% 1048 00:58:21,590 --> 00:58:27,020 said yes, that means that maybe you're p-value was 4.99, 1049 00:58:27,020 --> 00:58:29,762 or you're p-value was 10 to the minus 4, I will never know. 1050 00:58:29,762 --> 00:58:31,220 I will never know how much evidence 1051 00:58:31,220 --> 00:58:34,310 you had against the null. 1052 00:58:34,310 --> 00:58:36,260 But if you tell me what the p-value is, 1053 00:58:36,260 --> 00:58:37,492 I can make my own decision. 1054 00:58:37,492 --> 00:58:39,450 I don't have to tell me whether it's a yes, no. 1055 00:58:39,450 --> 00:58:42,620 You tell me it's 4.99, I'm going to say, well, maybe yes, 1056 00:58:42,620 --> 00:58:45,120 but I'm going to take it with a grain of salt. 1057 00:58:45,120 --> 00:58:48,100 And so that's why p-values are good numbers to have in mind. 1058 00:58:48,100 --> 00:58:51,940 Now, I should, as if it was like an old trick 1059 00:58:51,940 --> 00:58:54,310 that you start mastering when you're 45 years old. 1060 00:58:54,310 --> 00:58:57,490 No, it's just, how small is the number between 0 and 1? 1061 00:58:57,490 --> 00:59:00,670 That's really what you need to know. 1062 00:59:00,670 --> 00:59:03,385 Maybe on the log scale-- if it's 10 to the minus 1, 1063 00:59:03,385 --> 00:59:07,240 10 to the minus 2, 10 to the minus 3, et cetera-- 1064 00:59:07,240 --> 00:59:09,790 that's probably the extent of the mastery here. 1065 00:59:12,820 --> 00:59:16,420 So this traditional standard paradigm that I showed 1066 00:59:16,420 --> 00:59:21,160 is actually commonly referred to as the Neyman-Pearson paradigm. 1067 00:59:21,160 --> 00:59:23,320 So here, it says name Neyman-Pearson's theory, 1068 00:59:23,320 --> 00:59:25,970 so there's an entire theory that comes with it. 1069 00:59:25,970 --> 00:59:27,130 But it's really a paradigm. 1070 00:59:27,130 --> 00:59:29,296 It's a way of thinking about hypothesis testing that 1071 00:59:29,296 --> 00:59:32,530 says, well, if I'm not going to be able to optimize both 1072 00:59:32,530 --> 00:59:34,750 my type I and type II error, I'm actually 1073 00:59:34,750 --> 00:59:37,780 going to lock in my type I error below some level 1074 00:59:37,780 --> 00:59:42,550 and just minimize the type II error under this constraint. 1075 00:59:42,550 --> 00:59:45,100 That's what the Neyman-Pearson paradigm is. 1076 00:59:45,100 --> 00:59:48,490 And it sort of makes sense for hypothesis testing problems. 1077 00:59:48,490 --> 00:59:50,560 Now, if you were doing some other applications 1078 00:59:50,560 --> 00:59:52,210 with multi-objective optimization, 1079 00:59:52,210 --> 00:59:54,310 you would maybe come up with something different. 1080 00:59:54,310 --> 00:59:58,600 For example, machine learning is not performing typically 1081 00:59:58,600 --> 01:00:01,120 under Neyman-Pearson paradigm. 1082 01:00:01,120 --> 01:00:05,360 So if you do spam filtering, you could say, well, 1083 01:00:05,360 --> 01:00:08,650 I want to constrain the probability as much as I can 1084 01:00:08,650 --> 01:00:10,810 of taking somebody's important emails 1085 01:00:10,810 --> 01:00:14,590 and throwing them out as spam, and under this constraint, 1086 01:00:14,590 --> 01:00:17,350 not send too much spam to that person. 1087 01:00:17,350 --> 01:00:19,290 That sort of makes sense for spams. 1088 01:00:19,290 --> 01:00:23,620 Now, if you're labeling cats versus dogs, that's probably 1089 01:00:23,620 --> 01:00:27,100 not like you want to make sure that no more than 5% 1090 01:00:27,100 --> 01:00:30,670 of the dogs are labeled cat because, I mean, 1091 01:00:30,670 --> 01:00:31,469 it doesn't matter. 1092 01:00:31,469 --> 01:00:33,010 So what you typically do is, you just 1093 01:00:33,010 --> 01:00:34,810 sum up the two types of errors you can make, 1094 01:00:34,810 --> 01:00:36,851 and you minimize the sum without putting any more 1095 01:00:36,851 --> 01:00:38,260 weight on one or the other. 1096 01:00:38,260 --> 01:00:42,880 So here's an example where doing a binary decision, one or two 1097 01:00:42,880 --> 01:00:45,070 of the errors you can make, you don't 1098 01:00:45,070 --> 01:00:47,840 have to actually be like that. 1099 01:00:47,840 --> 01:00:50,300 So this example here, I did not. 1100 01:00:50,300 --> 01:00:55,200 The trivial test psi is equal to 0, what was it 1101 01:00:55,200 --> 01:01:00,350 in the US trial court example? 1102 01:01:00,350 --> 01:01:03,530 What is psi equals 0? 1103 01:01:03,530 --> 01:01:05,500 That was concluding always to the null. 1104 01:01:05,500 --> 01:01:08,000 What was the null? 1105 01:01:08,000 --> 01:01:08,930 AUDIENCE: Innocent. 1106 01:01:08,930 --> 01:01:10,388 PHILIPPE RIGOLLET: Innocent, right? 1107 01:01:10,388 --> 01:01:11,340 That's the status quo. 1108 01:01:11,340 --> 01:01:14,790 So that means that this guy never rejects h0. 1109 01:01:14,790 --> 01:01:16,910 Everybody's going away free. 1110 01:01:16,910 --> 01:01:18,800 So you're sure you're not actually 1111 01:01:18,800 --> 01:01:25,250 going against the constitution because alpha is 0%, which 1112 01:01:25,250 --> 01:01:26,940 is certainly less than 5%. 1113 01:01:26,940 --> 01:01:30,260 But the power, the fact that a lot of criminals 1114 01:01:30,260 --> 01:01:34,130 go back outside in the free world 1115 01:01:34,130 --> 01:01:37,580 is actually formulated in terms of low power, which, 1116 01:01:37,580 --> 01:01:39,100 in this case, is actually 0. 1117 01:01:39,100 --> 01:01:41,690 Again, the power is the number between 0 and 1. 1118 01:01:41,690 --> 01:01:43,100 Close to 0, good. 1119 01:01:43,100 --> 01:01:45,227 Close to 1, bad. 1120 01:01:45,227 --> 01:01:51,510 Now, what is the definition of the p-value? 1121 01:01:51,510 --> 01:01:54,300 That's going to be something-- it's a mouthful. 1122 01:01:54,300 --> 01:01:58,520 The definition of the p-value is a mouthful. 1123 01:01:58,520 --> 01:02:00,240 It's the tipping point. 1124 01:02:00,240 --> 01:02:02,620 It is the smallest level at which blah, blah, blah, blah, 1125 01:02:02,620 --> 01:02:03,120 blah. 1126 01:02:03,120 --> 01:02:05,410 It's complicated to remember it. 1127 01:02:05,410 --> 01:02:09,910 Now, I think that my 6th explanation, my wife, 1128 01:02:09,910 --> 01:02:12,994 after saying, oh, so it's the probability of making an error, 1129 01:02:12,994 --> 01:02:14,910 I said, yeah, that's the probability of making 1130 01:02:14,910 --> 01:02:16,870 an error because, of course, she can 1131 01:02:16,870 --> 01:02:22,270 think probability of making an error small, good, large, bad. 1132 01:02:22,270 --> 01:02:24,940 So that's actually a good way to remember. 1133 01:02:24,940 --> 01:02:26,521 I'm pretty sure that at least 50% 1134 01:02:26,521 --> 01:02:28,270 of people who are using p-values out there 1135 01:02:28,270 --> 01:02:31,220 think that the p-value is the probability of making an error. 1136 01:02:31,220 --> 01:02:33,040 Now, for all matters of purposes, 1137 01:02:33,040 --> 01:02:35,620 if your goal is to just threshold the p-value, 1138 01:02:35,620 --> 01:02:37,900 this is OK to have this in y. 1139 01:02:37,900 --> 01:02:42,220 But when comes, at least until December 22, 1140 01:02:42,220 --> 01:02:44,830 I would recommend trying to actually memorize 1141 01:02:44,830 --> 01:02:46,780 the right definition for the p-value. 1142 01:02:53,020 --> 01:02:55,240 So the idea, again, is fix the level 1143 01:02:55,240 --> 01:02:57,112 and try to optimize the power. 1144 01:03:01,360 --> 01:03:05,370 So we're going to try to compute some p-values from now on. 1145 01:03:05,370 --> 01:03:06,900 How do you compute the p-value? 1146 01:03:06,900 --> 01:03:10,670 Well, you can actually see it from this picture over there. 1147 01:03:14,047 --> 01:03:16,130 One thing I didn't show on this picture-- so here, 1148 01:03:16,130 --> 01:03:19,010 it was my q alpha over 2 that had alpha here, 1149 01:03:19,010 --> 01:03:21,230 alpha over 2 here. 1150 01:03:21,230 --> 01:03:22,580 That was my q alpha over 2. 1151 01:03:22,580 --> 01:03:26,870 And I said, if tn is to the right of this guy, 1152 01:03:26,870 --> 01:03:27,746 I'm going to reject. 1153 01:03:27,746 --> 01:03:29,120 If tn is to the left of this guy, 1154 01:03:29,120 --> 01:03:31,550 I'm going to fail to reject. 1155 01:03:31,550 --> 01:03:34,720 Pictorially, you can actually represent the p-value. 1156 01:03:34,720 --> 01:03:36,770 It's when I replace this guy by tn itself. 1157 01:03:41,170 --> 01:03:44,770 Sorry, that's p-value over 2. 1158 01:03:44,770 --> 01:03:47,360 No, actually, that's p-value. 1159 01:03:47,360 --> 01:03:51,290 So let me just keep it like that and put the absolute value 1160 01:03:51,290 --> 01:03:51,790 here. 1161 01:03:54,530 --> 01:03:58,560 So if you replace the role of q alpha over 2, by your test 1162 01:03:58,560 --> 01:04:01,630 statistic, the area under the curve 1163 01:04:01,630 --> 01:04:03,160 is actually the p-value itself up 1164 01:04:03,160 --> 01:04:06,730 to a scale because of the symmetric thing. 1165 01:04:06,730 --> 01:04:09,280 So there's a good way to see, pictorially, 1166 01:04:09,280 --> 01:04:10,930 what the p-value is. 1167 01:04:10,930 --> 01:04:13,750 It's just the probability that some Gaussians-- 1168 01:04:13,750 --> 01:04:17,680 it's just the probability that some absolute value of n01 1169 01:04:17,680 --> 01:04:18,370 exceeds tn. 1170 01:04:22,480 --> 01:04:24,384 That's what the p-value is. 1171 01:04:24,384 --> 01:04:26,300 Now, this guy has nothing to do with this guy, 1172 01:04:26,300 --> 01:04:32,680 so this is really just 1 minus the Gaussian cdf of tn, 1173 01:04:32,680 --> 01:04:34,650 and that's it. 1174 01:04:34,650 --> 01:04:36,460 So that's how I would compute p-values. 1175 01:04:36,460 --> 01:04:40,210 Now, as I said, the p-value is a beauty 1176 01:04:40,210 --> 01:04:43,330 because you don't have to understand 1177 01:04:43,330 --> 01:04:47,050 the fact that your limiting distribution is a Gaussian. 1178 01:04:47,050 --> 01:04:49,264 It's already factored in this construction. 1179 01:04:49,264 --> 01:04:50,680 The fact that I'm actually looking 1180 01:04:50,680 --> 01:04:54,880 at this cumulative distribution function of a standard Gaussian 1181 01:04:54,880 --> 01:04:57,302 makes my p-value automatically adjust to what 1182 01:04:57,302 --> 01:04:58,510 the limiting distribution is. 1183 01:04:58,510 --> 01:05:00,460 And if this was the cumulative distribution 1184 01:05:00,460 --> 01:05:03,520 function of a exponential, I would just 1185 01:05:03,520 --> 01:05:06,055 have a different function here denoted by f, for example, 1186 01:05:06,055 --> 01:05:07,866 and I would just compute a different value. 1187 01:05:07,866 --> 01:05:10,240 But in the end, regardless of what the limiting value is, 1188 01:05:10,240 --> 01:05:13,540 my p-value would still be a number between 0 and 1. 1189 01:05:13,540 --> 01:05:16,650 And so to illustrate that, let's look 1190 01:05:16,650 --> 01:05:20,340 at other weird distributions that we could get in place 1191 01:05:20,340 --> 01:05:22,620 of the standard Gaussian. 1192 01:05:22,620 --> 01:05:24,980 And we're not going to see many, but we'll see one. 1193 01:05:24,980 --> 01:05:27,311 And it's not called the chi squared distribution. 1194 01:05:27,311 --> 01:05:29,310 It's actually called the Student's distribution, 1195 01:05:29,310 --> 01:05:31,920 but it involves the chi squared distribution 1196 01:05:31,920 --> 01:05:34,320 as a building block. 1197 01:05:34,320 --> 01:05:38,820 So I don't know if my phonetics are not really right there, 1198 01:05:38,820 --> 01:05:43,050 so I try to say, well, it's chi squared. 1199 01:05:43,050 --> 01:05:47,190 Maybe it's "kee" squared above, in Canada, who knows. 1200 01:05:47,190 --> 01:05:50,589 So for a positive integer, so there's only 1 parameter. 1201 01:05:50,589 --> 01:05:52,380 So for the Gaussian, you have 2 parameters, 1202 01:05:52,380 --> 01:05:54,180 which are mu and sigma squared. 1203 01:05:54,180 --> 01:05:55,350 Those are real numbers. 1204 01:05:55,350 --> 01:05:57,090 Sigma squared's positive. 1205 01:05:57,090 --> 01:05:59,280 Here, I have 1 integer parameter. 1206 01:06:03,030 --> 01:06:05,160 Then the chi squared distribution 1207 01:06:05,160 --> 01:06:07,642 with d degrees of freedom-- 1208 01:06:07,642 --> 01:06:09,600 so the parameter is called a degree of freedom, 1209 01:06:09,600 --> 01:06:11,700 just like mu is called the expected value and sigma 1210 01:06:11,700 --> 01:06:12,600 squared is called the variance. 1211 01:06:12,600 --> 01:06:14,370 Here, we call it degrees of freedom. 1212 01:06:14,370 --> 01:06:17,290 You don't have to really understand why. 1213 01:06:17,290 --> 01:06:19,800 So that's the law that you would get-- 1214 01:06:19,800 --> 01:06:21,300 that's the random variable you would 1215 01:06:21,300 --> 01:06:26,260 get if you were to sum d squares of independent standard 1216 01:06:26,260 --> 01:06:26,890 Gaussians. 1217 01:06:29,570 --> 01:06:33,680 So I take the square of an independent random Gaussian. 1218 01:06:33,680 --> 01:06:34,730 I take another one. 1219 01:06:34,730 --> 01:06:36,380 I sum them, and that's a chi squared 1220 01:06:36,380 --> 01:06:39,370 with 2 degrees of freedom. 1221 01:06:39,370 --> 01:06:40,740 That's how you get it. 1222 01:06:40,740 --> 01:06:46,960 Now, I could define it using its probability density function. 1223 01:06:46,960 --> 01:06:49,150 I mean, after all, this is the sum 1224 01:06:49,150 --> 01:06:51,730 of positive random variables, so it 1225 01:06:51,730 --> 01:06:53,800 is a positive random variable. 1226 01:06:53,800 --> 01:06:56,680 It has a density on the positive real line. 1227 01:06:56,680 --> 01:07:03,420 And the pdf of chi squared with d degrees of freedom is what? 1228 01:07:03,420 --> 01:07:07,900 Well, it's fd of x is-- 1229 01:07:07,900 --> 01:07:13,930 what is it?-- x to the d/2 minus 1 e to the minus x/2. 1230 01:07:13,930 --> 01:07:16,510 And then here, I have a gamma of d/2. 1231 01:07:16,510 --> 01:07:20,470 And the other one is, I think, 2 to the d/2 minus 1. 1232 01:07:23,158 --> 01:07:26,530 No, 2 to the d/2. 1233 01:07:26,530 --> 01:07:28,690 That's what it is. 1234 01:07:28,690 --> 01:07:30,580 That's the density. 1235 01:07:30,580 --> 01:07:32,110 If you are very good at probability, 1236 01:07:32,110 --> 01:07:33,640 you can make the change of variable 1237 01:07:33,640 --> 01:07:35,789 and write your Jacobian and do all this stuff 1238 01:07:35,789 --> 01:07:37,330 and actually check that this is true. 1239 01:07:37,330 --> 01:07:40,540 I do not recommend doing that. 1240 01:07:40,540 --> 01:07:44,070 So this is the density, but it's better understood like that. 1241 01:07:44,070 --> 01:07:46,270 I think it was just something that you 1242 01:07:46,270 --> 01:07:48,160 built from standard Gaussian. 1243 01:07:48,160 --> 01:07:50,800 So for example, an example of a chi 1244 01:07:50,800 --> 01:07:52,870 squared with 2 degrees of freedom 1245 01:07:52,870 --> 01:07:54,790 is actually the following thing. 1246 01:07:54,790 --> 01:07:56,860 Let's assume I have a target like this. 1247 01:08:00,170 --> 01:08:02,960 And I don't aim very well. 1248 01:08:02,960 --> 01:08:05,996 And I'm trying to hit the center. 1249 01:08:05,996 --> 01:08:07,370 And I'm not going to have, maybe, 1250 01:08:07,370 --> 01:08:10,380 a deviation, which is standard Gaussian left, right 1251 01:08:10,380 --> 01:08:16,520 and standard Gaussian north, south. 1252 01:08:16,520 --> 01:08:18,710 So I'm throwing, and then I'm here, 1253 01:08:18,710 --> 01:08:22,279 and I'm claiming that this number here, by Pythagoras 1254 01:08:22,279 --> 01:08:24,290 theorem, the square distance here 1255 01:08:24,290 --> 01:08:25,790 is the sum of this square distance 1256 01:08:25,790 --> 01:08:30,060 here, which is the square of a Gaussian by assumption. 1257 01:08:30,060 --> 01:08:31,854 This is plus the square of this distance, 1258 01:08:31,854 --> 01:08:34,020 which is the square of another independent Gaussian. 1259 01:08:34,020 --> 01:08:35,486 I assume those are independent. 1260 01:08:35,486 --> 01:08:37,819 And so the square distance from this point to this point 1261 01:08:37,819 --> 01:08:40,580 is the chi squared with 2 degrees of freedom. 1262 01:08:40,580 --> 01:08:45,029 So this guy here is n01 squared. 1263 01:08:45,029 --> 01:08:48,140 This is n01 squared. 1264 01:08:48,140 --> 01:08:50,120 And so this guy here, this distance here, 1265 01:08:50,120 --> 01:08:53,285 is chi squared with 2 degrees of freedom. 1266 01:08:53,285 --> 01:08:54,410 I mean the square distance. 1267 01:08:54,410 --> 01:08:58,569 I'm talking about square distances here. 1268 01:08:58,569 --> 01:09:02,380 So now you can see that, actually, Pythagoras 1269 01:09:02,380 --> 01:09:05,899 is basically why chi squared [? arrives. ?] 1270 01:09:05,899 --> 01:09:07,810 That's why it has its own name. 1271 01:09:07,810 --> 01:09:10,479 I mean, I could define this random variable. 1272 01:09:10,479 --> 01:09:13,075 I mean, it's actually a gamma distribution. 1273 01:09:13,075 --> 01:09:15,700 It's a special case of something called the gamma distribution. 1274 01:09:15,700 --> 01:09:17,658 The fact that the special case has its own name 1275 01:09:17,658 --> 01:09:19,075 is because there's many times what 1276 01:09:19,075 --> 01:09:20,491 we're going to take sum of squares 1277 01:09:20,491 --> 01:09:23,140 of independent Gaussians because Gaussians, the sum of squares 1278 01:09:23,140 --> 01:09:25,330 is really the norm, the Euclidean norm squared, 1279 01:09:25,330 --> 01:09:26,830 just by Pythagoras theorem. 1280 01:09:26,830 --> 01:09:28,319 If I'm in higher dimension, I can 1281 01:09:28,319 --> 01:09:30,370 start to sum more squared coordinates, 1282 01:09:30,370 --> 01:09:32,119 and I'm going to measure the norm squared. 1283 01:09:34,240 --> 01:09:37,880 So if you want to draw this picture, it looks like this. 1284 01:09:37,880 --> 01:09:39,620 Again, it's the sum of positive numbers, 1285 01:09:39,620 --> 01:09:43,000 so it's going to be on 0 plus infinity. 1286 01:09:43,000 --> 01:09:44,680 That's fd. 1287 01:09:44,680 --> 01:09:52,850 And so f1 looks like this, f2 looks like this. 1288 01:09:52,850 --> 01:09:57,370 So the tails become heavier and heavier as d increases. 1289 01:09:57,370 --> 01:10:00,560 And then at [INAUDIBLE] to 3, it starts 1290 01:10:00,560 --> 01:10:01,960 to have a different shape. 1291 01:10:01,960 --> 01:10:04,210 It starts from 0 and it looks like this. 1292 01:10:04,210 --> 01:10:06,850 And then, as d increases, it's basically 1293 01:10:06,850 --> 01:10:09,210 as if you were to push this thing to the right. 1294 01:10:09,210 --> 01:10:14,979 It's just like, psh, so it's just falling like a big blob. 1295 01:10:14,979 --> 01:10:16,270 Everybody sees what's going on? 1296 01:10:16,270 --> 01:10:19,637 So there's just this fat thing that's just going there. 1297 01:10:19,637 --> 01:10:21,470 What is the expected value of a chi squared? 1298 01:10:28,670 --> 01:10:30,900 So it's the expected value of the sum 1299 01:10:30,900 --> 01:10:37,938 of Gaussian random variables, squared. 1300 01:10:37,938 --> 01:10:40,358 I know I said that. 1301 01:10:40,358 --> 01:10:42,790 AUDIENCE: So it's the sum of their second moments, right? 1302 01:10:42,790 --> 01:10:43,956 PHILIPPE RIGOLLET: Which is? 1303 01:10:46,240 --> 01:10:47,570 Those are n01. 1304 01:10:47,570 --> 01:10:50,386 AUDIENCE: It's like-- oh, I see, 1. 1305 01:10:50,386 --> 01:10:51,386 PHILIPPE RIGOLLET: Yeah. 1306 01:10:51,386 --> 01:10:53,294 AUDIENCE: So n times 1 or d times 1. 1307 01:10:53,294 --> 01:10:55,070 PHILIPPE RIGOLLET: Yeah, which is d. 1308 01:10:55,070 --> 01:10:56,990 So one thing you can check quickly 1309 01:10:56,990 --> 01:11:00,280 is that the expected value of a chi squared is d. 1310 01:11:00,280 --> 01:11:04,280 And so you see, that's why the mass is shifting to the right 1311 01:11:04,280 --> 01:11:05,240 as d increases. 1312 01:11:05,240 --> 01:11:06,350 It's just going there. 1313 01:11:06,350 --> 01:11:08,320 Actually, the variance is also increasing. 1314 01:11:08,320 --> 01:11:10,180 The variance is 2d. 1315 01:11:14,070 --> 01:11:16,130 So this is one thing. 1316 01:11:16,130 --> 01:11:19,230 And so why do we care about this? 1317 01:11:19,230 --> 01:11:22,640 In basic statistics, it's not like we actually 1318 01:11:22,640 --> 01:11:25,140 have statistics much about throwing darts 1319 01:11:25,140 --> 01:11:28,590 at high-dimensional boards. 1320 01:11:28,590 --> 01:11:31,380 So what's happening is that if I look at the sample variance, 1321 01:11:31,380 --> 01:11:36,590 the average of the sum of squared centered by their mean, 1322 01:11:36,590 --> 01:11:38,480 then I can actually expend this as the sum 1323 01:11:38,480 --> 01:11:42,320 of the squares minus the average squared 1324 01:11:42,320 --> 01:11:44,680 It's just the same trick that we have 1325 01:11:44,680 --> 01:11:49,360 for the variance-- second moment minus first moment square. 1326 01:11:49,360 --> 01:11:53,785 And then I claim that Cochran's theorem-- 1327 01:11:53,785 --> 01:11:56,640 and I will tell you in a second what Cochran's theorem tells me 1328 01:11:56,640 --> 01:11:58,390 is that this sample variance is actually-- 1329 01:11:58,390 --> 01:12:01,330 so if I had only this-- 1330 01:12:01,330 --> 01:12:04,560 look at those guys. 1331 01:12:04,560 --> 01:12:07,170 Those guys are Gaussian with mean mu and variance 1332 01:12:07,170 --> 01:12:08,370 sigma squared. 1333 01:12:08,370 --> 01:12:13,290 Think for 1 second mu being 0 and sigma squared being 1. 1334 01:12:13,290 --> 01:12:16,920 Now, this part would be a chi squared with n degrees 1335 01:12:16,920 --> 01:12:19,162 of freedom divided by n. 1336 01:12:19,162 --> 01:12:21,300 Now I get another thing here, which 1337 01:12:21,300 --> 01:12:24,850 is the square of something that looks like a Gaussian as well. 1338 01:12:24,850 --> 01:12:27,750 So it looks like I have something else here, which 1339 01:12:27,750 --> 01:12:29,580 looks also like a chi squared. 1340 01:12:29,580 --> 01:12:31,890 Now, Cochran's theorem is essentially telling you 1341 01:12:31,890 --> 01:12:35,130 that those things are independent, 1342 01:12:35,130 --> 01:12:39,720 and so that in a way, you can think of those guys as being, 1343 01:12:39,720 --> 01:12:43,780 here, n degrees of freedom minus 1 degree of freedom. 1344 01:12:43,780 --> 01:12:47,820 Now, here, as I said, this does not mean 0 and variance 1. 1345 01:12:47,820 --> 01:12:50,730 The fact that it's not mean 0 is not a problem 1346 01:12:50,730 --> 01:12:54,790 because I can remove the mean here and remove the mean here. 1347 01:12:54,790 --> 01:12:57,130 And so this thing has the same distribution, 1348 01:12:57,130 --> 01:12:59,100 regardless of what the actual mean is. 1349 01:12:59,100 --> 01:13:00,600 So without loss of generality, I can 1350 01:13:00,600 --> 01:13:02,097 assume that mu is equal to 0. 1351 01:13:02,097 --> 01:13:03,930 Now, the variance, I'm going to have to pay, 1352 01:13:03,930 --> 01:13:06,660 because if I multiply all these numbers by 10, 1353 01:13:06,660 --> 01:13:09,647 then this sn is going to multiplied by 100. 1354 01:13:09,647 --> 01:13:11,730 So this thing is going to scale with the variance. 1355 01:13:11,730 --> 01:13:13,813 And not surprisingly, it's scaling like the square 1356 01:13:13,813 --> 01:13:15,550 of the variance. 1357 01:13:15,550 --> 01:13:18,120 So if I look at sn, it's distributed 1358 01:13:18,120 --> 01:13:21,960 as sigma squared times the chi squared 1359 01:13:21,960 --> 01:13:25,060 with n minus 1 degrees of freedom divided by n. 1360 01:13:25,060 --> 01:13:28,230 And we don't really write that, because a chi squared 1361 01:13:28,230 --> 01:13:30,650 times sigma squared divided by n is not a distribution, 1362 01:13:30,650 --> 01:13:32,292 so we put everything to the left, 1363 01:13:32,292 --> 01:13:34,500 and we say that this is actually a chi squared with n 1364 01:13:34,500 --> 01:13:36,880 minus 1 degrees of freedom. 1365 01:13:36,880 --> 01:13:40,570 So here, I'm actually dropping a fact on you, 1366 01:13:40,570 --> 01:13:43,810 but you can see the building block. 1367 01:13:43,810 --> 01:13:46,750 What is the thing that's fuzzy at this point, 1368 01:13:46,750 --> 01:13:48,820 but the rest should be crystal clear to you? 1369 01:13:48,820 --> 01:13:52,060 The thing that's fuzzy is that removing this squared guy 1370 01:13:52,060 --> 01:13:55,770 here is actually removing 1 degree of freedom. 1371 01:13:55,770 --> 01:13:59,122 That should be weird, but that's what Cochran's theorem tells. 1372 01:13:59,122 --> 01:14:00,800 It's essentially stating something 1373 01:14:00,800 --> 01:14:04,760 about orthogonality of subspaces with the span 1374 01:14:04,760 --> 01:14:07,580 of the constant vector, something like that. 1375 01:14:07,580 --> 01:14:09,770 So you don't have to think about it too much, 1376 01:14:09,770 --> 01:14:11,630 but that's what it's telling me. 1377 01:14:11,630 --> 01:14:15,170 But the rest, if you plug in-- so the scaling in sigma squared 1378 01:14:15,170 --> 01:14:18,450 and in n, so that should be completely clear to you. 1379 01:14:18,450 --> 01:14:20,810 So in particular, if I remove that part, 1380 01:14:20,810 --> 01:14:24,950 it should be clear to you that this thing, if mean is 0, 1381 01:14:24,950 --> 01:14:27,240 this thing is actually distributed. 1382 01:14:27,240 --> 01:14:30,153 Well, if mu is 0, what is the distribution of this guy? 1383 01:14:35,810 --> 01:14:37,739 So I remove that part, just this part. 1384 01:14:46,840 --> 01:14:50,700 So I have xi, which are n0 sigma squared. 1385 01:14:50,700 --> 01:14:53,505 And I'm asking, what is the distribution of 1/n sum from i 1386 01:14:53,505 --> 01:14:57,382 equal 1 to n of xi squared? 1387 01:14:57,382 --> 01:15:00,350 So it is the sum of their IID. 1388 01:15:00,350 --> 01:15:03,620 So it's the sum of independent Gaussians, but not standard. 1389 01:15:03,620 --> 01:15:05,300 So the first thing to make them standard 1390 01:15:05,300 --> 01:15:07,450 is that I divide all of them by sigma squared. 1391 01:15:10,690 --> 01:15:17,050 Now, this guy is of the form zi squared where zi is n01. 1392 01:15:20,710 --> 01:15:25,740 So now, this thing here has what distribution? 1393 01:15:25,740 --> 01:15:27,420 AUDIENCE: Chi squared n. 1394 01:15:27,420 --> 01:15:30,198 PHILIPPE RIGOLLET: Chi squared n. 1395 01:15:30,198 --> 01:15:33,450 And now, sigma squared over n times chi squared n-- 1396 01:15:33,450 --> 01:15:35,940 so if I have sigma squared divided by n times chi 1397 01:15:35,940 --> 01:15:37,830 squared-- 1398 01:15:37,830 --> 01:15:41,970 sorry, so n times n divided by sigma squared. 1399 01:15:41,970 --> 01:15:45,570 So if I take this thing and I multiply it 1400 01:15:45,570 --> 01:15:48,360 by n divided by sigma squared, it means I remove this term, 1401 01:15:48,360 --> 01:15:49,860 and now I am left with a chi squared 1402 01:15:49,860 --> 01:15:51,130 with n degrees of freedom. 1403 01:15:51,130 --> 01:15:55,320 Now, the effect of centering with the sample mean here 1404 01:15:55,320 --> 01:15:57,450 is only to lose 1 degree of freedom. 1405 01:15:57,450 --> 01:15:58,108 That's it. 1406 01:16:01,460 --> 01:16:05,210 So if I want to do a test about variance, since this 1407 01:16:05,210 --> 01:16:08,000 is supposedly a good estimator of variance, 1408 01:16:08,000 --> 01:16:10,880 this could be my pivotal distribution. 1409 01:16:10,880 --> 01:16:12,710 This could play the role of a Gaussian. 1410 01:16:12,710 --> 01:16:16,100 If I want to know if my variance is equal to 1 or larger than 1, 1411 01:16:16,100 --> 01:16:21,720 I could actually build a test based on this only statement 1412 01:16:21,720 --> 01:16:23,910 and test if the variance is larger than 1 or not. 1413 01:16:23,910 --> 01:16:25,451 Now, this is not asymptotic because I 1414 01:16:25,451 --> 01:16:28,470 started with the very assumption that my data was 1415 01:16:28,470 --> 01:16:29,160 Gaussian itself. 1416 01:16:32,390 --> 01:16:33,830 Now, just a side remark-- you can 1417 01:16:33,830 --> 01:16:37,220 check that this chi squared 2, 2 is an exponential with 1/2 1418 01:16:37,220 --> 01:16:38,990 degrees of freedom, which is certainly not 1419 01:16:38,990 --> 01:16:42,350 clear from the fact that z1 squared plus z2 squared 1420 01:16:42,350 --> 01:16:44,690 is a chi squared with 2 degrees of freedom. 1421 01:16:44,690 --> 01:16:46,440 if I give you the sum of the square 1422 01:16:46,440 --> 01:16:50,540 of 2 independent Gaussian, this is actually an exponential. 1423 01:16:50,540 --> 01:16:53,140 That's not super clear, right? 1424 01:16:53,140 --> 01:17:00,260 But if you look at what was here-- 1425 01:17:00,260 --> 01:17:03,430 I don't know if you took notes, but let me rewrite it for you. 1426 01:17:03,430 --> 01:17:08,760 So it was x to the d/2 minus 1 e to the minus x/2 divided 1427 01:17:08,760 --> 01:17:14,470 by 2 to the d/2 gamma of d/2. 1428 01:17:14,470 --> 01:17:18,200 So if I plug in d is equal to 2, gamma of 2/2 1429 01:17:18,200 --> 01:17:21,920 is gamma of 1, which is 1. 1430 01:17:21,920 --> 01:17:23,750 It's factorial of 0. 1431 01:17:23,750 --> 01:17:26,290 So it's 1, so this guy goes away. 1432 01:17:26,290 --> 01:17:33,310 2 to the d/2 is 2 to 1, so that's just 1. 1433 01:17:33,310 --> 01:17:36,490 No, that's just 2. 1434 01:17:36,490 --> 01:17:40,990 Then x the d/2 minus 1 is x the 0, goes away. 1435 01:17:40,990 --> 01:17:47,080 And so I have x minus x/2 1/2, which is really, indeed, 1436 01:17:47,080 --> 01:17:50,050 of the form lambda e to the minus lambda 1437 01:17:50,050 --> 01:17:53,290 x for lambda is equal to 1/2, which was 1438 01:17:53,290 --> 01:17:54,851 our exponential distribution. 1439 01:17:59,200 --> 01:18:05,290 Well, next week is, well, Columbus Day? 1440 01:18:05,290 --> 01:18:08,510 So not next Monday-- 1441 01:18:08,510 --> 01:18:12,310 so next week, we'll talk about Student's distribution. 1442 01:18:12,310 --> 01:18:15,430 And so that was discovered by a guy 1443 01:18:15,430 --> 01:18:19,400 who pretended his name was Student, but was not Student. 1444 01:18:19,400 --> 01:18:23,260 And I challenge you to find why in the meantime. 1445 01:18:23,260 --> 01:18:24,940 So I'll see you next week. 1446 01:18:24,940 --> 01:18:28,090 Your homework is going to be outside 1447 01:18:28,090 --> 01:18:31,733 so we can release the room.