1
00:00:00,120 --> 00:00:02,460
The following content is
provided under a Creative

2
00:00:02,460 --> 00:00:03,850
Commons license.

3
00:00:03,850 --> 00:00:06,090
Your support will help
MIT OpenCourseWare

4
00:00:06,090 --> 00:00:10,180
continue to offer high-quality
educational resources for free.

5
00:00:10,180 --> 00:00:12,720
To make a donation or to
view additional materials

6
00:00:12,720 --> 00:00:16,680
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:16,680 --> 00:00:19,535
at ocw.mit.edu.

8
00:00:19,535 --> 00:00:21,410
PHILIPPE RIGOLLET: We're
talking about tests.

9
00:00:21,410 --> 00:00:24,720
And to be fair, we
spend most of our time

10
00:00:24,720 --> 00:00:28,470
talking about new
jargon that we're using.

11
00:00:28,470 --> 00:00:31,710
The main goal is to take a
binary decision, yes and no.

12
00:00:31,710 --> 00:00:36,360
So just so that we're clear
and we make sure that we all

13
00:00:36,360 --> 00:00:37,980
speak the same
language, let me just

14
00:00:37,980 --> 00:00:42,570
remind you what the key
words are for tests.

15
00:00:42,570 --> 00:00:48,310
So the first thing is
that we split theta

16
00:00:48,310 --> 00:00:53,690
in theta 0 and theta 1.

17
00:00:53,690 --> 00:00:57,610
Both are included in theta,
and they are disjoint.

18
00:01:00,970 --> 00:01:04,090
So I have my set of
possible parameters.

19
00:01:04,090 --> 00:01:10,770
And then I have theta 0
is here, theta 1 is here.

20
00:01:10,770 --> 00:01:14,170
And there might be
something that I leave out.

21
00:01:14,170 --> 00:01:18,070
And so what we're doing
is, we have two hypotheses.

22
00:01:18,070 --> 00:01:20,800
So here's our hypothesis
testing problem.

23
00:01:20,800 --> 00:01:27,910
And it's h0 theta belongs
to theta 0 versus h1 theta

24
00:01:27,910 --> 00:01:29,860
belongs to theta 1.

25
00:01:29,860 --> 00:01:32,620
This guy was called
the null, and this guy

26
00:01:32,620 --> 00:01:36,410
was called the alternative.

27
00:01:36,410 --> 00:01:37,910
And why we give
them special names

28
00:01:37,910 --> 00:01:41,230
is because we saw that they
have an asymmetric role.

29
00:01:41,230 --> 00:01:43,070
The null represents
the status quo,

30
00:01:43,070 --> 00:01:46,290
and data is here to bring
evidence against this guy.

31
00:01:46,290 --> 00:01:49,730
And we can really never
conclude that h0 is true

32
00:01:49,730 --> 00:01:56,420
because all we could conclude is
that h1 is not true, or may not

33
00:01:56,420 --> 00:01:59,030
be true.

34
00:01:59,030 --> 00:02:00,840
So that was the first thing.

35
00:02:00,840 --> 00:02:03,020
The second thing
was the hypothesis.

36
00:02:03,020 --> 00:02:05,730
The third thing
is, what is a test?

37
00:02:05,730 --> 00:02:17,100
Well, psi, it's a statistic,
and it takes the data,

38
00:02:17,100 --> 00:02:21,500
and it maps it into 0 or 1.

39
00:02:21,500 --> 00:02:24,000
And I didn't really mention it,
but there's some things such

40
00:02:24,000 --> 00:02:27,390
as called randomized
tests, which is, well,

41
00:02:27,390 --> 00:02:28,920
if I cannot really
make a decision,

42
00:02:28,920 --> 00:02:30,870
I might as well flip a coin.

43
00:02:30,870 --> 00:02:32,952
That tends to be biased,
but that's really--

44
00:02:32,952 --> 00:02:34,410
I mean, think about
it in practice.

45
00:02:34,410 --> 00:02:36,118
You probably don't
want to make decisions

46
00:02:36,118 --> 00:02:37,380
based on flipping a coin.

47
00:02:37,380 --> 00:02:39,270
And so what people
typically do--

48
00:02:39,270 --> 00:02:41,830
this is happening, typically,
at one specific value.

49
00:02:41,830 --> 00:02:44,339
So rather than flipping a coin
for this very specific value,

50
00:02:44,339 --> 00:02:45,880
what people typically
do is they say,

51
00:02:45,880 --> 00:02:48,088
OK, I'm going to side with
h0 because that's the most

52
00:02:48,088 --> 00:02:50,134
conservative choice I can make.

53
00:02:50,134 --> 00:02:52,050
So in a way, they think
of flipping this coin,

54
00:02:52,050 --> 00:02:55,380
but always falling
on heads, say.

55
00:02:55,380 --> 00:02:58,410
So associated to this test
was something called, well,

56
00:02:58,410 --> 00:03:05,590
the rejection
region r psi, which

57
00:03:05,590 --> 00:03:15,360
is just the set of data x1
xn such that psi of x1 xn

58
00:03:15,360 --> 00:03:16,720
is equal to 1.

59
00:03:16,720 --> 00:03:19,170
So that means we rejected
h0 when the test is 1.

60
00:03:19,170 --> 00:03:21,270
And those are the
set of data points

61
00:03:21,270 --> 00:03:25,140
that actually are going to
lead me to reject the test.

62
00:03:28,777 --> 00:03:30,860
And then the things that
we're actually, slightly,

63
00:03:30,860 --> 00:03:35,660
a little more important and
really peculiar to test,

64
00:03:35,660 --> 00:03:40,460
specific to tests, were the
type I and type II error.

65
00:03:40,460 --> 00:03:44,820
So the type I
error arises when--

66
00:03:44,820 --> 00:04:01,390
so type I error is when you
reject, whereas h0 is correct.

67
00:04:01,390 --> 00:04:06,920
And the type II error
is the opposite,

68
00:04:06,920 --> 00:04:17,354
so it's failed to reject,
whereas h1 is correct--

69
00:04:17,354 --> 00:04:20,450
h is correct, yeah.

70
00:04:20,450 --> 00:04:23,770
So those are the two types
of errors you can make.

71
00:04:23,770 --> 00:04:26,820
And we quantified their
probability of type I error.

72
00:04:26,820 --> 00:04:31,720
So alpha psi is
the probability--

73
00:04:31,720 --> 00:04:38,320
so that's the probability
of type I error.

74
00:04:41,470 --> 00:04:49,190
So psi is just the probability
for theta that psi rejects

75
00:04:49,190 --> 00:04:54,360
and that's defined
for theta and theta 0,

76
00:04:54,360 --> 00:04:56,390
so for different
values of theta 0.

77
00:04:56,390 --> 00:05:00,380
So h0 being correct means
there exists a theta in theta 0

78
00:05:00,380 --> 00:05:03,380
for which that actually
is the right distribution.

79
00:05:03,380 --> 00:05:05,060
So for different
values of theta,

80
00:05:05,060 --> 00:05:07,340
I might make different errors.

81
00:05:07,340 --> 00:05:12,440
So if you think, for example,
about the coin example,

82
00:05:12,440 --> 00:05:16,550
I'm testing if the coin
is biased towards heads

83
00:05:16,550 --> 00:05:18,770
or biased towards tails.

84
00:05:18,770 --> 00:05:21,830
So if I'm testing
whether p is larger

85
00:05:21,830 --> 00:05:25,145
than 1/2 or less than 1/2,
then when the true p-- let's

86
00:05:25,145 --> 00:05:27,290
say our h0 is larger than 1/2.

87
00:05:27,290 --> 00:05:29,900
When p is equal to 1, it's
actually very difficult for me

88
00:05:29,900 --> 00:05:33,570
to make a mistake,
because I only see heads.

89
00:05:33,570 --> 00:05:35,780
So when p is getting
closer to 1/2,

90
00:05:35,780 --> 00:05:38,850
I'm going to start making more
and more probability of error.

91
00:05:38,850 --> 00:05:42,170
And so the type II error-- so
that's the probability of type

92
00:05:42,170 --> 00:05:43,670
II--

93
00:05:43,670 --> 00:05:46,550
is denoted by beta psi.

94
00:05:46,550 --> 00:05:50,750
And it's the function,
well, that does the opposite

95
00:05:50,750 --> 00:05:58,070
and, this time, is defined
for theta in theta 1.

96
00:05:58,070 --> 00:06:13,410
And finally, we define something
called the power, pi of psi.

97
00:06:13,410 --> 00:06:16,180
And this time, this
is actually a number.

98
00:06:16,180 --> 00:06:23,220
And so this number is equal to
the maximum over theta n theta

99
00:06:23,220 --> 00:06:23,879
0.

100
00:06:23,879 --> 00:06:25,920
I mean, that could be a
supremum, but think of it

101
00:06:25,920 --> 00:06:32,940
as being a maximum of p
theta of psi is equal--

102
00:06:32,940 --> 00:06:37,580
sorry, that's n0, right?

103
00:06:37,580 --> 00:06:39,640
Give me one sec.

104
00:06:39,640 --> 00:06:42,270
No, sorry, that's the min.

105
00:06:46,190 --> 00:06:48,740
So this is not making a mistake.

106
00:06:48,740 --> 00:06:52,940
Theta 0 is in theta 2 So
if theta is in theta 1

107
00:06:52,940 --> 00:06:55,279
and I conclude 1, so
this is a good thing.

108
00:06:55,279 --> 00:06:56,570
I want this number to be large.

109
00:06:56,570 --> 00:06:58,860
And I'm looking at
the worst house--

110
00:06:58,860 --> 00:07:02,770
what is the smallest
value this number can be?

111
00:07:02,770 --> 00:07:06,040
So what I want to show you
a little bit is a picture.

112
00:07:09,410 --> 00:07:12,770
So now I'm going to take theta,
and think of it as being a p.

113
00:07:12,770 --> 00:07:18,340
So I'm going to take p for some
variable in the experiment.

114
00:07:18,340 --> 00:07:20,690
So p can range between 0
and 1, that's for sure.

115
00:07:23,516 --> 00:07:24,890
And what I'm going
to try to test

116
00:07:24,890 --> 00:07:30,200
is whether p is less than
1/2 or larger than 1/2.

117
00:07:30,200 --> 00:07:34,340
So this is going to
be, let's say, theta 0.

118
00:07:34,340 --> 00:07:37,020
And this guy here is theta 1.

119
00:07:37,020 --> 00:07:40,900
Just trying to give you a
picture of what those guys are.

120
00:07:40,900 --> 00:07:46,240
So I have my y-axis, and now I'm
going to start drawing number.

121
00:07:46,240 --> 00:07:48,760
All these things--
this function,

122
00:07:48,760 --> 00:07:51,250
this function, and
this number-- are

123
00:07:51,250 --> 00:07:52,600
all numbers between 0 and 1.

124
00:07:56,990 --> 00:07:59,750
So now I'm claiming that--

125
00:07:59,750 --> 00:08:03,350
so when I move
from left to right,

126
00:08:03,350 --> 00:08:08,430
what is my probability
of rejecting going to do?

127
00:08:08,430 --> 00:08:11,930
So what I'm going to plot is
the probability under theta.

128
00:08:11,930 --> 00:08:14,740
The first thing I want to plot
is the probability under theta

129
00:08:14,740 --> 00:08:19,400
that psi is equal to 1.

130
00:08:19,400 --> 00:08:20,880
And let's say psi--

131
00:08:20,880 --> 00:08:25,140
think of psi as being
just this indicator

132
00:08:25,140 --> 00:08:35,270
that square root on n xn bar
minus p over square root xn

133
00:08:35,270 --> 00:08:40,419
bar 1 minus xn bar is
larger than some constant c

134
00:08:40,419 --> 00:08:43,840
for a probability chosen c.

135
00:08:43,840 --> 00:08:48,280
So what we choose is that c
is in such a way that, at 1/2,

136
00:08:48,280 --> 00:08:50,380
when we're testing
for 1/2, what we

137
00:08:50,380 --> 00:08:56,500
wanted was this number to be
equal to alpha, basically.

138
00:08:56,500 --> 00:09:00,115
So we fix this alpha
number so that this guy--

139
00:09:00,115 --> 00:09:09,430
so if I want alpha of psi
of theta less than alpha

140
00:09:09,430 --> 00:09:12,360
given in advanced--

141
00:09:12,360 --> 00:09:15,199
so think of it as being
equal to, say, 5%.

142
00:09:15,199 --> 00:09:16,740
So I'm fixing this
number, and I want

143
00:09:16,740 --> 00:09:19,050
this to be controlled for
all theta and theta 0.

144
00:09:23,400 --> 00:09:26,440
So if you're going to
give me this budget,

145
00:09:26,440 --> 00:09:29,644
well, I'm actually going to
make it equal where I can.

146
00:09:29,644 --> 00:09:31,810
If you're telling me you
can make it equal to alpha,

147
00:09:31,810 --> 00:09:34,650
we know that if I
increase my type I error,

148
00:09:34,650 --> 00:09:36,590
I'm going to decrease
my type II error.

149
00:09:36,590 --> 00:09:39,280
If I start putting
everyone in jail

150
00:09:39,280 --> 00:09:41,862
or if I start letting
everyone go free,

151
00:09:41,862 --> 00:09:43,570
that's what we were
discussing last time.

152
00:09:43,570 --> 00:09:45,460
So since we have this
trade-off and you're

153
00:09:45,460 --> 00:09:49,372
giving me a budget for one guy,
I'm just going to max it out.

154
00:09:49,372 --> 00:09:50,830
And where am I
going to max it out?

155
00:09:50,830 --> 00:09:53,170
Exactly at 1/2 at the boundary.

156
00:09:53,170 --> 00:09:54,400
So this is going to be 5%.

157
00:10:00,970 --> 00:10:03,460
So what I know is that
since alpha of theta

158
00:10:03,460 --> 00:10:06,550
is less than alpha
for all theta in theta

159
00:10:06,550 --> 00:10:12,580
0-- sorry, that's for theta 0,
that's where alpha is defined.

160
00:10:12,580 --> 00:10:14,676
So for theta and theta 0,
I knew that my function

161
00:10:14,676 --> 00:10:15,800
is going to look like this.

162
00:10:15,800 --> 00:10:18,380
It's going to be somewhere
in this rectangle.

163
00:10:18,380 --> 00:10:20,426
Everybody agrees?

164
00:10:20,426 --> 00:10:22,800
So this function for this guy
is going to look like this.

165
00:10:22,800 --> 00:10:25,590
When I'm at 0, when
p is equal to 0,

166
00:10:25,590 --> 00:10:29,620
which means I only
observe 0's, then I

167
00:10:29,620 --> 00:10:32,350
know that p is going to be
0, and I will certainly not

168
00:10:32,350 --> 00:10:34,510
conclude that p is equal to 1.

169
00:10:34,510 --> 00:10:39,100
This test will never conclude
that p is equal to 1--

170
00:10:42,630 --> 00:10:44,770
that p is larger than
1/2, just because xn bar

171
00:10:44,770 --> 00:10:46,120
is going to be equal to 0.

172
00:10:46,120 --> 00:10:48,190
Well, this is actually
not well-defined,

173
00:10:48,190 --> 00:10:51,040
so maybe I need to do
something-- put it equal to 0

174
00:10:51,040 --> 00:10:52,900
if xn bar is equal to 0.

175
00:10:52,900 --> 00:10:55,871
So I guess, basically, I get
something which is negative,

176
00:10:55,871 --> 00:10:58,120
and so it's never going to
be larger than what I want.

177
00:10:58,120 --> 00:11:00,370
And so here, I'm
actually starting at 0.

178
00:11:00,370 --> 00:11:04,300
So now, this is this function
here that increases--

179
00:11:04,300 --> 00:11:06,610
I mean, it should
increase smoothly.

180
00:11:06,610 --> 00:11:11,620
This function here is
alpha psi of theta--

181
00:11:11,620 --> 00:11:15,280
or alpha psi of p, let's say,
because we're talking about p.

182
00:11:15,280 --> 00:11:17,916
Then it reaches alpha here.

183
00:11:17,916 --> 00:11:19,290
Now, when I go on
the other side,

184
00:11:19,290 --> 00:11:21,510
I'm actually looking at beta.

185
00:11:21,510 --> 00:11:23,730
When I'm on theta 1, the
function that matters

186
00:11:23,730 --> 00:11:28,540
is the probability of type II
error, which is beta of psi.

187
00:11:28,540 --> 00:11:30,775
And this beta of psi is
actually going to increase.

188
00:11:34,130 --> 00:11:35,500
So beta of psi is what?

189
00:11:35,500 --> 00:11:37,660
Well, beta of psi should also--

190
00:11:37,660 --> 00:11:39,940
sorry, that's the probability
of being equal to alpha.

191
00:11:39,940 --> 00:11:41,440
So what I'm going
to do is I'm going

192
00:11:41,440 --> 00:11:43,960
to look at the
probability of rejecting.

193
00:11:43,960 --> 00:11:46,502
So let me draw this
functional all the way.

194
00:11:46,502 --> 00:11:48,900
It's going to look like this.

195
00:11:48,900 --> 00:11:52,750
Now here, if I look at
this function here or here,

196
00:11:52,750 --> 00:11:57,320
this is the probability under
theta that psi is equal to 1.

197
00:11:57,320 --> 00:11:59,620
And we just said
that, in this region,

198
00:11:59,620 --> 00:12:02,600
this function is
called alpha of psi.

199
00:12:02,600 --> 00:12:06,990
In that region, it's
not called alpha of psi.

200
00:12:06,990 --> 00:12:08,340
It's not called anything.

201
00:12:08,340 --> 00:12:11,030
It's just the
probability of rejection.

202
00:12:11,030 --> 00:12:12,770
So it's not any
error, it's actually

203
00:12:12,770 --> 00:12:14,310
what you should be doing.

204
00:12:14,310 --> 00:12:19,680
What we're looking at in this
region is 1 minus this guy.

205
00:12:19,680 --> 00:12:21,780
We're looking at the
probability of not rejecting.

206
00:12:21,780 --> 00:12:23,820
So I need to actually,
basically, look at the 1

207
00:12:23,820 --> 00:12:27,340
minus this thing, which
here is going to be 95%.

208
00:12:27,340 --> 00:12:31,800
So I'm going to do 95%.

209
00:12:34,510 --> 00:12:36,640
And this is my probability.

210
00:12:36,640 --> 00:12:38,380
Ability And I'm just
basically drawing

211
00:12:38,380 --> 00:12:40,330
the symmetric of this guy.

212
00:12:40,330 --> 00:12:44,890
So this here is the
probability under theta

213
00:12:44,890 --> 00:12:50,390
that psi is equal to 0,
which is 1 minus p theta

214
00:12:50,390 --> 00:12:52,360
that psi is equal to 1.

215
00:12:52,360 --> 00:12:56,460
So it's just 1 minus
the wide curve.

216
00:12:56,460 --> 00:12:59,010
And it's actually,
by definition, equal

217
00:12:59,010 --> 00:13:00,360
to beta of psi of theta.

218
00:13:06,670 --> 00:13:09,360
Now, where do I read pi psi?

219
00:13:20,330 --> 00:13:22,250
What is pi psi on this picture?

220
00:13:26,130 --> 00:13:28,020
Is pi psi a number
or a function?

221
00:13:32,242 --> 00:13:32,950
AUDIENCE: Number.

222
00:13:32,950 --> 00:13:33,670
PHILIPPE RIGOLLET:
It's a number, right?

223
00:13:33,670 --> 00:13:35,444
It's the minimum of a function.

224
00:13:35,444 --> 00:13:36,360
What is this function?

225
00:13:36,360 --> 00:13:39,450
It's the probability under
theta that theta is equal to 1.

226
00:13:39,450 --> 00:13:44,070
I drew this entire function for
between theta 0 and theta 1.

227
00:13:44,070 --> 00:13:46,680
I drew-- this is this
entire white curve.

228
00:13:46,680 --> 00:13:48,152
This is this probability.

229
00:13:48,152 --> 00:13:50,610
Now I'm saying, look at the
smallest value this probability

230
00:13:50,610 --> 00:13:54,300
can take on the set theta 1.

231
00:13:54,300 --> 00:13:55,060
What is this?

232
00:14:00,332 --> 00:14:02,060
This guy.

233
00:14:02,060 --> 00:14:03,800
This is where my pi--

234
00:14:03,800 --> 00:14:08,750
this thing here is pi psi,
and so it's equal to 5%.

235
00:14:11,462 --> 00:14:13,510
So that's for this
particular test,

236
00:14:13,510 --> 00:14:19,035
because this test has a
continuous curve for this psi.

237
00:14:19,035 --> 00:14:20,800
And so if I want to
make sure that I'm

238
00:14:20,800 --> 00:14:24,310
at 5% when I come to the
right of the theta 0,

239
00:14:24,310 --> 00:14:26,380
if it touches theta
1, then I'd better

240
00:14:26,380 --> 00:14:30,220
have 5% on the other side if
the function is continuous.

241
00:14:30,220 --> 00:14:33,130
So basically, if
this function is

242
00:14:33,130 --> 00:14:38,189
increasing, which will be
the case for most tests,

243
00:14:38,189 --> 00:14:39,980
and continuous, then
what's going to happen

244
00:14:39,980 --> 00:14:42,499
is that the level of the
test, which is alpha,

245
00:14:42,499 --> 00:14:44,790
is actually going to be equal
to the power of the test.

246
00:14:48,260 --> 00:14:50,110
Now, there's something
I didn't mention,

247
00:14:50,110 --> 00:14:52,810
and I'm just mentioning
it passing by.

248
00:14:52,810 --> 00:14:55,760
Here, I define the power itself.

249
00:14:55,760 --> 00:14:59,420
This function, this
entire white curve here,

250
00:14:59,420 --> 00:15:01,510
is actually called
the power function--

251
00:15:06,690 --> 00:15:07,200
this thing.

252
00:15:07,200 --> 00:15:09,340
That's the entire white curve.

253
00:15:09,340 --> 00:15:12,200
And what you could
have is tests that

254
00:15:12,200 --> 00:15:16,670
have the entire curve which
is dominated by another test.

255
00:15:16,670 --> 00:15:18,760
So here, if I look
at this test--

256
00:15:18,760 --> 00:15:21,680
and let's assume I can
build another test that

257
00:15:21,680 --> 00:15:23,070
has this curve.

258
00:15:23,070 --> 00:15:29,024
Let's say it's the same
here, but then here, it

259
00:15:29,024 --> 00:15:29,690
looks like this.

260
00:15:34,600 --> 00:15:38,130
What is the power of this test?

261
00:15:38,130 --> 00:15:39,130
AUDIENCE: It's the same.

262
00:15:39,130 --> 00:15:40,505
PHILIPPE RIGOLLET:
It's the same.

263
00:15:40,505 --> 00:15:43,330
It's 5%, because this
point touches here exactly

264
00:15:43,330 --> 00:15:44,530
at the same point.

265
00:15:44,530 --> 00:15:48,970
However, for any other value
than the worst possible,

266
00:15:48,970 --> 00:15:51,670
this guy is doing
better than this guy.

267
00:15:51,670 --> 00:15:52,750
Can you see that?

268
00:15:52,750 --> 00:15:55,420
Having a curve higher
on the right-hand side

269
00:15:55,420 --> 00:15:57,310
is a good thing because
it means that you

270
00:15:57,310 --> 00:16:03,240
tend to reject more when
you're actually in h1.

271
00:16:03,240 --> 00:16:06,516
So this guy is definitely
better than this guy.

272
00:16:06,516 --> 00:16:07,890
And so what we
say, in this case,

273
00:16:07,890 --> 00:16:09,660
is that the test
with the dashed line

274
00:16:09,660 --> 00:16:13,550
is uniformly more powerful
than the other tests.

275
00:16:13,550 --> 00:16:15,390
But we're not going to
go into those details

276
00:16:15,390 --> 00:16:18,780
because, basically, all the
tests that we will describe

277
00:16:18,780 --> 00:16:22,010
are already the
most powerful ones.

278
00:16:22,010 --> 00:16:24,090
In particular, this guy is--

279
00:16:24,090 --> 00:16:25,020
there's no such thing.

280
00:16:25,020 --> 00:16:26,340
All the other guys
you can come up with

281
00:16:26,340 --> 00:16:27,631
are going to actually be below.

282
00:16:33,910 --> 00:16:36,610
So we saw a couple
tests, then we

283
00:16:36,610 --> 00:16:40,450
saw how to pick this threshold,
and we defined those two

284
00:16:40,450 --> 00:16:41,327
things.

285
00:16:41,327 --> 00:16:42,118
AUDIENCE: Question.

286
00:16:42,118 --> 00:16:43,076
PHILIPPE RIGOLLET: Yes?

287
00:16:43,076 --> 00:16:45,541
AUDIENCE: But in that
case, the dashed line,

288
00:16:45,541 --> 00:16:48,964
if it were also higher
in the region of theta 0,

289
00:16:48,964 --> 00:16:50,898
do you still consider it better?

290
00:16:50,898 --> 00:16:51,898
PHILIPPE RIGOLLET: Yeah.

291
00:16:51,898 --> 00:16:52,890
AUDIENCE: OK.

292
00:16:52,890 --> 00:16:55,400
PHILIPPE RIGOLLET: Because
you're given this budget of 5%.

293
00:16:55,400 --> 00:16:58,170
So in this paradigm
where you're given the--

294
00:16:58,170 --> 00:17:01,450
actually, if the dashed
line was this dashed line,

295
00:17:01,450 --> 00:17:03,030
I would still be happy.

296
00:17:03,030 --> 00:17:05,040
I mean, I don't care what
this thing does here,

297
00:17:05,040 --> 00:17:06,780
as long as it's below 5%.

298
00:17:06,780 --> 00:17:08,730
But here, I'm going
to try to discover.

299
00:17:08,730 --> 00:17:11,460
Think about, again, the
drug discovery example.

300
00:17:11,460 --> 00:17:14,069
You're trying to find--
let's say you're a scientist

301
00:17:14,069 --> 00:17:17,140
and you're trying to prove
that your drug works.

302
00:17:17,140 --> 00:17:18,140
What do you want to see?

303
00:17:18,140 --> 00:17:22,450
Well, FDA puts on
you this constraint

304
00:17:22,450 --> 00:17:26,919
that your probability of type
I error should never exceed 5%.

305
00:17:26,919 --> 00:17:28,710
You're going to work
under this assumption.

306
00:17:28,710 --> 00:17:30,293
But what you're going
to do is, you're

307
00:17:30,293 --> 00:17:33,720
going to try to find a test that
will make you find something

308
00:17:33,720 --> 00:17:35,450
as often as possible.

309
00:17:35,450 --> 00:17:38,400
And so you're going to
max this constraint of 5%.

310
00:17:38,400 --> 00:17:41,820
And then you're going to try to
make this curve, that means--

311
00:17:41,820 --> 00:17:45,780
this is, basically, this
number here, for any point

312
00:17:45,780 --> 00:17:47,990
here, is the probability
that you publish your paper.

313
00:17:47,990 --> 00:17:50,460
That's the probability
that you can

314
00:17:50,460 --> 00:17:51,690
release to market your drug.

315
00:17:51,690 --> 00:17:53,710
That's the probability
that it works.

316
00:17:53,710 --> 00:17:56,920
And so you want this curve
to be as high as possible.

317
00:17:56,920 --> 00:18:02,550
You want to make sure that if
there's evidence in the data

318
00:18:02,550 --> 00:18:05,670
that h1 is the truth, you
want to squeeze as much

319
00:18:05,670 --> 00:18:07,110
of this evidence as possible.

320
00:18:07,110 --> 00:18:09,780
And the test that has the
highest possible curve

321
00:18:09,780 --> 00:18:11,490
is the most powerful one.

322
00:18:11,490 --> 00:18:15,810
Now, you have to also understand
that having two curves that

323
00:18:15,810 --> 00:18:19,190
are on top of each other
completely, everywhere,

324
00:18:19,190 --> 00:18:22,560
is a rare phenomenon.

325
00:18:22,560 --> 00:18:24,450
It's not always
the case that there

326
00:18:24,450 --> 00:18:27,842
is a test that's uniformly more
powerful than any other test.

327
00:18:27,842 --> 00:18:29,550
It might be that you
have some trade-off,

328
00:18:29,550 --> 00:18:31,110
that it might be better
here, but then you're

329
00:18:31,110 --> 00:18:32,027
losing power here.

330
00:18:32,027 --> 00:18:33,610
Maybe it's-- I mean,
things like this.

331
00:18:33,610 --> 00:18:35,110
Well, actually, maybe
it should not go down.

332
00:18:35,110 --> 00:18:37,590
But let's say it goes like
this, and then, maybe, this guy

333
00:18:37,590 --> 00:18:39,660
goes like this.

334
00:18:39,660 --> 00:18:43,560
Then you have to, basically,
make an educated guess

335
00:18:43,560 --> 00:18:46,170
whether you think that the theta
you're going to find is here

336
00:18:46,170 --> 00:18:47,880
or is here, and then
you pick your test.

337
00:18:51,089 --> 00:18:51,880
Any other question?

338
00:18:51,880 --> 00:18:52,584
Yes?

339
00:18:52,584 --> 00:18:53,976
AUDIENCE: Can you explain
the green curve again?

340
00:18:53,976 --> 00:18:55,255
That's just the type II error?

341
00:18:55,255 --> 00:18:57,380
PHILIPPE RIGOLLET: So the
green curve is-- exactly.

342
00:18:57,380 --> 00:18:58,830
So that's beta psi of theta.

343
00:18:58,830 --> 00:19:00,620
So it's really
the type II error.

344
00:19:00,620 --> 00:19:02,730
And it's defined only here.

345
00:19:02,730 --> 00:19:05,420
So here, it's not
a definition, it's

346
00:19:05,420 --> 00:19:08,215
really I'm just mapping
it to this point.

347
00:19:08,215 --> 00:19:10,340
So it's defined only here,
and it's the probability

348
00:19:10,340 --> 00:19:11,048
of type II error.

349
00:19:15,510 --> 00:19:17,555
So here, it's pretty large.

350
00:19:17,555 --> 00:19:19,530
I'm making it,
basically, as large

351
00:19:19,530 --> 00:19:22,330
as I could because
I'm at the boundary,

352
00:19:22,330 --> 00:19:26,410
and that means, at the boundary,
since the status quo is h0,

353
00:19:26,410 --> 00:19:29,394
I'm always going
to go for h0 if I

354
00:19:29,394 --> 00:19:31,935
don't have any evidence, which
means that what's going to pay

355
00:19:31,935 --> 00:19:34,268
is the type II error that's
going to basically pay this.

356
00:19:38,059 --> 00:19:38,850
Any other question?

357
00:19:41,680 --> 00:19:43,000
So let's move on.

358
00:19:43,000 --> 00:19:47,190
So did we do this?

359
00:19:47,190 --> 00:19:50,220
No, I think we
stopped here, right?

360
00:19:50,220 --> 00:19:53,120
I didn't cover that part.

361
00:19:53,120 --> 00:19:55,310
So as I said, in
this paradigm, we're

362
00:19:55,310 --> 00:19:58,040
going to actually fix
this guy to be something.

363
00:19:58,040 --> 00:20:01,620
And this thing is actually
called the level of the test.

364
00:20:01,620 --> 00:20:03,410
I'm sorry, this is,
again, more words.

365
00:20:03,410 --> 00:20:06,920
Actually, the good news is that
we split it into two lectures.

366
00:20:06,920 --> 00:20:09,335
So we have, what is a test?

367
00:20:09,335 --> 00:20:11,252
What is a hypothesis?

368
00:20:11,252 --> 00:20:11,960
What is the null?

369
00:20:11,960 --> 00:20:14,429
What is the alternative?

370
00:20:14,429 --> 00:20:15,470
What is the type I error?

371
00:20:15,470 --> 00:20:16,553
What is the type II error?

372
00:20:16,553 --> 00:20:18,650
And now, I'm telling you
there's another thing.

373
00:20:18,650 --> 00:20:22,670
So we define the power, which
was some sort of a lower bound

374
00:20:22,670 --> 00:20:24,140
on the--

375
00:20:24,140 --> 00:20:26,390
or it's 1 minus the upper
bound on the type II

376
00:20:26,390 --> 00:20:28,540
error, basically.

377
00:20:28,540 --> 00:20:32,480
And so it's alternative--
so the power

378
00:20:32,480 --> 00:20:34,970
is the smallest
probability of rejecting

379
00:20:34,970 --> 00:20:36,290
when you're in the null.

380
00:20:36,290 --> 00:20:41,200
And it's alternative when you're
in theta 1, so that's my power.

381
00:20:41,200 --> 00:20:43,760
I looked here, and I looked
at the smallest value.

382
00:20:43,760 --> 00:20:45,770
And I can look at this
side and say, well,

383
00:20:45,770 --> 00:20:48,886
what is the largest probability
that I make a type I error?

384
00:20:48,886 --> 00:20:51,260
Again, this largest probability
is the level of the test.

385
00:20:58,460 --> 00:21:03,140
So this is alpha
equal, by definition,

386
00:21:03,140 --> 00:21:15,490
to the maximum for theta in
theta 0 of alpha psi of theta.

387
00:21:15,490 --> 00:21:18,340
So here, I just put
the level itself.

388
00:21:18,340 --> 00:21:20,950
As you can see, here,
it essentially says

389
00:21:20,950 --> 00:21:23,890
that if I'm of level of
5%, I'm also of level 10%,

390
00:21:23,890 --> 00:21:25,600
I'm also of level 15%.

391
00:21:25,600 --> 00:21:27,160
So here, it's really
an upper bound.

392
00:21:27,160 --> 00:21:29,980
Whatever you guys want to
take, this is what it is.

393
00:21:29,980 --> 00:21:34,861
But as we said, if
this number is 4.5%,

394
00:21:34,861 --> 00:21:36,360
you're losing in
your type II error.

395
00:21:36,360 --> 00:21:38,200
So if you're allowed to have--

396
00:21:38,200 --> 00:21:43,420
if this maximum here is 4.5% and
FDA told you you can go to 5%,

397
00:21:43,420 --> 00:21:44,920
you're losing in
your type II error.

398
00:21:44,920 --> 00:21:46,295
So you actually
want to make sure

399
00:21:46,295 --> 00:21:48,400
that this is the 5%
that's given to you.

400
00:21:48,400 --> 00:21:51,700
So the way it works is
that you give me the alpha,

401
00:21:51,700 --> 00:21:56,410
then I'm going to go back, pick
c that depends on alpha here,

402
00:21:56,410 --> 00:21:58,300
so that this thing is
actually equal to 5%.

403
00:22:01,590 --> 00:22:04,800
And so of course,
in many instances,

404
00:22:04,800 --> 00:22:06,840
we do not know the probability.

405
00:22:06,840 --> 00:22:09,290
We do not know how to compute
the probability of type I

406
00:22:09,290 --> 00:22:10,300
error.

407
00:22:10,300 --> 00:22:12,930
This is a maximum value for the
probability of type I error.

408
00:22:12,930 --> 00:22:13,920
We don't know how to compute it.

409
00:22:13,920 --> 00:22:15,900
I mean, it might be a very
complicated random variable.

410
00:22:15,900 --> 00:22:17,300
Maybe it's a weird binomial.

411
00:22:17,300 --> 00:22:19,500
We could compute it,
but it would be painful.

412
00:22:19,500 --> 00:22:21,960
But we know how to compute
is its asymptotic value.

413
00:22:21,960 --> 00:22:24,330
Just because of the central
limit theorem, convergence

414
00:22:24,330 --> 00:22:28,020
and distribution tells me that
the probability of type I error

415
00:22:28,020 --> 00:22:30,650
is basically going
towards the probability

416
00:22:30,650 --> 00:22:33,110
that some Gaussian
is in some region.

417
00:22:33,110 --> 00:22:36,240
And so we're going to
compute, not the level itself,

418
00:22:36,240 --> 00:22:37,500
but the asymptotic level.

419
00:22:43,700 --> 00:22:48,320
And that's basically
the limit as n

420
00:22:48,320 --> 00:22:56,830
goes to infinity of
alpha psi of theta.

421
00:22:56,830 --> 00:22:58,540
And then I'm going
to make the max here.

422
00:23:06,300 --> 00:23:08,240
So how am I going
to compute this?

423
00:23:08,240 --> 00:23:13,440
Well, if I take a test that has
rejection region of the form

424
00:23:13,440 --> 00:23:14,910
tn--

425
00:23:14,910 --> 00:23:17,970
because it depends on the
data, that's tn of x1 xn--

426
00:23:17,970 --> 00:23:23,440
my observation's larger
than some number c.

427
00:23:23,440 --> 00:23:26,430
Of course, I can
almost always write

428
00:23:26,430 --> 00:23:28,402
tests like that,
except that sometimes,

429
00:23:28,402 --> 00:23:30,860
it's going to be an absolute
value, which essentially means

430
00:23:30,860 --> 00:23:32,552
I'm going away from some value.

431
00:23:32,552 --> 00:23:34,260
Maybe, actually, I'm
less than something,

432
00:23:34,260 --> 00:23:37,090
but I can always put a negative
sign in front of everything.

433
00:23:37,090 --> 00:23:39,780
So this is not without
much of generality.

434
00:23:39,780 --> 00:23:47,890
So this includes something
that looks like--

435
00:23:51,520 --> 00:23:56,290
something is larger than the
constants, so that means--

436
00:23:56,290 --> 00:24:02,330
which is equivalent to--
well, let me write that as tq,

437
00:24:02,330 --> 00:24:05,510
because then that means that--

438
00:24:05,510 --> 00:24:07,940
so that's tn.

439
00:24:07,940 --> 00:24:10,040
But this actually
encompasses the fact

440
00:24:10,040 --> 00:24:21,370
that qn is larger than c or qn
is less than c and n minus c.

441
00:24:21,370 --> 00:24:22,480
So that includes this guy.

442
00:24:22,480 --> 00:24:26,320
That also includes
qn less than c,

443
00:24:26,320 --> 00:24:32,840
because this is equivalent
to qn is larger than minus c.

444
00:24:32,840 --> 00:24:33,830
And minus qn is--

445
00:24:33,830 --> 00:24:35,240
and so that's going to be my tn.

446
00:24:37,810 --> 00:24:42,430
So I can actually encode
several type of things--

447
00:24:42,430 --> 00:24:44,230
rejection regions.

448
00:24:44,230 --> 00:24:47,020
So here, in this case, I
have a rejection region

449
00:24:47,020 --> 00:24:50,110
that looks like this,
or a rejection region

450
00:24:50,110 --> 00:24:53,380
that looks like
this, or a rejection

451
00:24:53,380 --> 00:24:54,550
region that looks like this.

452
00:24:57,209 --> 00:24:58,750
And here, I don't
really represent it

453
00:24:58,750 --> 00:25:02,420
for the whole data, but maybe
for the average, for example,

454
00:25:02,420 --> 00:25:04,018
or the normalized average.

455
00:25:17,950 --> 00:25:23,950
So if I write this, then--

456
00:25:23,950 --> 00:25:25,470
yeah.

457
00:25:25,470 --> 00:25:32,970
And in this case,
this tn that shows up

458
00:25:32,970 --> 00:25:35,168
is called test statistic.

459
00:25:41,460 --> 00:25:43,730
I mean, this is
not set in stone.

460
00:25:43,730 --> 00:25:46,930
Here, for example, q could
be the test statistic.

461
00:25:46,930 --> 00:25:48,640
It doesn't have to
be minus q itself

462
00:25:48,640 --> 00:25:50,750
that's the test statistic.

463
00:25:50,750 --> 00:25:52,000
So what is the test statistic?

464
00:25:52,000 --> 00:25:55,170
Well, it's what you're going
to build from your data

465
00:25:55,170 --> 00:25:57,790
and then compare to
some fixed value.

466
00:25:57,790 --> 00:26:01,167
So in the example we had here,
what is our test statistic?

467
00:26:01,167 --> 00:26:02,000
Well, it's this guy.

468
00:26:05,620 --> 00:26:09,200
This was our test statistic.

469
00:26:09,200 --> 00:26:12,830
And is this thing a statistic?

470
00:26:12,830 --> 00:26:14,510
What are the criteria
for a statistic?

471
00:26:14,510 --> 00:26:15,810
What is the statistic?

472
00:26:21,046 --> 00:26:23,633
I know you know the answer.

473
00:26:23,633 --> 00:26:25,050
AUDIENCE: Measurable function.

474
00:26:25,050 --> 00:26:26,080
PHILIPPE RIGOLLET: Yeah,
it's a measurable function

475
00:26:26,080 --> 00:26:29,064
of the data that does not
depend on the parameter.

476
00:26:29,064 --> 00:26:32,995
Is this guy a statistic?

477
00:26:32,995 --> 00:26:33,981
AUDIENCE: It's not.

478
00:26:35,949 --> 00:26:37,490
PHILIPPE RIGOLLET:
Let's think again.

479
00:26:40,490 --> 00:26:45,360
When I implemented the
test, what did I do?

480
00:26:45,360 --> 00:26:47,490
I was able to compute my test.

481
00:26:47,490 --> 00:26:49,800
My test did not depend on
some unknown parameter.

482
00:26:49,800 --> 00:26:52,640
How did we do it?

483
00:26:52,640 --> 00:26:57,187
We just plugged in
0.5 here, remember?

484
00:26:57,187 --> 00:26:59,020
That was the value for
which we computed it,

485
00:26:59,020 --> 00:27:02,150
because under h0, that was
the value we're seeing.

486
00:27:02,150 --> 00:27:05,910
And if theta 0 is
actually an entire set,

487
00:27:05,910 --> 00:27:09,935
I'm just going to take the
value that's the closest to h1.

488
00:27:09,935 --> 00:27:11,060
We'll see that in a second.

489
00:27:11,060 --> 00:27:13,400
I mean, I did not
guarantee that to you.

490
00:27:13,400 --> 00:27:18,950
But just taking the worst type
I error and bounded by alpha

491
00:27:18,950 --> 00:27:22,310
is equivalent to taking p and
taking the value of p that's

492
00:27:22,310 --> 00:27:26,836
the closest to theta 1, which
is completely intuitive.

493
00:27:26,836 --> 00:27:29,460
The worst type I error is going
to be attained for the p that's

494
00:27:29,460 --> 00:27:32,260
the closest to the alternative.

495
00:27:32,260 --> 00:27:36,510
So even if the null is
actually just an entire set,

496
00:27:36,510 --> 00:27:38,940
it's as if it was
just the point that's

497
00:27:38,940 --> 00:27:41,840
the closest to the alternative.

498
00:27:41,840 --> 00:27:44,520
So now we can compute
this, because there's

499
00:27:44,520 --> 00:27:46,440
no unknown parameters
that shows up.

500
00:27:46,440 --> 00:27:48,210
We replace p by 0.5.

501
00:27:48,210 --> 00:27:50,406
And so that was
our test statistic.

502
00:27:53,952 --> 00:27:55,410
So when you're
building a test, you

503
00:27:55,410 --> 00:27:58,000
want to first build
a test statistic,

504
00:27:58,000 --> 00:28:01,230
and then see what threshold
you should be getting.

505
00:28:01,230 --> 00:28:08,640
So now, let's go back to our
example where we want to have--

506
00:28:08,640 --> 00:28:16,340
we have x1 xn, their
IID [INAUDIBLE] p.

507
00:28:16,340 --> 00:28:25,050
And I want to test if p is
1/2 versus p not equal to 1/2,

508
00:28:25,050 --> 00:28:27,630
which, as I said, is what
you want to do if you

509
00:28:27,630 --> 00:28:33,560
want to test if a coin is fair.

510
00:28:33,560 --> 00:28:36,560
And so here, I'm going to
build a test statistic.

511
00:28:36,560 --> 00:28:39,180
And we concluded
last time that--

512
00:28:39,180 --> 00:28:41,790
what do we want
for this statistic?

513
00:28:41,790 --> 00:28:44,700
We want it to have a
distribution which,

514
00:28:44,700 --> 00:28:49,820
under the null, does not
depend on the parameters,

515
00:28:49,820 --> 00:28:54,650
a distribution that I can
actually compute quintiles of.

516
00:28:54,650 --> 00:28:56,150
So what we did
is, we said, well,

517
00:28:56,150 --> 00:28:59,210
if I look at-- the central limit
theorem tells me that square

518
00:28:59,210 --> 00:29:03,500
root of n xn bar
minus p divided by--

519
00:29:03,500 --> 00:29:06,871
so if I do central limit theorem
plus Slutsky, for example,

520
00:29:06,871 --> 00:29:08,120
I'm going to have square root.

521
00:29:12,006 --> 00:29:13,880
And we've had this
discussion whether we want

522
00:29:13,880 --> 00:29:15,004
to use Slutsky or not here.

523
00:29:15,004 --> 00:29:17,900
But let's assume we're taking
Slutsky wherever we can.

524
00:29:17,900 --> 00:29:20,060
So this thing tells me
that, by the central limit

525
00:29:20,060 --> 00:29:23,510
theorem, as n goes to
infinity, this thing converges

526
00:29:23,510 --> 00:29:25,190
in distribution to some n01.

527
00:29:28,260 --> 00:29:31,550
Now, as we said, this guy
is not something we know.

528
00:29:31,550 --> 00:29:34,020
But under the null,
we actually know it.

529
00:29:34,020 --> 00:29:37,100
And we can actually
replace it by 1/2.

530
00:29:37,100 --> 00:29:41,300
So this thing holds under h0.

531
00:29:41,300 --> 00:29:44,300
When I write under h0, it
means when this is the truth.

532
00:29:47,120 --> 00:29:49,270
So now I have something
that converges

533
00:29:49,270 --> 00:29:52,570
to something that has no
dependence on anything I

534
00:29:52,570 --> 00:29:53,170
don't know.

535
00:29:53,170 --> 00:29:56,890
And in particular, if you have
any statistics textbook, which

536
00:29:56,890 --> 00:29:59,260
you don't because I
didn't require one--

537
00:29:59,260 --> 00:30:04,444
and you should be thankful,
because these things cost $350.

538
00:30:04,444 --> 00:30:05,860
Actually, if you
look at the back,

539
00:30:05,860 --> 00:30:12,250
you actually have a table
for a standard Gaussian.

540
00:30:12,250 --> 00:30:13,780
I could have anything else here.

541
00:30:13,780 --> 00:30:15,760
I could have an
exponential distribution.

542
00:30:15,760 --> 00:30:17,560
I could have a--

543
00:30:17,560 --> 00:30:20,320
I don't know-- well,
we'll see the chi squared

544
00:30:20,320 --> 00:30:22,030
distribution in a minute.

545
00:30:22,030 --> 00:30:24,040
Any distribution from
which you can actually

546
00:30:24,040 --> 00:30:25,600
see a table that
somebody actually

547
00:30:25,600 --> 00:30:27,516
computed this thing for
which you can actually

548
00:30:27,516 --> 00:30:30,430
draw the pdf and start computing
whatever probability you want

549
00:30:30,430 --> 00:30:32,410
on them, then this
is what you want

550
00:30:32,410 --> 00:30:35,110
to see at the right-hand side.

551
00:30:35,110 --> 00:30:36,500
This is any distribution.

552
00:30:36,500 --> 00:30:38,380
It's called pivotal.

553
00:30:38,380 --> 00:30:39,880
I think we've
mentioned that before.

554
00:30:39,880 --> 00:30:41,713
Pivotal means it does
not depend on anything

555
00:30:41,713 --> 00:30:43,610
that you don't know.

556
00:30:43,610 --> 00:30:45,616
And maybe it's easy to
compute those things.

557
00:30:45,616 --> 00:30:47,990
Probably, typically, you need
a computer to simulate them

558
00:30:47,990 --> 00:30:50,985
for you because computing
probabilities for Gaussians

559
00:30:50,985 --> 00:30:51,860
is not an easy thing.

560
00:30:51,860 --> 00:30:53,985
We don't know how to solve
those integrals exactly,

561
00:30:53,985 --> 00:30:56,520
we have to do it numerically.

562
00:30:56,520 --> 00:31:08,420
So now I want to do this test.

563
00:31:08,420 --> 00:31:12,950
My test statistic will
be declared to be what?

564
00:31:12,950 --> 00:31:17,740
Well, I'm going
to reject if what

565
00:31:17,740 --> 00:31:18,970
is larger than some number?

566
00:31:24,140 --> 00:31:27,500
The absolute value of this guy.

567
00:31:27,500 --> 00:31:29,550
So my test statistic
is going to be

568
00:31:29,550 --> 00:31:35,790
square root of n minus 0.5
divided by square root of xn

569
00:31:35,790 --> 00:31:38,240
bar 1 minus xn bar.

570
00:31:41,160 --> 00:31:43,470
That's my test statistic,
absolute value of this guy,

571
00:31:43,470 --> 00:31:45,900
because I want to reject either
when this guy is too large

572
00:31:45,900 --> 00:31:47,150
or when this guy is too small.

573
00:31:50,260 --> 00:31:51,760
I don't know ahead
whether I'm going

574
00:31:51,760 --> 00:31:55,470
to see p larger than
1/2 or less than 1/2.

575
00:31:55,470 --> 00:31:59,190
So now I need to compute c
such that the probability

576
00:31:59,190 --> 00:32:05,210
that tn is larger than c.

577
00:32:05,210 --> 00:32:11,290
So that's the probability
under p, which is unknown.

578
00:32:11,290 --> 00:32:17,230
I want this probability to be
less than some level alpha,

579
00:32:17,230 --> 00:32:18,410
asymptotically.

580
00:32:18,410 --> 00:32:24,740
So I want the limit of this
guy to be less than alpha,

581
00:32:24,740 --> 00:32:26,810
and that's the level of my test.

582
00:32:26,810 --> 00:32:32,010
So that's the given level.

583
00:32:32,010 --> 00:32:33,720
So I want this thing to happen.

584
00:32:33,720 --> 00:32:35,280
Now, what I know is
that this limit--

585
00:32:38,090 --> 00:32:40,397
actually, I should say
given asymptotic level.

586
00:32:48,520 --> 00:32:50,130
So what is this thing?

587
00:32:54,300 --> 00:33:00,600
Well, OK, that's the
probability that something

588
00:33:00,600 --> 00:33:03,000
that looks like under p.

589
00:33:03,000 --> 00:33:05,520
So under p, this guy--

590
00:33:05,520 --> 00:33:08,730
so what I know is that
tn is square root of n

591
00:33:08,730 --> 00:33:15,490
minus xn bar minus 0.5 divided
by square root of xn bar

592
00:33:15,490 --> 00:33:18,700
1 minus xn bar exceeds.

593
00:33:23,770 --> 00:33:26,882
Is this true that
as n to infinity,

594
00:33:26,882 --> 00:33:28,840
this probability is the
same as the probability

595
00:33:28,840 --> 00:33:30,460
that the absolute
value of a Gaussian

596
00:33:30,460 --> 00:33:33,610
exceeds c of a
standard Gaussian?

597
00:33:33,610 --> 00:33:34,351
Is this true?

598
00:33:37,281 --> 00:33:39,530
AUDIENCE: The absolute value
of the standard Gaussian.

599
00:33:39,530 --> 00:33:41,113
PHILIPPE RIGOLLET:
Yeah, the absolute.

600
00:33:41,113 --> 00:33:43,830
So you're saying that this, as
n becomes large enough, this

601
00:33:43,830 --> 00:33:48,500
should be the probability that
some absolute value of n01

602
00:33:48,500 --> 00:33:49,898
exceeds c, right?

603
00:33:49,898 --> 00:33:51,990
AUDIENCE: Yes.

604
00:33:51,990 --> 00:33:54,780
PHILIPPE RIGOLLET: So I claim
that this is not correct.

605
00:33:54,780 --> 00:33:56,077
Somebody tell me why.

606
00:33:56,077 --> 00:33:57,360
AUDIENCE: Even in the
limit it's not correct?

607
00:33:57,360 --> 00:33:59,651
PHILIPPE RIGOLLET: Even in
the limit, it's not correct.

608
00:34:03,164 --> 00:34:04,334
AUDIENCE: OK.

609
00:34:04,334 --> 00:34:05,917
PHILIPPE RIGOLLET:
So what do you see?

610
00:34:05,917 --> 00:34:07,625
AUDIENCE: It's because,
at the beginning,

611
00:34:07,625 --> 00:34:11,368
we picked the worst possible
true parameter, 0.5.

612
00:34:11,368 --> 00:34:13,915
So we don't actually know
that this 0.5 is the mean.

613
00:34:13,915 --> 00:34:15,040
PHILIPPE RIGOLLET: Exactly.

614
00:34:15,040 --> 00:34:19,500
So we pick this 0.5 here,
but this is for any p.

615
00:34:19,500 --> 00:34:21,360
But what is the
only p I can get?

616
00:34:21,360 --> 00:34:26,949
So what I want is that this
is true for all p in theta 0.

617
00:34:26,949 --> 00:34:31,750
But the only p that's in theta
0 is actually p is equal to 0.5.

618
00:34:31,750 --> 00:34:33,780
So yes, what you
said was true, but it

619
00:34:33,780 --> 00:34:38,130
required to specify
p to be equal to 0.5.

620
00:34:38,130 --> 00:34:40,409
So this, in general,
is not true.

621
00:34:40,409 --> 00:34:47,909
But it happens to be true if
p belongs to theta 0, which

622
00:34:47,909 --> 00:34:53,489
is strictly equivalent
to p is equal to 0.5,

623
00:34:53,489 --> 00:34:59,320
because theta 0 is really
just this one point, 0.5.

624
00:34:59,320 --> 00:35:01,820
So now, this becomes true.

625
00:35:01,820 --> 00:35:03,760
And so what I need to
do is to find c such

626
00:35:03,760 --> 00:35:05,140
that this guy is equal to what?

627
00:35:11,650 --> 00:35:14,020
I mean, let's just follow.

628
00:35:14,020 --> 00:35:16,950
So I want this to
be less than alpha.

629
00:35:16,950 --> 00:35:19,850
But then we said that
this was equal to this,

630
00:35:19,850 --> 00:35:21,970
which is equal to this.

631
00:35:21,970 --> 00:35:24,960
So all I want is that this
guy is less than alpha.

632
00:35:24,960 --> 00:35:28,230
But we said we might as well
just make it equal to alpha

633
00:35:28,230 --> 00:35:30,540
if you allow me to make
it as big as I want,

634
00:35:30,540 --> 00:35:32,050
as long as it's less than alpha.

635
00:35:32,050 --> 00:35:33,790
AUDIENCE: So this
is a true statement.

636
00:35:33,790 --> 00:35:35,748
PHILIPPE RIGOLLET: So
this is a true statement.

637
00:35:35,748 --> 00:35:38,354
But it's under this condition.

638
00:35:38,354 --> 00:35:39,104
AUDIENCE: Exactly.

639
00:35:43,010 --> 00:35:48,680
PHILIPPE RIGOLLET: So I'm
going to set it equal to alpha,

640
00:35:48,680 --> 00:35:52,310
and then I'm going to
try to solve for c.

641
00:36:10,390 --> 00:36:13,540
So what I'm looking
for is a c such that

642
00:36:13,540 --> 00:36:17,390
if I draw a standard Gaussian--

643
00:36:17,390 --> 00:36:20,530
so that's pdf of some n01--

644
00:36:20,530 --> 00:36:23,200
I want the probability that the
absolute value of my Gaussian

645
00:36:23,200 --> 00:36:25,630
exceeding this guy--

646
00:36:25,630 --> 00:36:29,350
so that means being
either here or here.

647
00:36:29,350 --> 00:36:31,220
So that's minus c and c.

648
00:36:31,220 --> 00:36:36,200
I want the sum of those two
things to be equal to alpha.

649
00:36:36,200 --> 00:36:53,570
So I want the sum of these
areas to equal alpha.

650
00:36:53,570 --> 00:36:56,240
So by symmetry,
each of them should

651
00:36:56,240 --> 00:36:58,190
be equal to alpha over 2.

652
00:37:02,710 --> 00:37:08,310
And so what I'm looking for
is c such that the probability

653
00:37:08,310 --> 00:37:15,410
that my n01 exceeds c, which
is just this area to the right,

654
00:37:15,410 --> 00:37:20,830
now, equals alpha, which is
equivalent to taking c, which

655
00:37:20,830 --> 00:37:26,020
is q equals alpha over 2,
and that's q alpha over 2

656
00:37:26,020 --> 00:37:28,240
by definition of q alpha over 2.

657
00:37:28,240 --> 00:37:30,370
That's just what
q alpha over 2 is.

658
00:37:30,370 --> 00:37:34,420
And that's what the tables at
the back of the book give you.

659
00:37:34,420 --> 00:37:42,400
Who has already seen a table
for Gaussian probabilities?

660
00:37:42,400 --> 00:37:44,200
What it does, it's just a table.

661
00:37:44,200 --> 00:37:45,696
I mean, it's pretty ancient.

662
00:37:45,696 --> 00:37:47,320
I mean, of course,
you can actually ask

663
00:37:47,320 --> 00:37:49,240
Google to do it for you now.

664
00:37:49,240 --> 00:37:52,180
I mean, it's basically
standard issue.

665
00:37:52,180 --> 00:37:56,110
But back in the day, they
actually had to look at tables.

666
00:37:56,110 --> 00:37:59,140
And since the values alphas
were pretty standard,

667
00:37:59,140 --> 00:38:01,510
the values alpha that
people were requesting

668
00:38:01,510 --> 00:38:04,810
were typically 1%,
5%, 10%, all you

669
00:38:04,810 --> 00:38:07,000
could do is to compute
these different values

670
00:38:07,000 --> 00:38:08,560
for different values of alpha.

671
00:38:08,560 --> 00:38:10,390
That was it.

672
00:38:10,390 --> 00:38:13,450
So there's really
not much to give you.

673
00:38:13,450 --> 00:38:15,750
So for the Gaussian,
I can tell you

674
00:38:15,750 --> 00:38:20,210
that alpha is equal to--
if alpha is equal to 5%,

675
00:38:20,210 --> 00:38:27,100
then q alpha over 2, q 2.5%
is equal to 1.96, for example.

676
00:38:27,100 --> 00:38:28,840
So those are just
fixed numbers that

677
00:38:28,840 --> 00:38:31,030
are functions of the Gaussian.

678
00:38:31,030 --> 00:38:32,410
So everybody agrees?

679
00:38:32,410 --> 00:38:37,443
We've done that before for
our confidence intervals.

680
00:38:40,350 --> 00:38:42,040
And so now we know
that if I actually

681
00:38:42,040 --> 00:38:48,460
plug in this guy to be
q alpha over 2, then

682
00:38:48,460 --> 00:38:51,430
this limit is actually
equal to alpha.

683
00:38:51,430 --> 00:38:53,247
And so now I've actually
constrained this.

684
00:39:01,040 --> 00:39:07,800
So q alpha over 2 here for alpha
equals 5%, as I said, is 1.96.

685
00:39:07,800 --> 00:39:13,790
So in the example 1, the
number that we found was 3.54,

686
00:39:13,790 --> 00:39:18,800
I think, or something
like that, 3.55 for t.

687
00:39:18,800 --> 00:39:29,290
So if we scroll back
very quickly, 3.45--

688
00:39:29,290 --> 00:39:30,980
that was example 1.

689
00:39:30,980 --> 00:39:33,770
Example two-- negative 0.77.

690
00:39:33,770 --> 00:39:40,970
So if I look at tn
in example 1, tn

691
00:39:40,970 --> 00:39:46,040
was just the absolute
value of 3.45, which--

692
00:39:46,040 --> 00:39:50,390
don't pull out your
calculators-- is equal to 3.45.

693
00:39:50,390 --> 00:39:54,500
Example 2, absolute
value of negative 0.77

694
00:39:54,500 --> 00:39:57,050
was equal to 0.77.

695
00:39:57,050 --> 00:39:59,450
And so all I need to
check is, is this number

696
00:39:59,450 --> 00:40:01,610
larger or smaller than 1.96?

697
00:40:01,610 --> 00:40:06,530
That's what my
test ends up being.

698
00:40:06,530 --> 00:40:12,860
So in example 1,
3.45 being larger

699
00:40:12,860 --> 00:40:18,885
than 1.96, that
means that I reject.

700
00:40:18,885 --> 00:40:22,790
Fairness of my
coins, in example 2,

701
00:40:22,790 --> 00:40:27,230
0.77 being smaller than 1.96--

702
00:40:27,230 --> 00:40:29,370
what do I do?

703
00:40:29,370 --> 00:40:30,270
I fail to reject.

704
00:40:44,084 --> 00:40:45,000
So here is a question.

705
00:40:47,730 --> 00:40:54,270
In example 1, for what level
alpha would psi alpha--

706
00:40:57,530 --> 00:41:00,090
OK, so here, what's
going to happen

707
00:41:00,090 --> 00:41:04,350
if I start decreasing my level?

708
00:41:04,350 --> 00:41:07,020
When I decrease my
level, I'm actually

709
00:41:07,020 --> 00:41:09,100
making this area
smaller and smaller,

710
00:41:09,100 --> 00:41:13,360
which means that I push
this c to the right.

711
00:41:13,360 --> 00:41:17,080
So now I'm asking,
what is the smallest c

712
00:41:17,080 --> 00:41:22,360
I should pick so that now,
I actually do not reject h0?

713
00:41:22,360 --> 00:41:29,642
What is the smallest c
I should be taking here?

714
00:41:29,642 --> 00:41:30,600
What is the smallest c?

715
00:41:37,520 --> 00:41:43,300
So c here, in the example I
gave you for 5%, was 1.96.

716
00:41:43,300 --> 00:41:49,330
What is the smallest c I
should be taking so that now,

717
00:41:49,330 --> 00:41:50,730
this inequality is reversed?

718
00:41:54,980 --> 00:41:55,990
3.45.

719
00:41:55,990 --> 00:41:58,900
I ask only trivial
questions, don't be worried.

720
00:41:58,900 --> 00:42:02,890
So 3.45 is the smallest
c that I'm actually

721
00:42:02,890 --> 00:42:04,870
willing to tolerate.

722
00:42:04,870 --> 00:42:07,870
So let's say this was my 5%.

723
00:42:07,870 --> 00:42:09,730
If this was 2.5--

724
00:42:09,730 --> 00:42:11,230
if here, let's say,
in this picture,

725
00:42:11,230 --> 00:42:16,490
alpha is 5%, that means
maybe I need to push here.

726
00:42:16,490 --> 00:42:18,150
And this number should be what?

727
00:42:18,150 --> 00:42:20,580
So this is going to be 1.96.

728
00:42:20,580 --> 00:42:26,460
And this number here is going
to be 3.45, clearly to scale.

729
00:42:26,460 --> 00:42:30,330
And so now, what I
want to ask you is,

730
00:42:30,330 --> 00:42:33,570
well, there's two ways I can
understand this number 3.45.

731
00:42:33,570 --> 00:42:36,120
It is the number
3.45, but I can also

732
00:42:36,120 --> 00:42:40,340
try to understand what is the
area to the right of this guy.

733
00:42:40,340 --> 00:42:42,780
And if I understand what the
area to the right of this guy

734
00:42:42,780 --> 00:42:47,700
is, this is actually
some alpha prime over 2.

735
00:42:47,700 --> 00:42:49,560
And that means
that if I actually

736
00:42:49,560 --> 00:42:53,840
fix this level alpha
prime, that would

737
00:42:53,840 --> 00:42:57,260
be exactly the tipping
point at which I would

738
00:42:57,260 --> 00:43:01,420
go from accepting to rejecting.

739
00:43:01,420 --> 00:43:04,390
So I knew, in terms of
absolute thresholds,

740
00:43:04,390 --> 00:43:07,300
3.45 is the trivial
answer to the question.

741
00:43:07,300 --> 00:43:09,040
That's the tipping
point, because I'm

742
00:43:09,040 --> 00:43:11,350
comparing a number to 3.45.

743
00:43:11,350 --> 00:43:13,330
But now, if I try
to map this back

744
00:43:13,330 --> 00:43:16,570
and understand what level
would have been giving me

745
00:43:16,570 --> 00:43:18,910
this particular
tipping point, that's

746
00:43:18,910 --> 00:43:21,430
a number between 0 and 1.

747
00:43:21,430 --> 00:43:25,830
The smaller the number, the
larger this number here,

748
00:43:25,830 --> 00:43:28,280
which means that the more
evidence I have in my data

749
00:43:28,280 --> 00:43:30,990
against h0.

750
00:43:30,990 --> 00:43:36,140
And so this number is actually
something called the p-value.

751
00:43:36,140 --> 00:43:38,640
And so saying, for
example 2, there's

752
00:43:38,640 --> 00:43:40,880
the tipping point
alpha at which I

753
00:43:40,880 --> 00:43:44,150
go from failing to
reject to rejecting.

754
00:43:44,150 --> 00:43:47,990
And that's exactly the number,
the area under the curve,

755
00:43:47,990 --> 00:43:53,990
such that here, I see 0.77.

756
00:43:53,990 --> 00:43:56,710
And this is this alpha
prime prime over 2.

757
00:43:59,660 --> 00:44:04,090
Alpha prime prime is
clearly larger than 5%.

758
00:44:04,090 --> 00:44:06,630
So what's the advantage of
thinking and mapping back

759
00:44:06,630 --> 00:44:08,170
these numbers?

760
00:44:08,170 --> 00:44:11,790
Well, now, I'm actually going
to spit out some number which

761
00:44:11,790 --> 00:44:12,900
is between 0 and 1.

762
00:44:12,900 --> 00:44:18,827
And that should be the only
scale you should have in mind.

763
00:44:18,827 --> 00:44:20,410
Remember, we discussed
that last time.

764
00:44:20,410 --> 00:44:22,750
I was like, well, if
I actually spit out

765
00:44:22,750 --> 00:44:26,200
a number which is 3.45,
maybe you can try to think,

766
00:44:26,200 --> 00:44:29,230
is 3.45 a large
number for a Gaussian?

767
00:44:29,230 --> 00:44:29,897
That's a number.

768
00:44:29,897 --> 00:44:32,355
But if I had another random
variable that was not Gaussian,

769
00:44:32,355 --> 00:44:33,910
maybe it was a
double exponential,

770
00:44:33,910 --> 00:44:36,220
you would have to have
another scale in your mind.

771
00:44:36,220 --> 00:44:42,880
Is 3.45 so large that
it's unlikely for it

772
00:44:42,880 --> 00:44:44,680
to come from a
double exponential.

773
00:44:44,680 --> 00:44:46,300
If I had a gamma distribution--

774
00:44:46,300 --> 00:44:48,508
I can think of any distribution,
and then that means,

775
00:44:48,508 --> 00:44:51,040
for each distribution, you would
have to have scale in mind.

776
00:44:51,040 --> 00:44:53,290
So of course, you can have
the Gaussian scale in mind.

777
00:44:53,290 --> 00:44:55,270
I mean, I have the
Gaussian scale in mind.

778
00:44:55,270 --> 00:44:59,740
But then, if I map it back into
this number between 0 and 1,

779
00:44:59,740 --> 00:45:02,260
all the distributions
play the same role.

780
00:45:02,260 --> 00:45:05,920
So whether I'm talking about
if my limiting distribution is

781
00:45:05,920 --> 00:45:09,874
normal or exponential or
gamma, or whatever you want,

782
00:45:09,874 --> 00:45:11,290
for all these guys,
I'm just going

783
00:45:11,290 --> 00:45:13,450
to map it into one
number between 0 and 1.

784
00:45:13,450 --> 00:45:16,210
Small number means lots
of evidence against h1.

785
00:45:16,210 --> 00:45:21,040
Large number means lots
of evidence against h0.

786
00:45:21,040 --> 00:45:25,210
Small number means very
few evidence against h9.

787
00:45:25,210 --> 00:45:27,800
And this is the only number
you need to keep in mind.

788
00:45:27,800 --> 00:45:29,710
And the question
is, am I willing

789
00:45:29,710 --> 00:45:34,570
to tolerate this number between
5%, 6%, or maybe 10%, 12%?

790
00:45:34,570 --> 00:45:37,720
And this is the only scale
you have to have in mind.

791
00:45:37,720 --> 00:45:41,030
And this scale is the
scale of p-values.

792
00:45:41,030 --> 00:45:48,120
So the p-value is the tipping
point in terms of alpha.

793
00:45:48,120 --> 00:45:52,050
In words, I can make it
formal, because tipping point,

794
00:45:52,050 --> 00:45:54,510
as far as I know, is
not a mathematical term.

795
00:45:54,510 --> 00:45:58,950
So a p-value of a
test is the smallest,

796
00:45:58,950 --> 00:46:01,740
potentially asymptotic level
if I talk about an asymptotic

797
00:46:01,740 --> 00:46:02,910
p-value--

798
00:46:02,910 --> 00:46:05,520
and that's what we do when we
talk about central theorem--

799
00:46:05,520 --> 00:46:09,410
at which the test rejects h0.

800
00:46:09,410 --> 00:46:10,820
If I were to go any smaller--

801
00:46:14,640 --> 00:46:17,750
sorry, it's the smallest level--

802
00:46:17,750 --> 00:46:19,360
yeah, if I were
to go any smaller,

803
00:46:19,360 --> 00:46:21,250
I would fail to reject.

804
00:46:21,250 --> 00:46:25,200
The smaller the level, the less
likely it is for me to reject.

805
00:46:25,200 --> 00:46:26,710
And if I were to
go any smaller, I

806
00:46:26,710 --> 00:46:31,010
would start failing to reject.

807
00:46:31,010 --> 00:46:33,210
And so it is a random number.

808
00:46:33,210 --> 00:46:35,790
It depends on what
I actually observe.

809
00:46:35,790 --> 00:46:39,240
So here, of course, I
instantiated those two numbers,

810
00:46:39,240 --> 00:46:44,202
3.45 and 0.77, as realizations
of random variables.

811
00:46:44,202 --> 00:46:46,410
But if you think of those
as being the random numbers

812
00:46:46,410 --> 00:46:50,190
before I see my data,
this was a random number,

813
00:46:50,190 --> 00:46:53,550
and therefore, the area under
the curve to the right of it

814
00:46:53,550 --> 00:46:55,830
is also a random area.

815
00:46:55,830 --> 00:46:58,920
If this thing fluctuates,
then the area under the curve

816
00:46:58,920 --> 00:47:00,090
fluctuates.

817
00:47:00,090 --> 00:47:02,150
And that's what the p-value is.

818
00:47:02,150 --> 00:47:05,950
That's what-- what is his name?

819
00:47:05,950 --> 00:47:06,780
I forget.

820
00:47:06,780 --> 00:47:10,470
John Oliver talks about when
he talks about p-hacking.

821
00:47:10,470 --> 00:47:14,150
And so we talked about
this in the first lecture.

822
00:47:14,150 --> 00:47:18,194
So p-hacking is, how do I
do-- oh, if I'm a scientist,

823
00:47:18,194 --> 00:47:20,360
do I want to see a small
p-value or a large p-value?

824
00:47:20,360 --> 00:47:21,027
AUDIENCE: Small.

825
00:47:21,027 --> 00:47:22,359
PHILIPPE RIGOLLET: Small, right?

826
00:47:22,359 --> 00:47:24,960
Scientists want to see small
p-values because small p-values

827
00:47:24,960 --> 00:47:28,020
equals rejecting,
which equals discovery,

828
00:47:28,020 --> 00:47:31,230
which equals publications,
which equals promotion.

829
00:47:31,230 --> 00:47:34,550
So that's what
people want to see.

830
00:47:34,550 --> 00:47:37,970
So people are tempted
to see small p-values.

831
00:47:37,970 --> 00:47:41,660
And what's called p-hacking
is, well, find a way to cheat.

832
00:47:41,660 --> 00:47:44,450
Maybe look at your data,
formulate your hypothesis

833
00:47:44,450 --> 00:47:49,840
in such a way that you will
actually have a smaller

834
00:47:49,840 --> 00:47:51,375
p-value than you should have.

835
00:47:51,375 --> 00:47:53,000
So here, for example,
there's one thing

836
00:47:53,000 --> 00:47:54,958
I did not insist on
because, again, this is not

837
00:47:54,958 --> 00:47:57,200
a particular course on
statistical thinking,

838
00:47:57,200 --> 00:47:59,420
but one thing that
we implicitly did

839
00:47:59,420 --> 00:48:04,750
was set those theta 0 and
theta 1 ahead of time.

840
00:48:04,750 --> 00:48:08,378
I fixed them, and I'm
trying to test this.

841
00:48:08,378 --> 00:48:11,280
This is to be contrasted
with the following approach.

842
00:48:11,280 --> 00:48:13,350
I draw my data.

843
00:48:13,350 --> 00:48:15,224
So I draw--

844
00:48:15,224 --> 00:48:16,890
I run this experiment,
which is probably

845
00:48:16,890 --> 00:48:18,840
going to get me a
publication in nature.

846
00:48:18,840 --> 00:48:23,010
I'm trying to test
if a coin is fair.

847
00:48:23,010 --> 00:48:24,760
And I draw my data,
and I see that there's

848
00:48:24,760 --> 00:48:31,020
13 out of 30 of my
observations that are heads.

849
00:48:31,020 --> 00:48:32,610
That means that,
from this data, it

850
00:48:32,610 --> 00:48:36,850
looks like p is less than 1/2.

851
00:48:36,850 --> 00:48:38,560
So if I look at
this data and then

852
00:48:38,560 --> 00:48:42,870
decide that my alternative
is not p not equal to 1/2,

853
00:48:42,870 --> 00:48:47,300
but rather p less than
1/2, that's p-hacking.

854
00:48:47,300 --> 00:48:50,550
I'm actually making my
p-value strictly smaller

855
00:48:50,550 --> 00:48:53,130
by first looking at the
data, and then deciding what

856
00:48:53,130 --> 00:48:54,660
my alternative is going to be.

857
00:48:54,660 --> 00:48:58,770
And that's cheating, because
all the things we did,

858
00:48:58,770 --> 00:49:02,970
we're assuming that this
0.5, or the alternative,

859
00:49:02,970 --> 00:49:05,240
was actually a fixed--
everything was deterministic.

860
00:49:05,240 --> 00:49:07,164
The only randomness
came from the data.

861
00:49:07,164 --> 00:49:08,580
But if I start
looking at the data

862
00:49:08,580 --> 00:49:11,130
and designing my experiment
or my alternatives

863
00:49:11,130 --> 00:49:13,232
and null hypothesis
based on the data,

864
00:49:13,232 --> 00:49:15,690
it's as if I started putting
randomness all over the place.

865
00:49:15,690 --> 00:49:18,300
And then I cannot control it
because I don't know how it

866
00:49:18,300 --> 00:49:22,960
just intermingles
with each other.

867
00:49:22,960 --> 00:49:26,170
So that was for the
John Oliver moment.

868
00:49:29,940 --> 00:49:32,340
So the p-value is nice.

869
00:49:32,340 --> 00:49:35,500
So maybe I mentioned
that, before, my wife

870
00:49:35,500 --> 00:49:36,610
works in market research.

871
00:49:36,610 --> 00:49:40,240
And maybe every two
years, she seems

872
00:49:40,240 --> 00:49:42,370
to run into a statistician
in the hallway,

873
00:49:42,370 --> 00:49:45,070
and she comes home and says,
what is a p-value again?

874
00:49:45,070 --> 00:49:48,190
And for her, a p-value
is just the number

875
00:49:48,190 --> 00:49:50,080
in an Excel spreadsheet.

876
00:49:50,080 --> 00:49:55,360
And actually, small equals
good and large equals bad.

877
00:49:55,360 --> 00:49:57,820
And that's all she needs
to know at this point.

878
00:49:57,820 --> 00:50:01,310
Actually, they do the job
for her-- small is green,

879
00:50:01,310 --> 00:50:02,290
large is red.

880
00:50:02,290 --> 00:50:06,590
And so for her, a p-value
is just green or red.

881
00:50:06,590 --> 00:50:08,740
But so what she's
really implicitly doing

882
00:50:08,740 --> 00:50:12,980
with this color code is just
applying the golden rule.

883
00:50:12,980 --> 00:50:16,210
What the statisticians do for
her in the Excel spreadsheet

884
00:50:16,210 --> 00:50:18,760
is that they take the
numbers for the p-values that

885
00:50:18,760 --> 00:50:20,440
are less than some fixed level.

886
00:50:20,440 --> 00:50:22,390
So depending on the field
in which she works--

887
00:50:22,390 --> 00:50:24,410
so she works for
pharmaceutical companies--

888
00:50:24,410 --> 00:50:26,560
so the p-values are
typically compared--

889
00:50:26,560 --> 00:50:31,030
the tests are usually performed
at level 1%, rather than 5%.

890
00:50:31,030 --> 00:50:33,040
So 5% is maybe
your gold standard

891
00:50:33,040 --> 00:50:36,550
if you're doing
sociology or trying to--

892
00:50:36,550 --> 00:50:39,970
I don't know-- release
a new blueberry flavor

893
00:50:39,970 --> 00:50:40,990
for your toothpaste.

894
00:50:40,990 --> 00:50:43,630
Something that's not going
to change the life of people,

895
00:50:43,630 --> 00:50:45,040
maybe you're going to run at 5%.

896
00:50:45,040 --> 00:50:46,150
It's OK to make a mistake.

897
00:50:46,150 --> 00:50:47,858
See, people are just
going to feel gross,

898
00:50:47,858 --> 00:50:50,770
but that's about
it, whereas here,

899
00:50:50,770 --> 00:50:53,296
if you have this p-value
which is less than 1%,

900
00:50:53,296 --> 00:50:55,420
it might be more important
for some drug discovery,

901
00:50:55,420 --> 00:50:56,550
for example.

902
00:50:56,550 --> 00:50:59,810
And so let's say you run at 1%.

903
00:50:59,810 --> 00:51:02,470
And so what they do in
this Excel spreadsheet is

904
00:51:02,470 --> 00:51:05,800
that all the numbers that
are below 1% show up in green

905
00:51:05,800 --> 00:51:09,010
and all the numbers that
are above 1% show up in red.

906
00:51:09,010 --> 00:51:09,790
And that's it.

907
00:51:09,790 --> 00:51:11,710
That's just applying
the golden rule.

908
00:51:11,710 --> 00:51:13,640
If the number is green, reject.

909
00:51:13,640 --> 00:51:18,024
If the number is
red, fail to reject.

910
00:51:18,024 --> 00:51:18,524
Yeah?

911
00:51:18,524 --> 00:51:20,512
AUDIENCE: So going
back to example 2

912
00:51:20,512 --> 00:51:23,991
where the prior
example where you

913
00:51:23,991 --> 00:51:26,476
want to cheat by
looking after beta

914
00:51:26,476 --> 00:51:32,450
and then formulating, say,
theta 1 to be p less than 1/2.

915
00:51:32,450 --> 00:51:33,914
PHILIPPE RIGOLLET: Yeah.

916
00:51:33,914 --> 00:51:38,306
AUDIENCE: So how would
you achieve your goal

917
00:51:38,306 --> 00:51:40,804
by changing the theta--

918
00:51:40,804 --> 00:51:42,470
PHILIPPE RIGOLLET:
By achieving my goal,

919
00:51:42,470 --> 00:51:45,826
you mean letting
ethics aside, right?

920
00:51:45,826 --> 00:51:46,700
AUDIENCE: Yeah, yeah.

921
00:51:46,700 --> 00:51:47,930
PHILIPPE RIGOLLET: Ah,
you want to be published.

922
00:51:47,930 --> 00:51:48,555
AUDIENCE: Yeah.

923
00:51:48,555 --> 00:51:54,410
PHILIPPE RIGOLLET: [LAUGHS]
So let me teach you how, then.

924
00:51:54,410 --> 00:51:58,160
So well, here, what do you do?

925
00:51:58,160 --> 00:52:03,380
You want to-- at
the end of the day,

926
00:52:03,380 --> 00:52:06,200
a test is only telling you
whether you found evidence

927
00:52:06,200 --> 00:52:11,150
in your data that h1 was more
likely than h0, basically.

928
00:52:11,150 --> 00:52:12,830
How do you make h1 more likely?

929
00:52:12,830 --> 00:52:18,500
Well, you just basically
target h1 to be what it is--

930
00:52:18,500 --> 00:52:21,740
what the data is going to
make it more likely to be.

931
00:52:21,740 --> 00:52:26,210
So if, for example, I say
h1 can be on both sides,

932
00:52:26,210 --> 00:52:29,177
then my data is going to have to
take into account fluctuations

933
00:52:29,177 --> 00:52:31,760
on both sides, and I'm going to
lose a factor or two somewhere

934
00:52:31,760 --> 00:52:33,690
because things
are not symmetric.

935
00:52:33,690 --> 00:52:38,240
Here is the ultimate
way of making this work.

936
00:52:38,240 --> 00:52:42,920
I'm going back to my
example of flipping coins.

937
00:52:42,920 --> 00:52:45,790
And now, so here,
what I did is, I said,

938
00:52:45,790 --> 00:52:54,690
oh, this number 0.43 is
actually smaller than 0.5,

939
00:52:54,690 --> 00:52:56,640
so I'm just going to
test whether I'm 0.5

940
00:52:56,640 --> 00:52:58,780
or I'm less than 0.5.

941
00:52:58,780 --> 00:53:01,050
But here is something
that I can promise you

942
00:53:01,050 --> 00:53:04,380
I did not make the
computation will reject.

943
00:53:04,380 --> 00:53:06,180
So here, this one actually--

944
00:53:06,180 --> 00:53:08,280
yeah, this one fails to reject.

945
00:53:08,280 --> 00:53:11,080
So here is one that
will certainly reject.

946
00:53:11,080 --> 00:53:24,950
h0 is 0.5, p is
0.5, h1p is 0.43.

947
00:53:24,950 --> 00:53:27,830
Now, you can try,
but I can promise you

948
00:53:27,830 --> 00:53:32,030
that your data will tell you
that h1 is the right one.

949
00:53:32,030 --> 00:53:36,110
I mean, you can check very
quickly that this is really

950
00:53:36,110 --> 00:53:37,780
extremely likely to happen.

951
00:53:40,795 --> 00:53:41,670
Actually, what am I--

952
00:53:45,050 --> 00:53:52,220
no, actually, that's
not true, because here,

953
00:53:52,220 --> 00:53:56,330
the test that I derive that's
based on this kind of stuff,

954
00:53:56,330 --> 00:53:59,960
here at some point,
somewhere under some layers,

955
00:53:59,960 --> 00:54:04,450
I assume that all our tests
are going to have this form.

956
00:54:04,450 --> 00:54:06,030
But here, this is
only when you're

957
00:54:06,030 --> 00:54:09,030
trying to test one region versus
another region next to it,

958
00:54:09,030 --> 00:54:11,030
or one point versus
a region around it,

959
00:54:11,030 --> 00:54:13,140
or something like this,
whereas for this guy,

960
00:54:13,140 --> 00:54:15,640
there's another test
that could come up with,

961
00:54:15,640 --> 00:54:18,720
which is, what is the
probability that I get 0.43,

962
00:54:18,720 --> 00:54:21,786
and what is the
probability that I get 0.5?

963
00:54:21,786 --> 00:54:23,160
Now, what I'm
going to do is, I'm

964
00:54:23,160 --> 00:54:25,680
going to just conclude
it's whichever

965
00:54:25,680 --> 00:54:27,607
has the largest probability.

966
00:54:27,607 --> 00:54:29,940
Then maybe I'm going to have
to make some adjustments so

967
00:54:29,940 --> 00:54:32,310
that the level is actually 5%.

968
00:54:32,310 --> 00:54:33,690
But I can make this happen.

969
00:54:33,690 --> 00:54:36,890
I can make the level be 5%
and always conclude this guy,

970
00:54:36,890 --> 00:54:38,610
but I would have to
use a different test.

971
00:54:38,610 --> 00:54:40,290
Now, the test that
I described, again,

972
00:54:40,290 --> 00:54:42,870
those tn larger
than c are built in

973
00:54:42,870 --> 00:54:46,170
to be tests that are resilient
to these kind of manipulations

974
00:54:46,170 --> 00:54:48,840
because they're
oblivious towards what

975
00:54:48,840 --> 00:54:50,490
the alternative looks like.

976
00:54:50,490 --> 00:54:51,960
I mean, they're just saying
it's either to the left

977
00:54:51,960 --> 00:54:53,334
or to the right,
but whether it's

978
00:54:53,334 --> 00:54:55,440
a point or an entire
half-line doesn't matter.

979
00:54:59,170 --> 00:55:01,200
So if you try to
look at your data

980
00:55:01,200 --> 00:55:05,520
and just put the data itself
into your hypothesis testing

981
00:55:05,520 --> 00:55:10,745
problem, then you're failing
the statistical principle.

982
00:55:10,745 --> 00:55:12,120
And that's what
people are doing.

983
00:55:12,120 --> 00:55:13,572
I mean, how can I check?

984
00:55:13,572 --> 00:55:15,030
I mean, of course,
here, it's going

985
00:55:15,030 --> 00:55:16,488
to be pretty blatant
if you publish

986
00:55:16,488 --> 00:55:17,820
a paper that looks like this.

987
00:55:17,820 --> 00:55:19,650
But there's ways to
do it differently.

988
00:55:19,650 --> 00:55:21,734
For example, one way to
do it is to just do mult--

989
00:55:21,734 --> 00:55:23,233
so typically, what
people do is they

990
00:55:23,233 --> 00:55:24,780
do multiple hypothesis testing.

991
00:55:24,780 --> 00:55:27,570
They're doing 100
tests at a time.

992
00:55:27,570 --> 00:55:30,714
Then you have random
fluctuations every time.

993
00:55:30,714 --> 00:55:32,130
And so they just
pick the one that

994
00:55:32,130 --> 00:55:34,422
has the random fluctuations
that go their way.

995
00:55:34,422 --> 00:55:36,130
I mean, sometimes it's
going in your way,

996
00:55:36,130 --> 00:55:37,879
and sometimes it's
going the opposite way,

997
00:55:37,879 --> 00:55:39,780
so you just pick the
one that works for you.

998
00:55:39,780 --> 00:55:41,990
We'll talk about multiple
hypothesis testing soon

999
00:55:41,990 --> 00:55:44,860
if you want to increase
your publication count.

1000
00:55:49,779 --> 00:55:50,820
There's actually papers--

1001
00:55:50,820 --> 00:55:53,040
I think it was a big
news that some papers,

1002
00:55:53,040 --> 00:55:54,720
I think, in psychology
or psychometrics

1003
00:55:54,720 --> 00:55:57,900
papers that actually refused
to publish p-values now.

1004
00:56:03,630 --> 00:56:05,880
Where were we?

1005
00:56:05,880 --> 00:56:07,230
Here's the golden rule.

1006
00:56:07,230 --> 00:56:11,970
So one thing that I like
to show is this thing,

1007
00:56:11,970 --> 00:56:14,340
just so you know how you
apply the golden rule

1008
00:56:14,340 --> 00:56:16,050
and how you apply
the standard tests.

1009
00:56:16,050 --> 00:56:25,450
So the standard paradigm
is the following.

1010
00:56:25,450 --> 00:56:29,430
You have a black box,
which is your test.

1011
00:56:29,430 --> 00:56:32,230
For my wife, this is the
4th floor of the building.

1012
00:56:32,230 --> 00:56:33,840
That's where the
statisticians sit.

1013
00:56:33,840 --> 00:56:35,980
What she sends there is data--

1014
00:56:38,620 --> 00:56:41,140
let's say x1 xn.

1015
00:56:41,140 --> 00:56:43,870
And she says, well, this
one is about toothpaste,

1016
00:56:43,870 --> 00:56:45,190
so here's a level--

1017
00:56:45,190 --> 00:56:47,220
let's say 5%.

1018
00:56:47,220 --> 00:56:50,890
What the 4th floor brings
back is that answer-- yes,

1019
00:56:50,890 --> 00:56:53,290
no, green, red, just an answer.

1020
00:56:58,050 --> 00:56:59,670
So that's the standard testing.

1021
00:56:59,670 --> 00:57:02,040
You just feed it the data
and the level at which you

1022
00:57:02,040 --> 00:57:04,330
want to perform the
test, maybe asymptotic,

1023
00:57:04,330 --> 00:57:06,580
and it spits out
a yes, no answer.

1024
00:57:06,580 --> 00:57:15,340
What p-value does, you just
feed it the data itself.

1025
00:57:18,380 --> 00:57:22,210
And what it spits
out is the p-value.

1026
00:57:22,210 --> 00:57:23,870
And now it's just up to you.

1027
00:57:23,870 --> 00:57:27,910
I mean, hopefully your brain
has the computational power

1028
00:57:27,910 --> 00:57:31,090
of deciding whether a number
is larger or smaller than 5%

1029
00:57:31,090 --> 00:57:33,790
without having to call
a statistician for this.

1030
00:57:33,790 --> 00:57:35,360
And that's what it does.

1031
00:57:35,360 --> 00:57:37,600
So now we're on 1 scale.

1032
00:57:37,600 --> 00:57:41,500
Now, I see some of you nodding
when I talk about p-hacking,

1033
00:57:41,500 --> 00:57:43,095
so that means you've
seen p-values.

1034
00:57:43,095 --> 00:57:45,220
If you've seen more than
100 p-values in your life,

1035
00:57:45,220 --> 00:57:47,330
you have an entire scale.

1036
00:57:47,330 --> 00:57:50,770
A good p-value is less
than 10 to the minus 4.

1037
00:57:50,770 --> 00:57:53,440
That's the ultimate sweet spot.

1038
00:57:53,440 --> 00:57:56,830
Actually, statistical
software spits out

1039
00:57:56,830 --> 00:58:01,370
an output which says less
than 10 to the minus 4.

1040
00:58:01,370 --> 00:58:02,830
But then maybe
you want a p-val--

1041
00:58:05,870 --> 00:58:08,960
if you tell me my p-value
was 4.65, then I will say,

1042
00:58:08,960 --> 00:58:10,970
you've been doing some
p-hacking until you found

1043
00:58:10,970 --> 00:58:12,680
a number that was below 5%.

1044
00:58:12,680 --> 00:58:14,660
That's typically
what people will do.

1045
00:58:14,660 --> 00:58:16,590
But if you tell me--

1046
00:58:16,590 --> 00:58:18,800
if you're doing the
test, if you're saying,

1047
00:58:18,800 --> 00:58:21,590
I published my
result, my test at 5%

1048
00:58:21,590 --> 00:58:27,020
said yes, that means that
maybe you're p-value was 4.99,

1049
00:58:27,020 --> 00:58:29,762
or you're p-value was 10 to
the minus 4, I will never know.

1050
00:58:29,762 --> 00:58:31,220
I will never know
how much evidence

1051
00:58:31,220 --> 00:58:34,310
you had against the null.

1052
00:58:34,310 --> 00:58:36,260
But if you tell me
what the p-value is,

1053
00:58:36,260 --> 00:58:37,492
I can make my own decision.

1054
00:58:37,492 --> 00:58:39,450
I don't have to tell me
whether it's a yes, no.

1055
00:58:39,450 --> 00:58:42,620
You tell me it's 4.99, I'm
going to say, well, maybe yes,

1056
00:58:42,620 --> 00:58:45,120
but I'm going to take
it with a grain of salt.

1057
00:58:45,120 --> 00:58:48,100
And so that's why p-values are
good numbers to have in mind.

1058
00:58:48,100 --> 00:58:51,940
Now, I should, as if it
was like an old trick

1059
00:58:51,940 --> 00:58:54,310
that you start mastering
when you're 45 years old.

1060
00:58:54,310 --> 00:58:57,490
No, it's just, how small is
the number between 0 and 1?

1061
00:58:57,490 --> 00:59:00,670
That's really what
you need to know.

1062
00:59:00,670 --> 00:59:03,385
Maybe on the log scale--
if it's 10 to the minus 1,

1063
00:59:03,385 --> 00:59:07,240
10 to the minus 2, 10 to
the minus 3, et cetera--

1064
00:59:07,240 --> 00:59:09,790
that's probably the extent
of the mastery here.

1065
00:59:12,820 --> 00:59:16,420
So this traditional standard
paradigm that I showed

1066
00:59:16,420 --> 00:59:21,160
is actually commonly referred to
as the Neyman-Pearson paradigm.

1067
00:59:21,160 --> 00:59:23,320
So here, it says name
Neyman-Pearson's theory,

1068
00:59:23,320 --> 00:59:25,970
so there's an entire
theory that comes with it.

1069
00:59:25,970 --> 00:59:27,130
But it's really a paradigm.

1070
00:59:27,130 --> 00:59:29,296
It's a way of thinking about
hypothesis testing that

1071
00:59:29,296 --> 00:59:32,530
says, well, if I'm not going
to be able to optimize both

1072
00:59:32,530 --> 00:59:34,750
my type I and type II
error, I'm actually

1073
00:59:34,750 --> 00:59:37,780
going to lock in my type
I error below some level

1074
00:59:37,780 --> 00:59:42,550
and just minimize the type II
error under this constraint.

1075
00:59:42,550 --> 00:59:45,100
That's what the
Neyman-Pearson paradigm is.

1076
00:59:45,100 --> 00:59:48,490
And it sort of makes sense for
hypothesis testing problems.

1077
00:59:48,490 --> 00:59:50,560
Now, if you were doing
some other applications

1078
00:59:50,560 --> 00:59:52,210
with multi-objective
optimization,

1079
00:59:52,210 --> 00:59:54,310
you would maybe come up
with something different.

1080
00:59:54,310 --> 00:59:58,600
For example, machine learning
is not performing typically

1081
00:59:58,600 --> 01:00:01,120
under Neyman-Pearson paradigm.

1082
01:00:01,120 --> 01:00:05,360
So if you do spam filtering,
you could say, well,

1083
01:00:05,360 --> 01:00:08,650
I want to constrain the
probability as much as I can

1084
01:00:08,650 --> 01:00:10,810
of taking somebody's
important emails

1085
01:00:10,810 --> 01:00:14,590
and throwing them out as spam,
and under this constraint,

1086
01:00:14,590 --> 01:00:17,350
not send too much
spam to that person.

1087
01:00:17,350 --> 01:00:19,290
That sort of makes
sense for spams.

1088
01:00:19,290 --> 01:00:23,620
Now, if you're labeling cats
versus dogs, that's probably

1089
01:00:23,620 --> 01:00:27,100
not like you want to make
sure that no more than 5%

1090
01:00:27,100 --> 01:00:30,670
of the dogs are labeled
cat because, I mean,

1091
01:00:30,670 --> 01:00:31,469
it doesn't matter.

1092
01:00:31,469 --> 01:00:33,010
So what you typically
do is, you just

1093
01:00:33,010 --> 01:00:34,810
sum up the two types
of errors you can make,

1094
01:00:34,810 --> 01:00:36,851
and you minimize the sum
without putting any more

1095
01:00:36,851 --> 01:00:38,260
weight on one or the other.

1096
01:00:38,260 --> 01:00:42,880
So here's an example where doing
a binary decision, one or two

1097
01:00:42,880 --> 01:00:45,070
of the errors you
can make, you don't

1098
01:00:45,070 --> 01:00:47,840
have to actually be like that.

1099
01:00:47,840 --> 01:00:50,300
So this example here, I did not.

1100
01:00:50,300 --> 01:00:55,200
The trivial test psi is
equal to 0, what was it

1101
01:00:55,200 --> 01:01:00,350
in the US trial court example?

1102
01:01:00,350 --> 01:01:03,530
What is psi equals 0?

1103
01:01:03,530 --> 01:01:05,500
That was concluding
always to the null.

1104
01:01:05,500 --> 01:01:08,000
What was the null?

1105
01:01:08,000 --> 01:01:08,930
AUDIENCE: Innocent.

1106
01:01:08,930 --> 01:01:10,388
PHILIPPE RIGOLLET:
Innocent, right?

1107
01:01:10,388 --> 01:01:11,340
That's the status quo.

1108
01:01:11,340 --> 01:01:14,790
So that means that this
guy never rejects h0.

1109
01:01:14,790 --> 01:01:16,910
Everybody's going away free.

1110
01:01:16,910 --> 01:01:18,800
So you're sure
you're not actually

1111
01:01:18,800 --> 01:01:25,250
going against the constitution
because alpha is 0%, which

1112
01:01:25,250 --> 01:01:26,940
is certainly less than 5%.

1113
01:01:26,940 --> 01:01:30,260
But the power, the fact
that a lot of criminals

1114
01:01:30,260 --> 01:01:34,130
go back outside
in the free world

1115
01:01:34,130 --> 01:01:37,580
is actually formulated in
terms of low power, which,

1116
01:01:37,580 --> 01:01:39,100
in this case, is actually 0.

1117
01:01:39,100 --> 01:01:41,690
Again, the power is the
number between 0 and 1.

1118
01:01:41,690 --> 01:01:43,100
Close to 0, good.

1119
01:01:43,100 --> 01:01:45,227
Close to 1, bad.

1120
01:01:45,227 --> 01:01:51,510
Now, what is the
definition of the p-value?

1121
01:01:51,510 --> 01:01:54,300
That's going to be
something-- it's a mouthful.

1122
01:01:54,300 --> 01:01:58,520
The definition of the
p-value is a mouthful.

1123
01:01:58,520 --> 01:02:00,240
It's the tipping point.

1124
01:02:00,240 --> 01:02:02,620
It is the smallest level at
which blah, blah, blah, blah,

1125
01:02:02,620 --> 01:02:03,120
blah.

1126
01:02:03,120 --> 01:02:05,410
It's complicated to remember it.

1127
01:02:05,410 --> 01:02:09,910
Now, I think that my 6th
explanation, my wife,

1128
01:02:09,910 --> 01:02:12,994
after saying, oh, so it's the
probability of making an error,

1129
01:02:12,994 --> 01:02:14,910
I said, yeah, that's the
probability of making

1130
01:02:14,910 --> 01:02:16,870
an error because,
of course, she can

1131
01:02:16,870 --> 01:02:22,270
think probability of making an
error small, good, large, bad.

1132
01:02:22,270 --> 01:02:24,940
So that's actually a
good way to remember.

1133
01:02:24,940 --> 01:02:26,521
I'm pretty sure
that at least 50%

1134
01:02:26,521 --> 01:02:28,270
of people who are using
p-values out there

1135
01:02:28,270 --> 01:02:31,220
think that the p-value is the
probability of making an error.

1136
01:02:31,220 --> 01:02:33,040
Now, for all
matters of purposes,

1137
01:02:33,040 --> 01:02:35,620
if your goal is to just
threshold the p-value,

1138
01:02:35,620 --> 01:02:37,900
this is OK to have this in y.

1139
01:02:37,900 --> 01:02:42,220
But when comes, at
least until December 22,

1140
01:02:42,220 --> 01:02:44,830
I would recommend trying
to actually memorize

1141
01:02:44,830 --> 01:02:46,780
the right definition
for the p-value.

1142
01:02:53,020 --> 01:02:55,240
So the idea, again,
is fix the level

1143
01:02:55,240 --> 01:02:57,112
and try to optimize the power.

1144
01:03:01,360 --> 01:03:05,370
So we're going to try to compute
some p-values from now on.

1145
01:03:05,370 --> 01:03:06,900
How do you compute the p-value?

1146
01:03:06,900 --> 01:03:10,670
Well, you can actually see it
from this picture over there.

1147
01:03:14,047 --> 01:03:16,130
One thing I didn't show
on this picture-- so here,

1148
01:03:16,130 --> 01:03:19,010
it was my q alpha over
2 that had alpha here,

1149
01:03:19,010 --> 01:03:21,230
alpha over 2 here.

1150
01:03:21,230 --> 01:03:22,580
That was my q alpha over 2.

1151
01:03:22,580 --> 01:03:26,870
And I said, if tn is to
the right of this guy,

1152
01:03:26,870 --> 01:03:27,746
I'm going to reject.

1153
01:03:27,746 --> 01:03:29,120
If tn is to the
left of this guy,

1154
01:03:29,120 --> 01:03:31,550
I'm going to fail to reject.

1155
01:03:31,550 --> 01:03:34,720
Pictorially, you can actually
represent the p-value.

1156
01:03:34,720 --> 01:03:36,770
It's when I replace
this guy by tn itself.

1157
01:03:41,170 --> 01:03:44,770
Sorry, that's p-value over 2.

1158
01:03:44,770 --> 01:03:47,360
No, actually, that's p-value.

1159
01:03:47,360 --> 01:03:51,290
So let me just keep it like
that and put the absolute value

1160
01:03:51,290 --> 01:03:51,790
here.

1161
01:03:54,530 --> 01:03:58,560
So if you replace the role of
q alpha over 2, by your test

1162
01:03:58,560 --> 01:04:01,630
statistic, the area
under the curve

1163
01:04:01,630 --> 01:04:03,160
is actually the
p-value itself up

1164
01:04:03,160 --> 01:04:06,730
to a scale because of
the symmetric thing.

1165
01:04:06,730 --> 01:04:09,280
So there's a good way
to see, pictorially,

1166
01:04:09,280 --> 01:04:10,930
what the p-value is.

1167
01:04:10,930 --> 01:04:13,750
It's just the probability
that some Gaussians--

1168
01:04:13,750 --> 01:04:17,680
it's just the probability that
some absolute value of n01

1169
01:04:17,680 --> 01:04:18,370
exceeds tn.

1170
01:04:22,480 --> 01:04:24,384
That's what the p-value is.

1171
01:04:24,384 --> 01:04:26,300
Now, this guy has nothing
to do with this guy,

1172
01:04:26,300 --> 01:04:32,680
so this is really just 1
minus the Gaussian cdf of tn,

1173
01:04:32,680 --> 01:04:34,650
and that's it.

1174
01:04:34,650 --> 01:04:36,460
So that's how I would
compute p-values.

1175
01:04:36,460 --> 01:04:40,210
Now, as I said, the
p-value is a beauty

1176
01:04:40,210 --> 01:04:43,330
because you don't
have to understand

1177
01:04:43,330 --> 01:04:47,050
the fact that your limiting
distribution is a Gaussian.

1178
01:04:47,050 --> 01:04:49,264
It's already factored
in this construction.

1179
01:04:49,264 --> 01:04:50,680
The fact that I'm
actually looking

1180
01:04:50,680 --> 01:04:54,880
at this cumulative distribution
function of a standard Gaussian

1181
01:04:54,880 --> 01:04:57,302
makes my p-value
automatically adjust to what

1182
01:04:57,302 --> 01:04:58,510
the limiting distribution is.

1183
01:04:58,510 --> 01:05:00,460
And if this was the
cumulative distribution

1184
01:05:00,460 --> 01:05:03,520
function of a
exponential, I would just

1185
01:05:03,520 --> 01:05:06,055
have a different function here
denoted by f, for example,

1186
01:05:06,055 --> 01:05:07,866
and I would just compute
a different value.

1187
01:05:07,866 --> 01:05:10,240
But in the end, regardless of
what the limiting value is,

1188
01:05:10,240 --> 01:05:13,540
my p-value would still be
a number between 0 and 1.

1189
01:05:13,540 --> 01:05:16,650
And so to illustrate
that, let's look

1190
01:05:16,650 --> 01:05:20,340
at other weird distributions
that we could get in place

1191
01:05:20,340 --> 01:05:22,620
of the standard Gaussian.

1192
01:05:22,620 --> 01:05:24,980
And we're not going to see
many, but we'll see one.

1193
01:05:24,980 --> 01:05:27,311
And it's not called the
chi squared distribution.

1194
01:05:27,311 --> 01:05:29,310
It's actually called the
Student's distribution,

1195
01:05:29,310 --> 01:05:31,920
but it involves the chi
squared distribution

1196
01:05:31,920 --> 01:05:34,320
as a building block.

1197
01:05:34,320 --> 01:05:38,820
So I don't know if my phonetics
are not really right there,

1198
01:05:38,820 --> 01:05:43,050
so I try to say, well,
it's chi squared.

1199
01:05:43,050 --> 01:05:47,190
Maybe it's "kee" squared
above, in Canada, who knows.

1200
01:05:47,190 --> 01:05:50,589
So for a positive integer,
so there's only 1 parameter.

1201
01:05:50,589 --> 01:05:52,380
So for the Gaussian,
you have 2 parameters,

1202
01:05:52,380 --> 01:05:54,180
which are mu and sigma squared.

1203
01:05:54,180 --> 01:05:55,350
Those are real numbers.

1204
01:05:55,350 --> 01:05:57,090
Sigma squared's positive.

1205
01:05:57,090 --> 01:05:59,280
Here, I have 1
integer parameter.

1206
01:06:03,030 --> 01:06:05,160
Then the chi
squared distribution

1207
01:06:05,160 --> 01:06:07,642
with d degrees of freedom--

1208
01:06:07,642 --> 01:06:09,600
so the parameter is called
a degree of freedom,

1209
01:06:09,600 --> 01:06:11,700
just like mu is called the
expected value and sigma

1210
01:06:11,700 --> 01:06:12,600
squared is called the variance.

1211
01:06:12,600 --> 01:06:14,370
Here, we call it
degrees of freedom.

1212
01:06:14,370 --> 01:06:17,290
You don't have to
really understand why.

1213
01:06:17,290 --> 01:06:19,800
So that's the law
that you would get--

1214
01:06:19,800 --> 01:06:21,300
that's the random
variable you would

1215
01:06:21,300 --> 01:06:26,260
get if you were to sum d
squares of independent standard

1216
01:06:26,260 --> 01:06:26,890
Gaussians.

1217
01:06:29,570 --> 01:06:33,680
So I take the square of an
independent random Gaussian.

1218
01:06:33,680 --> 01:06:34,730
I take another one.

1219
01:06:34,730 --> 01:06:36,380
I sum them, and
that's a chi squared

1220
01:06:36,380 --> 01:06:39,370
with 2 degrees of freedom.

1221
01:06:39,370 --> 01:06:40,740
That's how you get it.

1222
01:06:40,740 --> 01:06:46,960
Now, I could define it using its
probability density function.

1223
01:06:46,960 --> 01:06:49,150
I mean, after all,
this is the sum

1224
01:06:49,150 --> 01:06:51,730
of positive random
variables, so it

1225
01:06:51,730 --> 01:06:53,800
is a positive random variable.

1226
01:06:53,800 --> 01:06:56,680
It has a density on
the positive real line.

1227
01:06:56,680 --> 01:07:03,420
And the pdf of chi squared with
d degrees of freedom is what?

1228
01:07:03,420 --> 01:07:07,900
Well, it's fd of x is--

1229
01:07:07,900 --> 01:07:13,930
what is it?-- x to the d/2
minus 1 e to the minus x/2.

1230
01:07:13,930 --> 01:07:16,510
And then here, I
have a gamma of d/2.

1231
01:07:16,510 --> 01:07:20,470
And the other one is, I
think, 2 to the d/2 minus 1.

1232
01:07:23,158 --> 01:07:26,530
No, 2 to the d/2.

1233
01:07:26,530 --> 01:07:28,690
That's what it is.

1234
01:07:28,690 --> 01:07:30,580
That's the density.

1235
01:07:30,580 --> 01:07:32,110
If you are very
good at probability,

1236
01:07:32,110 --> 01:07:33,640
you can make the
change of variable

1237
01:07:33,640 --> 01:07:35,789
and write your Jacobian
and do all this stuff

1238
01:07:35,789 --> 01:07:37,330
and actually check
that this is true.

1239
01:07:37,330 --> 01:07:40,540
I do not recommend doing that.

1240
01:07:40,540 --> 01:07:44,070
So this is the density, but it's
better understood like that.

1241
01:07:44,070 --> 01:07:46,270
I think it was just
something that you

1242
01:07:46,270 --> 01:07:48,160
built from standard Gaussian.

1243
01:07:48,160 --> 01:07:50,800
So for example, an
example of a chi

1244
01:07:50,800 --> 01:07:52,870
squared with 2
degrees of freedom

1245
01:07:52,870 --> 01:07:54,790
is actually the following thing.

1246
01:07:54,790 --> 01:07:56,860
Let's assume I have
a target like this.

1247
01:08:00,170 --> 01:08:02,960
And I don't aim very well.

1248
01:08:02,960 --> 01:08:05,996
And I'm trying to
hit the center.

1249
01:08:05,996 --> 01:08:07,370
And I'm not going
to have, maybe,

1250
01:08:07,370 --> 01:08:10,380
a deviation, which is
standard Gaussian left, right

1251
01:08:10,380 --> 01:08:16,520
and standard Gaussian
north, south.

1252
01:08:16,520 --> 01:08:18,710
So I'm throwing,
and then I'm here,

1253
01:08:18,710 --> 01:08:22,279
and I'm claiming that this
number here, by Pythagoras

1254
01:08:22,279 --> 01:08:24,290
theorem, the square
distance here

1255
01:08:24,290 --> 01:08:25,790
is the sum of this
square distance

1256
01:08:25,790 --> 01:08:30,060
here, which is the square
of a Gaussian by assumption.

1257
01:08:30,060 --> 01:08:31,854
This is plus the square
of this distance,

1258
01:08:31,854 --> 01:08:34,020
which is the square of
another independent Gaussian.

1259
01:08:34,020 --> 01:08:35,486
I assume those are independent.

1260
01:08:35,486 --> 01:08:37,819
And so the square distance
from this point to this point

1261
01:08:37,819 --> 01:08:40,580
is the chi squared with
2 degrees of freedom.

1262
01:08:40,580 --> 01:08:45,029
So this guy here is n01 squared.

1263
01:08:45,029 --> 01:08:48,140
This is n01 squared.

1264
01:08:48,140 --> 01:08:50,120
And so this guy here,
this distance here,

1265
01:08:50,120 --> 01:08:53,285
is chi squared with
2 degrees of freedom.

1266
01:08:53,285 --> 01:08:54,410
I mean the square distance.

1267
01:08:54,410 --> 01:08:58,569
I'm talking about
square distances here.

1268
01:08:58,569 --> 01:09:02,380
So now you can see that,
actually, Pythagoras

1269
01:09:02,380 --> 01:09:05,899
is basically why chi
squared [? arrives. ?]

1270
01:09:05,899 --> 01:09:07,810
That's why it has its own name.

1271
01:09:07,810 --> 01:09:10,479
I mean, I could define
this random variable.

1272
01:09:10,479 --> 01:09:13,075
I mean, it's actually
a gamma distribution.

1273
01:09:13,075 --> 01:09:15,700
It's a special case of something
called the gamma distribution.

1274
01:09:15,700 --> 01:09:17,658
The fact that the special
case has its own name

1275
01:09:17,658 --> 01:09:19,075
is because there's
many times what

1276
01:09:19,075 --> 01:09:20,491
we're going to
take sum of squares

1277
01:09:20,491 --> 01:09:23,140
of independent Gaussians because
Gaussians, the sum of squares

1278
01:09:23,140 --> 01:09:25,330
is really the norm, the
Euclidean norm squared,

1279
01:09:25,330 --> 01:09:26,830
just by Pythagoras theorem.

1280
01:09:26,830 --> 01:09:28,319
If I'm in higher
dimension, I can

1281
01:09:28,319 --> 01:09:30,370
start to sum more
squared coordinates,

1282
01:09:30,370 --> 01:09:32,119
and I'm going to measure
the norm squared.

1283
01:09:34,240 --> 01:09:37,880
So if you want to draw this
picture, it looks like this.

1284
01:09:37,880 --> 01:09:39,620
Again, it's the sum
of positive numbers,

1285
01:09:39,620 --> 01:09:43,000
so it's going to be
on 0 plus infinity.

1286
01:09:43,000 --> 01:09:44,680
That's fd.

1287
01:09:44,680 --> 01:09:52,850
And so f1 looks like
this, f2 looks like this.

1288
01:09:52,850 --> 01:09:57,370
So the tails become heavier
and heavier as d increases.

1289
01:09:57,370 --> 01:10:00,560
And then at [INAUDIBLE]
to 3, it starts

1290
01:10:00,560 --> 01:10:01,960
to have a different shape.

1291
01:10:01,960 --> 01:10:04,210
It starts from 0 and
it looks like this.

1292
01:10:04,210 --> 01:10:06,850
And then, as d
increases, it's basically

1293
01:10:06,850 --> 01:10:09,210
as if you were to push
this thing to the right.

1294
01:10:09,210 --> 01:10:14,979
It's just like, psh, so it's
just falling like a big blob.

1295
01:10:14,979 --> 01:10:16,270
Everybody sees what's going on?

1296
01:10:16,270 --> 01:10:19,637
So there's just this fat
thing that's just going there.

1297
01:10:19,637 --> 01:10:21,470
What is the expected
value of a chi squared?

1298
01:10:28,670 --> 01:10:30,900
So it's the expected
value of the sum

1299
01:10:30,900 --> 01:10:37,938
of Gaussian random
variables, squared.

1300
01:10:37,938 --> 01:10:40,358
I know I said that.

1301
01:10:40,358 --> 01:10:42,790
AUDIENCE: So it's the sum of
their second moments, right?

1302
01:10:42,790 --> 01:10:43,956
PHILIPPE RIGOLLET: Which is?

1303
01:10:46,240 --> 01:10:47,570
Those are n01.

1304
01:10:47,570 --> 01:10:50,386
AUDIENCE: It's
like-- oh, I see, 1.

1305
01:10:50,386 --> 01:10:51,386
PHILIPPE RIGOLLET: Yeah.

1306
01:10:51,386 --> 01:10:53,294
AUDIENCE: So n times
1 or d times 1.

1307
01:10:53,294 --> 01:10:55,070
PHILIPPE RIGOLLET:
Yeah, which is d.

1308
01:10:55,070 --> 01:10:56,990
So one thing you
can check quickly

1309
01:10:56,990 --> 01:11:00,280
is that the expected value
of a chi squared is d.

1310
01:11:00,280 --> 01:11:04,280
And so you see, that's why the
mass is shifting to the right

1311
01:11:04,280 --> 01:11:05,240
as d increases.

1312
01:11:05,240 --> 01:11:06,350
It's just going there.

1313
01:11:06,350 --> 01:11:08,320
Actually, the variance
is also increasing.

1314
01:11:08,320 --> 01:11:10,180
The variance is 2d.

1315
01:11:14,070 --> 01:11:16,130
So this is one thing.

1316
01:11:16,130 --> 01:11:19,230
And so why do we
care about this?

1317
01:11:19,230 --> 01:11:22,640
In basic statistics,
it's not like we actually

1318
01:11:22,640 --> 01:11:25,140
have statistics much
about throwing darts

1319
01:11:25,140 --> 01:11:28,590
at high-dimensional boards.

1320
01:11:28,590 --> 01:11:31,380
So what's happening is that if
I look at the sample variance,

1321
01:11:31,380 --> 01:11:36,590
the average of the sum of
squared centered by their mean,

1322
01:11:36,590 --> 01:11:38,480
then I can actually
expend this as the sum

1323
01:11:38,480 --> 01:11:42,320
of the squares minus
the average squared

1324
01:11:42,320 --> 01:11:44,680
It's just the same
trick that we have

1325
01:11:44,680 --> 01:11:49,360
for the variance-- second moment
minus first moment square.

1326
01:11:49,360 --> 01:11:53,785
And then I claim that
Cochran's theorem--

1327
01:11:53,785 --> 01:11:56,640
and I will tell you in a second
what Cochran's theorem tells me

1328
01:11:56,640 --> 01:11:58,390
is that this sample
variance is actually--

1329
01:11:58,390 --> 01:12:01,330
so if I had only this--

1330
01:12:01,330 --> 01:12:04,560
look at those guys.

1331
01:12:04,560 --> 01:12:07,170
Those guys are Gaussian
with mean mu and variance

1332
01:12:07,170 --> 01:12:08,370
sigma squared.

1333
01:12:08,370 --> 01:12:13,290
Think for 1 second mu being
0 and sigma squared being 1.

1334
01:12:13,290 --> 01:12:16,920
Now, this part would be a
chi squared with n degrees

1335
01:12:16,920 --> 01:12:19,162
of freedom divided by n.

1336
01:12:19,162 --> 01:12:21,300
Now I get another
thing here, which

1337
01:12:21,300 --> 01:12:24,850
is the square of something that
looks like a Gaussian as well.

1338
01:12:24,850 --> 01:12:27,750
So it looks like I have
something else here, which

1339
01:12:27,750 --> 01:12:29,580
looks also like a chi squared.

1340
01:12:29,580 --> 01:12:31,890
Now, Cochran's theorem is
essentially telling you

1341
01:12:31,890 --> 01:12:35,130
that those things
are independent,

1342
01:12:35,130 --> 01:12:39,720
and so that in a way, you can
think of those guys as being,

1343
01:12:39,720 --> 01:12:43,780
here, n degrees of freedom
minus 1 degree of freedom.

1344
01:12:43,780 --> 01:12:47,820
Now, here, as I said, this
does not mean 0 and variance 1.

1345
01:12:47,820 --> 01:12:50,730
The fact that it's not
mean 0 is not a problem

1346
01:12:50,730 --> 01:12:54,790
because I can remove the mean
here and remove the mean here.

1347
01:12:54,790 --> 01:12:57,130
And so this thing has
the same distribution,

1348
01:12:57,130 --> 01:12:59,100
regardless of what
the actual mean is.

1349
01:12:59,100 --> 01:13:00,600
So without loss of
generality, I can

1350
01:13:00,600 --> 01:13:02,097
assume that mu is equal to 0.

1351
01:13:02,097 --> 01:13:03,930
Now, the variance, I'm
going to have to pay,

1352
01:13:03,930 --> 01:13:06,660
because if I multiply
all these numbers by 10,

1353
01:13:06,660 --> 01:13:09,647
then this sn is going
to multiplied by 100.

1354
01:13:09,647 --> 01:13:11,730
So this thing is going to
scale with the variance.

1355
01:13:11,730 --> 01:13:13,813
And not surprisingly, it's
scaling like the square

1356
01:13:13,813 --> 01:13:15,550
of the variance.

1357
01:13:15,550 --> 01:13:18,120
So if I look at sn,
it's distributed

1358
01:13:18,120 --> 01:13:21,960
as sigma squared
times the chi squared

1359
01:13:21,960 --> 01:13:25,060
with n minus 1 degrees
of freedom divided by n.

1360
01:13:25,060 --> 01:13:28,230
And we don't really write
that, because a chi squared

1361
01:13:28,230 --> 01:13:30,650
times sigma squared divided
by n is not a distribution,

1362
01:13:30,650 --> 01:13:32,292
so we put everything
to the left,

1363
01:13:32,292 --> 01:13:34,500
and we say that this is
actually a chi squared with n

1364
01:13:34,500 --> 01:13:36,880
minus 1 degrees of freedom.

1365
01:13:36,880 --> 01:13:40,570
So here, I'm actually
dropping a fact on you,

1366
01:13:40,570 --> 01:13:43,810
but you can see
the building block.

1367
01:13:43,810 --> 01:13:46,750
What is the thing that's
fuzzy at this point,

1368
01:13:46,750 --> 01:13:48,820
but the rest should be
crystal clear to you?

1369
01:13:48,820 --> 01:13:52,060
The thing that's fuzzy is
that removing this squared guy

1370
01:13:52,060 --> 01:13:55,770
here is actually removing
1 degree of freedom.

1371
01:13:55,770 --> 01:13:59,122
That should be weird, but that's
what Cochran's theorem tells.

1372
01:13:59,122 --> 01:14:00,800
It's essentially
stating something

1373
01:14:00,800 --> 01:14:04,760
about orthogonality of
subspaces with the span

1374
01:14:04,760 --> 01:14:07,580
of the constant vector,
something like that.

1375
01:14:07,580 --> 01:14:09,770
So you don't have to
think about it too much,

1376
01:14:09,770 --> 01:14:11,630
but that's what it's telling me.

1377
01:14:11,630 --> 01:14:15,170
But the rest, if you plug in--
so the scaling in sigma squared

1378
01:14:15,170 --> 01:14:18,450
and in n, so that should
be completely clear to you.

1379
01:14:18,450 --> 01:14:20,810
So in particular, if
I remove that part,

1380
01:14:20,810 --> 01:14:24,950
it should be clear to you
that this thing, if mean is 0,

1381
01:14:24,950 --> 01:14:27,240
this thing is
actually distributed.

1382
01:14:27,240 --> 01:14:30,153
Well, if mu is 0, what is
the distribution of this guy?

1383
01:14:35,810 --> 01:14:37,739
So I remove that
part, just this part.

1384
01:14:46,840 --> 01:14:50,700
So I have xi, which
are n0 sigma squared.

1385
01:14:50,700 --> 01:14:53,505
And I'm asking, what is the
distribution of 1/n sum from i

1386
01:14:53,505 --> 01:14:57,382
equal 1 to n of xi squared?

1387
01:14:57,382 --> 01:15:00,350
So it is the sum of their IID.

1388
01:15:00,350 --> 01:15:03,620
So it's the sum of independent
Gaussians, but not standard.

1389
01:15:03,620 --> 01:15:05,300
So the first thing
to make them standard

1390
01:15:05,300 --> 01:15:07,450
is that I divide all of
them by sigma squared.

1391
01:15:10,690 --> 01:15:17,050
Now, this guy is of the form
zi squared where zi is n01.

1392
01:15:20,710 --> 01:15:25,740
So now, this thing here
has what distribution?

1393
01:15:25,740 --> 01:15:27,420
AUDIENCE: Chi squared n.

1394
01:15:27,420 --> 01:15:30,198
PHILIPPE RIGOLLET:
Chi squared n.

1395
01:15:30,198 --> 01:15:33,450
And now, sigma squared over
n times chi squared n--

1396
01:15:33,450 --> 01:15:35,940
so if I have sigma squared
divided by n times chi

1397
01:15:35,940 --> 01:15:37,830
squared--

1398
01:15:37,830 --> 01:15:41,970
sorry, so n times n
divided by sigma squared.

1399
01:15:41,970 --> 01:15:45,570
So if I take this
thing and I multiply it

1400
01:15:45,570 --> 01:15:48,360
by n divided by sigma squared,
it means I remove this term,

1401
01:15:48,360 --> 01:15:49,860
and now I am left
with a chi squared

1402
01:15:49,860 --> 01:15:51,130
with n degrees of freedom.

1403
01:15:51,130 --> 01:15:55,320
Now, the effect of centering
with the sample mean here

1404
01:15:55,320 --> 01:15:57,450
is only to lose 1
degree of freedom.

1405
01:15:57,450 --> 01:15:58,108
That's it.

1406
01:16:01,460 --> 01:16:05,210
So if I want to do a test
about variance, since this

1407
01:16:05,210 --> 01:16:08,000
is supposedly a good
estimator of variance,

1408
01:16:08,000 --> 01:16:10,880
this could be my
pivotal distribution.

1409
01:16:10,880 --> 01:16:12,710
This could play the
role of a Gaussian.

1410
01:16:12,710 --> 01:16:16,100
If I want to know if my variance
is equal to 1 or larger than 1,

1411
01:16:16,100 --> 01:16:21,720
I could actually build a test
based on this only statement

1412
01:16:21,720 --> 01:16:23,910
and test if the variance
is larger than 1 or not.

1413
01:16:23,910 --> 01:16:25,451
Now, this is not
asymptotic because I

1414
01:16:25,451 --> 01:16:28,470
started with the very
assumption that my data was

1415
01:16:28,470 --> 01:16:29,160
Gaussian itself.

1416
01:16:32,390 --> 01:16:33,830
Now, just a side
remark-- you can

1417
01:16:33,830 --> 01:16:37,220
check that this chi squared 2,
2 is an exponential with 1/2

1418
01:16:37,220 --> 01:16:38,990
degrees of freedom,
which is certainly not

1419
01:16:38,990 --> 01:16:42,350
clear from the fact that
z1 squared plus z2 squared

1420
01:16:42,350 --> 01:16:44,690
is a chi squared with
2 degrees of freedom.

1421
01:16:44,690 --> 01:16:46,440
if I give you the
sum of the square

1422
01:16:46,440 --> 01:16:50,540
of 2 independent Gaussian, this
is actually an exponential.

1423
01:16:50,540 --> 01:16:53,140
That's not super clear, right?

1424
01:16:53,140 --> 01:17:00,260
But if you look
at what was here--

1425
01:17:00,260 --> 01:17:03,430
I don't know if you took notes,
but let me rewrite it for you.

1426
01:17:03,430 --> 01:17:08,760
So it was x to the d/2 minus
1 e to the minus x/2 divided

1427
01:17:08,760 --> 01:17:14,470
by 2 to the d/2 gamma of d/2.

1428
01:17:14,470 --> 01:17:18,200
So if I plug in d is
equal to 2, gamma of 2/2

1429
01:17:18,200 --> 01:17:21,920
is gamma of 1, which is 1.

1430
01:17:21,920 --> 01:17:23,750
It's factorial of 0.

1431
01:17:23,750 --> 01:17:26,290
So it's 1, so this
guy goes away.

1432
01:17:26,290 --> 01:17:33,310
2 to the d/2 is 2 to
1, so that's just 1.

1433
01:17:33,310 --> 01:17:36,490
No, that's just 2.

1434
01:17:36,490 --> 01:17:40,990
Then x the d/2 minus 1
is x the 0, goes away.

1435
01:17:40,990 --> 01:17:47,080
And so I have x minus x/2
1/2, which is really, indeed,

1436
01:17:47,080 --> 01:17:50,050
of the form lambda e
to the minus lambda

1437
01:17:50,050 --> 01:17:53,290
x for lambda is equal
to 1/2, which was

1438
01:17:53,290 --> 01:17:54,851
our exponential distribution.

1439
01:17:59,200 --> 01:18:05,290
Well, next week is,
well, Columbus Day?

1440
01:18:05,290 --> 01:18:08,510
So not next Monday--

1441
01:18:08,510 --> 01:18:12,310
so next week, we'll talk
about Student's distribution.

1442
01:18:12,310 --> 01:18:15,430
And so that was
discovered by a guy

1443
01:18:15,430 --> 01:18:19,400
who pretended his name was
Student, but was not Student.

1444
01:18:19,400 --> 01:18:23,260
And I challenge you to
find why in the meantime.

1445
01:18:23,260 --> 01:18:24,940
So I'll see you next week.

1446
01:18:24,940 --> 01:18:28,090
Your homework is
going to be outside

1447
01:18:28,090 --> 01:18:31,733
so we can release the room.