1
00:00:00,000 --> 00:00:02,430
The following content is
provided under a Creative

2
00:00:02,430 --> 00:00:03,730
Commons license.

3
00:00:03,730 --> 00:00:06,030
Your support will help
MIT OpenCourseWare

4
00:00:06,030 --> 00:00:10,060
continue to offer high quality
educational resources for free.

5
00:00:10,060 --> 00:00:12,690
To make a donation or to
view additional materials

6
00:00:12,690 --> 00:00:16,560
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:16,560 --> 00:00:17,904
at ocw.mit.edu.

8
00:00:20,970 --> 00:00:23,850
PROFESSOR: OK, I think
we'll make a start

9
00:00:23,850 --> 00:00:25,770
since it's almost 10 past.

10
00:00:25,770 --> 00:00:29,190
Can everyone hear
me in Singapore?

11
00:00:29,190 --> 00:00:31,650
Wave if you can.

12
00:00:31,650 --> 00:00:32,150
No?

13
00:00:32,150 --> 00:00:33,280
Oh, yes you can.

14
00:00:33,280 --> 00:00:33,780
Excellent.

15
00:00:33,780 --> 00:00:37,530
Right, I think that there's
a couple of announcements

16
00:00:37,530 --> 00:00:40,740
before we start.

17
00:00:40,740 --> 00:00:45,630
I'm going to have office hours
tomorrow between 5:00 and 6:00

18
00:00:45,630 --> 00:00:49,200
in case you have any questions
about the current problem

19
00:00:49,200 --> 00:00:54,570
set, which is due on
Thursday at 5:00 PM MIT time.

20
00:00:54,570 --> 00:00:59,800
And we are marking the
quizzes as fast as we can,

21
00:00:59,800 --> 00:01:02,620
so you'll have them back soon.

22
00:01:02,620 --> 00:01:06,158
If you have any comments
about the quiz, how

23
00:01:06,158 --> 00:01:07,700
you found the
questions and so forth,

24
00:01:07,700 --> 00:01:09,158
then I'd be delighted
to hear them.

25
00:01:11,610 --> 00:01:14,780
Any questions?

26
00:01:14,780 --> 00:01:15,820
OK.

27
00:01:15,820 --> 00:01:17,080
Singapore?

28
00:01:17,080 --> 00:01:17,860
No.

29
00:01:17,860 --> 00:01:23,980
So anyway, today I'm going
to introduce a tool that's

30
00:01:23,980 --> 00:01:29,080
going to be very helpful to
us as we look to have ways

31
00:01:29,080 --> 00:01:34,060
of studying a process
and understanding exactly

32
00:01:34,060 --> 00:01:38,500
how its inputs relate to its
outputs, so that by doing that,

33
00:01:38,500 --> 00:01:42,070
we can eventually model the
process, control it better,

34
00:01:42,070 --> 00:01:44,980
and improve its performance.

35
00:01:44,980 --> 00:01:52,110
And that technique is called the
analysis of variance, or ANOVA.

36
00:01:52,110 --> 00:01:56,590
I want first to quickly
review the tools

37
00:01:56,590 --> 00:02:00,730
that we've looked at in the
course so far so that we

38
00:02:00,730 --> 00:02:04,060
can put ANOVA into perspective.

39
00:02:04,060 --> 00:02:07,240
And we talked for the
first couple of lectures

40
00:02:07,240 --> 00:02:13,000
about a systems view of
manufacturing processes.

41
00:02:13,000 --> 00:02:17,830
We talked about a variety of
parameters that were important

42
00:02:17,830 --> 00:02:21,760
in determining the outputs, some
of which we have control over,

43
00:02:21,760 --> 00:02:25,090
some of which we don't-- some of
which are actually disturbances

44
00:02:25,090 --> 00:02:28,630
or inherent properties of the
equipment or the material that

45
00:02:28,630 --> 00:02:30,020
we're processing.

46
00:02:30,020 --> 00:02:32,950
And so we have this
idea that there

47
00:02:32,950 --> 00:02:37,150
must be some relationship
between the inputs and outputs.

48
00:02:37,150 --> 00:02:41,950
But we've actually focused
on developing tools that

49
00:02:41,950 --> 00:02:45,130
just focus on the
output-- on interpreting

50
00:02:45,130 --> 00:02:48,610
the output, the geometry,
some other property.

51
00:02:48,610 --> 00:02:51,130
And that's fine.

52
00:02:51,130 --> 00:02:53,460
That's an important tool.

53
00:02:53,460 --> 00:02:55,570
And here are some
of the tools we've

54
00:02:55,570 --> 00:02:59,410
looked at for statistical
testing purposes.

55
00:02:59,410 --> 00:03:02,380
There are, of course,
the T and the F tests.

56
00:03:02,380 --> 00:03:05,290
And we looked at control
charts, which are essentially

57
00:03:05,290 --> 00:03:08,790
running hypothesis tests
that a processes in control.

58
00:03:08,790 --> 00:03:11,740
We're testing the
hypothesis that the process

59
00:03:11,740 --> 00:03:17,530
mean, or variation,
is of a certain value.

60
00:03:17,530 --> 00:03:19,420
We looked at
cumulative some charts,

61
00:03:19,420 --> 00:03:22,480
and we looked at
moving average charts.

62
00:03:22,480 --> 00:03:25,150
Then briefly, two
lectures ago, we

63
00:03:25,150 --> 00:03:27,940
looked at Chi-squared
and T-squared charts.

64
00:03:27,940 --> 00:03:31,090
So let's look at the
properties of those things just

65
00:03:31,090 --> 00:03:33,260
to remind ourselves.

66
00:03:33,260 --> 00:03:35,800
The key thing with
the hypothesis tests

67
00:03:35,800 --> 00:03:38,200
and with the control charts
is that what we're doing

68
00:03:38,200 --> 00:03:43,450
is we're interpreting changes in
one particular specific output

69
00:03:43,450 --> 00:03:44,120
variable.

70
00:03:44,120 --> 00:03:47,410
So whether that be diameter,
or threshold voltage,

71
00:03:47,410 --> 00:03:50,020
or something, all
these hypothesis tests

72
00:03:50,020 --> 00:03:54,040
are comparing sample mean,
sample variances based

73
00:03:54,040 --> 00:03:55,960
on one output variable.

74
00:03:55,960 --> 00:03:59,230
And with the Chi-squared and
the T-squared squared charts,

75
00:03:59,230 --> 00:04:03,580
we introduced the ability
to look at two outputs

76
00:04:03,580 --> 00:04:06,220
which might have
different dimensions,

77
00:04:06,220 --> 00:04:09,950
and which we expect to be
somehow interdependent.

78
00:04:09,950 --> 00:04:15,490
So we're allowing
ourselves to interpret

79
00:04:15,490 --> 00:04:16,539
two different outputs.

80
00:04:19,260 --> 00:04:21,089
Thinking about the
number of samples

81
00:04:21,089 --> 00:04:24,630
that you can interpret with
any one of these given tests,

82
00:04:24,630 --> 00:04:28,530
well, with a T or an F test,
you're taking two samples

83
00:04:28,530 --> 00:04:29,710
and you're comparing them.

84
00:04:29,710 --> 00:04:32,250
So in the case of a T test,
you're asking the question,

85
00:04:32,250 --> 00:04:38,430
is there evidence that there's
a significant difference

86
00:04:38,430 --> 00:04:40,830
between the means
of the underlying

87
00:04:40,830 --> 00:04:44,790
distributions of the two
samples you've taken?

88
00:04:44,790 --> 00:04:46,985
And with the F
test, we're saying,

89
00:04:46,985 --> 00:04:48,360
is there a
significant difference

90
00:04:48,360 --> 00:04:52,385
between the underlying
variances of two distributions?

91
00:04:52,385 --> 00:04:53,760
With a control
chart, well that's

92
00:04:53,760 --> 00:04:55,400
just a running hypothesis test.

93
00:04:55,400 --> 00:05:00,240
So you take many samples,
and you're constantly

94
00:05:00,240 --> 00:05:04,830
doing a hypothesis test to
see whether the process is

95
00:05:04,830 --> 00:05:07,810
at the place where
you want it to be,

96
00:05:07,810 --> 00:05:10,080
so some predetermined
operating point

97
00:05:10,080 --> 00:05:15,360
that your content is
optimal, I suppose

98
00:05:15,360 --> 00:05:17,970
And with the Chi-squared
and the T-squared charts,

99
00:05:17,970 --> 00:05:21,750
again, the same is
true-- many samples.

100
00:05:21,750 --> 00:05:24,270
But when we come to ask,
well, how many inputs

101
00:05:24,270 --> 00:05:26,010
were we dealing
with these things,

102
00:05:26,010 --> 00:05:28,300
we don't necessarily
have any idea.

103
00:05:28,300 --> 00:05:33,240
We might be told that some input
to the process was changed,

104
00:05:33,240 --> 00:05:36,570
here are some samples, is there
evidence that the input has

105
00:05:36,570 --> 00:05:38,790
had an effect on the output?

106
00:05:38,790 --> 00:05:41,040
But sometimes, we
might just have samples

107
00:05:41,040 --> 00:05:44,040
and have no clear
understanding of what

108
00:05:44,040 --> 00:05:45,970
the relevant inputs were.

109
00:05:45,970 --> 00:05:49,680
So there's nothing
inherent in these tools

110
00:05:49,680 --> 00:05:54,090
that allows us to get the
relationship between inputs

111
00:05:54,090 --> 00:05:55,110
and outputs.

112
00:05:55,110 --> 00:05:58,800
And similarly, we
don't know how many

113
00:05:58,800 --> 00:06:01,800
levels or different values
any given input to a process

114
00:06:01,800 --> 00:06:06,450
might take-- how many voltages
we set a particular control

115
00:06:06,450 --> 00:06:07,710
voltage to.

116
00:06:07,710 --> 00:06:10,060
And so I've said
that they're unknown.

117
00:06:10,060 --> 00:06:13,950
But in a sense, I put
two with a question mark.

118
00:06:13,950 --> 00:06:18,660
Because you might think, in a
very clearly defined T test,

119
00:06:18,660 --> 00:06:23,640
say, you might say, well, an
operator varies one input.

120
00:06:23,640 --> 00:06:25,650
And here are some samples.

121
00:06:25,650 --> 00:06:26,740
Did the output change?

122
00:06:26,740 --> 00:06:28,680
So in that case, you'd
have two samples--

123
00:06:28,680 --> 00:06:30,720
two values for a
particular input.

124
00:06:30,720 --> 00:06:34,920
And that might be the most
sort of precisely defined type

125
00:06:34,920 --> 00:06:36,730
of test you can do.

126
00:06:36,730 --> 00:06:40,470
We've also talked about
yield modeling, where

127
00:06:40,470 --> 00:06:44,790
we're starting to try to
get some physical concept

128
00:06:44,790 --> 00:06:48,548
of the link between
properties of the process--

129
00:06:48,548 --> 00:06:50,340
the actual nature of
the process-- and what

130
00:06:50,340 --> 00:06:53,670
comes out at the other
end in terms of yield;

131
00:06:53,670 --> 00:06:58,320
and process
capability, where we're

132
00:06:58,320 --> 00:07:05,280
trying to ask questions about
how well the process performs

133
00:07:05,280 --> 00:07:09,960
based on being centered at
a particular output mean.

134
00:07:09,960 --> 00:07:13,080
But those tools, again,
don't necessarily

135
00:07:13,080 --> 00:07:15,630
let us model the
process, or indeed

136
00:07:15,630 --> 00:07:18,350
give us any insight with
which to improve it.

137
00:07:18,350 --> 00:07:20,160
So we need we need
some tools that

138
00:07:20,160 --> 00:07:22,920
will start letting us
do that, especially

139
00:07:22,920 --> 00:07:26,070
if we look at some typical data.

140
00:07:26,070 --> 00:07:28,590
Now you'll recall,
this set of data,

141
00:07:28,590 --> 00:07:30,540
this was used in
problem set two.

142
00:07:30,540 --> 00:07:32,400
This is from an
injection molding

143
00:07:32,400 --> 00:07:35,330
experiment done at
MIT for the course

144
00:07:35,330 --> 00:07:37,420
of a couple of years ago.

145
00:07:37,420 --> 00:07:44,070
And so what we've got here is
an output, which is diameter.

146
00:07:44,070 --> 00:07:47,040
We've got one output
that we've measured,

147
00:07:47,040 --> 00:07:50,475
but we've got two
input parameters

148
00:07:50,475 --> 00:07:51,600
that we think are relevant.

149
00:07:51,600 --> 00:07:52,920
There may be others.

150
00:07:52,920 --> 00:07:55,590
And in fact, there
almost certain they are.

151
00:07:55,590 --> 00:08:00,930
Because when we look at--

152
00:08:00,930 --> 00:08:05,490
when we look at
the sample of parts

153
00:08:05,490 --> 00:08:08,760
with a given combination
of these input parameters,

154
00:08:08,760 --> 00:08:11,880
velocity and hold
time, we see that there

155
00:08:11,880 --> 00:08:14,190
is substantial variation.

156
00:08:14,190 --> 00:08:20,610
And that is either due to
inherent random variation

157
00:08:20,610 --> 00:08:24,690
in properties of the materials,
or some input variable

158
00:08:24,690 --> 00:08:27,210
that we haven't
fully controlled.

159
00:08:27,210 --> 00:08:33,840
But we've got two input certain
parameters, and two values

160
00:08:33,840 --> 00:08:36,070
that we happen to have
chosen for each of them.

161
00:08:36,070 --> 00:08:40,650
Now, we could start to look
for significant differences

162
00:08:40,650 --> 00:08:43,960
between the means
for each setting

163
00:08:43,960 --> 00:08:50,370
by doing a whole raft
of T tests or F tests.

164
00:08:50,370 --> 00:08:54,588
But you know, that would
become tedious quite quickly,

165
00:08:54,588 --> 00:08:56,130
especially if you're
starting to look

166
00:08:56,130 --> 00:08:59,400
at many variables
in many settings.

167
00:08:59,400 --> 00:09:03,720
So we need a more systematic
way of going about this,

168
00:09:03,720 --> 00:09:08,100
and eventually being able
to say which of the inputs

169
00:09:08,100 --> 00:09:11,600
is important in determining
the value of the output.

170
00:09:15,400 --> 00:09:16,990
Eventually, once we
have those tools,

171
00:09:16,990 --> 00:09:21,160
well, there are a few
aims that we might have.

172
00:09:21,160 --> 00:09:25,430
And first among those is
developing a process model.

173
00:09:25,430 --> 00:09:31,070
So we want to be able to relate
inputs and disturbances--

174
00:09:31,070 --> 00:09:33,100
so inputs that we
can't control--

175
00:09:33,100 --> 00:09:35,470
to the outputs.

176
00:09:35,470 --> 00:09:38,710
And we want to know which
inputs are relevant-- which

177
00:09:38,710 --> 00:09:41,170
inputs, when they vary
by a reasonable amount,

178
00:09:41,170 --> 00:09:44,170
actually have a significant
effect on the outputs

179
00:09:44,170 --> 00:09:48,033
that we're interested
in understanding.

180
00:09:48,033 --> 00:09:49,450
Once we've done
that, we can think

181
00:09:49,450 --> 00:09:52,450
about optimizing the process.

182
00:09:52,450 --> 00:09:56,410
A target that we might
have is to maximize CPK,

183
00:09:56,410 --> 00:10:00,640
minimize the cost
associated with deviation

184
00:10:00,640 --> 00:10:04,420
from our target output value.

185
00:10:04,420 --> 00:10:06,700
We might have
particular models that

186
00:10:06,700 --> 00:10:12,310
allow us to look at the impact
of, say, a candidate process

187
00:10:12,310 --> 00:10:17,440
change to the mean outputs, to
the variance of the outputs.

188
00:10:17,440 --> 00:10:19,870
These are all saying
essentially the same thing,

189
00:10:19,870 --> 00:10:23,140
that if we can model a process,
we can control it better.

190
00:10:26,000 --> 00:10:31,370
If you had perfect understanding
of the physics surrounding

191
00:10:31,370 --> 00:10:33,620
the process that you had,
you could work forward

192
00:10:33,620 --> 00:10:34,760
from first principles.

193
00:10:34,760 --> 00:10:41,120
And you could think about every
physical phenomenon going on

194
00:10:41,120 --> 00:10:44,960
in your system and predict
what the output would be.

195
00:10:44,960 --> 00:10:45,890
But we don't.

196
00:10:45,890 --> 00:10:49,130
We don't have understanding
of all the variables

197
00:10:49,130 --> 00:10:50,390
that are at play.

198
00:10:50,390 --> 00:10:56,210
And what this means is that
to get control of a process,

199
00:10:56,210 --> 00:10:59,000
we need to work backwards
from the physical output.

200
00:10:59,000 --> 00:11:01,760
We need to take measurements
of what the process is doing,

201
00:11:01,760 --> 00:11:08,030
and model backwards to
understand what the inputs do.

202
00:11:08,030 --> 00:11:12,620
So when we're doing this
kind of empirical modeling,

203
00:11:12,620 --> 00:11:15,640
there's some questions
we need to ask ourselves.

204
00:11:15,640 --> 00:11:16,890
What are we trying to achieve?

205
00:11:16,890 --> 00:11:24,350
Do we want to minimize
quality loss, maximize CPK?

206
00:11:24,350 --> 00:11:25,970
Do we just want to
reduce the variance

207
00:11:25,970 --> 00:11:28,220
for some specific reason?

208
00:11:28,220 --> 00:11:30,620
We need to define our
system very carefully.

209
00:11:30,620 --> 00:11:31,730
What is the output?

210
00:11:31,730 --> 00:11:33,080
Are there multiple outputs?

211
00:11:33,080 --> 00:11:35,510
Are there multiple inputs?

212
00:11:35,510 --> 00:11:36,620
What do we want to vary?

213
00:11:36,620 --> 00:11:39,500
What does it cost us at
least to vary, either

214
00:11:39,500 --> 00:11:43,250
in exploring the process
space, or actually what

215
00:11:43,250 --> 00:11:49,910
does it cost us least to
control in the factory?

216
00:11:49,910 --> 00:11:54,275
And we need to come up
with some sort of model--

217
00:11:54,275 --> 00:11:57,270
some sort of of form for
the model that we can use.

218
00:11:57,270 --> 00:11:59,960
We haven't yet described
any particular model.

219
00:11:59,960 --> 00:12:01,640
This function phi
we've been talking

220
00:12:01,640 --> 00:12:05,600
about in previous
lectures hasn't

221
00:12:05,600 --> 00:12:09,110
been specific to any
particular process.

222
00:12:09,110 --> 00:12:12,560
What's also
important is to think

223
00:12:12,560 --> 00:12:16,880
about how easy it is to
collect data from the process.

224
00:12:16,880 --> 00:12:20,960
If you have 20
variables and they can--

225
00:12:20,960 --> 00:12:23,130
each vary over a
factor of ten, one

226
00:12:23,130 --> 00:12:27,530
needs to start thinking
about how one can simplify

227
00:12:27,530 --> 00:12:29,780
the process of doing
these experiments

228
00:12:29,780 --> 00:12:35,210
and working towards a
model, especially if there's

229
00:12:35,210 --> 00:12:39,110
a lot of measurements
of the outputs involved,

230
00:12:39,110 --> 00:12:41,870
a lot of laborious measurement.

231
00:12:41,870 --> 00:12:47,300
So going back to that chart
of the tools we have so far,

232
00:12:47,300 --> 00:12:51,080
what we're going to do
today is add a new one.

233
00:12:51,080 --> 00:12:53,195
And that is the
analysis of variance.

234
00:12:56,940 --> 00:13:01,470
Now, the key
advantage of this is

235
00:13:01,470 --> 00:13:06,660
that we're now starting to
be specific about looking

236
00:13:06,660 --> 00:13:12,780
at the actual inputs
that we're interested in

237
00:13:12,780 --> 00:13:16,210
and how they impact the outputs.

238
00:13:16,210 --> 00:13:20,786
And so we might just
be interested in--

239
00:13:23,730 --> 00:13:26,160
we might just be interested
in studying one input.

240
00:13:26,160 --> 00:13:28,830
But equally, there
are ANOVA techniques

241
00:13:28,830 --> 00:13:32,880
that will let us study
multiple input parameters.

242
00:13:32,880 --> 00:13:34,590
Is there a question
in Singapore?

243
00:13:34,590 --> 00:13:36,750
AUDIENCE: Yeah, I
have a question.

244
00:13:36,750 --> 00:13:38,990
PROFESSOR: OK.

245
00:13:38,990 --> 00:13:41,330
AUDIENCE: What is a
Chi-squared chart,

246
00:13:41,330 --> 00:13:43,580
and why does it
have two outputs?

247
00:13:46,510 --> 00:13:49,390
PROFESSOR: Right,
well this I think

248
00:13:49,390 --> 00:13:52,510
we covered in lecture
two-- two lectures ago,

249
00:13:52,510 --> 00:13:54,100
certainly the T-squared chart.

250
00:13:54,100 --> 00:14:00,100
What that does is, if you
have two output variables

251
00:14:00,100 --> 00:14:04,910
that you are maintaining
control charts for,

252
00:14:04,910 --> 00:14:07,600
but they are functionally
interdependent,

253
00:14:07,600 --> 00:14:10,450
you might not know
exactly how, but you

254
00:14:10,450 --> 00:14:13,780
know that there's some
relationship between them.

255
00:14:13,780 --> 00:14:20,080
A part that is out
of specification

256
00:14:20,080 --> 00:14:25,240
may not trigger an
out of control alarm

257
00:14:25,240 --> 00:14:27,470
on one of those control charts.

258
00:14:27,470 --> 00:14:32,342
But if you plot both of
them together and look

259
00:14:32,342 --> 00:14:34,300
at the results in
conjunction with one another,

260
00:14:34,300 --> 00:14:38,050
that allows you to infer more
about whether the process is

261
00:14:38,050 --> 00:14:39,250
in or out of control.

262
00:14:42,110 --> 00:14:44,990
So it's covered in Montgomery.

263
00:14:44,990 --> 00:14:47,210
We may have a problem
set question about it

264
00:14:47,210 --> 00:14:51,410
as well to help you
develop understanding.

265
00:14:54,160 --> 00:14:54,970
Anyway, I'm sorry.

266
00:14:54,970 --> 00:14:56,850
I want to focus on ANOVA today.

267
00:15:01,680 --> 00:15:06,090
Anyway, so we have the option of
looking at more than one input

268
00:15:06,090 --> 00:15:13,200
and taking more than one
input value for each input.

269
00:15:13,200 --> 00:15:15,787
And when we have different
levels for an input,

270
00:15:15,787 --> 00:15:17,620
we're going to refer
to those as treatments.

271
00:15:17,620 --> 00:15:21,210
So a variable may be treated
in one of a number of ways.

272
00:15:26,160 --> 00:15:30,380
The outputs-- well, we're
dealing in this particular case

273
00:15:30,380 --> 00:15:33,170
with one particular
quantity that we're

274
00:15:33,170 --> 00:15:35,120
going to measure and analyze.

275
00:15:35,120 --> 00:15:43,580
And we're going to have
two or more samples

276
00:15:43,580 --> 00:15:46,320
that we are able to deal with.

277
00:15:46,320 --> 00:15:48,680
So this is what
we're going to do.

278
00:15:48,680 --> 00:15:53,540
We're going to look first for
the ANOVA technique applied

279
00:15:53,540 --> 00:15:56,120
to one input variable.

280
00:15:56,120 --> 00:16:01,730
And then, we're going to work
through an example of that.

281
00:16:01,730 --> 00:16:06,350
Then we'll look at multivariable
analysis of variance

282
00:16:06,350 --> 00:16:07,370
where you have--

283
00:16:07,370 --> 00:16:09,320
in this particular
case, we're going

284
00:16:09,320 --> 00:16:12,200
to look at two input
variables and how

285
00:16:12,200 --> 00:16:15,110
we look at the interactions
of those two variables

286
00:16:15,110 --> 00:16:16,640
in setting the output.

287
00:16:21,290 --> 00:16:26,220
So here, we're going to start
with the single variable ANOVA.

288
00:16:26,220 --> 00:16:31,820
And this is a diagram
that shows to you

289
00:16:31,820 --> 00:16:37,550
what might happen to the output
of a process, which the output

290
00:16:37,550 --> 00:16:47,800
being this axis as we vary one
particular input parameter.

291
00:16:47,800 --> 00:16:50,370
So we're assuming we have
good control over this,

292
00:16:50,370 --> 00:16:52,980
and we can command
that input parameter

293
00:16:52,980 --> 00:16:54,600
to be whatever we choose.

294
00:16:54,600 --> 00:16:57,150
In this particular case,
we're saying, well,

295
00:16:57,150 --> 00:16:59,160
that there are three
possible values

296
00:16:59,160 --> 00:17:00,930
we might be interested in--

297
00:17:00,930 --> 00:17:04,500
A, B, and C. And so these
are the three treatments.

298
00:17:04,500 --> 00:17:07,740
And what we've shown
here, schematically,

299
00:17:07,740 --> 00:17:11,790
is the distribution of the
output variable under each

300
00:17:11,790 --> 00:17:13,190
of those three conditions.

301
00:17:16,859 --> 00:17:20,579
We're assuming
that the variation

302
00:17:20,579 --> 00:17:24,060
in the output for each
particular treatment

303
00:17:24,060 --> 00:17:32,460
is normally distributed, but
that the mean of the output

304
00:17:32,460 --> 00:17:35,670
is determined directly by the
value of this input parameter.

305
00:17:35,670 --> 00:17:38,670
So there's a one-to-one
deterministic relationship

306
00:17:38,670 --> 00:17:42,350
between the mean and
the value of the input.

307
00:17:42,350 --> 00:17:46,560
So these tau values--
tau 1, tau 2, and tau 3--

308
00:17:49,200 --> 00:17:56,430
these are actually determined
directly by the populations.

309
00:17:56,430 --> 00:17:58,950
But then, if we imagine
doing experiments

310
00:17:58,950 --> 00:18:03,720
under these three conditions,
we're going to take--

311
00:18:03,720 --> 00:18:08,320
we're going to set the input
to the value for sample A, say.

312
00:18:08,320 --> 00:18:12,150
And then, we're going to take
a number of output parts,

313
00:18:12,150 --> 00:18:16,110
measure them, and those
outputs will have--

314
00:18:16,110 --> 00:18:19,920
they will have a
sample mean, y1.

315
00:18:19,920 --> 00:18:26,760
And there will be, for each of
these samples, a sample mean.

316
00:18:26,760 --> 00:18:28,740
We can take all
the data together

317
00:18:28,740 --> 00:18:34,590
and come up with a grand
average for the collected data.

318
00:18:34,590 --> 00:18:39,360
Then the differences between
those individual treatment

319
00:18:39,360 --> 00:18:45,480
sample means and the grand
mean will be estimators

320
00:18:45,480 --> 00:18:49,980
for the values of tau--

321
00:18:49,980 --> 00:18:52,590
the effects of the treatments,
the different treatments

322
00:18:52,590 --> 00:18:54,670
that we're doing.

323
00:18:54,670 --> 00:18:56,110
So here's what we're doing.

324
00:18:56,110 --> 00:18:58,860
We're considering
multiple settings

325
00:18:58,860 --> 00:19:01,710
for some variable of interest.

326
00:19:01,710 --> 00:19:06,720
And there are
these real effects,

327
00:19:06,720 --> 00:19:14,790
deltas between the output
mean for different conditions.

328
00:19:14,790 --> 00:19:17,670
And we've observed
these samples.

329
00:19:17,670 --> 00:19:22,570
So the question is, based
on these observations,

330
00:19:22,570 --> 00:19:28,920
we want to be able to
see whether the observed

331
00:19:28,920 --> 00:19:32,380
differences in sample
means are real.

332
00:19:32,380 --> 00:19:36,740
So we're going to do some kind
of hypothesis test that says,

333
00:19:36,740 --> 00:19:39,205
the null hypothesis is
going to be changing

334
00:19:39,205 --> 00:19:41,580
the input variable doesn't
make a blind bit of difference

335
00:19:41,580 --> 00:19:42,720
to the output mean.

336
00:19:42,720 --> 00:19:45,480
And the alternate
hypothesis is there

337
00:19:45,480 --> 00:19:48,270
is a significant difference,
which, later, we want

338
00:19:48,270 --> 00:19:51,540
to model somehow, probably.

339
00:19:51,540 --> 00:19:55,410
What's important in
ANOVA is this assumption

340
00:19:55,410 --> 00:20:04,410
that the variance of the
output for each treatment

341
00:20:04,410 --> 00:20:05,730
is equivalent.

342
00:20:05,730 --> 00:20:09,550
And here, we've written
it as sigma sub-0 squared.

343
00:20:09,550 --> 00:20:17,220
So the width of each of
those subpopulations is--

344
00:20:17,220 --> 00:20:19,410
has to be assumed to be equal.

345
00:20:19,410 --> 00:20:21,893
And when one is
doing ANOVA, that's

346
00:20:21,893 --> 00:20:23,310
something that one
needs to check.

347
00:20:23,310 --> 00:20:25,215
And we'll talk about
that a little later.

348
00:20:28,640 --> 00:20:34,300
Now, I'm going to describe
the underlying assumption that

349
00:20:34,300 --> 00:20:38,560
allows the analysis of variance
to be done in this way.

350
00:20:38,560 --> 00:20:41,020
And this underlying
assumption is

351
00:20:41,020 --> 00:20:50,570
that you can model the
value of a sample part

352
00:20:50,570 --> 00:20:53,780
as the sum of three quantities.

353
00:20:53,780 --> 00:21:00,470
First quantity is
the process mean,

354
00:21:00,470 --> 00:21:05,600
around which the
output means of all

355
00:21:05,600 --> 00:21:09,120
the possible treatments center.

356
00:21:09,120 --> 00:21:13,810
Second is the effect
of the treatment.

357
00:21:13,810 --> 00:21:16,950
So that's how far is the
output mean of set when

358
00:21:16,950 --> 00:21:19,095
you have this particular
treatment, tau, taking

359
00:21:19,095 --> 00:21:21,720
place, whether there's a certain
voltage setting of the inputs,

360
00:21:21,720 --> 00:21:22,470
or whatever.

361
00:21:22,470 --> 00:21:26,010
And the third is
this residual term,

362
00:21:26,010 --> 00:21:27,900
which we write with epsilon.

363
00:21:27,900 --> 00:21:32,250
And that is describing the
random variation in the output

364
00:21:32,250 --> 00:21:35,670
beyond the systematic
effect of the treatment.

365
00:21:35,670 --> 00:21:39,570
Sometimes we take this mu
and the tau sub-t together.

366
00:21:39,570 --> 00:21:41,610
We write it as a mu sub-t.

367
00:21:41,610 --> 00:21:45,234
That's the output mean for
a particular treatment.

368
00:21:48,080 --> 00:21:52,640
So at the bottom of this graph,
I've just sketched that out

369
00:21:52,640 --> 00:21:54,450
to show what I mean.

370
00:21:54,450 --> 00:21:57,710
Here is the process--

371
00:21:57,710 --> 00:22:00,350
overall process mean.

372
00:22:00,350 --> 00:22:04,130
Here is the output mean
for a particular treatment.

373
00:22:04,130 --> 00:22:09,230
And say, if we take this
particular sample, for example,

374
00:22:09,230 --> 00:22:10,280
then there's a residual.

375
00:22:10,280 --> 00:22:12,800
There's a difference
between the treatment mean

376
00:22:12,800 --> 00:22:15,800
and the actual value
of that sample, which

377
00:22:15,800 --> 00:22:18,380
is epsilon sub-ti.

378
00:22:18,380 --> 00:22:20,700
Sorry, that squared
shouldn't be there.

379
00:22:20,700 --> 00:22:25,570
So the sum of these
three quantities

380
00:22:25,570 --> 00:22:30,055
is thought of as giving
us the output quantity.

381
00:22:34,100 --> 00:22:37,930
Now, I'm going to
describe the three steps

382
00:22:37,930 --> 00:22:39,730
that one is going to
go through to test

383
00:22:39,730 --> 00:22:43,480
this hypothesis that changing
the value of the input

384
00:22:43,480 --> 00:22:46,930
has no effect on
the treatment means.

385
00:22:46,930 --> 00:22:54,430
And the first step is to
come up with an estimate

386
00:22:54,430 --> 00:22:58,690
of the underlying
population variance.

387
00:22:58,690 --> 00:23:04,300
That involves looking at each
treatment sample in turn--

388
00:23:04,300 --> 00:23:06,850
so looking at this
sample, then this sample,

389
00:23:06,850 --> 00:23:11,650
and looking at the
sample variance

390
00:23:11,650 --> 00:23:15,100
individually for
each sample, and then

391
00:23:15,100 --> 00:23:22,570
taking a pooled estimate of the
random components of variation

392
00:23:22,570 --> 00:23:27,820
based on all those
sample variances.

393
00:23:27,820 --> 00:23:29,660
That's the first step.

394
00:23:29,660 --> 00:23:36,910
Second step is to try to look
at the between group variation.

395
00:23:36,910 --> 00:23:41,950
That means looking at
the sample means, which

396
00:23:41,950 --> 00:23:45,880
I've denoted with
these horizontal lines

397
00:23:45,880 --> 00:23:49,910
here, and looking at
the variance of those,

398
00:23:49,910 --> 00:23:54,640
and making an inference from
that about how much variation

399
00:23:54,640 --> 00:24:00,010
there is in the output as
we change the input setting.

400
00:24:00,010 --> 00:24:02,410
Once we've got those
two estimates--

401
00:24:02,410 --> 00:24:05,800
within group variation,
between group variation--

402
00:24:05,800 --> 00:24:08,860
we want to compare
those estimates,

403
00:24:08,860 --> 00:24:12,520
and infer from
that whether there

404
00:24:12,520 --> 00:24:17,950
is a difference between
the different treatments.

405
00:24:17,950 --> 00:24:19,450
If there is a
difference, then we're

406
00:24:19,450 --> 00:24:21,640
going to expect
the second estimate

407
00:24:21,640 --> 00:24:24,630
that we make to be larger
than the first estimate.

408
00:24:24,630 --> 00:24:28,480
And just by looking
at this schematic

409
00:24:28,480 --> 00:24:31,570
in the top right
of the view graph,

410
00:24:31,570 --> 00:24:36,130
you can perhaps see
intuitively how that might be.

411
00:24:36,130 --> 00:24:46,480
The variance associated with
the between group changes

412
00:24:46,480 --> 00:24:49,360
is going to be larger if
there's a significant impact

413
00:24:49,360 --> 00:24:51,880
of the input on the output.

414
00:24:51,880 --> 00:24:54,790
But if not, then those
two variance estimates

415
00:24:54,790 --> 00:24:55,730
should be equal.

416
00:24:55,730 --> 00:24:58,150
There's no effect
had by the input.

417
00:25:02,480 --> 00:25:05,330
I'm going to go through
each of those steps in turn,

418
00:25:05,330 --> 00:25:10,290
and go through the mechanics
of how we're going to do that.

419
00:25:10,290 --> 00:25:13,070
I've already said
that we're assuming

420
00:25:13,070 --> 00:25:18,050
that the output for each
group-- and by group, I

421
00:25:18,050 --> 00:25:23,180
mean the part sampled
for a given treatment,

422
00:25:23,180 --> 00:25:25,610
so for a given setting
of the input variable--

423
00:25:25,610 --> 00:25:30,620
we're saying that that
variation is described

424
00:25:30,620 --> 00:25:32,480
by a variance of
sigma 0 squared,

425
00:25:32,480 --> 00:25:35,930
and it's the same for each
group, which may be true

426
00:25:35,930 --> 00:25:37,820
or may not be.

427
00:25:37,820 --> 00:25:40,340
In which case, we need to
go back and think again.

428
00:25:44,740 --> 00:25:51,530
So the second bullet
point here is giving us

429
00:25:51,530 --> 00:25:57,940
sum of the squared deviations
for the teeth group.

430
00:25:57,940 --> 00:26:04,750
And so this is just saying,
we have a sample value

431
00:26:04,750 --> 00:26:07,780
for the output-- y sub-ti.

432
00:26:07,780 --> 00:26:14,830
We've taken a mean for the
group, y bar sub-t, which

433
00:26:14,830 --> 00:26:19,900
is here, and we're
summing the squares

434
00:26:19,900 --> 00:26:26,230
of the deviations between these
values and the sample mean.

435
00:26:26,230 --> 00:26:29,230
Once we've done that,
it's a simple step

436
00:26:29,230 --> 00:26:32,590
to go to the group variance.

437
00:26:32,590 --> 00:26:35,980
So this is an unbiased
estimate of the group variance.

438
00:26:35,980 --> 00:26:40,660
The sum of the square is
divided by the number of degrees

439
00:26:40,660 --> 00:26:43,280
of freedom in that treatment.

440
00:26:43,280 --> 00:26:48,730
So that's the number of
parts sampled minus 1.

441
00:26:48,730 --> 00:26:53,380
And the minus 1 is because,
well, the value of the mean--

442
00:26:53,380 --> 00:26:59,500
y bar sub-t, is dependent on
all the individual y sub-ti's.

443
00:26:59,500 --> 00:27:04,990
So there's one less
degree of freedom

444
00:27:04,990 --> 00:27:06,575
than there are data points.

445
00:27:10,570 --> 00:27:13,990
Once we've got each of
those group variances

446
00:27:13,990 --> 00:27:16,780
based on the data,
we're going to take

447
00:27:16,780 --> 00:27:19,720
a pool estimate of the
common within group variance.

448
00:27:19,720 --> 00:27:21,970
So inherent in this,
again, is the assumption

449
00:27:21,970 --> 00:27:25,270
that the variance
could be modeled

450
00:27:25,270 --> 00:27:27,040
as being equal for each group.

451
00:27:27,040 --> 00:27:31,150
And actually, that
should be a sub-2.

452
00:27:33,612 --> 00:27:35,320
But what we're doing,
this is effectively

453
00:27:35,320 --> 00:27:38,140
taking a weighted average based
on the number of data points

454
00:27:38,140 --> 00:27:43,520
that we have for each group.

455
00:27:43,520 --> 00:27:46,570
So we might take more samples
for a particular treatment,

456
00:27:46,570 --> 00:27:51,790
but we can deal
with that by using

457
00:27:51,790 --> 00:27:55,210
the correct number of degrees
of freedom for each treatment.

458
00:27:55,210 --> 00:27:57,730
And what we get out of that
is this pooled estimate

459
00:27:57,730 --> 00:28:03,525
of [INAUDIBLE] variance
s sub-r squared.

460
00:28:08,270 --> 00:28:11,800
So that was the
within group variance.

461
00:28:11,800 --> 00:28:15,580
And now, we're going to
talk about the between group

462
00:28:15,580 --> 00:28:16,390
variance.

463
00:28:16,390 --> 00:28:19,810
As I said, we're
testing the hypothesis

464
00:28:19,810 --> 00:28:28,000
that varying the input does not
cause the output mean to vary.

465
00:28:28,000 --> 00:28:32,740
If that were true,
then what I've

466
00:28:32,740 --> 00:28:35,500
sketched on the bottom left
here would be the case.

467
00:28:35,500 --> 00:28:38,050
You would have three
populations that

468
00:28:38,050 --> 00:28:39,910
looked absolutely identical.

469
00:28:39,910 --> 00:28:44,110
And it wouldn't matter what
value you chose for the input.

470
00:28:44,110 --> 00:28:45,830
You'd get the same output.

471
00:28:45,830 --> 00:28:49,510
However, when you sample,
we know that, of course,

472
00:28:49,510 --> 00:28:52,600
because there's random
variation in the output,

473
00:28:52,600 --> 00:28:54,920
the sample mean will
not always be the same.

474
00:28:54,920 --> 00:28:56,320
And that will vary.

475
00:28:56,320 --> 00:29:00,130
The sample mean will, itself,
be normally distributed

476
00:29:00,130 --> 00:29:05,570
with a variance of
sigma 0 squared over n,

477
00:29:05,570 --> 00:29:11,620
where n is the number
of samples in--

478
00:29:11,620 --> 00:29:14,510
the number of data
points in the sample.

479
00:29:14,510 --> 00:29:16,120
So this is looking
a little bit--

480
00:29:16,120 --> 00:29:18,550
on the bottom right-- a little
bit like a control chart,

481
00:29:18,550 --> 00:29:25,630
where the y-axis
is could be thought

482
00:29:25,630 --> 00:29:39,640
of as being linked to sigma 0
squared divided by the sample

483
00:29:39,640 --> 00:29:40,640
size.

484
00:29:40,640 --> 00:29:46,930
And so what this all means
is that if the hypothesis was

485
00:29:46,930 --> 00:29:49,930
true-- inputs don't
affect the output mean--

486
00:29:49,930 --> 00:29:52,210
then we could form
a second estimate

487
00:29:52,210 --> 00:30:03,370
of the within group variation
by looking at it this way, where

488
00:30:03,370 --> 00:30:07,830
we are taking the
treatment mean, which

489
00:30:07,830 --> 00:30:09,910
are these values here.

490
00:30:09,910 --> 00:30:13,750
We're subtracting the
grand mean of all the data

491
00:30:13,750 --> 00:30:15,400
from each treatment
mean in turn.

492
00:30:15,400 --> 00:30:18,340
We're taking the square
of those deviations.

493
00:30:18,340 --> 00:30:22,090
And this n sub-t here
is counting for the fact

494
00:30:22,090 --> 00:30:26,590
that, because of the
central limit theorem,

495
00:30:26,590 --> 00:30:30,280
the variance of the sample
mean is inversely proportional

496
00:30:30,280 --> 00:30:36,180
to the sample size, and k is the
number of different treatments.

497
00:30:36,180 --> 00:30:39,700
So in this case, k is 3.

498
00:30:39,700 --> 00:30:44,090
So we're making this
estimate, s sub-t squared,

499
00:30:44,090 --> 00:30:46,210
which, in the case
I've sketched here,

500
00:30:46,210 --> 00:30:51,440
we would expect to be
equal to s sub-r squared.

501
00:30:51,440 --> 00:31:00,590
However, if that isn't
the case, if there

502
00:31:00,590 --> 00:31:04,220
is a significant relationship
between the input

503
00:31:04,220 --> 00:31:09,920
and the output,
then s sub-t squared

504
00:31:09,920 --> 00:31:13,170
will be larger than
s sub-r squared.

505
00:31:13,170 --> 00:31:18,920
And that's the quantity
by which it's larger--

506
00:31:18,920 --> 00:31:20,480
this sum here.

507
00:31:20,480 --> 00:31:25,800
And this value, tau sub-t,
is the real difference--

508
00:31:25,800 --> 00:31:31,820
the actual systematic
change in output

509
00:31:31,820 --> 00:31:33,875
mean that occurs when
we vary the input.

510
00:31:37,960 --> 00:31:44,370
So we're causing an inflation
in the value of s sub-t squared

511
00:31:44,370 --> 00:31:48,750
based on the fact that
changing the input

512
00:31:48,750 --> 00:31:51,000
changes the output mean.

513
00:31:51,000 --> 00:31:54,540
So we've got these
two estimates--

514
00:31:54,540 --> 00:31:58,080
that within group variation and
the between group variation.

515
00:31:58,080 --> 00:32:02,400
We want to look at them,
and compare them, and say,

516
00:32:02,400 --> 00:32:05,100
is there a significant
impact on the output

517
00:32:05,100 --> 00:32:08,010
when we go through these
different treatments?

518
00:32:08,010 --> 00:32:11,790
So we're comparing
two variances.

519
00:32:11,790 --> 00:32:15,780
And that means that what
we're interested in doing

520
00:32:15,780 --> 00:32:16,740
is an F test.

521
00:32:20,160 --> 00:32:23,060
So we're asking
ourselves, how big

522
00:32:23,060 --> 00:32:27,320
is the chance that, if the
null hypothesis were true,

523
00:32:27,320 --> 00:32:30,800
we would observe
these two variances?

524
00:32:33,360 --> 00:32:35,360
We're going to make
the test statistic

525
00:32:35,360 --> 00:32:39,550
the ratio s sub-t squared
divided by s sub-r squared.

526
00:32:39,550 --> 00:32:44,360
So that's going to usually
be a value greater than 1.

527
00:32:44,360 --> 00:32:50,370
And we reject the null
hypothesis if that value is

528
00:32:50,370 --> 00:32:51,870
significantly greater than 1.

529
00:32:54,740 --> 00:32:56,530
What it's worth
remembering here--

530
00:32:56,530 --> 00:32:57,030
go ahead.

531
00:32:57,030 --> 00:32:58,572
AUDIENCE: When you
say significantly,

532
00:32:58,572 --> 00:33:01,095
you mean factor of
ten, factor of--

533
00:33:01,095 --> 00:33:02,720
PROFESSOR: Well,
we're going to do an F

534
00:33:02,720 --> 00:33:07,150
test at a chosen level of
significance to work that out.

535
00:33:07,150 --> 00:33:11,500
And that's exactly what this
slide says, so good question.

536
00:33:11,500 --> 00:33:13,870
We have this F statistic.

537
00:33:13,870 --> 00:33:16,450
And we're now going
to interpret that.

538
00:33:16,450 --> 00:33:19,000
We're going to pick a particular
significance level, which

539
00:33:19,000 --> 00:33:20,950
we want to say, is it--

540
00:33:20,950 --> 00:33:23,230
at the 5% level,
is it significant

541
00:33:23,230 --> 00:33:29,230
that there's a relationship
between the treatment that's

542
00:33:29,230 --> 00:33:32,790
chosen and the output mean?

543
00:33:32,790 --> 00:33:37,050
It's important to remember
that this is a one-sided F test

544
00:33:37,050 --> 00:33:39,120
that we're interested in doing.

545
00:33:39,120 --> 00:33:40,770
The possibility
we're considering

546
00:33:40,770 --> 00:33:44,760
is that sigma sub-t
squared is greater

547
00:33:44,760 --> 00:33:46,620
than sigma sub-r squared.

548
00:33:46,620 --> 00:33:54,060
The case where the real
value between group variance

549
00:33:54,060 --> 00:33:57,270
is less than the real value
of the within group variance

550
00:33:57,270 --> 00:33:59,980
is not something that
has a physical meaning.

551
00:33:59,980 --> 00:34:05,700
So it's a one-sided F test where
the statistic we're expecting

552
00:34:05,700 --> 00:34:07,090
to be bigger than 1--

553
00:34:07,090 --> 00:34:08,280
much bigger than 1.

554
00:34:08,280 --> 00:34:12,300
And here are the
degrees of freedom

555
00:34:12,300 --> 00:34:17,280
that we need to use
for that evaluation.

556
00:34:17,280 --> 00:34:25,540
And that's, based on what
we've learned so far, fairly

557
00:34:25,540 --> 00:34:27,460
straightforward test to do.

558
00:34:30,469 --> 00:34:33,699
We can also make this
additional estimate,

559
00:34:33,699 --> 00:34:38,050
which is based on the sum
of the squared deviations

560
00:34:38,050 --> 00:34:40,150
from the grand mean
among all samples.

561
00:34:40,150 --> 00:34:42,219
And that can be useful
in a number of ways,

562
00:34:42,219 --> 00:34:46,810
although it's not central to
evaluating this F statistic

563
00:34:46,810 --> 00:34:49,719
and testing the hypothesis
that we've described.

564
00:34:52,230 --> 00:34:54,280
This slide is pretty important.

565
00:34:54,280 --> 00:34:59,670
So this shows you how
one usually would lay out

566
00:34:59,670 --> 00:35:01,710
an ANOVA analysis.

567
00:35:01,710 --> 00:35:07,650
And we would put real
quantities in the spaces

568
00:35:07,650 --> 00:35:09,630
shown by expressions here.

569
00:35:09,630 --> 00:35:11,130
We'd evaluate the
sum of the squares

570
00:35:11,130 --> 00:35:14,520
between treatments
and within treatments.

571
00:35:14,520 --> 00:35:17,970
And we'd figure out what the
number of degrees of freedom

572
00:35:17,970 --> 00:35:19,110
was.

573
00:35:19,110 --> 00:35:22,700
We'd look at the estimates
of our variances,

574
00:35:22,700 --> 00:35:26,430
we'd take their ratio,
and we would then

575
00:35:26,430 --> 00:35:34,260
use F tables to find the
probability that, by chance,

576
00:35:34,260 --> 00:35:38,430
that value of f
sub-0 was observed

577
00:35:38,430 --> 00:35:43,800
if the null hypothesis of 0
treatment effects were true.

578
00:35:47,940 --> 00:35:54,170
And it's worth remembering
this word "residual."

579
00:35:54,170 --> 00:35:59,180
So before, when I highlighted
that quantity epsilon, which

580
00:35:59,180 --> 00:36:03,290
is the variation associated
with randomness, the thing

581
00:36:03,290 --> 00:36:06,350
that we're not trying
to model in looking

582
00:36:06,350 --> 00:36:09,020
at these different
treatments, that

583
00:36:09,020 --> 00:36:14,751
is accounted for by this sum
of squares within treatments.

584
00:36:19,520 --> 00:36:23,750
What I'm going to do now is work
through a very simple example

585
00:36:23,750 --> 00:36:27,740
to show how this single
variable ANOVA can be done.

586
00:36:27,740 --> 00:36:32,210
And it will hopefully give
you a feel for the steps that

587
00:36:32,210 --> 00:36:33,350
have to be gone through.

588
00:36:33,350 --> 00:36:36,710
This sort of thing is automated
in a number of programs.

589
00:36:36,710 --> 00:36:39,110
There are macros available
in Excel to do it.

590
00:36:39,110 --> 00:36:45,200
But it's worth knowing exactly
what the steps are so that you

591
00:36:45,200 --> 00:36:48,500
understand what's going on.

592
00:36:48,500 --> 00:36:52,100
What we've got here
are three samples

593
00:36:52,100 --> 00:36:55,560
for each of three treatments
of a particular variable.

594
00:36:55,560 --> 00:37:02,840
So this is the output that
we've measured in some arbitrary

595
00:37:02,840 --> 00:37:03,500
quantity.

596
00:37:03,500 --> 00:37:05,330
And here are the treatments.

597
00:37:11,960 --> 00:37:15,850
So for each
particular treatment,

598
00:37:15,850 --> 00:37:17,230
we have three samples.

599
00:37:17,230 --> 00:37:20,260
We evaluate a sample mean.

600
00:37:20,260 --> 00:37:21,220
Here, it's 11.

601
00:37:21,220 --> 00:37:22,660
Here it's 8.

602
00:37:22,660 --> 00:37:25,570
And we take the grand
average of all samples,

603
00:37:25,570 --> 00:37:27,880
and we know that as well.

604
00:37:31,260 --> 00:37:34,220
We can also evaluate the
sum of squared deviations

605
00:37:34,220 --> 00:37:38,300
of the sample values
from the sample mean.

606
00:37:38,300 --> 00:37:42,860
And in this case, sum of
squares for treatment one

607
00:37:42,860 --> 00:37:46,290
is the square of
that difference,

608
00:37:46,290 --> 00:37:50,970
plus the square that
difference, plus 0,

609
00:37:50,970 --> 00:37:54,060
because that particular
sample lies on the mean.

610
00:37:54,060 --> 00:37:56,460
We do that for each
particular treatment,

611
00:37:56,460 --> 00:38:00,130
and evaluate these
sums of squares.

612
00:38:00,130 --> 00:38:02,220
We know that there are
two degrees of freedom

613
00:38:02,220 --> 00:38:03,970
for each particular sample.

614
00:38:03,970 --> 00:38:06,210
So we can go
straight to estimates

615
00:38:06,210 --> 00:38:12,340
of the within group variances
for each group or treatment.

616
00:38:12,340 --> 00:38:15,000
And based on that, we can
evaluate that pool estimate

617
00:38:15,000 --> 00:38:18,360
of within group
variance, which is

618
00:38:18,360 --> 00:38:21,300
the thing that's
meant to be excluding

619
00:38:21,300 --> 00:38:24,660
the effect of the treatments.

620
00:38:31,850 --> 00:38:35,600
Now what we're going to do
is make the second estimate--

621
00:38:35,600 --> 00:38:39,170
the between group
variance estimate.

622
00:38:39,170 --> 00:38:44,600
And here, what we're doing
is looking purely at--

623
00:38:47,180 --> 00:38:49,910
we're interested
purely in the sample

624
00:38:49,910 --> 00:38:53,460
means that we've evaluated.

625
00:38:53,460 --> 00:38:58,800
Take this value, this
value, and this value.

626
00:38:58,800 --> 00:39:02,820
That sample mean is 11.

627
00:39:02,820 --> 00:39:05,980
The grand average for
all samples is 10.

628
00:39:05,980 --> 00:39:09,780
We're looking at
that difference,

629
00:39:09,780 --> 00:39:14,250
and we're squaring
that deviation here,

630
00:39:14,250 --> 00:39:20,310
and then we're scaling it
by the number of samples

631
00:39:20,310 --> 00:39:22,590
for that particular
treatment, which is 3--

632
00:39:22,590 --> 00:39:23,700
3 samples.

633
00:39:23,700 --> 00:39:26,460
And that's, again,
dealing with the fact

634
00:39:26,460 --> 00:39:31,440
that the standard
deviation of a sample mean

635
00:39:31,440 --> 00:39:33,990
is inversely proportional
to the number of data

636
00:39:33,990 --> 00:39:36,950
points in the sample.

637
00:39:36,950 --> 00:39:40,810
So we do this for each
treatment-- t equals 1, 2,

638
00:39:40,810 --> 00:39:41,920
and 3--

639
00:39:41,920 --> 00:39:50,000
and we evaluate our estimate
of the between group variance.

640
00:39:50,000 --> 00:39:58,295
So this estimate, in a
sense, is trying to look--

641
00:40:01,040 --> 00:40:03,590
well, folded into
that estimate will

642
00:40:03,590 --> 00:40:08,790
be the effect of the random
variation within the group

643
00:40:08,790 --> 00:40:14,220
and the effect of
changing the treatment.

644
00:40:14,220 --> 00:40:17,310
We have these two estimates--

645
00:40:17,310 --> 00:40:20,710
sigma sub-r squared,
sigma sub-t squared.

646
00:40:20,710 --> 00:40:23,610
Now, we're going
to do the F test

647
00:40:23,610 --> 00:40:27,960
to see whether there's
a significant evidence

648
00:40:27,960 --> 00:40:33,040
that changing the input
changes the output.

649
00:40:33,040 --> 00:40:37,030
Here's how we might
lay it out in Excel.

650
00:40:37,030 --> 00:40:40,420
We've evaluated the sums
of squares between groups

651
00:40:40,420 --> 00:40:41,410
within square--

652
00:40:41,410 --> 00:40:43,360
within groups, sorry.

653
00:40:43,360 --> 00:40:45,490
We've got the degrees
of freedom here.

654
00:40:45,490 --> 00:40:49,090
The mean squared value is
just taken from over here.

655
00:40:49,090 --> 00:40:54,970
And the F statistic is just the
between group estimate divided

656
00:40:54,970 --> 00:40:59,030
by the within group
estimate, so 4 1/2.

657
00:40:59,030 --> 00:41:01,730
And then we can go to
the tables and say,

658
00:41:01,730 --> 00:41:05,630
the 5% level, what's
the critical value of f?

659
00:41:05,630 --> 00:41:09,020
It turns out to be 5.14.

660
00:41:09,020 --> 00:41:11,750
So in this case, we would--

661
00:41:11,750 --> 00:41:15,620
the 5% level rejects--

662
00:41:15,620 --> 00:41:19,340
accept the null hypothesis that
there was no significant effect

663
00:41:19,340 --> 00:41:22,382
of inputs on outputs.

664
00:41:22,382 --> 00:41:25,508
AUDIENCE: Is the ratio
always larger than 1?

665
00:41:25,508 --> 00:41:26,900
How can it be smaller--

666
00:41:26,900 --> 00:41:29,025
PROFESSOR: There's a chance
that it will be smaller

667
00:41:29,025 --> 00:41:33,170
than 1, even if the
actual output means were

668
00:41:33,170 --> 00:41:35,720
unaffected by the input.

669
00:41:35,720 --> 00:41:37,370
But again, that would be--

670
00:41:37,370 --> 00:41:44,270
so if this is the value
of f and this is the PDF--

671
00:41:44,270 --> 00:41:45,530
I forget.

672
00:41:45,530 --> 00:41:49,490
This would be what the f
distribution looks like.

673
00:41:49,490 --> 00:41:53,660
So there are values of f that
are less than 1 down here.

674
00:41:53,660 --> 00:42:04,280
But the 5% level, you would only
reject the hypothesis that--

675
00:42:04,280 --> 00:42:09,140
so f crit here is
5.14, the 5% level.

676
00:42:09,140 --> 00:42:11,900
You would reject the
hypothesis that there

677
00:42:11,900 --> 00:42:16,880
was no significant impact of
inputs on outputs if you--

678
00:42:16,880 --> 00:42:24,170
this particular variable input,
obviously-- if the value of f

679
00:42:24,170 --> 00:42:25,940
were greater than that.

680
00:42:25,940 --> 00:42:30,920
So yeah, there's a small chance
that f will be less than 1.

681
00:42:35,830 --> 00:42:36,740
I think that's right.

682
00:42:36,740 --> 00:42:38,030
Is that right?

683
00:42:38,030 --> 00:42:39,370
AUDIENCE: Yeah, certainly.

684
00:42:39,370 --> 00:42:42,280
If the observed
f is less than 1,

685
00:42:42,280 --> 00:42:45,970
that would for sure tell you
that the treatments are not

686
00:42:45,970 --> 00:42:48,618
having an effect.

687
00:42:48,618 --> 00:42:54,570
It means your treatment deltas
are so small that in fact, you

688
00:42:54,570 --> 00:42:59,635
got lucky, your sample with 0
treatment effects in that case

689
00:42:59,635 --> 00:43:03,337
was of course even
smaller variance than the

690
00:43:03,337 --> 00:43:06,790
in group variance, which
happened purely by chance.

691
00:43:06,790 --> 00:43:10,710
So if you actually
observed an f less than 1,

692
00:43:10,710 --> 00:43:13,080
exactly as Hayden
said, you would reject

693
00:43:13,080 --> 00:43:15,880
the alternative hypothesis.

694
00:43:15,880 --> 00:43:20,650
You'd just have to say,
yeah, there's no effect.

695
00:43:20,650 --> 00:43:27,040
PROFESSOR: OK, well, that
is the whole example.

696
00:43:27,040 --> 00:43:29,270
That's what you would do if
you had nine data points,

697
00:43:29,270 --> 00:43:32,180
and you could write it
down on one bit of paper.

698
00:43:32,180 --> 00:43:37,540
Obviously, not everything
is that simple.

699
00:43:37,540 --> 00:43:38,453
Yeah?

700
00:43:38,453 --> 00:43:40,120
AUDIENCE: OK, go back
to previous slide.

701
00:43:42,660 --> 00:43:46,310
So the p value mean says if
the r is larger than 0.064,

702
00:43:46,310 --> 00:43:49,595
we will reject the
null hypothesis, right?

703
00:43:49,595 --> 00:43:54,386
PROFESSOR: Yeah, the p value
is the level of significance

704
00:43:54,386 --> 00:43:59,120
at which you would just
reject the null hypothesis.

705
00:43:59,120 --> 00:44:00,890
So that's the level
of significance

706
00:44:00,890 --> 00:44:04,370
for which f is equal to f crit.

707
00:44:04,370 --> 00:44:08,200
And in this case, it's 6.4%.

708
00:44:08,200 --> 00:44:11,720
So if the level of
significance is 5%,

709
00:44:11,720 --> 00:44:14,090
that's a more stringent--

710
00:44:16,790 --> 00:44:18,590
places a more
stringent requirement

711
00:44:18,590 --> 00:44:22,700
to reject the null
hypothesis than than 6.4%.

712
00:44:22,700 --> 00:44:28,277
So you would keep the null
hypothesis at the 5% level.

713
00:44:28,277 --> 00:44:30,110
AUDIENCE: So in fact,
this is a good example

714
00:44:30,110 --> 00:44:34,850
where, if I asked you just
directly the question,

715
00:44:34,850 --> 00:44:37,610
did the treatment
have an effect,

716
00:44:37,610 --> 00:44:42,560
your answer is dependent
on what level of confidence

717
00:44:42,560 --> 00:44:45,050
you wanted [INAUDIBLE]
of the evidence

718
00:44:45,050 --> 00:44:47,180
tells you there was an effect.

719
00:44:47,180 --> 00:44:51,997
If I asked you, I want to be 95%
confident there was an effect,

720
00:44:51,997 --> 00:44:54,080
your answer would be I
don't have enough evidence.

721
00:44:54,080 --> 00:44:55,865
No, there's no effect.

722
00:44:55,865 --> 00:45:00,240
If I asked you instead,
90% confident that there's

723
00:45:00,240 --> 00:45:04,340
an effect, your
answer is yes, there

724
00:45:04,340 --> 00:45:07,092
is an effect to 90% confidence.

725
00:45:07,092 --> 00:45:09,500
So you're right at
that interesting point

726
00:45:09,500 --> 00:45:11,350
with that 6.4--

727
00:45:11,350 --> 00:45:12,790
what is it?

728
00:45:12,790 --> 00:45:14,450
PROFESSOR: Yeah, 6.4.

729
00:45:14,450 --> 00:45:16,310
AUDIENCE: 6.4%.

730
00:45:16,310 --> 00:45:18,770
And if you just look at
the scatter of the data,

731
00:45:18,770 --> 00:45:19,640
it's kind of fuzzy.

732
00:45:19,640 --> 00:45:20,848
You don't have a lot of data.

733
00:45:20,848 --> 00:45:23,890
You only have three data
points in each sample.

734
00:45:23,890 --> 00:45:29,446
So that whole idea of confidence
interval we talked about early

735
00:45:29,446 --> 00:45:29,965
in the term.

736
00:45:29,965 --> 00:45:31,323
It's very important.

737
00:45:34,500 --> 00:45:39,890
PROFESSOR: OK, any more
questions from anyone?

738
00:45:39,890 --> 00:45:40,390
Right.

739
00:45:47,570 --> 00:45:53,270
Well, we mentioned at the
start that inherent in doing

740
00:45:53,270 --> 00:45:59,780
this analysis is the assumption
that the within group variance

741
00:45:59,780 --> 00:46:01,670
is the same for every group.

742
00:46:01,670 --> 00:46:06,020
And that this actually
should be a sigma, not a 3/4.

743
00:46:10,670 --> 00:46:13,340
It's important to
check that to make sure

744
00:46:13,340 --> 00:46:16,250
that the ANOVA is valid.

745
00:46:16,250 --> 00:46:18,410
And there are ways of
looking at the problem

746
00:46:18,410 --> 00:46:22,770
differently if you can't
really make that assumption.

747
00:46:22,770 --> 00:46:29,720
But what we also
want to do is try

748
00:46:29,720 --> 00:46:33,770
to make sure that the
analysis we're doing really

749
00:46:33,770 --> 00:46:43,750
does capture as much as possible
of the treatment effect.

750
00:46:43,750 --> 00:46:45,650
And there are various
ways of doing that.

751
00:46:45,650 --> 00:46:47,840
We can take the
residuals-- in other words,

752
00:46:47,840 --> 00:46:50,830
the difference between the
sample value and the sample

753
00:46:50,830 --> 00:46:51,730
mean.

754
00:46:51,730 --> 00:46:54,790
We can plot those
residuals either

755
00:46:54,790 --> 00:46:59,020
against the time order in
which the samples were taken.

756
00:46:59,020 --> 00:47:00,850
We could look at
the distribution

757
00:47:00,850 --> 00:47:04,120
of those residuals,
do a QQnorm plot

758
00:47:04,120 --> 00:47:06,280
or some other kind of plot.

759
00:47:06,280 --> 00:47:11,740
And we can make checks that
this underlying assumption

760
00:47:11,740 --> 00:47:13,930
is reasonable.

761
00:47:17,240 --> 00:47:18,990
So that's all I'm going
to say about this.

762
00:47:18,990 --> 00:47:27,640
But I think it's an important
thing to bear in mind.

763
00:47:27,640 --> 00:47:34,350
So ANOVA for one
variable, though it

764
00:47:34,350 --> 00:47:37,200
takes quite a lot of effort
to get it conceptually,

765
00:47:37,200 --> 00:47:43,810
the actual mechanics of doing
it are fairly straightforward.

766
00:47:43,810 --> 00:47:47,310
It's not always the case that
there's only one input variable

767
00:47:47,310 --> 00:47:50,490
that we're interested
in changing.

768
00:47:50,490 --> 00:47:53,400
We might, of course, want to
do it with two or more input

769
00:47:53,400 --> 00:47:55,170
variables.

770
00:47:55,170 --> 00:47:57,060
There are good
reasons why we'd be

771
00:47:57,060 --> 00:48:02,640
interested in exploiting
a number of inputs

772
00:48:02,640 --> 00:48:07,295
to achieve the output we want.

773
00:48:07,295 --> 00:48:10,050
If we have several
variables we can control,

774
00:48:10,050 --> 00:48:11,610
we can do things
like controlling

775
00:48:11,610 --> 00:48:14,790
the output mean and
the output variance

776
00:48:14,790 --> 00:48:17,400
to be what we want them to be.

777
00:48:17,400 --> 00:48:26,580
Or we can control the
output to improve CPK,

778
00:48:26,580 --> 00:48:29,820
while at the same time
reducing sensitivity

779
00:48:29,820 --> 00:48:32,268
to some other disturbance.

780
00:48:32,268 --> 00:48:41,070
And that is why we need to be
able to do analysis of variance

781
00:48:41,070 --> 00:48:43,260
for multiple inputs.

782
00:48:47,610 --> 00:48:52,860
One way of modeling the effect
of these multiple inputs

783
00:48:52,860 --> 00:48:58,050
would be to have this
simple additive model, where

784
00:48:58,050 --> 00:49:06,900
we are saying that
the output value is

785
00:49:06,900 --> 00:49:10,020
the sum of four quantities--

786
00:49:10,020 --> 00:49:13,755
process mean, as we
mentioned before, and then

787
00:49:13,755 --> 00:49:17,820
two separate treatment
effects, tau sub-t

788
00:49:17,820 --> 00:49:21,880
as we described for the
one variable case, and then

789
00:49:21,880 --> 00:49:28,410
an equivalent value for
another input, beta sub-q.

790
00:49:28,410 --> 00:49:31,560
And in this case, we
would say that there

791
00:49:31,560 --> 00:49:36,450
were there were k possible
treatments at first variable,

792
00:49:36,450 --> 00:49:38,890
and n possible treatments
for the second.

793
00:49:38,890 --> 00:49:41,170
So you could have any
one of the treatments

794
00:49:41,170 --> 00:49:43,560
for the first
variable, with any one

795
00:49:43,560 --> 00:49:47,070
of the treatments for the
second variable in combination.

796
00:49:47,070 --> 00:49:51,390
The fourth quantity is, again,
this residual-- this random

797
00:49:51,390 --> 00:49:53,400
variation that's
not something we're

798
00:49:53,400 --> 00:50:01,620
trying to account for with
these two input variables.

799
00:50:01,620 --> 00:50:05,860
And again, this is what we
call a fixed effects model.

800
00:50:05,860 --> 00:50:08,940
So we're saying that there's
a deterministic relationship

801
00:50:08,940 --> 00:50:15,990
between the input values and
the value of tau sub-t and beta

802
00:50:15,990 --> 00:50:17,310
sub-q.

803
00:50:17,310 --> 00:50:21,810
There are cases in which there's
a probabilistic relationship

804
00:50:21,810 --> 00:50:24,930
between the input and
those treatment effects.

805
00:50:24,930 --> 00:50:26,430
There are ways of
dealing with that

806
00:50:26,430 --> 00:50:27,900
that we won't describe here.

807
00:50:30,790 --> 00:50:35,000
The model that is up on the
board now is pretty simple.

808
00:50:35,000 --> 00:50:38,590
It assumes that the effects of
these two inputs are additive.

809
00:50:38,590 --> 00:50:41,380
There are plenty of cases,
plenty of processes where

810
00:50:41,380 --> 00:50:44,290
that simply isn't realistic.

811
00:50:44,290 --> 00:50:49,480
And there's some synergism
between the two inputs.

812
00:50:49,480 --> 00:50:52,990
An example that I
can think of that's

813
00:50:52,990 --> 00:50:55,420
been be relevant
in my research is

814
00:50:55,420 --> 00:51:00,970
in modeling the etching rate
in a silicon plasma etching

815
00:51:00,970 --> 00:51:03,910
chamber, where there are--

816
00:51:03,910 --> 00:51:07,480
I guess you could think of
it as there being two really

817
00:51:07,480 --> 00:51:09,940
important input
quantities, which

818
00:51:09,940 --> 00:51:16,120
are the flux of ions of reactant
onto the surface of the wafer,

819
00:51:16,120 --> 00:51:20,920
the flux of uncharged
fluorine radicals, which

820
00:51:20,920 --> 00:51:24,462
are the chemical species that
are responsible for chemical

821
00:51:24,462 --> 00:51:25,420
etching of the circuit.

822
00:51:25,420 --> 00:51:28,840
So you have these two
fluxes approaching

823
00:51:28,840 --> 00:51:30,130
the surface of the wafer.

824
00:51:30,130 --> 00:51:33,910
And it's not just the case that
the rate of removal of silicon

825
00:51:33,910 --> 00:51:38,603
is proportional to some weighted
sum of those two fluxes.

826
00:51:38,603 --> 00:51:40,270
You can't really have
etching unless you

827
00:51:40,270 --> 00:51:43,510
have a substantial flux of
both ions and these fluorine

828
00:51:43,510 --> 00:51:45,700
neutrals, so that the
ions provide the energy

829
00:51:45,700 --> 00:51:47,330
for the reaction to occur.

830
00:51:47,330 --> 00:51:50,950
And that means that the
model I just described

831
00:51:50,950 --> 00:51:52,510
wouldn't get us very far.

832
00:51:52,510 --> 00:51:56,500
We need to be able to
deal with an interaction

833
00:51:56,500 --> 00:51:58,615
between those two inputs.

834
00:52:02,800 --> 00:52:05,880
This is how we can represent it.

835
00:52:05,880 --> 00:52:10,680
We can add in a
fifth term, which

836
00:52:10,680 --> 00:52:18,060
is specific to the combination
of treatments t and q.

837
00:52:21,910 --> 00:52:26,950
Let's look at how we might
incorporate that into ANOVA.

838
00:52:31,600 --> 00:52:42,480
Here, we look at the
within group variance

839
00:52:42,480 --> 00:52:43,560
for a particular--

840
00:52:47,730 --> 00:52:50,490
for particular input variable.

841
00:52:50,490 --> 00:52:53,880
So this is for the
first input variable.

842
00:52:53,880 --> 00:52:55,380
The treatment is tau.

843
00:52:55,380 --> 00:52:58,260
And then, we have the
equivalent expression

844
00:52:58,260 --> 00:53:04,350
for what happens with
the second variable.

845
00:53:04,350 --> 00:53:08,880
Then, we have this
interaction term,

846
00:53:08,880 --> 00:53:13,320
which is an estimate of the
variance that's to do with--

847
00:53:13,320 --> 00:53:16,570
that can't be explained
by this additive idea,

848
00:53:16,570 --> 00:53:23,370
and finally, the residuals-- so
taking the actual value of the

849
00:53:23,370 --> 00:53:30,090
of the outputs minus the grand
mean, squaring those residuals.

850
00:53:30,090 --> 00:53:39,330
And what this leads us to
is a two-way way ANOVA table

851
00:53:39,330 --> 00:53:42,750
where we can evaluate
3 F statistics,

852
00:53:42,750 --> 00:53:48,450
and apply 3 F tests to look
for significant relationships

853
00:53:48,450 --> 00:53:54,570
between factor 1 and the
output, factor 2 and the output,

854
00:53:54,570 --> 00:53:59,010
and for a significant amount of
interaction between those two

855
00:53:59,010 --> 00:54:00,540
factors in setting the output.

856
00:54:00,540 --> 00:54:03,763
Katerina, you had a question?

857
00:54:03,763 --> 00:54:14,810
AUDIENCE: [INAUDIBLE]
or whether [INAUDIBLE]

858
00:54:14,810 --> 00:54:16,310
graphic you were
showing us earlier,

859
00:54:16,310 --> 00:54:20,960
with a [INAUDIBLE] Would
this be between groups,

860
00:54:20,960 --> 00:54:23,270
or is it within one?

861
00:54:23,270 --> 00:54:26,553
PROFESSOR: Yeah, this is--

862
00:54:26,553 --> 00:54:28,220
sorry, this is between
groups, isn't it?

863
00:54:28,220 --> 00:54:29,480
AUDIENCE: Could you
repeat the question?

864
00:54:29,480 --> 00:54:30,355
PROFESSOR: Oh, right.

865
00:54:30,355 --> 00:54:34,640
Yes, Katerina was asking,
are these estimates

866
00:54:34,640 --> 00:54:38,510
within group estimates or
between group estimates?

867
00:54:38,510 --> 00:54:41,480
And yes, absolutely,
you're right.

868
00:54:41,480 --> 00:54:44,210
These are actually
between group estimates.

869
00:54:44,210 --> 00:54:49,700
So we're taking a sample mean
and looking at its deviation

870
00:54:49,700 --> 00:54:52,730
from the grand
mean for all data.

871
00:54:52,730 --> 00:54:56,330
And we're making an
estimate, based on that,

872
00:54:56,330 --> 00:54:59,090
of the between group mean.

873
00:54:59,090 --> 00:55:04,790
So this is folding into it
some of the effect of varying

874
00:55:04,790 --> 00:55:06,560
the input.

875
00:55:06,560 --> 00:55:08,780
Question in Singapore?

876
00:55:08,780 --> 00:55:11,870
AUDIENCE: Yes, for si
squared, shouldn't it

877
00:55:11,870 --> 00:55:21,750
be ytq minus yt, minus yq,
plus y instead of minus y,

878
00:55:21,750 --> 00:55:25,220
according to a slide you
showed just before this slide?

879
00:55:25,220 --> 00:55:26,330
PROFESSOR: Did I show--

880
00:55:26,330 --> 00:55:28,340
oh, yeah, OK.

881
00:55:28,340 --> 00:55:29,890
Yes, I'm sorry.

882
00:55:29,890 --> 00:55:30,810
That's a mistake.

883
00:55:30,810 --> 00:55:31,940
Yeah, you're quite right.

884
00:55:31,940 --> 00:55:33,780
Thank you very much.

885
00:55:33,780 --> 00:55:34,983
Absolutely.

886
00:55:38,770 --> 00:55:43,480
Anyway, we have these estimates
of the between group variances

887
00:55:43,480 --> 00:55:47,620
and the interaction variance.

888
00:55:47,620 --> 00:55:50,620
And we have these
three F statistics

889
00:55:50,620 --> 00:55:57,100
that we can use separately to
test separate null hypotheses

890
00:55:57,100 --> 00:55:59,500
that there's no
impact to factor 1

891
00:55:59,500 --> 00:56:02,200
on the output, no input
of factor 2 on the output,

892
00:56:02,200 --> 00:56:06,220
and get to test the
hypothesis that there's

893
00:56:06,220 --> 00:56:11,695
no importance in any interaction
between those factors.

894
00:56:14,780 --> 00:56:22,990
So now, I want to give an
example of where this analysis

895
00:56:22,990 --> 00:56:26,140
could be relevant and useful.

896
00:56:26,140 --> 00:56:31,120
Now, we often think about the
relationship between inputs

897
00:56:31,120 --> 00:56:36,890
and outputs being
described where

898
00:56:36,890 --> 00:56:40,100
we're varying some
inputs over time,

899
00:56:40,100 --> 00:56:43,610
and the output is
changing over time.

900
00:56:43,610 --> 00:56:46,830
The output mean could
be shifting over time.

901
00:56:46,830 --> 00:56:49,160
So we have a machine,
and we're trying

902
00:56:49,160 --> 00:56:51,920
different settings for it.

903
00:56:51,920 --> 00:56:59,990
But in fact, a lot of cases in
semiconductor process control,

904
00:56:59,990 --> 00:57:03,740
we're interested in
spatial variations.

905
00:57:03,740 --> 00:57:08,720
And you can think of the
case I'm going to show you.

906
00:57:08,720 --> 00:57:13,520
This is for a metal etching
process where we're--

907
00:57:16,160 --> 00:57:21,590
some work that we've started
with one of our collaborators.

908
00:57:21,590 --> 00:57:25,580
The idea is that we want
to be able to model.

909
00:57:25,580 --> 00:57:29,540
And we're etching a metal layer
to form interconnect wires

910
00:57:29,540 --> 00:57:33,470
to be able to model how
uniformly that etches

911
00:57:33,470 --> 00:57:37,310
across a wafer, depending on
how densely the features are

912
00:57:37,310 --> 00:57:41,480
packed, what their
individual sizes are,

913
00:57:41,480 --> 00:57:45,050
and where they're
situated on the wafer.

914
00:57:45,050 --> 00:57:47,720
One problem that
you can encounter

915
00:57:47,720 --> 00:57:57,950
when processing these metal
layers is that the metal,

916
00:57:57,950 --> 00:58:02,390
you're masking a blanket layer
of the metal with photoresist,

917
00:58:02,390 --> 00:58:06,560
and then applying a
plasma to the wafer

918
00:58:06,560 --> 00:58:09,140
to etch the exposed metal away.

919
00:58:09,140 --> 00:58:11,960
But as that process
happens, you can

920
00:58:11,960 --> 00:58:15,440
you can get sideways
etching of the metal.

921
00:58:15,440 --> 00:58:25,550
And imagine you have
a photoresist layer.

922
00:58:25,550 --> 00:58:27,890
This is a cross-section
I'm sketching here.

923
00:58:27,890 --> 00:58:34,770
And you're etching a
trench into the metal

924
00:58:34,770 --> 00:58:37,400
down to some insulating layer.

925
00:58:42,180 --> 00:58:45,060
Agents have to enter this gap.

926
00:58:45,060 --> 00:58:47,220
Depending on the
size of the gap,

927
00:58:47,220 --> 00:58:52,960
that transport
process will vary.

928
00:58:52,960 --> 00:58:55,330
It will be harder
in narrower gaps

929
00:58:55,330 --> 00:58:57,850
than in wider gaps for
the reaction to get in,

930
00:58:57,850 --> 00:58:59,530
for the products to get out.

931
00:58:59,530 --> 00:59:03,550
But also, there's going
to be some lateral etching

932
00:59:03,550 --> 00:59:04,780
of the metal.

933
00:59:04,780 --> 00:59:07,930
And the rate at which that
lateral etching happens

934
00:59:07,930 --> 00:59:12,940
will depend on the
availability of reactants

935
00:59:12,940 --> 00:59:14,650
in the region of
that feature, which

936
00:59:14,650 --> 00:59:17,190
might vary across the wafer.

937
00:59:17,190 --> 00:59:19,960
If there is this lateral
etching, if it varies,

938
00:59:19,960 --> 00:59:21,910
then it's going to
affect the final width

939
00:59:21,910 --> 00:59:26,140
of the wire, its final
resistance, and therefore

940
00:59:26,140 --> 00:59:33,000
the speed at which an
individual capacitor,

941
00:59:33,000 --> 00:59:36,670
parasitic or otherwise, in
the circuit will charge.

942
00:59:36,670 --> 00:59:40,560
So you might find that if
you can't reduce variation

943
00:59:40,560 --> 00:59:48,740
in this lateral etching
process, the circuit properties

944
00:59:48,740 --> 00:59:51,620
of the devices produced
will vary substantially

945
00:59:51,620 --> 00:59:52,640
across the wafer.

946
00:59:57,320 --> 01:00:00,860
So there are several
things that can

947
01:00:00,860 --> 01:00:05,570
affect the availability of
reactants at a given feature.

948
01:00:05,570 --> 01:00:09,740
Firstly, there is the
position in the chamber.

949
01:00:09,740 --> 01:00:13,250
And what I've sketched
in the top left

950
01:00:13,250 --> 01:00:16,530
is a plan view of a wafer.

951
01:00:16,530 --> 01:00:20,090
This is, say, maybe one
of several wafers sitting

952
01:00:20,090 --> 01:00:24,590
in a large plasma
etching chamber.

953
01:00:24,590 --> 01:00:30,320
The gases flow through this
chamber with some path,

954
01:00:30,320 --> 01:00:34,520
some velocity distribution.

955
01:00:34,520 --> 01:00:36,470
The design of the
chamber will have

956
01:00:36,470 --> 01:00:41,450
an impact on the
density of reactants

957
01:00:41,450 --> 01:00:43,580
and how it varies
across the wafer.

958
01:00:43,580 --> 01:00:47,120
So you might find that
there's greater availability

959
01:00:47,120 --> 01:00:48,830
of reactants at the
center of the wafer.

960
01:00:48,830 --> 01:00:53,150
That's because there's an
inlet above the center.

961
01:00:53,150 --> 01:00:58,280
And so that's one thing
that can affect the amount

962
01:00:58,280 --> 01:01:00,830
of lateral metal etching.

963
01:01:00,830 --> 01:01:07,270
Then, you have the
actual geometry

964
01:01:07,270 --> 01:01:10,000
of the pattern that
is being etched.

965
01:01:10,000 --> 01:01:12,400
If you're trying to etch
a large amount of metal

966
01:01:12,400 --> 01:01:15,100
in a given region
of the wafer, that

967
01:01:15,100 --> 01:01:17,193
will act as a sink
for reactants.

968
01:01:17,193 --> 01:01:18,610
There will be a
lot of competition

969
01:01:18,610 --> 01:01:19,900
for these reactants.

970
01:01:19,900 --> 01:01:22,510
The concentration
locally at the surface

971
01:01:22,510 --> 01:01:25,000
is likely to be depressed.

972
01:01:25,000 --> 01:01:27,790
And that's going to reduce
the lateral etching rate

973
01:01:27,790 --> 01:01:29,890
for any individual feature.

974
01:01:29,890 --> 01:01:33,250
So we have this what we're
going to call pattern density

975
01:01:33,250 --> 01:01:34,300
effect--

976
01:01:34,300 --> 01:01:38,200
density of exposed
metal for etching.

977
01:01:38,200 --> 01:01:42,460
Finally, the thing that
can affect the availability

978
01:01:42,460 --> 01:01:44,350
of reactants for
this lateral etching

979
01:01:44,350 --> 01:01:46,330
is the size of the
feature, as I mentioned.

980
01:01:46,330 --> 01:01:50,800
Narrower features provide
a greater impediment

981
01:01:50,800 --> 01:01:56,750
to the transport of reactants
to the side wall of the feature.

982
01:01:56,750 --> 01:02:01,300
So in a way, you could
think of these phenomena

983
01:02:01,300 --> 01:02:04,660
as being input variables.

984
01:02:04,660 --> 01:02:07,510
Some of them you can control.

985
01:02:07,510 --> 01:02:11,320
If you wanted to, you
could place constraints

986
01:02:11,320 --> 01:02:13,720
on what kind of
density of features

987
01:02:13,720 --> 01:02:16,810
were available, what the
smallest feature available was

988
01:02:16,810 --> 01:02:20,080
to the designer, these chips.

989
01:02:20,080 --> 01:02:22,870
And to an extent,
you can control

990
01:02:22,870 --> 01:02:25,720
the tool-related variation.

991
01:02:25,720 --> 01:02:28,450
You can choose a
process that will

992
01:02:28,450 --> 01:02:31,300
give a more uniform distribution
of gases in the chamber.

993
01:02:31,300 --> 01:02:34,930
You could redesign the chamber.

994
01:02:34,930 --> 01:02:38,290
And some of that variation
you don't have an easy way

995
01:02:38,290 --> 01:02:39,490
to control.

996
01:02:39,490 --> 01:02:43,180
Some of that-- you don't have
complete control over all

997
01:02:43,180 --> 01:02:43,700
the inputs.

998
01:02:43,700 --> 01:02:45,910
But what I'm saying
is that this is

999
01:02:45,910 --> 01:02:49,810
a case where you have
multiple geometrical input

1000
01:02:49,810 --> 01:02:53,260
variations where you're
trying to manufacture

1001
01:02:53,260 --> 01:02:55,990
many identical chips.

1002
01:02:55,990 --> 01:03:02,920
Each square in this
graph is one chip.

1003
01:03:02,920 --> 01:03:07,540
And they're all supposed to
be identical to one another.

1004
01:03:07,540 --> 01:03:12,140
But because of the effects
I described they will not,

1005
01:03:12,140 --> 01:03:14,990
in fact, be identical.

1006
01:03:14,990 --> 01:03:19,180
So you can think
of this as being--

1007
01:03:19,180 --> 01:03:24,460
a wafer as being many
interdependent samples

1008
01:03:24,460 --> 01:03:33,150
of the output of a process
where the input is varying

1009
01:03:33,150 --> 01:03:36,160
in, in some cases,
an uncontrolled way,

1010
01:03:36,160 --> 01:03:38,750
in some cases a way
you can control.

1011
01:03:38,750 --> 01:03:43,720
So this would be a really meaty
problem for multivariate ANOVA

1012
01:03:43,720 --> 01:03:44,470
to deal with.

1013
01:03:48,340 --> 01:03:52,480
What we've got
within each chip--

1014
01:03:52,480 --> 01:03:53,950
this is actually
a test chip that

1015
01:03:53,950 --> 01:03:56,408
was designed to do some
experiments and build a model.

1016
01:03:56,408 --> 01:03:57,700
So it's not actually a product.

1017
01:03:57,700 --> 01:04:00,730
But within each
chip, what we have

1018
01:04:00,730 --> 01:04:05,050
is many copies of actually
the same sorts of features.

1019
01:04:05,050 --> 01:04:10,180
In fact, what they are is
just snake-shaped wires

1020
01:04:10,180 --> 01:04:16,660
which have a total length
that amplifies any resistance

1021
01:04:16,660 --> 01:04:18,950
variations caused by
the lateral etching.

1022
01:04:18,950 --> 01:04:21,460
You can go into the
chip and you can probe

1023
01:04:21,460 --> 01:04:23,200
the resistance of these wires.

1024
01:04:23,200 --> 01:04:27,370
As the lateral
etching gets faster,

1025
01:04:27,370 --> 01:04:30,820
the resistance gets higher,
because the wires are narrower.

1026
01:04:30,820 --> 01:04:32,710
So you have many of
these snake features

1027
01:04:32,710 --> 01:04:35,860
within each of these
individual squares.

1028
01:04:35,860 --> 01:04:41,800
And what we do is we surrounded
those the same features

1029
01:04:41,800 --> 01:04:46,720
by a different amount of
padding metal, which is not

1030
01:04:46,720 --> 01:04:49,690
electrically connected
to these snakes

1031
01:04:49,690 --> 01:04:53,180
but is sitting right next
to the snake structures,

1032
01:04:53,180 --> 01:05:00,040
so that they're perturbing
the transport of etching gas.

1033
01:05:00,040 --> 01:05:03,380
So in areas where there's a
larger amount of metal exposed

1034
01:05:03,380 --> 01:05:05,560
for etching, there's going
to be greater competition

1035
01:05:05,560 --> 01:05:07,870
for reactants
locally, and there's

1036
01:05:07,870 --> 01:05:12,650
going to be a lower etch rate,
including a lower lateral etch

1037
01:05:12,650 --> 01:05:13,150
rate.

1038
01:05:16,210 --> 01:05:18,080
It's not necessarily
true, of course,

1039
01:05:18,080 --> 01:05:22,000
that the pattern density
effects are confined

1040
01:05:22,000 --> 01:05:23,600
to one of these squares.

1041
01:05:23,600 --> 01:05:29,320
You may find that the lengths
over which competition

1042
01:05:29,320 --> 01:05:33,340
for reactants occur are larger
than the diameter of one

1043
01:05:33,340 --> 01:05:35,750
of these patches.

1044
01:05:35,750 --> 01:05:40,210
And that is something
that would need

1045
01:05:40,210 --> 01:05:47,190
to be modeled and dealt with
in understanding the process.

1046
01:05:53,350 --> 01:05:57,180
So I'm not going to go through
the analysis of variance

1047
01:05:57,180 --> 01:05:57,900
for this problem.

1048
01:05:57,900 --> 01:06:02,420
I just wanted to
highlight the fact

1049
01:06:02,420 --> 01:06:10,380
that there are these complicated
sets of geometrical input

1050
01:06:10,380 --> 01:06:14,130
variables that we want to
try to understand often

1051
01:06:14,130 --> 01:06:17,880
in semiconductor
manufacturing processes.

1052
01:06:17,880 --> 01:06:22,020
And very often,
we want to just go

1053
01:06:22,020 --> 01:06:24,030
beyond finding a
functional relationship

1054
01:06:24,030 --> 01:06:26,250
between some
geometrical property

1055
01:06:26,250 --> 01:06:29,490
and the performance
of the products.

1056
01:06:29,490 --> 01:06:34,980
We want to build a physical
model that will work as far

1057
01:06:34,980 --> 01:06:39,420
back as the settings on
the machine, the flow

1058
01:06:39,420 --> 01:06:42,420
rates of gases, the
amount of electrical power

1059
01:06:42,420 --> 01:06:45,720
that's going into generating
the plasma in the chamber,

1060
01:06:45,720 --> 01:06:50,670
to try to work back
with enough detail

1061
01:06:50,670 --> 01:06:56,460
that we can start to decide
what good input variable

1062
01:06:56,460 --> 01:06:57,480
values would be.

1063
01:07:01,320 --> 01:07:05,070
I'm just going to show
you a little bit of data

1064
01:07:05,070 --> 01:07:08,370
from one of these test wafers.

1065
01:07:08,370 --> 01:07:13,050
You can see that we've
identified a clear relationship

1066
01:07:13,050 --> 01:07:17,400
between the pattern density--

1067
01:07:17,400 --> 01:07:22,920
the amount of padding metal
within one of those sets

1068
01:07:22,920 --> 01:07:24,840
of features-- and the
average resistance

1069
01:07:24,840 --> 01:07:26,130
of one of the snakes.

1070
01:07:26,130 --> 01:07:35,040
There are several hundred
snakes within each given

1071
01:07:35,040 --> 01:07:36,060
patch of features.

1072
01:07:36,060 --> 01:07:38,550
And what we've done here is
just average the resistance

1073
01:07:38,550 --> 01:07:41,850
of all of them to
give a quick estimate

1074
01:07:41,850 --> 01:07:47,280
of the effect of local pattern
density and input variable

1075
01:07:47,280 --> 01:07:51,160
on one important output.

1076
01:07:51,160 --> 01:07:53,910
So we can see that the
pattern density has

1077
01:07:53,910 --> 01:07:56,880
an effect that we could
think of inventing

1078
01:07:56,880 --> 01:07:59,391
a functional model for.

1079
01:07:59,391 --> 01:08:04,080
But what we also have is the
wafer scale non-uniformity.

1080
01:08:04,080 --> 01:08:09,300
This is to do with the way
gases flow around the chamber,

1081
01:08:09,300 --> 01:08:12,670
approach the wafer, are
transported across the wafer.

1082
01:08:12,670 --> 01:08:18,240
And this graph shows you a
subset of the data we have.

1083
01:08:18,240 --> 01:08:21,990
Down in the bottom left
is a diagram of the wafer.

1084
01:08:21,990 --> 01:08:24,359
And what we've
done on this graph

1085
01:08:24,359 --> 01:08:30,510
is plotted the resistance
of a particular test

1086
01:08:30,510 --> 01:08:41,729
feature, a resistive snake,
within each chip on the wafer

1087
01:08:41,729 --> 01:08:43,380
as a function of location.

1088
01:08:43,380 --> 01:08:45,750
We slice up the wafer.

1089
01:08:45,750 --> 01:08:53,040
And the x-axis corresponds to
each slice of the wafer being

1090
01:08:53,040 --> 01:08:53,939
concatenated.

1091
01:08:53,939 --> 01:08:56,970
The first slice is
here, the second slice

1092
01:08:56,970 --> 01:09:00,930
is here, and so forth.

1093
01:09:00,930 --> 01:09:05,479
What we see is that the
resistances that result

1094
01:09:05,479 --> 01:09:07,760
tend to be larger in
the center of the wafer.

1095
01:09:07,760 --> 01:09:12,680
Here's a central part of the
wafer, and here is an edge.

1096
01:09:20,319 --> 01:09:22,930
So that relationship is clear.

1097
01:09:22,930 --> 01:09:25,210
But then, if we look
within each chip,

1098
01:09:25,210 --> 01:09:29,859
if we look at the features that
are near this 5% local metal

1099
01:09:29,859 --> 01:09:34,520
density, then we see
this amount of variation.

1100
01:09:34,520 --> 01:09:36,189
And if we look in
the region where

1101
01:09:36,189 --> 01:09:41,080
there's a much higher
amount of metal density,

1102
01:09:41,080 --> 01:09:43,970
therefore less of the
metal is being etched away,

1103
01:09:43,970 --> 01:09:50,800
we see that the resistance
is higher, and--

1104
01:09:50,800 --> 01:09:52,979
I'm sorry, by pattern
density we mean

1105
01:09:52,979 --> 01:09:56,920
the proportion of the chip
that is open for etching.

1106
01:09:56,920 --> 01:09:59,050
And we see that the
resistance is higher

1107
01:09:59,050 --> 01:10:06,490
because there's more there's
more lateral etching.

1108
01:10:06,490 --> 01:10:09,460
But what this gives
us an indication of

1109
01:10:09,460 --> 01:10:12,790
is that there must
be an interaction.

1110
01:10:12,790 --> 01:10:16,810
Because the size of the
wafer scale non-uniformity

1111
01:10:16,810 --> 01:10:19,810
depends on the local
pattern density.

1112
01:10:19,810 --> 01:10:24,820
It's not as if we're taking
the 5% variation pattern

1113
01:10:24,820 --> 01:10:26,110
and we're shifting it up.

1114
01:10:26,110 --> 01:10:30,760
When we change to
85% pattern density,

1115
01:10:30,760 --> 01:10:38,560
there is a change in the shape
of the location dependence

1116
01:10:38,560 --> 01:10:41,800
as we change the
local pattern density.

1117
01:10:41,800 --> 01:10:45,525
And that is an example of one
of those interaction effects

1118
01:10:45,525 --> 01:10:46,900
that we would need
to capture, so

1119
01:10:46,900 --> 01:10:52,090
either through some
multiplicative model

1120
01:10:52,090 --> 01:10:58,300
or something more
physically-based.

1121
01:10:58,300 --> 01:11:04,090
Anyway, that is the
end of today's lecture.

1122
01:11:04,090 --> 01:11:09,580
Next time, we're going to
use these techniques, ANOVA,

1123
01:11:09,580 --> 01:11:16,510
as the basis for starting to
build real models where we're

1124
01:11:16,510 --> 01:11:19,240
actually fleshing
out the functional

1125
01:11:19,240 --> 01:11:23,260
relationship between
inputs and outputs,

1126
01:11:23,260 --> 01:11:26,170
and designing experiments
that will give us

1127
01:11:26,170 --> 01:11:30,730
that information as
efficiently as possible.

1128
01:11:30,730 --> 01:11:34,780
OK, are there any
questions from either side?

1129
01:11:34,780 --> 01:11:35,440
Hello.

1130
01:11:35,440 --> 01:11:37,200
AUDIENCE: Yeah, I
have one question.

1131
01:11:37,200 --> 01:11:41,080
As for the ANOVA,
you have, like,

1132
01:11:41,080 --> 01:11:44,080
three parameters-- a
k, n, and m, right?

1133
01:11:44,080 --> 01:11:48,880
So does m always control
n multiplied by k?

1134
01:11:48,880 --> 01:11:51,310
PROFESSOR: Actually,
yeah, m in that

1135
01:11:51,310 --> 01:11:57,510
case was the number of samples
per combination of treatment.

1136
01:11:57,510 --> 01:12:02,560
So actually, in MANOVA, the
quantity that I termed m

1137
01:12:02,560 --> 01:12:07,150
is a bit like n in the
single variable ANOVA.

1138
01:12:07,150 --> 01:12:09,310
I know that's confusing.

1139
01:12:09,310 --> 01:12:15,760
But here, m the
number of replicates

1140
01:12:15,760 --> 01:12:19,105
for a given combination of
input variables t and q.

1141
01:12:22,570 --> 01:12:25,660
Does that make sense?

1142
01:12:25,660 --> 01:12:28,450
So we have a combination
of inputs t and q.

1143
01:12:28,450 --> 01:12:29,800
We keep them constant.

1144
01:12:29,800 --> 01:12:34,600
We sample a few parts for
that combination of inputs.

1145
01:12:34,600 --> 01:12:36,235
And there are m of those parts.

1146
01:12:38,850 --> 01:12:39,630
AUDIENCE: OK.

1147
01:12:39,630 --> 01:12:41,520
PROFESSOR: Thanks.

1148
01:12:41,520 --> 01:12:42,270
Anyone else?

1149
01:12:46,890 --> 01:12:49,140
OK, well, thank you.

1150
01:12:49,140 --> 01:12:53,010
The problem is due on Thursday.

1151
01:12:53,010 --> 01:12:56,620
Let me know if you have
any questions about it.

1152
01:12:56,620 --> 01:12:57,720
I'm sure you will.

1153
01:12:57,720 --> 01:12:59,363
Oh, hello?

1154
01:12:59,363 --> 01:13:00,780
AUDIENCE: Can I
check in with you?

1155
01:13:00,780 --> 01:13:02,042
PROFESSOR: Sure.

1156
01:13:02,042 --> 01:13:05,330
AUDIENCE: For the quiz, I'm
not sure which question.

1157
01:13:05,330 --> 01:13:08,370
I think it probably be the
last section of problem 1.

1158
01:13:08,370 --> 01:13:11,200
I think I needed a [INAUDIBLE]
table with r for equal

1159
01:13:11,200 --> 01:13:15,870
to 0.025, but we were
not provided with that.

1160
01:13:15,870 --> 01:13:16,560
PROFESSOR: Yes.

1161
01:13:16,560 --> 01:13:18,520
AUDIENCE: I'm not sure
whether I'm wrong, or--

1162
01:13:18,520 --> 01:13:20,520
PROFESSOR: About ten
people asked me at this end

1163
01:13:20,520 --> 01:13:21,030
about that.

1164
01:13:21,030 --> 01:13:26,370
Well, you know, I actually
think that a one-sided F

1165
01:13:26,370 --> 01:13:29,700
test would have been
appropriate in that case.

1166
01:13:29,700 --> 01:13:32,760
In which case, the table--

1167
01:13:32,760 --> 01:13:35,640
the 0.05 table would have
been appropriate to do

1168
01:13:35,640 --> 01:13:39,630
a one-sided 5% F test.

1169
01:13:39,630 --> 01:13:41,970
It was also possible
to answer that question

1170
01:13:41,970 --> 01:13:45,300
by looking at the confidence
intervals from part B

1171
01:13:45,300 --> 01:13:47,310
and seeing whether
they overlapped.

1172
01:13:47,310 --> 01:13:49,200
And I think they didn't
overlap, did they?

1173
01:13:49,200 --> 01:13:51,010
So they didn't overlap.

1174
01:13:51,010 --> 01:13:53,610
So we said there was a
significant difference.

1175
01:13:53,610 --> 01:13:57,190
But anyway we'll
publish solutions.

1176
01:13:57,190 --> 01:14:01,150
Thanks, anyone else?

1177
01:14:01,150 --> 01:14:01,750
No?

1178
01:14:01,750 --> 01:14:02,362
Good.

1179
01:14:02,362 --> 01:14:03,320
AUDIENCE: One question.

1180
01:14:03,320 --> 01:14:03,820
PROFESSOR: Yeah?

1181
01:14:03,820 --> 01:14:04,653
AUDIENCE: I'm sorry.

1182
01:14:04,653 --> 01:14:05,830
PROFESSOR: That's OK.

1183
01:14:05,830 --> 01:14:09,880
AUDIENCE: For the
MANOVA, there--

1184
01:14:09,880 --> 01:14:13,150
OK, for ANOVA
there's a term that

1185
01:14:13,150 --> 01:14:17,110
is the 0 mean normal residual.

1186
01:14:17,110 --> 01:14:19,780
And for MANOVA,
there's the same term.

1187
01:14:19,780 --> 01:14:26,340
But in the second line,
the term disappeared.

1188
01:14:26,340 --> 01:14:29,090
Do you know what I'm talking
about-- on previous slide?

1189
01:14:29,090 --> 01:14:31,580
PROFESSOR: On the
previous-- this slide?

1190
01:14:31,580 --> 01:14:32,120
This slide?

1191
01:14:32,120 --> 01:14:33,787
AUDIENCE: No, previous,
previous slide--

1192
01:14:33,787 --> 01:14:34,418
two slides ago.

1193
01:14:34,418 --> 01:14:35,210
Oh, yeah, this one.

1194
01:14:35,210 --> 01:14:36,043
PROFESSOR: This one.

1195
01:14:39,220 --> 01:14:42,680
AUDIENCE: The term
disappears later.

1196
01:14:42,680 --> 01:14:44,790
The last term on the--

1197
01:14:44,790 --> 01:14:49,220
PROFESSOR: Yeah, oh,
you mean the term--

1198
01:14:49,220 --> 01:14:51,320
there's a term here, but
there isn't a term here.

1199
01:14:51,320 --> 01:14:52,340
Is that what you mean?

1200
01:14:52,340 --> 01:14:52,750
AUDIENCE: Yeah.

1201
01:14:52,750 --> 01:14:53,417
PROFESSOR: Yeah.

1202
01:14:53,417 --> 01:15:00,080
Ah, right, well, the
second line is an estimate

1203
01:15:00,080 --> 01:15:04,010
of the value of the output for
the combination of inputs t

1204
01:15:04,010 --> 01:15:06,260
and q.

1205
01:15:06,260 --> 01:15:10,730
An estimate is not trying to
make any predictions about what

1206
01:15:10,730 --> 01:15:12,350
was a residual will be.

1207
01:15:12,350 --> 01:15:18,530
It's really dealing just with
means for a given treatment.

1208
01:15:18,530 --> 01:15:23,570
And it's saying, if, say, you
set your machine to input 1

1209
01:15:23,570 --> 01:15:28,610
having value t, input
2 having value q,

1210
01:15:28,610 --> 01:15:32,420
what's your best estimate
of what the output will be?

1211
01:15:32,420 --> 01:15:37,100
And that estimate has
to be the mean output.

1212
01:15:37,100 --> 01:15:39,980
There's no point making
anything other than a mean.

1213
01:15:39,980 --> 01:15:44,720
And so the residual,
that epsilon term,

1214
01:15:44,720 --> 01:15:47,780
is present in real
data, because there's

1215
01:15:47,780 --> 01:15:54,938
random variation in the output
around the expected mean.

1216
01:15:54,938 --> 01:15:57,230
AUDIENCE: Another way to say
that is your best estimate

1217
01:15:57,230 --> 01:15:59,790
of epsilon [INAUDIBLE].

1218
01:15:59,790 --> 01:16:00,483
PROFESSOR: Yeah.

1219
01:16:00,483 --> 01:16:02,795
AUDIENCE: So you
could put, like, a--

1220
01:16:02,795 --> 01:16:05,510
you could put a plus 0 there
if that's your best guess.

1221
01:16:08,740 --> 01:16:11,005
PROFESSOR: OK, did you
hear that in Singapore?

1222
01:16:11,005 --> 01:16:11,630
AUDIENCE: Yeah.

1223
01:16:11,630 --> 01:16:12,887
PROFESSOR: Yeah, good.

1224
01:16:12,887 --> 01:16:13,720
AUDIENCE: Thank you.

1225
01:16:16,420 --> 01:16:17,430
PROFESSOR: Anyone else?

1226
01:16:20,660 --> 01:16:23,500
OK, see you next time.