1 00:00:00,000 --> 00:00:02,430 The following content is provided under a Creative 2 00:00:02,430 --> 00:00:03,730 Commons license. 3 00:00:03,730 --> 00:00:06,030 Your support will help MIT OpenCourseWare 4 00:00:06,030 --> 00:00:10,060 continue to offer high-quality educational resources for free. 5 00:00:10,060 --> 00:00:12,690 To make a donation or to view additional materials 6 00:00:12,690 --> 00:00:16,560 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,560 --> 00:00:17,904 at ocw.mit.edu. 8 00:00:21,730 --> 00:00:26,680 DUANE BONING: OK, so on Tuesday, Hayden 9 00:00:26,680 --> 00:00:29,290 brought us up through analysis of variance, 10 00:00:29,290 --> 00:00:34,270 which had an implicit model of the process hidden underneath. 11 00:00:34,270 --> 00:00:36,940 It was mostly focused on the idea of, 12 00:00:36,940 --> 00:00:41,960 are there effects if you make different kinds of conditions? 13 00:00:41,960 --> 00:00:43,630 What we want to do is pick up on that, 14 00:00:43,630 --> 00:00:45,700 but get back to the model a little bit more, 15 00:00:45,700 --> 00:00:49,180 talk a little bit more explicitly about ways 16 00:00:49,180 --> 00:00:53,140 to express, from an empirical point of view, 17 00:00:53,140 --> 00:00:56,830 experimental data or other kinds of run data that we might have. 18 00:00:56,830 --> 00:00:58,330 So I'm going to pick up a little bit 19 00:00:58,330 --> 00:01:04,000 on this idea of modeling effects when we've got multiple inputs. 20 00:01:04,000 --> 00:01:07,960 Much of the ANOVA analysis was described for a single input. 21 00:01:07,960 --> 00:01:12,550 And then what we will do is look at linear and quadratic models, 22 00:01:12,550 --> 00:01:17,420 and other kinds of models for fitting of that kind of data. 23 00:01:17,420 --> 00:01:22,570 We'll talk a little bit about calculation or fitting of model 24 00:01:22,570 --> 00:01:28,480 coefficients, both from a fairly powerful and general regression 25 00:01:28,480 --> 00:01:34,300 idea, but also, we'll start to build up the lingo for design 26 00:01:34,300 --> 00:01:35,680 of experiments. 27 00:01:35,680 --> 00:01:39,280 One advantage of a design of experiments approach 28 00:01:39,280 --> 00:01:43,450 is that there are shorthand ways for rapidly assessing 29 00:01:43,450 --> 00:01:48,100 and, in fact, building models based on experimental data 30 00:01:48,100 --> 00:01:50,590 points in the selection-- careful selection 31 00:01:50,590 --> 00:01:55,660 of where you sample from your process or your conditions. 32 00:01:55,660 --> 00:01:59,920 So things like contrast will be perhaps a new terminology 33 00:01:59,920 --> 00:02:03,363 for you that summarize a little bit what 34 00:02:03,363 --> 00:02:04,780 it is we're trying to do with some 35 00:02:04,780 --> 00:02:08,650 of these design of experiments, or DOE. 36 00:02:08,650 --> 00:02:11,410 So to pick up a little bit, here's, again, 37 00:02:11,410 --> 00:02:12,950 the injection molding data. 38 00:02:12,950 --> 00:02:14,620 We've looked at this from a number 39 00:02:14,620 --> 00:02:16,930 of different perspectives, and we'll look at it 40 00:02:16,930 --> 00:02:19,187 a little bit more today. 41 00:02:19,187 --> 00:02:20,770 So we've got some dimension that we're 42 00:02:20,770 --> 00:02:23,260 measuring with an injection molding, where 43 00:02:23,260 --> 00:02:25,760 we have a couple of different parameters-- 44 00:02:25,760 --> 00:02:27,670 one of which might be the velocity 45 00:02:27,670 --> 00:02:31,140 that we're using for the injection. 46 00:02:31,140 --> 00:02:32,590 Another may be the whole time. 47 00:02:32,590 --> 00:02:34,660 Perhaps these are minutes-- 48 00:02:34,660 --> 00:02:37,180 one minute, two minute. 49 00:02:37,180 --> 00:02:40,840 And one would ask, is there an effect from a point of view 50 00:02:40,840 --> 00:02:41,650 of ANOVA? 51 00:02:41,650 --> 00:02:47,170 And then we can look and compare things like the spread 52 00:02:47,170 --> 00:02:51,280 within that particular condition, say, at one minute-- 53 00:02:51,280 --> 00:02:56,560 compare this spread and its mean to this spread and its mean, 54 00:02:56,560 --> 00:02:58,960 and make some assessment of whether there 55 00:02:58,960 --> 00:03:02,740 is a statistically significant effect. 56 00:03:02,740 --> 00:03:05,860 But we might also want to build up 57 00:03:05,860 --> 00:03:09,370 a causal relationship, and actually 58 00:03:09,370 --> 00:03:13,480 a functional relationship between hold time 59 00:03:13,480 --> 00:03:14,590 and dimension. 60 00:03:14,590 --> 00:03:16,160 And then the question becomes, well, 61 00:03:16,160 --> 00:03:20,080 what do we use to fit such a set of data with? 62 00:03:20,080 --> 00:03:22,840 Would we fit it with a line? 63 00:03:22,840 --> 00:03:26,900 Would we fit it with a curve that looks something like that, 64 00:03:26,900 --> 00:03:28,180 a curve something like that? 65 00:03:28,180 --> 00:03:29,740 What is appropriate? 66 00:03:29,740 --> 00:03:31,480 What are appropriate kinds of functions 67 00:03:31,480 --> 00:03:34,180 to use for fitting of these things? 68 00:03:34,180 --> 00:03:37,660 Now, hopefully you've got some intuition. 69 00:03:37,660 --> 00:03:40,250 And I'm not going to try to destroy any of that intuition 70 00:03:40,250 --> 00:03:40,750 today. 71 00:03:40,750 --> 00:03:42,820 I'm just going to try to extend it. 72 00:03:42,820 --> 00:03:44,580 One question for you-- 73 00:03:44,580 --> 00:03:46,420 hopefully you have an intuitive feel-- 74 00:03:46,420 --> 00:03:50,530 would you be justified in fitting this blue curve, 75 00:03:50,530 --> 00:03:53,020 in most cases, to this data? 76 00:03:56,390 --> 00:03:58,060 Well, I've now made it a red curve. 77 00:03:58,060 --> 00:03:58,780 Yeah? 78 00:03:58,780 --> 00:04:00,236 AUDIENCE: Well, [INAUDIBLE]. 79 00:04:04,288 --> 00:04:05,080 DUANE BONING: Yeah. 80 00:04:05,080 --> 00:04:07,900 AUDIENCE: So [INAUDIBLE] a physical model, or something, 81 00:04:07,900 --> 00:04:10,030 [INAUDIBLE] 82 00:04:10,030 --> 00:04:13,000 DUANE BONING: Good-- because usually the answer 83 00:04:13,000 --> 00:04:16,420 is it depends, and you already got the depends in there. 84 00:04:16,420 --> 00:04:19,552 If all you had was the pure data, certainly, 85 00:04:19,552 --> 00:04:21,260 you've only got really two sample points, 86 00:04:21,260 --> 00:04:23,320 so all you can fit is a line. 87 00:04:23,320 --> 00:04:25,770 But your intuition is very good there. 88 00:04:25,770 --> 00:04:28,210 If you have a lot of physical knowledge 89 00:04:28,210 --> 00:04:32,260 that, in fact, the relationship would have to be parabolic, 90 00:04:32,260 --> 00:04:34,390 two data points might give you enough information 91 00:04:34,390 --> 00:04:39,610 to be able to justifiably fit some more complicated function. 92 00:04:39,610 --> 00:04:41,170 We're going to be mostly focusing 93 00:04:41,170 --> 00:04:44,290 on empirical data and just sort of the limits of what you can 94 00:04:44,290 --> 00:04:46,270 infer from the data although. 95 00:04:46,270 --> 00:04:50,410 We're not going to be, at least today, 96 00:04:50,410 --> 00:04:54,670 trying to mix in both the physical functional 97 00:04:54,670 --> 00:04:59,260 dependencies and empirical. 98 00:04:59,260 --> 00:05:01,480 So one key question then is, what 99 00:05:01,480 --> 00:05:04,922 functional form might be justified by the data? 100 00:05:04,922 --> 00:05:06,880 In this case, if I've only got two data points, 101 00:05:06,880 --> 00:05:11,650 maybe it's only a very simple linear model. 102 00:05:11,650 --> 00:05:14,110 If I had more data, well, I might 103 00:05:14,110 --> 00:05:19,850 have enough to fit parabolic or other, more complicated data. 104 00:05:19,850 --> 00:05:22,770 But in addition now, when we start to go to two inputs-- 105 00:05:22,770 --> 00:05:25,280 this is this is, in fact, the same data, 106 00:05:25,280 --> 00:05:28,190 where we have not only hold time, 107 00:05:28,190 --> 00:05:30,560 as we did on the previous screen, 108 00:05:30,560 --> 00:05:33,530 but we may also be playing with the velocity 109 00:05:33,530 --> 00:05:36,770 and then doing some number of replicates-- something like 14 110 00:05:36,770 --> 00:05:40,880 or 15 runs at the low condition for velocity 111 00:05:40,880 --> 00:05:42,590 and low condition for hold time-- 112 00:05:42,590 --> 00:05:45,450 and then measure the data. 113 00:05:45,450 --> 00:05:47,660 And what we have here is conditions 114 00:05:47,660 --> 00:05:51,650 where we've changed both of those conditions, 115 00:05:51,650 --> 00:05:56,690 and it may well be the case that these two inputs interact-- 116 00:05:56,690 --> 00:06:01,250 that the effective velocity gets amplified 117 00:06:01,250 --> 00:06:05,720 when I've got a longer hold time than a shorter hold time. 118 00:06:05,720 --> 00:06:08,550 So it's not necessarily just an additive model. 119 00:06:08,550 --> 00:06:12,860 So we also need to mix in this issue of interaction, when 120 00:06:12,860 --> 00:06:14,930 we start thinking about model form-- 121 00:06:14,930 --> 00:06:19,670 not just sort of the functional dependence terms of powers 122 00:06:19,670 --> 00:06:24,320 or for the shape of the dependence on a single input, 123 00:06:24,320 --> 00:06:28,880 but also when we have multiple inputs, if there 124 00:06:28,880 --> 00:06:31,670 are interactions between them. 125 00:06:34,262 --> 00:06:35,970 So we can also turn this question around. 126 00:06:35,970 --> 00:06:40,020 I can say, depending on how much data I have, what kind of model 127 00:06:40,020 --> 00:06:41,160 might I fit? 128 00:06:41,160 --> 00:06:42,630 Can turn it around and say, I would 129 00:06:42,630 --> 00:06:46,670 like to be able to fit a model up to a certain kind of form-- 130 00:06:46,670 --> 00:06:49,270 you can turn it around and ask the question, 131 00:06:49,270 --> 00:06:50,760 what data points do I need? 132 00:06:50,760 --> 00:06:53,730 How much data do I need, and what's 133 00:06:53,730 --> 00:06:56,430 the location of those data points 134 00:06:56,430 --> 00:06:59,250 in order to be able to fit different functional forms? 135 00:06:59,250 --> 00:07:03,180 And that becomes the design of the experiment 136 00:07:03,180 --> 00:07:09,430 in order to be able to meet your modeling or estimation needs. 137 00:07:09,430 --> 00:07:11,220 So what we'll be doing is dealing 138 00:07:11,220 --> 00:07:15,630 with some models of the process, where 139 00:07:15,630 --> 00:07:19,110 we have multiple inputs or multiple treatments-- 140 00:07:19,110 --> 00:07:22,690 not just different conditions or levels at these, but in fact, 141 00:07:22,690 --> 00:07:26,070 we may have multiple different kinds of treatments, 142 00:07:26,070 --> 00:07:28,720 such as the velocity and the hold time. 143 00:07:28,720 --> 00:07:31,080 And so in general, we might talk about processes 144 00:07:31,080 --> 00:07:34,920 that have k input. 145 00:07:34,920 --> 00:07:38,310 So in our previous case, we had two inputs. 146 00:07:38,310 --> 00:07:40,470 And then we can look at different combinations 147 00:07:40,470 --> 00:07:44,310 of the data, depending on how many levels 148 00:07:44,310 --> 00:07:46,750 we pick for each of those inputs. 149 00:07:46,750 --> 00:07:49,110 So a level is simply where you're 150 00:07:49,110 --> 00:07:55,050 sampling along that axis of that particular factor or input. 151 00:07:55,050 --> 00:07:58,740 And we'll be talking especially about full factorial models, 152 00:07:58,740 --> 00:08:01,140 where we're looking at every combination 153 00:08:01,140 --> 00:08:07,140 of the discrete selection of those levels for those factors 154 00:08:07,140 --> 00:08:08,770 that we're interested in. 155 00:08:08,770 --> 00:08:12,690 So if I've got a two level selection or choice 156 00:08:12,690 --> 00:08:15,460 for each of our factor levels, then, in general, 157 00:08:15,460 --> 00:08:18,450 we can have 2 to the k combinations 158 00:08:18,450 --> 00:08:20,700 of different discrete sample points, 159 00:08:20,700 --> 00:08:24,670 different discrete treatment sets. 160 00:08:24,670 --> 00:08:26,070 You can extend this, of course. 161 00:08:26,070 --> 00:08:29,790 You might be looking at tree level experiments for each 162 00:08:29,790 --> 00:08:33,960 of our factor, and then I would have 3 to the k combinations. 163 00:08:33,960 --> 00:08:36,059 And you'll detect a trend here that's not 164 00:08:36,059 --> 00:08:38,880 especially encouraging. 165 00:08:38,880 --> 00:08:43,799 What happens as k gets big? 166 00:08:43,799 --> 00:08:45,600 We have exponential growth in the number 167 00:08:45,600 --> 00:08:48,010 of experimental points that we need. 168 00:08:48,010 --> 00:08:51,150 Similarly, as the number of levels go up, 169 00:08:51,150 --> 00:08:53,140 you can fit more powerful models, 170 00:08:53,140 --> 00:08:56,790 but again, you need a lot more data fairly rapidly. 171 00:08:56,790 --> 00:08:58,890 Today we'll be just pretty much focusing 172 00:08:58,890 --> 00:09:03,540 on relatively simple small numbers of k 173 00:09:03,540 --> 00:09:05,020 and small numbers of level. 174 00:09:05,020 --> 00:09:07,890 In fact, traditional 2 to the k design 175 00:09:07,890 --> 00:09:11,070 is what we'll be talking the most about. 176 00:09:11,070 --> 00:09:12,940 When you get back from spring break, 177 00:09:12,940 --> 00:09:16,320 we'll also come back to clever approaches 178 00:09:16,320 --> 00:09:20,400 out of design of experiments for subsampling the space-- 179 00:09:20,400 --> 00:09:24,780 basically, not having to explore all of the combinations 180 00:09:24,780 --> 00:09:26,640 because, in fact, all of the combinations 181 00:09:26,640 --> 00:09:29,580 may give you more data and fit a more powerful model than you're 182 00:09:29,580 --> 00:09:30,690 even really interested in. 183 00:09:34,520 --> 00:09:38,270 If I were doing, say, a 3 to the k, 184 00:09:38,270 --> 00:09:40,670 I might be able to actually estimate 185 00:09:40,670 --> 00:09:43,760 all kinds of very subtle three-factor interactions 186 00:09:43,760 --> 00:09:47,360 or four-factor interactions, but I might actually not 187 00:09:47,360 --> 00:09:48,530 believe they exist. 188 00:09:48,530 --> 00:09:50,380 I may had prior knowledge they don't exist. 189 00:09:50,380 --> 00:09:58,280 And so we'll look at reduced models next time. 190 00:09:58,280 --> 00:10:01,100 There is kind of just a general question. 191 00:10:01,100 --> 00:10:04,700 Why would you even want to model or use more than one input? 192 00:10:04,700 --> 00:10:07,190 Why wouldn't you want to simplify your process, 193 00:10:07,190 --> 00:10:09,170 make it as simple as possible? 194 00:10:09,170 --> 00:10:11,600 Why might more than one input be especially 195 00:10:11,600 --> 00:10:14,180 important for manufacturing process control? 196 00:10:16,750 --> 00:10:19,745 These are fairly obvious, so don't overthink too much. 197 00:10:19,745 --> 00:10:21,370 Why would you need more than one input? 198 00:10:25,538 --> 00:10:28,080 AUDIENCE: Because you have more than one input in the world-- 199 00:10:28,080 --> 00:10:30,130 DUANE BONING: You have more than one input in the real world. 200 00:10:30,130 --> 00:10:32,380 Why do you have more than one input in the real world? 201 00:10:34,880 --> 00:10:36,507 AUDIENCE: [INAUDIBLE] complex. 202 00:10:36,507 --> 00:10:38,340 You can manipulate them in a number of ways, 203 00:10:38,340 --> 00:10:44,550 so [INAUDIBLE] process might have temperature and pressure, 204 00:10:44,550 --> 00:10:47,730 different chemical baths you could use, [INAUDIBLE] 205 00:10:47,730 --> 00:10:48,588 processing. 206 00:10:48,588 --> 00:10:49,380 DUANE BONING: Good. 207 00:10:49,380 --> 00:10:52,360 AUDIENCE: So they all create slightly different outputs, 208 00:10:52,360 --> 00:10:56,420 but [INAUDIBLE] very similar outputs at the same time. 209 00:10:56,420 --> 00:10:57,170 DUANE BONING: Yes. 210 00:10:57,170 --> 00:11:00,020 OK, so I think you've touched on both of them there, 211 00:11:00,020 --> 00:11:01,790 as a freeform spoke. 212 00:11:01,790 --> 00:11:06,650 One is you often have more than one output. 213 00:11:06,650 --> 00:11:08,820 So it would actually be odd if you 214 00:11:08,820 --> 00:11:11,720 were trying to achieve just one parameter-- say, 215 00:11:11,720 --> 00:11:15,440 a length parameter-- with a dozen different inputs 216 00:11:15,440 --> 00:11:16,110 all at once. 217 00:11:16,110 --> 00:11:19,370 In fact, that would be a really challenging design problem. 218 00:11:19,370 --> 00:11:22,010 Often, they do interact, but very often, 219 00:11:22,010 --> 00:11:24,320 what you're actually after are fairly 220 00:11:24,320 --> 00:11:26,660 good solid one-to-one correspondences where, 221 00:11:26,660 --> 00:11:31,770 if I play with this knob, I have good control over this output. 222 00:11:31,770 --> 00:11:36,290 So you often strive for almost univariate, very simple 223 00:11:36,290 --> 00:11:38,390 direct control. 224 00:11:38,390 --> 00:11:42,470 You can't always achieve that, and so often you've 225 00:11:42,470 --> 00:11:44,780 got a multiple input and multiple output 226 00:11:44,780 --> 00:11:47,120 problem with interactions. 227 00:11:47,120 --> 00:11:49,370 But even if you only had one output, 228 00:11:49,370 --> 00:11:53,060 you might actually want two inputs, because one of them 229 00:11:53,060 --> 00:11:57,230 may actually give you control over the mean of the parameter 230 00:11:57,230 --> 00:12:02,750 and another input might let you deal with, or compensate 231 00:12:02,750 --> 00:12:08,490 for, or generate, or change the variance of the process. 232 00:12:08,490 --> 00:12:10,340 And so it's actually very nice to be 233 00:12:10,340 --> 00:12:12,410 able to deal with the process robustness, 234 00:12:12,410 --> 00:12:15,800 to be able to freely recenter the process without changing 235 00:12:15,800 --> 00:12:19,250 your variance, for example. 236 00:12:19,250 --> 00:12:24,363 So just highlighting that the practical world-- you often do 237 00:12:24,363 --> 00:12:26,780 have more than one input, and there are good reasons where 238 00:12:26,780 --> 00:12:30,140 you can actually use those. 239 00:12:30,140 --> 00:12:33,140 So we're going to mostly be dealing with empirical models 240 00:12:33,140 --> 00:12:36,950 here, and here's an example of a fairly simple 241 00:12:36,950 --> 00:12:38,480 general linear model. 242 00:12:42,080 --> 00:12:42,580 Question? 243 00:12:46,200 --> 00:12:48,260 Is there a question from Singapore? 244 00:12:48,260 --> 00:12:52,430 Somebody maybe pushed their button? 245 00:12:52,430 --> 00:12:55,490 [INAUDIBLE] like the microphone was-- 246 00:12:55,490 --> 00:12:57,020 we got a close-up view of you. 247 00:12:57,020 --> 00:13:00,440 We know you're awake, though-- 248 00:13:00,440 --> 00:13:02,750 good. 249 00:13:02,750 --> 00:13:06,328 OK, so here's just a quick overview of the kind of models 250 00:13:06,328 --> 00:13:07,370 that we'll be looking at. 251 00:13:07,370 --> 00:13:10,190 We've got some output. 252 00:13:10,190 --> 00:13:13,010 Here it's expressed in a univariate case. 253 00:13:13,010 --> 00:13:16,790 We'll extend this to multiple outputs in a minute. 254 00:13:16,790 --> 00:13:19,520 But I might have a single output, 255 00:13:19,520 --> 00:13:23,600 and the true value here may be eta. 256 00:13:23,600 --> 00:13:28,560 Again, I'm using a little bit the Greek to indicate truth, 257 00:13:28,560 --> 00:13:29,780 if you will. 258 00:13:29,780 --> 00:13:34,070 And it's got some overall mean. 259 00:13:34,070 --> 00:13:36,380 We might model that with a single coefficient, 260 00:13:36,380 --> 00:13:37,550 some beta 0. 261 00:13:37,550 --> 00:13:43,460 And then it's got single factor direct dependencies 262 00:13:43,460 --> 00:13:47,390 on some x sub i input. 263 00:13:47,390 --> 00:13:54,680 So x sub i here is indicating which factor it is. 264 00:13:54,680 --> 00:13:59,300 So I might have x1 and x2 would be a two-factor-- 265 00:13:59,300 --> 00:14:01,190 two different kinds of input, sort of 266 00:14:01,190 --> 00:14:03,875 like our injection molding problem. 267 00:14:03,875 --> 00:14:05,900 I might have a direct linear dependence 268 00:14:05,900 --> 00:14:09,270 and coefficient on each of those. 269 00:14:09,270 --> 00:14:13,190 And then we'll also be looking at interaction terms, ways 270 00:14:13,190 --> 00:14:17,570 of looking at a synergistic effect between the two 271 00:14:17,570 --> 00:14:18,740 of these terms-- 272 00:14:18,740 --> 00:14:20,450 still first-order. 273 00:14:20,450 --> 00:14:27,290 These are only direct x1, x2 kinds of interactions, 274 00:14:27,290 --> 00:14:30,240 but we'll see that actually gives you quite a bit of power 275 00:14:30,240 --> 00:14:35,000 for dealing with some interesting behavior. 276 00:14:35,000 --> 00:14:39,240 Then we'll also talk about are these higher order terms. 277 00:14:39,240 --> 00:14:43,070 So in truth, the model may have a more complicated functional 278 00:14:43,070 --> 00:14:44,330 dependence. 279 00:14:44,330 --> 00:14:47,240 What we're hoping is, in many cases, 280 00:14:47,240 --> 00:14:51,920 that the linear version of the model 281 00:14:51,920 --> 00:14:55,730 captures most of the behavior, and that these higher 282 00:14:55,730 --> 00:15:00,020 order model terms are smaller effects. 283 00:15:00,020 --> 00:15:03,950 And so often, we'll actually be ignoring these higher order 284 00:15:03,950 --> 00:15:10,430 terms, absorbing whatever errors there are as a result of that, 285 00:15:10,430 --> 00:15:13,640 and referring to those as a model form or model structure 286 00:15:13,640 --> 00:15:18,080 error, and distinguish that from residual error 287 00:15:18,080 --> 00:15:22,160 or other kinds of experimental error, 288 00:15:22,160 --> 00:15:23,630 or just replication error. 289 00:15:26,440 --> 00:15:28,930 So let's see-- couple other definitions just that we've got 290 00:15:28,930 --> 00:15:29,725 in here-- 291 00:15:29,725 --> 00:15:34,780 in here, you'll notice the k subscript indicates 292 00:15:34,780 --> 00:15:38,050 the number of factors, just like we were describing before-- 293 00:15:38,050 --> 00:15:41,770 two-factor, three-factor kinds of experiments. 294 00:15:41,770 --> 00:15:48,340 And then we often are forming sums over some particular-- 295 00:15:48,340 --> 00:15:52,040 either factor or input levels within those factors. 296 00:15:52,040 --> 00:15:56,300 So I'll try to be a little bit careful on what we've got then. 297 00:15:56,300 --> 00:16:02,080 So with that simple two-order or two input model, 298 00:16:02,080 --> 00:16:06,230 where k is equal to 2 in this case, 299 00:16:06,230 --> 00:16:08,500 we've got something that we might 300 00:16:08,500 --> 00:16:11,440 try to use with our injection molding case. 301 00:16:11,440 --> 00:16:15,400 That was a two input model-- 302 00:16:15,400 --> 00:16:19,030 injection velocity and hold time. 303 00:16:19,030 --> 00:16:22,940 We've got four coefficients. 304 00:16:22,940 --> 00:16:25,040 Quick question for you-- how many 305 00:16:25,040 --> 00:16:27,200 data points do you think you need 306 00:16:27,200 --> 00:16:31,280 at least a minimum of in order to be able to fit those four 307 00:16:31,280 --> 00:16:32,397 coefficients? 308 00:16:38,740 --> 00:16:42,960 If I did just one experiment and got one output, 309 00:16:42,960 --> 00:16:46,890 would that be enough data to estimate four coefficients? 310 00:16:46,890 --> 00:16:47,390 No. 311 00:16:51,130 --> 00:16:52,720 Four? 312 00:16:52,720 --> 00:16:54,398 Or was that just a wave of your hand? 313 00:16:54,398 --> 00:16:54,940 AUDIENCE: No. 314 00:16:54,940 --> 00:16:56,950 DUANE BONING: It was close. 315 00:16:56,950 --> 00:16:57,588 AUDIENCE: Yeah. 316 00:16:57,588 --> 00:16:59,380 DUANE BONING: So this is a simple extension 317 00:16:59,380 --> 00:17:00,700 of that linear case. 318 00:17:00,700 --> 00:17:05,290 If you had only one input and you wanted to fit a line, 319 00:17:05,290 --> 00:17:08,770 you would need at least two data points. 320 00:17:08,770 --> 00:17:10,038 Yeah? 321 00:17:10,038 --> 00:17:12,310 AUDIENCE: [INAUDIBLE] 322 00:17:12,310 --> 00:17:15,030 DUANE BONING: Oh, OK. 323 00:17:15,030 --> 00:17:17,740 AUDIENCE: Because that's a zero-- 324 00:17:17,740 --> 00:17:20,140 DUANE BONING: Yes. 325 00:17:20,140 --> 00:17:27,310 AUDIENCE: [INAUDIBLE] relative to your [INAUDIBLE]?? 326 00:17:27,310 --> 00:17:28,060 DUANE BONING: Yes. 327 00:17:28,060 --> 00:17:28,900 That's a good point. 328 00:17:32,230 --> 00:17:37,060 So if these are normalized in some way, 329 00:17:37,060 --> 00:17:41,290 such that there is a 0 in between, such 330 00:17:41,290 --> 00:17:46,030 that also the average of these are 0, 331 00:17:46,030 --> 00:17:47,330 then this would be the mean. 332 00:17:47,330 --> 00:17:50,185 So this would be more general-- 333 00:17:50,185 --> 00:17:51,490 some offset term. 334 00:17:51,490 --> 00:17:56,890 It may not necessarily be the mean-- good point, good point. 335 00:17:59,740 --> 00:18:03,430 OK, so part of the question is I think there's-- it's fairly 336 00:18:03,430 --> 00:18:05,950 clear intuition that you're probably going to need at least 337 00:18:05,950 --> 00:18:09,290 four data points in order to be able to fit a four-coefficient 338 00:18:09,290 --> 00:18:09,790 model. 339 00:18:09,790 --> 00:18:13,000 We need at least that number, but there 340 00:18:13,000 --> 00:18:16,960 are some additional subtleties. 341 00:18:16,960 --> 00:18:19,170 In this case, we've got at least two factors, 342 00:18:19,170 --> 00:18:21,520 so we know we've got two factors at work here. 343 00:18:21,520 --> 00:18:24,400 And I've got to at least sample across both 344 00:18:24,400 --> 00:18:28,600 of those input factors of both of those treatment things. 345 00:18:28,600 --> 00:18:30,610 But then there's a question, especially 346 00:18:30,610 --> 00:18:32,620 with this crazy interaction term-- 347 00:18:32,620 --> 00:18:37,000 how many levels do you really need in each case? 348 00:18:37,000 --> 00:18:41,740 Notice also here that, if I only have four data points, 349 00:18:41,740 --> 00:18:45,115 there's no way for me to really estimate higher order terms. 350 00:18:47,720 --> 00:18:50,010 And we'll also chat a little bit, 351 00:18:50,010 --> 00:18:53,520 because if I have exactly four data points, 352 00:18:53,520 --> 00:19:00,220 it's also not easy for me to estimator terms, 353 00:19:00,220 --> 00:19:03,370 because if I have four data points, I can exactly fit-- 354 00:19:03,370 --> 00:19:08,020 I get a direct solution for the four coefficients. 355 00:19:08,020 --> 00:19:10,660 And I can perfectly fit those-- 356 00:19:10,660 --> 00:19:13,570 I may be wrong in terms of actually interpolating 357 00:19:13,570 --> 00:19:16,480 between those, but I have a direct solution 358 00:19:16,480 --> 00:19:18,350 for those four coefficients. 359 00:19:18,350 --> 00:19:19,780 So part of the thing we'll get at 360 00:19:19,780 --> 00:19:27,160 is, how might you select levels and data samples 361 00:19:27,160 --> 00:19:30,700 so that you have more data to be able to do 362 00:19:30,700 --> 00:19:34,570 additional inferences on your model-- 363 00:19:34,570 --> 00:19:38,530 things like, what is the actual confidence level I 364 00:19:38,530 --> 00:19:41,080 have that an effect is real? 365 00:19:41,080 --> 00:19:45,760 What's my estimation not just on a point estimate for a model 366 00:19:45,760 --> 00:19:49,030 coefficient-- a beta 0 here or a beta 1-- 367 00:19:49,030 --> 00:19:51,850 but also, what is an integral estimate 368 00:19:51,850 --> 00:19:53,930 for those kinds of coefficients? 369 00:19:53,930 --> 00:19:56,410 So we'll get to many of those things. 370 00:19:56,410 --> 00:19:59,050 OK, here's back to our injection molding case-- 371 00:19:59,050 --> 00:20:01,330 dimension versus old time. 372 00:20:01,330 --> 00:20:06,010 I might try to fit that with a very simple one input linear 373 00:20:06,010 --> 00:20:13,210 model, some offset beta 0 and some beta 1 linear dependence. 374 00:20:13,210 --> 00:20:15,190 And we've already said it looks like I'm going 375 00:20:15,190 --> 00:20:16,630 to need at least two levels. 376 00:20:16,630 --> 00:20:21,100 I'm going to label those with an x underscore minus and an x sub 377 00:20:21,100 --> 00:20:21,640 plus-- 378 00:20:21,640 --> 00:20:25,680 so a high level and a low level. 379 00:20:25,680 --> 00:20:28,390 We'll actually talk about different labeling strategies 380 00:20:28,390 --> 00:20:31,780 for these things that make some of the calculations 381 00:20:31,780 --> 00:20:34,040 a little bit easier. 382 00:20:34,040 --> 00:20:36,580 So if I, again, do just one trial 383 00:20:36,580 --> 00:20:38,870 at each level, what I've got-- 384 00:20:38,870 --> 00:20:41,440 there's a little bit more notation for us 385 00:20:41,440 --> 00:20:43,750 in very simple vector notation-- 386 00:20:43,750 --> 00:20:48,010 what I've got is one factor, two levels, 387 00:20:48,010 --> 00:20:51,380 if I'm just doing a one-input model. 388 00:20:51,380 --> 00:20:55,450 So what I will have is two measurements, an eta 1 389 00:20:55,450 --> 00:20:56,530 and an eta 2. 390 00:20:56,530 --> 00:20:59,260 Those are my two actual sample points. 391 00:20:59,260 --> 00:21:04,340 And what I've done is run them at two different conditions. 392 00:21:04,340 --> 00:21:07,240 So in fact, these indicate the two different conditions 393 00:21:07,240 --> 00:21:08,710 that I ran at. 394 00:21:08,710 --> 00:21:12,340 And what I'm trying to do is fit the beta 0 and the beta 1 395 00:21:12,340 --> 00:21:13,810 for this case. 396 00:21:13,810 --> 00:21:16,510 Now, because I've got exactly the right amount of data, 397 00:21:16,510 --> 00:21:19,890 I'm going to have to estimate that the noise is 0, 398 00:21:19,890 --> 00:21:22,280 as we'll see in just a second. 399 00:21:22,280 --> 00:21:27,010 Now, I can express the two sets of samples-- 400 00:21:27,010 --> 00:21:30,340 so I've got, basically, an eta 1 is 401 00:21:30,340 --> 00:21:36,670 equal to some beta 0 plus beta 1x at the low condition. 402 00:21:36,670 --> 00:21:39,640 And I've, similarly, got eta 2. 403 00:21:39,640 --> 00:21:44,680 I'm estimating as beta 0 plus beta 1 at the high condition. 404 00:21:44,680 --> 00:21:47,140 That's just a system of two equations 405 00:21:47,140 --> 00:21:52,120 I can write it in matrix form, as shown down here. 406 00:21:52,120 --> 00:21:56,810 So I've got this condition times my two coefficient. 407 00:21:56,810 --> 00:22:00,910 So I've got my beta 0 and my beta 1. 408 00:22:00,910 --> 00:22:04,630 That's my first equation, giving me my eta 1-- 409 00:22:04,630 --> 00:22:07,570 and similarly, for my eta 2. 410 00:22:07,570 --> 00:22:12,100 Now, since we've got exactly the same number of model 411 00:22:12,100 --> 00:22:18,880 coefficients and model runs, my x is square, 412 00:22:18,880 --> 00:22:25,600 and I've got a direct solution to this equation-- 413 00:22:25,600 --> 00:22:29,410 where, again, eta is 0, because I don't have any replicates, 414 00:22:29,410 --> 00:22:31,930 I don't have any noise, and so I can simply 415 00:22:31,930 --> 00:22:39,050 directly invert that and solve directly for my coefficient. 416 00:22:39,050 --> 00:22:41,350 So my purpose here was both to remind us 417 00:22:41,350 --> 00:22:45,310 that there is some vector notation that just follows what 418 00:22:45,310 --> 00:22:49,750 you would think of naturally, and in the special case 419 00:22:49,750 --> 00:22:53,530 where you've only got exactly just barely enough data, 420 00:22:53,530 --> 00:22:56,720 there's a direct solution. 421 00:22:56,720 --> 00:22:58,840 So that's pretty straightforward. 422 00:22:58,840 --> 00:23:02,770 But what we'd often like to do is 423 00:23:02,770 --> 00:23:07,270 remember that there's noise in our process-- 424 00:23:07,270 --> 00:23:11,110 that, if I were to, in fact, run at the same condition 425 00:23:11,110 --> 00:23:14,980 multiple times, I've probably got some variance in there. 426 00:23:14,980 --> 00:23:17,200 It may be measurement error. 427 00:23:17,200 --> 00:23:20,420 It may be underlying process variation. 428 00:23:20,420 --> 00:23:24,130 So what do we do when we've got replicates? 429 00:23:24,130 --> 00:23:26,710 First off, we probably want to design 430 00:23:26,710 --> 00:23:30,950 the process or the experimental design for the process 431 00:23:30,950 --> 00:23:33,130 so that we do sample at multiple-- 432 00:23:33,130 --> 00:23:34,990 at the same condition multiple times 433 00:23:34,990 --> 00:23:37,525 so we can estimate these variances. 434 00:23:37,525 --> 00:23:39,400 I'm going to need that information for things 435 00:23:39,400 --> 00:23:42,230 like confidence intervals on the coefficients. 436 00:23:42,230 --> 00:23:45,970 So how do we deal with data where 437 00:23:45,970 --> 00:23:52,210 there is some spread, even at the same factor 438 00:23:52,210 --> 00:23:56,230 level for hold time? 439 00:23:56,230 --> 00:24:04,390 Well, now we're into an over-specified set 440 00:24:04,390 --> 00:24:05,380 of linear equations. 441 00:24:05,380 --> 00:24:09,250 I've got multiple sets of data for the same input. 442 00:24:09,250 --> 00:24:10,720 I've got multiple y's. 443 00:24:10,720 --> 00:24:12,880 I got multiple outputs. 444 00:24:12,880 --> 00:24:16,150 And what we'll generally be doing 445 00:24:16,150 --> 00:24:20,710 is trying to do our best fit in the same intuitive way you've 446 00:24:20,710 --> 00:24:25,090 probably been doing regression modeling or fitting of lines 447 00:24:25,090 --> 00:24:27,400 to data for a long time. 448 00:24:27,400 --> 00:24:30,280 We have a general approach of trying 449 00:24:30,280 --> 00:24:34,840 to get the best fit in a conceptual approach 450 00:24:34,840 --> 00:24:37,660 of minimizing some error metric. 451 00:24:37,660 --> 00:24:42,100 The most standard one is a sum of squared deviations 452 00:24:42,100 --> 00:24:45,140 of your data from your prediction. 453 00:24:45,140 --> 00:24:49,030 So we often talk about least squares or minimum 454 00:24:49,030 --> 00:24:51,460 squared error data sets. 455 00:24:51,460 --> 00:24:54,310 So in those kinds of conditions, what we're trying to do 456 00:24:54,310 --> 00:24:57,640 is find the data not and the one that gives us 457 00:24:57,640 --> 00:25:02,560 the best line that, in aggregate across all of our data, 458 00:25:02,560 --> 00:25:08,710 minimizes the sum of all squared error terms, where each error 459 00:25:08,710 --> 00:25:09,700 term-- 460 00:25:09,700 --> 00:25:13,880 each e sub i is-- 461 00:25:13,880 --> 00:25:17,200 so this would be my e sub i term. 462 00:25:17,200 --> 00:25:24,700 When I ran run sub i for input at, 463 00:25:24,700 --> 00:25:28,940 let's say, level 1 for my single factor here. 464 00:25:28,940 --> 00:25:32,620 This is my prediction. 465 00:25:32,620 --> 00:25:34,100 This is my data. 466 00:25:34,100 --> 00:25:38,680 So I've got a deviation between my best prediction 467 00:25:38,680 --> 00:25:41,590 at that input level and my actual measurement. 468 00:25:41,590 --> 00:25:44,080 I'm squaring that, and then what we're looking at 469 00:25:44,080 --> 00:25:47,830 is the sum of all of those across all of my data. 470 00:25:47,830 --> 00:25:51,940 And you can formulate it as sort of that optimization problem 471 00:25:51,940 --> 00:25:56,830 and do a search for the best beta 0, beta 1, conceptually. 472 00:25:56,830 --> 00:25:58,930 Turns out, in the simple linear cases, 473 00:25:58,930 --> 00:26:03,370 the power of linear models is that one can still 474 00:26:03,370 --> 00:26:07,240 solve the matrix formulation of the problem using 475 00:26:07,240 --> 00:26:10,960 a pseudo-inverse in the over-specified case, where 476 00:26:10,960 --> 00:26:15,250 I've got multiple samples at the same condition. 477 00:26:15,250 --> 00:26:17,560 And we need to do that, because now, 478 00:26:17,560 --> 00:26:19,750 if I've got multiple rows where I've 479 00:26:19,750 --> 00:26:23,650 sampled the same conditions multiple times, 480 00:26:23,650 --> 00:26:26,290 my x matrix is no longer square, so I 481 00:26:26,290 --> 00:26:27,490 can't do a direct solution. 482 00:26:31,000 --> 00:26:36,370 There's an aside here, which is simply 483 00:26:36,370 --> 00:26:41,740 the derivation of this relationship, this beta 484 00:26:41,740 --> 00:26:49,090 solution, based on the minimum squared error condition. 485 00:26:49,090 --> 00:26:51,310 Yes, question? 486 00:26:51,310 --> 00:26:52,690 AUDIENCE: Yes. 487 00:26:52,690 --> 00:26:54,970 For the previous equation, isn't it 488 00:26:54,970 --> 00:27:01,810 beta 0 plus beta 1 time x based on beta 0 minus beta 1 time x? 489 00:27:01,810 --> 00:27:02,620 DUANE BONING: Good. 490 00:27:02,620 --> 00:27:04,960 Yeah. 491 00:27:04,960 --> 00:27:06,880 I guess I could change the sign in the middle, 492 00:27:06,880 --> 00:27:08,500 but I probably shouldn't do that. 493 00:27:08,500 --> 00:27:12,510 Yeah, that should probably be a plus there. 494 00:27:12,510 --> 00:27:14,280 Thank you. 495 00:27:14,280 --> 00:27:20,490 So we could, again, formulate our problem just 496 00:27:20,490 --> 00:27:21,990 to give you a quick sense of where 497 00:27:21,990 --> 00:27:28,770 that funky description of this-- using this pseudo-inverse idea 498 00:27:28,770 --> 00:27:29,670 comes from. 499 00:27:29,670 --> 00:27:31,860 Again, if I have my linear model, 500 00:27:31,860 --> 00:27:37,920 I can express based on my measured data. 501 00:27:37,920 --> 00:27:40,050 These are my different conditions 502 00:27:40,050 --> 00:27:42,640 and these are my model terms. 503 00:27:42,640 --> 00:27:48,700 I've got a way to express the error for all of my runs, 504 00:27:48,700 --> 00:27:51,810 and now the squared error can be written in matrix form 505 00:27:51,810 --> 00:27:57,630 as the transpose of that error vector times the error vector. 506 00:27:57,630 --> 00:28:00,270 And now I simply substitute in, and you 507 00:28:00,270 --> 00:28:02,740 get something like that. 508 00:28:02,740 --> 00:28:05,250 And now you can do the derivative. 509 00:28:05,250 --> 00:28:07,380 What we're looking for is the point 510 00:28:07,380 --> 00:28:11,430 at which this quadratic error is minimum. 511 00:28:11,430 --> 00:28:15,870 So I can do the derivative there, and at that point, 512 00:28:15,870 --> 00:28:21,420 I'm looking for the bottom of that squared error 513 00:28:21,420 --> 00:28:24,430 curve, that quadratic error-- 514 00:28:24,430 --> 00:28:28,030 which, if I differentiate this, I get this. 515 00:28:28,030 --> 00:28:30,210 And now I can simply solve, and then, 516 00:28:30,210 --> 00:28:35,280 when you go ahead and plug that in, that drives for us-- 517 00:28:35,280 --> 00:28:38,820 this funky thing, where you take your data condition 518 00:28:38,820 --> 00:28:41,700 x transpose x, then you do the inverse 519 00:28:41,700 --> 00:28:50,060 of that times the x transpose with your measured outputs 520 00:28:50,060 --> 00:28:52,910 to solve for beta. 521 00:28:52,910 --> 00:28:56,600 So there is a matrix algebra formulation here 522 00:28:56,600 --> 00:29:01,130 that's basically giving us the best minimum error fit 523 00:29:01,130 --> 00:29:02,460 for all of our data. 524 00:29:02,460 --> 00:29:06,200 So that's a very nice, powerful general linear regression 525 00:29:06,200 --> 00:29:08,770 result. 526 00:29:08,770 --> 00:29:10,630 Here's another picture of the same thing. 527 00:29:10,630 --> 00:29:13,200 This is the same idea-- not the derivation, but what I meant 528 00:29:13,200 --> 00:29:17,430 by when I've got replicates. 529 00:29:17,430 --> 00:29:19,530 Again, this is my measured vector. 530 00:29:22,240 --> 00:29:24,010 These are my conditions. 531 00:29:24,010 --> 00:29:28,780 Again, I'm either running at the low condition 532 00:29:28,780 --> 00:29:31,240 or at the high condition for each of my-- 533 00:29:31,240 --> 00:29:33,760 for my single input here. 534 00:29:33,760 --> 00:29:38,470 I've got two coefficients, so my x times my beta, 535 00:29:38,470 --> 00:29:44,440 again, gives me my beta 0 plus my either x high or x low times 536 00:29:44,440 --> 00:29:45,430 beta 1. 537 00:29:45,430 --> 00:29:48,160 And for each case, I get some measured output. 538 00:29:48,160 --> 00:29:52,540 And the point is now, if I have replicates-- 539 00:29:52,540 --> 00:29:56,800 there's an x low, here's another x low, and if I continued on, 540 00:29:56,800 --> 00:30:00,400 there may be multiple places where my conditions were 541 00:30:00,400 --> 00:30:02,030 exactly the same-- 542 00:30:02,030 --> 00:30:04,570 oops-- my coefficients are exactly the same, 543 00:30:04,570 --> 00:30:07,060 but I get different measured outputs. 544 00:30:07,060 --> 00:30:08,170 So that's how my-- 545 00:30:08,170 --> 00:30:12,690 what used to be my square matrix becomes non-square. 546 00:30:12,690 --> 00:30:19,090 I'm filling in additional data rows as I add more replicates. 547 00:30:19,090 --> 00:30:23,110 And so in that case, I still got the simple linear model, 548 00:30:23,110 --> 00:30:25,570 but now I have to use not a direct inverse, 549 00:30:25,570 --> 00:30:28,300 but I got to use this sort of pseudo-inverse 550 00:30:28,300 --> 00:30:31,153 in order to fit my beta coefficients. 551 00:30:34,540 --> 00:30:39,550 OK, so that's the very powerful, general linear regression 552 00:30:39,550 --> 00:30:43,420 kind of approach, and it's based on minimizing a squared error 553 00:30:43,420 --> 00:30:45,100 term. 554 00:30:45,100 --> 00:30:48,070 But there's a lot more intuitive way 555 00:30:48,070 --> 00:30:50,920 of looking at this and some other observations that 556 00:30:50,920 --> 00:30:56,590 make estimation easier, and also connect up, 557 00:30:56,590 --> 00:30:59,440 I think, very nicely with our intuition. 558 00:30:59,440 --> 00:31:03,410 One important point is, when I've only got two levels-- 559 00:31:03,410 --> 00:31:05,290 so our one input, two levels-- 560 00:31:05,290 --> 00:31:07,930 we know a line has to fit that data. 561 00:31:07,930 --> 00:31:12,610 And if I've got replicates, the line 562 00:31:12,610 --> 00:31:15,700 actually has to go through-- 563 00:31:15,700 --> 00:31:18,880 provably, and we'll see a very quick little derivation-- 564 00:31:18,880 --> 00:31:21,670 the line has to go through the mean at each 565 00:31:21,670 --> 00:31:23,740 of the data points, my replicate points, 566 00:31:23,740 --> 00:31:25,030 at each of my two levels. 567 00:31:28,340 --> 00:31:30,860 So in order to fit the data, one is 568 00:31:30,860 --> 00:31:32,420 you could put it all on a big matrix 569 00:31:32,420 --> 00:31:35,210 and send it off to Matlab, or Excel, 570 00:31:35,210 --> 00:31:38,540 or whatever to do this big matrix pseudo-inverse, 571 00:31:38,540 --> 00:31:44,120 or you can simply calculate, look at your data 572 00:31:44,120 --> 00:31:47,690 under your low condition, your data under your high condition, 573 00:31:47,690 --> 00:31:51,050 find the mean, and your line has to go 574 00:31:51,050 --> 00:31:55,650 through the mean of those two conditions. 575 00:31:55,650 --> 00:31:58,640 So one of the two might be a little bit easier on a quiz. 576 00:32:02,710 --> 00:32:04,740 And this is simply working out and showing you, 577 00:32:04,740 --> 00:32:10,860 if, in this simple two-input condition, where I only 578 00:32:10,860 --> 00:32:16,380 had two data points, then you could see how this works. 579 00:32:16,380 --> 00:32:19,320 But with the mean, what's very nice 580 00:32:19,320 --> 00:32:21,990 is now this is the mean of my output, 581 00:32:21,990 --> 00:32:25,920 and what that gives you is a direct expression using 582 00:32:25,920 --> 00:32:30,930 that fact to be able to calculate your offset term 583 00:32:30,930 --> 00:32:32,130 and your beta 1 term. 584 00:32:37,960 --> 00:32:38,580 Let me see. 585 00:32:38,580 --> 00:32:39,663 Am I going to do this yet? 586 00:32:39,663 --> 00:32:40,390 No. 587 00:32:40,390 --> 00:32:44,650 So here's a simple application of that for our real data, 588 00:32:44,650 --> 00:32:46,600 this injection molding case. 589 00:32:46,600 --> 00:32:50,590 Here now I'm doing something kind of sneaky. 590 00:32:50,590 --> 00:32:53,500 This is I'm treating all of that injection molding 591 00:32:53,500 --> 00:32:56,170 experimental design as if I'd only 592 00:32:56,170 --> 00:33:00,010 done an experimental design in hold time. 593 00:33:00,010 --> 00:33:04,240 So what's really hidden inside of this spread of data 594 00:33:04,240 --> 00:33:08,230 is it is all replicates at the low value x minus 595 00:33:08,230 --> 00:33:09,890 for hold time. 596 00:33:09,890 --> 00:33:13,720 Similarly, these are replicates at x plus, 597 00:33:13,720 --> 00:33:16,300 but they're not really good pure replicates, 598 00:33:16,300 --> 00:33:19,360 because other things were changing. 599 00:33:19,360 --> 00:33:21,890 First off, time might often be changing, 600 00:33:21,890 --> 00:33:24,250 but what was going on in here is I was also 601 00:33:24,250 --> 00:33:28,570 playing with the injection velocity at the same time. 602 00:33:28,570 --> 00:33:32,320 So in this case, this is saying I'm 603 00:33:32,320 --> 00:33:34,570 going to treat these other things that 604 00:33:34,570 --> 00:33:37,570 might be changing, like injection velocity, 605 00:33:37,570 --> 00:33:39,340 as just noise factors. 606 00:33:39,340 --> 00:33:45,880 They're what contributes to the spread at the condition 607 00:33:45,880 --> 00:33:47,710 that I'm actually interested in. 608 00:33:47,710 --> 00:33:50,680 And so you will often do that. 609 00:33:50,680 --> 00:33:53,650 I'm just trying to fit a very simple 610 00:33:53,650 --> 00:33:58,090 first-order linear model, just relating my one factor here 611 00:33:58,090 --> 00:34:01,910 of hold time to my output. 612 00:34:01,910 --> 00:34:04,900 And it's going to have some variance and error 613 00:34:04,900 --> 00:34:09,969 in my estimate because of all these other higher order 614 00:34:09,969 --> 00:34:12,969 terms that I'm not including, other factors that I'm not 615 00:34:12,969 --> 00:34:16,810 including, pure replication error, measurement error. 616 00:34:16,810 --> 00:34:19,030 But what I am trying to do is understand 617 00:34:19,030 --> 00:34:24,429 as closely as possible the effective that single factor. 618 00:34:24,429 --> 00:34:27,250 So very often, I really do want to know just the hold time 619 00:34:27,250 --> 00:34:31,060 effect fact, even when it's contaminated with or spread out 620 00:34:31,060 --> 00:34:33,009 with these other effects. 621 00:34:36,708 --> 00:34:38,500 This is just showing, for that actual data, 622 00:34:38,500 --> 00:34:41,469 I can fit my two means for the spread, 623 00:34:41,469 --> 00:34:46,810 and then fit a very simple linear model. 624 00:34:46,810 --> 00:34:50,560 Now, in this case, I'm treating hold time 625 00:34:50,560 --> 00:34:53,500 as kind of a continuous factor, so you might think, 626 00:34:53,500 --> 00:34:56,480 if these are hold time in minutes, 627 00:34:56,480 --> 00:35:00,160 now you can interpolate. 628 00:35:00,160 --> 00:35:03,820 And I could ask, what I think the hold 629 00:35:03,820 --> 00:35:06,490 time-- or the effective on my dimension 630 00:35:06,490 --> 00:35:09,850 might be at 1.5 minutes? 631 00:35:09,850 --> 00:35:11,980 So I'm interpolating within my data. 632 00:35:11,980 --> 00:35:15,130 I could also extrapolate and ask questions. 633 00:35:15,130 --> 00:35:19,035 Well, what do I think it'll be out here at 2.5 minutes? 634 00:35:19,035 --> 00:35:20,410 That might be a little bit risky, 635 00:35:20,410 --> 00:35:23,590 and we'll talk about that a little bit 636 00:35:23,590 --> 00:35:27,700 in a later lecture, when we talk about the confidence intervals. 637 00:35:27,700 --> 00:35:29,740 But at another point I want to make 638 00:35:29,740 --> 00:35:40,300 is this same picture would apply even if these 639 00:35:40,300 --> 00:35:43,760 weren't continuous parameters. 640 00:35:43,760 --> 00:35:46,670 And I find it easier to think about the problem in terms 641 00:35:46,670 --> 00:35:49,040 of continuous parameters, because you're often 642 00:35:49,040 --> 00:35:52,400 changing a pressure, or a velocity, or a hold time, 643 00:35:52,400 --> 00:35:54,950 and looking at a continuous variation in some output 644 00:35:54,950 --> 00:35:55,760 parameter. 645 00:35:55,760 --> 00:36:01,310 But a very important point that I do want to make here is, 646 00:36:01,310 --> 00:36:09,680 if I were to use exactly the same chart and relabel the axes 647 00:36:09,680 --> 00:36:13,730 and say something like, this is the dimension-- 648 00:36:13,730 --> 00:36:15,740 this is still dimension-- 649 00:36:15,740 --> 00:36:19,730 and I'm doing injection molding, but here I'm 650 00:36:19,730 --> 00:36:30,030 using it from a tool from different manufacturers. 651 00:36:30,030 --> 00:36:32,670 So here, I run the tool on-- 652 00:36:32,670 --> 00:36:36,330 I don't know-- Fritze Inc's injection molding tool, 653 00:36:36,330 --> 00:36:44,670 and over here it's on Xenon Corp's tool. 654 00:36:44,670 --> 00:36:48,960 I could have essentially the same model, the same formalism 655 00:36:48,960 --> 00:36:53,100 if these were discrete levels, discrete choices 656 00:36:53,100 --> 00:36:54,270 in some factor. 657 00:36:54,270 --> 00:36:56,670 Here my factor is, which choice of tool 658 00:36:56,670 --> 00:36:58,560 would I want to be making? 659 00:36:58,560 --> 00:37:00,330 I want to know, is there an effect? 660 00:37:00,330 --> 00:37:04,830 I maybe even want to build a model with model coefficients 661 00:37:04,830 --> 00:37:11,280 for the effect, depending on what factor level I pick. 662 00:37:11,280 --> 00:37:12,840 Now, if these were discrete choices, 663 00:37:12,840 --> 00:37:17,380 it no longer makes sense to interpolate or extrapolate, 664 00:37:17,380 --> 00:37:20,730 but I do want to just highlight that the same formalism-- 665 00:37:20,730 --> 00:37:23,340 especially for factorial designs-- 666 00:37:23,340 --> 00:37:26,670 applies to both continuous parameters 667 00:37:26,670 --> 00:37:28,388 and discrete parameters. 668 00:37:28,388 --> 00:37:30,180 I'm going to come back to that a little bit 669 00:37:30,180 --> 00:37:33,390 when we talk about what might-- 670 00:37:33,390 --> 00:37:35,670 slightly different ways of looking at the model 671 00:37:35,670 --> 00:37:37,260 coefficients be in those cases. 672 00:37:39,850 --> 00:37:42,030 But I just want to remind you or highlight 673 00:37:42,030 --> 00:37:46,530 that you can use these for both continuous and discrete kinds 674 00:37:46,530 --> 00:37:49,390 of designs and factors. 675 00:37:49,390 --> 00:37:54,870 OK, so continuing with our plan here-- 676 00:37:54,870 --> 00:37:57,270 we've talked about multiple inputs and effects-- 677 00:37:57,270 --> 00:38:03,030 what I want to do next is say, if I do have multiple inputs, 678 00:38:03,030 --> 00:38:06,030 how does our ANOVA formalism apply 679 00:38:06,030 --> 00:38:10,830 to being able to decide whether both factors matter, 680 00:38:10,830 --> 00:38:14,520 or not, or three factors, which factors actually matter. 681 00:38:14,520 --> 00:38:16,800 And based on my observations, do I 682 00:38:16,800 --> 00:38:20,080 think that that factor actually has an effect or not? 683 00:38:20,080 --> 00:38:22,950 So I want to derive first real quickly, 684 00:38:22,950 --> 00:38:25,200 or show for us the different ANOVA 685 00:38:25,200 --> 00:38:27,220 formalisms in these cases. 686 00:38:27,220 --> 00:38:29,160 And then we'll come back a little bit more 687 00:38:29,160 --> 00:38:34,720 to the estimation of these model coefficients. 688 00:38:34,720 --> 00:38:35,890 Oh, what happened? 689 00:38:35,890 --> 00:38:37,750 There we go. 690 00:38:37,750 --> 00:38:38,980 OK. 691 00:38:38,980 --> 00:38:41,380 So back to our injection molding problem, 692 00:38:41,380 --> 00:38:45,310 now with two factors, where we have the velocity 693 00:38:45,310 --> 00:38:47,960 and we have the hold time-- 694 00:38:47,960 --> 00:38:50,860 and this is where I said I was being a little bit sneaky 695 00:38:50,860 --> 00:38:53,360 when I was showing you the one factor data. 696 00:38:53,360 --> 00:38:56,980 In fact, the setup here was to do four different combinations 697 00:38:56,980 --> 00:38:58,960 of these two factors-- 698 00:38:58,960 --> 00:39:01,690 two-factor, two-level experimental design-- 699 00:39:01,690 --> 00:39:04,240 with replicates at each factor-- 700 00:39:04,240 --> 00:39:06,970 something like 14 or 15 replicates. 701 00:39:06,970 --> 00:39:09,580 And here nothing was intentionally changing, 702 00:39:09,580 --> 00:39:12,880 so these are more like pure replicates 703 00:39:12,880 --> 00:39:16,450 within that set of experimental conditions. 704 00:39:16,450 --> 00:39:19,490 And what we can do is look at, in that case, where I've 705 00:39:19,490 --> 00:39:22,210 got pure replicates, what is the average 706 00:39:22,210 --> 00:39:24,620 at that test condition-- 707 00:39:24,620 --> 00:39:27,700 so test condition 1, where it was low and low 708 00:39:27,700 --> 00:39:32,710 for the two factors, and so on for test condition 2, 3, and 4. 709 00:39:32,710 --> 00:39:34,480 And now we can ask the question-- 710 00:39:34,480 --> 00:39:38,440 not just in the univariate case, but in the multivariate case, 711 00:39:38,440 --> 00:39:41,350 what kind of model works? 712 00:39:41,350 --> 00:39:43,870 And this is actually a description 713 00:39:43,870 --> 00:39:46,330 of the model that might be more appropriate if they 714 00:39:46,330 --> 00:39:51,720 were discrete choices and you weren't 715 00:39:51,720 --> 00:39:54,150 trying to do interpolation. 716 00:39:54,150 --> 00:39:56,160 But it's also implicitly the model 717 00:39:56,160 --> 00:39:58,890 that we're using for ANOVA analysis, 718 00:39:58,890 --> 00:40:01,710 because the ANOVA doesn't care whether it's 719 00:40:01,710 --> 00:40:05,460 a continuous parameter or a discrete parameter 720 00:40:05,460 --> 00:40:07,510 that we're using. 721 00:40:07,510 --> 00:40:10,890 So the basic idea for a two-- 722 00:40:10,890 --> 00:40:13,230 this is a two-factor kind of model. 723 00:40:13,230 --> 00:40:18,940 This is very close to what we saw with the ANOVA on Tuesday. 724 00:40:18,940 --> 00:40:20,760 One of the factor might be-- 725 00:40:20,760 --> 00:40:23,880 or treatment effect that we might be looking for 726 00:40:23,880 --> 00:40:26,790 might be the velocity effect. 727 00:40:26,790 --> 00:40:28,530 The other might be the hold time effect, 728 00:40:28,530 --> 00:40:31,740 and then we may be looking for interactions between the two. 729 00:40:31,740 --> 00:40:34,068 And the picture you should have in your mind 730 00:40:34,068 --> 00:40:35,610 is that what we're going to try to do 731 00:40:35,610 --> 00:40:37,830 is model this with an overall mean-- 732 00:40:37,830 --> 00:40:42,360 some mu-- and then we're asking the question-- 733 00:40:42,360 --> 00:40:44,910 think back to the ANOVA question-- 734 00:40:44,910 --> 00:40:48,330 is there a deviation from the grand mean, 735 00:40:48,330 --> 00:40:54,840 from the overall mean, if I have velocity at the high condition? 736 00:40:54,840 --> 00:40:57,240 Is there a statistically significant shift 737 00:40:57,240 --> 00:41:00,580 up for a high setting? 738 00:41:00,580 --> 00:41:05,070 And similarly, is there a statistically significant 739 00:41:05,070 --> 00:41:06,780 low effect? 740 00:41:06,780 --> 00:41:10,020 So is there an up or down shift, depending on 741 00:41:10,020 --> 00:41:14,160 whether I pick the high or low condition for velocity? 742 00:41:14,160 --> 00:41:17,550 And similarly, I would ask the question, is there an effect-- 743 00:41:17,550 --> 00:41:23,400 a beta 1 effect or a beta 2 effect-- 744 00:41:23,400 --> 00:41:27,735 depending on whether I selected the low-- 745 00:41:31,910 --> 00:41:35,420 I guess this is the high-- the high condition 746 00:41:35,420 --> 00:41:41,730 or the low condition for that actual variable? 747 00:41:41,730 --> 00:41:45,860 So these are discrete offsets, depending 748 00:41:45,860 --> 00:41:50,600 on whether I was at the low or the high condition. 749 00:41:50,600 --> 00:41:53,450 So i indicates, in our earlier notation, 750 00:41:53,450 --> 00:41:54,950 either the plus or minus. 751 00:41:54,950 --> 00:41:57,020 Here, we're switching rotation a little bit 752 00:41:57,020 --> 00:41:59,690 to just be a numerical index-- 753 00:41:59,690 --> 00:42:02,690 either 1 or 2 in this case, but if I 754 00:42:02,690 --> 00:42:07,370 had multiple levels of that factor, I could extend that. 755 00:42:07,370 --> 00:42:12,440 I can have little deltas with a coefficient that 756 00:42:12,440 --> 00:42:15,680 would go for each delta, depending on the discrete value 757 00:42:15,680 --> 00:42:18,250 that I picked for that level. 758 00:42:18,250 --> 00:42:18,750 Yeah? 759 00:42:18,750 --> 00:42:20,542 AUDIENCE: What if it's a shift of variance, 760 00:42:20,542 --> 00:42:23,442 and not a shift in the mean altering one of them? 761 00:42:23,442 --> 00:42:24,900 DUANE BONING: It's a good question. 762 00:42:24,900 --> 00:42:27,200 The question was, what if it's a shift in variance? 763 00:42:27,200 --> 00:42:31,580 Here we are looking for just-- 764 00:42:31,580 --> 00:42:37,650 we're just modeling the mean value of the output with this. 765 00:42:37,650 --> 00:42:41,870 You could similarly start looking at building up 766 00:42:41,870 --> 00:42:43,190 models of variance. 767 00:42:47,030 --> 00:42:49,280 You would probably not be thinking of it 768 00:42:49,280 --> 00:42:52,980 as additive variance terms in those cases, 769 00:42:52,980 --> 00:42:55,520 so you might also end up doing some other kinds 770 00:42:55,520 --> 00:42:57,860 of transformations on your data to get 771 00:42:57,860 --> 00:43:02,480 to additive models of variance. 772 00:43:02,480 --> 00:43:06,020 But basically, if I had enough data, 773 00:43:06,020 --> 00:43:10,670 you could start to try to model variance. 774 00:43:10,670 --> 00:43:13,280 If you think back to the ANOVA formulation, 775 00:43:13,280 --> 00:43:17,360 the simple ANOVA is all assuming your variance 776 00:43:17,360 --> 00:43:21,120 is exactly the same across all of your conditions. 777 00:43:21,120 --> 00:43:30,650 There are wonderful books on more powerful modeling 778 00:43:30,650 --> 00:43:33,440 and how ANOVA extends to those. 779 00:43:33,440 --> 00:43:34,730 In fact, one of my favorites-- 780 00:43:34,730 --> 00:43:37,280 I lent it to Hayden-- 781 00:43:37,280 --> 00:43:40,730 is a book by George Milliken, the title of which 782 00:43:40,730 --> 00:43:44,240 is Analysis of Messy Data. 783 00:43:44,240 --> 00:43:46,790 And it deals with all kinds of wonderful conditions-- 784 00:43:46,790 --> 00:43:49,640 not only modeling of variance, but also 785 00:43:49,640 --> 00:43:53,060 when you've got very complicated design situations 786 00:43:53,060 --> 00:43:57,500 with partially replicated, data unbalanced design. 787 00:43:57,500 --> 00:43:59,720 You run the whole experimental design and you're 788 00:43:59,720 --> 00:44:02,750 missing a data point, because the experiment crashed-- 789 00:44:02,750 --> 00:44:05,570 what do you do now? 790 00:44:05,570 --> 00:44:07,620 Looks at every possible thing-- 791 00:44:07,620 --> 00:44:12,740 so here, we're really looking just at mean modeling. 792 00:44:12,740 --> 00:44:14,420 Just to finish this up, I should also 793 00:44:14,420 --> 00:44:17,510 point out that there would be discrete offsets, depending 794 00:44:17,510 --> 00:44:24,480 on every combination of our interaction factors as well. 795 00:44:24,480 --> 00:44:30,690 And so I might have discrete model terms for those as well. 796 00:44:30,690 --> 00:44:34,220 Now, an important point just to make here is that, 797 00:44:34,220 --> 00:44:38,390 in this kind of a model-- and it'll maybe become clearer 798 00:44:38,390 --> 00:44:39,770 a little bit later-- 799 00:44:39,770 --> 00:44:42,260 each of these model coefficients-- 800 00:44:42,260 --> 00:44:48,770 this tau 1 and a tau 2 effect, for example, for the velocity-- 801 00:44:48,770 --> 00:44:52,010 they're not really independent. 802 00:44:52,010 --> 00:44:54,560 And this gets back to, I think, your question about, 803 00:44:54,560 --> 00:44:56,540 in this model, this-- 804 00:44:56,540 --> 00:44:59,030 if I were modeling that as a beta 0 term, 805 00:44:59,030 --> 00:45:01,010 this really is the mean, and that 806 00:45:01,010 --> 00:45:03,050 imposes an additional linear constraint 807 00:45:03,050 --> 00:45:04,700 on my other coefficients. 808 00:45:04,700 --> 00:45:07,400 And in particular, that tau 1 and tau 2 809 00:45:07,400 --> 00:45:09,380 would have to be equal and opposite here. 810 00:45:13,370 --> 00:45:17,570 This is a great model for being able to conceptualize and plug 811 00:45:17,570 --> 00:45:20,750 in, depending on what factor levels you have. 812 00:45:20,750 --> 00:45:23,540 You just have to be careful in fitting the model, 813 00:45:23,540 --> 00:45:27,500 and recognize that, in this kind of a formulation, 814 00:45:27,500 --> 00:45:33,650 there's some implied dependencies within this-- 815 00:45:33,650 --> 00:45:36,620 in particular, that each of the factor levels-- 816 00:45:36,620 --> 00:45:39,830 factor effects have to be 0 mean overall, 817 00:45:39,830 --> 00:45:41,690 and the interaction terms, similarly, 818 00:45:41,690 --> 00:45:43,640 have to be, in aggregate, 0 mean. 819 00:45:46,160 --> 00:45:49,740 That kind of makes sense, because-- 820 00:45:49,740 --> 00:45:55,110 think of the case where I've got two inputs, two outputs-- 821 00:45:55,110 --> 00:45:59,010 or one output, got my two factors-- 822 00:45:59,010 --> 00:46:02,220 velocity and hold time-- and I'm just doing two levels for each. 823 00:46:02,220 --> 00:46:04,710 I've got four different conditions. 824 00:46:04,710 --> 00:46:06,270 I may have replication, but I've only 825 00:46:06,270 --> 00:46:07,900 got four different conditions. 826 00:46:07,900 --> 00:46:15,300 How could I possibly fit a mean, two taus for factor 1, 827 00:46:15,300 --> 00:46:18,240 two coefficients-- taus for-- 828 00:46:18,240 --> 00:46:23,010 or betas here for factor 2, and four interaction terms? 829 00:46:23,010 --> 00:46:25,020 That's four, five, six, seven, eight-- 830 00:46:25,020 --> 00:46:28,545 nine model coefficients, four data points. 831 00:46:31,780 --> 00:46:34,860 So clearly, there's got to be some additional constraints 832 00:46:34,860 --> 00:46:40,220 on the model coefficients hidden under that. 833 00:46:40,220 --> 00:46:46,040 OK, so that's one way of looking at the experimental design 834 00:46:46,040 --> 00:46:47,610 and the data. 835 00:46:47,610 --> 00:46:52,640 Here's another way, simply pointing out-- 836 00:46:52,640 --> 00:46:56,690 what we'll often be doing is building tables 837 00:46:56,690 --> 00:47:00,110 to show the high and low condition 838 00:47:00,110 --> 00:47:02,840 for our different factors and the levels. 839 00:47:02,840 --> 00:47:06,170 We might label those, and then often, we'll 840 00:47:06,170 --> 00:47:12,260 illustrate our design space or experimental space 841 00:47:12,260 --> 00:47:18,360 with a design matrix or a design plot. 842 00:47:18,360 --> 00:47:20,450 So here I'm showing simply-- 843 00:47:20,450 --> 00:47:23,990 the four different conditions for velocity and hold 844 00:47:23,990 --> 00:47:26,250 time either, low or high. 845 00:47:26,250 --> 00:47:29,840 and I'm also doing something a little bit subtle, 846 00:47:29,840 --> 00:47:33,260 which is labeling them with not just a plus or a minus, 847 00:47:33,260 --> 00:47:37,325 but in fact, giving them a normalized value of plus 1 848 00:47:37,325 --> 00:47:38,120 or minus 1. 849 00:47:41,150 --> 00:47:43,350 There's some subtle reasons for doing that. 850 00:47:43,350 --> 00:47:50,720 One is, by definition, when I now fit for a two-level 851 00:47:50,720 --> 00:47:52,550 experiment-- 852 00:47:52,550 --> 00:47:54,560 when I fit a model coefficient to that, 853 00:47:54,560 --> 00:47:57,410 it will always be 0 mean, and so now 854 00:47:57,410 --> 00:48:02,270 that beta 0 term really does turn into the mean estimate. 855 00:48:02,270 --> 00:48:06,120 But-- might see that a little bit later. 856 00:48:06,120 --> 00:48:08,330 But the point is here now this is basically 857 00:48:08,330 --> 00:48:11,780 expressing all four of my conditions, my test conditions 858 00:48:11,780 --> 00:48:15,350 from the previous design. 859 00:48:15,350 --> 00:48:18,500 And I can also plot that, each one of these test conditions, 860 00:48:18,500 --> 00:48:21,930 on the x1 axis. 861 00:48:21,930 --> 00:48:24,830 We said this is velocity, basically. 862 00:48:24,830 --> 00:48:30,490 And this is my x2 axis, which is hold time. 863 00:48:30,490 --> 00:48:33,760 So it's either at the plus 1 or the minus 1 level 864 00:48:33,760 --> 00:48:35,950 for that factor. 865 00:48:35,950 --> 00:48:39,340 And so I've got an experimental design, 866 00:48:39,340 --> 00:48:42,280 where I've got four sample points 867 00:48:42,280 --> 00:48:46,330 at the plus minus every combination 868 00:48:46,330 --> 00:48:48,790 of those particular factors. 869 00:48:48,790 --> 00:48:53,590 And so we'll often talk about this as a corner 870 00:48:53,590 --> 00:49:04,400 point DOE or a full factorial DOE, 871 00:49:04,400 --> 00:49:08,270 where I've got every combination of those discrete factor 872 00:49:08,270 --> 00:49:11,990 levels being mapped out. 873 00:49:11,990 --> 00:49:14,070 And then I go and I run my tests. 874 00:49:14,070 --> 00:49:17,330 And here I've just written down the results 875 00:49:17,330 --> 00:49:21,200 of the average of the output over all replicates 876 00:49:21,200 --> 00:49:22,770 for that point. 877 00:49:22,770 --> 00:49:27,500 So this was over the 14 or 15 replicates. 878 00:49:27,500 --> 00:49:30,320 Notice here, this is just mapping out the design space, 879 00:49:30,320 --> 00:49:31,970 my x1 versus x2. 880 00:49:31,970 --> 00:49:37,730 It's not really showing you on a third axis what the output is. 881 00:49:37,730 --> 00:49:40,610 We could also do that and have a 3D plot, 882 00:49:40,610 --> 00:49:43,680 and we'll see that in a second. 883 00:49:43,680 --> 00:49:47,660 But the point for doing this formulation, this full effects 884 00:49:47,660 --> 00:49:52,130 model with these effects offsets and just 885 00:49:52,130 --> 00:49:55,460 reminding us what the model exploration here 886 00:49:55,460 --> 00:49:59,270 is, is that what we would also like to do, 887 00:49:59,270 --> 00:50:01,200 even when I've got more than one factor-- 888 00:50:01,200 --> 00:50:03,410 I've got these x1 and x2 factors-- 889 00:50:03,410 --> 00:50:07,880 is ask the ANOVA question, which would be-- 890 00:50:07,880 --> 00:50:11,300 from an ANOVA point of view, I would like to ask, 891 00:50:11,300 --> 00:50:15,830 does velocity have an effect? 892 00:50:15,830 --> 00:50:19,490 If I change from the low to the high, 893 00:50:19,490 --> 00:50:23,720 do I get an appreciable and significant change 894 00:50:23,720 --> 00:50:25,380 in the output? 895 00:50:25,380 --> 00:50:27,590 So from an ANOVA point of view, if I'm just 896 00:50:27,590 --> 00:50:31,530 looking at velocity, I would like to ask a question-- 897 00:50:31,530 --> 00:50:34,670 just looking together across all of the data 898 00:50:34,670 --> 00:50:37,040 that I've got at the low condition for x1 899 00:50:37,040 --> 00:50:39,980 and all of the data from the condition at x2, 900 00:50:39,980 --> 00:50:42,660 I might ask that question and say, 901 00:50:42,660 --> 00:50:47,000 is there a shift from that perspective or not? 902 00:50:47,000 --> 00:50:50,220 So really, we're back to a hypothesis test-- 903 00:50:50,220 --> 00:50:52,530 oh, what happened there? 904 00:50:52,530 --> 00:50:57,200 This should be a dot, dot, dot. 905 00:50:57,200 --> 00:51:00,680 [INAUDIBLE]---- which is basically asking the question, 906 00:51:00,680 --> 00:51:03,140 in the null hypothesis, there is no effect-- 907 00:51:03,140 --> 00:51:11,780 all of my taus are equal and 0, and all my interactions are 0. 908 00:51:11,780 --> 00:51:14,450 And in the alternative hypothesis, 909 00:51:14,450 --> 00:51:17,070 something has an effect. 910 00:51:17,070 --> 00:51:19,460 So that's the key question. 911 00:51:19,460 --> 00:51:22,130 Now, what you can do is start formulating, exactly 912 00:51:22,130 --> 00:51:25,880 like we did with ANOVA, sums of squared deviations 913 00:51:25,880 --> 00:51:29,630 across, for example, all of the A levels 914 00:51:29,630 --> 00:51:33,170 averaged over the B factors and any replicates. 915 00:51:33,170 --> 00:51:36,130 Remember, this is replicates. 916 00:51:36,130 --> 00:51:37,670 This is the B factor. 917 00:51:37,670 --> 00:51:43,390 So for example, this sum right here would be-- 918 00:51:43,390 --> 00:51:45,460 let's see-- forgot which one-- 919 00:51:48,160 --> 00:51:50,740 I'm looking at the responses for the first level, 920 00:51:50,740 --> 00:51:53,080 so call it the A level. 921 00:51:53,080 --> 00:51:58,990 So that would be basically the big red thing 922 00:51:58,990 --> 00:52:00,700 that I had drawn here. 923 00:52:00,700 --> 00:52:06,370 I basically call this first one factor A, 924 00:52:06,370 --> 00:52:08,650 and what I'm simply doing is averaging over 925 00:52:08,650 --> 00:52:11,950 all of the n replicates at each of these points 926 00:52:11,950 --> 00:52:15,190 and over the B levels at that point. 927 00:52:18,110 --> 00:52:23,860 So I can form a sum of squared set of deviations from the mean 928 00:52:23,860 --> 00:52:26,680 as I change factor A. I can similarly 929 00:52:26,680 --> 00:52:29,180 do that as I change factor B. 930 00:52:29,180 --> 00:52:31,240 Then I can also look at these interaction 931 00:52:31,240 --> 00:52:34,840 sum of squared terms, and once I have these, 932 00:52:34,840 --> 00:52:37,690 I can now start to break down the total sum 933 00:52:37,690 --> 00:52:45,310 of squared deviations in my data into components due to factor 934 00:52:45,310 --> 00:52:50,530 A, sum of squares from-- 935 00:52:50,530 --> 00:52:56,830 deviations from the grand mean for factor B, and terms 936 00:52:56,830 --> 00:52:59,260 here from interactions. 937 00:53:03,470 --> 00:53:08,910 And then this is residuals, sum of squared error. 938 00:53:08,910 --> 00:53:11,420 So basically, it's just like we did with the single factor 939 00:53:11,420 --> 00:53:15,410 ANOVA, but now I can do that, well, looking and seeing, 940 00:53:15,410 --> 00:53:16,760 did factor A have an effect? 941 00:53:16,760 --> 00:53:19,160 What's the sum of squared deviations 942 00:53:19,160 --> 00:53:21,230 that I observed in my data from the mean 943 00:53:21,230 --> 00:53:23,480 as I change that factor-- 944 00:53:23,480 --> 00:53:26,000 and similarly, for factor B. And I 945 00:53:26,000 --> 00:53:28,490 could extend this on to more factors, 946 00:53:28,490 --> 00:53:30,590 if I wanted-- if I had three factors or four. 947 00:53:33,920 --> 00:53:36,230 This is also just reminding us or laying out 948 00:53:36,230 --> 00:53:39,500 that there are different degrees of freedom in each case, which 949 00:53:39,500 --> 00:53:41,960 we're going to need when we want to estimate 950 00:53:41,960 --> 00:53:44,540 the mean square, because what we'll do 951 00:53:44,540 --> 00:53:47,480 is we'll take the sum of squared deviations, 952 00:53:47,480 --> 00:53:50,880 divide it by its degrees of freedom-- 953 00:53:50,880 --> 00:53:53,840 which is the number of factor levels I had minus 1 954 00:53:53,840 --> 00:53:55,640 in each of those cases-- 955 00:53:55,640 --> 00:54:00,230 and put these together to get a mean square estimate 956 00:54:00,230 --> 00:54:03,680 for that factor, which is essentially 957 00:54:03,680 --> 00:54:08,450 an estimate of the variance due to changing factor A, 958 00:54:08,450 --> 00:54:11,240 just like in the ANOVA formulation. 959 00:54:11,240 --> 00:54:14,660 And you can plot that into an ANOVA table 960 00:54:14,660 --> 00:54:17,030 for multiple factors, where, again, we 961 00:54:17,030 --> 00:54:19,400 have the sum of squared deviations 962 00:54:19,400 --> 00:54:21,050 due to each of those factors. 963 00:54:21,050 --> 00:54:24,770 This is just plugging in those formulas that we saw earlier. 964 00:54:24,770 --> 00:54:29,330 And the interaction-- if I've got pure replicates, that's 965 00:54:29,330 --> 00:54:34,730 a great case where I can separate out exactly just 966 00:54:34,730 --> 00:54:36,950 the replication error, perhaps coming 967 00:54:36,950 --> 00:54:41,390 from measurement or pure process replication error. 968 00:54:41,390 --> 00:54:44,540 And then, with the right appropriate degree of freedom, 969 00:54:44,540 --> 00:54:47,810 I can now form the mean square estimates. 970 00:54:47,810 --> 00:54:51,230 Remember, these are variance estimates. 971 00:54:51,230 --> 00:54:57,030 And then I can form the right F factor to decide, 972 00:54:57,030 --> 00:55:00,830 does this mean square, with respect to my 973 00:55:00,830 --> 00:55:04,580 within group mean square-- 974 00:55:04,580 --> 00:55:08,870 is that large enough effect to be 975 00:55:08,870 --> 00:55:12,780 statistically significant using an F-test? 976 00:55:12,780 --> 00:55:16,250 So it's exactly the same formulation for the ANOVA, 977 00:55:16,250 --> 00:55:18,740 but now where we can break it out 978 00:55:18,740 --> 00:55:23,720 across multiple factors or multiple interactions. 979 00:55:23,720 --> 00:55:24,740 OK. 980 00:55:24,740 --> 00:55:28,830 You actually have to try it and use this to get it, 981 00:55:28,830 --> 00:55:32,150 but the key point here is, when we 982 00:55:32,150 --> 00:55:35,300 are building these multiple factor models, 983 00:55:35,300 --> 00:55:39,620 it's still an ANOVA formalism to decide whether or not 984 00:55:39,620 --> 00:55:43,010 that factor is statistically significant or not. 985 00:55:46,180 --> 00:55:50,560 So let's look at this in the case of a linear model. 986 00:55:50,560 --> 00:55:53,830 And I'm going to give you a little bit more terminology 987 00:55:53,830 --> 00:55:58,060 here and start to get to this idea of contrasts, which 988 00:55:58,060 --> 00:56:02,620 are easy ways of estimating some of these models. 989 00:56:02,620 --> 00:56:08,140 So this is still our injection molding data. 990 00:56:08,140 --> 00:56:12,460 Factor x1 here was velocity. 991 00:56:12,460 --> 00:56:16,930 This was hold time. 992 00:56:16,930 --> 00:56:19,060 And I've got every combination. 993 00:56:19,060 --> 00:56:22,840 What's new on this page that I want to draw your attention to 994 00:56:22,840 --> 00:56:26,470 is some additional terminology that we often 995 00:56:26,470 --> 00:56:30,040 use that's shorthand for indicating 996 00:56:30,040 --> 00:56:36,370 the different combinations of those factor levels, that 997 00:56:36,370 --> 00:56:40,360 tell us exactly what point, what condition, 998 00:56:40,360 --> 00:56:44,260 combination we're running for each of our tests. 999 00:56:44,260 --> 00:56:46,150 So I already showed you this table 1000 00:56:46,150 --> 00:56:49,490 which was saying, OK, for my first factor-- 1001 00:56:49,490 --> 00:56:52,330 which we've also labeled as factor A-- 1002 00:56:52,330 --> 00:56:59,490 factor x2 is our second factor. 1003 00:56:59,490 --> 00:57:02,850 So note here, we've got three different kinds 1004 00:57:02,850 --> 00:57:05,400 of-- ways of referring to the same factor. 1005 00:57:05,400 --> 00:57:09,750 The first factor, factor a x1 value-- 1006 00:57:09,750 --> 00:57:11,850 it's the velocity term. 1007 00:57:11,850 --> 00:57:17,220 What I can also do is indicate in shorthand which 1008 00:57:17,220 --> 00:57:19,950 condition is actually at work. 1009 00:57:19,950 --> 00:57:23,200 And so it takes a little bit of getting used to, 1010 00:57:23,200 --> 00:57:24,960 but what you'll see here is-- 1011 00:57:24,960 --> 00:57:29,340 let's look at this condition first. 1012 00:57:29,340 --> 00:57:32,190 What I've selected here for test number two 1013 00:57:32,190 --> 00:57:34,770 is the high condition for factor A 1014 00:57:34,770 --> 00:57:38,130 and the low condition for factor B. 1015 00:57:38,130 --> 00:57:39,840 So what we're doing in shorthand is 1016 00:57:39,840 --> 00:57:44,580 indicating with a lowercase version of that factor level 1017 00:57:44,580 --> 00:57:51,550 all of the high selections for that factor. 1018 00:57:51,550 --> 00:57:54,690 So the high here meant I picked the high condition 1019 00:57:54,690 --> 00:57:58,110 for factor A, and I denote that with a little a. 1020 00:57:58,110 --> 00:58:01,740 And the low factors we just leave off, or think 1021 00:58:01,740 --> 00:58:04,960 of those as a 1 setting. 1022 00:58:04,960 --> 00:58:07,600 So my high condition for factor B-- 1023 00:58:07,600 --> 00:58:12,360 I'll indicate that test setup as-- with a little low b. 1024 00:58:12,360 --> 00:58:14,790 And this interaction, where I vary and pick 1025 00:58:14,790 --> 00:58:17,040 the high condition for both of those, 1026 00:58:17,040 --> 00:58:20,490 gives me the AB combination. 1027 00:58:20,490 --> 00:58:24,990 And the overall low condition we often refer to 1028 00:58:24,990 --> 00:58:29,410 as the 1 test condition. 1029 00:58:29,410 --> 00:58:33,130 Now, the reason for doing this will maybe not 1030 00:58:33,130 --> 00:58:34,810 be completely clear, but-- 1031 00:58:34,810 --> 00:58:36,010 give you some intuition. 1032 00:58:36,010 --> 00:58:39,880 When we extend this to larger numbers of factors, where 1033 00:58:39,880 --> 00:58:45,370 you want a very shorthand way of describing which combination-- 1034 00:58:45,370 --> 00:58:48,250 which test condition, which combination of your factor 1035 00:58:48,250 --> 00:58:52,720 levels you want to do, and as ways of forming 1036 00:58:52,720 --> 00:58:56,920 sums on the outputs at those conditions 1037 00:58:56,920 --> 00:59:00,520 in order to be able to do our estimates. 1038 00:59:00,520 --> 00:59:02,950 So that's a little bit of additional notation. 1039 00:59:02,950 --> 00:59:07,660 Now, the model that we might want to fit 1040 00:59:07,660 --> 00:59:10,090 is, in fact, perhaps a regression model. 1041 00:59:10,090 --> 00:59:16,900 It's a little bit different than the factor effects model, 1042 00:59:16,900 --> 00:59:20,140 which becomes a very nice way to do it 1043 00:59:20,140 --> 00:59:26,470 when we've normalized our inputs. 1044 00:59:26,470 --> 00:59:30,790 This works best with normalized input levels-- 1045 00:59:33,310 --> 00:59:39,210 in other words, my x sub i taking on values of plus 1 1046 00:59:39,210 --> 00:59:40,200 and minus 1. 1047 00:59:45,690 --> 00:59:50,160 And we'll see how those relate to the offset or effects model 1048 00:59:50,160 --> 00:59:51,820 in a second. 1049 00:59:51,820 --> 00:59:57,990 So this is now the same plot, my input x1 and my input x2, 1050 00:59:57,990 --> 01:00:00,880 but now I'm also adding the output. 1051 01:00:00,880 --> 01:00:05,560 So I'm getting to my 3D view of the results. 1052 01:00:05,560 --> 01:00:07,560 This is the same graph as before-- 1053 01:00:07,560 --> 01:00:10,830 my factor A and my factor B. 1054 01:00:10,830 --> 01:00:13,170 But then the height here-- each of the values 1055 01:00:13,170 --> 01:00:19,770 is the y bar at my n replicates averaged together 1056 01:00:19,770 --> 01:00:22,350 to give me a sense of what my actual process 1057 01:00:22,350 --> 01:00:29,040 output was at each of my four input conditions. 1058 01:00:29,040 --> 01:00:33,270 So we can talk about here the mean values 1059 01:00:33,270 --> 01:00:36,120 of each of those different treatment conditions, 1060 01:00:36,120 --> 01:00:40,600 again, where each of those is an output across those. 1061 01:00:40,600 --> 01:00:42,430 And now we can start to look and say, 1062 01:00:42,430 --> 01:00:47,520 OK, are there trends as a function of factor A, 1063 01:00:47,520 --> 01:00:49,770 are there trends as a function of factor B, 1064 01:00:49,770 --> 01:00:52,020 and fit model coefficients that go 1065 01:00:52,020 --> 01:00:56,190 or capture with those trends? 1066 01:00:56,190 --> 01:01:02,030 So a couple more pieces of terminology-- we can be asking, 1067 01:01:02,030 --> 01:01:05,270 is there an effect in the ANOVA sense-- 1068 01:01:05,270 --> 01:01:07,030 and we don't want to estimate what 1069 01:01:07,030 --> 01:01:11,140 the effective is from a model coefficient effect-- 1070 01:01:11,140 --> 01:01:15,110 is there an affect or a main effect-- 1071 01:01:15,110 --> 01:01:17,860 which is the direct linear term-- 1072 01:01:17,860 --> 01:01:19,960 and are there these interaction effects, 1073 01:01:19,960 --> 01:01:28,510 or these AB interaction or multiple factor interactions? 1074 01:01:28,510 --> 01:01:31,960 And the perspective that we take is very close 1075 01:01:31,960 --> 01:01:35,620 now to that sneaky trick that I showed-- 1076 01:01:35,620 --> 01:01:38,560 described with the injection molding data. 1077 01:01:38,560 --> 01:01:43,840 If I'm trying to estimate the main effective my first factor, 1078 01:01:43,840 --> 01:01:47,440 I basically am treating my second factor 1079 01:01:47,440 --> 01:01:48,970 as just noise conditions. 1080 01:01:48,970 --> 01:01:52,240 And I form my aggregate perspective for, 1081 01:01:52,240 --> 01:01:55,540 is there a velocity effect, averaging together 1082 01:01:55,540 --> 01:01:59,860 all of the other data, just considering 1083 01:01:59,860 --> 01:02:02,860 holding the velocity at its low condition or the velocity 1084 01:02:02,860 --> 01:02:04,150 at its high condition. 1085 01:02:04,150 --> 01:02:06,130 And then I basically do a linear fit 1086 01:02:06,130 --> 01:02:09,020 through the mean of those two cases. 1087 01:02:09,020 --> 01:02:12,250 So it's just like we were doing that the linear case, 1088 01:02:12,250 --> 01:02:16,270 but ignoring or lumping together all of the other data 1089 01:02:16,270 --> 01:02:20,030 to estimate the main effect in each case. 1090 01:02:20,030 --> 01:02:23,170 So there's a way to look at this then in our data 1091 01:02:23,170 --> 01:02:26,260 here for estimating the main effect. 1092 01:02:26,260 --> 01:02:30,600 We would label the main effect for A 1093 01:02:30,600 --> 01:02:33,340 as basically looking and saying, what I'd like to do 1094 01:02:33,340 --> 01:02:37,570 is take both of my high settings for A-- 1095 01:02:37,570 --> 01:02:40,600 this is the plus setting for A-- 1096 01:02:40,600 --> 01:02:43,705 average the outputs together for those two cases-- 1097 01:02:46,360 --> 01:02:51,770 similarly, take the low condition for A-- 1098 01:02:51,770 --> 01:02:54,830 let me use a slightly different color here, make this blue-- 1099 01:02:54,830 --> 01:02:59,600 look at the low condition, and average those together, 1100 01:02:59,600 --> 01:03:02,870 and then look at in aggregate as I 1101 01:03:02,870 --> 01:03:07,760 went from the low to the high for condition A. 1102 01:03:07,760 --> 01:03:09,830 Is there an average jump up? 1103 01:03:09,830 --> 01:03:14,660 Is there an effect of factor A in aggregate? 1104 01:03:14,660 --> 01:03:19,460 And so what you can actually do is very simply form-- 1105 01:03:19,460 --> 01:03:23,465 or model that overall delta coming from factor A 1106 01:03:23,465 --> 01:03:27,680 as just the overall average between these two points 1107 01:03:27,680 --> 01:03:30,380 and the overall average between these two points, 1108 01:03:30,380 --> 01:03:37,850 and simply fit the line through the mean of those two cases. 1109 01:03:37,850 --> 01:03:40,270 So essentially, I project all of my data 1110 01:03:40,270 --> 01:03:46,240 just down onto my A axis and ask the univariate question, 1111 01:03:46,240 --> 01:03:49,690 does my y output change as a function of A-- 1112 01:03:49,690 --> 01:03:53,000 lumping together all of the other data? 1113 01:03:53,000 --> 01:03:56,290 And so that's simply what we're doing with this equation. 1114 01:03:56,290 --> 01:03:59,290 And now that terminology that I introduced in the earlier 1115 01:03:59,290 --> 01:04:03,770 picture comes into play. 1116 01:04:03,770 --> 01:04:08,110 You can talk about now the outputs at y sub 1117 01:04:08,110 --> 01:04:13,480 A high as being those cases where 1118 01:04:13,480 --> 01:04:19,030 I had the plus value for A and the low value for B. 1119 01:04:19,030 --> 01:04:22,540 And the other one would be the plus value for A 1120 01:04:22,540 --> 01:04:25,990 and the high condition for B. So these two points 1121 01:04:25,990 --> 01:04:27,900 I'm simply referring to-- 1122 01:04:27,900 --> 01:04:29,650 it's getting kind of messy here, isn't it? 1123 01:04:32,660 --> 01:04:38,030 I can refer to this average right 1124 01:04:38,030 --> 01:04:43,310 here as the A and AB outputs averaged. 1125 01:04:43,310 --> 01:04:45,260 I've got two of those levels, and then n 1126 01:04:45,260 --> 01:04:48,030 replicates at each of those points. 1127 01:04:48,030 --> 01:04:51,020 So that's just a way of expressing that average, 1128 01:04:51,020 --> 01:04:53,630 using our test condition terminology. 1129 01:04:56,940 --> 01:04:57,960 OK. 1130 01:04:57,960 --> 01:04:59,220 Everybody get mean effects? 1131 01:05:02,670 --> 01:05:05,310 We can also do interaction effects. 1132 01:05:05,310 --> 01:05:10,050 And they're a little bit more subtle to form the contrast, 1133 01:05:10,050 --> 01:05:16,410 but it's essentially asking a different set of averages, 1134 01:05:16,410 --> 01:05:21,690 and trying to say, as I change both A and B together, 1135 01:05:21,690 --> 01:05:25,590 do I get something more than the A effect alone or the B effect 1136 01:05:25,590 --> 01:05:26,400 alone? 1137 01:05:26,400 --> 01:05:30,610 And what we actually do in this case is take the AB condition-- 1138 01:05:30,610 --> 01:05:32,070 this one right here-- 1139 01:05:32,070 --> 01:05:35,940 take the low-- so I'm taking the high-high and the low-low, 1140 01:05:35,940 --> 01:05:38,770 forming its average. 1141 01:05:38,770 --> 01:05:44,940 Then I take the other corner, the just A high and the just B 1142 01:05:44,940 --> 01:05:46,050 high-- 1143 01:05:46,050 --> 01:05:49,920 take its average, and then basically take the difference 1144 01:05:49,920 --> 01:05:51,460 between those two. 1145 01:05:51,460 --> 01:05:56,400 So I'm averaging the corners together, 1146 01:05:56,400 --> 01:05:59,130 in contrast to the main effect, where I just averaged 1147 01:05:59,130 --> 01:06:02,700 along the two factor levels. 1148 01:06:02,700 --> 01:06:05,130 And that forms an estimate of the AB offset. 1149 01:06:11,280 --> 01:06:16,237 Basically, this average of the A high and the A 1150 01:06:16,237 --> 01:06:25,410 low we'll refer to as the contrast, the right combination 1151 01:06:25,410 --> 01:06:27,690 of the output values that give me 1152 01:06:27,690 --> 01:06:30,570 an estimate for the A effect. 1153 01:06:30,570 --> 01:06:33,780 And similarly, we'll talk about the contrast here 1154 01:06:33,780 --> 01:06:35,700 for the B factor. 1155 01:06:35,700 --> 01:06:37,980 Which combinations of output values 1156 01:06:37,980 --> 01:06:42,330 do you use to estimate that effect and the contrast 1157 01:06:42,330 --> 01:06:45,420 for the interaction terms? 1158 01:06:45,420 --> 01:06:51,540 And what's cool, if I go back to that early slide where I said, 1159 01:06:51,540 --> 01:06:53,910 if I'm just fitting a linear output, y 1160 01:06:53,910 --> 01:06:58,770 as a function of one factor, and I got two data points, 1161 01:06:58,770 --> 01:07:05,430 my line goes through the average of those two conditions. 1162 01:07:05,430 --> 01:07:09,450 If I form the contrast now in a multiple output case, 1163 01:07:09,450 --> 01:07:13,170 again, the best estimate of the linear line 1164 01:07:13,170 --> 01:07:15,430 goes through the averages-- 1165 01:07:15,430 --> 01:07:19,500 these are the averages at each of those conditions. 1166 01:07:19,500 --> 01:07:22,710 And what I've simply got is, directly 1167 01:07:22,710 --> 01:07:26,130 from the contrast, a direct estimate 1168 01:07:26,130 --> 01:07:33,330 of the model coefficient as a function in the normalized axis 1169 01:07:33,330 --> 01:07:34,500 space. 1170 01:07:34,500 --> 01:07:37,740 If our x sub i's are in this normalized plus 1 1171 01:07:37,740 --> 01:07:41,070 to minus 1 range, directly calculating 1172 01:07:41,070 --> 01:07:43,080 the contrast of my outputs gives me 1173 01:07:43,080 --> 01:07:47,250 directly what an estimate of the effects 1174 01:07:47,250 --> 01:07:50,250 are-- the main effects as well as the interaction effects. 1175 01:07:53,337 --> 01:07:54,920 So I didn't actually have to go and do 1176 01:07:54,920 --> 01:07:59,160 the optimization problem, do the linear regression. 1177 01:07:59,160 --> 01:08:01,760 I'm using the fact that the linear regression forces 1178 01:08:01,760 --> 01:08:03,770 the data to go through the averages. 1179 01:08:03,770 --> 01:08:06,740 And I can simply look at the data conditions that I've got, 1180 01:08:06,740 --> 01:08:11,000 the averages of the output at each of my four combinations, 1181 01:08:11,000 --> 01:08:13,790 and then look at it either from the A factor perspective 1182 01:08:13,790 --> 01:08:17,840 and estimate the A affect, with a B factor perspective 1183 01:08:17,840 --> 01:08:20,779 and estimate the B effect, and then also look and see 1184 01:08:20,779 --> 01:08:23,750 if there's an interaction between the two 1185 01:08:23,750 --> 01:08:28,529 because of cross-coupling between those two factors. 1186 01:08:28,529 --> 01:08:32,510 So what you've got is the regression perspective 1187 01:08:32,510 --> 01:08:36,080 and this contrast perspective are exactly the same. 1188 01:08:38,819 --> 01:08:40,819 Now, I said this was a linear model, 1189 01:08:40,819 --> 01:08:44,840 and what's cool is it actually is from either the A 1190 01:08:44,840 --> 01:08:45,760 perspective-- 1191 01:08:48,859 --> 01:08:50,210 I guess that's the B factor. 1192 01:08:52,779 --> 01:08:56,020 That's the slope, if you will, versus my B 1193 01:08:56,020 --> 01:08:59,680 factor, my x2 variable. 1194 01:08:59,680 --> 01:09:04,450 I could similarly have the slope from my A factor. 1195 01:09:04,450 --> 01:09:08,439 This was my A axes, or x1 axes. 1196 01:09:08,439 --> 01:09:13,060 So I've got a simple linear dependence of output 1197 01:09:13,060 --> 01:09:16,390 on the input in each of those two cases, 1198 01:09:16,390 --> 01:09:23,350 and the interaction basically gives me linear dependencies 1199 01:09:23,350 --> 01:09:28,029 that add additional offsets, but that are locally linear 1200 01:09:28,029 --> 01:09:29,180 in each case. 1201 01:09:29,180 --> 01:09:35,529 So what we get is this so-called 3D ruled surface that 1202 01:09:35,529 --> 01:09:39,850 is linear every time I project it 1203 01:09:39,850 --> 01:09:42,399 on either the A axis or the B axis. 1204 01:09:42,399 --> 01:09:44,649 But it's got this kind of funky ability 1205 01:09:44,649 --> 01:09:49,510 to shift because of the interaction, 1206 01:09:49,510 --> 01:09:52,920 so it's kind of a subtle non-linearity at work. 1207 01:09:52,920 --> 01:09:55,470 It's not a curvature non-linearity 1208 01:09:55,470 --> 01:09:59,790 in the sense of quadratic curvature. 1209 01:09:59,790 --> 01:10:02,580 It is a ruled surface projection down 1210 01:10:02,580 --> 01:10:11,020 to linear dependencies on A and B for each of my combinations. 1211 01:10:11,020 --> 01:10:15,690 So here's an example of the surface 1212 01:10:15,690 --> 01:10:18,180 that results if I don't have the interaction. 1213 01:10:18,180 --> 01:10:23,820 And a very simple model example is x1 and x2 1214 01:10:23,820 --> 01:10:26,610 with some linear coefficient, and then 1215 01:10:26,610 --> 01:10:30,090 what I've simply got for my ruled surface is a plane. 1216 01:10:30,090 --> 01:10:36,420 It's a slanted plane as a function of x and x2 factors. 1217 01:10:36,420 --> 01:10:40,020 Sometimes we also draw these interaction plots, 1218 01:10:40,020 --> 01:10:43,320 and all of that this is just looking at one of the factors-- 1219 01:10:43,320 --> 01:10:49,260 say, along x2-- and picking some value for the other factor-- 1220 01:10:49,260 --> 01:10:54,930 say, x1 is-- let's see-- x1 is low-- 1221 01:10:54,930 --> 01:10:58,320 that would be this line right there-- 1222 01:10:58,320 --> 01:11:03,060 or x1 is high, which would be this line right there-- 1223 01:11:03,060 --> 01:11:07,170 and then simply plotting y just as the univariate function 1224 01:11:07,170 --> 01:11:11,760 of x2, where I picked or set some other value for my x1. 1225 01:11:11,760 --> 01:11:15,030 And in this case without interaction, 1226 01:11:15,030 --> 01:11:19,380 you'll notice those lines have to be parallel-- 1227 01:11:19,380 --> 01:11:27,050 that what I've got as a simple linear additive model 1228 01:11:27,050 --> 01:11:33,200 without interaction is I just get off sets 1229 01:11:33,200 --> 01:11:35,600 because of whatever value of x when I pick 1230 01:11:35,600 --> 01:11:39,510 or offsets because of whatever value of x2 I pick. 1231 01:11:39,510 --> 01:11:45,620 So this is simply saying the x1 effect 1232 01:11:45,620 --> 01:11:50,150 is shown right here as an additional shift in the output 1233 01:11:50,150 --> 01:11:53,960 dependence on x2. 1234 01:11:53,960 --> 01:11:57,860 So this is just another slice through the 3D picture, 1235 01:11:57,860 --> 01:11:59,510 but what's important here is you would 1236 01:11:59,510 --> 01:12:04,790 see this kind of interaction pattern on your output 1237 01:12:04,790 --> 01:12:07,070 as a function of your input, if there's 1238 01:12:07,070 --> 01:12:11,285 no cross term between x1 and x2. 1239 01:12:14,200 --> 01:12:18,580 And in contrast, here's the response surface of output 1240 01:12:18,580 --> 01:12:20,930 as a function of my two inputs. 1241 01:12:20,930 --> 01:12:23,350 This is, again, a ruled surface. 1242 01:12:23,350 --> 01:12:25,900 Each one of these is just a line, 1243 01:12:25,900 --> 01:12:30,490 but now it's got this funky non-linearity because 1244 01:12:30,490 --> 01:12:32,990 of this interaction term. 1245 01:12:32,990 --> 01:12:37,090 And in this case, if I take a slice at different low and high 1246 01:12:37,090 --> 01:12:39,970 of x1, you can see that there's something at work 1247 01:12:39,970 --> 01:12:44,470 more than just an additive effect from x1, 1248 01:12:44,470 --> 01:12:46,810 depending on my value of x2. 1249 01:12:46,810 --> 01:12:51,040 There is synergy between whatever x1 value I pick 1250 01:12:51,040 --> 01:12:53,500 and x2 value I picked that together-- 1251 01:12:53,500 --> 01:12:57,040 I need both to explain the output 1252 01:12:57,040 --> 01:12:59,650 and how they work together. 1253 01:12:59,650 --> 01:13:02,140 You can also have negative interactions, 1254 01:13:02,140 --> 01:13:08,710 where the cross term causes a negative synergistic effect 1255 01:13:08,710 --> 01:13:11,930 between the two. 1256 01:13:11,930 --> 01:13:15,680 OK, so what we'll pick up on next time 1257 01:13:15,680 --> 01:13:21,380 is extending this a little bit to a more general way 1258 01:13:21,380 --> 01:13:23,870 of talking about these contrasts. 1259 01:13:23,870 --> 01:13:25,940 I think hopefully now you've got the picture 1260 01:13:25,940 --> 01:13:30,560 from a univariate kind of modeling and regression picture 1261 01:13:30,560 --> 01:13:33,890 up to a two-factor kind of picture 1262 01:13:33,890 --> 01:13:35,460 with these interactions. 1263 01:13:35,460 --> 01:13:38,330 Next time, what we'll do is extend that-- 1264 01:13:38,330 --> 01:13:43,280 oops-- extend that or generalize that to when I might have three 1265 01:13:43,280 --> 01:13:44,210 factors-- 1266 01:13:44,210 --> 01:13:47,680 an A, B, and a C factor, for example. 1267 01:13:47,680 --> 01:13:51,500 And how do you visualize your design space in those cases 1268 01:13:51,500 --> 01:13:55,190 and how do you form the contrast to very quickly estimate model 1269 01:13:55,190 --> 01:13:58,110 coefficients in those cases? 1270 01:13:58,110 --> 01:13:59,660 And then we'll also get to-- 1271 01:14:02,940 --> 01:14:04,980 oops-- why'd that happen? 1272 01:14:04,980 --> 01:14:10,470 We'll also get next time to looking a little bit more at, 1273 01:14:10,470 --> 01:14:13,710 how do I check the adequacy of these model terms, 1274 01:14:13,710 --> 01:14:16,020 are they significant, as well as, 1275 01:14:16,020 --> 01:14:21,780 what are good estimates for confidence intervals 1276 01:14:21,780 --> 01:14:24,540 on these things-- and then come to these other subtle points, 1277 01:14:24,540 --> 01:14:28,410 like what happens if I don't want 1278 01:14:28,410 --> 01:14:32,580 to do 2 to the k, when k is 5? 1279 01:14:32,580 --> 01:14:35,160 I don't necessarily need every combination. 1280 01:14:35,160 --> 01:14:39,720 And we'll talk about fractional factorial designs. 1281 01:14:39,720 --> 01:14:46,700 So one thing you should do in preparation for next time 1282 01:14:46,700 --> 01:14:49,640 is start reading the experimental design chapter. 1283 01:14:49,640 --> 01:14:52,460 And I can't remember which chapter it is in Montgomery. 1284 01:14:52,460 --> 01:14:55,550 I'll post that on the-- 1285 01:14:55,550 --> 01:14:57,560 as an announcement on the website. 1286 01:14:57,560 --> 01:15:00,530 But there's a lot of lingo to get used to, 1287 01:15:00,530 --> 01:15:03,870 but it all comes back to nice intuitive relationships, 1288 01:15:03,870 --> 01:15:04,370 I think. 1289 01:15:04,370 --> 01:15:06,620 But you do need to start at least scanning 1290 01:15:06,620 --> 01:15:11,750 that chapter in Montgomery for the experimental design stuff. 1291 01:15:11,750 --> 01:15:15,290 I think Montgomery has a more thorough description than we 1292 01:15:15,290 --> 01:15:17,480 can find [INAUDIBLE]. 1293 01:15:17,480 --> 01:15:19,280 Actually, [INAUDIBLE] might be a good place 1294 01:15:19,280 --> 01:15:21,260 to get the quick read on it first, 1295 01:15:21,260 --> 01:15:25,230 and then Montgomery in more detail. 1296 01:15:25,230 --> 01:15:29,660 So with that, I hope you guys have a great spring break. 1297 01:15:29,660 --> 01:15:34,580 And enjoy the MIT spring break, for our MIT students 1298 01:15:34,580 --> 01:15:35,630 in Singapore. 1299 01:15:35,630 --> 01:15:38,720 You can rub it in to all your classmates 1300 01:15:38,720 --> 01:15:43,550 who are still meeting for other classes in Singapore. 1301 01:15:43,550 --> 01:15:48,380 You get a break to work on your projects out in Singapore. 1302 01:15:48,380 --> 01:15:51,490 So we'll see you in two weeks.