1 00:00:00,000 --> 00:00:02,460 SPEAKER: The following content is provided under a Creative 2 00:00:02,460 --> 00:00:03,760 Commons license. 3 00:00:03,760 --> 00:00:06,060 Your support will help MIT OpenCourseWare 4 00:00:06,060 --> 00:00:10,090 continue to offer high quality educational resources for free. 5 00:00:10,090 --> 00:00:12,690 To make a donation or to view additional materials 6 00:00:12,690 --> 00:00:16,560 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,560 --> 00:00:17,904 at ocw.mit.edu. 8 00:00:21,383 --> 00:00:22,800 DUANE BONING: So what I want to do 9 00:00:22,800 --> 00:00:26,550 today is pick up a little bit from last time, 10 00:00:26,550 --> 00:00:29,970 round our discussion of process optimization 11 00:00:29,970 --> 00:00:33,690 with a little bit of a different perspective today. 12 00:00:33,690 --> 00:00:36,540 Last time we talked about response surface methods 13 00:00:36,540 --> 00:00:41,520 and using those in optimization. 14 00:00:41,520 --> 00:00:44,970 What I want to do today is especially focus on-- 15 00:00:47,970 --> 00:00:49,620 geez. 16 00:00:49,620 --> 00:00:51,870 This arrow is not letting me draw. 17 00:00:51,870 --> 00:00:56,220 I want to focus on process robustness, 18 00:00:56,220 --> 00:00:58,430 and that's a little bit of a different perspective. 19 00:00:58,430 --> 00:01:01,140 So far, most of the optimization we talked about 20 00:01:01,140 --> 00:01:05,190 was looking at getting a model of the mean response 21 00:01:05,190 --> 00:01:12,390 and then using that to drive the process to an optimum target, 22 00:01:12,390 --> 00:01:13,530 if you will. 23 00:01:13,530 --> 00:01:15,630 So we want to come back a little bit to ideas 24 00:01:15,630 --> 00:01:19,980 of things like CPK, quality loss function, and so on, 25 00:01:19,980 --> 00:01:23,760 that start to fold in additional requirements 26 00:01:23,760 --> 00:01:27,930 or additional goals in the optimization. 27 00:01:27,930 --> 00:01:30,030 Coming out of this is little bit more 28 00:01:30,030 --> 00:01:34,260 of an explicit consideration of variation modeling, 29 00:01:34,260 --> 00:01:36,630 not simply the mean response, but also 30 00:01:36,630 --> 00:01:38,370 thinking about this assumption that we've 31 00:01:38,370 --> 00:01:41,280 made that the variance is the same everywhere 32 00:01:41,280 --> 00:01:44,460 across the process and what to do 33 00:01:44,460 --> 00:01:48,150 when that is not the case, because in fact, it often 34 00:01:48,150 --> 00:01:49,910 is not the case. 35 00:01:49,910 --> 00:01:53,700 So looking at the family of things 36 00:01:53,700 --> 00:01:57,180 that we might seek to optimize, there 37 00:01:57,180 --> 00:01:59,160 are really multiple things. 38 00:01:59,160 --> 00:02:01,290 Right? 39 00:02:01,290 --> 00:02:08,039 Some of which are easy to make in a quantitative fashion 40 00:02:08,039 --> 00:02:11,100 assessment of, models of, and then optimize against 41 00:02:11,100 --> 00:02:14,160 and some that are a little bit more qualitative 42 00:02:14,160 --> 00:02:17,470 or additional factors. 43 00:02:17,470 --> 00:02:25,500 So certainly things like quality is the main effect 44 00:02:25,500 --> 00:02:26,850 that we've been looking at. 45 00:02:26,850 --> 00:02:29,520 That is to say, we're looking at the response 46 00:02:29,520 --> 00:02:35,250 of one or more outputs as a function of one or more control 47 00:02:35,250 --> 00:02:38,700 factors or inputs and then seeking 48 00:02:38,700 --> 00:02:47,190 to optimize the overall yield, overall output of the process. 49 00:02:47,190 --> 00:02:49,140 You want to be on target but you also 50 00:02:49,140 --> 00:02:53,820 want to be as immune as possible to variance. 51 00:02:53,820 --> 00:02:55,990 And we talked earlier in the term, not so much 52 00:02:55,990 --> 00:02:58,350 last time on Tuesday, but earlier 53 00:02:58,350 --> 00:03:03,890 in the term about CPK and quality loss functions. 54 00:03:03,890 --> 00:03:08,540 But also, of course, another key driver in all manufacturing 55 00:03:08,540 --> 00:03:14,120 is cost, and there are costs associated with yield, 56 00:03:14,120 --> 00:03:14,760 of course. 57 00:03:14,760 --> 00:03:19,380 You're only getting revenue for things that are good. 58 00:03:19,380 --> 00:03:22,050 But there's also other cost factors, for example, 59 00:03:22,050 --> 00:03:24,410 picking of different points in the operating space 60 00:03:24,410 --> 00:03:29,560 may imply different things about cost of running of the process. 61 00:03:29,560 --> 00:03:32,540 And in fact, one of the important factors of cost 62 00:03:32,540 --> 00:03:35,300 is associated with rate. 63 00:03:35,300 --> 00:03:38,870 Very often your process factors or what operating 64 00:03:38,870 --> 00:03:44,300 point you pick for your process has a direct parts per hour 65 00:03:44,300 --> 00:03:50,000 or other kind of throughput or rate implication, right? 66 00:03:50,000 --> 00:03:52,520 In fact, you'll very often have this trade off 67 00:03:52,520 --> 00:03:56,630 where you might expect if you run things faster, 68 00:03:56,630 --> 00:03:59,540 your quality might be a little bit reduced. 69 00:03:59,540 --> 00:04:02,510 Whereas if you run things at a reduced rate, 70 00:04:02,510 --> 00:04:06,080 you may be able to boost up yield or quality, 71 00:04:06,080 --> 00:04:07,850 but there's a trade off then in terms 72 00:04:07,850 --> 00:04:10,190 of how much production you can get out 73 00:04:10,190 --> 00:04:17,490 or how rapidly you can get that out. 74 00:04:17,490 --> 00:04:19,709 In many of these cases, if you can actually 75 00:04:19,709 --> 00:04:25,640 fold all of those down into a direct dollar trade off 76 00:04:25,640 --> 00:04:28,300 then you can start to formulate an overall cost function 77 00:04:28,300 --> 00:04:31,480 or objective function, sort of an overall J 78 00:04:31,480 --> 00:04:35,170 function in our terminology from last time, that might actually 79 00:04:35,170 --> 00:04:37,330 trade off some of those things. 80 00:04:37,330 --> 00:04:39,130 But often, it's actually quite difficult 81 00:04:39,130 --> 00:04:42,730 to get really detailed information 82 00:04:42,730 --> 00:04:46,340 and build models of all of these factors. 83 00:04:46,340 --> 00:04:50,880 Now an additional factor is flexibility. 84 00:04:55,150 --> 00:04:58,420 That's a little bit harder to get quantitative about. 85 00:04:58,420 --> 00:05:02,740 What do you think we mean by flexibility? 86 00:05:02,740 --> 00:05:05,950 I mean, we could find the process optimum, 87 00:05:05,950 --> 00:05:09,048 lock in our equipment, lock down the control knobs, 88 00:05:09,048 --> 00:05:11,590 and say that's where we're going to operate for the next four 89 00:05:11,590 --> 00:05:12,700 years. 90 00:05:12,700 --> 00:05:14,420 That would probably not be very flexible. 91 00:05:17,550 --> 00:05:22,020 But give me some thoughts on what kind of flexibility 92 00:05:22,020 --> 00:05:23,950 you might want in a process? 93 00:05:23,950 --> 00:05:26,058 AUDIENCE: Change over time, like the process is 94 00:05:26,058 --> 00:05:27,590 easy to modify and change. 95 00:05:27,590 --> 00:05:28,590 DUANE BONING: Excellent. 96 00:05:28,590 --> 00:05:32,680 Yeah, so the idea there was change over time. 97 00:05:32,680 --> 00:05:34,090 So a process. 98 00:05:34,090 --> 00:05:36,840 So some of those might be goals actually in the equipment 99 00:05:36,840 --> 00:05:39,970 design or the line design. 100 00:05:39,970 --> 00:05:42,260 Not necessarily picking an operating point, 101 00:05:42,260 --> 00:05:45,940 but definitely flexibility is the ability 102 00:05:45,940 --> 00:05:47,620 to rapidly do changeovers. 103 00:05:47,620 --> 00:05:49,600 By the way, that often folds into some 104 00:05:49,600 --> 00:05:51,610 of these other factors, right? 105 00:05:51,610 --> 00:05:56,710 Can affect cost, quality, rate, change over time, 106 00:05:56,710 --> 00:06:01,420 how rapidly you do that, can have an impact on throughput. 107 00:06:01,420 --> 00:06:04,150 Changeover sometimes requires additional steps 108 00:06:04,150 --> 00:06:10,540 to warm up or prove in the new changeover. 109 00:06:10,540 --> 00:06:15,560 So furthermore if your equipment requires 110 00:06:15,560 --> 00:06:17,615 a little bit of adjustment after a changeover, 111 00:06:17,615 --> 00:06:20,780 it can have a quality impact. 112 00:06:20,780 --> 00:06:21,590 That's a good one. 113 00:06:21,590 --> 00:06:22,370 I like that. 114 00:06:22,370 --> 00:06:23,721 Other kinds of flexibility? 115 00:06:23,721 --> 00:06:25,346 AUDIENCE: You could [INAUDIBLE] and try 116 00:06:25,346 --> 00:06:29,200 to make it more insensitive to the exact operating point 117 00:06:29,200 --> 00:06:32,192 to try to [INAUDIBLE] development, rather, 118 00:06:32,192 --> 00:06:38,620 something that gives more room for the process 119 00:06:38,620 --> 00:06:41,050 to [INAUDIBLE] on either side [INAUDIBLE].. 120 00:06:43,502 --> 00:06:45,460 DUANE BONING: Yeah, so what Neil is just saying 121 00:06:45,460 --> 00:06:49,960 is you might want to process that is perhaps 122 00:06:49,960 --> 00:06:53,110 less sensitive at an optimum. 123 00:06:53,110 --> 00:06:58,720 So this would be less sensitive so that as your inputs vary, 124 00:06:58,720 --> 00:07:01,470 your outputs would vary less. 125 00:07:01,470 --> 00:07:06,340 I think that's an important characteristic. 126 00:07:06,340 --> 00:07:07,960 That's a notion of robustness. 127 00:07:07,960 --> 00:07:12,460 I'm not sure I would associate that with flexibility, though. 128 00:07:12,460 --> 00:07:14,150 I think it's a slightly different idea. 129 00:07:14,150 --> 00:07:15,106 I mean, certainly-- 130 00:07:19,270 --> 00:07:22,750 I'm trying to think how to think of that as-- 131 00:07:22,750 --> 00:07:24,400 it may be that what you would like 132 00:07:24,400 --> 00:07:27,400 is a process where this same characteristic might 133 00:07:27,400 --> 00:07:31,580 be true of multiple operating points 134 00:07:31,580 --> 00:07:34,390 so that one could rapidly, rapidly 135 00:07:34,390 --> 00:07:38,050 tune in the process, rapidly change over the process. 136 00:07:38,050 --> 00:07:38,740 Yeah. 137 00:07:38,740 --> 00:07:40,901 AUDIENCE: Different rates. 138 00:07:40,901 --> 00:07:42,200 Output rates. 139 00:07:42,200 --> 00:07:44,160 DUANE BONING: Different output rates. 140 00:07:44,160 --> 00:07:48,580 So what would the benefit of that be? 141 00:07:48,580 --> 00:07:51,290 AUDIENCE: [INAUDIBLE] demand. 142 00:07:51,290 --> 00:07:52,950 We don't want to overproduce. 143 00:08:02,757 --> 00:08:03,840 DUANE BONING: Interesting. 144 00:08:03,840 --> 00:08:04,380 Sure. 145 00:08:04,380 --> 00:08:09,230 So rate flexibility might be another characteristic. 146 00:08:09,230 --> 00:08:14,200 So I think we could identify multiple of these flexibility 147 00:08:14,200 --> 00:08:15,730 ideas. 148 00:08:15,730 --> 00:08:19,670 I think these are really important characteristics, 149 00:08:19,670 --> 00:08:23,510 especially in overall product design as well 150 00:08:23,510 --> 00:08:26,090 as process design. 151 00:08:26,090 --> 00:08:28,190 We haven't really talked that much about it. 152 00:08:28,190 --> 00:08:30,200 We're really focused mostly, in this subject, 153 00:08:30,200 --> 00:08:34,789 on the manufacturing process control and the quality issues. 154 00:08:34,789 --> 00:08:41,179 But I did want to have that in there just as a reminder 155 00:08:41,179 --> 00:08:43,400 that there's a lot of these other kinds of factors 156 00:08:43,400 --> 00:08:49,780 that are important in a manufacturing setting. 157 00:08:49,780 --> 00:08:51,460 But let's focus a little bit, look 158 00:08:51,460 --> 00:08:57,700 at minimizing, well, quality loss 159 00:08:57,700 --> 00:09:01,330 and thinking a little bit more about other cost factors. 160 00:09:01,330 --> 00:09:03,940 We've been saying that one of the important goals 161 00:09:03,940 --> 00:09:07,330 is we want the-- 162 00:09:07,330 --> 00:09:10,040 I'm not sure I would use x bar here. 163 00:09:10,040 --> 00:09:11,140 Let's call that y bar. 164 00:09:11,140 --> 00:09:14,410 If y is my output, I want my y bar output 165 00:09:14,410 --> 00:09:18,440 to at least hit or come as close as possible to some target. 166 00:09:18,440 --> 00:09:23,020 So again, if we assume that there is, in fact, some spread 167 00:09:23,020 --> 00:09:27,070 in the manufacturing process, I can never completely squeeze 168 00:09:27,070 --> 00:09:28,400 out all the variance. 169 00:09:28,400 --> 00:09:31,810 We have our typical CPK-like picture 170 00:09:31,810 --> 00:09:33,700 where we've got some target, we've 171 00:09:33,700 --> 00:09:36,830 got spec limits, upper and lower, 172 00:09:36,830 --> 00:09:40,420 and again, we want to maximize yield. 173 00:09:40,420 --> 00:09:42,610 So one important characteristic of that 174 00:09:42,610 --> 00:09:46,120 is getting on to target. 175 00:09:46,120 --> 00:09:52,510 It turns out if you have a fairly rich process-- 176 00:09:52,510 --> 00:09:57,220 rich in the sense that you've got multiple inputs that you 177 00:09:57,220 --> 00:09:59,260 can use to affect the process-- 178 00:09:59,260 --> 00:10:01,300 maybe you also have multiple outputs. 179 00:10:01,300 --> 00:10:06,160 But for the moment imagine that I've got one output of concern, 180 00:10:06,160 --> 00:10:13,080 but I've got more than one x control factor or input factor. 181 00:10:13,080 --> 00:10:16,630 Conceptually do you think there's 182 00:10:16,630 --> 00:10:20,440 going to be more than one solution to or just 183 00:10:20,440 --> 00:10:24,070 one solution to a combination of those input factors 184 00:10:24,070 --> 00:10:27,730 to hit the particular output? 185 00:10:27,730 --> 00:10:31,580 If I only had one input and one output, 186 00:10:31,580 --> 00:10:33,160 seems like most likely I'm only going 187 00:10:33,160 --> 00:10:35,300 to have one setting for that input that 188 00:10:35,300 --> 00:10:38,450 is going to correspond to a maximum on that output. 189 00:10:38,450 --> 00:10:43,830 But if I've got multiple ones and they have some influence, 190 00:10:43,830 --> 00:10:46,130 you got an interesting situation here, right? 191 00:10:46,130 --> 00:10:51,380 You end up with multiple solutions, if you will, 192 00:10:51,380 --> 00:10:54,310 to the maximum output problem. 193 00:10:54,310 --> 00:10:57,610 That is to say you may have contours 194 00:10:57,610 --> 00:11:02,830 in the multidimensional space, an x1, x2, y space 195 00:11:02,830 --> 00:11:06,230 that all achieve the same output. 196 00:11:06,230 --> 00:11:08,270 Turns out that's wonderful. 197 00:11:08,270 --> 00:11:11,430 It means, well, yes, you can match to target. 198 00:11:11,430 --> 00:11:14,990 But now that second input or the additional inputs 199 00:11:14,990 --> 00:11:17,780 give you a little bit of freedom to try 200 00:11:17,780 --> 00:11:23,010 to achieve other goals as well. 201 00:11:23,010 --> 00:11:25,790 Now that's assuming that sort of the match to target 202 00:11:25,790 --> 00:11:29,180 is the paramount goal, and getting 203 00:11:29,180 --> 00:11:32,060 close to that is pretty close to a paramount goal because 204 00:11:32,060 --> 00:11:35,510 of the direct effect on yield and quality loss, 205 00:11:35,510 --> 00:11:38,120 but it wouldn't necessarily have to be so. 206 00:11:38,120 --> 00:11:39,800 But here I'm sort of assuming you've 207 00:11:39,800 --> 00:11:41,870 got to get to target first, and then 208 00:11:41,870 --> 00:11:46,130 after that it's an opportunity to choose your operating 209 00:11:46,130 --> 00:11:49,370 point to do other things. 210 00:11:49,370 --> 00:11:51,950 To do things like minimize the cycle time 211 00:11:51,950 --> 00:11:55,190 to get as fast a rate as possible 212 00:11:55,190 --> 00:11:59,120 or to start to fold in other qualitative 213 00:11:59,120 --> 00:12:02,680 or quantitative cost drivers. 214 00:12:02,680 --> 00:12:04,630 A couple of simple examples might 215 00:12:04,630 --> 00:12:07,720 be at different operating points you might actually 216 00:12:07,720 --> 00:12:14,770 use more input materials or, say, manufacturing 217 00:12:14,770 --> 00:12:17,020 gases in a plasma reactor. 218 00:12:17,020 --> 00:12:18,640 At different operating points, you 219 00:12:18,640 --> 00:12:22,630 might be more or less efficient in the use of energy 220 00:12:22,630 --> 00:12:25,160 or other material resources. 221 00:12:25,160 --> 00:12:28,670 Another factor might be things like at different operating 222 00:12:28,670 --> 00:12:32,510 points, it has different side effects on the equipment, 223 00:12:32,510 --> 00:12:35,764 like wear of a machining tool. 224 00:12:35,764 --> 00:12:36,620 OK? 225 00:12:36,620 --> 00:12:42,380 So these are places where you can start to add those in. 226 00:12:42,380 --> 00:12:45,970 And that actually means that you really 227 00:12:45,970 --> 00:12:49,990 would like to have multivariable models in the sense 228 00:12:49,990 --> 00:12:54,910 of some output y that in fact, does let you trade off 229 00:12:54,910 --> 00:12:57,413 two or more inputs. 230 00:12:57,413 --> 00:12:59,080 And this is just a visualization of what 231 00:12:59,080 --> 00:13:01,250 I was just describing earlier. 232 00:13:01,250 --> 00:13:02,860 Here's a very simple linear model, 233 00:13:02,860 --> 00:13:05,970 two inputs with an interaction. 234 00:13:05,970 --> 00:13:11,960 And now let's say that the target value for y is 5. 235 00:13:11,960 --> 00:13:15,590 Well, that plane intersects with my response surface 236 00:13:15,590 --> 00:13:18,740 along some line, and you can sort of 237 00:13:18,740 --> 00:13:21,050 see that dashed line there. 238 00:13:21,050 --> 00:13:26,810 That indicates any combination of the x1 and x2 factors 239 00:13:26,810 --> 00:13:29,750 that can achieve that output. 240 00:13:29,750 --> 00:13:36,670 And so now, in fact, to find and decide what x1 and x2 241 00:13:36,670 --> 00:13:38,710 I'm going to use, I actually need 242 00:13:38,710 --> 00:13:41,140 to think about some other criteria. 243 00:13:41,140 --> 00:13:42,610 Right? 244 00:13:42,610 --> 00:13:45,940 I guess if it doesn't matter, you can pick any of them 245 00:13:45,940 --> 00:13:51,960 and you might randomly pick. 246 00:13:51,960 --> 00:13:54,630 I guess randomly picking is about the only thing you 247 00:13:54,630 --> 00:13:57,900 can do where you're not, I think, intuitively 248 00:13:57,900 --> 00:14:02,010 applying some additional filter or some additional criteria 249 00:14:02,010 --> 00:14:03,810 to the x1 and x2. 250 00:14:03,810 --> 00:14:06,750 So again, some of these might be picking x1 and x2 based 251 00:14:06,750 --> 00:14:11,630 on things like cost or rate. 252 00:14:11,630 --> 00:14:14,050 What are some other things that you 253 00:14:14,050 --> 00:14:21,580 might use to decide on which point along this space 254 00:14:21,580 --> 00:14:25,155 you would actually like to pick for an operating point? 255 00:14:25,155 --> 00:14:25,655 Yeah? 256 00:14:25,655 --> 00:14:27,140 AUDIENCE: Minimum variability. 257 00:14:27,140 --> 00:14:28,590 DUANE BONING: Minimum variability. 258 00:14:28,590 --> 00:14:30,920 Good. 259 00:14:30,920 --> 00:14:34,010 And we will talk about that for sure. 260 00:14:34,010 --> 00:14:35,740 Excellent. 261 00:14:35,740 --> 00:14:42,472 AUDIENCE: If you were having [INAUDIBLE] x1 but not x2, 262 00:14:42,472 --> 00:14:48,790 then you will use x1 to control most of the [INAUDIBLE].. 263 00:14:48,790 --> 00:14:50,230 DUANE BONING: Interesting. 264 00:14:50,230 --> 00:14:53,060 Yeah, so that's getting at the notion of flexibility, 265 00:14:53,060 --> 00:14:56,320 so the idea was if I had some other product that 266 00:14:56,320 --> 00:14:58,600 also was using x1 and x2, there might 267 00:14:58,600 --> 00:15:03,190 be value in keeping one of those variables constant 268 00:15:03,190 --> 00:15:05,920 or using it to optimize all of them 269 00:15:05,920 --> 00:15:10,660 and then focusing in and letting one variable be more associated 270 00:15:10,660 --> 00:15:12,100 with the flexibility, if you will, 271 00:15:12,100 --> 00:15:16,090 the thing that you use to tune in across different products 272 00:15:16,090 --> 00:15:18,730 or different processes. 273 00:15:18,730 --> 00:15:20,290 I like that. 274 00:15:20,290 --> 00:15:20,958 Yep? 275 00:15:20,958 --> 00:15:22,250 AUDIENCE: Physical constraints. 276 00:15:22,250 --> 00:15:26,282 Your machine might not be able to go that fast. 277 00:15:26,282 --> 00:15:28,240 DUANE BONING: Physical constraints, absolutely. 278 00:15:28,240 --> 00:15:32,830 So depends on whether these are rate or other things in terms 279 00:15:32,830 --> 00:15:35,620 of whether the machine can go that fast, 280 00:15:35,620 --> 00:15:39,010 but certainly you may have explored 281 00:15:39,010 --> 00:15:42,220 a space in these coded variables in x1 and x2 282 00:15:42,220 --> 00:15:47,140 that cover this broad range, but perhaps your equipment really 283 00:15:47,140 --> 00:15:50,810 doesn't want to run out here at one of the extreme low points. 284 00:15:50,810 --> 00:15:56,220 So that might actually drive you away from these extreme points 285 00:15:56,220 --> 00:16:01,870 or maybe it drives you towards one of those extreme points. 286 00:16:01,870 --> 00:16:04,860 In general, I think-- 287 00:16:04,860 --> 00:16:07,410 let's see, I believe one of the case 288 00:16:07,410 --> 00:16:09,690 studies-- it's either case study two or three. 289 00:16:09,690 --> 00:16:12,990 Dave Hardt will come in and talk about run by run control, 290 00:16:12,990 --> 00:16:15,180 a little bit of use of some of these kinds of models 291 00:16:15,180 --> 00:16:16,860 for feedback control. 292 00:16:16,860 --> 00:16:22,110 And in general, an observation that comes out of that is you 293 00:16:22,110 --> 00:16:24,390 might want to avoid operating at one 294 00:16:24,390 --> 00:16:28,110 of the boundaries of your process, right? 295 00:16:28,110 --> 00:16:31,470 If you were, in fact, picking one of these points that's 296 00:16:31,470 --> 00:16:35,795 at the edge of the space that you used to build your model, 297 00:16:35,795 --> 00:16:37,170 well, you're not really sure what 298 00:16:37,170 --> 00:16:40,950 happens if you adjust x to a little bit beyond that space. 299 00:16:40,950 --> 00:16:44,430 You might hope that the process continues to be well behaved 300 00:16:44,430 --> 00:16:47,370 and extrapolate out into that region. 301 00:16:47,370 --> 00:16:50,040 That's a little bit scarier to do. 302 00:16:50,040 --> 00:16:53,670 But in general, if you pick a point that's 303 00:16:53,670 --> 00:16:58,530 closer to somewhere in the interior, 304 00:16:58,530 --> 00:17:01,500 almost maybe even the center of your space, 305 00:17:01,500 --> 00:17:04,050 that gives you the maximum freedom 306 00:17:04,050 --> 00:17:07,950 in an active control setting to make small modifications 307 00:17:07,950 --> 00:17:15,270 to your x1, x2 factors to drive the process back onto target. 308 00:17:15,270 --> 00:17:18,990 So I would be inclined to also try 309 00:17:18,990 --> 00:17:22,230 to sort of keep my factors in a place 310 00:17:22,230 --> 00:17:26,700 where I know I can keep moving them to control the process. 311 00:17:26,700 --> 00:17:28,493 AUDIENCE: [INAUDIBLE],, doesn't it 312 00:17:28,493 --> 00:17:30,160 make sense that if possible you actually 313 00:17:30,160 --> 00:17:32,616 need to test further out because you could 314 00:17:32,616 --> 00:17:35,936 have more optimal conditions if you 315 00:17:35,936 --> 00:17:39,038 could, say, push the boundary? 316 00:17:39,038 --> 00:17:39,830 DUANE BONING: Yeah. 317 00:17:39,830 --> 00:17:41,550 So the question-- yeah, yeah. 318 00:17:41,550 --> 00:17:44,640 The question is kind of a general one 319 00:17:44,640 --> 00:17:47,910 about response surface models and optimal points. 320 00:17:47,910 --> 00:17:51,530 Is if you are on some boundary-- 321 00:17:51,530 --> 00:17:56,970 well, first off if I have linear models, 322 00:17:56,970 --> 00:18:00,150 very often you get to the constraints 323 00:18:00,150 --> 00:18:02,880 being a boundary especially if I fold in something, 324 00:18:02,880 --> 00:18:05,370 some additional factor. 325 00:18:05,370 --> 00:18:08,370 And so you could always ask the question, 326 00:18:08,370 --> 00:18:10,680 should I extend my experimental space 327 00:18:10,680 --> 00:18:12,750 and explore a little bit further? 328 00:18:12,750 --> 00:18:17,760 And the answer is if your variables naturally 329 00:18:17,760 --> 00:18:22,200 can go beyond your initial guess of the range of exploration, 330 00:18:22,200 --> 00:18:23,580 absolutely. 331 00:18:23,580 --> 00:18:25,560 And as we saw last time, for example, 332 00:18:25,560 --> 00:18:28,290 in incremental or iterative optimization, 333 00:18:28,290 --> 00:18:30,270 you may have actually already-- 334 00:18:30,270 --> 00:18:32,520 I guess I'm sort of assuming you've already 335 00:18:32,520 --> 00:18:39,450 explored the space to cover the optimum region as much as you 336 00:18:39,450 --> 00:18:41,400 can. 337 00:18:41,400 --> 00:18:45,500 That's a good observation as well. 338 00:18:45,500 --> 00:18:46,000 Good. 339 00:18:46,000 --> 00:18:46,850 Any others? 340 00:18:46,850 --> 00:18:47,570 Any other ideas? 341 00:18:51,000 --> 00:18:51,540 OK. 342 00:18:51,540 --> 00:18:52,040 Good. 343 00:18:52,040 --> 00:18:52,540 Good. 344 00:18:52,540 --> 00:18:55,290 I did want to hit on some of these. 345 00:18:55,290 --> 00:18:57,600 Now we'll come back to this minimum variance 346 00:18:57,600 --> 00:19:02,160 because that is a great idea, right? 347 00:19:02,160 --> 00:19:05,490 So far here this picture is still looking at, even 348 00:19:05,490 --> 00:19:08,820 in this picture, it's still looking at just output 349 00:19:08,820 --> 00:19:10,980 and matching that output to the target, 350 00:19:10,980 --> 00:19:14,970 that shaded plane is hitting the target. 351 00:19:14,970 --> 00:19:19,560 And a bigger goal [INAUDIBLE],, even just focused on quality 352 00:19:19,560 --> 00:19:25,560 is overall yield and minimizing either the effective variation 353 00:19:25,560 --> 00:19:27,640 or minimizing the variation. 354 00:19:27,640 --> 00:19:32,970 So if we go back to our quality and variation equation, 355 00:19:32,970 --> 00:19:37,380 my output can vary due to a couple of different factors. 356 00:19:37,380 --> 00:19:44,940 We said alpha are sort of noise factors, 357 00:19:44,940 --> 00:19:47,660 and these are the actual process settings 358 00:19:47,660 --> 00:19:49,655 if you will or control factors. 359 00:19:53,320 --> 00:19:55,470 And so we can see that they'll have 360 00:19:55,470 --> 00:19:59,730 some sensitivity of the process to each 361 00:19:59,730 --> 00:20:01,140 of these kinds of factors. 362 00:20:01,140 --> 00:20:05,410 If I tweak my x factor or u factor a little bit, 363 00:20:05,410 --> 00:20:09,960 my y is going to change some amount and normally I can 364 00:20:09,960 --> 00:20:14,260 control or I have power over those process settings, 365 00:20:14,260 --> 00:20:17,460 those options, but sometimes I don't. 366 00:20:17,460 --> 00:20:20,310 Sometimes I can dial it in, but I don't always 367 00:20:20,310 --> 00:20:22,650 get exactly what I dialed in. 368 00:20:22,650 --> 00:20:27,780 So there might be differences in sensitivity of the output 369 00:20:27,780 --> 00:20:30,600 where little deviations in x, especially 370 00:20:30,600 --> 00:20:33,720 with a non-linear or polynomial kind of model, 371 00:20:33,720 --> 00:20:38,400 might actually imply different amounts of delta y 372 00:20:38,400 --> 00:20:41,200 at different operating points. 373 00:20:41,200 --> 00:20:45,835 So that's an example where our assumption about-- 374 00:20:45,835 --> 00:20:47,460 let me just get some of you down here-- 375 00:20:50,190 --> 00:20:52,980 equal variance everywhere in the process space 376 00:20:52,980 --> 00:20:56,100 might actually not apply. 377 00:20:56,100 --> 00:20:59,350 But also, we have these noise factors. 378 00:20:59,350 --> 00:21:05,240 We have these variation sources, uncontrollable factors, 379 00:21:05,240 --> 00:21:07,210 things that we can't completely eliminate, 380 00:21:07,210 --> 00:21:10,480 and we might want to find an operating 381 00:21:10,480 --> 00:21:16,580 point where we minimize the sensitivity to those noise 382 00:21:16,580 --> 00:21:18,030 factors. 383 00:21:18,030 --> 00:21:18,930 OK? 384 00:21:18,930 --> 00:21:20,730 So one way of thinking about this as we 385 00:21:20,730 --> 00:21:24,180 look back at just a sensitivity kind of analysis, 386 00:21:24,180 --> 00:21:30,210 where can I look at either using my response surface 387 00:21:30,210 --> 00:21:35,400 model, which usually has this in it, 388 00:21:35,400 --> 00:21:40,200 and minimize some sensitivity metric on that 389 00:21:40,200 --> 00:21:44,370 or start to try to deal a little bit with these unmodeled noise 390 00:21:44,370 --> 00:21:46,020 factors. 391 00:21:46,020 --> 00:21:47,970 One approach here we're going to talk about 392 00:21:47,970 --> 00:21:50,900 is, what if I want to include those in the model? 393 00:21:50,900 --> 00:21:52,770 I actually want to understand how 394 00:21:52,770 --> 00:21:58,790 variance may change as a function of operating point. 395 00:21:58,790 --> 00:22:01,370 This is going to be a little bit overcoming the assumptions 396 00:22:01,370 --> 00:22:03,370 that we made. 397 00:22:03,370 --> 00:22:05,000 Now how can I get minimum variation? 398 00:22:05,000 --> 00:22:07,400 I mean, so one approach here is think about it 399 00:22:07,400 --> 00:22:09,860 from the sensitivity point of view. 400 00:22:09,860 --> 00:22:13,100 The other is think about it from a CPK perspective. 401 00:22:15,840 --> 00:22:16,470 You know? 402 00:22:16,470 --> 00:22:19,860 And we talked about needing to center the process as one 403 00:22:19,860 --> 00:22:24,480 part of that, but the other part would be the variance part. 404 00:22:24,480 --> 00:22:28,590 And again, here if I'm doing anything other 405 00:22:28,590 --> 00:22:31,680 than mean centering, I have to know how the variance actually 406 00:22:31,680 --> 00:22:34,020 depends on anything. 407 00:22:34,020 --> 00:22:37,500 Our assumptions all along in most of what we've done so far, 408 00:22:37,500 --> 00:22:41,940 ANOVA and other kinds of tools that we've looked at, 409 00:22:41,940 --> 00:22:45,060 have sort of assumed that the variance was constant. 410 00:22:45,060 --> 00:22:49,207 We want to back off and relax that as well. 411 00:22:49,207 --> 00:22:51,540 And another perspective that we'll look for a little bit 412 00:22:51,540 --> 00:22:56,070 here is to look back at the quality loss function, which 413 00:22:56,070 --> 00:22:57,510 can fold in together. 414 00:22:57,510 --> 00:22:59,310 Remember the quality loss function 415 00:22:59,310 --> 00:23:06,750 that says, as I deviate my output from my target I 416 00:23:06,750 --> 00:23:09,840 incur more and more loss? 417 00:23:09,840 --> 00:23:16,230 It actually has a penalty, not just it's good as long 418 00:23:16,230 --> 00:23:19,890 as it's above the lower spec limit 419 00:23:19,890 --> 00:23:23,310 and below the upper spec limit, but as I deviate 420 00:23:23,310 --> 00:23:28,440 from the overall target I have continuing incremental worse 421 00:23:28,440 --> 00:23:31,920 behavior in the product. 422 00:23:31,920 --> 00:23:34,320 So that's sort of a generalization a little bit 423 00:23:34,320 --> 00:23:41,940 of the focus away from just the target to also the variance. 424 00:23:41,940 --> 00:23:45,460 So what this is saying is for many of these approaches, 425 00:23:45,460 --> 00:23:51,640 whether you look at it from a CPK or a quality loss function, 426 00:23:51,640 --> 00:23:53,970 you need to know more about the variance 427 00:23:53,970 --> 00:23:57,270 or you need to know more about the influence of the noise 428 00:23:57,270 --> 00:24:01,650 factors in order to really get to process robustness. 429 00:24:01,650 --> 00:24:03,300 With what we've done so far already, 430 00:24:03,300 --> 00:24:05,130 you could do a little bit of robustness 431 00:24:05,130 --> 00:24:06,930 just with the response surface models 432 00:24:06,930 --> 00:24:11,520 just in terms of reaction to modelled input, 433 00:24:11,520 --> 00:24:15,240 but that's a pretty weak notion of robustness. 434 00:24:15,240 --> 00:24:17,340 That's only looking at some of the variances. 435 00:24:19,890 --> 00:24:22,440 Well, let's explore the CPK one. 436 00:24:22,440 --> 00:24:25,270 This is actually all just reminder. 437 00:24:25,270 --> 00:24:32,230 We've said CPK is a nice measure that looks at two effects, 438 00:24:32,230 --> 00:24:34,260 right? 439 00:24:34,260 --> 00:24:44,270 So the CP has an effect from the deviation 440 00:24:44,270 --> 00:24:49,130 from the mean of your process where the true mean is 441 00:24:49,130 --> 00:24:52,250 and how far away it is from your spec limits, 442 00:24:52,250 --> 00:24:55,310 so that's sort of a centering or a target point, 443 00:24:55,310 --> 00:24:58,340 and then it also has a notion of folding 444 00:24:58,340 --> 00:25:01,340 in the spread in your process, right? 445 00:25:01,340 --> 00:25:07,850 So one could, in fact, formulate something like CP or CPK 446 00:25:07,850 --> 00:25:14,030 as your cost function, your penalty function, your J, 447 00:25:14,030 --> 00:25:16,710 and try to build a model-- 448 00:25:16,710 --> 00:25:20,300 either use your model or build a separate model-- 449 00:25:20,300 --> 00:25:24,710 and minimize some cost. 450 00:25:24,710 --> 00:25:28,970 Well, in this case, maximize the CPK because a bigger CPK, 451 00:25:28,970 --> 00:25:32,780 we said, corresponds to yield. 452 00:25:32,780 --> 00:25:34,760 In fact, we'd use different formulas. 453 00:25:34,760 --> 00:25:40,610 You could actually relate a CPK to a fraction of nonconforming 454 00:25:40,610 --> 00:25:46,890 parts given different values. 455 00:25:46,890 --> 00:25:48,330 Another approach would actually be 456 00:25:48,330 --> 00:25:52,560 to extend that almost directly and say, 457 00:25:52,560 --> 00:25:55,440 what I'd like to do is actually use J directly 458 00:25:55,440 --> 00:25:57,930 or a model of CPK. 459 00:25:57,930 --> 00:26:02,700 Build a response surface model, if you will, of CPK, 460 00:26:02,700 --> 00:26:09,000 and that would be a way to fold in both mean information 461 00:26:09,000 --> 00:26:12,280 and process variance information. 462 00:26:14,860 --> 00:26:19,210 What are some gotchas in trying to do this? 463 00:26:19,210 --> 00:26:20,440 What are some difficulties? 464 00:26:24,280 --> 00:26:27,920 We said one of our purposes here, it might be in fact, 465 00:26:27,920 --> 00:26:34,020 pick your process operating point to maximize CPK. 466 00:26:34,020 --> 00:26:36,910 Why not just directly use that? 467 00:26:36,910 --> 00:26:39,120 Use this function right here. 468 00:26:39,120 --> 00:26:41,430 There's my function, I want to maximize 469 00:26:41,430 --> 00:26:49,680 K. Let me broaden the question. 470 00:26:49,680 --> 00:26:52,520 What are some good things or some bad things about this? 471 00:26:52,520 --> 00:26:53,020 Yeah? 472 00:26:53,020 --> 00:26:55,890 AUDIENCE: [INAUDIBLE]. 473 00:26:55,890 --> 00:26:56,940 DUANE BONING: OK. 474 00:26:56,940 --> 00:26:57,810 I'm sorry? 475 00:26:57,810 --> 00:27:00,870 AUDIENCE: [INAUDIBLE] is going to depend on y. 476 00:27:00,870 --> 00:27:03,180 DUANE BONING: OK, so the comment there-- 477 00:27:03,180 --> 00:27:05,370 by the way, you need to try to speak loudly. 478 00:27:05,370 --> 00:27:07,800 I'm trying to repeat the questions or the answers 479 00:27:07,800 --> 00:27:11,470 as much as I can, but I think it's a little hard for people 480 00:27:11,470 --> 00:27:11,970 to hear. 481 00:27:11,970 --> 00:27:15,215 So I'm sorry, now I lost the question-- or the answer. 482 00:27:15,215 --> 00:27:16,590 AUDIENCE: One of the things might 483 00:27:16,590 --> 00:27:19,000 be that the sigma might depend on mu 484 00:27:19,000 --> 00:27:22,470 or [INAUDIBLE] may depend on [INAUDIBLE].. 485 00:27:27,293 --> 00:27:28,960 DUANE BONING: So one of the complexities 486 00:27:28,960 --> 00:27:35,470 you said is this might actually be a function of the process 487 00:27:35,470 --> 00:27:36,980 conditions. 488 00:27:36,980 --> 00:27:37,480 Yes. 489 00:27:37,480 --> 00:27:40,270 In fact I would even almost say, this approach 490 00:27:40,270 --> 00:27:45,970 here where I built a quality overall CPK 491 00:27:45,970 --> 00:27:48,820 and if I were able to actually evaluate 492 00:27:48,820 --> 00:27:52,240 CPK at each of my process operating points, 493 00:27:52,240 --> 00:27:57,460 that might actually be a way to incorporate or accommodate 494 00:27:57,460 --> 00:28:02,020 a change in variance or a change in standard deviation. 495 00:28:02,020 --> 00:28:06,430 Whereas in some sense, if I build a response surface 496 00:28:06,430 --> 00:28:10,210 model up front with the assumption of a fixed variance, 497 00:28:10,210 --> 00:28:13,990 it might then be hard to actually use that fixed model 498 00:28:13,990 --> 00:28:17,600 with just J as a cost function. 499 00:28:17,600 --> 00:28:21,040 So these are two slightly different ideas, 500 00:28:21,040 --> 00:28:23,830 and I'm glad you mentioned that because-- 501 00:28:23,830 --> 00:28:25,720 I mean, this idea, the first idea 502 00:28:25,720 --> 00:28:30,100 up here is basically saying, I've already built my response 503 00:28:30,100 --> 00:28:31,760 model, maybe for y. 504 00:28:31,760 --> 00:28:35,560 I might, might, talk about a response model for s, 505 00:28:35,560 --> 00:28:43,500 and then I just use the model or use CPK as my cost 506 00:28:43,500 --> 00:28:45,090 that I'm minimizing. 507 00:28:45,090 --> 00:28:47,610 The second idea, in contrast down here, 508 00:28:47,610 --> 00:28:51,690 is build a new response variable and actually 509 00:28:51,690 --> 00:28:57,570 build a model directly of CPK as a function of my input 510 00:28:57,570 --> 00:28:58,870 conditions. 511 00:28:58,870 --> 00:29:01,080 And then I optimize on that. 512 00:29:01,080 --> 00:29:05,400 They're slightly different ideas because in that case, 513 00:29:05,400 --> 00:29:08,460 in some sense this case is, perhaps, assuming 514 00:29:08,460 --> 00:29:10,620 we're mostly having a fixed s. 515 00:29:10,620 --> 00:29:14,740 This one, as I said, can perhaps more easily accommodate 516 00:29:14,740 --> 00:29:20,348 a process varying standard deviation. 517 00:29:23,280 --> 00:29:27,060 There's one other potential challenge 518 00:29:27,060 --> 00:29:29,680 with the model down below. 519 00:29:29,680 --> 00:29:34,505 Well, either approach, really, with these two functions. 520 00:29:34,505 --> 00:29:37,896 What's something that's a little nasty about these functions? 521 00:29:45,680 --> 00:29:50,240 They have things like a min function in them. 522 00:29:50,240 --> 00:29:52,550 A minimum. 523 00:29:52,550 --> 00:29:59,885 And minimum between two variables 524 00:29:59,885 --> 00:30:03,950 can be very non-linear, so it may not 525 00:30:03,950 --> 00:30:05,720 be that smooth of a function. 526 00:30:05,720 --> 00:30:08,720 There may be a point at which you actually 527 00:30:08,720 --> 00:30:13,910 start to break more to the left side or to the right side 528 00:30:13,910 --> 00:30:18,300 and have not a continuously varying parameter. 529 00:30:18,300 --> 00:30:22,490 So it can be a little bit of a challenge 530 00:30:22,490 --> 00:30:25,880 from a modeling point of view as well as in optimization. 531 00:30:25,880 --> 00:30:28,840 It's kind of a nasty function. 532 00:30:28,840 --> 00:30:29,530 OK? 533 00:30:29,530 --> 00:30:33,820 And in fact, that nonlinear nasty behavior 534 00:30:33,820 --> 00:30:37,420 with its possible discontinuity is-- 535 00:30:37,420 --> 00:30:38,940 let's see, I want to-- 536 00:30:38,940 --> 00:30:42,040 ope, I guess I don't have it in here. 537 00:30:42,040 --> 00:30:45,910 It's kind of one of the main drivers for the whole notion 538 00:30:45,910 --> 00:30:50,620 of the quality loss function, that E of L, 539 00:30:50,620 --> 00:30:57,100 that quality loss function with a quadratic dependence 540 00:30:57,100 --> 00:31:03,520 on deviation from the mean is a nice way of folding in together 541 00:31:03,520 --> 00:31:08,410 in a more continuously varying function, 542 00:31:08,410 --> 00:31:12,430 both deviation from target and effects of variance. 543 00:31:12,430 --> 00:31:15,860 So we'll come back to that, but in fact, 544 00:31:15,860 --> 00:31:19,150 that's one of the main reasons why the quality loss 545 00:31:19,150 --> 00:31:24,370 function is used and has an advantage is 546 00:31:24,370 --> 00:31:28,420 it has a smoother behavior, a behavior that drives 547 00:31:28,420 --> 00:31:31,990 towards an optimum that pulls in or folds in both 548 00:31:31,990 --> 00:31:33,580 of those effects and doesn't have 549 00:31:33,580 --> 00:31:39,580 these discontinuities or not quite as much of the nastiness 550 00:31:39,580 --> 00:31:42,100 of these min functions. 551 00:31:42,100 --> 00:31:43,750 OK? 552 00:31:43,750 --> 00:31:49,480 OK, so I guess this discussion and everything 553 00:31:49,480 --> 00:31:52,340 we've been leading up to is really starting to say, 554 00:31:52,340 --> 00:31:55,240 OK, look. 555 00:31:55,240 --> 00:32:00,580 In order to be able to fold in both deviations from target 556 00:32:00,580 --> 00:32:03,700 but also variation, we need better ways 557 00:32:03,700 --> 00:32:07,480 to model the variance. 558 00:32:07,480 --> 00:32:10,090 In particular, the variance that might not 559 00:32:10,090 --> 00:32:15,010 be constant at your entire process base 560 00:32:15,010 --> 00:32:19,560 but may vary as a function or depend on a operating point. 561 00:32:19,560 --> 00:32:20,830 OK? 562 00:32:20,830 --> 00:32:26,010 Now remember, in some cases we've 563 00:32:26,010 --> 00:32:30,180 been careful to remind ourselves that there 564 00:32:30,180 --> 00:32:34,470 were assumptions on constant variance in our approaches. 565 00:32:34,470 --> 00:32:39,090 But sometimes we haven't even been explicitly mentioning 566 00:32:39,090 --> 00:32:42,030 that, but I want to highlight that some of the methods we've 567 00:32:42,030 --> 00:32:44,400 been using and developed actually 568 00:32:44,400 --> 00:32:47,520 implicitly continued that assumption 569 00:32:47,520 --> 00:32:50,550 that sigma squared or the variance 570 00:32:50,550 --> 00:32:52,780 was constant throughout the space. 571 00:32:52,780 --> 00:32:57,030 So in ANOVA, the simple ANOVA that we did, in order 572 00:32:57,030 --> 00:33:00,300 to decide if an effect is real we 573 00:33:00,300 --> 00:33:02,940 were always comparing a fixed offset 574 00:33:02,940 --> 00:33:06,240 to natural variance in the process 575 00:33:06,240 --> 00:33:10,410 and we were always pulling the estimate of variance 576 00:33:10,410 --> 00:33:14,680 across all operating points equally. 577 00:33:14,680 --> 00:33:15,400 OK? 578 00:33:15,400 --> 00:33:18,010 So we were enforcing an assumption 579 00:33:18,010 --> 00:33:23,290 that there was only one natural underlying process 580 00:33:23,290 --> 00:33:24,490 variance at work. 581 00:33:24,490 --> 00:33:28,750 And then we were trying to say, are there variances associated 582 00:33:28,750 --> 00:33:32,410 with fixed offsets that are larger 583 00:33:32,410 --> 00:33:35,310 than the replicate noise? 584 00:33:35,310 --> 00:33:39,610 So that was in there and we were careful in talking about that. 585 00:33:39,610 --> 00:33:45,940 But another place was actually in regression model fitting. 586 00:33:45,940 --> 00:33:51,460 When we formed a least squares regression problem 587 00:33:51,460 --> 00:33:54,520 and then solved it, that actually equally 588 00:33:54,520 --> 00:33:57,580 weights all of our data and all of our deviations 589 00:33:57,580 --> 00:34:01,870 from the model prediction equally 590 00:34:01,870 --> 00:34:07,740 and essentially, is also a equal variance assumption. 591 00:34:07,740 --> 00:34:09,469 Think of it this way. 592 00:34:09,469 --> 00:34:12,219 If I had, in fact-- 593 00:34:12,219 --> 00:34:12,719 wait. 594 00:34:12,719 --> 00:34:13,850 Let me draw it. 595 00:34:13,850 --> 00:34:20,300 If I in fact had different operating points or locations 596 00:34:20,300 --> 00:34:23,150 where the variance was very different 597 00:34:23,150 --> 00:34:28,520 and I'm trying to fit a line through this, but over here 598 00:34:28,520 --> 00:34:32,400 I have a huge spread. 599 00:34:32,400 --> 00:34:35,610 It may in fact be the case that I 600 00:34:35,610 --> 00:34:39,389 have less confidence or a wider spread 601 00:34:39,389 --> 00:34:42,719 here in picking a point that corresponds 602 00:34:42,719 --> 00:34:46,469 to where that line is going to go 603 00:34:46,469 --> 00:34:50,400 than in other regions of this space. 604 00:34:50,400 --> 00:34:56,380 Now they can still appeal to best estimates, 605 00:34:56,380 --> 00:35:02,250 but in terms of confidence in the use of each of your data 606 00:35:02,250 --> 00:35:06,090 points, there are alternative regression approaches, 607 00:35:06,090 --> 00:35:10,500 weighted regression approaches that might say in regions where 608 00:35:10,500 --> 00:35:12,600 I have less variance-- 609 00:35:12,600 --> 00:35:15,060 where I have more confidence, I know those data points 610 00:35:15,060 --> 00:35:18,000 are more solid-- 611 00:35:18,000 --> 00:35:21,300 I might actually give them more influence 612 00:35:21,300 --> 00:35:26,380 in the weighting in the regression than other points 613 00:35:26,380 --> 00:35:28,780 that I'm not so sure about. 614 00:35:28,780 --> 00:35:32,320 And so weighted regression approaches are available, 615 00:35:32,320 --> 00:35:34,960 and probably the most common weighted regression 616 00:35:34,960 --> 00:35:38,830 is to apply a weight that's inversely proportional 617 00:35:38,830 --> 00:35:43,180 to the variance associated with each of the operating 618 00:35:43,180 --> 00:35:47,140 regions in your space. 619 00:35:47,140 --> 00:35:47,650 OK? 620 00:35:47,650 --> 00:35:52,540 So just wanted to highlight that there are 621 00:35:52,540 --> 00:35:56,710 approaches that go beyond this assumption. 622 00:35:56,710 --> 00:35:59,200 We've been assuming that in both our regression 623 00:35:59,200 --> 00:36:04,570 analysis and our ANOVA analysis that we have one variant. 624 00:36:04,570 --> 00:36:07,780 I think I alluded to a book earlier this semester 625 00:36:07,780 --> 00:36:12,290 called Analysis of Messy Data. 626 00:36:12,290 --> 00:36:15,650 It's a wonderful book. 627 00:36:15,650 --> 00:36:18,420 Or actually, there's a series of books. 628 00:36:18,420 --> 00:36:20,360 There's either two or three of them. 629 00:36:20,360 --> 00:36:23,150 But one of the examples of messy data 630 00:36:23,150 --> 00:36:25,430 is when the variance is not constant 631 00:36:25,430 --> 00:36:30,500 and there's one of the volumes in that series is focused, 632 00:36:30,500 --> 00:36:32,780 I believe, on analysis of variance 633 00:36:32,780 --> 00:36:38,040 when you don't have constant variance throughout your space. 634 00:36:38,040 --> 00:36:42,410 So there are places you can look when things get messy. 635 00:36:42,410 --> 00:36:44,550 OK. 636 00:36:44,550 --> 00:36:46,980 OK, so what do we do? 637 00:36:46,980 --> 00:36:50,130 We've been assuming that but in reality, 638 00:36:50,130 --> 00:36:51,570 I think we've made the point. 639 00:36:51,570 --> 00:36:55,560 The process variation may in fact vary or depend 640 00:36:55,560 --> 00:36:57,480 on our operating point. 641 00:36:57,480 --> 00:37:00,090 I've already talked about the impact 642 00:37:00,090 --> 00:37:01,800 in our variance equation. 643 00:37:01,800 --> 00:37:06,420 If I had imperfect control on a control factor, 644 00:37:06,420 --> 00:37:08,820 I may have different sensitivity-- 645 00:37:08,820 --> 00:37:13,380 that dy, du-- especially with a non-linear model can vary. 646 00:37:13,380 --> 00:37:15,870 But it's also just simply the case 647 00:37:15,870 --> 00:37:18,090 that the effect of these noise factors 648 00:37:18,090 --> 00:37:21,000 may be such that I have different variants 649 00:37:21,000 --> 00:37:22,780 at different operating points. 650 00:37:22,780 --> 00:37:26,100 So what are some approaches? 651 00:37:26,100 --> 00:37:31,920 All I've said so far is variance may depend on your process 652 00:37:31,920 --> 00:37:33,210 settings. 653 00:37:33,210 --> 00:37:37,600 How are you going to study that and understand it? 654 00:37:37,600 --> 00:37:40,870 Well, let's do design of experiments 655 00:37:40,870 --> 00:37:44,230 but also include an explicit model 656 00:37:44,230 --> 00:37:47,840 for variance or standard deviation. 657 00:37:47,840 --> 00:37:51,640 So all we're doing here is simply saying treat variants 658 00:37:51,640 --> 00:37:55,030 as another process output. 659 00:37:55,030 --> 00:37:58,990 And so it is possible to include another response 660 00:37:58,990 --> 00:38:07,810 variable, either s squared or s or some variance model 661 00:38:07,810 --> 00:38:13,660 and do our typical DOE here on our input factors, 662 00:38:13,660 --> 00:38:15,790 but now get enough data that I can actually 663 00:38:15,790 --> 00:38:19,000 build a model of variance at each of my operating points. 664 00:38:21,800 --> 00:38:26,710 Can I possibly do that without replication? 665 00:38:26,710 --> 00:38:27,430 No. 666 00:38:27,430 --> 00:38:33,610 I have to have replicates at each of my design points, 667 00:38:33,610 --> 00:38:34,990 each of my DOE points. 668 00:38:34,990 --> 00:38:37,240 I have to gather multiple replicates in order 669 00:38:37,240 --> 00:38:42,130 to have an estimate of the noise at that replicate point. 670 00:38:42,130 --> 00:38:47,580 So fundamentally, you have to do more 671 00:38:47,580 --> 00:38:51,900 than a non-replicated, full factorial 672 00:38:51,900 --> 00:38:53,730 DOE like we were talking about that just 673 00:38:53,730 --> 00:38:56,040 explores the operating space. 674 00:38:56,040 --> 00:38:59,520 You've got to get more data if you want a model variance. 675 00:38:59,520 --> 00:39:02,350 You always have to have much more data. 676 00:39:02,350 --> 00:39:06,750 So you have to pick some number of replicates 677 00:39:06,750 --> 00:39:09,600 at each of your design points. 678 00:39:09,600 --> 00:39:12,480 And now this goes back to the tools 679 00:39:12,480 --> 00:39:14,400 we talked about earlier in the term 680 00:39:14,400 --> 00:39:17,790 in terms of deciding what sample size you need in order 681 00:39:17,790 --> 00:39:20,220 to be able to have different confidence 682 00:39:20,220 --> 00:39:24,360 intervals on your estimate of variance or standard deviation. 683 00:39:24,360 --> 00:39:24,870 Right? 684 00:39:24,870 --> 00:39:27,660 So depending on how close, how accurate 685 00:39:27,660 --> 00:39:32,940 you want to be on your estimate of a variance, that 686 00:39:32,940 --> 00:39:35,350 will influence how many replicates you need. 687 00:39:35,350 --> 00:39:35,850 Yeah? 688 00:39:35,850 --> 00:39:38,326 AUDIENCE: To get this at each level, you do [INAUDIBLE].. 689 00:39:43,877 --> 00:39:44,710 DUANE BONING: Right. 690 00:39:44,710 --> 00:39:45,220 Right. 691 00:39:45,220 --> 00:39:46,070 Absolutely. 692 00:39:46,070 --> 00:39:49,240 So the comment or question was, to do 693 00:39:49,240 --> 00:39:50,890 this you actually need replicates 694 00:39:50,890 --> 00:39:53,510 at all of the points. 695 00:39:53,510 --> 00:39:54,010 Yeah. 696 00:39:54,010 --> 00:39:59,890 So in some sense, the DOE that we were talking about, 697 00:39:59,890 --> 00:40:01,540 this would be, say, full factorial, 698 00:40:01,540 --> 00:40:03,640 and then if I picked my center point 699 00:40:03,640 --> 00:40:07,960 we talked about doing something where I had replicates only 700 00:40:07,960 --> 00:40:09,460 at the center points. 701 00:40:09,460 --> 00:40:12,910 And that's wonderful if you're continuing 702 00:40:12,910 --> 00:40:14,710 to assume that there's only one variance 703 00:40:14,710 --> 00:40:16,007 and it applies everywhere. 704 00:40:16,007 --> 00:40:17,590 In fact, that's kind of the assumption 705 00:40:17,590 --> 00:40:21,130 we made, even without necessarily 706 00:40:21,130 --> 00:40:23,110 having good evidence for it. 707 00:40:23,110 --> 00:40:29,200 But clearly here if I want a model of both 708 00:40:29,200 --> 00:40:31,440 the average output but the variance at each point, 709 00:40:31,440 --> 00:40:33,147 I mean replicates at each point. 710 00:40:33,147 --> 00:40:34,980 And you would normally want to do that where 711 00:40:34,980 --> 00:40:38,520 you have the same number of replicates at each point. 712 00:40:38,520 --> 00:40:44,850 It's possible you could do it without a balanced experiment 713 00:40:44,850 --> 00:40:46,920 where you had different numbers of replicates. 714 00:40:46,920 --> 00:40:54,120 Then it gets really messy because then your confidence 715 00:40:54,120 --> 00:40:57,600 interval for your estimate of variance at each of your points 716 00:40:57,600 --> 00:40:58,650 is different. 717 00:40:58,650 --> 00:41:03,180 You'd like to have at least the same confidence 718 00:41:03,180 --> 00:41:07,710 interval or the same number of replicates. 719 00:41:07,710 --> 00:41:13,890 Now there is some danger in some of these kinds of models. 720 00:41:13,890 --> 00:41:17,780 I wouldn't quite say danger, I would say 721 00:41:17,780 --> 00:41:21,340 some approximation going on. 722 00:41:21,340 --> 00:41:24,560 So for example, one thing that makes me a little bit 723 00:41:24,560 --> 00:41:27,380 uncomfortable is actually building a model directly 724 00:41:27,380 --> 00:41:32,300 of standard deviation, because estimating directly 725 00:41:32,300 --> 00:41:38,150 standard deviation is difficult. It's got bias in it. 726 00:41:38,150 --> 00:41:40,950 Whereas building a model, perhaps, for example, 727 00:41:40,950 --> 00:41:44,510 of variance, you can use the data 728 00:41:44,510 --> 00:41:46,820 and get an unbiased estimate of variance. 729 00:41:46,820 --> 00:41:54,250 Whereas you're susceptible to slight biases if you go in 730 00:41:54,250 --> 00:41:56,930 and directly model standard deviation. 731 00:41:56,930 --> 00:42:00,420 But for the most part, people pretty much ignore those. 732 00:42:00,420 --> 00:42:03,950 And are basically-- 733 00:42:03,950 --> 00:42:07,400 I think assuming that those kinds of biases 734 00:42:07,400 --> 00:42:09,530 get washed out in the noise because I 735 00:42:09,530 --> 00:42:11,150 don't have so many replicates. 736 00:42:14,230 --> 00:42:17,240 They tend not to be all that careful with it, 737 00:42:17,240 --> 00:42:21,002 but I would just put a little side note here. 738 00:42:21,002 --> 00:42:23,210 We should be a little bit careful about the influence 739 00:42:23,210 --> 00:42:26,210 of bias in these factors. 740 00:42:26,210 --> 00:42:28,740 And in fact, I think in one of the case studies 741 00:42:28,740 --> 00:42:33,680 we'll talk about approaches for modeling variance 742 00:42:33,680 --> 00:42:35,630 and some of the biases that can creep in. 743 00:42:38,240 --> 00:42:41,770 But the main latch here, the main lever 744 00:42:41,770 --> 00:42:44,260 for being able to get some power over the problem 745 00:42:44,260 --> 00:42:48,160 is simply doing a DOE that includes explicit modeling 746 00:42:48,160 --> 00:42:52,360 of the variance so that one can build up 747 00:42:52,360 --> 00:42:54,970 some kind of a variance or a standard deviation 748 00:42:54,970 --> 00:42:56,680 response surface. 749 00:42:56,680 --> 00:43:01,420 And then you can use that in combination 750 00:43:01,420 --> 00:43:04,210 with your output to form a cost function that 751 00:43:04,210 --> 00:43:10,100 seeks to minimize in some way both of those factors. 752 00:43:10,100 --> 00:43:12,350 So for example, now we can go back. 753 00:43:12,350 --> 00:43:18,340 We can go back to that surface that we had. 754 00:43:18,340 --> 00:43:23,110 That x1, x2 surface where I found the line or some 755 00:43:23,110 --> 00:43:26,980 function that minimized the error in my output 756 00:43:26,980 --> 00:43:28,550 from the target. 757 00:43:28,550 --> 00:43:33,610 But then I can explore over that within my x1 and x2 constrained 758 00:43:33,610 --> 00:43:35,770 space and look for the place where 759 00:43:35,770 --> 00:43:38,590 I've got the minimum variance. 760 00:43:38,590 --> 00:43:39,160 Right? 761 00:43:39,160 --> 00:43:41,320 So if I took back-- 762 00:43:44,840 --> 00:43:47,900 if I go back here to slide five-- 763 00:43:47,900 --> 00:43:49,790 I just heard a bell. 764 00:43:49,790 --> 00:43:52,830 Is everything still working there? 765 00:43:52,830 --> 00:43:55,010 You still have the slides in Singapore? 766 00:43:55,010 --> 00:43:58,900 AUDIENCE: No, we don't We don't have the slides, 767 00:43:58,900 --> 00:44:00,073 it was just disconnected. 768 00:44:00,073 --> 00:44:01,240 DUANE BONING: Yeah, I just-- 769 00:44:01,240 --> 00:44:05,722 AUDIENCE: You might want to share the desktop again. 770 00:44:05,722 --> 00:44:06,430 DUANE BONING: OK. 771 00:44:15,683 --> 00:44:17,740 Yeah, it looked like it dropped the call. 772 00:44:17,740 --> 00:44:22,600 So while that's calling, so the basic point was 773 00:44:22,600 --> 00:44:26,440 if I have an additional response surface 774 00:44:26,440 --> 00:44:29,530 for variance or standard deviation, 775 00:44:29,530 --> 00:44:31,960 the simplest use of that is simply 776 00:44:31,960 --> 00:44:34,150 to still do my optimization on the target 777 00:44:34,150 --> 00:44:39,130 and then just look up or use this to disambiguate my xy 778 00:44:39,130 --> 00:44:40,854 parameters. 779 00:44:40,854 --> 00:44:46,950 AUDIENCE: When you have [INAUDIBLE] experiment 780 00:44:46,950 --> 00:44:49,327 that we're doing, [INAUDIBLE]? 781 00:44:55,868 --> 00:44:56,660 DUANE BONING: Yeah. 782 00:44:56,660 --> 00:44:58,970 So the question was, let's say you only did 783 00:44:58,970 --> 00:45:02,180 four replicates at each point. 784 00:45:02,180 --> 00:45:04,040 Well, you ask the question, how good 785 00:45:04,040 --> 00:45:06,430 is your estimate of variance at that point? 786 00:45:06,430 --> 00:45:08,810 And you know how to calculate that, right? 787 00:45:08,810 --> 00:45:12,140 And your point is with only four replicates, 788 00:45:12,140 --> 00:45:15,132 it's kind of iffy about how good you're-- 789 00:45:15,132 --> 00:45:19,160 you're only going to have resolving power 790 00:45:19,160 --> 00:45:23,390 to be able to see really big differences in variance. 791 00:45:23,390 --> 00:45:25,610 If it's small differences in variance 792 00:45:25,610 --> 00:45:27,410 with that number of replicates, you're 793 00:45:27,410 --> 00:45:30,020 not going to really be able to disambiguate. 794 00:45:30,020 --> 00:45:34,160 So in essence with that small number-- oh, thank you-- 795 00:45:34,160 --> 00:45:37,100 that small number of replicates, you're really just looking 796 00:45:37,100 --> 00:45:39,410 for gross variance differences. 797 00:45:39,410 --> 00:45:43,740 Which if they exist, you definitely want to use. 798 00:45:43,740 --> 00:45:47,570 Now, if you're looking for really trying 799 00:45:47,570 --> 00:45:53,660 to get a good process that has the ability to take advantage 800 00:45:53,660 --> 00:45:56,540 of small variance differences-- because 801 00:45:56,540 --> 00:45:59,330 over many, many thousands or millions of parts, 802 00:45:59,330 --> 00:46:06,530 even small variances can really make a difference. 803 00:46:06,530 --> 00:46:08,840 Then you'd want to run more replicates 804 00:46:08,840 --> 00:46:13,920 in your design of experiments. 805 00:46:13,920 --> 00:46:16,770 OK, are we back at Singapore now? 806 00:46:16,770 --> 00:46:19,320 You have the slides there? 807 00:46:19,320 --> 00:46:20,280 AUDIENCE: No we don't. 808 00:46:20,280 --> 00:46:21,520 AUDIENCE: They had it and then they dropped it. 809 00:46:21,520 --> 00:46:23,187 AUDIENCE: Yeah, we can't see the slides. 810 00:46:30,572 --> 00:46:31,280 DUANE BONING: OK. 811 00:46:31,280 --> 00:46:34,175 So this one, it thinks we're in a call. 812 00:46:39,360 --> 00:46:40,410 Should I unshare? 813 00:46:40,410 --> 00:46:41,800 What should I-- 814 00:46:41,800 --> 00:46:43,425 AUDIENCE: No, I think it's coming back. 815 00:46:43,425 --> 00:46:44,883 DUANE BONING: Oh, it's coming back? 816 00:46:44,883 --> 00:46:46,040 AUDIENCE: [INAUDIBLE]. 817 00:46:46,040 --> 00:46:46,540 Yeah, yeah. 818 00:46:46,540 --> 00:46:47,010 It's back. 819 00:46:47,010 --> 00:46:47,850 DUANE BONING: OK, great. 820 00:46:47,850 --> 00:46:48,240 Great. 821 00:46:48,240 --> 00:46:48,740 Great. 822 00:46:51,000 --> 00:46:51,500 Good. 823 00:46:57,940 --> 00:47:02,530 Great, so here again, the first and easiest idea 824 00:47:02,530 --> 00:47:13,450 for combining models of both mean with a y and variance 825 00:47:13,450 --> 00:47:18,310 is use the variance to simply pick 826 00:47:18,310 --> 00:47:22,990 which of the x1, x2 combinations such that you're on target, 827 00:47:22,990 --> 00:47:25,540 give you minimum variance. 828 00:47:25,540 --> 00:47:29,050 Now you can also do that directly 829 00:47:29,050 --> 00:47:31,390 and form an overall optimization, 830 00:47:31,390 --> 00:47:33,160 and this is just working through that. 831 00:47:33,160 --> 00:47:35,140 This is the direct approach where 832 00:47:35,140 --> 00:47:38,740 if I have a response surface for my mean 833 00:47:38,740 --> 00:47:42,940 and a response surface for my variance, 834 00:47:42,940 --> 00:47:44,710 I have a constraint here first off 835 00:47:44,710 --> 00:47:51,370 that I need an optimum point that satisfies 836 00:47:51,370 --> 00:47:53,680 my output hitting a target. 837 00:47:53,680 --> 00:47:56,290 And then I can simply solve together, 838 00:47:56,290 --> 00:48:00,670 pick some x that satisfies that, and plug that 839 00:48:00,670 --> 00:48:03,040 into the s surface. 840 00:48:03,040 --> 00:48:08,030 Continuing on, substitute that in into the s surface 841 00:48:08,030 --> 00:48:13,450 so that you basically get some function 842 00:48:13,450 --> 00:48:19,420 where I need to find the x sub 2 that minimizes my variance. 843 00:48:19,420 --> 00:48:23,530 And then I can do a direct analytic variance minimization 844 00:48:23,530 --> 00:48:28,710 to solve for x2 and once I have x2, I can solve back for x1. 845 00:48:28,710 --> 00:48:33,550 So all I'm saying there is you can just algebraically use 846 00:48:33,550 --> 00:48:35,110 your two response surfaces. 847 00:48:35,110 --> 00:48:38,620 If they are simple and linear then 848 00:48:38,620 --> 00:48:44,650 it's nice and easy to drive that towards where 849 00:48:44,650 --> 00:48:50,050 you have the best y hat that's on target 850 00:48:50,050 --> 00:48:51,640 that also is minimum variance. 851 00:48:56,410 --> 00:49:01,690 Another approach, once I have models for both-- 852 00:49:01,690 --> 00:49:04,090 again, if I have a model for the mean response 853 00:49:04,090 --> 00:49:06,550 and a model for variance-- 854 00:49:09,300 --> 00:49:12,240 is think about other cost functions that 855 00:49:12,240 --> 00:49:17,340 combine both of these effects. 856 00:49:17,340 --> 00:49:23,690 So this is a bit more general because it 857 00:49:23,690 --> 00:49:27,800 allows us to look at, say, the quality loss function 858 00:49:27,800 --> 00:49:33,235 and have terms that penalize us for deviation from-- 859 00:49:39,200 --> 00:49:46,220 I don't-- well, penalizes for deviations from the target. 860 00:49:46,220 --> 00:49:49,700 I'm just looking at this and saying, I don't know why we-- 861 00:49:49,700 --> 00:49:53,160 let's just stick with y here everywhere. 862 00:49:53,160 --> 00:49:54,770 I don't know why we switched to x. 863 00:49:58,620 --> 00:50:03,120 So when I have deviations from my optimum point from the mean, 864 00:50:03,120 --> 00:50:04,500 that's going to cost me something 865 00:50:04,500 --> 00:50:08,850 and then I also have a term associated with the variance. 866 00:50:08,850 --> 00:50:15,370 And if both of those are functions of my input 867 00:50:15,370 --> 00:50:20,350 parameters, now I'm finding again the best input parameter 868 00:50:20,350 --> 00:50:23,600 that minimizes the overall cost function. 869 00:50:23,600 --> 00:50:25,890 Now there's a subtle difference here between-- 870 00:50:25,890 --> 00:50:30,310 or more power between one of these approaches. 871 00:50:30,310 --> 00:50:33,280 This approach versus the direct solution 872 00:50:33,280 --> 00:50:36,400 approach in the previous slide. 873 00:50:36,400 --> 00:50:38,290 The direct solution on the previous slide 874 00:50:38,290 --> 00:50:44,400 basically said, I have two outputs and two inputs 875 00:50:44,400 --> 00:50:47,940 and I'm trying to pull those two things 876 00:50:47,940 --> 00:50:52,220 into one overall optimum. 877 00:50:52,220 --> 00:50:52,720 Yes? 878 00:50:52,720 --> 00:50:53,250 AUDIENCE: Professor? 879 00:50:53,250 --> 00:50:54,417 DUANE BONING: Yes, question? 880 00:50:54,417 --> 00:50:56,937 AUDIENCE: I think we cannot see-- 881 00:50:56,937 --> 00:50:59,520 I think we can see a slide but then we cannot see any drawing, 882 00:50:59,520 --> 00:51:03,330 and actually we are now stuck at slide five. 883 00:51:03,330 --> 00:51:04,320 DUANE BONING: Oh, OK. 884 00:51:04,320 --> 00:51:06,870 So that doesn't sound like it's changing there. 885 00:51:09,397 --> 00:51:11,230 AUDIENCE: We don't really know which slide-- 886 00:51:11,230 --> 00:51:12,730 DUANE BONING: Yeah, I'm on slide 14. 887 00:51:15,300 --> 00:51:17,115 Yeah, I'm sorry? 888 00:51:17,115 --> 00:51:18,990 AUDIENCE: What we have right now is slide 11. 889 00:51:29,312 --> 00:51:30,770 DUANE BONING: Yeah, so I'm not sure 890 00:51:30,770 --> 00:51:32,960 if the sharing is working right. 891 00:51:32,960 --> 00:51:35,420 Are there just long delays? 892 00:51:35,420 --> 00:51:37,393 Should I unshare and reshare? 893 00:51:37,393 --> 00:51:40,231 AUDIENCE: Yeah, I would do that first and then [INAUDIBLE].. 894 00:51:44,961 --> 00:51:46,020 DUANE BONING: Oh, OK. 895 00:51:46,020 --> 00:51:47,790 Well, let me reshare here. 896 00:52:19,495 --> 00:52:20,940 Oh, we killed the coffee. 897 00:52:26,890 --> 00:52:29,740 OK, do you guys see slide 14 there in Singapore yet? 898 00:52:37,240 --> 00:52:38,992 AUDIENCE: Yeah, we can see it right now. 899 00:52:38,992 --> 00:52:39,700 DUANE BONING: OK. 900 00:52:39,700 --> 00:52:41,710 Great. 901 00:52:41,710 --> 00:52:44,140 So let's see if this works. 902 00:52:44,140 --> 00:52:47,500 We'll try continuing here. 903 00:52:47,500 --> 00:52:51,040 The point I was going to make is if I go back to slide 13-- 904 00:52:51,040 --> 00:52:52,630 see if you got 13 now. 905 00:52:52,630 --> 00:52:53,680 Did 13 come up? 906 00:52:57,160 --> 00:52:58,720 AUDIENCE: We can't see on the screen, 907 00:52:58,720 --> 00:53:00,475 but we can see behind your projectors. 908 00:53:00,475 --> 00:53:01,350 DUANE BONING: Oh, OK. 909 00:53:01,350 --> 00:53:03,040 So that's what we're using in the-- 910 00:53:03,040 --> 00:53:03,898 all right. 911 00:53:03,898 --> 00:53:05,190 AUDIENCE: Yeah, it's all right. 912 00:53:05,190 --> 00:53:08,970 DUANE BONING: OK, we'll just use the projector then. 913 00:53:08,970 --> 00:53:11,700 The point I was making here on slide 12 and 13 914 00:53:11,700 --> 00:53:15,760 is in some sense I've got two outputs and two inputs. 915 00:53:15,760 --> 00:53:21,450 I can directly solve the x1 and x2 to achieve both a y 916 00:53:21,450 --> 00:53:24,510 on target and then a minimum s. 917 00:53:24,510 --> 00:53:33,690 If I had three x's or four input factors, this approach-- 918 00:53:33,690 --> 00:53:37,620 well, I might have additional power 919 00:53:37,620 --> 00:53:40,530 in terms of picking points. 920 00:53:40,530 --> 00:53:44,940 But this approach actually can-- 921 00:53:44,940 --> 00:53:48,210 I guess it can work as well if my variance actually 922 00:53:48,210 --> 00:53:50,610 is different. 923 00:53:50,610 --> 00:53:53,140 It has a unique minimum. 924 00:53:53,140 --> 00:53:54,660 But the point I wanted to make here 925 00:53:54,660 --> 00:53:59,230 is that with minimizing of the quality loss function, 926 00:53:59,230 --> 00:54:01,950 it's a little bit more natural in the output 927 00:54:01,950 --> 00:54:04,170 and can combine or led to explore 928 00:54:04,170 --> 00:54:07,890 over a different mixture of input factors a little bit more 929 00:54:07,890 --> 00:54:09,160 naturally. 930 00:54:09,160 --> 00:54:10,930 OK? 931 00:54:10,930 --> 00:54:12,960 So it's a little bit more general than assuming 932 00:54:12,960 --> 00:54:15,390 immediately I've just got an x1 and an x2 933 00:54:15,390 --> 00:54:21,180 and I'm directly solving for the outputs individually. 934 00:54:21,180 --> 00:54:23,970 What it's letting me do is essentially combine them 935 00:54:23,970 --> 00:54:27,060 into a single cost function, and it also 936 00:54:27,060 --> 00:54:31,500 gives another little bit of flexibility 937 00:54:31,500 --> 00:54:34,470 that is an important difference between this 938 00:54:34,470 --> 00:54:37,140 and the previous approach. 939 00:54:37,140 --> 00:54:38,850 In the previous approach, we said 940 00:54:38,850 --> 00:54:45,280 the constraint is your output is exactly on target. 941 00:54:45,280 --> 00:54:46,180 Is that true here? 942 00:54:55,280 --> 00:54:56,330 No. 943 00:54:56,330 --> 00:55:01,430 When we're minimizing this combined penalty function 944 00:55:01,430 --> 00:55:05,120 with two terms in it it's actually a little bit more 945 00:55:05,120 --> 00:55:07,190 flexible in another important way, which 946 00:55:07,190 --> 00:55:12,320 is to say it's allowing you to maybe pick a point that's 947 00:55:12,320 --> 00:55:17,690 off target slightly because it's a lower variance 948 00:55:17,690 --> 00:55:24,920 point and overall you'll win in terms of yield or quality loss. 949 00:55:24,920 --> 00:55:27,100 So it's a little bit more flexible in letting 950 00:55:27,100 --> 00:55:30,220 you make that trade off between variance 951 00:55:30,220 --> 00:55:32,800 and being exactly on target. 952 00:55:32,800 --> 00:55:36,820 It doesn't presuppose that your mean on target 953 00:55:36,820 --> 00:55:40,800 is absolutely your best point. 954 00:55:40,800 --> 00:55:44,220 So I like the quality loss formalism a little bit 955 00:55:44,220 --> 00:55:50,020 if, in fact, I have a variance that also depends 956 00:55:50,020 --> 00:55:55,670 on what operating point I'm at, what x factors 957 00:55:55,670 --> 00:55:58,720 and therefore also what y point I'm at. 958 00:55:58,720 --> 00:56:02,110 It lets me combine those and trade those off 959 00:56:02,110 --> 00:56:04,720 in a natural way. 960 00:56:04,720 --> 00:56:08,030 Now by the way, the classic quality loss function, 961 00:56:08,030 --> 00:56:13,960 they also use a constant factor between the variance 962 00:56:13,960 --> 00:56:15,490 and being off target. 963 00:56:15,490 --> 00:56:17,710 A generalization of that is you can actually 964 00:56:17,710 --> 00:56:22,290 use different weighting factors in those two cases. 965 00:56:22,290 --> 00:56:27,660 You may actually have, I don't know, a customer that says, 966 00:56:27,660 --> 00:56:31,220 I really want my mean to be-- 967 00:56:31,220 --> 00:56:39,680 maybe your customer actually has some reason 968 00:56:39,680 --> 00:56:44,940 to wait or be more concerned about variance. 969 00:56:44,940 --> 00:56:48,950 He likes a tighter variance, but actually has-- 970 00:56:48,950 --> 00:56:53,430 as long as a mean is somewhere in my spec region, I'm fine. 971 00:56:53,430 --> 00:56:57,230 It's kind of an odd combination, but there might be reasons 972 00:56:57,230 --> 00:57:00,860 where you might weight variance or being 973 00:57:00,860 --> 00:57:03,000 near target a little bit differently, 974 00:57:03,000 --> 00:57:05,720 and so that's one nice thing about this minimizing 975 00:57:05,720 --> 00:57:07,640 of the expected loss. 976 00:57:07,640 --> 00:57:08,960 You can actually weight those. 977 00:57:11,740 --> 00:57:14,960 OK, so what are some challenges here when the variance is 978 00:57:14,960 --> 00:57:15,460 varying? 979 00:57:21,690 --> 00:57:24,690 One challenge with minimizing or variance [INAUDIBLE] 980 00:57:24,690 --> 00:57:27,850 that we've already talked about is you have to model it. 981 00:57:27,850 --> 00:57:29,580 And what I want to explore a little bit 982 00:57:29,580 --> 00:57:31,650 are some of the response surface model 983 00:57:31,650 --> 00:57:33,930 or approaches for actually building up 984 00:57:33,930 --> 00:57:37,690 models of these dependencies. 985 00:57:37,690 --> 00:57:38,550 Are you guys back? 986 00:57:38,550 --> 00:57:41,550 I just heard another bell. 987 00:57:41,550 --> 00:57:43,700 I'm just going to keep going. 988 00:57:43,700 --> 00:57:45,200 AUDIENCE: No, we're not but it's OK. 989 00:57:45,200 --> 00:57:46,242 We can see the projector. 990 00:57:46,242 --> 00:57:49,290 DUANE BONING: OK, we'll continue with that. 991 00:57:49,290 --> 00:57:52,140 I'll ignore the little bells and the birds chirping and whatnot. 992 00:57:55,150 --> 00:57:57,450 Now as we start to look at ways to build 993 00:57:57,450 --> 00:57:59,873 the model of the process we can return 994 00:57:59,873 --> 00:58:01,290 to the question of what is it that 995 00:58:01,290 --> 00:58:04,290 causes non-constant variance. 996 00:58:04,290 --> 00:58:09,090 And part of the effect is that our noise factors 997 00:58:09,090 --> 00:58:12,150 may have different influence at different parts 998 00:58:12,150 --> 00:58:14,140 of the operating space. 999 00:58:14,140 --> 00:58:17,250 So one approach for building the model 1000 00:58:17,250 --> 00:58:20,520 and trying to get places that are robust, 1001 00:58:20,520 --> 00:58:24,030 that is to say insensitive to variation, 1002 00:58:24,030 --> 00:58:29,070 is actually explicitly play with some of these noise 1003 00:58:29,070 --> 00:58:31,680 factors, additional factors in our process 1004 00:58:31,680 --> 00:58:35,400 that we're not using directly to control the process 1005 00:58:35,400 --> 00:58:42,000 and try to explore and model what effect those noise factors 1006 00:58:42,000 --> 00:58:44,550 have directly on the process. 1007 00:58:44,550 --> 00:58:45,060 OK? 1008 00:58:45,060 --> 00:58:49,950 And so that gets to, essentially, 1009 00:58:49,950 --> 00:58:53,520 an approach where we can do an extension 1010 00:58:53,520 --> 00:58:57,390 to our design of experiments where we explicitly 1011 00:58:57,390 --> 00:59:02,110 look not just at, say, our main control factors. 1012 00:59:02,110 --> 00:59:08,070 So these might be our A and B control factors. 1013 00:59:08,070 --> 00:59:11,820 But at each of our design points, what we might do 1014 00:59:11,820 --> 00:59:16,920 is not just do pure replicates but actually take 1015 00:59:16,920 --> 00:59:21,780 some of these other factors, things we can kind of change 1016 00:59:21,780 --> 00:59:24,480 in the process but in normal operation 1017 00:59:24,480 --> 00:59:27,250 we would not want to be. 1018 00:59:27,250 --> 00:59:29,740 Maybe it's a machine setting that 1019 00:59:29,740 --> 00:59:33,220 has a big cost associated with changing it 1020 00:59:33,220 --> 00:59:34,830 or a big time lag in it. 1021 00:59:34,830 --> 00:59:40,730 It's a noise factor or some other factor. 1022 00:59:40,730 --> 00:59:42,730 What we might want to do instead of just running 1023 00:59:42,730 --> 00:59:46,900 pure replicates is actually look at these additional noise 1024 00:59:46,900 --> 00:59:50,470 factors and explicitly, in the design of experiments, 1025 00:59:50,470 --> 00:59:53,230 tweak them, play with them, bury them 1026 00:59:53,230 --> 00:59:58,180 in a small region around that particular operating point. 1027 00:59:58,180 --> 01:00:02,200 And what that does is back to our variation equation. 1028 01:00:02,200 --> 01:00:05,860 That's explicitly saying, I know what my alpha factors are, 1029 01:00:05,860 --> 01:00:09,280 these noise factors, I'm going to actually modify 1030 01:00:09,280 --> 01:00:12,340 them a little bit and build a model that 1031 01:00:12,340 --> 01:00:19,130 gives me information about sensitivity to those factors. 1032 01:00:19,130 --> 01:00:23,980 So this is referred to as inner and outer factors. 1033 01:00:23,980 --> 01:00:28,060 What we've got if I do the same thing at each of these corner 1034 01:00:28,060 --> 01:00:30,310 points, I've got the inner design 1035 01:00:30,310 --> 01:00:33,760 and then I've also got an outer design 1036 01:00:33,760 --> 01:00:35,787 where I play with these additional factors 1037 01:00:35,787 --> 01:00:36,745 at each of the corners. 1038 01:00:39,930 --> 01:00:44,220 This, by the way, gets very close to approaches often 1039 01:00:44,220 --> 01:00:46,200 referred to as Taguchi approaches-- 1040 01:00:46,200 --> 01:00:55,290 after one of the big spokesmen for explicitly thinking about 1041 01:00:55,290 --> 01:00:58,770 dealing with noise and modeling of noise and optimizing 1042 01:00:58,770 --> 01:01:01,380 processes to be robust to noise-- 1043 01:01:01,380 --> 01:01:08,530 where you've got this inner and outer factor mixed design. 1044 01:01:08,530 --> 01:01:10,930 So let's see how this works. 1045 01:01:10,930 --> 01:01:12,840 Here's a very simple example. 1046 01:01:12,840 --> 01:01:18,550 We've got two main factors in our inner array. 1047 01:01:18,550 --> 01:01:21,360 So if I do just a full factorial in that, 1048 01:01:21,360 --> 01:01:23,940 we've got my four combinations. 1049 01:01:23,940 --> 01:01:24,720 Right? 1050 01:01:24,720 --> 01:01:29,100 And then here, instead of running pure replicates, 1051 01:01:29,100 --> 01:01:32,130 what we're doing is an outer array 1052 01:01:32,130 --> 01:01:36,570 where we vary the levels in another little full factorial 1053 01:01:36,570 --> 01:01:41,010 around each of those corner points. 1054 01:01:41,010 --> 01:01:43,680 And so now you can imagine that that's giving us 1055 01:01:43,680 --> 01:01:48,150 an ability to model the mean response, again, 1056 01:01:48,150 --> 01:01:55,500 at that corner point, but also a notion of variance. 1057 01:01:55,500 --> 01:01:59,760 If I just treat those as pure replicates 1058 01:01:59,760 --> 01:02:03,030 and build a model or an estimate of variance, 1059 01:02:03,030 --> 01:02:08,490 that gives me kind of a funny, slightly different way 1060 01:02:08,490 --> 01:02:10,200 of thinking about the variant. 1061 01:02:10,200 --> 01:02:13,200 It's actually a perturbed variant, 1062 01:02:13,200 --> 01:02:18,180 but I'm thinking of that as in my normal operation 1063 01:02:18,180 --> 01:02:20,610 those might be noise factors that I'm not explicitly 1064 01:02:20,610 --> 01:02:23,130 controlling and might randomly be picked 1065 01:02:23,130 --> 01:02:25,410 at one of those settings and so I'm 1066 01:02:25,410 --> 01:02:28,050 going to treat them, for the purpose of my process 1067 01:02:28,050 --> 01:02:35,100 optimization, for the purpose of robustness, as overall noise. 1068 01:02:35,100 --> 01:02:35,940 Right? 1069 01:02:35,940 --> 01:02:39,150 So in essence, it's lumping together these other factors 1070 01:02:39,150 --> 01:02:45,420 in order to explicitly and proactively explore 1071 01:02:45,420 --> 01:02:51,060 the space in those factors and get an estimate of variance 1072 01:02:51,060 --> 01:02:55,900 at each of our operating points, at each of our corner points. 1073 01:02:55,900 --> 01:02:56,530 OK? 1074 01:02:56,530 --> 01:02:59,080 So then you can build a model. 1075 01:02:59,080 --> 01:03:02,170 You could build a model both of the average output 1076 01:03:02,170 --> 01:03:03,850 and of the variance. 1077 01:03:03,850 --> 01:03:06,970 Another part of the methodology popularized 1078 01:03:06,970 --> 01:03:10,720 by Taguchi is to say, I don't want two outputs 1079 01:03:10,720 --> 01:03:13,270 that I have to somehow then optimize separately. 1080 01:03:13,270 --> 01:03:16,660 I want to fold these things in together into one combined cost 1081 01:03:16,660 --> 01:03:19,840 function that I then optimize. 1082 01:03:19,840 --> 01:03:23,470 And the particular cost function that he refers to or uses 1083 01:03:23,470 --> 01:03:28,180 is a signal to noise ratio, where what you're trying to do 1084 01:03:28,180 --> 01:03:32,410 is get on target or maximize some output, 1085 01:03:32,410 --> 01:03:35,140 but also then minimize variance. 1086 01:03:35,140 --> 01:03:41,750 And he actually has multiple different factors, 1087 01:03:41,750 --> 01:03:45,310 but one can imagine building a signal to noise ratio 1088 01:03:45,310 --> 01:03:51,880 that looks at the ratio of output mean to variant. 1089 01:03:51,880 --> 01:03:56,560 You take the log so that it becomes more of a linear model 1090 01:03:56,560 --> 01:03:58,210 and form a signal to noise. 1091 01:03:58,210 --> 01:04:00,550 And so here, for example in the picture, 1092 01:04:00,550 --> 01:04:05,620 is an example of an overall integrated cost function, 1093 01:04:05,620 --> 01:04:11,500 a signal to noise ratio function where you might want 1094 01:04:11,500 --> 01:04:15,220 the maximum signal to noise-- 1095 01:04:15,220 --> 01:04:17,860 larger output, smaller variance-- and this 1096 01:04:17,860 --> 01:04:22,330 would drive you towards an optimum point in your space 1097 01:04:22,330 --> 01:04:30,300 based on where you're going to achieve that notion of best. 1098 01:04:30,300 --> 01:04:32,560 OK? 1099 01:04:32,560 --> 01:04:34,640 So there's a couple of ideas on this slide. 1100 01:04:34,640 --> 01:04:36,970 One here is simply this notion of how 1101 01:04:36,970 --> 01:04:39,970 you do the design with the inner and outer array 1102 01:04:39,970 --> 01:04:46,120 and then the other is this idea of a natural way of combining 1103 01:04:46,120 --> 01:04:49,870 both variants and target different than the expected 1104 01:04:49,870 --> 01:04:53,230 loss function, but kind of related to it 1105 01:04:53,230 --> 01:04:55,460 are the signal to noise ratios. 1106 01:04:55,460 --> 01:04:57,040 And if you look in the literature, 1107 01:04:57,040 --> 01:05:02,050 there's different ones depending on what kind of objectives 1108 01:05:02,050 --> 01:05:04,270 you have qualitatively. 1109 01:05:04,270 --> 01:05:08,320 For example, if you really want to be close to the nominal 1110 01:05:08,320 --> 01:05:11,140 and be insensitive to the noise, then 1111 01:05:11,140 --> 01:05:14,770 that cost function that we just described on the previous slide 1112 01:05:14,770 --> 01:05:16,300 is a pretty good one. 1113 01:05:16,300 --> 01:05:19,900 There's other places where maybe your main driver is really 1114 01:05:19,900 --> 01:05:21,010 trying to get as-- 1115 01:05:24,200 --> 01:05:25,280 do I have this right? 1116 01:05:25,280 --> 01:05:26,270 I may have this-- 1117 01:05:26,270 --> 01:05:31,460 no, I think I have this right because of the negative there. 1118 01:05:31,460 --> 01:05:34,040 You might have a slightly different signal to noise ratio 1119 01:05:34,040 --> 01:05:36,950 if you're more driven towards you 1120 01:05:36,950 --> 01:05:41,000 want a large output, a large y, or another one 1121 01:05:41,000 --> 01:05:43,880 if you want a small y, you want to minimize y. 1122 01:05:43,880 --> 01:05:47,210 So you can still formulate these things in slightly different 1123 01:05:47,210 --> 01:05:48,320 signal to noise metrics. 1124 01:05:50,860 --> 01:05:55,000 So let's explore this inner and outer array design 1125 01:05:55,000 --> 01:05:56,740 approach a little bit more. 1126 01:05:56,740 --> 01:06:01,240 This is often referred to as a crossed ray method. 1127 01:06:01,240 --> 01:06:04,850 One question that comes up is, gee, 1128 01:06:04,850 --> 01:06:07,900 looks like an awful lot of experiments. 1129 01:06:07,900 --> 01:06:10,910 So we do want to understand how many experiments there are. 1130 01:06:10,910 --> 01:06:16,200 Are there ways to minimize that or reduce that? 1131 01:06:16,200 --> 01:06:19,440 And overall, it's clear that I've 1132 01:06:19,440 --> 01:06:23,190 got something that requires both control factor 1133 01:06:23,190 --> 01:06:28,590 tests in the inner array, but also noise factor 1134 01:06:28,590 --> 01:06:32,220 tests associated with additional variations 1135 01:06:32,220 --> 01:06:35,850 right around each of the points. 1136 01:06:35,850 --> 01:06:37,520 OK? 1137 01:06:37,520 --> 01:06:41,210 And it can actually be even worse 1138 01:06:41,210 --> 01:06:43,790 than just sort of what I've drawn here, 1139 01:06:43,790 --> 01:06:49,220 because if our goal is really to get to an optimum point that 1140 01:06:49,220 --> 01:06:55,470 hits the target and really is an optimum point in a target, 1141 01:06:55,470 --> 01:06:58,950 you may even need something more than just a full factorial 1142 01:06:58,950 --> 01:07:02,970 on your x1 and x2 parameters. 1143 01:07:02,970 --> 01:07:05,340 Think back to what we talked about on Tuesday. 1144 01:07:05,340 --> 01:07:07,050 We said if you're trying to drive 1145 01:07:07,050 --> 01:07:11,480 the process to an optimum y, you might 1146 01:07:11,480 --> 01:07:13,950 have locally linear models. 1147 01:07:13,950 --> 01:07:17,900 But if you really have an optimum y-- 1148 01:07:17,900 --> 01:07:21,440 not just at some target, but some overall optimum-- 1149 01:07:21,440 --> 01:07:25,910 what you need to get to is some natural maximum or minimum 1150 01:07:25,910 --> 01:07:28,280 point, which is a place where you've got something 1151 01:07:28,280 --> 01:07:31,380 like, say, quadratic curvature. 1152 01:07:31,380 --> 01:07:35,420 So ultimately your model for y, if you really 1153 01:07:35,420 --> 01:07:38,930 are trying to get to a true optimum it 1154 01:07:38,930 --> 01:07:46,660 may require some notion of a quadratic model, at least 1155 01:07:46,660 --> 01:07:50,110 in terms of your output as a function of your input. 1156 01:07:50,110 --> 01:07:52,990 So one approach was it might actually 1157 01:07:52,990 --> 01:07:59,300 require, for example, a three level model. 1158 01:07:59,300 --> 01:08:02,560 And in that case, things get bad really fast, right? 1159 01:08:02,560 --> 01:08:06,490 If I really needed a true three level model 1160 01:08:06,490 --> 01:08:13,720 on each of my x1, x2 inputs, now I go as 3 to the K sub C 1161 01:08:13,720 --> 01:08:16,720 where K sub C is the number of control factors, 1162 01:08:16,720 --> 01:08:19,100 K sub N is the number of noise factors. 1163 01:08:19,100 --> 01:08:21,160 So that could grow really, really fast. 1164 01:08:21,160 --> 01:08:23,380 Fortunately we know I don't necessarily 1165 01:08:23,380 --> 01:08:28,050 have to do a full three level factorial. 1166 01:08:28,050 --> 01:08:30,899 There are other approaches like central composite 1167 01:08:30,899 --> 01:08:35,580 that can build me a quadratic model in the output 1168 01:08:35,580 --> 01:08:41,040 without needing so much replication. 1169 01:08:41,040 --> 01:08:44,370 I don't have to have necessarily full three 1170 01:08:44,370 --> 01:08:47,819 levels on all of my factors because I don't really have 1171 01:08:47,819 --> 01:08:51,590 that much interaction going on. 1172 01:08:51,590 --> 01:08:53,260 But the point here is if I'm really 1173 01:08:53,260 --> 01:08:57,670 trying to explore the space for building up 1174 01:08:57,670 --> 01:09:00,640 my estimate of my output, I might well 1175 01:09:00,640 --> 01:09:04,100 need a quadratic model in that. 1176 01:09:04,100 --> 01:09:06,590 Do you think it gets that bad for the noise factors? 1177 01:09:06,590 --> 01:09:10,264 Do I need a quadratic kind of model for my noise factors? 1178 01:09:14,290 --> 01:09:16,170 In other words, why not? 1179 01:09:16,170 --> 01:09:18,800 If I look at this little corner point, 1180 01:09:18,800 --> 01:09:21,102 why don't I need a quadratic? 1181 01:09:23,640 --> 01:09:25,380 Should I include a center point there? 1182 01:09:25,380 --> 01:09:29,460 Should I include multiple exploration around that 1183 01:09:29,460 --> 01:09:30,029 as well? 1184 01:09:30,029 --> 01:09:33,750 Do I need a quadratic model of variance at each corner? 1185 01:09:40,350 --> 01:09:45,060 What are we using this exploration 1186 01:09:45,060 --> 01:09:48,920 in this corner array to do? 1187 01:09:48,920 --> 01:09:51,710 We're just using it to exercise these noise factors 1188 01:09:51,710 --> 01:09:55,040 and build up an aggregate into one estimate of variance 1189 01:09:55,040 --> 01:09:56,430 at that point. 1190 01:09:56,430 --> 01:10:03,920 We are rarely actually trying to build a local quadratic model. 1191 01:10:03,920 --> 01:10:05,840 I'm just using it-- 1192 01:10:05,840 --> 01:10:10,760 think of it almost as structured replicates at the corner. 1193 01:10:10,760 --> 01:10:16,520 I don't really need that kind of resolution. 1194 01:10:16,520 --> 01:10:20,120 So you typically will not see things 1195 01:10:20,120 --> 01:10:23,600 like doing a central composite design at each corner, 1196 01:10:23,600 --> 01:10:25,010 something like that. 1197 01:10:25,010 --> 01:10:30,020 Something very simple like a factorial, 1198 01:10:30,020 --> 01:10:33,240 either full or even fractional factorial. 1199 01:10:33,240 --> 01:10:37,190 This is a beautiful place for fractional factorial 1200 01:10:37,190 --> 01:10:38,540 in those corner points. 1201 01:10:38,540 --> 01:10:41,810 If I had eight noise factors, I do not really 1202 01:10:41,810 --> 01:10:47,230 need to do the eighth exploration 1203 01:10:47,230 --> 01:10:50,110 at each of my corner points because I don't really 1204 01:10:50,110 --> 01:10:52,840 care if it's a third or fourth or sixth order 1205 01:10:52,840 --> 01:10:55,810 interaction between noise factors, what 1206 01:10:55,810 --> 01:10:56,860 I would like to do-- 1207 01:10:56,860 --> 01:10:58,780 I don't care about confounding. 1208 01:10:58,780 --> 01:11:01,060 Confounding is great, no problem. 1209 01:11:01,060 --> 01:11:03,130 I want to lump them all together and just get 1210 01:11:03,130 --> 01:11:07,000 one estimate, in fact, of overall noise variance there. 1211 01:11:07,000 --> 01:11:08,830 So very reduced models-- 1212 01:11:08,830 --> 01:11:10,742 whoops. 1213 01:11:10,742 --> 01:11:13,490 I didn't want to do that. 1214 01:11:13,490 --> 01:11:15,920 Very reduced models. 1215 01:11:15,920 --> 01:11:18,920 Heavily fractional factorial in those corner points 1216 01:11:18,920 --> 01:11:22,390 are very effective for this purpose. 1217 01:11:26,300 --> 01:11:28,720 Another approach that we can use here 1218 01:11:28,720 --> 01:11:34,680 is we could actually go beyond that notion of just lumping 1219 01:11:34,680 --> 01:11:37,500 it all together just trying to get an estimate for variance 1220 01:11:37,500 --> 01:11:42,360 at that point and actually treat my control factors 1221 01:11:42,360 --> 01:11:46,470 and my noise factors almost on equal basis. 1222 01:11:46,470 --> 01:11:50,340 So the purposes of our design, why distinguish them? 1223 01:11:50,340 --> 01:11:54,570 I might actually want a functional dependence 1224 01:11:54,570 --> 01:11:59,070 on the noise factor, maybe because that 1225 01:11:59,070 --> 01:12:01,650 allows me to sort of more naturally 1226 01:12:01,650 --> 01:12:05,470 fold it all in together into one big experiment. 1227 01:12:05,470 --> 01:12:08,580 So in that case, you can easily imagine 1228 01:12:08,580 --> 01:12:12,790 building a single integrated model that 1229 01:12:12,790 --> 01:12:15,670 has linear interaction, maybe even 1230 01:12:15,670 --> 01:12:20,320 quadratic dependencies on my control factors. 1231 01:12:20,320 --> 01:12:23,710 But also that has terms in it that 1232 01:12:23,710 --> 01:12:30,990 look like essentially different parts of the problem 1233 01:12:30,990 --> 01:12:33,390 that we might be interested in. 1234 01:12:33,390 --> 01:12:36,360 One of those would be terms like this. 1235 01:12:36,360 --> 01:12:39,032 I've got a gamma times z1. 1236 01:12:39,032 --> 01:12:40,480 Well, what's that doing? 1237 01:12:40,480 --> 01:12:43,260 That's telling me essentially the sensitivity. 1238 01:12:43,260 --> 01:12:47,460 That's Dy to Dz1. 1239 01:12:47,460 --> 01:12:51,240 That's one of these alpha noise factors. 1240 01:12:51,240 --> 01:12:54,090 I directly have a measure there of sensitivity 1241 01:12:54,090 --> 01:12:58,030 to noise, that noise factor at that point. 1242 01:12:58,030 --> 01:12:59,580 So that might be a very useful thing 1243 01:12:59,580 --> 01:13:03,450 to know and actually model and distinguish. 1244 01:13:03,450 --> 01:13:07,970 A second one are terms that look like this. 1245 01:13:07,970 --> 01:13:10,510 Here's an x1 and z1. 1246 01:13:10,510 --> 01:13:13,330 What's this telling me? 1247 01:13:13,330 --> 01:13:16,120 Well, this is telling me there's an interaction 1248 01:13:16,120 --> 01:13:20,480 between my x variable and that noise factor. 1249 01:13:20,480 --> 01:13:23,950 So I know, depending on where I pick my x variable, 1250 01:13:23,950 --> 01:13:28,730 I've got a different sensitivity to noise at that point. 1251 01:13:28,730 --> 01:13:32,200 So that's kind of breaking that part of the overall response 1252 01:13:32,200 --> 01:13:35,680 out and letting you explicitly say, yes, now I 1253 01:13:35,680 --> 01:13:37,450 understand that interaction. 1254 01:13:37,450 --> 01:13:40,780 I understand how my noise interacts 1255 01:13:40,780 --> 01:13:42,910 with that particular factor and I 1256 01:13:42,910 --> 01:13:46,330 can use that to help guide my optimization 1257 01:13:46,330 --> 01:13:48,650 and selection of input points. 1258 01:13:48,650 --> 01:13:53,440 So there is some value in going ahead and doing 1259 01:13:53,440 --> 01:13:59,870 an integrated effect and noise response surface 1260 01:13:59,870 --> 01:14:04,400 approach where you can build up an integrated model together 1261 01:14:04,400 --> 01:14:05,270 in here. 1262 01:14:05,270 --> 01:14:07,190 And then you can use the same techniques 1263 01:14:07,190 --> 01:14:10,610 that we talked about last time, about aliasing and worrying 1264 01:14:10,610 --> 01:14:13,940 about which interaction factors and how high your order 1265 01:14:13,940 --> 01:14:16,280 interaction really needs so you can 1266 01:14:16,280 --> 01:14:18,950 go ahead and get reduced models and fold some of those 1267 01:14:18,950 --> 01:14:20,100 together. 1268 01:14:20,100 --> 01:14:22,940 But it essentially allows us to explicitly think 1269 01:14:22,940 --> 01:14:28,940 about both control factor responses, 1270 01:14:28,940 --> 01:14:32,005 but interaction with the noise factor. 1271 01:14:32,005 --> 01:14:32,505 Yeah? 1272 01:14:32,505 --> 01:14:34,710 AUDIENCE: Do we have a control on triggering 1273 01:14:34,710 --> 01:14:36,078 the level of noise? 1274 01:14:36,078 --> 01:14:36,870 DUANE BONING: Yeah. 1275 01:14:36,870 --> 01:14:39,520 So the big assumption in here is, in fact, 1276 01:14:39,520 --> 01:14:42,480 many of these noise factors in the design of experiments, 1277 01:14:42,480 --> 01:14:46,170 while you're doing your upfront process optimization, 1278 01:14:46,170 --> 01:14:49,348 that you actually can vary them. 1279 01:14:49,348 --> 01:14:50,390 AUDIENCE: That's a noise. 1280 01:14:50,390 --> 01:14:51,543 [INAUDIBLE] 1281 01:14:51,543 --> 01:14:53,460 DUANE BONING: Yeah, it's kind of a funny term. 1282 01:14:53,460 --> 01:14:56,580 Yeah, I'm glad you raised that question. 1283 01:14:56,580 --> 01:15:00,810 The question is we've got these noise factors and the term 1284 01:15:00,810 --> 01:15:02,410 calling the noise factors-- 1285 01:15:02,410 --> 01:15:05,070 one strong interpretation of noise 1286 01:15:05,070 --> 01:15:07,410 is it's something I have no control over. 1287 01:15:07,410 --> 01:15:08,970 This is a weaker notion. 1288 01:15:08,970 --> 01:15:11,130 This is basically saying there are 1289 01:15:11,130 --> 01:15:16,440 things like in the normal operation of the equipment 1290 01:15:16,440 --> 01:15:22,140 I can't vary them or in the normal operation 1291 01:15:22,140 --> 01:15:24,840 they will be naturally varied over 1292 01:15:24,840 --> 01:15:31,950 and I can't set and specify and keep it at that noise factor 1293 01:15:31,950 --> 01:15:32,740 at one of these. 1294 01:15:32,740 --> 01:15:36,610 So here's what I'm thinking as a good example. 1295 01:15:36,610 --> 01:15:39,900 Temperature, room temperature, say. 1296 01:15:39,900 --> 01:15:43,530 So you know in normal operation throughout the day, 1297 01:15:43,530 --> 01:15:46,110 the temperature from the morning to the afternoon 1298 01:15:46,110 --> 01:15:52,320 might be varying in your factory and you've only got a limit. 1299 01:15:52,320 --> 01:15:54,630 You can't really completely control that. 1300 01:15:54,630 --> 01:15:58,480 Maybe you're operating in a place where it has a big range 1301 01:15:58,480 --> 01:16:01,870 and you don't have any control over it. 1302 01:16:01,870 --> 01:16:04,770 So what you would want to do is understand that while you're 1303 01:16:04,770 --> 01:16:06,870 doing your process design. 1304 01:16:06,870 --> 01:16:11,040 Treat it as a noise factor, know it's impact, 1305 01:16:11,040 --> 01:16:13,920 but then in actual operation, I can't pick my temperature. 1306 01:16:13,920 --> 01:16:16,140 It's going to just vary, but you would 1307 01:16:16,140 --> 01:16:19,290 like your process to be as robust as possible 1308 01:16:19,290 --> 01:16:21,330 to that factor. 1309 01:16:21,330 --> 01:16:23,040 So that's the kind of noise factor 1310 01:16:23,040 --> 01:16:27,060 here is things that you can either actively 1311 01:16:27,060 --> 01:16:29,070 control during the design of experiments 1312 01:16:29,070 --> 01:16:33,690 or naturally sample from during the design of experiments. 1313 01:16:33,690 --> 01:16:35,640 Other things would be operator. 1314 01:16:39,270 --> 01:16:41,890 So that's a very good question. 1315 01:16:41,890 --> 01:16:45,000 I'm glad you asked that. 1316 01:16:45,000 --> 01:16:45,570 OK. 1317 01:16:45,570 --> 01:16:49,230 Once you've got a noise response kind of surface, 1318 01:16:49,230 --> 01:16:53,430 now what you can also do is build outputs. 1319 01:16:53,430 --> 01:16:55,170 Not only as a function of-- 1320 01:16:58,120 --> 01:17:00,940 mean responses as a function of your input 1321 01:17:00,940 --> 01:17:05,950 where you would essentially ignore the noise factors, 1322 01:17:05,950 --> 01:17:08,290 but also if you just simply take the variance 1323 01:17:08,290 --> 01:17:10,930 of this relationship that gives you 1324 01:17:10,930 --> 01:17:16,450 a model for the variance of the response when 1325 01:17:16,450 --> 01:17:21,160 under the assumption that x are constant. 1326 01:17:23,970 --> 01:17:27,770 So in other words, what we do is we can build two of these, 1327 01:17:27,770 --> 01:17:33,920 one under the assumption such that my noise factors 1328 01:17:33,920 --> 01:17:35,310 are at the mean. 1329 01:17:35,310 --> 01:17:38,120 So I'm basically just lumping all of those 1330 01:17:38,120 --> 01:17:41,750 into some overall noise from my noise factors. 1331 01:17:41,750 --> 01:17:44,030 And then in my variance case, I just 1332 01:17:44,030 --> 01:17:50,480 look at the terms that relate to the interactions assuming 1333 01:17:50,480 --> 01:17:54,710 some x is relatively constant. 1334 01:17:54,710 --> 01:17:58,160 So we ignore these terms or lump all these terms 1335 01:17:58,160 --> 01:18:02,810 into just the overall gamma and then include 1336 01:18:02,810 --> 01:18:08,900 some of the interaction terms with my z. 1337 01:18:08,900 --> 01:18:11,120 So essentially what this lets us do 1338 01:18:11,120 --> 01:18:15,800 if we built an overall response surface is 1339 01:18:15,800 --> 01:18:19,970 separate out and explore both the mean response and also 1340 01:18:19,970 --> 01:18:25,790 variance as we try to pick an x1 and our x2 to minimize 1341 01:18:25,790 --> 01:18:29,120 the sensitivity to those noise factors. 1342 01:18:29,120 --> 01:18:30,500 Yes, there was a question? 1343 01:18:30,500 --> 01:18:32,047 Singapore? 1344 01:18:32,047 --> 01:18:33,505 AUDIENCE: Prof, do we need to check 1345 01:18:33,505 --> 01:18:36,560 the r squared before calculating the mean and then the variance. 1346 01:18:36,560 --> 01:18:39,030 Whatever the r squared is, it's not very good. 1347 01:18:39,030 --> 01:18:40,040 It's [? small. ?] 1348 01:18:40,040 --> 01:18:40,450 DUANE BONING: Yeah. 1349 01:18:40,450 --> 01:18:40,950 Yeah. 1350 01:18:40,950 --> 01:18:44,240 So this sort of goes without saying. 1351 01:18:44,240 --> 01:18:48,110 I was just assuming that you use all of the normal response 1352 01:18:48,110 --> 01:18:51,270 surface modeling methodology. 1353 01:18:51,270 --> 01:18:55,430 You have to check significance of each of these terms. 1354 01:18:55,430 --> 01:18:58,110 You would only include terms if they are significant. 1355 01:18:58,110 --> 01:19:00,020 So it's more than just checking r squared, 1356 01:19:00,020 --> 01:19:02,150 you really are running ANOVA. 1357 01:19:02,150 --> 01:19:05,220 Does that noise factor have an effect or not? 1358 01:19:05,220 --> 01:19:07,670 If not, you just lump it into an overall aspect. 1359 01:19:07,670 --> 01:19:10,685 So all of the typical model construction 1360 01:19:10,685 --> 01:19:11,810 you would still want to do. 1361 01:19:14,710 --> 01:19:19,030 So anyway, the point here is now if you explicitly 1362 01:19:19,030 --> 01:19:21,190 do a response surface model, now you can 1363 01:19:21,190 --> 01:19:23,680 build models of the variance. 1364 01:19:23,680 --> 01:19:26,980 This is just an explosion or expansion 1365 01:19:26,980 --> 01:19:31,660 of the variance surface in general 1366 01:19:31,660 --> 01:19:34,570 you will observe because of this squaring 1367 01:19:34,570 --> 01:19:36,400 in the variance expression. 1368 01:19:36,400 --> 01:19:40,990 It typically is quadratic in your x1 and x2, which is nice 1369 01:19:40,990 --> 01:19:44,950 because then if you're trying to minimize that variance if it's 1370 01:19:44,950 --> 01:19:47,830 quadratic in x1 and x2, it does tend to drive 1371 01:19:47,830 --> 01:19:50,320 towards a nice minimum point. 1372 01:19:53,300 --> 01:19:55,730 I have one quick example and then we'll be done. 1373 01:19:55,730 --> 01:19:58,808 This is alluding back to our robust bending 1374 01:19:58,808 --> 01:20:00,350 where I've got a punch and I'm trying 1375 01:20:00,350 --> 01:20:02,990 to get a bend in a piece of sheet metal, 1376 01:20:02,990 --> 01:20:05,720 and we might have a couple of control factors 1377 01:20:05,720 --> 01:20:11,670 that again, we can use to optimize the process. 1378 01:20:11,670 --> 01:20:14,100 Like how deep I push the punch and what 1379 01:20:14,100 --> 01:20:17,010 the width of the width of the initial punch dye 1380 01:20:17,010 --> 01:20:20,160 is, but here's an example of a couple additional noise 1381 01:20:20,160 --> 01:20:21,060 factors. 1382 01:20:21,060 --> 01:20:25,590 It may be we know that some of the incoming material 1383 01:20:25,590 --> 01:20:30,210 will have a range of yield points and thicknesses. 1384 01:20:30,210 --> 01:20:33,510 Again, I'm not going to be able to necessarily select those. 1385 01:20:33,510 --> 01:20:36,540 I want to process it, it's going to be robust to those. 1386 01:20:36,540 --> 01:20:38,250 But in a design of experiments, I 1387 01:20:38,250 --> 01:20:41,250 can certainly sample from different incoming material 1388 01:20:41,250 --> 01:20:44,410 types that explore that. 1389 01:20:44,410 --> 01:20:47,250 So I could then do a design of experiments 1390 01:20:47,250 --> 01:20:57,720 where I vary both yield point and thickness of the input. 1391 01:20:57,720 --> 01:21:00,570 I can build up a model of z, I can build up 1392 01:21:00,570 --> 01:21:04,650 a model of the angle, I can go in and do estimation 1393 01:21:04,650 --> 01:21:11,150 of both my main effect on my control factors, 1394 01:21:11,150 --> 01:21:13,040 on interaction terms. 1395 01:21:13,040 --> 01:21:16,250 I could then estimate the size of those coefficients, 1396 01:21:16,250 --> 01:21:18,890 go in and make sure that they are significant, 1397 01:21:18,890 --> 01:21:21,380 and overall get down to say something 1398 01:21:21,380 --> 01:21:25,940 that looks like a reduced model where I might 1399 01:21:25,940 --> 01:21:27,680 be looking and saying, aw, OK. 1400 01:21:27,680 --> 01:21:31,610 That little beta 1, 2, 3 is insignificant. 1401 01:21:31,610 --> 01:21:36,320 I'm not going to include an x1, x2, x3 factor, for example, 1402 01:21:36,320 --> 01:21:39,110 in the model. 1403 01:21:39,110 --> 01:21:41,430 Here's the example of the reduced order model. 1404 01:21:41,430 --> 01:21:46,130 In this case, the mean surface is nice and linear 1405 01:21:46,130 --> 01:21:49,810 with an the interaction term. 1406 01:21:49,810 --> 01:21:53,640 So we could go and do optimization purely 1407 01:21:53,640 --> 01:21:58,680 on the mean surface trying to hit some target angle, 1408 01:21:58,680 --> 01:22:01,540 but I can also look at the variant surface. 1409 01:22:01,540 --> 01:22:05,950 And in this case, the variant surface has-- 1410 01:22:05,950 --> 01:22:09,580 it is a quadratic, it's just a weakly quadratic. 1411 01:22:09,580 --> 01:22:13,330 But you can see a very, very strong dependence 1412 01:22:13,330 --> 01:22:15,730 in the variance on this x2 factor. 1413 01:22:15,730 --> 01:22:19,500 I think that was the thickness factor. 1414 01:22:19,500 --> 01:22:21,690 Very big differences in variance. 1415 01:22:21,690 --> 01:22:24,930 And so then I could combine those two in different ways 1416 01:22:24,930 --> 01:22:27,750 to form an optimization and try to get 1417 01:22:27,750 --> 01:22:30,240 to a good robust operating point and that 1418 01:22:30,240 --> 01:22:32,640 might be a Taguchi signal to noise 1419 01:22:32,640 --> 01:22:35,970 or it might be an expected loss function. 1420 01:22:35,970 --> 01:22:38,520 In this case here, I've plotted out 1421 01:22:38,520 --> 01:22:43,080 what the expected loss surface is as a function of x1 and x2, 1422 01:22:43,080 --> 01:22:45,900 and if you're seeking to minimize that you can start 1423 01:22:45,900 --> 01:22:50,820 to see a region there where you would want to be in terms 1424 01:22:50,820 --> 01:22:55,320 of the x1, x2 to have a good on target or close 1425 01:22:55,320 --> 01:22:59,260 to target and a low variance process. 1426 01:22:59,260 --> 01:23:00,123 OK. 1427 01:23:00,123 --> 01:23:01,540 I'm going to skip over this slide. 1428 01:23:01,540 --> 01:23:05,560 It's basically just talking about a summary 1429 01:23:05,560 --> 01:23:08,800 of doing the combined response surface 1430 01:23:08,800 --> 01:23:11,530 model with a little bit of an assessment of maybe 1431 01:23:11,530 --> 01:23:13,180 the number of different design points 1432 01:23:13,180 --> 01:23:15,480 that might be required in those cases. 1433 01:23:15,480 --> 01:23:18,760 But what I wanted to do here is just conclude and give you 1434 01:23:18,760 --> 01:23:23,860 the high level perspective here is we did lots of machinery 1435 01:23:23,860 --> 01:23:26,200 on response surface modeling talking mostly 1436 01:23:26,200 --> 01:23:30,460 about mean output, but that same machinery can also 1437 01:23:30,460 --> 01:23:34,900 be used with some extra ideas of these inner and outer arrays 1438 01:23:34,900 --> 01:23:36,940 for also modeling variance when it 1439 01:23:36,940 --> 01:23:39,830 varies as a function of operating point. 1440 01:23:39,830 --> 01:23:42,070 And then we can combine those together 1441 01:23:42,070 --> 01:23:47,800 to try to get processes that are robust to not only 1442 01:23:47,800 --> 01:23:50,350 selection of operating point, but also to these kinds 1443 01:23:50,350 --> 01:23:52,220 of noise factors. 1444 01:23:52,220 --> 01:23:57,310 So with that, I'll conclude and we'll see you all on Tuesday. 1445 01:23:57,310 --> 01:24:00,718 I think on Tuesday we'll talk about nested variances, 1446 01:24:00,718 --> 01:24:02,510 so it's going to be a little bit different. 1447 01:24:02,510 --> 01:24:04,802 We're going to pull back from the design of experiments 1448 01:24:04,802 --> 01:24:10,780 and talk a little bit more about interesting variance structures 1449 01:24:10,780 --> 01:24:13,510 often arising in places like semiconductor manufacturing 1450 01:24:13,510 --> 01:24:14,210 and elsewhere. 1451 01:24:14,210 --> 01:24:16,320 So we'll see you on Tuesday.