1 00:00:00,000 --> 00:00:02,458 SPEAKER: The following content is provided under a Creative 2 00:00:02,458 --> 00:00:03,730 Commons license. 3 00:00:03,730 --> 00:00:06,030 Your support will help MIT OpenCourseWare 4 00:00:06,030 --> 00:00:10,060 continue to offer high-quality educational resources for free. 5 00:00:10,060 --> 00:00:12,690 To make a donation or to view additional materials 6 00:00:12,690 --> 00:00:16,560 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,560 --> 00:00:17,904 at ocw.mit.edu. 8 00:00:21,120 --> 00:00:23,870 PROFESSOR: It's a pleasure to get the chance to talk, 9 00:00:23,870 --> 00:00:26,370 particularly [AUDIO OUT] because I think this, in some ways, 10 00:00:26,370 --> 00:00:29,940 is the first big punch line of the term, something 11 00:00:29,940 --> 00:00:32,040 that-- material you've been hearing about 12 00:00:32,040 --> 00:00:33,000 has been leading up to. 13 00:00:33,000 --> 00:00:35,125 I guess it depends on whether you consider yourself 14 00:00:35,125 --> 00:00:36,480 an engineer or a theoretician. 15 00:00:36,480 --> 00:00:37,890 If you're a theoretician, this is 16 00:00:37,890 --> 00:00:41,047 going to seem kind of pretty simplistic. 17 00:00:41,047 --> 00:00:42,630 If you're a practitioner, an engineer, 18 00:00:42,630 --> 00:00:43,838 you're going to say this is-- 19 00:00:43,838 --> 00:00:47,490 I hope you'll say this is what we've been waiting for. 20 00:00:47,490 --> 00:00:50,220 And what it is really how we take a lot of the stuff 21 00:00:50,220 --> 00:00:53,610 that you've been learning about physical processes 22 00:00:53,610 --> 00:00:59,250 and the reasons they go bad and some basic theory 23 00:00:59,250 --> 00:01:03,330 on statistics, particularly normal statistics, 24 00:01:03,330 --> 00:01:05,880 and actually put it into some sort of use. 25 00:01:05,880 --> 00:01:08,910 And we'll do this-- 26 00:01:08,910 --> 00:01:12,060 this is really sort of SPC in-- 27 00:01:12,060 --> 00:01:17,050 I was going to say 90 minutes, but in less than 90 minutes. 28 00:01:17,050 --> 00:01:21,540 And if you follow everything that's in today's lecture 29 00:01:21,540 --> 00:01:24,090 and follow what's in the book, you kind of have it all, OK? 30 00:01:24,090 --> 00:01:25,590 And then it's just a matter of a lot 31 00:01:25,590 --> 00:01:27,720 of shades of gray around that. 32 00:01:27,720 --> 00:01:32,030 Making it more applicable, and understanding 33 00:01:32,030 --> 00:01:33,320 some of the subtleties. 34 00:01:33,320 --> 00:01:36,260 The other part of it, if I don't get too wordy 35 00:01:36,260 --> 00:01:38,750 and we don't have any technical problems, 36 00:01:38,750 --> 00:01:40,910 is process capability. 37 00:01:40,910 --> 00:01:43,490 And process capability in my mind 38 00:01:43,490 --> 00:01:48,110 is the meeting of the two great forces in manufacturing 39 00:01:48,110 --> 00:01:49,970 or in a manufacturing company. 40 00:01:49,970 --> 00:01:52,520 And that's the relationship-- 41 00:01:52,520 --> 00:01:56,055 those are the organizations of design and manufacturing. 42 00:01:56,055 --> 00:01:58,430 And it really is just where the two are brought together. 43 00:01:58,430 --> 00:02:01,678 And it's ultimately very simple. 44 00:02:01,678 --> 00:02:03,720 It has a lot of implications, but it's ultimately 45 00:02:03,720 --> 00:02:06,540 a very simple concept. 46 00:02:06,540 --> 00:02:10,380 And both of these things are things that historically you 47 00:02:10,380 --> 00:02:15,540 might say were discovered or put together and learned 48 00:02:15,540 --> 00:02:16,920 in the United States. 49 00:02:16,920 --> 00:02:19,230 They were unlearned-- it's an interesting history 50 00:02:19,230 --> 00:02:23,400 on this-- completely unlearned and transferred 51 00:02:23,400 --> 00:02:26,400 to other countries, in particular Japan, 52 00:02:26,400 --> 00:02:28,390 and then relearn back in this country. 53 00:02:28,390 --> 00:02:30,060 So there's some interesting history. 54 00:02:30,060 --> 00:02:35,040 If you read about Shewhart, if you read about Edwards Deming, 55 00:02:35,040 --> 00:02:39,120 and you read about Juran, these are the people 56 00:02:39,120 --> 00:02:43,770 that kind of invented and then brought it back 57 00:02:43,770 --> 00:02:45,000 to this country. 58 00:02:45,000 --> 00:02:49,980 And it's now-- you know, again, I find myself now having 59 00:02:49,980 --> 00:02:53,370 worked in this area long enough to tell stories that you have 60 00:02:53,370 --> 00:02:56,250 to understand that your background, this has probably 61 00:02:56,250 --> 00:02:59,010 been in your entire adult life, this has just 62 00:02:59,010 --> 00:03:01,020 been standard practice in industry 63 00:03:01,020 --> 00:03:02,370 pretty much around the world. 64 00:03:02,370 --> 00:03:05,330 But I can tell you, 15 years ago, 65 00:03:05,330 --> 00:03:08,623 which for me wasn't-- which, for me, is yesterday, 66 00:03:08,623 --> 00:03:10,540 this stuff was still kind of being discovered. 67 00:03:10,540 --> 00:03:14,230 But now, at least, it maybe isn't dogma yet everywhere, 68 00:03:14,230 --> 00:03:17,490 but it's pretty close to that. 69 00:03:17,490 --> 00:03:18,330 OK. 70 00:03:18,330 --> 00:03:20,820 So enough of my soapbox. 71 00:03:20,820 --> 00:03:22,410 There's an interesting paper that 72 00:03:22,410 --> 00:03:26,630 was written by Shewhart back when 73 00:03:26,630 --> 00:03:28,430 he was with Bell Laboratories. 74 00:03:28,430 --> 00:03:32,420 Bell Laboratories used to be the premier industrial research 75 00:03:32,420 --> 00:03:36,520 laboratory in the United States, maybe in the world. 76 00:03:36,520 --> 00:03:40,660 And of course, did a lot of work on communications and telephony 77 00:03:40,660 --> 00:03:42,910 and things like that and a lot of things 78 00:03:42,910 --> 00:03:45,640 on basic information-- on, of course, 79 00:03:45,640 --> 00:03:47,075 on the hardware of that. 80 00:03:47,075 --> 00:03:48,700 Software didn't really exist back then, 81 00:03:48,700 --> 00:03:52,600 but the hardware of it and other things was done there. 82 00:03:52,600 --> 00:03:56,630 And they had a large theoretical group. 83 00:03:56,630 --> 00:04:01,220 And Shewhart, as I understand, was a statistician. 84 00:04:01,220 --> 00:04:04,850 And Bell Labs also, in their connection 85 00:04:04,850 --> 00:04:05,810 with Western Electric-- 86 00:04:05,810 --> 00:04:09,950 Western Electric was the manufacturing arm 87 00:04:09,950 --> 00:04:12,110 of the Bell System, which was the one 88 00:04:12,110 --> 00:04:15,350 big monopoly in the United States at that time. 89 00:04:15,350 --> 00:04:17,870 So they owned the phone system, and they 90 00:04:17,870 --> 00:04:19,040 manufactured all the phones. 91 00:04:19,040 --> 00:04:21,477 That was an interesting era in history, too. 92 00:04:21,477 --> 00:04:23,060 You could get your phones from anybody 93 00:04:23,060 --> 00:04:24,518 as long as it was Western Electric. 94 00:04:24,518 --> 00:04:28,400 And you could get any color you wanted as long as it was black. 95 00:04:28,400 --> 00:04:32,490 They made very solid phones, but there wasn't a lot of variety. 96 00:04:32,490 --> 00:04:35,250 But they always worked. 97 00:04:35,250 --> 00:04:38,730 Anyway, they did a lot of work on improving 98 00:04:38,730 --> 00:04:40,450 telephony and other things like that, 99 00:04:40,450 --> 00:04:43,290 and a lot of statistics and mathematics 100 00:04:43,290 --> 00:04:45,320 got developed around control systems. 101 00:04:45,320 --> 00:04:48,570 Those if you who've taken control courses, I think-- 102 00:04:51,090 --> 00:04:53,970 I'm trying to think, well, I might get some of these names 103 00:04:53,970 --> 00:04:56,070 wrong, but some of the techniques and other things 104 00:04:56,070 --> 00:04:58,740 that are there came from the concept 105 00:04:58,740 --> 00:05:03,360 of developing better amplifiers for undersea 106 00:05:03,360 --> 00:05:06,780 cables and other things like that for telephony. 107 00:05:06,780 --> 00:05:11,370 At the same time this development of statistics 108 00:05:11,370 --> 00:05:14,040 and applied statistics was being used in that area, 109 00:05:14,040 --> 00:05:18,520 this guy Shewhart comes along and says, wait a second. 110 00:05:18,520 --> 00:05:20,440 We have this manufacturing. 111 00:05:20,440 --> 00:05:24,755 And what this little quote here, which 112 00:05:24,755 --> 00:05:26,380 is the introduction to this one paper-- 113 00:05:26,380 --> 00:05:27,547 he wrote a series of papers. 114 00:05:27,547 --> 00:05:28,930 It basically says is that-- 115 00:05:28,930 --> 00:05:32,020 he's stating what maybe is the obvious back in, I think, 116 00:05:32,020 --> 00:05:35,470 1925, which is the objective of manufacturing 117 00:05:35,470 --> 00:05:38,200 typically is to create a products that 118 00:05:38,200 --> 00:05:40,160 are as uniform as possible. 119 00:05:40,160 --> 00:05:41,830 And you can set up all the conditions 120 00:05:41,830 --> 00:05:45,190 you want around the production to make things 121 00:05:45,190 --> 00:05:47,320 as uniform as possible, which is exactly what we've 122 00:05:47,320 --> 00:05:48,130 been talking about. 123 00:05:48,130 --> 00:05:49,780 Control your inputs. 124 00:05:49,780 --> 00:05:53,720 Control your process variables, things like that. 125 00:05:53,720 --> 00:05:57,260 But there will still be variation in it. 126 00:05:57,260 --> 00:05:59,360 He's basically saying, if you ever 127 00:05:59,360 --> 00:06:01,490 make things in any sort of quantity, 128 00:06:01,490 --> 00:06:04,100 you will always see variation. 129 00:06:04,100 --> 00:06:07,880 And he basically did from that is-- 130 00:06:07,880 --> 00:06:09,320 and I don't know how much of this 131 00:06:09,320 --> 00:06:13,580 came from the idea of quantum mechanics, which was becoming 132 00:06:13,580 --> 00:06:19,250 popular at that time or was coming into being well known 133 00:06:19,250 --> 00:06:21,320 by the scientific community at least, which 134 00:06:21,320 --> 00:06:26,180 is that at some point, once you reduce things down, 135 00:06:26,180 --> 00:06:28,880 sort of explain everything that is to be explained, 136 00:06:28,880 --> 00:06:31,680 there's still some inherent randomness in things. 137 00:06:31,680 --> 00:06:33,950 And so this view of the idea that there's inherent 138 00:06:33,950 --> 00:06:38,240 randomness in any physical thing. 139 00:06:38,240 --> 00:06:41,430 Our ability to describe exactly what's going on eventually 140 00:06:41,430 --> 00:06:42,540 gets limited. 141 00:06:42,540 --> 00:06:46,500 And so that kind of led to this idea of statistical process 142 00:06:46,500 --> 00:06:49,410 control, which is really based on the idea 143 00:06:49,410 --> 00:06:53,520 that if I do everything I can to know everything 144 00:06:53,520 --> 00:06:57,060 I can about a process, then the only thing that's left 145 00:06:57,060 --> 00:06:58,890 is that unknowable part, if you will. 146 00:06:58,890 --> 00:07:01,360 The purely random part. 147 00:07:01,360 --> 00:07:04,420 And at one level, SPC is all about, 148 00:07:04,420 --> 00:07:06,970 first of all, determining, have I indeed 149 00:07:06,970 --> 00:07:09,580 taken care of all those things that I should know about 150 00:07:09,580 --> 00:07:12,920 or could know about? 151 00:07:12,920 --> 00:07:14,840 And those are things that are related 152 00:07:14,840 --> 00:07:17,810 to the physics of the process that we've 153 00:07:17,810 --> 00:07:19,560 been talking about so far. 154 00:07:19,560 --> 00:07:26,070 And once that's done, what is the underlying 155 00:07:26,070 --> 00:07:29,190 statistical behavior, random behavior of the process, 156 00:07:29,190 --> 00:07:31,530 and how well can I characterize that. 157 00:07:31,530 --> 00:07:35,110 And SPC all gets down to really basically saying, 158 00:07:35,110 --> 00:07:40,230 have I achieved this state of pure randomness? 159 00:07:40,230 --> 00:07:43,350 And SPC-- another way of looking at the Shewhart hypothesis 160 00:07:43,350 --> 00:07:48,090 is, if the process is acting in any way except purely random, 161 00:07:48,090 --> 00:07:50,380 then you haven't done your job. 162 00:07:50,380 --> 00:07:53,060 Then you can still improve the process. 163 00:07:53,060 --> 00:07:53,560 OK? 164 00:08:00,010 --> 00:08:00,510 OK. 165 00:08:00,510 --> 00:08:03,550 So I think you can probably read that on your print out. 166 00:08:03,550 --> 00:08:04,800 I can't read it here. 167 00:08:04,800 --> 00:08:06,360 So, OK. 168 00:08:06,360 --> 00:08:09,450 So the hypothesis in effect, or the approach, 169 00:08:09,450 --> 00:08:11,160 is to basically say, all processes 170 00:08:11,160 --> 00:08:13,128 have a certain degree of randomness. 171 00:08:16,120 --> 00:08:21,040 And this idea of assignable causes, which are really what-- 172 00:08:21,040 --> 00:08:24,340 if you're in a control systems, from a control systems 173 00:08:24,340 --> 00:08:27,520 background, it would say disturbances or changes. 174 00:08:27,520 --> 00:08:30,380 If they've all been eliminated, then it 175 00:08:30,380 --> 00:08:33,179 is a purely random process. 176 00:08:33,179 --> 00:08:35,900 So this concept of common causes or, again, 177 00:08:35,900 --> 00:08:39,669 purely random effects, if that's all that's left, 178 00:08:39,669 --> 00:08:44,360 then you do indeed have a process 179 00:08:44,360 --> 00:08:47,670 that is in a state of statistical control. 180 00:08:47,670 --> 00:08:49,490 Now, the other problem here again, 181 00:08:49,490 --> 00:08:51,620 and the reason I go on about this, 182 00:08:51,620 --> 00:08:54,920 is I came to this whole field-- 183 00:08:54,920 --> 00:08:57,830 all my formal education, with the exception of one or two 184 00:08:57,830 --> 00:09:00,980 classes, never talked about the concept 185 00:09:00,980 --> 00:09:03,350 of uncertainty or randomness. 186 00:09:03,350 --> 00:09:07,590 And particularly in the area of control systems, 187 00:09:07,590 --> 00:09:11,130 you think of control as actively watching something and doing 188 00:09:11,130 --> 00:09:15,600 something about it in a closed loop fashion. 189 00:09:15,600 --> 00:09:18,880 Statistical process control is quite different from that. 190 00:09:18,880 --> 00:09:20,940 And the term is used quite differently. 191 00:09:20,940 --> 00:09:26,730 It really means, have I eliminated or controlled 192 00:09:26,730 --> 00:09:29,220 everything external to the process 193 00:09:29,220 --> 00:09:31,470 that could cause it to vary? 194 00:09:31,470 --> 00:09:34,980 And indeed, is it in a state of what 195 00:09:34,980 --> 00:09:38,460 you'll see in a moment we'll define as purely 196 00:09:38,460 --> 00:09:42,150 random, stationary behavior. 197 00:09:42,150 --> 00:09:46,150 There's nothing active about the control at all. 198 00:09:46,150 --> 00:09:51,270 Now, for those of you who have come up with a control systems 199 00:09:51,270 --> 00:09:54,285 or system dynamics point of view-- 200 00:09:54,285 --> 00:09:56,910 and I think I warned you this is the last time I talked to you. 201 00:09:56,910 --> 00:09:59,350 I do everything with block diagrams. 202 00:09:59,350 --> 00:10:01,200 So I have to do this. 203 00:10:01,200 --> 00:10:03,750 We have this manufacturing process. 204 00:10:03,750 --> 00:10:10,350 And we have certain inputs to the process. 205 00:10:10,350 --> 00:10:14,820 And these again, are the things that we can control, 206 00:10:14,820 --> 00:10:16,380 and we should be able to control. 207 00:10:16,380 --> 00:10:21,150 And we either set them and leave them alone, 208 00:10:21,150 --> 00:10:23,790 or we control them in a way that makes the process do 209 00:10:23,790 --> 00:10:26,280 what we want to do. 210 00:10:26,280 --> 00:10:27,840 And that produces a certain output 211 00:10:27,840 --> 00:10:29,490 through all of the different materials 212 00:10:29,490 --> 00:10:32,800 and other things like that. 213 00:10:32,800 --> 00:10:35,130 And then in a control system sense, 214 00:10:35,130 --> 00:10:38,790 there are two reasons that the relationship between these two 215 00:10:38,790 --> 00:10:41,560 would vary. 216 00:10:41,560 --> 00:10:44,950 And again, the purely deterministic view of the world 217 00:10:44,950 --> 00:10:45,500 says, look. 218 00:10:45,500 --> 00:10:48,560 If I set some inputs, I'll get a certain output. 219 00:10:48,560 --> 00:10:52,910 There's a unique mapping between the inputs and the outputs. 220 00:10:52,910 --> 00:10:54,290 But two things can happen. 221 00:10:57,080 --> 00:11:02,080 You can have things that are called noise. 222 00:11:02,080 --> 00:11:04,930 And you'll hear that term throughout the term, actually. 223 00:11:04,930 --> 00:11:06,070 Or disturbances. 224 00:11:08,780 --> 00:11:11,495 And you'll notice that's sort of external to the-- this 225 00:11:11,495 --> 00:11:13,370 is shown as external to the process. 226 00:11:13,370 --> 00:11:15,680 And from a mathematical point of view, 227 00:11:15,680 --> 00:11:18,830 it's shown as being additive. 228 00:11:18,830 --> 00:11:21,590 So imagine that this is a purely deterministic process. 229 00:11:21,590 --> 00:11:24,290 I hold the input perfectly constant. 230 00:11:24,290 --> 00:11:26,540 The output stays perfectly constant, 231 00:11:26,540 --> 00:11:30,120 except that I have a noise up here that's described, 232 00:11:30,120 --> 00:11:34,180 let's say, by a normal distribution. 233 00:11:34,180 --> 00:11:37,410 So at any instant in time, this variable 234 00:11:37,410 --> 00:11:39,640 will be following a normal distribution. 235 00:11:39,640 --> 00:11:44,160 So the output will follow normal distribution. 236 00:11:44,160 --> 00:11:48,350 And you might say that one way of looking at the Shewhart 237 00:11:48,350 --> 00:11:51,050 hypothesis is, yeah, here's this process, 238 00:11:51,050 --> 00:11:52,830 and I've got it completely under control. 239 00:11:52,830 --> 00:11:54,622 I know everything that's going on about it. 240 00:11:54,622 --> 00:11:57,530 Nothing inherent in the process is changing, 241 00:11:57,530 --> 00:12:00,290 but there's this noise process out here somehow, 242 00:12:00,290 --> 00:12:01,310 and it adds in. 243 00:12:01,310 --> 00:12:04,400 And it's always there. 244 00:12:04,400 --> 00:12:09,230 Now, this turns out to be a bit of an artifice 245 00:12:09,230 --> 00:12:12,710 because what we all know is what's really 246 00:12:12,710 --> 00:12:14,600 causing most of the changes that we 247 00:12:14,600 --> 00:12:16,245 see is a change in the process. 248 00:12:23,460 --> 00:12:26,790 So we have things like a process noise or disturbance. 249 00:12:31,200 --> 00:12:33,560 And so that means something inside here, 250 00:12:33,560 --> 00:12:35,600 like those alphas that we talked about. 251 00:12:35,600 --> 00:12:38,330 The basic parameters of the process. 252 00:12:38,330 --> 00:12:43,580 The equipment states, the equipment the parameters. 253 00:12:43,580 --> 00:12:45,260 The material again. 254 00:12:45,260 --> 00:12:46,940 Especially the material changes. 255 00:12:46,940 --> 00:12:48,930 Material is inherent in here. 256 00:12:48,930 --> 00:12:50,700 And so that's going to change. 257 00:12:50,700 --> 00:12:53,180 One of the reasons we actually do 258 00:12:53,180 --> 00:12:55,430 this out here is that again, mathematically, 259 00:12:55,430 --> 00:12:57,740 having this change, having the function change, 260 00:12:57,740 --> 00:12:59,840 becomes really hard to handle. 261 00:12:59,840 --> 00:13:02,246 Additive disturbances are easy to handle. 262 00:13:02,246 --> 00:13:03,100 Simon. 263 00:13:03,100 --> 00:13:04,930 AUDIENCE: Would you say that temperature 264 00:13:04,930 --> 00:13:08,480 and material and everything across [INAUDIBLE]?? 265 00:13:08,480 --> 00:13:12,358 What examples do you have for the noise disturbance? 266 00:13:12,358 --> 00:13:13,650 PROFESSOR: Yeah, good question. 267 00:13:13,650 --> 00:13:17,140 I was going to ask you the same thing. 268 00:13:17,140 --> 00:13:20,170 You can see that everything we talked 269 00:13:20,170 --> 00:13:22,780 about when we talked about the process modeling 270 00:13:22,780 --> 00:13:24,740 had to be here. 271 00:13:24,740 --> 00:13:25,340 OK? 272 00:13:25,340 --> 00:13:27,240 So a couple of things here. 273 00:13:27,240 --> 00:13:30,720 So if the output from here is a dimension 274 00:13:30,720 --> 00:13:32,760 and you think of the output of the process 275 00:13:32,760 --> 00:13:36,180 here being a dimension that results 276 00:13:36,180 --> 00:13:39,420 from the combination of temperature, pressure, force, 277 00:13:39,420 --> 00:13:44,950 displacement, machine parameters and all those, 278 00:13:44,950 --> 00:13:46,190 what would ever mess that up? 279 00:13:53,480 --> 00:13:54,310 Measurement. 280 00:13:54,310 --> 00:13:55,460 Yeah. 281 00:13:55,460 --> 00:13:55,960 OK. 282 00:13:55,960 --> 00:13:58,060 So now, we aren't going to actually talk 283 00:13:58,060 --> 00:14:01,390 a whole lot about measurement in this class. 284 00:14:01,390 --> 00:14:05,200 But indeed, one of the good examples of this 285 00:14:05,200 --> 00:14:05,980 is measurement. 286 00:14:05,980 --> 00:14:09,940 So your perceived output is different from the real output 287 00:14:09,940 --> 00:14:12,560 because of some uncertainty from the measurement. 288 00:14:12,560 --> 00:14:16,090 So this is also a way of dealing with measurement noise. 289 00:14:16,090 --> 00:14:17,950 And indeed, we deal with measurement noise 290 00:14:17,950 --> 00:14:21,740 exactly the same way we do with processed noise. 291 00:14:21,740 --> 00:14:25,000 So that's why I was saying that this is a bit of an artifice. 292 00:14:25,000 --> 00:14:28,120 If I assume perfect measurement, then this 293 00:14:28,120 --> 00:14:31,760 is really a better model of doing it. 294 00:14:31,760 --> 00:14:34,700 Either way, our point is going to be 295 00:14:34,700 --> 00:14:37,190 that, and the reason I put this in here-- 296 00:14:37,190 --> 00:14:42,330 let's assume we did the best job we ever could with this. 297 00:14:42,330 --> 00:14:45,810 In some senses, the Shewhart hypothesis doesn't matter. 298 00:14:45,810 --> 00:14:47,700 Always going to have something here. 299 00:14:47,700 --> 00:14:50,602 And he doesn't really try to explain where it comes from. 300 00:14:50,602 --> 00:14:52,060 That's one of the differences here. 301 00:14:52,060 --> 00:14:54,815 This is not really physical origins of variation. 302 00:14:54,815 --> 00:14:55,440 It's just, hey. 303 00:14:55,440 --> 00:14:56,310 Look. 304 00:14:56,310 --> 00:15:00,090 Based on observation, no matter what you do, 305 00:15:00,090 --> 00:15:02,400 you get some of this variability. 306 00:15:02,400 --> 00:15:05,520 And your job is to get it down to the point where 307 00:15:05,520 --> 00:15:09,980 it's, in effect, unexplainable. 308 00:15:09,980 --> 00:15:11,270 OK. 309 00:15:11,270 --> 00:15:14,630 So here's the concept of being in control. 310 00:15:14,630 --> 00:15:18,470 And this is really an iconic diagram. 311 00:15:18,470 --> 00:15:22,500 And it tries to illustrate a number of things. 312 00:15:22,500 --> 00:15:24,740 And I think if you understand this diagram, 313 00:15:24,740 --> 00:15:30,980 then you really have the essence of statistical process control. 314 00:15:30,980 --> 00:15:33,590 Now, what you see here are-- 315 00:15:33,590 --> 00:15:36,620 again, this is, as the chart says, going in this way, 316 00:15:36,620 --> 00:15:38,930 you're going in time. 317 00:15:38,930 --> 00:15:41,510 And think of these as samples or instance of time 318 00:15:41,510 --> 00:15:43,790 or, from a manufacturing point of view, 319 00:15:43,790 --> 00:15:46,260 it's cycles of the process. 320 00:15:46,260 --> 00:15:49,070 So I make something at time I, and then 321 00:15:49,070 --> 00:15:53,210 I make something at time I plus 1 and I plus 2. 322 00:15:53,210 --> 00:15:55,100 Or a better way to think of it is 323 00:15:55,100 --> 00:15:59,720 I measure what I've made at I. I measure something else at I 324 00:15:59,720 --> 00:16:00,380 plus 1. 325 00:16:00,380 --> 00:16:04,730 I measure another product at I plus 1 and I plus 2, and so on. 326 00:16:07,440 --> 00:16:10,440 Each of these, of course, is representing a probability 327 00:16:10,440 --> 00:16:12,000 distribution. 328 00:16:12,000 --> 00:16:15,560 And they're normal here because the Shewhart stuff is all 329 00:16:15,560 --> 00:16:21,410 based on assuming the underlying statistics are normal. 330 00:16:21,410 --> 00:16:25,700 What this is illustrating is not the data itself 331 00:16:25,700 --> 00:16:29,770 but the distribution of the data as time moves on. 332 00:16:29,770 --> 00:16:32,110 And what this diagram tells you is the distribution 333 00:16:32,110 --> 00:16:33,740 is identical. 334 00:16:33,740 --> 00:16:37,550 It doesn't mean we're getting the same part every time. 335 00:16:37,550 --> 00:16:42,700 It just means the probability of getting a particular dimension 336 00:16:42,700 --> 00:16:44,720 never changes. 337 00:16:44,720 --> 00:16:45,500 OK? 338 00:16:45,500 --> 00:16:49,190 This is the concept of statistical-- 339 00:16:49,190 --> 00:16:51,140 a state of statistical control. 340 00:16:51,140 --> 00:16:54,230 The underlying parent distribution, 341 00:16:54,230 --> 00:16:57,530 the true random behavior of the process, 342 00:16:57,530 --> 00:17:02,000 is following this curve and never changes. 343 00:17:02,000 --> 00:17:05,630 If you've achieved that, then by the Shewhart hypothesis, 344 00:17:05,630 --> 00:17:07,550 you're in a state of statistical control. 345 00:17:07,550 --> 00:17:09,380 And that's essentially the best you can do. 346 00:17:12,339 --> 00:17:13,730 The best you can do. 347 00:17:13,730 --> 00:17:18,369 And so your process will never get any better than that. 348 00:17:18,369 --> 00:17:20,369 That's why we have the second half of the class, 349 00:17:20,369 --> 00:17:23,640 by the way, because what comes up after this part, 350 00:17:23,640 --> 00:17:25,890 after statistical process control, we say, OK. 351 00:17:25,890 --> 00:17:28,630 Now, you've got it in the state of statistical control. 352 00:17:28,630 --> 00:17:31,410 It's following this distribution for all time. 353 00:17:31,410 --> 00:17:32,190 That's great. 354 00:17:32,190 --> 00:17:33,930 What if that's not good enough, or what 355 00:17:33,930 --> 00:17:35,100 if it needs to be better? 356 00:17:35,100 --> 00:17:37,920 What if it still has too much variability compared 357 00:17:37,920 --> 00:17:40,055 to a design specification? 358 00:17:40,055 --> 00:17:40,680 What do you do? 359 00:17:45,660 --> 00:17:50,310 Any ideas on what you do if it's in perfect state of control, 360 00:17:50,310 --> 00:17:52,080 it's doing exactly what Shewhart says 361 00:17:52,080 --> 00:17:54,540 it should be doing to be in a state of control, 362 00:17:54,540 --> 00:17:57,600 and it's not good enough? 363 00:17:57,600 --> 00:18:04,110 AUDIENCE: [INAUDIBLE] parameters that narrow the distribution. 364 00:18:04,110 --> 00:18:05,290 PROFESSOR: Yes, exactly. 365 00:18:05,290 --> 00:18:07,377 Did you guys hear that in Singapore? 366 00:18:11,950 --> 00:18:12,680 No? 367 00:18:12,680 --> 00:18:14,260 OK. 368 00:18:14,260 --> 00:18:15,640 AUDIENCE: OK, I heard that. 369 00:18:15,640 --> 00:18:16,765 PROFESSOR: Oh, he heard it. 370 00:18:16,765 --> 00:18:18,580 OK, go ahead. 371 00:18:18,580 --> 00:18:19,540 Oh, you heard it? 372 00:18:19,540 --> 00:18:21,120 [LAUGHING] 373 00:18:21,120 --> 00:18:24,010 AUDIENCE: The research says about just change, optimize 374 00:18:24,010 --> 00:18:26,000 the parameters of the process. 375 00:18:26,000 --> 00:18:28,010 So try to optimize the process. 376 00:18:28,010 --> 00:18:30,262 So I was going to say that that is one way. 377 00:18:30,262 --> 00:18:31,720 But the other methods, for example, 378 00:18:31,720 --> 00:18:37,030 like positive controls, user feedback control of the system, 379 00:18:37,030 --> 00:18:38,840 and try to improve the process as well. 380 00:18:38,840 --> 00:18:39,520 So. 381 00:18:39,520 --> 00:18:40,780 PROFESSOR: Yeah, exactly. 382 00:18:40,780 --> 00:18:45,370 If you go back to that, again, the somewhat iconic equation 383 00:18:45,370 --> 00:18:48,345 that we had in one of the first lectures on the variation 384 00:18:48,345 --> 00:18:49,720 equation and the three things you 385 00:18:49,720 --> 00:18:52,750 can do to reduce the variation, one 386 00:18:52,750 --> 00:18:55,750 was to eliminate all these influences, which 387 00:18:55,750 --> 00:18:59,220 is what we're talking about with SPC. 388 00:18:59,220 --> 00:19:04,110 Another is to try to reduce the sensitivity of the process 389 00:19:04,110 --> 00:19:06,270 to those inherent variations, and that's 390 00:19:06,270 --> 00:19:09,090 what Richard was talking about with changing process 391 00:19:09,090 --> 00:19:10,170 parameters. 392 00:19:10,170 --> 00:19:14,390 And then the third would be to actually use some-- 393 00:19:14,390 --> 00:19:18,000 well, we'll talk about using feedback control using 394 00:19:18,000 --> 00:19:21,103 some sort of feedback to improve the process. 395 00:19:21,103 --> 00:19:23,520 There's a fourth one, which we didn't really put in there. 396 00:19:26,265 --> 00:19:27,890 Well, that's what he was talking about. 397 00:19:27,890 --> 00:19:28,840 Active control. 398 00:19:28,840 --> 00:19:29,920 Yeah. 399 00:19:29,920 --> 00:19:30,920 Can you generalize that? 400 00:19:30,920 --> 00:19:31,420 Yeah. 401 00:19:31,420 --> 00:19:32,530 [INAUDIBLE] 402 00:19:32,530 --> 00:19:35,890 AUDIENCE: [INAUDIBLE] 403 00:19:38,780 --> 00:19:40,610 PROFESSOR: Oh, yeah, Yeah, that's cheating. 404 00:19:40,610 --> 00:19:42,140 Selective assembly? 405 00:19:42,140 --> 00:19:43,160 Ooh. 406 00:19:43,160 --> 00:19:44,930 [LAUGHTER] 407 00:19:44,930 --> 00:19:46,250 None of that. 408 00:19:46,250 --> 00:19:49,580 No, selective assembly-- indeed, selective assembly 409 00:19:49,580 --> 00:19:52,680 is something that you do when this is the best you can do, 410 00:19:52,680 --> 00:19:53,870 and it's not good enough. 411 00:19:53,870 --> 00:19:56,450 You just take individual parts and say, will these fit? 412 00:19:56,450 --> 00:19:58,940 Or you gauge-- and that's was also 413 00:19:58,940 --> 00:20:01,400 my airspace example from a couple of weeks ago 414 00:20:01,400 --> 00:20:03,680 where you've got the people behind the curtains 415 00:20:03,680 --> 00:20:05,750 with the padded table and the padded hammers, 416 00:20:05,750 --> 00:20:09,890 and they're pounding things, and that sort of custom assembly. 417 00:20:09,890 --> 00:20:13,018 So you can do those sorts of things. 418 00:20:13,018 --> 00:20:14,435 But the last one I was thinking of 419 00:20:14,435 --> 00:20:16,310 is you go out and buy a new process. 420 00:20:16,310 --> 00:20:16,850 OK? 421 00:20:16,850 --> 00:20:18,590 If this is inherent in the process 422 00:20:18,590 --> 00:20:20,510 and that's the best that it can do, 423 00:20:20,510 --> 00:20:22,760 then you sort of invent a new process, if you will, 424 00:20:22,760 --> 00:20:25,130 or you do things a different way that gives you 425 00:20:25,130 --> 00:20:27,890 a higher degree of precision. 426 00:20:27,890 --> 00:20:30,690 So you can take all these steps. 427 00:20:30,690 --> 00:20:32,790 But in effect, this is-- 428 00:20:32,790 --> 00:20:35,250 as you'll see, this is the cheapest, the easiest, 429 00:20:35,250 --> 00:20:38,010 and the first one you always do. 430 00:20:38,010 --> 00:20:39,660 You never make a decision about what 431 00:20:39,660 --> 00:20:41,880 to do until the process is in the state 432 00:20:41,880 --> 00:20:44,030 of statistical control. 433 00:20:44,030 --> 00:20:45,800 Why is that? 434 00:20:45,800 --> 00:20:47,260 Why am I so adamant about that? 435 00:20:47,260 --> 00:20:50,770 You never go to the next step until we've done this. 436 00:20:50,770 --> 00:20:52,270 Any of these things we talked about. 437 00:20:59,300 --> 00:21:01,400 No ideas? 438 00:21:01,400 --> 00:21:02,096 Yeah. 439 00:21:02,096 --> 00:21:11,000 AUDIENCE: [INAUDIBLE] involved a lot of efforts [INAUDIBLE] 440 00:21:11,000 --> 00:21:14,420 from a an effort point of view, resources. 441 00:21:14,420 --> 00:21:17,390 [INAUDIBLE] inherent to the process [INAUDIBLE].. 442 00:21:22,450 --> 00:21:24,200 PROFESSOR: From an economic point of view, 443 00:21:24,200 --> 00:21:25,460 you certainly should do it. 444 00:21:25,460 --> 00:21:27,710 It is essentially almost free, although it 445 00:21:27,710 --> 00:21:29,720 takes a little bit of someone's effort 446 00:21:29,720 --> 00:21:31,170 and other things like that. 447 00:21:31,170 --> 00:21:33,932 But there's maybe even a better reason. 448 00:21:33,932 --> 00:21:36,390 AUDIENCE: The process starts doing something you don't like 449 00:21:36,390 --> 00:21:37,320 and you're going to do a feedback 450 00:21:37,320 --> 00:21:38,490 system or active control. 451 00:21:38,490 --> 00:21:40,170 You don't really know why, so you 452 00:21:40,170 --> 00:21:42,780 don't know what your feedback is going to do. 453 00:21:42,780 --> 00:21:45,150 If it can actually make it worse because I don't know 454 00:21:45,150 --> 00:21:46,650 the underlying cause anymore. 455 00:21:46,650 --> 00:21:49,535 So it could actually spiral out of control. 456 00:21:49,535 --> 00:21:50,910 PROFESSOR: This is really getting 457 00:21:50,910 --> 00:21:52,280 at the heart of the issue. 458 00:21:52,280 --> 00:21:52,817 Richard? 459 00:21:52,817 --> 00:21:54,900 AUDIENCE: Another point is that you have no chance 460 00:21:54,900 --> 00:21:59,970 to actually get an approximation of the underlying distribution 461 00:21:59,970 --> 00:22:02,878 when the process [INAUDIBLE]. 462 00:22:02,878 --> 00:22:04,920 AUDIENCE: Yeah, I was going to be building on Dan 463 00:22:04,920 --> 00:22:07,080 and just saying, it's understanding the root cause 464 00:22:07,080 --> 00:22:10,620 problem, and you can't really solve anything or understand 465 00:22:10,620 --> 00:22:12,888 how to improve it until you understand what's causing 466 00:22:12,888 --> 00:22:13,930 the [? error to ?] be on. 467 00:22:13,930 --> 00:22:14,597 PROFESSOR: Yeah. 468 00:22:14,597 --> 00:22:15,210 Yeah, exactly. 469 00:22:15,210 --> 00:22:19,780 The whole point with SPC is, if you remember in this-- in fact, 470 00:22:19,780 --> 00:22:21,780 I barely remember this, and I made up the slide. 471 00:22:21,780 --> 00:22:25,000 But maybe back in the first or second lecture, 472 00:22:25,000 --> 00:22:27,480 we had a list of things that lead up 473 00:22:27,480 --> 00:22:28,720 to good process control. 474 00:22:28,720 --> 00:22:30,870 The first thing was good housekeeping. 475 00:22:30,870 --> 00:22:33,300 Just have to have a sensible shop 476 00:22:33,300 --> 00:22:37,030 and don't play around with the process. 477 00:22:37,030 --> 00:22:38,820 So hold these things consonant. 478 00:22:38,820 --> 00:22:40,930 The next-- so that's kind of obvious, 479 00:22:40,930 --> 00:22:43,740 but you'd be surprised how long it took companies 480 00:22:43,740 --> 00:22:47,235 to decide that things like standard operating procedures 481 00:22:47,235 --> 00:22:47,860 were important. 482 00:22:47,860 --> 00:22:50,910 The next thing after that is, OK. 483 00:22:50,910 --> 00:22:52,770 Let's take that one level better, 484 00:22:52,770 --> 00:22:55,860 which is if something's going wrong with the process 485 00:22:55,860 --> 00:22:58,885 by the definition that's here, then 486 00:22:58,885 --> 00:23:00,260 that means that there's something 487 00:23:00,260 --> 00:23:02,990 out there causing the process to deviate that I could fix. 488 00:23:02,990 --> 00:23:04,310 I could fix it pretty easily. 489 00:23:04,310 --> 00:23:06,890 Bad control over material variability. 490 00:23:06,890 --> 00:23:09,000 Bad temperature control. 491 00:23:09,000 --> 00:23:10,520 Poor maintenance of the machine. 492 00:23:10,520 --> 00:23:11,390 That sort of thing. 493 00:23:11,390 --> 00:23:13,100 Shouldn't you fix all those things first 494 00:23:13,100 --> 00:23:16,660 before you go investing in fancy controls or something 495 00:23:16,660 --> 00:23:17,160 like that? 496 00:23:17,160 --> 00:23:17,660 And sure. 497 00:23:17,660 --> 00:23:20,660 If you're going to then go on to do process optimization, 498 00:23:20,660 --> 00:23:22,250 it's going to assume the process is 499 00:23:22,250 --> 00:23:24,710 in a state of statistical control. 500 00:23:24,710 --> 00:23:28,490 So it's like you're applying a theory to a situation 501 00:23:28,490 --> 00:23:30,420 that it wasn't meant to be applied to. 502 00:23:30,420 --> 00:23:33,750 So this is definitely where you start. 503 00:23:33,750 --> 00:23:34,830 OK. 504 00:23:34,830 --> 00:23:36,480 I didn't mean to be so preachy today. 505 00:23:36,480 --> 00:23:37,800 I guess it's just the weather. 506 00:23:43,820 --> 00:23:44,450 OK. 507 00:23:44,450 --> 00:23:45,810 Why isn't this going forward? 508 00:23:51,790 --> 00:23:52,950 I have to do this now? 509 00:23:52,950 --> 00:23:53,750 OK. 510 00:23:53,750 --> 00:23:54,250 All right. 511 00:23:54,250 --> 00:23:58,270 So here's an example of a process being not in control. 512 00:23:58,270 --> 00:24:00,880 And this is maybe more important than the other one 513 00:24:00,880 --> 00:24:04,660 because this is the kind of stuff you're trying to detect. 514 00:24:04,660 --> 00:24:07,120 Unfortunately, when you take data, 515 00:24:07,120 --> 00:24:09,820 you don't get pictures that look like this. 516 00:24:09,820 --> 00:24:14,137 Because, for example, at time instance I, 517 00:24:14,137 --> 00:24:15,220 you get one piece of data. 518 00:24:15,220 --> 00:24:16,428 You don't get a distribution. 519 00:24:16,428 --> 00:24:19,000 You get one piece of data. 520 00:24:19,000 --> 00:24:20,530 So this, in effect, is the challenge 521 00:24:20,530 --> 00:24:21,670 of working with real data. 522 00:24:21,670 --> 00:24:24,880 Now, keep in mind, and this is a hard thing to keep in mind here 523 00:24:24,880 --> 00:24:30,740 in a classroom, but put yourself in the position of a production 524 00:24:30,740 --> 00:24:35,960 supervisor or an operator who's making something and products 525 00:24:35,960 --> 00:24:37,710 come out one at a time. 526 00:24:37,710 --> 00:24:39,383 And as they come out, you measure them. 527 00:24:39,383 --> 00:24:41,300 And of course, by the time you've measured it, 528 00:24:41,300 --> 00:24:43,092 the next one is ready to be measured 529 00:24:43,092 --> 00:24:44,050 and that sort of thing. 530 00:24:44,050 --> 00:24:45,975 So things are happening pretty fast. 531 00:24:45,975 --> 00:24:47,600 And what you're trying to decide as you 532 00:24:47,600 --> 00:24:52,050 make those measurements is, is everything OK? 533 00:24:52,050 --> 00:24:54,810 And you don't have, with each measurement, 534 00:24:54,810 --> 00:24:56,380 you don't have enough data to say, 535 00:24:56,380 --> 00:24:58,900 well, here's the distribution. 536 00:24:58,900 --> 00:25:01,420 But we do the best we can. 537 00:25:01,420 --> 00:25:04,370 So here's how it's done. 538 00:25:04,370 --> 00:25:05,870 What you're trying to do with this 539 00:25:05,870 --> 00:25:07,940 or what this is basically saying is, 540 00:25:07,940 --> 00:25:09,320 well, look what's happening here. 541 00:25:11,830 --> 00:25:14,620 In the first two intervals, things looked OK. 542 00:25:14,620 --> 00:25:16,660 And then what happened here? 543 00:25:16,660 --> 00:25:21,880 Well, obviously, as you can see from the diagram, 544 00:25:21,880 --> 00:25:25,420 the whole underlying random behavior 545 00:25:25,420 --> 00:25:28,600 shifted it's mean value. 546 00:25:28,600 --> 00:25:33,780 And it's clear, it also got greater invariance. 547 00:25:33,780 --> 00:25:36,480 So the underlying parent distribution, 548 00:25:36,480 --> 00:25:39,300 meaning the underlying random behavior of the process, 549 00:25:39,300 --> 00:25:40,380 did not stay the same. 550 00:25:40,380 --> 00:25:42,870 It changed at integral 2. 551 00:25:42,870 --> 00:25:46,770 And then down here somewhat later, it got even worse. 552 00:25:46,770 --> 00:25:48,610 It followed a bimodal distribution. 553 00:25:48,610 --> 00:25:52,080 So it actually meant that the probability 554 00:25:52,080 --> 00:25:59,430 of getting a certain dimension had two peaks. 555 00:25:59,430 --> 00:26:02,883 And the least likely point was maybe somewhere near-- 556 00:26:02,883 --> 00:26:04,550 oh, I guess I've drawn it the other way. 557 00:26:04,550 --> 00:26:07,160 But yeah. 558 00:26:07,160 --> 00:26:10,090 Well, here's the point. 559 00:26:10,090 --> 00:26:13,820 It's anything but a normal distribution anymore. 560 00:26:13,820 --> 00:26:17,910 And it's anything but the original distribution. 561 00:26:17,910 --> 00:26:19,910 So these are some examples of the process 562 00:26:19,910 --> 00:26:22,290 being not in a state of statistical control. 563 00:26:22,290 --> 00:26:26,790 Now, the other thing, of course, that could be happening here-- 564 00:26:26,790 --> 00:26:28,860 this is basically saying, with the exception 565 00:26:28,860 --> 00:26:31,160 of the bimodal thing, it's basically saying, OK. 566 00:26:31,160 --> 00:26:36,920 You got a normal distribution, but it's underlying statistics, 567 00:26:36,920 --> 00:26:39,613 the mean and the variance, are no longer constant. 568 00:26:39,613 --> 00:26:41,030 The other thing could be happening 569 00:26:41,030 --> 00:26:44,150 is that it's still random, but it's 570 00:26:44,150 --> 00:26:45,980 switched its distribution to something 571 00:26:45,980 --> 00:26:49,900 like a uniform distribution. 572 00:26:49,900 --> 00:26:52,720 The real fact of the matter is that it's probably not 573 00:26:52,720 --> 00:26:57,430 following any distribution because it's 574 00:26:57,430 --> 00:27:00,130 got a lot of deterministic behavior underneath it. 575 00:27:00,130 --> 00:27:03,850 But here's the challenge if you're actually applying SPC. 576 00:27:03,850 --> 00:27:07,300 How do I look at data in real time 577 00:27:07,300 --> 00:27:10,990 and distinguish the difference among something 578 00:27:10,990 --> 00:27:11,975 that looks like this? 579 00:27:11,975 --> 00:27:14,350 This is actually a mean shift here now that I look at it. 580 00:27:14,350 --> 00:27:16,270 Something that looks like this, which is 581 00:27:16,270 --> 00:27:18,070 exactly what I wanted to see. 582 00:27:18,070 --> 00:27:22,060 Something that looks like this, where you've got a mean shift. 583 00:27:22,060 --> 00:27:23,560 Something that looks like this where 584 00:27:23,560 --> 00:27:29,298 I've got a mean shift and standard deviation shift. 585 00:27:29,298 --> 00:27:30,840 And something like this, where I just 586 00:27:30,840 --> 00:27:34,100 change the whole probability distribution. 587 00:27:34,100 --> 00:27:37,330 So how could I look at individual data points 588 00:27:37,330 --> 00:27:41,020 one by one and decide what's going on there? 589 00:27:41,020 --> 00:27:43,850 So that's really the challenge. 590 00:27:43,850 --> 00:27:46,870 And when we do come up with these rules and say, ah, 591 00:27:46,870 --> 00:27:51,580 it's not in control, it's really because our hypothesis 592 00:27:51,580 --> 00:27:54,630 is that something like this has happened. 593 00:27:54,630 --> 00:27:55,890 OK. 594 00:27:55,890 --> 00:27:59,100 So back to Shewhart, and back to practical. 595 00:28:02,150 --> 00:28:05,210 What the Shewhart chart does, the classic Shewhart chart, 596 00:28:05,210 --> 00:28:07,970 it's an x-bar, and often, it's done 597 00:28:07,970 --> 00:28:10,880 as an r-chart or a range chart. 598 00:28:10,880 --> 00:28:14,330 I like to use the s-chart for reasons 599 00:28:14,330 --> 00:28:15,570 I'll get to in just a second. 600 00:28:15,570 --> 00:28:20,810 But what it does is it plots not the data points themselves. 601 00:28:20,810 --> 00:28:24,850 Not the individual run data, but a sequential average 602 00:28:24,850 --> 00:28:25,350 of the data. 603 00:28:25,350 --> 00:28:28,680 And I'll show you this diagram in just a second. 604 00:28:28,680 --> 00:28:33,710 But the basic approach here is not to take each measurement 605 00:28:33,710 --> 00:28:39,570 and put it on a plot but instead to take groups of measurements 606 00:28:39,570 --> 00:28:49,960 and plot the average of those. 607 00:28:49,960 --> 00:28:52,090 A simple arithmetic average of them. 608 00:28:52,090 --> 00:28:52,720 OK? 609 00:28:52,720 --> 00:29:01,460 So if I get a plot, a data point here, this is actually 610 00:29:01,460 --> 00:29:03,800 the average of a set of data that 611 00:29:03,800 --> 00:29:06,500 occurred around the interval i. 612 00:29:06,500 --> 00:29:14,600 And the next one here is the average at i plus 1. 613 00:29:14,600 --> 00:29:16,850 OK? 614 00:29:16,850 --> 00:29:18,200 Why would we do that? 615 00:29:18,200 --> 00:29:22,525 Why not just plot the run data, meaning the actual data? 616 00:29:22,525 --> 00:29:23,900 Why start off and say, now, we're 617 00:29:23,900 --> 00:29:25,730 going to look at averages? 618 00:29:32,380 --> 00:29:34,480 AUDIENCE: Because the evidence or the distribution 619 00:29:34,480 --> 00:29:37,533 of the evidence [INAUDIBLE]. 620 00:29:37,533 --> 00:29:38,200 PROFESSOR: Yeah. 621 00:29:38,200 --> 00:29:39,750 That's one very good reason. 622 00:29:39,750 --> 00:29:41,870 You get the narrowing effect with the sample. 623 00:29:41,870 --> 00:29:44,037 AUDIENCE: Also, you're able to calculate or estimate 624 00:29:44,037 --> 00:29:45,590 the standard deviation. 625 00:29:45,590 --> 00:29:47,090 PROFESSOR: And that's exactly right. 626 00:29:47,090 --> 00:29:52,640 So for each of these, I can come up with a standard deviation 627 00:29:52,640 --> 00:29:57,287 as long as I have enough data. 628 00:29:57,287 --> 00:29:57,870 Each of those. 629 00:29:57,870 --> 00:30:03,440 So each of these is x-bar sub i is-- well, 630 00:30:03,440 --> 00:30:05,120 let me just do it this way-- 631 00:30:05,120 --> 00:30:11,540 is, of course, a sum over n of some set of data. 632 00:30:14,070 --> 00:30:15,950 And so the other part of this now 633 00:30:15,950 --> 00:30:17,810 is how many data points do I take? 634 00:30:17,810 --> 00:30:20,000 OK, that's a really good reason. 635 00:30:20,000 --> 00:30:23,240 There's an even more fundamental reason 636 00:30:23,240 --> 00:30:28,470 why I would plot this x-bar. 637 00:30:28,470 --> 00:30:32,810 What is x-bar an estimate of? 638 00:30:32,810 --> 00:30:36,230 It's the estimate of the true theoretical mean. 639 00:30:36,230 --> 00:30:40,300 So, if you will, it's our estimate of the mean value. 640 00:30:40,300 --> 00:30:44,500 If the process is in a state of statistical control, 641 00:30:44,500 --> 00:30:49,300 this thing, what is the true value of the mean? 642 00:30:54,680 --> 00:30:57,440 Or maybe I didn't state that question 643 00:30:57,440 --> 00:30:59,370 as well as I should have. 644 00:30:59,370 --> 00:31:04,360 How does the true value of the mean vary over time? 645 00:31:04,360 --> 00:31:05,260 It doesn't. 646 00:31:05,260 --> 00:31:06,640 Exactly. 647 00:31:06,640 --> 00:31:12,250 So what my ideal for in control is 648 00:31:12,250 --> 00:31:16,480 that if this is the theoretical mean right here of the process, 649 00:31:16,480 --> 00:31:20,900 then the true mean value should be a constant. 650 00:31:20,900 --> 00:31:25,900 So what I'm really plotting here in a sense 651 00:31:25,900 --> 00:31:28,090 are my sample statistics. 652 00:31:28,090 --> 00:31:30,940 Not the data, but my sample statistics. 653 00:31:30,940 --> 00:31:34,510 And the reason is, in effect, you have to say, at this level, 654 00:31:34,510 --> 00:31:35,890 I don't care about the data. 655 00:31:35,890 --> 00:31:40,860 What I care about are the statistics of the data 656 00:31:40,860 --> 00:31:44,790 and what they're telling me about the underlying behavior. 657 00:31:44,790 --> 00:31:48,780 So that's one of the key reasons for plotting this. 658 00:31:48,780 --> 00:31:50,010 We're not plotting the data. 659 00:31:50,010 --> 00:31:52,600 We're plotting the statistics of the data. 660 00:31:52,600 --> 00:31:55,770 There's another reason that's even maybe-- 661 00:31:55,770 --> 00:31:57,870 you'll see some other reasons why. 662 00:31:57,870 --> 00:32:01,830 But what other advantage do I get 663 00:32:01,830 --> 00:32:07,500 from calculating the average instead of-- 664 00:32:07,500 --> 00:32:10,740 doing this kind of averaging instead of just 665 00:32:10,740 --> 00:32:11,890 looking at the raw data? 666 00:32:11,890 --> 00:32:13,860 We already said it allows us to-- 667 00:32:13,860 --> 00:32:16,590 I've said here, it allows us to be actually plotting 668 00:32:16,590 --> 00:32:18,240 our estimate of the mean. 669 00:32:18,240 --> 00:32:20,760 It allows us to calculate at each interval 670 00:32:20,760 --> 00:32:22,350 a standard deviation. 671 00:32:22,350 --> 00:32:23,900 What else does it allow us to do? 672 00:32:32,460 --> 00:32:35,100 Remember, I said earlier that this is all 673 00:32:35,100 --> 00:32:36,435 based on this being normal. 674 00:32:42,850 --> 00:32:48,730 What's the best way to guarantee that whatever 675 00:32:48,730 --> 00:32:52,270 it is that I'm looking at is-- 676 00:32:52,270 --> 00:32:54,820 whatever variable I'm actually plotting is following 677 00:32:54,820 --> 00:32:56,370 a normal distribution? 678 00:32:59,660 --> 00:33:00,350 Go ahead. 679 00:33:00,350 --> 00:33:01,230 AUDIENCE: Central limit theorem. 680 00:33:01,230 --> 00:33:02,605 PROFESSOR: Central limit theorem. 681 00:33:02,605 --> 00:33:03,740 Exactly. 682 00:33:03,740 --> 00:33:07,045 So if you go back to that, if I'm 683 00:33:07,045 --> 00:33:08,420 doing any sort of averaging here, 684 00:33:08,420 --> 00:33:10,045 I'm invoking the central limit theorem. 685 00:33:10,045 --> 00:33:12,710 So even if the underlying behavior of the process 686 00:33:12,710 --> 00:33:18,270 naturally is not purely normal, it'll 687 00:33:18,270 --> 00:33:20,320 tend to be normal when I do the average. 688 00:33:20,320 --> 00:33:24,540 So it's a sort of a self-fulfilling method 689 00:33:24,540 --> 00:33:25,840 by doing it that way. 690 00:33:25,840 --> 00:33:26,340 OK. 691 00:33:28,880 --> 00:33:33,910 Now, since I can, with respect to this S chart, 692 00:33:33,910 --> 00:33:37,340 since I can calculate these at each intervals, you might say, 693 00:33:37,340 --> 00:33:40,170 I could have a second chart where 694 00:33:40,170 --> 00:33:44,970 I could be plotting sequentially the standard deviation. 695 00:33:44,970 --> 00:33:47,310 And so I can actually end up plotting-- 696 00:33:47,310 --> 00:33:50,620 let me put that over here. 697 00:33:50,620 --> 00:33:54,340 I can plot my sample standard deviation. 698 00:33:54,340 --> 00:33:57,100 And I could have another data point here 699 00:33:57,100 --> 00:33:59,980 and another data point here. 700 00:33:59,980 --> 00:34:03,070 And so I'm actually looking at both. 701 00:34:03,070 --> 00:34:06,490 I'm plotting the sample statistics for the mean 702 00:34:06,490 --> 00:34:10,610 and the sample statistics for the standard deviation. 703 00:34:10,610 --> 00:34:13,699 And again, what should the underlying standard deviation 704 00:34:13,699 --> 00:34:14,199 be? 705 00:34:14,199 --> 00:34:16,520 It should be a constant. 706 00:34:16,520 --> 00:34:19,690 And these will be samples, estimates of that. 707 00:34:19,690 --> 00:34:23,980 And they'll be following a normal distribution as well. 708 00:34:23,980 --> 00:34:26,699 And so again, we'll have to estimate what that value should 709 00:34:26,699 --> 00:34:29,060 be, but we can get this. 710 00:34:29,060 --> 00:34:33,949 Now, in your text, in most texts on this, they talk-- 711 00:34:33,949 --> 00:34:37,340 I know Montgomery talks about S charts. 712 00:34:37,340 --> 00:34:39,560 He also talks a lot about R-charts. 713 00:34:39,560 --> 00:34:43,460 And R-charts basically just says, for the sample 714 00:34:43,460 --> 00:34:45,710 of end data, what's the range? 715 00:34:45,710 --> 00:34:48,170 The min-max range of it, OK? 716 00:34:48,170 --> 00:34:52,370 Now, range, why do you think we used range for so many years 717 00:34:52,370 --> 00:34:53,900 as opposed to-- 718 00:34:53,900 --> 00:34:58,940 we use R, and a lot of the literature is on R. Especially 719 00:34:58,940 --> 00:35:02,360 if, let's assume N is is a number of 5. 720 00:35:02,360 --> 00:35:05,360 I'm taking five samples. 721 00:35:05,360 --> 00:35:06,590 I'm plotting data. 722 00:35:06,590 --> 00:35:09,230 Making decisions on the factory floor. 723 00:35:09,230 --> 00:35:11,360 Why would I use range which just involves 724 00:35:11,360 --> 00:35:14,360 a subtraction of a minimum-maximum number 725 00:35:14,360 --> 00:35:18,510 versus standard deviation, which is sum of the squares and all 726 00:35:18,510 --> 00:35:19,470 that other stuff? 727 00:35:19,470 --> 00:35:22,403 Why would anybody ever want to use range? 728 00:35:22,403 --> 00:35:24,768 AUDIENCE: [INAUDIBLE] 729 00:35:27,606 --> 00:35:30,110 PROFESSOR: Um, no. 730 00:35:30,110 --> 00:35:31,470 But I see where you're going. 731 00:35:31,470 --> 00:35:34,620 Yeah, it basically-- you know, it bounds the data. 732 00:35:34,620 --> 00:35:36,470 But again, what we're trying to get at 733 00:35:36,470 --> 00:35:38,880 is the underlying random behavior. 734 00:35:38,880 --> 00:35:40,805 So why would I ever use R? 735 00:35:40,805 --> 00:35:44,530 AUDIENCE: [INAUDIBLE] 736 00:35:44,530 --> 00:35:47,050 PROFESSOR: Yeah, think about the most basic user. 737 00:35:47,050 --> 00:35:49,300 We're talking about the simplicity of SPC. 738 00:35:49,300 --> 00:35:54,070 You got a pad and a pencil and paper. 739 00:35:54,070 --> 00:35:56,410 Averaging is pretty easy to do. 740 00:35:56,410 --> 00:35:58,810 Subtracting two numbers is pretty easy to do. 741 00:35:58,810 --> 00:35:59,980 You write them down. 742 00:35:59,980 --> 00:36:01,480 Doesn't take a calculator. 743 00:36:01,480 --> 00:36:06,190 Doesn't take a computer. 744 00:36:06,190 --> 00:36:08,860 And I know I'm really starting to sound old now, 745 00:36:08,860 --> 00:36:13,720 but pocket calculators weren't around for a long time. 746 00:36:13,720 --> 00:36:18,660 And to something that could take a square or square root, 747 00:36:18,660 --> 00:36:20,910 forget it. 748 00:36:20,910 --> 00:36:24,158 So in the production floor basis, 749 00:36:24,158 --> 00:36:25,950 you can have a couple of columns of numbers 750 00:36:25,950 --> 00:36:28,117 and write these things down and get it very quickly. 751 00:36:28,117 --> 00:36:30,145 Now, the thing is that this range is actually 752 00:36:30,145 --> 00:36:31,770 a pretty good estimate of the variance, 753 00:36:31,770 --> 00:36:34,290 and there's some nice stuff in the text about this 754 00:36:34,290 --> 00:36:36,090 in other books on this. 755 00:36:36,090 --> 00:36:43,178 So range is an estimate of the sample standard deviation. 756 00:36:43,178 --> 00:36:44,970 So it's not bad in their correction factors 757 00:36:44,970 --> 00:36:46,120 and all that. 758 00:36:46,120 --> 00:36:48,780 But the reason I like to do the S is, look. 759 00:36:48,780 --> 00:36:50,010 That's what we do now. 760 00:36:50,010 --> 00:36:51,600 It's easy to take the data. 761 00:36:51,600 --> 00:36:54,505 And very few if any organizations 762 00:36:54,505 --> 00:36:55,630 are doing it the other way. 763 00:36:55,630 --> 00:36:57,970 So that's what we do. 764 00:36:57,970 --> 00:36:58,470 OK. 765 00:36:58,470 --> 00:37:01,770 Now, a little bit about the sampling because this is-- 766 00:37:01,770 --> 00:37:03,210 such as it is. 767 00:37:03,210 --> 00:37:08,220 That's the theory of SPC. Now, here's the practice of it. 768 00:37:08,220 --> 00:37:11,676 How do we actually do this? 769 00:37:11,676 --> 00:37:14,650 Well, here it is. 770 00:37:14,650 --> 00:37:18,850 Here's a comb, which is actually meant 771 00:37:18,850 --> 00:37:23,760 to show you instance in time or actual production instances. 772 00:37:23,760 --> 00:37:27,360 So each of these vertical lines is a product, an opportunity 773 00:37:27,360 --> 00:37:30,340 to make a measurement. 774 00:37:30,340 --> 00:37:33,180 That's the real product. 775 00:37:33,180 --> 00:37:35,430 When we do that sequence then, what 776 00:37:35,430 --> 00:37:39,120 we're saying with this sequential average, which 777 00:37:39,120 --> 00:37:41,100 is what the Shewhart approach does, 778 00:37:41,100 --> 00:37:44,930 it says, well, let's take a group of those, a sample 779 00:37:44,930 --> 00:37:50,090 size N, and we'll call that sample interval J. 780 00:37:50,090 --> 00:37:53,480 So our time, if you will, is now when did this set of samples-- 781 00:37:53,480 --> 00:37:55,700 when was this set of samples taken? 782 00:37:55,700 --> 00:38:00,380 And let's do our measurements, let's do our plotting 783 00:38:00,380 --> 00:38:01,440 based on that. 784 00:38:01,440 --> 00:38:05,060 So for this orange block, we can calculate, which has, 785 00:38:05,060 --> 00:38:08,360 in this case, six measurements in it. 786 00:38:11,320 --> 00:38:12,080 So yeah. 787 00:38:12,080 --> 00:38:12,850 So n is 6. 788 00:38:12,850 --> 00:38:16,940 So I can do an average value for that, for that set of data, 789 00:38:16,940 --> 00:38:18,950 and I can do a standard deviation for that. 790 00:38:18,950 --> 00:38:21,390 A sample standard deviation from that. 791 00:38:21,390 --> 00:38:25,150 And then I wait a certain amount of time. 792 00:38:25,150 --> 00:38:26,830 And I do it again. 793 00:38:26,830 --> 00:38:28,160 And so on. 794 00:38:28,160 --> 00:38:30,955 So for each of these, I get-- 795 00:38:30,955 --> 00:38:32,860 do I have one here? 796 00:38:32,860 --> 00:38:39,620 So for each of these, I can get, of course, an x-bar and an S. 797 00:38:39,620 --> 00:38:40,120 OK? 798 00:38:43,050 --> 00:38:45,660 So you can see there's a trade off between how many samples I 799 00:38:45,660 --> 00:38:49,280 take and how far apart I take them. 800 00:38:49,280 --> 00:38:49,780 All right. 801 00:38:49,780 --> 00:38:51,890 First question is-- 802 00:38:51,890 --> 00:38:54,560 I'll ask it before somebody does because somebody always does. 803 00:38:54,560 --> 00:38:55,768 How big does that need to be? 804 00:39:00,670 --> 00:39:02,720 Greater than 1. 805 00:39:02,720 --> 00:39:06,710 You can do it with 2, especially if you use the range. 806 00:39:06,710 --> 00:39:08,120 But how big does N need to be? 807 00:39:08,120 --> 00:39:09,470 How big should N be? 808 00:39:17,732 --> 00:39:21,620 AUDIENCE: [INAUDIBLE] 809 00:39:26,510 --> 00:39:27,990 PROFESSOR: Yeah. 810 00:39:27,990 --> 00:39:30,320 It's exactly the confidence interval issue. 811 00:39:30,320 --> 00:39:32,930 But look at this relationship here. 812 00:39:32,930 --> 00:39:35,320 Well, yeah. 813 00:39:35,320 --> 00:39:40,810 You remember when we talked about the mean and then 814 00:39:40,810 --> 00:39:42,040 the invariance of the mean. 815 00:39:42,040 --> 00:39:45,430 And the invariance of the mean meaning how-- 816 00:39:45,430 --> 00:39:49,610 or, sorry, the variance of the estimate of the mean. 817 00:39:49,610 --> 00:39:50,930 This job. 818 00:40:01,710 --> 00:40:08,560 So if I'm plotting my estimate of the mean value 819 00:40:08,560 --> 00:40:13,270 and N is a really small number like 1 or 2, 820 00:40:13,270 --> 00:40:16,120 then I get the maximum variance of that estimate. 821 00:40:16,120 --> 00:40:18,280 It's going to be all over the place. 822 00:40:18,280 --> 00:40:20,570 If I increase N, that gets tighter and tighter. 823 00:40:20,570 --> 00:40:22,720 So I'm getting a better and better estimate 824 00:40:22,720 --> 00:40:24,640 of what the true mean value is. 825 00:40:24,640 --> 00:40:28,690 So obviously, maximizing N minimizes the variance 826 00:40:28,690 --> 00:40:30,490 of my estimate of the mean. 827 00:40:30,490 --> 00:40:32,890 And so I'm doing a better job of estimating the mean. 828 00:40:32,890 --> 00:40:36,660 So why not just make N a really big number? 829 00:40:36,660 --> 00:40:37,160 Adam. 830 00:40:37,160 --> 00:40:40,611 AUDIENCE: [INAUDIBLE] 831 00:40:43,853 --> 00:40:44,520 PROFESSOR: Yeah. 832 00:40:44,520 --> 00:40:47,103 Yeah, I mean, it could take you all day to get a decent sample 833 00:40:47,103 --> 00:40:48,390 size, right? 834 00:40:48,390 --> 00:40:54,140 And something else-- there's a real issue here that comes up 835 00:40:54,140 --> 00:40:57,230 called time variability. 836 00:40:57,230 --> 00:41:02,540 So within, if you took a large sample, 50, 100, 837 00:41:02,540 --> 00:41:03,680 the whole day's worth-- 838 00:41:03,680 --> 00:41:05,360 you could take a whole day's worth of production 839 00:41:05,360 --> 00:41:05,870 and say, OK. 840 00:41:05,870 --> 00:41:08,690 That'll give me a great estimate of the mean, right? 841 00:41:08,690 --> 00:41:10,190 But during that time, the process 842 00:41:10,190 --> 00:41:11,630 could have been going all over the place, 843 00:41:11,630 --> 00:41:13,172 changing here and there, and you just 844 00:41:13,172 --> 00:41:15,810 say, well, let's wait till the day is over and see how we did. 845 00:41:15,810 --> 00:41:16,310 OK? 846 00:41:16,310 --> 00:41:18,530 So it doesn't give you a lot of time to intervene. 847 00:41:18,530 --> 00:41:22,250 And again, I can't do it on the screen here, 848 00:41:22,250 --> 00:41:28,110 but imagine that I cover this up, 849 00:41:28,110 --> 00:41:30,610 and all you can see-- of course, you guys can't see anything 850 00:41:30,610 --> 00:41:34,590 now, but all you can see is what's happening right here. 851 00:41:34,590 --> 00:41:37,020 You don't know what's going to happen in the future. 852 00:41:37,020 --> 00:41:40,890 And you're trying to estimate what's going to happen. 853 00:41:40,890 --> 00:41:47,490 If you wait until here to do that, think of all the product 854 00:41:47,490 --> 00:41:52,680 that you've made and think about trying to make your next guess. 855 00:41:52,680 --> 00:41:56,970 It really gets kind of ridiculous. 856 00:41:56,970 --> 00:41:59,160 So there's a trade off. 857 00:41:59,160 --> 00:42:00,390 How big should N be? 858 00:42:00,390 --> 00:42:03,720 Well, big enough to get a reasonable estimate. 859 00:42:03,720 --> 00:42:09,300 Big enough to be able to detect changes in a timely manner. 860 00:42:09,300 --> 00:42:11,370 So if I take a lot of data points, 861 00:42:11,370 --> 00:42:13,950 it's just going to take me forever to get an estimate. 862 00:42:13,950 --> 00:42:16,200 And then there is the issue of how far apart they are. 863 00:42:16,200 --> 00:42:19,590 AUDIENCE: [INAUDIBLE] depend on the value add of the process 864 00:42:19,590 --> 00:42:20,530 that we're looking at? 865 00:42:20,530 --> 00:42:22,200 Where if you're adding a lot of value, 866 00:42:22,200 --> 00:42:26,120 you might want to look at it or more often because it's 867 00:42:26,120 --> 00:42:29,690 costly to miss minor changes. 868 00:42:29,690 --> 00:42:30,840 PROFESSOR: Yeah, sure. 869 00:42:30,840 --> 00:42:31,610 Sure. 870 00:42:31,610 --> 00:42:35,370 I think at this point, we look at this in a value-- 871 00:42:35,370 --> 00:42:37,820 let's say a value-neutral fashion, saying, look. 872 00:42:37,820 --> 00:42:41,630 At this point, there is-- 873 00:42:41,630 --> 00:42:44,550 your point is well taken in the following. 874 00:42:44,550 --> 00:42:46,460 There's a cost to every measurement we make. 875 00:42:46,460 --> 00:42:50,000 There's a cost to every average that we take 876 00:42:50,000 --> 00:42:51,260 and all that sort of thing. 877 00:42:51,260 --> 00:42:53,390 In the past, those costs could be pretty high. 878 00:42:53,390 --> 00:42:55,340 They're tending to be much lower these days 879 00:42:55,340 --> 00:42:57,660 with automated inspection and things like that 880 00:42:57,660 --> 00:42:59,060 as a matter of course. 881 00:42:59,060 --> 00:43:02,700 But yes, there's a cost benefit. 882 00:43:02,700 --> 00:43:05,210 So if it's a part where it's not that important, 883 00:43:05,210 --> 00:43:07,400 you know, so forth and so on. 884 00:43:07,400 --> 00:43:13,110 If you look downstream and you say it's not a critical-- 885 00:43:13,110 --> 00:43:15,080 what's the term I'm thinking of? 886 00:43:15,080 --> 00:43:21,090 It's not a key characteristic of the product, it can vary, 887 00:43:21,090 --> 00:43:22,470 then maybe this isn't important. 888 00:43:22,470 --> 00:43:24,387 But that actually gets to the next step, which 889 00:43:24,387 --> 00:43:25,620 is, how critical is a part? 890 00:43:25,620 --> 00:43:27,990 So if I hurry up, we'll get to that. 891 00:43:27,990 --> 00:43:29,400 But you're exactly right. 892 00:43:29,400 --> 00:43:32,423 So there is a cost of quality associated with this. 893 00:43:32,423 --> 00:43:33,840 Nowadays, we tend to think there's 894 00:43:33,840 --> 00:43:38,730 a cost, high cost associated with not doing this. 895 00:43:38,730 --> 00:43:39,390 OK. 896 00:43:39,390 --> 00:43:44,490 What's the effective-- why would we actually wait and not 897 00:43:44,490 --> 00:43:46,320 have these boxes right next to each other? 898 00:43:46,320 --> 00:43:48,690 Sample 6, OK. 899 00:43:48,690 --> 00:43:51,570 So 1 through 6 is the first sample, 7 to 12 900 00:43:51,570 --> 00:43:53,940 is the next sample, and so on. 901 00:43:53,940 --> 00:43:56,310 Why would I-- why am I actually showing a gap 902 00:43:56,310 --> 00:43:57,660 between these two? 903 00:43:57,660 --> 00:44:00,300 Why don't I just don't take any measurements? 904 00:44:00,300 --> 00:44:01,650 Ignore the data. 905 00:44:01,650 --> 00:44:03,546 Why would I ever do that? 906 00:44:03,546 --> 00:44:07,147 AUDIENCE: [INAUDIBLE] 907 00:44:07,147 --> 00:44:08,480 PROFESSOR: That's exactly right. 908 00:44:08,480 --> 00:44:10,160 Say a little bit more about that. 909 00:44:10,160 --> 00:44:19,950 AUDIENCE: I guess if there is certain variation [INAUDIBLE],, 910 00:44:19,950 --> 00:44:22,223 It would be [INAUDIBLE] any more than other causes 911 00:44:22,223 --> 00:44:25,198 of the variation that we take in our data samples 912 00:44:25,198 --> 00:44:26,240 right next to each other. 913 00:44:26,240 --> 00:44:29,250 It will have the same cause of variation affect [? two ?] 914 00:44:29,250 --> 00:44:32,158 of our samples, and in effect, we [INAUDIBLE] same thing 915 00:44:32,158 --> 00:44:33,006 twice. 916 00:44:33,006 --> 00:44:38,062 So [INAUDIBLE] we will still have [INAUDIBLE] causes 917 00:44:38,062 --> 00:44:39,177 of variation. 918 00:44:39,177 --> 00:44:40,510 PROFESSOR: That's exactly right. 919 00:44:40,510 --> 00:44:42,177 I hope you guys heard that in Singapore. 920 00:44:42,177 --> 00:44:45,810 I couldn't I couldn't say it any better, 921 00:44:45,810 --> 00:44:47,250 but that's basically it. 922 00:44:47,250 --> 00:44:51,030 And we'll talk about this I think a little bit 923 00:44:51,030 --> 00:44:52,660 more theoretically later. 924 00:44:52,660 --> 00:44:59,850 But if there is an underlying-- if something happens here 925 00:44:59,850 --> 00:45:02,640 that has some memory, some history to it, 926 00:45:02,640 --> 00:45:05,520 which causes correlation from one incident to the next, 927 00:45:05,520 --> 00:45:09,360 you're violating the assumption of independent random behavior. 928 00:45:09,360 --> 00:45:11,760 If you make this integral large enough, 929 00:45:11,760 --> 00:45:14,490 there's something actually called a correlation time. 930 00:45:14,490 --> 00:45:17,170 You can measure that for a process, 931 00:45:17,170 --> 00:45:20,340 and I think you'll see that near the end of the term. 932 00:45:20,340 --> 00:45:22,680 If you sample outside that correlation time, 933 00:45:22,680 --> 00:45:25,020 even though the process is not truly independent, 934 00:45:25,020 --> 00:45:27,208 it will appear to be independent, which 935 00:45:27,208 --> 00:45:28,500 is good for what we want to do. 936 00:45:28,500 --> 00:45:31,350 It's sort of like the other half of using the central limit 937 00:45:31,350 --> 00:45:32,130 theorem. 938 00:45:32,130 --> 00:45:34,260 We get normal, independent behavior, 939 00:45:34,260 --> 00:45:38,442 even when the individual data points don't follow that. 940 00:45:38,442 --> 00:45:39,400 There's another reason. 941 00:45:43,980 --> 00:45:46,290 Which is less important today than it was in the past, 942 00:45:46,290 --> 00:45:48,990 and that is it takes time to do this. 943 00:45:48,990 --> 00:45:51,390 And you just don't have time to take all this data. 944 00:45:51,390 --> 00:45:52,600 So you wait. 945 00:45:52,600 --> 00:45:55,830 You grab six, you take them over to the bench, 946 00:45:55,830 --> 00:45:57,450 and you do whatever you do with them, 947 00:45:57,450 --> 00:45:59,908 and then you come back as soon as you can for the next one. 948 00:45:59,908 --> 00:46:05,070 So you have to allow for this kind of behavior. 949 00:46:05,070 --> 00:46:09,960 Nowadays, again, most production for anything that's critical 950 00:46:09,960 --> 00:46:15,170 is actually taking the data, in many cases, on every part. 951 00:46:15,170 --> 00:46:18,390 Every part that's coming out has some measurement made on it. 952 00:46:18,390 --> 00:46:19,380 I shouldn't say most. 953 00:46:19,380 --> 00:46:20,430 Many. 954 00:46:20,430 --> 00:46:23,730 And this is all available, and you can choose to use it all. 955 00:46:23,730 --> 00:46:24,960 You can choose to ignore it. 956 00:46:24,960 --> 00:46:26,752 If you think you have a lot of correlation, 957 00:46:26,752 --> 00:46:28,800 you might ignore it. 958 00:46:28,800 --> 00:46:29,593 Question? 959 00:46:29,593 --> 00:46:32,910 AUDIENCE: [INAUDIBLE] 960 00:46:32,910 --> 00:46:34,140 PROFESSOR: It should be. 961 00:46:34,140 --> 00:46:35,280 For these reasons, yeah. 962 00:46:35,280 --> 00:46:35,880 It should be. 963 00:46:39,200 --> 00:46:42,680 Let me go on because I'm doing exactly what I didn't want 964 00:46:42,680 --> 00:46:44,010 to do, which is talk too much. 965 00:46:44,010 --> 00:46:44,840 So, OK. 966 00:46:44,840 --> 00:46:46,010 I think we know all that. 967 00:46:55,780 --> 00:46:57,630 OK. 968 00:46:57,630 --> 00:47:00,360 So this is what you would actually 969 00:47:00,360 --> 00:47:05,970 see if you took measurements, grouped them, in this case, 970 00:47:05,970 --> 00:47:09,010 in groups of N, and plotted the data. 971 00:47:09,010 --> 00:47:11,910 This is what you would see if the process were governed 972 00:47:11,910 --> 00:47:18,995 by purely random stationary, meaning fixed mean invariance, 973 00:47:18,995 --> 00:47:20,370 normal behavior, because this was 974 00:47:20,370 --> 00:47:23,850 generated, in fact, in the PowerPoint that's 975 00:47:23,850 --> 00:47:25,890 on the website. 976 00:47:25,890 --> 00:47:28,140 This is-- I think you can click on this, and the Excel 977 00:47:28,140 --> 00:47:30,300 spreadsheet that generated it is there. 978 00:47:30,300 --> 00:47:32,440 So you can re-randomize it and see what happens. 979 00:47:32,440 --> 00:47:33,940 But it's just a normal distribution. 980 00:47:33,940 --> 00:47:39,340 And this is what you would get with a normal distribution. 981 00:47:39,340 --> 00:47:41,070 In this case, a random number generator 982 00:47:41,070 --> 00:47:44,610 that's following normal distribution. 983 00:47:44,610 --> 00:47:46,450 And you get stuff that looks like this. 984 00:47:46,450 --> 00:47:48,510 So these are your estimates of the mean value. 985 00:47:48,510 --> 00:47:50,010 So you can look at that and say, OK. 986 00:47:50,010 --> 00:47:53,920 Well, you know, what does it say? 987 00:47:53,920 --> 00:47:57,150 Well, I can observe that-- 988 00:47:57,150 --> 00:48:02,180 I can eyeball that the mean value of the mean value 989 00:48:02,180 --> 00:48:03,720 looks like a constant. 990 00:48:03,720 --> 00:48:06,270 So it looks to me, in this particular case, 991 00:48:06,270 --> 00:48:10,760 like x-bar is telling me-- my estimate of x-bar 992 00:48:10,760 --> 00:48:14,200 is that the underlying meaning is a constant, 993 00:48:14,200 --> 00:48:16,910 and it's probably about right here. 994 00:48:16,910 --> 00:48:20,750 And likewise, I don't see that my estimate of the invariance 995 00:48:20,750 --> 00:48:22,250 is telling me that the invariance is 996 00:48:22,250 --> 00:48:24,030 anything but a constant. 997 00:48:24,030 --> 00:48:27,930 And again, I can look at its average value. 998 00:48:27,930 --> 00:48:29,870 So this is what you get. 999 00:48:29,870 --> 00:48:31,520 You plot a new data point, and this 1000 00:48:31,520 --> 00:48:32,870 is what you start to expect. 1001 00:48:32,870 --> 00:48:37,040 Now, of course, I can look at this now and get an overall-- 1002 00:48:37,040 --> 00:48:39,530 a grand mean, which is an average of all 1003 00:48:39,530 --> 00:48:45,020 the average values, and a grand standard deviation estimate. 1004 00:48:45,020 --> 00:48:47,510 I should say a grand average and a grand standard deviation 1005 00:48:47,510 --> 00:48:49,250 estimate. 1006 00:48:49,250 --> 00:48:50,990 And say, OK. 1007 00:48:50,990 --> 00:48:55,020 Now I know the underlying distribution of the process. 1008 00:48:55,020 --> 00:48:58,730 Once I say that, then I know what to expect, right? 1009 00:48:58,730 --> 00:49:01,590 Now, I can put some lines on this chart and say, 1010 00:49:01,590 --> 00:49:04,520 this is where the data should fall if everything's 1011 00:49:04,520 --> 00:49:05,450 in a state of control. 1012 00:49:08,040 --> 00:49:15,660 So this whole idea of setting chart limits 1013 00:49:15,660 --> 00:49:19,560 is extremely important and often abused and misused. 1014 00:49:19,560 --> 00:49:20,970 But here's the idea. 1015 00:49:20,970 --> 00:49:23,660 I now have all this data. 1016 00:49:23,660 --> 00:49:26,203 And this is before-- this is just taking raw data 1017 00:49:26,203 --> 00:49:27,370 and doing the average on it. 1018 00:49:27,370 --> 00:49:30,920 I haven't tried to quantify this other than that. 1019 00:49:30,920 --> 00:49:34,790 Now, I need to start to put some limits on this 1020 00:49:34,790 --> 00:49:37,225 or put some information on. 1021 00:49:37,225 --> 00:49:39,350 So the first thing you do, of course, is say, well, 1022 00:49:39,350 --> 00:49:41,051 there's the-- 1023 00:49:41,051 --> 00:49:46,600 on this set of data, there's the average in each case. 1024 00:49:46,600 --> 00:49:48,790 Now, another thing to keep in mind-- this 1025 00:49:48,790 --> 00:49:50,350 has already happened. 1026 00:49:50,350 --> 00:49:52,130 I'm not making any decisions based 1027 00:49:52,130 --> 00:49:53,380 on this about what to do next. 1028 00:49:53,380 --> 00:49:54,942 This has already happened. 1029 00:49:54,942 --> 00:49:56,650 So the first thing I need to do, and this 1030 00:49:56,650 --> 00:50:02,360 is part of the art of SPC, is I have to look at that and say, 1031 00:50:02,360 --> 00:50:06,620 does it look like it's in a state of statistical control? 1032 00:50:06,620 --> 00:50:08,555 And frankly, at this point, you eyeball it 1033 00:50:08,555 --> 00:50:12,430 and you say, well, you know, I don't see anything 1034 00:50:12,430 --> 00:50:14,350 that makes us look non-random. 1035 00:50:14,350 --> 00:50:16,940 And we're going to quantify that in just a second. 1036 00:50:16,940 --> 00:50:18,780 But what else could I do looking at, 1037 00:50:18,780 --> 00:50:20,830 say, a run of data like this to say, yeah. 1038 00:50:20,830 --> 00:50:25,850 It kind of looks like it's in a state of statistical control. 1039 00:50:28,800 --> 00:50:32,130 In other words, if the process is truly stationary, 1040 00:50:32,130 --> 00:50:35,228 if it is truly following a normal distribution, 1041 00:50:35,228 --> 00:50:37,020 what would I expect this data to look like? 1042 00:50:43,050 --> 00:50:44,760 Yeah, it's supposed to look random. 1043 00:50:44,760 --> 00:50:46,050 But how do I look at it and say, oh, yeah. 1044 00:50:46,050 --> 00:50:46,633 That's random. 1045 00:50:49,862 --> 00:50:51,770 AUDIENCE: [INAUDIBLE] 1046 00:50:51,770 --> 00:50:53,000 PROFESSOR: Yeah. 1047 00:50:53,000 --> 00:50:54,330 If it has a pattern. 1048 00:50:54,330 --> 00:50:55,220 If it has a trend. 1049 00:50:57,880 --> 00:50:59,560 Things like that. 1050 00:50:59,560 --> 00:51:03,340 If it does improbable things. 1051 00:51:03,340 --> 00:51:04,570 That's the one that I think-- 1052 00:51:04,570 --> 00:51:05,800 obviously, a trend. 1053 00:51:05,800 --> 00:51:09,060 If it starts to go like this. 1054 00:51:09,060 --> 00:51:12,720 If it starts to move up instead of being flat, that's a trend. 1055 00:51:12,720 --> 00:51:18,966 If it goes like this, that's a trend. 1056 00:51:18,966 --> 00:51:21,653 Then those are common trends that we'll talk about. 1057 00:51:21,653 --> 00:51:23,820 There's some ones that are a little bit more subtle. 1058 00:51:26,410 --> 00:51:29,590 Think about confidence intervals. 1059 00:51:29,590 --> 00:51:31,120 OK? 1060 00:51:31,120 --> 00:51:33,440 Let's put some confidence intervals on here. 1061 00:51:33,440 --> 00:51:36,350 Let's just put some lines in here. 1062 00:51:36,350 --> 00:51:40,510 And I'll, just for fun, put three lines 1063 00:51:40,510 --> 00:51:44,077 above the average and three lines below the average. 1064 00:51:44,077 --> 00:51:46,160 And I don't even know if those in the right place. 1065 00:51:46,160 --> 00:51:50,510 But remember, confidence intervals 1066 00:51:50,510 --> 00:51:54,060 that said that I should get 63%-- 1067 00:51:54,060 --> 00:51:54,650 is that right? 1068 00:51:54,650 --> 00:51:58,190 63% in the plus or minus 1 sigma. 1069 00:51:58,190 --> 00:51:59,960 So most of my data should fall in there. 1070 00:52:02,870 --> 00:52:03,950 That one interval. 1071 00:52:03,950 --> 00:52:07,370 Even more of it should fall within the next two intervals 1072 00:52:07,370 --> 00:52:10,100 and the plus or minus 3 sigma. 1073 00:52:10,100 --> 00:52:12,680 And it's very unlikely that I'll get very many points 1074 00:52:12,680 --> 00:52:14,970 in the 2 to 3 sigma. 1075 00:52:14,970 --> 00:52:17,960 A little bit more likely that'll get them within the 1 to 2 1076 00:52:17,960 --> 00:52:18,770 sigma. 1077 00:52:18,770 --> 00:52:20,390 So if I look at this data and I start 1078 00:52:20,390 --> 00:52:26,720 to see a lot of data points up here or down here, 1079 00:52:26,720 --> 00:52:28,250 that's pretty unlikely. 1080 00:52:28,250 --> 00:52:30,520 That shouldn't be happening. 1081 00:52:30,520 --> 00:52:33,610 That's an indication that it's not following the behavior 1082 00:52:33,610 --> 00:52:34,780 that I expected. 1083 00:52:34,780 --> 00:52:37,360 Or what if I-- in fact, I just drew it here. 1084 00:52:37,360 --> 00:52:39,560 But what if, in fact, what I saw-- 1085 00:52:39,560 --> 00:52:40,810 let me get rid of those lines. 1086 00:52:40,810 --> 00:52:42,550 They're distracting. 1087 00:52:42,550 --> 00:52:50,170 What if I saw instead maybe the same data, 1088 00:52:50,170 --> 00:52:53,790 but it looked like this instead? 1089 00:52:53,790 --> 00:52:58,800 I was getting, I don't know, two or three data points in a row 1090 00:52:58,800 --> 00:53:00,890 all in the same band. 1091 00:53:00,890 --> 00:53:03,540 That's pretty unlikely, too. 1092 00:53:03,540 --> 00:53:06,050 So based on these things and actually using things 1093 00:53:06,050 --> 00:53:07,800 like confidence intervals, you can come up 1094 00:53:07,800 --> 00:53:11,390 with some rules on this. 1095 00:53:11,390 --> 00:53:12,470 I'm going to skip this. 1096 00:53:12,470 --> 00:53:14,510 Is covered in the text very nicely, 1097 00:53:14,510 --> 00:53:18,530 but there are ways of calculating 1098 00:53:18,530 --> 00:53:24,470 the sample mean and the sample standard deviation. 1099 00:53:24,470 --> 00:53:28,880 And there's some correction factors because of biases. 1100 00:53:28,880 --> 00:53:33,390 But-- for both the x-bar and the S-chart. 1101 00:53:33,390 --> 00:53:36,290 So what is typically done, and you always hear about this, 1102 00:53:36,290 --> 00:53:39,290 is the plus or minus 3 sigma limit. 1103 00:53:39,290 --> 00:53:41,990 So here's an x-bar chart. 1104 00:53:41,990 --> 00:53:43,670 Historical x-bar chart. 1105 00:53:43,670 --> 00:53:45,530 I've calculated the grand mean. 1106 00:53:45,530 --> 00:53:46,910 Arithmetic average. 1107 00:53:46,910 --> 00:53:50,420 I've calculated plus or minus 3 sigma 1108 00:53:50,420 --> 00:53:53,570 of the grand mean with the correction factors 1109 00:53:53,570 --> 00:53:54,870 and all that. 1110 00:53:54,870 --> 00:53:56,870 And so what do these two lines tell me? 1111 00:53:56,870 --> 00:53:58,830 These, often called control limits. 1112 00:53:58,830 --> 00:54:01,750 What do those lines mean? 1113 00:54:01,750 --> 00:54:10,680 They mean that I expect 99.7% of all the data 1114 00:54:10,680 --> 00:54:12,450 to fall within those two lines. 1115 00:54:12,450 --> 00:54:13,410 That's all it means. 1116 00:54:16,140 --> 00:54:19,890 So a simple SPC thing and one you 1117 00:54:19,890 --> 00:54:21,840 hear about all the time without much thought, 1118 00:54:21,840 --> 00:54:24,960 frankly, is, oh, there's a data point up there or there's 1119 00:54:24,960 --> 00:54:27,820 a data point up there. 1120 00:54:27,820 --> 00:54:28,970 Impossible. 1121 00:54:28,970 --> 00:54:30,550 Shut down the process. 1122 00:54:30,550 --> 00:54:33,430 Everything's wrong. 1123 00:54:33,430 --> 00:54:36,070 So you can see that's a good first order thing to do. 1124 00:54:36,070 --> 00:54:39,880 You know, there's only 3 chances in 1,000 1125 00:54:39,880 --> 00:54:41,410 that one would be out there, would 1126 00:54:41,410 --> 00:54:42,660 be in one of these two limits. 1127 00:54:42,660 --> 00:54:44,560 So that's a pretty good first cut. 1128 00:54:44,560 --> 00:54:51,520 But it's a little bit simplistic because it is possible 1129 00:54:51,520 --> 00:54:54,130 that you'd get a process that's perfectly in control, 1130 00:54:54,130 --> 00:54:56,500 and you'd have one point that's up here. 1131 00:54:56,500 --> 00:54:58,708 It's just not probable, but it's possible. 1132 00:54:58,708 --> 00:55:00,500 So we can do a little bit better than that. 1133 00:55:00,500 --> 00:55:02,050 We can-- and again, we can do the same thing 1134 00:55:02,050 --> 00:55:02,870 with the S-chart. 1135 00:55:02,870 --> 00:55:07,650 We can do upper and lower control limits. 1136 00:55:07,650 --> 00:55:09,720 But the more important thing, I think, 1137 00:55:09,720 --> 00:55:17,610 is not waiting for that one data point that pops up here. 1138 00:55:17,610 --> 00:55:21,460 For example, one of the reasons that might be a problem-- 1139 00:55:21,460 --> 00:55:25,500 what if I have underlying behavior that looks like this? 1140 00:55:25,500 --> 00:55:27,440 And you can see that this is, again, this 1141 00:55:27,440 --> 00:55:29,715 is truly normal data. 1142 00:55:29,715 --> 00:55:31,340 And it's a smaller sample set that it's 1143 00:55:31,340 --> 00:55:33,890 highly improbable I'll get anything near the 3 sigma 1144 00:55:33,890 --> 00:55:35,720 limit. 1145 00:55:35,720 --> 00:55:37,160 And this is what you typically get 1146 00:55:37,160 --> 00:55:38,970 if you had a process that this was in control. 1147 00:55:38,970 --> 00:55:40,220 But what if it was doing this? 1148 00:55:45,880 --> 00:55:47,000 And you eyeball that. 1149 00:55:47,000 --> 00:55:49,930 And would you say, is anything wrong with this process? 1150 00:55:53,650 --> 00:55:54,150 Yeah. 1151 00:55:54,150 --> 00:55:56,130 Look, it's trending up like crazy. 1152 00:55:56,130 --> 00:55:57,270 When did that trend begin? 1153 00:56:01,310 --> 00:56:03,690 Maybe here. 1154 00:56:03,690 --> 00:56:08,490 Now, if all I do is sort of have an alarm when 1155 00:56:08,490 --> 00:56:12,090 I cross over this upper limit, can you 1156 00:56:12,090 --> 00:56:16,860 see that I've been-- something's been wrong for a long time. 1157 00:56:16,860 --> 00:56:18,210 And this could be hours. 1158 00:56:18,210 --> 00:56:21,867 It could be milliseconds, but it could be hours. 1159 00:56:21,867 --> 00:56:23,450 Something going wrong for a long time. 1160 00:56:23,450 --> 00:56:26,150 It would be nice to have something a little bit more 1161 00:56:26,150 --> 00:56:28,820 subtle than just simply waiting for that one 1162 00:56:28,820 --> 00:56:33,120 random, potentially just purely random event. 1163 00:56:33,120 --> 00:56:36,240 So-- although we'll talk about that in just a second. 1164 00:56:36,240 --> 00:56:38,160 So one of the things you can do, as I said, 1165 00:56:38,160 --> 00:56:41,990 is look at the data on the basis of looking at confidence 1166 00:56:41,990 --> 00:56:46,190 intervals on the frequency of getting any sorts of extremes 1167 00:56:46,190 --> 00:56:46,730 and trends. 1168 00:56:46,730 --> 00:56:48,650 And so there are a number of different things. 1169 00:56:48,650 --> 00:56:50,810 Now, there's a text we used to use for the class 1170 00:56:50,810 --> 00:56:55,190 by DeVore, Chang, and Sutherland. 1171 00:56:55,190 --> 00:56:57,230 We stopped using it because it's just 1172 00:56:57,230 --> 00:57:02,770 got some rough edges and theoretical problems. 1173 00:57:02,770 --> 00:57:04,970 Not problems, but lack of-- 1174 00:57:04,970 --> 00:57:06,650 it's theoretically correct, but it's 1175 00:57:06,650 --> 00:57:09,187 got some gaps in the theory. 1176 00:57:09,187 --> 00:57:10,520 So you need another book anyway. 1177 00:57:10,520 --> 00:57:12,380 Montgomery is a little bit more basic on it. 1178 00:57:12,380 --> 00:57:14,463 This is a little bit better on some of the process 1179 00:57:14,463 --> 00:57:18,350 optimization and pragmatic-- these guys are engineers 1180 00:57:18,350 --> 00:57:20,123 and less statisticians. 1181 00:57:20,123 --> 00:57:22,040 But one of the things they did is they came up 1182 00:57:22,040 --> 00:57:23,360 with the so-called eight rules. 1183 00:57:26,030 --> 00:57:30,460 In fact, I'm going to-- let me go past those for just a second 1184 00:57:30,460 --> 00:57:33,190 because-- 1185 00:57:33,190 --> 00:57:35,680 oops. 1186 00:57:35,680 --> 00:57:37,420 OK, never mind. 1187 00:57:37,420 --> 00:57:38,962 I was going to talk about that, but I 1188 00:57:38,962 --> 00:57:40,087 don't know if it's in here. 1189 00:57:40,087 --> 00:57:40,600 OK. 1190 00:57:40,600 --> 00:57:43,180 So the eight rules are based on really just four tests. 1191 00:57:43,180 --> 00:57:44,980 The probability that the data falls 1192 00:57:44,980 --> 00:57:48,120 into one of these confidence interval bands. 1193 00:57:48,120 --> 00:57:49,560 A measure of periodicity. 1194 00:57:49,560 --> 00:57:51,930 You should have no periodicity in the data. 1195 00:57:51,930 --> 00:57:53,790 No regularity in it. 1196 00:57:53,790 --> 00:57:55,920 Linear trends. 1197 00:57:55,920 --> 00:57:59,880 Something that's changing up or down over time. 1198 00:57:59,880 --> 00:58:03,348 And mean shift-- a jump change in the mean. 1199 00:58:03,348 --> 00:58:04,890 All these things are things are going 1200 00:58:04,890 --> 00:58:06,528 to be looking for in the data. 1201 00:58:06,528 --> 00:58:08,070 Now, mean, you're going to hear a lot 1202 00:58:08,070 --> 00:58:10,080 about mean shift throughout the term 1203 00:58:10,080 --> 00:58:12,205 because that's your most-- in some ways, 1204 00:58:12,205 --> 00:58:14,580 you can think of that is as one of the most likely things 1205 00:58:14,580 --> 00:58:15,080 to happen. 1206 00:58:15,080 --> 00:58:18,120 You're making product with a certain batch of material, 1207 00:58:18,120 --> 00:58:20,200 and you change the batch. 1208 00:58:20,200 --> 00:58:21,090 Something changes. 1209 00:58:21,090 --> 00:58:23,310 You have a shift change. 1210 00:58:23,310 --> 00:58:26,100 You have PM on the machine, Preventive Maintenance 1211 00:58:26,100 --> 00:58:27,090 on the machine. 1212 00:58:27,090 --> 00:58:29,010 And the next time you run it, things jump up. 1213 00:58:29,010 --> 00:58:32,880 So the underlying behavior, something in this model, 1214 00:58:32,880 --> 00:58:35,760 just made a step change, and you saw that out there. 1215 00:58:35,760 --> 00:58:37,110 Linear trends, obviously. 1216 00:58:37,110 --> 00:58:38,010 Something. 1217 00:58:38,010 --> 00:58:40,200 Tool wearing. 1218 00:58:40,200 --> 00:58:42,990 Something not-- something changing over time. 1219 00:58:42,990 --> 00:58:44,840 Periodicity. 1220 00:58:44,840 --> 00:58:46,720 It could be temperature changes. 1221 00:58:46,720 --> 00:58:48,850 It could be something that's underlying. 1222 00:58:48,850 --> 00:58:50,670 That sort of thing. 1223 00:58:50,670 --> 00:58:51,570 OK. 1224 00:58:51,570 --> 00:58:53,657 So test for out of control. 1225 00:58:53,657 --> 00:58:54,990 I think this comes out to eight. 1226 00:58:54,990 --> 00:58:57,480 So if you have a set of data and you look at it, 1227 00:58:57,480 --> 00:58:58,950 first thing you might do is say is 1228 00:58:58,950 --> 00:59:02,200 anything outside the plus or minus 3 sigma limits. 1229 00:59:02,200 --> 00:59:03,640 That's really unlikely. 1230 00:59:03,640 --> 00:59:07,500 If I get any extreme points like that, pull the Andon cord. 1231 00:59:07,500 --> 00:59:08,550 Stop the production. 1232 00:59:08,550 --> 00:59:09,330 Something's wrong. 1233 00:59:09,330 --> 00:59:11,910 Let's go out and fix it. 1234 00:59:11,910 --> 00:59:14,030 Improbable points. 1235 00:59:14,030 --> 00:59:16,650 Now, they've quantified this by saying, look. 1236 00:59:16,650 --> 00:59:21,830 If I get 2 out of 3 points, if I look at any triad of points, 1237 00:59:21,830 --> 00:59:25,210 if 2 or 3 of those are in the plus or minus 2 sigma band, 1238 00:59:25,210 --> 00:59:30,140 that's pretty improbable based on confidence intervals. 1239 00:59:30,140 --> 00:59:34,550 4 out of 5 in the plus or minus, outside the plus or minus 1 1240 00:59:34,550 --> 00:59:36,150 band. 1241 00:59:36,150 --> 00:59:37,950 That's pretty improbable. 1242 00:59:37,950 --> 00:59:41,640 Or all the data inside the plus or minus 1 band. 1243 00:59:41,640 --> 00:59:45,850 That shouldn't happen, either, if my process is stationary. 1244 00:59:45,850 --> 00:59:46,350 OK? 1245 00:59:46,350 --> 00:59:47,808 So each of these things you can see 1246 00:59:47,808 --> 00:59:49,590 could indicate that it's not following 1247 00:59:49,590 --> 00:59:51,855 the expected normal behavior. 1248 00:59:54,770 --> 01:00:00,638 Another one is in runs of 8 or more points, am I always-- 1249 01:00:00,638 --> 01:00:01,930 is there a run of eight points? 1250 01:00:01,930 --> 01:00:04,090 That's always above the mean value. 1251 01:00:04,090 --> 01:00:04,900 That's improbable. 1252 01:00:04,900 --> 01:00:08,920 It's supposed to be random on both sides. 1253 01:00:08,920 --> 01:00:10,330 Linear trends, of course. 1254 01:00:10,330 --> 01:00:11,170 Six points. 1255 01:00:11,170 --> 01:00:15,220 And they pick the numbers to trade off between sensitivity 1256 01:00:15,220 --> 01:00:18,060 and resolution. 1257 01:00:18,060 --> 01:00:20,040 Six points in a consistent direction. 1258 01:00:20,040 --> 01:00:22,920 Always heading upwards or downwards. 1259 01:00:22,920 --> 01:00:25,470 And bimodal data. 1260 01:00:25,470 --> 01:00:32,617 You got points-- if I get points outside of the central region, 1261 01:00:32,617 --> 01:00:34,950 that could indicate-- you go back to that picture I had. 1262 01:00:34,950 --> 01:00:36,075 I have a distribution here. 1263 01:00:36,075 --> 01:00:37,290 I have a distribution here. 1264 01:00:37,290 --> 01:00:39,390 It's actually following two different mean values 1265 01:00:39,390 --> 01:00:42,710 on either side of the mean value. 1266 01:00:42,710 --> 01:00:43,670 OK. 1267 01:00:43,670 --> 01:00:48,630 So how would I apply the Shewhart charting concept? 1268 01:00:48,630 --> 01:00:49,130 OK. 1269 01:00:49,130 --> 01:00:51,297 Well, keep in mind, the first thing I've got to do-- 1270 01:00:54,780 --> 01:00:57,997 well, let me back up. 1271 01:00:57,997 --> 01:00:58,830 Not the first thing. 1272 01:00:58,830 --> 01:01:02,840 What I'm trying to accomplish is this. 1273 01:01:02,840 --> 01:01:05,860 I'm in production. 1274 01:01:05,860 --> 01:01:07,330 Let's just look at the x-bar chart. 1275 01:01:07,330 --> 01:01:10,270 I'm in production, and this is now-- 1276 01:01:10,270 --> 01:01:13,150 I've got some data back here. 1277 01:01:13,150 --> 01:01:19,710 This is now-- and I took another piece of data. 1278 01:01:19,710 --> 01:01:21,820 And I want to say, OK. 1279 01:01:21,820 --> 01:01:23,640 Is everything all right? 1280 01:01:23,640 --> 01:01:26,500 Time to take any action? 1281 01:01:26,500 --> 01:01:32,830 I can't do that until I know I have at least done 1282 01:01:32,830 --> 01:01:38,950 my grand mean and I have at least my upper control limit, 1283 01:01:38,950 --> 01:01:43,610 my plus 3 sigma, and my lower control limit. 1284 01:01:43,610 --> 01:01:48,590 So I can't really do SPC until I have these limits. 1285 01:01:48,590 --> 01:01:50,438 Where do they come from? 1286 01:01:50,438 --> 01:01:51,730 They came from historical data. 1287 01:01:51,730 --> 01:01:54,670 They came from the stuff that already happened. 1288 01:01:54,670 --> 01:01:57,275 So this is a little bit of a catch-22. 1289 01:01:57,275 --> 01:01:59,650 What do you do if the process is horribly out of control. 1290 01:02:02,440 --> 01:02:05,345 And back here, it's all over the place. 1291 01:02:05,345 --> 01:02:06,220 And you just say, oh. 1292 01:02:06,220 --> 01:02:09,940 Well, here's the mean value on the standard deviation. 1293 01:02:09,940 --> 01:02:12,760 You can see that these will be totally bogus limits. 1294 01:02:12,760 --> 01:02:14,260 And if it's out of control back here 1295 01:02:14,260 --> 01:02:16,610 and you put those limits on here, guess what? 1296 01:02:16,610 --> 01:02:18,620 It'll appear to be in control here, too. 1297 01:02:18,620 --> 01:02:21,310 It'll stay within these limits. 1298 01:02:21,310 --> 01:02:26,380 So it's a bit of a catch-22. 1299 01:02:26,380 --> 01:02:28,930 But what do you do with these-- 1300 01:02:28,930 --> 01:02:30,655 how would I take these, I don't know, 1301 01:02:30,655 --> 01:02:33,680 25 to 50 points before I actually start doing control? 1302 01:02:33,680 --> 01:02:36,010 How would I take those and say, OK. 1303 01:02:36,010 --> 01:02:37,740 Looks good. 1304 01:02:37,740 --> 01:02:40,050 What kind of things would you do if I 1305 01:02:40,050 --> 01:02:43,780 were taking data to say, yep. 1306 01:02:43,780 --> 01:02:47,740 Looks like I can now apply the chart. 1307 01:02:47,740 --> 01:02:50,440 What am I looking for in the data? 1308 01:02:50,440 --> 01:02:52,382 AUDIENCE: [INAUDIBLE] 1309 01:03:04,157 --> 01:03:06,490 PROFESSOR: That could be very dangerous for a reason I'm 1310 01:03:06,490 --> 01:03:08,920 going to show you in a second. 1311 01:03:08,920 --> 01:03:11,260 But yeah, you have certain expectations for the process. 1312 01:03:11,260 --> 01:03:12,610 Yeah, absolutely. 1313 01:03:12,610 --> 01:03:16,990 But let's say that, again, here's the historical data. 1314 01:03:16,990 --> 01:03:18,380 I've got the historical data. 1315 01:03:21,340 --> 01:03:23,120 I always draw it looking periodic. 1316 01:03:23,120 --> 01:03:26,410 So I've got the historical data. 1317 01:03:26,410 --> 01:03:28,900 I've got to look at that data and say, 1318 01:03:28,900 --> 01:03:30,640 what do I need to say about it to say 1319 01:03:30,640 --> 01:03:33,300 it's useful for charting? 1320 01:03:33,300 --> 01:03:37,130 Number 1, it needs to be normal. 1321 01:03:37,130 --> 01:03:41,980 Number 2, it needs to be stationary. 1322 01:03:41,980 --> 01:03:43,750 Then I can start doing the charting. 1323 01:03:43,750 --> 01:03:50,690 What's the test I would do for it being normal? 1324 01:03:50,690 --> 01:03:53,070 AUDIENCE: [INAUDIBLE] 1325 01:03:54,110 --> 01:03:55,970 PROFESSOR: Yeah. 1326 01:03:55,970 --> 01:03:56,930 Exactly. 1327 01:03:56,930 --> 01:03:58,550 You could do things like-- that's 1328 01:03:58,550 --> 01:04:00,092 one of the reasons we looked at this. 1329 01:04:00,092 --> 01:04:01,870 Do a histogram. 1330 01:04:01,870 --> 01:04:02,500 What else? 1331 01:04:07,790 --> 01:04:08,340 Yeah, yeah. 1332 01:04:08,340 --> 01:04:13,560 You do a QQ or normal distribution plot. 1333 01:04:13,560 --> 01:04:15,450 Really good thing to do. 1334 01:04:15,450 --> 01:04:18,210 That's pretty fast and easy to do. 1335 01:04:18,210 --> 01:04:20,280 Is it following normal distribution? 1336 01:04:20,280 --> 01:04:23,490 If it's not, we got a real problem. 1337 01:04:23,490 --> 01:04:27,660 I just had a student of mine who plotted some measurement data, 1338 01:04:27,660 --> 01:04:31,200 and he plotted on a QQ, plot and it was nowhere near 1339 01:04:31,200 --> 01:04:31,770 following it. 1340 01:04:31,770 --> 01:04:33,160 Didn't make any sense. 1341 01:04:33,160 --> 01:04:34,980 And then I found out that, underlying, 1342 01:04:34,980 --> 01:04:37,980 he'd been doing some very deterministic things 1343 01:04:37,980 --> 01:04:39,480 between measurements. 1344 01:04:39,480 --> 01:04:41,520 Changing things in a very deterministic way. 1345 01:04:41,520 --> 01:04:43,990 And it came out exactly in the data. 1346 01:04:43,990 --> 01:04:45,990 It should have followed the normal distribution, 1347 01:04:45,990 --> 01:04:47,790 and it didn't because of this. 1348 01:04:50,590 --> 01:04:51,090 What else? 1349 01:04:56,380 --> 01:04:58,820 Hayden, what else could we do? 1350 01:04:58,820 --> 01:05:02,320 AUDIENCE: Well, I'm thinking you could get the upper 1351 01:05:02,320 --> 01:05:05,140 and lower control limits there in your specification. 1352 01:05:05,140 --> 01:05:06,490 [INTERPOSING VOICES] 1353 01:05:06,490 --> 01:05:08,680 PROFESSOR: You got to careful of that, though. 1354 01:05:08,680 --> 01:05:10,210 Yeah. 1355 01:05:10,210 --> 01:05:10,780 Good point. 1356 01:05:10,780 --> 01:05:11,890 That's the same issue of, you know, 1357 01:05:11,890 --> 01:05:13,140 here's what I expect it to be. 1358 01:05:13,140 --> 01:05:16,690 But the whole point about process capability, 1359 01:05:16,690 --> 01:05:18,970 which I'm going to get to in the last five minutes, 1360 01:05:18,970 --> 01:05:25,090 is process capability is what you're dreaming of getting. 1361 01:05:25,090 --> 01:05:27,460 This is what you actually get. 1362 01:05:27,460 --> 01:05:29,410 And they can do really far off. 1363 01:05:29,410 --> 01:05:32,140 The reason I ask you is you just put that derivation 1364 01:05:32,140 --> 01:05:33,815 of the kurtosis up there. 1365 01:05:33,815 --> 01:05:35,440 So you could do other tests on the data 1366 01:05:35,440 --> 01:05:36,862 like kurtosis and skewness. 1367 01:05:36,862 --> 01:05:38,320 There are all these different tests 1368 01:05:38,320 --> 01:05:40,180 to see if the data is normal. 1369 01:05:40,180 --> 01:05:41,720 And then that's all well and good, 1370 01:05:41,720 --> 01:05:42,720 but then you have to look at it. 1371 01:05:42,720 --> 01:05:43,840 Is it trending up or down? 1372 01:05:43,840 --> 01:05:45,640 Does it have any of these things? 1373 01:05:45,640 --> 01:05:47,470 Before I ever put a limit on it, can I 1374 01:05:47,470 --> 01:05:49,660 say, yeah, that kind of looks random? 1375 01:05:49,660 --> 01:05:52,240 And even without a limit, you can say, OK. 1376 01:05:52,240 --> 01:05:55,240 The data looks like it's not trending up. 1377 01:05:55,240 --> 01:05:57,310 I don't see any periodicity in the data. 1378 01:05:57,310 --> 01:05:59,380 In the case with my student's data, 1379 01:05:59,380 --> 01:06:01,150 there was this big hump in the data where 1380 01:06:01,150 --> 01:06:02,567 it spent a lot of time doing this, 1381 01:06:02,567 --> 01:06:04,930 and then it came back being [INAUDIBLE].. 1382 01:06:04,930 --> 01:06:10,330 It doesn't-- I may be wrong, but that doesn't look random to me. 1383 01:06:10,330 --> 01:06:11,140 OK. 1384 01:06:11,140 --> 01:06:12,400 So once I've done that-- 1385 01:06:12,400 --> 01:06:14,383 and this is where the art is-- 1386 01:06:14,383 --> 01:06:16,300 then I compute the center lines and the limits 1387 01:06:16,300 --> 01:06:18,520 and I begin plotting and do this. 1388 01:06:18,520 --> 01:06:21,790 And then I start to apply the eight rules. 1389 01:06:21,790 --> 01:06:24,220 Or there are some simpler rules, often called 1390 01:06:24,220 --> 01:06:29,290 the Western Electric rules, that are actually in your text. 1391 01:06:29,290 --> 01:06:31,900 And again, what these rules mean is these are alarm points. 1392 01:06:31,900 --> 01:06:34,840 These mean, to the best of our ability, 1393 01:06:34,840 --> 01:06:39,490 your process is not following a state of statistical control, 1394 01:06:39,490 --> 01:06:41,530 meaning it's not stationary. 1395 01:06:41,530 --> 01:06:43,780 There is some sort of cause on the outside that's 1396 01:06:43,780 --> 01:06:46,853 making a change, and it can be eliminated and identified. 1397 01:06:46,853 --> 01:06:48,520 And these are the Western Electric rules 1398 01:06:48,520 --> 01:06:51,400 that are in your text. 1399 01:06:51,400 --> 01:06:55,330 Any points outside the limit, 2 to 3 points outside 2 sigma 1400 01:06:55,330 --> 01:07:00,250 and 5 points outside 1 sigma, that sort of thing. 1401 01:07:00,250 --> 01:07:01,330 OK. 1402 01:07:01,330 --> 01:07:04,180 So you can see that just reviewing what we said, 1403 01:07:04,180 --> 01:07:05,390 this is in control. 1404 01:07:05,390 --> 01:07:09,850 Here's what the data should look like if it's in control. 1405 01:07:09,850 --> 01:07:14,290 And if you see these types of behaviors in the data, 1406 01:07:14,290 --> 01:07:17,140 it means something like this has happened. 1407 01:07:17,140 --> 01:07:20,920 Mean shift, trends, bimodal, whatever. 1408 01:07:20,920 --> 01:07:24,010 I'm going to skip over this thing on average run length. 1409 01:07:24,010 --> 01:07:26,870 It's very interesting, and it gives you an issue-- 1410 01:07:26,870 --> 01:07:29,245 some idea of what to do about sample sizes. 1411 01:07:31,960 --> 01:07:34,960 And it's very carefully covered in the text. 1412 01:07:34,960 --> 01:07:40,810 And you can play with this beautiful animation that 1413 01:07:40,810 --> 01:07:42,160 was made up a long time ago. 1414 01:07:44,870 --> 01:07:49,690 And let me just summarize this by saying, keep in mind, 1415 01:07:49,690 --> 01:07:52,970 what we're really doing here is hypothesis testing. 1416 01:07:52,970 --> 01:07:56,440 So all the stuff you just had on hypothesis testing 1417 01:07:56,440 --> 01:07:57,740 is exactly what we're doing. 1418 01:07:57,740 --> 01:08:02,350 We're testing the hypothesis that the data is defined 1419 01:08:02,350 --> 01:08:04,450 by a normal distribution. 1420 01:08:04,450 --> 01:08:06,400 And these charts are allowing us to test it, 1421 01:08:06,400 --> 01:08:10,900 but it's sort of testing it as we go along. 1422 01:08:10,900 --> 01:08:14,080 And these issues of alpha and beta probabilities 1423 01:08:14,080 --> 01:08:15,700 all fall into this. 1424 01:08:15,700 --> 01:08:18,967 So the power of the test is important. 1425 01:08:24,700 --> 01:08:26,260 I think I'm getting redundant here. 1426 01:08:35,990 --> 01:08:37,010 OK. 1427 01:08:37,010 --> 01:08:41,840 So again, repeating what I said earlier, 1428 01:08:41,840 --> 01:08:45,620 remodel the process as a normal, independent, random variable. 1429 01:08:45,620 --> 01:08:47,840 By doing sampling the way we did, 1430 01:08:47,840 --> 01:08:49,760 we sort of invoked the central limit theorem, 1431 01:08:49,760 --> 01:08:53,420 and by potentially by spreading the samples out, 1432 01:08:53,420 --> 01:08:57,710 we have a better chance of it being independent, 1433 01:08:57,710 --> 01:08:59,450 even if it's not. 1434 01:08:59,450 --> 01:09:01,460 We say, as a result of it being normal, 1435 01:09:01,460 --> 01:09:04,399 there are only two things I need to know about the distribution. 1436 01:09:04,399 --> 01:09:06,691 That's the beautiful thing about a normal distribution. 1437 01:09:06,691 --> 01:09:08,630 Two parameters, mean and variance. 1438 01:09:08,630 --> 01:09:10,220 So it's completely described by that. 1439 01:09:10,220 --> 01:09:12,000 I only need to know those two things. 1440 01:09:12,000 --> 01:09:15,680 So I estimate those with x bar and S on the data 1441 01:09:15,680 --> 01:09:20,750 and then enforce this concept of stationary conditions. 1442 01:09:20,750 --> 01:09:25,290 So if it's a stationary, normally distributed process, 1443 01:09:25,290 --> 01:09:26,220 everything's fine. 1444 01:09:26,220 --> 01:09:28,189 If it's not, everything's not fine. 1445 01:09:31,300 --> 01:09:31,800 Oh, yeah. 1446 01:09:31,800 --> 01:09:33,526 What happens if it's not right? 1447 01:09:33,526 --> 01:09:34,109 This is great. 1448 01:09:34,109 --> 01:09:36,569 I read this in the statistics book a long time ago. 1449 01:09:36,569 --> 01:09:39,602 If it's not right, you call an engineer. 1450 01:09:39,602 --> 01:09:41,810 So if you're from a purely statistical point of view, 1451 01:09:41,810 --> 01:09:44,270 you're just saying something's not right. 1452 01:09:44,270 --> 01:09:45,380 OK. 1453 01:09:45,380 --> 01:09:48,470 Now, I'm rushing because I did want to mention this. 1454 01:09:48,470 --> 01:09:51,180 I'm sure [INAUDIBLE] will do some more on it. 1455 01:09:51,180 --> 01:09:54,140 But I really wanted to get to this because of some 1456 01:09:54,140 --> 01:09:56,940 of the questions that came up. 1457 01:09:56,940 --> 01:09:59,960 This is our empirical model of the process now. 1458 01:09:59,960 --> 01:10:02,780 It's an empirical, statistical model of the process. 1459 01:10:02,780 --> 01:10:06,980 The real world, our estimate of the real world, 1460 01:10:06,980 --> 01:10:11,490 our underlying model, is this theoretical parent 1461 01:10:11,490 --> 01:10:14,300 normal distribution. 1462 01:10:14,300 --> 01:10:15,840 That's what you get. 1463 01:10:15,840 --> 01:10:16,890 It's not what you want. 1464 01:10:16,890 --> 01:10:19,590 It's what you get. 1465 01:10:19,590 --> 01:10:23,030 So the question is, how good is the process? 1466 01:10:23,030 --> 01:10:24,080 This is what it is. 1467 01:10:24,080 --> 01:10:25,670 Is that good, or bad, or indifferent? 1468 01:10:25,670 --> 01:10:28,370 And that's where you get into the concept of process 1469 01:10:28,370 --> 01:10:29,490 capability. 1470 01:10:29,490 --> 01:10:32,380 So we assume the process is in control. 1471 01:10:32,380 --> 01:10:34,420 We assume it follows that normal distribution. 1472 01:10:34,420 --> 01:10:35,540 Then we compare it. 1473 01:10:35,540 --> 01:10:39,140 Then we compare it to things like tolerances. 1474 01:10:39,140 --> 01:10:41,903 Where do I want my parts to be? 1475 01:10:41,903 --> 01:10:44,070 And there's also this concept of quality loss, which 1476 01:10:44,070 --> 01:10:46,260 is a better way, in some ways of looking 1477 01:10:46,260 --> 01:10:49,270 at design specifications. 1478 01:10:49,270 --> 01:10:51,460 The simplest design specification we can think of 1479 01:10:51,460 --> 01:10:52,660 is an upper and lower limit. 1480 01:10:52,660 --> 01:10:54,220 A tolerance. 1481 01:10:54,220 --> 01:10:57,490 If you can make me a part that falls between these two limits, 1482 01:10:57,490 --> 01:11:00,590 it's a good part. 1483 01:11:00,590 --> 01:11:06,515 So can think of that as a nominal value, 1484 01:11:06,515 --> 01:11:11,210 a target, and then an upper specification limit and a lower 1485 01:11:11,210 --> 01:11:13,920 specification limit. 1486 01:11:13,920 --> 01:11:16,110 Not control limit, but specification limit. 1487 01:11:18,840 --> 01:11:22,220 Tolerances basically say, anything in there 1488 01:11:22,220 --> 01:11:23,030 is a good part. 1489 01:11:23,030 --> 01:11:25,820 Has equal quality, if you will. 1490 01:11:25,820 --> 01:11:29,570 Quality loss, which was introduced by Taguchi 1491 01:11:29,570 --> 01:11:31,925 some time ago, basically said, that's baloney. 1492 01:11:35,070 --> 01:11:37,788 And there's some really good case examples of this, 1493 01:11:37,788 --> 01:11:38,580 but that's baloney. 1494 01:11:38,580 --> 01:11:39,690 He said, look. 1495 01:11:39,690 --> 01:11:43,170 If there's a target value for some specification, 1496 01:11:43,170 --> 01:11:46,650 it's a dimension or a property of something. 1497 01:11:46,650 --> 01:11:48,000 And you can specify that. 1498 01:11:48,000 --> 01:11:49,890 That's the best one. 1499 01:11:49,890 --> 01:11:53,670 Any deviation from that is less good, 1500 01:11:53,670 --> 01:11:58,260 and there is an actual cost associated with that deviation. 1501 01:11:58,260 --> 01:12:01,080 And he said, not knowing anything more about it, 1502 01:12:01,080 --> 01:12:02,920 we'll call it a quadratic loss. 1503 01:12:02,920 --> 01:12:07,200 So the farther you are from the specified value, 1504 01:12:07,200 --> 01:12:09,090 the worse the product is. 1505 01:12:09,090 --> 01:12:13,280 And of course, you should always work your way down this curve. 1506 01:12:13,280 --> 01:12:15,600 And there's a way to calibrate this, 1507 01:12:15,600 --> 01:12:17,000 which basically would say-- 1508 01:12:17,000 --> 01:12:20,890 one way to think of it is-- 1509 01:12:20,890 --> 01:12:24,200 a simple way to calibrate this. 1510 01:12:24,200 --> 01:12:26,720 If I actually have upper and lower specification limits 1511 01:12:26,720 --> 01:12:30,550 here, I can say, what's the cost of scrapping apart? 1512 01:12:30,550 --> 01:12:33,790 And where this curve crosses that, you say, OK. 1513 01:12:33,790 --> 01:12:37,258 That's my calibration constant. 1514 01:12:37,258 --> 01:12:39,050 But again, there's some great case examples 1515 01:12:39,050 --> 01:12:43,970 that show that if you work against quality loss, 1516 01:12:43,970 --> 01:12:45,680 you can actually end up with much-- 1517 01:12:45,680 --> 01:12:48,890 once you combine together parts into the whole thing, 1518 01:12:48,890 --> 01:12:54,440 you actually end up with a much better overall quality. 1519 01:12:54,440 --> 01:12:57,320 But for the moment, let's just look at the tolerances. 1520 01:12:57,320 --> 01:13:01,430 Process capability in it's simplest form is just this. 1521 01:13:01,430 --> 01:13:03,920 I start with my desired. 1522 01:13:03,920 --> 01:13:06,560 I'd like the part to fall between these two limits. 1523 01:13:06,560 --> 01:13:08,810 Once I say that, it doesn't matter what the target is, 1524 01:13:08,810 --> 01:13:09,020 right? 1525 01:13:09,020 --> 01:13:10,340 It's just between the two limits. 1526 01:13:10,340 --> 01:13:12,132 I'd like it to fall between the two limits. 1527 01:13:14,740 --> 01:13:16,660 There's my process. 1528 01:13:16,660 --> 01:13:17,950 That's the real world. 1529 01:13:17,950 --> 01:13:19,730 How did I do? 1530 01:13:19,730 --> 01:13:23,590 What's the likelihood that the part that I make 1531 01:13:23,590 --> 01:13:26,320 will fall inside those limits? 1532 01:13:26,320 --> 01:13:30,080 That's process capability. 1533 01:13:30,080 --> 01:13:32,480 You can, and their numbers-- 1534 01:13:32,480 --> 01:13:34,070 this is the chart. 1535 01:13:34,070 --> 01:13:35,580 You can look at that and say, OK, 1536 01:13:35,580 --> 01:13:37,370 how likely is it that I make a bad part? 1537 01:13:37,370 --> 01:13:40,280 Well, it's the area of that little space right there. 1538 01:13:40,280 --> 01:13:43,150 The way I've drawn this. 1539 01:13:43,150 --> 01:13:46,703 It's that area right there, and it looks like nil here. 1540 01:13:46,703 --> 01:13:48,370 So I could look at that and say, that's. 1541 01:13:48,370 --> 01:13:51,530 The probability that I make a bad part. 1542 01:13:51,530 --> 01:13:55,370 Again, to try and give us a more common basis for this, 1543 01:13:55,370 --> 01:13:57,180 things like this have been defined. 1544 01:13:57,180 --> 01:13:59,600 The CP parameter. 1545 01:13:59,600 --> 01:14:01,680 Process capability, which basically says, 1546 01:14:01,680 --> 01:14:07,540 what's the ratio of the width of that window to my plus or minus 1547 01:14:07,540 --> 01:14:10,150 3 sigma width? 1548 01:14:10,150 --> 01:14:13,570 So is the variability of my process 1549 01:14:13,570 --> 01:14:19,270 similar to or different from the underlying, the desired? 1550 01:14:19,270 --> 01:14:22,510 But it compares ranges only. 1551 01:14:22,510 --> 01:14:24,430 And if you look at this carefully, 1552 01:14:24,430 --> 01:14:28,570 that could say that I could have the following. 1553 01:14:28,570 --> 01:14:34,880 I could have a process where these are my specification 1554 01:14:34,880 --> 01:14:35,380 limits. 1555 01:14:41,070 --> 01:14:44,220 And here's my distribution. 1556 01:14:46,960 --> 01:14:48,840 It's pretty narrow. 1557 01:14:48,840 --> 01:14:51,120 So the width of this, plus or minus 3 sigma, 1558 01:14:51,120 --> 01:14:54,870 is well within this width. 1559 01:14:54,870 --> 01:14:58,830 So this number would actually be pretty large. 1560 01:14:58,830 --> 01:15:01,530 This window is wide compared to the window of this. 1561 01:15:01,530 --> 01:15:04,600 But how many bad parts do I make? 1562 01:15:04,600 --> 01:15:06,640 They're all bad, right? 1563 01:15:06,640 --> 01:15:10,242 So CP is one way of looking at how well you're 1564 01:15:10,242 --> 01:15:11,200 doing with the process. 1565 01:15:11,200 --> 01:15:13,920 But it's really only looking at variability. 1566 01:15:13,920 --> 01:15:15,490 So we come up with this CPK measure, 1567 01:15:15,490 --> 01:15:17,573 and there are a whole pile of these different ways 1568 01:15:17,573 --> 01:15:18,580 of looking at stuff. 1569 01:15:18,580 --> 01:15:24,700 CPK now penalizes deviations from the mean. 1570 01:15:24,700 --> 01:15:25,270 OK? 1571 01:15:25,270 --> 01:15:30,910 So now, this would have either negative or zero CPK. 1572 01:15:30,910 --> 01:15:33,280 And if you do the calculation, this 1573 01:15:33,280 --> 01:15:38,210 is sort of a blend between how well it's centered in here 1574 01:15:38,210 --> 01:15:39,470 and what the deviation is. 1575 01:15:39,470 --> 01:15:42,230 So if it's perfectly centered, then these two numbers-- 1576 01:15:46,610 --> 01:15:48,770 this is actually-- if it was perfectly centered 1577 01:15:48,770 --> 01:15:56,530 and it exactly fits like this, what's CPK going to be? 1578 01:15:56,530 --> 01:16:00,160 So here's the center specification and here's x-bar. 1579 01:16:00,160 --> 01:16:03,810 This is exactly plus or minus 3 sigma. 1580 01:16:03,810 --> 01:16:06,390 This distance here will be 3 sigma. 1581 01:16:06,390 --> 01:16:12,610 So your process capability, CPK, will be 1. 1582 01:16:12,610 --> 01:16:15,010 So let me just end up with this and just say 1583 01:16:15,010 --> 01:16:17,500 that there are a couple ways you can look at this. 1584 01:16:17,500 --> 01:16:20,380 If I have a process perfectly centered 1585 01:16:20,380 --> 01:16:23,650 and exactly plus or minus 3 sigma limits to my upper 1586 01:16:23,650 --> 01:16:28,480 and lower specification limit, both CP and CPK will be 1. 1587 01:16:28,480 --> 01:16:28,980 OK. 1588 01:16:28,980 --> 01:16:32,610 If that's the case, what's my parts 1589 01:16:32,610 --> 01:16:34,920 per million, which many companies will 1590 01:16:34,920 --> 01:16:36,930 use to measure their quality? 1591 01:16:36,930 --> 01:16:38,775 Once my parts per million bad parts? 1592 01:16:44,760 --> 01:16:46,492 AUDIENCE: [INAUDIBLE] 1593 01:16:48,430 --> 01:16:51,730 PROFESSOR: 3 out of 1,000. 1594 01:16:51,730 --> 01:16:53,170 So it's a lot. 1595 01:16:53,170 --> 01:16:53,950 So yeah. 1596 01:16:53,950 --> 01:17:00,320 So CP of 1's not particularly that good. 1597 01:17:00,320 --> 01:17:03,257 Companies like to have it a lot higher than that. 1598 01:17:03,257 --> 01:17:04,340 Here's an interesting one. 1599 01:17:04,340 --> 01:17:05,840 Look at the difference between this. 1600 01:17:05,840 --> 01:17:06,930 CP is 1. 1601 01:17:06,930 --> 01:17:08,420 CPK is 1. 1602 01:17:08,420 --> 01:17:11,150 CP is still 1 because the width of this distribution 1603 01:17:11,150 --> 01:17:13,610 didn't change. 1604 01:17:13,610 --> 01:17:17,870 But the mindshift shift was such that I actually have-- 1605 01:17:17,870 --> 01:17:22,130 half my parts will now be bad, and that, if you calculate, 1606 01:17:22,130 --> 01:17:25,520 gives you CPK of 0. 1607 01:17:25,520 --> 01:17:27,550 Here's another interesting one. 1608 01:17:27,550 --> 01:17:30,410 CP went up to 2 because I narrowed the distribution, 1609 01:17:30,410 --> 01:17:32,370 but it wasn't centered. 1610 01:17:32,370 --> 01:17:35,490 And so I got a CPK of 1. 1611 01:17:35,490 --> 01:17:37,540 This is really good. 1612 01:17:37,540 --> 01:17:39,160 This is really good. 1613 01:17:39,160 --> 01:17:41,450 This is now plus or minus-- 1614 01:17:41,450 --> 01:17:46,490 can you see that's plus or minus 6 sigma out to here? 1615 01:17:46,490 --> 01:17:48,670 So what's the likelihood of making a bad part now? 1616 01:17:51,570 --> 01:17:54,345 A couple per million, I think. 1617 01:17:54,345 --> 01:17:55,970 this-- I don't have time to go into it, 1618 01:17:55,970 --> 01:17:59,720 but this is sort of the origin of the 6 sigma 1619 01:17:59,720 --> 01:18:01,040 revolution of saying, wow. 1620 01:18:01,040 --> 01:18:04,840 If we could do this, we'd never make a bad part. 1621 01:18:04,840 --> 01:18:06,760 There's a little bit more to that, 1622 01:18:06,760 --> 01:18:10,655 but that's process capability. 1623 01:18:10,655 --> 01:18:12,530 There are all sorts of different forms of it. 1624 01:18:12,530 --> 01:18:14,655 But the last thing-- let me just say one more thing 1625 01:18:14,655 --> 01:18:16,940 about process capability. 1626 01:18:16,940 --> 01:18:20,360 All these things, CP, CPK, six sigma, they're all great. 1627 01:18:20,360 --> 01:18:25,670 Keep in mind that, again, if you can do it, 1628 01:18:25,670 --> 01:18:28,100 if you actually have the data to do it, it's useful to do, 1629 01:18:28,100 --> 01:18:28,880 I think. 1630 01:18:28,880 --> 01:18:35,870 What it really says is here's the design, 1631 01:18:35,870 --> 01:18:41,240 and somewhere here is manufacturing. 1632 01:18:46,260 --> 01:18:49,320 And both of these can change. 1633 01:18:49,320 --> 01:18:49,960 OK? 1634 01:18:49,960 --> 01:18:51,370 Both of these can change. 1635 01:18:51,370 --> 01:18:53,070 I can do better manufacturing. 1636 01:18:53,070 --> 01:18:54,420 I can center it. 1637 01:18:54,420 --> 01:18:56,160 I can do worse manufacturing. 1638 01:18:56,160 --> 01:18:57,330 I can buy new equipment. 1639 01:18:57,330 --> 01:19:00,150 I can change this all the ways we've talked about. 1640 01:19:00,150 --> 01:19:05,110 I can also go back to design or start with design and say, 1641 01:19:05,110 --> 01:19:08,470 you know, this part really isn't that important. 1642 01:19:08,470 --> 01:19:11,200 This window can actually be here. 1643 01:19:11,200 --> 01:19:12,710 And if I move the window out there, 1644 01:19:12,710 --> 01:19:15,800 look at how the process capability just changed. 1645 01:19:15,800 --> 01:19:17,690 In case 1, the process capability 1646 01:19:17,690 --> 01:19:20,300 of this distribution against this design is terrible. 1647 01:19:20,300 --> 01:19:23,000 The CPK is almost 0. 1648 01:19:23,000 --> 01:19:29,150 Now, if I opened up the window, CPK now skyrockets. 1649 01:19:29,150 --> 01:19:31,250 I'm great. 1650 01:19:31,250 --> 01:19:36,050 The trend usually is just the opposite, though. 1651 01:19:36,050 --> 01:19:39,320 And this is certainly true in semiconductor manufacturing is 1652 01:19:39,320 --> 01:19:42,140 you might say, here's the inherent underlying capability 1653 01:19:42,140 --> 01:19:45,080 of many of the processes that we've had historically. 1654 01:19:45,080 --> 01:19:50,420 Here were the design specifications back in 1980. 1655 01:19:50,420 --> 01:19:52,860 Line width or something like this. 1656 01:19:52,860 --> 01:19:54,210 Characteristic to mention. 1657 01:19:54,210 --> 01:19:57,730 Here they are now. 1658 01:19:57,730 --> 01:20:00,430 Oops. 1659 01:20:00,430 --> 01:20:03,295 Can't make this product with this process. 1660 01:20:06,530 --> 01:20:08,510 And it was all because-- so you can 1661 01:20:08,510 --> 01:20:10,970 look at how I have to improve my process capability. 1662 01:20:10,970 --> 01:20:13,280 And it's interesting to talk to people who've 1663 01:20:13,280 --> 01:20:16,100 been in the semiconductor industry their whole lives, 1664 01:20:16,100 --> 01:20:18,200 and they'll tell you this exact story 1665 01:20:18,200 --> 01:20:21,260 that, well, we didn't worry about this back when 1666 01:20:21,260 --> 01:20:24,065 the specifications for this wide. 1667 01:20:24,065 --> 01:20:26,190 But all of a sudden, it became much more important, 1668 01:20:26,190 --> 01:20:29,950 so we had to do a lot of things, including not just SPC, 1669 01:20:29,950 --> 01:20:31,910 but some extraordinary things. 1670 01:20:31,910 --> 01:20:32,410 OK. 1671 01:20:32,410 --> 01:20:33,340 Thanks for your patience. 1672 01:20:33,340 --> 01:20:35,290 I ran over, but it was a technical difficulty 1673 01:20:35,290 --> 01:20:36,070 at the beginning. 1674 01:20:36,070 --> 01:20:36,880 OK. 1675 01:20:36,880 --> 01:20:38,620 Bye bye.