1 00:00:00,000 --> 00:00:00,880 2 00:00:00,880 --> 00:00:00,940 Hi. 3 00:00:00,940 --> 00:00:03,980 In this problem we'll work through an example of 4 00:00:03,980 --> 00:00:08,150 calculating a distribution for a minute variable using the 5 00:00:08,150 --> 00:00:10,060 method of derived distributions. 6 00:00:10,060 --> 00:00:13,360 So in general, the process goes as follows. 7 00:00:13,360 --> 00:00:16,640 We know the distribution for some random variable X and 8 00:00:16,640 --> 00:00:19,090 what we want is the distribution for another 9 00:00:19,090 --> 00:00:22,070 random variable of Y, which is somehow related to X through 10 00:00:22,070 --> 00:00:23,450 some function g. 11 00:00:23,450 --> 00:00:25,630 So Y is a g of X. 12 00:00:25,630 --> 00:00:28,230 And the steps that we follow-- 13 00:00:28,230 --> 00:00:29,900 we can actually just kind of summarize them 14 00:00:29,900 --> 00:00:31,570 using this four steps. 15 00:00:31,570 --> 00:00:35,290 The first step is to write out the CDF of Y. So Y is thing 16 00:00:35,290 --> 00:00:35,870 that we want. 17 00:00:35,870 --> 00:00:38,380 And what we'll do is we'll write out the CDF first. 18 00:00:38,380 --> 00:00:43,060 So remember the CDF is just capital F of y, y is the 19 00:00:43,060 --> 00:00:45,590 probability that random variable Y is less than or 20 00:00:45,590 --> 00:00:48,960 equal to some value, little y. 21 00:00:48,960 --> 00:00:51,280 The next thing we'll do is, we'll use this relationship 22 00:00:51,280 --> 00:00:54,990 that we know, between Y and X. And we'll substitute in, 23 00:00:54,990 --> 00:00:57,960 instead of writing the random variable Y In here, we'll 24 00:00:57,960 --> 00:01:01,770 write it in terms of X. So we'll plug in for-- instead of 25 00:01:01,770 --> 00:01:05,140 Y, we'll plug-in X. And we'll use this function g in order 26 00:01:05,140 --> 00:01:06,510 to do that. 27 00:01:06,510 --> 00:01:09,570 So what we have now is that up to here, we would have that 28 00:01:09,570 --> 00:01:12,560 the CDF of Y is now the probability that the random 29 00:01:12,560 --> 00:01:16,240 variable X is less than or equal to some value, little y. 30 00:01:16,240 --> 00:01:18,190 Next what we'll do is we'll actually rewrite this 31 00:01:18,190 --> 00:01:22,810 probability as a CDF of X. So the CDF of X, 32 00:01:22,810 --> 00:01:25,020 remember, would be-- 33 00:01:25,020 --> 00:01:31,130 F of x is that the probability of X is less than or equal to 34 00:01:31,130 --> 00:01:33,440 some little x. 35 00:01:33,440 --> 00:01:36,145 And then once we have that, if we differentiate this-- 36 00:01:36,145 --> 00:01:38,860 37 00:01:38,860 --> 00:01:42,490 when we differentiate the CDF of X, we get the PDF of X. And 38 00:01:42,490 --> 00:01:45,650 what we presume is that we know this PDF already. 39 00:01:45,650 --> 00:01:49,760 And from that, what we get is, when we differentiate this 40 00:01:49,760 --> 00:01:52,850 thing, we get the PDF of Y. So through this whole process 41 00:01:52,850 --> 00:01:55,360 what we get is, we'll get the relationship between the PDF 42 00:01:55,360 --> 00:02:00,030 of Y and the PDF of X. So that is the process for calculating 43 00:02:00,030 --> 00:02:03,380 the PDF of Y using X. 44 00:02:03,380 --> 00:02:05,050 So let's go into our specific example. 45 00:02:05,050 --> 00:02:08,530 In this case, what we're told is that X, the one that we 46 00:02:08,530 --> 00:02:11,200 know, is a standard normal random variable. 47 00:02:11,200 --> 00:02:14,300 Meaning that it's mean 0 and variance 1. 48 00:02:14,300 --> 00:02:17,040 And so we know the form of the PDF. 49 00:02:17,040 --> 00:02:20,130 The PDF of x is this, 1 over square root of 2 pi e to the 50 00:02:20,130 --> 00:02:22,910 minus x squared over 2. 51 00:02:22,910 --> 00:02:24,680 And then the next thing that we're told is this 52 00:02:24,680 --> 00:02:32,190 relationship between X and Y. So what we're told is, if X is 53 00:02:32,190 --> 00:02:37,940 negative, then Y is minus X. If X is positive, then Y is 54 00:02:37,940 --> 00:02:40,820 the square root of X. So this is a graphical its 55 00:02:40,820 --> 00:02:45,030 representation of the relationship between X and Y. 56 00:02:45,030 --> 00:02:48,310 All right, so we have everything that we need. 57 00:02:48,310 --> 00:02:50,820 And now let's just go through this process and calculate 58 00:02:50,820 --> 00:02:52,420 what the PDF of Y is. 59 00:02:52,420 --> 00:02:57,690 So the first thing we do is we write out the PDF of Y. So the 60 00:02:57,690 --> 00:03:01,245 PDF of Y is what we've written. 61 00:03:01,245 --> 00:03:05,160 It's the probability that the random variable Y is less than 62 00:03:05,160 --> 00:03:06,410 or equal to some little y. 63 00:03:06,410 --> 00:03:08,480 64 00:03:08,480 --> 00:03:13,040 Now the next step that we do is we have to substitute in, 65 00:03:13,040 --> 00:03:15,120 instead of in terms of Y, we want to substitute it in terms 66 00:03:15,120 --> 00:03:18,420 of X. Because we actually know stuff about X, but we don't 67 00:03:18,420 --> 00:03:22,440 know anything about Y. So what is the probability that Y, the 68 00:03:22,440 --> 00:03:23,516 random variable Y, is less than or equal 69 00:03:23,516 --> 00:03:24,820 to some little y? 70 00:03:24,820 --> 00:03:27,420 Well, let's go back to this relationship and see if we can 71 00:03:27,420 --> 00:03:28,240 figure that out. 72 00:03:28,240 --> 00:03:33,590 So let's pretend that here is our little y. 73 00:03:33,590 --> 00:03:37,402 Well, if the random variable Y is less than or equal to 74 00:03:37,402 --> 00:03:39,160 little y, it has to be underneath 75 00:03:39,160 --> 00:03:41,760 this horizontal line. 76 00:03:41,760 --> 00:03:44,360 And in order for it to be underneath this horizontal 77 00:03:44,360 --> 00:03:50,110 line, that means that X has to be between this range. 78 00:03:50,110 --> 00:03:51,050 And what is this range? 79 00:03:51,050 --> 00:03:56,790 This range goes from minus Y to Y squared. 80 00:03:56,790 --> 00:03:57,810 So why is that? 81 00:03:57,810 --> 00:04:03,160 It's because in this portion X and Y are related as, Y is 82 00:04:03,160 --> 00:04:08,120 negative X and here it's Y is square root of X. So if X is Y 83 00:04:08,120 --> 00:04:11,920 squared, then Y would be Y. If X is negative Y, then Y would 84 00:04:11,920 --> 00:04:16,130 be Y. All right, so this is the range that 85 00:04:16,130 --> 00:04:18,480 we're looking for. 86 00:04:18,480 --> 00:04:20,870 So if Y, the random variable Y is less than or equal to 87 00:04:20,870 --> 00:04:27,580 little y, then this is the same as if the random variable 88 00:04:27,580 --> 00:04:32,700 X is between negative Y and Y squared. 89 00:04:32,700 --> 00:04:34,440 So let's plug that in. 90 00:04:34,440 --> 00:04:38,970 This is the same as the probability that X is between 91 00:04:38,970 --> 00:04:44,040 negative Y and Y squared. 92 00:04:44,040 --> 00:04:46,280 So those are the first two steps. 93 00:04:46,280 --> 00:04:49,070 Now the third step is, we have to rewrite this 94 00:04:49,070 --> 00:04:51,300 as the CDF of x. 95 00:04:51,300 --> 00:04:56,530 So right now we have it in terms of a probability of some 96 00:04:56,530 --> 00:04:59,760 event related to X. Let's actually transform that to be 97 00:04:59,760 --> 00:05:04,840 explicitly in terms of the CDF of X. So how do we do that? 98 00:05:04,840 --> 00:05:06,840 Well, this is just the probability that X is within 99 00:05:06,840 --> 00:05:07,750 some range. 100 00:05:07,750 --> 00:05:11,710 So we can turn that into the CDF by writing it as a 101 00:05:11,710 --> 00:05:13,650 difference of two CDFs. 102 00:05:13,650 --> 00:05:17,600 So this is the same as the probability that X is less 103 00:05:17,600 --> 00:05:23,730 than or equal to Y squared minus the probability that X 104 00:05:23,730 --> 00:05:27,170 is less than or equal to negative Y. 105 00:05:27,170 --> 00:05:30,230 So in order to find the probability that X is between 106 00:05:30,230 --> 00:05:35,160 this range, we take the probability that it's less 107 00:05:35,160 --> 00:05:36,410 than Y squared, which is everything here. 108 00:05:36,410 --> 00:05:39,410 And then we subtract that probability that it's less 109 00:05:39,410 --> 00:05:42,450 than Y, negative Y. So what we're left with is just within 110 00:05:42,450 --> 00:05:44,490 this range. 111 00:05:44,490 --> 00:05:52,810 So these actually are now exactly CDFs of X. So this is 112 00:05:52,810 --> 00:05:58,150 F of X evaluated at Y squared and this is F of X evaluated 113 00:05:58,150 --> 00:06:03,230 at negative Y. So now we've completed step three. 114 00:06:03,230 --> 00:06:05,570 And the last step that we need to do is differentiate. 115 00:06:05,570 --> 00:06:08,730 So if we differentiate both sides of this equation with 116 00:06:08,730 --> 00:06:14,070 respect to Y, we'll get that the left side would get what 117 00:06:14,070 --> 00:06:18,260 we want, which is the PDF of Y. Now we differentiate the 118 00:06:18,260 --> 00:06:19,250 right side-- 119 00:06:19,250 --> 00:06:21,460 we'll have to invoke the chain rule. 120 00:06:21,460 --> 00:06:27,140 So the first thing that we do is, well, this is a CDF of X. 121 00:06:27,140 --> 00:06:32,660 So when we differentiate we'll get the PDF of X. 122 00:06:32,660 --> 00:06:35,230 But then we also have invoke the chain rule for this 123 00:06:35,230 --> 00:06:36,010 argument inside. 124 00:06:36,010 --> 00:06:38,510 So the derivative of Y squared would give us 125 00:06:38,510 --> 00:06:42,340 an extra term, 2Y. 126 00:06:42,340 --> 00:06:46,800 And then similarly this would give us the PDF of X evaluated 127 00:06:46,800 --> 00:06:51,540 at negative Y plus the chain will give us an extra term of 128 00:06:51,540 --> 00:06:53,890 negative 1. 129 00:06:53,890 --> 00:06:56,360 So let's just clean this up a little bit. 130 00:06:56,360 --> 00:07:07,260 So it's 2y F X squared plus F X minus Y. All right, so now 131 00:07:07,260 --> 00:07:07,850 we're almost done. 132 00:07:07,850 --> 00:07:08,690 We've differentiated. 133 00:07:08,690 --> 00:07:11,460 We have the PDF of Y, which is what we're looking for. 134 00:07:11,460 --> 00:07:15,260 And we've written it in terms of the PDF of X. And 135 00:07:15,260 --> 00:07:18,580 fortunately we know what that is, so once we plug that in, 136 00:07:18,580 --> 00:07:20,280 then we're essentially done. 137 00:07:20,280 --> 00:07:22,440 So what is the PDF? 138 00:07:22,440 --> 00:07:25,570 Well, the PDF of X evaluated at Y squared is going to give 139 00:07:25,570 --> 00:07:32,250 us 1 over square root of 2 pi e to the minus-- 140 00:07:32,250 --> 00:07:36,610 so in this case, X is Y squared-- 141 00:07:36,610 --> 00:07:41,220 so we get Y to the fourth over 2. 142 00:07:41,220 --> 00:07:45,750 And then we get another 1 over square root of 2 pi e to the 143 00:07:45,750 --> 00:07:49,460 minus Y squared over 2. 144 00:07:49,460 --> 00:07:51,880 OK, and now we're almost done. 145 00:07:51,880 --> 00:07:53,400 The last thing that we need to take care of 146 00:07:53,400 --> 00:07:55,050 is, what is the range? 147 00:07:55,050 --> 00:07:58,030 Now remember, it's important when you calculate out PDFs to 148 00:07:58,030 --> 00:08:00,930 always think about the ranges where things are valid. 149 00:08:00,930 --> 00:08:05,510 So when we think about this, what is the range where this 150 00:08:05,510 --> 00:08:06,970 actually is valid? 151 00:08:06,970 --> 00:08:12,200 Well, Y, remember is related to X in this relationship. 152 00:08:12,200 --> 00:08:17,560 So as we look at this, we see that Y can never be negative. 153 00:08:17,560 --> 00:08:22,040 Because no matter what X is, Y gets transformed into some 154 00:08:22,040 --> 00:08:24,250 non-negative version. 155 00:08:24,250 --> 00:08:30,650 So what we know is that this is now actually valid only for 156 00:08:30,650 --> 00:08:38,480 Y greater than 0 and for Y less than 0, the PDF is 0. 157 00:08:38,480 --> 00:08:50,080 So this gives us the final PDF of Y. 158 00:08:50,080 --> 00:08:53,600 All right, so it seems like at first when you start doing 159 00:08:53,600 --> 00:08:54,990 these derived restriction problems 160 00:08:54,990 --> 00:08:56,970 that it's pretty difficult. 161 00:08:56,970 --> 00:09:00,830 But if we just remember that there are these pretty 162 00:09:00,830 --> 00:09:03,650 straightforward steps that we follow, and as long as you go 163 00:09:03,650 --> 00:09:06,380 through these steps and do them methodically, then you 164 00:09:06,380 --> 00:09:08,340 can actually come up with the solution for 165 00:09:08,340 --> 00:09:10,140 any of these problems. 166 00:09:10,140 --> 00:09:14,210 And one last thing to remember is to always think about what 167 00:09:14,210 --> 00:09:16,050 are the ranges where these things are valid? 168 00:09:16,050 --> 00:09:18,610 Because the relationship between these two random 169 00:09:18,610 --> 00:09:21,050 variables could be pretty complicated and you need to 170 00:09:21,050 --> 00:09:24,190 always be aware of when things are non-zero and 171 00:09:24,190 --> 00:09:25,440 when they are 0. 172 00:09:25,440 --> 00:09:26,333