1
00:00:00,000 --> 00:00:00,880
2
00:00:00,880 --> 00:00:00,940
Hi.
3
00:00:00,940 --> 00:00:03,980
In this problem we'll work
through an example of
4
00:00:03,980 --> 00:00:08,150
calculating a distribution for
a minute variable using the
5
00:00:08,150 --> 00:00:10,060
method of derived
distributions.
6
00:00:10,060 --> 00:00:13,360
So in general, the process
goes as follows.
7
00:00:13,360 --> 00:00:16,640
We know the distribution for
some random variable X and
8
00:00:16,640 --> 00:00:19,090
what we want is the distribution
for another
9
00:00:19,090 --> 00:00:22,070
random variable of Y, which is
somehow related to X through
10
00:00:22,070 --> 00:00:23,450
some function g.
11
00:00:23,450 --> 00:00:25,630
So Y is a g of X.
12
00:00:25,630 --> 00:00:28,230
And the steps that we follow--
13
00:00:28,230 --> 00:00:29,900
we can actually just kind
of summarize them
14
00:00:29,900 --> 00:00:31,570
using this four steps.
15
00:00:31,570 --> 00:00:35,290
The first step is to write out
the CDF of Y. So Y is thing
16
00:00:35,290 --> 00:00:35,870
that we want.
17
00:00:35,870 --> 00:00:38,380
And what we'll do is we'll
write out the CDF first.
18
00:00:38,380 --> 00:00:43,060
So remember the CDF is just
capital F of y, y is the
19
00:00:43,060 --> 00:00:45,590
probability that random variable
Y is less than or
20
00:00:45,590 --> 00:00:48,960
equal to some value, little y.
21
00:00:48,960 --> 00:00:51,280
The next thing we'll do is,
we'll use this relationship
22
00:00:51,280 --> 00:00:54,990
that we know, between Y and X.
And we'll substitute in,
23
00:00:54,990 --> 00:00:57,960
instead of writing the random
variable Y In here, we'll
24
00:00:57,960 --> 00:01:01,770
write it in terms of X. So we'll
plug in for-- instead of
25
00:01:01,770 --> 00:01:05,140
Y, we'll plug-in X. And we'll
use this function g in order
26
00:01:05,140 --> 00:01:06,510
to do that.
27
00:01:06,510 --> 00:01:09,570
So what we have now is that up
to here, we would have that
28
00:01:09,570 --> 00:01:12,560
the CDF of Y is now the
probability that the random
29
00:01:12,560 --> 00:01:16,240
variable X is less than or equal
to some value, little y.
30
00:01:16,240 --> 00:01:18,190
Next what we'll do is we'll
actually rewrite this
31
00:01:18,190 --> 00:01:22,810
probability as a CDF of
X. So the CDF of X,
32
00:01:22,810 --> 00:01:25,020
remember, would be--
33
00:01:25,020 --> 00:01:31,130
F of x is that the probability
of X is less than or equal to
34
00:01:31,130 --> 00:01:33,440
some little x.
35
00:01:33,440 --> 00:01:36,145
And then once we have that,
if we differentiate this--
36
00:01:36,145 --> 00:01:38,860
37
00:01:38,860 --> 00:01:42,490
when we differentiate the CDF of
X, we get the PDF of X. And
38
00:01:42,490 --> 00:01:45,650
what we presume is that we
know this PDF already.
39
00:01:45,650 --> 00:01:49,760
And from that, what we get is,
when we differentiate this
40
00:01:49,760 --> 00:01:52,850
thing, we get the PDF of Y. So
through this whole process
41
00:01:52,850 --> 00:01:55,360
what we get is, we'll get the
relationship between the PDF
42
00:01:55,360 --> 00:02:00,030
of Y and the PDF of X. So that
is the process for calculating
43
00:02:00,030 --> 00:02:03,380
the PDF of Y using X.
44
00:02:03,380 --> 00:02:05,050
So let's go into our
specific example.
45
00:02:05,050 --> 00:02:08,530
In this case, what we're told
is that X, the one that we
46
00:02:08,530 --> 00:02:11,200
know, is a standard normal
random variable.
47
00:02:11,200 --> 00:02:14,300
Meaning that it's mean
0 and variance 1.
48
00:02:14,300 --> 00:02:17,040
And so we know the
form of the PDF.
49
00:02:17,040 --> 00:02:20,130
The PDF of x is this, 1 over
square root of 2 pi e to the
50
00:02:20,130 --> 00:02:22,910
minus x squared over 2.
51
00:02:22,910 --> 00:02:24,680
And then the next thing that
we're told is this
52
00:02:24,680 --> 00:02:32,190
relationship between X and Y. So
what we're told is, if X is
53
00:02:32,190 --> 00:02:37,940
negative, then Y is minus X.
If X is positive, then Y is
54
00:02:37,940 --> 00:02:40,820
the square root of X. So
this is a graphical its
55
00:02:40,820 --> 00:02:45,030
representation of the
relationship between X and Y.
56
00:02:45,030 --> 00:02:48,310
All right, so we have everything
that we need.
57
00:02:48,310 --> 00:02:50,820
And now let's just go through
this process and calculate
58
00:02:50,820 --> 00:02:52,420
what the PDF of Y is.
59
00:02:52,420 --> 00:02:57,690
So the first thing we do is we
write out the PDF of Y. So the
60
00:02:57,690 --> 00:03:01,245
PDF of Y is what
we've written.
61
00:03:01,245 --> 00:03:05,160
It's the probability that the
random variable Y is less than
62
00:03:05,160 --> 00:03:06,410
or equal to some little y.
63
00:03:06,410 --> 00:03:08,480
64
00:03:08,480 --> 00:03:13,040
Now the next step that we do is
we have to substitute in,
65
00:03:13,040 --> 00:03:15,120
instead of in terms of Y, we
want to substitute it in terms
66
00:03:15,120 --> 00:03:18,420
of X. Because we actually know
stuff about X, but we don't
67
00:03:18,420 --> 00:03:22,440
know anything about Y. So what
is the probability that Y, the
68
00:03:22,440 --> 00:03:23,516
random variable Y, is
less than or equal
69
00:03:23,516 --> 00:03:24,820
to some little y?
70
00:03:24,820 --> 00:03:27,420
Well, let's go back to this
relationship and see if we can
71
00:03:27,420 --> 00:03:28,240
figure that out.
72
00:03:28,240 --> 00:03:33,590
So let's pretend that here
is our little y.
73
00:03:33,590 --> 00:03:37,402
Well, if the random variable
Y is less than or equal to
74
00:03:37,402 --> 00:03:39,160
little y, it has to
be underneath
75
00:03:39,160 --> 00:03:41,760
this horizontal line.
76
00:03:41,760 --> 00:03:44,360
And in order for it to be
underneath this horizontal
77
00:03:44,360 --> 00:03:50,110
line, that means that X has
to be between this range.
78
00:03:50,110 --> 00:03:51,050
And what is this range?
79
00:03:51,050 --> 00:03:56,790
This range goes from minus
Y to Y squared.
80
00:03:56,790 --> 00:03:57,810
So why is that?
81
00:03:57,810 --> 00:04:03,160
It's because in this portion X
and Y are related as, Y is
82
00:04:03,160 --> 00:04:08,120
negative X and here it's Y is
square root of X. So if X is Y
83
00:04:08,120 --> 00:04:11,920
squared, then Y would be Y. If
X is negative Y, then Y would
84
00:04:11,920 --> 00:04:16,130
be Y. All right, so this
is the range that
85
00:04:16,130 --> 00:04:18,480
we're looking for.
86
00:04:18,480 --> 00:04:20,870
So if Y, the random variable
Y is less than or equal to
87
00:04:20,870 --> 00:04:27,580
little y, then this is the same
as if the random variable
88
00:04:27,580 --> 00:04:32,700
X is between negative
Y and Y squared.
89
00:04:32,700 --> 00:04:34,440
So let's plug that in.
90
00:04:34,440 --> 00:04:38,970
This is the same as the
probability that X is between
91
00:04:38,970 --> 00:04:44,040
negative Y and Y squared.
92
00:04:44,040 --> 00:04:46,280
So those are the first
two steps.
93
00:04:46,280 --> 00:04:49,070
Now the third step is, we
have to rewrite this
94
00:04:49,070 --> 00:04:51,300
as the CDF of x.
95
00:04:51,300 --> 00:04:56,530
So right now we have it in terms
of a probability of some
96
00:04:56,530 --> 00:04:59,760
event related to X. Let's
actually transform that to be
97
00:04:59,760 --> 00:05:04,840
explicitly in terms of the CDF
of X. So how do we do that?
98
00:05:04,840 --> 00:05:06,840
Well, this is just the
probability that X is within
99
00:05:06,840 --> 00:05:07,750
some range.
100
00:05:07,750 --> 00:05:11,710
So we can turn that into the
CDF by writing it as a
101
00:05:11,710 --> 00:05:13,650
difference of two CDFs.
102
00:05:13,650 --> 00:05:17,600
So this is the same as the
probability that X is less
103
00:05:17,600 --> 00:05:23,730
than or equal to Y squared minus
the probability that X
104
00:05:23,730 --> 00:05:27,170
is less than or equal
to negative Y.
105
00:05:27,170 --> 00:05:30,230
So in order to find the
probability that X is between
106
00:05:30,230 --> 00:05:35,160
this range, we take the
probability that it's less
107
00:05:35,160 --> 00:05:36,410
than Y squared, which
is everything here.
108
00:05:36,410 --> 00:05:39,410
And then we subtract that
probability that it's less
109
00:05:39,410 --> 00:05:42,450
than Y, negative Y. So what
we're left with is just within
110
00:05:42,450 --> 00:05:44,490
this range.
111
00:05:44,490 --> 00:05:52,810
So these actually are now
exactly CDFs of X. So this is
112
00:05:52,810 --> 00:05:58,150
F of X evaluated at Y squared
and this is F of X evaluated
113
00:05:58,150 --> 00:06:03,230
at negative Y. So now we've
completed step three.
114
00:06:03,230 --> 00:06:05,570
And the last step that we need
to do is differentiate.
115
00:06:05,570 --> 00:06:08,730
So if we differentiate both
sides of this equation with
116
00:06:08,730 --> 00:06:14,070
respect to Y, we'll get that the
left side would get what
117
00:06:14,070 --> 00:06:18,260
we want, which is the PDF of
Y. Now we differentiate the
118
00:06:18,260 --> 00:06:19,250
right side--
119
00:06:19,250 --> 00:06:21,460
we'll have to invoke
the chain rule.
120
00:06:21,460 --> 00:06:27,140
So the first thing that we do
is, well, this is a CDF of X.
121
00:06:27,140 --> 00:06:32,660
So when we differentiate
we'll get the PDF of X.
122
00:06:32,660 --> 00:06:35,230
But then we also have invoke
the chain rule for this
123
00:06:35,230 --> 00:06:36,010
argument inside.
124
00:06:36,010 --> 00:06:38,510
So the derivative of Y
squared would give us
125
00:06:38,510 --> 00:06:42,340
an extra term, 2Y.
126
00:06:42,340 --> 00:06:46,800
And then similarly this would
give us the PDF of X evaluated
127
00:06:46,800 --> 00:06:51,540
at negative Y plus the chain
will give us an extra term of
128
00:06:51,540 --> 00:06:53,890
negative 1.
129
00:06:53,890 --> 00:06:56,360
So let's just clean this
up a little bit.
130
00:06:56,360 --> 00:07:07,260
So it's 2y F X squared plus F
X minus Y. All right, so now
131
00:07:07,260 --> 00:07:07,850
we're almost done.
132
00:07:07,850 --> 00:07:08,690
We've differentiated.
133
00:07:08,690 --> 00:07:11,460
We have the PDF of Y, which
is what we're looking for.
134
00:07:11,460 --> 00:07:15,260
And we've written it in terms
of the PDF of X. And
135
00:07:15,260 --> 00:07:18,580
fortunately we know what that
is, so once we plug that in,
136
00:07:18,580 --> 00:07:20,280
then we're essentially done.
137
00:07:20,280 --> 00:07:22,440
So what is the PDF?
138
00:07:22,440 --> 00:07:25,570
Well, the PDF of X evaluated at
Y squared is going to give
139
00:07:25,570 --> 00:07:32,250
us 1 over square root of
2 pi e to the minus--
140
00:07:32,250 --> 00:07:36,610
so in this case, X
is Y squared--
141
00:07:36,610 --> 00:07:41,220
so we get Y to the
fourth over 2.
142
00:07:41,220 --> 00:07:45,750
And then we get another 1 over
square root of 2 pi e to the
143
00:07:45,750 --> 00:07:49,460
minus Y squared over 2.
144
00:07:49,460 --> 00:07:51,880
OK, and now we're almost done.
145
00:07:51,880 --> 00:07:53,400
The last thing that we
need to take care of
146
00:07:53,400 --> 00:07:55,050
is, what is the range?
147
00:07:55,050 --> 00:07:58,030
Now remember, it's important
when you calculate out PDFs to
148
00:07:58,030 --> 00:08:00,930
always think about the ranges
where things are valid.
149
00:08:00,930 --> 00:08:05,510
So when we think about this,
what is the range where this
150
00:08:05,510 --> 00:08:06,970
actually is valid?
151
00:08:06,970 --> 00:08:12,200
Well, Y, remember is related
to X in this relationship.
152
00:08:12,200 --> 00:08:17,560
So as we look at this, we see
that Y can never be negative.
153
00:08:17,560 --> 00:08:22,040
Because no matter what X is, Y
gets transformed into some
154
00:08:22,040 --> 00:08:24,250
non-negative version.
155
00:08:24,250 --> 00:08:30,650
So what we know is that this is
now actually valid only for
156
00:08:30,650 --> 00:08:38,480
Y greater than 0 and for Y less
than 0, the PDF is 0.
157
00:08:38,480 --> 00:08:50,080
So this gives us the
final PDF of Y.
158
00:08:50,080 --> 00:08:53,600
All right, so it seems like at
first when you start doing
159
00:08:53,600 --> 00:08:54,990
these derived restriction
problems
160
00:08:54,990 --> 00:08:56,970
that it's pretty difficult.
161
00:08:56,970 --> 00:09:00,830
But if we just remember that
there are these pretty
162
00:09:00,830 --> 00:09:03,650
straightforward steps that we
follow, and as long as you go
163
00:09:03,650 --> 00:09:06,380
through these steps and do them
methodically, then you
164
00:09:06,380 --> 00:09:08,340
can actually come up with
the solution for
165
00:09:08,340 --> 00:09:10,140
any of these problems.
166
00:09:10,140 --> 00:09:14,210
And one last thing to remember
is to always think about what
167
00:09:14,210 --> 00:09:16,050
are the ranges where these
things are valid?
168
00:09:16,050 --> 00:09:18,610
Because the relationship between
these two random
169
00:09:18,610 --> 00:09:21,050
variables could be pretty
complicated and you need to
170
00:09:21,050 --> 00:09:24,190
always be aware of when things
are non-zero and
171
00:09:24,190 --> 00:09:25,440
when they are 0.
172
00:09:25,440 --> 00:09:26,333