WEBVTT
00:00:00.880 --> 00:00:00.940
Hi.
00:00:00.940 --> 00:00:03.980
In this problem we'll work
through an example of
00:00:03.980 --> 00:00:08.150
calculating a distribution for
a minute variable using the
00:00:08.150 --> 00:00:10.060
method of derived
distributions.
00:00:10.060 --> 00:00:13.360
So in general, the process
goes as follows.
00:00:13.360 --> 00:00:16.640
We know the distribution for
some random variable X and
00:00:16.640 --> 00:00:19.090
what we want is the distribution
for another
00:00:19.090 --> 00:00:22.070
random variable of Y, which is
somehow related to X through
00:00:22.070 --> 00:00:23.450
some function g.
00:00:23.450 --> 00:00:25.630
So Y is a g of X.
00:00:25.630 --> 00:00:28.230
And the steps that we follow--
00:00:28.230 --> 00:00:29.900
we can actually just kind
of summarize them
00:00:29.900 --> 00:00:31.570
using this four steps.
00:00:31.570 --> 00:00:35.290
The first step is to write out
the CDF of Y. So Y is thing
00:00:35.290 --> 00:00:35.870
that we want.
00:00:35.870 --> 00:00:38.380
And what we'll do is we'll
write out the CDF first.
00:00:38.380 --> 00:00:43.060
So remember the CDF is just
capital F of y, y is the
00:00:43.060 --> 00:00:45.590
probability that random variable
Y is less than or
00:00:45.590 --> 00:00:48.960
equal to some value, little y.
00:00:48.960 --> 00:00:51.280
The next thing we'll do is,
we'll use this relationship
00:00:51.280 --> 00:00:54.990
that we know, between Y and X.
And we'll substitute in,
00:00:54.990 --> 00:00:57.960
instead of writing the random
variable Y In here, we'll
00:00:57.960 --> 00:01:01.770
write it in terms of X. So we'll
plug in for-- instead of
00:01:01.770 --> 00:01:05.140
Y, we'll plug-in X. And we'll
use this function g in order
00:01:05.140 --> 00:01:06.510
to do that.
00:01:06.510 --> 00:01:09.570
So what we have now is that up
to here, we would have that
00:01:09.570 --> 00:01:12.560
the CDF of Y is now the
probability that the random
00:01:12.560 --> 00:01:16.240
variable X is less than or equal
to some value, little y.
00:01:16.240 --> 00:01:18.190
Next what we'll do is we'll
actually rewrite this
00:01:18.190 --> 00:01:22.810
probability as a CDF of
X. So the CDF of X,
00:01:22.810 --> 00:01:25.020
remember, would be--
00:01:25.020 --> 00:01:31.130
F of x is that the probability
of X is less than or equal to
00:01:31.130 --> 00:01:33.440
some little x.
00:01:33.440 --> 00:01:36.145
And then once we have that,
if we differentiate this--
00:01:38.860 --> 00:01:42.490
when we differentiate the CDF of
X, we get the PDF of X. And
00:01:42.490 --> 00:01:45.650
what we presume is that we
know this PDF already.
00:01:45.650 --> 00:01:49.760
And from that, what we get is,
when we differentiate this
00:01:49.760 --> 00:01:52.850
thing, we get the PDF of Y. So
through this whole process
00:01:52.850 --> 00:01:55.360
what we get is, we'll get the
relationship between the PDF
00:01:55.360 --> 00:02:00.030
of Y and the PDF of X. So that
is the process for calculating
00:02:00.030 --> 00:02:03.380
the PDF of Y using X.
00:02:03.380 --> 00:02:05.050
So let's go into our
specific example.
00:02:05.050 --> 00:02:08.530
In this case, what we're told
is that X, the one that we
00:02:08.530 --> 00:02:11.200
know, is a standard normal
random variable.
00:02:11.200 --> 00:02:14.300
Meaning that it's mean
0 and variance 1.
00:02:14.300 --> 00:02:17.040
And so we know the
form of the PDF.
00:02:17.040 --> 00:02:20.130
The PDF of x is this, 1 over
square root of 2 pi e to the
00:02:20.130 --> 00:02:22.910
minus x squared over 2.
00:02:22.910 --> 00:02:24.680
And then the next thing that
we're told is this
00:02:24.680 --> 00:02:32.190
relationship between X and Y. So
what we're told is, if X is
00:02:32.190 --> 00:02:37.940
negative, then Y is minus X.
If X is positive, then Y is
00:02:37.940 --> 00:02:40.820
the square root of X. So
this is a graphical its
00:02:40.820 --> 00:02:45.030
representation of the
relationship between X and Y.
00:02:45.030 --> 00:02:48.310
All right, so we have everything
that we need.
00:02:48.310 --> 00:02:50.820
And now let's just go through
this process and calculate
00:02:50.820 --> 00:02:52.420
what the PDF of Y is.
00:02:52.420 --> 00:02:57.690
So the first thing we do is we
write out the PDF of Y. So the
00:02:57.690 --> 00:03:01.245
PDF of Y is what
we've written.
00:03:01.245 --> 00:03:05.160
It's the probability that the
random variable Y is less than
00:03:05.160 --> 00:03:06.410
or equal to some little y.
00:03:08.480 --> 00:03:13.040
Now the next step that we do is
we have to substitute in,
00:03:13.040 --> 00:03:15.120
instead of in terms of Y, we
want to substitute it in terms
00:03:15.120 --> 00:03:18.420
of X. Because we actually know
stuff about X, but we don't
00:03:18.420 --> 00:03:22.440
know anything about Y. So what
is the probability that Y, the
00:03:22.440 --> 00:03:23.516
random variable Y, is
less than or equal
00:03:23.516 --> 00:03:24.820
to some little y?
00:03:24.820 --> 00:03:27.420
Well, let's go back to this
relationship and see if we can
00:03:27.420 --> 00:03:28.240
figure that out.
00:03:28.240 --> 00:03:33.590
So let's pretend that here
is our little y.
00:03:33.590 --> 00:03:37.402
Well, if the random variable
Y is less than or equal to
00:03:37.402 --> 00:03:39.160
little y, it has to
be underneath
00:03:39.160 --> 00:03:41.760
this horizontal line.
00:03:41.760 --> 00:03:44.360
And in order for it to be
underneath this horizontal
00:03:44.360 --> 00:03:50.110
line, that means that X has
to be between this range.
00:03:50.110 --> 00:03:51.050
And what is this range?
00:03:51.050 --> 00:03:56.790
This range goes from minus
Y to Y squared.
00:03:56.790 --> 00:03:57.810
So why is that?
00:03:57.810 --> 00:04:03.160
It's because in this portion X
and Y are related as, Y is
00:04:03.160 --> 00:04:08.120
negative X and here it's Y is
square root of X. So if X is Y
00:04:08.120 --> 00:04:11.920
squared, then Y would be Y. If
X is negative Y, then Y would
00:04:11.920 --> 00:04:16.130
be Y. All right, so this
is the range that
00:04:16.130 --> 00:04:18.480
we're looking for.
00:04:18.480 --> 00:04:20.870
So if Y, the random variable
Y is less than or equal to
00:04:20.870 --> 00:04:27.580
little y, then this is the same
as if the random variable
00:04:27.580 --> 00:04:32.700
X is between negative
Y and Y squared.
00:04:32.700 --> 00:04:34.440
So let's plug that in.
00:04:34.440 --> 00:04:38.970
This is the same as the
probability that X is between
00:04:38.970 --> 00:04:44.040
negative Y and Y squared.
00:04:44.040 --> 00:04:46.280
So those are the first
two steps.
00:04:46.280 --> 00:04:49.070
Now the third step is, we
have to rewrite this
00:04:49.070 --> 00:04:51.300
as the CDF of x.
00:04:51.300 --> 00:04:56.530
So right now we have it in terms
of a probability of some
00:04:56.530 --> 00:04:59.760
event related to X. Let's
actually transform that to be
00:04:59.760 --> 00:05:04.840
explicitly in terms of the CDF
of X. So how do we do that?
00:05:04.840 --> 00:05:06.840
Well, this is just the
probability that X is within
00:05:06.840 --> 00:05:07.750
some range.
00:05:07.750 --> 00:05:11.710
So we can turn that into the
CDF by writing it as a
00:05:11.710 --> 00:05:13.650
difference of two CDFs.
00:05:13.650 --> 00:05:17.600
So this is the same as the
probability that X is less
00:05:17.600 --> 00:05:23.730
than or equal to Y squared minus
the probability that X
00:05:23.730 --> 00:05:27.170
is less than or equal
to negative Y.
00:05:27.170 --> 00:05:30.230
So in order to find the
probability that X is between
00:05:30.230 --> 00:05:35.160
this range, we take the
probability that it's less
00:05:35.160 --> 00:05:36.410
than Y squared, which
is everything here.
00:05:36.410 --> 00:05:39.410
And then we subtract that
probability that it's less
00:05:39.410 --> 00:05:42.450
than Y, negative Y. So what
we're left with is just within
00:05:42.450 --> 00:05:44.490
this range.
00:05:44.490 --> 00:05:52.810
So these actually are now
exactly CDFs of X. So this is
00:05:52.810 --> 00:05:58.150
F of X evaluated at Y squared
and this is F of X evaluated
00:05:58.150 --> 00:06:03.230
at negative Y. So now we've
completed step three.
00:06:03.230 --> 00:06:05.570
And the last step that we need
to do is differentiate.
00:06:05.570 --> 00:06:08.730
So if we differentiate both
sides of this equation with
00:06:08.730 --> 00:06:14.070
respect to Y, we'll get that the
left side would get what
00:06:14.070 --> 00:06:18.260
we want, which is the PDF of
Y. Now we differentiate the
00:06:18.260 --> 00:06:19.250
right side--
00:06:19.250 --> 00:06:21.460
we'll have to invoke
the chain rule.
00:06:21.460 --> 00:06:27.140
So the first thing that we do
is, well, this is a CDF of X.
00:06:27.140 --> 00:06:32.660
So when we differentiate
we'll get the PDF of X.
00:06:32.660 --> 00:06:35.230
But then we also have invoke
the chain rule for this
00:06:35.230 --> 00:06:36.010
argument inside.
00:06:36.010 --> 00:06:38.510
So the derivative of Y
squared would give us
00:06:38.510 --> 00:06:42.340
an extra term, 2Y.
00:06:42.340 --> 00:06:46.800
And then similarly this would
give us the PDF of X evaluated
00:06:46.800 --> 00:06:51.540
at negative Y plus the chain
will give us an extra term of
00:06:51.540 --> 00:06:53.890
negative 1.
00:06:53.890 --> 00:06:56.360
So let's just clean this
up a little bit.
00:06:56.360 --> 00:07:07.260
So it's 2y F X squared plus F
X minus Y. All right, so now
00:07:07.260 --> 00:07:07.850
we're almost done.
00:07:07.850 --> 00:07:08.690
We've differentiated.
00:07:08.690 --> 00:07:11.460
We have the PDF of Y, which
is what we're looking for.
00:07:11.460 --> 00:07:15.260
And we've written it in terms
of the PDF of X. And
00:07:15.260 --> 00:07:18.580
fortunately we know what that
is, so once we plug that in,
00:07:18.580 --> 00:07:20.280
then we're essentially done.
00:07:20.280 --> 00:07:22.440
So what is the PDF?
00:07:22.440 --> 00:07:25.570
Well, the PDF of X evaluated at
Y squared is going to give
00:07:25.570 --> 00:07:32.250
us 1 over square root of
2 pi e to the minus--
00:07:32.250 --> 00:07:36.610
so in this case, X
is Y squared--
00:07:36.610 --> 00:07:41.220
so we get Y to the
fourth over 2.
00:07:41.220 --> 00:07:45.750
And then we get another 1 over
square root of 2 pi e to the
00:07:45.750 --> 00:07:49.460
minus Y squared over 2.
00:07:49.460 --> 00:07:51.880
OK, and now we're almost done.
00:07:51.880 --> 00:07:53.400
The last thing that we
need to take care of
00:07:53.400 --> 00:07:55.050
is, what is the range?
00:07:55.050 --> 00:07:58.030
Now remember, it's important
when you calculate out PDFs to
00:07:58.030 --> 00:08:00.930
always think about the ranges
where things are valid.
00:08:00.930 --> 00:08:05.510
So when we think about this,
what is the range where this
00:08:05.510 --> 00:08:06.970
actually is valid?
00:08:06.970 --> 00:08:12.200
Well, Y, remember is related
to X in this relationship.
00:08:12.200 --> 00:08:17.560
So as we look at this, we see
that Y can never be negative.
00:08:17.560 --> 00:08:22.040
Because no matter what X is, Y
gets transformed into some
00:08:22.040 --> 00:08:24.250
non-negative version.
00:08:24.250 --> 00:08:30.650
So what we know is that this is
now actually valid only for
00:08:30.650 --> 00:08:38.480
Y greater than 0 and for Y less
than 0, the PDF is 0.
00:08:38.480 --> 00:08:50.080
So this gives us the
final PDF of Y.
00:08:50.080 --> 00:08:53.600
All right, so it seems like at
first when you start doing
00:08:53.600 --> 00:08:54.990
these derived restriction
problems
00:08:54.990 --> 00:08:56.970
that it's pretty difficult.
00:08:56.970 --> 00:09:00.830
But if we just remember that
there are these pretty
00:09:00.830 --> 00:09:03.650
straightforward steps that we
follow, and as long as you go
00:09:03.650 --> 00:09:06.380
through these steps and do them
methodically, then you
00:09:06.380 --> 00:09:08.340
can actually come up with
the solution for
00:09:08.340 --> 00:09:10.140
any of these problems.
00:09:10.140 --> 00:09:14.210
And one last thing to remember
is to always think about what
00:09:14.210 --> 00:09:16.050
are the ranges where these
things are valid?
00:09:16.050 --> 00:09:18.610
Because the relationship between
these two random
00:09:18.610 --> 00:09:21.050
variables could be pretty
complicated and you need to
00:09:21.050 --> 00:09:24.190
always be aware of when things
are non-zero and
00:09:24.190 --> 00:09:25.440
when they are 0.