WEBVTT

00:00:00.700 --> 00:00:04.110
In this important segment, we
will develop a method for

00:00:04.110 --> 00:00:08.119
finding the PDF of a general
function of a continuous

00:00:08.119 --> 00:00:11.970
random variable, a function g of
X, which, in general, could

00:00:11.970 --> 00:00:13.970
be nonlinear.

00:00:13.970 --> 00:00:17.780
The method is very general
and involves two steps.

00:00:17.780 --> 00:00:22.310
The first step is to find the
CDF of Y. And then the second

00:00:22.310 --> 00:00:25.290
step is to take the derivative
of the CDF and

00:00:25.290 --> 00:00:27.210
then find the PDF.

00:00:27.210 --> 00:00:31.130
Most of the work lies here in
finding the CDF of Y. And how

00:00:31.130 --> 00:00:32.530
do we do that?

00:00:32.530 --> 00:00:36.630
Well, since Y is a function of
the random variable X, we

00:00:36.630 --> 00:00:41.130
replace Y by g of X. And now
we're dealing with a

00:00:41.130 --> 00:00:45.080
probability problem that
involves a random variable, X,

00:00:45.080 --> 00:00:46.810
with a known PDF.

00:00:46.810 --> 00:00:50.010
And we somehow calculate
this probability.

00:00:50.010 --> 00:00:52.800
So let us illustrate
this procedure

00:00:52.800 --> 00:00:54.050
through some examples.

00:00:56.350 --> 00:01:01.830
In our first example, we let X
be a random variable which is

00:01:01.830 --> 00:01:05.720
uniform on the range
from 0 to 2.

00:01:08.930 --> 00:01:11.520
And so the height of
the PDF is 1/2.

00:01:14.090 --> 00:01:18.860
And we wish to find the PDF of
the random variable Y which is

00:01:18.860 --> 00:01:21.720
defined as X cubed.

00:01:21.720 --> 00:01:25.600
So since X goes all the way
up to 2, Y goes all

00:01:25.600 --> 00:01:27.230
the way up to 8.

00:01:31.780 --> 00:01:44.430
The first step is to find the
CDF of Y. And since Y is a

00:01:44.430 --> 00:01:50.580
specific function of X, we
replace that functional form.

00:01:50.580 --> 00:01:54.759
And we write it this way.

00:01:54.759 --> 00:01:57.630
So we want to calculate the
probability that x cubed is

00:01:57.630 --> 00:02:01.550
less than or equal to
a certain number y.

00:02:01.550 --> 00:02:06.500
Let us take cubic roots of both
sides of this inequality.

00:02:06.500 --> 00:02:10.478
This is the same as the
probability that X is less

00:02:10.478 --> 00:02:16.480
than or equal to y to the 1/3.

00:02:16.480 --> 00:02:22.280
Now, we only care about values
of y that are between 0 and 8.

00:02:22.280 --> 00:02:26.780
So this calculation is going to
be for those values of y.

00:02:26.780 --> 00:02:31.710
For other values of y, we know
that the PDF is equal to 0.

00:02:31.710 --> 00:02:35.576
And there's no work that
needs to be done there.

00:02:35.576 --> 00:02:37.430
OK.

00:02:37.430 --> 00:02:43.030
Now, y is less than or equal to
8, so the cubic root of y

00:02:43.030 --> 00:02:45.070
is less than or equal to 2.

00:02:45.070 --> 00:02:49.550
So y to the 1/3 is going
to be a number

00:02:49.550 --> 00:02:51.280
somewhere in this range.

00:02:51.280 --> 00:02:54.900
Let's say this number.

00:02:54.900 --> 00:02:57.180
We want the probability
that X is less than or

00:02:57.180 --> 00:02:58.640
equal to that value.

00:02:58.640 --> 00:03:04.480
So that probability is equal to
this area under the PDF of

00:03:04.480 --> 00:03:09.330
X. And since it is uniform,
this area is easy to find.

00:03:09.330 --> 00:03:14.080
It's the height, which is
1/2 times the base,

00:03:14.080 --> 00:03:16.820
which is y to the 1/3.

00:03:16.820 --> 00:03:20.750
So we continue this calculation,
and we get 1/2

00:03:20.750 --> 00:03:23.620
times y to the 1/3.

00:03:23.620 --> 00:03:28.450
So this is the formula for the
CDF of Y for values of little

00:03:28.450 --> 00:03:31.200
y between 0 and 8.

00:03:31.200 --> 00:03:32.730
This completes step one.

00:03:32.730 --> 00:03:36.490
The second step is
simple calculus.

00:03:36.490 --> 00:03:39.270
We just need to take the
derivative of the CDF.

00:03:44.079 --> 00:03:53.340
And the derivative is 1/2 times
1/3, this exponent, y to

00:03:53.340 --> 00:03:58.250
the power of minus 2/3.

00:03:58.250 --> 00:04:05.400
Or in a cleaner form,
1/6 times 1 over y

00:04:05.400 --> 00:04:06.810
to the power 2/3.

00:04:09.740 --> 00:04:17.209
So the form of this PDF is
not a constant anymore.

00:04:17.209 --> 00:04:20.070
Y is not a uniform
random variable.

00:04:20.070 --> 00:04:24.390
The PDF becomes larger and
larger as y approaches 0.

00:04:24.390 --> 00:04:29.280
And in fact, in this example,
it even blows up when y

00:04:29.280 --> 00:04:33.159
becomes closer and
closer to 0.

00:04:33.159 --> 00:04:38.050
So this is the shape
of the PDF of Y.

00:04:38.050 --> 00:04:42.150
Our second example
is as follows.

00:04:42.150 --> 00:04:46.840
You go to the gym, you jump on
the treadmill, and you set the

00:04:46.840 --> 00:04:53.210
speed on the treadmill to some
random value which we call X.

00:04:53.210 --> 00:04:58.400
And that random value is
somewhere between 5 and 10

00:04:58.400 --> 00:05:00.000
kilometers per hour.

00:05:00.000 --> 00:05:03.780
And the way that you set it is
chosen at random and uniformly

00:05:03.780 --> 00:05:06.200
over this interval.

00:05:06.200 --> 00:05:09.520
So X is uniformly distributed
on the interval

00:05:09.520 --> 00:05:12.630
between 5 and 10.

00:05:12.630 --> 00:05:15.930
You want to run a total
of 10 kilometers.

00:05:15.930 --> 00:05:18.470
How long is it going
to take you?

00:05:18.470 --> 00:05:24.180
Let the time it takes you be
denoted by Y. And the time

00:05:24.180 --> 00:05:27.120
it's going to take you is the
distance you want to travel,

00:05:27.120 --> 00:05:30.720
which is 10 divided
by the speed with

00:05:30.720 --> 00:05:32.240
which you will be going.

00:05:32.240 --> 00:05:36.070
So the random variable y is
defined in terms of x through

00:05:36.070 --> 00:05:38.920
this particular expression.

00:05:38.920 --> 00:05:42.270
We want to find the PDF of y.

00:05:42.270 --> 00:05:49.320
First let us look at the range
of the random variable Y.

00:05:49.320 --> 00:05:54.590
Since x takes values between
5 and 10, Y takes values

00:05:54.590 --> 00:05:56.370
between 1 and 2.

00:06:02.240 --> 00:06:06.140
Therefore, the PDF of
Y is going to be 0

00:06:06.140 --> 00:06:07.990
outside that range.

00:06:07.990 --> 00:06:12.990
And let us now focus on values
of Y that belong to this

00:06:12.990 --> 00:06:14.950
interesting range.

00:06:14.950 --> 00:06:18.980
So 1 less than y less
than or equal to 2.

00:06:18.980 --> 00:06:21.740
And now we start with our
two-step program.

00:06:21.740 --> 00:06:27.060
We want to find the CDF of Y,
namely, the probability that

00:06:27.060 --> 00:06:31.350
capital Y takes a value less
than or equal to a certain

00:06:31.350 --> 00:06:33.950
little y in this range.

00:06:33.950 --> 00:06:39.540
We recall the definition of
capital Y. So now we're

00:06:39.540 --> 00:06:42.970
dealing with a probability
problem that involves the

00:06:42.970 --> 00:06:45.320
random variable capital
X, whose

00:06:45.320 --> 00:06:48.470
distribution is given to us.

00:06:48.470 --> 00:06:53.130
Now, we rewrite this
event as follows.

00:06:53.130 --> 00:06:55.409
We move X to the other side.

00:06:55.409 --> 00:06:59.730
This is the probability that X
is larger than or equal after

00:06:59.730 --> 00:07:03.380
we move the little y also
to the left-hand side.

00:07:03.380 --> 00:07:08.540
X being larger than or equal
to 10 over little y.

00:07:08.540 --> 00:07:12.340
Now, y is between 1 and 2.

00:07:12.340 --> 00:07:17.160
10/y is going to be a number
between 5 and 10.

00:07:17.160 --> 00:07:21.675
So 10/y is going to be somewhere
in this range.

00:07:24.210 --> 00:07:27.230
We're interested in the
probability that X is larger

00:07:27.230 --> 00:07:29.570
than or equal to that number.

00:07:29.570 --> 00:07:33.760
And this probability is going
to be the area of this

00:07:33.760 --> 00:07:36.670
rectangle here.

00:07:36.670 --> 00:07:42.070
And the area of that rectangle
is equal to the height of the

00:07:42.070 --> 00:07:43.240
rectangle--

00:07:43.240 --> 00:07:46.970
now, the height of this
rectangle is going to be 1/5.

00:07:46.970 --> 00:07:49.890
This is the choice that makes
the total area under this

00:07:49.890 --> 00:07:54.120
curve be equal to 1--

00:07:54.120 --> 00:07:55.600
times the base.

00:07:55.600 --> 00:07:58.240
And the length of the
base is this number

00:07:58.240 --> 00:08:00.770
10 minus that number.

00:08:00.770 --> 00:08:02.910
It's 10 minus 10/y.

00:08:06.640 --> 00:08:13.030
So this is the form of the CDF
of Y for y's in this range.

00:08:13.030 --> 00:08:18.420
To find the PDF of Y, we just
take the derivative.

00:08:18.420 --> 00:08:24.710
And we get 1/5 times the
derivative of this term, which

00:08:24.710 --> 00:08:31.944
is minus 10, divided
by y squared.

00:08:34.500 --> 00:08:37.530
But when we take the derivative
of 1/y, that gives

00:08:37.530 --> 00:08:39.740
us another minus sign.

00:08:39.740 --> 00:08:42.110
The two minus signs
cancel, and we

00:08:42.110 --> 00:08:46.410
obtain 2 over y squared.

00:08:46.410 --> 00:08:51.790
And if you wish to plot
this, it starts at 2.

00:08:51.790 --> 00:08:56.933
And then as y increases, the
PDF actually decreases.

00:08:59.500 --> 00:09:03.990
And this is the form of the PDF
of the random variable y.

00:09:03.990 --> 00:09:09.010
This is the form which is true
when y lies between 1 and 2.

00:09:09.010 --> 00:09:13.340
And of course, the PDF is
going to be 0 for other

00:09:13.340 --> 00:09:15.040
choices of little y.

00:09:18.990 --> 00:09:23.410
So what we have seen here is a
pretty systematic approach

00:09:23.410 --> 00:09:27.820
towards finding the PDF of the
random variable Y. Again, the

00:09:27.820 --> 00:09:32.450
first step is to look at the
CDF, write the CDF in terms of

00:09:32.450 --> 00:09:36.050
the random variable X, whose
distribution is known, and

00:09:36.050 --> 00:09:38.950
then solve a probability problem
that involves this

00:09:38.950 --> 00:09:41.230
particular random variable.

00:09:41.230 --> 00:09:44.230
And then in the last step, we
just need to differentiate the

00:09:44.230 --> 00:09:46.320
CDF in order to obtain the PDF.