WEBVTT
00:00:01.110 --> 00:00:04.280
We now move to the case of
continuous random variables.
00:00:04.280 --> 00:00:07.660
We will start with a special
case where we want to find the
00:00:07.660 --> 00:00:12.810
PDF of a linear function of a
continuous random variable.
00:00:12.810 --> 00:00:17.780
We will start by considering a
simple example, and study it
00:00:17.780 --> 00:00:20.410
using an intuitive argument.
00:00:20.410 --> 00:00:23.580
And afterwards, we will justify
our conclusions
00:00:23.580 --> 00:00:24.870
mathematically.
00:00:24.870 --> 00:00:28.190
So we start with a random
variable X that has a PDF over
00:00:28.190 --> 00:00:31.560
the form shown in this figure
so that it is a piecewise
00:00:31.560 --> 00:00:33.510
constant PDF.
00:00:33.510 --> 00:00:36.650
We then consider a random
variable z, which is defined
00:00:36.650 --> 00:00:42.180
to be 2 times X. The random
variable x takes values
00:00:42.180 --> 00:00:44.250
between minus 1 and 1.
00:00:44.250 --> 00:00:51.560
So z takes values between
minus 2 and 2.
00:00:51.560 --> 00:00:56.500
Now, values of X between minus
1 and 0 correspond to values
00:00:56.500 --> 00:00:59.810
of Z between minus 2 and 0.
00:00:59.810 --> 00:01:03.780
The different values of X in
this range are, in some sense,
00:01:03.780 --> 00:01:06.750
equally likely, because we
have a constant PDF.
00:01:06.750 --> 00:01:09.300
And that argues that the
corresponding values of Z
00:01:09.300 --> 00:01:12.200
should also be, in some
sense, equally likely.
00:01:12.200 --> 00:01:16.180
So the PDF should be constant
over this range.
00:01:16.180 --> 00:01:20.930
By a similar argument, the PDF
of Z should also be constant
00:01:20.930 --> 00:01:23.480
over the range from 0 to 2.
00:01:23.480 --> 00:01:28.200
And the PDF must, of course,
be 0 outside this range,
00:01:28.200 --> 00:01:31.780
because these are values of
Z that are impossible.
00:01:31.780 --> 00:01:35.759
Let us now try to figure out
the parameters of this PDF.
00:01:35.759 --> 00:01:40.729
The probability that X
is positive is the
00:01:40.729 --> 00:01:42.979
area of this rectangle.
00:01:42.979 --> 00:01:46.000
And the area of this
rectangle is 2/3.
00:01:46.000 --> 00:01:50.710
So the area of this rectangle
should also be 2/3.
00:01:50.710 --> 00:01:54.810
And that means that the height
of this rectangle should be
00:01:54.810 --> 00:01:56.830
equal to 1/3.
00:01:56.830 --> 00:02:00.770
Similarly, the probability that
X is negative is the area
00:02:00.770 --> 00:02:04.070
of this rectangle, and the
area of this rectangle is
00:02:04.070 --> 00:02:05.920
equal to 1/3.
00:02:05.920 --> 00:02:09.130
When X is negative, Z is also
negative, so the probability
00:02:09.130 --> 00:02:11.960
of a negative value should
be equal to 1/3.
00:02:11.960 --> 00:02:15.650
And for the area of this
rectangle to be 1/3, it means
00:02:15.650 --> 00:02:20.240
that the height of this
rectangle should be 1/6.
00:02:20.240 --> 00:02:21.530
So what happened here?
00:02:21.530 --> 00:02:26.420
We started with a PDF of X and
essentially stretched it out
00:02:26.420 --> 00:02:31.100
by a factor of 2 while keeping
the same shape.
00:02:31.100 --> 00:02:35.300
However, we also scaled
it down by a
00:02:35.300 --> 00:02:36.880
corresponding amount.
00:02:36.880 --> 00:02:41.660
So 2/3 became 1/3, and
1/3 became 1/6.
00:02:41.660 --> 00:02:45.590
The reason for this scaling down
is because we need the
00:02:45.590 --> 00:02:49.870
total probability, the total
area under this PDF, to be
00:02:49.870 --> 00:02:52.270
equal to 1.
00:02:52.270 --> 00:02:57.540
If we now add a number, let's
say 3, to the random variable
00:02:57.540 --> 00:03:00.340
Z, what is going to happen?
00:03:00.340 --> 00:03:03.740
The random variable Y now
will take values from
00:03:03.740 --> 00:03:05.290
minus 2 plus 3--
00:03:05.290 --> 00:03:07.010
this is plus 1--
00:03:07.010 --> 00:03:11.660
all the way up to 2 plus
3, which is plus 5.
00:03:14.170 --> 00:03:20.700
Values in the range from 1 to 3
correspond to values of Z in
00:03:20.700 --> 00:03:22.590
the range from minus 2 to 0.
00:03:22.590 --> 00:03:26.360
These values are all, in some
sense, equally likely.
00:03:26.360 --> 00:03:29.290
So they should also be
equally likely here.
00:03:29.290 --> 00:03:33.380
And by a similar argument, these
values in the range from
00:03:33.380 --> 00:03:36.600
3 to 5 should also be
equally likely.
00:03:36.600 --> 00:03:42.280
This rectangle corresponds
to this rectangle here.
00:03:42.280 --> 00:03:44.030
So the area should
be the same.
00:03:44.030 --> 00:03:47.320
And therefore, the height
should also be the same.
00:03:47.320 --> 00:03:50.090
Therefore, the height
here should be 1/6.
00:03:50.090 --> 00:03:52.150
And by the same argument,
the height here
00:03:52.150 --> 00:03:53.450
should be equal to 1/3.
00:03:56.130 --> 00:04:00.640
So what happens here is that
when we add 3 to a random
00:04:00.640 --> 00:04:06.590
variable, the PDF just gets
shifted by 3 but otherwise
00:04:06.590 --> 00:04:09.510
retains the same shape.
00:04:09.510 --> 00:04:12.220
So the story is entirely similar
to what happened in
00:04:12.220 --> 00:04:13.520
the discrete case.
00:04:13.520 --> 00:04:18.660
We start with a PDF of X. We
stretch it horizontally by a
00:04:18.660 --> 00:04:20.360
factor of 2.
00:04:20.360 --> 00:04:24.280
And then we shift it
horizontally by 3.
00:04:24.280 --> 00:04:27.640
The only difference is that here
in the continuous case,
00:04:27.640 --> 00:04:33.820
we also need to scale the plot
in the vertical dimension by a
00:04:33.820 --> 00:04:35.330
factor of 2.
00:04:35.330 --> 00:04:38.909
Actually, make it smaller
by a factor of 2.
00:04:38.909 --> 00:04:42.300
And this needs to be done in
order to keep the total area
00:04:42.300 --> 00:04:46.302
under the PDF equal to 1.
00:04:46.302 --> 00:04:49.390
Let us now go through a
mathematical argument with the
00:04:49.390 --> 00:04:52.700
purpose of also finding a
formula that represents what
00:04:52.700 --> 00:04:55.730
we just did in our
previous example.
00:04:55.730 --> 00:04:58.980
Let Y be equal to aX plus b.
00:04:58.980 --> 00:05:03.350
Here, X is a random variable
with a given PDF.
00:05:03.350 --> 00:05:06.140
a and b are given constants.
00:05:06.140 --> 00:05:11.990
Now, if a is equal to 0, then
Y is identically equal to b.
00:05:11.990 --> 00:05:14.020
So it is a constant random
variable and
00:05:14.020 --> 00:05:15.560
does not have a PDF.
00:05:15.560 --> 00:05:20.080
So let us exclude this case and
start by assuming that a
00:05:20.080 --> 00:05:23.400
is a positive number.
00:05:23.400 --> 00:05:27.490
We can try to work, as in the
discrete case, and try
00:05:27.490 --> 00:05:29.440
something like the following.
00:05:29.440 --> 00:05:34.120
The probability that Y takes
on a specific value is the
00:05:34.120 --> 00:05:39.430
same as the probability that aX
plus b takes on a specific
00:05:39.430 --> 00:05:44.030
value, which is the same as the
probability that X takes
00:05:44.030 --> 00:05:48.390
on the specific value, y
minus b divided by a.
00:05:48.390 --> 00:05:50.909
This equality was useful
in the discrete case.
00:05:50.909 --> 00:05:52.610
Is it useful here?
00:05:52.610 --> 00:05:53.850
Unfortunately not.
00:05:53.850 --> 00:05:56.180
When we're dealing with
continuous random variables,
00:05:56.180 --> 00:05:58.710
the probability that the
continuous random variable is
00:05:58.710 --> 00:06:02.210
exactly equal to a given number,
this probability is
00:06:02.210 --> 00:06:03.440
going to be equal to 0.
00:06:03.440 --> 00:06:05.750
And the same applies to
this side as well.
00:06:05.750 --> 00:06:08.190
So we have that 0
is equal to 0.
00:06:08.190 --> 00:06:12.030
And this is uninformative, and
we have not made any progress.
00:06:12.030 --> 00:06:15.020
So instead of working with
probabilities of individual
00:06:15.020 --> 00:06:19.110
points which will always
be 0, we will work with
00:06:19.110 --> 00:06:23.140
probabilities of intervals that
generally have non-zero
00:06:23.140 --> 00:06:24.340
probability.
00:06:24.340 --> 00:06:26.930
The trick is to work
with CDFs.
00:06:26.930 --> 00:06:33.690
So let us try to find the CDF
of Y. The CDF of the random
00:06:33.690 --> 00:06:37.950
variable Y is defined as the
probability that the random
00:06:37.950 --> 00:06:42.000
variable is less than or equal
to a certain number.
00:06:42.000 --> 00:06:45.400
Now, in our case,
Y is aX plus b.
00:06:49.210 --> 00:06:53.310
We move b to the other side of
the inequality and then divide
00:06:53.310 --> 00:06:56.230
both sides of the
inequality by a.
00:06:56.230 --> 00:07:00.700
And we get that this is the same
as the probability that X
00:07:00.700 --> 00:07:07.010
is less than or equal to y minus
b divided by a, which is
00:07:07.010 --> 00:07:14.350
the same as the CDF of X
evaluated at y minus b over a.
00:07:14.350 --> 00:07:18.500
So we have a formula for the CDF
of Y in terms of the CDF
00:07:18.500 --> 00:07:20.280
of X.
00:07:20.280 --> 00:07:22.060
How can we find the PDF?
00:07:22.060 --> 00:07:24.080
Simply by differentiating.
00:07:24.080 --> 00:07:28.130
We differentiate both sides
of this equation.
00:07:28.130 --> 00:07:31.220
The derivative of
a CDF is a PDF.
00:07:31.220 --> 00:07:36.890
And therefore, the PDF of Y is
going to be equal to the
00:07:36.890 --> 00:07:38.730
derivative of this side.
00:07:38.730 --> 00:07:41.080
Here we need to use
the chain rule.
00:07:41.080 --> 00:07:45.380
First, we take the derivative
of this function.
00:07:45.380 --> 00:07:50.930
And the derivative of the CDF
is a PDF, so the PDF of X
00:07:50.930 --> 00:07:53.315
evaluated at this particular
number.
00:07:55.960 --> 00:07:59.680
But then we also need to take
the derivative of the argument
00:07:59.680 --> 00:08:02.230
inside with respect to y.
00:08:02.230 --> 00:08:06.480
And that derivative
is equal to 1/a.
00:08:06.480 --> 00:08:11.130
And this gives us a formula for
the PDF of Y in terms of
00:08:11.130 --> 00:08:13.680
the PDF of X.
00:08:13.680 --> 00:08:18.950
How about the case where
a is less than 0?
00:08:18.950 --> 00:08:20.370
What is going to change?
00:08:20.370 --> 00:08:24.570
The first step up to
here remains valid.
00:08:24.570 --> 00:08:29.140
But now when we divide both
sides of the inequality by a,
00:08:29.140 --> 00:08:32.520
the direction of the inequality
gets reversed.
00:08:32.520 --> 00:08:38.510
So we obtain instead the
probability that X is larger
00:08:38.510 --> 00:08:43.850
than or equal to y minus
b divided by a.
00:08:43.850 --> 00:08:50.840
And this is 1 minus the
probability that X is less
00:08:50.840 --> 00:08:54.040
than y minus b over a.
00:08:54.040 --> 00:08:57.070
Now, X is a continuous random
variable, so the probability
00:08:57.070 --> 00:09:00.860
is not going to change if here
we make the inequality to be a
00:09:00.860 --> 00:09:02.900
less than or equal sign.
00:09:06.320 --> 00:09:12.760
And what we have here is 1 minus
the CDF of X evaluated
00:09:12.760 --> 00:09:16.890
at y minus b over a.
00:09:16.890 --> 00:09:22.300
We use the chain rule once more,
and we obtain that the
00:09:22.300 --> 00:09:32.920
PDF of Y, in this case, is equal
to minus the PDF of X
00:09:32.920 --> 00:09:36.970
evaluated at y minus
b over a times 1/a.
00:09:41.420 --> 00:09:45.240
Now, when a is positive,
a is the same as the
00:09:45.240 --> 00:09:47.052
absolute value of a.
00:09:47.052 --> 00:09:50.450
When a is negative and we have
this formula, we have here a
00:09:50.450 --> 00:09:54.690
minus a, which is the same as
the absolute value of a.
00:09:54.690 --> 00:09:59.560
So we can unify these two
formulas by replacing the
00:09:59.560 --> 00:10:03.150
occurrences of a and that minus
sign by just using the
00:10:03.150 --> 00:10:04.670
absolute value.
00:10:04.670 --> 00:10:10.360
And this gives us this formula
for the PDF of Y in terms of
00:10:10.360 --> 00:10:14.270
the PDF of X. And it is a
formula that's valid whether a
00:10:14.270 --> 00:10:18.510
is positive or negative.
00:10:18.510 --> 00:10:22.420
What this formula represents
is the following.
00:10:22.420 --> 00:10:27.700
Because of the factor of a that
we have here, we take the
00:10:27.700 --> 00:10:32.482
PDF of X and scale it
horizontally by a factor of a.
00:10:32.482 --> 00:10:37.040
Because of the term b that we
have here, the PDF also gets
00:10:37.040 --> 00:10:39.560
shifted horizontally by b.
00:10:39.560 --> 00:10:43.380
And finally, this term here
corresponds to a vertical
00:10:43.380 --> 00:10:45.920
scaling of the plot
that we have.
00:10:45.920 --> 00:10:49.980
And the reason that this term is
present is so that the PDF
00:10:49.980 --> 00:10:53.750
of Y integrates to 1.
00:10:53.750 --> 00:10:56.730
It is interesting to also
compare with the corresponding
00:10:56.730 --> 00:10:59.620
discrete formula that
we derived earlier.
00:10:59.620 --> 00:11:03.940
The discrete formula has exactly
the same appearance
00:11:03.940 --> 00:11:07.630
except that the scaling
factor is not present.
00:11:07.630 --> 00:11:10.340
So for the case of continuous
random variables, we need to
00:11:10.340 --> 00:11:12.880
scale vertically the PDF.
00:11:12.880 --> 00:11:16.370
But in the discrete case, such
a scaling is not present.