WEBVTT
00:00:00.385 --> 00:00:04.180
In all of the examples that we
have seen so far, we have
00:00:04.180 --> 00:00:08.210
calculated the distribution of a
random variable, Y, which is
00:00:08.210 --> 00:00:11.950
defined as a function of another
random variable, X.
00:00:11.950 --> 00:00:15.880
What about the case where we
define a random variable, Z,
00:00:15.880 --> 00:00:19.000
as a function of multiple
random variables?
00:00:19.000 --> 00:00:20.820
For example, here
is the function
00:00:20.820 --> 00:00:22.170
of two random variables.
00:00:22.170 --> 00:00:24.710
How can we find a distribution
of Z?
00:00:24.710 --> 00:00:27.980
The general methodology
is exactly the same.
00:00:27.980 --> 00:00:32.680
We somehow calculate the CDF of
the random variable Z and
00:00:32.680 --> 00:00:35.340
then differentiate
to find its PDF.
00:00:35.340 --> 00:00:37.160
Let us illustrate
this methodology
00:00:37.160 --> 00:00:39.020
with a simple example.
00:00:39.020 --> 00:00:43.060
So suppose that X and Y are
independent random variables
00:00:43.060 --> 00:00:46.780
and each one of them is uniform
on the unit interval.
00:00:46.780 --> 00:00:51.340
So their joint distribution is
going to be a uniform PDF on
00:00:51.340 --> 00:00:53.210
the unit square.
00:00:53.210 --> 00:00:55.790
We're interested in the random
variable, which is defined as
00:00:55.790 --> 00:00:59.430
the ratio of Y divided by X.
00:00:59.430 --> 00:01:03.220
So we will now calculate
the CDF of Z and then
00:01:03.220 --> 00:01:05.019
differentiate.
00:01:05.019 --> 00:01:07.670
It is useful to work in
terms of a diagram.
00:01:07.670 --> 00:01:11.340
This is essentially our sample
space, the unit square.
00:01:11.340 --> 00:01:14.950
The PDF of X is 1 on
the unit interval.
00:01:14.950 --> 00:01:17.420
The PDF of Y is 1 on
the unit interval.
00:01:17.420 --> 00:01:20.350
Because of independence, the
joint PDF is the product of
00:01:20.350 --> 00:01:22.440
their individual PDFs.
00:01:22.440 --> 00:01:27.580
So the joint PDF is equal to 1
throughout this unit square.
00:01:27.580 --> 00:01:32.890
So now let us write an
expression for the CDF of Z,
00:01:32.890 --> 00:01:36.280
which, by definition, is the
probability that the random
00:01:36.280 --> 00:01:41.789
variable Z, which in our case
is Y divided by X, is less
00:01:41.789 --> 00:01:46.680
than or equal than a certain
number, little z.
00:01:46.680 --> 00:01:49.310
What is the probability
of this event?
00:01:49.310 --> 00:01:51.539
Let us consider a few
different cases.
00:01:51.539 --> 00:01:54.310
Suppose that z is negative.
00:01:54.310 --> 00:01:57.050
What is the probability that
this ratio is negative?
00:01:57.050 --> 00:02:00.530
Well, since X and Y are
non-negative numbers, there's
00:02:00.530 --> 00:02:03.220
no way that the ratio is
going to be negative.
00:02:03.220 --> 00:02:09.410
So if little z is a negative
number, the probability of
00:02:09.410 --> 00:02:13.160
this event is going
to be equal to 0.
00:02:13.160 --> 00:02:15.170
This is the easier case.
00:02:15.170 --> 00:02:18.610
Now suppose that z is
a positive number.
00:02:18.610 --> 00:02:23.841
Let us draw a line that has
a slope of little z.
00:02:26.607 --> 00:02:30.670
y/z being less than or equal
to little z is the same as
00:02:30.670 --> 00:02:36.250
saying that y is less than or
equal to little z times x.
00:02:36.250 --> 00:02:42.920
This is the line on which
y is equal to z times x.
00:02:42.920 --> 00:02:45.930
So below that line, y is going
to be less than or
00:02:45.930 --> 00:02:48.870
equal to z times x.
00:02:48.870 --> 00:02:53.440
So the event of interest is
actually this triangle here.
00:02:53.440 --> 00:02:56.090
And the probability of this
event, since we're dealing
00:02:56.090 --> 00:02:59.500
with a uniform distribution on
the unit square, is just the
00:02:59.500 --> 00:03:02.440
area of this triangle.
00:03:02.440 --> 00:03:06.700
Now, since this line rises at
slope z, this point here, this
00:03:06.700 --> 00:03:09.480
intercept is at z.
00:03:09.480 --> 00:03:13.800
And so the sides of the
triangle are 1 and z.
00:03:13.800 --> 00:03:19.740
And so this formula here gives
us the value of the CDF for
00:03:19.740 --> 00:03:21.750
the case where little
z is positive.
00:03:25.060 --> 00:03:28.220
And the same formula would also
be true if z also were
00:03:28.220 --> 00:03:32.060
equal to 0, in which case,
we get 0 probability.
00:03:32.060 --> 00:03:34.880
But is this correct for
all positive z's?
00:03:34.880 --> 00:03:36.680
Well, not really.
00:03:36.680 --> 00:03:38.980
This calculation was based
on this picture.
00:03:38.980 --> 00:03:43.600
And in this picture, this line
intercepted this side of the
00:03:43.600 --> 00:03:45.030
unit square.
00:03:45.030 --> 00:03:50.180
And for that to happen, this
slope must be less than or
00:03:50.180 --> 00:03:52.000
equal to 1.
00:03:52.000 --> 00:03:55.380
So this formula is only correct
in the case where we
00:03:55.380 --> 00:03:58.660
have a slope of less
than or equal to 1.
00:03:58.660 --> 00:04:03.010
And now we need to deal with
the remaining case in which
00:04:03.010 --> 00:04:06.940
little z is strictly
larger than 1.
00:04:06.940 --> 00:04:10.210
In this case, we get a somewhat
different picture.
00:04:10.210 --> 00:04:19.200
If we draw a line with slope,
again, little z, because
00:04:19.200 --> 00:04:23.140
little z is bigger than 1, it's
going to intercept this
00:04:23.140 --> 00:04:25.230
side of the rectangle.
00:04:25.230 --> 00:04:28.900
Now, the event that Y/X is less
than or equal to little z
00:04:28.900 --> 00:04:34.110
is, again, the event that the
pair, X, Y, lies below this
00:04:34.110 --> 00:04:36.290
line that has a slope of z.
00:04:36.290 --> 00:04:39.909
So all we need is to find
the area of this region.
00:04:39.909 --> 00:04:42.690
One way of finding the area of
this region is to take the
00:04:42.690 --> 00:04:46.840
area of the entire unit square,
which is equal to 1,
00:04:46.840 --> 00:04:49.730
and subtract the area
of this triangle.
00:04:49.730 --> 00:04:52.120
What is the area of
this triangle?
00:04:52.120 --> 00:04:55.580
Well, since this line has a
slope of z, in order for it to
00:04:55.580 --> 00:05:02.750
rise to a value of 1, x must be
equal to 1 over little z.
00:05:02.750 --> 00:05:09.730
Therefore, this side of
the triangle is 1/z.
00:05:09.730 --> 00:05:16.790
And therefore, the area of the
triangle is 1/2 times 1/z,
00:05:16.790 --> 00:05:19.390
which is this expression here.
00:05:19.390 --> 00:05:23.430
And so we have found the value
of the CDF for all possible
00:05:23.430 --> 00:05:26.390
choices of little z.
00:05:26.390 --> 00:05:28.200
We can draw the CDF.
00:05:37.860 --> 00:05:40.770
And the picture is as follows.
00:05:40.770 --> 00:05:44.980
For z negative, the
CDF is equal to 0.
00:05:44.980 --> 00:05:51.170
For z between 0 and 1, the
CDF rises linearly
00:05:51.170 --> 00:05:53.010
at a slope of 1/2.
00:05:53.010 --> 00:05:56.810
And so when z is equal to
1, the CDF has risen
00:05:56.810 --> 00:05:59.300
to a value of 1/2.
00:05:59.300 --> 00:06:03.590
And then as z goes to infinity,
this term disappears
00:06:03.590 --> 00:06:06.175
and the CDF will
converge to 1.
00:06:09.110 --> 00:06:11.940
So it converges to 1
monotonically but in a
00:06:11.940 --> 00:06:13.180
non-linear fashion.
00:06:13.180 --> 00:06:17.580
So we get a picture
of this type.
00:06:17.580 --> 00:06:23.810
The next step, the final step,
is to differentiate the CDF
00:06:23.810 --> 00:06:25.360
and obtain the PDF.
00:06:33.792 --> 00:06:37.990
In this region, the CDF is
constant, so its derivative is
00:06:37.990 --> 00:06:39.850
going to be equal to 0.
00:06:39.850 --> 00:06:44.420
In this region, the CDF is
linear, so its derivative is
00:06:44.420 --> 00:06:47.140
equal to this factor of 1/2.
00:06:47.140 --> 00:06:55.420
So the CDF is equal to 1/2
for z's between 0 and 1.
00:06:55.420 --> 00:06:57.909
And finally, in this
region, this is the
00:06:57.909 --> 00:06:59.210
formula for the CDF.
00:06:59.210 --> 00:07:04.100
When we take the derivative, we
get the expression 1 over
00:07:04.100 --> 00:07:10.630
2z squared, which is a function
that decreases as z
00:07:10.630 --> 00:07:11.810
goes to infinity.
00:07:11.810 --> 00:07:14.250
So it has a shape
like this one.
00:07:14.250 --> 00:07:16.600
So we have completed the
solution to this problem.
00:07:16.600 --> 00:07:20.410
We found the CDF, and we found
the corresponding PDF.
00:07:20.410 --> 00:07:24.640
This methodology works more
generally for more complicated
00:07:24.640 --> 00:07:28.360
functions of X and Y and for
more complicated distributions
00:07:28.360 --> 00:07:31.250
for X and Y. Of course, when
the functions or the
00:07:31.250 --> 00:07:33.680
distributions are more
complicated, the calculus
00:07:33.680 --> 00:07:37.600
involved and the geometry may
require a lot more work.
00:07:37.600 --> 00:07:39.510
But conceptually,
the methodology
00:07:39.510 --> 00:07:40.760
is exactly the same.