WEBVTT
00:00:00.040 --> 00:00:00.760
Hey guys.
00:00:00.760 --> 00:00:02.140
Welcome back.
00:00:02.140 --> 00:00:05.440
Today we're going to do a fun
problem that will test your
00:00:05.440 --> 00:00:08.010
knowledge of the law
of total variance.
00:00:08.010 --> 00:00:11.050
And in the process, we'll also
get more practice dealing with
00:00:11.050 --> 00:00:14.460
joint PDFs and computing
conditional expectations and
00:00:14.460 --> 00:00:16.810
conditional variances.
00:00:16.810 --> 00:00:22.030
So in this problem, we are given
a joint PDF for x and y.
00:00:22.030 --> 00:00:26.490
So we're told that x and y can
take on the following values
00:00:26.490 --> 00:00:27.320
in the shape of this
00:00:27.320 --> 00:00:29.660
parallelogram, which I've drawn.
00:00:29.660 --> 00:00:33.250
And moreover, that x and y are
uniformly distributed.
00:00:33.250 --> 00:00:37.630
So the joint PDF is just flat
over this parallelogram.
00:00:37.630 --> 00:00:41.430
And because the parallelogram
has an area of 1, the height
00:00:41.430 --> 00:00:47.100
of the PDF must also be 1 so
that the PDF integrates to 1.
00:00:47.100 --> 00:00:47.430
OK.
00:00:47.430 --> 00:00:49.670
And then we are asked
to compute the
00:00:49.670 --> 00:00:51.470
variance of x plus y.
00:00:51.470 --> 00:00:54.560
So you can think of x plus y as
a new random variable whose
00:00:54.560 --> 00:00:56.520
variance we want to compute.
00:00:56.520 --> 00:00:59.530
And moreover, we're told we
should compute this variance
00:00:59.530 --> 00:01:03.240
by using something called the
law of total variance.
00:01:03.240 --> 00:01:07.140
So from lecture, you should
remember or you should recall
00:01:07.140 --> 00:01:09.170
that the law of total variance
can be written
00:01:09.170 --> 00:01:11.470
in these two ways.
00:01:11.470 --> 00:01:14.310
And the reason why there's two
different forms for this case
00:01:14.310 --> 00:01:17.350
is because the formula
always has you
00:01:17.350 --> 00:01:18.590
conditioning on something.
00:01:18.590 --> 00:01:22.120
Here we condition on x, here
we condition on y.
00:01:22.120 --> 00:01:25.590
And for this problem, the
logical choice you have for
00:01:25.590 --> 00:01:28.400
what to condition
on is x or y.
00:01:28.400 --> 00:01:29.970
So again, we have this option.
00:01:29.970 --> 00:01:33.450
And my claim is that we
should condition on x.
00:01:33.450 --> 00:01:37.740
And the reason has to do with
the geometry of this diagram.
00:01:37.740 --> 00:01:42.220
So notice that if you freeze an
x and then you sort of vary
00:01:42.220 --> 00:01:47.120
x, the width of this
parallelogram stays constant.
00:01:47.120 --> 00:01:50.190
However, if you condition on y
and look at the width this
00:01:50.190 --> 00:01:53.070
way, you see that the width
of the slices you get by
00:01:53.070 --> 00:01:56.680
conditioning vary with y.
00:01:56.680 --> 00:02:00.280
So to make our lives easier,
we're going to condition on x.
00:02:00.280 --> 00:02:02.480
And I'm going to erase this
bottom one, because
00:02:02.480 --> 00:02:04.980
we're not using it.
00:02:04.980 --> 00:02:08.889
So this really can seem quite
intimidating, because we have
00:02:08.889 --> 00:02:11.590
nested variances and
expectations going on, but
00:02:11.590 --> 00:02:13.920
we'll just take it slowly
step by step.
00:02:13.920 --> 00:02:16.650
So first, I want to focus
on this term--
00:02:16.650 --> 00:02:23.600
the conditional expectation of
x plus y conditioned on x.
00:02:23.600 --> 00:02:28.160
So coming back over to this
picture, if you fix an
00:02:28.160 --> 00:02:32.100
arbitrary x in the interval,
0 to 1, we're restricting
00:02:32.100 --> 00:02:34.900
ourselves to this universe.
00:02:34.900 --> 00:02:39.860
So y can only vary between this
point and this point.
00:02:39.860 --> 00:02:42.560
Now, I've already written down
here that the formula for this
00:02:42.560 --> 00:02:45.555
line is given by y
is equal to x.
00:02:45.555 --> 00:02:48.510
And the formula for this
line is given by y is
00:02:48.510 --> 00:02:50.190
equal to x plus 1.
00:02:50.190 --> 00:02:53.990
So in particular, when we
condition on x, we know that y
00:02:53.990 --> 00:02:56.920
varies between x and x plus 1.
00:02:56.920 --> 00:02:58.960
But we actually know
more than that.
00:02:58.960 --> 00:03:01.810
We know that in the
unconditional universe, x and
00:03:01.810 --> 00:03:04.180
y were uniformly distributed.
00:03:04.180 --> 00:03:07.520
So it follows that in the
conditional universe, y should
00:03:07.520 --> 00:03:10.600
also be uniformly distributed,
because conditioning doesn't
00:03:10.600 --> 00:03:13.880
change the relative frequency
of outcomes.
00:03:13.880 --> 00:03:19.210
So that reasoning means that
we can draw the conditional
00:03:19.210 --> 00:03:22.270
PDF of y conditioned
on x as this.
00:03:22.270 --> 00:03:24.770
We said it varies between
x and x plus 1.
00:03:24.770 --> 00:03:27.770
And we also said that it's
uniform, which means that it
00:03:27.770 --> 00:03:29.670
must have a height of 1.
00:03:29.670 --> 00:03:34.540
So this is py given
x, y given x.
00:03:34.540 --> 00:03:37.820
Now, you might be concerned,
because, well, we're trying to
00:03:37.820 --> 00:03:43.150
compute the expectation of
x plus y and this is the
00:03:43.150 --> 00:03:47.450
conditional PDF of y, not of the
random variable, x plus y.
00:03:47.450 --> 00:03:50.990
But I claim that we're OK, this
is still useful, because
00:03:50.990 --> 00:03:53.680
if we're conditioning
on x, this x
00:03:53.680 --> 00:03:55.640
just acts as a constant.
00:03:55.640 --> 00:03:58.555
It's not really going to change
anything except shift
00:03:58.555 --> 00:04:02.400
the expectation of y
by an amount of x.
00:04:02.400 --> 00:04:05.350
So what I'm saying in math terms
is that this is actually
00:04:05.350 --> 00:04:11.820
just x plus the expectation
of y given x.
00:04:11.820 --> 00:04:16.470
And now our conditional
PDF comes into play.
00:04:16.470 --> 00:04:18.930
Conditioned on x, this
is the PDF of y.
00:04:18.930 --> 00:04:21.899
And because it's uniformly
distributed and because
00:04:21.899 --> 00:04:24.860
expectation acts like center
of mass, we know that the
00:04:24.860 --> 00:04:27.800
expectation should be
the midpoint, right?
00:04:27.800 --> 00:04:31.695
And so to compute this point, we
simply take the average of
00:04:31.695 --> 00:04:35.990
the endpoints, x plus 1 plus
x over 2, which gives us
00:04:35.990 --> 00:04:38.240
2x plus 1 over 2.
00:04:38.240 --> 00:04:45.805
So plugging this back up here,
we get 2x/2 plus 2x plus 1
00:04:45.805 --> 00:04:53.960
over 2, which is 4x plus 1
over 2, or 2x plus 1/2.
00:04:53.960 --> 00:04:54.540
OK.
00:04:54.540 --> 00:04:58.810
So now I want to look at the
next term, the next inner
00:04:58.810 --> 00:05:01.610
term, which is this guy.
00:05:01.610 --> 00:05:05.070
So this computation is
going to be very
00:05:05.070 --> 00:05:07.320
similar in nature, actually.
00:05:07.320 --> 00:05:10.180
So we already discussed
that the joint--
00:05:10.180 --> 00:05:13.140
sorry, not the joint, the
conditional PDF of y given x
00:05:13.140 --> 00:05:14.640
is this guy.
00:05:14.640 --> 00:05:18.520
So the variance of x plus y
conditioned on x, we sort of
00:05:18.520 --> 00:05:20.940
have a similar phenomenon
occurring.
00:05:20.940 --> 00:05:24.180
x now in this conditional
world just acts like a
00:05:24.180 --> 00:05:28.250
constant that shifts the PDF but
doesn't change the width
00:05:28.250 --> 00:05:30.700
of the distribution at all.
00:05:30.700 --> 00:05:35.550
So this is actually just equal
to the variance of y given x,
00:05:35.550 --> 00:05:39.060
because constants don't
affect the variance.
00:05:39.060 --> 00:05:42.480
And now we can look at this
conditional PDF to figure out
00:05:42.480 --> 00:05:44.350
what this is.
00:05:44.350 --> 00:05:47.530
So we're going to take a quick
tangent over here, and I'm
00:05:47.530 --> 00:05:51.070
just going to remind you guys
that we have a formula for
00:05:51.070 --> 00:05:53.880
computing the variance of a
random variable when it's
00:05:53.880 --> 00:05:57.420
uniformly distributed between
two endpoints.
00:05:57.420 --> 00:06:02.270
So say we have a random variable
whose PDF looks
00:06:02.270 --> 00:06:04.330
something like this.
00:06:04.330 --> 00:06:06.420
Let's call it, let's say, w.
00:06:06.420 --> 00:06:09.190
This is pww.
00:06:09.190 --> 00:06:13.510
We have a formula that says
variance of w is equal to b
00:06:13.510 --> 00:06:17.000
minus a squared over 12.
00:06:17.000 --> 00:06:20.280
So we can apply that
formula over here.
00:06:20.280 --> 00:06:23.000
b is x plus 1, a is x.
00:06:23.000 --> 00:06:26.220
So b minus a squared over
12 is just 1/12.
00:06:26.220 --> 00:06:29.150
So we get 1/12.
00:06:29.150 --> 00:06:32.250
So we're making good progress,
because we have this inner
00:06:32.250 --> 00:06:34.460
quantity and this
inner quantity.
00:06:34.460 --> 00:06:37.170
So now all we need to do is take
the outer variance and
00:06:37.170 --> 00:06:39.430
the outer expectation.
00:06:39.430 --> 00:06:44.220
So writing this all down, we
get variance of x plus y is
00:06:44.220 --> 00:06:51.390
equal to variance of this guy,
2x plus 1/2 plus the
00:06:51.390 --> 00:06:52.890
expectation of 1/12.
00:06:55.720 --> 00:06:59.230
So this term is quite simple.
00:06:59.230 --> 00:07:02.120
We know that the expectation of
a constant or of a scalar
00:07:02.120 --> 00:07:03.590
is simply that scalar.
00:07:03.590 --> 00:07:05.900
So this evaluates to 1/12.
00:07:05.900 --> 00:07:07.790
And this one is not
bad either.
00:07:07.790 --> 00:07:12.340
So similar to our discussion up
here, we know constants do
00:07:12.340 --> 00:07:13.930
not affect variance.
00:07:13.930 --> 00:07:16.180
You know they shift your
distribution, they don't
00:07:16.180 --> 00:07:17.480
change the variance.
00:07:17.480 --> 00:07:19.760
So we can ignore the 1/2.
00:07:19.760 --> 00:07:21.550
This scaling factor
of 2, however,
00:07:21.550 --> 00:07:23.670
will change the variance.
00:07:23.670 --> 00:07:25.190
But we know how to handle
this already
00:07:25.190 --> 00:07:26.600
from previous lectures.
00:07:26.600 --> 00:07:30.850
We know that you can just take
out this scalar scaling factor
00:07:30.850 --> 00:07:33.320
as long as we square it.
00:07:33.320 --> 00:07:36.820
So this becomes 2 squared,
or 4 times the
00:07:36.820 --> 00:07:40.790
variance of x plus 1/12.
00:07:40.790 --> 00:07:43.410
And now to compute the variance
of x, we're going to
00:07:43.410 --> 00:07:47.050
use that formula again,
and we're
00:07:47.050 --> 00:07:48.670
going to use this picture.
00:07:48.670 --> 00:07:54.650
So here we have the joint PDF of
x and y, but really we want
00:07:54.650 --> 00:07:57.810
now the PDF of x, so we
can figure out what
00:07:57.810 --> 00:08:00.380
the variance is.
00:08:00.380 --> 00:08:02.800
So hopefully you remember a
trick we taught you called
00:08:02.800 --> 00:08:04.360
marginalization.
00:08:04.360 --> 00:08:08.950
To get the PDF of x given
a joint PDF, you simply
00:08:08.950 --> 00:08:12.090
marginalize over the
values of y.
00:08:12.090 --> 00:08:17.750
So if you freeze x is equal to
0, you get the probability
00:08:17.750 --> 00:08:20.730
density line over x by
integrating over this
00:08:20.730 --> 00:08:22.130
interval, over y.
00:08:22.130 --> 00:08:25.000
So if you integrate over
this strip, you get 1.
00:08:25.000 --> 00:08:27.340
If you move x over a little
bit and you integrate over
00:08:27.340 --> 00:08:29.210
this strip, you get 1.
00:08:29.210 --> 00:08:32.480
This is the argument I was
making earlier that the width
00:08:32.480 --> 00:08:35.240
of this interval stays the same,
and hence, the variance
00:08:35.240 --> 00:08:36.600
stays the same.
00:08:36.600 --> 00:08:40.590
So based on that argument, which
was slightly hand wavy,
00:08:40.590 --> 00:08:42.059
let's come over here
and draw it.
00:08:44.860 --> 00:08:50.060
We're claiming that the PDF of
x, px of x, looks like this.
00:08:50.060 --> 00:08:53.390
It's just uniformly distributed
between 0 and 1.
00:08:53.390 --> 00:08:56.330
And if you buy that, then we're
done, we're home free,
00:08:56.330 --> 00:08:59.380
because we can apply this
formula, b minus a squared
00:08:59.380 --> 00:09:01.570
over 12, gives us
the variance.
00:09:01.570 --> 00:09:04.620
So b is 1, a is 0, which
gives variance of
00:09:04.620 --> 00:09:07.010
x is equal to 1/12.
00:09:07.010 --> 00:09:13.980
So coming back over here, we
get 4 times 1/12 plus 1/12,
00:09:13.980 --> 00:09:16.100
which is 5/12.
00:09:16.100 --> 00:09:19.340
And that is our answer.
00:09:19.340 --> 00:09:22.340
So this problem was
straightforward in the sense
00:09:22.340 --> 00:09:24.850
that our task was very clear.
00:09:24.850 --> 00:09:28.010
We had to compute this, and we
had to do so by using the law
00:09:28.010 --> 00:09:29.730
of total variance.
00:09:29.730 --> 00:09:33.170
But we sort of reviewed a lot
of concepts along the way.
00:09:33.170 --> 00:09:38.170
We saw how, given a joint
PDF, you marginalize to
00:09:38.170 --> 00:09:39.950
get the PDF of x.
00:09:39.950 --> 00:09:44.580
We saw how constants don't
change variance.
00:09:44.580 --> 00:09:47.780
We got a lot of practice
finding conditional
00:09:47.780 --> 00:09:49.700
distributions and computing
conditional
00:09:49.700 --> 00:09:51.670
expectations and variances.
00:09:51.670 --> 00:09:54.530
And we also saw this trick.
00:09:54.530 --> 00:09:58.040
And it might seem like cheating
to memorize formulas,
00:09:58.040 --> 00:09:59.960
but there's a few important
ones you should know.
00:09:59.960 --> 00:10:03.240
And it will help you sort of
become faster at doing
00:10:03.240 --> 00:10:04.170
computations.
00:10:04.170 --> 00:10:06.200
And that's important,
especially if you
00:10:06.200 --> 00:10:08.390
guys take the exams.
00:10:08.390 --> 00:10:09.020
So that's it.
00:10:09.020 --> 00:10:10.270
See you next time.