1
00:00:00,000 --> 00:00:00,040
2
00:00:00,040 --> 00:00:00,760
Hey guys.
3
00:00:00,760 --> 00:00:02,140
Welcome back.
4
00:00:02,140 --> 00:00:05,440
Today we're going to do a fun
problem that will test your
5
00:00:05,440 --> 00:00:08,010
knowledge of the law
of total variance.
6
00:00:08,010 --> 00:00:11,050
And in the process, we'll also
get more practice dealing with
7
00:00:11,050 --> 00:00:14,460
joint PDFs and computing
conditional expectations and
8
00:00:14,460 --> 00:00:16,810
conditional variances.
9
00:00:16,810 --> 00:00:22,030
So in this problem, we are given
a joint PDF for x and y.
10
00:00:22,030 --> 00:00:26,490
So we're told that x and y can
take on the following values
11
00:00:26,490 --> 00:00:27,320
in the shape of this
12
00:00:27,320 --> 00:00:29,660
parallelogram, which I've drawn.
13
00:00:29,660 --> 00:00:33,250
And moreover, that x and y are
uniformly distributed.
14
00:00:33,250 --> 00:00:37,630
So the joint PDF is just flat
over this parallelogram.
15
00:00:37,630 --> 00:00:41,430
And because the parallelogram
has an area of 1, the height
16
00:00:41,430 --> 00:00:47,100
of the PDF must also be 1 so
that the PDF integrates to 1.
17
00:00:47,100 --> 00:00:47,430
OK.
18
00:00:47,430 --> 00:00:49,670
And then we are asked
to compute the
19
00:00:49,670 --> 00:00:51,470
variance of x plus y.
20
00:00:51,470 --> 00:00:54,560
So you can think of x plus y as
a new random variable whose
21
00:00:54,560 --> 00:00:56,520
variance we want to compute.
22
00:00:56,520 --> 00:00:59,530
And moreover, we're told we
should compute this variance
23
00:00:59,530 --> 00:01:03,240
by using something called the
law of total variance.
24
00:01:03,240 --> 00:01:07,140
So from lecture, you should
remember or you should recall
25
00:01:07,140 --> 00:01:09,170
that the law of total variance
can be written
26
00:01:09,170 --> 00:01:11,470
in these two ways.
27
00:01:11,470 --> 00:01:14,310
And the reason why there's two
different forms for this case
28
00:01:14,310 --> 00:01:17,350
is because the formula
always has you
29
00:01:17,350 --> 00:01:18,590
conditioning on something.
30
00:01:18,590 --> 00:01:22,120
Here we condition on x, here
we condition on y.
31
00:01:22,120 --> 00:01:25,590
And for this problem, the
logical choice you have for
32
00:01:25,590 --> 00:01:28,400
what to condition
on is x or y.
33
00:01:28,400 --> 00:01:29,970
So again, we have this option.
34
00:01:29,970 --> 00:01:33,450
And my claim is that we
should condition on x.
35
00:01:33,450 --> 00:01:37,740
And the reason has to do with
the geometry of this diagram.
36
00:01:37,740 --> 00:01:42,220
So notice that if you freeze an
x and then you sort of vary
37
00:01:42,220 --> 00:01:47,120
x, the width of this
parallelogram stays constant.
38
00:01:47,120 --> 00:01:50,190
However, if you condition on y
and look at the width this
39
00:01:50,190 --> 00:01:53,070
way, you see that the width
of the slices you get by
40
00:01:53,070 --> 00:01:56,680
conditioning vary with y.
41
00:01:56,680 --> 00:02:00,280
So to make our lives easier,
we're going to condition on x.
42
00:02:00,280 --> 00:02:02,480
And I'm going to erase this
bottom one, because
43
00:02:02,480 --> 00:02:04,980
we're not using it.
44
00:02:04,980 --> 00:02:08,889
So this really can seem quite
intimidating, because we have
45
00:02:08,889 --> 00:02:11,590
nested variances and
expectations going on, but
46
00:02:11,590 --> 00:02:13,920
we'll just take it slowly
step by step.
47
00:02:13,920 --> 00:02:16,650
So first, I want to focus
on this term--
48
00:02:16,650 --> 00:02:23,600
the conditional expectation of
x plus y conditioned on x.
49
00:02:23,600 --> 00:02:28,160
So coming back over to this
picture, if you fix an
50
00:02:28,160 --> 00:02:32,100
arbitrary x in the interval,
0 to 1, we're restricting
51
00:02:32,100 --> 00:02:34,900
ourselves to this universe.
52
00:02:34,900 --> 00:02:39,860
So y can only vary between this
point and this point.
53
00:02:39,860 --> 00:02:42,560
Now, I've already written down
here that the formula for this
54
00:02:42,560 --> 00:02:45,555
line is given by y
is equal to x.
55
00:02:45,555 --> 00:02:48,510
And the formula for this
line is given by y is
56
00:02:48,510 --> 00:02:50,190
equal to x plus 1.
57
00:02:50,190 --> 00:02:53,990
So in particular, when we
condition on x, we know that y
58
00:02:53,990 --> 00:02:56,920
varies between x and x plus 1.
59
00:02:56,920 --> 00:02:58,960
But we actually know
more than that.
60
00:02:58,960 --> 00:03:01,810
We know that in the
unconditional universe, x and
61
00:03:01,810 --> 00:03:04,180
y were uniformly distributed.
62
00:03:04,180 --> 00:03:07,520
So it follows that in the
conditional universe, y should
63
00:03:07,520 --> 00:03:10,600
also be uniformly distributed,
because conditioning doesn't
64
00:03:10,600 --> 00:03:13,880
change the relative frequency
of outcomes.
65
00:03:13,880 --> 00:03:19,210
So that reasoning means that
we can draw the conditional
66
00:03:19,210 --> 00:03:22,270
PDF of y conditioned
on x as this.
67
00:03:22,270 --> 00:03:24,770
We said it varies between
x and x plus 1.
68
00:03:24,770 --> 00:03:27,770
And we also said that it's
uniform, which means that it
69
00:03:27,770 --> 00:03:29,670
must have a height of 1.
70
00:03:29,670 --> 00:03:34,540
So this is py given
x, y given x.
71
00:03:34,540 --> 00:03:37,820
Now, you might be concerned,
because, well, we're trying to
72
00:03:37,820 --> 00:03:43,150
compute the expectation of
x plus y and this is the
73
00:03:43,150 --> 00:03:47,450
conditional PDF of y, not of the
random variable, x plus y.
74
00:03:47,450 --> 00:03:50,990
But I claim that we're OK, this
is still useful, because
75
00:03:50,990 --> 00:03:53,680
if we're conditioning
on x, this x
76
00:03:53,680 --> 00:03:55,640
just acts as a constant.
77
00:03:55,640 --> 00:03:58,555
It's not really going to change
anything except shift
78
00:03:58,555 --> 00:04:02,400
the expectation of y
by an amount of x.
79
00:04:02,400 --> 00:04:05,350
So what I'm saying in math terms
is that this is actually
80
00:04:05,350 --> 00:04:11,820
just x plus the expectation
of y given x.
81
00:04:11,820 --> 00:04:16,470
And now our conditional
PDF comes into play.
82
00:04:16,470 --> 00:04:18,930
Conditioned on x, this
is the PDF of y.
83
00:04:18,930 --> 00:04:21,899
And because it's uniformly
distributed and because
84
00:04:21,899 --> 00:04:24,860
expectation acts like center
of mass, we know that the
85
00:04:24,860 --> 00:04:27,800
expectation should be
the midpoint, right?
86
00:04:27,800 --> 00:04:31,695
And so to compute this point, we
simply take the average of
87
00:04:31,695 --> 00:04:35,990
the endpoints, x plus 1 plus
x over 2, which gives us
88
00:04:35,990 --> 00:04:38,240
2x plus 1 over 2.
89
00:04:38,240 --> 00:04:45,805
So plugging this back up here,
we get 2x/2 plus 2x plus 1
90
00:04:45,805 --> 00:04:53,960
over 2, which is 4x plus 1
over 2, or 2x plus 1/2.
91
00:04:53,960 --> 00:04:54,540
OK.
92
00:04:54,540 --> 00:04:58,810
So now I want to look at the
next term, the next inner
93
00:04:58,810 --> 00:05:01,610
term, which is this guy.
94
00:05:01,610 --> 00:05:05,070
So this computation is
going to be very
95
00:05:05,070 --> 00:05:07,320
similar in nature, actually.
96
00:05:07,320 --> 00:05:10,180
So we already discussed
that the joint--
97
00:05:10,180 --> 00:05:13,140
sorry, not the joint, the
conditional PDF of y given x
98
00:05:13,140 --> 00:05:14,640
is this guy.
99
00:05:14,640 --> 00:05:18,520
So the variance of x plus y
conditioned on x, we sort of
100
00:05:18,520 --> 00:05:20,940
have a similar phenomenon
occurring.
101
00:05:20,940 --> 00:05:24,180
x now in this conditional
world just acts like a
102
00:05:24,180 --> 00:05:28,250
constant that shifts the PDF but
doesn't change the width
103
00:05:28,250 --> 00:05:30,700
of the distribution at all.
104
00:05:30,700 --> 00:05:35,550
So this is actually just equal
to the variance of y given x,
105
00:05:35,550 --> 00:05:39,060
because constants don't
affect the variance.
106
00:05:39,060 --> 00:05:42,480
And now we can look at this
conditional PDF to figure out
107
00:05:42,480 --> 00:05:44,350
what this is.
108
00:05:44,350 --> 00:05:47,530
So we're going to take a quick
tangent over here, and I'm
109
00:05:47,530 --> 00:05:51,070
just going to remind you guys
that we have a formula for
110
00:05:51,070 --> 00:05:53,880
computing the variance of a
random variable when it's
111
00:05:53,880 --> 00:05:57,420
uniformly distributed between
two endpoints.
112
00:05:57,420 --> 00:06:02,270
So say we have a random variable
whose PDF looks
113
00:06:02,270 --> 00:06:04,330
something like this.
114
00:06:04,330 --> 00:06:06,420
Let's call it, let's say, w.
115
00:06:06,420 --> 00:06:09,190
This is pww.
116
00:06:09,190 --> 00:06:13,510
We have a formula that says
variance of w is equal to b
117
00:06:13,510 --> 00:06:17,000
minus a squared over 12.
118
00:06:17,000 --> 00:06:20,280
So we can apply that
formula over here.
119
00:06:20,280 --> 00:06:23,000
b is x plus 1, a is x.
120
00:06:23,000 --> 00:06:26,220
So b minus a squared over
12 is just 1/12.
121
00:06:26,220 --> 00:06:29,150
So we get 1/12.
122
00:06:29,150 --> 00:06:32,250
So we're making good progress,
because we have this inner
123
00:06:32,250 --> 00:06:34,460
quantity and this
inner quantity.
124
00:06:34,460 --> 00:06:37,170
So now all we need to do is take
the outer variance and
125
00:06:37,170 --> 00:06:39,430
the outer expectation.
126
00:06:39,430 --> 00:06:44,220
So writing this all down, we
get variance of x plus y is
127
00:06:44,220 --> 00:06:51,390
equal to variance of this guy,
2x plus 1/2 plus the
128
00:06:51,390 --> 00:06:52,890
expectation of 1/12.
129
00:06:52,890 --> 00:06:55,720
130
00:06:55,720 --> 00:06:59,230
So this term is quite simple.
131
00:06:59,230 --> 00:07:02,120
We know that the expectation of
a constant or of a scalar
132
00:07:02,120 --> 00:07:03,590
is simply that scalar.
133
00:07:03,590 --> 00:07:05,900
So this evaluates to 1/12.
134
00:07:05,900 --> 00:07:07,790
And this one is not
bad either.
135
00:07:07,790 --> 00:07:12,340
So similar to our discussion up
here, we know constants do
136
00:07:12,340 --> 00:07:13,930
not affect variance.
137
00:07:13,930 --> 00:07:16,180
You know they shift your
distribution, they don't
138
00:07:16,180 --> 00:07:17,480
change the variance.
139
00:07:17,480 --> 00:07:19,760
So we can ignore the 1/2.
140
00:07:19,760 --> 00:07:21,550
This scaling factor
of 2, however,
141
00:07:21,550 --> 00:07:23,670
will change the variance.
142
00:07:23,670 --> 00:07:25,190
But we know how to handle
this already
143
00:07:25,190 --> 00:07:26,600
from previous lectures.
144
00:07:26,600 --> 00:07:30,850
We know that you can just take
out this scalar scaling factor
145
00:07:30,850 --> 00:07:33,320
as long as we square it.
146
00:07:33,320 --> 00:07:36,820
So this becomes 2 squared,
or 4 times the
147
00:07:36,820 --> 00:07:40,790
variance of x plus 1/12.
148
00:07:40,790 --> 00:07:43,410
And now to compute the variance
of x, we're going to
149
00:07:43,410 --> 00:07:47,050
use that formula again,
and we're
150
00:07:47,050 --> 00:07:48,670
going to use this picture.
151
00:07:48,670 --> 00:07:54,650
So here we have the joint PDF of
x and y, but really we want
152
00:07:54,650 --> 00:07:57,810
now the PDF of x, so we
can figure out what
153
00:07:57,810 --> 00:08:00,380
the variance is.
154
00:08:00,380 --> 00:08:02,800
So hopefully you remember a
trick we taught you called
155
00:08:02,800 --> 00:08:04,360
marginalization.
156
00:08:04,360 --> 00:08:08,950
To get the PDF of x given
a joint PDF, you simply
157
00:08:08,950 --> 00:08:12,090
marginalize over the
values of y.
158
00:08:12,090 --> 00:08:17,750
So if you freeze x is equal to
0, you get the probability
159
00:08:17,750 --> 00:08:20,730
density line over x by
integrating over this
160
00:08:20,730 --> 00:08:22,130
interval, over y.
161
00:08:22,130 --> 00:08:25,000
So if you integrate over
this strip, you get 1.
162
00:08:25,000 --> 00:08:27,340
If you move x over a little
bit and you integrate over
163
00:08:27,340 --> 00:08:29,210
this strip, you get 1.
164
00:08:29,210 --> 00:08:32,480
This is the argument I was
making earlier that the width
165
00:08:32,480 --> 00:08:35,240
of this interval stays the same,
and hence, the variance
166
00:08:35,240 --> 00:08:36,600
stays the same.
167
00:08:36,600 --> 00:08:40,590
So based on that argument, which
was slightly hand wavy,
168
00:08:40,590 --> 00:08:42,059
let's come over here
and draw it.
169
00:08:42,059 --> 00:08:44,860
170
00:08:44,860 --> 00:08:50,060
We're claiming that the PDF of
x, px of x, looks like this.
171
00:08:50,060 --> 00:08:53,390
It's just uniformly distributed
between 0 and 1.
172
00:08:53,390 --> 00:08:56,330
And if you buy that, then we're
done, we're home free,
173
00:08:56,330 --> 00:08:59,380
because we can apply this
formula, b minus a squared
174
00:08:59,380 --> 00:09:01,570
over 12, gives us
the variance.
175
00:09:01,570 --> 00:09:04,620
So b is 1, a is 0, which
gives variance of
176
00:09:04,620 --> 00:09:07,010
x is equal to 1/12.
177
00:09:07,010 --> 00:09:13,980
So coming back over here, we
get 4 times 1/12 plus 1/12,
178
00:09:13,980 --> 00:09:16,100
which is 5/12.
179
00:09:16,100 --> 00:09:19,340
And that is our answer.
180
00:09:19,340 --> 00:09:22,340
So this problem was
straightforward in the sense
181
00:09:22,340 --> 00:09:24,850
that our task was very clear.
182
00:09:24,850 --> 00:09:28,010
We had to compute this, and we
had to do so by using the law
183
00:09:28,010 --> 00:09:29,730
of total variance.
184
00:09:29,730 --> 00:09:33,170
But we sort of reviewed a lot
of concepts along the way.
185
00:09:33,170 --> 00:09:38,170
We saw how, given a joint
PDF, you marginalize to
186
00:09:38,170 --> 00:09:39,950
get the PDF of x.
187
00:09:39,950 --> 00:09:44,580
We saw how constants don't
change variance.
188
00:09:44,580 --> 00:09:47,780
We got a lot of practice
finding conditional
189
00:09:47,780 --> 00:09:49,700
distributions and computing
conditional
190
00:09:49,700 --> 00:09:51,670
expectations and variances.
191
00:09:51,670 --> 00:09:54,530
And we also saw this trick.
192
00:09:54,530 --> 00:09:58,040
And it might seem like cheating
to memorize formulas,
193
00:09:58,040 --> 00:09:59,960
but there's a few important
ones you should know.
194
00:09:59,960 --> 00:10:03,240
And it will help you sort of
become faster at doing
195
00:10:03,240 --> 00:10:04,170
computations.
196
00:10:04,170 --> 00:10:06,200
And that's important,
especially if you
197
00:10:06,200 --> 00:10:08,390
guys take the exams.
198
00:10:08,390 --> 00:10:09,020
So that's it.
199
00:10:09,020 --> 00:10:10,270
See you next time.
200
00:10:10,270 --> 00:10:11,220