1
00:00:00,000 --> 00:00:00,040
2
00:00:00,040 --> 00:00:00,710
Hey, guys.
3
00:00:00,710 --> 00:00:02,020
Welcome back.
4
00:00:02,020 --> 00:00:04,680
Today, we're going to do another
fun problem, which is
5
00:00:04,680 --> 00:00:07,980
a drill problem on joint PMFs.
6
00:00:07,980 --> 00:00:11,330
And the goal is that you will
feel more comfortable by the
7
00:00:11,330 --> 00:00:13,830
end of this problem,
manipulating joint PMFs.
8
00:00:13,830 --> 00:00:17,490
And we'll also review some ideas
about independents in
9
00:00:17,490 --> 00:00:19,380
the process.
10
00:00:19,380 --> 00:00:22,440
So just to go over what
I've drawn here, we
11
00:00:22,440 --> 00:00:25,090
are given an xy plane.
12
00:00:25,090 --> 00:00:27,810
And we're told what
the PMF is.
13
00:00:27,810 --> 00:00:30,080
And it's plotted for you here.
14
00:00:30,080 --> 00:00:34,220
What these stars indicate
is simply that
15
00:00:34,220 --> 00:00:35,130
there is a value there.
16
00:00:35,130 --> 00:00:36,230
But we don't know what it is.
17
00:00:36,230 --> 00:00:39,750
It could be anything
between 0 and 1.
18
00:00:39,750 --> 00:00:41,600
And so we're given this
list of questions.
19
00:00:41,600 --> 00:00:42,590
And we're just going
to work through
20
00:00:42,590 --> 00:00:45,000
them linearly together.
21
00:00:45,000 --> 00:00:48,970
So we start off pretty simply.
22
00:00:48,970 --> 00:00:52,900
We want to compute, in part a,
the probability that x takes
23
00:00:52,900 --> 00:00:54,590
on a value of 1.
24
00:00:54,590 --> 00:00:58,690
So for those of you who like
formulas, I'm going to use the
25
00:00:58,690 --> 00:01:01,390
formula, which is usually
referred to as
26
00:01:01,390 --> 00:01:03,380
marginalization.
27
00:01:03,380 --> 00:01:06,740
So the marginal over
x is given by
28
00:01:06,740 --> 00:01:08,840
summing over the joint.
29
00:01:08,840 --> 00:01:13,170
So here we are interested in the
probability that x is 1.
30
00:01:13,170 --> 00:01:15,650
So I'm just going to freeze
the value of 1 here.
31
00:01:15,650 --> 00:01:17,365
And we sum over y.
32
00:01:17,365 --> 00:01:20,530
And in particular,
1, 2, and 3.
33
00:01:20,530 --> 00:01:28,230
So carrying this out, this is
the Pxy of 1, 1, plus Pxy of
34
00:01:28,230 --> 00:01:34,290
1, 2, plus Pxy 1, 3.
35
00:01:34,290 --> 00:01:37,990
And this, of course, reading
from the graph, is 1/12 plus
36
00:01:37,990 --> 00:01:45,700
2/12 plus 1/12, which is
equal to 4/12, or 1/3.
37
00:01:45,700 --> 00:01:48,340
So now you guys know
the formula.
38
00:01:48,340 --> 00:01:51,040
Hopefully you'll remember the
term marginalization.
39
00:01:51,040 --> 00:01:54,170
But I want to point out that
intuitively you can come up
40
00:01:54,170 --> 00:01:57,040
with the answer much faster.
41
00:01:57,040 --> 00:02:01,010
So the probability that x is
equal to 1 is the probability
42
00:02:01,010 --> 00:02:02,840
that this dot happens
or this dot
43
00:02:02,840 --> 00:02:05,320
happens or this dot happens.
44
00:02:05,320 --> 00:02:10,280
Now, these dots, or outcomes,
they're disjoint.
45
00:02:10,280 --> 00:02:12,320
So you can just sum the
probability to get the
46
00:02:12,320 --> 00:02:15,280
probability of one of these
things happening.
47
00:02:15,280 --> 00:02:17,310
So it's the same computation.
48
00:02:17,310 --> 00:02:20,300
And you'll probably get there
a little bit faster.
49
00:02:20,300 --> 00:02:23,940
So we're done with a already,
which is great.
50
00:02:23,940 --> 00:02:29,050
So for part b, conditioning on
x is equal to 1, we want to
51
00:02:29,050 --> 00:02:31,850
sketch the PMF of y.
52
00:02:31,850 --> 00:02:35,380
So if x is equal to
1 we are suddenly
53
00:02:35,380 --> 00:02:37,510
living in this universe.
54
00:02:37,510 --> 00:02:41,440
y can take values of 1, 2,
or 3 with these relative
55
00:02:41,440 --> 00:02:42,840
frequencies.
56
00:02:42,840 --> 00:02:45,790
So let's draw this here.
57
00:02:45,790 --> 00:02:47,400
So this is y.
58
00:02:47,400 --> 00:02:50,550
I said, already, y can take on
a value of 1. y can take on a
59
00:02:50,550 --> 00:02:51,270
value of 2.
60
00:02:51,270 --> 00:02:53,680
Or it can take on
a value of 3.
61
00:02:53,680 --> 00:02:58,700
And we're plotting here, Py
given x, y, conditioned on x
62
00:02:58,700 --> 00:03:00,390
is equal to 1.
63
00:03:00,390 --> 00:03:03,950
OK, so what I mean by preserving
the relative
64
00:03:03,950 --> 00:03:07,925
frequencies is that in
unconditional world this is
65
00:03:07,925 --> 00:03:10,320
dot is twice as likely
to happen as either
66
00:03:10,320 --> 00:03:12,470
this dot or this dot.
67
00:03:12,470 --> 00:03:16,110
And that relative likelihood
remains the same after
68
00:03:16,110 --> 00:03:17,590
conditioning.
69
00:03:17,590 --> 00:03:20,540
And the reason why we have to
change these values is because
70
00:03:20,540 --> 00:03:21,820
they have to sum to 1.
71
00:03:21,820 --> 00:03:24,410
So in other words, we have
to scale them up.
72
00:03:24,410 --> 00:03:26,420
So you can use a formula.
73
00:03:26,420 --> 00:03:28,780
But again, I'm here to show
you faster ways of
74
00:03:28,780 --> 00:03:30,390
thinking about it.
75
00:03:30,390 --> 00:03:34,620
So my little algorithm for
figuring out conditional PMFs
76
00:03:34,620 --> 00:03:36,600
is to take the numerators--
77
00:03:36,600 --> 00:03:38,500
so 1, 2, and 1--
78
00:03:38,500 --> 00:03:39,390
and sum them.
79
00:03:39,390 --> 00:03:41,300
So here that gives us 4.
80
00:03:41,300 --> 00:03:44,120
And then to preserve the
relative frequency, you
81
00:03:44,120 --> 00:03:47,630
actually keep the same
numerators but divide it by
82
00:03:47,630 --> 00:03:49,820
the sum, which you
just computed.
83
00:03:49,820 --> 00:03:51,530
So I'm going fast.
84
00:03:51,530 --> 00:03:53,950
I'll review in a second.
85
00:03:53,950 --> 00:03:57,390
But this is what you will
end up getting.
86
00:03:57,390 --> 00:04:02,890
So to recap, I did 1 plus 2
plus 1, which is 4, to get
87
00:04:02,890 --> 00:04:05,570
these denominators.
88
00:04:05,570 --> 00:04:08,080
And so I skipped a step here.
89
00:04:08,080 --> 00:04:11,880
This is really 2/4, which
is 1/2, obviously.
90
00:04:11,880 --> 00:04:14,030
So you add these
guys to get 4.
91
00:04:14,030 --> 00:04:15,710
And then you keep the
numerators and just
92
00:04:15,710 --> 00:04:16,649
divide them by 4.
93
00:04:16,649 --> 00:04:20,040
So 1/4, 2/4, which
is 1/2 and 1/4.
94
00:04:20,040 --> 00:04:21,959
And that's what we mean
by preserving
95
00:04:21,959 --> 00:04:23,230
the relative frequency.
96
00:04:23,230 --> 00:04:27,730
Except so this thing now sums
to 1, which is what we want.
97
00:04:27,730 --> 00:04:32,520
OK, so we're done with part b.
98
00:04:32,520 --> 00:04:36,490
Part c actually follows almost
immediately from part b.
99
00:04:36,490 --> 00:04:39,540
In part c we're interested in
computing the conditional
100
00:04:39,540 --> 00:04:42,970
expectation of y given
that x is equal to 1.
101
00:04:42,970 --> 00:04:45,550
So we've already done most of
the legwork because we have
102
00:04:45,550 --> 00:04:49,070
the conditional PMF
that we need.
103
00:04:49,070 --> 00:04:52,390
And so expectation, you guys
have calculated a bunch of
104
00:04:52,390 --> 00:04:53,060
these by now.
105
00:04:53,060 --> 00:04:55,350
So I'm just going to
appeal to your
106
00:04:55,350 --> 00:04:57,640
intuition and to symmetry.
107
00:04:57,640 --> 00:05:00,490
Expectation acts like
center of mass.
108
00:05:00,490 --> 00:05:04,200
This is a symmetrical
distribution of mass.
109
00:05:04,200 --> 00:05:07,960
And so the center is
right here at 2.
110
00:05:07,960 --> 00:05:10,660
So this is simply 2.
111
00:05:10,660 --> 00:05:13,110
And if that went too fast,
just convince yourselves.
112
00:05:13,110 --> 00:05:15,050
Use the normal formula
for expectations.
113
00:05:15,050 --> 00:05:18,240
And your answer will
agree with ours.
114
00:05:18,240 --> 00:05:21,670
OK, so d is a really
cool question.
115
00:05:21,670 --> 00:05:24,140
Because you can do
a lot of math.
116
00:05:24,140 --> 00:05:28,810
Or you can think and ask
yourself, at the most
117
00:05:28,810 --> 00:05:31,330
fundamental level, what
is independents?
118
00:05:31,330 --> 00:05:33,290
And if you think that
way you'll come to
119
00:05:33,290 --> 00:05:36,120
the answer very easily.
120
00:05:36,120 --> 00:05:41,740
So essentially, I rephrased this
to truncate it from the
121
00:05:41,740 --> 00:05:43,940
problem statement that
you guys are reading.
122
00:05:43,940 --> 00:05:47,710
But the idea is that
these stars are
123
00:05:47,710 --> 00:05:49,820
unknown probability masses.
124
00:05:49,820 --> 00:05:53,620
And this question is asking can
you figure out a way of
125
00:05:53,620 --> 00:05:58,390
assigning numbers between 0
and 1 to these values such
126
00:05:58,390 --> 00:06:02,395
that you end up with a valid
probability mass function, so
127
00:06:02,395 --> 00:06:06,020
everything sums to 1 and such
that x and y are independent?
128
00:06:06,020 --> 00:06:08,300
So it seems hard a priori.
129
00:06:08,300 --> 00:06:10,470
But let's think about
it a bit.
130
00:06:10,470 --> 00:06:12,540
And in the meantime I'm
going to erase this
131
00:06:12,540 --> 00:06:14,390
so I have more room.
132
00:06:14,390 --> 00:06:18,580
What does it mean for x and
y to be independent?
133
00:06:18,580 --> 00:06:23,480
Well, it means that they don't,
essentially, have
134
00:06:23,480 --> 00:06:25,760
information about each other.
135
00:06:25,760 --> 00:06:29,360
So if I tell you something about
x and if x and y are
136
00:06:29,360 --> 00:06:33,430
independent, your belief about
y shouldn't change.
137
00:06:33,430 --> 00:06:36,010
In other words, if you're a
rational person, x shouldn't
138
00:06:36,010 --> 00:06:38,950
change your belief about y.
139
00:06:38,950 --> 00:06:41,890
So let's look more closely
at this diagram.
140
00:06:41,890 --> 00:06:45,090
Now, the number 0 should
be popping out to you.
141
00:06:45,090 --> 00:06:49,500
Because this essentially means
that the 0.31 can't happen.
142
00:06:49,500 --> 00:06:52,850
Or it happens with
0 probability.
143
00:06:52,850 --> 00:06:57,630
So let's say fix x equal to 3.
144
00:06:57,630 --> 00:07:02,750
If you condition on x is equal
to 3, as I just said, this
145
00:07:02,750 --> 00:07:04,320
outcome can't happen.
146
00:07:04,320 --> 00:07:08,740
So y could only take on
values of 2 or 3.
147
00:07:08,740 --> 00:07:12,760
However, if you condition on x
is equal to 1, y could take on
148
00:07:12,760 --> 00:07:16,235
a value of 1 with probability
1/4 as we computed in the
149
00:07:16,235 --> 00:07:17,080
other problem.
150
00:07:17,080 --> 00:07:19,550
It could take on a value of
2 with probability of 1/2.
151
00:07:19,550 --> 00:07:23,220
Or it could take on a value
of 3 with probability 1/4.
152
00:07:23,220 --> 00:07:25,410
So these are actually very
different cases, right?
153
00:07:25,410 --> 00:07:28,450
Because if you observe
x is equal to 3 y can
154
00:07:28,450 --> 00:07:29,950
only be 2 or 3.
155
00:07:29,950 --> 00:07:35,030
But if you observe x is equal
to 1, y can be 1, 2, or 3.
156
00:07:35,030 --> 00:07:40,510
So actually, x, no matter what
values these stars have on, x
157
00:07:40,510 --> 00:07:43,000
always tells you something
about y.
158
00:07:43,000 --> 00:07:47,410
Therefore, the answer to
this, part d, is no.
159
00:07:47,410 --> 00:07:50,600
So let's put a no with
an exclamation point.
160
00:07:50,600 --> 00:07:52,730
So I like that problem a lot.
161
00:07:52,730 --> 00:07:56,960
And hopefully it clarifies
independents for you guys.
162
00:07:56,960 --> 00:08:00,240
So parts e and f, we're going
to be thinking about
163
00:08:00,240 --> 00:08:02,390
independents again.
164
00:08:02,390 --> 00:08:05,330
To go over what the problem
statement gives you, we
165
00:08:05,330 --> 00:08:08,900
defined this event, b, which
is the event that x is less
166
00:08:08,900 --> 00:08:12,640
than or equal to 2 and y is
less than or equal to 2.
167
00:08:12,640 --> 00:08:14,250
So let's get some colors.
168
00:08:14,250 --> 00:08:15,510
So do bright pink.
169
00:08:15,510 --> 00:08:17,170
So that means we're essentially
170
00:08:17,170 --> 00:08:18,500
living in this world.
171
00:08:18,500 --> 00:08:22,320
There's only those four dots.
172
00:08:22,320 --> 00:08:25,370
And we're also told a very
important piece of information
173
00:08:25,370 --> 00:08:30,870
that conditions on B. x and y
are conditionally independent.
174
00:08:30,870 --> 00:08:33,409
OK, so part e, now that
we have this.
175
00:08:33,409 --> 00:08:36,860
And by the way, these two
assumptions apply to both
176
00:08:36,860 --> 00:08:39,100
parts e and part f.
177
00:08:39,100 --> 00:08:44,920
So in part e, we want to
find out Pxy of 2, 2.
178
00:08:44,920 --> 00:08:48,090
Or in English, what is the
probability that x takes on a
179
00:08:48,090 --> 00:08:50,650
value of 2 and y takes
on a value of 2?
180
00:08:50,650 --> 00:08:54,630
So determine the value
of this star.
181
00:08:54,630 --> 00:09:01,090
And the whole trick here is that
the possible values that
182
00:09:01,090 --> 00:09:04,500
this star could take on are
constrained by the fact that
183
00:09:04,500 --> 00:09:07,210
we need to make sure that x
and y are conditionally
184
00:09:07,210 --> 00:09:11,310
independent given B.
185
00:09:11,310 --> 00:09:17,670
So my claim is that if two
random variables are
186
00:09:17,670 --> 00:09:22,040
independent and you condition
on one of them, say we
187
00:09:22,040 --> 00:09:24,050
condition on x.
188
00:09:24,050 --> 00:09:27,180
If you condition on different
values of x, the relative
189
00:09:27,180 --> 00:09:30,460
frequencies of y should
be the same.
190
00:09:30,460 --> 00:09:33,930
So here, the relative frequency,
condition on x is
191
00:09:33,930 --> 00:09:35,060
equal to 1.
192
00:09:35,060 --> 00:09:37,500
The relative frequencies
of y are 2 to 1.
193
00:09:37,500 --> 00:09:41,170
This outcome is twice as likely
to happen as this one.
194
00:09:41,170 --> 00:09:44,370
If we condition on 2 this
outcome needs to be twice as
195
00:09:44,370 --> 00:09:46,990
likely to happen as
this outcome.
196
00:09:46,990 --> 00:09:51,210
If they weren't, x would tell
you information about y.
197
00:09:51,210 --> 00:09:55,450
Because you would know that the
distribution over 2 and 1
198
00:09:55,450 --> 00:09:56,530
would be different.
199
00:09:56,530 --> 00:09:57,900
OK?
200
00:09:57,900 --> 00:10:00,940
So because the relative
frequencies have to be the
201
00:10:00,940 --> 00:10:06,440
same and 2/12 is 2 times
1/12 this guy must
202
00:10:06,440 --> 00:10:09,800
also be 2 times 2/12.
203
00:10:09,800 --> 00:10:13,860
So that gives us our
answer for part e.
204
00:10:13,860 --> 00:10:16,220
Let me write up here.
205
00:10:16,220 --> 00:10:26,340
Part e, we need Pxy 2, 2
to be equal to 4/12.
206
00:10:26,340 --> 00:10:32,630
And again, the way we got this
is simply we need x and y to
207
00:10:32,630 --> 00:10:37,250
be conditionally independent
given B. And if this were
208
00:10:37,250 --> 00:10:43,530
anything other than 4 the
relative frequency of y is
209
00:10:43,530 --> 00:10:46,960
equal to 2 to 1 would be
different from over here.
210
00:10:46,960 --> 00:10:49,540
So here condition on
x is equal to 1.
211
00:10:49,540 --> 00:10:53,250
The outcome, y is equal to 2
is twice as likely as x is
212
00:10:53,250 --> 00:10:54,990
equal to 1.
213
00:10:54,990 --> 00:10:59,510
Here, if we put a value of 4/12
and you condition on x is
214
00:10:59,510 --> 00:11:05,090
equal to 2, the outcome y is
equal to 2 is still twice as
215
00:11:05,090 --> 00:11:09,610
likely as the outcome
y is equal to 1.
216
00:11:09,610 --> 00:11:11,630
And if you put any other number
there the relative
217
00:11:11,630 --> 00:11:13,080
frequencies would
be different.
218
00:11:13,080 --> 00:11:15,050
So x would be telling you
something about y.
219
00:11:15,050 --> 00:11:18,491
So there would not be
independent condition on B.
220
00:11:18,491 --> 00:11:19,730
OK, that was a mouthful.
221
00:11:19,730 --> 00:11:22,860
But hopefully you guys
have it now.
222
00:11:22,860 --> 00:11:28,080
And lastly, we have part f,
which follows pretty directly
223
00:11:28,080 --> 00:11:29,900
from part e.
224
00:11:29,900 --> 00:11:32,550
So we were still in the
unconditional universe.
225
00:11:32,550 --> 00:11:36,840
In part e, we were figuring out
what's the value of star
226
00:11:36,840 --> 00:11:39,010
in the whole unconditional
universe?
227
00:11:39,010 --> 00:11:41,890
Now, in part f, we want the
value of star in the
228
00:11:41,890 --> 00:11:44,250
conditional universe
where B occurred.
229
00:11:44,250 --> 00:11:49,180
So let's come over here and plot
a new graph so we don't
230
00:11:49,180 --> 00:11:51,130
confuse ourselves.
231
00:11:51,130 --> 00:11:53,442
So we have xy.
232
00:11:53,442 --> 00:11:55,130
x can be 1 or 2.
233
00:11:55,130 --> 00:11:56,740
Y could be 1 or 2.
234
00:11:56,740 --> 00:12:00,690
So we have a plot that looks
something like this.
235
00:12:00,690 --> 00:12:05,360
And so again, same argument
as before.
236
00:12:05,360 --> 00:12:06,430
Let me just fill this in.
237
00:12:06,430 --> 00:12:08,930
From part e, we have
that this is 4/12.
238
00:12:08,930 --> 00:12:10,880
And we're going to use
my algorithm again.
239
00:12:10,880 --> 00:12:13,340
So in the conditional world,
the relative frequencies of
240
00:12:13,340 --> 00:12:14,990
these four dots should
be the same.
241
00:12:14,990 --> 00:12:19,000
But you need to scale them up so
that if you sum over all of
242
00:12:19,000 --> 00:12:20,780
them the probability
sums to 1.
243
00:12:20,780 --> 00:12:23,470
So you have a valid PMF.
244
00:12:23,470 --> 00:12:25,840
So my algorithm from before
was to add up all the
245
00:12:25,840 --> 00:12:26,920
numerators.
246
00:12:26,920 --> 00:12:30,520
So 1 plus 2 plus 4 plus
2 gives you 9.
247
00:12:30,520 --> 00:12:33,060
And then to preserve the
relative frequency you keep
248
00:12:33,060 --> 00:12:34,320
the same numerator.
249
00:12:34,320 --> 00:12:36,530
So here we had a
numerator of 1.
250
00:12:36,530 --> 00:12:38,790
That becomes 1/9.
251
00:12:38,790 --> 00:12:40,490
Here we had a numerator of 2.
252
00:12:40,490 --> 00:12:42,290
This becomes 2/9.
253
00:12:42,290 --> 00:12:43,900
Here we had a numerator of 4.
254
00:12:43,900 --> 00:12:45,310
That becomes 4/9.
255
00:12:45,310 --> 00:12:47,510
Here we had a numerator
of 2, so 2/9.
256
00:12:47,510 --> 00:12:51,040
And indeed, the relative
frequencies are preserved.
257
00:12:51,040 --> 00:12:52,870
And they all sum to 1.
258
00:12:52,870 --> 00:12:56,340
So our answer for part f--
259
00:12:56,340 --> 00:12:57,720
let's box it here--
260
00:12:57,720 --> 00:13:03,780
is that Pxy 2, 2 conditioned
on B is equal to 4/9,
261
00:13:03,780 --> 00:13:06,910
is just that guy.
262
00:13:06,910 --> 00:13:08,020
So we're done.
263
00:13:08,020 --> 00:13:09,780
Hopefully that wasn't
too painful.
264
00:13:09,780 --> 00:13:13,400
And this is a good drill
problem, because we got more
265
00:13:13,400 --> 00:13:16,880
comfortable working with
PMFs, joint PMFs.
266
00:13:16,880 --> 00:13:19,700
We went over marginalization.
267
00:13:19,700 --> 00:13:21,560
We went over conditioning.
268
00:13:21,560 --> 00:13:24,600
We went over into
independents.
269
00:13:24,600 --> 00:13:30,500
And I also gave you this quick
algorithm for figuring out
270
00:13:30,500 --> 00:13:32,310
what conditional PMFs
are if you don't
271
00:13:32,310 --> 00:13:33,840
want to use the formulas.
272
00:13:33,840 --> 00:13:36,190
Namely, you sum all of the
numerators to get a new
273
00:13:36,190 --> 00:13:39,280
denominator and then divide all
the old numerators by the
274
00:13:39,280 --> 00:13:42,020
new denominator you computed.
275
00:13:42,020 --> 00:13:43,350
So I hope that was helpful.
276
00:13:43,350 --> 00:13:44,600
I'll see you next time.
277
00:13:44,600 --> 00:13:45,420