1
00:00:00,000 --> 00:00:00,970
2
00:00:00,970 --> 00:00:02,080
Hi.
3
00:00:02,080 --> 00:00:03,810
In this problem, we'll get
more practice using
4
00:00:03,810 --> 00:00:05,210
conditioning to help
us calculate
5
00:00:05,210 --> 00:00:07,090
expectations of variances.
6
00:00:07,090 --> 00:00:09,500
We'll see that in this problem,
which deals with
7
00:00:09,500 --> 00:00:12,170
widgets and crates, it's
actually similar in flavor to
8
00:00:12,170 --> 00:00:14,130
an earlier problem that we
did, involving breaking a
9
00:00:14,130 --> 00:00:15,616
stick twice.
10
00:00:15,616 --> 00:00:18,560
And you'll see that in this
problem, we'll again use the
11
00:00:18,560 --> 00:00:21,590
law of iterated expectations and
the law of total variance
12
00:00:21,590 --> 00:00:24,450
to help us calculate
expectations of variances.
13
00:00:24,450 --> 00:00:27,500
And again, we'll be taking the
approach of attacking the
14
00:00:27,500 --> 00:00:30,910
problem by splitting into the
stages and building up from
15
00:00:30,910 --> 00:00:32,840
the bottom up.
16
00:00:32,840 --> 00:00:36,370
So in this problem, what we
have is a crate, which
17
00:00:36,370 --> 00:00:38,220
contains some number of boxes.
18
00:00:38,220 --> 00:00:40,210
And we don't know how
many boxes are.
19
00:00:40,210 --> 00:00:40,940
It's random.
20
00:00:40,940 --> 00:00:45,200
And it's given by some discrete
random variable, n.
21
00:00:45,200 --> 00:00:48,990
And in each box, there are
some number of widgets.
22
00:00:48,990 --> 00:00:50,840
And again, this is
also random.
23
00:00:50,840 --> 00:00:56,240
And in each box, say for Box
I, there are xi number of
24
00:00:56,240 --> 00:00:58,750
widgets in each one.
25
00:00:58,750 --> 00:01:01,120
What we're really interested
in in this problem is, how
26
00:01:01,120 --> 00:01:03,800
many widgets are there
total in this crate?
27
00:01:03,800 --> 00:01:06,330
So in the crate, there are
boxes, and in the boxes, there
28
00:01:06,330 --> 00:01:06,820
are widgets.
29
00:01:06,820 --> 00:01:10,160
How many widgets are there
total within the crate?
30
00:01:10,160 --> 00:01:12,680
And we'll call that a
random variable, t.
31
00:01:12,680 --> 00:01:14,320
And the problem gives
us some information.
32
00:01:14,320 --> 00:01:18,200
It tells us that the expectation
of the number of
33
00:01:18,200 --> 00:01:21,930
widgets in each box for all
the boxes is the same.
34
00:01:21,930 --> 00:01:22,930
It's 10.
35
00:01:22,930 --> 00:01:25,650
And also, the expectation
of the number of
36
00:01:25,650 --> 00:01:27,760
boxes is also 10.
37
00:01:27,760 --> 00:01:30,790
And furthermore, the variance of
x of the number of widgets
38
00:01:30,790 --> 00:01:34,140
and the number of
boxes is all 16.
39
00:01:34,140 --> 00:01:38,070
And lastly, an important fact
is that all the xi's, so all
40
00:01:38,070 --> 00:01:41,730
the widgets for each box, and
the total number of boxes,
41
00:01:41,730 --> 00:01:45,590
these random variables
are all independent.
42
00:01:45,590 --> 00:01:54,520
So to calculate t, t is just
a sum of x1 through xn.
43
00:01:54,520 --> 00:01:57,800
So x1 is the number of widgets
in Box 1, z2 is the number of
44
00:01:57,800 --> 00:02:01,050
widgets in Box 2, and all
the way through Box n.
45
00:02:01,050 --> 00:02:03,660
So what makes this difficult
is that the
46
00:02:03,660 --> 00:02:05,060
n is actually random.
47
00:02:05,060 --> 00:02:07,200
We don't actually know how
many boxes there are.
48
00:02:07,200 --> 00:02:11,070
So we don't even know how many
terms there are in the sum.
49
00:02:11,070 --> 00:02:14,670
Well, let's take a slightly
simpler problem.
50
00:02:14,670 --> 00:02:16,290
Let's pretend that we
actually know there
51
00:02:16,290 --> 00:02:18,690
are exactly 12 boxes.
52
00:02:18,690 --> 00:02:23,180
And in that case, the only thing
that's random now is how
53
00:02:23,180 --> 00:02:25,280
many widgets there
are in each box.
54
00:02:25,280 --> 00:02:30,170
And so let's call [? sum ?] a
new random variable, s, the
55
00:02:30,170 --> 00:02:31,920
sum of x1 through x12.
56
00:02:31,920 --> 00:02:34,540
So this would tell us,
this is the number of
57
00:02:34,540 --> 00:02:37,910
widgets in 12 boxes.
58
00:02:37,910 --> 00:02:38,900
All right.
59
00:02:38,900 --> 00:02:42,280
And because each of these xi's
are independent, and they have
60
00:02:42,280 --> 00:02:45,260
the same expectation, just by
linearity of expectations, we
61
00:02:45,260 --> 00:02:51,740
know that the expectation of s
is just 12 copies of the same
62
00:02:51,740 --> 00:02:54,220
expectation of xi.
63
00:02:54,220 --> 00:02:57,500
And similarly, because we also
assume that all the xi's are
64
00:02:57,500 --> 00:03:01,100
independent, the variance of
s, we can just add the
65
00:03:01,100 --> 00:03:02,530
variances of each
of these terms.
66
00:03:02,530 --> 00:03:06,610
So again, there are 12 copies
of the variance of xi.
67
00:03:06,610 --> 00:03:09,010
So we've done a simpler version
of this problem, where
68
00:03:09,010 --> 00:03:12,910
we've assumed we know what
n is, that n is 12.
69
00:03:12,910 --> 00:03:16,000
And we've seen that in this
simpler case, it's pretty
70
00:03:16,000 --> 00:03:19,850
simple to calculate what the
expectation of the sum is.
71
00:03:19,850 --> 00:03:22,640
So let's try to use that
knowledge to help us calculate
72
00:03:22,640 --> 00:03:24,840
the actual problem, where
n is actually random.
73
00:03:24,840 --> 00:03:27,420
74
00:03:27,420 --> 00:03:35,080
So what we'll do is use the law
of iterated expectations.
75
00:03:35,080 --> 00:03:38,220
And so this is written in terms
of x and y, but we can
76
00:03:38,220 --> 00:03:41,010
very easily just substitute in
for the random variables that
77
00:03:41,010 --> 00:03:41,610
we care about.
78
00:03:41,610 --> 00:03:46,690
Where in this case, what we see
is that in order to build
79
00:03:46,690 --> 00:03:52,510
things up, it would be helpful
if we condition on something
80
00:03:52,510 --> 00:03:54,360
that is useful.
81
00:03:54,360 --> 00:03:57,940
And in this case, it's fairly
clear that it would be helpful
82
00:03:57,940 --> 00:04:00,340
if we condition on n,
the number of boxes.
83
00:04:00,340 --> 00:04:02,960
So if we knew how many boxes
there were, then we can drop
84
00:04:02,960 --> 00:04:05,610
down to the level of widgets
within each box.
85
00:04:05,610 --> 00:04:08,410
And then once we have that, we
can build up and average over
86
00:04:08,410 --> 00:04:11,060
the total number of boxes.
87
00:04:11,060 --> 00:04:14,100
So what we should do
is condition on n,
88
00:04:14,100 --> 00:04:16,370
the number of boxes.
89
00:04:16,370 --> 00:04:19,820
So what have we discovered
through this
90
00:04:19,820 --> 00:04:21,269
simpler exercise earlier?
91
00:04:21,269 --> 00:04:25,310
Well, we've discovered that if
we knew the number of boxes,
92
00:04:25,310 --> 00:04:27,650
then the expectation of the
total number of widgets is
93
00:04:27,650 --> 00:04:30,870
just the number of boxes times
the number of widgets in each
94
00:04:30,870 --> 00:04:33,850
one, or the expectation of the
number of widgets in each one.
95
00:04:33,850 --> 00:04:39,680
So we can use that information
to help us here.
96
00:04:39,680 --> 00:04:45,590
Because now, this is basically
the same scenario, except that
97
00:04:45,590 --> 00:04:47,360
the number of boxes
is now random.
98
00:04:47,360 --> 00:04:50,010
Instead of being 12, it
could be anything.
99
00:04:50,010 --> 00:04:55,090
But if we just condition on
the number of boxes being
100
00:04:55,090 --> 00:04:57,040
equal to n, then we know
that there are
101
00:04:57,040 --> 00:04:59,550
exactly n copies of this.
102
00:04:59,550 --> 00:05:02,090
But notice that n here
is still random.
103
00:05:02,090 --> 00:05:08,370
And so what we get is that the
expectation is n times the
104
00:05:08,370 --> 00:05:10,940
expectation of the number of
widgets in each box, which we
105
00:05:10,940 --> 00:05:11,990
know is 10.
106
00:05:11,990 --> 00:05:17,830
So it's expectation of 10
times n or 10 times the
107
00:05:17,830 --> 00:05:22,210
expectation of n, which
gives us 100.
108
00:05:22,210 --> 00:05:27,040
Because there are, on
expectation, 10 boxes.
109
00:05:27,040 --> 00:05:28,850
So this, again, makes
intuitive sense.
110
00:05:28,850 --> 00:05:31,860
Because we know that on average,
there are 10 boxes.
111
00:05:31,860 --> 00:05:35,650
And on average, each box
has 10 widgets inside.
112
00:05:35,650 --> 00:05:37,330
And so on average,
we expect that
113
00:05:37,330 --> 00:05:39,260
there will be 100 widgets.
114
00:05:39,260 --> 00:05:42,120
And the key thing here is that
we actually relied on this
115
00:05:42,120 --> 00:05:42,920
independence.
116
00:05:42,920 --> 00:05:47,750
So if the number of widgets in
each box vary depending on--
117
00:05:47,750 --> 00:05:50,640
or if the distribution of the
number of widgets in each box
118
00:05:50,640 --> 00:05:53,170
vary depending on how many
boxes there were, then we
119
00:05:53,170 --> 00:05:54,790
wouldn't be able to
do it this simply.
120
00:05:54,790 --> 00:05:57,850
121
00:05:57,850 --> 00:06:01,980
OK, so that gives us the answer
to the first part, the
122
00:06:01,980 --> 00:06:04,200
expectation of the total
number of widgets.
123
00:06:04,200 --> 00:06:08,820
Now let's do the second part,
which is the variance.
124
00:06:08,820 --> 00:06:14,850
The variance, we'll again use
this idea of conditioning and
125
00:06:14,850 --> 00:06:17,810
splitting things up, and use
the law of total variance.
126
00:06:17,810 --> 00:06:22,460
So the variance of t is going to
be equal to the expectation
127
00:06:22,460 --> 00:06:31,560
of the conditional variance
plus the variance of the
128
00:06:31,560 --> 00:06:32,810
conditional expectation.
129
00:06:32,810 --> 00:06:35,100
130
00:06:35,100 --> 00:06:37,840
So what we have to do now is
just to calculate what all of
131
00:06:37,840 --> 00:06:39,380
these pieces are.
132
00:06:39,380 --> 00:06:41,810
So let's start with
this thing here,
133
00:06:41,810 --> 00:06:43,910
the conditional variance.
134
00:06:43,910 --> 00:06:45,510
So what is the conditional
variance?
135
00:06:45,510 --> 00:06:48,640
136
00:06:48,640 --> 00:06:51,690
Well, again, let's go back
to our simpler case.
137
00:06:51,690 --> 00:06:55,530
We know that if we knew what n
is, then the variance would
138
00:06:55,530 --> 00:07:01,600
just be n times the variance
of each xi.
139
00:07:01,600 --> 00:07:02,860
So what does that tell us?
140
00:07:02,860 --> 00:07:06,240
That tells us that, well, if
we knew what n was, so
141
00:07:06,240 --> 00:07:11,180
condition on n, the variance
would just be n times the
142
00:07:11,180 --> 00:07:14,440
variance of each xi.
143
00:07:14,440 --> 00:07:18,350
So we've just taken this analogy
and generalized it to
144
00:07:18,350 --> 00:07:19,940
the case where we don't actually
know what n is.
145
00:07:19,940 --> 00:07:23,370
We just condition on n, and we
still have a random variable.
146
00:07:23,370 --> 00:07:26,160
147
00:07:26,160 --> 00:07:30,340
So then from that, we know that
the expectation now, to
148
00:07:30,340 --> 00:07:32,840
get this first term, take
the expectation of this
149
00:07:32,840 --> 00:07:40,030
conditional variance, it's just
the expectation of n and
150
00:07:40,030 --> 00:07:42,120
the variance of xi,
we're given that.
151
00:07:42,120 --> 00:07:43,720
That's equal to 16.
152
00:07:43,720 --> 00:07:49,430
So it's n times 16, which we
know is 160, because the
153
00:07:49,430 --> 00:07:53,030
expectation of n, we
also know, is 10.
154
00:07:53,030 --> 00:07:56,130
All right, let's do this
second term now.
155
00:07:56,130 --> 00:07:57,680
We need the variance
of the conditional
156
00:07:57,680 --> 00:08:01,270
expectation of t given n.
157
00:08:01,270 --> 00:08:05,990
Well, what is the conditional
expectation of t given n?
158
00:08:05,990 --> 00:08:08,100
We've already kind of
used that here.
159
00:08:08,100 --> 00:08:12,600
And again, it's using the fact
that if we knew what n was,
160
00:08:12,600 --> 00:08:16,340
the expectation would just be n
times the expectation of the
161
00:08:16,340 --> 00:08:18,040
number of widgets in each box.
162
00:08:18,040 --> 00:08:20,800
So it would be n times the
expectation of each xi.
163
00:08:20,800 --> 00:08:23,590
164
00:08:23,590 --> 00:08:25,490
Now, to get the second
term, we just take
165
00:08:25,490 --> 00:08:27,100
the variance of this.
166
00:08:27,100 --> 00:08:36,350
So the variance is the variance
of n times the
167
00:08:36,350 --> 00:08:37,909
expectation of each xi.
168
00:08:37,909 --> 00:08:41,330
And the expectation
of each xi is 10.
169
00:08:41,330 --> 00:08:43,679
So it's n times 10.
170
00:08:43,679 --> 00:08:46,540
And now remember, when you
calculate variances,
171
00:08:46,540 --> 00:08:48,810
[? if you ?] have a constant
term inside, when you pull it
172
00:08:48,810 --> 00:08:49,700
out, you have to square it.
173
00:08:49,700 --> 00:08:54,090
So you get 100 times
the variance of n.
174
00:08:54,090 --> 00:08:57,030
And we know that the variance
of n is also 16.
175
00:08:57,030 --> 00:09:01,340
So this gives us 1600.
176
00:09:01,340 --> 00:09:01,630
All right.
177
00:09:01,630 --> 00:09:04,140
So now we've calculated
both terms here.
178
00:09:04,140 --> 00:09:05,610
The first term is
equal to 160.
179
00:09:05,610 --> 00:09:07,570
The second term is
equal to 1600.
180
00:09:07,570 --> 00:09:09,290
So to get the final answer,
all we have to
181
00:09:09,290 --> 00:09:11,240
do is add this up.
182
00:09:11,240 --> 00:09:17,890
So we get that the final answer
is equal to 1760.
183
00:09:17,890 --> 00:09:21,480
And this is not as obvious as
the expectation, where you
184
00:09:21,480 --> 00:09:22,965
could have just kind
of guessed that it
185
00:09:22,965 --> 00:09:24,570
was equal to 100.
186
00:09:24,570 --> 00:09:28,320
So again, this was just another
example of using
187
00:09:28,320 --> 00:09:32,220
conditioning and the laws of
total variance and iterated
188
00:09:32,220 --> 00:09:34,360
expectations in order to help
you solve a problem.
189
00:09:34,360 --> 00:09:38,600
And in this case, you could kind
of see that there is a
190
00:09:38,600 --> 00:09:41,340
hierarchy, where you
start with widgets.
191
00:09:41,340 --> 00:09:43,920
Widgets are contained in boxes,
and then crates contain
192
00:09:43,920 --> 00:09:45,080
some number of boxes.
193
00:09:45,080 --> 00:09:49,260
And so it's easy to
just condition and
194
00:09:49,260 --> 00:09:50,140
do it level by level.
195
00:09:50,140 --> 00:09:52,210
So you condition on the
number of boxes.
196
00:09:52,210 --> 00:09:54,620
If you know what the number of
boxes are, then you can easily
197
00:09:54,620 --> 00:09:57,320
calculate how many widgets
there are, on average.
198
00:09:57,320 --> 00:09:59,300
And then you average over the
number of boxes to get the
199
00:09:59,300 --> 00:10:01,970
final answer.
200
00:10:01,970 --> 00:10:03,020
So I hope that was helpful.
201
00:10:03,020 --> 00:10:04,270
And we'll see you next time.
202
00:10:04,270 --> 00:10:07,334