1
00:00:00,000 --> 00:00:00,840
2
00:00:00,840 --> 00:00:01,980
Hi.
3
00:00:01,980 --> 00:00:03,910
In this video, we're going
to do some approximate
4
00:00:03,910 --> 00:00:07,850
calculations using the central
limit theorem.
5
00:00:07,850 --> 00:00:10,490
We're given that Xn is the
number of gadgets produced on
6
00:00:10,490 --> 00:00:12,140
day n by a factory.
7
00:00:12,140 --> 00:00:15,140
And it has a normal distribution
with mean 5 and
8
00:00:15,140 --> 00:00:16,460
variance 9.
9
00:00:16,460 --> 00:00:20,340
And they're all independent and
identically distributed.
10
00:00:20,340 --> 00:00:22,020
We're looking for the
probability that the total
11
00:00:22,020 --> 00:00:26,180
number of gadgets in 100
days is less than 440.
12
00:00:26,180 --> 00:00:33,660
To start, we can first write
this as the probability of the
13
00:00:33,660 --> 00:00:37,820
sum of the gadgets produced
on each of 100 days
14
00:00:37,820 --> 00:00:42,320
being less than 440.
15
00:00:42,320 --> 00:00:45,890
Notice that this is a sum of a
large number of independent
16
00:00:45,890 --> 00:00:46,960
random variables.
17
00:00:46,960 --> 00:00:50,370
So we can use the central limit
theorem and approximate
18
00:00:50,370 --> 00:00:53,435
the sum as a normal
random variable.
19
00:00:53,435 --> 00:00:56,780
And then, basically, in order
to compute this probability,
20
00:00:56,780 --> 00:00:59,790
we'd basically need to
standardize this and then use
21
00:00:59,790 --> 00:01:01,440
the standard normal table.
22
00:01:01,440 --> 00:01:03,490
So let's first compute
the expectation and
23
00:01:03,490 --> 00:01:06,920
variance of the sum.
24
00:01:06,920 --> 00:01:13,400
So I'm going to actually sum up
from 1 to n instead of 100,
25
00:01:13,400 --> 00:01:15,790
to do it more generally.
26
00:01:15,790 --> 00:01:21,700
So the linearity is preserved
for the expectation operator.
27
00:01:21,700 --> 00:01:24,660
So this is the sum of
the expected value.
28
00:01:24,660 --> 00:01:28,120
And since they're all
identically distributed, they
29
00:01:28,120 --> 00:01:31,210
all have the same expectation,
and there are n of them.
30
00:01:31,210 --> 00:01:35,240
And so we have this
being n times 5.
31
00:01:35,240 --> 00:01:45,240
For the variance of the sum is
also the sum of the variances
32
00:01:45,240 --> 00:01:48,710
because the independents.
33
00:01:48,710 --> 00:01:50,838
And so they're identically
distributed to the -- so we
34
00:01:50,838 --> 00:01:58,860
have n times the variance of
Xi, and this is n times 9.
35
00:01:58,860 --> 00:02:03,860
So now, we can standardize
it, or make it 0 mean
36
00:02:03,860 --> 00:02:04,980
and variance 1.
37
00:02:04,980 --> 00:02:09,750
So to do that we would
take these Xi's,
38
00:02:09,750 --> 00:02:11,220
subtract by their mean.
39
00:02:11,220 --> 00:02:17,670
So it's going to be 5 times 100
of them, so it's 500 over
40
00:02:17,670 --> 00:02:19,620
the square root of the variance,
which is going to be
41
00:02:19,620 --> 00:02:24,430
9 times 100 of them, so
it's going to be 900.
42
00:02:24,430 --> 00:02:31,400
So that's going to be less than
440 minus 500 over square
43
00:02:31,400 --> 00:02:34,980
root of 900.
44
00:02:34,980 --> 00:02:39,790
So notice what we're trying
to do here is--
45
00:02:39,790 --> 00:02:45,840
notice that the sum of Xi's
is a discrete quantity.
46
00:02:45,840 --> 00:02:47,780
So it's a discrete random
variable, so it may
47
00:02:47,780 --> 00:02:49,190
have a PMF like this.
48
00:02:49,190 --> 00:02:53,110
49
00:02:53,110 --> 00:02:54,140
And we're trying to
approximate it
50
00:02:54,140 --> 00:02:56,600
with a normal density.
51
00:02:56,600 --> 00:03:01,160
So this is not drawn to scale,
but let's say that this is 440
52
00:03:01,160 --> 00:03:04,700
and this is 439.
53
00:03:04,700 --> 00:03:07,170
Basically, we're trying to say
what's the probability of this
54
00:03:07,170 --> 00:03:10,990
being less than 440, so it's the
probability that it's 439,
55
00:03:10,990 --> 00:03:15,410
or 438, or 437.
56
00:03:15,410 --> 00:03:19,310
But in the continuous case, a
good approximation to this
57
00:03:19,310 --> 00:03:24,540
would be to take the middle,
say, 439.5, and compute the
58
00:03:24,540 --> 00:03:27,690
area below that.
59
00:03:27,690 --> 00:03:32,560
So in this case, when we do the
normal approximation, it
60
00:03:32,560 --> 00:03:37,130
works out better if we use
this half correction.
61
00:03:37,130 --> 00:03:42,320
And so, this, in this case,
probability, let's call Z the
62
00:03:42,320 --> 00:03:44,220
standard normal.
63
00:03:44,220 --> 00:03:47,390
And so this is approximately
equal to a standard normal
64
00:03:47,390 --> 00:03:52,800
with the probability of standard
normal being less
65
00:03:52,800 --> 00:03:53,950
than whatever that is.
66
00:03:53,950 --> 00:03:55,850
And if you plug that into
your calculator, you
67
00:03:55,850 --> 00:03:59,790
get negative 2.02.
68
00:03:59,790 --> 00:04:03,670
So now, if we try to figure
out what this--
69
00:04:03,670 --> 00:04:06,230
from the table, we'll
find that negative
70
00:04:06,230 --> 00:04:07,730
values are not tabulated.
71
00:04:07,730 --> 00:04:11,970
But we know that the normal,
the center of normal is
72
00:04:11,970 --> 00:04:15,520
symmetric, and so if we want
to compute the area in this
73
00:04:15,520 --> 00:04:18,010
region, it's the same
as the area in this
74
00:04:18,010 --> 00:04:20,019
region, above 2.02.
75
00:04:20,019 --> 00:04:22,970
So this is the same as the
probability that Z
76
00:04:22,970 --> 00:04:28,400
is bigger than 2.02.
77
00:04:28,400 --> 00:04:32,480
That's just 1 minus the
probability that Z is less
78
00:04:32,480 --> 00:04:36,370
than or equal to 2.02,
and so that's, by
79
00:04:36,370 --> 00:04:40,910
definition, phi of 2.02.
80
00:04:40,910 --> 00:04:45,950
And if we look it up on the
table, 2.02 has probability
81
00:04:45,950 --> 00:04:49,490
here of 0.9783.
82
00:04:49,490 --> 00:04:51,590
And we can just write that in.
83
00:04:51,590 --> 00:04:59,590
That's the answer for Part A.
84
00:04:59,590 --> 00:05:01,800
So now for Part B.
85
00:05:01,800 --> 00:05:07,500
We're asked what's the largest
n, approximately, so that it
86
00:05:07,500 --> 00:05:09,600
satisfies this.
87
00:05:09,600 --> 00:05:13,590
So again, we can use the
central limit theorem.
88
00:05:13,590 --> 00:05:20,360
Use similar steps here so that
we have, in this case, n
89
00:05:20,360 --> 00:05:25,290
greater than or equal
to 200 plus 5n.
90
00:05:25,290 --> 00:05:26,620
And standardized.
91
00:05:26,620 --> 00:05:35,320
So we have n and the mean here--
this is where this
92
00:05:35,320 --> 00:05:36,030
comes handy.
93
00:05:36,030 --> 00:05:40,450
It's going to be 5n and
the variance is 9n.
94
00:05:40,450 --> 00:05:42,060
It's greater than or equal to.
95
00:05:42,060 --> 00:05:44,010
5n's will cancel and
you subtract.
96
00:05:44,010 --> 00:05:49,460
And then you get 200 over
the square root of 9n.
97
00:05:49,460 --> 00:05:54,600
And we can, again, use the half
approximation here, half
98
00:05:54,600 --> 00:05:55,520
correction here.
99
00:05:55,520 --> 00:06:00,820
But I'm not going to do it, to
keep the problem simple.
100
00:06:00,820 --> 00:06:05,340
And so in this case, this is
approximately equal to the
101
00:06:05,340 --> 00:06:07,890
standard normal being greater
than probability of the center
102
00:06:07,890 --> 00:06:11,870
of normal being greater than
or equal to 200 over square
103
00:06:11,870 --> 00:06:13,520
root of 9n.
104
00:06:13,520 --> 00:06:19,160
And so same sort
of thing here.
105
00:06:19,160 --> 00:06:23,540
This is just 1 minus this.
106
00:06:23,540 --> 00:06:26,740
The equal sign doesn't matter
because Z is a continuous
107
00:06:26,740 --> 00:06:28,140
random variable.
108
00:06:28,140 --> 00:06:34,270
And so we have this here.
109
00:06:34,270 --> 00:06:40,310
And we want this to be less
than or equal 0.05.
110
00:06:40,310 --> 00:06:47,170
So that means that phi of 200
over square root of 9 has to
111
00:06:47,170 --> 00:06:49,970
be greater than or
equal to 0.95.
112
00:06:49,970 --> 00:07:00,020
So we're basically looking for
something here that ensures
113
00:07:00,020 --> 00:07:04,360
that this region's
at least 0.95.
114
00:07:04,360 --> 00:07:09,810
So if you look at the table,
0.95 lies somewhere in between
115
00:07:09,810 --> 00:07:13,140
1.64 and 1.65.
116
00:07:13,140 --> 00:07:17,780
And I'm going to use 1.65 to
be conservative, because we
117
00:07:17,780 --> 00:07:20,640
want this region to
be at least 0.95.
118
00:07:20,640 --> 00:07:24,030
So 1.65 works better here.
119
00:07:24,030 --> 00:07:30,320
And so we want this thing, this
here, which is going to
120
00:07:30,320 --> 00:07:35,560
be 200 over square root of n--
121
00:07:35,560 --> 00:07:43,030
square root of 9n, to be bigger
than or equal to 1.65.
122
00:07:43,030 --> 00:07:51,240
So n here is going to be less
than or equal to 200 over 1.65
123
00:07:51,240 --> 00:07:55,140
squared, 1 over 9.
124
00:07:55,140 --> 00:07:57,650
If you plug this into your
calculator, you might have a
125
00:07:57,650 --> 00:07:58,660
decimal in there.
126
00:07:58,660 --> 00:08:02,210
Then we just pick n,
the largest integer
127
00:08:02,210 --> 00:08:05,220
that satisfies this.
128
00:08:05,220 --> 00:08:09,200
So we can plug that into your
calculator, you'll find that
129
00:08:09,200 --> 00:08:12,920
it's going to be 1,632.
130
00:08:12,920 --> 00:08:13,290
That's
131
00:08:13,290 --> 00:08:16,690
part B. Last part.
132
00:08:16,690 --> 00:08:19,910
Let n be the first day when the
total number of gadgets is
133
00:08:19,910 --> 00:08:21,720
greater than 1,000.
134
00:08:21,720 --> 00:08:23,930
What's the probability
that n is greater
135
00:08:23,930 --> 00:08:25,350
than or equal to 220?
136
00:08:25,350 --> 00:08:29,090
Again, we want to use the
central limit theorem, but the
137
00:08:29,090 --> 00:08:35,650
trick here is to recognize that
this is actually equal to
138
00:08:35,650 --> 00:08:42,240
the probability that the sum
from i equals 1 to 219 of Xi,
139
00:08:42,240 --> 00:08:45,370
is less than or equal
to 1,000.
140
00:08:45,370 --> 00:08:48,200
So let's look at both directions
to check this.
141
00:08:48,200 --> 00:08:51,390
If n is greater than or
equal to 220, then
142
00:08:51,390 --> 00:08:52,500
this has to be true.
143
00:08:52,500 --> 00:08:56,320
Because if it weren't true, and
if this were greater than
144
00:08:56,320 --> 00:09:00,790
1,000, then n would have been
less than or equal to 219.
145
00:09:00,790 --> 00:09:04,190
So this direction works.
146
00:09:04,190 --> 00:09:05,270
The other direction.
147
00:09:05,270 --> 00:09:08,990
If this were the case, it has
to be the case that n is
148
00:09:08,990 --> 00:09:14,060
greater than or equal to 220,
because up till 219 it hasn't
149
00:09:14,060 --> 00:09:15,410
exceeded 1,000.
150
00:09:15,410 --> 00:09:20,050
And so, at some point beyond
that, it's going to exceed
151
00:09:20,050 --> 00:09:24,090
1,000 and n is going to be
greater than or equal to 220.
152
00:09:24,090 --> 00:09:25,710
So this is the key trick here.
153
00:09:25,710 --> 00:09:34,120
And once you see this, you
realize that this is very easy
154
00:09:34,120 --> 00:09:38,070
because we do the same steps
as we did before.
155
00:09:38,070 --> 00:09:44,160
So you're looking for this, this
is equal to, again, you
156
00:09:44,160 --> 00:09:46,923
do your standardization.
157
00:09:46,923 --> 00:09:49,850
158
00:09:49,850 --> 00:09:56,780
So this is from 219, and you get
5 times 219 for the mean,
159
00:09:56,780 --> 00:10:01,180
and 9 times 219 for the
variance, less than or equal
160
00:10:01,180 --> 00:10:07,300
to 1,000 minus 5 times
219 over square
161
00:10:07,300 --> 00:10:09,710
root of 9 times 219.
162
00:10:09,710 --> 00:10:14,130
Again, you can do the half
correction here, make it
163
00:10:14,130 --> 00:10:19,410
1,000.5, but I'm not going to
do that in this case, for
164
00:10:19,410 --> 00:10:20,760
simplicity.
165
00:10:20,760 --> 00:10:23,780
So this is approximately equal
to Z being less than
166
00:10:23,780 --> 00:10:27,170
whatever this is.
167
00:10:27,170 --> 00:10:29,080
And if you plug it in,
you'll find that
168
00:10:29,080 --> 00:10:31,950
this is negative 2.14.
169
00:10:31,950 --> 00:10:39,670
So in this case, we have this
is the probability that Z--
170
00:10:39,670 --> 00:10:41,410
again, we do the same thing--
171
00:10:41,410 --> 00:10:44,550
is greater than or
equal to 2.14.
172
00:10:44,550 --> 00:10:49,590
And this is 1 minus the
probability that Z is less
173
00:10:49,590 --> 00:10:53,750
than or equal to 2.14.
174
00:10:53,750 --> 00:11:00,640
And that's just phi of 2.14--
175
00:11:00,640 --> 00:11:03,570
1 minus Z of 2.14.
176
00:11:03,570 --> 00:11:04,830
And that's--
177
00:11:04,830 --> 00:11:07,850
if you look it up on the
table, it's 2.14.
178
00:11:07,850 --> 00:11:12,700
It's 0.9838.
179
00:11:12,700 --> 00:11:14,160
So here's your answer.
180
00:11:14,160 --> 00:11:16,440
So we're done with
Part C as well.
181
00:11:16,440 --> 00:11:20,330
So in this exercise, we did
a lot of approximate
182
00:11:20,330 --> 00:11:23,070
calculations using
the central--