WEBVTT
00:00:00.840 --> 00:00:01.980
Hi.
00:00:01.980 --> 00:00:03.910
In this video, we're going
to do some approximate
00:00:03.910 --> 00:00:07.850
calculations using the central
limit theorem.
00:00:07.850 --> 00:00:10.490
We're given that Xn is the
number of gadgets produced on
00:00:10.490 --> 00:00:12.140
day n by a factory.
00:00:12.140 --> 00:00:15.140
And it has a normal distribution
with mean 5 and
00:00:15.140 --> 00:00:16.460
variance 9.
00:00:16.460 --> 00:00:20.340
And they're all independent and
identically distributed.
00:00:20.340 --> 00:00:22.020
We're looking for the
probability that the total
00:00:22.020 --> 00:00:26.180
number of gadgets in 100
days is less than 440.
00:00:26.180 --> 00:00:33.660
To start, we can first write
this as the probability of the
00:00:33.660 --> 00:00:37.820
sum of the gadgets produced
on each of 100 days
00:00:37.820 --> 00:00:42.320
being less than 440.
00:00:42.320 --> 00:00:45.890
Notice that this is a sum of a
large number of independent
00:00:45.890 --> 00:00:46.960
random variables.
00:00:46.960 --> 00:00:50.370
So we can use the central limit
theorem and approximate
00:00:50.370 --> 00:00:53.435
the sum as a normal
random variable.
00:00:53.435 --> 00:00:56.780
And then, basically, in order
to compute this probability,
00:00:56.780 --> 00:00:59.790
we'd basically need to
standardize this and then use
00:00:59.790 --> 00:01:01.440
the standard normal table.
00:01:01.440 --> 00:01:03.490
So let's first compute
the expectation and
00:01:03.490 --> 00:01:06.920
variance of the sum.
00:01:06.920 --> 00:01:13.400
So I'm going to actually sum up
from 1 to n instead of 100,
00:01:13.400 --> 00:01:15.790
to do it more generally.
00:01:15.790 --> 00:01:21.700
So the linearity is preserved
for the expectation operator.
00:01:21.700 --> 00:01:24.660
So this is the sum of
the expected value.
00:01:24.660 --> 00:01:28.120
And since they're all
identically distributed, they
00:01:28.120 --> 00:01:31.210
all have the same expectation,
and there are n of them.
00:01:31.210 --> 00:01:35.240
And so we have this
being n times 5.
00:01:35.240 --> 00:01:45.240
For the variance of the sum is
also the sum of the variances
00:01:45.240 --> 00:01:48.710
because the independents.
00:01:48.710 --> 00:01:50.838
And so they're identically
distributed to the -- so we
00:01:50.838 --> 00:01:58.860
have n times the variance of
Xi, and this is n times 9.
00:01:58.860 --> 00:02:03.860
So now, we can standardize
it, or make it 0 mean
00:02:03.860 --> 00:02:04.980
and variance 1.
00:02:04.980 --> 00:02:09.750
So to do that we would
take these Xi's,
00:02:09.750 --> 00:02:11.220
subtract by their mean.
00:02:11.220 --> 00:02:17.670
So it's going to be 5 times 100
of them, so it's 500 over
00:02:17.670 --> 00:02:19.620
the square root of the variance,
which is going to be
00:02:19.620 --> 00:02:24.430
9 times 100 of them, so
it's going to be 900.
00:02:24.430 --> 00:02:31.400
So that's going to be less than
440 minus 500 over square
00:02:31.400 --> 00:02:34.980
root of 900.
00:02:34.980 --> 00:02:39.790
So notice what we're trying
to do here is--
00:02:39.790 --> 00:02:45.840
notice that the sum of Xi's
is a discrete quantity.
00:02:45.840 --> 00:02:47.780
So it's a discrete random
variable, so it may
00:02:47.780 --> 00:02:49.190
have a PMF like this.
00:02:53.110 --> 00:02:54.140
And we're trying to
approximate it
00:02:54.140 --> 00:02:56.600
with a normal density.
00:02:56.600 --> 00:03:01.160
So this is not drawn to scale,
but let's say that this is 440
00:03:01.160 --> 00:03:04.700
and this is 439.
00:03:04.700 --> 00:03:07.170
Basically, we're trying to say
what's the probability of this
00:03:07.170 --> 00:03:10.990
being less than 440, so it's the
probability that it's 439,
00:03:10.990 --> 00:03:15.410
or 438, or 437.
00:03:15.410 --> 00:03:19.310
But in the continuous case, a
good approximation to this
00:03:19.310 --> 00:03:24.540
would be to take the middle,
say, 439.5, and compute the
00:03:24.540 --> 00:03:27.690
area below that.
00:03:27.690 --> 00:03:32.560
So in this case, when we do the
normal approximation, it
00:03:32.560 --> 00:03:37.130
works out better if we use
this half correction.
00:03:37.130 --> 00:03:42.320
And so, this, in this case,
probability, let's call Z the
00:03:42.320 --> 00:03:44.220
standard normal.
00:03:44.220 --> 00:03:47.390
And so this is approximately
equal to a standard normal
00:03:47.390 --> 00:03:52.800
with the probability of standard
normal being less
00:03:52.800 --> 00:03:53.950
than whatever that is.
00:03:53.950 --> 00:03:55.850
And if you plug that into
your calculator, you
00:03:55.850 --> 00:03:59.790
get negative 2.02.
00:03:59.790 --> 00:04:03.670
So now, if we try to figure
out what this--
00:04:03.670 --> 00:04:06.230
from the table, we'll
find that negative
00:04:06.230 --> 00:04:07.730
values are not tabulated.
00:04:07.730 --> 00:04:11.970
But we know that the normal,
the center of normal is
00:04:11.970 --> 00:04:15.520
symmetric, and so if we want
to compute the area in this
00:04:15.520 --> 00:04:18.010
region, it's the same
as the area in this
00:04:18.010 --> 00:04:20.019
region, above 2.02.
00:04:20.019 --> 00:04:22.970
So this is the same as the
probability that Z
00:04:22.970 --> 00:04:28.400
is bigger than 2.02.
00:04:28.400 --> 00:04:32.480
That's just 1 minus the
probability that Z is less
00:04:32.480 --> 00:04:36.370
than or equal to 2.02,
and so that's, by
00:04:36.370 --> 00:04:40.910
definition, phi of 2.02.
00:04:40.910 --> 00:04:45.950
And if we look it up on the
table, 2.02 has probability
00:04:45.950 --> 00:04:49.490
here of 0.9783.
00:04:49.490 --> 00:04:51.590
And we can just write that in.
00:04:51.590 --> 00:04:59.590
That's the answer for Part A.
00:04:59.590 --> 00:05:01.800
So now for Part B.
00:05:01.800 --> 00:05:07.500
We're asked what's the largest
n, approximately, so that it
00:05:07.500 --> 00:05:09.600
satisfies this.
00:05:09.600 --> 00:05:13.590
So again, we can use the
central limit theorem.
00:05:13.590 --> 00:05:20.360
Use similar steps here so that
we have, in this case, n
00:05:20.360 --> 00:05:25.290
greater than or equal
to 200 plus 5n.
00:05:25.290 --> 00:05:26.620
And standardized.
00:05:26.620 --> 00:05:35.320
So we have n and the mean here--
this is where this
00:05:35.320 --> 00:05:36.030
comes handy.
00:05:36.030 --> 00:05:40.450
It's going to be 5n and
the variance is 9n.
00:05:40.450 --> 00:05:42.060
It's greater than or equal to.
00:05:42.060 --> 00:05:44.010
5n's will cancel and
you subtract.
00:05:44.010 --> 00:05:49.460
And then you get 200 over
the square root of 9n.
00:05:49.460 --> 00:05:54.600
And we can, again, use the half
approximation here, half
00:05:54.600 --> 00:05:55.520
correction here.
00:05:55.520 --> 00:06:00.820
But I'm not going to do it, to
keep the problem simple.
00:06:00.820 --> 00:06:05.340
And so in this case, this is
approximately equal to the
00:06:05.340 --> 00:06:07.890
standard normal being greater
than probability of the center
00:06:07.890 --> 00:06:11.870
of normal being greater than
or equal to 200 over square
00:06:11.870 --> 00:06:13.520
root of 9n.
00:06:13.520 --> 00:06:19.160
And so same sort
of thing here.
00:06:19.160 --> 00:06:23.540
This is just 1 minus this.
00:06:23.540 --> 00:06:26.740
The equal sign doesn't matter
because Z is a continuous
00:06:26.740 --> 00:06:28.140
random variable.
00:06:28.140 --> 00:06:34.270
And so we have this here.
00:06:34.270 --> 00:06:40.310
And we want this to be less
than or equal 0.05.
00:06:40.310 --> 00:06:47.170
So that means that phi of 200
over square root of 9 has to
00:06:47.170 --> 00:06:49.970
be greater than or
equal to 0.95.
00:06:49.970 --> 00:07:00.020
So we're basically looking for
something here that ensures
00:07:00.020 --> 00:07:04.360
that this region's
at least 0.95.
00:07:04.360 --> 00:07:09.810
So if you look at the table,
0.95 lies somewhere in between
00:07:09.810 --> 00:07:13.140
1.64 and 1.65.
00:07:13.140 --> 00:07:17.780
And I'm going to use 1.65 to
be conservative, because we
00:07:17.780 --> 00:07:20.640
want this region to
be at least 0.95.
00:07:20.640 --> 00:07:24.030
So 1.65 works better here.
00:07:24.030 --> 00:07:30.320
And so we want this thing, this
here, which is going to
00:07:30.320 --> 00:07:35.560
be 200 over square root of n--
00:07:35.560 --> 00:07:43.030
square root of 9n, to be bigger
than or equal to 1.65.
00:07:43.030 --> 00:07:51.240
So n here is going to be less
than or equal to 200 over 1.65
00:07:51.240 --> 00:07:55.140
squared, 1 over 9.
00:07:55.140 --> 00:07:57.650
If you plug this into your
calculator, you might have a
00:07:57.650 --> 00:07:58.660
decimal in there.
00:07:58.660 --> 00:08:02.210
Then we just pick n,
the largest integer
00:08:02.210 --> 00:08:05.220
that satisfies this.
00:08:05.220 --> 00:08:09.200
So we can plug that into your
calculator, you'll find that
00:08:09.200 --> 00:08:12.920
it's going to be 1,632.
00:08:12.920 --> 00:08:13.290
That's
00:08:13.290 --> 00:08:16.690
part B. Last part.
00:08:16.690 --> 00:08:19.910
Let n be the first day when the
total number of gadgets is
00:08:19.910 --> 00:08:21.720
greater than 1,000.
00:08:21.720 --> 00:08:23.930
What's the probability
that n is greater
00:08:23.930 --> 00:08:25.350
than or equal to 220?
00:08:25.350 --> 00:08:29.090
Again, we want to use the
central limit theorem, but the
00:08:29.090 --> 00:08:35.650
trick here is to recognize that
this is actually equal to
00:08:35.650 --> 00:08:42.240
the probability that the sum
from i equals 1 to 219 of Xi,
00:08:42.240 --> 00:08:45.370
is less than or equal
to 1,000.
00:08:45.370 --> 00:08:48.200
So let's look at both directions
to check this.
00:08:48.200 --> 00:08:51.390
If n is greater than or
equal to 220, then
00:08:51.390 --> 00:08:52.500
this has to be true.
00:08:52.500 --> 00:08:56.320
Because if it weren't true, and
if this were greater than
00:08:56.320 --> 00:09:00.790
1,000, then n would have been
less than or equal to 219.
00:09:00.790 --> 00:09:04.190
So this direction works.
00:09:04.190 --> 00:09:05.270
The other direction.
00:09:05.270 --> 00:09:08.990
If this were the case, it has
to be the case that n is
00:09:08.990 --> 00:09:14.060
greater than or equal to 220,
because up till 219 it hasn't
00:09:14.060 --> 00:09:15.410
exceeded 1,000.
00:09:15.410 --> 00:09:20.050
And so, at some point beyond
that, it's going to exceed
00:09:20.050 --> 00:09:24.090
1,000 and n is going to be
greater than or equal to 220.
00:09:24.090 --> 00:09:25.710
So this is the key trick here.
00:09:25.710 --> 00:09:34.120
And once you see this, you
realize that this is very easy
00:09:34.120 --> 00:09:38.070
because we do the same steps
as we did before.
00:09:38.070 --> 00:09:44.160
So you're looking for this, this
is equal to, again, you
00:09:44.160 --> 00:09:46.923
do your standardization.
00:09:49.850 --> 00:09:56.780
So this is from 219, and you get
5 times 219 for the mean,
00:09:56.780 --> 00:10:01.180
and 9 times 219 for the
variance, less than or equal
00:10:01.180 --> 00:10:07.300
to 1,000 minus 5 times
219 over square
00:10:07.300 --> 00:10:09.710
root of 9 times 219.
00:10:09.710 --> 00:10:14.130
Again, you can do the half
correction here, make it
00:10:14.130 --> 00:10:19.410
1,000.5, but I'm not going to
do that in this case, for
00:10:19.410 --> 00:10:20.760
simplicity.
00:10:20.760 --> 00:10:23.780
So this is approximately equal
to Z being less than
00:10:23.780 --> 00:10:27.170
whatever this is.
00:10:27.170 --> 00:10:29.080
And if you plug it in,
you'll find that
00:10:29.080 --> 00:10:31.950
this is negative 2.14.
00:10:31.950 --> 00:10:39.670
So in this case, we have this
is the probability that Z--
00:10:39.670 --> 00:10:41.410
again, we do the same thing--
00:10:41.410 --> 00:10:44.550
is greater than or
equal to 2.14.
00:10:44.550 --> 00:10:49.590
And this is 1 minus the
probability that Z is less
00:10:49.590 --> 00:10:53.750
than or equal to 2.14.
00:10:53.750 --> 00:11:00.640
And that's just phi of 2.14--
00:11:00.640 --> 00:11:03.570
1 minus Z of 2.14.
00:11:03.570 --> 00:11:04.830
And that's--
00:11:04.830 --> 00:11:07.850
if you look it up on the
table, it's 2.14.
00:11:07.850 --> 00:11:12.700
It's 0.9838.
00:11:12.700 --> 00:11:14.160
So here's your answer.
00:11:14.160 --> 00:11:16.440
So we're done with
Part C as well.
00:11:16.440 --> 00:11:20.330
So in this exercise, we did
a lot of approximate
00:11:20.330 --> 00:11:23.070
calculations using
the central--