WEBVTT
00:00:00.840 --> 00:00:04.950
We have introduced the concept
of expected value or mean,
00:00:04.950 --> 00:00:08.670
which tells us the average value
of a random variable.
00:00:08.670 --> 00:00:13.170
We will now introduce another
quantity, the variance, which
00:00:13.170 --> 00:00:15.700
quantifies the spread of the
00:00:15.700 --> 00:00:18.530
distribution of a random variable.
00:00:18.530 --> 00:00:25.100
So consider a random variable
with a given PMF, for example
00:00:25.100 --> 00:00:27.640
like the PMF shown
in this diagram.
00:00:31.930 --> 00:00:37.310
And consider another random
variable that happens to have
00:00:37.310 --> 00:00:40.280
the same mean, but
it's distribution
00:00:40.280 --> 00:00:42.200
is more spread out.
00:00:45.100 --> 00:00:49.090
So both random variables have
the same mean, which we denote
00:00:49.090 --> 00:00:53.880
by mu, and which in this picture
would be somewhere
00:00:53.880 --> 00:00:55.130
around here.
00:00:58.310 --> 00:01:03.840
However, the second PMF, the
blue PMF, has typical outcomes
00:01:03.840 --> 00:01:08.020
that tend to have a larger
distance from the mean.
00:01:08.020 --> 00:01:13.920
By distance from the mean what
we mean is that if the result
00:01:13.920 --> 00:01:17.289
of the random variable, its
numerical value, happens to
00:01:17.289 --> 00:01:23.230
be, let's say for example, this
one, then this quantity
00:01:23.230 --> 00:01:27.700
here is X minus mu is the
distance from the mean, how
00:01:27.700 --> 00:01:31.440
far away the outcome of the
random variable happens to be
00:01:31.440 --> 00:01:34.400
from the mean of that
random variable.
00:01:34.400 --> 00:01:37.820
Of course, the distance from the
mean is a random quantity.
00:01:37.820 --> 00:01:39.520
It is a random variable.
00:01:39.520 --> 00:01:43.210
Its value is determined once
we know the outcome of the
00:01:43.210 --> 00:01:47.140
experiment and the value
of the random variable.
00:01:47.140 --> 00:01:50.640
What can we say about the
distance from the mean.
00:01:50.640 --> 00:01:55.390
Let us calculate its average
or expected value.
00:01:55.390 --> 00:01:58.840
The expected value of the
distance from the mean, which
00:01:58.840 --> 00:02:03.290
is this quantity, using the
linearity of expectations, is
00:02:03.290 --> 00:02:08.288
equal to the expected value of
X minus the constant mu.
00:02:08.288 --> 00:02:12.040
But the expected value is by
definition equal to mu.
00:02:12.040 --> 00:02:15.420
And so we obtain zero.
00:02:15.420 --> 00:02:18.090
So we see that the average value
of the distance from the
00:02:18.090 --> 00:02:19.890
mean is always zero.
00:02:19.890 --> 00:02:22.240
And so it is uninformative.
00:02:22.240 --> 00:02:26.360
What we really want is the
average absolute value of the
00:02:26.360 --> 00:02:30.750
distance from the mean, or
something with this flavor.
00:02:30.750 --> 00:02:33.700
Mathematically, it turns out
that the average of the
00:02:33.700 --> 00:02:37.690
squared distance from the
mean is a better behaved
00:02:37.690 --> 00:02:39.280
mathematical object.
00:02:39.280 --> 00:02:42.220
And this is the quantity
that we will consider.
00:02:42.220 --> 00:02:43.490
It has a name.
00:02:43.490 --> 00:02:45.410
It is called the variance.
00:02:45.410 --> 00:02:50.190
And it is defined as the
expected value of the squared
00:02:50.190 --> 00:02:53.050
distance from the mean.
00:02:53.050 --> 00:02:56.350
The first thing to note is that
the variance is always
00:02:56.350 --> 00:02:57.600
non-negative.
00:02:59.990 --> 00:03:04.400
This is because it is the
expected value of non-negative
00:03:04.400 --> 00:03:06.590
quantities.
00:03:06.590 --> 00:03:10.040
How exactly do we computer
the variance?
00:03:10.040 --> 00:03:15.440
The squared distance from the
mean is really a function of
00:03:15.440 --> 00:03:24.010
the random variable X. So it is
a function of the form g of
00:03:24.010 --> 00:03:30.490
X, where g is a particular
function defined this way.
00:03:37.530 --> 00:03:42.460
So we can use the expected value
rule applied to this
00:03:42.460 --> 00:03:44.410
particular function g.
00:03:44.410 --> 00:03:46.700
And we obtain the following.
00:03:55.310 --> 00:04:00.790
So what we have to do is to go
over all numerical values of
00:04:00.790 --> 00:04:05.310
the random variable X. For
each one, calculate its
00:04:05.310 --> 00:04:10.630
squared distance from the mean
and weigh that quantity
00:04:10.630 --> 00:04:15.190
according to the corresponding
probability of that particular
00:04:15.190 --> 00:04:16.440
numerical value.
00:04:19.149 --> 00:04:22.860
One final comment, the variance
is a bit hard to
00:04:22.860 --> 00:04:25.865
interpret, because it is
in the wrong units.
00:04:25.865 --> 00:04:29.790
If capital X corresponds to
meters, then the variance has
00:04:29.790 --> 00:04:32.590
units of meters squared.
00:04:32.590 --> 00:04:36.450
A more intuitive quantity is
the square root of the
00:04:36.450 --> 00:04:39.970
variance, which is called
the standard deviation.
00:04:39.970 --> 00:04:43.280
It has the same units as the
random variable and captures
00:04:43.280 --> 00:04:44.845
the width of the distribution.
00:04:47.490 --> 00:04:50.360
Let us now take a quick look at
some of the properties of
00:04:50.360 --> 00:04:51.380
the variance.
00:04:51.380 --> 00:04:54.710
We know that expectations have
a linearity property.
00:04:54.710 --> 00:04:57.300
Is this the case for the
variance as well?
00:04:57.300 --> 00:04:58.500
Not quite.
00:04:58.500 --> 00:05:01.820
Instead we have this relation
for the variance of a linear
00:05:01.820 --> 00:05:03.880
function of a random variable.
00:05:03.880 --> 00:05:06.920
Let us see why it is true.
00:05:06.920 --> 00:05:11.590
We use the shorthand notation
mu for the expected value of
00:05:11.590 --> 00:05:15.950
X. We will proceed one step at a
time and first consider what
00:05:15.950 --> 00:05:18.160
happens to the variance
if we add the
00:05:18.160 --> 00:05:20.470
constant to a random variable.
00:05:20.470 --> 00:05:26.040
So let Y be X plus
some constant b.
00:05:26.040 --> 00:05:31.120
And let us just define nu to
be the expected value of Y,
00:05:31.120 --> 00:05:34.450
which, using linearity of
expectations, is the expected
00:05:34.450 --> 00:05:37.470
value of X plus b.
00:05:37.470 --> 00:05:40.030
Let us now calculate
the variance.
00:05:40.030 --> 00:05:45.890
By definition the variance of Y
is the expected value of the
00:05:45.890 --> 00:05:50.170
distance squared of
Y from its mean.
00:05:53.170 --> 00:05:58.290
Now we substitute, because
in this case Y is
00:05:58.290 --> 00:06:00.700
equal to X plus b.
00:06:00.700 --> 00:06:05.170
Whereas the mean, nu,
is mu plus b.
00:06:10.790 --> 00:06:16.890
And now we notice that this
b cancels with that b.
00:06:16.890 --> 00:06:25.080
And we are left with the
expected value of X minus mu
00:06:25.080 --> 00:06:34.190
squared, which is just the
variance of X. So this proves
00:06:34.190 --> 00:06:38.020
this relation for the case
where a is equal to 1.
00:06:38.020 --> 00:06:43.030
The variance of X plus b is
equal to the variance of X. So
00:06:43.030 --> 00:06:46.960
we see that when we add a
constant to a random variable,
00:06:46.960 --> 00:06:49.300
the variance remains
unchanged.
00:06:49.300 --> 00:06:53.890
Intuitively, adding a constant
just moves the entire PMF
00:06:53.890 --> 00:06:56.880
right or left by some
amount, but without
00:06:56.880 --> 00:06:58.750
changing its shape.
00:06:58.750 --> 00:07:03.370
And so the spread of this
PMF remains unchanged.
00:07:03.370 --> 00:07:07.080
Let us now see what happens if
we multiply a random variable
00:07:07.080 --> 00:07:08.866
by a constant.
00:07:08.866 --> 00:07:14.920
Let again nu be the expected
value of Y. And so in this
00:07:14.920 --> 00:07:19.000
case by linearity this is equal
to a times the expected
00:07:19.000 --> 00:07:22.880
value of X. So it
is a times mu.
00:07:22.880 --> 00:07:29.220
We calculate the variance once
more using the definition and
00:07:29.220 --> 00:07:33.490
substituting in the place of
Y what Y is in this case--
00:07:33.490 --> 00:07:34.909
it's aX--
00:07:34.909 --> 00:07:40.770
and subtracting the mean of
Y, which is a mu, squared.
00:07:40.770 --> 00:07:44.170
We take out a factor
of a squared.
00:07:47.150 --> 00:07:51.909
And then we use linearity of
expectations to note that this
00:07:51.909 --> 00:07:55.750
is a squared times the expected
value of X minus mu
00:07:55.750 --> 00:08:04.050
squared, which is a squared
times the variance of X.
00:08:04.050 --> 00:08:07.680
So this establishes this formula
for the case where b
00:08:07.680 --> 00:08:09.260
equals zero.
00:08:09.260 --> 00:08:12.900
Putting together these two
facts, if we multiply a random
00:08:12.900 --> 00:08:19.080
variable by a, the variance gets
multiplied by a squared.
00:08:19.080 --> 00:08:22.510
And if we add a constant, the
variance doesn't change.
00:08:22.510 --> 00:08:26.720
And this establishes this
particular fact.
00:08:26.720 --> 00:08:38.159
As an example, the variance of,
let's say, 3 minus 4X is
00:08:38.159 --> 00:08:45.100
going to be equal minus 4
squared times the variance of
00:08:45.100 --> 00:08:54.230
X, which is 16 times
the variance of X.
00:08:54.230 --> 00:08:58.100
Finally, let me mention an
alternative way of computing
00:08:58.100 --> 00:09:03.260
variances, which is often
a bit quicker.
00:09:03.260 --> 00:09:06.020
We have this useful
formula here.
00:09:06.020 --> 00:09:09.770
We will see later a few examples
of how it is used,
00:09:09.770 --> 00:09:15.180
but for now let me just
show why it is true.
00:09:15.180 --> 00:09:21.800
We have by definition that the
variance of X is the expected
00:09:21.800 --> 00:09:27.410
value of X minus mu squared.
00:09:27.410 --> 00:09:32.700
Now let us rewrite what is
inside the expectation by just
00:09:32.700 --> 00:09:35.495
expanding this square, which
is [X squared minus]
00:09:35.495 --> 00:09:40.900
2 mu X plus mu squared.
00:09:40.900 --> 00:09:44.290
Using linearity of expectations,
this is broken
00:09:44.290 --> 00:09:52.340
down into expected value of X
squared minus the expected
00:09:52.340 --> 00:09:56.000
value of two times mu X.
But mu is a constant.
00:09:56.000 --> 00:09:58.450
So we can take it outside
the expected value.
00:09:58.450 --> 00:10:01.480
And we're left with
2mu expected value
00:10:01.480 --> 00:10:09.160
of X plus mu squared.
00:10:09.160 --> 00:10:13.970
But remember that mu is just the
same as the expected value
00:10:13.970 --> 00:10:18.480
of X. So what we have here is
twice the expected value of X,
00:10:18.480 --> 00:10:22.840
squared, plus the expected value
of X, squared, and that
00:10:22.840 --> 00:10:27.730
leaves us just minus the
expected value of X, squared.
00:10:35.090 --> 00:10:38.280
So we will now move in the
next segment into a few
00:10:38.280 --> 00:10:40.190
examples of variance
calculations.