WEBVTT
00:00:00.740 --> 00:00:02.770
In this segment we start
a new topic.
00:00:02.770 --> 00:00:05.180
We will talk about the
covariance of two random
00:00:05.180 --> 00:00:08.430
variables, which gives us useful
information about the
00:00:08.430 --> 00:00:11.720
dependencies between these
two random variables.
00:00:11.720 --> 00:00:14.110
Let us motivate the concept
by looking first
00:00:14.110 --> 00:00:15.600
at a special case.
00:00:15.600 --> 00:00:18.690
Suppose that X and Y have zero
means and that they there are
00:00:18.690 --> 00:00:20.870
discrete random variables.
00:00:20.870 --> 00:00:24.690
If X and Y are independent, then
the expectation of the
00:00:24.690 --> 00:00:28.880
product is the product
of the expectations.
00:00:28.880 --> 00:00:32.100
And since we have assumed zero
means, this is going to be
00:00:32.100 --> 00:00:33.830
equal to zero.
00:00:33.830 --> 00:00:39.190
But suppose instead that the
joint PMF of X and Y is of the
00:00:39.190 --> 00:00:40.990
following kind.
00:00:40.990 --> 00:00:45.080
Each point in this diagram is
equally likely, so we have
00:00:45.080 --> 00:00:49.600
here a discrete uniform
distribution on the discrete
00:00:49.600 --> 00:00:55.100
set which consists of the points
shown in this diagram.
00:00:55.100 --> 00:00:58.330
What we have in this particular
example is that at
00:00:58.330 --> 00:01:02.610
most outcomes, positive values
of X tend to go together with
00:01:02.610 --> 00:01:06.970
positive values of Y. And
negative values of X tend to
00:01:06.970 --> 00:01:11.100
go together with negative values
of Y. So most of the
00:01:11.100 --> 00:01:15.039
time we have outcomes in this
quadrant, in which x times y
00:01:15.039 --> 00:01:18.950
is positive, or in this quadrant
where x times y is,
00:01:18.950 --> 00:01:20.210
again, positive.
00:01:20.210 --> 00:01:24.190
But some of the time we fall in
this quadrant where x times
00:01:24.190 --> 00:01:26.410
y is negative, or in
this quadrant where
00:01:26.410 --> 00:01:28.830
x times y is negative.
00:01:28.830 --> 00:01:32.940
Since we have many more points
here and here, on the average,
00:01:32.940 --> 00:01:37.130
the value of x times y is
going to be positive.
00:01:41.350 --> 00:01:45.990
On the other hand, if the
diagram takes this form, then,
00:01:45.990 --> 00:01:50.140
most of the time, the pair x, y
lies in this quadrant or in
00:01:50.140 --> 00:01:51.870
that quadrant where
the product of
00:01:51.870 --> 00:01:54.280
x times y is negative.
00:01:54.280 --> 00:01:57.300
So the random variables X and
Y typically have opposite
00:01:57.300 --> 00:02:02.950
signs, and on the average, the
expected value of X times Y is
00:02:02.950 --> 00:02:06.170
going to be negative.
00:02:06.170 --> 00:02:09.530
So here we have a positive
expectation, here we have a
00:02:09.530 --> 00:02:14.110
negative expectation of X times
Y. This quantity, the
00:02:14.110 --> 00:02:18.300
expected value of X times Y,
tells us whether X and Y tend
00:02:18.300 --> 00:02:22.579
to move in the same or in
opposite directions.
00:02:22.579 --> 00:02:25.860
And this quantity is what we
call the covariance, in the
00:02:25.860 --> 00:02:28.420
zero mean case.
00:02:28.420 --> 00:02:30.329
Let us now generalize.
00:02:30.329 --> 00:02:33.260
The random variables do not
have to be discrete.
00:02:33.260 --> 00:02:35.430
This quantity is well
defined for any
00:02:35.430 --> 00:02:37.900
kind of random variables.
00:02:37.900 --> 00:02:43.040
And if we have non-zero means,
the covariance is defined by
00:02:43.040 --> 00:02:46.100
this expression.
00:02:46.100 --> 00:02:50.829
What we have here is that we
look at the deviation of X
00:02:50.829 --> 00:02:55.240
from its mean value, and the
deviation of Y from its mean
00:02:55.240 --> 00:02:59.720
value, and we're asking whether
these two deviations
00:02:59.720 --> 00:03:03.170
tend to have the same sign or
not, whether they move in the
00:03:03.170 --> 00:03:05.370
same direction or not.
00:03:05.370 --> 00:03:08.930
If the covariance is positive,
what it tells us is that
00:03:08.930 --> 00:03:12.210
whenever this quantity is
positive so that X is above
00:03:12.210 --> 00:03:18.410
its mean, then, typically or
usually, the deviation of Y
00:03:18.410 --> 00:03:22.600
from its mean will also
tend to be positive.
00:03:22.600 --> 00:03:26.160
To summarize, the covariance,
in general, tells us whether
00:03:26.160 --> 00:03:31.470
two random variables tend to
move together, both being high
00:03:31.470 --> 00:03:37.160
or both being low, in some
average or typical sense.
00:03:37.160 --> 00:03:40.140
Now, if the two random variables
are independent, we
00:03:40.140 --> 00:03:44.160
already saw that in the zero
mean case, this quantity--
00:03:44.160 --> 00:03:45.100
the covariance--
00:03:45.100 --> 00:03:46.550
is going to be 0.
00:03:46.550 --> 00:03:50.570
How about the case where
we have non-zero means?
00:03:50.570 --> 00:03:56.620
Well, if we have independence,
then we have the expected
00:03:56.620 --> 00:03:58.720
value of the product of
two random variables.
00:03:58.720 --> 00:04:03.540
X and Y are independent, so X
minus the expected value,
00:04:03.540 --> 00:04:06.710
which is a constant, is going to
be independent from Y minus
00:04:06.710 --> 00:04:08.920
its expected value.
00:04:08.920 --> 00:04:12.520
And so, the covariance is going
to be the product of two
00:04:12.520 --> 00:04:13.770
expectations.
00:04:25.220 --> 00:04:30.590
But the expected value of X
minus this constant is 0, and
00:04:30.590 --> 00:04:34.120
the same is true for
this term as well.
00:04:34.120 --> 00:04:37.805
So the covariance in this case
is going to be equal to 0.
00:04:40.340 --> 00:04:44.790
So in the independent case,
we have zero covariances.
00:04:44.790 --> 00:04:48.570
On the other hand, the
converse is not true.
00:04:48.570 --> 00:04:54.085
There are examples in which we
have dependence but zero
00:04:54.085 --> 00:04:55.640
covariance.
00:04:55.640 --> 00:04:57.140
Here is one example.
00:04:57.140 --> 00:05:01.580
In this example there are
four possible outcomes.
00:05:01.580 --> 00:05:05.800
At any particular outcome,
either X or Y
00:05:05.800 --> 00:05:07.430
is going to be 0.
00:05:07.430 --> 00:05:11.150
So in this example the random
variable X times Y is
00:05:11.150 --> 00:05:13.200
identically equal to 0.
00:05:13.200 --> 00:05:16.940
The mean of X is also 0, the
mean of Y is also 0 by
00:05:16.940 --> 00:05:19.230
symmetry, so the covariance
is the expected
00:05:19.230 --> 00:05:20.780
value of this quantity.
00:05:20.780 --> 00:05:25.160
And so the covariance, in this
example, is equal to 0.
00:05:25.160 --> 00:05:27.810
On the other hand, the
two random variables,
00:05:27.810 --> 00:05:30.380
X and Y, are dependent.
00:05:30.380 --> 00:05:35.300
If I tell you that X is equal to
1, then you know that this
00:05:35.300 --> 00:05:38.460
outcome has occurred.
00:05:38.460 --> 00:05:44.010
And in that case, you are
certain that Y is equal to 0.
00:05:44.010 --> 00:05:47.070
So knowing the value of X tells
you a lot about the
00:05:47.070 --> 00:05:51.440
value of Y and, therefore, we
have dependence between these
00:05:51.440 --> 00:05:52.690
two random variables.