WEBVTT

00:00:01.510 --> 00:00:05.570
Let us now revisit the subject
of expectations and develop an

00:00:05.570 --> 00:00:08.580
important linearity property
for the case where we're

00:00:08.580 --> 00:00:11.330
dealing with multiple
random variables.

00:00:11.330 --> 00:00:14.420
We already have a linearity
property.

00:00:14.420 --> 00:00:18.350
If we have a linear function of
a single random variable,

00:00:18.350 --> 00:00:22.220
then expectations behave
in a linear fashion.

00:00:22.220 --> 00:00:25.370
But now, if we have multiple
random variables, we have this

00:00:25.370 --> 00:00:26.930
additional property.

00:00:26.930 --> 00:00:30.630
The expected value of the sum
of two random variables is

00:00:30.630 --> 00:00:34.340
equal to the sum of their
expectations.

00:00:34.340 --> 00:00:37.610
Let us go through the derivation
of this very

00:00:37.610 --> 00:00:42.030
important fact because it is a
nice exercise in applying the

00:00:42.030 --> 00:00:45.630
expected value rule and
also manipulating

00:00:45.630 --> 00:00:47.195
PMFs and joint PMFs.

00:00:49.960 --> 00:00:54.740
We're dealing with the expected
value of a function

00:00:54.740 --> 00:00:58.110
of two random variables.

00:00:58.110 --> 00:00:59.700
Which function?

00:00:59.700 --> 00:01:08.300
If we write it this way, we are
dealing with the function

00:01:08.300 --> 00:01:12.275
g, which is just the sum
of its two entries.

00:01:17.230 --> 00:01:20.760
So now we can continue with
the application of the

00:01:20.760 --> 00:01:22.710
expected value rule.

00:01:22.710 --> 00:01:28.039
And we obtain the sum over
all possible x, y pairs.

00:01:28.039 --> 00:01:30.750
Here, we need to write
to g of x,y.

00:01:30.750 --> 00:01:33.759
But in our case, the function
we're dealing with

00:01:33.759 --> 00:01:36.789
is just x plus y.

00:01:36.789 --> 00:01:40.210
And then we weigh, according
to the entries

00:01:40.210 --> 00:01:41.580
of the joint PMF.

00:01:41.580 --> 00:01:45.289
So this is just an application
of the expected value rule to

00:01:45.289 --> 00:01:47.729
this particular function.

00:01:47.729 --> 00:01:52.740
Now let us take this sum and
break it into two pieces--

00:01:52.740 --> 00:02:02.870
one involving only the x-term,
and another piece involving

00:02:02.870 --> 00:02:04.120
only the y-term.

00:02:14.640 --> 00:02:20.850
Now, if we look at this
double summation, look

00:02:20.850 --> 00:02:22.540
at the inner sum.

00:02:22.540 --> 00:02:24.490
It's a sum over y's.

00:02:24.490 --> 00:02:28.420
While we're adding over y's, the
value of x remains fixed.

00:02:28.420 --> 00:02:31.900
So x is a constant, as far
as the sum is concerned.

00:02:31.900 --> 00:02:35.585
So x can be pulled outside
this summation.

00:02:48.160 --> 00:02:54.930
Let us just continue with this
term, the first one, and see

00:02:54.930 --> 00:02:57.570
that a simplification happens.

00:02:57.570 --> 00:03:01.290
This quantity here is the sum
of the probabilities of the

00:03:01.290 --> 00:03:05.150
different y's that can go
together with a particular x.

00:03:05.150 --> 00:03:07.740
So it is just equal to
the probability or

00:03:07.740 --> 00:03:08.940
that particular x.

00:03:08.940 --> 00:03:10.430
It's the marginal PMF.

00:03:17.630 --> 00:03:21.600
If we carry out a similar step
for the second term, we will

00:03:21.600 --> 00:03:23.780
obtain the sum over y's.

00:03:23.780 --> 00:03:27.940
It's just a symmetrical
argument.

00:03:27.940 --> 00:03:31.579
And at this point we recognize
that what we have in front of

00:03:31.579 --> 00:03:36.520
us is just the expected value of
X, this is the first term,

00:03:36.520 --> 00:03:40.170
plus the expected value of
Y. So this completes the

00:03:40.170 --> 00:03:42.870
derivation of this important
linearity property.

00:03:45.500 --> 00:03:47.900
Of course, we proved the
linearity property for the

00:03:47.900 --> 00:03:51.460
case of the sum of two
random variables.

00:03:51.460 --> 00:03:55.130
But you can proceed in a similar
way, or maybe use

00:03:55.130 --> 00:03:59.930
induction, and one can easily
establish, by following the

00:03:59.930 --> 00:04:03.100
same kind of argument, that we
have a linearity property when

00:04:03.100 --> 00:04:07.320
we add any finite number
of random variables.

00:04:07.320 --> 00:04:09.750
The expected value of
a sum is the sum of

00:04:09.750 --> 00:04:12.720
the expected values.

00:04:12.720 --> 00:04:16.450
Just for a little bit of
practice, if, for example,

00:04:16.450 --> 00:04:19.200
we're dealing with this
expression, the expected value

00:04:19.200 --> 00:04:25.510
of that expression would be the
expected value of 2X plus

00:04:25.510 --> 00:04:33.370
the expected value of 3Y minus
the expected value of Z. And

00:04:33.370 --> 00:04:37.440
then, using the linearity
property for linear functions

00:04:37.440 --> 00:04:41.380
of a single random variable, we
can pull the constants out

00:04:41.380 --> 00:04:42.510
of the expectations.

00:04:42.510 --> 00:04:46.330
And this would be twice the
expected value of X plus 3

00:04:46.330 --> 00:04:56.380
times the expected value of Y
minus the expected value of Z.

00:04:56.380 --> 00:05:00.160
What we will do next is to use
the linearity property of

00:05:00.160 --> 00:05:04.920
expectations to solve a problem
that would otherwise

00:05:04.920 --> 00:05:07.540
be quite difficult.

00:05:07.540 --> 00:05:12.040
We will use the linearity
property to find the mean of a

00:05:12.040 --> 00:05:14.350
binomial random variable.

00:05:14.350 --> 00:05:17.170
Let X be a binomial random
variable with

00:05:17.170 --> 00:05:18.830
parameters n and p.

00:05:18.830 --> 00:05:22.780
And we can interpret X as the
number of successes in n

00:05:22.780 --> 00:05:25.880
independent trials where each
one of the trials has a

00:05:25.880 --> 00:05:29.320
probability p of resulting
in a success.

00:05:29.320 --> 00:05:32.470
Well, we know the PMF
of a binomial.

00:05:32.470 --> 00:05:37.420
And we can use the definition of
expectation to obtain this

00:05:37.420 --> 00:05:38.690
expression.

00:05:38.690 --> 00:05:41.420
This is just the PMF
of the binomial.

00:05:46.040 --> 00:05:49.240
And therefore, what we have here
is the usual definition

00:05:49.240 --> 00:05:50.800
of the expected value.

00:05:50.800 --> 00:05:54.750
Now, if you look at this sum,
it appears quite formidable.

00:05:54.750 --> 00:05:58.390
And it would be quite
hard to evaluate it.

00:05:58.390 --> 00:06:02.190
Instead, we're going to use
a very useful trick.

00:06:02.190 --> 00:06:07.950
We will employ what we have
called indicator variables.

00:06:07.950 --> 00:06:11.880
So let's define a random
variable Xi, which is a one if

00:06:11.880 --> 00:06:16.050
the ith trial is a success,
and zero otherwise.

00:06:16.050 --> 00:06:20.050
Now if we want to count
successes, what we want to

00:06:20.050 --> 00:06:24.850
count is how many of the
Xi's are equal to 1.

00:06:24.850 --> 00:06:30.480
So if we add the Xi's, this will
have a contribution of 1

00:06:30.480 --> 00:06:32.320
from each one of
the successes.

00:06:32.320 --> 00:06:34.490
So when you add them
up, you obtain the

00:06:34.490 --> 00:06:36.710
total number of successes.

00:06:36.710 --> 00:06:40.659
So we have expressed a random
variable as a sum of much

00:06:40.659 --> 00:06:43.270
simpler random variables.

00:06:43.270 --> 00:06:47.620
So at this point, we can now use
linearity of expectations

00:06:47.620 --> 00:06:51.400
to write that the expected
value of X will be the

00:06:51.400 --> 00:06:57.330
expected value of X1 plus
all the way to the

00:06:57.330 --> 00:07:00.920
expected value of Xn.

00:07:00.920 --> 00:07:05.180
Now what is the expected
value of X1?

00:07:05.180 --> 00:07:08.920
It is a Bernoulli random
variable that takes the value

00:07:08.920 --> 00:07:13.060
1 with probability p and takes
the value of 0 with

00:07:13.060 --> 00:07:14.940
probability 1 minus p.

00:07:14.940 --> 00:07:19.370
The expected value of this
random variable is p.

00:07:19.370 --> 00:07:23.830
And similarly, for each one of
these terms in the summation.

00:07:23.830 --> 00:07:29.310
And so the final end result
is equal to n times p.

00:07:29.310 --> 00:07:32.770
This answer, of course, makes
also intuitive sense.

00:07:32.770 --> 00:07:40.190
If we have to p equal to 1/2,
and we toss a coin 100 times,

00:07:40.190 --> 00:07:45.230
the expected number, or the
average number, of heads we

00:07:45.230 --> 00:07:50.650
expect to see will be 1/2 half
times 100, which is 50.

00:07:50.650 --> 00:07:55.270
The higher p is, the more
successes we expect to see.

00:07:55.270 --> 00:07:58.480
And of course, if we double
n, we expect to see

00:07:58.480 --> 00:08:01.120
twice as many successes.

00:08:01.120 --> 00:08:05.760
So this is an illustration of
the power of breaking up

00:08:05.760 --> 00:08:10.150
problems into simpler pieces
that are easier to analyze.

00:08:10.150 --> 00:08:14.440
And the linearity of
expectations is one more tool

00:08:14.440 --> 00:08:18.710
that we have in our hands
for breaking up perhaps

00:08:18.710 --> 00:08:22.220
complicated random variables
into simpler ones and then

00:08:22.220 --> 00:08:23.470
analyzing them separately.