WEBVTT
00:00:01.370 --> 00:00:04.160
In this segment, we
point out and discuss
00:00:04.160 --> 00:00:07.330
some important but also
intuitive properties
00:00:07.330 --> 00:00:09.580
of conditional expectations.
00:00:09.580 --> 00:00:13.900
The first property is the
one that is written up here.
00:00:13.900 --> 00:00:16.300
What is the intuitive meaning?
00:00:16.300 --> 00:00:21.140
If you condition on Y, then
the value of Y is known,
00:00:21.140 --> 00:00:23.710
and so g of Y is also known.
00:00:23.710 --> 00:00:25.340
There's no randomness in it.
00:00:25.340 --> 00:00:28.400
It can be treated as a
constant, and therefore it
00:00:28.400 --> 00:00:31.860
can be pulled outside
the expectation.
00:00:31.860 --> 00:00:33.560
So that's the intuition.
00:00:33.560 --> 00:00:37.160
How does one establish
such a result formally?
00:00:37.160 --> 00:00:38.850
Let us take the discrete case.
00:00:38.850 --> 00:00:43.250
So let us assume that X
and Y are both discrete.
00:00:45.850 --> 00:00:49.270
What does it take to
establish a fact of this kind?
00:00:49.270 --> 00:00:52.410
We want to show that two
random variables are equal,
00:00:52.410 --> 00:00:54.830
and that amounts
to the following.
00:00:54.830 --> 00:00:58.790
We consider an outcome
of the experiment,
00:00:58.790 --> 00:01:00.880
and we want to
show that whatever
00:01:00.880 --> 00:01:03.100
the outcome of
the experiment is,
00:01:03.100 --> 00:01:06.480
these two random variables
will be the same.
00:01:06.480 --> 00:01:16.370
So let us consider an outcome
for which the random variable,
00:01:16.370 --> 00:01:22.480
Y, takes a specific
value, little y.
00:01:22.480 --> 00:01:25.950
And of course, this has to
be a specific little y that
00:01:25.950 --> 00:01:27.630
is possible.
00:01:27.630 --> 00:01:33.759
Otherwise, conditioning on that
event would not be meaningful.
00:01:33.759 --> 00:01:39.500
So if an outcome has this
value for the random variable,
00:01:39.500 --> 00:01:43.740
Y, then what does the
random variable do?
00:01:43.740 --> 00:01:48.150
This is, by definition,
the random variable
00:01:48.150 --> 00:01:51.080
that takes the
value-- expected value
00:01:51.080 --> 00:01:57.770
of g Y X, conditional
expectation,
00:01:57.770 --> 00:02:01.610
given that capital Y
took on this value.
00:02:01.610 --> 00:02:04.220
This was our definition
of the concept
00:02:04.220 --> 00:02:06.610
of the abstract
conditional expectation
00:02:06.610 --> 00:02:08.030
as a random variable.
00:02:08.030 --> 00:02:10.990
This is the random
variable that takes
00:02:10.990 --> 00:02:13.430
this specific numerical
value whenever
00:02:13.430 --> 00:02:17.320
the random variable, capital
Y, takes the value, little y.
00:02:17.320 --> 00:02:22.860
And similarly, if the
random variable, capital Y,
00:02:22.860 --> 00:02:27.120
takes the value, little y,
this random variable here
00:02:27.120 --> 00:02:33.800
is the expected value of X,
given that Y is little y.
00:02:33.800 --> 00:02:36.190
And when capital
Y takes the value,
00:02:36.190 --> 00:02:39.110
little y, this
function, g of Y, takes
00:02:39.110 --> 00:02:42.990
on this particular
numerical value.
00:02:42.990 --> 00:02:45.720
So we want to show that
these two expressions
00:02:45.720 --> 00:02:49.500
we will be equal no
matter what capital Y is.
00:02:49.500 --> 00:02:52.370
Now, when we place ourselves in
a conditional universe, where
00:02:52.370 --> 00:02:56.280
capital Y takes a
value, little y,
00:02:56.280 --> 00:02:59.660
then the joint PMF
of X and Y gets
00:02:59.660 --> 00:03:03.500
concentrated on those
values of capital Y
00:03:03.500 --> 00:03:06.080
that obey this relation.
00:03:06.080 --> 00:03:10.190
So conditioned on
this event, capital Y
00:03:10.190 --> 00:03:13.490
is, with certainty,
equal to little y.
00:03:13.490 --> 00:03:15.970
Therefore, this
random variable here,
00:03:15.970 --> 00:03:20.530
in the conditional universe,
is the same as this number.
00:03:27.210 --> 00:03:29.750
But now, since this
is a number, it
00:03:29.750 --> 00:03:31.825
can be pulled outside
the expectation.
00:03:38.740 --> 00:03:43.880
So we have concluded that
for any outcome for which
00:03:43.880 --> 00:03:47.600
the random variable, capital
Y, takes this specific value,
00:03:47.600 --> 00:03:52.260
little y, this random
variable takes this value.
00:03:52.260 --> 00:03:55.110
This random variable
takes this value.
00:03:55.110 --> 00:03:56.970
They are the same.
00:03:56.970 --> 00:04:00.050
So no matter what
the outcome is,
00:04:00.050 --> 00:04:02.740
these two random variables
take the same value,
00:04:02.740 --> 00:04:08.130
and therefore they are
the same random variables.
00:04:08.130 --> 00:04:13.290
Now, this is a correct proof
if the random variables
00:04:13.290 --> 00:04:14.625
are discrete.
00:04:14.625 --> 00:04:18.470
If the random variables
are continuous or general,
00:04:18.470 --> 00:04:23.160
then carrying out a rigorous
proof is actually quite subtle,
00:04:23.160 --> 00:04:26.430
and it is beyond our scope.
00:04:26.430 --> 00:04:29.960
However, the intuition
is still correct,
00:04:29.960 --> 00:04:31.990
and the result is correct.
00:04:31.990 --> 00:04:36.780
And we will be using it
freely whenever we need to.
00:04:36.780 --> 00:04:39.980
Let us now move to a
second observation.
00:04:39.980 --> 00:04:43.260
Suppose that h is an
invertible function.
00:04:43.260 --> 00:04:44.260
What does that mean?
00:04:44.260 --> 00:04:46.770
That if I give you
the value of h,
00:04:46.770 --> 00:04:49.720
you can tell me the
value of the argument.
00:04:49.720 --> 00:04:54.980
So in some sense,
Y and h of Y can
00:04:54.980 --> 00:04:56.490
be recovered from each other.
00:04:56.490 --> 00:04:59.900
If I give you Y, you
can calculate h of Y.
00:04:59.900 --> 00:05:04.870
But also, if I give you h of Y,
you can figure out what Y was.
00:05:04.870 --> 00:05:09.890
An example could be
the function, h of Y,
00:05:09.890 --> 00:05:14.980
equals Y to the third power.
00:05:14.980 --> 00:05:18.220
If I tell you the value
of Y, you know Y cubed.
00:05:18.220 --> 00:05:21.710
But if I tell you Y cubed,
you can also figure out Y.
00:05:21.710 --> 00:05:27.180
So Y and Y cubed carry
exactly the same information.
00:05:27.180 --> 00:05:30.740
In that case, the
conditional expectation--
00:05:30.740 --> 00:05:33.930
what you expect, on the
average, X to be-- if I tell you
00:05:33.930 --> 00:05:39.159
the value of Y, should be the
same as what you would expect
00:05:39.159 --> 00:05:43.670
X to be if I give you the
value of, let's say, Y cubed.
00:05:43.670 --> 00:05:47.060
In both cases, I'm giving you
the same amount of information,
00:05:47.060 --> 00:05:50.520
so the conditional distribution
of X should be the same.
00:05:50.520 --> 00:05:54.060
And the conditional expectations
should also be the same.
00:05:54.060 --> 00:05:57.730
So this is, again, a
very intuitive fact.
00:05:57.730 --> 00:06:00.470
How do we verify that
this fact is true?
00:06:00.470 --> 00:06:03.610
Using the same method as before.
00:06:03.610 --> 00:06:09.950
So fix some particular
outcome for which
00:06:09.950 --> 00:06:14.570
the random variable, capital Y,
takes a specific value, little
00:06:14.570 --> 00:06:15.070
y.
00:06:18.610 --> 00:06:22.080
When that happens,
this random variable
00:06:22.080 --> 00:06:25.610
will take this value here.
00:06:25.610 --> 00:06:28.740
That's just by the definition
of conditional expectation.
00:06:28.740 --> 00:06:31.720
This is the random variable
that takes this value whenever
00:06:31.720 --> 00:06:34.070
capital Y happens to
be equal to little y.
00:06:36.850 --> 00:06:41.650
In that case, we also
have that h of capital
00:06:41.650 --> 00:06:46.025
Y takes on a specific
value, h of little y.
00:06:48.659 --> 00:06:54.440
When this random variable
takes this specific value,
00:06:54.440 --> 00:07:03.810
this random variable here will
take a value of this kind.
00:07:07.830 --> 00:07:11.350
So this is the
random variable that
00:07:11.350 --> 00:07:15.800
takes this value
whenever h of capital Y
00:07:15.800 --> 00:07:19.460
happens to be this
specific number.
00:07:19.460 --> 00:07:25.510
But now, the event that h of
Y takes this specific value,
00:07:25.510 --> 00:07:28.350
because the function,
h, is invertible,
00:07:28.350 --> 00:07:35.750
is identical to the event that
Y takes that particular value.
00:07:35.750 --> 00:07:39.620
And so, since this event
is identical to that event,
00:07:39.620 --> 00:07:42.207
the conditional probabilities,
given this event,
00:07:42.207 --> 00:07:44.540
would be the same as the
conditional probabilities given
00:07:44.540 --> 00:07:45.560
that the event.
00:07:45.560 --> 00:07:48.720
And therefore, the
conditional expectations
00:07:48.720 --> 00:07:50.130
would also be the same.
00:07:52.800 --> 00:07:54.780
Once more, this
is a proof that's
00:07:54.780 --> 00:07:57.110
entirely rigorous
if we are dealing
00:07:57.110 --> 00:07:59.630
with discrete random
variables, although
00:07:59.630 --> 00:08:01.640
in the continuous
case, there could
00:08:01.640 --> 00:08:03.870
be some subtleties involved.
00:08:03.870 --> 00:08:07.040
However, the result
is true in general.
00:08:07.040 --> 00:08:10.910
The technical details
are beyond our scope.