WEBVTT

00:00:00.690 --> 00:00:03.650
We will now take a step towards
abstraction, and

00:00:03.650 --> 00:00:05.300
discuss the issue of

00:00:05.300 --> 00:00:07.720
convergence of random variables.

00:00:07.720 --> 00:00:10.430
Let us look at the weak
law of large numbers.

00:00:10.430 --> 00:00:14.820
It tells us that with high
probability, the sample mean

00:00:14.820 --> 00:00:19.230
falls close to the true mean
as n goes to infinity.

00:00:19.230 --> 00:00:22.250
We would like to interpret this
statement by saying that

00:00:22.250 --> 00:00:25.190
the sample mean converges
to the true mean.

00:00:25.190 --> 00:00:28.840
However, before we can make such
a statement, we should

00:00:28.840 --> 00:00:34.050
first define carefully the word
"converges." And we need

00:00:34.050 --> 00:00:37.360
a notion of convergence that
refers to convergence of

00:00:37.360 --> 00:00:39.960
random variables.

00:00:39.960 --> 00:00:41.680
Here's a definition.

00:00:41.680 --> 00:00:44.980
Suppose that we have a sequence
of random variables

00:00:44.980 --> 00:00:47.560
that are not necessarily
independent.

00:00:47.560 --> 00:00:51.870
We say that this sequence of
random variables converges in

00:00:51.870 --> 00:00:53.400
probability--

00:00:53.400 --> 00:00:56.580
that's a particular notion of
convergence we're defining.

00:00:56.580 --> 00:00:59.220
It converges to a certain
number if the

00:00:59.220 --> 00:01:02.100
following is true--

00:01:02.100 --> 00:01:06.130
no matter what epsilon is, as
long as it is a positive

00:01:06.130 --> 00:01:10.440
number, the probability that the
random variable falls far

00:01:10.440 --> 00:01:11.600
from this number--

00:01:11.600 --> 00:01:16.100
that is, epsilon or further
away from that number--

00:01:16.100 --> 00:01:21.310
that probability converges
to 0 as n increases.

00:01:21.310 --> 00:01:25.470
That is, as n increases, there
is higher and higher

00:01:25.470 --> 00:01:29.060
probability that Yn will
be close to this

00:01:29.060 --> 00:01:31.150
particular number a.

00:01:31.150 --> 00:01:32.910
This is the notion
of convergence

00:01:32.910 --> 00:01:34.350
that we have defined.

00:01:34.350 --> 00:01:37.229
And notice that this notion
of convergence corresponds

00:01:37.229 --> 00:01:39.700
exactly to what is happening
in the weak

00:01:39.700 --> 00:01:41.370
law of large numbers.

00:01:41.370 --> 00:01:45.190
And so in particular, we can
describe the weak law of large

00:01:45.190 --> 00:01:51.710
numbers as saying that Mn, the
sample mean, converges to mu

00:01:51.710 --> 00:01:55.820
as n goes to infinity, but
in a particular sense--

00:01:55.820 --> 00:02:00.370
in the sense of convergence
in probability.

00:02:00.370 --> 00:02:04.030
Let us now try to understand a
little better what convergence

00:02:04.030 --> 00:02:07.080
in probability really
amounts to.

00:02:07.080 --> 00:02:10.690
And we will do that by making a
comparison with the ordinary

00:02:10.690 --> 00:02:13.900
notion of convergence
of real numbers.

00:02:13.900 --> 00:02:16.980
When we're dealing with
convergence of numbers, we

00:02:16.980 --> 00:02:21.270
start with a sequence of
numbers, and we are interested

00:02:21.270 --> 00:02:24.570
in the statement that this
sequence converges to a

00:02:24.570 --> 00:02:25.850
certain limit.

00:02:25.850 --> 00:02:27.380
What does that mean?

00:02:27.380 --> 00:02:32.410
What we mean is that the
elements of the sequence

00:02:32.410 --> 00:02:33.579
eventually--

00:02:33.579 --> 00:02:36.130
that is, when n is
large enough--

00:02:36.130 --> 00:02:40.230
will get and stay arbitrarily
close to this particular

00:02:40.230 --> 00:02:43.680
number a, which is the limit.

00:02:43.680 --> 00:02:50.020
In terms of a picture,
here is a, the limit.

00:02:50.020 --> 00:02:53.120
Here is n.

00:02:53.120 --> 00:02:58.930
We take a small band around
this number a.

00:02:58.930 --> 00:03:03.810
And what we require is that the
elements of the sequence

00:03:03.810 --> 00:03:09.790
eventually get within this
band around the number a.

00:03:09.790 --> 00:03:13.820
They might get outside the
band, get inside again.

00:03:13.820 --> 00:03:15.250
But eventually--

00:03:15.250 --> 00:03:17.070
that is, after some time--

00:03:17.070 --> 00:03:19.310
the elements of the
sequence will only

00:03:19.310 --> 00:03:21.750
stay inside this band.

00:03:21.750 --> 00:03:24.590
Now to translate this into a
more formal mathematical

00:03:24.590 --> 00:03:28.350
statement, which is the
mathematical definition of the

00:03:28.350 --> 00:03:31.430
notion of convergence, we
have the following--

00:03:31.430 --> 00:03:35.680
if I give you some epsilon,
epsilon could be

00:03:35.680 --> 00:03:37.860
a very small number.

00:03:37.860 --> 00:03:44.230
I form a band around a that goes
from a minus epsilon to a

00:03:44.230 --> 00:03:45.740
plus epsilon.

00:03:45.740 --> 00:03:50.770
What I want is that there exists
a certain time, n0--

00:03:50.770 --> 00:03:53.900
in this picture, n0
would be here--

00:03:53.900 --> 00:04:02.320
such that for all times after
n0, our sequence stays within

00:04:02.320 --> 00:04:03.840
epsilon from a.

00:04:03.840 --> 00:04:08.080
That is, our sequence stays
inside this band.

00:04:08.080 --> 00:04:12.280
Now let us move to the case of
random variables, and see what

00:04:12.280 --> 00:04:16.480
convergence in probability
is talking about.

00:04:16.480 --> 00:04:20.480
Here, instead of a sequence of
numbers, we have a sequence of

00:04:20.480 --> 00:04:22.720
random variables.

00:04:22.720 --> 00:04:26.040
And we're interested in the
meaning of the convergence of

00:04:26.040 --> 00:04:28.280
the sequence of random
variables to

00:04:28.280 --> 00:04:30.070
a particular number.

00:04:30.070 --> 00:04:35.110
In words, what this means is
that if I fix a certain

00:04:35.110 --> 00:04:39.730
epsilon, as in this picture,
then the probability that the

00:04:39.730 --> 00:04:44.240
random variable falls outside
this band converges to 0.

00:04:44.240 --> 00:04:46.605
So the picture would
be as follows.

00:04:51.010 --> 00:04:53.640
We have, again, our limit.

00:04:53.640 --> 00:04:56.750
We fix some epsilon,
which could be an

00:04:56.750 --> 00:04:58.980
arbitrarily small number.

00:04:58.980 --> 00:05:03.800
For any fixed choice of epsilon,
we take this band,

00:05:03.800 --> 00:05:08.400
and for any given n, we look
into the probability that our

00:05:08.400 --> 00:05:11.430
random variable falls
inside that band.

00:05:11.430 --> 00:05:15.570
So if I am to draw the
distribution of our random

00:05:15.570 --> 00:05:20.570
variable, a distribution might
be something like this--

00:05:20.570 --> 00:05:23.250
so there is a certain
probability that it falls

00:05:23.250 --> 00:05:25.350
outside this band.

00:05:25.350 --> 00:05:31.410
But when n becomes large, this
probability of falling outside

00:05:31.410 --> 00:05:35.880
this band becomes very small.

00:05:35.880 --> 00:05:40.800
So the probability of falling
outside the band becomes tiny.

00:05:40.800 --> 00:05:43.300
So the bulk of the
distribution--

00:05:43.300 --> 00:05:45.132
that is, most of the
probability--

00:05:45.132 --> 00:05:48.140
is concentrated inside
this band.

00:05:48.140 --> 00:05:52.300
And this is true, no matter
how small epsilon is.

00:05:52.300 --> 00:05:59.430
If I take a much narrower band
around a, I still want all of

00:05:59.430 --> 00:06:01.700
the probability to
be eventually

00:06:01.700 --> 00:06:04.070
concentrated inside that band.

00:06:04.070 --> 00:06:05.850
Of course, it might
take longer.

00:06:05.850 --> 00:06:10.580
It might take a larger value of
n, but I want that when n

00:06:10.580 --> 00:06:15.160
is very large, the bulk of the
distribution is, again,

00:06:15.160 --> 00:06:19.660
concentrated inside
this narrow band.

00:06:19.660 --> 00:06:24.480
So in words, convergence in
probability means that almost

00:06:24.480 --> 00:06:30.590
all of the probability mass of
the random variable Yn, when n

00:06:30.590 --> 00:06:36.020
is large, that probability mass
get concentrated within a

00:06:36.020 --> 00:06:41.890
narrow band around the limit
of the random variable.

00:06:41.890 --> 00:06:45.090
We finally point out a few
useful properties of

00:06:45.090 --> 00:06:48.700
convergence in probability
that parallel well-known

00:06:48.700 --> 00:06:51.470
properties of convergence
of sequences.

00:06:51.470 --> 00:06:53.870
Suppose that we have a sequence
of random variables

00:06:53.870 --> 00:06:57.850
that converges in probability
to a certain number a, and

00:06:57.850 --> 00:07:00.400
another sequence that converges
in probability to

00:07:00.400 --> 00:07:02.400
some other number b.

00:07:02.400 --> 00:07:05.690
We do not make any assumptions
about independence.

00:07:05.690 --> 00:07:09.640
We do not assume the Xn's are
independent of each other.

00:07:09.640 --> 00:07:12.710
We do not assume that the
sequence of Xn's is

00:07:12.710 --> 00:07:15.640
independent of Yn.

00:07:15.640 --> 00:07:19.190
We then have the following
properties--

00:07:19.190 --> 00:07:23.260
if g is a continuous function,
then the function of the

00:07:23.260 --> 00:07:26.770
random variables converges to
the function of the number.

00:07:26.770 --> 00:07:30.990
So for example, the sequence of
random variables Xn squared

00:07:30.990 --> 00:07:34.790
is going to converge
to a squared.

00:07:34.790 --> 00:07:40.300
Another fact is that the sum of
these two random variables

00:07:40.300 --> 00:07:44.440
converges to the sum
of their limits.

00:07:44.440 --> 00:07:48.080
Both of these properties are
analogous to what happens with

00:07:48.080 --> 00:07:50.760
ordinary convergence
of numbers.

00:07:50.760 --> 00:07:53.400
And they tell us that, in some
sense, convergence in

00:07:53.400 --> 00:07:56.640
probability is not a very
different notion.

00:07:56.640 --> 00:08:00.260
We will not prove those
properties at this point, but

00:08:00.260 --> 00:08:01.950
they're useful to know.

00:08:01.950 --> 00:08:05.100
However, there's an
important caveat.

00:08:05.100 --> 00:08:09.590
Xn might converge to a certain
number in probability.

00:08:09.590 --> 00:08:14.350
However, the expected value
of Xn does not necessarily

00:08:14.350 --> 00:08:16.630
converge to that same limit.

00:08:16.630 --> 00:08:19.370
So convergence of random
variables does not imply

00:08:19.370 --> 00:08:21.370
convergence of expectations.

00:08:21.370 --> 00:08:25.310
And we will be seeing an example
where this convergence

00:08:25.310 --> 00:08:26.560
does not take place.