WEBVTT

00:00:00.680 --> 00:00:03.920
We have seen so far an example
of a probability law on a

00:00:03.920 --> 00:00:07.590
discrete and finite sample space
as well as an example

00:00:07.590 --> 00:00:10.550
with an infinite and continuous
sample space.

00:00:10.550 --> 00:00:14.830
Let us now look at an example
involving a discrete but

00:00:14.830 --> 00:00:17.350
infinite sample space.

00:00:17.350 --> 00:00:20.810
We carry out an experiment whose
outcome is an arbitrary

00:00:20.810 --> 00:00:22.890
positive integer.

00:00:22.890 --> 00:00:25.300
As an example of such an
experiment, suppose that we

00:00:25.300 --> 00:00:28.610
keep tossing a coin and the
outcome is the number of

00:00:28.610 --> 00:00:32.850
tosses until we observe heads
for the first time.

00:00:32.850 --> 00:00:36.140
The first heads might appear
in the first toss or the

00:00:36.140 --> 00:00:39.150
second or the third,
and so on.

00:00:39.150 --> 00:00:42.890
So in this example, any positive
integer is possible.

00:00:42.890 --> 00:00:46.320
And so our sample space
is infinite.

00:00:46.320 --> 00:00:49.480
Let us not specify a
probability law.

00:00:49.480 --> 00:00:52.730
A probability law should
determine the probability of

00:00:52.730 --> 00:00:56.950
every event, of every subset
of the sample space.

00:00:56.950 --> 00:00:59.080
That is, the probability
of every

00:00:59.080 --> 00:01:01.970
set of positive integers.

00:01:01.970 --> 00:01:06.140
But instead I will just tell you
the probability of events

00:01:06.140 --> 00:01:08.860
that contain a single element.

00:01:08.860 --> 00:01:13.050
I'm going to tell you that there
is probability 1 over 2

00:01:13.050 --> 00:01:18.010
to the n that the outcome
is equal to n.

00:01:18.010 --> 00:01:19.860
Is this good enough?

00:01:19.860 --> 00:01:23.800
Is this information enough to
determine the probability of

00:01:23.800 --> 00:01:26.420
any subset?

00:01:26.420 --> 00:01:28.950
Before we look into that
question, let us first do a

00:01:28.950 --> 00:01:32.425
quick sanity check to see
whether these numbers that we

00:01:32.425 --> 00:01:35.420
are given look like legitimate
probabilities.

00:01:35.420 --> 00:01:37.410
Do they add to 1?

00:01:37.410 --> 00:01:39.410
Let's do a quick check.

00:01:39.410 --> 00:01:45.840
So the sum over all the possible
values of n of the

00:01:45.840 --> 00:01:49.610
probabilities that we're given,
which is an infinite

00:01:49.610 --> 00:01:55.520
sum starting from 1, all the way
up to infinity, of 1 over

00:01:55.520 --> 00:01:58.700
2 to the n, is equal
to the following.

00:01:58.700 --> 00:02:04.250
First we take out a factor of
1/2 from all of these terms,

00:02:04.250 --> 00:02:08.080
which reduces the exponent
from n to n minus 1.

00:02:08.080 --> 00:02:13.700
This is the same as running
the sum from n equals 0 to

00:02:13.700 --> 00:02:19.310
infinity of 1/2 and to the n.

00:02:19.310 --> 00:02:24.980
And now we have a usual infinite
geometric series and

00:02:24.980 --> 00:02:27.730
we have a formula for this.

00:02:27.730 --> 00:02:33.320
The geometric series has a value
of 1 over 1 minus the

00:02:33.320 --> 00:02:36.665
number whose power we're
taking, which is 1/2.

00:02:39.280 --> 00:02:42.520
And after we do the arithmetic,
this turns out to

00:02:42.520 --> 00:02:44.240
be equal to 1.

00:02:44.240 --> 00:02:50.860
So indeed, it appears that we
have the basic elements of

00:02:50.860 --> 00:02:54.360
what it would take to have a
legitimate probability law.

00:02:54.360 --> 00:02:57.870
But now let us look into how
we might calculate the

00:02:57.870 --> 00:03:00.510
probability of some
general event.

00:03:00.510 --> 00:03:05.370
For example, the probability
that the outcome is even.

00:03:05.370 --> 00:03:08.300
We proceed as follows.

00:03:08.300 --> 00:03:11.200
The probability that the outcome
is even, this is the

00:03:11.200 --> 00:03:15.840
probability of an infinite
set that consists of

00:03:15.840 --> 00:03:18.610
all the even integers.

00:03:22.280 --> 00:03:29.760
We can write this set as the
union of lots of little sets

00:03:29.760 --> 00:03:33.090
that contain a single
element each.

00:03:33.090 --> 00:03:36.530
So it's the set containing the
number 2, the set containing

00:03:36.530 --> 00:03:38.750
the number 4, the set
containing the

00:03:38.750 --> 00:03:41.120
number 6, and so on.

00:03:44.010 --> 00:03:47.170
At this point we notice that
we're talking about the

00:03:47.170 --> 00:03:51.430
probability of a union of sets
and these sets are disjoint

00:03:51.430 --> 00:03:54.760
because they contain
different elements.

00:03:54.760 --> 00:04:00.900
So we can use an additivity
property and say that this is

00:04:00.900 --> 00:04:05.280
the probability of obtaining a
2, plus the probability of

00:04:05.280 --> 00:04:08.190
obtaining a 4, plus
the probability of

00:04:08.190 --> 00:04:12.390
obtaining a 6 and so on.

00:04:12.390 --> 00:04:15.570
If you're curious about doing
this calculation and actually

00:04:15.570 --> 00:04:19.339
obtaining a numerical answer,
you would proceed as follows.

00:04:19.339 --> 00:04:26.030
You notice that this is 1 over
2 to the second power plus 1

00:04:26.030 --> 00:04:31.370
over 2 to the fourth power plus
1 over 2 to the sixth

00:04:31.370 --> 00:04:34.170
power and so on.

00:04:34.170 --> 00:04:43.260
Now you factor out a factor of
1/4 and what you're left is 1

00:04:43.260 --> 00:04:48.400
plus 1 over 2 to the second
power, which is 1/4, plus 1

00:04:48.400 --> 00:04:56.000
over 2 to the fourth power,
which is the same as 1/4 to

00:04:56.000 --> 00:04:59.760
the second power and so on.

00:04:59.760 --> 00:05:05.440
And now we have 1/4 times the
infinite sum of a geometric

00:05:05.440 --> 00:05:12.620
series, which gives us
1 over 1 minus 1/4.

00:05:12.620 --> 00:05:16.240
And after you do the algebra you
obtain a numerical answer,

00:05:16.240 --> 00:05:17.750
which is equal to 1/3.

00:05:20.260 --> 00:05:23.550
But leaving the details of the
calculation aside, the more

00:05:23.550 --> 00:05:26.810
important question I want to
address is the following.

00:05:26.810 --> 00:05:29.430
Is this calculation correct?

00:05:29.430 --> 00:05:32.370
We seem to have used
an additivity

00:05:32.370 --> 00:05:34.370
property at this point.

00:05:37.720 --> 00:05:41.500
But the additivity properties
that we have in our hands at

00:05:41.500 --> 00:05:46.800
this point only talk about
disjoint unions of finitely

00:05:46.800 --> 00:05:48.290
many subsets.

00:05:48.290 --> 00:05:51.460
Our initial axiom talked about
a disjoint union of two

00:05:51.460 --> 00:05:54.990
subsets and then later on we
established a similar property

00:05:54.990 --> 00:05:58.820
for a disjoint union of
finitely many subsets.

00:05:58.820 --> 00:06:02.620
But here we're talking
about the union of

00:06:02.620 --> 00:06:05.770
infinitely many subsets.

00:06:05.770 --> 00:06:11.940
So this step here is not really
allowed by what we have

00:06:11.940 --> 00:06:13.140
in our hands.

00:06:13.140 --> 00:06:16.500
On the other hand, we would like
our theory to allow this

00:06:16.500 --> 00:06:18.540
kind of calculation.

00:06:18.540 --> 00:06:23.070
The way out of this dilemma is
to introduce an additional

00:06:23.070 --> 00:06:27.015
axiom that will indeed allow
this kind of calculation.

00:06:29.660 --> 00:06:32.836
The axiom that we introduce
is the following.

00:06:32.836 --> 00:06:39.700
If we have an infinite sequence
of disjoint events,

00:06:39.700 --> 00:06:42.430
as for example in
this picture.

00:06:42.430 --> 00:06:44.560
We have our sample space.

00:06:44.560 --> 00:06:46.909
We have a first event, A1.

00:06:46.909 --> 00:06:49.440
We have a second event, A2.

00:06:49.440 --> 00:06:51.690
The third event, A3.

00:06:51.690 --> 00:06:55.730
And so we keep continuing and
we have an infinite sequence

00:06:55.730 --> 00:06:57.400
of such events.

00:06:57.400 --> 00:07:02.770
Then the probability of the
union of these events, of

00:07:02.770 --> 00:07:07.600
these infinitely many events, is
the sum of their individual

00:07:07.600 --> 00:07:09.390
probabilities.

00:07:09.390 --> 00:07:15.630
The key word here is
the word sequence.

00:07:15.630 --> 00:07:20.430
Namely, these events, these sets
that we're dealing with,

00:07:20.430 --> 00:07:25.120
can be arranged so that we can
talk about the first event,

00:07:25.120 --> 00:07:31.490
A1, the second event, A2, the
third one, A3, and so on.

00:07:31.490 --> 00:07:35.510
To appreciate the issue that
arises here and to see why the

00:07:35.510 --> 00:07:41.360
word sequence is so important,
let us consider the following

00:07:41.360 --> 00:07:43.110
calculation.

00:07:43.110 --> 00:07:45.680
Our sample space is
the unit square.

00:07:48.750 --> 00:07:52.290
And we consider a model where
the probability of a set is

00:07:52.290 --> 00:07:57.030
its area, as in the examples
that we considered earlier.

00:07:57.030 --> 00:08:00.550
Let us now look at the
probability of the overall

00:08:00.550 --> 00:08:02.180
sample space.

00:08:02.180 --> 00:08:07.890
Our sample space is the unit
square and the unit square can

00:08:07.890 --> 00:08:13.870
be thought of as the union of
various sets that consist of

00:08:13.870 --> 00:08:15.330
single points.

00:08:15.330 --> 00:08:22.780
So it's the union of subsets
with one element each.

00:08:22.780 --> 00:08:25.100
And it's a union taken
over all the

00:08:25.100 --> 00:08:28.770
points in the unit square.

00:08:28.770 --> 00:08:31.590
Then we think about
additivity.

00:08:31.590 --> 00:08:35.490
We observe that these subsets
are disjoint.

00:08:35.490 --> 00:08:39.080
If we're considering different
points, then we get disjoint

00:08:39.080 --> 00:08:40.890
single element sets.

00:08:40.890 --> 00:08:44.190
And then an additivity property
would tells us that

00:08:44.190 --> 00:08:47.450
the probability of these
union is the sum of the

00:08:47.450 --> 00:08:53.750
probabilities of the different
single element subsets.

00:08:53.750 --> 00:08:57.910
Now, as we discussed before,
single element subsets have 0

00:08:57.910 --> 00:08:58.770
probability.

00:08:58.770 --> 00:09:04.320
So we have a sum of lots of 0s
and the sum of 0s should be

00:09:04.320 --> 00:09:06.310
equal to 0.

00:09:06.310 --> 00:09:09.310
On the other hand, by the
probability axioms, the

00:09:09.310 --> 00:09:11.860
probability of the entire
sample space

00:09:11.860 --> 00:09:13.750
should be equal to 1.

00:09:13.750 --> 00:09:18.140
And so we have established
that 1 is equal to 0.

00:09:18.140 --> 00:09:20.120
This looks like a paradox.

00:09:20.120 --> 00:09:21.840
Is it?

00:09:21.840 --> 00:09:26.110
The catch is that there is
nothing in the axioms we have

00:09:26.110 --> 00:09:29.770
introduced so far or the
properties we have established

00:09:29.770 --> 00:09:32.600
that would justify this step.

00:09:32.600 --> 00:09:36.940
So this step here
is questionable.

00:09:36.940 --> 00:09:40.440
You might argue that the unit
square is the union of

00:09:40.440 --> 00:09:45.490
disjoint single element sets,
which is the case that we have

00:09:45.490 --> 00:09:47.340
in additivity axioms.

00:09:47.340 --> 00:09:50.950
But the additivity axiom only
applies when we have a

00:09:50.950 --> 00:09:53.770
sequence of events.

00:09:53.770 --> 00:09:56.580
And this is not what
we have here.

00:09:56.580 --> 00:09:59.470
This is not a union
of a sequence of

00:09:59.470 --> 00:10:01.090
single element sets.

00:10:01.090 --> 00:10:04.160
In fact, there is no way that
the elements of the unit

00:10:04.160 --> 00:10:06.930
square can be arranged
in a sequence.

00:10:06.930 --> 00:10:13.440
The unit square is said to
be an uncountable set.

00:10:13.440 --> 00:10:16.950
This is a deep and fundamental
mathematical fact.

00:10:16.950 --> 00:10:19.980
What it essentially says is that
there are two kinds of

00:10:19.980 --> 00:10:21.510
infinite sets.

00:10:21.510 --> 00:10:26.450
Discrete ones or in formal
terminology countable.

00:10:26.450 --> 00:10:29.980
These are sets whose elements
can be arranged in a sequence,

00:10:29.980 --> 00:10:31.680
like the integers.

00:10:31.680 --> 00:10:36.910
And also uncountable sets, such
as the unit square or the

00:10:36.910 --> 00:10:40.030
real line, whose elements
cannot be

00:10:40.030 --> 00:10:42.140
arranged in a sequence.

00:10:42.140 --> 00:10:45.680
If you're curious, you can
find the proof of this

00:10:45.680 --> 00:10:48.400
important fact in the
supplementary materials that

00:10:48.400 --> 00:10:51.020
we are providing.

00:10:51.020 --> 00:10:53.680
After all these discussion, you
may now have legitimate

00:10:53.680 --> 00:10:57.340
suspicions about the models
we have been looking at.

00:10:57.340 --> 00:11:00.860
Is area a legitimate
probability law?

00:11:00.860 --> 00:11:05.600
Does it even satisfy countable
additivity?

00:11:05.600 --> 00:11:09.000
This question takes us into deep
waters and has to do with

00:11:09.000 --> 00:11:12.250
a deep subfield of mathematics
called Measure Theory.

00:11:12.250 --> 00:11:15.970
Fortunately, it turns out
that all is well.

00:11:15.970 --> 00:11:19.030
Area is a legitimate
probability law.

00:11:19.030 --> 00:11:23.600
It does indeed satisfy the
countable additivity axiom as

00:11:23.600 --> 00:11:29.270
long as we only deal with nice
subsets of the unit square.

00:11:29.270 --> 00:11:32.640
Fortunately, the subsets that
arise in whatever we do in

00:11:32.640 --> 00:11:35.220
this course will be "nice".

00:11:35.220 --> 00:11:39.890
Subsets that are not nice are
quite pathological and we will

00:11:39.890 --> 00:11:42.260
not encounter them.

00:11:42.260 --> 00:11:47.170
At this stage we are not in a
position to say anything more

00:11:47.170 --> 00:11:50.200
that would be meaningful about
these issues because they're

00:11:50.200 --> 00:11:53.230
quite complicated and
mathematically deep.

00:11:53.230 --> 00:11:57.550
We can only say that there are
some serious mathematical

00:11:57.550 --> 00:11:58.660
subtleties.

00:11:58.660 --> 00:12:01.620
But fortunately, they
can all be overcome

00:12:01.620 --> 00:12:03.190
in a rigorous manner.

00:12:03.190 --> 00:12:06.190
And for the rest of this class,
you can just forget

00:12:06.190 --> 00:12:07.710
about these subtle issues.