WEBVTT

00:00:01.500 --> 00:00:04.860
I now want to emphasize
an important point.

00:00:04.860 --> 00:00:08.200
Conditional probabilities are
just the same as ordinary

00:00:08.200 --> 00:00:12.020
probabilities applied to
a different situation.

00:00:12.020 --> 00:00:16.700
They do not taste or smell or
behave any differently than

00:00:16.700 --> 00:00:18.680
ordinary probabilities.

00:00:18.680 --> 00:00:20.260
What do I mean by that?

00:00:20.260 --> 00:00:24.770
I mean that they satisfy the
usual probability axioms.

00:00:24.770 --> 00:00:28.830
For example, ordinary
probabilities must also be

00:00:28.830 --> 00:00:29.560
non-negative.

00:00:29.560 --> 00:00:32.420
Is this true for conditional
probabilities?

00:00:32.420 --> 00:00:35.250
Of course it is true, because
conditional probabilities are

00:00:35.250 --> 00:00:38.460
defined as a ratio of
two probabilities.

00:00:38.460 --> 00:00:40.150
Probabilities are
non-negative.

00:00:40.150 --> 00:00:43.950
So the ratio will also be
non-negative, of course as

00:00:43.950 --> 00:00:45.600
long as it is well-defined.

00:00:45.600 --> 00:00:49.460
And here we need to remember
that we only talk about

00:00:49.460 --> 00:00:53.370
conditional probabilities when
we condition on an event that

00:00:53.370 --> 00:00:56.590
itself has positive
probability.

00:00:56.590 --> 00:00:59.170
How about another axiom?

00:00:59.170 --> 00:01:03.540
What is the probability of
the entire sample space,

00:01:03.540 --> 00:01:05.400
given the event B?

00:01:05.400 --> 00:01:06.940
Let's check it out.

00:01:06.940 --> 00:01:11.950
By definition, the conditional
probability is the probability

00:01:11.950 --> 00:01:17.200
of the intersection of the two
events involved divided by the

00:01:17.200 --> 00:01:21.380
probability of the conditioning
event.

00:01:21.380 --> 00:01:24.545
Now, what is the intersection
of omega with B?

00:01:24.545 --> 00:01:26.400
B is a subset of omega.

00:01:26.400 --> 00:01:29.880
So when we intersect
the two sets, we're

00:01:29.880 --> 00:01:32.740
left just with B itself.

00:01:32.740 --> 00:01:36.810
So the numerator becomes the
probability of B. We're

00:01:36.810 --> 00:01:39.570
dividing by the probability
of B, and so the

00:01:39.570 --> 00:01:41.330
answer is equal to 1.

00:01:41.330 --> 00:01:46.610
So indeed, the sample space
has unit probability, even

00:01:46.610 --> 00:01:49.229
under the conditional model.

00:01:49.229 --> 00:01:53.390
Now, remember that when we
condition on an event B, we

00:01:53.390 --> 00:01:56.320
could still work with the
original sample space.

00:01:56.320 --> 00:02:00.230
However, possible outcomes that
do not belong to B are

00:02:00.230 --> 00:02:04.140
considered impossible, so we
might as well think of B

00:02:04.140 --> 00:02:07.170
itself as being our
sample space.

00:02:07.170 --> 00:02:10.949
If we proceed like that and
think now of B as being our

00:02:10.949 --> 00:02:15.980
new sample space, what is the
probability of this new sample

00:02:15.980 --> 00:02:18.180
space in the conditional
model?

00:02:18.180 --> 00:02:20.960
Let's apply the definition
once more.

00:02:20.960 --> 00:02:25.490
It's the probability of the
intersection of the two events

00:02:25.490 --> 00:02:30.450
involved, B intersection B,
divided by the probability of

00:02:30.450 --> 00:02:32.980
the conditioning event.

00:02:32.980 --> 00:02:34.360
What is the numerator?

00:02:34.360 --> 00:02:38.490
The intersection of B with
itself is just B, so the

00:02:38.490 --> 00:02:42.290
numerator is the probability
of B. We're dividing by the

00:02:42.290 --> 00:02:47.370
probability of B. So the
answer is, again, 1.

00:02:47.370 --> 00:02:52.130
Finally, we need to check
the additivity axiom.

00:02:52.130 --> 00:02:55.210
Recall what the additivity
axiom says.

00:02:55.210 --> 00:02:59.070
If we have two events, two
subsets of the sample space

00:02:59.070 --> 00:03:03.140
that are disjoint, then the
probability of their union is

00:03:03.140 --> 00:03:07.250
equal to the sum of their
individual probabilities.

00:03:07.250 --> 00:03:10.950
Is this going to be the case
if we now condition on a

00:03:10.950 --> 00:03:13.290
certain event?

00:03:13.290 --> 00:03:17.430
What we want to prove is the
following statement.

00:03:17.430 --> 00:03:21.870
If we take two events that are
disjoint, they have empty

00:03:21.870 --> 00:03:26.210
intersection, then the
probability of the union is

00:03:26.210 --> 00:03:30.680
the sum of their individual
probabilities, but where now

00:03:30.680 --> 00:03:33.420
the probabilities that we're
employing are the conditional

00:03:33.420 --> 00:03:38.329
probabilities, given the event
B. So let us verify whether

00:03:38.329 --> 00:03:43.579
this relation, this fact
is correct or not.

00:03:43.579 --> 00:03:46.590
Let us take this quantity
and use the

00:03:46.590 --> 00:03:49.510
definition to write it out.

00:03:49.510 --> 00:03:52.630
By definition, this conditional
probability is the

00:03:52.630 --> 00:03:56.730
probability of the intersection
of the first

00:03:56.730 --> 00:04:01.040
event of interest, the one that
appears on this side of

00:04:01.040 --> 00:04:05.420
the conditioning, intersection
with the event on which we are

00:04:05.420 --> 00:04:07.580
conditioning.

00:04:07.580 --> 00:04:11.080
And then we divide by the
probability of the

00:04:11.080 --> 00:04:15.980
conditioning event, B. Now,
let's look at this quantity,

00:04:15.980 --> 00:04:16.820
what is it?

00:04:16.820 --> 00:04:20.990
We're taking the union of A with
C, and then intersect it

00:04:20.990 --> 00:04:26.930
with B. This union consists
of these two pieces.

00:04:26.930 --> 00:04:30.130
When we intersect with
B, what is left is

00:04:30.130 --> 00:04:32.025
these two pieces here.

00:04:34.940 --> 00:04:42.770
So A union C intersected with B
is the union of two pieces.

00:04:42.770 --> 00:04:48.870
One piece is A intersection
B, this piece here.

00:04:48.870 --> 00:04:55.510
And another piece, which is C
intersection B, this is the

00:04:55.510 --> 00:04:56.980
second piece here.

00:04:56.980 --> 00:05:02.200
So here we basically used a
set theoretic identity.

00:05:02.200 --> 00:05:05.640
And now we divide by the
same [denominator]

00:05:05.640 --> 00:05:07.780
as before.

00:05:07.780 --> 00:05:10.330
And now let us continue.

00:05:10.330 --> 00:05:11.920
Here's an interesting
observation.

00:05:14.460 --> 00:05:18.180
The events A and
C are disjoint.

00:05:18.180 --> 00:05:22.870
The piece of A that also belongs
in B, therefore, is

00:05:22.870 --> 00:05:27.920
disjoint from the piece of
C that also belongs to B.

00:05:27.920 --> 00:05:35.400
Therefore, this set here and
that set here are disjoint.

00:05:35.400 --> 00:05:39.530
Since they are disjoint, the
probability of their union has

00:05:39.530 --> 00:05:42.850
to be equal to the sum
of their individual

00:05:42.850 --> 00:05:45.070
probabilities.

00:05:45.070 --> 00:05:49.040
So here we're using the
additivity axiom on the

00:05:49.040 --> 00:05:54.260
original probabilities to break
this probability up into

00:05:54.260 --> 00:05:55.510
two pieces.

00:05:59.770 --> 00:06:05.870
And now we observe that here
we have the ratio of an

00:06:05.870 --> 00:06:09.560
intersection by the probability
of B. This is just

00:06:09.560 --> 00:06:14.270
the conditional probability of A
given B using the definition

00:06:14.270 --> 00:06:16.860
of conditional probabilities.

00:06:16.860 --> 00:06:21.750
And the second part is the
conditional probability of C

00:06:21.750 --> 00:06:25.240
given B, where, again, we're
using the definition of

00:06:25.240 --> 00:06:27.370
conditional probabilities.

00:06:27.370 --> 00:06:32.490
So we have indeed checked that
this additivity property is

00:06:32.490 --> 00:06:37.450
true for the case of conditional
probabilities when

00:06:37.450 --> 00:06:40.530
we consider two disjoint
events.

00:06:40.530 --> 00:06:45.340
Now, we could repeat the same
derivation and verify that it

00:06:45.340 --> 00:06:52.810
is also true for the case of a
disjoint union, of finitely

00:06:52.810 --> 00:06:57.260
many events, or even
for countably

00:06:57.260 --> 00:07:00.120
many disjoint events.

00:07:00.120 --> 00:07:05.720
So we do have finite and
countable additivity.

00:07:05.720 --> 00:07:10.240
We're not proving it, but the
argument is exactly the same

00:07:10.240 --> 00:07:13.010
as for the case of two events.

00:07:13.010 --> 00:07:18.510
So conditional probabilities do
satisfy all of the standard

00:07:18.510 --> 00:07:20.840
axioms of probability theory.

00:07:20.840 --> 00:07:23.700
So conditional probabilities
are just like ordinary

00:07:23.700 --> 00:07:24.970
probabilities.

00:07:24.970 --> 00:07:28.460
This actually has a very
important implication.

00:07:28.460 --> 00:07:30.800
Since conditional probabilities
satisfy all of

00:07:30.800 --> 00:07:34.710
the probability axioms, any
formula or theorem that we

00:07:34.710 --> 00:07:39.880
ever derive for ordinary
probabilities will remain true

00:07:39.880 --> 00:07:42.570
for conditional probabilities
as well.