WEBVTT
00:00:01.500 --> 00:00:04.860
I now want to emphasize
an important point.
00:00:04.860 --> 00:00:08.200
Conditional probabilities are
just the same as ordinary
00:00:08.200 --> 00:00:12.020
probabilities applied to
a different situation.
00:00:12.020 --> 00:00:16.700
They do not taste or smell or
behave any differently than
00:00:16.700 --> 00:00:18.680
ordinary probabilities.
00:00:18.680 --> 00:00:20.260
What do I mean by that?
00:00:20.260 --> 00:00:24.770
I mean that they satisfy the
usual probability axioms.
00:00:24.770 --> 00:00:28.830
For example, ordinary
probabilities must also be
00:00:28.830 --> 00:00:29.560
non-negative.
00:00:29.560 --> 00:00:32.420
Is this true for conditional
probabilities?
00:00:32.420 --> 00:00:35.250
Of course it is true, because
conditional probabilities are
00:00:35.250 --> 00:00:38.460
defined as a ratio of
two probabilities.
00:00:38.460 --> 00:00:40.150
Probabilities are
non-negative.
00:00:40.150 --> 00:00:43.950
So the ratio will also be
non-negative, of course as
00:00:43.950 --> 00:00:45.600
long as it is well-defined.
00:00:45.600 --> 00:00:49.460
And here we need to remember
that we only talk about
00:00:49.460 --> 00:00:53.370
conditional probabilities when
we condition on an event that
00:00:53.370 --> 00:00:56.590
itself has positive
probability.
00:00:56.590 --> 00:00:59.170
How about another axiom?
00:00:59.170 --> 00:01:03.540
What is the probability of
the entire sample space,
00:01:03.540 --> 00:01:05.400
given the event B?
00:01:05.400 --> 00:01:06.940
Let's check it out.
00:01:06.940 --> 00:01:11.950
By definition, the conditional
probability is the probability
00:01:11.950 --> 00:01:17.200
of the intersection of the two
events involved divided by the
00:01:17.200 --> 00:01:21.380
probability of the conditioning
event.
00:01:21.380 --> 00:01:24.545
Now, what is the intersection
of omega with B?
00:01:24.545 --> 00:01:26.400
B is a subset of omega.
00:01:26.400 --> 00:01:29.880
So when we intersect
the two sets, we're
00:01:29.880 --> 00:01:32.740
left just with B itself.
00:01:32.740 --> 00:01:36.810
So the numerator becomes the
probability of B. We're
00:01:36.810 --> 00:01:39.570
dividing by the probability
of B, and so the
00:01:39.570 --> 00:01:41.330
answer is equal to 1.
00:01:41.330 --> 00:01:46.610
So indeed, the sample space
has unit probability, even
00:01:46.610 --> 00:01:49.229
under the conditional model.
00:01:49.229 --> 00:01:53.390
Now, remember that when we
condition on an event B, we
00:01:53.390 --> 00:01:56.320
could still work with the
original sample space.
00:01:56.320 --> 00:02:00.230
However, possible outcomes that
do not belong to B are
00:02:00.230 --> 00:02:04.140
considered impossible, so we
might as well think of B
00:02:04.140 --> 00:02:07.170
itself as being our
sample space.
00:02:07.170 --> 00:02:10.949
If we proceed like that and
think now of B as being our
00:02:10.949 --> 00:02:15.980
new sample space, what is the
probability of this new sample
00:02:15.980 --> 00:02:18.180
space in the conditional
model?
00:02:18.180 --> 00:02:20.960
Let's apply the definition
once more.
00:02:20.960 --> 00:02:25.490
It's the probability of the
intersection of the two events
00:02:25.490 --> 00:02:30.450
involved, B intersection B,
divided by the probability of
00:02:30.450 --> 00:02:32.980
the conditioning event.
00:02:32.980 --> 00:02:34.360
What is the numerator?
00:02:34.360 --> 00:02:38.490
The intersection of B with
itself is just B, so the
00:02:38.490 --> 00:02:42.290
numerator is the probability
of B. We're dividing by the
00:02:42.290 --> 00:02:47.370
probability of B. So the
answer is, again, 1.
00:02:47.370 --> 00:02:52.130
Finally, we need to check
the additivity axiom.
00:02:52.130 --> 00:02:55.210
Recall what the additivity
axiom says.
00:02:55.210 --> 00:02:59.070
If we have two events, two
subsets of the sample space
00:02:59.070 --> 00:03:03.140
that are disjoint, then the
probability of their union is
00:03:03.140 --> 00:03:07.250
equal to the sum of their
individual probabilities.
00:03:07.250 --> 00:03:10.950
Is this going to be the case
if we now condition on a
00:03:10.950 --> 00:03:13.290
certain event?
00:03:13.290 --> 00:03:17.430
What we want to prove is the
following statement.
00:03:17.430 --> 00:03:21.870
If we take two events that are
disjoint, they have empty
00:03:21.870 --> 00:03:26.210
intersection, then the
probability of the union is
00:03:26.210 --> 00:03:30.680
the sum of their individual
probabilities, but where now
00:03:30.680 --> 00:03:33.420
the probabilities that we're
employing are the conditional
00:03:33.420 --> 00:03:38.329
probabilities, given the event
B. So let us verify whether
00:03:38.329 --> 00:03:43.579
this relation, this fact
is correct or not.
00:03:43.579 --> 00:03:46.590
Let us take this quantity
and use the
00:03:46.590 --> 00:03:49.510
definition to write it out.
00:03:49.510 --> 00:03:52.630
By definition, this conditional
probability is the
00:03:52.630 --> 00:03:56.730
probability of the intersection
of the first
00:03:56.730 --> 00:04:01.040
event of interest, the one that
appears on this side of
00:04:01.040 --> 00:04:05.420
the conditioning, intersection
with the event on which we are
00:04:05.420 --> 00:04:07.580
conditioning.
00:04:07.580 --> 00:04:11.080
And then we divide by the
probability of the
00:04:11.080 --> 00:04:15.980
conditioning event, B. Now,
let's look at this quantity,
00:04:15.980 --> 00:04:16.820
what is it?
00:04:16.820 --> 00:04:20.990
We're taking the union of A with
C, and then intersect it
00:04:20.990 --> 00:04:26.930
with B. This union consists
of these two pieces.
00:04:26.930 --> 00:04:30.130
When we intersect with
B, what is left is
00:04:30.130 --> 00:04:32.025
these two pieces here.
00:04:34.940 --> 00:04:42.770
So A union C intersected with B
is the union of two pieces.
00:04:42.770 --> 00:04:48.870
One piece is A intersection
B, this piece here.
00:04:48.870 --> 00:04:55.510
And another piece, which is C
intersection B, this is the
00:04:55.510 --> 00:04:56.980
second piece here.
00:04:56.980 --> 00:05:02.200
So here we basically used a
set theoretic identity.
00:05:02.200 --> 00:05:05.640
And now we divide by the
same [denominator]
00:05:05.640 --> 00:05:07.780
as before.
00:05:07.780 --> 00:05:10.330
And now let us continue.
00:05:10.330 --> 00:05:11.920
Here's an interesting
observation.
00:05:14.460 --> 00:05:18.180
The events A and
C are disjoint.
00:05:18.180 --> 00:05:22.870
The piece of A that also belongs
in B, therefore, is
00:05:22.870 --> 00:05:27.920
disjoint from the piece of
C that also belongs to B.
00:05:27.920 --> 00:05:35.400
Therefore, this set here and
that set here are disjoint.
00:05:35.400 --> 00:05:39.530
Since they are disjoint, the
probability of their union has
00:05:39.530 --> 00:05:42.850
to be equal to the sum
of their individual
00:05:42.850 --> 00:05:45.070
probabilities.
00:05:45.070 --> 00:05:49.040
So here we're using the
additivity axiom on the
00:05:49.040 --> 00:05:54.260
original probabilities to break
this probability up into
00:05:54.260 --> 00:05:55.510
two pieces.
00:05:59.770 --> 00:06:05.870
And now we observe that here
we have the ratio of an
00:06:05.870 --> 00:06:09.560
intersection by the probability
of B. This is just
00:06:09.560 --> 00:06:14.270
the conditional probability of A
given B using the definition
00:06:14.270 --> 00:06:16.860
of conditional probabilities.
00:06:16.860 --> 00:06:21.750
And the second part is the
conditional probability of C
00:06:21.750 --> 00:06:25.240
given B, where, again, we're
using the definition of
00:06:25.240 --> 00:06:27.370
conditional probabilities.
00:06:27.370 --> 00:06:32.490
So we have indeed checked that
this additivity property is
00:06:32.490 --> 00:06:37.450
true for the case of conditional
probabilities when
00:06:37.450 --> 00:06:40.530
we consider two disjoint
events.
00:06:40.530 --> 00:06:45.340
Now, we could repeat the same
derivation and verify that it
00:06:45.340 --> 00:06:52.810
is also true for the case of a
disjoint union, of finitely
00:06:52.810 --> 00:06:57.260
many events, or even
for countably
00:06:57.260 --> 00:07:00.120
many disjoint events.
00:07:00.120 --> 00:07:05.720
So we do have finite and
countable additivity.
00:07:05.720 --> 00:07:10.240
We're not proving it, but the
argument is exactly the same
00:07:10.240 --> 00:07:13.010
as for the case of two events.
00:07:13.010 --> 00:07:18.510
So conditional probabilities do
satisfy all of the standard
00:07:18.510 --> 00:07:20.840
axioms of probability theory.
00:07:20.840 --> 00:07:23.700
So conditional probabilities
are just like ordinary
00:07:23.700 --> 00:07:24.970
probabilities.
00:07:24.970 --> 00:07:28.460
This actually has a very
important implication.
00:07:28.460 --> 00:07:30.800
Since conditional probabilities
satisfy all of
00:07:30.800 --> 00:07:34.710
the probability axioms, any
formula or theorem that we
00:07:34.710 --> 00:07:39.880
ever derive for ordinary
probabilities will remain true
00:07:39.880 --> 00:07:42.570
for conditional probabilities
as well.