WEBVTT
00:00:00.790 --> 00:00:03.450
In this segment, we develop
the inclusion-exclusion
00:00:03.450 --> 00:00:07.220
formula, which is a beautiful
generalization of a formula
00:00:07.220 --> 00:00:09.390
that we have seen before.
00:00:09.390 --> 00:00:11.500
Let us look at this
formula and remind
00:00:11.500 --> 00:00:13.520
ourselves what it says.
00:00:13.520 --> 00:00:19.110
If we have two sets, A1 and A2,
and we're interested in
00:00:19.110 --> 00:00:22.080
the probability of their union,
how can we find it?
00:00:22.080 --> 00:00:25.960
We take the probability of the
first set, we add to it the
00:00:25.960 --> 00:00:29.810
probability of the second set,
but then we realize that by
00:00:29.810 --> 00:00:32.710
doing so we have double counted
00:00:32.710 --> 00:00:34.770
this part of the diagram.
00:00:34.770 --> 00:00:38.740
And so we need to correct for
that and we need to subtract
00:00:38.740 --> 00:00:41.040
the probability of this
intersection.
00:00:41.040 --> 00:00:43.830
And that's how this formula
comes about.
00:00:43.830 --> 00:00:46.350
Can we generalize this thinking,
let's say, to the
00:00:46.350 --> 00:00:49.240
case of three events?
00:00:49.240 --> 00:01:01.240
Suppose that we have three
events, A1, A2, and A3.
00:01:01.240 --> 00:01:05.470
And we want to calculate the
probability of their union.
00:01:05.470 --> 00:01:08.920
We first start by adding
the probabilities of
00:01:08.920 --> 00:01:10.660
the different sets.
00:01:10.660 --> 00:01:14.260
But then we realize that, for
example, this part of the
00:01:14.260 --> 00:01:17.410
diagram has been
counted twice.
00:01:17.410 --> 00:01:21.090
It shows up once inside the
probability of A1 and once
00:01:21.090 --> 00:01:24.180
inside the probability of A2.
00:01:24.180 --> 00:01:27.740
So, for this reason, we need
to make a correction and we
00:01:27.740 --> 00:01:29.930
need to subtract the
probability of this
00:01:29.930 --> 00:01:30.670
intersection.
00:01:30.670 --> 00:01:32.630
Similarly, subtract the
probability of that
00:01:32.630 --> 00:01:34.940
intersection and of this one.
00:01:34.940 --> 00:01:37.630
So we subtract the probabilities
of these
00:01:37.630 --> 00:01:38.800
intersections.
00:01:38.800 --> 00:01:41.120
But, actually, the intersections
are not just
00:01:41.120 --> 00:01:42.340
what I drew here.
00:01:42.340 --> 00:01:45.780
The intersections also
involve this part.
00:01:45.780 --> 00:01:50.940
So now, let us just focus on
this part of the diagram here.
00:01:50.940 --> 00:01:57.570
A typical element that belongs
to all three of the sets will
00:01:57.570 --> 00:02:02.520
show up once here, once
here and once there.
00:02:02.520 --> 00:02:07.350
But it will also show up in all
of these intersections.
00:02:07.350 --> 00:02:10.610
And so it shows up three times
with a plus sign, three times
00:02:10.610 --> 00:02:13.390
with a minus sign, which means
that these elements will not
00:02:13.390 --> 00:02:14.990
to be counted at all.
00:02:14.990 --> 00:02:19.280
In order to count them, we
need to add one more term
00:02:19.280 --> 00:02:23.640
which is the probability of the
three way intersection.
00:02:23.640 --> 00:02:27.760
So this is the formula for the
probability of the union of
00:02:27.760 --> 00:02:30.550
three events.
00:02:30.550 --> 00:02:34.329
It has a rationale similar to
this formula, and you can
00:02:34.329 --> 00:02:37.829
convince yourself that it is
a correct formula by just
00:02:37.829 --> 00:02:40.490
looking at the different pieces
of this diagram and
00:02:40.490 --> 00:02:44.410
make sure that each one of them
is accounted properly.
00:02:44.410 --> 00:02:48.055
But instead of working in terms
of such a picture, let
00:02:48.055 --> 00:02:51.050
us think about a more
formal derivation.
00:02:51.050 --> 00:02:54.550
And the formal derivation will
use a beautiful trick.
00:02:54.550 --> 00:02:58.950
Namely, indicator functions.
00:02:58.950 --> 00:03:02.800
So here is the formula that
we want to establish.
00:03:02.800 --> 00:03:08.190
And let us remind ourselves what
indicator functions are.
00:03:08.190 --> 00:03:11.680
To any set or event,
we can associate
00:03:11.680 --> 00:03:13.640
an indicator function.
00:03:13.640 --> 00:03:16.440
Let's say that this
is the set Ai.
00:03:16.440 --> 00:03:20.900
We're going to associate an
indicator function, call it
00:03:20.900 --> 00:03:25.070
Xi, which is equal to 1 when
the outcome is inside this
00:03:25.070 --> 00:03:29.250
set, and it's going to be 0 when
the outcome is outside.
00:03:29.250 --> 00:03:32.829
What is the indicator function
of the complement?
00:03:32.829 --> 00:03:37.190
The indicator function of the
complement is 1 minus the
00:03:37.190 --> 00:03:39.600
indicator of the event.
00:03:39.600 --> 00:03:41.300
Why is this?
00:03:41.300 --> 00:03:47.180
If the outcome is in the
complement, then Xi is equal
00:03:47.180 --> 00:03:50.600
to 0, and this expression
is equal to 1.
00:03:50.600 --> 00:03:55.110
On the other hand, if the
outcome is inside Ai, then the
00:03:55.110 --> 00:03:58.610
indicator function will be equal
to 1 and this quantity
00:03:58.610 --> 00:04:01.520
is going to be equal to 0.
00:04:01.520 --> 00:04:07.520
If we have the intersection of
two events, Ai and Aj, what is
00:04:07.520 --> 00:04:09.360
their indicator function?
00:04:09.360 --> 00:04:12.930
It is Xi times Xj.
00:04:12.930 --> 00:04:17.149
This expression is equal to 1,
if and only if, Xi is equal to
00:04:17.149 --> 00:04:21.890
1 and Xj is equal to 1, which
happens, if and only if, the
00:04:21.890 --> 00:04:27.000
outcome is inside Ai
and also inside Aj.
00:04:27.000 --> 00:04:31.710
Now, what about the indicator
of the intersection of the
00:04:31.710 --> 00:04:32.960
complements?
00:04:35.409 --> 00:04:36.940
Well, it's an intersection.
00:04:36.940 --> 00:04:40.150
So the associated indicator
function is going to be the
00:04:40.150 --> 00:04:43.640
product of the indicator
function of the first set,
00:04:43.640 --> 00:04:48.560
which is 1 minus Xi times the
indicator function of the
00:04:48.560 --> 00:04:52.840
second set, which
is 1 minus Xj.
00:04:52.840 --> 00:04:55.930
And finally, what
is the indicator
00:04:55.930 --> 00:04:57.895
function of this event?
00:05:00.840 --> 00:05:04.380
Here we remember De
Morgan's Laws.
00:05:04.380 --> 00:05:09.770
De Morgan's Laws tell us that
the complement of this set--
00:05:09.770 --> 00:05:11.470
the complement of a union--
00:05:11.470 --> 00:05:13.880
is the intersection of
the complements.
00:05:13.880 --> 00:05:18.140
So this event here is the
complement of that event.
00:05:18.140 --> 00:05:21.850
And, therefore, the associated
indicator function is going to
00:05:21.850 --> 00:05:24.435
be 1 minus this expression.
00:05:31.440 --> 00:05:35.030
And if we were dealing with
more than two sets--
00:05:35.030 --> 00:05:38.600
and here we had, for example,
three way intersections--
00:05:38.600 --> 00:05:41.430
you would get the product
of three terms.
00:05:41.430 --> 00:05:44.770
And if we had a three way union,
we would get a similar
00:05:44.770 --> 00:05:47.460
expression, except that here
we would have, again, a
00:05:47.460 --> 00:05:51.250
product of three terms
instead of two.
00:05:51.250 --> 00:05:57.159
So now, let us put to use what
we have done so far.
00:05:57.159 --> 00:06:02.030
We are interested in the
probability that the outcome
00:06:02.030 --> 00:06:06.550
falls in the union
of three sets.
00:06:06.550 --> 00:06:09.760
Now, an important fact to
remember is that the
00:06:09.760 --> 00:06:15.300
probability of an event is the
same as the expected value of
00:06:15.300 --> 00:06:17.660
the indicator of that event.
00:06:22.810 --> 00:06:28.090
This is because the indicator is
equal to 1, if and only if,
00:06:28.090 --> 00:06:32.400
the outcome happens to
be inside that set.
00:06:32.400 --> 00:06:36.820
And so the contribution that we
get to the expectation is 1
00:06:36.820 --> 00:06:40.810
times the probability that the
indicator is 1, which is just
00:06:40.810 --> 00:06:43.070
this probability.
00:06:43.070 --> 00:06:51.600
Now, the indicator of a three
way union is going to be, by
00:06:51.600 --> 00:06:56.620
what we just discussed, 1 minus
a product of this kind,
00:06:56.620 --> 00:06:58.125
but now with three terms.
00:07:05.660 --> 00:07:10.320
Let us now calculate this
expectation by expanding the
00:07:10.320 --> 00:07:12.020
product involved.
00:07:12.020 --> 00:07:16.480
We have this first term, then,
when we multiply those three
00:07:16.480 --> 00:07:19.960
terms together, we're going to
get a bunch of contributions.
00:07:19.960 --> 00:07:25.170
One contribution with a minus
sign is 1 times 1 times 1.
00:07:25.170 --> 00:07:27.970
Another contribution would
be minus minus--
00:07:27.970 --> 00:07:28.580
that's a plus--
00:07:28.580 --> 00:07:32.420
X1 times 1 times 1.
00:07:32.420 --> 00:07:37.360
And similarly, we get a
contribution of X2 and X3.
00:07:37.360 --> 00:07:41.050
And then we have a contribution
such as X1 times
00:07:41.050 --> 00:07:43.310
X2 times 1.
00:07:43.310 --> 00:07:45.630
And if you look at
the minus signs--
00:07:45.630 --> 00:07:48.250
there are three minuses
involved-- so, overall, it's
00:07:48.250 --> 00:07:49.750
going to be a minus.
00:07:49.750 --> 00:07:52.510
Minus X1 times X2.
00:07:52.510 --> 00:07:55.350
And then there is going to
be similar terms, such as
00:07:55.350 --> 00:08:01.010
X1 X3 and X2 X3.
00:08:01.010 --> 00:08:03.680
And, finally, there's going
to be a term X1
00:08:03.680 --> 00:08:06.100
times X2 times X3.
00:08:06.100 --> 00:08:09.770
There's a total of four minus
signs involved, so everything
00:08:09.770 --> 00:08:12.230
shows up in the end
with a plus sign.
00:08:17.410 --> 00:08:20.280
So the probability of this
event is equal to the
00:08:20.280 --> 00:08:23.530
expectation of this random
variable here.
00:08:23.530 --> 00:08:26.270
We notice that the
ones cancel out.
00:08:26.270 --> 00:08:31.310
The expected value of X1 for an
indicator variable is just
00:08:31.310 --> 00:08:32.799
the probability of that event.
00:08:32.799 --> 00:08:33.820
And we get this term.
00:08:33.820 --> 00:08:39.120
The expected value of X2 and
X3 give us these terms.
00:08:39.120 --> 00:08:42.159
The expected value
of X1 times X2.
00:08:42.159 --> 00:08:46.110
This is the indicator random
variable of the intersection.
00:08:46.110 --> 00:08:50.380
So the expected value of this
term is just the probability
00:08:50.380 --> 00:08:53.740
of the intersection
of A1 and A2.
00:08:53.740 --> 00:08:58.800
And, similarly, these terms here
give rise to those two
00:08:58.800 --> 00:09:00.360
terms here.
00:09:00.360 --> 00:09:06.610
Finally, X1 times X2 times X3 is
the indicator variable for
00:09:06.610 --> 00:09:10.290
the event A1 intersection
A2 intersection A3.
00:09:10.290 --> 00:09:14.220
Therefore, the expected value of
this term, is equal to this
00:09:14.220 --> 00:09:15.600
probability.
00:09:15.600 --> 00:09:19.290
And, therefore, we have
established exactly the
00:09:19.290 --> 00:09:22.940
formula that we wanted
to establish.
00:09:22.940 --> 00:09:26.310
Now this derivation that we
carried out here, there's
00:09:26.310 --> 00:09:29.070
nothing special about
the case of three.
00:09:29.070 --> 00:09:32.770
We could have the union of many
more events, we would
00:09:32.770 --> 00:09:36.750
just have here the product of
more terms, and we would need
00:09:36.750 --> 00:09:40.010
to carry out the multiplication
and we would
00:09:40.010 --> 00:09:45.770
get cross terms of all types
involving just one of the
00:09:45.770 --> 00:09:48.400
indicator variables, or products
of two indicator
00:09:48.400 --> 00:09:50.680
variables, or products
of three indicator
00:09:50.680 --> 00:09:53.020
variables, and so on.
00:09:53.020 --> 00:09:56.790
And after you carry out this
exercise and keep track of the
00:09:56.790 --> 00:10:01.870
various terms, you end up with
this general version of what
00:10:01.870 --> 00:10:04.530
is called the
inclusion-exclusion formula.
00:10:04.530 --> 00:10:07.390
So the probability
of a union is--
00:10:07.390 --> 00:10:10.000
there's the sum of the
probabilities, but then you
00:10:10.000 --> 00:10:14.600
subtract all possible
probabilities of two way
00:10:14.600 --> 00:10:16.060
intersections.
00:10:16.060 --> 00:10:20.510
Then we add probabilities of
three way intersections, then
00:10:20.510 --> 00:10:23.560
you subtract probabilities of
four way intersections, and
00:10:23.560 --> 00:10:28.330
you keep going this way
alternating sings until you
00:10:28.330 --> 00:10:32.300
get to the last term, which
is the probability of the
00:10:32.300 --> 00:10:36.570
intersection of all the
events involved.
00:10:36.570 --> 00:10:39.810
And this exponent here of n
minus 1 is the exponent that
00:10:39.810 --> 00:10:44.370
you need so that the last term
has the correct sign.
00:10:44.370 --> 00:10:50.210
So, for example, if n is equal
to 3, the exponent would be 2,
00:10:50.210 --> 00:10:54.130
so this would be a plus sign,
which is consistent with what
00:10:54.130 --> 00:10:56.640
we got here.
00:10:56.640 --> 00:11:00.140
So this is a formula that is
quite useful when you want to
00:11:00.140 --> 00:11:03.730
calculate probabilities
of unions of events.
00:11:03.730 --> 00:11:08.640
But also, this derivation using
indicator functions, is
00:11:08.640 --> 00:11:09.890
quite beautiful.