WEBVTT
00:00:01.980 --> 00:00:04.910
Let us now study a very
important counting problem,
00:00:04.910 --> 00:00:07.360
the problem of counting
combinations.
00:00:07.360 --> 00:00:09.520
What is a combination?
00:00:09.520 --> 00:00:13.830
We start with a set
of n elements.
00:00:13.830 --> 00:00:18.580
And we're also given a
non-negative integer, k.
00:00:18.580 --> 00:00:22.780
And we want to construct or
to choose a subset of the
00:00:22.780 --> 00:00:27.480
original set that has
exactly k elements.
00:00:27.480 --> 00:00:30.990
In different language, we want
to pick a combination of k
00:00:30.990 --> 00:00:32.820
elements of the original set.
00:00:32.820 --> 00:00:36.160
In how many ways can
this be done?
00:00:36.160 --> 00:00:38.870
Let us introduce
some notation.
00:00:38.870 --> 00:00:44.460
We use this notation here, which
we read as "n-choose-k,"
00:00:44.460 --> 00:00:47.830
to denote exactly the quantity
that we want to calculate,
00:00:47.830 --> 00:00:52.540
namely the number of subsets
of a given n-element set,
00:00:52.540 --> 00:00:54.770
where we only count
those subsets that
00:00:54.770 --> 00:00:57.980
have exactly k elements.
00:00:57.980 --> 00:01:00.870
How are we going to calculate
this quantity?
00:01:00.870 --> 00:01:03.450
Instead of proceeding directly,
we're going to
00:01:03.450 --> 00:01:06.430
consider a somewhat different
counting problem which we're
00:01:06.430 --> 00:01:09.220
going to approach in two
different ways, get two
00:01:09.220 --> 00:01:12.050
different answers, compare
those answers, and by
00:01:12.050 --> 00:01:15.180
comparing them, we're going to
get an equation, which is
00:01:15.180 --> 00:01:19.150
going to give us in the end,
the desired number.
00:01:19.150 --> 00:01:21.610
The alternative problem
that we're going
00:01:21.610 --> 00:01:24.050
to use is the following.
00:01:24.050 --> 00:01:27.850
We start, as before, with
our given set that
00:01:27.850 --> 00:01:30.240
consists of n elements.
00:01:30.240 --> 00:01:35.229
But instead of picking a subset,
what we want to do is
00:01:35.229 --> 00:01:39.940
to construct a list, an
ordered sequence, that
00:01:39.940 --> 00:01:44.630
consists of k distinct elements
taken out of the
00:01:44.630 --> 00:01:46.060
original set.
00:01:46.060 --> 00:01:50.370
So we think of having k
different slots, and we want
00:01:50.370 --> 00:01:53.910
to fill each one of those slots
with one of the elements
00:01:53.910 --> 00:01:55.910
of the original set.
00:01:55.910 --> 00:01:58.030
In how many ways can
this be done?
00:02:00.590 --> 00:02:03.130
Well, we want to use the
counting principle.
00:02:03.130 --> 00:02:07.630
So we want to decompose this
problem into stages.
00:02:07.630 --> 00:02:11.300
So what we can do is to choose
each one of the k items that
00:02:11.300 --> 00:02:13.620
go into this list
one at a time.
00:02:13.620 --> 00:02:17.270
We first choose an item that
goes to the first position, to
00:02:17.270 --> 00:02:18.640
the first slot.
00:02:18.640 --> 00:02:22.120
Having used one of the items in
that set, we're left with n
00:02:22.120 --> 00:02:26.329
minus 1 choices for the
item that can go
00:02:26.329 --> 00:02:27.990
into the second slot.
00:02:27.990 --> 00:02:30.730
And we continue similarly.
00:02:30.730 --> 00:02:33.920
When we're ready to fill the
last slot, we have already
00:02:33.920 --> 00:02:39.230
used k minus one of the items,
which means that the number of
00:02:39.230 --> 00:02:42.420
choices that we're going to
have at that stage is n
00:02:42.420 --> 00:02:46.050
minus k plus 1.
00:02:46.050 --> 00:02:48.520
At this point, it's also
useful to simplify that
00:02:48.520 --> 00:02:49.670
expression a bit.
00:02:49.670 --> 00:02:56.180
We observe that this is the same
as n factorial divided by
00:02:56.180 --> 00:02:59.990
n the minus k factorial.
00:02:59.990 --> 00:03:01.510
Why is this the case?
00:03:01.510 --> 00:03:04.310
You can verify that this is
correct by moving the
00:03:04.310 --> 00:03:06.540
denominator to the other side.
00:03:06.540 --> 00:03:08.870
And when you do that, you
realize that you have the
00:03:08.870 --> 00:03:13.640
product of all terms from n
down to n minus k plus 1.
00:03:13.640 --> 00:03:17.780
And then you have the product of
n minus k going all the way
00:03:17.780 --> 00:03:19.210
down to one.
00:03:19.210 --> 00:03:22.320
And that's exactly the product,
which is the same as
00:03:22.320 --> 00:03:23.190
n factorial.
00:03:23.190 --> 00:03:28.390
It's a product of all integers
from n all the way down to 1.
00:03:28.390 --> 00:03:30.870
So this was the first method
of constructing the
00:03:30.870 --> 00:03:32.940
list that we wanted.
00:03:32.940 --> 00:03:34.215
How about a second method?
00:03:37.120 --> 00:03:43.620
What we can do is to first
choose k items out of the
00:03:43.620 --> 00:03:52.620
original set, and then take
those k terms and order them
00:03:52.620 --> 00:03:57.860
in a sequence to obtain
an ordered list.
00:03:57.860 --> 00:04:02.240
So we construct our ordered
list in two stages.
00:04:02.240 --> 00:04:04.770
In the first stage, how many
choices [do] we have?
00:04:04.770 --> 00:04:08.490
That's the number of subsets
with k elements out of the
00:04:08.490 --> 00:04:09.990
original set.
00:04:09.990 --> 00:04:11.950
This number, we don't
know what it is.
00:04:11.950 --> 00:04:13.650
That's what we're trying
to calculate.
00:04:13.650 --> 00:04:15.890
But we have a symbol for it.
00:04:15.890 --> 00:04:17.950
It's n-choose-k.
00:04:17.950 --> 00:04:20.459
How about the second stage?
00:04:20.459 --> 00:04:22.410
We have k elements,
and we want to
00:04:22.410 --> 00:04:24.830
arrange them in a sequence.
00:04:24.830 --> 00:04:26.780
That is, we want to form a
00:04:26.780 --> 00:04:29.020
permutation of those k elements.
00:04:29.020 --> 00:04:31.850
This is a problem that we have
already studied, and we know
00:04:31.850 --> 00:04:35.140
that the answer is
k factorial.
00:04:35.140 --> 00:04:39.070
According to the counting
principle, the number of ways
00:04:39.070 --> 00:04:42.790
that this two-stage construction
can be made is
00:04:42.790 --> 00:04:46.790
equal to the product of the
number of ways, number of
00:04:46.790 --> 00:04:50.390
options that we have in the
first stage times the number
00:04:50.390 --> 00:04:52.655
of options that we have
in the second stage.
00:04:55.659 --> 00:04:59.280
So this is one answer
for the number of
00:04:59.280 --> 00:05:01.280
possible ordered sequences.
00:05:01.280 --> 00:05:02.920
This is another answer.
00:05:02.920 --> 00:05:05.050
Of course, both of
them are correct.
00:05:05.050 --> 00:05:07.720
And therefore, they
have to be equal.
00:05:10.940 --> 00:05:15.600
And by using that equality, we
can now find a formula for
00:05:15.600 --> 00:05:19.380
this coefficient n-choose-k
simply by taking this k
00:05:19.380 --> 00:05:23.630
factorial factor and sending it
to the denominator of that
00:05:23.630 --> 00:05:24.700
expression.
00:05:24.700 --> 00:05:29.630
So by equating this expression
with that expression here, we
00:05:29.630 --> 00:05:35.920
find the final answer, which is
that the number of choices,
00:05:35.920 --> 00:05:40.830
n-choose-k, is equal to
this expression here.
00:05:40.830 --> 00:05:44.409
Now, this expression
is valid only for
00:05:44.409 --> 00:05:46.040
numbers that make sense.
00:05:46.040 --> 00:05:52.400
So n can be any integer, any
non-negative integer.
00:05:52.400 --> 00:05:56.240
And k, the only k's that make
sense, would be k's
00:05:56.240 --> 00:05:59.990
from 0, 1 up to n.
00:05:59.990 --> 00:06:03.410
You may be wondering about some
of the extreme cases of
00:06:03.410 --> 00:06:04.250
that formula.
00:06:04.250 --> 00:06:09.350
What does it mean for n to
be 0 or for k equal to 0?
00:06:09.350 --> 00:06:13.190
So let us consider now some of
these extreme cases and make a
00:06:13.190 --> 00:06:15.130
sanity check about
this formula.
00:06:18.990 --> 00:06:21.520
So this is the formula
that we have and
00:06:21.520 --> 00:06:22.770
that we want to check.
00:06:22.770 --> 00:06:26.900
The first case to consider
is the extreme case of
00:06:26.900 --> 00:06:29.310
n-choose-n.
00:06:29.310 --> 00:06:30.800
What does that correspond to?
00:06:30.800 --> 00:06:34.390
Out of a set with n elements,
we want to choose a subset
00:06:34.390 --> 00:06:36.030
that has n elements.
00:06:36.030 --> 00:06:37.700
There's not much of
a choice here.
00:06:37.700 --> 00:06:41.270
We just have to take all of the
elements of the original
00:06:41.270 --> 00:06:43.190
set and put them
in the subset.
00:06:43.190 --> 00:06:46.940
So the subset is the same
as the set itself.
00:06:46.940 --> 00:06:49.190
So we only have one
choice here.
00:06:49.190 --> 00:06:50.820
That should be the answer.
00:06:50.820 --> 00:06:52.920
Let's check it with
the formula.
00:06:52.920 --> 00:06:55.960
The formula gives
us n factorial
00:06:55.960 --> 00:06:58.909
divided by n factorial.
00:06:58.909 --> 00:07:05.250
And then, since k is equal to n,
here we get zero factorial.
00:07:05.250 --> 00:07:06.730
Is this correct?
00:07:06.730 --> 00:07:09.610
Well, it becomes correct
as long as we adopt the
00:07:09.610 --> 00:07:13.930
convention that zero factorial
is equal to 1.
00:07:13.930 --> 00:07:16.530
We're going to adopt this
convention and keep it
00:07:16.530 --> 00:07:18.040
throughout this course.
00:07:21.230 --> 00:07:24.920
Let's look at another extreme
case now, the
00:07:24.920 --> 00:07:27.150
coefficient n choose 0.
00:07:27.150 --> 00:07:29.250
This time let us start
from the formula.
00:07:29.250 --> 00:07:33.340
The formula tells us that this
should be n factorial divided
00:07:33.340 --> 00:07:39.370
by 0 factorial and divided by n
factorial, since the number
00:07:39.370 --> 00:07:41.740
k is equal to 0.
00:07:41.740 --> 00:07:45.520
Using the convention that we
have, this is equal to 1.
00:07:45.520 --> 00:07:48.630
So this is, again, equal to 1.
00:07:48.630 --> 00:07:50.420
Is it the correct answer?
00:07:50.420 --> 00:07:54.250
How many subsets of a given
set are there that have
00:07:54.250 --> 00:07:57.190
exactly zero elements?
00:07:57.190 --> 00:08:00.940
Well, there's only one subset
that has exactly 0 elements,
00:08:00.940 --> 00:08:04.420
and this is the empty set.
00:08:04.420 --> 00:08:07.300
So this explains this particular
answer and shows
00:08:07.300 --> 00:08:12.550
that it is meaningful and
that it makes sense.
00:08:12.550 --> 00:08:15.860
Now, let us use our
understanding of those
00:08:15.860 --> 00:08:22.440
coefficients to solve a somewhat
harder problem.
00:08:22.440 --> 00:08:24.320
Suppose that for some
reason, you want to
00:08:24.320 --> 00:08:26.340
calculate this sum.
00:08:26.340 --> 00:08:28.800
What is it going to be?
00:08:28.800 --> 00:08:33.390
One way of approaching this
problem is to use the formula
00:08:33.390 --> 00:08:37.710
for these coefficients,
do a lot of algebra.
00:08:37.710 --> 00:08:41.250
And if you're really patient
and careful, eventually you
00:08:41.250 --> 00:08:43.320
should be able to get
the right answer.
00:08:43.320 --> 00:08:45.110
But this is very painful.
00:08:45.110 --> 00:08:48.590
Let us think whether there's a
clever way, a shortcut, of
00:08:48.590 --> 00:08:50.050
obtaining this answer.
00:08:50.050 --> 00:08:55.170
Let us try to think what
this sum is all about.
00:08:55.170 --> 00:08:58.730
This sum includes this term,
which is the number of
00:08:58.730 --> 00:09:01.360
zero-element subsets.
00:09:01.360 --> 00:09:04.190
This number, which is the number
of subsets that have
00:09:04.190 --> 00:09:05.420
one element.
00:09:05.420 --> 00:09:09.930
And we keep going all the way to
the number of subsets that
00:09:09.930 --> 00:09:13.110
have exactly n elements.
00:09:13.110 --> 00:09:16.670
So we're counting zero-element
subsets, one-element subsets,
00:09:16.670 --> 00:09:18.820
all the way up to n.
00:09:18.820 --> 00:09:24.160
So what we're counting really
is the number of all subsets
00:09:24.160 --> 00:09:27.620
of our given set.
00:09:27.620 --> 00:09:31.450
But this is a number that
we know what it is.
00:09:31.450 --> 00:09:34.470
The number of subsets of
a given set with n
00:09:34.470 --> 00:09:38.290
elements is 2 to the n.
00:09:38.290 --> 00:09:42.680
So by thinking carefully and
interpreting the terms in this
00:09:42.680 --> 00:09:47.830
sum, we were able to solve
this problem very fast,
00:09:47.830 --> 00:09:50.810
something that would be
extremely tedious if we had
00:09:50.810 --> 00:09:52.060
tried to do it algebraically.
00:09:54.390 --> 00:09:57.680
For some practice with this
idea, why don't you pause at
00:09:57.680 --> 00:10:02.170
this point and try to solve a
problem of a similar nature?