Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

**Description:** Introduces and defines relationships between sets and covers how they are used to reason about counting.

**Speaker:** Marten van Dijk

Lecture 16: Counting Rules I

The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.

PROFESSOR: So this week we are going to talk about counting. Tonight is a problem set eight due. For this week, we will post a new problem set tonight as well. Counting is very important. The rest of the semester after this week, we will actually explain probability theory. And it's all based on counting.

So I'm going to teach you this week a whole toolkit of all kinds of ways how to count. And as you can see, we're going to talk about a lot of different kinds of rules. A mapping rule we'll talk about, the pigeon hole principle, another rule, the product rule and the sum rule. And these are all ways to count one thing by counting another thing.

Actually, what we are going to talk about is how to count a difficult set the objects. And we will map those objects to something that we can count in a much easier way. OK. So let's start with a lot of definitions actually.

So we have to talk about sets, sequences, and permutations to start off with. So the definition of a set is as follows. A set is actually an unordered collection of distinct elements.

So as an example, we may have, say, a set that contains the elements a, b, and c. Well, if you reorder these elements, it doesn't really matter. It's still the same set.

We can also write it as the set c, a, b. On the other hand if you have a collection in which say the elements a appears twice, well, these two are not distinct. So this is not a set. So this is not a set.

But it is a collection. And you may call this a multiset. But we'll not go into that. So we will be talking about sets. And your interested usually in the cardinality of a set.

So what's that? The cardinality or size is defined as follows. Cardinality is just a number of elements that the set S really has. So it's a number of elements in S.

And how do we denote this/ This is denoted by two vertical bars around the letter that represents the set. So we denote this as follows.

Now, when we talk about sets, we may also be interested in ordered collection of elements. And that's what we call a sequence. So a sequence is defined as follows.

A sequence is an ordered collection of elements. And we also call these elements components or terms. And these elements do not necessarily have to be distinct-- so not necessarily distinct.

Now as an example, so how do we denote this? For sets, we use this type of notation like this symbol. For sequences, we just have a very simple type of bracket, just a round bracket. So an example could be, well, the elements a, b, and c.

The first entry or the first term of the sequence-- depending on whether you look at it from the left or the right, it may differ-- is here a, b, and then c. Another one could be a, b, and a. And as you can see, the element a occurs twice in the sequence.

So we're going to relate sets and sequences. And let's talk about the permutation. So a permutation of a set is defined as follows. A permutation of a set S is actually a sequence that contains every element in S exactly once.

So every element in S occurs exactly once. And as an example, we may look at the set that we have described over here, the set a,b, and c. And how many permutations are there?

So the first one I may just order the elements in a, b, and c. So I have, say, the sequence a and then b, and then c. There are many more. I can, for example, start with b, and then c, and then may cycle through to a. That's another permutation of these three elements.

And I can do this once more. I can, for example, start with c, and then a, and then b. And it turns out that we can do a few more. You can also start to c, and then b, and then a. We sort of reversed the order that we had over here into its opposite, so first c, then b, then a.

And we can look at this cycle there. There's shifts. And we start with b, a, and then we rotate through to c, which is what we did over here when you went from this permutation to this one.

So we have to b, a, and then c. And then we may start with a, and we have a, c, and b. It turns out that this is actually all the possible permutations of this set. So there's six of these.

And in general, how many permutations are there of a set of n elements? So let's have a look here what we did. So if I want to create a permutation of a set, I may start off by selecting for the first term a, and b, or c.

So I have actually three choices. So the first term has three choices. The second term over here, well, for example, suppose my first term was b. Then the second term, well, must be an element that is different from b, as yet part of S.

So I have only two choices for the second term, either c or a. So the second term has two choices. And once I have chosen the second one, well, if I've already chosen b and c, there's only one element left in the set over here, only a. So I have only one choice left.

So in this way, we may count the total number of permutations of this set as the number of choices that I have for the first term times the number of choices for the second term, 3 times 2 times the number of choices for the third term, which is only one. So the third term has only one choice.

Now in general, we can do this for any permutation. And if you want to count the number of permutations, of a set with n elements, well, it turns out that it's equal to n times n minus 1 just like here. We have three elements, 3 times 3 minus 1 and so on, n times n minus 1 times n minus 2, et cetera all the way up to one.

And this is n factorial. And you've seen this already when we talked about Stirling's formula and how to approximate this. Now, this type of reasoning we will generalize later on when we come to the Generalized Product rule. And But this is already a first example of how we go about this.

So permutations relates sets and sequences. So now we go on to define more special functions. So permutations is one kind of mapping.

So let's now define functions. And then we will talk about a few different flavors of those. We talk about surjective functions, injective functions, and bijective functions.

And the whole idea is is that if I can use a mapping from one set to another set that satisfies some of those properties, it can say something about how their cardinalities are related. And that's what we want. We want to count.

OK So the definition of a function is as follows. So a function f from x to y is actually a relation between the sets X and Y. And we say that-- oh-- with the property that every single element, every element, of X is actually related to exactly one element of Y.

And we will call x to be the domain of the function f. And Y we will call the range or image of the function f. So let's give an example. And to see a couple of examples of, first of all, a function and then relations to that are actually not functions.

So suppose you have a mapping from x which just contains the elements a, b, and c, just in line with this example over here. And we have a mapping f that maps to the set y. And the y is just the numbers 1, 2, and 3.

Well, i could map, for example, a to 1, b to 3, and c to 3. Now this is a function. Because every element of x is met to exactly one element of y. a is just mapped to 1. b is also mapped to an element, only 3. c is mapped to 3 as well.

And they usually write this as f evaluated in a is equal to 1. And f b is equal to 3. And f of c is equal to 3 as well.

Now what is not a function-- oh, I could, for example, add another edge over here if I wanted to. But this is not a function. Because b is now mapped to two elements, 2 and 3. And that's not what's covered by this definition. So this is not a function.

I can also remove, say, an edge. Well, in this case, b is not mapped to anything at all. And that's not a function either. So we really have the property for functions that there is six exactly one outgoing arrow, if you want to think about it as being a graph, from each element in x to exactly one element in y.

So now we can talk about a few definitions. So we will talk about these few properties, surjective, injective, and bijective. And then we can start to do a few interesting examples.

So a function f that goes from x to y is called surjective. if every single element of y is mapped to at least once. So what does that mean? To at least once.

So every element of y, so say 1 for example here, is mapped to at least once. Well, to the element 1 we have mapped a. So that's great. But for example element 2 is not mapped to at all.

So this particular example is not surjective. But we will come to a few examples that are. So here we have the distinction that every element of y, so every single element of y, is mapped to at least once.

The injective is defined in a similar fashion. But now, every element of y is not mapped to at least once, but at most once. So let's have look over here.

And that's also not true for this example actually. Because three is mapped to two times. So it's not mapped to at most once. So this example is also not injective. Because if the function is injective, every element of y of the range is mapped to at most once.

Bijective is if every element of y is mapped to exactly once. And we can see that the function is bijective if and only if it is both surjective and injective. So bijective if and only if we have both the properties surjectvie as well as injective.

So let's give a couple of examples. So as the first example, we may have the set x and y. We have 1, 2, and 3, and set it to elements a and b over here. 1 is mapped to a, 2 is mapped to a, and 3 is mapped to b.

And now we can see that every element in y is mapped to at least once. This one is mapped to two times. This one is mapped to once. So this one is actually surjective.

So that's great. Another example of something this is injective is if you have, say, 1, 2, and 3, and a, b, c, and d. 1 is mapped, say, to a. 2 to b, 3 to d. Well, in this case we have that it's injective.

Because every element of y is mapped to at most once, once, once, zero times, and once. So this one is injective. And this one is not injective, right?

Because this one is mapped to 2 times. This one is not surjective. Because this one is not covered at all. It's only mapped to once.

OK. So let us talk again about permutations. So let me talk about permutations. We can define a mapping using a permutation that is an example of a bijection. So let's do that.

And then we can come to the mapping rule. And we can start to do some counting. So for example if we have a permutation, a 1 up to a n, so let this be a permutation of the set S that contains all the elements a 1, up to a n.

So this is just one example of a permutation. And now we may define the following function. We say that pi evaluated in a i give us output i. Actually, what I mean here is that if you take an element in S, then this one is mapped to under this function to i if and only if a is in the i-th position in this permutation.

So if and only if today is in i-th term in the permutation. So in this case, we know that pi is bijective. And why is this? Well, we know from the definition of a permutation that any permutation is a sequence in which every element of S occurs exactly once.

So that means that every position is covered exactly once by an element of S. And that is exactly the definition over here which says that every element in the range is mapped to exactly once by an element in a domain. So this is an example of a bijection.

OK. So now that we have defined functions and the special properties, let's talk about the mapping rule, which we'll do over here. And now for the first time we start to talk about the cardinalities of sets and how they're related to one another. So the mapping rule is that first of all, if f is a function from x to y and if f is actually surjective, well, what do we know?

We actually know that the cardinality of the number of elements in the domain is at least the number of elements in the range. And why is that? If you look at a definition of surjectivity, we know that every element of y is covered by some element in x at least once. And all the elements in x and mapped to exactly one element in y.

So we know that the cardinality of x is at least y. Because every element in y is mapped by some unique distinct element in x. And if a function f is injective, well, in that case we know that the reverse relation holds, so inequality holds.

The cardinality of x is at most the cardinality of y. So why is that? Well, every element in an injective function, right? Every element is mapped to at most one element.

So every element in y is mapped by at most one element in x. So we know that all the elements in x are mapped to some element in y. But every element in y cannot map to by more than two times by something in the domain. So we know that this inequality holds.

OK. For a bijective function, we have that both these properties hold. And we will have an equality over here. So if this one is bijective, we have that the cardinalities are equal to one another.

And this is also called the bijection rule. So let's give an example where we want to find out how many ways there are to select 12 doughnuts from 5 varieties. So let's see how that would work.

And the whole idea is that we're going to define the set that we want to count, which is all these possible configurations of doughnuts over five varieties of flavors. And then we're going to map these to another structure that we can understand a little be better. So let's do this.

So as an example, let x be all the ways to select, say, 12 doughnuts from 5 varieties. So let's give an example. For example, we may have 2 doughnuts. And they are in the chocolate flavored basket. So we have chocolate.

And suppose we have no doughnuts in the lemon filled version of a doughnut. Suppose he have a whole bunch of doughnuts, say 6 of those, that are with sugar. We have some that are glazed, say 2. And finally, we have just a couple of plain doughnuts.

So this would be a configuration that is in x. Because we have 1, 2, 3, 4, 5, varieties. And we have 12 doughnuts, 2 over here, 6 here, 2, and another 2, 12 doughnuts, that are selected from these 5 varieties.

Now, if you are going to try to represent such a configuration, that's usually how we think about counting, then we may map this to a 01 sequence. So how do we do this? We can just map the doughnuts two 0s, the divider between the two baskets as a 1. So this is a 1.

Then we have no 0 between these two ones, because there are no doughnuts in the lemon filled basket. So we have one that is the mapping from this divider field over here. We've got 6 0s.

We have another one that is this divider, two 0s, two doughnuts in the glazed version, and so on. So what do we see here? We have a 01 sequence where we have 12 zeros and we have 1, 2, 3, 4 1s.

And we can see that this mapping is actually bijective. Because if I have two 0s, I can map them back to doughnuts. The 1 I can map back to the divider between two baskets. So let y be the set of all kinds of configurations of 12, all kinds of sequences.

Oops, maybe I will not take this one out. Let's do this one. So if you are going to define y as the set of all 16-bit sequences with exactly four 1s, then we know that by the bijection rule, we have created this bijection over here, that the cardinalities of x and y are exactly equal.

So now we know that by the bijection rule, we have been able to count the number of elements in x by counting something else, which is really how we proceed in these types of proofs. So we are now able to just count these types of objects. And later on next lecture, we'll actually figure out a formula for this one.

So this is an example of how we can use the bijection rule. Another example is one in which we are going to count subsets. So we'll give a lot of examples through these two lectures. And also the problem set, as you will see, will have a lot of small little parts with all kinds of countings that you will need to do applying different rules.

So let's talk about how to count subsets of a set x. So what we want is a bijection from subsets of a set x containing of, say, 1 up to n, so the integers 1 up to n, to n-bit sequences. We know that we can do this if you define a bijection as follows.

So we map a subset S under a mapping f to a bit sequence, b1, b1, all the way to bn. If I add the following relation, bi is computed as either a 1 or a 0, it's computed as a 1 if i is in S. And it's a 0 if i is not in S.

Now, we know that this is a bijection. So if we have a bit sequence, then we can construct from this mapping the corresponding subset. If you have a subset, we can use this mapping to construct the corresponding bit sequence.

So how many n-bit sequences are there? Well, there are 2 to the power n n-bit sequences. Why is that? Well, we have two choices for b1, a 0 or a 1, two choices for b2, 0 or a 1, and so on.

So we have 2 times 2 times 2 choices over here. So we have 2 to the power n choices for a bit sequence. So there are 2 to the power of n number of bit sequences. And this is actually equal.

Because of this bijection rule that we have described over here, this is equal to the total number of possible ways to select subsets of x. So this is the number of subsets of an n element set. So this is a very nice way to demonstrate how we can use a bijection rule to count something that appears to be much more harder to think about, to grasp. At least for me it's harder to grasp.

So I have a subset that can be any size in a set of n elements. And now I can find this really easy going mapping that I can show to be bijective and all of a sudden, I know how to count it. Because I can just look at the image and count those types of objects, in this case n-bit sequences.

I get a really easygoing number that I can compute fairly easily. And now I have counted something much more complex. So this is how we generally will think about these things.

OK. So let's talk now about the generalized pigeon hole principal. So we have covered quite a lot of definitions right now. So we explained the functions mapping rule.

So now we come to generalized pigeon hole principle and a few other rules. OK. So what about a generalized pigeon hole principal? This is actually the following counting argument.

If I know that the cardinality of a set x is more than k times the cardinality of a set y, what do I know? Well, I know that for all functions f that have domain x and range y, I know that there must exist k plus 1 different elements of x that are mapped to the same element in y. And if we take a specific case k=1, we will actually call this the pigeon hole principle.

And let me just demonstrate it by the famous example of pigeons. Well, if I have more pigeons that the number of holes than they can fly into, I know for sure there exists a hole that two pigeons will fit in. So that's where the name comes from.

So let me write it out. So an example is if I have more than n pigeons, so the pigeons from the set x, and say they fly into n holes, and the holes is my set y, well, then I know the cardinality of x is more than the cardinality of y. I have more pigeons than there are holes.

So I know that at least two pigeons will fly into the same hole. So for a generalized case, how can we prove something like that? Well, we could use, for example, something like a contradiction.

For example, suppose that for all k plus 1-- as opposed to the negation is true. So how do we prove this usually? So assume we have this.

We want to prove this. Well, suppose that's not true. So suppose there's a mapping f. So a set for all k plus 1 different elements of x, well, they are not mapped to the same elements in y. But what I really know then is that every element in y is mapped to at most by k distinct elements of x.

So that means that the total number of elements of x must be at most k times y. And that's not true. It's larger by assumption. So it's a contradiction.

So this is a very general principle though. And it's worth writing it all out. Because this is a famous rule that we will use in counting. And it leads to interestingly results.

OK. So let's give another example. Let's think about Boston. In Boston, we have say a half a million non-bald people. It turns out that there are at least 3 people that have the exact same number or hairs on the head.

So that's kind of weird. How do we know that? I cannot point out any three in Boston that have the same number of hairs. I have no idea.

But somehow I can count and use this principle and tell you that it must be true that in Boston with 500,000 people, there are 3 of them that are not bald. So we exclude the bald people. Because that would be easy. They all have 0 hairs. But say non-bald people that actually have the same number of hairs.

So how do we do that? How can we make such types of conclusions? So say Boston has about 500,000 and non-bald people. And let's call this set x. Because we're going to use the pigeon hole principle.

So our claim is that there exists 3 people in Boston such that they have the same number of hairs on their head. So how do we do this? Well, we know that we may generally assume that any person has at most 200,000 hairs on their head.

So the number of hairs on a head is at most 200,000. So how should I define my set y in order to apply this pigeon hole principle? So what do we do?

So I want to have mapping, right, from all the people the set x to the number of hairs. So the number of hairs on one's head is going to be the set y. And what do we know?

We know that the cardinality of y is at most 200,000. Actually, the way we defined it it's exactly 200,000. And the set x has a cardinality of about 500,000. So what do we know?

We can apply our generalized pigeon hole principle. It's very surprising, because we notice that x is more than two times the cardinality of y. 2 times 200,000 is less than 500,000. So I know that by this particular principle, this particular mapping must have the property, because this holds for all mappings, that there are at least k plus 1, 2 plus 1, 3 different people in Boston out of the set x that are mapped to the same element in y.

That means that they have the same number of hairs on their head. So this is kind of really surprising. We can make a statement without really inspecting every single person's head.

But we can still make a statement about the fact that there are 3 different people in Boston that have the exact same number of hairs. So this is an example of a non-constructive proof. And I will give another one.

And it's a very important principle. There's actually a new technique that you haven't seen before. So far we have been constructively proofing all kinds of properties using induction mainly.

And this is what is called a non-constructive proof. Because I cannot give a specific example that demonstrates that this claim is true. But yet, I've shown that it is true, but in a non-constructive way without an example.

OK. So what about another one? For example, we may pick 10 arbitrary two-digit numbers. So pick 10 arbitrary double-digit numbers.

And we can pick any sequence of numbers. I'm just picking a few. You may add a few, too. i don't know, 2, 7, 14, I don't know, 31, 25, 60, 92, and so on. So I have 1, 2, 3, 4, 5, 6, 7, 8, I don't know 9, and another one, say, 91 or something like that.

And so I have 10 double-digit numbers. It turns out that I can show to you that there are two subsets that if I look at the sum of their elements, so I look at sum of the elements of the first subset and I look at the sum of the elements of the second subset, that I can find two subsets that have an equal sum. Now if you just look at those numbers, and I've now picked 10 arbitrary double-digit numbers, well, usually it's pretty hard to figure out whether that's really true or not.

Maybe I have been selecting the numbers in such a way that it's easy to see. I mean, we can still try to wrap our minds around it and try to really solve this constructively by giving an example. It turns out that we can prove this statement. And we will use the pigeon hole principle.

And we do not even have to actually-- oh, you have a question?

AUDIENCE: [INAUDIBLE].

PROFESSOR: Oh, sure. We can make it double-digits. So I could put this here. It'll be 4, 2 of I want to. But yeah, just select something else. It doesn't really matter.

Yeah. So what we are going to show now is that through the pigeon hole principle, we can prove that there are two subsets that have the same sum. And just by inspection it's a very hard problem to solve. So I did not even give you an example.

But we can still show this. So how can we go ahead with this? So let's think together about this problem.

So I want to choose two sets x and y. And somehow, I want to have a mapping, right, from any double-digit set of numbers. Somehow I want to map that to sums. Because that's what I'm interested in.

I'm interested in sums. And I want to show something about subsets of these double-digit numbers. So what do I do?

I take x as the collection of subsets of these numbers. And I want to that there are at least two subsets that map to the same sum. So let's first count how many we have here.

We already did this. We made a mapping from subsets to two binary sequences, bit sequences. In this case, we have 10 numbers.

So we have 2 to the power 10 possible subsets. So this is equal to 1,024. Now y is going to be the sum of a subset. So what do I know?

I know that the possible sums range from 0 all the way to, well, what's the maximum? sum that I can have out of a subset of 10 double-digit numbers? So I can select all the 10 elements in this set. And they are double digit.

So at most, they are 99. So I know that this set is really the set of all possible sums. Now we know that 1,024 is more than 990. So the cardinality of x is more than the cardinality of y.

So by the pigeon hole principle, we know that there exists at least two different elements of x. In our case, there exists two different subsets that map to the same elements in y, the same sum. So now we have shown that even though we have not shown any particular example that demonstrates this claim that there are two different subsets that have the same sum, we still got a proof using counting that this is true. So this is called a non-constructive proof.

Let me write it down. And this is a great way of proofing properties. So now we can continue with another definition where we look at another property.

Over here, we talks about surjectivity, injectivity, and bijectivity. And now we will talk about the following property. We say that a k to 1 function is f from x to y actually maps actually k elements of x to every element of y.

So what do we know? Well, we can have the following counting rule that we call the division rule. And it says that if f such a type of function, so if f is k to 1, well then we know that the cardinality of the domain is equal to k times the cardinality y.

And why is that? Well, exactly k elements of x map to each element of y. So the first element of y, we have k elements of x mapped to it. The second element of y, k elements mapped to that one.

So we know that the domain is exactly k times the range, k times the size of the range. Now this division rule actually generalizes the bijection rule, which I've put over there, the bijection rule. And why is that?

Well, that's because a function is a bijection if and only if it is actually 1 to 1. So if you replace k by 1, we have that exactly one element of x is mapped to every element in y. And that's the definition of a bijection.

And the bijection rule says that if you have a bijection, then the cardinality of the domain is equal to the cardinality of the range, so for k equals 1 here. So let's give an example on how this works. I think we can take this out actually.

So let's give an example using a chessboard, where we have 2 identical rooks and we want to count the number of ways we can put them on the chessboard in such a way that the 2 rooks cannot see one another, meaning that the rooks are on different rows and on different columns. So let's give an example. So the example is like this.

So how many ways do we have to place 2 identical rooks on a chessboard in such a way that no row or column is shared? So how can we do this? Well, for example, let's look at a chessboard.

And suppose we have a rook over here and a rook over here. And say the first rook is on row 1 and on column 1. And the second rook is on row 2, R2, on row R2, and on the column that is indexed by C2.

So how can we describe such configurations? Well, I could describe this by using a sequence in which I look at the placement of the first rook, and then describe the placement of the second rook. So I may have r1, c1, and then r2 and c2.

So this could be a way to describe the positioning of these rooks. And I could create a mapping f that is doing this for me. And so if I call this an example of a valid. So let y be the set of valid rook configuration.

And this is one example of it. So this is part of this set. And if I define x as all the sequences r1, c1, r2, and c2 such that, well, the rook over here does not share a row with the rook that is described by this position. So r1 is not equal to r2. And they also do not share a column.

So the first rook has column c1. The second one is on column c2. So also c1 and c2 should be different. So these sequences are really placements, right?

So this describes rook 1. This describes the rook 2. And the whole combination is really a placement.

So now I have described the function f that maps a sequence that describes the position of the first and the second rook, maps such a sequence to an element in y, which is a valid rook configuration. So now let's have a look at how we can apply the division rule. So is this function bijective? Is that true?

So is it true that every-- so I have a mapping that goes from here to here. But is it true that every valid rook configuration is mapped to exactly once? Is that true?

Is this the only sequence that will map using this function f into a valid configuration? Yup. It's true. So you can switch rook 1 and rook 2. And it will still look the same.

The 2 rooks are identical. They look exactly the same. So we have, again, the exact same configuration. And we can see that this particular sequence also maps to the same. It just swap the positions. So we have r2, c2, r1, and c1 also maps under f to the same configuration.

And those are the exact 2 possibilities that can happen that map to this configuration. Every valid configuration is mapped to exactly 2 times. So now we can use the division rule. Because f is 2 to 1.

So f is 2 to 1. What does that mean if you apply the division rule? It means that the cardinality of all those sequences is equal to 2 times the cardinality of valid configurations. Or in other words, the cardinality for all the valid configurations is the cardinality of all those possible sequences divided by 2.

So now we can start counting x over here. So how do we do that? Well, I'm going to use something similar as what we did when we were counting permutations. And we'll generalize it in a moment.

So how do we go about this? Well, let's have a look. If I have r1, c1, and r2, and c2, so this is a sequence. So how many choices do I have?

Well, a chessboard has 8 rows. So I can choose 8 possible rows for the first rook. It also has 8 columns. So I have 8 possible choices for the column.

But what about the second rook? Well, the second rook can be on any row except the one that I've already selected for the first. So this 8 minus 1, we have 7 possible choices to select the row for the second rook.

It must be different from the one that was already selected. And I have 7 possible choices. And similarly for this particular column as well, the column has to be different. So how many choices do I have?

Well, it's not 8. It's one less, because I've already selected the one for the first rook. So I have 7 choices. So the cardinality of x is actually equal to 8 times 8 times 7 times 7.

So it's 8 times 7 squared divided by 2. So now we have to counted, by using the division rule, we have to the divide this by 2. We have counted the total number of valid configurations.

So now we are going to generalize this principle that we have talked about here. And we will do that over here I think. Yup. And that's the generalized product rule.

So the generalized product rule is as follows. It's essentially saying that if we have a set of sequences of length k, then how can we count those? Well, if we know the following properties-- well, let me first write out the generalized product rule is as follows.

Let S be a set of length k sequences. Then I know that if there are n1 possible first entries and if I know that once I've selected my first entry, there are n2 possible second entries for each first entry. And if I continue like this, my choice for the third term in the sequence is I've always n3 possible choices given my selection for the first 2 entries.

So if I have that property that continues in that way, so let me write it out. So we have n3 possible third entries for each combination in this case of first together with second entries. And if I continue this all the way to nk, so nk possible kth entries for each combination of all the previous entries, then I know that the set S can be counted as, well, I've n1 possible choices for the first entry.

Once I've chosen to fix that one, I have n2 possible choices for the second, then n3 possible choice for the third. And I go all the way to nk. Well, let's first talk about it from the perspective of the chess problem here.

I got 8 possible choices for r1. Given r1, I don't care. I still have 8 possible choices for the column here. So I have 8 choices here. But for the third one, once I have selected r1 and c1, I only have 7 choices left for r2 and 7 choices left for c2.

So that's an example where we use this particular generalized product rule. Also when we were counting the number of permutations, we were saying we can fix the first entry of a permutation in n ways if I have a permutation for n elements. And then I have the second entry, the second term, of a permutation has only n minus 1 choices. Because I've already chosen one.

And the next one has n minus 3 choices. Because I've already selected 2 of them. So I have only n minus 2 choices left. Then I have n minus 3 choices. Because I've already selected 3 and so on.

And I get n factorial. So that's the same kind of principle that we have here. So let me give an example where we can see how this works.

So what do we do? In this example, we want to count the number of committees. So it's the exact same kind of principle that we are going to talk about.

So we are going to count the number of communities described by sequence x, y, z, where x the first one is, say, the leader of the committee. The second one indicates the secretary. The third one is some consultant. So they're all different. They have all different roles.

And such a committee is elected from n members. And in how many ways can I do this? Well, I have n ways to choose my first term in the sequence. I have n ways to choose the leader.

So there's n ways to choose x. How many ways do I have to choose y? Well, if I've chosen already a leader, I need to choose someone else. So I have n minus 1 members left, n minus 1 ways to choose y.

I'm just not allowed to choose x. And then I will have n minus 2 ways to choose a z except x and y. And so for each x, I have only n minus 1 ways to choose y. For each x and y, I have only n minus 2 ways to choose z.

So if I multiply all this together, I get n times n minus 1 times n minus 2 to choose all these committee-- this is the total number of possible committees that I can select from an n member set of people. So let's go to a little bit of a different example that uses the same principle. OK. Let's make some space.

In the second problem, I will define to you a defective dollar bill. It's not really defective. But it's a property that we will assign to dollar bill.

And you can check for yourself whether you have one in your wallet. So let's define a defective dollar bill to have the property that if you look at the 8-bit serial number, some of the digits appear more than once. So some digit appears more than once in the 8-bit serial number.

So you can check our own wallet and check for your $1 bills and check whether you have a defective dollar. This seems to be a pretty specific and rare property, right? Well, check you dollar bills.

You'll figure out that you have probably a defective dollar bill in your wallet. So that's kind of weird. But it seems this property. If you look at that, it seems to be something that is maybe a little bit more common than we thought it is.

It seems to be so special. So let's do a counting argument and find out what's happening here. So let's look at a fraction of the non-defective. So we are counting the opposite, the non-defective bills.

Well, that's the number of non-defective serial numbers divided by the total number of serial numbers. And let's call these small x and y and count these.

So let's see. So first of all, let's count y. Well, that's easy, I have 8 digits in my serial number. So I have 10 choices for the first digit, 10 choices for the second one, and so on. In total, I have 10 times 10 times 10 to the power 8 choices.

What about x? Well, I'm using, again, our generalized product rule over here. Well, if I'm going to have a non-defective dollar bill, then all the digits in the 8 digit serial number have to be different.

So for the first digit, I have 10 choices. Now that I've selected my first digit, I have 9 digits for my second choice for my second digit in the serial number. Then I have 8 possible choices, 7, because I've already selected 3. And I cannot choose those anymore-- times 6 times 5 times 4 times 3.

And now I have chosen an 8 digit serial number in which all the digits are different. OK So how many are these? Well, this is actually equal to 10 factorial divided by 2 factorial. And it turns out to be something like 1,814,400. possible choices.

So now let's look at the fraction. It turns out if you divide this by this, you get a really very small fraction. This is actually equal to 1.8144% So a very small fraction is non-defective.

So almost all the dollars are sort of defective. It simply means that they have this special property that some digit occurs more than once. So it's kind of interesting.

So we can already see that by counting, it's sometimes a little bit counterintuitive. Because if I would see this particular property, I would in first instance think that it's a very special property. But that's not true. It's very common it turns out.

Now a special case of the generalized product rule is the product rule. And this is defined as follows. We are going to first of all define a product over sets. The definition is that the product of a set A1 with A2 up to An is actually equal to the set of sequences.

So the first entry is selected from the first set, the second entry is selected from the second set, and so on. And now the product rule tells us by just applying that reasoning over here that the cardinality of the product of all those sets is actually equal to the cardinality of the first set multiplied by the cardinality of the second set and so on up to the cardinality of the last set. Because we have this number of choices for the very first element over here, this number of choices for the second one, and so on.

And we apply that rule, and we see that this is the result. Now when we use this, in specific to count all the n-bit sequences, we have exactly 2 choices for the first bit. We have to set 0,1 in our example times the set 0, 1 over here and so on.

And that's how we derived that we have 2 times 2, 2 the power n choices for an n-bit sequence. So now we come to the sum rule. And we will give an example for that.

So the sum rule states that if you look at sets, then we may be able to count their union. And we will consider a very specific case. In the next lecture, we will talk about the general case.

So the sum rule is the following counting mechanism. If the sets A1 up to An are all disjoint, so they are disjoint sets, then we know that if I try to count the union of all those set, it's going to be the sum of the separate cardinalities. So let me just write it out actually.

So it's the cardinality of A1 plus A2 all the way to An. Why is this? Well, all the sets are disjoint. So there are no intersections between sets that contain elements. All the intersections are empty.

So counting the union is really counting each separate set. And that's why we have the sum. And in the next lecture, we will talk about inclusion, exclusion rule. And then we will take into account that we have intersections that are not empty.

But let's give now an example where we count the number of passwords with certain properties. And we will apply all these different rules together. And that's the type of problems that you would like to be able to solve.

So in our last example here, we have that passwords have the following property. They are 6 to 8 symbols. So that's property 1. We have that the very first symbol must be special in the sense that it is a letter.

And this can an upper or lowercase. And say that the other symbols are actually letters or digits. So let's count the total number of possible passwords.

We're going to use the sum rule. So let's define what kinds of sets we are taking the symbols from. So the first set is for the first symbol, which we call f or first. We have all the letters a, b, c in lowercase, and then all of them in uppercase. And in total, we have 52 elements in this set.

For the second symbol, or the other symbols, we have all these letters, but also all the digits, 0, 1, up to 9. And this set actually has cardinality 62. We added 10 digits.

So let's talk about-- actually, we like to use this in the sum rule as well. So how do we count? What kind of possibilities do we really have? So let's describe the set of passwords explicitly in a formula. So let p be the set of possible passwords.

And this one is actually equal to-- well, I need to choose a first symbol. And then I need to choose a second symbol, and a third, and a fourth, and a fifth, and a sixth. That's one possibility. I use 6 symbols.

I can also use 7 or 8 symbols. But this is one of them. We also denote this as S to the power 5. That's an equivalent notation.

The other possibilities for passwords are that we first choose an entry from f, a letter. And then we will need to choose another 6 symbols. In total, we have 7.

And we have another possibility where we choose a first symbol, and then we choose 7 other symbols. So has 8 symbols. This has in total 7. This one has in total 6 symbols.

So this covers all the possible passwords. So let's count them. We know that these sets are all the different. They are distinct.

These are sequences that have 6 entries, 7, and 8 entries. So if you look at the cardinality of P, it's actually equal to the sum by the sum rule that we just described over there, is equal to the very first one, f times S to the power 5 plus the cardinality of the product of f with S to the power 6. And we have f times S to the power 7.

So this is simply by application of the sum rule. And now we can apply the product rule very simply, which is this one. So that's equal to the cardinality of f times the cardinality of S to the power 5. And then we have the same rule applied to this one.

It's the commonality of f times the cardinality of S to the power 6. And then we have the cardinality of f times the cardinality of S to the power 7. Now, we simply plug-in these numbers.

And then you will have the total number of passwords that you can select from. It turns out to be about 1.8 times 10 to the power 14. So here we have applied both to sum rule and the product rule.

So in general, you will see that you have to apply multiple rules together in order to find an answer to your counting problem. And you'll see that on a problem set. And we will give a few more examples in next lecture.

And we will start talking about the generalization of the sum room called inclusion exclusion. And we will give you another type of proof technique called combinatorial proofs. All right. Good luck with the problems.

## Free Downloads

### Video

- iTunes U (MP4 - 175MB)
- Internet Archive (MP4 - 175MB)

### Subtitle

- English - US (SRT)

## Welcome!

This is one of over 2,200 courses on OCW. Find materials for this course in the pages linked along the left.

**MIT OpenCourseWare** is a free & open publication of material from thousands of MIT courses, covering the entire MIT curriculum.

**No enrollment or registration.** Freely browse and use OCW materials at your own pace. There's no signup, and no start or end dates.

**Knowledge is your reward.** Use OCW to guide your own life-long learning, or to teach others. We don't offer credit or certification for using OCW.

**Made for sharing**. Download files for later. Send to friends and colleagues. Modify, remix, and reuse (just remember to cite OCW as the source.)

Learn more at Get Started with MIT OpenCourseWare