Description: In an unsuccessful attempt to prove Fermat’s last theorem, Schur showed that every finite coloring of the integers contains a monochromatic solution to x + y = z, an early result in Ramsey theory. Professor Zhao begins the course with a proof of Schur’s theorem via graph theory and how it led to the modern development of additive combinatorics. He then takes the class on a tour of modern highlights of the field: Roth’s theorem, Szemerédi’s theorem, and the Green–Tao theorem.
Instructor: Yufei Zhao
YUFEI ZHAO: OK, let's, get started. Welcome to 18.217. So this is combinatorial theory, graph theory, and additive combinatorics. So course website is up there. So all the course information is on there. So after around the middle of the class, I'll say a bit more about various course information, administrative things. But I want to jump directly into the mathematical content.
So this course roughly has two parts. The first part will look at graph theory, in particular problems in extremal graph theory. In the second part, we'll transition to additive combinatorics. But these are not two separate subjects. So I want to show you this topic in a way that connects these two areas and show you that they are quite related to each other. And many of the common themes that will come up in one part of the course will also show up in the other.
So the story between graph theory and additive combinatorics began about 100 years ago with Schur, the famous mathematician, Isaai Schur. Well, he was like many mathematicians of his era trying to prove Fermat's Last Theorem.
So here's what's Schur's approach. He said, well, let's look at this equation that comes up in from Fermat's Last Theorem. And, well, one of the methods of elementary number theory to rule out solutions to an equation is to consider what happens when you mod p. If you can rule out for infinitely many values p, possible non-trivial solutions to this equation mod p, then you will rule out possibilities of solutions to Fermat's Last Theorem.
OK, so this was Schur's approach. As you can guess, unfortunately, this approach did not work. And Schur proved that this method definitely doesn't work. So that's the starting point of our discussion. So it turns out that for every value of n, there exists non-trivial solutions for all p sufficiently large. So thereby, ruling out the strategy. So let's see how Schur proved his theorem. So that will be the first half of today's lecture.
So this seems like a number theory question. So what does it have to do with graph theory? So I wanted to show you this connection.
Now, Schur deduced his theorem from another result. That is known as Schur's Theorem, which says that if be positive integers is colored using finitely many colors, then there exists a monochromatic solution to the equation x plus y equals to z. So if you give me 10 colors and color the positive integers using those 10 colors, then I can find for you a solution to this equation where x, y, and z are all of the same color.
Now, this statement-- OK, so it's a perfectly understandable statement. But let me rephrase it in a somewhat different way. And this gets to a point that I want to discuss where many statements in additive combinatorics or just combinatorics in general have different formulations, one that comes in an infinitary form, which is more qualitative so to speak and another form that is known as finitary. And that's more quantitative in nature.
So Schur's Theorem is stated in a infinitary form. So it tells you if you color using finitely many colors, then there exists a monochromatic solution. So many, but not all, statements of that form have an equivalent finitary form that is sometimes more useful. And also, once you stay the right finitary form, you can ask additional questions.
So here's what Schur's Theorem looks like in the equivalent finitary form. You give me an r. For every r, there exists some N as a function of r, such that if the numbers 1 through N-- so throughout this course, I'm going to use this bracket N to denote integers up to N-- so if these numbers are colored using our colors, then necessarily, there exists a monochromatic solution to the equation x plus y equals to z, where x, y, and z are in the set that is being colored.
So it looks very similar to the first version I stated. But now, there are some more quantifiers. So for every r, there exists an N.
So why are these two versions equivalent to each other? So it's not too hard to deduce their equivalence. So let me do that now.
The fact that the finitary version implies the infinitary version claims should be fairly obvious. So once you know the finitary version, if you give me a coloring of the positive integers, well I just have to look far enough up to this N and I get the conclusion I want.
But now, in the other direction-- so in the other direction-- suppose I fix to this r. So, OK, so I assume the infinitary version. I wanted to deduce the finitary version. So I start with this r. And let's suppose the conclusion were false. So supposed the conclusion were false, namely for every N there exists some coloring-- so for every N there exists some coloring-- which we will call phi sub N, that avoids monochromatic solutions to x plus y equals to z. So I'm going to use this Chi for shorthand for monochromatic. So suppose there exists such a coloring.
And now, I want to take this collection of colorings and produce for you a coloring of the positive integers. And you can do this basically by a standard diagonalization trick. Namely, we see that by taking an infinite subsequent, such that-- so let me call this infinite sub-sequence phi of-- phi sub-- well, so it's infinite sub-sequence of this phi sub N, such that phi sub N of k stabilizes along the sub-sequence for every k.
OK, so you can do this simply by diagonalization trick. And then, we see that along the sub-sequence phi N converges point-wise to some coloring of the entire set of positive integers. And this coloring avoids monochromatic solutions to x plus y equals to z, because if there were monochromatic solutions in this coloring of the entire integers, then I can look back to where that came from. And that would have been the monochromatic solution in one of my phi N's.
So this is an argument that shows the equivalence between the finitary form and infinitary form. But now, when we look at the finitary form, you can ask additional questions, such as, how big does this N have to be as a function of r. It turns out those kind of questions in general are very difficult. And we know some things. For this type of questions, we know some bounds usually. But the truth is usually unknown. And there are major open problems in combinatorics of this type. So there's still a lot that we do not understand.
OK, so now we have Schur's Theorem in this form. Let me show you how to deduce his conclusion about ruling out this approach to proving Fermat's Last theorem.
The claim is the following that if you have a positive integer n, then for all sufficiently large primes p, there exists x, y, and z, all belonging to integers up to p minus 1, such that their n-th powers add up like this. So it's a solution to Fermat's equation mod p. All right, so how can we deduce this from what we said about coloring? So what is the coloring? OK, so here's what Schur did, so proof assuming for now Schur's theorem.
So let's look at the multiplicative group of non-zero residues, mod p. So we know it's a cyclic group because there's a generator. So there's a primitive root generator. Let H denote the subgroup of n-th powers. Well, H is a pretty big subgroup. So what's the index of H in this multiplicative group? It's at most M.
So think about representing this as a cyclic group using a generator. So H then would be all the elements whose exponent is divisible by M. So this the index is at most M. It could be smaller. But it's at most M.
And so in particular, I can use the H cosets to partition the multiplicative group of non-zero residues. And this is a color. Virtual partition is the same thing as a coloring. There is a bounded number of colors. But I let peek at large.
So by Schur's theorem if p is sufficiently large, then one of my cosets should contain a solution to x plus y equals to z. What does that look like? So that one coset, one H coset, course that contains x, y, z, such that x plus y equals to z as integers. They belong to the same coset. So x, y, and z belong to some coset of H, which means then that x equals to a times n-th power with a y equals to a times some n-th power and little z equals to a times some n-th power. You have this equation. Put them together.
So that is true. So now mod p, I can cancel the a's. And this produces a non-trivial solution to Fermat's equation, mod p.
OK, so this was the proof of this claim that this method does not work for solving Fermat's Last Theorem. But, you know, we assumed this claim of Schur's theorem that every finite coloring of the positive integers contains a monochromatic solution to x plus y equals to z. So we still need to prove that claim. So we still need to prove this combinatorial claim. And so that's what we're going to do now.
This is where graph theory comes in. So let me state a very similar-looking theorem about graphs. And this is known as Ramsey's theorem, although Ramsay's theorem actually historically came after Schur's theorem, but Ramsey's theorem, here, we're going to use it specifically in the case for triangles.
So what does it say? That if you give me an r, the number of colors, then there exists some large N such that if the edges of the complete graph, K sub N, along N vertices are colored using r colors, then there exists a monochromatic triangle somewhere. Any questions so far about any of these statements?
So let's see how Ramsay's theorem for triangles is proved. By the way, I want to give you a historical note about Frank Ramsey. So he's someone who made significant contributions to many different areas, not just in mathematics. So he contributed to seminal works in mathematical logic where this theorem came from, but also to philosophy and to economics before his untimely death at the age of 26 from liver-related problems. So he's someone whose very short life contributed tremendously to academics.
So let's see how Ramsay's theorem, in this case, is proved. We'll do induction on r, the number of colors. So for every r, I need to show you some N, such that the statement is true.
In the first case, when r equals to 1, there's not much to do. Just one color, if I just have three vertices, that already is OK. Three vertices, that's already a monochromatic triangle.
So from now on, let r be at least 2. And suppose the claim holds for r minus 1 colors, with N prime being the corresponding number of vertices with r minus 1 colors. So now let me pick an arbitrary vertex. So pick an arbitrary vertex v and look what happens. So here's v.
And let me look at the outgoing edges. So we'll show that N being r crimes N prime minus 1 plus 2 works. So now, we have a lot of outgoing edges. In particular, we have r times N prime minus 1 plus 1 outgoing edges.
So by the pigeonhole principle, some color-- so there exists at least N prime outgoing edges with the same color, let's say, yellow. So suppose yellow is the outgoing color. And let me call the set of vertices on the other end of these edges v0.
So now let's think about what happens in v0. So in v0, either v0 contains a yellow edge, in which case you get a yellow triangle. Or we lose the color inside v0. So the number of colors goes down. Else v0 has at most r minus 1 colors. And v0 has at least N prime number of vertices. So by induction, v0 has a monochromatic triangle in the remaining colors.
So that completes the proof of Ramsay's theorem, in this case, for triangles. And if you wish to find out what is the bound that comes out of this argument, well, you can chase to the proof and get some bound.
The remaining question now is, what does this all have to do with Schur's theorem? So so far, we've talked about some number theory. We've talked about some graph theory and how to link these two things together. And I think this is a great example. It's a fairly simple example, which I'm about to show you of how to link these two ideas together. And this connection, we'll see many times in the rest of this course.
I don't want to erase Schur's theorem. So let me--
So let's prove Schur's theorem. So let's start with a coloring. So let's start with the coloring of 1 through N. And I want to form a graph with colors on the edges that are somehow derived from this coloring on these integers. And here's what I'm going to do.
So let's color the complete graph-- let's color the edges of the complete graph on the vertex set having N plus 1 vertices, labeled at integers up to positive integers up to N plus 1. But by the Ramsey result we just proved, if N is large enough, then there exists a monochromatic triangle.
So what does it look like? So let me draw for you a monochromatic triangle. Suppose it-- so I haven't told you what the coloring is yet. So the coloring is that I'm going to color the edge between i and j, using the color derived by applying phi to the number j minus i, namely the length of that segment if I lay out all the vertices on the number line.
So now have an r coloring of this complete graph. So Ramsey tells us that there exists a monochromatic triangle. The triangle sits on vertices i, j, and k. And the rule tells us that the colors are phi of k minus i, phi of j minus i, and phi of k minus j.
So these three numbers, they have the same coloring. But, look, if I set these numbers to be x, y, and z-- so x being j minus i, for instance-- then x plus y equals to z. And they all have the same color. So this monochromatic triangle gives us a monochromatic equation, 2x plus y equals to z, thereby concluding the proof of Schur's theorem.
OK, so this rounds out the discussion for now of-- well, we started with some statement about number theory. And then we took this detour to graph theory, looking at Ramsay's theorems of monochromatic triangles, and then go back to number theory and proved the result that Schur did. So how does go to graphs help? So why was this advantageous? What do you guys think?
So I claim that by going to graphs, we added some extra flexibility to what we can play with. For example, we started out with a problem where there were only N things being colored. And then we moved to graphs where about-- well, N choose 2 or N squared objects are being colored. And then we did an induction argument. So remember in the proof of Ramsey's theorem up there, there was an induction argument taking all vertices. And that argument doesn't make that much sense if you stayed within the numbers.
Somehow moving to graphs gave you that extra flexibility allow you to do more things. And this is one of the advantages of moving from problem about numbers to a problem about graphs. And we'll see this connection later on as well. Yeah?
AUDIENCE: Sort of related to that. Are there better bounds known for this specific, like Schur's result of that power on e, because the N's here would be pretty bad.
YUFEI ZHAO: Right, so Ashwan asked, so what about bounds? So what do we know about bounds? So I don't know off the top of my head the answers to those questions. But in general, they're quite open. So there are exponential gaps between lower and upper bounds on our knowledge of what is the optimal N you can put in the theorem. Any more questions?
All right so, I think this is a good point for us to-- so usually when I give 90-minute lectures, I like to take a short 2-minute break in between. So I want to do that. And then in the second half, I want to take you through a tour of additive combinatorics. So tell you about some of the modern developments.
Now, this is an exciting field where it started out, I think, roughly with Schur's theorem that we just discussed. That started about 100 years ago. But a lot has taken place in the past century. And there's still a lot of ongoing exciting research developments. So in the second half of this lecture, I want to give you a tour through those developments and show you some of the highlights from additive combinatorics.
So let's take a quick 2-minute break. And feel free to ask questions in the meantime.
So another part of the writing assignment in addition to course notes is a contribution to Wikipedia, which is, you know, nowadays, of course, you know, if you hear some word like Szemeredi 's regularity lemma the first thing you do is type into Google. And more often than not the first link that comes up is Wikipedia. And, you know, some of the articles, they are all right, and some of them are really not all right.
And it would be fantastic for future students and also for yourselves if there were better entry points to this area by having higher quality Wikipedia articles or articles that are simply missing about specific topics. So one of the assignments-- again, this can be collaborative. So I'll give you more information how to do that later-- is to contribute to Wikipedia and roughly contribute one high quality article or edit some existing articles so that they become high quality. Yep.
AUDIENCE: Can we something similar to LMDB with creating a website that has all the information needed in combinatorics?
YUFEI ZHAO: So we can talk about that. So if there are other ideas about how to do this, we can definitely open the chatting about that.
So the other thing is that instead of holding the usual office hours, what I like to do is-- so this class ends at 4:00 PM. So after 4:00, I'll go up to the Math Common Room, which is just right upstairs and hang out there for a bit. If you have questions, you want to chat, come talk to me. I'd be happy to chat about anything related or not related to the course.
And before homeworks are due, I will try to set up some special office hours for you in case you want to ask about homework problems. And if you want to meet with me individually, please just send me an email.
Oh, one more thing about the course notes. So because I want to do quality control, so here is the process that will happen with the course notes. So the first lecture is already online. So you can already see. So I've written up the lecture notes for the first lecture. And you can use that as an example of what I'm looking for.
So I'm looking for people to sign up starting from the next lecture, and I will send out a link tonight. For future lectures, so whoever writes the lecture, I'll the lecture, and then within one day, so by the end of the day after the lecture, it will be good if they were already at least some sketch, some rough draft at least containing the theorem statements and whatnot from the day's lecture. So that the next person can start writing afterwards.
But once you are done, once you feel that you have a polished version of the lecture, write up, ideally within four days of the lecture-- so that in terms of expectations and timelines, again all of this information is online-- so you're finished with polishing your lecture notes, within four days send me an email, so both co-authors if there are two of you, and I will schedule an appointment, about half an hour, where I will sit down with you to go through what you've written and tell you some comments. So you can go back and polish it further. And hopefully, that will just be a one round thing. If more rounds are needed, well, it's not ideal, but we'll make it happen until the notes are ready to use for future generations. OK, any questions about any of the course logistics?
All right, so in the second half of today's lecture, I want to take you through a tour of modern additive combinatorics. And this is an area of research which I am actively involved in. And it's something that I am quite excited about. And part of the reason why I teach this course-- this course is something that I developed a couple years ago when I taught for the first time then-- because I want to introduce you guys to this very active and exciting area of research.
Now, what is added combinatorics? The term itself is actually fairly new. So the term, additive combinatorics, I believe was coined by Terry Tao back in the early 2000s as somewhat of a rebranding of an area that already existed, but then got a lot of exciting developments in the early 2000s. It's a deep and far reaching subject with many connections to areas like graph theory, harmonic analysis, or Fourier analysis, ergodic theory, discrete geometry, logic and model theory, and has many connections all over the place, and also has many deep theorems.
So let me take you through a tour historically of, I think, some of the major milestones and landmarks in additive combinatorics. So after Schur's theorem, which we discussed in the first half of today's lecture, the next big result I would say is Van der Waerden's theorem, which was 1927. Van der Waerden's theorem says that every coloring of the positive integers using finite many colors contains arbitrarily long arithmetic progressions.
So we'll see arithmetic progressions come up a lot. So from now on we'll abbreviate this word by AP. So AP stands for Arithmetic Progressions.
So instead of Schur's theorem where you just find a single solution to x plus y equals to z, so now, we're finding a much bigger structure. Keep in mind, so a novice mistake people make is to confuse arbitrarily long arithmetic progressions with infinitely long. So these are definitely not the same. So you can think about. I'll leave it to you as an exercise, well, also homework exercise, that you can color the integers with just two colors in a way that destroys all possible infinitely long monochromatic arithmetic progressions. So arbitrarily long is very different from infinitely long.
Now, so this was a great result, but it provokes more questions. So Erdos-Turan in the '30s, they asked-- well, they conjectured that the true reason in Van der Waerden's theorem of having long arithmetic progressions, it's not so much that you're coloring. It's just because if you use finitely many colors, then one of the color classes must have fairly high density. So one of the classes if you use r colors has density at least 1 over r. And they conjectured that every subset of the positive integers, or the integers with positive density, contains long-- so arbitrarily long arithmetic progressions.
You may ask, what does it mean, density? So you can define density in many different ways. And it doesn't actually really matter that much which definition you use. But let me write down one definition. So you can define given a subset of integers the upper density, or rather, let me just say that it has positive upper density, if when we take the lim sup as n goes to infinity and look at we'll take a scaling window and look at what fraction of that window is a, then this number, this limit sup is positive.
So that's one definition of positive density. There are many other definitions, sometimes known as the Banach density. And you can take variations. I mean, for the purpose of this discussion, they're all roughly equivalent. So let's not worry too much about which definition of density we use here.
All right, so Erdos and Turan conjectured that the true reason for Van der Waerden's theorem is that one of the color classes has positive density. And this turned out to be an amazingly prescient question and that one had to wait several decades. So this conjecture was made in the '30s, in 1936. So you had to wait several decades before finding out what the answer is.
So in a foundational theorem, in the subject known as Roth's theorem-- so Roth proved it in the '50s. I think '53-- Roth proved that, I think, '53, in the '50s-- that k equals to 3 is true. So if I say that it contains k term, arithmetic progressions for every k. And Roth proved that every positive density subset contains a 3-term arithmetic progression.
And already, Roth introduced very important ideas that we will see in this course in two different forms. So in the first half the course, we'll see a graph theoretic proof that was found later in the '70s of Roth's theorem. And then in the second half, we'll see Roth's original proof that used Fourier analysis.
So Fourier analysis in number theory is also known as the Hardy-Littlewood circle method. It's a powerful method in analytic number theory. But there are very interesting new ideas introduced by Roth as well in developing this result.
The full conjecture was settled by Szemeredi. It took another couple of decades. So in the late '70s, Szemeredi proved his landmark theorem that confirmed the Erdos-Turan conjecture.
Szemeredi's theorem is a deep theorem. So this theorem is the proof, what the original combinatorial proof is a tour de force. And you can look at the introduction of his paper, where there is an enormously complex diagram-- so you can see this in the course notes-- that lays out the logical dependencies of all the lemmas and propositions in his paper. And even if you assume every single statement is true, looking at that diagram, it's not immediately clear what is going on because the logical dependencies are so involved. So this was a really complex proof.
But not only that, Szemeredi's theorem actually motivated a lot of subsequent research. So later on, researchers from other areas came in and found also sophisticated proofs of Szemeredi's theorem from other areas and using other tools, including-- and here are some of the most important perspectives, later perspectives, of Szemeredi's theorem.
So there was a proof using ergodic theory that followed fairly shortly after Szemeredi's original proof. This is due to Furstenberg. And initially, it wasn't clear, because all of these proofs were so involved. It wasn't clear if the ergodic theoretic proof was genuinely something new, or it was a rephrasing of Szemeredi's combinatorial proof.
But then very quickly it was realized that there were extensions of Szemeredi's theorem, other combinatorial results that the ergodic theorists could establish using their methods, so using the same methods or extensions of the same methods that combinatorialists did not know how to do. And to this date, there are still theorems for which the only known proofs use ergodic theory, so extensions of Szemeredi's theorem. And I will mention one later on today.
So that's one of the perspectives. The other perspective that was also quite influential there is something known as higher order Fourier analysis, which was pioneered by Tim Gowers' in around 2000. So Gowers won the Fields Medal, party for his work on Banach spaces but also party for this development.
So higher order Fourier analysis is in some sense an extension of Roth's theorem. So anyway, Roth also won a Fields Medal, although this is not his most famous term. I'll say his second most famous theorem. So Roth used this Fourier analysis in the sense of Hardy-Littlewood to control 3-term arithmetic progressions.
But it turns out that that method for very good fundamental reasons completely fails for 4-term arithmetic progressions. So we'll see later in the course why that's the case, why is it that you cannot do Fourier analysis to control 4-term APs. But Gowers managed to find a way to overcome that difficulty. And he came up with an extension, with a generalization of Fourier analysis, very powerful, very difficult to use, actually. But that allows you to understand longer arithmetic progressions.
Another very influential approach is called hypergraph regularity. So the hypergraph regularity method was also discovered in the early 2000s independently by a team led by Rodl and also by Gowers. So the hypergraph regularity method is an extension of what's known as Szemeredi's regularity, Szemeredi's graph regularity method. And this is the method that will be a central topic in the first half of this course.
And it's a method that is quite central, or at least some of the ideas quite central, to Szemeredi's method. And he gave an alternative proof. He and Ruzsa gave an alternative proof of Roth's theorem using graph theory. And for a long time, people realized that one could extend some of those ideas to hypergraphs. But working out how that proof goes actually took an enormous amount of time and effort and resulted in this amazing theorem on hypergraph.
Let me mention these are not the only methods that were used to extend Szemeredi's theorem or give alternate proofs. There are many others. For example, you may have heard of something called the polymath project. Raise your hand if you heard of the polymath project. OK, great. So maybe about half of you.
So this is an online collaborative project started by Tim Gowers and also famous people like Terry Tao. And they were all quite involved in various polymath projects. And the first successful polymath project produced a combinatorial proof of something known as the density Hales-Jewett theorem. So I won't explain what it this here. So it's something which is related to tic tac toe. But let me not go into that. So it's a deep combinatorial theorem that had they known earlier using ergodic theoretic methods, but they gave a new combinatorial proof, in particular gave some concrete bounds on this theorem and that in particular also implies Szemeredi's theorem. So this gave a new proof.
And as a result, they-- it's an online collaborative project-- so they published this paper under the pseudonym DHJ Polymath, where DHJ stands for Density Hales Jewett. And they kept the same name for all of the subsequent papers published by the polymath project.
So as you see through all of these examples that there a lot of work that were motivated by Szemeredi's theorem. This is truly a foundational result, a foundational theorem that gave way to a lot of important research. And Szemeredi himself received an Apple Prize for his seminal contributions to combinatorics and also theoretical computer science.
We still don't understand in some sense completely what Szemeredi's theorem-- you know, for example, we do understand the optimal bounds. And also more importantly, conceptually, we don't really understand how these methods are related to each other. So there's some vague sense that they all have some common things. But there is a lot of mystery as to what do these methods coming from very different areas-- ergodic theory, harmonic analysis, you what do they all have to do with each other
but there is central theme. And this is also going to be a theme in this course, which goes under the name-- and I believe Terry Tao is the one who popularized this name-- the dichotomy between structure and randomness, structure and pseudo randomness. Somehow it's a really fancy way of saying signal versus noise.
So I give you some object, I give you some complex object, and there is some mathematical way to separate the structure from some noisy aspects, which behave random-like. So there will be many places in this course where this dichotomy will play an important role.
Any questions at this point?
I want to take you through some generalizations and extensions of Szemeredi's theorem. So first, let's look at what happens if we go to higher dimensions. Suppose we have a subset in D dimensions, d-dimensional lattice. So we can also define some notion of density. Again, it doesn't matter precisely what is the notion you use.
For example, we can say that it has a positive upper density if this lim sup is positive. So Szemeredi's theorem in one dimension tells us that if you have some sort of positive density, then I can find arbitrarily long arithmetic progressions. So what should the corresponding generalization in higher dimensions? Well, here's a notion that I can define, namely that we say that a contains arbitrary constellations to mean that-- so what does that mean?
So a constellation, you can think of it as some finite pattern, so a set of stars in the sky, so some pattern. And I want to find that pattern somewhere in a, where I'm allowed to dilate. So I'm allowed to do to multiply pattern by some number and also translate.
So on the finite pattern-- so what I mean precisely is that for every finite subset of the grid, there exists some translation and some dilation, such that once I apply this dilation and translation to my pattern F, meaning I'm looking at the image of this F under this transformation, then this set lies inside a. So you see that arithmetic progressions is the constellation, just numbers 1 through k. So that's a definition.
And the multi-dimensional Szemeredi's theorem-- so the multi-dimensional generalization of Szemeredi's theorem says that for every subset-- so every subset of the d-dimensional lattice of possible density contains arbitrary constellations. You give me a pattern, and I can find this pattern inside a, provided that a has positive density.
So in particular, if I want to find a 10 by 10 square grid, so meaning suppose I want to find a pattern which consists of something like that, a 10 by 10 square grid, where all of these lengths are equal, but I don't specify what they are. But as long as they are equal, then the theorem tells me that as long as a has positive density, then I can find such a pattern inside a.
So this theorem was proved by Furstenberg and Ketsen. So you see that it is a generalization of Szemeredi's. So the one-dimensional case is precisely Szemeredi's theorem. So Furstenberg and Ketsen, using ergodic theory showed that one can generalize Szemeredi's theorem to the multi-dimensional setting. However, the combinatorial approaches employed by Szemeredi did not easily generalize. So it took another couple of decades at least for people to find a combinatorial proof of this result. And namely that happened with the hypergraph regularity method. So this was one of the motivations of this project.
And you say, OK, what's the point of having different proofs? Well, for one thing it's nice to know different perspectives to important theorem. But there's also concrete objective. In particular, it turns out that if you prove something using ergodic theory, because-- we will not discuss ergodic theory in this course. But roughly, one of the early steps in such a proof applies compactness. And that already destroys any chance of getting concrete quantitative bounds.
So you can ask if I want to find a 10 by 10 pattern and I have density 1%, how large do I need to look? How far do I have to look in order to find that pattern? So that's a quantitative question that is actually not at all addressed by ergodic theory. So the later methods using combinatorial methods gave you concrete bounds. And so there are some concrete differences between these methods.
So this theorem reminds me of the scene from the movie a Beautiful Mind, which is one of the greatest mathematical movies in some sense. And so there's a scene there where Russell Crowe playing John Nash-- so there were at this fancy party. And Nash was with his soon to be wife, Alicia. And he points to the sky and tells her, pick a shape. Pick a shape and I can find for you among the stars. And so this is what the theorem allows you to do it. So give me a shape and I can find that constellation inside a.
Let's look at other generalizations. So far, we are looking at linear patterns. So we're looking at linear dilations and translations. But what about polynomial patterns?
So here's a question. Suppose I give you a dense subset, a positive density subset of integers. Can you find two numbers whose difference is a perfect square? So this question was asked by Lovasz. And a positive answer was given in the late '70s by Furstenberg and Sarkozy independently. So Furstenberg and Sarkozy, they showed using different methods-- so one ergodic theoretic and the other is more harmonic analytic-- that every subset of the integers, so every subset of positive integers, with positive density contains two numbers differing by a perfect square. So in other words, we can always find the pattern x plus y squared.
So what about other polynomial patterns? Instead of this y squared, suppose you just give me some other polynomial or maybe a collection of polynomials. So what can I say? Well, there are some things for which this is not true.
Can you give me an example where if I putting the wrong polynomial it's not true? What if the polynomial is the constant 1? If you take the even numbers, has density 1/2, but it doesn't contain any patterns of x and x plus 1.
So I need to say some hypotheses about these polynomials. So a vast generalization of this result, so known as polynomial Szemeredi theorem, says that if A is a subset of integers with positive density, and if we have these polynomials, P1 through Pk with integer coefficients and zero constant terms, then I can always find a pattern. So there exists some x and positive integer y such that this pattern, x plus P1 of y, x plus P2 of y, and so on x, plus Pk of y, they all lie in A. So in other words, succinctly, every subset of integers with positive density contains arbitrary polynomial patterns.
So this was proved-- so this was an important result proof by Bergelson and Liebman using ergodic theory. And so far for this general statement, the only known proof uses ergodic theory. So there was some recent developments, recent pretty exciting developments that for some specific cases where if you have some additional restrictions on the P's, then there are other methods coming from Fourier analytic, harmonic analytic methods that could give you a different proof that allows you to get some bounds. Remember, the ergodic proof gives you no bounds. But so far, in general, the only method known is ergodic theoretic.
And actually, Bergelson and Liebman proved something which is more general than what I've stated. So this is also true in a multidimensional setting. I won't state that precisely, but you can imagine what it is.
Let me mention one more theorem that many of you I imagine have heard of. And this is the Green Tao theorem. So the Green Tao theorem says that the primes contain arbitrarily long arithmetic progressions.
So this is a famous theorem. And it's one of the most celebrated results of the past couple of decades. And it resolved some longstanding folklore conjectures in number theory.
The Green Tao theorem, well, you see that in form it looks somewhat like Szemeredi 's theorem. But it doesn't follow from Szemeredi 's theorem. Well, the primes, they don't have positive density. The prime number theorem tells us that density decays like 1 over log n.
So what about quantity versions of Szemeredi 's theorem? It is possible. Although we do not know how to prove such statement, it is possible that a density of primes alone might guarantee the Green Tao theorem in that it is possible that Szemeredi 's theorem is true for any set whose density decays like the prime numbers, like 1 over log n.
But no we're quite far from proving such a statement. And that's not what Green and Tao did. Instead, they took Szemeredi 's theorem as a black box and applied it to some variant of the primes and showed that inside this variant, Szemeredi 's theorem is also true, and that the primes sit inside this variant of the primes, known as pseudo primes, as a set of relatively positive density, but somehow transferring Szemeredi 's theorem from the dense setting to a sparser setting.
So this is a very exciting technique. And as a result, Green-Tao proved not just that the primes contain arbitrarily long arithmetic progressions, but every relatively dense, so relatively positive density subset, of the primes contains arbitrarily long arithmetic progressions. To prove this theorem they incorporated many different ideas coming from many different areas of mathematics, including harmonic analysis, some ideas coming from combinatorics, and number theory as well. So there were some innovations at the time in number theory that were employed in this result.
So this is certainly a landmark theorem. And although we will not discuss a full proof of the Green-Tao theorem, we will go into some of the ideas through this course. And I will show you bits and pieces that we will see throughout the course.
So this is meant to be a very fast tour of what happened in the last 100 years in additive combinatorics, taking you from Schur's theorem, which was really about 100 years ago, to something that is much more modern. But now, instead of being up in the stars, let's come back down to Earth. And I want to talk about what we'll do next.
So what are some of the things that we can actually prove that doesn't involve taking up 50 pages using a complex logical diagram, as Szemeredi did in his paper. So what are some of the simple things that we can start with? Well, so first, let's go back to Roth's theorem.
So Roth's theorem, we stated it up there. But let me restate it in a finitary form. So Roth's theorem is the statement that every subset of integers 1 through n that avoids 3-term arithmetic progressions must have size little o of N. So earlier we gave an infinitary statement that if you have a positive density subset of the integers that it contains a three AP, this is an equivalent finitary statement.
Roth's original proof used Fourier analysis. And a different proof was given in the '70s by Rusza and Szemeredi using graph theoretic methods. So how does graph theory have to do with this result? And this shouldn't be surprising to this point, given that we already saw how we used Ramsay's theorem, graph theoretic result, to prove Schur's theorem, which is something that is number theoretic.
So something similar happens. But now, the question is what is the graph theoretic problem that we need to look at? So for Schur's theorem it was Ramsey's theorem for triangles. But what about for Roth's theorem? A naive guess is the following. So what's the question that we should ask?
Here's a somewhat naive guess, which turns out not to be the right question, but still an interesting question, which is what is the maximum number of edges in a triangle-free graph on n vertices? Now, this is not totally a stupid guess, because as you imagine from what we said with Schur's theorem, somehow you want to set up a graph so that the triangles correspond to the 3-term arithmetic progressions.
And you want to set it up in such a way that this question about what's the maximum size subset of 1 through n without 3 APs translates into some question about what's the maximum number of edges in a graph that has some property? So what is that property? So this is not a totally stupid guess.
But it turns out this question is relatively easy. Still it has a name. So this was found by Mantel about 100 years ago, so known as Mantel's theorem. And the answer, well, we'll see a proof. So the first thing we'll do in the next lecture is prove Mantel's theorem, but I do want to hold suspense.
I mean the answer, it turns out to be fairly simple to describe. Namely that you split the vertices into two basically equal halves. And you join all the possible edges between the two halves. So this complete bipartide graph with two equal-sized parts. And it turns out this graph, you see this triangle-free and also turns out to have the maximum number of edges. Yeah, question.
AUDIENCE: What are asymptotics for three arithmetic progression of--
YUFEI ZHAO: Let me get to that in a second. OK, so I'll talk about asymptotics in a second. So it turns that this is not the right graph theoretic question to ask. So what is the right graph theoretic question to ask? I'll tell you what it is. I mean it shouldn't be clear to you at this point. It still seems like an interesting question, but it's also somewhat bizarre to think about if you've never seen this before.
So what is the maximum number of edges in an n vertex graph, where every edge lies in exactly one triangle? So I want a graph with lots and lots of edges where every edge sits in exactly one triangle. Now, you might have some difficulty even coming up with good graphs that have this property. And that's OK. These are very strange things to think about. But we'll see many examples of it later on.
We'll also see how Roth's theorem is connected to this graph theoretic question. Just to give you a hint, you know, where does exactly one triangle come from, it's because even if you avoid 3-term arithmetic progressions, there are still these trivial 3-term arithmetic progressions, where you keep the same number three times. And in graph theoretic world, that comes to the unique triangle that every edge sits on.
So to address the question about quantitative bounds, for Roth's theorem, it turns out that we have upper bounds and lower bounds. And it is still a wide open question as to what these things should be. And roughly speaking, the best lower bound comes from a construction, which we'll see later in this course, the higher size around n divided by e to the c root log n. And the best upper bound is of the form roughly n over log n.
That's maybe a little bit hard to think about how these numbers behave. So if you raise both sides to-- the denominator to e to the something, then it's maybe easier to compare. But it's still a pretty far gap. So still a pretty big gap.
There's a famous conjecture of Erdos some of you might have heard of, that if you have a subset of the positive integers with divergent harmonic series, then it contains arbitrarily long automatic progressions. That's a very attractive statement. But somehow I don't like the statement so much, because it seems to make a too pretty. And the statement really is about what is the bounds on Roth's theorem and on Szemeredi 's theorem.
And having divergent harmonic series is roughly the same as trying to prove Roth's theorem slightly better than the bound that we currently have, somehow breaking this logarithmic barrier. So that conjecture, that having divergent harmonic series, implies 3-term APs is still open. That is still open. So where the bound's very close to what we can prove, but it is still open.
For this question, we will see later in this course, once we've developed Szemeredi 's regularity lemma that we can prove an upper bound of o to the n squared, so little n. And that will suffice for proving Roth's theorem.
It turns out that we don't know what the right answer should be. We don't know what is the best such graph. And it turns out the best construction for this graph there comes from over here, the best lower bound construction of a set, of a large set without 3-term arithmetic progressions. So I'm giving you a preview of more of these connections between additive combinatorics on one hand and graph theory on the other hand that we'll see throughout this course.
Any questions? OK. So just to tell you what's going to happen next, so the next thing that we're going to discuss is basically extremal graph theory. And in particular, if you forbid some structure, such as a triangle, maybe a four cycle, maybe some other graph, what can you say about the maximum number of edges? And there are still a lot of interesting open problems, even for that. I forbid some H. What's the maximum number of edges? So the next few lectures will be on that topic.