WEBVTT
00:00:17.880 --> 00:00:20.820
YUFEI ZHAO: Today we want
to look at the sum product
00:00:20.820 --> 00:00:21.522
problem.
00:00:21.522 --> 00:00:22.980
So for the past
few lectures, we've
00:00:22.980 --> 00:00:25.260
been discussing the
structure of sets
00:00:25.260 --> 00:00:28.380
under the addition operation.
00:00:28.380 --> 00:00:31.050
Today we're going to throw
in one extra operation, so
00:00:31.050 --> 00:00:33.310
multiplication,
and understand how
00:00:33.310 --> 00:00:37.920
sets behave under both
addition and multiplication.
00:00:37.920 --> 00:00:43.380
And the basic problem here
is, can it be the case that A
00:00:43.380 --> 00:00:49.980
plus A, A times A,
which is, analogously,
00:00:49.980 --> 00:00:55.860
the set of all pairwise
products of elements from A--
00:00:55.860 --> 00:01:05.760
can these two sets be
simultaneously small,
00:01:05.760 --> 00:01:08.390
that is, the same
for some single A?
00:01:11.710 --> 00:01:17.710
Can we have it so that
A plus A and A times A
00:01:17.710 --> 00:01:20.690
are simultaneously small?
00:01:20.690 --> 00:01:24.470
For example, it's easy to
make one of them small.
00:01:24.470 --> 00:01:29.210
We've seen examples where if
you take A to be an arithmetic
00:01:29.210 --> 00:01:32.570
progression, then A plus
A is more or less as
00:01:32.570 --> 00:01:34.400
small as it gets.
00:01:34.400 --> 00:01:39.710
But for such an example, you
see A times A is pretty large.
00:01:39.710 --> 00:01:42.620
It's actually not so clear
how to prove how large it
00:01:42.620 --> 00:01:43.820
Is.
00:01:43.820 --> 00:01:45.320
And there are some
very nice proofs.
00:01:45.320 --> 00:01:48.620
And this problem has actually
been more or less pinned down.
00:01:48.620 --> 00:01:54.350
But the short version is
that A times A has size close
00:01:54.350 --> 00:01:55.790
to its maximum possible.
00:01:59.600 --> 00:02:05.690
So it turns out the size of A
times A is almost quadratic.
00:02:05.690 --> 00:02:08.009
So this number is actually
now known fairly precisely.
00:02:08.009 --> 00:02:12.410
So this problem of determining
the size of A times
00:02:12.410 --> 00:02:17.690
A for the interval 1 through
N is known as the Erdos
00:02:17.690 --> 00:02:19.160
multiplication table problem.
00:02:30.190 --> 00:02:32.910
So if you take an N by
N multiplication table,
00:02:32.910 --> 00:02:35.920
how many numbers do
you see in the table?
00:02:35.920 --> 00:02:39.570
So that turns out to be
sub-quadratic, but not too
00:02:39.570 --> 00:02:42.170
sub-quadratic.
00:02:42.170 --> 00:02:46.130
So this problem has been more
or less solved by Kevin Ford.
00:02:46.130 --> 00:02:48.840
And we now know a fairly
precise expression,
00:02:48.840 --> 00:02:50.840
but I don't want
to focus on that.
00:02:50.840 --> 00:02:54.200
That's not the topic
of today's lecture.
00:02:54.200 --> 00:02:55.700
This is just an example.
00:02:55.700 --> 00:02:57.950
Alternatively, you
can take A times A
00:02:57.950 --> 00:03:02.400
to be quite small by taking A
to be a geometric progression.
00:03:02.400 --> 00:03:04.820
Then it's not too hard
to convince yourself
00:03:04.820 --> 00:03:08.390
that A plus A must be
fairly large in that case.
00:03:08.390 --> 00:03:10.310
And the geometric
progression doesn't
00:03:10.310 --> 00:03:14.630
have so much additive structure,
so A plus A will be large.
00:03:14.630 --> 00:03:20.380
So can you make A plus A and A
times A simultaneously small?
00:03:20.380 --> 00:03:24.470
So there's this conjecture
that the answer is no.
00:03:24.470 --> 00:03:27.480
And this is a famous
conjecture in this area, known
00:03:27.480 --> 00:03:36.400
as the Erdos similarity
conjecture on the sum product
00:03:36.400 --> 00:03:40.810
problem, which states
that for all finite sets
00:03:40.810 --> 00:03:48.730
of real numbers, either
A plus A or A times A
00:03:48.730 --> 00:03:53.035
has to be close
to quadratic size.
00:04:00.300 --> 00:04:01.800
So that's the conjecture.
00:04:01.800 --> 00:04:03.810
It's still very much open.
00:04:03.810 --> 00:04:05.760
Today I want to show
you some progress
00:04:05.760 --> 00:04:08.810
towards this conjecture
via some partial results.
00:04:08.810 --> 00:04:12.780
And it will use a nice
combination of tools
00:04:12.780 --> 00:04:16.519
from graph theory and
incidence geometry,
00:04:16.519 --> 00:04:18.810
so it nicely ties in
together many of the things
00:04:18.810 --> 00:04:22.019
that we've seen in
this course so far.
00:04:22.019 --> 00:04:25.870
So Erdos and Szemeredi
proved some bound,
00:04:25.870 --> 00:04:30.610
which is like 1 plus
c for some constant c.
00:04:30.610 --> 00:04:35.387
Today we'll show some bounds
for somewhat better c's.
00:04:35.387 --> 00:04:35.970
So you'll see.
00:04:39.350 --> 00:04:41.480
The first tool that
I want to introduce
00:04:41.480 --> 00:04:45.150
is a result from graph theory
known as the "crossing number
00:04:45.150 --> 00:04:46.312
inequality."
00:04:56.660 --> 00:04:58.340
So you know that
planar graphs are
00:04:58.340 --> 00:05:00.100
graphs where you can
draw on the planes
00:05:00.100 --> 00:05:02.230
so that the edges do not cross.
00:05:02.230 --> 00:05:05.000
And there are some famous
examples of non-planar graphs,
00:05:05.000 --> 00:05:09.070
like K5 and K 3, 3.
00:05:09.070 --> 00:05:11.560
But you can ask a more
quantitative question.
00:05:11.560 --> 00:05:14.320
If I give you a graph,
how many crossings
00:05:14.320 --> 00:05:18.280
must you have in every
drawing of this graph?
00:05:18.280 --> 00:05:19.840
And the crossing
number inequality
00:05:19.840 --> 00:05:22.990
provides some estimate
for such a quantity.
00:05:22.990 --> 00:05:28.600
So given the graph G, denoted
by cr, so crossing of G,
00:05:28.600 --> 00:05:40.130
to be the minimum number
of crossings in a planar
00:05:40.130 --> 00:05:46.570
drawing of G. There
is a bit of subtlety
00:05:46.570 --> 00:05:50.470
here, where by a planar drawing,
do I mean using line segments
00:05:50.470 --> 00:05:52.090
or do I mean using curves?
00:05:52.090 --> 00:05:55.990
It's actually not clear how
it affects this quantity here.
00:05:55.990 --> 00:05:57.520
That's a very subtle issue.
00:05:57.520 --> 00:06:00.490
So for planar graphs,
there's a famous result
00:06:00.490 --> 00:06:02.800
that more or less says
if a planar graph can
00:06:02.800 --> 00:06:04.270
be drawn using
continuous curves,
00:06:04.270 --> 00:06:06.830
then it can be drawn
using straight lines.
00:06:06.830 --> 00:06:09.352
But the minimum
number of crossings,
00:06:09.352 --> 00:06:10.810
the two different
ways of drawings,
00:06:10.810 --> 00:06:13.400
they might end up with
different crossing numbers.
00:06:13.400 --> 00:06:15.230
But for the purpose
of today's lecture,
00:06:15.230 --> 00:06:18.010
we'll use a more general notion,
although it doesn't actually
00:06:18.010 --> 00:06:19.840
matter for today
which one we'll use--
00:06:19.840 --> 00:06:21.745
so planar drawing using curves.
00:06:25.870 --> 00:06:28.810
Draw the graph where edges
are continuous curves.
00:06:28.810 --> 00:06:30.460
How many crossings do you get?
00:06:30.460 --> 00:06:35.450
The crossing is a pair
of edges that cross.
00:06:35.450 --> 00:06:38.260
You can ask-- it's just a
cross over point that can--
00:06:38.260 --> 00:06:39.035
it doesn't matter.
00:06:39.035 --> 00:06:40.660
So there are many
different subtle ways
00:06:40.660 --> 00:06:42.040
of defining these things.
00:06:42.040 --> 00:06:44.830
They won't really come
up for today's lecture.
00:06:44.830 --> 00:06:49.510
The crossing number
inequality is a result
00:06:49.510 --> 00:06:54.010
from the '80s, which give
you a lower-bound estimate
00:06:54.010 --> 00:06:56.530
on the number of crossings.
00:06:56.530 --> 00:07:06.470
If G is a graph
with enough edges--
00:07:06.470 --> 00:07:08.220
the number of edges
is, let's say,
00:07:08.220 --> 00:07:10.790
at least four times the
number of vertices--
00:07:13.530 --> 00:07:21.220
then the number of crossings
of every drawing of G
00:07:21.220 --> 00:07:24.970
is at least the
number of edges cubed
00:07:24.970 --> 00:07:28.810
divided by the number
of vertices squared.
00:07:28.810 --> 00:07:32.530
And there's an extra constant
factor, which is some constant.
00:07:39.380 --> 00:07:43.570
So the constant does
not depend on the graph.
00:07:43.570 --> 00:07:45.760
In particular, if it
has a lot of edges,
00:07:45.760 --> 00:07:50.815
then every drawing of G must
have a lot of crossings.
00:07:50.815 --> 00:07:52.300
So the crossing
number inequality
00:07:52.300 --> 00:07:56.080
was proved by two separate
independent works,
00:07:56.080 --> 00:07:59.680
one by Ajtai, Chvatal,
Newborn, Szemeredi and the
00:07:59.680 --> 00:08:02.470
other by Tom Leighton,
our very own Tom Leighton.
00:08:06.950 --> 00:08:11.090
So let me first give you some
consequences of this theorem,
00:08:11.090 --> 00:08:12.690
just for illustration.
00:08:12.690 --> 00:08:17.930
So if you have an n-vertex
graph with a quadratic number
00:08:17.930 --> 00:08:26.460
of edges, then how many
crossings must you have?
00:08:26.460 --> 00:08:29.070
You plug in these
parameters into the theorem.
00:08:29.070 --> 00:08:36.270
See that it has necessarily
n to the 4th crossings.
00:08:39.100 --> 00:08:42.020
But if you just draw the
graph in some arbitrary way,
00:08:42.020 --> 00:08:44.570
you have at most n
to the 4 crossings,
00:08:44.570 --> 00:08:49.215
because a crossing
involves four points.
00:08:49.215 --> 00:08:51.090
So when you have a
quadratic number of edges,
00:08:51.090 --> 00:08:54.780
you must get basically the
maximum number of crossings.
00:08:54.780 --> 00:08:58.163
The leading constant term factor
is an interesting problem,
00:08:58.163 --> 00:08:59.580
which we're not
going to get into.
00:09:02.540 --> 00:09:06.810
Let's prove the crossing
number inequality.
00:09:06.810 --> 00:09:11.145
First, the base case of the
crossing number inequalities
00:09:11.145 --> 00:09:14.065
is when you can draw a
graph with no crossings.
00:09:14.065 --> 00:09:16.420
And those are planar graphs.
00:09:16.420 --> 00:09:29.130
So for every connected
planar graph,
00:09:29.130 --> 00:09:31.770
if it has at least one cycle--
and you'll see why in a second,
00:09:31.770 --> 00:09:33.000
why I say this--
00:09:33.000 --> 00:09:43.380
if with at least one cycle,
so that's not a tree,
00:09:43.380 --> 00:09:48.150
we must have that 3
times the number of faces
00:09:48.150 --> 00:09:53.380
is at most 2 times
the number of edges.
00:09:53.380 --> 00:09:55.770
So here, we're going
to use the key tool
00:09:55.770 --> 00:10:01.380
being Euler's
formula, which we all
00:10:01.380 --> 00:10:05.310
know as the number of vertices
minus the number of edges
00:10:05.310 --> 00:10:09.080
plus the number of
faces equals to 2.
00:10:09.080 --> 00:10:12.470
We're here for face, because
I draw a planar graph,
00:10:12.470 --> 00:10:18.040
and so I count the faces.
00:10:18.040 --> 00:10:21.520
Here there are two faces, outer
face, inner face, count edges
00:10:21.520 --> 00:10:25.380
and vertices, so you have
Euler's formula up there.
00:10:25.380 --> 00:10:28.390
And plug in Euler's
formula for a planar graph
00:10:28.390 --> 00:10:32.740
with at least one cycle, so
we can obtain this consequence
00:10:32.740 --> 00:10:42.030
over here, because every
face is adjacent to at least
00:10:42.030 --> 00:10:45.330
three edges.
00:10:45.330 --> 00:10:47.510
If you go around
the face, you see
00:10:47.510 --> 00:10:56.350
these three edges, and every
edge is counted exactly twice,
00:10:56.350 --> 00:11:02.320
is adjacent to
exactly two faces.
00:11:02.320 --> 00:11:04.785
So you do the double counting,
you get that inequality up
00:11:04.785 --> 00:11:05.285
there.
00:11:07.920 --> 00:11:15.910
So plugging these two into Euler
gets you that inequality up
00:11:15.910 --> 00:11:16.410
there.
00:11:23.730 --> 00:11:31.290
Plugging these two into Euler,
we get that the number of edges
00:11:31.290 --> 00:11:34.870
is almost 3 times the
number of vertices minus 6.
00:11:41.662 --> 00:11:43.120
So for this leaves
that inequality,
00:11:43.120 --> 00:11:48.760
but plug it into Euler, plug in
this into Euler, you get this.
00:11:48.760 --> 00:11:52.300
So we have that
the number of edges
00:11:52.300 --> 00:11:57.670
is at most 3 times the number
of vertices for every graph G.
00:11:57.670 --> 00:12:02.230
So here, we require
that the graph is planar
00:12:02.230 --> 00:12:07.205
and has at least one cycle, but
even if we drop the condition
00:12:07.205 --> 00:12:08.830
that it has at least
one cycle but just
00:12:08.830 --> 00:12:15.760
require that it's planar,
every planar graph G
00:12:15.760 --> 00:12:18.645
satisfies this
inequality over here.
00:12:18.645 --> 00:12:20.020
So in other words,
you might have
00:12:20.020 --> 00:12:24.100
heard before, in a planar graph,
the average degree of a vertex
00:12:24.100 --> 00:12:25.150
is almost 6.
00:12:28.460 --> 00:12:34.790
So in particular, the
crossing number of a graph G
00:12:34.790 --> 00:12:40.430
is positive if the
number of edges
00:12:40.430 --> 00:12:42.380
exceeds 3 times the
number of vertices.
00:12:44.915 --> 00:12:46.810
It's not planar, so
it has at least one
00:12:46.810 --> 00:12:49.300
crossing every drawing.
00:12:49.300 --> 00:13:04.730
And by deleting an edge
from each crossing,
00:13:04.730 --> 00:13:05.900
we get a planar graph.
00:13:11.360 --> 00:13:13.040
You draw the graph.
00:13:13.040 --> 00:13:14.040
You have some crossings.
00:13:14.040 --> 00:13:17.230
You get rid of an edge
associated with each drawing.
00:13:17.230 --> 00:13:19.450
Then you get a planar graph.
00:13:19.450 --> 00:13:21.610
If you look at this
inequality and you
00:13:21.610 --> 00:13:24.460
account for the number of
edges that you deleted,
00:13:24.460 --> 00:13:28.740
we obtain then the
inequality that the number
00:13:28.740 --> 00:13:31.470
of edges minus the
number of crossings
00:13:31.470 --> 00:13:34.710
is at least 3 times
the number of vertices.
00:13:43.130 --> 00:13:50.820
So we obtain the
inequality that the lower
00:13:50.820 --> 00:13:56.640
bounds in number of crossings
as the number of edges
00:13:56.640 --> 00:14:01.110
minus 3 times the number
of vertices, this one.
00:14:06.820 --> 00:14:10.490
So that's some lower bound
on the crossing number.
00:14:10.490 --> 00:14:12.650
It's not quite the bound
that we have over there.
00:14:12.650 --> 00:14:14.990
And in fact, if you take a
graph with a quadratic number
00:14:14.990 --> 00:14:17.840
of edges, this bound
here only gives you
00:14:17.840 --> 00:14:20.810
quadratic lower bound on
the crossing number, some
00:14:20.810 --> 00:14:21.480
lower bound.
00:14:21.480 --> 00:14:22.860
But it's not a
great lower bound.
00:14:22.860 --> 00:14:24.750
And we would like to do better.
00:14:24.750 --> 00:14:28.710
So here's a trick that
is a very nice trick,
00:14:28.710 --> 00:14:33.410
where we're going to use
this inequality to upgrade it
00:14:33.410 --> 00:14:36.170
to a much better
inequality, bootstrap it
00:14:36.170 --> 00:14:38.430
to a much tighter inequality.
00:14:38.430 --> 00:14:40.910
So this involves the use of
the probabilistic method.
00:14:43.870 --> 00:14:48.410
Let me denote by p some
number between 0 and 1,
00:14:48.410 --> 00:14:49.760
to be decided later.
00:14:54.080 --> 00:14:57.320
And starting with
a graph G, let's
00:14:57.320 --> 00:15:05.820
let G prime, with vertices
and edges being V prime and E
00:15:05.820 --> 00:15:16.300
prime, be obtained from G
by randomly deleting some
00:15:16.300 --> 00:15:19.690
of the vertices,
or rather randomly
00:15:19.690 --> 00:15:29.600
keeping each vertex
with probability p,
00:15:29.600 --> 00:15:35.931
independently for each
of these vertices.
00:15:35.931 --> 00:15:37.940
So you have some graph G.
00:15:37.940 --> 00:15:41.330
I keep each vertex
with probability p.
00:15:41.330 --> 00:15:43.640
And I delete the
remaining vertices.
00:15:43.640 --> 00:15:45.800
And I get a smaller graph.
00:15:45.800 --> 00:15:49.230
I get some induced subgraph.
00:15:49.230 --> 00:15:51.440
And I would like
to know what can we
00:15:51.440 --> 00:15:55.910
say about the crossing number of
the smaller graph in comparison
00:15:55.910 --> 00:16:00.240
to the crossing number
of the original graph?
00:16:00.240 --> 00:16:03.860
For the smaller graph, because
it's still a planar graph
00:16:03.860 --> 00:16:05.770
so G prime--
00:16:05.770 --> 00:16:07.180
so it's still a graph.
00:16:07.180 --> 00:16:09.180
It's not a planar graph,
but it's still a graph,
00:16:09.180 --> 00:16:14.380
so G prime still satisfies
this inequality up here.
00:16:17.600 --> 00:16:21.950
So G prime still satisfies
that the number of crossings
00:16:21.950 --> 00:16:24.270
in every drawing of
G prime is at least
00:16:24.270 --> 00:16:28.500
the number of edges of G
prime minus 3 times the number
00:16:28.500 --> 00:16:30.520
of vertices of G prime.
00:16:33.560 --> 00:16:38.390
But note that G prime
is a random graph.
00:16:38.390 --> 00:16:40.400
G was fixed, given.
00:16:40.400 --> 00:16:44.100
G prime is a random graph.
00:16:44.100 --> 00:16:46.520
So let's evaluate
the expectation
00:16:46.520 --> 00:16:50.270
of both quantities, left-hand
side and right-hand side.
00:16:53.730 --> 00:16:56.280
If this inequality is
true for every G prime,
00:16:56.280 --> 00:16:59.370
the same inequality must
be true in expectation.
00:17:09.369 --> 00:17:13.510
Now what do we know about
all the expectations of each
00:17:13.510 --> 00:17:17.230
of these quantities?
00:17:17.230 --> 00:17:19.940
The number of vertices
in expectation--
00:17:19.940 --> 00:17:21.260
that's pretty easy.
00:17:21.260 --> 00:17:28.780
So this one here is p times the
original number of vertices.
00:17:28.780 --> 00:17:30.970
The number of edges
is also pretty easy.
00:17:30.970 --> 00:17:34.150
Each edge is kept if
both endpoints are kept.
00:17:34.150 --> 00:17:38.620
So this expectation on the
number of edges remaining
00:17:38.620 --> 00:17:43.640
is also pretty
easy to determine.
00:17:43.640 --> 00:17:49.070
The crossing number
of the new graph--
00:17:49.070 --> 00:17:51.830
that I have to be a little
bit more careful of,
00:17:51.830 --> 00:17:54.380
because when you look
at the smaller graph,
00:17:54.380 --> 00:17:56.780
maybe there's a
different way to draw it
00:17:56.780 --> 00:18:00.320
that's not just deleting
the sum of the vertices
00:18:00.320 --> 00:18:02.070
from the original graph.
00:18:02.070 --> 00:18:03.560
So even though
the original graph
00:18:03.560 --> 00:18:06.170
might have a lot of crossings,
when you go to a subgraph,
00:18:06.170 --> 00:18:09.070
maybe there's a
better way to draw it.
00:18:09.070 --> 00:18:11.320
But we just need an inequality
in the right direction.
00:18:11.320 --> 00:18:13.110
So we are still OK.
00:18:13.110 --> 00:18:16.170
And I claim that the
crossing number of G prime
00:18:16.170 --> 00:18:19.740
is in expectation
at most p to be 4th
00:18:19.740 --> 00:18:23.190
times the crossing number
of G. Because if you
00:18:23.190 --> 00:18:32.360
keep the same drawing, then the
expected number of crossings
00:18:32.360 --> 00:18:34.470
that are kept--
00:18:34.470 --> 00:18:38.850
each crossing is kept if
all four of its end points
00:18:38.850 --> 00:18:40.770
are kept.
00:18:40.770 --> 00:18:44.830
So each crossing is kept with
probability p to the 4th.
00:18:44.830 --> 00:18:47.320
So you can draw
it in expectation
00:18:47.320 --> 00:18:48.845
with this many crossings.
00:18:48.845 --> 00:18:49.720
Maybe it's much less.
00:18:49.720 --> 00:18:50.890
Maybe there's a
better way to draw it,
00:18:50.890 --> 00:18:53.575
but you have an inequality
going in the right direction.
00:19:01.500 --> 00:19:05.130
Looking at that inequality
up there in yellow,
00:19:05.130 --> 00:19:08.710
we find that the
crossing number of G
00:19:08.710 --> 00:19:17.960
is at least p to the minus
2 E minus 3p to the minus 3.
00:19:21.090 --> 00:19:25.690
And this is true for every
value of p between 0 and 1.
00:19:25.690 --> 00:19:30.540
So now you pick a value of p
that works most in your favor.
00:19:30.540 --> 00:19:33.690
And it turns out
you should do this
00:19:33.690 --> 00:19:37.710
by setting these
two equalities to be
00:19:37.710 --> 00:19:39.100
roughly equal to each other.
00:19:43.050 --> 00:19:57.090
So setting p between 0 and
1 so that 4 times the--
00:19:57.090 --> 00:19:58.590
basically, set these
two terms to be
00:19:58.590 --> 00:20:00.020
roughly equal to each other.
00:20:03.890 --> 00:20:09.320
And then we get that
this quantity here
00:20:09.320 --> 00:20:13.910
is at least the
claimed quantity,
00:20:13.910 --> 00:20:19.270
which is E cubed
over V squared up
00:20:19.270 --> 00:20:24.140
to some constant factor, which
I don't really care about.
00:20:24.140 --> 00:20:27.020
In order to set p, I have
to be a little bit careful
00:20:27.020 --> 00:20:28.560
that p is between 0 and 1.
00:20:28.560 --> 00:20:30.500
If you set p to be 1.2,
this whole argument
00:20:30.500 --> 00:20:33.060
doesn't make any sense.
00:20:33.060 --> 00:20:33.980
So this is OK.
00:20:36.820 --> 00:20:46.260
So we know p is at most one
as long as E is at most 4p.
00:20:46.260 --> 00:20:49.470
I mean, the 4 here is not
optimal, but if 4 were 2,
00:20:49.470 --> 00:20:50.550
then it's not true.
00:20:50.550 --> 00:20:54.482
So if E is 2V, you can
have a planar graph,
00:20:54.482 --> 00:20:56.940
so you shouldn't have a lower
bound on the crossing number.
00:20:59.550 --> 00:21:02.620
So this is the proof of the
crossing number inequality.
00:21:02.620 --> 00:21:04.770
As I said, if you
have lots of edges,
00:21:04.770 --> 00:21:08.960
then you must have
lots of crossings.
00:21:08.960 --> 00:21:10.002
Any questions?
00:21:13.250 --> 00:21:15.200
So let's use the crossing
number inequality
00:21:15.200 --> 00:21:19.220
to prove a fundamental
result in incidence geometry.
00:21:26.820 --> 00:21:30.120
Incidence geometry is
this area of discrete math
00:21:30.120 --> 00:21:33.510
that concerns fairly
basic-sounding questions
00:21:33.510 --> 00:21:37.530
about incidences between,
let's say, points and lines.
00:21:37.530 --> 00:21:39.730
And here's an example.
00:21:39.730 --> 00:21:52.530
So what's the maximum number
of incidences between endpoints
00:21:52.530 --> 00:21:57.660
and end lines,
where by "incidence"
00:21:57.660 --> 00:22:03.150
I mean if p-- so curly
p-- is a set of points,
00:22:03.150 --> 00:22:09.750
and curly l is a set of lines,
then I write I of p and l
00:22:09.750 --> 00:22:20.770
to be the number of pairs,
one point, one line, such
00:22:20.770 --> 00:22:25.380
that the point lies on the line.
00:22:25.380 --> 00:22:29.440
So I'm counting incidences
between points and lines.
00:22:29.440 --> 00:22:30.910
You can view this in many ways.
00:22:30.910 --> 00:22:34.000
You can view it as a bipartite
graph between points and lines,
00:22:34.000 --> 00:22:40.000
and we're counting the number of
edges in this bipartite graph.
00:22:40.000 --> 00:22:41.730
So I give you end
points, end lines.
00:22:41.730 --> 00:22:45.350
What's the maximum
number of incidences?
00:22:45.350 --> 00:22:47.670
It's not such an
obvious question.
00:22:47.670 --> 00:22:52.560
So let's see how we can
approach this question.
00:22:52.560 --> 00:22:58.280
But first, let me give
you some easy bounds.
00:22:58.280 --> 00:23:02.480
So here's a trivial bound--
00:23:07.680 --> 00:23:12.390
so here, I want to know if I
give you some number of points,
00:23:12.390 --> 00:23:16.250
some number of lines, what's the
maximum number of incidences.
00:23:16.250 --> 00:23:21.580
So a trivial bound is that
the number of incidences
00:23:21.580 --> 00:23:26.970
is at most the product
between the number of points
00:23:26.970 --> 00:23:30.120
and the number of lines.
00:23:30.120 --> 00:23:32.220
One point, one line,
at most one incidence.
00:23:32.220 --> 00:23:34.750
So that's pretty trivial.
00:23:34.750 --> 00:23:36.260
We can do better.
00:23:36.260 --> 00:23:42.840
So we can do better
because, well, you
00:23:42.840 --> 00:23:50.050
see, let's use this following
fact, that every line--
00:23:52.880 --> 00:24:02.735
so every pair of points
determine at most one line.
00:24:02.735 --> 00:24:03.810
I have two points.
00:24:03.810 --> 00:24:08.780
There's at most one line that
contains those two points.
00:24:08.780 --> 00:24:15.990
Using this fact, we see
that the number of--
00:24:15.990 --> 00:24:23.500
so let's count the number
of triples involving
00:24:23.500 --> 00:24:34.590
two points and one line
such that both points lie
00:24:34.590 --> 00:24:35.220
on the line.
00:24:39.730 --> 00:24:41.530
So how big can this set be?
00:24:41.530 --> 00:24:44.950
So let's try to count it
in two different ways.
00:24:44.950 --> 00:24:48.900
On one hand, this
quantity is at most
00:24:48.900 --> 00:24:52.270
the number of points squared,
because if I give you
00:24:52.270 --> 00:24:56.700
two points, then they
determine this line--
00:24:56.700 --> 00:25:01.580
so at most the number
of points squared.
00:25:01.580 --> 00:25:09.390
But on the other hand, we see
that if I give you a line,
00:25:09.390 --> 00:25:12.090
I just need to count
now the number of--
00:25:15.268 --> 00:25:17.560
let me also require that
these two points are distinct.
00:25:17.560 --> 00:25:20.860
So if I give you
a line, I now need
00:25:20.860 --> 00:25:26.990
to count the number of pairs
of points on this line.
00:25:26.990 --> 00:25:36.750
So I can enumerate over
lines and count line
00:25:36.750 --> 00:25:42.270
by line how many pairs of
points are on that line.
00:25:42.270 --> 00:25:45.790
So I get this
quantity over here.
00:25:45.790 --> 00:25:48.930
On each line, I have
that contribution.
00:25:48.930 --> 00:25:55.100
And now, using
Cauchy-Schwartz inequality,
00:25:55.100 --> 00:26:01.130
we find that this
squared term is at least
00:26:01.130 --> 00:26:11.200
the number of incidences
divided by the number of lines.
00:26:11.200 --> 00:26:13.660
And the remaining minus
1 term contributes just
00:26:13.660 --> 00:26:16.134
to the number of incidences.
00:26:20.090 --> 00:26:22.370
So the first is by
Cauchy-Schwartz.
00:26:26.690 --> 00:26:30.450
So putting these two
inequalities together,
00:26:30.450 --> 00:26:34.920
we get some upper bound on
the number of incidences.
00:26:34.920 --> 00:26:37.990
If you have to invert
this inequality,
00:26:37.990 --> 00:26:42.810
you will get that the number
of incidences between points
00:26:42.810 --> 00:26:50.360
and lines is upper bounded
by the number of points
00:26:50.360 --> 00:26:53.930
times the number of
lines raised to power
00:26:53.930 --> 00:27:00.450
1/2 plus the number of lines.
00:27:00.450 --> 00:27:03.290
So that's what you get from
this inequality over here.
00:27:06.150 --> 00:27:08.720
By considering
point-line duality--
00:27:08.720 --> 00:27:12.450
so whenever you have this
kind of setup involving points
00:27:12.450 --> 00:27:15.870
and lines, you can take
the projected duality
00:27:15.870 --> 00:27:18.570
and transform the
configuration into--
00:27:18.570 --> 00:27:21.150
lines into points and points
into lines, and the incidences
00:27:21.150 --> 00:27:22.560
are preserved.
00:27:22.560 --> 00:27:25.630
So I also have an inequality.
00:27:25.630 --> 00:27:29.550
By duality-- I also
have an inequality
00:27:29.550 --> 00:27:32.760
where I switch the roles
of points and lines.
00:27:38.610 --> 00:27:40.890
So I is already the numbers.
00:27:40.890 --> 00:27:44.470
I don't need to put an
extra absolute value sign.
00:27:44.470 --> 00:27:46.590
So the number of
points and lines
00:27:46.590 --> 00:27:50.130
is upper bounded by
the number of lines
00:27:50.130 --> 00:27:52.740
times the square root
of a number of points
00:27:52.740 --> 00:27:58.200
plus an extra term, just in
case there are very few lines.
00:28:02.035 --> 00:28:03.910
So these are the bounds
that you have so far.
00:28:03.910 --> 00:28:06.160
And the only thing that
we have used so far
00:28:06.160 --> 00:28:09.340
is the fact that every two
points determine at most one
00:28:09.340 --> 00:28:12.760
line, and every two lines
meet at at most one point.
00:28:15.290 --> 00:28:17.200
So these are the
bounds that we get.
00:28:17.200 --> 00:28:23.720
And in particular, for
end points and end lines,
00:28:23.720 --> 00:28:26.530
we get the number
of incidences is--
00:28:26.530 --> 00:28:29.420
they go off n to the 3/2.
00:28:34.440 --> 00:28:36.930
This should remind you of
something we've done before.
00:28:40.300 --> 00:28:43.560
So in the first
part of this course,
00:28:43.560 --> 00:28:50.500
when we were looking at extremal
numbers, where did 3/2 come up?
00:28:50.500 --> 00:28:52.420
AUDIENCE: [INAUDIBLE] like C4?
00:28:52.420 --> 00:28:53.980
YUFEI ZHAO: C4, yeah.
00:28:53.980 --> 00:29:00.450
So if you compare this quantity
to the extremal number of C4,
00:29:00.450 --> 00:29:04.690
it's also n to the 3/2.
00:29:04.690 --> 00:29:07.750
And in fact, the proof
is exactly the same.
00:29:07.750 --> 00:29:12.110
All we're using here is that
the incidence graph is C4-free
00:29:12.110 --> 00:29:17.700
So in fact, this is an
argument about C4-free graphs.
00:29:17.700 --> 00:29:21.060
So this fact here, every two
points determine at most one
00:29:21.060 --> 00:29:25.080
line, is saying that if you
look at the incidence graph,
00:29:25.080 --> 00:29:28.000
there's no C4.
00:29:28.000 --> 00:29:30.670
That's all we're using for now.
00:29:30.670 --> 00:29:31.712
Any questions?
00:29:35.410 --> 00:29:37.000
So is this the truth?
00:29:37.000 --> 00:29:39.640
Now, back when we were
discussing the extremal number
00:29:39.640 --> 00:29:43.650
for C4-free graphs, we
saw that, in fact, this
00:29:43.650 --> 00:29:45.240
is the correct order.
00:29:45.240 --> 00:29:46.740
And what was the
construction there?
00:29:53.550 --> 00:29:56.790
So the construction also
came from incidences,
00:29:56.790 --> 00:30:01.290
but incidences of
taking all lines
00:30:01.290 --> 00:30:07.950
and points in the finite
field plain, Fq squared.
00:30:07.950 --> 00:30:10.740
If you look at all the
lines and all the points
00:30:10.740 --> 00:30:13.740
in a finite field
plain, then you
00:30:13.740 --> 00:30:18.750
get the correct
lower bound for C4.
00:30:18.750 --> 00:30:23.490
But now we are actually
working in the real plane,
00:30:23.490 --> 00:30:28.560
so it turns out that the answer
is different when you're not
00:30:28.560 --> 00:30:30.120
working the finite field.
00:30:30.120 --> 00:30:33.615
We're going to be using the
topology of the real plane.
00:30:33.615 --> 00:30:35.740
And we're going to come up
with a different answer.
00:30:35.740 --> 00:30:41.070
So it turns out that
the truth for the number
00:30:41.070 --> 00:30:44.210
of maximum number of
incidences in the plane,
00:30:44.210 --> 00:30:50.020
for points and lines in the
real plane, is not exponent 3/2,
00:30:50.020 --> 00:30:53.770
but turns out to be 4/3.
00:30:53.770 --> 00:30:56.710
And this is a consequence
of an important result
00:30:56.710 --> 00:30:58.950
in incidence geometry,
a fundamental result,
00:30:58.950 --> 00:31:00.800
known as the
Szemeredi-Trotter theorem.
00:31:07.080 --> 00:31:11.350
So the Szemeredi-Trotter
theorem says
00:31:11.350 --> 00:31:17.050
that the number of incidences
between points and lines
00:31:17.050 --> 00:31:19.720
is upper bounded by
this function where
00:31:19.720 --> 00:31:22.740
you look at the number of points
times the number of lines,
00:31:22.740 --> 00:31:34.250
and each raised to power 2/3
and plus some additional terms,
00:31:34.250 --> 00:31:38.360
just in case there are many more
lines compared to points or way
00:31:38.360 --> 00:31:41.170
more points compared to lines.
00:31:41.170 --> 00:31:44.070
So that's the
Szemeredi-Trotter theorem.
00:31:44.070 --> 00:31:52.190
And as a corollary, you see
that n points, n lines give you
00:31:52.190 --> 00:31:58.930
at most n to the 4/3
incidences, in contrast
00:31:58.930 --> 00:32:05.950
to the setting of the finite
field plain, where you can
00:32:05.950 --> 00:32:08.150
get n to the 3/2 incidences.
00:32:08.150 --> 00:32:10.870
So somehow, we have
to use the topology
00:32:10.870 --> 00:32:13.280
of the real plane for this one.
00:32:13.280 --> 00:32:15.370
And I want to show you a proof--
00:32:15.370 --> 00:32:17.080
turns out not the
original proof,
00:32:17.080 --> 00:32:19.490
but it's a proof that
uses the crossing number
00:32:19.490 --> 00:32:23.525
inequality to prove
Szemeredi-Trotter theorem.
00:32:23.525 --> 00:32:25.150
You see, in crossing
number inequality,
00:32:25.150 --> 00:32:29.290
we are using the topology
of the real plane.
00:32:29.290 --> 00:32:31.657
Where?
00:32:31.657 --> 00:32:32.740
AUDIENCE: Euler's formula.
00:32:32.740 --> 00:32:34.198
YUFEI ZHAO: Euler's
formula, right.
00:32:34.198 --> 00:32:36.020
So the very beginning,
Euler's formula
00:32:36.020 --> 00:32:40.290
has to do with the
topology of the real plane.
00:32:40.290 --> 00:32:43.790
Now, this bound turns
out to be tight.
00:32:43.790 --> 00:32:45.530
So let me give you
an example showing
00:32:45.530 --> 00:32:49.880
that the 4/3 exponent is tight.
00:32:49.880 --> 00:32:55.340
And the example
is, if you take p
00:32:55.340 --> 00:33:06.420
to be this rectangular
grid of points,
00:33:06.420 --> 00:33:09.750
and L to be a set
of lines-- so I'm
00:33:09.750 --> 00:33:13.710
going to write the
lines by their equation,
00:33:13.710 --> 00:33:17.730
where the slope is an
integer from 1 through k
00:33:17.730 --> 00:33:21.090
and the y-intercept
is an integer from 1
00:33:21.090 --> 00:33:23.610
through k squared.
00:33:23.610 --> 00:33:30.230
And you see here
that every line in L
00:33:30.230 --> 00:33:42.130
contains exactly k points
from P. So we got in total k
00:33:42.130 --> 00:33:51.040
to the 4th incidences, which is
on the order of n to the 4/3.
00:33:53.840 --> 00:33:55.740
So n to the 4/3 third
is the right answer.
00:33:59.513 --> 00:34:01.930
Now let me show you how to
prove Szemeredi-Trotter theorem
00:34:01.930 --> 00:34:03.980
from the crossing
number inequality.
00:34:03.980 --> 00:34:06.700
It turns out to be a very
neat application that's
00:34:06.700 --> 00:34:10.610
almost a direct consequence
once you set up the right graph.
00:34:10.610 --> 00:34:15.550
And the idea is that we are
going to draw a graph based
00:34:15.550 --> 00:34:19.570
on our incidence configuration.
00:34:19.570 --> 00:34:26.900
So first, just to clean
things up a little bit,
00:34:26.900 --> 00:34:41.830
let's get rid of lines in
L with 1 or 0 points in P.
00:34:41.830 --> 00:34:45.219
So this operation doesn't
affect the bounds.
00:34:45.219 --> 00:34:46.810
So you can check.
00:34:46.810 --> 00:34:50.170
These lines don't contribute
much to the incidence bound,
00:34:50.170 --> 00:34:53.260
and only contributes
to this plus L.
00:34:53.260 --> 00:34:55.600
So you can get
rid of such lines.
00:34:55.600 --> 00:35:03.830
So let's assume
that every line in L
00:35:03.830 --> 00:35:14.390
contains at least
two points from P.
00:35:14.390 --> 00:35:19.010
And let's draw a graph based
on this incidence structure.
00:35:19.010 --> 00:35:19.970
So if I have--
00:35:27.850 --> 00:35:36.980
so suppose these are
my points and lines.
00:35:36.980 --> 00:35:39.560
I'll just draw a
graph where I keep
00:35:39.560 --> 00:35:47.560
the points as the vertices,
and I put in an edge.
00:35:47.560 --> 00:35:55.900
It's a finite edge that
connects two adjacent points
00:35:55.900 --> 00:35:56.620
on the same line.
00:36:02.467 --> 00:36:03.300
So I get some graph.
00:36:09.480 --> 00:36:12.938
Let me make this graph
a bit more interesting.
00:36:23.680 --> 00:36:25.080
So I get some graph.
00:36:25.080 --> 00:36:31.340
And how many crossings, at
most, does this graph have?
00:36:31.340 --> 00:36:42.030
So the number of
crossings of G is at most
00:36:42.030 --> 00:36:45.920
the number of lines
squared, because a crossing
00:36:45.920 --> 00:36:47.030
comes from two lines.
00:36:49.662 --> 00:36:50.870
So here, you have a crossing.
00:36:50.870 --> 00:36:52.448
A crossing comes from two lines.
00:36:52.448 --> 00:36:54.740
Number of crossings is at
most number of lines squared.
00:36:57.380 --> 00:36:59.500
On the other hand, we
can give a lower bound
00:36:59.500 --> 00:37:04.820
to the number of crossings from
the crossing number inequality.
00:37:04.820 --> 00:37:07.470
And to do that, I want to
estimate the number of edges.
00:37:07.470 --> 00:37:09.770
And this is the reason why
I assume every line contains
00:37:09.770 --> 00:37:16.580
at least two points from P,
because a line with now k
00:37:16.580 --> 00:37:23.720
incidences gives
k minus 1 edges.
00:37:26.690 --> 00:37:32.300
And if k is at least 2, then k
minus 1 is at least k over 2,
00:37:32.300 --> 00:37:33.410
let's say.
00:37:33.410 --> 00:37:35.990
I don't care about
constant factors.
00:37:35.990 --> 00:37:41.270
So by crossing
number inequality,
00:37:41.270 --> 00:37:46.130
the number of crossings
of G is at least
00:37:46.130 --> 00:37:50.210
the number of edges cubed
over the number of vertices
00:37:50.210 --> 00:38:00.320
squared, which is at least
the number of incidences
00:38:00.320 --> 00:38:05.980
of this configuration cubed over
the number of points squared.
00:38:05.980 --> 00:38:08.540
Actually, number of vertices
is the number of points.
00:38:08.540 --> 00:38:12.050
And number of edges,
by this argument here,
00:38:12.050 --> 00:38:15.428
is on the same order as
the number of incidences.
00:38:18.500 --> 00:38:24.470
Putting these two facts
together, we see--
00:38:24.470 --> 00:38:29.510
there was one extra hypothesis
in crossing number inequality.
00:38:29.510 --> 00:38:32.780
Provided that this
hypothesis holds,
00:38:32.780 --> 00:38:37.610
which is that the
number of incidences
00:38:37.610 --> 00:38:48.720
is at least 8 times
the number of points,
00:38:48.720 --> 00:38:52.200
so that the original
hypothesis holds.
00:38:54.810 --> 00:38:58.080
So putting everything
together, and rearranging
00:38:58.080 --> 00:39:02.190
all of these terms, and
using upper and lower bounds
00:39:02.190 --> 00:39:08.460
on the crossing number, we find
that the number of incidences
00:39:08.460 --> 00:39:10.980
is upper bounded by--
00:39:10.980 --> 00:39:22.750
the main term you see is
just coming from these two,
00:39:22.750 --> 00:39:27.640
but there are a few other terms
that we should put in, just
00:39:27.640 --> 00:39:31.170
in case this
hypothesis is violated,
00:39:31.170 --> 00:39:35.080
and also to take care of
this assumption over here,
00:39:35.080 --> 00:39:39.660
so adding a couple of
linear terms corresponding
00:39:39.660 --> 00:39:43.200
to the number of points
and the number of lines.
00:39:43.200 --> 00:39:46.677
If this hypothesis is
violated, then the inequality
00:39:46.677 --> 00:39:47.260
is still true.
00:39:51.650 --> 00:39:55.430
So this proves the crossing
numbers inequality.
00:39:55.430 --> 00:39:56.638
Any questions?
00:40:00.790 --> 00:40:06.450
So we've done these
two very neat results.
00:40:06.450 --> 00:40:09.760
The question is, what do they
have to do with the sum product
00:40:09.760 --> 00:40:11.790
problem?
00:40:11.790 --> 00:40:15.390
So I want to show you how
you can give some lower bound
00:40:15.390 --> 00:40:20.010
on the sum product problem
using Szemeredi-Trotter theorem.
00:40:22.650 --> 00:40:25.750
So it turns out that the sum
product problem is intimately
00:40:25.750 --> 00:40:28.780
related to incidence geometry.
00:40:28.780 --> 00:40:32.500
And the reason-- you'll see in
a second precisely why they're
00:40:32.500 --> 00:40:36.640
related, but roughly speaking,
when you have addition
00:40:36.640 --> 00:40:39.580
and multiplication,
they're are kind of
00:40:39.580 --> 00:40:43.090
like taking slope
and y-intercept
00:40:43.090 --> 00:40:45.020
of an equation of a line.
00:40:45.020 --> 00:40:47.500
So there are two operations
that are involved.
00:40:47.500 --> 00:40:52.660
So turns out, many incidence
geometry problems can be set up
00:40:52.660 --> 00:40:53.830
and a way--
00:40:53.830 --> 00:40:55.630
so many sum product
problems can be set up
00:40:55.630 --> 00:40:58.800
in a way that involves
incidence geometry.
00:40:58.800 --> 00:41:04.330
And a very short and clever
lower bound to the sum product
00:41:04.330 --> 00:41:10.270
problem was proved by
Elekes in the late '90s.
00:41:18.930 --> 00:41:25.470
So he showed the bound that if
you have a subset of finite,
00:41:25.470 --> 00:41:31.870
subset of reals, then the sum
set size times the product set
00:41:31.870 --> 00:41:35.665
size is at least A to the 5/2.
00:41:39.150 --> 00:41:46.995
As a corollary, one of these
two must be fairly large.
00:41:46.995 --> 00:41:53.560
The max of the sum set size
and the product set size
00:41:53.560 --> 00:41:57.495
is at least a to the 5/4.
00:42:06.030 --> 00:42:07.440
Let me show you the proof.
00:42:07.440 --> 00:42:11.040
I'm going to construct a set
of points and a set of lines
00:42:11.040 --> 00:42:16.490
based on the set A. And
the set of points in R2
00:42:16.490 --> 00:42:22.700
is going to be pairs x comma y,
where the horizontal coordinate
00:42:22.700 --> 00:42:26.740
lies in the sum set, A plus
A, and the vertical coordinate
00:42:26.740 --> 00:42:38.140
lies in the product set, A
times A. And a set of lines
00:42:38.140 --> 00:42:40.350
is going to be these lines--
00:42:40.350 --> 00:42:52.350
y equals to a times x minus
a prime, where a and a prime
00:42:52.350 --> 00:42:56.810
lie in A.
00:42:56.810 --> 00:43:00.840
So these are some
points and some lines.
00:43:00.840 --> 00:43:07.270
And I want to show you that
they must have many incidences.
00:43:07.270 --> 00:43:09.180
So what are the incidences?
00:43:09.180 --> 00:43:17.080
So note that the line y equals
to a times x minus a prime--
00:43:17.080 --> 00:43:27.510
it contains the points
a prime plus b and ab,
00:43:27.510 --> 00:43:34.230
which lies in P for all
b in A. You plug it in.
00:43:34.230 --> 00:43:39.743
If you plug in a prime plus
b into here, you get ab.
00:43:39.743 --> 00:43:44.300
And this point lies in P,
because the first coordinate
00:43:44.300 --> 00:43:45.920
is the sum set.
00:43:45.920 --> 00:43:49.550
The second coordinate
lies in the product set.
00:43:49.550 --> 00:43:57.570
So each line in L
contains many incidences.
00:43:57.570 --> 00:44:01.960
So each line in L
contains a incidents.
00:44:01.960 --> 00:44:17.610
So this line, each line in
L contains a incidences.
00:44:17.610 --> 00:44:23.490
Also, we can easily
compute the number of lines
00:44:23.490 --> 00:44:26.490
and the number of points.
00:44:26.490 --> 00:44:30.870
The number of points
is A plus A size
00:44:30.870 --> 00:44:36.540
times the size of A times
A. And the number of lines
00:44:36.540 --> 00:44:41.100
is just the size of A squared.
00:44:41.100 --> 00:44:52.541
So by Szemeredi-Trotter, we find
that the number of incidences
00:44:52.541 --> 00:44:58.250
is lower bounded by
noting this fact here.
00:44:58.250 --> 00:44:59.880
We have many incidences.
00:44:59.880 --> 00:45:07.120
So the number of lines, each
line contributes a incidences.
00:45:07.120 --> 00:45:09.100
But we also have an
upper bound coming
00:45:09.100 --> 00:45:11.496
from the
Szemeredi-Trotter theorem.
00:45:11.496 --> 00:45:18.100
So plugging in the upper
bound, we find that you have--
00:45:18.100 --> 00:45:20.080
so now I'm just
directly plugging
00:45:20.080 --> 00:45:22.895
in the statement of
Szemeredi-Trotter.
00:45:26.460 --> 00:45:28.317
The main term is the first term.
00:45:28.317 --> 00:45:30.150
You should still check
the latter two terms,
00:45:30.150 --> 00:45:31.650
but the main term
is the first term.
00:45:34.190 --> 00:45:39.540
So plugging in the
values for P and L,
00:45:39.540 --> 00:45:55.790
we find this is the case,
plus some additional terms,
00:45:55.790 --> 00:45:58.910
which you can check are
dominated by the first term.
00:45:58.910 --> 00:46:00.770
So let me just do
a big O over there.
00:46:03.530 --> 00:46:06.810
Now you put left
and right together,
00:46:06.810 --> 00:46:11.070
and we could obtain
some lower bound
00:46:11.070 --> 00:46:15.090
on the product of the sizes
of the sum set and the product
00:46:15.090 --> 00:46:18.320
set, thereby
yielding allocations.
00:46:23.710 --> 00:46:26.460
So this is some lower bound
on the sum product problem.
00:46:26.460 --> 00:46:30.400
And you see, we went through
the crossing number inequality
00:46:30.400 --> 00:46:32.875
to prove Szemeredi-Trotter,
a basic result
00:46:32.875 --> 00:46:34.480
in incidence geometry.
00:46:34.480 --> 00:46:39.580
And viewing sum product as an
incidence geometry problem, one
00:46:39.580 --> 00:46:43.850
can obtain this lower
bound over here.
00:46:43.850 --> 00:46:44.878
Any questions?
00:46:47.950 --> 00:46:51.640
I want to show you a different
proof that was found later,
00:46:51.640 --> 00:46:54.550
that gives an improvement.
00:46:54.550 --> 00:47:01.670
And there's a question,
can you do better than 5/4?
00:47:01.670 --> 00:47:06.490
So it turns out that there was
a very nice result of Solymosi
00:47:06.490 --> 00:47:09.914
sometime later that
gives you an improvement.
00:47:16.210 --> 00:47:21.730
Solymosi proved
in 2009 that if A
00:47:21.730 --> 00:47:29.030
is a subset of positive reals,
then the size of A times
00:47:29.030 --> 00:47:33.070
A multiplied by the
size of A plus A squared
00:47:33.070 --> 00:47:37.540
is at least size of
A to the 4th divided
00:47:37.540 --> 00:47:45.610
by 4 ceiling log of the size
of A, where the log is base 2.
00:47:45.610 --> 00:47:48.600
So don't worry about
the specific constants.
00:47:51.240 --> 00:47:54.640
A being in the positive
reals is no big deal,
00:47:54.640 --> 00:47:58.630
because you can always separate
A as positive and negative
00:47:58.630 --> 00:48:00.790
and analyze each
part separately.
00:48:00.790 --> 00:48:05.260
So as a corollary to
Solymosi's theorem,
00:48:05.260 --> 00:48:12.730
we obtain that for A, a
subset of the reals, the sum
00:48:12.730 --> 00:48:17.110
set and the product set,
at least one of them
00:48:17.110 --> 00:48:24.990
must have size at
least A raised to 4/3
00:48:24.990 --> 00:48:34.820
divided by 2 times log base 2
size of A raised to 1/3 third.
00:48:34.820 --> 00:48:42.210
So basically, A to the 4/3 minus
little one in the exponent,
00:48:42.210 --> 00:48:43.370
so better than before.
00:48:43.370 --> 00:48:44.440
And this is a new bound.
00:48:47.580 --> 00:48:52.200
I want to note that in
this formulation, where
00:48:52.200 --> 00:48:58.320
we are looking at lower bounding
this quantity over here,
00:48:58.320 --> 00:49:06.160
this is tied up to
logarithmic factors,
00:49:06.160 --> 00:49:11.710
by considering A to be just
the interval from 1 to n.
00:49:11.710 --> 00:49:14.140
If A is the interval
from 1 to n,
00:49:14.140 --> 00:49:17.230
then the left-hand side, A
plus A, is around size n.
00:49:17.230 --> 00:49:18.660
So you have n squared.
00:49:18.660 --> 00:49:20.260
And A times A is
also, I mentioned,
00:49:20.260 --> 00:49:23.800
around size n squared.
00:49:23.800 --> 00:49:26.170
So this inequality
here is tight.
00:49:26.170 --> 00:49:28.820
The consequence is not tight,
but the first inequality
00:49:28.820 --> 00:49:29.320
is tight.
00:49:34.400 --> 00:49:36.260
So in the remainder
of today's lecture,
00:49:36.260 --> 00:49:39.960
I want to show you how to
prove Solymosi's lower bound.
00:49:39.960 --> 00:49:43.550
And it has some
similarities to the one
00:49:43.550 --> 00:49:50.210
that we've seen, because it also
looks at some geometric aspects
00:49:50.210 --> 00:49:52.610
of the sum product problem.
00:49:52.610 --> 00:49:57.950
But it doesn't use the exact
tools that we've seen earlier.
00:49:57.950 --> 00:50:00.590
It does use some tools
that were related
00:50:00.590 --> 00:50:04.520
to the lecture from Monday.
00:50:04.520 --> 00:50:07.070
So last time, we
discussed this thing
00:50:07.070 --> 00:50:09.830
called the "additive energy."
00:50:09.830 --> 00:50:12.560
You can come up with a similar
notion for the multiplication
00:50:12.560 --> 00:50:24.220
operation, so the
"multiplicative energy,"
00:50:24.220 --> 00:50:31.426
which we'll denote by E sub,
with the multiplication symbol,
00:50:31.426 --> 00:50:35.650
A. So the multiplicative energy
is like the additive energy,
00:50:35.650 --> 00:50:38.028
except that instead
of doing addition,
00:50:38.028 --> 00:50:39.820
we're going to do a
multiplication instead.
00:50:39.820 --> 00:50:45.650
So one way to define it is
the number of quadruples such
00:50:45.650 --> 00:50:55.530
that there exists some real
lambda such that a, comma,
00:50:55.530 --> 00:50:58.680
b equals to lambda c, comma, d.
00:51:06.400 --> 00:51:08.540
So basically the same
as additive energy,
00:51:08.540 --> 00:51:12.530
except that we're using
multiplications instead.
00:51:12.530 --> 00:51:15.330
By the Cauchy-Schwartz
inequality--
00:51:15.330 --> 00:51:20.980
and this is a calculation
we saw last time, as well--
00:51:20.980 --> 00:51:27.440
we see that if you have a
set with small product, then
00:51:27.440 --> 00:51:30.280
it must have high
multiplicative energy.
00:51:30.280 --> 00:51:34.030
So last time, we saw small
sum set implies high additive
00:51:34.030 --> 00:51:34.570
energy.
00:51:34.570 --> 00:51:38.560
Likewise, small product set
implies high multiplicative
00:51:38.560 --> 00:51:39.570
energy.
00:51:39.570 --> 00:51:43.690
In particular, the
multiplicative energy of A,
00:51:43.690 --> 00:51:50.440
you can rewrite it as
sum over all elements
00:51:50.440 --> 00:51:55.750
x in the product set of the
quantity, which tells you
00:51:55.750 --> 00:52:02.730
the number of ways to
write x as a product,
00:52:02.730 --> 00:52:05.860
this number squared and
then summed over all x.
00:52:05.860 --> 00:52:09.280
By Cauchy-Schwartz, we find
that this quantity here is lower
00:52:09.280 --> 00:52:12.430
bounded by the size of
A to the 4th divided
00:52:12.430 --> 00:52:20.200
by the size of A times A. So
to prove Solymosi's theorem,
00:52:20.200 --> 00:52:26.710
we are going to actually
prove a bound on the energy,
00:52:26.710 --> 00:52:28.680
instead of proving
it on the set.
00:52:28.680 --> 00:52:30.460
We're going to prove
it on the energy.
00:52:30.460 --> 00:52:42.480
So it suffices to show that
the multiplicative energy is
00:52:42.480 --> 00:52:49.900
at most 4 times the
sum set size times--
00:52:49.900 --> 00:53:01.832
so let me divide the
energy by log of A.
00:53:01.832 --> 00:53:04.580
So when you plug this
into this inequality,
00:53:04.580 --> 00:53:05.520
it would imply that.
00:53:05.520 --> 00:53:09.590
So it remains to
show this inequality
00:53:09.590 --> 00:53:12.170
over here upper bounding
the multiplicative energy.
00:53:20.730 --> 00:53:22.710
There's an important
idea that we're
00:53:22.710 --> 00:53:25.440
going to use here, which is
also pretty common in analysis,
00:53:25.440 --> 00:53:32.850
is that instead of considering
that energy sum here,
00:53:32.850 --> 00:53:35.880
we're going to
consider a similar sum,
00:53:35.880 --> 00:53:40.110
except we're going to chop up
the sum into pieces according
00:53:40.110 --> 00:53:44.600
to how big the terms
are, so that we're only
00:53:44.600 --> 00:53:48.230
looking at contributions
of comparable size.
00:53:48.230 --> 00:53:50.948
And so this is called a
"dyadic decomposition."
00:54:02.420 --> 00:54:07.900
The idea is that we can write
the multiplicative energy
00:54:07.900 --> 00:54:10.400
similar to above, but
instead of summing over
00:54:10.400 --> 00:54:15.590
x in the product set, let me
sum over s in the quotient set.
00:54:15.590 --> 00:54:21.670
So you can interpret
what this quotient A is.
00:54:21.670 --> 00:54:27.140
This is the set of all A divided
by B, where A and B are in A. A
00:54:27.140 --> 00:54:29.150
is a set of positive
reals, so I don't need
00:54:29.150 --> 00:54:31.400
to worry about division by 0.
00:54:31.400 --> 00:54:36.770
So what remains, then,
is the intersection
00:54:36.770 --> 00:54:41.700
of s times A and A squared.
00:54:41.700 --> 00:54:47.170
Remember, s times A is scaling
each element of A by s.
00:54:47.170 --> 00:54:49.760
So we have this
quantity over here.
00:54:49.760 --> 00:54:55.070
So I want to break up the
sum into a bunch of smaller
00:54:55.070 --> 00:54:59.990
sums, where I want to
break up the sum according
00:54:59.990 --> 00:55:05.570
to how big the terms are,
so that inside each group,
00:55:05.570 --> 00:55:08.990
all the terms are
roughly of the same size.
00:55:08.990 --> 00:55:11.480
And easiest way to do
this is to chop them up
00:55:11.480 --> 00:55:19.400
into groups where everything
inside the same collection
00:55:19.400 --> 00:55:21.540
differs by at most
a factor of 2.
00:55:21.540 --> 00:55:25.100
So that's why it's called
a dyadic decomposition,
00:55:25.100 --> 00:55:27.740
going from 0 to--
00:55:27.740 --> 00:55:32.960
the maximum possible
here is basically A.
00:55:32.960 --> 00:55:39.410
So let's look at i going
from 0 to log base 2 of A.
00:55:39.410 --> 00:55:41.150
So this is the number of bins.
00:55:44.350 --> 00:55:49.810
And partition the
sum into sub-sums
00:55:49.810 --> 00:55:54.190
where I'm looking at the
i-th sub-sum consisting
00:55:54.190 --> 00:55:58.510
of contributions involving
terms with size between 2
00:55:58.510 --> 00:56:01.700
to the i and 2 to the i plus 1.
00:56:09.530 --> 00:56:15.260
Break up the sum according
to the sizes of the summands.
00:56:15.260 --> 00:56:18.020
By pigeonhole principle,
one of these summands
00:56:18.020 --> 00:56:19.760
must be somewhat large.
00:56:23.520 --> 00:56:33.730
So by pigeonhole,
there exists a k
00:56:33.730 --> 00:56:47.920
such that setting D to be the
s such that that corresponds
00:56:47.920 --> 00:56:52.670
to the k-th term in the sum.
00:57:04.250 --> 00:57:17.790
So one has that this sum coming
from just contributions from D
00:57:17.790 --> 00:57:19.260
is at least--
00:57:23.760 --> 00:57:27.930
so it's at least the
multiplicative energy
00:57:27.930 --> 00:57:29.760
divided by the number of bins.
00:57:36.170 --> 00:57:39.100
All of that many bins--
00:57:39.100 --> 00:57:41.300
by pigeonhole, I
can find one bin
00:57:41.300 --> 00:57:44.510
that's a pretty large
contribution to the sum.
00:57:44.510 --> 00:57:52.910
And the right-hand side, we can
upper bound each term over here
00:57:52.910 --> 00:57:57.470
by 2 to the 2k plus
2, and the number
00:57:57.470 --> 00:58:05.590
of terms as the size of D. Let
me call the elements of D S1
00:58:05.590 --> 00:58:16.820
through Sm, where S1 through Sm
are sorted in increasing order.
00:58:22.880 --> 00:58:26.960
Now let me draw you a
picture of what's going on.
00:58:26.960 --> 00:58:40.050
Let's consider for each element
of D, so for each i and m,
00:58:40.050 --> 00:58:48.230
let's consider the line given
by the equation y equals to s
00:58:48.230 --> 00:58:51.960
sub i times x.
00:58:51.960 --> 00:58:56.290
Let me draw this
picture where I'm
00:58:56.290 --> 00:59:01.260
looking at the
positive quadrant,
00:59:01.260 --> 00:59:03.640
so I have a bunch of points
in the positive quadrant.
00:59:10.180 --> 00:59:13.110
And specifically, I'm
interested in these points whose
00:59:13.110 --> 00:59:18.205
coordinates, both coordinates
are elements of A.
00:59:18.205 --> 00:59:26.580
And I want to consider
lines through points of A,
00:59:26.580 --> 00:59:28.950
but I want to
consider lines where
00:59:28.950 --> 00:59:34.380
it intersects this A cross A in
the desired number of points.
00:59:37.030 --> 00:59:39.400
And we find those
set, and then let's
00:59:39.400 --> 00:59:48.120
draw these lines over here,
where this line here, L1
00:59:48.120 --> 00:59:53.640
has slope exactly S1,
and L2, L3, and so on.
00:59:58.990 --> 01:00:03.820
I want to draw one more line,
which is somewhat auxiliary,
01:00:03.820 --> 01:00:06.770
but just to make our
life a bit easier.
01:00:06.770 --> 01:00:13.240
Finally, let's let L of m
plus 1 be the vertical line,
01:00:13.240 --> 01:00:21.030
or rather be the
vertical ray, which
01:00:21.030 --> 01:00:31.230
goes to the minimum
element of A above Lm.
01:00:31.230 --> 01:00:34.660
So it's this line over here.
01:00:34.660 --> 01:00:36.370
That's Lm plus 1.
01:00:40.550 --> 01:00:44.920
So in A cross A, I
draw a bunch of lines.
01:00:44.920 --> 01:00:46.650
So now all the lines--
01:00:46.650 --> 01:00:50.700
so all these lines involve
some point of A and the origin,
01:00:50.700 --> 01:00:51.930
but I don't draw all of them.
01:00:51.930 --> 01:00:54.600
I draw a select set of them.
01:00:54.600 --> 01:00:59.620
And what we said earlier says
that the number of lines,
01:00:59.620 --> 01:01:02.880
the number of points on
each of these strong lines,
01:01:02.880 --> 01:01:05.730
is roughly the same for
each of these lines.
01:01:12.350 --> 01:01:17.240
Let's let capital L
sub j denote the set
01:01:17.240 --> 01:01:24.350
of points in A cross A
that lie on the j-th line.
01:01:32.320 --> 01:01:37.930
So that's L1, L2, and so on.
01:01:41.230 --> 01:01:49.420
I claim that if you look
at two consecutive lines
01:01:49.420 --> 01:01:55.390
and look at the sum set
of the points in A cross A
01:01:55.390 --> 01:02:00.500
that intersect, you're
looking at two lines,
01:02:00.500 --> 01:02:02.550
and you're adding up
points on those two lines.
01:02:02.550 --> 01:02:05.680
So you form a grid.
01:02:05.680 --> 01:02:13.210
So you end up forming this grid.
01:02:13.210 --> 01:02:15.940
And the number of
points on this grid
01:02:15.940 --> 01:02:19.630
is precisely the product
of these two point sets.
01:02:28.940 --> 01:02:42.650
Moreover, the sets Lj
plus L sub j plus 1
01:02:42.650 --> 01:02:48.050
are disjoint for different j.
01:02:52.660 --> 01:02:56.440
And this is where we're using
the geometry of the plane here.
01:02:56.440 --> 01:03:02.960
Because the sum of L1
and L2 lies in the span,
01:03:02.960 --> 01:03:06.470
the sum of L2 and L3
in a different span,
01:03:06.470 --> 01:03:09.370
so they cannot intersect.
01:03:09.370 --> 01:03:11.430
So they lie in--
01:03:11.430 --> 01:03:38.370
so since they span disjoint
regions, L1 plus L2 lies here,
01:03:38.370 --> 01:03:41.590
L2 plus L3 lies
there, and so on.
01:03:41.590 --> 01:03:42.780
But they're all disjoint.
01:03:51.180 --> 01:03:53.700
Now let's put everything
that we know together.
01:03:58.720 --> 01:04:06.550
Remember, the goal is to upper
bound the multiplicative energy
01:04:06.550 --> 01:04:09.840
as a function of the sum set.
01:04:09.840 --> 01:04:12.300
So in other words, we want
to lower bound the sum set.
01:04:12.300 --> 01:04:19.360
So I want to show you that this
A plus A has a lot of elements.
01:04:19.360 --> 01:04:21.690
There's a lot of sums.
01:04:21.690 --> 01:04:25.360
And I have a bunch of disjoint
contributions to these sums.
01:04:25.360 --> 01:04:28.590
So let's add up those disjoint
contributions to the sums.
01:04:32.883 --> 01:04:38.220
You see that the size
of A plus A squared
01:04:38.220 --> 01:04:42.990
is the same as the size of
the product set A plus A.
01:04:42.990 --> 01:04:45.540
So this is Cartesian product.
01:04:45.540 --> 01:04:55.110
Here is-- this is a
Cartesian product,
01:04:55.110 --> 01:04:58.740
in other words, the grid
that is strong up there.
01:04:58.740 --> 01:05:01.150
I add this product to itself.
01:05:04.753 --> 01:05:06.170
So I should get
the same set here.
01:05:09.660 --> 01:05:11.800
But how big is this sum set?
01:05:11.800 --> 01:05:15.290
That grid, that lattice
grid added to itself,
01:05:15.290 --> 01:05:16.180
how big should it be?
01:05:16.180 --> 01:05:19.220
I want to lower bound
the number of sums.
01:05:19.220 --> 01:05:23.030
And the key observation
is up there.
01:05:23.030 --> 01:05:27.880
We can look at contributions
coming from distinct spans.
01:05:27.880 --> 01:05:34.980
In particular, this sum
here, so this sum set here,
01:05:34.980 --> 01:05:45.260
size is lower bounded by these
distinct Lj plus L j plus 1's.
01:05:45.260 --> 01:05:46.210
I threw away a lot.
01:05:46.210 --> 01:05:49.430
I only keep the lines on the
L's, and I only consider sums
01:05:49.430 --> 01:05:52.330
between consecutive L's.
01:05:52.330 --> 01:05:54.955
That should be a
lower bound to the sum
01:05:54.955 --> 01:05:56.280
set of the grid with itself.
01:05:58.860 --> 01:06:03.425
But you see, and here, we're
using these different--
01:06:03.425 --> 01:06:08.460
for different j's, these
contributions are destroyed.
01:06:08.460 --> 01:06:15.640
But by what we said up there,
Lj plus L j plus 1 is a grid.
01:06:15.640 --> 01:06:20.950
So it has size Lj
times L j plus 1.
01:06:23.900 --> 01:06:32.900
And the size of each Lj
is at least 2 to the k.
01:06:32.900 --> 01:06:38.150
So the sum here is at
least m times 2 to the 2k.
01:06:41.410 --> 01:06:48.340
But we saw over here that
the energy lower bounds
01:06:48.340 --> 01:06:50.290
this 2 to the 2k.
01:06:50.290 --> 01:06:58.060
So we have a lower bound that is
the multiplicative energy of A
01:06:58.060 --> 01:07:05.407
divided by 4 times the log
base 2 of the size of A.
01:07:05.407 --> 01:07:07.490
So don't worry so much
about the constant factors.
01:07:07.490 --> 01:07:13.000
That's just the order of
magnitude that is important.
01:07:13.000 --> 01:07:14.680
And that's it.
01:07:14.680 --> 01:07:15.309
Yep.
01:07:15.309 --> 01:07:20.130
AUDIENCE: How do you know that
the size of big L sub m plus 1?
01:07:20.130 --> 01:07:20.880
YUFEI ZHAO: Great.
01:07:20.880 --> 01:07:24.200
The question is, what do we
know about the size of big L sub
01:07:24.200 --> 01:07:25.172
m plus 1?
01:07:25.172 --> 01:07:26.130
So that's a good point.
01:07:28.940 --> 01:07:30.543
The easiest answer
is, if I don't
01:07:30.543 --> 01:07:31.960
care about these
constant factors,
01:07:31.960 --> 01:07:34.660
I don't need to worry about it.
01:07:34.660 --> 01:07:40.060
You can think about what
is the number of points
01:07:40.060 --> 01:07:47.430
on this line above that.
01:07:47.430 --> 01:07:53.110
It's essentially the number of
elements of A above the biggest
01:07:53.110 --> 01:07:57.450
element of s m, above s m.
01:08:01.197 --> 01:08:03.123
It's a good question.
01:08:03.123 --> 01:08:04.790
I think we don't need
to worry about it.
01:08:04.790 --> 01:08:08.935
I'm being slightly sloppy here.
01:08:08.935 --> 01:08:10.915
Yeah.
01:08:10.915 --> 01:08:16.890
AUDIENCE: [INAUDIBLE]
01:08:16.890 --> 01:08:18.390
YUFEI ZHAO: I think
the question is,
01:08:18.390 --> 01:08:22.540
how do we know for j equals to
m that you have this bound over
01:08:22.540 --> 01:08:23.040
here?
01:08:25.866 --> 01:08:33.770
AUDIENCE: [INAUDIBLE]
01:08:33.770 --> 01:08:34.520
YUFEI ZHAO: Great.
01:08:34.520 --> 01:08:35.260
So yes.
01:08:38.224 --> 01:08:48.877
AUDIENCE: [INAUDIBLE]
01:08:48.877 --> 01:08:50.710
YUFEI ZHAO: So there
are some ways to do it.
01:08:50.710 --> 01:08:53.010
You can notice that
the vertical line
01:08:53.010 --> 01:09:00.340
has at least as many points
as the first slanted line.
01:09:03.420 --> 01:09:08.700
So details that you can work on.
01:09:08.700 --> 01:09:11.109
So this proves
Solymosi's theorem,
01:09:11.109 --> 01:09:16.470
which gives you a lower bound
on the sum set and the product
01:09:16.470 --> 01:09:19.600
set sizes and the
maximum of those two.
01:09:19.600 --> 01:09:21.282
It's based on-- it's very short.
01:09:21.282 --> 01:09:21.990
It's very clever.
01:09:21.990 --> 01:09:24.540
It took a long time to find.
01:09:24.540 --> 01:09:28.649
And it gave a bound
on the sum product
01:09:28.649 --> 01:09:32.370
problem of 4/3 that
actually remained
01:09:32.370 --> 01:09:36.810
stuck for a very
long time, until just
01:09:36.810 --> 01:09:48.420
fairly recently there was
an improvement that gives--
01:09:48.420 --> 01:09:52.350
so by Konyagin
and Shkredov where
01:09:52.350 --> 01:09:58.260
they improved the Solymosi
bound from 4/3 to 4/3
01:09:58.260 --> 01:10:00.570
plus some really
small constant c.
01:10:04.300 --> 01:10:06.530
So it's some explicit constant.
01:10:06.530 --> 01:10:09.040
I think right now-- so that's
being proved over time,
01:10:09.040 --> 01:10:12.780
but right now, I think
c is around 1 over 1,000
01:10:12.780 --> 01:10:13.730
or a few thousand.
01:10:13.730 --> 01:10:17.600
So it's some small
but explicit constant.
01:10:17.600 --> 01:10:22.170
It remains a major open
problem to improve this bound
01:10:22.170 --> 01:10:25.730
and prove Erdos'
similarity conjecture,
01:10:25.730 --> 01:10:31.070
that if you have n elements,
then one of the sums
01:10:31.070 --> 01:10:33.830
or products must be
nearly quadratic in size.
01:10:33.830 --> 01:10:36.650
And people generally believe
that that's the case.
01:10:39.320 --> 01:10:40.468
Any questions?
01:10:47.950 --> 01:10:51.410
So this concludes all the topics
I want to cover in this course.
01:10:51.410 --> 01:10:52.470
So we went a long way.
01:10:52.470 --> 01:10:54.110
And so the beginning
of this course,
01:10:54.110 --> 01:10:56.860
we started with
extremal graph theory,
01:10:56.860 --> 01:11:00.760
looking at the basic problem
of if you have a graph that
01:11:00.760 --> 01:11:05.830
doesn't contain some
subgraph, triangle, C4, what's
01:11:05.830 --> 01:11:08.440
the maximum number of edges.
01:11:08.440 --> 01:11:11.350
In fact, that showed
up even today.
01:11:11.350 --> 01:11:13.270
And then we went
down to other tools,
01:11:13.270 --> 01:11:15.640
like Szemeredi's
regularity lemma
01:11:15.640 --> 01:11:18.250
that allows us to deduce
important arithmetic
01:11:18.250 --> 01:11:20.710
consequences, such
as Roth's theorem.
01:11:20.710 --> 01:11:22.210
It's also an extremal
problem if you
01:11:22.210 --> 01:11:25.300
have a set without a three-term
arithmetic progression,
01:11:25.300 --> 01:11:28.880
how many elements can it have?
01:11:28.880 --> 01:11:31.900
And so the important tool of
Szemeredi's regularity lemma
01:11:31.900 --> 01:11:34.270
then later showed up
in many different ways
01:11:34.270 --> 01:11:36.490
in this course,
especially the message
01:11:36.490 --> 01:11:38.800
of Szemeredi's regularity
lemma, that when
01:11:38.800 --> 01:11:41.290
you look at an object, it's
important to decompose it
01:11:41.290 --> 01:11:44.800
into its structural component
and its pseudo-random
01:11:44.800 --> 01:11:45.970
component.
01:11:45.970 --> 01:11:48.340
So this dichotomy,
this interplay
01:11:48.340 --> 01:11:50.590
between structure and
pseudo randomness,
01:11:50.590 --> 01:11:54.190
is a key theme
throughout this course.
01:11:54.190 --> 01:11:56.170
And it showed up in
some of the later topics
01:11:56.170 --> 01:11:59.920
as well, when we discussed
spectral graph theory,
01:11:59.920 --> 01:12:03.250
quasi-randomness,
graph limits, and also
01:12:03.250 --> 01:12:07.300
in the later Fourier analytic
proof of Roth's theorem.
01:12:07.300 --> 01:12:09.490
All of these proofs,
all of these techniques,
01:12:09.490 --> 01:12:12.250
involve some kind of
interplay between structure
01:12:12.250 --> 01:12:15.990
and pseudo-randomness.
01:12:15.990 --> 01:12:17.700
In the past month
or, so we've been
01:12:17.700 --> 01:12:21.240
looking at Freiman's
theorem, this key result
01:12:21.240 --> 01:12:24.840
in additive combinatorics
concerning the structure
01:12:24.840 --> 01:12:27.020
of sets under addition.
01:12:27.020 --> 01:12:30.450
And there, we also saw many
different tools that came up,
01:12:30.450 --> 01:12:33.720
and also connections I
mentioned a few lectures ago,
01:12:33.720 --> 01:12:35.490
connections to really
important results
01:12:35.490 --> 01:12:38.220
in geometry to group theory.
01:12:38.220 --> 01:12:41.190
And it really
extends all around.
01:12:41.190 --> 01:12:43.710
And a few takeaways
from this course--
01:12:43.710 --> 01:12:47.260
one of them is that graph
theory, additive combinatorics,
01:12:47.260 --> 01:12:49.030
they are not isolated subjects.
01:12:49.030 --> 01:12:52.143
They're connected to a
lot within mathematics.
01:12:52.143 --> 01:12:54.060
And that's one of the
goals I want to show you
01:12:54.060 --> 01:12:57.600
in this course, is to
show these connections
01:12:57.600 --> 01:13:01.170
throughout mathematics and
some to analysis, to geometry,
01:13:01.170 --> 01:13:02.250
to topology.
01:13:02.250 --> 01:13:05.550
And even simple
questions can lead
01:13:05.550 --> 01:13:08.790
to really deep mathematics.
01:13:08.790 --> 01:13:11.280
And some of them I try to
show you, try to hint at you,
01:13:11.280 --> 01:13:13.980
or at least I mentioned
throughout this course.
01:13:13.980 --> 01:13:20.220
And what we've seen so far is
just the tip of the iceberg.
01:13:20.220 --> 01:13:23.430
And there is a lot of still
extremely exciting work
01:13:23.430 --> 01:13:24.600
that's to be done.
01:13:24.600 --> 01:13:28.500
And I've also tried to emphasize
many important open problems
01:13:28.500 --> 01:13:32.370
that have yet to be
better understood.
01:13:32.370 --> 01:13:36.373
And I expect in some future
iteration of this course,
01:13:36.373 --> 01:13:38.040
some of these problems
will be resolved,
01:13:38.040 --> 01:13:41.460
and I can show the next
generation of students
01:13:41.460 --> 01:13:43.740
in your seats some new
techniques, new methods,
01:13:43.740 --> 01:13:44.752
and new theorems.
01:13:44.752 --> 01:13:46.210
And I expect that
will be the case.
01:13:46.210 --> 01:13:47.670
This is a very exciting area.
01:13:47.670 --> 01:13:50.200
And it's an area that is
very close to my heart.
01:13:50.200 --> 01:13:54.170
It's something that I've been
thinking about since my PhD.
01:13:54.170 --> 01:13:56.730
The bulk of my
research work revolves
01:13:56.730 --> 01:13:59.460
around better understanding
connections between graph
01:13:59.460 --> 01:14:02.130
theory, on one hand, and
additive combinatorics
01:14:02.130 --> 01:14:03.470
on the other hand.
01:14:03.470 --> 01:14:05.220
It's been really fun
teaching this course,
01:14:05.220 --> 01:14:07.680
and happy to have
all of you here.
01:14:07.680 --> 01:14:08.580
Thank you.
01:14:08.580 --> 01:14:11.630
[APPLAUSE]