WEBVTT
00:00:00.500 --> 00:00:03.912
[SQUEAKING]
00:00:18.590 --> 00:00:21.720
PROFESSOR: Last time, we
started discussing graph limits.
00:00:21.720 --> 00:00:24.908
And let me remind you some of
the notions and definitions
00:00:24.908 --> 00:00:25.700
that were involved.
00:00:35.590 --> 00:00:37.490
One of the main
objects in graph limits
00:00:37.490 --> 00:00:46.670
is that of a graphon, which are
symmetric, measurable functions
00:00:46.670 --> 00:00:49.490
from the unit squared
to the unit interval.
00:00:58.890 --> 00:01:02.570
So here, symmetric means
that w of x, comma, y
00:01:02.570 --> 00:01:04.670
equals to w of y, comma, x.
00:01:09.810 --> 00:01:11.520
We define a notion
of convergence
00:01:11.520 --> 00:01:13.980
for a sequence of graphons.
00:01:13.980 --> 00:01:21.080
And remember, the
notion of convergence
00:01:21.080 --> 00:01:33.330
is that a sequence is convergent
if the sequence of homomorphism
00:01:33.330 --> 00:01:43.330
densities converges as n goes
to infinity for every fixed
00:01:43.330 --> 00:01:45.680
F, every fixed graph.
00:01:49.480 --> 00:01:52.180
So this is how we
define convergence.
00:01:52.180 --> 00:01:53.920
So a sequence of
graphs or graphons,
00:01:53.920 --> 00:01:58.360
they converge if all the
homomorphism densities--
00:01:58.360 --> 00:02:01.200
so you should think of this
as subgraph statistics--
00:02:01.200 --> 00:02:04.520
if all of these
statistics converge.
00:02:04.520 --> 00:02:10.180
We also say that a sequence
converges to a particular limit
00:02:10.180 --> 00:02:16.180
if these homomorphism
densities converge
00:02:16.180 --> 00:02:20.170
to the corresponding
homomorphism density
00:02:20.170 --> 00:02:24.510
of the limit for every F.
00:02:24.510 --> 00:02:25.010
OK.
00:02:25.010 --> 00:02:27.740
So this is how we
define convergence.
00:02:27.740 --> 00:02:29.870
We also define this
notion of a distance.
00:02:33.140 --> 00:02:35.170
And to do that, we
first define the cut
00:02:35.170 --> 00:02:41.900
norm to be the following
quantity defined
00:02:41.900 --> 00:02:49.340
by taking two subsets, S
and T, which are measurable.
00:02:49.340 --> 00:02:51.890
Everything so far is
going to be measurable.
00:02:51.890 --> 00:02:55.820
And look at what is the
maximum possible deviation
00:02:55.820 --> 00:03:00.350
of the integral of this
function on this box, S cross T.
00:03:00.350 --> 00:03:03.800
And here, w, you should think
of it as taking real values,
00:03:03.800 --> 00:03:06.133
allowing both positive
and negative values,
00:03:06.133 --> 00:03:07.550
because otherwise,
you should just
00:03:07.550 --> 00:03:11.410
take S and T to be
the whole interval.
00:03:11.410 --> 00:03:12.850
OK.
00:03:12.850 --> 00:03:14.950
And this definition
was motivated
00:03:14.950 --> 00:03:19.620
by our discussion of discrepancy
coming from quasi randomness.
00:03:19.620 --> 00:03:22.280
Now, if I give you
two graphs or graphons
00:03:22.280 --> 00:03:24.170
and ask you to
compare them, you are
00:03:24.170 --> 00:03:28.550
allowed to permute the
vertices in some sense,
00:03:28.550 --> 00:03:31.140
so to find the best overlay.
00:03:31.140 --> 00:03:34.040
And that notion is
captured in the definition
00:03:34.040 --> 00:03:40.610
of cut distance, which is
defined to be the following
00:03:40.610 --> 00:03:53.540
quantity, where we consider over
all possible measure-preserving
00:03:53.540 --> 00:04:10.470
bijections from the interval
to itself of the difference
00:04:10.470 --> 00:04:14.130
between these two
graphons if I rotate
00:04:14.130 --> 00:04:18.750
one of them using this
measure-preserving bijection.
00:04:26.460 --> 00:04:29.175
So think of this as
permuting the vertices.
00:04:36.130 --> 00:04:39.660
So these were the definitions
that were involved last time.
00:04:39.660 --> 00:04:41.410
And at the end of
last lecture, I
00:04:41.410 --> 00:04:45.060
stated three main theorems
of graph limit theory.
00:04:45.060 --> 00:04:47.230
So I forgot to
mention what are some
00:04:47.230 --> 00:04:49.820
of the histories of this theory.
00:04:49.820 --> 00:04:52.360
So there were a number
of important papers
00:04:52.360 --> 00:04:57.250
that developed this very idea of
graph limits, which is actually
00:04:57.250 --> 00:05:00.100
somewhat-- if you think
about all of combinatorics,
00:05:00.100 --> 00:05:02.830
we like to deal with
discrete objects.
00:05:02.830 --> 00:05:06.610
And even the idea of taking
a limit is rather novel.
00:05:06.610 --> 00:05:11.830
So this work is due
to a number of people.
00:05:11.830 --> 00:05:14.830
In particular, Laszlo Lovasz
played a very important
00:05:14.830 --> 00:05:17.200
central role in the
development of this theory.
00:05:17.200 --> 00:05:19.480
And various people
came to this theory
00:05:19.480 --> 00:05:21.460
from different
perspectives-- some
00:05:21.460 --> 00:05:24.160
from more pure
perspectives, and some
00:05:24.160 --> 00:05:26.290
from more applied perspectives.
00:05:26.290 --> 00:05:29.810
And this theory is now getting
used in more and more places,
00:05:29.810 --> 00:05:33.030
including statistics,
machine learning, and so on.
00:05:33.030 --> 00:05:37.990
And I'll explain where that
comes up just a little bit.
00:05:37.990 --> 00:05:40.870
At the end of last lecture,
I stated three main theorems.
00:05:40.870 --> 00:05:44.560
And what I want to do
today is develop some tools
00:05:44.560 --> 00:05:47.777
so that we can prove those
theorems in the next lecture.
00:05:47.777 --> 00:05:48.277
OK.
00:05:48.277 --> 00:05:49.910
So I want to develop some tools.
00:05:49.910 --> 00:05:52.510
In particular, you'll see some
of the things that we've talked
00:05:52.510 --> 00:05:55.960
about in the chapter on
Szemerédi's regularity lemma
00:05:55.960 --> 00:05:59.140
come up again in a slightly
different language.
00:05:59.140 --> 00:06:02.320
So much of what I will say
today hopefully should already
00:06:02.320 --> 00:06:04.660
be familiar to you, but
you will see it again
00:06:04.660 --> 00:06:08.690
from the perspective
of graph limits.
00:06:08.690 --> 00:06:11.257
But first, before telling
you about the tools,
00:06:11.257 --> 00:06:12.840
I want to give you
some more examples.
00:06:15.580 --> 00:06:17.970
So one of the ways that
I motivated graph limits
00:06:17.970 --> 00:06:22.380
last time is this example of
an Erdos-Renyi random graph
00:06:22.380 --> 00:06:25.470
or a sequence of quasi-random
graphs converging
00:06:25.470 --> 00:06:26.540
to a constant.
00:06:26.540 --> 00:06:30.690
The constant graphon
is the limit.
00:06:30.690 --> 00:06:32.170
But what about generalizations?
00:06:32.170 --> 00:06:34.590
What about generalizations
of that construction when
00:06:34.590 --> 00:06:37.500
your limit is not the constant?
00:06:37.500 --> 00:06:43.530
So this leads to this idea
of a w random graph, which
00:06:43.530 --> 00:06:49.250
generalizes that of an
Erdos-Renyi random graph.
00:06:49.250 --> 00:06:58.390
So in Erdos-Renyi, we're
looking at every edge occurring
00:06:58.390 --> 00:07:03.260
with the same probability, p,
uniform throughout the graph.
00:07:03.260 --> 00:07:07.250
But what I want to do now is
allow you to change the edge
00:07:07.250 --> 00:07:08.920
probability somewhat.
00:07:08.920 --> 00:07:09.420
OK.
00:07:12.288 --> 00:07:14.330
So before giving you the
more general definition,
00:07:14.330 --> 00:07:19.160
a special case of this
is an important model
00:07:19.160 --> 00:07:22.090
of random graphs known as
the stochastic block model.
00:07:25.802 --> 00:07:31.700
And in particular, a two-block
model consists of the following
00:07:31.700 --> 00:07:39.330
data where I am looking
at two types of vertices--
00:07:44.750 --> 00:07:46.030
let's call them red and blue--
00:07:49.650 --> 00:07:54.830
where the vertices are
assigned to colors at random--
00:08:01.570 --> 00:08:03.050
for example, 50/50.
00:08:03.050 --> 00:08:05.690
But any other
probability is fine.
00:08:05.690 --> 00:08:10.330
And now I put down the edges
according to which colors
00:08:10.330 --> 00:08:12.340
the two endpoints are.
00:08:12.340 --> 00:08:23.500
So two red vertices are joined
with edge probability Prr.
00:08:23.500 --> 00:08:26.920
If I have a red
and a blue, then I
00:08:26.920 --> 00:08:32.380
may have a different probability
joining them, and likewise
00:08:32.380 --> 00:08:38.400
with blue-blue, like that.
00:08:38.400 --> 00:08:40.960
So in other words, I can encode
this probability information
00:08:40.960 --> 00:08:50.890
in the matrix, like that.
00:08:50.890 --> 00:08:54.800
So it's symmetric
across the diagonal.
00:08:54.800 --> 00:08:57.220
So this is a slightly
more general version
00:08:57.220 --> 00:08:59.950
of an Erdos-Renyi
random graph where now I
00:08:59.950 --> 00:09:02.320
have potentially different
types of vertices.
00:09:02.320 --> 00:09:04.090
And you can imagine
these kinds of models
00:09:04.090 --> 00:09:06.010
are very important in
applied mathematics
00:09:06.010 --> 00:09:09.740
for modeling certain situations
such as, for example,
00:09:09.740 --> 00:09:14.890
if you have people with
different political party
00:09:14.890 --> 00:09:16.270
affiliations.
00:09:16.270 --> 00:09:19.890
How likely are they
to talk to each other?
00:09:19.890 --> 00:09:22.050
So you can imagine
some of these numbers
00:09:22.050 --> 00:09:24.700
might be bigger than others.
00:09:24.700 --> 00:09:27.360
And there's an important
statistical problem.
00:09:27.360 --> 00:09:31.050
If I give you a graph, can
you cluster or classify
00:09:31.050 --> 00:09:33.330
the vertices according
to their types
00:09:33.330 --> 00:09:36.690
if I do not show you in advance
what the colors are but show
00:09:36.690 --> 00:09:39.340
you what the output graph is?
00:09:39.340 --> 00:09:41.490
So these are important
statistical questions
00:09:41.490 --> 00:09:45.750
with lots of applications.
00:09:45.750 --> 00:09:48.570
This is an example of if
you have only two blocks.
00:09:48.570 --> 00:09:52.030
But of course, you can
have more than two blocks.
00:09:52.030 --> 00:09:55.810
And the graphon context
tells us that we should not
00:09:55.810 --> 00:09:58.540
limit ourselves to just blocks.
00:09:58.540 --> 00:10:02.200
If I give you any
graphon w, I can also
00:10:02.200 --> 00:10:06.040
construct a random graph.
00:10:06.040 --> 00:10:08.980
So what I would like
to do is to consider
00:10:08.980 --> 00:10:12.080
the following
construction where--
00:10:12.080 --> 00:10:19.420
OK, so let's just call
it w random graph denoted
00:10:19.420 --> 00:10:23.920
by g and w--
00:10:23.920 --> 00:10:28.510
where I form the graph
using the following process.
00:10:28.510 --> 00:10:34.480
First, the vertex set is
labeled by 1 through n.
00:10:34.480 --> 00:10:44.640
And let me draw the vertex types
by taking uniform random x1
00:10:44.640 --> 00:10:46.946
through xn--
00:10:46.946 --> 00:10:51.080
OK, so uniform iid.
00:10:51.080 --> 00:10:54.170
So you think of them as the
vertex colors, the vertex
00:10:54.170 --> 00:10:55.560
types.
00:10:55.560 --> 00:11:03.440
And I put an edge
between i and j
00:11:03.440 --> 00:11:10.834
with probability
exactly w of xi,
00:11:10.834 --> 00:11:17.382
xj, so for all i less
than j independently.
00:11:21.160 --> 00:11:23.950
That's the definition
of a w random graph.
00:11:23.950 --> 00:11:26.790
And the two-block
stochastic model
00:11:26.790 --> 00:11:29.470
is a special case of
this w random graph
00:11:29.470 --> 00:11:31.720
for the graphon,
which corresponds
00:11:31.720 --> 00:11:35.310
to this red-blue picture here.
00:11:38.650 --> 00:11:49.300
So the generation process would
be I give you some x1, x2, x3,
00:11:49.300 --> 00:11:57.250
and then, likewise, x1, x3, x2.
00:11:57.250 --> 00:12:01.260
And then I evaluate, what
is the value of this graphon
00:12:01.260 --> 00:12:02.750
at these points?
00:12:11.450 --> 00:12:15.080
And those are my
edge probabilities.
00:12:15.080 --> 00:12:17.420
So what I described
is a special case
00:12:17.420 --> 00:12:19.580
of this general w random graph.
00:12:22.460 --> 00:12:25.570
Any questions?
00:12:25.570 --> 00:12:28.390
So like before, an important
statistical question
00:12:28.390 --> 00:12:31.250
is if I show you
the graph, can you
00:12:31.250 --> 00:12:37.210
tell me a good model for
where this graph came from?
00:12:37.210 --> 00:12:41.460
So that's one of the reasons
why people in applied math
00:12:41.460 --> 00:12:45.970
might care about these
types of constructions.
00:12:45.970 --> 00:12:47.350
Let me talk about some theorems.
00:12:51.050 --> 00:12:54.800
I've told you that the sequence
of Erdos-Renyi random graphs
00:12:54.800 --> 00:12:57.770
converges to the
constant graphon p.
00:12:57.770 --> 00:13:01.190
So instead of taking
a constant graphon p,
00:13:01.190 --> 00:13:04.190
now I start with w random graph.
00:13:04.190 --> 00:13:06.860
And you should expect,
and it is indeed true,
00:13:06.860 --> 00:13:12.500
that this sequence converges
to w as their limit.
00:13:12.500 --> 00:13:14.190
So let w be a graphon.
00:13:19.695 --> 00:13:21.830
So let w be a graphon.
00:13:21.830 --> 00:13:28.450
And for each n, let me
draw this graph G sub
00:13:28.450 --> 00:13:34.472
n using the w random
graph model independently.
00:13:37.640 --> 00:13:47.810
Then with probability
1, the sequence
00:13:47.810 --> 00:13:50.480
converges to the graphon w.
00:13:53.680 --> 00:13:58.190
So in the sense that I've
shown above, described above.
00:13:58.190 --> 00:14:01.640
So this statement
tells us a couple
00:14:01.640 --> 00:14:04.900
of things-- one, that w random
graphs converge to the limit w,
00:14:04.900 --> 00:14:12.400
as you should expect; and
two, that every graphon w
00:14:12.400 --> 00:14:17.750
is the limit point of
some sequence of graphs.
00:14:17.750 --> 00:14:20.650
So this is something that
we never quite explicitly
00:14:20.650 --> 00:14:21.950
stated before.
00:14:21.950 --> 00:14:24.980
So let me make this remark.
00:14:24.980 --> 00:14:39.670
So in particular,
every w is the limit
00:14:39.670 --> 00:14:47.998
of some sequence of graphs,
just like every real number,
00:14:47.998 --> 00:14:49.540
in analogy to what
we said last time.
00:14:49.540 --> 00:14:52.340
Every real number is
the limit of a sequence
00:14:52.340 --> 00:14:55.760
of rational numbers through
rational approximation.
00:14:55.760 --> 00:14:59.570
And this is some form of
approximation of a graphon
00:14:59.570 --> 00:15:01.425
by a sequence of graphs.
00:15:01.425 --> 00:15:01.925
OK.
00:15:01.925 --> 00:15:03.740
So I'm not going to
prove this theorem.
00:15:03.740 --> 00:15:08.420
The proof is not difficult.
So using that definition
00:15:08.420 --> 00:15:11.240
of subgraph
convergence, the proof
00:15:11.240 --> 00:15:16.890
uses what's known as
Azuma's inequality.
00:15:16.890 --> 00:15:21.110
So by an appropriate application
of Azuma's inequality
00:15:21.110 --> 00:15:22.790
on the concentration
of martingales,
00:15:22.790 --> 00:15:27.110
one can prove this
theorem here by estimating
00:15:27.110 --> 00:15:28.970
the probability that--
00:15:35.180 --> 00:15:41.960
to show that the probability
that the F density in Gn,
00:15:41.960 --> 00:15:47.330
it is very close to
the F density in w
00:15:47.330 --> 00:15:49.336
with high probability.
00:15:52.252 --> 00:15:55.145
OK.
00:15:55.145 --> 00:15:56.020
Any questions so far?
00:15:58.820 --> 00:16:02.600
So this is an important
example of one
00:16:02.600 --> 00:16:06.220
of the motivations
of graph limits.
00:16:06.220 --> 00:16:09.460
But now, let's get back
to what I said earlier.
00:16:09.460 --> 00:16:11.810
I would like to develop
a sequence of tools
00:16:11.810 --> 00:16:14.150
that will allow us to prove
the main theorem stated
00:16:14.150 --> 00:16:18.000
at the end of the last lecture.
00:16:18.000 --> 00:16:19.470
And this will sound
very familiar,
00:16:19.470 --> 00:16:23.610
because we're going to write
down some lemmas that we did
00:16:23.610 --> 00:16:26.490
back in the chapter of
Szemerédi's regularity lemma
00:16:26.490 --> 00:16:29.450
but now in the
language of graphons.
00:16:29.450 --> 00:16:31.600
So the first is
a counting lemma.
00:16:38.270 --> 00:16:39.770
The goal of the
counting lemma is
00:16:39.770 --> 00:16:42.590
to show that if you
have two graphons which
00:16:42.590 --> 00:16:50.060
are close to each other in the
sense of cut distance, then
00:16:50.060 --> 00:16:55.530
their F densities are
similar to each other.
00:16:55.530 --> 00:16:57.190
So here's a statement.
00:16:57.190 --> 00:17:05.403
So if w and u are
graphons and F is
00:17:05.403 --> 00:17:19.460
a graph, then the F density
of w minus the F density of u,
00:17:19.460 --> 00:17:24.940
their difference is no more
than a constant-- so number
00:17:24.940 --> 00:17:32.110
of edges of F times the cut
distance between u and w.
00:17:37.670 --> 00:17:41.740
So maybe some of you already
see how to do this from
00:17:41.740 --> 00:17:45.930
our discussion on
Szemerédi's regularity lemma.
00:17:45.930 --> 00:17:48.790
In any case, I want to just
rewrite the proof again
00:17:48.790 --> 00:17:50.350
in the language of graphons.
00:17:50.350 --> 00:17:52.190
And this will hopefully--
00:17:52.190 --> 00:17:55.700
so we did two proofs of the
triangle counting lemma.
00:17:55.700 --> 00:17:58.445
One was hopefully more
intuitive for you,
00:17:58.445 --> 00:18:00.070
which is you pick a
typical vertex that
00:18:00.070 --> 00:18:01.528
has lots of neighbors
on both sides
00:18:01.528 --> 00:18:04.412
and therefore lots
of edges between.
00:18:04.412 --> 00:18:06.370
And then there was a
second proof, which I said
00:18:06.370 --> 00:18:08.470
was a more analytic
proof, where you took out
00:18:08.470 --> 00:18:10.420
one edge at a time.
00:18:10.420 --> 00:18:13.450
And that proof, I think
it's technically easier
00:18:13.450 --> 00:18:16.383
to implement, especially
for general H.
00:18:16.383 --> 00:18:17.800
But the first time
you see it, you
00:18:17.800 --> 00:18:20.680
might not quite see what
the calculation was about.
00:18:20.680 --> 00:18:23.320
So I want to do this exact
same calculation again
00:18:23.320 --> 00:18:24.547
in the language of graphons.
00:18:24.547 --> 00:18:26.380
And hopefully, it should
be clear this time.
00:18:29.600 --> 00:18:31.390
So this is the same
as the counting lemma
00:18:31.390 --> 00:18:34.800
over epsilon-regular pairs.
00:18:34.800 --> 00:18:44.120
So it suffices to
prove the inequality
00:18:44.120 --> 00:18:49.330
where the right-hand side
is replaced not by the cut
00:18:49.330 --> 00:18:53.440
distance but by the cut norm.
00:18:53.440 --> 00:18:57.550
And the reason is that once
you have the second inequality
00:18:57.550 --> 00:19:04.410
by taking an infimum over all
measure-preserving bijections
00:19:04.410 --> 00:19:05.290
phi--
00:19:05.290 --> 00:19:10.990
and notice that that change
does not affect the F density.
00:19:10.990 --> 00:19:12.900
By taking an infimum
over phi, you
00:19:12.900 --> 00:19:14.752
recover the first inequality.
00:19:17.590 --> 00:19:22.360
I want to give you a small
reformulation of the cut norm
00:19:22.360 --> 00:19:25.606
that will be useful for thinking
about this counting lemma.
00:19:29.980 --> 00:19:37.750
Here's a reformulation
of the cut norm--
00:19:37.750 --> 00:19:42.470
namely, that I can
define the cut norm.
00:19:42.470 --> 00:19:45.840
So here, w is taking
real values, so
00:19:45.840 --> 00:19:48.630
not necessarily non-negative.
00:19:48.630 --> 00:19:52.860
So the cut norm
we saw earlier is
00:19:52.860 --> 00:20:01.940
defined to be the supremum
over all measurable subsets
00:20:01.940 --> 00:20:08.900
of the 0, 1 interval of this
integral in absolute value.
00:20:08.900 --> 00:20:14.780
But it turns out I can rewrite
this supremum over a slightly
00:20:14.780 --> 00:20:16.850
larger set of objects.
00:20:16.850 --> 00:20:21.500
Instead of just looking
over measurable subsets
00:20:21.500 --> 00:20:26.330
of the interval, let me now
look at measurable functions.
00:20:26.330 --> 00:20:29.130
Little u.
00:20:29.130 --> 00:20:32.570
So OK, let me look at functions.
00:20:32.570 --> 00:20:40.860
So u and v from 0, 1 to 0, 1--
00:20:40.860 --> 00:20:46.530
and as always, everything
is measurable--
00:20:46.530 --> 00:20:49.650
of the following integral.
00:21:01.570 --> 00:21:04.260
So I claim this is true.
00:21:04.260 --> 00:21:09.370
So I consider this integral.
00:21:09.370 --> 00:21:11.480
Instead of integrating
over a box,
00:21:11.480 --> 00:21:16.160
now I'm integrating
this expression.
00:21:16.160 --> 00:21:16.660
OK.
00:21:16.660 --> 00:21:19.380
So why is this true?
00:21:19.380 --> 00:21:23.670
Well, one of the
directions is easy to see,
00:21:23.670 --> 00:21:27.630
because the right-hand side
is strictly an enlargement
00:21:27.630 --> 00:21:29.070
of the left-hand side.
00:21:29.070 --> 00:21:35.940
So by taking u to be the
indicator function of S
00:21:35.940 --> 00:21:38.750
and v to be the indicator
of function of T,
00:21:38.750 --> 00:21:40.680
you see that the
right-hand side, in fact,
00:21:40.680 --> 00:21:42.690
includes the left-hand
side in terms
00:21:42.690 --> 00:21:45.330
of what you are allowed to do.
00:21:45.330 --> 00:21:48.160
But what about the
other direction?
00:21:48.160 --> 00:21:50.070
So for the other
direction, the main thing
00:21:50.070 --> 00:21:56.700
is to notice that the
integral or the integrand,
00:21:56.700 --> 00:22:05.800
what's inside this integral,
is bilinear in the values of u
00:22:05.800 --> 00:22:12.390
and v. So in particular, the
extrema of this integral,
00:22:12.390 --> 00:22:17.210
as you allow to vary u
and v, they are obtained.
00:22:17.210 --> 00:22:22.350
So they are obtained
for u and v,
00:22:22.350 --> 00:22:31.610
taking values in the
endpoints 0, comma, 1.
00:22:36.030 --> 00:22:39.160
It may be helpful to think about
the discrete setting, when,
00:22:39.160 --> 00:22:42.070
instead of this integral, you
have a matrix and two vectors
00:22:42.070 --> 00:22:43.870
multiplied from left and right.
00:22:43.870 --> 00:22:46.840
And you had to decide,
what are the coordinates
00:22:46.840 --> 00:22:48.560
of those vectors?
00:22:48.560 --> 00:22:50.260
It's a bilinear form.
00:22:50.260 --> 00:22:53.090
How do you maximize
it or minimize it?
00:22:53.090 --> 00:22:57.900
You have to change every entry
to one of its two endpoints.
00:22:57.900 --> 00:23:00.660
Otherwise, it can never be--
00:23:00.660 --> 00:23:04.610
you never lose by doing that.
00:23:04.610 --> 00:23:05.950
OK, so think about it.
00:23:05.950 --> 00:23:12.610
So this is not difficult once
you see it the right way.
00:23:12.610 --> 00:23:18.630
But now, we have this cut
norm expressed over not sets,
00:23:18.630 --> 00:23:22.220
but over bounded functions.
00:23:22.220 --> 00:23:24.620
And now I'm ready to
prove the counting lemma.
00:23:32.400 --> 00:23:36.000
And instead of writing down
the whole proof for general H,
00:23:36.000 --> 00:23:40.650
let me write down the
calculation that illustrates
00:23:40.650 --> 00:23:42.600
this proof for triangles.
00:23:49.460 --> 00:23:50.840
And the general
proof is the same
00:23:50.840 --> 00:23:54.500
once you understand how
this argument works.
00:23:54.500 --> 00:24:00.770
And the argument works by
considering the difference
00:24:00.770 --> 00:24:09.890
between these two F densities.
00:24:09.890 --> 00:24:12.710
And what I want to do is--
00:24:12.710 --> 00:24:14.160
so this is some integral, right?
00:24:14.160 --> 00:24:17.090
So this is this integral,
which I'll write out.
00:24:41.780 --> 00:24:46.640
So we would like to show
that this quantity here
00:24:46.640 --> 00:24:51.730
is small if u and w
are close in cut norm.
00:24:51.730 --> 00:24:59.830
So let's write this integral
as a telescoping sum
00:24:59.830 --> 00:25:03.900
where the first term
is obtained by--
00:25:08.990 --> 00:25:11.150
so by this, I mean
w of x, comma, y
00:25:11.150 --> 00:25:12.440
minus u of x, comma, y.
00:25:24.440 --> 00:25:27.290
And then the second term
of the telescoping sum--
00:25:27.290 --> 00:25:28.950
so you see what happens.
00:25:28.950 --> 00:25:31.040
I change one factor at a time.
00:25:51.570 --> 00:25:54.810
And finally, I change
the third factor.
00:26:09.300 --> 00:26:10.300
So this is the identity.
00:26:10.300 --> 00:26:12.280
If you expand out all
of these differences,
00:26:12.280 --> 00:26:15.630
you see that everything
intermediate cancels out.
00:26:15.630 --> 00:26:19.700
So it's a telescoping sum.
00:26:19.700 --> 00:26:24.281
But now I want to show
that each term is small.
00:26:24.281 --> 00:26:28.170
So how can I show that
each term is small?
00:26:28.170 --> 00:26:32.400
Look at this expression here.
00:26:34.992 --> 00:26:38.280
I claim that for a
fixed value of z--
00:26:45.300 --> 00:26:47.490
so imagine fixing z.
00:26:47.490 --> 00:26:52.000
And let x and y vary
in this integral.
00:26:52.000 --> 00:26:55.760
It has the form up there, right?
00:26:55.760 --> 00:27:00.680
If you fix z, then
you have this u and v
00:27:00.680 --> 00:27:02.660
coming from these two factors.
00:27:02.660 --> 00:27:04.880
And they are both
bounded between 0 and 1.
00:27:08.090 --> 00:27:18.170
So for a fixed value of z,
this is at most w minus u--
00:27:18.170 --> 00:27:23.290
the cut norm difference between
w and u in absolute value.
00:27:27.520 --> 00:27:33.590
So if I left z vary, it is
still bounded in absolute value
00:27:33.590 --> 00:27:36.450
by that quantity.
00:27:36.450 --> 00:27:46.580
So therefore each is
bounded by w minus u cut
00:27:46.580 --> 00:27:49.910
norm in absolute value.
00:27:49.910 --> 00:27:52.410
Add all three of them together.
00:27:52.410 --> 00:27:57.290
We find that the whole thing
is bounded in absolute value
00:27:57.290 --> 00:27:59.963
by 3 times the cut
normal difference.
00:28:03.350 --> 00:28:06.660
OK, and that finishes the
proof of the counting lemma.
00:28:06.660 --> 00:28:10.050
For triangles, of course,
if you have general H,
00:28:10.050 --> 00:28:12.600
then you just have more terms.
00:28:12.600 --> 00:28:18.040
You have a longer telescoping
sum, and you have this bound.
00:28:18.040 --> 00:28:18.540
OK.
00:28:18.540 --> 00:28:19.450
So this is a counting lemma.
00:28:19.450 --> 00:28:22.080
And I claim that it's exactly
the same proof as the second
00:28:22.080 --> 00:28:24.952
proof of the counting lemma
that we did when we discussed
00:28:24.952 --> 00:28:27.160
Szemerédi's regularity lemma
and this counting lemma.
00:28:30.220 --> 00:28:33.194
Any questions?
00:28:33.194 --> 00:28:33.694
Yeah.
00:28:37.082 --> 00:28:42.487
AUDIENCE: Why did it suffice
to prove over the [INAUDIBLE]??
00:28:42.487 --> 00:28:43.070
PROFESSOR: OK.
00:28:43.070 --> 00:28:45.460
So let me answer
that in a second.
00:28:45.460 --> 00:28:48.280
So first, this should
be H, not F. OK,
00:28:48.280 --> 00:28:55.000
so your question
was, up there, why
00:28:55.000 --> 00:28:59.350
was it sufficient to
prove this version instead
00:28:59.350 --> 00:29:00.365
of that version?
00:29:00.365 --> 00:29:01.240
Is that the question?
00:29:01.240 --> 00:29:02.177
AUDIENCE: Yeah.
00:29:02.177 --> 00:29:02.760
PROFESSOR: OK.
00:29:02.760 --> 00:29:04.970
Suppose I prove it
for this version.
00:29:04.970 --> 00:29:06.870
So I know this is true.
00:29:06.870 --> 00:29:09.610
Now I take infimum
of both sides.
00:29:09.610 --> 00:29:17.990
So now I consider
infimum of both sides.
00:29:17.990 --> 00:29:21.380
So then this is true, right?
00:29:21.380 --> 00:29:24.440
Because it's true for every phi.
00:29:24.440 --> 00:29:28.490
But the left-hand side doesn't
change, because the F density
00:29:28.490 --> 00:29:32.930
in a relabeling of the vertices,
it's still the same quantity,
00:29:32.930 --> 00:29:34.880
whereas this one
here is now that.
00:29:40.226 --> 00:29:41.198
All right.
00:29:44.600 --> 00:29:53.320
So what we see as a corollary
of this counting lemma
00:29:53.320 --> 00:29:58.540
is that if you are a Cauchy
sequence with respect
00:29:58.540 --> 00:30:06.940
to the cut distance,
then the sequence
00:30:06.940 --> 00:30:09.347
is automatically convergent.
00:30:15.663 --> 00:30:17.330
So recall the definition
of convergence.
00:30:17.330 --> 00:30:20.920
Convergence has to do with
F densities converging.
00:30:20.920 --> 00:30:22.940
And if you have a
Cauchy sequence,
00:30:22.940 --> 00:30:25.970
then the F densities converge.
00:30:25.970 --> 00:30:29.000
And also, a related
but different statement
00:30:29.000 --> 00:30:35.180
is that if you have
a sequence wn that
00:30:35.180 --> 00:30:41.040
converges to w in
cut distance, then
00:30:41.040 --> 00:30:45.810
it implies that wn
converges to w in the sense
00:30:45.810 --> 00:30:48.496
as defined for F densities.
00:30:51.880 --> 00:30:55.270
So qualitatively, what
the counting lemma says
00:30:55.270 --> 00:31:00.550
is that the cut norm is stronger
than the notion of convergence
00:31:00.550 --> 00:31:05.260
coming from subgraph densities.
00:31:05.260 --> 00:31:08.668
So this is one part of
this regularity method, so
00:31:08.668 --> 00:31:09.460
the counting lemma.
00:31:09.460 --> 00:31:12.503
Of course, the other part is
the regularity lemma itself.
00:31:12.503 --> 00:31:13.920
So that's the next
thing we'll do.
00:31:17.020 --> 00:31:18.610
And it turns out
that we actually
00:31:18.610 --> 00:31:21.190
don't need the full strength
of the regularity lemma.
00:31:21.190 --> 00:31:23.740
We only need something called
a weak regularity lemma.
00:31:37.660 --> 00:31:41.690
What the weak regularity
lemma says is--
00:31:41.690 --> 00:31:44.850
I mean, you still have a
partition of the vertices.
00:31:44.850 --> 00:31:46.370
So let me now state
it for graphons.
00:31:46.370 --> 00:31:53.110
So for a partition p--
00:31:53.110 --> 00:31:56.920
so I have a partition
of the vertex set--
00:32:04.120 --> 00:32:13.100
and a symmetric,
measurable function w--
00:32:13.100 --> 00:32:16.080
I'm just going to omit the
word "measurable" from now on.
00:32:16.080 --> 00:32:18.990
Everything will be measurable.
00:32:18.990 --> 00:32:22.160
What I can do is, OK,
all of these assets
00:32:22.160 --> 00:32:24.463
are also measurable.
00:32:27.780 --> 00:32:38.130
I can define what's known as a
stepping operator that sends w
00:32:38.130 --> 00:32:43.190
to this object,
w sub p, obtained
00:32:43.190 --> 00:32:55.210
by averaging over
the steps si cross sj
00:32:55.210 --> 00:33:01.490
and replacing that graphon by
its average over each step.
00:33:01.490 --> 00:33:07.900
Precisely, so I
obtain a new graphon,
00:33:07.900 --> 00:33:11.630
a new symmetric, measurable
function, w sub p,
00:33:11.630 --> 00:33:20.100
where the value on x,
comma, y is defined
00:33:20.100 --> 00:33:23.890
to be the following quantity--
00:33:30.610 --> 00:33:39.040
if x, comma, y lies
in si cross sj.
00:33:39.040 --> 00:33:43.840
So pictorially, what happens is
that you look at your graphon.
00:33:47.540 --> 00:33:51.262
There's a partition
of the vertex set,
00:33:51.262 --> 00:33:52.853
so to speak, the interval.
00:33:52.853 --> 00:33:54.770
Doesn't have to be a
partition into intervals,
00:33:54.770 --> 00:33:57.850
but for illustration,
suppose it looks like that.
00:33:57.850 --> 00:34:01.850
And what I do is I take
this w, and I replace it
00:34:01.850 --> 00:34:06.590
by a new graphon, a new
symmetric, measurable function,
00:34:06.590 --> 00:34:12.749
w sub p, obtained by averaging.
00:34:16.421 --> 00:34:17.600
Take each box.
00:34:17.600 --> 00:34:18.860
Replace it by its average.
00:34:18.860 --> 00:34:22.310
Put that average into the box.
00:34:22.310 --> 00:34:26.920
So this is what w sub
p is supposed to be.
00:34:26.920 --> 00:34:29.710
Just a few minor technicalities.
00:34:29.710 --> 00:34:39.690
If this denominator is equal
to 0, let's ignore the set.
00:34:39.690 --> 00:34:42.679
I mean, then you have a
zero measure set, anyway,
00:34:42.679 --> 00:34:44.820
so we ignore that set.
00:34:44.820 --> 00:34:47.330
So everything will be
treated up to measure zero,
00:34:47.330 --> 00:34:49.850
changing the function
on measure zero sets.
00:34:49.850 --> 00:34:53.883
So it doesn't really matter
if you're not strictly
00:34:53.883 --> 00:34:55.050
allowed to do this division.
00:34:58.310 --> 00:34:59.200
OK.
00:34:59.200 --> 00:35:01.990
So this operator plays
an important role
00:35:01.990 --> 00:35:03.820
in the regularity
lemma, because it's
00:35:03.820 --> 00:35:07.050
how we think about partitioning,
what happens to a graph
00:35:07.050 --> 00:35:08.260
under partitioning.
00:35:08.260 --> 00:35:12.640
It has several other names if
you look at it from slightly
00:35:12.640 --> 00:35:14.060
different perspectives.
00:35:14.060 --> 00:35:19.400
So you can view
it as a projection
00:35:19.400 --> 00:35:22.280
in the sense of Hilbert space.
00:35:22.280 --> 00:35:35.170
So in the Hilbert space of
functions on the unit square,
00:35:35.170 --> 00:35:44.840
the stepping operator is a
projection unto the subspace
00:35:44.840 --> 00:35:52.090
of constants,
subspace of functions
00:35:52.090 --> 00:35:56.660
that are constant on each step.
00:36:05.210 --> 00:36:06.920
So that's one interpretation.
00:36:06.920 --> 00:36:09.860
Another interpretation is
that this operation is also
00:36:09.860 --> 00:36:11.870
a conditional expectation.
00:36:17.340 --> 00:36:21.900
If you know what a conditional
expectation actually
00:36:21.900 --> 00:36:25.130
is in the sense of
probability theory,
00:36:25.130 --> 00:36:26.940
so then that's
what happens here.
00:36:26.940 --> 00:36:30.720
If you view 0, 1 squared
as a probability space,
00:36:30.720 --> 00:36:35.340
then what we're doing is we're
doing conditional expectation
00:36:35.340 --> 00:36:39.750
relative to the sigma algebra
generated by these steps.
00:36:41.793 --> 00:36:43.710
So these are just a
couple of ways of thinking
00:36:43.710 --> 00:36:44.627
about what's going on.
00:36:44.627 --> 00:36:46.290
They might be somewhat
helpful later on
00:36:46.290 --> 00:36:47.873
if you're familiar
with these notions.
00:36:47.873 --> 00:36:49.705
But if you're not,
don't worry about it.
00:36:49.705 --> 00:36:51.330
Concretely, it's what
happens up there.
00:36:58.340 --> 00:36:58.990
OK.
00:36:58.990 --> 00:37:01.930
So now let me state the
weak regularity lemma.
00:37:13.530 --> 00:37:16.550
So the weak regularity
lemma is attributed
00:37:16.550 --> 00:37:25.800
to Frieze and Kannan,
although their work predates
00:37:25.800 --> 00:37:27.540
the language of graphons.
00:37:27.540 --> 00:37:29.720
So it's stated in the
language of graphs,
00:37:29.720 --> 00:37:30.720
but it's the same proof.
00:37:30.720 --> 00:37:33.410
So let me state it for you
both in terms of graphons
00:37:33.410 --> 00:37:35.070
and in graphs.
00:37:35.070 --> 00:37:48.160
What it says is that for every
epsilon and every graphon w,
00:37:48.160 --> 00:38:00.760
there exists a partition
denoted p of the 0, 1 interval.
00:38:00.760 --> 00:38:03.110
And now I tell you how
many sets there are.
00:38:03.110 --> 00:38:05.320
So it's a partition into--
00:38:05.320 --> 00:38:08.300
so not a tower-type
number of parts,
00:38:08.300 --> 00:38:11.920
but only roughly an
exponential number of parts--
00:38:11.920 --> 00:38:22.250
4 to the 1 over epsilon
squared measurable sets such
00:38:22.250 --> 00:38:29.710
that if we apply the stepping
operator to this graphon,
00:38:29.710 --> 00:38:35.538
we obtain an approximation of
the graphon in the cut norm.
00:38:40.520 --> 00:38:45.050
So that's the statement of
the weak regularity lemma.
00:38:45.050 --> 00:38:51.620
There exists a partition such
that if you do this stepping,
00:38:51.620 --> 00:38:53.460
then you obtain
an approximation.
00:38:53.460 --> 00:38:56.120
So I want you to think about
what this has to do with
00:38:56.120 --> 00:38:58.600
the usual version of Szemerédi's
regularity lemma that
00:38:58.600 --> 00:39:00.030
you've seen earlier.
00:39:00.030 --> 00:39:01.970
So hopefully, you
should realize, morally,
00:39:01.970 --> 00:39:04.660
they're about the same
types of statements.
00:39:04.660 --> 00:39:07.980
But more importantly, how are
they different from each other?
00:39:07.980 --> 00:39:12.620
And now let me state a version
for graphs, which is similar
00:39:12.620 --> 00:39:17.090
but not exactly the same as
what we just saw for graphons.
00:39:17.090 --> 00:39:19.520
So let me state it.
00:39:19.520 --> 00:39:26.300
So for graphs, the
weak regularity lemma
00:39:26.300 --> 00:39:36.420
says that, OK, so for graphs,
let me define a partition
00:39:36.420 --> 00:39:55.130
p of the vertex set is
called weakly epsilon regular
00:39:55.130 --> 00:39:58.360
if the following is true.
00:39:58.360 --> 00:40:03.055
If it is the case that whenever
I look at two vertex subsets,
00:40:03.055 --> 00:40:08.650
A and B, of the
vertex set of g, then
00:40:08.650 --> 00:40:13.880
the number of vertices
between A and B
00:40:13.880 --> 00:40:21.530
is what you should expect based
on the density information that
00:40:21.530 --> 00:40:24.710
comes out of this partition.
00:40:24.710 --> 00:40:32.830
Namely, if I sum over all
the parts of the partition,
00:40:32.830 --> 00:40:46.200
look at how many vertices from A
lie in the corresponding parts.
00:40:46.200 --> 00:40:51.090
And then multiply by the edge
density between these parts.
00:40:51.090 --> 00:40:53.820
So that's your predicted
value based on the data that
00:40:53.820 --> 00:40:55.900
comes out of the partition.
00:40:55.900 --> 00:40:58.170
So I claim that this is
the actual number of edges.
00:40:58.170 --> 00:41:00.720
This is the predicted
number of edges.
00:41:00.720 --> 00:41:07.395
And those two numbers should
be similar to each other bt
00:41:07.395 --> 00:41:11.380
at most epsilon n, where n
is the number of vertices.
00:41:11.380 --> 00:41:14.680
So this is the definition of
what it means for a partition
00:41:14.680 --> 00:41:18.700
to be weakly epsilon regular.
00:41:18.700 --> 00:41:22.190
So it's important to think
about why this is weaker.
00:41:22.190 --> 00:41:23.190
It's called weak, right?
00:41:23.190 --> 00:41:28.150
So why is it weaker than a
notion of epsilon regularity?
00:41:28.150 --> 00:41:30.450
So why is it weaker?
00:41:30.450 --> 00:41:34.110
So previously, we had
epsilon-regular partition
00:41:34.110 --> 00:41:36.900
in the definition of
Szemerédi's regularity lemma,
00:41:36.900 --> 00:41:38.880
this epsilon-regular partition.
00:41:38.880 --> 00:41:43.350
And here, notion of
weakly epsilon regular.
00:41:43.350 --> 00:41:44.620
So why is this a lot weaker?
00:41:47.460 --> 00:41:52.050
It is not saying that
individual pairs of parts
00:41:52.050 --> 00:41:55.355
are epsilon regular.
00:41:55.355 --> 00:41:57.730
And eventually, we're going
to have this number of parts.
00:41:57.730 --> 00:42:00.210
So I'll state a
theorem in a second.
00:42:00.210 --> 00:42:04.070
So the sizes of the
parts are much smaller
00:42:04.070 --> 00:42:07.380
than epsilon fraction.
00:42:07.380 --> 00:42:12.080
But what this weak notion of
regularity says, if you look
00:42:12.080 --> 00:42:13.950
at it globally--
00:42:13.950 --> 00:42:15.740
so not looking at
specific parts,
00:42:15.740 --> 00:42:17.450
but looking at it globally--
00:42:17.450 --> 00:42:19.670
then this partition is
a good approximation
00:42:19.670 --> 00:42:24.280
of what's going on in the
actual graph, whereas--
00:42:24.280 --> 00:42:25.710
OK, so it's worth
thinking about.
00:42:25.710 --> 00:42:27.335
It's really worth
thinking about what's
00:42:27.335 --> 00:42:29.990
the difference between this weak
notion and the usual notion.
00:42:29.990 --> 00:42:33.380
But first, let me state
this regularity lemma.
00:42:33.380 --> 00:42:43.330
So the weak regularity
lemma for graphs
00:42:43.330 --> 00:42:50.820
says that for every
epsilon and every graph G,
00:42:50.820 --> 00:43:03.360
there exists a weakly
epsilon-regular partition
00:43:03.360 --> 00:43:09.090
of the vertex set
of G into at most 4
00:43:09.090 --> 00:43:11.570
to the 1 over epsilon
squared parts.
00:43:20.240 --> 00:43:24.640
Now, you might wonder why
did Frieze and Kannan come up
00:43:24.640 --> 00:43:29.010
with this notion of regularity.
00:43:29.010 --> 00:43:32.010
It's a weaker result if you
don't care about the bounds,
00:43:32.010 --> 00:43:38.070
because an epsilon-regular
partition will be automatically
00:43:38.070 --> 00:43:41.360
weakly epsilon regular.
00:43:41.360 --> 00:43:43.220
So maybe with small
changes of epsilon
00:43:43.220 --> 00:43:46.370
if you wish, but basically,
this is a weaker notion
00:43:46.370 --> 00:43:47.690
compared to what we had before.
00:43:50.780 --> 00:43:53.560
But of course, the advantage
is that you have a much more
00:43:53.560 --> 00:43:56.230
reasonable number of parts.
00:43:56.230 --> 00:43:58.210
It's not a tower.
00:43:58.210 --> 00:44:01.180
It's just a single exponential.
00:44:01.180 --> 00:44:02.110
And this is important.
00:44:02.110 --> 00:44:05.740
And their motivation was a
computer science and algorithm
00:44:05.740 --> 00:44:06.760
application.
00:44:06.760 --> 00:44:11.410
So I want to take a
brief detour and mention
00:44:11.410 --> 00:44:18.022
why you might care about weakly
epsilon-regular partitions.
00:44:22.180 --> 00:44:25.240
In particular, the problem
that is of interest
00:44:25.240 --> 00:44:30.980
is in approximating
something called a max cut.
00:44:30.980 --> 00:44:38.060
So the max cut problem asks you
to determine-- given a graph G,
00:44:38.060 --> 00:44:46.360
find the maximum over
all subsets of vertices,
00:44:46.360 --> 00:44:49.610
the maximum number of
vertices between a set
00:44:49.610 --> 00:44:51.040
and its complement.
00:44:51.040 --> 00:44:52.430
That's called a cut.
00:44:52.430 --> 00:44:56.430
I give you a graph,
and I want to know--
00:44:56.430 --> 00:45:01.860
find this s so that it
can have as many edges
00:45:01.860 --> 00:45:05.488
across this set as possible.
00:45:05.488 --> 00:45:07.530
This is an important
problem in computer science,
00:45:07.530 --> 00:45:09.120
extremely important problem.
00:45:09.120 --> 00:45:12.450
And the status of
this problem is
00:45:12.450 --> 00:45:20.640
that it is known to be difficult
to get it even within 1%.
00:45:20.640 --> 00:45:24.188
So the best algorithm is due
to Goemans and Williamson.
00:45:30.410 --> 00:45:32.120
It's an important
algorithm that was
00:45:32.120 --> 00:45:33.560
one of the
foundational algorithms
00:45:33.560 --> 00:45:35.690
in semidefinite
programming, so related--
00:45:35.690 --> 00:45:37.340
the words "semidefinite
programming"
00:45:37.340 --> 00:45:40.070
came up earlier in this course
when we discussed growth index
00:45:40.070 --> 00:45:40.970
inequality.
00:45:40.970 --> 00:45:43.820
So they came up with an
approximation algorithm.
00:45:43.820 --> 00:45:47.100
So here, I'm only talking
about polynomial time,
00:45:47.100 --> 00:45:48.830
so efficient algorithms.
00:45:48.830 --> 00:45:53.600
Approximation algorithm
with approximation ratio
00:45:53.600 --> 00:45:56.120
around 0.878.
00:45:56.120 --> 00:46:03.900
So one can obtain a cut
that is within basically
00:46:03.900 --> 00:46:07.820
13% of the maximum.
00:46:07.820 --> 00:46:10.540
So it's an
approximation algorithm.
00:46:10.540 --> 00:46:17.380
However, it is known that it is
hard in the sense of complexity
00:46:17.380 --> 00:46:18.540
theory.
00:46:18.540 --> 00:46:29.830
It'd be hard to approximate
beyond the ratio 16 over 17,
00:46:29.830 --> 00:46:37.000
which is around 0.491.
00:46:37.000 --> 00:46:38.980
And there is an
important conjecture
00:46:38.980 --> 00:46:41.800
in computer science called
a unique games conjecture
00:46:41.800 --> 00:46:44.240
that, if that
conjecture were true,
00:46:44.240 --> 00:46:46.710
then it would be
difficult. It would
00:46:46.710 --> 00:46:52.010
be hard to approximate beyond
the Goemans-Williamson ratio.
00:46:52.010 --> 00:46:54.070
So this indicates the
status of this problem.
00:46:54.070 --> 00:46:59.760
It is difficult to do an
epsilon approximation.
00:46:59.760 --> 00:47:03.135
But if the graph I
give you is dense--
00:47:10.460 --> 00:47:13.040
"dense" meaning a
quadratic number
00:47:13.040 --> 00:47:17.970
of edges, where n is
a number of vertices--
00:47:17.970 --> 00:47:25.210
then it turns out that the
regularity-type algorithms--
00:47:25.210 --> 00:47:28.390
so that theorem combined
with the algorithmic versions
00:47:28.390 --> 00:47:35.360
allows you to get polynomial
time approximation algorithms.
00:47:35.360 --> 00:47:38.660
So this is polynomial time
approximation schemes.
00:47:41.620 --> 00:47:52.000
So one can approximate up
to 1 minus epsilon ratio.
00:47:52.000 --> 00:47:57.940
So one can approximate
up to epsilon
00:47:57.940 --> 00:48:07.796
n squared additive error
in polynomial time.
00:48:07.796 --> 00:48:12.730
So in particular, if I'm
willing to lose 0.01 n squared,
00:48:12.730 --> 00:48:16.540
then there is an algorithm to
approximate the size of the max
00:48:16.540 --> 00:48:17.040
cut.
00:48:17.040 --> 00:48:21.110
And that algorithm
basically comes from--
00:48:21.110 --> 00:48:23.320
without giving you any
details whatsoever,
00:48:23.320 --> 00:48:27.310
the algorithm essentially comes
from first finding a regularity
00:48:27.310 --> 00:48:28.334
partition.
00:48:35.110 --> 00:48:40.120
So the partition breaks
the set of vertices
00:48:40.120 --> 00:48:43.240
into some number of pieces.
00:48:43.240 --> 00:48:57.640
And now I search over
all possible ratios
00:48:57.640 --> 00:49:01.080
to divide each piece.
00:49:04.280 --> 00:49:06.210
So there is a bounded
number of parts.
00:49:06.210 --> 00:49:09.320
Each one of those, I decide,
do I cut this up half-half?
00:49:09.320 --> 00:49:13.270
Do I cut it up 1/3,
2/3, and so on?
00:49:13.270 --> 00:49:17.940
And those numbers alone,
because of this definition
00:49:17.940 --> 00:49:22.040
of weakly epsilon
regular, once you
00:49:22.040 --> 00:49:27.005
know what the intersection
of A, B is, let's say,
00:49:27.005 --> 00:49:29.780
a complement is with
individual sets,
00:49:29.780 --> 00:49:32.510
then I basically know
the number of edges.
00:49:32.510 --> 00:49:36.800
So I can approximate
the size of the max cut
00:49:36.800 --> 00:49:41.300
using a weakly
epsilon-regular partition.
00:49:41.300 --> 00:49:47.360
So that was the motivation
for these weakly epsilon
00:49:47.360 --> 00:49:51.820
partitions, at least the
algorithmic application.
00:49:51.820 --> 00:49:52.320
OK.
00:49:52.320 --> 00:49:53.420
Any questions?
00:49:56.240 --> 00:49:56.740
OK.
00:49:56.740 --> 00:49:58.150
So let's take a quick break.
00:49:58.150 --> 00:50:00.100
And then afterwards,
I want to show
00:50:00.100 --> 00:50:03.160
you the proof of the
weak regularity lemma.
00:50:05.730 --> 00:50:06.230
All right.
00:50:06.230 --> 00:50:12.560
So let me start the proof of
the weak regularity lemma.
00:50:12.560 --> 00:50:14.775
And the proof is by this
energy increment argument.
00:50:14.775 --> 00:50:16.400
So let's see what
this energy increment
00:50:16.400 --> 00:50:19.700
argument looks like in
the language of graphons.
00:50:19.700 --> 00:50:27.610
So energy now means L2,
so L2 energy increment.
00:50:27.610 --> 00:50:29.600
So the statement
of this lemma is
00:50:29.600 --> 00:50:42.230
that if you have w, a graphon,
and p, a partition, of 0,
00:50:42.230 --> 00:50:46.740
comma, 1 interval such that--
00:50:50.120 --> 00:50:51.260
always measurable pieces.
00:50:51.260 --> 00:50:52.300
I'm not going to even write it.
00:50:52.300 --> 00:50:53.592
It's always measurable pieces--
00:50:57.320 --> 00:51:08.390
such that the difference between
w and w averaged over steps p
00:51:08.390 --> 00:51:11.420
is bigger than epsilon.
00:51:11.420 --> 00:51:14.390
So this is the notion
of being not epsilon
00:51:14.390 --> 00:51:22.280
regular in the weak sense,
not weakly epsilon regular.
00:51:22.280 --> 00:51:33.100
Then there exists a
refinement, p prime of p,
00:51:33.100 --> 00:51:45.430
dividing each part of p
into at most four parts
00:51:45.430 --> 00:51:57.380
such that the true norm
increases by more than epsilon
00:51:57.380 --> 00:52:02.040
squared under this refinement.
00:52:02.040 --> 00:52:04.110
So it should be similar.
00:52:04.110 --> 00:52:06.450
It should be familiar to
you, because we have similar
00:52:06.450 --> 00:52:09.686
arguments from Szemerédi's
regularity lemma.
00:52:09.686 --> 00:52:10.644
So let's see the proof.
00:52:13.490 --> 00:52:18.380
Because you have violation
of weak epsilon regularity,
00:52:18.380 --> 00:52:23.250
there exists sets S and T,
measurable subsets of 0,
00:52:23.250 --> 00:52:29.510
1 interval, such that this
integral evaluated over S
00:52:29.510 --> 00:52:39.140
cross T is more than
epsilon in absolute value.
00:52:39.140 --> 00:52:55.690
So now let me take p prime to
be the common refinement of p
00:52:55.690 --> 00:53:07.890
by introducing S and
T into this partition.
00:53:07.890 --> 00:53:10.900
So throw S and T in
and break everything
00:53:10.900 --> 00:53:12.930
according to S and T.
00:53:12.930 --> 00:53:21.140
And so each part becomes
at most four subparts.
00:53:21.140 --> 00:53:22.960
So that's the at
most four subparts.
00:53:25.780 --> 00:53:29.060
I now need to show that I
have an energy increment.
00:53:29.060 --> 00:53:33.340
And to do this, let
me first perform
00:53:33.340 --> 00:53:36.530
the following calculation.
00:53:36.530 --> 00:53:41.590
So remember, this symbol
here is the inner product
00:53:41.590 --> 00:53:44.230
obtained by multiplying
and integrating
00:53:44.230 --> 00:53:46.890
over the entire box.
00:53:46.890 --> 00:53:52.400
I claim that that
inner product equals
00:53:52.400 --> 00:54:00.790
to the inner product
between wp and wp prime,
00:54:00.790 --> 00:54:08.580
because what happens here is
we are looking at a situation
00:54:08.580 --> 00:54:15.510
where wp prime is
constant on each part.
00:54:15.510 --> 00:54:20.920
So when I do this inner product,
I can replace w by its average.
00:54:20.920 --> 00:54:23.810
And likewise, over here, I can
also replace it by its average.
00:54:23.810 --> 00:54:26.990
And you end up having
the same average.
00:54:26.990 --> 00:54:33.340
And these two averages
are both just what happens
00:54:33.340 --> 00:54:35.170
if you do stepping by p.
00:54:38.440 --> 00:54:48.380
You also have that w has inner
product with 1 sub S cross T
00:54:48.380 --> 00:54:54.780
the same as that of p
prime by the same reason,
00:54:54.780 --> 00:55:00.790
because over S
cross T. So S cross
00:55:00.790 --> 00:55:06.000
T is a union of the
parts of p prime.
00:55:06.000 --> 00:55:16.360
So S is union of
parts of p prime.
00:55:16.360 --> 00:55:16.860
OK.
00:55:16.860 --> 00:55:18.140
So let's see.
00:55:18.140 --> 00:55:21.770
With those observations,
you find that--
00:55:30.580 --> 00:55:33.580
so this is true.
00:55:33.580 --> 00:55:35.870
This is from the first equality.
00:55:35.870 --> 00:55:40.795
So now let me draw
you a right triangle.
00:55:49.890 --> 00:55:51.840
So you have a right
angle, because you have
00:55:51.840 --> 00:55:54.450
an inner product that is 0.
00:55:54.450 --> 00:56:04.530
So by Pythagorean theorem,
so what is this hypotenuse?
00:56:04.530 --> 00:56:06.520
So you add these two vectors.
00:56:06.520 --> 00:56:14.060
And you find out this wp prime.
00:56:14.060 --> 00:56:16.010
So by Pythagorean
theorem, you find
00:56:16.010 --> 00:56:20.540
that the L2 norm
of wp prime equals
00:56:20.540 --> 00:56:33.990
to the L2 norm of the sum of
the L2 norm squares of the two
00:56:33.990 --> 00:56:35.910
legs of this right triangle.
00:56:43.420 --> 00:56:48.153
On the other hand,
this quantity here.
00:56:48.153 --> 00:56:50.070
So let's think about
that quantity over there.
00:56:52.810 --> 00:56:54.420
It's an L2 norm.
00:56:54.420 --> 00:57:16.580
So in particular, it is at
least this quantity here,
00:57:16.580 --> 00:57:20.180
which you can derive
in one of many ways--
00:57:20.180 --> 00:57:25.840
for example, by Cauchy-Schwarz
inequality or go from L2 to L1
00:57:25.840 --> 00:57:28.330
and then pass down to L1.
00:57:28.330 --> 00:57:31.890
So this is true.
00:57:31.890 --> 00:57:33.687
So let's say by Cauchy-Schwarz.
00:57:49.570 --> 00:57:55.580
But this quantity here, we
said was bigger than epsilon.
00:58:04.690 --> 00:58:12.180
So as a result,
this final quantity,
00:58:12.180 --> 00:58:17.300
this L2 norm of
the new refinement,
00:58:17.300 --> 00:58:20.540
increases from the previous one
by more than epsilon squared.
00:58:24.620 --> 00:58:25.880
OK.
00:58:25.880 --> 00:58:27.910
So this is the L2 energy
increment argument.
00:58:27.910 --> 00:58:29.870
I claim it's the same
argument, basically,
00:58:29.870 --> 00:58:32.480
as the one that we did for
Szemerédi's regularity lemma.
00:58:32.480 --> 00:58:34.700
And I encourage you to
go back and compare them
00:58:34.700 --> 00:58:36.200
to see why they're the same.
00:58:40.280 --> 00:58:41.360
All right, moving on.
00:58:41.360 --> 00:58:45.230
So the other part
of regularity lemma
00:58:45.230 --> 00:58:48.820
is to iterate this approach.
00:58:48.820 --> 00:58:51.980
So if you have something
which is not epsilon regular,
00:58:51.980 --> 00:58:52.790
refine it.
00:58:52.790 --> 00:58:53.960
And then iterate.
00:58:53.960 --> 00:58:58.820
And you cannot perceive more
than a bounded number of times,
00:58:58.820 --> 00:59:02.390
because energy is always
bounded between 0 and 1.
00:59:02.390 --> 00:59:09.260
So for every epsilon bigger
than 0 and graphon w,
00:59:09.260 --> 00:59:17.210
suppose you have P0, a
partition of 0, 1 interval
00:59:17.210 --> 00:59:19.960
into measurable sets.
00:59:19.960 --> 00:59:38.280
Then there exists a partition
p that cuts up each part of P0
00:59:38.280 --> 00:59:47.460
into at most 4 to the
1 over epsilon parts
00:59:47.460 --> 00:59:55.920
such that w minus w sub
p is at most epsilon.
00:59:55.920 --> 00:59:59.620
So I'm basically restating
the weak regularity lemma
00:59:59.620 --> 01:00:03.630
over there but with a
small difference, which
01:00:03.630 --> 01:00:07.020
will become useful later on
when we prove compactness.
01:00:07.020 --> 01:00:09.645
Namely, I'm allowed to
start with any partition.
01:00:09.645 --> 01:00:11.520
Instead of starting with
a trivial partition,
01:00:11.520 --> 01:00:14.100
I can start with any partition.
01:00:14.100 --> 01:00:16.382
This was also true when
we were talking about
01:00:16.382 --> 01:00:18.840
Szemerédi's regularity lemma,
although I didn't stress that
01:00:18.840 --> 01:00:20.320
point.
01:00:20.320 --> 01:00:21.858
That's certainly the case here.
01:00:21.858 --> 01:00:23.400
I mean, the proof
is exactly the same
01:00:23.400 --> 01:00:26.250
with or without this extra.
01:00:26.250 --> 01:00:30.780
This extra P0 really plays
an insignificant role.
01:00:30.780 --> 01:00:34.520
What happens, as in the proof
of Szemerédi's regularity lemma,
01:00:34.520 --> 01:00:42.770
is that we repeatedly apply
the previous lemma to obtain
01:00:42.770 --> 01:00:56.040
the sequence of partitions
of the 0, 1 interval where,
01:00:56.040 --> 01:01:10.160
each step, either we find that
we obtain some partition p sub
01:01:10.160 --> 01:01:15.790
i such that it's a good
approximation of w,
01:01:15.790 --> 01:01:33.150
in which case we stop, or the
L2 energy increases by more than
01:01:33.150 --> 01:01:34.750
epsilon squared.
01:01:40.630 --> 01:01:49.890
And since the final energy
is always at most 1--
01:01:49.890 --> 01:01:52.620
so it's always bounded
between 0 and 1--
01:01:52.620 --> 01:02:01.060
we must stop after at
most 1 over epsilon steps.
01:02:06.460 --> 01:02:14.160
And if you calculate
the number of parts,
01:02:14.160 --> 01:02:20.165
each part is subdivided
into at most four parts
01:02:20.165 --> 01:02:26.780
at each step, which
gives you the conclusion
01:02:26.780 --> 01:02:29.580
on the final number of parts.
01:02:29.580 --> 01:02:31.640
OK, so very similar
to what we did before.
01:02:35.780 --> 01:02:36.720
All right.
01:02:36.720 --> 01:02:41.850
So that concludes the discussion
of the weak regularity lemma.
01:02:41.850 --> 01:02:44.360
So basically the same proof.
01:02:44.360 --> 01:02:48.403
Weaker conclusion and
better quantitative balance.
01:02:48.403 --> 01:02:50.820
The next thing and the final
thing I want to discuss today
01:02:50.820 --> 01:02:55.140
is a new ingredient which
we haven't seen before
01:02:55.140 --> 01:02:58.110
but that will play an
important role in the proof
01:02:58.110 --> 01:02:59.580
of the compactness--
01:02:59.580 --> 01:03:03.160
in particular, the proof of
the existence of the limit.
01:03:03.160 --> 01:03:08.280
And this is something where I
need to discuss martingales.
01:03:12.410 --> 01:03:15.010
So martingale gill is
an important object
01:03:15.010 --> 01:03:16.555
in probability theory.
01:03:16.555 --> 01:03:18.865
And it's a random sequence.
01:03:23.620 --> 01:03:28.350
So we'll look at discrete
sequences, so indexed
01:03:28.350 --> 01:03:32.010
by non-negative integers.
01:03:32.010 --> 01:03:36.620
And is martingale is
such a sequence where
01:03:36.620 --> 01:03:43.330
if I'm interested in the
expectation of the next term
01:03:43.330 --> 01:03:47.720
and even if you know
all the previous terms--
01:03:47.720 --> 01:03:51.530
so you have full knowledge of
the sequence before time n,
01:03:51.530 --> 01:03:55.000
and you want to predict
on the expectation what
01:03:55.000 --> 01:03:56.440
the nth term is--
01:03:56.440 --> 01:04:04.830
then you cannot do better than
simply predicting the last term
01:04:04.830 --> 01:04:06.800
that you saw.
01:04:06.800 --> 01:04:11.000
So this is the definition
of a martingale.
01:04:11.000 --> 01:04:13.730
Now, to do this
formally, I need to talk
01:04:13.730 --> 01:04:18.080
about filtrations and what
not in measured theory.
01:04:18.080 --> 01:04:20.702
But let me not do that.
01:04:20.702 --> 01:04:22.160
OK, so this is how
you should think
01:04:22.160 --> 01:04:25.960
about martingales and a
couple of important examples
01:04:25.960 --> 01:04:27.080
of martingales.
01:04:27.080 --> 01:04:31.670
So the first one comes
from-- the reason
01:04:31.670 --> 01:04:35.000
why these things are called
martingales is that there
01:04:35.000 --> 01:04:37.280
is a gambling strategy
which is related
01:04:37.280 --> 01:04:44.720
to such a sequence where
let's say you consider
01:04:44.720 --> 01:04:48.130
a sequence of fair coin tosses.
01:04:48.130 --> 01:04:50.500
So here's what
we're going to do.
01:04:50.500 --> 01:04:53.850
So suppose we consider
a betting strategy.
01:05:03.240 --> 01:05:17.080
And x sub n is equal
to your balance time n.
01:05:17.080 --> 01:05:21.120
And suppose that we're
looking at a fair casino
01:05:21.120 --> 01:05:27.700
where the expectation of
every game is exactly 0.
01:05:27.700 --> 01:05:31.260
Then this is a martingale.
01:05:31.260 --> 01:05:33.000
So imagine you have
a sequence of coin
01:05:33.000 --> 01:05:38.600
flips, and you win $1 for each
head and lose $1 for each tail.
01:05:38.600 --> 01:05:42.050
When you're at time five, you
should have $2 in your pocket.
01:05:42.050 --> 01:05:46.420
Then time five
plus 1, you expect
01:05:46.420 --> 01:05:48.505
to also have that many dollars.
01:05:48.505 --> 01:05:49.130
It might go up.
01:05:49.130 --> 01:05:49.838
It might go down.
01:05:49.838 --> 01:05:52.726
But in expectation,
it doesn't change.
01:05:52.726 --> 01:05:53.670
Is there a question?
01:05:56.220 --> 01:05:56.720
OK.
01:05:56.720 --> 01:05:59.778
So they're asking about,
is there some independence
01:05:59.778 --> 01:06:00.570
condition required?
01:06:00.570 --> 01:06:02.190
And the answer is no.
01:06:02.190 --> 01:06:04.540
So there's no independence
condition that is required.
01:06:04.540 --> 01:06:06.270
So the definition
of a martingale
01:06:06.270 --> 01:06:10.020
is just if, even with complete
knowledge of the sequence up
01:06:10.020 --> 01:06:14.130
to a certain point, the
difference going forward
01:06:14.130 --> 01:06:16.565
is 0 in expectation.
01:06:22.570 --> 01:06:27.970
OK, so here's another
example of a martingale,
01:06:27.970 --> 01:06:31.490
which actually turns out to
be more relevant to our use--
01:06:34.250 --> 01:06:39.980
namely, that if I
have some hidden--
01:06:39.980 --> 01:06:44.540
think of x as some hidden
random variable, so something
01:06:44.540 --> 01:06:46.880
that you have no idea.
01:06:46.880 --> 01:06:56.170
But you can observe
it at time n based
01:06:56.170 --> 01:07:07.100
on information up to time n.
01:07:11.380 --> 01:07:16.600
So for example, suppose
you have no idea who
01:07:16.600 --> 01:07:21.910
is going to win the
presidential election.
01:07:21.910 --> 01:07:24.550
And really, nobody has any idea.
01:07:24.550 --> 01:07:28.990
But as time proceeds, you
make an educated guess
01:07:28.990 --> 01:07:30.790
based on the information
that you have,
01:07:30.790 --> 01:07:33.890
all the information you
have up to that point.
01:07:33.890 --> 01:07:36.590
And that information becomes
a larger and larger set
01:07:36.590 --> 01:07:38.420
as time moves forward.
01:07:38.420 --> 01:07:41.090
Your prediction is going to
be a random variable that
01:07:41.090 --> 01:07:43.790
goes up and down.
01:07:43.790 --> 01:07:48.120
And that will be a
martingale, because--
01:07:48.120 --> 01:07:52.980
so how I predict
today based on what
01:07:52.980 --> 01:07:56.660
are all the possibilities
happening going forward,
01:07:56.660 --> 01:08:00.300
well, one of many
things could happen.
01:08:00.300 --> 01:08:05.810
But if I knew that my prediction
is going to, in expectation,
01:08:05.810 --> 01:08:08.390
shift upwards, then
I shouldn't have
01:08:08.390 --> 01:08:09.710
predicted what I predict today.
01:08:09.710 --> 01:08:13.300
I should have predicted
upwards anyway.
01:08:13.300 --> 01:08:13.800
OK.
01:08:13.800 --> 01:08:19.819
So this is another
construction of martingales.
01:08:19.819 --> 01:08:21.410
So this also comes up.
01:08:21.410 --> 01:08:26.120
You could have other more pure
mathematics-type explanations,
01:08:26.120 --> 01:08:29.930
where suppose I
want to know what
01:08:29.930 --> 01:08:34.490
is the chromatic number
of a random graph.
01:08:34.490 --> 01:08:38.960
And I show you that
graph one edge at a time.
01:08:38.960 --> 01:08:41.270
You can predict the expectation.
01:08:41.270 --> 01:08:44.540
You can find the expectation
of this graph's statistic
01:08:44.540 --> 01:08:47.630
based on what you've
seen up to time n.
01:08:47.630 --> 01:08:51.979
And that sequence
will be a martingale.
01:08:51.979 --> 01:08:56.149
An important property
of a martingale,
01:08:56.149 --> 01:08:59.990
which is known as the
martingale convergence theorem--
01:08:59.990 --> 01:09:06.740
and so that's what we'll need
for the proof of the existence
01:09:06.740 --> 01:09:07.790
of the limit next time--
01:09:15.689 --> 01:09:20.359
says that every
bounded martingale--
01:09:23.649 --> 01:09:27.229
so for example, suppose
your martingale only
01:09:27.229 --> 01:09:29.590
takes values between 0 and 1.
01:09:29.590 --> 01:09:33.500
So every bounded martingale
converges almost surely.
01:09:42.870 --> 01:09:46.715
You cannot have a martingale
which you expect to constantly
01:09:46.715 --> 01:09:47.340
go up and down.
01:09:53.040 --> 01:09:56.170
So I want to show you
a proof of this fact.
01:09:56.170 --> 01:09:59.090
Let me just mention that
the bounded condition is
01:09:59.090 --> 01:10:01.490
a little bit stronger than
what we actually need.
01:10:01.490 --> 01:10:03.470
From the proof, you'll
see that you really only
01:10:03.470 --> 01:10:08.010
need them to be L1 bounded.
01:10:08.010 --> 01:10:10.360
It's enough.
01:10:10.360 --> 01:10:12.190
And more generally,
there is a condition
01:10:12.190 --> 01:10:19.380
called uniform integrability,
which I won't explain.
01:10:22.368 --> 01:10:23.364
All right.
01:10:26.120 --> 01:10:26.620
OK.
01:10:26.620 --> 01:10:29.250
So let me show you a proof
of the martingale convergence
01:10:29.250 --> 01:10:29.750
theorem.
01:10:29.750 --> 01:10:33.520
And I'm going to be somewhat
informal and somewhat cavalier,
01:10:33.520 --> 01:10:35.650
because I don't want
to get into some
01:10:35.650 --> 01:10:38.550
of the fine details
of probability theory.
01:10:38.550 --> 01:10:43.840
But if you have taken something
like 18.675 probability theory,
01:10:43.840 --> 01:10:45.520
then you can fill in
all those details.
01:10:48.580 --> 01:10:50.290
So I like this
proof, because it's
01:10:50.290 --> 01:10:51.580
kind of a proof by gambling.
01:10:56.680 --> 01:11:00.070
So I want to tell you a story
which should convince you that
01:11:00.070 --> 01:11:04.380
a martingale cannot
keep going up and down.
01:11:04.380 --> 01:11:06.120
It must converge almost surely.
01:11:08.640 --> 01:11:15.970
So suppose x sub n
doesn't converge.
01:11:19.863 --> 01:11:21.280
OK, so this is why
I say I'm going
01:11:21.280 --> 01:11:23.040
to be somewhat cavalier
with probability theory.
01:11:23.040 --> 01:11:24.680
So when I say this
doesn't converge,
01:11:24.680 --> 01:11:28.060
I mean a specific instance of
the sequence doesn't converge
01:11:28.060 --> 01:11:30.050
or some specific realization.
01:11:30.050 --> 01:11:39.490
If it doesn't converge,
then there exists a and b,
01:11:39.490 --> 01:11:50.740
both rational numbers between
0 and 1, such that the sequence
01:11:50.740 --> 01:11:59.040
crosses the interval a,
b infinitely many times.
01:12:06.040 --> 01:12:11.060
So by crossing this interval,
what I mean is the following.
01:12:19.510 --> 01:12:20.010
OK.
01:12:20.010 --> 01:12:23.140
So there's an
important picture which
01:12:23.140 --> 01:12:25.900
will help a lot in
understanding this theorem.
01:12:31.550 --> 01:12:41.300
So imagine I have this
time n, and I have a and b.
01:12:41.300 --> 01:12:43.130
So I have this martingale.
01:12:43.130 --> 01:12:55.850
It's realization curve
will be like that.
01:12:55.850 --> 01:12:58.390
So that's an instance
of this martingale.
01:12:58.390 --> 01:13:03.950
And by crossing, I
mean a sequence that--
01:13:03.950 --> 01:13:07.390
OK, so here's what
I mean by crossing.
01:13:07.390 --> 01:13:15.192
I start below a and--
01:13:15.192 --> 01:13:16.400
let me use a different color.
01:13:19.170 --> 01:13:26.320
So I start below a, and I
go above b and then wait
01:13:26.320 --> 01:13:30.430
until I come back below a.
01:13:30.430 --> 01:13:32.740
And I go above b.
01:13:32.740 --> 01:13:36.040
Wait until I come back.
01:13:36.040 --> 01:13:37.500
So do like that.
01:13:45.592 --> 01:13:46.558
Like that.
01:13:52.860 --> 01:13:57.900
So I start below a until
the first time I go above b.
01:13:57.900 --> 01:13:59.700
And then I stop that sequence.
01:13:59.700 --> 01:14:05.705
So those are the upcrossings
of this martingale.
01:14:12.980 --> 01:14:15.960
So upcrossing is when
you start below a,
01:14:15.960 --> 01:14:18.720
and then you end up above b.
01:14:18.720 --> 01:14:26.040
So if you don't converge,
then there exists such a
01:14:26.040 --> 01:14:30.360
and b such that there are
infinitely many such crossings.
01:14:30.360 --> 01:14:32.950
So this is just a fact.
01:14:32.950 --> 01:14:36.910
It's not hard to see.
01:14:36.910 --> 01:14:40.000
And what we'll show is
that this doesn't happen
01:14:40.000 --> 01:14:42.280
except with probability 0.
01:14:42.280 --> 01:14:53.330
So we'll show that this
occurs with probability 0.
01:14:55.950 --> 01:15:02.930
And because there are
only countably many
01:15:02.930 --> 01:15:11.690
rational numbers, we find
that x sub n converges
01:15:11.690 --> 01:15:13.000
with probability 1.
01:15:22.440 --> 01:15:23.630
So these are upcrossings.
01:15:23.630 --> 01:15:25.920
So I didn't define
it, but hopefully you
01:15:25.920 --> 01:15:29.160
understood from my picture
and my description.
01:15:29.160 --> 01:15:36.270
And let me define
by u sub n to be
01:15:36.270 --> 01:15:44.620
the number of
upcrossings up to time
01:15:44.620 --> 01:15:53.207
n, so the number of
such upcrossings.
01:15:55.950 --> 01:15:58.205
Now let me consider
a betting strategy.
01:16:05.790 --> 01:16:07.770
Basically, I want to make money.
01:16:07.770 --> 01:16:15.290
And I want to make money by
following these upcrossings.
01:16:15.290 --> 01:16:15.790
OK.
01:16:15.790 --> 01:16:20.050
So every time you
give me a number and--
01:16:20.050 --> 01:16:21.710
so think of this as
the stock market.
01:16:21.710 --> 01:16:26.647
So it's a fair stock market
where you tell me the price,
01:16:26.647 --> 01:16:28.230
and I get to decide,
do I want to buy?
01:16:28.230 --> 01:16:31.070
Or do I want to sell?
01:16:31.070 --> 01:16:45.720
So consider the betting
strategy where at any time,
01:16:45.720 --> 01:16:54.530
we're going to hold either 0
or 1 share of the stock, which
01:16:54.530 --> 01:16:57.590
has these moving prices.
01:16:57.590 --> 01:17:07.980
And what we're going to do
is if xn is less than a,
01:17:07.980 --> 01:17:12.060
is less than the lower
bound, then we're
01:17:12.060 --> 01:17:27.890
going to buy and hold, meaning
1, until the first time
01:17:27.890 --> 01:17:42.450
that the price reaches
above b and then
01:17:42.450 --> 01:17:48.052
sell as soon as the first time
we see the price goes above b.
01:17:50.950 --> 01:17:52.900
So this is the betting strategy.
01:17:52.900 --> 01:17:54.960
And it's something
which you can implement.
01:17:54.960 --> 01:17:57.030
If you see a sequence
of prices, you
01:17:57.030 --> 01:17:59.130
can implement this strategy.
01:17:59.130 --> 01:18:03.000
And you already hopefully see,
if you have many upcrossings,
01:18:03.000 --> 01:18:05.310
then each upcrossing,
you make money.
01:18:05.310 --> 01:18:07.620
Each upcrossing, you make money.
01:18:07.620 --> 01:18:09.880
And this is almost
too good to be true.
01:18:09.880 --> 01:18:15.160
And in fact, we see that the
total gain from this strategy--
01:18:15.160 --> 01:18:17.300
so if you start with
some balance, what
01:18:17.300 --> 01:18:18.460
you get at the end--
01:18:18.460 --> 01:18:22.750
is at least this
difference from a
01:18:22.750 --> 01:18:27.452
to b times the number
of upcrossings.
01:18:31.270 --> 01:18:33.610
You might start somewhere.
01:18:33.610 --> 01:18:35.790
You buy, and then you
just lose everything.
01:18:35.790 --> 01:18:38.840
So there might be
an initial cost.
01:18:38.840 --> 01:18:42.400
And that cost is
bounded, because we start
01:18:42.400 --> 01:18:44.680
with a bounded martingale.
01:18:44.680 --> 01:18:52.780
So suppose the martingale
is always between 0 and 1.
01:18:52.780 --> 01:18:54.915
We start with a
bounded martingale.
01:18:57.530 --> 01:19:01.730
But on the other hand,
there is a theorem
01:19:01.730 --> 01:19:04.670
about martingales, which
is not hard to deduce
01:19:04.670 --> 01:19:07.700
from the definition, that
no matter what the betting
01:19:07.700 --> 01:19:11.150
strategy is, the gain
at any particular time
01:19:11.150 --> 01:19:13.580
must be 0 in expectation.
01:19:16.940 --> 01:19:19.240
So this is just the
property of the martingale.
01:19:19.240 --> 01:19:24.190
So 0 equals the
expected gain, which
01:19:24.190 --> 01:19:27.520
is at least b minus a
times the expected number
01:19:27.520 --> 01:19:30.630
of upcrossings minus 1.
01:19:30.630 --> 01:19:35.430
And thus the expected number
of upcrossings up to time n
01:19:35.430 --> 01:19:41.600
is at most 1 over b minus a.
01:19:41.600 --> 01:19:47.140
Now, we let n go to infinity.
01:19:47.140 --> 01:19:57.780
And let u sub infinity be the
total number of upcrossings.
01:20:02.030 --> 01:20:17.430
By the monotone convergence
theorem in this limit,
01:20:17.430 --> 01:20:20.310
the limit of these u sub
n's, it can never go down.
01:20:20.310 --> 01:20:23.740
It's always weakly increasing.
01:20:23.740 --> 01:20:28.020
It converges to the
expectation of the total number
01:20:28.020 --> 01:20:29.232
of upcrossings.
01:20:29.232 --> 01:20:31.440
So now, in particular, you
know that the total number
01:20:31.440 --> 01:20:38.120
of upcrossings is at
most some finite number.
01:20:38.120 --> 01:20:40.300
So in particular,
the probability
01:20:40.300 --> 01:20:45.630
that you have infinitely
many crossings is 0.
01:20:45.630 --> 01:20:50.330
So with probability 0, you
cross infinitely many times,
01:20:50.330 --> 01:20:52.880
which proves the
claim over there
01:20:52.880 --> 01:20:54.870
and which concludes
the proof of the claim
01:20:54.870 --> 01:20:58.535
that the martingale
converges almost surely.
01:20:58.535 --> 01:21:00.660
OK, so that proves the
martingale converge theorem.
01:21:00.660 --> 01:21:02.430
So next time, we'll
combine everything
01:21:02.430 --> 01:21:05.640
that we did today to prove the
three main theorems that we
01:21:05.640 --> 01:21:09.230
stated last time
on graph limits.