1
00:00:00,500 --> 00:00:03,912
[SQUEAKING]
2
00:00:18,590 --> 00:00:21,720
PROFESSOR: Last time, we
started discussing graph limits.
3
00:00:21,720 --> 00:00:24,908
And let me remind you some of
the notions and definitions
4
00:00:24,908 --> 00:00:25,700
that were involved.
5
00:00:35,590 --> 00:00:37,490
One of the main
objects in graph limits
6
00:00:37,490 --> 00:00:46,670
is that of a graphon, which are
symmetric, measurable functions
7
00:00:46,670 --> 00:00:49,490
from the unit squared
to the unit interval.
8
00:00:58,890 --> 00:01:02,570
So here, symmetric means
that w of x, comma, y
9
00:01:02,570 --> 00:01:04,670
equals to w of y, comma, x.
10
00:01:09,810 --> 00:01:11,520
We define a notion
of convergence
11
00:01:11,520 --> 00:01:13,980
for a sequence of graphons.
12
00:01:13,980 --> 00:01:21,080
And remember, the
notion of convergence
13
00:01:21,080 --> 00:01:33,330
is that a sequence is convergent
if the sequence of homomorphism
14
00:01:33,330 --> 00:01:43,330
densities converges as n goes
to infinity for every fixed
15
00:01:43,330 --> 00:01:45,680
F, every fixed graph.
16
00:01:49,480 --> 00:01:52,180
So this is how we
define convergence.
17
00:01:52,180 --> 00:01:53,920
So a sequence of
graphs or graphons,
18
00:01:53,920 --> 00:01:58,360
they converge if all the
homomorphism densities--
19
00:01:58,360 --> 00:02:01,200
so you should think of this
as subgraph statistics--
20
00:02:01,200 --> 00:02:04,520
if all of these
statistics converge.
21
00:02:04,520 --> 00:02:10,180
We also say that a sequence
converges to a particular limit
22
00:02:10,180 --> 00:02:16,180
if these homomorphism
densities converge
23
00:02:16,180 --> 00:02:20,170
to the corresponding
homomorphism density
24
00:02:20,170 --> 00:02:24,510
of the limit for every F.
25
00:02:24,510 --> 00:02:25,010
OK.
26
00:02:25,010 --> 00:02:27,740
So this is how we
define convergence.
27
00:02:27,740 --> 00:02:29,870
We also define this
notion of a distance.
28
00:02:33,140 --> 00:02:35,170
And to do that, we
first define the cut
29
00:02:35,170 --> 00:02:41,900
norm to be the following
quantity defined
30
00:02:41,900 --> 00:02:49,340
by taking two subsets, S
and T, which are measurable.
31
00:02:49,340 --> 00:02:51,890
Everything so far is
going to be measurable.
32
00:02:51,890 --> 00:02:55,820
And look at what is the
maximum possible deviation
33
00:02:55,820 --> 00:03:00,350
of the integral of this
function on this box, S cross T.
34
00:03:00,350 --> 00:03:03,800
And here, w, you should think
of it as taking real values,
35
00:03:03,800 --> 00:03:06,133
allowing both positive
and negative values,
36
00:03:06,133 --> 00:03:07,550
because otherwise,
you should just
37
00:03:07,550 --> 00:03:11,410
take S and T to be
the whole interval.
38
00:03:11,410 --> 00:03:12,850
OK.
39
00:03:12,850 --> 00:03:14,950
And this definition
was motivated
40
00:03:14,950 --> 00:03:19,620
by our discussion of discrepancy
coming from quasi randomness.
41
00:03:19,620 --> 00:03:22,280
Now, if I give you
two graphs or graphons
42
00:03:22,280 --> 00:03:24,170
and ask you to
compare them, you are
43
00:03:24,170 --> 00:03:28,550
allowed to permute the
vertices in some sense,
44
00:03:28,550 --> 00:03:31,140
so to find the best overlay.
45
00:03:31,140 --> 00:03:34,040
And that notion is
captured in the definition
46
00:03:34,040 --> 00:03:40,610
of cut distance, which is
defined to be the following
47
00:03:40,610 --> 00:03:53,540
quantity, where we consider over
all possible measure-preserving
48
00:03:53,540 --> 00:04:10,470
bijections from the interval
to itself of the difference
49
00:04:10,470 --> 00:04:14,130
between these two
graphons if I rotate
50
00:04:14,130 --> 00:04:18,750
one of them using this
measure-preserving bijection.
51
00:04:26,460 --> 00:04:29,175
So think of this as
permuting the vertices.
52
00:04:36,130 --> 00:04:39,660
So these were the definitions
that were involved last time.
53
00:04:39,660 --> 00:04:41,410
And at the end of
last lecture, I
54
00:04:41,410 --> 00:04:45,060
stated three main theorems
of graph limit theory.
55
00:04:45,060 --> 00:04:47,230
So I forgot to
mention what are some
56
00:04:47,230 --> 00:04:49,820
of the histories of this theory.
57
00:04:49,820 --> 00:04:52,360
So there were a number
of important papers
58
00:04:52,360 --> 00:04:57,250
that developed this very idea of
graph limits, which is actually
59
00:04:57,250 --> 00:05:00,100
somewhat-- if you think
about all of combinatorics,
60
00:05:00,100 --> 00:05:02,830
we like to deal with
discrete objects.
61
00:05:02,830 --> 00:05:06,610
And even the idea of taking
a limit is rather novel.
62
00:05:06,610 --> 00:05:11,830
So this work is due
to a number of people.
63
00:05:11,830 --> 00:05:14,830
In particular, Laszlo Lovasz
played a very important
64
00:05:14,830 --> 00:05:17,200
central role in the
development of this theory.
65
00:05:17,200 --> 00:05:19,480
And various people
came to this theory
66
00:05:19,480 --> 00:05:21,460
from different
perspectives-- some
67
00:05:21,460 --> 00:05:24,160
from more pure
perspectives, and some
68
00:05:24,160 --> 00:05:26,290
from more applied perspectives.
69
00:05:26,290 --> 00:05:29,810
And this theory is now getting
used in more and more places,
70
00:05:29,810 --> 00:05:33,030
including statistics,
machine learning, and so on.
71
00:05:33,030 --> 00:05:37,990
And I'll explain where that
comes up just a little bit.
72
00:05:37,990 --> 00:05:40,870
At the end of last lecture,
I stated three main theorems.
73
00:05:40,870 --> 00:05:44,560
And what I want to do
today is develop some tools
74
00:05:44,560 --> 00:05:47,777
so that we can prove those
theorems in the next lecture.
75
00:05:47,777 --> 00:05:48,277
OK.
76
00:05:48,277 --> 00:05:49,910
So I want to develop some tools.
77
00:05:49,910 --> 00:05:52,510
In particular, you'll see some
of the things that we've talked
78
00:05:52,510 --> 00:05:55,960
about in the chapter on
Szemerédi's regularity lemma
79
00:05:55,960 --> 00:05:59,140
come up again in a slightly
different language.
80
00:05:59,140 --> 00:06:02,320
So much of what I will say
today hopefully should already
81
00:06:02,320 --> 00:06:04,660
be familiar to you, but
you will see it again
82
00:06:04,660 --> 00:06:08,690
from the perspective
of graph limits.
83
00:06:08,690 --> 00:06:11,257
But first, before telling
you about the tools,
84
00:06:11,257 --> 00:06:12,840
I want to give you
some more examples.
85
00:06:15,580 --> 00:06:17,970
So one of the ways that
I motivated graph limits
86
00:06:17,970 --> 00:06:22,380
last time is this example of
an Erdos-Renyi random graph
87
00:06:22,380 --> 00:06:25,470
or a sequence of quasi-random
graphs converging
88
00:06:25,470 --> 00:06:26,540
to a constant.
89
00:06:26,540 --> 00:06:30,690
The constant graphon
is the limit.
90
00:06:30,690 --> 00:06:32,170
But what about generalizations?
91
00:06:32,170 --> 00:06:34,590
What about generalizations
of that construction when
92
00:06:34,590 --> 00:06:37,500
your limit is not the constant?
93
00:06:37,500 --> 00:06:43,530
So this leads to this idea
of a w random graph, which
94
00:06:43,530 --> 00:06:49,250
generalizes that of an
Erdos-Renyi random graph.
95
00:06:49,250 --> 00:06:58,390
So in Erdos-Renyi, we're
looking at every edge occurring
96
00:06:58,390 --> 00:07:03,260
with the same probability, p,
uniform throughout the graph.
97
00:07:03,260 --> 00:07:07,250
But what I want to do now is
allow you to change the edge
98
00:07:07,250 --> 00:07:08,920
probability somewhat.
99
00:07:08,920 --> 00:07:09,420
OK.
100
00:07:12,288 --> 00:07:14,330
So before giving you the
more general definition,
101
00:07:14,330 --> 00:07:19,160
a special case of this
is an important model
102
00:07:19,160 --> 00:07:22,090
of random graphs known as
the stochastic block model.
103
00:07:25,802 --> 00:07:31,700
And in particular, a two-block
model consists of the following
104
00:07:31,700 --> 00:07:39,330
data where I am looking
at two types of vertices--
105
00:07:44,750 --> 00:07:46,030
let's call them red and blue--
106
00:07:49,650 --> 00:07:54,830
where the vertices are
assigned to colors at random--
107
00:08:01,570 --> 00:08:03,050
for example, 50/50.
108
00:08:03,050 --> 00:08:05,690
But any other
probability is fine.
109
00:08:05,690 --> 00:08:10,330
And now I put down the edges
according to which colors
110
00:08:10,330 --> 00:08:12,340
the two endpoints are.
111
00:08:12,340 --> 00:08:23,500
So two red vertices are joined
with edge probability Prr.
112
00:08:23,500 --> 00:08:26,920
If I have a red
and a blue, then I
113
00:08:26,920 --> 00:08:32,380
may have a different probability
joining them, and likewise
114
00:08:32,380 --> 00:08:38,400
with blue-blue, like that.
115
00:08:38,400 --> 00:08:40,960
So in other words, I can encode
this probability information
116
00:08:40,960 --> 00:08:50,890
in the matrix, like that.
117
00:08:50,890 --> 00:08:54,800
So it's symmetric
across the diagonal.
118
00:08:54,800 --> 00:08:57,220
So this is a slightly
more general version
119
00:08:57,220 --> 00:08:59,950
of an Erdos-Renyi
random graph where now I
120
00:08:59,950 --> 00:09:02,320
have potentially different
types of vertices.
121
00:09:02,320 --> 00:09:04,090
And you can imagine
these kinds of models
122
00:09:04,090 --> 00:09:06,010
are very important in
applied mathematics
123
00:09:06,010 --> 00:09:09,740
for modeling certain situations
such as, for example,
124
00:09:09,740 --> 00:09:14,890
if you have people with
different political party
125
00:09:14,890 --> 00:09:16,270
affiliations.
126
00:09:16,270 --> 00:09:19,890
How likely are they
to talk to each other?
127
00:09:19,890 --> 00:09:22,050
So you can imagine
some of these numbers
128
00:09:22,050 --> 00:09:24,700
might be bigger than others.
129
00:09:24,700 --> 00:09:27,360
And there's an important
statistical problem.
130
00:09:27,360 --> 00:09:31,050
If I give you a graph, can
you cluster or classify
131
00:09:31,050 --> 00:09:33,330
the vertices according
to their types
132
00:09:33,330 --> 00:09:36,690
if I do not show you in advance
what the colors are but show
133
00:09:36,690 --> 00:09:39,340
you what the output graph is?
134
00:09:39,340 --> 00:09:41,490
So these are important
statistical questions
135
00:09:41,490 --> 00:09:45,750
with lots of applications.
136
00:09:45,750 --> 00:09:48,570
This is an example of if
you have only two blocks.
137
00:09:48,570 --> 00:09:52,030
But of course, you can
have more than two blocks.
138
00:09:52,030 --> 00:09:55,810
And the graphon context
tells us that we should not
139
00:09:55,810 --> 00:09:58,540
limit ourselves to just blocks.
140
00:09:58,540 --> 00:10:02,200
If I give you any
graphon w, I can also
141
00:10:02,200 --> 00:10:06,040
construct a random graph.
142
00:10:06,040 --> 00:10:08,980
So what I would like
to do is to consider
143
00:10:08,980 --> 00:10:12,080
the following
construction where--
144
00:10:12,080 --> 00:10:19,420
OK, so let's just call
it w random graph denoted
145
00:10:19,420 --> 00:10:23,920
by g and w--
146
00:10:23,920 --> 00:10:28,510
where I form the graph
using the following process.
147
00:10:28,510 --> 00:10:34,480
First, the vertex set is
labeled by 1 through n.
148
00:10:34,480 --> 00:10:44,640
And let me draw the vertex types
by taking uniform random x1
149
00:10:44,640 --> 00:10:46,946
through xn--
150
00:10:46,946 --> 00:10:51,080
OK, so uniform iid.
151
00:10:51,080 --> 00:10:54,170
So you think of them as the
vertex colors, the vertex
152
00:10:54,170 --> 00:10:55,560
types.
153
00:10:55,560 --> 00:11:03,440
And I put an edge
between i and j
154
00:11:03,440 --> 00:11:10,834
with probability
exactly w of xi,
155
00:11:10,834 --> 00:11:17,382
xj, so for all i less
than j independently.
156
00:11:21,160 --> 00:11:23,950
That's the definition
of a w random graph.
157
00:11:23,950 --> 00:11:26,790
And the two-block
stochastic model
158
00:11:26,790 --> 00:11:29,470
is a special case of
this w random graph
159
00:11:29,470 --> 00:11:31,720
for the graphon,
which corresponds
160
00:11:31,720 --> 00:11:35,310
to this red-blue picture here.
161
00:11:38,650 --> 00:11:49,300
So the generation process would
be I give you some x1, x2, x3,
162
00:11:49,300 --> 00:11:57,250
and then, likewise, x1, x3, x2.
163
00:11:57,250 --> 00:12:01,260
And then I evaluate, what
is the value of this graphon
164
00:12:01,260 --> 00:12:02,750
at these points?
165
00:12:11,450 --> 00:12:15,080
And those are my
edge probabilities.
166
00:12:15,080 --> 00:12:17,420
So what I described
is a special case
167
00:12:17,420 --> 00:12:19,580
of this general w random graph.
168
00:12:22,460 --> 00:12:25,570
Any questions?
169
00:12:25,570 --> 00:12:28,390
So like before, an important
statistical question
170
00:12:28,390 --> 00:12:31,250
is if I show you
the graph, can you
171
00:12:31,250 --> 00:12:37,210
tell me a good model for
where this graph came from?
172
00:12:37,210 --> 00:12:41,460
So that's one of the reasons
why people in applied math
173
00:12:41,460 --> 00:12:45,970
might care about these
types of constructions.
174
00:12:45,970 --> 00:12:47,350
Let me talk about some theorems.
175
00:12:51,050 --> 00:12:54,800
I've told you that the sequence
of Erdos-Renyi random graphs
176
00:12:54,800 --> 00:12:57,770
converges to the
constant graphon p.
177
00:12:57,770 --> 00:13:01,190
So instead of taking
a constant graphon p,
178
00:13:01,190 --> 00:13:04,190
now I start with w random graph.
179
00:13:04,190 --> 00:13:06,860
And you should expect,
and it is indeed true,
180
00:13:06,860 --> 00:13:12,500
that this sequence converges
to w as their limit.
181
00:13:12,500 --> 00:13:14,190
So let w be a graphon.
182
00:13:19,695 --> 00:13:21,830
So let w be a graphon.
183
00:13:21,830 --> 00:13:28,450
And for each n, let me
draw this graph G sub
184
00:13:28,450 --> 00:13:34,472
n using the w random
graph model independently.
185
00:13:37,640 --> 00:13:47,810
Then with probability
1, the sequence
186
00:13:47,810 --> 00:13:50,480
converges to the graphon w.
187
00:13:53,680 --> 00:13:58,190
So in the sense that I've
shown above, described above.
188
00:13:58,190 --> 00:14:01,640
So this statement
tells us a couple
189
00:14:01,640 --> 00:14:04,900
of things-- one, that w random
graphs converge to the limit w,
190
00:14:04,900 --> 00:14:12,400
as you should expect; and
two, that every graphon w
191
00:14:12,400 --> 00:14:17,750
is the limit point of
some sequence of graphs.
192
00:14:17,750 --> 00:14:20,650
So this is something that
we never quite explicitly
193
00:14:20,650 --> 00:14:21,950
stated before.
194
00:14:21,950 --> 00:14:24,980
So let me make this remark.
195
00:14:24,980 --> 00:14:39,670
So in particular,
every w is the limit
196
00:14:39,670 --> 00:14:47,998
of some sequence of graphs,
just like every real number,
197
00:14:47,998 --> 00:14:49,540
in analogy to what
we said last time.
198
00:14:49,540 --> 00:14:52,340
Every real number is
the limit of a sequence
199
00:14:52,340 --> 00:14:55,760
of rational numbers through
rational approximation.
200
00:14:55,760 --> 00:14:59,570
And this is some form of
approximation of a graphon
201
00:14:59,570 --> 00:15:01,425
by a sequence of graphs.
202
00:15:01,425 --> 00:15:01,925
OK.
203
00:15:01,925 --> 00:15:03,740
So I'm not going to
prove this theorem.
204
00:15:03,740 --> 00:15:08,420
The proof is not difficult.
So using that definition
205
00:15:08,420 --> 00:15:11,240
of subgraph
convergence, the proof
206
00:15:11,240 --> 00:15:16,890
uses what's known as
Azuma's inequality.
207
00:15:16,890 --> 00:15:21,110
So by an appropriate application
of Azuma's inequality
208
00:15:21,110 --> 00:15:22,790
on the concentration
of martingales,
209
00:15:22,790 --> 00:15:27,110
one can prove this
theorem here by estimating
210
00:15:27,110 --> 00:15:28,970
the probability that--
211
00:15:35,180 --> 00:15:41,960
to show that the probability
that the F density in Gn,
212
00:15:41,960 --> 00:15:47,330
it is very close to
the F density in w
213
00:15:47,330 --> 00:15:49,336
with high probability.
214
00:15:52,252 --> 00:15:55,145
OK.
215
00:15:55,145 --> 00:15:56,020
Any questions so far?
216
00:15:58,820 --> 00:16:02,600
So this is an important
example of one
217
00:16:02,600 --> 00:16:06,220
of the motivations
of graph limits.
218
00:16:06,220 --> 00:16:09,460
But now, let's get back
to what I said earlier.
219
00:16:09,460 --> 00:16:11,810
I would like to develop
a sequence of tools
220
00:16:11,810 --> 00:16:14,150
that will allow us to prove
the main theorem stated
221
00:16:14,150 --> 00:16:18,000
at the end of the last lecture.
222
00:16:18,000 --> 00:16:19,470
And this will sound
very familiar,
223
00:16:19,470 --> 00:16:23,610
because we're going to write
down some lemmas that we did
224
00:16:23,610 --> 00:16:26,490
back in the chapter of
Szemerédi's regularity lemma
225
00:16:26,490 --> 00:16:29,450
but now in the
language of graphons.
226
00:16:29,450 --> 00:16:31,600
So the first is
a counting lemma.
227
00:16:38,270 --> 00:16:39,770
The goal of the
counting lemma is
228
00:16:39,770 --> 00:16:42,590
to show that if you
have two graphons which
229
00:16:42,590 --> 00:16:50,060
are close to each other in the
sense of cut distance, then
230
00:16:50,060 --> 00:16:55,530
their F densities are
similar to each other.
231
00:16:55,530 --> 00:16:57,190
So here's a statement.
232
00:16:57,190 --> 00:17:05,403
So if w and u are
graphons and F is
233
00:17:05,403 --> 00:17:19,460
a graph, then the F density
of w minus the F density of u,
234
00:17:19,460 --> 00:17:24,940
their difference is no more
than a constant-- so number
235
00:17:24,940 --> 00:17:32,110
of edges of F times the cut
distance between u and w.
236
00:17:37,670 --> 00:17:41,740
So maybe some of you already
see how to do this from
237
00:17:41,740 --> 00:17:45,930
our discussion on
Szemerédi's regularity lemma.
238
00:17:45,930 --> 00:17:48,790
In any case, I want to just
rewrite the proof again
239
00:17:48,790 --> 00:17:50,350
in the language of graphons.
240
00:17:50,350 --> 00:17:52,190
And this will hopefully--
241
00:17:52,190 --> 00:17:55,700
so we did two proofs of the
triangle counting lemma.
242
00:17:55,700 --> 00:17:58,445
One was hopefully more
intuitive for you,
243
00:17:58,445 --> 00:18:00,070
which is you pick a
typical vertex that
244
00:18:00,070 --> 00:18:01,528
has lots of neighbors
on both sides
245
00:18:01,528 --> 00:18:04,412
and therefore lots
of edges between.
246
00:18:04,412 --> 00:18:06,370
And then there was a
second proof, which I said
247
00:18:06,370 --> 00:18:08,470
was a more analytic
proof, where you took out
248
00:18:08,470 --> 00:18:10,420
one edge at a time.
249
00:18:10,420 --> 00:18:13,450
And that proof, I think
it's technically easier
250
00:18:13,450 --> 00:18:16,383
to implement, especially
for general H.
251
00:18:16,383 --> 00:18:17,800
But the first time
you see it, you
252
00:18:17,800 --> 00:18:20,680
might not quite see what
the calculation was about.
253
00:18:20,680 --> 00:18:23,320
So I want to do this exact
same calculation again
254
00:18:23,320 --> 00:18:24,547
in the language of graphons.
255
00:18:24,547 --> 00:18:26,380
And hopefully, it should
be clear this time.
256
00:18:29,600 --> 00:18:31,390
So this is the same
as the counting lemma
257
00:18:31,390 --> 00:18:34,800
over epsilon-regular pairs.
258
00:18:34,800 --> 00:18:44,120
So it suffices to
prove the inequality
259
00:18:44,120 --> 00:18:49,330
where the right-hand side
is replaced not by the cut
260
00:18:49,330 --> 00:18:53,440
distance but by the cut norm.
261
00:18:53,440 --> 00:18:57,550
And the reason is that once
you have the second inequality
262
00:18:57,550 --> 00:19:04,410
by taking an infimum over all
measure-preserving bijections
263
00:19:04,410 --> 00:19:05,290
phi--
264
00:19:05,290 --> 00:19:10,990
and notice that that change
does not affect the F density.
265
00:19:10,990 --> 00:19:12,900
By taking an infimum
over phi, you
266
00:19:12,900 --> 00:19:14,752
recover the first inequality.
267
00:19:17,590 --> 00:19:22,360
I want to give you a small
reformulation of the cut norm
268
00:19:22,360 --> 00:19:25,606
that will be useful for thinking
about this counting lemma.
269
00:19:29,980 --> 00:19:37,750
Here's a reformulation
of the cut norm--
270
00:19:37,750 --> 00:19:42,470
namely, that I can
define the cut norm.
271
00:19:42,470 --> 00:19:45,840
So here, w is taking
real values, so
272
00:19:45,840 --> 00:19:48,630
not necessarily non-negative.
273
00:19:48,630 --> 00:19:52,860
So the cut norm
we saw earlier is
274
00:19:52,860 --> 00:20:01,940
defined to be the supremum
over all measurable subsets
275
00:20:01,940 --> 00:20:08,900
of the 0, 1 interval of this
integral in absolute value.
276
00:20:08,900 --> 00:20:14,780
But it turns out I can rewrite
this supremum over a slightly
277
00:20:14,780 --> 00:20:16,850
larger set of objects.
278
00:20:16,850 --> 00:20:21,500
Instead of just looking
over measurable subsets
279
00:20:21,500 --> 00:20:26,330
of the interval, let me now
look at measurable functions.
280
00:20:26,330 --> 00:20:29,130
Little u.
281
00:20:29,130 --> 00:20:32,570
So OK, let me look at functions.
282
00:20:32,570 --> 00:20:40,860
So u and v from 0, 1 to 0, 1--
283
00:20:40,860 --> 00:20:46,530
and as always, everything
is measurable--
284
00:20:46,530 --> 00:20:49,650
of the following integral.
285
00:21:01,570 --> 00:21:04,260
So I claim this is true.
286
00:21:04,260 --> 00:21:09,370
So I consider this integral.
287
00:21:09,370 --> 00:21:11,480
Instead of integrating
over a box,
288
00:21:11,480 --> 00:21:16,160
now I'm integrating
this expression.
289
00:21:16,160 --> 00:21:16,660
OK.
290
00:21:16,660 --> 00:21:19,380
So why is this true?
291
00:21:19,380 --> 00:21:23,670
Well, one of the
directions is easy to see,
292
00:21:23,670 --> 00:21:27,630
because the right-hand side
is strictly an enlargement
293
00:21:27,630 --> 00:21:29,070
of the left-hand side.
294
00:21:29,070 --> 00:21:35,940
So by taking u to be the
indicator function of S
295
00:21:35,940 --> 00:21:38,750
and v to be the indicator
of function of T,
296
00:21:38,750 --> 00:21:40,680
you see that the
right-hand side, in fact,
297
00:21:40,680 --> 00:21:42,690
includes the left-hand
side in terms
298
00:21:42,690 --> 00:21:45,330
of what you are allowed to do.
299
00:21:45,330 --> 00:21:48,160
But what about the
other direction?
300
00:21:48,160 --> 00:21:50,070
So for the other
direction, the main thing
301
00:21:50,070 --> 00:21:56,700
is to notice that the
integral or the integrand,
302
00:21:56,700 --> 00:22:05,800
what's inside this integral,
is bilinear in the values of u
303
00:22:05,800 --> 00:22:12,390
and v. So in particular, the
extrema of this integral,
304
00:22:12,390 --> 00:22:17,210
as you allow to vary u
and v, they are obtained.
305
00:22:17,210 --> 00:22:22,350
So they are obtained
for u and v,
306
00:22:22,350 --> 00:22:31,610
taking values in the
endpoints 0, comma, 1.
307
00:22:36,030 --> 00:22:39,160
It may be helpful to think about
the discrete setting, when,
308
00:22:39,160 --> 00:22:42,070
instead of this integral, you
have a matrix and two vectors
309
00:22:42,070 --> 00:22:43,870
multiplied from left and right.
310
00:22:43,870 --> 00:22:46,840
And you had to decide,
what are the coordinates
311
00:22:46,840 --> 00:22:48,560
of those vectors?
312
00:22:48,560 --> 00:22:50,260
It's a bilinear form.
313
00:22:50,260 --> 00:22:53,090
How do you maximize
it or minimize it?
314
00:22:53,090 --> 00:22:57,900
You have to change every entry
to one of its two endpoints.
315
00:22:57,900 --> 00:23:00,660
Otherwise, it can never be--
316
00:23:00,660 --> 00:23:04,610
you never lose by doing that.
317
00:23:04,610 --> 00:23:05,950
OK, so think about it.
318
00:23:05,950 --> 00:23:12,610
So this is not difficult once
you see it the right way.
319
00:23:12,610 --> 00:23:18,630
But now, we have this cut
norm expressed over not sets,
320
00:23:18,630 --> 00:23:22,220
but over bounded functions.
321
00:23:22,220 --> 00:23:24,620
And now I'm ready to
prove the counting lemma.
322
00:23:32,400 --> 00:23:36,000
And instead of writing down
the whole proof for general H,
323
00:23:36,000 --> 00:23:40,650
let me write down the
calculation that illustrates
324
00:23:40,650 --> 00:23:42,600
this proof for triangles.
325
00:23:49,460 --> 00:23:50,840
And the general
proof is the same
326
00:23:50,840 --> 00:23:54,500
once you understand how
this argument works.
327
00:23:54,500 --> 00:24:00,770
And the argument works by
considering the difference
328
00:24:00,770 --> 00:24:09,890
between these two F densities.
329
00:24:09,890 --> 00:24:12,710
And what I want to do is--
330
00:24:12,710 --> 00:24:14,160
so this is some integral, right?
331
00:24:14,160 --> 00:24:17,090
So this is this integral,
which I'll write out.
332
00:24:41,780 --> 00:24:46,640
So we would like to show
that this quantity here
333
00:24:46,640 --> 00:24:51,730
is small if u and w
are close in cut norm.
334
00:24:51,730 --> 00:24:59,830
So let's write this integral
as a telescoping sum
335
00:24:59,830 --> 00:25:03,900
where the first term
is obtained by--
336
00:25:08,990 --> 00:25:11,150
so by this, I mean
w of x, comma, y
337
00:25:11,150 --> 00:25:12,440
minus u of x, comma, y.
338
00:25:24,440 --> 00:25:27,290
And then the second term
of the telescoping sum--
339
00:25:27,290 --> 00:25:28,950
so you see what happens.
340
00:25:28,950 --> 00:25:31,040
I change one factor at a time.
341
00:25:51,570 --> 00:25:54,810
And finally, I change
the third factor.
342
00:26:09,300 --> 00:26:10,300
So this is the identity.
343
00:26:10,300 --> 00:26:12,280
If you expand out all
of these differences,
344
00:26:12,280 --> 00:26:15,630
you see that everything
intermediate cancels out.
345
00:26:15,630 --> 00:26:19,700
So it's a telescoping sum.
346
00:26:19,700 --> 00:26:24,281
But now I want to show
that each term is small.
347
00:26:24,281 --> 00:26:28,170
So how can I show that
each term is small?
348
00:26:28,170 --> 00:26:32,400
Look at this expression here.
349
00:26:34,992 --> 00:26:38,280
I claim that for a
fixed value of z--
350
00:26:45,300 --> 00:26:47,490
so imagine fixing z.
351
00:26:47,490 --> 00:26:52,000
And let x and y vary
in this integral.
352
00:26:52,000 --> 00:26:55,760
It has the form up there, right?
353
00:26:55,760 --> 00:27:00,680
If you fix z, then
you have this u and v
354
00:27:00,680 --> 00:27:02,660
coming from these two factors.
355
00:27:02,660 --> 00:27:04,880
And they are both
bounded between 0 and 1.
356
00:27:08,090 --> 00:27:18,170
So for a fixed value of z,
this is at most w minus u--
357
00:27:18,170 --> 00:27:23,290
the cut norm difference between
w and u in absolute value.
358
00:27:27,520 --> 00:27:33,590
So if I left z vary, it is
still bounded in absolute value
359
00:27:33,590 --> 00:27:36,450
by that quantity.
360
00:27:36,450 --> 00:27:46,580
So therefore each is
bounded by w minus u cut
361
00:27:46,580 --> 00:27:49,910
norm in absolute value.
362
00:27:49,910 --> 00:27:52,410
Add all three of them together.
363
00:27:52,410 --> 00:27:57,290
We find that the whole thing
is bounded in absolute value
364
00:27:57,290 --> 00:27:59,963
by 3 times the cut
normal difference.
365
00:28:03,350 --> 00:28:06,660
OK, and that finishes the
proof of the counting lemma.
366
00:28:06,660 --> 00:28:10,050
For triangles, of course,
if you have general H,
367
00:28:10,050 --> 00:28:12,600
then you just have more terms.
368
00:28:12,600 --> 00:28:18,040
You have a longer telescoping
sum, and you have this bound.
369
00:28:18,040 --> 00:28:18,540
OK.
370
00:28:18,540 --> 00:28:19,450
So this is a counting lemma.
371
00:28:19,450 --> 00:28:22,080
And I claim that it's exactly
the same proof as the second
372
00:28:22,080 --> 00:28:24,952
proof of the counting lemma
that we did when we discussed
373
00:28:24,952 --> 00:28:27,160
Szemerédi's regularity lemma
and this counting lemma.
374
00:28:30,220 --> 00:28:33,194
Any questions?
375
00:28:33,194 --> 00:28:33,694
Yeah.
376
00:28:37,082 --> 00:28:42,487
AUDIENCE: Why did it suffice
to prove over the [INAUDIBLE]??
377
00:28:42,487 --> 00:28:43,070
PROFESSOR: OK.
378
00:28:43,070 --> 00:28:45,460
So let me answer
that in a second.
379
00:28:45,460 --> 00:28:48,280
So first, this should
be H, not F. OK,
380
00:28:48,280 --> 00:28:55,000
so your question
was, up there, why
381
00:28:55,000 --> 00:28:59,350
was it sufficient to
prove this version instead
382
00:28:59,350 --> 00:29:00,365
of that version?
383
00:29:00,365 --> 00:29:01,240
Is that the question?
384
00:29:01,240 --> 00:29:02,177
AUDIENCE: Yeah.
385
00:29:02,177 --> 00:29:02,760
PROFESSOR: OK.
386
00:29:02,760 --> 00:29:04,970
Suppose I prove it
for this version.
387
00:29:04,970 --> 00:29:06,870
So I know this is true.
388
00:29:06,870 --> 00:29:09,610
Now I take infimum
of both sides.
389
00:29:09,610 --> 00:29:17,990
So now I consider
infimum of both sides.
390
00:29:17,990 --> 00:29:21,380
So then this is true, right?
391
00:29:21,380 --> 00:29:24,440
Because it's true for every phi.
392
00:29:24,440 --> 00:29:28,490
But the left-hand side doesn't
change, because the F density
393
00:29:28,490 --> 00:29:32,930
in a relabeling of the vertices,
it's still the same quantity,
394
00:29:32,930 --> 00:29:34,880
whereas this one
here is now that.
395
00:29:40,226 --> 00:29:41,198
All right.
396
00:29:44,600 --> 00:29:53,320
So what we see as a corollary
of this counting lemma
397
00:29:53,320 --> 00:29:58,540
is that if you are a Cauchy
sequence with respect
398
00:29:58,540 --> 00:30:06,940
to the cut distance,
then the sequence
399
00:30:06,940 --> 00:30:09,347
is automatically convergent.
400
00:30:15,663 --> 00:30:17,330
So recall the definition
of convergence.
401
00:30:17,330 --> 00:30:20,920
Convergence has to do with
F densities converging.
402
00:30:20,920 --> 00:30:22,940
And if you have a
Cauchy sequence,
403
00:30:22,940 --> 00:30:25,970
then the F densities converge.
404
00:30:25,970 --> 00:30:29,000
And also, a related
but different statement
405
00:30:29,000 --> 00:30:35,180
is that if you have
a sequence wn that
406
00:30:35,180 --> 00:30:41,040
converges to w in
cut distance, then
407
00:30:41,040 --> 00:30:45,810
it implies that wn
converges to w in the sense
408
00:30:45,810 --> 00:30:48,496
as defined for F densities.
409
00:30:51,880 --> 00:30:55,270
So qualitatively, what
the counting lemma says
410
00:30:55,270 --> 00:31:00,550
is that the cut norm is stronger
than the notion of convergence
411
00:31:00,550 --> 00:31:05,260
coming from subgraph densities.
412
00:31:05,260 --> 00:31:08,668
So this is one part of
this regularity method, so
413
00:31:08,668 --> 00:31:09,460
the counting lemma.
414
00:31:09,460 --> 00:31:12,503
Of course, the other part is
the regularity lemma itself.
415
00:31:12,503 --> 00:31:13,920
So that's the next
thing we'll do.
416
00:31:17,020 --> 00:31:18,610
And it turns out
that we actually
417
00:31:18,610 --> 00:31:21,190
don't need the full strength
of the regularity lemma.
418
00:31:21,190 --> 00:31:23,740
We only need something called
a weak regularity lemma.
419
00:31:37,660 --> 00:31:41,690
What the weak regularity
lemma says is--
420
00:31:41,690 --> 00:31:44,850
I mean, you still have a
partition of the vertices.
421
00:31:44,850 --> 00:31:46,370
So let me now state
it for graphons.
422
00:31:46,370 --> 00:31:53,110
So for a partition p--
423
00:31:53,110 --> 00:31:56,920
so I have a partition
of the vertex set--
424
00:32:04,120 --> 00:32:13,100
and a symmetric,
measurable function w--
425
00:32:13,100 --> 00:32:16,080
I'm just going to omit the
word "measurable" from now on.
426
00:32:16,080 --> 00:32:18,990
Everything will be measurable.
427
00:32:18,990 --> 00:32:22,160
What I can do is, OK,
all of these assets
428
00:32:22,160 --> 00:32:24,463
are also measurable.
429
00:32:27,780 --> 00:32:38,130
I can define what's known as a
stepping operator that sends w
430
00:32:38,130 --> 00:32:43,190
to this object,
w sub p, obtained
431
00:32:43,190 --> 00:32:55,210
by averaging over
the steps si cross sj
432
00:32:55,210 --> 00:33:01,490
and replacing that graphon by
its average over each step.
433
00:33:01,490 --> 00:33:07,900
Precisely, so I
obtain a new graphon,
434
00:33:07,900 --> 00:33:11,630
a new symmetric, measurable
function, w sub p,
435
00:33:11,630 --> 00:33:20,100
where the value on x,
comma, y is defined
436
00:33:20,100 --> 00:33:23,890
to be the following quantity--
437
00:33:30,610 --> 00:33:39,040
if x, comma, y lies
in si cross sj.
438
00:33:39,040 --> 00:33:43,840
So pictorially, what happens is
that you look at your graphon.
439
00:33:47,540 --> 00:33:51,262
There's a partition
of the vertex set,
440
00:33:51,262 --> 00:33:52,853
so to speak, the interval.
441
00:33:52,853 --> 00:33:54,770
Doesn't have to be a
partition into intervals,
442
00:33:54,770 --> 00:33:57,850
but for illustration,
suppose it looks like that.
443
00:33:57,850 --> 00:34:01,850
And what I do is I take
this w, and I replace it
444
00:34:01,850 --> 00:34:06,590
by a new graphon, a new
symmetric, measurable function,
445
00:34:06,590 --> 00:34:12,749
w sub p, obtained by averaging.
446
00:34:16,421 --> 00:34:17,600
Take each box.
447
00:34:17,600 --> 00:34:18,860
Replace it by its average.
448
00:34:18,860 --> 00:34:22,310
Put that average into the box.
449
00:34:22,310 --> 00:34:26,920
So this is what w sub
p is supposed to be.
450
00:34:26,920 --> 00:34:29,710
Just a few minor technicalities.
451
00:34:29,710 --> 00:34:39,690
If this denominator is equal
to 0, let's ignore the set.
452
00:34:39,690 --> 00:34:42,679
I mean, then you have a
zero measure set, anyway,
453
00:34:42,679 --> 00:34:44,820
so we ignore that set.
454
00:34:44,820 --> 00:34:47,330
So everything will be
treated up to measure zero,
455
00:34:47,330 --> 00:34:49,850
changing the function
on measure zero sets.
456
00:34:49,850 --> 00:34:53,883
So it doesn't really matter
if you're not strictly
457
00:34:53,883 --> 00:34:55,050
allowed to do this division.
458
00:34:58,310 --> 00:34:59,200
OK.
459
00:34:59,200 --> 00:35:01,990
So this operator plays
an important role
460
00:35:01,990 --> 00:35:03,820
in the regularity
lemma, because it's
461
00:35:03,820 --> 00:35:07,050
how we think about partitioning,
what happens to a graph
462
00:35:07,050 --> 00:35:08,260
under partitioning.
463
00:35:08,260 --> 00:35:12,640
It has several other names if
you look at it from slightly
464
00:35:12,640 --> 00:35:14,060
different perspectives.
465
00:35:14,060 --> 00:35:19,400
So you can view
it as a projection
466
00:35:19,400 --> 00:35:22,280
in the sense of Hilbert space.
467
00:35:22,280 --> 00:35:35,170
So in the Hilbert space of
functions on the unit square,
468
00:35:35,170 --> 00:35:44,840
the stepping operator is a
projection unto the subspace
469
00:35:44,840 --> 00:35:52,090
of constants,
subspace of functions
470
00:35:52,090 --> 00:35:56,660
that are constant on each step.
471
00:36:05,210 --> 00:36:06,920
So that's one interpretation.
472
00:36:06,920 --> 00:36:09,860
Another interpretation is
that this operation is also
473
00:36:09,860 --> 00:36:11,870
a conditional expectation.
474
00:36:17,340 --> 00:36:21,900
If you know what a conditional
expectation actually
475
00:36:21,900 --> 00:36:25,130
is in the sense of
probability theory,
476
00:36:25,130 --> 00:36:26,940
so then that's
what happens here.
477
00:36:26,940 --> 00:36:30,720
If you view 0, 1 squared
as a probability space,
478
00:36:30,720 --> 00:36:35,340
then what we're doing is we're
doing conditional expectation
479
00:36:35,340 --> 00:36:39,750
relative to the sigma algebra
generated by these steps.
480
00:36:41,793 --> 00:36:43,710
So these are just a
couple of ways of thinking
481
00:36:43,710 --> 00:36:44,627
about what's going on.
482
00:36:44,627 --> 00:36:46,290
They might be somewhat
helpful later on
483
00:36:46,290 --> 00:36:47,873
if you're familiar
with these notions.
484
00:36:47,873 --> 00:36:49,705
But if you're not,
don't worry about it.
485
00:36:49,705 --> 00:36:51,330
Concretely, it's what
happens up there.
486
00:36:58,340 --> 00:36:58,990
OK.
487
00:36:58,990 --> 00:37:01,930
So now let me state the
weak regularity lemma.
488
00:37:13,530 --> 00:37:16,550
So the weak regularity
lemma is attributed
489
00:37:16,550 --> 00:37:25,800
to Frieze and Kannan,
although their work predates
490
00:37:25,800 --> 00:37:27,540
the language of graphons.
491
00:37:27,540 --> 00:37:29,720
So it's stated in the
language of graphs,
492
00:37:29,720 --> 00:37:30,720
but it's the same proof.
493
00:37:30,720 --> 00:37:33,410
So let me state it for you
both in terms of graphons
494
00:37:33,410 --> 00:37:35,070
and in graphs.
495
00:37:35,070 --> 00:37:48,160
What it says is that for every
epsilon and every graphon w,
496
00:37:48,160 --> 00:38:00,760
there exists a partition
denoted p of the 0, 1 interval.
497
00:38:00,760 --> 00:38:03,110
And now I tell you how
many sets there are.
498
00:38:03,110 --> 00:38:05,320
So it's a partition into--
499
00:38:05,320 --> 00:38:08,300
so not a tower-type
number of parts,
500
00:38:08,300 --> 00:38:11,920
but only roughly an
exponential number of parts--
501
00:38:11,920 --> 00:38:22,250
4 to the 1 over epsilon
squared measurable sets such
502
00:38:22,250 --> 00:38:29,710
that if we apply the stepping
operator to this graphon,
503
00:38:29,710 --> 00:38:35,538
we obtain an approximation of
the graphon in the cut norm.
504
00:38:40,520 --> 00:38:45,050
So that's the statement of
the weak regularity lemma.
505
00:38:45,050 --> 00:38:51,620
There exists a partition such
that if you do this stepping,
506
00:38:51,620 --> 00:38:53,460
then you obtain
an approximation.
507
00:38:53,460 --> 00:38:56,120
So I want you to think about
what this has to do with
508
00:38:56,120 --> 00:38:58,600
the usual version of Szemerédi's
regularity lemma that
509
00:38:58,600 --> 00:39:00,030
you've seen earlier.
510
00:39:00,030 --> 00:39:01,970
So hopefully, you
should realize, morally,
511
00:39:01,970 --> 00:39:04,660
they're about the same
types of statements.
512
00:39:04,660 --> 00:39:07,980
But more importantly, how are
they different from each other?
513
00:39:07,980 --> 00:39:12,620
And now let me state a version
for graphs, which is similar
514
00:39:12,620 --> 00:39:17,090
but not exactly the same as
what we just saw for graphons.
515
00:39:17,090 --> 00:39:19,520
So let me state it.
516
00:39:19,520 --> 00:39:26,300
So for graphs, the
weak regularity lemma
517
00:39:26,300 --> 00:39:36,420
says that, OK, so for graphs,
let me define a partition
518
00:39:36,420 --> 00:39:55,130
p of the vertex set is
called weakly epsilon regular
519
00:39:55,130 --> 00:39:58,360
if the following is true.
520
00:39:58,360 --> 00:40:03,055
If it is the case that whenever
I look at two vertex subsets,
521
00:40:03,055 --> 00:40:08,650
A and B, of the
vertex set of g, then
522
00:40:08,650 --> 00:40:13,880
the number of vertices
between A and B
523
00:40:13,880 --> 00:40:21,530
is what you should expect based
on the density information that
524
00:40:21,530 --> 00:40:24,710
comes out of this partition.
525
00:40:24,710 --> 00:40:32,830
Namely, if I sum over all
the parts of the partition,
526
00:40:32,830 --> 00:40:46,200
look at how many vertices from A
lie in the corresponding parts.
527
00:40:46,200 --> 00:40:51,090
And then multiply by the edge
density between these parts.
528
00:40:51,090 --> 00:40:53,820
So that's your predicted
value based on the data that
529
00:40:53,820 --> 00:40:55,900
comes out of the partition.
530
00:40:55,900 --> 00:40:58,170
So I claim that this is
the actual number of edges.
531
00:40:58,170 --> 00:41:00,720
This is the predicted
number of edges.
532
00:41:00,720 --> 00:41:07,395
And those two numbers should
be similar to each other bt
533
00:41:07,395 --> 00:41:11,380
at most epsilon n, where n
is the number of vertices.
534
00:41:11,380 --> 00:41:14,680
So this is the definition of
what it means for a partition
535
00:41:14,680 --> 00:41:18,700
to be weakly epsilon regular.
536
00:41:18,700 --> 00:41:22,190
So it's important to think
about why this is weaker.
537
00:41:22,190 --> 00:41:23,190
It's called weak, right?
538
00:41:23,190 --> 00:41:28,150
So why is it weaker than a
notion of epsilon regularity?
539
00:41:28,150 --> 00:41:30,450
So why is it weaker?
540
00:41:30,450 --> 00:41:34,110
So previously, we had
epsilon-regular partition
541
00:41:34,110 --> 00:41:36,900
in the definition of
Szemerédi's regularity lemma,
542
00:41:36,900 --> 00:41:38,880
this epsilon-regular partition.
543
00:41:38,880 --> 00:41:43,350
And here, notion of
weakly epsilon regular.
544
00:41:43,350 --> 00:41:44,620
So why is this a lot weaker?
545
00:41:47,460 --> 00:41:52,050
It is not saying that
individual pairs of parts
546
00:41:52,050 --> 00:41:55,355
are epsilon regular.
547
00:41:55,355 --> 00:41:57,730
And eventually, we're going
to have this number of parts.
548
00:41:57,730 --> 00:42:00,210
So I'll state a
theorem in a second.
549
00:42:00,210 --> 00:42:04,070
So the sizes of the
parts are much smaller
550
00:42:04,070 --> 00:42:07,380
than epsilon fraction.
551
00:42:07,380 --> 00:42:12,080
But what this weak notion of
regularity says, if you look
552
00:42:12,080 --> 00:42:13,950
at it globally--
553
00:42:13,950 --> 00:42:15,740
so not looking at
specific parts,
554
00:42:15,740 --> 00:42:17,450
but looking at it globally--
555
00:42:17,450 --> 00:42:19,670
then this partition is
a good approximation
556
00:42:19,670 --> 00:42:24,280
of what's going on in the
actual graph, whereas--
557
00:42:24,280 --> 00:42:25,710
OK, so it's worth
thinking about.
558
00:42:25,710 --> 00:42:27,335
It's really worth
thinking about what's
559
00:42:27,335 --> 00:42:29,990
the difference between this weak
notion and the usual notion.
560
00:42:29,990 --> 00:42:33,380
But first, let me state
this regularity lemma.
561
00:42:33,380 --> 00:42:43,330
So the weak regularity
lemma for graphs
562
00:42:43,330 --> 00:42:50,820
says that for every
epsilon and every graph G,
563
00:42:50,820 --> 00:43:03,360
there exists a weakly
epsilon-regular partition
564
00:43:03,360 --> 00:43:09,090
of the vertex set
of G into at most 4
565
00:43:09,090 --> 00:43:11,570
to the 1 over epsilon
squared parts.
566
00:43:20,240 --> 00:43:24,640
Now, you might wonder why
did Frieze and Kannan come up
567
00:43:24,640 --> 00:43:29,010
with this notion of regularity.
568
00:43:29,010 --> 00:43:32,010
It's a weaker result if you
don't care about the bounds,
569
00:43:32,010 --> 00:43:38,070
because an epsilon-regular
partition will be automatically
570
00:43:38,070 --> 00:43:41,360
weakly epsilon regular.
571
00:43:41,360 --> 00:43:43,220
So maybe with small
changes of epsilon
572
00:43:43,220 --> 00:43:46,370
if you wish, but basically,
this is a weaker notion
573
00:43:46,370 --> 00:43:47,690
compared to what we had before.
574
00:43:50,780 --> 00:43:53,560
But of course, the advantage
is that you have a much more
575
00:43:53,560 --> 00:43:56,230
reasonable number of parts.
576
00:43:56,230 --> 00:43:58,210
It's not a tower.
577
00:43:58,210 --> 00:44:01,180
It's just a single exponential.
578
00:44:01,180 --> 00:44:02,110
And this is important.
579
00:44:02,110 --> 00:44:05,740
And their motivation was a
computer science and algorithm
580
00:44:05,740 --> 00:44:06,760
application.
581
00:44:06,760 --> 00:44:11,410
So I want to take a
brief detour and mention
582
00:44:11,410 --> 00:44:18,022
why you might care about weakly
epsilon-regular partitions.
583
00:44:22,180 --> 00:44:25,240
In particular, the problem
that is of interest
584
00:44:25,240 --> 00:44:30,980
is in approximating
something called a max cut.
585
00:44:30,980 --> 00:44:38,060
So the max cut problem asks you
to determine-- given a graph G,
586
00:44:38,060 --> 00:44:46,360
find the maximum over
all subsets of vertices,
587
00:44:46,360 --> 00:44:49,610
the maximum number of
vertices between a set
588
00:44:49,610 --> 00:44:51,040
and its complement.
589
00:44:51,040 --> 00:44:52,430
That's called a cut.
590
00:44:52,430 --> 00:44:56,430
I give you a graph,
and I want to know--
591
00:44:56,430 --> 00:45:01,860
find this s so that it
can have as many edges
592
00:45:01,860 --> 00:45:05,488
across this set as possible.
593
00:45:05,488 --> 00:45:07,530
This is an important
problem in computer science,
594
00:45:07,530 --> 00:45:09,120
extremely important problem.
595
00:45:09,120 --> 00:45:12,450
And the status of
this problem is
596
00:45:12,450 --> 00:45:20,640
that it is known to be difficult
to get it even within 1%.
597
00:45:20,640 --> 00:45:24,188
So the best algorithm is due
to Goemans and Williamson.
598
00:45:30,410 --> 00:45:32,120
It's an important
algorithm that was
599
00:45:32,120 --> 00:45:33,560
one of the
foundational algorithms
600
00:45:33,560 --> 00:45:35,690
in semidefinite
programming, so related--
601
00:45:35,690 --> 00:45:37,340
the words "semidefinite
programming"
602
00:45:37,340 --> 00:45:40,070
came up earlier in this course
when we discussed growth index
603
00:45:40,070 --> 00:45:40,970
inequality.
604
00:45:40,970 --> 00:45:43,820
So they came up with an
approximation algorithm.
605
00:45:43,820 --> 00:45:47,100
So here, I'm only talking
about polynomial time,
606
00:45:47,100 --> 00:45:48,830
so efficient algorithms.
607
00:45:48,830 --> 00:45:53,600
Approximation algorithm
with approximation ratio
608
00:45:53,600 --> 00:45:56,120
around 0.878.
609
00:45:56,120 --> 00:46:03,900
So one can obtain a cut
that is within basically
610
00:46:03,900 --> 00:46:07,820
13% of the maximum.
611
00:46:07,820 --> 00:46:10,540
So it's an
approximation algorithm.
612
00:46:10,540 --> 00:46:17,380
However, it is known that it is
hard in the sense of complexity
613
00:46:17,380 --> 00:46:18,540
theory.
614
00:46:18,540 --> 00:46:29,830
It'd be hard to approximate
beyond the ratio 16 over 17,
615
00:46:29,830 --> 00:46:37,000
which is around 0.491.
616
00:46:37,000 --> 00:46:38,980
And there is an
important conjecture
617
00:46:38,980 --> 00:46:41,800
in computer science called
a unique games conjecture
618
00:46:41,800 --> 00:46:44,240
that, if that
conjecture were true,
619
00:46:44,240 --> 00:46:46,710
then it would be
difficult. It would
620
00:46:46,710 --> 00:46:52,010
be hard to approximate beyond
the Goemans-Williamson ratio.
621
00:46:52,010 --> 00:46:54,070
So this indicates the
status of this problem.
622
00:46:54,070 --> 00:46:59,760
It is difficult to do an
epsilon approximation.
623
00:46:59,760 --> 00:47:03,135
But if the graph I
give you is dense--
624
00:47:10,460 --> 00:47:13,040
"dense" meaning a
quadratic number
625
00:47:13,040 --> 00:47:17,970
of edges, where n is
a number of vertices--
626
00:47:17,970 --> 00:47:25,210
then it turns out that the
regularity-type algorithms--
627
00:47:25,210 --> 00:47:28,390
so that theorem combined
with the algorithmic versions
628
00:47:28,390 --> 00:47:35,360
allows you to get polynomial
time approximation algorithms.
629
00:47:35,360 --> 00:47:38,660
So this is polynomial time
approximation schemes.
630
00:47:41,620 --> 00:47:52,000
So one can approximate up
to 1 minus epsilon ratio.
631
00:47:52,000 --> 00:47:57,940
So one can approximate
up to epsilon
632
00:47:57,940 --> 00:48:07,796
n squared additive error
in polynomial time.
633
00:48:07,796 --> 00:48:12,730
So in particular, if I'm
willing to lose 0.01 n squared,
634
00:48:12,730 --> 00:48:16,540
then there is an algorithm to
approximate the size of the max
635
00:48:16,540 --> 00:48:17,040
cut.
636
00:48:17,040 --> 00:48:21,110
And that algorithm
basically comes from--
637
00:48:21,110 --> 00:48:23,320
without giving you any
details whatsoever,
638
00:48:23,320 --> 00:48:27,310
the algorithm essentially comes
from first finding a regularity
639
00:48:27,310 --> 00:48:28,334
partition.
640
00:48:35,110 --> 00:48:40,120
So the partition breaks
the set of vertices
641
00:48:40,120 --> 00:48:43,240
into some number of pieces.
642
00:48:43,240 --> 00:48:57,640
And now I search over
all possible ratios
643
00:48:57,640 --> 00:49:01,080
to divide each piece.
644
00:49:04,280 --> 00:49:06,210
So there is a bounded
number of parts.
645
00:49:06,210 --> 00:49:09,320
Each one of those, I decide,
do I cut this up half-half?
646
00:49:09,320 --> 00:49:13,270
Do I cut it up 1/3,
2/3, and so on?
647
00:49:13,270 --> 00:49:17,940
And those numbers alone,
because of this definition
648
00:49:17,940 --> 00:49:22,040
of weakly epsilon
regular, once you
649
00:49:22,040 --> 00:49:27,005
know what the intersection
of A, B is, let's say,
650
00:49:27,005 --> 00:49:29,780
a complement is with
individual sets,
651
00:49:29,780 --> 00:49:32,510
then I basically know
the number of edges.
652
00:49:32,510 --> 00:49:36,800
So I can approximate
the size of the max cut
653
00:49:36,800 --> 00:49:41,300
using a weakly
epsilon-regular partition.
654
00:49:41,300 --> 00:49:47,360
So that was the motivation
for these weakly epsilon
655
00:49:47,360 --> 00:49:51,820
partitions, at least the
algorithmic application.
656
00:49:51,820 --> 00:49:52,320
OK.
657
00:49:52,320 --> 00:49:53,420
Any questions?
658
00:49:56,240 --> 00:49:56,740
OK.
659
00:49:56,740 --> 00:49:58,150
So let's take a quick break.
660
00:49:58,150 --> 00:50:00,100
And then afterwards,
I want to show
661
00:50:00,100 --> 00:50:03,160
you the proof of the
weak regularity lemma.
662
00:50:05,730 --> 00:50:06,230
All right.
663
00:50:06,230 --> 00:50:12,560
So let me start the proof of
the weak regularity lemma.
664
00:50:12,560 --> 00:50:14,775
And the proof is by this
energy increment argument.
665
00:50:14,775 --> 00:50:16,400
So let's see what
this energy increment
666
00:50:16,400 --> 00:50:19,700
argument looks like in
the language of graphons.
667
00:50:19,700 --> 00:50:27,610
So energy now means L2,
so L2 energy increment.
668
00:50:27,610 --> 00:50:29,600
So the statement
of this lemma is
669
00:50:29,600 --> 00:50:42,230
that if you have w, a graphon,
and p, a partition, of 0,
670
00:50:42,230 --> 00:50:46,740
comma, 1 interval such that--
671
00:50:50,120 --> 00:50:51,260
always measurable pieces.
672
00:50:51,260 --> 00:50:52,300
I'm not going to even write it.
673
00:50:52,300 --> 00:50:53,592
It's always measurable pieces--
674
00:50:57,320 --> 00:51:08,390
such that the difference between
w and w averaged over steps p
675
00:51:08,390 --> 00:51:11,420
is bigger than epsilon.
676
00:51:11,420 --> 00:51:14,390
So this is the notion
of being not epsilon
677
00:51:14,390 --> 00:51:22,280
regular in the weak sense,
not weakly epsilon regular.
678
00:51:22,280 --> 00:51:33,100
Then there exists a
refinement, p prime of p,
679
00:51:33,100 --> 00:51:45,430
dividing each part of p
into at most four parts
680
00:51:45,430 --> 00:51:57,380
such that the true norm
increases by more than epsilon
681
00:51:57,380 --> 00:52:02,040
squared under this refinement.
682
00:52:02,040 --> 00:52:04,110
So it should be similar.
683
00:52:04,110 --> 00:52:06,450
It should be familiar to
you, because we have similar
684
00:52:06,450 --> 00:52:09,686
arguments from Szemerédi's
regularity lemma.
685
00:52:09,686 --> 00:52:10,644
So let's see the proof.
686
00:52:13,490 --> 00:52:18,380
Because you have violation
of weak epsilon regularity,
687
00:52:18,380 --> 00:52:23,250
there exists sets S and T,
measurable subsets of 0,
688
00:52:23,250 --> 00:52:29,510
1 interval, such that this
integral evaluated over S
689
00:52:29,510 --> 00:52:39,140
cross T is more than
epsilon in absolute value.
690
00:52:39,140 --> 00:52:55,690
So now let me take p prime to
be the common refinement of p
691
00:52:55,690 --> 00:53:07,890
by introducing S and
T into this partition.
692
00:53:07,890 --> 00:53:10,900
So throw S and T in
and break everything
693
00:53:10,900 --> 00:53:12,930
according to S and T.
694
00:53:12,930 --> 00:53:21,140
And so each part becomes
at most four subparts.
695
00:53:21,140 --> 00:53:22,960
So that's the at
most four subparts.
696
00:53:25,780 --> 00:53:29,060
I now need to show that I
have an energy increment.
697
00:53:29,060 --> 00:53:33,340
And to do this, let
me first perform
698
00:53:33,340 --> 00:53:36,530
the following calculation.
699
00:53:36,530 --> 00:53:41,590
So remember, this symbol
here is the inner product
700
00:53:41,590 --> 00:53:44,230
obtained by multiplying
and integrating
701
00:53:44,230 --> 00:53:46,890
over the entire box.
702
00:53:46,890 --> 00:53:52,400
I claim that that
inner product equals
703
00:53:52,400 --> 00:54:00,790
to the inner product
between wp and wp prime,
704
00:54:00,790 --> 00:54:08,580
because what happens here is
we are looking at a situation
705
00:54:08,580 --> 00:54:15,510
where wp prime is
constant on each part.
706
00:54:15,510 --> 00:54:20,920
So when I do this inner product,
I can replace w by its average.
707
00:54:20,920 --> 00:54:23,810
And likewise, over here, I can
also replace it by its average.
708
00:54:23,810 --> 00:54:26,990
And you end up having
the same average.
709
00:54:26,990 --> 00:54:33,340
And these two averages
are both just what happens
710
00:54:33,340 --> 00:54:35,170
if you do stepping by p.
711
00:54:38,440 --> 00:54:48,380
You also have that w has inner
product with 1 sub S cross T
712
00:54:48,380 --> 00:54:54,780
the same as that of p
prime by the same reason,
713
00:54:54,780 --> 00:55:00,790
because over S
cross T. So S cross
714
00:55:00,790 --> 00:55:06,000
T is a union of the
parts of p prime.
715
00:55:06,000 --> 00:55:16,360
So S is union of
parts of p prime.
716
00:55:16,360 --> 00:55:16,860
OK.
717
00:55:16,860 --> 00:55:18,140
So let's see.
718
00:55:18,140 --> 00:55:21,770
With those observations,
you find that--
719
00:55:30,580 --> 00:55:33,580
so this is true.
720
00:55:33,580 --> 00:55:35,870
This is from the first equality.
721
00:55:35,870 --> 00:55:40,795
So now let me draw
you a right triangle.
722
00:55:49,890 --> 00:55:51,840
So you have a right
angle, because you have
723
00:55:51,840 --> 00:55:54,450
an inner product that is 0.
724
00:55:54,450 --> 00:56:04,530
So by Pythagorean theorem,
so what is this hypotenuse?
725
00:56:04,530 --> 00:56:06,520
So you add these two vectors.
726
00:56:06,520 --> 00:56:14,060
And you find out this wp prime.
727
00:56:14,060 --> 00:56:16,010
So by Pythagorean
theorem, you find
728
00:56:16,010 --> 00:56:20,540
that the L2 norm
of wp prime equals
729
00:56:20,540 --> 00:56:33,990
to the L2 norm of the sum of
the L2 norm squares of the two
730
00:56:33,990 --> 00:56:35,910
legs of this right triangle.
731
00:56:43,420 --> 00:56:48,153
On the other hand,
this quantity here.
732
00:56:48,153 --> 00:56:50,070
So let's think about
that quantity over there.
733
00:56:52,810 --> 00:56:54,420
It's an L2 norm.
734
00:56:54,420 --> 00:57:16,580
So in particular, it is at
least this quantity here,
735
00:57:16,580 --> 00:57:20,180
which you can derive
in one of many ways--
736
00:57:20,180 --> 00:57:25,840
for example, by Cauchy-Schwarz
inequality or go from L2 to L1
737
00:57:25,840 --> 00:57:28,330
and then pass down to L1.
738
00:57:28,330 --> 00:57:31,890
So this is true.
739
00:57:31,890 --> 00:57:33,687
So let's say by Cauchy-Schwarz.
740
00:57:49,570 --> 00:57:55,580
But this quantity here, we
said was bigger than epsilon.
741
00:58:04,690 --> 00:58:12,180
So as a result,
this final quantity,
742
00:58:12,180 --> 00:58:17,300
this L2 norm of
the new refinement,
743
00:58:17,300 --> 00:58:20,540
increases from the previous one
by more than epsilon squared.
744
00:58:24,620 --> 00:58:25,880
OK.
745
00:58:25,880 --> 00:58:27,910
So this is the L2 energy
increment argument.
746
00:58:27,910 --> 00:58:29,870
I claim it's the same
argument, basically,
747
00:58:29,870 --> 00:58:32,480
as the one that we did for
Szemerédi's regularity lemma.
748
00:58:32,480 --> 00:58:34,700
And I encourage you to
go back and compare them
749
00:58:34,700 --> 00:58:36,200
to see why they're the same.
750
00:58:40,280 --> 00:58:41,360
All right, moving on.
751
00:58:41,360 --> 00:58:45,230
So the other part
of regularity lemma
752
00:58:45,230 --> 00:58:48,820
is to iterate this approach.
753
00:58:48,820 --> 00:58:51,980
So if you have something
which is not epsilon regular,
754
00:58:51,980 --> 00:58:52,790
refine it.
755
00:58:52,790 --> 00:58:53,960
And then iterate.
756
00:58:53,960 --> 00:58:58,820
And you cannot perceive more
than a bounded number of times,
757
00:58:58,820 --> 00:59:02,390
because energy is always
bounded between 0 and 1.
758
00:59:02,390 --> 00:59:09,260
So for every epsilon bigger
than 0 and graphon w,
759
00:59:09,260 --> 00:59:17,210
suppose you have P0, a
partition of 0, 1 interval
760
00:59:17,210 --> 00:59:19,960
into measurable sets.
761
00:59:19,960 --> 00:59:38,280
Then there exists a partition
p that cuts up each part of P0
762
00:59:38,280 --> 00:59:47,460
into at most 4 to the
1 over epsilon parts
763
00:59:47,460 --> 00:59:55,920
such that w minus w sub
p is at most epsilon.
764
00:59:55,920 --> 00:59:59,620
So I'm basically restating
the weak regularity lemma
765
00:59:59,620 --> 01:00:03,630
over there but with a
small difference, which
766
01:00:03,630 --> 01:00:07,020
will become useful later on
when we prove compactness.
767
01:00:07,020 --> 01:00:09,645
Namely, I'm allowed to
start with any partition.
768
01:00:09,645 --> 01:00:11,520
Instead of starting with
a trivial partition,
769
01:00:11,520 --> 01:00:14,100
I can start with any partition.
770
01:00:14,100 --> 01:00:16,382
This was also true when
we were talking about
771
01:00:16,382 --> 01:00:18,840
Szemerédi's regularity lemma,
although I didn't stress that
772
01:00:18,840 --> 01:00:20,320
point.
773
01:00:20,320 --> 01:00:21,858
That's certainly the case here.
774
01:00:21,858 --> 01:00:23,400
I mean, the proof
is exactly the same
775
01:00:23,400 --> 01:00:26,250
with or without this extra.
776
01:00:26,250 --> 01:00:30,780
This extra P0 really plays
an insignificant role.
777
01:00:30,780 --> 01:00:34,520
What happens, as in the proof
of Szemerédi's regularity lemma,
778
01:00:34,520 --> 01:00:42,770
is that we repeatedly apply
the previous lemma to obtain
779
01:00:42,770 --> 01:00:56,040
the sequence of partitions
of the 0, 1 interval where,
780
01:00:56,040 --> 01:01:10,160
each step, either we find that
we obtain some partition p sub
781
01:01:10,160 --> 01:01:15,790
i such that it's a good
approximation of w,
782
01:01:15,790 --> 01:01:33,150
in which case we stop, or the
L2 energy increases by more than
783
01:01:33,150 --> 01:01:34,750
epsilon squared.
784
01:01:40,630 --> 01:01:49,890
And since the final energy
is always at most 1--
785
01:01:49,890 --> 01:01:52,620
so it's always bounded
between 0 and 1--
786
01:01:52,620 --> 01:02:01,060
we must stop after at
most 1 over epsilon steps.
787
01:02:06,460 --> 01:02:14,160
And if you calculate
the number of parts,
788
01:02:14,160 --> 01:02:20,165
each part is subdivided
into at most four parts
789
01:02:20,165 --> 01:02:26,780
at each step, which
gives you the conclusion
790
01:02:26,780 --> 01:02:29,580
on the final number of parts.
791
01:02:29,580 --> 01:02:31,640
OK, so very similar
to what we did before.
792
01:02:35,780 --> 01:02:36,720
All right.
793
01:02:36,720 --> 01:02:41,850
So that concludes the discussion
of the weak regularity lemma.
794
01:02:41,850 --> 01:02:44,360
So basically the same proof.
795
01:02:44,360 --> 01:02:48,403
Weaker conclusion and
better quantitative balance.
796
01:02:48,403 --> 01:02:50,820
The next thing and the final
thing I want to discuss today
797
01:02:50,820 --> 01:02:55,140
is a new ingredient which
we haven't seen before
798
01:02:55,140 --> 01:02:58,110
but that will play an
important role in the proof
799
01:02:58,110 --> 01:02:59,580
of the compactness--
800
01:02:59,580 --> 01:03:03,160
in particular, the proof of
the existence of the limit.
801
01:03:03,160 --> 01:03:08,280
And this is something where I
need to discuss martingales.
802
01:03:12,410 --> 01:03:15,010
So martingale gill is
an important object
803
01:03:15,010 --> 01:03:16,555
in probability theory.
804
01:03:16,555 --> 01:03:18,865
And it's a random sequence.
805
01:03:23,620 --> 01:03:28,350
So we'll look at discrete
sequences, so indexed
806
01:03:28,350 --> 01:03:32,010
by non-negative integers.
807
01:03:32,010 --> 01:03:36,620
And is martingale is
such a sequence where
808
01:03:36,620 --> 01:03:43,330
if I'm interested in the
expectation of the next term
809
01:03:43,330 --> 01:03:47,720
and even if you know
all the previous terms--
810
01:03:47,720 --> 01:03:51,530
so you have full knowledge of
the sequence before time n,
811
01:03:51,530 --> 01:03:55,000
and you want to predict
on the expectation what
812
01:03:55,000 --> 01:03:56,440
the nth term is--
813
01:03:56,440 --> 01:04:04,830
then you cannot do better than
simply predicting the last term
814
01:04:04,830 --> 01:04:06,800
that you saw.
815
01:04:06,800 --> 01:04:11,000
So this is the definition
of a martingale.
816
01:04:11,000 --> 01:04:13,730
Now, to do this
formally, I need to talk
817
01:04:13,730 --> 01:04:18,080
about filtrations and what
not in measured theory.
818
01:04:18,080 --> 01:04:20,702
But let me not do that.
819
01:04:20,702 --> 01:04:22,160
OK, so this is how
you should think
820
01:04:22,160 --> 01:04:25,960
about martingales and a
couple of important examples
821
01:04:25,960 --> 01:04:27,080
of martingales.
822
01:04:27,080 --> 01:04:31,670
So the first one comes
from-- the reason
823
01:04:31,670 --> 01:04:35,000
why these things are called
martingales is that there
824
01:04:35,000 --> 01:04:37,280
is a gambling strategy
which is related
825
01:04:37,280 --> 01:04:44,720
to such a sequence where
let's say you consider
826
01:04:44,720 --> 01:04:48,130
a sequence of fair coin tosses.
827
01:04:48,130 --> 01:04:50,500
So here's what
we're going to do.
828
01:04:50,500 --> 01:04:53,850
So suppose we consider
a betting strategy.
829
01:05:03,240 --> 01:05:17,080
And x sub n is equal
to your balance time n.
830
01:05:17,080 --> 01:05:21,120
And suppose that we're
looking at a fair casino
831
01:05:21,120 --> 01:05:27,700
where the expectation of
every game is exactly 0.
832
01:05:27,700 --> 01:05:31,260
Then this is a martingale.
833
01:05:31,260 --> 01:05:33,000
So imagine you have
a sequence of coin
834
01:05:33,000 --> 01:05:38,600
flips, and you win $1 for each
head and lose $1 for each tail.
835
01:05:38,600 --> 01:05:42,050
When you're at time five, you
should have $2 in your pocket.
836
01:05:42,050 --> 01:05:46,420
Then time five
plus 1, you expect
837
01:05:46,420 --> 01:05:48,505
to also have that many dollars.
838
01:05:48,505 --> 01:05:49,130
It might go up.
839
01:05:49,130 --> 01:05:49,838
It might go down.
840
01:05:49,838 --> 01:05:52,726
But in expectation,
it doesn't change.
841
01:05:52,726 --> 01:05:53,670
Is there a question?
842
01:05:56,220 --> 01:05:56,720
OK.
843
01:05:56,720 --> 01:05:59,778
So they're asking about,
is there some independence
844
01:05:59,778 --> 01:06:00,570
condition required?
845
01:06:00,570 --> 01:06:02,190
And the answer is no.
846
01:06:02,190 --> 01:06:04,540
So there's no independence
condition that is required.
847
01:06:04,540 --> 01:06:06,270
So the definition
of a martingale
848
01:06:06,270 --> 01:06:10,020
is just if, even with complete
knowledge of the sequence up
849
01:06:10,020 --> 01:06:14,130
to a certain point, the
difference going forward
850
01:06:14,130 --> 01:06:16,565
is 0 in expectation.
851
01:06:22,570 --> 01:06:27,970
OK, so here's another
example of a martingale,
852
01:06:27,970 --> 01:06:31,490
which actually turns out to
be more relevant to our use--
853
01:06:34,250 --> 01:06:39,980
namely, that if I
have some hidden--
854
01:06:39,980 --> 01:06:44,540
think of x as some hidden
random variable, so something
855
01:06:44,540 --> 01:06:46,880
that you have no idea.
856
01:06:46,880 --> 01:06:56,170
But you can observe
it at time n based
857
01:06:56,170 --> 01:07:07,100
on information up to time n.
858
01:07:11,380 --> 01:07:16,600
So for example, suppose
you have no idea who
859
01:07:16,600 --> 01:07:21,910
is going to win the
presidential election.
860
01:07:21,910 --> 01:07:24,550
And really, nobody has any idea.
861
01:07:24,550 --> 01:07:28,990
But as time proceeds, you
make an educated guess
862
01:07:28,990 --> 01:07:30,790
based on the information
that you have,
863
01:07:30,790 --> 01:07:33,890
all the information you
have up to that point.
864
01:07:33,890 --> 01:07:36,590
And that information becomes
a larger and larger set
865
01:07:36,590 --> 01:07:38,420
as time moves forward.
866
01:07:38,420 --> 01:07:41,090
Your prediction is going to
be a random variable that
867
01:07:41,090 --> 01:07:43,790
goes up and down.
868
01:07:43,790 --> 01:07:48,120
And that will be a
martingale, because--
869
01:07:48,120 --> 01:07:52,980
so how I predict
today based on what
870
01:07:52,980 --> 01:07:56,660
are all the possibilities
happening going forward,
871
01:07:56,660 --> 01:08:00,300
well, one of many
things could happen.
872
01:08:00,300 --> 01:08:05,810
But if I knew that my prediction
is going to, in expectation,
873
01:08:05,810 --> 01:08:08,390
shift upwards, then
I shouldn't have
874
01:08:08,390 --> 01:08:09,710
predicted what I predict today.
875
01:08:09,710 --> 01:08:13,300
I should have predicted
upwards anyway.
876
01:08:13,300 --> 01:08:13,800
OK.
877
01:08:13,800 --> 01:08:19,819
So this is another
construction of martingales.
878
01:08:19,819 --> 01:08:21,410
So this also comes up.
879
01:08:21,410 --> 01:08:26,120
You could have other more pure
mathematics-type explanations,
880
01:08:26,120 --> 01:08:29,930
where suppose I
want to know what
881
01:08:29,930 --> 01:08:34,490
is the chromatic number
of a random graph.
882
01:08:34,490 --> 01:08:38,960
And I show you that
graph one edge at a time.
883
01:08:38,960 --> 01:08:41,270
You can predict the expectation.
884
01:08:41,270 --> 01:08:44,540
You can find the expectation
of this graph's statistic
885
01:08:44,540 --> 01:08:47,630
based on what you've
seen up to time n.
886
01:08:47,630 --> 01:08:51,979
And that sequence
will be a martingale.
887
01:08:51,979 --> 01:08:56,149
An important property
of a martingale,
888
01:08:56,149 --> 01:08:59,990
which is known as the
martingale convergence theorem--
889
01:08:59,990 --> 01:09:06,740
and so that's what we'll need
for the proof of the existence
890
01:09:06,740 --> 01:09:07,790
of the limit next time--
891
01:09:15,689 --> 01:09:20,359
says that every
bounded martingale--
892
01:09:23,649 --> 01:09:27,229
so for example, suppose
your martingale only
893
01:09:27,229 --> 01:09:29,590
takes values between 0 and 1.
894
01:09:29,590 --> 01:09:33,500
So every bounded martingale
converges almost surely.
895
01:09:42,870 --> 01:09:46,715
You cannot have a martingale
which you expect to constantly
896
01:09:46,715 --> 01:09:47,340
go up and down.
897
01:09:53,040 --> 01:09:56,170
So I want to show you
a proof of this fact.
898
01:09:56,170 --> 01:09:59,090
Let me just mention that
the bounded condition is
899
01:09:59,090 --> 01:10:01,490
a little bit stronger than
what we actually need.
900
01:10:01,490 --> 01:10:03,470
From the proof, you'll
see that you really only
901
01:10:03,470 --> 01:10:08,010
need them to be L1 bounded.
902
01:10:08,010 --> 01:10:10,360
It's enough.
903
01:10:10,360 --> 01:10:12,190
And more generally,
there is a condition
904
01:10:12,190 --> 01:10:19,380
called uniform integrability,
which I won't explain.
905
01:10:22,368 --> 01:10:23,364
All right.
906
01:10:26,120 --> 01:10:26,620
OK.
907
01:10:26,620 --> 01:10:29,250
So let me show you a proof
of the martingale convergence
908
01:10:29,250 --> 01:10:29,750
theorem.
909
01:10:29,750 --> 01:10:33,520
And I'm going to be somewhat
informal and somewhat cavalier,
910
01:10:33,520 --> 01:10:35,650
because I don't want
to get into some
911
01:10:35,650 --> 01:10:38,550
of the fine details
of probability theory.
912
01:10:38,550 --> 01:10:43,840
But if you have taken something
like 18.675 probability theory,
913
01:10:43,840 --> 01:10:45,520
then you can fill in
all those details.
914
01:10:48,580 --> 01:10:50,290
So I like this
proof, because it's
915
01:10:50,290 --> 01:10:51,580
kind of a proof by gambling.
916
01:10:56,680 --> 01:11:00,070
So I want to tell you a story
which should convince you that
917
01:11:00,070 --> 01:11:04,380
a martingale cannot
keep going up and down.
918
01:11:04,380 --> 01:11:06,120
It must converge almost surely.
919
01:11:08,640 --> 01:11:15,970
So suppose x sub n
doesn't converge.
920
01:11:19,863 --> 01:11:21,280
OK, so this is why
I say I'm going
921
01:11:21,280 --> 01:11:23,040
to be somewhat cavalier
with probability theory.
922
01:11:23,040 --> 01:11:24,680
So when I say this
doesn't converge,
923
01:11:24,680 --> 01:11:28,060
I mean a specific instance of
the sequence doesn't converge
924
01:11:28,060 --> 01:11:30,050
or some specific realization.
925
01:11:30,050 --> 01:11:39,490
If it doesn't converge,
then there exists a and b,
926
01:11:39,490 --> 01:11:50,740
both rational numbers between
0 and 1, such that the sequence
927
01:11:50,740 --> 01:11:59,040
crosses the interval a,
b infinitely many times.
928
01:12:06,040 --> 01:12:11,060
So by crossing this interval,
what I mean is the following.
929
01:12:19,510 --> 01:12:20,010
OK.
930
01:12:20,010 --> 01:12:23,140
So there's an
important picture which
931
01:12:23,140 --> 01:12:25,900
will help a lot in
understanding this theorem.
932
01:12:31,550 --> 01:12:41,300
So imagine I have this
time n, and I have a and b.
933
01:12:41,300 --> 01:12:43,130
So I have this martingale.
934
01:12:43,130 --> 01:12:55,850
It's realization curve
will be like that.
935
01:12:55,850 --> 01:12:58,390
So that's an instance
of this martingale.
936
01:12:58,390 --> 01:13:03,950
And by crossing, I
mean a sequence that--
937
01:13:03,950 --> 01:13:07,390
OK, so here's what
I mean by crossing.
938
01:13:07,390 --> 01:13:15,192
I start below a and--
939
01:13:15,192 --> 01:13:16,400
let me use a different color.
940
01:13:19,170 --> 01:13:26,320
So I start below a, and I
go above b and then wait
941
01:13:26,320 --> 01:13:30,430
until I come back below a.
942
01:13:30,430 --> 01:13:32,740
And I go above b.
943
01:13:32,740 --> 01:13:36,040
Wait until I come back.
944
01:13:36,040 --> 01:13:37,500
So do like that.
945
01:13:45,592 --> 01:13:46,558
Like that.
946
01:13:52,860 --> 01:13:57,900
So I start below a until
the first time I go above b.
947
01:13:57,900 --> 01:13:59,700
And then I stop that sequence.
948
01:13:59,700 --> 01:14:05,705
So those are the upcrossings
of this martingale.
949
01:14:12,980 --> 01:14:15,960
So upcrossing is when
you start below a,
950
01:14:15,960 --> 01:14:18,720
and then you end up above b.
951
01:14:18,720 --> 01:14:26,040
So if you don't converge,
then there exists such a
952
01:14:26,040 --> 01:14:30,360
and b such that there are
infinitely many such crossings.
953
01:14:30,360 --> 01:14:32,950
So this is just a fact.
954
01:14:32,950 --> 01:14:36,910
It's not hard to see.
955
01:14:36,910 --> 01:14:40,000
And what we'll show is
that this doesn't happen
956
01:14:40,000 --> 01:14:42,280
except with probability 0.
957
01:14:42,280 --> 01:14:53,330
So we'll show that this
occurs with probability 0.
958
01:14:55,950 --> 01:15:02,930
And because there are
only countably many
959
01:15:02,930 --> 01:15:11,690
rational numbers, we find
that x sub n converges
960
01:15:11,690 --> 01:15:13,000
with probability 1.
961
01:15:22,440 --> 01:15:23,630
So these are upcrossings.
962
01:15:23,630 --> 01:15:25,920
So I didn't define
it, but hopefully you
963
01:15:25,920 --> 01:15:29,160
understood from my picture
and my description.
964
01:15:29,160 --> 01:15:36,270
And let me define
by u sub n to be
965
01:15:36,270 --> 01:15:44,620
the number of
upcrossings up to time
966
01:15:44,620 --> 01:15:53,207
n, so the number of
such upcrossings.
967
01:15:55,950 --> 01:15:58,205
Now let me consider
a betting strategy.
968
01:16:05,790 --> 01:16:07,770
Basically, I want to make money.
969
01:16:07,770 --> 01:16:15,290
And I want to make money by
following these upcrossings.
970
01:16:15,290 --> 01:16:15,790
OK.
971
01:16:15,790 --> 01:16:20,050
So every time you
give me a number and--
972
01:16:20,050 --> 01:16:21,710
so think of this as
the stock market.
973
01:16:21,710 --> 01:16:26,647
So it's a fair stock market
where you tell me the price,
974
01:16:26,647 --> 01:16:28,230
and I get to decide,
do I want to buy?
975
01:16:28,230 --> 01:16:31,070
Or do I want to sell?
976
01:16:31,070 --> 01:16:45,720
So consider the betting
strategy where at any time,
977
01:16:45,720 --> 01:16:54,530
we're going to hold either 0
or 1 share of the stock, which
978
01:16:54,530 --> 01:16:57,590
has these moving prices.
979
01:16:57,590 --> 01:17:07,980
And what we're going to do
is if xn is less than a,
980
01:17:07,980 --> 01:17:12,060
is less than the lower
bound, then we're
981
01:17:12,060 --> 01:17:27,890
going to buy and hold, meaning
1, until the first time
982
01:17:27,890 --> 01:17:42,450
that the price reaches
above b and then
983
01:17:42,450 --> 01:17:48,052
sell as soon as the first time
we see the price goes above b.
984
01:17:50,950 --> 01:17:52,900
So this is the betting strategy.
985
01:17:52,900 --> 01:17:54,960
And it's something
which you can implement.
986
01:17:54,960 --> 01:17:57,030
If you see a sequence
of prices, you
987
01:17:57,030 --> 01:17:59,130
can implement this strategy.
988
01:17:59,130 --> 01:18:03,000
And you already hopefully see,
if you have many upcrossings,
989
01:18:03,000 --> 01:18:05,310
then each upcrossing,
you make money.
990
01:18:05,310 --> 01:18:07,620
Each upcrossing, you make money.
991
01:18:07,620 --> 01:18:09,880
And this is almost
too good to be true.
992
01:18:09,880 --> 01:18:15,160
And in fact, we see that the
total gain from this strategy--
993
01:18:15,160 --> 01:18:17,300
so if you start with
some balance, what
994
01:18:17,300 --> 01:18:18,460
you get at the end--
995
01:18:18,460 --> 01:18:22,750
is at least this
difference from a
996
01:18:22,750 --> 01:18:27,452
to b times the number
of upcrossings.
997
01:18:31,270 --> 01:18:33,610
You might start somewhere.
998
01:18:33,610 --> 01:18:35,790
You buy, and then you
just lose everything.
999
01:18:35,790 --> 01:18:38,840
So there might be
an initial cost.
1000
01:18:38,840 --> 01:18:42,400
And that cost is
bounded, because we start
1001
01:18:42,400 --> 01:18:44,680
with a bounded martingale.
1002
01:18:44,680 --> 01:18:52,780
So suppose the martingale
is always between 0 and 1.
1003
01:18:52,780 --> 01:18:54,915
We start with a
bounded martingale.
1004
01:18:57,530 --> 01:19:01,730
But on the other hand,
there is a theorem
1005
01:19:01,730 --> 01:19:04,670
about martingales, which
is not hard to deduce
1006
01:19:04,670 --> 01:19:07,700
from the definition, that
no matter what the betting
1007
01:19:07,700 --> 01:19:11,150
strategy is, the gain
at any particular time
1008
01:19:11,150 --> 01:19:13,580
must be 0 in expectation.
1009
01:19:16,940 --> 01:19:19,240
So this is just the
property of the martingale.
1010
01:19:19,240 --> 01:19:24,190
So 0 equals the
expected gain, which
1011
01:19:24,190 --> 01:19:27,520
is at least b minus a
times the expected number
1012
01:19:27,520 --> 01:19:30,630
of upcrossings minus 1.
1013
01:19:30,630 --> 01:19:35,430
And thus the expected number
of upcrossings up to time n
1014
01:19:35,430 --> 01:19:41,600
is at most 1 over b minus a.
1015
01:19:41,600 --> 01:19:47,140
Now, we let n go to infinity.
1016
01:19:47,140 --> 01:19:57,780
And let u sub infinity be the
total number of upcrossings.
1017
01:20:02,030 --> 01:20:17,430
By the monotone convergence
theorem in this limit,
1018
01:20:17,430 --> 01:20:20,310
the limit of these u sub
n's, it can never go down.
1019
01:20:20,310 --> 01:20:23,740
It's always weakly increasing.
1020
01:20:23,740 --> 01:20:28,020
It converges to the
expectation of the total number
1021
01:20:28,020 --> 01:20:29,232
of upcrossings.
1022
01:20:29,232 --> 01:20:31,440
So now, in particular, you
know that the total number
1023
01:20:31,440 --> 01:20:38,120
of upcrossings is at
most some finite number.
1024
01:20:38,120 --> 01:20:40,300
So in particular,
the probability
1025
01:20:40,300 --> 01:20:45,630
that you have infinitely
many crossings is 0.
1026
01:20:45,630 --> 01:20:50,330
So with probability 0, you
cross infinitely many times,
1027
01:20:50,330 --> 01:20:52,880
which proves the
claim over there
1028
01:20:52,880 --> 01:20:54,870
and which concludes
the proof of the claim
1029
01:20:54,870 --> 01:20:58,535
that the martingale
converges almost surely.
1030
01:20:58,535 --> 01:21:00,660
OK, so that proves the
martingale converge theorem.
1031
01:21:00,660 --> 01:21:02,430
So next time, we'll
combine everything
1032
01:21:02,430 --> 01:21:05,640
that we did today to prove the
three main theorems that we
1033
01:21:05,640 --> 01:21:09,230
stated last time
on graph limits.