1
00:00:07 --> 00:00:10
Good morning.
Today we're going to talk about
2
00:00:10 --> 00:00:14
it a balanced search structure,
so a data structure that
3
00:00:14 --> 00:00:18
maintains a dynamic set subject
to insertion,
4
00:00:18 --> 00:00:21
deletion, and search called
skip lists.
5
00:00:21 --> 00:00:25
So, I'll call this a dynamic
search structure because it's a
6
00:00:25 --> 00:00:28
data structure.
It supports search,
7
00:00:28 --> 00:00:33
and it's dynamic,
meaning insert and delete.
8
00:00:33 --> 00:00:39
So, what other dynamic search
structures do we know,
9
00:00:39 --> 00:00:45
just for sake of comparison,
and to wake everyone up?
10
00:00:45 --> 00:00:50
Shut them out,
efficient, I should say,
11
00:00:50 --> 00:00:55
also good, logarithmic time per
operation.
12
00:00:55 --> 00:01:01
So, this is a really easy
question to get us off the
13
00:01:01 --> 00:01:05
ground.
You've seen them all in the
14
00:01:05 --> 00:01:08
last week, so it shouldn't be so
hard.
15
00:01:08 --> 00:01:11
Treap, good.
On the problems that we saw
16
00:01:11 --> 00:01:13
treaps.
That's, in some sense,
17
00:01:13 --> 00:01:17
the simplest dynamic search
structure you can get from first
18
00:01:17 --> 00:01:21
principles because all we needed
was a bound on a randomly
19
00:01:21 --> 00:01:26
constructed binary search tree.
And then treaps did well.
20
00:01:26 --> 00:01:30
So, that was sort of the first
one you saw depending on when
21
00:01:30 --> 00:01:34
you did your problem set.
What else?
22
00:01:34 --> 00:01:36
Charles?
Red black trees,
23
00:01:36 --> 00:01:40
good answer.
So, that was exactly one week
24
00:01:40 --> 00:01:44
ago.
I hope you still remember it.
25
00:01:44 --> 00:01:48
They have guaranteed log n
performance.
26
00:01:48 --> 00:01:55
So, this was an expected bound.
This was a worst-case order log
27
00:01:55 --> 00:01:58
n per operation,
insert, delete,
28
00:01:58 --> 00:02:02
and search.
And, there was one more for
29
00:02:02 --> 00:02:07
those who want to recitation on
Friday: B trees,
30
00:02:07 --> 00:02:10
good.
And, by B trees,
31
00:02:10 --> 00:02:14
I also include two-three trees,
two-three-four trees,
32
00:02:14 --> 00:02:16
and all those guys.
So, if B is a constant,
33
00:02:16 --> 00:02:19
or if you want your B trees
knows a little bit cleverly,
34
00:02:19 --> 00:02:22
that these have guaranteed
order log n performance,
35
00:02:22 --> 00:02:24
so, worst case,
order log n.
36
00:02:24 --> 00:02:27
So, you should know this.
These are all balanced search
37
00:02:27 --> 00:02:29
structures.
They are dynamic.
38
00:02:29 --> 00:02:31
They support insertions and
deletions.
39
00:02:31 --> 00:02:34
They support searches,
finding a given key.
40
00:02:34 --> 00:02:37
And if you don't find the key,
you find its predecessor and
41
00:02:37 --> 00:02:42
successor pretty easily in all
of these structures.
42
00:02:42 --> 00:02:44
If you want to augment some
data structure,
43
00:02:44 --> 00:02:48
you should think about which
one of these is easiest to
44
00:02:48 --> 00:02:53
augment, as in Monday's lecture.
So, the question I want to pose
45
00:02:53 --> 00:02:56
to you is supposed I gave you
all a laptop right now,
46
00:02:56 --> 00:02:59
which would be great.
Then I asked you,
47
00:02:59 --> 00:03:03
in order to keep this laptop
you have to implement one of
48
00:03:03 --> 00:03:06
these data structures,
let's say, within this class
49
00:03:06 --> 00:03:09
hour.
Do you think you could do it?
50
00:03:09 --> 00:03:12
How many people think you could
do it?
51
00:03:12 --> 00:03:13
A couple people,
a few people,
52
00:03:13 --> 00:03:15
OK, all front row people,
good.
53
00:03:15 --> 00:03:19
I could probably do it.
My preference would be B trees.
54
00:03:19 --> 00:03:21
They're sort of the simplest in
my mind.
55
00:03:21 --> 00:03:23
This is without using the
textbook.
56
00:03:23 --> 00:03:25
This would be a closed book
exam.
57
00:03:25 --> 00:03:30
I don't have enough laptops to
do it, unfortunately.
58
00:03:30 --> 00:03:32
So, B trees are pretty
reasonable.
59
00:03:32 --> 00:03:35
Deletion, you have to remember
stealing from a sibling and
60
00:03:35 --> 00:03:37
whatnot.
So, deletions are a bit tricky.
61
00:03:37 --> 00:03:40
Red black trees,
I can never remember it.
62
00:03:40 --> 00:03:43
I'd have to look it up,
or re-derive the three cases.
63
00:03:43 --> 00:03:46
treaps are a bit fancy.
So, that would take a little
64
00:03:46 --> 00:03:49
while to remember exactly how
those work.
65
00:03:49 --> 00:03:51
You'd have to solve your
problem set again,
66
00:03:51 --> 00:03:55
if you don't have it memorized.
Skip lists, on the other hand,
67
00:03:55 --> 00:03:57
are a data structure you will
never forget,
68
00:03:57 --> 00:04:00
and something you can implement
within an hour,
69
00:04:00 --> 00:04:03
no problem.
I've made this claim a couple
70
00:04:03 --> 00:04:05
times before,
and I always felt bad because I
71
00:04:05 --> 00:04:10
had never actually done it.
So, this morning,
72
00:04:10 --> 00:04:13
I implemented skip lists,
and it took me ten minutes to
73
00:04:13 --> 00:04:17
implement a linked list,
and 30 minutes to implement
74
00:04:17 --> 00:04:19
skip lists.
And another 30 minutes
75
00:04:19 --> 00:04:21
debugging them.
There you go.
76
00:04:21 --> 00:04:24
It can be done.
Skip lists are really simple.
77
00:04:24 --> 00:04:27
And, at no point writing the
code did I have to think,
78
00:04:27 --> 00:04:32
whereas every other structure I
would have to think.
79
00:04:32 --> 00:04:36
There was one moment when I
thought, ah, how do I flip a
80
00:04:36 --> 00:04:38
coin?
That was the entire amount of
81
00:04:38 --> 00:04:41
thinking.
So, skip lists are a randomized
82
00:04:41 --> 00:04:44
structure.
Let's add in another adjective
83
00:04:44 --> 00:04:46
here, and let's also add in
simple.
84
00:04:46 --> 00:04:49
So, we have a simple,
efficient, dynamic,
85
00:04:49 --> 00:04:53
randomized search structure:
all those things together.
86
00:04:53 --> 00:04:57
So, it's sort of like treaps
and that the bound is only a
87
00:04:57 --> 00:05:01
randomized bound.
But today, we're going to see a
88
00:05:01 --> 00:05:06
much stronger bound than an
expectation bound.
89
00:05:06 --> 00:05:11
So, in particular,
skip lists will run in order
90
00:05:11 --> 00:05:17
log n expected time.
So, the running time for each
91
00:05:17 --> 00:05:22
operation will be order log n in
expectation.
92
00:05:22 --> 00:05:28
But, we're going to prove a
much stronger result that their
93
00:05:28 --> 00:05:34
order log n, with high
probability.
94
00:05:34 --> 00:05:37
So, this is a very strong
claim.
95
00:05:37 --> 00:05:42
And it means that the running
time of each operation,
96
00:05:42 --> 00:05:48
the running time of every
operation is order log n almost
97
00:05:48 --> 00:05:54
always in a certain sense.
Why don't I foreshadow that?
98
00:05:54 --> 00:05:59
So, it's something like,
the probability that it's order
99
00:05:59 --> 00:06:05
log n is at least one minus one
over some polynomial,
100
00:06:05 --> 00:06:08
and n.
And, you get to set the
101
00:06:08 --> 00:06:10
polynomial however large you
like.
102
00:06:10 --> 00:06:13
So, what this basically means
is that almost all the time,
103
00:06:13 --> 00:06:16
you take your skip lists,
you do a polynomial number of
104
00:06:16 --> 00:06:18
operations on it,
because presumably you are
105
00:06:18 --> 00:06:21
running a polynomial time
algorithm that using this data
106
00:06:21 --> 00:06:23
structure.
Do polynomial numbers of
107
00:06:23 --> 00:06:26
inserts, delete searches,
every single one of them will
108
00:06:26 --> 00:06:30
take order log n time,
almost guaranteed.
109
00:06:30 --> 00:06:33
So this is a really strong
bound on the tail of the
110
00:06:33 --> 00:06:36
distribution.
The mean is order log n.
111
00:06:36 --> 00:06:39
That's not so exciting.
But, in fact,
112
00:06:39 --> 00:06:43
almost all of the weight of
this probability distribution is
113
00:06:43 --> 00:06:47
right around the log n,
just tiny little epsilons,
114
00:06:47 --> 00:06:51
very tiny probabilities you
could be bigger than log n.
115
00:06:51 --> 00:06:55
So that's where we are going.
This is a data structure by
116
00:06:55 --> 00:07:00
Pugh] in 1989.
This is the most recent.
117
00:07:00 --> 00:07:03
Actually, no,
sorry, treaps are more recent.
118
00:07:03 --> 00:07:06
They were like '93 or so,
but a fairly recent data
119
00:07:06 --> 00:07:09
structure for just insert,
delete, search.
120
00:07:09 --> 00:07:13
And, it's very simple.
You can derive it if you don't
121
00:07:13 --> 00:07:16
know anything about data
structures, well,
122
00:07:16 --> 00:07:19
almost nothing.
Now, analyzing that the
123
00:07:19 --> 00:07:21
performance is log n,
that, of course,
124
00:07:21 --> 00:07:25
takes our sophistication.
But the data structure itself
125
00:07:25 --> 00:07:30
is very simple.
We're going to start from
126
00:07:30 --> 00:07:34
scratch.
Suppose you don't know what a
127
00:07:34 --> 00:07:38
red black tree is.
You don't know what a B tree
128
00:07:38 --> 00:07:41
is.
Suppose you don't even know
129
00:07:41 --> 00:07:45
what a tree is.
What is the simplest data
130
00:07:45 --> 00:07:51
structure for storing a bunch of
items for storing a dynamic set?
131
00:07:51 --> 00:07:54
A list, good,
a linked list.
132
00:07:54 --> 00:07:58
Now, suppose that it's a sorted
linked list.
133
00:07:58 --> 00:08:05
So, I'm going to be a little
bit fancier there.
134
00:08:05 --> 00:08:10
So, if you have a linked list
of items, here it is,
135
00:08:10 --> 00:08:16
maybe we'll make it doubly
linked just for kicks,
136
00:08:16 --> 00:08:22
how long does it take to search
in a sorted linked list?
137
00:08:22 --> 00:08:26
Log n is one answer.
n is the other answer.
138
00:08:26 --> 00:08:31
Which one is right?
n is the right answer.
139
00:08:31 --> 00:08:35
So, even though it's sorted,
we can't do binary search
140
00:08:35 --> 00:08:38
because we don't have
random-access into a linked
141
00:08:38 --> 00:08:40
list.
So, suppose I'm only given a
142
00:08:40 --> 00:08:44
pointer to the head.
Otherwise, I'm assuming it's an
143
00:08:44 --> 00:08:46
array.
So, in a sorted array you can
144
00:08:46 --> 00:08:48
search in log n.
Sorted linked list:
145
00:08:48 --> 00:08:51
you've still got to scan
through the darn thing.
146
00:08:51 --> 00:08:53
So, theta n,
worst case search.
147
00:08:53 --> 00:08:56
Not so good,
but if we just try to improve
148
00:08:56 --> 00:08:59
it a little bit,
we will discover skip lists
149
00:08:59 --> 00:09:03
automatically.
So, this is our starting point:
150
00:09:03 --> 00:09:06
sorted linked lists,
data n time.
151
00:09:06 --> 00:09:09
And, I'm not going to think too
much about insertions and
152
00:09:09 --> 00:09:12
deletions for the moment.
Let's just get search better,
153
00:09:12 --> 00:09:15
and then we'll worry about
dates.
154
00:09:15 --> 00:09:17
Updates are where randomization
will come in.
155
00:09:17 --> 00:09:21
Search: pretty easy idea.
So, how can we make a linked
156
00:09:21 --> 00:09:23
list better?
Suppose all we know about our
157
00:09:23 --> 00:09:26
linked lists.
What can I do to make it
158
00:09:26 --> 00:09:28
faster?
This is where you need a little
159
00:09:28 --> 00:09:32
bit of innovation,
some creativity.
160
00:09:32 --> 00:09:37
More links: that's a good idea.
So, I do try to maybe add
161
00:09:37 --> 00:09:40
pointers to go a couple steps
ahead.
162
00:09:40 --> 00:09:45
If I had log n pointers,
I could do all powers of two
163
00:09:45 --> 00:09:48
ahead.
That's a pretty good search
164
00:09:48 --> 00:09:51
structure.
Some people use that;
165
00:09:51 --> 00:09:56
like, some peer-to-peer
networks use that idea.
166
00:09:56 --> 00:10:01
But that's a little too fancy
for me.
167
00:10:01 --> 00:10:03
Ah, good.
You could try to build a tree
168
00:10:03 --> 00:10:07
on this linear structure.
That's essentially where we're
169
00:10:07 --> 00:10:09
going.
So, you could try to put
170
00:10:09 --> 00:10:12
pointers to, like,
the middle of the list from the
171
00:10:12 --> 00:10:14
roots.
So, you search between either
172
00:10:14 --> 00:10:16
here.
You point to the median,
173
00:10:16 --> 00:10:20
so you can compare against the
median, and know whether you
174
00:10:20 --> 00:10:23
should go in the first half or
the second half that's
175
00:10:23 --> 00:10:27
definitely on the right track,
also a bit too sophisticated.
176
00:10:27 --> 00:10:29
Another list:
yes.
177
00:10:29 --> 00:10:32
Yes, good.
So, we are going to use two
178
00:10:32 --> 00:10:34
lists.
That's sort of the next
179
00:10:34 --> 00:10:38
simplest thing you could do.
OK, and as you suggested,
180
00:10:38 --> 00:10:41
we could maybe have pointers
between them.
181
00:10:41 --> 00:10:46
So, maybe we have some elements
down here, some of the elements
182
00:10:46 --> 00:10:48
up here.
We want to have pointers
183
00:10:48 --> 00:10:51
between the lists.
OK, it gets a little bit crazy
184
00:10:51 --> 00:10:54
in how exactly you might do
that.
185
00:10:54 --> 00:10:56
But somehow,
this feels good.
186
00:10:56 --> 00:10:58
So this is one linked list:
L_1.
187
00:10:58 --> 00:11:02
This is another linked list:
L_2.
188
00:11:02 --> 00:11:12
And, to give you some
inspiration, I want to give you,
189
00:11:12 --> 00:11:19
so let's play a game.
The game is,
190
00:11:19 --> 00:11:29
what is this sequence?
So, the sequence is 14.
191
00:11:29 --> 00:11:38
If you know the answer,
shout it out.
192
00:11:38 --> 00:11:42
Anyone yet? OK, it's tricky.
193
00:11:42 --> 00:11:54
194
00:11:54 --> 00:11:58
It's a bit of a small class,
so I hope someone knows the
195
00:11:58 --> 00:11:59
answer.
196
00:11:59 --> 00:12:10
197
00:12:10 --> 00:12:14
How many TA's know the answer?
Just a couple,
198
00:12:14 --> 00:12:19
OK, if you're looking at the
slides, probably you know the
199
00:12:19 --> 00:12:21
answer.
That's cheating.
200
00:12:21 --> 00:12:26
OK, I'll give you a hint.
It is not a mathematical
201
00:12:26 --> 00:12:29
sequence.
This is a real-life sequence.
202
00:12:29 --> 00:12:32
Yeah?
Yeah, and what city?
203
00:12:32 --> 00:12:36
New York, yeah,
this is the 7th Ave line.
204
00:12:36 --> 00:12:40
This is my favorite subway line
in New York.
205
00:12:40 --> 00:12:46
But, what's a cool feature of
the New York City subway?
206
00:12:46 --> 00:12:49
OK, it's a skip list.
Good answer.
207
00:12:49 --> 00:12:54
[LAUGHTER] Indeed it is.
Skip lists are so practical.
208
00:12:54 --> 00:13:00
They've been implemented in the
subway system.
209
00:13:00 --> 00:13:03
How cool is that?
OK, Boston subway is pretty
210
00:13:03 --> 00:13:08
cool because it's the oldest
subway definitely in the United
211
00:13:08 --> 00:13:11
States, maybe in the world.
New York is close,
212
00:13:11 --> 00:13:16
and it has other nice features
like it's open 24 hours.
213
00:13:16 --> 00:13:20
That's a definite plus,
but it also has this feature of
214
00:13:20 --> 00:13:23
express lines.
So, it's a bit of an
215
00:13:23 --> 00:13:26
abstraction,
but the 7th Ave line has
216
00:13:26 --> 00:13:29
essentially two kinds of cars.
These are street numbers by the
217
00:13:29 --> 00:13:31
way.
This is, Penn Station,
218
00:13:31 --> 00:13:33
Times Square,
and so on.
219
00:13:33 --> 00:13:36
So, there are essentially two
lines.
220
00:13:36 --> 00:13:39
There's the express line which
goes 14, to 34,
221
00:13:39 --> 00:13:41
to 42, to 72,
to 96.
222
00:13:41 --> 00:13:45
And then, there's the local
line which stops at every stop.
223
00:13:45 --> 00:13:49
And, they accomplish this with
four sets of tracks.
224
00:13:49 --> 00:13:54
So, I mean, the express lines
have their own dedicated track.
225
00:13:54 --> 00:13:57
If you want to go to stop 59
from, let's say,
226
00:13:57 --> 00:14:00
Penn Station,
well, let's say from lower west
227
00:14:00 --> 00:14:05
side, you get on the express
line.
228
00:14:05 --> 00:14:10
You jump to 42 pretty quickly,
and then you switch over to the
229
00:14:10 --> 00:14:16
local line, and go on to 59 or
wherever I said I was going.
230
00:14:16 --> 00:14:21
OK, so this is express and
local lines, and we can
231
00:14:21 --> 00:14:25
represent that with a couple of
lists.
232
00:14:25 --> 00:14:29
We have one list,
sure, we have one list on the
233
00:14:29 --> 00:14:34
bottom, so leave some space up
here.
234
00:14:34 --> 00:14:48
This is the local line,
L_2, 34, 42,
235
00:14:48 --> 00:15:02
50, 59, 66, 72,
79, and so on.
236
00:15:02 --> 00:15:08
And then we had the express
line on top, which only stops at
237
00:15:08 --> 00:15:11
14, 34, 42, 72,
and so on.
238
00:15:11 --> 00:15:16
I'm not going to redraw the
whole list.
239
00:15:16 --> 00:15:21
You get the idea.
And so, what we're going to do
240
00:15:21 --> 00:15:27
is put links between in the
local and express lines,
241
00:15:27 --> 00:15:34
wherever they happen to meet.
And, that's our two linked list
242
00:15:34 --> 00:15:38
structure.
So, that's what I actually
243
00:15:38 --> 00:15:42
meant what I was trying to draw
some picture.
244
00:15:42 --> 00:15:47
Now, this has a property that
in one list, the bottom list,
245
00:15:47 --> 00:15:52
every element occurs.
And the top list just copies
246
00:15:52 --> 00:15:56
some of those elements.
And we're going to preserve
247
00:15:56 --> 00:16:00
that property.
So, L_2 stores all the
248
00:16:00 --> 00:16:05
elements, and L_1 stores some
subset.
249
00:16:05 --> 00:16:10
And, it's still open which ones
we should store.
250
00:16:10 --> 00:16:16
That's the one thing we need to
think about.
251
00:16:16 --> 00:16:23
But, our inspiration is from
the New York subway system.
252
00:16:23 --> 00:16:30
OK, there, that the idea.
Of course, we're also going to
253
00:16:30 --> 00:16:36
use more than two lists.
OK, we also have links.
254
00:16:36 --> 00:16:44
Let's say it links between
equal keys in L_1 and L_2.
255
00:16:44 --> 00:16:46
Good.
So, just for the sake of
256
00:16:46 --> 00:16:50
completeness,
and because we will need this
257
00:16:50 --> 00:16:55
later, let's talk about searches
before we worry about how these
258
00:16:55 --> 00:17:00
lists are actually constructed.
Of course, if I wanted that
259
00:17:00 --> 00:17:04
board.
So, if you want to search for
260
00:17:04 --> 00:17:06
an element, x,
what do you do?
261
00:17:06 --> 00:17:09
Well, this is the taking the
subway algorithm.
262
00:17:09 --> 00:17:14
And, suppose you always start
in the upper left corner of the
263
00:17:14 --> 00:17:17
subway system,
if you're always in the lower
264
00:17:17 --> 00:17:21
west side, 14th St,
and I don't know exactly where
265
00:17:21 --> 00:17:25
that is, but more or less,
somewhere down at the bottom of
266
00:17:25 --> 00:17:27
Manhattan.
And, you want to go to a
267
00:17:27 --> 00:17:33
particular station like 59.
Well, you'd stay on the express
268
00:17:33 --> 00:17:37
line as long as you can because
it happens that we started on
269
00:17:37 --> 00:17:39
the express line.
And then, you go down.
270
00:17:39 --> 00:17:43
And then you take the local
line the rest of the way.
271
00:17:43 --> 00:17:47
That's clearly the right thing
to do if you always start in the
272
00:17:47 --> 00:17:50
top left corner.
So, I'm going to write that
273
00:17:50 --> 00:17:54
down in some kind of an
algorithm because we will be
274
00:17:54 --> 00:17:56
generalizing it.
It's pretty obvious at this
275
00:17:56 --> 00:18:00
point.
It will remain obvious.
276
00:18:00 --> 00:18:06
So, I want to walk right in the
top list until that would go too
277
00:18:06 --> 00:18:09
far.
So, you imagine giving someone
278
00:18:09 --> 00:18:14
directions on the subway system
they've never been on.
279
00:18:14 --> 00:18:17
So, you say,
OK, you start at 14th.
280
00:18:17 --> 00:18:22
Take the express line,
and when you get to 72nd,
281
00:18:22 --> 00:18:25
you've gone too far.
Go back one,
282
00:18:25 --> 00:18:30
and then go down to the local
line.
283
00:18:30 --> 00:18:32
It's really annoying
directions.
284
00:18:32 --> 00:18:37
But this is what an algorithm
has to do because it's never
285
00:18:37 --> 00:18:41
taken the subway before.
So, it's going to check,
286
00:18:41 --> 00:18:45
so let's do it here.
So, suppose I'm aiming for 59.
287
00:18:45 --> 00:18:49
So, I started 14,
say the first thing I do is go
288
00:18:49 --> 00:18:51
to 34.
Then from there,
289
00:18:51 --> 00:18:54
I go to 42.
Still good because 59 is bigger
290
00:18:54 --> 00:18:56
than 42.
I go right again.
291
00:18:56 --> 00:18:59
I say, oops,
72 is too big.
292
00:18:59 --> 00:19:04
That was too far.
So, I go back to where it just
293
00:19:04 --> 00:19:07
was.
Then I go down and then I keep
294
00:19:07 --> 00:19:12
going right until I find the
element that I want,
295
00:19:12 --> 00:19:17
or discover that it's not in
the bottom list because bottom
296
00:19:17 --> 00:19:21
list has everyone.
So, that's the algorithm.
297
00:19:21 --> 00:19:27
Stop when going right would go
too far, and you discover that
298
00:19:27 --> 00:19:31
with a comparison.
Then you walk down to L_2.
299
00:19:31 --> 00:19:35
And then you walk right in L_2
until you find x,
300
00:19:35 --> 00:19:40
or you find something greater
than x, in which case x is
301
00:19:40 --> 00:19:46
definitely not on your list.
And you found the predecessor
302
00:19:46 --> 00:19:49
and successor,
which may be your goal.
303
00:19:49 --> 00:19:52
If you didn't find where x was,
you should find where it would
304
00:19:52 --> 00:19:55
go if it were there,
because then maybe you could
305
00:19:55 --> 00:19:58
insert there.
We're going to use this
306
00:19:58 --> 00:20:00
algorithm in insertion.
OK, but that search:
307
00:20:00 --> 00:20:05
pretty easy at this point.
Now, what we haven't discussed
308
00:20:05 --> 00:20:08
is how fast the search algorithm
is, and it depends,
309
00:20:08 --> 00:20:12
of course, which elements we're
going to store in L_1,
310
00:20:12 --> 00:20:14
which subset of elements should
go in L_1.
311
00:20:14 --> 00:20:18
Now, in the subway system,
you probably put all the
312
00:20:18 --> 00:20:21
popular stations in L_1.
But here, we want worst-case
313
00:20:21 --> 00:20:24
performance.
So, we don't have some
314
00:20:24 --> 00:20:26
probability distribution on the
nodes.
315
00:20:26 --> 00:20:30
We just like every node to be
accessed sort of as quickly as
316
00:20:30 --> 00:20:35
possible, uniformly.
So, we want to minimize the
317
00:20:35 --> 00:20:39
maximum time over all queries.
So, any ideas what we should do
318
00:20:39 --> 00:20:42
with L_1?
Should I put all the nodes of
319
00:20:42 --> 00:20:46
L_1 in the beginning?
OK, it's a strict subset.
320
00:20:46 --> 00:20:49
Suppose I told you what the
size of L_1 was.
321
00:20:49 --> 00:20:53
I can tell you,
I could afford to build this
322
00:20:53 --> 00:20:56
many express stops.
How should you distribute them
323
00:20:56 --> 00:21:02
among the elements of L_2?
Uniformly, good.
324
00:21:02 --> 00:21:08
So, what nodes,
sorry, what keys,
325
00:21:08 --> 00:21:17
let's say, go in L_1?
Well, definitely the best thing
326
00:21:17 --> 00:21:24
to do is to spread them out
uniformly, OK,
327
00:21:24 --> 00:21:35
which is definitely not what
the 7th Ave line looks like.
328
00:21:35 --> 00:21:39
But, let's imagine that we
could reengineer everything.
329
00:21:39 --> 00:21:45
So, we're going to try to space
these things out a little bit
330
00:21:45 --> 00:21:47
more.
So, 34 and 42nd are way too
331
00:21:47 --> 00:21:50
close.
We'll take a few more stops.
332
00:21:50 --> 00:21:54
And, now we can start to
analyze things.
333
00:21:54 --> 00:21:57
OK, as a function of the length
of L_1.
334
00:21:57 --> 00:22:03
So, the cost of a search is now
roughly, so, I want a function
335
00:22:03 --> 00:22:07
of the length of L_1,
and the length of L_2,
336
00:22:07 --> 00:22:11
which is all the elements,
n.
337
00:22:11 --> 00:22:18
What is the cost of the search
if I spread out all the elements
338
00:22:18 --> 00:22:20
in L_1 uniformly?
Yeah?
339
00:22:20 --> 00:22:26
Right, the total number of
elements in the top lists,
340
00:22:26 --> 00:22:33
plus the division between the
bottom and the top.
341
00:22:33 --> 00:22:36
So, I'll write the length of
L_1 plus the length of L_2
342
00:22:36 --> 00:22:39
divided by the length of L_1.
OK, this is roughly,
343
00:22:39 --> 00:22:42
I mean, there's maybe a plus
one or so here because in the
344
00:22:42 --> 00:22:46
worst case, I have to search
through all of L_1 because the
345
00:22:46 --> 00:22:49
station I could be looking for
could be the max.
346
00:22:49 --> 00:22:52
OK, and maybe I'm not lucky,
and the max is not on the
347
00:22:52 --> 00:22:54
express line.
So then, I have to go down to
348
00:22:54 --> 00:22:57
the local line.
And how many stops will I have
349
00:22:57 --> 00:23:01
to go on the local line?
Well, L_1 just evenly
350
00:23:01 --> 00:23:04
partitions L_2.
So this is the number of
351
00:23:04 --> 00:23:08
consecutive stations between two
express stops.
352
00:23:08 --> 00:23:12
So, I take the express,
possibly this long,
353
00:23:12 --> 00:23:15
but I take the local possibly
this long.
354
00:23:15 --> 00:23:18
And, this is an L_2.
And there is,
355
00:23:18 --> 00:23:20
plus, a constant,
for example,
356
00:23:20 --> 00:23:24
go walking down.
But that's basically the number
357
00:23:24 --> 00:23:28
of nodes that I visit.
So, I'd like to minimize this
358
00:23:28 --> 00:23:36
function.
Now, L_2, I'm going to call
359
00:23:36 --> 00:23:47
that n because that's the total
number of elements.
360
00:23:47 --> 00:23:55
L_1, I can choose to be
whatever I want.
361
00:23:55 --> 00:24:03
So, let's go over here.
So, I want to minimize L_1 plus
362
00:24:03 --> 00:24:07
n over L_1.
And I get to choose L_1.
363
00:24:07 --> 00:24:11
Now, I could differentiate
this, set it to zero,
364
00:24:11 --> 00:24:15
and go crazy.
Or, I could realize that,
365
00:24:15 --> 00:24:19
I mean, that's not hard.
But, that's a little bit too
366
00:24:19 --> 00:24:22
fancy for me.
So, I could say,
367
00:24:22 --> 00:24:26
well, this is clearly best when
L_1 is small.
368
00:24:26 --> 00:24:32
And this is clearly best when
L_1 is large.
369
00:24:32 --> 00:24:37
So, there's a trade-off there.
And, the trade-off will be
370
00:24:37 --> 00:24:44
roughly minimized up to constant
factors when these two terms are
371
00:24:44 --> 00:24:48
equal.
That's when I have pretty good
372
00:24:48 --> 00:24:53
balance between the two ends of
the trade-off.
373
00:24:53 --> 00:24:56
So, this is up to constant
factors.
374
00:24:56 --> 00:25:03
I can let L_1 equal n over L_1,
OK, because at most I'm losing
375
00:25:03 --> 00:25:10
a factor of two there when they
happen to be equal.
376
00:25:10 --> 00:25:14
So now, I just solve this.
This is really easy.
377
00:25:14 --> 00:25:18
This is (L_1)^2 equals n.
So, L_1 is the square root of
378
00:25:18 --> 00:25:20
n.
OK, so the cost that I'm
379
00:25:20 --> 00:25:24
getting over here,
L_1 plus L_2 over L_1 is the
380
00:25:24 --> 00:25:28
square root of n plus n over
root n, which is,
381
00:25:28 --> 00:25:32
again, root n.
So, I get two root n.
382
00:25:32 --> 00:25:36
So, search cost,
and I'm caring about the
383
00:25:36 --> 00:25:39
constant here,
because it will matter in a
384
00:25:39 --> 00:25:41
moment.
Two square root of n:
385
00:25:41 --> 00:25:45
I'm not caring about the
additive constant,
386
00:25:45 --> 00:25:48
but the multiplicative constant
I care about.
387
00:25:48 --> 00:25:52
OK, that seems good.
We started with a linked list
388
00:25:52 --> 00:25:56
that searched in n time,
theta n time per operation.
389
00:25:56 --> 00:26:03
Now we have two linked lists,
search and theta root n time.
390
00:26:03 --> 00:26:07
It seems pretty good.
This is what the structure
391
00:26:07 --> 00:26:10
looks like.
We have root n guys here.
392
00:26:10 --> 00:26:15
This is in the local line.
And, we have one express stop
393
00:26:15 --> 00:26:19
which represents that.
But we have another root n
394
00:26:19 --> 00:26:24
values in the local line.
And we have one express stop
395
00:26:24 --> 00:26:28
that represents that.
And these two are linked,
396
00:26:28 --> 00:26:31
and so on.
397
00:26:31 --> 00:26:42
398
00:26:42 --> 00:26:44
Well, I should put some dot,
dot, dots in there.
399
00:26:44 --> 00:26:47
OK, so each of these chunks has
length root n,
400
00:26:47 --> 00:26:49
and the number of
representatives up here is
401
00:26:49 --> 00:26:52
square root of n.
The number of express stops is
402
00:26:52 --> 00:26:54
square root of n.
So clearly, things are balanced
403
00:26:54 --> 00:26:55
now.
I search for,
404
00:26:55 --> 00:26:57
at most, square root of n up
here.
405
00:26:57 --> 00:27:00
Then I search in one of these
lists for, at most,
406
00:27:00 --> 00:27:04
square root of n.
So, every search takes,
407
00:27:04 --> 00:27:10
at most, two root n.
Cool, what should we do next?
408
00:27:10 --> 00:27:15
So, again, ignore insertions
and deletions.
409
00:27:15 --> 00:27:22
I want to make searches faster
because square root of n is not
410
00:27:22 --> 00:27:25
so hot as we know.
Sorry?
411
00:27:25 --> 00:27:30
More lines.
Let's add a super express line,
412
00:27:30 --> 00:27:35
or another linked list.
OK, this was two.
413
00:27:35 --> 00:27:41
Why not do three?
So, we started with a sorted
414
00:27:41 --> 00:27:45
linked list.
Then we went to two.
415
00:27:45 --> 00:27:48
This gave us two square root of
n.
416
00:27:48 --> 00:27:52
Now, I want three sorted linked
lists.
417
00:27:52 --> 00:27:57
I didn't pluralize here.
Any guesses what the running
418
00:27:57 --> 00:28:02
time might be?
This is just guesswork.
419
00:28:02 --> 00:28:05
Don't think.
From two square root of n,
420
00:28:05 --> 00:28:08
you would go to,
sorry?
421
00:28:08 --> 00:28:12
Two square root of two,
fourth root of n?
422
00:28:12 --> 00:28:17
That's on the right track.
Both the constant and the root
423
00:28:17 --> 00:28:20
change, but not quite so
fancily.
424
00:28:20 --> 00:28:24
Three times the cubed root:
good.
425
00:28:24 --> 00:28:29
Intuition is very helpful here.
It doesn't matter what the
426
00:28:29 --> 00:28:35
right answer is.
Use your intuition.
427
00:28:35 --> 00:28:37
You can prove that.
It's not so hard.
428
00:28:37 --> 00:28:40
You now have three lists,
and what you want to balance
429
00:28:40 --> 00:28:44
are at the length of the top
list, the ratio between the top
430
00:28:44 --> 00:28:47
two lists, and the ratio between
the bottom two lists.
431
00:28:47 --> 00:28:50
So, you want these three to
multiply out to n,
432
00:28:50 --> 00:28:53
because the top times the ratio
times the ratio:
433
00:28:53 --> 00:28:56
that has to equal n.
And, so that's where you get
434
00:28:56 --> 00:28:59
the cubed root of n.
Each of these should be equal.
435
00:28:59 --> 00:29:03
So, you set them because the
cost is the sum of those three
436
00:29:03 --> 00:29:07
things.
So, you set each of them to
437
00:29:07 --> 00:29:11
cubed root of n,
and there are three of them.
438
00:29:11 --> 00:29:15
OK, check it at home if you
want to be more sure.
439
00:29:15 --> 00:29:21
Obviously, we want a few more.
So, let's think about k sorted
440
00:29:21 --> 00:29:24
lists.
k sorted lists will be k times
441
00:29:24 --> 00:29:28
the k'th root of n.
You probably guessed that by
442
00:29:28 --> 00:29:33
now.
So, what should we set k to?
443
00:29:33 --> 00:29:38
I don't want the exact minimum.
What's a good value for k?
444
00:29:38 --> 00:29:41
Should I set it to n?
n's kind of nice,
445
00:29:41 --> 00:29:44
because the n'th root of n is
just one.
446
00:29:44 --> 00:29:48
Now that's n.
So, this is why I cared about
447
00:29:48 --> 00:29:53
the lead constant because it's
going to grow as I add more
448
00:29:53 --> 00:29:56
lists.
What's the biggest reasonable
449
00:29:56 --> 00:30:03
value of k that I could use?
Log n, because I have a k out
450
00:30:03 --> 00:30:07
there.
I certainly don't want to use
451
00:30:07 --> 00:30:13
more than log n.
So, log n times the log n'th
452
00:30:13 --> 00:30:18
root, and this is a little hard
to draw of n.
453
00:30:18 --> 00:30:23
Now, what is the log n'th root
of n?
454
00:30:23 --> 00:30:27
That's what you're all thinking
about.
455
00:30:27 --> 00:30:34
What is the log n'th root of n
minus two?
456
00:30:34 --> 00:30:39
It's one of these good
questions whose answer is?
457
00:30:39 --> 00:30:43
Oh man.
Remember the definition of
458
00:30:43 --> 00:30:47
root?
OK, the root is n to the one
459
00:30:47 --> 00:30:51
over log n.
OK, good, remember the
460
00:30:51 --> 00:30:55
definition of having a power,
A to the B?
461
00:30:55 --> 00:30:59
It was like two to the power,
B log A?
462
00:30:59 --> 00:31:06
Does that sound familiar?
So, this is two to the log n
463
00:31:06 --> 00:31:11
over log n, which is,
I hope you can get it at this
464
00:31:11 --> 00:31:17
point, two.
Wow, so the log n'th root of n
465
00:31:17 --> 00:31:20
minus two is zero:
my favorite answer.
466
00:31:20 --> 00:31:23
OK, this is to.
So this whole thing is two log
467
00:31:23 --> 00:31:26
n: pretty nifty.
So, you could be a little
468
00:31:26 --> 00:31:31
fancier and tweak this a little
bit, but two log n is plenty
469
00:31:31 --> 00:31:36
good for me.
We clearly don't want to use
470
00:31:36 --> 00:31:41
any more lists,
but log n lists sounds pretty
471
00:31:41 --> 00:31:45
good.
I get, now, logarithmic search
472
00:31:45 --> 00:31:47
time.
Let's check.
473
00:31:47 --> 00:31:52
I mean, we sort of did this all
intuitively.
474
00:31:52 --> 00:31:56
Let's draw what the list looks
like.
475
00:31:56 --> 00:32:01
But, it will work.
So, I'm going to redraw this
476
00:32:01 --> 00:32:07
example because you have to,
also.
477
00:32:07 --> 00:32:14
So, let's redesign that New
York City subway system.
478
00:32:14 --> 00:32:22
And, I want you to leave three
blank lines up here.
479
00:32:22 --> 00:32:29
So, you should have this
memorized by now.
480
00:32:29 --> 00:32:34
But I don't.
So, we are not allowed to
481
00:32:34 --> 00:32:38
change the local line,
though it would be nice,
482
00:32:38 --> 00:32:43
add a few more stops there.
OK, we can stop at 79th Street.
483
00:32:43 --> 00:32:47
That's enough.
So now, we have log n lists.
484
00:32:47 --> 00:32:53
And here, log n is about four.
So, I want to make a bunch of
485
00:32:53 --> 00:32:55
lists here.
In particular,
486
00:32:55 --> 00:33:02
14 will appear on all of them.
So, why don't I draw those in?
487
00:33:02 --> 00:33:05
And, the question is,
which elements go in here?
488
00:33:05 --> 00:33:08
So, I have log n lists.
And, my goal is to balance the
489
00:33:08 --> 00:33:12
number of items up here,
and the ratio between these two
490
00:33:12 --> 00:33:15
lists, and the ratio between
these two lists,
491
00:33:15 --> 00:33:18
and the ratio between these two
lists.
492
00:33:18 --> 00:33:20
I want all these things to be
balanced.
493
00:33:20 --> 00:33:24
There are log n of them.
So, the product of all those
494
00:33:24 --> 00:33:27
ratios better be n,
the number of elements down
495
00:33:27 --> 00:33:29
here.
So, the product of all these
496
00:33:29 --> 00:33:36
ratios is n.
And there's log n of them;
497
00:33:36 --> 00:33:44
how big is each ratio?
So, I'll call the ratio r.
498
00:33:44 --> 00:33:52
The ratio's r.
I should have r to the power of
499
00:33:52 --> 00:33:56
log n equals n.
What's r?
500
00:33:56 --> 00:34:02
What's r minus two?
Zero.
501
00:34:02 --> 00:34:05
OK, this should be two to the
power of log n.
502
00:34:05 --> 00:34:09
So, if the ratio between the
number of elements here and here
503
00:34:09 --> 00:34:12
is to all the way down,
then I will have an elements at
504
00:34:12 --> 00:34:15
the bottom, which is what I
want.
505
00:34:15 --> 00:34:18
So, in other words,
I want half the elements here,
506
00:34:18 --> 00:34:22
a quarter of the elements here,
an eighth of the elements here,
507
00:34:22 --> 00:34:25
and so on.
So, I'm going to take half of
508
00:34:25 --> 00:34:28
the elements evenly spaced out:
34th, 50th, 66th,
509
00:34:28 --> 00:34:32
79th, and so on.
So, this is our new
510
00:34:32 --> 00:34:35
semi-express line:
not terribly fast,
511
00:34:35 --> 00:34:39
but you save a factor of two
for going up there.
512
00:34:39 --> 00:34:42
And, when you're done,
you go down,
513
00:34:42 --> 00:34:44
and you walk,
at most, one step.
514
00:34:44 --> 00:34:47
And you find what you're
looking for.
515
00:34:47 --> 00:34:52
OK, and then we do the same
thing over and over and over
516
00:34:52 --> 00:34:56
until we run out of elements.
I can't read my own writing.
517
00:34:56 --> 00:34:59
It's 79th.
518
00:34:59 --> 00:35:11
519
00:35:11 --> 00:35:14
OK, if I had a bigger example,
I would be more levels,
520
00:35:14 --> 00:35:19
but this is just barely enough.
Let's say two elements is where
521
00:35:19 --> 00:35:21
I stop.
So, this looks good.
522
00:35:21 --> 00:35:24
Does this look like a structure
you've seen before,
523
00:35:24 --> 00:35:25
at all, vaguely?
Yes?
524
00:35:25 --> 00:35:28
A tree: yes.
It looks a lot like a binary
525
00:35:28 --> 00:35:31
tree.
I'll just leave it at that.
526
00:35:31 --> 00:35:34
In your problem set,
you'll understand why skip
527
00:35:34 --> 00:35:38
lists are really like trees.
But it's more or less a tree.
528
00:35:38 --> 00:35:41
Let's say at this level,
it looks sort of like binary
529
00:35:41 --> 00:35:42
search.
You look at 14;
530
00:35:42 --> 00:35:44
you look at 15,
and therefore,
531
00:35:44 --> 00:35:48
you decide whether you are in
the left half for the right
532
00:35:48 --> 00:35:50
half.
And that's sort of like a tree.
533
00:35:50 --> 00:35:54
It's not quite a tree because
we have this element repeated
534
00:35:54 --> 00:35:55
all over.
But more or less,
535
00:35:55 --> 00:35:59
this is a binary tree.
At depth I, we have two to the
536
00:35:59 --> 00:36:04
I nodes, just like a tree,
just like a balanced tree.
537
00:36:04 --> 00:36:08
I'm going to call this
structure an ideal skip list.
538
00:36:08 --> 00:36:13
And, if all we are doing our
searches, ideal skip lists are
539
00:36:13 --> 00:36:15
pretty good.
Maybe at practice:
540
00:36:15 --> 00:36:20
not quite as good as a binary
search tree, but up to constant
541
00:36:20 --> 00:36:24
factors: just as good.
So, for example,
542
00:36:24 --> 00:36:28
I mean, we can generalize
search, just check that it's log
543
00:36:28 --> 00:36:32
n.
So, the search procedure is you
544
00:36:32 --> 00:36:36
start at the top left.
So, let's say we are looking
545
00:36:36 --> 00:36:38
for 72.
You start at the top left.
546
00:36:38 --> 00:36:41
14 is smaller than 72,
so I try to go right.
547
00:36:41 --> 00:36:44
79 is too big.
So, I follow this arrow,
548
00:36:44 --> 00:36:47
but I say, oops,
that's too much.
549
00:36:47 --> 00:36:49
So, instead,
I go down 14 still.
550
00:36:49 --> 00:36:53
I go to the right:
oh, 50, that's still smaller
551
00:36:53 --> 00:36:55
than 72: OK.
I tried to go right again.
552
00:36:55 --> 00:36:58
Oh: 79, that's too big.
That's no good.
553
00:36:58 --> 00:37:00
So, I go down.
So, I get 50.
554
00:37:00 --> 00:37:05
I do the same thing over and
over.
555
00:37:05 --> 00:37:07
I try to go to the right:
oh, 66, that's OK.
556
00:37:07 --> 00:37:09
Try to go to the right:
oh, 79, that's too big.
557
00:37:09 --> 00:37:11
So I go down.
Now I go to the right and,
558
00:37:11 --> 00:37:14
oh, 72: done.
Otherwise, I'd go too far and
559
00:37:14 --> 00:37:16
try to go down and say,
oops, element must not be
560
00:37:16 --> 00:37:18
there.
It's a very simple search
561
00:37:18 --> 00:37:21
algorithm: same as here except
just remove the L_1 and L_2.
562
00:37:21 --> 00:37:23
Go right until that would go
too far.
563
00:37:23 --> 00:37:25
Then go down.
Then go right until we'd go too
564
00:37:25 --> 00:37:28
far, and then go down.
You might have to do this log n
565
00:37:28 --> 00:37:30
times.
In each level,
566
00:37:30 --> 00:37:34
you're clearly only walking a
couple of steps because the
567
00:37:34 --> 00:37:37
ratio between these two sizes is
only two.
568
00:37:37 --> 00:37:40
So, this will cost two log n
for search.
569
00:37:40 --> 00:37:42
Good, I mean,
so that was to check because we
570
00:37:42 --> 00:37:46
were using intuition over here;
a little bit shaky.
571
00:37:46 --> 00:37:50
So, this is an ideal skip list,
we have to support insertions
572
00:37:50 --> 00:37:53
and deletions.
As soon as we do an insert and
573
00:37:53 --> 00:37:57
delete, there's no way we're
going to maintain the structure.
574
00:37:57 --> 00:38:03
It's a bit too special.
There is only one of these
575
00:38:03 --> 00:38:09
where everything is perfectly
spaced out, and everything is
576
00:38:09 --> 00:38:13
beautiful.
So, we can't do that.
577
00:38:13 --> 00:38:20
We're going to maintain roughly
this structure as best we can.
578
00:38:20 --> 00:38:27
And, if anyone of you knows
someone in New York City subway
579
00:38:27 --> 00:38:31
planning, you can tell them
this.
580
00:38:31 --> 00:38:37
OK, so: skip lists.
So, I mean, this is basically
581
00:38:37 --> 00:38:42
our data structure.
You could use this as a
582
00:38:42 --> 00:38:46
starting point,
but then you start using skip
583
00:38:46 --> 00:38:49
lists.
And, we need to somehow
584
00:38:49 --> 00:38:54
implement insertions and
deletions, and maintain roughly
585
00:38:54 --> 00:39:01
this structure well enough that
the search still costs order log
586
00:39:01 --> 00:39:05
n time.
So, let's focus on insertions.
587
00:39:05 --> 00:39:09
If we do insertions right,
it turns out deletions are
588
00:39:09 --> 00:39:11
really trivial.
589
00:39:11 --> 00:39:28
590
00:39:28 --> 00:39:31
And again, this is all from
first principles.
591
00:39:31 --> 00:39:34
We're not allowed to use
anything fancy.
592
00:39:34 --> 00:39:38
But, it would be nice if we
used some good chalk.
593
00:39:38 --> 00:39:42
This one looks better.
So, suppose you want to insert
594
00:39:42 --> 00:39:46
an element, x.
We said how to search for an
595
00:39:46 --> 00:39:48
element.
So, how do we insert it?
596
00:39:48 --> 00:39:53
Well, the first thing we should
do is figure out where it goes.
597
00:39:53 --> 00:39:57
So, we search for x.
We call search of x to find
598
00:39:57 --> 00:40:03
where x fits in the bottom list,
not just any list.
599
00:40:03 --> 00:40:06
Pretty easy to find out where
it fits in the top list.
600
00:40:06 --> 00:40:08
That takes, like,
constant time.
601
00:40:08 --> 00:40:11
What we want to know:
because the top list has
602
00:40:11 --> 00:40:14
constant length,
we want to know where x goes in
603
00:40:14 --> 00:40:17
the bottom list.
So, let's say we want to insert
604
00:40:17 --> 00:40:19
a search for 80.
Well, it is a bit too big.
605
00:40:19 --> 00:40:22
Let search for 75.
So, we'll find the 75 fits
606
00:40:22 --> 00:40:25
right here between 72 and 79
using the same path.
607
00:40:25 --> 00:40:29
OK, if it's there already,
we complain because I'm going
608
00:40:29 --> 00:40:32
to assume all keys are distinct
for now just so the picture
609
00:40:32 --> 00:40:38
stays simple.
But this works fine even if you
610
00:40:38 --> 00:40:42
are inserting the same key over
and over.
611
00:40:42 --> 00:40:47
So, that seems good.
One thing we should clearly do
612
00:40:47 --> 00:40:50
is insert x into the bottom
list.
613
00:40:50 --> 00:40:55
We now know where it fits.
It should go there.
614
00:40:55 --> 00:40:59
Because we want to maintain
this invariant,
615
00:40:59 --> 00:41:06
that the bottom list contains
all the elements.
616
00:41:06 --> 00:41:10
So, there we go.
We've maintained the invariant.
617
00:41:10 --> 00:41:14
The bottom list contains all
the elements.
618
00:41:14 --> 00:41:18
So, we search for 75.
We say, oh, 75 goes here,
619
00:41:18 --> 00:41:24
and we just sort of link in 75.
You know how to do a linked
620
00:41:24 --> 00:41:29
list, I hope.
Let me just erase that pointer.
621
00:41:29 --> 00:41:32
All the work in implementing
skip lists is the linked list
622
00:41:32 --> 00:41:34
manipulation.
Is that enough?
623
00:41:34 --> 00:41:38
No, it would be fine for now
because now there's only a chain
624
00:41:38 --> 00:41:41
of length three here that you'd
have to walk over if you're
625
00:41:41 --> 00:41:44
looking for something in this
range.
626
00:41:44 --> 00:41:47
But if I just keep inserting
75, and 76, than 76 plus
627
00:41:47 --> 00:41:51
epsilon, 76 plus two epsilon,
and so on, just pack a whole
628
00:41:51 --> 00:41:54
bunch of elements in here,
this chain will get really
629
00:41:54 --> 00:41:55
long.
Now, suddenly,
630
00:41:55 --> 00:41:58
things are not so balanced.
If I do a search,
631
00:41:58 --> 00:42:02
I'll pay an arbitrarily long
amount time here to search for
632
00:42:02 --> 00:42:05
someone.
If I insert k things,
633
00:42:05 --> 00:42:08
it'll take k time.
I want it to stay log n.
634
00:42:08 --> 00:42:11
If I only insert log n items,
it's OK for now.
635
00:42:11 --> 00:42:15
What I want to do is decide
which of these lists contain 75.
636
00:42:15 --> 00:42:17
So, clearly it goes on the
bottom.
637
00:42:17 --> 00:42:19
Every element goes in the
bottom.
638
00:42:19 --> 00:42:21
Should it go up a level?
Maybe.
639
00:42:21 --> 00:42:23
It depends.
It's not clear yet.
640
00:42:23 --> 00:42:27
If I insert a few items here,
definitely some of them should
641
00:42:27 --> 00:42:39
go on the next level.
Should I go to levels up?
642
00:42:39 --> 00:42:57
Maybe, but even less likely.
So, what should I do?
643
00:42:57 --> 00:43:01
Yeah?
Right, so you maintain the
644
00:43:01 --> 00:43:05
ideal partition size,
which may be like the length of
645
00:43:05 --> 00:43:07
this chain.
And you see,
646
00:43:07 --> 00:43:10
well, if that gets too long,
then I should split it in the
647
00:43:10 --> 00:43:14
middle, promote that guy up to
the next level,
648
00:43:14 --> 00:43:18
and do the same thing up here.
If this chain gets too long
649
00:43:18 --> 00:43:21
between two consecutive next
level express stops,
650
00:43:21 --> 00:43:23
then I'll promote the middle
guy.
651
00:43:23 --> 00:43:26
And that's what you'll do in
your problem set.
652
00:43:26 --> 00:43:30
That's too fancy for me.
I don't need no stinking
653
00:43:30 --> 00:43:34
counters.
What else could I do?
654
00:43:34 --> 00:43:46
655
00:43:46 --> 00:43:48
I could try to maintain the
ideal skip list structure.
656
00:43:48 --> 00:43:51
That will be too expensive.
Like I say, 75 is the guy that
657
00:43:51 --> 00:43:54
gets promoted,
and this guy gets demoted all
658
00:43:54 --> 00:43:55
the way down.
But that will propagate
659
00:43:55 --> 00:43:58
everything to the right.
And that could cost linear time
660
00:43:58 --> 00:44:01
for update.
Other idea?
661
00:44:01 --> 00:44:07
If I only want half of them to
go up, I could flip a coin.
662
00:44:07 --> 00:44:11
Good idea.
All right, for that,
663
00:44:11 --> 00:44:16
I will give you a quarter.
It's a good one.
664
00:44:16 --> 00:44:19
It's the old line state,
Maryland.
665
00:44:19 --> 00:44:24
There you go.
However, you have to perform
666
00:44:24 --> 00:44:32
some services for that quarter,
namely, flip the coin.
667
00:44:32 --> 00:44:34
Can you flip a coin?
Good.
668
00:44:34 --> 00:44:38
What did you get?
Tails, OK, that's the first
669
00:44:38 --> 00:44:42
random bit.
But we are going to do is build
670
00:44:42 --> 00:44:45
a skip list.
Maybe I should tell you how
671
00:44:45 --> 00:44:48
first.
OK, but the idea is flip a
672
00:44:48 --> 00:44:50
coin.
If it's heads,
673
00:44:50 --> 00:44:55
so, sorry, if it's heads,
we will promote it to the next
674
00:44:55 --> 00:45:03
level, and flip again.
So, this is an answer to the
675
00:45:03 --> 00:45:10
question, which other lists
should store x?
676
00:45:10 --> 00:45:16
How many other lists should we
add x to?
677
00:45:16 --> 00:45:22
Well, the algorithm is,
flip a coin,
678
00:45:22 --> 00:45:28
and if it comes out heads,
then promote x.
679
00:45:28 --> 00:45:36
to the next level up,
and flip again.
680
00:45:36 --> 00:45:39
OK, that's key because we might
want this element to go
681
00:45:39 --> 00:45:41
arbitrarily high.
But for starters,
682
00:45:41 --> 00:45:43
we flip a coin.
It doesn't go to the next
683
00:45:43 --> 00:45:45
level.
Well, we'd like it to go to the
684
00:45:45 --> 00:45:49
next level with probability one
half because we want the ratio
685
00:45:49 --> 00:45:51
between these two sizes to be a
half, or sorry,
686
00:45:51 --> 00:45:54
two, depending which way you
take the ratio.
687
00:45:54 --> 00:45:56
So, I want roughly half the
elements up here.
688
00:45:56 --> 00:45:58
So, I flip a coin.
If it comes up heads,
689
00:45:58 --> 00:46:02
I go up here.
This is a fair coin.
690
00:46:02 --> 00:46:05
So I want it 50-50.
OK, then how many should that
691
00:46:05 --> 00:46:07
element go up to the next level
up?
692
00:46:07 --> 00:46:09
Well, with 50% probability
again.
693
00:46:09 --> 00:46:12
So, I flip another point.
If it comes up heads,
694
00:46:12 --> 00:46:15
I'll go up another level.
And that will maintain the
695
00:46:15 --> 00:46:19
approximate ratio between these
two guys as being two.
696
00:46:19 --> 00:46:21
The expected ratio will
definitely be two,
697
00:46:21 --> 00:46:25
and so on, all the way up.
If I go up to the top and flip
698
00:46:25 --> 00:46:28
a coin, it comes up heads,
I'll make another level.
699
00:46:28 --> 00:46:33
This is the insertion
algorithm: dead simple.
700
00:46:33 --> 00:46:38
The fancier one you will see on
your problem set.
701
00:46:38 --> 00:46:40
So, let's do it.
702
00:46:40 --> 00:46:49
703
00:46:49 --> 00:46:53
OK, I also need someone to
generate random numbers.
704
00:46:53 --> 00:46:56
Who can generate random
numbers?
705
00:46:56 --> 00:47:00
Pseudo-random?
I'll give you a quarter.
706
00:47:00 --> 00:47:02
I have one here.
Here you go.
707
00:47:02 --> 00:47:05
That's a boring quarter.
Who would like to generate
708
00:47:05 --> 00:47:08
random numbers?
Someone volunteering someone
709
00:47:08 --> 00:47:10
else: that's a good way to do
it.
710
00:47:10 --> 00:47:13
Here you go.
You get a quarter,
711
00:47:13 --> 00:47:15
but you're not allowed to flip
it.
712
00:47:15 --> 00:47:18
No randomness for you;
well, OK, you can generate
713
00:47:18 --> 00:47:22
bits, and then compute a number.
So, give me a number.
714
00:47:22 --> 00:47:25
44, can answer.
OK, we already flipped a coin
715
00:47:25 --> 00:47:27
and I got tails.
Done.
716
00:47:27 --> 00:47:33
That's the insertion algorithm.
I'm going to make some more
717
00:47:33 --> 00:47:36
space actually,
put it way down here.
718
00:47:36 --> 00:47:41
OK, so 44 does not get promoted
because we got a tails.
719
00:47:41 --> 00:47:46
So, give me another number.
Nine, OK, I search for nine in
720
00:47:46 --> 00:47:49
this list.
I should mention one other
721
00:47:49 --> 00:47:53
thing, sorry.
I need a small change.
722
00:47:53 --> 00:47:57
This is just to make sure
searches still work.
723
00:47:57 --> 00:48:02
So, the worry is suppose I
insert something bigger and then
724
00:48:02 --> 00:48:07
I promote it.
This would look very bad for a
725
00:48:07 --> 00:48:11
skip list data structure because
I always want to start at the
726
00:48:11 --> 00:48:13
top left, and now there's no top
left.
727
00:48:13 --> 00:48:17
So, just minor change:
just let me remember that.
728
00:48:17 --> 00:48:21
The minor change is that I'm
going to store a special value
729
00:48:21 --> 00:48:25
minus infinity in every list.
So, minus infinity always gets
730
00:48:25 --> 00:48:29
promoted all the way to the top,
whatever the top happens to be
731
00:48:29 --> 00:48:32
now.
So, initially,
732
00:48:32 --> 00:48:35
that way I'll always have a top
left.
733
00:48:35 --> 00:48:38
Sorry, I forgot to mention
that.
734
00:48:38 --> 00:48:41
So, initially I'll just have
minus infinity.
735
00:48:41 --> 00:48:45
Then I insert 44.
I say, OK, 44 goes there,
736
00:48:45 --> 00:48:47
no promotion,
done.
737
00:48:47 --> 00:48:49
Now, we're going to insert
nine.
738
00:48:49 --> 00:48:53
Nine goes here.
So, minus infinity to nine,
739
00:48:53 --> 00:48:55
flip your coin,
heads.
740
00:48:55 --> 00:49:00
Did he actually flip it?
OK, good.
741
00:49:00 --> 00:49:02
He flipped it before,
yeah, sure.
742
00:49:02 --> 00:49:04
I'm just giving you a hard
time.
743
00:49:04 --> 00:49:09
So, we have nine up here.
We need to maintain this minus
744
00:49:09 --> 00:49:13
infinity just to make sure it
gets promoted along with
745
00:49:13 --> 00:49:16
everything else.
So, that looks like a nice skip
746
00:49:16 --> 00:49:18
list.
Flip it again.
747
00:49:18 --> 00:49:21
Tails, good.
OK, so this looks like an ideal
748
00:49:21 --> 00:49:23
skip list.
Isn't that great?
749
00:49:23 --> 00:49:27
It works every time.
OK, give me another number.
750
00:49:27 --> 00:49:32
26, OK, so I search for 26.
26 goes here.
751
00:49:32 --> 00:49:36
It clearly goes on the bottom
list.
752
00:49:36 --> 00:49:41
Here we go, 26,
and then I you raised 44.
753
00:49:41 --> 00:49:46
Flip.
Tails, OK, another number.
754
00:49:46 --> 00:49:52
50, oh, a big one.
It costs me a little while to
755
00:49:52 --> 50.
search, and I get over here.
756
50. --> 00:49:56
757
00:49:56 --> 00:49:58
Flip.
Heads, good.
758
00:49:58 --> 00:50:05
So 50 gets promoted.
Flip it again.
759
00:50:05 --> 00:50:08
Tails, OK, still a reasonable
number.
760
00:50:08 --> 00:50:11
Another number?
12, it takes a little while to
761
00:50:11 --> 00:50:15
get exciting here.
OK, 12 goes here between nine
762
00:50:15 --> 00:50:18
and 26.
You're giving me a hard time
763
00:50:18 --> 00:50:20
here.
OK, flip.
764
00:50:20 --> 00:50:24
Heads, OK, 12 gets promoted.
I know you have to work a
765
00:50:24 --> 00:50:30
little bit, but we just came
here to search for 12.
766
00:50:30 --> 00:50:35
So, we know that nine was the
last point we went down.
767
00:50:35 --> 00:50:39
So, we promote 12.
It gets inserted up here.
768
00:50:39 --> 00:50:45
We are just inserting into this
particular linked list:
769
00:50:45 --> 00:50:48
nothing fancy.
We link the two twelves
770
00:50:48 --> 00:50:52
together.
It still looks kind of like a
771
00:50:52 --> 00:50:55
linked list.
Flip again.
772
00:50:55 --> 37.
OK, tails, another number.
773
37. --> 00:50:58
774
00:50:58 --> 00:51:02
Jeez.
It's a good test of memory.
775
00:51:02 --> 00:51:05
37, what was it,
44 and 50?
776
00:51:05 --> 00:51:08
And 50 was at the next level
up.
777
00:51:08 --> 00:51:14
I think I should just keep
appending elements and have you
778
00:51:14 --> 00:51:18
flip coins.
OK, we just inserted 37.
779
00:51:18 --> 00:51:22
Tails.
OK, that's getting to be a long
780
00:51:22 --> 00:51:25
chain.
That looks a bit worse.
781
00:51:25 --> 00:51:29
OK, give me another number
larger than 50.
782
00:51:29 --> 00:51:34
51, good answer.
Thank you.
783
00:51:34 --> 00:51:37
OK, flip again.
And again.
784
00:51:37 --> 00:51:40
Tails.
Another number.
785
00:51:40 --> 00:51:45
Wait, someone else should pick
a number.
786
00:51:45 --> 00:51:49
It's not working.
What did you say?
787
00:51:49 --> 00:51:52
52, good answer.
Flip.
788
00:51:52 --> 00:51:58
Tails, not surprising.
We've gotten a lot of heads
789
00:51:58 --> 00:52:03
there.
OK, another number.
790
00:52:03 --> 00:52:06
53, thank you.
Flip.
791
00:52:06 --> 00:52:08
Heads, heads,
OK.
792
00:52:08 --> 00:52:13
Heads, heads,
you didn't flip.
793
00:52:13 --> 00:52:17
All right, 53,
you get the idea.
794
00:52:17 --> 00:52:26
If you get two consecutive
heads, then the guy goes up two
795
00:52:26 --> 00:52:32
levels.
OK, now flip for real.
796
00:52:32 --> 00:52:33
Heads.
Finally.
797
00:52:33 --> 00:52:39
Heads we've been waiting for.
If you flipped three heads in a
798
00:52:39 --> 00:52:44
row, you go three levels.
And each time,
799
00:52:44 --> 00:52:47
we keep promoting minus
infinity.
800
00:52:47 --> 00:52:50
Look again.
Heads, oh my God.
801
00:52:50 --> 00:52:54
Where were they before?
Flip again.
802
00:52:54 --> 00:53:00
It better be tails this time.
Tails, good.
803
00:53:00 --> 00:53:04
OK, you get the idea.
Eventually you run out of board
804
00:53:04 --> 00:53:06
space.
Now, it's pretty rare that you
805
00:53:06 --> 00:53:10
go too high.
What's the probability that you
806
00:53:10 --> 00:53:13
go higher than log n?
Another easy log computation.
807
00:53:13 --> 00:53:17
Each time, I have a 50%
probability of going up.
808
00:53:17 --> 00:53:22
One in n probability of going
up log n levels because half to
809
00:53:22 --> 00:53:24
the power of log n is one out of
n.
810
00:53:24 --> 00:53:28
So, it depends on n,
but I'm not going to go too
811
00:53:28 --> 00:53:32
high.
And, intuitively,
812
00:53:32 --> 00:53:37
this is not so bad.
So, these are skip lists.
813
00:53:37 --> 00:53:44
You have the ratios right in
expectation, which is a pretty
814
00:53:44 --> 00:53:49
weak statement.
This doesn't say anything about
815
00:53:49 --> 00:53:54
the lengths of these change.
But intuitively,
816
00:53:54 --> 00:53:59
it's pretty good.
Let's say pretty good on
817
00:53:59 --> 00:54:03
average.
So, I had two semi-random
818
00:54:03 --> 00:54:05
processes going on here.
One is picking the numbers,
819
00:54:05 --> 00:54:08
and that, I don't want to
assume anything about.
820
00:54:08 --> 00:54:09
The numbers could be
adversarial.
821
00:54:09 --> 00:54:12
It could be sequential.
It could be reverse sorted.
822
00:54:12 --> 00:54:14
It could be random.
I don't know.
823
00:54:14 --> 00:54:15
So, it didn't matter what he
said.
824
00:54:15 --> 00:54:18
At least, it shouldn't matter.
I mean, it matters here.
825
00:54:18 --> 00:54:20
Don't worry.
You're still loved.
826
00:54:20 --> 00:54:22
You still get your $0.25.
But what the algorithm cares
827
00:54:22 --> 00:54:24
about is the outcomes of these
coins.
828
00:54:24 --> 00:54:27
And the probability,
the statement that this data
829
00:54:27 --> 00:54:30
structure is fast with high
probability is only about the
830
00:54:30 --> 00:54:34
random coins.
Right, it doesn't matter what
831
00:54:34 --> 00:54:38
the adversary chooses for
numbers as long as those coins
832
00:54:38 --> 00:54:43
are random, and the adversary
doesn't know the coins.
833
00:54:43 --> 00:54:46
It doesn't know the outcomes of
the coins.
834
00:54:46 --> 00:54:50
So, in that case,
on average, overall of the coin
835
00:54:50 --> 00:54:55
flips, you should be OK.
But the claim is not just that
836
00:54:55 --> 00:54:58
it's pretty good on average.
But, it's really,
837
00:54:58 --> 00:55:03
really good almost always.
OK, with really high
838
00:55:03 --> 00:55:07
probability it's log n.
So, for example,
839
00:55:07 --> 00:55:10
with probability,
one minus one over n,
840
00:55:10 --> 00:55:15
it's order of log n,
with probability one minus one
841
00:55:15 --> 00:55:19
over n^2 it's log n,
probability one minus one over
842
00:55:19 --> 00:55:24
n^100, it's order log n.
All those statements are true
843
00:55:24 --> 00:55:30
for any value of 100.
So, that's where we're going.
844
00:55:30 --> 00:55:33
OK, I should mention,
how do you delete in a skip
845
00:55:33 --> 00:55:34
list?
Find the element.
846
00:55:34 --> 00:55:37
You delete it all the way.
There's nothing fancy with
847
00:55:37 --> 00:55:40
delete.
Because we have all these
848
00:55:40 --> 00:55:43
independent, random choices,
all of these elements are sort
849
00:55:43 --> 00:55:47
of independent from each other.
We don't really care.
850
00:55:47 --> 00:55:49
So, delete an element,
just throw it away.
851
00:55:49 --> 00:55:53
The tricky part is insertion.
When I insert an element,
852
00:55:53 --> 00:55:56
I'm just going to randomly see
how high it should go.
853
00:55:56 --> 00:56:00
With probability one over two
to the i, it will go to height
854
00:56:00 --> 00:56:04
i.
Good, that's my time.
855
00:56:04 --> 00:56:08
I've been having too much fun
here.
856
00:56:08 --> 00:56:14
I've got to go a little bit
faster, OK.
857
00:56:14 --> 00:56:25
858
00:56:25 --> 00:56:32
So here's the theorem.
Let's see exactly what we are
859
00:56:32 --> 00:56:38
proving first.
With high probability,
860
00:56:38 --> 00:56:46
this is a formal notion which I
will define a second.
861
00:56:46 --> 00:56:55
Every search in n elements skip
lists costs order of log n.
862
00:56:55 --> 00:57:03
So, that's the theorem.
Now I need to define with high
863
00:57:03 --> 00:57:06
probability.
So, with high probability.
864
00:57:06 --> 00:57:10
And, it's a bit of a long
phrase.
865
00:57:10 --> 00:57:15
So, often we will,
and you can abbreviate it WHP.
866
00:57:15 --> 00:57:20
So, if I have a random event,
and the random event here is
867
00:57:20 --> 00:57:26
that every search in an n
element skip list costs order
868
00:57:26 --> 00:57:32
log n, I want to know what it
means for that event E to occur
869
00:57:32 --> 00:57:36
with high probability.
870
00:57:36 --> 00:57:47
871
00:57:47 --> 00:57:53
So this is the definition.
So, the statement is that for
872
00:57:53 --> 00:58:00
any alpha greater than or equal
to one, there is a suitable
873
00:58:00 --> 00:58:04
choice of constants --
874
00:58:04 --> 00:58:16
875
00:58:16 --> 00:58:27
-- for which the event,
E, occurs with this probability
876
00:58:27 --> 00:58:37
I keep mentioning.
So, the probability at least
877
00:58:37 --> 00:58:46
one minus one over n to the
alpha.
878
00:58:46 --> 00:58:49
So, this is a bit imprecise,
but it will suffice for our
879
00:58:49 --> 00:58:52
purposes.
If you want to really formal
880
00:58:52 --> 00:58:55
definition, you can read the
lecture notes.
881
00:58:55 --> 00:58:59
There are special lecture notes
for this lecture on the stellar
882
00:58:59 --> 00:59:01
site.
And, there's the PowerPoint
883
00:59:01 --> 00:59:06
notes on the SMA site.
But, right, there's a bit of a
884
00:59:06 --> 00:59:08
subtlety in the choice of
constants here.
885
00:59:08 --> 00:59:11
There is a choice of this
constant.
886
00:59:11 --> 00:59:14
And there's a choice of this
constant.
887
00:59:14 --> 00:59:16
And, these are related.
And, there's alpha,
888
00:59:16 --> 00:59:19
which we get to whatever we
want.
889
00:59:19 --> 00:59:22
But the bottom line is,
we get to choose what
890
00:59:22 --> 00:59:24
probability we want this to be
true.
891
00:59:24 --> 00:59:28
If I want it to be true,
with probability one minus one
892
00:59:28 --> 00:59:32
over n^100, I can do that.
I just sat alpha to a hundred,
893
00:59:32 --> 00:59:37
and up to this little constant
that's going to grow much slower
894
00:59:37 --> 00:59:41
than n to the alpha.
I get the error probability.
895
00:59:41 --> 00:59:45
So this thing is called the
error probability.
896
00:59:45 --> 00:59:48
The probability that I fail is
polynomially small,
897
00:59:48 --> 00:59:51
for any polynomial I want.
Now, with the same data
898
00:59:51 --> 00:59:54
structure, right,
I fixed the data structure.
899
00:59:54 --> 00:59:57
It doesn't depend on alpha.
Anything you want,
900
00:59:57 --> 1:00:01.717
any alpha value you want,
this data structure will take
901
1:00:01.717 --> 1:00:06.692
order of log n time.
Now, this constant will depend
902
1:00:06.692 --> 1:00:08.666
on alpha.
So, you know,
903
1:00:08.666 --> 1:00:14.141
you want error probability one
over n^100 is probably going to
904
1:00:14.141 --> 1:00:17.461
be, like, 100 log n.
It's still log n.
905
1:00:17.461 --> 1:00:22.128
OK, this is a very strong claim
about the tale of the
906
1:00:22.128 --> 1:00:27.064
distribution of the running time
of search, very strong.
907
1:00:27.064 --> 1:00:32
Let me give you an idea of how
strong it is.
908
1:00:32 --> 1:00:36.731
How many people know what
Boole's inequality is?
909
1:00:36.731 --> 1:00:42.671
How many people know what the
union bound is in probability?
910
1:00:42.671 --> 1:00:45.691
You should.
It's in appendix c.
911
1:00:45.691 --> 1:00:49.214
Maybe you'll know it by the
theorem.
912
1:00:49.214 --> 1:00:55.154
It's good to know it by name.
It's sort of like linearity of
913
1:00:55.154 --> 1:00:58.476
expectations.
It's a lot easier to
914
1:00:58.476 --> 1:01:03.978
communicate to someone.
Linearity of expectations:
915
1:01:03.978 --> 1:01:07.554
instead of saying,
you know that thing where you
916
1:01:07.554 --> 1:01:11.51
sum up all the expectations of
things, and that's the
917
1:01:11.51 --> 1:01:15.086
expectation of the sum?
It's a lot easier to say
918
1:01:15.086 --> 1:01:18.815
linearity of expectation.
So, let me quiz you in a
919
1:01:18.815 --> 1:01:21.706
different way.
So, if I take a bunch of
920
1:01:21.706 --> 1:01:26.119
events, and I take their union,
either this happens or this
921
1:01:26.119 --> 1:01:29.847
happens, or so on.
So, this is the inclusive OR of
922
1:01:29.847 --> 1:01:31.521
k events.
And, instead,
923
1:01:31.521 --> 1:01:37
I look at the sum of the
probabilities of those events.
924
1:01:37 --> 1:01:40.111
OK, easy question:
are these equal?
925
1:01:40.111 --> 1:01:42.947
No, unless they are
independent.
926
1:01:42.947 --> 1:01:47.248
But can I say anything about
them, any relation?
927
1:01:47.248 --> 1:01:51.183
Smaller, yeah.
This is less than or equal to
928
1:01:51.183 --> 1:01:54.477
that.
OK, this should be intuitive to
929
1:01:54.477 --> 1:01:57.771
you from a probability point of
view.
930
1:01:57.771 --> 1:02:01.705
Look at the textbook.
OK: very basic result,
931
1:02:01.705 --> 1:02:07.041
trivial result almost.
What does this tell us?
932
1:02:07.041 --> 1:02:11.479
Well, suppose that E_i is some
kind of error event.
933
1:02:11.479 --> 1:02:15.295
We don't want it to happen.
OK, and suppose,
934
1:02:15.295 --> 1:02:19.467
mix some letters here.
Suppose I have a bunch of
935
1:02:19.467 --> 1:02:23.017
events which occur with high
probability.
936
1:02:23.017 --> 1:02:26.745
OK, call those E_i complement.
So, suppose,
937
1:02:26.745 --> 1:02:31.893
so this is the end of that
statement, E_i complement occurs
938
1:02:31.893 --> 1:02:37.063
with high probability.
OK, so then the probability of
939
1:02:37.063 --> 1:02:39.609
E_i is very small,
polynomially small.
940
1:02:39.609 --> 1:02:42.636
One over n to the alpha for any
alpha I want.
941
1:02:42.636 --> 1:02:46.007
Now, suppose I take a whole
bunch of these events,
942
1:02:46.007 --> 1:02:48.69
and let's say that k is
polynomial in n.
943
1:02:48.69 --> 1:02:52.405
So, I take a bunch of events,
which I'd like to happen.
944
1:02:52.405 --> 1:02:54.882
They all occur with high
probability.
945
1:02:54.882 --> 1:02:57.565
There is only polynomially many
of them.
946
1:02:57.565 --> 1:03:00.316
So let's say,
let me give this constant a
947
1:03:00.316 --> 1:03:03
name.
Let's call it c.
948
1:03:03 --> 1:03:05.873
Let's say I take n to the c
such events.
949
1:03:05.873 --> 1:03:09.926
Well, what's the probability
that all those events occur
950
1:03:09.926 --> 1:03:12.873
together?
Because they should rest of the
951
1:03:12.873 --> 1:03:17.073
time occurred together because
each one occurs most of the
952
1:03:17.073 --> 1:03:19.578
time, occurs with high
probability.
953
1:03:19.578 --> 1:03:23.115
So, I want to look at E_1 bar
intersect, E_2 bar,
954
1:03:23.115 --> 1:03:25.842
and so on.
So, each of these occurs as
955
1:03:25.842 --> 1:03:29.378
high probability.
What's the chance that they all
956
1:03:29.378 --> 1:03:32.166
occur?
It's also with high
957
1:03:32.166 --> 1:03:34.316
probability.
I'm changing the alpha.
958
1:03:34.316 --> 1:03:37.817
So, the union bound tells me
the probability of any one of
959
1:03:37.817 --> 1:03:40.09
these failing,
the probability of this
960
1:03:40.09 --> 1:03:42.608
failing, or this failing,
or this failing,
961
1:03:42.608 --> 1:03:44.573
which is this thing,
is, at most,
962
1:03:44.573 --> 1:03:47.276
the sum of the probabilities of
each failure.
963
1:03:47.276 --> 1:03:49.303
These are the error
probabilities.
964
1:03:49.303 --> 1:03:52.619
I know that each of them is,
at most, one over n to the
965
1:03:52.619 --> 1:03:55.875
alpha, with a constant in front.
If I add them all up,
966
1:03:55.875 --> 1:03:57.779
there's only n to the c of
them.
967
1:03:57.779 --> 1:04:01.034
So, I take this error
probability, and I multiply by n
968
1:04:01.034 --> 1:04:05.4
to the c.
So, I get like n to the c over
969
1:04:05.4 --> 1:04:08.679
n to the alpha,
which is one over n to the
970
1:04:08.679 --> 1:04:11.96
alpha minus c.
I can set alpha as big as I
971
1:04:11.96 --> 1:04:13.88
want.
So, I said it much,
972
1:04:13.88 --> 1:04:17.88
much bigger than c,
and this event occurs with high
973
1:04:17.88 --> 1:04:21
probability.
I sort of made a mess here,
974
1:04:21 --> 1:04:25.719
but this event occurs with high
probability because of this.
975
1:04:25.719 --> 1:04:30.599
Whatever the constant is here,
however many events I'm taking,
976
1:04:30.599 --> 1:04:35
I just set alpha to be bigger
than that.
977
1:04:35 --> 1:04:37.951
And, this event will occur with
high probability,
978
1:04:37.951 --> 1:04:40.041
too.
So, when I say here that every
979
1:04:40.041 --> 1:04:42.992
search of cost order log n with
high probability,
980
1:04:42.992 --> 1:04:46.005
not only do I mean that if you
look at one search,
981
1:04:46.005 --> 1:04:48.587
it costs order log n with high
probability.
982
1:04:48.587 --> 1:04:51.969
You look at another search,
and it costs log n with high
983
1:04:51.969 --> 1:04:54.244
probability.
I mean, if you take every
984
1:04:54.244 --> 1:04:57.318
search, all of them take order
log n time with high
985
1:04:57.318 --> 1:04:59.593
probability.
So, this event that every
986
1:04:59.593 --> 1:05:03.036
single search you do takes order
log n, is true with high
987
1:05:03.036 --> 1:05:06.663
probability estimate the number
of searches you are doing is
988
1:05:06.663 --> 1:05:10.887
polynomial in n.
So, I'm assuming that I'm not
989
1:05:10.887 --> 1:05:14.467
using this data structure
forever, just for a polynomial
990
1:05:14.467 --> 1:05:17.136
amount of time.
But, who's got more than a
991
1:05:17.136 --> 1:05:19.218
polynomial amount of time
anyway?
992
1:05:19.218 --> 1:05:21.757
This is MIT.
So, hopefully that's clear.
993
1:05:21.757 --> 1:05:24.035
We'll see it a few more times.
Yeah?
994
1:05:24.035 --> 1:05:26.443
The algorithm doesn't depend on
Alpha.
995
1:05:26.443 --> 1:05:31
The question is how do you
choose alpha in the algorithm.
996
1:05:31 --> 1:05:33.925
So, we don't need to.
This is just sort of for an
997
1:05:33.925 --> 1:05:36.668
analysis tool.
This is saying that the farther
998
1:05:36.668 --> 1:05:39.838
out you get, so you say,
well, what's the probability
999
1:05:39.838 --> 1:05:43.19
that more than ten log n.
Well, it's like one over n^10.
1000
1:05:43.19 --> 1:05:46.238
Let's say it's linear.
Well, what's the chance that
1001
1:05:46.238 --> 1:05:49.407
you're more than 20 log n?
Well that's one over n^20.
1002
1:05:49.407 --> 1:05:52.942
So, the point is the tail of
this distribution is getting a
1003
1:05:52.942 --> 1:05:54.466
really small,
really fast.
1004
1:05:54.466 --> 1:05:57.758
And, such using alpha is more
like sort of for your own
1005
1:05:57.758 --> 1:06:00.135
feeling good.
OK, you can set it to 100,
1006
1:06:00.135 --> 1:06:05.209
and then n is at least two.
So, that's like one over 2^100
1007
1:06:05.209 --> 1:06:08.082
chance that you fail.
That's damn small.
1008
1:06:08.082 --> 1:06:11.322
If you've got a real random
number generator,
1009
1:06:11.322 --> 1:06:15.668
the chance that you're going to
hit one over 2^200 is pretty
1010
1:06:15.668 --> 1:06:18.762
tiny, right?
So, let's say you set alpha to
1011
1:06:18.762 --> 1:06:21.266
256, which is always a good
number.
1012
1:06:21.266 --> 1:06:25.759
2^256 is much bigger than the
number of particles in the known
1013
1:06:25.759 --> 1:06:29
universe, so,
the light matter.
1014
1:06:29 --> 1:06:32.898
So, actually I think this even
accounts for some notion of dark
1015
1:06:32.898 --> 1:06:34.533
matter.
So, this is really,
1016
1:06:34.533 --> 1:06:37.615
really, really big.
So, the chance that you pick a
1017
1:06:37.615 --> 1:06:41.576
random particle in the universe
that happens to be your favorite
1018
1:06:41.576 --> 1:06:45.161
particle, this one right here,
that's over one over 2^256,
1019
1:06:45.161 --> 1:06:47.487
or even smaller.
So, set alpha to 256,
1020
1:06:47.487 --> 1:06:51.26
the chance to your algorithm
takes more than order log n time
1021
1:06:51.26 --> 1:06:54.907
is a lot smaller than the chance
that a meteor strikes your
1022
1:06:54.907 --> 1:06:58.68
computer at the same time that
it has a flooding point error,
1023
1:06:58.68 --> 1:07:02.642
at the same time that the earth
explodes because they're putting
1024
1:07:02.642 --> 1:07:06.415
a transport through this part of
the solar system at the same
1025
1:07:06.415 --> 1:07:08.113
time, I mean,
I could go on,
1026
1:07:08.113 --> 1:07:10.752
right?
It's really,
1027
1:07:10.752 --> 1:07:13.51
really unlikely that you are
more than log n.
1028
1:07:13.51 --> 1:07:15.705
And how unlikely:
you get to choose.
1029
1:07:15.705 --> 1:07:19.467
But it's just in the analysis
the algorithm doesn't depend on
1030
1:07:19.467 --> 1:07:21.159
it.
It's the same algorithm,
1031
1:07:21.159 --> 1:07:23.04
very cool.
Sometimes, with high
1032
1:07:23.04 --> 1:07:25.297
probability, bounds depends on
alpha.
1033
1:07:25.297 --> 1:07:27.68
I mean, the algorithm depends
on alpha.
1034
1:07:27.68 --> 1:07:32.307
But here, it will not.
OK, away we go.
1035
1:07:32.307 --> 1:07:37.692
So now you all understand the
claim.
1036
1:07:37.692 --> 1:07:45.384
So let's do a warm up.
We will also need this fact.
1037
1:07:45.384 --> 1:07:52.769
But it's pretty easy.
The lemma is that with high
1038
1:07:52.769 --> 1:08:01.692
probability, the number of
levels in the skip list is order
1039
1:08:01.692 --> 1:08:06.266
log n.
I think it's order log n,
1040
1:08:06.266 --> 1:08:09.349
certainly.
So, how do we prove that
1041
1:08:09.349 --> 1:08:12.613
something happens with high
probably?
1042
1:08:12.613 --> 1:08:18.144
Compute the probability that it
happened; show that it's high.
1043
1:08:18.144 --> 1:08:22.677
Even if you don't know what
high probability means,
1044
1:08:22.677 --> 1:08:26.122
in fact, I used to ask that
earlier on.
1045
1:08:26.122 --> 1:08:30.746
So, let's compute the chance
that it doesn't happen,
1046
1:08:30.746 --> 1:08:35.551
the error probability,
because that's just a one minus
1047
1:08:35.551 --> 1:08:39.449
the cleaner.
So, I'd like to say,
1048
1:08:39.449 --> 1:08:42.71
let's say, that it's,
at most, c log n levels.
1049
1:08:42.71 --> 1:08:46.115
So, what's the error
probability for that event?
1050
1:08:46.115 --> 1:08:50.028
This is sort of an event.
I'll put it in squiggles just
1051
1:08:50.028 --> 1:08:53
for, all set.
This is the probability that
1052
1:08:53 --> 1:08:56.26
they are strictly greater than c
log n levels.
1053
1:08:56.26 --> 1:09:00.173
So, I want to say that that
probability is particularly
1054
1:09:00.173 --> 1:09:04.683
small, polynomially small.
Well, how do I make levels?
1055
1:09:04.683 --> 1:09:07.552
When I insert an element,
the probability half,
1056
1:09:07.552 --> 1:09:09.984
it goes up.
And, the number of levels in
1057
1:09:09.984 --> 1:09:13.726
the skip list is the max over
all the elements of how high it
1058
1:09:13.726 --> 1:09:15.035
goes up.
But, max, oh,
1059
1:09:15.035 --> 1:09:17.779
that's a mess.
All right, you can compute the
1060
1:09:17.779 --> 1:09:21.022
expectation of the max if you
have a bunch of unknown
1061
1:09:21.022 --> 1:09:24.202
variables; there is expectation
there is a constant,
1062
1:09:24.202 --> 1:09:26.759
and you take the max.
It's like log in and
1063
1:09:26.759 --> 1:09:31
expectation, but we want a much
stronger statement.
1064
1:09:31 --> 1:09:35.815
And, we have this Boole's
inequality that says I have a
1065
1:09:35.815 --> 1:09:39.472
bunch of things,
polynomially many things.
1066
1:09:39.472 --> 1:09:43.842
Let's say we have n items.
Each one independently,
1067
1:09:43.842 --> 1:09:47.142
I don't even care if it's a
dependent.
1068
1:09:47.142 --> 1:09:52.582
If it goes up more than c log
n, yeah, the number of levels is
1069
1:09:52.582 --> 1:09:55.258
more than c log n.
So, this is,
1070
1:09:55.258 --> 1:10:00.163
at most, and then I want to
know, do any of those events
1071
1:10:00.163 --> 1:10:03.017
happen for any of the n
elements?
1072
1:10:03.017 --> 1:10:06.762
So, I just multiplied by n.
It's certainly,
1073
1:10:06.762 --> 1:10:10.597
at most, n times the
probability that x gets
1074
1:10:10.597 --> 1:10:15.502
promoted, this much here,
greater than or equal to log n
1075
1:10:15.502 --> 1:10:18.734
times.
OK, if I pick,
1076
1:10:18.734 --> 1:10:21.041
for any element,
x, because it's the same for
1077
1:10:21.041 --> 1:10:23.191
each element.
They are done independently.
1078
1:10:23.191 --> 1:10:26.179
So, I'm just summing over x
here, and that's just a factor
1079
1:10:26.179 --> 1:10:26.756
of n.
Clear?
1080
1:10:26.756 --> 1:10:29.588
This is Boole's inequality.
Now, what's the probability
1081
1:10:29.588 --> 1:10:32
that x gets promoted c log n
times?
1082
1:10:32 --> 1:10:36.646
We did this before for log n.
It was one over n.
1083
1:10:36.646 --> 1:10:40.305
For c log n,
it's one over n to the c.
1084
1:10:40.305 --> 1:10:44.161
OK, this is n times two.
Let's be nicer:
1085
1:10:44.161 --> 1:10:47.324
one half to the power of c log
n.
1086
1:10:47.324 --> 1:10:53.257
One half to the power of c log
n is one over two to the c log
1087
1:10:53.257 --> 1:10:55.926
n.
The log n comes out here,
1088
1:10:55.926 --> 1:10:58.991
becomes an n.
We get n to the c.
1089
1:10:58.991 --> 1:11:05.022
So, this is n divided by n to
the c, which is n to the c minus
1090
1:11:05.022 --> 1:11:09.904
one.
And, I get to choose c to be
1091
1:11:09.904 --> 1:11:14.676
whatever I want.
So, I choose c minus one to be
1092
1:11:14.676 --> 1:11:17.477
alpha.
I think exactly that.
1093
1:11:17.477 --> 1:11:21.626
Oh, sorry, one over n to the c
minus one.
1094
1:11:21.626 --> 1:11:24.634
Thank you.
It better be small.
1095
1:11:24.634 --> 1:11:30.236
This is an upper bound.
So, probability is polynomially
1096
1:11:30.236 --> 1:11:32.956
small.
I get to choose,
1097
1:11:32.956 --> 1:11:36.484
and this is a bit of the trik.
I'm choosing this constant to
1098
1:11:36.484 --> 1:11:38.397
be large, large enough for
alpha.
1099
1:11:38.397 --> 1:11:40.61
The point is,
as c grows, alpha grows.
1100
1:11:40.61 --> 1:11:43.48
Therefore, I can set alpha to
be whatever I want,
1101
1:11:43.48 --> 1:11:46.29
set c accordingly.
So, there's a little bit more
1102
1:11:46.29 --> 1:11:49.459
words that have to go here.
But, they're in the notes.
1103
1:11:49.459 --> 1:11:51.851
I can set alpha to be as large
as I want.
1104
1:11:51.851 --> 1:11:55.199
So, I can make this probability
as small as I want in the
1105
1:11:55.199 --> 1:11:56.993
polynomial sets.
So, that's it.
1106
1:11:56.993 --> 1:11:58.727
Number of levels,
order log n:
1107
1:11:58.727 --> 1:12:02.224
wasn't that easy?
Rules and equality,
1108
1:12:02.224 --> 1:12:06.026
the point is that when you're
dealing with high probability,
1109
1:12:06.026 --> 1:12:09.377
use Boole's inequality.
And, anything that's true for
1110
1:12:09.377 --> 1:12:12.664
one element is true for all of
them, just like that.
1111
1:12:12.664 --> 1:12:15.886
Just lose a factor of n,
but that's just one in the
1112
1:12:15.886 --> 1:12:18.271
alpha, and alpha is big:
big constant,
1113
1:12:18.271 --> 1:12:21.106
but it's big.
OK, so let's prove the theorem.
1114
1:12:21.106 --> 1:12:23.813
High probability searches cost
order log n.
1115
1:12:23.813 --> 1:12:27.422
We now know the height is order
log n, but it depends how
1116
1:12:27.422 --> 1:12:32.756
balanced this thing is.
It depends how long the chains
1117
1:12:32.756 --> 1:12:36.8
are to really know that a search
costs log n.
1118
1:12:36.8 --> 1:12:41.21
Just knowing a bound on the
height is not enough,
1119
1:12:41.21 --> 1:12:45.805
unlike a binary tree.
So, we have one cool idea for
1120
1:12:45.805 --> 1:12:49.389
this analysis.
And it's called backwards
1121
1:12:49.389 --> 1:12:52.697
analysis.
So, normally you think of a
1122
1:12:52.697 --> 1:12:58.21
search as starting in the top
left corner going left and down
1123
1:12:58.21 --> 1:13:04
until you get to the item that
you're looking for.
1124
1:13:04 --> 1:13:07.423
I'm going to look at the
reverse process.
1125
1:13:07.423 --> 1:13:12.558
You start at the item you're
looking for, and you go left and
1126
1:13:12.558 --> 1:13:15.896
up until you get to the top left
corner.
1127
1:13:15.896 --> 1:13:20.175
The number of steps in those
two walks is the same.
1128
1:13:20.175 --> 1:13:23.855
And, I'm not implementing an
algorithm here,
1129
1:13:23.855 --> 1:13:27.792
I'm just doing analysis.
So, those are the same
1130
1:13:27.792 --> 1:13:32.671
processes, just in reverse.
So, here's what it looks like.
1131
1:13:32.671 --> 1:13:35.409
You have a search,
and it starts,
1132
1:13:35.409 --> 1:13:42
which really means that it ends
at a node in the bottom list.
1133
1:13:42 --> 1:13:46.845
Then, each time you visit a
node in this search,
1134
1:13:46.845 --> 1:13:52.618
you either go left or up.
And, when do you go left or up?
1135
1:13:52.618 --> 1:13:56.639
Well, it depends with the coin
flip was.
1136
1:13:56.639 --> 1:14:02
So, if the node wasn't promoted
at this level.
1137
1:14:02 --> 1:14:08.317
So, if it wasn't promoted
higher, and that happened
1138
1:14:08.317 --> 1:14:14.003
exactly when we got a tails.
Then, we go left,
1139
1:14:14.003 --> 1:14:19.057
which really means we came from
the left.
1140
1:14:19.057 --> 1:14:25.754
Or, if we got a heads,
so if this node was promoted to
1141
1:14:25.754 --> 1:14:31.44
the next level,
which happened whenever we got
1142
1:14:31.44 --> 1:14:37
a heads at that particular
moment.
1143
1:14:37 --> 1:14:42.86
This is in the past some time
when we did the insertion.
1144
1:14:42.86 --> 1:14:45.844
Then we go, or came from,
up.
1145
1:14:45.844 --> 1:14:51.704
And, we stop at the root.
This is really where we start;
1146
1:14:51.704 --> 1:14:55.967
same thing.
So, either at the root or I'm
1147
1:14:55.967 --> 1:15:03
also going to think of this as
stopping at minus infinity.
1148
1:15:03 --> 1:15:05.562
OK, that was a bit messy,
but let me review.
1149
1:15:05.562 --> 1:15:08.602
So, normally we start up here.
Well, just looking at
1150
1:15:08.602 --> 1:15:11.344
everything backwards,
and in brackets is what's
1151
1:15:11.344 --> 1:15:13.966
really happening.
So, this search ends at the
1152
1:15:13.966 --> 1:15:17.364
node you were looking for.
It's always in the bottom list.
1153
1:15:17.364 --> 1:15:19.807
Then it says,
well, was this node promoted
1154
1:15:19.807 --> 1:15:21.953
higher?
If it was, I came from above.
1155
1:15:21.953 --> 1:15:25.41
If not, I came to the left.
It must have been in the bottom
1156
1:15:25.41 --> 1:15:28.033
chain somewhere.
OK, and that's true at every
1157
1:15:28.033 --> 1:15:31.87
node you visit.
It depends whether that quite
1158
1:15:31.87 --> 1:15:35.806
slipped heads or tails at the
time that you inserted that node
1159
1:15:35.806 --> 1:15:38.774
into that level.
But, these are just a bunch of
1160
1:15:38.774 --> 1:15:40.774
events.
I'm just going to check,
1161
1:15:40.774 --> 1:15:44.258
what is the probability that
its heads, and what is the
1162
1:15:44.258 --> 1:15:47.096
probability that a tails?
It's always a half.
1163
1:15:47.096 --> 1:15:50.516
Every time I look at a coin
flip, when it was flipped,
1164
1:15:50.516 --> 1:15:54
there was a probability of half
going out of their way.
1165
1:15:54 --> 1:15:56.967
That's the magic.
And, I'm not using that these
1166
1:15:56.967 --> 1:16:02.248
events are independent anyway.
For every element that I search
1167
1:16:02.248 --> 1:16:05.584
for, for every value,
x, that's another search.
1168
1:16:05.584 --> 1:16:08.123
Those events may not be
independent.
1169
1:16:08.123 --> 1:16:12.112
I can still use Boole's
inequality and conclude that all
1170
1:16:12.112 --> 1:16:15.375
of them are order log n with
high probability.
1171
1:16:15.375 --> 1:16:19.582
As long as I can prove that any
one event happens with high
1172
1:16:19.582 --> 1:16:22.556
probability.
So, I don't need independence
1173
1:16:22.556 --> 1:16:26.835
between, I knew that these coin
flips in a single search are
1174
1:16:26.835 --> 1:16:30.969
independent, but everything
else, for different searches I
1175
1:16:30.969 --> 1:16:35.803
don't care.
So, how long can this process
1176
1:16:35.803 --> 1:16:39.283
go on?
We want to know how many times
1177
1:16:39.283 --> 1:16:44.31
can I make this walk?
Well, when I hit the root node,
1178
1:16:44.31 --> 1:16:47.983
I'm done.
Well, how quickly would I hit
1179
1:16:47.983 --> 1:16:51.56
the root node?
Well, with probability,
1180
1:16:51.56 --> 1:16:57.069
a half, I go up each step.
The number of times I go up is,
1181
1:16:57.069 --> 1:17:02
at most, the number of levels
minus one.
1182
1:17:02 --> 1:17:05.41
And that's order log n with
high probability.
1183
1:17:05.41 --> 1:17:07.813
So, this is the only other
idea.
1184
1:17:07.813 --> 1:17:10.682
So, we are now improving this
theorem.
1185
1:17:10.682 --> 1:17:15.333
So, the number of up moves in a
search, which are really down
1186
1:17:15.333 --> 1:17:19.054
moves, but same thing,
is less than the number of
1187
1:17:19.054 --> 1:17:22
levels.
Certainly, you can't go up more
1188
1:17:22 --> 1:17:24.713
than there are levels in the
search.
1189
1:17:24.713 --> 1:17:27.968
And in insert,
you can go arbitrarily high.
1190
1:17:27.968 --> 1:17:32
But a search:
as high as you can go.
1191
1:17:32 --> 1:17:34.821
And this is,
at most, c log n with high
1192
1:17:34.821 --> 1:17:37.866
probability.
This is what we proved in the
1193
1:17:37.866 --> 1:17:40.242
lemma.
So, we have a bound on the
1194
1:17:40.242 --> 1:17:42.99
number of up moves.
Half of the moves,
1195
1:17:42.99 --> 1:17:45.44
roughly, are going to be up
moves.
1196
1:17:45.44 --> 1:17:49.004
So, this pretty much down to
the number of moves.
1197
1:17:49.004 --> 1:17:51.752
Not quite.
So, what this means is that
1198
1:17:51.752 --> 1:17:54.797
with high probability,
so this is the same
1199
1:17:54.797 --> 1:17:58.955
probability, but I could choose
that as high as I want by
1200
1:17:58.955 --> 1:18:03.553
setting c large enough.
The number of moves,
1201
1:18:03.553 --> 1:18:06.893
in other words,
the cost of the search is at
1202
1:18:06.893 --> 1:18:11.32
most the number of coin flips
until we get c long n heads,
1203
1:18:11.32 --> 1:18:15.747
right, because in every step of
the search, I make a move,
1204
1:18:15.747 --> 1:18:19.009
and then I flip another coin,
conceptually.
1205
1:18:19.009 --> 1:18:22.504
There is another independent
coin lying there.
1206
1:18:22.504 --> 1:18:27.165
And it's either heads or tails.
Each of those is independent.
1207
1:18:27.165 --> 1:18:31.902
So, how many independent coin
flips does it take until I get c
1208
1:18:31.902 --> 1:18:37.206
log n heads?
The claim is that that's order
1209
1:18:37.206 --> 1:18:42.979
log n with high probability.
But we need to prove that.
1210
1:18:42.979 --> 1:18:48.324
So, this is a claim.
So, if you just sit there with
1211
1:18:48.324 --> 1:18:55.058
a coin, and you want to know how
many times does it take until I
1212
1:18:55.058 --> 1:19:00.082
get c log n heads,
the claim is that that number
1213
1:19:00.082 --> 1:19:05
is order log n with high
probability.
1214
1:19:05 --> 1:19:08.595
As long as I prove that,
I know that the total number of
1215
1:19:08.595 --> 1:19:11.276
steps I make,
which is the number of heads
1216
1:19:11.276 --> 1:19:15.394
and tails is order log n because
I definitely know the number of
1217
1:19:15.394 --> 1:19:17.094
heads is, at most,
c log n.
1218
1:19:17.094 --> 1:19:21.147
The claim is that the number of
tails can't be too much bigger.
1219
1:19:21.147 --> 1:19:23.174
Notice, I can't just say c
here.
1220
1:19:23.174 --> 1:19:25.985
OK, it's really important that
I have log n.
1221
1:19:25.985 --> 1:19:28.208
Why?
Because with high probability,
1222
1:19:28.208 --> 1:19:32
it depends on n.
This notion depends on n.
1223
1:19:32 --> 1:19:35.434
Log n: it's true.
Anything bigger that log n:
1224
1:19:35.434 --> 1:19:38.087
it's true, like n.
If I put n here,
1225
1:19:38.087 --> 1:19:41.756
this is also true.
But, if I put a constant or a
1226
1:19:41.756 --> 1:19:46.126
log log n, this is not true.
It's really important that I
1227
1:19:46.126 --> 1:19:50.185
have log n here because my
notion of high probability
1228
1:19:50.185 --> 1:19:54.321
depends on what's written here.
OK, it's clear so far.
1229
1:19:54.321 --> 1:19:57.912
We're almost done,
which is good because I just
1230
1:19:57.912 --> 1:20:01.19
ran out of time.
Sorry, we're going to go a
1231
1:20:01.19 --> 1:20:07.528
couple minutes over.
So, I want to compute the error
1232
1:20:07.528 --> 1:20:12.308
probability here.
So, I want to compute the
1233
1:20:12.308 --> 1:20:17.886
probability that there is less
than c log n heads.
1234
1:20:17.886 --> 1:20:23.691
Let me skip this step.
So, I will be approximate and
1235
1:20:23.691 --> 1:20:29.382
say, what's the probability that
there is, at most,
1236
1:20:29.382 --> 1:20:33.923
c log n heads?
So, I need to say how many
1237
1:20:33.923 --> 1:20:37.549
coins we are flipping here for
what this event is.
1238
1:20:37.549 --> 1:20:40.139
So, I need to specify this
constant.
1239
1:20:40.139 --> 1:20:42.729
Let's say we flip ten c log n
coins.
1240
1:20:42.729 --> 1:20:47.169
Now I want to look at the error
probability under that event.
1241
1:20:47.169 --> 1:20:51.312
The probability that there is
at most c log n heads among
1242
1:20:51.312 --> 1:20:55.382
those ten c log n flips.
So, the claim is this should be
1243
1:20:55.382 --> 1:20:58.416
pretty small.
It's going to depend on ten.
1244
1:20:58.416 --> 1:21:01.672
Then I'll choose ten to be
arbitrarily large,
1245
1:21:01.672 --> 1:21:05.076
and I'll be done,
OK, make my life a little bit
1246
1:21:05.076 --> 1:21:10.054
easier.
Well, I would ask you normally,
1247
1:21:10.054 --> 1:21:15.77
but this is 6.042 material.
So, what's the probability that
1248
1:21:15.77 --> 1:21:19.021
we have, at most,
this many heads?
1249
1:21:19.021 --> 1:21:23.653
Well, that means that nine c
log n of the coins,
1250
1:21:23.653 --> 1:21:29.368
because there are ten c log n
flips, c log n heads at most,
1251
1:21:29.368 --> 1:21:34
nine c log n at least better be
tails.
1252
1:21:34 --> 1:21:37.149
So this is the probability that
all those other guys become
1253
1:21:37.149 --> 1:21:39.104
tails, which is already pretty
small.
1254
1:21:39.104 --> 1:21:41.33
And then, there is this
permutation thing.
1255
1:21:41.33 --> 1:21:44.533
So, if I had exactly c log n
heads, this would be the number
1256
1:21:44.533 --> 1:21:47.574
of ways to rearrange c log n
heads among ten c log n coin
1257
1:21:47.574 --> 1:21:49.475
flips.
OK, that's just the number of
1258
1:21:49.475 --> 1:21:51.375
permutations.
So, this is a bit big,
1259
1:21:51.375 --> 1:21:53.601
which is kind of annoying.
This is really,
1260
1:21:53.601 --> 1:21:55.665
really small.
The claim is this is much
1261
1:21:55.665 --> 1:21:58
smaller than that is big.
1262
1:21:58 --> 1:22:14
1263
1:22:14 --> 1:22:18.548
So, this is just some math.
I'm going to whiz through it.
1264
1:22:18.548 --> 1:22:21.39
So, you don't have to stay too
long.
1265
1:22:21.39 --> 1:22:26.02
But you should go over it.
You should know that y choose x
1266
1:22:26.02 --> 1:22:30
is, at most, ey over x to the x,
good fact.
1267
1:22:30 --> 1:22:35.033
Therefore, this is,
at most, ten c log n over c log
1268
1:22:35.033 --> 1:22:38.456
n, also known as ten.
These cancel.
1269
1:22:38.456 --> 1:22:43.691
There's an e out here.
And then I raise that to the c
1270
1:22:43.691 --> 1:22:48.02
log n power.
OK, then I divide by two to the
1271
1:22:48.02 --> 1:22:51.946
power, nine c log n.
OK, so what's this?
1272
1:22:51.946 --> 1:22:57.986
This is e times ten to the c
log n divided by two to the nine
1273
1:22:57.986 --> 1:23:02.355
c log n.
OK, claim this is very big.
1274
1:23:02.355 --> 1:23:06.367
This is not so big,
because I have a nine here.
1275
1:23:06.367 --> 1:23:09.769
So, let's work it out.
This e times ten,
1276
1:23:09.769 --> 1:23:13.345
that's a good number,
we can put upstairs.
1277
1:23:13.345 --> 1:23:17.096
So, we get log of e times ten,
ten times, e,
1278
1:23:17.096 --> 1:23:21.109
and then c log n.
And then, we have over two to
1279
1:23:21.109 --> 1:23:25.121
the nine c log n.
So, we have this two to the c
1280
1:23:25.121 --> 1:23:31.946
log n in both cases.
So, this is two to the log,
1281
1:23:31.946 --> 1:23:38.669
ten e minus nine,
c, log n: some basic algebra.
1282
1:23:38.669 --> 1:23:43.199
So, I'm going to set,
not quite.
1283
1:23:43.199 --> 1:23:49.338
This is one over two to the
nine minus log:
1284
1:23:49.338 --> 1:23:58.253
so, just inverting everything
here, negating the sign in here.
1285
1:23:58.253 --> 1:24:06
And, this is my alpha because
the rest is n.
1286
1:24:06 --> 1:24:09.903
So, this is one over n to the
alpha when alpha is this
1287
1:24:09.903 --> 1:24:13.291
particular value:
nine minus log of ten times e
1288
1:24:13.291 --> 1:24:16.09
times c.
It's a bit of a strange thing.
1289
1:24:16.09 --> 1:24:19.184
But, the point is,
as ten goes to infinity,
1290
1:24:19.184 --> 1:24:22.424
nine here is the number one
smaller than ten,
1291
1:24:22.424 --> 1:24:24.855
right?
We subtracted one somewhere
1292
1:24:24.855 --> 1:24:27.949
along the way.
So, as ten goes to infinity,
1293
1:24:27.949 --> 1:24:32
this is basically,
this is ten minus one.
1294
1:24:32 --> 1:24:35.1
This is log of ten times e.
e doesn't really matter.
1295
1:24:35.1 --> 1:24:37.531
The point is,
this is logarithmic in ten.
1296
1:24:37.531 --> 1:24:40.692
This is linear in ten.
The thing that's linear in ten
1297
1:24:40.692 --> 1:24:44.035
is much bigger than the thing
that's logarithmic in ten.
1298
1:24:44.035 --> 1:24:45.919
This is called abusive
notation.
1299
1:24:45.919 --> 1:24:48.958
OK, as ten goes to infinity,
this goes to infinity,
1300
1:24:48.958 --> 1:24:51.329
gets bigger.
And, there is a c out here.
1301
1:24:51.329 --> 1:24:54.794
But, for any value of c that
you want, whatever value of c
1302
1:24:54.794 --> 1:24:58.015
you wanted in that claim,
I can make alpha arbitrarily
1303
1:24:58.015 --> 1:25:00.629
large by changing the constant
in the big O,
1304
1:25:00.629 --> 1:25:04.812
which here was ten.
OK, so that claim is true with
1305
1:25:04.812 --> 1:25:07.652
high probability.
Whatever probability you want,
1306
1:25:07.652 --> 1:25:10.673
which tells you alpha,
you set a constant effort of
1307
1:25:10.673 --> 1:25:13.089
the log N to be this number,
which grows,
1308
1:25:13.089 --> 1:25:15.929
and you're done.
You get the claim that is order
1309
1:25:15.929 --> 1:25:19.312
log N heads, order log N flips
with the high probability,
1310
1:25:19.312 --> 1:25:21.548
therefore.
[None of the steps?] in the
1311
1:25:21.548 --> 1:25:24.146
search is order log N with high
probability.
1312
1:25:24.146 --> 1:25:26.14
Really cool stuff;
read the notes.
1313
1:25:26.14 --> 1:25:29
Sorry I went so fast at the
end.