1 00:00:07 --> 00:00:10 Good morning. Today we're going to talk about 2 00:00:10 --> 00:00:14 it a balanced search structure, so a data structure that 3 00:00:14 --> 00:00:18 maintains a dynamic set subject to insertion, 4 00:00:18 --> 00:00:21 deletion, and search called skip lists. 5 00:00:21 --> 00:00:25 So, I'll call this a dynamic search structure because it's a 6 00:00:25 --> 00:00:28 data structure. It supports search, 7 00:00:28 --> 00:00:33 and it's dynamic, meaning insert and delete. 8 00:00:33 --> 00:00:39 So, what other dynamic search structures do we know, 9 00:00:39 --> 00:00:45 just for sake of comparison, and to wake everyone up? 10 00:00:45 --> 00:00:50 Shut them out, efficient, I should say, 11 00:00:50 --> 00:00:55 also good, logarithmic time per operation. 12 00:00:55 --> 00:01:01 So, this is a really easy question to get us off the 13 00:01:01 --> 00:01:05 ground. You've seen them all in the 14 00:01:05 --> 00:01:08 last week, so it shouldn't be so hard. 15 00:01:08 --> 00:01:11 Treap, good. On the problems that we saw 16 00:01:11 --> 00:01:13 treaps. That's, in some sense, 17 00:01:13 --> 00:01:17 the simplest dynamic search structure you can get from first 18 00:01:17 --> 00:01:21 principles because all we needed was a bound on a randomly 19 00:01:21 --> 00:01:26 constructed binary search tree. And then treaps did well. 20 00:01:26 --> 00:01:30 So, that was sort of the first one you saw depending on when 21 00:01:30 --> 00:01:34 you did your problem set. What else? 22 00:01:34 --> 00:01:36 Charles? Red black trees, 23 00:01:36 --> 00:01:40 good answer. So, that was exactly one week 24 00:01:40 --> 00:01:44 ago. I hope you still remember it. 25 00:01:44 --> 00:01:48 They have guaranteed log n performance. 26 00:01:48 --> 00:01:55 So, this was an expected bound. This was a worst-case order log 27 00:01:55 --> 00:01:58 n per operation, insert, delete, 28 00:01:58 --> 00:02:02 and search. And, there was one more for 29 00:02:02 --> 00:02:07 those who want to recitation on Friday: B trees, 30 00:02:07 --> 00:02:10 good. And, by B trees, 31 00:02:10 --> 00:02:14 I also include two-three trees, two-three-four trees, 32 00:02:14 --> 00:02:16 and all those guys. So, if B is a constant, 33 00:02:16 --> 00:02:19 or if you want your B trees knows a little bit cleverly, 34 00:02:19 --> 00:02:22 that these have guaranteed order log n performance, 35 00:02:22 --> 00:02:24 so, worst case, order log n. 36 00:02:24 --> 00:02:27 So, you should know this. These are all balanced search 37 00:02:27 --> 00:02:29 structures. They are dynamic. 38 00:02:29 --> 00:02:31 They support insertions and deletions. 39 00:02:31 --> 00:02:34 They support searches, finding a given key. 40 00:02:34 --> 00:02:37 And if you don't find the key, you find its predecessor and 41 00:02:37 --> 00:02:42 successor pretty easily in all of these structures. 42 00:02:42 --> 00:02:44 If you want to augment some data structure, 43 00:02:44 --> 00:02:48 you should think about which one of these is easiest to 44 00:02:48 --> 00:02:53 augment, as in Monday's lecture. So, the question I want to pose 45 00:02:53 --> 00:02:56 to you is supposed I gave you all a laptop right now, 46 00:02:56 --> 00:02:59 which would be great. Then I asked you, 47 00:02:59 --> 00:03:03 in order to keep this laptop you have to implement one of 48 00:03:03 --> 00:03:06 these data structures, let's say, within this class 49 00:03:06 --> 00:03:09 hour. Do you think you could do it? 50 00:03:09 --> 00:03:12 How many people think you could do it? 51 00:03:12 --> 00:03:13 A couple people, a few people, 52 00:03:13 --> 00:03:15 OK, all front row people, good. 53 00:03:15 --> 00:03:19 I could probably do it. My preference would be B trees. 54 00:03:19 --> 00:03:21 They're sort of the simplest in my mind. 55 00:03:21 --> 00:03:23 This is without using the textbook. 56 00:03:23 --> 00:03:25 This would be a closed book exam. 57 00:03:25 --> 00:03:30 I don't have enough laptops to do it, unfortunately. 58 00:03:30 --> 00:03:32 So, B trees are pretty reasonable. 59 00:03:32 --> 00:03:35 Deletion, you have to remember stealing from a sibling and 60 00:03:35 --> 00:03:37 whatnot. So, deletions are a bit tricky. 61 00:03:37 --> 00:03:40 Red black trees, I can never remember it. 62 00:03:40 --> 00:03:43 I'd have to look it up, or re-derive the three cases. 63 00:03:43 --> 00:03:46 treaps are a bit fancy. So, that would take a little 64 00:03:46 --> 00:03:49 while to remember exactly how those work. 65 00:03:49 --> 00:03:51 You'd have to solve your problem set again, 66 00:03:51 --> 00:03:55 if you don't have it memorized. Skip lists, on the other hand, 67 00:03:55 --> 00:03:57 are a data structure you will never forget, 68 00:03:57 --> 00:04:00 and something you can implement within an hour, 69 00:04:00 --> 00:04:03 no problem. I've made this claim a couple 70 00:04:03 --> 00:04:05 times before, and I always felt bad because I 71 00:04:05 --> 00:04:10 had never actually done it. So, this morning, 72 00:04:10 --> 00:04:13 I implemented skip lists, and it took me ten minutes to 73 00:04:13 --> 00:04:17 implement a linked list, and 30 minutes to implement 74 00:04:17 --> 00:04:19 skip lists. And another 30 minutes 75 00:04:19 --> 00:04:21 debugging them. There you go. 76 00:04:21 --> 00:04:24 It can be done. Skip lists are really simple. 77 00:04:24 --> 00:04:27 And, at no point writing the code did I have to think, 78 00:04:27 --> 00:04:32 whereas every other structure I would have to think. 79 00:04:32 --> 00:04:36 There was one moment when I thought, ah, how do I flip a 80 00:04:36 --> 00:04:38 coin? That was the entire amount of 81 00:04:38 --> 00:04:41 thinking. So, skip lists are a randomized 82 00:04:41 --> 00:04:44 structure. Let's add in another adjective 83 00:04:44 --> 00:04:46 here, and let's also add in simple. 84 00:04:46 --> 00:04:49 So, we have a simple, efficient, dynamic, 85 00:04:49 --> 00:04:53 randomized search structure: all those things together. 86 00:04:53 --> 00:04:57 So, it's sort of like treaps and that the bound is only a 87 00:04:57 --> 00:05:01 randomized bound. But today, we're going to see a 88 00:05:01 --> 00:05:06 much stronger bound than an expectation bound. 89 00:05:06 --> 00:05:11 So, in particular, skip lists will run in order 90 00:05:11 --> 00:05:17 log n expected time. So, the running time for each 91 00:05:17 --> 00:05:22 operation will be order log n in expectation. 92 00:05:22 --> 00:05:28 But, we're going to prove a much stronger result that their 93 00:05:28 --> 00:05:34 order log n, with high probability. 94 00:05:34 --> 00:05:37 So, this is a very strong claim. 95 00:05:37 --> 00:05:42 And it means that the running time of each operation, 96 00:05:42 --> 00:05:48 the running time of every operation is order log n almost 97 00:05:48 --> 00:05:54 always in a certain sense. Why don't I foreshadow that? 98 00:05:54 --> 00:05:59 So, it's something like, the probability that it's order 99 00:05:59 --> 00:06:05 log n is at least one minus one over some polynomial, 100 00:06:05 --> 00:06:08 and n. And, you get to set the 101 00:06:08 --> 00:06:10 polynomial however large you like. 102 00:06:10 --> 00:06:13 So, what this basically means is that almost all the time, 103 00:06:13 --> 00:06:16 you take your skip lists, you do a polynomial number of 104 00:06:16 --> 00:06:18 operations on it, because presumably you are 105 00:06:18 --> 00:06:21 running a polynomial time algorithm that using this data 106 00:06:21 --> 00:06:23 structure. Do polynomial numbers of 107 00:06:23 --> 00:06:26 inserts, delete searches, every single one of them will 108 00:06:26 --> 00:06:30 take order log n time, almost guaranteed. 109 00:06:30 --> 00:06:33 So this is a really strong bound on the tail of the 110 00:06:33 --> 00:06:36 distribution. The mean is order log n. 111 00:06:36 --> 00:06:39 That's not so exciting. But, in fact, 112 00:06:39 --> 00:06:43 almost all of the weight of this probability distribution is 113 00:06:43 --> 00:06:47 right around the log n, just tiny little epsilons, 114 00:06:47 --> 00:06:51 very tiny probabilities you could be bigger than log n. 115 00:06:51 --> 00:06:55 So that's where we are going. This is a data structure by 116 00:06:55 --> 00:07:00 Pugh] in 1989. This is the most recent. 117 00:07:00 --> 00:07:03 Actually, no, sorry, treaps are more recent. 118 00:07:03 --> 00:07:06 They were like '93 or so, but a fairly recent data 119 00:07:06 --> 00:07:09 structure for just insert, delete, search. 120 00:07:09 --> 00:07:13 And, it's very simple. You can derive it if you don't 121 00:07:13 --> 00:07:16 know anything about data structures, well, 122 00:07:16 --> 00:07:19 almost nothing. Now, analyzing that the 123 00:07:19 --> 00:07:21 performance is log n, that, of course, 124 00:07:21 --> 00:07:25 takes our sophistication. But the data structure itself 125 00:07:25 --> 00:07:30 is very simple. We're going to start from 126 00:07:30 --> 00:07:34 scratch. Suppose you don't know what a 127 00:07:34 --> 00:07:38 red black tree is. You don't know what a B tree 128 00:07:38 --> 00:07:41 is. Suppose you don't even know 129 00:07:41 --> 00:07:45 what a tree is. What is the simplest data 130 00:07:45 --> 00:07:51 structure for storing a bunch of items for storing a dynamic set? 131 00:07:51 --> 00:07:54 A list, good, a linked list. 132 00:07:54 --> 00:07:58 Now, suppose that it's a sorted linked list. 133 00:07:58 --> 00:08:05 So, I'm going to be a little bit fancier there. 134 00:08:05 --> 00:08:10 So, if you have a linked list of items, here it is, 135 00:08:10 --> 00:08:16 maybe we'll make it doubly linked just for kicks, 136 00:08:16 --> 00:08:22 how long does it take to search in a sorted linked list? 137 00:08:22 --> 00:08:26 Log n is one answer. n is the other answer. 138 00:08:26 --> 00:08:31 Which one is right? n is the right answer. 139 00:08:31 --> 00:08:35 So, even though it's sorted, we can't do binary search 140 00:08:35 --> 00:08:38 because we don't have random-access into a linked 141 00:08:38 --> 00:08:40 list. So, suppose I'm only given a 142 00:08:40 --> 00:08:44 pointer to the head. Otherwise, I'm assuming it's an 143 00:08:44 --> 00:08:46 array. So, in a sorted array you can 144 00:08:46 --> 00:08:48 search in log n. Sorted linked list: 145 00:08:48 --> 00:08:51 you've still got to scan through the darn thing. 146 00:08:51 --> 00:08:53 So, theta n, worst case search. 147 00:08:53 --> 00:08:56 Not so good, but if we just try to improve 148 00:08:56 --> 00:08:59 it a little bit, we will discover skip lists 149 00:08:59 --> 00:09:03 automatically. So, this is our starting point: 150 00:09:03 --> 00:09:06 sorted linked lists, data n time. 151 00:09:06 --> 00:09:09 And, I'm not going to think too much about insertions and 152 00:09:09 --> 00:09:12 deletions for the moment. Let's just get search better, 153 00:09:12 --> 00:09:15 and then we'll worry about dates. 154 00:09:15 --> 00:09:17 Updates are where randomization will come in. 155 00:09:17 --> 00:09:21 Search: pretty easy idea. So, how can we make a linked 156 00:09:21 --> 00:09:23 list better? Suppose all we know about our 157 00:09:23 --> 00:09:26 linked lists. What can I do to make it 158 00:09:26 --> 00:09:28 faster? This is where you need a little 159 00:09:28 --> 00:09:32 bit of innovation, some creativity. 160 00:09:32 --> 00:09:37 More links: that's a good idea. So, I do try to maybe add 161 00:09:37 --> 00:09:40 pointers to go a couple steps ahead. 162 00:09:40 --> 00:09:45 If I had log n pointers, I could do all powers of two 163 00:09:45 --> 00:09:48 ahead. That's a pretty good search 164 00:09:48 --> 00:09:51 structure. Some people use that; 165 00:09:51 --> 00:09:56 like, some peer-to-peer networks use that idea. 166 00:09:56 --> 00:10:01 But that's a little too fancy for me. 167 00:10:01 --> 00:10:03 Ah, good. You could try to build a tree 168 00:10:03 --> 00:10:07 on this linear structure. That's essentially where we're 169 00:10:07 --> 00:10:09 going. So, you could try to put 170 00:10:09 --> 00:10:12 pointers to, like, the middle of the list from the 171 00:10:12 --> 00:10:14 roots. So, you search between either 172 00:10:14 --> 00:10:16 here. You point to the median, 173 00:10:16 --> 00:10:20 so you can compare against the median, and know whether you 174 00:10:20 --> 00:10:23 should go in the first half or the second half that's 175 00:10:23 --> 00:10:27 definitely on the right track, also a bit too sophisticated. 176 00:10:27 --> 00:10:29 Another list: yes. 177 00:10:29 --> 00:10:32 Yes, good. So, we are going to use two 178 00:10:32 --> 00:10:34 lists. That's sort of the next 179 00:10:34 --> 00:10:38 simplest thing you could do. OK, and as you suggested, 180 00:10:38 --> 00:10:41 we could maybe have pointers between them. 181 00:10:41 --> 00:10:46 So, maybe we have some elements down here, some of the elements 182 00:10:46 --> 00:10:48 up here. We want to have pointers 183 00:10:48 --> 00:10:51 between the lists. OK, it gets a little bit crazy 184 00:10:51 --> 00:10:54 in how exactly you might do that. 185 00:10:54 --> 00:10:56 But somehow, this feels good. 186 00:10:56 --> 00:10:58 So this is one linked list: L_1. 187 00:10:58 --> 00:11:02 This is another linked list: L_2. 188 00:11:02 --> 00:11:12 And, to give you some inspiration, I want to give you, 189 00:11:12 --> 00:11:19 so let's play a game. The game is, 190 00:11:19 --> 00:11:29 what is this sequence? So, the sequence is 14. 191 00:11:29 --> 00:11:38 If you know the answer, shout it out. 192 00:11:38 --> 00:11:42 Anyone yet? OK, it's tricky. 193 00:11:42 --> 00:11:54 194 00:11:54 --> 00:11:58 It's a bit of a small class, so I hope someone knows the 195 00:11:58 --> 00:11:59 answer. 196 00:11:59 --> 00:12:10 197 00:12:10 --> 00:12:14 How many TA's know the answer? Just a couple, 198 00:12:14 --> 00:12:19 OK, if you're looking at the slides, probably you know the 199 00:12:19 --> 00:12:21 answer. That's cheating. 200 00:12:21 --> 00:12:26 OK, I'll give you a hint. It is not a mathematical 201 00:12:26 --> 00:12:29 sequence. This is a real-life sequence. 202 00:12:29 --> 00:12:32 Yeah? Yeah, and what city? 203 00:12:32 --> 00:12:36 New York, yeah, this is the 7th Ave line. 204 00:12:36 --> 00:12:40 This is my favorite subway line in New York. 205 00:12:40 --> 00:12:46 But, what's a cool feature of the New York City subway? 206 00:12:46 --> 00:12:49 OK, it's a skip list. Good answer. 207 00:12:49 --> 00:12:54 [LAUGHTER] Indeed it is. Skip lists are so practical. 208 00:12:54 --> 00:13:00 They've been implemented in the subway system. 209 00:13:00 --> 00:13:03 How cool is that? OK, Boston subway is pretty 210 00:13:03 --> 00:13:08 cool because it's the oldest subway definitely in the United 211 00:13:08 --> 00:13:11 States, maybe in the world. New York is close, 212 00:13:11 --> 00:13:16 and it has other nice features like it's open 24 hours. 213 00:13:16 --> 00:13:20 That's a definite plus, but it also has this feature of 214 00:13:20 --> 00:13:23 express lines. So, it's a bit of an 215 00:13:23 --> 00:13:26 abstraction, but the 7th Ave line has 216 00:13:26 --> 00:13:29 essentially two kinds of cars. These are street numbers by the 217 00:13:29 --> 00:13:31 way. This is, Penn Station, 218 00:13:31 --> 00:13:33 Times Square, and so on. 219 00:13:33 --> 00:13:36 So, there are essentially two lines. 220 00:13:36 --> 00:13:39 There's the express line which goes 14, to 34, 221 00:13:39 --> 00:13:41 to 42, to 72, to 96. 222 00:13:41 --> 00:13:45 And then, there's the local line which stops at every stop. 223 00:13:45 --> 00:13:49 And, they accomplish this with four sets of tracks. 224 00:13:49 --> 00:13:54 So, I mean, the express lines have their own dedicated track. 225 00:13:54 --> 00:13:57 If you want to go to stop 59 from, let's say, 226 00:13:57 --> 00:14:00 Penn Station, well, let's say from lower west 227 00:14:00 --> 00:14:05 side, you get on the express line. 228 00:14:05 --> 00:14:10 You jump to 42 pretty quickly, and then you switch over to the 229 00:14:10 --> 00:14:16 local line, and go on to 59 or wherever I said I was going. 230 00:14:16 --> 00:14:21 OK, so this is express and local lines, and we can 231 00:14:21 --> 00:14:25 represent that with a couple of lists. 232 00:14:25 --> 00:14:29 We have one list, sure, we have one list on the 233 00:14:29 --> 00:14:34 bottom, so leave some space up here. 234 00:14:34 --> 00:14:48 This is the local line, L_2, 34, 42, 235 00:14:48 --> 00:15:02 50, 59, 66, 72, 79, and so on. 236 00:15:02 --> 00:15:08 And then we had the express line on top, which only stops at 237 00:15:08 --> 00:15:11 14, 34, 42, 72, and so on. 238 00:15:11 --> 00:15:16 I'm not going to redraw the whole list. 239 00:15:16 --> 00:15:21 You get the idea. And so, what we're going to do 240 00:15:21 --> 00:15:27 is put links between in the local and express lines, 241 00:15:27 --> 00:15:34 wherever they happen to meet. And, that's our two linked list 242 00:15:34 --> 00:15:38 structure. So, that's what I actually 243 00:15:38 --> 00:15:42 meant what I was trying to draw some picture. 244 00:15:42 --> 00:15:47 Now, this has a property that in one list, the bottom list, 245 00:15:47 --> 00:15:52 every element occurs. And the top list just copies 246 00:15:52 --> 00:15:56 some of those elements. And we're going to preserve 247 00:15:56 --> 00:16:00 that property. So, L_2 stores all the 248 00:16:00 --> 00:16:05 elements, and L_1 stores some subset. 249 00:16:05 --> 00:16:10 And, it's still open which ones we should store. 250 00:16:10 --> 00:16:16 That's the one thing we need to think about. 251 00:16:16 --> 00:16:23 But, our inspiration is from the New York subway system. 252 00:16:23 --> 00:16:30 OK, there, that the idea. Of course, we're also going to 253 00:16:30 --> 00:16:36 use more than two lists. OK, we also have links. 254 00:16:36 --> 00:16:44 Let's say it links between equal keys in L_1 and L_2. 255 00:16:44 --> 00:16:46 Good. So, just for the sake of 256 00:16:46 --> 00:16:50 completeness, and because we will need this 257 00:16:50 --> 00:16:55 later, let's talk about searches before we worry about how these 258 00:16:55 --> 00:17:00 lists are actually constructed. Of course, if I wanted that 259 00:17:00 --> 00:17:04 board. So, if you want to search for 260 00:17:04 --> 00:17:06 an element, x, what do you do? 261 00:17:06 --> 00:17:09 Well, this is the taking the subway algorithm. 262 00:17:09 --> 00:17:14 And, suppose you always start in the upper left corner of the 263 00:17:14 --> 00:17:17 subway system, if you're always in the lower 264 00:17:17 --> 00:17:21 west side, 14th St, and I don't know exactly where 265 00:17:21 --> 00:17:25 that is, but more or less, somewhere down at the bottom of 266 00:17:25 --> 00:17:27 Manhattan. And, you want to go to a 267 00:17:27 --> 00:17:33 particular station like 59. Well, you'd stay on the express 268 00:17:33 --> 00:17:37 line as long as you can because it happens that we started on 269 00:17:37 --> 00:17:39 the express line. And then, you go down. 270 00:17:39 --> 00:17:43 And then you take the local line the rest of the way. 271 00:17:43 --> 00:17:47 That's clearly the right thing to do if you always start in the 272 00:17:47 --> 00:17:50 top left corner. So, I'm going to write that 273 00:17:50 --> 00:17:54 down in some kind of an algorithm because we will be 274 00:17:54 --> 00:17:56 generalizing it. It's pretty obvious at this 275 00:17:56 --> 00:18:00 point. It will remain obvious. 276 00:18:00 --> 00:18:06 So, I want to walk right in the top list until that would go too 277 00:18:06 --> 00:18:09 far. So, you imagine giving someone 278 00:18:09 --> 00:18:14 directions on the subway system they've never been on. 279 00:18:14 --> 00:18:17 So, you say, OK, you start at 14th. 280 00:18:17 --> 00:18:22 Take the express line, and when you get to 72nd, 281 00:18:22 --> 00:18:25 you've gone too far. Go back one, 282 00:18:25 --> 00:18:30 and then go down to the local line. 283 00:18:30 --> 00:18:32 It's really annoying directions. 284 00:18:32 --> 00:18:37 But this is what an algorithm has to do because it's never 285 00:18:37 --> 00:18:41 taken the subway before. So, it's going to check, 286 00:18:41 --> 00:18:45 so let's do it here. So, suppose I'm aiming for 59. 287 00:18:45 --> 00:18:49 So, I started 14, say the first thing I do is go 288 00:18:49 --> 00:18:51 to 34. Then from there, 289 00:18:51 --> 00:18:54 I go to 42. Still good because 59 is bigger 290 00:18:54 --> 00:18:56 than 42. I go right again. 291 00:18:56 --> 00:18:59 I say, oops, 72 is too big. 292 00:18:59 --> 00:19:04 That was too far. So, I go back to where it just 293 00:19:04 --> 00:19:07 was. Then I go down and then I keep 294 00:19:07 --> 00:19:12 going right until I find the element that I want, 295 00:19:12 --> 00:19:17 or discover that it's not in the bottom list because bottom 296 00:19:17 --> 00:19:21 list has everyone. So, that's the algorithm. 297 00:19:21 --> 00:19:27 Stop when going right would go too far, and you discover that 298 00:19:27 --> 00:19:31 with a comparison. Then you walk down to L_2. 299 00:19:31 --> 00:19:35 And then you walk right in L_2 until you find x, 300 00:19:35 --> 00:19:40 or you find something greater than x, in which case x is 301 00:19:40 --> 00:19:46 definitely not on your list. And you found the predecessor 302 00:19:46 --> 00:19:49 and successor, which may be your goal. 303 00:19:49 --> 00:19:52 If you didn't find where x was, you should find where it would 304 00:19:52 --> 00:19:55 go if it were there, because then maybe you could 305 00:19:55 --> 00:19:58 insert there. We're going to use this 306 00:19:58 --> 00:20:00 algorithm in insertion. OK, but that search: 307 00:20:00 --> 00:20:05 pretty easy at this point. Now, what we haven't discussed 308 00:20:05 --> 00:20:08 is how fast the search algorithm is, and it depends, 309 00:20:08 --> 00:20:12 of course, which elements we're going to store in L_1, 310 00:20:12 --> 00:20:14 which subset of elements should go in L_1. 311 00:20:14 --> 00:20:18 Now, in the subway system, you probably put all the 312 00:20:18 --> 00:20:21 popular stations in L_1. But here, we want worst-case 313 00:20:21 --> 00:20:24 performance. So, we don't have some 314 00:20:24 --> 00:20:26 probability distribution on the nodes. 315 00:20:26 --> 00:20:30 We just like every node to be accessed sort of as quickly as 316 00:20:30 --> 00:20:35 possible, uniformly. So, we want to minimize the 317 00:20:35 --> 00:20:39 maximum time over all queries. So, any ideas what we should do 318 00:20:39 --> 00:20:42 with L_1? Should I put all the nodes of 319 00:20:42 --> 00:20:46 L_1 in the beginning? OK, it's a strict subset. 320 00:20:46 --> 00:20:49 Suppose I told you what the size of L_1 was. 321 00:20:49 --> 00:20:53 I can tell you, I could afford to build this 322 00:20:53 --> 00:20:56 many express stops. How should you distribute them 323 00:20:56 --> 00:21:02 among the elements of L_2? Uniformly, good. 324 00:21:02 --> 00:21:08 So, what nodes, sorry, what keys, 325 00:21:08 --> 00:21:17 let's say, go in L_1? Well, definitely the best thing 326 00:21:17 --> 00:21:24 to do is to spread them out uniformly, OK, 327 00:21:24 --> 00:21:35 which is definitely not what the 7th Ave line looks like. 328 00:21:35 --> 00:21:39 But, let's imagine that we could reengineer everything. 329 00:21:39 --> 00:21:45 So, we're going to try to space these things out a little bit 330 00:21:45 --> 00:21:47 more. So, 34 and 42nd are way too 331 00:21:47 --> 00:21:50 close. We'll take a few more stops. 332 00:21:50 --> 00:21:54 And, now we can start to analyze things. 333 00:21:54 --> 00:21:57 OK, as a function of the length of L_1. 334 00:21:57 --> 00:22:03 So, the cost of a search is now roughly, so, I want a function 335 00:22:03 --> 00:22:07 of the length of L_1, and the length of L_2, 336 00:22:07 --> 00:22:11 which is all the elements, n. 337 00:22:11 --> 00:22:18 What is the cost of the search if I spread out all the elements 338 00:22:18 --> 00:22:20 in L_1 uniformly? Yeah? 339 00:22:20 --> 00:22:26 Right, the total number of elements in the top lists, 340 00:22:26 --> 00:22:33 plus the division between the bottom and the top. 341 00:22:33 --> 00:22:36 So, I'll write the length of L_1 plus the length of L_2 342 00:22:36 --> 00:22:39 divided by the length of L_1. OK, this is roughly, 343 00:22:39 --> 00:22:42 I mean, there's maybe a plus one or so here because in the 344 00:22:42 --> 00:22:46 worst case, I have to search through all of L_1 because the 345 00:22:46 --> 00:22:49 station I could be looking for could be the max. 346 00:22:49 --> 00:22:52 OK, and maybe I'm not lucky, and the max is not on the 347 00:22:52 --> 00:22:54 express line. So then, I have to go down to 348 00:22:54 --> 00:22:57 the local line. And how many stops will I have 349 00:22:57 --> 00:23:01 to go on the local line? Well, L_1 just evenly 350 00:23:01 --> 00:23:04 partitions L_2. So this is the number of 351 00:23:04 --> 00:23:08 consecutive stations between two express stops. 352 00:23:08 --> 00:23:12 So, I take the express, possibly this long, 353 00:23:12 --> 00:23:15 but I take the local possibly this long. 354 00:23:15 --> 00:23:18 And, this is an L_2. And there is, 355 00:23:18 --> 00:23:20 plus, a constant, for example, 356 00:23:20 --> 00:23:24 go walking down. But that's basically the number 357 00:23:24 --> 00:23:28 of nodes that I visit. So, I'd like to minimize this 358 00:23:28 --> 00:23:36 function. Now, L_2, I'm going to call 359 00:23:36 --> 00:23:47 that n because that's the total number of elements. 360 00:23:47 --> 00:23:55 L_1, I can choose to be whatever I want. 361 00:23:55 --> 00:24:03 So, let's go over here. So, I want to minimize L_1 plus 362 00:24:03 --> 00:24:07 n over L_1. And I get to choose L_1. 363 00:24:07 --> 00:24:11 Now, I could differentiate this, set it to zero, 364 00:24:11 --> 00:24:15 and go crazy. Or, I could realize that, 365 00:24:15 --> 00:24:19 I mean, that's not hard. But, that's a little bit too 366 00:24:19 --> 00:24:22 fancy for me. So, I could say, 367 00:24:22 --> 00:24:26 well, this is clearly best when L_1 is small. 368 00:24:26 --> 00:24:32 And this is clearly best when L_1 is large. 369 00:24:32 --> 00:24:37 So, there's a trade-off there. And, the trade-off will be 370 00:24:37 --> 00:24:44 roughly minimized up to constant factors when these two terms are 371 00:24:44 --> 00:24:48 equal. That's when I have pretty good 372 00:24:48 --> 00:24:53 balance between the two ends of the trade-off. 373 00:24:53 --> 00:24:56 So, this is up to constant factors. 374 00:24:56 --> 00:25:03 I can let L_1 equal n over L_1, OK, because at most I'm losing 375 00:25:03 --> 00:25:10 a factor of two there when they happen to be equal. 376 00:25:10 --> 00:25:14 So now, I just solve this. This is really easy. 377 00:25:14 --> 00:25:18 This is (L_1)^2 equals n. So, L_1 is the square root of 378 00:25:18 --> 00:25:20 n. OK, so the cost that I'm 379 00:25:20 --> 00:25:24 getting over here, L_1 plus L_2 over L_1 is the 380 00:25:24 --> 00:25:28 square root of n plus n over root n, which is, 381 00:25:28 --> 00:25:32 again, root n. So, I get two root n. 382 00:25:32 --> 00:25:36 So, search cost, and I'm caring about the 383 00:25:36 --> 00:25:39 constant here, because it will matter in a 384 00:25:39 --> 00:25:41 moment. Two square root of n: 385 00:25:41 --> 00:25:45 I'm not caring about the additive constant, 386 00:25:45 --> 00:25:48 but the multiplicative constant I care about. 387 00:25:48 --> 00:25:52 OK, that seems good. We started with a linked list 388 00:25:52 --> 00:25:56 that searched in n time, theta n time per operation. 389 00:25:56 --> 00:26:03 Now we have two linked lists, search and theta root n time. 390 00:26:03 --> 00:26:07 It seems pretty good. This is what the structure 391 00:26:07 --> 00:26:10 looks like. We have root n guys here. 392 00:26:10 --> 00:26:15 This is in the local line. And, we have one express stop 393 00:26:15 --> 00:26:19 which represents that. But we have another root n 394 00:26:19 --> 00:26:24 values in the local line. And we have one express stop 395 00:26:24 --> 00:26:28 that represents that. And these two are linked, 396 00:26:28 --> 00:26:31 and so on. 397 00:26:31 --> 00:26:42 398 00:26:42 --> 00:26:44 Well, I should put some dot, dot, dots in there. 399 00:26:44 --> 00:26:47 OK, so each of these chunks has length root n, 400 00:26:47 --> 00:26:49 and the number of representatives up here is 401 00:26:49 --> 00:26:52 square root of n. The number of express stops is 402 00:26:52 --> 00:26:54 square root of n. So clearly, things are balanced 403 00:26:54 --> 00:26:55 now. I search for, 404 00:26:55 --> 00:26:57 at most, square root of n up here. 405 00:26:57 --> 00:27:00 Then I search in one of these lists for, at most, 406 00:27:00 --> 00:27:04 square root of n. So, every search takes, 407 00:27:04 --> 00:27:10 at most, two root n. Cool, what should we do next? 408 00:27:10 --> 00:27:15 So, again, ignore insertions and deletions. 409 00:27:15 --> 00:27:22 I want to make searches faster because square root of n is not 410 00:27:22 --> 00:27:25 so hot as we know. Sorry? 411 00:27:25 --> 00:27:30 More lines. Let's add a super express line, 412 00:27:30 --> 00:27:35 or another linked list. OK, this was two. 413 00:27:35 --> 00:27:41 Why not do three? So, we started with a sorted 414 00:27:41 --> 00:27:45 linked list. Then we went to two. 415 00:27:45 --> 00:27:48 This gave us two square root of n. 416 00:27:48 --> 00:27:52 Now, I want three sorted linked lists. 417 00:27:52 --> 00:27:57 I didn't pluralize here. Any guesses what the running 418 00:27:57 --> 00:28:02 time might be? This is just guesswork. 419 00:28:02 --> 00:28:05 Don't think. From two square root of n, 420 00:28:05 --> 00:28:08 you would go to, sorry? 421 00:28:08 --> 00:28:12 Two square root of two, fourth root of n? 422 00:28:12 --> 00:28:17 That's on the right track. Both the constant and the root 423 00:28:17 --> 00:28:20 change, but not quite so fancily. 424 00:28:20 --> 00:28:24 Three times the cubed root: good. 425 00:28:24 --> 00:28:29 Intuition is very helpful here. It doesn't matter what the 426 00:28:29 --> 00:28:35 right answer is. Use your intuition. 427 00:28:35 --> 00:28:37 You can prove that. It's not so hard. 428 00:28:37 --> 00:28:40 You now have three lists, and what you want to balance 429 00:28:40 --> 00:28:44 are at the length of the top list, the ratio between the top 430 00:28:44 --> 00:28:47 two lists, and the ratio between the bottom two lists. 431 00:28:47 --> 00:28:50 So, you want these three to multiply out to n, 432 00:28:50 --> 00:28:53 because the top times the ratio times the ratio: 433 00:28:53 --> 00:28:56 that has to equal n. And, so that's where you get 434 00:28:56 --> 00:28:59 the cubed root of n. Each of these should be equal. 435 00:28:59 --> 00:29:03 So, you set them because the cost is the sum of those three 436 00:29:03 --> 00:29:07 things. So, you set each of them to 437 00:29:07 --> 00:29:11 cubed root of n, and there are three of them. 438 00:29:11 --> 00:29:15 OK, check it at home if you want to be more sure. 439 00:29:15 --> 00:29:21 Obviously, we want a few more. So, let's think about k sorted 440 00:29:21 --> 00:29:24 lists. k sorted lists will be k times 441 00:29:24 --> 00:29:28 the k'th root of n. You probably guessed that by 442 00:29:28 --> 00:29:33 now. So, what should we set k to? 443 00:29:33 --> 00:29:38 I don't want the exact minimum. What's a good value for k? 444 00:29:38 --> 00:29:41 Should I set it to n? n's kind of nice, 445 00:29:41 --> 00:29:44 because the n'th root of n is just one. 446 00:29:44 --> 00:29:48 Now that's n. So, this is why I cared about 447 00:29:48 --> 00:29:53 the lead constant because it's going to grow as I add more 448 00:29:53 --> 00:29:56 lists. What's the biggest reasonable 449 00:29:56 --> 00:30:03 value of k that I could use? Log n, because I have a k out 450 00:30:03 --> 00:30:07 there. I certainly don't want to use 451 00:30:07 --> 00:30:13 more than log n. So, log n times the log n'th 452 00:30:13 --> 00:30:18 root, and this is a little hard to draw of n. 453 00:30:18 --> 00:30:23 Now, what is the log n'th root of n? 454 00:30:23 --> 00:30:27 That's what you're all thinking about. 455 00:30:27 --> 00:30:34 What is the log n'th root of n minus two? 456 00:30:34 --> 00:30:39 It's one of these good questions whose answer is? 457 00:30:39 --> 00:30:43 Oh man. Remember the definition of 458 00:30:43 --> 00:30:47 root? OK, the root is n to the one 459 00:30:47 --> 00:30:51 over log n. OK, good, remember the 460 00:30:51 --> 00:30:55 definition of having a power, A to the B? 461 00:30:55 --> 00:30:59 It was like two to the power, B log A? 462 00:30:59 --> 00:31:06 Does that sound familiar? So, this is two to the log n 463 00:31:06 --> 00:31:11 over log n, which is, I hope you can get it at this 464 00:31:11 --> 00:31:17 point, two. Wow, so the log n'th root of n 465 00:31:17 --> 00:31:20 minus two is zero: my favorite answer. 466 00:31:20 --> 00:31:23 OK, this is to. So this whole thing is two log 467 00:31:23 --> 00:31:26 n: pretty nifty. So, you could be a little 468 00:31:26 --> 00:31:31 fancier and tweak this a little bit, but two log n is plenty 469 00:31:31 --> 00:31:36 good for me. We clearly don't want to use 470 00:31:36 --> 00:31:41 any more lists, but log n lists sounds pretty 471 00:31:41 --> 00:31:45 good. I get, now, logarithmic search 472 00:31:45 --> 00:31:47 time. Let's check. 473 00:31:47 --> 00:31:52 I mean, we sort of did this all intuitively. 474 00:31:52 --> 00:31:56 Let's draw what the list looks like. 475 00:31:56 --> 00:32:01 But, it will work. So, I'm going to redraw this 476 00:32:01 --> 00:32:07 example because you have to, also. 477 00:32:07 --> 00:32:14 So, let's redesign that New York City subway system. 478 00:32:14 --> 00:32:22 And, I want you to leave three blank lines up here. 479 00:32:22 --> 00:32:29 So, you should have this memorized by now. 480 00:32:29 --> 00:32:34 But I don't. So, we are not allowed to 481 00:32:34 --> 00:32:38 change the local line, though it would be nice, 482 00:32:38 --> 00:32:43 add a few more stops there. OK, we can stop at 79th Street. 483 00:32:43 --> 00:32:47 That's enough. So now, we have log n lists. 484 00:32:47 --> 00:32:53 And here, log n is about four. So, I want to make a bunch of 485 00:32:53 --> 00:32:55 lists here. In particular, 486 00:32:55 --> 00:33:02 14 will appear on all of them. So, why don't I draw those in? 487 00:33:02 --> 00:33:05 And, the question is, which elements go in here? 488 00:33:05 --> 00:33:08 So, I have log n lists. And, my goal is to balance the 489 00:33:08 --> 00:33:12 number of items up here, and the ratio between these two 490 00:33:12 --> 00:33:15 lists, and the ratio between these two lists, 491 00:33:15 --> 00:33:18 and the ratio between these two lists. 492 00:33:18 --> 00:33:20 I want all these things to be balanced. 493 00:33:20 --> 00:33:24 There are log n of them. So, the product of all those 494 00:33:24 --> 00:33:27 ratios better be n, the number of elements down 495 00:33:27 --> 00:33:29 here. So, the product of all these 496 00:33:29 --> 00:33:36 ratios is n. And there's log n of them; 497 00:33:36 --> 00:33:44 how big is each ratio? So, I'll call the ratio r. 498 00:33:44 --> 00:33:52 The ratio's r. I should have r to the power of 499 00:33:52 --> 00:33:56 log n equals n. What's r? 500 00:33:56 --> 00:34:02 What's r minus two? Zero. 501 00:34:02 --> 00:34:05 OK, this should be two to the power of log n. 502 00:34:05 --> 00:34:09 So, if the ratio between the number of elements here and here 503 00:34:09 --> 00:34:12 is to all the way down, then I will have an elements at 504 00:34:12 --> 00:34:15 the bottom, which is what I want. 505 00:34:15 --> 00:34:18 So, in other words, I want half the elements here, 506 00:34:18 --> 00:34:22 a quarter of the elements here, an eighth of the elements here, 507 00:34:22 --> 00:34:25 and so on. So, I'm going to take half of 508 00:34:25 --> 00:34:28 the elements evenly spaced out: 34th, 50th, 66th, 509 00:34:28 --> 00:34:32 79th, and so on. So, this is our new 510 00:34:32 --> 00:34:35 semi-express line: not terribly fast, 511 00:34:35 --> 00:34:39 but you save a factor of two for going up there. 512 00:34:39 --> 00:34:42 And, when you're done, you go down, 513 00:34:42 --> 00:34:44 and you walk, at most, one step. 514 00:34:44 --> 00:34:47 And you find what you're looking for. 515 00:34:47 --> 00:34:52 OK, and then we do the same thing over and over and over 516 00:34:52 --> 00:34:56 until we run out of elements. I can't read my own writing. 517 00:34:56 --> 00:34:59 It's 79th. 518 00:34:59 --> 00:35:11 519 00:35:11 --> 00:35:14 OK, if I had a bigger example, I would be more levels, 520 00:35:14 --> 00:35:19 but this is just barely enough. Let's say two elements is where 521 00:35:19 --> 00:35:21 I stop. So, this looks good. 522 00:35:21 --> 00:35:24 Does this look like a structure you've seen before, 523 00:35:24 --> 00:35:25 at all, vaguely? Yes? 524 00:35:25 --> 00:35:28 A tree: yes. It looks a lot like a binary 525 00:35:28 --> 00:35:31 tree. I'll just leave it at that. 526 00:35:31 --> 00:35:34 In your problem set, you'll understand why skip 527 00:35:34 --> 00:35:38 lists are really like trees. But it's more or less a tree. 528 00:35:38 --> 00:35:41 Let's say at this level, it looks sort of like binary 529 00:35:41 --> 00:35:42 search. You look at 14; 530 00:35:42 --> 00:35:44 you look at 15, and therefore, 531 00:35:44 --> 00:35:48 you decide whether you are in the left half for the right 532 00:35:48 --> 00:35:50 half. And that's sort of like a tree. 533 00:35:50 --> 00:35:54 It's not quite a tree because we have this element repeated 534 00:35:54 --> 00:35:55 all over. But more or less, 535 00:35:55 --> 00:35:59 this is a binary tree. At depth I, we have two to the 536 00:35:59 --> 00:36:04 I nodes, just like a tree, just like a balanced tree. 537 00:36:04 --> 00:36:08 I'm going to call this structure an ideal skip list. 538 00:36:08 --> 00:36:13 And, if all we are doing our searches, ideal skip lists are 539 00:36:13 --> 00:36:15 pretty good. Maybe at practice: 540 00:36:15 --> 00:36:20 not quite as good as a binary search tree, but up to constant 541 00:36:20 --> 00:36:24 factors: just as good. So, for example, 542 00:36:24 --> 00:36:28 I mean, we can generalize search, just check that it's log 543 00:36:28 --> 00:36:32 n. So, the search procedure is you 544 00:36:32 --> 00:36:36 start at the top left. So, let's say we are looking 545 00:36:36 --> 00:36:38 for 72. You start at the top left. 546 00:36:38 --> 00:36:41 14 is smaller than 72, so I try to go right. 547 00:36:41 --> 00:36:44 79 is too big. So, I follow this arrow, 548 00:36:44 --> 00:36:47 but I say, oops, that's too much. 549 00:36:47 --> 00:36:49 So, instead, I go down 14 still. 550 00:36:49 --> 00:36:53 I go to the right: oh, 50, that's still smaller 551 00:36:53 --> 00:36:55 than 72: OK. I tried to go right again. 552 00:36:55 --> 00:36:58 Oh: 79, that's too big. That's no good. 553 00:36:58 --> 00:37:00 So, I go down. So, I get 50. 554 00:37:00 --> 00:37:05 I do the same thing over and over. 555 00:37:05 --> 00:37:07 I try to go to the right: oh, 66, that's OK. 556 00:37:07 --> 00:37:09 Try to go to the right: oh, 79, that's too big. 557 00:37:09 --> 00:37:11 So I go down. Now I go to the right and, 558 00:37:11 --> 00:37:14 oh, 72: done. Otherwise, I'd go too far and 559 00:37:14 --> 00:37:16 try to go down and say, oops, element must not be 560 00:37:16 --> 00:37:18 there. It's a very simple search 561 00:37:18 --> 00:37:21 algorithm: same as here except just remove the L_1 and L_2. 562 00:37:21 --> 00:37:23 Go right until that would go too far. 563 00:37:23 --> 00:37:25 Then go down. Then go right until we'd go too 564 00:37:25 --> 00:37:28 far, and then go down. You might have to do this log n 565 00:37:28 --> 00:37:30 times. In each level, 566 00:37:30 --> 00:37:34 you're clearly only walking a couple of steps because the 567 00:37:34 --> 00:37:37 ratio between these two sizes is only two. 568 00:37:37 --> 00:37:40 So, this will cost two log n for search. 569 00:37:40 --> 00:37:42 Good, I mean, so that was to check because we 570 00:37:42 --> 00:37:46 were using intuition over here; a little bit shaky. 571 00:37:46 --> 00:37:50 So, this is an ideal skip list, we have to support insertions 572 00:37:50 --> 00:37:53 and deletions. As soon as we do an insert and 573 00:37:53 --> 00:37:57 delete, there's no way we're going to maintain the structure. 574 00:37:57 --> 00:38:03 It's a bit too special. There is only one of these 575 00:38:03 --> 00:38:09 where everything is perfectly spaced out, and everything is 576 00:38:09 --> 00:38:13 beautiful. So, we can't do that. 577 00:38:13 --> 00:38:20 We're going to maintain roughly this structure as best we can. 578 00:38:20 --> 00:38:27 And, if anyone of you knows someone in New York City subway 579 00:38:27 --> 00:38:31 planning, you can tell them this. 580 00:38:31 --> 00:38:37 OK, so: skip lists. So, I mean, this is basically 581 00:38:37 --> 00:38:42 our data structure. You could use this as a 582 00:38:42 --> 00:38:46 starting point, but then you start using skip 583 00:38:46 --> 00:38:49 lists. And, we need to somehow 584 00:38:49 --> 00:38:54 implement insertions and deletions, and maintain roughly 585 00:38:54 --> 00:39:01 this structure well enough that the search still costs order log 586 00:39:01 --> 00:39:05 n time. So, let's focus on insertions. 587 00:39:05 --> 00:39:09 If we do insertions right, it turns out deletions are 588 00:39:09 --> 00:39:11 really trivial. 589 00:39:11 --> 00:39:28 590 00:39:28 --> 00:39:31 And again, this is all from first principles. 591 00:39:31 --> 00:39:34 We're not allowed to use anything fancy. 592 00:39:34 --> 00:39:38 But, it would be nice if we used some good chalk. 593 00:39:38 --> 00:39:42 This one looks better. So, suppose you want to insert 594 00:39:42 --> 00:39:46 an element, x. We said how to search for an 595 00:39:46 --> 00:39:48 element. So, how do we insert it? 596 00:39:48 --> 00:39:53 Well, the first thing we should do is figure out where it goes. 597 00:39:53 --> 00:39:57 So, we search for x. We call search of x to find 598 00:39:57 --> 00:40:03 where x fits in the bottom list, not just any list. 599 00:40:03 --> 00:40:06 Pretty easy to find out where it fits in the top list. 600 00:40:06 --> 00:40:08 That takes, like, constant time. 601 00:40:08 --> 00:40:11 What we want to know: because the top list has 602 00:40:11 --> 00:40:14 constant length, we want to know where x goes in 603 00:40:14 --> 00:40:17 the bottom list. So, let's say we want to insert 604 00:40:17 --> 00:40:19 a search for 80. Well, it is a bit too big. 605 00:40:19 --> 00:40:22 Let search for 75. So, we'll find the 75 fits 606 00:40:22 --> 00:40:25 right here between 72 and 79 using the same path. 607 00:40:25 --> 00:40:29 OK, if it's there already, we complain because I'm going 608 00:40:29 --> 00:40:32 to assume all keys are distinct for now just so the picture 609 00:40:32 --> 00:40:38 stays simple. But this works fine even if you 610 00:40:38 --> 00:40:42 are inserting the same key over and over. 611 00:40:42 --> 00:40:47 So, that seems good. One thing we should clearly do 612 00:40:47 --> 00:40:50 is insert x into the bottom list. 613 00:40:50 --> 00:40:55 We now know where it fits. It should go there. 614 00:40:55 --> 00:40:59 Because we want to maintain this invariant, 615 00:40:59 --> 00:41:06 that the bottom list contains all the elements. 616 00:41:06 --> 00:41:10 So, there we go. We've maintained the invariant. 617 00:41:10 --> 00:41:14 The bottom list contains all the elements. 618 00:41:14 --> 00:41:18 So, we search for 75. We say, oh, 75 goes here, 619 00:41:18 --> 00:41:24 and we just sort of link in 75. You know how to do a linked 620 00:41:24 --> 00:41:29 list, I hope. Let me just erase that pointer. 621 00:41:29 --> 00:41:32 All the work in implementing skip lists is the linked list 622 00:41:32 --> 00:41:34 manipulation. Is that enough? 623 00:41:34 --> 00:41:38 No, it would be fine for now because now there's only a chain 624 00:41:38 --> 00:41:41 of length three here that you'd have to walk over if you're 625 00:41:41 --> 00:41:44 looking for something in this range. 626 00:41:44 --> 00:41:47 But if I just keep inserting 75, and 76, than 76 plus 627 00:41:47 --> 00:41:51 epsilon, 76 plus two epsilon, and so on, just pack a whole 628 00:41:51 --> 00:41:54 bunch of elements in here, this chain will get really 629 00:41:54 --> 00:41:55 long. Now, suddenly, 630 00:41:55 --> 00:41:58 things are not so balanced. If I do a search, 631 00:41:58 --> 00:42:02 I'll pay an arbitrarily long amount time here to search for 632 00:42:02 --> 00:42:05 someone. If I insert k things, 633 00:42:05 --> 00:42:08 it'll take k time. I want it to stay log n. 634 00:42:08 --> 00:42:11 If I only insert log n items, it's OK for now. 635 00:42:11 --> 00:42:15 What I want to do is decide which of these lists contain 75. 636 00:42:15 --> 00:42:17 So, clearly it goes on the bottom. 637 00:42:17 --> 00:42:19 Every element goes in the bottom. 638 00:42:19 --> 00:42:21 Should it go up a level? Maybe. 639 00:42:21 --> 00:42:23 It depends. It's not clear yet. 640 00:42:23 --> 00:42:27 If I insert a few items here, definitely some of them should 641 00:42:27 --> 00:42:39 go on the next level. Should I go to levels up? 642 00:42:39 --> 00:42:57 Maybe, but even less likely. So, what should I do? 643 00:42:57 --> 00:43:01 Yeah? Right, so you maintain the 644 00:43:01 --> 00:43:05 ideal partition size, which may be like the length of 645 00:43:05 --> 00:43:07 this chain. And you see, 646 00:43:07 --> 00:43:10 well, if that gets too long, then I should split it in the 647 00:43:10 --> 00:43:14 middle, promote that guy up to the next level, 648 00:43:14 --> 00:43:18 and do the same thing up here. If this chain gets too long 649 00:43:18 --> 00:43:21 between two consecutive next level express stops, 650 00:43:21 --> 00:43:23 then I'll promote the middle guy. 651 00:43:23 --> 00:43:26 And that's what you'll do in your problem set. 652 00:43:26 --> 00:43:30 That's too fancy for me. I don't need no stinking 653 00:43:30 --> 00:43:34 counters. What else could I do? 654 00:43:34 --> 00:43:46 655 00:43:46 --> 00:43:48 I could try to maintain the ideal skip list structure. 656 00:43:48 --> 00:43:51 That will be too expensive. Like I say, 75 is the guy that 657 00:43:51 --> 00:43:54 gets promoted, and this guy gets demoted all 658 00:43:54 --> 00:43:55 the way down. But that will propagate 659 00:43:55 --> 00:43:58 everything to the right. And that could cost linear time 660 00:43:58 --> 00:44:01 for update. Other idea? 661 00:44:01 --> 00:44:07 If I only want half of them to go up, I could flip a coin. 662 00:44:07 --> 00:44:11 Good idea. All right, for that, 663 00:44:11 --> 00:44:16 I will give you a quarter. It's a good one. 664 00:44:16 --> 00:44:19 It's the old line state, Maryland. 665 00:44:19 --> 00:44:24 There you go. However, you have to perform 666 00:44:24 --> 00:44:32 some services for that quarter, namely, flip the coin. 667 00:44:32 --> 00:44:34 Can you flip a coin? Good. 668 00:44:34 --> 00:44:38 What did you get? Tails, OK, that's the first 669 00:44:38 --> 00:44:42 random bit. But we are going to do is build 670 00:44:42 --> 00:44:45 a skip list. Maybe I should tell you how 671 00:44:45 --> 00:44:48 first. OK, but the idea is flip a 672 00:44:48 --> 00:44:50 coin. If it's heads, 673 00:44:50 --> 00:44:55 so, sorry, if it's heads, we will promote it to the next 674 00:44:55 --> 00:45:03 level, and flip again. So, this is an answer to the 675 00:45:03 --> 00:45:10 question, which other lists should store x? 676 00:45:10 --> 00:45:16 How many other lists should we add x to? 677 00:45:16 --> 00:45:22 Well, the algorithm is, flip a coin, 678 00:45:22 --> 00:45:28 and if it comes out heads, then promote x. 679 00:45:28 --> 00:45:36 to the next level up, and flip again. 680 00:45:36 --> 00:45:39 OK, that's key because we might want this element to go 681 00:45:39 --> 00:45:41 arbitrarily high. But for starters, 682 00:45:41 --> 00:45:43 we flip a coin. It doesn't go to the next 683 00:45:43 --> 00:45:45 level. Well, we'd like it to go to the 684 00:45:45 --> 00:45:49 next level with probability one half because we want the ratio 685 00:45:49 --> 00:45:51 between these two sizes to be a half, or sorry, 686 00:45:51 --> 00:45:54 two, depending which way you take the ratio. 687 00:45:54 --> 00:45:56 So, I want roughly half the elements up here. 688 00:45:56 --> 00:45:58 So, I flip a coin. If it comes up heads, 689 00:45:58 --> 00:46:02 I go up here. This is a fair coin. 690 00:46:02 --> 00:46:05 So I want it 50-50. OK, then how many should that 691 00:46:05 --> 00:46:07 element go up to the next level up? 692 00:46:07 --> 00:46:09 Well, with 50% probability again. 693 00:46:09 --> 00:46:12 So, I flip another point. If it comes up heads, 694 00:46:12 --> 00:46:15 I'll go up another level. And that will maintain the 695 00:46:15 --> 00:46:19 approximate ratio between these two guys as being two. 696 00:46:19 --> 00:46:21 The expected ratio will definitely be two, 697 00:46:21 --> 00:46:25 and so on, all the way up. If I go up to the top and flip 698 00:46:25 --> 00:46:28 a coin, it comes up heads, I'll make another level. 699 00:46:28 --> 00:46:33 This is the insertion algorithm: dead simple. 700 00:46:33 --> 00:46:38 The fancier one you will see on your problem set. 701 00:46:38 --> 00:46:40 So, let's do it. 702 00:46:40 --> 00:46:49 703 00:46:49 --> 00:46:53 OK, I also need someone to generate random numbers. 704 00:46:53 --> 00:46:56 Who can generate random numbers? 705 00:46:56 --> 00:47:00 Pseudo-random? I'll give you a quarter. 706 00:47:00 --> 00:47:02 I have one here. Here you go. 707 00:47:02 --> 00:47:05 That's a boring quarter. Who would like to generate 708 00:47:05 --> 00:47:08 random numbers? Someone volunteering someone 709 00:47:08 --> 00:47:10 else: that's a good way to do it. 710 00:47:10 --> 00:47:13 Here you go. You get a quarter, 711 00:47:13 --> 00:47:15 but you're not allowed to flip it. 712 00:47:15 --> 00:47:18 No randomness for you; well, OK, you can generate 713 00:47:18 --> 00:47:22 bits, and then compute a number. So, give me a number. 714 00:47:22 --> 00:47:25 44, can answer. OK, we already flipped a coin 715 00:47:25 --> 00:47:27 and I got tails. Done. 716 00:47:27 --> 00:47:33 That's the insertion algorithm. I'm going to make some more 717 00:47:33 --> 00:47:36 space actually, put it way down here. 718 00:47:36 --> 00:47:41 OK, so 44 does not get promoted because we got a tails. 719 00:47:41 --> 00:47:46 So, give me another number. Nine, OK, I search for nine in 720 00:47:46 --> 00:47:49 this list. I should mention one other 721 00:47:49 --> 00:47:53 thing, sorry. I need a small change. 722 00:47:53 --> 00:47:57 This is just to make sure searches still work. 723 00:47:57 --> 00:48:02 So, the worry is suppose I insert something bigger and then 724 00:48:02 --> 00:48:07 I promote it. This would look very bad for a 725 00:48:07 --> 00:48:11 skip list data structure because I always want to start at the 726 00:48:11 --> 00:48:13 top left, and now there's no top left. 727 00:48:13 --> 00:48:17 So, just minor change: just let me remember that. 728 00:48:17 --> 00:48:21 The minor change is that I'm going to store a special value 729 00:48:21 --> 00:48:25 minus infinity in every list. So, minus infinity always gets 730 00:48:25 --> 00:48:29 promoted all the way to the top, whatever the top happens to be 731 00:48:29 --> 00:48:32 now. So, initially, 732 00:48:32 --> 00:48:35 that way I'll always have a top left. 733 00:48:35 --> 00:48:38 Sorry, I forgot to mention that. 734 00:48:38 --> 00:48:41 So, initially I'll just have minus infinity. 735 00:48:41 --> 00:48:45 Then I insert 44. I say, OK, 44 goes there, 736 00:48:45 --> 00:48:47 no promotion, done. 737 00:48:47 --> 00:48:49 Now, we're going to insert nine. 738 00:48:49 --> 00:48:53 Nine goes here. So, minus infinity to nine, 739 00:48:53 --> 00:48:55 flip your coin, heads. 740 00:48:55 --> 00:49:00 Did he actually flip it? OK, good. 741 00:49:00 --> 00:49:02 He flipped it before, yeah, sure. 742 00:49:02 --> 00:49:04 I'm just giving you a hard time. 743 00:49:04 --> 00:49:09 So, we have nine up here. We need to maintain this minus 744 00:49:09 --> 00:49:13 infinity just to make sure it gets promoted along with 745 00:49:13 --> 00:49:16 everything else. So, that looks like a nice skip 746 00:49:16 --> 00:49:18 list. Flip it again. 747 00:49:18 --> 00:49:21 Tails, good. OK, so this looks like an ideal 748 00:49:21 --> 00:49:23 skip list. Isn't that great? 749 00:49:23 --> 00:49:27 It works every time. OK, give me another number. 750 00:49:27 --> 00:49:32 26, OK, so I search for 26. 26 goes here. 751 00:49:32 --> 00:49:36 It clearly goes on the bottom list. 752 00:49:36 --> 00:49:41 Here we go, 26, and then I you raised 44. 753 00:49:41 --> 00:49:46 Flip. Tails, OK, another number. 754 00:49:46 --> 00:49:52 50, oh, a big one. It costs me a little while to 755 00:49:52 --> 50. search, and I get over here. 756 50. --> 00:49:56 757 00:49:56 --> 00:49:58 Flip. Heads, good. 758 00:49:58 --> 00:50:05 So 50 gets promoted. Flip it again. 759 00:50:05 --> 00:50:08 Tails, OK, still a reasonable number. 760 00:50:08 --> 00:50:11 Another number? 12, it takes a little while to 761 00:50:11 --> 00:50:15 get exciting here. OK, 12 goes here between nine 762 00:50:15 --> 00:50:18 and 26. You're giving me a hard time 763 00:50:18 --> 00:50:20 here. OK, flip. 764 00:50:20 --> 00:50:24 Heads, OK, 12 gets promoted. I know you have to work a 765 00:50:24 --> 00:50:30 little bit, but we just came here to search for 12. 766 00:50:30 --> 00:50:35 So, we know that nine was the last point we went down. 767 00:50:35 --> 00:50:39 So, we promote 12. It gets inserted up here. 768 00:50:39 --> 00:50:45 We are just inserting into this particular linked list: 769 00:50:45 --> 00:50:48 nothing fancy. We link the two twelves 770 00:50:48 --> 00:50:52 together. It still looks kind of like a 771 00:50:52 --> 00:50:55 linked list. Flip again. 772 00:50:55 --> 37. OK, tails, another number. 773 37. --> 00:50:58 774 00:50:58 --> 00:51:02 Jeez. It's a good test of memory. 775 00:51:02 --> 00:51:05 37, what was it, 44 and 50? 776 00:51:05 --> 00:51:08 And 50 was at the next level up. 777 00:51:08 --> 00:51:14 I think I should just keep appending elements and have you 778 00:51:14 --> 00:51:18 flip coins. OK, we just inserted 37. 779 00:51:18 --> 00:51:22 Tails. OK, that's getting to be a long 780 00:51:22 --> 00:51:25 chain. That looks a bit worse. 781 00:51:25 --> 00:51:29 OK, give me another number larger than 50. 782 00:51:29 --> 00:51:34 51, good answer. Thank you. 783 00:51:34 --> 00:51:37 OK, flip again. And again. 784 00:51:37 --> 00:51:40 Tails. Another number. 785 00:51:40 --> 00:51:45 Wait, someone else should pick a number. 786 00:51:45 --> 00:51:49 It's not working. What did you say? 787 00:51:49 --> 00:51:52 52, good answer. Flip. 788 00:51:52 --> 00:51:58 Tails, not surprising. We've gotten a lot of heads 789 00:51:58 --> 00:52:03 there. OK, another number. 790 00:52:03 --> 00:52:06 53, thank you. Flip. 791 00:52:06 --> 00:52:08 Heads, heads, OK. 792 00:52:08 --> 00:52:13 Heads, heads, you didn't flip. 793 00:52:13 --> 00:52:17 All right, 53, you get the idea. 794 00:52:17 --> 00:52:26 If you get two consecutive heads, then the guy goes up two 795 00:52:26 --> 00:52:32 levels. OK, now flip for real. 796 00:52:32 --> 00:52:33 Heads. Finally. 797 00:52:33 --> 00:52:39 Heads we've been waiting for. If you flipped three heads in a 798 00:52:39 --> 00:52:44 row, you go three levels. And each time, 799 00:52:44 --> 00:52:47 we keep promoting minus infinity. 800 00:52:47 --> 00:52:50 Look again. Heads, oh my God. 801 00:52:50 --> 00:52:54 Where were they before? Flip again. 802 00:52:54 --> 00:53:00 It better be tails this time. Tails, good. 803 00:53:00 --> 00:53:04 OK, you get the idea. Eventually you run out of board 804 00:53:04 --> 00:53:06 space. Now, it's pretty rare that you 805 00:53:06 --> 00:53:10 go too high. What's the probability that you 806 00:53:10 --> 00:53:13 go higher than log n? Another easy log computation. 807 00:53:13 --> 00:53:17 Each time, I have a 50% probability of going up. 808 00:53:17 --> 00:53:22 One in n probability of going up log n levels because half to 809 00:53:22 --> 00:53:24 the power of log n is one out of n. 810 00:53:24 --> 00:53:28 So, it depends on n, but I'm not going to go too 811 00:53:28 --> 00:53:32 high. And, intuitively, 812 00:53:32 --> 00:53:37 this is not so bad. So, these are skip lists. 813 00:53:37 --> 00:53:44 You have the ratios right in expectation, which is a pretty 814 00:53:44 --> 00:53:49 weak statement. This doesn't say anything about 815 00:53:49 --> 00:53:54 the lengths of these change. But intuitively, 816 00:53:54 --> 00:53:59 it's pretty good. Let's say pretty good on 817 00:53:59 --> 00:54:03 average. So, I had two semi-random 818 00:54:03 --> 00:54:05 processes going on here. One is picking the numbers, 819 00:54:05 --> 00:54:08 and that, I don't want to assume anything about. 820 00:54:08 --> 00:54:09 The numbers could be adversarial. 821 00:54:09 --> 00:54:12 It could be sequential. It could be reverse sorted. 822 00:54:12 --> 00:54:14 It could be random. I don't know. 823 00:54:14 --> 00:54:15 So, it didn't matter what he said. 824 00:54:15 --> 00:54:18 At least, it shouldn't matter. I mean, it matters here. 825 00:54:18 --> 00:54:20 Don't worry. You're still loved. 826 00:54:20 --> 00:54:22 You still get your $0.25. But what the algorithm cares 827 00:54:22 --> 00:54:24 about is the outcomes of these coins. 828 00:54:24 --> 00:54:27 And the probability, the statement that this data 829 00:54:27 --> 00:54:30 structure is fast with high probability is only about the 830 00:54:30 --> 00:54:34 random coins. Right, it doesn't matter what 831 00:54:34 --> 00:54:38 the adversary chooses for numbers as long as those coins 832 00:54:38 --> 00:54:43 are random, and the adversary doesn't know the coins. 833 00:54:43 --> 00:54:46 It doesn't know the outcomes of the coins. 834 00:54:46 --> 00:54:50 So, in that case, on average, overall of the coin 835 00:54:50 --> 00:54:55 flips, you should be OK. But the claim is not just that 836 00:54:55 --> 00:54:58 it's pretty good on average. But, it's really, 837 00:54:58 --> 00:55:03 really good almost always. OK, with really high 838 00:55:03 --> 00:55:07 probability it's log n. So, for example, 839 00:55:07 --> 00:55:10 with probability, one minus one over n, 840 00:55:10 --> 00:55:15 it's order of log n, with probability one minus one 841 00:55:15 --> 00:55:19 over n^2 it's log n, probability one minus one over 842 00:55:19 --> 00:55:24 n^100, it's order log n. All those statements are true 843 00:55:24 --> 00:55:30 for any value of 100. So, that's where we're going. 844 00:55:30 --> 00:55:33 OK, I should mention, how do you delete in a skip 845 00:55:33 --> 00:55:34 list? Find the element. 846 00:55:34 --> 00:55:37 You delete it all the way. There's nothing fancy with 847 00:55:37 --> 00:55:40 delete. Because we have all these 848 00:55:40 --> 00:55:43 independent, random choices, all of these elements are sort 849 00:55:43 --> 00:55:47 of independent from each other. We don't really care. 850 00:55:47 --> 00:55:49 So, delete an element, just throw it away. 851 00:55:49 --> 00:55:53 The tricky part is insertion. When I insert an element, 852 00:55:53 --> 00:55:56 I'm just going to randomly see how high it should go. 853 00:55:56 --> 00:56:00 With probability one over two to the i, it will go to height 854 00:56:00 --> 00:56:04 i. Good, that's my time. 855 00:56:04 --> 00:56:08 I've been having too much fun here. 856 00:56:08 --> 00:56:14 I've got to go a little bit faster, OK. 857 00:56:14 --> 00:56:25 858 00:56:25 --> 00:56:32 So here's the theorem. Let's see exactly what we are 859 00:56:32 --> 00:56:38 proving first. With high probability, 860 00:56:38 --> 00:56:46 this is a formal notion which I will define a second. 861 00:56:46 --> 00:56:55 Every search in n elements skip lists costs order of log n. 862 00:56:55 --> 00:57:03 So, that's the theorem. Now I need to define with high 863 00:57:03 --> 00:57:06 probability. So, with high probability. 864 00:57:06 --> 00:57:10 And, it's a bit of a long phrase. 865 00:57:10 --> 00:57:15 So, often we will, and you can abbreviate it WHP. 866 00:57:15 --> 00:57:20 So, if I have a random event, and the random event here is 867 00:57:20 --> 00:57:26 that every search in an n element skip list costs order 868 00:57:26 --> 00:57:32 log n, I want to know what it means for that event E to occur 869 00:57:32 --> 00:57:36 with high probability. 870 00:57:36 --> 00:57:47 871 00:57:47 --> 00:57:53 So this is the definition. So, the statement is that for 872 00:57:53 --> 00:58:00 any alpha greater than or equal to one, there is a suitable 873 00:58:00 --> 00:58:04 choice of constants -- 874 00:58:04 --> 00:58:16 875 00:58:16 --> 00:58:27 -- for which the event, E, occurs with this probability 876 00:58:27 --> 00:58:37 I keep mentioning. So, the probability at least 877 00:58:37 --> 00:58:46 one minus one over n to the alpha. 878 00:58:46 --> 00:58:49 So, this is a bit imprecise, but it will suffice for our 879 00:58:49 --> 00:58:52 purposes. If you want to really formal 880 00:58:52 --> 00:58:55 definition, you can read the lecture notes. 881 00:58:55 --> 00:58:59 There are special lecture notes for this lecture on the stellar 882 00:58:59 --> 00:59:01 site. And, there's the PowerPoint 883 00:59:01 --> 00:59:06 notes on the SMA site. But, right, there's a bit of a 884 00:59:06 --> 00:59:08 subtlety in the choice of constants here. 885 00:59:08 --> 00:59:11 There is a choice of this constant. 886 00:59:11 --> 00:59:14 And there's a choice of this constant. 887 00:59:14 --> 00:59:16 And, these are related. And, there's alpha, 888 00:59:16 --> 00:59:19 which we get to whatever we want. 889 00:59:19 --> 00:59:22 But the bottom line is, we get to choose what 890 00:59:22 --> 00:59:24 probability we want this to be true. 891 00:59:24 --> 00:59:28 If I want it to be true, with probability one minus one 892 00:59:28 --> 00:59:32 over n^100, I can do that. I just sat alpha to a hundred, 893 00:59:32 --> 00:59:37 and up to this little constant that's going to grow much slower 894 00:59:37 --> 00:59:41 than n to the alpha. I get the error probability. 895 00:59:41 --> 00:59:45 So this thing is called the error probability. 896 00:59:45 --> 00:59:48 The probability that I fail is polynomially small, 897 00:59:48 --> 00:59:51 for any polynomial I want. Now, with the same data 898 00:59:51 --> 00:59:54 structure, right, I fixed the data structure. 899 00:59:54 --> 00:59:57 It doesn't depend on alpha. Anything you want, 900 00:59:57 --> 1:00:01.717 any alpha value you want, this data structure will take 901 1:00:01.717 --> 1:00:06.692 order of log n time. Now, this constant will depend 902 1:00:06.692 --> 1:00:08.666 on alpha. So, you know, 903 1:00:08.666 --> 1:00:14.141 you want error probability one over n^100 is probably going to 904 1:00:14.141 --> 1:00:17.461 be, like, 100 log n. It's still log n. 905 1:00:17.461 --> 1:00:22.128 OK, this is a very strong claim about the tale of the 906 1:00:22.128 --> 1:00:27.064 distribution of the running time of search, very strong. 907 1:00:27.064 --> 1:00:32 Let me give you an idea of how strong it is. 908 1:00:32 --> 1:00:36.731 How many people know what Boole's inequality is? 909 1:00:36.731 --> 1:00:42.671 How many people know what the union bound is in probability? 910 1:00:42.671 --> 1:00:45.691 You should. It's in appendix c. 911 1:00:45.691 --> 1:00:49.214 Maybe you'll know it by the theorem. 912 1:00:49.214 --> 1:00:55.154 It's good to know it by name. It's sort of like linearity of 913 1:00:55.154 --> 1:00:58.476 expectations. It's a lot easier to 914 1:00:58.476 --> 1:01:03.978 communicate to someone. Linearity of expectations: 915 1:01:03.978 --> 1:01:07.554 instead of saying, you know that thing where you 916 1:01:07.554 --> 1:01:11.51 sum up all the expectations of things, and that's the 917 1:01:11.51 --> 1:01:15.086 expectation of the sum? It's a lot easier to say 918 1:01:15.086 --> 1:01:18.815 linearity of expectation. So, let me quiz you in a 919 1:01:18.815 --> 1:01:21.706 different way. So, if I take a bunch of 920 1:01:21.706 --> 1:01:26.119 events, and I take their union, either this happens or this 921 1:01:26.119 --> 1:01:29.847 happens, or so on. So, this is the inclusive OR of 922 1:01:29.847 --> 1:01:31.521 k events. And, instead, 923 1:01:31.521 --> 1:01:37 I look at the sum of the probabilities of those events. 924 1:01:37 --> 1:01:40.111 OK, easy question: are these equal? 925 1:01:40.111 --> 1:01:42.947 No, unless they are independent. 926 1:01:42.947 --> 1:01:47.248 But can I say anything about them, any relation? 927 1:01:47.248 --> 1:01:51.183 Smaller, yeah. This is less than or equal to 928 1:01:51.183 --> 1:01:54.477 that. OK, this should be intuitive to 929 1:01:54.477 --> 1:01:57.771 you from a probability point of view. 930 1:01:57.771 --> 1:02:01.705 Look at the textbook. OK: very basic result, 931 1:02:01.705 --> 1:02:07.041 trivial result almost. What does this tell us? 932 1:02:07.041 --> 1:02:11.479 Well, suppose that E_i is some kind of error event. 933 1:02:11.479 --> 1:02:15.295 We don't want it to happen. OK, and suppose, 934 1:02:15.295 --> 1:02:19.467 mix some letters here. Suppose I have a bunch of 935 1:02:19.467 --> 1:02:23.017 events which occur with high probability. 936 1:02:23.017 --> 1:02:26.745 OK, call those E_i complement. So, suppose, 937 1:02:26.745 --> 1:02:31.893 so this is the end of that statement, E_i complement occurs 938 1:02:31.893 --> 1:02:37.063 with high probability. OK, so then the probability of 939 1:02:37.063 --> 1:02:39.609 E_i is very small, polynomially small. 940 1:02:39.609 --> 1:02:42.636 One over n to the alpha for any alpha I want. 941 1:02:42.636 --> 1:02:46.007 Now, suppose I take a whole bunch of these events, 942 1:02:46.007 --> 1:02:48.69 and let's say that k is polynomial in n. 943 1:02:48.69 --> 1:02:52.405 So, I take a bunch of events, which I'd like to happen. 944 1:02:52.405 --> 1:02:54.882 They all occur with high probability. 945 1:02:54.882 --> 1:02:57.565 There is only polynomially many of them. 946 1:02:57.565 --> 1:03:00.316 So let's say, let me give this constant a 947 1:03:00.316 --> 1:03:03 name. Let's call it c. 948 1:03:03 --> 1:03:05.873 Let's say I take n to the c such events. 949 1:03:05.873 --> 1:03:09.926 Well, what's the probability that all those events occur 950 1:03:09.926 --> 1:03:12.873 together? Because they should rest of the 951 1:03:12.873 --> 1:03:17.073 time occurred together because each one occurs most of the 952 1:03:17.073 --> 1:03:19.578 time, occurs with high probability. 953 1:03:19.578 --> 1:03:23.115 So, I want to look at E_1 bar intersect, E_2 bar, 954 1:03:23.115 --> 1:03:25.842 and so on. So, each of these occurs as 955 1:03:25.842 --> 1:03:29.378 high probability. What's the chance that they all 956 1:03:29.378 --> 1:03:32.166 occur? It's also with high 957 1:03:32.166 --> 1:03:34.316 probability. I'm changing the alpha. 958 1:03:34.316 --> 1:03:37.817 So, the union bound tells me the probability of any one of 959 1:03:37.817 --> 1:03:40.09 these failing, the probability of this 960 1:03:40.09 --> 1:03:42.608 failing, or this failing, or this failing, 961 1:03:42.608 --> 1:03:44.573 which is this thing, is, at most, 962 1:03:44.573 --> 1:03:47.276 the sum of the probabilities of each failure. 963 1:03:47.276 --> 1:03:49.303 These are the error probabilities. 964 1:03:49.303 --> 1:03:52.619 I know that each of them is, at most, one over n to the 965 1:03:52.619 --> 1:03:55.875 alpha, with a constant in front. If I add them all up, 966 1:03:55.875 --> 1:03:57.779 there's only n to the c of them. 967 1:03:57.779 --> 1:04:01.034 So, I take this error probability, and I multiply by n 968 1:04:01.034 --> 1:04:05.4 to the c. So, I get like n to the c over 969 1:04:05.4 --> 1:04:08.679 n to the alpha, which is one over n to the 970 1:04:08.679 --> 1:04:11.96 alpha minus c. I can set alpha as big as I 971 1:04:11.96 --> 1:04:13.88 want. So, I said it much, 972 1:04:13.88 --> 1:04:17.88 much bigger than c, and this event occurs with high 973 1:04:17.88 --> 1:04:21 probability. I sort of made a mess here, 974 1:04:21 --> 1:04:25.719 but this event occurs with high probability because of this. 975 1:04:25.719 --> 1:04:30.599 Whatever the constant is here, however many events I'm taking, 976 1:04:30.599 --> 1:04:35 I just set alpha to be bigger than that. 977 1:04:35 --> 1:04:37.951 And, this event will occur with high probability, 978 1:04:37.951 --> 1:04:40.041 too. So, when I say here that every 979 1:04:40.041 --> 1:04:42.992 search of cost order log n with high probability, 980 1:04:42.992 --> 1:04:46.005 not only do I mean that if you look at one search, 981 1:04:46.005 --> 1:04:48.587 it costs order log n with high probability. 982 1:04:48.587 --> 1:04:51.969 You look at another search, and it costs log n with high 983 1:04:51.969 --> 1:04:54.244 probability. I mean, if you take every 984 1:04:54.244 --> 1:04:57.318 search, all of them take order log n time with high 985 1:04:57.318 --> 1:04:59.593 probability. So, this event that every 986 1:04:59.593 --> 1:05:03.036 single search you do takes order log n, is true with high 987 1:05:03.036 --> 1:05:06.663 probability estimate the number of searches you are doing is 988 1:05:06.663 --> 1:05:10.887 polynomial in n. So, I'm assuming that I'm not 989 1:05:10.887 --> 1:05:14.467 using this data structure forever, just for a polynomial 990 1:05:14.467 --> 1:05:17.136 amount of time. But, who's got more than a 991 1:05:17.136 --> 1:05:19.218 polynomial amount of time anyway? 992 1:05:19.218 --> 1:05:21.757 This is MIT. So, hopefully that's clear. 993 1:05:21.757 --> 1:05:24.035 We'll see it a few more times. Yeah? 994 1:05:24.035 --> 1:05:26.443 The algorithm doesn't depend on Alpha. 995 1:05:26.443 --> 1:05:31 The question is how do you choose alpha in the algorithm. 996 1:05:31 --> 1:05:33.925 So, we don't need to. This is just sort of for an 997 1:05:33.925 --> 1:05:36.668 analysis tool. This is saying that the farther 998 1:05:36.668 --> 1:05:39.838 out you get, so you say, well, what's the probability 999 1:05:39.838 --> 1:05:43.19 that more than ten log n. Well, it's like one over n^10. 1000 1:05:43.19 --> 1:05:46.238 Let's say it's linear. Well, what's the chance that 1001 1:05:46.238 --> 1:05:49.407 you're more than 20 log n? Well that's one over n^20. 1002 1:05:49.407 --> 1:05:52.942 So, the point is the tail of this distribution is getting a 1003 1:05:52.942 --> 1:05:54.466 really small, really fast. 1004 1:05:54.466 --> 1:05:57.758 And, such using alpha is more like sort of for your own 1005 1:05:57.758 --> 1:06:00.135 feeling good. OK, you can set it to 100, 1006 1:06:00.135 --> 1:06:05.209 and then n is at least two. So, that's like one over 2^100 1007 1:06:05.209 --> 1:06:08.082 chance that you fail. That's damn small. 1008 1:06:08.082 --> 1:06:11.322 If you've got a real random number generator, 1009 1:06:11.322 --> 1:06:15.668 the chance that you're going to hit one over 2^200 is pretty 1010 1:06:15.668 --> 1:06:18.762 tiny, right? So, let's say you set alpha to 1011 1:06:18.762 --> 1:06:21.266 256, which is always a good number. 1012 1:06:21.266 --> 1:06:25.759 2^256 is much bigger than the number of particles in the known 1013 1:06:25.759 --> 1:06:29 universe, so, the light matter. 1014 1:06:29 --> 1:06:32.898 So, actually I think this even accounts for some notion of dark 1015 1:06:32.898 --> 1:06:34.533 matter. So, this is really, 1016 1:06:34.533 --> 1:06:37.615 really, really big. So, the chance that you pick a 1017 1:06:37.615 --> 1:06:41.576 random particle in the universe that happens to be your favorite 1018 1:06:41.576 --> 1:06:45.161 particle, this one right here, that's over one over 2^256, 1019 1:06:45.161 --> 1:06:47.487 or even smaller. So, set alpha to 256, 1020 1:06:47.487 --> 1:06:51.26 the chance to your algorithm takes more than order log n time 1021 1:06:51.26 --> 1:06:54.907 is a lot smaller than the chance that a meteor strikes your 1022 1:06:54.907 --> 1:06:58.68 computer at the same time that it has a flooding point error, 1023 1:06:58.68 --> 1:07:02.642 at the same time that the earth explodes because they're putting 1024 1:07:02.642 --> 1:07:06.415 a transport through this part of the solar system at the same 1025 1:07:06.415 --> 1:07:08.113 time, I mean, I could go on, 1026 1:07:08.113 --> 1:07:10.752 right? It's really, 1027 1:07:10.752 --> 1:07:13.51 really unlikely that you are more than log n. 1028 1:07:13.51 --> 1:07:15.705 And how unlikely: you get to choose. 1029 1:07:15.705 --> 1:07:19.467 But it's just in the analysis the algorithm doesn't depend on 1030 1:07:19.467 --> 1:07:21.159 it. It's the same algorithm, 1031 1:07:21.159 --> 1:07:23.04 very cool. Sometimes, with high 1032 1:07:23.04 --> 1:07:25.297 probability, bounds depends on alpha. 1033 1:07:25.297 --> 1:07:27.68 I mean, the algorithm depends on alpha. 1034 1:07:27.68 --> 1:07:32.307 But here, it will not. OK, away we go. 1035 1:07:32.307 --> 1:07:37.692 So now you all understand the claim. 1036 1:07:37.692 --> 1:07:45.384 So let's do a warm up. We will also need this fact. 1037 1:07:45.384 --> 1:07:52.769 But it's pretty easy. The lemma is that with high 1038 1:07:52.769 --> 1:08:01.692 probability, the number of levels in the skip list is order 1039 1:08:01.692 --> 1:08:06.266 log n. I think it's order log n, 1040 1:08:06.266 --> 1:08:09.349 certainly. So, how do we prove that 1041 1:08:09.349 --> 1:08:12.613 something happens with high probably? 1042 1:08:12.613 --> 1:08:18.144 Compute the probability that it happened; show that it's high. 1043 1:08:18.144 --> 1:08:22.677 Even if you don't know what high probability means, 1044 1:08:22.677 --> 1:08:26.122 in fact, I used to ask that earlier on. 1045 1:08:26.122 --> 1:08:30.746 So, let's compute the chance that it doesn't happen, 1046 1:08:30.746 --> 1:08:35.551 the error probability, because that's just a one minus 1047 1:08:35.551 --> 1:08:39.449 the cleaner. So, I'd like to say, 1048 1:08:39.449 --> 1:08:42.71 let's say, that it's, at most, c log n levels. 1049 1:08:42.71 --> 1:08:46.115 So, what's the error probability for that event? 1050 1:08:46.115 --> 1:08:50.028 This is sort of an event. I'll put it in squiggles just 1051 1:08:50.028 --> 1:08:53 for, all set. This is the probability that 1052 1:08:53 --> 1:08:56.26 they are strictly greater than c log n levels. 1053 1:08:56.26 --> 1:09:00.173 So, I want to say that that probability is particularly 1054 1:09:00.173 --> 1:09:04.683 small, polynomially small. Well, how do I make levels? 1055 1:09:04.683 --> 1:09:07.552 When I insert an element, the probability half, 1056 1:09:07.552 --> 1:09:09.984 it goes up. And, the number of levels in 1057 1:09:09.984 --> 1:09:13.726 the skip list is the max over all the elements of how high it 1058 1:09:13.726 --> 1:09:15.035 goes up. But, max, oh, 1059 1:09:15.035 --> 1:09:17.779 that's a mess. All right, you can compute the 1060 1:09:17.779 --> 1:09:21.022 expectation of the max if you have a bunch of unknown 1061 1:09:21.022 --> 1:09:24.202 variables; there is expectation there is a constant, 1062 1:09:24.202 --> 1:09:26.759 and you take the max. It's like log in and 1063 1:09:26.759 --> 1:09:31 expectation, but we want a much stronger statement. 1064 1:09:31 --> 1:09:35.815 And, we have this Boole's inequality that says I have a 1065 1:09:35.815 --> 1:09:39.472 bunch of things, polynomially many things. 1066 1:09:39.472 --> 1:09:43.842 Let's say we have n items. Each one independently, 1067 1:09:43.842 --> 1:09:47.142 I don't even care if it's a dependent. 1068 1:09:47.142 --> 1:09:52.582 If it goes up more than c log n, yeah, the number of levels is 1069 1:09:52.582 --> 1:09:55.258 more than c log n. So, this is, 1070 1:09:55.258 --> 1:10:00.163 at most, and then I want to know, do any of those events 1071 1:10:00.163 --> 1:10:03.017 happen for any of the n elements? 1072 1:10:03.017 --> 1:10:06.762 So, I just multiplied by n. It's certainly, 1073 1:10:06.762 --> 1:10:10.597 at most, n times the probability that x gets 1074 1:10:10.597 --> 1:10:15.502 promoted, this much here, greater than or equal to log n 1075 1:10:15.502 --> 1:10:18.734 times. OK, if I pick, 1076 1:10:18.734 --> 1:10:21.041 for any element, x, because it's the same for 1077 1:10:21.041 --> 1:10:23.191 each element. They are done independently. 1078 1:10:23.191 --> 1:10:26.179 So, I'm just summing over x here, and that's just a factor 1079 1:10:26.179 --> 1:10:26.756 of n. Clear? 1080 1:10:26.756 --> 1:10:29.588 This is Boole's inequality. Now, what's the probability 1081 1:10:29.588 --> 1:10:32 that x gets promoted c log n times? 1082 1:10:32 --> 1:10:36.646 We did this before for log n. It was one over n. 1083 1:10:36.646 --> 1:10:40.305 For c log n, it's one over n to the c. 1084 1:10:40.305 --> 1:10:44.161 OK, this is n times two. Let's be nicer: 1085 1:10:44.161 --> 1:10:47.324 one half to the power of c log n. 1086 1:10:47.324 --> 1:10:53.257 One half to the power of c log n is one over two to the c log 1087 1:10:53.257 --> 1:10:55.926 n. The log n comes out here, 1088 1:10:55.926 --> 1:10:58.991 becomes an n. We get n to the c. 1089 1:10:58.991 --> 1:11:05.022 So, this is n divided by n to the c, which is n to the c minus 1090 1:11:05.022 --> 1:11:09.904 one. And, I get to choose c to be 1091 1:11:09.904 --> 1:11:14.676 whatever I want. So, I choose c minus one to be 1092 1:11:14.676 --> 1:11:17.477 alpha. I think exactly that. 1093 1:11:17.477 --> 1:11:21.626 Oh, sorry, one over n to the c minus one. 1094 1:11:21.626 --> 1:11:24.634 Thank you. It better be small. 1095 1:11:24.634 --> 1:11:30.236 This is an upper bound. So, probability is polynomially 1096 1:11:30.236 --> 1:11:32.956 small. I get to choose, 1097 1:11:32.956 --> 1:11:36.484 and this is a bit of the trik. I'm choosing this constant to 1098 1:11:36.484 --> 1:11:38.397 be large, large enough for alpha. 1099 1:11:38.397 --> 1:11:40.61 The point is, as c grows, alpha grows. 1100 1:11:40.61 --> 1:11:43.48 Therefore, I can set alpha to be whatever I want, 1101 1:11:43.48 --> 1:11:46.29 set c accordingly. So, there's a little bit more 1102 1:11:46.29 --> 1:11:49.459 words that have to go here. But, they're in the notes. 1103 1:11:49.459 --> 1:11:51.851 I can set alpha to be as large as I want. 1104 1:11:51.851 --> 1:11:55.199 So, I can make this probability as small as I want in the 1105 1:11:55.199 --> 1:11:56.993 polynomial sets. So, that's it. 1106 1:11:56.993 --> 1:11:58.727 Number of levels, order log n: 1107 1:11:58.727 --> 1:12:02.224 wasn't that easy? Rules and equality, 1108 1:12:02.224 --> 1:12:06.026 the point is that when you're dealing with high probability, 1109 1:12:06.026 --> 1:12:09.377 use Boole's inequality. And, anything that's true for 1110 1:12:09.377 --> 1:12:12.664 one element is true for all of them, just like that. 1111 1:12:12.664 --> 1:12:15.886 Just lose a factor of n, but that's just one in the 1112 1:12:15.886 --> 1:12:18.271 alpha, and alpha is big: big constant, 1113 1:12:18.271 --> 1:12:21.106 but it's big. OK, so let's prove the theorem. 1114 1:12:21.106 --> 1:12:23.813 High probability searches cost order log n. 1115 1:12:23.813 --> 1:12:27.422 We now know the height is order log n, but it depends how 1116 1:12:27.422 --> 1:12:32.756 balanced this thing is. It depends how long the chains 1117 1:12:32.756 --> 1:12:36.8 are to really know that a search costs log n. 1118 1:12:36.8 --> 1:12:41.21 Just knowing a bound on the height is not enough, 1119 1:12:41.21 --> 1:12:45.805 unlike a binary tree. So, we have one cool idea for 1120 1:12:45.805 --> 1:12:49.389 this analysis. And it's called backwards 1121 1:12:49.389 --> 1:12:52.697 analysis. So, normally you think of a 1122 1:12:52.697 --> 1:12:58.21 search as starting in the top left corner going left and down 1123 1:12:58.21 --> 1:13:04 until you get to the item that you're looking for. 1124 1:13:04 --> 1:13:07.423 I'm going to look at the reverse process. 1125 1:13:07.423 --> 1:13:12.558 You start at the item you're looking for, and you go left and 1126 1:13:12.558 --> 1:13:15.896 up until you get to the top left corner. 1127 1:13:15.896 --> 1:13:20.175 The number of steps in those two walks is the same. 1128 1:13:20.175 --> 1:13:23.855 And, I'm not implementing an algorithm here, 1129 1:13:23.855 --> 1:13:27.792 I'm just doing analysis. So, those are the same 1130 1:13:27.792 --> 1:13:32.671 processes, just in reverse. So, here's what it looks like. 1131 1:13:32.671 --> 1:13:35.409 You have a search, and it starts, 1132 1:13:35.409 --> 1:13:42 which really means that it ends at a node in the bottom list. 1133 1:13:42 --> 1:13:46.845 Then, each time you visit a node in this search, 1134 1:13:46.845 --> 1:13:52.618 you either go left or up. And, when do you go left or up? 1135 1:13:52.618 --> 1:13:56.639 Well, it depends with the coin flip was. 1136 1:13:56.639 --> 1:14:02 So, if the node wasn't promoted at this level. 1137 1:14:02 --> 1:14:08.317 So, if it wasn't promoted higher, and that happened 1138 1:14:08.317 --> 1:14:14.003 exactly when we got a tails. Then, we go left, 1139 1:14:14.003 --> 1:14:19.057 which really means we came from the left. 1140 1:14:19.057 --> 1:14:25.754 Or, if we got a heads, so if this node was promoted to 1141 1:14:25.754 --> 1:14:31.44 the next level, which happened whenever we got 1142 1:14:31.44 --> 1:14:37 a heads at that particular moment. 1143 1:14:37 --> 1:14:42.86 This is in the past some time when we did the insertion. 1144 1:14:42.86 --> 1:14:45.844 Then we go, or came from, up. 1145 1:14:45.844 --> 1:14:51.704 And, we stop at the root. This is really where we start; 1146 1:14:51.704 --> 1:14:55.967 same thing. So, either at the root or I'm 1147 1:14:55.967 --> 1:15:03 also going to think of this as stopping at minus infinity. 1148 1:15:03 --> 1:15:05.562 OK, that was a bit messy, but let me review. 1149 1:15:05.562 --> 1:15:08.602 So, normally we start up here. Well, just looking at 1150 1:15:08.602 --> 1:15:11.344 everything backwards, and in brackets is what's 1151 1:15:11.344 --> 1:15:13.966 really happening. So, this search ends at the 1152 1:15:13.966 --> 1:15:17.364 node you were looking for. It's always in the bottom list. 1153 1:15:17.364 --> 1:15:19.807 Then it says, well, was this node promoted 1154 1:15:19.807 --> 1:15:21.953 higher? If it was, I came from above. 1155 1:15:21.953 --> 1:15:25.41 If not, I came to the left. It must have been in the bottom 1156 1:15:25.41 --> 1:15:28.033 chain somewhere. OK, and that's true at every 1157 1:15:28.033 --> 1:15:31.87 node you visit. It depends whether that quite 1158 1:15:31.87 --> 1:15:35.806 slipped heads or tails at the time that you inserted that node 1159 1:15:35.806 --> 1:15:38.774 into that level. But, these are just a bunch of 1160 1:15:38.774 --> 1:15:40.774 events. I'm just going to check, 1161 1:15:40.774 --> 1:15:44.258 what is the probability that its heads, and what is the 1162 1:15:44.258 --> 1:15:47.096 probability that a tails? It's always a half. 1163 1:15:47.096 --> 1:15:50.516 Every time I look at a coin flip, when it was flipped, 1164 1:15:50.516 --> 1:15:54 there was a probability of half going out of their way. 1165 1:15:54 --> 1:15:56.967 That's the magic. And, I'm not using that these 1166 1:15:56.967 --> 1:16:02.248 events are independent anyway. For every element that I search 1167 1:16:02.248 --> 1:16:05.584 for, for every value, x, that's another search. 1168 1:16:05.584 --> 1:16:08.123 Those events may not be independent. 1169 1:16:08.123 --> 1:16:12.112 I can still use Boole's inequality and conclude that all 1170 1:16:12.112 --> 1:16:15.375 of them are order log n with high probability. 1171 1:16:15.375 --> 1:16:19.582 As long as I can prove that any one event happens with high 1172 1:16:19.582 --> 1:16:22.556 probability. So, I don't need independence 1173 1:16:22.556 --> 1:16:26.835 between, I knew that these coin flips in a single search are 1174 1:16:26.835 --> 1:16:30.969 independent, but everything else, for different searches I 1175 1:16:30.969 --> 1:16:35.803 don't care. So, how long can this process 1176 1:16:35.803 --> 1:16:39.283 go on? We want to know how many times 1177 1:16:39.283 --> 1:16:44.31 can I make this walk? Well, when I hit the root node, 1178 1:16:44.31 --> 1:16:47.983 I'm done. Well, how quickly would I hit 1179 1:16:47.983 --> 1:16:51.56 the root node? Well, with probability, 1180 1:16:51.56 --> 1:16:57.069 a half, I go up each step. The number of times I go up is, 1181 1:16:57.069 --> 1:17:02 at most, the number of levels minus one. 1182 1:17:02 --> 1:17:05.41 And that's order log n with high probability. 1183 1:17:05.41 --> 1:17:07.813 So, this is the only other idea. 1184 1:17:07.813 --> 1:17:10.682 So, we are now improving this theorem. 1185 1:17:10.682 --> 1:17:15.333 So, the number of up moves in a search, which are really down 1186 1:17:15.333 --> 1:17:19.054 moves, but same thing, is less than the number of 1187 1:17:19.054 --> 1:17:22 levels. Certainly, you can't go up more 1188 1:17:22 --> 1:17:24.713 than there are levels in the search. 1189 1:17:24.713 --> 1:17:27.968 And in insert, you can go arbitrarily high. 1190 1:17:27.968 --> 1:17:32 But a search: as high as you can go. 1191 1:17:32 --> 1:17:34.821 And this is, at most, c log n with high 1192 1:17:34.821 --> 1:17:37.866 probability. This is what we proved in the 1193 1:17:37.866 --> 1:17:40.242 lemma. So, we have a bound on the 1194 1:17:40.242 --> 1:17:42.99 number of up moves. Half of the moves, 1195 1:17:42.99 --> 1:17:45.44 roughly, are going to be up moves. 1196 1:17:45.44 --> 1:17:49.004 So, this pretty much down to the number of moves. 1197 1:17:49.004 --> 1:17:51.752 Not quite. So, what this means is that 1198 1:17:51.752 --> 1:17:54.797 with high probability, so this is the same 1199 1:17:54.797 --> 1:17:58.955 probability, but I could choose that as high as I want by 1200 1:17:58.955 --> 1:18:03.553 setting c large enough. The number of moves, 1201 1:18:03.553 --> 1:18:06.893 in other words, the cost of the search is at 1202 1:18:06.893 --> 1:18:11.32 most the number of coin flips until we get c long n heads, 1203 1:18:11.32 --> 1:18:15.747 right, because in every step of the search, I make a move, 1204 1:18:15.747 --> 1:18:19.009 and then I flip another coin, conceptually. 1205 1:18:19.009 --> 1:18:22.504 There is another independent coin lying there. 1206 1:18:22.504 --> 1:18:27.165 And it's either heads or tails. Each of those is independent. 1207 1:18:27.165 --> 1:18:31.902 So, how many independent coin flips does it take until I get c 1208 1:18:31.902 --> 1:18:37.206 log n heads? The claim is that that's order 1209 1:18:37.206 --> 1:18:42.979 log n with high probability. But we need to prove that. 1210 1:18:42.979 --> 1:18:48.324 So, this is a claim. So, if you just sit there with 1211 1:18:48.324 --> 1:18:55.058 a coin, and you want to know how many times does it take until I 1212 1:18:55.058 --> 1:19:00.082 get c log n heads, the claim is that that number 1213 1:19:00.082 --> 1:19:05 is order log n with high probability. 1214 1:19:05 --> 1:19:08.595 As long as I prove that, I know that the total number of 1215 1:19:08.595 --> 1:19:11.276 steps I make, which is the number of heads 1216 1:19:11.276 --> 1:19:15.394 and tails is order log n because I definitely know the number of 1217 1:19:15.394 --> 1:19:17.094 heads is, at most, c log n. 1218 1:19:17.094 --> 1:19:21.147 The claim is that the number of tails can't be too much bigger. 1219 1:19:21.147 --> 1:19:23.174 Notice, I can't just say c here. 1220 1:19:23.174 --> 1:19:25.985 OK, it's really important that I have log n. 1221 1:19:25.985 --> 1:19:28.208 Why? Because with high probability, 1222 1:19:28.208 --> 1:19:32 it depends on n. This notion depends on n. 1223 1:19:32 --> 1:19:35.434 Log n: it's true. Anything bigger that log n: 1224 1:19:35.434 --> 1:19:38.087 it's true, like n. If I put n here, 1225 1:19:38.087 --> 1:19:41.756 this is also true. But, if I put a constant or a 1226 1:19:41.756 --> 1:19:46.126 log log n, this is not true. It's really important that I 1227 1:19:46.126 --> 1:19:50.185 have log n here because my notion of high probability 1228 1:19:50.185 --> 1:19:54.321 depends on what's written here. OK, it's clear so far. 1229 1:19:54.321 --> 1:19:57.912 We're almost done, which is good because I just 1230 1:19:57.912 --> 1:20:01.19 ran out of time. Sorry, we're going to go a 1231 1:20:01.19 --> 1:20:07.528 couple minutes over. So, I want to compute the error 1232 1:20:07.528 --> 1:20:12.308 probability here. So, I want to compute the 1233 1:20:12.308 --> 1:20:17.886 probability that there is less than c log n heads. 1234 1:20:17.886 --> 1:20:23.691 Let me skip this step. So, I will be approximate and 1235 1:20:23.691 --> 1:20:29.382 say, what's the probability that there is, at most, 1236 1:20:29.382 --> 1:20:33.923 c log n heads? So, I need to say how many 1237 1:20:33.923 --> 1:20:37.549 coins we are flipping here for what this event is. 1238 1:20:37.549 --> 1:20:40.139 So, I need to specify this constant. 1239 1:20:40.139 --> 1:20:42.729 Let's say we flip ten c log n coins. 1240 1:20:42.729 --> 1:20:47.169 Now I want to look at the error probability under that event. 1241 1:20:47.169 --> 1:20:51.312 The probability that there is at most c log n heads among 1242 1:20:51.312 --> 1:20:55.382 those ten c log n flips. So, the claim is this should be 1243 1:20:55.382 --> 1:20:58.416 pretty small. It's going to depend on ten. 1244 1:20:58.416 --> 1:21:01.672 Then I'll choose ten to be arbitrarily large, 1245 1:21:01.672 --> 1:21:05.076 and I'll be done, OK, make my life a little bit 1246 1:21:05.076 --> 1:21:10.054 easier. Well, I would ask you normally, 1247 1:21:10.054 --> 1:21:15.77 but this is 6.042 material. So, what's the probability that 1248 1:21:15.77 --> 1:21:19.021 we have, at most, this many heads? 1249 1:21:19.021 --> 1:21:23.653 Well, that means that nine c log n of the coins, 1250 1:21:23.653 --> 1:21:29.368 because there are ten c log n flips, c log n heads at most, 1251 1:21:29.368 --> 1:21:34 nine c log n at least better be tails. 1252 1:21:34 --> 1:21:37.149 So this is the probability that all those other guys become 1253 1:21:37.149 --> 1:21:39.104 tails, which is already pretty small. 1254 1:21:39.104 --> 1:21:41.33 And then, there is this permutation thing. 1255 1:21:41.33 --> 1:21:44.533 So, if I had exactly c log n heads, this would be the number 1256 1:21:44.533 --> 1:21:47.574 of ways to rearrange c log n heads among ten c log n coin 1257 1:21:47.574 --> 1:21:49.475 flips. OK, that's just the number of 1258 1:21:49.475 --> 1:21:51.375 permutations. So, this is a bit big, 1259 1:21:51.375 --> 1:21:53.601 which is kind of annoying. This is really, 1260 1:21:53.601 --> 1:21:55.665 really small. The claim is this is much 1261 1:21:55.665 --> 1:21:58 smaller than that is big. 1262 1:21:58 --> 1:22:14 1263 1:22:14 --> 1:22:18.548 So, this is just some math. I'm going to whiz through it. 1264 1:22:18.548 --> 1:22:21.39 So, you don't have to stay too long. 1265 1:22:21.39 --> 1:22:26.02 But you should go over it. You should know that y choose x 1266 1:22:26.02 --> 1:22:30 is, at most, ey over x to the x, good fact. 1267 1:22:30 --> 1:22:35.033 Therefore, this is, at most, ten c log n over c log 1268 1:22:35.033 --> 1:22:38.456 n, also known as ten. These cancel. 1269 1:22:38.456 --> 1:22:43.691 There's an e out here. And then I raise that to the c 1270 1:22:43.691 --> 1:22:48.02 log n power. OK, then I divide by two to the 1271 1:22:48.02 --> 1:22:51.946 power, nine c log n. OK, so what's this? 1272 1:22:51.946 --> 1:22:57.986 This is e times ten to the c log n divided by two to the nine 1273 1:22:57.986 --> 1:23:02.355 c log n. OK, claim this is very big. 1274 1:23:02.355 --> 1:23:06.367 This is not so big, because I have a nine here. 1275 1:23:06.367 --> 1:23:09.769 So, let's work it out. This e times ten, 1276 1:23:09.769 --> 1:23:13.345 that's a good number, we can put upstairs. 1277 1:23:13.345 --> 1:23:17.096 So, we get log of e times ten, ten times, e, 1278 1:23:17.096 --> 1:23:21.109 and then c log n. And then, we have over two to 1279 1:23:21.109 --> 1:23:25.121 the nine c log n. So, we have this two to the c 1280 1:23:25.121 --> 1:23:31.946 log n in both cases. So, this is two to the log, 1281 1:23:31.946 --> 1:23:38.669 ten e minus nine, c, log n: some basic algebra. 1282 1:23:38.669 --> 1:23:43.199 So, I'm going to set, not quite. 1283 1:23:43.199 --> 1:23:49.338 This is one over two to the nine minus log: 1284 1:23:49.338 --> 1:23:58.253 so, just inverting everything here, negating the sign in here. 1285 1:23:58.253 --> 1:24:06 And, this is my alpha because the rest is n. 1286 1:24:06 --> 1:24:09.903 So, this is one over n to the alpha when alpha is this 1287 1:24:09.903 --> 1:24:13.291 particular value: nine minus log of ten times e 1288 1:24:13.291 --> 1:24:16.09 times c. It's a bit of a strange thing. 1289 1:24:16.09 --> 1:24:19.184 But, the point is, as ten goes to infinity, 1290 1:24:19.184 --> 1:24:22.424 nine here is the number one smaller than ten, 1291 1:24:22.424 --> 1:24:24.855 right? We subtracted one somewhere 1292 1:24:24.855 --> 1:24:27.949 along the way. So, as ten goes to infinity, 1293 1:24:27.949 --> 1:24:32 this is basically, this is ten minus one. 1294 1:24:32 --> 1:24:35.1 This is log of ten times e. e doesn't really matter. 1295 1:24:35.1 --> 1:24:37.531 The point is, this is logarithmic in ten. 1296 1:24:37.531 --> 1:24:40.692 This is linear in ten. The thing that's linear in ten 1297 1:24:40.692 --> 1:24:44.035 is much bigger than the thing that's logarithmic in ten. 1298 1:24:44.035 --> 1:24:45.919 This is called abusive notation. 1299 1:24:45.919 --> 1:24:48.958 OK, as ten goes to infinity, this goes to infinity, 1300 1:24:48.958 --> 1:24:51.329 gets bigger. And, there is a c out here. 1301 1:24:51.329 --> 1:24:54.794 But, for any value of c that you want, whatever value of c 1302 1:24:54.794 --> 1:24:58.015 you wanted in that claim, I can make alpha arbitrarily 1303 1:24:58.015 --> 1:25:00.629 large by changing the constant in the big O, 1304 1:25:00.629 --> 1:25:04.812 which here was ten. OK, so that claim is true with 1305 1:25:04.812 --> 1:25:07.652 high probability. Whatever probability you want, 1306 1:25:07.652 --> 1:25:10.673 which tells you alpha, you set a constant effort of 1307 1:25:10.673 --> 1:25:13.089 the log N to be this number, which grows, 1308 1:25:13.089 --> 1:25:15.929 and you're done. You get the claim that is order 1309 1:25:15.929 --> 1:25:19.312 log N heads, order log N flips with the high probability, 1310 1:25:19.312 --> 1:25:21.548 therefore. [None of the steps?] in the 1311 1:25:21.548 --> 1:25:24.146 search is order log N with high probability. 1312 1:25:24.146 --> 1:25:26.14 Really cool stuff; read the notes. 1313 1:25:26.14 --> 1:25:29 Sorry I went so fast at the end.