Description: In this lecture, Professor Devadas introduces divide-and-conquer algorithms and problems that can be solved using divide-and-conquer approaches.
Instructor: Srinivas Devadas
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
PROFESSOR: All right. Good morning, everyone. Let's get started. So we're going to start 6.046 in earnest today. We're going to start with our first module on divide and conquer.
You've all seen divide and conquer algorithms before. Merge sort is a classic divide and conquer algorithm. I'm going to spend just a couple minutes talking about the paradigm, give you a slightly more general setting than merge sort. And then we'll get into two really cool divide and conquer problems in the sense that these are problems for which divide and conquer works very well-- mainly, convex hall and median finding.
So before I get started on the material, let me remind you that you should be signing up for a recitation section on Stellar. And please do that even if you don't plan on attending sections. Because we need that so we can assign your problem sets to be graded, OK?
So that's our way of partitioning problem sets as well. And then the other thing is problem set one is going to go out today. And that it's a one week problem set. All problem sets are going to be a week in duration. Please read these problem sets the day that they come out. Spend 5, 10 minutes reading them.
Some things are going to look like they're magic, that they're-- how could I possibly prove this? If you think about it for a bit, it'll become obvious. We promise you that. But get started early. Don't get started at 7:00 PM when we have 11:59 PM deadline on Thursday, all right? That four hours or five hours of time may not be enough to go from magical to obvious, OK?
So let's get started with the paradigm associated with divide and conquer. It's just a beautiful notion that you can break up the problem into smaller parts and somehow compose the solutions to the smaller parts. And of course, the details are going to be what's important when we take a particular problem instance.
But let's say we're given a problem of size n. We're going to divide it into a sub problems-- I'll put that in quotes so you know it's a symbol-- a sub problems of size n over b. And here, a is an integer. And a is going to be greater than or equal to 1. It could be two. It could be three. It could be four.
This is the generalization I alluded to. And b does not have to be two or even an integer. But it has to be strictly greater than one. Otherwise, there's no notion of divide and conquer. You're not breaking things up into smaller problems. So b should be strictly greater than one. So that's the general setting. And then you'll solve each sub problem recursively. And the idea here is that once the sub problems become really small, they become constant size, it's relatively easy to solve them. You can just do exhaustive search.
If you have 10 elements and you're doing effectively a cubic search, well, 10 cubed is 1,000. That's a constant. You're in great shape as long as the constants are small enough. And so you're going to recurse until these problems get small. And then typically-- this is not true for all divide and conquer approaches. But for most of them, and certainly the ones we're going to cover today, the smarts is going to be in the combination step-- when you combine these problems, the solutions of these sub problems, into the overall solution. And so that's the story.
Typically, what happens in terms of efficiency is that you can write a recurrence that's associated with this divide and conquer algorithm. And you say t of n, which is a running time, for a problem of size n is going to be a times tfn over b-- and this is a recurrence-- plus the work that you need to do for the merge operational or the combine. This is the same as merge. And so you get a recurrence. And you're not quite done yet in terms of the analysis. Because once you have the recurrence, you do have to solve the recurrence. And it's usually not that hard and certainly it's not going to be particularly difficult for the divide and conquer examples that we're going to look, at least today.
But we also have this theorem that's called the master theorem that is essentially something where you can fairly mechanically plug in the a's and the b's and whatever you have there-- maybe it's theta n, maybe it's theta n square-- and get the solution to the recurrence.
I'm actually not going to do that today. But you'll hear once again about the massive theorem tomorrow in section. And it's a fairly straightforward template that you can use for most of the divide and conquer examples we're going to look at in 046 with one exception that we'll look at in median finding today that will simply give you the solution to the recurrence, OK?
So you've see most of these things before. That's a little bit of setup. And so let's dive right in into convex hull, which is my favorite problem when it comes to using divide and conquer. So convex hull, I got a little prop here which will save me from writing on the board and hopefully be more understandable. But the idea here is that in this case, we have a two dimensional problem with a bunch of points in a two dimensional plane. You can certainly do convex hull for three dimensions, many dimensions. And convexity is something that is a fundamental notion in optimization. And maybe we'll get to that in 6046 in advanced topics, maybe not. But in the context of today's lecture, what we're interested in doing is essentially finding an envelope or a hull associated with a collection of points on a two dimensional plane. And this hull obviously is going to be something, as you can guess, that encloses all of these points, OK?
So what I have here, if I make this string taut enough-- this is not working so well, but I think you get the picture. All right, so that's not a convex hull. This is not a convex hull for the reason that I have a bunch of points outside of the hull. All right, so let me just-- that is a convex hull. And now if I start stretching like that or like this or like that, that's still a convex hull, OK? So that's the game.
We have to find an algorithm. And we look at a couple of different ones that will find all of these segments that are associated with this convex hull, OK? So this is a segment that's part of the convex hull. That's a segment that's part of the convex hull. If, in fact, I had something like this-- and this was stretched out-- because I have those two points outside the convex hull, this may still be a segment that's part of the electronics hall but this one is not, right? So that's-- the game here is to find these segments. So if you're going to working with segments or tangents-- they're going to be used synonymously-- all of the tangents or segments associated with the entirety of the convex hull and we have to discover them. And only input that we have is the set of pointx-- xiy coordinates.
And there's just a variety of algorithms that you can use to do this. The one that I wish I had time to explain but I'll just mention is what's called a gift wrapping algorithm. You might not have done this, but I guarantee you I said you probably have taken a misshapen gift, right, and tried to wrap it in gift wrapping paper. And when you're doing that, you're essentially-- if you're doing it right you're essentially trying to find the convex hull of this three dimensional structure. You're trying to tighten it up. You're trying to find the minimum amount of gift wrapping paper.
I'm not sure if you've ever thought about minimizing gift wrapping paper, but you should have. And that's the convex hull of this three dimensional shape. But we'll stick to two dimensions because we'll have to draw things on the board. So let me just spec this out a bit. I've been given endpoints in a plane. And those set of points are s, xi, yi such that i equals 1, 2 to n. And we're just going to assume here, just to make things easy because we don't want to have segments that are null or segments that are a little bit different because they're discontinuous. But we're going to assume that no two have the same x-coordinate.
This is just a matter of convenience. And no two have the same y-coordinate. And then finally, no three in a line. Because we want to be able to look at pairs of points and find these segments. And it just gets kind of inconvenient. You have to do special cases if there of them are on a line. And so the convex hull itself is the smallest polygon containing all points in s. And we're going to call that ch of s-- convex hull of s.
STUDENT: Smallest convex polygon.
PROFESSOR: The smallest convex polygon-- thank you. And so just as an example on the board, when you have something like this, you're going to have your convex hull being that. This one is inside of it. These two points are inside of it. And all the other ones form the hull. And so we might have p, q, r, s, t, u. And v and x are inside of the hull. They're not part of the specification of ch of s, which I haven't quite told you how we're going to specify that.
But the way you're going to specify that is simply by representing it as a sequence of points that are on the boundary on the hull in clockwise order. And you can think of this as being a doubly linked list in terms of the data structure that you'd use if you coded this up. So in this case, it would be p to q to r to s. You're going to start with t in this case. It's a doubly linked list. So you could conceivably start with anything. But that's the representation of the convex hull. And we're going to use clockwise just because we want to be clear on as to what order we're enumerating these points. It's going to become important when we do the divide and conquer algorithm. So let's say that we didn't care about divide and conquer just for the heck of it and I gave you a bunch of points over here.
Can you think of a simple-- forget efficiency for just a couple of minutes. Can you think of a simple algorithm that would generate the segments of the convex hull? For example, I do not want to generate this segment vx.
If I think of a segment as being something that is defined by two points, then I don't want to generate the segment vx because clearly the segment is not part of the convex hull. But whereas the segment pq, qr, rs, et cetera, they're all part of the convex hull, right? So what is the obvious brute force algorithm, forgetting efficiency, that given this set of points will generate one by one the segments of the convex hull? Anybody? Did you have your head up? No? Go ahead. Yep.
STUDENT: Draw the line and check how many other lines intersect with it.
PROFESSOR: Draw the line and check how many lines it intersects with.
STUDENT: Yeah. PROFESSOR: Is there-- I think you got-- you draw the line. That's good, right?
STUDENT: [LAUGHS] AUDIENCE: [LAUGHING]
PROFESSOR: Well-- but you want to do a little more. Yeah, go ahead.
STUDENT: For every pair of points you see, make a half-plane and see where they complete all of their other points. [INAUDIBLE]
PROFESSOR: Ah, so that's good. That's good. That's good. All right, so the first person who breaks the ice here always gets a Frisbee. Sorry man. At least I only hit the lecturer-- no liability considerations here. OK, now I'm getting scared. Right, so I think there's a certain amount of when I throw this, am I going to choke or not, right? But it's going to get higher when one of you guys in the back answers a question. So you're exactly right. And you draw a line. And then you just look at it. And you look at the half plane. And if all the points are to one side, it is a segment of the convex hull. If they're not, it's not a segment-- beautiful. All right, are we done? Can we go and enjoy the good weather outside? No, we've got ways to go here. So this is not the segment whereas one-- let me draw that. I should draw these in a dotted way.
This is not a segment. This is not a segment. This is a segment. And I violated my rule of these three not being in a straight line. So I'll move this over here. And then that's a segment and so on and so forth, OK? Right?
STUDENT: It's no longer a side with the ones below it.
PROFESSOR: I'm sorry?
STUDENT: It would have to go directly to the bottom one from the left one.
PROFESSOR: Oh, you're right. That's a good point. That's an excellent point. SO what happened here was when I moved that out-- exactly right. Thank you. This is good. So when I moved this out here, what happened was-- and I drew this-- well, this one here, my convex hull, changed. The problem specification changed on me. It was my fault. But then what would happen, of course, is as I move this, that would become the segment that was part of the convex hull, OK?
So sorry to confuse people. But what we have here in terms of an algorithm, if I leave the points the same, works perfectly well. So let me just leave the points the same and just quickly recap, which is, I'm going to take a pair of points. And I'm going to draw-- and let me just draw this in a dotted fashion first. And I'm going to say that's the segment. And I'm going to take a look at that line and say this breaks up the plane into two half planes. Are all about points on one side? And if the answer is yes, I'm going to go ahead and, boom, say that is a segment of my convex hull. If the answers is no, like in this case, I'm going to drop that segment,
OK? So now let's talk about complexity. Let's say that there are n points here. And how many segments do I have? I have O n square theta n square segments. And what is the complexity of the test? What is the complexity of the test that's associated with, once I've drawn the segments, deciding whether the segment is going to be a tangent which is part of the convex hull or not? What is the complexity?
STUDENT: O n.
PROFESSOR: O n-- exactly right. So on test complexity-- and so we got over theta n cubed complexity, OK? So it makes sense to do divide and conquer if you can do better than this.
Because this is a really simple algorithm. The good news is we will be able to do better than that. And now that we have a particular algorithm-- I'm not quite ready to show you that yet. Now that we have a particular algorithm, we can think about how we can improve things. And of course we're going to use divide and conquer. So let's go ahead and do that. And so generally, the divide and conquer, as I mentioned before, in most cases, the division is pretty straightforward. And that's the case here as well. All the fun is going to be in the merge step. Right, so what we're going to do, as you can imagine, is we're going to take these points. And we're going to break them up. And the way we're going to break them up is by dividing them into half lengths. We're going to just draw a line. And we're going to say everything to the left of the line is one sub problem, everything to the right of the line is another sub problem, go off and find the convex hull for each of the sub problems. If you have two points, you're done, obviously. It's trivial.
And at some point, you can say I'm just going to deal with brute force. If we can go down to order n cubed, if n is small, I can just apply that algorithm. So it doesn't even have to be the base case of n equals 1 or n equals 2. That's a perfectly fine thing to do. But you could certainly go with n equals 10, as I mentioned before, and run this brute force algorithm. And so at that point, you know that you can get down to small enough size sub problems for which you can find the convex hull efficiently. And then you've got these two convex hulls which are clearly on two different half planes because that's the way you defined them. And now you've got to merge them. And that's where all the fun is, OK? So let's just write this out again.
You're going to sort the points by x-coordinates. And we're going to do this once and for all. We don't have to keep sorting here because we're just going to be partitioning based on x-coordinates. And we can keep splitting based on x-coordinates because we want to generate these half-lengths, right? So if we can do those once and for all-- and for the input set S, we're going to divide into the left half A and right half B by the x-coordinates. And then we're going to compute CH of A and CH of B recursively. And then we're going to combine. So the only difference here from what we had before is the specification of the division. It looked pretty generic.
It's similar to the paradigm that I wrote before. But I've specified exactly how I'm going to break this up. So let's start with the merge operation. We're going to spend most of our time specing that. And again, there's many ways you could do the merge. And we want the most efficient way. That's obviously going to determine complexity. So, big question-- how to merge.
So what I have here, if I look at the merge step, is I've created my two sub problems corresponding to these two half planes. And what I have here is-- let's say I've generated, at this point, a convex hull associated with each of these sub problems. So what I have here is a1, a2. I'm going to go clockwise to specify the convex hull. And the other thing that I'm going to do is in the sub problem case, my starting point is going to be for the left sub problem, the coordinate that has the highest x value, OK? So that's a1 in this case-- the highest x value going over. x is increasing to the right. And for the right half of the problem, it's going to be the coordinate that has the lowest x value. And I'm going to go clockwise in both of these cases. So when you see an ordering associated with the subscripts for these points, start with a1 or b1 and then go clockwise. And that's how we number this-- so just notational, nothing profound here.
So I got these two convex hulls-- these sub hulls, if you will. And what I need to do now is merge them together. And you can obviously look at this and it's kind of obvious what the overall convex hull is, right?
But the key thing is, I'm going to have to look at each of the pairs of points that are associated with this and that and try to generate the tangents, the new tangents, that are not part of the sub hulls, but they're part of the overall hull, right? So in this case, you can imagine an algorithm that is going to kind of do what this brute force algorithm does except that it's looking at a point from here and a point from here.
So you could imagine that I'm going to do a pairwise generation of segments. And then I'm going to check to see whether these segments are actually tangents that are part of the overall convex hull or not. So what I would do here is I'd look at this. And is that going to be part of the overall hull? No, and precisely why not? Someone tell me why this segment a1 b1 is not part of the overall hull? Yeah, go ahead.
STUDENT: If we were to draw a line through the whole thing there would be one on both sides.
PROFESSOR: Exactly right-- that's exactly right. So here you go. So that's not part of it. Now, if I look at this-- well, same reason that's not part of it. In this case-- and this is a fairly obvious example. I'm going to do something that's slightly less obvious in case you get your hopes up that we have this trivial algorithm,
OK? This is looking good, right? That's supposed to be a straight line, by the way. So a4 b2-- I mean, that's looking good, right? Because all the points are on one side. So a4 b2 is our upper tangent. Right, so our upper tangent is something that we're going to define as-- if I look at each of these things, I'm going to say they have a yij.
OK, what is yij? yij is the y-coordinate. of the segment that I'm looking at, the ij segment. So this yij is for ai and bj. So what I have here is y42 out here. And this is-- for the upper tangent, yij is going to be maximum, right? Because that's essentially something which would ensure me that there are no points higher than that, right? So if I go up all the way and I find this that has the maximum yij, that is going to be my upper tangent. Because only for that will I have no points ahead of that, OK? So yij is upper tangent. This is going to be maximum. And I'm not going to write this down, but it makes sense that the lower tangent is going to have the lowest yij. Are we all good here? Yeah, question.
STUDENT: So I am just wondering, I couldn't hear what she said why we moved out a1 b1.
PROFESSOR: OK, so good. Let me-- that reason we moved out a1 b1 is because if I just drew a1 b1 like this-- and I'm extrapolating this. This is again supposed to be a straight line. Then you clearly see that there are points on either side of the a1 b1 segment when you look at the overall problem, correct? You see that on a1 b1, b2 is on this side, b3 is on this side if I just extend this line all the way to infinity in both directions. And that violates the requirement that the segment be part of the overall hull, OK? That make sense? Good. So, everybody with me?
So clearly, there's a trivial merge algorithm here. And the trivial merge algorithm is to look at not every pair of points-- every ab pair, right? Every aibj pair. And so what is the complexity of doing that? If I have n total points, the complexity would be-- would be in square, right? Because maybe I'd have half here and half here, ignore constants. And you could say, well, it's going to be n squared divided by 4, but that's theta n squared. So there's an obvious merge algorithm that is theta n square looking at all pairs of points. And when I mean all pairs of points, I mean like an a and a b. Because I want to pick a pair when I go left of that dividing line and then right of the dividing line. But either way, it's theta n square, OK? So now you look at that and you go, huh. Can I do a better?
What if I just went for the highest a point and the highest b point and I just, no, that's it? I'm done-- constant time. Wouldn't that be wonderful? Yeah, wonderful, but incorrect, OK?
Right, so what is an example. And so this is something that I spent a little bit of time last night concocting. So I'm like you guys too. I do my problem set the night before. Well, don't do as I do. Do as I say. But I've done this before. So that's the difference.
But this particular example is new. So what I have here is I'm going to show you why there's not a trivial algorithm, OK, that-- I got to get these angles right-- that you can't just pick the highest points and keep going, right? And then that would be constant time. So that's my a over here. And let's assume that I have my dividing line like that. And then what I'm going to do here-- and I hope I get this right-- is I'm going to have something like this, like that. And then I'm going to have b1 here clockwise-- so b2, b3, and b4. So as you can see here, if I look at a4-- a little adjustment necessary. OK, so if I look at that, a4 to b1 versus-- I mean, just eyeball it. A3 to b1-- right, is a4 to b1 going to be the upper tangent? No, right? So now a3 is lower than a4. You guys see that?
And b1 is lower than b2, right? So it's clear that if I just took a4 to b2 that it will not be an upper tangent. Everybody see that? Yep, all right, good. So we can't have a constant time algorithm. We have theta and square in the back. So it is there something-- maybe theta n?
How would we do this merge and find the upper tangent by being a little smarter about searching for pairs of points that give us this maximum yij? I mean, the goal here is simple. At some level, if you looked at the brute force, I would generate each of these things. I would find the yj intercepts associated with this line. And I just pick the maximum. And the constant time algorithm doesn't work. The theta n squared algorithm definitely works. But we don't like it. So there has to be something in between. So, any ideas? Yeah, back there.
STUDENT: So... I had a question. [INAUDIBLE]
PROFESSOR: No, you're just finding-- no, you're maximizing the yij. So for once you have this segment-- so the question was, isn't the obvious merge algorithm theta n cubed, right? And my answer is no, because the theta n extra factor came from the fact that you had to check every point, every endpoint, to see on which side of the plane it was. Whereas here, what I'm doing is I've got this one line here that is basically y equals 0, if you like, or y equals some-- I'm sorry, x equals 0 or x equals some value. And I just need to, once I have the equation for the line associated with a4 b1 or a4 b2, I just have to find the intercept of it, which is constant time, right? And then once I find the intercept of it, I just maximize that intercept to get my yij. So I'm good, OK?
So it's only theta n squared, right? Good question. So this is actually quite-- very, very, very clever. This particular algorithm is called the two finger algorithm. And I do have multiple fingers, but it's going to work a lot better if I borrow Eric's finger. And we're going to demonstrate to you the two finger algorithm for merging these two convex hulls. And then we'll talk about the complexity of it. And my innovation again last night was to turn this from a two-finger algorithm. Not only did I have the bright idea of using Eric-- I decided it was going to become the two finger an string algorithm. So this is wild.
This is my contribution to 046 lore-- come on. So the way the two finger algorithm works-- this pseudo code should be incomprehensible. If you just look at it and you go, what, right? But this demo is going to clear everything up. Right so here's what you do. So now we're going to do a demo of the merge algorithm that is a clever merge algorithm than the one that uses order n square time. And it's correct. It's going to get you the correct upper tangent and what we are starting at here is with Erik’s left finger on A1, which is defined to be the point that's closest to the vertical line that you see here, the one that has the highest x-coordinate. And my finger is on B1, which is the point that has the smallest X-coordinate on the right hand side sub-hull. And what we do is we compute, for the segment A1 B1, we compute by Yij, in this case Y11, which is the intercept on the vertical line that you see here that Erik just marked with a red dot. And you can look at the pseudocode over on, to my right if I face the board. And what happens now is I'm going to move clockwise, and I'm going to go from B1 to B4. And what happened here?
Did the Yij increase or decrease? Well, as you can see it decreased. And so I'm going to go back to B1. And we're not quite done with this step here. Erik’s going to go counterclockwise over to A4. And we're going to check again, yeah, keep the string taught, check again whether Yij increased or decreased and as is clear from here Yij increased. So now we move to this point. And as of this moment we think that A4 B1 has the highest
Yij. But we have a while loop. We’re going to have to continue with this while loop, and now what happens is, I’m going to go from B1 clockwise again to B4. And when this happens, did Yij increase or decrease? Well it decreased. So I'm going to go back to B1 and Erik now is going to go counterclockwise to A3. And as you can see Y31 increased a little bit, so we're going to now stop this iteration of the algorithm and we're at A3 B1, which we think at this point is our upper tangent, but let's check that. Start over again on my side B1 to B4, what happened? Well Yij decreased. So I'm going to go back to B1. And then Erik’s going to try. He’s going conterclockwise, he's going to go A3 to A2 and, well, big decrease in Yij. Now Erik goes back to A3. At this point we've tried both moves, my clockwise move and Erik’s counterclockwise move. My move from B1 to B4 and Erik’s move from A3 to A2. So we've converged, we're out of the while loop, A3 B1 for this example is our upper tangent. All right. You can have your finger back Erik.
So the reason this works is because we have a convex hull here and a convex hull here. We are starting with the points that are closest to each other in terms of A1 being the closest to this vertical line, B1 being the closest to this vertical line, and we are moving upward in both directions because I went clockwise and Erik went counterclockwise. And that's the intuition of why this algorithm works. We're not going to do a formal proof of this algorithm, but the monotonicity property corresponding to the convexity of this subhull and the convexity of the subhull essentially can give you a formal proof of correctness of this algorithm, but as I said we won't cover that in 046. So all that remains now is to look at our pseudocode which matches the execution that you just saw and talk about the complexity of the pseudocode. So what is the complexity of this algorithm? It's order n, right? So what has happening here, if you look at this while loop, is that while I have two counters, I'm essentially looking at two operations per loop.
And either one of those counters is guaranteed to increment through the loop. And so since I have in this case p points, in one case p plus q equals n-- so let's say I had p points here and I have q points here. And got p plus q equals n. And I got a theta n merge simply because I'm going to be running through and incrementing-- as long as I'm in the loop, I'm going to be incrementing either the i or the j. And the maximum they can go to are p and q before I bounce out of the loop or before they rotate around.
And so that's why this is theta n. And so you put it all together in terms of what the merge corresponds to in terms of complexity and put that together with the overall divide and conquer. We have a case where this is looking like a recurrence that you've seen many a time t of n. I've broken it up into two sub problems. So I have 2. And I could certainly choose this l over here that's my line l to be such that I have a good partition between the two sets of points.
Now, if I choose l to be all the way on the right hand side, then I have this large sub problem-- makes no sense whatsoever. So what I can do-- there's nothing that's stopping me when I've sorted these points by the x-coordinates to do the division such that there's exactly the same number, assuming an even number of points n, exactly the same number on the left hand side or the right hand side. But I can get that right roughly certainly within one very easily. So that's where the n over 2 comes from, OK? In the next problem that we'll look at, the median finding problem, we'll find that trying to get the sub problems to be of roughly equal size is actually a little difficult, OK? But I want to point out that in this particular case, it's easy to get sub problems that are half the size because you've done the sorting.
And then you just choose the line, the vertical line such that you've got a bunch of points that are on either side. And then in terms of the merge operation, we have 2t n over 2 plus theta n. People recognize this recurrence? It's the old merge sort recurrence. So we did all of this in-- well, it's not merge sort. Clearly the algorithm is not merge sort. We got the same recurrence. And so this is theta n log n-- so a lot better than theta nq. And there's no convex hull algorithm that's in the general case better than this. Even the gift wrapping algorithm that I mentioned to you, with the right data structures, it gets down to that in terms of theta n log n, but no better.
OK, so good. That's pretty much what I had here. Again, like I said, happy to answer questions about the correctness of this loop algorithm for merge later. Any other questions associated with this?
STUDENT: Question. Yeah, back there.
STUDENT: If the input is recorded by x coordinates, can you do better than [INAUDIBLE]?
PROFESSOR: No, you can't, because-- I mean, the n log n for the pre-sorting, I mean, there's another theta n log n for the sorting at the top level. And we didn't actually use that, right? So the question was, can we do better if the input was pre sorted?
And I actually did not even use the complexity of the sort. We just matched it in this case. So theta n log n-- and then you can imagine maybe that you could do a theta n sort if these points were small enough and you rounded them up and you could use a bucket sort or a counting sort and lower that.
So this theta n log n is kind of fundamental to the divide and conquer algorithm. The only way you can improve that is by making a merge process that's even faster. And we obviously tried to cook up a theta one merge process. But that didn't work out, OK?
STUDENT: But are there algorithms that [INAUDIBLE] ?
PROFESSOR: First-- if you assume certain things about the input, you're absolutely, right? So one thing you'll discover in algorithms in 6046 as well is that we're never satisfied.
OK, so I just said, oh, you can't do better than theta n log n. But that's in the general case. And I think I mentioned that. You're on the right track. If the input is pre sorted, you can take that away-- no, it doesn't help in that particular instance if you have general settings. But if you-- the two dimensional case-- if the hull, all the segments have a certain characteristic-- not quite planar, but something that's a little more stringent than that-- you could imagine that you can do improvements. I don't know if any compelling special case input for convex hull from which you can do better than theta n log n.
But that's a fine exercise for you, which is in what cases, given some structure on the points, can I do better than theta n log n? So that's something that keeps coming up in the algorithm literature, if you can use that, OK? Yeah, back there-- question.
STUDENT: Where's your [INAUDIBLE] step? You also have to figure out which lines to remove from each of your two...
PROFESSOR: Ah, good point. And you're exactly, absolutely right. And I just realized that I skipped that step, right?
Thank you so much. So the question was, how do I remove the lines? And it's actually fairly straightforward. Let's keep this up here. And we don't need this incomprehensible pseudo code, right? So let's erase that. And thank you for asking that question. So it's a little simple cut and paste approach where let's say that I find the upper tangent ai bj. And I find the lower tangent.
Let's call it ak bm. And in this particular instance, what do I have? I have a1, a2, a3, a4 as being one of my sub hulls. And then I have b1, b2, b3, b4 as the other one. Now, what did we determine to be the upper tangent? Was it a3 b1? Right, a3 b1? So a3 b1 was my upper tangent. And I guess it was a1-- a1 b4? A1 b4 was my lower tangent.
So the big question is, now that I've found these two, how do I generate the collect representation of the overall convex hull? And so it turns out that you have to do this-- and then the complexity of this is important as well. And you need to do what's called a cut and paste that's associated with this where we're going to just look at this and that. So if we're going to have these two things, then we've got to generate a list of points. Now, clearly a4 is not going to be part of that, right? A4 is not going to be part of the overall hull. What is it that we want? We want something like a1, a2, a3, b1, b2, b3, b4, right? But there's a point that we have to discard here. Agree?
And so the way we do this is very mechanical. That's the good news here. I mean, you don't have to look at it pictorially. I just made that up looking at-- eyeballing it. Clearly, a computer doesn't have eyeballs, right? And so what we're going to do is we're going to say the first link-- in general, the first link is ai to bj. Because that's my upper tangent, OK? And in this case, it's going to be a3 d1, OK? And then I'm going to go down the b list until you see bm, which is the lower tangent. You're on the b list. So you're looking for the lower tangent point. And then you're going to jump until you see bm. You link it to ak, OK? You link it to ak and continue until you return to ai. And then you have your circular list, OK? So what you see here is you have a3 here. So I'm going to go ahead and write out the execution of what I just wrote here.
So I have a3. And I'm going to go jump over to b1. So I'm going to write down b1. Then I'm going to along the b's until I get to b4. In this case, I'm going to include all of the b's. So I got b1, b2, b3, b4. And then I'm going to jump from b4 to a1 because that's part of my lower tangent. And I got a1 here, a2. And then I'm back to a3, which is great. Because then I'm done, OK? And so exactly what I said happened, thank goodness, which is we dropped a4 but we kept all the other points. Does that answer your question? Good.
What is the complexity of cut and paste? It's order n. I'm just walking through these lists. So there's no hidden complexity here, OK? Good, good-- thank you. You definitely deserve a Frisbee. In fact, you deserve two, right? Where are you? I-- oh, could you stand up? Yeah, right-- two colors. All right. Oh, so he-- well, you can give it to him if you like. So good, thank you.
So are we done? Are we done with convex hull? OK, good. So let's go on and do median finding. Very different-- very different set of issues here. Still on divide and conquer, but a very different set of issues. The specification here is, of course, straightforward. You can think of it as I just want a better algorithm than sorting and looking for the median at the particular position-- in over two position, for example. Let's say n is odd. And it's floor of n over 2. You can find that median. Right, so it's pretty easy if you can do sorting. But we're never satisfied with using a standard algorithm. If we think that we can do better than that. So the whole game here is going to be I'm going to find the median. And I want to do it in better than theta n log n time. OK, so that's what median finding is all about. You're going to use divide and conquer for this.
And so in general, we're going to define, given a set of n numbers, define rank of x as the numbers in the set that are greater than-- I'm sorry, less than or equal to x. I mean, you could have defined it differently. We're going to go with less than or equal to. So in general, the rank, of course, is something that could be used very easily to find the median. So if you want to find the element of rank n plus 1 divided by 2 floor, that's what we call the lower median. And n plus 1 divided by 2 ceiling is the upper median. And they may be the same if n is odd. But that's what we want. So you can think of it as it's not median finding, but finding elements with a certain rank. And we want to do this in linear time, OK?
So we're going to apply divide and conquer here. And as always, the template can be instantiated. And the devil is in the details of either division or merge. And we had most of our fun with convex hull on the merge operation. It turns out most of the fun here with respect to median finding is in the divide, OK? So what I want is the definition of a select routine that takes a set of numbers s. And this is the rank. So I want a rank i. And that i might be n over 2-- well, floor of n plus 1 over 2, whatever?
And so what does the divide and conquer look like? Well, the first thing you need to do is divide. And as of now, we're just going to say you're going to pick some element x belonging to s. And this choice is going to be crucial. But at this point, I'm not ready to specify this choice yet, OK? So we're going to have to do this cleverly. And then what we're going to do is we're going to compute on k, which is the rank of x, and generate two sub arrays such that I want to find the fifth highest element. I want to find the median element. I want to find the 10th highest element. So I have to keep track of what happens in the sub problems. Because the sub problems are going to determine, depending on how many elements are inside those sub problems, which I can only determine after I've solved those sub problems. I'm going to have to collect that information and put it together in the merge operation.
So if I want to find the 10th highest element and I've broken it up relatively arbitrarily, it's quite possible that the 10th highest element is going to be discovered in the left one or the right one. And I have to show that it's the 10th highest. And it might be that there's four elements in the left and five on the right that are-- let's see. If I defined the rank as less than or equal to x, there's four on the left and five on the right that are smaller. And that's why this is the 10th highest element. And that's essentially what we have to look at. So b and c are going to correspond to the sub arrays that you can clearly eliminate one of them. You can count the number of elements in b, count the number of elements in c. And you can eliminate one of them in this recursion as you're discovering this element with the correct rank-- in this case, i. So let me write the rest of this out and make sure we're all on the same page.
What I have here pictorially is I've generated b here and c. So this is all of b and that's all of c. I have k minus 1 elements here in b. And let's say I have n minus k elements in c. And I'm going to do-- essentially take-- once I've selected a particular element, I'm going to look at all of the elements that are less than it and put it into the array b. I'm going to look at all the elements that are better than it. Let's assume all elements are unique. I'm going to put all of them into c. And I'm going to recur on b and c. Those two are my sub problems. But what I have to do is once I recur and I discover the ranks of the sub problems, I have to put them together. So what I have here is if k equals i-- so I computed the rank and I realized that if k equals-- equals i, I should say-- if k equals i, then I'm going to just return x. I'm done at this point. I got lucky. I picked an element x and it magically ended up having the correct rank, OK? Not always going to happen. And so in other case, if k is greater than i, then going to return select bi.
So what I've done here is if k is greater than i, then I'm going to say, oh, so now I'm going to have to find the element in b. I know that it's going to be in b because k is greater than i. And I've got to find the exact position depending on what i is over here. But it's going to be somewhere between 1 and k minus 1. And then the last case is if k is less than i, then this is a little more tricky. I'm going to turn on c of i minus k, OK? So what happens here is that my k is-- the rank for the x that I looked at over here is less than i.
So I know that I'm going to find this element that I'm looking for in c. But if I just look at c, I don't want to look at c and look for an element of rank i within c, right? That doesn't make sense because I'm looking for an element of rank i in the overall array that was given to me. So I have to subtract out the k elements that correspond to x and all of the k minus 1 elements that are in b to go figure out exactly what position or rank I'm looking for in the sub array corresponding to c, OK? So, people buy that. So that's just a small, little thing that you have to keep in mind as you do this. So that's pretty straightforward, looking pretty good. And you say, well, am I done here? And as you can imagine, the answer is no, because we haven't specified this value.
Now, can someone tell me, at least from an efficiency standpoint, what might happen, what we're looking for here? As you can imagine, we want to improve on theta n log n. And so you could you say, well, I'm happy with theta n. That theta n complexity algorithm is better than a theta n log n complexity algorithm, which is kind of in the bag. Because we know how to sort and we know how to index. So we want a theta n algorithm. Now, if you take this and if I just picked, let's say, the biggest element-- I kept picking x to be n or n minus 1 or just picked a constant value. I picked x to be in the middle.
I picked the index. I can always pick an element based on its index. I can always go for the middle one. So what is the worst case complexity of this algorithm? If I don't specify or I give you this arbitrary selection corresponding to x belonging to s, what is the worst case complexity of this algorithm? Yeah, go ahead.
STUDENT: N squared.
PROFESSOR: N squared-- why is that?
STUDENT: Because if you [INAUDIBLE] take like the least element.
PROFESSOR: Yep. STUDENT: How do you compare like N o against the other analysis?
PROFESSOR: Exactly right. That's exactly right. So what happens is that you're doing a bunch of work here with this theta n work. Right here, this is theta n work, OK? So given that you're doing theta n work here, you have to be really careful as to how you pick the x element. So what might happen is that you end up picking the x over here. And given the particular rank you're looking for, you have to now-- you're left with a large array that has n minus 1 elements in the worst case. You started with n. You did not go to n over 2 and n over 2, which is what divide and conquer is all about-- even n over b, OK?
You went to n minus 1. And then you go to n minus 2. And you go to n minus 3 because you're constantly picking-- this is worst case analysis. You're constantly picking these sub arrays to be extremely unbalanced. So when the sub arrays are extremely unbalanced, you end up doing theta n work in each level of the recursion. And those theta n's, because you're going down all the way from n to one, are going to be theta n square when you keep doing that, OK? So thanks for that analysis. And so this is theta n squared if you have a batch selection. So we won't talk about randomized algorithms, but the problem with randomized algorithms is that the analysis will be given a probability distribution. And it'll be expected time.
What we want here is a deterministic algorithm that is guaranteed to run in worst case theta n. So we want a deterministic way of picking x belonging to s such that all of this works out and when we get our recurrence and we solve it, somehow magically we're getting fully balanced partitions-- firmly balanced sub problems in the sense that it's not n minus 1 and 1. It's something like-- it could even be n over 10 and 9n over 10. But as long as you guarantee that, you're shaking things down geometrically. And the asymptotics is going to work out. but the determinism is what we need. And so we're going to pick x cleverly. And we don't want the rank x to be extreme. So this is not the only way you could do it, but this is really very clever.
There's a deterministic way. And you're going to see some arbitrary constants here. And we'll talk about them once I've described it. But what we're going to do is we're going to arrange s into columns of size 5, right? We're going to take this single array. And we're going to make it a two dimensional array where the number of rows is five and the number of columns that you have is n over 5-- the ceiling in this case. And then we're going to sort it each column, big elements on top. And we're going to do this in linear time. And you might say, how did that happen? Well, there's only five elements. So it's linear. You could do whatever you wanted. You could do n raised to four.
But it's five raised to four and it's constants. Don't you love theory? So then we're going to find what we're going to call the median of medians. So I'm going to explain this. This works for arbitrary rank, but it's a little easier to focus in on the median to just explain the particular example. Because as you can see, there's an intricacy here associated with the break up. And so here we go. I'm going to draw out a picture. And we're going to try and argue that this deterministic strategy that I'll specify gives you fairly balanced partitions in all cases, OK?
So what we see here is we see-- pictorially, you see columns of length five. Each of these dots corresponds to a number. This one dimensional array got turned into a two dimensional right. So I got four full columns. And it's suddenly possible, given n, that my fifth column is not full, right? So that's certainly possible. So that's why I have that up here. It so what I've here is I'm going to lay them out this way. And I'm going to look at that. I'm going to look at the middle elements of each of these n over five columns. That's exactly what I'm going to look at. Now, if I look at what I want, what I want over here is this x. If I want to find-- I'm going to find the median of medians. So is x. Now, it is true the first that these columns-- I'm just putting that up here imagining that that's x.
That's not guaranteed to be x because the columns themselves aren't-- well, these columns are sorted. And what I'm going to have to guarantee, of course, is that when I go find this median of medians is that it ends up being something that gives me balanced partitions. So maybe say a little bit more before I explain what's going on.
Each of these columns is sorted. And s is arranged into columns of size 5 like I just said here. These are the medians, OK? If I look at determining the medians and I say that once I've determined this x, which I've discovered that it's the median, then this is right there in the middle. There's going to be a bunch of columns to the left of it, a bunch of elements to the left of it, and a bunch of elements to the right of it. And in this case, I have five columns. I could have had more. It happens to be the third one.
So the idea is that once I find this median of medians, which corresponds to this x number, I can say that all of the columns-- these all correspond to columns that have their median element greater than x. These correspond to columns that have their median element less than x, OK? So what I have here in this picture is that these elements here are going to be greater than x. And these elements here are going to be less than x. So let me clear. What's happened here is we've not only sorted all of the columns such that you have large elements up here.
Each of these five columns have been sorted that way. On top of that, I've discovered the particular column that corresponds to the medians of medians. And this is my x over here. And it may be the case that these columns aren't sorted. This one may be larger than that or vice versa-- same thing over there. I have no idea. But it's guaranteed that once I find this median that I do know all of the columns that have elements in this position that are less than this x. And I know columns that in this position have elements that are greater than x, OK? Yep.
STUDENT: Shouldn't the two elements below x also be computed [INAUDIBLE] less than x.
PROFESSOR: You're exactly right. I would have probably been able to get the same asymptotic complexity if I dropped those because I had a constant number. But you're absolutely exactly right.
So the point that-- the question was-- I just redrew it. These two are clearly less than x as well because they're part of the sorting. And that's essentially I have here. Now, my goal here-- and you can kind of see from here as to where we're headed. What I've down here by this process of sorting each column and finding the median of medians is that I found this median of medians such that there's a bunch of columns on the left. And roughly half of those elements in those columns are less than x. And there are a bunch of columns on the right. And roughly half of those columns have elements that are greater than x. So what I now have to do is to do a little bit of math to show you exactly what the recurrence is. And let me do that over here.
So that's the last thing that we have to do. I probably won't solve the recurrence, but that can wait until tomorrow. The recurrence will be something that's not particularly difficult to solve. So I want to now make a more quantitative argument that the variable being n as to how many elements are guaranteed to be greater than x. And essentially what I'm saying, which is I'm writing out what I have on that picture there, half of the n over 5 groups contribute at least three elements greater than x except for one group with possibly less than five elements, which is the one that I have all the way to the right, and one group that contains x. So for all the other columns, I'm going to get three elements that are greater than x. And so if you write that out, this says there are at least three n over 10, because I have half of all of those groups, minus 2. And I'm not counting perfectly accurately here, but I have an at least. So this should all be fine. 3n over 1d-- 3 times n over 10 minus 2 elements are strictly greater than x. And that comes from that picture.
I'm going to be able to say the same thing for less than x as well. I can't count the one. Depending on how things go, maybe I could have played around and subtracted 1 instead of a 2 in the latter case. But I'm just being conservative here. It is clear that I'm going to have a bunch of columns that are full columns, that are going to be contributing three elements that are greater than x. And in this case, I have, well, two of them here for the less than x. And I got one for the greater than x. So that's all that I'm seeing over here with respect to the balance of the partitions. And it turns out that's enough. It turns out all I have to do with this observation is to go off and run the recurrence. And we're going to get an efficient algorithm. Yep.
STUDENT: Should it not be like greater than or equal to, because there's... [INAUDIBLE]
PROFESSOR: No, there's nothing that's equal.
STUDENT: So you are saying, that's all you need.
PROFESSOR: Yeah. Yeah, I assume that-- so, convenience, yeah. There's always a little bit of convenience thrown in here. We will assume that the a has unique elements. So there's nothing that's x, OK? Good.
So the recurrence, once you do that, is t of n equals-- we're going to just say it's order one for n less than or equal to 140. Where did that come from? Well, like 140. It's just a large number. It came from the fact that you're going to see 10 minus 3, which is 7. And then you want to multiply that by 2. So some reasonably large number-- we're going to go off and we're going to assume that's a constant. So you could sort those 140 numbers and find the median or whatever rank. It's all constant time once you get down to the base case. So you just want it to be large enough such that you could break it up and you have something interesting going on with respect to the number of columns. So don't worry much about that number. The key thing here is the recurrence, all right?
And this is what we have spent the rest of our time on. And I'll just write this out and explain where these numbers came from. So that's our recurrence for n less than or equal to 140. And else, you're going to do this. So what is going on here? What are all of these components corresponding to this recurrence?
Really quickly, this is simply something that says I'm finding the median of medians. I'm finding some element that has a certain rank. So this median of medians is going to be running on n over 5 columns. So I've got this-- there are n over 5 columns here. And I'm going to be calling this algorithm recursively, the median finding algorithm, to do that-- finding the median of medians. This thing over here is-- I'm going to be discarding at least regardless of what I do. Because I have these two statements here, I take the overall n. And I'm going to discard. In my paradigm over here, I'm either going to go with b or I'm either going to go with c depending on what I'm looking for. And given that b and c are not completely unbalanced,
I'm going to be discarding 3n over 10 minus 6 elements, which simply corresponds to me ignoring the ceiling here and multiplying the 3 out. So that's 3n over 10 minus 6. So then I have 7n over 10 plus 6. That's the maximum size partition that I'm going to recur on. It's only going to be exactly one of them, as you can see from that. It's either else. It's not recurring on both of them. It's recurring on one of them. So that's where the 7n over 10 plus 6 comes from. And then you ask where does this theta n come from. Well, the theta n comes from the fact that I do have to do some sorting. It's constant time sorting for every column, OK? Because it's only five elements.
So I'm going to do constant time sorting. But there's order n columns. Because it's-- then it's n over 5 columns. So this is the sorting of all of the columns, all right? So that's it. And I'll just leave you with-- you cannot apply the master theorem for solving this particular recurrence. But if you make the observation-- and you'll see this in section. You make the observation that n over 5 plus 7n over 10 is actually less than n. So you get 0.2n here and 0.7n there. That's actually less than n. This thing runs in linear time. And you'll see that in section tomorrow. So this whole thing is theta n time. See you next time.