Lecture 15: Single-Source Shortest Paths Problem

Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

Description: This lecture introduces weighted graphs and considers general approaches to the shortest paths problem. The lecture discusses single source shortest paths, negative-weight edges, and optimal substructure.

The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.

PROFESSOR: [INAUDIBLE] audio. We're going to have a series of lectures on shortest paths. And one of the big differences between this module and the previous one, at least from a mathematical standpoint, is that we're going to be looking at graphs that have weights on their edges.

So when Eric talked about depth first search and breadth first search in the last couple of lectures, we had directed graphs and undirected graphs. But we didn't really have attributes on the edges. In particular, you have a much richer class of problems and applications if you allow for weights on graph edges.

And these weights can be integers. They could be real numbers, irrationals. They could be negative, what have you. And for different classes of graphs and different restrictions on weights, there's really a set of shortest path algorithms, that we'll look at in the next few lectures, which are kind of optimized for a particular application.

So we won't do specific algorithms today. But we'll set up the problem. We'll talk about the general approach that most shortest path algorithms take to solve a particular instance of a problem. And then we'll close with talking about a particular property that's pretty important.

That's called the optimum or optimal substructure property. That is a technique that most shortest path algorithms, or actually all shortest path algorithms use to get efficient complexity. So asymptotic complexity is important, obviously. And we're always looking for the best algorithm with the best asymptotic complexity. And optimal substructure is a hammer that we're going to use to get that.

So the canonical motivation, of course, for shortest paths is-- now, if you want to steal, or I guess borrow, a cannon from Caltech and bring it over to MIT, then you want the fastest way of getting here with your illegal goods. And you want to find the shortest way or the fastest way of getting from one location to another.

So Google Maps go from point A to point B. That's a classic application of the shortest path problem. In this case, you could imagine that distance would be something that would be a very simple metric that you could use for the weights on the edges.

So for this entire module, we're going to be looking at a graph G V, E, W. And you know what V and E are. They're the vertices and the edges.

And W is a weight function that maps edges to weights. And so we're adding that in here. And so W would be E to R. So the set of real numbers.

We're going to be looking at two different algorithms in subsequent lectures. And you'll implement one of them in your problem set. The simpler algorithm, which we'll look at first, is called Dijkstra, after Edsger Dijkstra, who did some similar work in concurrent programming, won the Turing Award.

But on the side invented this cool algorithm-- or at least gets credit for it-- called Dijkstra's algorithm that assumes non-negative weight edges. So I should really say non-negative. So read that as non-negative. And that has a complexity of order V log V plus E. All right.

So this is practically linear time. And typically, you're going to be dominated in many cases by E. In general, if you talk about a simple graph, what's the asymptotic relationship between E and V? Can you relate E to V? Can you give me a bound?

AUDIENCE: V squared.

PROFESSOR: Sorry? V squared, thanks. That's good.

So you can think of E as being order V square. And you can certainly have-- that's worth recursion. So now, you can kind of imagine a complete graph. And a complete graph is something that has an edge between each pair of vertices. And that's where you'll get E being k w squared.

So when you say simple graph, you're saying you have at most one edge between any pair of vertices. A multigraph is something that could have multiple edges between pairs of vertices. We won't really be talking about multigraphs in this sequence of lectures. But something to think about or keep in the back of your mind as we go through these algorithms.

And so the dominating factor here, and in many cases really, is E. And Dijkstra is a nice algorithm, because it's linear in the number of edges. So that's Dijkstra.

And that's the first of the algorithms that we'll look at next time. But we'll see the general structure of Dijkstra today. And then there's the Bellman-Ford algorithm that works on positive and negative edges, weight edges. And this has a complexity order V E.

So you could imagine a particular implementation of Bellman-Ford running in order V cubed time. Because E could be V square. And you've got this additional E factor. So it's order V cubed versus order V log V. So when you have a chance, use Dijkstra. When you're stuck, you'd want to do Bellman-Ford, because you have these negative weight edges.

And one of the challenges in negative weight edges, and I'll say a little bit more as we go along, is that you end up having to have to find cycles that are of a negative weight, because they kind of throw off your shortest path algorithm if you were just assuming that shortest path lengths are only going to decrease. But when you have negative weights, you might take a step and the overall weight might decrease.

So it's kind of a longer path in terms of the number of edges. But the weight is smaller. And that kind of makes the algorithm more complicated. And it has to do more work.

So that's really why there's a difference between these two complexities. And I guarantee you, you'll understand this much better after we're done with the lectures on Dijkstra and the lectures on Bellman-Ford.

So that's the set up for the problem. That's what we're going to be looking at. Let's look at a couple more definitions beyond what I have here with respect to just the notation. And you can think of path p as a sequence of vertices-- V0, V1, et cetera, to Vk.

And this is the path if Vi, Vi plus 1 belongs to E for 0 less than or equal to i less than or equal to k. So a path is a sequence of edges. And each of those edges has to be in the graph, has to be in the set of edges E.

And W of p, which is the weight of the path, we know that by the weight of edges, those are easy. They're given by the W function. The weight of the path is simply the summation of the weights of the edges. All right.

So fairly obvious definitions. But obviously, we have to get these right in order to actually solve the problem correctly. And the shortest path problem is, as you can imagine, something that tries to find a path p that has minimum weight. So in general, you have some set up for the problem. But it comes down to find p with--

And there are many, many possible paths. You have to understand that there are potentially an exponential number of paths in the graphs that we would consider. And here's a real simple example where you would have an exponential number of paths. And we'll come back to this example later in the lecture.

But let's assume that all the directions go this way. And it's a directed graph. Well, you could have the path that goes all the way here. But you could have the path that goes on top and all the way this way. You have basically two choices on getting to this vertex.

Then you've got, given the two ways you have of getting to this vertex. You've got four ways of getting here. And then, you have eight ways of getting there. So on and so forth. So there's an exponential number of paths potentially.

The other thing that's interesting here, which is important in terms of this complexity is, what's interesting about what you see here with respect to the complexity and what you see here. Anybody want to point that out?

So I have this complexity here and order VE out there. What's an interesting observation if you look at this board here and the two complexities? Anybody? Yeah, back there.

AUDIENCE: It's not a function of weight.

PROFESSOR: It's not a function of weight. Great. That's definitely worth recursion. And I'll let you throw this one. All the way back there, all right? Right.

All right, good. Good, good. That was good. That was better than what I could do. No, not really. But I would've been right in his hands right there.

Anyway, so that's a great observation, actually. And I should have pointed it out right up front. But I'm glad I got it out of you. W doesn't exist in the complexity. This is pretty important. W could be a large number. I mean, it could be 2 raised to 64.

The fact is that there's only E square different values possible for a weight, right. I mean, roughly speaking. If you have a complete graph, it's a simple graph, there's order E square possible weights.

But the range of the weights could be exponential. I could have an edge weight of 0.0001 and a different edge weight of 10 raised to 98. There's nothing that's stopping me from doing that or putting a specification like that.

But the nice thing about Dijkstra, and Bellman-Ford, and virtually all of the algorithms that are useful in practice is that they don't depend on the dynamic range of the weights. And so keep that in mind as you think of shortest path algorithms. And we'll talk a little bit about this in section tomorrow, or the TAs will, as to why breadth first search and depth first search aren't directly applicable to the shortest path problem. And the hint really is the dynamic range of the weights. So keep that in mind.

So a couple things why this is an interesting algorithm, or interesting problem to solve, and harder than the problems we've looked at so far like sorting and search, is that you have an exponential number of paths. And then the dynamic range of the weights can be very, very large. And it's not linear by any means. All right.

So these algorithms are going to have to have some smarts. And the optimal substructure property that we'll look at towards the end of today's lecture will give you a sense of how these algorithms actually work in basically linear time. Or VE, you could think of that as being cubic time in terms of the vertices.

So keep that in mind. Let's talk a little bit more about weighted graphs. I want a little bit more notation.

And what I have is V0 using path p to Vk. So I'm going to write that to say that there's a particular path of V0 to Vk. Sometimes I'm searching for the path p with a minimum weight. And that's how I'm going to represent that.

V0, which is a single vertex path, is the path from V0 to V0. So it's really a 0 length path. And it has weight 0.

So that's one condition. The other condition that we need to look at, which is the other case, is what if there isn't a path? So I want to put those two things together, the two extremes, and of course all of the cases in between, in this definition of the shortest pathway.

And so I'm going to talk about the shortest path value of the weight of the shortest path between u and v as delta, u, v. And my goal is to find delta. It's also to find the path.

It doesn't help you very much if you know that there's a way of getting from here to Lexington within 14 miles if you don't know what that path is, right. So that's one aspect of it, which is you want to get the weight. But you want to get the path as well. And these algorithms will do that for you.

And in particular, what we want is delta u, v to be the minimum over all the paths W p, such that p is in fact the path from u to v. And this is the case where if there exists any such path, and the last thing is you want this to be infinity, the weight to be infinity otherwise. So if you're only talking about roads going from here to Tokyo, should have length infinity. A little matter of the Pacific Ocean in between.

So that's the set up in terms of the numbers that we want to see. If you're starting from a particular point, you can think of the shortest path length from your source as being a 0. Initially, everything is infinity because you haven't found any paths yet. And what you're going to do is try and reduce these infinities down for all of the vertices that are reachable from the source vertex.

And it's quite possible that you may be given a graph where a particular vertices, or in your set V, that can't be reached from the particular source that you started with. And for those vertices, you're going to have your delta u, v. If v is unreachable from you, it will stay at infinity.

So let's look at an example. Let's take-- it's going to be an iterative process here of finding the shortest paths. And so let's take an example that corresponds to a fairly complex graph, or at least a nontrivial one, where that's my source vertex. And I've labeled these other vertices A through F. And I have a bunch of edges. 5.

I got one more here. So that's what's given to me. And I want to find delta S plugged in for u. And A, B, D, et cetera plugged in for V for this graph.

And let's just do this manually if you will. And just trying to do some kind of breadth first search. And we do know breadth first search. We know depth first search. You can imagine trying to use those notions to try and find the shortest paths here.

So now we have to prove afterwards when we are done that these are, in fact, the shortest paths. And that's the hard part of it. But we can certainly try and fill in some numbers associated with paths that we do know about.

So I'm going to say that the numbers that are inside each of these vertices-- d of u is the current weight. And so initially, I'm going to start with D of S being 0, because that's a source. And all of these other ones are going to have-- I'm not going to write this down-- but they're going to have infinity for their D of Vs.

So D of A is infinity. Do of B is infinity, et cetera. And what I want to do is decrease this D number to the point where I'm confident that all of the D numbers that are inside these vertices, these are the current weights, or end up being the delta numbers. So my algorithm is done when my d numbers shrink down. And I got the delta values, the correct delta values.

But if I wanted to do this, sort of a seat of the pants approach, just go off and try and iteratively reduce these numbers, you say, well, this one was infinity. But clearly, if I start from S and I follow the edges in S, I'm going to be able to mark this as a one. And similarly here, I'm going to be able to mark this as a 2.

Now, I could arbitrarily pick this one here and this A vertex, and then start looking at the edges that emanate from the A vertex. And I could go off and mark this as 6, for example. And if I start from here, I'd mark this as 3.

Now, is it in fact true that 6 equals delta S comma C equals 6? No. What is in fact-- is there a better way of getting to C? And what is the weight of that? What vertex do I have to go through?

I mean, one way is to go from S to B to D to C, right? And that would give me 5 right? So that's 5. Can I do better than 5? Not in this graph. OK

So it's not the case that the shortest length path gave you the smallest weight. I mean, that was one example of that. And I can go on an and bore you with filling in all of these numbers. But you can do that on your own. And it's really not particularly edifying to do that.

But you get a sense of what you need to be able to do in order to converge on the delta. And it might take some doing. Because you have to somehow enumerate in an implicit way-- you can't do it in an explicit way, because then there'd be an exponential number of paths. But you'd have to implicitly enumerate all the different ways that you can possibly get to a vertex and discover the shortest path through that process, all right. And so we have to be able to do that in these shortest path algorithms.

And this is a simple graph that has positive weights, non-negative weights with edges. It gets more complicated when you have negative weights. But before I get to that, there's one other thing that I want to talk about here with respect to discovering the actual path.

So what we did here was we had delta u, v that corresponded to the weight of the shortest path. But if you want the path itself, we need to have a way of finding the sequence of vertices that corresponds to the minimum weight path. And in particular, we're going to have to define what we call the predecessor relationship.

And so what I have is d of V is the value inside the circle, which is the current weight. And as d is something you're very interested in, eventually you want it to go to delta. The other thing that you're very interested in-- and this is really a fairly straightforward data structure corresponding to just the d number and this predecessor number. And pi of V is the predecessor vertex on the best path to V. And you said, pi of S equals NIL.

And then you can think of this as this is eventually what we want, and this gets modified as well. So right now, when you're working and trying to find the path, you have some particular path that happens to be the current best path. And that's a sequence of vertices that you can get by following the predecessors.

So once you're at a particular vertex E, you say all right, right now I can look at pi of E. And if that points me to C, then that's good. I'm going to look at pi of C. And that might point me to A, and so on and so forth.

In this particular instance, pi of E is going to, when you're finally done, is going to point to A. And pi of A is going to point to S, all right, because that's the path that is the best path is this one. Like so and like that.

And so those are the two data structures you need to keep in mind that you need to iterate on, this predecessor relationship and the current distance. And then this ends up being delta. You're done.

And at that point, your predecessor relationship is correct. So that's the set up. The last complication I want to talk about here is negative weights. And it's, I think, appropriate to talk about it when we have Bellman-Ford up here. Which is really the general algorithm.

So let's talk about-- so the first question is why. Why do these things exist, other than making our lives more difficult? So give me an example. What is the motivation for a graph with negative weights?

I mean, I really would like to know. The best motivation is definitely worth recursion. Then I can use it next time. Yeah, go ahead.

AUDIENCE: I'm just thinking like if your goal, if your goal [INAUDIBLE]. And some of them cost too much. Some of them get you money, and you want to know what-- you're trying to find [INAUDIBLE].

PROFESSOR: Sure. Yeah, I mean, I think that's a good motivation. I think driving, when you think about distances and so on, there's no notion of a negative distance, at least physically. But you can imagine that you could have a case where you're getting paid to drive or something, or it costs you to drive, and that would be one. Yeah, go ahead.

AUDIENCE: It sounds like Monopoly. So the vertices are supposed to be [INAUDIBLE].

PROFESSOR: Oh, if you land on something, you have to pay rent. Or sometimes you land on something and you actually get money.

AUDIENCE: [INAUDIBLE].

PROFESSOR: Takes you forward, backwards, right. Yeah go ahead.

AUDIENCE: [INAUDIBLE].

PROFESSOR: So that is such an interesting notion. Sometimes you may want to go. And maybe in this case, you're saying it's better to take your distance metric and go further away in order to get the best way of getting there, or something like that.

AUDIENCE: [INAUDIBLE].

PROFESSOR: Right. Sure. That'd be good. Right. Victor, you had your hand up.

AUDIENCE: Yeah. I'm going to give [INAUDIBLE] on the highway, you can't [INAUDIBLE] distances [INAUDIBLE] negative. Well, if a government uses [INAUDIBLE] police to regulate traffic, then you might have a negative distance. Because obviously, you could go a certain way minus the [INAUDIBLE].

PROFESSOR: Right. Yeah, that's a good example. One of the things that we have to think about is-- and this is something that might come up, by the way, in a problem set or a quiz-- which is, is there a way of shifting these weights to make them all positive? So the examples we've talked about, not clear to me that in the particular settings that we talked about that you can somehow create the base case to be 0 rather than being negative.

So it may not be possible in a particular scenario. But if you can do that-- and the reason I bring this up is if you can do that, suddenly instead of using an order V, E algorithm, if you can prove correctness of the final solution is exactly what you'd have gotten for the initial problem certification, you've gone from an order V, E algorithm to an order V log V algorithm. So that's a wonderful thing to do.

So keep that in mind. Try and get rid of negative weight edges if you can without changing the problem certification. I saw a hand back there.

AUDIENCE: Oh, no. I thought you were just asking the question, if we could do that? So I was just gettin ready to answer.

PROFESSOR: OK, yeah, so that's something to keep in mind. One example that I think has come up here, which came up, I think, the last time I lectured was imagine that you're driving and there are all these advertisements. And you get paid to drive on a freeway. So the reverse toll.

I mean, it's a reverse toll, because you get to go there and you have to see all these ads. And then I guess you drive pretty fast through those ads. But you have to go through. And so you get paid to go through those particular roads.

And then what about social networks? I mean, there's liking people and disliking people. I mean, that sounds pretty-- that's negative and positive. One could imagine that social networks would have positive weights and negative weights.

I'm surprised one of you-- I mean, I don't have an account on Facebook. But presumably, you guys do. So think of what's the-- yeah, that's right. Well, I'm not sure how this works. But you guys figure it out.

So why? Reverse tolls, social networks. Lots of things. Even if you're not convinced by the motivation, I will spend a whole lecture talking about Bellman-Ford. So that's just so that's clear.

So the issue with the negative weight cycles is something that is worth spending a minute on. And I talked about the fact that you had an exponential number of paths. And that causes a bit of a problem, even in the case where you have positive weights. And I will revisit that example.

But here's an even worse problem that corresponds to negative cycles. So eventually, you want to terminate. The faster you terminate, and if you can talk about asymptotic complexity, obviously that means that you've terminated within a worst case bound of time.

And if that's exponential, that's bad. You'd want it to be small. But what if you didn't even terminate?

So suppose you have something like this where you have a graph that has negative weights on some of the edges. But others are positive. So this one has a minus 6. I think I got those right.

So 2 for minus 6 over here, 3, 2, 1, and minus 2. So one thing that you notice from this graph is that you got this annoying cycle here. That's a negative weight cycle. And that's why I've picked this particular example.

Minus 6 plus 2 is minus 4. Minus 4 plus 3 is minus 1. So if you had something where you can't depend on the fact that the D's are going to keep reducing. And that eventually, they'll stop reducing.

Well, that's true. Eventually, they'll stop reducing. Because they're lower bounded by 0 when you have positive weight edges or non-negative weight edges. But if you have a graph with a negative cycle-- and I mean, this is a recipe for an infinite loop, right-- in your program, potentially a bug. But maybe not even a bug, not a bug in implementation, but a bug in the algorithm. Because this determination condition isn't set properly.

So you can imagine that you would get to B and the first time-- whoops, I'm missing a weight here. So you get to B. And you say, well, I'm done. Delta of SB is 4.

But that's not true. Because you could get to B. And then you could get back to B with the weight of 3. And then you could do it again with the weight of 2, and so on and so forth. So that's a problem.

So what would you like an algorithm to do? What would you like the Bellman-Ford to do here? It's not the case that all of the delta values, that is the shortest path values, are undefined for this graph. Some of them are well defined.

This one, you can't ever get back to it. So clearly, delta S, S is 0. Everybody buy that? What about this one?

It's 2, right. Delta S, A is 2. And everybody buys that, because there's just no way. You can't, you don't touch a negative weight cycle.

You, in fact, don't touch a negative weight edge. But more importantly, you don't touch a negative weight cycle in order to get to A. And there's no way of touching that.

On the other hand, anything that's in here you could run many times. And you could end up with whatever weight you wanted. There'd be a minus infinity weight.

So what you want an algorithm that handles in particular negative cycles, which are the hard part here. Negative weights aren't the hard part if you can't run through these edges more than once. It's actually the negative cycles that are hard.

And the negative cycles are going to make shortest path lengths indeterminate, but not necessarily for every node in the graph, like this example shows. So what you want your Bellman-Ford algorithm to do, or your shortest path algorithm that handles negative cycles to do, is to finish in reasonable amounts of time. Order V, E will take and give you the delta numbers for all of the vertices that actually have finite numbers and then mark all of these other vertices as being indeterminate, or essentially minus infinity. OK

So that's your termination condition. It's different from the termination condition if you simply had non-negative edge weights. All right. So remember, it's cycles that cause a problem, not just the edges. And you have to do something about the cycles. But they may not affect the entire part of the computation.

So if you don't know that you have a cycle or not, then you end up with having to use Bellman-Ford. And so that also tells you something which is interesting, which is Bellman-Ford has to detect negative cycles. If Bellman-Ford couldn't detect negative cycles, then how could it possibly be a correct algorithm for the arbitrary case?

So Dijkstra doesn't have to do that. And that's why Dijkstra is simpler. All right.

So let me talk about the general structure of shortest path algorithms. And the 2 important notions that I want to talk about here are the notion of relaxation, which we sort of did already when we ran through this example. But I need to formalize that. And then we'll go back and revisit this exponential graph example.

So the general structural of shortest path algorithms are as follows. We're going to initialize for all u belonging to the vertex set, we set d v to be infinity. And we set the predecessor to be NIL.

And then we'll set d of S to be 0. We're talking about a single source, here. We'll set that to be 0. And what we're going to do is essentially repeat.

Select some edge u comma v. And I'm not specifying how. This is going to result in a different algorithm depending on the specifics of how.

But the important notion is that we're going to relax edge u, v. And what the notion of relaxation is is that you're going to look at it. And you'll say, well, if d of v is greater than d of u plus w u, v, then I've discovered a better way of getting to v then I currently know.

So d of v would currently be infinity, which means I haven't found a way of getting to v yet. But I know that d of u, for example, is a finite number. And I do know that this edge exists from u to v, which means that I can update the value of d of v. And that's what we call relaxation of the edge u, v.

And so what you do here is if the if is true, then you set d, v to be d, u plus w u, v. And you'll also update the predecessor relationship, because the current best predecessor for v is going to be u. So that's the notion of relaxation.

And I kind of ran out of room here. But you keep doing this. This repeat. So what is the repeat? Well, the repeat is until all edges have d of v less than or equal to d of u plus w u, v.

And the assumption here is that you have no negative cycles. We need a different structure. The notion of relaxation is still going to be relevant. But don't think of this structure as being the structure that Bellman-Ford uses, or algorithms that can handle negative cycles use.

So hopefully, you got the notion of relaxation, which is from a pictorial standpoint, it's simply something that we did when we looked at updating the value of 6 to 5, for example. So we said through this process, if I relax this particular edge and d was already set up-- let's say d, the vertex here had 3. And this was originally 6. And I look at it and I say, D of C is 6.

On other hand, 6 is greater than d of the vertex D, which happens to be 3 plus 2. And since 5 is less than 6, I can relax this edge and update the value of 6 to 5. And then I update the predecessor relationship to have a pi of C to be D. That's the notion of relaxation. Fundamental notion. Going to use it in every algorithm that we talk about.

When do you stop? Well, when you don't have negative cycles, there's a fairly clean termination condition, which says that you can't relax any of the edges any more. OK You get to the point where you have values that are associated with each of these vertices inside. And it doesn't matter what edge you pick, you can't improve them.

So this termination condition, it could involve an order E check. So we're not talking complexity here yet in terms of being efficient.

But you can imagine when I say until all edges cannot be relaxed, that you'd have to look at all the edges. And if any one of them can be relaxed, it's possible that another one can now be relaxed. So you've got to keep going until you get to the point where none of the edges can be relaxed.

So this is a brute force algorithm. And it'll work. It'll just be slow. It'll work for known negative cycles. And if you just kind of randomly select these edges and just keep going, I'll give you an example where it works pretty badly in a minute. But this is an algorithm.

So I guess I lied when I said we weren't going to give you an algorithm. It is an algorithm. It's just an algorithm that you never want to implement. You do want to implement the relaxation condition. But not this random way of selecting edges and having this termination condition that, in of itself, is an order E check.

And one of the reasons why you don't want to implement this algorithm is coming up shortly in our exponential graph example. But let me make sure that people aren't bored. Any questions about the general structure, relaxation, anything? Are we good? OK.

So you guys, I walk away from lecture thinking I've given this spectacular lecture and everybody understands. And then Victor tells me when he shows up in section in the morning, he says did people understand graphs? And everyone says no. Or did people understand x? And people say no.

So at least tomorrow, tell Victor that you understood. Whether you did or not. So then I feel better.

AUDIENCE: That's going to make my life real easy.

PROFESSOR: Yeah, right. So good. Well, you probably like hearing stuff from Victor better than me anyway. That's the secret here, right?

All right, so one of the reasons why you don't want to implement this algorithm is precisely this example that I put up. And this is a really neat example that I like a lot, because it points out two different things. It points out that exponential number of paths, an exponential number of paths in a graph, could cause a problem with this algorithm.

The other thing that it points out is that we got issues with the weights of edges. One of the nice observations one of you made earlier on is that we had these neat algorithms that did not depend on the dynamic range of the weights.

So let's just say that I in fact had an exponential range for the weights. I know 4 isn't exponential, but at some level, you could imagine that it's exponentially related to 1 or 2. And the point here is that if I created a graph that looked like this, where I have V4, V5, V6, V7, V8, and it had this structure, then I'm going to end up having something like 2 raised to n over 2 weight if I have n vertices in this graph. Or at least the dynamic range of these weights is going to be 2 raised to n divided by 2. Everybody buy that?

So think of this graph as being a fragment of this large graph, which where n could be 100 and the weights could be 2 raised to 50. And 2 raised to 50 isn't a number that we can't handle on a computer, right? It's still less than 64 bits, right? So it's a pretty reasonable example.

And we talked about multiple precision arithmetic, infinite precision arithmetic. So we can handle arbitrary numbers of an arbitrary position. So there's nothing that's stopping us from putting square root of 2 and all sorts of things. We won't do imaginary numbers. But you could imagine putting numbers with a high dynamic range as edges in a particular graph and expect the Dijkstra, assuming that all of the edges are non-negative, that Dijkstra should be able to run on it.

So what happens with this example? Well, with this example, here's what happens. Let's say that I ran this algorithm. And initially, I just followed this chain here. And I get-- this starts with a 0. And this is a 4, because I get there with 4. This one is 8. And this is 10. And this is 12, 13, 14.

And that's the initial pass. That's the selection. What ends up happening is that you could now relax at this-- you see 14. And let's say you relax this edge. You see that 12 and 14, you've turned that into 13.

And then when you relax this edge, this turns into 12. So you go through that process. Now, this one stays 12. But now you relax this edge.

And so this 12 becomes 10. And then when this changes, you need to-- if you relax this edge first, then this 13 is going to become 11. It doesn't really matter. This becomes, I guess, 11.

And-- is that right? Yup. This is 11 and that's 11 as well. It might start out being 12 if you relax this edge and that edge. So you might go to 12 to 11, and so on and so forth.

So for a pathological ordering, I won't belabor the point. But you see that you're going 14, 13, 12, 11 with a bad ordering that corresponds to the selection of the edges. And so if the overall weight here and overall weight here, when you start out with, is going to be order 2 raised to n over 2. OK And you could be, in this particular graph, relaxing edges an exponential number of times in order to finish.

And so the number of times you relax an edge could be of the order of the weights that you start out with. And that makes this algorithm an exponential time algorithm. So clearly, we have to do better than that when it comes to Dijkstra or Bellman-Ford.

So how are we going to do better than that? Yeah, question back there.

AUDIENCE: Is it an issue that we're starting at the [INAUDIBLE]?

PROFESSOR: You're exactly right. There's an issue with the ordering that we've chosen. But what you have to show is that for any graph, the particular ordering that you choose will result in V log V plus E and so on and so forth. So you're exactly right.

I mean, it's an issue with the ordering we've chosen. This is a pathological ordering. It's just meaning to say that we have to be careful about how we select. If you selected wrong, you've got problems.

And so the purpose of next week is going to be how do we select these edges properly. And so I leave you with this notion of, very simple notion of, optimal substructure using two very simple terms that you can prove in literally a line of text. And the first one says as subpaths of a shortest path are shortest paths.

And all that means is if I had V0, and I went to V1, and I went to V2, and these are paths here. So this could be p01, p02, p03. And so there are many vertices potentially between V0 and V1. And if you tell me that V0 through V3, the concatenation of p01, p02, and, sorry, p03 are a shortest path. If this is an SP, shortest path, then that implies that each of these are shortest paths as well.

And that makes sense, because if in fact there was a better way of getting from V0 to V1 that was better than p01, why would you ever put p01 in here? You would use that better way. So very simple. That's what's called the optimum substructure property.

And this notion of the triangle inequality is also related to that. And that simply says that if I have something like this, that I have V0, V1, and V2, then when I look at the delta value of V0, V1, and I compare that with the delta values of V0, V2, and V2, V1, then this has got to be smaller than or equal to this plus that.

And that again makes sense. Because if this plus this was smaller than that, well remember I'm talking about paths here, not edges. And the better way of getting to V1 would be to follow-- go through V2 rather than following this path up on top.

Amazingly, these two notions are going to be enough to take this algorithm and turn it into essentially a linear time algorithm. And we'll do that next time.