1 00:00:00,000 --> 00:00:02,886 [SQUEAKING][RUSTLING][CLICKING] 2 00:00:12,510 --> 00:00:17,280 JASON KU: Welcome, everybody, to our lecture 14 of 6.006. 3 00:00:17,280 --> 00:00:21,240 This is our last lecture on graph algorithms, 4 00:00:21,240 --> 00:00:24,570 in particular, the last lecture we'll have 5 00:00:24,570 --> 00:00:28,350 on weighted shortest paths. 6 00:00:28,350 --> 00:00:31,290 But we're going to talk about a slightly different problem 7 00:00:31,290 --> 00:00:35,970 today, different than single source shortest paths. 8 00:00:35,970 --> 00:00:39,960 We're going to be talking about all-pairs shortest paths. 9 00:00:39,960 --> 00:00:43,380 But first, let's review our single source 10 00:00:43,380 --> 00:00:45,330 shortest paths algorithms. 11 00:00:45,330 --> 00:00:49,740 We had BFS, DAG relaxation, Dijkstra, 12 00:00:49,740 --> 00:00:54,570 which we saw last time, which gets pretty close to linear. 13 00:00:54,570 --> 00:00:59,190 V log V plus E is pretty close to linear. 14 00:00:59,190 --> 00:01:03,390 It's much closer to linear than the general algorithm we have 15 00:01:03,390 --> 00:01:05,880 for solving single source shortest paths, 16 00:01:05,880 --> 00:01:09,180 namely, Bellman-Ford, which is a little bit more like quadratic. 17 00:01:09,180 --> 00:01:11,190 This is like the difference between-- 18 00:01:11,190 --> 00:01:15,870 for sparse graphs, this is like the difference between sorting, 19 00:01:15,870 --> 00:01:20,430 using insertion sort and n squared, and merge sort 20 00:01:20,430 --> 00:01:22,410 in N log N, for example. 21 00:01:22,410 --> 00:01:24,630 We're going to get actually quite a big bonus 22 00:01:24,630 --> 00:01:29,970 for large input sizes by using Dijkstra when we can. 23 00:01:29,970 --> 00:01:32,100 Today, we're going to be focusing 24 00:01:32,100 --> 00:01:35,040 on this new problem called all-pairs shortest paths. 25 00:01:35,040 --> 00:01:37,410 It's not really complicated. 26 00:01:37,410 --> 00:01:39,720 Instead of having a single source, 27 00:01:39,720 --> 00:01:42,300 we are essentially wanting to, given an input-- 28 00:01:45,192 --> 00:01:57,890 this is our weighted graph, where we've got a graph V, E, 29 00:01:57,890 --> 00:02:03,800 and we've got a weight function from the edges to the integers. 30 00:02:06,810 --> 00:02:08,620 This is our general weighted graph. 31 00:02:08,620 --> 00:02:10,560 We want our output to be something 32 00:02:10,560 --> 00:02:16,470 like, the shortest path distance from u 33 00:02:16,470 --> 00:02:26,130 to v for every u and v in our vortex set. 34 00:02:26,130 --> 00:02:27,790 That's what we want to return. 35 00:02:27,790 --> 00:02:32,310 Now, there's one caveat here that, 36 00:02:32,310 --> 00:02:36,190 if there's a negative weight cycle in our graph-- 37 00:02:36,190 --> 00:02:43,700 in other words, if any of these delta u, v's is minus infinity, 38 00:02:43,700 --> 00:02:47,810 there's a negative rate cycle in our graph. 39 00:02:47,810 --> 00:02:53,240 So unless, I guess-- 40 00:02:53,240 --> 00:03:09,515 or abort if G contains a negative weight cycle. 41 00:03:12,140 --> 00:03:15,650 So we're not actually going to worry about negative weight 42 00:03:15,650 --> 00:03:19,280 cycles in today's class. 43 00:03:19,280 --> 00:03:21,820 If we have a graph, it could have negative weights. 44 00:03:21,820 --> 00:03:23,960 These are any integers. 45 00:03:23,960 --> 00:03:28,460 It could include negative weight edges. 46 00:03:28,460 --> 00:03:32,270 But as long as all of our path distances 47 00:03:32,270 --> 00:03:38,920 are bounded from below, none of them 48 00:03:38,920 --> 00:03:41,830 are negative infinity, we don't have any negative weight 49 00:03:41,830 --> 00:03:45,430 cycles, then I want you to output all of these shortest 50 00:03:45,430 --> 00:03:46,330 path distances. 51 00:03:46,330 --> 00:03:49,580 Now, in particular, this output could-- 52 00:03:49,580 --> 00:03:56,800 any of these outputs needs to have size theta of V squared. 53 00:03:56,800 --> 00:03:58,870 Because for every pair of vertices, 54 00:03:58,870 --> 00:04:02,230 I need to return to you a number, or infinity, 55 00:04:02,230 --> 00:04:04,400 or minus infinity or something like that. 56 00:04:04,400 --> 00:04:07,210 But we are not dealing with a case with minus infinity. 57 00:04:07,210 --> 00:04:11,320 The output could have size-- 58 00:04:11,320 --> 00:04:12,460 this is a theta here. 59 00:04:12,460 --> 00:04:16,750 It does have size V squared. 60 00:04:16,750 --> 00:04:18,850 But in particular, it's at least V 61 00:04:18,850 --> 00:04:24,010 squared because I need to give a number 62 00:04:24,010 --> 00:04:25,960 for each pair of vertices. 63 00:04:25,960 --> 00:04:30,260 And so we couldn't hope for linear time 64 00:04:30,260 --> 00:04:33,530 in the size of this graph for this problem, right? 65 00:04:33,530 --> 00:04:36,050 Single source shortest paths, for certain versions 66 00:04:36,050 --> 00:04:38,550 of the problem, we need to read the graph. 67 00:04:38,550 --> 00:04:41,580 And so we need to use linear time. 68 00:04:41,580 --> 00:04:44,240 But in this problem, our output has quadratic size 69 00:04:44,240 --> 00:04:45,990 in the number of vertices. 70 00:04:45,990 --> 00:04:49,620 So in a sense, we can't do better than this. 71 00:04:49,620 --> 00:04:52,040 We can't do better than quadratic. 72 00:04:52,040 --> 00:04:56,010 And actually, what's one way we could 73 00:04:56,010 --> 00:04:59,190 solve all-pairs shortest paths by using stuff 74 00:04:59,190 --> 00:05:02,170 we've already done in this class? 75 00:05:02,170 --> 00:05:04,650 That's why I put this slide up here. 76 00:05:04,650 --> 00:05:07,320 Yeah, we could just solve a single source 77 00:05:07,320 --> 00:05:12,180 shortest paths algorithm from every vertex in my graph. 78 00:05:12,180 --> 00:05:15,060 That seems like a stupid thing to do. 79 00:05:15,060 --> 00:05:17,670 It's almost brute force on the vertices. 80 00:05:17,670 --> 00:05:19,980 But it's certainly a way we could solve this problem, 81 00:05:19,980 --> 00:05:23,450 in polynomial time. 82 00:05:23,450 --> 00:05:32,790 And we could definitely solve it in order 83 00:05:32,790 --> 00:05:38,980 V squared E time, using Bellman-Ford. 84 00:05:38,980 --> 00:05:44,010 We just take V steps of Bellman-Ford and deal 85 00:05:44,010 --> 00:05:48,130 with a graph on any set of vertices. 86 00:05:48,130 --> 00:05:51,170 We can do better than this. 87 00:05:51,170 --> 00:05:54,310 We can do better than this for graphs 88 00:05:54,310 --> 00:05:56,380 that are special in some way. 89 00:05:56,380 --> 00:06:02,140 We can do V times V plus E, V times linear. 90 00:06:02,140 --> 00:06:05,020 If our weights are positive and bounded, 91 00:06:05,020 --> 00:06:07,090 we can use BFS V times. 92 00:06:07,090 --> 00:06:08,920 Or if our graph doesn't have cycles, 93 00:06:08,920 --> 00:06:11,320 we could use DAG relaxation V times. 94 00:06:11,320 --> 00:06:15,470 Or if our graph had non-negative edge weights, we could get, 95 00:06:15,470 --> 00:06:21,160 basically, V squared log V plus V times E. 96 00:06:21,160 --> 00:06:22,540 And that's actually not bad. 97 00:06:22,540 --> 00:06:28,360 In sparse graphs, this is what Bellman-Ford would give us. 98 00:06:36,510 --> 00:06:40,230 But if we had Dijkstra's, for example, if we 99 00:06:40,230 --> 00:06:45,180 had all positive edge weights-- or non-negative, sorry, 100 00:06:45,180 --> 00:06:55,830 we could get V squared log V plus V, E time. 101 00:06:55,830 --> 00:07:00,720 This is V times Dijkstra. 102 00:07:08,330 --> 00:07:10,910 OK, so how do these running times compare? 103 00:07:10,910 --> 00:07:12,200 This is V times Bellman-Ford. 104 00:07:12,200 --> 00:07:15,140 This is V times Dijkstra. 105 00:07:15,140 --> 00:07:19,580 Let's just get a feel for this separation here. 106 00:07:19,580 --> 00:07:23,600 If we had a sparse graph where V is upper-bounded 107 00:07:23,600 --> 00:07:29,310 by the number of vertices, this one looks like V squared log V. 108 00:07:29,310 --> 00:07:33,900 This one looks like V cubed. 109 00:07:33,900 --> 00:07:38,670 And we need to spend at least V squared time. 110 00:07:38,670 --> 00:07:40,980 So actually, this is really close 111 00:07:40,980 --> 00:07:43,080 to linear in the size of the graph, 112 00:07:43,080 --> 00:07:47,620 just off by a log factor, just like sorting would entail. 113 00:07:47,620 --> 00:07:54,670 And this one would have a linear factor. 114 00:07:54,670 --> 00:07:57,910 In the sparse graph, this would be a linear factor worse 115 00:07:57,910 --> 00:08:00,100 than this, instead of a logarithmic factor-- 116 00:08:00,100 --> 00:08:03,070 again, this linear to log separation. 117 00:08:03,070 --> 00:08:06,310 We don't want to have to do this running time if we 118 00:08:06,310 --> 00:08:09,250 don't have to. 119 00:08:09,250 --> 00:08:10,640 That's the name of the game. 120 00:08:10,640 --> 00:08:16,790 And really, all we're going to do in this lecture 121 00:08:16,790 --> 00:08:24,740 is try to solve how we can make this running time faster 122 00:08:24,740 --> 00:08:27,320 by doing something a little bit more intelligent 123 00:08:27,320 --> 00:08:30,110 than running a single source shortest path 124 00:08:30,110 --> 00:08:32,224 algorithm from every vertex. 125 00:08:37,419 --> 00:08:38,960 How are we going to do that? 126 00:08:38,960 --> 00:08:46,330 Well, we could-- let's see. 127 00:08:46,330 --> 00:08:47,860 What are we doing? 128 00:08:47,860 --> 00:08:48,910 Right. 129 00:08:48,910 --> 00:08:51,235 The idea here, if we had a graph-- 130 00:08:57,240 --> 00:09:01,320 should my graph be directed or undirected? 131 00:09:01,320 --> 00:09:02,550 I'm not sure. 132 00:09:02,550 --> 00:09:05,180 Let's see if we can make a directed graph. 133 00:09:05,180 --> 00:09:08,140 OK, so here's a directed graph. 134 00:09:08,140 --> 00:09:10,270 Why do I not care about undirected graphs? 135 00:09:10,270 --> 00:09:13,450 Can anyone tell me? 136 00:09:13,450 --> 00:09:16,960 Yeah, it's because-- 137 00:09:16,960 --> 00:09:19,270 I don't care about undirected graphs 138 00:09:19,270 --> 00:09:24,710 because, if I had an undirected graph, 139 00:09:24,710 --> 00:09:26,570 I could detect whether I had negative weight 140 00:09:26,570 --> 00:09:36,470 cycles in constant time-- 141 00:09:36,470 --> 00:09:38,090 I'm sorry, in linear time. 142 00:09:38,090 --> 00:09:43,720 I could just check each edge, see if it has negative weight, 143 00:09:43,720 --> 00:09:46,630 because a negative weight edge, an undirected edge 144 00:09:46,630 --> 00:09:48,880 is a cycle of negative weight. 145 00:09:48,880 --> 00:09:53,440 So I could just-- if it has any negative edge weights, 146 00:09:53,440 --> 00:09:58,760 I could return in linear time that it does, and I can abort. 147 00:09:58,760 --> 00:10:02,450 Or it has only positive weights, and I can still use Dijkstra. 148 00:10:02,450 --> 00:10:03,540 So that's all good. 149 00:10:03,540 --> 00:10:06,660 So we're only concerned about needing to run Bellman-Ford 150 00:10:06,660 --> 00:10:09,920 on directed graphs that potentially have negative edge 151 00:10:09,920 --> 00:10:10,520 weight. 152 00:10:10,520 --> 00:10:13,410 OK, so here's a graph. 153 00:10:13,410 --> 00:10:14,540 Let's see. 154 00:10:14,540 --> 00:10:15,950 Is this a graph that I want? 155 00:10:18,920 --> 00:10:20,920 Sure. 156 00:10:20,920 --> 00:10:33,530 Let's say we've got that direction and this direction. 157 00:10:33,530 --> 00:10:36,780 Say we have a directed graph like this. 158 00:10:36,780 --> 00:10:38,390 And let's say this is s. 159 00:10:38,390 --> 00:10:41,610 This is our source. 160 00:10:41,610 --> 00:10:49,120 And we have weights being 2-- 161 00:10:49,120 --> 00:10:58,450 sorry, weights being 4, 1, 1, 2, 2, 2, 2. 162 00:10:58,450 --> 00:11:01,030 So this is an example of a graph we 163 00:11:01,030 --> 00:11:05,840 might want to run all-pairs shortest paths on. 164 00:11:05,840 --> 00:11:08,030 Maybe we also have negative weights in this graph. 165 00:11:13,660 --> 00:11:16,090 In particular, this has a negative weight cycle. 166 00:11:16,090 --> 00:11:17,650 I don't want negative weight cycles, 167 00:11:17,650 --> 00:11:22,310 so I'm going to make this 0. 168 00:11:22,310 --> 00:11:24,610 So this graph doesn't have negative weight cycles. 169 00:11:24,610 --> 00:11:25,110 Great. 170 00:11:27,740 --> 00:11:30,210 That's true, great. 171 00:11:30,210 --> 00:11:32,940 All right, so here's an example that we might want 172 00:11:32,940 --> 00:11:35,070 to compute shortest paths on. 173 00:11:35,070 --> 00:11:37,770 There's no s in all-pairs shortest paths. 174 00:11:37,770 --> 00:11:40,610 But I'm going to be talking about a couple 175 00:11:40,610 --> 00:11:43,240 of shortest paths from s in my next argument, 176 00:11:43,240 --> 00:11:46,680 so I'm just labeling that vertex as s. 177 00:11:46,680 --> 00:11:50,190 OK, the claim-- the approach we're going to do, 178 00:11:50,190 --> 00:11:53,460 we're going to try to take a graph that has negative edge 179 00:11:53,460 --> 00:11:57,620 weights, directed graph. 180 00:11:57,620 --> 00:12:01,160 We don't know if it has negative cycles or not yet. 181 00:12:01,160 --> 00:12:03,770 But we want to compute all-pairs shortest paths, 182 00:12:03,770 --> 00:12:07,510 not in this running time, but in this running time. 183 00:12:07,510 --> 00:12:08,930 How could we do that? 184 00:12:08,930 --> 00:12:11,560 Well, maybe it's possible that we 185 00:12:11,560 --> 00:12:16,380 can change the weight of every edge 186 00:12:16,380 --> 00:12:19,500 so that they're all positive, but shortest paths 187 00:12:19,500 --> 00:12:20,210 are preserved. 188 00:12:20,210 --> 00:12:24,060 So basically, if a particular path-- 189 00:12:24,060 --> 00:12:32,970 like OK, the shortest path from s to t here is 1, 2, 3. 190 00:12:32,970 --> 00:12:37,780 I could change edge weights in this graph. 191 00:12:37,780 --> 00:12:41,920 Say, for example, if I changed 1 to 0 here, 192 00:12:41,920 --> 00:12:46,680 that would still make this a shortest path. 193 00:12:46,680 --> 00:12:48,960 I haven't done-- I've reweighted the graph. 194 00:12:48,960 --> 00:12:51,315 Shortest paths have to be the same in this graph. 195 00:12:54,660 --> 00:12:59,510 But now-- sorry. 196 00:12:59,510 --> 00:13:05,910 Yeah, this is not a shortest path. 197 00:13:05,910 --> 00:13:11,640 OK, I'll make that minus 2, and then these both 2, 198 00:13:11,640 --> 00:13:16,300 and I think this 4. 199 00:13:16,300 --> 00:13:20,320 Man, I really should have done my example beforehand. 200 00:13:20,320 --> 00:13:23,050 OK, so this still doesn't have negative weight cycles. 201 00:13:23,050 --> 00:13:24,760 It has a negative weight edge. 202 00:13:24,760 --> 00:13:27,170 But this path is longer than this path. 203 00:13:27,170 --> 00:13:30,460 So when this was 1, this had length 204 00:13:30,460 --> 00:13:34,195 of 3, which was shorter than this path. 205 00:13:36,960 --> 00:13:40,150 That is length 4. 206 00:13:40,150 --> 00:13:42,160 OK, cool. 207 00:13:42,160 --> 00:13:46,300 So this is the shortest path from s to t. 208 00:13:46,300 --> 00:13:48,730 I could change weights in this graph, 209 00:13:48,730 --> 00:13:55,360 for example, changing 2 to 3 and 1 to 0. 210 00:13:55,360 --> 00:13:57,820 That changed weights in my graph, 211 00:13:57,820 --> 00:14:01,320 but shortest paths remain the same. 212 00:14:01,320 --> 00:14:04,520 So maybe there's a way I could reweight my edges so 213 00:14:04,520 --> 00:14:06,110 that shortest paths stay the same, 214 00:14:06,110 --> 00:14:07,490 shortest paths are preserved. 215 00:14:11,620 --> 00:14:15,880 So I'm going to put this back to where it was. 216 00:14:15,880 --> 00:14:35,080 So idea-- make edge weights non-negative while 217 00:14:35,080 --> 00:14:40,780 preserving shortest paths. 218 00:14:43,810 --> 00:14:46,270 In other words, just reweight the edges 219 00:14:46,270 --> 00:14:50,680 here so that shortest paths in G-- 220 00:14:50,680 --> 00:14:56,650 this is G-- after we reweight will go to some graph G prime, 221 00:14:56,650 --> 00:14:59,950 with the same combinatorial structure just different edge 222 00:14:59,950 --> 00:15:00,610 weights. 223 00:15:00,610 --> 00:15:03,040 And we want shortest paths-- 224 00:15:03,040 --> 00:15:05,830 we want these weights to all be non-negative. 225 00:15:05,830 --> 00:15:09,130 And we want shortest paths, if there's a shortest path in G, 226 00:15:09,130 --> 00:15:12,190 it continues to be a shortest path in the reweighted graph. 227 00:15:12,190 --> 00:15:13,450 That's the goal for today. 228 00:15:16,560 --> 00:15:21,280 If we can do that transformation and this is not negative, 229 00:15:21,280 --> 00:15:22,500 then we're done. 230 00:15:22,500 --> 00:15:26,565 We're done because we can just run Dijkstra V 231 00:15:26,565 --> 00:15:32,940 times on that new graph, get the shortest path distances, 232 00:15:32,940 --> 00:15:36,360 construct a shortest path tree from those distances, 233 00:15:36,360 --> 00:15:40,290 and then traverse that tree in the original graph, 234 00:15:40,290 --> 00:15:43,320 and compute shortest paths along that tree. 235 00:15:43,320 --> 00:15:45,780 So that's the claim. 236 00:15:45,780 --> 00:16:05,270 Claim-- we can compute distances in G-- 237 00:16:05,270 --> 00:16:09,360 so we're going to restrict G prime to have 238 00:16:09,360 --> 00:16:15,530 with non-negative edge weights. 239 00:16:23,290 --> 00:16:26,935 If we have such a G prime with non-negative edge weights, 240 00:16:26,935 --> 00:16:34,150 we can compute distances in G from distances 241 00:16:34,150 --> 00:16:49,520 in G prime in V times V plus E, which is smaller 242 00:16:49,520 --> 00:16:51,200 than our Dijkstra running time. 243 00:16:51,200 --> 00:16:53,540 This is our Dijkstra running time. 244 00:16:53,540 --> 00:16:55,530 And this is smaller than that. 245 00:16:55,530 --> 00:16:58,220 So that would be fine, right? 246 00:16:58,220 --> 00:17:00,010 What do we do? 247 00:17:00,010 --> 00:17:03,580 We have our new graph G prime. 248 00:17:03,580 --> 00:17:05,079 It has all positive edge weights, 249 00:17:05,079 --> 00:17:07,540 so we run all-pairs shortest paths in here 250 00:17:07,540 --> 00:17:10,780 by just running Dijkstra V times. 251 00:17:10,780 --> 00:17:17,050 And then for each vertex s, we have some shortest path tree 252 00:17:17,050 --> 00:17:20,079 to all the things that it's connected to. 253 00:17:20,079 --> 00:17:25,130 We can look at that path in G. This 254 00:17:25,130 --> 00:17:27,380 is the same common [INAUDIBLE] graph, just 255 00:17:27,380 --> 00:17:28,850 with different edge weights. 256 00:17:28,850 --> 00:17:31,130 We can look at that same tree. 257 00:17:31,130 --> 00:17:35,870 I'm not going to be able to draw the exact same tree here. 258 00:17:35,870 --> 00:17:41,570 But what I can do is I can just BFS or DFS along this tree. 259 00:17:41,570 --> 00:17:44,480 And every time I have an edge, because each one of these 260 00:17:44,480 --> 00:17:47,270 is the shortest path in this tree, 261 00:17:47,270 --> 00:17:53,780 I can just compute every time I traverse an edge what 262 00:17:53,780 --> 00:17:56,090 its shortest path distance is in G 263 00:17:56,090 --> 00:17:59,570 in linear time for that vertex for that particular s. 264 00:17:59,570 --> 00:18:04,570 I do that over all s's, and I get this running time. 265 00:18:04,570 --> 00:18:06,550 So that's the goal here. 266 00:18:06,550 --> 00:18:10,860 All right, so we first wanted to find-- 267 00:18:10,860 --> 00:18:12,930 we were wanting to make edge weights 268 00:18:12,930 --> 00:18:15,510 non-negative while preserving shortest paths. 269 00:18:15,510 --> 00:18:18,240 Because if we could do that, we could 270 00:18:18,240 --> 00:18:22,650 solve our original all-pairs shortest paths problem. 271 00:18:22,650 --> 00:18:24,990 So this is the proof sketch. 272 00:18:29,890 --> 00:18:31,780 But how can we do this? 273 00:18:31,780 --> 00:18:32,570 But how? 274 00:18:35,260 --> 00:18:39,490 It doesn't even seem like this could be possible, generally. 275 00:18:39,490 --> 00:18:41,800 How can I just reweight the edges 276 00:18:41,800 --> 00:18:45,580 and maintain that the shortest path trees are the same? 277 00:18:45,580 --> 00:18:47,160 That seems hard to do. 278 00:18:47,160 --> 00:18:51,440 And in particular, if there's a negative weight 279 00:18:51,440 --> 00:18:54,860 cycle in this graph, it's impossible to do. 280 00:18:54,860 --> 00:19:10,470 So claim-- not possible if G contains a negative weight 281 00:19:10,470 --> 00:19:10,970 cycle. 282 00:19:19,090 --> 00:19:23,230 OK, the exclamation point is just my comment here. 283 00:19:25,940 --> 00:19:28,690 So if G contains a negative weight cycle, 284 00:19:28,690 --> 00:19:33,430 then in particular, the shortest path distance or a shortest 285 00:19:33,430 --> 00:19:37,300 path in G-- 286 00:19:37,300 --> 00:19:40,570 if this is G, say we have a negative weight 287 00:19:40,570 --> 00:19:44,990 cycle directed cycle C here. 288 00:19:44,990 --> 00:19:49,340 In particular, the shortest path from this vertex 289 00:19:49,340 --> 00:19:54,890 on the cycle to this vertex on the cycle, 290 00:19:54,890 --> 00:19:56,330 what is its shortest path? 291 00:19:56,330 --> 00:19:58,730 It's infinite. 292 00:19:58,730 --> 00:20:01,460 A shortest path is infinite around this cycle. 293 00:20:01,460 --> 00:20:03,740 You just keep going around the cycle over and over 294 00:20:03,740 --> 00:20:06,560 and over again because it has a negative weight. 295 00:20:06,560 --> 00:20:12,530 Weight of C is less than 0, strictly. 296 00:20:12,530 --> 00:20:14,420 That's what a negative weight cycle is. 297 00:20:14,420 --> 00:20:18,650 OK, so the shortest path-- a shortest path from s to t 298 00:20:18,650 --> 00:20:23,200 has infinite length and, in particular, is non-simple. 299 00:20:23,200 --> 00:20:42,830 However-- so shortest path from s to t is non-simple. 300 00:20:48,690 --> 00:20:52,350 But as we proved in the last lecture, 301 00:20:52,350 --> 00:21:10,500 shortest paths in a graph with non-negative weights are what? 302 00:21:10,500 --> 00:21:11,792 Are simple, right? 303 00:21:11,792 --> 00:21:13,500 Because they're just shortest path trees. 304 00:21:20,020 --> 00:21:22,780 So that's a contradiction. 305 00:21:22,780 --> 00:21:24,310 So this is not possible. 306 00:21:27,070 --> 00:21:34,330 So given a graph with negative weights but no negative cycle, 307 00:21:34,330 --> 00:21:37,810 it's still not clear how we could find such a reweighting 308 00:21:37,810 --> 00:21:39,390 of the graph. 309 00:21:39,390 --> 00:21:41,190 Can we do this? 310 00:21:41,190 --> 00:21:45,150 Well, we're going to exploit a little idea here. 311 00:21:45,150 --> 00:21:48,810 How can we transform the weights of a path? 312 00:21:48,810 --> 00:21:51,500 Well, how-- what's a-- 313 00:21:51,500 --> 00:21:55,370 a silly idea, I have this silly idea. 314 00:21:55,370 --> 00:21:58,315 If I don't want negative edge weights in this graph-- 315 00:21:58,315 --> 00:22:00,290 ugh, this is messy in the back. 316 00:22:00,290 --> 00:22:06,700 You got edge weights 1, minus 2, 0, 1, 4, 5, and 1. 317 00:22:06,700 --> 00:22:10,270 There's only one negative edge weight here. 318 00:22:10,270 --> 00:22:14,170 What if I just added a large number, 319 00:22:14,170 --> 00:22:19,240 or in particular, the negative of the smallest 320 00:22:19,240 --> 00:22:23,200 edge in my graph to every edge in my graph? 321 00:22:23,200 --> 00:22:26,500 Then I'll have a graph with non-negative weights. 322 00:22:26,500 --> 00:22:29,060 Fantastic. 323 00:22:29,060 --> 00:22:32,400 Why is that not a good idea? 324 00:22:32,400 --> 00:22:34,500 Well, in particular, if I did that to this graph, 325 00:22:34,500 --> 00:22:39,640 if I added 2 to every edge, the weight of this path, 326 00:22:39,640 --> 00:22:45,800 which was the shortest path, changed from weight 3 327 00:22:45,800 --> 00:22:52,660 to weight 9, because I added 2 for every edge. 328 00:22:52,660 --> 00:22:55,630 But this path, which wasn't a shortest path in the original 329 00:22:55,630 --> 00:22:58,640 graph-- it had weight 4-- 330 00:22:58,640 --> 00:22:59,830 increased only by 2. 331 00:22:59,830 --> 00:23:02,360 Now that is a shortest path. 332 00:23:02,360 --> 00:23:04,890 Or it's a shorter path than this one, 333 00:23:04,890 --> 00:23:08,090 so this one can't be a shortest path. 334 00:23:08,090 --> 00:23:10,310 So that transformation, sure, would 335 00:23:10,310 --> 00:23:12,920 make all the weights non-negative, 336 00:23:12,920 --> 00:23:16,680 but would not preserve shortest paths. 337 00:23:16,680 --> 00:23:23,390 In particular, if I added the same edge weight to every edge, 338 00:23:23,390 --> 00:23:29,210 I will bias toward taking paths that have fewer edges, not just 339 00:23:29,210 --> 00:23:31,110 smaller weight. 340 00:23:31,110 --> 00:23:34,560 So that first idea doesn't work. 341 00:23:34,560 --> 00:23:45,500 Idea-- add large number to each edge. 342 00:23:49,120 --> 00:23:51,940 This is bad. 343 00:23:51,940 --> 00:24:07,060 Makes weights non-negative, but does not 344 00:24:07,060 --> 00:24:12,030 preserve shortest paths. 345 00:24:12,030 --> 00:24:16,530 So this is not a good idea-- bad idea. 346 00:24:19,540 --> 00:24:22,060 Is there any way you can think of 347 00:24:22,060 --> 00:24:26,350 to modify the edge weights in a graph in any way that 348 00:24:26,350 --> 00:24:29,050 will preserve shortest paths? 349 00:24:29,050 --> 00:24:37,040 So here is an idea for you, which 350 00:24:37,040 --> 00:24:40,670 is kind of this critical step in Johnson's and in a lot of graph 351 00:24:40,670 --> 00:24:43,550 transformation algorithms. 352 00:24:43,550 --> 00:24:51,290 If I have a vertex, say this middle guy, say v, 353 00:24:51,290 --> 00:25:00,620 every path from v goes through an outgoing edge of v. 354 00:25:00,620 --> 00:25:08,570 And every path going into v goes through an edge going into v. 355 00:25:08,570 --> 00:25:11,180 I haven't said anything-- 356 00:25:11,180 --> 00:25:13,020 I've said very stupid things. 357 00:25:13,020 --> 00:25:15,680 But that observation is critical here. 358 00:25:15,680 --> 00:25:18,740 If I add a number-- 359 00:25:18,740 --> 00:25:20,960 or let me see if I got this right in terms 360 00:25:20,960 --> 00:25:24,320 of adding and subtracting. 361 00:25:24,320 --> 00:25:29,940 If I add a number to all outgoing edges from a vertex, 362 00:25:29,940 --> 00:25:35,810 and I subtract that same number from the weights of all 363 00:25:35,810 --> 00:25:41,810 of the incoming edges to that vertex, then every path from v 364 00:25:41,810 --> 00:25:45,530 is changed by the same amount, because every path from v 365 00:25:45,530 --> 00:25:47,990 goes through one of those outgoing edges. 366 00:25:47,990 --> 00:25:51,440 And any path going into v has also 367 00:25:51,440 --> 00:25:53,450 changed by the same amount. 368 00:25:53,450 --> 00:25:57,950 In particular, it's changed by a negative, whatever 369 00:25:57,950 --> 00:26:01,480 we added to the outgoing edges. 370 00:26:01,480 --> 00:26:06,240 So such a transformation, adding a number 371 00:26:06,240 --> 00:26:09,130 from all the outgoing edges from a vertex 372 00:26:09,130 --> 00:26:13,140 and subtracting that same number from all the incoming edges, 373 00:26:13,140 --> 00:26:14,460 preserves shortest paths. 374 00:26:14,460 --> 00:26:15,480 That's a claim. 375 00:26:15,480 --> 00:26:19,810 Idea-- this is a better idea. 376 00:26:25,150 --> 00:26:32,695 Given vertex v, add-- 377 00:26:35,300 --> 00:26:39,180 I'm going to put this on two lines. 378 00:26:39,180 --> 00:26:55,670 Add weight h to all outgoing edges, 379 00:26:55,670 --> 00:27:04,315 and subtract weight to all incoming edges. 380 00:27:07,310 --> 00:27:09,320 So that's the idea. 381 00:27:09,320 --> 00:27:19,040 And the claim is, this transformation, 382 00:27:19,040 --> 00:27:39,490 shortest paths are preserved under this transformation. 383 00:27:45,870 --> 00:27:48,290 And why is that? 384 00:27:48,290 --> 00:27:52,280 It's kind of the exact same argument that I had over there. 385 00:27:52,280 --> 00:28:10,690 Proof-- consider any path in my graph, either if the path-- 386 00:28:10,690 --> 00:28:12,900 path could go through v many times, 387 00:28:12,900 --> 00:28:15,540 or it could go through not at all. 388 00:28:15,540 --> 00:28:23,830 If my path, if I have a path, in my original graph G, 389 00:28:23,830 --> 00:28:28,840 then with path weight w of pi-- 390 00:28:28,840 --> 00:28:31,580 this is my path-- 391 00:28:31,580 --> 00:28:41,390 it goes through v some number of times. 392 00:28:41,390 --> 00:28:45,640 So I'm going to say this is going from s to t. 393 00:28:45,640 --> 00:28:51,010 If it crosses v-- if it never crosses v, 394 00:28:51,010 --> 00:28:55,570 if it never touches v, the vertex that I transformed, 395 00:28:55,570 --> 00:28:58,090 then I argue that the path weight is the same 396 00:28:58,090 --> 00:29:01,400 because I didn't do anything to edges that are in this path. 397 00:29:01,400 --> 00:29:06,170 Alternatively, this thing goes through v sometimes. 398 00:29:06,170 --> 00:29:12,660 If it goes through v in the middle, 399 00:29:12,660 --> 00:29:14,980 how is the weight of my path changed? 400 00:29:14,980 --> 00:29:19,960 Well, it hasn't, because I added a number to all outgoing edges, 401 00:29:19,960 --> 00:29:23,040 so there's an outgoing edge here with weight 402 00:29:23,040 --> 00:29:25,320 I've changed by weight h. 403 00:29:25,320 --> 00:29:26,820 And there's an incoming edge here 404 00:29:26,820 --> 00:29:29,020 that I've changed by weight negative h. 405 00:29:29,020 --> 00:29:32,280 So these cancel out and you've got 0. 406 00:29:32,280 --> 00:29:36,250 So passing through a vertex doesn't 407 00:29:36,250 --> 00:29:39,130 change the weight of my path. 408 00:29:39,130 --> 00:29:42,040 The only way I could change the weight of my path 409 00:29:42,040 --> 00:29:44,575 is if v is the start vertex or the end vertex. 410 00:29:47,560 --> 00:29:52,720 So it's possible that s is my vertex or t is my vertex. 411 00:29:52,720 --> 00:29:58,320 Well, for any path leaving v, I will 412 00:29:58,320 --> 00:30:02,790 have increased the weight of that path 413 00:30:02,790 --> 00:30:07,330 by h, because I added a weight h to all outgoing edges. 414 00:30:07,330 --> 00:30:10,110 So again, while the path weight has 415 00:30:10,110 --> 00:30:15,490 changed, since all of the paths leaving v 416 00:30:15,490 --> 00:30:18,340 have changed by the same amount, a shortest path 417 00:30:18,340 --> 00:30:19,900 will still be a shortest path. 418 00:30:19,900 --> 00:30:22,180 And same goes for t. 419 00:30:22,180 --> 00:30:26,950 If t, the end vertex is v, I'll have subtracted h 420 00:30:26,950 --> 00:30:30,070 from all of my incoming edges, which 421 00:30:30,070 --> 00:30:35,740 means that any path ending at t, any directed path ending at t, 422 00:30:35,740 --> 00:30:38,060 also changes by the same value. 423 00:30:38,060 --> 00:30:42,350 And so shortest paths must be preserved. 424 00:30:42,350 --> 00:30:46,130 So shortest paths preserved. 425 00:30:50,050 --> 00:30:51,730 So that's pretty cool transformation. 426 00:30:51,730 --> 00:30:57,870 I can assign for any vertex such a transformation which 427 00:30:57,870 --> 00:31:00,180 affects all of the edges surrounding it 428 00:31:00,180 --> 00:31:06,180 by this h additive factor, either added or subtracted. 429 00:31:06,180 --> 00:31:09,900 So maybe-- and I can do this independently for every vertex. 430 00:31:09,900 --> 00:31:11,970 The shortest paths were preserved 431 00:31:11,970 --> 00:31:14,250 by me doing this to one vertex. 432 00:31:14,250 --> 00:31:17,270 Then if I do it to another vertex, 433 00:31:17,270 --> 00:31:19,580 then shortest paths are still preserved. 434 00:31:19,580 --> 00:31:21,920 And let's prove that real quick. 435 00:31:27,980 --> 00:31:29,570 What I'm going to do is I'm going 436 00:31:29,570 --> 00:31:31,760 to want to do this to give me flexibility 437 00:31:31,760 --> 00:31:33,950 for changing all the edge weights in my graph 438 00:31:33,950 --> 00:31:35,960 to have this property. 439 00:31:35,960 --> 00:31:36,710 I'm going to set-- 440 00:31:39,220 --> 00:31:56,372 or define a potential function h that maps vertices to integers. 441 00:32:04,480 --> 00:32:16,900 So this is the potential h of v. And then we're 442 00:32:16,900 --> 00:32:36,760 going to make a graph, G prime based on above transformation 443 00:32:36,760 --> 00:32:44,330 for each vertex in v. 444 00:32:44,330 --> 00:32:49,670 So I'm going to set a number, an h for each vertex. 445 00:32:49,670 --> 00:32:52,420 These are independent now. 446 00:32:52,420 --> 00:32:54,850 And I'm going to add that potential 447 00:32:54,850 --> 00:32:56,063 to all outgoing edges. 448 00:32:56,063 --> 00:32:57,730 And I'm going to subtract that potential 449 00:32:57,730 --> 00:33:00,200 from all incoming edges. 450 00:33:00,200 --> 00:33:03,470 This transformation is going to preserve shortest paths. 451 00:33:03,470 --> 00:33:05,990 Let's actually be a little bit more rigorous 452 00:33:05,990 --> 00:33:09,690 that that's the case when we do this multiple times. 453 00:33:09,690 --> 00:33:24,030 So claim-- shortest paths are still preserved. 454 00:33:29,620 --> 00:33:32,630 All right, well, that's, again, not so difficult to see. 455 00:33:32,630 --> 00:33:40,362 Let's consider a path from s to t. 456 00:33:40,362 --> 00:33:42,165 It passes through a bunch of vertices. 457 00:33:42,165 --> 00:33:51,320 I'm going to label these as v0 to vk so that I can 458 00:33:51,320 --> 00:33:53,480 kind of number them. 459 00:33:53,480 --> 00:33:56,850 All right, this is v1 here. 460 00:33:56,850 --> 00:34:03,350 This is a directed path, v1, 2, 3, 4, all the way up to k. 461 00:34:03,350 --> 00:34:07,200 There are k edges in this graph. 462 00:34:07,200 --> 00:34:14,040 I claim to you that any path from v0 to vk, any shortest 463 00:34:14,040 --> 00:34:19,469 path from v0 to vk remains a shortest path after I reweight 464 00:34:19,469 --> 00:34:22,699 everything in this way. 465 00:34:22,699 --> 00:34:27,339 So let's say this is path pi, and so it has weight w 466 00:34:27,339 --> 00:34:33,199 pi, which is really just the sum of all of the edge 467 00:34:33,199 --> 00:34:42,949 weights from vi minus 1 to vi, for i equals 1 to k. 468 00:34:42,949 --> 00:34:44,070 This is poor notation. 469 00:34:44,070 --> 00:34:49,300 This is the weight of the edge from the vi minus 1 to i. 470 00:34:49,300 --> 00:34:53,040 And we've got-- it indexes from 1-- 471 00:34:53,040 --> 00:34:54,600 that's the first edge-- 472 00:34:54,600 --> 00:34:57,180 to k, which is the last edge. 473 00:34:57,180 --> 00:34:59,460 So that's the weight of my path. 474 00:34:59,460 --> 00:35:01,160 The weight of my transformed path-- 475 00:35:01,160 --> 00:35:02,740 I'm going to do it down here. 476 00:35:02,740 --> 00:35:06,017 It's a little iffy. 477 00:35:06,017 --> 00:35:08,100 The weight of my transformed path I'm going to say 478 00:35:08,100 --> 00:35:12,960 is the weight in this new weighted graph G prime. 479 00:35:16,390 --> 00:35:18,310 This weight of that same path-- 480 00:35:18,310 --> 00:35:20,470 it's the same path-- 481 00:35:20,470 --> 00:35:26,510 is just going to be the sum of all of the reweighted edges. 482 00:35:26,510 --> 00:35:36,650 So i equals 1 to k of my original weight of my edge, 483 00:35:36,650 --> 00:35:40,430 so from 0, i minus 1 to vi. 484 00:35:43,100 --> 00:35:45,310 But what did I do? 485 00:35:45,310 --> 00:35:49,570 This edge is outgoing from vi minus 1. 486 00:35:49,570 --> 00:35:53,140 So it's outgoing, so I add that weight-- 487 00:35:53,140 --> 00:35:54,280 that potential, sorry. 488 00:35:59,680 --> 00:36:04,450 But that edge is also incoming into vi. 489 00:36:04,450 --> 00:36:11,495 So when I reweighted the thing, I got a subtraction of h, vi. 490 00:36:16,850 --> 00:36:19,070 Now, what happens here in the sum, 491 00:36:19,070 --> 00:36:23,780 this term, if I just took the sum over this term, 492 00:36:23,780 --> 00:36:26,500 that's exactly my original pathway. 493 00:36:26,500 --> 00:36:27,170 So that's good. 494 00:36:31,280 --> 00:36:36,510 But you'll notice that this sum has k terms, 495 00:36:36,510 --> 00:36:40,830 and this sum has the subtraction of k other terms. 496 00:36:40,830 --> 00:36:45,450 But most of these terms are equal. 497 00:36:45,450 --> 00:36:52,860 Along the path, all the incoming and outgoing edges cancel out. 498 00:36:52,860 --> 00:36:58,070 So we're left with only adding the potential 499 00:36:58,070 --> 00:37:02,450 at the starting vertex and subtracting the potential 500 00:37:02,450 --> 00:37:04,000 at the final vertex. 501 00:37:04,000 --> 00:37:11,510 So we've got, add h, v0 minus h, vk. 502 00:37:15,440 --> 00:37:17,780 And why is that good? 503 00:37:17,780 --> 00:37:22,490 Well, that's good because every path from v0 504 00:37:22,490 --> 00:37:26,900 to vk starts at v0 and ends at vk. 505 00:37:26,900 --> 00:37:29,390 That's just-- that's how it is. 506 00:37:29,390 --> 00:37:33,380 That's how we've defined paths going from v0 to vk. 507 00:37:33,380 --> 00:37:38,060 But every such path, we transform the weight 508 00:37:38,060 --> 00:37:42,500 of that path by adding a constant associated 509 00:37:42,500 --> 00:37:46,730 with the start and adding this value associated with the end. 510 00:37:46,730 --> 00:37:51,140 And so every path going from v0 to vk changes 511 00:37:51,140 --> 00:37:54,060 by the same amount. 512 00:37:54,060 --> 00:38:00,330 And so if this path pi was shortest, 513 00:38:00,330 --> 00:38:02,730 it's still shortest in the reweighted graph 514 00:38:02,730 --> 00:38:05,580 because I've just changed all paths between those two 515 00:38:05,580 --> 00:38:08,140 vertices by the same amount. 516 00:38:08,140 --> 00:38:11,520 This is kind of like a telescoping argument here 517 00:38:11,520 --> 00:38:12,440 in that kind of proof. 518 00:38:12,440 --> 00:38:12,940 Right. 519 00:38:12,940 --> 00:38:18,210 So we have, the weight changes. 520 00:38:18,210 --> 00:38:22,050 It could change, but it changes all paths between these two 521 00:38:22,050 --> 00:38:24,850 vertices by the same amount, which 522 00:38:24,850 --> 00:38:28,000 means that shortest paths are still shortest. 523 00:38:28,000 --> 00:38:29,540 Awesome. 524 00:38:29,540 --> 00:38:36,390 OK, so the name of the game here is now, 525 00:38:36,390 --> 00:38:39,330 we have this really flexible tool. 526 00:38:39,330 --> 00:38:43,950 We have this tool where we can add or subtract weight 527 00:38:43,950 --> 00:38:45,360 from various edges. 528 00:38:45,360 --> 00:38:48,670 But we have to do so in a kind of localized, constrained way. 529 00:38:48,670 --> 00:38:52,870 We have to do the same thing around each vertex. 530 00:38:52,870 --> 00:38:55,260 But it seems like a powerful transformation technique 531 00:38:55,260 --> 00:38:58,750 that maybe we can get this thing that we want, 532 00:38:58,750 --> 00:39:01,380 which is a G prime, a reweighting of the graph where 533 00:39:01,380 --> 00:39:07,130 all the edge weights are positive or non-negative. 534 00:39:07,130 --> 00:39:16,040 So does there exist an h such that the weights are all 535 00:39:16,040 --> 00:39:16,710 positive? 536 00:39:16,710 --> 00:39:17,720 What does that mean? 537 00:39:17,720 --> 00:39:28,540 w prime u, v, the weight in my new graph, in G prime, 538 00:39:28,540 --> 00:39:36,250 I want these modified weights, this modified 539 00:39:36,250 --> 00:39:43,280 weight of my graph, I want each of these to be non-negative. 540 00:39:43,280 --> 00:39:46,271 So does there exist such a thing? 541 00:39:46,271 --> 00:39:47,670 Huh. 542 00:39:47,670 --> 00:39:50,520 Well, if I rearrange this equation a little bit, 543 00:39:50,520 --> 00:39:55,695 this side, I get something that looks like this. 544 00:39:55,695 --> 00:40:04,030 h of v needs to be less than or equal to h of u, 545 00:40:04,030 --> 00:40:08,570 plus the weight of some edge from u 546 00:40:08,570 --> 00:40:16,450 to v. What does that look like? 547 00:40:16,450 --> 00:40:19,810 That looks like almost exactly the definition 548 00:40:19,810 --> 00:40:21,520 of the triangle inequality. 549 00:40:24,270 --> 00:40:27,990 Shortest path from some vertex here and its shortest path 550 00:40:27,990 --> 00:40:30,210 distance from the same vertex here, 551 00:40:30,210 --> 00:40:35,660 this is just a statement of the triangle inequality. 552 00:40:35,660 --> 00:40:39,320 So if we can set these h's to be the shortest path 553 00:40:39,320 --> 00:40:42,320 distance from some vertex and those shortest path distances 554 00:40:42,320 --> 00:40:49,570 are finite, and not minus infinity, 555 00:40:49,570 --> 00:40:54,370 then this thing will hold by triangle inequality. 556 00:40:54,370 --> 00:40:58,360 And in particular, if we were to reweight the edges based 557 00:40:58,360 --> 00:41:01,780 on those values of h, then we get new edge rates 558 00:41:01,780 --> 00:41:04,630 that are non-negative. 559 00:41:04,630 --> 00:41:05,740 Awesome. 560 00:41:05,740 --> 00:41:07,380 OK. 561 00:41:07,380 --> 00:41:11,040 But there might not be any vertex 562 00:41:11,040 --> 00:41:15,000 from which we can access, which we can reach 563 00:41:15,000 --> 00:41:16,920 all vertices in the graph. 564 00:41:16,920 --> 00:41:19,650 In particular, my graph might not even be connected. 565 00:41:24,350 --> 00:41:27,170 If I want this property, I need all of these-- 566 00:41:27,170 --> 00:41:32,780 I don't gain any information if these things are infinite. 567 00:41:32,780 --> 00:41:34,940 It's exhaustively true. 568 00:41:34,940 --> 00:41:37,760 Infinity is-- I don't even know how 569 00:41:37,760 --> 00:41:41,580 to compare infinity and infinity plus a constant. 570 00:41:41,580 --> 00:41:44,270 I don't know. 571 00:41:44,270 --> 00:41:46,560 So I need all of these things to be finite. 572 00:41:46,560 --> 00:41:49,310 So how can I make those things finite? 573 00:41:49,310 --> 00:41:50,750 So here's the next idea. 574 00:42:00,740 --> 00:42:18,500 Add new vertex s with 0-weight edge 575 00:42:18,500 --> 00:42:30,250 to every vertex, V in V. We take our original graph. 576 00:42:30,250 --> 00:42:33,460 We add a new super node or auxiliary vertex 577 00:42:33,460 --> 00:42:37,468 s, with a 0-weight edge to every vertex in my graph. 578 00:42:37,468 --> 00:42:38,510 What does that look like? 579 00:42:38,510 --> 00:42:42,260 This is like-- there's my original graph, 580 00:42:42,260 --> 00:42:45,260 and now I have this vertex s. 581 00:42:45,260 --> 00:42:49,160 But it has directed edges into all of the vertices 582 00:42:49,160 --> 00:42:51,820 with 0-weight. 583 00:42:51,820 --> 00:42:52,930 That's our picture. 584 00:42:52,930 --> 00:43:03,470 And this new thing I'm going to call, maybe, my s graph now. 585 00:43:07,740 --> 00:43:13,630 And the claim is, well now, if I run some shortest path 586 00:43:13,630 --> 00:43:15,300 algorithm, single source shortest path 587 00:43:15,300 --> 00:43:19,560 algorithm this time, from s to compute the shortest path 588 00:43:19,560 --> 00:43:26,000 distance to all of the vertices, the shortest distance 589 00:43:26,000 --> 00:43:31,310 to each of the vertices can't be positive, 590 00:43:31,310 --> 00:43:34,090 because there's a 0-weight edge. 591 00:43:34,090 --> 00:43:38,260 So a minimum weight path is going to be no bigger than 0. 592 00:43:40,890 --> 00:43:43,470 If it's finite, then there's a finite length shortest path. 593 00:43:43,470 --> 00:43:46,790 If it's minus infinity, then there's 594 00:43:46,790 --> 00:43:51,290 a negative rate cycle in my graph and I can stop. 595 00:43:51,290 --> 00:43:58,050 So there are either two situations. 596 00:43:58,050 --> 00:44:05,090 If delta s,v equals minus infinity-- 597 00:44:05,090 --> 00:44:18,290 so I guess this is, run single source shortest paths from s. 598 00:44:18,290 --> 00:44:21,920 And really, because this graph could contain negative edge 599 00:44:21,920 --> 00:44:24,170 weights and could contain negative cycles, 600 00:44:24,170 --> 00:44:28,040 we can't really do better than running Bellman-Ford here 601 00:44:28,040 --> 00:44:30,260 from s to compute these paths. 602 00:44:30,260 --> 00:44:36,890 If there exists in this new graph this Gs, if there exists 603 00:44:36,890 --> 00:44:41,210 a vertex that has negative infinite weight 604 00:44:41,210 --> 00:44:45,560 in the reweighted graph-- 605 00:44:45,560 --> 00:44:50,880 sorry, in the original graph G-- 606 00:44:50,880 --> 00:44:53,490 G hasn't been reweighted yet. 607 00:44:53,490 --> 00:45:01,380 If there's a negative weight distance from s, 608 00:45:01,380 --> 00:45:09,030 then there was a negative weight cycle in the original graph. 609 00:45:09,030 --> 00:45:10,950 Why is that? 610 00:45:10,950 --> 00:45:14,010 Well, if this was set to minus infinity, 611 00:45:14,010 --> 00:45:18,300 then there is some negative weight cycle in the graph. 612 00:45:18,300 --> 00:45:20,190 The worry is that that negative weight 613 00:45:20,190 --> 00:45:27,030 cycle was added to my graph by adding this vertex s. 614 00:45:27,030 --> 00:45:29,520 But what do I know about vertex s? 615 00:45:29,520 --> 00:45:31,290 It has no incoming edges. 616 00:45:31,290 --> 00:45:35,430 So no negative weight cycle could go through s. 617 00:45:35,430 --> 00:45:38,610 So any negative weight cycle was in the original graph, 618 00:45:38,610 --> 00:45:41,420 and so I can abort. 619 00:45:41,420 --> 00:45:43,490 Abort. 620 00:45:43,490 --> 00:45:45,050 Yay. 621 00:45:45,050 --> 00:45:51,550 Otherwise, what do I do? 622 00:45:51,550 --> 00:45:56,710 Well, I know the shortest path distances here 623 00:45:56,710 --> 00:46:00,520 would satisfy the triangle inequality. 624 00:46:00,520 --> 00:46:18,960 So if I reweight with h of v equal to delta s of v, 625 00:46:18,960 --> 00:46:22,320 if we set our potentials in our reweighted graph 626 00:46:22,320 --> 00:46:28,900 to be the shortest path distance from our super node s, 627 00:46:28,900 --> 00:46:32,450 it satisfies the triangle inequality. 628 00:46:35,070 --> 00:46:39,480 And because there's no negative cycles, all of these values 629 00:46:39,480 --> 00:46:40,950 are finite. 630 00:46:40,950 --> 00:46:45,840 And then this reweighting will lead to a graph with strictly-- 631 00:46:45,840 --> 00:46:48,720 or not strictly-- 632 00:46:48,720 --> 00:46:53,700 strictly no negative weights or non-negative weights. 633 00:46:53,700 --> 00:46:56,440 OK, great. 634 00:46:56,440 --> 00:46:59,590 So that's basically it. 635 00:46:59,590 --> 00:47:02,231 That's the idea behind Johnson's algorithm. 636 00:47:08,620 --> 00:47:12,550 It's really a reduction problem or a reduction algorithm. 637 00:47:12,550 --> 00:47:18,580 We reducing from solving kind of signed all-pairs 638 00:47:18,580 --> 00:47:21,340 shortest paths, graphs where their weights could 639 00:47:21,340 --> 00:47:24,230 be positive or negative, and we're 640 00:47:24,230 --> 00:47:27,770 reducing to creating a graph that 641 00:47:27,770 --> 00:47:30,590 has the same shortest paths properties, 642 00:47:30,590 --> 00:47:32,360 but only has non-negative edge weights. 643 00:47:32,360 --> 00:47:35,060 So we're reducing from a signed context 644 00:47:35,060 --> 00:47:37,325 to a non-negative weight context. 645 00:47:40,700 --> 00:47:46,130 So Johnson's algorithm, what are the steps? 646 00:47:46,130 --> 00:48:00,840 Construct Gs from G, just as up here. 647 00:48:00,840 --> 00:48:04,730 I make a new vertex s. 648 00:48:04,730 --> 00:48:09,190 I put a 0 weight directed edge from s to every vertex. 649 00:48:09,190 --> 00:48:11,390 So that's the first step. 650 00:48:11,390 --> 00:48:25,690 Second step-- compute E, s,v for all V in V, i.e-- 651 00:48:25,690 --> 00:48:29,315 or e.g-- I guess really it should 652 00:48:29,315 --> 00:48:31,690 be i.e. because I don't really have another option here-- 653 00:48:31,690 --> 00:48:36,070 but by Bellman-Ford. 654 00:48:39,750 --> 00:48:44,140 This is just a single run of Bellman-Ford here. 655 00:48:44,140 --> 00:48:46,270 Compute. 656 00:48:46,270 --> 00:48:49,190 And then there are two possibilities. 657 00:48:49,190 --> 00:49:04,440 If there exists a delta s, v that's minus infinity, 658 00:49:04,440 --> 00:49:05,835 then abort. 659 00:49:09,490 --> 00:49:22,750 Else, make-- or reweight the graph according to this 660 00:49:22,750 --> 00:49:37,170 reweighting scheme, by reweighting each edge 661 00:49:37,170 --> 00:49:40,140 in my original graph to have weight-- 662 00:49:44,130 --> 00:49:48,180 our new weight, which is our old weight, 663 00:49:48,180 --> 00:49:49,410 plus our transformation. 664 00:49:49,410 --> 00:49:53,760 Now, our transformation is now going to set h, v to this delta 665 00:49:53,760 --> 00:50:00,630 s, v. So I'm going to add delta s, 666 00:50:00,630 --> 00:50:06,570 u, and subtract delta s, v. That's our reweighting scheme. 667 00:50:06,570 --> 00:50:10,690 I'm just identifying h, v with this shortest path distance 668 00:50:10,690 --> 00:50:11,190 here. 669 00:50:16,300 --> 00:50:20,730 And after I reweighted that, I can just 670 00:50:20,730 --> 00:50:32,375 solve all-pairs shortest paths on G prime with Dijkstra. 671 00:50:36,170 --> 00:50:50,450 And then compute G shortest path distances from G prime shortest 672 00:50:50,450 --> 00:50:53,250 path distances. 673 00:50:57,440 --> 00:51:03,200 Compute these distances from the other using this algorithm up 674 00:51:03,200 --> 00:51:08,900 here-- can compute distances in G from distances in G prime 675 00:51:08,900 --> 00:51:10,640 in linear time-- 676 00:51:10,640 --> 00:51:17,150 or sorry, v times linear time, linear time for each s-- 677 00:51:17,150 --> 00:51:19,460 for each vertex in my graph. 678 00:51:19,460 --> 00:51:22,340 OK, so that's the algorithm. 679 00:51:22,340 --> 00:51:26,810 It's basically, correctness is trivial. 680 00:51:26,810 --> 00:51:29,840 We already proved-- the whole part of this lecture, 681 00:51:29,840 --> 00:51:31,640 the interesting part of this lecture 682 00:51:31,640 --> 00:51:35,300 was proving that, if we had a transformation based 683 00:51:35,300 --> 00:51:37,670 on a potential function that changed 684 00:51:37,670 --> 00:51:41,240 outgoing edges in a symmetrically opposite way 685 00:51:41,240 --> 00:51:47,670 as incoming edges, then that preserves shortest paths. 686 00:51:47,670 --> 00:51:55,630 And then realizing that the triangle inequality enforces 687 00:51:55,630 --> 00:51:57,730 this condition that edge weights will 688 00:51:57,730 --> 00:52:04,360 be non-negative under this reweighting, 689 00:52:04,360 --> 00:52:07,120 so we find shortest path distances 690 00:52:07,120 --> 00:52:09,650 from some other arbitrary vertex, 691 00:52:09,650 --> 00:52:13,000 and set our potential functions to be those shortest path 692 00:52:13,000 --> 00:52:14,350 distance weights. 693 00:52:14,350 --> 00:52:18,010 We do the reweighting, because that reweighting 694 00:52:18,010 --> 00:52:20,260 preserves shortest paths, which we already argued. 695 00:52:24,000 --> 00:52:27,270 Then we can do-- then this has positive edge weights, 696 00:52:27,270 --> 00:52:29,100 so Dijkstra applies. 697 00:52:29,100 --> 00:52:31,420 And then computing this takes a small amount of time. 698 00:52:31,420 --> 00:52:35,230 OK, what is the running time of this algorithm? 699 00:52:35,230 --> 00:52:38,310 So this part, reconstructing this thing, 700 00:52:38,310 --> 00:52:40,118 this takes linear time. 701 00:52:40,118 --> 00:52:40,785 I'm just adding. 702 00:52:44,250 --> 00:52:47,490 I'm just making a new graph of the same size, 703 00:52:47,490 --> 00:52:51,810 except I added v edges and one vertex. 704 00:52:51,810 --> 00:52:56,490 Computing-- doing Bellman-Ford on this new modified graph, 705 00:52:56,490 --> 00:52:57,570 that's just-- 706 00:52:57,570 --> 00:52:59,280 I'm doing that once. 707 00:52:59,280 --> 00:53:01,290 That takes V times E time. 708 00:53:04,030 --> 00:53:07,390 Doing this check, that just takes-- 709 00:53:07,390 --> 00:53:08,830 I'm looping over my vertices. 710 00:53:08,830 --> 00:53:10,720 That just takes V time. 711 00:53:10,720 --> 00:53:15,940 Otherwise, doing this reweighting, 712 00:53:15,940 --> 00:53:18,040 I change the weight of every edge. 713 00:53:18,040 --> 00:53:20,200 That takes order E time. 714 00:53:23,520 --> 00:53:27,570 And then solving G prime-- 715 00:53:27,570 --> 00:53:31,260 solving all-pairs shortest paths on the modified edge weight 716 00:53:31,260 --> 00:53:35,500 graph with Dijkstra takes V times Dijkstra. 717 00:53:35,500 --> 00:53:40,890 That's-- I could use a little bit more board space here. 718 00:53:40,890 --> 00:53:56,560 That's V times V log, V plus E time, 719 00:53:56,560 --> 00:53:59,930 which is actually the running time that we're looking for. 720 00:53:59,930 --> 00:54:02,890 I wanted to reduce to not using more than this time. 721 00:54:02,890 --> 00:54:04,520 We used this amount of time. 722 00:54:04,520 --> 00:54:08,050 Let's make sure we still didn't use even more. 723 00:54:08,050 --> 00:54:12,790 After that, we compute these paths, as proofed before, 724 00:54:12,790 --> 00:54:21,150 in V times V plus E, which is smaller than that. 725 00:54:21,150 --> 00:54:24,690 And so summing up all of these running times, 726 00:54:24,690 --> 00:54:25,840 this one dominates. 727 00:54:25,840 --> 00:54:30,240 And so Johnson's can solve signed 728 00:54:30,240 --> 00:54:34,960 weighted all-pairs shortest paths, signed 729 00:54:34,960 --> 00:54:40,060 all-pairs shortest paths, not in V times Bellman-Ford, like we 730 00:54:40,060 --> 00:54:47,260 had before up here, but faster, in nearly 731 00:54:47,260 --> 00:54:53,180 linear for sparse graphs, just without this log factor. 732 00:54:53,180 --> 00:54:58,040 So we got quite a big improvement. 733 00:54:58,040 --> 00:55:00,950 So that's the nice thing about all-pairs 734 00:55:00,950 --> 00:55:04,160 shortest paths is that, really, we 735 00:55:04,160 --> 00:55:09,860 don't have to incur this big cost in the context 736 00:55:09,860 --> 00:55:11,000 of negative weights. 737 00:55:11,000 --> 00:55:14,180 Essentially, we just run Bellman-Ford once 738 00:55:14,180 --> 00:55:17,090 to see if there is a negative weight cycle in my graph. 739 00:55:17,090 --> 00:55:22,460 If it is, I save a lot of work by stopping early. 740 00:55:22,460 --> 00:55:23,600 So that's Johnson's. 741 00:55:23,600 --> 00:55:27,440 That's the end of our graphs lectures. 742 00:55:27,440 --> 00:55:29,120 We'll be having a review and problem 743 00:55:29,120 --> 00:55:34,070 session about how to solve problems, graph problems using 744 00:55:34,070 --> 00:55:36,010 this material. 745 00:55:36,010 --> 00:55:38,950 But we've talked about a lot of different things so far. 746 00:55:38,950 --> 00:55:40,630 We've talked about graph reachability, 747 00:55:40,630 --> 00:55:44,260 connected components, detecting cycles, detecting 748 00:55:44,260 --> 00:55:47,417 topological sort orders of a DAG. 749 00:55:47,417 --> 00:55:49,500 We've talked about finding negative weight cycles, 750 00:55:49,500 --> 00:55:52,230 single source shortest path algorithms, 751 00:55:52,230 --> 00:55:55,350 and now finally, today, all-pairs shortest path 752 00:55:55,350 --> 00:55:57,990 algorithms, with a new algorithm that's really 753 00:55:57,990 --> 00:56:01,740 not an entirely new algorithm. 754 00:56:01,740 --> 00:56:06,160 We didn't have to do any proof by induction here. 755 00:56:06,160 --> 00:56:08,860 Really, the heavy work that's happening 756 00:56:08,860 --> 00:56:13,240 is we're reducing to using either Dijkstra or Bellman-Ford 757 00:56:13,240 --> 00:56:16,660 to do the heavy lifting of finding single source shortest 758 00:56:16,660 --> 00:56:19,440 paths efficiently. 759 00:56:19,440 --> 00:56:24,070 So Johnson's is really just glue to transform 760 00:56:24,070 --> 00:56:26,200 a graph in a clever way, and then 761 00:56:26,200 --> 00:56:29,020 reducing to using some of the shortest paths algorithms 762 00:56:29,020 --> 00:56:30,550 faster. 763 00:56:30,550 --> 00:56:32,770 So that's our unit on graphs. 764 00:56:32,770 --> 00:56:35,380 Our next lecture, we'll start talking 765 00:56:35,380 --> 00:56:38,950 about a general form of, not presenting you 766 00:56:38,950 --> 00:56:41,050 with an algorithm, but how to design 767 00:56:41,050 --> 00:56:45,310 your own algorithm in the context of dynamic programming. 768 00:56:45,310 --> 00:56:47,430 So see you next lecture.