WEBVTT
00:00:00.000 --> 00:00:01.458
[SQUEAKING]
00:00:01.458 --> 00:00:02.916
[RUSTLING]
00:00:02.916 --> 00:00:06.318
[CLICKING]
00:00:12.275 --> 00:00:13.650
JASON KU: Good
morning, everyone.
00:00:13.650 --> 00:00:18.190
Welcome to the 13th
lecture of 6.006.
00:00:18.190 --> 00:00:23.020
Just to recap from
last time, we've
00:00:23.020 --> 00:00:24.940
been talking about shortest--
00:00:24.940 --> 00:00:28.450
single source shortest
paths on weighted graphs
00:00:28.450 --> 00:00:31.330
for the past two lectures.
00:00:31.330 --> 00:00:34.510
Previously we were only talking
about unweighted graphs.
00:00:34.510 --> 00:00:40.000
And so far, up until today,
we've talked about three ways
00:00:40.000 --> 00:00:44.950
to solve single source shortest
paths on weighted graphs.
00:00:44.950 --> 00:00:47.590
Namely the first one used BFS.
00:00:47.590 --> 00:00:50.260
If you can kind of
transform your graph
00:00:50.260 --> 00:00:55.220
into a linear-sized graph that's
unweighted that corresponds
00:00:55.220 --> 00:00:58.520
to your weighted
problem, essentially
00:00:58.520 --> 00:01:02.390
replacing each weighted
edge with of weight
00:01:02.390 --> 00:01:05.900
w with w single edges.
00:01:05.900 --> 00:01:09.410
Now that's only good for
positive weight things
00:01:09.410 --> 00:01:13.700
and if the sum of your
weights are small.
00:01:13.700 --> 00:01:15.830
But if the sum of
your weights is
00:01:15.830 --> 00:01:18.920
linear in the combinatorial
size of your graph,
00:01:18.920 --> 00:01:22.340
V plus E, then we can get
a linear time algorithm
00:01:22.340 --> 00:01:28.040
to solve weighted shortest paths
using breadth-first search.
00:01:28.040 --> 00:01:31.850
Then we talked
about how we could--
00:01:31.850 --> 00:01:35.630
if we-- the problem with
weighted shortest paths
00:01:35.630 --> 00:01:39.740
is if our weights were negative
and there could exist cycles,
00:01:39.740 --> 00:01:41.540
then we could have
negative weight cycles
00:01:41.540 --> 00:01:43.910
and that would be more
difficult to handle,
00:01:43.910 --> 00:01:47.180
because then you have
vertices where you have
00:01:47.180 --> 00:01:49.340
an unbounded number
of edges you might
00:01:49.340 --> 00:01:51.960
have to go through
for a shortest path.
00:01:51.960 --> 00:01:55.310
There might not be a finite
length shortest path.
00:01:55.310 --> 00:01:57.380
But in the condition
where we didn't
00:01:57.380 --> 00:01:58.790
have cycles in the graph--
00:01:58.790 --> 00:02:01.490
of course, we couldn't
have negative weight ones,
00:02:01.490 --> 00:02:05.750
so we were also able to
do that in linear time
00:02:05.750 --> 00:02:08.210
by exploiting the fact
that our vertices could
00:02:08.210 --> 00:02:11.180
be ordered in a
topological order,
00:02:11.180 --> 00:02:15.650
and then we could kind of
push shortest path information
00:02:15.650 --> 00:02:19.880
from the furthest one
back to the ones forward.
00:02:19.880 --> 00:02:21.800
By relaxing edges forward.
00:02:21.800 --> 00:02:27.140
By maintaining this invariant
that we had shortest paths as
00:02:27.140 --> 00:02:31.290
we were processing these
things in topological order.
00:02:31.290 --> 00:02:35.160
Then last time, we were talking
about general graphs, graphs
00:02:35.160 --> 00:02:37.590
that could contain cycles,
and this is our most general
00:02:37.590 --> 00:02:41.440
algorithm, because if there
are negative weight cycles,
00:02:41.440 --> 00:02:43.890
Bellman-Ford, which we
talked about last time,
00:02:43.890 --> 00:02:45.180
can detect them.
00:02:45.180 --> 00:02:48.720
And in particular,
for any vertex
00:02:48.720 --> 00:02:52.960
that had a finite
weight shortest paths--
00:02:52.960 --> 00:02:57.460
path, we could compute
that shortest path for it,
00:02:57.460 --> 00:02:58.390
compute its distance.
00:02:58.390 --> 00:03:00.190
And for any one
that is reachable
00:03:00.190 --> 00:03:03.550
from a negative weight
cycle, not only could we
00:03:03.550 --> 00:03:07.210
mark it as minus
infinity distance,
00:03:07.210 --> 00:03:10.750
but we could also find
a negative weight cycle
00:03:10.750 --> 00:03:14.500
essentially by duplicating
our graph to make it a DAG
00:03:14.500 --> 00:03:18.940
and being able to follow
pointers back in this expanded
00:03:18.940 --> 00:03:21.490
DAG that had multiple layers.
00:03:21.490 --> 00:03:24.440
So that's what we've
done up until now.
00:03:24.440 --> 00:03:28.640
We've gotten linear for
some types of graphs.
00:03:28.640 --> 00:03:31.790
And we've gotten kind
of quadratic V times
00:03:31.790 --> 00:03:36.880
E for general graphs, ones that
could contain negative cycles.
00:03:36.880 --> 00:03:39.850
Now how bad is this?
00:03:39.850 --> 00:03:45.970
Well, if the graph is sparse,
if the number of edges
00:03:45.970 --> 00:03:48.830
in our graph is
on the order of V,
00:03:48.830 --> 00:03:52.930
then this is quadratic
time and V, V squared.
00:03:52.930 --> 00:03:58.480
But if the graph is dense
where we have quadratic--
00:03:58.480 --> 00:04:01.790
like the complete graph
where every edge is present,
00:04:01.790 --> 00:04:06.010
then we have quadratically
many edges in our graph in V.
00:04:06.010 --> 00:04:08.380
And so this running
time is V cubed.
00:04:08.380 --> 00:04:11.080
V cube's not great in
terms of its running time.
00:04:11.080 --> 00:04:15.040
We would like something
closer to linear.
00:04:15.040 --> 00:04:18.640
And so that's what
we're going to do today.
00:04:18.640 --> 00:04:21.519
If we have this
restriction where
00:04:21.519 --> 00:04:24.190
we have non-negative
weights, we can
00:04:24.190 --> 00:04:26.920
have negative weight cycles.
00:04:26.920 --> 00:04:31.990
And this is a restriction that
comes up a lot for many graphs
00:04:31.990 --> 00:04:34.040
you might encounter.
00:04:34.040 --> 00:04:38.140
A lot of times you don't have
both positive and negative
00:04:38.140 --> 00:04:38.710
weight.
00:04:38.710 --> 00:04:40.870
I don't have a negative
distance to my house.
00:04:43.390 --> 00:04:47.740
In any metric we have
non-negative weights.
00:04:47.740 --> 00:04:51.430
So these things come up a
lot, and we can actually
00:04:51.430 --> 00:04:54.100
do quite a bit better, since
there are no negative weight
00:04:54.100 --> 00:04:58.630
cycles, we can
get almost linear.
00:04:58.630 --> 00:05:03.910
It's not going to be quite
V plus E as you see up here
00:05:03.910 --> 00:05:04.870
on the slide.
00:05:04.870 --> 00:05:07.270
We're going to get
something very close.
00:05:07.270 --> 00:05:10.720
It's V plus the E,
but on the V term,
00:05:10.720 --> 00:05:13.030
we have this
logarithmic factor in V.
00:05:13.030 --> 00:05:15.650
Which remember for all
intents and purposes,
00:05:15.650 --> 00:05:18.820
this log of that
thing in real life
00:05:18.820 --> 00:05:22.000
is not going to be bigger
than like a factor of 30
00:05:22.000 --> 00:05:24.550
or something like that.
00:05:24.550 --> 00:05:25.600
Maybe 60.
00:05:25.600 --> 00:05:27.790
But it's a small number.
00:05:27.790 --> 00:05:30.890
And so this is actually
pretty good performance.
00:05:30.890 --> 00:05:34.060
It's almost linear-- that's what
I'm saying almost linear here,
00:05:34.060 --> 00:05:37.310
and that's what we're
going to try to do today.
00:05:37.310 --> 00:05:40.430
So, how do we do this?
00:05:40.430 --> 00:05:44.280
Well, I'm going to make two
observations here, first off.
00:05:44.280 --> 00:05:50.500
Our idea is going to be to
generalize the notion of BFS.
00:05:50.500 --> 00:05:55.180
When we had BFS, we
split up our graph--
00:05:55.180 --> 00:05:59.140
to solve unweighted-- solve
weighted shortest paths in BFS,
00:05:59.140 --> 00:06:01.990
we could take our
positive edge weights,
00:06:01.990 --> 00:06:06.560
break them up into
individual edges.
00:06:06.560 --> 00:06:11.270
But if the total weight
of our edges was large,
00:06:11.270 --> 00:06:13.370
then we'd have a problem,
because now we've
00:06:13.370 --> 00:06:15.550
expanded the size of our graph.
00:06:15.550 --> 00:06:18.050
This is the same issue that we
had with something like radix
00:06:18.050 --> 00:06:21.350
sort where we don't
want our algorithm
00:06:21.350 --> 00:06:25.050
to run in the size of
the numbers in our input,
00:06:25.050 --> 00:06:28.610
we want our algorithm to
run in the number of numbers
00:06:28.610 --> 00:06:29.780
in our input.
00:06:29.780 --> 00:06:34.150
This is the difference
between N and U
00:06:34.150 --> 00:06:36.280
back when we were talking
about data structures.
00:06:36.280 --> 00:06:43.000
Here, if the size of our weights
are large compared to V and E,
00:06:43.000 --> 00:06:47.850
then doing this expansion
is going to be difficult.
00:06:47.850 --> 00:06:51.970
But if we had, say, some graph--
00:06:51.970 --> 00:06:55.125
this is my graph G, and
we had a source vertex
00:06:55.125 --> 00:06:59.650
s, the idea here
is going to still
00:06:59.650 --> 00:07:04.960
be to try to grow a
frontier of increasing
00:07:04.960 --> 00:07:12.920
distance from my source and try
to maintain all of the things
00:07:12.920 --> 00:07:16.520
within a certain
distance from my source.
00:07:16.520 --> 00:07:19.240
So that's the idea, grow a
sphere centered at my source,
00:07:19.240 --> 00:07:23.590
repeatedly explore
closer vertices
00:07:23.590 --> 00:07:25.120
before I get to further ones.
00:07:25.120 --> 00:07:29.350
But how can I explore
closer vertices
00:07:29.350 --> 00:07:32.920
if I don't know the
distances beforehand?
00:07:32.920 --> 00:07:36.100
This is kind of-- seems
like a circular logic.
00:07:36.100 --> 00:07:38.680
I'm going to use the
distance to my things
00:07:38.680 --> 00:07:41.350
to compute the
distances to my things.
00:07:41.350 --> 00:07:43.780
That doesn't work so well.
00:07:43.780 --> 00:07:45.320
So how do we do this?
00:07:45.320 --> 00:07:49.450
Well, the idea here is
to gradually compute
00:07:49.450 --> 00:07:53.080
the distances-- compute
the distances as we go so
00:07:53.080 --> 00:07:54.520
that we maintain this property.
00:07:54.520 --> 00:07:59.680
Now this property, this idea
wouldn't work necessarily
00:07:59.680 --> 00:08:03.190
in the context of
negative edge weights.
00:08:03.190 --> 00:08:06.460
Here, we have this
growing frontier,
00:08:06.460 --> 00:08:09.340
this ball around my source.
00:08:09.340 --> 00:08:13.420
And as I grow my
thing, these things are
00:08:13.420 --> 00:08:19.480
at further and further distance,
because any edge from something
00:08:19.480 --> 00:08:22.720
back here as I'm growing
my ball a certain distance,
00:08:22.720 --> 00:08:24.940
these things are
outside that distance.
00:08:24.940 --> 00:08:28.300
We're kind of using a
key observation here.
00:08:28.300 --> 00:08:30.070
Here's my observation 1.
00:08:35.100 --> 00:08:43.220
If weights greater
than or equal to 0,
00:08:43.220 --> 00:08:56.985
then distances increase
along shortest paths.
00:09:00.640 --> 00:09:04.328
Maybe weakly
monotonically increase
00:09:04.328 --> 00:09:05.620
if there are zero-weight edges.
00:09:05.620 --> 00:09:18.940
But in general, if I had a
path going from s to some v,
00:09:18.940 --> 00:09:23.080
and it's going
through some vertex u,
00:09:23.080 --> 00:09:24.460
I have some shortest path.
00:09:24.460 --> 00:09:27.130
This is the shortest
path from s to v,
00:09:27.130 --> 00:09:31.330
and it goes through some
point u, some vertex u.
00:09:31.330 --> 00:09:35.950
Then this monotonicity more
specifically means that
00:09:35.950 --> 00:09:42.550
the shortest path from s to u
and the shortest path from s
00:09:42.550 --> 00:09:51.140
to v, which is this
whole thing, how
00:09:51.140 --> 00:09:54.610
do these relate to each other?
00:09:54.610 --> 00:09:57.690
If this is along
that path, then this
00:09:57.690 --> 00:10:01.890
has to be at least as
large as the subpath.
00:10:01.890 --> 00:10:03.810
Because all of these--
00:10:03.810 --> 00:10:07.030
the weight of this path
cannot be negative.
00:10:07.030 --> 00:10:10.050
So that's the thing that
Dijkstra's going to exploit.
00:10:10.050 --> 00:10:11.640
It essentially
means that when I'm
00:10:11.640 --> 00:10:18.130
expanding this frontier
of distance away from x,
00:10:18.130 --> 00:10:24.680
it's possible if I had negative
weight, that this line--
00:10:24.680 --> 00:10:28.580
if I had some very negative
weight going from a vertex
00:10:28.580 --> 00:10:33.570
here to a vertex
here, this vertex
00:10:33.570 --> 00:10:37.920
could be within this boundary.
00:10:37.920 --> 00:10:42.999
Maybe if this distance is x,
this guy could be within x.
00:10:42.999 --> 00:10:47.330
The things that are
within distance x of s
00:10:47.330 --> 00:10:49.880
might not be all contained.
00:10:49.880 --> 00:10:54.560
There could be a path from
here to this other vertex width
00:10:54.560 --> 00:10:56.360
distance x.
00:10:56.360 --> 00:10:59.000
It doesn't have this property
because I could decrease
00:10:59.000 --> 00:11:01.070
in distance along the path.
00:11:01.070 --> 00:11:03.390
So that's the first observation.
00:11:03.390 --> 00:11:09.000
Second observation,
well, let's see
00:11:09.000 --> 00:11:15.590
if we can piggyback
on DAG relaxation.
00:11:15.590 --> 00:11:22.520
I claim to you that we can solve
single source shortest paths
00:11:22.520 --> 00:11:38.110
faster if we're given
an order of vertices
00:11:38.110 --> 00:11:46.330
in increasing
distance beforehand.
00:11:46.330 --> 00:11:48.250
Distance from s.
00:11:48.250 --> 00:11:49.600
So here's the idea.
00:11:49.600 --> 00:11:51.248
I'm not going to give
you the distances
00:11:51.248 --> 00:11:52.165
to all these vertices.
00:11:54.890 --> 00:11:57.530
Instead I'm going to
give you the order
00:11:57.530 --> 00:12:02.780
of the vertices in some
increasing distance from s.
00:12:02.780 --> 00:12:06.680
So basically I'm saying, if
I had some, I don't know,
00:12:06.680 --> 00:12:09.698
here's a graph.
00:12:09.698 --> 00:12:10.865
Let's see if I can remember.
00:12:18.380 --> 00:12:20.420
OK, and I'm going to
put some edges on here.
00:12:36.060 --> 00:12:36.560
OK.
00:12:36.560 --> 00:12:42.930
And I'm going to call these
vertices 0, 1, 2, 3, and 4.
00:12:42.930 --> 00:12:43.430
OK.
00:12:43.430 --> 00:12:44.195
So here's a graph.
00:12:46.740 --> 00:12:49.890
Maybe I put some
edge weights on here.
00:12:49.890 --> 00:12:57.990
I'm going to say this one is 3,
this one is 2, this one is 3,
00:12:57.990 --> 00:13:04.950
this is 1, this is 1,
this is 0, and this is 0.
00:13:04.950 --> 00:13:10.560
So from vertex 1 to
2, that was the 2
00:13:10.560 --> 00:13:13.680
for the labeling of that vertex.
00:13:13.680 --> 00:13:14.800
That edge is zero-weight.
00:13:14.800 --> 00:13:15.300
OK.
00:13:15.300 --> 00:13:21.000
So here's a weighted graph
And I don't necessarily know--
00:13:21.000 --> 00:13:25.440
I could use Bellman-Ford to find
shortest paths from this vertex
00:13:25.440 --> 00:13:31.820
0, but the idea here is
I'm not going to give you
00:13:31.820 --> 00:13:34.520
shortest paths, I'm going to
try to compute shortest paths,
00:13:34.520 --> 00:13:36.770
but I'm going to give you
some additional information.
00:13:36.770 --> 00:13:43.640
I'm going to give you the order
of their shortest path distance
00:13:43.640 --> 00:13:45.620
from the source.
00:13:45.620 --> 00:13:46.460
And I can just--
00:13:46.460 --> 00:13:52.580
I'm going to eyeball
this and say--
00:13:52.580 --> 00:13:54.460
I'm going to change
this slightly
00:13:54.460 --> 00:13:56.680
to make it a little
bit more interesting.
00:13:56.680 --> 00:14:00.250
I'm going to say
this is distance 4.
00:14:00.250 --> 00:14:02.830
OK.
00:14:02.830 --> 00:14:06.490
All right, so now what we have
is the shortest path distance--
00:14:06.490 --> 00:14:07.810
I'm just eyeballing this.
00:14:07.810 --> 00:14:10.360
The shortest distance to--
00:14:16.280 --> 00:14:17.930
bad example.
00:14:17.930 --> 00:14:18.620
All right.
00:14:18.620 --> 00:14:21.950
So, these are the weights.
00:14:21.950 --> 00:14:25.310
Shortest-path distance
to 3 is going to be 2,
00:14:25.310 --> 00:14:28.670
I'm going to say, through there.
00:14:28.670 --> 00:14:33.220
Shortest-path distance
here is 2 also.
00:14:33.220 --> 00:14:35.590
Shortest-path distance
here is also 2
00:14:35.590 --> 00:14:39.090
because I can go through both
of these 0's and it's not
00:14:39.090 --> 00:14:40.020
a problem.
00:14:40.020 --> 00:14:41.490
And then the
shortest-path distance
00:14:41.490 --> 00:14:45.870
here is 2 to here
and a 1/3 to there.
00:14:45.870 --> 00:14:50.930
So these are listed
in increasing distance
00:14:50.930 --> 00:14:53.140
from my source.
00:14:53.140 --> 00:14:58.820
I had to compute those
deltas to convince you
00:14:58.820 --> 00:15:00.380
that this was the
right ordering,
00:15:00.380 --> 00:15:03.110
but this is a right
ordering of these things.
00:15:03.110 --> 00:15:06.620
Now it's not the
only right ordering,
00:15:06.620 --> 00:15:07.850
but it is a right ordering.
00:15:07.850 --> 00:15:09.470
OK, so I'm told--
00:15:09.470 --> 00:15:12.470
I'm arguing to you that I could
solve a single source shortest
00:15:12.470 --> 00:15:16.160
paths in linear time
if I were to give you
00:15:16.160 --> 00:15:18.740
the vertices in
increasing distance?
00:15:18.740 --> 00:15:21.230
How could I do that?
00:15:21.230 --> 00:15:27.430
Well, because of this
first observation,
00:15:27.430 --> 00:15:31.420
I know that if these are
increasing in distance,
00:15:31.420 --> 00:15:35.590
any edge going backwards
with respect to this ordering
00:15:35.590 --> 00:15:40.640
can't participate in shortest
paths with one exception.
00:15:40.640 --> 00:15:42.800
Anyone know what
that exception is?
00:15:42.800 --> 00:15:47.570
No edge can go backwards
in this ordering
00:15:47.570 --> 00:15:51.990
based on this observation
except under what condition?
00:15:51.990 --> 00:15:52.490
Yeah?
00:15:52.490 --> 00:15:53.698
AUDIENCE: If the weight is 0?
00:15:53.698 --> 00:15:55.430
JASON KU: If the
weight to 0, yeah.
00:15:55.430 --> 00:16:00.600
So if the weight to 0, just
like this situation here,
00:16:00.600 --> 00:16:03.630
then I could go backwards
in the ordering.
00:16:03.630 --> 00:16:05.520
See, it's problematic.
00:16:05.520 --> 00:16:09.300
The idea is I'm going to
want to construct a DAG
00:16:09.300 --> 00:16:11.910
so that I can run
DAG relaxation.
00:16:11.910 --> 00:16:22.120
Well, if I have a component
here that has 0 weights,
00:16:22.120 --> 00:16:26.440
I can coalesce this thing down--
00:16:26.440 --> 00:16:31.420
I can deal with this
component separately.
00:16:31.420 --> 00:16:33.790
Let's worry about
that separately.
00:16:33.790 --> 00:16:40.280
If we do, we can collapse this
edge down into a single vertex
00:16:40.280 --> 00:16:44.470
and transform this graph so
it does respect the ordering.
00:16:44.470 --> 00:16:50.680
So I'm going to transform
this graph into a new graph.
00:16:50.680 --> 00:16:53.590
This is a graph--
00:16:53.590 --> 00:17:00.220
contains vertex 2 and vertex
0, vertex 1 and 3 here,
00:17:00.220 --> 00:17:03.170
and vertex 4.
00:17:03.170 --> 00:17:05.690
OK, now we have--
00:17:05.690 --> 00:17:09.020
and I'm only going to keep
edges going forward in the--
00:17:15.329 --> 00:17:23.970
I'm going to need to collapse
this entire section down
00:17:23.970 --> 00:17:25.290
into one vertex.
00:17:25.290 --> 00:17:27.119
This doesn't quite work.
00:17:27.119 --> 00:17:28.079
OK.
00:17:28.079 --> 00:17:31.680
Let's ignore zero-weight
edges for now.
00:17:31.680 --> 00:17:32.760
Let's assume these are--
00:17:41.132 --> 00:17:42.840
all right, there's
something broken here.
00:17:42.840 --> 00:17:44.450
If I have a cycle here--
00:17:48.350 --> 00:17:51.750
right now I don't have
a cycle of zero-weight.
00:17:51.750 --> 00:17:54.810
So what I could do is I
could take this vertex
00:17:54.810 --> 00:17:57.660
and put it after both
of these vertices.
00:17:57.660 --> 00:17:59.225
And now I would--
00:17:59.225 --> 00:18:02.180
or I could rearrange the
order of these three vertices
00:18:02.180 --> 00:18:06.560
where there's a path of length
0 and get a new ordering that
00:18:06.560 --> 00:18:09.620
still satisfies the property.
00:18:09.620 --> 00:18:17.240
And that's always the case
because paths can't increase--
00:18:17.240 --> 00:18:19.910
paths can't decrease in weight.
00:18:19.910 --> 00:18:22.470
I can rearrange the
ordering of these things
00:18:22.470 --> 00:18:26.840
so that 3 comes
first, 1 comes second,
00:18:26.840 --> 00:18:29.645
and 2 comes third of
those three vertices.
00:18:33.450 --> 00:18:34.170
Yeah.
00:18:34.170 --> 00:18:43.380
So for every set of 0 edges, I
can just flip the relationship
00:18:43.380 --> 00:18:45.990
if they have the same distance.
00:18:45.990 --> 00:18:50.550
In my input, I'm given vertices
that have the same distance
00:18:50.550 --> 00:18:51.490
from the source.
00:18:51.490 --> 00:18:53.910
And so if those are the same
distance from the source
00:18:53.910 --> 00:18:56.040
and they're connected
by a zero-weight edge,
00:18:56.040 --> 00:18:57.870
it doesn't hurt me to
flip their ordering.
00:18:57.870 --> 00:18:59.940
So I'm going to do that.
00:18:59.940 --> 00:19:02.457
So let's convert
that into a graph
00:19:02.457 --> 00:19:03.540
with a different ordering.
00:19:13.030 --> 00:19:18.720
0 3 now, 1 2.
00:19:18.720 --> 00:19:23.560
OK and I have this distance,
this edge, this edge,
00:19:23.560 --> 00:19:24.735
this edge, this edge.
00:19:28.390 --> 00:19:28.995
This edge.
00:19:31.980 --> 00:19:33.150
What am I missing?
00:19:33.150 --> 00:19:36.190
2 to 3.
00:19:36.190 --> 00:19:38.530
And here.
00:19:38.530 --> 00:19:41.740
I think I have all
of those edges.
00:19:41.740 --> 00:19:43.550
Yeah?
00:19:43.550 --> 00:19:44.660
OK.
00:19:44.660 --> 00:19:47.960
Now I have the property
that every edge that
00:19:47.960 --> 00:19:50.360
could participate
in the shortest path
00:19:50.360 --> 00:19:57.270
are going forward in the
ordering, because all of these
00:19:57.270 --> 00:19:59.540
are zero-weight.
00:19:59.540 --> 00:20:01.100
So we flip those
around so they're
00:20:01.100 --> 00:20:04.250
going correct with
respect to the ordering.
00:20:04.250 --> 00:20:07.850
And any edge going
backwards that
00:20:07.850 --> 00:20:10.400
is positive weight
certainly can't
00:20:10.400 --> 00:20:12.390
be used in any shortest path.
00:20:12.390 --> 00:20:14.480
So I'm just going
to get rid of them.
00:20:17.790 --> 00:20:18.650
Yeah?
00:20:18.650 --> 00:20:20.882
What do I do if there's
a zero-weight cycle?
00:20:20.882 --> 00:20:22.590
JASON KU: If there's
a zero-weight cycle,
00:20:22.590 --> 00:20:25.410
I can just coalesce
them all together down
00:20:25.410 --> 00:20:28.500
to a single vertex, because
if I reach one of them,
00:20:28.500 --> 00:20:30.990
I can reach all of them.
00:20:30.990 --> 00:20:33.330
AUDIENCE: You're getting a
topological ordering of--
00:20:33.330 --> 00:20:34.080
JASON KU: Exactly.
00:20:34.080 --> 00:20:37.470
I'm computing-- so
the idea here is we're
00:20:37.470 --> 00:20:39.420
trying to construct a DAG.
00:20:39.420 --> 00:20:43.020
I can construct this
DAG in linear time.
00:20:43.020 --> 00:20:45.840
And then I can
run DAG relaxation
00:20:45.840 --> 00:20:50.070
on this graph in linear
time to get shortest paths.
00:20:50.070 --> 00:20:52.140
So that's an approach.
00:20:52.140 --> 00:20:54.420
If I knew the ordering
of the vertices
00:20:54.420 --> 00:20:59.020
in increasing distance, then
I could use DAG relaxation.
00:20:59.020 --> 00:21:01.760
So we're going to use both
of these observations.
00:21:01.760 --> 00:21:04.710
That's how we're going to solve
this single source shortage
00:21:04.710 --> 00:21:08.850
problem with non-negative
weights using Dijkstra.
00:21:08.850 --> 00:21:12.600
So that's finally now
where we're coming to.
00:21:12.600 --> 00:21:16.230
Sorry, I missed a case here
when I was writing up my notes,
00:21:16.230 --> 00:21:20.610
and I tried to fix it live and
hopefully you guys followed me.
00:21:20.610 --> 00:21:21.110
OK.
00:21:23.700 --> 00:21:28.200
Dijkstra's algorithm.
00:21:32.200 --> 00:21:33.660
Did I spell that right?
00:21:33.660 --> 00:21:35.070
Kind of.
00:21:35.070 --> 00:21:36.180
OK.
00:21:36.180 --> 00:21:37.065
What?
00:21:37.065 --> 00:21:37.565
Dijkstra.
00:21:42.480 --> 00:21:43.560
OK.
00:21:43.560 --> 00:21:49.260
Now Dijkstra was this
Dutch computer scientist.
00:21:49.260 --> 00:21:50.700
This is him.
00:21:50.700 --> 00:21:52.530
Pretty famous, he
wrote a monograph
00:21:52.530 --> 00:21:56.970
on why programming
languages should
00:21:56.970 --> 00:22:02.280
start with 0 indexing as opposed
to 1 indexing, so I like him.
00:22:02.280 --> 00:22:07.080
But in particular, he designed
this very nice generalization
00:22:07.080 --> 00:22:10.560
of BFS for weighted graphs.
00:22:10.560 --> 00:22:12.210
But maybe I didn't
spell this right
00:22:12.210 --> 00:22:15.210
because when he writes
his name, he writes it
00:22:15.210 --> 00:22:17.730
with a Y with a dash over it.
00:22:17.730 --> 00:22:25.440
So in reality on a
Dutch typewriter,
00:22:25.440 --> 00:22:29.280
you might have a character
that looks like this, Y
00:22:29.280 --> 00:22:31.960
with a umlaut on top of it.
00:22:31.960 --> 00:22:37.600
But on modern-- on
an English keyboard,
00:22:37.600 --> 00:22:42.920
this looks pretty
similar to an IJ.
00:22:42.920 --> 00:22:50.120
So in a lot of manuscripts,
we write it as D-I--
00:22:50.120 --> 00:22:52.640
there's no J sound in Dijkstra.
00:22:52.640 --> 00:22:55.460
It's coming from this is Y here.
00:22:55.460 --> 00:22:59.690
That's an interesting way to
remember how to spell Dijkstra.
00:22:59.690 --> 00:23:04.760
But the basic idea behind
Dijkstra is the following idea.
00:23:09.830 --> 00:23:28.880
Relaxed edges from vertices
in increasing distance
00:23:28.880 --> 00:23:31.770
from source.
00:23:31.770 --> 00:23:32.300
OK.
00:23:32.300 --> 00:23:34.230
This is the same
kind of difficulty
00:23:34.230 --> 00:23:40.440
we had before when we were
trying to generalize BFS.
00:23:40.440 --> 00:23:45.320
So how do we know what
the next vertex is
00:23:45.320 --> 00:23:47.570
with increasing distance to s?
00:23:47.570 --> 00:24:03.400
Well, the second idea is find
the next vertex efficiently
00:24:03.400 --> 00:24:05.351
using a data structure.
00:24:08.440 --> 00:24:10.150
And the data structure
we're going to use
00:24:10.150 --> 00:24:14.020
is something I like to call
a changeable priority queue.
00:24:22.940 --> 00:24:27.460
So this is a little different
than a normal priority queue
00:24:27.460 --> 00:24:36.010
that we had at the end of
our data structures unit.
00:24:36.010 --> 00:24:40.280
This changeable priority
queue has three operations.
00:24:40.280 --> 00:24:42.970
We're going to say it's a queue.
00:24:42.970 --> 00:24:49.330
We can build it on an
iterable set of items.
00:24:49.330 --> 00:24:54.700
Just stick x-- like
n items in there.
00:24:54.700 --> 00:25:03.850
We can delete min
from the queue.
00:25:03.850 --> 00:25:06.700
OK, this is the same now
as the priority queue.
00:25:06.700 --> 00:25:09.190
It's this third operation
that's going to be different.
00:25:12.910 --> 00:25:22.420
Decrease the key of an
item that has id, id.
00:25:22.420 --> 00:25:25.630
OK, so this is a little strange.
00:25:25.630 --> 00:25:27.640
What the heck is this id?
00:25:27.640 --> 00:25:30.430
All right, with a change
of priority queue,
00:25:30.430 --> 00:25:34.060
each of our items has two
values instead of one value.
00:25:34.060 --> 00:25:37.480
It has a key, but it also--
00:25:37.480 --> 00:25:42.760
on which the priority queue
is leading the min item
00:25:42.760 --> 00:25:45.130
with the minimum key.
00:25:45.130 --> 00:25:48.010
But also, each item
has an ID associated
00:25:48.010 --> 00:25:51.790
with it, a unique integer.
00:25:51.790 --> 00:25:54.970
So that when we
perform this operation,
00:25:54.970 --> 00:26:01.690
decrease_key, it can find some
item in our data structure
00:26:01.690 --> 00:26:03.640
with the given ID.
00:26:03.640 --> 00:26:05.470
And if it's
contained there, it's
00:26:05.470 --> 00:26:12.400
going to change its key
to some smaller value k.
00:26:12.400 --> 00:26:16.000
And don't worry about
the edge cases here.
00:26:16.000 --> 00:26:18.070
We're always going to
make sure this k is
00:26:18.070 --> 00:26:20.440
going to be smaller
then whatever
00:26:20.440 --> 00:26:23.090
that key was to begin with.
00:26:23.090 --> 00:26:25.930
So this is really a kind
of a funky operation.
00:26:28.460 --> 00:26:33.450
If I had a priority queue, not
a changeable priority queue,
00:26:33.450 --> 00:26:35.210
but I had a priority
queue and I wanted
00:26:35.210 --> 00:26:38.420
to implement a change
of priority queue,
00:26:38.420 --> 00:26:39.230
how could I do it?
00:26:43.980 --> 00:26:46.800
Well, a regular priority
queue is already
00:26:46.800 --> 00:26:49.680
going to get me
these two operations.
00:26:49.680 --> 00:26:50.790
It's just this one.
00:26:50.790 --> 00:26:55.540
I essentially need to
find something by an ID
00:26:55.540 --> 00:26:59.190
and then update its key.
00:26:59.190 --> 00:27:05.990
So the idea how
to implement this
00:27:05.990 --> 00:27:09.250
is going to be to use a
regular priority queue.
00:27:15.290 --> 00:27:17.840
I'm going to call it Q prime.
00:27:17.840 --> 00:27:30.630
And I'm going to cross-link
it with a dictionary D.
00:27:30.630 --> 00:27:34.230
So these are just regular
priority queue on my items
00:27:34.230 --> 00:27:38.640
that has the key
as defined above.
00:27:38.640 --> 00:27:41.910
But I'm going to cross-link it
with a dictionary, a dictionary
00:27:41.910 --> 00:27:46.860
that maps IDs to their
location in the priority queue.
00:27:46.860 --> 00:27:50.610
We've done this many times in
the data structures section.
00:27:50.610 --> 00:27:53.820
We're trying to cross
link to data structures
00:27:53.820 --> 00:27:58.630
to make a query on a
different type of key
00:27:58.630 --> 00:28:01.870
to find its place in
another data structure.
00:28:01.870 --> 00:28:06.980
So, if we had a
priority a dictionary,
00:28:06.980 --> 00:28:09.430
we could do this
stuff pretty fast.
00:28:12.150 --> 00:28:14.220
In particular, I'm
going to assume
00:28:14.220 --> 00:28:18.360
that our IDs of our vertices
are the integers between 0
00:28:18.360 --> 00:28:19.830
and v minus 1.
00:28:19.830 --> 00:28:24.000
And so for my dictionary,
I could get constant time
00:28:24.000 --> 00:28:30.438
looking up of that ID by
using what data structure?
00:28:30.438 --> 00:28:32.220
AUDIENCE: Hash table.
00:28:32.220 --> 00:28:33.220
JASON KU: We could get--
00:28:33.220 --> 00:28:37.520
OK, so we could get
expected constant time
00:28:37.520 --> 00:28:40.980
if we used a hash table.
00:28:40.980 --> 00:28:45.040
But if we knew that
our vertex IDs were
00:28:45.040 --> 00:28:48.310
just the numbers
from 0 to v minus 1,
00:28:48.310 --> 00:28:51.190
we could get rid of
that expected time
00:28:51.190 --> 00:28:55.460
by using a direct access array.
00:28:55.460 --> 00:28:56.120
Great.
00:28:56.120 --> 00:28:57.900
OK, so that's the assumption.
00:28:57.900 --> 00:28:59.720
And so really, the
name of the game
00:28:59.720 --> 00:29:04.730
here is to choose a
priority queue here
00:29:04.730 --> 00:29:07.430
that's going to make
these things fast when we
00:29:07.430 --> 00:29:08.540
start to look at Dijkstra.
00:29:08.540 --> 00:29:12.530
OK, so we're going to
use this data structure
00:29:12.530 --> 00:29:18.470
to keep track of our
distance estimates
00:29:18.470 --> 00:29:21.895
to all of the
vertices away from s.
00:29:21.895 --> 00:29:25.220
OK, so this is
Dijkstra's algorithm.
00:29:25.220 --> 00:29:26.150
OK.
00:29:26.150 --> 00:29:31.430
Set-- so same
initialization step.
00:29:31.430 --> 00:29:33.020
We're going to set--
00:29:33.020 --> 00:29:37.430
this is a distance
estimate d, not delta.
00:29:37.430 --> 00:29:40.070
We're going to want
the d's be our delta
00:29:40.070 --> 00:29:41.540
is at the end of the algorithm.
00:29:41.540 --> 00:29:43.430
That's what we're
going to have to prove.
00:29:43.430 --> 00:29:52.335
So we first set all of them to
infinity, and then set d of s,
00:29:52.335 --> 00:29:56.010
s equal to 0.
00:29:56.010 --> 00:29:58.620
And here, we're never
going to update it again,
00:29:58.620 --> 00:30:02.580
because our shortest
distance is in a graph
00:30:02.580 --> 00:30:06.540
with non-negative edge weights
certainly can't go below 0.
00:30:09.090 --> 00:30:10.410
All right.
00:30:10.410 --> 00:30:14.130
Now we build our--
00:30:14.130 --> 00:30:18.540
build our changeable
priority queue--
00:30:18.540 --> 00:30:28.620
queue-- with an item--
00:30:28.620 --> 00:30:34.890
I'm going to say an item is--
00:30:34.890 --> 00:30:38.850
x is represented by
a tuple of its ID,
00:30:38.850 --> 00:30:43.410
and then its key just
for brevity here.
00:30:43.410 --> 00:30:49.020
With an item v, d of s, v.
00:30:49.020 --> 00:30:53.070
So I'm going to be storing in
my changeable priority queue
00:30:53.070 --> 00:30:58.260
the vertex label and its
shortest-path distance estimate
00:30:58.260 --> 00:30:59.190
d.
00:30:59.190 --> 00:31:01.500
And that's going to be the
key, the minimum that I'm
00:31:01.500 --> 00:31:11.820
trying going to be querying
on for each the v and V.
00:31:11.820 --> 00:31:13.410
So I'm going to
build that thing.
00:31:13.410 --> 00:31:18.180
It's going to then have all
of my vertices in my graph.
00:31:18.180 --> 00:31:25.710
Then while my changeable
priority queue still
00:31:25.710 --> 00:31:44.850
has items, not empty, I'm
going to delete some u, d s, u.
00:31:44.850 --> 00:31:49.620
So some item such
that its distance
00:31:49.620 --> 00:32:01.860
is minimized from Q that
has minimum distance.
00:32:05.830 --> 00:32:06.345
OK.
00:32:06.345 --> 00:32:07.720
So I'm going to
I'm going to look
00:32:07.720 --> 00:32:09.370
at all the things in
my priority queue.
00:32:09.370 --> 00:32:11.830
At the start it's
just going to be s,
00:32:11.830 --> 00:32:14.500
because everything as
shortest-path distance
00:32:14.500 --> 00:32:17.080
estimate infinite except for s.
00:32:17.080 --> 00:32:18.940
And so that's
clearly the smallest.
00:32:18.940 --> 00:32:21.550
OK, so I'm going to
remove that from my queue,
00:32:21.550 --> 00:32:23.570
and then I'm going
to process it.
00:32:23.570 --> 00:32:25.150
How am I going to process it?
00:32:25.150 --> 00:32:28.600
It's the exact same kind
of thing as DAG relaxation.
00:32:28.600 --> 00:32:31.270
I'm going to relax all
its outgoing edges.
00:32:31.270 --> 00:32:40.100
So just for completeness for
v in the outgoing adjacencies
00:32:40.100 --> 00:32:50.070
of u, I'm going to relax--
00:32:50.070 --> 00:32:52.830
sorry.
00:32:52.830 --> 00:32:58.800
We have to check
whether we can relax it.
00:32:58.800 --> 00:33:09.710
Basically if the shortest-path
distance estimate to v
00:33:09.710 --> 00:33:22.760
is greater than going to u first
and then crossing that edge,
00:33:22.760 --> 00:33:24.830
if going through
that is better, this
00:33:24.830 --> 00:33:27.680
is violating our
triangle inequality.
00:33:27.680 --> 00:33:39.220
And so we relax edge u, v, and
by that we mean set this thing
00:33:39.220 --> 00:33:42.930
to be equal to that thing.
00:33:42.930 --> 00:33:44.370
That's what we meant by relax.
00:33:44.370 --> 00:33:46.950
And then we have one
other thing to do.
00:33:46.950 --> 00:33:53.560
We have changed these
distance estimates
00:33:53.560 --> 00:33:56.260
but our Q doesn't know that
we change these things.
00:33:56.260 --> 00:33:58.000
We added these items in here.
00:34:01.170 --> 00:34:04.660
But it doesn't know that
my distances have changed.
00:34:04.660 --> 00:34:12.000
So we to tell the Q to remember
to change its key value
00:34:12.000 --> 00:34:15.929
associated with the item v.
00:34:15.929 --> 00:34:22.320
So decrease-- what is it?
00:34:22.320 --> 00:34:33.810
Decrease key vertex
v in Q to the new d
00:34:33.810 --> 00:34:37.570
s, v, the one that I
just decreased here.
00:34:37.570 --> 00:34:39.989
And I know that I decreased
it because I said it
00:34:39.989 --> 00:34:40.980
to a smaller value.
00:34:40.980 --> 00:34:42.480
That makes sense.
00:34:42.480 --> 00:34:45.929
All right, so that's Dijkstra.
00:34:45.929 --> 00:34:50.550
Let's run it on an example.
00:34:50.550 --> 00:34:53.830
So here's an example.
00:34:53.830 --> 00:34:56.880
I have a directed graph.
00:34:56.880 --> 00:34:58.170
It does contain cycles.
00:34:58.170 --> 00:35:01.650
In particular, here
are some cycles.
00:35:01.650 --> 00:35:04.990
I think those are the main ones.
00:35:04.990 --> 00:35:06.740
There are definitely
cycles in this graph.
00:35:09.980 --> 00:35:12.320
But as you see,
all of the weights
00:35:12.320 --> 00:35:15.020
are non-negative, in
particular-- they're positive,
00:35:15.020 --> 00:35:16.100
actually.
00:35:16.100 --> 00:35:21.420
It's going to be just helpful
in writing out this example.
00:35:21.420 --> 00:35:25.310
So let's run Dijkstra
on this graph.
00:35:25.310 --> 00:35:28.400
First we initialize and we set
the shortest-path distance.
00:35:28.400 --> 00:35:32.760
I'm going to label it in white
here to all of the things.
00:35:32.760 --> 00:35:34.340
Then I'm going to,
as I update it,
00:35:34.340 --> 00:35:38.360
I'm just going to cross them
out and write a new number.
00:35:38.360 --> 00:35:40.750
So that's what it
is at the start.
00:35:40.750 --> 00:35:43.100
That's initialization,
that's after step 1.
00:35:43.100 --> 00:35:46.700
And then I stick things
into my Q. What's in my Q?
00:35:46.700 --> 00:35:52.100
Here's my Q. It's everything.
00:35:52.100 --> 00:35:58.520
It's vertices s, a, b, c, d.
00:35:58.520 --> 00:36:02.410
I got five items
in my Q. Really,
00:36:02.410 --> 00:36:05.740
it's the item pair with its
shortest distance estimate,
00:36:05.740 --> 00:36:08.440
I'm just not going
to rewrite that here.
00:36:08.440 --> 00:36:10.520
So the idea here is--
00:36:10.520 --> 00:36:11.670
the while loop, OK.
00:36:11.670 --> 00:36:13.990
Q is not empty, great.
00:36:13.990 --> 00:36:17.590
We're going to delete the one
with the smallest distance
00:36:17.590 --> 00:36:21.220
estimate, which
is s, right, yeah.
00:36:21.220 --> 00:36:26.980
So I remove that, and then
I relax edges out of s.
00:36:26.980 --> 00:36:30.766
So I relax edge here to a.
00:36:30.766 --> 00:36:32.890
That's better than the
distance estimate--
00:36:32.890 --> 00:36:36.340
10 is better than the
distance estimate infinite,
00:36:36.340 --> 00:36:39.730
so I'm going to
change this to 10.
00:36:39.730 --> 00:36:42.010
And then here's
another outgoing edge.
00:36:42.010 --> 00:36:46.390
3 is better than
infinite, so I'm
00:36:46.390 --> 00:36:49.540
going to change its delta to 3.
00:36:49.540 --> 00:36:50.290
OK.
00:36:50.290 --> 00:36:54.580
So now I go back in here and I
change the distance estimates
00:36:54.580 --> 00:36:58.140
associated with my Q.
00:36:58.140 --> 00:37:01.860
Now, next step of the
algorithm, s is done.
00:37:01.860 --> 00:37:06.460
I've processed everything
distance 0 away.
00:37:06.460 --> 00:37:08.320
But I'm now going to
use my priority queue
00:37:08.320 --> 00:37:12.160
to say which of my vertices
has the shortest distance
00:37:12.160 --> 00:37:14.590
estimate now.
00:37:14.590 --> 00:37:16.790
So which one is it?
00:37:16.790 --> 00:37:18.220
a, b, or c, or d?
00:37:20.860 --> 00:37:22.690
Yeah, it's 3 and c.
00:37:22.690 --> 00:37:24.370
3 is smaller than 10.
00:37:24.370 --> 00:37:28.360
So Q is going to
magically delete c for me,
00:37:28.360 --> 00:37:31.840
tell me what that is, and now
I'm going to process that.
00:37:31.840 --> 00:37:36.040
Now I've changed my
boundary to this.
00:37:36.040 --> 00:37:39.640
And now I relax edges out of c.
00:37:39.640 --> 00:37:43.450
So here's an edge
at a c, that's a 4.
00:37:43.450 --> 00:37:49.540
A 4 plus the 3 is smaller
than 10, so I update it.
00:37:49.540 --> 00:37:54.460
3 plus 8 is 11, that's smaller
than infinite, so I update it,
00:37:54.460 --> 00:37:56.540
I relax.
00:37:56.540 --> 00:37:59.360
3 plus 2 is smaller
than infinite,
00:37:59.360 --> 00:38:00.500
so I relax that as well.
00:38:03.170 --> 00:38:06.020
Now of the things
still left in my Q,
00:38:06.020 --> 00:38:08.940
I'm actually going to
remove it from my Q
00:38:08.940 --> 00:38:10.940
instead of crossing it
out, maybe that's better.
00:38:13.730 --> 00:38:18.920
Of the vertices still left in my
Q, which has smallest distance?
00:38:18.920 --> 00:38:19.640
Yeah.
00:38:19.640 --> 00:38:20.480
d.
00:38:20.480 --> 00:38:22.880
d has 5, 7, or 11.
00:38:22.880 --> 00:38:24.480
5 is the smallest.
00:38:24.480 --> 00:38:27.890
So I remove d from my cue
and I relax edges from it.
00:38:27.890 --> 00:38:33.560
And now my boundary looks
something like this.
00:38:33.560 --> 00:38:35.180
I relax edges out of it.
00:38:35.180 --> 00:38:37.640
5 plus 5, that's 10.
00:38:37.640 --> 00:38:42.470
10 is smaller than
11, so that's a 10.
00:38:42.470 --> 00:38:45.896
And that's the only
outgoing edge from d.
00:38:45.896 --> 00:38:47.990
so I'm done.
00:38:47.990 --> 00:38:53.690
And then the last, 7
is smaller than 10,
00:38:53.690 --> 00:38:55.400
I relax edges out of a.
00:38:58.030 --> 00:39:03.340
a to b, 7 plus 2
is smaller than 10.
00:39:09.660 --> 00:39:10.440
And now I'm done.
00:39:10.440 --> 00:39:13.970
So what I did every
time I removed s--
00:39:13.970 --> 00:39:17.810
or I removed a vertex, I said
its shortest-path distance
00:39:17.810 --> 00:39:19.880
to the small--
00:39:19.880 --> 00:39:22.290
the last value I assigned to it.
00:39:22.290 --> 00:39:29.670
So this was then 3, and
then a was 7, b was 9,
00:39:29.670 --> 00:39:33.150
and then d was 5.
00:39:33.150 --> 00:39:35.430
So that's Dijkstra in action.
00:39:35.430 --> 00:39:39.300
It seems like these are the
shortest-path distances,
00:39:39.300 --> 00:39:40.800
but how do we prove that?
00:39:40.800 --> 00:39:44.040
Did it do the right thing?
00:39:44.040 --> 00:39:45.530
Well, let's find out.
00:39:45.530 --> 00:39:49.100
So that's what we're going to
spend some time on right now,
00:39:49.100 --> 00:39:51.380
just talking about the
correctness of Dijkstra's
00:39:51.380 --> 00:39:52.263
algorithm.
00:39:57.690 --> 00:39:58.470
OK.
00:39:58.470 --> 00:40:06.550
Correctness follows from
two main observations.
00:40:06.550 --> 00:40:14.290
So the claim here that we're
trying to prove is that d of s
00:40:14.290 --> 00:40:20.860
equals the delta s--
so the estimates equal
00:40:20.860 --> 00:40:30.680
the shortest-path distance is
at the end of Dijkstra for all v
00:40:30.680 --> 00:40:33.995
and V at end.
00:40:37.570 --> 00:40:40.100
And this is going to follow
from two observations.
00:40:40.100 --> 00:40:53.230
So the proof here,
first, if ever relaxation
00:40:53.230 --> 00:40:57.160
sets d of s of v--
00:40:57.160 --> 00:41:01.170
it sets the estimate equal to
the shortest-path distance,
00:41:01.170 --> 00:41:13.125
if it ever does that, I argue
to you that still true at end.
00:41:16.280 --> 00:41:18.915
OK, that's not a very
strong statement.
00:41:18.915 --> 00:41:23.150
This is saying if I ever
set the distance estimate
00:41:23.150 --> 00:41:26.060
to the true distance,
I'm never going to set it
00:41:26.060 --> 00:41:28.550
to a different value later on.
00:41:28.550 --> 00:41:29.540
And why is that?
00:41:32.140 --> 00:41:36.370
Well, relaxation only ever
decreases the distance.
00:41:39.580 --> 00:41:51.840
Relaxation only
decreases d s, v.
00:41:51.840 --> 00:41:56.220
But we proved in lecture
11-- so two lectures ago
00:41:56.220 --> 00:41:57.645
that relaxation is safe.
00:42:02.130 --> 00:42:03.960
And what does safe mean?
00:42:03.960 --> 00:42:08.740
Safe means that relaxation--
00:42:08.740 --> 00:42:13.630
that relaxation will only ever
change these distant estimates
00:42:13.630 --> 00:42:18.280
to be either infinite--
00:42:18.280 --> 00:42:24.000
it was never-- there was
never a path to my vertex.
00:42:24.000 --> 00:42:38.750
Or it was the length of some
path to v. Length of some path.
00:42:38.750 --> 00:42:39.590
OK.
00:42:39.590 --> 00:42:42.200
So what does that mean?
00:42:42.200 --> 00:42:46.070
It only decreases,
but it's always
00:42:46.070 --> 00:42:49.250
the length of some
path to v. So if this
00:42:49.250 --> 00:42:50.765
is the length of
the shortest path
00:42:50.765 --> 00:42:54.230
to v, I could never set
it to a smaller length,
00:42:54.230 --> 00:42:56.990
because there are no paths
with shorter distance.
00:42:56.990 --> 00:42:58.080
That's the whole point.
00:42:58.080 --> 00:42:58.580
OK.
00:42:58.580 --> 00:43:02.840
So with this
observation, I'm going
00:43:02.840 --> 00:43:05.570
to argue this final claim.
00:43:05.570 --> 00:43:19.660
It suffices to show that my
estimate equals the shortest
00:43:19.660 --> 00:43:34.730
distance when v is
removed from the Q.
00:43:34.730 --> 00:43:41.770
And since I removed every vertex
from the Q in this while loop,
00:43:41.770 --> 00:43:45.460
I will eventually said to
all of the distance estimates
00:43:45.460 --> 00:43:49.810
to the real distance
and we'll be golden.
00:43:49.810 --> 00:43:50.660
Happy days.
00:43:50.660 --> 00:43:51.160
All right.
00:43:51.160 --> 00:43:54.220
So we'll be done if we
can prove that statement.
00:43:54.220 --> 00:43:55.270
All right.
00:43:55.270 --> 00:44:01.360
So we're going to prove
this by induction obviously.
00:44:01.360 --> 00:44:21.950
Induction on first k
vertices removed from the Q.
00:44:21.950 --> 00:44:28.190
So the Q, we're popping vertices
from this Q in some order.
00:44:28.190 --> 00:44:31.580
So I'm going to just
argue that this claim is
00:44:31.580 --> 00:44:34.880
true for the first k.
00:44:34.880 --> 00:44:38.270
Clearly that's true
for k equals 1.
00:44:38.270 --> 00:44:44.710
Base case, k equals 1.
00:44:44.710 --> 00:44:45.640
What is k equals 1?
00:44:45.640 --> 00:44:47.560
That means the first
word vertex that I
00:44:47.560 --> 00:44:50.980
pop has this property,
which is definitely true,
00:44:50.980 --> 00:44:54.823
because we set the shortest
distance to s to be 0.
00:44:54.823 --> 00:44:55.490
That's all good.
00:44:58.150 --> 00:44:59.650
Now we have our inductive step.
00:45:09.280 --> 00:45:17.920
Assume it's true for k prime--
00:45:17.920 --> 00:45:22.030
sorry, k less than k prime.
00:45:22.030 --> 00:45:30.780
And let's let v prime be
k prime vertex popped.
00:45:33.460 --> 00:45:35.160
v prime.
00:45:35.160 --> 00:45:36.580
OK.
00:45:36.580 --> 00:45:46.830
And now let's look at
some shortest path from s
00:45:46.830 --> 00:45:49.060
to v prime.
00:45:49.060 --> 00:45:53.250
So we got the shortest
path from s to v prime.
00:45:53.250 --> 00:45:54.060
It exists.
00:45:54.060 --> 00:45:56.400
v prime is accessible.
00:45:56.400 --> 00:45:58.770
Let's say we pruned
our graph to be
00:45:58.770 --> 00:46:01.020
only the things
accessible from s
00:46:01.020 --> 00:46:07.610
so that, yeah, there exists
the shortest path to v prime.
00:46:07.610 --> 00:46:11.780
And now let's think
about these vertices.
00:46:11.780 --> 00:46:16.400
Some of them were removed
from the Q and some of them
00:46:16.400 --> 00:46:17.600
were not.
00:46:17.600 --> 00:46:21.260
s was definitely
removed from the Q.
00:46:21.260 --> 00:46:25.250
But some of these other
vertices might not be.
00:46:25.250 --> 00:46:27.530
I want to be able to
induct on this path,
00:46:27.530 --> 00:46:30.410
in particular, the
vertex before me
00:46:30.410 --> 00:46:32.330
so that I can say
that when I removed
00:46:32.330 --> 00:46:39.300
it and I relax the edge to v
prime, then we're all golden.
00:46:39.300 --> 00:46:40.840
But that might not be the case.
00:46:40.840 --> 00:46:43.830
There could be a vertex,
the vertex preceding me
00:46:43.830 --> 00:46:47.930
in the graph in this
shortest path that was not
00:46:47.930 --> 00:46:50.240
popped from Q. I need
to argue that it was
00:46:50.240 --> 00:46:52.790
or some other thing.
00:46:52.790 --> 00:47:01.850
So let's consider the first
vertex in this path from s
00:47:01.850 --> 00:47:05.910
to v. I'm going to
call it y, I think.
00:47:05.910 --> 00:47:06.410
Yeah.
00:47:09.660 --> 00:47:16.430
A vertex y that is
not in Q. After I
00:47:16.430 --> 00:47:21.740
pop v prime, this is the
first-- or before I pop v prime,
00:47:21.740 --> 00:47:25.790
y is not in the Q. Now these
might be the same vertex
00:47:25.790 --> 00:47:31.830
if all of the preceding ones
on this path were in the Q.
00:47:31.830 --> 00:47:35.280
But in particular, we're
going to look at this guy.
00:47:35.280 --> 00:47:39.130
And say its predecessor's
x in the path.
00:47:39.130 --> 00:47:41.760
Well what do I know?
00:47:41.760 --> 00:47:45.470
I know that x is in the queue.
00:47:45.470 --> 00:47:51.610
Everything here was
popped from the Q--
00:47:51.610 --> 00:47:52.400
not in.
00:47:55.720 --> 00:47:59.800
Which means that by induction,
the shortest-path distance
00:47:59.800 --> 00:48:01.480
was set here correctly.
00:48:01.480 --> 00:48:07.990
So that the distance
estimate at y
00:48:07.990 --> 00:48:16.432
can't be bigger than the
shortest path to x plus w x, y.
00:48:20.580 --> 00:48:23.280
But this is on the
shortest path to y,
00:48:23.280 --> 00:48:27.340
because the subpaths of shortest
paths or shortest paths.
00:48:27.340 --> 00:48:32.640
So this has to equal d
s, y, the distance to y.
00:48:32.640 --> 00:48:35.080
So actually, y is all good here.
00:48:35.080 --> 00:48:39.030
And so if v prime
were y, we'd be done.
00:48:39.030 --> 00:48:43.030
That's the same argument
is DAG relaxation.
00:48:43.030 --> 00:48:46.040
But we need to prove
something about v prime.
00:48:46.040 --> 00:48:50.200
Well, because we have
non-negative weights,
00:48:50.200 --> 00:48:53.890
the distance to v prime
has to be at least as big
00:48:53.890 --> 00:48:57.000
as this distance,
because it's a subpath.
00:48:57.000 --> 00:49:01.450
So this has to be less than
or equal to the true distance
00:49:01.450 --> 00:49:03.205
to v prime.
00:49:06.090 --> 00:49:11.100
Because of negative--
non-negative weights,
00:49:11.100 --> 00:49:15.050
because the weights
are non-negative.
00:49:15.050 --> 00:49:18.420
But because
relaxation is safe, we
00:49:18.420 --> 00:49:21.150
know that our distance
estimate for v prime
00:49:21.150 --> 00:49:23.340
has to be at least the
shortest-path distance.
00:49:28.320 --> 00:49:31.490
This is because it's safe.
00:49:31.490 --> 00:49:36.760
This is-- weights are
greater than or equal to 0.
00:49:40.060 --> 00:49:43.870
The last step here
is that because we're
00:49:43.870 --> 00:49:46.030
popping the minimum
from our priority
00:49:46.030 --> 00:49:49.660
queue, the thing
with the smallest
00:49:49.660 --> 00:49:52.720
shortest-path distance,
this has to be less than
00:49:52.720 --> 00:49:59.440
or equal to the shortest-path
distance estimate to y.
00:49:59.440 --> 00:50:02.410
Because this is the smallest
among all such vertices
00:50:02.410 --> 00:50:05.700
in my Q.
00:50:05.700 --> 00:50:07.600
But these are the same value.
00:50:07.600 --> 00:50:10.200
So everything between
here is the same value.
00:50:10.200 --> 00:50:14.370
In particular, the
estimate here is
00:50:14.370 --> 00:50:16.737
equal to my true
shortest-path distance,
00:50:16.737 --> 00:50:18.570
which is exactly what
we're trying to prove.
00:50:18.570 --> 00:50:21.430
OK, so that's why
Dijkstra's correct.
00:50:21.430 --> 00:50:24.990
I'm going to spend the last
five minutes on the running
00:50:24.990 --> 00:50:25.965
time of Dijkstra.
00:50:28.580 --> 00:50:35.810
We set this up so
that we did everything
00:50:35.810 --> 00:50:39.380
in terms of these Q operations.
00:50:39.380 --> 00:50:42.170
Right so we have
these Q operations,
00:50:42.170 --> 00:50:43.460
we have three of them.
00:50:43.460 --> 00:50:48.660
I'm going to say if I
have a build operation,
00:50:48.660 --> 00:50:51.210
let's say it takes
B time; to lead min,
00:50:51.210 --> 00:50:54.330
I'm going to say it takes M
time; and this decreased key,
00:50:54.330 --> 00:50:57.360
I'm going to say
it takes D time.
00:50:57.360 --> 00:50:59.310
So what is the running
time of Dijkstra?
00:50:59.310 --> 00:51:03.400
If I take a look at that
algorithm over there--
00:51:03.400 --> 00:51:09.990
well I guess let's switch
these back up again.
00:51:09.990 --> 00:51:11.760
OK, so what does this do?
00:51:11.760 --> 00:51:12.630
We build once.
00:51:15.330 --> 00:51:21.080
Then we delete the minimum
from the Q how many times?
00:51:21.080 --> 00:51:22.040
v times.
00:51:22.040 --> 00:51:26.212
We remove every
vertex from our Q.
00:51:26.212 --> 00:51:30.880
Then for every
possible edge, we may
00:51:30.880 --> 00:51:35.410
need to relax and decrease
the key in our queue
00:51:35.410 --> 00:51:37.420
once for every outgoing edge.
00:51:40.370 --> 00:51:55.640
So the running time is B plus
V times M plus E times D. OK.
00:51:55.640 --> 00:51:58.340
So how could we implement
this priority queue?
00:51:58.340 --> 00:52:04.270
Well, if we use the stupidest
priority queue in the world,
00:52:04.270 --> 00:52:06.070
here's a list of
different implementations
00:52:06.070 --> 00:52:08.600
we could have for
our priority queues.
00:52:08.600 --> 00:52:12.820
And when I say priority queue,
I mean this priority queue.
00:52:12.820 --> 00:52:15.340
We're already implementing
the changeable priority queue
00:52:15.340 --> 00:52:18.800
by linking it with a
dictionary that's efficient
00:52:18.800 --> 00:52:22.280
If I just use an array, I can
find the min in linear time,
00:52:22.280 --> 00:52:23.950
sure.
00:52:23.950 --> 00:52:26.635
And I don't have to update
that array in any way.
00:52:29.350 --> 00:52:32.950
I mean, I can just
keep the distances
00:52:32.950 --> 00:52:34.580
in my direct access array.
00:52:34.580 --> 00:52:36.580
I don't have to store a
separate data structure.
00:52:36.580 --> 00:52:40.400
I just store the distances
in my direct access array D,
00:52:40.400 --> 00:52:43.210
and so I can find
it in constant time
00:52:43.210 --> 00:52:45.208
and I can update the
values stored there.
00:52:45.208 --> 00:52:46.750
And then whenever
I want the minimum,
00:52:46.750 --> 00:52:49.280
I can just loop through
the whole thing.
00:52:49.280 --> 00:52:51.940
So that gives me a
really fast decrease key,
00:52:51.940 --> 00:52:54.190
but slow delete min.
00:52:54.190 --> 00:52:58.920
But if we take a look at
the running time bound here,
00:52:58.920 --> 00:53:01.560
we get something, if
we replace n with v,
00:53:01.560 --> 00:53:05.250
we get a quadratic
time algorithm
00:53:05.250 --> 00:53:10.410
in the number of vertices,
which for a dense graph,
00:53:10.410 --> 00:53:11.643
this is in linear time.
00:53:11.643 --> 00:53:12.810
That's actually pretty good.
00:53:12.810 --> 00:53:15.205
Dense meaning that I have
at least a quadratic number
00:53:15.205 --> 00:53:15.705
of vertices.
00:53:18.420 --> 00:53:20.150
So that's actually
really good, and it's
00:53:20.150 --> 00:53:23.180
the stupidest possible
data structure
00:53:23.180 --> 00:53:25.460
we could use for
this priority queue.
00:53:25.460 --> 00:53:29.480
Now we can do a little better,
actually, for not dense--
00:53:29.480 --> 00:53:32.840
I mean, for sparse graphs
where the number of edges
00:53:32.840 --> 00:53:40.133
is at most v, then this is
pretty bad, it's quadratic.
00:53:40.133 --> 00:53:41.800
We want to do something
a little better.
00:53:41.800 --> 00:53:45.340
Now if we're sparse,
a binary heap
00:53:45.340 --> 00:53:49.330
can delete min in
logarithmic time,
00:53:49.330 --> 00:53:53.500
but it can actually, if I know
where I am in the heap and I
00:53:53.500 --> 00:53:59.150
decrease the key and
I'm in a min heap,
00:53:59.150 --> 00:54:01.400
I can just swap with
my parent upwards
00:54:01.400 --> 00:54:05.900
in the tree in log n time
and rebalance the-- refix
00:54:05.900 --> 00:54:07.710
the binary heap property.
00:54:07.710 --> 00:54:11.240
And so I can do that
in logarithmic time.
00:54:11.240 --> 00:54:14.900
And if I do that and I
put it into this formula,
00:54:14.900 --> 00:54:16.790
I actually get n--
00:54:16.790 --> 00:54:24.830
or V plus V times log
V plus E times log V.
00:54:24.830 --> 00:54:29.780
And so that's going to give me
E log V if I'm assuming that I'm
00:54:29.780 --> 00:54:34.190
first pruning out all of the
things not connected to me,
00:54:34.190 --> 00:54:37.670
then E asymptotically
upper bounds V,
00:54:37.670 --> 00:54:42.440
and I get this E log V running
time, which is pretty good.
00:54:42.440 --> 00:54:44.930
That's just an extra
log factor on linear.
00:54:48.400 --> 00:54:50.740
Now there's an even better--
00:54:50.740 --> 00:54:54.910
well, better is hard to say.
00:54:54.910 --> 00:54:57.970
Really, there's a
different data structure
00:54:57.970 --> 00:55:03.940
that achieves both bounds
for sparse and dense graphs
00:55:03.940 --> 00:55:05.530
and everything in between.
00:55:05.530 --> 00:55:10.120
It gives us an E plus V
log V running time bound.
00:55:10.120 --> 00:55:12.280
This data structure is
called the Fibonacci heap.
00:55:12.280 --> 00:55:15.760
We're not going to
talk about it in 6.006.
00:55:15.760 --> 00:55:19.510
They talk about it-- and you
can look at chapter 19 in CLRS
00:55:19.510 --> 00:55:20.680
or you can look at--
00:55:20.680 --> 00:55:23.560
I think they talk
about it in 6.854
00:55:23.560 --> 00:55:26.200
if you're interested in
learning about Fibonacci heaps.
00:55:26.200 --> 00:55:27.550
But these are almost never--
00:55:27.550 --> 00:55:30.720
I mean, they get good
theoretical bounds.
00:55:30.720 --> 00:55:33.820
So what you want to say
is, whenever we give you
00:55:33.820 --> 00:55:36.970
a theory problem where you
might want to use Dijkstra,
00:55:36.970 --> 00:55:43.690
you want to use this
theoretical running time bound
00:55:43.690 --> 00:55:48.850
for your problem E plus
V log V. But if you
00:55:48.850 --> 00:55:55.000
happen to know that your graph
is sparse or dense, just using
00:55:55.000 --> 00:55:58.180
an array or a heap is
going to get you just as
00:55:58.180 --> 00:55:59.650
good of a running time.
00:55:59.650 --> 00:56:02.200
Very close to linear.
00:56:02.200 --> 00:56:06.520
And so in practice, most people,
when they are implementing
00:56:06.520 --> 00:56:08.710
a graph search
algorithm, they know
00:56:08.710 --> 00:56:10.720
if their graph is
sparse or dense,
00:56:10.720 --> 00:56:13.670
and so they never bother
implementing a Fibonacci heap,
00:56:13.670 --> 00:56:16.180
which is a little complicated.
00:56:16.180 --> 00:56:19.570
So they're usually either in
one of these first two cases
00:56:19.570 --> 00:56:23.350
where V squared is linear
when your graph is dense,
00:56:23.350 --> 00:56:29.072
or we're very close to linear,
E times log V, which is V log
00:56:29.072 --> 00:56:31.570
V if your graph is sparse.
00:56:31.570 --> 00:56:35.185
So that's the running
time of Dijkstra.
00:56:38.030 --> 00:56:45.820
So so far, we've gotten
all of these nice bounds.
00:56:45.820 --> 00:56:48.130
Some special cases where we're--
00:56:48.130 --> 00:56:51.130
I mean, special cases
where we're linear.
00:56:51.130 --> 00:56:55.180
Dijkstra where we're
close to linear.
00:56:55.180 --> 00:56:59.140
And Bellman-Ford, if we throw
our hands up in the air,
00:56:59.140 --> 00:57:01.518
there might be negative
cycles in our graph,
00:57:01.518 --> 00:57:03.560
we gotta spend that
quadratic running time bound.
00:57:03.560 --> 00:57:05.290
Now there are faster
algorithms, but this
00:57:05.290 --> 00:57:07.990
is the fastest we're going
to teach you in this class.
00:57:07.990 --> 00:57:10.260
Now and in the
next lecture we're
00:57:10.260 --> 00:57:13.030
going to be talking about
all pair shortest paths,
00:57:13.030 --> 00:57:16.470
and we'll pick it up next time.