WEBVTT

00:00:00.070 --> 00:00:02.500
The following content is
provided under a Creative

00:00:02.500 --> 00:00:04.019
Commons license.

00:00:04.019 --> 00:00:06.360
Your support will help
MIT OpenCourseWare

00:00:06.360 --> 00:00:10.730
continue to offer high quality
educational resources for free.

00:00:10.730 --> 00:00:13.340
To make a donation or
view additional materials

00:00:13.340 --> 00:00:15.325
from hundreds of
MIT courses, visit

00:00:15.325 --> 00:00:16.575
mitopencourseware@ocw.mit.edu.

00:00:20.681 --> 00:00:21.680
ERIK DEMAINE: All right.

00:00:21.680 --> 00:00:24.740
Welcome to our second
lecture on what

00:00:24.740 --> 00:00:26.735
to do when you have
an NP-hard problem.

00:00:26.735 --> 00:00:30.260
So two lectures ago we saw how
to prove a problem is NP-hard.

00:00:30.260 --> 00:00:34.320
Last lecture we saw if
you want polynomial time

00:00:34.320 --> 00:00:37.980
but you're willing to put up
with a not perfect solution,

00:00:37.980 --> 00:00:40.190
but you want to get within
some factor of the best

00:00:40.190 --> 00:00:42.800
solution, that's
approximation algorithms.

00:00:42.800 --> 00:00:45.720
Today we're going to do
a different thing called

00:00:45.720 --> 00:00:47.110
fixed parameter algorithms.

00:00:47.110 --> 00:00:49.450
These are going to run an
exponential time in the worst

00:00:49.450 --> 00:00:49.950
case.

00:00:49.950 --> 00:00:54.310
But not so bad in a certain
sense, which we'll get to.

00:00:54.310 --> 00:01:03.530
In general, the theme of these
last two lectures and this one

00:01:03.530 --> 00:01:06.110
is that we'd really like
to solve hard problems.

00:01:06.110 --> 00:01:10.660
We'd like to solve them fast,
meaning polynomial time.

00:01:16.390 --> 00:01:20.970
And we would like correct
solutions, also known

00:01:20.970 --> 00:01:21.995
as exact solutions.

00:01:26.801 --> 00:01:27.300
OK.

00:01:27.300 --> 00:01:30.200
We'd love to solve NP-hard
problems in polynomial time

00:01:30.200 --> 00:01:31.470
exactly.

00:01:31.470 --> 00:01:34.450
But that's not possible
unless P equals nP.

00:01:34.450 --> 00:01:37.630
So pick any two.

00:01:37.630 --> 00:01:41.200
That's the general idea.

00:01:41.200 --> 00:01:48.930
This is a bastardization of a
joke which is-- sleep, friends,

00:01:48.930 --> 00:01:50.970
work-- pick any two.

00:01:50.970 --> 00:01:53.400
That's the MIT motto.

00:01:53.400 --> 00:01:57.180
Here in algorithms-- hard,
fast, exact-- pick any two.

00:01:57.180 --> 00:02:01.540
So most of this class is about
these two-- polynomial time

00:02:01.540 --> 00:02:03.970
algorithms give
you exact things.

00:02:03.970 --> 00:02:13.460
That's the class P. Last
lecture was about hard problems.

00:02:13.460 --> 00:02:15.120
We drop exactness.

00:02:15.120 --> 00:02:16.709
We still want polynomial time.

00:02:16.709 --> 00:02:18.250
We still want to
solve hard problems.

00:02:18.250 --> 00:02:20.900
So this is approximation
algorithms.

00:02:20.900 --> 00:02:25.160
And what we're doing today
is the other combination.

00:02:25.160 --> 00:02:29.850
So we want exact,
but we're going

00:02:29.850 --> 00:02:31.646
to sacrifice how
fast things are.

00:02:31.646 --> 00:02:33.270
They're not going to
be polynomial time

00:02:33.270 --> 00:02:35.170
in a strict sense,
but it's going

00:02:35.170 --> 00:02:38.830
to be somewhere in between
polynomial and exponential.

00:02:38.830 --> 00:02:43.450
This is an area called FPT for
fixed parameter tractability.

00:02:46.050 --> 00:02:48.600
So what's this
parameter business?

00:02:48.600 --> 00:03:01.630
In general, the idea is that we
really want an exact solution

00:03:01.630 --> 00:03:03.160
to an NP-hard
problem, which means

00:03:03.160 --> 00:03:06.330
it has to take exponential
time in the worst case.

00:03:06.330 --> 00:03:24.000
But we want to confine
the exponential dependence

00:03:24.000 --> 00:03:25.570
to something called a parameter.

00:03:32.450 --> 00:03:32.950
OK.

00:03:32.950 --> 00:03:35.480
We actually use
parameters all the time.

00:03:35.480 --> 00:03:38.104
For example, on a graph,
there's two typical parameters

00:03:38.104 --> 00:03:39.770
you think about-- the
number of vertices

00:03:39.770 --> 00:03:41.120
and the number of edges.

00:03:41.120 --> 00:03:45.050
If you're sorting an array, the
usual parameter you think about

00:03:45.050 --> 00:03:46.570
is the size of the array.

00:03:46.570 --> 00:03:47.070
OK.

00:03:47.070 --> 00:03:47.819
That's all I mean.

00:03:47.819 --> 00:03:55.550
A parameter, in general, is just
some kind of size or complexity

00:03:55.550 --> 00:03:56.520
measure.

00:03:56.520 --> 00:04:03.300
So in general, a
parameter is going

00:04:03.300 --> 00:04:10.940
to be-- we're going
to call it k of x--

00:04:10.940 --> 00:04:20.870
should be a non-negative
integer-- and x is the input.

00:04:20.870 --> 00:04:24.729
So you're thinking
about some problem,

00:04:24.729 --> 00:04:26.395
like a problem we'll
be looking at today

00:04:26.395 --> 00:04:30.390
is vertex cover, which we
saw in the last lecture.

00:04:30.390 --> 00:04:33.070
Vertex cover--
you're given a graph.

00:04:33.070 --> 00:04:36.810
And based on that
graph, we're going

00:04:36.810 --> 00:04:40.600
to define some
function of that graph.

00:04:40.600 --> 00:04:42.464
So this is the input
to the problem,

00:04:42.464 --> 00:04:44.880
and k is just going to be some
non-negative integer, which

00:04:44.880 --> 00:04:46.330
is a function of that input.

00:04:46.330 --> 00:04:50.227
Just some measure of how
tough your problem is.

00:04:50.227 --> 00:04:51.600
OK.

00:04:51.600 --> 00:04:55.080
And what we would
like is a running time

00:04:55.080 --> 00:04:59.420
that is exponential in k, but
polynomial in everything else.

00:04:59.420 --> 00:05:03.910
Polynomial in the size of
the problem, in v and E. OK.

00:05:03.910 --> 00:05:16.780
So that's the general goal
is-- polynomial in the problem

00:05:16.780 --> 00:05:29.750
size-- which we usually
call n-- and exponential

00:05:29.750 --> 00:05:36.600
in the parameter--
which I'm calling-- just

00:05:36.600 --> 00:05:37.980
going to call k--
in general, you

00:05:37.980 --> 00:05:40.220
could consider more
parameters, but we're

00:05:40.220 --> 00:05:44.090
just going to think of two--
the overall size of the problem,

00:05:44.090 --> 00:05:48.820
and some particular parameter
that we look at called k.

00:05:48.820 --> 00:05:51.550
So if you can
achieve this, which

00:05:51.550 --> 00:05:53.340
we'll call fixed
parameter tractability--

00:05:53.340 --> 00:05:54.830
I'll define it formally
in a little bit,

00:05:54.830 --> 00:05:56.210
because there's
more than one way

00:05:56.210 --> 00:05:57.390
you might think of defining it.

00:05:57.390 --> 00:05:58.014
Some are right.

00:05:58.014 --> 00:05:59.970
Some are wrong.

00:05:59.970 --> 00:06:01.950
If you can achieve
this, what you get

00:06:01.950 --> 00:06:04.760
is an exact algorithm
for your problem

00:06:04.760 --> 00:06:09.012
that runs really fast
provided k is small.

00:06:09.012 --> 00:06:10.720
So this is sort of a
way of saying, well,

00:06:10.720 --> 00:06:12.850
you know the problem
is NP-hard in general,

00:06:12.850 --> 00:06:16.600
but as long as this measure
k is reasonably small,

00:06:16.600 --> 00:06:18.590
I'm still able to
solve it really fast.

00:06:18.590 --> 00:06:22.220
So it's a way of
characterizing a wide family

00:06:22.220 --> 00:06:27.100
of-- a big subset of the
problem that you can solve.

00:06:27.100 --> 00:06:30.530
You know that in general you're
going to need exponential time,

00:06:30.530 --> 00:06:34.130
but this gives you a measure
of how hard your input is.

00:06:34.130 --> 00:06:35.620
May not be the only measure.

00:06:35.620 --> 00:06:37.870
May not be the best
one, in any sense.

00:06:37.870 --> 00:06:40.920
But if you can
define a parameter,

00:06:40.920 --> 00:06:43.450
and you know that in
your practical scenarios

00:06:43.450 --> 00:06:46.140
that parameter will be
small, then you're golden.

00:06:46.140 --> 00:06:47.790
Then you can actually
solve the problem

00:06:47.790 --> 00:06:50.320
in a reasonable amount of time
and get an exact solution.

00:06:50.320 --> 00:06:53.340
No approximation here.

00:06:53.340 --> 00:06:57.360
So that's the idea.

00:06:57.360 --> 00:07:03.140
So that was a parameter.

00:07:03.140 --> 00:07:06.530
We're also going to define
a parameterized problem.

00:07:15.540 --> 00:07:18.280
This is just a problem
plus a parameter.

00:07:25.070 --> 00:07:25.570
OK.

00:07:25.570 --> 00:07:27.822
So we already have some
notions of problems.

00:07:27.822 --> 00:07:29.530
We can take any problem
that we've looked

00:07:29.530 --> 00:07:31.550
at before, like vertex cover.

00:07:31.550 --> 00:07:34.080
And if we just define
some parameter,

00:07:34.080 --> 00:07:36.160
then we get a
parameterized problem

00:07:36.160 --> 00:07:37.680
when we put these
things together.

00:07:37.680 --> 00:07:39.430
And usually we would
write it as something

00:07:39.430 --> 00:07:43.160
like, oh, take this problem,
and then consider it

00:07:43.160 --> 00:07:45.940
with respect to this parameter.

00:07:49.180 --> 00:07:51.780
And in general, for
a single problem,

00:07:51.780 --> 00:07:54.234
there may be several
natural parameters

00:07:54.234 --> 00:07:55.400
that you want to care about.

00:07:55.400 --> 00:08:00.100
Usually there's actually
one obvious parameter.

00:08:00.100 --> 00:08:04.490
So let's do that
for vertex cover.

00:08:04.490 --> 00:08:07.020
But in general, we can talk
about a problem with respect

00:08:07.020 --> 00:08:08.020
to different parameters.

00:08:08.020 --> 00:08:11.300
And some of them may be
feasible to solve in this sense.

00:08:11.300 --> 00:08:13.804
Some maybe not.

00:08:13.804 --> 00:08:14.680
All right.

00:08:14.680 --> 00:08:18.910
So I'm going to
define k vertex cover.

00:08:18.910 --> 00:08:22.562
This is almost the
same as vertex cover.

00:08:22.562 --> 00:08:24.200
It just has a k in front.

00:08:24.200 --> 00:08:27.380
But the k means that it's a
parameterized problem instead

00:08:27.380 --> 00:08:29.170
of just a general problem.

00:08:32.039 --> 00:08:40.030
So as in vertex cover,
we're given a graph G.

00:08:40.030 --> 00:08:41.750
And I'm going to
think of the decision

00:08:41.750 --> 00:08:43.320
version of vertex cover.

00:08:43.320 --> 00:08:50.260
So we're given a
non-negative integer k.

00:08:50.260 --> 00:08:54.640
And we want to know-- is there
a vertex cover of size k-- say,

00:08:54.640 --> 00:09:00.370
less than or equal to k--
is there a vertex cover?

00:09:00.370 --> 00:09:03.880
Remember a vertex cover
is a set of vertices

00:09:03.880 --> 00:09:05.115
that cover all the edges.

00:09:11.579 --> 00:09:15.490
And we want the size of S to
be less than or equal to k.

00:09:19.850 --> 00:09:20.350
OK.

00:09:20.350 --> 00:09:22.150
So every for every
edge, we need to choose

00:09:22.150 --> 00:09:23.233
one of the two end points.

00:09:23.233 --> 00:09:25.970
We want the total number of
chosen vertices to be, at most,

00:09:25.970 --> 00:09:27.350
k.

00:09:27.350 --> 00:09:32.250
And so that's a regular
decision problem.

00:09:32.250 --> 00:09:34.020
But for parameterized
problem, we also

00:09:34.020 --> 00:09:36.230
want to define a
parameter function.

00:09:36.230 --> 00:09:38.250
And that parameter
function-- guess what?

00:09:38.250 --> 00:09:39.690
k.

00:09:39.690 --> 00:09:43.010
Most obvious thing, given that
I wrote the letter k here.

00:09:43.010 --> 00:09:44.501
That's going to
be our parameter.

00:09:44.501 --> 00:09:45.000
OK.

00:09:45.000 --> 00:09:48.340
And most problems, a lot
of problems, especially

00:09:48.340 --> 00:09:50.670
decision versions of
optimization problems--

00:09:50.670 --> 00:09:52.974
like before, we were
minimizing the vertex cover--

00:09:52.974 --> 00:09:54.390
this is the decision
version where

00:09:54.390 --> 00:09:56.870
we want to decide whether
there's one of size of most k.

00:09:56.870 --> 00:09:58.244
If you can solve
this, of course,

00:09:58.244 --> 00:10:01.420
you can binary search on k,
like you did in your quiz.

00:10:01.420 --> 00:10:02.910
Hopefully.

00:10:02.910 --> 00:10:06.030
So that's all good.

00:10:06.030 --> 00:10:10.100
And a lot of problems have
this some non-negative integer

00:10:10.100 --> 00:10:11.340
floating around.

00:10:11.340 --> 00:10:13.420
And that's the, kind
of the, obvious choice

00:10:13.420 --> 00:10:14.170
for the parameter.

00:10:14.170 --> 00:10:15.720
Doesn't have to be the only one.

00:10:15.720 --> 00:10:19.450
But today, we're just going
to look at vertex cover

00:10:19.450 --> 00:10:21.315
with this parameterization.

00:10:21.315 --> 00:10:23.440
In your problem set, you'll
look at another problem

00:10:23.440 --> 00:10:26.260
with another natural parameter.

00:10:26.260 --> 00:10:30.250
This is usually called
the natural parameter.

00:10:30.250 --> 00:10:33.820
But there's no formal
definition of natural.

00:10:33.820 --> 00:10:34.790
That's just intuition.

00:10:39.430 --> 00:10:40.910
All right.

00:10:40.910 --> 00:10:43.224
So that's the set up.

00:10:43.224 --> 00:10:44.265
Let's do some algorithms.

00:10:47.540 --> 00:10:52.970
I guess the first note is
that k can actually be small.

00:10:52.970 --> 00:10:54.665
Nice example is a star graph.

00:10:59.490 --> 00:11:04.920
So you have the vertices, but
what's the smallest vertex

00:11:04.920 --> 00:11:07.500
cover?

00:11:07.500 --> 00:11:08.650
1.

00:11:08.650 --> 00:11:10.380
Everyone's holding
up one finger.

00:11:10.380 --> 00:11:12.960
You choose this guy--
that in the center,

00:11:12.960 --> 00:11:14.440
that covers all the edges.

00:11:14.440 --> 00:11:17.780
So it can be that k is
much smaller than v,

00:11:17.780 --> 00:11:20.540
and our goal here
is that we're going

00:11:20.540 --> 00:11:22.800
to get some
polynomial dependence

00:11:22.800 --> 00:11:24.900
in the size of the
graph, but we're

00:11:24.900 --> 00:11:28.000
going to get an exponential
dependence on k.

00:11:28.000 --> 00:11:29.610
Now there are many
different ways

00:11:29.610 --> 00:11:31.360
you could think of
exponential dependence,

00:11:31.360 --> 00:11:32.990
but let's start
with-- what would

00:11:32.990 --> 00:11:40.450
be the really obvious brute
force solution to vertex cover?

00:11:40.450 --> 00:11:41.030
OK.

00:11:41.030 --> 00:11:43.210
I want exact.

00:11:43.210 --> 00:11:46.470
I'm not going to be clever.

00:11:46.470 --> 00:11:51.752
What's the obvious
algorithm to solve this?

00:11:51.752 --> 00:11:52.252
Yeah.

00:11:52.252 --> 00:11:52.744
AUDIENCE: Try any
combination of k vertices,

00:11:52.744 --> 00:11:54.380
and see if it's a vertex cover.

00:11:54.380 --> 00:11:56.380
ERIK DEMAINE: Try any
combination of k vertices.

00:11:56.380 --> 00:11:57.505
See if it's a vertex cover.

00:11:57.505 --> 00:12:02.210
How many combinations
of k vertices are there?

00:12:02.210 --> 00:12:03.448
And choose k.

00:12:03.448 --> 00:12:05.790
Good.

00:12:05.790 --> 00:12:06.302
Let's see.

00:12:06.302 --> 00:12:07.510
I'm a little out of practice.

00:12:07.510 --> 00:12:08.900
It's been awhile.

00:12:08.900 --> 00:12:10.210
Close.

00:12:10.210 --> 00:12:12.380
Off by one.

00:12:12.380 --> 00:12:17.550
So try all and choose k.

00:12:17.550 --> 00:12:28.830
I guess, v choose k,
subsets of k vertices.

00:12:28.830 --> 00:12:30.980
If I wanted to match
this definition exactly,

00:12:30.980 --> 00:12:34.250
I should try all subsets of less
than or equal to k vertices.

00:12:34.250 --> 00:12:38.060
But, hey, if I choose
fewer than k vertices,

00:12:38.060 --> 00:12:40.850
why not add in a few
extras until I get up to k.

00:12:40.850 --> 00:12:44.050
So it's enough to look
at v choose k subsets,

00:12:44.050 --> 00:12:46.160
because-- subsets
of size exactly

00:12:46.160 --> 00:12:50.010
k-- because that will end
up giving the same answer

00:12:50.010 --> 00:12:52.290
as this question.

00:12:52.290 --> 00:12:53.290
OK.

00:12:53.290 --> 00:13:01.370
So for each-- test each of
those choices for coverage.

00:13:01.370 --> 00:13:05.360
So that just means
we loop over--

00:13:05.360 --> 00:13:07.590
I guess for every
vertex in our set,

00:13:07.590 --> 00:13:10.549
we mark all of the
incident edges as covered.

00:13:10.549 --> 00:13:12.090
And then we go
through all the edges,

00:13:12.090 --> 00:13:14.310
and see whether every
one got marked covered.

00:13:14.310 --> 00:13:18.151
If not, we reset and
try the next subset.

00:13:18.151 --> 00:13:18.650
OK.

00:13:18.650 --> 00:13:21.270
This is like not smart
dynamic programming.

00:13:21.270 --> 00:13:23.150
You just guess
what the subset is.

00:13:23.150 --> 00:13:24.970
And see if it covers.

00:13:24.970 --> 00:13:27.560
This is how you would prove
that this problem is in NP.

00:13:27.560 --> 00:13:28.320
Right?

00:13:28.320 --> 00:13:31.535
But now we're actually making
it an exponential algorithm.

00:13:31.535 --> 00:13:33.410
So what's the running
time of this algorithm?

00:13:41.800 --> 00:13:42.300
Yeah.

00:13:42.300 --> 00:13:43.508
AUDIENCE: E times v to the k.

00:13:43.508 --> 00:13:45.336
ERIK DEMAINE: E
times v to the k.

00:13:45.336 --> 00:13:45.836
Good.

00:13:50.090 --> 00:13:50.855
Must be.

00:13:55.220 --> 00:13:57.130
So that's obviously exponential.

00:13:57.130 --> 00:13:59.750
In a certain sense, the
dependence on E and v

00:13:59.750 --> 00:14:03.900
is in the bottom, which is good.

00:14:03.900 --> 00:14:06.420
And the k is in the
exponent, which makes sense.

00:14:06.420 --> 00:14:08.550
So this is not surprising.

00:14:08.550 --> 00:14:11.220
We also don't think
of it as good.

00:14:11.220 --> 00:14:14.140
We've defined this to be bad.

00:14:14.140 --> 00:14:15.040
OK.

00:14:15.040 --> 00:14:18.790
In general, we think of
a running time like n

00:14:18.790 --> 00:14:23.340
to the f of k, where n is sort
of the overall problem size

00:14:23.340 --> 00:14:24.000
here.

00:14:24.000 --> 00:14:26.650
Here n is basically v plus E.
And that's the overall input

00:14:26.650 --> 00:14:28.380
size for a graph.

00:14:28.380 --> 00:14:32.360
If we have a running time
where the exponent of n

00:14:32.360 --> 00:14:35.476
depends on k, in
a nontrivial way,

00:14:35.476 --> 00:14:37.100
we think of that as
a bad running time.

00:14:37.100 --> 00:14:39.430
This is a slow algorithm.

00:14:39.430 --> 00:14:42.200
It's slow because
even when k equals 2--

00:14:42.200 --> 00:14:44.650
if you have a large
graph-- this is probably

00:14:44.650 --> 00:14:45.900
not something you want to run.

00:14:45.900 --> 00:14:48.790
Definitely when k is 10,
you're completely hosed.

00:14:48.790 --> 00:14:50.572
This is a very impractical.

00:14:50.572 --> 00:14:52.530
And the formal sense in
which it is impractical

00:14:52.530 --> 00:14:55.710
is that the exponent
in n depends on k.

00:14:55.710 --> 00:15:02.740
In general, you cannot say--
so I'd like to-- I mean fixed

00:15:02.740 --> 00:15:03.250
parameter.

00:15:03.250 --> 00:15:05.083
The whole point is to
think of the parameter

00:15:05.083 --> 00:15:07.160
as being fixed, like a constant.

00:15:07.160 --> 00:15:07.660
OK.

00:15:07.660 --> 00:15:09.570
Now if the parameter is fixed.

00:15:09.570 --> 00:15:12.110
If you think of
it as at most 100,

00:15:12.110 --> 00:15:16.910
then, indeed, this will be at
most n to the 101 or something.

00:15:16.910 --> 00:15:20.580
So it is polynomial
for any fixed k.

00:15:20.580 --> 00:15:24.780
The catch is that the exponent
of the polynomial depends on k.

00:15:24.780 --> 00:15:28.000
As you increase k, as you
increase your bound on k,

00:15:28.000 --> 00:15:29.720
the exponent increases.

00:15:29.720 --> 00:15:33.530
I can't say this is an n squared
algorithm for any fixed k.

00:15:33.530 --> 00:15:34.030
OK.

00:15:34.030 --> 00:15:38.840
So exponent depends on k.

00:15:38.840 --> 00:15:40.610
That's the bad case.

00:15:40.610 --> 00:15:45.540
So the good case,
we're going to define,

00:15:45.540 --> 00:15:49.750
is that the exponent
doesn't depend on k.

00:15:49.750 --> 00:15:51.650
That may seem like
a small change.

00:15:51.650 --> 00:15:53.280
It is a small change.

00:15:53.280 --> 00:15:54.270
But it's a big one.

00:15:57.747 --> 00:15:59.330
It's a small change
with a big effect.

00:16:04.650 --> 00:16:22.360
So I'm going to define, let's
say, a parameterized problem is

00:16:22.360 --> 00:16:33.980
fixed parameter
tractable-- which,

00:16:33.980 --> 00:16:36.520
given how many letters that
is, we're going to abbreviate

00:16:36.520 --> 00:16:55.985
to FPT-- if it can be solved in
f of k times polynomial in n.

00:17:00.286 --> 00:17:00.790
OK.

00:17:00.790 --> 00:17:09.089
Because this means
that the exponent here

00:17:09.089 --> 00:17:10.420
doesn't depend on anything.

00:17:15.810 --> 00:17:22.150
The exponent of n
doesn't depend on k.

00:17:29.501 --> 00:17:30.000
OK.

00:17:30.000 --> 00:17:34.820
So for this definition--
just to be explicit--

00:17:34.820 --> 00:17:40.070
I want the constant here to
be independent-- of course,

00:17:40.070 --> 00:17:42.010
it should be
independent of n, and it

00:17:42.010 --> 00:17:45.340
should be independent of k.

00:17:45.340 --> 00:17:49.270
This can be any function.

00:17:49.270 --> 00:17:52.260
It's presumably an
exponential function,

00:17:52.260 --> 00:17:54.310
because if this is
an NP-hard problem,

00:17:54.310 --> 00:17:56.060
something's got
to be exponential.

00:17:56.060 --> 00:17:58.030
This clearly is not exponential.

00:17:58.030 --> 00:17:59.800
So it's got to be here.

00:17:59.800 --> 00:18:02.670
So this is a sense in which
we're exponential in k,

00:18:02.670 --> 00:18:04.610
polynomial in n.

00:18:04.610 --> 00:18:08.490
But it's much better than
this kind of running time.

00:18:08.490 --> 00:18:08.990
OK.

00:18:08.990 --> 00:18:10.540
We can think about
what-- in the sense

00:18:10.540 --> 00:18:12.998
in which it is much better once
we have an actual algorithm

00:18:12.998 --> 00:18:13.950
of this type.

00:18:13.950 --> 00:18:17.430
So let's do-- let's
try to solve vertex

00:18:17.430 --> 00:18:20.880
cover in this kind of time.

00:18:20.880 --> 00:18:23.030
I claim vertex cover is
fixed parameter tractable.

00:18:23.030 --> 00:18:24.155
There is such an algorithm.

00:18:36.650 --> 00:18:39.370
And the algorithm is
going to look familiar.

00:18:39.370 --> 00:18:42.710
Very similar to the
2-approximation algorithm

00:18:42.710 --> 00:18:48.532
that we had last class
for vertex cover.

00:18:48.532 --> 00:18:50.740
So-- but I'm going to give
it a different name, which

00:18:50.740 --> 00:18:52.620
is bounded-search-tree.

00:19:05.860 --> 00:19:06.360
OK.

00:19:06.360 --> 00:19:09.970
This algorithm is also going to
feel like dynamic programming.

00:19:09.970 --> 00:19:11.960
Or we're going to use guessing.

00:19:11.960 --> 00:19:13.630
In general,
exponential algorithms,

00:19:13.630 --> 00:19:14.610
naturally is guessing.

00:19:14.610 --> 00:19:18.450
But here, when I guess, I have
to try all the possibilities.

00:19:18.450 --> 00:19:21.110
Here this was one way of
trying all the possibilities.

00:19:21.110 --> 00:19:23.836
We're going to be a little
bit more sophisticated in how

00:19:23.836 --> 00:19:25.960
we try all the possibilities
that actually exploits

00:19:25.960 --> 00:19:27.460
the properties of vertex cover.

00:19:30.230 --> 00:19:35.390
First line is just like the
2-approximation algorithm.

00:19:35.390 --> 00:19:39.320
Look at any edge in the graph.

00:19:44.050 --> 00:19:44.680
OK.

00:19:44.680 --> 00:19:46.150
Here it is.

00:19:46.150 --> 00:19:51.360
From u to v. What do I
know about that picture?

00:19:58.030 --> 00:19:59.400
Yeah.

00:19:59.400 --> 00:20:00.370
AUDIENCE: One of those vertices
has to be in the cover.

00:20:00.370 --> 00:20:01.230
ERIK DEMAINE: One
of those vertices

00:20:01.230 --> 00:20:02.250
has to be in the cover.

00:20:02.250 --> 00:20:07.610
Either u or v or both are in
S for that edge to be covered.

00:20:07.610 --> 00:20:10.700
Now for the 2-approximation,
we just put those both in.

00:20:10.700 --> 00:20:12.450
Here we can't afford
to do that because we

00:20:12.450 --> 00:20:15.260
want an exact solution.

00:20:15.260 --> 00:20:18.050
So we'll try both options.

00:20:18.050 --> 00:20:19.770
We don't know which one belongs.

00:20:19.770 --> 00:20:22.690
Let's guess.

00:20:22.690 --> 00:20:30.750
So we know either u is in S or
v is in S. Don't know which.

00:20:30.750 --> 00:20:31.295
So guess.

00:20:34.470 --> 00:20:36.910
Sorry-- I should, to be
clear, mention or both.

00:20:39.601 --> 00:20:41.100
So we're going to
guess, which means

00:20:41.100 --> 00:20:43.730
we need to try both options.

00:20:43.730 --> 00:20:46.620
We're going to try putting
u in, and then we're

00:20:46.620 --> 00:20:48.490
going to try putting v in.

00:20:48.490 --> 00:20:50.610
So let's just see what
happens when we try that.

00:20:50.610 --> 00:20:57.050
So in the first guess, we
say, let's put u in S. OK.

00:20:57.050 --> 00:21:00.800
Well, if we put u in
S, that means we cover

00:21:00.800 --> 00:21:03.390
all of the edges incident to u.

00:21:03.390 --> 00:21:06.487
So I'd like to use recursion.

00:21:06.487 --> 00:21:07.820
I'd like to simplify my problem.

00:21:07.820 --> 00:21:10.417
Get another vertex
cover instance.

00:21:10.417 --> 00:21:12.000
So in order to do
that, I'm just going

00:21:12.000 --> 00:21:14.450
to delete u and all
of its incident edges.

00:21:14.450 --> 00:21:16.700
We do the similar thing in
the approximation algorithm

00:21:16.700 --> 00:21:18.800
but for u and v simultaneously.

00:21:18.800 --> 00:21:24.105
So delete u as incident edges.

00:21:27.230 --> 00:21:29.520
Now we have a vertex
cover instance.

00:21:29.520 --> 00:21:30.750
There's one other thing.

00:21:30.750 --> 00:21:32.230
There's a new graph we have.

00:21:32.230 --> 00:21:34.050
But we also need to update k.

00:21:34.050 --> 00:21:37.900
Because we just
used one of those--

00:21:37.900 --> 00:21:39.940
we just added something
to S, and then

00:21:39.940 --> 00:21:41.270
we deleted that from the graph.

00:21:41.270 --> 00:21:43.640
Which means, in our new
graph, effectively k

00:21:43.640 --> 00:21:45.541
has gone down by 1.

00:21:45.541 --> 00:21:46.040
OK.

00:21:46.040 --> 00:21:47.610
So I'll say decrement k.

00:21:51.430 --> 00:21:53.020
Now I have a new instance.

00:21:53.020 --> 00:21:57.050
I have a new graph and
a different value of k.

00:21:57.050 --> 00:21:59.665
Recurse this algorithm.

00:22:04.500 --> 00:22:07.990
I would say-- I'll call the new
graph G prime and the integer

00:22:07.990 --> 00:22:11.600
k prime. k prime
equals k minus 1.

00:22:11.600 --> 00:22:14.320
And then the second case
is do the same thing

00:22:14.320 --> 00:22:17.140
for v. I won't write the
code, exactly the same,

00:22:17.140 --> 00:22:19.180
but I delete v and
it's incident edges.

00:22:19.180 --> 00:22:20.810
I still decrement k by 1.

00:22:20.810 --> 00:22:21.990
And I recurse.

00:22:21.990 --> 00:22:26.140
And then I just return the
or of these two answers.

00:22:26.140 --> 00:22:27.960
So if this one finds
a solution, great.

00:22:27.960 --> 00:22:29.910
I found a solution to
the overall problem.

00:22:29.910 --> 00:22:31.330
This one finds a
solution, great.

00:22:31.330 --> 00:22:33.535
Maybe both return yes.

00:22:33.535 --> 00:22:34.160
Doesn't matter.

00:22:34.160 --> 00:22:37.380
In general, I just
take the inclusive

00:22:37.380 --> 00:22:39.390
or of those two Boolean values.

00:22:39.390 --> 00:22:44.440
That gives me an overall yes
no answer to k vertex cover.

00:22:44.440 --> 00:22:45.820
Cool?

00:22:45.820 --> 00:22:48.610
So next question is what
the running time is.

00:22:48.610 --> 00:22:51.640
But you can think of this
as a dynamic program.

00:22:51.640 --> 00:22:55.180
It's just, here we recurse,
and we don't bother memoizing.

00:22:55.180 --> 00:22:59.420
Because, in general, memoization
will never help us here.

00:22:59.420 --> 00:23:01.540
And you may have even
thought of algorithms

00:23:01.540 --> 00:23:04.174
like this in the dynamic
programming world.

00:23:04.174 --> 00:23:06.090
And we just say, well,
that's not good enough,

00:23:06.090 --> 00:23:09.010
because in dynamic programming
we want polynomial time.

00:23:09.010 --> 00:23:11.335
This is like a dynamic
program, but the running time

00:23:11.335 --> 00:23:13.410
is exponential.

00:23:13.410 --> 00:23:16.690
But it turns out it will be
fixed parameter tractable.

00:23:16.690 --> 00:23:17.570
That's the good news.

00:23:23.022 --> 00:23:24.480
Let's think about
the running time.

00:23:38.930 --> 00:23:41.920
So if I draw-- let's
draw a recursion tree.

00:23:41.920 --> 00:23:42.420
Right?

00:23:42.420 --> 00:23:47.600
This is a divide-and-conquer
algorithm in a very weak sense.

00:23:47.600 --> 00:23:59.730
We start up here with a problem
of size n and a parameter k.

00:23:59.730 --> 00:24:01.435
And we make two recursive calls.

00:24:04.390 --> 00:24:04.890
OK.

00:24:04.890 --> 00:24:07.880
We deleted a vertex
and maybe some edges.

00:24:07.880 --> 00:24:09.950
So let's say, we
have a new problem

00:24:09.950 --> 00:24:12.240
of size something
like n minus 1.

00:24:12.240 --> 00:24:15.680
But what really saves us
is that k went down by 1.

00:24:15.680 --> 00:24:18.090
And we have two recursive calls.

00:24:18.090 --> 00:24:20.816
Each of them k is 1 smaller.

00:24:20.816 --> 00:24:21.530
OK.

00:24:21.530 --> 00:24:24.350
And then each of those
has two recursive calls.

00:24:24.350 --> 00:24:26.230
I don't really know
what happens to n.

00:24:26.230 --> 00:24:28.460
It probably doesn't
get that much smaller,

00:24:28.460 --> 00:24:30.940
but k goes down by another 1.

00:24:30.940 --> 00:24:31.730
OK.

00:24:31.730 --> 00:24:34.590
So I'm writing here the
size of the problems

00:24:34.590 --> 00:24:37.140
and the parameters
of the problems.

00:24:37.140 --> 00:24:39.150
n minus 2.

00:24:39.150 --> 00:24:41.760
k minus 2.

00:24:41.760 --> 00:24:43.720
OK.

00:24:43.720 --> 00:24:47.550
How much time do I spend
in each of these nodes?

00:24:47.550 --> 00:24:50.932
How much work am I
doing-- non-recursive work

00:24:50.932 --> 00:24:52.140
am I doing in this algorithm?

00:24:57.965 --> 00:24:58.465
Yeah.

00:24:58.465 --> 00:25:01.215
AUDIENCE: o of E,
right? [INAUDIBLE].

00:25:01.215 --> 00:25:02.340
ERIK DEMAINE: o of E. Yeah.

00:25:02.340 --> 00:25:04.570
Certainly at most
order E. Probably

00:25:04.570 --> 00:25:09.060
at most order v, because
there's only at most v incident

00:25:09.060 --> 00:25:10.880
edges to each vertex.

00:25:10.880 --> 00:25:11.610
Yeah?

00:25:11.610 --> 00:25:12.370
Linear time.

00:25:12.370 --> 00:25:17.260
Doesn't really matter
how careful we are here,

00:25:17.260 --> 00:25:22.030
but I will say-- each of
these nodes-- we spend,

00:25:22.030 --> 00:25:25.590
at most, let's
say, order v time.

00:25:25.590 --> 00:25:26.590
OK.

00:25:26.590 --> 00:25:29.249
It happened that
v went down by 1.

00:25:29.249 --> 00:25:30.540
As you can see at these levels.

00:25:30.540 --> 00:25:33.500
But certainly an upper
bound is the original v.

00:25:33.500 --> 00:25:40.010
In each of these nodes, we spend
at most the original v. When

00:25:40.010 --> 00:25:41.106
does this recursion stop?

00:25:41.106 --> 00:25:42.230
I didn't write a base case.

00:25:42.230 --> 00:25:43.000
Help me out.

00:25:43.000 --> 00:25:45.650
What's a good base case
for this algorithm?

00:25:49.770 --> 00:25:50.270
Yeah.

00:25:50.270 --> 00:25:52.603
AUDIENCE: When k equals 0,
check if there are any edges.

00:25:55.507 --> 00:25:57.840
ERIK DEMAINE: When k equals
0, check if there any edges.

00:25:57.840 --> 00:26:01.220
When k equals 0, I can't put
anything into my vertex cover.

00:26:01.220 --> 00:26:03.790
So if there any edges, they're
not going to be covered.

00:26:03.790 --> 00:26:05.110
That's bad news.

00:26:05.110 --> 00:26:05.610
OK.

00:26:05.610 --> 00:26:17.010
So over here we have base case,
k equals 0, check-- or let's

00:26:17.010 --> 00:26:28.380
say, return whether
size of E is not 0.

00:26:28.380 --> 00:26:30.660
If it's not 0--
sorry-- whether it

00:26:30.660 --> 00:26:34.360
equals 0-- get it
right-- if it equals 0,

00:26:34.360 --> 00:26:35.340
then the answer is yes.

00:26:35.340 --> 00:26:36.298
There's a vertex cover.

00:26:36.298 --> 00:26:39.320
I can cover all of those
0 edges using 0 vertices.

00:26:39.320 --> 00:26:40.350
That's good.

00:26:40.350 --> 00:26:41.850
But when E does not
equal 0, there's

00:26:41.850 --> 00:26:44.880
no way I can cover that
non-zero number of edges using

00:26:44.880 --> 00:26:47.990
0 vertices in my vertex cover.

00:26:47.990 --> 00:26:48.490
OK.

00:26:48.490 --> 00:26:50.280
So that's the base
case, which means

00:26:50.280 --> 00:26:54.470
this recursion keeps going
until we get down to k equals 0.

00:26:54.470 --> 00:26:56.220
We start at k.

00:26:56.220 --> 00:26:57.700
We end up with 0.

00:26:57.700 --> 00:27:00.300
So the number of
levels here is k.

00:27:00.300 --> 00:27:00.800
OK.

00:27:00.800 --> 00:27:06.780
The height of this tree--
this recursion tree is k.

00:27:06.780 --> 00:27:10.890
So how many nodes are
there in this tree?

00:27:10.890 --> 00:27:11.650
2 to the k.

00:27:17.900 --> 00:27:21.970
So total running time
is v times 2 to the k.

00:27:21.970 --> 00:27:24.370
I guess I should write
2 to the k times v. Hey,

00:27:24.370 --> 00:27:25.910
that is exactly what I wanted.

00:27:25.910 --> 00:27:28.380
I got a function of
k-- namely 2 to the k.

00:27:28.380 --> 00:27:30.050
Exponential-- that makes sense.

00:27:30.050 --> 00:27:33.140
And I got a polynomial in n.

00:27:33.140 --> 00:27:35.980
Here it's n.

00:27:35.980 --> 00:27:37.930
The exponent is 1.

00:27:37.930 --> 00:27:42.620
v is at most n. n
us v plus E. Wow.

00:27:42.620 --> 00:27:44.170
Big improvement.

00:27:44.170 --> 00:27:48.930
This seems equally simple
of an algorithm as this one,

00:27:48.930 --> 00:27:51.721
but actually it
runs a lot faster.

00:27:51.721 --> 00:27:52.220
OK.

00:27:52.220 --> 00:27:53.636
Let me give you a
feeling-- I mean

00:27:53.636 --> 00:27:56.520
this is what we would call a
linear time algorithm for fixed

00:27:56.520 --> 00:27:57.130
k.

00:27:57.130 --> 00:27:59.600
The exponent here
doesn't depend on k.

00:27:59.600 --> 00:28:01.540
If k is 10, it's a
linear time algorithm.

00:28:01.540 --> 00:28:04.180
If k is 100, it's a
linear time algorithm.

00:28:04.180 --> 00:28:06.010
If k is 100, that
might be a little bit

00:28:06.010 --> 00:28:07.510
beyond what we can run.

00:28:07.510 --> 00:28:11.380
But you know, k
equals 32, 40 maybe,

00:28:11.380 --> 00:28:13.480
that would probably be
reasonable running time,

00:28:13.480 --> 00:28:14.220
in practice.

00:28:14.220 --> 00:28:14.720
OK.

00:28:14.720 --> 00:28:17.940
That's a lot better than before
where like k equals 2 or 3.

00:28:17.940 --> 00:28:19.710
This is probably unreasonable.

00:28:19.710 --> 00:28:23.070
v is like a billion,
say, big graph.

00:28:23.070 --> 00:28:25.120
Also from a theoretical
perspective,

00:28:25.120 --> 00:28:28.600
this works even up
to k equals log n.

00:28:28.600 --> 00:28:31.440
If k equals log n,
this'll be n squared.

00:28:31.440 --> 00:28:32.630
That's nice.

00:28:32.630 --> 00:28:34.220
k equals 2 log n, it's n cubed.

00:28:34.220 --> 00:28:34.720
OK.

00:28:34.720 --> 00:28:35.920
So it grows.

00:28:35.920 --> 00:28:38.974
But we can handle k
equals order log n.

00:28:38.974 --> 00:28:40.390
And this will still
be polynomial.

00:28:40.390 --> 00:28:42.223
In general, with fixed
parameter algorithms,

00:28:42.223 --> 00:28:44.120
it's not always going
to be up to log n,

00:28:44.120 --> 00:28:47.160
it's going to be up to whatever
the inverse of this f of k is.

00:28:47.160 --> 00:28:51.250
That's where we can
still be polynomial.

00:28:51.250 --> 00:28:52.680
So that's nice.

00:28:52.680 --> 00:28:55.960
I consider this a
good running time.

00:28:55.960 --> 00:29:00.040
Good in the sense that it
follows that definition

00:29:00.040 --> 00:29:01.750
of fixed parameter tractable.

00:29:01.750 --> 00:29:03.620
So bounded-search-tree
algorithm is good.

00:29:03.620 --> 00:29:05.650
Brute force algorithm is bad.

00:29:05.650 --> 00:29:07.049
In this case.

00:29:07.049 --> 00:29:08.840
Bounded-search-tree is
a general technique.

00:29:08.840 --> 00:29:11.340
You can use it for
lots of problems.

00:29:11.340 --> 00:29:13.086
We're going to see
another technique

00:29:13.086 --> 00:29:14.210
today called kernelization.

00:29:17.018 --> 00:29:20.960
But-- Let's see--
before I get there,

00:29:20.960 --> 00:29:24.356
I want to question
this definition.

00:29:24.356 --> 00:29:25.480
So this definition is nice.

00:29:25.480 --> 00:29:28.400
It's natural in the
sense that it gives you--

00:29:28.400 --> 00:29:31.743
it distinguishes between the
exponent of n depending on k

00:29:31.743 --> 00:29:35.560
and not depending on k, which
is a natural thing to do.

00:29:35.560 --> 00:29:40.129
But there's another natural
definition of fixed parameter

00:29:40.129 --> 00:29:40.670
tractability.

00:29:43.190 --> 00:29:50.476
So let's-- vertex cover-- I
think you remember the problem

00:29:50.476 --> 00:29:50.975
by now.

00:30:02.010 --> 00:30:04.420
So let's see-- we have
this definition, which

00:30:04.420 --> 00:30:08.706
is f of k times polynomial n.

00:30:08.706 --> 00:30:11.080
But I would say that the first
time I saw fixed parameter

00:30:11.080 --> 00:30:13.910
tractability, I thought, well,
why do you define it that way?

00:30:13.910 --> 00:30:18.280
I mean, maybe it would be better
to do f of k plus polynomial n.

00:30:22.454 --> 00:30:23.620
That would be better, right?

00:30:23.620 --> 00:30:26.480
That would be
faster, seems like.

00:30:26.480 --> 00:30:30.630
So I mean, this is nice in
that we achieved this bound,

00:30:30.630 --> 00:30:33.630
but could we hope for
this even better bound.

00:30:33.630 --> 00:30:34.130
OK?

00:30:34.130 --> 00:30:38.080
It turns out these
notions are identical.

00:30:38.080 --> 00:30:39.067
This is weird.

00:30:39.067 --> 00:30:40.150
The first time you see it.

00:30:40.150 --> 00:30:50.760
So theorem-- you can solve a
problem in this kind of time,

00:30:50.760 --> 00:30:56.680
if and only if you can solve the
problem in this kind of time.

00:30:56.680 --> 00:30:59.400
So of course, f is
going to change.

00:30:59.400 --> 00:31:03.050
And why don't I label
these constants.

00:31:03.050 --> 00:31:07.210
So we have c up here and
some c prime up here.

00:31:07.210 --> 00:31:09.600
But you can solve a problem
in this multiplicative time,

00:31:09.600 --> 00:31:11.920
if and only if you can
solve it in an additive time

00:31:11.920 --> 00:31:15.230
with a different function
and a different constant.

00:31:15.230 --> 00:31:17.262
This is actually
really easy to prove.

00:31:17.262 --> 00:31:21.540
The longer you think about it,
the more obvious it will be.

00:31:21.540 --> 00:31:26.310
If you have an instance of
size n with parameter k,

00:31:26.310 --> 00:31:29.370
there are two cases.

00:31:29.370 --> 00:31:34.340
Either n is less than
or equal to f of k,

00:31:34.340 --> 00:31:37.721
or n is greater than
or equal to f of k.

00:31:37.721 --> 00:31:38.220
Right?

00:31:38.220 --> 00:31:41.420
It's got to be one
of those, maybe both.

00:31:41.420 --> 00:31:44.180
If n is less than or
equal to f of k that

00:31:44.180 --> 00:31:51.112
means that this running time,
f of k times n to the c--

00:31:51.112 --> 00:31:53.030
let's see-- n is at most f of k.

00:31:53.030 --> 00:31:58.380
So this is at most f of
k to the c plus 1 power.

00:31:58.380 --> 00:31:58.880
Right?

00:31:58.880 --> 00:32:03.550
I multiply f of k to
the c times f of k.

00:32:03.550 --> 00:32:06.280
When n is greater
than f of k, then

00:32:06.280 --> 00:32:09.665
I know that this running time,
f of k times n to the c, well,

00:32:09.665 --> 00:32:11.370
now I know an upper
bound of f of k.

00:32:11.370 --> 00:32:14.010
I know this thing is at most n.

00:32:14.010 --> 00:32:17.540
And so this is at most
n to the c plus 1.

00:32:17.540 --> 00:32:18.040
OK.

00:32:18.040 --> 00:32:19.880
So really I have two
scenarios, either I'm

00:32:19.880 --> 00:32:22.550
bounded by some
purely function of k,

00:32:22.550 --> 00:32:25.830
or I'm bounded by some
purely polynomial of n.

00:32:25.830 --> 00:32:31.600
Which means, in both cases,
the running time f of k times

00:32:31.600 --> 00:32:35.650
n to the c is bounded
above by the max

00:32:35.650 --> 00:32:42.580
of those two things, max of
f of k to the c plus 1, n

00:32:42.580 --> 00:32:44.723
to the c plus 1.

00:32:44.723 --> 00:32:45.530
OK.

00:32:45.530 --> 00:32:47.370
And the max is always,
at most, the sum.

00:32:47.370 --> 00:32:50.510
I'm assuming everything
here is non-negative.

00:32:50.510 --> 00:32:56.030
So I take f of k, c plus
1, plus n to the c plus 1.

00:32:56.030 --> 00:32:57.090
Boom.

00:32:57.090 --> 00:33:02.890
That is an additive function
of k plus polynomial in n.

00:33:02.890 --> 00:33:03.390
OK.

00:33:03.390 --> 00:33:05.250
Rather trivial.

00:33:05.250 --> 00:33:07.130
This a funny area
where you think,

00:33:07.130 --> 00:33:08.327
ah, this is deep question.

00:33:08.327 --> 00:33:09.410
Are these things the same?

00:33:09.410 --> 00:33:13.830
And ends up, yeah, they're
the same for obvious reasons.

00:33:13.830 --> 00:33:18.470
So for example, we have this
linear, basically n times 2

00:33:18.470 --> 00:33:20.390
to the k algorithm.

00:33:20.390 --> 00:33:23.990
If you apply this
argument, you get

00:33:23.990 --> 00:33:29.780
this is, at most, so this
n times to the k bound,

00:33:29.780 --> 00:33:35.430
is, at most, n squared
plus 4 to the k.

00:33:35.430 --> 00:33:35.930
OK.

00:33:35.930 --> 00:33:39.810
I'm basically just
squaring both of the terms.

00:33:39.810 --> 00:33:40.310
OK.

00:33:40.310 --> 00:33:42.830
Probably you prefer
this time bound,

00:33:42.830 --> 00:33:45.860
but if you really like
an additive time bound,

00:33:45.860 --> 00:33:49.282
the exact same algorithm
satisfies this.

00:33:49.282 --> 00:33:51.130
OK.

00:33:51.130 --> 00:33:53.390
So not that exciting.

00:33:53.390 --> 00:33:55.695
And in practice, n squared--
it looks like a bad thing,

00:33:55.695 --> 00:33:57.820
so you'd probably prefer
this kind of running time.

00:33:57.820 --> 00:34:00.510
But there is a sense-- there's
a quadratics thing going

00:34:00.510 --> 00:34:01.650
on here in that we
have an n and then

00:34:01.650 --> 00:34:03.066
we have a function
of k multiplied

00:34:03.066 --> 00:34:05.420
together-- OK-- whatever.

00:34:05.420 --> 00:34:05.920
All right.

00:34:05.920 --> 00:34:07.740
So this justifies
the definition.

00:34:07.740 --> 00:34:11.030
This is kind of robust to
whether I put a dot here

00:34:11.030 --> 00:34:13.642
or plus, so clearly this
is the right definition.

00:34:13.642 --> 00:34:14.600
We're going to use dot.

00:34:14.600 --> 00:34:19.280
You could also use
plus, but-- all right.

00:34:19.280 --> 00:34:22.460
But there's another thing
called kernelization,

00:34:22.460 --> 00:34:27.770
which, in an intuitive sense,
matches this idea of plus.

00:34:27.770 --> 00:34:31.699
And it also matches
an idea that's

00:34:31.699 --> 00:34:34.194
common practice
called pre-processing.

00:34:34.194 --> 00:34:38.820
If I have a giant graph,
and I'm given some number k,

00:34:38.820 --> 00:34:40.600
and I want to find
a vertex cover,

00:34:40.600 --> 00:34:44.118
well, maybe the first thing I
could do is simplify my graph.

00:34:44.118 --> 00:34:46.409
Maybe there's some parts that
are really easy to solve.

00:34:46.409 --> 00:34:49.590
I should throw those away first.

00:34:49.590 --> 00:34:51.314
And that will make
my problem smaller.

00:34:51.314 --> 00:34:53.480
So if I'm going to have an
exponential running time,

00:34:53.480 --> 00:34:56.239
presumably, I want to first make
the problem as small as I can.

00:34:56.239 --> 00:34:59.570
Then deal with one
of these algorithms.

00:34:59.570 --> 00:35:00.070
OK.

00:35:00.070 --> 00:35:01.270
So we're going to do that.

00:35:09.310 --> 00:35:11.620
First, I'm going to tell
you about it generically.

00:35:16.017 --> 00:35:17.600
And then we'll do
it for vertex cover.

00:35:24.570 --> 00:35:28.160
So first, let me
give you a definition

00:35:28.160 --> 00:35:30.622
of what we'd like out of this
pre-processing procedure.

00:35:30.622 --> 00:35:32.705
It's going to be called a
kernelization procedure.

00:35:36.720 --> 00:35:40.565
Kernelization algorithm is
a polynomial time algorithm.

00:35:43.460 --> 00:35:45.590
Head back to
polynomial time land.

00:35:48.470 --> 00:35:51.580
You can think of
it as a reduction,

00:35:51.580 --> 00:35:54.620
but with NP-hardness, we
reduced from one problem a

00:35:54.620 --> 00:35:55.799
to another problem b.

00:35:55.799 --> 00:35:57.590
Here we're going to
reduce from the problem

00:35:57.590 --> 00:35:59.370
a to the same problem a.

00:35:59.370 --> 00:36:01.560
It's a self reduction,
if you will.

00:36:01.560 --> 00:36:05.600
But the input to the problem
is going to get smaller.

00:36:05.600 --> 00:36:09.281
So we're going to
convert an input.

00:36:09.281 --> 00:36:10.905
So this is for a
parameterized problem.

00:36:10.905 --> 00:36:12.960
So an input consists
of some regular input

00:36:12.960 --> 00:36:16.440
x and a parameter k.

00:36:16.440 --> 00:36:31.020
And we want to convert it into
an equivalent small input x

00:36:31.020 --> 00:36:37.181
prime k prime to
the same problem.

00:36:37.181 --> 00:36:37.680
OK.

00:36:37.680 --> 00:36:39.690
The problem is fixed,
say vertex cover.

00:36:39.690 --> 00:36:41.340
So we're given an
arbitrary input.

00:36:41.340 --> 00:36:43.770
This would be a
graph and a number k.

00:36:43.770 --> 00:36:48.890
And we want to convert it into
an equivalent small input,

00:36:48.890 --> 00:36:52.790
which is another graph G prime,
and another parameter k prime.

00:36:52.790 --> 00:36:56.470
So equivalent means that the
answer is going to be the same.

00:36:56.470 --> 00:36:56.970
OK.

00:36:56.970 --> 00:37:01.110
And I want the answer
to the problem--

00:37:01.110 --> 00:37:07.570
let's say, answer of x comma
k to be equal to the answer

00:37:07.570 --> 00:37:10.770
to x prime and k prime.

00:37:10.770 --> 00:37:15.285
Again, same problem,
but different input.

00:37:15.285 --> 00:37:16.910
I'm trying to be a
little generic here.

00:37:16.910 --> 00:37:18.986
It could be-- we're
going to think here

00:37:18.986 --> 00:37:20.860
about decision problems,
but this makes sense

00:37:20.860 --> 00:37:22.160
even for non-decision problems.

00:37:22.160 --> 00:37:24.243
Whatever the answer is
here, it should be the same

00:37:24.243 --> 00:37:26.851
as the answer is here, because
I want an exact solution.

00:37:26.851 --> 00:37:28.350
I want to solve
exactly the problem.

00:37:28.350 --> 00:37:31.520
I want to compute this
answer exactly correctly.

00:37:31.520 --> 00:37:33.820
So if I can reduce
it to some x prime k

00:37:33.820 --> 00:37:36.070
prime with the same
answer, well, now I

00:37:36.070 --> 00:37:38.150
can just solve x prime k prime.

00:37:38.150 --> 00:37:39.090
So that's good.

00:37:39.090 --> 00:37:41.086
Now what does small mean?

00:37:41.086 --> 00:37:43.130
We need to define both of these.

00:37:43.130 --> 00:37:47.790
Small means that
the size of x prime,

00:37:47.790 --> 00:37:51.650
which you might call
n prime, should be,

00:37:51.650 --> 00:37:53.950
at most, some function of k.

00:38:00.420 --> 00:38:01.030
Cool.

00:38:01.030 --> 00:38:03.840
So this is interesting.

00:38:03.840 --> 00:38:09.280
So we started with probably
a giant problem, x excise n,

00:38:09.280 --> 00:38:13.850
and we have a parameter k, which
we presume is relatively small.

00:38:13.850 --> 00:38:16.260
And we convert it
into a new input x

00:38:16.260 --> 00:38:18.210
prime that's very small.

00:38:18.210 --> 00:38:21.220
Its size is a function of k.

00:38:21.220 --> 00:38:24.561
No more dependence on n. n has
disappeared from the problem.

00:38:24.561 --> 00:38:25.060
OK.

00:38:25.060 --> 00:38:26.530
We start with
something a size n.

00:38:26.530 --> 00:38:28.984
We produced something the
size as function of k.

00:38:28.984 --> 00:38:30.900
And then-- OK-- there's
some other parameter k

00:38:30.900 --> 00:38:33.210
prime-- doesn't matter
much what it is.

00:38:33.210 --> 00:38:35.955
It's going to also
be a function of k.

00:38:35.955 --> 00:38:37.580
So we started with
something of size n.

00:38:37.580 --> 00:38:39.090
We produced something of size k.

00:38:39.090 --> 00:38:42.150
In polynomial time.

00:38:42.150 --> 00:38:42.650
Wow.

00:38:42.650 --> 00:38:45.340
This would be big
if we could do it.

00:38:45.340 --> 00:38:49.070
Because we start with
giant problem small k,

00:38:49.070 --> 00:38:51.320
and we kernelize
it down-- so here's

00:38:51.320 --> 00:38:53.730
the picture with this big thing.

00:38:53.730 --> 00:38:57.010
And the intuition is that
the hardness of the problem

00:38:57.010 --> 00:39:00.040
is just from this
thing of size k.

00:39:00.040 --> 00:39:01.480
But there's no thing of size k.

00:39:01.480 --> 00:39:03.290
Or at least we haven't
found it yet, right?

00:39:03.290 --> 00:39:06.190
The k is the size of the
vertex cover we're looking for.

00:39:06.190 --> 00:39:07.600
But we don't know where that is.

00:39:07.600 --> 00:39:09.700
It's hiding somewhere
in this instance,

00:39:09.700 --> 00:39:11.520
in this amorphous blob.

00:39:11.520 --> 00:39:12.520
So it's k.

00:39:12.520 --> 00:39:17.640
But somehow, magically,
this kernelization procedure

00:39:17.640 --> 00:39:21.160
produces a new problem that's
only a little bit bigger

00:39:21.160 --> 00:39:22.410
than k.

00:39:22.410 --> 00:39:22.910
OK.

00:39:22.910 --> 00:39:25.047
Some function of k.

00:39:25.047 --> 00:39:26.630
So we take the big
problem, we make it

00:39:26.630 --> 00:39:28.270
down to this small thing.

00:39:28.270 --> 00:39:29.330
What do you do now?

00:39:29.330 --> 00:39:31.300
You can run any
algorithm you want,

00:39:31.300 --> 00:39:36.230
any finite algorithm
applied to this instance

00:39:36.230 --> 00:39:39.090
will run in some
function of k time.

00:39:39.090 --> 00:39:41.740
Doesn't matter as long as
this is a correct algorithm.

00:39:41.740 --> 00:39:45.990
If your problem is in NP,
there is an exponential time

00:39:45.990 --> 00:39:46.490
algorithm.

00:39:46.490 --> 00:39:48.380
You just try all the guesses.

00:39:48.380 --> 00:39:50.260
So we could use-- we
have two of them here.

00:39:50.260 --> 00:39:54.290
We could run either of these
after we've kernelized.

00:39:54.290 --> 00:39:56.640
And we would get an FPT time.

00:39:56.640 --> 00:40:01.500
And, indeed, the FPT would
mimic this kind of running time.

00:40:01.500 --> 00:40:04.470
We do a polynomial
amount of pre-processing.

00:40:04.470 --> 00:40:06.500
That's the
kernelization procedure.

00:40:06.500 --> 00:40:08.220
That's the only dependence on n.

00:40:08.220 --> 00:40:10.500
After we've done
that, the new problem

00:40:10.500 --> 00:40:12.102
is entirely a function of k.

00:40:12.102 --> 00:40:13.810
And then you apply
any algorithm to that,

00:40:13.810 --> 00:40:15.790
you'll get f of k running time.

00:40:15.790 --> 00:40:17.470
Now if we want a good
f of k, we should

00:40:17.470 --> 00:40:19.580
use the best algorithm we have.

00:40:19.580 --> 00:40:23.852
But, in general, you
could use anything.

00:40:23.852 --> 00:40:25.170
All right.

00:40:25.170 --> 00:40:26.080
So far, so good.

00:40:29.320 --> 00:40:34.340
So we had this theorem that
product is the same as plus.

00:40:34.340 --> 00:40:38.550
In fact, kernelization
is the same thing.

00:40:38.550 --> 00:40:42.430
So these are all the same thing.

00:40:42.430 --> 00:40:45.790
I guess this one is
equivalent to being FPT.

00:40:45.790 --> 00:40:47.860
that's the definition of FPT.

00:40:47.860 --> 00:40:50.180
So I'll just write it here.

00:40:55.080 --> 00:40:59.820
The problem is FPT, if and
only if it has a kernelization.

00:41:05.590 --> 00:41:07.340
This is crazy.

00:41:07.340 --> 00:41:10.059
I keep introducing stronger
and stronger notions of good.

00:41:10.059 --> 00:41:11.600
And they all turn
out to be the same.

00:41:11.600 --> 00:41:13.540
That, again, gives you
a sense of robustness

00:41:13.540 --> 00:41:14.940
of this definition.

00:41:14.940 --> 00:41:17.260
And why this is a
natural thing to study.

00:41:17.260 --> 00:41:19.440
So this sounds crazy.

00:41:19.440 --> 00:41:21.690
How could I put all
of the easy work

00:41:21.690 --> 00:41:24.230
at the beginning in this
polynomial time algorithm

00:41:24.230 --> 00:41:26.210
and then in the end
produce something

00:41:26.210 --> 00:41:27.700
that is a reasonable size?

00:41:27.700 --> 00:41:30.621
Again, this proof is
going to be trivial.

00:41:30.621 --> 00:41:32.370
I think everything in
this field is either

00:41:32.370 --> 00:41:34.670
really hard or trivial.

00:41:34.670 --> 00:41:37.290
I guess that makes sense.

00:41:37.290 --> 00:41:41.160
So let's first-- so I'm just
looking at this inequality--

00:41:41.160 --> 00:41:44.930
this implication-- we
already did the other one.

00:41:44.930 --> 00:41:48.360
The easy direction, of
course, is this way.

00:41:48.360 --> 00:41:50.030
I didn't even do
it in this case.

00:41:50.030 --> 00:41:51.530
If I have an additive
running time,

00:41:51.530 --> 00:41:52.988
it certainly, at
most, the product,

00:41:52.988 --> 00:41:56.490
assuming both those
numbers are at least 1.

00:41:56.490 --> 00:41:57.050
OK.

00:41:57.050 --> 00:42:00.250
And again, if I have a
kernelization, as I said,

00:42:00.250 --> 00:42:03.260
I could run the kernelization
algorithm in polynomial time,

00:42:03.260 --> 00:42:06.740
and then run any finite
algorithm to solve the problem,

00:42:06.740 --> 00:42:09.010
and I would get some
f of k running time.

00:42:09.010 --> 00:42:10.780
So this is easy.

00:42:15.500 --> 00:42:25.650
Kernelize and then run any
algorithm on the kernel.

00:42:25.650 --> 00:42:30.920
Kernel is the produced
output x prime k prime.

00:42:30.920 --> 00:42:31.420
OK.

00:42:31.420 --> 00:42:33.874
Let's do the other direction.

00:42:33.874 --> 00:42:35.040
That's the interesting part.

00:42:35.040 --> 00:42:37.331
So suppose I have it an
algorithm that runs, let's say,

00:42:37.331 --> 00:42:38.950
in this running time.

00:42:38.950 --> 00:42:43.010
I claim that I can
turn it into a kernel.

00:42:43.010 --> 00:42:45.711
And the proof is going
to look just like before.

00:42:45.711 --> 00:42:46.210
OK.

00:42:46.210 --> 00:42:48.020
So there are two cases.

00:42:48.020 --> 00:42:51.650
One is that f of k, let's say,
is less than or equal to n.

00:42:54.390 --> 00:42:56.242
Actually I want to do
the other case first.

00:42:56.242 --> 00:42:57.700
I think it's a
little more natural.

00:42:57.700 --> 00:43:00.230
Well, it doesn't really matter.

00:43:00.230 --> 00:43:03.380
They're both easy but
for different reasons.

00:43:03.380 --> 00:43:06.280
the then parts are
going to look different.

00:43:06.280 --> 00:43:08.470
So the first case,
if n is at most

00:43:08.470 --> 00:43:11.100
f of k, what do I do in that
situation, in other words,

00:43:11.100 --> 00:43:13.510
kernelize of this
thing of size n,

00:43:13.510 --> 00:43:17.280
I want to kernelize into
something of size f of k?

00:43:17.280 --> 00:43:17.940
Nothing.

00:43:17.940 --> 00:43:19.200
I'm done already.

00:43:19.200 --> 00:43:21.495
So this is the already
kernelized case.

00:43:27.059 --> 00:43:27.600
That's great.

00:43:31.210 --> 00:43:33.780
So the other cases,
and it's big.

00:43:33.780 --> 00:43:37.080
That's the more interesting
case, of course.

00:43:37.080 --> 00:43:39.820
n is greater than
or equal to f of k.

00:43:43.520 --> 00:43:44.580
What happens here?

00:43:44.580 --> 00:43:49.090
Well, just like last time, that
means that this running time, f

00:43:49.090 --> 00:43:51.430
of k is now at most n, that
means this running times

00:43:51.430 --> 00:43:54.560
is at most n to the c plus one.

00:43:54.560 --> 00:43:55.060
Right?

00:43:55.060 --> 00:44:00.585
So that means the FPT
algorithm that I'm given,

00:44:00.585 --> 00:44:01.960
because we're
assuming here we're

00:44:01.960 --> 00:44:03.720
given an FPT algorithm,
we want to produce

00:44:03.720 --> 00:44:08.400
a kernel-- in that
case, that algorithm

00:44:08.400 --> 00:44:15.180
runs in n to the c plus 1 time.

00:44:15.180 --> 00:44:17.280
Which means I can
actually run it.

00:44:17.280 --> 00:44:19.621
That's polynomial time.

00:44:19.621 --> 00:44:20.120
OK.

00:44:20.120 --> 00:44:23.720
So over here I needed a
polynomial time kernelization

00:44:23.720 --> 00:44:24.990
algorithm.

00:44:24.990 --> 00:44:27.600
If it happens that
f of k is at most n,

00:44:27.600 --> 00:44:30.747
then I can actually
run the FPT algorithm,

00:44:30.747 --> 00:44:32.830
and that would be a valid
kernelization procedure.

00:44:32.830 --> 00:44:35.890
Now the FPT algorithm
actually solves the problem.

00:44:35.890 --> 00:44:39.750
Let's say, it says
yes or no, whether the

00:44:39.750 --> 00:44:42.120
answer to my original question.

00:44:42.120 --> 00:44:45.610
The kernelization procedure
has to output an input

00:44:45.610 --> 00:44:46.500
to the problem.

00:44:46.500 --> 00:44:50.610
So what I need to
add one thing here

00:44:50.610 --> 00:45:02.471
which is just-- output a
canonical yes or no input

00:45:02.471 --> 00:45:02.970
accordingly.

00:45:05.700 --> 00:45:06.200
OK.

00:45:06.200 --> 00:45:08.640
If the FPT algorithm-- here
I'm thinking about decision

00:45:08.640 --> 00:45:10.980
problems-- if the FPT
algorithm says yes,

00:45:10.980 --> 00:45:13.239
I'm going to output
one instance,

00:45:13.239 --> 00:45:15.280
one input of the problem
where the output is yes.

00:45:15.280 --> 00:45:18.400
I know that one exists, because
this algorithm said yes.

00:45:18.400 --> 00:45:21.620
So in a constant
amount of space,

00:45:21.620 --> 00:45:24.120
I'm able to write a yes input.

00:45:24.120 --> 00:45:26.240
Or in a constant amount
space, I write a no input.

00:45:26.240 --> 00:45:28.050
So the new kernel will
have constant size,

00:45:28.050 --> 00:45:31.900
which is smaller than f of k.

00:45:31.900 --> 00:45:32.990
OK.

00:45:32.990 --> 00:45:33.580
That's it.

00:45:33.580 --> 00:45:37.150
So either I output the same
input that I was given.

00:45:37.150 --> 00:45:39.670
Or I output something
of constant size

00:45:39.670 --> 00:45:42.480
that encodes a yes or no.

00:45:42.480 --> 00:45:44.750
Kind of trivial again.

00:45:44.750 --> 00:45:51.150
The catch is that size of the
kernel, the size of the output,

00:45:51.150 --> 00:45:54.160
in general, here is going
to be exponential in k,

00:45:54.160 --> 00:45:58.390
because f of k presumably
is exponential in k.

00:45:58.390 --> 00:45:59.440
That's annoying.

00:45:59.440 --> 00:46:03.130
This is what you might call
an exponential size kernel.

00:46:03.130 --> 00:46:04.310
So an interesting question.

00:46:04.310 --> 00:46:07.940
And exponential size kernels
are equivalent to FPT.

00:46:07.940 --> 00:46:09.750
Something that may
not be equivalent

00:46:09.750 --> 00:46:11.629
is a polynomial size kernel.

00:46:11.629 --> 00:46:13.670
It would be nice if I
start with something that's

00:46:13.670 --> 00:46:16.580
polynomial in n, and I
reduce it to something

00:46:16.580 --> 00:46:18.530
that's polynomial in k.

00:46:18.530 --> 00:46:21.490
And then I run something
that's exponential on that.

00:46:21.490 --> 00:46:25.400
But it will only be singly
exponential in k, hopefully.

00:46:25.400 --> 00:46:28.880
Whereas, if I use this
kernelization procedure,

00:46:28.880 --> 00:46:33.030
if I apply to vertex cover
with one of these algorithms,

00:46:33.030 --> 00:46:34.570
I'm going to get
a new thing that's

00:46:34.570 --> 00:46:36.480
size is exponential in k.

00:46:36.480 --> 00:46:39.277
If I run one of these
brute force algorithms,

00:46:39.277 --> 00:46:40.860
I'm going to get
something that's like

00:46:40.860 --> 00:46:42.250
doubly exponential in k.

00:46:42.250 --> 00:46:44.810
That's not so hot,
because I know

00:46:44.810 --> 00:46:47.794
how to do exponential in k.

00:46:47.794 --> 00:46:48.690
All right.

00:46:48.690 --> 00:46:52.570
But this is the general idea of
kernelization, and why it's not

00:46:52.570 --> 00:46:54.750
surprising that you can do it.

00:46:54.750 --> 00:46:57.660
I have one catch here which
is-- this is an algorithm.

00:46:57.660 --> 00:46:59.840
It compares n to f of k.

00:46:59.840 --> 00:47:04.250
In order to do this, you
have to know what k is.

00:47:04.250 --> 00:47:06.120
Minor technicality.

00:47:06.120 --> 00:47:10.510
If you don't know what
k is, you can basically

00:47:10.510 --> 00:47:14.040
run this algorithm with
a timer, a stopwatch,

00:47:14.040 --> 00:47:17.430
and if it's running time
exceeds n to the c plus 1,

00:47:17.430 --> 00:47:19.610
then you know you're
not in this case.

00:47:19.610 --> 00:47:21.970
If it finishes within
that time bound, great.

00:47:21.970 --> 00:47:23.080
You found the answer.

00:47:23.080 --> 00:47:25.720
If it doesn't finish, then you
know you must be in this case,

00:47:25.720 --> 00:47:27.971
and then you just quit, and
output your original input

00:47:27.971 --> 00:47:28.970
and say, I'm kernelized.

00:47:28.970 --> 00:47:29.750
Done.

00:47:29.750 --> 00:47:30.894
Easy.

00:47:30.894 --> 00:47:31.770
OK.

00:47:31.770 --> 00:47:32.686
That's a technicality.

00:47:37.270 --> 00:47:39.560
All right.

00:47:39.560 --> 00:47:41.134
So much for general theory.

00:47:41.134 --> 00:47:42.300
Let's go back to algorithms.

00:47:47.380 --> 00:47:50.450
Yeah, all this work I want
to write down over here.

00:47:50.450 --> 00:47:54.125
We have a v times 2
to the k algorithm.

00:47:56.900 --> 00:47:57.750
On the one side.

00:47:57.750 --> 00:48:03.565
And here, we have an E
as v to the k algorithm.

00:48:06.150 --> 00:48:07.857
Just want to keep
a running-- we're

00:48:07.857 --> 00:48:09.940
going to get a faster
algorithm than both of those

00:48:09.940 --> 00:48:10.856
through kernelization.

00:48:13.230 --> 00:48:18.221
So I claim that we can
find a polynomial kernel--

00:48:18.221 --> 00:48:19.970
polynomial-sized
kernel-- it's going to be

00:48:19.970 --> 00:48:22.840
quadratic for vertex cover.

00:48:44.270 --> 00:48:45.899
These are hard to find.

00:48:45.899 --> 00:48:47.440
And there's a whole
research industry

00:48:47.440 --> 00:48:50.650
for finding polynomial kernels.

00:48:50.650 --> 00:48:56.850
So I'm going to give
you some methods.

00:48:56.850 --> 00:48:59.060
But they're specific
to vertex cover.

00:49:02.700 --> 00:49:03.970
So here's the first thing.

00:49:03.970 --> 00:49:05.740
This is hard to draw.

00:49:05.740 --> 00:49:08.360
So here I have a
vertex u, and suppose

00:49:08.360 --> 00:49:11.065
I have an edge
connected from u to u.

00:49:11.065 --> 00:49:13.190
This is called a loop.

00:49:13.190 --> 00:49:17.810
What can I do from a
vertex cover perspective?

00:49:17.810 --> 00:49:21.919
What can I conclude
about this picture?

00:49:21.919 --> 00:49:22.418
Yeah.

00:49:22.418 --> 00:49:24.487
AUDIENCE: [INAUDIBLE].

00:49:24.487 --> 00:49:25.320
ERIK DEMAINE: Right.

00:49:25.320 --> 00:49:28.900
You must be in the vertex cover,
because this edge really only

00:49:28.900 --> 00:49:29.771
has one endpoint.

00:49:29.771 --> 00:49:30.270
OK.

00:49:30.270 --> 00:49:32.020
So far, so easy.

00:49:32.020 --> 00:49:35.450
So what I can do in
this case is say,

00:49:35.450 --> 00:49:44.430
OK, u is in the vertex cover,
and then delete u at it's

00:49:44.430 --> 00:49:45.065
incident edges.

00:49:51.136 --> 00:49:53.800
So-- and then we
have to decrement k.

00:49:57.710 --> 00:49:58.210
OK.

00:49:58.210 --> 00:49:59.100
Cool.

00:49:59.100 --> 00:50:01.310
Seems-- feels familiar.

00:50:01.310 --> 00:50:03.500
But in this case,
there's no guessing.

00:50:03.500 --> 00:50:05.860
We just know you
must be in the cover.

00:50:05.860 --> 00:50:06.640
All right.

00:50:06.640 --> 00:50:07.530
Here's another case.

00:50:07.530 --> 00:50:12.439
Suppose I have u and v and there
are many edges connecting them.

00:50:12.439 --> 00:50:13.605
This is called a multi-edge.

00:50:16.170 --> 00:50:18.390
So maybe you just assume
your graph is simple.

00:50:18.390 --> 00:50:20.610
But if you don't assume
your graph is simple,

00:50:20.610 --> 00:50:25.200
we might have these
kinds of situations.

00:50:25.200 --> 00:50:27.250
What can I do in this case?

00:50:27.250 --> 00:50:28.665
What can I guarantee?

00:50:28.665 --> 00:50:29.190
Yeah.

00:50:29.190 --> 00:50:30.940
AUDIENCE: You can
remove all but one edge.

00:50:30.940 --> 00:50:33.189
ERIK DEMAINE: You can remove
all but one of the edges.

00:50:33.189 --> 00:50:35.400
Let's just delete all but one.

00:50:35.400 --> 00:50:38.120
If I cover one of
them, I cover them all.

00:50:38.120 --> 00:50:38.775
Easy peasy.

00:50:41.580 --> 00:50:45.290
See if you get the other rules.

00:50:45.290 --> 00:50:46.795
So delete all by 1.

00:50:51.325 --> 00:50:52.950
In general, we're
going to have a bunch

00:50:52.950 --> 00:50:55.820
of these kinds of simplification
rules are guaranteed correct.

00:50:55.820 --> 00:50:58.476
They don't change the output.

00:50:58.476 --> 00:51:00.600
But magically, we're going
to end up with something

00:51:00.600 --> 00:51:01.970
the size of function of k.

00:51:01.970 --> 00:51:03.250
We haven't done much yet.

00:51:03.250 --> 00:51:05.750
But now we know that the
graph is simple, meaning it

00:51:05.750 --> 00:51:08.170
has no loops and it
has no multi-edges.

00:51:08.170 --> 00:51:09.940
Cool.

00:51:09.940 --> 00:51:10.440
All right.

00:51:10.440 --> 00:51:25.180
Next thing I want to think
about is a vertex of degree

00:51:25.180 --> 00:51:27.610
greater than k. k is
the current value of k.

00:51:27.610 --> 00:51:33.830
So suppose I have a high
degree vertex, more than k,

00:51:33.830 --> 00:51:35.740
edges out going from it.

00:51:40.417 --> 00:51:41.250
What can I say then?

00:51:45.530 --> 00:51:46.030
Yeah.

00:51:46.030 --> 00:51:47.950
AUDIENCE: You must pick it.

00:51:47.950 --> 00:51:51.480
ERIK DEMAINE: You must
put it in the cover.

00:51:51.480 --> 00:51:53.130
Why?

00:51:53.130 --> 00:51:54.963
AUDIENCE: Because then
you need to cover all

00:51:54.963 --> 00:51:56.171
the remaining error vertices.

00:51:58.877 --> 00:51:59.710
ERIK DEMAINE: Right.

00:51:59.710 --> 00:52:00.668
Proof by contradiction.

00:52:00.668 --> 00:52:03.510
If I don't put it in the cover,
that means all of these guys

00:52:03.510 --> 00:52:04.420
are in the cover.

00:52:04.420 --> 00:52:06.170
There's more than k of them.

00:52:06.170 --> 00:52:10.800
And the whole goal was to find a
vertex cover of size at most k.

00:52:13.370 --> 00:52:15.965
So you better not put all of
these in your vertex cover,

00:52:15.965 --> 00:52:17.090
because that's more than k.

00:52:17.090 --> 00:52:20.140
So, therefore, this
one has to be in there.

00:52:20.140 --> 00:52:21.720
This is a cool argument.

00:52:21.720 --> 00:52:24.440
Simple, but cool.

00:52:24.440 --> 00:52:24.940
OK.

00:52:24.940 --> 00:52:27.620
So any vertex of
degree greater than k

00:52:27.620 --> 00:52:33.560
must be in the vertex cover,
which I was calling S. OK.

00:52:33.560 --> 00:52:39.240
So delete that vertex and its
incident edges, decrement k,

00:52:39.240 --> 00:52:42.290
because we just used something.

00:52:42.290 --> 00:52:42.790
OK.

00:52:42.790 --> 00:52:44.170
So just keep doing this.

00:52:44.170 --> 00:52:46.680
Every time I see a vertex
of degree more than k,

00:52:46.680 --> 00:52:49.100
delete it, decrement k.

00:52:49.100 --> 00:52:51.320
Now I have a new graph
and a new value of k.

00:52:51.320 --> 00:52:54.660
Look for any vertices whose
degree is more than k.

00:52:54.660 --> 00:52:57.200
If I find one, delete
it, repeat, repeat.

00:52:57.200 --> 00:52:58.980
Keep repeating until
you can't anymore.

00:52:58.980 --> 00:53:00.680
How much time is
this going to take?

00:53:00.680 --> 00:53:01.320
I don't know.

00:53:01.320 --> 00:53:02.460
Most quadratic, right?

00:53:02.460 --> 00:53:04.730
I look at all the vertices.

00:53:04.730 --> 00:53:06.760
Look at their degrees.

00:53:06.760 --> 00:53:09.630
I could look over all the
edges, increment the degrees.

00:53:09.630 --> 00:53:11.440
In linear time, I can
find whether there's

00:53:11.440 --> 00:53:14.000
any vertex of
degree more than k.

00:53:14.000 --> 00:53:15.910
Then delete it in linear time.

00:53:15.910 --> 00:53:17.270
Then try again.

00:53:17.270 --> 00:53:19.610
And this will happen to
most linear time many times

00:53:19.610 --> 00:53:21.380
because I can only
delete a vertex once.

00:53:21.380 --> 00:53:23.690
So I delete at most v vertices.

00:53:23.690 --> 00:53:29.095
So overall running time here
is like at most v times E,

00:53:29.095 --> 00:53:29.595
polynomial.

00:53:38.220 --> 00:53:40.530
Probably if you're clever,
use a data structure

00:53:40.530 --> 00:53:42.410
to update degrees.

00:53:42.410 --> 00:53:44.510
You could do this
in order v time.

00:53:44.510 --> 00:53:48.810
But let's not be clever yet.

00:53:48.810 --> 00:53:49.900
All right.

00:53:49.900 --> 00:53:55.180
So now, after I've done
all of these reductions,

00:53:55.180 --> 00:54:00.360
I have a graph where every
vertex has degree at most k.

00:54:00.360 --> 00:54:01.905
So it's like a
bounded degree graph.

00:54:43.580 --> 00:54:45.950
Why do I care about a
bounded degree graph?

00:54:45.950 --> 00:54:49.480
Remember, I drew this example,
which was a star graph,

00:54:49.480 --> 00:54:54.130
where n was large, but k was
very small. n was n, n was v,

00:54:54.130 --> 00:54:57.870
and k was 1.

00:54:57.870 --> 00:54:59.490
Now a star graph is
special because it

00:54:59.490 --> 00:55:01.790
has a very high degree vertex.

00:55:01.790 --> 00:55:06.220
In general, if I have a
vertex of some degree, say k,

00:55:06.220 --> 00:55:12.662
and I put it in the vertex
cover, it covers k edges.

00:55:12.662 --> 00:55:13.650
OK.

00:55:13.650 --> 00:55:24.654
So each vertex in S
covers at most k edges,

00:55:24.654 --> 00:55:27.560
wherever the degree is.

00:55:27.560 --> 00:55:30.127
So that means you don't get
much bang for your buck anymore.

00:55:30.127 --> 00:55:32.710
We've already taken care of all
the high degree vertices where

00:55:32.710 --> 00:55:34.650
you get a lot of
reward for putting

00:55:34.650 --> 00:55:36.350
one vertex in the cover.

00:55:36.350 --> 00:55:39.150
Now this is the new value of k.

00:55:39.150 --> 00:55:42.040
It may have decremented
from before.

00:55:42.040 --> 00:55:45.400
Every vertex that we could
possibly put in the set

00:55:45.400 --> 00:55:47.180
will only cover k edges.

00:55:47.180 --> 00:55:50.280
Now we know that
we only get to put

00:55:50.280 --> 00:55:53.120
k more vertices into the set.

00:55:53.120 --> 00:55:59.510
We know-- we're supposing
that sides of S is at most k.

00:55:59.510 --> 00:56:06.850
So that means that the
number of edges, size of E,

00:56:06.850 --> 00:56:09.680
must be at most k squared.

00:56:09.680 --> 00:56:10.500
All right.

00:56:10.500 --> 00:56:13.230
Because every one I put into
S covers at most k edges.

00:56:13.230 --> 00:56:15.800
All of them have to be covered.

00:56:15.800 --> 00:56:19.834
And so k times k is k squared.

00:56:19.834 --> 00:56:22.160
Hah, interesting.

00:56:22.160 --> 00:56:23.850
That means my graph is small.

00:56:23.850 --> 00:56:25.105
Now slight catch.

00:56:25.105 --> 00:56:26.980
There might be a whole
bunch of vertices that

00:56:26.980 --> 00:56:29.170
have no edges incident to them.

00:56:29.170 --> 00:56:33.971
So I need to delete
isolated vertices.

00:56:33.971 --> 00:56:35.880
Let's say, degree 0 vertices.

00:56:38.660 --> 00:56:39.160
OK.

00:56:39.160 --> 00:56:41.800
Degree 0 vertices-- you
really don't want to put those

00:56:41.800 --> 00:56:42.770
into your vertex cover.

00:56:42.770 --> 00:56:43.390
No point.

00:56:43.390 --> 00:56:45.060
They don't cover any edges.

00:56:45.060 --> 00:56:46.510
So delete those.

00:56:46.510 --> 00:56:49.240
And now, I still may not
have a connected graph,

00:56:49.240 --> 00:56:53.800
but, in the worst case,
I have a matching.

00:56:53.800 --> 00:56:56.670
I know the total number of
edges is at most case squared.

00:56:56.670 --> 00:57:00.190
That means the total
number of vertices

00:57:00.190 --> 00:57:03.610
is at most twice
that, 2k squared.

00:57:03.610 --> 00:57:05.560
So after all of
these operations,

00:57:05.560 --> 00:57:10.660
I assumed that S
was size at most k.

00:57:10.660 --> 00:57:14.200
And then if I do all
of these operations,

00:57:14.200 --> 00:57:17.940
I get a graph with at most 2k
squared vertices, and at most

00:57:17.940 --> 00:57:19.510
case k squared edges.

00:57:19.510 --> 00:57:21.060
So the total size
of the graph, which

00:57:21.060 --> 00:57:27.090
I'm calling n, n which is
size of v plus size of E

00:57:27.090 --> 00:57:30.100
is order k squared.

00:57:30.100 --> 00:57:31.400
3k squared.

00:57:31.400 --> 00:57:35.820
And I assumed that there was
a vertex cover size at most k

00:57:35.820 --> 00:57:37.270
throughout this.

00:57:37.270 --> 00:57:41.050
So what I do is I run this
kernelization algorithm.

00:57:41.050 --> 00:57:44.100
And I see-- is the graph
that I produced size

00:57:44.100 --> 00:57:46.150
at most 3k squared?

00:57:46.150 --> 00:57:47.610
If it is, output it.

00:57:47.610 --> 00:57:51.220
That's a kernelized
thing because it's small.

00:57:51.220 --> 00:57:53.400
If it isn't, if the
graph I've produced

00:57:53.400 --> 00:57:56.960
is still too big, that must mean
that this assumption was wrong,

00:57:56.960 --> 00:57:59.440
which means the answer to the
vertex cover problem is no,

00:57:59.440 --> 00:58:01.980
there is no vertex
cover of size at most k.

00:58:01.980 --> 00:58:03.850
And so then I just
output a canonical

00:58:03.850 --> 00:58:09.010
no instance, like-- so I mean,
this is sort of outside--

00:58:09.010 --> 00:58:13.305
but if the newly
produced graph-- I'll

00:58:13.305 --> 00:58:25.550
call it v prime plus E prime is
greater than 3 times k squared,

00:58:25.550 --> 00:58:27.872
then output-- and
here I can actually

00:58:27.872 --> 00:58:29.330
give you one--
let's say, I'm going

00:58:29.330 --> 00:58:33.130
to output the graph which is a
single edge with two vertices

00:58:33.130 --> 00:58:36.100
and k equals 0.

00:58:36.100 --> 00:58:38.530
The answer to vertex cover
in this instance is no.

00:58:38.530 --> 00:58:40.800
So this is an example
of a constant size,

00:58:40.800 --> 00:58:43.170
no representative.

00:58:43.170 --> 00:58:44.970
So either I get
something that's small

00:58:44.970 --> 00:58:47.540
and I output that, or
it's big, in which case

00:58:47.540 --> 00:58:51.970
I output this thing, which is
to say, nope, can't be done.

00:58:51.970 --> 00:58:53.450
That's kernelization.

00:58:53.450 --> 00:58:59.060
So I've produced a quadratic
size graph, quadratic in k

00:58:59.060 --> 00:59:00.450
in polynomial time.

00:59:03.000 --> 00:59:04.384
Question?

00:59:04.384 --> 00:59:05.830
No.

00:59:05.830 --> 00:59:06.800
Wow.

00:59:06.800 --> 00:59:09.220
So this is kernelization
at its finest.

00:59:09.220 --> 00:59:10.730
A polynomial kernel.

00:59:10.730 --> 00:59:11.850
Polynomial time.

00:59:11.850 --> 00:59:15.872
We get that down to something
the size of polynomial in k.

00:59:15.872 --> 00:59:18.330
This is how you should-- if
you want to solve vertex cover,

00:59:18.330 --> 00:59:20.740
you might as well do
these reductions first,

00:59:20.740 --> 00:59:22.764
because they will
simplify your thing.

00:59:22.764 --> 00:59:25.180
And now, if you happen to know
your vertex cover is small,

00:59:25.180 --> 00:59:27.070
then the graph will be small.

00:59:27.070 --> 00:59:29.650
So now we could run either
of these algorithms.

00:59:29.650 --> 00:59:30.530
OK.

00:59:30.530 --> 00:59:32.410
Presumably, we should
run the better one.

00:59:32.410 --> 00:59:36.310
But for fun, let's
analyze both of them.

00:59:36.310 --> 00:59:36.810
OK.

00:59:36.810 --> 00:59:39.570
So I'm going to leave
the running times here.

00:59:44.480 --> 00:59:46.510
We're going to get a
faster vertex cover

00:59:46.510 --> 00:59:51.190
algorithm from a fixed parameter
tractability perspective.

00:59:51.190 --> 01:00:02.420
So here's a new FTP algorithm.

01:00:02.420 --> 01:00:03.119
Two of them.

01:00:03.119 --> 01:00:03.910
First we kernelize.

01:00:07.080 --> 01:00:07.580
OK.

01:00:07.580 --> 01:00:09.690
We spent-- I guess--
order vE time.

01:00:09.690 --> 01:00:12.880
Again, I think you can
get that down to order v

01:00:12.880 --> 01:00:15.440
without too much effort.

01:00:15.440 --> 01:00:16.829
Obvious.

01:00:16.829 --> 01:00:17.870
It's not totally obvious.

01:00:17.870 --> 01:00:19.360
It's a good exercise.

01:00:19.360 --> 01:00:22.114
Be a good problem set problem.

01:00:22.114 --> 01:00:23.280
It's not on the problem set.

01:00:23.280 --> 01:00:24.490
Don't worry.

01:00:24.490 --> 01:00:28.050
It could be a good
final exam problem.

01:00:28.050 --> 01:00:29.770
Probably a little long.

01:00:29.770 --> 01:00:30.300
All right.

01:00:30.300 --> 01:00:35.680
So now we could-- let's
say, option one-- let's

01:00:35.680 --> 01:00:38.330
use the brute force
algorithm after that.

01:00:41.940 --> 01:00:46.240
The running time of that
is E times v to the k.

01:00:46.240 --> 01:00:49.990
But now E is k squared.

01:00:49.990 --> 01:00:53.180
And v is also order k squared.

01:00:53.180 --> 01:00:55.170
Let's not worry
about-- actually I

01:00:55.170 --> 01:00:57.560
do have to worry about
constants here, because it's

01:00:57.560 --> 01:00:59.280
in the base of an exponent.

01:00:59.280 --> 01:01:01.540
So I do.

01:01:01.540 --> 01:01:05.680
So we're going to get k
squared for the E term.

01:01:05.680 --> 01:01:12.180
And then v term is going
to be 2 times k squared.

01:01:12.180 --> 01:01:14.840
And that's going to be
raised to the k-th power.

01:01:14.840 --> 01:01:15.340
OK.

01:01:15.340 --> 01:01:18.140
So I'll simplify a little bit.

01:01:18.140 --> 01:01:23.280
This is like 2 to the k
times-- I guess-- k to the 2k.

01:01:23.280 --> 01:01:23.780
OK.

01:01:23.780 --> 01:01:24.780
It's k to the k squared.

01:01:28.600 --> 01:01:29.330
Not bad.

01:01:29.330 --> 01:01:32.900
Overall running time
is vE plus this.

01:01:36.056 --> 01:01:36.930
It's a function of k.

01:01:36.930 --> 01:01:37.638
It's exponential.

01:01:39.960 --> 01:01:40.460
Good.

01:01:40.460 --> 01:01:41.600
We have a better algorithm.

01:01:41.600 --> 01:01:43.670
We have this v times 2
to the k running time,

01:01:43.670 --> 01:01:45.510
so we might as
well use that one.

01:01:45.510 --> 01:01:47.400
But the point is--
once you kernelize,

01:01:47.400 --> 01:01:49.370
you can use pretty
stupid algorithms,

01:01:49.370 --> 01:01:51.550
and you still get really
good running times.

01:01:51.550 --> 01:01:51.680
OK.

01:01:51.680 --> 01:01:53.100
We'll get a slightly
better running time

01:01:53.100 --> 01:01:54.400
using the bounded-tree-search.

01:02:02.885 --> 01:02:04.260
So if we use
bounded-tree-search,

01:02:04.260 --> 01:02:08.301
we have v. v is 2k squared.

01:02:08.301 --> 01:02:09.800
So here the constant
doesn't matter,

01:02:09.800 --> 01:02:11.780
because there's no exponent.

01:02:11.780 --> 01:02:13.770
And then we have
times 2 to the k.

01:02:13.770 --> 01:02:21.120
So we're going to get k squared
times 2 to the k algorithms.

01:02:21.120 --> 01:02:22.330
Kind of funny symmetry here.

01:02:22.330 --> 01:02:24.740
2 and k are switching roles.

01:02:24.740 --> 01:02:27.210
Of course, the 2 to
the k is the big term.

01:02:27.210 --> 01:02:29.740
But now it's only
singularly exponential in k.

01:02:29.740 --> 01:02:32.710
This thing is like
2 to the k log k.

01:02:32.710 --> 01:02:33.980
This thing is only 2 to the k.

01:02:33.980 --> 01:02:35.440
So it's better.

01:02:35.440 --> 01:02:37.290
And this is like k factorial.

01:02:37.290 --> 01:02:38.940
And this is just 2 to the k.

01:02:38.940 --> 01:02:40.500
So it's a big improvement.

01:02:40.500 --> 01:02:42.930
This will be a much more
practical algorithm.

01:02:42.930 --> 01:02:44.720
So we run the
kernelization, then

01:02:44.720 --> 01:02:48.150
we run the
bounded-tree-search algorithm.

01:02:48.150 --> 01:03:01.200
And so the total running time
is vE plus k squared 2 to the k.

01:03:01.200 --> 01:03:02.480
The story doesn't end here.

01:03:02.480 --> 01:03:05.040
There are dozens
of papers about how

01:03:05.040 --> 01:03:07.490
to solve vertex cover from
fixed parameter tractability

01:03:07.490 --> 01:03:08.580
perspective.

01:03:08.580 --> 01:03:15.446
The best one so far--
I'm not going to cover--

01:03:15.446 --> 01:03:16.820
but it is based
on kernelization.

01:03:16.820 --> 01:03:19.520
Just more rules.

01:03:19.520 --> 01:03:29.290
And you get k v
plus 1.274 to the k.

01:03:29.290 --> 01:03:32.310
And some cover's better
than 2, but very similar.

01:03:35.050 --> 01:03:35.970
That's vertex cover.

01:03:35.970 --> 01:03:37.567
If you have a vertex
cover instance,

01:03:37.567 --> 01:03:40.150
and you know that it's going to
have a relatively small vertex

01:03:40.150 --> 01:03:42.570
cover, these are the
algorithms you should use.

01:03:46.440 --> 01:03:49.310
Any questions?

01:03:49.310 --> 01:03:51.860
This ends our
vertex cover story.

01:03:51.860 --> 01:03:55.560
But the last thing I want to do
is connect up these two areas.

01:03:55.560 --> 01:03:58.140
Last class we talked about
approximation algorithms.

01:03:58.140 --> 01:04:00.390
This class we talked about
fixed parameter algorithms.

01:04:00.390 --> 01:04:03.420
They're actually
closely related.

01:04:03.420 --> 01:04:06.450
And so, for example, we will
get a fixed parameter algorithm

01:04:06.450 --> 01:04:13.500
to subset sum, using what
we already had last lecture.

01:04:19.806 --> 01:04:21.680
So, so far today, I've
basically been talking

01:04:21.680 --> 01:04:23.580
about decision problems.

01:04:23.580 --> 01:04:26.250
But let's think a little bit
about optimization problems.

01:04:46.800 --> 01:04:51.370
So take your favorite
optimization problem.

01:04:51.370 --> 01:04:53.220
Like any of the ones
from last lecture.

01:04:59.380 --> 01:05:03.880
And let's assume that
the optimal solution

01:05:03.880 --> 01:05:06.210
value-- the thing we're
trying to optimize, minimize,

01:05:06.210 --> 01:05:08.660
or maximize-- is an integer.

01:05:08.660 --> 01:05:11.674
Assume that OPT is an integer.

01:05:11.674 --> 01:05:12.620
OK.

01:05:12.620 --> 01:05:18.380
Now let's look at
the decision problem.

01:05:18.380 --> 01:05:20.130
Whenever you have an
optimization problem,

01:05:20.130 --> 01:05:24.460
you can convert it into
a decision problem.

01:05:24.460 --> 01:05:27.820
You can convert it into a few.

01:05:27.820 --> 01:05:31.050
For example, OPT less than or
equal to k, or OPT greater than

01:05:31.050 --> 01:05:32.200
or equal to k.

01:05:32.200 --> 01:05:36.264
They're all going to turn
out to work the same.

01:05:36.264 --> 01:05:40.494
OPT equal k would also work.

01:05:40.494 --> 01:05:42.410
Now that's a decision
problem, but what I want

01:05:42.410 --> 01:05:45.050
is a parameterized
decision problem.

01:05:45.050 --> 01:05:47.200
What should my parameter be?

01:05:47.200 --> 01:05:48.486
k.

01:05:48.486 --> 01:05:50.912
All right.

01:05:50.912 --> 01:05:52.120
That's the obvious parameter.

01:05:56.320 --> 01:05:58.490
In some sense, we're
parameterizing by OPT,

01:05:58.490 --> 01:06:00.170
but we're adding a
layer of indirection.

01:06:00.170 --> 01:06:02.520
We're saying, well,
OPT, but we want

01:06:02.520 --> 01:06:06.400
to decide whether OPT is
less than or equal to k.

01:06:06.400 --> 01:06:08.260
And let's parameterize by k.

01:06:08.260 --> 01:06:11.740
That's similar flavor to what
we had with vertex cover.

01:06:11.740 --> 01:06:16.580
If we started with minimum
vertex cover, and converted it.

01:06:16.580 --> 01:06:17.154
Cool.

01:06:17.154 --> 01:06:17.945
Here's the theorem.

01:06:26.032 --> 01:06:28.240
This is not going to be as
strong as the other things

01:06:28.240 --> 01:06:28.740
we've seen.

01:06:28.740 --> 01:06:30.460
No equivalence here.

01:06:56.590 --> 01:06:59.130
So it's a one way implication.

01:06:59.130 --> 01:07:02.210
And I haven't defined this term
yet, but it's similar to one

01:07:02.210 --> 01:07:03.810
we saw last class.

01:07:03.810 --> 01:07:06.510
If the optimization problem
that we started with

01:07:06.510 --> 01:07:10.320
has an efficient PTAS, an
efficient Polynomial Time

01:07:10.320 --> 01:07:14.010
Approximation Scheme, then
the decision problem--

01:07:14.010 --> 01:07:16.350
you get from here-- is
fixed parameter tractable

01:07:16.350 --> 01:07:18.470
with respect to k.

01:07:18.470 --> 01:07:18.970
OK.

01:07:18.970 --> 01:07:20.150
So what does EPTAS mean?

01:07:20.150 --> 01:07:23.640
It's going to look familiar.

01:07:23.640 --> 01:07:26.430
We're going to take an
arbitrary function of 1

01:07:26.430 --> 01:07:32.580
over epsilon times a
fixed polynomial in n.

01:07:32.580 --> 01:07:35.480
So last time we
talked about PTAS

01:07:35.480 --> 01:07:39.890
we could have-- you could have
something like n to the f of 1

01:07:39.890 --> 01:07:40.600
over epsilon.

01:07:40.600 --> 01:07:43.260
I'm going to consider that
bad, as you might imagine,

01:07:43.260 --> 01:07:45.670
from a fixed parameter
tractability perspective.

01:07:45.670 --> 01:07:49.490
Better would be some function,
possibly exponential,

01:07:49.490 --> 01:07:52.390
if 1 over epsilon
times polynomial in n.

01:07:52.390 --> 01:07:54.900
This is going to be good from
a fixed parameter perspective.

01:07:54.900 --> 01:07:56.774
Although it's about
approximation algorithms,

01:07:56.774 --> 01:07:58.900
not about exact algorithms.

01:07:58.900 --> 01:08:01.010
Of course, even
better is the FPTASs

01:08:01.010 --> 01:08:04.390
we saw last time, which is
polynomial 1 over epsilon times

01:08:04.390 --> 01:08:05.850
polynomial in n.

01:08:05.850 --> 01:08:06.470
That's ideal.

01:08:06.470 --> 01:08:09.820
If you have an FPTAS,
it is also an EPTAS.

01:08:09.820 --> 01:08:12.360
You just remove-- or you
just add one more stroke

01:08:12.360 --> 01:08:13.952
to the first letter.

01:08:13.952 --> 01:08:16.370
And you got an EPTAS.

01:08:16.370 --> 01:08:18.930
And last class we
actually saw an EPTAS.

01:08:18.930 --> 01:08:23.818
For subset sum, we saw 2 to
the 1 over epsilon times n.

01:08:23.818 --> 01:08:25.859
Now, in fact, for that
problem, there's an FPTAS.

01:08:25.859 --> 01:08:27.010
Even better.

01:08:27.010 --> 01:08:28.609
But you can see
from last lecture

01:08:28.609 --> 01:08:31.520
why it's nice to have an
exponential dependence on 1

01:08:31.520 --> 01:08:32.350
over epsilon.

01:08:32.350 --> 01:08:34.170
And what this is saying
is you can do that

01:08:34.170 --> 01:08:36.200
as long as that
exponential dependence is

01:08:36.200 --> 01:08:38.660
separated from the n part.

01:08:38.660 --> 01:08:41.880
If it's multiplicatively
or additively separated,

01:08:41.880 --> 01:08:45.040
as you might imagine, it's
the same thing, from n,

01:08:45.040 --> 01:08:47.170
then we call this
an efficient PTAS.

01:08:47.170 --> 01:08:51.410
Not fully, not quite as
good as an FPTAS, but close.

01:08:51.410 --> 01:08:53.170
And as long as you
have such a thing,

01:08:53.170 --> 01:08:56.399
you can convert it into an
FPT algorithm for the decision

01:08:56.399 --> 01:08:57.870
problem.

01:08:57.870 --> 01:09:00.220
The way this is typically
used-- so this tells us we

01:09:00.220 --> 01:09:03.040
get an FPT algorithm
for subset sum.

01:09:03.040 --> 01:09:05.040
In fact, because
there's an FPTAS,

01:09:05.040 --> 01:09:08.439
we get a pseudo
polynomial time algorithm,

01:09:08.439 --> 01:09:10.330
which is in some sense better.

01:09:10.330 --> 01:09:11.550
Anyway.

01:09:11.550 --> 01:09:14.220
The way this theorem is usually
used is in the contrapositive.

01:09:14.220 --> 01:09:17.850
What this tells us is that if we
can find a problem that is not

01:09:17.850 --> 01:09:21.120
FPT-- and there's a whole
theory like NP completeness

01:09:21.120 --> 01:09:23.350
for showing the problems
are almost certainly not

01:09:23.350 --> 01:09:25.180
fixed parameter
tractable-- then we

01:09:25.180 --> 01:09:28.627
know that there is not an EPTAS.

01:09:28.627 --> 01:09:30.460
And this is the state
of the art for proving

01:09:30.460 --> 01:09:32.990
that these kinds of
algorithms do not exist.

01:09:32.990 --> 01:09:35.960
Typically, you look at it from
a fixed parameter perspective,

01:09:35.960 --> 01:09:37.689
and show that probably
doesn't exist.

01:09:37.689 --> 01:09:39.890
Then you get that this
probably doesn't exist.

01:09:39.890 --> 01:09:40.390
OK.

01:09:40.390 --> 01:09:41.735
Let's prove this theorem.

01:09:41.735 --> 01:09:43.220
It's, again, really easy.

01:09:45.802 --> 01:09:47.760
But a nice connection
between these two worlds.

01:09:59.922 --> 01:10:00.422
All right.

01:10:04.340 --> 01:10:06.610
So there are two cases--
the optimization problem

01:10:06.610 --> 01:10:09.580
we're thinking about could be
a minimization or maximization

01:10:09.580 --> 01:10:10.300
problem.

01:10:10.300 --> 01:10:14.675
Let's say it's maximization,
just to be concrete.

01:10:14.675 --> 01:10:16.480
It won't make too
much difference,

01:10:16.480 --> 01:10:19.160
but it will make
a tiny difference

01:10:19.160 --> 01:10:21.285
in order of-- or the
inequality directions.

01:10:25.591 --> 01:10:26.090
OK.

01:10:26.090 --> 01:10:28.580
So what we're going to do.

01:10:28.580 --> 01:10:32.950
So we're given an EPTAS,
and we want to solve FPT.

01:10:32.950 --> 01:10:35.760
We want an FPT algorithm.

01:10:35.760 --> 01:10:37.150
So what do we do?

01:10:37.150 --> 01:10:39.460
Well, an algorithm is going
to be to run that EPTAS.

01:10:39.460 --> 01:10:41.126
That's sort of the
only thing we can do.

01:10:41.126 --> 01:10:43.580
Now the EPTAS-- this is
an approximation scheme.

01:10:43.580 --> 01:10:47.120
It has an extra input
which is epsilon.

01:10:47.120 --> 01:10:49.280
We need to choose epsilon,
because we're not--

01:10:49.280 --> 01:10:51.176
we're trying to
solve it exactly.

01:10:51.176 --> 01:10:52.800
But there's no epsilon
in that problem,

01:10:52.800 --> 01:10:54.580
so we got to make one up.

01:10:54.580 --> 01:11:01.250
Let's run the EPTAS with--
anyone have good intuition?

01:11:01.250 --> 01:11:02.486
What should epsilon be?

01:11:08.438 --> 01:11:09.389
Yeah.

01:11:09.389 --> 01:11:10.180
Remind you of this.

01:11:19.730 --> 01:11:20.230
Tricky.

01:11:23.650 --> 01:11:25.090
We want epsilon to be small.

01:11:25.090 --> 01:11:25.590
Yeah.

01:11:25.590 --> 01:11:26.923
AUDIENCE: It should be 1 over k.

01:11:26.923 --> 01:11:29.820
ERIK DEMAINE: 1 over
k is almost right.

01:11:29.820 --> 01:11:32.370
Anything less than
that would work.

01:11:32.370 --> 01:11:38.990
So I'll use 1 over 2k, but 1
over k plus 1 would also work.

01:11:38.990 --> 01:11:42.604
Or anything a little
bit-- 1 over k--

01:11:42.604 --> 01:11:49.230
yeah-- 1 over k plus
.00001 something.

01:11:49.230 --> 01:11:53.040
Anything a little bit less than
1 over k will turn out to work.

01:11:53.040 --> 01:11:55.230
So, why?

01:11:55.230 --> 01:11:59.770
So first of all, how
much time does this take?

01:11:59.770 --> 01:12:02.820
Well, we were given
this running time.

01:12:02.820 --> 01:12:04.570
1 over this is 2k.

01:12:04.570 --> 01:12:11.110
So this is going to take f of
2k time times polynomial in n.

01:12:14.020 --> 01:12:14.520
OK.

01:12:14.520 --> 01:12:16.810
We need to connect E
and k, because we're

01:12:16.810 --> 01:12:19.710
given-- sorry-- epsilon
and k, because we're

01:12:19.710 --> 01:12:23.204
given something whose running
time depends on epsilon not k.

01:12:23.204 --> 01:12:24.870
Now we're setting
epsilon in terms of k,

01:12:24.870 --> 01:12:28.980
so now the running time is a
function of k, not epsilon.

01:12:28.980 --> 01:12:31.334
And then times n.

01:12:31.334 --> 01:12:32.000
So this is good.

01:12:32.000 --> 01:12:34.910
This looks like an
FPT running time.

01:12:34.910 --> 01:12:37.600
I claim we found
that the answer.

01:12:40.460 --> 01:12:40.960
OK.

01:12:40.960 --> 01:12:42.376
This is maybe the
surprising part.

01:12:42.376 --> 01:12:45.110
You had good intuition here.

01:12:45.110 --> 01:12:47.580
And the intuition is
just that-- if you're

01:12:47.580 --> 01:12:51.690
this close to optimal, and
optimal is actually an integer,

01:12:51.690 --> 01:12:55.980
and you found an integer, then
you're going to be less than 1

01:12:55.980 --> 01:12:58.260
away, which means you're
actually the same thing.

01:12:58.260 --> 01:12:58.760
OK.

01:12:58.760 --> 01:13:00.190
But let's do it more formally.

01:13:11.350 --> 01:13:13.950
So we're within a 1
plus epsilon factor.

01:13:13.950 --> 01:13:16.690
I'm going to call the
epsilon part relative error.

01:13:16.690 --> 01:13:17.190
All right.

01:13:17.190 --> 01:13:21.330
That's how much it gets
multiplied by OPT in order

01:13:21.330 --> 01:13:24.520
to compute the error bound.

01:13:24.520 --> 01:13:27.830
So the relative
error is epsilon.

01:13:27.830 --> 01:13:32.010
Epsilon is-- I guess is--
at most epsilon-- epsilon is

01:13:32.010 --> 01:13:35.860
1 over 2k, which all
I'm going to need

01:13:35.860 --> 01:13:37.800
is that this is strictly
less than 1 over k.

01:13:40.720 --> 01:13:47.577
So this means if I look
at absolute error--

01:13:47.577 --> 01:13:49.660
so in case you're not
familiar-- relative error is

01:13:49.660 --> 01:13:56.012
I take my approximate
solution-- I subtract off,

01:13:56.012 --> 01:13:58.220
let's say the optimal
solution, did I get this right?

01:13:58.220 --> 01:14:00.440
This is a maximization problem.

01:14:00.440 --> 01:14:02.430
So yeah, my solution's
presumably-- no,

01:14:02.430 --> 01:14:04.120
it's going to be the
other way around.

01:14:04.120 --> 01:14:05.620
For maximization
problem, it's going

01:14:05.620 --> 01:14:07.460
to be-- the optimal
could be bigger

01:14:07.460 --> 01:14:09.820
than me, so I take
that difference-- this

01:14:09.820 --> 01:14:11.700
is called absolute error.

01:14:11.700 --> 01:14:12.200
OK.

01:14:12.200 --> 01:14:15.640
And relative error is when
I just divide that by OPT.

01:14:15.640 --> 01:14:16.690
That's relative error.

01:14:16.690 --> 01:14:19.486
So I have this one part already.

01:14:19.486 --> 01:14:21.610
So usually you state it in
terms of 1 plus epsilon.

01:14:21.610 --> 01:14:23.500
If you state it in
terms of relative error,

01:14:23.500 --> 01:14:24.390
the 1 disappears.

01:14:24.390 --> 01:14:26.010
You just get epsilon.

01:14:26.010 --> 01:14:28.820
The absolute error
which is OPT minus APX

01:14:28.820 --> 01:14:34.013
is I take the relative error
and I multiply it by OPT.

01:14:34.013 --> 01:14:35.430
OK.

01:14:35.430 --> 01:14:38.990
So relative error is
going to be less than 1

01:14:38.990 --> 01:14:48.362
if OPT is-- I guess--
greater than or equal to k?

01:14:48.362 --> 01:14:49.570
I have less than in my notes.

01:14:49.570 --> 01:14:53.880
But if OPT is greater
than or equal to k,

01:14:53.880 --> 01:14:56.870
then absolute error
is less than 1.

01:14:56.870 --> 01:14:58.850
Right?

01:14:58.850 --> 01:15:00.190
I hope.

01:15:00.190 --> 01:15:01.690
Let's check.

01:15:01.690 --> 01:15:06.330
The relative error is
actually OPT divided by k.

01:15:06.330 --> 01:15:06.830
Oops.

01:15:06.830 --> 01:15:08.371
No, I've got it the
wrong way around.

01:15:08.371 --> 01:15:10.700
It's correct in my notes.

01:15:10.700 --> 01:15:11.280
OK.

01:15:11.280 --> 01:15:13.220
Relative error is this thing.

01:15:13.220 --> 01:15:16.800
It's going to be strictly
less than OPT divided by k.

01:15:16.800 --> 01:15:18.160
This thing times OPT.

01:15:18.160 --> 01:15:18.660
OK.

01:15:18.660 --> 01:15:21.320
So as long as OPT is
less than or equal to k,

01:15:21.320 --> 01:15:23.130
this thing will be
strictly less than 1.

01:15:25.840 --> 01:15:26.340
That's good.

01:15:26.340 --> 01:15:29.070
OPT error less than
1 for an integer

01:15:29.070 --> 01:15:32.638
means that we actually
have the same value.

01:15:32.638 --> 01:15:35.060
I have that written
down more formally.

01:15:35.060 --> 01:15:37.010
So let's go here.

01:15:40.870 --> 01:15:56.220
So if we find an integral
solution-- of value-- values

01:15:56.220 --> 01:15:58.690
the objective function
we're trying to maximize.

01:15:58.690 --> 01:16:06.350
Let's say we achieve something
value less than or equal to k.

01:16:10.720 --> 01:16:14.090
Which it better be about if
OPT is less than or equal to k.

01:16:14.090 --> 01:16:37.310
Then OPT-- OK-- this is
basically doing the computation

01:16:37.310 --> 01:16:38.960
again in another way.

01:16:38.960 --> 01:16:40.520
We had 1 plus epsilon.

01:16:40.520 --> 01:16:43.460
Epsilon's chosen
to be 1/2 1 over k.

01:16:43.460 --> 01:16:48.130
And then k was the solution
value that we found.

01:16:48.130 --> 01:16:51.510
And so we have this relation
between OPT and the thing.

01:16:51.510 --> 01:16:56.970
And therefore-- and this works
out to exactly k plus 1/2.

01:16:56.970 --> 01:17:01.450
So this is, again, strictly
less than k plus 1.

01:17:01.450 --> 01:17:05.095
And so if we found-- we
assumed that OPT was less than

01:17:05.095 --> 01:17:06.340
or equal to k.

01:17:06.340 --> 01:17:08.590
And so now it must
actually be equal to k,

01:17:08.590 --> 01:17:12.570
because there are no integers
between k and k plus 1/2.

01:17:12.570 --> 01:17:13.070
OK.

01:17:13.070 --> 01:17:14.736
I probably could have
done that shorter.

01:17:17.270 --> 01:17:21.380
So when we have an EPTAS, we
exactly get an FPT algorithm.

01:17:21.380 --> 01:17:23.800
And that's-- the
reverse does not hold.

01:17:23.800 --> 01:17:26.330
There are some problems
that have FPT algorithms

01:17:26.330 --> 01:17:28.930
but do not have EPTASes.

01:17:28.930 --> 01:17:32.620
But, it's something, and it
connects these two fields.

01:17:32.620 --> 01:17:35.170
And that's all we'll say about
fixed parameter algorithms.

01:17:35.170 --> 01:17:38.060
Any final questions?

01:17:38.060 --> 01:17:38.560
Cool.

01:17:38.560 --> 01:17:40.950
See you next week.