WEBVTT

00:00:00.070 --> 00:00:02.430
The following content is
provided under a Creative

00:00:02.430 --> 00:00:03.820
Commons license.

00:00:03.820 --> 00:00:06.050
Your support will help
MIT OpenCourseWare

00:00:06.050 --> 00:00:10.150
continue to offer high-quality
educational resources for free.

00:00:10.150 --> 00:00:12.700
To make a donation or to
view additional materials

00:00:12.700 --> 00:00:16.600
from hundreds of MIT courses,
visit MIT OpenCourseWare

00:00:16.600 --> 00:00:17.263
at ocw.mit.edu.

00:00:25.846 --> 00:00:26.720
PROFESSOR: All right.

00:00:26.720 --> 00:00:31.300
Today we continue our theme
of approximation, lower bounds

00:00:31.300 --> 00:00:33.020
inapproximability.

00:00:33.020 --> 00:00:35.470
Quick recap of last time.

00:00:35.470 --> 00:00:39.530
We talked about lots of
different reductions.

00:00:39.530 --> 00:00:44.900
We, I guess, in particular
talked about P-tests, AP and L.

00:00:44.900 --> 00:00:48.720
And in particular we'll be using
L-reductions almost exclusively

00:00:48.720 --> 00:00:51.880
today, except the occasional
strict reduction, which

00:00:51.880 --> 00:00:55.160
is even stronger, in a sense.

00:00:55.160 --> 00:00:57.117
So what's an L-reduction?

00:00:57.117 --> 00:00:59.450
We're trying to go from one
problem A to another problem

00:00:59.450 --> 00:01:05.530
B. We're given an instance x of
A. We convert it via function f

00:01:05.530 --> 00:01:10.790
to an instance x prime of B.
Then we imagine that somehow we

00:01:10.790 --> 00:01:12.290
obtain a solution.

00:01:12.290 --> 00:01:16.030
We don't know anything about
it. y prime to x prime.

00:01:16.030 --> 00:01:17.540
That's in B space.

00:01:17.540 --> 00:01:19.210
And then, in the
reduction, we're

00:01:19.210 --> 00:01:21.160
supposed to be able to
map any such solution

00:01:21.160 --> 00:01:28.040
y prime to x prime via g into
solution y of x in A problem--

00:01:28.040 --> 00:01:31.560
so that's given by the function
g-- such that two things hold.

00:01:31.560 --> 00:01:36.070
The first one is that
for f, optimal solution

00:01:36.070 --> 00:01:39.520
of x prime should be at
most some constant times

00:01:39.520 --> 00:01:41.930
the optimal solution to x.

00:01:41.930 --> 00:01:44.190
So we don't blow
up OPTs too much.

00:01:44.190 --> 00:01:47.660
And secondly the
absolute difference

00:01:47.660 --> 00:01:51.600
between the cost of y
versus the optimal solution

00:01:51.600 --> 00:01:56.150
for x should be within a
constant factor of this kind

00:01:56.150 --> 00:01:59.650
of gap-- additive gap
between the cost of y

00:01:59.650 --> 00:02:04.210
prime versus the optimal
solution to x prime,

00:02:04.210 --> 00:02:06.430
meaning that if we were
given a y prime that's

00:02:06.430 --> 00:02:08.430
very close to optimal
for x prime, then the y

00:02:08.430 --> 00:02:10.949
we produce is very
close to optimal for x.

00:02:10.949 --> 00:02:14.446
And we want that in
an additive sense

00:02:14.446 --> 00:02:17.070
that will imply that things are
good in a multiplicative sense.

00:02:17.070 --> 00:02:20.180
Last time we proved
that for the min case,

00:02:20.180 --> 00:02:21.490
for minimization problems.

00:02:21.490 --> 00:02:24.110
If you're curious, I
worked out the details

00:02:24.110 --> 00:02:26.840
for maximization problems.

00:02:26.840 --> 00:02:29.670
It's a little bit uglier
in terms of the arithmetic.

00:02:29.670 --> 00:02:32.920
But you again get that if
you had a constant factor

00:02:32.920 --> 00:02:35.970
approximation over here, you
preserve a constant factor

00:02:35.970 --> 00:02:39.630
approximation over
here, and you only-- you

00:02:39.630 --> 00:02:41.500
lose a reasonable factor.

00:02:44.530 --> 00:02:47.119
We also have that if you
can get a PTAS over here,

00:02:47.119 --> 00:02:49.160
so you can get an arbitrarily
good approximation,

00:02:49.160 --> 00:02:50.550
you also get a PTAS over here.

00:02:50.550 --> 00:02:52.380
That was the PTAS reduction.

00:02:52.380 --> 00:02:55.430
And it turns out the constant
in the end is roughly

00:02:55.430 --> 00:02:59.710
epsilon over alpha beta,
where alpha was this constant,

00:02:59.710 --> 00:03:02.270
and beta was this constant.

00:03:02.270 --> 00:03:03.564
That's what we had before.

00:03:03.564 --> 00:03:04.730
It's a little bit different.

00:03:04.730 --> 00:03:06.380
For small epsilon,
it's about the same.

00:03:06.380 --> 00:03:09.310
But for large epsilon, it
does make a difference.

00:03:09.310 --> 00:03:13.470
And this is why, in
case you were confused,

00:03:13.470 --> 00:03:17.230
an L-reduction does not
imply in the maximization

00:03:17.230 --> 00:03:21.030
case an AP-reduction, because
you have this non-linear term.

00:03:21.030 --> 00:03:23.680
Here, everything was
linear in epsilon.

00:03:23.680 --> 00:03:26.320
With minimization, that's true.

00:03:26.320 --> 00:03:27.780
The L implies AP.

00:03:27.780 --> 00:03:30.350
But for maximization
it's not quite true.

00:03:30.350 --> 00:03:32.560
It's close.

00:03:32.560 --> 00:03:34.270
So there's some funny.

00:03:34.270 --> 00:03:36.270
What I said didn't quite
match this picture.

00:03:36.270 --> 00:03:38.920
That's an explanation.

00:03:38.920 --> 00:03:42.280
And then we did
a few reductions.

00:03:42.280 --> 00:03:47.680
I claimed that Max
E3SAT-E5, this was exactly

00:03:47.680 --> 00:03:49.650
three distinct
literals per clause,

00:03:49.650 --> 00:03:54.790
exactly five occurrences
of each variable in five

00:03:54.790 --> 00:03:57.150
different clauses.

00:03:57.150 --> 00:03:58.490
I claimed that was APX-complete.

00:03:58.490 --> 00:04:00.130
We didn't prove it.

00:04:00.130 --> 00:04:02.810
What we did prove is
that assuming Max 3SAT

00:04:02.810 --> 00:04:06.940
is APX-complete, we
reduce that to Max 3SAT3,

00:04:06.940 --> 00:04:10.240
which is at most three
occurrences, each thing,

00:04:10.240 --> 00:04:12.200
first by using
expander, and then

00:04:12.200 --> 00:04:14.620
splitting the constant
size-- constant occurrence

00:04:14.620 --> 00:04:18.190
variables-- with the cycle
of implications trick.

00:04:18.190 --> 00:04:21.019
And then we reduced from
that to bounded degree.

00:04:21.019 --> 00:04:23.800
I think we did
like max degree 4.

00:04:23.800 --> 00:04:27.320
But all of these can be
done in max degree 3.

00:04:27.320 --> 00:04:30.951
Independent set, vertex
cover, and dominating set.

00:04:30.951 --> 00:04:32.200
Vertex cover we've seen a lot.

00:04:32.200 --> 00:04:33.940
You want to cover all the
edges by choosing vertices.

00:04:33.940 --> 00:04:35.650
Dominating set, you want
to cover all the vertices

00:04:35.650 --> 00:04:36.630
by choosing vertices.

00:04:36.630 --> 00:04:38.610
Each vertex covers
its neighbor set.

00:04:38.610 --> 00:04:43.556
And independent set, for general
graphs this is super hard.

00:04:43.556 --> 00:04:45.055
But for bounded
degree graphs, there

00:04:45.055 --> 00:04:46.950
is a constant factor
approximation.

00:04:46.950 --> 00:04:50.750
This was choosing vertices
that induced no edges.

00:04:50.750 --> 00:04:55.850
So with that in mind, let's
do some more APX-reductions,

00:04:55.850 --> 00:04:58.030
APX-hardness,
using L-reductions.

00:05:00.600 --> 00:05:06.450
So the next problem we're
going to do is Max 2SAT.

00:05:12.110 --> 00:05:15.470
So because we're in the world
of optimization, in some sense

00:05:15.470 --> 00:05:19.420
the distinction between 2SAT
and 3SAT is not so important.

00:05:19.420 --> 00:05:22.340
It turns out Max 2SAT
will be APX-complete

00:05:22.340 --> 00:05:24.230
just like Max 3SAT was.

00:05:24.230 --> 00:05:25.877
So when we didn't
have Max, of course

00:05:25.877 --> 00:05:27.460
the complexities
were quite different.

00:05:27.460 --> 00:05:29.847
3SAT was hard, 2SAT was easy.

00:05:29.847 --> 00:05:31.430
With maximization,
they're going to be

00:05:31.430 --> 00:05:34.460
equivalent in this perspective.

00:05:34.460 --> 00:05:43.150
So I'm going to
do an L-reduction

00:05:43.150 --> 00:05:52.250
from independent set of,
let's say a degree 3.

00:05:55.275 --> 00:05:56.900
So it'll work with
any constant degree,

00:05:56.900 --> 00:05:59.260
but we'll get a different
number of occurrences.

00:05:59.260 --> 00:06:01.710
And the reduction
is the following.

00:06:01.710 --> 00:06:05.240
There are two types of
gadgets for every vertex.

00:06:05.240 --> 00:06:07.280
So I'm given an
independent set instance.

00:06:07.280 --> 00:06:11.240
For every vertex v, we're
going to convert that

00:06:11.240 --> 00:06:19.484
into a clause-- namely v. I
want v to be true, if possible.

00:06:19.484 --> 00:06:21.150
It's a funny way of
thinking when you're

00:06:21.150 --> 00:06:23.608
maximizing a number of causes,
because a lot of the clauses

00:06:23.608 --> 00:06:24.490
won't be satisfied.

00:06:24.490 --> 00:06:27.240
But you're going to try to
put v in the independent set

00:06:27.240 --> 00:06:28.040
if you can.

00:06:28.040 --> 00:06:30.060
That's the meaning
of that clause.

00:06:30.060 --> 00:06:34.420
Then for every edge--
let's say connecting v

00:06:34.420 --> 00:06:38.070
to w-- we're going to convert
that into a clause which

00:06:38.070 --> 00:06:43.220
is not v or not w.

00:06:43.220 --> 00:06:46.200
We don't want them both to
be in the independent set.

00:06:46.200 --> 00:06:48.794
That's the meaning of-- yeah.

00:06:48.794 --> 00:06:50.460
I'm trying to simulate
independent sets.

00:06:50.460 --> 00:06:52.200
So I don't want
these both to be in.

00:06:52.200 --> 00:06:55.220
This is a 2SAT clause.

00:06:55.220 --> 00:06:56.950
So what's the claim here?

00:06:56.950 --> 00:07:00.800
Suppose you have some
assignment to the variable.

00:07:00.800 --> 00:07:03.360
So there's one variable
per vertex over here.

00:07:03.360 --> 00:07:07.900
The idea is that variable should
indicate whether the vertex is

00:07:07.900 --> 00:07:10.450
in the independent set.

00:07:10.450 --> 00:07:13.290
And the claim is
that we will never

00:07:13.290 --> 00:07:18.315
violate an edge constraint, or
it's never useful to violate.

00:07:18.315 --> 00:07:21.050
The claim is that there
exists an OPT-- optimal

00:07:21.050 --> 00:07:29.075
solution-- satisfying all
of these edge constraints.

00:07:32.021 --> 00:07:33.020
So we're doing Max 2SAT.

00:07:33.020 --> 00:07:35.610
So we get a point for
every one of these things

00:07:35.610 --> 00:07:37.670
that we satisfy.

00:07:37.670 --> 00:07:40.790
And so in particular,
if you didn't

00:07:40.790 --> 00:07:45.530
get this point-- not v or
not w-- the converse of this

00:07:45.530 --> 00:07:48.740
is that they are both in.

00:07:48.740 --> 00:07:50.570
Then the idea is
that you instead

00:07:50.570 --> 00:07:54.510
take one of those vertices
out of the independent set,

00:07:54.510 --> 00:07:57.484
and that will be better for you.

00:07:57.484 --> 00:07:59.900
In general, when you put a
variable in an independent set,

00:07:59.900 --> 00:08:02.462
it only helps you
for one clause.

00:08:02.462 --> 00:08:04.170
There's only one
occurrence of positive v

00:08:04.170 --> 00:08:05.500
in all of these things.

00:08:05.500 --> 00:08:08.710
You might have many edges
coming into a vertex,

00:08:08.710 --> 00:08:12.350
and they all prefer the
case that v is false.

00:08:12.350 --> 00:08:15.220
So things are going to be
easier if you set v to false.

00:08:15.220 --> 00:08:18.420
So if you discover a clause like
this, which is currently false,

00:08:18.420 --> 00:08:20.880
meaning both v and
w are true, you're

00:08:20.880 --> 00:08:24.170
going to gain a point
by setting v to false.

00:08:24.170 --> 00:08:26.564
You'll also lose a point, but
you'll only lose one point.

00:08:26.564 --> 00:08:27.980
Potentially, you
gain many points,

00:08:27.980 --> 00:08:31.620
but you gain at least one point
and lose at most one point

00:08:31.620 --> 00:08:36.240
by switching from both v
and w true into just one

00:08:36.240 --> 00:08:37.490
of them true.

00:08:37.490 --> 00:08:41.840
So you can always convert
without losing anything in OPT

00:08:41.840 --> 00:08:44.900
into a solution that satisfies
all edge constraints.

00:08:44.900 --> 00:08:48.000
And then we know we
have an independent set.

00:08:48.000 --> 00:08:50.070
That's what the edge
constraints say.

00:08:50.070 --> 00:08:53.006
And therefore the
remaining problem

00:08:53.006 --> 00:08:54.630
is to maximize the
number vertices that

00:08:54.630 --> 00:08:55.853
are in the independent set.

00:09:04.350 --> 00:09:08.820
So that means if we're given
any solution y prime to this Max

00:09:08.820 --> 00:09:11.915
2SAT instance, we can convert
it back to an independent set.

00:09:11.915 --> 00:09:14.970
Now it's not quite
of the same value.

00:09:14.970 --> 00:09:20.700
In general, the optimal solution
here for the 2SAT instance

00:09:20.700 --> 00:09:23.080
is going to be the
optimal solution

00:09:23.080 --> 00:09:28.197
for the independent set instance
plus the total number of edges,

00:09:28.197 --> 00:09:30.030
because we're going to
satisfy all of these.

00:09:30.030 --> 00:09:32.030
That's what we just showed.

00:09:32.030 --> 00:09:36.410
So this is where we get a
kind of additive behavior,

00:09:36.410 --> 00:09:39.060
like in this L-reduction.

00:09:39.060 --> 00:09:40.800
The gap is an additive thing.

00:09:40.800 --> 00:09:42.310
But here it's a
nice fixed thing.

00:09:42.310 --> 00:09:46.080
And so these are
pretty much the same.

00:09:46.080 --> 00:09:48.160
There's just this
additive offset.

00:09:48.160 --> 00:09:51.110
So that's going to be fine in
terms of the second property.

00:09:51.110 --> 00:09:54.310
The additive difference between
one of these solutions and OPT

00:09:54.310 --> 00:09:55.460
will be exactly the same.

00:09:55.460 --> 00:09:58.017
The beta here at this
constant will be 1.

00:09:58.017 --> 00:10:00.100
But we do have to worry
about the first condition.

00:10:00.100 --> 00:10:02.641
We need to make sure OPT doesn't
blow up too much, because we

00:10:02.641 --> 00:10:04.490
did make it bigger.

00:10:04.490 --> 00:10:08.790
So for that, all
we need is this is

00:10:08.790 --> 00:10:12.200
omega, the number of vertices.

00:10:12.200 --> 00:10:16.840
And that's because we assumed
our graph had bounded degree,

00:10:16.840 --> 00:10:19.610
and so we can always find
an independent set of size

00:10:19.610 --> 00:10:23.190
something like n over constant.

00:10:23.190 --> 00:10:24.610
So because that's
already linear,

00:10:24.610 --> 00:10:26.630
we only added
another linear thing.

00:10:26.630 --> 00:10:32.120
Again, also this is
order, number of vertices.

00:10:32.120 --> 00:10:35.200
So we're not adding too
much relative to this,

00:10:35.200 --> 00:10:37.110
because bounded degree.

00:10:37.110 --> 00:10:37.900
Cool?

00:10:37.900 --> 00:10:40.460
So that's Max
2SAT, APX-hardness.

00:10:48.110 --> 00:10:53.190
Fun fact which I won't prove.

00:10:53.190 --> 00:11:03.050
Max E2SAT-E3 is
also APX-complete.

00:11:03.050 --> 00:11:07.030
So here we got some bounded
number of occurrences.

00:11:07.030 --> 00:11:10.680
I guess each variable is
going to appear in one

00:11:10.680 --> 00:11:13.060
plus three, four clauses.

00:11:13.060 --> 00:11:16.090
You can get that down to
three clauses per variable.

00:11:21.150 --> 00:11:21.650
OK.

00:11:27.810 --> 00:11:29.630
Now that we have
Max 2SAT, we can

00:11:29.630 --> 00:11:36.105
do another one, which is
Max not all equal 3SAT.

00:11:40.580 --> 00:11:48.060
So from SAT-land, we have
3SAT, not all equal 3SAT,

00:11:48.060 --> 00:11:49.710
and 1 and 3SAT.

00:11:49.710 --> 00:11:51.230
We're going to get all of those.

00:11:51.230 --> 00:11:53.320
Actually, we can
even get 1 and 2SAT.

00:11:53.320 --> 00:11:55.160
Little bit stronger.

00:11:55.160 --> 00:11:58.260
But let's do not all equal 3SAT.

00:11:58.260 --> 00:12:02.870
So here we are going
to do, I believe,

00:12:02.870 --> 00:12:15.920
a strict reduction from Max
2SAT which we just proved,

00:12:15.920 --> 00:12:18.510
APX-complete.

00:12:18.510 --> 00:12:20.230
Yeah.

00:12:20.230 --> 00:12:22.655
It's again in APX,
because you can, say, take

00:12:22.655 --> 00:12:24.330
your random
assignment, and you'll

00:12:24.330 --> 00:12:28.100
satisfy some constant
fraction of the clauses.

00:12:28.100 --> 00:12:30.710
And OK.

00:12:30.710 --> 00:12:32.630
So here's the reduction.

00:12:32.630 --> 00:12:34.370
Again, very easy.

00:12:34.370 --> 00:12:36.294
Suppose we're starting
from Max 2SAT,

00:12:36.294 --> 00:12:37.710
so all our clauses
look like this.

00:12:37.710 --> 00:12:40.380
These may be negated or not.

00:12:40.380 --> 00:12:50.700
And we're going to convert it
into not all equal of x, y,

00:12:50.700 --> 00:12:52.056
and a.

00:12:52.056 --> 00:12:57.250
a is a new variable, and it
appears in every single clause.

00:12:57.250 --> 00:12:57.750
OK?

00:12:57.750 --> 00:12:58.791
So this is kind of funny.

00:13:01.980 --> 00:13:04.880
So a appears everywhere.

00:13:04.880 --> 00:13:06.990
And not all equal has
this nice symmetry, right?

00:13:06.990 --> 00:13:08.240
There wasn't really
a zero or one.

00:13:08.240 --> 00:13:09.990
You can think of
them as red, as blue.

00:13:09.990 --> 00:13:12.910
Doesn't matter whether red
is true or blue is true.

00:13:12.910 --> 00:13:15.130
So in particular, we
can use that symmetry

00:13:15.130 --> 00:13:18.530
to make a consider it as false.

00:13:18.530 --> 00:13:21.210
So by a possible
flipping everything,

00:13:21.210 --> 00:13:25.080
we can imagine
that a equals zero.

00:13:25.080 --> 00:13:29.242
If not, flip all the bits, and
you'll still be not all equal.

00:13:29.242 --> 00:13:31.700
Or all the things that were
not all equal before will still

00:13:31.700 --> 00:13:32.408
be not all equal.

00:13:32.408 --> 00:13:34.350
You'll preserve OPT.

00:13:34.350 --> 00:13:38.540
Now once you think of a is
false, then not all equal

00:13:38.540 --> 00:13:41.790
is saying that these
are not both 0, which is

00:13:41.790 --> 00:13:44.010
the same thing as saying 2SAT.

00:13:44.010 --> 00:13:45.890
Duh.

00:13:45.890 --> 00:13:46.510
OK.

00:13:46.510 --> 00:13:49.370
Again, I mean this is
saying OPT is preserved.

00:13:49.370 --> 00:13:51.560
But if you take any
solution to this problem,

00:13:51.560 --> 00:13:54.100
you first possibly flip
it so that a is zero,

00:13:54.100 --> 00:13:57.910
and then convert the xy is just
exactly the xy's over here,

00:13:57.910 --> 00:14:00.450
and you'll preserve the
size of the solution.

00:14:00.450 --> 00:14:02.420
You won't get any scale
here, and you also

00:14:02.420 --> 00:14:04.840
preserved OPT exactly.

00:14:04.840 --> 00:14:06.750
So it's in particular
an L-reduction,

00:14:06.750 --> 00:14:08.750
but it's even a
strict reduction.

00:14:08.750 --> 00:14:10.820
Didn't lose anything.

00:14:10.820 --> 00:14:14.110
No additive slop or whatever.

00:14:14.110 --> 00:14:14.750
OK.

00:14:14.750 --> 00:14:16.430
That's nice.

00:14:16.430 --> 00:14:21.625
Next is usually called Max-Cut.

00:14:24.690 --> 00:14:25.760
You're given a graph.

00:14:25.760 --> 00:14:27.550
You want to split
it into two parts

00:14:27.550 --> 00:14:31.660
to maximize the number of
edges between the two parts.

00:14:31.660 --> 00:14:43.100
But this is the same thing
as max positive 1 and 2SAT,

00:14:43.100 --> 00:14:46.360
which is simpler
than 1 and 3SAT.

00:14:46.360 --> 00:14:51.860
You have, I mean, in a cut,
again, you have two sides.

00:14:51.860 --> 00:14:54.320
Call them true or false, or
red and blue, or whatever.

00:14:54.320 --> 00:14:57.800
You would like to assign
exactly one of these to be true.

00:14:57.800 --> 00:15:00.010
Then that edge
will be in the cut.

00:15:00.010 --> 00:15:01.630
So it's the same problem.

00:15:01.630 --> 00:15:10.440
And you can also think of
it as max positive XOR-SAT.

00:15:10.440 --> 00:15:12.320
Maybe actually call it 2XOR-SAT.

00:15:15.320 --> 00:15:15.890
Same thing.

00:15:15.890 --> 00:15:19.100
It's just every constraint is
of the form this x or this.

00:15:19.100 --> 00:15:21.480
You want to maximize the
number of those constraints.

00:15:21.480 --> 00:15:23.510
So a lot of these problems
have different formulations

00:15:23.510 --> 00:15:25.551
depending on whether you're
thinking about logic,

00:15:25.551 --> 00:15:27.970
or thinking about
a graph problem.

00:15:27.970 --> 00:15:31.890
So we're going to get all of
these four with one reduction.

00:15:31.890 --> 00:15:35.010
And it's going to be
from probably this one.

00:15:35.010 --> 00:15:36.430
Yes.

00:15:36.430 --> 00:15:38.260
The great chain of
reductions here.

00:15:47.320 --> 00:15:51.450
So we're going to reduce
from Max not all equal 3SAT.

00:15:54.060 --> 00:15:58.130
I should mention, all of the
reductions we've been seeing,

00:15:58.130 --> 00:16:01.260
including this initial batch
where we started from 3SAT,

00:16:01.260 --> 00:16:02.902
converted into 3SAT
3, converted it

00:16:02.902 --> 00:16:04.610
into an independent
set, to vertex cover,

00:16:04.610 --> 00:16:09.460
to dominating set to Max
2SAT, to Max not equal 3SAT

00:16:09.460 --> 00:16:12.870
to Max-Cut, are all in this
seminal paper by Papadimitriou

00:16:12.870 --> 00:16:15.350
and Yannakakis, 1991.

00:16:15.350 --> 00:16:18.826
This is before APX
was really a thing.

00:16:18.826 --> 00:16:20.450
It had a different
name at that point--

00:16:20.450 --> 00:16:24.530
Max SMP-- which later is proved
to be essentially equal to APX,

00:16:24.530 --> 00:16:26.710
or the completeness
version is the same.

00:16:26.710 --> 00:16:29.019
You don't need to
know about that.

00:16:29.019 --> 00:16:31.310
It comes from a different
world, but all the reductions

00:16:31.310 --> 00:16:32.570
apply here.

00:16:32.570 --> 00:16:36.230
So here is the
reduction for a Max-Cut.

00:16:36.230 --> 00:16:39.040
So again we're trying
to simulate Max

00:16:39.040 --> 00:16:40.720
not all equal 3SAT.

00:16:40.720 --> 00:16:44.850
Now we actually saw in the
planar lecture, planar 3SAT,

00:16:44.850 --> 00:16:49.650
that you can reduce planar
not all equal 3SAT to planar

00:16:49.650 --> 00:16:53.470
Max-Cut, and that we use that to
get a polynomial time algorithm

00:16:53.470 --> 00:16:55.340
for planar not all equal 3SAT.

00:16:55.340 --> 00:16:57.190
We're just going
to do the reverse.

00:16:57.190 --> 00:17:00.930
And if you recall, this was
the heart of that reduction.

00:17:00.930 --> 00:17:03.810
The point is that
you can represent

00:17:03.810 --> 00:17:08.800
a not all equal clause as
a cut, as a Max-Cut problem

00:17:08.800 --> 00:17:09.599
on a triangle.

00:17:09.599 --> 00:17:11.864
Because in a triangle,
either they're all equal,

00:17:11.864 --> 00:17:14.849
and then there's no cut edges,
or they're not all equal,

00:17:14.849 --> 00:17:17.520
and then there's
exactly two cut edges.

00:17:17.520 --> 00:17:19.079
So that's for a cause of size 3.

00:17:19.079 --> 00:17:22.050
We also need to handle the
case of a cause of size 2.

00:17:22.050 --> 00:17:24.905
But that's a two-gon, I
guess, instead of a triangle.

00:17:24.905 --> 00:17:26.030
It works the same way here.

00:17:26.030 --> 00:17:29.950
You get 1 if they're not all
equal, and zero otherwise.

00:17:29.950 --> 00:17:32.120
This is shown as the zero case.

00:17:32.120 --> 00:17:32.620
OK.

00:17:32.620 --> 00:17:35.910
Now the one thing we need,
because not all equal 3SAT

00:17:35.910 --> 00:17:39.300
here, we need negation.

00:17:39.300 --> 00:17:45.480
So we're going to build each
variable and its negation

00:17:45.480 --> 00:17:46.350
with this gadget.

00:17:46.350 --> 00:17:48.220
This is a new gadget,
variable gadget.

00:17:48.220 --> 00:17:52.790
It's just a whole bunch of
edges connecting xi and xi bar.

00:17:52.790 --> 00:17:53.900
And you can make this.

00:17:53.900 --> 00:17:56.570
You can avoid the
multigraph aspect here.

00:17:56.570 --> 00:17:59.530
But let's not worry
about it here.

00:17:59.530 --> 00:18:03.950
So in general, if there are k
occurrences of this variable,

00:18:03.950 --> 00:18:07.370
then we're going to
have 2k parallel edges,

00:18:07.370 --> 00:18:11.480
because the cost over here, the
potential benefit here is 2.

00:18:11.480 --> 00:18:14.730
Again, we want to argue that
if we take an optimal solution,

00:18:14.730 --> 00:18:18.550
we can make it another optimal
solution where xi and xi

00:18:18.550 --> 00:18:21.706
bar are on opposite
sides of the cut.

00:18:21.706 --> 00:18:23.830
And the reason is, if
they're both on the same side

00:18:23.830 --> 00:18:27.690
of the cut, you're not
getting this benefit.

00:18:27.690 --> 00:18:29.930
If you flip one
of the sides, you

00:18:29.930 --> 00:18:32.150
get this huge
benefit, which is 2k.

00:18:32.150 --> 00:18:33.960
And you say, well,
how much do I lose

00:18:33.960 --> 00:18:37.470
if I flip this from one side
of the cut to the other.

00:18:37.470 --> 00:18:41.590
Well, it appears in at most k
different clauses, each of them

00:18:41.590 --> 00:18:43.480
gives me at most two points.

00:18:43.480 --> 00:18:45.970
So I'm losing, at
most, 2k points

00:18:45.970 --> 00:18:47.330
by making these opposite.

00:18:47.330 --> 00:18:48.410
But I gain 2k points.

00:18:48.410 --> 00:18:50.990
So it never hurts me
to do that switch.

00:18:50.990 --> 00:18:53.560
So I can assume these two
guys are on opposite sides,

00:18:53.560 --> 00:18:56.540
and therefore I can assume
it's sort of validly doing

00:18:56.540 --> 00:18:57.740
the negation part.

00:18:57.740 --> 00:19:01.810
And then it just reduces
to not all equal 3SAT.

00:19:01.810 --> 00:19:04.770
There's a difference between
this one, where we only

00:19:04.770 --> 00:19:07.250
get one point, and this
one we only get two points.

00:19:07.250 --> 00:19:09.212
AUDIENCE: You get two points.

00:19:09.212 --> 00:19:10.670
PROFESSOR: You get
two points here?

00:19:10.670 --> 00:19:10.910
Oh yeah.

00:19:10.910 --> 00:19:11.701
You get two points.

00:19:11.701 --> 00:19:14.820
That's why we doubled the edge.

00:19:14.820 --> 00:19:16.747
So that's cool.

00:19:16.747 --> 00:19:17.830
I think you would be fine.

00:19:17.830 --> 00:19:20.121
It'd still be an L-reduction
even if you have one edge.

00:19:20.121 --> 00:19:21.690
But this is nicer.

00:19:21.690 --> 00:19:23.410
And yeah.

00:19:23.410 --> 00:19:24.750
That's it.

00:19:24.750 --> 00:19:25.250
Cool.

00:19:25.250 --> 00:19:28.050
This is Max-Cut.

00:19:28.050 --> 00:19:32.020
It will be a
bounded degree based

00:19:32.020 --> 00:19:34.470
on the number of occurrences
we got, which was like four.

00:19:34.470 --> 00:19:37.600
I mean, we can use three,
and then we'll multiply.

00:19:37.600 --> 00:19:42.270
In general you can prove
Max-Cut remains APX-complete

00:19:42.270 --> 00:19:45.440
for degree three graphs.

00:19:45.440 --> 00:19:47.630
So we're not going
to prove it here.

00:19:47.630 --> 00:19:51.600
So another kind of reduction
trick to reduce degrees, just

00:19:51.600 --> 00:19:55.680
say degree 3 is possible.

00:19:55.680 --> 00:20:03.690
It's also Max Cut in degree
3 graphs is APX-complete.

00:20:03.690 --> 00:20:10.100
So you could call that max
positive 1 and 2SAT, hyphen 3.

00:20:10.100 --> 00:20:10.740
Maybe even E3.

00:20:13.580 --> 00:20:14.175
All right.

00:20:16.690 --> 00:20:18.587
So this gives you a flavor.

00:20:18.587 --> 00:20:20.420
This is a fun series
of reductions, each one

00:20:20.420 --> 00:20:22.150
building on the previous one.

00:20:22.150 --> 00:20:24.730
But it gives you kind
of starting point.

00:20:24.730 --> 00:20:27.310
A lot of the problems
we're familiar with in NP

00:20:27.310 --> 00:20:29.480
completeness land, if you
just add "Max" in front,

00:20:29.480 --> 00:20:32.930
they become hard.

00:20:32.930 --> 00:20:35.960
I mean I guess Max-Cut
always had a Max in front.

00:20:35.960 --> 00:20:38.850
Max 2SAT for NP completeness,
we also had a Max in front.

00:20:38.850 --> 00:20:41.341
So those are familiar,
and they're APX-complete.

00:20:41.341 --> 00:20:42.840
All of the problems,
I've described,

00:20:42.840 --> 00:20:44.298
at least for bounded
degree graphs,

00:20:44.298 --> 00:20:46.340
have constant factor
approximations.

00:20:46.340 --> 00:20:47.730
So this is the right level.

00:20:47.730 --> 00:20:49.350
They are APX-complete.

00:20:49.350 --> 00:20:51.650
And that determines
their approximability.

00:20:51.650 --> 00:20:52.710
Constant factor, no PTAS.

00:20:55.890 --> 00:21:03.060
Now it would be nice to know
which problems are hard.

00:21:03.060 --> 00:21:06.790
With NP-completeness,
and in the SAT universe,

00:21:06.790 --> 00:21:09.170
we had Schaefer's
dichotomy theorem that

00:21:09.170 --> 00:21:12.380
said-- let me cheat and
look at my notes from,

00:21:12.380 --> 00:21:17.390
I think, lecture four--
that SAT is polynomial if

00:21:17.390 --> 00:21:18.980
and only if the
clauses that you're

00:21:18.980 --> 00:21:21.270
allowed to do-- the
operations you're allowed

00:21:21.270 --> 00:21:25.491
to do with variables--
are either have

00:21:25.491 --> 00:21:27.740
the property that when you
set all the variables true,

00:21:27.740 --> 00:21:28.810
everything's satisfied.

00:21:28.810 --> 00:21:31.730
Or you set all the variables
false, everything satisfied.

00:21:31.730 --> 00:21:37.080
Or every single clause is a
conjunction of Horn causes.

00:21:37.080 --> 00:21:43.200
Horn clauses were a few
variables, and at most one

00:21:43.200 --> 00:21:45.200
of them is positive.

00:21:45.200 --> 00:21:48.520
Or all the causes you have
are conjunctions of Dual-Horn,

00:21:48.520 --> 00:21:54.300
which was, in every clause at
most one of them is negated,

00:21:54.300 --> 00:21:58.900
or all of the clauses
are conjunctions of 2CNF,

00:21:58.900 --> 00:22:00.660
only like 2SAT.

00:22:00.660 --> 00:22:05.010
Or what I didn't give
a name at the time,

00:22:05.010 --> 00:22:10.140
but is essentially a slight
generalization of XOR-SAT.

00:22:10.140 --> 00:22:11.580
Let me give it a name here.

00:22:11.580 --> 00:22:13.040
I'm going to call it X(N)OR-SAT.

00:22:19.350 --> 00:22:23.190
You can also phrase them as
linear equations over Z2.

00:22:32.390 --> 00:22:34.290
So this is zero and one.

00:22:34.290 --> 00:22:38.120
And it's either X OR, meaning
you take the X OR of all

00:22:38.120 --> 00:22:40.340
the things-- that's like
the summation of all things,

00:22:40.340 --> 00:22:42.370
or it's X(N)OR, meaning
when you take that sum,

00:22:42.370 --> 00:22:44.420
it should equal zero.

00:22:44.420 --> 00:22:46.300
And such systems
of linear equations

00:22:46.300 --> 00:22:52.250
can be solved in polynomial
time using Gaussian elimination

00:22:52.250 --> 00:22:53.920
over Z2.

00:22:53.920 --> 00:22:56.060
And all of the things
I just mentioned

00:22:56.060 --> 00:22:59.420
are all the situations
where SAT is polynomial.

00:22:59.420 --> 00:23:03.810
Every other type of clause,
SAT is NP-complete--

00:23:03.810 --> 00:23:05.607
or set of classes.

00:23:05.607 --> 00:23:06.690
Now why do I mention this?

00:23:06.690 --> 00:23:11.520
Because there is an
analogous theorem for it's

00:23:11.520 --> 00:23:15.690
not quite SAT, because we
need something like this Max.

00:23:15.690 --> 00:23:17.690
We need to turn it into
an optimization problem.

00:23:17.690 --> 00:23:21.050
SAT is not normally an
optimization problem by itself.

00:23:21.050 --> 00:23:25.270
And characterizing how
approximal those problems are.

00:23:25.270 --> 00:23:32.750
Now it is a complicated
theorem-- so complicated,

00:23:32.750 --> 00:23:35.200
that I don't want to
write it on the board,

00:23:35.200 --> 00:23:36.670
because there's a lot of cases.

00:23:36.670 --> 00:23:39.140
But the point is,
it's exhaustive.

00:23:39.140 --> 00:23:41.166
It will tell you if
you have anything

00:23:41.166 --> 00:23:42.540
of the type we
had with Schaefer,

00:23:42.540 --> 00:23:44.515
which was you define a
kind of clause function.

00:23:44.515 --> 00:23:46.190
It's either satisfied or not.

00:23:46.190 --> 00:23:48.120
It applies to some
number of variables.

00:23:48.120 --> 00:23:51.150
And then, once you've
defined that clause type,

00:23:51.150 --> 00:23:52.830
you can apply it
to any combination

00:23:52.830 --> 00:23:54.450
of variables you want.

00:23:54.450 --> 00:23:57.400
That family of problems
with no other restrictions

00:23:57.400 --> 00:23:58.505
is what we get.

00:23:58.505 --> 00:24:03.590
And I will just tell you
what the problems are.

00:24:03.590 --> 00:24:04.547
There's four of them.

00:24:04.547 --> 00:24:06.380
This is part of what
makes the theorem long,

00:24:06.380 --> 00:24:08.750
but also extremely powerful.

00:24:08.750 --> 00:24:12.340
The first dichotomy
is max verses min.

00:24:12.340 --> 00:24:15.580
And then the second
dichotomy is they

00:24:15.580 --> 00:24:18.162
call it CSP for constraint
satisfaction problem.

00:24:18.162 --> 00:24:19.620
So you have a bunch
of constraints.

00:24:19.620 --> 00:24:21.970
You want to satisfy
as many as possible.

00:24:21.970 --> 00:24:26.750
So this would be the number
of satisfied constraints

00:24:26.750 --> 00:24:29.940
is your objective, or
your cost function.

00:24:33.240 --> 00:24:37.870
Or the other version is what's
called the ones problem, or max

00:24:37.870 --> 00:24:39.530
ones, or min ones.

00:24:39.530 --> 00:24:42.560
This is the number
of true variables.

00:24:48.010 --> 00:24:52.060
So again, we have a
Schaefer-like SAT style

00:24:52.060 --> 00:24:53.132
of set of clauses.

00:24:53.132 --> 00:24:55.590
Either we want to maximize the
number of satisfied clauses,

00:24:55.590 --> 00:24:58.170
or we want to minimize the
number satisfied clauses,

00:24:58.170 --> 00:25:02.360
or we want to maximize the
number of true variables

00:25:02.360 --> 00:25:03.980
and satisfy everything.

00:25:03.980 --> 00:25:06.360
Or we want to minimize the
number of true variables

00:25:06.360 --> 00:25:09.040
and satisfy everything.

00:25:09.040 --> 00:25:09.540
OK.

00:25:09.540 --> 00:25:11.930
Now obviously, if the
SAT problem is hard,

00:25:11.930 --> 00:25:13.840
it's going to be
hard to do this.

00:25:13.840 --> 00:25:15.710
But it's still interesting.

00:25:15.710 --> 00:25:17.000
You can still think about it.

00:25:17.000 --> 00:25:23.260
And even when the SAT problem
is easy, Max ones can be hard.

00:25:23.260 --> 00:25:25.650
So I am going to--
I wrote it all down,

00:25:25.650 --> 00:25:27.280
and then I realized
how long it was.

00:25:27.280 --> 00:25:29.060
And so I will just show you.

00:25:29.060 --> 00:25:32.460
Imagine I just hand-wrote this.

00:25:32.460 --> 00:25:35.310
So this is the easy case.

00:25:35.310 --> 00:25:36.401
Max CSP.

00:25:36.401 --> 00:25:38.400
So we want to maximize
the number of constraints

00:25:38.400 --> 00:25:40.990
that we satisfy.

00:25:40.990 --> 00:25:45.430
And I'm going to characterize
when it is polynomial.

00:25:45.430 --> 00:25:47.710
Now here, PO I haven't
defined, but that's

00:25:47.710 --> 00:25:49.737
the analog of P for
optimization problems.

00:25:49.737 --> 00:25:51.570
So it's the set of all
optimization problems

00:25:51.570 --> 00:25:55.340
that are in P that have a
polynomial timed algorithm

00:25:55.340 --> 00:25:57.110
to solve them exactly.

00:25:57.110 --> 00:25:58.520
So it turns out
in this situation

00:25:58.520 --> 00:26:01.330
you are either polynomial
or APX-complete.

00:26:01.330 --> 00:26:04.940
So it's only about constant
factor verses perfect.

00:26:04.940 --> 00:26:08.310
There's never a PTAS, unless
there's a polynomial time

00:26:08.310 --> 00:26:09.012
algorithm.

00:26:09.012 --> 00:26:10.470
And the cases should
look familiar.

00:26:10.470 --> 00:26:13.170
It's either when you set
all the variables true

00:26:13.170 --> 00:26:15.860
or all the variables false,
that satisfies everything.

00:26:15.860 --> 00:26:17.690
In that case, Max CSP
is, of course, easy.

00:26:17.690 --> 00:26:19.790
You can satisfy everything.

00:26:19.790 --> 00:26:23.150
Another case is if
you write the clauses

00:26:23.150 --> 00:26:26.510
in disjunctive normal
form-- this is a new type

00:26:26.510 --> 00:26:29.360
that we hadn't seen before,
all your causes are--

00:26:29.360 --> 00:26:32.450
when you write them in DNF,
they have exactly two terms.

00:26:32.450 --> 00:26:36.265
So it's the OR of two things
that are anded together.

00:26:36.265 --> 00:26:36.765
Sorry.

00:26:36.765 --> 00:26:38.050
There's an "or" in the middle.

00:26:38.050 --> 00:26:40.340
And you have a bunch of
things anded together

00:26:40.340 --> 00:26:41.630
in each of my hands.

00:26:41.630 --> 00:26:44.760
And all the ones in here and
positive, and all the ones

00:26:44.760 --> 00:26:46.110
in here are negative.

00:26:46.110 --> 00:26:49.090
If every clause looks
like that, then you

00:26:49.090 --> 00:26:51.670
can solve this in
polynomial time.

00:26:51.670 --> 00:26:56.180
And in all other cases, this
problem is APX-complete.

00:26:56.180 --> 00:26:59.582
So that's a nice, very
clean characterization.

00:26:59.582 --> 00:27:01.998
AUDIENCE: Wait. [INAUDIBLE]
that we learned about earlier.

00:27:01.998 --> 00:27:03.390
Is this the [INAUDIBLE]?

00:27:03.390 --> 00:27:04.015
PROFESSOR: Yes.

00:27:04.015 --> 00:27:05.530
This is disjunctive normal form.

00:27:05.530 --> 00:27:09.390
So it's the or of ands.

00:27:09.390 --> 00:27:12.590
We usually, we deal
with CNF ands of ors.

00:27:12.590 --> 00:27:17.530
But for this
characterization, every clause

00:27:17.530 --> 00:27:19.810
can be uniquely
converted into a DNF,

00:27:19.810 --> 00:27:21.150
and uniquely converted into CNF.

00:27:21.150 --> 00:27:23.990
So that's a well-defined
thing to say.

00:27:26.405 --> 00:27:28.530
With Schaefer, we just had
to look at the CNF form.

00:27:28.530 --> 00:27:31.990
But here we get a
new set of things.

00:27:31.990 --> 00:27:33.130
All right.

00:27:33.130 --> 00:27:35.350
That was one out of four.

00:27:35.350 --> 00:27:37.240
Max Min CSP Ones.

00:27:37.240 --> 00:27:40.486
Next one is Max Ones.

00:27:40.486 --> 00:27:41.860
This is not the
most complicated.

00:27:44.540 --> 00:27:46.390
But let's go through them.

00:27:46.390 --> 00:27:49.862
So again, we want to maximize
the number of true variables.

00:27:49.862 --> 00:27:51.945
So of course, if we set
all the variables to true,

00:27:51.945 --> 00:27:55.570
and everything is satisfied,
yay, a polynomial, OK?

00:27:55.570 --> 00:27:58.180
But curiously, if you settle
the variables to false,

00:27:58.180 --> 00:28:02.910
and that satisfies everything,
that's going to be here.

00:28:02.910 --> 00:28:05.230
That's Poly-APX-complete.

00:28:05.230 --> 00:28:08.050
Poly-APX-complete, you can
translate to something like n

00:28:08.050 --> 00:28:10.160
to the 1 minus
epsilon, approximable,

00:28:10.160 --> 00:28:12.850
and that's the best you can do.

00:28:12.850 --> 00:28:15.620
Or there's a lower bound of
n to the 1 minus epsilon.

00:28:15.620 --> 00:28:18.231
Upper bound might
be n or something.

00:28:18.231 --> 00:28:18.730
OK.

00:28:18.730 --> 00:28:23.180
So because maximizing ones, when
setting things all at false,

00:28:23.180 --> 00:28:24.450
does not necessarily help you.

00:28:24.450 --> 00:28:26.800
There are some more
positive cases.

00:28:26.800 --> 00:28:28.730
If you have a Dual-Horn set up.

00:28:28.730 --> 00:28:31.270
So this is another one of
the Schaefer situations.

00:28:31.270 --> 00:28:34.675
If every clause when you write
it in CNF every subclause

00:28:34.675 --> 00:28:37.780
is Dual-Horn, at most,
one negated thing,

00:28:37.780 --> 00:28:40.070
that is a good situation
for maximizing ones,

00:28:40.070 --> 00:28:44.170
because only one of
them has to be negative.

00:28:44.170 --> 00:28:48.646
But with Horn, for example,
you get Poly-APX-complete,

00:28:48.646 --> 00:28:51.020
because we have an asymmetry
here between ones and zeros.

00:28:51.020 --> 00:28:51.968
Question?

00:28:51.968 --> 00:28:53.160
AUDIENCE: In this list,
do we just read down it

00:28:53.160 --> 00:28:54.210
until we hit the thing?

00:28:54.210 --> 00:28:55.010
PROFESSOR: Yes.

00:28:55.010 --> 00:28:55.860
Good question.

00:28:55.860 --> 00:29:01.290
This is a sequential algorithm
for determining what you have.

00:29:01.290 --> 00:29:03.430
If any of these says,
oh, you're in PO,

00:29:03.430 --> 00:29:05.760
then you should stop reading
the rest of the theorem.

00:29:05.760 --> 00:29:09.640
The way they write the theorem
is less is probably clearer.

00:29:09.640 --> 00:29:11.386
They write an else
if for each one,

00:29:11.386 --> 00:29:13.260
but I wrote it backwards,
so it's hard for me

00:29:13.260 --> 00:29:14.730
to write else if.

00:29:14.730 --> 00:29:15.410
Yeah.

00:29:15.410 --> 00:29:18.530
Occasionally I'll mention
that the previous things

00:29:18.530 --> 00:29:19.030
don't apply.

00:29:19.030 --> 00:29:20.860
But you should read
this sequentially.

00:29:24.100 --> 00:29:24.600
OK.

00:29:24.600 --> 00:29:25.870
So it was Dual-Horn.

00:29:25.870 --> 00:29:31.300
Another polynomial case is
what I call 2-X(N)OR-SAT,

00:29:31.300 --> 00:29:32.590
where the N is in parentheses.

00:29:32.590 --> 00:29:35.110
So in other words, you
have linear equations.

00:29:35.110 --> 00:29:39.300
Each equation only has two
terms, sort of like 2SAT.

00:29:39.300 --> 00:29:41.330
And you have equations
that say equal zero

00:29:41.330 --> 00:29:44.120
or equal one on those two terms.

00:29:44.120 --> 00:29:45.870
That is also
polynomially solvable.

00:29:45.870 --> 00:29:47.490
This is a special case.

00:29:47.490 --> 00:29:49.650
We didn't need the
2 for Schaefer.

00:29:49.650 --> 00:29:54.490
Here we need the 2, because if
you have X(N)OR-SAT in general.

00:29:54.490 --> 00:29:57.760
And when I say this, I
mean that all constraints

00:29:57.760 --> 00:29:58.940
fall into this category.

00:29:58.940 --> 00:30:00.990
If all constraints
are of this form,

00:30:00.990 --> 00:30:03.080
all clauses are of this
form, then you're good.

00:30:03.080 --> 00:30:06.420
If all clauses are of
the form X(N)OR-SAT,

00:30:06.420 --> 00:30:10.450
but they're not in this class,
they're not all of length 2,

00:30:10.450 --> 00:30:12.800
then the problem
becomes APX-complete,

00:30:12.800 --> 00:30:16.630
by contrast to
Schaefer, where, I mean,

00:30:16.630 --> 00:30:19.370
deciding whether you can satisfy
all those things is easy--

00:30:19.370 --> 00:30:22.670
maximizing the number of ones
when you do it is APX-complete.

00:30:22.670 --> 00:30:25.950
So that's particularly
interesting.

00:30:25.950 --> 00:30:27.700
AUDIENCE: Not all equal
3SAT fall in that?

00:30:27.700 --> 00:30:28.610
Is that?

00:30:32.620 --> 00:30:35.330
PROFESSOR: Not all equal 3SAT.

00:30:35.330 --> 00:30:37.527
AUDIENCE: Those are
X(N)OR clauses, right?

00:30:37.527 --> 00:30:38.110
PROFESSOR: No.

00:30:38.110 --> 00:30:39.526
They should not
be X(N)OR clauses,

00:30:39.526 --> 00:30:40.930
because it's NP-complete.

00:30:40.930 --> 00:30:42.800
And when you have
X(N)OR clauses,

00:30:42.800 --> 00:30:45.650
it's always polynomial to
decide whether you can satisfy

00:30:45.650 --> 00:30:47.040
everything.

00:30:47.040 --> 00:30:49.645
So it's in the other case.

00:30:52.570 --> 00:30:54.070
But good question,
because we should

00:30:54.070 --> 00:30:56.920
be getting APX-completeness.

00:30:56.920 --> 00:30:58.837
Yeah, but Max not all
equal 3SAT is different.

00:30:58.837 --> 00:31:01.128
Here we're trying to maximize
the number of clause that

00:31:01.128 --> 00:31:01.810
were satisfied.

00:31:01.810 --> 00:31:04.309
So if you have not
all equal 3SAT,

00:31:04.309 --> 00:31:06.350
and you want to maximize
the number of ones, that

00:31:06.350 --> 00:31:08.724
means first you have to satisfy
not all equal 3SAT, which

00:31:08.724 --> 00:31:09.610
is hard.

00:31:09.610 --> 00:31:11.760
So that's going
to fall into this.

00:31:11.760 --> 00:31:13.800
The bottom one is feasibility.

00:31:13.800 --> 00:31:15.930
Just finding a feasible
solution is NP hard.

00:31:18.590 --> 00:31:24.630
The X(N)OR-SAT is this thing--
linear equations over Z2.

00:31:24.630 --> 00:31:27.139
And it could be equal
to 0, or equal to 1.

00:31:27.139 --> 00:31:28.930
This is what you might
call an X OR clause,

00:31:28.930 --> 00:31:32.940
or this is an X OR clause,
this is an X(N)OR clause.

00:31:32.940 --> 00:31:36.890
So if they don't all have size
two, then you're APX-complete.

00:31:36.890 --> 00:31:41.400
But you can find a solution
by Schaefer's theorem.

00:31:41.400 --> 00:31:42.280
OK.

00:31:42.280 --> 00:31:45.390
So as I mentioned, Horn
clauses and 2AT clauses

00:31:45.390 --> 00:31:46.570
are actually really hard.

00:31:46.570 --> 00:31:49.320
They're Poly-APX-complete,
n to the 1 minus epsilon.

00:31:49.320 --> 00:31:51.350
Also these are all
situations where

00:31:51.350 --> 00:31:54.724
you can find feasible solutions
easily by Schaefer, like when

00:31:54.724 --> 00:31:57.140
you can set them all false,
and that satisfies everything.

00:31:57.140 --> 00:31:58.020
It doesn't help you
when you're trying

00:31:58.020 --> 00:31:59.311
to maximize the number of ones.

00:31:59.311 --> 00:32:01.916
It just gets you to zero.

00:32:01.916 --> 00:32:03.040
Then you want to do better.

00:32:03.040 --> 00:32:06.680
And it's really hard to
get any better factor.

00:32:06.680 --> 00:32:08.630
One more situation.

00:32:08.630 --> 00:32:09.130
Sorry.

00:32:11.934 --> 00:32:13.350
There's a slight
distinction here.

00:32:13.350 --> 00:32:15.800
So suppose you have
the feature that you

00:32:15.800 --> 00:32:20.290
can set one variable
true, and the rest false.

00:32:20.290 --> 00:32:22.650
If that satisfies all your
constraints, than great,

00:32:22.650 --> 00:32:24.467
you found the value 1.

00:32:24.467 --> 00:32:26.300
And there's a big
difference between 0 and 1

00:32:26.300 --> 00:32:28.216
when you're looking at
relative approximation,

00:32:28.216 --> 00:32:30.950
because anything
divided by 0 is huge.

00:32:30.950 --> 00:32:32.880
So it's really hard
to get a good factor.

00:32:32.880 --> 00:32:33.760
That's the situation.

00:32:33.760 --> 00:32:35.260
Distinguishing
between 0 and greater

00:32:35.260 --> 00:32:39.150
than 0, which is an infinite
ratio, it could be NP-hard.

00:32:39.150 --> 00:32:41.470
That's when you,
in this situation,

00:32:41.470 --> 00:32:42.980
we set all the variables false.

00:32:42.980 --> 00:32:43.680
You get zero.

00:32:43.680 --> 00:32:46.690
But finding any other solution
is going to be NP-hard.

00:32:46.690 --> 00:32:48.280
Here, if you can
at least get 1, you

00:32:48.280 --> 00:32:50.930
can get an N approximation,
whereas here you

00:32:50.930 --> 00:32:52.320
can't get an N approximation.

00:32:52.320 --> 00:32:55.290
Here you can get
Poly approximation.

00:32:55.290 --> 00:32:57.700
And finally, if you have none
of this above situations,

00:32:57.700 --> 00:33:01.950
then testing feasibility is
NP-hard by Schaefer's theorem.

00:33:01.950 --> 00:33:04.310
So it's like Schaefer
theorem, but some of the cases

00:33:04.310 --> 00:33:08.200
split up into parts.

00:33:08.200 --> 00:33:09.660
Now, that was maximization.

00:33:09.660 --> 00:33:10.510
Question?

00:33:10.510 --> 00:33:12.510
AUDIENCE: So, what's
special about 1 here?

00:33:12.510 --> 00:33:15.977
It seems to me if you
replace that 1 by K

00:33:15.977 --> 00:33:17.310
it should still be in that case.

00:33:17.310 --> 00:33:18.390
PROFESSOR: This case.

00:33:18.390 --> 00:33:19.330
AUDIENCE: Yeah.

00:33:19.330 --> 00:33:22.620
If I just replace that one
with a fixed K. Like 2.

00:33:22.620 --> 00:33:23.880
PROFESSOR: Yes.

00:33:23.880 --> 00:33:27.290
So that problem will
still be-- so if you

00:33:27.290 --> 00:33:30.000
can set all but
K of them true, I

00:33:30.000 --> 00:33:32.000
think you can also set
all but one of them true,

00:33:32.000 --> 00:33:33.430
and still satisfy.

00:33:33.430 --> 00:33:34.190
Yeah.

00:33:34.190 --> 00:33:35.310
So here's the thing.

00:33:35.310 --> 00:33:36.680
This is all variables, right?

00:33:36.680 --> 00:33:39.440
So the idea is you
have tons of variables,

00:33:39.440 --> 00:33:41.857
and let's say two of
them are set to true.

00:33:41.857 --> 00:33:43.440
So if you look at a
clause, the clause

00:33:43.440 --> 00:33:46.685
might just apply to these
guys-- all the false guys--

00:33:46.685 --> 00:33:49.060
or it might apply to false
guys and one of the true guys,

00:33:49.060 --> 00:33:52.595
or it might apply to false
guys and two of the true guys.

00:33:52.595 --> 00:33:54.220
All of those would
have to be satisfied

00:33:54.220 --> 00:33:56.050
in your hypothetical situation.

00:33:56.050 --> 00:33:58.810
If that's true, that implies
that all the clauses are

00:33:58.810 --> 00:34:00.950
satisfied when only one
of them is set true,

00:34:00.950 --> 00:34:02.400
and the rest are false.

00:34:02.400 --> 00:34:04.980
So your case would fall
into this case as well,

00:34:04.980 --> 00:34:07.260
and you'd get
Poly-APX-completeness again.

00:34:07.260 --> 00:34:10.040
So it's not totally obvious
when these things apply.

00:34:10.040 --> 00:34:14.256
But this is the complete
list of different cases.

00:34:14.256 --> 00:34:14.839
Any questions?

00:34:17.480 --> 00:34:19.530
OK.

00:34:19.530 --> 00:34:21.440
Two out of four.

00:34:21.440 --> 00:34:25.460
Next one, this is the
longest one, is Min CSP.

00:34:25.460 --> 00:34:28.639
Now here we don't get as
nice a characterization,

00:34:28.639 --> 00:34:31.159
because there are some
open problems left.

00:34:31.159 --> 00:34:33.420
I haven't checked whether
all of these open problems

00:34:33.420 --> 00:34:36.610
remain open, but as of
2001 they were open,

00:34:36.610 --> 00:34:38.639
which was a while ago.

00:34:38.639 --> 00:34:41.800
And we can check whether
there's more explicit status.

00:34:41.800 --> 00:34:45.310
But I have the status
as of this paper here.

00:34:45.310 --> 00:34:47.150
So Min CSP.

00:34:47.150 --> 00:34:51.130
This is, you want to minimize
the number of constraints

00:34:51.130 --> 00:34:54.122
that are satisfied,
whereas before we

00:34:54.122 --> 00:34:55.080
looked at maximization.

00:34:55.080 --> 00:34:58.740
There are only three cases
which were something like this.

00:34:58.740 --> 00:35:02.270
Again, if setting all the
variables false or true

00:35:02.270 --> 00:35:08.810
satisfies all the clauses,
this is good, apparently.

00:35:08.810 --> 00:35:10.830
That's less obvious
in this case.

00:35:10.830 --> 00:35:12.240
In general,
minimization problems

00:35:12.240 --> 00:35:14.365
behave quite differently
from maximization problems

00:35:14.365 --> 00:35:16.110
in terms of approximability.

00:35:16.110 --> 00:35:17.970
Maximization is
generally easier to

00:35:17.970 --> 00:35:22.130
approximate, because your
solutions tend to be big,

00:35:22.130 --> 00:35:24.370
and it's easier to
approximate big things.

00:35:24.370 --> 00:35:27.830
Minimization-- small-- is hard.

00:35:27.830 --> 00:35:31.380
Also we had the
situation from Max CSP,

00:35:31.380 --> 00:35:33.540
if when you write it
in DNF, is exactly

00:35:33.540 --> 00:35:35.107
two terms for every clause.

00:35:35.107 --> 00:35:36.690
One of them is all
positive variables,

00:35:36.690 --> 00:35:38.356
and the other is all
negative variables.

00:35:38.356 --> 00:35:40.470
That's also easy.

00:35:40.470 --> 00:35:46.270
And here's a new case
of APX-completeness.

00:35:46.270 --> 00:35:48.610
So if the problem
you're trying to solve

00:35:48.610 --> 00:35:51.290
is exactly this
problem, they call this,

00:35:51.290 --> 00:35:54.190
I think, implication
hitting set.

00:35:54.190 --> 00:35:57.910
So you have a clause which
lets you say x1 implies

00:35:57.910 --> 00:36:01.620
x2 for any two variables.

00:36:01.620 --> 00:36:06.010
And you have some set of
clauses like this, where you

00:36:06.010 --> 00:36:08.720
can say here's five variables.

00:36:08.720 --> 00:36:10.680
The OR of them is true.

00:36:10.680 --> 00:36:13.479
No negation here.

00:36:13.479 --> 00:36:15.520
So this is called hitting
set, meaning I give you

00:36:15.520 --> 00:36:19.370
a set of vertices and a graph,
and I want at least one of them

00:36:19.370 --> 00:36:22.320
to be hit, to be
included, to be true.

00:36:22.320 --> 00:36:24.700
And we're trying to minimize
the number of such things

00:36:24.700 --> 00:36:26.533
that we satisfy.

00:36:26.533 --> 00:36:31.490
So this turns out to be hard,
but only there's no PTAS,

00:36:31.490 --> 00:36:35.600
but there's a constant
factor approximation.

00:36:35.600 --> 00:36:38.360
And then we have
these four cases

00:36:38.360 --> 00:36:41.770
which show that they are
equivalent to known studied

00:36:41.770 --> 00:36:42.860
problems.

00:36:42.860 --> 00:36:44.720
So there are these
special cases.

00:36:44.720 --> 00:36:48.414
Other than these getting
any approximation

00:36:48.414 --> 00:36:49.830
factor of less
than infinity would

00:36:49.830 --> 00:36:52.430
require you to distinguish
between zeros OPT,

00:36:52.430 --> 00:36:55.400
and OPT is greater than
zero, and it's NP-complete,

00:36:55.400 --> 00:36:57.980
unless you have these.

00:36:57.980 --> 00:37:00.970
So there are some special
cases like Min Uncut.

00:37:00.970 --> 00:37:03.150
This is the reverse of Max Cut.

00:37:03.150 --> 00:37:05.880
You want to minimize the
number of uncut edges.

00:37:05.880 --> 00:37:10.320
So that plus Max Cut should be
equal to the number of edges.

00:37:10.320 --> 00:37:12.920
But the approximability of the
two sides is quite different.

00:37:12.920 --> 00:37:16.480
And here are the best
results of our APX-hardness,

00:37:16.480 --> 00:37:19.900
and log and upper bound
for approximation.

00:37:19.900 --> 00:37:21.870
So that's a little
bit harder maybe.

00:37:21.870 --> 00:37:25.110
It's at least as hard as this.

00:37:25.110 --> 00:37:30.480
And that happens when you are
in the 2x (N)OR-SAT situation,

00:37:30.480 --> 00:37:33.320
something we saw
from the last slide.

00:37:33.320 --> 00:37:35.820
So here it reduces to
this other problem.

00:37:35.820 --> 00:37:39.025
Basically the same, but the
X(N)ORs don't buy you anything

00:37:39.025 --> 00:37:39.525
new.

00:37:42.580 --> 00:37:44.860
In the case of 2SAT,
you get a problem

00:37:44.860 --> 00:37:47.950
known as Min 2CNF deletion.

00:37:47.950 --> 00:37:51.780
And it's similar-- APX-hard,
and best approximation

00:37:51.780 --> 00:37:54.680
is log times log log.

00:37:54.680 --> 00:37:57.880
If in the case where you
have X(N)OR-SAT in general,

00:37:57.880 --> 00:38:01.330
but it's not all of the linear
equations have only two terms--

00:38:01.330 --> 00:38:05.110
so we have some larger ones--
then it turns out to be

00:38:05.110 --> 00:38:07.000
equivalent to nearest Codeword.

00:38:07.000 --> 00:38:10.120
So it turns out you can write
all such equations using

00:38:10.120 --> 00:38:13.260
either equations of length,
by using equations of length 3

00:38:13.260 --> 00:38:13.760
always.

00:38:13.760 --> 00:38:15.750
So this is linear equation.

00:38:15.750 --> 00:38:20.820
This should equal 1, or
this says equals zero.

00:38:20.820 --> 00:38:23.276
And from that, you can
construct all such things.

00:38:23.276 --> 00:38:24.525
This is a really hard problem.

00:38:27.610 --> 00:38:29.800
Poly-APX-hardness is not known.

00:38:29.800 --> 00:38:31.680
Current lower best
lower bound is this 2

00:38:31.680 --> 00:38:33.460
to the log to the 1
minus epsilon, which

00:38:33.460 --> 00:38:37.440
we saw in the table of various
inapproximability results

00:38:37.440 --> 00:38:37.940
last time.

00:38:37.940 --> 00:38:42.620
So this is a little bit
smaller than n to the epsilon,

00:38:42.620 --> 00:38:43.890
but it's kind of close-ish.

00:38:47.150 --> 00:38:50.300
And finally, in the--
I didn't write it.

00:38:50.300 --> 00:38:52.810
If you're in CNF form,
and all of the subclauses

00:38:52.810 --> 00:38:55.960
are either Horn, or all of
the subclauses are Dual-Horn,

00:38:55.960 --> 00:39:00.350
then you get something
called Min Horn Deletion.

00:39:00.350 --> 00:39:02.170
And this has the same
inapproximability.

00:39:04.730 --> 00:39:06.070
Here it's known.

00:39:06.070 --> 00:39:07.580
So up here, the
best approximation

00:39:07.580 --> 00:39:11.770
is n-- nothing, basically.

00:39:11.770 --> 00:39:13.110
Put them all in.

00:39:13.110 --> 00:39:16.990
And here there's a slightly
better approximation known ,

00:39:16.990 --> 00:39:18.990
I think, n to the 1 minus
epsilon, or something.

00:39:18.990 --> 00:39:20.804
But these are all super hard.

00:39:20.804 --> 00:39:22.470
The main point of
this is so that you're

00:39:22.470 --> 00:39:23.820
aware of these problems.

00:39:23.820 --> 00:39:26.640
If you ever encounter a problem
that looks anything like this,

00:39:26.640 --> 00:39:29.740
or it looks like some
kind of CSP problem,

00:39:29.740 --> 00:39:31.900
you should go to this
list and check it out.

00:39:31.900 --> 00:39:35.430
So don't memorize these,
but look at the notes.

00:39:35.430 --> 00:39:36.772
Definitely memorize these guys.

00:39:36.772 --> 00:39:37.730
These are good to know.

00:39:37.730 --> 00:39:42.140
But there's a few
obscure problems here.

00:39:42.140 --> 00:39:42.640
OK.

00:39:42.640 --> 00:39:47.560
Last one is minimizing
the number of ones.

00:39:47.560 --> 00:39:49.990
So this is like the
hardest of two worlds.

00:39:49.990 --> 00:39:51.760
Minimization is kind of harder.

00:39:51.760 --> 00:39:54.460
And here you have to satisfy
everything, but minimize

00:39:54.460 --> 00:39:56.390
the number of true variables.

00:39:59.530 --> 00:40:03.250
So this is easy if you
can set them all false.

00:40:03.250 --> 00:40:04.820
And then you win.

00:40:04.820 --> 00:40:07.120
This is easy in the Horn case.

00:40:07.120 --> 00:40:09.170
The Horn case is when
at most one is positive,

00:40:09.170 --> 00:40:11.900
so most of them
can be set to zero.

00:40:11.900 --> 00:40:15.990
This is easy in
the 2X(N)OR case.

00:40:15.990 --> 00:40:19.060
So if you have linear equations,
two terms each, equal to 0

00:40:19.060 --> 00:40:21.320
or equals 1, that's also.

00:40:21.320 --> 00:40:24.100
And you want to minimize the
number of true variables.

00:40:24.100 --> 00:40:25.410
That's good.

00:40:25.410 --> 00:40:28.060
If you're in 2CNF form,
there's a constant factor

00:40:28.060 --> 00:40:28.780
approximation.

00:40:28.780 --> 00:40:30.240
That's the best you can do.

00:40:30.240 --> 00:40:30.781
APX-complete.

00:40:33.090 --> 00:40:36.300
This is a case from
the last slide.

00:40:36.300 --> 00:40:39.290
If you have the hitting set
constraints on constant number

00:40:39.290 --> 00:40:41.830
of constant size
vertex sets, and you

00:40:41.830 --> 00:40:44.230
have implication constraints,
then your problem

00:40:44.230 --> 00:40:45.535
is APX-complete again.

00:40:48.380 --> 00:40:50.300
And then we have these
guys appearing, again

00:40:50.300 --> 00:40:51.070
nearest Codeword.

00:40:51.070 --> 00:40:52.980
N Min Horn deletion.

00:40:52.980 --> 00:40:55.020
This one we get in
the Dual-Horn case.

00:40:55.020 --> 00:40:56.490
The Horn case is good.

00:40:56.490 --> 00:40:59.880
Dual-Horn, we get this
thing, which was like log N

00:40:59.880 --> 00:41:00.380
approximal.

00:41:00.380 --> 00:41:01.490
Or no.

00:41:01.490 --> 00:41:05.880
This was the 2 to the log
N to the 1 minus epsilon.

00:41:05.880 --> 00:41:10.380
And this is X(N)OR-SAT when
they're not all binary.

00:41:10.380 --> 00:41:12.870
Then we get nearest
Codeword-complete.

00:41:12.870 --> 00:41:16.590
And finally, oh, two more.

00:41:16.590 --> 00:41:19.450
The dual to this, if all
the variables being set true

00:41:19.450 --> 00:41:22.590
satisfies your constraint,
that gives you a solution,

00:41:22.590 --> 00:41:27.780
but it's like the worst solution
possible, because you get N.

00:41:27.780 --> 00:41:32.320
And so in that case, you can get
probably a poly approximation.

00:41:32.320 --> 00:41:34.740
Not very impressive.

00:41:34.740 --> 00:41:37.380
And that's actually the
best you can do, at some N

00:41:37.380 --> 00:41:39.180
to the 1 minus epsilon.

00:41:39.180 --> 00:41:42.250
And in all other cases,
by Schaefer's theorem,

00:41:42.250 --> 00:41:45.250
deciding whether even finding
a feasible solution is NP-hard.

00:41:45.250 --> 00:41:47.960
So, good luck approximating.

00:41:47.960 --> 00:41:49.360
Cool?

00:41:49.360 --> 00:41:54.275
This is the Khanna, Sudan,
Trevisan, Williamson

00:41:54.275 --> 00:41:55.150
multichotomy theorem.

00:41:59.100 --> 00:41:59.600
All right.

00:42:03.860 --> 00:42:11.280
So let's do some
more reductions.

00:42:38.260 --> 00:42:42.740
My goal on this page is
to get to our good friend

00:42:42.740 --> 00:42:46.180
from one of the first lectures,
edge-matching-puzzles.

00:42:46.180 --> 00:42:50.480
You have little square
tiles, colors on the edges.

00:42:50.480 --> 00:42:52.910
Normally we want to satisfy
all of the edge constraints.

00:42:52.910 --> 00:42:57.480
Only equal colors match,
are adjacent to each other.

00:42:57.480 --> 00:43:00.040
Now the problem is going
to be maximize the number

00:43:00.040 --> 00:43:03.335
of satisfied edge constraints.

00:43:03.335 --> 00:43:05.160
But before I show
you that reduction,

00:43:05.160 --> 00:43:08.020
I need another problem,
which is APX-complete.

00:43:08.020 --> 00:43:10.330
So that problem is APX-complete.

00:43:10.330 --> 00:43:14.540
So I need two more problems.

00:43:14.540 --> 00:43:28.996
One is Max independent set
in 3-regular 3-edge colorable

00:43:28.996 --> 00:43:29.495
graphs.

00:43:32.790 --> 00:43:33.290
OK.

00:43:33.290 --> 00:43:35.415
I'm not going to prove this
one, because we already

00:43:35.415 --> 00:43:37.030
did a version of
independent set,

00:43:37.030 --> 00:43:39.340
and it's just tedious
to make it-- first,

00:43:39.340 --> 00:43:42.210
to make it exactly
degree three everywhere,

00:43:42.210 --> 00:43:45.260
and secondly make
it 3-edge colorable.

00:43:45.260 --> 00:43:48.630
With 3 regular 3-edge color
is a nice kind of graph,

00:43:48.630 --> 00:43:55.370
because every vertex, you've
got one edge of each class.

00:43:55.370 --> 00:43:56.930
So that's kind of cool.

00:43:56.930 --> 00:43:57.990
And we can use this.

00:43:57.990 --> 00:44:00.310
This problem is
basically equivalent

00:44:00.310 --> 00:44:03.720
to the actual
problem I want, which

00:44:03.720 --> 00:44:07.610
is a variation of
three-dimensional matching.

00:44:07.610 --> 00:44:09.980
So remember
three-dimensional matching,

00:44:09.980 --> 00:44:16.310
you have three sets--
A, B, and C. You

00:44:16.310 --> 00:44:19.080
look at the triples
on A, B, and C.

00:44:19.080 --> 00:44:23.140
And you're given some set
of interesting triples

00:44:23.140 --> 00:44:24.960
among those.

00:44:24.960 --> 00:44:32.350
And with 3DM, what we wanted was
to choose a set of such triples

00:44:32.350 --> 00:44:36.080
that covers all the vertices,
and no two of them intersect.

00:44:36.080 --> 00:44:38.500
That's the matching aspect.

00:44:38.500 --> 00:44:40.740
In this problem, we want
to choose as many triples

00:44:40.740 --> 00:44:43.700
as we can that don't
intersect each other.

00:44:43.700 --> 00:44:55.530
So the problem is choose
max subset S prime of S

00:44:55.530 --> 00:44:59.750
with no duplicate
coordinates, I'll say.

00:45:03.720 --> 00:45:05.900
So let's assume A, B,
and C are disjoint.

00:45:05.900 --> 00:45:09.020
Then I don't want any
element in A union B union C

00:45:09.020 --> 00:45:13.800
to appear twice in this
chosen set S prime.

00:45:13.800 --> 00:45:15.710
So that's the problem.

00:45:15.710 --> 00:45:19.521
Now I'm going to prove
that that's hard.

00:45:19.521 --> 00:45:24.990
It is basically the same
as Max independent set,

00:45:24.990 --> 00:45:29.830
and three regular
3-edge colored graphs,

00:45:29.830 --> 00:45:33.760
because what I do is
I take such a graph,

00:45:33.760 --> 00:45:43.490
and for each edge color class--
there are three of them--

00:45:43.490 --> 00:45:46.040
those are going
to be A, B, and C.

00:45:46.040 --> 00:45:47.800
So if I have red,
green, and blue,

00:45:47.800 --> 00:45:49.910
all the red edges are
going to be elements of A,

00:45:49.910 --> 00:45:52.220
all the green edges are
going to be the elements

00:45:52.220 --> 00:45:54.720
of B-- B for green.

00:45:54.720 --> 00:45:58.090
And then all the blue
elements are elements of C.

00:45:58.090 --> 00:45:58.710
OK.

00:45:58.710 --> 00:46:06.380
Then a vertex, as I said, has
exactly one of each class.

00:46:06.380 --> 00:46:07.790
So that's going to be my triple.

00:46:11.410 --> 00:46:13.540
And that's it.

00:46:13.540 --> 00:46:16.150
So now, if I want to solve
three-dimensional matching

00:46:16.150 --> 00:46:17.930
among those triples,
that's going

00:46:17.930 --> 00:46:22.735
to correspond to choosing a
set of vertices in here, no two

00:46:22.735 --> 00:46:25.760
of which share a color.

00:46:25.760 --> 00:46:30.105
No two of which share the
same item of A. Let's say A

00:46:30.105 --> 00:46:32.360
is this color of edge.

00:46:32.360 --> 00:46:35.720
So that means that
the vertices over here

00:46:35.720 --> 00:46:37.890
are not connected by an edge.

00:46:37.890 --> 00:46:40.920
So the cool thing here is that
each element of A, B, and C

00:46:40.920 --> 00:46:49.000
only appears in two
different triples.

00:46:49.000 --> 00:46:51.800
Corresponding to the
two ends of the edge.

00:46:51.800 --> 00:46:54.540
So now we have max
three-dimensional matching

00:46:54.540 --> 00:46:58.670
where every element in ABC
appears in exactly two triples.

00:46:58.670 --> 00:47:03.188
So I guess I can even
write E2 if I want to.

00:47:03.188 --> 00:47:05.060
OK.

00:47:05.060 --> 00:47:08.130
That was our sort of homework.

00:47:08.130 --> 00:47:13.370
Now we have max edge
matching puzzles.

00:47:13.370 --> 00:47:17.287
Again, we're given square tiles.

00:47:17.287 --> 00:47:18.870
There's different
colors on the tiles.

00:47:18.870 --> 00:47:20.780
Any number of colors.

00:47:20.780 --> 00:47:23.950
And we would like
to lay things out.

00:47:23.950 --> 00:47:26.880
And I'll tell you the instance
here is going to be 2 by N.

00:47:26.880 --> 00:47:29.760
So it's fairly narrow,
unlike the construction

00:47:29.760 --> 00:47:32.240
we saw in class.

00:47:32.240 --> 00:47:36.330
And we're reducing
from Max 3D M2.

00:47:36.330 --> 00:47:38.156
That's why I introduced it.

00:47:38.156 --> 00:47:43.090
And this is a four
years ago result.

00:47:43.090 --> 00:47:47.640
So the idea is the triple is
represented by these three

00:47:47.640 --> 00:47:49.210
tiles, and some more.

00:47:49.210 --> 00:47:52.090
But for starters,
these three tiles.

00:47:52.090 --> 00:47:54.870
The u glue is unique--
global unique.

00:47:54.870 --> 00:47:57.090
So it wants to be
on the boundary.

00:47:57.090 --> 00:47:58.890
And here tiles are
not allowed to rotate,

00:47:58.890 --> 00:48:01.490
so it wants to be on
the bottom boundary.

00:48:01.490 --> 00:48:08.676
So this ab glues only
appear as a single pairs.

00:48:08.676 --> 00:48:10.300
I guess they'll also
appear over there.

00:48:10.300 --> 00:48:11.383
But not very many of them.

00:48:11.383 --> 00:48:13.800
So basically a, b, and
c have to glue together

00:48:13.800 --> 00:48:14.720
in sequence like that.

00:48:14.720 --> 00:48:15.980
And the percent
signs are going to be

00:48:15.980 --> 00:48:17.140
the same on the bottom row.

00:48:17.140 --> 00:48:19.130
So nothing else.

00:48:19.130 --> 00:48:20.832
This is basically
forced to do this.

00:48:20.832 --> 00:48:22.540
We'll actually have
to do it a few times,

00:48:22.540 --> 00:48:24.920
but you have to build
this bottom structure.

00:48:24.920 --> 00:48:28.210
And then the question is
what do you build on top.

00:48:28.210 --> 00:48:32.900
And the idea is there are
exactly one each of these three

00:48:32.900 --> 00:48:37.110
tiles which just communicate
dollar sign left to right,

00:48:37.110 --> 00:48:39.550
and have a, b, c on the bottom.

00:48:39.550 --> 00:48:40.432
So those are cool.

00:48:40.432 --> 00:48:42.890
And if you want to put a triple
into your three-dimensional

00:48:42.890 --> 00:48:46.950
matching, then you
put those in sequence.

00:48:46.950 --> 00:48:48.020
No mismatches.

00:48:48.020 --> 00:48:48.680
This is great.

00:48:48.680 --> 00:48:49.820
You can take a whole
bunch of these,

00:48:49.820 --> 00:48:52.028
stick them next to each
other, everything will match.

00:48:52.028 --> 00:48:53.000
No errors.

00:48:53.000 --> 00:48:54.990
So you're getting
some constant number

00:48:54.990 --> 00:48:58.230
of points for each of these.

00:48:58.230 --> 00:49:03.240
But you will have to build
more-- at least two copies

00:49:03.240 --> 00:49:04.930
of this bottom structure.

00:49:04.930 --> 00:49:07.540
And there's only one
copy of this top thing.

00:49:07.540 --> 00:49:09.110
So that's the annoying part.

00:49:09.110 --> 00:49:11.820
But there are some variations
of these tiles which

00:49:11.820 --> 00:49:13.570
look like something
like this-- I'll

00:49:13.570 --> 00:49:16.930
show you all of them in a
moment-- which have exactly one

00:49:16.930 --> 00:49:18.450
mismatch.

00:49:18.450 --> 00:49:20.842
So you don't get
quite as many points.

00:49:20.842 --> 00:49:22.800
You get, I don't know,
15 instead of 16 points,

00:49:22.800 --> 00:49:24.740
or whatever.

00:49:24.740 --> 00:49:26.750
Bottom structure looks the same.

00:49:26.750 --> 00:49:31.571
And the point of this
is we know a appears

00:49:31.571 --> 00:49:32.570
in two different places.

00:49:32.570 --> 00:49:35.870
So we need two
versions of the a tile.

00:49:35.870 --> 00:49:39.015
But we only want one of them
to be happy and give you

00:49:39.015 --> 00:49:40.640
all the points,
because you should only

00:49:40.640 --> 00:49:44.400
be able to choose
the a thing once.

00:49:44.400 --> 00:49:46.520
So yet this triple
will still exist.

00:49:46.520 --> 00:49:48.400
adc will still be
floating around there.

00:49:48.400 --> 00:49:52.410
You want to still be buildable,
but at a cost of negative 1.

00:49:52.410 --> 00:49:54.880
So this part's still built.

00:49:54.880 --> 00:49:57.030
Then you have these
sort of filler tiles.

00:49:57.030 --> 00:49:59.000
Your goal is then just
get rid of all the stuff

00:49:59.000 --> 00:50:00.770
and pay a penalty.

00:50:00.770 --> 00:50:03.410
But you want to minimize the
number of times you do this,

00:50:03.410 --> 00:50:05.850
or maximize the number
of times you do this,

00:50:05.850 --> 00:50:09.200
and then it will be
simulating Max 3DM.

00:50:09.200 --> 00:50:12.640
There'll be some
additive consistent cost,

00:50:12.640 --> 00:50:16.690
which is the cost of all
the unpicked triples.

00:50:16.690 --> 00:50:20.525
And then this will
be an L-reduction.

00:50:20.525 --> 00:50:21.650
So I have some more slides.

00:50:21.650 --> 00:50:24.070
It's a bit complicated
to do all of the details,

00:50:24.070 --> 00:50:28.020
but this is a fully worked-out
example with two triples.

00:50:28.020 --> 00:50:30.680
We have a, b, c and a, d, c.

00:50:30.680 --> 00:50:32.264
And because they
share a, we don't

00:50:32.264 --> 00:50:33.430
want them both to be picked.

00:50:33.430 --> 00:50:36.380
So the same as what I showed
you just in the previous slide.

00:50:36.380 --> 00:50:38.500
But then there are
all these other tiles

00:50:38.500 --> 00:50:41.420
that are floating
around in order to make

00:50:41.420 --> 00:50:43.320
all the combinations possible.

00:50:43.320 --> 00:50:45.730
And there's all these
tiles to basically allow

00:50:45.730 --> 00:50:47.390
them to get thrown away.

00:50:47.390 --> 00:50:50.710
And so that's not so clear.

00:50:50.710 --> 00:50:54.104
This is the overall
construction.

00:50:54.104 --> 00:50:56.520
For every triple, you're going
to have exactly these three

00:50:56.520 --> 00:50:59.310
tiles that we saw.

00:50:59.310 --> 00:51:01.310
It got rotated relative
to the previous picture.

00:51:01.310 --> 00:51:03.560
Maybe rotations are allowed.

00:51:03.560 --> 00:51:05.890
And then for every
variable, here

00:51:05.890 --> 00:51:08.150
they're called x, y,
z instead of a, b, c.

00:51:08.150 --> 00:51:09.290
But the same thing.

00:51:09.290 --> 00:51:13.100
For every a thing we'll have
some constant set of tiles that

00:51:13.100 --> 00:51:15.250
includes the really good one.

00:51:15.250 --> 00:51:15.750
Sorry.

00:51:15.750 --> 00:51:17.270
The good one has
two dollar signs.

00:51:17.270 --> 00:51:19.465
This is the one you really like.

00:51:19.465 --> 00:51:21.090
And then there's all
this stuff to make

00:51:21.090 --> 00:51:23.350
sure things can get consumed.

00:51:23.350 --> 00:51:24.880
And you can get
rid of the triples

00:51:24.880 --> 00:51:27.950
and pay exactly one
per unpicked triple.

00:51:27.950 --> 00:51:29.700
So I don't want to go
through the details,

00:51:29.700 --> 00:51:34.711
but once you have that, you get
an L-reduction from Max 3DN2.

00:51:34.711 --> 00:51:35.210
Questions?

00:51:38.508 --> 00:51:39.494
All right.

00:51:44.960 --> 00:51:50.590
So I want to go
up the hierarchy.

00:51:50.590 --> 00:51:55.130
We've been focusing on constant
factor, approximable problems

00:51:55.130 --> 00:51:56.270
that have no PTASses.

00:51:59.080 --> 00:52:00.870
I will mention there
before we go on

00:52:00.870 --> 00:52:04.050
that there are some
constant factor approximable

00:52:04.050 --> 00:52:08.020
problems that are not,
that have no PTAS,

00:52:08.020 --> 00:52:10.600
and yet are not APX-complete.

00:52:10.600 --> 00:52:17.520
So APX-complete is not
all of APX minus PTAS.

00:52:17.520 --> 00:52:22.460
So there are APX
minus PTAS problems

00:52:22.460 --> 00:52:23.620
that are not APX-complete.

00:52:26.380 --> 00:52:29.140
So these are still useful
from a reduction standpoint.

00:52:29.140 --> 00:52:33.910
You can use them to show that
your problem has no PTAS.

00:52:33.910 --> 00:52:36.450
But you have to state
them differently.

00:52:40.690 --> 00:52:43.190
And they're somewhat
familiar problems.

00:52:43.190 --> 00:52:46.230
One of them is bin packing.

00:52:46.230 --> 00:52:48.950
This is you're moving
out of your house.

00:52:48.950 --> 00:52:50.950
You have a bunch of objects.

00:52:50.950 --> 00:52:52.700
You live in a
one-dimensional universe.

00:52:52.700 --> 00:52:55.620
So each box is
exactly the same size.

00:52:55.620 --> 00:52:57.240
It's one-dimensional in size.

00:52:57.240 --> 00:52:58.920
And you have a bunch of items
which are one-dimensional.

00:52:58.920 --> 00:53:01.211
And you want to pack as many
as you can into each box--

00:53:01.211 --> 00:53:03.190
but overall use the
minimum number of boxes.

00:53:03.190 --> 00:53:05.690
It's a minimization problem.

00:53:05.690 --> 00:53:08.770
This has no constant
factor approximation.

00:53:08.770 --> 00:53:14.740
But you can find what's called
a asymptotic PTAS, where

00:53:14.740 --> 00:53:17.950
you can get a PTAS-style
result-- 1 plus epsilon

00:53:17.950 --> 00:53:21.822
times OPT plus 1.

00:53:21.822 --> 00:53:24.881
So an additive error.

00:53:24.881 --> 00:53:26.380
And so in particular,
distinguishing

00:53:26.380 --> 00:53:29.930
between two bins and three
bins is weakly NP-complete.

00:53:29.930 --> 00:53:36.325
That's like partition,
right, between two bins

00:53:36.325 --> 00:53:37.570
and three bins.

00:53:37.570 --> 00:53:39.280
So you need this
sort of additive one.

00:53:39.280 --> 00:53:42.060
You can't get a PTAS
without the additive one.

00:53:42.060 --> 00:53:45.360
So it's not as hard as all
constant factor inapproximable

00:53:45.360 --> 00:53:49.300
problems, but
somewhere in between.

00:53:49.300 --> 00:53:52.440
APX-intermediate is
the technical term.

00:53:52.440 --> 00:53:56.501
Some other ones are minimum.

00:53:56.501 --> 00:53:58.432
AUDIENCE: [INAUDIBLE].

00:53:58.432 --> 00:54:00.765
PROFESSOR: Oh, this is all
assuming P does not equal NP.

00:54:00.765 --> 00:54:01.120
Yes.

00:54:01.120 --> 00:54:03.530
If P equals NP, then I think
all these things are equal.

00:54:03.530 --> 00:54:05.300
So, thank you.

00:54:08.400 --> 00:54:10.670
Another problem I've
seen in some situations

00:54:10.670 --> 00:54:15.260
is you want to find the
spanning tree in a graph that

00:54:15.260 --> 00:54:16.840
minimizes the maximum degree.

00:54:16.840 --> 00:54:19.070
This is also APX-intermediate.

00:54:19.070 --> 00:54:21.220
There's a constant
factor approximation.

00:54:21.220 --> 00:54:26.130
No PTAS, but not as
hard as all of APX.

00:54:26.130 --> 00:54:28.440
And another one is
min edge coloring,

00:54:28.440 --> 00:54:33.120
which is quite a bit easier
than vertex coloring.

00:54:33.120 --> 00:54:34.864
So these are problems
to watch out for.

00:54:34.864 --> 00:54:37.280
They're the only ones I know
of that are APX-intermediate.

00:54:37.280 --> 00:54:38.280
There may be more known.

00:54:41.330 --> 00:54:42.190
OK.

00:54:42.190 --> 00:54:44.900
So unless there are
questions, I want to go up

00:54:44.900 --> 00:54:46.795
to log factor approximation.

00:54:54.180 --> 00:54:56.050
Surprisingly, in
the CSP universe,

00:54:56.050 --> 00:54:59.970
we didn't get any
log approximation

00:54:59.970 --> 00:55:00.970
as the right answer.

00:55:00.970 --> 00:55:03.350
But there are problems where
log is the right answer.

00:55:07.774 --> 00:55:09.690
Again, there's probably
intermediate problems.

00:55:09.690 --> 00:55:11.720
But here are some
problems that are actually

00:55:11.720 --> 00:55:14.880
complete over all log
approximable problems.

00:55:14.880 --> 00:55:16.930
So there's a log
lower-bound and upper-bound

00:55:16.930 --> 00:55:19.390
on their approximability.

00:55:19.390 --> 00:55:25.090
I've mentioned two of them--
set cover and dominating set.

00:55:29.859 --> 00:55:32.150
First thing I'd like to show
is that these two problems

00:55:32.150 --> 00:55:33.390
are the same.

00:55:33.390 --> 00:55:35.810
I'm not going to try to
prove lower bounds on them--

00:55:35.810 --> 00:55:37.240
at least for now.

00:55:37.240 --> 00:55:40.720
But let me show that you could
L-reduce one to the other.

00:55:40.720 --> 00:55:44.080
So the easy direction
is L-reducing dominating

00:55:44.080 --> 00:55:47.260
set to set cover,
because dominating set

00:55:47.260 --> 00:55:49.100
says, well, if I
choose this vertex,

00:55:49.100 --> 00:55:52.550
then I cover these vertices.

00:55:52.550 --> 00:55:53.050
OK.

00:55:53.050 --> 00:55:57.920
So let's call this vertex V,
and then maybe a, b, c, d.

00:55:57.920 --> 00:56:04.450
I can represent that by a
set-- namely v, a, b, c, d.

00:56:04.450 --> 00:56:06.502
If I choose that set, it
covers those elements,

00:56:06.502 --> 00:56:07.960
just like when I
choose this vertex

00:56:07.960 --> 00:56:09.450
it covers those vertices.

00:56:09.450 --> 00:56:09.950
OK.

00:56:09.950 --> 00:56:12.490
So that's a strict
reduction from dominating

00:56:12.490 --> 00:56:14.865
set to set cover.

00:56:14.865 --> 00:56:18.320
In some sense, the bipartite
version gives you more control.

00:56:18.320 --> 00:56:18.820
OK.

00:56:18.820 --> 00:56:22.500
This is the non-bipartite
version of set cover.

00:56:22.500 --> 00:56:24.110
So what about the
other reduction--

00:56:24.110 --> 00:56:27.500
reducing set cover
to dominating set?

00:56:30.090 --> 00:56:33.170
So this is a little more fun.

00:56:33.170 --> 00:56:35.710
We need to build
a graph dominating

00:56:35.710 --> 00:56:39.000
set that somehow has two very
different types of vertices.

00:56:39.000 --> 00:56:42.810
We want to represent sets, and
we want to represent elements.

00:56:42.810 --> 00:56:44.400
So here's what
we're going to do.

00:56:44.400 --> 00:56:49.040
We build a clique
representing the sets.

00:56:49.040 --> 00:56:53.560
So there are nodes in this
clique-- one for every set.

00:56:53.560 --> 00:56:57.240
And then we're going to have an
independent set over here that

00:56:57.240 --> 00:56:59.710
will represent the elements.

00:56:59.710 --> 00:57:01.910
And then whenever
a set over here

00:57:01.910 --> 00:57:05.940
contains an element over
there, we will add an edge.

00:57:05.940 --> 00:57:08.970
So in general, an element
may appear in several sets,

00:57:08.970 --> 00:57:12.200
and the set is going to
consist of many elements.

00:57:12.200 --> 00:57:14.440
But over here, there's
not going to be any edges

00:57:14.440 --> 00:57:15.410
between these elements.

00:57:15.410 --> 00:57:18.390
These are independent.

00:57:18.390 --> 00:57:22.370
And over here, all
of the edges exist.

00:57:22.370 --> 00:57:25.540
So the intent is you choose
a set of these vertices

00:57:25.540 --> 00:57:29.620
corresponding to sets in
order to cover those vertices.

00:57:29.620 --> 00:57:31.870
And that's going to work,
because these vertices

00:57:31.870 --> 00:57:33.820
are super easy to cover
in the dominating set.

00:57:33.820 --> 00:57:36.880
You choose any of them,
you cover all of them.

00:57:36.880 --> 00:57:40.800
These guys, you never want to
put them in a dominating set.

00:57:40.800 --> 00:57:42.800
Why would you put this
in a dominating set, when

00:57:42.800 --> 00:57:44.466
you could just follow
one of these edges

00:57:44.466 --> 00:57:45.780
and put this in instead?

00:57:45.780 --> 00:57:49.960
That vertex will cover this one,
and it will cover all of these.

00:57:49.960 --> 00:57:52.680
And the only edges from
here are to over here.

00:57:52.680 --> 00:57:56.451
So if you choose a set, you'll
cover all the sets and that one

00:57:56.451 --> 00:57:56.950
element.

00:57:56.950 --> 00:57:58.324
If you choose the
element, you'll

00:57:58.324 --> 00:58:01.390
cover the element
and some of the sets.

00:58:01.390 --> 00:58:04.100
So in any optimal solution,
if this ever appears,

00:58:04.100 --> 00:58:06.530
you can keep it optimal
and move over here.

00:58:06.530 --> 00:58:09.170
That is sort of arguments
we've been doing over and over.

00:58:09.170 --> 00:58:11.240
So there is an optimal
solution where you only

00:58:11.240 --> 00:58:16.810
choose vertices on the left,
and then that is a set cover.

00:58:16.810 --> 00:58:19.570
Again, it's a strict reduction.

00:58:19.570 --> 00:58:21.170
No loss.

00:58:21.170 --> 00:58:21.670
Cool?

00:58:21.670 --> 00:58:24.425
So that is why these two
problems are equivalent.

00:58:24.425 --> 00:58:26.300
Now we're just going to
take on faith for now

00:58:26.300 --> 00:58:29.290
that they are log
inapproximable.

00:58:29.290 --> 00:58:32.044
And you've probably seen that
this one is log approximable.

00:58:32.044 --> 00:58:33.960
So now you know that
this is log approximable.

00:58:39.540 --> 00:58:45.170
I would say most
of the literature

00:58:45.170 --> 00:58:50.040
I see for inapproximability
is either APX hardness,

00:58:50.040 --> 00:58:52.465
or what people usually
call set cover hardness.

00:58:55.140 --> 00:58:57.440
I mean, the fact that set
covers log APX-complete,

00:58:57.440 --> 00:58:58.814
that is complete
for that class--

00:58:58.814 --> 00:59:01.230
not just a log lower-bound--
is fairly recent.

00:59:01.230 --> 00:59:03.760
So people usually have
called it set cover hardness.

00:59:03.760 --> 00:59:07.000
Now you can call it
log APX-hardness.

00:59:07.000 --> 00:59:10.120
So let me show you one example.

00:59:10.120 --> 00:59:11.880
There are a lot
of both out there,

00:59:11.880 --> 00:59:15.852
and I'm actually just showing
you sort of a small sampling,

00:59:15.852 --> 00:59:17.900
because there's so much.

00:59:17.900 --> 00:59:20.120
So here's a fun problem.

00:59:20.120 --> 00:59:23.167
It's called token
reconfiguration.

00:59:23.167 --> 00:59:24.750
And the idea is
you're doing some kind

00:59:24.750 --> 00:59:27.410
of motion planning in a graph.

00:59:27.410 --> 00:59:29.380
So something like
pushing blocks,

00:59:29.380 --> 00:59:33.300
except you have a
bunch of robots,

00:59:33.300 --> 00:59:37.100
which here are represented--
well, you have a graph.

00:59:37.100 --> 00:59:40.760
And each vertex can either
have a robot or not.

00:59:40.760 --> 00:59:43.580
In some, you're given
an initial configuration

00:59:43.580 --> 00:59:45.320
of how the robots are
placed, and you're

00:59:45.320 --> 00:59:46.903
given a final
configuration of how you

00:59:46.903 --> 00:59:48.160
want the robots to be placed.

00:59:48.160 --> 00:59:49.826
And they have the
same number of robots,

00:59:49.826 --> 00:59:53.220
because you can't eat
robots, or create them yet.

00:59:53.220 --> 00:59:55.470
So when robots
can create robots,

00:59:55.470 --> 00:59:57.990
that will be another problem.

00:59:57.990 --> 00:59:59.490
So here you have
robot conservation.

01:00:03.200 --> 01:00:05.370
So in a configuration,
there are three types

01:00:05.370 --> 01:00:08.350
of vertices in that situation.

01:00:08.350 --> 01:00:10.760
It could be you have a
vertex that currently

01:00:10.760 --> 01:00:12.580
has a robot-- here
they're called tokens,

01:00:12.580 --> 01:00:16.210
to be a little more generic.

01:00:16.210 --> 01:00:19.480
It could have a robot,
but not be a place

01:00:19.480 --> 01:00:20.690
that should have a robot.

01:00:20.690 --> 01:00:22.690
So in the initial
configuration, it has a robot,

01:00:22.690 --> 01:00:24.910
but in the final
configuration it does not.

01:00:24.910 --> 01:00:28.750
It could be you have some
robots that are basically

01:00:28.750 --> 01:00:29.940
where they want to be.

01:00:29.940 --> 01:00:33.240
They are robot and also in
the target configuration,

01:00:33.240 --> 01:00:34.780
there's a robot there.

01:00:34.780 --> 01:00:36.870
Or I guess there's four
cases, but in this case

01:00:36.870 --> 01:00:38.040
we'll only have three.

01:00:38.040 --> 01:00:40.260
Or it could be that you
want to have robot there,

01:00:40.260 --> 01:00:42.240
but currently you do not.

01:00:42.240 --> 01:00:46.817
So this is an instance
that simulates set cover.

01:00:46.817 --> 01:00:48.650
And this is a situation
where robots are all

01:00:48.650 --> 01:00:49.520
treated identically.

01:00:49.520 --> 01:00:52.400
So you don't care
which robot goes where.

01:00:52.400 --> 01:00:54.030
So you've got these
robots over here,

01:00:54.030 --> 01:00:55.350
which don't want to be here.

01:00:55.350 --> 01:00:56.860
They want to be over there.

01:00:56.860 --> 01:00:58.450
I mean, if you
measure this length,

01:00:58.450 --> 01:01:01.900
it's the same as this length.

01:01:01.900 --> 01:01:03.540
And these robots
don't want to move,

01:01:03.540 --> 01:01:05.930
but they're going to have to,
because they're in the way.

01:01:05.930 --> 01:01:08.590
In this tripartite graph,
they're in the way from here

01:01:08.590 --> 01:01:09.950
to there.

01:01:09.950 --> 01:01:12.840
I didn't tell you a
move in this scenario

01:01:12.840 --> 01:01:18.220
is that you can take a robot
and follow any empty path, OK

01:01:18.220 --> 01:01:21.410
So you can make a sequence of
moves all at a cost of one,

01:01:21.410 --> 01:01:23.610
as long as it doesn't
hit any other robots.

01:01:23.610 --> 01:01:25.570
So, a collision-free path.

01:01:25.570 --> 01:01:27.850
You follow it, then you
can pick up another robot,

01:01:27.850 --> 01:01:29.349
move it along a
collision-free path,

01:01:29.349 --> 01:01:32.720
pick up another
robot, and so on.

01:01:32.720 --> 01:01:34.884
So if you want to move
all these guys over here,

01:01:34.884 --> 01:01:37.300
you're going to have to move
some of these out of the way.

01:01:37.300 --> 01:01:38.300
How many?

01:01:38.300 --> 01:01:39.570
Set cover many.

01:01:39.570 --> 01:01:42.330
Here's the set cover instance
in this bipartite graph.

01:01:42.330 --> 01:01:45.710
So what you can do is take this
robot, move it out of the way,

01:01:45.710 --> 01:01:47.240
move it to one of
these elements,

01:01:47.240 --> 01:01:49.200
and then for the remainder
of this set, which

01:01:49.200 --> 01:01:51.597
are these two nodes,
you can take this guy

01:01:51.597 --> 01:01:53.430
and move it there in
one step, take this guy

01:01:53.430 --> 01:01:54.800
and move it there in one step.

01:01:54.800 --> 01:01:56.200
The length of this doesn't
matter, because you

01:01:56.200 --> 01:01:57.480
can follow a long path.

01:01:57.480 --> 01:02:01.900
And you just drain out
this thing one at a time--

01:02:01.900 --> 01:02:05.490
except for this guy, who
you moved out of the way.

01:02:05.490 --> 01:02:08.260
You move one of these
to fill his spot.

01:02:08.260 --> 01:02:10.690
And if you can cover all
the elements over here

01:02:10.690 --> 01:02:13.640
with only k of
these guys moving,

01:02:13.640 --> 01:02:20.215
then the number of moves
will be k plus A. So

01:02:20.215 --> 01:02:21.340
that's what's written here.

01:02:21.340 --> 01:02:26.940
OPT is, this is a fixed added
of cost plus the set cover.

01:02:26.940 --> 01:02:30.600
And this is going to be
an L-reduction, provided

01:02:30.600 --> 01:02:36.990
this is a linear in A, which
is easy enough to arrange.

01:02:36.990 --> 01:02:38.590
So that's the unlabeled case.

01:02:38.590 --> 01:02:40.860
You can also solve
the labeled case.

01:02:40.860 --> 01:02:44.170
Maybe you want robot one
to go to position one,

01:02:44.170 --> 01:02:47.190
and you want robot two
to go to position two.

01:02:47.190 --> 01:02:48.901
Same thing, but
here these robots

01:02:48.901 --> 01:02:50.900
are going to have to go
back where they started.

01:02:50.900 --> 01:02:53.525
So you just add a little vertex
so they can get out of the way.

01:02:53.525 --> 01:02:55.590
Everything can move
where they want to.

01:02:55.590 --> 01:02:58.710
Again, choose a set
cover, move those over,

01:02:58.710 --> 01:02:59.970
and then move them back.

01:02:59.970 --> 01:03:02.130
So you end up paying
two times the set cover.

01:03:02.130 --> 01:03:03.840
But just a constant factor loss.

01:03:03.840 --> 01:03:05.960
Still an L-reduction.

01:03:05.960 --> 01:03:07.960
And this problem
is motivated, it's

01:03:07.960 --> 01:03:10.290
sort of a generalization
of the 15 puzzle.

01:03:10.290 --> 01:03:12.750
You have a little 4 by 4 grid.

01:03:12.750 --> 01:03:13.990
You've got movable tiles.

01:03:13.990 --> 01:03:16.300
You can only move one
at a time in that case,

01:03:16.300 --> 01:03:18.420
because there's
only a single gap.

01:03:18.420 --> 01:03:20.890
This is sort of a
generalized form of that,

01:03:20.890 --> 01:03:22.770
where you have various tiles.

01:03:22.770 --> 01:03:25.230
You want to get them
into the right spots,

01:03:25.230 --> 01:03:28.300
but you can't have collisions
during that motion.

01:03:28.300 --> 01:03:31.470
So that's where this
problem came from.

01:03:31.470 --> 01:03:34.320
15 puzzle, by the way, in
the generalized n by n form

01:03:34.320 --> 01:03:37.077
is NP-hard and in APX,
but I think it's open

01:03:37.077 --> 01:03:38.160
whether it's APX-complete.

01:03:40.700 --> 01:03:44.820
I would show the proof, but it's
very complicated, so, I won't.

01:03:48.450 --> 01:03:50.140
Cool.

01:03:50.140 --> 01:03:53.170
Well, in the last little
bit, I wanted to tell you

01:03:53.170 --> 01:03:56.230
about the super high end.

01:03:56.230 --> 01:03:57.835
So we went to log approximation.

01:04:00.640 --> 01:04:03.720
There are other
things known, but not

01:04:03.720 --> 01:04:05.110
a lot of completeness results.

01:04:05.110 --> 01:04:06.610
So we're going to
get to other kinds

01:04:06.610 --> 01:04:09.370
of interapproximability
next class.

01:04:09.370 --> 01:04:13.430
For now, I want to stick
to something APX-complete.

01:04:13.430 --> 01:04:15.790
And the most studied
class above log

01:04:15.790 --> 01:04:19.740
is poly, which is like n
to the 1 minus epsilon.

01:04:34.860 --> 01:04:38.360
And my main goal here is to
tell you about some problems

01:04:38.360 --> 01:04:40.880
that you should, if you
think your problem is

01:04:40.880 --> 01:04:44.730
like Poly-APX-hard, these
are the standard problems

01:04:44.730 --> 01:04:46.390
to start from.

01:04:46.390 --> 01:04:47.629
There are two of them.

01:04:47.629 --> 01:04:49.920
And I've mentioned them, but
not quite in this context.

01:04:57.920 --> 01:05:03.194
They are clique and
independent set.

01:05:03.194 --> 01:05:04.610
These are really
the same problem.

01:05:04.610 --> 01:05:08.670
One is the complement
graph of the other.

01:05:08.670 --> 01:05:09.975
Both maximization problems.

01:05:12.930 --> 01:05:14.480
And those are the standard ones.

01:05:14.480 --> 01:05:16.690
I'll leave it at that.

01:05:16.690 --> 01:05:18.490
I'm going to keep going up.

01:05:18.490 --> 01:05:22.173
The next level most studied
is Exp-APX-complete.

01:05:25.136 --> 01:05:27.010
So for these problems,
the best approximation

01:05:27.010 --> 01:05:29.960
is n divided by log squared n.

01:05:29.960 --> 01:05:32.234
And there's a lower bound
of n to the 1 minus epsilon.

01:05:32.234 --> 01:05:34.400
So there is a gap in terms
of their approximability.

01:05:34.400 --> 01:05:35.775
But what we know
is that they are

01:05:35.775 --> 01:05:39.930
the hardest problems that have
any n to the ce approximation.

01:05:39.930 --> 01:05:44.380
They're all reducible to each
other via PTAS reductions.

01:05:44.380 --> 01:05:45.705
So, fairly preserving.

01:05:48.680 --> 01:05:52.010
So our next class
up is APX-complete,

01:05:52.010 --> 01:05:59.980
things, problems approximable in
exponential and n approximation

01:05:59.980 --> 01:06:00.480
factors.

01:06:00.480 --> 01:06:02.850
How would that happen?

01:06:02.850 --> 01:06:04.420
This is kind of funny.

01:06:04.420 --> 01:06:09.350
And the canonical problem here
is the basic reason is numbers.

01:06:12.190 --> 01:06:14.410
We take the traveling
salesman problem.

01:06:14.410 --> 01:06:16.950
And every edge
can have a weight.

01:06:16.950 --> 01:06:18.730
Let's say it's integer weights.

01:06:18.730 --> 01:06:21.960
But any integer weight that
can be expressible in n bits

01:06:21.960 --> 01:06:25.740
is fair game, which means
the actual value of that edge

01:06:25.740 --> 01:06:28.500
is going to be exponential in n.

01:06:28.500 --> 01:06:31.210
And from that, you can get
a very easy lower bound.

01:06:31.210 --> 01:06:33.480
And in fact, all
problems that are

01:06:33.480 --> 01:06:38.430
approximable in exponential APX
can be reduced to general TSP,

01:06:38.430 --> 01:06:40.267
where you're just given
a bunch of distances

01:06:40.267 --> 01:06:41.350
between pairs of vertices.

01:06:41.350 --> 01:06:43.040
It doesn't satisfy
triangle inequality.

01:06:43.040 --> 01:06:44.930
That's the non-metric aspect.

01:06:44.930 --> 01:06:48.100
The triangle inequality TSP,
which is what normally happens,

01:06:48.100 --> 01:06:49.260
there is a constant factor.

01:06:49.260 --> 01:06:51.430
It's APX complete.

01:06:51.430 --> 01:06:57.030
But for general waits
between pairs of vertices,

01:06:57.030 --> 01:06:59.370
non-metric, it's
Exp-APX-complete,

01:06:59.370 --> 01:07:03.220
because you can
basically make a graph

01:07:03.220 --> 01:07:05.240
and solve
Hamiltonicity by saying

01:07:05.240 --> 01:07:09.050
all the edges in the graph
have weight one or zero,

01:07:09.050 --> 01:07:12.790
and all of the edges-- I guess
one would be a little bit more

01:07:12.790 --> 01:07:14.240
legitimate.

01:07:14.240 --> 01:07:16.350
And all the non-edges
in the graph

01:07:16.350 --> 01:07:17.890
are going to give
weight infinity.

01:07:17.890 --> 01:07:19.920
Infinity is the largest
expressible number which

01:07:19.920 --> 01:07:22.360
is 1, 1, 1, 1, n bits long.

01:07:22.360 --> 01:07:24.610
And so either you use one
of those edges or you don't.

01:07:24.610 --> 01:07:27.540
And there's an exponential
gap between them.

01:07:27.540 --> 01:07:29.590
So even if we disallow
zeros being an output,

01:07:29.590 --> 01:07:33.446
then we get
exponential separation.

01:07:33.446 --> 01:07:35.070
That doesn't prove
completeness, but it

01:07:35.070 --> 01:07:38.070
proves that you can't hope
for better than exponential

01:07:38.070 --> 01:07:40.910
approximation there.

01:07:40.910 --> 01:07:42.240
OK.

01:07:42.240 --> 01:07:46.620
Two more even crazier classes.

01:07:46.620 --> 01:07:48.420
Now we did see these
classes come up

01:07:48.420 --> 01:07:52.580
with the
characterization theorem.

01:07:52.580 --> 01:07:54.990
But these are probably how
these results were proved.

01:08:17.750 --> 01:08:20.778
So you might think, well,
double the exponential.

01:08:20.778 --> 01:08:21.319
I don't know.

01:08:21.319 --> 01:08:22.376
What's next?

01:08:22.376 --> 01:08:24.189
Next, you could define that.

01:08:24.189 --> 01:08:27.550
But what seems to
appear most often

01:08:27.550 --> 01:08:33.040
is this is the ultimate class
among all NP optimization

01:08:33.040 --> 01:08:34.810
problems, you could
imagine being complete

01:08:34.810 --> 01:08:36.060
against all of them.

01:08:36.060 --> 01:08:40.270
And this is with respect
to AP-reductions,

01:08:40.270 --> 01:08:41.279
one of the ones we saw.

01:08:44.090 --> 01:08:47.490
And I'm going to define a very
closely related class, which

01:08:47.490 --> 01:08:51.560
is NPO PB, NPO
polynomially bounded.

01:08:57.700 --> 01:08:58.992
OK.

01:08:58.992 --> 01:09:02.220
So these are the hardest
problems to approximate.

01:09:02.220 --> 01:09:04.740
This is basically the problems
that have numbers in them,

01:09:04.740 --> 01:09:06.810
and this is the problem
that have no numbers,

01:09:06.810 --> 01:09:10.180
or if they have numbers they
are polynomially bounded,

01:09:10.180 --> 01:09:12.660
like the polynomial situation.

01:09:12.660 --> 01:09:16.160
So non-metric TSP, well, it's
not as hard as NPO-complete,

01:09:16.160 --> 01:09:18.021
but it's more in this category.

01:09:18.021 --> 01:09:20.645
AUDIENCE: Is there a notion
of strongness, weakness

01:09:20.645 --> 01:09:22.450
in these kind of things?

01:09:22.450 --> 01:09:23.620
PROFESSOR: That's funny.

01:09:23.620 --> 01:09:25.090
This is a stronger result.

01:09:25.090 --> 01:09:26.560
So there's not quite an analog.

01:09:26.560 --> 01:09:29.569
But you can do
exponential tricks

01:09:29.569 --> 01:09:33.140
and give yourself a
hard time over here.

01:09:33.140 --> 01:09:36.080
And here you're just
not allowed to use.

01:09:36.080 --> 01:09:37.760
Everything's polynomial.

01:09:37.760 --> 01:09:41.870
So a three-partition is sort
of more in this universe.

01:09:41.870 --> 01:09:45.490
But in this situation, if you
sort of have three partitions,

01:09:45.490 --> 01:09:50.410
but with exponential numbers,
then you get this harder class.

01:09:50.410 --> 01:09:53.040
So this is not the
analog of weak.

01:09:53.040 --> 01:09:57.724
You could maybe imagine--
well, in some sense,

01:09:57.724 --> 01:09:59.390
weak is a modifier
in the problem, where

01:09:59.390 --> 01:10:01.139
you say I want to
restrict all the numbers

01:10:01.139 --> 01:10:02.700
to a polynomial size.

01:10:02.700 --> 01:10:05.900
So when you do something
like three partition,

01:10:05.900 --> 01:10:10.160
it's sort of a weak
problem, or it's

01:10:10.160 --> 01:10:12.270
a polynomially bounded problem.

01:10:12.270 --> 01:10:15.850
Strong NP hardness means
that that is NP-complete.

01:10:15.850 --> 01:10:19.012
Anyway vague analog,
but not quite.

01:10:19.012 --> 01:10:21.470
It's possible some of these,
you could add a weak modifier,

01:10:21.470 --> 01:10:24.590
and it would mean
something, but I don't know.

01:10:24.590 --> 01:10:25.090
All right.

01:10:25.090 --> 01:10:27.230
So I just want to give
you some sample problems

01:10:27.230 --> 01:10:29.290
on both of these sides.

01:10:29.290 --> 01:10:31.930
Maybe let's start
with this side, which

01:10:31.930 --> 01:10:35.117
is a little more
interesting, because you

01:10:35.117 --> 01:10:36.575
get some kind of
familiar problems,

01:10:36.575 --> 01:10:37.533
and they're super hard.

01:10:40.520 --> 01:10:46.150
Minimum independent
dominating set.

01:10:46.150 --> 01:10:47.300
We've seen independent set.

01:10:47.300 --> 01:10:48.383
We've seen dominating set.

01:10:48.383 --> 01:10:51.390
Independent set is already
hard to approximate.

01:10:51.390 --> 01:10:56.360
But this problem is
worse, because even

01:10:56.360 --> 01:10:58.080
finding an independent
dominating set

01:10:58.080 --> 01:11:02.020
is NP-complete, whereas
finding an independent set,

01:11:02.020 --> 01:11:04.210
I can choose nothing.

01:11:04.210 --> 01:11:06.990
But if I want to simultaneously
be dominating an independent,

01:11:06.990 --> 01:11:07.870
that's NP.

01:11:07.870 --> 01:11:09.570
Hard to find any solution.

01:11:09.570 --> 01:11:17.680
In general in NPO PB problems,
NPO PB-complete problems,

01:11:17.680 --> 01:11:20.920
it's always NP-complete to
find a feasible solution.

01:11:20.920 --> 01:11:22.543
But it's worse than that.

01:11:22.543 --> 01:11:25.210
So the first level would be
to find a feasible solution.

01:11:25.210 --> 01:11:26.910
And this is saying
on top of that you

01:11:26.910 --> 01:11:28.490
want to minimize the size.

01:11:28.490 --> 01:11:30.276
I think Max would also be hard.

01:11:30.276 --> 01:11:32.080
But I think there's
a general theorem,

01:11:32.080 --> 01:11:33.640
that if you're hard
in the min case,

01:11:33.640 --> 01:11:35.570
you're also hard
in the max case.

01:11:35.570 --> 01:11:38.960
But it depends on
the exact set-up.

01:11:38.960 --> 01:11:41.540
So this is sort of an
optimization version

01:11:41.540 --> 01:11:44.110
that makes it even
harder than NP-complete.

01:11:44.110 --> 01:11:49.330
So I think this is NP-complete,
and this is kind of even worse.

01:11:49.330 --> 01:11:52.910
It's sort of stating the
stronger thing about when

01:11:52.910 --> 01:11:55.380
you're trying to optimize
over a space of solutions,

01:11:55.380 --> 01:11:57.130
that it's NP-complete to decide.

01:11:57.130 --> 01:11:59.320
Notice that's still
an NPO problem.

01:11:59.320 --> 01:12:01.490
We define that
solutions need to be

01:12:01.490 --> 01:12:03.244
recognizable in polynomial time.

01:12:03.244 --> 01:12:05.035
But we didn't say that
you can generate one

01:12:05.035 --> 01:12:06.200
in polynomial time.

01:12:06.200 --> 01:12:09.030
So it could be NP-complete
to find a single solution,

01:12:09.030 --> 01:12:09.744
like here.

01:12:09.744 --> 01:12:11.660
All of these problems
will have that property.

01:12:15.990 --> 01:12:21.250
Another fun problem is
shortest computation.

01:12:21.250 --> 01:12:23.189
This is sort of the
most intuitive one

01:12:23.189 --> 01:12:23.980
at a certain level.

01:12:23.980 --> 01:12:25.480
If you know Turing
machines, and you

01:12:25.480 --> 01:12:27.396
have a non-deterministic
Turing machine, which

01:12:27.396 --> 01:12:29.020
could take
non-deterministic branches,

01:12:29.020 --> 01:12:31.630
you want to find the computation
in such a machine that

01:12:31.630 --> 01:12:34.720
terminates the earliest
using the fewest steps.

01:12:34.720 --> 01:12:39.080
So you might think of that
as canonical NPO PB problem.

01:12:39.080 --> 01:12:41.690
There's no numbers in it,
but as you can imagine,

01:12:41.690 --> 01:12:44.440
that's super hard to do.

01:12:44.440 --> 01:12:46.840
Here's some more
graph theoretic ones.

01:12:46.840 --> 01:12:50.920
Quite natural problems,
but super hard.

01:12:50.920 --> 01:12:52.510
Longest induced path.

01:12:52.510 --> 01:12:55.030
Induced means, there
are no other edges

01:12:55.030 --> 01:12:57.320
between the chosen vertices.

01:12:57.320 --> 01:13:00.810
So this is sort of
longest path is one thing.

01:13:00.810 --> 01:13:03.030
That's quite hard to
approximate-- like, I think,

01:13:03.030 --> 01:13:05.070
n to the 1 minus epsilon.

01:13:05.070 --> 01:13:07.070
That's sort of the
analog of Hamiltonicity.

01:13:07.070 --> 01:13:09.740
Along this induced
path is worse.

01:13:09.740 --> 01:13:12.190
Even finding an induced
path of length k,

01:13:12.190 --> 01:13:16.550
finding a feasible solution,
finding an induced path

01:13:16.550 --> 01:13:17.150
is hard.

01:13:24.310 --> 01:13:33.749
Another fun one is longest
path with forbidden pairs.

01:13:33.749 --> 01:13:35.540
So there are pairs of
edges that you're not

01:13:35.540 --> 01:13:38.160
allowed to choose together, and
subject to those constraints

01:13:38.160 --> 01:13:40.000
you want to find
the longest path.

01:13:40.000 --> 01:13:42.600
So these are all
NPO PB complete.

01:13:42.600 --> 01:13:44.437
No numbers in any of them.

01:13:44.437 --> 01:13:46.145
Now let me give you
some number problems.

01:13:58.930 --> 01:14:03.230
So Ones was you want to maximize
the number of true variables.

01:14:03.230 --> 01:14:05.710
Now we're going to add weights.

01:14:05.710 --> 01:14:09.330
So we want to maximize
the sum of the weights

01:14:09.330 --> 01:14:12.370
of the true
variables-- and while

01:14:12.370 --> 01:14:15.830
satisfying a Boolean formula.

01:14:15.830 --> 01:14:17.970
So again, finding a
feasible solution is hard.

01:14:17.970 --> 01:14:19.800
That's not surprising.

01:14:19.800 --> 01:14:22.440
Here, the weights can
be exponential in value,

01:14:22.440 --> 01:14:24.500
because we allow n
bits for the weights.

01:14:24.500 --> 01:14:28.450
And that pushes you
into NPO completeness.

01:14:28.450 --> 01:14:31.210
If you say the weights have
to be polynomially bounded,

01:14:31.210 --> 01:14:33.276
then this problem
is NPO PB complete.

01:14:33.276 --> 01:14:34.900
And that's sort of
the starting problem

01:14:34.900 --> 01:14:36.820
that they used to prove
all of these are hard.

01:14:36.820 --> 01:14:39.420
So they're reductions from
this with polynomial weights

01:14:39.420 --> 01:14:40.100
to these guys.

01:14:44.438 --> 01:14:47.330
AUDIENCE: [INAUDIBLE]?

01:14:47.330 --> 01:14:49.220
PROFESSOR: 3SAT.

01:14:49.220 --> 01:14:54.080
I don't know whether you could
go down to 2SAT is interesting.

01:14:54.080 --> 01:14:57.960
Here they say, I think,
probably 3SAT or CNFSAT.

01:14:57.960 --> 01:15:00.040
Those reductions
definitely still work.

01:15:00.040 --> 01:15:03.050
Whether you could put the
2SAT into the Max aspect,

01:15:03.050 --> 01:15:03.630
I don't know.

01:15:03.630 --> 01:15:06.550
But this could be
fun to look at.

01:15:06.550 --> 01:15:09.000
There aren't a ton of papers
about these two classes,

01:15:09.000 --> 01:15:11.050
but there are a few
before they nailed down

01:15:11.050 --> 01:15:12.860
any interesting problems.

01:15:12.860 --> 01:15:14.740
Here's another
interesting problem.

01:15:20.600 --> 01:15:24.830
Suppose you want to do
integer linear programming.

01:15:24.830 --> 01:15:28.710
To keep it simple, we'll
assume that the variables are

01:15:28.710 --> 01:15:33.452
zero or one, and then
that is equally hard.

01:15:33.452 --> 01:15:34.910
Here it's a little,
unless you know

01:15:34.910 --> 01:15:37.034
a lot about linear programming,
it's not so obvious

01:15:37.034 --> 01:15:39.290
that finding a feasible
solution here is hard.

01:15:39.290 --> 01:15:41.589
But in general, linear
programing-- at least

01:15:41.589 --> 01:15:43.880
in the non-integer case--
you could reduce optimization

01:15:43.880 --> 01:15:45.660
to feasibility.

01:15:45.660 --> 01:15:47.992
So I think the same
thing applies here.

01:15:47.992 --> 01:15:49.950
If you're not familiar
with linear programming,

01:15:49.950 --> 01:15:53.450
it's basically a bunch of
inequality constraints,

01:15:53.450 --> 01:15:55.260
linear inequality constraints.

01:15:55.260 --> 01:15:58.330
And now this is a
bunch of integers.

01:15:58.330 --> 01:16:01.840
These are both given integer
matrices and vectors.

01:16:01.840 --> 01:16:05.400
And they can have
exponential value.

01:16:05.400 --> 01:16:06.310
Question?

01:16:06.310 --> 01:16:08.750
AUDIENCE: For the
max/min weighted ones,

01:16:08.750 --> 01:16:12.320
for polynomial bounded,
is it still hard

01:16:12.320 --> 01:16:15.460
if you just do ones
and minus ones?

01:16:15.460 --> 01:16:19.560
PROFESSOR: I think min or
max ones without weights

01:16:19.560 --> 01:16:21.600
is NPO PB-complete.

01:16:21.600 --> 01:16:23.600
I should double-check.

01:16:23.600 --> 01:16:27.100
I didn't actually mention, but
this characterization theorem

01:16:27.100 --> 01:16:30.230
works for weighted
problems also.

01:16:30.230 --> 01:16:33.540
For every single case, they show
that weighted and unweighted

01:16:33.540 --> 01:16:38.640
are the same complexity,
except for this one.

01:16:38.640 --> 01:16:42.740
In the min ones case, if all
the variables' true, satisfy it,

01:16:42.740 --> 01:16:45.330
you get Poly-APX-completeness
if you're unweighted.

01:16:45.330 --> 01:16:50.300
If you're weighted, then you
can't find any approximation.

01:16:50.300 --> 01:16:55.390
It's NP-hard to find any factor,
which I think, this is, I

01:16:55.390 --> 01:16:58.037
think, before the
introduction or popularization

01:16:58.037 --> 01:16:58.745
of these classes.

01:16:58.745 --> 01:17:03.549
So that may be distinguishing
between Poly-APX-complete,

01:17:03.549 --> 01:17:05.590
which is definitely smaller
than NPO PB-complete.

01:17:05.590 --> 01:17:08.430
This might be NPO
PB-completeness.

01:17:08.430 --> 01:17:08.930
Unclear.

01:17:08.930 --> 01:17:12.120
But it's definitely
worse than Poly-APX.

01:17:12.120 --> 01:17:13.150
Yeah?

01:17:13.150 --> 01:17:15.150
AUDIENCE: How is it that
distinguished from PXP?

01:17:15.150 --> 01:17:17.550
Because I'm just confused how
you would ever get anything

01:17:17.550 --> 01:17:19.950
worse than this, because,
that's like the biggest

01:17:19.950 --> 01:17:22.370
that you [INAUDIBLE].

01:17:22.370 --> 01:17:25.470
PROFESSOR: So this problem
is exponential APX-hard

01:17:25.470 --> 01:17:26.795
if you forbid zero.

01:17:26.795 --> 01:17:30.030
If you allow zero, then you
can't get any approximation.

01:17:30.030 --> 01:17:32.500
Here, I think even
when you allow zero,

01:17:32.500 --> 01:17:34.820
or even when you
forbid zero, you still

01:17:34.820 --> 01:17:36.000
can't get an approximation.

01:17:36.000 --> 01:17:39.370
I think that's the idea here.

01:17:39.370 --> 01:17:42.020
Here, these problems
generally you

01:17:42.020 --> 01:17:44.812
can get, depending
on your set-up,

01:17:44.812 --> 01:17:47.395
these problems you can all get
like a factor, n approximation.

01:17:49.900 --> 01:17:52.360
Well, maybe not in
polynomial time.

01:17:52.360 --> 01:17:54.090
This is hard to find.

01:17:54.090 --> 01:17:55.320
Some of these you can.

01:17:55.320 --> 01:17:58.750
Longest induced path, just
have a path of length 1.

01:17:58.750 --> 01:18:00.220
That will be induced.

01:18:00.220 --> 01:18:02.040
So that gives you a
factor n approximation.

01:18:02.040 --> 01:18:05.239
There is a lower bound
on this situation,

01:18:05.239 --> 01:18:07.030
n to the 1 minus epsilon
inapproximability.

01:18:09.640 --> 01:18:12.810
I think morally it
should be a factor n,

01:18:12.810 --> 01:18:15.190
but this is the
best result I found.

01:18:15.190 --> 01:18:17.290
So it's funny.

01:18:17.290 --> 01:18:19.800
This is only for
number problems.

01:18:19.800 --> 01:18:21.520
So I presented this
is as in between.

01:18:21.520 --> 01:18:23.832
But this is actually
in some sense lower

01:18:23.832 --> 01:18:24.915
than Exp-APX-completeness.

01:18:27.716 --> 01:18:29.465
It's sort of a harder
version of Poly-APX.

01:18:32.130 --> 01:18:34.740
This is a slightly harder
version of Exp-APX.

01:18:37.320 --> 01:18:39.920
I think it's a small
difference, but it's

01:18:39.920 --> 01:18:43.410
good to know there
is this difference.

01:18:43.410 --> 01:18:46.160
Other questions?

01:18:46.160 --> 01:18:46.660
All right.

01:18:46.660 --> 01:18:54.460
So this ends what I plan to say
about L-reduction-style proofs,

01:18:54.460 --> 01:18:57.562
which are all about
preserving approximability.

01:18:57.562 --> 01:18:59.020
The next class,
we're going to look

01:18:59.020 --> 01:19:01.980
at a different take on
inapproximability, which

01:19:01.980 --> 01:19:06.180
is called gaps, and gap
preserving reductions,

01:19:06.180 --> 01:19:08.320
where you can set up
a problem that either

01:19:08.320 --> 01:19:10.980
it has a great solution,
or the next solution

01:19:10.980 --> 01:19:12.110
below that is way lower.

01:19:12.110 --> 01:19:15.105
And there's a gap between the
best and the next to best.

01:19:15.105 --> 01:19:16.480
And whenever you
have such a gap,

01:19:16.480 --> 01:19:18.249
you also have an
inapproximability gap,

01:19:18.249 --> 01:19:20.290
because you know there's
this solution out there,

01:19:20.290 --> 01:19:24.510
but finding it, if it's
NP-complete to find this,

01:19:24.510 --> 01:19:27.240
to solve it exactly, and
so the next level down you

01:19:27.240 --> 01:19:28.110
lose some factor.

01:19:28.110 --> 01:19:30.600
And whatever that gap is is
your inapproximability bound.

01:19:30.600 --> 01:19:33.280
It doesn't give you
completeness results like this

01:19:33.280 --> 01:19:35.030
in general-- not always.

01:19:35.030 --> 01:19:37.732
But it tends to give you really
get inapproximability bounds.

01:19:37.732 --> 01:19:40.190
Here I've completely ignored
what the constant factors are.

01:19:40.190 --> 01:19:42.860
Most of them are not so great.

01:19:42.860 --> 01:19:44.650
Like when you
prove APX-hardness,

01:19:44.650 --> 01:19:48.770
usually you get a 1 plus 1
over 1,000 kind of lower bound

01:19:48.770 --> 01:19:50.540
on the possibility factor.

01:19:50.540 --> 01:19:53.750
But the best upper
bound is like 2, or 1.5.

01:19:53.750 --> 01:19:55.290
And what we'll talk
about next time,

01:19:55.290 --> 01:19:58.290
you can get much closer--
sometimes exact bounds

01:19:58.290 --> 01:20:00.380
between upper and lower.

01:20:00.380 --> 01:20:03.130
But that will be next week.