WEBVTT

00:00:00.060 --> 00:00:02.500
The following content is
provided under a Creative

00:00:02.500 --> 00:00:04.019
Commons license.

00:00:04.019 --> 00:00:06.360
Your support will help
MIT OpenCourseWare

00:00:06.360 --> 00:00:10.730
continue to offer high quality
educational resources for free.

00:00:10.730 --> 00:00:13.330
To make a donation or
view additional materials

00:00:13.330 --> 00:00:17.236
from hundreds of MIT courses,
visit MIT OpenCourseWare

00:00:17.236 --> 00:00:17.861
at ocw.mit.edu.

00:00:21.720 --> 00:00:24.930
SRINIVAS DEVADAS:
So welcome to 6046.

00:00:24.930 --> 00:00:27.010
My name is Srinivas Devadas.

00:00:27.010 --> 00:00:30.240
I'm a professor of
computer science.

00:00:30.240 --> 00:00:33.650
This is my 27th year at MIT.

00:00:33.650 --> 00:00:37.740
I'm teaching this class
with great course staff,

00:00:37.740 --> 00:00:42.010
with co-lecturers,
Eric Demaine over here

00:00:42.010 --> 00:00:45.750
and Nancy Lynch,
who's over there,

00:00:45.750 --> 00:00:50.250
and a whole bunch of TAs, who
you will meet through the term.

00:00:50.250 --> 00:00:55.620
We just signed up our last TA
yesterday, so at this point,

00:00:55.620 --> 00:00:58.230
even I don't know their names.

00:00:58.230 --> 00:01:01.840
But we hope to have
a great semester.

00:01:01.840 --> 00:01:05.470
I'm very excited to be teaching
this class with Eric and Nancy.

00:01:05.470 --> 00:01:09.830
I recognize some of you folks
from 006 from a year ago,

00:01:09.830 --> 00:01:13.450
so hello again, and
from other classes.

00:01:13.450 --> 00:01:16.600
And so let's get started.

00:01:16.600 --> 00:01:18.310
I mentioned 006.

00:01:18.310 --> 00:01:21.730
006 is a prerequisite
for this class,

00:01:21.730 --> 00:01:26.030
so if by chance you've
skipped a class--

00:01:26.030 --> 00:01:29.490
MIT or EECS has allowed
you to skip that--

00:01:29.490 --> 00:01:34.440
make sure you check in with
us to see that you are ready

00:01:34.440 --> 00:01:39.633
for 6046 because we will
assume that you know the 6006

00:01:39.633 --> 00:01:40.340
material.

00:01:40.340 --> 00:01:42.950
And by that, I
mean basic material

00:01:42.950 --> 00:01:46.290
on that data structures,
classical algorithms

00:01:46.290 --> 00:01:52.930
like sorting, algorithms
for dynamic programming,

00:01:52.930 --> 00:01:56.750
or algorithms that use dynamic
programming I should say,

00:01:56.750 --> 00:02:00.580
algorithms for shortest
paths, et cetera.

00:02:00.580 --> 00:02:04.550
6046 itself, we're
going to run this course

00:02:04.550 --> 00:02:08.039
pretty much off the Stellar
website in the sense

00:02:08.039 --> 00:02:12.550
that that'll be our one-stop
shop for getting everything

00:02:12.550 --> 00:02:16.490
including lecture handouts,
problem sets-- turning

00:02:16.490 --> 00:02:19.590
in your problem sets, et cetera.

00:02:19.590 --> 00:02:22.800
And I should mention
that this course is

00:02:22.800 --> 00:02:27.290
being taped for
OpenCourseWare, and while it'll

00:02:27.290 --> 00:02:32.750
take a little bit of time for
the videos to be put online,

00:02:32.750 --> 00:02:39.370
we hope to do that perhaps
in clumps before the quizzes

00:02:39.370 --> 00:02:46.020
that you will have as we
have to have in our class.

00:02:46.020 --> 00:02:49.940
So let me just say a couple
more things about logistics,

00:02:49.940 --> 00:02:53.580
and then we get started
with technical content.

00:02:53.580 --> 00:02:55.610
As I mentioned, we're
going to be running

00:02:55.610 --> 00:02:57.610
this course off Stellar.

00:02:57.610 --> 00:03:00.610
Please sign up for
recitations section

00:03:00.610 --> 00:03:04.390
by going to the stellar website
and choosing a section that

00:03:04.390 --> 00:03:06.270
works for your schedule.

00:03:06.270 --> 00:03:11.580
Sections go from 10:00 AM
all the way to 3:00 I think,

00:03:11.580 --> 00:03:17.390
and we've placed a limit on the
number of students per section.

00:03:17.390 --> 00:03:20.140
We wanted the sections
to be manageable in size,

00:03:20.140 --> 00:03:22.600
but there's plenty of
room for everybody,

00:03:22.600 --> 00:03:24.950
and the schedule
flexibility should

00:03:24.950 --> 00:03:29.820
allows you to choose a
section pretty easily.

00:03:29.820 --> 00:03:32.310
We have a course information
document and an objectives

00:03:32.310 --> 00:03:34.270
document on the website.

00:03:34.270 --> 00:03:37.460
That has a lot of details
on the grading policy,

00:03:37.460 --> 00:03:40.260
the collaboration
policy, et cetera.

00:03:40.260 --> 00:03:44.700
Please read it very
carefully from the first page

00:03:44.700 --> 00:03:46.230
all the way to the end.

00:03:46.230 --> 00:03:48.530
And I will mention
one thing that you

00:03:48.530 --> 00:03:55.650
should be careful about, which
is that while problem sets are

00:03:55.650 --> 00:03:59.380
only 30% of the
grade, we do require

00:03:59.380 --> 00:04:02.180
you to attempt the problems.

00:04:02.180 --> 00:04:04.120
And there's actually
a penalty associated

00:04:04.120 --> 00:04:07.940
with not attempting problems and
not tuning problem sets in that

00:04:07.940 --> 00:04:11.390
is way more than 30%,
so keep that in mind,

00:04:11.390 --> 00:04:14.470
and please read the
collaboration policy

00:04:14.470 --> 00:04:17.130
as well as the grading
policy, carefully.

00:04:17.130 --> 00:04:19.740
And feel free to
ask us questions.

00:04:19.740 --> 00:04:23.600
You can ask us questions
anonymously through Piazza,

00:04:23.600 --> 00:04:25.520
or you can certainly
send us email.

00:04:25.520 --> 00:04:27.930
All the information
is on Stellar.

00:04:27.930 --> 00:04:32.960
So that's all I really had to
say about course logistics.

00:04:32.960 --> 00:04:35.830
Let me tell you a
little bit about how

00:04:35.830 --> 00:04:37.800
the content of this
course is structured.

00:04:40.410 --> 00:04:44.900
We have several modules,
and Eric, Nancy,

00:04:44.900 --> 00:04:49.810
and I will be in charge of
each of these different modules

00:04:49.810 --> 00:04:51.670
as the term goes.

00:04:51.670 --> 00:04:58.070
Our very first module is going
to start really next time.

00:04:58.070 --> 00:05:00.400
Today is really an
overview lecture.

00:05:00.400 --> 00:05:02.890
But it's a module on
divide and conquer,

00:05:02.890 --> 00:05:06.970
and you learned about this
divide and conquer paradigm

00:05:06.970 --> 00:05:09.670
in 006 or equivalent classes.

00:05:09.670 --> 00:05:12.350
It's breaking of a problem
into smaller problems

00:05:12.350 --> 00:05:14.690
and getting efficiency that way.

00:05:14.690 --> 00:05:16.780
Merge sort is a
classic algorithm

00:05:16.780 --> 00:05:19.600
that follows the divide
and conquer paradigm.

00:05:19.600 --> 00:05:22.180
If you're going to
take it to a new level.

00:05:22.180 --> 00:05:25.190
And I guess that's sort
of the team of 046.

00:05:25.190 --> 00:05:28.920
Take the material in 006 and
raise the stakes a little bit--

00:05:28.920 --> 00:05:31.030
raise the level of
sophistication--

00:05:31.030 --> 00:05:34.880
and you'll see things like
fast Fourier transform.

00:05:34.880 --> 00:05:37.310
Finding an algorithm
for a convex hull,

00:05:37.310 --> 00:05:38.770
we'll do that next time.

00:05:38.770 --> 00:05:42.010
That uses the divide
and conquer paradigm.

00:05:42.010 --> 00:05:44.670
We're going to do a
ton of optimization.

00:05:44.670 --> 00:05:48.790
Divide and conquer can
obviously be used for search

00:05:48.790 --> 00:05:51.060
and also for optimization.

00:05:51.060 --> 00:05:56.580
In particular, we'll look
at strategies corresponding

00:05:56.580 --> 00:06:02.050
to greedy algorithms, Dijkstra,
which hopefully you remember

00:06:02.050 --> 00:06:07.480
the shortest path algorithm from
006 is an example of a greedy

00:06:07.480 --> 00:06:08.370
algorithm.

00:06:08.370 --> 00:06:11.360
We'll see a bunch
of other examples,

00:06:11.360 --> 00:06:13.260
and we'll look at one today.

00:06:13.260 --> 00:06:19.770
And dynamic programming, it's
a wonderful algorithmic hammer

00:06:19.770 --> 00:06:22.730
that you can apply to a
wide variety of problems,

00:06:22.730 --> 00:06:25.250
certainly to shortest
paths as well.

00:06:25.250 --> 00:06:27.700
We'll look at it in
many different contexts.

00:06:27.700 --> 00:06:33.580
And then really
quickly network flow,

00:06:33.580 --> 00:06:36.490
which is a problem that's
associated with-- here's

00:06:36.490 --> 00:06:37.480
a network.

00:06:37.480 --> 00:06:40.620
This capacity is associated
with the network.

00:06:40.620 --> 00:06:44.010
The capacities could
respond to the width

00:06:44.010 --> 00:06:48.880
of the roads in a highway
system or the number

00:06:48.880 --> 00:06:52.170
of lanes, the amount of
traffic that can go through.

00:06:52.170 --> 00:06:56.230
How do I maximize the
set of commodities,

00:06:56.230 --> 00:06:58.140
or the amount of
commodities that I

00:06:58.140 --> 00:06:59.730
can push through the network?

00:06:59.730 --> 00:07:06.320
That it turns out is,
again, a problem that

00:07:06.320 --> 00:07:08.410
has many different
applications, so it's really

00:07:08.410 --> 00:07:10.606
a collection of problems.

00:07:10.606 --> 00:07:12.480
You're going to spend
some time, a little bit

00:07:12.480 --> 00:07:17.170
today, but a little
more than in 6006,

00:07:17.170 --> 00:07:19.670
talking about intractability.

00:07:19.670 --> 00:07:22.115
So a lot of algorithms that
we're going to talk about

00:07:22.115 --> 00:07:25.080
are efficient in the
sense that they're

00:07:25.080 --> 00:07:27.160
polynomial time solvable.

00:07:27.160 --> 00:07:31.560
And first, polynomial
time solvable

00:07:31.560 --> 00:07:35.801
doesn't imply efficiency
in the practical sense,

00:07:35.801 --> 00:07:37.550
so if you have an n
raised to 8 algorithm,

00:07:37.550 --> 00:07:38.930
it's polynomial time.

00:07:38.930 --> 00:07:42.250
But really, it's not something
that you can use on real world

00:07:42.250 --> 00:07:45.220
problems where n is
relatively large,

00:07:45.220 --> 00:07:49.700
but generally in a theoretical
computer science class,

00:07:49.700 --> 00:07:52.100
we'll think about
tractable problems

00:07:52.100 --> 00:07:57.320
as being those that have
polynomial time algorithms that

00:07:57.320 --> 00:08:00.270
can solve them
exactly or optimally.

00:08:00.270 --> 00:08:04.690
But intractability then
corresponds to problems

00:08:04.690 --> 00:08:08.850
that, at the moment, we don't
know of a polynomial time

00:08:08.850 --> 00:08:11.510
algorithm to solve them,
and the best algorithms

00:08:11.510 --> 00:08:14.340
we have take worst
case exponential time.

00:08:14.340 --> 00:08:17.740
And so the question is, what
happens with those problems?

00:08:17.740 --> 00:08:21.360
And we'll look at things
like approximation algorithms

00:08:21.360 --> 00:08:28.740
that can get us, in the case
of optimization problems,

00:08:28.740 --> 00:08:32.929
get us to within a certain
fraction of optimal,

00:08:32.929 --> 00:08:36.260
guaranteed, and run
in polynomial time.

00:08:36.260 --> 00:08:38.750
So you can't get
the absolute best,

00:08:38.750 --> 00:08:42.470
but you can get within 10% or
we can get within a factor of 2.

00:08:42.470 --> 00:08:46.450
That may be enough for
a particular instance

00:08:46.450 --> 00:08:49.660
of a problem or a set of
instances of a problem.

00:08:49.660 --> 00:08:53.000
And what we do a bunch
of advanced topics.

00:08:53.000 --> 00:08:56.050
I think we have distributed
algorithms plan.

00:08:56.050 --> 00:09:01.450
Nancy works in that
area, and we'll also

00:09:01.450 --> 00:09:03.380
talk about cryptography.

00:09:03.380 --> 00:09:06.820
There's a deep connection
between number theory

00:09:06.820 --> 00:09:11.190
algorithms and cryptography
that towards end of the lecture,

00:09:11.190 --> 00:09:13.340
or, I should say, towards
the end of the course,

00:09:13.340 --> 00:09:18.110
I will look at a
little more closely.

00:09:18.110 --> 00:09:22.530
So much for overview, let's get
started with today's lecture

00:09:22.530 --> 00:09:24.490
for real.

00:09:24.490 --> 00:09:27.360
And here's the theme
of today's lecture.

00:09:30.050 --> 00:09:31.640
I talked a bit
about tractability

00:09:31.640 --> 00:09:33.470
and intractability.

00:09:33.470 --> 00:09:36.110
And what is fascinating
about algorithms

00:09:36.110 --> 00:09:41.020
is that you might
see a problem that

00:09:41.020 --> 00:09:46.440
has a fairly obvious polynomial
time solution or a linear time

00:09:46.440 --> 00:09:50.360
solution, then you change
it ever so slightly,

00:09:50.360 --> 00:09:53.360
and the linear time
algorithm doesn't work.

00:09:53.360 --> 00:09:56.370
Maybe you can find
a cubic algorithm.

00:09:56.370 --> 00:09:59.890
And then you change
it a little more,

00:09:59.890 --> 00:10:03.800
and you end up with
something that you

00:10:03.800 --> 00:10:06.130
can't find a polynomial
time algorithm for.

00:10:06.130 --> 00:10:10.510
You can't prove that the
polynomial time algorithm

00:10:10.510 --> 00:10:12.980
or polynomial monomial
time algorithm

00:10:12.980 --> 00:10:15.690
gives you the optimal
solution in all cases.

00:10:15.690 --> 00:10:18.700
And then you go off
into complexity theory.

00:10:18.700 --> 00:10:23.630
You maybe discover that, or
show that this problem is

00:10:23.630 --> 00:10:27.560
NP-complete, and now you're
in the intractability domain.

00:10:27.560 --> 00:10:32.540
So very small changes
in problem statements

00:10:32.540 --> 00:10:39.150
can end up with very
different situations

00:10:39.150 --> 00:10:41.960
from a standpoint of
algorithm complexity.

00:10:41.960 --> 00:10:46.480
And so that's really
what I want to point out

00:10:46.480 --> 00:10:50.585
to you in some detail
with a concrete example.

00:10:59.130 --> 00:11:02.630
So I want to get a
little bit pedantic here

00:11:02.630 --> 00:11:06.820
with respect to intractability
and tractability.

00:11:06.820 --> 00:11:12.580
You've seen, I think, these
terms before in the one lecture

00:11:12.580 --> 00:11:19.580
in 006, but we'll go over this
in some detail today and more

00:11:19.580 --> 00:11:21.310
later on in the semester.

00:11:21.310 --> 00:11:27.090
But for now, let's recall some
basic terminology associated

00:11:27.090 --> 00:11:29.990
with tractability and
intractability or complexity

00:11:29.990 --> 00:11:32.530
theory, broadly speaking.

00:11:32.530 --> 00:11:39.700
Capital P is a class of problems
solvable in polynomial time.

00:11:45.560 --> 00:11:48.600
And think of that
as big O, n raised

00:11:48.600 --> 00:11:54.686
to k for some constant k.

00:11:54.686 --> 00:11:57.210
Now you can have long
factors in there,

00:11:57.210 --> 00:12:00.550
but once you put a big
O in there, you're good.

00:12:00.550 --> 00:12:04.000
You can always
say, order n, even

00:12:04.000 --> 00:12:08.140
if it's a logarithmic
problem, and big O

00:12:08.140 --> 00:12:10.790
lets you be sloppy like that.

00:12:10.790 --> 00:12:15.410
And there are many examples
of polynomial time algorithms,

00:12:15.410 --> 00:12:18.760
of course, for interesting
problems like shortest paths.

00:12:18.760 --> 00:12:22.600
So the shortest path
problem is order V square,

00:12:22.600 --> 00:12:25.070
where V is the number of
vertices in the graph.

00:12:25.070 --> 00:12:26.810
There's algorithms for that.

00:12:26.810 --> 00:12:32.380
You can do a little bit
better if you use fancier data

00:12:32.380 --> 00:12:35.660
structure, but
that's an example.

00:12:35.660 --> 00:12:42.050
NP is another class of problems
that's very interesting.

00:12:42.050 --> 00:12:48.030
This is the class of
problems that whose solution

00:12:48.030 --> 00:12:51.810
is verifiable in
polynomial time.

00:12:55.640 --> 00:13:06.060
So an example of a problem in
NP that is not known to be NP

00:13:06.060 --> 00:13:10.710
is the Hamiltonian
cycle problem.

00:13:10.710 --> 00:13:13.180
And the Hamiltonian
cycle problem

00:13:13.180 --> 00:13:31.260
corresponds to given a directed
graph, find a simple cycle.

00:13:31.260 --> 00:13:37.900
So you can repeat vertices,
but you need the simple cycle

00:13:37.900 --> 00:13:48.250
to contain each vertex in V.

00:13:48.250 --> 00:13:55.500
And determining whether a given
cycle is a Hamiltonian cycle

00:13:55.500 --> 00:13:57.740
or not is simple.

00:13:57.740 --> 00:14:00.400
You just traverse the cycle.

00:14:00.400 --> 00:14:03.780
Make sure that you've touched
all the vertices exactly once,

00:14:03.780 --> 00:14:04.690
and you're done.

00:14:04.690 --> 00:14:07.430
Clearly doable in
polynomial time.

00:14:07.430 --> 00:14:10.390
So therefore, Hamiltonian
cycle is an NP,

00:14:10.390 --> 00:14:16.620
but determining whether a
graph has a Hamiltonian cycle

00:14:16.620 --> 00:14:19.850
or not is a hard problem.

00:14:19.850 --> 00:14:31.000
And in particular, the
notion of NP completeness

00:14:31.000 --> 00:14:39.940
is something that defines the
level of intractability for NP.

00:14:39.940 --> 00:14:45.450
The NP complete problems are
the hardest problems in NP,

00:14:45.450 --> 00:14:47.820
and Hamiltonian
cycle is one of them.

00:14:47.820 --> 00:14:55.710
If you can solve any NP complete
problem in polynomial time,

00:14:55.710 --> 00:14:59.710
you can solve all problems
in NP in polynomial time.

00:14:59.710 --> 00:15:03.670
So that's what I meant by saying
that NP complete problems are,

00:15:03.670 --> 00:15:06.050
in some sense, the
hardest problems an NP

00:15:06.050 --> 00:15:10.000
because solving one of
them gives you everything.

00:15:10.000 --> 00:15:12.900
So the definition
of NP completeness

00:15:12.900 --> 00:15:19.070
is that the problem
is in NP and is

00:15:19.070 --> 00:15:23.370
as hard-- an
informal definition--

00:15:23.370 --> 00:15:26.280
as any problem in NP.

00:15:33.050 --> 00:15:37.110
And so Hamiltonian cycle
is an NP complete problem.

00:15:37.110 --> 00:15:39.410
Satisfiability is an
NP complete problem,

00:15:39.410 --> 00:15:41.250
and there's a whole
bunch of them.

00:15:41.250 --> 00:15:46.130
So going back to our theme
here, what I want to show you

00:15:46.130 --> 00:15:50.200
is how for an interval
scheduling problem, that I'll

00:15:50.200 --> 00:15:57.960
define in a couple of minutes,
how we move from linear time,

00:15:57.960 --> 00:16:01.930
therefore P, to something
that's still in P.

00:16:01.930 --> 00:16:03.500
But it's a little
more complicated

00:16:03.500 --> 00:16:06.360
if I change the constraints
of a problem a little bit.

00:16:06.360 --> 00:16:10.450
And finally, if I add more
constraints to the problem,

00:16:10.450 --> 00:16:12.100
generalize it-- and
you can think of it

00:16:12.100 --> 00:16:14.180
as adding constraints
or generalizing

00:16:14.180 --> 00:16:19.930
the problem-- you get
small changes to something

00:16:19.930 --> 00:16:22.250
that becomes NP complete.

00:16:22.250 --> 00:16:25.890
So this is something
that algorithm designers

00:16:25.890 --> 00:16:30.390
have to keep in mind because
before you go off and try

00:16:30.390 --> 00:16:32.620
to design an algorithm
for a problem

00:16:32.620 --> 00:16:37.760
you like to know where in the
spectrum your problem resides.

00:16:37.760 --> 00:16:41.330
And in order to
do that, you need

00:16:41.330 --> 00:16:44.660
to understand algorithm
paradigms obviously and be

00:16:44.660 --> 00:16:48.420
able to apply them, but you also
have to understand reductions

00:16:48.420 --> 00:16:51.430
where you can try and translate
one problem to another.

00:16:51.430 --> 00:16:55.550
And if you can do that,
and the first problem

00:16:55.550 --> 00:16:58.660
is known to be hard, then
you can make arguments

00:16:58.660 --> 00:17:02.060
about the hardness
of your problem.

00:17:02.060 --> 00:17:06.020
So these are the kinds of things
that we'll touch upon today,

00:17:06.020 --> 00:17:12.440
the analysis of an algorithm,
the design of an algorithm,

00:17:12.440 --> 00:17:16.579
and also the complexity analysis
of an algorithm, which may not

00:17:16.579 --> 00:17:18.900
just be an
asymptotic-- well, this

00:17:18.900 --> 00:17:21.339
is order n cubed
or order n square

00:17:21.339 --> 00:17:25.510
but more in the realm of
NP completeness as well.

00:17:28.280 --> 00:17:32.000
So so much for
context, let's dive

00:17:32.000 --> 00:17:39.520
into our interval scheduling
problem, which is something

00:17:39.520 --> 00:17:44.670
that you can imagine
doing for classes,

00:17:44.670 --> 00:17:50.110
tasks, a particular schedule
during a day, life in general.

00:17:50.110 --> 00:17:59.140
And in the general setting, we
have resources and requests,

00:17:59.140 --> 00:18:03.600
and we're going to have a single
resource for our first version

00:18:03.600 --> 00:18:05.190
of the problem.

00:18:05.190 --> 00:18:10.640
And our requests are
going to be 1 through n,

00:18:10.640 --> 00:18:12.550
and we can think
of these requests

00:18:12.550 --> 00:18:17.100
as requiring time
corresponding to the resource.

00:18:17.100 --> 00:18:19.300
So the request is
for the resource,

00:18:19.300 --> 00:18:20.790
and you want time
on the resource.

00:18:20.790 --> 00:18:22.280
Maybe it's computation time.

00:18:22.280 --> 00:18:24.420
Maybe it's your time.

00:18:24.420 --> 00:18:26.550
It could be anything.

00:18:26.550 --> 00:18:32.230
Each of these requests responds
to an interval of time,

00:18:32.230 --> 00:18:34.460
and that's where
the name comes from.

00:18:34.460 --> 00:18:40.030
si is start time time.

00:18:40.030 --> 00:18:46.650
fi is the finish
time, and we're going

00:18:46.650 --> 00:18:50.660
to say si is strictly
less than fi.

00:18:50.660 --> 00:18:52.970
So I didn't put less
than or equal to there

00:18:52.970 --> 00:18:57.790
because I want these requests
to be non-null, non-zero,

00:18:57.790 --> 00:19:00.810
so otherwise they're
uninteresting.

00:19:00.810 --> 00:19:02.680
And we're going to
have a start time,

00:19:02.680 --> 00:19:05.860
and we're going to have an end
time, and they're not equal.

00:19:05.860 --> 00:19:11.270
So that's the first part
of the specification

00:19:11.270 --> 00:19:15.000
of the problem and
then the second part,

00:19:15.000 --> 00:19:21.370
which is intuitive is that
two requests-- we have

00:19:21.370 --> 00:19:24.460
a single resource
here remember-- i

00:19:24.460 --> 00:19:30.390
and j are considered
to be compatible,

00:19:30.390 --> 00:19:33.810
which means you can satisfy
both of these requests.

00:19:33.810 --> 00:19:34.760
They're compatible.

00:19:34.760 --> 00:19:37.330
Incompatible requests,
you can't satisfy

00:19:37.330 --> 00:19:41.600
with your single
resource simultaneously--

00:19:41.600 --> 00:19:45.345
Provided they don't overlap.

00:19:51.850 --> 00:19:58.450
And an overlapping condition
might be that fi is less than

00:19:58.450 --> 00:20:08.430
or equal to sg, or fj
less than or equal to si.

00:20:08.430 --> 00:20:11.110
So again, I put a less
than or equal to here,

00:20:11.110 --> 00:20:16.160
which is important
to spend a minute on.

00:20:16.160 --> 00:20:22.470
What I'm saying here in this
context is that I really have

00:20:22.470 --> 00:20:26.970
open-ended intervals on the
right-hand side corresponding

00:20:26.970 --> 00:20:29.420
to the fi's.

00:20:29.420 --> 00:20:35.190
So pictorially, you could
look at it this way.

00:20:35.190 --> 00:20:40.830
Let's say I have
intervals like this.

00:20:40.830 --> 00:20:42.830
So this is interval number 1.

00:20:42.830 --> 00:20:44.640
That's interval number 2.

00:20:44.640 --> 00:20:53.360
Right here I have s of 1, f of
1 out here, s of 2 out here,

00:20:53.360 --> 00:20:55.150
and f of 2 out here.

00:20:55.150 --> 00:21:01.910
So this is f of 1 for
that and s of 2 for this.

00:21:01.910 --> 00:21:07.920
I'm allowing s of 2 and f
of 1 to be exactly equal,

00:21:07.920 --> 00:21:14.960
and I still agree that these
two are compatible requests.

00:21:14.960 --> 00:21:17.860
So this is-- I guess
it's terminology.

00:21:17.860 --> 00:21:22.940
It's our definition
of compatibility.

00:21:22.940 --> 00:21:29.000
So you can imagine now
an optimization problem

00:21:29.000 --> 00:21:31.880
that is associated with
interval scheduling

00:21:31.880 --> 00:21:35.980
where, in a different
example, I have

00:21:35.980 --> 00:21:40.780
this interval
corresponding to s1 and f1.

00:21:40.780 --> 00:21:45.200
I might have a different
interval here corresponding

00:21:45.200 --> 00:21:48.750
to 2, then corresponding to 3.

00:21:48.750 --> 00:21:56.990
And then maybe I've
got 4 here, 5, and 6.

00:21:56.990 --> 00:22:03.090
So those are my six intervals
corresponding to my input.

00:22:03.090 --> 00:22:04.760
I have a single resource.

00:22:04.760 --> 00:22:08.110
I'm just drawn out in
a two-dimensional form.

00:22:08.110 --> 00:22:11.220
There's six different
requests that I have,

00:22:11.220 --> 00:22:12.700
the six different intervals.

00:22:12.700 --> 00:22:16.260
Intervals and
requests are synonyms.

00:22:16.260 --> 00:22:20.780
And my goal here-- and it's kind
of obvious in this example--

00:22:20.780 --> 00:22:42.540
is to select a compatible subset
of requests, or intervals,

00:22:42.540 --> 00:22:43.785
that is of maximum size.

00:22:49.850 --> 00:22:53.540
And I'd like to do
this efficiently.

00:22:53.540 --> 00:22:56.480
So we'll always consider
efficiency here,

00:22:56.480 --> 00:22:59.860
but in terms of the
specification of the problem as

00:22:59.860 --> 00:23:05.450
opposed to a requirement on the
complexity of the algorithm,

00:23:05.450 --> 00:23:09.940
I want maximum size
for this subset.

00:23:09.940 --> 00:23:13.570
So as I showed you, or
I mentioned earlier,

00:23:13.570 --> 00:23:18.150
in this case, it is
clear from the drawing

00:23:18.150 --> 00:23:21.200
that I put up there that the
maximum size for that six

00:23:21.200 --> 00:23:25.820
requests example
that I have is three.

00:23:25.820 --> 00:23:27.900
So that's the set up.

00:23:27.900 --> 00:23:33.490
Now we're going to spend
the next few minutes

00:23:33.490 --> 00:23:37.840
talking about a greedy
strategy for solving

00:23:37.840 --> 00:23:39.990
this particular problem.

00:23:39.990 --> 00:23:43.090
If you don't know of
it, the greedy strategy

00:23:43.090 --> 00:23:49.540
is going to always produce
the maximum size or not.

00:23:49.540 --> 00:23:54.750
In fact, it depends on the
particular greedy heuristic,

00:23:54.750 --> 00:23:58.710
the selection heuristic that
a greedy algorithm uses.

00:23:58.710 --> 00:24:01.410
So that's going to be important,
and we'll take a look--

00:24:01.410 --> 00:24:03.060
and hopefully you
can suggest some--

00:24:03.060 --> 00:24:06.970
at a few different
greedy heuristics.

00:24:06.970 --> 00:24:10.297
But my claim, overall
claim, that I'm

00:24:10.297 --> 00:24:11.880
going to have to
spend a bunch of time

00:24:11.880 --> 00:24:15.300
here justifying and
eventually proving

00:24:15.300 --> 00:24:27.210
is that we can solve
this problem using

00:24:27.210 --> 00:24:28.160
a greedy algorithm.

00:24:30.730 --> 00:24:32.730
Now what is a greedy algorithm?

00:24:32.730 --> 00:24:36.190
You've seen some examples.

00:24:36.190 --> 00:24:42.240
As the name implies, it's
something that's myopic.

00:24:42.240 --> 00:24:43.750
It doesn't look ahead.

00:24:43.750 --> 00:24:48.700
It looks to maximize
the very first thing

00:24:48.700 --> 00:24:51.660
that you couldn't maximize.

00:24:51.660 --> 00:24:57.050
It says-- traffic is a
good example-- don't let

00:24:57.050 --> 00:24:58.840
anybody cut in front of you.

00:24:58.840 --> 00:25:00.210
You've got some room up there.

00:25:00.210 --> 00:25:02.780
Get up there.

00:25:02.780 --> 00:25:07.040
Generally, people
are greedy when

00:25:07.040 --> 00:25:10.120
it comes to getting
to work, trying

00:25:10.120 --> 00:25:13.330
to minimize the time
and, in this case,

00:25:13.330 --> 00:25:15.480
on the time that they
spend on the road.

00:25:15.480 --> 00:25:18.230
But we've had other examples.

00:25:18.230 --> 00:25:22.240
For example, when you look
at interval scheduling,

00:25:22.240 --> 00:25:29.410
you might say, I'm going to
pick the smallest request.

00:25:29.410 --> 00:25:32.510
And I'm going to pick the
smallest request first,

00:25:32.510 --> 00:25:34.490
and I'm going to try
and collect together

00:25:34.490 --> 00:25:36.270
as many requests as possible.

00:25:36.270 --> 00:25:38.020
And if the requests
are small in the sense

00:25:38.020 --> 00:25:41.620
that si and fi, for
the two requests,

00:25:41.620 --> 00:25:46.440
are close to each other, then
maybe that's the best strategy.

00:25:46.440 --> 00:25:50.050
So that's an example
of a greedy strategy

00:25:50.050 --> 00:25:52.130
for our particular example.

00:25:52.130 --> 00:25:59.050
But just to give you a slightly
better definition of greedy

00:25:59.050 --> 00:26:04.780
than what I've said so
far, a greedy algorithm

00:26:04.780 --> 00:26:18.270
is a myopic algorithm
that does two things.

00:26:18.270 --> 00:26:35.121
It processes the input one piece
at a time with no apparent look

00:26:35.121 --> 00:26:35.620
ahead.

00:26:39.820 --> 00:26:42.910
So what happens is that greedy
algorithms are typically

00:26:42.910 --> 00:26:45.550
quite efficient.

00:26:45.550 --> 00:26:51.340
What you end up doing is looking
at a small part of the problem

00:26:51.340 --> 00:26:55.040
instance and
deciding what to do.

00:26:55.040 --> 00:26:57.790
Once you've done
that, then you're

00:26:57.790 --> 00:27:01.080
in a situation where the problem
has gotten a little bit simpler

00:27:01.080 --> 00:27:03.780
because you've already
solved part of it,

00:27:03.780 --> 00:27:05.680
and then you move on.

00:27:05.680 --> 00:27:09.180
So what would a template
for a greedy algorithm

00:27:09.180 --> 00:27:13.420
look like for our interval
scheduling problem?

00:27:13.420 --> 00:27:17.590
Here's a template that
probably puts it all together

00:27:17.590 --> 00:27:24.310
and gives you a good sense of
what I mean by greedy, at least

00:27:24.310 --> 00:27:24.980
in this context.

00:27:29.150 --> 00:27:34.400
So before we even get into
particulars of selection

00:27:34.400 --> 00:27:38.170
strategies, let me
give you a template

00:27:38.170 --> 00:27:41.905
for greedy interval scheduling.

00:27:46.350 --> 00:27:54.910
So step 1, use a simple
rule to select a request.

00:28:00.630 --> 00:28:08.180
And once you do that, if you
selected a particular request--

00:28:08.180 --> 00:28:11.730
let's say you selected 1.

00:28:11.730 --> 00:28:16.630
What happens now once
you've selected 1?

00:28:16.630 --> 00:28:17.850
Well, you're done.

00:28:17.850 --> 00:28:18.730
You can't select 2.

00:28:18.730 --> 00:28:19.650
You can't select 3.

00:28:19.650 --> 00:28:20.599
You can't select 4.

00:28:20.599 --> 00:28:21.390
You can't select 5.

00:28:21.390 --> 00:28:23.560
You can't select 6.

00:28:23.560 --> 00:28:27.880
So if you have selected 1
in this case, you're done,

00:28:27.880 --> 00:28:32.260
but we have to codify
that in a step here.

00:28:32.260 --> 00:28:34.500
And what that means
is that we have

00:28:34.500 --> 00:28:44.400
to reject all requests that
are incompatible with i.

00:28:46.930 --> 00:28:51.190
And at this point, because we've
rejected a bunch of requests,

00:28:51.190 --> 00:28:54.480
our problem got smaller.

00:28:54.480 --> 00:28:59.180
And so you now have
a smaller problem,

00:28:59.180 --> 00:29:06.400
and you just repeat-- go back to
step 1-- until all requests are

00:29:06.400 --> 00:29:06.900
processed.

00:29:09.620 --> 00:29:12.480
All right, so that's
a classical template

00:29:12.480 --> 00:29:15.000
for a greedy algorithm.

00:29:15.000 --> 00:29:19.350
You just go through these
really simple steps.

00:29:19.350 --> 00:29:22.410
And the reason
this is a template

00:29:22.410 --> 00:29:26.430
is because I haven't
specified a particular rule,

00:29:26.430 --> 00:29:30.200
and so it's not quite an
algorithm that you can code yet

00:29:30.200 --> 00:29:32.390
because we need a rule.

00:29:32.390 --> 00:29:38.230
So with all of that
context, let me ask you.

00:29:38.230 --> 00:29:44.880
What is a rule that you
think would work well

00:29:44.880 --> 00:29:47.090
for an interval
scheduling problem?

00:29:47.090 --> 00:29:48.101
Yeah, go ahead.

00:29:48.101 --> 00:29:50.360
AUDIENCE: Select one with
the earliest finish time.

00:29:50.360 --> 00:29:52.430
SRINIVAS DEVADAS: Select one
with the earliest finish time.

00:29:52.430 --> 00:29:54.440
All right, well, I did
not want that answer.

00:29:54.440 --> 00:29:56.640
[LAUGHTER]

00:29:56.640 --> 00:29:59.360
But now that you've
given me the answer,

00:29:59.360 --> 00:30:01.640
I have to do
something about this.

00:30:01.640 --> 00:30:06.040
So I want a different answer, so
we'll go to a different person.

00:30:06.040 --> 00:30:13.480
But before I do that, let me
reward you for that answer

00:30:13.480 --> 00:30:22.472
I did not want with a limited
edition 6046 Frisbee, OK?

00:30:22.472 --> 00:30:25.200
[APPLAUSE]

00:30:25.200 --> 00:30:26.995
You need to stand up
because I don't want

00:30:26.995 --> 00:30:28.120
to take people's heads off.

00:30:28.120 --> 00:30:28.240
[LAUGHTER]

00:30:28.240 --> 00:30:28.770
Yeah sorry.

00:30:28.770 --> 00:30:31.522
All right, so here you go.

00:30:31.522 --> 00:30:32.468
All right?

00:30:32.468 --> 00:30:32.968
Good.

00:30:32.968 --> 00:30:34.420
[APPLAUSE]

00:30:34.420 --> 00:30:38.390
So people do cookies and candy.

00:30:38.390 --> 00:30:40.340
I think Eric, Nancy
and I are cooler.

00:30:40.340 --> 00:30:41.790
[LAUGHTER]

00:30:41.790 --> 00:30:45.800
So we do Frisbees.

00:30:45.800 --> 00:30:48.930
All right, good, so
the fact of the matter

00:30:48.930 --> 00:30:54.570
was that this class was
scheduled for 9:30 to 11:00

00:30:54.570 --> 00:30:56.200
on Tuesdays and Thursdays.

00:30:56.200 --> 00:30:59.370
That's when we decided
to do Frisbees.

00:30:59.370 --> 00:31:01.530
And then it got shifted
over to 11:00 to 12:30,

00:31:01.530 --> 00:31:05.270
but then we bought all these
Frisbees, so we said, whatever.

00:31:05.270 --> 00:31:07.424
It's not like we
could use all of them

00:31:07.424 --> 00:31:08.090
All right, good.

00:31:08.090 --> 00:31:12.887
So I don't like that answer,
and I want a different one.

00:31:12.887 --> 00:31:13.720
Give me another one.

00:31:13.720 --> 00:31:14.485
Yeah, go ahead.

00:31:14.485 --> 00:31:16.610
AUDIENCE: Just carry it
through in numerical order.

00:31:16.610 --> 00:31:17.120
SRINIVAS DEVADAS: I'm sorry?

00:31:17.120 --> 00:31:19.340
AUDIENCE: Just carry it
through in numerical order.

00:31:19.340 --> 00:31:21.590
SRINIVAS DEVADAS: Carry it
through in numerical order.

00:31:21.590 --> 00:31:22.760
Is that going to work?

00:31:22.760 --> 00:31:25.770
And what's an example
that it didn't work?

00:31:25.770 --> 00:31:27.830
The one right there, right?

00:31:27.830 --> 00:31:30.190
Should I get her a Frisbee?

00:31:30.190 --> 00:31:31.391
We should.

00:31:31.391 --> 00:31:33.140
I'm going to be generous
at the beginning.

00:31:33.140 --> 00:31:33.723
You can just--

00:31:36.280 --> 00:31:38.080
But that's an answer I liked.

00:31:38.080 --> 00:31:42.280
Yeah, there you go.

00:31:42.280 --> 00:31:46.370
So entering through a numeric
order isn't going to work.

00:31:46.370 --> 00:31:50.486
This is a great
example right there.

00:31:50.486 --> 00:31:52.900
Give me another one.

00:31:52.900 --> 00:31:55.430
[LAUGHTER]

00:31:55.430 --> 00:31:56.820
There are no
Frisbees right here.

00:31:56.820 --> 00:31:58.320
Over there, yeah?

00:31:58.320 --> 00:32:01.136
AUDIENCE: Try the one
with the shortest time.

00:32:01.136 --> 00:32:03.510
SRINIVAS DEVADAS: Ah, try the
one with the shortest time.

00:32:03.510 --> 00:32:09.001
OK, so the shortest time in
this case might be this one.

00:32:09.001 --> 00:32:10.500
The shortest time
might be this one,

00:32:10.500 --> 00:32:12.083
and, hey, that might
work in this case

00:32:12.083 --> 00:32:14.600
because you pick this one,
which is the shortest,

00:32:14.600 --> 00:32:16.860
or maybe it's five,
which is the shortest.

00:32:16.860 --> 00:32:20.870
Either way, you could get 2, 5,
and 6, looking at this picture,

00:32:20.870 --> 00:32:22.420
seems to work.

00:32:22.420 --> 00:32:27.050
Maybe 4, 5, and 6 if you pick
5 first, et cetera, right?

00:32:27.050 --> 00:32:31.010
I'll give you a Frisbee if you
can take that same algorithm

00:32:31.010 --> 00:32:32.260
and give me a counter example.

00:32:39.120 --> 00:32:43.762
AUDIENCE: Let's say you have two
requests which don't overlap,

00:32:43.762 --> 00:32:44.512
and then there's--

00:32:44.512 --> 00:32:46.678
SRINIVAS DEVADAS: --there's
one right in the middle,

00:32:46.678 --> 00:32:47.680
exactly right.

00:32:47.680 --> 00:32:50.780
Yep, so let's see.

00:32:50.780 --> 00:32:51.830
What do I do?

00:32:51.830 --> 00:32:52.420
Oh, here.

00:32:56.320 --> 00:33:00.020
So pictorially, a really
you can look at this,

00:33:00.020 --> 00:33:04.960
and you can actually figure out
whether your heuristic works

00:33:04.960 --> 00:33:05.460
or not.

00:33:05.460 --> 00:33:07.460
But this, I think, what
you were thinking about.

00:33:10.720 --> 00:33:12.950
There you go, right?

00:33:12.950 --> 00:33:14.450
So you get one.

00:33:17.322 --> 00:33:18.530
So that clearly doesn't work.

00:33:18.530 --> 00:33:24.730
So this one was
smallest, doesn't work.

00:33:24.730 --> 00:33:27.090
The suggestion
here was a numeric.

00:33:30.350 --> 00:33:31.030
It doesn't work.

00:33:34.900 --> 00:33:40.030
Here's one that
might actually work.

00:33:40.030 --> 00:33:46.550
For each request,
find the number

00:33:46.550 --> 00:33:48.020
of incompatible requests.

00:33:51.880 --> 00:33:52.880
So you've got a request.

00:33:52.880 --> 00:33:56.230
You can always intersect
the other requests with it

00:33:56.230 --> 00:34:00.800
and decide whether the second
request is compatible or not,

00:34:00.800 --> 00:34:03.670
and you do this for
every other request.

00:34:03.670 --> 00:34:07.450
And you can collect
together numbers associated

00:34:07.450 --> 00:34:12.240
with how many incompatible
requests a particular request

00:34:12.240 --> 00:34:17.760
has, and you say, well, let
me use that as a heuristic.

00:34:17.760 --> 00:34:21.750
So each request, find number
of incompatible requests

00:34:21.750 --> 00:34:30.401
and select the one
with the minimum number

00:34:30.401 --> 00:34:31.109
of incompatibles.

00:34:36.110 --> 00:34:40.090
So just to be
clear, in this case,

00:34:40.090 --> 00:34:43.020
you would not select
1 because clearly 1 is

00:34:43.020 --> 00:34:45.659
incompatible with
every other request,

00:34:45.659 --> 00:34:48.150
so that clearly is
not numeric order.

00:34:48.150 --> 00:34:50.500
In this case, you would
not select this one

00:34:50.500 --> 00:34:53.580
because it's incompatible
with this one and that one.

00:34:53.580 --> 00:34:56.020
So you'd select that one
which has the minimum number

00:34:56.020 --> 00:34:57.650
of incompatibles.

00:34:57.650 --> 00:34:59.630
So you think this
is going to produce

00:34:59.630 --> 00:35:06.333
the correct answer, the maximum
answer, in every possible case?

00:35:06.333 --> 00:35:07.260
AUDIENCE: No.

00:35:07.260 --> 00:35:10.460
SRINIVAS DEVADAS:
No, who said, no?

00:35:10.460 --> 00:35:12.290
Well, anybody who said,
no, should give me

00:35:12.290 --> 00:35:13.420
a counter example.

00:35:13.420 --> 00:35:14.450
Yeah, go for it.

00:35:14.450 --> 00:35:16.845
AUDIENCE: If the
one that it selects

00:35:16.845 --> 00:35:21.156
has mutually incompatible
collection of intervals

00:35:21.156 --> 00:35:23.560
with which it's compatible.

00:35:23.560 --> 00:35:27.740
SRINIVAS DEVADAS: Right,
so that's a good thought.

00:35:27.740 --> 00:35:29.390
We'll have to [INAUDIBLE] that.

00:35:29.390 --> 00:35:31.910
And I think this
particular example,

00:35:31.910 --> 00:35:34.940
that's exactly what
you said, which

00:35:34.940 --> 00:35:43.290
just instantiates your notion
of mutual incompatibility.

00:35:43.290 --> 00:35:46.360
So here's an example
where I have something.

00:35:46.360 --> 00:35:48.010
It's a little more complicated.

00:35:48.010 --> 00:35:50.960
As you can see, this is
a pretty good heuristic.

00:35:50.960 --> 00:35:55.570
It's not perfect as you can
see from this example, where

00:35:55.570 --> 00:36:01.805
I have something like this.

00:36:13.910 --> 00:36:22.600
So if you look at this,
what I have here is

00:36:22.600 --> 00:36:25.800
I have just a bunch
of requests which

00:36:25.800 --> 00:36:29.760
have-- this is incompatible
with this and that

00:36:29.760 --> 00:36:33.870
and these two, so clearly a lot
of incompatibilities for these,

00:36:33.870 --> 00:36:36.390
a lot incompatibilities
for these.

00:36:36.390 --> 00:36:38.580
Which is the minimum?

00:36:38.580 --> 00:36:43.460
The one in here, but what
happens if you select that?

00:36:43.460 --> 00:36:47.720
Well, clearly you don't get
this solution, which is optimal.

00:36:47.720 --> 00:36:51.800
The one on top, so this
is a bad selection.

00:36:56.273 --> 00:36:59.880
And so this doesn't
work either, OK?

00:36:59.880 --> 00:37:00.900
There you go.

00:37:04.120 --> 00:37:07.510
So as it turns out,
the reason I didn't

00:37:07.510 --> 00:37:10.830
like that first answer
was it was correct.

00:37:10.830 --> 00:37:12.440
[LAUGHTER]

00:37:12.440 --> 00:37:15.070
It's actually a
beautiful heuristic.

00:37:15.070 --> 00:37:21.194
Earliest finish time is a
heuristic that is-- well,

00:37:21.194 --> 00:37:22.860
it's not really a
heuristic in the sense

00:37:22.860 --> 00:37:25.940
that if you use
that selection rule,

00:37:25.940 --> 00:37:29.700
then it works in every case.

00:37:29.700 --> 00:37:35.150
In every case, it's going to get
to you the maximum number, OK?

00:37:35.150 --> 00:37:41.110
Earliest finished time
so what does that mean?

00:37:41.110 --> 00:37:46.690
Well, it just means that
I'm going to just scan

00:37:46.690 --> 00:37:52.670
the f of i's associated with the
list of requests that I have,

00:37:52.670 --> 00:37:56.680
and I'm going to pick
the one that is minimum.

00:37:56.680 --> 00:37:59.860
Minimum f of i means
earliest finish time.

00:37:59.860 --> 00:38:01.730
Now you can just step
back, and I'm not

00:38:01.730 --> 00:38:06.650
going to do this for every
diagram that I have up here,

00:38:06.650 --> 00:38:09.510
but look at every
example that I've put up.

00:38:09.510 --> 00:38:14.250
Apply the selection rule
associated with earliest finish

00:38:14.250 --> 00:38:17.840
time, and you'll see
that it works and gets

00:38:17.840 --> 00:38:19.370
you the maximum number.

00:38:19.370 --> 00:38:27.630
For example, over here, this
has the earliest finish time.

00:38:27.630 --> 00:38:29.760
Not this, not this,
it's over here.

00:38:29.760 --> 00:38:35.170
So you pick that, and then you
use the greedy algorithm step 2

00:38:35.170 --> 00:38:38.680
to eliminate all of
the intervals that are

00:38:38.680 --> 00:38:41.500
incompatible, so these go away.

00:38:41.500 --> 00:38:45.050
Once this goes away, this
one has the earliest finish

00:38:45.050 --> 00:38:48.480
time and so on and so forth.

00:38:48.480 --> 00:38:56.000
So this is something that you
can prove through examples.

00:38:56.000 --> 00:38:58.230
That's not really
a good notion when

00:38:58.230 --> 00:39:01.760
you can prove to
yourself using examples.

00:39:01.760 --> 00:39:08.880
And this is where I guess
is the essence of 6046,

00:39:08.880 --> 00:39:12.780
to some extent 006
comes into play.

00:39:12.780 --> 00:39:18.310
We will have to prove beyond
a shadow of a doubt using

00:39:18.310 --> 00:39:23.820
mathematical rigor that the
earliest finish time selection

00:39:23.820 --> 00:39:30.665
rule always gives us the
maximum number of requests,

00:39:30.665 --> 00:39:31.790
and we're going to do that.

00:39:31.790 --> 00:39:33.750
It's going to take us
a little bit of time,

00:39:33.750 --> 00:39:38.330
but that's the kind of thing
you will be expected to do

00:39:38.330 --> 00:39:41.530
and you'll see a lot of in 046.

00:39:41.530 --> 00:39:43.590
OK?

00:39:43.590 --> 00:39:47.260
So everyone buy
earliest finish time?

00:39:47.260 --> 00:39:48.462
Yep, go ahead.

00:39:48.462 --> 00:39:50.922
AUDIENCE: So what if we
consider the simple path

00:39:50.922 --> 00:39:54.080
example of there's one
request for the whole block,

00:39:54.080 --> 00:39:57.577
and there's one small request
that it mentioned earlier.

00:39:57.577 --> 00:39:59.160
SRINIVAS DEVADAS:
Well, you'll get one

00:39:59.160 --> 00:40:01.960
for-- if there's
any two requests,

00:40:01.960 --> 00:40:03.810
your maximum number is 1.

00:40:03.810 --> 00:40:05.770
So you pick-- it
doesn't matter--

00:40:05.770 --> 00:40:08.840
it's not like you want
efficiency of your resource.

00:40:08.840 --> 00:40:11.390
In this particular case,
we will look at cases

00:40:11.390 --> 00:40:15.450
where you might have an extra
consideration associated

00:40:15.450 --> 00:40:18.770
with your problem which
changes the problem that says,

00:40:18.770 --> 00:40:22.210
I want my resource to
be maximally utilized.

00:40:22.210 --> 00:40:24.770
If you do that, then
this doesn't work.

00:40:24.770 --> 00:40:27.470
And that's exactly-- it's
a great question you asked.

00:40:27.470 --> 00:40:32.380
But I did say that we were going
to look at the team here, which

00:40:32.380 --> 00:40:38.150
I don't have anymore, but of
how problems change algorithms.

00:40:38.150 --> 00:40:40.830
And so that's a problem change.

00:40:40.830 --> 00:40:41.956
You've got a question.

00:40:41.956 --> 00:40:43.414
AUDIENCE: I have
a counter example.

00:40:43.414 --> 00:40:46.110
You have three
intervals that don't

00:40:46.110 --> 00:40:48.085
conflict with one another.

00:40:48.085 --> 00:40:52.376
You have one interval that
conflicts with the first two

00:40:52.376 --> 00:40:55.912
and ends earlier
than the first one.

00:40:55.912 --> 00:40:57.620
SRINIVAS DEVADAS: OK,
so are you claiming

00:40:57.620 --> 00:40:59.870
that there's going to be a
counter example to earliest

00:40:59.870 --> 00:41:00.430
finish time?

00:41:00.430 --> 00:41:00.770
AUDIENCE: Yes.

00:41:00.770 --> 00:41:02.853
SRINIVAS DEVADAS: All
right, I would write it down

00:41:02.853 --> 00:41:04.680
on a sheet of paper.

00:41:04.680 --> 00:41:07.930
And get me a concrete example,
and you can just slide it by.

00:41:07.930 --> 00:41:13.470
And if you get that before I
finished my proof, you win, OK?

00:41:13.470 --> 00:41:15.870
[LAUGHTER]

00:41:15.870 --> 00:41:18.000
So I would write it down.

00:41:18.000 --> 00:41:20.750
Just write it down, so good.

00:41:20.750 --> 00:41:22.600
All right, so this
is a contest now.

00:41:22.600 --> 00:41:25.559
[LAUGHTER]

00:41:25.559 --> 00:41:27.600
All right, so we are going
to try and prove this.

00:41:36.450 --> 00:41:39.780
So there's many ways
you could prove things,

00:41:39.780 --> 00:41:42.320
and I mean prove
things properly.

00:41:42.320 --> 00:41:46.290
And I don't know if you've
read the old 6042 proof

00:41:46.290 --> 00:41:48.040
techniques that
are invalid, which

00:41:48.040 --> 00:41:53.230
is things like prove
by intimidation, proof

00:41:53.230 --> 00:41:56.890
because the lecturer said so,
you know, things like that.

00:41:56.890 --> 00:41:59.030
This is going to be a
classical proof technique.

00:41:59.030 --> 00:42:01.150
It's going to be a
proof by induction.

00:42:01.150 --> 00:42:04.360
We're going to go into
it in some detail.

00:42:04.360 --> 00:42:06.360
Later on in the
term we are going

00:42:06.360 --> 00:42:08.890
to put out sketches of proofs.

00:42:08.890 --> 00:42:12.300
We are going to be skipping
steps in lecture that

00:42:12.300 --> 00:42:16.810
are obvious or maybe
not so obvious,

00:42:16.810 --> 00:42:19.570
but if you paid
attention, then you

00:42:19.570 --> 00:42:23.770
can infer the middle
step, for example.

00:42:23.770 --> 00:42:26.710
And so will be doing
proof sketches,

00:42:26.710 --> 00:42:29.390
and proof sketches are
not sketchy proofs.

00:42:29.390 --> 00:42:30.870
[LAUGHTER]

00:42:30.870 --> 00:42:32.470
So keep that in mind.

00:42:32.470 --> 00:42:34.700
But this particular proof
that we're going to do,

00:42:34.700 --> 00:42:36.170
I'm going to put
in all the steps

00:42:36.170 --> 00:42:39.210
because it's our first one.

00:42:39.210 --> 00:42:46.580
And so what we're going to
do here is prove a claim,

00:42:46.580 --> 00:42:57.010
and the claim is
simply that-- whoops,

00:42:57.010 --> 00:42:59.980
this is not writing very well.

00:43:03.390 --> 00:43:04.360
What is going on here?

00:43:11.200 --> 00:43:13.534
OK.

00:43:13.534 --> 00:43:16.450
[LAUGHTER]

00:43:16.450 --> 00:43:17.725
Back to the white.

00:43:22.480 --> 00:43:37.050
Given a list of intervals
l, our greedy algorithm

00:43:37.050 --> 00:43:53.960
with earliest finish time
produces k star intervals

00:43:53.960 --> 00:43:55.340
where k star is minimal.

00:44:01.250 --> 00:44:03.430
So that's what we like to prove.

00:44:03.430 --> 00:44:05.313
AUDIENCE: [INAUDIBLE].

00:44:05.313 --> 00:44:06.938
SRINIVAS DEVADAS:
Sorry, what happened?

00:44:06.938 --> 00:44:08.180
AUDIENCE: [INAUDIBLE]

00:44:08.180 --> 00:44:10.900
SRINIVAS DEVADAS: Oh, right.

00:44:10.900 --> 00:44:13.510
Good point.

00:44:13.510 --> 00:44:14.010
Maximum.

00:44:21.370 --> 00:44:24.720
What we're going to do is
prove this by induction,

00:44:24.720 --> 00:44:27.010
and it's going to be
induction on k star.

00:44:32.070 --> 00:44:40.990
And so the base case is almost
always with induction proofs

00:44:40.990 --> 00:44:45.110
trivial, and it's
similar here as well.

00:44:45.110 --> 00:44:49.960
And in the base
case, if you have

00:44:49.960 --> 00:44:55.360
a single interval
in your list, then

00:44:55.360 --> 00:44:57.950
obviously that's
a trivial example.

00:44:57.950 --> 00:45:01.420
But what I'm saying here for
the base is slightly different.

00:45:01.420 --> 00:45:05.510
It says that the optimal
solution has a single interval,

00:45:05.510 --> 00:45:06.130
right?

00:45:06.130 --> 00:45:11.020
And so now if your problem has
one interval or two intervals

00:45:11.020 --> 00:45:14.380
or three intervals, you
can always pick one,

00:45:14.380 --> 00:45:18.205
and it's clearly going to be
a valid schedule because you

00:45:18.205 --> 00:45:19.860
don't have to check
compatibility.

00:45:19.860 --> 00:45:22.490
And so the base
case is trivial even

00:45:22.490 --> 00:45:27.990
in the case where
you're not talking just

00:45:27.990 --> 00:45:32.560
of intervals that
have cardinality 1,

00:45:32.560 --> 00:45:36.230
but the optimal schedule
has cardinality 1.

00:45:36.230 --> 00:45:38.770
So that's a trivial case.

00:45:38.770 --> 00:45:46.750
So the hard work, of course,
in the induction proofs is

00:45:46.750 --> 00:45:51.930
assuming the hypothesis
and proving the n-plus-1,

00:45:51.930 --> 00:45:56.510
or in this case, the
k-star-plus-1 case.

00:45:56.510 --> 00:45:59.330
And that's what we'll
have to work on.

00:45:59.330 --> 00:46:13.910
So let's say that the
claim holds for k star,

00:46:13.910 --> 00:46:28.260
and we are given a
list of intervals

00:46:28.260 --> 00:46:33.860
who's optimal schedule
is k star plus 1.

00:46:37.110 --> 00:46:45.660
It has k-star-plus-1 intervals
in the optimal schedule,

00:46:45.660 --> 00:46:48.300
so L may be some large
number, capital L,

00:46:48.300 --> 00:46:49.650
maybe in the hundreds.

00:46:49.650 --> 00:46:53.160
And k star, there may
be 10 of what have you.

00:46:53.160 --> 00:46:53.910
They're different.

00:46:53.910 --> 00:46:56.080
I want to point that out.

00:46:56.080 --> 00:47:05.010
So our optimal schedule, we're
going to write out as this,

00:47:05.010 --> 00:47:05.750
s star.

00:47:08.340 --> 00:47:13.550
So usually if you use star for
optimal in 046 and it's got

00:47:13.550 --> 00:47:22.510
k-star-plus-1 entries, and those
entries look like sf pairs--

00:47:22.510 --> 00:47:28.080
so I'm going to using the
subscript j1 through j k star

00:47:28.080 --> 00:47:35.440
plus 1 to denote
these intervals.

00:47:35.440 --> 00:47:39.170
So the first one is sj1, fj1.

00:47:39.170 --> 00:47:41.670
That's an interval
that's been selected

00:47:41.670 --> 00:47:44.510
and is part of our
optimal solution.

00:47:44.510 --> 00:47:51.590
And then you keep going
and we have sj k star

00:47:51.590 --> 00:47:59.470
plus 1 comma fj k star plus 1.

00:47:59.470 --> 00:48:06.240
So no getting away from
subscripts here in 046 So

00:48:06.240 --> 00:48:15.010
that's what we have in terms
of this is what the optimal

00:48:15.010 --> 00:48:15.710
schedule is.

00:48:15.710 --> 00:48:17.740
It's got size k star.

00:48:17.740 --> 00:48:19.990
Of course, what
we have to show is

00:48:19.990 --> 00:48:26.154
that the greedy algorithm
with the earliest finish time

00:48:26.154 --> 00:48:27.570
is going to produce
something that

00:48:27.570 --> 00:48:30.900
is k star plus one in size.

00:48:30.900 --> 00:48:33.300
And so that's the hard part.

00:48:33.300 --> 00:48:36.330
We can assume the
inductive hypothesis,

00:48:36.330 --> 00:48:37.950
and we'll have to do that.

00:48:37.950 --> 00:48:41.700
But there's a couple
of steps in between.

00:48:41.700 --> 00:48:51.490
So let's say that what
we have is s1 through k

00:48:51.490 --> 00:48:55.020
is what the greedy algorithm
produces with the earliest

00:48:55.020 --> 00:48:56.126
finish time.

00:48:56.126 --> 00:49:10.350
So I'm going to write
that down sik fik,

00:49:10.350 --> 00:49:17.800
so notice I have k here, and
k and k star, at this point,

00:49:17.800 --> 00:49:18.640
are not comparable.

00:49:21.780 --> 00:49:29.860
I'm just making a statement that
I took this particular problem

00:49:29.860 --> 00:49:35.800
that has k star plus 1 in terms
of its optimal solution size,

00:49:35.800 --> 00:49:39.684
and for that problem,
I have k intervals

00:49:39.684 --> 00:49:41.350
that are produced by
the earliest finish

00:49:41.350 --> 00:49:43.760
time greedy heuristic.

00:49:43.760 --> 00:49:47.630
And so that's why the
subscripts here are different.

00:49:47.630 --> 00:49:51.570
I have i1 here and ik,
and then over here I

00:49:51.570 --> 00:49:54.320
have the j's, and so these
intervals are different.

00:49:56.940 --> 00:50:07.160
If I look at f of i plus f
of i1, and if I look f of j1,

00:50:07.160 --> 00:50:10.200
what can I say about
f of i1 and f of j1?

00:50:15.720 --> 00:50:17.950
Is there a relationship
between f of i1 and f of j1?

00:50:21.700 --> 00:50:23.160
They're equal?

00:50:23.160 --> 00:50:26.380
Do they have to be equal?

00:50:26.380 --> 00:50:26.880
Yeah?

00:50:26.880 --> 00:50:27.990
AUDIENCE: Less or equal to.

00:50:27.990 --> 00:50:29.365
SRINIVAS DEVADAS:
Less than equal

00:50:29.365 --> 00:50:33.740
to, exactly right, so
they're less than equal to.

00:50:33.740 --> 00:50:37.330
It's possible that
you might end up

00:50:37.330 --> 00:50:43.130
with a different optimal
solution that doesn't

00:50:43.130 --> 00:50:44.520
use the earliest finish time.

00:50:44.520 --> 00:50:46.940
We think earliest finish time
is optimal at this point.

00:50:46.940 --> 00:50:49.680
We haven't proven it yet,
but it's quite possible

00:50:49.680 --> 00:50:53.880
that you may have
other solutions that

00:50:53.880 --> 00:50:56.650
are optimal that aren't
necessarily the ones

00:50:56.650 --> 00:50:58.980
that earliest finish
time gives you.

00:50:58.980 --> 00:51:01.260
So that's really why the
less than or equal to

00:51:01.260 --> 00:51:04.490
is important here.

00:51:04.490 --> 00:51:07.180
Now what I'm going to
do is create a schedule,

00:51:07.180 --> 00:51:13.270
s star star, that essentially
is going to be taking s star

00:51:13.270 --> 00:51:18.270
and pulling out the first
interval from s star

00:51:18.270 --> 00:51:21.000
and substituting it
with the first interval

00:51:21.000 --> 00:51:24.020
from my greedy
algorithms schedule.

00:51:24.020 --> 00:51:25.820
So I'm just going
to replace that,

00:51:25.820 --> 00:51:32.660
and so s star star is si1 fj1.

00:51:36.520 --> 00:51:42.500
And then I'm going to
be going back to sj2 fj2

00:51:42.500 --> 00:51:46.690
because I'm going back to s
star and all the other ones

00:51:46.690 --> 00:51:48.930
are coming from s star.

00:51:48.930 --> 00:52:01.780
So they're going to be sj k star
plus 1 comma fj k star plus 1.

00:52:04.350 --> 00:52:08.130
So I just did a little
substitution there associated

00:52:08.130 --> 00:52:13.950
with the optimal
solution, and I stuck

00:52:13.950 --> 00:52:16.890
in part of the greedy
algorithm solution,

00:52:16.890 --> 00:52:18.625
in fact, the very
first schedule.

00:52:18.625 --> 00:52:22.542
AUDIENCE: So the 1 should be i1.

00:52:22.542 --> 00:52:25.580
SRINIVAS DEVADAS: Oh,
this should be-- i1,

00:52:25.580 --> 00:52:26.580
AUDIENCE: Right?

00:52:26.580 --> 00:52:28.640
SRINIVAS DEVADAS: i1, thank you.

00:52:28.640 --> 00:52:33.650
Yep, good.

00:52:33.650 --> 00:52:39.260
So we've got a couple of things
to do, a couple of observations

00:52:39.260 --> 00:52:46.150
to make, and we're going
to be able do prove

00:52:46.150 --> 00:52:48.680
some relationship
between k and k star

00:52:48.680 --> 00:52:51.970
that is going to give us
the proof for our claim.

00:52:56.570 --> 00:53:02.670
So clearly, s star
is also optimal.

00:53:02.670 --> 00:53:05.226
All I've done is
taken one interval out

00:53:05.226 --> 00:53:06.600
and replaced it
with another one.

00:53:06.600 --> 00:53:08.610
It hasn't changed the size.

00:53:08.610 --> 00:53:13.230
It goes up to k star plus 1, so
s double star is also optimal.

00:53:13.230 --> 00:53:17.470
s star is optimal. s
double star is optimal.

00:53:17.470 --> 00:53:29.210
Now I'm going to define L
prime as the set of intervals

00:53:29.210 --> 00:53:35.720
with s of i greater than
or equal to f of i1.

00:53:35.720 --> 00:53:37.330
So what is L prime?

00:53:37.330 --> 00:53:41.030
Well, L prime is what
happens in the second step

00:53:41.030 --> 00:53:45.500
of the greedy algorithm,
where in the second step

00:53:45.500 --> 00:53:48.720
of the greedy algorithm,
once I've selected

00:53:48.720 --> 00:53:52.000
this particular interval
and I've pull it in,

00:53:52.000 --> 00:53:54.720
I have to reject all of
the other intervals that

00:53:54.720 --> 00:53:57.840
are incompatible with this one.

00:53:57.840 --> 00:54:03.880
So I'm going to have to take
only those intervals for which

00:54:03.880 --> 00:54:08.260
s of i is greater than
or equal to f of i1

00:54:08.260 --> 00:54:14.140
because those are the
ones that are compatible.

00:54:14.140 --> 00:54:16.190
So that's what L prime is.

00:54:16.190 --> 00:54:22.500
And I'm going to be able
to say that since s double

00:54:22.500 --> 00:54:38.300
star is optimal for L, s
double star 2 to k star plus 1

00:54:38.300 --> 00:54:40.615
is optimal for L prime.

00:54:44.760 --> 00:54:51.670
So I'm making a statement
about this optimal solution.

00:54:51.670 --> 00:54:53.660
I know that's
optimal, and basically

00:54:53.660 --> 00:54:58.730
what I'm saying is subsets of
the optimal solution are going

00:54:58.730 --> 00:55:01.360
to have to be optimal because
if that's not the case,

00:55:01.360 --> 00:55:06.850
I could always substitute
something better and shrink

00:55:06.850 --> 00:55:12.110
the size of the k star plus
1 optimal solution, which

00:55:12.110 --> 00:55:14.790
obviously would be
a contradiction.

00:55:14.790 --> 00:55:20.730
So s double star
is optimal for L,

00:55:20.730 --> 00:55:23.816
and therefore s double
star 2 through k star

00:55:23.816 --> 00:55:26.810
plus 1 is optimal for L prime.

00:55:26.810 --> 00:55:28.120
Everybody buy that?

00:55:28.120 --> 00:55:29.020
Yep?

00:55:29.020 --> 00:55:31.720
Good.

00:55:31.720 --> 00:55:33.440
And so what this
means, of course,

00:55:33.440 --> 00:55:49.100
is that the optimal schedule
for L prime has k star size.

00:55:49.100 --> 00:55:50.290
And I'm starting with 2.

00:55:50.290 --> 00:55:51.650
I've taken away 1.

00:55:51.650 --> 00:55:55.030
So now I have L prime,
which is a smaller problem.

00:55:55.030 --> 00:55:58.575
Now you see where the proof is
headed, if you didn't already.

00:55:58.575 --> 00:56:01.250
I have a smaller problem,
which is L prime.

00:56:01.250 --> 00:56:03.680
Clearly, it's got
fewer requests,

00:56:03.680 --> 00:56:08.930
and I have constructed
an optimal schedule

00:56:08.930 --> 00:56:11.570
for that problem
by pulling it out

00:56:11.570 --> 00:56:15.910
of the original optimal
schedule I was given.

00:56:15.910 --> 00:56:21.950
And that size of that
optimal schedule is k star.

00:56:21.950 --> 00:56:26.140
And now I get to invoke
my inductive hypothesis

00:56:26.140 --> 00:56:29.150
because my inductive
hypothesis says

00:56:29.150 --> 00:56:31.960
that this claim that
I have up there holds

00:56:31.960 --> 00:56:36.160
for any set of
problems that have

00:56:36.160 --> 00:56:39.430
an optimal schedule
of size k star.

00:56:39.430 --> 00:56:42.610
That's what the inductive
hypothesis gives me.

00:56:42.610 --> 00:56:56.340
And so by the
inductive hypothesis,

00:56:56.340 --> 00:57:09.520
when I run the greedy
algorithm on L prime,

00:57:09.520 --> 00:57:19.340
I'm going to get sk
schedule of size k star.

00:57:28.920 --> 00:57:33.070
Now can you tell me, based
on what you see on the board,

00:57:33.070 --> 00:57:37.230
by construction, when I
run the greedy algorithm,

00:57:37.230 --> 00:57:41.830
what am I getting on L star?

00:57:41.830 --> 00:57:46.980
By construction, when I run the
greedy algorithm on L prime--

00:57:46.980 --> 00:57:49.840
there's too many
superscripts here--

00:57:49.840 --> 00:57:52.980
when I run the greedy algorithm
on L prime, what do I get?

00:57:55.840 --> 00:57:56.851
Someone?

00:57:56.851 --> 00:57:57.350
Yeah?

00:57:57.350 --> 00:58:01.244
AUDIENCE: We get s of i sub
2, s of i sub 2 interval.

00:58:01.244 --> 00:58:02.660
SRINIVAS DEVADAS:
Exactly right, I

00:58:02.660 --> 00:58:06.500
get everything from
the second thing

00:58:06.500 --> 00:58:09.910
here all the way to the
end because that's exactly

00:58:09.910 --> 00:58:11.410
what the greedy algorithm does.

00:58:11.410 --> 00:58:15.020
Remember, the greedy
algorithm picked si1 fi1,

00:58:15.020 --> 00:58:19.140
and then rejected all requests
that are incompatible and then

00:58:19.140 --> 00:58:20.090
move on.

00:58:20.090 --> 00:58:23.180
When you rejected all requests
that are incompatible here,

00:58:23.180 --> 00:58:25.830
you got exactly L prime.

00:58:25.830 --> 00:58:29.130
And by construction,
the greedy algorithm

00:58:29.130 --> 00:58:35.800
should have given me all
the way from si2 too sik.

00:58:35.800 --> 00:58:37.620
Thank you.

00:58:37.620 --> 00:58:54.620
So by construction,
the greedy on L prime

00:58:54.620 --> 00:59:00.760
gives s2 to k, right?

00:59:00.760 --> 00:59:02.985
And what is the size of this?

00:59:02.985 --> 00:59:07.990
2 to k gives me a
size of k minus 1.

00:59:07.990 --> 00:59:08.970
This is k minus 1.

00:59:15.910 --> 00:59:21.620
So if I put these
two things together,

00:59:21.620 --> 00:59:23.125
what is the next step?

00:59:23.125 --> 00:59:26.330
I have the inductive
hypothesis giving me a fact.

00:59:26.330 --> 00:59:29.380
I have the construction
giving me something.

00:59:29.380 --> 00:59:32.600
Now I can relate k and k star.

00:59:32.600 --> 00:59:33.600
What's the relationship?

00:59:38.440 --> 00:59:41.720
k star is equal to
k minus 1, right?

00:59:41.720 --> 00:59:44.710
Do people see that?

00:59:44.710 --> 00:59:51.100
So size k star or
just k minus 1.

00:59:51.100 --> 00:59:57.680
So what that means is
given that s2k is a size k

00:59:57.680 --> 01:00:05.280
star, it means that s1k
is of size k star plus 1,

01:00:05.280 --> 01:00:07.940
which is exactly what I want.

01:00:07.940 --> 01:00:11.860
That's optimal because
I said in the beginning

01:00:11.860 --> 01:00:16.550
that we had k star plus 1 in our
inductive hypothesis this case

01:00:16.550 --> 01:00:18.400
as being the optimal solution.

01:00:18.400 --> 01:00:21.980
So this last step
here is all you

01:00:21.980 --> 01:00:30.220
need to argue now that s
of 1k, going back up here,

01:00:30.220 --> 01:00:42.140
this is optimal because
k equals k star plus 1.

01:00:42.140 --> 01:00:48.440
There you go, so that's the
kind of argument that you have

01:00:48.440 --> 01:00:53.510
to make in order to prove
something like this in 046.

01:00:53.510 --> 01:00:57.010
And what you'll see
in your problem sets,

01:00:57.010 --> 01:00:59.380
including the one that's
going to come out on Thursday,

01:00:59.380 --> 01:01:02.910
is that different
problem that you

01:01:02.910 --> 01:01:05.907
have to have proof for
a greedy algorithm for.

01:01:05.907 --> 01:01:07.490
I forget exactly
what technique you'll

01:01:07.490 --> 01:01:09.460
have used there,
perhaps induction,

01:01:09.460 --> 01:01:10.900
perhaps contradiction.

01:01:10.900 --> 01:01:12.650
And these are the
kinds of things

01:01:12.650 --> 01:01:17.110
that get you to the
point where you've

01:01:17.110 --> 01:01:19.830
analyzed the correctness
of algorithms,

01:01:19.830 --> 01:01:22.720
not just the fact that you're
getting a valid schedule,

01:01:22.720 --> 01:01:26.330
but you're getting a
valid maximum schedule

01:01:26.330 --> 01:01:29.520
in terms of the maximum
number of requests.

01:01:29.520 --> 01:01:32.920
Any questions about this?

01:01:32.920 --> 01:01:35.110
Do people buy the proof?

01:01:35.110 --> 01:01:36.308
Yep.

01:01:36.308 --> 01:01:37.730
Good.

01:01:37.730 --> 01:01:42.080
So that was greedy for
a particular problem.

01:01:42.080 --> 01:01:43.890
I told you that the
team of our lecture

01:01:43.890 --> 01:01:51.200
here was changing the
problem and getting

01:01:51.200 --> 01:01:58.800
different algorithms that
had different complexities.

01:01:58.800 --> 01:02:00.050
So let's go ahead and do that.

01:02:00.050 --> 01:02:03.130
So the rest of
this lecture, we'll

01:02:03.130 --> 01:02:05.290
just take a look at
different kinds of problems

01:02:05.290 --> 01:02:09.720
and talk a little more
superficially about what

01:02:09.720 --> 01:02:12.160
the problem complexities are.

01:02:12.160 --> 01:02:14.480
And so one thing that
might come to mind

01:02:14.480 --> 01:02:18.890
is that you'd like to do
weighted interval scheduling.

01:02:23.600 --> 01:02:38.350
And what happens here is
each request has weight wi,

01:02:38.350 --> 01:02:48.420
and what we want to do is
schedule a subset of requests

01:02:48.420 --> 01:02:50.180
with maximum weight.

01:02:50.180 --> 01:02:52.600
So previously, it was
just all weights were 1,

01:02:52.600 --> 01:02:57.150
so maximum cardinality
was what we wanted.

01:02:57.150 --> 01:02:59.850
But now we want to schedule
a subset of requests

01:02:59.850 --> 01:03:03.060
with maximum weight.

01:03:03.060 --> 01:03:10.470
Someone give me an argument as
to whether the greedy algorithm

01:03:10.470 --> 01:03:15.300
earliest finish time first is
optimal for this weighted case,

01:03:15.300 --> 01:03:18.320
or give me a counter example.

01:03:18.320 --> 01:03:19.960
Yep, go ahead.

01:03:19.960 --> 01:03:22.285
AUDIENCE: Oh, well, you
know like your first example

01:03:22.285 --> 01:03:25.378
you have your first weight
of the first interval,

01:03:25.378 --> 01:03:26.836
it took the whole
time, [INAUDIBLE]

01:03:26.836 --> 01:03:28.324
would have three smaller ones?

01:03:28.324 --> 01:03:31.862
Well, if the weight of the
first one was 20 and then--

01:03:31.862 --> 01:03:33.570
SRINIVAS DEVADAS:
Exactly, exactly right.

01:03:33.570 --> 01:03:34.830
All right, I owe you one too.

01:03:34.830 --> 01:03:38.010
So here you go.

01:03:38.010 --> 01:03:41.220
So it's a fairly
trivial example.

01:03:41.220 --> 01:03:51.330
All you do is w equals 1,
w equals 1, w equals 3,

01:03:51.330 --> 01:03:53.210
so there you go.

01:03:53.210 --> 01:03:54.800
So clearly, the
earliest finish time

01:03:54.800 --> 01:03:57.980
would pick this one and then
this one, which is fine.

01:03:57.980 --> 01:04:00.570
You get two of these,
but this was important.

01:04:00.570 --> 01:04:04.750
This is, I don't know,
sleep party, 6046.

01:04:04.750 --> 01:04:06.640
[LAUGHTER]

01:04:06.640 --> 01:04:08.580
So there you go.

01:04:08.580 --> 01:04:11.155
So the weight it is, we
should make that infinity.

01:04:15.050 --> 01:04:17.720
Most important thing
in the world at least

01:04:17.720 --> 01:04:19.240
for the next six months.

01:04:23.130 --> 01:04:24.790
So how does this work now?

01:04:28.640 --> 01:04:33.410
So it turns out that
the greedy strategy,

01:04:33.410 --> 01:04:37.950
the template that I had, fails.

01:04:37.950 --> 01:04:41.660
There's nothing that
exists on this planet

01:04:41.660 --> 01:04:47.520
that, at least I know of, where
you can have a simple rule

01:04:47.520 --> 01:04:52.900
and use that template to get the
optimum solution, in this case,

01:04:52.900 --> 01:04:57.989
maximum weight solution,
for every problem instance,

01:04:57.989 --> 01:04:59.155
so that template just fails.

01:05:03.730 --> 01:05:05.426
What other programming
paradigm do you

01:05:05.426 --> 01:05:06.550
think would be useful here?

01:05:09.790 --> 01:05:10.909
Yeah, go ahead.

01:05:10.909 --> 01:05:11.450
AUDIENCE: DP.

01:05:11.450 --> 01:05:12.820
SRINIVAS DEVADAS: DP, right.

01:05:12.820 --> 01:05:18.293
So do you want to take a stab
at a potential DP solution here?

01:05:18.293 --> 01:05:21.524
AUDIENCE: Yeah, so either
include it in your [INAUDIBLE]

01:05:21.524 --> 01:05:25.480
or discard it and then continue
with set of other intervals.

01:05:25.480 --> 01:05:28.190
SRINIVAS DEVADAS: Yeah, that's
a perfect divide and conquer.

01:05:28.190 --> 01:05:30.820
And then when you include
it, what do you have to do?

01:05:30.820 --> 01:05:32.737
AUDIENCE: Eliminate all
conflicting intervals.

01:05:32.737 --> 01:05:34.611
SRINIVAS DEVADAS: Right,
how many subproblems

01:05:34.611 --> 01:05:36.220
do you think there are.

01:05:36.220 --> 01:05:38.588
I want to make you own
your Frisbee, right?

01:05:38.588 --> 01:05:42.720
[LAUGHTER]

01:05:42.720 --> 01:05:50.070
AUDIENCE: 2 to the power
of the number of intervals

01:05:50.070 --> 01:05:51.690
you have because--

01:05:51.690 --> 01:05:54.165
SRINIVAS DEVADAS: Well,
that's a number of subsets

01:05:54.165 --> 01:05:55.686
that you have.

01:05:55.686 --> 01:05:57.060
So you have n
intervals, then you

01:05:57.060 --> 01:05:58.600
have two [INAUDIBLE] subsets.

01:05:58.600 --> 01:05:59.590
AUDIENCE: Yeah.

01:05:59.590 --> 01:06:01.048
SRINIVAS DEVADAS:
But remember, you

01:06:01.048 --> 01:06:04.230
want to go-- you want to be
smarter than that, right?

01:06:04.230 --> 01:06:07.940
You want to be a little
bit smarter than that.

01:06:07.940 --> 01:06:10.560
So here, you get
a Frisbee anyway.

01:06:10.560 --> 01:06:11.570
[LAUGHTER]

01:06:11.570 --> 01:06:14.150
No, not anyway, here you go.

01:06:14.150 --> 01:06:16.150
Right.

01:06:16.150 --> 01:06:18.760
So anybody else?

01:06:18.760 --> 01:06:21.470
So what I want to use
is dynamic programming.

01:06:21.470 --> 01:06:23.110
We've established that.

01:06:23.110 --> 01:06:24.642
I want to use
dynamic programming.

01:06:24.642 --> 01:06:27.100
And the dynamic programming--
you have some experience with

01:06:27.100 --> 01:06:32.210
that in 006-- the name of the
game is to figure out what

01:06:32.210 --> 01:06:35.320
the subproblems are.

01:06:35.320 --> 01:06:36.960
The subproblems
are kind of going

01:06:36.960 --> 01:06:39.880
to look like a
collection of requests.

01:06:39.880 --> 01:06:42.180
I mean, there's no
two things about it.

01:06:42.180 --> 01:06:45.260
They're going to be a
collection of requests,

01:06:45.260 --> 01:06:50.430
and so the challenge here is
not to go to the 2 raised to n,

01:06:50.430 --> 01:06:55.960
because 2 raised to n is
bad if you want efficiency.

01:06:55.960 --> 01:07:00.500
So we have to have a polynomial
number of subproblems.

01:07:00.500 --> 01:07:03.160
So someone who hasn't
answered yet, go ahead.

01:07:03.160 --> 01:07:10.920
AUDIENCE: [INAUDIBLE] so
[INAUDIBLE] subset [INAUDIBLE]

01:07:10.920 --> 01:07:16.122
So from interval i to
interval j [INAUDIBLE].

01:07:16.122 --> 01:07:17.580
SRINIVAS DEVADAS:
So you're looking

01:07:17.580 --> 01:07:22.520
at every pair of i's and j's,
and, well, not all of them

01:07:22.520 --> 01:07:23.460
are going to be valid.

01:07:23.460 --> 01:07:26.070
There won't be intervals
associated with that,

01:07:26.070 --> 01:07:29.280
but that's a reasonable start.

01:07:29.280 --> 01:07:32.860
Someone else, someone
who hasn't answered?

01:07:32.860 --> 01:07:33.963
Yeah, back there.

01:07:33.963 --> 01:07:35.895
AUDIENCE: You could
go the best term

01:07:35.895 --> 01:07:39.609
to start to some even point,
and so there'd n of those.

01:07:39.609 --> 01:07:42.150
SRINIVAS DEVADAS: Ah, best from
the start to any given point.

01:07:42.150 --> 01:07:45.665
All right, well, you
got close, Michael.

01:07:45.665 --> 01:07:47.180
There you go.

01:07:47.180 --> 01:07:50.360
You need to stand up.

01:07:50.360 --> 01:07:52.660
Ew, bad throw.

01:07:52.660 --> 01:07:54.100
That's a bad throw.

01:07:54.100 --> 01:07:56.570
I've got to practice.

01:07:56.570 --> 01:08:00.800
OK, so as you can see
with dynamic programming,

01:08:00.800 --> 01:08:03.550
the challenge is to figure
out what the subproblems are.

01:08:03.550 --> 01:08:05.710
The fact of the
matter is that there's

01:08:05.710 --> 01:08:10.895
going to be many different
possible algorithms that

01:08:10.895 --> 01:08:13.630
are all DP for this
weighted problem.

01:08:13.630 --> 01:08:15.520
There's at least two
interesting ones.

01:08:15.520 --> 01:08:18.350
We're going to do a simple one,
which is based on the answer

01:08:18.350 --> 01:08:21.689
that the gentleman
here just gave.

01:08:21.689 --> 01:08:24.550
But it turns out you can be
a little smarter than that,

01:08:24.550 --> 01:08:29.451
and most likely you'll hear
the smarter way in the section,

01:08:29.451 --> 01:08:31.200
but let's do the simple
one because that's

01:08:31.200 --> 01:08:33.210
all I have time here for.

01:08:33.210 --> 01:08:36.569
And the key is to
define the subproblems,

01:08:36.569 --> 01:08:40.210
and then once you do that,
the actual recursion ends up

01:08:40.210 --> 01:08:45.380
being a fairly straightforward
and intuitive step.

01:08:45.380 --> 01:08:56.770
So let's look at dynamic
programming, one particular way

01:08:56.770 --> 01:09:01.020
of solving this problem,
using the DP paradigm.

01:09:01.020 --> 01:09:07.149
So what I'm going to do is
define subproblems R star,

01:09:07.149 --> 01:09:10.899
so R is the total number
of requests that we have,

01:09:10.899 --> 01:09:13.370
and the subproblems
are going to correspond

01:09:13.370 --> 01:09:19.830
to-- I'm going to request
j belonging to R such

01:09:19.830 --> 01:09:22.010
that-- oh, I'm sorry.

01:09:22.010 --> 01:09:31.130
This is R of x-- such that sj
is greater than or equal to x.

01:09:31.130 --> 01:09:37.960
So what I'm doing here
is, given a particular x,

01:09:37.960 --> 01:09:40.680
I can always shrink
the number of requests

01:09:40.680 --> 01:09:45.340
that I have based on this rule.

01:09:45.340 --> 01:09:48.279
And then you might
ask, what is x?

01:09:48.279 --> 01:09:59.410
And now you can apply the
same subsetting property

01:09:59.410 --> 01:10:07.030
by choosing the x's to be
the finishing times of all

01:10:07.030 --> 01:10:08.535
of the other requests.

01:10:08.535 --> 01:10:12.210
All right, so x equals f of i.

01:10:12.210 --> 01:10:17.680
So what this means is-- then
I put f of i over here--

01:10:17.680 --> 01:10:20.360
it means all of
the requests that

01:10:20.360 --> 01:10:26.940
come after the i-th request
finished our part of R of fi.

01:10:26.940 --> 01:10:43.350
So R of fi would simply be
requests later than f of i.

01:10:43.350 --> 01:10:46.430
And there's something subtle
here that I want to point out,

01:10:46.430 --> 01:10:51.790
which is R of fi is
not the set of requests

01:10:51.790 --> 01:10:55.910
that are compatible
with the i-th request.

01:10:55.910 --> 01:10:57.810
It's not exactly that.

01:10:57.810 --> 01:11:02.324
It's the set of requests
that are later than f of i.

01:11:02.324 --> 01:11:04.800
So keep that in mind
because what happens here

01:11:04.800 --> 01:11:09.620
is we're going to solve
this problem step by step.

01:11:09.620 --> 01:11:13.990
We're going to construct the
dynamic programming solution

01:11:13.990 --> 01:11:19.410
essentially by
picking a request,

01:11:19.410 --> 01:11:21.570
just like in the greedy
case, and then taking

01:11:21.570 --> 01:11:23.310
the request that
comes after that.

01:11:23.310 --> 01:11:26.690
So we're going to
pick an early request,

01:11:26.690 --> 01:11:28.570
and then we're going
to subset the solution,

01:11:28.570 --> 01:11:31.630
pick the next one just like
we did with the greedy.

01:11:31.630 --> 01:11:35.110
And so the subproblems
that we will actually

01:11:35.110 --> 01:11:38.090
solve potentially bottom up
if we are doing recursion

01:11:38.090 --> 01:11:43.290
are going to correspond to a
set of requests that come later

01:11:43.290 --> 01:11:48.620
than the particular subset
that we're looking at,

01:11:48.620 --> 01:11:51.770
which is defined by a
particular interval.

01:11:51.770 --> 01:11:55.036
So requests that are later
than f of i, not necessarily

01:11:55.036 --> 01:11:56.660
all of the requests
that are compatible

01:11:56.660 --> 01:11:59.043
with the i-th request.

01:11:59.043 --> 01:12:04.230
And so if you do that, then
the number of subproblems

01:12:04.230 --> 01:12:10.680
here are small n, where n
is the number of requests.

01:12:10.680 --> 01:12:16.640
So if n is the
number of requests

01:12:16.640 --> 01:12:27.520
in the original problem,
the number of sub problems

01:12:27.520 --> 01:12:32.070
equals n because all I do
is plug-in an appropriate i,

01:12:32.070 --> 01:12:35.110
find f of i for it, and
generate the R of f of i

01:12:35.110 --> 01:12:36.290
for each of those.

01:12:36.290 --> 01:12:39.210
So there's going to be
n of those subproblems.

01:12:39.210 --> 01:12:51.406
And we're going to solve
each subproblem once and then

01:12:51.406 --> 01:12:51.905
memoize.

01:12:56.140 --> 01:12:59.180
And so the work
that we have to do

01:12:59.180 --> 01:13:04.140
is the basic rule corresponding
to the complexity of a DP,

01:13:04.140 --> 01:13:16.800
which is number of
subproblems times the time

01:13:16.800 --> 01:13:24.860
to solve each subproblem,
or a single subproblem,

01:13:24.860 --> 01:13:33.540
and this assumes
order 1 for lookups.

01:13:33.540 --> 01:13:36.240
So you can think of
the recursive calls

01:13:36.240 --> 01:13:44.260
as being order 1
because your assuming

01:13:44.260 --> 01:13:46.900
you're doing memoization.

01:13:46.900 --> 01:13:50.450
So I haven't really told
you anything here that you

01:13:50.450 --> 01:13:56.150
haven't seen in 006 and likely
applied a bunch of times.

01:13:56.150 --> 01:14:00.410
Over here, we've just defined
what our subproblems are

01:14:00.410 --> 01:14:04.680
for our particular
DP, and we argued

01:14:04.680 --> 01:14:06.450
that the number of
subproblems that

01:14:06.450 --> 01:14:09.110
are associated with
this particular choice

01:14:09.110 --> 01:14:11.560
of subproblems
corresponds to n if you

01:14:11.560 --> 01:14:14.750
have n requests in the
original problem instance

01:14:14.750 --> 01:14:16.460
that you've given.

01:14:16.460 --> 01:14:21.310
So the last thing that we have
to do here to solve our DP

01:14:21.310 --> 01:14:23.830
is, of course, to
write our recursion

01:14:23.830 --> 01:14:25.830
and to convince ourselves
that this actually all

01:14:25.830 --> 01:14:28.950
works out, and let's do that.

01:14:35.290 --> 01:14:41.700
And so what we have
here is our DP guessing.

01:14:45.520 --> 01:14:51.530
And we're going to
try each request

01:14:51.530 --> 01:15:03.060
i as a plausible first request,
and so that's where this works.

01:15:03.060 --> 01:15:06.000
You might be thinking,
boy, I mean, this R of fi

01:15:06.000 --> 01:15:07.450
looks a little strange.

01:15:07.450 --> 01:15:10.230
Why doesn't it include
all of the requests that

01:15:10.230 --> 01:15:15.050
are compatible with
the i-th request?

01:15:15.050 --> 01:15:19.590
I mean, I'm somehow shrinking
my subsequent problem size

01:15:19.590 --> 01:15:22.165
if I'm ignoring
some requests that

01:15:22.165 --> 01:15:25.380
are earlier that really
should be part of--

01:15:25.380 --> 01:15:27.610
or are part of the
compatible set,

01:15:27.610 --> 01:15:30.380
but they're not part
of the R of fi set.

01:15:30.380 --> 01:15:32.230
And so some of you
may be thinking that,

01:15:32.230 --> 01:15:35.700
well, the reason this
is going to work out

01:15:35.700 --> 01:15:38.160
is because we are
going to construct

01:15:38.160 --> 01:15:42.900
our solution, as I said before,
from the beginning to the end.

01:15:42.900 --> 01:15:45.330
So we're going to
try each request

01:15:45.330 --> 01:15:48.370
as a plausible first request.

01:15:48.370 --> 01:15:51.840
So even though this request
might be in our chart

01:15:51.840 --> 01:15:56.290
all the way to the right,
it might have a huge weight,

01:15:56.290 --> 01:16:01.360
and so I'm going to have to try
that out as my first selection.

01:16:01.360 --> 01:16:03.780
And when I try that out
as my first selection,

01:16:03.780 --> 01:16:05.760
then the definition
of my subproblem

01:16:05.760 --> 01:16:07.246
says that this will work.

01:16:07.246 --> 01:16:08.870
I only have to look
at the request that

01:16:08.870 --> 01:16:11.900
comes later than that because
the ones that came the earlier,

01:16:11.900 --> 01:16:14.220
I've tried them out too.

01:16:14.220 --> 01:16:17.750
So that's something that you
need to keep in mind in order

01:16:17.750 --> 01:16:21.390
to argue correctness
of this recursion

01:16:21.390 --> 01:16:23.760
that I'm going to write out now.

01:16:23.760 --> 01:16:31.270
And so the recursion, and I have
opt R, what is the first thing

01:16:31.270 --> 01:16:33.520
that I'm going to have
on the right-hand side

01:16:33.520 --> 01:16:36.950
of this recursive formulation?

01:16:36.950 --> 01:16:41.320
What mathematical construct
am I going to have to do here?

01:16:41.320 --> 01:16:43.800
And you see something like
guessing and seeing something

01:16:43.800 --> 01:16:46.810
like try each request as
a possible first, what

01:16:46.810 --> 01:16:50.010
mathematical construct am I
going to have to put up here?

01:16:50.010 --> 01:16:50.930
AUDIENCE: Max.

01:16:50.930 --> 01:16:54.470
SRINIVAS DEVADAS:
Max, who said max?

01:16:54.470 --> 01:16:57.910
No one wants to
take credit for max?

01:16:57.910 --> 01:16:59.450
It's max, right?

01:16:59.450 --> 01:17:06.480
So I'm going to have max 1
less than equal to i less than

01:17:06.480 --> 01:17:08.740
or equal to n.

01:17:08.740 --> 01:17:12.770
And I'm going to-- does
someone want to tell me what

01:17:12.770 --> 01:17:14.210
the rest of this looks like?

01:17:17.780 --> 01:17:19.706
Someone else?

01:17:19.706 --> 01:17:21.460
A couple Frisbees left, guys.

01:17:21.460 --> 01:17:22.565
[LAUGHTER]

01:17:22.565 --> 01:17:24.106
What does the rest
of this look like?

01:17:28.170 --> 01:17:28.670
Yep?

01:17:28.670 --> 01:17:31.323
AUDIENCE: 1 plus
the optimal R f of--

01:17:31.323 --> 01:17:34.630
SRINIVAS DEVADAS:
Not 1, just what kind

01:17:34.630 --> 01:17:36.670
of problem do we have here?

01:17:36.670 --> 01:17:37.925
It's not 1 anymore.

01:17:37.925 --> 01:17:38.509
AUDIENCE: Oh--

01:17:38.509 --> 01:17:39.716
SRINIVAS DEVADAS: The weight.

01:17:39.716 --> 01:17:40.520
AUDIENCE: Right.

01:17:40.520 --> 01:17:44.170
SRINIVAS DEVADAS: The
weight, yep, so Wi

01:17:44.170 --> 01:17:50.620
plus the optimal R fi.

01:17:53.700 --> 01:18:00.350
OK, so we got Wi plus
optimum of R of fi.

01:18:00.350 --> 01:18:03.560
And you said "1," close enough.

01:18:03.560 --> 01:18:05.580
If it was 1, you'd use greedy.

01:18:05.580 --> 01:18:07.840
And so that's why we
were in that Wi mode,

01:18:07.840 --> 01:18:09.520
and we end up getting this here.

01:18:09.520 --> 01:18:10.350
So that's it.

01:18:10.350 --> 01:18:13.114
You try every request
as a possible first.

01:18:13.114 --> 01:18:14.780
Obviously, you pick
that request so it's

01:18:14.780 --> 01:18:20.580
part of your weight in terms of
the weight for your solution.

01:18:20.580 --> 01:18:23.370
When you do that, because
it was the first request,

01:18:23.370 --> 01:18:25.640
you get to prune
the set of requests

01:18:25.640 --> 01:18:31.290
that come later corresponding
to R of fi that you see here.

01:18:31.290 --> 01:18:36.060
And then you go
ahead and simply find

01:18:36.060 --> 01:18:38.920
the optimum for a
smaller problem,

01:18:38.920 --> 01:18:40.610
clearly has fewer requests.

01:18:40.610 --> 01:18:45.920
And as long as you maximize
over the set of guesses

01:18:45.920 --> 01:18:49.980
that you've taken, and there's
n guesses up at the top level.

01:18:49.980 --> 01:18:51.670
Obviously in the
lower levels, you're

01:18:51.670 --> 01:18:55.050
going to have fewer requests
in your R of fi's, and you'll

01:18:55.050 --> 01:19:03.270
have fewer durations of the max,
but it's n at the top level.

01:19:03.270 --> 01:19:08.900
So one last question,
what is the complexity

01:19:08.900 --> 01:19:12.184
of what we see here?

01:19:12.184 --> 01:19:13.510
AUDIENCE: n square.

01:19:13.510 --> 01:19:16.600
SRINIVAS DEVADAS: n square, and
the reason it's n square is you

01:19:16.600 --> 01:19:19.610
simply use-- you can be
really mechanical about this--

01:19:19.610 --> 01:19:25.440
you say, if this was order 1,
I'm doing a max over n items.

01:19:25.440 --> 01:19:29.570
And therefore, that's order n
time to solve one subproblem.

01:19:29.570 --> 01:19:36.650
And since I have n subproblems,
I get n times order in,

01:19:36.650 --> 01:19:40.390
which is order n squared.

01:19:40.390 --> 01:19:45.310
So the last thing I'll do-- and
I just have one more minute--

01:19:45.310 --> 01:19:52.740
is give you a sense of a small
change to interval scheduling

01:19:52.740 --> 01:19:56.940
that puts us in that
NP complete domain.

01:19:56.940 --> 01:19:59.360
So so far, we've just
done two problems.

01:19:59.360 --> 01:20:00.275
There's many others.

01:20:00.275 --> 01:20:01.400
We did interval scheduling.

01:20:01.400 --> 01:20:03.630
There was greedy linear time.

01:20:03.630 --> 01:20:05.880
Weighted interval
scheduling is order n

01:20:05.880 --> 01:20:08.880
squared according to this
particular DP formulation.

01:20:08.880 --> 01:20:13.620
It turns out there's a
smarter DP formulation that

01:20:13.620 --> 01:20:16.900
runs an order n log
n time that you'll

01:20:16.900 --> 01:20:22.500
hear about in section on Friday,
but it's still polynomial time.

01:20:22.500 --> 01:20:26.880
Let's make one reasonable
change to this,

01:20:26.880 --> 01:20:31.520
which is to say that we may
have multiple resources,

01:20:31.520 --> 01:20:34.120
and they may be non identical.

01:20:34.120 --> 01:20:38.140
So it turns out everything that
we've done kind of extrapolates

01:20:38.140 --> 01:20:43.580
very well to identical
machines, even though there's

01:20:43.580 --> 01:20:45.710
many identical machines.

01:20:45.710 --> 01:20:48.180
But if you have
non-identical machines, what

01:20:48.180 --> 01:20:53.350
that means is you have
resources or machines

01:20:53.350 --> 01:20:55.510
that have different types.

01:20:55.510 --> 01:21:02.190
So maybe your
machines are T1 to Tm.

01:21:02.190 --> 01:21:06.670
And it's essentially
a situation where

01:21:06.670 --> 01:21:09.660
you say, this
particular task can only

01:21:09.660 --> 01:21:12.370
be run on this machine
or this other machines,

01:21:12.370 --> 01:21:14.650
some subset of machines.

01:21:14.650 --> 01:21:22.670
So you can still have a
weight of 1 for all requests,

01:21:22.670 --> 01:21:30.670
but you have something like
A of i belonging subset of T

01:21:30.670 --> 01:21:36.730
is a set of machines
that i runs on.

01:21:40.280 --> 01:21:42.020
OK, that's it.

01:21:42.020 --> 01:21:43.970
That's the change we make.

01:21:43.970 --> 01:21:49.350
Q of i is going to be
specified for each of the i's.

01:21:49.350 --> 01:21:51.470
So you could even
have two machines.

01:21:51.470 --> 01:21:53.470
And you could say, here's
a set of requests that

01:21:53.470 --> 01:21:55.210
could run on both machines.

01:21:55.210 --> 01:21:57.600
Here's a set that only
runs on the first machine,

01:21:57.600 --> 01:22:00.630
and here's another set that
runs on the second machine.

01:22:00.630 --> 01:22:04.790
That's a simple example
of this generalization.

01:22:04.790 --> 01:22:13.280
If you do this, this problem has
been shown to be NP complete.

01:22:13.280 --> 01:22:17.690
And by that I mean, NP complete
problems are decision problems.

01:22:17.690 --> 01:22:24.100
And so you say, can some
specific number k less

01:22:24.100 --> 01:22:26.875
than and requests be scheduled.

01:22:31.250 --> 01:22:35.669
This decision problem
is NP complete.

01:22:35.669 --> 01:22:37.960
And so what happens when you
have NP complete problems?

01:22:37.960 --> 01:22:40.418
Well, we're going to have a
little module in the class that

01:22:40.418 --> 01:22:41.940
deals with intractability.

01:22:41.940 --> 01:22:44.290
We're going to
look at cases where

01:22:44.290 --> 01:22:46.430
we could apply
approximation algorithms,

01:22:46.430 --> 01:22:50.610
and maybe in the case of
the optimization problem,

01:22:50.610 --> 01:22:54.270
if the optimum for
this is k star,

01:22:54.270 --> 01:22:57.800
I will say that we can
get within 10% of k star.

01:22:57.800 --> 01:23:00.520
The other way is to just
deal with intractability

01:23:00.520 --> 01:23:03.410
by hoping that your
exponential time

01:23:03.410 --> 01:23:06.640
algorithm runs in a
reasonable amount of time

01:23:06.640 --> 01:23:08.790
for common cases.

01:23:08.790 --> 01:23:13.110
So in the worst case, you might
end up taking a long time.

01:23:13.110 --> 01:23:15.400
But you just sort of
back off after an hour

01:23:15.400 --> 01:23:19.070
and take what you get from
the operative algorithm.

01:23:19.070 --> 01:23:23.840
But in many cases, the algorithm
might actually complete,

01:23:23.840 --> 01:23:26.760
and they give you
the optimum solution.

01:23:26.760 --> 01:23:28.270
So done here.

01:23:28.270 --> 01:23:31.490
Make sure to sign up for
a recitation section.

01:23:31.490 --> 01:23:33.616
And see you guys next time.