WEBVTT

00:00:00.050 --> 00:00:01.770
The following
content is provided

00:00:01.770 --> 00:00:04.010
under a Creative
Commons license.

00:00:04.010 --> 00:00:06.860
Your support will help MIT
OpenCourseWare continue

00:00:06.860 --> 00:00:10.720
to offer high quality
educational resources for free.

00:00:10.720 --> 00:00:13.340
To make a donation or
view additional materials

00:00:13.340 --> 00:00:17.207
from hundreds of MIT courses,
visit MIT OpenCourseWare

00:00:17.207 --> 00:00:17.832
at ocw.mit.edu.

00:00:21.155 --> 00:00:22.030
PROFESSOR: All right.

00:00:22.030 --> 00:00:23.220
So I brought a few problems.

00:00:23.220 --> 00:00:26.950
They're obviously not the quiz
problems, though some of them

00:00:26.950 --> 00:00:29.280
are supposed to be similar.

00:00:29.280 --> 00:00:30.880
What I have here
might not be what

00:00:30.880 --> 00:00:33.260
you have on the quiz because
we might drop quiz problems

00:00:33.260 --> 00:00:35.280
or because some of them
are just meant to make

00:00:35.280 --> 00:00:39.310
you think and not to give away
the solutions to the quiz.

00:00:39.310 --> 00:00:41.490
Now, before we get
started on this,

00:00:41.490 --> 00:00:43.814
do you guys have any burning
questions or any concepts

00:00:43.814 --> 00:00:44.730
that you want covered?

00:00:44.730 --> 00:00:46.800
Based on that, I'll select
which problems we do.

00:00:46.800 --> 00:00:47.727
Yes?

00:00:47.727 --> 00:00:50.288
AUDIENCE: This actually relates
not too much to the Pset.

00:00:53.040 --> 00:00:55.725
If you're looking at
the time complexity

00:00:55.725 --> 00:01:02.360
to maybe transfer something
from one table to another,

00:01:02.360 --> 00:01:04.164
it takes a lot
more time, I would

00:01:04.164 --> 00:01:06.980
assume, to move the actual
item to the new table

00:01:06.980 --> 00:01:09.250
than it does just to look
at your point and be like,

00:01:09.250 --> 00:01:10.680
oh, there's nothing there.

00:01:10.680 --> 00:01:12.187
So if you were
just going to look

00:01:12.187 --> 00:01:14.330
through an empty
table of size m,

00:01:14.330 --> 00:01:17.325
the time to look through that
empty table, I'm assuming,

00:01:17.325 --> 00:01:21.210
is much less than the time
to actually move an item.

00:01:21.210 --> 00:01:28.650
PROFESSOR: So you're saying
we have m things here.

00:01:28.650 --> 00:01:29.350
AUDIENCE: Yes.

00:01:29.350 --> 00:01:33.050
PROFESSOR: Some might
be nil, and some

00:01:33.050 --> 00:01:36.260
might have stuff
in them, and you're

00:01:36.260 --> 00:01:41.166
going to resize that to
presumably 2 times m,

00:01:41.166 --> 00:01:42.540
and the way you
do that is you're

00:01:42.540 --> 00:01:45.730
going to move the elements,
presumably by rehashing them,

00:01:45.730 --> 00:01:46.411
right?

00:01:46.411 --> 00:01:46.994
AUDIENCE: Yes.

00:01:50.170 --> 00:01:53.560
PROFESSOR: So these elements,
at least when we use Python,

00:01:53.560 --> 00:01:57.060
we don't really store
big elements anywhere.

00:01:57.060 --> 00:01:59.000
If you have a big
object, we always

00:01:59.000 --> 00:02:01.270
work with references
to that object.

00:02:01.270 --> 00:02:03.280
So you remember
the address where

00:02:03.280 --> 00:02:06.510
that object lies in memory,
and since the memory is

00:02:06.510 --> 00:02:10.110
finite and small, addresses are
all from 0 to a small number,

00:02:10.110 --> 00:02:12.650
so they're constant.

00:02:12.650 --> 00:02:15.990
So what you have here
is not a big object.

00:02:15.990 --> 00:02:17.800
It's the address
of the big object,

00:02:17.800 --> 00:02:21.470
so moving is always
constant time.

00:02:21.470 --> 00:02:23.750
AUDIENCE: What I'm
saying is let's

00:02:23.750 --> 00:02:26.230
say that the table
is completely full

00:02:26.230 --> 00:02:28.280
versus completely empty table.

00:02:28.280 --> 00:02:32.040
It would take more time
to move everything out

00:02:32.040 --> 00:02:34.510
of the full table
than it does just

00:02:34.510 --> 00:02:36.515
to look the empty table, right?

00:02:36.515 --> 00:02:37.390
PROFESSOR: Let's see.

00:02:37.390 --> 00:02:42.300
So writing something here
is order 1 time, right?

00:02:42.300 --> 00:02:44.350
So moving is order 1 time.

00:02:44.350 --> 00:02:47.130
Moving one element
is order 1 time.

00:02:47.130 --> 00:02:50.242
What's accessing an element
in a table in a list?

00:02:50.242 --> 00:02:51.200
You have a Python list.

00:02:51.200 --> 00:02:53.282
What's the cost of
doing an index access?

00:02:53.282 --> 00:02:54.740
AUDIENCE: It's also
order 1, right?

00:02:54.740 --> 00:02:55.790
PROFESSOR: OK.

00:02:55.790 --> 00:02:59.890
So order 1, index.

00:02:59.890 --> 00:03:02.727
Order 1, move.

00:03:02.727 --> 00:03:04.060
Suppose you have an empty table.

00:03:04.060 --> 00:03:05.680
How many indices do you do?

00:03:05.680 --> 00:03:07.915
How many times the index?

00:03:07.915 --> 00:03:09.540
AUDIENCE: You look
at each one, so it's

00:03:09.540 --> 00:03:12.425
order 1 times the
length of the table.

00:03:12.425 --> 00:03:14.890
PROFESSOR: m.

00:03:14.890 --> 00:03:17.620
So if the table is empty,
you have order m indices

00:03:17.620 --> 00:03:19.740
and 0 moves.

00:03:19.740 --> 00:03:22.970
Total running time, order m.

00:03:22.970 --> 00:03:26.070
If you have a full
table, how many times

00:03:26.070 --> 00:03:27.412
do you index in the table?

00:03:27.412 --> 00:03:28.437
AUDIENCE: Still order m.

00:03:28.437 --> 00:03:29.020
PROFESSOR: OK.

00:03:29.020 --> 00:03:31.330
How many times do
you move stuff?

00:03:31.330 --> 00:03:34.365
AUDIENCE: Order m.

00:03:34.365 --> 00:03:35.615
PROFESSOR: Total running time?

00:03:35.615 --> 00:03:38.835
AUDIENCE: It's order
2m, which is order m.

00:03:38.835 --> 00:03:40.960
PROFESSOR: So it doesn't
matter whether the table's

00:03:40.960 --> 00:03:42.198
full or empty.

00:03:42.198 --> 00:03:42.739
AUDIENCE: OK.

00:03:42.739 --> 00:03:44.160
Just wanted to confirm that.

00:03:44.160 --> 00:03:45.784
PROFESSOR: And this
is how you do that.

00:03:47.940 --> 00:03:48.460
Cool.

00:03:48.460 --> 00:03:49.022
Thanks.

00:03:49.022 --> 00:03:49.855
Any other questions?

00:04:00.410 --> 00:04:04.480
Then we will go over problems in
the order in which I like them,

00:04:04.480 --> 00:04:06.110
which is easiest to
hardest so that I

00:04:06.110 --> 00:04:07.690
don't have to explain
the hard ones.

00:04:14.710 --> 00:04:16.570
Warm up problem one.

00:04:24.800 --> 00:04:27.720
So you have this recursion
and you have to solve it,

00:04:27.720 --> 00:04:40.350
and you get a hint that n
to the power of 1 over log n

00:04:40.350 --> 00:04:45.540
is 2, which is theta 1.

00:04:45.540 --> 00:04:47.130
So based on the
hint, you can see

00:04:47.130 --> 00:04:50.826
that it's going to
involve some math.

00:04:50.826 --> 00:04:53.700
It's going to get a bit ugly.

00:04:53.700 --> 00:04:55.260
So how do we solve recursions?

00:04:55.260 --> 00:04:56.110
Two methods.

00:04:56.110 --> 00:04:57.514
What are they?

00:04:57.514 --> 00:04:59.927
AUDIENCE: Expand.

00:04:59.927 --> 00:05:00.510
PROFESSOR: OK.

00:05:00.510 --> 00:05:02.951
Substitution formally,
but basically, we

00:05:02.951 --> 00:05:04.450
expand this guy
over and over again.

00:05:04.450 --> 00:05:06.084
And?

00:05:06.084 --> 00:05:07.340
AUDIENCE: Trees.

00:05:07.340 --> 00:05:09.970
PROFESSOR: Recursion trees.

00:05:09.970 --> 00:05:11.610
Which one do guys
want to do first?

00:05:15.080 --> 00:05:17.780
If you only have
one t here, anything

00:05:17.780 --> 00:05:22.340
works because you can keep
expanding it and that works,

00:05:22.340 --> 00:05:24.317
so we can do either method.

00:05:24.317 --> 00:05:25.900
Which one do you
guys want to go over?

00:05:28.580 --> 00:05:29.740
Trees.

00:05:29.740 --> 00:05:31.100
OK.

00:05:31.100 --> 00:05:36.020
So we start with the first node.

00:05:36.020 --> 00:05:39.292
The size of the problem is n.

00:05:39.292 --> 00:05:41.732
What's the cost inside here?

00:05:45.148 --> 00:05:46.130
AUDIENCE: 1.

00:05:46.130 --> 00:05:47.550
PROFESSOR: OK.

00:05:47.550 --> 00:05:50.620
So this creates one sub-problem.

00:05:50.620 --> 00:05:51.483
What's the size?

00:05:54.198 --> 00:05:56.170
AUDIENCE: n to the 1/2.

00:05:56.170 --> 00:05:58.130
PROFESSOR: OK.

00:05:58.130 --> 00:06:00.530
Square root of n
equals n to the 1/2.

00:06:00.530 --> 00:06:01.774
You solved it already.

00:06:01.774 --> 00:06:02.440
What's the cost?

00:06:05.279 --> 00:06:05.779
AUDIENCE: 1.

00:06:08.641 --> 00:06:10.660
PROFESSOR: Do people
remember this?

00:06:10.660 --> 00:06:14.980
Is anyone confused about
what's going on here?

00:06:14.980 --> 00:06:15.760
OK.

00:06:15.760 --> 00:06:19.710
So two terms, something
involving t and something

00:06:19.710 --> 00:06:21.250
not involving t.

00:06:21.250 --> 00:06:24.110
The thing involving t is
what we want to get rid of.

00:06:24.110 --> 00:06:29.870
When we do our recursion
tree, whatever is in here

00:06:29.870 --> 00:06:34.030
goes inside here,
and this tells me

00:06:34.030 --> 00:06:37.920
how this number
relates to this number.

00:06:37.920 --> 00:06:41.250
So when I go from one
level to the next level,

00:06:41.250 --> 00:06:42.550
this is the transformation.

00:06:42.550 --> 00:06:48.406
n and becomes square root of
n, so the transformation here

00:06:48.406 --> 00:06:50.030
is the same as the
transformation here.

00:06:57.880 --> 00:07:00.133
What's next?

00:07:00.133 --> 00:07:02.152
AUDIENCE: n to the 1/4.

00:07:02.152 --> 00:07:02.735
PROFESSOR: OK.

00:07:05.500 --> 00:07:07.828
Cost?

00:07:07.828 --> 00:07:09.320
AUDIENCE: 1.

00:07:09.320 --> 00:07:10.000
PROFESSOR: OK.

00:07:10.000 --> 00:07:12.350
Do we need to do one more or
do people see the pattern?

00:07:16.090 --> 00:07:17.340
Silence means one more.

00:07:17.340 --> 00:07:21.240
If you guys don't speak,
we're going to go slow.

00:07:21.240 --> 00:07:21.790
What's here?

00:07:25.213 --> 00:07:29.125
AUDIENCE: n to the 1/8.

00:07:29.125 --> 00:07:31.559
PROFESSOR: What's here?

00:07:31.559 --> 00:07:32.059
AUDIENCE: 1.

00:07:35.200 --> 00:07:37.170
PROFESSOR: Let's hope
everyone saw the pattern,

00:07:37.170 --> 00:07:40.590
and suppose we've done
this for l levels,

00:07:40.590 --> 00:07:43.330
so we're at the bottom.

00:07:43.330 --> 00:07:44.950
What should the cost
be at the bottom?

00:07:47.436 --> 00:07:48.352
AUDIENCE: [INAUDIBLE].

00:07:51.412 --> 00:07:52.120
PROFESSOR: Sorry.

00:07:52.120 --> 00:07:53.230
We don't start with the cost.

00:07:53.230 --> 00:07:55.438
What should the size of the
problem be at the bottom?

00:08:00.978 --> 00:08:03.448
AUDIENCE: n to the
1 over 2 to the i.

00:08:11.015 --> 00:08:12.390
PROFESSOR: Let's
say that this is

00:08:12.390 --> 00:08:15.080
level h, where h is
the height of the tree.

00:08:17.871 --> 00:08:20.204
AUDIENCE: Don't you want to
do something like n to the 1

00:08:20.204 --> 00:08:20.936
over n?

00:08:25.120 --> 00:08:26.420
PROFESSOR: Yeah.

00:08:26.420 --> 00:08:30.876
OK, so we want something
that looks like that?

00:08:30.876 --> 00:08:33.300
AUDIENCE: If the recursion
tree is height i,

00:08:33.300 --> 00:08:35.299
it is n to the 1 over 2
to the i, but 2 to the i

00:08:35.299 --> 00:08:37.267
should equal n, or
approximately n.

00:08:40.835 --> 00:08:41.459
PROFESSOR: Why?

00:08:46.960 --> 00:08:48.520
I like that, but why?

00:08:48.520 --> 00:08:54.550
AUDIENCE: Because
you need to go down

00:08:54.550 --> 00:08:57.690
until you're only
looking at one element,

00:08:57.690 --> 00:09:02.067
and that would be one
nth of the problem.

00:09:02.067 --> 00:09:02.650
PROFESSOR: OK.

00:09:02.650 --> 00:09:05.968
So we want this guy
to look like what?

00:09:15.450 --> 00:09:17.990
In fact, it doesn't exactly
have to look like 1, but what's

00:09:17.990 --> 00:09:22.417
the advantage if we manage to
get this guy to look like 1?

00:09:22.417 --> 00:09:23.250
We have a recursion.

00:09:23.250 --> 00:09:25.940
We don't have a base
case here, right?

00:09:25.940 --> 00:09:34.780
A reasonable base case
is T of 1 is theta 1.

00:09:34.780 --> 00:09:37.800
Whatever function that is,
if you evaluate it at 1,

00:09:37.800 --> 00:09:41.640
you're going to get a
constant, so you can say that.

00:09:41.640 --> 00:09:45.376
Now, at the same time I can
say that for any constant,

00:09:45.376 --> 00:09:50.660
c, T of c is theta 1.

00:09:53.180 --> 00:09:59.410
So if I take this constant
here, which happens to be 2,

00:09:59.410 --> 00:10:01.100
but that's not to
worry about that.

00:10:01.100 --> 00:10:04.320
If I take this guy here,
I can put it in here.

00:10:08.350 --> 00:10:12.560
And I know that this guy
here equals this guy here.

00:10:12.560 --> 00:10:17.210
So if I can make this guy
here look like this guy here,

00:10:17.210 --> 00:10:20.500
then I'm done.

00:10:20.500 --> 00:10:21.015
Make sense?

00:10:23.780 --> 00:10:25.740
If it makes sense,
everyone should nod so

00:10:25.740 --> 00:10:30.720
that I know and I can go
forward, or smile or something.

00:10:30.720 --> 00:10:32.250
So this should look like 1.

00:10:32.250 --> 00:10:35.240
This is order 1 if not 1.

00:10:35.240 --> 00:10:39.250
Let's make it order 1,
because it's 2 in this case.

00:10:39.250 --> 00:10:42.495
What's the cost here?

00:10:42.495 --> 00:10:43.410
AUDIENCE: 1.

00:10:43.410 --> 00:10:44.230
PROFESSOR: 1.

00:10:44.230 --> 00:10:47.130
Everything inside the
bubbles is order of already,

00:10:47.130 --> 00:10:48.913
so I don't need to
write an order of.

00:10:52.380 --> 00:10:55.120
What do we do next?

00:10:55.120 --> 00:10:58.030
AUDIENCE: Solve the
[INAUDIBLE] equation.

00:10:58.030 --> 00:10:59.485
2 to the [INAUDIBLE].

00:11:04.330 --> 00:11:06.825
PROFESSOR: You're
skipping one step.

00:11:06.825 --> 00:11:09.200
That's exactly what you do
when you have the substitution

00:11:09.200 --> 00:11:09.580
method.

00:11:09.580 --> 00:11:10.710
You're going to get
to something and you

00:11:10.710 --> 00:11:11.834
need to solve the equation.

00:11:11.834 --> 00:11:14.160
But for the tree,
there are two steps.

00:11:14.160 --> 00:11:15.990
So we need to add up
all the costs here

00:11:15.990 --> 00:11:18.220
and that's the total cost here.

00:11:18.220 --> 00:11:24.681
In order to do that, first,
we sum up over each level.

00:11:24.681 --> 00:11:26.180
And in this case,
it's really simple

00:11:26.180 --> 00:11:27.846
because there's only
one node per level,

00:11:27.846 --> 00:11:29.690
but if you have multiple
nodes per level,

00:11:29.690 --> 00:11:31.600
you have to sum
up for each level,

00:11:31.600 --> 00:11:33.570
and then you have
to do a big sum.

00:11:33.570 --> 00:11:37.441
What's the sum for this level?

00:11:37.441 --> 00:11:37.940
1.

00:11:40.920 --> 00:11:41.610
Come on, guys.

00:11:41.610 --> 00:11:43.834
You're scaring me.

00:11:43.834 --> 00:11:45.325
AUDIENCE: 1.

00:11:45.325 --> 00:11:46.319
AUDIENCE: 1.

00:11:46.319 --> 00:11:47.320
AUDIENCE: It's all 1.

00:11:47.320 --> 00:11:48.564
PROFESSOR: Excellent.

00:11:48.564 --> 00:11:49.980
So the only thing
that I'm missing

00:11:49.980 --> 00:11:53.070
is to know how many levels I
have, because the sum is going

00:11:53.070 --> 00:11:58.050
to be order h, whatever h is.

00:11:58.050 --> 00:11:59.480
How do we do that?

00:11:59.480 --> 00:12:02.990
n to the 1 over 2
to the power of h

00:12:02.990 --> 00:12:07.030
has to equal this guy, right?

00:12:07.030 --> 00:12:08.852
AUDIENCE: Why would
it equal that guy?

00:12:08.852 --> 00:12:11.122
We know it's less
than that guy, but we

00:12:11.122 --> 00:12:13.150
don't know it's
equal to that guy.

00:12:13.150 --> 00:12:15.600
PROFESSOR: We have to make
it equal because we can only

00:12:15.600 --> 00:12:18.300
stop when we get
to the base case.

00:12:18.300 --> 00:12:20.210
So we have to expand
the recursion tree

00:12:20.210 --> 00:12:22.570
until we get to a base
case, and then we stop,

00:12:22.570 --> 00:12:25.270
and this is our base
case because this

00:12:25.270 --> 00:12:27.670
is what the problem says
should be our base case.

00:12:27.670 --> 00:12:29.774
AUDIENCE: Right, but n
to the 1 over 2 to the h

00:12:29.774 --> 00:12:39.172
is not equal to n
to the 1 over log n.

00:12:39.172 --> 00:12:41.380
PROFESSOR: Well, we can set
h to be whatever we want.

00:12:41.380 --> 00:12:44.540
h is the height of the tree,
so we don't know what it is.

00:12:44.540 --> 00:12:46.166
We have to find out what it is.

00:12:46.166 --> 00:12:47.540
AUDIENCE: So let's
say 2 to the h

00:12:47.540 --> 00:12:52.992
is equal to log n if you want
to make it look like that.

00:12:52.992 --> 00:12:54.700
PROFESSOR: Let me
write down the equation

00:12:54.700 --> 00:12:57.030
to make sure you're right.

00:12:57.030 --> 00:13:00.010
You're probably right because
you're thinking faster than me,

00:13:00.010 --> 00:13:04.136
but let me not embarrass myself
and do this the right way.

00:13:09.970 --> 00:13:14.890
So you said 2 to the
h is log n, right?

00:13:14.890 --> 00:13:15.640
Looks about right.

00:13:15.640 --> 00:13:19.215
So what's h?

00:13:19.215 --> 00:13:21.500
AUDIENCE: Log base 2.

00:13:21.500 --> 00:13:23.770
PROFESSOR: All right.

00:13:23.770 --> 00:13:25.410
Log log n.

00:13:28.110 --> 00:13:30.600
So T of n is order h.

00:13:30.600 --> 00:13:33.950
We got this from here.

00:13:33.950 --> 00:13:43.510
T of n is order h is
order of log log n.

00:13:43.510 --> 00:13:44.820
Math people drowning, right?

00:13:48.090 --> 00:13:49.205
Any questions about this?

00:13:52.260 --> 00:13:53.493
Yes?

00:13:53.493 --> 00:13:55.710
AUDIENCE: The first
line on the right--

00:13:55.710 --> 00:13:56.565
PROFESSOR: This?

00:13:56.565 --> 00:13:57.190
AUDIENCE: Yeah.

00:13:57.190 --> 00:13:58.148
Is that your base case?

00:13:58.148 --> 00:13:59.214
What is that?

00:13:59.214 --> 00:14:01.380
PROFESSOR: We got a hint
with the problem that said,

00:14:01.380 --> 00:14:06.660
n to the power of 1 over log
n is 2, which is order 1.

00:14:09.640 --> 00:14:12.507
So for the base case, we always
want them to look like this.

00:14:12.507 --> 00:14:14.840
If we don't get a base case,
we write our own base case,

00:14:14.840 --> 00:14:16.700
which is if you
plug in a constant,

00:14:16.700 --> 00:14:18.950
you're going to get a constant.

00:14:18.950 --> 00:14:21.580
And since we're told that
this guy is a constant,

00:14:21.580 --> 00:14:24.405
that's a pretty good hint
that we want to get to it.

00:14:29.994 --> 00:14:31.410
Let's see how we're
doing on time.

00:14:31.410 --> 00:14:34.080
Good.

00:14:34.080 --> 00:14:37.740
Ready to move on to
the next problem?

00:14:37.740 --> 00:14:38.832
Let's do a fun one.

00:14:38.832 --> 00:14:41.040
Some people might remember
it from elementary school,

00:14:41.040 --> 00:14:44.450
but this time, we're going to
look at it with our 6.006 eyes.

00:14:44.450 --> 00:14:50.850
So suppose you have
m coins, gold coins.

00:14:50.850 --> 00:14:51.840
One of them is face.

00:14:51.840 --> 00:14:54.340
The fake one is super light
because it's not real gold.

00:14:54.340 --> 00:14:57.380
It's something that
looks like gold.

00:14:57.380 --> 00:15:03.210
And we have a scale, and
the scale is super accurate.

00:15:03.210 --> 00:15:05.950
It can weigh any
coins on either side

00:15:05.950 --> 00:15:08.640
and tell us which
side is heavier.

00:15:08.640 --> 00:15:10.830
Perfect accuracy, no need
to worry about errors.

00:15:13.690 --> 00:15:16.550
I want to find out which
coin is the bad coin.

00:15:16.550 --> 00:15:19.410
What is the minimum number
of experiments I have to do?

00:15:22.440 --> 00:15:24.880
So there is a strategy, and
we can worry about that later,

00:15:24.880 --> 00:15:28.790
but using 6.006, what is the
minimum number of experiments

00:15:28.790 --> 00:15:30.284
I have to do?

00:15:30.284 --> 00:15:31.280
AUDIENCE: Log N times.

00:15:35.770 --> 00:15:36.710
PROFESSOR: Not quite.

00:15:36.710 --> 00:15:38.280
So this is what you
think is, and you

00:15:38.280 --> 00:15:41.620
can do log n with
binary search, right?

00:15:41.620 --> 00:15:45.270
The problem with binary search
is if I put half of my coins

00:15:45.270 --> 00:15:47.550
on the left, half of
my coins on the right,

00:15:47.550 --> 00:15:49.960
one side is going to
be heavier, right?

00:15:49.960 --> 00:15:53.430
So the answers are going
to be this or this,

00:15:53.430 --> 00:15:55.610
but I never get this.

00:15:55.610 --> 00:15:58.750
I only get one
bit of information

00:15:58.750 --> 00:16:00.760
instead of getting one trit.

00:16:00.760 --> 00:16:03.250
A trit is a base three digit.

00:16:03.250 --> 00:16:07.952
How many bits of
information in a trit?

00:16:07.952 --> 00:16:10.362
AUDIENCE: One and a half.

00:16:10.362 --> 00:16:12.290
PROFESSOR: Roughly.

00:16:12.290 --> 00:16:13.712
AUDIENCE: Log 3.

00:16:13.712 --> 00:16:14.420
PROFESSOR: Log 3.

00:16:14.420 --> 00:16:16.211
And we know that it's
base 2 because that's

00:16:16.211 --> 00:16:17.720
what we use in CS.

00:16:17.720 --> 00:16:21.790
So we're discarding a
fractional bit of information

00:16:21.790 --> 00:16:25.046
if we're not allowing
for this to happen.

00:16:31.550 --> 00:16:33.254
Anyone want to try
something else?

00:16:33.254 --> 00:16:34.670
We have to prove
this, by the way.

00:16:34.670 --> 00:16:36.753
We have to prove the minimum
that we come up with.

00:16:40.176 --> 00:16:42.064
AUDIENCE: You could
just do it coin by coin,

00:16:42.064 --> 00:16:43.490
but that would take forever.

00:16:43.490 --> 00:16:45.010
PROFESSOR: That's
N. That's worse.

00:16:45.010 --> 00:16:50.620
AUDIENCE: How about log base
4 N or something like that?

00:16:50.620 --> 00:16:53.020
AUDIENCE: Can you explain
to me why we can't just

00:16:53.020 --> 00:16:54.190
do binary search?

00:16:54.190 --> 00:16:55.140
PROFESSOR: We can.

00:16:55.140 --> 00:16:58.030
It's definitely going to
give us the correct answer,

00:16:58.030 --> 00:17:00.190
but it's not the minimum
number of weighings

00:17:00.190 --> 00:17:03.570
because we're discarding
a possible answer.

00:17:03.570 --> 00:17:05.319
So if you do binary
search, you will never

00:17:05.319 --> 00:17:07.470
get that the two
sides are equal.

00:17:07.470 --> 00:17:08.700
AUDIENCE: Log base 3.

00:17:08.700 --> 00:17:10.560
PROFESSOR: Log base
3 would be better

00:17:10.560 --> 00:17:12.829
because we have three
choices all the time.

00:17:12.829 --> 00:17:13.980
Let's prove that.

00:17:13.980 --> 00:17:17.190
So the right answer happens to
be log base 3 of N. Let's see

00:17:17.190 --> 00:17:19.880
how we would get it
aside from guessing.

00:17:19.880 --> 00:17:22.850
AUDIENCE: So you
divide it into thirds

00:17:22.850 --> 00:17:26.810
and compare one third and one
third, and if they're equal,

00:17:26.810 --> 00:17:29.285
then the light one is
in the other third.

00:17:29.285 --> 00:17:31.265
And if they're not,
[INAUDIBLE] light one.

00:17:31.265 --> 00:17:34.937
Then you just keep
dividing by 3.

00:17:34.937 --> 00:17:35.520
PROFESSOR: OK.

00:17:35.520 --> 00:17:36.760
So that's the strategy.

00:17:36.760 --> 00:17:38.250
What if I don't
know the strategy?

00:17:38.250 --> 00:17:40.166
How do I do this without
knowing the strategy?

00:17:42.668 --> 00:17:46.014
AUDIENCE: What if the number
of coins isn't divisible by 3?

00:17:49.360 --> 00:17:52.186
PROFESSOR: Math people.

00:17:52.186 --> 00:17:56.520
AUDIENCE: Yeah, but then
how do you-- OK, never mind.

00:17:56.520 --> 00:17:59.134
AUDIENCE: Just take the two
extra coins and toss them out.

00:18:02.220 --> 00:18:03.810
PROFESSOR: If it's
not divisible by 3,

00:18:03.810 --> 00:18:07.540
you add fake coins
that are good.

00:18:07.540 --> 00:18:11.050
I mean, you use good coins.

00:18:11.050 --> 00:18:13.500
But we're not worried
about the strategy.

00:18:13.500 --> 00:18:15.240
I want us to think
of a lower bound.

00:18:15.240 --> 00:18:17.470
This is a lower bound
for an algorithm, right?

00:18:17.470 --> 00:18:23.280
You cannot do better
than log 3 N experiments.

00:18:23.280 --> 00:18:27.247
Does the word "lower
bound" ring any bells?

00:18:27.247 --> 00:18:29.580
Is there any lecture where
we talked about lower bounds?

00:18:37.070 --> 00:18:39.830
So if you sort and you're
using a comparison model,

00:18:39.830 --> 00:18:40.970
what's the best you can do?

00:18:44.260 --> 00:18:46.140
AUDIENCE: N log N.

00:18:46.140 --> 00:18:48.700
PROFESSOR: N log N. Good.

00:18:48.700 --> 00:18:58.010
So sorting using CMP, the
comparison model, is N log N.

00:18:58.010 --> 00:18:59.110
How did we prove that?

00:19:06.360 --> 00:19:07.300
One word.

00:19:07.300 --> 00:19:08.080
Well, two words.

00:19:12.160 --> 00:19:13.280
Decision trees.

00:19:13.280 --> 00:19:16.140
Does anyone remember
what decision trees are?

00:19:16.140 --> 00:19:17.086
One person.

00:19:17.086 --> 00:19:19.107
AUDIENCE: It's just a
comparison thing, right?

00:19:19.107 --> 00:19:20.940
You're like, is it
greater, is it less than,

00:19:20.940 --> 00:19:22.315
or is there some
sort of question

00:19:22.315 --> 00:19:24.384
you're asking about each key.

00:19:24.384 --> 00:19:25.050
PROFESSOR: Cool.

00:19:25.050 --> 00:19:27.590
Let's go over that a little bit.

00:19:27.590 --> 00:19:31.790
No matter what
your algorithm is,

00:19:31.790 --> 00:19:34.000
it's going to weigh
some coins and it's

00:19:34.000 --> 00:19:36.310
going to get an
answer from the scale.

00:19:36.310 --> 00:19:39.350
And then based on that, it's
going to weigh some other coins

00:19:39.350 --> 00:19:42.120
and get some answer
from the scale.

00:19:42.120 --> 00:19:44.510
And it will do some
experiments and then

00:19:44.510 --> 00:19:46.540
it will give you an answer.

00:19:46.540 --> 00:19:50.300
So if you draw a decision
tree, it would look like this.

00:19:50.300 --> 00:19:52.670
First, we start
with 0 information.

00:19:52.670 --> 00:19:54.590
We weigh some coins.

00:19:54.590 --> 00:19:59.420
Based on that, we have
three possible answers--

00:19:59.420 --> 00:20:03.910
smaller, equal, greater.

00:20:03.910 --> 00:20:07.260
Now, if we're here, we're
going to do another experiment.

00:20:07.260 --> 00:20:10.910
Three possible answers.

00:20:10.910 --> 00:20:13.350
If we're here, another
experiment, three possible

00:20:13.350 --> 00:20:13.850
answers.

00:20:13.850 --> 00:20:16.770
If we're here, another
experiment, three possible

00:20:16.770 --> 00:20:19.420
answers.

00:20:19.420 --> 00:20:21.830
Say we do a third experiment.

00:20:21.830 --> 00:20:24.641
One, two, three, one, two,
three, one, two, three, one,

00:20:24.641 --> 00:20:27.863
two, three, one, two, three,
one, two, three, one, two,

00:20:27.863 --> 00:20:30.760
three, one, two,
three, one, two, three.

00:20:30.760 --> 00:20:33.160
And then suppose we stop.

00:20:33.160 --> 00:20:35.620
If we stop, we have
to give an answer.

00:20:35.620 --> 00:20:37.530
So this is an answer,
this is an answer,

00:20:37.530 --> 00:20:44.880
this is an answer, answer,
answer, answer, answer, answer.

00:20:44.880 --> 00:20:47.100
So how many answers do
I have at the bottom

00:20:47.100 --> 00:20:50.684
if I have three levels?

00:20:50.684 --> 00:20:52.600
Here I have three
experiments, so three levels

00:20:52.600 --> 00:20:53.474
in the decision tree.

00:20:53.474 --> 00:20:56.160
How many answers?

00:20:56.160 --> 00:20:58.080
AUDIENCE: [INAUDIBLE].

00:20:58.080 --> 00:21:00.260
PROFESSOR: 3 to the third
because I start three

00:21:00.260 --> 00:21:03.580
at the first level, nine at
the second, 27 at the third.

00:21:03.580 --> 00:21:05.050
Each time, I multiply by 3.

00:21:08.490 --> 00:21:15.030
So if I do three weighings, I
can give at most 27 answers.

00:21:15.030 --> 00:21:18.790
If I have more than 27
coins, I can't possibly

00:21:18.790 --> 00:21:24.410
decide which one is bad
because say if I have 30 coins,

00:21:24.410 --> 00:21:27.400
then I need to be able
to give out 30 answers.

00:21:27.400 --> 00:21:29.810
My algorithm has to
have a place where

00:21:29.810 --> 00:21:33.020
it says the bad coin is coin
one, coin two, coin three, all

00:21:33.020 --> 00:21:34.460
the way to coin 30.

00:21:34.460 --> 00:21:36.960
Here I only have 27
possible answers,

00:21:36.960 --> 00:21:39.297
so this isn't going to
cut it for 30 coins.

00:21:39.297 --> 00:21:41.880
I need to do one more comparison
so that I have a deeper tree.

00:21:44.560 --> 00:21:49.230
So suppose I have h
comparisons instead.

00:21:49.230 --> 00:21:51.500
How many leaves?

00:21:51.500 --> 00:21:53.464
How many possible answers?

00:21:53.464 --> 00:21:55.900
AUDIENCE: h to the third.

00:21:55.900 --> 00:21:56.663
PROFESSOR: Almost.

00:21:56.663 --> 00:21:57.538
AUDIENCE: 3 to the h.

00:21:57.538 --> 00:22:00.380
PROFESSOR: 3 to the h.

00:22:00.380 --> 00:22:07.370
So 3 multiplied by 3 multiplied
by 3 multiplied by 3 h times,

00:22:07.370 --> 00:22:09.210
so 3 to the h.

00:22:09.210 --> 00:22:12.210
It's no longer equal to 27.

00:22:12.210 --> 00:22:14.425
3 to the h is the number
of possible answers.

00:22:14.425 --> 00:22:17.700
This has to be bigger or
equal to N. Otherwise,

00:22:17.700 --> 00:22:18.885
the algorithm is incorrect.

00:22:21.830 --> 00:22:23.210
So what can we say about h?

00:22:28.261 --> 00:22:29.177
AUDIENCE: [INAUDIBLE].

00:22:43.370 --> 00:22:45.740
PROFESSOR: We did all
this without even thinking

00:22:45.740 --> 00:22:47.940
of what an algorithm
would look like.

00:22:47.940 --> 00:22:49.330
This works for any algorithm.

00:22:49.330 --> 00:22:52.820
No matter how smart you are, no
matter how much math you know,

00:22:52.820 --> 00:22:55.940
your algorithm is going
to be bound by this.

00:22:55.940 --> 00:22:59.071
So the fact that the
answer looks like this

00:22:59.071 --> 00:23:01.320
gives you some intuition for
how to solve the problem.

00:23:01.320 --> 00:23:03.445
If you want to solve the
problem now and figure out

00:23:03.445 --> 00:23:06.600
the strategy, you know
that you have a 3 here.

00:23:06.600 --> 00:23:08.430
So if you divide
into 2 every time,

00:23:08.430 --> 00:23:11.820
you're not going to
get to the right limit.

00:23:11.820 --> 00:23:14.660
So first you do this,
you get a lower bound,

00:23:14.660 --> 00:23:18.190
and then you use your
intuition to figure out

00:23:18.190 --> 00:23:20.110
what the lower bound means.

00:23:20.110 --> 00:23:24.170
In this case, it would mean the
strategy that we heard earlier.

00:23:24.170 --> 00:23:25.750
You have to divide
into 3 every time

00:23:25.750 --> 00:23:28.810
and then figure out what you
do based on the comparison.

00:23:28.810 --> 00:23:32.790
So your answer works
perfectly once we have this.

00:23:32.790 --> 00:23:34.389
And also, once we
have this, you know

00:23:34.389 --> 00:23:36.430
that your answer is correct
because it's optimal.

00:23:36.430 --> 00:23:37.680
You can't do better than that.

00:23:40.007 --> 00:23:41.340
Any questions on decision trees?

00:23:44.590 --> 00:23:47.420
So lower bounds are a
boring topic in general.

00:23:47.420 --> 00:23:48.770
They tell you what you can't do.

00:23:48.770 --> 00:23:51.600
They don't tell you anything
useful about what you can do.

00:23:51.600 --> 00:23:53.900
In some cases, being able to
reason about a lower bound

00:23:53.900 --> 00:23:55.274
gives you a hint
of the solution.

00:24:01.730 --> 00:24:02.230
New problem.

00:24:23.340 --> 00:24:26.380
Suppose we have a 2D map.

00:24:26.380 --> 00:24:29.080
There's a hill, and you take
a satellite picture of it

00:24:29.080 --> 00:24:33.930
at night, and you get a picture
with bright pixels and not

00:24:33.930 --> 00:24:35.070
bright pixels.

00:24:35.070 --> 00:24:39.540
There are numbers showing
how bright your pixels are.

00:24:39.540 --> 00:24:46.530
1, 2, 1, 2, 3, 0, 0.

00:24:46.530 --> 00:24:51.645
I'm going to draw out an example
so we can use our intuition.

00:24:51.645 --> 00:24:55.338
0, 0, 1.

00:25:13.090 --> 00:25:14.680
So suppose this is our map.

00:25:14.680 --> 00:25:21.660
It's W times H-- W
of what these are,

00:25:21.660 --> 00:25:25.860
I think they're columns,
and H of the other ones.

00:25:25.860 --> 00:25:28.080
And you want to find a
certain picture inside it.

00:25:28.080 --> 00:25:31.210
You want to see how many times
does a certain pattern show up.

00:25:31.210 --> 00:25:35.470
Say the pattern is
small w times small h,

00:25:35.470 --> 00:25:37.740
and it looks like this.

00:25:40.980 --> 00:25:43.374
But this will be the
input to your problem,

00:25:43.374 --> 00:25:44.790
so the pattern
might be different.

00:25:44.790 --> 00:25:46.660
You can't hard code this in.

00:25:46.660 --> 00:25:48.290
And this is useful.

00:25:48.290 --> 00:25:50.250
This problem is called
a bunker hill problem.

00:25:50.250 --> 00:25:53.429
This is a hill, and
this is a bunker.

00:25:53.429 --> 00:25:54.720
You take a picture of the hill.

00:25:54.720 --> 00:25:55.960
You want to know
where the bunkers are

00:25:55.960 --> 00:25:58.670
so you can bomb them at night so
then you can attack the place.

00:25:58.670 --> 00:25:59.660
AUDIENCE: That's awful.

00:25:59.660 --> 00:26:00.645
PROFESSOR: Thank you.

00:26:00.645 --> 00:26:05.670
I'll take that as a compliment
So a nice way of solving this?

00:26:09.963 --> 00:26:13.090
AUDIENCE: You could just
go through each row,

00:26:13.090 --> 00:26:15.950
and then look for a match
for the first row, and then--

00:26:15.950 --> 00:26:17.212
PROFESSOR: Yep.

00:26:17.212 --> 00:26:18.559
Is this a match?

00:26:18.559 --> 00:26:19.642
That's what you're saying.

00:26:19.642 --> 00:26:20.128
AUDIENCE: Yeah.

00:26:20.128 --> 00:26:21.025
We can see that's a match.

00:26:21.025 --> 00:26:21.320
PROFESSOR: Is this a match?

00:26:21.320 --> 00:26:22.360
Is this a match?

00:26:22.360 --> 00:26:24.920
By the way, this is a match.

00:26:24.920 --> 00:26:29.960
This is not a match, this is not
a match, this is not a match,

00:26:29.960 --> 00:26:30.970
this is not a match.

00:26:30.970 --> 00:26:32.200
Now we go down here.

00:26:32.200 --> 00:26:34.560
This is not a match,
this is not a match,

00:26:34.560 --> 00:26:37.242
this is not a match,
so on and so forth.

00:26:37.242 --> 00:26:39.075
AUDIENCE: That wasn't
what I was suggesting,

00:26:39.075 --> 00:26:41.422
but that's a good idea.

00:26:41.422 --> 00:26:43.630
PROFESSOR: Maybe you're
suggesting something smarter,

00:26:43.630 --> 00:26:45.296
and I don't want to
let you do something

00:26:45.296 --> 00:26:47.932
smarter so that we look at the
brute force approach first.

00:26:47.932 --> 00:26:49.390
AUDIENCE: I mean,
I was just saying

00:26:49.390 --> 00:26:51.200
take the first row
of your bunker,

00:26:51.200 --> 00:26:53.150
and then compare
it to other rows,

00:26:53.150 --> 00:26:56.787
and once you hit that, then
check and see if the rest of--

00:26:56.787 --> 00:26:58.370
PROFESSOR: Yeah,
that's a bit smarter,

00:26:58.370 --> 00:27:00.300
so that's harder
to reason about.

00:27:00.300 --> 00:27:04.623
Let's take this one and figure
out the running time of it.

00:27:04.623 --> 00:27:07.654
AUDIENCE: Does that mean even
if you know it's not a match,

00:27:07.654 --> 00:27:10.114
you keep checking
all nine of them?

00:27:10.114 --> 00:27:10.780
PROFESSOR: Yeah.

00:27:10.780 --> 00:27:12.500
Say at the worst case,
you only find out

00:27:12.500 --> 00:27:14.232
it's not a match all
the way at the end.

00:27:16.895 --> 00:27:19.436
AUDIENCE: Are you trying to look
for all matches or just one?

00:27:19.436 --> 00:27:22.560
PROFESSOR: All matches.

00:27:22.560 --> 00:27:25.155
AUDIENCE: You're limited to n
squared time almost no matter

00:27:25.155 --> 00:27:26.755
what, right?

00:27:26.755 --> 00:27:29.270
If you have a small
bunker in a large field,

00:27:29.270 --> 00:27:32.210
you have to hit the
small bunker every time.

00:27:32.210 --> 00:27:34.720
PROFESSOR: Are you going to
solve the problem for me?

00:27:34.720 --> 00:27:37.337
AUDIENCE: Are we trying
to find the [INAUDIBLE]?

00:27:37.337 --> 00:27:37.920
PROFESSOR: No.

00:27:37.920 --> 00:27:39.697
We're trying to
find out the running

00:27:39.697 --> 00:27:42.420
time for the dumb
algorithm first.

00:27:42.420 --> 00:27:44.150
Humor me and let's
solve this first,

00:27:44.150 --> 00:27:46.870
and then let's get to the
efficient algorithm, OK?

00:27:46.870 --> 00:27:50.580
AUDIENCE: Big W minus
small w plus 1 times big H

00:27:50.580 --> 00:27:52.080
minus small h plus 1.

00:27:55.080 --> 00:27:57.080
AUDIENCE: Where'd
you get plus 1 from?

00:27:59.582 --> 00:28:01.064
AUDIENCE: So it's WH?

00:28:01.064 --> 00:28:02.101
AUDIENCE: Yeah.

00:28:02.101 --> 00:28:04.100
PROFESSOR: Well, there's
something missing here.

00:28:04.100 --> 00:28:08.250
This is how many positions I
have that I have to look at.

00:28:08.250 --> 00:28:11.014
How much time does it take
to compare the small images?

00:28:11.014 --> 00:28:11.930
AUDIENCE: [INAUDIBLE].

00:28:15.860 --> 00:28:18.800
PROFESSOR: This is smaller than
wh, which is the input size,

00:28:18.800 --> 00:28:21.120
so it's scary if you have an
algorithm that runs faster

00:28:21.120 --> 00:28:24.010
than the input size
because it means

00:28:24.010 --> 00:28:25.926
you're not looking
at all the input.

00:28:34.641 --> 00:28:36.640
So this is definitely
bigger than the input size

00:28:36.640 --> 00:28:38.530
once we add the w times h here.

00:28:38.530 --> 00:28:39.555
Don't forget this guy.

00:28:42.590 --> 00:28:48.700
This is the naive algorithm, and
if we discard the small order

00:28:48.700 --> 00:28:51.012
factors, we get that
this is order of WHwh.

00:28:53.950 --> 00:28:56.420
How can you do better?

00:28:56.420 --> 00:28:59.240
You have the answer, right?

00:28:59.240 --> 00:29:01.850
Let's let everyone else
think for a minute,

00:29:01.850 --> 00:29:04.300
and then you can give me
the answer if you want,

00:29:04.300 --> 00:29:06.620
or someone else can
give me the answer.

00:29:08.955 --> 00:29:11.079
I guess you should because
you thought of it first.

00:29:17.070 --> 00:29:17.780
Any ideas?

00:29:23.620 --> 00:29:28.010
So you're thinking about
the input size, right?

00:29:28.010 --> 00:29:32.160
Someone was thinking
about the input size.

00:29:32.160 --> 00:29:34.680
So the input size
is W times H, right?

00:29:34.680 --> 00:29:36.519
So if I have an algorithm
that's W times H,

00:29:36.519 --> 00:29:38.810
that's optimal because it
has to look at all the input.

00:29:42.016 --> 00:29:43.890
Well, we're going to
have an algorithm that's

00:29:43.890 --> 00:29:47.990
W times H, so with
that out of the way,

00:29:47.990 --> 00:29:51.000
does that inspire anyone
as to what the solution is?

00:29:58.485 --> 00:30:00.510
AUDIENCE: Do that thing
that I was saying,

00:30:00.510 --> 00:30:03.704
just take the first row, but
then you still have a W term.

00:30:03.704 --> 00:30:04.370
PROFESSOR: Yeah.

00:30:04.370 --> 00:30:07.470
So let's make it better.

00:30:07.470 --> 00:30:09.250
It is the correct intuition.

00:30:09.250 --> 00:30:11.920
Now try to use a trick we
learned in lecture to make that

00:30:11.920 --> 00:30:12.420
faster.

00:30:17.233 --> 00:30:18.816
AUDIENCE: Just use
the top left corner

00:30:18.816 --> 00:30:21.527
instead of the whole row.

00:30:21.527 --> 00:30:22.110
PROFESSOR: OK.

00:30:22.110 --> 00:30:24.110
So we could use the
top left corner,

00:30:24.110 --> 00:30:26.590
and if the top left
corner doesn't match,

00:30:26.590 --> 00:30:30.630
then we don't have
to check for matches.

00:30:30.630 --> 00:30:33.300
So this works for
reasonably random data.

00:30:33.300 --> 00:30:35.630
As long as we don't have
a lot of false positives,

00:30:35.630 --> 00:30:38.040
we're going to run fast.

00:30:38.040 --> 00:30:43.450
Now, the top corner of
this one, if the map

00:30:43.450 --> 00:30:47.820
has a lot of 1's and then some
2's sprinkled all over it,

00:30:47.820 --> 00:30:51.370
most of the time, we'll have
to go through the whole image

00:30:51.370 --> 00:30:54.190
so we're going to have a
lot of false positives.

00:30:54.190 --> 00:30:59.180
How do we make our false
positive rate go down?

00:30:59.180 --> 00:31:01.680
AUDIENCE: Looks kind of
like a rolling hash problem.

00:31:01.680 --> 00:31:04.580
PROFESSOR: Looks like a
rolling hash problem, exactly.

00:31:04.580 --> 00:31:07.147
Let's see if we can
use rolling hashes.

00:31:07.147 --> 00:31:09.230
AUDIENCE: But then you
still have that lowercase w

00:31:09.230 --> 00:31:11.022
term, though.

00:31:11.022 --> 00:31:12.480
PROFESSOR: How do
we get rid of it?

00:31:15.204 --> 00:31:16.182
AUDIENCE: [INAUDIBLE].

00:31:16.182 --> 00:31:16.890
PROFESSOR: Sorry?

00:31:21.633 --> 00:31:22.758
AUDIENCE: Wouldn't it be w?

00:31:22.758 --> 00:31:25.936
I mean the running time if we
were just going through one row

00:31:25.936 --> 00:31:29.115
would be big W minus
little w, times--

00:31:31.534 --> 00:31:33.200
PROFESSOR: So where's
your rolling hash?

00:31:37.768 --> 00:31:40.143
AUDIENCE: I guess you can use
the entire thing as a hash,

00:31:40.143 --> 00:31:40.643
too.

00:31:40.643 --> 00:31:42.970
That would kind of work.

00:31:42.970 --> 00:31:45.330
PROFESSOR: So we want a
hash for the whole thing.

00:31:45.330 --> 00:31:50.928
Instead of using this as the
hash, we want a smarter hash.

00:31:50.928 --> 00:31:53.880
AUDIENCE: It's the
entire thing, and then

00:31:53.880 --> 00:31:58.308
as you move to the right, you
can add those and subtract,

00:31:58.308 --> 00:32:01.487
and compare that with the hash.

00:32:01.487 --> 00:32:02.070
PROFESSOR: OK.

00:32:02.070 --> 00:32:05.510
So we'd have a rolling hash
that has everything in here,

00:32:05.510 --> 00:32:08.640
and then as I move to the
right, I add these guys

00:32:08.640 --> 00:32:09.710
and I remove these guys.

00:32:12.730 --> 00:32:19.400
This is big W times big
H, roughly, times small h

00:32:19.400 --> 00:32:21.470
because every time
I move to the right,

00:32:21.470 --> 00:32:25.090
I have to do order h work.

00:32:25.090 --> 00:32:32.260
So I'm down from this
thing to order of WHh,

00:32:32.260 --> 00:32:33.590
So it's better.

00:32:33.590 --> 00:32:34.590
It's one step forward.

00:32:34.590 --> 00:32:35.990
Now, let's make
this even faster.

00:32:47.420 --> 00:32:53.810
What if I could do this in
order 1 instead of order h?

00:32:57.790 --> 00:33:00.114
How would I do this in order 1?

00:33:00.114 --> 00:33:03.450
AUDIENCE: You'd have to
compress all the rows,

00:33:03.450 --> 00:33:05.690
and then take the
hash of each column.

00:33:05.690 --> 00:33:08.662
PROFESSOR: How would
we compress them?

00:33:08.662 --> 00:33:12.050
AUDIENCE: Take the
hash of the column.

00:33:12.050 --> 00:33:13.830
PROFESSOR: You want
to compress the rows?

00:33:13.830 --> 00:33:14.413
AUDIENCE: Yes.

00:33:14.413 --> 00:33:16.705
You divide it--

00:33:16.705 --> 00:33:18.330
PROFESSOR: Let's not
compress the rows.

00:33:23.970 --> 00:33:26.050
AUDIENCE: You could
take just your bunker,

00:33:26.050 --> 00:33:29.980
and then figure out the
hashes of the three columns,

00:33:29.980 --> 00:33:33.290
and just run through like that.

00:33:33.290 --> 00:33:36.146
You'd still have to access
each of those items.

00:33:36.146 --> 00:33:39.800
I don't really see
how it's faster.

00:33:39.800 --> 00:33:41.380
I guess it's less, though.

00:33:41.380 --> 00:33:42.340
It's less.

00:33:42.340 --> 00:33:43.300
Maybe it's only 1.

00:33:47.630 --> 00:33:53.970
AUDIENCE: So do you want to hash
each little column [INAUDIBLE]?

00:34:01.200 --> 00:34:03.950
PROFESSOR: So we're going
to hash all these guys,

00:34:03.950 --> 00:34:06.340
and then we're going to
have hashes for them,

00:34:06.340 --> 00:34:10.590
and we're going to do
the regular Rabin-Karp

00:34:10.590 --> 00:34:13.420
for the hashes.

00:34:13.420 --> 00:34:16.478
Now, what happens
when I go down?

00:34:16.478 --> 00:34:18.840
PROFESSOR: You have to
recompute everything.

00:34:18.840 --> 00:34:20.024
PROFESSOR: Let's do better
than recompute everything.

00:34:20.024 --> 00:34:21.241
AUDIENCE: Do you want
to [INAUDIBLE] downward

00:34:21.241 --> 00:34:21.540
on each column?

00:34:21.540 --> 00:34:22.165
PROFESSOR: Yep.

00:34:22.165 --> 00:34:23.489
Rolling hash.

00:34:23.489 --> 00:34:28.199
I want to make this faster,
so I have big W hashes.

00:34:28.199 --> 00:34:30.610
They're all little h inside.

00:34:30.610 --> 00:34:32.360
Here, I have to compute
them brute force.

00:34:32.360 --> 00:34:33.485
I can't do anything better.

00:34:33.485 --> 00:34:36.250
But when I go from
here to here, there's

00:34:36.250 --> 00:34:39.394
only one element going out
and one element going in.

00:34:42.070 --> 00:34:43.310
Same for all these guys.

00:34:43.310 --> 00:34:46.080
Let's not make the picture
uglier than it needs to be.

00:34:46.080 --> 00:34:48.960
So I have big W rolling hashes.

00:34:48.960 --> 00:34:51.340
They're vertical rolling hashes.

00:34:51.340 --> 00:34:59.310
And then the rolling hashes
hash columns, so my the sliding

00:34:59.310 --> 00:35:03.230
window that I have is
little w rolling hashes.

00:35:03.230 --> 00:35:10.320
Each rolling hash is little h in
size, so it's a hash of hashes.

00:35:10.320 --> 00:35:12.000
It's nested hashes.

00:35:12.000 --> 00:35:14.410
And then when I go
down, I only have

00:35:14.410 --> 00:35:17.950
to roll down each of
the rolling hashes by 1,

00:35:17.950 --> 00:35:19.300
so that's constant time.

00:35:19.300 --> 00:35:24.630
So to go from here to here, to
the slide the window one down,

00:35:24.630 --> 00:35:27.330
I have to roll this hash
down, roll this one,

00:35:27.330 --> 00:35:29.640
this one, this one, this one,
this one, and all of them

00:35:29.640 --> 00:35:31.640
roll down in constant time.

00:35:31.640 --> 00:35:36.640
So when I'm adding a column
to the hash, say I'm here

00:35:36.640 --> 00:35:40.380
and I want to go here,
I roll down this hash

00:35:40.380 --> 00:35:42.560
and I have the answer.

00:35:42.560 --> 00:35:43.210
It's order 1.

00:35:43.210 --> 00:35:45.360
I'm adding it in order 1.

00:35:47.726 --> 00:35:48.600
Does this make sense?

00:35:55.087 --> 00:35:58.572
AUDIENCE: It's tricky.

00:35:58.572 --> 00:36:00.196
PROFESSOR: But it's
not too bad, right?

00:36:00.196 --> 00:36:02.770
AUDIENCE: You just need to do
the vertical roll first, right?

00:36:02.770 --> 00:36:03.395
PROFESSOR: Yep.

00:36:05.880 --> 00:36:07.890
To have the simplest
possible code,

00:36:07.890 --> 00:36:12.520
you start with big W rolling
hashes, do 1D Rabin-Karp,

00:36:12.520 --> 00:36:17.267
you roll everything
down, 1D Rabin-Karp,

00:36:17.267 --> 00:36:18.100
and keep doing that.

00:36:20.810 --> 00:36:23.372
OK What's the running
time for this?

00:36:23.372 --> 00:36:24.410
AUDIENCE: WH.

00:36:24.410 --> 00:36:25.012
PROFESSOR: WH.

00:36:25.012 --> 00:36:27.595
AUDIENCE: Does this one have a
space complexity about W, then,

00:36:27.595 --> 00:36:28.780
because [INAUDIBLE]?

00:36:28.780 --> 00:36:29.840
PROFESSOR: Yeah.

00:36:29.840 --> 00:36:32.500
So my memory requirement
went up to 4W.

00:36:39.850 --> 00:36:41.670
Is everyone happy with this?

00:36:41.670 --> 00:36:46.340
It's one of the few cases where
an approach for solving a 1D

00:36:46.340 --> 00:36:48.670
problem generalizes to 2D.

00:36:48.670 --> 00:36:51.630
In most problems, you have to
rethink the whole situation.

00:37:02.980 --> 00:37:04.497
Let's do a hard problem.

00:37:04.497 --> 00:37:05.580
Enough with the easy ones.

00:37:10.580 --> 00:37:19.240
Two lists, roughly size N,
and they're both sorted.

00:37:19.240 --> 00:37:21.090
Let me fill them out
with random numbers.

00:37:25.010 --> 00:37:50.710
5, 13, 22, 43, 56, 62, 81,
86, 87, 2, 3, 7, 9, 15, 19,

00:37:50.710 --> 00:37:53.965
24, 28, 32.

00:37:56.780 --> 00:37:58.700
So I have these lists.

00:37:58.700 --> 00:38:01.810
Let's be generous and say that
all the numbers are different.

00:38:01.810 --> 00:38:03.110
They're sorted.

00:38:03.110 --> 00:38:09.110
I want to find the nth number,
so the number with rank n,

00:38:09.110 --> 00:38:19.274
out of both lists
as fast as possible.

00:38:19.274 --> 00:38:22.680
AUDIENCE: Wouldn't
it just be index?

00:38:22.680 --> 00:38:26.470
PROFESSOR: So the thing is
if, for example, n is 1,

00:38:26.470 --> 00:38:27.990
then it's this.

00:38:27.990 --> 00:38:29.140
This is the second number.

00:38:29.140 --> 00:38:31.884
This is the third.

00:38:31.884 --> 00:38:33.800
So if you take the lists
and you combine them,

00:38:33.800 --> 00:38:36.720
then I want the
result out of that.

00:38:36.720 --> 00:38:39.120
AUDIENCE: Let's do merge
sort and the combined

00:38:39.120 --> 00:38:41.950
[INAUDIBLE] index, right?

00:38:41.950 --> 00:38:44.980
PROFESSOR: Full
merge sort, N log N?

00:38:44.980 --> 00:38:45.480
No.

00:38:45.480 --> 00:38:46.870
AUDIENCE: It's already
sorted, though.

00:38:46.870 --> 00:38:47.160
PROFESSOR: OK.

00:38:47.160 --> 00:38:47.610
So what do we do?

00:38:47.610 --> 00:38:48.260
AUDIENCE: We just do merge.

00:38:48.260 --> 00:38:49.560
That's just order N.

00:38:49.560 --> 00:38:51.182
PROFESSOR: So merge.

00:38:51.182 --> 00:38:53.050
AUDIENCE: Merge [INAUDIBLE].

00:38:53.050 --> 00:38:54.760
PROFESSOR: OK.

00:38:54.760 --> 00:38:58.230
So merge and then index
is the first approach,

00:38:58.230 --> 00:39:01.840
which is order N. Then you
said run the merge algorithm,

00:39:01.840 --> 00:39:07.052
but stop when you get to
the little nth element,

00:39:07.052 --> 00:39:09.010
so that's a little bit
better and we don't have

00:39:09.010 --> 00:39:12.490
to produce an array so the space
complexity is down to order 1,

00:39:12.490 --> 00:39:14.600
right?

00:39:14.600 --> 00:39:15.615
Now let's do--

00:39:15.615 --> 00:39:17.170
AUDIENCE: Logarighmic times n.

00:39:17.170 --> 00:39:18.575
PROFESSOR: Yeah, exactly.

00:39:18.575 --> 00:39:19.200
This is linear.

00:39:19.200 --> 00:39:20.810
We have to get to logarithms.

00:39:20.810 --> 00:39:23.228
How do we get to logarithms?

00:39:23.228 --> 00:39:24.860
AUDIENCE: Do a
modified binary search.

00:39:32.070 --> 00:39:38.190
AUDIENCE: What if you
first looked at-- actually,

00:39:38.190 --> 00:39:39.830
I don't know what
I'm going to say.

00:39:42.132 --> 00:39:43.090
PROFESSOR: Anyone else?

00:39:46.037 --> 00:39:47.120
So modified binary search.

00:39:47.120 --> 00:39:49.380
Do you know the full
answer, or do you

00:39:49.380 --> 00:39:51.350
want to start looking
at the solution?

00:39:51.350 --> 00:39:53.665
AUDIENCE: I have an
idea [INAUDIBLE].

00:39:53.665 --> 00:39:55.290
PROFESSOR: Let's see
how it would work.

00:39:58.530 --> 00:40:11.530
AUDIENCE: So if we take
the n over second element

00:40:11.530 --> 00:40:19.628
on each row, the
one that is lower,

00:40:19.628 --> 00:40:22.600
that's at least the n
over second element,

00:40:22.600 --> 00:40:23.725
and the one that's higher--

00:40:35.230 --> 00:40:38.415
PROFESSOR: So let's say
this is our N1 and N2,

00:40:38.415 --> 00:40:41.330
and they're both
order N initially.

00:40:41.330 --> 00:40:45.010
So this is N2 over 2
if this is smaller.

00:40:48.328 --> 00:40:49.786
AUDIENCE: The lower
one is at least

00:40:49.786 --> 00:40:51.790
the N over second
element since everything

00:40:51.790 --> 00:41:03.844
before it is less than
N. Does that make sense?

00:41:03.844 --> 00:41:04.510
PROFESSOR: Yeah.

00:41:04.510 --> 00:41:07.140
So this is N over 2
are greater, right?

00:41:07.140 --> 00:41:07.840
This guy.

00:41:16.030 --> 00:41:28.859
AUDIENCE: We also know
that the element above it

00:41:28.859 --> 00:41:35.316
is at most the nth element
because it's greater than--

00:41:35.316 --> 00:41:37.620
PROFESSOR: So it's
at most N1 plus N2

00:41:37.620 --> 00:41:39.370
because that's how
many you have in total,

00:41:39.370 --> 00:41:44.150
and the one on the top, you
know that these ones are

00:41:44.150 --> 00:41:45.520
bigger than h, right?

00:41:45.520 --> 00:41:47.850
But you don't know
anything about these ones,

00:41:47.850 --> 00:41:52.360
so it's minus N1 over 2.

00:41:52.360 --> 00:41:56.700
So it's at most N1 over 2
plus N2, the top element.

00:41:59.590 --> 00:42:06.860
AUDIENCE: So then we can take
the element three quarters

00:42:06.860 --> 00:42:09.992
of the way through N2 and
one quarter of the way

00:42:09.992 --> 00:42:13.934
through N1 to do
more [INAUDIBLE].

00:42:13.934 --> 00:42:14.850
AUDIENCE: [INAUDIBLE]?

00:42:17.722 --> 00:42:18.305
AUDIENCE: Yes.

00:42:22.962 --> 00:42:24.920
PROFESSOR: Let's see what
happens in each case.

00:42:24.920 --> 00:42:34.270
So if little n is
here, then you divide.

00:42:34.270 --> 00:42:41.490
So if n is smaller than this,
then you chop them up here

00:42:41.490 --> 00:42:43.860
and you've divided
the problem into half.

00:42:43.860 --> 00:42:45.190
You're good.

00:42:45.190 --> 00:42:49.680
If it's bigger than
this other number here,

00:42:49.680 --> 00:42:52.610
you've chopped the problem
up and you're here.

00:42:52.610 --> 00:42:53.500
You're good.

00:42:53.500 --> 00:42:56.100
Now, the hard case seems
to be when it's in between.

00:42:56.100 --> 00:42:57.690
So what do we do then?

00:42:57.690 --> 00:42:59.854
AUDIENCE: Aren't those
two numbers the same?

00:42:59.854 --> 00:43:01.770
AUDIENCE: If it's between,
take the upper half

00:43:01.770 --> 00:43:07.220
of the bottom one and the lower
half of the upper one, right?

00:43:07.220 --> 00:43:09.695
If it's between 15 and 43,
then you take everything

00:43:09.695 --> 00:43:15.375
in the upper half of N2 and
you take the lower half of N1.

00:43:15.375 --> 00:43:16.750
AUDIENCE: Yeah,
you should always

00:43:16.750 --> 00:43:18.662
be taking the upper half
of one and the lower

00:43:18.662 --> 00:43:21.240
half of the other in this case.

00:43:21.240 --> 00:43:21.990
PROFESSOR: Really?

00:43:26.270 --> 00:43:35.230
AUDIENCE: [INAUDIBLE]
N2, 15 is at least

00:43:35.230 --> 00:43:37.330
the nth over 2 element.

00:43:37.330 --> 00:43:39.664
I think we're using three
n's at the same time.

00:43:39.664 --> 00:43:42.110
AUDIENCE: Are you using
the little n or the big N?

00:43:42.110 --> 00:43:42.776
AUDIENCE: Sorry.

00:43:45.100 --> 00:43:47.224
[INAUDIBLE] is element
and the lists are called N

00:43:47.224 --> 00:43:48.140
and this is confusing.

00:43:48.140 --> 00:43:50.376
Can we rename the nth
element to something

00:43:50.376 --> 00:43:53.130
like m or some
other useful number?

00:43:53.130 --> 00:43:54.200
The kth element, OK.

00:43:54.200 --> 00:43:57.680
So the 15 at least the
kth over 2 element,

00:43:57.680 --> 00:44:00.227
so it can't be anything
on the left half of the--

00:44:00.227 --> 00:44:01.060
PROFESSOR: k over 2?

00:44:01.060 --> 00:44:02.350
Why k over 2?

00:44:02.350 --> 00:44:03.570
This list is size N.

00:44:03.570 --> 00:44:05.220
AUDIENCE: Sorry.

00:44:05.220 --> 00:44:07.470
I didn't pick the
elements at N over 2.

00:44:07.470 --> 00:44:11.369
I picked the
elements at k over 2.

00:44:11.369 --> 00:44:12.660
PROFESSOR: Why would I do that?

00:44:18.790 --> 00:44:24.030
If N1 is greater than k, then
I chop off the end of the list,

00:44:24.030 --> 00:44:24.966
right?

00:44:24.966 --> 00:44:29.325
If N2 is bigger than k, then
I chop off the end of the list

00:44:29.325 --> 00:44:29.825
completely.

00:44:33.375 --> 00:44:37.100
If this list is sorted and
I want the third element,

00:44:37.100 --> 00:44:39.200
I know that these
are not the answer.

00:44:39.200 --> 00:44:42.140
No matter what's down here,
these are not the answer.

00:44:42.140 --> 00:44:49.350
So I know for sure that k is
going to be bigger than N1, N2.

00:44:49.350 --> 00:44:55.640
So instead of going there,
let's go at k over 2.

00:44:55.640 --> 00:45:01.290
And here, let's go for k over 2.

00:45:01.290 --> 00:45:03.255
Now this one looks
a bit nastier.

00:45:03.255 --> 00:45:07.720
N1 plus N2 stays
what it was before.

00:45:14.970 --> 00:45:18.560
AUDIENCE: On the list where you
got the element that was lower,

00:45:18.560 --> 00:45:24.690
you know that everything to
the left of it is less than k

00:45:24.690 --> 00:45:25.240
over 2.

00:45:31.470 --> 00:45:34.330
The element number is
lower than k over 2,

00:45:34.330 --> 00:45:36.650
so we're not using anything
to the left of the 15.

00:45:36.650 --> 00:45:38.540
You can kill that
section for us.

00:45:38.540 --> 00:45:42.007
PROFESSOR: OK, so we can kill
it, but then what's the rank?

00:45:42.007 --> 00:45:43.590
When I recurse, how
am I going to know

00:45:43.590 --> 00:45:45.520
the rank that I'm looking for?

00:45:48.817 --> 00:45:50.701
AUDIENCE: k minus
what you killed.

00:45:55.700 --> 00:45:57.520
AUDIENCE: You can
save the branches.

00:45:57.520 --> 00:45:59.560
PROFESSOR: So you want
to kill this guy, right?

00:45:59.560 --> 00:46:01.970
So you want to
kill these numbers.

00:46:01.970 --> 00:46:04.330
But here I have
k over 2 numbers,

00:46:04.330 --> 00:46:06.130
and here I have
k over 2 numbers.

00:46:06.130 --> 00:46:09.401
How do I know that it's
not somewhere here?

00:46:09.401 --> 00:46:13.562
AUDIENCE: How do you know that
what's not somewhere there?

00:46:13.562 --> 00:46:16.220
AUDIENCE: You compare
15 and 43, right?

00:46:16.220 --> 00:46:19.660
And then you see
that 43 is bigger,

00:46:19.660 --> 00:46:26.300
and you see 15 is smaller, so
then you would go to, I guess,

00:46:26.300 --> 00:46:36.195
k over 4 index in N1,
and 3k over 4 in N2.

00:46:36.195 --> 00:46:37.695
AUDIENCE: When you
recurse, you said

00:46:37.695 --> 00:46:41.430
that you've killed
k over 2 elements.

00:46:41.430 --> 00:46:46.947
AUDIENCE: But you can also kill
everything to the right of 43.

00:46:46.947 --> 00:46:47.530
AUDIENCE: Yes.

00:46:47.530 --> 00:46:50.015
You can kill everything
to the right of 43

00:46:50.015 --> 00:46:51.754
since it can't be any
of those elements,

00:46:51.754 --> 00:46:53.670
and you can kill everything
to the left of 15.

00:46:53.670 --> 00:46:57.236
And then you repeat the
algorithm again with the lists

00:46:57.236 --> 00:47:00.210
you didn't kill, except you
also put in a term of we've

00:47:00.210 --> 00:47:03.950
already covered k
over 2 elements.

00:47:03.950 --> 00:47:07.570
PROFESSOR: So we want the
element with the rank k over 4

00:47:07.570 --> 00:47:08.715
over these lists.

00:47:13.646 --> 00:47:16.330
So I know for sure that what
I have is either k over 2

00:47:16.330 --> 00:47:21.240
or less than k over 2, right?

00:47:21.240 --> 00:47:23.710
So this is less than
k over 2, and then I'm

00:47:23.710 --> 00:47:25.532
looking for a rank of k over 4.

00:47:28.184 --> 00:47:30.250
That seems to work.

00:47:30.250 --> 00:47:33.540
How does the running time look?

00:47:33.540 --> 00:47:38.030
AUDIENCE: It should
be O of log k.

00:47:38.030 --> 00:47:40.530
AUDIENCE: I think
it's log [INAUDIBLE].

00:47:45.030 --> 00:47:52.609
PROFESSOR: So log
k, log N1 plus N2.

00:47:57.440 --> 00:47:58.620
Are these different?

00:47:58.620 --> 00:48:00.830
AUDIENCE: Yes.

00:48:00.830 --> 00:48:06.406
AUDIENCE: If k is
1, the algorithm

00:48:06.406 --> 00:48:10.084
should only recurse once,
even if N is 20 million.

00:48:16.220 --> 00:48:18.364
PROFESSOR: OK.

00:48:18.364 --> 00:48:21.902
AUDIENCE: But if k is 20
million and the list lengths are

00:48:21.902 --> 00:48:24.784
two million long, it'll take
approximately those lengths

00:48:24.784 --> 00:48:26.740
to run.

00:48:26.740 --> 00:48:28.690
PROFESSOR: OK.

00:48:28.690 --> 00:48:36.420
So what gets reduced, aside from
the list size, k gets reduced.

00:48:36.420 --> 00:48:39.870
k seems to define the input
size for the next iteration

00:48:39.870 --> 00:48:42.700
because I'll have at least
k over 2 elements in one

00:48:42.700 --> 00:48:45.640
of these buckets.

00:48:45.640 --> 00:48:48.250
So it sounds like
it should be log k.

00:48:56.120 --> 00:49:00.390
k is bigger than N1, N2, but
it should hopefully be smaller

00:49:00.390 --> 00:49:03.740
than the sum because otherwise,
why am I doing the problem?

00:49:03.740 --> 00:49:08.360
So this is definitely
order of N1 plus N2.

00:49:08.360 --> 00:49:09.390
So this is a bound.

00:49:09.390 --> 00:49:11.820
This is a slightly
tighter bound.

00:49:11.820 --> 00:49:13.665
We have a different
solution to the problem.

00:49:16.945 --> 00:49:18.820
All the possible solutions
are hard to argue.

00:49:18.820 --> 00:49:21.540
They all come down to
something like this.

00:49:21.540 --> 00:49:26.070
The one that we have
requires you to use-- you

00:49:26.070 --> 00:49:29.873
have two indices, and you know
that the sum of the indices

00:49:29.873 --> 00:49:33.580
is k, and you do binary
search on the top

00:49:33.580 --> 00:49:36.090
and adjust the
index on the bottom

00:49:36.090 --> 00:49:40.040
to keep the constraint that the
sum of the two indices in N.

00:49:40.040 --> 00:49:44.620
And you can look at that in the
notes that we're going to post.

00:49:44.620 --> 00:49:45.780
Are you guys tired?

00:49:45.780 --> 00:49:49.509
Do you want to look at one
more thing, or are we done?

00:49:49.509 --> 00:49:50.425
AUDIENCE: [INAUDIBLE].

00:49:57.370 --> 00:50:01.510
PROFESSOR: Let's look at
something reasonably easy.

00:50:01.510 --> 00:50:03.350
You guys can read
this on your own.

00:50:03.350 --> 00:50:04.910
I'm not going to
bore you with that.

00:50:08.650 --> 00:50:12.030
So suppose we have
some functions,

00:50:12.030 --> 00:50:13.990
and we want to
order them according

00:50:13.990 --> 00:50:16.070
to their asymptotic growth rate.

00:50:16.070 --> 00:50:19.480
Do people remember how
to do this from Pset one?

00:50:19.480 --> 00:50:22.400
So the idea is that you take
each function, you simplify it,

00:50:22.400 --> 00:50:23.970
and then you sort them.

00:50:23.970 --> 00:50:28.234
So let's have a couple of simple
ones and then some hard ones,

00:50:28.234 --> 00:50:29.900
and we're going to
stop in five minutes.

00:50:39.110 --> 00:50:39.650
What's this?

00:50:42.752 --> 00:50:44.153
AUDIENCE: n to the fourth.

00:50:46.652 --> 00:50:47.235
PROFESSOR: OK.

00:50:49.810 --> 00:50:51.090
Let's see.

00:50:51.090 --> 00:50:51.850
n choose 3.

00:50:51.850 --> 00:50:54.444
What is this?

00:50:54.444 --> 00:50:55.539
AUDIENCE: [INAUDIBLE].

00:50:55.539 --> 00:50:56.580
PROFESSOR: OK, very good.

00:50:56.580 --> 00:50:57.796
Why?

00:50:57.796 --> 00:51:01.177
AUDIENCE: Something about choose
is n times n minus 1 times

00:51:01.177 --> 00:51:03.592
n minus 2.

00:51:03.592 --> 00:51:07.456
AUDIENCE: It's n factorial
over n minus 2 factorial.

00:51:11.220 --> 00:51:13.325
PROFESSOR: This comes out
to be roughly n cubed.

00:51:16.265 --> 00:51:16.765
Cool.

00:51:19.570 --> 00:51:22.808
How about n plus log
to the fourth of n?

00:51:25.616 --> 00:51:27.020
AUDIENCE: N.

00:51:27.020 --> 00:51:28.960
PROFESSOR: Yep.

00:51:28.960 --> 00:51:32.290
So even if I have a
polynomial in a logarithm,

00:51:32.290 --> 00:51:35.780
it's still dominated by pure n.

00:51:35.780 --> 00:51:39.000
Now, suppose we want to
order these guys together

00:51:39.000 --> 00:51:43.730
with-- which one doesn't
look boring at all?

00:51:43.730 --> 00:51:46.940
n to the log n and 2 to the n.

00:51:49.790 --> 00:51:50.739
Let's sort them.

00:51:50.739 --> 00:51:51.780
Which one's the smallest?

00:51:51.780 --> 00:51:53.394
Which one's the biggest?

00:51:59.250 --> 00:52:02.504
AUDIENCE: 2 to the n is bigger.

00:52:02.504 --> 00:52:04.420
PROFESSOR: Let's start
with the smallest ones,

00:52:04.420 --> 00:52:07.340
because I think
that will be easy.

00:52:07.340 --> 00:52:10.480
So which one's the absolute
smallest out of all these guys?

00:52:10.480 --> 00:52:11.340
AUDIENCE: n.

00:52:11.340 --> 00:52:12.590
PROFESSOR: OK.

00:52:12.590 --> 00:52:14.120
Then?

00:52:14.120 --> 00:52:15.960
AUDIENCE: n to the third.

00:52:15.960 --> 00:52:17.276
PROFESSOR: Cubed and fourth.

00:52:17.276 --> 00:52:18.650
So we have to
compare these guys.

00:52:18.650 --> 00:52:19.608
How do we compare them?

00:52:23.010 --> 00:52:26.900
n to the power of log n
and 2 to the n something.

00:52:29.984 --> 00:52:30.900
AUDIENCE: [INAUDIBLE].

00:52:34.819 --> 00:52:35.860
PROFESSOR: Take the logs.

00:52:35.860 --> 00:52:38.550
So when we have something
confusing with exponentials,

00:52:38.550 --> 00:52:40.500
take the logs and
see what we get.

00:52:40.500 --> 00:52:42.640
Logs are monotonic, so
if you take the logs,

00:52:42.640 --> 00:52:45.450
you'll have the same
relationship afterwards.

00:52:45.450 --> 00:52:53.090
So log of this is log of
n to the power of log n,

00:52:53.090 --> 00:53:05.020
so it's log n times
log n, so it's log 2 n.

00:53:05.020 --> 00:53:08.940
Log of 2 to the n is n.

00:53:08.940 --> 00:53:11.404
Which one's bigger?

00:53:11.404 --> 00:53:12.205
AUDIENCE: n.

00:53:12.205 --> 00:53:13.080
PROFESSOR: All right.

00:53:21.040 --> 00:53:22.470
And we're not going
to solve this,

00:53:22.470 --> 00:53:25.150
but how would you go
about solving this guy?

00:53:28.840 --> 00:53:30.060
What do you do to it?

00:53:30.060 --> 00:53:30.860
AUDIENCE: Sterling.

00:53:30.860 --> 00:53:33.170
PROFESSOR: Sterling, yep.

00:53:33.170 --> 00:53:35.270
You do Sterling, you
go through the numbers,

00:53:35.270 --> 00:53:37.210
and you figure out the answer.

00:53:37.210 --> 00:53:39.135
And then if you have
to do logarithms,

00:53:39.135 --> 00:53:41.390
you use logarithms
to figure out where

00:53:41.390 --> 00:53:43.510
it belongs among these guys.

00:53:43.510 --> 00:53:44.800
AUDIENCE: What's Sterling?

00:53:44.800 --> 00:53:45.540
AUDIENCE: Sterling's formula.

00:53:45.540 --> 00:53:47.506
It's that gross thing
that was on the board

00:53:47.506 --> 00:53:48.464
before we came in here.

00:53:48.464 --> 00:53:54.390
PROFESSOR: So Sterling says
that n factorial is ugly.

00:54:03.030 --> 00:54:09.440
2 pi n here times n over
e to the power of n.

00:54:13.670 --> 00:54:14.827
So what's this binomial?

00:54:14.827 --> 00:54:15.910
What's the formula for it?

00:54:21.670 --> 00:54:23.210
OK, formula for n choose k.

00:54:23.210 --> 00:54:24.512
Anyone?

00:54:24.512 --> 00:54:28.363
AUDIENCE: It's n times 1
over 2 factorial times n

00:54:28.363 --> 00:54:29.611
minus k factorial.

00:54:32.750 --> 00:54:35.410
PROFESSOR: In this
case, it's n factorial

00:54:35.410 --> 00:54:40.576
over n over 2 factorial raised
to the power of 2, right?

00:54:40.576 --> 00:54:42.950
And then we chug through the
math and get to some answer.

00:54:46.390 --> 00:54:47.956
All right?