WEBVTT

00:00:00.040 --> 00:00:02.480
The following content is
provided under a Creative

00:00:02.480 --> 00:00:04.010
Commons license.

00:00:04.010 --> 00:00:06.340
Your support will help
MIT OpenCourseWare

00:00:06.340 --> 00:00:10.700
continue to offer high quality
educational resources for free.

00:00:10.700 --> 00:00:13.320
To make a donation or
view additional materials

00:00:13.320 --> 00:00:17.035
from hundreds of MIT courses,
visit MIT OpenCourseWare

00:00:17.035 --> 00:00:17.660
at ocw.mit.edu.

00:00:20.410 --> 00:00:21.410
ERIK DEMAINE: All right.

00:00:21.410 --> 00:00:23.870
Welcome back to 6046.

00:00:23.870 --> 00:00:28.030
Today we continue our
theme of data structures

00:00:28.030 --> 00:00:30.700
but this time, instead of doing
a fancy cool data structure,

00:00:30.700 --> 00:00:33.370
we're going to look
at fancy cool analysis

00:00:33.370 --> 00:00:34.850
techniques for data structures.

00:00:34.850 --> 00:00:37.880
And these are useful for tons
of different data structures,

00:00:37.880 --> 00:00:41.800
especially in the context when
you're using a data structure

00:00:41.800 --> 00:00:44.380
to implement an algorithm.

00:00:44.380 --> 00:00:48.950
For example, in Dijkstra, when
you learn Dijkstra's algorithm,

00:00:48.950 --> 00:00:50.660
you had lots of
different heap structures

00:00:50.660 --> 00:00:54.850
you could use for the
priority queue in Dijkstra,

00:00:54.850 --> 00:00:57.930
and they gave different
running times with Dijkstra.

00:00:57.930 --> 00:01:00.230
But the key thing
in that context

00:01:00.230 --> 00:01:02.230
is that you cared about
the total running time

00:01:02.230 --> 00:01:05.390
of the algorithm,
less than you cared

00:01:05.390 --> 00:01:08.370
about the individual running
time of each operation.

00:01:08.370 --> 00:01:10.910
That's what
amortization is about.

00:01:10.910 --> 00:01:13.910
It's, I guess, a technique
from financial analysis,

00:01:13.910 --> 00:01:16.910
but we've appropriated
it in computer science

00:01:16.910 --> 00:01:19.280
as an analysis technique
to say, well, let's not

00:01:19.280 --> 00:01:22.360
worry about every single
operation worst case cost,

00:01:22.360 --> 00:01:25.630
let's just worry about the
total operation, the sum

00:01:25.630 --> 00:01:27.640
of all the operations cost.

00:01:27.640 --> 00:01:29.540
That's the whole
idea of amortization,

00:01:29.540 --> 00:01:31.535
but there's a lot of
different ways to do it.

00:01:31.535 --> 00:01:33.910
We're going to cover four
different methods for doing it,

00:01:33.910 --> 00:01:38.390
and three-ish examples
of doing it today.

00:01:38.390 --> 00:01:42.180
You've seen some essentially
in recitation last time,

00:01:42.180 --> 00:01:44.460
and you've seen a
little bit in 6006,

00:01:44.460 --> 00:01:50.040
so let me first remind you of
the example of table doubling

00:01:50.040 --> 00:01:56.330
from 6006.

00:01:56.330 --> 00:01:59.270
This came up in the
context of hash tables.

00:01:59.270 --> 00:02:04.910
As you may recall,
if you store n items

00:02:04.910 --> 00:02:12.280
in a hash table of size m--
there are m slots in the table,

00:02:12.280 --> 00:02:16.290
let's say, using chaining--
hashing with chaining-- then we

00:02:16.290 --> 00:02:27.270
got an expected
cost constant plus

00:02:27.270 --> 00:02:29.520
the load factor, size
of the table divided

00:02:29.520 --> 00:02:31.680
by the number of items.

00:02:31.680 --> 00:02:34.030
So we wanted to get
constant expected,

00:02:34.030 --> 00:02:38.220
and so we wanted this to
always be, at most, a constant.

00:02:38.220 --> 00:02:40.250
I guess we could handle
a larger table size,

00:02:40.250 --> 00:02:42.600
although then we are
unhappy about our space,

00:02:42.600 --> 00:02:46.350
but we definitely want m to be
at least around n so that this

00:02:46.350 --> 00:02:48.770
works out to order one.

00:02:48.770 --> 00:02:51.230
And the solution for doing
that was table doubling.

00:02:51.230 --> 00:02:53.275
Whenever the table is
too big, double it--

00:02:53.275 --> 00:02:55.025
or sorry-- whenever
the table is too small

00:02:55.025 --> 00:02:58.080
and we have too many items,
double the size of the table.

00:03:01.260 --> 00:03:08.522
If n-- n is the thing
that we can't control.

00:03:08.522 --> 00:03:09.980
That's the number
of items somebody

00:03:09.980 --> 00:03:11.790
is inserting into the table.

00:03:11.790 --> 00:03:18.650
If n grows to the value
to match m, then double m.

00:03:22.400 --> 00:03:29.935
So m prime equals 2m, and
to double the table size,

00:03:29.935 --> 00:03:32.060
you have to allocate a new
array of double the size

00:03:32.060 --> 00:03:35.750
and copy over all the items,
and that involves hashing.

00:03:35.750 --> 00:03:42.320
But overall this will take
order, size of the table, work.

00:03:42.320 --> 00:03:45.469
Doesn't matter whether I'm
using m or m prime here,

00:03:45.469 --> 00:03:47.760
because they're within a
constant factor of each other,

00:03:47.760 --> 00:03:48.770
and that's bad.

00:03:48.770 --> 00:03:51.110
Linear time to do an
insertion is clearly bad.

00:03:51.110 --> 00:03:53.070
This is all during one
insertion operation

00:03:53.070 --> 00:03:55.400
that this would happen,
but overall it's

00:03:55.400 --> 00:03:58.550
not going to be bad, because
you only double log n times.

00:03:58.550 --> 00:04:06.130
And if you look at
the total cost--

00:04:06.130 --> 00:04:09.130
so maybe you think, oh,
is it log n per operation,

00:04:09.130 --> 00:04:14.070
but it's not so bad because
total cost for n insertions

00:04:14.070 --> 00:04:21.352
starting from an empty structure
is something like 2 to the 0--

00:04:21.352 --> 00:04:24.890
this is a big theta outside--
2 to the 1, 2 to the 2.

00:04:24.890 --> 00:04:28.920
If we're only doing
insertions, this is great.

00:04:28.920 --> 00:04:30.950
2 to the log n.

00:04:30.950 --> 00:04:36.067
This is a geometric series
and so this is order n.

00:04:38.810 --> 00:04:42.180
Theta head I guess.

00:04:42.180 --> 00:04:46.090
So to do n insertions,
cost theta n,

00:04:46.090 --> 00:04:56.520
so we'd like to say the
amortized cost per operation

00:04:56.520 --> 00:05:09.770
is constant, because
we did n operations.

00:05:09.770 --> 00:05:12.440
Total cost was n, so sort
of on average per operation,

00:05:12.440 --> 00:05:14.120
that was the only constant.

00:05:14.120 --> 00:05:17.100
So this is the sense in which
hash tables are constant,

00:05:17.100 --> 00:05:19.920
expected, amortized.

00:05:19.920 --> 00:05:23.000
And we'll get back to
hashing in a future lecture,

00:05:23.000 --> 00:05:26.210
probably I think lecture
8, but for now we're

00:05:26.210 --> 00:05:28.430
just going to think about
this as a general thing

00:05:28.430 --> 00:05:33.580
where you need table doubling,
then this gives you a fast way

00:05:33.580 --> 00:05:34.666
to insert into a table.

00:05:34.666 --> 00:05:36.540
Later we'll think about
deleting from a table

00:05:36.540 --> 00:05:38.880
and keeping the
space not too big,

00:05:38.880 --> 00:05:41.330
but that's a starting point.

00:05:41.330 --> 00:05:43.460
This is an example of a
general technique called

00:05:43.460 --> 00:05:48.600
the aggregate method, which
is probably the weakest

00:05:48.600 --> 00:05:53.812
method for doing
amortization but

00:05:53.812 --> 00:05:55.020
maybe the most intuitive one.

00:06:01.160 --> 00:06:06.050
So the aggregate
method says, well, we

00:06:06.050 --> 00:06:07.620
do some sequence of operations.

00:06:07.620 --> 00:06:09.820
Let's say, in general,
there are k operations.

00:06:09.820 --> 00:06:18.100
Measure the total cost
of those operations,

00:06:18.100 --> 00:06:22.115
divide by k, that's the
amortized cost per operation.

00:06:35.190 --> 00:06:36.810
You can think of
this as a definition,

00:06:36.810 --> 00:06:38.450
but it's not actually
going to be our definition

00:06:38.450 --> 00:06:39.280
of amortized cost.

00:06:39.280 --> 00:06:41.600
We're going to use a
more flexible definition,

00:06:41.600 --> 00:06:45.250
but for simple examples like
this, it's a fine definition,

00:06:45.250 --> 00:06:46.960
and it gives you what you want.

00:06:46.960 --> 00:06:50.809
When your sequence of
operations is very clear,

00:06:50.809 --> 00:06:52.350
like here, there's
only one thing you

00:06:52.350 --> 00:06:54.500
can do at each step,
which is insert-- that's

00:06:54.500 --> 00:06:56.560
my definition of the
problem-- then great,

00:06:56.560 --> 00:06:58.100
we get a very simple sum.

00:06:58.100 --> 00:06:59.740
As soon as you mix
inserts and deletes,

00:06:59.740 --> 00:07:01.670
the sum is not so clear.

00:07:01.670 --> 00:07:05.502
But in some situations,
the sum is really clean,

00:07:05.502 --> 00:07:06.960
so you just compute
the sum, divide

00:07:06.960 --> 00:07:09.370
by a number of operations,
you get a cost,

00:07:09.370 --> 00:07:11.820
and that could be
the amortized cost.

00:07:11.820 --> 00:07:15.840
And that's the aggregate method,
works great for simple sums.

00:07:15.840 --> 00:07:19.480
Here's another example
where it-- no, sorry.

00:07:19.480 --> 00:07:22.410
Let me now give you
the general definition

00:07:22.410 --> 00:07:31.560
of amortized bounds,
which becomes important

00:07:31.560 --> 00:07:33.930
once you're dealing with
different types of operations.

00:07:33.930 --> 00:07:37.070
I want to say an insert
costs one bound amortized

00:07:37.070 --> 00:07:39.750
and maybe a delete
costs some other bound.

00:07:39.750 --> 00:07:48.250
So what you get to do is assign
a cost for each operation.

00:07:48.250 --> 00:08:04.490
I should call it
an amortized cost,

00:08:04.490 --> 00:08:13.810
such that you preserve
the sum of those costs.

00:08:13.810 --> 00:08:19.910
So what I mean is that if I look
at the sum over all operations

00:08:19.910 --> 00:08:25.570
of the amortized cost
of that operation,

00:08:25.570 --> 00:08:33.120
and I compare that with the
sum of all the actual costs

00:08:33.120 --> 00:08:38.240
of the operations, the
amortize should always

00:08:38.240 --> 00:08:41.299
be bigger, because I
always want an upper bound

00:08:41.299 --> 00:08:43.059
on my actual cost.

00:08:43.059 --> 00:08:45.830
So if I can prove that the
amortized costs are, at most,

00:08:45.830 --> 00:08:48.310
say, constant per
operation, then I

00:08:48.310 --> 00:08:50.100
get that the sum
of the actual cost

00:08:50.100 --> 00:08:51.600
is, at most, constant
per operation.

00:08:51.600 --> 00:08:54.140
I don't learn anything
about the individual costs,

00:08:54.140 --> 00:08:55.690
but I learn about
the total cost.

00:08:55.690 --> 00:08:57.773
And in the context of an
algorithm like Dijkstra's

00:08:57.773 --> 00:08:59.920
algorithm, you only care
about the total cost,

00:08:59.920 --> 00:09:02.810
because you don't care about
the shortest paths at time t,

00:09:02.810 --> 00:09:05.310
you only care about the shortest
paths when the algorithm is

00:09:05.310 --> 00:09:06.790
completely finished.

00:09:06.790 --> 00:09:11.281
So in a lot of situations,
maybe not a real-time system,

00:09:11.281 --> 00:09:12.780
but almost everything
else, you just

00:09:12.780 --> 00:09:14.113
care about the sum of the costs.

00:09:14.113 --> 00:09:15.850
As long as that's
small, you can afford

00:09:15.850 --> 00:09:19.930
the occasional
expensive operation.

00:09:19.930 --> 00:09:22.410
So this is a more
flexible definition.

00:09:22.410 --> 00:09:25.140
One option would be to
assign the average cost

00:09:25.140 --> 00:09:28.120
to each operation, but we have
a whole bunch more operations.

00:09:28.120 --> 00:09:30.570
We could say inserts cost
more than deletes or things

00:09:30.570 --> 00:09:31.370
like that.

00:09:31.370 --> 00:09:35.240
In fact, let me do
such an example.

00:09:35.240 --> 00:09:38.132
A couple weeks ago, you
learned about 2-3 trees.

00:09:38.132 --> 00:09:39.840
This would work for
any structure though.

00:09:49.690 --> 00:09:54.250
So I claim I'm going to look at
three operations on 2-3 trees.

00:09:54.250 --> 00:09:57.679
One is create an
empty tree, so I

00:09:57.679 --> 00:09:59.220
need to think about
how we're getting

00:09:59.220 --> 00:10:01.230
started in amortization.

00:10:06.410 --> 00:10:09.040
Let's say you always
start with an empty tree.

00:10:09.040 --> 00:10:11.519
It takes constant
time to make one.

00:10:11.519 --> 00:10:17.250
I pay log n time-- I'm going
to tweak that a little bit--

00:10:17.250 --> 00:10:33.490
for an insertion, and
I pay 0 time per delete

00:10:33.490 --> 00:10:34.635
in an amortized sense.

00:10:41.090 --> 00:10:43.030
You can write big
O of 0 if you like.

00:10:43.030 --> 00:10:45.040
Same thing.

00:10:45.040 --> 00:10:49.361
So deletion you can think
of as a free operation.

00:10:49.361 --> 00:10:49.860
Why?

00:10:56.577 --> 00:10:59.160
This is a bit counter-intuitive
because, of course, in reality

00:10:59.160 --> 00:11:02.151
the actual cost of a deletion
is going to be log n.

00:11:02.151 --> 00:11:02.650
Yeah.

00:11:02.650 --> 00:11:04.200
AUDIENCE: You can never
delete more elements

00:11:04.200 --> 00:11:05.440
than you've already inserted.

00:11:05.440 --> 00:11:07.140
ERIK DEMAINE: You can
never delete more elements

00:11:07.140 --> 00:11:08.210
than you've already inserted.

00:11:08.210 --> 00:11:08.709
Good.

00:11:08.709 --> 00:11:12.030
AUDIENCE: Can you cap
the cost of [INAUDIBLE]

00:11:12.030 --> 00:11:13.100
ERIK DEMAINE: Yeah.

00:11:13.100 --> 00:11:16.520
So I can bound the deletion
cost by the insertion cost,

00:11:16.520 --> 00:11:19.230
and in the context of
just the aggregate method,

00:11:19.230 --> 00:11:23.020
you could look at the total
cost of all the operations.

00:11:23.020 --> 00:11:25.360
I guess we're not
exactly dividing here,

00:11:25.360 --> 00:11:31.940
but if we look at
the total cost,

00:11:31.940 --> 00:11:37.300
let's say that we do c
creations, i insertions

00:11:37.300 --> 00:11:44.330
and d deletions, then the total
cost becomes c plus i times

00:11:44.330 --> 00:11:48.558
log n plus d times the log n.

00:11:51.540 --> 00:11:56.572
And the point is d is
less than or equal to i,

00:11:56.572 --> 00:11:58.280
because you can never
delete an item that

00:11:58.280 --> 00:12:00.030
wasn't already inserted
if you're starting

00:12:00.030 --> 00:12:01.760
from an empty structure.

00:12:01.760 --> 00:12:07.370
And so this is i plus d
times log n, but that's just,

00:12:07.370 --> 00:12:13.660
at most, twice i times log
n, so we get c plus i log n.

00:12:16.950 --> 00:12:22.220
And so we can think of that
as having a d times 0, 0

00:12:22.220 --> 00:12:23.880
cost per deletion.

00:12:23.880 --> 00:12:26.570
So this is the sum of the
actual costs over here.

00:12:26.570 --> 00:12:29.220
This is the sum of the
amortized costs, where

00:12:29.220 --> 00:12:31.200
we say 0 for the
deletion, and we just

00:12:31.200 --> 00:12:36.980
showed that this is an upper
bound on that, so we're happy.

00:12:36.980 --> 00:12:38.750
Now, there's a
slight catch here,

00:12:38.750 --> 00:12:41.890
and that's why I wrote
star on every n, which

00:12:41.890 --> 00:12:44.190
is not every operation
has the same cost, right?

00:12:44.190 --> 00:12:45.815
When you start from
an empty structure,

00:12:45.815 --> 00:12:50.480
insertion cost constant
time, because n is 0.

00:12:50.480 --> 00:12:53.640
When n is a constant,
insertion is constant time.

00:12:53.640 --> 00:12:58.310
When n grows to n,
it costs log n time.

00:12:58.310 --> 00:13:00.700
At different times, n
is a different value,

00:13:00.700 --> 00:13:03.450
and n I'm going to use
to mean the current size

00:13:03.450 --> 00:13:05.680
of the structure.

00:13:05.680 --> 00:13:07.820
For this argument to
work at the moment,

00:13:07.820 --> 00:13:11.249
I need that n is not the
current value, because this

00:13:11.249 --> 00:13:12.290
is kind of charging work.

00:13:12.290 --> 00:13:14.200
Some insertions are
for large structures,

00:13:14.200 --> 00:13:16.700
some are for small structures,
some deletions are for small,

00:13:16.700 --> 00:13:17.820
some are for large.

00:13:17.820 --> 00:13:19.240
Gets confusing to think about.

00:13:19.240 --> 00:13:21.660
We will fix that in a
moment but, for now, I'm

00:13:21.660 --> 00:13:23.930
just going to
define n star to be

00:13:23.930 --> 00:13:26.795
the maximum size over all time.

00:13:29.370 --> 00:13:33.570
OK, if we just define it
that way, then this is true.

00:13:33.570 --> 00:13:38.000
That will let me pay
for any deletion,

00:13:38.000 --> 00:13:40.030
but we'll remove that
star later on once

00:13:40.030 --> 00:13:44.260
we get better analysis
methods, but so far so good.

00:13:44.260 --> 00:13:47.440
Two very simple
examples-- table doubling,

00:13:47.440 --> 00:13:50.167
2-3 trees with free deletion.

00:13:50.167 --> 00:13:52.000
Of course, that would
work for any structure

00:13:52.000 --> 00:13:54.470
with logarithmic
insertion and deletion,

00:13:54.470 --> 00:13:57.760
but we're going to be using
2-3 trees in a more-- analyzing

00:13:57.760 --> 00:14:02.870
them in a more
interesting way later on.

00:14:02.870 --> 00:14:06.090
So let's go to the next method,
which is the accounting method.

00:14:12.220 --> 00:14:16.816
It's like the bank teller's
analysis, if you will.

00:14:19.700 --> 00:14:23.740
These are all just different
ways to compute these sums

00:14:23.740 --> 00:14:26.660
or to think about the sums,
and usually one method

00:14:26.660 --> 00:14:31.570
is a lot easier, either for you
personally or for each problem,

00:14:31.570 --> 00:14:32.250
more typically.

00:14:32.250 --> 00:14:34.460
Each problem usually one
or more of these methods

00:14:34.460 --> 00:14:36.790
is going to be more
intuitive than the others.

00:14:36.790 --> 00:14:38.550
They're all kind of
equivalent, but it's

00:14:38.550 --> 00:14:40.270
good to have them
all in your mind

00:14:40.270 --> 00:14:43.080
so you can just think about
the problem in different ways.

00:14:43.080 --> 00:14:46.230
So with the accounting
method, what we're going to do

00:14:46.230 --> 00:14:57.070
is define a bank account and
an operation can store credit

00:14:57.070 --> 00:14:57.960
in that bank account.

00:15:11.546 --> 00:15:13.670
Credits maybe not the best
word, because you're not

00:15:13.670 --> 00:15:16.650
allowed for the bank
account to go negative.

00:15:16.650 --> 00:15:22.570
The bank account must always
be non-negative balance,

00:15:22.570 --> 00:15:25.650
because otherwise your
summations won't work out.

00:15:25.650 --> 00:15:29.050
So when you store credit in the
bank account, you pay for it.

00:15:29.050 --> 00:15:32.060
It's as if you're
consuming time now in order

00:15:32.060 --> 00:15:34.090
to pay for it in the future.

00:15:34.090 --> 00:15:36.750
And think of operations
costing money,

00:15:36.750 --> 00:15:43.530
so whenever I do a deletion, I
spend actual time, log n time,

00:15:43.530 --> 00:15:46.580
but if I had log n
dollars in the bank,

00:15:46.580 --> 00:15:48.270
and I could pull
those out of the bank,

00:15:48.270 --> 00:15:51.230
I can use those dollars
to pay for the work,

00:15:51.230 --> 00:15:53.280
and then the deletion
itself becomes

00:15:53.280 --> 00:15:55.460
free in an amortized sense.

00:15:55.460 --> 00:16:00.030
So this is, on the
one hand, operation--

00:16:00.030 --> 00:16:03.390
and when I do an insertion,
I'm going to physically take

00:16:03.390 --> 00:16:05.960
some coins out of myself.

00:16:05.960 --> 00:16:09.410
That will cost
something in the sense

00:16:09.410 --> 00:16:11.030
that the amortized
cost of insertion

00:16:11.030 --> 00:16:14.040
goes up in order to put
those coins in the bank,

00:16:14.040 --> 00:16:16.330
but then I'll be able to
use them for deletion.

00:16:16.330 --> 00:16:18.320
So this is what
insertion is going to do.

00:16:18.320 --> 00:16:20.300
I can store credit
in the bank, and then

00:16:20.300 --> 00:16:29.540
separately we allow an operation
to take coins out of the bank,

00:16:29.540 --> 00:16:37.070
and you can pay for time
using the credit that's

00:16:37.070 --> 00:16:38.090
been stored in the bank.

00:16:43.410 --> 00:16:48.090
As long as the bank
balance remains

00:16:48.090 --> 00:16:52.680
non-negative at all
times, this will be good.

00:16:52.680 --> 00:16:55.980
The bank balance is a
sort of unused time.

00:16:55.980 --> 00:16:57.930
We're paying for it to
store things in there.

00:16:57.930 --> 00:16:59.305
If we don't use
it, well, we just

00:16:59.305 --> 00:17:00.700
have an upper bound on time.

00:17:00.700 --> 00:17:05.190
As long as we go
non-negative, then

00:17:05.190 --> 00:17:08.150
the summation will always
be in the right direction.

00:17:08.150 --> 00:17:10.170
This inequality will hold.

00:17:10.170 --> 00:17:11.900
Let's do an example.

00:17:27.839 --> 00:17:29.910
Well, maybe this
is a first example.

00:17:29.910 --> 00:17:35.050
So when I do an
insertion, I can put,

00:17:35.050 --> 00:17:41.170
let's say, one coin of value
log n star into the bank, and so

00:17:41.170 --> 00:17:42.810
the total cost of
that insertion,

00:17:42.810 --> 00:17:46.310
I pay log n star real cost
in order to do the insertion,

00:17:46.310 --> 00:17:49.610
then I also pay log n
star for those coins

00:17:49.610 --> 00:17:50.630
to put them in the bank.

00:17:50.630 --> 00:17:53.520
When I do a deletion, the
real cost is log n star,

00:17:53.520 --> 00:17:56.950
but I'm going to extract
out of it log n star coins,

00:17:56.950 --> 00:17:58.730
and so the total
cost is actually

00:17:58.730 --> 00:18:07.320
free-- the total
amortized cost is free--

00:18:07.320 --> 00:18:09.890
and the reason that works, the
reason the balance is always

00:18:09.890 --> 00:18:11.940
non-negative, is because
for every deletion

00:18:11.940 --> 00:18:13.700
there was an
insertion before it.

00:18:13.700 --> 00:18:16.220
So that's maybe a
less intuitive way

00:18:16.220 --> 00:18:17.720
to think about this
problem, but you

00:18:17.720 --> 00:18:19.085
could think about it that way.

00:18:25.080 --> 00:18:29.800
More generally-- so
what we'd like to say

00:18:29.800 --> 00:18:32.970
is that we only put
log n without the star,

00:18:32.970 --> 00:18:38.230
the current value
of n per insert

00:18:38.230 --> 00:18:46.815
and a 0 per delete amortized.

00:18:53.570 --> 00:18:59.390
So we'd like to
say, OK, let me put

00:18:59.390 --> 00:19:20.710
one coin worth log n for each
insertion, and when I delete,

00:19:20.710 --> 00:19:21.900
I consume the coin.

00:19:27.600 --> 00:19:29.050
And, in general,
the formula here

00:19:29.050 --> 00:19:38.660
is that the amortized
cost of an operation

00:19:38.660 --> 00:19:55.960
is the actual cost plus the
deposits minus the withdrawals.

00:20:03.341 --> 00:20:03.840
OK.

00:20:03.840 --> 00:20:06.160
So insertion, we
just double the cost,

00:20:06.160 --> 00:20:08.010
because we pay log
n to the real thing,

00:20:08.010 --> 00:20:11.270
we pay log n to store the coin.

00:20:11.270 --> 00:20:14.170
That's the plus deposit
part, so insertion remains

00:20:14.170 --> 00:20:18.360
log n, and then deletion, we
pay log n to do the deletion,

00:20:18.360 --> 00:20:23.470
but then we subtract off
the coin of value log n,

00:20:23.470 --> 00:20:25.770
so that hopefully
works out to zero 0.

00:20:25.770 --> 00:20:28.970
But, again, we have this
issue that coins actually

00:20:28.970 --> 00:20:30.840
have different amounts,
depending on what

00:20:30.840 --> 00:20:34.670
the current value of n was.

00:20:34.670 --> 00:20:38.030
You can actually get
this to work if you say,

00:20:38.030 --> 00:20:42.690
well, there are coins
of varying values here,

00:20:42.690 --> 00:20:44.490
and I think the
invariant is if you

00:20:44.490 --> 00:20:47.060
have a current
structure of size n,

00:20:47.060 --> 00:20:50.470
you will have one coin of size
log 1, log 2, log 3, log 4,

00:20:50.470 --> 00:20:52.590
up to log n.

00:20:52.590 --> 00:20:56.870
Each coin corresponds to the
item that made n that value.

00:20:56.870 --> 00:21:00.000
And so when you delete
an item at size n,

00:21:00.000 --> 00:21:04.450
you'll be removing the log nth
coin, the coin of value log n.

00:21:04.450 --> 00:21:07.075
So you can actually get this
to work if you're careful.

00:21:10.020 --> 00:21:24.430
I guess the invariant is one
coin of value log i for i

00:21:24.430 --> 00:21:31.140
equals 1 to n, and you can
check that invariant holds.

00:21:31.140 --> 00:21:33.480
When I do a new insertion,
I increase n by 1

00:21:33.480 --> 00:21:35.940
and I make a new coin
of log that value.

00:21:35.940 --> 00:21:38.460
When I do a deletion,
I'm going to remove

00:21:38.460 --> 00:21:41.450
that last coin of log n.

00:21:41.450 --> 00:21:43.470
So this does work out.

00:21:43.470 --> 00:21:46.870
So we got rid of the end star.

00:21:46.870 --> 00:21:52.058
OK, let's use this same method
to analyze table doubling.

00:21:52.058 --> 00:21:53.766
We already know why
table doubling works,

00:21:53.766 --> 00:21:57.180
but good to think of it
from different perspectives.

00:22:06.530 --> 00:22:08.610
And it's particularly
fun to think of the coins

00:22:08.610 --> 00:22:11.100
as being physical objects
in the data structure.

00:22:11.100 --> 00:22:13.855
I always thought it would fun
to put this in a programming

00:22:13.855 --> 00:22:15.230
language, but I
don't think there

00:22:15.230 --> 00:22:20.290
is a programming language that
has coins in it in this sense

00:22:20.290 --> 00:22:21.460
yet.

00:22:21.460 --> 00:22:22.790
Maybe you can fix that.

00:22:22.790 --> 00:22:25.030
So let's go back
to table doubling.

00:22:45.630 --> 00:22:50.060
Let's say when we insert
an item into a table,

00:22:50.060 --> 00:22:53.420
and here I'm just
going to do insertions.

00:22:53.420 --> 00:22:55.650
We'll worry about
deletions in a moment.

00:22:58.630 --> 00:23:00.300
Whenever I do an
insertion, I'm going

00:23:00.300 --> 00:23:07.440
to put a coin on that item,
and the value of the coin

00:23:07.440 --> 00:23:08.990
is going to be a constant.

00:23:08.990 --> 00:23:12.100
I going to give the constant a
name so we can be a little more

00:23:12.100 --> 00:23:15.910
precise in a moment-- c.

00:23:15.910 --> 00:23:23.240
So here's kind of the typical--
well, here's an array.

00:23:23.240 --> 00:23:25.420
We start with an
array of size 1,

00:23:25.420 --> 00:23:30.692
and we insert a single item
here, and we put a coin on it.

00:23:30.692 --> 00:23:36.196
Maybe I'll draw the coin in
a color, which I've lost.

00:23:36.196 --> 00:23:36.695
Here.

00:23:39.280 --> 00:23:45.540
So I insert some item x, and
I put a coin on that item.

00:23:48.942 --> 00:23:50.400
When I do the next
insertion, let's

00:23:50.400 --> 00:23:53.100
say I have to double
the table to size 2.

00:23:53.100 --> 00:23:55.240
I'm going to use
up that coin, so

00:23:55.240 --> 00:24:00.120
erase it, put a new coin on
the item that I just put down.

00:24:00.120 --> 00:24:02.160
Call it y.

00:24:02.160 --> 00:24:07.520
In general-- so the next time
I double, which is immediately,

00:24:07.520 --> 00:24:09.562
I'm going to go to size 4.

00:24:09.562 --> 00:24:15.550
I erase this coin,
then I put a coin here.

00:24:15.550 --> 00:24:19.660
When I insert item, of
course, letter after z is w.

00:24:19.660 --> 00:24:23.670
Then I put another coin
when I have to double again,

00:24:23.670 --> 00:24:29.490
so here I'm going to use
these coins to charge

00:24:29.490 --> 00:24:33.060
for the doubling, and
then in the next round,

00:24:33.060 --> 00:24:34.900
I'm going to be
inserting here, here,

00:24:34.900 --> 00:24:38.720
and here, and I'll be
putting a coin here, here,

00:24:38.720 --> 00:24:40.110
here, and here.

00:24:40.110 --> 00:24:42.090
In general, you start
to see the pattern--

00:24:42.090 --> 00:24:45.640
so I used up these
guys-- that by the time

00:24:45.640 --> 00:24:49.580
I have to double again, half
of the items have coins,

00:24:49.580 --> 00:24:52.082
the other half don't,
because I already used them.

00:24:52.082 --> 00:24:54.540
You have to be careful not to
use a coin twice, because you

00:24:54.540 --> 00:24:56.840
only get to use it once.

00:24:56.840 --> 00:24:59.690
You can't divide money
into double money

00:24:59.690 --> 00:25:01.190
unless you're doing
stocks, I guess.

00:25:04.180 --> 00:25:07.060
As soon as I get to a place
where the array is completely

00:25:07.060 --> 00:25:09.840
full when n equals m, the
last half of the items

00:25:09.840 --> 00:25:10.570
will have coins.

00:25:10.570 --> 00:25:14.100
I'm going to use them in
order to pay for the doubling,

00:25:14.100 --> 00:25:19.930
so the number of coins
here will be n over 2.

00:25:19.930 --> 00:25:21.865
So this is why I wanted
to make this constant

00:25:21.865 --> 00:25:26.140
a little explicit, because
it has to be bigger than 2

00:25:26.140 --> 00:25:26.800
in some sense.

00:25:26.800 --> 00:25:31.300
However much work-- let's say
it takes a times n work in order

00:25:31.300 --> 00:25:33.420
to do doubling,
then this constant

00:25:33.420 --> 00:25:35.830
should be something
like two times a,

00:25:35.830 --> 00:25:38.200
because I need to do
the work to double,

00:25:38.200 --> 00:25:41.740
but I only have n over
2 coins to pay for it.

00:25:41.740 --> 00:25:44.640
I don't get coins over here.

00:25:44.640 --> 00:26:01.710
So when we double, the last
n over 2 items have coins,

00:26:01.710 --> 00:26:15.770
and so the amortized cost
of the doubling operation

00:26:15.770 --> 00:26:18.510
is going to be the
real cost, which

00:26:18.510 --> 00:26:23.130
is sum theta n minus the
number of coins I can remove

00:26:23.130 --> 00:26:24.620
and their value.

00:26:24.620 --> 00:26:29.070
So it's going to be
minus c times n over 2

00:26:29.070 --> 00:26:35.110
and, the point is, this
is 0 if we set c large.

00:26:35.110 --> 00:26:36.890
It only has to be a constant.

00:26:36.890 --> 00:26:41.150
It needs to be bigger than
2 times that constant.

00:26:41.150 --> 00:26:43.217
And usually when you're
working with coins,

00:26:43.217 --> 00:26:45.050
you want to make the
constants explicit just

00:26:45.050 --> 00:26:47.640
to make sure there's no circular
dependence on constants,

00:26:47.640 --> 00:26:53.000
make sure there is a valid
choice of c that annihilates

00:26:53.000 --> 00:26:56.030
whatever cost you
want to get rid of.

00:26:56.030 --> 00:27:02.710
So this is the accounting
method view of table doubling.

00:27:02.710 --> 00:27:03.790
Any questions so far?

00:27:06.500 --> 00:27:07.870
So far so good.

00:27:07.870 --> 00:27:10.770
Pretty simple example.

00:27:10.770 --> 00:27:13.780
Let's get to more
interesting examples.

00:27:13.780 --> 00:27:16.130
You also think about the
amortized cost of an insert.

00:27:16.130 --> 00:27:17.760
It costs constant real time.

00:27:17.760 --> 00:27:19.570
Actual cost is constant.

00:27:19.570 --> 00:27:21.359
You have to also
deposit one coin, which

00:27:21.359 --> 00:27:23.650
costs constant time so the
amortized cost of the insert

00:27:23.650 --> 00:27:25.850
is still constant.

00:27:25.850 --> 00:27:29.170
So that's good.

00:27:29.170 --> 00:27:32.170
Still we don't know how
to deal with deletions,

00:27:32.170 --> 00:27:36.160
but let me give you a kind
of reverse perspective

00:27:36.160 --> 00:27:39.751
on the accounting method.

00:27:43.600 --> 00:27:45.620
It's, again, equivalent
in a certain sense,

00:27:45.620 --> 00:27:48.560
but in another sense
may be more intuitive

00:27:48.560 --> 00:27:51.110
some of the time
for some people.

00:27:51.110 --> 00:27:53.110
It's actually not
in the textbook,

00:27:53.110 --> 00:27:55.950
but it's the one I
use the most so I

00:27:55.950 --> 00:27:58.350
figure it's worth teaching.

00:27:58.350 --> 00:27:59.718
It's called the charging method.

00:28:02.590 --> 00:28:06.510
It's also a little bit more
time travel-y, if you will,

00:28:06.510 --> 00:28:11.710
so if you like time travel,
this method is for you,

00:28:11.710 --> 00:28:14.680
or maybe a more
pessimistic view is blaming

00:28:14.680 --> 00:28:17.560
the past for your mistakes.

00:28:17.560 --> 00:28:23.060
So what we're going
to do is allow--

00:28:23.060 --> 00:28:26.660
there's no bank balance anymore,
although it's essentially

00:28:26.660 --> 00:28:27.540
there.

00:28:27.540 --> 00:28:35.090
We're going to allow operations
to charge some of their cost

00:28:35.090 --> 00:28:48.616
retroactively to the
past, not the future.

00:28:48.616 --> 00:28:50.500
I actually have a
data structures paper

00:28:50.500 --> 00:28:52.880
which proves that while
time travel to the past

00:28:52.880 --> 00:28:55.670
is plausible, time travel to the
future is not computationally.

00:28:55.670 --> 00:28:58.340
So you're not allowed to
time travel to the future,

00:28:58.340 --> 00:29:02.628
only allowed to go to the
past, and say, hey, give me $5.

00:29:05.710 --> 00:29:08.710
But you've got to be a little
bit conservative in how

00:29:08.710 --> 00:29:09.210
you do it.

00:29:09.210 --> 00:29:12.770
You can't just keep charging the
same operation a million times,

00:29:12.770 --> 00:29:15.450
because then the cost of
that operation is going up.

00:29:15.450 --> 00:29:17.110
At the end of the
day, every operation

00:29:17.110 --> 00:29:20.320
had to have paid for
its total charge.

00:29:20.320 --> 00:29:22.740
So there's the actual
cost, which it starts with,

00:29:22.740 --> 00:29:25.225
and then there's whatever it's
being charged by the future.

00:29:25.225 --> 00:29:26.850
So from an analysis
perspective, you're

00:29:26.850 --> 00:29:27.933
thinking about the future.

00:29:27.933 --> 00:29:31.650
What could
potentially charge me?

00:29:31.650 --> 00:29:38.830
Again, you can define the
amortized cost of an operation

00:29:38.830 --> 00:29:48.330
is going to be the actual
cost minus the total charge

00:29:48.330 --> 00:29:48.890
to the past.

00:29:52.600 --> 00:29:54.320
So when we charge
to the past, we

00:29:54.320 --> 00:29:58.450
get free dollars in
the present, but we

00:29:58.450 --> 00:30:00.624
have to pay for whatever
the future is going to do.

00:30:04.500 --> 00:30:06.730
So we have to imagine
how many times could I

00:30:06.730 --> 00:30:08.110
get charged in the future?

00:30:08.110 --> 00:30:10.680
I'm going to have to pay for
that now in a consistent time

00:30:10.680 --> 00:30:12.150
line.

00:30:12.150 --> 00:30:15.050
You will have to have paid for
things that come in the future.

00:30:17.690 --> 00:30:19.560
So let's do an example.

00:30:19.560 --> 00:30:21.685
Actually it sounds
crazy and weird,

00:30:21.685 --> 00:30:23.560
but I actually find this
a lot more intuitive

00:30:23.560 --> 00:30:26.405
to think about even
these very examples.

00:30:29.560 --> 00:30:31.370
Let's start with table doubling.

00:30:58.610 --> 00:31:01.780
So we have this kind
of picture already.

00:31:01.780 --> 00:31:05.200
It's going to be
pretty much the same.

00:31:05.200 --> 00:31:10.520
After I've doubled the
table, my array is half full

00:31:10.520 --> 00:31:14.015
and, again, insertion only,
although we'll insertion

00:31:14.015 --> 00:31:16.080
and deletion in the moment.

00:31:16.080 --> 00:31:19.140
In order to get from half
full to completely full,

00:31:19.140 --> 00:31:23.074
I have to do n
over 2 insertions.

00:31:23.074 --> 00:31:26.290
It's looking very similar,
but what I'm going to say

00:31:26.290 --> 00:31:31.440
is that when I double
the array next time,

00:31:31.440 --> 00:31:35.716
I'm going to charge that
doubling to those operations.

00:31:39.340 --> 00:31:43.890
In general, you can actually say
this quite concisely-- whenever

00:31:43.890 --> 00:31:45.920
I do a doubling
operation, I'm going

00:31:45.920 --> 00:31:53.740
to charge it to
all the insertions

00:31:53.740 --> 00:31:54.948
since the last doubling.

00:32:01.560 --> 00:32:04.080
That's a very
clear set of items.

00:32:04.080 --> 00:32:06.910
Doublings happen, and then they
don't happen for a while, just

00:32:06.910 --> 00:32:08.326
all those insertions
that happened

00:32:08.326 --> 00:32:11.440
since the last doubling
charged to them.

00:32:11.440 --> 00:32:13.010
And how many are there?

00:32:13.010 --> 00:32:18.010
Well, as we've argued,
there are n over 2 of them,

00:32:18.010 --> 00:32:22.460
and the cost of-- in order
to make this doubling free,

00:32:22.460 --> 00:32:25.150
I need to charge theta n.

00:32:25.150 --> 00:32:29.660
So this doubling cost
theta n, but there's

00:32:29.660 --> 00:32:31.200
n over things to charge to.

00:32:31.200 --> 00:32:33.580
I'm going to uniformly
distribute my charge

00:32:33.580 --> 00:32:37.510
to them, which
means I'm charging

00:32:37.510 --> 00:32:41.310
a constant amount to each.

00:32:41.310 --> 00:32:49.420
And the key fact here is that
I only charge an insert once.

00:32:49.420 --> 00:32:52.390
Because of this
since clause, I never

00:32:52.390 --> 00:32:54.620
will charge an
item twice as long

00:32:54.620 --> 00:32:56.180
as I'm only inserting for now.

00:33:02.140 --> 00:33:04.290
If you look over all
time, you will only

00:33:04.290 --> 00:33:05.534
charge an insert once.

00:33:08.200 --> 00:33:10.300
That's good, because
the inserts have

00:33:10.300 --> 00:33:12.440
to pay for their total
charge in the future.

00:33:12.440 --> 00:33:15.110
There's only one charge, and
it's only a constant amount,

00:33:15.110 --> 00:33:17.040
then amortized cost
of insert is still

00:33:17.040 --> 00:33:20.040
constant, amortized
cost of doubling is 0,

00:33:20.040 --> 00:33:23.970
because we charged the
entire cost to the past.

00:33:23.970 --> 00:33:27.000
So same example, but slightly
different perspective.

00:33:27.000 --> 00:33:35.770
Let's do a more interesting
example-- inserts and deletes

00:33:35.770 --> 00:33:36.435
in a table.

00:34:09.638 --> 00:34:14.070
Let' say I want to maintain
that the size of the table

00:34:14.070 --> 00:34:17.690
is always within a constant
factor of the number of items

00:34:17.690 --> 00:34:20.219
currently in the table.

00:34:20.219 --> 00:34:22.780
If I just want an upper bound,
then I only need to double,

00:34:22.780 --> 00:34:24.320
but if I want also
a lower bound--

00:34:24.320 --> 00:34:27.580
if I don't want the
table to be too empty,

00:34:27.580 --> 00:34:31.420
then I need to
add table halving.

00:34:31.420 --> 00:34:35.150
So what I'm going to do is
when the table is 100% full,

00:34:35.150 --> 00:34:42.540
I double its size, when
the table is 50% full,

00:34:42.540 --> 00:34:44.029
should I halve it in size?

00:34:44.029 --> 00:34:44.900
Would that work?

00:34:47.580 --> 00:34:49.610
No, because--

00:34:49.610 --> 00:34:53.610
AUDIENCE: [INAUDIBLE]
have to have it inserted

00:34:53.610 --> 00:34:56.400
in place of linear [INAUDIBLE].

00:34:56.400 --> 00:34:58.204
ERIK DEMAINE: Right.

00:34:58.204 --> 00:35:00.620
I can basically do insert,
delete, insert, delete, insert,

00:35:00.620 --> 00:35:03.000
delete, and every
single operation costs

00:35:03.000 --> 00:35:05.500
linear time, because maybe
I'm a little bit less

00:35:05.500 --> 00:35:10.530
than half full-- sorry, yeah, if
I'm a little bit less than half

00:35:10.530 --> 00:35:13.420
full, then I'm going to
shrink the array into half.

00:35:13.420 --> 00:35:18.190
Get rid of this part, then
if I immediately insert,

00:35:18.190 --> 00:35:19.780
it becomes 100% full again.

00:35:19.780 --> 00:35:21.900
I have to double in size,
and then if I delete,

00:35:21.900 --> 00:35:24.970
it becomes less than half full,
and I have to halve in size.

00:35:24.970 --> 00:35:27.220
Every operation would
cost linear time,

00:35:27.220 --> 00:35:29.050
so amortized cost
is linear time.

00:35:29.050 --> 00:35:30.700
That's not good.

00:35:30.700 --> 00:35:37.020
So what I'll do is just separate
those constants a little bit.

00:35:37.020 --> 00:35:39.090
When I'm 100% full,
I will double.

00:35:39.090 --> 00:35:41.110
That seems pretty
clear, but let's say

00:35:41.110 --> 00:35:44.910
when I'm a quarter
full, then I will halve.

00:35:44.910 --> 00:35:52.736
Any value less than 50 would
work here, but-- just halve,

00:35:52.736 --> 00:35:55.290
like that.

00:35:55.290 --> 00:35:56.290
This will actually work.

00:35:56.290 --> 00:35:58.780
This will be constant
amortized per operation,

00:35:58.780 --> 00:36:01.999
but it's-- especially the
initial analysis we did

00:36:01.999 --> 00:36:03.790
of table doubling isn't
going to work here,

00:36:03.790 --> 00:36:05.200
because it's complicated.

00:36:05.200 --> 00:36:09.590
The thing's going to
shrink and grow over time.

00:36:09.590 --> 00:36:11.080
Just summing that is not easy.

00:36:11.080 --> 00:36:13.150
It depends on the
sequence of operations,

00:36:13.150 --> 00:36:16.250
but with charging
and also with coins,

00:36:16.250 --> 00:36:18.310
we could do it in
a pretty clean way.

00:36:18.310 --> 00:36:19.685
I'm going to do
it with charging.

00:36:23.740 --> 00:36:27.030
So this particular
choice of constants

00:36:27.030 --> 00:36:33.172
is nice, because when I double
a full array, it's half full,

00:36:33.172 --> 00:36:36.790
and also when I
have an array that's

00:36:36.790 --> 00:36:44.080
a quarter full, like this,
and then I divide it--

00:36:44.080 --> 00:36:48.710
and then I shrink it--
I get rid of this part,

00:36:48.710 --> 00:36:50.410
it's also half full.

00:36:50.410 --> 00:36:53.070
So whenever I do a
double or a halve,

00:36:53.070 --> 00:36:57.240
the new array is half full, 50%.

00:36:57.240 --> 00:36:57.980
That's nice.

00:37:04.150 --> 00:37:08.750
That's nice, because 50% is far
away from both 25% and 100%.

00:37:15.510 --> 00:37:20.330
So our nice state is right
after a doubling or a halve,

00:37:20.330 --> 00:37:22.740
then we know that
our structure is 50%.

00:37:22.740 --> 00:37:25.150
In order to get to an
under-flowing state

00:37:25.150 --> 00:37:28.490
where we have to halve, I have
to delete at least a quarter

00:37:28.490 --> 00:37:33.280
of the items, a quarter of m.

00:37:33.280 --> 00:37:37.430
In order to get to overflowing
where I have to double,

00:37:37.430 --> 00:37:40.630
I have to insert at
least m over 2 items.

00:37:40.630 --> 00:37:43.330
Either way, a constant
fraction times m, that's

00:37:43.330 --> 00:37:44.870
what I'm going to charge to.

00:37:44.870 --> 00:37:48.650
Now, to be clear,
when I'm 50% full,

00:37:48.650 --> 00:37:51.360
I might insert, delete, insert,
delete, many different inserts

00:37:51.360 --> 00:37:52.440
and deletes.

00:37:52.440 --> 00:37:54.180
At some point, one
of these two things

00:37:54.180 --> 00:37:56.050
is going to happen though.

00:37:56.050 --> 00:38:00.000
In order to get here, I have to
do at least m over 4 deletions.

00:38:00.000 --> 00:38:02.000
I might also do more
insertions and deletions,

00:38:02.000 --> 00:38:03.550
but I have to do
at least that many,

00:38:03.550 --> 00:38:05.790
and those are the ones
I'm going to charge to.

00:38:05.790 --> 00:38:16.740
So I'm going to charge
a halving operation

00:38:16.740 --> 00:38:24.660
to the at least m
over 4 deletions

00:38:24.660 --> 00:38:32.870
since the last resize of either
type, doubling or halving.

00:38:32.870 --> 00:38:37.930
And I'm going to
charge the doubling

00:38:37.930 --> 00:38:44.960
to the at least m
over 2 insertions

00:38:44.960 --> 00:38:46.136
since the last resize.

00:38:53.950 --> 00:38:56.670
OK, and that's it.

00:38:56.670 --> 00:38:59.680
Because the halving costs
theta m time, doubling costs

00:38:59.680 --> 00:39:03.930
theta m time, I have theta
m operations to charge to,

00:39:03.930 --> 00:39:07.730
so I'm only charging constant
for each of the operations.

00:39:07.730 --> 00:39:09.840
And because of this
since last resize clause,

00:39:09.840 --> 00:39:13.950
it's clear that you're never
charging an operation more

00:39:13.950 --> 00:39:18.720
than once, because
you can divide time

00:39:18.720 --> 00:39:21.540
by when the resizes happen,
grows or shrinks, halves

00:39:21.540 --> 00:39:22.720
or doubles.

00:39:22.720 --> 00:39:28.030
And each resize is only charging
to the past a window of time.

00:39:28.030 --> 00:39:30.510
So it's like you have epics
of time, you separate them,

00:39:30.510 --> 00:39:33.570
you only charge
within your epic.

00:39:33.570 --> 00:39:36.866
OK, so that's cool.

00:39:36.866 --> 00:39:38.240
So you only get
a constant number

00:39:38.240 --> 00:39:40.974
of charges per item
of a constant amount,

00:39:40.974 --> 00:39:42.390
therefore insertions
and deletions

00:39:42.390 --> 00:39:43.760
are constant amortized.

00:39:43.760 --> 00:39:47.660
Halving and doubling
is free amortized.

00:39:47.660 --> 00:39:48.998
Clear?

00:39:48.998 --> 00:39:51.206
This is where amortization
starts to get interesting.

00:39:55.710 --> 00:40:00.070
You can also think of this
example in terms of coins,

00:40:00.070 --> 00:40:02.820
but with putting
coins on the items,

00:40:02.820 --> 00:40:05.950
but then you have to think
about the invariance of where

00:40:05.950 --> 00:40:08.070
the coins are, which I
find to be more work.

00:40:08.070 --> 00:40:09.630
We actually had
to do it up here.

00:40:09.630 --> 00:40:12.110
I was claiming the last
half of the items had coins.

00:40:12.110 --> 00:40:14.220
You have to prove that really.

00:40:14.220 --> 00:40:15.980
With this method, you don't.

00:40:15.980 --> 00:40:19.680
I mean, what you have to prove
is that there are enough things

00:40:19.680 --> 00:40:20.222
to charge to.

00:40:20.222 --> 00:40:22.346
We had to prove here that
there were n over 2 items

00:40:22.346 --> 00:40:22.910
to charge to.

00:40:22.910 --> 00:40:25.260
Kind of the same thing,
but it was very clear

00:40:25.260 --> 00:40:28.490
that you weren't charging to
the same thing more than once.

00:40:28.490 --> 00:40:30.840
You were never trying to
use a coin that wasn't there

00:40:30.840 --> 00:40:35.100
because of the since clause.

00:40:35.100 --> 00:40:36.000
To each their own.

00:40:36.000 --> 00:40:37.510
I think either way would work.

00:40:45.790 --> 00:40:47.430
I think I will
skip this example,

00:40:47.430 --> 00:40:49.980
but I'll just mention it.

00:40:49.980 --> 00:40:54.270
So for 2-3 trees, we
said deletions were free,

00:40:54.270 --> 00:40:56.610
and we did that with
the coin invariant,

00:40:56.610 --> 00:40:59.880
that there was one coin
of size log i for each i.

00:40:59.880 --> 00:41:02.410
You could instead say,
when I delete an item,

00:41:02.410 --> 00:41:04.810
I'm going to charge
it to the insert that

00:41:04.810 --> 00:41:10.900
made n this current value,
because that insert paid

00:41:10.900 --> 00:41:12.850
log n the actual
cost, so it can afford

00:41:12.850 --> 00:41:15.170
to pay another log n
to pay for the deletion

00:41:15.170 --> 00:41:19.320
of some other item, the one
we're currently deleting.

00:41:19.320 --> 00:41:22.410
And that works, that you don't
double charge to an insert,

00:41:22.410 --> 00:41:25.950
because you're
decreasing n right now.

00:41:25.950 --> 00:41:27.690
So for n to get up
to that value again,

00:41:27.690 --> 00:41:29.400
you would have had
to do another insert.

00:41:29.400 --> 00:41:33.570
So same thing, slightly
different perspective.

00:41:33.570 --> 00:41:35.950
Let's go to something
even more interesting

00:41:35.950 --> 00:41:39.850
and in some sense more powerful,
the last method on the list,

00:41:39.850 --> 00:41:41.251
which is potential method.

00:42:05.310 --> 00:42:08.144
This is a good exercise in how
many ways can you skin a cat?

00:42:11.940 --> 00:42:18.030
So potential method, I like
to call it defining karma

00:42:18.030 --> 00:42:24.080
in a formal way, is more
like the counting strategy.

00:42:24.080 --> 00:42:26.220
We're going to think
about there being a bank

00:42:26.220 --> 00:42:28.280
account with some
balance, but we're

00:42:28.280 --> 00:42:31.560
going to define that balance
as a function of the data

00:42:31.560 --> 00:42:33.670
structure state.

00:42:33.670 --> 00:42:46.330
So that's called the
potential function,

00:42:46.330 --> 00:42:49.570
but you can think of
it as a bank balance.

00:42:49.570 --> 00:42:52.243
You can think of it as
kinetic potential, I guess.

00:42:55.560 --> 00:42:56.407
Potential energy.

00:43:17.030 --> 00:43:21.790
Just like the bank account, we
want this function to always

00:43:21.790 --> 00:43:22.760
be non-negative.

00:43:22.760 --> 00:43:24.185
We'll also make it an integer.

00:43:26.774 --> 00:43:27.815
That would be convenient.

00:43:33.002 --> 00:43:34.460
The potential
function is basically

00:43:34.460 --> 00:43:39.240
trying to measure how bad is
the data structure right now?

00:43:39.240 --> 00:43:41.480
It's, again, like saving
up for a rainy day.

00:43:41.480 --> 00:43:44.120
We want that whenever we have
to do an expensive operation,

00:43:44.120 --> 00:43:48.970
like a double or halve, that
this potential has grown

00:43:48.970 --> 00:43:52.520
large enough that we
can charge that cost

00:43:52.520 --> 00:43:54.340
to a decrease in the potential.

00:43:54.340 --> 00:43:56.440
So it's like this is
storing up energy,

00:43:56.440 --> 00:43:59.420
and whenever we
have some free time,

00:43:59.420 --> 00:44:02.490
we'll give some of that time
to the potential function.

00:44:02.490 --> 00:44:05.720
It's just like the accounting
method, in a certain sense,

00:44:05.720 --> 00:44:07.400
but we're defining
things differently.

00:44:07.400 --> 00:44:10.370
Over here, we explicitly
said, hey look, I'm

00:44:10.370 --> 00:44:12.330
going to store some
credit right now.

00:44:12.330 --> 00:44:14.980
So we were basically
specifying the delta,

00:44:14.980 --> 00:44:17.480
and here we're saying I'm going
to consume some credit right

00:44:17.480 --> 00:44:18.180
now.

00:44:18.180 --> 00:44:21.590
Over here, we're going to
define this magical function

00:44:21.590 --> 00:44:22.560
of the current state.

00:44:22.560 --> 00:44:25.552
From that you can compute the
deltas, but also from here

00:44:25.552 --> 00:44:27.760
you can integrate and compute
the potential function.

00:44:27.760 --> 00:44:29.593
So they're interchangeable,
but usually it's

00:44:29.593 --> 00:44:32.360
easier to think about one
perspective or the other.

00:44:32.360 --> 00:44:34.352
Really often, you can
just look at what's

00:44:34.352 --> 00:44:36.060
going on with the data
structure and say,

00:44:36.060 --> 00:44:40.590
hey, you know, this aspect
of the data structure

00:44:40.590 --> 00:44:43.802
makes it bad, makes
costly operations,

00:44:43.802 --> 00:44:45.760
and you can just define
the potential function,

00:44:45.760 --> 00:44:48.560
then just check that it works.

00:44:48.560 --> 00:44:50.380
But it's a little
bit of black magic

00:44:50.380 --> 00:44:52.584
to come up with these
functions, so you depends how

00:44:52.584 --> 00:44:53.875
you like to think about things.

00:44:56.770 --> 00:45:01.040
So, as before, we can
define an amortized cost.

00:45:05.000 --> 00:45:10.377
It's going to be the
actual cost plus the change

00:45:10.377 --> 00:45:11.085
in the potential.

00:45:16.330 --> 00:45:21.540
So change of potential
is just the potential

00:45:21.540 --> 00:45:29.160
after the operation minus the
potential before the operation.

00:45:41.870 --> 00:45:48.100
I highlight that, and it's
kind of obvious from the way

00:45:48.100 --> 00:45:52.410
we set things up, but
what I care about is

00:45:52.410 --> 00:45:54.734
the sum of the amortized costs.

00:45:54.734 --> 00:45:56.400
I care about that,
because it's supposed

00:45:56.400 --> 00:46:00.630
to be an upper bound on the
sum of the actual costs.

00:46:00.630 --> 00:46:02.755
And if you just look
at what that sum is,

00:46:02.755 --> 00:46:04.380
on the right-hand
side I have amortized

00:46:04.380 --> 00:46:08.640
cost plus the fee after
the operation minus the fee

00:46:08.640 --> 00:46:10.300
before the operation.

00:46:10.300 --> 00:46:15.100
If I add all those up,
this part telescopes

00:46:15.100 --> 00:46:17.440
or you get cancellation
from each term

00:46:17.440 --> 00:46:19.170
with the previous term.

00:46:19.170 --> 00:46:28.080
The sum of the
amortized costs is

00:46:28.080 --> 00:46:41.970
equal to the sum of the
actual costs plus phi

00:46:41.970 --> 00:46:47.640
at the end minus phi
at the beginning.

00:46:52.220 --> 00:46:54.400
So a slight catch with
the potential method.

00:46:54.400 --> 00:46:58.260
When you define things
this way, you also

00:46:58.260 --> 00:47:01.930
have to pay for phi
at the beginning,

00:47:01.930 --> 00:47:06.797
because we want the actual cost
to be, at most, amortized cost.

00:47:06.797 --> 00:47:08.880
So we need to take this
apart and put it over here

00:47:08.880 --> 00:47:11.320
so it's, at most,
some of amortized cost

00:47:11.320 --> 00:47:12.650
plus phi of the beginning.

00:47:15.550 --> 00:47:18.645
This part becomes negative,
so we usually just ignore it.

00:47:18.645 --> 00:47:22.250
It can only help us.

00:47:22.250 --> 00:47:23.960
So when you define a
potential function,

00:47:23.960 --> 00:47:26.100
you'd really like it to
be 0 at the beginning.

00:47:34.150 --> 00:47:37.250
It's funny, but you pay
phi of the beginning state

00:47:37.250 --> 00:47:41.460
at the beginning of time, and
when you've done 0 operations,

00:47:41.460 --> 00:47:44.250
you really like
the cost to be 0,

00:47:44.250 --> 00:47:46.770
and you don't want to
have to have stored stuff

00:47:46.770 --> 00:47:52.560
in the bank, so this should
be a-- constant would probably

00:47:52.560 --> 00:47:54.940
be OK, or whatever the cost
of your first operation

00:47:54.940 --> 00:47:58.650
is but should be constant or 0.

00:47:58.650 --> 00:48:00.330
Usually we do this
by saying, look,

00:48:00.330 --> 00:48:02.980
let's start with an empty
structure and work from there.

00:48:02.980 --> 00:48:05.550
Usually phi of an
empty structure is 0,

00:48:05.550 --> 00:48:06.650
and all is well.

00:48:06.650 --> 00:48:09.640
So when you're defining things
with potential function,

00:48:09.640 --> 00:48:12.040
you have to be careful
about your initial state.

00:48:12.040 --> 00:48:13.665
You have to make sure
it's non-negative

00:48:13.665 --> 00:48:15.870
just like you did
over here, but you

00:48:15.870 --> 00:48:19.460
didn't have to worry about
this part over there.

00:48:19.460 --> 00:48:21.640
All this infrastructure,
what's it good for?

00:48:21.640 --> 00:48:24.700
Let's do some examples.

00:48:24.700 --> 00:48:27.307
These are going to be the
most interesting examples.

00:48:53.370 --> 00:48:55.780
A kind of classic
example of amortization

00:48:55.780 --> 00:48:58.780
is incrementing
a binary counter.

00:48:58.780 --> 00:49:04.650
So when you have some
binary value like this one

00:49:04.650 --> 00:49:14.181
and you increment
it, many bits change,

00:49:14.181 --> 00:49:15.680
but only a constant
number are going

00:49:15.680 --> 00:49:17.080
to change in an amortized sense.

00:49:17.080 --> 00:49:20.710
If I start with a 0
vector, 0 bit vector,

00:49:20.710 --> 00:49:23.640
and I increment-- well, the
very first increment costs 1,

00:49:23.640 --> 00:49:27.200
the next increment costs 2,
the next increment costs 1,

00:49:27.200 --> 00:49:32.150
next increment costs 3, then
1, then 2, then 1, then 4,

00:49:32.150 --> 00:49:34.160
then it's a fractal.

00:49:34.160 --> 00:49:37.280
But instead of thinking
about that fractal

00:49:37.280 --> 00:49:39.330
and working hard to
prove that summation

00:49:39.330 --> 00:49:45.650
is linear for an operation,
let's use the potential method.

00:49:45.650 --> 00:49:48.530
And the intuition here
is actually pretty easy,

00:49:48.530 --> 00:49:56.170
because an increment
has a very clear cost.

00:49:56.170 --> 00:50:00.560
It's just the number
of trailing 1s plus 1.

00:50:11.390 --> 00:50:15.070
That's what it is
in actual cost.

00:50:15.070 --> 00:50:18.030
We'd like that to be
constant so, intuitively,

00:50:18.030 --> 00:50:20.720
what is making an increment bad?

00:50:20.720 --> 00:50:22.090
If you had to name one thing?

00:50:27.970 --> 00:50:31.760
If I just look at a
configuration, is this bad?

00:50:31.760 --> 00:50:33.480
Is this bad?

00:50:33.480 --> 00:50:34.766
How bad is the configuration?

00:50:34.766 --> 00:50:35.266
Yeah.

00:50:35.266 --> 00:50:37.258
AUDIENCE: The more
trailing ones you have,

00:50:37.258 --> 00:50:38.260
the worse the state is?

00:50:38.260 --> 00:50:40.801
ERIK DEMAINE: The more trailing
ones, the worse the state is.

00:50:40.801 --> 00:50:42.460
So that's one
natural definition.

00:50:42.460 --> 00:50:44.137
Turns out, it won't work.

00:50:44.137 --> 00:50:44.720
Let's see why.

00:50:47.320 --> 00:50:49.110
I think here's an example.

00:50:51.790 --> 00:50:53.880
So near the end of
our increment stage,

00:50:53.880 --> 00:50:56.450
we have a whole bunch of
1s but no trailing 1s,

00:50:56.450 --> 00:50:58.540
number of trailing 1s is 0.

00:50:58.540 --> 00:51:03.670
If I do a single increment, now
the number of trailing 1s is n,

00:51:03.670 --> 00:51:06.640
so if you look at
the amortized cost,

00:51:06.640 --> 00:51:09.700
it's the actual cost
plus the change in phi

00:51:09.700 --> 00:51:11.889
and so I actually pay
n for that operation

00:51:11.889 --> 00:51:13.680
in the amortized sense,
and that's no good.

00:51:13.680 --> 00:51:17.430
We only want to pay constant,
but it's on the right track.

00:51:17.430 --> 00:51:21.140
So number of trailing 1, it
is the natural thing to try,

00:51:21.140 --> 00:51:26.250
but it doesn't quite work
for our definition of phi.

00:51:26.250 --> 00:51:28.629
Other ideas?

00:51:28.629 --> 00:51:29.128
Yeah.

00:51:29.128 --> 00:51:31.819
AUDIENCE: The total
number of [INAUDIBLE]

00:51:31.819 --> 00:51:33.360
ERIK DEMAINE: The
total number of 1s.

00:51:33.360 --> 00:51:34.270
Yeah.

00:51:34.270 --> 00:51:40.130
Let's define phi, could
be the number of 1 bits.

00:51:40.130 --> 00:51:45.500
That will work, but
you both get a Frisbee.

00:51:49.137 --> 00:51:50.220
AUDIENCE: Oh, [INAUDIBLE].

00:51:50.220 --> 00:51:51.325
ERIK DEMAINE: Sorry.

00:51:51.325 --> 00:51:52.158
Good thing I missed.

00:51:55.130 --> 00:51:56.260
Number 1 bits.

00:51:56.260 --> 00:52:02.150
Intuitively, 1s are bad, and
this is a good definition,

00:52:02.150 --> 00:52:05.930
because when I increment
I only create one 1,

00:52:05.930 --> 00:52:09.560
so I'm not going to have this
issue that delta phi goes up

00:52:09.560 --> 00:52:11.720
by a lot-- sorry, that
phi goes up by a lot,

00:52:11.720 --> 00:52:14.484
that delta phi is really large.

00:52:14.484 --> 00:52:16.400
Because even in this
scenario, if I increment,

00:52:16.400 --> 00:52:17.780
I only add one 1.

00:52:17.780 --> 00:52:23.370
In this scenario, I destroy
three 1s and add one.

00:52:23.370 --> 00:52:31.070
In general, if there are,
let's say, t trailing bits,

00:52:31.070 --> 00:52:46.660
then an increment destroys t 1
bits, and it creates one 1 bit.

00:52:51.330 --> 00:52:52.860
That's always what happens.

00:52:52.860 --> 00:52:56.430
T could be 0, and then I
have a net positive of 1,

00:52:56.430 --> 00:53:00.180
but most of the time actually
I destroy 1 bits-- well,

00:53:00.180 --> 00:53:02.420
more than half the
time I destroy 1 bits,

00:53:02.420 --> 00:53:04.730
and I just create a
single 1 bit, in terms

00:53:04.730 --> 00:53:06.560
of the total number of 1s.

00:53:06.560 --> 00:53:18.280
So the amortized cost
is the actual cost,

00:53:18.280 --> 00:53:20.867
which is this 1 plus t.

00:53:23.920 --> 00:53:28.480
I'm actually going to
remove the-- well, yeah.

00:53:28.480 --> 00:53:31.786
I'd like to remove
the big O if I could.

00:53:31.786 --> 00:53:34.870
I'm going to
count-- I want to be

00:53:34.870 --> 00:53:36.716
a little bit precise
about my counting,

00:53:36.716 --> 00:53:38.340
because I have to do
a minus sign here.

00:53:38.340 --> 00:53:41.060
If I just wrote minus
t, that doesn't quite

00:53:41.060 --> 00:53:42.960
work out, because
there's a constant here

00:53:42.960 --> 00:53:44.790
that I have to annihilate.

00:53:44.790 --> 00:53:47.150
If I count the number
of bits that change,

00:53:47.150 --> 00:53:51.090
then that's exactly 1
plus t in an increment.

00:53:51.090 --> 00:53:54.960
And now the change of potential
is that I decrease by t,

00:53:54.960 --> 00:54:01.440
I increase by 1, I get 0.

00:54:01.440 --> 00:54:05.952
That seems a little bit too
small, 0 time per operation.

00:54:05.952 --> 00:54:07.410
AUDIENCE: You're
adding a 1, you're

00:54:07.410 --> 00:54:08.660
not subtracting [INAUDIBLE].

00:54:08.660 --> 00:54:10.555
Sorry, you're not
subtracting [INAUDIBLE].

00:54:10.555 --> 00:54:12.099
Just subtracting something else.

00:54:12.099 --> 00:54:13.390
ERIK DEMAINE: Oh, right, sorry.

00:54:13.390 --> 00:54:14.400
That's 2.

00:54:14.400 --> 00:54:15.609
Thank you.

00:54:15.609 --> 00:54:16.900
I just can't do the arithmetic.

00:54:16.900 --> 00:54:21.960
I wrote everything correct, but
this is a plus 1 and a plus 1.

00:54:21.960 --> 00:54:24.080
T minus t is the key
part that cancels.

00:54:24.080 --> 00:54:27.970
Now, if you were measuring
running time instead

00:54:27.970 --> 00:54:29.470
of the number of
changed bits, you'd

00:54:29.470 --> 00:54:32.240
have to have a big O
here, and in that case

00:54:32.240 --> 00:54:35.760
you'd have to define phi to be
some constant times the number

00:54:35.760 --> 00:54:36.540
of 1 bits.

00:54:36.540 --> 00:54:38.290
So you could still set
that constant large

00:54:38.290 --> 00:54:41.856
enough so that this part,
which is multiplied by c,

00:54:41.856 --> 00:54:43.230
would annihilate
this part, which

00:54:43.230 --> 00:54:47.230
would have a big O. I guess
I'll write it in just for kicks

00:54:47.230 --> 00:54:48.630
so you've seen both versions.

00:54:48.630 --> 00:54:52.482
This would be minus c see
times t plus 1 times c.

00:54:52.482 --> 00:54:53.690
So that would still work out.

00:54:53.690 --> 00:54:56.080
If you set c to the right
value, you will still get 2.

00:54:58.760 --> 00:55:01.370
So binary counters,
constant amortize operation.

00:55:01.370 --> 00:55:03.520
So I think this is
very clean, much easier

00:55:03.520 --> 00:55:05.697
than analyzing the
fractal of the costs.

00:55:05.697 --> 00:55:07.780
Now, binary counter with
increment and decrements,

00:55:07.780 --> 00:55:09.090
that doesn't work.

00:55:09.090 --> 00:55:11.100
There are other data
structures to do it,

00:55:11.100 --> 00:55:13.617
but that's for another class.

00:55:16.530 --> 00:55:19.370
Let's go back to 2-3 trees,
because I have more interesting

00:55:19.370 --> 00:55:20.590
things to say about them.

00:55:23.528 --> 00:55:25.027
Any questions about
binary counters?

00:55:28.440 --> 00:55:30.620
As you saw, it wasn't
totally easy to define

00:55:30.620 --> 00:55:32.840
a potential function,
but we're going

00:55:32.840 --> 00:55:35.450
to see-- if see
enough examples, you

00:55:35.450 --> 00:55:38.930
get some intuition for
them, but it is probably

00:55:38.930 --> 00:55:42.400
the hardest method to use but
also kind of the most powerful.

00:55:42.400 --> 00:55:44.740
I would say all
hard amortizations

00:55:44.740 --> 00:55:46.640
use a potential function.

00:55:46.640 --> 00:55:48.730
That's just life.

00:55:48.730 --> 00:55:51.170
Finding them is tough.

00:55:51.170 --> 00:55:52.050
That's reality.

00:56:00.790 --> 00:56:04.160
I want to analyze insertions
only in 2-3 trees,

00:56:04.160 --> 00:56:06.430
then we'll do insertions
and deletions,

00:56:06.430 --> 00:56:15.710
and I want to count how
many splits in a 2-3 tree

00:56:15.710 --> 00:56:16.690
when I do an insertion.

00:56:22.560 --> 00:56:26.830
So remember, when you
insert into a 2-3 tree,

00:56:26.830 --> 00:56:29.310
so you started a leaf,
you insert a key there.

00:56:29.310 --> 00:56:31.160
If it's too big,
you split that node

00:56:31.160 --> 00:56:33.730
into two parts, which
causes an insert of a key

00:56:33.730 --> 00:56:35.470
into the parent.

00:56:35.470 --> 00:56:37.890
Then that might be too big,
and you split, and so on.

00:56:37.890 --> 00:56:39.535
So total number of
splits per insert?

00:56:42.530 --> 00:56:43.243
Upper bounds?

00:56:43.243 --> 00:56:44.097
AUDIENCE: Log n.

00:56:44.097 --> 00:56:44.930
ERIK DEMAINE: Log n.

00:56:44.930 --> 00:56:45.430
OK.

00:56:45.430 --> 00:56:47.230
Definitely log n
in the worst case.

00:56:47.230 --> 00:56:54.830
That's sort of the actual cost
but, as you may be guessing,

00:56:54.830 --> 00:57:04.110
I claim the amortized number
of splits is only constant,

00:57:04.110 --> 00:57:06.970
and first will prove
this with insertion only.

00:57:06.970 --> 00:57:08.950
With insertion and
deletion in a 2-3 tree,

00:57:08.950 --> 00:57:12.740
it's actually not true, but for
insertion only this is true.

00:57:12.740 --> 00:57:15.292
So let's prove it.

00:57:15.292 --> 00:57:23.090
A 2-3 tree, we have two types
of nodes, 2 nodes and 3 nodes.

00:57:23.090 --> 00:57:24.960
I'm counting the
number of children,

00:57:24.960 --> 00:57:26.720
not the number of
keys, is one smaller

00:57:26.720 --> 00:57:28.770
than the number of children.

00:57:28.770 --> 00:57:33.140
Sorry, no vertical line there.

00:57:33.140 --> 00:57:35.880
This is just sum key
x, sum key and y.

00:57:40.700 --> 00:57:46.850
So when I insert
a key into a node,

00:57:46.850 --> 00:57:52.590
it momentarily becomes
a 4 node, you might say,

00:57:52.590 --> 00:57:55.860
with has three
keys, x, y, and z.

00:57:55.860 --> 00:58:01.680
So 4 node, it has four
children, hence the 4,

00:58:01.680 --> 00:58:10.790
and we split it into x and z.

00:58:10.790 --> 00:58:13.277
There's the four
children, same number,

00:58:13.277 --> 00:58:15.110
but now they're distributed
between x and z.

00:58:15.110 --> 00:58:20.120
And then y gets promoted
to the next level up,

00:58:20.120 --> 00:58:22.120
which allows us to have
two pointers to x and z.

00:58:22.120 --> 00:58:23.760
And that's how 2-3 trees work.

00:58:23.760 --> 00:58:26.110
That's how split works.

00:58:26.110 --> 00:58:30.030
Now, I want to say
that splitting-- I

00:58:30.030 --> 00:58:33.035
want to charge the splitting
to something, intuitively.

00:58:38.070 --> 00:58:40.680
Let's say y was the
key that was inserted,

00:58:40.680 --> 00:58:46.650
so we started with x
z, which was a 3 node.

00:58:46.650 --> 00:58:51.380
When we did an insert,
it became a 4 node,

00:58:51.380 --> 00:58:55.530
and then we did a
split, which left us

00:58:55.530 --> 00:59:00.010
with two 2 nodes and something.

00:59:03.510 --> 00:59:05.917
So what can you say
overall about this process?

00:59:08.440 --> 00:59:09.940
What's making this example bad?

00:59:09.940 --> 00:59:12.510
What's making the split
happen, in some sense?

00:59:12.510 --> 00:59:15.519
I mean, the insert is
one thing, but there's

00:59:15.519 --> 00:59:16.810
another thing we can charge to.

00:59:19.081 --> 00:59:21.580
Insert's not enough, because
we're going to do log n splits,

00:59:21.580 --> 00:59:23.140
and we can only
charge to the insert

00:59:23.140 --> 00:59:25.141
once if we want constant
amortized bound.

00:59:36.150 --> 00:59:36.884
Yeah?

00:59:36.884 --> 00:59:38.050
AUDIENCE: Number of 3 nodes?

00:59:38.050 --> 00:59:40.040
ERIK DEMAINE: Number
of 3 nodes, exactly.

00:59:40.040 --> 00:59:46.760
That's a good
potential function,

00:59:46.760 --> 00:59:50.122
because on the left side of
this picture, we had one 3 node.

00:59:50.122 --> 00:59:52.330
On the right side of the
picture, we had two 2 nodes.

00:59:52.330 --> 00:59:53.840
Now, what's happening
to the parent?

00:59:53.840 --> 00:59:55.730
We'll have to worry
about that in a moment,

00:59:55.730 --> 00:59:57.250
but you've got the intuition.

00:59:59.970 --> 01:00:01.890
Number of 3 nodes.

01:00:17.870 --> 01:00:19.680
I looked at just a
single operation here,

01:00:19.680 --> 01:00:23.100
but if you look more generally
about an expensive insert,

01:00:23.100 --> 01:00:27.970
in that it does many splits,
the only way that can happen

01:00:27.970 --> 01:00:36.950
is if you had a chain of 3 nodes
all connected to each other

01:00:36.950 --> 01:00:38.490
and you do an insert down here.

01:00:38.490 --> 01:00:41.830
This one splits, then this one
splits, then this one splits.

01:00:41.830 --> 01:00:45.440
So there are all these 3
nodes just hanging around,

01:00:45.440 --> 01:00:50.170
and after you do the split,
the parent of the very

01:00:50.170 --> 01:00:53.770
last node that splits,
that might become a 3 node.

01:00:53.770 --> 01:00:55.540
So that will be
up here somewhere.

01:00:55.540 --> 01:00:57.660
You might have made
one new 3 node,

01:00:57.660 --> 01:01:01.010
but then this one is
a couple of 2 nodes,

01:01:01.010 --> 01:01:03.130
this becomes a
couple of 2 nodes,

01:01:03.130 --> 01:01:05.420
and this becomes a
couple of 2 nodes.

01:01:05.420 --> 01:01:10.790
So if you had k 3 nodes before,
afterwards you have one.

01:01:14.680 --> 01:01:16.540
Sound familiar?

01:01:16.540 --> 01:01:18.240
This is actually
exactly what's going on

01:01:18.240 --> 01:01:21.820
with the binary counter, so this
may seem like a toy example,

01:01:21.820 --> 01:01:26.730
but over here we created,
at most, one 1 bit.

01:01:26.730 --> 01:01:32.890
Down here we create,
at most, one 3 node,

01:01:32.890 --> 01:01:34.300
which is when the split stops.

01:01:34.300 --> 01:01:36.050
When the split stops,
that's the only time

01:01:36.050 --> 01:01:39.910
we actually insert a key into
a node and it doesn't split,

01:01:39.910 --> 01:01:41.460
because otherwise you split.

01:01:41.460 --> 01:01:44.140
When you split, you're
always making two nodes,

01:01:44.140 --> 01:01:45.544
and that's good.

01:01:45.544 --> 01:01:47.210
At the very end when
you stop splitting,

01:01:47.210 --> 01:01:49.320
you might have made one 3 node.

01:01:49.320 --> 01:02:02.240
So in an insert, let's say
the number of splits equals k,

01:02:02.240 --> 01:02:05.930
then the change of
potential for that operation

01:02:05.930 --> 01:02:15.600
is minus k plus 1,
because for every split

01:02:15.600 --> 01:02:18.900
there was a 3 node to charge
to-- or for every split

01:02:18.900 --> 01:02:23.624
there was a 3 node that
became two nodes, two 2 nodes.

01:02:23.624 --> 01:02:25.040
So the potential
went down by one,

01:02:25.040 --> 01:02:27.416
because you used to have
one 3 node, then you had 0.

01:02:27.416 --> 01:02:29.290
At the very end, you
might create one 3 node.

01:02:29.290 --> 01:02:31.840
That's the plus 1.

01:02:31.840 --> 01:02:35.390
So the amortized cost is just
the sum of these two things,

01:02:35.390 --> 01:02:36.390
and we get 1.

01:02:41.920 --> 01:02:46.163
That's k minus k
plus 1 which is 1.

01:02:48.930 --> 01:02:50.260
Cool, huh?

01:02:50.260 --> 01:02:54.070
This is where a potential method
becomes powerful, I would say.

01:02:54.070 --> 01:02:58.190
You can view this as a
kind of charging argument,

01:02:58.190 --> 01:03:00.070
but it gets very confusing.

01:03:00.070 --> 01:03:02.310
Maybe with coins is
the most plausible use.

01:03:02.310 --> 01:03:04.010
Essentially, the
invariance you'd want

01:03:04.010 --> 01:03:06.320
is that you have a
coin on every 3 node.

01:03:06.320 --> 01:03:08.906
Same thing, of course,
but it's I think easier

01:03:08.906 --> 01:03:10.030
to think about it this way.

01:03:10.030 --> 01:03:12.390
Say, well, 3 nodes seem
to be the bad thing.

01:03:12.390 --> 01:03:15.259
Let's just count them,
let's just see what happens.

01:03:15.259 --> 01:03:16.800
It's more like you
say I want to have

01:03:16.800 --> 01:03:18.925
this invariant that there's
a coin on every 3 node.

01:03:18.925 --> 01:03:21.480
How can I achieve that?

01:03:21.480 --> 01:03:26.061
And it just works magically,
because A, it helps it was true

01:03:26.061 --> 01:03:28.560
and, B, we had to come up with
the right potential function.

01:03:28.560 --> 01:03:30.851
And those are tricky and, in
general with amortization,

01:03:30.851 --> 01:03:34.642
unless you're told on a p
set prove order t amortize,

01:03:34.642 --> 01:03:36.850
you don't always know what
the right running time is,

01:03:36.850 --> 01:03:38.183
and you just have to experiment.

01:03:42.190 --> 01:03:47.130
Our final example,
most impressive.

01:03:47.130 --> 01:03:48.501
Let's go over here.

01:03:52.409 --> 01:03:53.450
It's a surprise, I guess.

01:03:53.450 --> 01:03:54.721
It's not even on the list.

01:03:57.610 --> 01:04:08.320
I want to do-- this
is great for inserts,

01:04:08.320 --> 01:04:09.450
but what about deletes?

01:04:09.450 --> 01:04:10.946
I want to do
inserts and deletes.

01:04:21.810 --> 01:04:27.860
I'd like to do 2-3 trees,
but 2-3 trees don't work.

01:04:27.860 --> 01:04:30.320
If I want to get a constant
amortized bound for inserts

01:04:30.320 --> 01:04:32.690
and deletes, I've got to
constant advertised here

01:04:32.690 --> 01:04:35.590
for inserts-- I should be clear.

01:04:35.590 --> 01:04:37.240
I'm ignoring the
cost of searching.

01:04:37.240 --> 01:04:39.340
Let's just say searching
is cheap for some reason.

01:04:39.340 --> 01:04:41.300
Maybe you already know
where your key is,

01:04:41.300 --> 01:04:43.460
and you just want
to insert there.

01:04:43.460 --> 01:04:47.420
Then insert only costs constant
amortize in a 2-3 tree.

01:04:47.420 --> 01:04:50.960
Insert and delete
is not that good.

01:04:50.960 --> 01:04:53.470
It can be log n
for every operation

01:04:53.470 --> 01:04:57.160
if I do inserts and deletes,
essentially for the same reason

01:04:57.160 --> 01:05:01.940
that a binary counter can
be n for every operation

01:05:01.940 --> 01:05:04.880
if I do increments
and decrements.

01:05:04.880 --> 01:05:08.120
I could be here,
increment a couple times,

01:05:08.120 --> 01:05:09.790
and then I change a
huge number of bits.

01:05:09.790 --> 01:05:12.740
If I immediately decrement,
then all the bits go back.

01:05:12.740 --> 01:05:14.310
In increment, all
the bits go back.

01:05:14.310 --> 01:05:15.643
Decrement, all the bits go back.

01:05:15.643 --> 01:05:17.840
So I'm changing end
bits in every operation.

01:05:17.840 --> 01:05:22.630
In the same way, if you just
think of one path of your tree,

01:05:22.630 --> 01:05:25.740
and you think of the
0 bits as 2 nodes

01:05:25.740 --> 01:05:29.710
and the 1 bits as 3
nodes, when I increment

01:05:29.710 --> 01:05:31.880
by inserting at the
bottom, all those 3s

01:05:31.880 --> 01:05:33.980
turn to 1, except
the top I make a 3.

01:05:33.980 --> 01:05:35.400
That's just like
a binary counter.

01:05:35.400 --> 01:05:40.150
It went from all
1s to 1 0 0 0 0 0,

01:05:40.150 --> 01:05:42.140
and then if I
decrement, if I delete

01:05:42.140 --> 01:05:44.410
from that very
same leaf, then I'm

01:05:44.410 --> 01:05:46.640
going to have to do
merges all the way back up

01:05:46.640 --> 01:05:48.720
and turn those all back
into 3 nodes again.

01:05:48.720 --> 01:05:50.860
And so every operation
is going to pay log n.

01:05:50.860 --> 01:05:54.810
Log n's, not so bad, but
I really want constant.

01:05:54.810 --> 01:06:03.440
So I'm going to introduce
something new called 2-5 trees,

01:06:03.440 --> 01:06:06.900
and it's going to be exactly
like b trees that you learned,

01:06:06.900 --> 01:06:12.140
except now the number of
children of every node

01:06:12.140 --> 01:06:14.620
should be between 2 and 5.

01:06:17.950 --> 01:06:20.730
All the operations
are defined the same.

01:06:20.730 --> 01:06:22.720
We've already
talked about insert.

01:06:22.720 --> 01:06:25.440
Now insert-- when you
have six children,

01:06:25.440 --> 01:06:27.200
then you're
overflowing, and then

01:06:27.200 --> 01:06:28.920
you're going to split
in half and so on.

01:06:28.920 --> 01:06:30.544
So actually I should
draw that picture,

01:06:30.544 --> 01:06:32.410
because we're going to need it.

01:06:32.410 --> 01:06:39.340
So if I started with a 5 node,
which means it has four keys,

01:06:39.340 --> 01:06:42.268
and then I insert into
it, I get a 6 node.

01:06:47.460 --> 01:06:48.596
That's too many.

01:06:51.490 --> 01:06:52.572
Six children.

01:06:55.110 --> 01:06:57.070
OK, that's too
much, so I'm going

01:06:57.070 --> 01:06:59.380
to split it in
half, which is going

01:06:59.380 --> 01:07:06.520
to leave a 3 node
and a single item,

01:07:06.520 --> 01:07:14.860
which gets promoted to the
parent, and another 3 node.

01:07:14.860 --> 01:07:19.770
OK, so we started with a 5 node,
and the result was two 3 nodes.

01:07:19.770 --> 01:07:23.730
OK, that split, and
we also contaminate

01:07:23.730 --> 01:07:25.380
the parent a little
bit, but that

01:07:25.380 --> 01:07:28.030
may lead to another split,
which will look like this again.

01:07:28.030 --> 01:07:29.010
So if we're just
doing insertions,

01:07:29.010 --> 01:07:31.400
fine, we just count the number
of 5 nodes, no different,

01:07:31.400 --> 01:07:32.180
right?

01:07:32.180 --> 01:07:36.250
But I want to do simultaneously
insert and delete.

01:07:36.250 --> 01:07:42.470
So let's remember what
happens with a delete.

01:07:42.470 --> 01:07:45.530
So if you just delete
a key and a leaf,

01:07:45.530 --> 01:07:49.160
the issue is it may
become too empty.

01:07:49.160 --> 01:07:53.020
So what's too empty?

01:07:53.020 --> 01:07:55.530
Well, the minimum
number of children

01:07:55.530 --> 01:07:57.380
we're allowed to have
is two, so too empty

01:07:57.380 --> 01:08:00.300
would be that I have one child.

01:08:00.300 --> 01:08:03.660
So maybe initially
I have two children,

01:08:03.660 --> 01:08:06.310
and I have a single key
x, then maybe I delete

01:08:06.310 --> 01:08:11.090
x, and so now I have 0 keys.

01:08:11.090 --> 01:08:12.720
This is a 1 node.

01:08:12.720 --> 01:08:15.180
It has a single child.

01:08:15.180 --> 01:08:16.520
OK.

01:08:16.520 --> 01:08:18.680
Weird.

01:08:18.680 --> 01:08:23.750
In that case, there are
sort of two situations.

01:08:23.750 --> 01:08:26.880
Maybe your sibling
has enough keys

01:08:26.880 --> 01:08:31.210
that you can just steal one,
then that was really cheap.

01:08:31.210 --> 01:08:37.510
But the other case
is that you-- yeah.

01:08:37.510 --> 01:08:40.460
I'm also going to have
to involve my parent,

01:08:40.460 --> 01:08:44.439
so maybe I'm going
to take a key from x

01:08:44.439 --> 01:08:47.760
and merge all these
things together.

01:08:47.760 --> 01:08:53.479
So that's y, then
what I get is an x y.

01:08:53.479 --> 01:08:58.461
I had two children here,
three children here.

01:08:58.461 --> 01:08:58.960
OK.

01:08:58.960 --> 01:09:00.760
Also messed up my
parent a little bit,

01:09:00.760 --> 01:09:03.550
but that's going to
be the recursive case.

01:09:03.550 --> 01:09:05.029
This is a sort of
merge operation.

01:09:05.029 --> 01:09:07.330
In general, I merge
with my sibling

01:09:07.330 --> 01:09:09.202
and then potentially
split again,

01:09:09.202 --> 01:09:11.410
or you can think of it as
stealing from your sibling,

01:09:11.410 --> 01:09:13.257
as you may be
experienced with doing.

01:09:13.257 --> 01:09:15.340
I don't have siblings, so
I didn't get to do that,

01:09:15.340 --> 01:09:17.649
but I stole from my
parents, so whatever.

01:09:17.649 --> 01:09:19.490
However you want
to think about it,

01:09:19.490 --> 01:09:22.689
that is merging in a b tree.

01:09:22.689 --> 01:09:24.990
We started with a 2 node here.

01:09:24.990 --> 01:09:28.260
We ended up with a 3 node.

01:09:28.260 --> 01:09:30.560
Hmm, that's good.

01:09:30.560 --> 01:09:32.200
It's different at least.

01:09:32.200 --> 01:09:36.310
So the bad case
here is a 5 node,

01:09:36.310 --> 01:09:38.120
bad case here is a 2 node.

01:09:38.120 --> 01:09:39.936
What should I use a
potential function?

01:10:02.160 --> 01:10:03.098
Yeah.

01:10:03.098 --> 01:10:04.974
AUDIENCE: Number of
nodes with two children

01:10:04.974 --> 01:10:06.827
and number of nodes
with five children?

01:10:06.827 --> 01:10:08.410
ERIK DEMAINE: Number
of nodes with two

01:10:08.410 --> 01:10:10.100
or five children, yeah.

01:10:10.100 --> 01:10:11.420
So that's it.

01:10:11.420 --> 01:10:13.290
Just combine with the sum.

01:10:13.290 --> 01:10:24.870
That's going to be the number
of nodes with two children

01:10:24.870 --> 01:10:31.870
plus the number of nodes
with five children.

01:10:35.690 --> 01:10:37.020
This is measuring karma.

01:10:37.020 --> 01:10:40.520
This is how bad is
my tree going to be,

01:10:40.520 --> 01:10:43.370
because if I have 2 nodes, I'm
really close to under flowing

01:10:43.370 --> 01:10:44.640
and that's potentially bad.

01:10:44.640 --> 01:10:49.140
If I happen to delete there,
bad things are going to happen.

01:10:49.140 --> 01:10:52.110
If I have a bunch of 5 nodes,
splits could happen there,

01:10:52.110 --> 01:10:53.540
and I don't know whether
it's going to be an insert

01:10:53.540 --> 01:10:54.520
or delete next,
so I'm just going

01:10:54.520 --> 01:10:56.110
to keep track of both of them.

01:10:56.110 --> 01:10:59.690
And luckily neither of
them output 5s or 2s.

01:10:59.690 --> 01:11:01.940
If they did, like
if we did 2-3 trees,

01:11:01.940 --> 01:11:03.574
this is a total
nightmare, because you

01:11:03.574 --> 01:11:05.990
can't count the number of 2
nodes plus the number 3 nodes.

01:11:05.990 --> 01:11:07.760
That's all the nodes.

01:11:07.760 --> 01:11:10.510
Potential only changes
by 1 in each step.

01:11:10.510 --> 01:11:12.090
That would never help you.

01:11:12.090 --> 01:11:12.590
OK?

01:11:12.590 --> 01:11:15.880
But here we have enough
of a gap between the lower

01:11:15.880 --> 01:11:19.070
bound and the upper bound and,
in general, any constants here

01:11:19.070 --> 01:11:20.430
will work.

01:11:20.430 --> 01:11:22.560
These are usually
called a-b trees,

01:11:22.560 --> 01:11:24.080
generalization of
b trees, where you

01:11:24.080 --> 01:11:26.880
get to specify the lower
bound and the upper bound, as

01:11:26.880 --> 01:11:31.210
long as a-- what's
the way-- as long as a

01:11:31.210 --> 01:11:35.620
is strictly less than b over 2,
then this argument will work.

01:11:35.620 --> 01:11:41.090
As long as there's at least
one gap between a and b over 2,

01:11:41.090 --> 01:11:46.750
then this argument will work,
because in the small case,

01:11:46.750 --> 01:11:49.840
you start with the
minimum number of children

01:11:49.840 --> 01:11:50.850
you can have.

01:11:50.850 --> 01:11:55.830
You'll get one more in the end,
and in the other situation,

01:11:55.830 --> 01:11:59.100
you have too many
things, you divide by 2,

01:11:59.100 --> 01:12:00.960
and you don't want
dividing by 2 to end up

01:12:00.960 --> 01:12:02.580
with the bad case over here.

01:12:02.580 --> 01:12:04.850
That's what happened
even with 2-4 trees--

01:12:04.850 --> 01:12:09.300
2-3-4 trees-- but with 2-5
trees, there's enough of a gap

01:12:09.300 --> 01:12:13.550
that when we split 5 in
half, we get 3s only, no 2s,

01:12:13.550 --> 01:12:19.450
and when we merge
2s, we get 3s, no 5s.

01:12:19.450 --> 01:12:24.550
So in either case,
if we do the split--

01:12:24.550 --> 01:12:36.710
if we do an insert with k
splits, the change in potential

01:12:36.710 --> 01:12:39.610
is minus k plus 1.

01:12:39.610 --> 01:12:44.930
Again, we might make a single
five-child node at the top when

01:12:44.930 --> 01:12:50.140
we stop splitting, but
every time we split,

01:12:50.140 --> 01:12:52.850
we've taken a 5 node
and destroyed it,

01:12:52.850 --> 01:12:55.850
left it with two 3 nodes,
so that decreases by k,

01:12:55.850 --> 01:12:59.040
and so this k cost gets
cancelled out by this negative

01:12:59.040 --> 01:13:02.200
k and change potential, so the
amortized cost is 1 just like

01:13:02.200 --> 01:13:03.640
before.

01:13:03.640 --> 01:13:12.860
But now, also with delete,
with k merge operations

01:13:12.860 --> 01:13:16.820
where I'm treating all
of this as one operation,

01:13:16.820 --> 01:13:22.320
again, the change of
potential is minus k plus 1.

01:13:22.320 --> 01:13:24.890
Potentially when
we stop merging,

01:13:24.890 --> 01:13:28.400
because we stole one
key from our parent,

01:13:28.400 --> 01:13:32.037
it may now be a 2 node,
whereas before it wasn't.

01:13:32.037 --> 01:13:34.495
If it was already a 2 node,
then it would be another merge,

01:13:34.495 --> 01:13:36.135
and that's actually
a good case for us,

01:13:36.135 --> 01:13:38.830
but when the merges
stop, they stop

01:13:38.830 --> 01:13:41.292
because we hit a node
that's at least a 3 node,

01:13:41.292 --> 01:13:43.750
then we delete a key from it,
so potentially it's a 2 node.

01:13:43.750 --> 01:13:46.530
So potentially the
potential goes up by 1.

01:13:46.530 --> 01:13:50.500
We make one new bad
node, but every time

01:13:50.500 --> 01:13:52.590
we do a merge, we
destroy bad nodes,

01:13:52.590 --> 01:13:54.140
because we started
with a 2 node,

01:13:54.140 --> 01:13:56.370
we turned it into a 3 node.

01:13:56.370 --> 01:13:59.230
So, again, the amortized
cost is the actual cost,

01:13:59.230 --> 01:14:01.850
which is k, plus the
change in potential,

01:14:01.850 --> 01:14:06.370
which is minus k plus 1, and so
the amortized cost is just 1.

01:14:06.370 --> 01:14:10.620
Constant number of splits or
merges per insert or delete.

01:14:25.740 --> 01:14:28.970
So this is actually
really nice if you're

01:14:28.970 --> 01:14:31.750
in a model where changing
your data structure

01:14:31.750 --> 01:14:34.930
is more expensive than
searching your data structure.

01:14:34.930 --> 01:14:38.651
For example, you have a lot of
threads in parallel accessing

01:14:38.651 --> 01:14:39.150
your thing.

01:14:39.150 --> 01:14:42.796
You're on a multi-core
machine or something.

01:14:42.796 --> 01:14:44.170
You have a shared
data structure,

01:14:44.170 --> 01:14:47.920
you really don't want to be
changing things very often,

01:14:47.920 --> 01:14:50.380
because you have to take a
lock and then that slows down

01:14:50.380 --> 01:14:51.940
all the other threads.

01:14:51.940 --> 01:14:53.890
If searches are
really fast but splits

01:14:53.890 --> 01:14:55.900
and merges are
expensive, then this

01:14:55.900 --> 01:14:59.506
is a reason why you should use
2-5 trees instead of 2-3 trees,

01:14:59.506 --> 01:15:01.130
because 2-3 trees,
they'll be splitting

01:15:01.130 --> 01:15:03.645
emerging all the time, log n.

01:15:03.645 --> 01:15:05.020
It's not a huge
difference, log n

01:15:05.020 --> 01:15:06.728
versus constant, but
with data structures

01:15:06.728 --> 01:15:08.187
that's usually the gap.

01:15:08.187 --> 01:15:10.020
Last class we were super
excited, because we

01:15:10.020 --> 01:15:11.600
went from log to log log.

01:15:11.600 --> 01:15:13.980
Here we're excited we
go from log to constant.

01:15:13.980 --> 01:15:17.210
It's a little better, but
they're all small numbers,

01:15:17.210 --> 01:15:21.570
but still we like to go
fast, as fast as possible.

01:15:21.570 --> 01:15:24.400
In a real system, actually
it's even more important,

01:15:24.400 --> 01:15:26.680
because splitting the root
is probably the worst,

01:15:26.680 --> 01:15:29.100
because everyone is
always touching the root.

01:15:29.100 --> 01:15:30.840
In a 2-5 tree, you
almost never touch

01:15:30.840 --> 01:15:33.410
the root, almost always
splitting and merging

01:15:33.410 --> 01:15:35.580
at the leaves,
whereas in a 2-3 tree,

01:15:35.580 --> 01:15:39.120
you could be going all the way
to the root every single time.

01:15:39.120 --> 01:15:42.800
So that's my examples.

01:15:42.800 --> 01:15:44.176
Any questions?

01:15:44.176 --> 01:15:47.370
AUDIENCE: [INAUDIBLE]

01:15:47.370 --> 01:15:49.641
ERIK DEMAINE: For free minutes.

01:15:49.641 --> 01:15:50.140
Cool.

01:15:50.140 --> 01:15:51.930
That's amortization.