WEBVTT
00:00:01.030 --> 00:00:05.939
Here's another approach to improving the latency
of our adder, this time focusing just on the
00:00:05.939 --> 00:00:07.190
carry logic.
00:00:07.190 --> 00:00:12.089
Early on in the course, we learned that by
going from a chain of logic gates to a tree
00:00:12.089 --> 00:00:17.500
of logic gates, we could go from a linear
latency to a logarithmic latency.
00:00:17.500 --> 00:00:20.090
Let's try to do that here.
00:00:20.090 --> 00:00:24.680
We'll start by rewriting the equations for
the carry-out from the full adder module.
00:00:24.680 --> 00:00:28.360
The final form of the rewritten equation has
two terms.
00:00:28.360 --> 00:00:33.370
The G, or generate, term is true when the
inputs will cause the module to generate a
00:00:33.370 --> 00:00:38.410
carry-out right away, without having to wait
for the carry-in to arrive.
00:00:38.410 --> 00:00:43.070
The P, or propagate, term is true if the module
will generate a carry-out only if there's
00:00:43.070 --> 00:00:45.569
a carry-in.
00:00:45.569 --> 00:00:50.350
So there only two ways to get a carry-out
from the module: it's either generated by
00:00:50.350 --> 00:00:55.329
the current module or the carry-in is propagated
from the previous module.
00:00:55.329 --> 00:01:01.909
Actually, it's usual to change the logic for
the P term from "A OR B" to "A XOR B".
00:01:01.909 --> 00:01:05.950
This doesn't change the truth table for the
carry-out but will allow us to express the
00:01:05.950 --> 00:01:09.789
sum output as "P XOR carry-in".
00:01:09.789 --> 00:01:13.450
Here's the schematic for the reorganized full
adder module.
00:01:13.450 --> 00:01:18.201
The little sum-of-products circuit for the
carry-out can be implemented using 3 2-input
00:01:18.201 --> 00:01:23.130
NAND gates, which is a bit more compact than
the implementation for the three product terms
00:01:23.130 --> 00:01:25.420
we suggested in Lab 2.
00:01:25.420 --> 00:01:28.610
Time to update your full adder circuit!
00:01:28.610 --> 00:01:33.030
Now consider two adjacent adder modules in
a larger adder circuit:
00:01:33.030 --> 00:01:39.000
we'll use the label H to refer to the high-order
module and the label L to refer to the low-order
00:01:39.000 --> 00:01:40.070
module.
00:01:40.070 --> 00:01:44.649
We can use the generate and propagate information
from each of the modules to develop equations
00:01:44.649 --> 00:01:49.009
for the carry-out from the pair of modules
treated as a single block.
00:01:49.009 --> 00:01:54.950
We'll generate a carry-out from the block
when a carry-out is generated by the H module,
00:01:54.950 --> 00:02:01.460
or when a carry-out is generated by the L
module and propagated by the H module.
00:02:01.460 --> 00:02:06.399
And we'll propagate the carry-in through the
block only if the L module propagates its
00:02:06.399 --> 00:02:12.970
carry-in to the intermediate carry-out and
H module propagates that to the final carry-out.
00:02:12.970 --> 00:02:19.870
So we have two simple equations requiring
only a couple of logic gates to implement.
00:02:19.870 --> 00:02:26.450
Let's use these equations to build a generate-propagate
(GP) module and hook it to the H and L modules
00:02:26.450 --> 00:02:27.980
as shown.
00:02:27.980 --> 00:02:32.790
The G and P outputs of the GP module tell
us under what conditions we'll get a carry-out
00:02:32.790 --> 00:02:38.610
from the two individual modules treated as
a single, larger block.
00:02:38.610 --> 00:02:43.120
We can use additional layers of GP modules
to build a tree of logic that computes the
00:02:43.120 --> 00:02:48.099
generate and propagate logic for adders with
any number of inputs.
00:02:48.099 --> 00:02:53.780
For an adder with N inputs, the tree will
contain a total of N-1 GP modules and have
00:02:53.780 --> 00:02:56.909
a latency that's order log(N).
00:02:56.909 --> 00:03:00.659
In the next step, we'll see how to use the
generate and propagate information to quickly
00:03:00.659 --> 00:03:06.569
compute the carry-in for each of the original
full adder modules.
00:03:06.569 --> 00:03:11.379
Once we're given the carry-in C_0 for the
low-order bit, we can hierarchically compute
00:03:11.379 --> 00:03:14.819
the carry-in for each full adder module.
00:03:14.819 --> 00:03:19.410
Given the carry-in to a block of adders, we
simply pass it along as the carry-in to the
00:03:19.410 --> 00:03:21.200
low-half of the block.
00:03:21.200 --> 00:03:25.239
The carry-in for the high-half of the block
is computed the using the generate and propagate
00:03:25.239 --> 00:03:28.280
information from the low-half of the block.
00:03:28.280 --> 00:03:33.900
We can use these equations to build a C module
and arrange the C modules in a tree as shown
00:03:33.900 --> 00:03:40.140
to use the C_0 carry-in to hierarchically
compute the carry-in to each layer of successively
00:03:40.140 --> 00:03:44.980
smaller blocks, until we finally reach the
full adder modules.
00:03:44.980 --> 00:03:51.180
For example, these equations show how C4 is
computed from C0, and C6 is computed from
00:03:51.180 --> 00:03:53.280
C4.
00:03:53.280 --> 00:03:58.269
Again the total propagation delay from the
arrival of the C_0 input to the carry-ins
00:03:58.269 --> 00:04:02.379
for each full adder is order log(N).
00:04:02.379 --> 00:04:09.209
Notice that the G_L and P_L inputs to a particular
C module are the same as two of the inputs
00:04:09.209 --> 00:04:13.110
to the GP module in the same position in the
GP tree.
00:04:13.110 --> 00:04:18.668
We can combine the GP module and C module
to form a single carry-lookahead module that
00:04:18.668 --> 00:04:23.940
passes generate and propagate information
up the tree and carry-in information down
00:04:23.940 --> 00:04:25.130
the tree.
00:04:25.130 --> 00:04:30.430
The schematic at the top shows how to wire
up the tree of carry-lookahead modules.
00:04:30.430 --> 00:04:33.970
And now we get to the payoff for all this
hard work!
00:04:33.970 --> 00:04:39.060
The combined propagation delay to hierarchically
compute the generate and propagate information
00:04:39.060 --> 00:04:45.370
on the way up and the carry-in information
on the way down is order log(N),
00:04:45.370 --> 00:04:50.690
which is then the latency for the entire adder
since computing the sum outputs only takes
00:04:50.690 --> 00:04:53.680
one additional XOR delay.
00:04:53.680 --> 00:04:59.020
This is a considerable improvement over the
order N latency of the ripple-carry adder.
00:04:59.020 --> 00:05:04.639
A final design note: we no longer need the
carry-out circuitry in the full adder module,
00:05:04.639 --> 00:05:07.169
so it can be removed.
00:05:07.169 --> 00:05:12.740
Variations on this generate-propagate strategy
form the basis for the fastest-known adder
00:05:12.740 --> 00:05:13.870
circuits.
00:05:13.870 --> 00:05:18.000
If you'd like to learn more, look up "Kogge-Stone
adders" on Wikipedia.