1
00:00:00,659 --> 00:00:04,839
In this chapter our goal is to introduce some
metrics for measuring the performance of a

2
00:00:04,839 --> 00:00:08,950
circuit and then investigate ways to improve
that performance.

3
00:00:08,950 --> 00:00:13,110
We'll start by putting aside circuits for
a moment and look at an everyday example that

4
00:00:13,110 --> 00:00:17,810
will help us understand the proposed performance
metrics.

5
00:00:17,810 --> 00:00:21,720
Laundry is a processing task we all have to
face at some point!

6
00:00:21,720 --> 00:00:26,510
The input to our laundry "system" is some
number of loads of dirty laundry and the output

7
00:00:26,510 --> 00:00:30,519
is the same loads, but washed, dried, and
folded.

8
00:00:30,519 --> 00:00:35,760
There two system components: a washer that
washes a load of laundry in 30 minutes, and

9
00:00:35,760 --> 00:00:38,780
a dryer that dries a load in 60 minutes.

10
00:00:38,780 --> 00:00:43,030
You may be used to laundry system components
with different propagation delays, but let's

11
00:00:43,030 --> 00:00:46,219
go with these delays for our example.

12
00:00:46,219 --> 00:00:49,200
Our laundry follows a simple path through
the system:

13
00:00:49,200 --> 00:00:55,120
each load is first washed in the washer and
afterwards moved to the dryer for drying.

14
00:00:55,120 --> 00:00:59,590
There can, of course, be delays between the
steps of loading the washer, or moving wet,

15
00:00:59,590 --> 00:01:04,580
washed loads to the dryer, or in taking dried
loads out of the dryer.

16
00:01:04,580 --> 00:01:08,840
Let's assume we move the laundry through the
system as fast as possible, moving loads to

17
00:01:08,840 --> 00:01:12,760
the next processing step as soon as we can.

18
00:01:12,760 --> 00:01:17,090
Most of us wait to do laundry until we've
accumulated several loads.

19
00:01:17,090 --> 00:01:19,039
That turns out to be a good strategy!

20
00:01:19,039 --> 00:01:20,770
Let's see why…

21
00:01:20,770 --> 00:01:25,520
To process a single load of laundry, we first
run it through the washer, which takes 30

22
00:01:25,520 --> 00:01:26,520
minutes.

23
00:01:26,520 --> 00:01:29,400
Then we run it through the dryer, which takes
60 minutes.

24
00:01:29,400 --> 00:01:34,560
So the total amount of time from system input
to system output is 90 minutes.

25
00:01:34,560 --> 00:01:38,850
If this were a combinational logic circuit,
we'd say the circuit's propagation delay is

26
00:01:38,850 --> 00:01:42,780
90 minutes from valid inputs to valid outputs.

27
00:01:42,780 --> 00:01:46,700
Okay, that's the performance analysis for
a single load of laundry.

28
00:01:46,700 --> 00:01:50,750
Now let's think about doing N loads of laundry.

29
00:01:50,750 --> 00:01:55,100
Here at MIT we like to make gentle fun of
our colleagues at the prestigious institution

30
00:01:55,100 --> 00:01:57,150
just up the river from us.

31
00:01:57,150 --> 00:02:01,439
So here's how we imagine they do N loads of
laundry at Harvard.

32
00:02:01,439 --> 00:02:06,140
They follow the combinational recipe of supplying
new system inputs after the system generates

33
00:02:06,140 --> 00:02:09,590
the correct output from the previous set of
inputs.

34
00:02:09,590 --> 00:02:15,000
So in step 1 the first load is washed and
in step 2, the first load is dried, taking

35
00:02:15,000 --> 00:02:17,280
a total of 90 minutes.

36
00:02:17,280 --> 00:02:21,750
Once those steps complete, Harvard students
move on to step 3, starting the processing

37
00:02:21,750 --> 00:02:23,140
of the second load of laundry.

38
00:02:23,140 --> 00:02:24,790
And so on…

39
00:02:24,790 --> 00:02:29,420
The total time for the system to process N
laundry loads is just N times the time it

40
00:02:29,420 --> 00:02:31,890
takes to process a single load.

41
00:02:31,890 --> 00:02:34,840
So the total time is N*90 minutes.

42
00:02:34,840 --> 00:02:37,750
Of course, we're being silly here!

43
00:02:37,750 --> 00:02:40,420
Harvard students don't actually do laundry.

44
00:02:40,420 --> 00:02:44,940
Mummy sends the family butler over on Wednesday
mornings to collect the dirty loads and return

45
00:02:44,940 --> 00:02:49,760
them starched and pressed in time for afternoon
tea.

46
00:02:49,760 --> 00:02:53,480
But I hope you're seeing the analogy we're
making between the Harvard approach to laundry

47
00:02:53,480 --> 00:02:55,540
and combinational circuits.

48
00:02:55,540 --> 00:03:00,610
We can all see that the washer is sitting
idle while the dryer is running and that inefficiency

49
00:03:00,610 --> 00:03:06,410
has a cost in terms of the rate at which N
load of laundry can move through the system.

50
00:03:06,410 --> 00:03:11,780
As engineering students here in 6.004, we
see that it makes sense to overlap washing

51
00:03:11,780 --> 00:03:13,100
and drying.

52
00:03:13,100 --> 00:03:15,700
So in step 1 we wash the first load.

53
00:03:15,700 --> 00:03:21,150
And in step 2, we dry the first load as before,
but, in addition, we start washing the second

54
00:03:21,150 --> 00:03:22,520
load of laundry.

55
00:03:22,520 --> 00:03:28,079
We have to allocate 60 minutes for step 2
in order to give the dryer time to finish.

56
00:03:28,079 --> 00:03:32,460
There's a slight inefficiency in that the
washer finishes its work early, but with only

57
00:03:32,460 --> 00:03:38,819
one dryer, it's the dryer that determines
how quickly laundry moves through the system.

58
00:03:38,819 --> 00:03:43,680
Systems that overlap the processing of a sequence
of inputs are called pipelined systems and

59
00:03:43,680 --> 00:03:48,290
each of the processing steps is called a stage
of the pipeline.

60
00:03:48,290 --> 00:03:52,700
The rate at which inputs move through the
pipeline is determined by the slowest pipeline

61
00:03:52,700 --> 00:03:53,709
stage.

62
00:03:53,709 --> 00:04:00,120
Our laundry system is a 2-stage pipeline with
a 60-minute processing time for each stage.

63
00:04:00,120 --> 00:04:06,110
We repeat the overlapped wash/dry step until
all N loads of laundry have been processed.

64
00:04:06,110 --> 00:04:10,370
We're starting a new washer load every 60
minutes and getting a new load of dried laundry

65
00:04:10,370 --> 00:04:12,599
from the dryer every 60 minutes.

66
00:04:12,599 --> 00:04:17,529
In other words, the effective processing rate
of our overlapped laundry system is one load

67
00:04:17,529 --> 00:04:19,519
every 60 minutes.

68
00:04:19,519 --> 00:04:25,010
So once the process is underway N loads of
laundry takes N*60 minutes.

69
00:04:25,010 --> 00:04:31,850
And a particular load of laundry, which requires
two stages of processing time, takes 120 minutes.

70
00:04:31,850 --> 00:04:35,720
The timing for the first load of laundry is
a little different since the timing of Step

71
00:04:35,720 --> 00:04:38,800
1 can be shorter with no dryer to wait for.

72
00:04:38,800 --> 00:04:43,470
But in the performance analysis of pipelined
systems, we're interested in the steady state

73
00:04:43,470 --> 00:04:47,310
where we're assuming that we have an infinite
supply of inputs.

74
00:04:47,310 --> 00:04:50,880
We see that there are two interesting performance
metrics.

75
00:04:50,880 --> 00:04:55,120
The first is the latency of the system, the
time it takes for the system to process a

76
00:04:55,120 --> 00:04:56,470
particular input.

77
00:04:56,470 --> 00:05:01,590
In the Harvard laundry system, it takes 90
minutes to wash and dry a load.

78
00:05:01,590 --> 00:05:07,000
In the 6.004 laundry, it takes 120 minutes
to wash and dry a load, assuming that it's

79
00:05:07,000 --> 00:05:08,880
not the first load.

80
00:05:08,880 --> 00:05:14,320
The second performance measure is throughput,
the rate at which the system produces outputs.

81
00:05:14,320 --> 00:05:19,050
In many systems, we get one set of outputs
for each set of inputs, and in such systems,

82
00:05:19,050 --> 00:05:23,490
the throughput also tells us the rate at inputs
are consumed.

83
00:05:23,490 --> 00:05:28,680
In the Harvard laundry system, the throughput
is 1 load of laundry every 90 minutes.

84
00:05:28,680 --> 00:05:35,090
In the 6.004 laundry, the throughput is 1
load of laundry every 60 minutes.

85
00:05:35,090 --> 00:05:40,760
The Harvard laundry has lower latency, the
6.004 laundry has better throughput.

86
00:05:40,760 --> 00:05:43,180
Which is the better system?

87
00:05:43,180 --> 00:05:45,210
That depends on your goals!

88
00:05:45,210 --> 00:05:50,389
If you need to wash 100 loads of laundry,
you'd prefer to use the system with higher

89
00:05:50,389 --> 00:05:51,389
throughput.

90
00:05:51,389 --> 00:05:56,300
If, on the other hand, you want clean underwear
for your date in 90 minutes, you're much more

91
00:05:56,300 --> 00:05:59,420
concerned about the latency.

92
00:05:59,420 --> 00:06:04,640
The laundry example also illustrates a common
tradeoff between latency and throughput.

93
00:06:04,640 --> 00:06:09,590
If we increase throughput by using pipelined
processing, the latency usually increases

94
00:06:09,590 --> 00:06:14,400
since all pipeline stages must operate in
lock-step and the rate of processing is thus

95
00:06:14,400 --> 00:06:16,180
determined by the slowest stage.