1
00:00:00,390 --> 00:00:02,330
We're on the home stretch now.

2
00:00:02,330 --> 00:00:06,660
For all the instructions up until now, the
next instruction has come from the location

3
00:00:06,660 --> 00:00:11,600
following the current instruction - hence
the "PC+4" logic.

4
00:00:11,600 --> 00:00:16,660
Branches and JMPs change that by altering
the value in the PC.

5
00:00:16,660 --> 00:00:21,490
The JMP instruction simply takes the value
in the RA register and makes it the next PC

6
00:00:21,490 --> 00:00:23,039
value.

7
00:00:23,039 --> 00:00:28,179
The PCSEL MUX in the upper left-hand corner
lets the control logic select the source of

8
00:00:28,179 --> 00:00:30,279
the next PC value.

9
00:00:30,279 --> 00:00:35,039
When PCSEL is 0, the incremented PC value
is chosen.

10
00:00:35,039 --> 00:00:39,829
When PCSEL is 2, the value of the RA register
is chosen.

11
00:00:39,829 --> 00:00:44,469
We'll see how the other inputs to the PCSEL
MUX are used in just a moment.

12
00:00:44,469 --> 00:00:48,840
The JMP and branch instructions also cause
the address of the following instruction,

13
00:00:48,840 --> 00:00:54,180
i.e., the PC+4 value, to be written to the
RC register.

14
00:00:54,180 --> 00:01:01,570
When WDSEL is 0, the "0" input of the WDSEL
MUX is used to select the PC+4 value as the

15
00:01:01,570 --> 00:01:04,349
write-back data.

16
00:01:04,349 --> 00:01:06,710
Here's how the data flow works.

17
00:01:06,710 --> 00:01:13,289
The output of the PC+4 adder is is routed
to the register file and WERF is set to 1

18
00:01:13,289 --> 00:01:16,910
to enable that value to be written at the
end of the cycle.

19
00:01:16,910 --> 00:01:23,229
Meanwhile, the value of RA register coming
out of the register file is connected to the

20
00:01:23,229 --> 00:01:25,330
"2" input of the PCSEL MUX.

21
00:01:25,330 --> 00:01:30,860
So setting PCSEL to 2 will select the value
in the RA register as the next value for the

22
00:01:30,860 --> 00:01:31,960
PC.

23
00:01:31,960 --> 00:01:36,210
The rest of the control signals are "don't
cares", except, of course for the memory write

24
00:01:36,210 --> 00:01:41,070
enable (MWR), which can never be "don't care"
lest we cause an accidental write to some

25
00:01:41,070 --> 00:01:43,310
memory location.

26
00:01:43,310 --> 00:01:48,130
The branch instruction requires an additional
adder to compute the target address by adding

27
00:01:48,130 --> 00:01:54,329
the scaled offset from the instruction's literal
field to the current PC+4 value.

28
00:01:54,329 --> 00:01:59,829
Remember that we scale the offset by a factor
of 4 to convert it from the word offset stored

29
00:01:59,829 --> 00:02:04,430
in the literal to the byte offset required
for the PC.

30
00:02:04,430 --> 00:02:09,619
The output of the offset adder becomes the
"1" input to the PCSEL MUX, where, if the

31
00:02:09,619 --> 00:02:14,420
branch is taken, it will become the next value
of the PC.

32
00:02:14,420 --> 00:02:20,709
Note that multiplying by 4 is easily accomplished
by shifting the literal two bits to the left,

33
00:02:20,709 --> 00:02:24,150
which inserts two 0-bits at the low-order
end of the value.

34
00:02:24,150 --> 00:02:31,040
And, like before, the sign-extension just
requires replicating bit ID[15], in this case

35
00:02:31,040 --> 00:02:33,129
fourteen times.

36
00:02:33,129 --> 00:02:38,670
So implementing this complicated-looking expression
requires care in wiring up the input to the

37
00:02:38,670 --> 00:02:43,090
offset adder, but no additional logic!

38
00:02:43,090 --> 00:02:46,910
We do need some logic to determine if we should
branch or not.

39
00:02:46,910 --> 00:02:52,920
The 32-bit NOR gate connected to the first
read port of the register file tests the value

40
00:02:52,920 --> 00:02:53,920
of the RA register.

41
00:02:53,920 --> 00:03:01,799
The NOR's output Z will be 1 if all the bits
of the RA register value are 0, and 0 otherwise.

42
00:03:01,799 --> 00:03:10,110
The Z value can be used by the control logic
to determine the correct value for PCSEL.

43
00:03:10,110 --> 00:03:16,769
If Z indicates the branch is taken, PCSEL
will be 1 and the output of the offset adder

44
00:03:16,769 --> 00:03:20,220
becomes the next value of the PC.

45
00:03:20,220 --> 00:03:26,010
If the branch is not taken, PCSEL will be
0 and execution will continue with the next

46
00:03:26,010 --> 00:03:30,510
instruction at PC+4.

47
00:03:30,510 --> 00:03:35,930
As in the JMP instruction, the PC+4 value
is routed to the register file to be written

48
00:03:35,930 --> 00:03:39,260
into the RC register at end of the cycle.

49
00:03:39,260 --> 00:03:45,659
Meanwhile, the value of Z is computed from
the value of the RA register while the branch

50
00:03:45,659 --> 00:03:49,980
offset adder computes the address of the branch
target.

51
00:03:49,980 --> 00:03:55,250
The output of the offset adder is routed to
the PCSEL MUX where the value of the 3-bit

52
00:03:55,250 --> 00:04:01,830
PCSEL control signal, computed by the control
logic based on Z, determines whether the next

53
00:04:01,830 --> 00:04:06,850
PC value is the branch target or the PC+4
value.

54
00:04:06,850 --> 00:04:13,140
The remaining control signals are unused and
set to their default "don't care" values.

55
00:04:13,140 --> 00:04:18,950
We have one last instruction to introduce:
the LDR or load-relative instruction.

56
00:04:18,950 --> 00:04:24,140
LDR behaves like a normal LD instruction except
that the memory address is taken from the

57
00:04:24,140 --> 00:04:26,630
branch offset adder.

58
00:04:26,630 --> 00:04:32,260
Why would it be useful to load a value from
a location near the LDR instruction?

59
00:04:32,260 --> 00:04:36,480
Normally such addresses would refer to the
neighboring instructions, so why would we

60
00:04:36,480 --> 00:04:43,040
want to load the binary encoding of an instruction
into a register to be used as data?

61
00:04:43,040 --> 00:04:48,790
The use case for LDR is accessing large constants
that have to be stored in main memory because

62
00:04:48,790 --> 00:04:54,260
they are too large to fit into the 16-bit
literal field of an instruction.

63
00:04:54,260 --> 00:05:01,170
In the example shown here, the compiled code
needs to load the constant 123456.

64
00:05:01,170 --> 00:05:08,540
So it uses an LDR instruction that refers
to a nearby location C1: that has been initialized

65
00:05:08,540 --> 00:05:11,520
with the required value.

66
00:05:11,520 --> 00:05:16,410
Since this read-only constant is part of the
program, it makes sense to store it with the

67
00:05:16,410 --> 00:05:22,110
instructions for the program, usually just
after the code for a procedure.

68
00:05:22,110 --> 00:05:26,780
Note that we have to be careful to place the
storage location so that it won't be executed

69
00:05:26,780 --> 00:05:29,330
as an instruction!

70
00:05:29,330 --> 00:05:34,730
To route the output of the offset adder to
the main memory address port, we'll add ASEL

71
00:05:34,730 --> 00:05:41,580
MUX so we can select either the RA register
value (when ASEL=0) or the output of the offset

72
00:05:41,580 --> 00:05:47,190
adder (when ASEL=1) as the first ALU operand.

73
00:05:47,190 --> 00:05:54,090
For LDR, ASEL will be 1, and we'll then ask
the ALU compute the Boolean operation "A",

74
00:05:54,090 --> 00:06:00,380
i.e., the boolean function whose output is
just the value of the first operand.

75
00:06:00,380 --> 00:06:04,810
This value then appears on the ALU output,
which is connected to the main memory address

76
00:06:04,810 --> 00:06:10,650
port and the remainder of the execution proceeds
just like it did for LD.

77
00:06:10,650 --> 00:06:12,970
This seems a bit complicated!

78
00:06:12,970 --> 00:06:17,840
Mr. Blue has a good question:
why not just put the ASEL MUX on the wire

79
00:06:17,840 --> 00:06:24,230
leading to the main memory address port and
bypass the ALU altogether?

80
00:06:24,230 --> 00:06:29,440
The answer has to do with the amount of time
needed to compute the memory address.

81
00:06:29,440 --> 00:06:36,310
If we moved the ASEL MUX here, the data flow
for LD and ST addresses would then pass through

82
00:06:36,310 --> 00:06:43,090
two MUXes, the BSEL MUX and the ASEL MUX,
slowing down the arrival of the address by

83
00:06:43,090 --> 00:06:45,340
a small amount.

84
00:06:45,340 --> 00:06:50,140
This may not seem like a big deal, but the
additional time would have to be added the

85
00:06:50,140 --> 00:06:55,630
clock period, thus slowing down every instruction
by a little bit.

86
00:06:55,630 --> 00:07:00,830
When executing billions of instructions, a
little extra time on each instruction really

87
00:07:00,830 --> 00:07:05,750
impacts the overall performance of the processor.

88
00:07:05,750 --> 00:07:11,130
By placing the ASEL MUX where we did, its
propagation delay overlaps that of the BSEL

89
00:07:11,130 --> 00:07:18,360
MUX, so the increased functionality it provides
comes with no cost in performance.

90
00:07:18,360 --> 00:07:21,250
Here's the data flow for the LDR instruction.

91
00:07:21,250 --> 00:07:26,650
The output of the offset adder is routed through
the ASEL MUX to the ALU.

92
00:07:26,650 --> 00:07:32,400
The ALU performs the Boolean computation "A"
and the result becomes the address for main

93
00:07:32,400 --> 00:07:33,780
memory.

94
00:07:33,780 --> 00:07:39,810
The returning data is routed through the WDSEL
MUX so it can be written into the RC register

95
00:07:39,810 --> 00:07:42,310
at the end of the cycle.

96
00:07:42,310 --> 00:07:46,060
The remaining control values are given their
usual default values.