1
00:00:01,319 --> 00:00:06,019
In the previous lecture we developed the instruction
set architecture for the Beta, the computer

2
00:00:06,019 --> 00:00:09,540
system we'll be building throughout this part
of the course.

3
00:00:09,540 --> 00:00:13,280
The Beta incorporates two types of storage
or memory.

4
00:00:13,280 --> 00:00:19,620
In the CPU datapath there are 32 general-purpose
registers, which can be read to supply source

5
00:00:19,620 --> 00:00:24,279
operands for the ALU or written with the ALU
result.

6
00:00:24,279 --> 00:00:28,890
In the CPU's control logic there is a special-purpose
register called the program counter, which

7
00:00:28,890 --> 00:00:34,740
contains the address of the memory location
holding the next instruction to be executed.

8
00:00:34,740 --> 00:00:40,280
The datapath and control logic are connected
to a large main memory with a maximum capacity

9
00:00:40,280 --> 00:00:46,440
of 2^32 bytes, organized as 2^30 32-bit words.

10
00:00:46,440 --> 00:00:49,670
This memory holds both data and instructions.

11
00:00:49,670 --> 00:00:54,149
Beta instructions are 32-bit values comprised
of various fields.

12
00:00:54,149 --> 00:00:58,690
The 6-bit OPCODE field specifies the operation
to be performed.

13
00:00:58,690 --> 00:01:07,000
The 5-bit Ra, Rb, and Rc fields contain register
numbers, specifying one of the 32 general-purpose

14
00:01:07,000 --> 00:01:08,240
registers.

15
00:01:08,240 --> 00:01:14,439
There are two instruction formats: one specifying
an opcode and three registers, the other specifying

16
00:01:14,439 --> 00:01:19,770
an opcode, two registers, and a 16-bit signed
constant.

17
00:01:19,770 --> 00:01:21,850
There three classes of instructions.

18
00:01:21,850 --> 00:01:27,110
The ALU instructions perform an arithmetic
or logic operation on two operands, producing

19
00:01:27,110 --> 00:01:31,189
a result that is stored in the destination
register.

20
00:01:31,189 --> 00:01:36,070
The operands are either two values from the
general-purpose registers, or one register

21
00:01:36,070 --> 00:01:38,490
value and a constant.

22
00:01:38,490 --> 00:01:44,600
The yellow highlighting indicates instructions
that use the second instruction format.

23
00:01:44,600 --> 00:01:49,450
The Load/Store instructions access main memory,
either loading a value from main memory into

24
00:01:49,450 --> 00:01:53,130
a register, or storing a register value to
main memory.

25
00:01:53,130 --> 00:01:58,250
And, finally, there are branches and jumps
whose execution may change the program counter

26
00:01:58,250 --> 00:02:02,840
and hence the address of the next instruction
to be executed.

27
00:02:02,840 --> 00:02:07,830
To program the Beta we'll need to load main
memory with binary-encoded instructions.

28
00:02:07,830 --> 00:02:12,300
Figuring out each encoding is clearly the
job for a computer, so we'll create a simple

29
00:02:12,300 --> 00:02:17,579
programming language that will let us specify
the opcode and operands for each instruction.

30
00:02:17,579 --> 00:02:22,140
So instead of writing the binary at the top
of slide, we'll write assembly language statements

31
00:02:22,140 --> 00:02:24,910
to specify instructions in symbolic form.

32
00:02:24,910 --> 00:02:30,079
Of course we still have think about which
registers to use for which values and write

33
00:02:30,079 --> 00:02:34,690
sequences of instructions for more complex
operations.

34
00:02:34,690 --> 00:02:39,410
By using a high-level language we can move
up one more level abstraction and describe

35
00:02:39,410 --> 00:02:44,690
the computation we want in terms of variables
and mathematical operations rather than registers

36
00:02:44,690 --> 00:02:47,060
and ALU functions.

37
00:02:47,060 --> 00:02:51,490
In this lecture we'll describe the assembly
language we'll use for programming the Beta.

38
00:02:51,490 --> 00:02:56,950
And in the next lecture we'll figure out how
to translate high-level languages, such as

39
00:02:56,950 --> 00:02:59,240
C, into assembly language.

40
00:02:59,240 --> 00:03:02,590
The layer cake of abstractions gets taller
yet:

41
00:03:02,590 --> 00:03:07,340
we could write an interpreter for say, Python,
in C and then write our application programs

42
00:03:07,340 --> 00:03:08,820
in Python.

43
00:03:08,820 --> 00:03:13,870
Nowadays, programmers often choose the programming
language that's most suitable for expressing

44
00:03:13,870 --> 00:03:19,650
their computations, then, after perhaps many
layers of translation, come up with a sequence

45
00:03:19,650 --> 00:03:23,079
of instructions that the Beta can actually
execute.

46
00:03:23,079 --> 00:03:28,079
Okay, back to assembly language, which we'll
use to shield ourselves from the bit-level

47
00:03:28,079 --> 00:03:33,020
representations of instructions and from having
to know the exact location of variables and

48
00:03:33,020 --> 00:03:34,950
instructions in memory.

49
00:03:34,950 --> 00:03:40,240
A program called the "assembler" reads a text
file containing the assembly language program

50
00:03:40,240 --> 00:03:45,450
and produces an array of 32-bit words that
can be used to initialize main memory.

51
00:03:45,450 --> 00:03:50,540
We'll learn the UASM assembly language, which
is built into BSim, our simulator for the

52
00:03:50,540 --> 00:03:52,320
Beta ISA.

53
00:03:52,320 --> 00:03:56,760
UASM is really just a fancy calculator!

54
00:03:56,760 --> 00:04:02,040
It reads arithmetic expressions and evaluates
them to produce 8-bit values, which it then

55
00:04:02,040 --> 00:04:07,440
adds sequentially to the array of bytes which
will eventually be loaded into the Beta's

56
00:04:07,440 --> 00:04:08,440
memory.

57
00:04:08,440 --> 00:04:14,260
UASM supports several useful language features
that make it easier to write assembly language

58
00:04:14,260 --> 00:04:15,430
programs.

59
00:04:15,430 --> 00:04:19,730
Symbols and labels let us give names to particular
values and addresses.

60
00:04:19,730 --> 00:04:25,400
And macros let us create shorthand notations
for sequences of expressions that, when evaluated,

61
00:04:25,400 --> 00:04:30,130
will generate the binary representations for
instructions and data.

62
00:04:30,130 --> 00:04:32,600
Here's an example UASM source file.

63
00:04:32,600 --> 00:04:38,870
Typically we write one UASM statement on each
line and can use spaces, tabs, and newlines

64
00:04:38,870 --> 00:04:41,010
to make the source as readable as possible.

65
00:04:41,010 --> 00:04:43,770
We've added some color coding to help in our
explanation.

66
00:04:43,770 --> 00:04:49,680
Comments (shown in green) allow us to add
text annotations to the program.

67
00:04:49,680 --> 00:04:53,220
Good comments will help remind you how your
program works.

68
00:04:53,220 --> 00:04:57,290
You really don't want to have figure out from
scratch what a section of code does each time

69
00:04:57,290 --> 00:05:00,210
you need to modify or debug it!

70
00:05:00,210 --> 00:05:02,610
There are two ways to add comments to the
code.

71
00:05:02,610 --> 00:05:08,320
"//" starts a comment, which then occupies
the rest of the source line.

72
00:05:08,320 --> 00:05:13,480
Any characters after "//" are ignored by the
assembler, which will start processing statements

73
00:05:13,480 --> 00:05:17,620
again at the start of the next line in the
source file.

74
00:05:17,620 --> 00:05:24,380
You can also enclose comment text using the
delimiters "/*" and "*/" and the assembler

75
00:05:24,380 --> 00:05:26,850
will ignore everything in-between.

76
00:05:26,850 --> 00:05:32,050
Using this second type of comment, you can
"comment-out" many lines of code by placing

77
00:05:32,050 --> 00:05:39,110
"/*" at the start and, many lines later, end
the comment section with a "*/".

78
00:05:39,110 --> 00:05:43,300
Symbols (shown in red) are symbolic names
for constant values.

79
00:05:43,300 --> 00:05:49,230
Symbols make the code easier to understand,
e.g., we can use N as the name for an initial

80
00:05:49,230 --> 00:05:52,860
value for some computation, in this case the
value 12.

81
00:05:52,860 --> 00:05:57,250
Subsequent statements can refer to this value
using the symbol N instead of entering the

82
00:05:57,250 --> 00:05:59,750
value 12 directly.

83
00:05:59,750 --> 00:06:04,230
When reading the program, we'll know that
N means this particular initial value.

84
00:06:04,230 --> 00:06:09,220
So if later we want to change the initial
value, we only have to change the definition

85
00:06:09,220 --> 00:06:14,600
of the symbol N rather than find all the 12's
in our program and change them.

86
00:06:14,600 --> 00:06:19,800
In fact some of the other appearances of 12
might not refer to this initial value and

87
00:06:19,800 --> 00:06:22,560
so to be sure we only changed the ones that
did,

88
00:06:22,560 --> 00:06:26,450
we'd have to read and understand the whole
program to make sure we only edited the right

89
00:06:26,450 --> 00:06:27,630
12's.

90
00:06:27,630 --> 00:06:30,510
You can imagine how error-prone that might
be!

91
00:06:30,510 --> 00:06:34,900
So using symbols is a practice you want to
follow!

92
00:06:34,900 --> 00:06:38,210
Note that all the register names are shown
in red.

93
00:06:38,210 --> 00:06:43,880
We'll define the symbols R0 through R31 to
have the values 0 through 31.

94
00:06:43,880 --> 00:06:48,230
Then we'll use those symbols to help us understand
which instruction operands are intended to

95
00:06:48,230 --> 00:06:55,170
be registers, e.g., by writing R1, and which
operands are numeric values, e.g., by writing

96
00:06:55,170 --> 00:06:56,970
the number 1.

97
00:06:56,970 --> 00:07:00,730
We could just use numbers everywhere, but
the code would be much harder to read and

98
00:07:00,730 --> 00:07:02,360
understand.

99
00:07:02,360 --> 00:07:07,090
Labels (shown in yellow) are symbols whose
value are the address of a particular location

100
00:07:07,090 --> 00:07:08,410
in the program.

101
00:07:08,410 --> 00:07:14,150
Here, the label "loop" will be our name for
the location of the MUL instruction in memory.

102
00:07:14,150 --> 00:07:18,710
In the BNE at the end of the code, we use
the label "loop" to specify the MUL instruction

103
00:07:18,710 --> 00:07:20,430
as the branch target.

104
00:07:20,430 --> 00:07:26,169
So if R1 is non-zero, we want to branch back
to the MUL instruction and start another iteration.

105
00:07:26,169 --> 00:07:32,220
We'll use indentation for most UASM statements
to make it easy to spot the labels defined

106
00:07:32,220 --> 00:07:34,520
by the program.

107
00:07:34,520 --> 00:07:38,840
Indentation isn't required, it's just another
habit assembly language programmers use to

108
00:07:38,840 --> 00:07:42,210
keep their programs readable.

109
00:07:42,210 --> 00:07:47,940
We use macro invocations (shown in blue) when
we want to write Beta instructions.

110
00:07:47,940 --> 00:07:53,120
When the assembler encounters a macro, it
"expands" the macro, replacing it with a string

111
00:07:53,120 --> 00:07:56,260
of text provided by in the macro's definition.

112
00:07:56,260 --> 00:08:00,889
During expansion, the provided arguments are
textually inserted into the expanded text

113
00:08:00,889 --> 00:08:03,600
at locations specified in the macro definition.

114
00:08:03,600 --> 00:08:09,639
Think of a macro as shorthand for a longer
text string we could have typed in.

115
00:08:09,639 --> 00:08:12,270
We'll show how all this works in the next
video segment.