1
00:00:00,489 --> 00:00:06,689
It's not unusual to find that an application
is organized as multiple communicating processes.

2
00:00:06,689 --> 00:00:12,050
What's the advantage of using multiple processes
instead of just a single process?

3
00:00:12,050 --> 00:00:17,710
Many applications exhibit concurrency, i.e.,
some of the required computations can be performed

4
00:00:17,710 --> 00:00:19,060
in parallel.

5
00:00:19,060 --> 00:00:24,230
For example, video compression algorithms
represent each video frame as an array of

6
00:00:24,230 --> 00:00:27,340
8-pixel by 8-pixel macroblocks.

7
00:00:27,340 --> 00:00:32,880
Each macroblock is individually compressed
by converting the 64 intensity and color values

8
00:00:32,880 --> 00:00:38,500
from the spatial domain to the frequency domain
and then quantizing and Huffman encoding the

9
00:00:38,500 --> 00:00:40,450
frequency coefficients.

10
00:00:40,450 --> 00:00:44,650
If you're using a multi-core processor to
do the compression, you can perform the macroblock

11
00:00:44,650 --> 00:00:48,480
compressions concurrently.

12
00:00:48,480 --> 00:00:53,460
Applications like video games are naturally
divided into the "front-end" user interface

13
00:00:53,460 --> 00:00:57,820
and "back-end" simulation and rendering engines.

14
00:00:57,820 --> 00:01:02,810
Inputs from the user arrive asynchronously
with respect to the simulation and it's easiest

15
00:01:02,810 --> 00:01:09,329
to organize the processing of user events
separately from the backend processing.

16
00:01:09,329 --> 00:01:14,630
Processes are an effective way to encapsulate
the state and computation for what are logically

17
00:01:14,630 --> 00:01:19,799
independent components of an application,
which communicate with one another when they

18
00:01:19,799 --> 00:01:22,710
need to share information.

19
00:01:22,710 --> 00:01:28,049
These sorts of applications are often data-
or event-driven, i.e., the processing required

20
00:01:28,049 --> 00:01:34,420
is determined by the data to be processed
or the arrival of external events.

21
00:01:34,420 --> 00:01:38,020
How should the processes communicate with
each other?

22
00:01:38,020 --> 00:01:43,170
If the processes are running out of the same
physical memory, it would be easy to arrange

23
00:01:43,170 --> 00:01:49,369
to share memory data by mapping the same physical
page into the contexts for both processes.

24
00:01:49,369 --> 00:01:55,869
Any data written to that page by one process
will be able to be read by the other process.

25
00:01:55,869 --> 00:02:00,579
To make it easier to coordinate the processes'
communicating via shared memory, we'll see

26
00:02:00,579 --> 00:02:04,280
it's convenient to provide synchronization
primitives.

27
00:02:04,280 --> 00:02:10,389
Some ISAs include instructions that make it
easy to do the required synchronization.

28
00:02:10,389 --> 00:02:17,050
Another approach is to add OS supervisor calls
to pass messages from one process to another.

29
00:02:17,050 --> 00:02:23,460
Message passing involves more overhead than
shared memory, but makes the application programming

30
00:02:23,460 --> 00:02:29,160
independent of whether the communicating processes
are running on the same physical processor.

31
00:02:29,160 --> 00:02:34,579
In this lecture, we'll use the classic producer-consumer
problem as our example of concurrent processes

32
00:02:34,579 --> 00:02:36,620
that need to communicate and synchronize.

33
00:02:36,620 --> 00:02:41,920
There are two processes: a producer and a
consumer.

34
00:02:41,920 --> 00:02:47,590
The producer is running in a loop, which performs
some computation  to generate information,

35
00:02:47,590 --> 00:02:53,590
in this case, a single character C.
The consumer is also running a loop, which

36
00:02:53,590 --> 00:02:58,590
waits for the next character to arrive from
the producer, then performs some computation

37
00:02:58,590 --> 00:03:01,120
.

38
00:03:01,120 --> 00:03:04,900
The information passing between the producer
and consumer could obviously be much more

39
00:03:04,900 --> 00:03:07,630
complicated than a single character.

40
00:03:07,630 --> 00:03:12,210
For example, a compiler might produce a sequence
of assembly language statements that are passed

41
00:03:12,210 --> 00:03:17,370
to the assembler to be converted into the
appropriate binary representation.

42
00:03:17,370 --> 00:03:22,290
The user interface front-end for a video game
might pass a sequence of player actions to

43
00:03:22,290 --> 00:03:24,260
the simulation and rendering back-end.

44
00:03:24,260 --> 00:03:31,410
In fact, the notion of hooking multiple processes
together in a processing pipeline is so useful

45
00:03:31,410 --> 00:03:36,270
that the Unix and Linux operating systems
provide a PIPE primitive in the operating

46
00:03:36,270 --> 00:03:38,960
system
that connects the output channel of the upstream

47
00:03:38,960 --> 00:03:43,600
process to the input channel of the downstream
process.

48
00:03:43,600 --> 00:03:48,710
Let's look at a timing diagram for the actions
of our simple producer/consumer example.

49
00:03:48,710 --> 00:03:53,570
We'll use arrows to indicate when one action
happens before another.

50
00:03:53,570 --> 00:03:58,829
Inside a single process, e.g., the producer,
the order of execution implies a particular

51
00:03:58,829 --> 00:04:03,220
ordering in time:
the first execution of  is followed by

52
00:04:03,220 --> 00:04:05,580
the sending of the first character.

53
00:04:05,580 --> 00:04:10,090
Then there's the second execution of ,
followed by the sending of the second character,

54
00:04:10,090 --> 00:04:11,850
and so on.

55
00:04:11,850 --> 00:04:15,880
In later examples, we'll omit the timing arrows
between successive statements in the same

56
00:04:15,880 --> 00:04:18,149
program.

57
00:04:18,149 --> 00:04:23,319
We see a similar order of execution in the
consumer: the first character is received,

58
00:04:23,319 --> 00:04:27,990
then the computation  is performed for
the first time, etc.

59
00:04:27,990 --> 00:04:32,620
Inside of each process, the process' program
counter is determining the order in which

60
00:04:32,620 --> 00:04:35,139
the computations are performed.

61
00:04:35,139 --> 00:04:39,129
So far, so good - each process is running
as expected.

62
00:04:39,129 --> 00:04:43,620
However, for the producer/consumer system
to function correctly as a whole, we'll need

63
00:04:43,620 --> 00:04:48,389
to introduce some additional constraints on
the order of execution.

64
00:04:48,389 --> 00:04:52,979
These are called "precedence constraints"
and we'll use this stylized less-than sign

65
00:04:52,979 --> 00:05:00,830
to indicate that computation A must precede,
i.e., come before, computation B.

66
00:05:00,830 --> 00:05:06,060
In the producer/consumer system we can't consume
data before it's been produced, a constraint

67
00:05:06,060 --> 00:05:12,349
we can formalize as requiring that the i_th
send operation has to precede the i_th receive

68
00:05:12,349 --> 00:05:13,680
operation.

69
00:05:13,680 --> 00:05:19,440
This timing constraint is shown as the solid
red arrow in the timing diagram.

70
00:05:19,440 --> 00:05:23,910
Assuming we're using, say, a shared memory
location to hold the character being transmitted

71
00:05:23,910 --> 00:05:28,680
from the producer to the consumer,
we need to ensure that the producer doesn't

72
00:05:28,680 --> 00:05:32,320
overwrite the previous character before it's
been read by the consumer.

73
00:05:32,320 --> 00:05:39,990
In other words, we require the i_th receive
to precede the i+1_st send.

74
00:05:39,990 --> 00:05:45,639
These timing constraints are shown as the
dotted red arrows in the timing diagram.

75
00:05:45,639 --> 00:05:49,370
Together these precedence constraints mean
that the producer and consumer are tightly

76
00:05:49,370 --> 00:05:53,840
coupled in the sense that a character has
to be read by the consumer before the next

77
00:05:53,840 --> 00:05:59,659
character can be sent by the producer,
which might be less than optimal if the 

78
00:05:59,659 --> 00:06:04,289
and  computations take a variable amount
of time.

79
00:06:04,289 --> 00:06:10,139
So let's see how we can relax the constraints
to allow for more independence between the

80
00:06:10,139 --> 00:06:12,669
producer and consumer.

81
00:06:12,669 --> 00:06:17,800
We can relax the execution constraints on
the producer and consumer by having them communicate

82
00:06:17,800 --> 00:06:23,569
via N-character first-in-first-out (FIFO)
buffer.

83
00:06:23,569 --> 00:06:27,830
As the producer produces characters it inserts
them into the buffer.

84
00:06:27,830 --> 00:06:32,449
The consumer reads characters from the buffer
in the same order as they were produced.

85
00:06:32,449 --> 00:06:36,210
The buffer can hold between 0 and N characters.

86
00:06:36,210 --> 00:06:41,680
If the buffer holds 0 characters, it's empty;
if it holds N characters, it's full.

87
00:06:41,680 --> 00:06:46,240
The producer should wait if the buffer is
full, the consumer should wait if the buffer

88
00:06:46,240 --> 00:06:48,509
is empty.

89
00:06:48,509 --> 00:06:53,449
Using the N-character FIFO buffer relaxes
our second overwrite constraint to the requirement

90
00:06:53,449 --> 00:06:58,279
that the i_th receive must happen before i+N_th
send.

91
00:06:58,279 --> 00:07:03,850
In other words, the producer can get up to
N characters ahead of the consumer.

92
00:07:03,850 --> 00:07:09,159
FIFO buffers are implemented as an N-element
character array with two indices:

93
00:07:09,159 --> 00:07:14,430
the read index indicates the next character
to be read, the write index indicates the

94
00:07:14,430 --> 00:07:15,960
next character to be written.

95
00:07:15,960 --> 00:07:21,039
We'll also need a counter to keep track of
the number of characters held by the buffer,

96
00:07:21,039 --> 00:07:23,259
but that's been omitted from this diagram.

97
00:07:23,259 --> 00:07:29,529
The indices are incremented modulo N, i.e.,
the next element to be accessed after the

98
00:07:29,529 --> 00:07:35,249
N-1_st element is the 0_th element, hence
the name "circular buffer".

99
00:07:35,249 --> 00:07:36,650
Here's how it works.

100
00:07:36,650 --> 00:07:42,029
The producer runs, using the write index to
add the first character to the buffer.

101
00:07:42,029 --> 00:07:48,699
The producer can produce additional characters,
but must wait once the buffer is full.

102
00:07:48,699 --> 00:07:53,539
The consumer can receive a character anytime
the buffer is not empty, using the read index

103
00:07:53,539 --> 00:07:57,639
to keep track of the next character to be
read.

104
00:07:57,639 --> 00:08:02,240
Execution of the producer and consumer can
proceed in any order so long as the producer

105
00:08:02,240 --> 00:08:07,830
doesn't write into a full buffer and the consumer
doesn't read from an empty buffer.

106
00:08:07,830 --> 00:08:11,860
Here's what the code for the producer and
consumer might look like.

107
00:08:11,860 --> 00:08:16,400
The array and indices for the circular buffer
live in shared memory where they can be accessed

108
00:08:16,400 --> 00:08:18,460
by both processes.

109
00:08:18,460 --> 00:08:23,589
The SEND routine in the producer uses the
write index IN to keep track of where to write

110
00:08:23,589 --> 00:08:25,229
the next character.

111
00:08:25,229 --> 00:08:30,279
Similarly the RCV routine in the consumer
uses the read index OUT to keep track of the

112
00:08:30,279 --> 00:08:32,240
next character to be read.

113
00:08:32,240 --> 00:08:36,000
After each use, each index is incremented
modulo N.

114
00:08:36,000 --> 00:08:41,659
The problem with this code is that, as currently
written, neither of the two precedence constraints

115
00:08:41,659 --> 00:08:42,880
is enforced.

116
00:08:42,880 --> 00:08:47,480
The consumer can read from an empty buffer
and the producer can overwrite entries when

117
00:08:47,480 --> 00:08:49,490
the buffer is full.

118
00:08:49,490 --> 00:08:54,250
We'll need to modify this code to enforce
the constraints and for that we'll introduce

119
00:08:54,250 --> 00:09:00,000
a new programming construct that we'll use
to provide the appropriate inter-process synchronization.