1
00:00:00,459 --> 00:00:05,140
Now let's think a bit about what happens if
there's an error and one or more of the bits

2
00:00:05,140 --> 00:00:07,710
in our encoded data gets corrupted.

3
00:00:07,710 --> 00:00:12,760
We'll focus on single-bit errors, but much
of what we discuss can be generalized to multi-bit

4
00:00:12,760 --> 00:00:13,760
errors.

5
00:00:13,760 --> 00:00:18,929
For example, consider encoding the results
of some unpredictable event, e.g., flipping

6
00:00:18,929 --> 00:00:19,929
a fair coin.

7
00:00:19,929 --> 00:00:26,329
There are two outcomes: "heads", encoded as,
say, 0, and "tails" encoded as 1.

8
00:00:26,329 --> 00:00:31,109
Now suppose some error occurs during processing
- for example, the data is corrupted while

9
00:00:31,109 --> 00:00:36,650
being transmitted from Bob to Alice:
Bob intended to send the message "heads",

10
00:00:36,650 --> 00:00:41,760
but the 0 was corrupted and become a 1 during
transmission, so Alice receives 1, which she

11
00:00:41,760 --> 00:00:43,750
interprets as "tails".

12
00:00:43,750 --> 00:00:48,359
So this simple encoding doesn't work very
well if there's the possibility of single-bit

13
00:00:48,359 --> 00:00:50,160
errors.

14
00:00:50,160 --> 00:00:53,920
To help with our discussion, we'll introduce
the notion of "Hamming distance", defined

15
00:00:53,920 --> 00:00:59,480
as the number of positions in which the corresponding
digits differ in two encodings of the same

16
00:00:59,480 --> 00:01:00,480
length.

17
00:01:00,480 --> 00:01:06,280
For example, here are two 7-bit encodings,
which differ in their third and fifth positions,

18
00:01:06,280 --> 00:01:09,930
so the Hamming distance between the encodings
is 2.

19
00:01:09,930 --> 00:01:14,479
If someone tells us the Hamming distance of
two encodings is 0, then the two encodings

20
00:01:14,479 --> 00:01:16,250
are identical.

21
00:01:16,250 --> 00:01:21,400
Hamming distance is a handy tool for measuring
how encodings differ.

22
00:01:21,400 --> 00:01:24,070
How does this help us think about single-bit
errors?

23
00:01:24,070 --> 00:01:28,930
A single-bit error changes exactly one of
the bits of an encoding, so the Hamming distance

24
00:01:28,930 --> 00:01:34,740
between a valid binary code word and the same
code word with a single-bit error is 1.

25
00:01:34,740 --> 00:01:39,970
The difficulty with our simple encoding is
that the two valid code words ("0" and "1")

26
00:01:39,970 --> 00:01:42,270
also have a Hamming distance of 1.

27
00:01:42,270 --> 00:01:47,340
So a single-bit error changes one valid code
word into another valid code word.

28
00:01:47,340 --> 00:01:51,908
We'll show this graphically, using an arrow
to indicate that two encodings differ by a

29
00:01:51,908 --> 00:01:56,899
single bit, i.e., that the Hamming distance
between the encodings is 1.

30
00:01:56,899 --> 00:02:01,750
The real issue here is that when Alice receives
a 1, she can't distinguish between an uncorrupted

31
00:02:01,750 --> 00:02:06,799
encoding of tails and a corrupted encoding
of heads - she can't detect that an error

32
00:02:06,799 --> 00:02:08,030
occurred.

33
00:02:08,030 --> 00:02:10,940
Let's figure how to solve her problem!

34
00:02:10,940 --> 00:02:15,390
The insight is to come up with a set of valid
code words such that a single-bit error does

35
00:02:15,390 --> 00:02:18,440
NOT produce another valid code word.

36
00:02:18,440 --> 00:02:23,360
What we need are code words that differ by
at least two bits, i.e., we want the minimum

37
00:02:23,360 --> 00:02:28,310
Hamming distance between any two code words
to be at least 2.

38
00:02:28,310 --> 00:02:32,840
If we have a set of code words where the minimum
Hamming distance is 1, we can generate the

39
00:02:32,840 --> 00:02:38,020
set we want by adding a parity bit to each
of the original code words.

40
00:02:38,020 --> 00:02:40,620
There's "even parity" and "odd parity."

41
00:02:40,620 --> 00:02:45,481
Using even parity, the additional parity bit
is chosen so that the total number of 1 bits

42
00:02:45,481 --> 00:02:48,350
in the new code word are even.

43
00:02:48,350 --> 00:02:54,060
For example, our original encoding for "heads"
was 0, adding an even parity bit gives us

44
00:02:54,060 --> 00:02:55,610
00.

45
00:02:55,610 --> 00:03:00,400
Adding an even parity bit to our original
encoding for "tails" gives us 11.

46
00:03:00,400 --> 00:03:05,670
The minimum Hamming distance between code
words has increased from 1 to 2.

47
00:03:05,670 --> 00:03:08,060
How does this help?

48
00:03:08,060 --> 00:03:14,030
Consider what happens when there's a single-bit
error: 00 would be corrupted to 01 or 10,

49
00:03:14,030 --> 00:03:16,510
neither of which is a valid code word.

50
00:03:16,510 --> 00:03:17,510
Aha!

51
00:03:17,510 --> 00:03:20,820
We can detect that a single-bit error has
occurred.

52
00:03:20,820 --> 00:03:25,820
Similarly, single-bit errors for 11 would
also be detected.

53
00:03:25,820 --> 00:03:31,900
Note that the valid code words 00 and 11 both
have an even number of 1-bits, but that the

54
00:03:31,900 --> 00:03:36,540
corrupted code words 01 or 10 have an odd
number of 1-bits.

55
00:03:36,540 --> 00:03:40,510
We say that corrupted code words have a "parity
error".

56
00:03:40,510 --> 00:03:45,150
It's easy to perform a parity check: simply
count the number of 1s in the code word.

57
00:03:45,150 --> 00:03:51,260
If it's even, a single-bit error has NOT occurred;
if it's odd, a single-bit error HAS occurred.

58
00:03:51,260 --> 00:03:55,570
We'll see in a couple of chapters that the
Boolean function exclusive-or can be used

59
00:03:55,570 --> 00:03:58,150
to perform parity checks.

60
00:03:58,150 --> 00:04:02,450
Note that parity won't help us if there's
an even number of bit errors, where a corrupted

61
00:04:02,450 --> 00:04:07,630
code word would have an even number of 1-bits
and hence appear to be okay.

62
00:04:07,630 --> 00:04:12,060
Parity is useful for detecting single-bit
errors; we'll need a more sophisticated encoding

63
00:04:12,060 --> 00:04:14,750
to detect more errors.

64
00:04:14,750 --> 00:04:20,478
In general, to detect some number E of errors,
we need a minimum Hamming distance of E+1

65
00:04:20,478 --> 00:04:22,169
between code words.

66
00:04:22,169 --> 00:04:26,659
We can see this graphically below which shows
how errors can corrupt the valid code words

67
00:04:26,659 --> 00:04:31,620
000 and 111, which have a Hamming distance
of 3.

68
00:04:31,620 --> 00:04:37,000
In theory this means we should be able to
detect up to 2-bit errors.

69
00:04:37,000 --> 00:04:41,290
Each arrow represents a single-bit error and
we can see from the diagram that following

70
00:04:41,290 --> 00:04:47,750
any path of length 2 from either 000 or 111
doesn't get us to the other valid code word.

71
00:04:47,750 --> 00:04:54,189
In other words, assuming we start with either
000 or 111, we can detect the occurrence of

72
00:04:54,189 --> 00:04:56,520
up to 2 errors.

73
00:04:56,520 --> 00:05:00,979
Basically our error detection scheme relies
on choosing code words far enough apart, as

74
00:05:00,979 --> 00:05:05,930
measured by Hamming distance, so that E errors
can't corrupt one valid code word so that

75
00:05:05,930 --> 00:05:07,819
it looks like another valid code word.