1
00:00:00,000 --> 00:00:02,490
The following content is
provided under a Creative

2
00:00:02,490 --> 00:00:04,059
Commons license.

3
00:00:04,059 --> 00:00:06,360
Your support will help
MIT OpenCourseWare

4
00:00:06,360 --> 00:00:10,720
continue to offer high-quality
educational resources for free.

5
00:00:10,720 --> 00:00:13,350
To make a donation or
view additional materials

6
00:00:13,350 --> 00:00:17,290
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,290 --> 00:00:18,294
at ocw.mit.edu.

8
00:00:27,030 --> 00:00:30,190
GEORGE VERGHESE:
OK, let's continue.

9
00:00:30,190 --> 00:00:34,110
So we're going to
continue with linear codes

10
00:00:34,110 --> 00:00:38,350
and talk today about
error correction.

11
00:00:38,350 --> 00:00:45,810
So let me just remind you,
we're thinking of linear codes

12
00:00:45,810 --> 00:00:47,520
very concretely
as being generated

13
00:00:47,520 --> 00:00:50,250
through a process like this.

14
00:00:50,250 --> 00:00:53,100
We put this up on the
board several times.

15
00:00:53,100 --> 00:00:56,190
You've got the
data bits and then

16
00:00:56,190 --> 00:01:08,180
the parity bits being generated
by the data bits multiplying

17
00:01:08,180 --> 00:01:12,857
into a so-called
generator matrix.

18
00:01:12,857 --> 00:01:14,940
You've seen this in lecture
on recitation as well.

19
00:01:24,890 --> 00:01:28,670
And we've considered different
ways to think of this matrix.

20
00:01:28,670 --> 00:01:34,430
One way is to think of it as
made up of a bunch of rows,

21
00:01:34,430 --> 00:01:39,830
and what you're doing is
taking linear combinations

22
00:01:39,830 --> 00:01:43,220
of these rows to
generate a code word.

23
00:01:43,220 --> 00:01:47,952
So the dimensions here--
this is going to be n.

24
00:01:47,952 --> 00:01:49,910
So when you take a linear
combination of these,

25
00:01:49,910 --> 00:01:52,430
you're generating a
word that's n bits long.

26
00:01:52,430 --> 00:01:54,980
But the underlying
degrees of freedom only

27
00:01:54,980 --> 00:01:56,720
correspond to k bits,
because you're just

28
00:01:56,720 --> 00:01:59,540
doing a weighted
combination of k of these.

29
00:01:59,540 --> 00:02:01,200
OK?

30
00:02:01,200 --> 00:02:06,080
Now, we talk of these as
though they're vectors,

31
00:02:06,080 --> 00:02:09,780
you're combining them, take the
linear combinations, and so on.

32
00:02:09,780 --> 00:02:15,080
And I just wanted to
say a word about in what

33
00:02:15,080 --> 00:02:16,410
sense this is a vector.

34
00:02:16,410 --> 00:02:19,265
So this is an array of n bits.

35
00:02:23,270 --> 00:02:25,160
So we're talking
about something--

36
00:02:25,160 --> 00:02:26,630
I'll call it v, let's say--

37
00:02:26,630 --> 00:02:33,380
which is an array of n bits.

38
00:02:33,380 --> 00:02:36,560
And the question is, in
what sense is that a vector?

39
00:02:36,560 --> 00:02:40,420
In what sense does it
live in a vector space?

40
00:02:40,420 --> 00:02:42,170
So when we say vector
space, we're usually

41
00:02:42,170 --> 00:02:46,610
thinking of arrays of n elements
with real numbers in them,

42
00:02:46,610 --> 00:02:50,300
and the kind that you
use in physics, where

43
00:02:50,300 --> 00:02:52,760
you take linear combinations
of them with real numbers

44
00:02:52,760 --> 00:02:54,658
and you get new vectors.

45
00:02:54,658 --> 00:02:56,450
This is the same kind
of thing, except it's

46
00:02:56,450 --> 00:03:01,550
working over not the real
field, but as we've seen, gf2.

47
00:03:01,550 --> 00:03:09,620
So this is a vector
space over gf2.

48
00:03:09,620 --> 00:03:11,250
It's a funny vector
space, again,

49
00:03:11,250 --> 00:03:13,070
because it has a finite
number of elements.

50
00:03:13,070 --> 00:03:16,310
The vectors that we're
used to thinking of--

51
00:03:16,310 --> 00:03:20,000
Euclidean vector space has
an infinity of elements,

52
00:03:20,000 --> 00:03:22,100
because you can have an
array of n components,

53
00:03:22,100 --> 00:03:24,090
but each component could
be any real number.

54
00:03:24,090 --> 00:03:29,780
So any point in 3D space
would be a vector in r3.

55
00:03:29,780 --> 00:03:32,720
So this is a vector
space over gf2,

56
00:03:32,720 --> 00:03:37,400
and it only has a finite
number of components--

57
00:03:37,400 --> 00:03:40,410
only has 2 to the
n possible vectors.

58
00:03:40,410 --> 00:03:45,350
It's a finite set of vectors,
so it's strange that way too.

59
00:03:45,350 --> 00:03:48,180
So in what sense is
that a vector space?

60
00:03:48,180 --> 00:03:52,670
Well, it turns out that
they're pretty abstract things

61
00:03:52,670 --> 00:03:54,500
that you can refer to
as vectors, provided

62
00:03:54,500 --> 00:03:56,960
they satisfy certain axioms.

63
00:03:56,960 --> 00:03:58,340
So what you want
to be able to do

64
00:03:58,340 --> 00:04:02,210
is define a sum
of these objects.

65
00:04:02,210 --> 00:04:04,100
You need to have
a set of scanners

66
00:04:04,100 --> 00:04:08,270
and define a scalar times
vector multiplication.

67
00:04:12,490 --> 00:04:14,810
And then you need a
0 vector, a vector

68
00:04:14,810 --> 00:04:16,790
that, when you add
to another vector,

69
00:04:16,790 --> 00:04:18,800
gives you the same
vector back again.

70
00:04:18,800 --> 00:04:24,080
You need certain
distributivity properties.

71
00:04:24,080 --> 00:04:28,310
So if you take a scalar
times the sum of two vectors,

72
00:04:28,310 --> 00:04:32,227
you get things like that.

73
00:04:32,227 --> 00:04:34,060
So you can list a bunch
of these properties.

74
00:04:34,060 --> 00:04:36,132
I'm not trying to teach
you are the axioms are

75
00:04:36,132 --> 00:04:37,340
that define the vector space.

76
00:04:37,340 --> 00:04:39,140
But there's a set of
axioms, and you'll

77
00:04:39,140 --> 00:04:41,810
recognize very quickly that
Euclidean space satisfies

78
00:04:41,810 --> 00:04:42,800
those axioms.

79
00:04:42,800 --> 00:04:44,675
But the point is there
are other objects that

80
00:04:44,675 --> 00:04:48,170
satisfy the same axioms, and you
can work with them as vectors--

81
00:04:48,170 --> 00:04:53,330
so notions of independence
of vectors, a basis in terms

82
00:04:53,330 --> 00:04:54,930
of which you write
other vectors--

83
00:04:54,930 --> 00:04:55,650
all of these.

84
00:04:55,650 --> 00:04:58,220
Now, I'm not assuming you've
done a linear algebra course.

85
00:04:58,220 --> 00:04:59,637
I'm assuming you've
picked up some

86
00:04:59,637 --> 00:05:02,630
of this in the course of
doing physics, and so on.

87
00:05:02,630 --> 00:05:05,090
I'm just trying to
talk intuitively here.

88
00:05:05,090 --> 00:05:10,970
One thing we don't have here is
a notion of an inner product,

89
00:05:10,970 --> 00:05:13,020
or a dot product,
or a scalar product.

90
00:05:13,020 --> 00:05:23,128
So if you had two n component
vectors in Euclidean space--

91
00:05:23,128 --> 00:05:24,920
you're probably used
to this from physics--

92
00:05:24,920 --> 00:05:27,560
you'll take inner products
defined in this fashion.

93
00:05:27,560 --> 00:05:29,780
Well, we can certainly do
this kind of computation

94
00:05:29,780 --> 00:05:32,690
with the elements
of a vector here,

95
00:05:32,690 --> 00:05:35,030
but the resulting object
doesn't have the properties

96
00:05:35,030 --> 00:05:36,660
of an inner product.

97
00:05:36,660 --> 00:05:38,780
For instance, you can take the--

98
00:05:38,780 --> 00:05:42,470
if you take the inner product
of two non-zero vectors in real

99
00:05:42,470 --> 00:05:45,630
vector space, you'll never get--

100
00:05:45,630 --> 00:05:50,360
well, you can get
the inner product

101
00:05:50,360 --> 00:05:52,040
to be 0 under very
special conditions.

102
00:05:52,040 --> 00:05:53,623
There's a notion
of orthogonality.

103
00:05:53,623 --> 00:05:55,040
It turns out that
doesn't actually

104
00:05:55,040 --> 00:05:58,590
work quite the same way
here over this space.

105
00:05:58,590 --> 00:06:02,660
So what do we do have is
we set aside orthogonality.

106
00:06:02,660 --> 00:06:07,100
We'll talk about linear
combinations of vectors.

107
00:06:12,410 --> 00:06:17,940
We'll talk about a
set of basis vectors.

108
00:06:17,940 --> 00:06:20,600
So a set of basis vectors
would be a set of vectors

109
00:06:20,600 --> 00:06:23,120
that you can take linear
combinations of to get

110
00:06:23,120 --> 00:06:25,690
other vectors in the space--

111
00:06:25,690 --> 00:06:28,730
and a minimal such set.

112
00:06:28,730 --> 00:06:31,220
So we'll be using a bit of
the language of vector spaces.

113
00:06:31,220 --> 00:06:32,678
You might have some
notions of that

114
00:06:32,678 --> 00:06:35,120
might come from what
you've done with physics.

115
00:06:35,120 --> 00:06:38,720
And that's all really
that we want to depend on.

116
00:06:44,290 --> 00:06:47,720
All right, so back to
this-- what we have

117
00:06:47,720 --> 00:06:50,300
is these arrays of n bits.

118
00:06:50,300 --> 00:06:53,980
We think of them as
vectors in some space.

119
00:06:53,980 --> 00:06:57,410
The dimension of the space
is the number of vectors

120
00:06:57,410 --> 00:07:00,380
that you need in order
to generate other vectors

121
00:07:00,380 --> 00:07:01,710
by linear combination.

122
00:07:01,710 --> 00:07:08,240
So the question is, can
I generate some vector

123
00:07:08,240 --> 00:07:16,010
by taking alpha 1 v1 plus
alpha 2 v2 plus alpha 3 v3?

124
00:07:16,010 --> 00:07:19,100
So I'd like to be able to
generate a vector in the space

125
00:07:19,100 --> 00:07:21,870
by taking a linear
combination of other vectors.

126
00:07:21,870 --> 00:07:25,280
So if you ask for what's the
minimum number of such vectors

127
00:07:25,280 --> 00:07:28,130
you need here in order
to be able to generate

128
00:07:28,130 --> 00:07:31,340
any vector by taking a
linear combination, that's

129
00:07:31,340 --> 00:07:33,030
the dimension of the space.

130
00:07:33,030 --> 00:07:36,440
So in that sense, it turns
out that these anaerobes

131
00:07:36,440 --> 00:07:40,250
live in an n-dimensional space.

132
00:07:40,250 --> 00:07:43,520
But they don't span all
of n-dimensional space,

133
00:07:43,520 --> 00:07:44,450
because you're just--

134
00:07:44,450 --> 00:07:47,030
you've just got k of them here.

135
00:07:47,030 --> 00:07:49,940
It turns out that
what you get by taking

136
00:07:49,940 --> 00:07:54,860
linear combinations of these
is a k-dimensional subspace

137
00:07:54,860 --> 00:07:56,875
of an n-dimensional space.

138
00:07:56,875 --> 00:07:58,250
So in some sense,
when you define

139
00:07:58,250 --> 00:08:01,670
a code, what you're
doing is you're saying,

140
00:08:01,670 --> 00:08:05,840
I have this n dimensional space
that my words can live in,

141
00:08:05,840 --> 00:08:08,960
but I'm going to restrict
myself to words that

142
00:08:08,960 --> 00:08:13,580
live in a k-dimensional subspace
so that, if a vector pops out

143
00:08:13,580 --> 00:08:16,550
of that subspace, I recognize
it as being an error.

144
00:08:16,550 --> 00:08:18,830
So that's the general idea.

145
00:08:18,830 --> 00:08:21,193
All of this can be
done more carefully

146
00:08:21,193 --> 00:08:22,610
using the notion
of vector spaces.

147
00:08:22,610 --> 00:08:24,667
I just wanted to give
you a rough idea of that.

148
00:08:27,955 --> 00:08:29,330
This is one way
of thinking over.

149
00:08:29,330 --> 00:08:32,680
Here was another way of thinking
of it, which was column-wise.

150
00:08:42,396 --> 00:08:46,100
We think of the generator
matrix as being made up

151
00:08:46,100 --> 00:08:47,150
of a bunch of columns.

152
00:08:54,180 --> 00:08:56,460
And that's useful
when you want to think

153
00:08:56,460 --> 00:08:59,970
about how a parity
bit is defined

154
00:08:59,970 --> 00:09:02,100
in terms of the data bits.

155
00:09:02,100 --> 00:09:06,421
So here's what you see when
you think of this column-wise.

156
00:09:13,020 --> 00:09:15,800
So let's take p1 here.

157
00:09:18,750 --> 00:09:21,630
Actually, let me
specialize this further.

158
00:09:21,630 --> 00:09:25,320
We've already said that, because
of the form in which we set up

159
00:09:25,320 --> 00:09:30,450
our code words, this is in
what's called systematic form.

160
00:09:36,137 --> 00:09:37,720
We've got the data
bits sitting there,

161
00:09:37,720 --> 00:09:39,510
and then we add in
the parity bits.

162
00:09:39,510 --> 00:09:43,620
Because of that, we've said that
there's an identity matrix here

163
00:09:43,620 --> 00:09:45,600
all the way down as 1's.

164
00:09:45,600 --> 00:09:47,350
And then we've got
some other matrix here,

165
00:09:47,350 --> 00:09:57,840
which is something
we'll denote by A.

166
00:09:57,840 --> 00:10:04,800
OK, so when we do this
multiplication in the first k

167
00:10:04,800 --> 00:10:07,500
positions, we just
pick up d1 to dk.

168
00:10:07,500 --> 00:10:11,160
In the next position over,
I get the expression for p1.

169
00:10:11,160 --> 00:10:14,940
So p1 is going to be d1
times the first entry there

170
00:10:14,940 --> 00:10:18,280
plus d2 times the
second entry, and so on.

171
00:10:18,280 --> 00:10:19,290
So here's what I get.

172
00:10:23,040 --> 00:10:29,310
Let me call this A11
in that first row.

173
00:10:29,310 --> 00:10:34,890
Here's A21, all
the way up to Ak1.

174
00:10:34,890 --> 00:10:37,650
So these are just numbering the
entries down that first row.

175
00:10:56,020 --> 00:10:57,680
So I'm taking a
combination like that,

176
00:10:57,680 --> 00:11:00,320
a linear combination
of the data bits

177
00:11:00,320 --> 00:11:01,775
to get that first parity bit.

178
00:11:14,210 --> 00:11:17,000
So the j-th parity bit
is found by going over

179
00:11:17,000 --> 00:11:18,590
to the j-th column.

180
00:11:18,590 --> 00:11:26,090
The entries here are A1j
all the way up to Akj.

181
00:11:26,090 --> 00:11:27,830
So the way matrix
multiplication works--

182
00:11:27,830 --> 00:11:30,580
if I'm looking for
the j-th entry--

183
00:11:30,580 --> 00:11:35,390
the j-th parity bit here, I
take this and do the dot product

184
00:11:35,390 --> 00:11:39,230
kind of expression with the j-th
column, so this is what I get.

185
00:11:39,230 --> 00:11:40,895
So this is a typical
parity relation.

186
00:11:47,000 --> 00:11:48,475
And it goes the other way too.

187
00:11:52,400 --> 00:11:54,980
If you had this expression,
and not the matrix,

188
00:11:54,980 --> 00:11:59,020
you can just take those numbers
and translate them back in.

189
00:11:59,020 --> 00:12:02,930
And these numbers are
just 0, 1 in our--

190
00:12:02,930 --> 00:12:06,990
in the case of a binary field
that we're working over.

191
00:12:06,990 --> 00:12:08,450
That is just 0, 1.

192
00:12:08,450 --> 00:12:12,380
So you either have the data
bit there or you don't.

193
00:12:12,380 --> 00:12:14,810
All additions or modulo
2 additions, of course.

194
00:12:23,640 --> 00:12:25,265
Actually, let me call
this the parity--

195
00:12:28,070 --> 00:12:29,450
it's the parity definition.

196
00:12:29,450 --> 00:12:32,240
I may have had another
term for it in my slides.

197
00:12:32,240 --> 00:12:37,550
Let's see here--
this parity equation.

198
00:12:37,550 --> 00:12:40,202
It just defines
the parity better.

199
00:12:40,202 --> 00:12:42,035
Here's what I think of
as a parity relation.

200
00:12:50,450 --> 00:12:51,250
What is this sum?

201
00:12:55,130 --> 00:12:56,885
What does that sum
work out to be?

202
00:12:56,885 --> 00:12:57,760
AUDIENCE: [INAUDIBLE]

203
00:12:57,760 --> 00:12:58,300
GEORGE VERGHESE: Sorry?

204
00:12:58,300 --> 00:12:59,550
AUDIENCE: [INAUDIBLE]

205
00:12:59,550 --> 00:13:03,590
GEORGE VERGHESE: 0--
because I'm adding

206
00:13:03,590 --> 00:13:05,900
pj to itself, and then gf2.

207
00:13:05,900 --> 00:13:08,508
That gives me 0.

208
00:13:08,508 --> 00:13:10,550
And this is what I think
of as a parity relation.

209
00:13:17,390 --> 00:13:20,390
So a parity equation
defines my parity bits,

210
00:13:20,390 --> 00:13:24,710
but I get immediately from that
a parody relation that relates

211
00:13:24,710 --> 00:13:26,880
my parity bit to my data bits.

212
00:13:29,840 --> 00:13:32,000
Turns out this is
important for the way

213
00:13:32,000 --> 00:13:35,520
we're going to talk
about our correction.

214
00:13:35,520 --> 00:13:41,070
OK, so let me step
off the dime here.

215
00:13:41,070 --> 00:13:44,150
And this is a
particular example.

216
00:13:44,150 --> 00:13:46,275
I just wanted to set
up the general notation

217
00:13:46,275 --> 00:13:47,400
before we got back to this.

218
00:13:47,400 --> 00:13:49,830
We've looked at this before--

219
00:13:49,830 --> 00:13:52,970
9, 4, 4 rectangular code.

220
00:13:52,970 --> 00:13:57,230
So what might that be?

221
00:13:57,230 --> 00:13:58,160
How many data bits?

222
00:14:01,010 --> 00:14:04,370
9, 4, 4--

223
00:14:04,370 --> 00:14:06,560
4 data bits, right?

224
00:14:06,560 --> 00:14:08,210
So how would I get in 9, 4, 4?

225
00:14:08,210 --> 00:14:13,010
I'd have D1, D2, D3, D4.

226
00:14:13,010 --> 00:14:19,520
And then, depending on how I
number this, P1, P2, P3, P4,

227
00:14:19,520 --> 00:14:21,955
P5 would be one
way to number it.

228
00:14:21,955 --> 00:14:24,080
I don't know if that
corresponds to what's up here.

229
00:14:24,080 --> 00:14:24,710
Can we check?

230
00:14:28,100 --> 00:14:29,140
So what's P1?

231
00:14:29,140 --> 00:14:35,850
P1 is going to be D1 times 1
plus D2 times 1, and that's it.

232
00:14:35,850 --> 00:14:37,760
So it's d1 plus D2.

233
00:14:37,760 --> 00:14:42,750
So P1 is indeed that
element there on the board.

234
00:14:42,750 --> 00:14:45,960
And so you can check
each of these entries.

235
00:14:45,960 --> 00:14:46,460
Let's see.

236
00:14:46,460 --> 00:14:50,510
P5-- that's the
last entry up there.

237
00:14:50,510 --> 00:14:54,740
That's going to be this
row inner producted or dot

238
00:14:54,740 --> 00:14:58,040
producted with all
the sequence of 1's

239
00:14:58,040 --> 00:15:02,000
there, so it's going to be D1
plus D2 plus D3 plus D4, which

240
00:15:02,000 --> 00:15:07,460
is indeed how that overall
parity bit is defined.

241
00:15:10,720 --> 00:15:12,470
So again, you see in
the generator matrix,

242
00:15:12,470 --> 00:15:13,940
you have the identity
matrix there,

243
00:15:13,940 --> 00:15:19,018
and then you have this matrix
that we're referring to as A.

244
00:15:19,018 --> 00:15:20,810
Now, the notation is
a little bit different

245
00:15:20,810 --> 00:15:22,580
reading chapters 5
and 6, by the way.

246
00:15:22,580 --> 00:15:25,280
I've tried to stick with the
notation I had in lecture

247
00:15:25,280 --> 00:15:27,170
last time in the chapter
5 notation, which

248
00:15:27,170 --> 00:15:32,390
uses capital D for the
data bit vector and capital

249
00:15:32,390 --> 00:15:33,960
C for the code word.

250
00:15:33,960 --> 00:15:37,280
So you'll see slightly
different notation in chapter 6,

251
00:15:37,280 --> 00:15:40,640
but I think you'll
navigate fine.

252
00:15:40,640 --> 00:15:43,580
One other term here--

253
00:15:43,580 --> 00:15:49,280
we say that these code words
live in the row space of G. So

254
00:15:49,280 --> 00:15:51,570
the space that we generate,
the space of vectors,

255
00:15:51,570 --> 00:15:54,020
the subspace of the big
space that we generate

256
00:15:54,020 --> 00:15:55,910
by taking linear
combinations of these rows

257
00:15:55,910 --> 00:15:58,100
is referred to as
the row space of G.

258
00:15:58,100 --> 00:16:01,190
So we define the
code by defining

259
00:16:01,190 --> 00:16:04,910
a G. If the code's going
to be in systematic form,

260
00:16:04,910 --> 00:16:07,130
we have the identity here,
and then some matrix.

261
00:16:07,130 --> 00:16:09,200
This is all for linear codes.

262
00:16:09,200 --> 00:16:12,197
And then the code words live in
the row space of this matrix.

263
00:16:12,197 --> 00:16:14,780
In other words, they're obtained
by taking linear combinations

264
00:16:14,780 --> 00:16:15,290
of the rows.

265
00:16:22,550 --> 00:16:24,500
Here's what I already
have on the board.

266
00:16:27,220 --> 00:16:30,770
And it's just to say that the
matrix A that's sitting out

267
00:16:30,770 --> 00:16:31,610
here--

268
00:16:31,610 --> 00:16:35,540
this piece-- is
obtained directly

269
00:16:35,540 --> 00:16:39,030
from the parity relations.

270
00:16:39,030 --> 00:16:40,410
OK, so let's think about this.

271
00:16:44,120 --> 00:16:46,540
Can two columns of the
matrix A be the same?

272
00:16:50,500 --> 00:16:52,910
What happens if two
columns of A are the same?

273
00:16:59,447 --> 00:17:01,780
Let's say these two columns
are identical to each other.

274
00:17:05,440 --> 00:17:09,740
What is that telling us
about the code that we have?

275
00:17:11,940 --> 00:17:12,440
Yeah?

276
00:17:12,440 --> 00:17:15,863
AUDIENCE: [INAUDIBLE]

277
00:17:15,863 --> 00:17:16,780
GEORGE VERGHESE: Yeah.

278
00:17:16,780 --> 00:17:18,650
And so basically, one
of those parity bits

279
00:17:18,650 --> 00:17:20,530
is not buying you
anything, right?

280
00:17:20,530 --> 00:17:21,700
Yeah.

281
00:17:21,700 --> 00:17:28,280
OK, so if you did discover
two columns two columns were

282
00:17:28,280 --> 00:17:31,010
identical, then one
of those parity bits

283
00:17:31,010 --> 00:17:33,140
is not checking a different
linear combination--

284
00:17:33,140 --> 00:17:35,350
checking the same
linear combination,

285
00:17:35,350 --> 00:17:37,340
and so it's not
buying you anything.

286
00:17:37,340 --> 00:17:38,495
What about two rows?

287
00:17:42,780 --> 00:17:47,114
Can two rows of the
matrix be identical?

288
00:17:51,560 --> 00:17:57,460
So let's actually think of them.

289
00:17:57,460 --> 00:18:03,000
Erase that-- can I have two
identical rows here in A?

290
00:18:14,580 --> 00:18:18,420
And if I did, what
would it mean?

291
00:18:18,420 --> 00:18:20,100
OK, let's say I have
two identical rows.

292
00:18:20,100 --> 00:18:20,850
What does it mean?

293
00:18:20,850 --> 00:18:22,650
What does it signify?

294
00:18:22,650 --> 00:18:23,150
Someone?

295
00:18:26,690 --> 00:18:27,190
Yeah?

296
00:18:27,190 --> 00:18:28,950
AUDIENCE: [INAUDIBLE]

297
00:18:28,950 --> 00:18:29,550
GEORGE VERGHESE:
Could you speak up?

298
00:18:29,550 --> 00:18:29,760
Sorry.

299
00:18:29,760 --> 00:18:30,677
My hearing's not good.

300
00:18:30,677 --> 00:18:36,543
AUDIENCE: [INAUDIBLE]

301
00:18:36,543 --> 00:18:38,210
GEORGE VERGHESE: So
there'll be two data

302
00:18:38,210 --> 00:18:42,050
bits that are entering the same
way in every parity relation.

303
00:18:42,050 --> 00:18:44,490
If two rows are the
same here, then there

304
00:18:44,490 --> 00:18:46,850
are two data bits
here that are entering

305
00:18:46,850 --> 00:18:50,800
every parity relationship in
exactly the same combination.

306
00:18:50,800 --> 00:18:52,910
And so you're not going
to be able to distinguish

307
00:18:52,910 --> 00:18:55,160
between an error that happens
in one of them and the--

308
00:18:55,160 --> 00:18:58,100
and an error that
happens in another one.

309
00:18:58,100 --> 00:18:59,990
So this is a problem.

310
00:18:59,990 --> 00:19:04,670
All right, so there are
certain conditions at the A

311
00:19:04,670 --> 00:19:05,420
he has to satisfy.

312
00:19:07,940 --> 00:19:11,660
All right, here's
another important matrix.

313
00:19:11,660 --> 00:19:14,030
You may have already
seen it in reading.

314
00:19:14,030 --> 00:19:16,240
You may not have seen
it yet in recitation.

315
00:19:16,240 --> 00:19:19,130
And it's a matrix
that we call H.

316
00:19:19,130 --> 00:19:25,250
And let's just think
about what it's doing.

317
00:19:25,250 --> 00:19:27,560
What I'm trying
to do is basically

318
00:19:27,560 --> 00:19:29,870
summarize this set of
equations, the parity

319
00:19:29,870 --> 00:19:33,650
relations in matrix form.

320
00:19:33,650 --> 00:19:38,060
So let's take a parity
relation that we

321
00:19:38,060 --> 00:19:39,680
had in this particular case.

322
00:19:39,680 --> 00:19:42,320
In fact, let's go back
to a specific one.

323
00:19:45,680 --> 00:19:47,362
Let's take the first
parity relationship

324
00:19:47,362 --> 00:19:48,320
that we had over there.

325
00:19:48,320 --> 00:19:55,700
We said that P1 for this
code was D1 plus D2, right?

326
00:19:55,700 --> 00:19:57,790
That's the equation
for the parity.

327
00:19:57,790 --> 00:20:01,310
The parity relationship is this.

328
00:20:04,070 --> 00:20:06,680
How would I express
that in matrix form,

329
00:20:06,680 --> 00:20:08,390
as part of a matrix equation?

330
00:20:08,390 --> 00:20:12,240
Well, that's what we're
starting to assemble here.

331
00:20:12,240 --> 00:20:14,690
So let me show you what that is.

332
00:20:14,690 --> 00:20:16,800
Let's look at the
top row out here.

333
00:20:16,800 --> 00:20:18,860
What's the top row telling us?

334
00:20:18,860 --> 00:20:27,020
It says 1 times D1 plus 1 times
D2 plus 1 times P1 equals 0.

335
00:20:27,020 --> 00:20:28,540
That's all that enters.

336
00:20:28,540 --> 00:20:32,550
So that first row is capturing
the first parity relationship.

337
00:20:32,550 --> 00:20:34,190
And you can go down here.

338
00:20:34,190 --> 00:20:35,540
Go to the second row.

339
00:20:35,540 --> 00:20:42,410
This is saying D3 plus
D4 plus P2 is equal to 0.

340
00:20:42,410 --> 00:20:44,450
That's indeed the
parity relationship

341
00:20:44,450 --> 00:20:48,350
associated with the second
row in the rectangular code.

342
00:20:48,350 --> 00:20:50,510
So all that this is doing
is listing all the parity

343
00:20:50,510 --> 00:20:52,620
relationships.

344
00:20:52,620 --> 00:20:54,300
So how many of
these do you have?

345
00:20:54,300 --> 00:20:56,450
Well, you have as
many relationships

346
00:20:56,450 --> 00:20:58,560
as you have parity bits.

347
00:20:58,560 --> 00:21:01,580
So this is n minus
k times n matrix,

348
00:21:01,580 --> 00:21:07,280
and it's just listing
the parity relations.

349
00:21:07,280 --> 00:21:10,970
We call this matrix
H, and interestingly--

350
00:21:10,970 --> 00:21:14,610
let's see-- there's an
identity matrix sitting here,

351
00:21:14,610 --> 00:21:17,360
and the reason is that
each of these equations

352
00:21:17,360 --> 00:21:19,390
involves only one parity bit.

353
00:21:19,390 --> 00:21:21,740
There's only a single PI
that's involved in each parity

354
00:21:21,740 --> 00:21:25,040
relation, and so there
should be only one

355
00:21:25,040 --> 00:21:28,370
of these columns
picked as a 1 when

356
00:21:28,370 --> 00:21:32,000
you get to this segment that
multiplies the parity bits.

357
00:21:32,000 --> 00:21:34,460
So there's an identity
matrix sitting there,

358
00:21:34,460 --> 00:21:36,930
and then there's
the rest of it here.

359
00:21:36,930 --> 00:21:41,150
So here's the identity matrix
and here's the rest of it.

360
00:21:41,150 --> 00:21:44,310
And not surprisingly,
the rest of it--

361
00:21:44,310 --> 00:21:47,750
well, it comes from the
same set of coefficients,

362
00:21:47,750 --> 00:21:49,412
and so it relates
to the A matrix.

363
00:21:49,412 --> 00:21:50,870
And if you look at
it carefully, it

364
00:21:50,870 --> 00:21:54,440
turns out, well, it is the A
matrix, but turned on its side.

365
00:21:57,970 --> 00:22:01,370
A superscript T
means A transposed--

366
00:22:06,640 --> 00:22:08,960
IE, rows become columns.

367
00:22:13,965 --> 00:22:15,840
We're taking the a
matrix, and in some sense,

368
00:22:15,840 --> 00:22:16,810
turning it on its side.

369
00:22:19,890 --> 00:22:24,540
So that makes sense, because
what defined the first parity

370
00:22:24,540 --> 00:22:26,160
relationship?

371
00:22:26,160 --> 00:22:30,775
Well, we found it in the
columns of A here before.

372
00:22:30,775 --> 00:22:32,650
That's what defined the
parity relationships.

373
00:22:32,650 --> 00:22:35,165
And now we're writing
it out in this row form,

374
00:22:35,165 --> 00:22:36,600
so you've got to
transpose things.

375
00:22:36,600 --> 00:22:40,120
You've got to take what was a
column in A and make it a row.

376
00:22:40,120 --> 00:22:42,990
So what you're seeing
out here at the top--

377
00:22:42,990 --> 00:22:45,180
that was the first
column of A. So now it's

378
00:22:45,180 --> 00:22:47,400
the first row of A transpose.

379
00:22:47,400 --> 00:22:49,860
This was the second
column of A. Now

380
00:22:49,860 --> 00:22:53,160
it becomes the second row
of A transpose, and so on.

381
00:22:53,160 --> 00:22:54,690
So that's what that
relationship is.

382
00:23:04,110 --> 00:23:13,920
So I can write the parity
relationships in the form H

383
00:23:13,920 --> 00:23:19,140
times C equals a
whole bunch of 0's.

384
00:23:19,140 --> 00:23:22,230
This is k parody relationships.

385
00:23:22,230 --> 00:23:23,550
This is k 0's here.

386
00:23:28,880 --> 00:23:32,120
But I could read write the
same thing turned on its side,

387
00:23:32,120 --> 00:23:34,320
and that's what you're
seeing over here.

388
00:23:34,320 --> 00:23:40,240
So if H is the matrix
there, what is H transpose?

389
00:23:43,120 --> 00:23:43,620
Let's see.

390
00:23:43,620 --> 00:23:44,960
I'm going to change--

391
00:23:44,960 --> 00:23:46,940
interchange rows and columns.

392
00:23:46,940 --> 00:23:50,750
So what's going to happen
is that the first row of H

393
00:23:50,750 --> 00:23:54,410
is going to become the
first column of H transpose.

394
00:23:54,410 --> 00:23:57,200
And so if you imagine how
this gets turned on its side,

395
00:23:57,200 --> 00:23:58,790
here's what H
transpose looks like.

396
00:24:02,930 --> 00:24:04,620
H transpose looks like that.

397
00:24:04,620 --> 00:24:07,790
So when you transpose a matrix
whose entries are blocks,

398
00:24:07,790 --> 00:24:10,790
you end up flipping
the matrix around,

399
00:24:10,790 --> 00:24:13,110
but also transposing
each of the blocks.

400
00:24:13,110 --> 00:24:14,150
So that's A transpose.

401
00:24:22,830 --> 00:24:23,770
Oh, actually, sorry.

402
00:24:23,770 --> 00:24:26,950
I wrote this one
wrong, didn't I?

403
00:24:26,950 --> 00:24:29,472
What is our vector C?

404
00:24:29,472 --> 00:24:30,550
C is this vector.

405
00:24:30,550 --> 00:24:31,400
It's a code word.

406
00:24:34,360 --> 00:24:38,807
When I set this up in matrix
form, I got a column vector,

407
00:24:38,807 --> 00:24:40,390
so I've got to
transpose this as well.

408
00:24:40,390 --> 00:24:42,755
That's what I was missing.

409
00:24:42,755 --> 00:24:44,380
That's C transpose
that I'm looking at.

410
00:24:46,930 --> 00:24:49,600
So when I write down the
parity relationships,

411
00:24:49,600 --> 00:24:53,260
I've got this, but I could
also write it in the form--

412
00:24:56,357 --> 00:24:58,690
when you transpose a product
of things, what do you get?

413
00:24:58,690 --> 00:25:04,150
Well, it turns out, if I
transpose the product of two

414
00:25:04,150 --> 00:25:13,008
things, I get the product
of the transposes,

415
00:25:13,008 --> 00:25:14,050
but in the reverse order.

416
00:25:16,660 --> 00:25:19,190
So these are just different
ways of arranging the equations.

417
00:25:19,190 --> 00:25:24,367
So I could have
written it in this form

418
00:25:24,367 --> 00:25:26,200
or I could take the
transpose of both sides,

419
00:25:26,200 --> 00:25:28,283
and I get something that
looks slightly different.

420
00:25:28,283 --> 00:25:37,900
So here, this would
be a row of 0's.

421
00:25:37,900 --> 00:25:40,150
So this is just to get you
comfortable with the matrix

422
00:25:40,150 --> 00:25:40,720
operations.

423
00:25:40,720 --> 00:25:42,792
You'll see lots of
this in chapter 6.

424
00:25:42,792 --> 00:25:44,750
You want to get a little
comfortable with that.

425
00:25:54,660 --> 00:25:56,730
Here's a question for you.

426
00:25:56,730 --> 00:25:57,885
Here's my H matrix.

427
00:26:01,080 --> 00:26:02,085
Here's my code words.

428
00:26:06,170 --> 00:26:09,400
If I have a code word
of minimum weight,

429
00:26:09,400 --> 00:26:12,160
how many 1's would
it have in it?

430
00:26:15,280 --> 00:26:19,780
In this particular case,
this is a 9, 4, 4 code.

431
00:26:19,780 --> 00:26:23,155
If I had a code word
of minimum weight,

432
00:26:23,155 --> 00:26:24,530
how many 1's would
it have in it?

433
00:26:33,300 --> 00:26:35,190
I heard something,
but I didn't hear

434
00:26:35,190 --> 00:26:37,430
where it came from, or
didn't hear it very clearly.

435
00:26:37,430 --> 00:26:37,930
Yeah?

436
00:26:37,930 --> 00:26:38,450
AUDIENCE: 4--

437
00:26:38,450 --> 00:26:38,750
GEORGE VERGHESE: 4?

438
00:26:38,750 --> 00:26:39,250
Yeah, OK.

439
00:26:39,250 --> 00:26:40,290
[INAUDIBLE] heard there.

440
00:26:40,290 --> 00:26:40,830
OK.

441
00:26:40,830 --> 00:26:43,650
So the code word
of minimum weight

442
00:26:43,650 --> 00:26:49,200
would have weight 4, because
this is a code of distance 4.

443
00:26:49,200 --> 00:26:51,420
It's a linear code.

444
00:26:51,420 --> 00:26:53,470
The minimum Hamming
distance is 4.

445
00:26:53,470 --> 00:26:56,730
It's a linear code, so we
know, for a linear code,

446
00:26:56,730 --> 00:27:00,420
the minimum weight you'll find
among all code words is 4.

447
00:27:00,420 --> 00:27:03,540
OK, so we've got a vector
here that has four 1's in it,

448
00:27:03,540 --> 00:27:05,940
and everything else is 0.

449
00:27:05,940 --> 00:27:09,480
So when I take this
computation, what

450
00:27:09,480 --> 00:27:10,800
is it that I'm actually doing?

451
00:27:10,800 --> 00:27:14,580
If I take the matrix
H and I multiplied

452
00:27:14,580 --> 00:27:16,320
by a vector that
has four 1's in it,

453
00:27:16,320 --> 00:27:18,570
and everything else is 0,
what is it that I'm actually

454
00:27:18,570 --> 00:27:20,912
doing to that matrix?

455
00:27:20,912 --> 00:27:22,400
Yeah?

456
00:27:22,400 --> 00:27:24,772
AUDIENCE: [INAUDIBLE]

457
00:27:24,772 --> 00:27:25,980
GEORGE VERGHESE: Of the rows?

458
00:27:25,980 --> 00:27:28,562
Am I taking combination
of the rows?

459
00:27:28,562 --> 00:27:30,728
AUDIENCE: [INAUDIBLE]

460
00:27:30,728 --> 00:27:32,520
GEORGE VERGHESE: When
I do a multiplication

461
00:27:32,520 --> 00:27:35,340
in this fashion with the
vector on the right-hand side,

462
00:27:35,340 --> 00:27:38,370
I end up doing the opposite
of what I was doing here.

463
00:27:38,370 --> 00:27:41,020
When I have the vector on
the left and I multiply,

464
00:27:41,020 --> 00:27:43,050
I'm taking the
combinations of the rows.

465
00:27:43,050 --> 00:27:44,610
When I've matrix
times vector, I'm

466
00:27:44,610 --> 00:27:47,310
taking combination
of the columns.

467
00:27:47,310 --> 00:27:50,040
So if I had a vector
year with four 1's in it,

468
00:27:50,040 --> 00:27:51,060
and everything's 0--

469
00:27:51,060 --> 00:27:54,780
everything else
0, I'd be taking--

470
00:27:54,780 --> 00:27:58,170
I'd be picking out four
columns of this to add,

471
00:27:58,170 --> 00:27:59,700
and the result
would be a 0 vector.

472
00:28:02,920 --> 00:28:03,445
Yeah?

473
00:28:03,445 --> 00:28:07,350
AUDIENCE: [INAUDIBLE]
all 0 [INAUDIBLE]

474
00:28:07,350 --> 00:28:08,630
GEORGE VERGHESE: All 0's here?

475
00:28:08,630 --> 00:28:10,005
AUDIENCE: All 0's
for your code--

476
00:28:12,635 --> 00:28:14,010
GEORGE VERGHESE:
Were you asking,

477
00:28:14,010 --> 00:28:15,128
why all 0's for the code?

478
00:28:15,128 --> 00:28:16,920
AUDIENCE: Well, why
you can't have all 0's?

479
00:28:16,920 --> 00:28:18,420
GEORGE VERGHESE:
You can have all 0.

480
00:28:18,420 --> 00:28:19,532
AUDIENCE: [INAUDIBLE]

481
00:28:19,532 --> 00:28:20,490
GEORGE VERGHESE: Sorry.

482
00:28:20,490 --> 00:28:22,470
The minimum weight non-zero
vector is the Hamming distance.

483
00:28:22,470 --> 00:28:23,100
Sorry.

484
00:28:23,100 --> 00:28:24,760
I may have dropped that word.

485
00:28:24,760 --> 00:28:27,260
For a linear code, we know that
the minimum Hamming distance

486
00:28:27,260 --> 00:28:29,525
is the minimum weight
non-zero vector.

487
00:28:29,525 --> 00:28:30,650
That's what I meant to say.

488
00:28:30,650 --> 00:28:32,130
I may have neglected
to say that.

489
00:28:32,130 --> 00:28:33,473
Thanks for catching that.

490
00:28:33,473 --> 00:28:34,890
So we have a
non-zero vector here.

491
00:28:34,890 --> 00:28:36,420
It's got four 1's in it.

492
00:28:36,420 --> 00:28:39,060
What that tells us is that there
are four columns here that we

493
00:28:39,060 --> 00:28:42,023
can add together and
get the 0 vector.

494
00:28:42,023 --> 00:28:44,190
Do you think it's going to
be possible to find three

495
00:28:44,190 --> 00:28:47,130
columns here that we could add
together and get the 0 vector?

496
00:28:55,800 --> 00:28:57,745
I'm not asking
you to actually do

497
00:28:57,745 --> 00:28:59,370
the computation in
your head, but based

498
00:28:59,370 --> 00:29:03,150
on the reasoning we just had,
if you found three columns that

499
00:29:03,150 --> 00:29:05,880
could be added together to
give you the 0 vector, that

500
00:29:05,880 --> 00:29:10,110
would mean you'd have a vector
here, a code word with--

501
00:29:10,110 --> 00:29:12,360
or any vector here
with three ones in it,

502
00:29:12,360 --> 00:29:15,875
everything else 0, such
that this product was 0.

503
00:29:15,875 --> 00:29:17,250
So it would be a
valid code word.

504
00:29:20,220 --> 00:29:25,410
Is that possible to take a
vector here with just three 1's

505
00:29:25,410 --> 00:29:27,750
in it, everything else 0,
and have a valid code word?

506
00:29:30,940 --> 00:29:31,440
No, right?

507
00:29:31,440 --> 00:29:34,290
We said the minimum
Hamming distance is 4,

508
00:29:34,290 --> 00:29:37,722
the minimum weight
code word here is--

509
00:29:37,722 --> 00:29:38,430
has got weight 4.

510
00:29:38,430 --> 00:29:42,450
You'll not find a valid
code word of weight 3.

511
00:29:42,450 --> 00:29:44,663
So you can actually
look at this H matrix

512
00:29:44,663 --> 00:29:46,830
and figure out what the
minimum Hamming distance is.

513
00:29:46,830 --> 00:29:49,410
It's basically the
minimum number of columns

514
00:29:49,410 --> 00:29:53,610
that you can add to
get the 0 vector.

515
00:29:53,610 --> 00:29:57,390
Now, does that tell
you, by the way, why you

516
00:29:57,390 --> 00:30:00,067
can't, in this instance, have--

517
00:30:00,067 --> 00:30:01,650
well, if you discovered
that you could

518
00:30:01,650 --> 00:30:04,570
add two columns together
and get the 0 vector,

519
00:30:04,570 --> 00:30:05,730
what would that tell you?

520
00:30:05,730 --> 00:30:10,350
If two columns of A transpose
were identical or two rows of A

521
00:30:10,350 --> 00:30:12,840
were identical,
that would tell you

522
00:30:12,840 --> 00:30:16,440
that you can add two columns
and get the 0 vector.

523
00:30:16,440 --> 00:30:19,780
That would mean that the Hamming
distance is 2, not 4, right?

524
00:30:19,780 --> 00:30:25,020
A Hamming distance 2
code is not worth much.

525
00:30:25,020 --> 00:30:29,190
It's no good for
error correction.

526
00:30:29,190 --> 00:30:35,810
That comes back to what I said
earlier in these questions.

527
00:30:38,390 --> 00:30:43,760
If A had two rows the
same, well, there's

528
00:30:43,760 --> 00:30:45,560
two data bits that
are always entering

529
00:30:45,560 --> 00:30:47,143
in the same combination,
so you're not

530
00:30:47,143 --> 00:30:50,268
protecting against
individual errors there.

531
00:30:50,268 --> 00:30:52,310
And so it's exactly that
issue that we're seeing.

532
00:30:52,310 --> 00:30:53,727
The Hamming distance
ends up being

533
00:30:53,727 --> 00:30:58,020
2, if you have two rows
of A that are identical.

534
00:30:58,020 --> 00:30:59,750
OK, so there's a lot
that can be gleaned

535
00:30:59,750 --> 00:31:07,640
from the generator matrix
and the parity check matrix.

536
00:31:07,640 --> 00:31:09,440
So this is what I
just went through--

537
00:31:09,440 --> 00:31:12,020
that, if you have
the H matrix, you

538
00:31:12,020 --> 00:31:15,672
can get the minimum
distance D by looking

539
00:31:15,672 --> 00:31:17,630
to see what's the minimum
number of columns you

540
00:31:17,630 --> 00:31:19,010
can add to get the 0 vector.

541
00:31:21,830 --> 00:31:24,020
All right, now, how
does decoding work?

542
00:31:24,020 --> 00:31:37,530
We've gone through this effort
to generate a code word,

543
00:31:37,530 --> 00:31:39,780
and then, at the receiving
end, we get some word.

544
00:31:42,780 --> 00:31:45,720
This is some received
word, and it's

545
00:31:45,720 --> 00:31:48,440
going to be a code word
plus possibly an error.

546
00:31:54,780 --> 00:31:56,580
And now we want
to figure out, is

547
00:31:56,580 --> 00:31:59,040
the thing we received
already a code word?

548
00:31:59,040 --> 00:32:02,635
Or if it's not, what code
word can I correct it to?

549
00:32:02,635 --> 00:32:05,010
And we're going to assume we
have just single-bit errors.

550
00:32:08,980 --> 00:32:12,300
So one way to do it is just
an exhaustive search, which

551
00:32:12,300 --> 00:32:15,660
is you've got this
received code word,

552
00:32:15,660 --> 00:32:19,800
you know that it's going
to be one of 2 to the k--

553
00:32:19,800 --> 00:32:23,670
so it's going to be no more than
having distance 1 from the 2

554
00:32:23,670 --> 00:32:28,820
to the k code words that
you have in your code set.

555
00:32:31,470 --> 00:32:35,370
So you can compare against
those 2 to the k code words,

556
00:32:35,370 --> 00:32:38,580
and whichever one it's
having distance 1 away from,

557
00:32:38,580 --> 00:32:42,270
that's the one that
you're going to announce.

558
00:32:42,270 --> 00:32:44,620
So that'd be one way to do it.

559
00:32:44,620 --> 00:32:48,660
The thing is that that's
not exploiting anything

560
00:32:48,660 --> 00:32:50,540
nice about the structure
of linear code,

561
00:32:50,540 --> 00:32:52,230
so what I want to
talk to you about now

562
00:32:52,230 --> 00:32:55,980
is a way to actually
capture this error

563
00:32:55,980 --> 00:33:00,330
in the case of a linear code.

564
00:33:00,330 --> 00:33:01,950
So this builds on
the relationships

565
00:33:01,950 --> 00:33:06,552
that we've been developing
some intuition for here.

566
00:33:06,552 --> 00:33:07,510
So here's what happens.

567
00:33:07,510 --> 00:33:12,970
You get a received
vector and bits,

568
00:33:12,970 --> 00:33:16,950
which is valid code word
plus a vector E that

569
00:33:16,950 --> 00:33:19,800
has a single 1 in it,
and everything else 0,

570
00:33:19,800 --> 00:33:21,270
or is completely 0.

571
00:33:21,270 --> 00:33:24,000
So if you were receiving
the code word correctly,

572
00:33:24,000 --> 00:33:26,898
you've had no errors,
and this is what you get.

573
00:33:26,898 --> 00:33:28,440
But if you've had
a single bit error,

574
00:33:28,440 --> 00:33:31,530
then E is a vector
with a single 1 in it.

575
00:33:31,530 --> 00:33:34,770
It's n bits long, has
a single one in it,

576
00:33:34,770 --> 00:33:36,270
and that gets added
to the code word

577
00:33:36,270 --> 00:33:39,245
to give you what you receive.

578
00:33:39,245 --> 00:33:40,620
So here's what
we're going to do.

579
00:33:40,620 --> 00:33:44,220
We're going to exploit
the relationships

580
00:33:44,220 --> 00:33:46,110
that we had up here.

581
00:33:46,110 --> 00:33:52,350
We know that H times is a
valid code word equals 0.

582
00:33:52,350 --> 00:33:55,640
Or I can write it the other
way around if I want to do row

583
00:33:55,640 --> 00:33:58,590
multiplications-- sorry--

584
00:33:58,590 --> 00:34:00,210
C times H equals 0.

585
00:34:03,360 --> 00:34:05,010
So I'm going to take
the receipt vector

586
00:34:05,010 --> 00:34:06,700
and do that
multiplication with it.

587
00:34:06,700 --> 00:34:08,550
And in this case, I've
chosen to write it

588
00:34:08,550 --> 00:34:12,929
as H times the transpose.

589
00:34:12,929 --> 00:34:15,810
So if the received vector
was a valid code word,

590
00:34:15,810 --> 00:34:18,480
I'm going to get 0.

591
00:34:18,480 --> 00:34:21,030
If the received vector
was not a valid code word,

592
00:34:21,030 --> 00:34:22,580
I'll get something else.

593
00:34:22,580 --> 00:34:25,770
And that's what we refer
to as a syndrome vector.

594
00:34:25,770 --> 00:34:27,000
OK, so let's expand this out.

595
00:34:27,000 --> 00:34:32,130
We've got R is C plus E--
so C plus E transpose.

596
00:34:32,130 --> 00:34:35,610
That's the multiplication
we're going to do.

597
00:34:35,610 --> 00:34:38,042
This matrix
multiplication will be

598
00:34:38,042 --> 00:34:40,084
the sum of the individual
matrix multiplications,

599
00:34:40,084 --> 00:34:42,960
so it's going to be
H times C plus H--

600
00:34:42,960 --> 00:34:46,969
sorry-- H times C transpose
plus H times E transpose.

601
00:34:46,969 --> 00:34:47,969
So let's write that out.

602
00:35:03,240 --> 00:35:07,850
So we've got H
times R transpose is

603
00:35:07,850 --> 00:35:13,160
going to be H times C transpose
plus H times E transpose.

604
00:35:13,160 --> 00:35:15,420
We know this is 0.

605
00:35:15,420 --> 00:35:16,310
It's a 0 vector.

606
00:35:21,460 --> 00:35:24,420
n minus k here, the number
of parity relations--

607
00:35:24,420 --> 00:35:26,910
that's how long
that 0 vector is.

608
00:35:26,910 --> 00:35:28,908
Yeah.

609
00:35:28,908 --> 00:35:30,450
And then we've got
some other vector,

610
00:35:30,450 --> 00:35:32,367
which we're referring
to as a syndrome vector.

611
00:35:36,390 --> 00:35:36,890
OK?

612
00:35:39,770 --> 00:35:40,550
So let's see.

613
00:35:40,550 --> 00:35:43,000
What does HE
transpose look like.

614
00:35:43,000 --> 00:35:46,700
E transpose is going to
have just a single 1 in it

615
00:35:46,700 --> 00:35:47,616
somewhere.

616
00:35:52,750 --> 00:36:04,070
And here's H. Let's see.

617
00:36:04,070 --> 00:36:07,960
A transpose stacked up next
to the identity-- that's

618
00:36:07,960 --> 00:36:10,030
what H looked like.

619
00:36:10,030 --> 00:36:12,487
So when we compute--

620
00:36:12,487 --> 00:36:13,570
let me write this better--

621
00:36:13,570 --> 00:36:15,010
I didn't write that well--

622
00:36:15,010 --> 00:36:18,400
this is H.

623
00:36:18,400 --> 00:36:25,120
And what I'm writing
here is HE transpose.

624
00:36:25,120 --> 00:36:26,860
So what is HE transpose doing?

625
00:36:32,610 --> 00:36:34,710
When I multiply a
matrix like this

626
00:36:34,710 --> 00:36:37,560
by a vector that has
just a single 1 in it,

627
00:36:37,560 --> 00:36:38,402
what am I doing?

628
00:36:38,402 --> 00:36:39,360
What do I end up doing?

629
00:36:42,640 --> 00:36:43,140
Sorry.

630
00:36:43,140 --> 00:36:44,790
I heard a voice
from somewhere here.

631
00:36:44,790 --> 00:36:45,723
AUDIENCE: [INAUDIBLE]

632
00:36:45,723 --> 00:36:47,640
GEORGE VERGHESE: Picking
up one column, right?

633
00:36:47,640 --> 00:36:50,880
I'm just selecting
out a column of H.

634
00:36:50,880 --> 00:36:54,810
So each error that you
can get will give you

635
00:36:54,810 --> 00:37:00,735
a syndrome that corresponds
to picking on one column of H.

636
00:37:00,735 --> 00:37:04,350
And column of H is
associated with data bits

637
00:37:04,350 --> 00:37:06,600
here or parity bits here.

638
00:37:11,160 --> 00:37:12,660
So really, this is
all that you have

639
00:37:12,660 --> 00:37:14,550
to do to do your
error correction.

640
00:37:14,550 --> 00:37:21,300
You can pre-compute, or store
basically the columns of H

641
00:37:21,300 --> 00:37:23,310
in your database--

642
00:37:23,310 --> 00:37:30,420
compute H times the received
vector to get the syndrome.

643
00:37:30,420 --> 00:37:32,940
That's really H times
the error, which

644
00:37:32,940 --> 00:37:34,320
is giving you the syndrome.

645
00:37:34,320 --> 00:37:37,050
That's just a single column
of H. So the syndromes

646
00:37:37,050 --> 00:37:40,890
that you can possibly get
are individual columns of H.

647
00:37:40,890 --> 00:37:41,820
So you know what H is.

648
00:37:41,820 --> 00:37:43,320
You've got it stored.

649
00:37:43,320 --> 00:37:45,720
Compute the syndrome,
and see which column of H

650
00:37:45,720 --> 00:37:47,610
it corresponds to.

651
00:37:47,610 --> 00:37:50,698
That's the bit
that has the error.

652
00:37:50,698 --> 00:37:52,740
And actually, the only
cases you're interested in

653
00:37:52,740 --> 00:37:55,480
are where you are going
to correct the data bit,

654
00:37:55,480 --> 00:37:58,440
so this is really all
the part that you really

655
00:37:58,440 --> 00:37:59,580
have to focus on.

656
00:37:59,580 --> 00:38:01,760
So you compute the syndromes.

657
00:38:01,760 --> 00:38:04,980
You compare against the columns
of H, which are your syndrome

658
00:38:04,980 --> 00:38:08,070
vectors, and then you're done.

659
00:38:08,070 --> 00:38:10,890
I think I see the
same thing over here.

660
00:38:10,890 --> 00:38:15,300
Let's just look
at it concretely--

661
00:38:15,300 --> 00:38:17,460
again, for the same code,
the rectangular code

662
00:38:17,460 --> 00:38:20,410
with all the parity bits there.

663
00:38:20,410 --> 00:38:24,120
So this is how you
generated a code word.

664
00:38:24,120 --> 00:38:25,640
Sorry.

665
00:38:25,640 --> 00:38:28,380
OK, let's take the data bit--

666
00:38:28,380 --> 00:38:30,870
the data vector being all 1's.

667
00:38:30,870 --> 00:38:33,090
This is the code word
that goes with it.

668
00:38:33,090 --> 00:38:34,950
It happens with
this particular code

669
00:38:34,950 --> 00:38:36,635
that all the parity
bits then are 0.

670
00:38:39,180 --> 00:38:42,810
What you receive ends up being
this, because one of the data

671
00:38:42,810 --> 00:38:45,750
bits ends up getting corrupted.

672
00:38:45,750 --> 00:38:51,610
When you take that received
vector and pre-multiplied by H,

673
00:38:51,610 --> 00:38:59,350
here's the resulting
syndrome vector that you get.

674
00:39:01,870 --> 00:39:05,060
And what error does
on correspond to?

675
00:39:05,060 --> 00:39:08,000
Well, actually, if you
look in the columns of H,

676
00:39:08,000 --> 00:39:10,100
you'll see that what
you've pulled out

677
00:39:10,100 --> 00:39:11,740
is the second
column, so that means

678
00:39:11,740 --> 00:39:13,800
a second data bit is an error.

679
00:39:13,800 --> 00:39:16,113
And that's the
change that you make.

680
00:39:16,113 --> 00:39:17,280
So it really is that simple.

681
00:39:17,280 --> 00:39:21,770
You take the received
word pre-multiplied

682
00:39:21,770 --> 00:39:27,200
by the parity check matrix H,
look at the syndrome vector,

683
00:39:27,200 --> 00:39:29,960
and see which of the columns
of H that corresponds to.

684
00:39:29,960 --> 00:39:32,930
That's the bit
you're going to flip.

685
00:39:32,930 --> 00:39:33,430
All right?

686
00:39:36,260 --> 00:39:44,460
So now you're actually only
dealing with this many vectors.

687
00:39:44,460 --> 00:39:49,062
It's a number of
vectors equal to--

688
00:39:49,062 --> 00:39:49,770
how many is that?

689
00:39:54,405 --> 00:39:55,930
We've got to do
the multiplication,

690
00:39:55,930 --> 00:39:59,010
but you just have to compare
with the number of vectors

691
00:39:59,010 --> 00:40:02,220
in those columns.

692
00:40:02,220 --> 00:40:05,204
So it's a much simpler
task, computationally.

693
00:40:09,000 --> 00:40:13,140
OK, I think we've said all this.

694
00:40:13,140 --> 00:40:16,770
And so let me just
wind up on linear block

695
00:40:16,770 --> 00:40:18,520
codes with a quick
summary, and then we'll

696
00:40:18,520 --> 00:40:22,870
go on to talk about
some extensions here.

697
00:40:22,870 --> 00:40:24,010
We've seen all this.

698
00:40:24,010 --> 00:40:27,190
We know what the rate of a
linear code is-- k over n--

699
00:40:27,190 --> 00:40:30,640
how many errors we can correct.

700
00:40:30,640 --> 00:40:33,670
And we've seen all this--
what a parity bit does,

701
00:40:33,670 --> 00:40:36,400
whether repetition code--
it's called replication

702
00:40:36,400 --> 00:40:39,280
code in the notes, but the
more familiar term-- actually,

703
00:40:39,280 --> 00:40:41,800
the more commonly used
term is repetition code.

704
00:40:41,800 --> 00:40:44,050
We've looked at Hamming codes
and the rectangular code

705
00:40:44,050 --> 00:40:44,973
as well.

706
00:40:44,973 --> 00:40:46,390
And so these are
the ones that you

707
00:40:46,390 --> 00:40:49,540
want to have in mind
as particular examples

708
00:40:49,540 --> 00:40:55,420
to work with when you're trying
to come up with examples that

709
00:40:55,420 --> 00:40:57,320
will either prove or disprove--

710
00:40:57,320 --> 00:40:58,750
that will illustrate
a conjecture

711
00:40:58,750 --> 00:41:00,850
or disprove a
conjecture, for instance.

712
00:41:00,850 --> 00:41:02,903
And you'll see many
problems on past quizzes

713
00:41:02,903 --> 00:41:03,820
that are of that type.

714
00:41:08,410 --> 00:41:13,110
And then what we did today was
looking at syndrome decoding.

715
00:41:13,110 --> 00:41:15,580
All right, so this was all
focused on single error

716
00:41:15,580 --> 00:41:19,010
correction in linear codes.

717
00:41:19,010 --> 00:41:21,820
But the point is that
that may or may not

718
00:41:21,820 --> 00:41:26,740
be the situation that
you're dealing with.

719
00:41:26,740 --> 00:41:31,383
We actually said that, to
get better error protection

720
00:41:31,383 --> 00:41:32,800
while maintaining
high data rates,

721
00:41:32,800 --> 00:41:34,720
you probably want to work
with longer and longer strings

722
00:41:34,720 --> 00:41:35,323
of data.

723
00:41:35,323 --> 00:41:37,240
Well, if you work with
longer strings of data,

724
00:41:37,240 --> 00:41:39,640
you're going to get
more bits and error.

725
00:41:39,640 --> 00:41:42,160
So you may not be
able to limit yourself

726
00:41:42,160 --> 00:41:46,900
to thinking about
single-bit error correction.

727
00:41:46,900 --> 00:41:48,340
We have talked a bit--

728
00:41:48,340 --> 00:41:52,120
and you've done this in
recitation too, I imagine--

729
00:41:52,120 --> 00:41:55,510
well, you've probably done more
in recitation than in lecture--

730
00:41:55,510 --> 00:41:59,590
with independent corruption
of multiple bits.

731
00:41:59,590 --> 00:42:01,140
So let me say a few
words about that.

732
00:42:11,250 --> 00:42:12,740
Let's think of a
systematic code,

733
00:42:12,740 --> 00:42:20,390
for instance, still k bits
here, and then parity bits here.

734
00:42:26,300 --> 00:42:29,900
But what if you could have
up to t errors, not just

735
00:42:29,900 --> 00:42:31,170
a single error--

736
00:42:31,170 --> 00:42:39,675
so if you wanted to
protect against t errors?

737
00:42:42,370 --> 00:42:45,860
So in some sense, you
want your n minus k bits

738
00:42:45,860 --> 00:42:51,580
here to signal all
those possibilities,

739
00:42:51,580 --> 00:42:53,540
so you need the number
of possibilities

740
00:42:53,540 --> 00:42:57,530
that can be signaled by
n minus k parity bits

741
00:42:57,530 --> 00:42:59,480
to be greater than or
equal to the number

742
00:42:59,480 --> 00:43:01,250
of possible conditions
that correspond

743
00:43:01,250 --> 00:43:04,510
to having up to t errors.

744
00:43:04,510 --> 00:43:06,170
And we've said a
little bit about this,

745
00:43:06,170 --> 00:43:10,100
but you can have now either
no error at all, which

746
00:43:10,100 --> 00:43:15,020
is one condition; or an error
in one of these bits, which

747
00:43:15,020 --> 00:43:19,010
is n separate conditions;
or you could have two bits

748
00:43:19,010 --> 00:43:20,448
out of here being an error.

749
00:43:20,448 --> 00:43:21,740
So how many conditions is that?

750
00:43:26,660 --> 00:43:28,670
n choose 2 and so on--

751
00:43:28,670 --> 00:43:31,498
I'll leave you to figure out
where you end up on that.

752
00:43:31,498 --> 00:43:32,540
So I just wanted to say--

753
00:43:35,073 --> 00:43:37,490
you've seen this in recitation,
but I haven't mentioned it

754
00:43:37,490 --> 00:43:37,970
in lecture.

755
00:43:37,970 --> 00:43:39,512
I don't want to say
later that, oh, I

756
00:43:39,512 --> 00:43:42,080
didn't know it was something
we had to know for a quiz.

757
00:43:42,080 --> 00:43:46,250
We do expect that
what n choose m means.

758
00:43:46,250 --> 00:43:50,030
So n choose m is
the number of ways

759
00:43:50,030 --> 00:43:53,030
of picking m things
from n things,

760
00:43:53,030 --> 00:43:56,090
and we assume that what that--

761
00:43:56,090 --> 00:43:57,420
how that's done.

762
00:43:57,420 --> 00:43:59,930
So you've got n objects.

763
00:43:59,930 --> 00:44:02,880
You want to pick m
things from there.

764
00:44:02,880 --> 00:44:06,070
So you can pick the first one
in n ways, the second one in n

765
00:44:06,070 --> 00:44:10,370
minus 1 ways, and keep
on going until you

766
00:44:10,370 --> 00:44:19,910
get to n minus n plus 1, which
is also n factorial over n

767
00:44:19,910 --> 00:44:23,000
minus m factorial.

768
00:44:23,000 --> 00:44:25,160
But when you did
that picking, you

769
00:44:25,160 --> 00:44:27,758
were paying attention to the
order in which you collected

770
00:44:27,758 --> 00:44:30,050
the things, but if the ordering
doesn't matter to you--

771
00:44:30,050 --> 00:44:32,120
if all these objects
are interchangeable--

772
00:44:32,120 --> 00:44:33,770
then you've actually
overcounted.

773
00:44:33,770 --> 00:44:35,853
So you've got m things,
but the order in which you

774
00:44:35,853 --> 00:44:37,728
pick them doesn't matter,
because they're all

775
00:44:37,728 --> 00:44:38,880
interchangeable.

776
00:44:38,880 --> 00:44:41,090
And so you've overcounted.

777
00:44:41,090 --> 00:44:43,760
You've got to divide by the
number of ways of rearranging m

778
00:44:43,760 --> 00:44:49,550
things, and so that's how
you get that expression.

779
00:44:49,550 --> 00:44:53,180
So you start off with thinking
about how you pick m things,

780
00:44:53,180 --> 00:44:54,620
and then make a
little correction,

781
00:44:54,620 --> 00:44:57,590
and so this is
what n choose m is.

782
00:44:57,590 --> 00:44:59,498
Another thing that I
just threw in for fun,

783
00:44:59,498 --> 00:45:01,040
because it's something
you might want

784
00:45:01,040 --> 00:45:02,570
to carry around in your head--

785
00:45:02,570 --> 00:45:07,610
if you don't have a feel for
how n factorial grows with n,

786
00:45:07,610 --> 00:45:10,490
well, it actually
grows pretty fast.

787
00:45:10,490 --> 00:45:14,060
It's actually growing
like n to the n.

788
00:45:14,060 --> 00:45:16,280
This is a very famous
approximation, referred to

789
00:45:16,280 --> 00:45:17,950
as Sterling's approximation.

790
00:45:17,950 --> 00:45:20,690
So when you get out to
large n, the right-hand side

791
00:45:20,690 --> 00:45:23,480
here is a very good
approximation to n factorial.

792
00:45:23,480 --> 00:45:27,140
And you see that it's sort
of like n to the n, which

793
00:45:27,140 --> 00:45:30,290
makes sense, because you
seem to be multiplying n

794
00:45:30,290 --> 00:45:31,232
by itself n times.

795
00:45:31,232 --> 00:45:33,440
Except you're multiplying
by a little bit less than n

796
00:45:33,440 --> 00:45:38,080
each time, so the e over there
ends up compensating for it,

797
00:45:38,080 --> 00:45:39,170
it turns out.

798
00:45:39,170 --> 00:45:42,530
And then there's an extra
n to the 1/2 out there.

799
00:45:42,530 --> 00:45:46,900
OK, so we'll assume that
how to do the combinatorics.

800
00:45:46,900 --> 00:45:48,530
And now, what this
is saying is, what's

801
00:45:48,530 --> 00:45:51,410
the probability
of getting m bits

802
00:45:51,410 --> 00:45:54,320
and error in an n-bit word?

803
00:45:54,320 --> 00:45:56,100
Well, if you've got
m bits in error,

804
00:45:56,100 --> 00:45:59,480
that's because those m
bits flipped independently,

805
00:45:59,480 --> 00:46:01,610
each with probability p.

806
00:46:01,610 --> 00:46:03,830
The remaining n
minus m did not flip,

807
00:46:03,830 --> 00:46:07,130
so that's the probability of
getting one such configuration.

808
00:46:07,130 --> 00:46:09,330
And then you count all the
possible configurations.

809
00:46:09,330 --> 00:46:12,770
So what that top expression is
is the probability of getting

810
00:46:12,770 --> 00:46:14,900
m bits in error out
of n, and that's

811
00:46:14,900 --> 00:46:18,160
something that we want you
to be comfortable with.

812
00:46:18,160 --> 00:46:25,940
OK, now, just to
wind up here, I want

813
00:46:25,940 --> 00:46:28,340
to go back to this
last bullet, which

814
00:46:28,340 --> 00:46:31,820
is that, in many
situations, the errors don't

815
00:46:31,820 --> 00:46:33,800
occur independently
in the different bits.

816
00:46:33,800 --> 00:46:37,070
If you think of a CD with
a scratch or a thumb print

817
00:46:37,070 --> 00:46:41,090
or something on
it, that's local,

818
00:46:41,090 --> 00:46:43,670
and so if you get
one bit corrupted,

819
00:46:43,670 --> 00:46:45,620
that increases the
chances that the next bit

820
00:46:45,620 --> 00:46:47,025
is corrupted as well.

821
00:46:47,025 --> 00:46:48,275
So errors can occur in bursts.

822
00:46:51,320 --> 00:46:53,690
If you're trying to make
a phone call from a car,

823
00:46:53,690 --> 00:46:58,470
and you're suddenly
shielded from an antenna--

824
00:46:58,470 --> 00:47:00,260
a nearby antenna,
then you're going

825
00:47:00,260 --> 00:47:02,330
to lose a whole bunch
of bits in sequence.

826
00:47:02,330 --> 00:47:06,240
So bits can be in
error in clusters,

827
00:47:06,240 --> 00:47:11,040
and what we've talked about so
far doesn't quite manage that.

828
00:47:11,040 --> 00:47:16,010
So here's an idea for
how to do that, which

829
00:47:16,010 --> 00:47:19,940
is referred to as interleaving.

830
00:47:19,940 --> 00:47:24,388
So if we had B
different code words

831
00:47:24,388 --> 00:47:26,930
that we were going to transmit,
and we did it the normal way,

832
00:47:26,930 --> 00:47:30,230
we would send out the first
one, second one, and so on.

833
00:47:30,230 --> 00:47:31,730
This little shading
here is supposed

834
00:47:31,730 --> 00:47:36,200
to indicate the parity bits
going along with the data bits.

835
00:47:36,200 --> 00:47:40,910
If we had a burst of errors,
we could lose two entire words

836
00:47:40,910 --> 00:47:42,500
over there--

837
00:47:42,500 --> 00:47:43,910
nothing to be done.

838
00:47:43,910 --> 00:47:47,450
They derive entirely corrupted,
and we wouldn't get them back.

839
00:47:47,450 --> 00:47:51,080
The idea of interleaving
is stack up B words

840
00:47:51,080 --> 00:47:53,750
that you want to
transmit, but transmit

841
00:47:53,750 --> 00:47:58,400
the bits out one at a time
from each of the B words.

842
00:47:58,400 --> 00:48:00,620
So you transmit the first
bit from the first word,

843
00:48:00,620 --> 00:48:03,260
first bit from the
second word, and so on.

844
00:48:03,260 --> 00:48:04,760
And so this is the
sequence in which

845
00:48:04,760 --> 00:48:06,818
you're doing the transmission.

846
00:48:06,818 --> 00:48:08,360
Now, if you've got
a burst of errors,

847
00:48:08,360 --> 00:48:12,440
you're corrupting a
more localized set

848
00:48:12,440 --> 00:48:15,170
of bits in each of the
words, and there's some hope

849
00:48:15,170 --> 00:48:18,390
that your error correction
then can recover.

850
00:48:18,390 --> 00:48:20,270
So this is very often done.

851
00:48:24,080 --> 00:48:27,332
I don't think I want
to actually walk you

852
00:48:27,332 --> 00:48:28,790
through a particular
scheme for it,

853
00:48:28,790 --> 00:48:33,030
but we'll have it on the
slides for you to look through.

854
00:48:33,030 --> 00:48:38,090
But basically, it actually
turns out to work very well.

855
00:48:38,090 --> 00:48:42,750
And that may be all I
want to do for today.

856
00:48:42,750 --> 00:48:43,790
We'll see you next time.

857
00:48:43,790 --> 00:48:45,832
We're going to talk about
linear codes next time,

858
00:48:45,832 --> 00:48:47,810
but a much more
elaborate kind of code

859
00:48:47,810 --> 00:48:50,360
called a convolutional code.

860
00:48:50,360 --> 00:48:52,120
Thank you.