1 00:00:00,000 --> 00:00:01,196 PROFESSOR: All right, I thought he would be. 2 00:00:01,196 --> 00:00:04,200 It's always a hazard putting up somebody who you know is 3 00:00:04,200 --> 00:00:07,200 going to lecture better than you do. 4 00:00:07,200 --> 00:00:10,080 But I hope that was a good change of pace. 5 00:00:10,080 --> 00:00:13,070 And, of course, he knows more about the subject of 6 00:00:13,070 --> 00:00:15,660 Reed-Solomon decoding than I or more than practically 7 00:00:15,660 --> 00:00:18,860 anyone else does, so -- 8 00:00:18,860 --> 00:00:21,130 did he have much time to talk about decoding or 9 00:00:21,130 --> 00:00:22,730 was the most -- yeah. 10 00:00:22,730 --> 00:00:25,190 Talked about the decoding algorithms, Sudan-type 11 00:00:25,190 --> 00:00:26,762 decoding algorithms. 12 00:00:26,762 --> 00:00:28,180 Yeah? 13 00:00:28,180 --> 00:00:29,070 OK. 14 00:00:29,070 --> 00:00:29,850 Good. 15 00:00:29,850 --> 00:00:32,650 Well, in fact, he covered so much that I don't have much 16 00:00:32,650 --> 00:00:34,000 left to do today. 17 00:00:37,110 --> 00:00:41,280 I remind you, Ashish put out an email. 18 00:00:41,280 --> 00:00:44,960 There's a review session tomorrow for the midterm from 19 00:00:44,960 --> 00:00:49,760 5 to 7 in a different room, 36-112. 20 00:00:49,760 --> 00:00:52,460 The midterm is Wednesday in this room. 21 00:00:52,460 --> 00:00:56,180 It starts at 9:05 to give you a little bit more 22 00:00:56,180 --> 00:00:58,400 time to work on it. 23 00:00:58,400 --> 00:01:02,880 It's closed book except that you're allowed to make up 24 00:01:02,880 --> 00:01:07,280 three normal size sheets of notes. 25 00:01:07,280 --> 00:01:10,830 You can write as small as you like. 26 00:01:10,830 --> 00:01:14,920 This experience has shown this is the very best way of 27 00:01:14,920 --> 00:01:18,380 getting you to consolidate what you've learned up to this 28 00:01:18,380 --> 00:01:21,920 point, is writing your own three-page precis of the 29 00:01:21,920 --> 00:01:23,090 course to date. 30 00:01:23,090 --> 00:01:24,115 So, yes? 31 00:01:24,115 --> 00:01:26,352 AUDIENCE: [INAUDIBLE]? 32 00:01:26,352 --> 00:01:27,680 PROFESSOR: What is the rule, both -- 33 00:01:27,680 --> 00:01:28,596 what? 34 00:01:28,596 --> 00:01:29,970 AUDIENCE: Both sides. 35 00:01:29,970 --> 00:01:30,950 PROFESSOR: Both sides. 36 00:01:30,950 --> 00:01:32,070 All right. 37 00:01:32,070 --> 00:01:33,746 Six sheets. 38 00:01:33,746 --> 00:01:36,640 If you can't write down the whole course in six sheets, 39 00:01:36,640 --> 00:01:40,520 then, well, you haven't been paying attention. 40 00:01:44,200 --> 00:01:46,030 OK. 41 00:01:46,030 --> 00:01:48,880 And you can bring calculators, but we don't 42 00:01:48,880 --> 00:01:51,110 expect they'll be useful. 43 00:01:51,110 --> 00:01:53,940 Please erase any erasable will memories if you do bring 44 00:01:53,940 --> 00:01:55,320 calculators. 45 00:01:55,320 --> 00:01:57,190 These are things we tell undergraduates. 46 00:01:57,190 --> 00:01:59,390 I guess they hardly seem necessary 47 00:01:59,390 --> 00:02:02,070 in a graduate course. 48 00:02:02,070 --> 00:02:05,500 Any other questions about the logistics of the midterm? 49 00:02:05,500 --> 00:02:10,738 AUDIENCE: What is the material going to be for the midterm? 50 00:02:10,738 --> 00:02:14,820 PROFESSOR: Everything that's been covered up to today. 51 00:02:14,820 --> 00:02:16,440 I won't hold you responsible for anything I've 52 00:02:16,440 --> 00:02:17,760 talked about today. 53 00:02:17,760 --> 00:02:19,750 So it's basically chapters one through eight. 54 00:02:24,650 --> 00:02:27,510 I really will only hold you responsible for what we 55 00:02:27,510 --> 00:02:30,790 covered in class, which chapters one, two, and three 56 00:02:30,790 --> 00:02:35,760 were so brief, you can't say we really covered them. 57 00:02:35,760 --> 00:02:37,703 So really, chapters four through eight. 58 00:02:42,633 --> 00:02:45,560 Any other questions? 59 00:02:45,560 --> 00:02:46,040 All right. 60 00:02:46,040 --> 00:02:48,160 You'll have more chance as we go along. 61 00:02:52,170 --> 00:02:59,210 So as I say, Ralf really covered the main things about 62 00:02:59,210 --> 00:03:05,240 Reed-Solomon codes, which are the crowning achievement of 63 00:03:05,240 --> 00:03:07,480 algebraic coding theory. 64 00:03:07,480 --> 00:03:12,180 They have a lot of favorable attributes. 65 00:03:12,180 --> 00:03:20,280 They are optimum, from an n, k, d point of view, in the 66 00:03:20,280 --> 00:03:26,500 sense that they're maximum distance separable. 67 00:03:26,500 --> 00:03:29,520 The parameters are as good as they possibly could be by this 68 00:03:29,520 --> 00:03:33,360 very basic Singleton bound. 69 00:03:33,360 --> 00:03:37,320 They have efficient algebraic decoding algorithms. 70 00:03:42,130 --> 00:03:48,170 And traditionally, this has meant hard decision bounded 71 00:03:48,170 --> 00:03:52,030 distance decoding algorithms like the Welch-Berlekamp 72 00:03:52,030 --> 00:03:53,750 algorithm or the Berlekamp-Massey match 73 00:03:53,750 --> 00:03:59,390 algorithm, which take hard decisions in, put hard 74 00:03:59,390 --> 00:04:03,360 decisions on symbols out. 75 00:04:03,360 --> 00:04:07,210 But as Ralf suggested to you, now there are algorithms that 76 00:04:07,210 --> 00:04:11,800 can use soft decisions in, produce a list, at least, 77 00:04:11,800 --> 00:04:15,880 coming out, some indication of the reliability of each 78 00:04:15,880 --> 00:04:17,980 element of the list. 79 00:04:17,980 --> 00:04:20,410 So they've been softened up. 80 00:04:23,080 --> 00:04:29,480 Nonetheless, what are the negatives is, basically, that 81 00:04:29,480 --> 00:04:36,760 they work on bytes, or symbols, not bits. 82 00:04:40,130 --> 00:04:44,080 So if we have a basically binary channel, as we very 83 00:04:44,080 --> 00:04:50,970 frequently do, then they need something else to make them 84 00:04:50,970 --> 00:04:54,920 adapted to the binary channel, or even on the additive white 85 00:04:54,920 --> 00:04:56,060 Gaussian noise channel. 86 00:04:56,060 --> 00:04:58,740 I don't think we've gone through the calculation. 87 00:04:58,740 --> 00:05:07,100 You could think of sending q or e symbols over the additive 88 00:05:07,100 --> 00:05:09,240 white Gaussian noise channel, but what would you need? 89 00:05:09,240 --> 00:05:12,490 You would need a q or e signal set. 90 00:05:12,490 --> 00:05:17,430 Now, if you pick that to be a q simplex signals set or a 91 00:05:17,430 --> 00:05:20,040 orthogonal or biorthogonal signal set, then that might 92 00:05:20,040 --> 00:05:23,530 work because the distance structure of such a signal set 93 00:05:23,530 --> 00:05:26,900 is like the Hamming distance structure in GF of q. 94 00:05:29,650 --> 00:05:31,960 In other words, it's approximately equally likely 95 00:05:31,960 --> 00:05:36,330 to make an error to any other symbol. 96 00:05:36,330 --> 00:05:39,910 The symbols are more or less equally spread apart in 97 00:05:39,910 --> 00:05:40,890 Euclidean space. 98 00:05:40,890 --> 00:05:46,130 Just as when we count distance in Hamming space, we give the 99 00:05:46,130 --> 00:05:49,880 same weight to each possible kind of error or each possible 100 00:05:49,880 --> 00:05:51,610 kind of symbol difference. 101 00:05:51,610 --> 00:05:56,820 So if you did that, then Reed-Solomon 102 00:05:56,820 --> 00:05:58,240 codes might work well. 103 00:05:58,240 --> 00:06:01,670 That's one form of a concatenated scheme, which I 104 00:06:01,670 --> 00:06:04,290 will talk about in just a second. 105 00:06:04,290 --> 00:06:08,350 But if you tried to apply Reed-Solomon codes to a binary 106 00:06:08,350 --> 00:06:11,266 input additive white Gaussian noise channel, just translate 107 00:06:11,266 --> 00:06:16,150 the, say, 8-bit bytes into bits and send them one bit at 108 00:06:16,150 --> 00:06:20,680 a time, then the distance structures don't 109 00:06:20,680 --> 00:06:21,840 correspond at all well. 110 00:06:21,840 --> 00:06:25,580 It's much more likely to make a single bit error than an 111 00:06:25,580 --> 00:06:28,270 error that involves all eight bits or something. 112 00:06:28,270 --> 00:06:33,700 And for that fundamental reason, you'll find that 113 00:06:33,700 --> 00:06:36,500 performance is not so great. 114 00:06:36,500 --> 00:06:41,330 And certainly, this is not the way to get to capacity all by 115 00:06:41,330 --> 00:06:45,370 itself unless it's married with something else. 116 00:06:45,370 --> 00:06:48,590 So that's something we always have to deal with in another 117 00:06:48,590 --> 00:06:51,300 applications, as I will talk about in a minute. 118 00:06:51,300 --> 00:06:54,470 We naturally do want to correct bytes, 119 00:06:54,470 --> 00:06:57,210 or symbols, or blocks. 120 00:06:57,210 --> 00:06:59,950 There are many bits. 121 00:06:59,950 --> 00:07:03,770 And then Reed-Solomon codes are just exactly what we want. 122 00:07:03,770 --> 00:07:07,570 They're optimal in any such situation. 123 00:07:07,570 --> 00:07:09,280 And so what did I just say? 124 00:07:09,280 --> 00:07:16,520 They're not the way to get to capacity, although they are 125 00:07:16,520 --> 00:07:21,930 sometimes part of schemes for getting to capacity. 126 00:07:21,930 --> 00:07:24,930 And since getting to capacity in the additive white Gaussian 127 00:07:24,930 --> 00:07:29,780 noise channel is the theme of this course, then they play 128 00:07:29,780 --> 00:07:33,440 only a peripheral role in that theme. 129 00:07:33,440 --> 00:07:35,750 Nonetheless, they're great codes. 130 00:07:35,750 --> 00:07:37,580 They're in wide use. 131 00:07:37,580 --> 00:07:39,960 They're part of every semiconductor manufacturer's 132 00:07:39,960 --> 00:07:41,400 core library. 133 00:07:41,400 --> 00:07:51,220 You can buy (255,223) distance 33, 16 error correcting 134 00:07:51,220 --> 00:07:56,250 Reed-Solomon decoder in part of any chip package and it'll 135 00:07:56,250 --> 00:08:00,260 go at gigabits nowadays. 136 00:08:00,260 --> 00:08:04,640 So it's kind of the all-purpose error correcting 137 00:08:04,640 --> 00:08:09,540 code, with the caveat that you need to somehow get what you 138 00:08:09,540 --> 00:08:15,320 correct to be bytes rather than bits, OK? 139 00:08:15,320 --> 00:08:23,200 So today I thought I'd cover the last two topics in chapter 140 00:08:23,200 --> 00:08:32,400 eight and maybe, to the extent that time permits, go back to 141 00:08:32,400 --> 00:08:35,106 chapter seven. 142 00:08:35,106 --> 00:08:39,530 But we'll see how we feel about that. 143 00:08:39,530 --> 00:08:41,719 Applications of Reed-Solomon codes. 144 00:08:45,470 --> 00:08:51,100 OK, basically, you can apply them wherever you want to 145 00:08:51,100 --> 00:08:58,690 correct blocks, rather than bits or packets or n-tuples or 146 00:08:58,690 --> 00:09:00,990 what have you. 147 00:09:00,990 --> 00:09:06,860 So one obvious application is, when you send packets through 148 00:09:06,860 --> 00:09:22,510 the internet, if you use Reed-Solomon codes across the 149 00:09:22,510 --> 00:09:30,820 packet, then you can correct errors that occur in 150 00:09:30,820 --> 00:09:33,020 individual packets. 151 00:09:33,020 --> 00:09:35,950 This is not the way the internet usually works. 152 00:09:35,950 --> 00:09:41,330 The internet usually works by having an error detection code 153 00:09:41,330 --> 00:09:45,930 within each packet, a cyclic redundancy check (CRC). 154 00:09:45,930 --> 00:09:49,020 You hash bits, whatever. 155 00:09:49,020 --> 00:09:54,170 You provide some redundant bits within each packet if -- 156 00:09:54,170 --> 00:09:57,200 when you receive the packet, you check to see whether the 157 00:09:57,200 --> 00:09:59,130 parity check checks. 158 00:09:59,130 --> 00:10:05,400 If there are parity check bits, then a very good rule of 159 00:10:05,400 --> 00:10:10,210 thumb is that the probability of not detecting an error for 160 00:10:10,210 --> 00:10:12,820 almost any type of error pattern is going to be of the 161 00:10:12,820 --> 00:10:14,750 order of 2 to the minus r. 162 00:10:14,750 --> 00:10:16,970 That's exactly what it would be if you had a completely 163 00:10:16,970 --> 00:10:18,940 random stream of bits. 164 00:10:18,940 --> 00:10:22,940 Random stream of bits with r redundant bits in a block, the 165 00:10:22,940 --> 00:10:25,990 probability that all the check bits are random, the 166 00:10:25,990 --> 00:10:28,440 probability that they're all equal to 0, is 2 167 00:10:28,440 --> 00:10:29,940 to the minus r. 168 00:10:29,940 --> 00:10:30,340 All right? 169 00:10:30,340 --> 00:10:34,670 But it turns out that's very good approximation for just 170 00:10:34,670 --> 00:10:35,990 about any error mechanism. 171 00:10:35,990 --> 00:10:38,870 So you pick r large enough. 172 00:10:38,870 --> 00:10:43,110 It's often been picked to be something like r equals 32. 173 00:10:43,110 --> 00:10:47,320 What kind of failure to detect does that give? 174 00:10:47,320 --> 00:10:51,220 2 to the minus 32, which is what, about 2 the minus 9th, 2 175 00:10:51,220 --> 00:10:52,320 to the minus 10th -- 176 00:10:52,320 --> 00:10:56,190 10 to the minus 9th, 10 to the minus 10th. 177 00:10:56,190 --> 00:10:58,010 Is that low enough? 178 00:10:58,010 --> 00:10:59,100 It was certainly -- 179 00:10:59,100 --> 00:11:01,390 when I started in this business, that was considered 180 00:11:01,390 --> 00:11:04,930 just incredibly low, but now that we're sending mega 181 00:11:04,930 --> 00:11:08,220 packets per second, it doesn't seem quite so low anymore. 182 00:11:08,220 --> 00:11:12,640 And so 32 is a little bit on the sloppy side. 183 00:11:12,640 --> 00:11:14,870 There are going to be undetected errors that get 184 00:11:14,870 --> 00:11:17,100 through if you're spending 10 to the 9th, 185 00:11:17,100 --> 00:11:19,710 10 to the 10th packets. 186 00:11:19,710 --> 00:11:26,900 But 32 or, nowadays, 64 would be a better number. 187 00:11:26,900 --> 00:11:30,390 Just like cryptography, the number of bits has to go up as 188 00:11:30,390 --> 00:11:32,020 technology goes up. 189 00:11:32,020 --> 00:11:34,400 But 32 is often used. 190 00:11:34,400 --> 00:11:37,700 OK, but here's an alternative scheme. 191 00:11:37,700 --> 00:11:41,220 Suppose you coded across packets. 192 00:11:41,220 --> 00:11:48,720 So here is a, say, a 1,000-bit packet. 193 00:11:48,720 --> 00:11:55,270 And here's packet one, here's packet two, here's packet 194 00:11:55,270 --> 00:12:04,810 three, and so forth up to packet k, where we're going to 195 00:12:04,810 --> 00:12:16,050 use n, k, d equals n minus k plus 1, RS code over, it 196 00:12:16,050 --> 00:12:21,720 doesn't matter, GF of let's say 256, just for definite 197 00:12:21,720 --> 00:12:22,740 [INAUDIBLE]. 198 00:12:22,740 --> 00:12:25,270 Here I'm using older notation. 199 00:12:25,270 --> 00:12:28,140 This is the same as F-256. 200 00:12:28,140 --> 00:12:30,310 Or we sometimes write that as F 2 to the 8th. 201 00:12:32,960 --> 00:12:36,640 OK, how are we going to do packet error correction? 202 00:12:36,640 --> 00:12:41,420 We're going to code crosswise in columns, all right? 203 00:12:41,420 --> 00:12:46,860 So these would be k. 204 00:12:46,860 --> 00:12:48,540 Let's make this eight-wide. 205 00:12:48,540 --> 00:12:50,630 Just pick off a bite at a time here. 206 00:12:50,630 --> 00:12:52,840 So each of these can be considered to be a 207 00:12:52,840 --> 00:12:56,500 symbol in GF of 256. 208 00:12:56,500 --> 00:13:01,060 And then we'll code n minus k parity check 209 00:13:01,060 --> 00:13:04,370 packets down here. 210 00:13:04,370 --> 00:13:10,200 So these are redundant check packets. 211 00:13:10,200 --> 00:13:16,030 Encoded bytes at this point. 212 00:13:16,030 --> 00:13:17,220 They're not packets yet. 213 00:13:17,220 --> 00:13:26,610 But if we do that column-wise across here, we can fill in n 214 00:13:26,610 --> 00:13:42,160 minus k check packets and you can see the block length is 215 00:13:42,160 --> 00:13:44,310 quite huge here. 216 00:13:44,310 --> 00:13:48,840 So you might have a delay or latency problem in executing 217 00:13:48,840 --> 00:13:49,340 this scheme. 218 00:13:49,340 --> 00:13:54,530 Basically, you would have to send this whole thing before 219 00:13:54,530 --> 00:13:57,130 you could do error correction and then error correct the 220 00:13:57,130 --> 00:13:58,040 whole thing. 221 00:13:58,040 --> 00:14:00,450 But how does error correction work? 222 00:14:00,450 --> 00:14:01,230 You send this. 223 00:14:01,230 --> 00:14:05,740 Perhaps some of the packets are erased or even received 224 00:14:05,740 --> 00:14:07,230 incorrectly. 225 00:14:07,230 --> 00:14:10,760 You would then use some error correction or erasure and 226 00:14:10,760 --> 00:14:14,880 error correction scheme, which Ralf talked about, for 227 00:14:14,880 --> 00:14:18,610 correcting the Reed-Solomon code. 228 00:14:18,610 --> 00:14:27,470 And you could correct up to d over 2 errors, or up to d 229 00:14:27,470 --> 00:14:31,640 minus 1 erasures, OK? 230 00:14:31,640 --> 00:14:35,560 So this would be a very powerful packet error 231 00:14:35,560 --> 00:14:36,470 correction scheme. 232 00:14:36,470 --> 00:14:37,500 What are its costs? 233 00:14:37,500 --> 00:14:40,590 It's costs are quite a bit of redundancy. 234 00:14:40,590 --> 00:14:45,270 It doesn't fit the internet architecture very well. 235 00:14:45,270 --> 00:14:48,420 This is basically a point-to-point scheme. 236 00:14:48,420 --> 00:14:51,780 So when you're sending from a to b, you'd have to send b an 237 00:14:51,780 --> 00:14:55,390 additional n minus k streams. 238 00:14:55,390 --> 00:14:58,660 And if you're sending the same webpage, as you often do, to 239 00:14:58,660 --> 00:15:03,290 multiple people, you'd have to send this to everybody. 240 00:15:03,290 --> 00:15:07,380 And it also doesn't have any feedback in it. 241 00:15:07,380 --> 00:15:10,960 The great thing about ARQ is that you only have to send the 242 00:15:10,960 --> 00:15:12,720 information that's necessary. 243 00:15:12,720 --> 00:15:15,650 You only have to restore the packets that 244 00:15:15,650 --> 00:15:17,185 are actually lost. 245 00:15:17,185 --> 00:15:20,370 You send them again. 246 00:15:20,370 --> 00:15:23,520 Whereas here, you're just kind of, a priori, deciding that 247 00:15:23,520 --> 00:15:26,450 you're going to send a great deal of redundant information 248 00:15:26,450 --> 00:15:30,380 down here to cover some kind of worst case, whatever you 249 00:15:30,380 --> 00:15:33,320 think the worst-case number of packets would be. 250 00:15:33,320 --> 00:15:35,430 If there are more that worst-case number or packets, 251 00:15:35,430 --> 00:15:37,880 then you could retransmit again, but you're 252 00:15:37,880 --> 00:15:40,730 retransmitting this huge block, all right? 253 00:15:40,730 --> 00:15:45,230 So for that reason, this is not actually used, as far as 254 00:15:45,230 --> 00:15:46,840 I'm aware, in the internet. 255 00:15:46,840 --> 00:15:49,400 ARQ, wherever you have a feedback path, 256 00:15:49,400 --> 00:15:52,460 you should use it. 257 00:15:52,460 --> 00:15:55,280 That's a general moral. 258 00:15:55,280 --> 00:15:59,540 So ARQ is a wonderful scheme that is, on an erasure 259 00:15:59,540 --> 00:16:03,810 channel, that's a capacity-approaching scheme. 260 00:16:03,810 --> 00:16:06,490 Just on a bitwise basis, if you have a binary erasure 261 00:16:06,490 --> 00:16:11,410 channel with erasure probability p, and then the 262 00:16:11,410 --> 00:16:15,140 capacity of the channel is 1 minus p, and you can reach 263 00:16:15,140 --> 00:16:18,010 that capacity if you simply retransmit each 264 00:16:18,010 --> 00:16:18,970 bit that was erased. 265 00:16:18,970 --> 00:16:20,530 As it's erased -- 266 00:16:20,530 --> 00:16:22,300 it's erased, you send it again. 267 00:16:22,300 --> 00:16:25,100 And a quick calculation shows you that the average number of 268 00:16:25,100 --> 00:16:26,960 retransmissions is p. 269 00:16:26,960 --> 00:16:31,660 And so the average reduction from 1 bit per second is -- 270 00:16:31,660 --> 00:16:35,260 1 bit per transmission is p bits per transmission, you can 271 00:16:35,260 --> 00:16:38,240 achieve 1 minus p with no coding at all, just by 272 00:16:38,240 --> 00:16:39,890 retransmission. 273 00:16:39,890 --> 00:16:41,990 So it's perfectly matched to the erasure channel. 274 00:16:41,990 --> 00:16:47,250 And the internet, especially with parity checks, CRC, 275 00:16:47,250 --> 00:16:52,980 within blocks, long enough CRC is basically a 276 00:16:52,980 --> 00:16:56,250 packet erasure channel. 277 00:16:56,250 --> 00:16:59,630 You really, in practice, hardly ever make 278 00:16:59,630 --> 00:17:01,500 a undetected -- 279 00:17:01,500 --> 00:17:05,920 let a packet with errors come through. 280 00:17:05,920 --> 00:17:07,530 OK? 281 00:17:07,530 --> 00:17:08,750 Any questions about this? 282 00:17:08,750 --> 00:17:12,519 AUDIENCE: [INAUDIBLE]. 283 00:17:12,519 --> 00:17:18,530 PROFESSOR: The TCP/IP protocol just has a 32-bit CRC, I 284 00:17:18,530 --> 00:17:22,310 think, and if it fails to check, you ask for that packet 285 00:17:22,310 --> 00:17:23,604 again by packet number. 286 00:17:28,930 --> 00:17:30,090 OK? 287 00:17:30,090 --> 00:17:34,040 So Reed-Solomon codes, of course, are optimal if that's 288 00:17:34,040 --> 00:17:35,040 what you want to do. 289 00:17:35,040 --> 00:17:38,000 You couldn't possibly have a better code for 290 00:17:38,000 --> 00:17:39,890 coding across packets. 291 00:17:39,890 --> 00:17:42,550 But it's cumbersome in the packet environment. 292 00:17:45,860 --> 00:17:47,110 OK. 293 00:17:49,880 --> 00:17:54,060 Let's talk about concatenated codes. 294 00:17:54,060 --> 00:17:55,310 I guess I'll do it at the same place. 295 00:18:12,370 --> 00:18:15,800 This is something -- 296 00:18:15,800 --> 00:18:24,530 I did in my thesis and last week I was at a tribute to 297 00:18:24,530 --> 00:18:27,230 Andy Viterbi, who invented the Viterbi 298 00:18:27,230 --> 00:18:29,310 algorithm, among other things. 299 00:18:29,310 --> 00:18:32,850 Of course, everybody knows that the Viterbi algorithm, 300 00:18:32,850 --> 00:18:36,300 when Andy invented it, he had no idea it was actually going 301 00:18:36,300 --> 00:18:37,960 to be a practical algorithm. 302 00:18:37,960 --> 00:18:40,060 It's become one of the super practical algorithms. 303 00:18:40,060 --> 00:18:43,200 He was just trying to prove a certain result, exponential 304 00:18:43,200 --> 00:18:45,600 error bounds for convolutional codes. 305 00:18:45,600 --> 00:18:48,300 And he thought the proof was easier to understand if you 306 00:18:48,300 --> 00:18:54,240 introduce this algorithm, which he describes perfectly. 307 00:18:54,240 --> 00:18:55,590 It is the Viterbi algorithm. 308 00:18:55,590 --> 00:19:00,150 But really, just a proof technique. 309 00:19:00,150 --> 00:19:05,140 Didn't even realize it was an optimum decoder. 310 00:19:05,140 --> 00:19:09,200 And then somebody, not him, not me, but in his company, 311 00:19:09,200 --> 00:19:11,720 fortunately, realized that gee, this 312 00:19:11,720 --> 00:19:13,330 could be very practical. 313 00:19:13,330 --> 00:19:16,620 And that became a workhorse decoding algorithm. 314 00:19:16,620 --> 00:19:21,700 Very similarly, concatenated codes are something that I did 315 00:19:21,700 --> 00:19:24,450 in my thesis in 1965. 316 00:19:24,450 --> 00:19:28,880 And I was looking, basically, for a theoretical result. 317 00:19:28,880 --> 00:19:35,060 How could you approach capacity with an exponentially 318 00:19:35,060 --> 00:19:39,380 decreasing error probability near capacity but with only 319 00:19:39,380 --> 00:19:42,010 polynomial increasing complexity? 320 00:19:42,010 --> 00:19:48,610 And I looked into this very simple idea of basically 321 00:19:48,610 --> 00:19:51,670 taking two codes and using one to correct the 322 00:19:51,670 --> 00:19:52,630 errors of the other. 323 00:19:52,630 --> 00:19:54,590 And I found that you could do that. 324 00:19:54,590 --> 00:19:59,310 And I'd also never realized that this 325 00:19:59,310 --> 00:20:00,580 might actually be practical. 326 00:20:00,580 --> 00:20:04,280 I remember going around giving talks when I was interviewing 327 00:20:04,280 --> 00:20:07,580 afterwards that I was looking for an industrial position. 328 00:20:07,580 --> 00:20:11,730 I went down to places like Bell Labs, IBM labs, and did 329 00:20:11,730 --> 00:20:16,520 seminars talking about codes that were thousands of symbols 330 00:20:16,520 --> 00:20:19,290 long and everybody rolled their eyes and said, boy, this 331 00:20:19,290 --> 00:20:22,320 guy is not very practical. 332 00:20:22,320 --> 00:20:26,530 And to my surprise and everyone else's surprise, this 333 00:20:26,530 --> 00:20:32,990 became and still is a very practical way of getting 334 00:20:32,990 --> 00:20:34,460 high-performance codes. 335 00:20:34,460 --> 00:20:36,610 This, in conjunction -- 336 00:20:36,610 --> 00:20:41,450 well, I'll get into it, but this was, again, a workhorse 337 00:20:41,450 --> 00:20:47,730 coding technique and still is in wide use. 338 00:20:47,730 --> 00:20:53,060 In fact, as you will see, this contains some of the ideas 339 00:20:53,060 --> 00:20:58,650 that go into modern capacity-approaching codes. 340 00:20:58,650 --> 00:21:01,100 But the idea is very simple. 341 00:21:01,100 --> 00:21:10,730 You can do it two ways, or probably many more than two. 342 00:21:10,730 --> 00:21:14,035 Suppose we have a basically binary channel. 343 00:21:17,910 --> 00:21:21,340 I'll just say a core physical channel there. 344 00:21:21,340 --> 00:21:25,200 The channel is what you can't affect as an engineer. 345 00:21:25,200 --> 00:21:29,030 So let's say we have a binary symmetric channel or a binary 346 00:21:29,030 --> 00:21:32,530 input additive white Gaussian noise channel or 347 00:21:32,530 --> 00:21:33,310 whatever you like. 348 00:21:33,310 --> 00:21:48,140 And then we could put on it an encoder and decoder for an n, 349 00:21:48,140 --> 00:21:52,506 k, d binary linear block code. 350 00:21:56,080 --> 00:21:58,740 OK? 351 00:21:58,740 --> 00:22:03,490 Now, from the input and the output, what is this channel 352 00:22:03,490 --> 00:22:04,740 going to look like? 353 00:22:07,720 --> 00:22:09,220 At the input, we're putting -- 354 00:22:09,220 --> 00:22:12,780 basically, per block, we put k bits in. 355 00:22:12,780 --> 00:22:15,390 They're encoded into n bits. 356 00:22:15,390 --> 00:22:21,860 They're sent over the channel, received some n-tuple of noisy 357 00:22:21,860 --> 00:22:23,250 received symbols. 358 00:22:23,250 --> 00:22:26,480 The decoder tries to find the best code word. 359 00:22:26,480 --> 00:22:28,755 The code word is identified by k bits. 360 00:22:32,170 --> 00:22:36,370 Most of the time there'll be no error and these -- 361 00:22:36,370 --> 00:22:41,520 let's call this u in and u hat out. 362 00:22:41,520 --> 00:22:45,240 u hat is going to equal u. 363 00:22:45,240 --> 00:22:48,150 But, of course, let's imagine this is a 364 00:22:48,150 --> 00:22:49,790 relatively short code. 365 00:22:49,790 --> 00:22:53,380 Let's imagine this is a maximum likelihood decoder or 366 00:22:53,380 --> 00:22:57,800 minimum distance decoder in the appropriate space. 367 00:22:57,800 --> 00:23:01,070 Then the complexity is going to be of the 368 00:23:01,070 --> 00:23:03,890 order of 2 to the k. 369 00:23:03,890 --> 00:23:07,080 And so you can't let k get very big. 370 00:23:07,080 --> 00:23:09,180 We'll find somewhat better ways to do 371 00:23:09,180 --> 00:23:10,900 maximum likelihood decoding. 372 00:23:10,900 --> 00:23:13,050 But if you're really trying to do optimum decoding, the 373 00:23:13,050 --> 00:23:16,080 complexity is going to go up exponentially at this stage 374 00:23:16,080 --> 00:23:18,350 with the number of information bits k. 375 00:23:18,350 --> 00:23:21,440 So this is not the scheme that we're looking for that has 376 00:23:21,440 --> 00:23:23,940 only polynomial complexity. 377 00:23:23,940 --> 00:23:27,140 We know that we can choose codes like this that with 378 00:23:27,140 --> 00:23:29,700 optimum decoding get to capacity. 379 00:23:29,700 --> 00:23:32,350 That's Shannon's theorem. 380 00:23:32,350 --> 00:23:36,040 But we don't know how to decode them with reasonable 381 00:23:36,040 --> 00:23:37,950 complexity. 382 00:23:37,950 --> 00:23:39,320 But, all right, here's a start. 383 00:23:39,320 --> 00:23:43,910 We could use a relatively modest length code that we can 384 00:23:43,910 --> 00:23:45,480 feasibly decode. 385 00:23:45,480 --> 00:23:48,470 And this is what we get. 386 00:23:48,470 --> 00:23:49,740 OK. 387 00:23:49,740 --> 00:23:52,240 So we have blocks going in. 388 00:23:52,240 --> 00:23:55,920 Block from an alphabet of size 2 to the k. 389 00:23:55,920 --> 00:23:58,550 Blocks coming out, alphabet size 2 to the k. 390 00:23:58,550 --> 00:24:01,710 Occasionally, we make errors. 391 00:24:01,710 --> 00:24:03,370 What's a good idea to then do? 392 00:24:06,150 --> 00:24:07,770 This is not very hard. 393 00:24:07,770 --> 00:24:09,870 This is why you should have been in this field in the '60s 394 00:24:09,870 --> 00:24:12,870 rather than the year 2005. 395 00:24:12,870 --> 00:24:16,300 There were lots of good ideas just laying around waiting to 396 00:24:16,300 --> 00:24:18,530 be picked up. 397 00:24:18,530 --> 00:24:19,020 Class? 398 00:24:19,020 --> 00:24:21,310 Anyone have a suggestion for how to make further 399 00:24:21,310 --> 00:24:22,480 improvement in this system? 400 00:24:22,480 --> 00:24:24,168 AUDIENCE: [INAUDIBLE]. 401 00:24:24,168 --> 00:24:25,590 PROFESSOR: Add another code. 402 00:24:25,590 --> 00:24:28,690 What should it be? 403 00:24:28,690 --> 00:24:34,830 Let's put an encoder here for what? 404 00:24:38,734 --> 00:24:41,662 AUDIENCE: [INAUDIBLE]. 405 00:24:41,662 --> 00:24:45,190 PROFESSOR: Everyone is extremely wide awake. 406 00:24:45,190 --> 00:24:50,500 An n, k, d Reed-Solomon code over what field? 407 00:24:55,670 --> 00:25:00,965 In principle, it should be over F2 to the k. 408 00:25:00,965 --> 00:25:04,450 Now, of course, if k is 64 or something here, you might not 409 00:25:04,450 --> 00:25:07,900 even go all the way to field of 2 to the 64 elements. 410 00:25:07,900 --> 00:25:10,880 You might divide it up again as we did in the packet error 411 00:25:10,880 --> 00:25:11,380 correction. 412 00:25:11,380 --> 00:25:16,690 But basically, the way this goes is you try to keep this 413 00:25:16,690 --> 00:25:19,770 as small as possible so that this can still work. 414 00:25:19,770 --> 00:25:27,570 There's a tradeoff between the complexity of the two codes 415 00:25:27,570 --> 00:25:29,672 and therefore, the rates. 416 00:25:29,672 --> 00:25:30,922 The decoder -- 417 00:25:33,390 --> 00:25:44,160 here we have an algebraic Reed-Solomon decoder, which, 418 00:25:44,160 --> 00:25:48,660 in the simplest case, could just do error correction. 419 00:25:48,660 --> 00:25:53,190 Just take these k bits as a hard decision on u hat and try 420 00:25:53,190 --> 00:25:57,310 to correct any errors that occurred in this inner loop 421 00:25:57,310 --> 00:26:00,700 here, which we might think of this as being a 422 00:26:00,700 --> 00:26:02,110 2 to the kre channel. 423 00:26:02,110 --> 00:26:05,302 So this whole thing together is a 2 to 424 00:26:05,302 --> 00:26:06,890 the kre super channel. 425 00:26:09,500 --> 00:26:15,020 Sorry if you can't read that, but you heard me say it. 426 00:26:15,020 --> 00:26:19,740 So here's a channel that takes in 2 to the kre symbols, puts 427 00:26:19,740 --> 00:26:25,900 out 2 to the kre symbols and sometimes makes errors 428 00:26:25,900 --> 00:26:27,770 according to whatever the error probability you can 429 00:26:27,770 --> 00:26:29,020 achieve here is. 430 00:26:31,560 --> 00:26:35,930 Or some of the discussion back when we were talking about 431 00:26:35,930 --> 00:26:39,170 hard decision with binary suggested that it would be 432 00:26:39,170 --> 00:26:43,980 better to pass along more reliability information here 433 00:26:43,980 --> 00:26:45,660 along with just the hard decision. 434 00:26:49,980 --> 00:26:54,600 And the very simplest kind of reliability information is if 435 00:26:54,600 --> 00:26:59,950 this guy really isn't sure, if either the noise is too large 436 00:26:59,950 --> 00:27:04,000 so he can't make any decision or if the noise leaves him 437 00:27:04,000 --> 00:27:06,980 exactly halfway between two code words so it's a 438 00:27:06,980 --> 00:27:11,290 completely ambiguous decision, or whatever criteria you have, 439 00:27:11,290 --> 00:27:13,600 this might make erasures. 440 00:27:13,600 --> 00:27:16,730 And I'm sure Ralf told you how to do error and erasure 441 00:27:16,730 --> 00:27:20,870 correction with Reed-Solomon codes. 442 00:27:20,870 --> 00:27:25,320 Or this could be more likely that information just -- 443 00:27:25,320 --> 00:27:28,490 how likely was the code word you actually saw? 444 00:27:28,490 --> 00:27:32,080 And there are various ways of devising an approximate 445 00:27:32,080 --> 00:27:33,900 likelihood metric. 446 00:27:33,900 --> 00:27:37,030 You just do what makes sense that indicates that some 447 00:27:37,030 --> 00:27:40,130 words, you're absolutely sure of in this decoding. 448 00:27:40,130 --> 00:27:44,890 And other words, you don't know -- 449 00:27:44,890 --> 00:27:48,050 if you have one or more good candidates or you have no good 450 00:27:48,050 --> 00:27:50,620 candidates. 451 00:27:50,620 --> 00:27:54,880 So based on this reliability information, you can actually 452 00:27:54,880 --> 00:27:55,990 incorporate that. 453 00:27:55,990 --> 00:27:58,580 Did Ralf talk about reliability-based decoding? 454 00:27:58,580 --> 00:28:02,030 Soft decoding algorithms for Reed-Solomon codes? 455 00:28:02,030 --> 00:28:02,710 No? 456 00:28:02,710 --> 00:28:05,950 OK, well, that's a shame because he and Alex Vardy are 457 00:28:05,950 --> 00:28:10,250 the inventors of a very good class that's 458 00:28:10,250 --> 00:28:11,760 recently been improved. 459 00:28:11,760 --> 00:28:20,060 So one of the knocks on Reed-Solomon codes is that, at 460 00:28:20,060 --> 00:28:25,670 one point, you might have said a minus can only work with 461 00:28:25,670 --> 00:28:26,920 hard decisions. 462 00:28:29,000 --> 00:28:31,410 And then, fairly early, it was figured out how 463 00:28:31,410 --> 00:28:33,970 to correct to erasures. 464 00:28:33,970 --> 00:28:39,320 But just in the last 10 years, we now have -- 465 00:28:39,320 --> 00:28:40,420 this is no longer true. 466 00:28:40,420 --> 00:28:45,735 We can have soft, meaning reliability-labeled inputs. 467 00:28:55,740 --> 00:28:58,890 And this is a plus. 468 00:28:58,890 --> 00:29:01,260 That means we can do soft -- 469 00:29:01,260 --> 00:29:05,140 and we have corresponding algorithms that can do soft 470 00:29:05,140 --> 00:29:07,080 Reed-Solomon decoding. 471 00:29:07,080 --> 00:29:09,770 They are, of course, more complicated than just 472 00:29:09,770 --> 00:29:11,020 straightforward error 473 00:29:11,020 --> 00:29:13,720 correction with hard decisions. 474 00:29:13,720 --> 00:29:17,660 But there's still polynomial complexity. 475 00:29:17,660 --> 00:29:25,660 They're maybe an order of slightly higher degree of n, 476 00:29:25,660 --> 00:29:29,200 but still polynomial in the block length or in k or d or 477 00:29:29,200 --> 00:29:31,780 whatever you're taking. 478 00:29:31,780 --> 00:29:35,160 And so we now have some such algorithms that do make 479 00:29:35,160 --> 00:29:37,240 substantial gains. 480 00:29:37,240 --> 00:29:41,350 And, of course, if he didn't tell you about them, I'm not 481 00:29:41,350 --> 00:29:43,720 going to tell you about them. 482 00:29:43,720 --> 00:29:45,890 It's a rather specialized subject. 483 00:29:45,890 --> 00:29:47,080 But it's something that's really 484 00:29:47,080 --> 00:29:50,650 happened the last 10 years. 485 00:29:50,650 --> 00:29:52,500 And they've constantly been improved. 486 00:29:52,500 --> 00:29:55,260 The first ones, like one I came up with called 487 00:29:55,260 --> 00:29:59,560 generalized minimum distance decoding, made a modest 488 00:29:59,560 --> 00:30:00,950 improvement. 489 00:30:00,950 --> 00:30:05,810 In dB terms, 1 dB, where there's 3 dB to be got. 490 00:30:05,810 --> 00:30:10,360 Now they're up to 2, 2 1/2 out of the 3 dB with these new 491 00:30:10,360 --> 00:30:14,160 soft Reed-Solomon decoding algorithms. 492 00:30:14,160 --> 00:30:17,560 So anyway, if you pass some reliability information here, 493 00:30:17,560 --> 00:30:20,775 then you can improve the performance of this decoder. 494 00:30:20,775 --> 00:30:27,640 This decoder knows which symbols are reliable and it 495 00:30:27,640 --> 00:30:31,610 won't change in the decoding in which are unreliable or 496 00:30:31,610 --> 00:30:32,550 completely erased. 497 00:30:32,550 --> 00:30:34,470 Where there's no information about them, that actually 498 00:30:34,470 --> 00:30:36,940 helps you in the decoding. 499 00:30:36,940 --> 00:30:43,670 If you knew that you had s erasures rather than s errors, 500 00:30:43,670 --> 00:30:45,620 this helps you quite a lot. 501 00:30:45,620 --> 00:30:47,840 We can decode twice as many erasures as 502 00:30:47,840 --> 00:30:49,680 we can decode errors. 503 00:30:49,680 --> 00:30:53,060 And similarly, soft information in between can 504 00:30:53,060 --> 00:30:56,310 improve the capabilities of your decoder. 505 00:30:56,310 --> 00:30:56,730 All right. 506 00:30:56,730 --> 00:31:01,620 So that's the set up. 507 00:31:01,620 --> 00:31:04,275 You can see why it makes a certain amount of sense. 508 00:31:07,190 --> 00:31:10,360 From a complexity point of view, what it does -- 509 00:31:10,360 --> 00:31:12,680 here, we're going to do maximum likelihood decoding, 510 00:31:12,680 --> 00:31:14,760 complexity order of 2 to the k. 511 00:31:14,760 --> 00:31:16,710 But we're not going to let k get very 512 00:31:16,710 --> 00:31:18,690 large, so this is feasible. 513 00:31:18,690 --> 00:31:21,330 Then we're going to let the code get very long, as we know 514 00:31:21,330 --> 00:31:23,480 we have to do from Shannon's Theorem, 515 00:31:23,480 --> 00:31:26,020 using this outer code. 516 00:31:26,020 --> 00:31:29,840 But now we know we have polynomial complexity 517 00:31:29,840 --> 00:31:33,340 Reed-Solomon decoding algorithms, even for the soft 518 00:31:33,340 --> 00:31:34,600 decision case. 519 00:31:34,600 --> 00:31:42,330 So if we balance the complexity here, basically let 520 00:31:42,330 --> 00:31:46,140 this length be exponential and this length, as it is -- it's 521 00:31:46,140 --> 00:31:50,190 of the order of 2 to the k itself -- 522 00:31:50,190 --> 00:31:55,100 then we can get polynomial complexity overall, all right? 523 00:31:55,100 --> 00:31:57,980 The overall complexity is going to be of the order of a 524 00:31:57,980 --> 00:32:01,250 small power of 2 to the k. 525 00:32:01,250 --> 00:32:04,140 The overall block length is going to be of the order -- 526 00:32:04,140 --> 00:32:07,430 is going to be linear in 2 to the k. 527 00:32:07,430 --> 00:32:10,300 And therefore, you're going to get polynomial decoding 528 00:32:10,300 --> 00:32:11,620 complexity. 529 00:32:11,620 --> 00:32:14,510 The other question is performance. 530 00:32:14,510 --> 00:32:19,770 And you can show that if you -- 531 00:32:19,770 --> 00:32:23,730 well, even if you don't use reliability information, you 532 00:32:23,730 --> 00:32:27,360 can get all the way to the capacity of this underlying 533 00:32:27,360 --> 00:32:31,860 channel here with exponentially-decreasing error 534 00:32:31,860 --> 00:32:35,080 rate for any rate below that capacity. 535 00:32:35,080 --> 00:32:40,530 Not as good an exponent as if you just used random codes per 536 00:32:40,530 --> 00:32:44,740 Shannon and coded for the channel, but nonetheless, you 537 00:32:44,740 --> 00:32:46,030 basically get what you want. 538 00:32:46,030 --> 00:32:53,310 So using two smaller, simpler codes to create one long code 539 00:32:53,310 --> 00:32:55,790 turned out to be a good idea. 540 00:32:55,790 --> 00:32:58,520 First of all, for proving this theoretical result, you could 541 00:32:58,520 --> 00:33:01,740 get exponentially-decreasing error rate for polynomial 542 00:33:01,740 --> 00:33:04,190 increasing decoding complexity. 543 00:33:04,190 --> 00:33:06,830 And within about 10 years, this is what 544 00:33:06,830 --> 00:33:08,080 people actually did. 545 00:33:10,510 --> 00:33:14,370 And, in fact, what people actually did -- 546 00:33:14,370 --> 00:33:19,390 I've got a block code here, but what people actually did 547 00:33:19,390 --> 00:33:22,220 was they used a short constraint length 548 00:33:22,220 --> 00:33:25,380 convolutional code here, which we'll be talking 549 00:33:25,380 --> 00:33:28,450 about after the break. 550 00:33:28,450 --> 00:33:31,260 Convolutional codes tend to have a better performance 551 00:33:31,260 --> 00:33:34,860 complexity tradeoff, particularly in soft decision 552 00:33:34,860 --> 00:33:37,060 situations than block codes do. 553 00:33:37,060 --> 00:33:42,270 So in a practical system, it's almost always better to use a 554 00:33:42,270 --> 00:33:44,110 convolutional code than a block code. 555 00:33:44,110 --> 00:33:46,710 We'll discuss the reasons why. 556 00:33:46,710 --> 00:33:50,270 So this convolutional code here, there is a maximum 557 00:33:50,270 --> 00:33:53,550 likelihood sequence detector for the convolutional code 558 00:33:53,550 --> 00:33:55,710 down here, which is the Viterbi algorithm, which 559 00:33:55,710 --> 00:33:58,550 you've probably all heard about. 560 00:33:58,550 --> 00:34:02,390 And this is very practical as long as the constraint length, 561 00:34:02,390 --> 00:34:05,710 k, now becomes the constraint length, nu, of the code 562 00:34:05,710 --> 00:34:07,600 doesn't get too large. 563 00:34:07,600 --> 00:34:12,820 So for about 20 years, the standard for space 564 00:34:12,820 --> 00:34:20,330 communication was this was a (255, 223, 33) Reed-Solomon 565 00:34:20,330 --> 00:34:27,270 code over F-256. 566 00:34:27,270 --> 00:34:35,560 This is a constraint length 6, rate 1/2 convolutional code. 567 00:34:39,100 --> 00:34:45,270 This is a 64-state Viterbi algorithm. 568 00:34:45,270 --> 00:34:48,380 So what's the -- 569 00:34:48,380 --> 00:34:52,320 again, if we have a look at the error channel, here is a 570 00:34:52,320 --> 00:34:54,790 stream of bits coming along. 571 00:34:54,790 --> 00:34:58,960 After encoding and decoding, what we typically see is 572 00:34:58,960 --> 00:35:02,260 bursts of errors, all right? 573 00:35:02,260 --> 00:35:04,640 So you get a burst that starts -- 574 00:35:04,640 --> 00:35:08,040 in convolutional codes, it can start at a random time, end at 575 00:35:08,040 --> 00:35:09,355 a random time. 576 00:35:09,355 --> 00:35:14,730 You might get another discrete error event over here. 577 00:35:14,730 --> 00:35:16,520 In other words, most of the time, the 578 00:35:16,520 --> 00:35:19,430 decoder is decoding properly. 579 00:35:19,430 --> 00:35:22,220 What you're receiving is what you transmitted in a 580 00:35:22,220 --> 00:35:24,150 streamwise fashion. 581 00:35:24,150 --> 00:35:27,510 The bits just are transmitted forever. 582 00:35:27,510 --> 00:35:30,250 But every so often, the decoder gets off on the wrong 583 00:35:30,250 --> 00:35:33,000 track, puts out errors for a while, then gets 584 00:35:33,000 --> 00:35:37,680 resynchronized, and decodes correctly again. 585 00:35:37,680 --> 00:35:40,140 Is my picture clear? 586 00:35:40,140 --> 00:35:44,970 I'm using words, mainly, to tell you what it represents. 587 00:35:44,970 --> 00:35:47,290 OK. 588 00:35:47,290 --> 00:35:52,230 So now, basically, what you want to do is, just as we did 589 00:35:52,230 --> 00:35:56,940 in the packet communication case, you want to send 590 00:35:56,940 --> 00:36:05,940 multiple parallel streams, up to 256 of them, such that in 591 00:36:05,940 --> 00:36:09,870 this stream, you're going to get independent error blocks. 592 00:36:09,870 --> 00:36:12,710 And this one, you're going to get errors 593 00:36:12,710 --> 00:36:13,900 occurring somewhere else. 594 00:36:13,900 --> 00:36:16,550 They're going to have a typical length. 595 00:36:16,550 --> 00:36:21,290 You can very well estimate what the typical length of 596 00:36:21,290 --> 00:36:24,390 error events is, but they occur randomly. 597 00:36:24,390 --> 00:36:27,010 And now you use the Reed-Solomon code, coding 598 00:36:27,010 --> 00:36:35,430 across this, and that will basically give you this 599 00:36:35,430 --> 00:36:36,680 picture over here. 600 00:36:39,100 --> 00:36:42,670 How would I get this? 601 00:36:42,670 --> 00:36:54,570 One way is by using a block interleaver, which is 602 00:36:54,570 --> 00:36:56,285 described in the notes. 603 00:36:56,285 --> 00:37:01,040 A block interleaver, you basically take a long block 604 00:37:01,040 --> 00:37:05,470 from the convolutional code and you just encode and 605 00:37:05,470 --> 00:37:08,350 transmit along here and ultimately decode. 606 00:37:08,350 --> 00:37:11,350 And after a very long time, you come back and start 607 00:37:11,350 --> 00:37:13,110 transmitting the second row just as 608 00:37:13,110 --> 00:37:15,140 part of the same stream. 609 00:37:15,140 --> 00:37:19,760 And then the third row and so forth. 610 00:37:19,760 --> 00:37:26,630 And if the block length n, here, is long enough, the 611 00:37:26,630 --> 00:37:28,490 errors in different rows will be effectively 612 00:37:28,490 --> 00:37:31,560 independent, all right? 613 00:37:31,560 --> 00:37:34,850 So again, you wind up with very long blocks in this 614 00:37:34,850 --> 00:37:39,365 scheme, but analytically, it works. 615 00:37:39,365 --> 00:37:45,106 And in practice, this is roughly what's done, except 616 00:37:45,106 --> 00:37:47,740 there is such a thing as a convolutional interleaver, 617 00:37:47,740 --> 00:37:52,790 which again, operates streamwise and actually 618 00:37:52,790 --> 00:37:55,571 performs better in practice. 619 00:37:55,571 --> 00:37:58,300 But I won't take the time to go into this because we're not 620 00:37:58,300 --> 00:38:01,470 going to really be getting into things like interleavers. 621 00:38:01,470 --> 00:38:03,780 Do you see how this works? 622 00:38:03,780 --> 00:38:05,060 OK. 623 00:38:05,060 --> 00:38:08,300 So it works well theoretically. 624 00:38:08,300 --> 00:38:10,380 It gives us the theoretical result we want. 625 00:38:10,380 --> 00:38:12,345 Works well in practice. 626 00:38:15,570 --> 00:38:17,420 As I say, this was the standard for space 627 00:38:17,420 --> 00:38:23,710 communications during the '70s and the '80s. 628 00:38:23,710 --> 00:38:27,600 Around the end of the '80s, they upgraded this. 629 00:38:27,600 --> 00:38:35,860 First of all, they replaced this by a nu equals 14, rate 630 00:38:35,860 --> 00:38:41,780 1/6, lower rate convolutional code, so this now becomes a 2 631 00:38:41,780 --> 00:38:46,075 to the 1fourth state Viterbi decoder. 632 00:38:46,075 --> 00:38:53,110 They built this huge rack at JPL that had 8,192 parallel 633 00:38:53,110 --> 00:38:55,770 decoding units, add-compare-select units in 634 00:38:55,770 --> 00:38:58,560 the Viterbi algorithm, to decode this thing. 635 00:38:58,560 --> 00:39:01,150 But, of course, they only need to do that in one place, in 636 00:39:01,150 --> 00:39:02,130 the space program. 637 00:39:02,130 --> 00:39:04,560 Everything comes back to JPL. 638 00:39:04,560 --> 00:39:07,650 So they built this BVD, Big Viterbi decoder. 639 00:39:10,210 --> 00:39:15,130 And first, they used it with this code and then they souped 640 00:39:15,130 --> 00:39:17,110 up the Reed-Solomon code. 641 00:39:17,110 --> 00:39:23,810 And by about 1990, 1991, they got this thing operating up 642 00:39:23,810 --> 00:39:27,980 within about 2 dB of the Shannon limit, OK? 643 00:39:27,980 --> 00:39:32,200 Which was considered a heroic feat at the time, or even now. 644 00:39:32,200 --> 00:39:36,090 So that kind of tells you, in the terms we've been using in 645 00:39:36,090 --> 00:39:39,400 this course, just what you can get out of 646 00:39:39,400 --> 00:39:40,650 this kind of approach. 647 00:39:44,310 --> 00:39:48,520 At that point, that was considered coding is dead. 648 00:39:48,520 --> 00:39:49,770 There's really nothing more to do. 649 00:39:49,770 --> 00:39:53,000 We're within about 2 dB of the Shannon limit and it's 650 00:39:53,000 --> 00:39:54,820 inconceivable that we could actually get 651 00:39:54,820 --> 00:39:56,850 any closer than that. 652 00:39:56,850 --> 00:39:59,600 Second half of the course will, of course, tell us how 653 00:39:59,600 --> 00:40:03,290 we can get within 0.0045 dB of the Shannon limit. 654 00:40:03,290 --> 00:40:06,250 But that was the state of the art as of about 655 00:40:06,250 --> 00:40:11,460 1990, 15 years ago. 656 00:40:11,460 --> 00:40:12,310 Questions? 657 00:40:12,310 --> 00:40:14,670 Comments? 658 00:40:14,670 --> 00:40:15,920 OK. 659 00:40:18,850 --> 00:40:19,220 All right. 660 00:40:19,220 --> 00:40:25,260 I'll just mention a third application is burst error 661 00:40:25,260 --> 00:40:28,990 correction, which really, we've already 662 00:40:28,990 --> 00:40:30,240 been talking about. 663 00:40:35,480 --> 00:40:37,970 Suppose -- 664 00:40:37,970 --> 00:40:40,270 let's let this be a naked channel now. 665 00:40:40,270 --> 00:40:42,870 Suppose you're transmitting with an 666 00:40:42,870 --> 00:40:44,210 interleave scheme like this. 667 00:40:44,210 --> 00:40:46,880 And now, in the same way -- 668 00:40:46,880 --> 00:40:49,250 it's a radio channel or something and every so often, 669 00:40:49,250 --> 00:40:53,960 you have an outage just due to fading, or a thunderstorm, or 670 00:40:53,960 --> 00:40:58,010 somebody turns on his vacuum cleaner, or whatever. 671 00:40:58,010 --> 00:41:00,200 Every so often, as you transmit, you're 672 00:41:00,200 --> 00:41:01,000 going to get bursts. 673 00:41:01,000 --> 00:41:02,330 You don't have independent, 674 00:41:02,330 --> 00:41:04,350 identically-distributed errors. 675 00:41:04,350 --> 00:41:07,090 The character of the channel is such that you see bursts of 676 00:41:07,090 --> 00:41:10,210 errors, which is precisely what you do in this super 677 00:41:10,210 --> 00:41:11,670 channel here. 678 00:41:11,670 --> 00:41:16,380 Every so often, you see a bunch of errors if you have 679 00:41:16,380 --> 00:41:20,470 the original transmitted scheme to compare it to. 680 00:41:20,470 --> 00:41:21,120 All right. 681 00:41:21,120 --> 00:41:24,290 In exactly the same way, you can code -- 682 00:41:24,290 --> 00:41:27,470 with something like this block interleaver, you can code 683 00:41:27,470 --> 00:41:31,950 across channels and correct the bursts with 684 00:41:31,950 --> 00:41:33,850 Reed-Solomon codes. 685 00:41:33,850 --> 00:41:37,100 And here, it's quite easy to show that, because 686 00:41:37,100 --> 00:41:40,630 Reed-Solomon codes have optimal parameters n, k, d, 687 00:41:40,630 --> 00:41:44,140 that Reed-Solomon codes do the best possible job of 688 00:41:44,140 --> 00:41:48,180 correcting burst errors that you possibly could do, given 689 00:41:48,180 --> 00:41:52,580 some specification for how long the bursts could possibly 690 00:41:52,580 --> 00:42:00,820 be and the minimum guard space between bursts. 691 00:42:00,820 --> 00:42:03,410 This is classical stuff. 692 00:42:03,410 --> 00:42:06,820 Reed-Solomon codes are optimum burst correctors if you 693 00:42:06,820 --> 00:42:11,750 describe the burst channel by max burst length in a minimum 694 00:42:11,750 --> 00:42:14,029 guard space. 695 00:42:14,029 --> 00:42:15,980 Is that too brief? 696 00:42:15,980 --> 00:42:17,040 Probably is. 697 00:42:17,040 --> 00:42:19,700 Give you some impression of what we're talking about. 698 00:42:19,700 --> 00:42:22,920 So Reed-Solomon codes are optimum for burst error 699 00:42:22,920 --> 00:42:29,750 correction, which, most real channels are bursty. 700 00:42:29,750 --> 00:42:33,500 The only non-bursty channel that we really have is the 701 00:42:33,500 --> 00:42:36,300 additive white Gaussian noise channel, ultimately. 702 00:42:36,300 --> 00:42:39,100 That is the channel in space communications, which is why 703 00:42:39,100 --> 00:42:45,280 you can apply a nice, textbook-type code like this 704 00:42:45,280 --> 00:42:46,810 to the space channel. 705 00:42:46,810 --> 00:42:49,250 For any other channel, typically, you have to use 706 00:42:49,250 --> 00:42:52,460 some kind of interleaving to break up the bursts so 707 00:42:52,460 --> 00:42:54,340 that you can -- 708 00:42:54,340 --> 00:42:58,120 this simply is no information over a long time. 709 00:42:58,120 --> 00:43:01,310 If you tried to encode crossways on this, you'd have 710 00:43:01,310 --> 00:43:05,700 to have a code that's long enough to see the average 711 00:43:05,700 --> 00:43:07,890 burst's behavior. 712 00:43:07,890 --> 00:43:13,370 We'd make the burst have an ergodic model. 713 00:43:13,370 --> 00:43:16,160 The way to do that in practice is to code across it so you 714 00:43:16,160 --> 00:43:21,180 get independent snippets of widely separated bursts which 715 00:43:21,180 --> 00:43:23,156 can be assumed to be independent. 716 00:43:23,156 --> 00:43:23,650 Yeah. 717 00:43:23,650 --> 00:43:24,900 AUDIENCE: [INAUDIBLE]? 718 00:43:28,890 --> 00:43:30,080 PROFESSOR: No different, really. 719 00:43:30,080 --> 00:43:33,377 AUDIENCE: [INAUDIBLE]. 720 00:43:33,377 --> 00:43:36,010 PROFESSOR: Yeah, in space channel -- all right. 721 00:43:36,010 --> 00:43:37,830 Here, I'm talking about power limit. 722 00:43:37,830 --> 00:43:40,300 That's a good question. 723 00:43:40,300 --> 00:43:44,540 But if you think about the band-limited regime, 724 00:43:44,540 --> 00:43:45,640 let's suppose -- 725 00:43:45,640 --> 00:43:48,320 so this would be some [? MRE ?] channel. 726 00:43:48,320 --> 00:43:53,570 You would use some multilevel signaling scheme like QAM with 727 00:43:53,570 --> 00:43:58,530 a large signal set or QPSK or something like that to get a 728 00:43:58,530 --> 00:44:02,710 basic transmission with a discrete signal set that 729 00:44:02,710 --> 00:44:05,460 doesn't lose too much over sending 730 00:44:05,460 --> 00:44:08,730 Gaussian over the channel. 731 00:44:08,730 --> 00:44:11,440 Again, this will be later in the course. 732 00:44:11,440 --> 00:44:13,480 We'll talk about this in detail. 733 00:44:13,480 --> 00:44:15,770 Other than that -- 734 00:44:15,770 --> 00:44:18,940 well, all right, so now this is an MRE channel. 735 00:44:18,940 --> 00:44:22,660 If M is large enough, then you could apply Reed-Solomon codes 736 00:44:22,660 --> 00:44:23,410 right here. 737 00:44:23,410 --> 00:44:24,950 But typically, it's not. 738 00:44:24,950 --> 00:44:28,590 M is like 16 or 64. 739 00:44:28,590 --> 00:44:32,940 And so Reed-Solomon codes would still be too short. 740 00:44:32,940 --> 00:44:34,370 You would not be able to get long enough 741 00:44:34,370 --> 00:44:36,280 codes over such a channel. 742 00:44:36,280 --> 00:44:40,210 So a similar kind of thing can be done. 743 00:44:40,210 --> 00:44:43,250 And the history of bandwidth-limited coding 744 00:44:43,250 --> 00:44:47,530 pretty much parallels that of power-limited coding. 745 00:44:47,530 --> 00:44:52,740 What corresponds to block codes here is lattice codes 746 00:44:52,740 --> 00:44:53,820 for this channel. 747 00:44:53,820 --> 00:44:56,180 What corresponds to convolutional codes is 748 00:44:56,180 --> 00:44:59,510 trellis-coded modulation. 749 00:44:59,510 --> 00:45:05,180 And again, once you handle the basic physical channel with 750 00:45:05,180 --> 00:45:08,160 these codes that are well matched to that channel but 751 00:45:08,160 --> 00:45:11,420 whose complexity increases too rapidly if you make them long 752 00:45:11,420 --> 00:45:15,510 because of exponential decoding, then you can always 753 00:45:15,510 --> 00:45:18,430 clean up whatever errors you make. 754 00:45:18,430 --> 00:45:22,030 You can think of this as a cleanup function out here. 755 00:45:22,030 --> 00:45:24,290 Typically, the rates out here are very high. 756 00:45:24,290 --> 00:45:26,120 You don't need much redundancy. 757 00:45:26,120 --> 00:45:30,550 It doesn't cost too much in overall rate. 758 00:45:30,550 --> 00:45:36,340 But even with very high rates, like 223 over 255, you can do 759 00:45:36,340 --> 00:45:38,225 a rather large amount of error correction. 760 00:45:38,225 --> 00:45:41,430 You can correct up to 16 errors or 32 761 00:45:41,430 --> 00:45:42,980 erasures with us channel. 762 00:45:42,980 --> 00:45:45,720 So this is a cleanup function. 763 00:45:45,720 --> 00:45:49,310 You basically work very hard in here, get the best error 764 00:45:49,310 --> 00:45:51,430 probability you can, which -- 765 00:45:51,430 --> 00:45:52,740 it doesn't have to be too low. 766 00:45:52,740 --> 00:45:55,340 10 to the minus 2, 10 to the minus 3. 767 00:45:55,340 --> 00:45:57,900 And then you use this to clean it up and that drives the 768 00:45:57,900 --> 00:46:00,428 error probability down to 10 to the minus 12 or 10 to the 769 00:46:00,428 --> 00:46:02,360 minus 15 or whatever you really want. 770 00:46:05,910 --> 00:46:09,187 But the principles are the same in power 771 00:46:09,187 --> 00:46:10,660 limit and band limit. 772 00:46:10,660 --> 00:46:13,730 But nice question. 773 00:46:13,730 --> 00:46:14,980 Any other questions? 774 00:46:18,700 --> 00:46:19,250 OK. 775 00:46:19,250 --> 00:46:26,290 So as a result, Reed-Solomon codes are the general purpose 776 00:46:26,290 --> 00:46:32,310 code that has emerged from algebraic coding theory. 777 00:46:32,310 --> 00:46:37,080 A good example of burst error correction is, you know on 778 00:46:37,080 --> 00:46:41,260 your CDs, there is an error correction scheme incorporated 779 00:46:41,260 --> 00:46:45,610 such that if you scratch the CD, it still works. 780 00:46:45,610 --> 00:46:46,240 How does that work? 781 00:46:46,240 --> 00:46:48,380 It could only be because there's some error 782 00:46:48,380 --> 00:46:49,150 correction in there. 783 00:46:49,150 --> 00:46:53,810 What, actually, is the error correction scheme? 784 00:46:53,810 --> 00:46:58,000 I'm not sure I know what the latest and greatest is, but 785 00:46:58,000 --> 00:47:01,310 generally, the error correction schemes for 786 00:47:01,310 --> 00:47:05,700 magnetic memory recording like that have used 787 00:47:05,700 --> 00:47:08,160 Reed-Solomon codes -- 788 00:47:08,160 --> 00:47:11,040 have used two Reed-Solomon codes in interleave 789 00:47:11,040 --> 00:47:11,960 fashion like this. 790 00:47:11,960 --> 00:47:14,290 So the across code is Reed-Solomon. 791 00:47:14,290 --> 00:47:16,220 The down code is Reed-Solomon. 792 00:47:16,220 --> 00:47:20,820 As a result, if you make a scratch through the disc, you 793 00:47:20,820 --> 00:47:28,980 will cut all of these codes, but you will cut each one in a 794 00:47:28,980 --> 00:47:32,300 small enough place so that you can correct it. 795 00:47:32,300 --> 00:47:34,620 So you can think of this as a actually a physical laying 796 00:47:34,620 --> 00:47:38,820 down of Reed-Solomon codes on the two-dimensional surface of 797 00:47:38,820 --> 00:47:41,710 the disc in such a way that -- 798 00:47:41,710 --> 00:47:45,350 well, even if your scratch came right across one whole 799 00:47:45,350 --> 00:47:48,780 channel here, then, of course, this decoder would be dead. 800 00:47:48,780 --> 00:47:52,590 But you still have somebody decoding in this direction to 801 00:47:52,590 --> 00:47:54,740 pick up the errors. 802 00:47:54,740 --> 00:47:58,000 And so that's why you can completely abuse these discs 803 00:47:58,000 --> 00:48:01,160 and they still work. 804 00:48:01,160 --> 00:48:02,750 And when you think about what's actually going on, 805 00:48:02,750 --> 00:48:10,120 think of the transfer rates from a CD to your computer. 806 00:48:10,120 --> 00:48:15,160 It's up in the hundreds of millions of bits per second. 807 00:48:15,160 --> 00:48:16,950 Maybe its gigabits by now. 808 00:48:16,950 --> 00:48:18,380 Perhaps it is. 809 00:48:18,380 --> 00:48:20,130 And you've got this Reed-Solomon -- 810 00:48:20,130 --> 00:48:22,700 these are non-trivial Reed-Solomon decoders. 811 00:48:22,700 --> 00:48:27,020 These are like this 16 error correcting decoder out here. 812 00:48:27,020 --> 00:48:31,420 And so these are working away at that data rate, capable of 813 00:48:31,420 --> 00:48:31,945 doing decoding. 814 00:48:31,945 --> 00:48:35,620 And they're cheap enough so that they're just buried in 815 00:48:35,620 --> 00:48:38,990 part of one of the integrated circuits in 816 00:48:38,990 --> 00:48:41,520 your disc player circuitry. 817 00:48:41,520 --> 00:48:47,150 So that'd give you a physical sense for just how much error 818 00:48:47,150 --> 00:48:52,090 correction power and speed you can get out of this class of 819 00:48:52,090 --> 00:48:57,050 code decoder, OK? 820 00:48:57,050 --> 00:49:01,110 So wherever you want an extremely powerful code and 821 00:49:01,110 --> 00:49:03,470 you're a little bit less concerned with getting to 822 00:49:03,470 --> 00:49:07,045 capacity, Reed-Solomon code is the way to go. 823 00:49:07,045 --> 00:49:12,610 It's absolutely part of your toolbox, OK? 824 00:49:17,520 --> 00:49:30,350 The last topic in chapter eight is BCH codes, which are 825 00:49:30,350 --> 00:49:33,310 binary codes. 826 00:49:33,310 --> 00:49:39,760 So they're binary linear block codes of the class we've been 827 00:49:39,760 --> 00:49:44,640 discussing, comparable with Reed-Muller codes. 828 00:49:44,640 --> 00:49:48,960 Reed-Muller codes were discovered in 1954 in two 829 00:49:48,960 --> 00:49:53,670 separate papers by Reed and by Muller in two 830 00:49:53,670 --> 00:49:54,790 separate points of view. 831 00:49:54,790 --> 00:49:57,760 And there are a million points of view you can have for 832 00:49:57,760 --> 00:49:58,470 Reed-Muller codes. 833 00:49:58,470 --> 00:50:00,050 We've used yet a third one. 834 00:50:02,590 --> 00:50:09,880 BCH stands for Bose and Chaudhury and Hocquenghem. 835 00:50:09,880 --> 00:50:13,290 Bose and Chaudhury were both professors in the US. 836 00:50:13,290 --> 00:50:18,310 They wrote a paper about 1960 introducing Reed-Solomon 837 00:50:18,310 --> 00:50:23,040 codes, which, by the way, and introducing BC codes -- 838 00:50:23,040 --> 00:50:25,240 Bose-Chaudhury codes -- 839 00:50:25,240 --> 00:50:28,040 Reed-Solomon's paper was in 1960 also. 840 00:50:28,040 --> 00:50:31,260 These were completely independent of each other. 841 00:50:31,260 --> 00:50:35,660 And then this fellow named Hocquenghem in France, whose 842 00:50:35,660 --> 00:50:41,230 name I can't and few people can pronounce, turned out he 843 00:50:41,230 --> 00:50:45,770 had actually invented these codes in 1959 in a paper that 844 00:50:45,770 --> 00:50:49,750 unfortunately was in French, so few people read it. 845 00:50:49,750 --> 00:50:52,120 But anyway, in honor of all three of these people, these 846 00:50:52,120 --> 00:50:54,340 are called BCH codes. 847 00:50:54,340 --> 00:50:58,070 These fit right into the main theme of algebraic coding 848 00:50:58,070 --> 00:50:59,080 theory at that time. 849 00:50:59,080 --> 00:51:02,190 Maybe stimulated it because these turned out to be cyclic 850 00:51:02,190 --> 00:51:05,920 codes, or at least as they were originally introduced. 851 00:51:05,920 --> 00:51:15,130 And so when I was in school, which was in the early '60s, 852 00:51:15,130 --> 00:51:18,590 these were considered the greatest thing going. 853 00:51:18,590 --> 00:51:23,500 People were interested in binary codes, algebraic codes 854 00:51:23,500 --> 00:51:29,570 because they are slightly better than Reed-Muller codes. 855 00:51:29,570 --> 00:51:31,980 Now, I would say, there's sort of been a swing back. 856 00:51:31,980 --> 00:51:35,780 Shu Lin gave a paper about 10 years ago that said 857 00:51:35,780 --> 00:51:38,140 Reed-Muller codes are not so bad. 858 00:51:38,140 --> 00:51:41,140 In fact, in terms of performance versus complexity 859 00:51:41,140 --> 00:51:43,690 for certain kinds of decoding schemes, trellis-based 860 00:51:43,690 --> 00:51:48,000 decoding, Reed-Muller codes are better than BCH codes. 861 00:51:48,000 --> 00:51:50,280 So from a more modern prospective, we're more 862 00:51:50,280 --> 00:51:53,520 interested in Reed-Muller than we are in BCH. 863 00:51:53,520 --> 00:51:57,790 Nonetheless, you'll hear about these a lot if you read the 864 00:51:57,790 --> 00:52:01,780 literature and so you ought to know what they are. 865 00:52:01,780 --> 00:52:02,560 OK. 866 00:52:02,560 --> 00:52:05,890 Conceptually, what are they? 867 00:52:05,890 --> 00:52:11,440 Since we already know about Reed-Solomon codes, the way 868 00:52:11,440 --> 00:52:15,760 that I'm going to characterize BCH codes is as subfield 869 00:52:15,760 --> 00:52:19,480 subcodes of the Reed-Solomon codes. 870 00:52:19,480 --> 00:52:25,440 Now, if we have a Reed-Solomon code over F2 to the 8 or 871 00:52:25,440 --> 00:52:32,800 something, the field with 256 elements, or any power of 2, 872 00:52:32,800 --> 00:52:37,240 contains a subfield which is just 0 and 1, all right? 873 00:52:37,240 --> 00:52:40,760 All fields contains 0 and 1. 874 00:52:40,760 --> 00:52:43,500 If the field has characteristic 2, then 0 and 1 875 00:52:43,500 --> 00:52:47,730 all by themselves are the prime subfield. 876 00:52:47,730 --> 00:52:51,490 So it's possible that there are certain words in the 877 00:52:51,490 --> 00:52:55,660 Reed-Muller code -- sorry, in the Reed-Solomon code that are 878 00:52:55,660 --> 00:52:58,760 completely made up of 0's and 1's. 879 00:52:58,760 --> 00:53:00,740 For instance, the all-0 word is certainly in the 880 00:53:00,740 --> 00:53:04,260 Reed-Solomon code, so we have, at least, that one that's made 881 00:53:04,260 --> 00:53:05,890 up of all zeros. 882 00:53:05,890 --> 00:53:10,440 Almost always, we find that the all-1 word is a code word. 883 00:53:10,440 --> 00:53:14,350 So we have a code with all 0's and all 1's. 884 00:53:14,350 --> 00:53:16,140 And there might be more of them. 885 00:53:16,140 --> 00:53:19,920 And that's actually very straightforward, to find out 886 00:53:19,920 --> 00:53:25,290 how many completely binary words there are in a 887 00:53:25,290 --> 00:53:26,540 Reed-Solomon code. 888 00:53:30,080 --> 00:53:37,080 So it's the binary subfield. 889 00:53:41,950 --> 00:53:55,080 We have F2 is a subfield of F2 to the M for all M. OK, so a 890 00:53:55,080 --> 00:54:07,650 subfield subcode, if we have an n, k, d equals n minus k 891 00:54:07,650 --> 00:54:24,250 plus 1 Reed-Solomon code, then the set of all BCH code is the 892 00:54:24,250 --> 00:54:32,780 set of all binary code words, code words that just use the 0 893 00:54:32,780 --> 00:54:34,155 and 1 of the code alphabet. 894 00:54:39,160 --> 00:54:44,320 Call this c in C. So we call this C prime. 895 00:54:48,780 --> 00:54:52,350 OK, so what are its parameters going to be? 896 00:54:52,350 --> 00:54:56,600 Its length is still going to be n, right? 897 00:54:56,600 --> 00:55:00,360 All the code words here are n-tuples, so these have got to 898 00:55:00,360 --> 00:55:02,940 be n-tuples. 899 00:55:02,940 --> 00:55:04,230 We don't know its dimension. 900 00:55:04,230 --> 00:55:06,110 Let's just call it k prime. 901 00:55:06,110 --> 00:55:08,820 That's how many such code words are there? 902 00:55:08,820 --> 00:55:14,190 We're going to have to do some combinatorics 903 00:55:14,190 --> 00:55:17,230 to find that out. 904 00:55:17,230 --> 00:55:20,600 And what's the minimum distance going to be? 905 00:55:20,600 --> 00:55:21,850 Anybody? 906 00:55:24,791 --> 00:55:27,629 AUDIENCE: [INAUDIBLE]. 907 00:55:27,629 --> 00:55:32,360 PROFESSOR: In general, well, it's certainly not going to be 908 00:55:32,360 --> 00:55:38,090 less than because, by definition of the minimum 909 00:55:38,090 --> 00:55:41,270 distance, the minimum Hamming distance between any two code 910 00:55:41,270 --> 00:55:45,730 words in this code is d, so we're just looking at a subset 911 00:55:45,730 --> 00:55:47,990 of the code words in this code. 912 00:55:47,990 --> 00:55:51,990 And the minimum squared distance has to be at least d. 913 00:55:54,780 --> 00:55:56,730 There's an example given in the notes. 914 00:55:56,730 --> 00:56:02,420 If we let c be the 4, 3 -- 915 00:56:02,420 --> 00:56:11,110 I think it's the 4, 2, 3, over F4, then C prime turns out to 916 00:56:11,110 --> 00:56:15,390 be the 4, 1, 4 code over F2. 917 00:56:19,390 --> 00:56:22,420 In other words, the only two code words in this code that 918 00:56:22,420 --> 00:56:26,900 are all binary are the all-0 and all-1 sequences. 919 00:56:26,900 --> 00:56:27,310 All right? 920 00:56:27,310 --> 00:56:30,720 So we actually get a minimum distance of 4. 921 00:56:30,720 --> 00:56:33,470 It has to be at least 3 in this particular case, so 922 00:56:33,470 --> 00:56:38,170 here's an example where it could be greater than 3. 923 00:56:38,170 --> 00:56:41,110 Or if there are cases where the subcode just consists of 924 00:56:41,110 --> 00:56:45,080 the all-0 word, then that, by convention, has an infinite 925 00:56:45,080 --> 00:56:47,330 minimum distance. 926 00:56:47,330 --> 00:56:51,270 So distance could be greater, but it never can be less than 927 00:56:51,270 --> 00:56:52,980 what you started with. 928 00:56:52,980 --> 00:56:56,030 On the other hand, the dimension is 929 00:56:56,030 --> 00:56:58,100 clearly going to go down. 930 00:56:58,100 --> 00:57:00,980 If it were still equal to what it was, then we'd have a 931 00:57:00,980 --> 00:57:06,860 binary MDS code and we know we don't get binary MDS codes. 932 00:57:06,860 --> 00:57:11,230 So in general, k prime decreases quite a bit. 933 00:57:11,230 --> 00:57:14,640 In this case, it just decreases by 1. 934 00:57:14,640 --> 00:57:17,035 And we have to find out how it decreases. 935 00:57:22,510 --> 00:57:22,920 OK. 936 00:57:22,920 --> 00:57:33,060 I'm a little uncertain as to how much of the derivation -- 937 00:57:33,060 --> 00:57:34,535 so we need to find -- 938 00:57:37,676 --> 00:57:38,878 AUDIENCE: Is that [INAUDIBLE] at all points of 939 00:57:38,878 --> 00:57:40,128 the field, is it? 940 00:57:48,627 --> 00:57:51,320 PROFESSOR: No. 941 00:57:51,320 --> 00:57:54,830 One of the characterizations you have of Reed-Solomon 942 00:57:54,830 --> 00:58:00,390 codes, I assume, was that they're the set of all 943 00:58:00,390 --> 00:58:06,040 polynomials of length n that evaluate to 0 at some 944 00:58:06,040 --> 00:58:12,310 consecutive set of n minus k powers of alpha. 945 00:58:12,310 --> 00:58:13,690 Did you see anything like that? 946 00:58:13,690 --> 00:58:14,881 AUDIENCE: Yes. 947 00:58:14,881 --> 00:58:16,030 PROFESSOR: All right. 948 00:58:16,030 --> 00:58:18,950 In other words, they're -- 949 00:58:18,950 --> 00:58:33,450 let me write that down because RS code is the set of all 950 00:58:33,450 --> 00:58:55,686 multiples f of x, g of x, of degree less than n where of g 951 00:58:55,686 --> 00:59:04,560 of x equals the product of x minus alpha to the i for i 952 00:59:04,560 --> 00:59:10,140 equals 0 to n minus k minus one or something like that. 953 00:59:10,140 --> 00:59:13,870 This is called a generator polynomial and -- 954 00:59:13,870 --> 00:59:14,940 so what does this mean? 955 00:59:14,940 --> 00:59:18,480 In other language, all polynomials of degree less 956 00:59:18,480 --> 00:59:23,220 than n that have roots at 1 alpha, alpha 957 00:59:23,220 --> 00:59:26,030 squared, or so forth. 958 00:59:26,030 --> 00:59:30,330 It's better to make this i equals 1 to n minus k. 959 00:59:30,330 --> 00:59:33,170 You can also make it minus i. 960 00:59:33,170 --> 00:59:36,090 There are lots of details in here. 961 00:59:36,090 --> 00:59:38,950 But I'll just ask you broadly to recall that 962 00:59:38,950 --> 00:59:40,570 characterization. 963 00:59:40,570 --> 00:59:41,270 This is a -- 964 00:59:41,270 --> 00:59:44,024 AUDIENCE: I have another question. 965 00:59:44,024 --> 00:59:46,319 [INAUDIBLE] are not linear? 966 00:59:46,319 --> 00:59:48,280 PROFESSOR: OK, let's ask that. 967 00:59:50,880 --> 00:59:58,340 So we've got a set of binary n-tuples, OK, 968 00:59:58,340 --> 01:00:01,160 defined as the subset. 969 01:00:01,160 --> 01:00:02,720 So what do we need to do to see whether 970 01:00:02,720 --> 01:00:03,560 it's linear or not? 971 01:00:03,560 --> 01:00:06,422 AUDIENCE: [INAUDIBLE]. 972 01:00:06,422 --> 01:00:08,320 PROFESSOR: We just check the group property. 973 01:00:08,320 --> 01:00:11,710 Is it closed under addition? 974 01:00:11,710 --> 01:00:14,150 Suppose we take two of these code words we add them 975 01:00:14,150 --> 01:00:17,590 together, component-wise, vector form. 976 01:00:20,360 --> 01:00:26,070 Seems to me we get A, that gives us another code word in 977 01:00:26,070 --> 01:00:29,500 the Reed-Solomon code because it's closed under addition, 978 01:00:29,500 --> 01:00:33,690 and B, by the addition rules, it's going to be another 979 01:00:33,690 --> 01:00:37,950 binary code word, all right? 980 01:00:37,950 --> 01:00:40,450 So it's another binary code word in the RS code, 981 01:00:40,450 --> 01:00:42,230 therefore, it's in C prime. 982 01:00:42,230 --> 01:00:45,670 So that's proof that, in fact, this code is linear. 983 01:00:45,670 --> 01:00:49,250 AUDIENCE: What about [INAUDIBLE]? 984 01:00:49,250 --> 01:00:50,620 PROFESSOR: That's trivial. 985 01:00:50,620 --> 01:00:53,190 For a binary code, we only have to consider two 986 01:00:53,190 --> 01:00:54,610 scalars, 0 and 1. 987 01:00:54,610 --> 01:00:55,860 AUDIENCE: [INAUDIBLE]? 988 01:00:58,186 --> 01:01:00,160 PROFESSOR: But now I'm asking whether this 989 01:01:00,160 --> 01:01:01,410 binary code is linear. 990 01:01:04,060 --> 01:01:05,310 There's the code over F2. 991 01:01:08,418 --> 01:01:08,901 Yes? 992 01:01:08,901 --> 01:01:09,867 You had a -- 993 01:01:09,867 --> 01:01:11,450 AUDIENCE: Same question. 994 01:01:11,450 --> 01:01:16,260 The base field is F2 to the x, so if you add 1 and 1 there, 995 01:01:16,260 --> 01:01:17,740 it wouldn't be just 0. 996 01:01:17,740 --> 01:01:22,138 But I guess if you restrict the field to F2, you have to 997 01:01:22,138 --> 01:01:23,745 check everything again, don't you? 998 01:01:23,745 --> 01:01:29,710 PROFESSOR: No, because we said F2 is the subfield of F2 to 999 01:01:29,710 --> 01:01:34,230 the M. And what we actually showed was that if you just 1000 01:01:34,230 --> 01:01:39,520 stay in the prime field, the integers of the field, then 1001 01:01:39,520 --> 01:01:41,660 you get exactly the same addition and multiplication 1002 01:01:41,660 --> 01:01:46,400 rules as you do in the prime field. 1003 01:01:46,400 --> 01:01:50,570 So in fact, in a field of characteristic 2, we always 1004 01:01:50,570 --> 01:01:57,710 have 0 plus 1 is 1, 0 plus 1 plus 0 is 1, 1 plus 1 is 0. 1005 01:01:57,710 --> 01:02:01,600 That last one is the only one you really have to check. 1006 01:02:01,600 --> 01:02:05,840 So that accounts for the definition of characteristic, 1007 01:02:05,840 --> 01:02:09,750 and that's the reason we can always use just plus signs, 1008 01:02:09,750 --> 01:02:11,890 because subtraction is the same as addition. 1009 01:02:11,890 --> 01:02:13,140 AUDIENCE: [INAUDIBLE]? 1010 01:02:16,128 --> 01:02:17,810 PROFESSOR: Yeah. 1011 01:02:17,810 --> 01:02:19,060 We showed -- 1012 01:02:21,480 --> 01:02:24,255 well, very simple proof. 1013 01:02:28,290 --> 01:02:30,940 The order of a subgroup has to define the order of 1014 01:02:30,940 --> 01:02:32,420 the group it's in. 1015 01:02:32,420 --> 01:02:35,570 The additive group of the prime field has order 2 to the 1016 01:02:35,570 --> 01:02:38,480 M, so it has to be some divisor of 2 to the M, which 1017 01:02:38,480 --> 01:02:39,730 is a prime. 1018 01:02:43,740 --> 01:02:45,930 I think we proved that another way, but that's 1019 01:02:45,930 --> 01:02:49,330 a very simple proof. 1020 01:02:49,330 --> 01:02:54,350 OK, great questions, because it shows you're thinking about 1021 01:02:54,350 --> 01:02:57,210 how this relates to what you learned back in chapter seven. 1022 01:02:57,210 --> 01:03:00,420 But at this point you've kind of got a little vague about 1023 01:03:00,420 --> 01:03:02,810 chapter seven, what's really in there. 1024 01:03:02,810 --> 01:03:04,820 By Wednesday, you're really going to know 1025 01:03:04,820 --> 01:03:05,900 chapter seven well. 1026 01:03:05,900 --> 01:03:07,150 Let me tell you. 1027 01:03:11,240 --> 01:03:12,640 OK. 1028 01:03:12,640 --> 01:03:14,370 Good. 1029 01:03:14,370 --> 01:03:16,290 Thank you for the help. 1030 01:03:16,290 --> 01:03:18,680 Because these are all things that I should have just said 1031 01:03:18,680 --> 01:03:21,290 up here, but they come out much better, actually, if they 1032 01:03:21,290 --> 01:03:22,700 come out in response to questions. 1033 01:03:26,190 --> 01:03:27,440 OK. 1034 01:03:29,650 --> 01:03:36,990 So here's one characterization of a Reed-Solomon code which 1035 01:03:36,990 --> 01:03:38,290 apparently you did see. 1036 01:03:38,290 --> 01:03:41,430 It's the set of all multiples of some generator polynomial. 1037 01:03:41,430 --> 01:03:45,200 The generator polynomial is described as this polynomial 1038 01:03:45,200 --> 01:03:53,290 of degree n minus k whose roots are precisely alpha up 1039 01:03:53,290 --> 01:03:55,300 through alpha to the n minus k. 1040 01:03:55,300 --> 01:03:56,890 You did say that? 1041 01:03:56,890 --> 01:03:58,140 OK. 1042 01:04:00,960 --> 01:04:10,460 So similarly, the BCH code -- 1043 01:04:10,460 --> 01:04:14,030 I'm just going to assert this just to show you how it goes. 1044 01:04:14,030 --> 01:04:27,550 It's the set of all multiples of g prime of x, which is the 1045 01:04:27,550 --> 01:04:47,590 minimal degree binary polynomial 1046 01:04:47,590 --> 01:04:50,610 with the same roots. 1047 01:04:55,810 --> 01:04:57,616 Now, how can a binary polynomial -- 1048 01:05:01,940 --> 01:05:06,158 alpha to the n minus k. 1049 01:05:06,158 --> 01:05:09,280 How can a binary polynomial have non-binary roots? 1050 01:05:12,720 --> 01:05:15,290 Well, it clearly can. 1051 01:05:15,290 --> 01:05:18,615 Let's consider g of prime of x. 1052 01:05:21,260 --> 01:05:29,030 It's an element of F2 to the x, the set of all polynomials 1053 01:05:29,030 --> 01:05:37,550 over F2, which is a subset of the set of all polynomials 1054 01:05:37,550 --> 01:05:39,390 over 2 to the M for the same reason. 1055 01:05:39,390 --> 01:05:43,600 F2 is the subfield of F2 to the M, so some of the 1056 01:05:43,600 --> 01:05:47,250 polynomials over F2 to the M have purely binary 1057 01:05:47,250 --> 01:05:49,040 coefficients, all right? 1058 01:05:49,040 --> 01:05:51,950 So that's the sense -- 1059 01:05:51,950 --> 01:05:54,615 that's how we can get between these two things. 1060 01:06:05,260 --> 01:06:12,660 So some binary polynomials have roots in GF of 2 the M 1061 01:06:12,660 --> 01:06:14,020 let me give you an example. 1062 01:06:16,890 --> 01:06:29,860 Let's consider x squared plus x plus 1 as a polynomial in F4 1063 01:06:29,860 --> 01:06:37,320 of x, which is certainly is. 1064 01:06:37,320 --> 01:06:42,740 It's a polynomial of degree 2 and F4 of x. 1065 01:06:42,740 --> 01:06:45,740 What are its roots in F4 of x? 1066 01:06:45,740 --> 01:06:48,890 Does it have any roots in F4 of x? 1067 01:06:48,890 --> 01:06:51,890 Does it have any roots in F2 -- 1068 01:06:51,890 --> 01:06:56,270 sorry, I'm asking whether it has any roots in F4. 1069 01:06:56,270 --> 01:07:00,230 Now I step back and ask if it has any roots in F2. 1070 01:07:00,230 --> 01:07:01,320 So how do we do that? 1071 01:07:01,320 --> 01:07:08,150 We just ask if F of x is 0 for x equals 0. 1072 01:07:08,150 --> 01:07:09,080 We get 1. 1073 01:07:09,080 --> 01:07:12,680 So it's not a root of 0. 1074 01:07:12,680 --> 01:07:15,610 Substitute 1, we get 1 plus 1 plus 1. 1075 01:07:15,610 --> 01:07:19,340 1 is not a root F of x. 1076 01:07:19,340 --> 01:07:20,000 All right. 1077 01:07:20,000 --> 01:07:21,620 Well, we have two more candidates. 1078 01:07:21,620 --> 01:07:23,416 How about alpha? 1079 01:07:23,416 --> 01:07:29,490 For alpha, we get F of alpha equals alpha squared plus 1080 01:07:29,490 --> 01:07:32,700 alpha plus 1. 1081 01:07:32,700 --> 01:07:37,580 And if you remember the addition table of F4, however 1082 01:07:37,580 --> 01:07:42,220 it's constructed, this equals 0, OK? 1083 01:07:50,850 --> 01:07:52,925 0, 1, alpha, alpha, squared. 1084 01:07:52,925 --> 01:07:55,370 0, 1 alpha, alpha squared. 1085 01:07:55,370 --> 01:07:59,040 0, 1, alpha, alpha, squared. 1086 01:07:59,040 --> 01:08:00,706 This is plus. 1087 01:08:00,706 --> 01:08:03,190 1, alpha, alpha, squared. 1088 01:08:03,190 --> 01:08:06,850 0, 1 plus alpha is alpha squared. 1089 01:08:06,850 --> 01:08:09,090 1 plus alpha squared is alpha. 1090 01:08:09,090 --> 01:08:12,130 Any of these is telling you the same thing. 1091 01:08:12,130 --> 01:08:15,810 0, 1, 0, 1 alpha, squared alpha. 1092 01:08:15,810 --> 01:08:19,819 So we conclude from that that 1 plus 1 plus alpha, which is 1093 01:08:19,819 --> 01:08:22,190 alpha squared, plus alpha squared is 0. 1094 01:08:27,609 --> 01:08:32,020 So alpha is, in fact, a root of this. 1095 01:08:32,020 --> 01:08:36,859 And what about F of alpha squared? 1096 01:08:36,859 --> 01:08:41,100 Alpha squared squared is alpha fourth, but we can 1097 01:08:41,100 --> 01:08:44,229 reduce that mod 3 -- 1098 01:08:44,229 --> 01:08:45,520 the exponent mod 3. 1099 01:08:45,520 --> 01:08:50,350 So this is alpha plus alpha squared plus 1 and that is 1100 01:08:50,350 --> 01:08:51,050 equal to 0. 1101 01:08:51,050 --> 01:08:56,580 So alpha and alpha squared are roots of this polynomial. 1102 01:09:05,910 --> 01:09:28,859 There's a lovely fact proved in the notes, which is that 1103 01:09:28,859 --> 01:09:37,800 every element of -- 1104 01:09:37,800 --> 01:09:38,479 facts. 1105 01:09:38,479 --> 01:09:52,660 Every element of F2 to the M is a root of x 2 1106 01:09:52,660 --> 01:09:54,790 to the M plus x. 1107 01:10:01,330 --> 01:10:03,490 OK, did you see something like this last time? 1108 01:10:03,490 --> 01:10:05,025 I see some shaking of heads. 1109 01:10:10,010 --> 01:10:11,260 OK. 1110 01:10:13,210 --> 01:10:18,100 And how many roots could there possibly be of x 2 to the M 1111 01:10:18,100 --> 01:10:20,195 plus x by the fundamental theorem of algebra? 1112 01:10:20,195 --> 01:10:21,264 AUDIENCE: 2 to the M. 1113 01:10:21,264 --> 01:10:28,520 PROFESSOR: 2 to the M. So that means that x 2 to 1114 01:10:28,520 --> 01:10:31,010 the M plus x -- 1115 01:10:31,010 --> 01:10:39,570 and for every root, it's got to have a factor of the form x 1116 01:10:39,570 --> 01:10:43,770 minus, or in this case, plus beta. 1117 01:10:43,770 --> 01:10:46,590 So this must have a complete factorization of 1118 01:10:46,590 --> 01:10:47,840 the following form. 1119 01:10:53,750 --> 01:10:58,625 x 2 to the M plus x is simply the product of x minus beta. 1120 01:11:02,370 --> 01:11:03,620 All right. 1121 01:11:06,680 --> 01:11:11,190 In fact, you can see that 0 a root, obviously. 1122 01:11:11,190 --> 01:11:15,150 1 is a root, obviously. 1123 01:11:15,150 --> 01:11:20,730 And we've checked up here. 1124 01:11:20,730 --> 01:11:22,350 Let's see, what are we interested in? 1125 01:11:22,350 --> 01:11:24,070 Here's another example. 1126 01:11:24,070 --> 01:11:28,770 x to the fourth plus 1, f of -- 1127 01:11:28,770 --> 01:11:32,800 if we have g of x equals x to the four -- sorry -- plus x, 1128 01:11:32,800 --> 01:11:38,620 then g of alpha is alpha to the fourth plus alpha. 1129 01:11:38,620 --> 01:11:43,220 But alpha to the fourth is equal to alpha, so that's 0. 1130 01:11:43,220 --> 01:11:47,450 g to the alpha squared equals alpha to the 1131 01:11:47,450 --> 01:11:50,400 8th plus alpha squared. 1132 01:11:50,400 --> 01:11:53,570 Alpha to the 8th is alpha to the fifth is alpha squared, so 1133 01:11:53,570 --> 01:11:55,140 that equals 0. 1134 01:11:55,140 --> 01:11:59,095 So, in fact, that's true for that case. 1135 01:12:04,720 --> 01:12:38,450 Every irreducible polynomial g of x and F2 of x whose degree 1136 01:12:38,450 --> 01:12:45,555 g of x divides 2 to the M -- 1137 01:12:54,440 --> 01:12:56,350 is that right? 1138 01:12:56,350 --> 01:13:01,225 This may be 2 to the M minus -- 1139 01:13:05,230 --> 01:13:06,990 I think it's whose degree divides M -- 1140 01:13:11,290 --> 01:13:23,605 is a factor of x to the 2M plus x in F2 to the x. 1141 01:13:29,910 --> 01:13:32,580 OK. 1142 01:13:32,580 --> 01:13:37,930 So let's give you an example again. 1143 01:13:37,930 --> 01:13:41,220 Example M equals 2. 1144 01:13:41,220 --> 01:13:44,170 This means, what are the irreducible polynomials whose 1145 01:13:44,170 --> 01:13:45,420 degrees divide 2? 1146 01:13:48,300 --> 01:13:59,320 The prime polynomials of degree 1 certainly divides to 1147 01:13:59,320 --> 01:14:03,890 our x and x plus 1. 1148 01:14:03,890 --> 01:14:08,240 Do they divide x to the fourth plus 1? 1149 01:14:08,240 --> 01:14:09,490 Yeah, they do. 1150 01:14:13,390 --> 01:14:17,910 How many prime polynomials are there of degree 2? 1151 01:14:17,910 --> 01:14:23,080 There's only one, x squared plus x plus 1. 1152 01:14:23,080 --> 01:14:29,720 And if you divide it out into x to the fourth plus 1, it 1153 01:14:29,720 --> 01:14:30,970 goes evenly. 1154 01:14:33,380 --> 01:14:42,610 It turns out that what this implies is that this is 1155 01:14:42,610 --> 01:14:45,410 actually an if or only if. 1156 01:14:45,410 --> 01:14:50,990 A polynomial g of x divides x to 2 to the M plus x if and 1157 01:14:50,990 --> 01:14:56,940 only if g of x has a degree which divides M and g of x is 1158 01:14:56,940 --> 01:14:59,020 irreducible. 1159 01:14:59,020 --> 01:15:12,120 So this implies that x to the M plus x factors in F2 of x 1160 01:15:12,120 --> 01:15:25,010 into the product of all irreducible polynomials whose 1161 01:15:25,010 --> 01:15:32,780 degrees divide M. And I think you did a homework problem 1162 01:15:32,780 --> 01:15:36,710 that used this, even though it was never proved in class. 1163 01:15:36,710 --> 01:15:38,180 So you at least read about it. 1164 01:15:41,400 --> 01:15:45,430 Again, for this example, that should mean that x to the 1165 01:15:45,430 --> 01:15:51,880 fourth plus 1 equals x times x plus 1 times x 1166 01:15:51,880 --> 01:15:54,450 squared plus x plus 1. 1167 01:15:54,450 --> 01:15:55,700 Is that true? 1168 01:15:58,510 --> 01:16:00,320 Pretty easy to see that it is. 1169 01:16:00,320 --> 01:16:03,130 x plus 1 times x squared plus x plus 1 is just 1170 01:16:03,130 --> 01:16:05,440 x cubed plus 1. 1171 01:16:05,440 --> 01:16:06,470 Multiply by x -- 1172 01:16:06,470 --> 01:16:08,420 I'm sorry, this x fourth plus x. 1173 01:16:08,420 --> 01:16:10,970 How many times have I done this? 1174 01:16:10,970 --> 01:16:14,642 Should be x to the fourth plus x. 1175 01:16:14,642 --> 01:16:17,390 Yes. 1176 01:16:17,390 --> 01:16:19,610 OK, it's kind of miracle when you first see it. 1177 01:16:19,610 --> 01:16:26,950 This works over any field for any PNM again, lest you get 1178 01:16:26,950 --> 01:16:28,200 this kind of factorization. 1179 01:16:30,740 --> 01:16:36,210 OK, but now, comparing this to this, what does that mean? 1180 01:16:36,210 --> 01:16:42,290 That means this is one factorization. 1181 01:16:42,290 --> 01:16:45,520 This is in -- 1182 01:16:45,520 --> 01:16:48,050 we can do a complete factorization in 1183 01:16:48,050 --> 01:16:50,400 F2 to the M of x. 1184 01:16:50,400 --> 01:16:55,470 We can only do complete in the sense that we reduce it to a 1185 01:16:55,470 --> 01:16:58,350 product of degree 1 polynomials. 1186 01:16:58,350 --> 01:17:02,790 In F2 of x, we can't do a complete factorization. 1187 01:17:02,790 --> 01:17:06,600 This is similar to, over the complex field you can always 1188 01:17:06,600 --> 01:17:12,250 completely factor a polynomial into one-dimensional 1189 01:17:12,250 --> 01:17:15,460 polynomials of this form, where beta is the set of roots 1190 01:17:15,460 --> 01:17:16,900 of the polynomial. 1191 01:17:16,900 --> 01:17:21,910 Whereas over the real field, you might have to have some 1192 01:17:21,910 --> 01:17:24,370 degree 2 polynomials in there. 1193 01:17:24,370 --> 01:17:26,340 This is exactly analogous. 1194 01:17:26,340 --> 01:17:28,660 You can't get complete 1195 01:17:28,660 --> 01:17:33,450 factorization over this subfield. 1196 01:17:33,450 --> 01:17:34,700 But what does it mean? 1197 01:17:34,700 --> 01:17:37,890 Here we have a polynomial of degree 4, has 4 roots. 1198 01:17:37,890 --> 01:17:39,640 We factor like this. 1199 01:17:39,640 --> 01:17:43,380 So each of these factors must contain a number of roots 1200 01:17:43,380 --> 01:17:45,670 equal to its degree. 1201 01:17:45,670 --> 01:17:47,280 So what are the roots? 1202 01:17:47,280 --> 01:17:51,170 This is the polynomial that has root 0. 1203 01:17:51,170 --> 01:17:53,710 This is the polynomial that has root 1. 1204 01:17:53,710 --> 01:17:56,710 As we just showed, this is the polynomial that has the roots 1205 01:17:56,710 --> 01:17:59,074 alpha and alpha squared, OK? 1206 01:18:01,650 --> 01:18:05,290 So it's always going to turn out that way. 1207 01:18:05,290 --> 01:18:14,430 So you can group the field elements into subsets where 1208 01:18:14,430 --> 01:18:19,090 each subset is the set of roots of some irreducible 1209 01:18:19,090 --> 01:18:25,480 binary polynomial of some degree that divides M, OK? 1210 01:18:25,480 --> 01:18:29,390 These are called cyclotomic subsets. 1211 01:18:29,390 --> 01:18:31,570 Whence the name of Elwyn Berlekamp's company, 1212 01:18:31,570 --> 01:18:32,820 Cyclotomics. 1213 01:18:35,630 --> 01:18:36,090 All right. 1214 01:18:36,090 --> 01:18:41,100 So this is where, I think, the theory gets just absolutely 1215 01:18:41,100 --> 01:18:44,350 lovely and unexpected, at least when you start in. 1216 01:18:44,350 --> 01:18:46,950 And people who have a taste for these things, I encourage 1217 01:18:46,950 --> 01:18:50,120 you to read a real book on the subject. 1218 01:18:50,120 --> 01:18:53,340 Berlekamp's is one of the most elegant. 1219 01:18:53,340 --> 01:18:54,280 But there are lots -- 1220 01:18:54,280 --> 01:18:56,660 because it's an elegant subject, there are lots of 1221 01:18:56,660 --> 01:18:59,560 elegant treatments of it. 1222 01:18:59,560 --> 01:19:01,590 All right. 1223 01:19:01,590 --> 01:19:03,520 So remember what we were trying to do 1224 01:19:03,520 --> 01:19:04,480 in the first place. 1225 01:19:04,480 --> 01:19:13,510 We were trying to find the dimension of BCH codes. 1226 01:19:13,510 --> 01:19:15,120 So where are we now? 1227 01:19:15,120 --> 01:19:25,960 We've defined a BCH code as the set of all binary 1228 01:19:25,960 --> 01:19:39,270 polynomials with roots alpha up to alpha to the n minus k 1229 01:19:39,270 --> 01:19:43,760 where n and k are the parameters of the parent 1230 01:19:43,760 --> 01:19:44,760 Reed-Solomon code. 1231 01:19:44,760 --> 01:19:47,618 And alpha is in the parent field. 1232 01:19:47,618 --> 01:19:50,250 So alpha in 1233 01:19:50,250 --> 01:19:55,920 F2 to the M. OK. 1234 01:19:55,920 --> 01:19:58,835 One final fact, I guess, we need to know here. 1235 01:20:01,415 --> 01:20:05,850 All right, if alpha is a root -- now I'm talking 1236 01:20:05,850 --> 01:20:08,540 about any field -- 1237 01:20:08,540 --> 01:20:13,160 then alpha squared is a root of a binary polynomial. 1238 01:20:18,480 --> 01:20:22,160 And again, this is due to a very lovely fact, which is 1239 01:20:22,160 --> 01:20:29,090 that squaring is linear in fields of characteristic 2. 1240 01:20:29,090 --> 01:20:31,045 Why is squaring linear. 1241 01:20:31,045 --> 01:20:33,530 A little side calculation. 1242 01:20:33,530 --> 01:20:38,410 a plus b squared equals a squared plus 2ab plus b 1243 01:20:38,410 --> 01:20:41,340 squared but characteristic 2. 1244 01:20:41,340 --> 01:20:43,450 We can wipe that out. 1245 01:20:43,450 --> 01:20:54,240 So in general, f of x quantity squared is f of x squared. 1246 01:20:58,120 --> 01:21:03,520 And that's why, if f of x evaluates to 0, then f of x 1247 01:21:03,520 --> 01:21:09,740 squared must evaluate to 0 since 0 squared is always 1248 01:21:09,740 --> 01:21:11,910 equal to 0. 1249 01:21:11,910 --> 01:21:12,360 All right. 1250 01:21:12,360 --> 01:21:17,330 So alpha is a root if and only if alpha squared is a root. 1251 01:21:17,330 --> 01:21:18,380 Is that true here? 1252 01:21:18,380 --> 01:21:18,900 Yes. 1253 01:21:18,900 --> 01:21:20,460 Alpha alpha squared. 1254 01:21:20,460 --> 01:21:23,080 Well, is it also true if we squared alpha squared? 1255 01:21:23,080 --> 01:21:25,430 What is the square of alpha squared? 1256 01:21:25,430 --> 01:21:27,800 It's back to alpha, all right? 1257 01:21:27,800 --> 01:21:36,720 So these sets now turn out to be cyclotomic cosets, they're 1258 01:21:36,720 --> 01:21:39,130 called, which is the set of all 1259 01:21:39,130 --> 01:21:41,740 elements and their squares. 1260 01:21:41,740 --> 01:21:46,360 We get alpha, alpha squared, alpha fourth, alpha eighth. 1261 01:21:46,360 --> 01:21:48,170 And then, at some point, it cycles. 1262 01:21:48,170 --> 01:21:49,420 Why? 1263 01:21:49,420 --> 01:21:51,620 Because what does alpha to the 2M equal? 1264 01:21:51,620 --> 01:21:52,552 AUDIENCE: [INAUDIBLE]. 1265 01:21:52,552 --> 01:21:55,820 PROFESSOR: No. 1266 01:21:55,820 --> 01:21:56,710 Alpha. 1267 01:21:56,710 --> 01:21:57,120 Right. 1268 01:21:57,120 --> 01:22:01,350 Alpha to the 2M minus 1 is 1, so alpha to the 2M is alpha. 1269 01:22:01,350 --> 01:22:05,680 So it comes back to alpha again because of that. 1270 01:22:05,680 --> 01:22:08,670 So we get a finite number of roots which, of course, are 1271 01:22:08,670 --> 01:22:10,810 equal to the degree here. 1272 01:22:10,810 --> 01:22:13,306 In some cases, it's fewer. 1273 01:22:13,306 --> 01:22:14,830 It's whatever the degree is. 1274 01:22:14,830 --> 01:22:23,980 It's d, which divides M. 1275 01:22:23,980 --> 01:22:27,360 So from this, we simply start -- 1276 01:22:27,360 --> 01:22:30,930 suppose we want alpha to be a root. 1277 01:22:30,930 --> 01:22:36,180 Let's start with high rate codes. 1278 01:22:36,180 --> 01:22:44,140 I think here in the example, in the text, I used F16. 1279 01:22:44,140 --> 01:22:47,860 F16 is going to have -- 1280 01:22:47,860 --> 01:22:54,110 the elements of F16 are going to be the roots of all the 1281 01:22:54,110 --> 01:22:59,830 degree 1, 2, and 4 polynomials because M is 4 and these are 1282 01:22:59,830 --> 01:23:01,320 the degrees that divide 4. 1283 01:23:01,320 --> 01:23:11,300 So again, we get x and x plus 1, which have roots 0 and 1. 1284 01:23:11,300 --> 01:23:15,720 We get x squared plus x plus 1. 1285 01:23:15,720 --> 01:23:19,870 What are the roots of x squared plus x plus 1 and F16? 1286 01:23:19,870 --> 01:23:22,520 There are only going to be two of them and they've got to 1287 01:23:22,520 --> 01:23:28,770 have the property that if we start with the first one, 1288 01:23:28,770 --> 01:23:32,300 let's call it beta and beta squared. 1289 01:23:32,300 --> 01:23:34,640 Beta fourth has got to equal beta because we 1290 01:23:34,640 --> 01:23:36,960 only have two of them. 1291 01:23:36,960 --> 01:23:43,040 That more or less forces beta to be alpha to the fifth or 1292 01:23:43,040 --> 01:23:45,410 alpha to the tenth. 1293 01:23:45,410 --> 01:23:48,300 Let's see if that works. 1294 01:23:48,300 --> 01:23:51,680 We get alpha to the fifth, alpha to the tenth. 1295 01:23:51,680 --> 01:23:54,400 Square that, we get alpha to the 20th. 1296 01:23:54,400 --> 01:23:57,770 But we now take the exponents mod 15 and we get 1297 01:23:57,770 --> 01:24:00,440 5 again, all right? 1298 01:24:00,440 --> 01:24:04,350 So those are the two roots of x squared 1299 01:24:04,350 --> 01:24:05,990 plus x plus 1 in F16. 1300 01:24:09,220 --> 01:24:10,920 Obviously, I'm not -- 1301 01:24:10,920 --> 01:24:13,340 I guess that was a legitimate proof, but it 1302 01:24:13,340 --> 01:24:15,810 was certainly quick. 1303 01:24:15,810 --> 01:24:18,690 OK, there are three polynomials of degree 4. 1304 01:24:18,690 --> 01:24:22,850 By the way, I think you can imagine how, given this 1305 01:24:22,850 --> 01:24:27,210 result, we can now enumerate the 1306 01:24:27,210 --> 01:24:29,450 polynomials of each degree. 1307 01:24:29,450 --> 01:24:31,300 We now have an expression. 1308 01:24:35,350 --> 01:24:40,360 There are two irreducible polynomials of degree 1. 1309 01:24:40,360 --> 01:24:43,290 So how many more of degree 2 is it going to take to make 1310 01:24:43,290 --> 01:24:44,770 the overall degree 4? 1311 01:24:44,770 --> 01:24:45,560 One of them. 1312 01:24:45,560 --> 01:24:47,160 So we're looking for a single -- 1313 01:24:47,160 --> 01:24:48,870 there must be a single irreducible 1314 01:24:48,870 --> 01:24:50,620 polynomial degree 4. 1315 01:24:50,620 --> 01:24:54,150 Similarly over here, all right, we've accounted for 1316 01:24:54,150 --> 01:24:56,640 four of the elements of F16. 1317 01:24:56,640 --> 01:24:59,810 The only other possible degree we could have is 4. 1318 01:24:59,810 --> 01:25:03,320 So how many irreducible polynomials are there of 1319 01:25:03,320 --> 01:25:07,460 degree 4 over in F 2 of x? 1320 01:25:07,460 --> 01:25:08,320 Must be three of them. 1321 01:25:08,320 --> 01:25:11,240 And it always works out. 1322 01:25:11,240 --> 01:25:14,330 So, from this you can get a closed-form combinatorial 1323 01:25:14,330 --> 01:25:18,720 formula again in terms of the Mobius inversion formula. 1324 01:25:18,720 --> 01:25:22,740 By doing a sieve or by some other method, you can find out 1325 01:25:22,740 --> 01:25:31,840 that the three irreducible polynomials of degree 4 are 1326 01:25:31,840 --> 01:25:36,850 those three and the corresponding roots 1327 01:25:36,850 --> 01:25:38,580 for this one -- 1328 01:25:38,580 --> 01:25:41,790 if this is the irreducible polynomial you use to define 1329 01:25:41,790 --> 01:25:45,100 the field, then alpha is a root of it and you're going to 1330 01:25:45,100 --> 01:25:49,060 get alpha, alpha squared, alpha fourth, 1331 01:25:49,060 --> 01:25:50,560 alpha to the eightth. 1332 01:25:50,560 --> 01:25:53,110 Alpha to the eightth squared is alpha to the 16th, but that 1333 01:25:53,110 --> 01:25:55,240 equals alpha, OK? 1334 01:25:55,240 --> 01:25:58,670 So these are the roots of that equation. 1335 01:25:58,670 --> 01:25:59,920 Here you get -- 1336 01:26:02,740 --> 01:26:06,940 sorry, x fourth plus x third plus 1, since that's basically 1337 01:26:06,940 --> 01:26:12,060 the reverse of this, turns out you get alpha to the minus 1, 1338 01:26:12,060 --> 01:26:16,070 which is alpha to the 1fourth. 1339 01:26:16,070 --> 01:26:19,860 Square that and you get alpha to the minus 2, which is alpha 1340 01:26:19,860 --> 01:26:22,950 to the 13th, or equivalently, alpha to the 28 -- 1341 01:26:22,950 --> 01:26:25,320 reduces to thirteen. 1342 01:26:25,320 --> 01:26:28,980 This becomes alpha the 11th, alpha to the seventh. 1343 01:26:28,980 --> 01:26:30,730 and then alpha to the seventh gives you alpha 1344 01:26:30,730 --> 01:26:31,930 to the 1fourth again. 1345 01:26:31,930 --> 01:26:33,898 AUDIENCE: [INAUDIBLE] 1346 01:26:33,898 --> 01:26:39,310 same thing about taking the square [INAUDIBLE]? 1347 01:26:39,310 --> 01:26:40,294 PROFESSOR: Correct. 1348 01:26:40,294 --> 01:26:41,770 AUDIENCE: --from the [INAUDIBLE]? 1349 01:26:41,770 --> 01:26:45,390 PROFESSOR: Because you go around in a circle. 1350 01:26:45,390 --> 01:26:47,090 So it's a finite set. 1351 01:26:47,090 --> 01:26:51,160 So you can obviously divide by 2. 1352 01:26:51,160 --> 01:26:54,840 So there's one cyclotomic coset. 1353 01:26:54,840 --> 01:26:56,540 Here's the second one. 1354 01:26:56,540 --> 01:26:58,960 And then the third one must be whatever's left over. 1355 01:26:58,960 --> 01:26:59,710 What's left over? 1356 01:26:59,710 --> 01:27:02,040 We don't have alpha to the third yet. 1357 01:27:02,040 --> 01:27:07,680 Alpha to the third, alpha to the sixth, alpha to the 12th, 1358 01:27:07,680 --> 01:27:11,740 and then alpha to the 2fourth, which is alpha to the ninth. 1359 01:27:11,740 --> 01:27:15,558 And alpha to the 18th is equal alpha to the third again. 1360 01:27:18,190 --> 01:27:21,120 OK, what's the moral of this? 1361 01:27:21,120 --> 01:27:23,660 We want to start off with -- 1362 01:27:23,660 --> 01:27:25,850 let's just take a look at the Reed-Solomon, 1363 01:27:25,850 --> 01:27:27,380 the very first one. 1364 01:27:27,380 --> 01:27:31,680 We just want n minus k to be 1 and we just want to have root 1365 01:27:31,680 --> 01:27:34,695 alpha in here. 1366 01:27:34,695 --> 01:27:38,660 Well, if we get root alpha, we're going to get root alpha 1367 01:27:38,660 --> 01:27:45,070 squared, alpha fourth and alpha eightth along with it. 1368 01:27:45,070 --> 01:27:47,940 If we want a binary polynomial with root alpha, 1369 01:27:47,940 --> 01:27:49,065 it has to this one. 1370 01:27:49,065 --> 01:27:52,250 And it's going to have four roots. 1371 01:27:52,250 --> 01:27:52,700 All right. 1372 01:27:52,700 --> 01:28:05,370 So for root alpha, we're going to get x fourth plus x plus 1 1373 01:28:05,370 --> 01:28:07,840 as our generator polynomial. 1374 01:28:07,840 --> 01:28:09,260 And it's actually going to have roots 1375 01:28:09,260 --> 01:28:11,190 alpha and alpha squared. 1376 01:28:11,190 --> 01:28:13,540 So it's going to be the same one as you would get if you 1377 01:28:13,540 --> 01:28:15,860 wanted alpha and alpha squared. 1378 01:28:15,860 --> 01:28:23,700 That means n minus k is already up to 2, which means 1379 01:28:23,700 --> 01:28:26,900 your distance is already up to 3. 1380 01:28:26,900 --> 01:28:29,760 Distance is n minus k plus 1. 1381 01:28:29,760 --> 01:28:36,630 So this leads to a code with n equals 16 -- 1382 01:28:36,630 --> 01:28:37,850 is that right? 1383 01:28:37,850 --> 01:28:39,100 Yes. 1384 01:28:40,920 --> 01:28:45,420 BCH codes are always defined to be one shorter. 1385 01:28:45,420 --> 01:28:58,660 We're going to have four redundant bits, so k equals 12 1386 01:28:58,660 --> 01:29:01,530 and distance equals 3. 1387 01:29:04,530 --> 01:29:06,113 I think that's not quite right. 1388 01:29:11,430 --> 01:29:16,490 When we're doing BCH codes, they're always defined just on 1389 01:29:16,490 --> 01:29:20,630 the non-zero elements of the field. 1390 01:29:20,630 --> 01:29:28,090 So we start with the shorter Read-Solomon code of 1391 01:29:28,090 --> 01:29:30,960 length q minus 1. 1392 01:29:30,960 --> 01:29:32,940 15 in this case rather than 16. 1393 01:29:35,600 --> 01:29:41,530 And 0 never appears here as one of the possible roots. 1394 01:29:41,530 --> 01:29:42,900 So let me just do that. 1395 01:29:42,900 --> 01:29:46,170 I apologize I haven't carried through all 1396 01:29:46,170 --> 01:29:48,110 the possible steps. 1397 01:29:48,110 --> 01:29:52,000 So we're getting all of the polynomial multiples of x 1398 01:29:52,000 --> 01:29:57,230 fourth plus x plus 1 of degree less or equal to 1399 01:29:57,230 --> 01:29:59,560 14 rather than 15. 1400 01:29:59,560 --> 01:30:01,610 So it's n equals 15. 1401 01:30:01,610 --> 01:30:05,610 n minus k is going to be 4. 1402 01:30:05,610 --> 01:30:07,220 So this is 11. 1403 01:30:07,220 --> 01:30:11,280 And the distance is going to be greater than equal to 3. 1404 01:30:11,280 --> 01:30:13,160 In fact, it is 3. 1405 01:30:13,160 --> 01:30:18,160 So it's a 15, 11, 3 code, which is, in 1406 01:30:18,160 --> 01:30:20,040 fact, the Hamming code. 1407 01:30:20,040 --> 01:30:20,450 Yes? 1408 01:30:20,450 --> 01:30:21,350 AUDIENCE: [INAUDIBLE] 1409 01:30:21,350 --> 01:30:24,110 PROFESSOR: Oh, I'm sorry. 1410 01:30:24,110 --> 01:30:27,080 Wasn't this fun? 1411 01:30:27,080 --> 01:30:28,592 You're supposed to do that, Ashish. 1412 01:30:28,592 --> 01:30:29,476 AUDIENCE: [INAUDIBLE]. 1413 01:30:29,476 --> 01:30:34,420 PROFESSOR: OK, well, so you can see how it goes. 1414 01:30:34,420 --> 01:30:37,770 So this gets us both roots alpha and alpha squared. 1415 01:30:37,770 --> 01:30:42,540 If we want alpha third and alpha fourth, we already have 1416 01:30:42,540 --> 01:30:45,000 alpha fourth from this polynomial. 1417 01:30:45,000 --> 01:30:49,800 So all we need is this one, which gets us another. 1418 01:30:49,800 --> 01:30:55,166 And that gives us an n equals 15, k equals 7, d greater than 1419 01:30:55,166 --> 01:31:01,170 or equal to 5, which, in fact, is a 15, 7, 5 code. 1420 01:31:01,170 --> 01:31:03,620 So we get additional codes. 1421 01:31:03,620 --> 01:31:06,130 There's a table in there that shows you that most the time 1422 01:31:06,130 --> 01:31:09,790 for short block lengths for distances which are equal to a 1423 01:31:09,790 --> 01:31:12,060 prime power, we get the same parameters 1424 01:31:12,060 --> 01:31:13,320 as Reed-Muller codes. 1425 01:31:13,320 --> 01:31:16,070 For lengths 64 and higher, sometimes we do 1426 01:31:16,070 --> 01:31:17,300 a little bit better. 1427 01:31:17,300 --> 01:31:21,560 In addition, we get a lot more codes for every, basically, 1428 01:31:21,560 --> 01:31:24,740 odd distance when you take the length odd or even distance if 1429 01:31:24,740 --> 01:31:26,300 you take the length even. 1430 01:31:30,010 --> 01:31:32,670 And that's definitely an asset, that you have a larger 1431 01:31:32,670 --> 01:31:34,620 number of codes to choose from. 1432 01:31:34,620 --> 01:31:38,210 So there's no doubt that in terms n, k, d, BCH is the 1433 01:31:38,210 --> 01:31:42,480 larger and, for longer block length, slightly better class 1434 01:31:42,480 --> 01:31:43,750 than Reed-Muller codes. 1435 01:31:43,750 --> 01:31:46,110 But as I said before, in terms of performance versus 1436 01:31:46,110 --> 01:31:48,450 complexity, Reed-Muller might still be the 1437 01:31:48,450 --> 01:31:49,730 ones you want to use. 1438 01:31:49,730 --> 01:31:50,220 OK. 1439 01:31:50,220 --> 01:31:52,640 Good luck in the review and good luck in the exam. 1440 01:31:52,640 --> 01:31:53,890 I will see you Wednesday.