1 00:00:01,640 --> 00:00:08,170 PROFESSOR: So the handouts will be just the problem set 8 2 00:00:08,170 --> 00:00:11,120 solutions, of which you already have the first two. 3 00:00:11,120 --> 00:00:15,840 Remind you problem set 9 is due on Friday, but we will 4 00:00:15,840 --> 00:00:17,840 accept it on Monday if that's when you want to 5 00:00:17,840 --> 00:00:20,230 hand it in to Ashish. 6 00:00:20,230 --> 00:00:24,620 And problem set 10 I will hand out next week, but you won't 7 00:00:24,620 --> 00:00:27,020 be responsible for it. 8 00:00:27,020 --> 00:00:32,490 You could try it if you're so moved. 9 00:00:32,490 --> 00:00:33,530 OK. 10 00:00:33,530 --> 00:00:35,370 We're in the middle of chapter 13. 11 00:00:35,370 --> 00:00:39,540 We've been talking about capacity approaching codes. 12 00:00:39,540 --> 00:00:43,640 We've talked about a number of classes of them, low density 13 00:00:43,640 --> 00:00:48,220 parity check, turbo, repeat accumulate, and I've given you 14 00:00:48,220 --> 00:00:52,610 a general idea of how the sum product decoding algorithm is 15 00:00:52,610 --> 00:00:54,580 applied to decode these codes. 16 00:00:54,580 --> 00:00:58,093 These are all defined on graphs with cycles, in the 17 00:00:58,093 --> 00:01:02,330 middle of which is a large pseudo random interleaver. 18 00:01:02,330 --> 00:01:06,740 The sum product algorithm is therefore done iteratively. 19 00:01:06,740 --> 00:01:10,400 In general, the initial observed information comes in 20 00:01:10,400 --> 00:01:13,880 on one side, the left side or the right side, and the 21 00:01:13,880 --> 00:01:17,880 iterative schedule amounts to doing first the left side, 22 00:01:17,880 --> 00:01:19,980 then the right side, then the left side, then the right 23 00:01:19,980 --> 00:01:24,200 side, until you converge, you hope. 24 00:01:24,200 --> 00:01:27,650 That was the original turbo idea, and that continues to be 25 00:01:27,650 --> 00:01:30,290 the right way to do it. 26 00:01:30,290 --> 00:01:31,290 OK. 27 00:01:31,290 --> 00:01:33,800 Today, we're actually going to try to do some analysis. 28 00:01:33,800 --> 00:01:38,300 To do the analysis, we're going to focus on low density 29 00:01:38,300 --> 00:01:44,330 party check codes, which are certainly far easier than 30 00:01:44,330 --> 00:01:46,460 turbo codes to analyze, because they 31 00:01:46,460 --> 00:01:47,760 have such simple elements. 32 00:01:47,760 --> 00:01:51,580 I guess the repeat accumulate codes are equally easy to 33 00:01:51,580 --> 00:01:55,510 analyze, but maybe not as good in performance. 34 00:01:55,510 --> 00:01:58,200 Maybe they're as good, I don't know. 35 00:01:58,200 --> 00:02:00,390 No one has driven that as far as low density 36 00:02:00,390 --> 00:02:01,640 parity check codes. 37 00:02:03,880 --> 00:02:07,340 Also, we're going to take a very simple channel. 38 00:02:07,340 --> 00:02:11,110 It's actually the channel for which most of the analysis has 39 00:02:11,110 --> 00:02:16,170 been done, which is the Binary Erasure Channel, where 40 00:02:16,170 --> 00:02:19,670 everything reduces to a one-dimensional problem, and 41 00:02:19,670 --> 00:02:22,980 therefore we can do things quite precisely. 42 00:02:22,980 --> 00:02:26,850 But this will allow me to introduce density evolution, 43 00:02:26,850 --> 00:02:30,320 which is the generalization of this for more general channels 44 00:02:30,320 --> 00:02:33,910 like the binary input added white Gaussian noise channel, 45 00:02:33,910 --> 00:02:36,400 if I manage to go fast enough. 46 00:02:36,400 --> 00:02:39,340 I apologize today, I do feel in a hurry. 47 00:02:39,340 --> 00:02:43,670 Nonetheless, please ask questions whenever you want to 48 00:02:43,670 --> 00:02:47,220 slow me down or just get some more understanding. 49 00:02:47,220 --> 00:02:50,310 So, the Binary Erasure Channel is one of the elementary 50 00:02:50,310 --> 00:02:53,260 channels, if you've ever taken information theory, 51 00:02:53,260 --> 00:02:54,620 that you look at. 52 00:02:54,620 --> 00:02:59,010 It has two inputs and three outputs. 53 00:02:59,010 --> 00:03:02,910 The two inputs are 0 and 1, the outputs are 0 and 1 or an 54 00:03:02,910 --> 00:03:06,350 erasure, an ambiguous output. 55 00:03:06,350 --> 00:03:11,110 If you send a 0, you can either get the 0 correctly, or 56 00:03:11,110 --> 00:03:13,030 you could get an erasure. 57 00:03:13,030 --> 00:03:15,540 Might be a deletion, you just don't get anything. 58 00:03:15,540 --> 00:03:18,080 Similarly, if you send a 1, you either 59 00:03:18,080 --> 00:03:19,290 get a 1 or an erasure. 60 00:03:19,290 --> 00:03:22,480 There's no possibility of getting something incorrectly. 61 00:03:22,480 --> 00:03:26,870 That's the key thing about this channel. 62 00:03:26,870 --> 00:03:30,050 The probability of an erasure is p, regardless of whether 63 00:03:30,050 --> 00:03:31,980 you send 0 or 1. 64 00:03:31,980 --> 00:03:36,330 So there's a single parameter that governs this channel. 65 00:03:36,330 --> 00:03:40,486 Now admittedly, this is not a very realistic channel. 66 00:03:40,486 --> 00:03:43,580 It's a toy channel in the binary case. 67 00:03:43,580 --> 00:03:50,330 However, actually some of the impetus for this development 68 00:03:50,330 --> 00:03:53,080 was the people who were considering packet 69 00:03:53,080 --> 00:03:55,830 transmission on the internet. 70 00:03:55,830 --> 00:03:58,110 And in the case of packet transmission on the internet, 71 00:03:58,110 --> 00:04:02,660 of course, you have a long packet, a very non-binary 72 00:04:02,660 --> 00:04:06,750 symbol if you like, but if you consider these to be packets, 73 00:04:06,750 --> 00:04:08,880 then on the internet, you either receive the packet 74 00:04:08,880 --> 00:04:11,950 correctly or you fail to receive it. 75 00:04:11,950 --> 00:04:16,250 You don't receive it at all, and you know it, because there 76 00:04:16,250 --> 00:04:19,240 is an internal party check in each packet. 77 00:04:19,240 --> 00:04:23,220 So the q-ary erasure channel is in fact a realistic model, 78 00:04:23,220 --> 00:04:26,310 and in fact, there's a company, Digital Fountain, 79 00:04:26,310 --> 00:04:31,310 that has been founded and is still going strong as far as I 80 00:04:31,310 --> 00:04:36,510 know, which is specifically devoted to solutions for this 81 00:04:36,510 --> 00:04:40,250 q-ary erasure channel for particular kinds of scenarios 82 00:04:40,250 --> 00:04:43,140 on the internet where you want to do forward error correction 83 00:04:43,140 --> 00:04:46,280 rather than repeat transmission. 84 00:04:46,280 --> 00:04:51,370 And a lot of the early work on the analysis of these guys-- 85 00:04:51,370 --> 00:04:55,350 Luby, Shokrollahi, other people-- 86 00:04:55,350 --> 00:04:58,840 they were some of the people who focused on low density 87 00:04:58,840 --> 00:05:03,730 party check codes immediately, following work of Spielman and 88 00:05:03,730 --> 00:05:10,110 Sipser here at MIT, and said, OK, suppose we try this on our 89 00:05:10,110 --> 00:05:11,950 q-ary erasure channel. 90 00:05:11,950 --> 00:05:15,440 And they were able to get very close to the capacity of the 91 00:05:15,440 --> 00:05:18,190 q-ary erasure channel, which is also 1 minus p. 92 00:05:18,190 --> 00:05:21,230 This is the information theoretic capacity of the 93 00:05:21,230 --> 00:05:22,110 binary channel. 94 00:05:22,110 --> 00:05:25,940 It's kind of obvious that it should be 1 minus p, because 95 00:05:25,940 --> 00:05:29,410 on the average, you get 1 minus p good bits out for 96 00:05:29,410 --> 00:05:30,800 every bit that you send in. 97 00:05:30,800 --> 00:05:34,090 So the maximum rate you could expect to send over this 98 00:05:34,090 --> 00:05:38,100 channel is 1 minus p. 99 00:05:38,100 --> 00:05:39,350 OK. 100 00:05:41,060 --> 00:05:44,740 Let's first think about maximum likelihood decoding on 101 00:05:44,740 --> 00:05:45,300 this channel. 102 00:05:45,300 --> 00:05:49,980 Suppose we take a word from a code, from a binary code, and 103 00:05:49,980 --> 00:05:53,750 send it through this channel, and we get some erasure 104 00:05:53,750 --> 00:05:55,880 pattern at the output. 105 00:05:55,880 --> 00:05:59,750 So we have a subset of the bits that are good, and a 106 00:05:59,750 --> 00:06:03,310 subset that are erased at the output. 107 00:06:03,310 --> 00:06:06,150 Now what does maximum likelihood decoding amount to 108 00:06:06,150 --> 00:06:07,400 on this channel? 109 00:06:10,630 --> 00:06:15,780 Well, the code word that we sent is going to match up with 110 00:06:15,780 --> 00:06:19,400 all the good bits received, right? 111 00:06:19,400 --> 00:06:21,580 So we know that there's going to be at least one word in the 112 00:06:21,580 --> 00:06:23,940 code that agrees with the received 113 00:06:23,940 --> 00:06:25,690 sequence in the good places. 114 00:06:28,430 --> 00:06:30,970 If that's the only word in the code that agrees with the 115 00:06:30,970 --> 00:06:33,310 received word in those places, then we can declare it the 116 00:06:33,310 --> 00:06:34,900 winner, right? 117 00:06:34,900 --> 00:06:37,070 And maximum likelihood decoding succeeds. 118 00:06:37,070 --> 00:06:40,130 We know what the channel is, so we know that all the good 119 00:06:40,130 --> 00:06:43,070 bits have to match up with the code word. 120 00:06:43,070 --> 00:06:46,120 But suppose there are 2 words in the code that match up in 121 00:06:46,120 --> 00:06:48,960 all the good places? 122 00:06:48,960 --> 00:06:50,790 There's no way to decide between them, right? 123 00:06:54,180 --> 00:06:56,900 So basically, that's what maximum likelihood decoding 124 00:06:56,900 --> 00:06:58,630 amounts to. 125 00:06:58,630 --> 00:07:07,410 You simply check how many code words match the 126 00:07:07,410 --> 00:07:09,120 received good bits. 127 00:07:09,120 --> 00:07:12,160 If there's only one, you decode. 128 00:07:12,160 --> 00:07:15,480 If there's more than one, you could flip a coin, but we'll 129 00:07:15,480 --> 00:07:18,290 consider that to be a decoding failure. 130 00:07:18,290 --> 00:07:20,820 You just don't know, so you throw up your hands, you have 131 00:07:20,820 --> 00:07:24,110 a detected decoding failure. 132 00:07:24,110 --> 00:07:28,860 So in the case of a linear code, what are we doing here? 133 00:07:28,860 --> 00:07:31,950 In the case of a linear code, consider the 134 00:07:31,950 --> 00:07:33,130 parity check equations. 135 00:07:33,130 --> 00:07:38,030 We basically have n minus k parity check equations, and 136 00:07:38,030 --> 00:07:44,040 we're trying to find how many code sequences that solve 137 00:07:44,040 --> 00:07:45,780 those parity check equations. 138 00:07:45,780 --> 00:07:51,350 So we have n minus k equations, n unknowns, and 139 00:07:51,350 --> 00:07:55,310 we're basically just trying to solve linear equations. 140 00:07:55,310 --> 00:07:57,240 So that would be one decoding method for 141 00:07:57,240 --> 00:07:59,250 maximum likelihood decoding. 142 00:07:59,250 --> 00:08:00,370 Solve the equations. 143 00:08:00,370 --> 00:08:02,450 If you get a unique solution, you're finished. 144 00:08:02,450 --> 00:08:05,970 If you get a space of solutions, so dimension one or 145 00:08:05,970 --> 00:08:09,270 more, you lose. 146 00:08:09,270 --> 00:08:09,730 OK? 147 00:08:09,730 --> 00:08:14,330 So we know lots of ways of solving linear equations like 148 00:08:14,330 --> 00:08:17,068 Gaussian elimination, back propagation/back substitution 149 00:08:17,068 --> 00:08:18,318 (I'm not sure exactly what it's called). 150 00:08:21,290 --> 00:08:24,140 That's actually what we will be doing with low density 151 00:08:24,140 --> 00:08:26,630 parity check codes. 152 00:08:26,630 --> 00:08:30,780 And so, decoding for the binary ratio channel you can 153 00:08:30,780 --> 00:08:33,740 think of as just trying to solve linear equations. 154 00:08:33,740 --> 00:08:39,840 If you get a unique solution, you win, otherwise, you fail. 155 00:08:39,840 --> 00:08:45,460 Another way of looking at it in a linear code is what do 156 00:08:45,460 --> 00:08:48,920 the good bits have to form? 157 00:08:48,920 --> 00:08:52,920 The erased bits have to be a function of the 158 00:08:52,920 --> 00:08:55,570 good bits, all right? 159 00:08:55,570 --> 00:08:59,450 In linear code, that's just a function of where 160 00:08:59,450 --> 00:09:00,630 the good bits are. 161 00:09:00,630 --> 00:09:02,940 We've run into this concept before. 162 00:09:02,940 --> 00:09:05,366 We called it an information set. 163 00:09:05,366 --> 00:09:09,330 An information set is a subset of the coordinates that 164 00:09:09,330 --> 00:09:11,760 basically determines the rest the coordinates. 165 00:09:11,760 --> 00:09:14,150 If you know the bits in that subset, then you 166 00:09:14,150 --> 00:09:14,840 know the code word. 167 00:09:14,840 --> 00:09:17,870 You can fill out the rest of the code word through some 168 00:09:17,870 --> 00:09:19,050 linear equation. 169 00:09:19,050 --> 00:09:24,500 So basically, we're going to succeed if the good bits cover 170 00:09:24,500 --> 00:09:29,840 an information set, and we're going to fail otherwise. 171 00:09:29,840 --> 00:09:35,200 So how many bits do we need to cover an information set? 172 00:09:35,200 --> 00:09:37,690 We're certainly going to need at least k. 173 00:09:37,690 --> 00:09:40,440 Now today, we're going to be considering very long codes. 174 00:09:40,440 --> 00:09:45,550 So suppose I have a long (n,k) code. 175 00:09:45,550 --> 00:09:50,050 I have an (n,k) code, and I transmit it over this channel. 176 00:09:50,050 --> 00:09:52,915 About how many bits are going to be erased? 177 00:09:55,700 --> 00:09:59,400 About pn bits are going to be erased, or (1 178 00:09:59,400 --> 00:10:01,380 minus p) times n. 179 00:10:01,380 --> 00:10:03,510 We're going to get approximately-- 180 00:10:03,510 --> 00:10:05,120 law of large numbers-- 181 00:10:05,120 --> 00:10:14,340 (1 minus p) times n unerased bits, and this has to be 182 00:10:14,340 --> 00:10:18,400 clearly greater than k. 183 00:10:18,400 --> 00:10:22,620 OK, so with very high probability, if we get more 184 00:10:22,620 --> 00:10:27,190 than that, we'll be able to solve the equations, find a 185 00:10:27,190 --> 00:10:28,020 unique code word. 186 00:10:28,020 --> 00:10:31,390 If we get fewer than that, there's no possible way we 187 00:10:31,390 --> 00:10:32,550 could solve the equations. 188 00:10:32,550 --> 00:10:34,980 We don't have enough left. 189 00:10:34,980 --> 00:10:37,310 So what does this say? 190 00:10:37,310 --> 00:10:44,600 This says that k over n, which is the code rate, has to be 191 00:10:44,600 --> 00:10:49,260 less than 1 minus p in order for this maximum likelihood 192 00:10:49,260 --> 00:10:52,480 decoding to work with a linear code over the binary erasure 193 00:10:52,480 --> 00:10:57,370 channel, and that is consistent with this. 194 00:10:57,370 --> 00:11:00,410 If the rate is less than 1 minus p, then with very high 195 00:11:00,410 --> 00:11:02,480 probability you're going to be successful. 196 00:11:02,480 --> 00:11:08,360 If it's greater than 1 minus p, no chance, as n becomes 1. 197 00:11:08,360 --> 00:11:08,890 OK? 198 00:11:08,890 --> 00:11:09,820 You with me? 199 00:11:09,820 --> 00:11:11,070 AUDIENCE: [UNINTELLIGIBLE]? 200 00:11:13,080 --> 00:11:17,090 PROFESSOR: Well, in general, they're not, and the first 201 00:11:17,090 --> 00:11:21,630 exercise on the homework says take the 844 code. 202 00:11:21,630 --> 00:11:26,070 There's certain places where if you erase 4 bits, you lose, 203 00:11:26,070 --> 00:11:27,760 and there are other places where if you 204 00:11:27,760 --> 00:11:31,100 erase 4 bits, you win. 205 00:11:31,100 --> 00:11:37,630 And that exercise also points out that the low density 206 00:11:37,630 --> 00:11:39,880 parity check decoding that we're going to do, the 207 00:11:39,880 --> 00:11:43,520 graphical decoding, may fail in a case where maximum 208 00:11:43,520 --> 00:11:46,660 likelihood decoding might work. 209 00:11:46,660 --> 00:11:48,770 But maximum likelihood decoding is certainly the best 210 00:11:48,770 --> 00:11:53,040 we can do, so it's clear. 211 00:11:53,040 --> 00:11:56,090 You can't signal at a rate greater than 1 minus p. 212 00:11:56,090 --> 00:11:59,480 You just don't get more than 1 minus p bits of information, 213 00:11:59,480 --> 00:12:03,920 or n times (1 minus p) bits of information in a code word of 214 00:12:03,920 --> 00:12:08,030 length n, so you can't possibly communicate more than 215 00:12:08,030 --> 00:12:13,000 n times (1 minus p) bits in a block. 216 00:12:13,000 --> 00:12:14,250 OK. 217 00:12:16,820 --> 00:12:20,920 So what are we going to try to do to 218 00:12:20,920 --> 00:12:24,640 signal over this channel? 219 00:12:24,640 --> 00:12:29,500 We're going to try using a low density parity check code. 220 00:12:29,500 --> 00:12:31,330 Actually, I guess I did want this first. 221 00:12:34,232 --> 00:12:37,790 Let me talk about both of these back and forth. 222 00:12:37,790 --> 00:12:39,500 Sorry, Mr. TV guy. 223 00:12:42,000 --> 00:12:45,210 So we're going to use a low density parity check code, and 224 00:12:45,210 --> 00:12:50,110 initially, we're going to assume a regular code with, 225 00:12:50,110 --> 00:12:56,860 say, left degree is 3 over here, right degree is 6. 226 00:12:56,860 --> 00:13:00,650 And we're going to try to decode by using the iterative 227 00:13:00,650 --> 00:13:03,310 sum product algorithm with a left right schedule. 228 00:13:07,340 --> 00:13:10,140 OK. 229 00:13:10,140 --> 00:13:13,550 I can work either here or up here. 230 00:13:13,550 --> 00:13:20,470 What are the rules for sum product update on a binary 231 00:13:20,470 --> 00:13:22,170 erasure channel? 232 00:13:22,170 --> 00:13:30,220 Let's just start out, and walk through it a little bit, and 233 00:13:30,220 --> 00:13:33,195 then step back and develop some more general rules. 234 00:13:35,700 --> 00:13:40,080 What is the message coming in here that we 235 00:13:40,080 --> 00:13:41,680 receive from the channel? 236 00:13:41,680 --> 00:13:45,340 We're going to convert it into an APP vector. 237 00:13:45,340 --> 00:13:48,480 What could the APP vector be? 238 00:13:48,480 --> 00:13:56,360 It's either, say, 0, 1 or 1, 0, if the bit is unerased. 239 00:13:56,360 --> 00:14:00,300 So this would be-- if we get this, we know a posteriori 240 00:14:00,300 --> 00:14:04,145 probability of a 0 is a 1, and of a 1 is a 0. 241 00:14:04,145 --> 00:14:06,240 No question, we have certainty. 242 00:14:06,240 --> 00:14:10,980 Similarly down here, it's a 1. 243 00:14:10,980 --> 00:14:13,090 And in here, it's 1/2, 1/2. 244 00:14:15,610 --> 00:14:17,310 Complete uncertainty. 245 00:14:17,310 --> 00:14:18,560 No information. 246 00:14:20,610 --> 00:14:22,050 So, we can get-- 247 00:14:22,050 --> 00:14:25,570 those are our three possibilities off the channel. 248 00:14:25,570 --> 00:14:29,320 (0,1), (1,0), (1/2, 1/2). 249 00:14:29,320 --> 00:14:32,380 Now, if we get a certain bit coming in, what are the 250 00:14:32,380 --> 00:14:36,520 messages going out on each of these lines here? 251 00:14:36,520 --> 00:14:38,720 We actually only need to know this one. 252 00:14:38,720 --> 00:14:42,660 Initially, everything inside here is complete ignorance. 253 00:14:42,660 --> 00:14:44,120 Half, 1/2 everywhere. 254 00:14:44,120 --> 00:14:45,730 You can consider everything to be erased. 255 00:14:48,420 --> 00:14:48,750 All right. 256 00:14:48,750 --> 00:14:53,470 Well, if we got a known bit coming in, a 0 or a 1, the 257 00:14:53,470 --> 00:14:55,780 repetition node simply says propagate 258 00:14:55,780 --> 00:14:57,260 that through all here. 259 00:14:57,260 --> 00:15:01,950 So if you worked out the sum product update rule for here, 260 00:15:01,950 --> 00:15:07,100 it would basically say, in this message, if any of these 261 00:15:07,100 --> 00:15:11,980 lines is known, then this line is known and we have a certain 262 00:15:11,980 --> 00:15:13,950 bit going out. 263 00:15:13,950 --> 00:15:14,370 All right? 264 00:15:14,370 --> 00:15:17,140 So if 0, 1 comes in, we'll get 0, 1 out. 265 00:15:17,140 --> 00:15:20,210 It's certainly a 1. 266 00:15:20,210 --> 00:15:26,240 Only in the case where all of these other lines are erased-- 267 00:15:26,240 --> 00:15:28,600 all these other incoming messages are erasures-- then 268 00:15:28,600 --> 00:15:30,180 we don't know anything, then the output 269 00:15:30,180 --> 00:15:32,260 has to be an erasure. 270 00:15:32,260 --> 00:15:32,780 All right? 271 00:15:32,780 --> 00:15:38,465 So that's the sum product update rule at a equals node. 272 00:15:38,465 --> 00:15:38,850 All right? 273 00:15:38,850 --> 00:15:44,050 If any of these d minus 1 incoming messages is known, 274 00:15:44,050 --> 00:15:45,820 then the output is known. 275 00:15:45,820 --> 00:15:48,280 If they're all unknown, then the output is unknown. 276 00:15:52,360 --> 00:15:55,490 You're going to find, in general, these are the only 277 00:15:55,490 --> 00:15:58,010 kinds of messages we're ever going to have to deal with. 278 00:15:58,010 --> 00:16:01,920 Either, we're basically going to take known bits and 279 00:16:01,920 --> 00:16:04,235 propagate them through the graph-- 280 00:16:09,800 --> 00:16:12,770 so initially, everything is erased, and after awhile, we 281 00:16:12,770 --> 00:16:14,000 start learning things. 282 00:16:14,000 --> 00:16:16,910 More and more things become known, and we succeed if 283 00:16:16,910 --> 00:16:20,000 everything becomes known inside the graph. 284 00:16:20,000 --> 00:16:20,380 All right? 285 00:16:20,380 --> 00:16:23,500 So it's just the propagation of unerased variables through 286 00:16:23,500 --> 00:16:25,272 this graph. 287 00:16:25,272 --> 00:16:26,522 AUDIENCE: [UNINTELLIGIBLE] 288 00:16:29,240 --> 00:16:30,490 PROFESSOR: No. 289 00:16:32,712 --> 00:16:36,400 So they're not only known, but they're correct. 290 00:16:36,400 --> 00:16:40,550 And like everything else, you can prove that by induction. 291 00:16:40,550 --> 00:16:43,650 The bits that we receive from the channel certainly have to 292 00:16:43,650 --> 00:16:46,430 be consistent with the correct code word. 293 00:16:46,430 --> 00:16:49,110 All these internal constraints are the constraints of the 294 00:16:49,110 --> 00:16:52,830 code, so we can never generate an incorrect message. 295 00:16:52,830 --> 00:16:55,080 That's basically the hand waving proof of that. 296 00:16:57,800 --> 00:16:59,050 OK. 297 00:17:02,510 --> 00:17:06,410 So we're going to propagate either known bits or erasures 298 00:17:06,410 --> 00:17:08,920 in the first iteration. 299 00:17:08,920 --> 00:17:12,490 And what's the fraction of these lines that's going to be 300 00:17:12,490 --> 00:17:16,003 erased in a very long code? 301 00:17:16,003 --> 00:17:16,470 AUDIENCE: [UNINTELLIGIBLE] 302 00:17:16,470 --> 00:17:17,579 PROFESSOR: Its going to be p. 303 00:17:17,579 --> 00:17:18,099 All right? 304 00:17:18,099 --> 00:17:25,520 So initially, we have fraction p erased and fraction 1 minus 305 00:17:25,520 --> 00:17:28,790 p which are good. 306 00:17:28,790 --> 00:17:30,230 OK. 307 00:17:30,230 --> 00:17:33,750 And then, this, we'll take this to be a perfectly random 308 00:17:33,750 --> 00:17:34,380 interleaver. 309 00:17:34,380 --> 00:17:39,730 So perfectly randomly, this comes out there. 310 00:17:39,730 --> 00:17:40,980 OK? 311 00:17:44,900 --> 00:17:47,410 All right, so now we have various messages 312 00:17:47,410 --> 00:17:49,850 coming in over here. 313 00:17:49,850 --> 00:17:54,250 Some are erased, some are known and correct, and that's 314 00:17:54,250 --> 00:17:56,730 the only things they can be. 315 00:17:56,730 --> 00:17:59,030 All right, what can we do on the right side now? 316 00:17:59,030 --> 00:18:02,460 On the right side, we have to execute the sum product 317 00:18:02,460 --> 00:18:05,400 algorithm for a zero sum node of this type. 318 00:18:07,960 --> 00:18:09,210 What is the rule here? 319 00:18:11,880 --> 00:18:16,710 Clearly, if we get good data on all these input bits, we 320 00:18:16,710 --> 00:18:18,980 know what the output bit is. 321 00:18:18,980 --> 00:18:22,570 So if we get five good ones over here, we can tell what 322 00:18:22,570 --> 00:18:23,820 the sixth one has to be. 323 00:18:26,790 --> 00:18:33,690 However, if any of these is erased, then what's the 324 00:18:33,690 --> 00:18:36,105 probability this is a 0 or a 1? 325 00:18:36,105 --> 00:18:38,550 It's 1/2, 1/2. 326 00:18:38,550 --> 00:18:42,230 So any erasure here means we get no 327 00:18:42,230 --> 00:18:43,800 information out of this node. 328 00:18:43,800 --> 00:18:47,230 We get an erasure coming out. 329 00:18:47,230 --> 00:18:56,960 All right, so we come in here, and if p is some large number, 330 00:18:56,960 --> 00:19:01,200 the rate of this code is 1/2. 331 00:19:01,200 --> 00:19:04,460 So I'm going to do a simulation for like p equals a 332 00:19:04,460 --> 00:19:06,290 little less-- 333 00:19:06,290 --> 00:19:10,240 small enough so that this code could succeed, 0.4-- 334 00:19:10,240 --> 00:19:16,050 so the capacity is 0.6 bits per bit. 335 00:19:16,050 --> 00:19:21,140 But if this is 0.4, what's the probability that any 5 of 336 00:19:21,140 --> 00:19:23,640 these are all going to be unerased? 337 00:19:23,640 --> 00:19:24,890 It's pretty small. 338 00:19:28,850 --> 00:19:34,380 So you won't be surprised to learn that the probability of 339 00:19:34,380 --> 00:19:36,900 an erasure of coming back-- 340 00:19:36,900 --> 00:19:38,800 call that q-- 341 00:19:38,800 --> 00:19:42,930 equals 0.9 or more, greater than 0.9. 342 00:19:42,930 --> 00:19:45,330 But it's not 1. 343 00:19:45,330 --> 00:19:49,290 So for some small fraction of these over here, we're going 344 00:19:49,290 --> 00:19:51,780 to get some information, some additional information, that 345 00:19:51,780 --> 00:19:53,490 we didn't have before. 346 00:19:53,490 --> 00:19:57,140 And this is going to propagate randomly back, and it may 347 00:19:57,140 --> 00:20:02,240 allow us to now know some of these bits that were initially 348 00:20:02,240 --> 00:20:03,490 erased on the channel. 349 00:20:07,620 --> 00:20:10,080 So that's the idea. 350 00:20:10,080 --> 00:20:14,410 So to understand the performance of 351 00:20:14,410 --> 00:20:18,850 this, we simply track-- 352 00:20:18,850 --> 00:20:22,850 let me call this, in general, the erasure probability going 353 00:20:22,850 --> 00:20:26,680 from left to right, and this, in general, we'll call the 354 00:20:26,680 --> 00:20:29,770 erasure probability going from right to left. 355 00:20:32,630 --> 00:20:36,520 And we can actually compute what these probabilities are 356 00:20:36,520 --> 00:20:39,810 for each iteration under the assumption that the code is 357 00:20:39,810 --> 00:20:44,090 very long and random so that every time we make a 358 00:20:44,090 --> 00:20:46,690 computation, we're dealing with completely fresh and 359 00:20:46,690 --> 00:20:48,230 independent information. 360 00:20:48,230 --> 00:20:49,590 And that's what we're going to do. 361 00:20:49,590 --> 00:20:50,170 Yes? 362 00:20:50,170 --> 00:20:51,420 AUDIENCE: [UNINTELLIGIBLE] 363 00:20:58,410 --> 00:21:02,120 PROFESSOR: When they come from the right side, they're either 364 00:21:02,120 --> 00:21:04,180 erased or they're consistent. 365 00:21:07,610 --> 00:21:11,670 I argued before, waving my hands, that these messages 366 00:21:11,670 --> 00:21:13,400 could never be incorrect. 367 00:21:13,400 --> 00:21:16,680 So if you get 2 known messages, they can't conflict 368 00:21:16,680 --> 00:21:18,220 with each other. 369 00:21:18,220 --> 00:21:20,302 Is that your concern? 370 00:21:20,302 --> 00:21:20,788 AUDIENCE: Yeah. 371 00:21:20,788 --> 00:21:23,218 Because you're randomly connecting [UNINTELLIGIBLE], 372 00:21:23,218 --> 00:21:26,620 so it might be that one of the plus signs gave you an 373 00:21:26,620 --> 00:21:29,174 [UNINTELLIGIBLE], whereas another plus sign gave you a 374 00:21:29,174 --> 00:21:30,106 proper message. 375 00:21:30,106 --> 00:21:33,050 And they both run back to the same equation. 376 00:21:33,050 --> 00:21:34,450 PROFESSOR: Well, OK. 377 00:21:34,450 --> 00:21:36,430 So this is pseudo random, but is chosen for 378 00:21:36,430 --> 00:21:37,280 once and for all. 379 00:21:37,280 --> 00:21:38,340 It determines the code. 380 00:21:38,340 --> 00:21:42,900 I don't re-choose it every time, but when I analyze it, 381 00:21:42,900 --> 00:21:46,140 I'll assume that it's random enough so that the bits that 382 00:21:46,140 --> 00:21:48,630 enter into any one calculation are bits that I've never seen 383 00:21:48,630 --> 00:21:51,690 before, and therefore can be taken to be entirely random. 384 00:21:51,690 --> 00:21:55,180 But of course, in actual practice, you've got a fixed 385 00:21:55,180 --> 00:21:57,050 interleaver here, and you have to, in order 386 00:21:57,050 --> 00:21:59,590 to decode the code. 387 00:21:59,590 --> 00:22:04,010 But the other concern here is if we actually had the 388 00:22:04,010 --> 00:22:07,670 possibility of errors, the pure binary erasure channel 389 00:22:07,670 --> 00:22:08,680 never allows errors. 390 00:22:08,680 --> 00:22:12,450 If this actually allowed a 0 to go to a 1 or a 1 to go to 391 00:22:12,450 --> 00:22:14,860 0, then we'd have an altogether different situation 392 00:22:14,860 --> 00:22:18,920 over here, and we'd have to simply honestly compute the 393 00:22:18,920 --> 00:22:22,110 sum product algorithm and what is the APP if we have some 394 00:22:22,110 --> 00:22:23,580 probability of error. 395 00:22:23,580 --> 00:22:25,890 And they could conflict, and we'd have to weigh the 396 00:22:25,890 --> 00:22:29,340 evidence, and take the dominating evidence, or mix it 397 00:22:29,340 --> 00:22:31,640 all up into the single parameter 398 00:22:31,640 --> 00:22:32,890 that we call the APP. 399 00:22:36,460 --> 00:22:37,710 All right. 400 00:22:39,710 --> 00:22:45,695 So let me now do a little analysis. 401 00:22:45,695 --> 00:22:48,060 Actually, I've done this a couple places. 402 00:22:50,760 --> 00:22:55,110 Suppose the probability of erasure here-- 403 00:22:55,110 --> 00:23:00,620 this is the q right to left parameter. 404 00:23:00,620 --> 00:23:06,160 Suppose the probability of q right to left is 0.9, or 405 00:23:06,160 --> 00:23:13,420 whatever, and this is the original received message from 406 00:23:13,420 --> 00:23:16,685 the channel, which had an erasure probability of p. 407 00:23:19,480 --> 00:23:22,530 What's the q left to right? 408 00:23:22,530 --> 00:23:28,036 What's the erasure probability for the outgoing message? 409 00:23:28,036 --> 00:23:31,390 Well, the outgoing message is erased only if all of these 410 00:23:31,390 --> 00:23:34,480 incoming messages are erased. 411 00:23:34,480 --> 00:23:38,450 All right, so this is simply p times q right to 412 00:23:38,450 --> 00:23:42,960 left, d minus 1. 413 00:23:42,960 --> 00:23:44,045 OK? 414 00:23:44,045 --> 00:23:45,295 AUDIENCE: [UNINTELLIGIBLE] 415 00:23:47,910 --> 00:23:49,860 PROFESSOR: Assuming it's a long random code, so 416 00:23:49,860 --> 00:23:52,580 everything here is independent. 417 00:23:52,580 --> 00:23:55,360 I'll say something else about this in just a second. 418 00:23:58,150 --> 00:24:01,495 But let's naively make that assumption right now, and then 419 00:24:01,495 --> 00:24:04,830 see how best we can justify it. 420 00:24:04,830 --> 00:24:06,020 What's the rule over here? 421 00:24:06,020 --> 00:24:11,770 Here, we're over on the right side if we want to compute the 422 00:24:11,770 --> 00:24:13,430 right to left. 423 00:24:13,430 --> 00:24:18,140 If these are all erased with probability q left to right, 424 00:24:18,140 --> 00:24:24,944 what is the probability that this one going out is erased? 425 00:24:24,944 --> 00:24:27,770 Well, it's easier to compute here the probability of not 426 00:24:27,770 --> 00:24:29,050 being erased. 427 00:24:29,050 --> 00:24:34,230 This is not erased only if all of these are not erased. 428 00:24:34,230 --> 00:24:38,800 So we get q right to left. 429 00:24:38,800 --> 00:24:43,780 One minus q right to left is equal to 1 minus q left to 430 00:24:43,780 --> 00:24:48,070 right, to the d minus 1. 431 00:24:48,070 --> 00:24:51,630 And let's see, this is d right, and this is d left. 432 00:24:54,830 --> 00:24:58,770 I'm doing it for the specific context. 433 00:24:58,770 --> 00:25:05,320 OK, so under the independence assumption, we can compute 434 00:25:05,320 --> 00:25:09,900 exactly what these evolving erasure probabilities are as 435 00:25:09,900 --> 00:25:12,520 we go through this left right iteration. 436 00:25:12,520 --> 00:25:16,660 This is what's so neat about this whole thing. 437 00:25:16,660 --> 00:25:22,430 Now, here's the best argument for why these are all 438 00:25:22,430 --> 00:25:24,440 independent. 439 00:25:24,440 --> 00:25:30,690 Let's look at the messages that enter into, say, a 440 00:25:30,690 --> 00:25:31,950 particular-- 441 00:25:31,950 --> 00:25:35,610 this is computing q left to right down here. 442 00:25:35,610 --> 00:25:40,030 All right, we've got something coming in, one bit here. 443 00:25:40,030 --> 00:25:45,860 We've got more bits coming in up here, and here, which 444 00:25:45,860 --> 00:25:48,800 originally came from bits coming in up here. 445 00:25:48,800 --> 00:25:50,870 We have a tree of computation. 446 00:25:50,870 --> 00:25:54,120 If we went back through this pseudo random but fixed 447 00:25:54,120 --> 00:25:58,900 interleaver, we could actually draw this tree for every 448 00:25:58,900 --> 00:26:04,790 instance of every computation, and this would be q left to 449 00:26:04,790 --> 00:26:07,610 right at the nth iteration, this is-- 450 00:26:07,610 --> 00:26:08,882 I'm sorry. 451 00:26:08,882 --> 00:26:12,470 Yeah, this is q left to right at the nth iteration, this is 452 00:26:12,470 --> 00:26:17,630 q right to left at the n minus first iteration, this is q 453 00:26:17,630 --> 00:26:22,300 left to right at the n minus first iteration, and so forth. 454 00:26:26,500 --> 00:26:31,190 Now, the argument is that if I go back-- 455 00:26:31,190 --> 00:26:36,050 let's fix the number of iterations I go back here-- 456 00:26:36,050 --> 00:26:40,690 m, let's say, and I want to do an analysis of the first m 457 00:26:40,690 --> 00:26:43,400 iterations. 458 00:26:43,400 --> 00:26:48,380 I claim that as this code becomes long, n goes to 459 00:26:48,380 --> 00:26:54,340 infinity with fixed d lambda, d rho, that the probability 460 00:26:54,340 --> 00:26:58,220 you're ever going to run into repeated bit or message up 461 00:26:58,220 --> 00:27:00,780 here goes to 0. 462 00:27:00,780 --> 00:27:02,230 All right? 463 00:27:02,230 --> 00:27:04,510 So I fix the number of iterations I'm 464 00:27:04,510 --> 00:27:05,220 going to look at. 465 00:27:05,220 --> 00:27:07,500 I let the length of the code go to infinity. 466 00:27:07,500 --> 00:27:11,340 I let everything be chosen pseudo randomly over here. 467 00:27:11,340 --> 00:27:17,570 Then the probability of seeing the same message or bit twice 468 00:27:17,570 --> 00:27:19,680 in this tree goes to 0. 469 00:27:19,680 --> 00:27:22,740 And therefore, in that limit, the independence assumption 470 00:27:22,740 --> 00:27:23,590 become valid. 471 00:27:23,590 --> 00:27:27,390 That is basically the argument, all right? 472 00:27:27,390 --> 00:27:29,840 So I can analyze any fixed number of 473 00:27:29,840 --> 00:27:31,190 iterations in this way. 474 00:27:39,142 --> 00:27:40,392 AUDIENCE: [UNINTELLIGIBLE] 475 00:27:42,640 --> 00:27:43,770 PROFESSOR: OK, yes. 476 00:27:43,770 --> 00:27:45,140 Good. 477 00:27:45,140 --> 00:27:51,010 So this is saying the girth is probabilistically-- 478 00:27:51,010 --> 00:27:59,860 so limit in probability going to infinity, or it's also 479 00:27:59,860 --> 00:28:02,110 referred to as the locally tree-like assumption. 480 00:28:05,330 --> 00:28:09,260 OK, graph in the neighborhood of any node-- 481 00:28:09,260 --> 00:28:12,380 this is kind of a map of the neighborhood back for a 482 00:28:12,380 --> 00:28:13,630 distance of m-- 483 00:28:16,730 --> 00:28:18,640 we're not ever going to run into any cycles. 484 00:28:23,530 --> 00:28:26,420 Good, thank you. 485 00:28:26,420 --> 00:28:29,270 OK, so under that assumption, now we 486 00:28:29,270 --> 00:28:30,760 can do an exact analysis. 487 00:28:30,760 --> 00:28:32,010 This is what's amazing. 488 00:28:35,760 --> 00:28:36,760 And how do we do it? 489 00:28:36,760 --> 00:28:39,180 Here's a good way of doing it. 490 00:28:39,180 --> 00:28:42,120 We just draw the curves of these 2 equations, and we go 491 00:28:42,120 --> 00:28:43,480 back and forth between them. 492 00:28:46,180 --> 00:28:49,980 And this was actually a technique invented earlier for 493 00:28:49,980 --> 00:28:53,130 turbo codes, but it works very nicely for low density parity 494 00:28:53,130 --> 00:28:54,890 check code analysis. 495 00:28:54,890 --> 00:28:57,280 It's called the exit chart. 496 00:28:57,280 --> 00:29:01,570 I've drawn it in a somewhat peculiar way, but it's so that 497 00:29:01,570 --> 00:29:04,360 it will look like the exit charts you might see an in the 498 00:29:04,360 --> 00:29:06,318 literature. 499 00:29:06,318 --> 00:29:10,100 So I'm just drawing q right to left on this axis, and q left 500 00:29:10,100 --> 00:29:11,870 to right on this axis. 501 00:29:11,870 --> 00:29:15,250 I want to sort of start in the lower left and work my way up 502 00:29:15,250 --> 00:29:18,030 to the upper right, which is the way exit 503 00:29:18,030 --> 00:29:19,160 charts always work. 504 00:29:19,160 --> 00:29:23,610 So to do that, I basically invert the axis and take it 505 00:29:23,610 --> 00:29:25,410 from 1 down to 0. 506 00:29:25,410 --> 00:29:27,640 Initially, both of these-- 507 00:29:27,640 --> 00:29:30,350 the probability is one that everything is erased 508 00:29:30,350 --> 00:29:34,830 internally on every edge, and if things work out, we'll get 509 00:29:34,830 --> 00:29:38,390 up to the point where nothing is erased with high 510 00:29:38,390 --> 00:29:39,640 probability. 511 00:29:41,500 --> 00:29:46,290 OK, these are our 2 equations just copied from over there 512 00:29:46,290 --> 00:29:49,810 for the specific case of left degree equals 3 and right 513 00:29:49,810 --> 00:29:52,900 degree equals 6. 514 00:29:52,900 --> 00:29:56,650 And so I just plot the curves of these 2 equations. 515 00:29:56,650 --> 00:30:02,090 This is done in the notes, and the important thing is that 516 00:30:02,090 --> 00:30:08,660 the curves don't cross, for a value of p equal 0.4. 517 00:30:08,660 --> 00:30:13,030 One of these curves depends on p, the other one doesn't. 518 00:30:13,030 --> 00:30:16,390 So this is just a simple little quadratic curve here, 519 00:30:16,390 --> 00:30:19,150 and this is a fifth order curve, and they look 520 00:30:19,150 --> 00:30:22,000 something like this. 521 00:30:22,000 --> 00:30:22,870 What does this mean? 522 00:30:22,870 --> 00:30:25,400 Initially, the q right to left is 1. 523 00:30:25,400 --> 00:30:30,800 If I go through one iteration, using the fact that I get this 524 00:30:30,800 --> 00:30:34,010 external information-- 525 00:30:34,010 --> 00:30:35,610 extrinsic information-- 526 00:30:35,610 --> 00:30:39,570 then q left to right becomes 0.4, so we do to the outer. 527 00:30:39,570 --> 00:30:45,130 Now, I have q left to right propagating to the right side, 528 00:30:45,130 --> 00:30:50,290 and at this point, I get something like 0.922, I think 529 00:30:50,290 --> 00:30:52,700 is the first one. 530 00:30:52,700 --> 00:30:55,280 So the q right to left has gone from 531 00:30:55,280 --> 00:30:58,210 1 down to 0.9 something. 532 00:30:58,210 --> 00:31:00,260 OK, but that's better. 533 00:31:00,260 --> 00:31:05,050 Now, with that value of q, of course I get a much more 534 00:31:05,050 --> 00:31:07,510 favorable situation on the left. 535 00:31:07,510 --> 00:31:10,730 I go over to the left side, and now I get 536 00:31:10,730 --> 00:31:14,240 some p equal to-- 537 00:31:14,240 --> 00:31:17,670 this is all done in the notes-- 538 00:31:17,670 --> 00:31:20,880 0.34. 539 00:31:20,880 --> 00:31:23,790 So I've reduced my erasure probability going from left to 540 00:31:23,790 --> 00:31:33,000 right, which in turn, helps me out as I go over here, 0.875, 541 00:31:33,000 --> 00:31:35,490 and so forth. 542 00:31:35,490 --> 00:31:36,050 Are you with me? 543 00:31:36,050 --> 00:31:38,730 Does everyone see what I'm doing? 544 00:31:38,730 --> 00:31:41,050 Any questions? 545 00:31:41,050 --> 00:31:44,250 Again, I'm claiming this is an exact calculation-- 546 00:31:44,250 --> 00:31:46,050 or I would call it a simulation-- 547 00:31:46,050 --> 00:31:48,830 of what the algorithm does in each iteration. 548 00:31:48,830 --> 00:31:52,810 First iteration, first full, left, right, right left, you 549 00:31:52,810 --> 00:31:53,490 get to here. 550 00:31:53,490 --> 00:31:56,840 Second one, you get to here, and so forth. 551 00:31:56,840 --> 00:32:00,640 And I claim as n goes to infinity, and everything is 552 00:32:00,640 --> 00:32:05,530 random, this is the way the erasure 553 00:32:05,530 --> 00:32:06,800 probabilities will evolve. 554 00:32:09,440 --> 00:32:17,320 And it's clear visually that if the curves don't cross, we 555 00:32:17,320 --> 00:32:21,060 get to the upper right corner, which means decoding succeeds. 556 00:32:21,060 --> 00:32:26,640 There are no erasures anywhere at the end of the day. 557 00:32:26,640 --> 00:32:29,790 And furthermore, you go and you take a very long code, 558 00:32:29,790 --> 00:32:33,050 like 10 to the seventh bits, and you simulate it on this 559 00:32:33,050 --> 00:32:36,120 channel, and it will behave exactly like this. 560 00:32:36,120 --> 00:32:40,100 OK, so this is really a good piece of analysis. 561 00:32:40,100 --> 00:32:43,810 So this reduces it to very simple terms. 562 00:32:43,810 --> 00:32:48,760 We have 2 equations, and of course they meet here at the 563 00:32:48,760 --> 00:32:50,700 (0,0) point. 564 00:32:50,700 --> 00:32:52,970 Substitute 0 in here, you get 0 there. 565 00:32:52,970 --> 00:32:55,690 Substance 0 here, you get 0 there. 566 00:32:55,690 --> 00:32:59,490 But if they don't meet anywhere else, if there's no 567 00:32:59,490 --> 00:33:05,860 fixed point to this iterative convergence, then decoding is 568 00:33:05,860 --> 00:33:07,680 going to succeed. 569 00:33:07,680 --> 00:33:10,920 So this is the whole question: can we design 2 curves that 570 00:33:10,920 --> 00:33:12,170 don't cross? 571 00:33:22,020 --> 00:33:23,270 OK. 572 00:33:24,980 --> 00:33:29,330 So what do we expect now to happen? 573 00:33:29,330 --> 00:33:32,670 Suppose we increase p. 574 00:33:32,670 --> 00:33:37,540 Suppose we increase p to 0.45, which is another case that's 575 00:33:37,540 --> 00:33:41,580 considered in the notes, what's going to happen? 576 00:33:41,580 --> 00:33:43,990 This curve is just a simple quadratic, it's going to be 577 00:33:43,990 --> 00:33:45,720 dragged down a little bit. 578 00:33:45,720 --> 00:33:52,570 We're going to get some different curve, which is just 579 00:33:52,570 --> 00:33:57,220 this curve scaled by 0.45 over 0.4. 580 00:33:57,220 --> 00:33:59,790 It's going to start here, and it's going to 581 00:33:59,790 --> 00:34:00,980 be this scaled curve. 582 00:34:00,980 --> 00:34:03,895 And unfortunately, those 2 curves cross. 583 00:34:07,550 --> 00:34:12,510 So that's the way it's going to look, and now, again, we 584 00:34:12,510 --> 00:34:18,429 can simulate iterative decoding for this case. 585 00:34:18,429 --> 00:34:20,290 Again, initially, we'll start out. 586 00:34:20,290 --> 00:34:24,760 We'll go from 1, 0.45 will be our right going erasure 587 00:34:24,760 --> 00:34:25,350 probability. 588 00:34:25,350 --> 00:34:28,980 We'll go over here, make some progress, but 589 00:34:28,980 --> 00:34:30,690 what's going to happen? 590 00:34:30,690 --> 00:34:32,395 We're going to get stuck right there. 591 00:34:36,260 --> 00:34:37,639 So we find the fixed point. 592 00:34:37,639 --> 00:34:40,480 In fact, this simulation is a very efficient way of 593 00:34:40,480 --> 00:34:45,170 calculating what the fixed point of these 2 curves are. 594 00:34:45,170 --> 00:34:47,770 Probably some of you are analytical whizzes and can do 595 00:34:47,770 --> 00:34:50,350 it analytically, but it's not that easy 596 00:34:50,350 --> 00:34:51,600 for a quintic equation. 597 00:34:55,699 --> 00:35:00,430 In any case, as far as decoding is concerned-- 598 00:35:00,430 --> 00:35:03,390 all right, this code doesn't work on an erasure channel 599 00:35:03,390 --> 00:35:05,940 which has an erasure probability of 0.45. 600 00:35:05,940 --> 00:35:10,120 It does work on one that has an erasure probability of 0.4. 601 00:35:10,120 --> 00:35:16,770 That should suggest to you-- yeah? 602 00:35:16,770 --> 00:35:18,020 AUDIENCE: [UNINTELLIGIBLE] 603 00:35:20,520 --> 00:35:23,520 PROFESSOR: Yes, so this code doesn't get to capacity. 604 00:35:23,520 --> 00:35:24,770 Too bad. 605 00:35:27,690 --> 00:35:33,540 So I'm not claiming that a regular d left equals 3, d 606 00:35:33,540 --> 00:35:37,010 right equals 6 LDPC code can achieve capacity. 607 00:35:40,030 --> 00:35:44,030 There's some threshold for p, below which it'll work, and 608 00:35:44,030 --> 00:35:45,830 above which it won't work. 609 00:35:45,830 --> 00:35:50,460 That threshold is somewhere between 0.4 and 0.45. 610 00:35:50,460 --> 00:35:53,510 In fact, it's 0.429 something or other. 611 00:35:53,510 --> 00:36:00,470 So this design approach will succeed it's near capacity, 612 00:36:00,470 --> 00:36:02,790 but I certainly don't claim this is a 613 00:36:02,790 --> 00:36:04,350 capacity approaching code. 614 00:36:13,360 --> 00:36:17,920 I might mention now something called the area theorem, 615 00:36:17,920 --> 00:36:20,310 because it's easy to do now and it will be 616 00:36:20,310 --> 00:36:22,730 harder to do later. 617 00:36:22,730 --> 00:36:24,140 What is this area here? 618 00:36:28,340 --> 00:36:31,240 I'm saying the area above this curve here. 619 00:36:35,344 --> 00:36:38,740 Well, you can do that simply by integrating this. 620 00:36:38,740 --> 00:36:46,870 It's integral of p times q-squared dq from 0 to 1, and 621 00:36:46,870 --> 00:36:49,840 it turns out to be p over 3. 622 00:36:49,840 --> 00:36:51,090 Believe me? 623 00:36:54,360 --> 00:37:00,130 Which happens to be p over the left degree. 624 00:37:00,130 --> 00:37:06,230 Not fortuitously, because this is the left degree minus 1. 625 00:37:06,230 --> 00:37:08,800 So you're always going to get p over the left degree. 626 00:37:11,780 --> 00:37:14,900 And what's the area under here? 627 00:37:14,900 --> 00:37:19,070 Well, I can compute-- 628 00:37:19,070 --> 00:37:22,340 basically change variables to 1 minus q, q prime, and 1 629 00:37:22,340 --> 00:37:26,590 minus q is q prime over here, and so I'll get the same kind 630 00:37:26,590 --> 00:37:31,590 of calculation, 0 to 1, this time q to the fifth over pq, 631 00:37:31,590 --> 00:37:36,020 which is 1/6, which not 632 00:37:36,020 --> 00:37:39,370 fortuitously is 1 over d right. 633 00:37:39,370 --> 00:37:46,920 So the area here is p over 3, and the area here is-- 634 00:37:46,920 --> 00:37:49,460 under this side of the curve is-- 635 00:37:49,460 --> 00:37:51,340 that must be 5/6. 636 00:37:51,340 --> 00:37:55,090 Sorry, so the area under this side is 1/6 so 637 00:37:55,090 --> 00:37:56,400 it's 1 minus this. 638 00:38:07,490 --> 00:38:10,560 It's clearly the big part, so this is 5/6. 639 00:38:15,560 --> 00:38:15,980 All right. 640 00:38:15,980 --> 00:38:19,230 I've claimed my criterion for successful decoding is that 641 00:38:19,230 --> 00:38:21,980 these curves not cross. 642 00:38:21,980 --> 00:38:31,010 All right, so for successful decoding, clearly the sum of 643 00:38:31,010 --> 00:38:35,940 these 2 areas has to be less than 1, right? 644 00:38:35,940 --> 00:38:50,210 So successful decoding: a necessary condition is that p 645 00:38:50,210 --> 00:38:52,500 over d_lambda -- 646 00:38:52,500 --> 00:38:56,510 let me just extend this to any regular code-- 647 00:38:56,510 --> 00:39:03,050 plus 1 minus 1 over d_rho has to be less than 1. 648 00:39:09,490 --> 00:39:13,320 OK, what does this sum out to? 649 00:39:13,320 --> 00:39:27,530 This says that p has to be less than d_lambda over d_rho, 650 00:39:27,530 --> 00:39:30,290 which happens to 1 minus r, right? 651 00:39:32,960 --> 00:39:38,835 Or equivalently, r less than 1 minus p, which is capacity. 652 00:39:42,200 --> 00:39:44,150 So what did I just prove very quickly? 653 00:39:44,150 --> 00:39:48,800 I proved that for a regular low density parity check code, 654 00:39:48,800 --> 00:39:54,370 just considering the areas under these 2 curves and the 655 00:39:54,370 --> 00:39:59,490 requirement that the 2 curves must not cross, I find that 656 00:39:59,490 --> 00:40:04,200 regular codes can't possibly work for a rate any greater 657 00:40:04,200 --> 00:40:06,520 than 1 minus p, which is capacity. 658 00:40:06,520 --> 00:40:10,160 In fact, the rate has to be less than 1 minus p, strictly 659 00:40:10,160 --> 00:40:13,020 less, in order for there to-- 660 00:40:13,020 --> 00:40:16,870 unless we were lucky enough just to get 2 curves that were 661 00:40:16,870 --> 00:40:18,620 right on top of each other. 662 00:40:18,620 --> 00:40:19,840 I don't know whether that would work or not. 663 00:40:19,840 --> 00:40:21,330 I guess it doesn't work. 664 00:40:21,330 --> 00:40:23,735 But we'd need them to be just a scooch apart. 665 00:40:27,010 --> 00:40:30,170 OK, so I can make an inequality sign here. 666 00:40:30,170 --> 00:40:33,920 OK, well that's rather gratifying. 667 00:40:38,160 --> 00:40:43,510 What do we do to improve the situation? 668 00:40:43,510 --> 00:40:45,560 OK, one-- 669 00:40:45,560 --> 00:40:47,570 it's probably the first thing you would think of 670 00:40:47,570 --> 00:40:50,830 investigating maybe at this point, why don't we look at an 671 00:40:50,830 --> 00:40:53,640 irregular LDPC code? 672 00:41:07,360 --> 00:41:11,320 And I'm going to characterize such a code by-- 673 00:41:11,320 --> 00:41:18,650 there's going to be some distribution on the left side, 674 00:41:18,650 --> 00:41:22,580 which I might write by lambda_d. 675 00:41:22,580 --> 00:41:26,920 This is going to be the fraction of left 676 00:41:26,920 --> 00:41:33,910 nodes of degree d. 677 00:41:33,910 --> 00:41:36,290 All right, I'll simply let that be some distribution. 678 00:41:36,290 --> 00:41:39,250 Some might have degree 2, some might have degree 3. 679 00:41:39,250 --> 00:41:44,270 Some might have degree 500. 680 00:41:44,270 --> 00:41:51,000 And similarly, rho_d is the fraction of 681 00:41:51,000 --> 00:41:54,170 right nodes, et cetera. 682 00:42:01,500 --> 00:42:06,430 And there's some average degree here, and some average 683 00:42:06,430 --> 00:42:09,250 degree here. 684 00:42:09,250 --> 00:42:12,950 So this is the average degree, or the typical degree. 685 00:42:16,500 --> 00:42:20,500 This is average left degree, this is average right degree. 686 00:42:23,990 --> 00:42:26,470 If I do that, then the calculations 687 00:42:26,470 --> 00:42:29,640 are done in the notes. 688 00:42:29,640 --> 00:42:32,070 I won't take the time to do them here, but basically you 689 00:42:32,070 --> 00:42:38,190 find the rate of the code is 1 minus the average left degree 690 00:42:38,190 --> 00:42:39,760 over the average right degree. 691 00:42:42,860 --> 00:42:45,180 OK, so it reduces to the previous case 692 00:42:45,180 --> 00:42:47,820 and the regular case. 693 00:42:47,820 --> 00:42:51,440 Regular case, this is 1 for one particular degree and 0 694 00:42:51,440 --> 00:42:52,690 for everything else. 695 00:42:56,050 --> 00:42:59,160 It works out. 696 00:42:59,160 --> 00:43:02,910 If I do that and go through exactly the same analysis with 697 00:43:02,910 --> 00:43:06,690 my computation tree, now I simply have a distribution of 698 00:43:06,690 --> 00:43:11,710 degrees at each level of the computation tree, and you will 699 00:43:11,710 --> 00:43:18,040 not be surprised to hear what I get out as my left to right 700 00:43:18,040 --> 00:43:22,315 equations, is I get out some average of this. 701 00:43:25,350 --> 00:43:39,090 In fact, what I get out now is that q left to right is the 702 00:43:39,090 --> 00:43:43,520 sum over d of-- 703 00:43:43,520 --> 00:43:55,590 this is going to be lambda_d times p times q right to left 704 00:43:55,590 --> 00:43:58,810 to the d minus 1. 705 00:43:58,810 --> 00:44:02,140 Which again reduces to the previous thing, if only one of 706 00:44:02,140 --> 00:44:06,250 these is 1 and the rest are 0. 707 00:44:06,250 --> 00:44:07,500 So I just get the-- 708 00:44:09,920 --> 00:44:11,570 this is just an expectation. 709 00:44:11,570 --> 00:44:13,830 This is the fraction of erasures. 710 00:44:13,830 --> 00:44:17,860 I just count the number of times I go through a node of 711 00:44:17,860 --> 00:44:20,990 degree d, and for that fraction of time, I'm going to 712 00:44:20,990 --> 00:44:25,340 get this relationship, and so I just average over them. 713 00:44:25,340 --> 00:44:26,450 That's very quick. 714 00:44:26,450 --> 00:44:29,733 Look at the notes for a detailed derivation, but I 715 00:44:29,733 --> 00:44:32,840 hope it's intuitively plausible. 716 00:44:32,840 --> 00:44:39,812 And similarly, 1 minus q right to the left is the sum over d 717 00:44:39,812 --> 00:44:48,365 of rho_d, 1 minus q left to right to the d minus 1. 718 00:44:52,510 --> 00:44:55,970 OK, this is elegantly done if we 719 00:44:55,970 --> 00:44:59,710 define generating functions. 720 00:44:59,710 --> 00:45:03,490 We do that over here. 721 00:45:03,490 --> 00:45:05,600 I've lost it now so I'll do it over here. 722 00:45:08,500 --> 00:45:11,230 So what you'll see in the literature is generating 723 00:45:11,230 --> 00:45:15,920 functions to find is lambda_x equals sum over d of lambda_d 724 00:45:15,920 --> 00:45:18,740 x to the d minus 1. 725 00:45:18,740 --> 00:45:25,650 And rho of x equals sum over d, rho_d, x to the d minus 1. 726 00:45:25,650 --> 00:45:28,360 And then these equations are simply written as-- 727 00:45:28,360 --> 00:45:35,410 this is p times lambda of q right to left, and this is 728 00:45:35,410 --> 00:45:42,710 equal to rho of 1 minus q left to right. 729 00:45:46,870 --> 00:45:49,630 OK, so we get nice, elegant generating function 730 00:45:49,630 --> 00:45:50,880 representations. 731 00:45:52,840 --> 00:45:56,350 But from the point of view of the curves, we're basically 732 00:45:56,350 --> 00:45:58,110 just going to average these curves. 733 00:45:58,110 --> 00:46:02,520 So we now replace these equations up here by the 734 00:46:02,520 --> 00:46:03,770 average equations. 735 00:46:11,940 --> 00:46:18,100 This becomes p times lambda of q right to left, and this 736 00:46:18,100 --> 00:46:25,950 becomes rho of 1 minus q left to right. 737 00:46:25,950 --> 00:46:30,760 OK, but again, I'm going to reduce all of this 2 curves, 738 00:46:30,760 --> 00:46:34,770 which again I can use for a simulation. 739 00:46:34,770 --> 00:46:38,900 And now I have lots of degrees of freedom. 740 00:46:38,900 --> 00:46:41,630 I could change all these lambdas and all these rhos, 741 00:46:41,630 --> 00:46:45,180 and I can explore the space, and that's what's Sae-Young 742 00:46:45,180 --> 00:46:48,530 Chung did in his thesis, not so much for this channel. 743 00:46:48,530 --> 00:46:53,260 He did do it for this channel, but also for additive white 744 00:46:53,260 --> 00:46:54,610 Gaussian noise channels. 745 00:46:54,610 --> 00:47:00,980 And so the idea is you try to make these 2 curves just as 746 00:47:00,980 --> 00:47:02,910 close together as you can. 747 00:47:08,880 --> 00:47:09,820 Something like that. 748 00:47:09,820 --> 00:47:11,730 Or, of course, you can do other tricks. 749 00:47:11,730 --> 00:47:15,690 You can have some of these-- 750 00:47:15,690 --> 00:47:17,300 you can have some bits over here that go 751 00:47:17,300 --> 00:47:18,250 to the outside world. 752 00:47:18,250 --> 00:47:20,520 You can suppress some of these bits here. 753 00:47:20,520 --> 00:47:23,800 You can play around with the graph. 754 00:47:23,800 --> 00:47:25,580 No limit on invention. 755 00:47:25,580 --> 00:47:27,490 But you don't really have to do any of that. 756 00:47:30,430 --> 00:47:37,790 So it becomes a curve fitting exercise, and you can imagine 757 00:47:37,790 --> 00:47:39,630 doing this in your thesis, except you were 758 00:47:39,630 --> 00:47:41,095 not born soon enough. 759 00:47:44,950 --> 00:47:48,090 The interesting point here is that this now becomes-- 760 00:47:48,090 --> 00:47:51,470 the area becomes p over d_lambda-bar, again, 761 00:47:51,470 --> 00:47:54,320 proof in the notes. 762 00:47:54,320 --> 00:48:00,790 This becomes 1 minus 1 over d_rho-bar. 763 00:48:04,820 --> 00:48:06,720 And so again, the area theorem-- 764 00:48:12,540 --> 00:48:15,880 in order for these curves not to cross, we've got to have p 765 00:48:15,880 --> 00:48:25,860 over d_lambda-bar plus 1 minus 1 over d_rho-bar, less than 766 00:48:25,860 --> 00:48:30,230 the area of the whole exit chart, which is 1. 767 00:48:30,230 --> 00:48:33,870 We again find that-- 768 00:48:33,870 --> 00:48:39,870 let me put it this way, 1 minus d_lambda-bar over 769 00:48:39,870 --> 00:48:46,450 d_rho-bar is less than 1 minus p, which is equivalent to the 770 00:48:46,450 --> 00:48:50,660 rate must be less than the capacity of the channel. 771 00:48:50,660 --> 00:48:52,855 So this is a very nice, elegant result. 772 00:48:52,855 --> 00:48:56,530 The area theorem says that no matter how you play with these 773 00:48:56,530 --> 00:48:59,430 degree distributions in an irregular low-density parity 774 00:48:59,430 --> 00:49:04,790 check code, you of course can never get above capacity. 775 00:49:04,790 --> 00:49:10,010 But, it certainly suggests that you might be able to play 776 00:49:10,010 --> 00:49:13,060 around with these curves such that they get as close as you 777 00:49:13,060 --> 00:49:14,170 might like. 778 00:49:14,170 --> 00:49:17,270 And the converse of this is that if you can make these 779 00:49:17,270 --> 00:49:20,890 arbitrarily close to each other, then you can achieve 780 00:49:20,890 --> 00:49:22,830 rates arbitrarily close to capacity. 781 00:49:26,630 --> 00:49:29,850 And that, in fact, is true. 782 00:49:29,850 --> 00:49:33,060 So simply by going to irregular low-density parity 783 00:49:33,060 --> 00:49:36,810 check codes, we can get as close as we like, arbitrarily 784 00:49:36,810 --> 00:49:41,610 close, to the capacity of the binary erasure channel with 785 00:49:41,610 --> 00:49:44,276 this kind of iterative decoding. 786 00:49:44,276 --> 00:49:46,320 And you can see the kind of trade you're 787 00:49:46,320 --> 00:49:47,010 going to have to make. 788 00:49:47,010 --> 00:49:51,160 Obviously, you're going to have more iterations as these 789 00:49:51,160 --> 00:49:52,670 get very close. 790 00:49:52,670 --> 00:49:55,450 What is the decoding process going to look like? 791 00:49:55,450 --> 00:49:59,960 It's going to look like very fine grained steps here, lots 792 00:49:59,960 --> 00:50:02,650 of iterations, but-- 793 00:50:02,650 --> 00:50:03,120 all right. 794 00:50:03,120 --> 00:50:04,360 So it's a 100 iterations. 795 00:50:04,360 --> 00:50:07,880 So it's 200 irritations. 796 00:50:07,880 --> 00:50:10,100 These are not crazy numbers. 797 00:50:10,100 --> 00:50:12,570 These are quite feasible numbers. 798 00:50:12,570 --> 00:50:16,450 And so if you're willing to do a lot of computation-- 799 00:50:16,450 --> 00:50:18,160 which is what you expect, as you get close 800 00:50:18,160 --> 00:50:19,480 to capacity, right-- 801 00:50:19,480 --> 00:50:23,520 you can get as close to capacity as you like, at least 802 00:50:23,520 --> 00:50:26,270 on this channel. 803 00:50:26,270 --> 00:50:30,010 OK, isn't that great? 804 00:50:30,010 --> 00:50:35,010 It's an easy channel, I grant you, but everything here is 805 00:50:35,010 --> 00:50:38,000 pretty simple. 806 00:50:38,000 --> 00:50:40,890 All these sum product updates-- 807 00:50:40,890 --> 00:50:45,750 for here, it's just a matter of basically snow point 808 00:50:45,750 --> 00:50:46,900 propagating erasures. 809 00:50:46,900 --> 00:50:49,870 You just take the known variables. 810 00:50:49,870 --> 00:50:53,130 You keep computing as many as you can of them. 811 00:50:53,130 --> 00:50:57,110 Basically, every time an edge becomes known, you only have 812 00:50:57,110 --> 00:50:59,910 to visit each edge once, actually. 813 00:50:59,910 --> 00:51:02,220 The first time it becomes known is the only time you 814 00:51:02,220 --> 00:51:02,850 have to visit it. 815 00:51:02,850 --> 00:51:05,490 After that, you can just leave it fixed. 816 00:51:05,490 --> 00:51:13,130 All right, so if this has a linear number of edges, as it 817 00:51:13,130 --> 00:51:15,940 does, by construction, for either the regular or 818 00:51:15,940 --> 00:51:18,580 irregular case, the complexity is now going 819 00:51:18,580 --> 00:51:20,690 to be linear, right? 820 00:51:20,690 --> 00:51:22,190 We only have to visit each edge once. 821 00:51:22,190 --> 00:51:27,640 There are only a number of edges proportional to n. 822 00:51:27,640 --> 00:51:29,800 So the complexity of this whole decoding algorithm-- all 823 00:51:29,800 --> 00:51:33,490 you do is, you fix as many edges as you can, then you go 824 00:51:33,490 --> 00:51:36,670 over here and you try to fix as many more edges as you can. 825 00:51:36,670 --> 00:51:39,520 You come back here, try to fix as many more as you can. 826 00:51:39,520 --> 00:51:42,700 It will behave exactly as this simulation shows it will 827 00:51:42,700 --> 00:51:49,220 behave, and after going back and forth maybe 100 times-- 828 00:51:49,220 --> 00:51:53,470 in more reasonable cases, it's only 10 or 20 times, it's a 829 00:51:53,470 --> 00:51:56,970 very finite number of times-- 830 00:51:56,970 --> 00:51:59,700 you'll be done. 831 00:51:59,700 --> 00:52:05,390 Another qualitative aspect of this that you already see in 832 00:52:05,390 --> 00:52:08,150 the regular code case-- 833 00:52:08,150 --> 00:52:10,760 in fact, you see it very nicely there-- is that 834 00:52:10,760 --> 00:52:16,450 typically, very typically, you have an initial period here 835 00:52:16,450 --> 00:52:18,680 where you make a rapid progress because the curves 836 00:52:18,680 --> 00:52:22,740 are pretty far apart, then you have some narrow little tunnel 837 00:52:22,740 --> 00:52:25,540 that you have to get through, and then the 838 00:52:25,540 --> 00:52:27,010 curves widen up again. 839 00:52:27,010 --> 00:52:28,480 I've exaggerated it here. 840 00:52:32,860 --> 00:52:35,730 So OK, you're making great progress, you're filling in, 841 00:52:35,730 --> 00:52:39,400 lots of edges become known, and then for a while it seems 842 00:52:39,400 --> 00:52:43,020 like you're making no progress at all, making very tiny 843 00:52:43,020 --> 00:52:46,630 progress on each iteration. 844 00:52:46,630 --> 00:52:50,730 But then, you get through this tunnel, and boom! 845 00:52:50,730 --> 00:52:53,330 Things go very fast. 846 00:52:53,330 --> 00:52:55,920 And for this code, it has a zero-- 847 00:52:55,920 --> 00:52:59,670 the regular code has a zero slope at this point, whereas 848 00:52:59,670 --> 00:53:03,540 this has a non-zero slope. 849 00:53:03,540 --> 00:53:05,600 So these things will go boom, boom, boom, boom, boom as you 850 00:53:05,600 --> 00:53:08,200 go in there. 851 00:53:08,200 --> 00:53:11,660 So these guys at Digital Fountain, they called their 852 00:53:11,660 --> 00:53:13,990 second class of codes, [UNINTELLIGIBLE], tornado 853 00:53:13,990 --> 00:53:15,930 codes, because they had this effect. 854 00:53:15,930 --> 00:53:18,370 You have to struggle and struggle, but then when you 855 00:53:18,370 --> 00:53:22,440 finally get it, there's a tornado, a blizzard, of known 856 00:53:22,440 --> 00:53:24,661 edges, and all of a sudden, all the edges become known. 857 00:53:27,960 --> 00:53:31,760 Oh by the way, this could be done for packets. 858 00:53:31,760 --> 00:53:32,870 There's nothing-- 859 00:53:32,870 --> 00:53:36,380 you know, this is a repetition for a packet, and this is a 860 00:53:36,380 --> 00:53:38,800 bit-wise parity check for a packet. 861 00:53:38,800 --> 00:53:42,200 So the same diagram works perfectly well for packet 862 00:53:42,200 --> 00:53:43,400 transmission. 863 00:53:43,400 --> 00:53:44,390 That's the way they use it. 864 00:53:44,390 --> 00:53:44,858 Yeah? 865 00:53:44,858 --> 00:53:46,108 AUDIENCE: [UNINTELLIGIBLE] 866 00:53:49,070 --> 00:53:49,760 PROFESSOR: Yeah. 867 00:53:49,760 --> 00:53:50,060 Right. 868 00:53:50,060 --> 00:53:53,670 So this chart makes it very clear. 869 00:53:53,670 --> 00:53:55,740 If you're going to get this tornado effect, it's because 870 00:53:55,740 --> 00:53:57,490 you have some gap in here. 871 00:53:57,490 --> 00:53:59,480 The bigger the gap, the further away you are from 872 00:53:59,480 --> 00:54:01,630 capacity, quite quantitatively. 873 00:54:06,886 --> 00:54:08,782 So I just-- 874 00:54:08,782 --> 00:54:11,380 this is the first year I've been able to get this far in 875 00:54:11,380 --> 00:54:13,150 the course, and I think this is very much 876 00:54:13,150 --> 00:54:17,530 worth presenting because-- 877 00:54:17,530 --> 00:54:18,650 look at what's happened here. 878 00:54:18,650 --> 00:54:23,230 At least for one channel, after 50 years of work in 879 00:54:23,230 --> 00:54:29,040 trying to get to Shannon's channel capacity, around 1995 880 00:54:29,040 --> 00:54:32,960 or so, people finally figured out a way of constructing a 881 00:54:32,960 --> 00:54:35,990 code and a decoding algorithm that in fact has linear 882 00:54:35,990 --> 00:54:40,630 complexity, and can get as close to channel capacity as 883 00:54:40,630 --> 00:54:42,890 you like in a very feasible way, at 884 00:54:42,890 --> 00:54:46,300 least for this channel. 885 00:54:46,300 --> 00:54:50,160 So that's really where we want to end the story in this 886 00:54:50,160 --> 00:54:52,480 class, because the whole class has been about getting to 887 00:54:52,480 --> 00:54:53,290 channel capacity. 888 00:54:53,290 --> 00:54:56,350 Well, what about other channels? 889 00:54:56,350 --> 00:55:00,230 What about channels with errors here? 890 00:55:00,230 --> 00:55:13,850 So let's go to the symmetric input binary 891 00:55:13,850 --> 00:55:20,700 channel, which I-- 892 00:55:23,490 --> 00:55:27,760 symmetric, sorry-- symmetric binary input channel. 893 00:55:27,760 --> 00:55:30,160 This is not standardized. 894 00:55:30,160 --> 00:55:35,810 The problem is, what you really want to say is the 895 00:55:35,810 --> 00:55:38,620 binary symmetric channel, except that term is already 896 00:55:38,620 --> 00:55:41,940 taken, so you've got to say something else. 897 00:55:41,940 --> 00:55:44,510 I say symmetric binary input channel. 898 00:55:44,510 --> 00:55:45,775 You'll see other things in the literature. 899 00:55:48,610 --> 00:55:54,840 This channel has 2 inputs: 0 and 1, and it has as many 900 00:55:54,840 --> 00:55:55,950 outputs as you like. 901 00:55:55,950 --> 00:55:59,290 It might have an erasure output. 902 00:55:59,290 --> 00:56:02,120 And the key thing about the erasure output is that the 903 00:56:02,120 --> 00:56:04,730 probability of getting there from either 0 or 1 is the 904 00:56:04,730 --> 00:56:07,790 same, call it p again. 905 00:56:07,790 --> 00:56:11,610 And so the a posteriori probability, let's write the 906 00:56:11,610 --> 00:56:14,830 APPs by each of these. 907 00:56:14,830 --> 00:56:16,950 The erasure output is always going to be a state of 908 00:56:16,950 --> 00:56:19,510 complete ignorance, you don't know. 909 00:56:19,510 --> 00:56:22,010 So there might be one output like that, and then there will 910 00:56:22,010 --> 00:56:28,150 be other outputs here that occur in pairs. 911 00:56:28,150 --> 00:56:31,650 And the pairs are always going to have the character that 912 00:56:31,650 --> 00:56:35,730 their APP is going to be 1 minus-- 913 00:56:35,730 --> 00:56:37,280 I've used p excessively here. 914 00:56:37,280 --> 00:56:41,080 Let me take it off of here and use it here-- 915 00:56:41,080 --> 00:56:44,330 for a typical other pair, you're going to have 1 minus 916 00:56:44,330 --> 00:56:48,570 pp, or p 1 minus p. 917 00:56:48,570 --> 00:56:50,680 In other words, just looking at these 2 outputs, it's a 918 00:56:50,680 --> 00:56:53,380 binary symmetric channel. 919 00:56:53,380 --> 00:56:56,080 The probability of p of cross over and 1 920 00:56:56,080 --> 00:56:59,890 minus p of being correct. 921 00:56:59,890 --> 00:57:03,140 And we may have pairs that are pretty unreliable where p is 922 00:57:03,140 --> 00:57:05,580 close to 1/2, and we may have pairs that 923 00:57:05,580 --> 00:57:07,460 are extremely reliable. 924 00:57:07,460 --> 00:57:14,110 So this 1 minus p prime, p prime, where p prime might be 925 00:57:14,110 --> 00:57:17,550 very close to 0. 926 00:57:17,550 --> 00:57:20,820 But the point is, the outputs always occur in these pairs. 927 00:57:20,820 --> 00:57:25,750 The output space can be partitioned into pairs such 928 00:57:25,750 --> 00:57:28,780 that, for each pair, you have a binary symmetric channel, or 929 00:57:28,780 --> 00:57:32,580 you might have this singleton, which is an erasure. 930 00:57:32,580 --> 00:57:37,030 And this is, of course, what we have for the binary input 931 00:57:37,030 --> 00:57:39,170 additive white Gaussian noise channel. 932 00:57:39,170 --> 00:57:45,440 We have 2 inputs, and now we have an output which is the 933 00:57:45,440 --> 00:57:49,760 complete real line, which has a distribution like this. 934 00:57:49,760 --> 00:57:53,555 But in this case, 0 is the erasure. 935 00:57:53,555 --> 00:57:57,080 If we get a 0, then the probability of the APP message 936 00:57:57,080 --> 00:57:59,140 is (1/2,1/2). 937 00:57:59,140 --> 00:58:02,620 And the pairs are plus or minus y. 938 00:58:02,620 --> 00:58:11,090 If we get to see y, then the probability of y given 0, or 939 00:58:11,090 --> 00:58:14,920 given one, that's the same pair as the probability of 940 00:58:14,920 --> 00:58:16,910 minus y given-- 941 00:58:16,910 --> 00:58:20,740 this is, of course, minus 1, plus 1 for my 2 possible 942 00:58:20,740 --> 00:58:22,950 transmissions here. 943 00:58:22,950 --> 00:58:25,800 Point is, binary input added white Gaussian noise channel 944 00:58:25,800 --> 00:58:27,120 is in this class. 945 00:58:27,120 --> 00:58:29,517 It has a continuous output rather than a discrete output. 946 00:58:32,200 --> 00:58:34,630 But there's a key symmetry property here. 947 00:58:34,630 --> 00:58:39,720 Basically, if you exchange 0 for 1, nothing changes. 948 00:58:39,720 --> 00:58:42,070 All right, so the symmetry between 0 and 1. 949 00:58:42,070 --> 00:58:44,830 That's why it's called a symmetric channel. 950 00:58:44,830 --> 00:58:49,390 That means you can easily prove that for the capacity 951 00:58:49,390 --> 00:58:53,630 achieving input distribution is always (1/2,1/2), for any 952 00:58:53,630 --> 00:58:54,450 such channel. 953 00:58:54,450 --> 00:58:56,880 If you've taken information theory, you've seen this 954 00:58:56,880 --> 00:58:59,050 demonstrated. 955 00:58:59,050 --> 00:59:04,370 And this has the important implication that you can use 956 00:59:04,370 --> 00:59:08,320 linear codes on any symmetric binary input channel without 957 00:59:08,320 --> 00:59:09,865 loss of channel capacity. 958 00:59:13,200 --> 00:59:15,020 Linear codes achieve capacity. 959 00:59:19,450 --> 00:59:22,640 OK, whereas, of course, if this weren't (1/2,1/2), then 960 00:59:22,640 --> 00:59:24,480 linear codes couldn't possibly achieve capacity. 961 00:59:32,190 --> 00:59:33,960 Suppose you have such a channel. 962 00:59:33,960 --> 00:59:38,450 What's the sum product updates? 963 00:59:38,450 --> 00:59:43,860 The sum product updates become more complicated. 964 00:59:43,860 --> 00:59:46,400 They're really not hard for the equality sign. 965 00:59:46,400 --> 00:59:50,560 You remember for a repetition node, the sum product update 966 00:59:50,560 --> 00:59:54,610 is just the product of basically the APPs coming in 967 00:59:54,610 --> 00:59:56,670 or the APPs going out. 968 00:59:56,670 --> 00:59:58,920 So all we've got to do is take the product. 969 00:59:58,920 --> 01:00:02,410 It'll turn out the messages in this case are always of the 970 01:00:02,410 --> 01:00:06,500 form p, 1 minus p-- 971 01:00:06,500 --> 01:00:09,420 of course, because they're binary, and so 972 01:00:09,420 --> 01:00:11,040 has to be like this-- 973 01:00:11,040 --> 01:00:13,830 so we really just need a single parameter p. 974 01:00:13,830 --> 01:00:18,120 We multiply all the p's, and that normalize correctly, and 975 01:00:18,120 --> 01:00:19,660 that'll be the output. 976 01:00:22,410 --> 01:00:26,270 For the update here, I'm sorry I don't have time to talk 977 01:00:26,270 --> 01:00:32,990 about it in class, but there's a clever little procedure 978 01:00:32,990 --> 01:00:35,790 which basically says take the Hadamard Transform 979 01:00:35,790 --> 01:00:37,330 of p, 1 minus p. 980 01:00:37,330 --> 01:00:40,750 The Hadamard Transform in general says, convert this to 981 01:00:40,750 --> 01:00:44,160 the pair of a plus b, a minus b. 982 01:00:44,160 --> 01:00:50,110 So in this case, we convert it to a plus b is always 1, and a 983 01:00:50,110 --> 01:00:56,990 minus b is, in this case, 2p minus 1. 984 01:00:56,990 --> 01:00:59,440 Works out better, turns out this is actually 985 01:00:59,440 --> 01:01:02,400 a likelihood ratio. 986 01:01:02,400 --> 01:01:04,970 Take the Hadamard Transform, then you can use the same 987 01:01:04,970 --> 01:01:07,250 product update rule as you used up here. 988 01:01:10,220 --> 01:01:19,860 So do the repetition node updates, which is easy-- 989 01:01:19,860 --> 01:01:23,980 so it says just multiply all the inputs component-wise in 990 01:01:23,980 --> 01:01:27,880 this vector, and then take the Hadamard Transform again to 991 01:01:27,880 --> 01:01:33,400 get your time domain or primal domain, 992 01:01:33,400 --> 01:01:34,820 rather than dual domain. 993 01:01:34,820 --> 01:01:36,920 So you work in the dual domain, rather 994 01:01:36,920 --> 01:01:38,560 than the primal domain. 995 01:01:38,560 --> 01:01:40,630 Again, I'm sorry. 996 01:01:40,630 --> 01:01:42,390 You got a homework problem on it, after you've done the 997 01:01:42,390 --> 01:01:45,590 homework problem, you'll understand this. 998 01:01:45,590 --> 01:01:51,410 And this turns out to involve hyperbolic 999 01:01:51,410 --> 01:01:53,860 tangents to do these. 1000 01:01:53,860 --> 01:01:57,770 These Hadamard Transforms turn out to be taking hyperbolic 1001 01:01:57,770 --> 01:02:01,010 tangents, and this is called the hyperbolic tangent rule, 1002 01:02:01,010 --> 01:02:02,630 the tanh rule. 1003 01:02:02,630 --> 01:02:05,790 So there's a simple way to do updates in general for any of 1004 01:02:05,790 --> 01:02:07,040 these channels. 1005 01:02:09,740 --> 01:02:13,400 Now, you can do the same kind of 1006 01:02:13,400 --> 01:02:17,970 analysis, but what's different? 1007 01:02:17,970 --> 01:02:22,280 For the erasure channel, we only had 2 types of messages, 1008 01:02:22,280 --> 01:02:25,620 known or erased, and all we really had to do is keep track 1009 01:02:25,620 --> 01:02:28,980 of what's the probability of the erasure type of message, 1010 01:02:28,980 --> 01:02:32,690 or 1 minus this probability, it doesn't matter. 1011 01:02:32,690 --> 01:02:34,670 So that's why I said it was one-dimensional. 1012 01:02:37,430 --> 01:02:45,310 For the symmetric binary input channel, in general, you can 1013 01:02:45,310 --> 01:02:47,880 have any APP vector here. 1014 01:02:47,880 --> 01:02:49,840 This is a single parameter vector. 1015 01:02:49,840 --> 01:02:54,780 It's parameterized by p, or by the likelihood ratio, or by 1016 01:02:54,780 --> 01:02:56,690 the log likelihood ratio. 1017 01:02:56,690 --> 01:02:58,340 There are various ways to parameterize it. 1018 01:02:58,340 --> 01:03:01,750 But in any case, a single number tells you what the APP 1019 01:03:01,750 --> 01:03:04,220 message is. 1020 01:03:04,220 --> 01:03:08,676 And so at this point-- or I guess, better looking at it in 1021 01:03:08,676 --> 01:03:10,310 the competition tree-- 1022 01:03:10,310 --> 01:03:12,790 at each point, instead of having a single number, we 1023 01:03:12,790 --> 01:03:16,600 have a probability distribution on p. 1024 01:03:16,600 --> 01:03:22,840 So we get some probability distribution on p, pp of p, 1025 01:03:22,840 --> 01:03:27,640 that characterizes where you are at this time. 1026 01:03:27,640 --> 01:03:31,490 Coming off the channel, initially, the probability 1027 01:03:31,490 --> 01:03:35,690 distribution on p might be equal to y, I think it is, 1028 01:03:35,690 --> 01:03:39,370 actually, or p to the minus y, and you get some probability 1029 01:03:39,370 --> 01:03:41,700 distribution on what p is. 1030 01:03:45,070 --> 01:03:47,720 By the way, again because of symmetry, you can always 1031 01:03:47,720 --> 01:03:51,760 assume that the all-zero vector was sent in your code. 1032 01:03:51,760 --> 01:03:55,770 It doesn't matter which of your code words is sent, since 1033 01:03:55,770 --> 01:03:57,430 everything is symmetrical. 1034 01:03:57,430 --> 01:04:00,120 So you can do all your analysis assuming the all-zero 1035 01:04:00,120 --> 01:04:01,620 code word was sent. 1036 01:04:01,620 --> 01:04:04,060 This simplifies things a lot, too. 1037 01:04:04,060 --> 01:04:07,840 p then becomes the probability which-- 1038 01:04:07,840 --> 01:04:10,270 well, I guess I've got it backwards. 1039 01:04:10,270 --> 01:04:15,870 Should be 1 minus pp, because p then becomes the 1040 01:04:15,870 --> 01:04:18,398 probability. 1041 01:04:18,398 --> 01:04:22,140 If the assumed probability of the input is a 1, in other 1042 01:04:22,140 --> 01:04:24,640 words, the probability that your current 1043 01:04:24,640 --> 01:04:27,470 guess would be wrong-- 1044 01:04:27,470 --> 01:04:29,000 I'm not saying that well. 1045 01:04:29,000 --> 01:04:31,640 Anyway, you get some distribution of p. 1046 01:04:31,640 --> 01:04:34,540 Let me just draw it like that. 1047 01:04:34,540 --> 01:04:37,120 So here's pp of p. 1048 01:04:37,120 --> 01:04:40,210 There's probability distribution. 1049 01:04:40,210 --> 01:04:42,560 And again, we'll draw it going from 1 to 0. 1050 01:04:45,860 --> 01:04:47,180 So that doesn't go out here. 1051 01:04:50,550 --> 01:04:56,700 OK, with more effort, you can again see what the effect of 1052 01:04:56,700 --> 01:04:58,480 the update rule is going to be. 1053 01:04:58,480 --> 01:05:02,400 For each iteration, you have a certain input distribution on 1054 01:05:02,400 --> 01:05:03,490 all these lines. 1055 01:05:03,490 --> 01:05:06,190 Again, under the independence assumption, you get 1056 01:05:06,190 --> 01:05:08,580 independently-- 1057 01:05:08,580 --> 01:05:12,640 you get a distribution for the APP parameter p on each of 1058 01:05:12,640 --> 01:05:13,980 these lines. 1059 01:05:13,980 --> 01:05:14,780 That leads-- 1060 01:05:14,780 --> 01:05:17,380 you can then calculate what the distribution-- or simulate 1061 01:05:17,380 --> 01:05:21,380 what it is on the output line, just by seeing what's the 1062 01:05:21,380 --> 01:05:24,480 effect of applying the sum product rule. 1063 01:05:24,480 --> 01:05:27,300 It's a much more elaborate calculation, but you can do 1064 01:05:27,300 --> 01:05:30,870 it, or you can do it up to some degree of precision. 1065 01:05:30,870 --> 01:05:35,950 This you can't do exactly, but you can do it to fourteen bits 1066 01:05:35,950 --> 01:05:38,800 of precision if you like. 1067 01:05:38,800 --> 01:05:44,290 And so again, you can work through something that amounts 1068 01:05:44,290 --> 01:05:51,580 to plotting the progress of the iteration through here, up 1069 01:05:51,580 --> 01:05:54,860 to any degree of precision you want. 1070 01:05:54,860 --> 01:05:59,980 So again, you can determine whether it succeeds or fails, 1071 01:05:59,980 --> 01:06:02,900 again, for regular or irregular low-density parity 1072 01:06:02,900 --> 01:06:05,590 check codes. 1073 01:06:05,590 --> 01:06:07,340 In general, it's better to make it irregular. 1074 01:06:07,340 --> 01:06:11,580 You could make it as irregular as you like. 1075 01:06:11,580 --> 01:06:14,380 And so you can see that this could involve a lot of 1076 01:06:14,380 --> 01:06:21,000 computer time to optimize everything, but at the end of 1077 01:06:21,000 --> 01:06:27,080 the day, it's basically a similar kind of hill climbing, 1078 01:06:27,080 --> 01:06:31,950 curve fitting exercise, where ultimately on any of these 1079 01:06:31,950 --> 01:06:39,180 binary input symmetric channels, you can get as close 1080 01:06:39,180 --> 01:06:41,350 as you want to capacity. 1081 01:06:41,350 --> 01:06:44,090 In the very first lecture, I showed you what Sae-Young 1082 01:06:44,090 --> 01:06:46,460 Chung achieved in his thesis. 1083 01:06:46,460 --> 01:06:49,740 He got the binary input additive white 1084 01:06:49,740 --> 01:06:51,420 Gaussian noise channel. 1085 01:06:51,420 --> 01:06:54,980 He got under the assumption of 1086 01:06:54,980 --> 01:06:56,990 asymptotically long random codes. 1087 01:06:56,990 --> 01:07:02,290 He got within 0.0045 dB of channel capacity. 1088 01:07:02,290 --> 01:07:05,990 And then for a more reasonable number, like a block length of 1089 01:07:05,990 --> 01:07:10,970 10 to the seventh, he got within 0.040 1090 01:07:10,970 --> 01:07:13,500 dB of channel capacity. 1091 01:07:13,500 --> 01:07:15,200 Now, that's still a longer code than 1092 01:07:15,200 --> 01:07:16,110 anybody's going to use. 1093 01:07:16,110 --> 01:07:20,730 It's a little bit of a stunt, but I think his work convinced 1094 01:07:20,730 --> 01:07:25,070 everybody that we finally had gotten to channel capacity. 1095 01:07:25,070 --> 01:07:27,300 OK, the Eta Kappa Nu person is here. 1096 01:07:27,300 --> 01:07:30,910 Please help her out, and we'll see you Monday.