1 00:00:00,000 --> 00:00:04,990 [SQUEAKING] [RUSTLING] [CLICKING] 2 00:00:24,980 --> 00:00:27,500 PROFESSOR: OK, hello, everybody. 3 00:00:27,500 --> 00:00:28,315 We'll get started. 4 00:00:31,780 --> 00:00:46,586 So just recapping what we did in the last lecture on Tuesday, 5 00:00:46,586 --> 00:00:50,510 it was the second part of a two-lecture sequence 6 00:00:50,510 --> 00:00:57,260 on the hierarchy theorems for time and space 7 00:00:57,260 --> 00:01:00,320 and using the hierarchy theorems to show 8 00:01:00,320 --> 00:01:04,730 that there is a problem, which is intractable, 9 00:01:04,730 --> 00:01:08,600 that's provably outside of polynomial time. 10 00:01:08,600 --> 00:01:11,960 And that was this equivalence problem for regular expressions 11 00:01:11,960 --> 00:01:13,820 with exponentiation. 12 00:01:13,820 --> 00:01:18,200 And then we had a short discussion about oracles 13 00:01:18,200 --> 00:01:22,730 and the possibility that similar methods might be used to show 14 00:01:22,730 --> 00:01:25,790 that satisfiability is outside of P, 15 00:01:25,790 --> 00:01:28,730 which would then of course solve the P versus NP problem 16 00:01:28,730 --> 00:01:34,070 and argue that it seems unlikely this kind of a meta theorem-- 17 00:01:34,070 --> 00:01:38,330 not a well-defined notion-- 18 00:01:38,330 --> 00:01:40,610 but it seems unlikely that the methods that 19 00:01:40,610 --> 00:01:50,240 were used for proving the intractability of equivalence 20 00:01:50,240 --> 00:01:52,010 of regular expressions with explanation 21 00:01:52,010 --> 00:01:55,820 could be used to solve P versus NP, at least 22 00:01:55,820 --> 00:02:00,380 the diagonalization method in a pure form, whatever that means, 23 00:02:00,380 --> 00:02:04,770 that's not going to be enough. 24 00:02:04,770 --> 00:02:08,780 So today, we're going to shift gears again, begin 25 00:02:08,780 --> 00:02:11,930 a somewhat different topic, which is really 26 00:02:11,930 --> 00:02:17,060 going to be our, again, a few lectures 27 00:02:17,060 --> 00:02:20,720 on probabilistic computation, which is going to round out 28 00:02:20,720 --> 00:02:23,342 the semester for us. 29 00:02:23,342 --> 00:02:24,800 And we're going to start by talking 30 00:02:24,800 --> 00:02:27,800 about a different model of computation which 31 00:02:27,800 --> 00:02:32,332 allows for probabilism in the measuring 32 00:02:32,332 --> 00:02:34,040 the amount of probabilism we're measuring 33 00:02:34,040 --> 00:02:37,430 the probabilities allowing for probabilism in the computation 34 00:02:37,430 --> 00:02:39,650 defined an associated complexity class-- 35 00:02:39,650 --> 00:02:45,710 this class BPP-- and then start the discussion of an example 36 00:02:45,710 --> 00:02:48,410 about something called branching programs. 37 00:02:48,410 --> 00:02:52,640 OK, so with that in mind-- 38 00:03:01,800 --> 00:03:03,870 so we're going to start off by defining 39 00:03:03,870 --> 00:03:09,644 the notion of a Probabilistic Turing Machine or PTM. 40 00:03:09,644 --> 00:03:11,880 The Probabilistic Turing Machine is 41 00:03:11,880 --> 00:03:14,070 a lot like the way we have thought 42 00:03:14,070 --> 00:03:17,580 about not deterministic Turing machines in that it's 43 00:03:17,580 --> 00:03:21,180 a kind of a machine that can have multiple choices, 44 00:03:21,180 --> 00:03:24,400 multiple ways to go in its computation. 45 00:03:24,400 --> 00:03:28,608 So there's not just going to be a fixed deterministic path 46 00:03:28,608 --> 00:03:30,150 of its computation, but there's going 47 00:03:30,150 --> 00:03:32,620 to be a tree of possibilities. 48 00:03:32,620 --> 00:03:34,710 And for our purposes, we're going 49 00:03:34,710 --> 00:03:37,140 to limit the branching in that tree 50 00:03:37,140 --> 00:03:40,910 to be either a step of the computation 51 00:03:40,910 --> 00:03:42,680 where there's no branching, where 52 00:03:42,680 --> 00:03:46,590 it's a deterministic step, as shown over here. 53 00:03:46,590 --> 00:03:50,510 So every step of the way leads uniquely to the next step. 54 00:03:50,510 --> 00:03:53,090 Or there might be some steps which have a choice. 55 00:03:53,090 --> 00:03:56,510 And we're only going to allow for these purposes 56 00:03:56,510 --> 00:04:00,470 to keep life simple, having only a choice 57 00:04:00,470 --> 00:04:03,590 among two possibilities. 58 00:04:03,590 --> 00:04:10,610 And we'll associate to that the notion 59 00:04:10,610 --> 00:04:17,060 of a probability that each choice will have a 50-50 chance 60 00:04:17,060 --> 00:04:18,470 of getting taken. 61 00:04:18,470 --> 00:04:22,160 And this kind of corresponds with the way some of us 62 00:04:22,160 --> 00:04:24,560 or some of you think about non-determinism, which 63 00:04:24,560 --> 00:04:27,170 is not exactly right up until this point, 64 00:04:27,170 --> 00:04:30,085 is that the machine is kind of taking a random branch. 65 00:04:30,085 --> 00:04:32,210 Really, we don't think about it randomly until now. 66 00:04:32,210 --> 00:04:33,960 Now we're going to think about the machine 67 00:04:33,960 --> 00:04:37,880 as actually picking a random choice among all 68 00:04:37,880 --> 00:04:40,380 the different branches that it could make 69 00:04:40,380 --> 00:04:43,550 and picking that choice uniformly by flipping 70 00:04:43,550 --> 00:04:47,790 a coin every time it has an option of which way to go. 71 00:04:47,790 --> 00:04:49,400 Now you could define-- 72 00:04:49,400 --> 00:04:52,190 I'm getting a question here-- 73 00:04:52,190 --> 00:04:54,740 a machine that has several different ways to-- 74 00:04:54,740 --> 00:04:56,178 more than two ways to go. 75 00:04:56,178 --> 00:04:58,220 And then you would need to have a three-way coin, 76 00:04:58,220 --> 00:04:59,760 a four-way coin, and so on. 77 00:04:59,760 --> 00:05:02,580 And you could define it all that way as well. 78 00:05:02,580 --> 00:05:05,720 But it doesn't end up giving you anything different or anything 79 00:05:05,720 --> 00:05:08,960 interesting or new for the kinds of things 80 00:05:08,960 --> 00:05:10,400 we're going to be discussing. 81 00:05:10,400 --> 00:05:11,775 And it's just going to be simpler 82 00:05:11,775 --> 00:05:13,790 to keep the discussion limited to the case 83 00:05:13,790 --> 00:05:17,090 where the machine can only have two possibilities if it's 84 00:05:17,090 --> 00:05:20,570 going to be having a choice at all or just one possibility 85 00:05:20,570 --> 00:05:23,220 when there's no choice. 86 00:05:23,220 --> 00:05:26,040 OK, so now, we're going to have to talk 87 00:05:26,040 --> 00:05:28,980 about the probability of the machine 88 00:05:28,980 --> 00:05:31,590 taking some branch of its computation. 89 00:05:31,590 --> 00:05:34,740 So you imagine here, here is the same computation tree 90 00:05:34,740 --> 00:05:36,660 that we've seen before in the case 91 00:05:36,660 --> 00:05:39,150 of ordinary non-deterministic machines, 92 00:05:39,150 --> 00:05:41,190 where you have M on w. 93 00:05:41,190 --> 00:05:43,200 There could be several different ways to go. 94 00:05:43,200 --> 00:05:45,240 And there might be some particular branch. 95 00:05:45,240 --> 00:05:47,520 But now, we want to talk about the probability 96 00:05:47,520 --> 00:05:52,510 that the machine actually ends up picking that branch. 97 00:05:52,510 --> 00:05:59,530 And it's going to be when we talk about the machine having 98 00:05:59,530 --> 00:06:01,510 a choice of ways to go, we're going 99 00:06:01,510 --> 00:06:03,140 to associate that with a coin flip. 100 00:06:03,140 --> 00:06:05,140 So we're going to call that a coin flip step 101 00:06:05,140 --> 00:06:07,720 when the machine has a possibility of ways to go. 102 00:06:07,720 --> 00:06:11,500 And so on a particular branch, the probability 103 00:06:11,500 --> 00:06:15,490 of that branch occurring is going to be 1 over 2 104 00:06:15,490 --> 00:06:22,000 to the number of coin flip states on that branch. 105 00:06:22,000 --> 00:06:23,240 and the reason for-- 106 00:06:23,240 --> 00:06:29,210 I mean, this is kind of the definition that makes sense 107 00:06:29,210 --> 00:06:33,050 in that if you imagine looking at the computation tree here-- 108 00:06:33,050 --> 00:06:37,880 and here is the branch that we're focusing on of interest-- 109 00:06:37,880 --> 00:06:41,150 every time there's a coin flip on that branch, 110 00:06:41,150 --> 00:06:44,540 there's a 50-50 chance of taking a different branch 111 00:06:44,540 --> 00:06:46,770 or staying on that branch. 112 00:06:46,770 --> 00:06:50,510 So the more coin flips there are on some particular branch, 113 00:06:50,510 --> 00:06:52,070 the less likely that branch would 114 00:06:52,070 --> 00:06:55,940 be the one that the machine actually ends up taking. 115 00:06:55,940 --> 00:06:59,480 And so it's going to be 1 over 2 to the number of coin flips. 116 00:06:59,480 --> 00:07:02,530 And that's the way we're defining it. 117 00:07:02,530 --> 00:07:05,860 Now once we have that notion, we can also 118 00:07:05,860 --> 00:07:08,710 talk about the probability that the machine ends up 119 00:07:08,710 --> 00:07:12,280 accepting because as before, each of these branches 120 00:07:12,280 --> 00:07:16,120 is going to end up at an accept state or reject state, 121 00:07:16,120 --> 00:07:19,030 thinking about this only in terms of deciders. 122 00:07:19,030 --> 00:07:25,130 And the probability of the machine accepting here 123 00:07:25,130 --> 00:07:28,370 is just going to be the sum over all probabilities 124 00:07:28,370 --> 00:07:32,210 of the branches that end up accepting. 125 00:07:32,210 --> 00:07:34,640 So just add up all of those probabilities of a branch 126 00:07:34,640 --> 00:07:37,880 leading to an accept, and we'll call that the probability 127 00:07:37,880 --> 00:07:39,350 that the machine accepts its input. 128 00:07:42,820 --> 00:07:46,180 And the probability that the machine rejects 129 00:07:46,180 --> 00:07:47,980 is going to be 1 minus the probability 130 00:07:47,980 --> 00:07:52,900 that it accepts because the machine, on every branch, 131 00:07:52,900 --> 00:07:59,230 is either going to do one or the other, OK? 132 00:07:59,230 --> 00:08:02,680 Now if you're thinking about a particular language 133 00:08:02,680 --> 00:08:04,990 that the machine is trying to decide, 134 00:08:04,990 --> 00:08:12,238 this probabilistic machine now is trying to decide, 135 00:08:12,238 --> 00:08:16,210 on each input, some of the branches of the machine 136 00:08:16,210 --> 00:08:17,710 may give the correct answer. 137 00:08:17,710 --> 00:08:20,502 They're going to accept when the input is in the language. 138 00:08:20,502 --> 00:08:22,210 Other branches may give the wrong answer. 139 00:08:22,210 --> 00:08:24,880 They may reject when the input is in the language and vise 140 00:08:24,880 --> 00:08:26,590 versa. 141 00:08:26,590 --> 00:08:28,840 So there's going to be a possibility of error 142 00:08:28,840 --> 00:08:31,020 now in the machine in any particular branch. 143 00:08:31,020 --> 00:08:33,309 It might actually give the wrong answer. 144 00:08:33,309 --> 00:08:35,230 And what we're going to say is bound 145 00:08:35,230 --> 00:08:40,730 that error over all possible inputs, 146 00:08:40,730 --> 00:08:44,510 and we'll say that the machine for any epsilon 147 00:08:44,510 --> 00:08:47,180 greater than or equal to 0, we will 148 00:08:47,180 --> 00:08:50,750 say that the machine decides the language with error probability 149 00:08:50,750 --> 00:08:54,940 epsilon if that's the worst that can possibly 150 00:08:54,940 --> 00:08:59,230 happen if, for every input, the machine gives 151 00:08:59,230 --> 00:09:03,150 the wrong answer with probability at most epsilon. 152 00:09:05,910 --> 00:09:09,600 Equivalently, if you like to spell it out a little bit more, 153 00:09:09,600 --> 00:09:16,280 a little differently, for strings 154 00:09:16,280 --> 00:09:18,500 that are in the language, the probability 155 00:09:18,500 --> 00:09:20,720 that the machine rejects that input 156 00:09:20,720 --> 00:09:22,130 is going to be at most epsilon. 157 00:09:22,130 --> 00:09:24,690 And for strings in the language the probability 158 00:09:24,690 --> 00:09:27,620 for strings not in the language, the probability that accepts 159 00:09:27,620 --> 00:09:29,720 is at most epsilon. 160 00:09:29,720 --> 00:09:34,280 So again, this is the machine doing the thing that it's not 161 00:09:34,280 --> 00:09:35,660 supposed to be doing. 162 00:09:35,660 --> 00:09:37,610 For things in the language, it should 163 00:09:37,610 --> 00:09:39,230 be rejecting very rarely. 164 00:09:39,230 --> 00:09:40,880 For things not in the language, it 165 00:09:40,880 --> 00:09:42,410 should be accepting very rarely. 166 00:09:42,410 --> 00:09:48,550 And that's what this bound is doing for you, OK? 167 00:09:48,550 --> 00:09:51,260 So let's just see. 168 00:09:51,260 --> 00:09:54,020 So we'll talk about-- so I'm getting some questions here 169 00:09:54,020 --> 00:09:54,520 about-- 170 00:09:59,810 --> 00:10:01,120 so let me just look at these. 171 00:10:01,120 --> 00:10:02,080 One second here. 172 00:10:11,510 --> 00:10:14,570 Yeah, so probability 0-- 173 00:10:14,570 --> 00:10:16,700 so there's a possibility that the machine 174 00:10:16,700 --> 00:10:20,210 might have a probability 0, say, of accepting. 175 00:10:20,210 --> 00:10:23,000 That means there are no branches that end up accepting 176 00:10:23,000 --> 00:10:24,410 or probability 0 of rejecting. 177 00:10:24,410 --> 00:10:27,050 There were no rejecting branches. 178 00:10:27,050 --> 00:10:28,730 I think we're going to talk in a minute 179 00:10:28,730 --> 00:10:30,170 about the connection between this 180 00:10:30,170 --> 00:10:33,170 and the standard notion of NP. 181 00:10:33,170 --> 00:10:35,840 So just hold off on that for a second. 182 00:10:38,470 --> 00:10:43,090 Also, what about the possibility that the machine 183 00:10:43,090 --> 00:10:49,010 is being a decider or running in a certain amount of time? 184 00:10:49,010 --> 00:10:53,710 So we will look at time-bounded machines in a second 185 00:10:53,710 --> 00:10:56,980 on the next slide or two, talking about machines 186 00:10:56,980 --> 00:10:58,550 that run in polynomial time. 187 00:10:58,550 --> 00:11:00,520 So that means all branches have to halt 188 00:11:00,520 --> 00:11:03,860 within some polynomial number of steps. 189 00:11:03,860 --> 00:11:05,780 So that's where we're going. 190 00:11:05,780 --> 00:11:07,750 But for the time being, we're the only 191 00:11:07,750 --> 00:11:09,790 looking at deciders, where the machine has 192 00:11:09,790 --> 00:11:12,280 to hold on every branch. 193 00:11:12,280 --> 00:11:14,780 But some branches might run for a long time. 194 00:11:14,780 --> 00:11:18,970 But for now, we're not going to be thinking about machines that 195 00:11:18,970 --> 00:11:21,850 have branches that run forever. 196 00:11:21,850 --> 00:11:23,410 All of our machines are deciders, 197 00:11:23,410 --> 00:11:26,480 so they hold on every branch. 198 00:11:26,480 --> 00:11:27,890 See, is there anything else here? 199 00:11:27,890 --> 00:11:28,430 No. 200 00:11:28,430 --> 00:11:30,000 So why don't I move on? 201 00:11:30,000 --> 00:11:34,100 So let's define now the class BPP using 202 00:11:34,100 --> 00:11:36,770 this notion of a probabilistic Turing 203 00:11:36,770 --> 00:11:40,140 machine, which is now going to be running in polynomial time. 204 00:11:40,140 --> 00:11:42,380 So BPP is going to be another one of this complexity 205 00:11:42,380 --> 00:11:44,810 classes, a collection of languages, like P, 206 00:11:44,810 --> 00:11:48,140 and NP, and PSPACE, and so on. 207 00:11:48,140 --> 00:11:49,790 But it's going to be now associated 208 00:11:49,790 --> 00:11:52,430 with the capabilities of these probabilistic machines, 209 00:11:52,430 --> 00:11:55,880 the kinds of languages that they can do. 210 00:11:55,880 --> 00:12:02,460 So we'll say, the class BPP is the set of languages, A, 211 00:12:02,460 --> 00:12:06,590 that there's some probabilistic polynomial time Turing 212 00:12:06,590 --> 00:12:08,750 machines, so all branches have to hold within-- 213 00:12:08,750 --> 00:12:11,480 into the k for some k. 214 00:12:11,480 --> 00:12:15,980 So some polynomial time probabilistic Turing machine 215 00:12:15,980 --> 00:12:19,550 decides A with error possibility, at most, 1/3. 216 00:12:22,530 --> 00:12:26,400 So in other words, when it's accepting 217 00:12:26,400 --> 00:12:28,860 for strings in the language, the machine 218 00:12:28,860 --> 00:12:31,170 has to reject with, at most, 1/3. 219 00:12:31,170 --> 00:12:32,622 So it's saying it equivalently. 220 00:12:32,622 --> 00:12:34,080 For strings in the language, it has 221 00:12:34,080 --> 00:12:36,960 to accept with 2/3 probability. 222 00:12:36,960 --> 00:12:38,830 And for strings not in the language, 223 00:12:38,830 --> 00:12:40,530 it has to reject with 2/3 probability, 224 00:12:40,530 --> 00:12:42,465 at least, in both cases. 225 00:12:45,990 --> 00:12:48,460 OK. 226 00:12:48,460 --> 00:12:50,725 Somehow, I ended up with-- 227 00:12:50,725 --> 00:12:52,100 I didn't check my animation here. 228 00:12:52,100 --> 00:12:52,933 But OK, that's fine. 229 00:12:52,933 --> 00:12:53,930 So there is a-- 230 00:12:57,080 --> 00:13:02,870 now if you look at the 1/3 here in the definition, 231 00:13:02,870 --> 00:13:05,690 it seems strange to define a complexity 232 00:13:05,690 --> 00:13:08,780 class in terms of some arbitrary content like 1/3. 233 00:13:08,780 --> 00:13:17,000 Why didn't we use 1/4 or 1/10 in the definition of BPP 234 00:13:17,000 --> 00:13:20,180 and say the machine has to get have an error with at most 235 00:13:20,180 --> 00:13:22,760 1/10 or 1/100? 236 00:13:22,760 --> 00:13:24,260 Well, it doesn't matter. 237 00:13:24,260 --> 00:13:28,220 And that's the point of this next statement 238 00:13:28,220 --> 00:13:31,520 called the amplification lemma, which 239 00:13:31,520 --> 00:13:34,860 says that you can always-- 240 00:13:34,860 --> 00:13:37,070 if you have a machine that's running 241 00:13:37,070 --> 00:13:40,790 in a certain polynomial time that's 242 00:13:40,790 --> 00:13:45,140 running with a certain error, which is, at most, 1/2. 243 00:13:45,140 --> 00:13:47,390 If you have an error 1/2, it's not interesting 244 00:13:47,390 --> 00:13:51,920 because the machine could just flip a coin for every input, 245 00:13:51,920 --> 00:13:55,670 and it could get the right answer with probability of 1/2. 246 00:13:55,670 --> 00:13:58,010 So probability 1/2 is not interesting. 247 00:13:58,010 --> 00:14:01,070 You have to have probability strictly less than 1/2 248 00:14:01,070 --> 00:14:04,070 for the machine to actually be doing something that's 249 00:14:04,070 --> 00:14:06,630 meaningful about that language. 250 00:14:06,630 --> 00:14:13,480 So if you have a probabilistic Turing machine that has error, 251 00:14:13,480 --> 00:14:15,880 let's say, epsilon 1, which is, at most, 1/2-- 252 00:14:15,880 --> 00:14:20,790 which is less than 1/2, then you can 253 00:14:20,790 --> 00:14:23,850 convert that to any error probability 254 00:14:23,850 --> 00:14:28,020 you want for some other polynomial time 255 00:14:28,020 --> 00:14:29,980 probabilistic Turing machine. 256 00:14:29,980 --> 00:14:33,270 So you can make that error, which maybe starts out as 1/3, 257 00:14:33,270 --> 00:14:39,023 and you can drive that error down to 1 over a googol. 258 00:14:39,023 --> 00:14:40,440 And seriously, you can really make 259 00:14:40,440 --> 00:14:46,540 the error extremely, extremely small 260 00:14:46,540 --> 00:14:48,910 using a very simple procedure. 261 00:14:48,910 --> 00:14:51,450 And that's simply this-- 262 00:14:51,450 --> 00:14:53,340 so if you're starting out with a machine that 263 00:14:53,340 --> 00:14:55,530 has an error possibility of 1/3, say, 264 00:14:55,530 --> 00:14:58,050 so that means 2/3 of the time it's 265 00:14:58,050 --> 00:15:00,882 going to get the right answer, and, at most, 1/3 of the time-- 266 00:15:00,882 --> 00:15:03,090 the least 2/3 of the time, the right answer, at most, 267 00:15:03,090 --> 00:15:05,070 1/3 of the time, the incorrect answer, 268 00:15:05,070 --> 00:15:07,933 whether that's accepting or rejecting. 269 00:15:07,933 --> 00:15:09,600 And now you want to get that answer down 270 00:15:09,600 --> 00:15:11,970 to be something much-- that error down to something 271 00:15:11,970 --> 00:15:14,040 much smaller. 272 00:15:14,040 --> 00:15:19,500 The idea is, you're going to take that machine, 273 00:15:19,500 --> 00:15:24,570 and you're going to run it multiple times 274 00:15:24,570 --> 00:15:27,870 with independent runs, if you want to think about it more 275 00:15:27,870 --> 00:15:28,650 formally speaking. 276 00:15:28,650 --> 00:15:29,850 But it's intuitive. 277 00:15:29,850 --> 00:15:33,240 You're just going to run the machine, tossing your coins. 278 00:15:35,725 --> 00:15:37,350 Instead of just running it once, you're 279 00:15:37,350 --> 00:15:40,320 going to run the machine 100 times or a million times. 280 00:15:40,320 --> 00:15:43,060 But you can do it at just, say, it's a constant factor. 281 00:15:43,060 --> 00:15:44,940 And even 1,000 times is going to be 282 00:15:44,940 --> 00:15:47,610 enough to increase your confidence in the result 283 00:15:47,610 --> 00:15:53,690 tremendously because if you run the machine 1,000 times, 284 00:15:53,690 --> 00:15:58,280 and 600 of those times, the machine accepts, and 400 285 00:15:58,280 --> 00:16:02,630 of the times, the machine rejects, 286 00:16:02,630 --> 00:16:05,990 it's very powerful evidence that this machine 287 00:16:05,990 --> 00:16:08,000 is biased toward accepting, that it's 288 00:16:08,000 --> 00:16:10,290 accepting most of the time. 289 00:16:10,290 --> 00:16:16,710 So if it had an error probability, at most, 290 00:16:16,710 --> 00:16:23,850 1/3, the probability that you're seeing 291 00:16:23,850 --> 00:16:28,350 it accept 600 times when, really, 2/3 of the time, 292 00:16:28,350 --> 00:16:32,640 it's rejecting overall, is extremely unlikely. 293 00:16:32,640 --> 00:16:34,507 And you can calculate that-- 294 00:16:34,507 --> 00:16:36,090 which we're not going to bother to do, 295 00:16:36,090 --> 00:16:38,670 but it's a routine probability calculation-- 296 00:16:38,670 --> 00:16:41,370 to show that the probability that if you 297 00:16:41,370 --> 00:16:44,970 run it a whole bunch of times, and you see the majority coming 298 00:16:44,970 --> 00:16:53,310 up, which is not the right answer, the probability of that 299 00:16:53,310 --> 00:16:56,550 is extremely small. 300 00:16:56,550 --> 00:16:58,050 So I'm not saying that very clearly. 301 00:16:58,050 --> 00:17:04,020 But the method here is you're going 302 00:17:04,020 --> 00:17:10,829 to take your original machine, which has error probability 1/3 303 00:17:10,829 --> 00:17:12,900 or whatever it is-- 304 00:17:12,900 --> 00:17:17,310 maybe has error probability 49%. 305 00:17:17,310 --> 00:17:20,880 And you run it for a large number of times. 306 00:17:20,880 --> 00:17:23,510 And then you take the majority vote. 307 00:17:23,510 --> 00:17:28,460 And you're sampling the outcomes of this machine. 308 00:17:28,460 --> 00:17:32,390 And if you take enough samples, it's overwhelmingly likely-- 309 00:17:32,390 --> 00:17:34,760 since you're just doing them uniformly-- 310 00:17:34,760 --> 00:17:36,867 you're taking those samples uniformly, 311 00:17:36,867 --> 00:17:38,450 it's overwhelmingly likely that you're 312 00:17:38,450 --> 00:17:41,960 going to be seeing the predominant one come up 313 00:17:41,960 --> 00:17:43,850 more often. 314 00:17:43,850 --> 00:17:46,220 And exactly what that right value is, 315 00:17:46,220 --> 00:17:48,020 we're not going to bother to calculate. 316 00:17:48,020 --> 00:17:52,430 But that's something that, I will refer you to the textbook, 317 00:17:52,430 --> 00:17:54,110 or this is the kind of thing that 318 00:17:54,110 --> 00:17:56,225 comes up in any elementary probability book 319 00:17:56,225 --> 00:17:57,350 and sort of very intuitive. 320 00:17:57,350 --> 00:17:58,725 So I don't want to spend the time 321 00:17:58,725 --> 00:18:01,541 and do that calculation, which is not all that interesting. 322 00:18:05,310 --> 00:18:09,270 OK, so just one quick question here that I'm getting-- 323 00:18:09,270 --> 00:18:11,640 what happens if you bound-- if the error is 324 00:18:11,640 --> 00:18:12,840 greater than a half? 325 00:18:12,840 --> 00:18:15,970 I don't think that because we're bounding the error, 326 00:18:15,970 --> 00:18:19,110 so we're not saying the error actually is one, like, 327 00:18:19,110 --> 00:18:22,320 60% on everything because then if you knew the error was 328 00:18:22,320 --> 00:18:27,300 60% guaranteed, you can always just flip your answer around 329 00:18:27,300 --> 00:18:30,150 and get your error to be 40%. 330 00:18:30,150 --> 00:18:35,390 But I'm saying the error's at most whatever epsilon is. 331 00:18:35,390 --> 00:18:38,717 And so if you're saying the error is, at most, 60%, 332 00:18:38,717 --> 00:18:39,925 it doesn't tell you anything. 333 00:18:43,135 --> 00:18:43,635 OK. 334 00:18:51,970 --> 00:18:54,820 Another question is, does the amplification lemma also 335 00:18:54,820 --> 00:18:58,810 justify that the choice of model with binary branching choices 336 00:18:58,810 --> 00:19:00,820 is equivalent to any other? 337 00:19:00,820 --> 00:19:05,320 Perhaps, you could say that because you can change those. 338 00:19:05,320 --> 00:19:08,950 If you had three-way branching, you 339 00:19:08,950 --> 00:19:12,370 can simulate that with two-way branching to any accuracy 340 00:19:12,370 --> 00:19:14,036 that you want. 341 00:19:14,036 --> 00:19:16,420 You're not going to get it down to zero, 342 00:19:16,420 --> 00:19:18,700 but you're going to get it very close. 343 00:19:18,700 --> 00:19:21,400 So maybe it's the amplification lemma. 344 00:19:21,400 --> 00:19:26,090 Maybe it's, yeah, sort of all related. 345 00:19:26,090 --> 00:19:28,830 OK, let's move on. 346 00:19:28,830 --> 00:19:35,990 So the way that it's helpful to think about this class, 347 00:19:35,990 --> 00:19:39,873 let's contrast it with the other model 348 00:19:39,873 --> 00:19:41,540 of non-deterministic computation that we 349 00:19:41,540 --> 00:19:44,720 have is non-determinism, is NP. 350 00:19:44,720 --> 00:19:46,910 So non-deterministic, the model of 351 00:19:46,910 --> 00:19:49,040 non-deterministic polynomial time computation 352 00:19:49,040 --> 00:19:52,250 was NP, the other class. 353 00:19:52,250 --> 00:19:57,390 And so the way-- 354 00:19:57,390 --> 00:19:59,220 I think one way to look at or think 355 00:19:59,220 --> 00:20:01,860 about non-determinism in the case of NP 356 00:20:01,860 --> 00:20:08,310 is, for strings in the language for your NP Turing machine, 357 00:20:08,310 --> 00:20:10,540 there's at least one accepting branch. 358 00:20:10,540 --> 00:20:13,020 So I'm indicating the accepting ones in green 359 00:20:13,020 --> 00:20:17,050 and the non-accepting one, the rejecting branches in red. 360 00:20:17,050 --> 00:20:22,050 So you could have almost all of the branches be rejecting 361 00:20:22,050 --> 00:20:25,470 branches for strings in the language 362 00:20:25,470 --> 00:20:27,600 as long as there is at least one accepting branch. 363 00:20:27,600 --> 00:20:29,700 That's just the way non-determinism works. 364 00:20:29,700 --> 00:20:33,000 The accepting branch overrules all of the others. 365 00:20:33,000 --> 00:20:34,800 It's only when you're not in the language 366 00:20:34,800 --> 00:20:38,280 that all of the branches turn out to have to be rejecting. 367 00:20:38,280 --> 00:20:42,450 That's when the rejecting has no accepting branch 368 00:20:42,450 --> 00:20:44,370 to overrule it. 369 00:20:44,370 --> 00:20:46,750 But the situation for BPP is a little different-- 370 00:20:46,750 --> 00:20:48,780 is different. 371 00:20:48,780 --> 00:20:53,280 There, it's kind of the majority rules. 372 00:20:53,280 --> 00:20:58,370 So in the case for strings in the language, 373 00:20:58,370 --> 00:21:00,620 you need to have a large-- 374 00:21:00,620 --> 00:21:05,900 or the overwhelming majority of the branches 375 00:21:05,900 --> 00:21:07,040 have to be accepting. 376 00:21:07,040 --> 00:21:08,660 And for strings not in the language, 377 00:21:08,660 --> 00:21:11,970 the overwhelming majority have to be rejected. 378 00:21:11,970 --> 00:21:15,600 What you're not going to allow in the case of BPP 379 00:21:15,600 --> 00:21:23,490 is kind of an in-between state where it's sort of 50-50 380 00:21:23,490 --> 00:21:25,830 or very, very close to 50-50. 381 00:21:25,830 --> 00:21:29,670 Those kinds of machines don't qualify 382 00:21:29,670 --> 00:21:35,390 as designing a language in BPP. 383 00:21:35,390 --> 00:21:38,120 They always have to lean one way or lean 384 00:21:38,120 --> 00:21:41,352 the other way for every input. 385 00:21:41,352 --> 00:21:43,560 Otherwise, you won't be able to do the amplification. 386 00:21:43,560 --> 00:21:47,090 So you need to have some bias away from half 387 00:21:47,090 --> 00:21:51,890 in accepting or rejecting. 388 00:21:51,890 --> 00:21:55,897 So let me-- so I was going to ask a check-in, I think, 389 00:21:55,897 --> 00:21:56,480 at this point. 390 00:21:56,480 --> 00:21:57,770 Yes, let's. 391 00:21:57,770 --> 00:22:00,910 So just thinking about BPP, I hope I was clear. 392 00:22:00,910 --> 00:22:02,750 So if there's questions about that, 393 00:22:02,750 --> 00:22:04,430 I think I've somehow didn't-- 394 00:22:07,320 --> 00:22:10,240 I'm not sure I described it totally well here. 395 00:22:10,240 --> 00:22:12,390 So I'm going to ask a few questions about BPP. 396 00:22:12,390 --> 00:22:15,600 But if you have any questions for me first, go ahead. 397 00:22:20,355 --> 00:22:21,980 OK, why don't we just run the check-in? 398 00:22:31,685 --> 00:22:33,810 Let me launch this, and then I can answer questions 399 00:22:33,810 --> 00:22:36,171 as you're asking. 400 00:22:36,171 --> 00:22:37,560 Did I start that? 401 00:22:37,560 --> 00:22:39,240 Yeah. 402 00:22:39,240 --> 00:22:44,520 OK, so you have to check all of these that you think are true. 403 00:22:48,072 --> 00:22:49,780 Can you think of non-deterministic Turing 404 00:22:49,780 --> 00:22:54,310 machines as try all branches at once and get the right answer? 405 00:22:59,370 --> 00:23:03,582 And BP gets only one branch? 406 00:23:03,582 --> 00:23:04,082 No. 407 00:23:06,680 --> 00:23:08,250 I would say a little differently. 408 00:23:08,250 --> 00:23:10,790 I would say, I would think of non-determinism 409 00:23:10,790 --> 00:23:12,710 as you can still just try one branch, 410 00:23:12,710 --> 00:23:16,320 but you always guess the right one. 411 00:23:16,320 --> 00:23:20,122 So there's some sort of magical power 412 00:23:20,122 --> 00:23:22,580 that allows you always to guess the right answer if there's 413 00:23:22,580 --> 00:23:25,350 a right guess If you're in the language. 414 00:23:25,350 --> 00:23:29,450 In the case of BPP, you're going to be picking 415 00:23:29,450 --> 00:23:32,010 a random branch no matter what. 416 00:23:32,010 --> 00:23:33,870 And you know that the random branch 417 00:23:33,870 --> 00:23:37,860 is likely to give the right answer but not guaranteed. 418 00:23:37,860 --> 00:23:40,800 And the amplification lemma tells you 419 00:23:40,800 --> 00:23:42,660 you can arrange things so that it's 420 00:23:42,660 --> 00:23:46,650 extremely likely that the random branch is going 421 00:23:46,650 --> 00:23:49,220 to give you the right answer. 422 00:23:49,220 --> 00:23:49,880 OK, let's see. 423 00:23:49,880 --> 00:23:52,100 How we doing on our check-in here? 424 00:23:54,890 --> 00:23:57,860 Got a lot of support for all candidates. 425 00:24:01,140 --> 00:24:06,065 And I'll give you another-- 426 00:24:08,970 --> 00:24:10,800 a little bit of time here because there's 427 00:24:10,800 --> 00:24:11,675 a bunch of questions. 428 00:24:11,675 --> 00:24:14,850 It's almost like four check-ins at once. 429 00:24:14,850 --> 00:24:17,805 But we have two more real check-ins coming later. 430 00:24:20,490 --> 00:24:25,162 OK, so why don't we come and let's give it 431 00:24:25,162 --> 00:24:27,120 another 10 seconds, and then I'm going to stop. 432 00:24:31,190 --> 00:24:35,390 Closing down-- 1, 2, 3, close. 433 00:24:38,510 --> 00:24:42,140 OK, so we've got a lot of support here. 434 00:24:42,140 --> 00:24:47,270 And in fact that's good because all of them are true. 435 00:24:47,270 --> 00:24:50,160 Some of them are easier to see than others. 436 00:24:50,160 --> 00:24:54,320 So first of all, c is very easy to see because that's 437 00:24:54,320 --> 00:24:57,860 going to be a machine that has the correct answer all 438 00:24:57,860 --> 00:24:58,920 of the time. 439 00:24:58,920 --> 00:25:02,600 So that's error probability zero on both 440 00:25:02,600 --> 00:25:03,830 accepting and rejecting. 441 00:25:06,680 --> 00:25:07,700 This is a little harder. 442 00:25:07,700 --> 00:25:10,790 D is a little bit harder to see that it's in PSPACE. 443 00:25:10,790 --> 00:25:15,320 But you could calculate for every branch 444 00:25:15,320 --> 00:25:16,787 what its probability is. 445 00:25:16,787 --> 00:25:18,620 And you can just go through all the branches 446 00:25:18,620 --> 00:25:21,080 and add up all those probabilities in a PSPACE 447 00:25:21,080 --> 00:25:21,890 machine. 448 00:25:21,890 --> 00:25:24,680 So you have to think about it a little bit. 449 00:25:24,680 --> 00:25:28,250 But d is not too hard to see either. 450 00:25:28,250 --> 00:25:33,380 Closure under complement-- if you just take your BPP machine 451 00:25:33,380 --> 00:25:38,570 and you flip the answer on every branch, that's typically 452 00:25:38,570 --> 00:25:41,150 doesn't work in ordinary non-determinism, 453 00:25:41,150 --> 00:25:43,100 but it does work here because it's 454 00:25:43,100 --> 00:25:45,170 going to change a bias toward accepting 455 00:25:45,170 --> 00:25:48,000 into a bias toward rejecting and vice versa. 456 00:25:48,000 --> 00:25:50,840 So BPP is closed under complement. 457 00:25:50,840 --> 00:25:52,410 And closure under a union-- 458 00:25:52,410 --> 00:25:54,410 it kind of follows from the amplification lemma. 459 00:25:54,410 --> 00:25:58,110 As long as you can make the probability extremely small, 460 00:25:58,110 --> 00:26:00,830 then you can just run the two different machines. 461 00:26:00,830 --> 00:26:03,140 And even though the they each may 462 00:26:03,140 --> 00:26:06,170 make a mistake cumulatively, the total, 463 00:26:06,170 --> 00:26:09,320 the probability that each one of them-- that either of them 464 00:26:09,320 --> 00:26:10,880 will make a mistake is still small. 465 00:26:10,880 --> 00:26:12,770 And so you can just run the two machines 466 00:26:12,770 --> 00:26:14,907 and take the or of the responses that they get, 467 00:26:14,907 --> 00:26:16,490 and it's still very likely to give you 468 00:26:16,490 --> 00:26:18,920 the right answer for the union. 469 00:26:18,920 --> 00:26:24,480 OK, let's continue. 470 00:26:24,480 --> 00:26:28,958 So what I'm going to do now for the rest of the lecture is-- 471 00:26:28,958 --> 00:26:30,500 and it's actually going to spill over 472 00:26:30,500 --> 00:26:32,743 into the lecture after Thanksgiving 473 00:26:32,743 --> 00:26:34,160 because this is going to introduce 474 00:26:34,160 --> 00:26:36,560 an important method for us-- 475 00:26:36,560 --> 00:26:41,960 is to look at an example of a problem that's in BPP. 476 00:26:41,960 --> 00:26:45,080 I love to teach things by using examples, 477 00:26:45,080 --> 00:26:46,850 and so this is a very good example 478 00:26:46,850 --> 00:26:49,990 because it has a lot of meat to it. 479 00:26:49,990 --> 00:26:52,450 And it's a very interesting example. 480 00:26:52,450 --> 00:26:56,680 In general, proving things in BPP, which are not trivially 481 00:26:56,680 --> 00:26:58,390 there because they're already in P, 482 00:26:58,390 --> 00:27:00,850 they tend to be somewhat more involved 483 00:27:00,850 --> 00:27:04,540 than some of the other algorithms we've seen. 484 00:27:04,540 --> 00:27:09,370 So there are no simple examples of problems in BPP 485 00:27:09,370 --> 00:27:12,550 which are not already in P. 486 00:27:12,550 --> 00:27:16,840 So this is one example that we're 487 00:27:16,840 --> 00:27:19,330 going to go through of a problem in BPP that's 488 00:27:19,330 --> 00:27:23,330 not known to be in P. Of course, things could collapse down, 489 00:27:23,330 --> 00:27:30,033 but as far as we know, this language is not in P. 490 00:27:30,033 --> 00:27:31,450 So let's see what the language is. 491 00:27:31,450 --> 00:27:34,360 It has to do with these things called branching programs. 492 00:27:34,360 --> 00:27:37,970 The branching program is a structure that looks like this. 493 00:27:37,970 --> 00:27:40,390 So let's understand what the pieces are. 494 00:27:40,390 --> 00:27:42,595 First of all, it's a directed graph. 495 00:27:46,300 --> 00:27:49,360 And we're not going to allow-- 496 00:27:49,360 --> 00:27:51,880 there are no cycles allowed in this graph. 497 00:27:51,880 --> 00:27:56,650 It's a directed acyclic graph-- 498 00:27:56,650 --> 00:27:58,590 so no loops allowed. 499 00:27:58,590 --> 00:28:02,790 And the nodes are in two categories. 500 00:28:02,790 --> 00:28:08,280 There are query nodes, which are labeled with a variable letter, 501 00:28:08,280 --> 00:28:13,920 and output nodes, which are labeled either 0 or 1. 502 00:28:13,920 --> 00:28:17,160 And lastly, one of the query nodes is going to be-- 503 00:28:17,160 --> 00:28:23,440 or one of the nodes is going to be designated as a start, OK? 504 00:28:23,440 --> 00:28:29,390 And so what you do is, the way-- 505 00:28:29,390 --> 00:28:31,820 this is a model of computation. 506 00:28:31,820 --> 00:28:39,470 And the way we actually use a branching program is we 507 00:28:39,470 --> 00:28:41,930 have some assignment to the variables. 508 00:28:41,930 --> 00:28:43,560 That's going to be the input. 509 00:28:43,560 --> 00:28:45,450 So you take all of the variables. 510 00:28:45,450 --> 00:28:47,420 There are three variables in this case-- 511 00:28:47,420 --> 00:28:49,520 x1, x2, and x3. 512 00:28:49,520 --> 00:28:51,800 You give them some truth assignment. 513 00:28:51,800 --> 00:28:54,110 So let's say, zero's a 1. 514 00:28:54,110 --> 00:28:56,120 And so x1 is 0. 515 00:28:56,120 --> 00:28:57,480 x1 is 1, or whatever. 516 00:28:57,480 --> 00:29:00,710 And once you have the truth assignment, 517 00:29:00,710 --> 00:29:08,270 you start at the start variable, and you look at its label. 518 00:29:10,810 --> 00:29:15,800 And you see what value the input has assigned to that variable. 519 00:29:15,800 --> 00:29:18,823 So if x1 is assigned a 1, you're going 520 00:29:18,823 --> 00:29:19,990 to follow down the 1 branch. 521 00:29:19,990 --> 00:29:22,930 If it assigned a 0, you go down the 0 branch. 522 00:29:22,930 --> 00:29:27,680 And then when you get down to the next node, 523 00:29:27,680 --> 00:29:29,540 that's another variable that you're 524 00:29:29,540 --> 00:29:32,660 going to have to query depending upon what the input 525 00:29:32,660 --> 00:29:33,890 assignment is. 526 00:29:33,890 --> 00:29:37,790 And you're just going to continue that process. 527 00:29:37,790 --> 00:29:39,770 Because there are no cycles, you're 528 00:29:39,770 --> 00:29:42,170 going to end up at one of the output nodes 529 00:29:42,170 --> 00:29:44,030 because all of the variable nodes-- 530 00:29:44,030 --> 00:29:48,050 all the query nodes have two outgoing edges, 531 00:29:48,050 --> 00:29:50,900 one labeled 0 and one labeled 1. 532 00:29:50,900 --> 00:29:53,210 So you're going to eventually end up at an output node, 533 00:29:53,210 --> 00:29:54,585 and that's going to be the output 534 00:29:54,585 --> 00:29:56,390 of the branching program. 535 00:29:56,390 --> 00:30:00,180 So let's do a quick example. 536 00:30:00,180 --> 00:30:04,580 So if x1 is 1, x2 is 0, and x3 is 1-- 537 00:30:04,580 --> 00:30:07,550 so again, we start at the start variable-- 538 00:30:07,550 --> 00:30:11,240 the start node that has the indicated with the arrow 539 00:30:11,240 --> 00:30:12,350 coming in from nowhere. 540 00:30:12,350 --> 00:30:19,470 So you're going to start at the nose labeled x1. 541 00:30:19,470 --> 00:30:21,840 So you have to look and see, what is x1 in the input? 542 00:30:21,840 --> 00:30:22,530 It's a 1. 543 00:30:22,530 --> 00:30:24,690 So you're going to follow down the 1 branch. 544 00:30:24,690 --> 00:30:26,040 Now you see the next node. 545 00:30:26,040 --> 00:30:27,705 Oh, that's an x3. 546 00:30:27,705 --> 00:30:31,320 We see, what's x3 in the input? x3 is a 1. 547 00:30:31,320 --> 00:30:33,480 So go down to 1 branch again. 548 00:30:33,480 --> 00:30:35,700 Now we have an x2 node. 549 00:30:35,700 --> 00:30:37,720 Take a look at the input. x2 is a 0. 550 00:30:37,720 --> 00:30:39,330 We follow the 0 branch. 551 00:30:39,330 --> 00:30:41,860 Now, you're at an output branch, an output node. 552 00:30:41,860 --> 00:30:46,420 So that's a 0, and that's the output of this computation. 553 00:30:46,420 --> 00:30:49,710 So writing it this way and thinking about it 554 00:30:49,710 --> 00:30:53,700 as a Boolean function which maps strings of zeros and ones, 555 00:30:53,700 --> 00:30:58,440 we have f of 101 representing those that assignment. 556 00:30:58,440 --> 00:30:59,460 That equals 0. 557 00:30:59,460 --> 00:31:00,570 That was the output. 558 00:31:00,570 --> 00:31:06,795 And that's the output of this computation, OK? 559 00:31:06,795 --> 00:31:07,920 So important to understand. 560 00:31:07,920 --> 00:31:10,410 We're going to spend a lot of time 561 00:31:10,410 --> 00:31:13,050 talking about branching programs-- so critical 562 00:31:13,050 --> 00:31:14,475 to understand this model. 563 00:31:14,475 --> 00:31:15,600 I think it's fairly simple. 564 00:31:15,600 --> 00:31:17,325 But if you didn't get it, please ask. 565 00:31:20,400 --> 00:31:22,950 We can easily correct up any misunderstanding at this point. 566 00:31:28,920 --> 00:31:31,830 It's not exactly the same as a DFA. 567 00:31:31,830 --> 00:31:36,220 DFAs, for one thing, can take inputs of any length. 568 00:31:36,220 --> 00:31:40,470 This has inputs of some particular length, 569 00:31:40,470 --> 00:31:45,198 where the branching program has some fixed number of variables. 570 00:31:45,198 --> 00:31:45,990 This one has three. 571 00:31:45,990 --> 00:31:49,500 So this only takes inputs of length 3. 572 00:31:49,500 --> 00:31:52,870 So there's maybe some connection to thinking of these as states 573 00:31:52,870 --> 00:31:56,040 and so on, but it's a different model. 574 00:31:56,040 --> 00:31:59,580 So now, we'll say that two branching programs-- 575 00:32:02,027 --> 00:32:03,610 OK, let me just ask one more question. 576 00:32:03,610 --> 00:32:05,920 Not all nodes need to be used, right? 577 00:32:09,710 --> 00:32:11,600 Yeah, I mean, there's no requirement 578 00:32:11,600 --> 00:32:13,250 that all nodes need to be used. 579 00:32:13,250 --> 00:32:15,140 And there even could be inaccessible nodes. 580 00:32:15,140 --> 00:32:17,450 I'm not preventing that. 581 00:32:17,450 --> 00:32:19,163 That could be OK. 582 00:32:19,163 --> 00:32:21,080 So on the particular branch, certainly, you're 583 00:32:21,080 --> 00:32:23,030 not going to-- when you're executing 584 00:32:23,030 --> 00:32:25,783 this branching program on an input, 585 00:32:25,783 --> 00:32:27,200 obviously, certainly, you're going 586 00:32:27,200 --> 00:32:29,840 to have a path that's going to only use 587 00:32:29,840 --> 00:32:35,390 some part of the tree, a part of the graph. 588 00:32:35,390 --> 00:32:41,410 But there might be some paths that can never occur. 589 00:32:41,410 --> 00:32:46,390 So if you went down x equal to 1 here, and then x3 was 0, now, 590 00:32:46,390 --> 00:32:52,830 you're re-reading x1, so you wouldn't go down this branch 591 00:32:52,830 --> 00:32:53,587 unless you-- 592 00:32:53,587 --> 00:32:55,920 I think all of the branches in this particular branching 593 00:32:55,920 --> 00:32:57,490 program could get used. 594 00:32:57,490 --> 00:33:00,720 But I didn't check that, so maybe I'm wrong. 595 00:33:00,720 --> 00:33:02,340 OK, so let's continue. 596 00:33:02,340 --> 00:33:05,370 Two branching programs may or may not 597 00:33:05,370 --> 00:33:07,813 compute the same function. 598 00:33:07,813 --> 00:33:09,480 We'll say they're equivalent if they do. 599 00:33:13,550 --> 00:33:18,515 Now two branching programs can be equivalent 600 00:33:18,515 --> 00:33:20,390 even though they superficially look different 601 00:33:20,390 --> 00:33:22,480 from one another. 602 00:33:22,480 --> 00:33:25,570 And we're interested in the computational problem of, 603 00:33:25,570 --> 00:33:27,730 given two of these branching programs, 604 00:33:27,730 --> 00:33:29,800 do they compute the same function? 605 00:33:29,800 --> 00:33:32,980 In other words, do they always give the same answer 606 00:33:32,980 --> 00:33:36,140 on the setting of the input? 607 00:33:36,140 --> 00:33:38,860 So we'll define the associated language. 608 00:33:38,860 --> 00:33:40,900 Equivalence problem for branching programs 609 00:33:40,900 --> 00:33:44,020 says that you're given two of these branching programs, 610 00:33:44,020 --> 00:33:46,330 and they're equivalent to be in the language. 611 00:33:46,330 --> 00:33:48,550 We're going to sometimes write equivalents 612 00:33:48,550 --> 00:33:53,380 using the mathematical notation of the three-lined equals sign, 613 00:33:53,380 --> 00:33:56,043 equivalence sign. 614 00:33:56,043 --> 00:33:57,960 OK, that means a computer saying-- they always 615 00:33:57,960 --> 00:34:00,220 give the same answer. 616 00:34:00,220 --> 00:34:05,960 Now that problem turns out to be coNP-complete, 617 00:34:05,960 --> 00:34:10,440 as I've asked you to show on your homework, I believe. 618 00:34:10,440 --> 00:34:13,530 This is not a super hard reduction. 619 00:34:13,530 --> 00:34:15,870 And coNP complete, by the way, is the complement 620 00:34:15,870 --> 00:34:17,370 of an NP-complete problem. 621 00:34:17,370 --> 00:34:21,389 Or equivalently, it's a problem to which all coNP problems are 622 00:34:21,389 --> 00:34:23,570 polynomial time-reducible, and it's in coNP. 623 00:34:26,780 --> 00:34:38,810 So this is coNP-complete, and that's for you to show. 624 00:34:38,810 --> 00:34:41,540 But that has an important significance for us 625 00:34:41,540 --> 00:34:45,830 right now because if-- 626 00:34:45,830 --> 00:34:50,630 looking at the question of whether this problem is in BPP, 627 00:34:50,630 --> 00:34:54,469 the fact that it's coNP-complete suggests 628 00:34:54,469 --> 00:35:02,210 that the answer is no because if a coNP-complete or NP-complete 629 00:35:02,210 --> 00:35:06,050 problem more in BPP because everything else in NP or coNP 630 00:35:06,050 --> 00:35:10,280 is reducible to that problem, then all of those NP or coNP 631 00:35:10,280 --> 00:35:13,580 problems would be in BPP for exactly the same reason 632 00:35:13,580 --> 00:35:15,720 that we've seen before. 633 00:35:15,720 --> 00:35:17,360 And that's not known to be the case 634 00:35:17,360 --> 00:35:20,040 and not believed to be the case. 635 00:35:20,040 --> 00:35:25,960 So we don't expect that coNP-complete problem 636 00:35:25,960 --> 00:35:27,670 is going to be in BPP. 637 00:35:27,670 --> 00:35:32,930 That would be an amazing and surprising result. 638 00:35:32,930 --> 00:35:40,680 So because I hope I made it clear in my previous discussion 639 00:35:40,680 --> 00:35:44,400 that the BPP, from a practical standpoint, 640 00:35:44,400 --> 00:35:48,660 is very close to being like P because you can make the error 641 00:35:48,660 --> 00:35:53,970 probability of the machine so incredibly low that it's 642 00:35:53,970 --> 00:35:55,020 a comparable-- 643 00:35:57,560 --> 00:36:00,470 if you run the machine and the error probability 644 00:36:00,470 --> 00:36:04,910 is like 1 over googol, then it's sort of even greater 645 00:36:04,910 --> 00:36:07,250 than the probability that some alpha particle 646 00:36:07,250 --> 00:36:12,590 came in and flipped the value of some internal memory 647 00:36:12,590 --> 00:36:14,947 cell in your computation. 648 00:36:14,947 --> 00:36:17,030 So if you have an extremely low error probability, 649 00:36:17,030 --> 00:36:21,590 it's pretty good from a practical standpoint. 650 00:36:21,590 --> 00:36:26,650 So it would be amazing if NP problems were solvable in BPP. 651 00:36:26,650 --> 00:36:28,780 So this is not the language we're 652 00:36:28,780 --> 00:36:30,020 going to use as our example. 653 00:36:30,020 --> 00:36:32,590 We're going to look at a related, restricted version 654 00:36:32,590 --> 00:36:34,450 of this problem about equivalence 655 00:36:34,450 --> 00:36:35,800 for branching programs. 656 00:36:35,800 --> 00:36:38,210 And that, I'm going to introduce right now. 657 00:36:38,210 --> 00:36:40,476 OK, any questions here? 658 00:36:40,476 --> 00:36:42,080 I don't see any questions. 659 00:36:44,702 --> 00:36:47,750 I'm fading out. 660 00:36:47,750 --> 00:36:51,870 OK, so let's move on. 661 00:36:54,610 --> 00:36:57,180 So we're going to talk about branching programs 662 00:36:57,180 --> 00:36:59,940 that are what are called read-once, 663 00:36:59,940 --> 00:37:02,260 read-once branching programs. 664 00:37:02,260 --> 00:37:05,130 And those are simply branching programs 665 00:37:05,130 --> 00:37:10,350 that are not allowed to reread an input that they've 666 00:37:10,350 --> 00:37:13,180 previously read. 667 00:37:13,180 --> 00:37:16,950 So for example, is this branching program 668 00:37:16,950 --> 00:37:19,490 a read-once branching program? 669 00:37:19,490 --> 00:37:20,530 No. 670 00:37:20,530 --> 00:37:23,030 This branching program is not a read-once branching program 671 00:37:23,030 --> 00:37:29,760 because you can find a path that's 672 00:37:29,760 --> 00:37:33,340 going to cause you to read the same variable more than once. 673 00:37:33,340 --> 00:37:35,910 So it's not going to be a read-once. 674 00:37:35,910 --> 00:37:40,200 So over here, it's not read-once because there's two occurrences 675 00:37:40,200 --> 00:37:43,320 of an x1 on the same branch. 676 00:37:43,320 --> 00:37:44,820 Now you might ask, why would anybody 677 00:37:44,820 --> 00:37:47,320 want to do that because you've already read the value of x1? 678 00:37:47,320 --> 00:37:51,563 Well, in the case of this particular branching program, 679 00:37:51,563 --> 00:37:52,980 there might be a value because you 680 00:37:52,980 --> 00:37:57,420 could have got to this x1 branch by going this way or that way. 681 00:38:00,120 --> 00:38:02,420 But that's a separate question. 682 00:38:02,420 --> 00:38:06,680 If we restrict our attention to read-once branching programs, 683 00:38:06,680 --> 00:38:08,780 then the problem of testing equivalence 684 00:38:08,780 --> 00:38:13,620 becomes very different in character. 685 00:38:13,620 --> 00:38:19,440 And in fact, we're going to give a probabilistic algorithm-- 686 00:38:19,440 --> 00:38:22,300 a BPP algorithm to solve that problem. 687 00:38:22,300 --> 00:38:26,220 So the equivalence problem for read-once branching programs, 688 00:38:26,220 --> 00:38:28,050 which are not allowed to reread variables 689 00:38:28,050 --> 00:38:31,720 on any path, that's interestingly 690 00:38:31,720 --> 00:38:36,070 going to be solvable with a probabilistic polynomial time 691 00:38:36,070 --> 00:38:41,132 algorithm with a small error probability. 692 00:38:41,132 --> 00:38:42,590 So I'm going to run a check-in now. 693 00:38:42,590 --> 00:38:44,548 But let's make sure we're all together on this. 694 00:38:44,548 --> 00:38:46,190 So I've got a good question here. 695 00:38:46,190 --> 00:38:49,310 Can every Boolean function be described 696 00:38:49,310 --> 00:38:50,360 by a branching program? 697 00:38:50,360 --> 00:38:51,290 Yes. 698 00:38:51,290 --> 00:38:52,700 That's an easy exercise. 699 00:38:52,700 --> 00:38:54,800 But you can make-- 700 00:38:54,800 --> 00:38:59,210 branching programs are-- they may be a large, 701 00:38:59,210 --> 00:39:01,940 but you can describe any Boolean function 702 00:39:01,940 --> 00:39:04,130 with some branching program. 703 00:39:04,130 --> 00:39:07,270 That's not hard to show. 704 00:39:07,270 --> 00:39:08,770 Other question-- are we all together 705 00:39:08,770 --> 00:39:10,878 on understanding what read-once means, 706 00:39:10,878 --> 00:39:12,670 and branching programs, and all that stuff? 707 00:39:12,670 --> 00:39:14,378 This is a good time to ask if you're not. 708 00:39:17,230 --> 00:39:18,470 OK, so let's do the check-in. 709 00:39:21,260 --> 00:39:25,322 So as I pointed out, we will show 710 00:39:25,322 --> 00:39:27,530 that the equivalence for read-once branching programs 711 00:39:27,530 --> 00:39:29,750 is solvable in BPP. 712 00:39:29,750 --> 00:39:33,470 Can we use that to solve the general case for branching 713 00:39:33,470 --> 00:39:36,920 programs by converting general branching programs 714 00:39:36,920 --> 00:39:39,080 to read-once branching programs and then 715 00:39:39,080 --> 00:39:41,720 running the read-once test? 716 00:39:41,720 --> 00:39:46,300 So what do you think? 717 00:39:46,300 --> 00:39:51,100 OK, I'm seeing a lot of correct answers here. 718 00:39:51,100 --> 00:39:53,095 So let's wrap this one up quickly. 719 00:39:58,120 --> 00:39:59,530 So another 10 seconds, please. 720 00:40:06,860 --> 00:40:08,060 OK? 721 00:40:08,060 --> 00:40:09,800 Are we all ready? 722 00:40:09,800 --> 00:40:11,105 1, 2, 3, closing. 723 00:40:17,410 --> 00:40:19,330 All right, yes. 724 00:40:19,330 --> 00:40:24,610 Most of you have answered correctly. 725 00:40:24,610 --> 00:40:27,580 Well, answer A is not a very good answer because we already 726 00:40:27,580 --> 00:40:30,310 commented on the previous slide that we don't know 727 00:40:30,310 --> 00:40:33,550 how to do the general case in BPP, 728 00:40:33,550 --> 00:40:36,040 so it would be kind of surprising if, right here, I'm 729 00:40:36,040 --> 00:40:42,010 saying, yes, we could do it by using the restricted case. 730 00:40:42,010 --> 00:40:48,920 So I think a better answer would be 731 00:40:48,920 --> 00:40:52,520 to one of the no's, but as I did comment, 732 00:40:52,520 --> 00:40:53,810 you can always convert-- 733 00:40:53,810 --> 00:40:59,588 you can always do any Boolean function with a-- 734 00:40:59,588 --> 00:41:02,130 well, maybe I didn't say it for read-once branching programs. 735 00:41:02,130 --> 00:41:03,890 But even read-once branching programs 736 00:41:03,890 --> 00:41:06,720 can compute any Boolean function. 737 00:41:06,720 --> 00:41:09,840 So the conversion is possible but, in general, 738 00:41:09,840 --> 00:41:11,270 will not be polynomial time. 739 00:41:11,270 --> 00:41:14,570 And if you imagine even trying to do the conversion over here, 740 00:41:14,570 --> 00:41:18,080 you could convert this branching program to read-once, 741 00:41:18,080 --> 00:41:20,945 but you'd have to basically separate the two. 742 00:41:23,450 --> 00:41:25,340 You know, instead of rereading the x1, 743 00:41:25,340 --> 00:41:27,960 you could remember that x1 value. 744 00:41:27,960 --> 00:41:30,900 But then you would not be-- you couldn't converge over here. 745 00:41:30,900 --> 00:41:34,460 You'd have to keep those two threads of the-- those two 746 00:41:34,460 --> 00:41:36,830 branches of the computation apart-- 747 00:41:36,830 --> 00:41:38,660 those two paths apart from one another. 748 00:41:38,660 --> 00:41:40,400 And already, the branching program 749 00:41:40,400 --> 00:41:43,770 would start to increase in size by doing that. 750 00:41:43,770 --> 00:41:49,550 And so, in general, converting is possible, 751 00:41:49,550 --> 00:41:53,660 but it requires a big increase in the size. 752 00:41:53,660 --> 00:41:56,960 And then it will not allow a polynomial time algorithm 753 00:41:56,960 --> 00:42:02,690 anymore, even in the probabilistic case. 754 00:42:02,690 --> 00:42:07,400 OK, so now, let's start to look at the possibility 755 00:42:07,400 --> 00:42:12,410 of showing that this equivalence problem is solvable in BPP. 756 00:42:12,410 --> 00:42:15,860 And it's going to take us in a kind of a strange direction, 757 00:42:15,860 --> 00:42:18,590 but let's try to get our intuition going first 758 00:42:18,590 --> 00:42:24,030 by doing something which seems like the most obvious approach. 759 00:42:24,030 --> 00:42:32,290 So here, so we're going to give an algorithm now, which 760 00:42:32,290 --> 00:42:34,270 is going to be an attempt. 761 00:42:34,270 --> 00:42:36,400 This is not going to work, but nevertheless, it's 762 00:42:36,400 --> 00:42:40,530 going to have the germ of the right idea or not the germ 763 00:42:40,530 --> 00:42:43,930 but the beginning of the right way to think about it. 764 00:42:43,930 --> 00:42:46,680 So here are the two read-once branching programs, 765 00:42:46,680 --> 00:42:49,160 B,1 and B,2. 766 00:42:49,160 --> 00:42:51,680 And I want to see, do they compute 767 00:42:51,680 --> 00:42:54,780 the same function or not? 768 00:42:54,780 --> 00:42:57,170 So one thing you might try is just 769 00:42:57,170 --> 00:43:02,300 running them on a bunch of randomly selected assignments 770 00:43:02,300 --> 00:43:04,640 or inputs. 771 00:43:04,640 --> 00:43:09,140 So you can just take two random input assignments. 772 00:43:09,140 --> 00:43:14,300 Just take x1, flip a coin to say it's 1 or 0. 773 00:43:14,300 --> 00:43:17,600 Do the same for x2 and so on. 774 00:43:17,600 --> 00:43:19,880 Then you get some input assignment. 775 00:43:19,880 --> 00:43:22,820 You run the two branching programs on that assignment. 776 00:43:22,820 --> 00:43:24,260 And maybe that doesn't give-- even 777 00:43:24,260 --> 00:43:26,510 if they agree, it doesn't give you a lot of confidence 778 00:43:26,510 --> 00:43:30,070 that you get the right answer, that they're really equivalent. 779 00:43:30,070 --> 00:43:32,150 So you do it 100 times, whatever-- 780 00:43:32,150 --> 00:43:34,580 some number of times. 781 00:43:34,580 --> 00:43:42,423 And of course, if they ever disagree 782 00:43:42,423 --> 00:43:43,965 on one of those assignments, then you 783 00:43:43,965 --> 00:43:48,630 know they're not equivalent, and you can immediately reject. 784 00:43:48,630 --> 00:43:52,500 But what I'd like to say is if they 785 00:43:52,500 --> 00:44:02,300 agree on those 100 tries or those 100 assignments there, 786 00:44:02,300 --> 00:44:09,990 then they are, at least I haven't found a place 787 00:44:09,990 --> 00:44:12,180 where they disagree, so I'm going 788 00:44:12,180 --> 00:44:15,540 to say that they're equivalent. 789 00:44:15,540 --> 00:44:18,460 Is that a reasonable thing to do? 790 00:44:18,460 --> 00:44:20,190 Well, it might be. 791 00:44:20,190 --> 00:44:22,830 It depends on k. 792 00:44:22,830 --> 00:44:26,070 So the critical thing is, what value of k 793 00:44:26,070 --> 00:44:29,100 should you pick which is going to be big enough 794 00:44:29,100 --> 00:44:32,250 to allow us to draw the conclusion that if you run it 795 00:44:32,250 --> 00:44:34,680 for k times, and you never see a difference, 796 00:44:34,680 --> 00:44:39,630 then you can conclude, with good confidence, 797 00:44:39,630 --> 00:44:43,230 that the two branching programs are equivalent because you 798 00:44:43,230 --> 00:44:46,800 tried to look for a difference, and you never found one. 799 00:44:46,800 --> 00:44:52,740 Well, the thing is that k is going to have to be pretty big. 800 00:44:55,690 --> 00:44:59,460 So looking at it this way, if the two branching programs were 801 00:44:59,460 --> 00:45:03,640 equivalent, then, certainly, they're 802 00:45:03,640 --> 00:45:05,380 always going to give the same value. 803 00:45:07,900 --> 00:45:09,930 So the probability that the machine accepts 804 00:45:09,930 --> 00:45:10,860 is going to be one. 805 00:45:10,860 --> 00:45:14,070 And that's good because we want for-- 806 00:45:14,070 --> 00:45:15,820 this is a case when we're in the language. 807 00:45:15,820 --> 00:45:18,100 We want the probability of acceptance to be high. 808 00:45:18,100 --> 00:45:21,760 And here, the probability of acceptance is actually 1. 809 00:45:21,760 --> 00:45:25,540 So it's always going to accept when the two branching programs 810 00:45:25,540 --> 00:45:26,613 were equivalent. 811 00:45:26,613 --> 00:45:28,780 But what happens when the branching programs are not 812 00:45:28,780 --> 00:45:29,870 equivalent? 813 00:45:29,870 --> 00:45:32,290 Now, we want the probability of rejection to be high. 814 00:45:32,290 --> 00:45:37,540 The probably of acceptance should be very low, right? 815 00:45:37,540 --> 00:45:40,660 So if they're not equivalent, what we want-- the probability 816 00:45:40,660 --> 00:45:42,910 that the machine rejects is going 817 00:45:42,910 --> 00:45:46,720 to be high if they're not equivalent because that's 818 00:45:46,720 --> 00:45:48,520 what the correct answer is. 819 00:45:48,520 --> 00:45:52,130 Well, the only way the machine is going to reject-- 820 00:45:52,130 --> 00:45:56,090 if it finds a place where the two 821 00:45:56,090 --> 00:45:59,540 branching programs disagree. 822 00:45:59,540 --> 00:46:02,540 But those two branching programs, 823 00:46:02,540 --> 00:46:08,340 even though not equivalent, might disagree rarely. 824 00:46:08,340 --> 00:46:11,850 They might only disagree on one input assignment out of the 2 825 00:46:11,850 --> 00:46:15,020 to the n possibilities. 826 00:46:15,020 --> 00:46:17,290 So these two in-equivalent branching programs 827 00:46:17,290 --> 00:46:20,350 might agree almost everywhere, just except at one place. 828 00:46:20,350 --> 00:46:22,850 And then that's enough for them not to be equivalent. 829 00:46:22,850 --> 00:46:24,700 But the problem is that if you're just 830 00:46:24,700 --> 00:46:29,230 going to do random sampling, the likelihood 831 00:46:29,230 --> 00:46:33,040 of finding that one exceptional place where the two disagree 832 00:46:33,040 --> 00:46:33,737 is very low. 833 00:46:33,737 --> 00:46:36,070 You're going to have to do an enormous number of samples 834 00:46:36,070 --> 00:46:43,100 before you're likely to find that point of difference. 835 00:46:43,100 --> 00:46:49,360 And so in order to be confident that you're 836 00:46:49,360 --> 00:46:51,670 going to find that difference, if there is one, 837 00:46:51,670 --> 00:46:54,250 you're going to have to do exponentially many samples. 838 00:46:54,250 --> 00:46:56,800 And you don't have time to do that with a polynomial time 839 00:46:56,800 --> 00:46:57,798 algorithm. 840 00:47:02,318 --> 00:47:04,360 You're just going to have to flip too many coins. 841 00:47:04,360 --> 00:47:08,980 You have to run too many different samples, 842 00:47:08,980 --> 00:47:12,190 different assignments through these two machines. 843 00:47:12,190 --> 00:47:18,470 And because they're different, but they're almost the same. 844 00:47:18,470 --> 00:47:21,130 So we're going to need to find a different method. 845 00:47:21,130 --> 00:47:26,300 And the idea is we're going to run 846 00:47:26,300 --> 00:47:32,140 these two branching programs in some crazy way. 847 00:47:32,140 --> 00:47:34,570 Instead of running them on 0's and 1's that we've 848 00:47:34,570 --> 00:47:36,130 been doing it so far, we're going 849 00:47:36,130 --> 00:47:39,640 to feed in values for the variables which 850 00:47:39,640 --> 00:47:41,320 are non-Boolean. 851 00:47:41,320 --> 00:47:42,760 They are going to be-- 852 00:47:42,760 --> 00:47:50,950 we're going to set x1 to 2, x3 to 7, x4 to 15. 853 00:47:50,950 --> 00:47:53,960 Of course, that doesn't seem to make any sense. 854 00:47:53,960 --> 00:47:56,620 But it's nevertheless going to turn out to be a useful thing 855 00:47:56,620 --> 00:47:58,495 to do, and it's going to give us some insight 856 00:47:58,495 --> 00:48:00,178 into the equivalence or in-equivalence 857 00:48:00,178 --> 00:48:01,345 of these branching programs. 858 00:48:04,230 --> 00:48:06,660 OK, so let's just-- 859 00:48:06,660 --> 00:48:07,980 I think-- are we at the end of? 860 00:48:07,980 --> 00:48:09,430 Yeah, we're at the break here. 861 00:48:09,430 --> 00:48:12,970 So I'm getting some questions coming in, which is great. 862 00:48:12,970 --> 00:48:14,220 I will answer those questions. 863 00:48:14,220 --> 00:48:18,600 But why don't I start off our break and then-- 864 00:48:23,295 --> 00:48:23,795 OK. 865 00:48:29,880 --> 00:48:35,370 OK, so there's a question about whether this machine runs 866 00:48:35,370 --> 00:48:36,720 deterministically or not. 867 00:48:36,720 --> 00:48:39,220 So which machine are we talking about? 868 00:48:39,220 --> 00:48:41,760 So the branching programs themselves, 869 00:48:41,760 --> 00:48:43,050 they run deterministically. 870 00:48:43,050 --> 00:48:47,370 You give them an assignment to the input variables 871 00:48:47,370 --> 00:48:51,600 that's going to determine a path through each branching program, 872 00:48:51,600 --> 00:48:54,420 which is eventually going to output a 0 or a 1. 873 00:48:54,420 --> 00:48:57,210 And you want to know, do those two branching programs always 874 00:48:57,210 --> 00:49:00,240 give the same value no matter what the input was? 875 00:49:00,240 --> 00:49:04,950 But the branching programs themselves were deterministic. 876 00:49:04,950 --> 00:49:08,210 Now the machine that's trying to make 877 00:49:08,210 --> 00:49:11,120 the determination of whether those two branching programs 878 00:49:11,120 --> 00:49:14,360 are equivalent, that machine that we're going to be arguing 879 00:49:14,360 --> 00:49:16,760 is going to be a probabilistic machine. 880 00:49:16,760 --> 00:49:18,890 So it's a kind of non-deterministic machine 881 00:49:18,890 --> 00:49:21,950 that's going to have different possible ways to go depending 882 00:49:21,950 --> 00:49:25,540 upon the outcome of its coin tosses. 883 00:49:25,540 --> 00:49:28,710 So you can think of as non-determinism 884 00:49:28,710 --> 00:49:33,920 in the ordinary sense, that it has a tree of possibilities. 885 00:49:33,920 --> 00:49:39,290 But now the way we're thinking about acceptance is different. 886 00:49:39,290 --> 00:49:43,610 Instead of accepting, if there's just one accept branch, 887 00:49:43,610 --> 00:49:46,190 the machine, for it to accept, has 888 00:49:46,190 --> 00:49:50,240 to have a majority of the branches be accepting. 889 00:49:50,240 --> 00:49:56,600 And so there's some similarities but some differences 890 00:49:56,600 --> 00:49:59,640 with the usual way we think of non-determinism. 891 00:49:59,640 --> 00:50:01,790 So what's the motivation behind introducing 892 00:50:01,790 --> 00:50:03,530 this type of Turing machine? 893 00:50:03,530 --> 00:50:09,050 Well, I mean, I guess there are two motivations. 894 00:50:12,230 --> 00:50:15,020 Probabilistic algorithms, sometimes called Monte Carlo 895 00:50:15,020 --> 00:50:18,650 algorithms, turn out to be useful in practice 896 00:50:18,650 --> 00:50:20,310 for a variety of things. 897 00:50:20,310 --> 00:50:26,540 And so that led to people to think about them 898 00:50:26,540 --> 00:50:29,120 in the context of complexity. 899 00:50:29,120 --> 00:50:31,470 They're related in some ways to quantum computers, 900 00:50:31,470 --> 00:50:35,060 which are also probabilistic in a somewhat different way. 901 00:50:35,060 --> 00:50:44,640 But they also have a very nice formulation 902 00:50:44,640 --> 00:50:46,270 in complexity theory. 903 00:50:46,270 --> 00:50:48,600 So complexity theorists like to think 904 00:50:48,600 --> 00:50:51,270 about probabilistic computation because, I mean, 905 00:50:51,270 --> 00:50:53,880 you can do interesting things with probabilistic machines. 906 00:50:53,880 --> 00:50:55,740 And the complexity classes associated 907 00:50:55,740 --> 00:50:56,890 are also interesting. 908 00:50:56,890 --> 00:51:00,330 So as you'll see, it leads us in an interesting direction 909 00:51:00,330 --> 00:51:03,130 to consider how to solve this problem, 910 00:51:03,130 --> 00:51:06,810 this read-once branching program problem equivalence 911 00:51:06,810 --> 00:51:08,040 with a probabilistic machine. 912 00:51:08,040 --> 00:51:10,565 It's just an interesting algorithm 913 00:51:10,565 --> 00:51:11,940 that we're going to come up with. 914 00:51:19,200 --> 00:51:21,390 So in our proof attempt, where did we 915 00:51:21,390 --> 00:51:24,120 use the probabilistic nature for BPP? 916 00:51:24,120 --> 00:51:26,250 Because we're running the two branching programs 917 00:51:26,250 --> 00:51:27,390 on a random input. 918 00:51:32,720 --> 00:51:35,660 I mean, so you have your two branching programs. 919 00:51:35,660 --> 00:51:39,440 You pick a random input to run those two branching programs, 920 00:51:39,440 --> 00:51:41,420 and you see what they do. 921 00:51:41,420 --> 00:51:43,220 That's why it's probabilistic. 922 00:51:43,220 --> 00:51:47,606 When you think about random behavior of the machine, 923 00:51:47,606 --> 00:51:49,680 that's a probabilistic machine. 924 00:51:49,680 --> 00:51:53,900 So each branch of the machine is going 925 00:51:53,900 --> 00:51:56,750 to be like the way we normally think about non-determinism. 926 00:51:56,750 --> 00:51:59,392 Somebody is asking whether we think 927 00:51:59,392 --> 00:52:01,100 of the complexity of the machine in terms 928 00:52:01,100 --> 00:52:05,780 of all of the branches of the machine or each branch 929 00:52:05,780 --> 00:52:06,470 separately. 930 00:52:06,470 --> 00:52:09,140 We always think about, for non-deterministic machines, 931 00:52:09,140 --> 00:52:11,665 each branch separately. 932 00:52:11,665 --> 00:52:12,540 I'm not totally sure. 933 00:52:12,540 --> 00:52:13,873 I understand the question there. 934 00:52:13,873 --> 00:52:16,280 So are all the inputs built-in, and we randomly 935 00:52:16,280 --> 00:52:19,548 choose one through coin flips? 936 00:52:19,548 --> 00:52:21,340 Not sure I understand that question either. 937 00:52:21,340 --> 00:52:25,070 We're given as input the two branching programs. 938 00:52:25,070 --> 00:52:30,670 And then we flip coins using our non-determinism-- 939 00:52:30,670 --> 00:52:32,420 you can think about it equivalent in terms 940 00:52:32,420 --> 00:52:33,440 of coin flips-- 941 00:52:33,440 --> 00:52:36,060 to choose the values of the variables. 942 00:52:36,060 --> 00:52:38,480 So now, we have a set of variable inputs 943 00:52:38,480 --> 00:52:41,420 to the values of the variables. 944 00:52:41,420 --> 00:52:47,540 And we use that as input to the branching programs 945 00:52:47,540 --> 00:52:49,740 to see whether they-- 946 00:52:49,740 --> 00:52:51,860 to see what answers they give, and, in particular, 947 00:52:51,860 --> 00:52:53,300 whether they give the same answer 948 00:52:53,300 --> 00:52:57,030 on that randomly-chosen input. 949 00:52:57,030 --> 00:52:57,720 Let's move on. 950 00:53:01,480 --> 00:53:09,760 All right, so now moving us toward the actual BPP algorithm 951 00:53:09,760 --> 00:53:17,590 for read-once branching program equivalence testing, 952 00:53:17,590 --> 00:53:21,315 we have to think about a different way to-- 953 00:53:24,610 --> 00:53:28,450 we need an alternate way of thinking about the computation 954 00:53:28,450 --> 00:53:30,940 of a branching program. 955 00:53:30,940 --> 00:53:33,160 It's going to look very similar, but it's 956 00:53:33,160 --> 00:53:34,810 going to lead us in a direction that's 957 00:53:34,810 --> 00:53:39,760 going to allow us to talk about these non-Boolean inputs 958 00:53:39,760 --> 00:53:41,500 that I referred to. 959 00:53:41,500 --> 00:53:43,180 Just kind of where we're going-- we're 960 00:53:43,180 --> 00:53:45,040 going to be simulating branching programs 961 00:53:45,040 --> 00:53:50,080 with polynomials, if that helps you as an overarching plan. 962 00:53:50,080 --> 00:53:52,900 But we'll get there a little slowly. 963 00:53:52,900 --> 00:53:58,150 So OK, here's a read-once branching program. 964 00:53:58,150 --> 00:54:01,840 We're not going to use the read-once feature just yet, 965 00:54:01,840 --> 00:54:03,530 but that'll come later. 966 00:54:03,530 --> 00:54:05,155 But anyway, here's a branching program. 967 00:54:12,710 --> 00:54:14,760 Oh, here's my branch and my-- 968 00:54:14,760 --> 00:54:16,350 I crashed here. 969 00:54:16,350 --> 00:54:17,660 I'll start that again. 970 00:54:22,720 --> 00:54:27,990 OK, so we take an input, whatever it is, 971 00:54:27,990 --> 00:54:31,002 and thinking about the computation of the branching-- 972 00:54:31,002 --> 00:54:33,210 so we're not thinking about the algorithm, right now. 973 00:54:33,210 --> 00:54:35,043 We're just thinking about branching programs 974 00:54:35,043 --> 00:54:35,880 for the minute. 975 00:54:35,880 --> 00:54:39,860 We're going to get back to the algorithm later. 976 00:54:39,860 --> 00:54:43,870 So the branching program follows a path, as I indicated, 977 00:54:43,870 --> 00:54:45,340 when you have a particular input. 978 00:54:45,340 --> 00:54:51,310 If x1 is 0, x2 is 1, x3 is 1, so the output 979 00:54:51,310 --> 00:54:53,270 is going to be 1 in this case. 980 00:54:53,270 --> 00:54:53,770 OK? 981 00:54:56,580 --> 00:55:02,040 So the way I want to think about this a little differently is I 982 00:55:02,040 --> 00:55:07,170 want to label all of the nodes and all of the edges 983 00:55:07,170 --> 00:55:11,130 with a value that tells me whether or not 984 00:55:11,130 --> 00:55:15,590 this yellow path went through that node or edge. 985 00:55:15,590 --> 00:55:17,540 It's going to be just doing the same, 986 00:55:17,540 --> 00:55:21,370 but you may think this is no difference at all. 987 00:55:21,370 --> 00:55:25,200 But I want to label all of the things on the yellow path, 988 00:55:25,200 --> 00:55:27,300 I'm going to label them with a 1. 989 00:55:27,300 --> 00:55:30,120 And all of the things that are not on the yellow path, 990 00:55:30,120 --> 00:55:33,340 I'm going to label with a 0. 991 00:55:33,340 --> 00:55:35,860 So I'm trying to keep those labels apart 992 00:55:35,860 --> 00:55:38,525 from the original branching program, which 993 00:55:38,525 --> 00:55:39,400 are written in white. 994 00:55:39,400 --> 00:55:41,990 These labels are written in yellow. 995 00:55:41,990 --> 00:55:44,120 But these labels have to do with the execution 996 00:55:44,120 --> 00:55:48,050 of the branching program on an input. 997 00:55:48,050 --> 00:55:52,100 So once I have an input, that's going to determine a 1 or a 0 998 00:55:52,100 --> 00:55:55,070 label for every node and edge. 999 00:56:00,130 --> 00:56:03,007 Now if we want to look at the output 1000 00:56:03,007 --> 00:56:05,340 from this branching program after we have that labeling, 1001 00:56:05,340 --> 00:56:09,000 we only have to look at the label of the one output node 1002 00:56:09,000 --> 00:56:11,790 because if that one has a 1 on it, 1003 00:56:11,790 --> 00:56:14,310 that means that the path went through that 1. 1004 00:56:14,310 --> 00:56:18,600 And so therefore, the output is 1. 1005 00:56:23,310 --> 00:56:26,820 So I'm going to give you another way of assigning that. 1006 00:56:26,820 --> 00:56:30,060 Instead of just coming finding the path first 1007 00:56:30,060 --> 00:56:32,267 and then coming up with the labeling afterward, 1008 00:56:32,267 --> 00:56:34,350 I'm going to give you a different way of coming up 1009 00:56:34,350 --> 00:56:37,140 with that labeling, kind of building it up inductively, 1010 00:56:37,140 --> 00:56:41,550 starting at the start node and building up that labeling. 1011 00:56:41,550 --> 00:56:45,190 You'll see what I mean by my example. 1012 00:56:45,190 --> 00:56:49,040 So if I have a label on this node, 1013 00:56:49,040 --> 00:56:52,480 so I already know whether or not the path went 1014 00:56:52,480 --> 00:56:56,740 through that node, label 1 means the path went through it. 1015 00:56:56,740 --> 00:56:59,830 Label 0 means the path does not go through it. 1016 00:57:03,720 --> 00:57:08,060 That's going to tell me how to label the two outgoing edges. 1017 00:57:08,060 --> 00:57:09,990 So if I've already labeled this with a, 1018 00:57:09,990 --> 00:57:16,070 where a is a 0 or a 1, then what expression should I use? 1019 00:57:19,640 --> 00:57:22,250 Under what circumstances will I label-- 1020 00:57:22,250 --> 00:57:26,940 what's the right label for this one outgoing edge here? 1021 00:57:26,940 --> 00:57:31,780 Well, if a is 0, that means we know the path did not 1022 00:57:31,780 --> 00:57:33,410 go through this node. 1023 00:57:33,410 --> 00:57:36,190 So there's no way it could go through that a edge. 1024 00:57:36,190 --> 00:57:42,400 Similarly, if xi is a 0, that means, 1025 00:57:42,400 --> 00:57:44,320 even if we did go through that node, 1026 00:57:44,320 --> 00:57:48,700 the path would go through the other outgoing edge and not 1027 00:57:48,700 --> 00:57:49,640 through this one. 1028 00:57:49,640 --> 00:57:53,860 So that tells us that the Boolean expression which 1029 00:57:53,860 --> 00:57:59,080 describes the label of this node in the execution 1030 00:57:59,080 --> 00:58:03,610 is going to be the and of the value on the node 1031 00:58:03,610 --> 00:58:08,080 and the query variable of that node. 1032 00:58:08,080 --> 00:58:10,720 Now think about, what's the right way 1033 00:58:10,720 --> 00:58:14,380 to label the other edge, the execution 1034 00:58:14,380 --> 00:58:17,110 value of the other edge? 1035 00:58:17,110 --> 00:58:18,400 Again, you have to have-- 1036 00:58:18,400 --> 00:58:20,890 go through this node, so a has to be 1. 1037 00:58:20,890 --> 00:58:24,790 But now, you want xi to be 0 in order to go through that edge. 1038 00:58:24,790 --> 00:58:28,000 So that means it's going to be a and the complement of xi. 1039 00:58:30,390 --> 00:58:30,890 OK? 1040 00:58:30,890 --> 00:58:32,790 So this is going to just tell me this. 1041 00:58:32,790 --> 00:58:38,530 I'm writing a formula for how we labeling these edges based 1042 00:58:38,530 --> 00:58:41,500 on the label of the parent node. 1043 00:58:41,500 --> 00:58:45,640 Similarly, if I have a bunch of edges where I already 1044 00:58:45,640 --> 00:58:50,740 know the values, the labels, the execution labels there, 1045 00:58:50,740 --> 00:58:53,800 let's say, so I have a1, a2, and a3, what is the right label 1046 00:58:53,800 --> 00:58:55,030 to put on this node? 1047 00:58:58,550 --> 00:59:00,710 Well, if any one of those is a 1, 1048 00:59:00,710 --> 00:59:03,750 that means the path went through that edge. 1049 00:59:03,750 --> 00:59:06,360 And so therefore, it's going to go through that node. 1050 00:59:06,360 --> 00:59:09,470 So that tells us that the label to put on that node 1051 00:59:09,470 --> 00:59:14,900 is the or of the labels on the incoming edges. 1052 00:59:14,900 --> 00:59:17,060 OK? 1053 00:59:17,060 --> 00:59:18,050 Questions on this? 1054 00:59:21,270 --> 00:59:27,040 So now this is setting the stage for starting 1055 00:59:27,040 --> 00:59:31,810 to think about this more toward polynomials 1056 00:59:31,810 --> 00:59:33,835 instead of using a Boolean algebra. 1057 00:59:36,360 --> 00:59:41,150 So I'm getting a question-- how do we know what the execution 1058 00:59:41,150 --> 00:59:43,490 path is, which nodes to label? 1059 00:59:43,490 --> 00:59:46,640 We're going to be labeling all of the nodes. 1060 00:59:46,640 --> 00:59:50,690 So we start off with labeling the-- 1061 00:59:50,690 --> 00:59:51,830 did I say that here? 1062 00:59:51,830 --> 00:59:54,380 We start up-- I didn't say it, but I should have. 1063 00:59:54,380 --> 00:59:58,130 We labeled the start node with 1 because the path always 1064 00:59:58,130 --> 00:59:59,460 goes through the start node. 1065 00:59:59,460 --> 01:00:01,550 So without even talking about a path, 1066 01:00:01,550 --> 01:00:04,400 we just label the start node 1. 1067 01:00:04,400 --> 01:00:07,340 Well, maybe we'll do an example of this also. 1068 01:00:07,340 --> 01:00:11,660 But now, once we label this start-- this node 1, 1069 01:00:11,660 --> 01:00:14,810 we have an expression that tells us 1070 01:00:14,810 --> 01:00:21,860 how to label the two outgoing edges, this edge and that edge. 1071 01:00:21,860 --> 01:00:26,600 And I'm doing it without knowing the values of the variables. 1072 01:00:26,600 --> 01:00:28,337 I'm just making an expression, which 1073 01:00:28,337 --> 01:00:30,170 is going to describe what those labels would 1074 01:00:30,170 --> 01:00:33,380 be once you tell me what the input assignment is. 1075 01:00:35,890 --> 01:00:37,750 OK so I'm just sort of-- 1076 01:00:37,750 --> 01:00:40,420 it's almost like a symbolic execution here. 1077 01:00:40,420 --> 01:00:42,430 I'm just writing down the different expressions 1078 01:00:42,430 --> 01:00:45,280 for how to calculate what these things should be. 1079 01:00:47,980 --> 01:00:55,570 Let me-- maybe this will become clearer as we continue. 1080 01:00:55,570 --> 01:01:00,930 So now this is the big idea of this proof. 1081 01:01:00,930 --> 01:01:04,530 We're going to use something called arithmetization. 1082 01:01:04,530 --> 01:01:07,290 We're going to convert from thinking about things 1083 01:01:07,290 --> 01:01:09,660 in the Boolean world to thinking about things 1084 01:01:09,660 --> 01:01:13,050 in the arithmetical world, where we have arithmetic 1085 01:01:13,050 --> 01:01:16,050 over integers, let's say, for now. 1086 01:01:16,050 --> 01:01:17,483 So instead of ands and ors, we're 1087 01:01:17,483 --> 01:01:19,275 going to be talking about pluses and times. 1088 01:01:22,580 --> 01:01:27,730 And the way we're going to make the bridge 1089 01:01:27,730 --> 01:01:33,880 is by showing how to simulate the and and or operations 1090 01:01:33,880 --> 01:01:37,180 with the plus and times operations. 1091 01:01:37,180 --> 01:01:45,490 So assuming 1 means true and 0 means false, 1092 01:01:45,490 --> 01:01:52,630 if you have the expression a and b as a Boolean expression, 1093 01:01:52,630 --> 01:01:57,890 we can represent that as a times b using arithmetic 1094 01:01:57,890 --> 01:02:04,600 because it computes exactly the same value when 1095 01:02:04,600 --> 01:02:09,310 we have the Boolean representation 1096 01:02:09,310 --> 01:02:12,430 of true and false being 1 and 0. 1097 01:02:12,430 --> 01:02:16,180 So 1 and 1 is 1. 1098 01:02:16,180 --> 01:02:18,640 And 1 times 1 is 1. 1099 01:02:18,640 --> 01:02:20,230 And anything else-- 1100 01:02:20,230 --> 01:02:24,670 1 and 0, 0 and 1, 0 and 0-- 1101 01:02:24,670 --> 01:02:26,170 if you applied the times operator, 1102 01:02:26,170 --> 01:02:27,628 you're going to get the same value. 1103 01:02:27,628 --> 01:02:32,060 So times is very much like and in this sense. 1104 01:02:32,060 --> 01:02:33,800 OK, we're going to write it as just ab, 1105 01:02:33,800 --> 01:02:37,460 usually, without the time symbol. 1106 01:02:37,460 --> 01:02:40,720 So if we have a complement, how would we 1107 01:02:40,720 --> 01:02:44,050 simulate that with arithmetic? 1108 01:02:44,050 --> 01:02:50,200 Well, again, here we're just flipping one and 0 1109 01:02:50,200 --> 01:02:52,990 in using the complement operation that's 1110 01:02:52,990 --> 01:02:55,180 going to be the same as subtracting 1111 01:02:55,180 --> 01:02:59,740 the value from one that also flips it from between 1 and 0. 1112 01:03:02,390 --> 01:03:05,270 How about or-- if you have a or b? 1113 01:03:05,270 --> 01:03:09,320 Well, it's slightly more complicated 1114 01:03:09,320 --> 01:03:14,840 because you use a plus b, but you 1115 01:03:14,840 --> 01:03:18,890 have to subtract off the product because what 1116 01:03:18,890 --> 01:03:24,140 you want is this simulation should give you 1117 01:03:24,140 --> 01:03:25,710 exactly the same value. 1118 01:03:25,710 --> 01:03:31,850 So if you have 1 or 1, you want that to be a 1 answer. 1119 01:03:31,850 --> 01:03:33,378 You don't want it to be a 2. 1120 01:03:33,378 --> 01:03:35,045 So you have to subtract off the product. 1121 01:03:39,340 --> 01:03:43,480 And the goal is to have a faithful simulation of the and 1122 01:03:43,480 --> 01:03:44,960 or by using plus and times. 1123 01:03:44,960 --> 01:03:47,620 So you get exactly the same answers out 1124 01:03:47,620 --> 01:03:52,020 when you put in Boolean values here. 1125 01:03:52,020 --> 01:03:58,215 OK, so just to say where we're going, what this is going to-- 1126 01:04:01,782 --> 01:04:03,740 superficially, we haven't really done anything. 1127 01:04:06,890 --> 01:04:10,850 But what this is going to enable us to do 1128 01:04:10,850 --> 01:04:16,630 is plug in values which are not Boolean because it doesn't 1129 01:04:16,630 --> 01:04:17,980 make sense to talk about-- 1130 01:04:17,980 --> 01:04:20,410 it makes sense to talk about 1 and 0, 1131 01:04:20,410 --> 01:04:24,370 but it doesn't make sense to talk about 2 and 3. 1132 01:04:24,370 --> 01:04:28,590 But it does make sense to talk about 2 times 3. 1133 01:04:28,590 --> 01:04:33,490 And that's going to be useful. 1134 01:04:33,490 --> 01:04:34,730 OK, so let's just see. 1135 01:04:39,520 --> 01:04:46,110 Remember that inductive labeling procedure 1136 01:04:46,110 --> 01:04:51,390 that I described before, where I gave the execution 1137 01:04:51,390 --> 01:04:55,440 labels on the edges depending upon the label of the parent 1138 01:04:55,440 --> 01:04:59,340 node and which node, which variable is being queried. 1139 01:04:59,340 --> 01:05:03,570 So if I know that this value is an a, but now the-- 1140 01:05:06,350 --> 01:05:08,210 OK, so I'm just going to write this 1141 01:05:08,210 --> 01:05:14,520 down using arithmetic instead of using Boolean operations. 1142 01:05:14,520 --> 01:05:18,560 So before, we have-- this was a and xi, If you remember 1143 01:05:18,560 --> 01:05:20,307 from the previous slide. 1144 01:05:20,307 --> 01:05:22,640 Now what are we going to use instead because we're going 1145 01:05:22,640 --> 01:05:24,350 to use this conversion here? 1146 01:05:24,350 --> 01:05:28,050 Instead of and, we're going to use multiplication. 1147 01:05:28,050 --> 01:05:30,560 That's just a times xi. 1148 01:05:30,560 --> 01:05:31,940 What about on this side? 1149 01:05:31,940 --> 01:05:36,670 Here was and a and the complement of xi. 1150 01:05:36,670 --> 01:05:42,340 Now the complement of xi is 1 minus xi arithmetically. 1151 01:05:42,340 --> 01:05:49,330 So this becomes a times 1 minus xi, OK? 1152 01:05:49,330 --> 01:05:56,240 Similarly, here, we did the or to get the label on the node 1153 01:05:56,240 --> 01:05:59,600 from the labels of its incoming edges. 1154 01:05:59,600 --> 01:06:02,810 Now, we're going do something a little strange because we 1155 01:06:02,810 --> 01:06:04,550 have a formula here for or. 1156 01:06:04,550 --> 01:06:07,650 But for technical reasons that will come up later, 1157 01:06:07,650 --> 01:06:11,090 this is not a convenient representation for us. 1158 01:06:11,090 --> 01:06:13,610 What I'm going to use instead of this one-- 1159 01:06:13,610 --> 01:06:18,310 I'm just going to simply say, let's take the sum. 1160 01:06:18,310 --> 01:06:19,375 Why is that good enough? 1161 01:06:22,100 --> 01:06:24,380 In this case, this is still going 1162 01:06:24,380 --> 01:06:27,920 to be a faithful representation and give the right answer all 1163 01:06:27,920 --> 01:06:29,190 of the time. 1164 01:06:29,190 --> 01:06:35,360 And that's because for our branching programs, 1165 01:06:35,360 --> 01:06:40,940 read-once or otherwise, read-once is not coming in yet, 1166 01:06:40,940 --> 01:06:45,410 for our branching programs, they're acyclic. 1167 01:06:45,410 --> 01:06:50,440 So they can never enter a node on two different paths. 1168 01:06:50,440 --> 01:06:54,170 There's, at most, one way to come into a node on a path 1169 01:06:54,170 --> 01:06:56,240 through the-- on an execution path 1170 01:06:56,240 --> 01:06:57,560 through the branching program. 1171 01:06:57,560 --> 01:07:02,390 If it comes in through this edge, 1172 01:07:02,390 --> 01:07:05,510 there's no way for this edge to also have a path because that 1173 01:07:05,510 --> 01:07:11,870 means you have to go out and come back and have 1174 01:07:11,870 --> 01:07:15,440 a cycle in the branching program, which is disallowed. 1175 01:07:15,440 --> 01:07:18,440 So at most, one of these edges can 1176 01:07:18,440 --> 01:07:20,310 have the path go through it. 1177 01:07:20,310 --> 01:07:23,490 So at most, one of these a's can be a 1. 1178 01:07:23,490 --> 01:07:25,010 The others are going to be 0. 1179 01:07:25,010 --> 01:07:27,140 And therefore, just taking the sum 1180 01:07:27,140 --> 01:07:29,960 is going to give us a value of either 0 or 1, 1181 01:07:29,960 --> 01:07:32,000 but it's never going to give a value higher. 1182 01:07:32,000 --> 01:07:34,130 And so you don't have to subtract off 1183 01:07:34,130 --> 01:07:36,130 these product terms. 1184 01:07:36,130 --> 01:07:37,380 A little bit complicated here. 1185 01:07:37,380 --> 01:07:41,070 If you didn't totally get that, don't worry for now. 1186 01:07:41,070 --> 01:07:45,810 We're more concerned that you get the big picture of what's 1187 01:07:45,810 --> 01:07:46,410 going on. 1188 01:07:49,850 --> 01:07:55,174 OK, so I think we're almost-- 1189 01:07:55,174 --> 01:07:58,600 let me just see how far we are. 1190 01:07:58,600 --> 01:07:59,290 Yeah. 1191 01:07:59,290 --> 01:08:01,630 So I'm just going to work through an example. 1192 01:08:01,630 --> 01:08:04,500 And I think that'll bring us-- 1193 01:08:04,500 --> 01:08:08,070 let's just see, Do we have any questions here? 1194 01:08:08,070 --> 01:08:09,320 Not seeing any. 1195 01:08:09,320 --> 01:08:13,530 That means you're either all totally understanding, 1196 01:08:13,530 --> 01:08:15,270 or you're totally lost. 1197 01:08:15,270 --> 01:08:17,460 I never can tell. 1198 01:08:17,460 --> 01:08:21,990 So feel free to ask a question, even if you're confused. 1199 01:08:21,990 --> 01:08:24,750 I'll do my best. 1200 01:08:24,750 --> 01:08:26,939 OK, maybe this example might help. 1201 01:08:29,880 --> 01:08:31,609 So now, what we're going to do is, 1202 01:08:31,609 --> 01:08:39,380 using this arithmetical view of the way a branching program's 1203 01:08:39,380 --> 01:08:46,550 computation is executed when we're running 1204 01:08:46,550 --> 01:08:49,729 an input through it, this is going 1205 01:08:49,729 --> 01:08:52,250 to allow us now to give a meaning 1206 01:08:52,250 --> 01:08:55,380 to running the branching program on non-Boolean inputs. 1207 01:08:55,380 --> 01:08:57,410 So maybe this example will illustrate that. 1208 01:09:00,479 --> 01:09:03,140 So let's just take this particular branching program 1209 01:09:03,140 --> 01:09:06,870 here, OK? 1210 01:09:06,870 --> 01:09:09,960 This branching program it's just on two variables, x1 and x2, 1211 01:09:09,960 --> 01:09:12,430 and it actually computes a familiar function. 1212 01:09:12,430 --> 01:09:14,130 This is the exclusive or function 1213 01:09:14,130 --> 01:09:15,580 if you look at it for a minute. 1214 01:09:15,580 --> 01:09:20,670 You'll see that this is going to give you x1 exclusive or x2. 1215 01:09:20,670 --> 01:09:25,439 So it's going to be 1 if either of the x1 or x2 are 1, 1216 01:09:25,439 --> 01:09:27,660 but it's going to be 0 if they're both 1. 1217 01:09:27,660 --> 01:09:30,720 That's what this branching program computes. 1218 01:09:30,720 --> 01:09:35,899 Now, but let's take a look at running this branching program 1219 01:09:35,899 --> 01:09:39,109 instead of on the usual Boolean values, 1220 01:09:39,109 --> 01:09:45,189 let's run it on x1 equal to 2 and x2 equal to 3. 1221 01:09:45,189 --> 01:09:52,370 Now a common confusion might be that you're looking-- 1222 01:09:52,370 --> 01:09:55,370 when you do the x1 query, you're looking 1223 01:09:55,370 --> 01:10:00,140 for another outgoing edge, which is labeled 2. 1224 01:10:00,140 --> 01:10:02,000 No, that's not what I'm doing. 1225 01:10:02,000 --> 01:10:05,150 What I'm doing here is, I'm somehow, 1226 01:10:05,150 --> 01:10:08,990 through this execution, by assigning these other values, 1227 01:10:08,990 --> 01:10:11,900 I'm blending together the computation 1228 01:10:11,900 --> 01:10:18,098 of x1 equal to 0 and x1 equal to 1 together. 1229 01:10:18,098 --> 01:10:19,640 I don't know if that makes any sense. 1230 01:10:19,640 --> 01:10:22,890 But let's look through the example. 1231 01:10:22,890 --> 01:10:25,400 So first of all, these are the labeling rules 1232 01:10:25,400 --> 01:10:29,990 that I had from the previous slide 1233 01:10:29,990 --> 01:10:35,108 when I used plus and times instead of and and or, OK? 1234 01:10:35,108 --> 01:10:36,650 Now, I'm going to show you how to use 1235 01:10:36,650 --> 01:10:41,060 that to label the nodes and edges of this graph 1236 01:10:41,060 --> 01:10:44,190 based on this input. 1237 01:10:44,190 --> 01:10:46,580 And that will determine an output would 1238 01:10:46,580 --> 01:10:50,590 be the value on the 1 node. 1239 01:10:50,590 --> 01:10:54,960 OK, so we always start out by labeling the start with 1. 1240 01:10:54,960 --> 01:10:55,830 That's just a rule. 1241 01:10:59,410 --> 01:11:02,680 And OK, sorry. 1242 01:11:02,680 --> 01:11:06,640 Let's think about it together before I blurt out the answer. 1243 01:11:06,640 --> 01:11:11,640 What's going to be the label on this edge? 1244 01:11:11,640 --> 01:11:15,440 So this is one of the outgoing edges from a node that 1245 01:11:15,440 --> 01:11:16,710 already has a label. 1246 01:11:16,710 --> 01:11:19,900 So that's going to be this case, here. 1247 01:11:19,900 --> 01:11:24,190 And what we do is if we take the label of that node, 1248 01:11:24,190 --> 01:11:27,370 and since it's a 1 edge that's outgoing, 1249 01:11:27,370 --> 01:11:32,590 we multiply that label by the value 1250 01:11:32,590 --> 01:11:42,290 of that variable of the assignment to that variable. 1251 01:11:42,290 --> 01:11:44,520 So x1 is 2. 1252 01:11:44,520 --> 01:11:48,690 So we take-- the a here is 1. 1253 01:11:48,690 --> 01:11:50,730 x1 is assigned to 2. 1254 01:11:50,730 --> 01:11:56,550 So it's going to be 1 times 2 is going to be the execution 1255 01:11:56,550 --> 01:11:58,543 value we put on this edge. 1256 01:11:58,543 --> 01:11:59,460 So it's going to be 2. 1257 01:12:02,570 --> 01:12:06,110 What's going to be the value we put on the other edge, the 0 1258 01:12:06,110 --> 01:12:08,952 outgoing edge from x1? 1259 01:12:08,952 --> 01:12:10,910 So I want you to think about that for a second. 1260 01:12:15,930 --> 01:12:18,290 So now, we're going to use this expression. 1261 01:12:18,290 --> 01:12:20,390 It's a times 1 minus xi. 1262 01:12:23,050 --> 01:12:25,630 And so xi, again, is 2. 1263 01:12:25,630 --> 01:12:28,330 So 1 minus xi is 1 minus 2. 1264 01:12:31,210 --> 01:12:33,850 That's how compliment-- OK, well, let's save that. 1265 01:12:33,850 --> 01:12:35,740 So it's 1 minus 2. 1266 01:12:35,740 --> 01:12:39,740 So that's minus 1 times the label 1 here. 1267 01:12:39,740 --> 01:12:41,920 So you get minus 1 is the label on this edge. 1268 01:12:45,460 --> 01:12:51,370 Now keep in mind that if I had plugged in-- 1269 01:12:51,370 --> 01:12:54,700 and this is very important-- if I had plugged in Boolean values 1270 01:12:54,700 --> 01:12:57,860 here, I would be getting out the same Boolean 1271 01:12:57,860 --> 01:13:01,950 values that you would get just by following through the path. 1272 01:13:01,950 --> 01:13:04,080 The things on the path would be 1. 1273 01:13:04,080 --> 01:13:05,550 The things off the path would be 0. 1274 01:13:10,750 --> 01:13:16,390 But what's happening here is that there's still 1275 01:13:16,390 --> 01:13:19,882 a meaning when the inputs are not Boolean. 1276 01:13:19,882 --> 01:13:20,840 So let's continue here. 1277 01:13:20,840 --> 01:13:23,090 How about-- what's going to be the value on this node? 1278 01:13:25,832 --> 01:13:26,540 So think with me. 1279 01:13:26,540 --> 01:13:27,900 I think it'll help you. 1280 01:13:27,900 --> 01:13:30,170 So now, we're using this rule here. 1281 01:13:30,170 --> 01:13:33,170 We add up all the values on the incoming edges. 1282 01:13:33,170 --> 01:13:36,050 There's only one incoming edge, which is value 2. 1283 01:13:36,050 --> 01:13:38,050 So that means this guy is going to get a 2. 1284 01:13:38,050 --> 01:13:39,050 And similar on this one. 1285 01:13:39,050 --> 01:13:41,810 This guy is going to get a minus 1. 1286 01:13:41,810 --> 01:13:44,180 Now, let's take a look at this edge. 1287 01:13:44,180 --> 01:13:48,280 So this is the 0 outgoing edge from a node labeled 2 1288 01:13:48,280 --> 01:13:50,880 with label x2. 1289 01:13:50,880 --> 01:13:54,160 So this is the 0 outgoing edge. 1290 01:13:54,160 --> 01:13:55,950 The label is 2, so it's going to be 1291 01:13:55,950 --> 01:13:59,790 2 times 1 minus the x2 value. 1292 01:13:59,790 --> 01:14:00,930 x2 is 3. 1293 01:14:00,930 --> 01:14:02,730 So 1 minus 3 is minus 2. 1294 01:14:02,730 --> 01:14:07,680 It's going to be 2 times minus 2, which is minus 4. 1295 01:14:07,680 --> 01:14:10,800 So similarly, you can get the value here, 1296 01:14:10,800 --> 01:14:14,490 the value on the one outgoing edge 1297 01:14:14,490 --> 01:14:20,250 is going to be 2 times the x2 value, which is 3, 1298 01:14:20,250 --> 01:14:23,120 so that's going to be 6. 1299 01:14:23,120 --> 01:14:24,280 And these two here-- 1300 01:14:29,270 --> 01:14:32,110 so now, we have a minus 1. 1301 01:14:32,110 --> 01:14:37,180 And the outgoing is a 0 edge, so it's 1 minus 3. 1302 01:14:37,180 --> 01:14:41,620 And here, it's going to be 1 times minus 3. 1303 01:14:41,620 --> 01:14:42,640 No. 1304 01:14:42,640 --> 01:14:43,430 1 times 3. 1305 01:14:43,430 --> 01:14:43,930 I'm sorry. 1306 01:14:43,930 --> 01:14:44,770 1 times 3. 1307 01:14:47,760 --> 01:14:49,030 The answer's minus 3. 1308 01:14:49,030 --> 01:14:53,630 So now what's the label on the 0 output edge? 1309 01:14:53,630 --> 01:14:57,080 So you have to aid up the two incoming edges here. 1310 01:14:57,080 --> 01:15:04,100 So we have this edge here was a 2. 1311 01:15:04,100 --> 01:15:08,870 This edge coming in here is a 6, so it's going to be 2 plus 6. 1312 01:15:08,870 --> 01:15:10,400 It's 8. 1313 01:15:10,400 --> 01:15:11,360 What about this edge-- 1314 01:15:11,360 --> 01:15:12,140 this node here? 1315 01:15:12,140 --> 01:15:13,970 This is an important node because this 1316 01:15:13,970 --> 01:15:15,870 is going to be the output. 1317 01:15:15,870 --> 01:15:26,040 So it has minus 3 coming in and a minus 4 coming in, 1318 01:15:26,040 --> 01:15:28,220 so you add those together. 1319 01:15:28,220 --> 01:15:30,440 You get a minus 7. 1320 01:15:30,440 --> 01:15:32,336 I mean, you may wonder, what-- 1321 01:15:32,336 --> 01:15:36,971 [LAUGHS] what in the world is going on here? 1322 01:15:36,971 --> 01:15:39,620 Is this a lot of mumbo jumbo? 1323 01:15:39,620 --> 01:15:42,740 But we're going to make sense of all this-- not today. 1324 01:15:42,740 --> 01:15:45,590 We're going to have to argue why this is-- 1325 01:15:45,590 --> 01:15:47,540 what the meaning that we're going to get out 1326 01:15:47,540 --> 01:15:49,250 of this is going to be. 1327 01:15:49,250 --> 01:15:53,450 But the point is that this is going to lead 1328 01:15:53,450 --> 01:15:56,880 to a new algorithm for testing. 1329 01:15:56,880 --> 01:16:00,390 This is, again, getting back to what we were doing. 1330 01:16:00,390 --> 01:16:02,850 This is the equivalence problem for 1331 01:16:02,850 --> 01:16:04,275 read-once branching programs. 1332 01:16:04,275 --> 01:16:06,150 So now, what the new algorithm is going to do 1333 01:16:06,150 --> 01:16:10,270 is going to pick a random non-Boolean assignment. 1334 01:16:10,270 --> 01:16:12,760 So it's going to randomly assign values 1335 01:16:12,760 --> 01:16:16,797 to the x's and to some non-Boolean values. 1336 01:16:16,797 --> 01:16:18,380 Instead of zeros and ones, we're going 1337 01:16:18,380 --> 01:16:21,680 to plug in random integer values. 1338 01:16:21,680 --> 01:16:25,430 We'll make that clear next time what the domain is going to be. 1339 01:16:28,180 --> 01:16:31,260 And then once we have that non-Boolean assignment, 1340 01:16:31,260 --> 01:16:34,950 we're going to value B1 and B2. 1341 01:16:34,950 --> 01:16:39,793 And if they disagree out there in that extended domain, 1342 01:16:39,793 --> 01:16:41,835 then we have to show that they're not equivalent, 1343 01:16:41,835 --> 01:16:43,910 and then we'll reject. 1344 01:16:43,910 --> 01:16:48,380 And we'll also show that if they were equivalent, then even 1345 01:16:48,380 --> 01:16:55,190 when we evaluate them, we have to show that if they're not 1346 01:16:55,190 --> 01:16:58,340 equivalent, that they're very likely to have 1347 01:16:58,340 --> 01:17:01,380 a difference in the non-Boolean domain. 1348 01:17:01,380 --> 01:17:07,190 And so if they agree, it gives you evidence 1349 01:17:07,190 --> 01:17:10,820 that the two are really equivalent. 1350 01:17:10,820 --> 01:17:13,480 So the completeness proof will come after Thanksgiving. 1351 01:17:13,480 --> 01:17:19,360 So with that, I'm going to wish you all a nice break. 1352 01:17:19,360 --> 01:17:20,560 Oh, we have a check-in here. 1353 01:17:20,560 --> 01:17:21,490 Sorry. 1354 01:17:21,490 --> 01:17:23,420 Oh, yeah, this is a good one. 1355 01:17:23,420 --> 01:17:31,490 I don't know if you're following me, but if I plug in 1 for x1 1356 01:17:31,490 --> 01:17:35,060 and y for x2-- 1357 01:17:35,060 --> 01:17:37,610 do the inputs in the assignment need to be distinct? 1358 01:17:37,610 --> 01:17:38,860 No. 1359 01:17:38,860 --> 01:17:40,690 It could be the same value. 1360 01:17:40,690 --> 01:17:43,450 I could be 2 and 2 here. 1361 01:17:43,450 --> 01:17:45,360 That's perfectly valid. 1362 01:17:45,360 --> 01:17:47,480 But here, I'm going to plug in 1 for x1, 1363 01:17:47,480 --> 01:17:50,640 and I'm going to plug in a variable for x2-- 1364 01:17:50,640 --> 01:17:52,100 y. 1365 01:17:52,100 --> 01:17:54,680 And I'm going to do the whole calculation that I just did. 1366 01:17:54,680 --> 01:17:56,305 And now, what's going to be the output? 1367 01:17:59,140 --> 01:18:01,870 And I mean, this looks like a pain to figure out. 1368 01:18:01,870 --> 01:18:02,705 You could do it. 1369 01:18:02,705 --> 01:18:03,580 It looks like a pain. 1370 01:18:03,580 --> 01:18:04,990 But let me give you a big hint. 1371 01:18:08,390 --> 01:18:13,220 Remember that this thing is supposed to be calculating. 1372 01:18:13,220 --> 01:18:15,575 The original branching program calculates the exclusive 1373 01:18:15,575 --> 01:18:17,920 or function. 1374 01:18:17,920 --> 01:18:23,170 And that means when I plug in a Boolean value, 1375 01:18:23,170 --> 01:18:27,400 I should get the exclusive or value coming out. 1376 01:18:27,400 --> 01:18:31,990 So if I already know that x1 is 1, which of these 1377 01:18:31,990 --> 01:18:36,850 is consistent with getting a value 1378 01:18:36,850 --> 01:18:40,615 that the exclusive or function would compute? 1379 01:18:40,615 --> 01:18:45,030 So let me launch a poll on that. 1380 01:18:45,030 --> 01:18:45,960 So we're out of time. 1381 01:18:50,730 --> 01:18:55,570 So let's just let this run for another 10 seconds. 1382 01:18:55,570 --> 01:18:58,475 OK, I'm going to close this. 1383 01:18:58,475 --> 01:18:58,975 Ready? 1384 01:19:04,010 --> 01:19:06,370 Yes, indeed, a is the right answer 1385 01:19:06,370 --> 01:19:09,400 because that's one-- if I know that one variable is 1, then 1386 01:19:09,400 --> 01:19:12,160 the exclusive or is going to be the complement 1387 01:19:12,160 --> 01:19:15,050 of the other variable, which is 1 minus y. 1388 01:19:15,050 --> 01:19:19,590 So that's what you would get if you calculated this. 1389 01:19:19,590 --> 01:19:21,810 OK, so this is what we did today. 1390 01:19:21,810 --> 01:19:26,010 And feel free to ask questions. 1391 01:19:26,010 --> 01:19:27,870 So we're going to spend a good chunk-- 1392 01:19:27,870 --> 01:19:30,630 I'll review this, what we've done so far, 1393 01:19:30,630 --> 01:19:33,270 but then we're going to carry it forward and spend 1394 01:19:33,270 --> 01:19:35,520 a good chunk of Tuesday's lecture 1395 01:19:35,520 --> 01:19:38,850 after the Thanksgiving break proving 1396 01:19:38,850 --> 01:19:42,660 that this procedure that I just described worked and works. 1397 01:19:42,660 --> 01:19:46,860 And it's an interesting but somewhat-- 1398 01:19:46,860 --> 01:19:49,298 it's not such an easy proof. 1399 01:19:49,298 --> 01:19:50,340 So we're going to spend-- 1400 01:19:50,340 --> 01:19:53,340 try to do it slowly and clearly. 1401 01:19:53,340 --> 01:19:55,590 But this notion of arithmetization 1402 01:19:55,590 --> 01:19:57,990 is going to be-- 1403 01:19:57,990 --> 01:20:02,430 it's an important notion in complexity. 1404 01:20:02,430 --> 01:20:08,550 And so we'll see it again coming up in another proof afterward 1405 01:20:08,550 --> 01:20:10,710 about interactive proof systems. 1406 01:20:10,710 --> 01:20:14,560 OK, so please ask questions. 1407 01:20:14,560 --> 01:20:19,710 So the output is the value of the output 1 state, yes. 1408 01:20:19,710 --> 01:20:22,540 There was a question I got. 1409 01:20:22,540 --> 01:20:24,600 Other questions? 1410 01:20:24,600 --> 01:20:28,240 Somebody saying, minus 7 is not the XOR of 2 and 2. 1411 01:20:28,240 --> 01:20:33,590 [LAUGHS] What is the XOR of 2 and 3? 1412 01:20:33,590 --> 01:20:36,290 So by the way, I should say, we kind of ran a little short 1413 01:20:36,290 --> 01:20:38,510 on time, I'm not saying that we discovered 1414 01:20:38,510 --> 01:20:42,950 some fundamental new truth about XOR here 1415 01:20:42,950 --> 01:20:45,140 because that would be bizarre. 1416 01:20:45,140 --> 01:20:47,480 It really depends on the arbitrary decision 1417 01:20:47,480 --> 01:20:52,322 that we made to say true is 1 and false is 0. 1418 01:20:52,322 --> 01:20:54,530 We could have come up with a different representation 1419 01:20:54,530 --> 01:20:55,680 for true and false. 1420 01:20:55,680 --> 01:20:57,620 And then you would get a different value 1421 01:20:57,620 --> 01:21:00,650 for XOR coming out of that-- from the arithmetization 1422 01:21:00,650 --> 01:21:03,110 that I just described. 1423 01:21:03,110 --> 01:21:07,250 But for this particular way of representing true and false, 1424 01:21:07,250 --> 01:21:09,890 that's how XOR and this particular branching program, 1425 01:21:09,890 --> 01:21:14,060 that's how XOR evaluates. 1426 01:21:14,060 --> 01:21:15,720 The remainder of the proof-- 1427 01:21:15,720 --> 01:21:17,990 so somebody is asking, which is true, 1428 01:21:17,990 --> 01:21:19,710 the fundamental theorem of algebra, 1429 01:21:19,710 --> 01:21:22,250 which talks about polynomials and the number of routes 1430 01:21:22,250 --> 01:21:23,438 that you can have-- 1431 01:21:23,438 --> 01:21:24,605 that's going to be critical. 1432 01:21:27,360 --> 01:21:29,578 So that is the fundamental theorem of algebra. 1433 01:21:29,578 --> 01:21:30,620 That's where we're going. 1434 01:21:30,620 --> 01:21:31,280 Good question. 1435 01:21:34,090 --> 01:21:36,010 Somebody is complaining that we're not 1436 01:21:36,010 --> 01:21:41,020 taking the digit binary representation of 2 and 3 1437 01:21:41,020 --> 01:21:47,530 and taking the bit by bit XOR. 1438 01:21:47,530 --> 01:21:50,890 Binary representation is not a part of this. 1439 01:21:50,890 --> 01:21:55,900 We're thinking of these as two elements of a finite field, 1440 01:21:55,900 --> 01:21:59,050 which we'll talk about later. 1441 01:21:59,050 --> 01:22:03,972 The binary representation is not entering into this discussion. 1442 01:22:06,690 --> 01:22:10,235 So I'll talk about why just doing the sum is enough. 1443 01:22:16,320 --> 01:22:18,200 I think that was-- 1444 01:22:18,200 --> 01:22:18,830 so why is it-- 1445 01:22:18,830 --> 01:22:19,730 I mean, here it is. 1446 01:22:19,730 --> 01:22:23,480 Why is just doing the sum when I'm 1447 01:22:23,480 --> 01:22:29,000 looking at how to describe the value of this node 1448 01:22:29,000 --> 01:22:32,060 based upon the values of all the incoming nodes. 1449 01:22:32,060 --> 01:22:35,120 And remember, the starting point of this 1450 01:22:35,120 --> 01:22:38,450 is that we have to faithfully represent the Boolean 1451 01:22:38,450 --> 01:22:42,070 logic with the arithmetic. 1452 01:22:42,070 --> 01:22:44,020 And then we're going to use that and extend it 1453 01:22:44,020 --> 01:22:45,437 to non-Boolean values. 1454 01:22:45,437 --> 01:22:47,770 But as a starting point, we have to faithfully represent 1455 01:22:47,770 --> 01:22:49,210 the Boolean values. 1456 01:22:49,210 --> 01:22:52,960 Now the Boolean values, on the incoming edges, 1457 01:22:52,960 --> 01:22:57,970 at most, one of them can be a 1 because the 1's 1458 01:22:57,970 --> 01:23:04,240 correspond to the edges of the execution path. 1459 01:23:04,240 --> 01:23:07,450 And you can't make an execution path that's 1460 01:23:07,450 --> 01:23:10,365 going to have two branches-- 1461 01:23:10,365 --> 01:23:11,740 that's going to go through a node 1462 01:23:11,740 --> 01:23:14,170 twice because then you have a loop. 1463 01:23:14,170 --> 01:23:17,520 And we don't have-- there's no cycles allowed. 1464 01:23:17,520 --> 01:23:22,230 OK, so I think we're-- it's at 4:00. 1465 01:23:22,230 --> 01:23:23,850 I want to say farewell to you all. 1466 01:23:23,850 --> 01:23:25,800 Have a great week. 1467 01:23:25,800 --> 01:23:29,120 And I'll see you when you get back.