1 00:00:00,000 --> 00:00:01,984 [SQUEAKING] 2 00:00:01,984 --> 00:00:03,968 [RUSTLING] 3 00:00:03,968 --> 00:00:05,456 [CLICKING] 4 00:00:25,300 --> 00:00:28,030 MICHAEL SIPSER: Hi, everybody. 5 00:00:28,030 --> 00:00:31,690 Glad to have you all back for our next 6 00:00:31,690 --> 00:00:37,780 to last installment of theory of computation. 7 00:00:37,780 --> 00:00:45,730 Today, we are going to embark on the very last big topic 8 00:00:45,730 --> 00:00:51,150 for the semester, and that is, in some ways, 9 00:00:51,150 --> 00:00:57,210 going to be following on what we started a couple of lectures 10 00:00:57,210 --> 00:01:00,450 back when we looked at probabilistic Turing 11 00:01:00,450 --> 00:01:05,459 machines and probabilistic computation and its associated 12 00:01:05,459 --> 00:01:07,900 class BPP. 13 00:01:07,900 --> 00:01:12,670 Now what we're going to discuss is, in some sense, 14 00:01:12,670 --> 00:01:17,350 a probabilistic version of NP. 15 00:01:17,350 --> 00:01:18,820 And that's going to be a complexity 16 00:01:18,820 --> 00:01:22,810 class called IP, which stands for Interactive Proof systems. 17 00:01:26,080 --> 00:01:28,320 And so we're going to present that model 18 00:01:28,320 --> 00:01:32,890 and look at a couple of examples. 19 00:01:32,890 --> 00:01:35,010 I would just like to say at the beginning 20 00:01:35,010 --> 00:01:38,710 that this model is a very important one. 21 00:01:38,710 --> 00:01:41,550 It really has been the starting point 22 00:01:41,550 --> 00:01:47,140 for a great deal of research in complexity theory. 23 00:01:47,140 --> 00:01:49,420 So we're just really going to be touching on it, 24 00:01:49,420 --> 00:01:52,050 but there's a lot more that people 25 00:01:52,050 --> 00:01:54,900 have pursued with this model. 26 00:01:54,900 --> 00:01:59,790 And it's also a connection into the cryptography field, 27 00:01:59,790 --> 00:02:04,020 which also makes use of the interactive proof system model. 28 00:02:04,020 --> 00:02:09,270 In fact, some of the genesis of that model 29 00:02:09,270 --> 00:02:12,630 comes out of cryptography where you're 30 00:02:12,630 --> 00:02:19,170 having multiple parties either communicating or, in some ways, 31 00:02:19,170 --> 00:02:26,490 interacting to achieve certain goals of communication 32 00:02:26,490 --> 00:02:31,120 or signing or passwords or what have you. 33 00:02:31,120 --> 00:02:35,490 So this is both an applied area and also 34 00:02:35,490 --> 00:02:39,570 one that has a lot of very interesting theory associated 35 00:02:39,570 --> 00:02:40,433 to it. 36 00:02:40,433 --> 00:02:41,850 So with that, why don't we-- we're 37 00:02:41,850 --> 00:02:50,200 going to jump in and start out by making myself smaller 38 00:02:50,200 --> 00:02:53,920 and just do an introduction. 39 00:02:53,920 --> 00:02:57,880 I'm going to introduce the model or the concept 40 00:02:57,880 --> 00:03:02,080 of an interactive proof with an example, 41 00:03:02,080 --> 00:03:07,600 and that example involves the graph isomorphism problem. 42 00:03:07,600 --> 00:03:10,750 That's the problem of testing whether two graphs are 43 00:03:10,750 --> 00:03:12,010 isomorphic. 44 00:03:12,010 --> 00:03:14,500 What do we mean by two graphs being isomorphic 45 00:03:14,500 --> 00:03:20,050 is that they're really just the same graph with one of them 46 00:03:20,050 --> 00:03:27,480 perhaps being relabeled or permuted so that they may look 47 00:03:27,480 --> 00:03:29,970 superficially different, they may appear 48 00:03:29,970 --> 00:03:31,950 with a different sequence of labels, 49 00:03:31,950 --> 00:03:34,810 or the nodes are appearing in a different order, 50 00:03:34,810 --> 00:03:40,270 but except for that, it's really just the same graph. 51 00:03:40,270 --> 00:03:43,440 So I'm kind of illustrating that here if you can see those two 52 00:03:43,440 --> 00:03:46,300 graphs here, which look different from each other. 53 00:03:46,300 --> 00:03:47,700 Both on 8 nodes. 54 00:03:47,700 --> 00:03:49,890 They are, in fact, the same graph 55 00:03:49,890 --> 00:03:52,980 as I can illustrate by a little animation which will 56 00:03:52,980 --> 00:03:54,810 convert this one into that one. 57 00:04:04,510 --> 00:04:12,350 So the two graphs, these graphs being the same, 58 00:04:12,350 --> 00:04:14,510 we call that isomorphic. 59 00:04:14,510 --> 00:04:18,329 So these are graphs G and H, and they're really the same graph. 60 00:04:18,329 --> 00:04:21,470 So we called them isomorphic graphs, 61 00:04:21,470 --> 00:04:26,180 and we have an associated computational problem called 62 00:04:26,180 --> 00:04:31,190 ISO, which is given a pair of graphs, we'd like to know, 63 00:04:31,190 --> 00:04:32,330 are they isomorphic or not? 64 00:04:32,330 --> 00:04:33,920 So ISO is the collection of pairs 65 00:04:33,920 --> 00:04:36,380 of graphs which are isomorphic. 66 00:04:36,380 --> 00:04:43,490 And it's easy to see that this problem is an NP 67 00:04:43,490 --> 00:04:49,220 problem because all you need to do in order to see or to give 68 00:04:49,220 --> 00:04:51,230 a certificate that the two graphs are 69 00:04:51,230 --> 00:04:53,900 isomorphic to each other is tell you-- 70 00:04:53,900 --> 00:04:56,750 it's just to say which nodes in the one graph 71 00:04:56,750 --> 00:05:00,390 correspond to which other nodes in the other graph. 72 00:05:00,390 --> 00:05:01,950 And then you all you'll need to check 73 00:05:01,950 --> 00:05:04,650 is that the edge relationships are 74 00:05:04,650 --> 00:05:08,070 consistent with that mapping or that isomorphism, 75 00:05:08,070 --> 00:05:08,760 as it's called. 76 00:05:11,330 --> 00:05:15,310 So it's easy to see that the ISO problem is an NP. 77 00:05:15,310 --> 00:05:16,810 And if you're not getting that, make 78 00:05:16,810 --> 00:05:21,670 sure you understand because the whole first part of the lecture 79 00:05:21,670 --> 00:05:25,510 will be lost if you don't understand this ISO problem. 80 00:05:25,510 --> 00:05:29,200 Now, the question of whether you can 81 00:05:29,200 --> 00:05:36,590 test two graphs being isomorphic in polynomial time 82 00:05:36,590 --> 00:05:37,580 is not clear. 83 00:05:37,580 --> 00:05:42,530 And in fact, that's an unsolved problem to this day. 84 00:05:42,530 --> 00:05:44,270 And it's a problem that has generated 85 00:05:44,270 --> 00:05:48,230 an enormous literature. 86 00:05:48,230 --> 00:05:52,220 There are hundreds of papers on the graph isomorphism problem, 87 00:05:52,220 --> 00:05:58,050 as it's called, to try to resolve-- 88 00:05:58,050 --> 00:06:01,140 to try to see if one can find a polynomial time algorithm. 89 00:06:01,140 --> 00:06:04,700 And in fact, it was a very big result just in the last 10 90 00:06:04,700 --> 00:06:08,060 years where there was subexponential algorithm given, 91 00:06:08,060 --> 00:06:12,950 so that was more faster than the brute force search approach 92 00:06:12,950 --> 00:06:15,680 but didn't get it all the way down to polynomial. 93 00:06:15,680 --> 00:06:17,690 Now, why is there so much attention just 94 00:06:17,690 --> 00:06:19,820 to this one particular NP problem? 95 00:06:19,820 --> 00:06:24,020 It's because it's not known whether the graph isomorphism 96 00:06:24,020 --> 00:06:25,370 problem is NP complete. 97 00:06:25,370 --> 00:06:27,920 ISO is not known to be an NP complete problem, 98 00:06:27,920 --> 00:06:30,890 and that puts it into a very, very small class 99 00:06:30,890 --> 00:06:34,160 of problems in NP which are not known 100 00:06:34,160 --> 00:06:37,580 to be either NP or NP complete. 101 00:06:37,580 --> 00:06:41,000 It's kind of a curiosity that for NP problems, 102 00:06:41,000 --> 00:06:45,800 almost all of them have ended up being in one side or the other. 103 00:06:45,800 --> 00:06:49,075 And in fact, it's a-- 104 00:06:54,030 --> 00:07:00,910 in fact, I think it's the only problem that 105 00:07:00,910 --> 00:07:03,820 just involves graphs that's not known 106 00:07:03,820 --> 00:07:07,690 to be either in P or an NP. 107 00:07:07,690 --> 00:07:09,730 So I got a question here. 108 00:07:09,730 --> 00:07:12,370 What would be in between exponential and polynomial? 109 00:07:12,370 --> 00:07:15,400 For example, I don't remember what the bound is, 110 00:07:15,400 --> 00:07:22,490 but it's something in the range of n to the log n, 111 00:07:22,490 --> 00:07:24,860 a time complexity for the graph isomorphism. 112 00:07:24,860 --> 00:07:26,900 I may be getting that wrong. 113 00:07:26,900 --> 00:07:29,000 I don't remember exactly what the bound is. 114 00:07:29,000 --> 00:07:33,290 But that's significantly better than 2 115 00:07:33,290 --> 00:07:37,220 to the n or some exponential amount of time, 116 00:07:37,220 --> 00:07:40,260 but it's more than n to any constant, 117 00:07:40,260 --> 00:07:41,930 so it's more than any polynomial time. 118 00:07:45,030 --> 00:07:50,030 So another question of the same sort 119 00:07:50,030 --> 00:07:52,880 is whether the complementary problem is an NP 120 00:07:52,880 --> 00:07:56,450 or whether ISO is in coNP, or let's 121 00:07:56,450 --> 00:07:58,220 talk about it in terms of the complement, 122 00:07:58,220 --> 00:08:01,490 whether the complement of ISO, which I'll 123 00:08:01,490 --> 00:08:04,910 refer to as the non-ISO problem, whether that's 124 00:08:04,910 --> 00:08:06,320 known to be an NP. 125 00:08:06,320 --> 00:08:07,670 So that's also not known. 126 00:08:10,850 --> 00:08:13,490 In other words, if I give you two graphs 127 00:08:13,490 --> 00:08:16,677 and I ask you to show that they're not isomorphic, 128 00:08:16,677 --> 00:08:18,260 suppose they aren't isomorphic and you 129 00:08:18,260 --> 00:08:20,522 go through the effort of determining that 130 00:08:20,522 --> 00:08:21,980 by a brute force search and now you 131 00:08:21,980 --> 00:08:26,030 want to prove that they're not isomorphic, 132 00:08:26,030 --> 00:08:29,280 well, it's not known to be an NP either. 133 00:08:29,280 --> 00:08:34,799 So there's no known short certificate of two graphs 134 00:08:34,799 --> 00:08:36,600 not being isomorphic. 135 00:08:36,600 --> 00:08:38,309 We don't know how to do that either. 136 00:08:38,309 --> 00:08:42,539 But there's something that's very interesting, nevertheless, 137 00:08:42,539 --> 00:08:47,970 and it has to do with the ability for one party 138 00:08:47,970 --> 00:08:51,780 to prove to another that graphs are either isomorphic or not 139 00:08:51,780 --> 00:08:52,960 isomorphic. 140 00:08:52,960 --> 00:08:55,440 So if you're just having it like a prover, 141 00:08:55,440 --> 00:08:57,270 we haven't really been necessarily 142 00:08:57,270 --> 00:08:59,760 formulating that this way so much in this class, 143 00:08:59,760 --> 00:09:01,770 but this is a completely equivalent way 144 00:09:01,770 --> 00:09:04,290 of formulating the notion of NP whether you 145 00:09:04,290 --> 00:09:07,290 have a polynomial time verifier and a prover who 146 00:09:07,290 --> 00:09:08,730 can produce certificates. 147 00:09:08,730 --> 00:09:10,830 Say it's a powerful prover. 148 00:09:10,830 --> 00:09:14,910 So if you have a problem that's in NP, 149 00:09:14,910 --> 00:09:18,360 a prover can convince a polynomial time verifier 150 00:09:18,360 --> 00:09:22,380 that strings are in the language if in fact they are. 151 00:09:22,380 --> 00:09:24,870 So in the case of the ISO problem, 152 00:09:24,870 --> 00:09:27,510 a prover can convince a polynomial time verifier 153 00:09:27,510 --> 00:09:32,950 that graphs are isomorphic just by exhibiting the isomorphism. 154 00:09:32,950 --> 00:09:38,040 Now, for the non-isomorphism case, 155 00:09:38,040 --> 00:09:40,260 we don't know that that problem is in NP, 156 00:09:40,260 --> 00:09:43,740 but it's still possible for a prover 157 00:09:43,740 --> 00:09:49,960 to convince a verifier that graphs are not isomorphic 158 00:09:49,960 --> 00:09:54,010 if you change the rules of the game slightly. 159 00:09:54,010 --> 00:09:57,910 So even though the non-ISO problem is not 160 00:09:57,910 --> 00:10:00,400 known to be an NP, a prover can still 161 00:10:00,400 --> 00:10:02,530 convince a polynomial time verifier 162 00:10:02,530 --> 00:10:04,765 that graphs are not isomorphic, assuming 163 00:10:04,765 --> 00:10:07,510 they are, in fact, not isomorphic, 164 00:10:07,510 --> 00:10:13,310 provided the prover and the verifier 165 00:10:13,310 --> 00:10:15,210 can interact with one another. 166 00:10:15,210 --> 00:10:18,260 So the verifier can ask questions of the prover, 167 00:10:18,260 --> 00:10:21,990 and the verifier gets to be probabilistic. 168 00:10:21,990 --> 00:10:24,420 So that's in this-- 169 00:10:24,420 --> 00:10:26,370 that's in a sense in which I mean 170 00:10:26,370 --> 00:10:30,930 that this notion is a kind of a probabilistic version of NP. 171 00:10:33,950 --> 00:10:34,450 OK. 172 00:10:34,450 --> 00:10:40,950 So let me show you how that's done. 173 00:10:40,950 --> 00:10:46,040 So before we jump in to the method for a prover 174 00:10:46,040 --> 00:10:52,295 to show a verifier that graphs are not isomorphic, 175 00:10:52,295 --> 00:10:54,330 let's try to get a little clearer on the model. 176 00:10:54,330 --> 00:10:56,420 So I'm going to first show it to you informally, 177 00:10:56,420 --> 00:10:57,878 and then we'll look at it formally. 178 00:10:59,960 --> 00:11:00,460 OK. 179 00:11:00,460 --> 00:11:06,310 So in interactive proofs, there are two parties. 180 00:11:06,310 --> 00:11:09,310 And I'm going to think about them as one of them 181 00:11:09,310 --> 00:11:12,620 is going to be the professor. 182 00:11:12,620 --> 00:11:16,700 So the professor is going to play the role of the verifier, 183 00:11:16,700 --> 00:11:21,200 in a sense, but it's like the one who checks. 184 00:11:21,200 --> 00:11:26,780 And the professor, being kind of old and tired and teaching too 185 00:11:26,780 --> 00:11:34,870 long maybe, can only operate in probabilistic polynomial time. 186 00:11:34,870 --> 00:11:37,980 So the professor, if he wants to tell whether two graphs are 187 00:11:37,980 --> 00:11:41,220 isomorphic or not, probabilistic polynomial time 188 00:11:41,220 --> 00:11:44,940 doesn't seem to be enough to tell whether two graphs are 189 00:11:44,940 --> 00:11:47,430 isomorphic or not because it seems to be 190 00:11:47,430 --> 00:11:50,480 a more than polynomial problem. 191 00:11:50,480 --> 00:11:58,440 However, the professor has help. 192 00:11:58,440 --> 00:12:05,370 It has an army of graduate students, 193 00:12:05,370 --> 00:12:07,620 and the graduate students, they're 194 00:12:07,620 --> 00:12:10,710 not limited in the same way the professor is. 195 00:12:10,710 --> 00:12:12,570 The graduate students are young. 196 00:12:12,570 --> 00:12:16,050 They are energetic. 197 00:12:16,050 --> 00:12:17,460 They can stay up all night. 198 00:12:17,460 --> 00:12:19,420 They know how to code. 199 00:12:19,420 --> 00:12:23,670 So the graduate students have unlimited computational 200 00:12:23,670 --> 00:12:25,838 ability. 201 00:12:25,838 --> 00:12:28,380 So then we're going to think of the graduate students playing 202 00:12:28,380 --> 00:12:31,860 the role of the prover because they're not 203 00:12:31,860 --> 00:12:34,230 limited in their capabilities, we'll assume. 204 00:12:34,230 --> 00:12:36,580 The professor, on the other hand, is limited. 205 00:12:36,580 --> 00:12:39,120 So the professor wants to know if the two 206 00:12:39,120 --> 00:12:45,780 graphs are isomorphic, let's say, whatever they are. 207 00:12:45,780 --> 00:12:48,450 Can't do it by himself, so he's going 208 00:12:48,450 --> 00:12:54,690 to ask his students to figure out the answer and report back. 209 00:12:54,690 --> 00:12:56,490 Now, there's only one problem. 210 00:12:56,490 --> 00:13:00,237 The professor knows that students-- 211 00:13:00,237 --> 00:13:02,070 well, in the old days, they'd like to party. 212 00:13:02,070 --> 00:13:06,840 I guess these days, they like to play computer games a lot. 213 00:13:06,840 --> 00:13:10,050 And so they're not really that eager to spend all their time 214 00:13:10,050 --> 00:13:12,850 figuring out whether graphs are isomorphic. 215 00:13:12,850 --> 00:13:15,790 So he's worried that the students will just 216 00:13:15,790 --> 00:13:18,070 come up with some answer and figure that he won't 217 00:13:18,070 --> 00:13:21,620 be able to tell the difference. 218 00:13:21,620 --> 00:13:25,220 So the professor does not trust the students. 219 00:13:25,220 --> 00:13:27,820 It's not enough for the professor 220 00:13:27,820 --> 00:13:29,320 to give the problem to the students 221 00:13:29,320 --> 00:13:31,487 and just take any answer that they're going to give. 222 00:13:31,487 --> 00:13:34,870 The professor wants to be convinced. 223 00:13:38,080 --> 00:13:47,590 So now, how could the students convince the professor 224 00:13:47,590 --> 00:13:49,590 of the answer, that they've really done the work 225 00:13:49,590 --> 00:13:52,380 and figured out whether the graphs are isomorphic or not? 226 00:13:52,380 --> 00:13:54,600 Well, if the graphs are isomorphic, 227 00:13:54,600 --> 00:13:56,610 if it turns out that the graphs were isomorphic 228 00:13:56,610 --> 00:13:59,230 and the students figure that out, 229 00:13:59,230 --> 00:14:02,400 then life is good because what are they 230 00:14:02,400 --> 00:14:05,730 going to do to convince the professor? 231 00:14:05,730 --> 00:14:07,500 They're going to hand over the isomorphism 232 00:14:07,500 --> 00:14:11,690 and show, yeah, I mean, they are. 233 00:14:11,690 --> 00:14:13,640 Those graphs really are isomorphic, 234 00:14:13,640 --> 00:14:15,760 and here's how the correspondence works. 235 00:14:15,760 --> 00:14:18,800 Professor can check, oh, yeah. 236 00:14:18,800 --> 00:14:20,630 Now I'm convinced. 237 00:14:20,630 --> 00:14:24,972 But suppose the graphs were not isomorphic. 238 00:14:24,972 --> 00:14:26,180 What are we going to do then? 239 00:14:29,165 --> 00:14:31,780 The students have figured out graphs are not isom-- 240 00:14:31,780 --> 00:14:33,880 the professor wants to be convinced. 241 00:14:33,880 --> 00:14:34,420 Oh, no. 242 00:14:34,420 --> 00:14:35,720 What are we going to do? 243 00:14:35,720 --> 00:14:38,283 Well, in fact, we're going to engage-- 244 00:14:38,283 --> 00:14:39,700 the professor and the students are 245 00:14:39,700 --> 00:14:43,130 going to engage in the following protocol. 246 00:14:43,130 --> 00:14:45,530 Dialogue. 247 00:14:45,530 --> 00:14:47,000 What's going to happen is-- 248 00:14:47,000 --> 00:14:48,960 now, you have to make sure your-- 249 00:14:48,960 --> 00:14:53,018 this is critical to understand this little part of the story 250 00:14:53,018 --> 00:14:54,560 here because it's really going to set 251 00:14:54,560 --> 00:14:57,080 the pattern for everything in today's and tomorr-- 252 00:14:57,080 --> 00:15:00,900 in today's lecture and the next lecture. 253 00:15:00,900 --> 00:15:05,940 So we're going to engage in the following interaction 254 00:15:05,940 --> 00:15:08,640 between the students and the professor, 255 00:15:08,640 --> 00:15:12,330 which is going to enable the students to convince 256 00:15:12,330 --> 00:15:16,290 the professor that the two graphs really are not 257 00:15:16,290 --> 00:15:18,230 isomorphic. 258 00:15:18,230 --> 00:15:21,140 So how is that going to work? 259 00:15:21,140 --> 00:15:24,780 This is a beautiful little thing, by the way. 260 00:15:24,780 --> 00:15:28,260 So the professor is going to take the two graphs 261 00:15:28,260 --> 00:15:31,960 and pick one of them at random. 262 00:15:31,960 --> 00:15:35,710 Has these two graphs, G and H. Let's 263 00:15:35,710 --> 00:15:38,352 say they really are not isomorphic. 264 00:15:38,352 --> 00:15:40,060 The professor doesn't know that for sure. 265 00:15:40,060 --> 00:15:41,352 That's what the students claim. 266 00:15:41,352 --> 00:15:43,900 The professor really wants to be convinced 267 00:15:43,900 --> 00:15:45,940 that the students are right. 268 00:15:45,940 --> 00:15:48,534 So the professor's going to pick one of the two at random. 269 00:15:48,534 --> 00:15:53,380 Randomly permute that choice, the one 270 00:15:53,380 --> 00:15:56,740 that he picked, and hand it over to the students. 271 00:15:56,740 --> 00:16:03,590 Say, OK, here is one of those two graphs randomly scrambled. 272 00:16:03,590 --> 00:16:06,845 Then I'm going to ask the students, which one did I pick? 273 00:16:13,350 --> 00:16:18,880 Now, if the graphs were really not isomorphic, 274 00:16:18,880 --> 00:16:23,020 the students can check whether that randomly 275 00:16:23,020 --> 00:16:26,140 scrambled graph is isomorphic to either G 276 00:16:26,140 --> 00:16:29,973 or to H. It's going to be isomorphic to one or the other. 277 00:16:29,973 --> 00:16:31,640 And then the students can figure it out, 278 00:16:31,640 --> 00:16:34,100 and they say, oh you picked G. Or no, you 279 00:16:34,100 --> 00:16:36,630 picked H, as the case may be. 280 00:16:36,630 --> 00:16:38,640 The students can figure that out. 281 00:16:38,640 --> 00:16:42,770 But if the graphs were isomorphic, 282 00:16:42,770 --> 00:16:46,730 then that scrambled version of G or H 283 00:16:46,730 --> 00:16:49,430 could equally well have come from either of them. 284 00:16:51,863 --> 00:16:53,280 And the students would have no way 285 00:16:53,280 --> 00:16:59,150 of knowing which one the professor picked. 286 00:16:59,150 --> 00:17:01,700 So there's nothing they could do which 287 00:17:01,700 --> 00:17:05,040 would be better than guessing. 288 00:17:05,040 --> 00:17:07,339 So if we do that a bunch of times, the professor 289 00:17:07,339 --> 00:17:12,470 picks at random, sometimes go secretly of course, 290 00:17:12,470 --> 00:17:18,520 picks either G or picks H and the students get it 291 00:17:18,520 --> 00:17:22,440 right every time, either the students 292 00:17:22,440 --> 00:17:25,349 are really doing the work and the graphs are really 293 00:17:25,349 --> 00:17:30,570 not isomorphic or the students are just incredibly lucky. 294 00:17:30,570 --> 00:17:33,465 They're managing to guess right, let's say, 100 times. 295 00:17:38,480 --> 00:17:40,180 So how would the stu-- 296 00:17:40,180 --> 00:17:44,320 the professor randomly and secretly picks G or H, 297 00:17:44,320 --> 00:17:46,240 uses its probablism. 298 00:17:46,240 --> 00:17:47,890 Flips a coin. 299 00:17:47,890 --> 00:17:49,660 Just a two-sided coin. 300 00:17:49,660 --> 00:17:51,610 Says, OK, sometimes I'm going to do G, 301 00:17:51,610 --> 00:17:55,520 sometimes I'm going to do H. Just completely at random picks 302 00:17:55,520 --> 00:17:57,120 one or the other. 303 00:17:57,120 --> 00:18:00,380 Then with some more randomness, finds 304 00:18:00,380 --> 00:18:03,890 a random permutation of the one that he picked and then 305 00:18:03,890 --> 00:18:06,200 sends that over to the students and say, 306 00:18:06,200 --> 00:18:07,640 which one did it come from? 307 00:18:15,680 --> 00:18:16,720 So I'm not sure-- 308 00:18:16,720 --> 00:18:19,005 OK, so let's pause here. 309 00:18:19,005 --> 00:18:21,130 Let's make sure we all understand this because this 310 00:18:21,130 --> 00:18:23,420 is really important. 311 00:18:23,420 --> 00:18:25,000 So I'm getting a question here. 312 00:18:25,000 --> 00:18:28,060 How do we-- I'm not sure what your question is. 313 00:18:28,060 --> 00:18:30,768 OK, so let me just say the professor is going to play 314 00:18:30,768 --> 00:18:31,810 the role of the verifier. 315 00:18:31,810 --> 00:18:33,050 The graduate students play the role 316 00:18:33,050 --> 00:18:34,510 of the prover that's coming, but I really 317 00:18:34,510 --> 00:18:36,040 want to understand this protocol here. 318 00:18:36,040 --> 00:18:36,540 OK. 319 00:18:36,540 --> 00:18:40,660 So how is the professor picking the graphs again? 320 00:18:40,660 --> 00:18:42,670 OK, I don't-- picking the graphs at random. 321 00:18:42,670 --> 00:18:43,780 You have just two graphs. 322 00:18:43,780 --> 00:18:45,630 They're part of the input. 323 00:18:45,630 --> 00:18:49,537 Both the students and the professor can see the graphs, 324 00:18:49,537 --> 00:18:51,370 and the professor's just picking one of them 325 00:18:51,370 --> 00:18:52,340 at random using a coin. 326 00:18:52,340 --> 00:18:54,340 So I'm not sure I understand the question there. 327 00:18:54,340 --> 00:18:56,050 Could P and V engage in a protocol 328 00:18:56,050 --> 00:18:58,990 where the secret here is on the prover side instead? 329 00:18:58,990 --> 00:19:02,050 The question of revealing the isomorphism-- there is no iso-- 330 00:19:02,050 --> 00:19:04,750 I'm not sure I understand this question either. 331 00:19:04,750 --> 00:19:06,040 Maybe we'll make this clear-- 332 00:19:08,750 --> 00:19:14,140 for this little illustration, the professor doesn't know. 333 00:19:14,140 --> 00:19:15,790 The graphs could be isomorphic or they 334 00:19:15,790 --> 00:19:19,000 could be not isomorphic. 335 00:19:19,000 --> 00:19:22,540 And so the professor wants to be convinced either way, whatever 336 00:19:22,540 --> 00:19:25,090 the students-- whatever answer the students come up with. 337 00:19:25,090 --> 00:19:28,240 We're going to shift this into a problem 338 00:19:28,240 --> 00:19:31,307 about deciding a language next. 339 00:19:31,307 --> 00:19:32,890 But right now, I'm just trying to give 340 00:19:32,890 --> 00:19:35,230 a sense of how the model works. 341 00:19:35,230 --> 00:19:37,150 I want to move from this informal model, 342 00:19:37,150 --> 00:19:40,630 and now I'm going to formalize that in terms of model 343 00:19:40,630 --> 00:19:43,960 which will be deciding a language. 344 00:19:43,960 --> 00:19:46,030 OK? 345 00:19:46,030 --> 00:19:50,530 So the interactive proof system model, 346 00:19:50,530 --> 00:19:54,040 we have two interacting parties, a verifier, which 347 00:19:54,040 --> 00:19:56,230 is probabilistic polynomial time, 348 00:19:56,230 --> 00:19:58,780 played by the professor in the previous slide, 349 00:19:58,780 --> 00:20:01,990 and the prover, which is unlimited computational power, 350 00:20:01,990 --> 00:20:04,810 played by the students in the previous slide. 351 00:20:08,210 --> 00:20:11,030 Both of them get to see the input, which 352 00:20:11,030 --> 00:20:13,980 in the previous case, well, it could be, 353 00:20:13,980 --> 00:20:17,360 for example, the pair of graphs. 354 00:20:17,360 --> 00:20:20,660 The exchange of number of polynomial-size messages. 355 00:20:20,660 --> 00:20:27,210 So the whole exchange, including the verifier's own computation, 356 00:20:27,210 --> 00:20:28,920 is going to be polynomial. 357 00:20:28,920 --> 00:20:31,020 The only thing that's not included 358 00:20:31,020 --> 00:20:34,800 within the computational cost is the prover's work, 359 00:20:34,800 --> 00:20:35,820 which is unlimited. 360 00:20:40,170 --> 00:20:42,510 After that, the verifier-- after the interaction, 361 00:20:42,510 --> 00:20:45,430 the verifier will accept or reject. 362 00:20:45,430 --> 00:20:47,970 And we're going to define the probability 363 00:20:47,970 --> 00:20:51,120 that the verifier, together with a particular prover, 364 00:20:51,120 --> 00:20:57,570 ends up accepting as you look over the different possible 365 00:20:57,570 --> 00:21:02,210 coin tosses of the verifier, which could lead 366 00:21:02,210 --> 00:21:06,325 to different behavior on the part of the verifier 367 00:21:06,325 --> 00:21:07,700 and therefore, different behavior 368 00:21:07,700 --> 00:21:10,530 on the part of the prover. 369 00:21:10,530 --> 00:21:13,470 So over all the different possibilities 370 00:21:13,470 --> 00:21:15,840 for the verifier's computation, we're 371 00:21:15,840 --> 00:21:18,120 going to look at the probability that the verifier 372 00:21:18,120 --> 00:21:20,752 with this particular prover ends up accepting. 373 00:21:20,752 --> 00:21:21,960 And I've written it this way. 374 00:21:21,960 --> 00:21:24,930 It says the probability of the verifier interacting 375 00:21:24,930 --> 00:21:27,840 with the prover accepts the input. 376 00:21:27,840 --> 00:21:28,830 It's just simply that. 377 00:21:32,580 --> 00:21:35,170 And so we're going to work through an example. 378 00:21:35,170 --> 00:21:39,150 We're going to work through the previous example more 379 00:21:39,150 --> 00:21:40,560 precisely in a second. 380 00:21:43,110 --> 00:21:45,810 The class IP for Interactive Proofs 381 00:21:45,810 --> 00:21:49,170 stands for-- it's a class of languages 382 00:21:49,170 --> 00:21:54,460 such that for some verifier and a prover, 383 00:21:54,460 --> 00:21:58,030 for strings in the language, the prover 384 00:21:58,030 --> 00:22:01,870 makes the verifier accept with high probability. 385 00:22:01,870 --> 00:22:03,580 And here is the interesting part. 386 00:22:03,580 --> 00:22:06,520 For strings not in the language, the prover 387 00:22:06,520 --> 00:22:08,560 makes it accept with low probability, 388 00:22:08,560 --> 00:22:10,930 but there's no prover which can make it 389 00:22:10,930 --> 00:22:12,590 accept with high probability. 390 00:22:12,590 --> 00:22:16,160 So there's no way to cheat. 391 00:22:16,160 --> 00:22:18,380 If you think about it in the case of the graph 392 00:22:18,380 --> 00:22:27,130 non-isomorphism, if the graphs were really isomorphic 393 00:22:27,130 --> 00:22:30,890 and the students were trying to, in a devious way, 394 00:22:30,890 --> 00:22:35,210 prove through that protocol that they're not isomorphic, 395 00:22:35,210 --> 00:22:38,660 they would fail because there's nothing they can do. 396 00:22:38,660 --> 00:22:42,740 If the graphs were isomorphic, then when 397 00:22:42,740 --> 00:22:46,490 the verifier, or the professor, picks one or the other 398 00:22:46,490 --> 00:22:50,500 at random and scrambles it, the students 399 00:22:50,500 --> 00:22:53,290 would have no way of telling which one the professor did. 400 00:22:53,290 --> 00:22:57,220 So no matter what kind of scheme they try to come up with, 401 00:22:57,220 --> 00:22:59,220 they're going to be out of luck. 402 00:22:59,220 --> 00:23:03,230 So it's no ma-- for any strategy, 403 00:23:03,230 --> 00:23:07,610 for strings that are not in the language, for any prover-- 404 00:23:07,610 --> 00:23:09,830 calling that P with a tilde to stand 405 00:23:09,830 --> 00:23:13,520 for a devious or crooked prover. 406 00:23:13,520 --> 00:23:17,210 For any possibly crooked prover, even that 407 00:23:17,210 --> 00:23:19,040 would be working with the verifier is still 408 00:23:19,040 --> 00:23:23,600 going to end up accepting with low probability. 409 00:23:23,600 --> 00:23:25,520 So strings in the language, there's 410 00:23:25,520 --> 00:23:27,410 going to be an honest prover who just follows 411 00:23:27,410 --> 00:23:30,920 the protocol in the correct way, which makes the verifier accept 412 00:23:30,920 --> 00:23:32,210 with high probability. 413 00:23:32,210 --> 00:23:35,600 For strings not in the language, every prover 414 00:23:35,600 --> 00:23:39,373 is going to fail to make it accept with high probability. 415 00:23:42,190 --> 00:23:42,690 OK. 416 00:23:42,690 --> 00:23:44,107 So I mean, the way I like to think 417 00:23:44,107 --> 00:23:49,278 about it is that P tilde is a possibly crooked prover which 418 00:23:49,278 --> 00:23:51,570 is trying to make the verifier accept when it shouldn't 419 00:23:51,570 --> 00:23:55,620 because the string is not in the language. 420 00:23:55,620 --> 00:23:59,910 It's like you can think of this in the case of satisfiability. 421 00:24:04,780 --> 00:24:08,080 A crooked prover might try to convince the verifier 422 00:24:08,080 --> 00:24:11,980 that the formula's satisfiable when it isn't by somehow trying 423 00:24:11,980 --> 00:24:13,510 to produce a satisfying assignment, 424 00:24:13,510 --> 00:24:14,990 but that's going to be impossible. 425 00:24:14,990 --> 00:24:17,230 There's nothing any strategy can possibly 426 00:24:17,230 --> 00:24:19,663 work when the formula is not satisfiable 427 00:24:19,663 --> 00:24:21,580 if that's what the verifier is going to check. 428 00:24:21,580 --> 00:24:24,670 It's going to be looking for that satisfying assignment. 429 00:24:24,670 --> 00:24:25,510 OK? 430 00:24:25,510 --> 00:24:27,970 And by the way, we're not going to prove this, 431 00:24:27,970 --> 00:24:30,310 but it's really going to be proved in the same way. 432 00:24:30,310 --> 00:24:32,770 You can make that one third error 433 00:24:32,770 --> 00:24:35,860 that occurs here, something very tiny, 434 00:24:35,860 --> 00:24:40,170 by the same kind of repetition argument. 435 00:24:40,170 --> 00:24:40,890 OK? 436 00:24:40,890 --> 00:24:42,105 So let's see. 437 00:24:44,980 --> 00:24:49,420 So why can't the prover in the first case be crooked? 438 00:24:49,420 --> 00:24:51,760 The prover in the first case could be crooked, 439 00:24:51,760 --> 00:24:54,055 but that's not going to serve the purposes. 440 00:24:56,930 --> 00:24:59,750 What we want to show-- 441 00:24:59,750 --> 00:25:03,760 think about it like we think about NP. 442 00:25:03,760 --> 00:25:06,310 For strings in the language, there exists a certificate. 443 00:25:06,310 --> 00:25:09,530 There is a proof that you're in the language. 444 00:25:09,530 --> 00:25:17,240 So if somebody is going to not produce the proof, 445 00:25:17,240 --> 00:25:18,620 that's irrelevant. 446 00:25:18,620 --> 00:25:20,780 The question is, if you look at the best 447 00:25:20,780 --> 00:25:26,720 possible case, the best possible prover who's going to be able-- 448 00:25:26,720 --> 00:25:28,700 we're asking, does there exist a way 449 00:25:28,700 --> 00:25:35,940 to convince the verifier that the string is in the language? 450 00:25:35,940 --> 00:25:40,650 So it doesn't matter that there might be some other silly way 451 00:25:40,650 --> 00:25:41,400 that doesn't work. 452 00:25:41,400 --> 00:25:43,583 We just were looking at the best possible way. 453 00:25:43,583 --> 00:25:45,750 So the best possible way when you're in the language 454 00:25:45,750 --> 00:25:47,542 is going to end up with the verifier having 455 00:25:47,542 --> 00:25:48,900 high probability. 456 00:25:48,900 --> 00:25:51,150 When you're not in the language, the best possible way 457 00:25:51,150 --> 00:25:53,430 is still going to end up with low probability. 458 00:25:53,430 --> 00:25:55,830 When I talk about best possible, I'm 459 00:25:55,830 --> 00:25:58,582 trying to maximize the probability that the verifier 460 00:25:58,582 --> 00:25:59,790 is going to end up accepting. 461 00:25:59,790 --> 00:26:01,453 Let's continue. 462 00:26:01,453 --> 00:26:03,120 Not sure I was as clear as I would like, 463 00:26:03,120 --> 00:26:08,520 but maybe again we're going to stick with that example 464 00:26:08,520 --> 00:26:13,320 because this is a very helpful example 465 00:26:13,320 --> 00:26:17,920 to try to understand the setup. 466 00:26:17,920 --> 00:26:20,160 And so we're going to-- 467 00:26:20,160 --> 00:26:22,740 I'm going to revisit that previous example 468 00:26:22,740 --> 00:26:26,970 about non-isomorphism but now in the context of this thinking 469 00:26:26,970 --> 00:26:28,590 about it as a language. 470 00:26:28,590 --> 00:26:31,465 So we're going to take this non-isomorphism-- 471 00:26:37,650 --> 00:26:38,940 yeah. 472 00:26:38,940 --> 00:26:41,790 We're going to take the non-isomorphism problem 473 00:26:41,790 --> 00:26:43,230 and show that it's an IP. 474 00:26:43,230 --> 00:26:45,360 So there's going to be a verifier together 475 00:26:45,360 --> 00:26:48,150 with a prover, which are going to make the verifier accept 476 00:26:48,150 --> 00:26:51,990 with high probability for strings in the language, namely 477 00:26:51,990 --> 00:26:55,923 graphs not being isomorphic, and nothing that's 478 00:26:55,923 --> 00:26:57,840 going to be no way to make the verifier accept 479 00:26:57,840 --> 00:27:00,840 with high probability for strings out of the language. 480 00:27:00,840 --> 00:27:04,220 Therefore, that's when the graphs are isomorphic. 481 00:27:04,220 --> 00:27:06,080 OK. 482 00:27:06,080 --> 00:27:07,970 So the protocol is just we're going to repeat 483 00:27:07,970 --> 00:27:09,440 the following thing twice. 484 00:27:09,440 --> 00:27:12,650 You know, I said in the previous case do it 100 times just 485 00:27:12,650 --> 00:27:15,350 to help us to think about it, but actually, twice 486 00:27:15,350 --> 00:27:17,750 is going to be enough to get the bound we need. 487 00:27:17,750 --> 00:27:19,790 So the verifier is going to operate 488 00:27:19,790 --> 00:27:22,970 like this, in terms of this is the verifier's 489 00:27:22,970 --> 00:27:27,290 first communicating, sending messages to the prover. 490 00:27:27,290 --> 00:27:30,890 It's going to randomly choose G or H, just 491 00:27:30,890 --> 00:27:33,380 like what the professor did last time, 492 00:27:33,380 --> 00:27:37,190 randomly permute the result to get a new graph, K, 493 00:27:37,190 --> 00:27:39,140 which was going to be-- 494 00:27:39,140 --> 00:27:42,650 which is isomorphic either to G or H depending upon the choice 495 00:27:42,650 --> 00:27:48,270 the verifier made, and then send that graph K. 496 00:27:48,270 --> 00:27:51,960 Now, the prover's turn is going to respond by-- 497 00:27:51,960 --> 00:27:55,800 the prover's going to compare K with both 498 00:27:55,800 --> 00:27:56,760 of the original graphs. 499 00:27:56,760 --> 00:27:59,440 It's got to be isomorphic to one or the other. 500 00:27:59,440 --> 00:28:02,980 And it's going to report back which one. 501 00:28:02,980 --> 00:28:05,050 Just going to say, well, you picked G. No. 502 00:28:05,050 --> 00:28:06,843 Or you picked H. 503 00:28:06,843 --> 00:28:09,010 Because the prover, with its unlimited capabilities, 504 00:28:09,010 --> 00:28:10,240 can determine that. 505 00:28:13,470 --> 00:28:16,235 And then V accepts if the prover was right both times. 506 00:28:19,440 --> 00:28:21,920 And if the prover was ever not right, 507 00:28:21,920 --> 00:28:24,050 the verifier says, oh, something's fishy here. 508 00:28:24,050 --> 00:28:27,710 Because we know that the prover has unlimited capability, 509 00:28:27,710 --> 00:28:33,620 so could get it right if this was an honest prover. 510 00:28:33,620 --> 00:28:38,450 And so if it's not getting it right, 511 00:28:38,450 --> 00:28:40,160 then the verifier is going to reject. 512 00:28:43,350 --> 00:28:45,440 So if the graphs are not isomorphic, 513 00:28:45,440 --> 00:28:47,990 the prover can tell which one it picked randomly. 514 00:28:47,990 --> 00:28:52,610 So therefore, if the graphs are not isomorphic, 515 00:28:52,610 --> 00:28:55,280 the verifier with that honest prover 516 00:28:55,280 --> 00:29:00,090 will accept with probability 1 because that honest prover 517 00:29:00,090 --> 00:29:03,520 is always going to get the right answer, which is at least 2/3, 518 00:29:03,520 --> 00:29:04,600 is the bound we need. 519 00:29:07,560 --> 00:29:12,490 We don't care about the space used, in answer to a question. 520 00:29:12,490 --> 00:29:18,300 If we were not in the language, so G and H are not isomorphic, 521 00:29:18,300 --> 00:29:21,030 then there's nothing any crooked prover could possibly 522 00:29:21,030 --> 00:29:23,820 do because it gets a graph. 523 00:29:23,820 --> 00:29:24,405 Can't tell. 524 00:29:24,405 --> 00:29:26,280 There's no way to tell whether it came from G 525 00:29:26,280 --> 00:29:32,700 or came from H. So that crooked prover would have o-- 526 00:29:32,700 --> 00:29:34,500 the best thing it could do is guess. 527 00:29:34,500 --> 00:29:37,920 So a 50% chance of answering correctly each time and only 528 00:29:37,920 --> 00:29:41,990 a 25% chance for doing it twice. 529 00:29:41,990 --> 00:29:43,850 And that's why I did it twice, in order 530 00:29:43,850 --> 00:29:49,190 to get that error to be small. 531 00:29:49,190 --> 00:29:52,300 So it's only a 25% chance of the prover getting lucky, 532 00:29:52,300 --> 00:29:56,030 so that would be an error case if the prover, just by chance, 533 00:29:56,030 --> 00:29:59,990 picked the right answer twice, even though the graphs were 534 00:29:59,990 --> 00:30:01,260 isomorphic. 535 00:30:01,260 --> 00:30:04,670 So therefore, for the isomorphic case, 536 00:30:04,670 --> 00:30:07,700 the verifier interacting with any prover 537 00:30:07,700 --> 00:30:10,190 is going to accept that input with, at most, 538 00:30:10,190 --> 00:30:13,620 one quarter, 25% of the time, which is less than a third. 539 00:30:13,620 --> 00:30:16,910 So that's just to achieve that bound. 540 00:30:16,910 --> 00:30:17,660 OK? 541 00:30:17,660 --> 00:30:22,040 So let's answer some questions first, and then I'll try to-- 542 00:30:28,130 --> 00:30:30,817 I'll ask you. 543 00:30:30,817 --> 00:30:31,650 You understand this? 544 00:30:31,650 --> 00:30:34,720 So I think it's worth trying to understand 545 00:30:34,720 --> 00:30:39,910 this model of this interactive proof system. 546 00:30:39,910 --> 00:30:41,890 It's a little slippery, I realize, 547 00:30:41,890 --> 00:30:47,410 but if you just hold onto your intuition of the prover trying 548 00:30:47,410 --> 00:30:50,800 to convince-- 549 00:30:50,800 --> 00:30:52,510 a powerful prover trying to convince 550 00:30:52,510 --> 00:31:00,385 a limited verifier of some string being in a language. 551 00:31:03,430 --> 00:31:06,340 You want the prover to be able to succeed when the string is 552 00:31:06,340 --> 00:31:08,410 in the language but fail when the string is not 553 00:31:08,410 --> 00:31:09,077 in the language. 554 00:31:11,620 --> 00:31:12,120 Yes. 555 00:31:12,120 --> 00:31:13,650 We are going to-- somebody's asking 556 00:31:13,650 --> 00:31:16,890 if the prover is identifying G or H by brute force. 557 00:31:16,890 --> 00:31:17,610 Yes. 558 00:31:17,610 --> 00:31:20,580 The prover is going to use its unlimited capabilities 559 00:31:20,580 --> 00:31:30,310 to determine, given K, whether it came from G or H. 560 00:31:30,310 --> 00:31:34,370 The computational cost of the prover is irrelevant for this. 561 00:31:34,370 --> 00:31:37,610 It's just like when we think about a certificate 562 00:31:37,610 --> 00:31:39,075 for satisfiability. 563 00:31:39,075 --> 00:31:40,700 We don't talk about the cost of finding 564 00:31:40,700 --> 00:31:43,430 that certificate for NP. 565 00:31:43,430 --> 00:31:46,940 For IP, again, we don't talk about the cost of the prover 566 00:31:46,940 --> 00:31:47,870 running. 567 00:31:47,870 --> 00:31:50,510 So somebody is asking, does the crooked prover 568 00:31:50,510 --> 00:31:54,530 answer just randomly, or can the crooked prover have a strategy? 569 00:31:54,530 --> 00:31:58,200 The crooked prover can have a strategy. 570 00:31:58,200 --> 00:32:01,150 We're assuming the crooked prover is devious. 571 00:32:01,150 --> 00:32:03,030 But it's still going to fail. 572 00:32:03,030 --> 00:32:04,033 OK. 573 00:32:04,033 --> 00:32:04,950 Let's do the check-in. 574 00:32:07,550 --> 00:32:10,400 Suppose we change the model so that the prover can 575 00:32:10,400 --> 00:32:15,410 watch the verifier picking its random choices. 576 00:32:15,410 --> 00:32:18,300 So the verifier cannot act in secret anymore, 577 00:32:18,300 --> 00:32:22,580 but the prover can watch the verifier. 578 00:32:22,580 --> 00:32:24,920 Now, let's suppose we had the same protocol 579 00:32:24,920 --> 00:32:26,330 that I just described. 580 00:32:26,330 --> 00:32:28,370 What language do we end up with? 581 00:32:28,370 --> 00:32:30,350 Is it the same language, different language, 582 00:32:30,350 --> 00:32:33,170 and what is that language? 583 00:32:33,170 --> 00:32:35,370 So going to hopefully-- 584 00:32:39,190 --> 00:32:42,250 it'll give me some sense of how well you're following me 585 00:32:42,250 --> 00:32:44,740 by how well this goes. 586 00:32:44,740 --> 00:32:47,450 Yeah, someone's asking about how this connects up, for example, 587 00:32:47,450 --> 00:32:48,280 with NP. 588 00:32:48,280 --> 00:32:50,890 So we're going to look at that also in a second. 589 00:32:57,670 --> 00:32:59,890 OK, so this is reassuring that most of you, I think, 590 00:32:59,890 --> 00:33:02,340 are on the right track, at least for this check-in. 591 00:33:02,340 --> 00:33:05,910 Do we assume P uses this access to guess right? 592 00:33:05,910 --> 00:33:06,840 What access? 593 00:33:06,840 --> 00:33:08,160 P is not really guessing. 594 00:33:08,160 --> 00:33:09,977 The P is actually-- 595 00:33:09,977 --> 00:33:12,060 I don't think a P is non-deterministic or anything 596 00:33:12,060 --> 00:33:12,560 like that. 597 00:33:12,560 --> 00:33:14,740 P is actually trying to get the right answer 598 00:33:14,740 --> 00:33:18,180 and using its computational ability 599 00:33:18,180 --> 00:33:20,700 to do that if it's possible. 600 00:33:20,700 --> 00:33:21,720 It may not be possible. 601 00:33:21,720 --> 00:33:23,820 Then there's nothing you can do. 602 00:33:23,820 --> 00:33:25,560 OK, so let's end this. 603 00:33:25,560 --> 00:33:28,340 Are you all in? 604 00:33:28,340 --> 00:33:30,300 Two seconds left. 605 00:33:30,300 --> 00:33:32,330 Please vote. 606 00:33:32,330 --> 00:33:33,890 Vote now or never. 607 00:33:33,890 --> 00:33:37,000 OK, ending. 608 00:33:37,000 --> 00:33:41,540 Yeah, so C is the correct answer here. 609 00:33:41,540 --> 00:33:44,510 If the prover can watch what the verifier is doing, 610 00:33:44,510 --> 00:33:48,350 the prover can see what graph the verifier picked right 611 00:33:48,350 --> 00:33:49,950 from the beginning. 612 00:33:49,950 --> 00:33:52,640 And so the prover, without having to do any work, 613 00:33:52,640 --> 00:33:53,810 can say-- 614 00:33:53,810 --> 00:33:56,120 prover looks over the verifier's shoulder and says, 615 00:33:56,120 --> 00:33:59,810 oh, you pick G. And now you're randomly permuting it, 616 00:33:59,810 --> 00:34:01,055 but I don't care about that. 617 00:34:01,055 --> 00:34:08,520 I know you pick G, so the prover is going to respond back a G. 618 00:34:08,520 --> 00:34:10,380 Even if the graphs were isomorphic, 619 00:34:10,380 --> 00:34:13,050 the prover is going to be able to get the right answer. 620 00:34:13,050 --> 00:34:18,610 Kind of interestingly, you can make 621 00:34:18,610 --> 00:34:22,030 a-- you can change the protocol somewhat 622 00:34:22,030 --> 00:34:26,560 to make it that even if the prover has access 623 00:34:26,560 --> 00:34:29,360 to the verifier's randomness, you can still achieve this, 624 00:34:29,360 --> 00:34:32,439 but not with the same protocol. 625 00:34:32,439 --> 00:34:36,360 So that's a separate question. 626 00:34:36,360 --> 00:34:38,639 OK, so let's move on here. 627 00:34:38,639 --> 00:34:40,817 Don't want to get too bogged down. 628 00:34:40,817 --> 00:34:41,984 OK, here's another check-in. 629 00:34:46,630 --> 00:34:52,440 OK, so you have to tell me, which 630 00:34:52,440 --> 00:34:54,555 of the following statements are true? 631 00:34:57,130 --> 00:34:57,997 As far as you know. 632 00:35:01,820 --> 00:35:08,330 You'll have to think a little bit how these relate to-- 633 00:35:08,330 --> 00:35:13,340 how NP and IP or BPP and IP relate to one another. 634 00:35:13,340 --> 00:35:15,850 OK, how are we doing on this? 635 00:35:24,660 --> 00:35:28,310 So we're going to have to close this pretty soon too. 636 00:35:28,310 --> 00:35:29,570 Do the best you can. 637 00:35:29,570 --> 00:35:30,500 Interesting. 638 00:35:30,500 --> 00:35:31,310 OK. 639 00:35:31,310 --> 00:35:32,120 Closing up shop. 640 00:35:35,270 --> 00:35:37,770 Last vote. 641 00:35:37,770 --> 00:35:39,835 OK, 1, 2, 3. 642 00:35:39,835 --> 00:35:41,210 There's one more person out there 643 00:35:41,210 --> 00:35:43,320 who hasn't voted who voted last time. 644 00:35:43,320 --> 00:35:43,820 Oh well. 645 00:35:47,160 --> 00:35:48,717 All right. 646 00:35:48,717 --> 00:35:49,800 In fact, they're all true. 647 00:35:53,480 --> 00:35:53,980 Let's see. 648 00:35:53,980 --> 00:35:58,072 Why is NP contained with IP, contained in IP? 649 00:35:58,072 --> 00:35:59,780 Well, many of you have seen this already, 650 00:35:59,780 --> 00:36:01,510 so let's just quickly go through it. 651 00:36:05,010 --> 00:36:16,790 If we just had a deterministic V, maybe it's just-- 652 00:36:16,790 --> 00:36:19,960 is that going to be enough if deterministic V-- 653 00:36:19,960 --> 00:36:21,870 I think it's just going to be equivalent, 654 00:36:21,870 --> 00:36:23,930 but actually, just to be doubly sure, 655 00:36:23,930 --> 00:36:26,690 the deterministic V and the prover 656 00:36:26,690 --> 00:36:29,880 just sends a message to the verifier and then checks it. 657 00:36:29,880 --> 00:36:33,107 That's the way we normally think about a certificate for NP. 658 00:36:33,107 --> 00:36:34,940 I don't think it's going to change anything, 659 00:36:34,940 --> 00:36:37,232 but should double-check that, if the verifier can still 660 00:36:37,232 --> 00:36:38,160 ask questions. 661 00:36:38,160 --> 00:36:40,670 But I think as long as the verifier is deterministic, 662 00:36:40,670 --> 00:36:43,370 you're going to get exactly NP here. 663 00:36:46,680 --> 00:36:50,480 And now, how about BPP? 664 00:36:50,480 --> 00:36:52,880 Well, there you don't even need a prover 665 00:36:52,880 --> 00:36:55,400 because the verifier is already probabilistic. 666 00:36:55,400 --> 00:36:59,410 So verifier can ignore the prover. 667 00:36:59,410 --> 00:37:03,327 And this one is a little tricky, IP contained in PSPACE, 668 00:37:03,327 --> 00:37:04,660 because we haven't covered that. 669 00:37:04,660 --> 00:37:07,077 So there's no way for you to know that unless you happened 670 00:37:07,077 --> 00:37:08,440 to read ahead in the book. 671 00:37:08,440 --> 00:37:10,810 But it's, in fact, true. 672 00:37:10,810 --> 00:37:14,080 In some ways, it's a little bit like the proof 673 00:37:14,080 --> 00:37:20,380 that NP is contained in PSPACE. 674 00:37:20,380 --> 00:37:23,510 IP is sort of an enhanced version of NP. 675 00:37:23,510 --> 00:37:27,400 And there's just basically a piece-based brute force 676 00:37:27,400 --> 00:37:31,570 algorithm that goes through the entire tree of possibilities 677 00:37:31,570 --> 00:37:36,460 of the verifier and verifier with exchanges with the prover 678 00:37:36,460 --> 00:37:41,680 and can determine that the verifier is either 679 00:37:41,680 --> 00:37:44,230 going to accept for some prover or is going to end up 680 00:37:44,230 --> 00:37:46,625 rejecting for every prover. 681 00:37:46,625 --> 00:37:49,000 So we're not going to prove this statement, but something 682 00:37:49,000 --> 00:37:52,980 good for you to know anyway, just the fact. 683 00:37:52,980 --> 00:37:54,270 But we're going to do-- 684 00:37:54,270 --> 00:37:58,530 the surprising thing, in reference to part C, 685 00:37:58,530 --> 00:38:01,270 is that the containment also goes the other way. 686 00:38:01,270 --> 00:38:02,145 This is the amazing-- 687 00:38:07,190 --> 00:38:13,540 is an amazing result, that everything in PSPACE 688 00:38:13,540 --> 00:38:16,030 you can do within IP. 689 00:38:16,030 --> 00:38:19,530 So this is-- IP actually turns out to be incredibly powerful. 690 00:38:19,530 --> 00:38:21,390 Gives you everything in PSPACE. 691 00:38:21,390 --> 00:38:23,940 You get IP equals PSPACE. 692 00:38:23,940 --> 00:38:26,790 So that says that any problem that you can solve in PSPACE, 693 00:38:26,790 --> 00:38:30,430 like any of the-- a game, for example. 694 00:38:30,430 --> 00:38:35,280 If you can imagine formulating checkers or chess 695 00:38:35,280 --> 00:38:37,830 as a PSPACE problem, which depending 696 00:38:37,830 --> 00:38:39,750 upon some details of the rules you 697 00:38:39,750 --> 00:38:42,480 can do because you have to generalize it to an n 698 00:38:42,480 --> 00:38:45,750 by n board, but OK. 699 00:38:45,750 --> 00:38:48,750 Let's not quibble. 700 00:38:48,750 --> 00:38:59,960 Then we don't know which side has a forced win in chess, 701 00:38:59,960 --> 00:39:02,540 and even if somebody goes to the effort of going 702 00:39:02,540 --> 00:39:05,600 through the game tree and determines 703 00:39:05,600 --> 00:39:08,520 that, let's say, white has a forced win, 704 00:39:08,520 --> 00:39:10,490 there's no way for them to-- 705 00:39:10,490 --> 00:39:11,780 there's no short certificate. 706 00:39:11,780 --> 00:39:14,330 We don't know that that problem is not an NP. 707 00:39:14,330 --> 00:39:18,860 But by going through an interactive proof, 708 00:39:18,860 --> 00:39:22,760 an all-powerful prover could still convince somebody that 709 00:39:22,760 --> 00:39:24,815 white had a forced-- 710 00:39:24,815 --> 00:39:26,750 convince somebody in polynomial time 711 00:39:26,750 --> 00:39:33,140 that a white has a forced win, let's say, in chess. 712 00:39:33,140 --> 00:39:37,640 Again, a little stretching things because this is-- 713 00:39:37,640 --> 00:39:40,700 you really need to talk about this as an n by n, not an 8 714 00:39:40,700 --> 00:39:45,950 by 8, but I think the spirit is fair. 715 00:39:45,950 --> 00:39:50,110 So OK. 716 00:39:50,110 --> 00:39:51,820 So let's continue. 717 00:39:51,820 --> 00:39:54,550 So we're not going to quite prove 718 00:39:54,550 --> 00:39:56,678 that PSPACE is contained in IP. 719 00:39:56,678 --> 00:39:58,720 We're going to prove a somewhat weaker statement, 720 00:39:58,720 --> 00:40:04,270 but very similar and historically came first, 721 00:40:04,270 --> 00:40:07,510 that coNP is contained in IP. 722 00:40:07,510 --> 00:40:09,670 So not only is NP contained in IP, 723 00:40:09,670 --> 00:40:12,250 but we're going to prove that coNP is contained in IP. 724 00:40:12,250 --> 00:40:17,230 And this actually has most of the idea for the PSPACE being 725 00:40:17,230 --> 00:40:18,400 contained in IP. 726 00:40:18,400 --> 00:40:20,748 And itself, it's just an amazing proof. 727 00:40:20,748 --> 00:40:21,415 A little easier. 728 00:40:27,510 --> 00:40:28,620 OK. 729 00:40:28,620 --> 00:40:31,770 This was done, if I'm remembering-- somebody's 730 00:40:31,770 --> 00:40:33,360 asking me, how old is this? 731 00:40:33,360 --> 00:40:36,990 It's something in the, I think, late '90s, but I'm not-- 732 00:40:36,990 --> 00:40:37,770 I don't remember. 733 00:40:37,770 --> 00:40:39,180 Maybe early '90s. 734 00:40:39,180 --> 00:40:42,810 I think it's late '90s when this was shown, 735 00:40:42,810 --> 00:40:43,920 so it's been a while now. 736 00:40:46,830 --> 00:40:47,790 OK. 737 00:40:47,790 --> 00:40:52,430 So yeah. 738 00:40:52,430 --> 00:40:55,070 So in terms of the relationship with cryptography, 739 00:40:55,070 --> 00:41:00,650 there were two parallel threads that both independently 740 00:41:00,650 --> 00:41:05,090 came up with the notion of an interactive proof system. 741 00:41:05,090 --> 00:41:08,510 I was a little bit personally involved with this 742 00:41:08,510 --> 00:41:12,650 in a way as well, but mainly that there 743 00:41:12,650 --> 00:41:16,400 was one group in cryptography working on this, 744 00:41:16,400 --> 00:41:18,380 and there was another group who was actually 745 00:41:18,380 --> 00:41:21,350 coming out of the graph isomorphism world, 746 00:41:21,350 --> 00:41:22,040 working on it. 747 00:41:22,040 --> 00:41:26,570 And they came up with two separate models, 748 00:41:26,570 --> 00:41:29,780 one involving the private randomness and one involving 749 00:41:29,780 --> 00:41:31,580 the public randomness. 750 00:41:31,580 --> 00:41:36,590 And it was turned out that they were actually equivalent. 751 00:41:36,590 --> 00:41:41,750 And it's an interesting story, but unfortunately, we 752 00:41:41,750 --> 00:41:43,590 don't have time for it. 753 00:41:43,590 --> 00:41:45,290 So why don't we move on. 754 00:41:45,290 --> 00:41:47,720 And I'm going to start showing you 755 00:41:47,720 --> 00:41:55,730 how the proof that coNP is contained in IP goes. 756 00:41:55,730 --> 00:41:57,350 And what we're going to do is work 757 00:41:57,350 --> 00:42:03,230 with a problem that's almost like coNP complete, but going 758 00:42:03,230 --> 00:42:03,990 to be-- 759 00:42:03,990 --> 00:42:05,990 well, it's going to be this #SAT problem. 760 00:42:05,990 --> 00:42:08,120 We'll see the connection with coNP in a second. 761 00:42:11,010 --> 00:42:15,930 So coNP, so it's supposed to be exactly k satisfying 762 00:42:15,930 --> 00:42:17,190 assignments. 763 00:42:17,190 --> 00:42:21,450 Phi comma k is a set of pairs where 764 00:42:21,450 --> 00:42:24,223 the formula phi has exactly k satisfying assignment. 765 00:42:24,223 --> 00:42:25,890 So really, this is a problem of counting 766 00:42:25,890 --> 00:42:28,140 how many satisfying assignments you have in a formula. 767 00:42:30,702 --> 00:42:35,120 So for NP, you have at least one. 768 00:42:35,120 --> 00:42:39,210 But I want to know exactly how many. 769 00:42:39,210 --> 00:42:43,125 So the #SAT problem is the pair's formula and the count. 770 00:42:48,210 --> 00:42:53,490 And so if we define the count, #phi 771 00:42:53,490 --> 00:42:57,720 is the number of satisfying assignments of a phi. 772 00:42:57,720 --> 00:43:01,650 Then in another way of writing this #SAT problem 773 00:43:01,650 --> 00:43:08,030 is the pair's phi k where k is the number of satisfying 774 00:43:08,030 --> 00:43:09,470 assignments of phi. 775 00:43:09,470 --> 00:43:12,800 So we're going to be using this notation #phi a lot, 776 00:43:12,800 --> 00:43:15,050 so just make sure you got that notation. 777 00:43:15,050 --> 00:43:18,605 This is the number of satisfying assignments of that formula. 778 00:43:22,140 --> 00:43:24,013 OK? 779 00:43:24,013 --> 00:43:25,430 And here's a definition I probably 780 00:43:25,430 --> 00:43:27,180 should have given you earlier in the term, 781 00:43:27,180 --> 00:43:29,850 but better late than never. 782 00:43:29,850 --> 00:43:34,780 So the notion that a language is NP hard, 783 00:43:34,780 --> 00:43:39,370 it's like NP complete except without being necess-- 784 00:43:39,370 --> 00:43:42,830 without necessarily being in NP. 785 00:43:42,830 --> 00:43:44,660 So this is just the reduction part. 786 00:43:44,660 --> 00:43:48,260 A language is NP hard or coNP hard or PSPACE hard 787 00:43:48,260 --> 00:43:50,300 or any of those other classes that we've 788 00:43:50,300 --> 00:43:53,570 looked at if every problem in the class 789 00:43:53,570 --> 00:43:56,720 is reducible to that language. 790 00:43:56,720 --> 00:44:00,960 But you don't know whether that language is in the class. 791 00:44:00,960 --> 00:44:05,200 So we just call it NP hard instead of NP complete. 792 00:44:05,200 --> 00:44:09,010 So you could say the language is NP complete if it's hard 793 00:44:09,010 --> 00:44:09,970 and it's in NP. 794 00:44:13,110 --> 00:44:17,100 OK, and so we're going to show that this #SAT problem is 795 00:44:17,100 --> 00:44:19,180 coNP hard. 796 00:44:19,180 --> 00:44:22,770 So everything in coNP is polynomial time reducible 797 00:44:22,770 --> 00:44:23,770 to #SAT. 798 00:44:23,770 --> 00:44:27,370 That's easy because what we're going to do 799 00:44:27,370 --> 00:44:32,090 is take a coNP complete problem, which 800 00:44:32,090 --> 00:44:36,260 is the unsatisfiability problem, the complement 801 00:44:36,260 --> 00:44:40,850 of satisfiability, and show that reduces to the #SAT problem. 802 00:44:40,850 --> 00:44:44,600 And that's easy because a formula is unsatisfiable 803 00:44:44,600 --> 00:44:49,440 exactly when it has zero satisfying assignments. 804 00:44:49,440 --> 00:44:52,370 So if you can tell how many satisfying assignments 805 00:44:52,370 --> 00:44:57,190 something has exactly, or you can answer the question, 806 00:44:57,190 --> 00:45:03,340 does a formula have exactly 1,000 satisfying assignments, 807 00:45:03,340 --> 00:45:07,600 if you can do that in general, then you can solve coNP. 808 00:45:07,600 --> 00:45:09,730 You can solve the unsatisfiability problem 809 00:45:09,730 --> 00:45:12,490 by asking if it's zero satisfying assignments, 810 00:45:12,490 --> 00:45:16,170 and that allows you to solve anything in coNP. 811 00:45:16,170 --> 00:45:17,130 OK. 812 00:45:17,130 --> 00:45:18,840 So we're going to just work with this one 813 00:45:18,840 --> 00:45:24,020 problem, the #SAT problem, and show that that problem's in IP. 814 00:45:24,020 --> 00:45:25,538 OK? 815 00:45:25,538 --> 00:45:26,580 Let's take a quick break. 816 00:45:33,110 --> 00:45:33,610 OK. 817 00:45:33,610 --> 00:45:35,200 Feel free to send me-- let me see 818 00:45:35,200 --> 00:45:37,242 if I can catch up with some of the questions that 819 00:45:37,242 --> 00:45:38,870 have been cropping up here. 820 00:45:38,870 --> 00:45:42,610 So if the prover knows the random choices of the verifier, 821 00:45:42,610 --> 00:45:44,905 can flip the answer to make the verifier reject? 822 00:45:47,297 --> 00:45:49,880 Not sure what that-- you mean in the context just of the graph 823 00:45:49,880 --> 00:45:53,420 isomorphism problem or something in general? 824 00:45:53,420 --> 00:45:56,800 I'm not sure I-- 825 00:45:56,800 --> 00:45:58,240 you'll have to explain. 826 00:45:58,240 --> 00:46:00,640 So I will respond with a question mark. 827 00:46:00,640 --> 00:46:02,230 What else can I answer for you guys? 828 00:46:02,230 --> 00:46:05,230 So I've got a question. 829 00:46:05,230 --> 00:46:10,120 If IP equals PSPACE, does that mean that ISO or non-ISO might 830 00:46:10,120 --> 00:46:13,960 be N, might be PSPACE complete? 831 00:46:13,960 --> 00:46:15,880 But no. 832 00:46:15,880 --> 00:46:17,840 That's not known. 833 00:46:17,840 --> 00:46:19,025 So we're about out of time. 834 00:46:33,370 --> 00:46:34,810 OK. 835 00:46:34,810 --> 00:46:35,950 Let's continue here. 836 00:46:41,020 --> 00:46:44,190 OK, so this is where we're kind of 837 00:46:44,190 --> 00:46:46,150 going to start to get into the meat of things. 838 00:46:49,610 --> 00:46:51,610 And if you didn't quite understand everything up 839 00:46:51,610 --> 00:46:54,610 till now, maybe just try to keep your intuition 840 00:46:54,610 --> 00:47:01,000 about how does a powerful party convince 841 00:47:01,000 --> 00:47:07,350 a probabilistic polynomial time party of the number 842 00:47:07,350 --> 00:47:09,750 of satisfying assignments? 843 00:47:09,750 --> 00:47:12,030 An exact number. 844 00:47:12,030 --> 00:47:14,430 Not at least, but you want to know exactly 845 00:47:14,430 --> 00:47:17,940 the number of satisfying assignments. 846 00:47:17,940 --> 00:47:19,950 So it could be zero, for example. 847 00:47:19,950 --> 00:47:21,120 How do you convince a-- 848 00:47:21,120 --> 00:47:26,100 how do you convince someone that there were zero assignments? 849 00:47:26,100 --> 00:47:29,070 And you can have an interaction which does that, 850 00:47:29,070 --> 00:47:32,820 and that's not obvious at all how you're going to do that. 851 00:47:37,200 --> 00:47:38,600 All right. 852 00:47:38,600 --> 00:47:42,870 So OK. 853 00:47:42,870 --> 00:47:46,035 So we're going to have to introduce some notation, which 854 00:47:46,035 --> 00:47:50,810 I hope that it doesn't cause heartburn here. 855 00:47:50,810 --> 00:47:57,490 So let's say, again, here is the language 856 00:47:57,490 --> 00:47:59,380 we're working with, #SAT. 857 00:47:59,380 --> 00:48:06,980 And we have a phi that has m variables, x1 to xm. 858 00:48:06,980 --> 00:48:08,990 Now, here's the notation. 859 00:48:08,990 --> 00:48:12,590 I'm going to-- if I write phi with a-- 860 00:48:12,590 --> 00:48:16,190 phi of 0, that just means the formula 861 00:48:16,190 --> 00:48:21,080 that I get by plugging in 0 for x1 862 00:48:21,080 --> 00:48:26,400 and leaving all the rest of the variables alone. 863 00:48:26,400 --> 00:48:30,360 OK, so I substitute 0 for x1 where 0 means false and 1 means 864 00:48:30,360 --> 00:48:32,980 true as usual. 865 00:48:32,980 --> 00:48:35,470 And but it's still going to be some other formula 866 00:48:35,470 --> 00:48:38,480 but just with that substitution. 867 00:48:38,480 --> 00:48:43,060 If I write phi 01, that means I've preset the first two 868 00:48:43,060 --> 00:48:46,990 variables to 0 and 1. 869 00:48:46,990 --> 00:48:52,070 If I write phi with a bunch of preset values, 870 00:48:52,070 --> 00:48:55,660 I'm just setting the first i variables, x1 to xi, 871 00:48:55,660 --> 00:49:03,340 to some values and leaving the other variables as unset. 872 00:49:03,340 --> 00:49:05,710 So I'm calling the ones that I'm nailing 873 00:49:05,710 --> 00:49:10,080 in there, as I'm already saying, these are the presets. 874 00:49:10,080 --> 00:49:12,900 So this is just converting some formulas 875 00:49:12,900 --> 00:49:17,160 into other formulas that have somewhat fewer variables. 876 00:49:17,160 --> 00:49:18,630 All right? 877 00:49:18,630 --> 00:49:22,740 Now, let's recall that number notation and number sign 878 00:49:22,740 --> 00:49:26,460 notation, #phi is the number of satisfying assignments. 879 00:49:26,460 --> 00:49:30,000 Now, if I say #phi of 0, that's the number 880 00:49:30,000 --> 00:49:33,920 of satisfying assignments when I've preset x1 to 0. 881 00:49:37,364 --> 00:49:43,370 Similarly, if I preset the first i variables to some values and 882 00:49:43,370 --> 00:49:44,030 then I take-- 883 00:49:44,030 --> 00:49:47,150 I want to take how many satisfying 884 00:49:47,150 --> 00:49:51,900 assignments subject to those presets, I write it this way. 885 00:49:51,900 --> 00:49:54,008 So I'm going to use this notation a lot. 886 00:49:54,008 --> 00:49:55,550 You have to understand this notation. 887 00:49:55,550 --> 00:49:58,140 Ask if you don't un-- if you don't get it. 888 00:49:58,140 --> 00:49:59,390 So another way of writing it-- 889 00:49:59,390 --> 00:50:01,390 I don't know if this is helpful, but another way 890 00:50:01,390 --> 00:50:05,360 of writing #phi of a1 to ai, remember, we 891 00:50:05,360 --> 00:50:09,410 have m variables altogether, that means 892 00:50:09,410 --> 00:50:17,100 I take the variables which I have not yet preset, 893 00:50:17,100 --> 00:50:20,340 and I allow them to range of all possible 0s and 1s, 894 00:50:20,340 --> 00:50:26,500 and I add up the formula's values for all of those. 895 00:50:26,500 --> 00:50:30,480 So there's a 1 every time I satisfy and a 0 every time I 896 00:50:30,480 --> 00:50:31,560 don't satisfy. 897 00:50:31,560 --> 00:50:34,800 So I'm adding up all the satisfying assignments subject 898 00:50:34,800 --> 00:50:36,225 to these i presets. 899 00:50:39,810 --> 00:50:40,770 OK? 900 00:50:40,770 --> 00:50:46,050 So here are two critical facts about this number sign 901 00:50:46,050 --> 00:50:47,640 notation. 902 00:50:47,640 --> 00:50:53,660 First of all, if I preset the first i values to something, 903 00:50:53,660 --> 00:50:58,310 now I can, in addition, set the next variable 904 00:50:58,310 --> 00:51:04,760 either to 0 or to 1, and I get this relationship, 905 00:51:04,760 --> 00:51:07,370 which is just simply a generalization of the fact 906 00:51:07,370 --> 00:51:10,670 that the total number of satisfying assignments 907 00:51:10,670 --> 00:51:13,550 of the formula is equal to the number of satisfying 908 00:51:13,550 --> 00:51:17,780 assignments when x1 is 0 plus the number of satisfying 909 00:51:17,780 --> 00:51:21,450 assignments when x1 is 1. 910 00:51:21,450 --> 00:51:23,450 They together have to add up to the total number 911 00:51:23,450 --> 00:51:26,680 because x1 is going to be either 0 or 1. 912 00:51:26,680 --> 00:51:28,550 So that's fact number one. 913 00:51:28,550 --> 00:51:33,152 Fact number two is that if I preset everything, 914 00:51:33,152 --> 00:51:35,110 all of the variables, so there are no variables 915 00:51:35,110 --> 00:51:39,160 left, then the number of satisfying assignments 916 00:51:39,160 --> 00:51:42,430 subject to that preset of everything 917 00:51:42,430 --> 00:51:44,530 is just whether or not I've satisfied 918 00:51:44,530 --> 00:51:46,750 the formula, which is the value of the formula 919 00:51:46,750 --> 00:51:49,310 on those presets. 920 00:51:49,310 --> 00:51:50,570 OK? 921 00:51:50,570 --> 00:51:52,520 Both two simple facts, but it's going 922 00:51:52,520 --> 00:51:54,995 to be critical in the protocol I'm about to describe. 923 00:51:58,440 --> 00:52:01,420 Questions on this? 924 00:52:01,420 --> 00:52:05,530 I think I actually do have a question for you. 925 00:52:05,530 --> 00:52:06,280 So let's just see. 926 00:52:09,460 --> 00:52:12,000 What do you think? 927 00:52:12,000 --> 00:52:15,480 It's just to check your understanding. 928 00:52:15,480 --> 00:52:17,460 OK. 929 00:52:17,460 --> 00:52:22,890 Got about 80% getting this. 930 00:52:22,890 --> 00:52:24,040 I'm not sure that's good. 931 00:52:24,040 --> 00:52:28,710 But all right. 932 00:52:28,710 --> 00:52:29,280 Almost done? 933 00:52:31,645 --> 00:52:32,145 Closing. 934 00:52:35,205 --> 00:52:35,705 OK. 935 00:52:38,390 --> 00:52:40,160 OK, so yes. 936 00:52:40,160 --> 00:52:42,455 A is the correct answer. 937 00:52:42,455 --> 00:52:45,890 If there are 9 satisfying assignments all together 938 00:52:45,890 --> 00:52:48,260 and there are 6 satisfying assignments 939 00:52:48,260 --> 00:52:51,860 with the first variable is set to 0, 940 00:52:51,860 --> 00:52:53,660 then there's only 3 satisfying assignments 941 00:52:53,660 --> 00:52:56,510 with the first variable set to 1 because 9 has 942 00:52:56,510 --> 00:52:58,550 got to be equal to 6 plus 3. 943 00:52:58,550 --> 00:53:02,310 That's actually this fact number one. 944 00:53:02,310 --> 00:53:04,960 It's not going to be 15. 945 00:53:04,960 --> 00:53:06,500 This is not true either. 946 00:53:06,500 --> 00:53:11,780 So it's just A. Good. 947 00:53:11,780 --> 00:53:13,290 OK. 948 00:53:13,290 --> 00:53:15,375 OK, so let's try to-- 949 00:53:15,375 --> 00:53:17,610 with that knowledge, let's try to see 950 00:53:17,610 --> 00:53:20,680 how we can put #SAT in IP. 951 00:53:20,680 --> 00:53:22,950 So this is not going to quite work, 952 00:53:22,950 --> 00:53:25,050 but it's really going to set us up 953 00:53:25,050 --> 00:53:29,290 to do this-- to finish this next time. 954 00:53:29,290 --> 00:53:34,070 So you might immediately see where this is going wrong, 955 00:53:34,070 --> 00:53:40,640 but you'll have to put up with it because the setup is 956 00:53:40,640 --> 00:53:43,850 what's important. 957 00:53:43,850 --> 00:53:44,810 OK. 958 00:53:44,810 --> 00:53:48,060 So understand, now, here's the setup. 959 00:53:48,060 --> 00:53:55,660 We have the input is a formula and a number 960 00:53:55,660 --> 00:53:58,510 where that number is supposed to be the number of satisfying 961 00:53:58,510 --> 00:54:00,960 assignments. 962 00:54:00,960 --> 00:54:03,400 It could be wrong, in which case, 963 00:54:03,400 --> 00:54:06,500 we're not in the language. 964 00:54:06,500 --> 00:54:08,680 But if it's right, you're in the language. 965 00:54:08,680 --> 00:54:13,480 So the prover is supposed to convince the verifier that it's 966 00:54:13,480 --> 00:54:15,843 correct if it is correct. 967 00:54:15,843 --> 00:54:17,260 And it's not going to-- it's going 968 00:54:17,260 --> 00:54:21,685 to fail no matter what it tries to do if it's not correct. 969 00:54:24,200 --> 00:54:26,590 So this says the prover is going to send, first of all-- 970 00:54:30,210 --> 00:54:34,520 so the prover is going to send a claim about the number 971 00:54:34,520 --> 00:54:37,200 of satisfying assignments. 972 00:54:37,200 --> 00:54:40,680 Going to send-- when I say this value here, 973 00:54:40,680 --> 00:54:43,260 this is what the prover-- 974 00:54:43,260 --> 00:54:45,510 if it's honest, it's going to send the right value. 975 00:54:45,510 --> 00:54:50,620 Of course, the verifier does not know if the prover is honest, 976 00:54:50,620 --> 00:54:52,480 but I'm describing how the honest prover is 977 00:54:52,480 --> 00:54:53,110 going to operate. 978 00:54:53,110 --> 00:54:54,818 And we'll have to understand what happens 979 00:54:54,818 --> 00:54:57,822 if the prover tries to cheat. 980 00:54:57,822 --> 00:55:00,280 So the prover is going to send-- the honest prover is going 981 00:55:00,280 --> 00:55:02,950 to send the number of satisfying assignments altogether, 982 00:55:02,950 --> 00:55:05,890 and the verifier just makes sure that that matches up 983 00:55:05,890 --> 00:55:07,377 with the input. 984 00:55:07,377 --> 00:55:08,960 If it doesn't match up with the input, 985 00:55:08,960 --> 00:55:11,726 the verifier is just going to-- 986 00:55:11,726 --> 00:55:15,590 the verifier is going to not be convinced that the input is 987 00:55:15,590 --> 00:55:17,190 in the language. 988 00:55:17,190 --> 00:55:19,700 So it's going to just reject at that point. 989 00:55:23,800 --> 00:55:25,750 OK. 990 00:55:25,750 --> 00:55:27,340 Then now the verifier says, OK. 991 00:55:27,340 --> 00:55:29,110 That was very good that you sent me this. 992 00:55:29,110 --> 00:55:31,670 How do I know that's right? 993 00:55:31,670 --> 00:55:33,680 So what the prover is going to do 994 00:55:33,680 --> 00:55:36,350 to try to convince the verifier that this value was 995 00:55:36,350 --> 00:55:43,140 correct is unravel that by one level by say, 996 00:55:43,140 --> 00:55:48,380 well, there were 9 satisfying assignments altogether. 997 00:55:48,380 --> 00:55:54,020 6 them were when x1 is 0, and 3 of them were when x1 is 1. 998 00:55:56,710 --> 00:55:58,690 What does the verifier have to check? 999 00:55:58,690 --> 00:56:02,380 That these add up correctly. 1000 00:56:02,380 --> 00:56:05,980 When I preset x1 to 0 and to 1, it 1001 00:56:05,980 --> 00:56:08,140 had better add up to the total number 1002 00:56:08,140 --> 00:56:10,340 of satisfying assignments. 1003 00:56:10,340 --> 00:56:12,110 If that works out, the verifier's happy. 1004 00:56:12,110 --> 00:56:15,560 It's still being-- it's still consistent with being 1005 00:56:15,560 --> 00:56:18,095 convinced that this k was the right value. 1006 00:56:22,630 --> 00:56:27,400 So the next step is, well, the verifier says, well, 1007 00:56:27,400 --> 00:56:30,370 how do I know those two values are correct? 1008 00:56:30,370 --> 00:56:31,600 The prover says, OK. 1009 00:56:31,600 --> 00:56:38,550 Well, I want to unravel them one level further then the number 1010 00:56:38,550 --> 00:56:41,250 of satisfying assignments when the next variable is 1011 00:56:41,250 --> 00:56:43,470 set to both possibilities for each 1012 00:56:43,470 --> 00:56:47,510 of the possibilities of the first variable. 1013 00:56:47,510 --> 00:56:51,670 Now, if you're understanding me about what the prover is 1014 00:56:51,670 --> 00:56:55,660 sending, you should start to be getting a little nervous 1015 00:56:55,660 --> 00:56:58,300 because something is-- 1016 00:56:58,300 --> 00:57:00,382 I mean, this is going to be correct, 1017 00:57:00,382 --> 00:57:02,590 but it's going to start-- it looks like it's starting 1018 00:57:02,590 --> 00:57:06,760 to blow up in terms of the amount of work that's involved, 1019 00:57:06,760 --> 00:57:08,740 and that's actually a problem. 1020 00:57:08,740 --> 00:57:10,630 But let's bear with that for the moment. 1021 00:57:10,630 --> 00:57:13,150 Let's just worry about correctness, not about 1022 00:57:13,150 --> 00:57:17,310 complexity for the moment. 1023 00:57:17,310 --> 00:57:20,000 So the prover's going to now send 1024 00:57:20,000 --> 00:57:22,430 the number of satisfying assignments for each 1025 00:57:22,430 --> 00:57:25,070 of those four possible ways of presetting the first two 1026 00:57:25,070 --> 00:57:27,830 variables, and the verifier is going 1027 00:57:27,830 --> 00:57:30,290 to check that that was consistent with the information 1028 00:57:30,290 --> 00:57:35,280 the prover sent in the previous round 1029 00:57:35,280 --> 00:57:37,150 by, again, checking this identity here. 1030 00:57:37,150 --> 00:57:39,060 So then the prover's going to continue 1031 00:57:39,060 --> 00:57:44,250 doing that until it's done that through m rounds, where 1032 00:57:44,250 --> 00:57:46,530 m is the number of variables. 1033 00:57:46,530 --> 00:57:48,800 So at this point, the prover's going 1034 00:57:48,800 --> 00:57:54,680 to send all possible ways of presetting 1035 00:57:54,680 --> 00:57:56,070 all of the variables. 1036 00:57:56,070 --> 00:58:00,250 So now there's 2 to the m possibilities here. 1037 00:58:00,250 --> 00:58:06,270 Again, this is hopelessly not allowed, but OK, ignoring that. 1038 00:58:06,270 --> 00:58:08,460 The prover's got to use this at the nth round 1039 00:58:08,460 --> 00:58:11,250 to check what happens at the previous round, 1040 00:58:11,250 --> 00:58:15,830 so that's when they were m minus 1 values sent 1041 00:58:15,830 --> 00:58:20,010 because each one has one more-- 1042 00:58:20,010 --> 00:58:21,900 you're extending the presets by 1. 1043 00:58:21,900 --> 00:58:26,010 So we're using this to check that the previous round 1044 00:58:26,010 --> 00:58:27,430 values were correct. 1045 00:58:27,430 --> 00:58:30,660 So it's looking for-- 1046 00:58:30,660 --> 00:58:37,170 the m minus 1 presets have to add up correctly 1047 00:58:37,170 --> 00:58:41,700 in terms of the presets of m values 1048 00:58:41,700 --> 00:58:46,260 for each of those ways of doing those m minus 1 presets. 1049 00:58:46,260 --> 00:58:48,840 And so now, the prover has sent all of those 2 1050 00:58:48,840 --> 00:58:53,670 to the m counts, which are, by the way, 1051 00:58:53,670 --> 00:58:57,780 1s and 0s because at this point, we have preset all 1052 00:58:57,780 --> 00:59:00,570 of the values of the variables. 1053 00:59:00,570 --> 00:59:02,610 And so there's only one possible assignment 1054 00:59:02,610 --> 00:59:05,220 at most that there can be. 1055 00:59:07,930 --> 00:59:11,830 And now the prover is done. 1056 00:59:11,830 --> 00:59:15,860 The verifier is going to check by itself 1057 00:59:15,860 --> 00:59:21,120 that these values make sense, that these values are correct. 1058 00:59:21,120 --> 00:59:24,930 So it's going to do that by looking back at the formula. 1059 00:59:24,930 --> 00:59:26,870 So far, up until this point, the verifier 1060 00:59:26,870 --> 00:59:28,910 has not been looking at the formula. 1061 00:59:28,910 --> 00:59:31,550 It's just been checking the internal consistency 1062 00:59:31,550 --> 00:59:34,110 of the prover's messages with each other. 1063 00:59:34,110 --> 00:59:36,350 But now at the end, the verifier is 1064 00:59:36,350 --> 00:59:40,100 going to take these values that the prover sent 1065 00:59:40,100 --> 00:59:42,200 for each of the 2 to the m presets 1066 00:59:42,200 --> 00:59:46,020 and see if it matches up with what the formula would do. 1067 00:59:46,020 --> 00:59:47,570 Remember, that was the other-- 1068 00:59:47,570 --> 00:59:51,500 sort of the base case of the fact number 1069 00:59:51,500 --> 00:59:55,835 two from the slide or two ago. 1070 00:59:55,835 --> 00:59:56,960 Make sure that these agree. 1071 00:59:59,940 --> 01:00:03,030 OK, and now the verifier says, well, OK. 1072 01:00:03,030 --> 01:00:06,400 If everything has checked out and all of these 1073 01:00:06,400 --> 01:00:09,360 are in agreement, then the verifier 1074 01:00:09,360 --> 01:00:16,410 is going to be convinced that phi 1075 01:00:16,410 --> 01:00:19,750 had k satisfying assignments. 1076 01:00:19,750 --> 01:00:23,400 But if anywhere along the way one of these checks fails, 1077 01:00:23,400 --> 01:00:26,010 the prover is not-- the verifier is not going to be convinced, 1078 01:00:26,010 --> 01:00:27,052 and it's going to reject. 1079 01:00:31,760 --> 01:00:33,500 So in a sense, this is kind of dopey. 1080 01:00:33,500 --> 01:00:38,780 I mean, I'm just kind of giving you a complicated way of just 1081 01:00:38,780 --> 01:00:43,040 counting up, one by one, each of the satisfying assignments 1082 01:00:43,040 --> 01:00:45,690 of the formula and seeing if that matches k. 1083 01:00:51,840 --> 01:00:54,780 But nevertheless, this way of looking at it 1084 01:00:54,780 --> 01:01:03,680 is going to help us to understand the way to fix this. 1085 01:01:03,680 --> 01:01:06,530 So bear with me for another minute on this one. 1086 01:01:06,530 --> 01:01:09,180 So another way of looking at this, 1087 01:01:09,180 --> 01:01:12,140 which I think is particularly useful, 1088 01:01:12,140 --> 01:01:16,040 is to think of what happens-- 1089 01:01:18,600 --> 01:01:19,660 well, OK. 1090 01:01:19,660 --> 01:01:21,300 We'll get there in a second. 1091 01:01:21,300 --> 01:01:23,670 I want to look at what happens if k was wrong, 1092 01:01:23,670 --> 01:01:25,920 but before I do that, let's look at the-- 1093 01:01:25,920 --> 01:01:30,540 I'm going to give a kind of a graphical view 1094 01:01:30,540 --> 01:01:35,970 of the information that the prover sends and the verifier's 1095 01:01:35,970 --> 01:01:38,050 actions in this protocol. 1096 01:01:38,050 --> 01:01:40,920 So the values that the prover's sending 1097 01:01:40,920 --> 01:01:43,020 are going to be in yellow. 1098 01:01:43,020 --> 01:01:47,070 So and the information that the verifier has or checks 1099 01:01:47,070 --> 01:01:48,830 is going to be in white. 1100 01:01:48,830 --> 01:01:54,300 So the verifier has the k, the input value, 1101 01:01:54,300 --> 01:01:56,300 which is supposed to be the number of satisfying 1102 01:01:56,300 --> 01:02:00,620 assignments, and the prover sends some value, 1103 01:02:00,620 --> 01:02:04,868 and the verifier checks that this value, which 1104 01:02:04,868 --> 01:02:07,160 is supposed to be the number of satisfying assignments, 1105 01:02:07,160 --> 01:02:08,810 corresponds with k. 1106 01:02:08,810 --> 01:02:11,010 So that's one of the checks it does. 1107 01:02:11,010 --> 01:02:12,740 Then the prover is going to send-- 1108 01:02:12,740 --> 01:02:15,230 going to take, to justify this value, 1109 01:02:15,230 --> 01:02:20,690 it sends the number of satisfying assignments when you 1110 01:02:20,690 --> 01:02:24,830 have x1 set to 0 or set to 1. 1111 01:02:24,830 --> 01:02:28,250 The verifier adds those up to give you-- 1112 01:02:28,250 --> 01:02:30,290 and it's supposed to equal the total number 1113 01:02:30,290 --> 01:02:31,490 of satisfying assignments. 1114 01:02:31,490 --> 01:02:33,950 And so this is-- if you understood this protocol, 1115 01:02:33,950 --> 01:02:34,610 this is just-- 1116 01:02:34,610 --> 01:02:40,200 I'm writing it out in a sort of a simplified way perhaps. 1117 01:02:40,200 --> 01:02:41,220 OK. 1118 01:02:41,220 --> 01:02:45,000 And so keeps checking that these things add up 1119 01:02:45,000 --> 01:02:50,910 correctly until you get down to setting all m values in all 2 1120 01:02:50,910 --> 01:02:53,460 to the m possible ways, and now the verifier 1121 01:02:53,460 --> 01:02:58,020 is going to then check to make sure that that equals 1122 01:02:58,020 --> 01:02:59,520 what the formula would say. 1123 01:03:03,560 --> 01:03:04,060 OK. 1124 01:03:04,060 --> 01:03:11,365 So now, what happens if k was the wrong value? 1125 01:03:14,240 --> 01:03:17,685 It did not agree with the number of satisfying assignments. 1126 01:03:21,960 --> 01:03:25,905 And what happens now? 1127 01:03:30,840 --> 01:03:34,560 Could the prover-- what happens if the prover tries to make 1128 01:03:34,560 --> 01:03:35,985 the verifier accept anyway? 1129 01:03:40,830 --> 01:03:44,340 So the only thing the prover can do at the very first step 1130 01:03:44,340 --> 01:03:48,690 would be to lie about-- 1131 01:03:48,690 --> 01:03:51,540 if the prover sends the-- if k is wrong 1132 01:03:51,540 --> 01:03:57,510 and the prover sends the correct value for the total count, 1133 01:03:57,510 --> 01:03:58,840 the verifier's going to reject. 1134 01:03:58,840 --> 01:04:03,050 So I'm trying to see, could the prover 1135 01:04:03,050 --> 01:04:05,090 try to make the verifier accept? 1136 01:04:05,090 --> 01:04:06,690 What happens? 1137 01:04:06,690 --> 01:04:09,080 So the prover has to lie here, and I'm 1138 01:04:09,080 --> 01:04:12,080 going to indicate that by saying the prover is 1139 01:04:12,080 --> 01:04:21,160 sending in the wrong value for the total count. 1140 01:04:21,160 --> 01:04:25,710 Well, if the prover's going to lie here, 1141 01:04:25,710 --> 01:04:34,880 then just like if you have a child who tells a lie, 1142 01:04:34,880 --> 01:04:36,380 and then you start-- as the parent, 1143 01:04:36,380 --> 01:04:37,880 you start asking questions to try 1144 01:04:37,880 --> 01:04:40,610 to see if the story is consistent, 1145 01:04:40,610 --> 01:04:44,460 one lie is going to lead to another lie. 1146 01:04:44,460 --> 01:04:45,750 And that's what happens here. 1147 01:04:48,720 --> 01:04:52,390 In order to justify this lie, the prover 1148 01:04:52,390 --> 01:04:55,540 is going to have to lie in one, or perhaps both, 1149 01:04:55,540 --> 01:04:57,670 but at least one of these two values 1150 01:04:57,670 --> 01:05:00,820 because you can't have the two correct values adding up 1151 01:05:00,820 --> 01:05:01,990 to the incorrect value. 1152 01:05:05,080 --> 01:05:07,190 So you have to think about what's going on here. 1153 01:05:07,190 --> 01:05:09,640 So this is a lie that's going to force 1154 01:05:09,640 --> 01:05:13,550 a lie at one side or the other one level down, 1155 01:05:13,550 --> 01:05:17,000 which is then going to force a lie to propagate down. 1156 01:05:17,000 --> 01:05:20,260 And so there's-- a lie at every stage is going to force a lie 1157 01:05:20,260 --> 01:05:24,460 at least in one place or another to propagate all the way down 1158 01:05:24,460 --> 01:05:26,290 to the bottom. 1159 01:05:26,290 --> 01:05:28,690 And then at the bottom, the verifier 1160 01:05:28,690 --> 01:05:33,860 will see that the check doesn't work 1161 01:05:33,860 --> 01:05:37,220 as when it tries to connect it up with the formula itself, 1162 01:05:37,220 --> 01:05:39,185 and the verifier will reject. 1163 01:05:42,060 --> 01:05:45,690 So it's just a way of looking at this. 1164 01:05:45,690 --> 01:05:47,955 If the for-- if the value-- 1165 01:05:47,955 --> 01:05:49,665 if the input was not in the language. 1166 01:05:52,870 --> 01:05:58,787 So but the problem is that, as I said, this is exponential. 1167 01:05:58,787 --> 01:06:00,120 So how are we going to fix that? 1168 01:06:00,120 --> 01:06:02,945 So just looking ahead to what we're going to do on Tuesday-- 1169 01:06:05,670 --> 01:06:08,690 OK, let's see if there's any questions here first of all. 1170 01:06:15,420 --> 01:06:16,170 OK. 1171 01:06:16,170 --> 01:06:16,950 I got a question. 1172 01:06:16,950 --> 01:06:26,450 Should this be-- should this be a minus? 1173 01:06:26,450 --> 01:06:31,550 I purposely made this bracket not include the very last 0. 1174 01:06:31,550 --> 01:06:33,800 Yeah, there's a total of m 0s here altogether, 1175 01:06:33,800 --> 01:06:36,110 but I left out the last 0. 1176 01:06:36,110 --> 01:06:37,610 That's why I said m minus 1. 1177 01:06:37,610 --> 01:06:39,396 Maybe it would have been better to say m. 1178 01:06:46,840 --> 01:06:49,010 OK, so I've got another interesting question here. 1179 01:06:49,010 --> 01:06:51,850 Why can't we reject right away if k is wrong? 1180 01:06:56,130 --> 01:06:59,600 Well, the verifier is probabilistic polynomial time. 1181 01:06:59,600 --> 01:07:01,820 How does the verifier know if k is wrong? 1182 01:07:06,480 --> 01:07:09,210 So I mean-- or right. 1183 01:07:09,210 --> 01:07:13,470 So what we're trying to do is something like NP 1184 01:07:13,470 --> 01:07:15,390 where we have a certificate, but now we 1185 01:07:15,390 --> 01:07:17,190 have this kind of interactive certificate 1186 01:07:17,190 --> 01:07:18,450 in the form of this prover. 1187 01:07:18,450 --> 01:07:20,075 Maybe that's another way to look at it. 1188 01:07:22,770 --> 01:07:24,420 Where if you're in the language, there 1189 01:07:24,420 --> 01:07:27,863 should be some way for the prover to make you accept. 1190 01:07:27,863 --> 01:07:29,280 But if you're not in the language, 1191 01:07:29,280 --> 01:07:32,205 there should be no way for the prover to make you accept. 1192 01:07:36,720 --> 01:07:38,998 So the verifier just can't reject right away 1193 01:07:38,998 --> 01:07:40,290 because there's no way to tell. 1194 01:07:40,290 --> 01:07:41,310 How does the verifier know? 1195 01:07:41,310 --> 01:07:42,450 It's going to start rejecting things 1196 01:07:42,450 --> 01:07:44,617 when it shouldn't if it's just going to be rejecting 1197 01:07:44,617 --> 01:07:47,110 willy-nilly here. 1198 01:07:47,110 --> 01:07:47,610 OK. 1199 01:07:47,610 --> 01:07:49,560 How does the verifier need to determine 1200 01:07:49,560 --> 01:07:52,463 if the prover is internally consistent instead 1201 01:07:52,463 --> 01:07:53,130 of just asking-- 1202 01:07:57,750 --> 01:07:59,760 so why does the verifier need to determine 1203 01:07:59,760 --> 01:08:01,770 if the prover is internally consistent 1204 01:08:01,770 --> 01:08:04,710 instead of just asking the questions in step n plus 1? 1205 01:08:07,270 --> 01:08:09,100 Yeah, so maybe that's-- 1206 01:08:09,100 --> 01:08:11,350 because it looks like all of the work 1207 01:08:11,350 --> 01:08:14,380 is happening at the very end. 1208 01:08:14,380 --> 01:08:17,350 But I'm really presenting this to you 1209 01:08:17,350 --> 01:08:23,710 as a preparation for what we're going to do on Tuesday. 1210 01:08:23,710 --> 01:08:27,430 So it's important to think about the connection from each step 1211 01:08:27,430 --> 01:08:28,359 to the next. 1212 01:08:28,359 --> 01:08:31,090 Each step is going to be justified by what 1213 01:08:31,090 --> 01:08:35,330 happens at the next step until we get to the very end. 1214 01:08:35,330 --> 01:08:38,300 So you'd have to just understand it for what it is. 1215 01:08:38,300 --> 01:08:39,800 Don't try to make it more efficient. 1216 01:08:39,800 --> 01:08:43,510 I realize this is kind of dumb. 1217 01:08:43,510 --> 01:08:44,290 Good point. 1218 01:08:44,290 --> 01:08:47,660 We're not using the probablism here. 1219 01:08:47,660 --> 01:08:51,000 And moreover, we're not really even using the interaction 1220 01:08:51,000 --> 01:08:51,500 here. 1221 01:08:51,500 --> 01:08:53,060 The prover is doing all the sending. 1222 01:08:53,060 --> 01:08:56,149 The verifier is just accepting at the end. 1223 01:08:56,149 --> 01:08:57,518 Yeah. 1224 01:08:57,518 --> 01:09:00,060 We're not using the power, and we're getting a weaker result. 1225 01:09:00,060 --> 01:09:02,819 So let's move on before we run out of time here. 1226 01:09:02,819 --> 01:09:05,680 So how are we going to fix this? 1227 01:09:05,680 --> 01:09:08,439 So the problem is this blowing up. 1228 01:09:08,439 --> 01:09:12,729 To justify each stage, each value 1229 01:09:12,729 --> 01:09:18,550 we're needing to present two values which add up to it. 1230 01:09:21,370 --> 01:09:25,810 And that's leading to a blowup. 1231 01:09:25,810 --> 01:09:27,819 Now, it would be nice if we can do something 1232 01:09:27,819 --> 01:09:33,310 where each value was supported by just a single value 1233 01:09:33,310 --> 01:09:34,359 at the next level. 1234 01:09:37,210 --> 01:09:39,090 So here's an idea. 1235 01:09:39,090 --> 01:09:44,250 In order to understand to see that this total count is 1236 01:09:44,250 --> 01:09:47,550 correct, why don't we just pick at random either 0 or 1 1237 01:09:47,550 --> 01:09:50,540 and only follow that one down? 1238 01:09:50,540 --> 01:09:52,250 Well, the problem with doing that 1239 01:09:52,250 --> 01:09:59,180 is because the sequence of lies could be just a single path 1240 01:09:59,180 --> 01:10:01,370 through this tree. 1241 01:10:01,370 --> 01:10:04,310 And the chances you're going to find that path down 1242 01:10:04,310 --> 01:10:06,613 to a contradiction at the bottom is very low 1243 01:10:06,613 --> 01:10:08,030 if you're just doing it at random. 1244 01:10:12,000 --> 01:10:15,150 So just randomly picking 0s and 1s 1245 01:10:15,150 --> 01:10:17,510 as the one you're going to justify, 1246 01:10:17,510 --> 01:10:19,550 used to justify the previous value, 1247 01:10:19,550 --> 01:10:22,400 is not going to be good enough. 1248 01:10:22,400 --> 01:10:24,360 But this is what we're going to do. 1249 01:10:24,360 --> 01:10:26,900 However, the values that we're going 1250 01:10:26,900 --> 01:10:33,080 to pick for these random inputs are not 1251 01:10:33,080 --> 01:10:35,440 going to be Boolean values. 1252 01:10:35,440 --> 01:10:40,810 We're going to pick non-Boolean assignments to the variables. 1253 01:10:40,810 --> 01:10:44,320 Which again, just as with the branching program case, 1254 01:10:44,320 --> 01:10:47,350 didn't make any sense on the surface of it. 1255 01:10:47,350 --> 01:10:50,730 We're going to have to make it make sense. 1256 01:10:50,730 --> 01:10:54,065 And we'll have to see how to do that in Tuesday's lecture. 1257 01:10:57,670 --> 01:10:59,050 So that's kind of the setup. 1258 01:11:01,775 --> 01:11:02,275 OK. 1259 01:11:09,080 --> 01:11:10,400 Yeah, so in a similar question. 1260 01:11:10,400 --> 01:11:14,440 Why is this any different from just non-deterministically 1261 01:11:14,440 --> 01:11:15,580 guessing the assignments? 1262 01:11:15,580 --> 01:11:16,690 It's because of this. 1263 01:11:16,690 --> 01:11:19,820 We're really setting the stage. 1264 01:11:19,820 --> 01:11:20,320 OK. 1265 01:11:20,320 --> 01:11:24,070 So what we did today was we introduced the model 1266 01:11:24,070 --> 01:11:26,800 and defined the complexity class. 1267 01:11:26,800 --> 01:11:30,550 We did show this one in its full glory. 1268 01:11:30,550 --> 01:11:33,490 We showed that non-ISO is an IP. 1269 01:11:33,490 --> 01:11:38,680 Really worth understanding this protocol here, making sure 1270 01:11:38,680 --> 01:11:44,350 you're comfortable with that and also the model itself. 1271 01:11:44,350 --> 01:11:48,270 And so for Tuesday's lecture, we're going to finish this up. 1272 01:11:48,270 --> 01:11:49,770 Well, we started showing that #SAT 1273 01:11:49,770 --> 01:11:55,020 is an IP which is what we need to do to prove coNP is an IP. 1274 01:11:55,020 --> 01:12:00,790 And we'll finish that next time, which will be our last time. 1275 01:12:00,790 --> 01:12:01,290 OK. 1276 01:12:01,290 --> 01:12:03,630 So that's it for today. 1277 01:12:03,630 --> 01:12:06,670 I'll stick around for questions. 1278 01:12:06,670 --> 01:12:08,310 So a good question here. 1279 01:12:08,310 --> 01:12:14,820 Why can't V just reject if some of the checks are incorrect? 1280 01:12:14,820 --> 01:12:15,810 Yes. 1281 01:12:15,810 --> 01:12:18,580 As soon as there's a check that fails, 1282 01:12:18,580 --> 01:12:21,150 V can just reject at that stage. 1283 01:12:21,150 --> 01:12:24,480 I'm just trying to argue that at some point along the way, 1284 01:12:24,480 --> 01:12:27,270 if the input is not in the language, 1285 01:12:27,270 --> 01:12:30,188 there's going to be a check that fails. 1286 01:12:30,188 --> 01:12:31,980 I mean, I said reject at the end, but yeah. 1287 01:12:31,980 --> 01:12:36,810 I mean, you could have rejected at any point along the way. 1288 01:12:42,845 --> 01:12:43,345 OK. 1289 01:12:50,100 --> 01:12:53,220 Someone's asking for what role did I play? 1290 01:12:53,220 --> 01:12:58,720 So I did-- my own personal role in this was twofold. 1291 01:12:58,720 --> 01:13:02,758 First of all, I came up with the idea of-- 1292 01:13:02,758 --> 01:13:03,300 not the idea. 1293 01:13:03,300 --> 01:13:06,990 I came up with the name interactive proof. 1294 01:13:06,990 --> 01:13:09,660 I remember when Silvio Micali was explaining this to me 1295 01:13:09,660 --> 01:13:12,420 in my apartment many, many years ago. 1296 01:13:12,420 --> 01:13:18,033 He had kind of a little bit complicated-- 1297 01:13:18,033 --> 01:13:20,200 and I don't even remember what the protocol was for. 1298 01:13:20,200 --> 01:13:21,533 It was not for something simple. 1299 01:13:21,533 --> 01:13:23,980 It was something involving prime numbers. 1300 01:13:23,980 --> 01:13:25,330 And I said, oh. 1301 01:13:25,330 --> 01:13:27,740 That's a kind of an interactive proof. 1302 01:13:27,740 --> 01:13:29,900 And it stuck from that point on. 1303 01:13:29,900 --> 01:13:32,920 So that was one thing. 1304 01:13:32,920 --> 01:13:36,370 But the other thing, in terms of more mathematically, 1305 01:13:36,370 --> 01:13:39,880 my role was-- so Shafi Goldwasser and I 1306 01:13:39,880 --> 01:13:43,360 proved the equivalence of the two models, the public coin 1307 01:13:43,360 --> 01:13:46,330 and the private coin version. 1308 01:13:46,330 --> 01:13:52,210 So that was my role in this back when 1309 01:13:52,210 --> 01:13:55,670 this was all first coming out. 1310 01:13:55,670 --> 01:13:59,540 Proved it on an airplane on the way to a conference somewhere. 1311 01:13:59,540 --> 01:14:01,630 Anyway, so I think we're going to-- 1312 01:14:01,630 --> 01:14:03,760 unless there's any other questions, 1313 01:14:03,760 --> 01:14:05,080 I think we'll head out. 1314 01:14:05,080 --> 01:14:06,490 Take care, everybody. 1315 01:14:06,490 --> 01:14:09,850 See you on-- see you on Tuesday. 1316 01:14:09,850 --> 01:14:11,260 Bye-bye.