1 00:00:00,000 --> 00:00:01,972 [SQUEAKING] 2 00:00:01,972 --> 00:00:03,944 [RUSTLING] 3 00:00:03,944 --> 00:00:05,423 [CLICKING] 4 00:00:25,617 --> 00:00:26,700 MICHAEL SIPSER: Hi, folks. 5 00:00:26,700 --> 00:00:27,290 Welcome back. 6 00:00:31,580 --> 00:00:36,320 So we will continue our discussion that we had-- 7 00:00:36,320 --> 00:00:39,180 that we've been doing for the past few lectures. 8 00:00:39,180 --> 00:00:41,750 We first talked about time complexity. 9 00:00:41,750 --> 00:00:46,430 And then we shifted gears to talk about space complexity. 10 00:00:46,430 --> 00:00:52,460 So we had a couple of lectures on PSPACE, 11 00:00:52,460 --> 00:00:56,510 kind of culminating in proving that there were languages which 12 00:00:56,510 --> 00:00:59,330 are PSPACE complete, namely this TQBF language, 13 00:00:59,330 --> 00:01:00,870 which is where we started. 14 00:01:00,870 --> 00:01:05,840 And then we also proved that there are problems involving 15 00:01:05,840 --> 00:01:09,710 games, such as the generalized geography game, where 16 00:01:09,710 --> 00:01:11,840 determining which side has a winning strategy 17 00:01:11,840 --> 00:01:13,610 is PSPACE complete. 18 00:01:13,610 --> 00:01:15,770 At the end of the lecture last time, 19 00:01:15,770 --> 00:01:19,670 we moved to a different regime of space, 20 00:01:19,670 --> 00:01:22,790 namely from polynomial space down to log space. 21 00:01:22,790 --> 00:01:25,910 And we introduced the classes L and NL. 22 00:01:25,910 --> 00:01:28,820 And so I'm going to begin today's lecture 23 00:01:28,820 --> 00:01:32,330 by reviewing some of the material on N and NL, 24 00:01:32,330 --> 00:01:34,760 which I think came a little too quickly last time. 25 00:01:34,760 --> 00:01:36,620 And then we have two important theorems 26 00:01:36,620 --> 00:01:38,370 we're going to cover today. 27 00:01:38,370 --> 00:01:45,860 One is about proving that there are complete languages for NL 28 00:01:45,860 --> 00:01:49,190 that has a bearing on the L versus NL question-- 29 00:01:49,190 --> 00:01:51,500 log space, deterministic log space, 30 00:01:51,500 --> 00:01:53,990 versus nondeterministic log space. 31 00:01:53,990 --> 00:01:57,120 That's yet another unsolved problem in our field. 32 00:01:57,120 --> 00:02:00,950 And so there is a notion of a complete problem for NL. 33 00:02:00,950 --> 00:02:06,080 And then we're going to prove a theorem that 34 00:02:06,080 --> 00:02:08,610 was, in its day, very surprising to people. 35 00:02:08,610 --> 00:02:11,570 I remember when it came out in the 1980s, 36 00:02:11,570 --> 00:02:15,770 that NL, in fact, is closed under complementation, 37 00:02:15,770 --> 00:02:21,680 that the NL class and coNL class both collapse to 1, 38 00:02:21,680 --> 00:02:24,230 that they're equal, which is not the way we believe 39 00:02:24,230 --> 00:02:26,660 the situation to be for NP. 40 00:02:26,660 --> 00:02:30,530 But until someone has a proof that they're different, 41 00:02:30,530 --> 00:02:33,010 strange things can happen. 42 00:02:33,010 --> 00:02:34,450 OK. 43 00:02:34,450 --> 00:02:40,340 So with that, I will move myself into the corner. 44 00:02:40,340 --> 00:02:44,230 And let's do our review. 45 00:02:44,230 --> 00:02:53,260 So we are-- in order to talk about space complexity classes 46 00:02:53,260 --> 00:02:58,000 that are smaller than n, we had to introduce a new model, 47 00:02:58,000 --> 00:03:03,610 which was the two-tape model, where there was a read-only 48 00:03:03,610 --> 00:03:07,090 tape that had the input on it, which you normally would think 49 00:03:07,090 --> 00:03:08,920 of something very large-- 50 00:03:08,920 --> 00:03:14,890 the whole internet, or something so big 51 00:03:14,890 --> 00:03:18,460 that you can't read it into your local memory. 52 00:03:18,460 --> 00:03:25,130 And then you have a work tape, which is your local memory. 53 00:03:25,130 --> 00:03:29,540 And the way we're going to think about it in the context of log 54 00:03:29,540 --> 00:03:32,900 space is that that work tape is logarithmic in the size 55 00:03:32,900 --> 00:03:33,890 of the input. 56 00:03:33,890 --> 00:03:38,780 And that is enough to have small counters or pointers 57 00:03:38,780 --> 00:03:43,100 into the input, because a reference location of the input 58 00:03:43,100 --> 00:03:43,710 is just-- 59 00:03:43,710 --> 00:03:48,450 you only need log n bits to do that. 60 00:03:48,450 --> 00:03:57,630 So we gave a couple of examples of the L and NL-- 61 00:03:57,630 --> 00:04:02,580 of L and NL languages, so this language of ww reverse, 62 00:04:02,580 --> 00:04:04,590 as you may remember. 63 00:04:04,590 --> 00:04:08,200 So here is an input in ww reverse. 64 00:04:08,200 --> 00:04:11,340 It's a string follow-- it's a palindromic string, which 65 00:04:11,340 --> 00:04:13,350 is of even length. 66 00:04:13,350 --> 00:04:16,440 And so you can make a machine in log space here 67 00:04:16,440 --> 00:04:19,230 that can test whether its input is of that form. 68 00:04:19,230 --> 00:04:22,050 And the work tape is only-- 69 00:04:22,050 --> 00:04:26,850 all it needs is a pointer into the-- 70 00:04:26,850 --> 00:04:29,250 or a couple of pointers that refer 71 00:04:29,250 --> 00:04:31,860 to the corresponding places of the input 72 00:04:31,860 --> 00:04:33,630 that you're looking at at the moment. 73 00:04:33,630 --> 00:04:35,940 So you maybe start out looking at the two 74 00:04:35,940 --> 00:04:41,190 outside a's and then the b symbols that are next to that. 75 00:04:41,190 --> 00:04:45,270 And you can write down on your tape 76 00:04:45,270 --> 00:04:47,950 where you're looking currently. 77 00:04:47,950 --> 00:04:49,920 And so that's going to be enough for you 78 00:04:49,920 --> 00:04:53,250 to-- you may have to zigzag, of course, back and forth a lot 79 00:04:53,250 --> 00:04:55,500 in order to do that test. 80 00:04:55,500 --> 00:04:59,710 But that's completely fine, using the model. 81 00:04:59,710 --> 00:05:01,260 We're not going to be measuring time. 82 00:05:01,260 --> 00:05:04,920 We're only going to be focusing on how much space we're using. 83 00:05:04,920 --> 00:05:08,400 Another example that we gave is the PATH language, 84 00:05:08,400 --> 00:05:10,830 where you're given a graph, and a start node, 85 00:05:10,830 --> 00:05:11,580 and a target node. 86 00:05:11,580 --> 00:05:16,710 And you want to know, is there a path in this directed graph 87 00:05:16,710 --> 00:05:18,600 that goes from s to t? 88 00:05:18,600 --> 00:05:21,840 And that's the language that's also-- 89 00:05:21,840 --> 00:05:24,030 that language is in NL-- 90 00:05:24,030 --> 00:05:27,420 in fact, not known to be in L. 91 00:05:27,420 --> 00:05:30,900 So the way that would look, shifting 92 00:05:30,900 --> 00:05:35,280 to an input in the PATH language, 93 00:05:35,280 --> 00:05:37,410 you would have a graph represented, 94 00:05:37,410 --> 00:05:41,550 say, by a sequence of edges, and a start, and a target. 95 00:05:41,550 --> 00:05:47,348 And the work tape would keep track of the current node. 96 00:05:47,348 --> 00:05:48,890 So the nondeterministic machine would 97 00:05:48,890 --> 00:05:53,180 guess a path that takes you from s to t, node by node. 98 00:05:53,180 --> 00:05:56,870 And the work tape would keep track of the current node. 99 00:05:56,870 --> 00:05:59,180 OK, so I hope you have this-- 100 00:05:59,180 --> 00:06:02,120 develop a little bit of an intuition for these classes, 101 00:06:02,120 --> 00:06:03,020 L and NL. 102 00:06:03,020 --> 00:06:05,810 We're going to be spending the entire lecture today 103 00:06:05,810 --> 00:06:09,050 talking about that. 104 00:06:09,050 --> 00:06:10,700 OK. 105 00:06:10,700 --> 00:06:15,800 So as I mentioned, the L and NL problem is an unsolved problem. 106 00:06:15,800 --> 00:06:19,730 And it's very much analogous to the P versus NP problem, 107 00:06:19,730 --> 00:06:21,770 except, as I mentioned, as we'll show, 108 00:06:21,770 --> 00:06:24,680 that NL and its complement end up 109 00:06:24,680 --> 00:06:26,390 being the same, which is not something 110 00:06:26,390 --> 00:06:29,960 that seems to be the case for NP, though we don't know. 111 00:06:37,760 --> 00:06:40,382 Can we think of this as a multi-head Turing machine? 112 00:06:40,382 --> 00:06:42,590 I'm getting a question about that, which is, I think, 113 00:06:42,590 --> 00:06:43,760 you can. 114 00:06:43,760 --> 00:06:47,450 In fact, that's an alternative way that people look at it. 115 00:06:47,450 --> 00:06:50,210 You can think of it as having multiple-- you know, 116 00:06:50,210 --> 00:06:56,980 a head basically needs log space to store the location-- 117 00:06:56,980 --> 00:07:03,350 to store the location of where that head would be. 118 00:07:03,350 --> 00:07:06,230 So if you imagine having several different heads on the input 119 00:07:06,230 --> 00:07:08,720 tape, you can think of a log space machine 120 00:07:08,720 --> 00:07:10,730 as being sort of a Turing machine 121 00:07:10,730 --> 00:07:13,070 that has multiple heads on the input table. 122 00:07:13,070 --> 00:07:15,590 It's equivalent. 123 00:07:15,590 --> 00:07:17,840 Good question. 124 00:07:17,840 --> 00:07:21,870 So let's move on, then. 125 00:07:21,870 --> 00:07:23,450 OK. 126 00:07:23,450 --> 00:07:26,510 So one of the things we proved last time was that anything 127 00:07:26,510 --> 00:07:31,460 that you can do in L, you can also do in polynomial time. 128 00:07:31,460 --> 00:07:35,190 And I'll answer some of these chat questions in a minute. 129 00:07:35,190 --> 00:07:42,160 But the-- so the class L is a subset of P. 130 00:07:42,160 --> 00:07:44,110 This is easy to prove. 131 00:07:44,110 --> 00:07:47,800 But I think it's nevertheless important to see why it's true, 132 00:07:47,800 --> 00:07:51,080 because it sort of sets some definitions of things 133 00:07:51,080 --> 00:07:52,910 that we're going to use later. 134 00:07:52,910 --> 00:07:59,710 So in particular, we really need to know 135 00:07:59,710 --> 00:08:05,380 the notion of a configuration of a log space machine 136 00:08:05,380 --> 00:08:07,330 on an input. 137 00:08:07,330 --> 00:08:12,080 So because the input does not change, 138 00:08:12,080 --> 00:08:13,780 we don't really consider the input 139 00:08:13,780 --> 00:08:16,490 to be part of the configuration. 140 00:08:16,490 --> 00:08:19,390 The only thing that's-- 141 00:08:19,390 --> 00:08:21,490 the thing that's relevant in the configuration 142 00:08:21,490 --> 00:08:24,400 is the dynamic part of the machine-- the state, the head 143 00:08:24,400 --> 00:08:28,150 locations, and the work tape contents. 144 00:08:28,150 --> 00:08:30,550 So we're defining that the configuration 145 00:08:30,550 --> 00:08:32,650 for the machine on a particular input, w, 146 00:08:32,650 --> 00:08:35,289 is those four things-- 147 00:08:35,289 --> 00:08:38,200 state, the two head locations, and the tape contents. 148 00:08:38,200 --> 00:08:41,110 And the important thing to keep in mind for this theorem 149 00:08:41,110 --> 00:08:44,380 is that we have only a polynomial number 150 00:08:44,380 --> 00:08:48,640 of different configurations if you just do the calculation. 151 00:08:48,640 --> 00:08:51,730 The main part is the number of different tape contents 152 00:08:51,730 --> 00:08:54,460 as you can have, which is exponential in the log n. 153 00:08:54,460 --> 00:08:57,230 And that's a polynomial. 154 00:08:57,230 --> 00:09:01,340 And so therefore, any machine that runs in log space, 155 00:09:01,340 --> 00:09:03,440 provided it always holds, and we always 156 00:09:03,440 --> 00:09:06,590 assume our machines always hold, they 157 00:09:06,590 --> 00:09:09,860 can only run for a polynomial number of steps, 158 00:09:09,860 --> 00:09:13,460 because that's as many different configurations as they have. 159 00:09:13,460 --> 00:09:16,940 If they ran for, say, an exponential number of steps, 160 00:09:16,940 --> 00:09:20,330 they would have to be repeating configurations. 161 00:09:20,330 --> 00:09:23,160 And then they would be looping. 162 00:09:23,160 --> 00:09:27,860 OK, so there we go. 163 00:09:30,380 --> 00:09:33,840 OK, so let me just get back-- so somebody asked me a question. 164 00:09:33,840 --> 00:09:34,880 Which is harder? 165 00:09:34,880 --> 00:09:38,790 P versus NP or L versus NL? 166 00:09:38,790 --> 00:09:43,160 Completely no idea. 167 00:09:43,160 --> 00:09:50,770 It's a-- I guess there was a common line of thinking that 168 00:09:50,770 --> 00:09:52,570 if you're going to-- 169 00:09:52,570 --> 00:09:56,038 that it's good to try to think about-- 170 00:09:56,038 --> 00:09:57,580 if you're trying to separate classes, 171 00:09:57,580 --> 00:10:00,100 you might as well take classes that are as far 172 00:10:00,100 --> 00:10:02,740 apart as one another. 173 00:10:02,740 --> 00:10:05,020 Like, if you're trying to prove-- 174 00:10:05,020 --> 00:10:08,290 if you're comparing P different from NP and P different 175 00:10:08,290 --> 00:10:11,440 from PSPACE, maybe P different from PSPACE 176 00:10:11,440 --> 00:10:16,510 might be easier, because P and PSPACE seem to be even further 177 00:10:16,510 --> 00:10:18,850 apart than P and NP. 178 00:10:18,850 --> 00:10:19,730 Nobody knows. 179 00:10:19,730 --> 00:10:22,420 And I suspect that there's something 180 00:10:22,420 --> 00:10:26,480 fundamental about computation that we just don't understand. 181 00:10:26,480 --> 00:10:28,510 And then once somebody makes a breakthrough 182 00:10:28,510 --> 00:10:32,350 and solves one of those problems, a lot of them 183 00:10:32,350 --> 00:10:34,010 are going to get solved in short order. 184 00:10:34,010 --> 00:10:39,770 But again, it's purely speculation. 185 00:10:39,770 --> 00:10:40,700 OK, d. 186 00:10:40,700 --> 00:10:46,190 What is d here? d would be the size of the tape alphabet. 187 00:10:46,190 --> 00:10:50,660 OK, so this is the number of different tape contents 188 00:10:50,660 --> 00:10:51,230 we have. 189 00:10:54,570 --> 00:10:55,350 Good. 190 00:10:55,350 --> 00:10:57,150 All right. 191 00:10:57,150 --> 00:11:00,560 So let's continue on. 192 00:11:00,560 --> 00:11:03,050 Another thing we mentioned kind of quickly in passing, 193 00:11:03,050 --> 00:11:07,910 but still an important fact, is that Savitch's theorem 194 00:11:07,910 --> 00:11:12,100 works down to the level of log space-- 195 00:11:12,100 --> 00:11:13,660 same exact proof. 196 00:11:13,660 --> 00:11:17,920 So that means that nondeterministic log space 197 00:11:17,920 --> 00:11:22,215 is contained in deterministic log squared space, 198 00:11:22,215 --> 00:11:24,340 because that's what Savitch's theorem does for you. 199 00:11:24,340 --> 00:11:27,730 It converts nondeterministic machines 200 00:11:27,730 --> 00:11:29,500 to deterministic machines at the cost 201 00:11:29,500 --> 00:11:32,840 of a squaring in the amount of space you need. 202 00:11:32,840 --> 00:11:35,710 And so I'm not going to go through this in detail. 203 00:11:35,710 --> 00:11:40,510 But the same picture that I copied off an earlier slide 204 00:11:40,510 --> 00:11:44,890 with a simple modification is that instead of-- 205 00:11:44,890 --> 00:11:46,210 that I'm right down-- 206 00:11:46,210 --> 00:11:50,565 the size of the configuration is going to be now log n, 207 00:11:50,565 --> 00:11:52,440 because that's how big the configurations are 208 00:11:52,440 --> 00:11:55,800 when you have a nondeterministic log space machine. 209 00:11:55,800 --> 00:11:57,990 And so simulating that-- 210 00:11:57,990 --> 00:12:02,130 so this would be what the tableau would 211 00:12:02,130 --> 00:12:05,970 look like for an NL machine. 212 00:12:05,970 --> 00:12:08,850 And then you can simulate that in the same way 213 00:12:08,850 --> 00:12:12,840 by trying all possible intermediates 214 00:12:12,840 --> 00:12:16,733 and then splitting it, doing the top half and then 215 00:12:16,733 --> 00:12:17,400 the bottom half. 216 00:12:17,400 --> 00:12:22,363 We're using the space, of course, recursively. 217 00:12:22,363 --> 00:12:24,030 The amount of space you're going to need 218 00:12:24,030 --> 00:12:29,040 is going to be enough to store, for one level of the recursion, 219 00:12:29,040 --> 00:12:30,630 one configuration. 220 00:12:30,630 --> 00:12:32,460 And that's order log space. 221 00:12:32,460 --> 00:12:34,680 And then the number of levels of recursion 222 00:12:34,680 --> 00:12:36,510 is going to be another factor of log n, 223 00:12:36,510 --> 00:12:38,970 because that's log to the running time, which 224 00:12:38,970 --> 00:12:43,470 is going to be exponential in log n, which is polynomial. 225 00:12:43,470 --> 00:12:45,870 So the total amount of space that you would need 226 00:12:45,870 --> 00:12:47,048 would be log squared space. 227 00:12:47,048 --> 00:12:49,590 Again, this is sort of saying the proof of Savitch's theorem, 228 00:12:49,590 --> 00:12:51,520 just over again. 229 00:12:51,520 --> 00:12:54,100 So if it's coming too fast for you, 230 00:12:54,100 --> 00:12:56,310 just review the proof of Savitch's theorem 231 00:12:56,310 --> 00:12:59,880 and observe that it works, even if the amount of space 232 00:12:59,880 --> 00:13:00,810 that the machine-- 233 00:13:00,810 --> 00:13:03,000 that the nondeterministic machine starts off with 234 00:13:03,000 --> 00:13:04,050 is log space. 235 00:13:07,240 --> 00:13:09,610 All right. 236 00:13:09,610 --> 00:13:12,820 And last thing I was going to-- last thing in the category 237 00:13:12,820 --> 00:13:22,330 of a review is our theorem that not only is all of L within P-- 238 00:13:22,330 --> 00:13:24,760 and that's kind of trivial, kind of immediate. 239 00:13:24,760 --> 00:13:26,710 You don't even have to change the machine. 240 00:13:26,710 --> 00:13:29,870 If you have a log space machine for some language, 241 00:13:29,870 --> 00:13:32,057 the very same machine is a polynomial time machine 242 00:13:32,057 --> 00:13:33,640 for that language, because it can only 243 00:13:33,640 --> 00:13:35,570 be running for a polynomial amount of time. 244 00:13:35,570 --> 00:13:40,290 But now, we have a nondeterministic machine 245 00:13:40,290 --> 00:13:41,850 for some language. 246 00:13:41,850 --> 00:13:44,130 We're going to have to change it to become 247 00:13:44,130 --> 00:13:47,320 a deterministic machine that runs in polynomial time. 248 00:13:47,320 --> 00:13:50,850 And so we're going to give a deterministic polynomial time 249 00:13:50,850 --> 00:13:56,660 simulation of a nondeterministic log space machine. 250 00:13:56,660 --> 00:14:00,510 And we kind of did this last time, but a little quickly. 251 00:14:00,510 --> 00:14:06,490 So now, if we have some nondeterministic log space 252 00:14:06,490 --> 00:14:15,010 machine, so an M, which decides the language A in log space, 253 00:14:15,010 --> 00:14:16,960 we're going to show how to simulate 254 00:14:16,960 --> 00:14:21,130 that machine with a deterministic polynomial time 255 00:14:21,130 --> 00:14:22,450 machine. 256 00:14:22,450 --> 00:14:28,210 And the key idea, which is going to come up 257 00:14:28,210 --> 00:14:34,830 in a later theorem, so good to understand it not only here, 258 00:14:34,830 --> 00:14:39,870 but to understand it for the next theorem that's coming, 259 00:14:39,870 --> 00:14:42,540 is the notion of a configuration graph. 260 00:14:42,540 --> 00:14:45,400 I was sort of thinking about calling it a computation graph. 261 00:14:45,400 --> 00:14:46,980 But now, on further reflection, I 262 00:14:46,980 --> 00:14:50,940 think configuration graph maybe is the more suggestive term. 263 00:14:50,940 --> 00:14:53,200 So let's stick with that. 264 00:14:53,200 --> 00:14:59,430 So a configuration graph for a machine on an input 265 00:14:59,430 --> 00:15:02,130 is just a set of all configurations 266 00:15:02,130 --> 00:15:04,350 that the machine has, all the possible different 267 00:15:04,350 --> 00:15:07,950 configurations the machine can have, 268 00:15:07,950 --> 00:15:11,790 with edges connecting configurations that correspond 269 00:15:11,790 --> 00:15:14,140 to legal moves of the machine. 270 00:15:14,140 --> 00:15:15,630 So here is some configuration. 271 00:15:15,630 --> 00:15:18,540 This is a snapshot of the machine at a moment in time. 272 00:15:18,540 --> 00:15:20,710 Here is some other configuration, 273 00:15:20,710 --> 00:15:24,840 another snapshot of the machine at a moment n time. 274 00:15:24,840 --> 00:15:28,080 And you're going to draw an edge between ci and cj 275 00:15:28,080 --> 00:15:31,440 if cj could follow in one step from ci. 276 00:15:31,440 --> 00:15:34,080 And you could tell by just looking at the configurations 277 00:15:34,080 --> 00:15:36,000 whether that could be possible. 278 00:15:36,000 --> 00:15:40,320 Obviously, the head has to be one place over in cj 279 00:15:40,320 --> 00:15:41,610 from where it was in ci. 280 00:15:41,610 --> 00:15:43,350 And it has to be updated according 281 00:15:43,350 --> 00:15:44,920 to the rules of the machine. 282 00:15:44,920 --> 00:15:48,930 So you can tell whether-- 283 00:15:48,930 --> 00:15:51,150 so you could fill out this graph. 284 00:15:51,150 --> 00:15:53,970 You could write down all the possible configurations. 285 00:15:53,970 --> 00:15:56,010 And you can put the edges down. 286 00:15:56,010 --> 00:15:59,970 Now, the point is that when we have a log space machine, 287 00:15:59,970 --> 00:16:03,120 we don't have too many possible configurations. 288 00:16:03,120 --> 00:16:05,490 There's only a polynomial number. 289 00:16:05,490 --> 00:16:09,390 So the size of this whole graph is polynomial. 290 00:16:12,950 --> 00:16:15,710 So our polynomial time simulation 291 00:16:15,710 --> 00:16:19,730 is going to write down that entire configuration 292 00:16:19,730 --> 00:16:23,420 graph of the log space machine on its input. 293 00:16:23,420 --> 00:16:25,920 There [INAUDIBLE] many configurations. 294 00:16:25,920 --> 00:16:28,170 There can be only polynomially many. 295 00:16:28,170 --> 00:16:31,260 So you can write down all those configurations as the nodes 296 00:16:31,260 --> 00:16:33,240 and then go look at each pair of nodes, 297 00:16:33,240 --> 00:16:36,990 whether this configuration could lead to that configuration. 298 00:16:36,990 --> 00:16:39,400 According to-- it's a nondeterministic machine. 299 00:16:39,400 --> 00:16:42,250 So a configuration could go to several different locations-- 300 00:16:42,250 --> 00:16:44,790 there could be several different ways to go. 301 00:16:44,790 --> 00:16:47,130 But those are just several different outgoing edges 302 00:16:47,130 --> 00:16:49,890 from a particular node in this graph representing 303 00:16:49,890 --> 00:16:51,600 the configuration, which might have 304 00:16:51,600 --> 00:16:53,550 several different legal successors 305 00:16:53,550 --> 00:16:57,910 in the nondeterministic computation. 306 00:16:57,910 --> 00:16:58,480 OK. 307 00:16:58,480 --> 00:17:05,950 Now, the important thing here is that M accepts its input, w, 308 00:17:05,950 --> 00:17:09,500 exactly when there is a path [INAUDIBLE] 309 00:17:09,500 --> 00:17:11,569 configuration graph that takes you 310 00:17:11,569 --> 00:17:15,050 from the start configuration to the accept configuration. 311 00:17:15,050 --> 00:17:16,790 And as I mentioned, let's assume, 312 00:17:16,790 --> 00:17:19,383 as we've been doing, that the machine-- 313 00:17:19,383 --> 00:17:21,050 I should have put it here, but I didn't. 314 00:17:21,050 --> 00:17:25,790 But the machine, when it is about to accept, 315 00:17:25,790 --> 00:17:30,080 it erases its work tape and moves both of its heads 316 00:17:30,080 --> 00:17:32,240 to the home position at the left end of the tape. 317 00:17:32,240 --> 00:17:34,840 So there's just one accepting configuration 318 00:17:34,840 --> 00:17:35,840 you have to worry about. 319 00:17:35,840 --> 00:17:36,965 It just makes life simpler. 320 00:17:39,450 --> 00:17:41,430 So there's going to be a start configuration, 321 00:17:41,430 --> 00:17:45,780 a single accept configuration in this configuration graph. 322 00:17:45,780 --> 00:17:51,180 And now there's going to be a path, indicated here, that 323 00:17:51,180 --> 00:17:54,870 connects the start configuration to the accept configuration 324 00:17:54,870 --> 00:18:00,510 if and only if M accepts w, because that path 325 00:18:00,510 --> 00:18:02,760 is the sequence of configurations 326 00:18:02,760 --> 00:18:05,310 that the machine would go through 327 00:18:05,310 --> 00:18:07,560 if you launched it on w. 328 00:18:07,560 --> 00:18:09,120 It would start at the start. 329 00:18:09,120 --> 00:18:11,700 And there might be several different ways to go. 330 00:18:11,700 --> 00:18:14,760 But if there is one of them that leads you to an accept, 331 00:18:14,760 --> 00:18:17,400 that's going to correspond to a branch of the computation 332 00:18:17,400 --> 00:18:19,010 that it's accepting. 333 00:18:19,010 --> 00:18:24,140 OK, so that tells us what the polynomial time algorithm is. 334 00:18:24,140 --> 00:18:29,870 On input w, you construct that configuration graph for M 335 00:18:29,870 --> 00:18:33,540 on w, G sub Mw. 336 00:18:33,540 --> 00:18:37,440 And you test whether there's a path from c start to c 337 00:18:37,440 --> 00:18:44,370 accept using any polynomial time depth-first search 338 00:18:44,370 --> 00:18:48,420 or breadth-first search algorithm for testing 339 00:18:48,420 --> 00:18:51,990 whether there's a connection path in a graph. 340 00:18:51,990 --> 00:18:54,300 And if there is such a path, you accept, 341 00:18:54,300 --> 00:18:57,060 because that means the machine M accepted. 342 00:18:57,060 --> 00:19:01,110 And if there was no path, you reject, because then M 343 00:19:01,110 --> 00:19:03,540 must have not accepted. 344 00:19:03,540 --> 00:19:07,500 And therefore, you have a polynomial time simulation 345 00:19:07,500 --> 00:19:12,190 of your nondeterministic log space machine M, OK? 346 00:19:12,190 --> 00:19:13,910 How are we doing? 347 00:19:13,910 --> 00:19:19,610 OK, so that tells us that NL is contained within P. 348 00:19:19,610 --> 00:19:23,010 And also, L is contained within NL, as before. 349 00:19:23,010 --> 00:19:26,495 So we have kind of this hierarchy of classes. 350 00:19:29,710 --> 00:19:31,890 Now, you can even talk about, not only is L 351 00:19:31,890 --> 00:19:35,100 different from NL, even is L different from P? 352 00:19:35,100 --> 00:19:37,020 Is it possible that anything that you 353 00:19:37,020 --> 00:19:39,750 can do in polynomial time, you can do [INAUDIBLE] space? 354 00:19:39,750 --> 00:19:42,252 Don't know. 355 00:19:42,252 --> 00:19:42,835 Open question. 356 00:19:49,800 --> 00:19:53,520 OK, getting a very good question here just now. 357 00:19:53,520 --> 00:19:59,220 Why is this construction taking log space? 358 00:19:59,220 --> 00:20:00,610 It doesn't. 359 00:20:00,610 --> 00:20:03,220 This construction takes polynomial time. 360 00:20:03,220 --> 00:20:10,990 This algorithm here is not a polynomial-- 361 00:20:10,990 --> 00:20:13,450 this is not a log space algorithm that I'm giving you. 362 00:20:13,450 --> 00:20:15,340 I'm giving you a polynomial time algorithm 363 00:20:15,340 --> 00:20:18,250 for simulating a nondeterministic log space 364 00:20:18,250 --> 00:20:19,640 machine. 365 00:20:19,640 --> 00:20:22,790 Now, later on-- I don't want to confuse the issue right now. 366 00:20:22,790 --> 00:20:26,450 For this particular slide, all I need to do 367 00:20:26,450 --> 00:20:30,890 is construct that graph in polynomial time. 368 00:20:30,890 --> 00:20:32,330 It's a polynomial size graph. 369 00:20:32,330 --> 00:20:38,270 I can't store that whole graph in a log space memory. 370 00:20:38,270 --> 00:20:39,380 OK, so question here-- 371 00:20:42,220 --> 00:20:45,550 we can see that listing out the nodes and edges 372 00:20:45,550 --> 00:20:47,680 would be polynomial time. 373 00:20:47,680 --> 00:20:49,502 But how do we actually provide structure 374 00:20:49,502 --> 00:20:50,710 to this graph representation? 375 00:20:50,710 --> 00:20:52,130 I don't even know what that means. 376 00:20:52,130 --> 00:20:53,913 So if you can clarify that for me, 377 00:20:53,913 --> 00:20:55,330 then maybe I can try to answer it. 378 00:20:55,330 --> 00:21:00,640 But a graph is just a list of nodes and a list of edges. 379 00:21:00,640 --> 00:21:03,430 After that, we know what the graph is. 380 00:21:03,430 --> 00:21:06,160 I mean, you may like a picture. 381 00:21:06,160 --> 00:21:09,280 But the machine doesn't need a picture. 382 00:21:09,280 --> 00:21:13,330 Our definition of-- we just represent these things 383 00:21:13,330 --> 00:21:18,390 as strings, in the end. 384 00:21:18,390 --> 00:21:20,750 So please clarify if you want me to-- 385 00:21:20,750 --> 00:21:24,560 I'll answer it the next at the next pause. 386 00:21:24,560 --> 00:21:26,960 I'm grateful for all the questions, because I'm sure, 387 00:21:26,960 --> 00:21:30,560 any question that any of you have, another 20 of you 388 00:21:30,560 --> 00:21:31,220 also have. 389 00:21:31,220 --> 00:21:34,800 So questions are good. 390 00:21:34,800 --> 00:21:37,460 Don't be bashful. 391 00:21:37,460 --> 00:21:41,840 And also ask the TAs if I become overloaded. 392 00:21:41,840 --> 00:21:46,130 OK, now we're going to shift gears into some new material. 393 00:21:46,130 --> 00:21:47,900 All right. 394 00:21:47,900 --> 00:21:53,900 We're going to talk about the notion that-- 395 00:21:53,900 --> 00:21:57,410 sort of thinking about, analogous to the P versus NP 396 00:21:57,410 --> 00:22:00,920 problem, where there were these NP-complete problems, 397 00:22:00,920 --> 00:22:03,440 now we have the L versus NL problem. 398 00:22:03,440 --> 00:22:08,120 There are going to be an NL-complete problems 399 00:22:08,120 --> 00:22:12,500 that kind of capture the essence of NL the way 400 00:22:12,500 --> 00:22:15,080 NP-complete problems capture the essence of NP, 401 00:22:15,080 --> 00:22:18,230 in a sense, in that all of the problems in NP 402 00:22:18,230 --> 00:22:19,650 are reducible to them. 403 00:22:19,650 --> 00:22:21,770 So they're kind of like the hardest NP problems. 404 00:22:21,770 --> 00:22:24,200 Here, we're going to have exactly analogous situations 405 00:22:24,200 --> 00:22:29,270 for NL, where we're going to show problems 406 00:22:29,270 --> 00:22:33,120 where all other NL problems are reducible to them. 407 00:22:33,120 --> 00:22:35,390 So if you can kind of solve one of them, 408 00:22:35,390 --> 00:22:39,080 like solve one of these NL-complete problems in log 409 00:22:39,080 --> 00:22:42,380 space deterministically, then you solve all of NL problems 410 00:22:42,380 --> 00:22:46,460 in log space deterministically. 411 00:22:46,460 --> 00:22:49,940 So it's a very similar-looking definition 412 00:22:49,940 --> 00:22:51,900 to what we had before. 413 00:22:51,900 --> 00:22:54,230 It's NL complete if it's in NL. 414 00:22:54,230 --> 00:22:59,450 And then all other languages in NL should be reducible. 415 00:22:59,450 --> 00:23:05,510 But now, we have a new notion here, with an L instead of a P. 416 00:23:05,510 --> 00:23:08,690 Now, before, when we talked about NP completeness, 417 00:23:08,690 --> 00:23:11,880 we had polynomial time reducibility. 418 00:23:11,880 --> 00:23:14,880 That's not going to work anymore, 419 00:23:14,880 --> 00:23:20,970 because if you remember, NL is a subset of P. 420 00:23:20,970 --> 00:23:24,870 So all NL languages are polynomial-- 421 00:23:24,870 --> 00:23:28,260 are languages in P. And if we're talking about-- 422 00:23:28,260 --> 00:23:30,840 if we use polynomial time reducibility, 423 00:23:30,840 --> 00:23:34,100 all languages in P are reducible to each other. 424 00:23:34,100 --> 00:23:36,670 We need to have a notion of reducibility which is 425 00:23:36,670 --> 00:23:40,190 kind of weaker than the class. 426 00:23:40,190 --> 00:23:43,490 And so polynomial time reducibility just 427 00:23:43,490 --> 00:23:45,980 would not work here, because everything 428 00:23:45,980 --> 00:23:48,170 would become NL complete, because everything 429 00:23:48,170 --> 00:23:49,442 is reducible to each other. 430 00:23:49,442 --> 00:23:50,900 So we need to have a weaker notion. 431 00:23:50,900 --> 00:23:53,080 We're going to use log space reducibility, 432 00:23:53,080 --> 00:23:54,080 which we have to define. 433 00:23:57,220 --> 00:23:59,050 So here, for that, we're going to have 434 00:23:59,050 --> 00:24:01,390 to talk about the notion of a function 435 00:24:01,390 --> 00:24:06,170 that you can compute in log space. 436 00:24:06,170 --> 00:24:08,200 And it's a little tricky here. 437 00:24:08,200 --> 00:24:10,570 Just like when we talked about language recognition 438 00:24:10,570 --> 00:24:12,880 in log space, where we had the work tape 439 00:24:12,880 --> 00:24:15,550 had to be smaller than the input tape, 440 00:24:15,550 --> 00:24:17,140 because the inputs can be large. 441 00:24:17,140 --> 00:24:19,000 The work area is small. 442 00:24:19,000 --> 00:24:22,300 Now the output also could be large 443 00:24:22,300 --> 00:24:24,800 relative to the work area. 444 00:24:24,800 --> 00:24:30,130 So we're going to have a three-tape model, where there's 445 00:24:30,130 --> 00:24:33,400 the input is going to be a read-only, 446 00:24:33,400 --> 00:24:37,000 the output is a write-only-- it's like a printer. 447 00:24:37,000 --> 00:24:38,740 It's something you can only write on, 448 00:24:38,740 --> 00:24:40,698 but you can't read back, because otherwise, you 449 00:24:40,698 --> 00:24:44,030 could cheat by using the output as a kind of storage. 450 00:24:44,030 --> 00:24:45,880 And then you have your storage area, which 451 00:24:45,880 --> 00:24:48,100 is your read-write work tape. 452 00:24:50,890 --> 00:24:52,570 OK, so this-- we'll call this-- 453 00:24:52,570 --> 00:24:57,070 the traditional name of this is a log space transducer. 454 00:24:57,070 --> 00:24:59,770 So it converts inputs to outputs, 455 00:24:59,770 --> 00:25:03,630 but uses only log space for its working memory. 456 00:25:03,630 --> 00:25:08,740 OK, so the input tape stores n bit-- n symbols. 457 00:25:08,740 --> 00:25:12,360 The work tape stores log n symbols, order log n symbols. 458 00:25:12,360 --> 00:25:14,670 And then we have the output tape. 459 00:25:14,670 --> 00:25:16,380 You may want to think about how big-- 460 00:25:16,380 --> 00:25:21,240 there's going to be a check-in coming to kind of ask you, 461 00:25:21,240 --> 00:25:23,760 how big could the output be? 462 00:25:23,760 --> 00:25:26,790 But we'll save that for the end if you-- 463 00:25:26,790 --> 00:25:31,110 you can mull that over if you want to think ahead. 464 00:25:31,110 --> 00:25:33,320 OK. 465 00:25:33,320 --> 00:25:36,860 So we think of a log space transducer 466 00:25:36,860 --> 00:25:39,680 as computing a function, which is just 467 00:25:39,680 --> 00:25:41,480 a mapping from the input to the output 468 00:25:41,480 --> 00:25:44,486 that the transducer provides for you. 469 00:25:44,486 --> 00:25:50,720 A transducer is a deterministic machine, by the way. 470 00:25:50,720 --> 00:25:55,320 So you take the transducer. 471 00:25:55,320 --> 00:25:57,240 You give it w. 472 00:25:57,240 --> 00:25:59,190 And you turn it on. 473 00:25:59,190 --> 00:26:02,950 And then it halts with f of w on its output tape. 474 00:26:02,950 --> 00:26:07,545 That's what it means to be computing the function f, OK? 475 00:26:07,545 --> 00:26:13,730 And we'll say that A is log space reducible to B, using 476 00:26:13,730 --> 00:26:17,720 the l subscript symbol on the less than or equal to sign, 477 00:26:17,720 --> 00:26:20,810 if it's mapping reducible to B, but by a reduction 478 00:26:20,810 --> 00:26:22,670 function that's computable in log 479 00:26:22,670 --> 00:26:24,440 space, just the same way we define 480 00:26:24,440 --> 00:26:26,060 polynomial time reducibility. 481 00:26:26,060 --> 00:26:29,360 But there, we insisted that the reduction function was 482 00:26:29,360 --> 00:26:30,680 computable in polynomial time. 483 00:26:34,640 --> 00:26:35,140 OK. 484 00:26:35,140 --> 00:26:39,900 Just quickly, I got a question again. 485 00:26:39,900 --> 00:26:41,500 Why log space here? 486 00:26:41,500 --> 00:26:46,870 Because polynomial time would be too powerful for doing 487 00:26:46,870 --> 00:26:56,240 reductions internal to P. Every language in P 488 00:26:56,240 --> 00:26:58,580 is reducible to every other language in P. 489 00:26:58,580 --> 00:27:02,030 And so everything in NL would be reducible to everything else 490 00:27:02,030 --> 00:27:05,060 in L with a polynomial time reducibility. 491 00:27:05,060 --> 00:27:07,430 And so that would not be an interesting notion. 492 00:27:07,430 --> 00:27:12,110 We have to use a weaker notion than that, a weaker 493 00:27:12,110 --> 00:27:17,570 kind of reduction, using a weaker model, 494 00:27:17,570 --> 00:27:20,060 so that you don't get-- 495 00:27:20,060 --> 00:27:24,710 otherwise, the reduction function 496 00:27:24,710 --> 00:27:28,970 would be able to answer whether-- 497 00:27:28,970 --> 00:27:31,617 would be able to solve the problem A itself 498 00:27:31,617 --> 00:27:33,200 if we had a polynomial time reduction, 499 00:27:33,200 --> 00:27:38,900 and we're mapping things from NL to other problems in NL. 500 00:27:38,900 --> 00:27:40,630 The reduction would solve the problem. 501 00:27:40,630 --> 00:27:42,420 And that's not what you want. 502 00:27:42,420 --> 00:27:44,570 The reduction should be constrained only 503 00:27:44,570 --> 00:27:50,430 to be able to do simple transformations on the problem, 504 00:27:50,430 --> 00:27:52,350 not to solve the problem. 505 00:27:52,350 --> 00:27:53,910 Anyway, you have to look at that. 506 00:27:53,910 --> 00:27:58,140 This is an issue that's come up before when we talked about, 507 00:27:58,140 --> 00:27:59,700 what's the right notion of reduction 508 00:27:59,700 --> 00:28:01,390 to use for PSPACE completeness? 509 00:28:01,390 --> 00:28:04,760 Same exact discussion. 510 00:28:04,760 --> 00:28:05,990 OK. 511 00:28:05,990 --> 00:28:08,060 Now, there is an issue here, though, 512 00:28:08,060 --> 00:28:09,470 that we have to be careful of. 513 00:28:09,470 --> 00:28:16,700 When we have A being log space reducible to B, and B in L, 514 00:28:16,700 --> 00:28:17,615 then what you want-- 515 00:28:22,400 --> 00:28:25,640 if A is log space reducible to B and B in L, 516 00:28:25,640 --> 00:28:27,920 then you want A to be in L. That's 517 00:28:27,920 --> 00:28:30,830 the same pattern we've always had for reductions. 518 00:28:30,830 --> 00:28:33,480 If A is reducible to B, and B is easy, then A is easy. 519 00:28:33,480 --> 00:28:37,730 So here, the notion of easy is being an L. Now, 520 00:28:37,730 --> 00:28:41,130 if you remember the proof of that we had from before, 521 00:28:41,130 --> 00:28:43,700 which I'll just put out for you, is 522 00:28:43,700 --> 00:28:57,630 that to show a log space solver for A, you take an input, w. 523 00:28:57,630 --> 00:29:02,280 And now, if A is reducible to B, you compute the reduction. 524 00:29:02,280 --> 00:29:05,640 And then you run the decider for B. 525 00:29:05,640 --> 00:29:08,430 So if we're assuming A is reducible to B, and B is in L, 526 00:29:08,430 --> 00:29:12,490 so B has a log space decider, you 527 00:29:12,490 --> 00:29:15,650 take your w, which you want to know, is it in A? 528 00:29:15,650 --> 00:29:21,110 You map it over to a B problem using the reduction function. 529 00:29:21,110 --> 00:29:25,480 And then you solve it using the decider for B. 530 00:29:25,480 --> 00:29:27,840 And you give the same answer. 531 00:29:27,840 --> 00:29:30,620 Now, this actually doesn't work anymore, 532 00:29:30,620 --> 00:29:37,780 or it doesn't work in an obvious way, 533 00:29:37,780 --> 00:29:42,590 because, if you're following me-- 534 00:29:42,590 --> 00:29:43,770 I hope most of you are-- 535 00:29:47,502 --> 00:29:50,650 there is a problem here, which-- 536 00:29:50,650 --> 00:30:01,700 because we're trying to give a log space algorithm for A. 537 00:30:01,700 --> 00:30:05,510 And that algorithm is going to be computing this reduction 538 00:30:05,510 --> 00:30:09,660 function, mapping w to f of w. 539 00:30:09,660 --> 00:30:13,020 f of w might itself be very large, 540 00:30:13,020 --> 00:30:15,150 as this picture suggests here. 541 00:30:15,150 --> 00:30:18,780 You may not be able to store f of w 542 00:30:18,780 --> 00:30:30,020 in the log space memory for the machine that's deciding A. 543 00:30:30,020 --> 00:30:37,260 So this is an obstacle that we need 544 00:30:37,260 --> 00:30:41,300 to solve in order to prove this theorem, which we need, 545 00:30:41,300 --> 00:30:43,300 because that's the whole justification for doing 546 00:30:43,300 --> 00:30:46,380 these reducibilities-- 547 00:30:46,380 --> 00:30:48,830 should be a familiar-looking kind of line 548 00:30:48,830 --> 00:30:51,750 to what we've seen before. 549 00:30:51,750 --> 00:30:53,570 So we don't have space to store f of w. 550 00:30:53,570 --> 00:30:54,830 What do we do? 551 00:30:54,830 --> 00:30:56,675 And I'll also mention that this is 552 00:30:56,675 --> 00:30:58,925 going to be relevant to one of your homework problems. 553 00:31:02,210 --> 00:31:04,120 So what do we do? 554 00:31:04,120 --> 00:31:07,210 We don't have space to store the intermediate result 555 00:31:07,210 --> 00:31:09,130 that we need in order to solve the problem. 556 00:31:09,130 --> 00:31:10,720 We started with w. 557 00:31:10,720 --> 00:31:14,360 Now we'd need to test if f of w is in B. 558 00:31:14,360 --> 00:31:16,040 That can run in log space. 559 00:31:16,040 --> 00:31:18,440 But just simply getting your hands on f of w-- 560 00:31:18,440 --> 00:31:21,840 what do you do about that? 561 00:31:21,840 --> 00:31:30,100 So what we're going to do is the following, is the decider-- 562 00:31:30,100 --> 00:31:38,280 the decider for B, which needs f of w, 563 00:31:38,280 --> 00:31:41,820 because it's deciding if f of w is in B-- 564 00:31:41,820 --> 00:31:45,510 it doesn't need all of f of w sitting there in front of it 565 00:31:45,510 --> 00:31:46,890 all at once. 566 00:31:46,890 --> 00:31:50,620 If you think about how the Turing machine operates 567 00:31:50,620 --> 00:31:54,970 on its input, it only looks at one symbol at a time. 568 00:31:54,970 --> 00:31:57,693 It starts out reading the leftmost symbol of f of w, 569 00:31:57,693 --> 00:31:59,110 then maybe it moves its head right 570 00:31:59,110 --> 00:32:01,240 and moves to the second symbol of f of w, 571 00:32:01,240 --> 00:32:02,590 then the third symbol of f of w. 572 00:32:02,590 --> 00:32:06,245 Maybe it gets up to the 10th symbol of f of w. 573 00:32:06,245 --> 00:32:08,620 Maybe it moves his head back and goes to the ninth symbol 574 00:32:08,620 --> 00:32:09,700 and the eighth symbol. 575 00:32:09,700 --> 00:32:14,230 But the Turing machine's head, which is deciding B, 576 00:32:14,230 --> 00:32:18,490 only looks at one symbol of f of w at a time. 577 00:32:21,310 --> 00:32:25,760 So instead of writing down all of f of w, 578 00:32:25,760 --> 00:32:31,280 the idea is that we are going to compute 579 00:32:31,280 --> 00:32:36,020 the individual symbols of f of w that we need only 580 00:32:36,020 --> 00:32:39,510 at the moment we need them. 581 00:32:39,510 --> 00:32:45,180 So if the decider for B is reading 582 00:32:45,180 --> 00:32:59,240 the 10th symbol of f of w, we fire up the transducer on w. 583 00:32:59,240 --> 00:33:01,550 And as it's writing out its output, which 584 00:33:01,550 --> 00:33:03,140 we don't have space to store anymore, 585 00:33:03,140 --> 00:33:06,410 we throw away all of the output values 586 00:33:06,410 --> 00:33:08,870 until we get to the 10th one. 587 00:33:08,870 --> 00:33:12,230 And then we say, ah, the 10th one is whatever, 588 00:33:12,230 --> 00:33:14,900 is a c, whatever the value is. 589 00:33:14,900 --> 00:33:19,310 Now we feed that into the decider for B. 590 00:33:19,310 --> 00:33:22,617 We can now simulate that decider for one more step. 591 00:33:22,617 --> 00:33:24,200 Now the decider says, all right, now I 592 00:33:24,200 --> 00:33:27,440 need the 11th symbol of f of w. 593 00:33:27,440 --> 00:33:32,000 OK, now we can run that machine for one more place. 594 00:33:32,000 --> 00:33:33,140 But if it needs-- 595 00:33:33,140 --> 00:33:35,570 but we don't even have to do it. 596 00:33:35,570 --> 00:33:38,720 I think the better way to think about it is, every time 597 00:33:38,720 --> 00:33:41,870 that decider for B needs another symbol, 598 00:33:41,870 --> 00:33:47,510 we start the transducer over again and just keep-- 599 00:33:47,510 --> 00:33:50,270 throw away everything except for that one symbol output 600 00:33:50,270 --> 00:33:53,600 that we need. 601 00:33:53,600 --> 00:33:59,500 So every time we do another step of simulating B, 602 00:33:59,500 --> 00:34:03,100 we're going to have to rerun the transducer from the beginning, 603 00:34:03,100 --> 00:34:07,900 just to recompute that, or compute maybe 604 00:34:07,900 --> 00:34:10,840 for the first time, or recompute it if we need it subsequently. 605 00:34:10,840 --> 00:34:12,610 This is going to be slow, but we don't 606 00:34:12,610 --> 00:34:20,380 care, to recompute that symbol that the simulator-- 607 00:34:20,380 --> 00:34:24,580 that the decider for B requires, OK? 608 00:34:24,580 --> 00:34:25,900 So I'm saying that over here. 609 00:34:25,900 --> 00:34:29,730 Recompute the symbols of f of w as needed. 610 00:34:29,730 --> 00:34:31,920 OK, so let me-- let's take a couple of questions. 611 00:34:31,920 --> 00:34:33,712 And then we're going to move to a check-in. 612 00:34:42,790 --> 00:34:45,820 So somebody's asking, why did we have to introduce transducer 613 00:34:45,820 --> 00:34:47,770 for log space reducibility when we 614 00:34:47,770 --> 00:34:49,870 didn't do it for polynomial time reducibility? 615 00:34:49,870 --> 00:34:52,300 We could have for polynomial time reducibility. 616 00:34:52,300 --> 00:34:55,659 But we didn't need to, because we could just all do it 617 00:34:55,659 --> 00:34:57,190 on the same tape. 618 00:34:57,190 --> 00:35:00,310 The problem is, for log space, the tape is-- 619 00:35:00,310 --> 00:35:05,750 the work tape is too small to hold the input on the output. 620 00:35:05,750 --> 00:35:09,100 So we can't-- since we're only working-- 621 00:35:09,100 --> 00:35:12,040 we have a log n bound that we have to work within. 622 00:35:12,040 --> 00:35:15,460 We need to separate those functions 623 00:35:15,460 --> 00:35:17,950 from the work functions, the input function and the output 624 00:35:17,950 --> 00:35:18,940 function. 625 00:35:18,940 --> 00:35:23,440 So if we have more than the amount of resource we have, 626 00:35:23,440 --> 00:35:25,272 either time or space was at least n, 627 00:35:25,272 --> 00:35:26,980 then we could just lump them all together 628 00:35:26,980 --> 00:35:33,242 and have that one tape do multiple functions. 629 00:35:33,242 --> 00:35:34,700 And somebody's asked me here, yeah, 630 00:35:34,700 --> 00:35:36,980 this is mapping reducibility, this m. 631 00:35:36,980 --> 00:35:42,170 This is from the notion we saw before. 632 00:35:42,170 --> 00:35:44,030 OK. 633 00:35:44,030 --> 00:35:47,420 Does f of w lie on the input tape of B? 634 00:35:47,420 --> 00:35:49,340 Well, yes. 635 00:35:49,340 --> 00:35:53,400 So we are-- good question. 636 00:35:53,400 --> 00:35:55,040 So f of w-- 637 00:35:55,040 --> 00:35:57,620 you know, because what are we doing? 638 00:35:57,620 --> 00:35:59,650 We're trying to find a decider for A 639 00:35:59,650 --> 00:36:04,670 here, using the decider for B and the mapping from A 640 00:36:04,670 --> 00:36:09,534 to B, the reduction from A to B. 641 00:36:09,534 --> 00:36:14,170 So the decider for B expects to find its input 642 00:36:14,170 --> 00:36:16,990 on an input tape. 643 00:36:16,990 --> 00:36:18,760 That input is going to be f of w. 644 00:36:22,440 --> 00:36:26,010 But we have to get the effect of that 645 00:36:26,010 --> 00:36:28,380 without actually writing down that input tape, 646 00:36:28,380 --> 00:36:29,760 because we don't have enough room 647 00:36:29,760 --> 00:36:32,580 to write down the input tape for the decider-- 648 00:36:32,580 --> 00:36:35,730 for the B decider, because that could be very large. 649 00:36:35,730 --> 00:36:38,190 And we only have-- 650 00:36:38,190 --> 00:36:40,850 we have no place to put the f of w. 651 00:36:40,850 --> 00:36:42,350 So think about what's going on here. 652 00:36:42,350 --> 00:36:44,690 We're making a log space machine whose input 653 00:36:44,690 --> 00:36:49,850 is w, has to compute f of w as an intermediate value, 654 00:36:49,850 --> 00:36:52,930 to feed it into the B decider. 655 00:36:52,930 --> 00:36:58,640 That is not going to be possible to hold onto that whole f of w 656 00:36:58,640 --> 00:36:59,810 at one-- 657 00:36:59,810 --> 00:37:01,400 altogether, because it's too big. 658 00:37:01,400 --> 00:37:02,060 But that doesn't matter. 659 00:37:02,060 --> 00:37:02,768 We don't need it. 660 00:37:02,768 --> 00:37:05,424 We only needed one symbol at a time, which we can recompute. 661 00:37:08,150 --> 00:37:09,650 OK, so let's see. 662 00:37:15,570 --> 00:37:17,310 So somebody says, can we just ensure 663 00:37:17,310 --> 00:37:18,900 that the output tape is order n so we 664 00:37:18,900 --> 00:37:20,778 don't need to use more tape than the input? 665 00:37:20,778 --> 00:37:22,320 Order n is still going to be too big. 666 00:37:22,320 --> 00:37:23,945 Where are you going to put that output? 667 00:37:23,945 --> 00:37:25,770 Even if it's just order n-- first of all, 668 00:37:25,770 --> 00:37:28,080 the answer is no, we can't, because there 669 00:37:28,080 --> 00:37:30,340 are going to be reductions which are bigger than that. 670 00:37:30,340 --> 00:37:33,960 But the other question is, can we just 671 00:37:33,960 --> 00:37:35,910 ensure that the output is order n? 672 00:37:35,910 --> 00:37:38,520 You can't put the output on the input tape. 673 00:37:38,520 --> 00:37:39,990 The input tape is read-only. 674 00:37:39,990 --> 00:37:42,450 The output tape is write-only. 675 00:37:42,450 --> 00:37:45,012 So there's no place to-- 676 00:37:45,012 --> 00:37:46,970 even if the output is just as big as the input, 677 00:37:46,970 --> 00:37:48,650 it doesn't help you. 678 00:37:48,650 --> 00:37:51,170 If the output is only log n, OK, then we could do it. 679 00:37:51,170 --> 00:37:54,650 But that's not going to be interesting for us. 680 00:37:54,650 --> 00:37:57,110 You're going to need, for these large space 681 00:37:57,110 --> 00:38:01,160 reductions, big outputs, as we'll see in a minute. 682 00:38:04,510 --> 00:38:07,330 What's the running time for this log space reduction? 683 00:38:07,330 --> 00:38:08,680 It's all going to be polynomial. 684 00:38:08,680 --> 00:38:11,000 It's all going to be a log space algorithm. 685 00:38:11,000 --> 00:38:12,777 So it's all going to be polynomial. 686 00:38:15,760 --> 00:38:18,290 Is there any NP completeness reduction 687 00:38:18,290 --> 00:38:21,430 which can be done in log space? 688 00:38:21,430 --> 00:38:23,350 All of the NP-- 689 00:38:23,350 --> 00:38:26,620 all typical NP completeness reductions, 690 00:38:26,620 --> 00:38:28,600 those polynomial time reductions, they all 691 00:38:28,600 --> 00:38:33,650 can be done in log space, because they are-- 692 00:38:33,650 --> 00:38:37,910 reductions tend to be very simple transformations. 693 00:38:37,910 --> 00:38:40,310 And log space is going to be enough to do all of them. 694 00:38:47,920 --> 00:38:49,870 OK. 695 00:38:49,870 --> 00:38:52,010 I can't answer the second part of that. 696 00:38:52,010 --> 00:38:52,970 That's too complicated. 697 00:38:52,970 --> 00:38:56,020 And I think we should move on. 698 00:38:56,020 --> 00:38:57,925 So let's look at the first check-in here. 699 00:39:00,590 --> 00:39:02,680 So if we have a long space transducer that 700 00:39:02,680 --> 00:39:06,940 computes f, and if you feed it inputs of length n, 701 00:39:06,940 --> 00:39:10,180 how big can the outputs be, actually? 702 00:39:14,617 --> 00:39:16,950 So why don't you think about that and give me an answer? 703 00:39:16,950 --> 00:39:21,700 I'll give you a minute to answer this question. 704 00:39:21,700 --> 00:39:24,610 Oh, this is a tough one. 705 00:39:24,610 --> 00:39:28,990 Let me just say up front, there are-- 706 00:39:28,990 --> 00:39:32,680 I struggle with this lecture, because some-- especially the 707 00:39:32,680 --> 00:39:35,620 stuff in the second half, it's kind of hard. 708 00:39:35,620 --> 00:39:38,500 I wouldn't say it's technical. 709 00:39:38,500 --> 00:39:40,900 But conceptually, I think some of the material 710 00:39:40,900 --> 00:39:44,320 is a little harder, maybe in part 711 00:39:44,320 --> 00:39:46,360 because people are not used to thinking 712 00:39:46,360 --> 00:39:49,990 about memory complexity or space complexity, 713 00:39:49,990 --> 00:39:51,420 even though I don't see why-- 714 00:39:51,420 --> 00:39:53,170 I mean, I think it's an important resource 715 00:39:53,170 --> 00:39:55,310 to be considering. 716 00:39:55,310 --> 00:39:56,740 But I think it's less common. 717 00:39:56,740 --> 00:39:59,350 And I think there's some discomfort with that. 718 00:39:59,350 --> 00:40:02,800 OK, so we're just about done here. 719 00:40:02,800 --> 00:40:04,120 Five more seconds, please. 720 00:40:07,260 --> 00:40:10,540 All right, about to wrap. 721 00:40:10,540 --> 00:40:11,950 Wrap the check-in. 722 00:40:11,950 --> 00:40:14,540 1, 2, 3. 723 00:40:14,540 --> 00:40:15,680 All right. 724 00:40:15,680 --> 00:40:18,590 So yes, the correct answer is c. 725 00:40:21,830 --> 00:40:25,850 As I mentioned, we're going to want 726 00:40:25,850 --> 00:40:31,360 to have outputs that are larger than log n. 727 00:40:31,360 --> 00:40:33,580 And there's no reason why they wouldn't 728 00:40:33,580 --> 00:40:36,550 be able to be larger than log n, according to the definition 729 00:40:36,550 --> 00:40:37,660 that I gave you. 730 00:40:37,660 --> 00:40:40,090 There's no bound on the output. 731 00:40:40,090 --> 00:40:44,830 We're only measuring the running space of this algorithm 732 00:40:44,830 --> 00:40:47,170 in terms of its work tape. 733 00:40:47,170 --> 00:40:51,200 The input and output tapes don't count. 734 00:40:51,200 --> 00:40:52,680 So they can be more than log n. 735 00:40:52,680 --> 00:40:53,680 They can be more than n. 736 00:40:53,680 --> 00:40:55,190 Polynomial is the right answer. 737 00:40:55,190 --> 00:40:55,900 Why? 738 00:40:55,900 --> 00:41:02,260 Because a log space transducer, if you just ignore the output, 739 00:41:02,260 --> 00:41:04,180 is just an ordinary log space machine. 740 00:41:04,180 --> 00:41:07,930 And it can only run for a polynomial number of steps 741 00:41:07,930 --> 00:41:10,090 without it end up going into a loop. 742 00:41:10,090 --> 00:41:11,860 The same argument that we gave for that 743 00:41:11,860 --> 00:41:15,170 before applies here as well. 744 00:41:15,170 --> 00:41:17,770 So if it's going to exceed a polynomial number of steps, 745 00:41:17,770 --> 00:41:19,730 it's never going to hold. 746 00:41:19,730 --> 00:41:21,680 And so that's going to be-- 747 00:41:21,680 --> 00:41:24,750 not allow it-- it's got to halt with the output on the output 748 00:41:24,750 --> 00:41:25,250 tape. 749 00:41:25,250 --> 00:41:33,320 And so it'll be disqualified as a log space transducer 750 00:41:33,320 --> 00:41:37,610 if it doesn't halt. So it can't be anything longer 751 00:41:37,610 --> 00:41:39,290 than polynomial. 752 00:41:39,290 --> 00:41:43,340 It's a good thing to think about, to understand. 753 00:41:43,340 --> 00:41:45,940 OK, so let's continue. 754 00:41:45,940 --> 00:41:48,960 So we're going to show that the PATH problem is NL complete. 755 00:41:48,960 --> 00:41:51,420 Now, we defined NL completeness. 756 00:41:51,420 --> 00:41:56,070 And we've seen the PATH problem before. 757 00:41:56,070 --> 00:41:58,980 And we're now going to show that PATH occupies a very 758 00:41:58,980 --> 00:42:02,100 special position for NL, namely that it's 759 00:42:02,100 --> 00:42:03,555 an NL-complete problem. 760 00:42:06,880 --> 00:42:11,880 So if you can solve the PATH problem deterministically 761 00:42:11,880 --> 00:42:14,775 in log space, you have gotten a big result. 762 00:42:14,775 --> 00:42:16,350 No one knows how to do that. 763 00:42:16,350 --> 00:42:20,430 And it would collapse all of NL down 764 00:42:20,430 --> 00:42:23,940 to log space if you could do PATH in log space 765 00:42:23,940 --> 00:42:27,202 deterministically. 766 00:42:27,202 --> 00:42:28,410 OK, so let's see why that is. 767 00:42:28,410 --> 00:42:30,660 So first of all, the two components of being complete 768 00:42:30,660 --> 00:42:33,850 are being in the language and the reduction part. 769 00:42:33,850 --> 00:42:36,480 So in the language, we've shown already. 770 00:42:36,480 --> 00:42:40,710 Now, we want to show that for any other language in NL, 771 00:42:40,710 --> 00:42:45,210 it's going to be log space reducible to PATH. 772 00:42:45,210 --> 00:42:51,710 In a certain sense, this may not feel so surprising, 773 00:42:51,710 --> 00:43:00,360 thinking back to our proof that NL is a subset of P, 774 00:43:00,360 --> 00:43:05,950 because we managed to convert any NL 775 00:43:05,950 --> 00:43:12,880 machine, the running of any NL machine, to a PATH problem 776 00:43:12,880 --> 00:43:17,150 that the polynomial time machine then solved. 777 00:43:17,150 --> 00:43:23,220 And so it's really the same idea that says that PATH really 778 00:43:23,220 --> 00:43:31,800 captures any NL machine. 779 00:43:31,800 --> 00:43:34,170 The computation of any NL machine 780 00:43:34,170 --> 00:43:37,350 really can be seen as a PATH problem, where 781 00:43:37,350 --> 00:43:41,200 the nodes are the configurations of the machine. 782 00:43:41,200 --> 00:43:42,360 So let's just see how-- 783 00:43:42,360 --> 00:43:44,370 let me just try to go through that if that 784 00:43:44,370 --> 00:43:48,480 wasn't super clear, which I'm not sure it was. 785 00:43:48,480 --> 00:43:51,450 So suppose we have a machine decided by-- 786 00:43:51,450 --> 00:43:55,290 a language decided by a nondeterministic-- an NL 787 00:43:55,290 --> 00:44:00,048 machine, a nondeterministic machine in log space. 788 00:44:00,048 --> 00:44:01,590 Again, I should have put this before. 789 00:44:01,590 --> 00:44:04,842 But we're going to modify M to erase its work tape 790 00:44:04,842 --> 00:44:06,800 and move its head to the left end on accepting. 791 00:44:06,800 --> 00:44:11,600 So it has a unique accepting configuration. 792 00:44:14,830 --> 00:44:19,510 Now I'm going to give it the log space reduction that 793 00:44:19,510 --> 00:44:26,980 maps our language A, which is in NL, to the PATH language. 794 00:44:26,980 --> 00:44:28,570 So thinking about what that means, 795 00:44:28,570 --> 00:44:33,250 I'm going to take an input, w, which may or may not be in A, 796 00:44:33,250 --> 00:44:39,320 and produce for you a graph with a start and target node, start 797 00:44:39,320 --> 00:44:46,130 and target notes, where w is going to be in the language 798 00:44:46,130 --> 00:44:50,770 if and only if G has a path from s to t. 799 00:44:50,770 --> 00:44:52,770 And what do you think that graph is going to be? 800 00:44:55,570 --> 00:44:57,430 That's going to be the configuration 801 00:44:57,430 --> 00:45:06,300 graph for the machine that decides A, OK? 802 00:45:06,300 --> 00:45:12,840 So that is how it's going to look. 803 00:45:12,840 --> 00:45:14,090 So maybe here's a picture. 804 00:45:14,090 --> 00:45:15,080 Right. 805 00:45:15,080 --> 00:45:22,170 So f of w, where w, again, is your problem about membership 806 00:45:22,170 --> 00:45:24,570 in A, is going to become a problem 807 00:45:24,570 --> 00:45:25,900 about membership in PATH. 808 00:45:25,900 --> 00:45:29,250 And it's just going to be the configuration graph for M on w. 809 00:45:31,950 --> 00:45:36,940 Now, what's left is to show that we can do this conversion 810 00:45:36,940 --> 00:45:38,650 with a log space transducer. 811 00:45:38,650 --> 00:45:42,650 So it's a log space computable reduction. 812 00:45:45,200 --> 00:45:49,210 So let's just try to go through that quickly-- 813 00:45:49,210 --> 00:45:52,960 conceptually, not super hard. 814 00:45:52,960 --> 00:45:55,338 So here's our transducer. 815 00:45:55,338 --> 00:45:57,130 Let's just think about what it needs to do. 816 00:45:57,130 --> 00:46:00,670 It needs to take an input, w, and convert that f 817 00:46:00,670 --> 00:46:05,580 of w to this thing here-- 818 00:46:05,580 --> 00:46:07,740 computation graph of M on w-- 819 00:46:07,740 --> 00:46:10,620 the configuration graph M on w, the start 820 00:46:10,620 --> 00:46:11,950 and accept configuration. 821 00:46:11,950 --> 00:46:14,320 So that's going to look like this down here. 822 00:46:14,320 --> 00:46:17,700 That's what we want to eventually appear 823 00:46:17,700 --> 00:46:18,690 on the output tape. 824 00:46:26,350 --> 00:46:29,170 So the way we're going to achieve that-- we only 825 00:46:29,170 --> 00:46:33,610 have a small log space, order log space work tape. 826 00:46:33,610 --> 00:46:36,520 And the way we're going to be able to produce this output 827 00:46:36,520 --> 00:46:37,120 is-- 828 00:46:37,120 --> 00:46:43,900 the configuration graph is just a series of edges, which are-- 829 00:46:43,900 --> 00:46:45,790 say, you can go from this configuration 830 00:46:45,790 --> 00:46:47,690 to that configuration in one step. 831 00:46:47,690 --> 00:46:50,590 So what we're going to do is, on our work tape, 832 00:46:50,590 --> 00:46:53,770 we're going to go through all possible pairs 833 00:46:53,770 --> 00:46:58,570 of configurations, again, just in some like odometer order, 834 00:46:58,570 --> 00:47:01,930 just by looking at all possible strings, really, of length 835 00:47:01,930 --> 00:47:03,910 order log n that are big enough to represent 836 00:47:03,910 --> 00:47:04,815 two configurations. 837 00:47:04,815 --> 00:47:06,190 Every once in a while, it's going 838 00:47:06,190 --> 00:47:08,530 to be actually a pair of configurations. 839 00:47:08,530 --> 00:47:11,920 At that point, we look at those two configurations, look at M, 840 00:47:11,920 --> 00:47:15,040 and see, can this configuration go to that configuration? 841 00:47:15,040 --> 00:47:17,440 If yes, you print it out on the output tape. 842 00:47:17,440 --> 00:47:22,058 If no, you just move on to the next pair of configurations. 843 00:47:22,058 --> 00:47:24,100 And then, at the end, you write down on the start 844 00:47:24,100 --> 00:47:25,390 and accept configurations. 845 00:47:25,390 --> 00:47:27,620 So I've indicated that here. 846 00:47:27,620 --> 00:47:29,050 Here is the transducer. 847 00:47:29,050 --> 00:47:33,310 It says, on input w, for all pairs of configurations, that-- 848 00:47:33,310 --> 00:47:35,890 now, this is getting written down on the work tape-- 849 00:47:35,890 --> 00:47:38,650 you output those pairs which are legal moves for M. 850 00:47:38,650 --> 00:47:40,900 And then finally, you output the start and the accept. 851 00:47:40,900 --> 00:47:43,030 That's it. 852 00:47:43,030 --> 00:47:44,590 So let's just see. 853 00:47:44,590 --> 00:47:45,925 Let me take any questions here. 854 00:47:49,140 --> 00:47:52,200 Why do we need special accept state for M? 855 00:47:52,200 --> 00:47:53,520 Well, we want to have-- 856 00:47:53,520 --> 00:47:55,680 I think you mean accepting configuration. 857 00:47:55,680 --> 00:47:56,820 I just want to have a-- 858 00:47:56,820 --> 00:47:58,470 I don't want to have a multiplicity 859 00:47:58,470 --> 00:48:00,600 of different possible accepting configurations, 860 00:48:00,600 --> 00:48:02,658 because then it's not really a PATH problem. 861 00:48:02,658 --> 00:48:04,950 Then it becomes a question of, can I get from the start 862 00:48:04,950 --> 00:48:08,520 to one of those nodes representing 863 00:48:08,520 --> 00:48:10,290 accepting configurations? 864 00:48:10,290 --> 00:48:11,610 That's a little messy. 865 00:48:11,610 --> 00:48:12,540 I could fix it. 866 00:48:12,540 --> 00:48:14,550 But the simplest fix is just to make 867 00:48:14,550 --> 00:48:19,760 there be a single accepting configuration. 868 00:48:19,760 --> 00:48:21,900 Well, why do I output start and accept 869 00:48:21,900 --> 00:48:23,150 at the end of the output tape? 870 00:48:23,150 --> 00:48:26,480 That's the way I write down my PATH problem. 871 00:48:26,480 --> 00:48:30,210 It's a graph, followed by a start node and a target node. 872 00:48:30,210 --> 00:48:34,550 So I have to follow that form. 873 00:48:34,550 --> 00:48:36,080 I'm not sure what you're asking. 874 00:48:36,080 --> 00:48:38,180 You want me to put that first? 875 00:48:38,180 --> 00:48:39,200 I'm not sure what the-- 876 00:48:39,200 --> 00:48:41,270 or why at all? 877 00:48:41,270 --> 00:48:43,988 Because it has to be a-- 878 00:48:43,988 --> 00:48:44,820 here it is. 879 00:48:44,820 --> 00:48:46,237 Here's the output I'm looking for. 880 00:48:49,110 --> 00:48:49,610 OK. 881 00:48:53,360 --> 00:48:57,560 Do the three-- do the read-write work 882 00:48:57,560 --> 00:49:00,230 tape here store pointers to configuration 883 00:49:00,230 --> 00:49:01,670 or some sort of counter? 884 00:49:01,670 --> 00:49:04,670 No, they store the actual configuration. 885 00:49:04,670 --> 00:49:08,883 The configuration for M is-- 886 00:49:08,883 --> 00:49:10,050 just think about what it is. 887 00:49:10,050 --> 00:49:13,250 It's a log space size object. 888 00:49:13,250 --> 00:49:15,830 It's a tape for M. It's a location 889 00:49:15,830 --> 00:49:19,780 of its heads and its state. 890 00:49:19,780 --> 00:49:22,270 So you could kind of write down that stuff right over here, 891 00:49:22,270 --> 00:49:24,160 on the left side of this-- 892 00:49:24,160 --> 00:49:25,162 this left slot. 893 00:49:25,162 --> 00:49:26,620 And on the right slot, you're going 894 00:49:26,620 --> 00:49:29,280 to write another configuration for M on w. 895 00:49:32,250 --> 00:49:36,106 And you're going to just put the edges in accordingly. 896 00:49:39,840 --> 00:49:41,700 OK, so somebody-- did that help? 897 00:49:41,700 --> 00:49:44,400 Somebody, again, is asking, why is the configuration only log 898 00:49:44,400 --> 00:49:45,670 space? 899 00:49:45,670 --> 00:49:46,590 It's just a tape. 900 00:49:46,590 --> 00:49:49,260 It's a log space tape. 901 00:49:49,260 --> 00:49:53,230 That's the main thing in the configuration of the tape. 902 00:49:53,230 --> 00:49:56,365 On the read-write work tape, do we only 903 00:49:56,365 --> 00:49:57,740 write two configurations at once? 904 00:49:57,740 --> 00:49:58,240 Yeah. 905 00:49:58,240 --> 00:50:01,100 We're just writing down a candidate edge 906 00:50:01,100 --> 00:50:03,100 that we're going to output onto the output tape. 907 00:50:03,100 --> 00:50:05,018 So that's why we have two configurations. 908 00:50:05,018 --> 00:50:07,060 I want to know, can I get from this configuration 909 00:50:07,060 --> 00:50:08,110 to that configuration? 910 00:50:08,110 --> 00:50:10,930 If yes, I print it out, print out that pair. 911 00:50:10,930 --> 00:50:14,650 That's an edge in my configuration 912 00:50:14,650 --> 00:50:18,440 graph, which is what I'm supposed to be outputting here. 913 00:50:18,440 --> 00:50:22,300 Can there be multiple-- 914 00:50:22,300 --> 00:50:25,990 OK, why don't we move on? 915 00:50:25,990 --> 00:50:27,790 Again, direct questions to our TAs, 916 00:50:27,790 --> 00:50:32,110 who would be more than happy to help you. 917 00:50:32,110 --> 00:50:34,480 And we will-- let me just quickly 918 00:50:34,480 --> 00:50:38,170 give-- we're running a little tight here time-wise. 919 00:50:38,170 --> 00:50:40,210 But let's just see. 920 00:50:40,210 --> 00:50:43,090 Here's an example of showing some other problem is 921 00:50:43,090 --> 00:50:44,200 NL complete. 922 00:50:44,200 --> 00:50:45,700 You have a homework problem on that. 923 00:50:45,700 --> 00:50:47,575 So I thought I wanted to give you an example. 924 00:50:47,575 --> 00:50:50,410 Maybe we can just defer this to the recitation. 925 00:50:50,410 --> 00:50:54,320 So maybe we'll try to do this a little quickly to save us 926 00:50:54,320 --> 00:50:54,820 on time. 927 00:50:54,820 --> 00:50:59,650 But the 2SAT problem, which is just like the 3SAT problem, 928 00:50:59,650 --> 00:51:02,530 except with two literals per clause-- 929 00:51:02,530 --> 00:51:05,540 curiously, the complement of that problem, 930 00:51:05,540 --> 00:51:10,750 so the unsatisfiable formulas, that 931 00:51:10,750 --> 00:51:14,620 form an NL-complete language. 932 00:51:14,620 --> 00:51:17,110 And so first of all, you have to show it's in NL. 933 00:51:17,110 --> 00:51:18,250 We're not going to do that. 934 00:51:18,250 --> 00:51:19,125 It's a nice exercise. 935 00:51:19,125 --> 00:51:22,990 It's not totally trivial to do. 936 00:51:22,990 --> 00:51:28,820 But you might want to try that. 937 00:51:28,820 --> 00:51:32,860 We're going to show that PATH is reducible to the complement 938 00:51:32,860 --> 00:51:35,170 of 2SAT. 939 00:51:35,170 --> 00:51:38,920 We've got to give a reduction that converts graphs 940 00:51:38,920 --> 00:51:43,060 to formulas, where there is a PATH, now, when the formula is 941 00:51:43,060 --> 00:51:44,740 unsatisfied. 942 00:51:44,740 --> 00:51:51,160 And what's going to happen is the PATH 943 00:51:51,160 --> 00:51:56,740 is going to correspond to a sequence of implications 944 00:51:56,740 --> 00:52:00,670 in the formula, which yields a contradiction 945 00:52:00,670 --> 00:52:04,790 and forces it to be unsatisfied. 946 00:52:04,790 --> 00:52:08,090 Again, this is going to come a little fast. 947 00:52:08,090 --> 00:52:11,420 And then maybe we can discuss it over the break, which is next. 948 00:52:11,420 --> 00:52:16,520 So every node in G is going to have associated 949 00:52:16,520 --> 00:52:19,200 variable in the formula. 950 00:52:19,200 --> 00:52:22,250 So there's a variable for every one of the nodes. 951 00:52:22,250 --> 00:52:24,050 For every edge, there's going to be 952 00:52:24,050 --> 00:52:28,130 a clause of implication connecting those two 953 00:52:28,130 --> 00:52:29,190 associated nodes. 954 00:52:29,190 --> 00:52:31,550 So if there's an edge from u to v, 955 00:52:31,550 --> 00:52:35,000 then there's going to be an implication in the formula that 956 00:52:35,000 --> 00:52:40,190 says, if xu is true, then xv is true. 957 00:52:40,190 --> 00:52:43,880 And note that that's equivalent to the more conventional way 958 00:52:43,880 --> 00:52:48,750 [INAUDIBLE] xu complement or xv. 959 00:52:48,750 --> 00:52:50,140 These are logically equivalent. 960 00:52:50,140 --> 00:52:55,170 So I'm not cheating you here in terms of being a 2SAT problem. 961 00:52:55,170 --> 00:52:56,920 They really just look like this. 962 00:52:56,920 --> 00:53:02,050 And lastly, I'm going to put two additional clauses. 963 00:53:02,050 --> 00:53:06,180 It's [INAUDIBLE] x for the start variable-- 964 00:53:06,180 --> 00:53:08,640 from the start node, s-- 965 00:53:08,640 --> 00:53:10,140 here, s. 966 00:53:10,140 --> 00:53:13,180 I want to force that one to be true. 967 00:53:13,180 --> 00:53:16,320 So it's x-- since I want to have exactly two per clause, 968 00:53:16,320 --> 00:53:17,970 that's xs or xs. 969 00:53:17,970 --> 00:53:20,790 So that forces x-- 970 00:53:20,790 --> 00:53:23,760 that variable true. 971 00:53:23,760 --> 00:53:31,620 And lastly, if t is true, that's going to force the-- 972 00:53:31,620 --> 00:53:35,880 if xt is true, that's going to force xs to be false. 973 00:53:40,120 --> 00:53:44,310 So now, if there's actually a path in the graph that 974 00:53:44,310 --> 00:53:47,150 goes from s to t, there's going to be 975 00:53:47,150 --> 00:53:50,690 a sequence of implications, starting now with s being true, 976 00:53:50,690 --> 00:53:53,750 forcing other things being true, including 977 00:53:53,750 --> 00:53:57,740 forcing t to be true, which then forces s to be false. 978 00:53:57,740 --> 00:54:00,920 And that's our contradiction, which shows that the formula 979 00:54:00,920 --> 00:54:02,000 cannot be satisfied. 980 00:54:05,020 --> 00:54:08,860 So now, you have to prove that this works. 981 00:54:08,860 --> 00:54:12,300 As I said, for the forward direction, if there is a path, 982 00:54:12,300 --> 00:54:14,880 you follow the implications to get a contradiction. 983 00:54:14,880 --> 00:54:19,260 For the reverse-- let me not spend time here. 984 00:54:19,260 --> 00:54:22,470 I'll leave this to you to think about offline. 985 00:54:22,470 --> 00:54:24,390 But if there is no path, there is 986 00:54:24,390 --> 00:54:29,010 a way of assigning the variables to true and false 987 00:54:29,010 --> 00:54:33,280 to make a satisfying assignment to the formula. 988 00:54:33,280 --> 00:54:35,810 So that gives the other direction, OK? 989 00:54:38,660 --> 00:54:41,250 And you can show it's computable in log space. 990 00:54:41,250 --> 00:54:44,840 That's very simple, because a very simple transformation 991 00:54:44,840 --> 00:54:46,410 there, OK? 992 00:54:46,410 --> 00:54:50,015 So I think we're going to move on to the break. 993 00:54:53,390 --> 00:54:59,175 And I'm happy to take questions at this point about this. 994 00:55:02,020 --> 00:55:05,740 Does the configuration, going back, include the input? 995 00:55:05,740 --> 00:55:06,400 No. 996 00:55:06,400 --> 00:55:08,110 The configuration does not-- 997 00:55:08,110 --> 00:55:11,320 as I said, the configuration for M on w 998 00:55:11,320 --> 00:55:17,700 is the state, the head positions, and the work tape 999 00:55:17,700 --> 00:55:20,370 contents, not the input tape, because then you would be-- 1000 00:55:23,630 --> 00:55:25,250 it's not there for a reason. 1001 00:55:25,250 --> 00:55:26,000 The input is huge. 1002 00:55:28,520 --> 00:55:30,047 But you don't need the input there, 1003 00:55:30,047 --> 00:55:32,380 because the input is going to be constant for everybody. 1004 00:55:32,380 --> 00:55:35,230 Everybody can look at that input, which is a fixed, sort 1005 00:55:35,230 --> 00:55:36,310 of external thing. 1006 00:55:39,810 --> 00:55:45,940 Somebody's asking me, are there NP-complete problems in-- 1007 00:55:45,940 --> 00:55:49,990 there are definitely NP-complete [INAUDIBLE].. 1008 00:55:49,990 --> 00:55:57,740 I don't know-- there are some problems in number theory where 1009 00:55:57,740 --> 00:55:58,670 it's-- 1010 00:55:58,670 --> 00:56:00,950 like factoring, where we don't know the status, 1011 00:56:00,950 --> 00:56:05,300 somewhere between P and NP, formulated 1012 00:56:05,300 --> 00:56:06,530 as a language, of course. 1013 00:56:06,530 --> 00:56:10,580 But there are problems in solving 1014 00:56:10,580 --> 00:56:17,670 certain kinds of equations, low-degree equations, 1015 00:56:17,670 --> 00:56:22,050 that I don't remember now if [INAUDIBLE] actually 1016 00:56:22,050 --> 00:56:25,050 known to be NP complete. 1017 00:56:25,050 --> 00:56:27,895 Now, you asked about NL complete [INAUDIBLE].. 1018 00:56:27,895 --> 00:56:30,020 I don't know if there are NL-complete number theory 1019 00:56:30,020 --> 00:56:31,700 problems. 1020 00:56:31,700 --> 00:56:32,480 Oh, good question. 1021 00:56:32,480 --> 00:56:34,820 Somebody's asking me, does NL also 1022 00:56:34,820 --> 00:56:37,280 have an alternative definition using 1023 00:56:37,280 --> 00:56:39,380 certificates or witnesses? 1024 00:56:39,380 --> 00:56:41,940 Yeah. 1025 00:56:41,940 --> 00:56:44,300 Yes, sort of. 1026 00:56:51,530 --> 00:56:54,650 For NL, you can make a certificate, which is, again, 1027 00:56:54,650 --> 00:56:57,170 polynomial size certificate. 1028 00:56:57,170 --> 00:57:00,020 But it has to be-- you're only allowed 1029 00:57:00,020 --> 00:57:04,950 to read it with a one-way head. 1030 00:57:04,950 --> 00:57:10,590 So it's like a one-way certificate. 1031 00:57:10,590 --> 00:57:13,380 So it has to be-- you can only process it in a certain way. 1032 00:57:13,380 --> 00:57:16,100 That's a nice exercise, actually, itself. 1033 00:57:16,100 --> 00:57:18,780 But anyway, let us-- 1034 00:57:18,780 --> 00:57:24,230 we are now done. 1035 00:57:24,230 --> 00:57:25,730 And we're going to move back. 1036 00:57:25,730 --> 00:57:27,080 We're going to continue. 1037 00:57:27,080 --> 00:57:28,355 So everybody return. 1038 00:57:31,340 --> 00:57:37,770 This is what's next on the agenda, 1039 00:57:37,770 --> 00:57:40,200 proving that NL equals coNL. 1040 00:57:40,200 --> 00:57:41,310 This is a hard proof. 1041 00:57:44,540 --> 00:57:47,090 I'm going to try to break it down as much as I can. 1042 00:57:47,090 --> 00:57:51,350 And let's hope you get-- 1043 00:57:53,852 --> 00:57:55,400 I hope you get it. 1044 00:57:55,400 --> 00:57:58,460 I'll try to be as helpful as I can. 1045 00:57:58,460 --> 00:57:59,690 OK. 1046 00:57:59,690 --> 00:58:04,640 But if you're finding it tough, you won't be alone. 1047 00:58:04,640 --> 00:58:11,235 So first of all, we're going to show-- 1048 00:58:11,235 --> 00:58:12,610 the way we're going to solve this 1049 00:58:12,610 --> 00:58:15,580 is by showing that the complement of PATH 1050 00:58:15,580 --> 00:58:21,190 is solvable in NL, because the complement of PATH is-- 1051 00:58:21,190 --> 00:58:24,490 just as PATH is complete for NL, the complement 1052 00:58:24,490 --> 00:58:27,610 is complete for coNL. 1053 00:58:27,610 --> 00:58:31,690 And so by doing that problem in NL, 1054 00:58:31,690 --> 00:58:34,390 we're going to reduce all of-- all of coNL 1055 00:58:34,390 --> 00:58:37,390 will be reducible to problems in NL. 1056 00:58:37,390 --> 00:58:39,640 And so therefore, we'll be in NL. coNL 1057 00:58:39,640 --> 00:58:41,140 will be then inside NL. 1058 00:58:41,140 --> 00:58:44,980 And then NL is going to be equal to coNL. 1059 00:58:44,980 --> 00:58:50,950 If that sequence of logical connections, is not clear. 1060 00:58:50,950 --> 00:58:51,580 Don't worry. 1061 00:58:54,270 --> 00:58:58,280 The point is that we want [INAUDIBLE] go back and figure 1062 00:58:58,280 --> 00:58:59,690 out why that's enough later. 1063 00:58:59,690 --> 00:59:02,420 But what this means is we want to give 1064 00:59:02,420 --> 00:59:05,870 a nondeterministic machine, which 1065 00:59:05,870 --> 00:59:12,830 will accept when there is no path from s to t. 1066 00:59:12,830 --> 00:59:13,330 OK? 1067 00:59:15,910 --> 00:59:18,010 And please don't say, why don't we 1068 00:59:18,010 --> 00:59:21,520 just take the machine for PATH and flip the answer? 1069 00:59:21,520 --> 00:59:24,640 You can't do that with a nondeterministic machine. 1070 00:59:24,640 --> 00:59:27,580 So you better-- if you're thinking that that's allowed, 1071 00:59:27,580 --> 00:59:30,950 go back and review nondeterminism. 1072 00:59:30,950 --> 00:59:33,490 So you want to make a nondeterministic machine, which 1073 00:59:33,490 --> 00:59:35,200 is going to accept when there's no path. 1074 00:59:35,200 --> 00:59:38,110 So some branch is going to make a sequence of guesses. 1075 00:59:38,110 --> 00:59:40,720 And it has to be sure that there's no path. 1076 00:59:40,720 --> 00:59:42,520 And then it's going to be-- 1077 00:59:42,520 --> 00:59:44,980 and then it can accept when there's no path. 1078 00:59:44,980 --> 00:59:47,440 Now, if you can find a way of like making a separator, 1079 00:59:47,440 --> 00:59:51,190 something that cuts the graph in half and separates s from t, 1080 00:59:51,190 --> 00:59:53,180 then you would be good. 1081 00:59:53,180 --> 00:59:59,290 The only problem is there's no obvious way of doing that, 1082 00:59:59,290 --> 01:00:01,270 because those kind of separators, 1083 01:00:01,270 --> 01:00:03,730 even if they were [INAUDIBLE] probably too 1084 01:00:03,730 --> 01:00:07,720 big to write down in log space. 1085 01:00:07,720 --> 01:00:10,420 So I'm going to give you a completely different way 1086 01:00:10,420 --> 01:00:11,930 of doing it. 1087 01:00:11,930 --> 01:00:18,470 And I'm going to make-- this is a little different presentation 1088 01:00:18,470 --> 01:00:19,550 than what's in the book. 1089 01:00:19,550 --> 01:00:22,790 I think hopefully, this is a little longer, and therefore, 1090 01:00:22,790 --> 01:00:25,780 a little clearer. 1091 01:00:25,780 --> 01:00:26,920 We'll see. 1092 01:00:26,920 --> 01:00:28,570 So first of all, I'm going to define 1093 01:00:28,570 --> 01:00:32,480 a notion of a nondeterministic machine computing a function. 1094 01:00:32,480 --> 01:00:35,110 And that's a simple idea. 1095 01:00:35,110 --> 01:00:38,240 What you want is, on the different branches-- 1096 01:00:38,240 --> 01:00:43,000 so you have some function, f, which has, for every w, 1097 01:00:43,000 --> 01:00:44,710 there's an output, f of w. 1098 01:00:47,290 --> 01:00:49,590 And the nondeterministic machine can 1099 01:00:49,590 --> 01:00:54,660 operate that on all of its branches, 1100 01:00:54,660 --> 01:01:01,080 it's allowed to either give f of w or say reject, meaning punt, 1101 01:01:01,080 --> 01:01:03,150 or say, I don't know. 1102 01:01:03,150 --> 01:01:07,770 So every branch has to give the right answer. 1103 01:01:07,770 --> 01:01:10,170 So all the branches that give an answer have to agree, 1104 01:01:10,170 --> 01:01:11,940 because there's only one right answer. 1105 01:01:11,940 --> 01:01:13,800 All the branches that give an answer 1106 01:01:13,800 --> 01:01:20,300 have to give the right answer, or they can say, I don't know. 1107 01:01:20,300 --> 01:01:23,120 The only thing is you have to also say that at least one 1108 01:01:23,120 --> 01:01:26,880 of the branches actually gives an answer. 1109 01:01:26,880 --> 01:01:29,270 So somebody cannot reject. 1110 01:01:29,270 --> 01:01:30,710 Somebody cannot say, I don't know. 1111 01:01:33,820 --> 01:01:37,030 So at least one of the branches gives an answer and-- 1112 01:01:37,030 --> 01:01:38,920 gives the answer. 1113 01:01:38,920 --> 01:01:43,040 And all the other branches can either give the answer, 1114 01:01:43,040 --> 01:01:44,890 or they can say-- 1115 01:01:44,890 --> 01:01:48,010 they can just reject. 1116 01:01:48,010 --> 01:01:49,480 But there's no notion of accepting. 1117 01:01:49,480 --> 01:01:52,870 There's just a notion of this nondeterministic machine, 1118 01:01:52,870 --> 01:01:55,330 on some branches, giving the output value, 1119 01:01:55,330 --> 01:01:59,528 and other branches just punting and saying reject. 1120 01:01:59,528 --> 01:02:00,820 Maybe reject is the wrong word. 1121 01:02:00,820 --> 01:02:01,750 I could just say punt. 1122 01:02:05,320 --> 01:02:06,580 All right. 1123 01:02:06,580 --> 01:02:08,800 So we're going to be talking about functions that you 1124 01:02:08,800 --> 01:02:13,960 can compute with nondeterministic machines, 1125 01:02:13,960 --> 01:02:19,100 with NL machines in particular. 1126 01:02:19,100 --> 01:02:19,690 All right? 1127 01:02:19,690 --> 01:02:24,670 So we're going to look at this path function now. 1128 01:02:24,670 --> 01:02:27,190 Now, this is not exactly the same as the PATH language. 1129 01:02:27,190 --> 01:02:30,550 This is a function here, written with lowercase. 1130 01:02:30,550 --> 01:02:33,700 So given a graph, s and t, I'm going 1131 01:02:33,700 --> 01:02:37,515 to say yes if there is a path and no if there's no path. 1132 01:02:37,515 --> 01:02:38,890 And this is a function now, which 1133 01:02:38,890 --> 01:02:44,338 is going to output yes or no, not a language. 1134 01:02:44,338 --> 01:02:45,130 This is a function. 1135 01:02:45,130 --> 01:02:46,230 It's very closely related. 1136 01:02:46,230 --> 01:02:47,730 I understand. 1137 01:02:47,730 --> 01:02:49,500 So if you can solve the function, 1138 01:02:49,500 --> 01:02:51,900 you can do the language. 1139 01:02:51,900 --> 01:02:55,980 But what we're going to give is a NL machine, 1140 01:02:55,980 --> 01:02:57,480 a nondeterministic machine, which is 1141 01:02:57,480 --> 01:02:58,772 going to compute this function. 1142 01:03:01,280 --> 01:03:04,435 And therefore, you can use that to do the PATH language. 1143 01:03:07,390 --> 01:03:11,740 Two important things for us is, if G is some graph, well, 1144 01:03:11,740 --> 01:03:14,410 here's the starting node, s. 1145 01:03:14,410 --> 01:03:18,070 R is all of the nodes that you can reach from s. 1146 01:03:18,070 --> 01:03:21,700 This is some collection of nodes. 1147 01:03:21,700 --> 01:03:24,700 And c, which stands for count, is 1148 01:03:24,700 --> 01:03:27,030 the number of reachable nodes. 1149 01:03:27,030 --> 01:03:28,890 So I've written that down here more-- 1150 01:03:28,890 --> 01:03:31,260 if you like it more formally. 1151 01:03:31,260 --> 01:03:32,880 R is the number-- 1152 01:03:32,880 --> 01:03:36,180 is the collection of nodes for which there's 1153 01:03:36,180 --> 01:03:38,860 a path from s to the node. 1154 01:03:38,860 --> 01:03:44,058 And c is the size of R. So you have to understand these two, 1155 01:03:44,058 --> 01:03:45,850 because we're going to be playing with this 1156 01:03:45,850 --> 01:03:46,933 for the next three slides. 1157 01:03:51,510 --> 01:03:52,560 OK. 1158 01:03:52,560 --> 01:03:55,600 Now, first of all, this is kind of a little bit of an exercise 1159 01:03:55,600 --> 01:03:56,100 theorem. 1160 01:03:56,100 --> 01:03:58,225 But it's still going to be a useful fact that we're 1161 01:03:58,225 --> 01:04:00,270 going to end up needing later. 1162 01:04:00,270 --> 01:04:02,610 But it's also a little bit of just 1163 01:04:02,610 --> 01:04:05,750 to test your understanding. 1164 01:04:05,750 --> 01:04:09,380 Suppose there's some NL machine which 1165 01:04:09,380 --> 01:04:11,746 computes this path function. 1166 01:04:11,746 --> 01:04:17,160 So on the different branches of the nondeterminism, 1167 01:04:17,160 --> 01:04:21,510 given a graph, G, s, and t, there 1168 01:04:21,510 --> 01:04:24,120 are going to be some branches which may output yes, 1169 01:04:24,120 --> 01:04:25,620 or some branches that may output no. 1170 01:04:25,620 --> 01:04:27,412 And other branches might say, I don't know. 1171 01:04:27,412 --> 01:04:30,528 But the machine always has to give the right answer if it's 1172 01:04:30,528 --> 01:04:31,570 going to give any answer. 1173 01:04:31,570 --> 01:04:35,740 So all branches either have to say yes, or all branches-- 1174 01:04:35,740 --> 01:04:39,150 all branches have to say yes or punt, 1175 01:04:39,150 --> 01:04:42,810 or all branches have to say no or punt, 1176 01:04:42,810 --> 01:04:47,710 because one of those answers is going to be the right answer. 1177 01:04:47,710 --> 01:04:51,420 So suppose I have a way of computing path by an NL 1178 01:04:51,420 --> 01:04:54,160 machine. 1179 01:04:54,160 --> 01:04:56,440 Then can I also compute the-- 1180 01:04:56,440 --> 01:04:58,360 can I make some other NL machine which 1181 01:04:58,360 --> 01:05:03,680 computes the count, the number of nodes reachable? 1182 01:05:03,680 --> 01:05:05,800 So if I can test if a node is reachable, 1183 01:05:05,800 --> 01:05:09,890 can I figure out how many nodes are reachable? 1184 01:05:09,890 --> 01:05:12,120 This is supposed to be easy. 1185 01:05:12,120 --> 01:05:15,570 This is kind of a little bit of a practice. 1186 01:05:15,570 --> 01:05:17,630 So if I can figure out if nodes are reachable, 1187 01:05:17,630 --> 01:05:19,850 yes or no, then I can say, figure out 1188 01:05:19,850 --> 01:05:21,440 how many nodes are reachable. 1189 01:05:21,440 --> 01:05:23,750 You just go through them one by one, 1190 01:05:23,750 --> 01:05:28,080 testing if they're reachable, and count the ones that are. 1191 01:05:28,080 --> 01:05:29,590 That's all I have in mind. 1192 01:05:29,590 --> 01:05:33,810 So start with a counter that's set to 0 initially. 1193 01:05:33,810 --> 01:05:38,600 And go through each of the nodes of G one by one. 1194 01:05:38,600 --> 01:05:42,620 And I use my NL machine that computes path. 1195 01:05:42,620 --> 01:05:45,350 That's what I mean by this part. 1196 01:05:45,350 --> 01:05:46,580 So I test it. 1197 01:05:46,580 --> 01:05:51,470 If the NL machine says yes, there is a path, 1198 01:05:51,470 --> 01:05:53,300 then I increase the counter. 1199 01:05:53,300 --> 01:05:57,020 And if it says there's no path, then I just 1200 01:05:57,020 --> 01:06:00,460 continue without increasing the counter. 1201 01:06:00,460 --> 01:06:02,620 Now, when I'm running my NL machine 1202 01:06:02,620 --> 01:06:06,820 to compute this function, that NL machine might punt, 1203 01:06:06,820 --> 01:06:09,520 might reject sometimes on some branches. 1204 01:06:09,520 --> 01:06:11,540 That's OK. 1205 01:06:11,540 --> 01:06:12,700 I'm also allowed. 1206 01:06:12,700 --> 01:06:13,780 I'm also an NL machine. 1207 01:06:13,780 --> 01:06:16,940 I'm computing a value. 1208 01:06:16,940 --> 01:06:20,470 And I also might punt on some branches. 1209 01:06:23,200 --> 01:06:27,110 So at the end, I'm going to output that count, OK? 1210 01:06:27,110 --> 01:06:31,700 So what I'm going to prove next is the converse of this. 1211 01:06:31,700 --> 01:06:34,040 And that's-- and that's the magical hard part, 1212 01:06:34,040 --> 01:06:39,680 that if I can compute the count, then I can do the test 1213 01:06:39,680 --> 01:06:46,490 of whether individual nodes are connected, have a path from s. 1214 01:06:51,000 --> 01:06:52,140 OK, so let's just see. 1215 01:06:56,230 --> 01:07:01,180 Somebody is asking if nondeterministic machines-- 1216 01:07:01,180 --> 01:07:03,010 so like M is not allowed to loop? 1217 01:07:03,010 --> 01:07:04,000 No. 1218 01:07:04,000 --> 01:07:06,520 If a machine, if any one of these machines, 1219 01:07:06,520 --> 01:07:09,940 like an NL machine, loops, it's going to be going forever. 1220 01:07:09,940 --> 01:07:11,120 That's not allowed. 1221 01:07:11,120 --> 01:07:14,160 So no looping. 1222 01:07:14,160 --> 01:07:19,468 I'm not sure why that's relevant, but no looping. 1223 01:07:19,468 --> 01:07:21,010 But what I'm more worried is that you 1224 01:07:21,010 --> 01:07:24,593 understand this theorem here. 1225 01:07:24,593 --> 01:07:26,010 I think we have a check-in coming. 1226 01:07:26,010 --> 01:07:26,510 Let's see. 1227 01:07:29,630 --> 01:07:30,170 OK. 1228 01:07:30,170 --> 01:07:31,087 This might be helpful. 1229 01:07:34,390 --> 01:07:38,010 So consider the statement that PATH complement is NL. 1230 01:07:38,010 --> 01:07:40,930 That's what we're trying to prove, 1231 01:07:40,930 --> 01:07:44,845 and also that some NL machine can compute the path function. 1232 01:07:50,770 --> 01:07:55,580 These are going to be related facts. 1233 01:07:55,580 --> 01:07:59,047 Which one can we prove from the other easily? 1234 01:07:59,047 --> 01:08:00,630 I mean, they're both going to be true. 1235 01:08:00,630 --> 01:08:02,210 So in some sense, it's trivial. 1236 01:08:02,210 --> 01:08:04,460 But I want to know, which one can we 1237 01:08:04,460 --> 01:08:07,790 prove kind of immediately without doing much work? 1238 01:08:07,790 --> 01:08:11,510 That I can solve this PATH problem in NL, 1239 01:08:11,510 --> 01:08:13,460 the complement of the PATH problem in NL, 1240 01:08:13,460 --> 01:08:16,640 or that I can compute the path function in NL? 1241 01:08:16,640 --> 01:08:19,939 So what do you think? 1242 01:08:19,939 --> 01:08:24,590 OK, almost done here? 1243 01:08:24,590 --> 01:08:25,855 Yeah. 1244 01:08:25,855 --> 01:08:26,355 Ending. 1245 01:08:32,000 --> 01:08:34,700 You guys didn't do well. 1246 01:08:34,700 --> 01:08:36,200 That's OK. 1247 01:08:36,200 --> 01:08:39,890 Actually, the right answer is c. 1248 01:08:39,890 --> 01:08:44,330 Most of you got that if I can solve the path function, so 1249 01:08:44,330 --> 01:08:48,740 the yes-no value, I can use that now to solve both PATH and PATH 1250 01:08:48,740 --> 01:08:50,069 complement. 1251 01:08:50,069 --> 01:08:54,500 That seems more clear cut. 1252 01:08:54,500 --> 01:08:58,819 But suppose I can solve the PATH complement problem in NL. 1253 01:08:58,819 --> 01:09:02,270 And I also know I can solve the PATH problem in NL. 1254 01:09:02,270 --> 01:09:05,880 That, we've already shown. 1255 01:09:05,880 --> 01:09:12,210 So knowing both of those, if I'm given a G, s, and t, what 1256 01:09:12,210 --> 01:09:15,600 I can do is nondeterministically pick which 1257 01:09:15,600 --> 01:09:17,830 of those two directions. 1258 01:09:17,830 --> 01:09:18,970 You know, I pick-- 1259 01:09:18,970 --> 01:09:22,300 I'm going to guess, well, it's in PATH, 1260 01:09:22,300 --> 01:09:23,840 or it's in the complement of PATH. 1261 01:09:23,840 --> 01:09:27,520 So there are two different nondeterministic ways to go. 1262 01:09:27,520 --> 01:09:30,917 One of those is going to always end up rejecting. 1263 01:09:30,917 --> 01:09:32,500 And so that's going to end up punting. 1264 01:09:32,500 --> 01:09:34,569 The other direction is going to sometimes end up 1265 01:09:34,569 --> 01:09:36,579 accepting and sometimes punting. 1266 01:09:36,579 --> 01:09:39,826 And based upon whether which side ends up-- 1267 01:09:39,826 --> 01:09:41,784 one or the other is going to have some accept-- 1268 01:09:44,529 --> 01:09:46,029 is going to be accepting. 1269 01:09:46,029 --> 01:09:48,609 And so the one that's accepting is 1270 01:09:48,609 --> 01:09:51,069 going to tell me whether to answer yes or no. 1271 01:09:51,069 --> 01:09:55,750 So actually, both directions, both implications 1272 01:09:55,750 --> 01:09:57,100 follow pretty easily. 1273 01:10:00,510 --> 01:10:01,100 OK. 1274 01:10:01,100 --> 01:10:06,200 Anyway, let's try to show-- 1275 01:10:06,200 --> 01:10:09,208 this is the hard part. 1276 01:10:09,208 --> 01:10:10,250 And we have five minutes. 1277 01:10:10,250 --> 01:10:13,380 Let's see how far we can get. 1278 01:10:13,380 --> 01:10:18,050 So this theorem works by magic. 1279 01:10:18,050 --> 01:10:23,510 So it kind of blew everybody's mind when it first came out. 1280 01:10:23,510 --> 01:10:24,260 So let's just see. 1281 01:10:24,260 --> 01:10:25,790 It's really not that hard. 1282 01:10:25,790 --> 01:10:28,180 But it's sort of-- 1283 01:10:28,180 --> 01:10:29,055 it's kind of twisted. 1284 01:10:32,310 --> 01:10:35,080 So suppose some machine can compute 1285 01:10:35,080 --> 01:10:40,790 c, the count, the number reachable from s. 1286 01:10:40,790 --> 01:10:45,990 I'm going to use that to solve path, the path function, 1287 01:10:45,990 --> 01:10:51,930 to test, yes, I can output yes if there is a path 1288 01:10:51,930 --> 01:10:55,920 or no, there is no path, for each node t. 1289 01:10:55,920 --> 01:10:58,780 So if I know how many nodes are reachable, 1290 01:10:58,780 --> 01:11:05,400 then I can solve now for individual nodes, which is 1291 01:11:05,400 --> 01:11:09,630 strange that you can do that. 1292 01:11:09,630 --> 01:11:11,700 Now, I'm not telling you how to compute c. 1293 01:11:11,700 --> 01:11:15,460 That's for later, which I probably won't get to. 1294 01:11:15,460 --> 01:11:18,820 But just pretend we can somehow figure out 1295 01:11:18,820 --> 01:11:24,570 what the count is of the number of reachable nodes, OK? 1296 01:11:24,570 --> 01:11:33,350 So here is my nondeterministic algorithm for computing path. 1297 01:11:33,350 --> 01:11:37,720 First, I'm going to compute c, or let's say c is given. 1298 01:11:37,720 --> 01:11:42,730 And now, maybe the best thing to do 1299 01:11:42,730 --> 01:11:46,912 is to try to give you the idea up front. 1300 01:11:46,912 --> 01:11:49,370 What we're going to do, since we're a little short on time, 1301 01:11:49,370 --> 01:11:55,860 what we're going to do is, suppose I tell you, 1302 01:11:55,860 --> 01:12:01,020 there are, in this graph, 100 nodes reachable from s. 1303 01:12:01,020 --> 01:12:02,650 So c is 100. 1304 01:12:02,650 --> 01:12:04,850 There's 100 reachable nodes. 1305 01:12:04,850 --> 01:12:06,530 Now I want to know-- 1306 01:12:06,530 --> 01:12:08,900 I say, well, I don't really-- 1307 01:12:08,900 --> 01:12:09,950 that's all very nice. 1308 01:12:09,950 --> 01:12:12,860 But I'd like to know this particular node, t. 1309 01:12:12,860 --> 01:12:16,270 Is that reachable from s? 1310 01:12:16,270 --> 01:12:19,000 Now, I'm a nondeterministic machine. 1311 01:12:19,000 --> 01:12:23,190 Now, if t was reachable, then I'd 1312 01:12:23,190 --> 01:12:24,990 be fine, because nondeterministically, I 1313 01:12:24,990 --> 01:12:26,550 don't even care about the 100. 1314 01:12:26,550 --> 01:12:31,800 I take, nondeterministically, on some branch, starting from s, 1315 01:12:31,800 --> 01:12:33,420 I'm going to hit t. 1316 01:12:33,420 --> 01:12:36,000 And that branch is going to say yes. 1317 01:12:36,000 --> 01:12:38,850 The other branches, maybe they'll punt. 1318 01:12:38,850 --> 01:12:41,340 But some branch is going to get the right answer. 1319 01:12:41,340 --> 01:12:46,250 The problem is, suppose t is not reachable. 1320 01:12:46,250 --> 01:12:48,800 Then you want some branch to say no. 1321 01:12:48,800 --> 01:12:51,170 And how could that branch ever say no, 1322 01:12:51,170 --> 01:12:54,080 unless it's sure that t is not reachable? 1323 01:12:54,080 --> 01:12:57,770 And how can one branch be sure? 1324 01:12:57,770 --> 01:12:59,270 The idea is this. 1325 01:12:59,270 --> 01:13:03,020 Suppose I know that there are 100 reachable nodes. 1326 01:13:03,020 --> 01:13:06,770 What I'm going to do nondeterministically is I'm 1327 01:13:06,770 --> 01:13:10,362 going to guess those 100 nodes, one by one. 1328 01:13:10,362 --> 01:13:12,320 You can't store them all, because it could be-- 1329 01:13:12,320 --> 01:13:14,510 100 could be a big number. 1330 01:13:14,510 --> 01:13:16,250 I'm going to guess them one by one. 1331 01:13:16,250 --> 01:13:18,190 I'm going to guess them. 1332 01:13:18,190 --> 01:13:20,340 And every time I guess a node, I'm 1333 01:13:20,340 --> 01:13:22,680 going to prove it's reachable by guessing the path that 1334 01:13:22,680 --> 01:13:24,760 shows it's reachable. 1335 01:13:24,760 --> 01:13:29,520 So I'm going to guess 100 nodes, prove that they're reachable, 1336 01:13:29,520 --> 01:13:32,280 and then see, was t of those reachable nodes? 1337 01:13:35,930 --> 01:13:38,210 If it was, well, then I would have found it, 1338 01:13:38,210 --> 01:13:40,360 and I would know to say yes. 1339 01:13:40,360 --> 01:13:43,120 But if t was not one of the 100 reachable 1340 01:13:43,120 --> 01:13:45,370 nodes, and I know there's only 100-- 1341 01:13:45,370 --> 01:13:48,040 so if t is not one of those nodes-- 1342 01:13:48,040 --> 01:13:53,010 in other words, if I found them all, and t wasn't one of them, 1343 01:13:53,010 --> 01:13:56,020 then I know it's not reachable. 1344 01:13:56,020 --> 01:13:59,400 And that's how, using the count, I 1345 01:13:59,400 --> 01:14:02,820 can be sure that certain nodes are not reachable, 1346 01:14:02,820 --> 01:14:06,210 because I just find all the ones that are, prove that they are, 1347 01:14:06,210 --> 01:14:09,180 check that the count agrees with what I was given, 1348 01:14:09,180 --> 01:14:11,910 and then say no, t is not reachable, 1349 01:14:11,910 --> 01:14:14,460 if it's not one of those nodes that I've 1350 01:14:14,460 --> 01:14:20,573 found to be reachable, which adds up to my given count. 1351 01:14:20,573 --> 01:14:21,490 That's the whole idea. 1352 01:14:24,040 --> 01:14:26,410 Of course, how do you get the count? 1353 01:14:26,410 --> 01:14:30,340 Oddly enough, it's kind of the same idea 1354 01:14:30,340 --> 01:14:32,110 repeated over and over again. 1355 01:14:32,110 --> 01:14:34,490 But I guess we'll have to do that next time. 1356 01:14:34,490 --> 01:14:35,860 So let's just write this down. 1357 01:14:35,860 --> 01:14:38,290 And we'll kind of use it as the beginning 1358 01:14:38,290 --> 01:14:41,140 of Thursday's lecture. 1359 01:14:41,140 --> 01:14:43,690 So we're going to go through each node u, one by one. 1360 01:14:46,420 --> 01:14:49,330 Now we're going to guess, for each node, 1361 01:14:49,330 --> 01:14:52,040 whether there's a path to it or not. 1362 01:14:52,040 --> 01:14:55,900 So I'm going to call it either p or n. 1363 01:14:55,900 --> 01:14:58,390 Again, this is now-- think about my 100 nodes. 1364 01:14:58,390 --> 01:15:00,795 I'm going to be guessing all 100 nodes. 1365 01:15:00,795 --> 01:15:02,170 I'm going to nondeterministically 1366 01:15:02,170 --> 01:15:06,890 pick a path from that node that I guess is reachable. 1367 01:15:06,890 --> 01:15:10,510 So if I guess a node, there is a path. 1368 01:15:10,510 --> 01:15:13,650 I'm going to confirm there's a path by nondeterministically 1369 01:15:13,650 --> 01:15:15,340 picking it. 1370 01:15:15,340 --> 01:15:17,140 If I don't find that path, I just 1371 01:15:17,140 --> 01:15:21,520 reject punt on that branch. 1372 01:15:21,520 --> 01:15:24,960 If that path that I found actually 1373 01:15:24,960 --> 01:15:29,380 led me to t, so u, that node that I'm working on, 1374 01:15:29,380 --> 01:15:32,400 is currently t, then I know to accept. 1375 01:15:32,400 --> 01:15:36,180 But otherwise, I'm just going to count the number of nodes 1376 01:15:36,180 --> 01:15:40,210 that I find are reachable. 1377 01:15:40,210 --> 01:15:42,070 If I've guessed that u is not reachable, 1378 01:15:42,070 --> 01:15:45,540 I'm just going to skip it. 1379 01:15:45,540 --> 01:15:47,640 At the end, I see whether the number 1380 01:15:47,640 --> 01:15:49,530 of nodes that I have determined are 1381 01:15:49,530 --> 01:15:53,790 reachable agrees with my original count, c. 1382 01:15:53,790 --> 01:15:56,470 So does k equal c or not? 1383 01:15:56,470 --> 01:15:58,440 If it doesn't equal, they're not equal, 1384 01:15:58,440 --> 01:16:01,470 then I didn't find all the reachable nodes. 1385 01:16:01,470 --> 01:16:03,000 I didn't guess right. 1386 01:16:03,000 --> 01:16:05,130 And so I punt. 1387 01:16:05,130 --> 01:16:08,010 I say, well, bad branch of the nondeterminism. 1388 01:16:08,010 --> 01:16:10,130 I just give up. 1389 01:16:10,130 --> 01:16:12,590 But some branch of the nondeterminism 1390 01:16:12,590 --> 01:16:15,740 is going to guess all of the correct nodes which 1391 01:16:15,740 --> 01:16:18,010 are reachable. 1392 01:16:18,010 --> 01:16:22,810 And then, if t hadn't been found already to be one of them, 1393 01:16:22,810 --> 01:16:25,250 at this point, I know t is not reachable. 1394 01:16:25,250 --> 01:16:28,150 And so I can output no. 1395 01:16:28,150 --> 01:16:30,370 OK? 1396 01:16:30,370 --> 01:16:34,600 So that's the whole thing. 1397 01:16:34,600 --> 01:16:37,733 What is m? m is the-- yeah, good question. m is the number 1398 01:16:37,733 --> 01:16:38,650 of nodes of the graph. 1399 01:16:38,650 --> 01:16:39,650 I should have said that. 1400 01:16:39,650 --> 01:16:43,200 So you don't want to go-- you don't want to get into a loop. 1401 01:16:43,200 --> 01:16:47,710 So you better cut off your picking of a path 1402 01:16:47,710 --> 01:16:49,520 to some cutoff value. 1403 01:16:49,520 --> 01:16:51,790 So you're going to cut it off at m, 1404 01:16:51,790 --> 01:16:53,320 which is the number of nodes, which 1405 01:16:53,320 --> 01:16:54,970 is going to be long enough. 1406 01:16:54,970 --> 01:16:58,750 Actually, we're going to play with that in a bit later, but-- 1407 01:17:03,108 --> 01:17:03,900 OK, let's just see. 1408 01:17:03,900 --> 01:17:06,330 How do I know I did not visit the same node twice 1409 01:17:06,330 --> 01:17:07,423 when counting? 1410 01:17:07,423 --> 01:17:09,090 Because I'm just going to go through all 1411 01:17:09,090 --> 01:17:12,730 of the nodes in some order. 1412 01:17:12,730 --> 01:17:13,810 Pick any order. 1413 01:17:13,810 --> 01:17:18,430 The nodes appear in some order in the representation 1414 01:17:18,430 --> 01:17:21,780 of the graph on the input. 1415 01:17:21,780 --> 01:17:22,620 So any old order-- 1416 01:17:22,620 --> 01:17:24,620 I'm just going to go through the nodes in order. 1417 01:17:24,620 --> 01:17:31,180 Therefore, I'm never going to see the same node twice. 1418 01:17:31,180 --> 01:17:32,710 What does step 4 mean? 1419 01:17:41,010 --> 01:17:46,150 So step 4 is-- 1420 01:17:46,150 --> 01:17:48,730 step 4 means, for each node, I'm guessing 1421 01:17:48,730 --> 01:17:52,120 that that node either has a path to it from s 1422 01:17:52,120 --> 01:17:54,040 or does not have a path to it from s. 1423 01:17:57,200 --> 01:18:00,700 So kind of thinking about it, the original-- we're 1424 01:18:00,700 --> 01:18:01,200 out of time. 1425 01:18:01,200 --> 01:18:02,820 So why don't I-- 1426 01:18:02,820 --> 01:18:06,260 I'm happy to discuss this in the office hours. 1427 01:18:06,260 --> 01:18:08,360 I'm just going to skip over the rest of the slides 1428 01:18:08,360 --> 01:18:12,770 here and review. 1429 01:18:12,770 --> 01:18:13,970 We have a missing check-in. 1430 01:18:18,980 --> 01:18:22,220 Let's just-- I want to make sure everybody's 1431 01:18:22,220 --> 01:18:25,620 got all their check-ins here. 1432 01:18:25,620 --> 01:18:27,500 So why don't we just-- 1433 01:18:27,500 --> 01:18:34,460 if we know NL is equal to coNL, we also-- we 1434 01:18:34,460 --> 01:18:37,730 showed 2SAT complement is NL complete. 1435 01:18:37,730 --> 01:18:40,850 It also then follows that 2SAT itself is NL complete, 1436 01:18:40,850 --> 01:18:42,418 because NL equals coNL. 1437 01:18:42,418 --> 01:18:44,210 So I'm going to give you the answer to this 1438 01:18:44,210 --> 01:18:50,373 just, because I want you all to finish this poll. 1439 01:18:50,373 --> 01:18:52,040 Still, some of you are getting it wrong. 1440 01:18:56,130 --> 01:18:58,800 OK. 1441 01:18:58,800 --> 01:19:00,600 So please answer it quick. 1442 01:19:00,600 --> 01:19:02,870 And then we're going to end. 1443 01:19:02,870 --> 01:19:04,460 Are we all done? 1444 01:19:04,460 --> 01:19:07,980 Get your participation points here. 1445 01:19:07,980 --> 01:19:08,595 Three seconds. 1446 01:19:11,770 --> 01:19:14,480 OK, ending. 1447 01:19:14,480 --> 01:19:17,520 OK, doesn't matter. 1448 01:19:17,520 --> 01:19:19,640 So here, we ran over. 1449 01:19:19,640 --> 01:19:21,110 Sorry about that. 1450 01:19:21,110 --> 01:19:22,460 Quick review. 1451 01:19:22,460 --> 01:19:23,960 This is what we didn't quite finish. 1452 01:19:23,960 --> 01:19:24,830 This is part 5. 1453 01:19:24,830 --> 01:19:26,670 But we'll finish that next time. 1454 01:19:26,670 --> 01:19:29,570 OK, when showing PATH is NL complete, 1455 01:19:29,570 --> 01:19:32,380 we also need to list the nodes for constructing the graph. 1456 01:19:32,380 --> 01:19:33,430 The slides only mention-- 1457 01:19:33,430 --> 01:19:34,870 yeah, I kind of skipped that. 1458 01:19:34,870 --> 01:19:37,773 But yeah, you can just write down all the nodes. 1459 01:19:37,773 --> 01:19:39,940 Again, but that's also going to just take log space, 1460 01:19:39,940 --> 01:19:41,073 as you observed. 1461 01:19:41,073 --> 01:19:43,240 Yeah, technically, when you're writing down a graph, 1462 01:19:43,240 --> 01:19:44,440 you write down a list of the nodes, 1463 01:19:44,440 --> 01:19:46,065 and you write down a list of the edges. 1464 01:19:46,065 --> 01:19:48,160 I kind of skipped writing down the nodes. 1465 01:19:48,160 --> 01:19:50,050 But yeah, it's the same-- 1466 01:19:50,050 --> 01:19:51,335 doesn't matter. 1467 01:19:51,335 --> 01:19:52,960 So I'm going to say goodbye to you all. 1468 01:19:52,960 --> 01:19:54,750 Thank you.