1 00:00:08,200 --> 00:00:11,560 SRINI DEVADAS: All right good morning, everyone. 2 00:00:11,560 --> 00:00:12,856 Welcome back. 3 00:00:12,856 --> 00:00:15,330 I hope you had a good long weekend. 4 00:00:15,330 --> 00:00:19,840 So today's puzzle is, I guess, a classic puzzle. 5 00:00:19,840 --> 00:00:21,870 It's Sudoku. 6 00:00:21,870 --> 00:00:23,970 I've never actually successfully managed 7 00:00:23,970 --> 00:00:28,530 to complete a Sudoku puzzle by myself, 8 00:00:28,530 --> 00:00:31,620 because they've fallen into two categories for me. 9 00:00:31,620 --> 00:00:35,610 Either they're easy, and I get bored and I stop. 10 00:00:35,610 --> 00:00:39,100 Or they're too hard, and I get lazy and I stop. 11 00:00:39,100 --> 00:00:42,930 But what I have done is write a computer program 12 00:00:42,930 --> 00:00:47,100 that essentially solves any Sudoku puzzle that is 13 00:00:47,100 --> 00:00:50,730 put in front of it in seconds. 14 00:00:50,730 --> 00:00:54,510 Maybe there exist puzzles for which it would take minutes, 15 00:00:54,510 --> 00:00:57,660 but I haven't discovered such puzzles. 16 00:00:57,660 --> 00:01:02,550 And what we're going to do today is talk about Sudoku, 17 00:01:02,550 --> 00:01:07,830 compare and contrast the human way of solving Sudoku puzzles 18 00:01:07,830 --> 00:01:11,880 against a brute force way, and then try and integrate 19 00:01:11,880 --> 00:01:13,050 the two together. 20 00:01:13,050 --> 00:01:15,300 You know perhaps this is the closest 21 00:01:15,300 --> 00:01:17,760 we're going to get to AI in this class, 22 00:01:17,760 --> 00:01:23,280 where we're going to try and marry an exhaustive search 23 00:01:23,280 --> 00:01:25,950 method with some smarts. 24 00:01:25,950 --> 00:01:30,020 And back-- I think when we were doing the N-queens puzzle-- 25 00:01:30,020 --> 00:01:32,400 one of you asked a question about what 26 00:01:32,400 --> 00:01:34,470 the number of possibilities were. 27 00:01:34,470 --> 00:01:37,680 For an eight queens puzzle, was it eight raised to eight? 28 00:01:37,680 --> 00:01:39,360 And I said, well no. 29 00:01:39,360 --> 00:01:43,260 You can prune the search by figuring out 30 00:01:43,260 --> 00:01:47,940 that particular partial configurations that correspond 31 00:01:47,940 --> 00:01:51,180 to perhaps two queens being placed on the eight 32 00:01:51,180 --> 00:01:57,600 by eight board already does not correspond to a solution, 33 00:01:57,600 --> 00:01:59,970 because the two queens conflict with each other. 34 00:01:59,970 --> 00:02:01,980 And you can then shrink this eight 35 00:02:01,980 --> 00:02:04,330 raised to eight substantially. 36 00:02:04,330 --> 00:02:08,009 So that's exactly the methodology 37 00:02:08,009 --> 00:02:09,840 that we're going to follow here in trying 38 00:02:09,840 --> 00:02:13,800 to take our brute force solver, which 39 00:02:13,800 --> 00:02:18,660 will work, given enough time, on arbitrary Sudoku puzzles. 40 00:02:18,660 --> 00:02:20,580 But we may not want to wait that long. 41 00:02:20,580 --> 00:02:23,190 And we're going to take this strategy of pruning the search 42 00:02:23,190 --> 00:02:25,170 to try and improve the solver. 43 00:02:25,170 --> 00:02:29,250 And one last thing before I get started on the rules of Sudoku, 44 00:02:29,250 --> 00:02:32,790 we're going to have to have a way of measuring performance. 45 00:02:32,790 --> 00:02:36,900 Just like we can measure eight raised to eight or four raised 46 00:02:36,900 --> 00:02:39,000 to four or what have you, we want 47 00:02:39,000 --> 00:02:41,520 to have a way of measuring-- 48 00:02:41,520 --> 00:02:43,482 outside of the particular machine 49 00:02:43,482 --> 00:02:44,940 that's being used, we can obviously 50 00:02:44,940 --> 00:02:46,900 measure real time in terms of seconds, 51 00:02:46,900 --> 00:02:49,980 but that's not as precise-- 52 00:02:49,980 --> 00:02:53,460 you want to measure more precisely what 53 00:02:53,460 --> 00:02:55,680 the number of combinations are. 54 00:02:55,680 --> 00:02:58,410 And so we could certainly instrument our code 55 00:02:58,410 --> 00:03:02,040 with appropriate counters that will allow 56 00:03:02,040 --> 00:03:03,810 us to measure this performance. 57 00:03:03,810 --> 00:03:07,500 And so then it won't matter if our code runs on a fast machine 58 00:03:07,500 --> 00:03:08,637 or a slow machine. 59 00:03:08,637 --> 00:03:10,470 We can compare it with another piece of code 60 00:03:10,470 --> 00:03:14,000 or another variant of the code and say, oh this new variant 61 00:03:14,000 --> 00:03:15,990 is slower or the new variant is faster 62 00:03:15,990 --> 00:03:17,280 according to this metric. 63 00:03:17,280 --> 00:03:18,270 All right? 64 00:03:18,270 --> 00:03:22,300 So without further ado, let's dive into Sudoku. 65 00:03:22,300 --> 00:03:27,420 How many of you have never seen Sudoku, never played Sudoku? 66 00:03:27,420 --> 00:03:30,300 All right, so that's fine. 67 00:03:30,300 --> 00:03:33,240 It's only going to take me about 30 seconds 68 00:03:33,240 --> 00:03:35,820 to explain what the rules of Sudoku are. 69 00:03:35,820 --> 00:03:39,900 And then we can dive into trying to, at least partially, solve 70 00:03:39,900 --> 00:03:40,950 this puzzle. 71 00:03:40,950 --> 00:03:43,410 I do not want to completely solve the puzzle 72 00:03:43,410 --> 00:03:49,260 because, as I said, it's either too simple or it's too hard. 73 00:03:49,260 --> 00:03:51,450 And I'd rather write computer programs. 74 00:03:51,450 --> 00:03:56,040 And so the rules of Sudoku are simple. 75 00:03:56,040 --> 00:04:00,900 So this is classic Sudoku, and it's a nine by nine. 76 00:04:00,900 --> 00:04:03,180 There's many variants of Sudoku. 77 00:04:03,180 --> 00:04:05,970 In fact, a couple of the exercises 78 00:04:05,970 --> 00:04:10,350 talk about two variants, diagonal Sudoku and even 79 00:04:10,350 --> 00:04:12,340 Sudoku, I think-- 80 00:04:12,340 --> 00:04:14,730 there's probably odd Sudokus as well-- 81 00:04:14,730 --> 00:04:17,820 that add even more constraints to the basic constraints 82 00:04:17,820 --> 00:04:21,510 of Sudoku that I'm going to write up here. 83 00:04:21,510 --> 00:04:26,070 And this is nine by nine Sudoku. 84 00:04:26,070 --> 00:04:28,860 And the numbers are one through nine. 85 00:04:36,980 --> 00:04:40,234 And the rules are simple. 86 00:04:40,234 --> 00:04:48,270 Each row has all the numbers, which 87 00:04:48,270 --> 00:04:51,600 means that no numbers could be repeated, because there's 88 00:04:51,600 --> 00:04:54,670 nine columns and nine rows. 89 00:04:54,670 --> 00:04:58,730 So there's nine numbers on each row. 90 00:04:58,730 --> 00:05:06,250 Each column has all the numbers-- 91 00:05:06,250 --> 00:05:06,750 same thing. 92 00:05:09,480 --> 00:05:11,610 So there's nine rows and nine columns. 93 00:05:11,610 --> 00:05:18,540 And then each sector, which is a three by three grid-- 94 00:05:18,540 --> 00:05:22,230 and so that's why I have these overhangs here corresponding 95 00:05:22,230 --> 00:05:26,700 to pointing out what the nine sectors are in Sudoku. 96 00:05:26,700 --> 00:05:28,200 So this is a sector. 97 00:05:28,200 --> 00:05:29,000 That's a sector. 98 00:05:29,000 --> 00:05:30,749 This middle one, which is completely blank 99 00:05:30,749 --> 00:05:33,910 right now is a sector, et cetera. 100 00:05:33,910 --> 00:05:37,065 So each sector has all the numbers. 101 00:05:40,670 --> 00:05:43,100 You can grow the size of the puzzle. 102 00:05:43,100 --> 00:05:46,850 It gets more difficult. You could add more constraints. 103 00:05:46,850 --> 00:05:49,640 As I mentioned, diagonal Sudoku might say something 104 00:05:49,640 --> 00:05:55,340 like both of the diagonals, the large diagonals, the full size 105 00:05:55,340 --> 00:05:59,060 diagonals, have all nine numbers on them, et cetera. 106 00:05:59,060 --> 00:06:01,830 So that makes the puzzle different. 107 00:06:01,830 --> 00:06:04,490 You may have a solution to the nine 108 00:06:04,490 --> 00:06:07,040 by nine original Sudoku puzzle, but it may not 109 00:06:07,040 --> 00:06:10,040 be a solution to the diagonal puzzle. 110 00:06:10,040 --> 00:06:12,950 Obviously the other way around works 111 00:06:12,950 --> 00:06:17,120 because the diagonal puzzle only has more constraints. 112 00:06:17,120 --> 00:06:22,700 And so what we do here is try and use implications. 113 00:06:22,700 --> 00:06:24,670 Right, so we have these rules. 114 00:06:24,670 --> 00:06:28,100 And we'll first forget about computer programs 115 00:06:28,100 --> 00:06:33,860 and try and solve this the way people do when they just 116 00:06:33,860 --> 00:06:37,014 have a paper and pencil and they have 117 00:06:37,014 --> 00:06:38,180 the puzzle in front of them. 118 00:06:38,180 --> 00:06:44,090 And they try and use these rules to discover empty positions. 119 00:06:44,090 --> 00:06:49,407 And it's kind of hard to do anything with this sector here. 120 00:06:49,407 --> 00:06:51,740 You could use the row and column constraints, obviously, 121 00:06:51,740 --> 00:06:52,730 even for this sector. 122 00:06:52,730 --> 00:06:58,730 Because you have constraints on these three based on the fact 123 00:06:58,730 --> 00:07:01,920 that you have nine and seven and one and six on this row, 124 00:07:01,920 --> 00:07:02,690 et cetera. 125 00:07:02,690 --> 00:07:05,900 But usually you go with sectors that 126 00:07:05,900 --> 00:07:07,940 have a few numbers in them. 127 00:07:07,940 --> 00:07:10,100 You go with rows that have a few numbers in them. 128 00:07:10,100 --> 00:07:12,260 And you go with columns that have a few numbers in them. 129 00:07:12,260 --> 00:07:14,390 And then you can try and shrink the possibilities. 130 00:07:14,390 --> 00:07:15,470 All right? 131 00:07:15,470 --> 00:07:19,550 So just because I don't want to go overboard 132 00:07:19,550 --> 00:07:22,460 with respect to looking all over the puzzle, 133 00:07:22,460 --> 00:07:25,790 let's focus in on eight-- 134 00:07:25,790 --> 00:07:27,260 the number eight. 135 00:07:27,260 --> 00:07:33,800 And one of you tell me if I can imply something 136 00:07:33,800 --> 00:07:39,410 based on the locations of eight on the top third 137 00:07:39,410 --> 00:07:40,160 of this puzzle. 138 00:07:43,480 --> 00:07:45,274 Yeah, go ahead. 139 00:07:45,274 --> 00:07:46,560 AUDIENCE: Top middle square. 140 00:07:46,560 --> 00:07:47,870 SRINI DEVADAS: Top middle square, what's your name? 141 00:07:47,870 --> 00:07:48,710 Kye? 142 00:07:48,710 --> 00:07:53,570 So Kye says top middle square should be an eight, right here. 143 00:07:53,570 --> 00:07:55,790 Right, and-- oh, top OK. 144 00:07:55,790 --> 00:07:57,290 Yeah that clearly can't be an eight, 145 00:07:57,290 --> 00:07:58,960 because this is an eight here. 146 00:07:58,960 --> 00:08:03,760 But good, so the claim is this is an eight. 147 00:08:03,760 --> 00:08:05,820 And Kye, how did you figure that out? 148 00:08:05,820 --> 00:08:08,180 AUDIENCE: You can eliminate the first two rows because. 149 00:08:08,180 --> 00:08:10,013 SRINI DEVADAS: Right you can eliminate this, 150 00:08:10,013 --> 00:08:12,390 because eight can't be here because of this eight. 151 00:08:12,390 --> 00:08:14,800 Eight can't be here because of this eight. 152 00:08:14,800 --> 00:08:18,640 You need to have an 8 in here somewhere, because eight 153 00:08:18,640 --> 00:08:20,320 doesn't exist in the sector. 154 00:08:20,320 --> 00:08:22,870 So that would imply that I need to put an eight up here. 155 00:08:22,870 --> 00:08:23,710 OK? 156 00:08:23,710 --> 00:08:27,490 So this is what's called a horizontal scan. 157 00:08:27,490 --> 00:08:31,630 The only thing that Kye did here was scan horizontally. 158 00:08:31,630 --> 00:08:34,750 And you can imagine that-- 159 00:08:34,750 --> 00:08:38,080 so Kye did not use, in order to imply the eight-- 160 00:08:38,080 --> 00:08:40,840 and so this word implication, imply, is something 161 00:08:40,840 --> 00:08:43,570 that we're going to use in a more technical sense 162 00:08:43,570 --> 00:08:47,560 as well when we write our code, but an implication essentially 163 00:08:47,560 --> 00:08:51,640 says these rules imply the location of the eight. 164 00:08:51,640 --> 00:08:52,360 Right? 165 00:08:52,360 --> 00:08:55,750 And we didn't do a vertical scan. 166 00:08:55,750 --> 00:08:58,010 We did not use the fact that-- 167 00:08:58,010 --> 00:08:59,860 in this particular implication, we did not 168 00:08:59,860 --> 00:09:04,390 use the fact that a column needs to have all numbers on it, 169 00:09:04,390 --> 00:09:08,800 and therefore all of the numbers on a column have to be unique. 170 00:09:08,800 --> 00:09:10,225 Take a look at-- 171 00:09:13,080 --> 00:09:15,000 take a look at this part here. 172 00:09:15,000 --> 00:09:20,280 And let's look at one. 173 00:09:20,280 --> 00:09:24,240 And try and use a more sophisticated form 174 00:09:24,240 --> 00:09:27,750 of implication corresponding to both horizontal 175 00:09:27,750 --> 00:09:34,080 and vertical scans to imply the position of a one somewhere 176 00:09:34,080 --> 00:09:36,290 on the puzzle. 177 00:09:36,290 --> 00:09:38,350 Can someone do that? 178 00:09:38,350 --> 00:09:39,527 Yeah, back there. 179 00:09:39,527 --> 00:09:41,969 AUDIENCE: In the top box-- in the top right box, 180 00:09:41,969 --> 00:09:43,010 it's to the right of six. 181 00:09:43,010 --> 00:09:45,740 SRINI DEVADAS: OK, so the one can't be here. 182 00:09:45,740 --> 00:09:47,380 The one can't be here. 183 00:09:47,380 --> 00:09:47,880 Right? 184 00:09:47,880 --> 00:09:49,660 And the one can't be here. 185 00:09:49,660 --> 00:09:51,072 So it has to be over here. 186 00:09:51,072 --> 00:09:51,780 What's your name? 187 00:09:51,780 --> 00:09:52,500 AUDIENCE: George. 188 00:09:52,500 --> 00:09:53,875 SRINI DEVADAS: George-- so George 189 00:09:53,875 --> 00:09:55,170 says the one has to be here. 190 00:09:55,170 --> 00:09:59,070 And he used both vertical scanning as well as 191 00:09:59,070 --> 00:10:02,310 horizontal scanning in order to imply the one. 192 00:10:02,310 --> 00:10:04,410 So it's a little more sophisticated. 193 00:10:04,410 --> 00:10:08,640 OK, on top of that, obviously, sectors 194 00:10:08,640 --> 00:10:12,960 are going to give you some implications as well. 195 00:10:12,960 --> 00:10:14,640 And there's no end to this, honestly. 196 00:10:14,640 --> 00:10:18,290 There's combinations, there's also a little bit of look 197 00:10:18,290 --> 00:10:23,640 ahead, where the hardest puzzles are the ones where you run out 198 00:10:23,640 --> 00:10:26,730 of the eights and the ones in terms of the examples 199 00:10:26,730 --> 00:10:30,090 that we have here where we've just sort of implied-- 200 00:10:30,090 --> 00:10:34,140 without guessing, we've implied the location of a number. 201 00:10:34,140 --> 00:10:37,380 And then because of that, our puzzle got smaller in the sense 202 00:10:37,380 --> 00:10:41,850 that there's fewer blank locations, blank squares. 203 00:10:41,850 --> 00:10:44,740 And then that helps us move forward. 204 00:10:44,740 --> 00:10:47,280 So the easy puzzles are the ones where 205 00:10:47,280 --> 00:10:49,650 fairly straightforward implications 206 00:10:49,650 --> 00:10:55,740 like the ones we did here always exist, are easy to find. 207 00:10:55,740 --> 00:10:59,040 Sometimes you have to search a little bit, look at the top, 208 00:10:59,040 --> 00:11:01,300 look at the bottom, look at the middle. 209 00:11:01,300 --> 00:11:03,150 And then you fill things in. 210 00:11:03,150 --> 00:11:04,890 And because you filled things in, 211 00:11:04,890 --> 00:11:07,950 something else now is in play, right? 212 00:11:07,950 --> 00:11:10,500 It becomes viable in terms of an implication. 213 00:11:10,500 --> 00:11:12,660 The fact that I put an eight in there 214 00:11:12,660 --> 00:11:14,470 implies that the eight is now taken-- 215 00:11:14,470 --> 00:11:15,660 its location. 216 00:11:15,660 --> 00:11:18,210 And so now obviously there's only four left here. 217 00:11:18,210 --> 00:11:20,250 And the fact that there's an eight here 218 00:11:20,250 --> 00:11:24,390 implies that all of these can't have an eight in them. 219 00:11:24,390 --> 00:11:27,480 These seven locations underneath can't 220 00:11:27,480 --> 00:11:30,730 have an eight in them, right, because of the constraints. 221 00:11:30,730 --> 00:11:33,780 So this shrinking of possibilities 222 00:11:33,780 --> 00:11:37,150 is something that a human does. 223 00:11:37,150 --> 00:11:39,550 And you can kind of go through this process. 224 00:11:39,550 --> 00:11:44,580 It's an iterative process that you go through. 225 00:11:44,580 --> 00:11:50,700 And if you get stuck then you can't do an implication 226 00:11:50,700 --> 00:11:52,200 that gives you a number. 227 00:11:52,200 --> 00:11:56,250 And some of the harder puzzles you have to-- you 228 00:11:56,250 --> 00:12:00,150 have a couple of choices, and only one of them 229 00:12:00,150 --> 00:12:03,510 is going to be a correct one going forward, 230 00:12:03,510 --> 00:12:06,930 but you don't know that at that moment. 231 00:12:06,930 --> 00:12:08,520 So you now have to guess. 232 00:12:08,520 --> 00:12:10,590 And perhaps you put an eight over here 233 00:12:10,590 --> 00:12:11,820 or an eight over there. 234 00:12:11,820 --> 00:12:13,920 And then you say I'm going to go with an eight over here. 235 00:12:13,920 --> 00:12:15,310 And then you go a little bit further, 236 00:12:15,310 --> 00:12:16,810 and then you realize, wait a minute, 237 00:12:16,810 --> 00:12:18,570 there's no way I can solve this puzzle. 238 00:12:18,570 --> 00:12:20,842 Because I need to put two sevens into this sector. 239 00:12:20,842 --> 00:12:22,800 And then you go back and there's actually a bit 240 00:12:22,800 --> 00:12:24,960 of backtracking that happens-- 241 00:12:24,960 --> 00:12:28,080 a wrong guess, and you need to go backwards. 242 00:12:28,080 --> 00:12:29,760 And it's very hard to do for us. 243 00:12:29,760 --> 00:12:32,910 It's very hard to do for us with pencil and paper, 244 00:12:32,910 --> 00:12:37,350 keeping things in our head, or you know writing down notes 245 00:12:37,350 --> 00:12:38,700 on the side of the paper. 246 00:12:38,700 --> 00:12:42,210 Whereas it's very easy to do that for a computer program. 247 00:12:42,210 --> 00:12:44,402 And we kind of did that already in the eight queens. 248 00:12:44,402 --> 00:12:46,110 But we're going to do that in a much more 249 00:12:46,110 --> 00:12:48,240 systematic way over here. 250 00:12:48,240 --> 00:12:49,980 So you kind of weigh these two ways 251 00:12:49,980 --> 00:12:51,700 of approaching this problem. 252 00:12:51,700 --> 00:12:53,490 One of which is I'm just going to blast 253 00:12:53,490 --> 00:12:55,500 through the different combinations, 254 00:12:55,500 --> 00:13:01,660 having a giant tree structure in my head of 255 00:13:01,660 --> 00:13:05,960 where, you know this might imply that particular grid location 256 00:13:05,960 --> 00:13:09,000 grid IJ equals eight. 257 00:13:09,000 --> 00:13:13,170 This might imply that grid IJ equals seven. 258 00:13:13,170 --> 00:13:14,850 And there's obviously a huge number 259 00:13:14,850 --> 00:13:18,930 of combinations corresponding to which of these squares 260 00:13:18,930 --> 00:13:22,710 that I pick and what value I assign to those squares. 261 00:13:22,710 --> 00:13:24,720 And then once I do that there's another set. 262 00:13:24,720 --> 00:13:27,100 And this explodes on you very quickly. 263 00:13:27,100 --> 00:13:31,530 And so if you just did this in a completely brutish way, 264 00:13:31,530 --> 00:13:33,450 there's no way your program would ever 265 00:13:33,450 --> 00:13:35,460 end, even on a simple puzzle. 266 00:13:35,460 --> 00:13:40,350 But thanks to these rules, it turns out a fairly 267 00:13:40,350 --> 00:13:43,202 straightforward program that's 20 lines of code is going 268 00:13:43,202 --> 00:13:45,660 to solve most problems-- at least the ones that I've looked 269 00:13:45,660 --> 00:13:47,190 at-- 270 00:13:47,190 --> 00:13:49,110 in a reasonable amount of time. 271 00:13:49,110 --> 00:13:53,500 And then it's just interesting from an algorithmic standpoint 272 00:13:53,500 --> 00:13:56,730 and an efficiency standpoint to look and see 273 00:13:56,730 --> 00:14:01,090 how we can take that fairly naive approach which 274 00:14:01,090 --> 00:14:04,950 does have some pruning but it's exhaustive and improve that. 275 00:14:04,950 --> 00:14:07,740 All right, so that's kind of where we are headed. 276 00:14:07,740 --> 00:14:08,360 Make sense? 277 00:14:08,360 --> 00:14:12,550 Any questions about what we have so far? 278 00:14:12,550 --> 00:14:13,480 All right. 279 00:14:13,480 --> 00:14:18,780 So what we're going to do is go ahead and look 280 00:14:18,780 --> 00:14:21,780 at code for Sudoku that-- 281 00:14:21,780 --> 00:14:27,040 so all you need to think about now for the next few minutes, 282 00:14:27,040 --> 00:14:30,220 because we're going to move into exhaustive search mode and code 283 00:14:30,220 --> 00:14:33,166 things up in a computer, is just those rules. 284 00:14:33,166 --> 00:14:34,540 So you don't have to really think 285 00:14:34,540 --> 00:14:38,200 about horizontal scans and implications or vertical scans. 286 00:14:38,200 --> 00:14:40,510 That's going to come a little bit later. 287 00:14:40,510 --> 00:14:44,530 And we're going to vary that up as well. 288 00:14:44,530 --> 00:14:48,760 But we're going to do kind of what we did for the N-queens 289 00:14:48,760 --> 00:14:52,690 problem, which is set up a recursive search that 290 00:14:52,690 --> 00:14:56,410 is going to explore all of these different possibilities. 291 00:14:56,410 --> 00:15:01,880 And the equivalent of no conflicts for the eight queens 292 00:15:01,880 --> 00:15:07,300 or the N-queens problem, which said there's no conflicts here 293 00:15:07,300 --> 00:15:09,790 because none of the queens attack each other, we're going 294 00:15:09,790 --> 00:15:12,880 to have something that essentially says this is valid 295 00:15:12,880 --> 00:15:15,484 so far, none of the rules of Sudoku 296 00:15:15,484 --> 00:15:16,900 corresponding to these three rules 297 00:15:16,900 --> 00:15:19,630 that I have up on the board are going to be violated. 298 00:15:19,630 --> 00:15:22,360 All right so let's take a look. 299 00:15:22,360 --> 00:15:31,330 And so this structure hopefully, given that we've done N-queens, 300 00:15:31,330 --> 00:15:36,970 should be a little bit easier to understand. 301 00:15:36,970 --> 00:15:42,240 So when we did eight queens or N-queens, 302 00:15:42,240 --> 00:15:44,454 we decided to start column by column. 303 00:15:44,454 --> 00:15:45,870 And we could have done row by row, 304 00:15:45,870 --> 00:15:48,000 but we decided to start column by column. 305 00:15:48,000 --> 00:15:49,920 And it was a fairly straightforward puzzle. 306 00:15:49,920 --> 00:15:53,490 There's obviously no real change in an eight queens puzzle. 307 00:15:53,490 --> 00:15:55,530 I mean, you're solving the same puzzle as I am. 308 00:15:55,530 --> 00:15:57,850 But if I give you a Sudoku puzzle, 309 00:15:57,850 --> 00:16:00,330 there's more variety to it in the sense that 310 00:16:00,330 --> 00:16:02,730 depending on what I fill up-- 311 00:16:02,730 --> 00:16:04,620 the hard puzzles are the ones that 312 00:16:04,620 --> 00:16:11,430 are kind of intermediate in the sense of they're not obviously 313 00:16:11,430 --> 00:16:15,430 fully filled and they're not empty. 314 00:16:15,430 --> 00:16:17,310 Right if it's completely empty then 315 00:16:17,310 --> 00:16:19,890 it's trivial to solve a Sudoku puzzle. 316 00:16:19,890 --> 00:16:21,870 You can take any solution to a Sudoku puzzle 317 00:16:21,870 --> 00:16:24,630 and present it as a solution to the empty puzzle. 318 00:16:24,630 --> 00:16:27,560 And then if everything is full except for two things, 319 00:16:27,560 --> 00:16:29,310 I mean it's kind of obvious what those two 320 00:16:29,310 --> 00:16:32,700 things are assuming that the puzzle had a valid solution. 321 00:16:32,700 --> 00:16:35,370 So really it's puzzles like this where 322 00:16:35,370 --> 00:16:39,820 maybe a third are full that are more difficult. 323 00:16:39,820 --> 00:16:43,620 And it's kind of a separate school, a little community that 324 00:16:43,620 --> 00:16:46,530 designs puzzles and tries to create hard puzzles. 325 00:16:46,530 --> 00:16:50,160 And they try and make the human's problem 326 00:16:50,160 --> 00:16:54,960 harder by making this requirement of look ahead, 327 00:16:54,960 --> 00:16:56,130 like I mentioned. 328 00:16:56,130 --> 00:16:59,010 So good. 329 00:16:59,010 --> 00:17:01,990 So let's take a look at the code here. 330 00:17:01,990 --> 00:17:06,660 So what I want to do here-- and the first part here is-- 331 00:17:06,660 --> 00:17:10,230 as I said, we went column by column in the case of N-queens. 332 00:17:10,230 --> 00:17:11,980 And the question is, where do I start. 333 00:17:11,980 --> 00:17:14,579 I want to do something in a fairly naive way. 334 00:17:14,579 --> 00:17:17,640 And what I'm going to do is I'm going to do some sort of scan. 335 00:17:17,640 --> 00:17:19,230 I'm going to scan like that. 336 00:17:19,230 --> 00:17:20,440 And I'm going to find-- 337 00:17:20,440 --> 00:17:24,780 as I do the scan, I'm going to find the next empty grid 338 00:17:24,780 --> 00:17:25,920 location. 339 00:17:25,920 --> 00:17:28,380 And I'm going to say that is going to be something that I'm 340 00:17:28,380 --> 00:17:29,520 going to try and fill in. 341 00:17:29,520 --> 00:17:34,820 OK so it's not going to be I discovered this eight in the-- 342 00:17:34,820 --> 00:17:38,490 it was, if you count in terms of the empty locations, 343 00:17:38,490 --> 00:17:41,190 if I went this way, it was the fourth empty location 344 00:17:41,190 --> 00:17:43,000 and decided to fill that up. 345 00:17:43,000 --> 00:17:45,780 But here I'm just-- in this code I'm 346 00:17:45,780 --> 00:17:48,936 going to try one, two, three, four, five, six, seven, eight, 347 00:17:48,936 --> 00:17:50,700 nine here. 348 00:17:50,700 --> 00:17:53,100 And the first part of the code here that 349 00:17:53,100 --> 00:17:55,680 says what is the name of this procedure. 350 00:17:55,680 --> 00:17:58,390 It says find next cell to fill. 351 00:17:58,390 --> 00:18:00,270 Which means what its name is-- 352 00:18:00,270 --> 00:18:01,210 find the next cell. 353 00:18:01,210 --> 00:18:03,400 Find the next grid location to fill. 354 00:18:03,400 --> 00:18:06,420 And it simply goes for X in range zero 355 00:18:06,420 --> 00:18:09,720 through nine, for Y in range zero through nine. 356 00:18:09,720 --> 00:18:13,380 I'm assuming that since zero is not a valid entry here, 357 00:18:13,380 --> 00:18:15,830 I could use zero to signify empty. 358 00:18:15,830 --> 00:18:16,380 OK? 359 00:18:16,380 --> 00:18:19,410 It's only one through nine, so zero can signify empty. 360 00:18:19,410 --> 00:18:22,770 And this just returns X and Y corresponding 361 00:18:22,770 --> 00:18:26,740 to the first empty location. 362 00:18:26,740 --> 00:18:28,950 So in this case it would just return (0,0). 363 00:18:28,950 --> 00:18:29,790 OK? 364 00:18:29,790 --> 00:18:33,540 And then if this were full then it would go-- 365 00:18:33,540 --> 00:18:36,960 obviously the X changes. 366 00:18:36,960 --> 00:18:42,060 And when X changes, you're going over to the right. 367 00:18:42,060 --> 00:18:46,470 And so you would get a (1,0) back. 368 00:18:46,470 --> 00:18:49,230 If this were filled, the next time around I'd 369 00:18:49,230 --> 00:18:51,900 get this grid location, et cetera. 370 00:18:51,900 --> 00:18:52,740 It doesn't matter. 371 00:18:52,740 --> 00:18:54,900 Just like I could go column by column or row 372 00:18:54,900 --> 00:18:58,440 by row, as long as I have a deterministic way 373 00:18:58,440 --> 00:19:03,600 of discovering the empty location-- 374 00:19:03,600 --> 00:19:06,930 and usually you want to have the same way of discovering 375 00:19:06,930 --> 00:19:08,200 the empty location. 376 00:19:08,200 --> 00:19:09,840 But even that is not a requirement 377 00:19:09,840 --> 00:19:11,700 as long as there's an empty location 378 00:19:11,700 --> 00:19:15,420 and your find next cell to fill finds that empty location 379 00:19:15,420 --> 00:19:18,840 and returns it to you, you're good, and the rest of our code 380 00:19:18,840 --> 00:19:20,340 is going to work. 381 00:19:20,340 --> 00:19:23,190 But no reason to get more complicated 382 00:19:23,190 --> 00:19:25,470 than what I have up there. 383 00:19:25,470 --> 00:19:28,310 So find next cell to fill makes sense? 384 00:19:28,310 --> 00:19:29,160 We good with that? 385 00:19:29,160 --> 00:19:30,240 All right. 386 00:19:30,240 --> 00:19:35,130 And generally with exhaustive search 387 00:19:35,130 --> 00:19:42,501 the key procedure is always do you have a valid solution 388 00:19:42,501 --> 00:19:43,000 or not? 389 00:19:43,000 --> 00:19:45,130 And you may not have a complete solution. 390 00:19:45,130 --> 00:19:47,250 Think of it as a partial configuration. 391 00:19:47,250 --> 00:19:52,290 So this is a partial configuration that is valid. 392 00:19:52,290 --> 00:19:54,751 It's not a complete solution to the Sudoku puzzle. 393 00:19:54,751 --> 00:19:57,000 It's a partial configuration that's valid in the sense 394 00:19:57,000 --> 00:19:59,520 that it satisfies all of the constraints. 395 00:19:59,520 --> 00:20:01,200 You know, if I put another eight in here 396 00:20:01,200 --> 00:20:04,410 it would not be a valid partial configuration. 397 00:20:04,410 --> 00:20:05,970 It would be partial, but not valid. 398 00:20:05,970 --> 00:20:07,800 Right? 399 00:20:07,800 --> 00:20:09,180 I need to grow this. 400 00:20:09,180 --> 00:20:11,070 I need to grow this into a solution. 401 00:20:11,070 --> 00:20:13,350 And when I say solution I mean all the constraints 402 00:20:13,350 --> 00:20:15,770 have to be satisfied. 403 00:20:15,770 --> 00:20:18,800 A configuration could be invalid or valid. 404 00:20:18,800 --> 00:20:20,740 A solution is always valid. 405 00:20:20,740 --> 00:20:21,240 All right? 406 00:20:21,240 --> 00:20:23,260 That's just terminology. 407 00:20:23,260 --> 00:20:29,190 And so I want to be able to look high up 408 00:20:29,190 --> 00:20:33,100 and be able to truncate the search and say, 409 00:20:33,100 --> 00:20:35,610 you know what, grid IJ equaling eight, 410 00:20:35,610 --> 00:20:39,600 because I put an eight in that sector which already 411 00:20:39,600 --> 00:20:45,220 had an eight in it, is something that should not be explored. 412 00:20:45,220 --> 00:20:47,580 And I don't have to worry about any 413 00:20:47,580 --> 00:20:50,400 of the branches that come here. 414 00:20:50,400 --> 00:20:53,670 Because immediately I've violated the constraint. 415 00:20:53,670 --> 00:20:55,920 So in general, I can always check 416 00:20:55,920 --> 00:20:59,910 whether partial configurations violate the three 417 00:20:59,910 --> 00:21:01,680 constraints I have or not. 418 00:21:01,680 --> 00:21:04,750 And that is what this piece of code does. 419 00:21:04,750 --> 00:21:07,030 And it's also straightforward. 420 00:21:07,030 --> 00:21:13,110 It's perhaps even more straightforward 421 00:21:13,110 --> 00:21:17,070 than diagonal checking in the case of eight queens. 422 00:21:17,070 --> 00:21:20,490 But all this does is use the construct that 423 00:21:20,490 --> 00:21:21,960 says I'm going to look at-- 424 00:21:25,144 --> 00:21:26,685 essentially this is something that is 425 00:21:26,685 --> 00:21:29,090 list comprehensions in Python. 426 00:21:29,090 --> 00:21:32,220 The for comes after this predicate here. 427 00:21:32,220 --> 00:21:35,480 But effectively what you're saying is for X in range nine, 428 00:21:35,480 --> 00:21:39,750 check that grid IX is not equal to E. OK, 429 00:21:39,750 --> 00:21:41,370 and you're just looking at E. 430 00:21:41,370 --> 00:21:47,610 So is valid, grid IJE takes the grid which looks like this one, 431 00:21:47,610 --> 00:21:51,660 let's say, and so it's got zeros in all of the empty places. 432 00:21:51,660 --> 00:21:55,200 And it's got a bunch of non-zero entries in all of the places 433 00:21:55,200 --> 00:21:56,340 that you see here. 434 00:21:56,340 --> 00:22:00,660 And in addition, you have perhaps zero, zero, 435 00:22:00,660 --> 00:22:03,550 and let's call it one. 436 00:22:03,550 --> 00:22:09,070 And so this is I, and this is J, and that is E. OK. 437 00:22:09,070 --> 00:22:13,620 And so let me write that out here, I, J, and E, 438 00:22:13,620 --> 00:22:17,620 where I is one-- 439 00:22:17,620 --> 00:22:22,440 zero, I'm sorry, I is zero, J is zero, and E is 1. 440 00:22:22,440 --> 00:22:25,670 So that would mean putting a one up here 441 00:22:25,670 --> 00:22:29,370 and that obviously is going to violate one of our constraints. 442 00:22:29,370 --> 00:22:30,480 But that's fine. 443 00:22:30,480 --> 00:22:31,770 We're going to check that. 444 00:22:31,770 --> 00:22:36,390 And it's essentially doing incremental checking 445 00:22:36,390 --> 00:22:37,720 just like we did. 446 00:22:37,720 --> 00:22:44,640 So it's not checking to see that all of the existing grid 447 00:22:44,640 --> 00:22:49,640 IJ values are conflicting or not. 448 00:22:49,640 --> 00:22:52,250 It's just saying I have an-- 449 00:22:52,250 --> 00:22:55,190 I'm going to be writing something into this grid, 450 00:22:55,190 --> 00:22:56,690 into an empty location. 451 00:22:56,690 --> 00:22:59,690 It happens to be zero, zero having the value one. 452 00:22:59,690 --> 00:23:02,180 And I'm going to check whether the introduction 453 00:23:02,180 --> 00:23:06,650 of a one into this square is going to cause problems or not. 454 00:23:06,650 --> 00:23:08,600 That's all that it's doing-- 455 00:23:08,600 --> 00:23:12,380 incremental, just like we had with eight queens. 456 00:23:12,380 --> 00:23:15,620 And that check is relatively easy to do, 457 00:23:15,620 --> 00:23:20,900 because I just need to go and I look at the row corresponding 458 00:23:20,900 --> 00:23:25,274 to I, which in this case is the top row. 459 00:23:25,274 --> 00:23:26,690 I look at the column corresponding 460 00:23:26,690 --> 00:23:29,360 to J, which is the leftmost column, 461 00:23:29,360 --> 00:23:31,910 and then I look at the sector corresponding 462 00:23:31,910 --> 00:23:35,450 to zero, zero, which is this top left sector. 463 00:23:35,450 --> 00:23:38,210 And I check to see for each of those three things, 464 00:23:38,210 --> 00:23:39,810 whether there's a problem or not. 465 00:23:39,810 --> 00:23:41,050 And the first two-- 466 00:23:41,050 --> 00:23:44,180 well actually, I have a problem with the row. 467 00:23:44,180 --> 00:23:47,180 And I would also have a problem with the sector. 468 00:23:47,180 --> 00:23:49,620 I wouldn't have a problem with the column. 469 00:23:49,620 --> 00:23:52,580 But one of them is bad enough. 470 00:23:52,580 --> 00:23:54,140 And so I'm going to get a false. 471 00:23:54,140 --> 00:23:59,280 So row OK is going to be false. 472 00:23:59,280 --> 00:24:02,440 And so I'm going to return false out here. 473 00:24:02,440 --> 00:24:04,271 All right, that make sense? 474 00:24:04,271 --> 00:24:05,520 So those are the three things. 475 00:24:05,520 --> 00:24:07,680 And there's really not that much here 476 00:24:07,680 --> 00:24:12,800 beyond taking those constraints and codifying them. 477 00:24:12,800 --> 00:24:13,610 Right. 478 00:24:13,610 --> 00:24:15,091 Any questions? 479 00:24:15,091 --> 00:24:15,590 Yeah. 480 00:24:15,590 --> 00:24:16,090 Fadi. 481 00:24:16,090 --> 00:24:18,542 AUDIENCE: What is the all thing-- 482 00:24:18,542 --> 00:24:20,250 SRINI DEVADAS: Ah, the all is essentially 483 00:24:20,250 --> 00:24:23,010 a Python built in function that is 484 00:24:23,010 --> 00:24:27,120 going to essentially say that-- 485 00:24:27,120 --> 00:24:31,170 it's going to-- it's a conjunction that 486 00:24:31,170 --> 00:24:35,850 says I'm getting a bunch of Booleans that correspond 487 00:24:35,850 --> 00:24:39,870 to the generation of this list comprehension 488 00:24:39,870 --> 00:24:45,790 where E not equal to grid IX is going to give me true or false. 489 00:24:45,790 --> 00:24:49,500 And I need all of those things to be true. 490 00:24:49,500 --> 00:24:51,111 All right? 491 00:24:51,111 --> 00:24:53,710 AUDIENCE: Okay, it's always going to be a Boolean there 492 00:24:53,710 --> 00:24:57,140 and that depends on whether all of the elements 493 00:24:57,140 --> 00:24:58,140 of the list itself are-- 494 00:24:58,140 --> 00:24:59,550 SRINI DEVADAS: It's a conjunction. 495 00:24:59,550 --> 00:24:59,970 Yeah, that's right. 496 00:24:59,970 --> 00:25:01,440 So and, think of it as an and. 497 00:25:01,440 --> 00:25:03,749 Even if one of them is false, the and is false. 498 00:25:03,749 --> 00:25:05,790 In order for the and to be true, then all of them 499 00:25:05,790 --> 00:25:06,750 need to be true. 500 00:25:06,750 --> 00:25:07,960 All right? 501 00:25:07,960 --> 00:25:10,380 So it's just a convenient construct which 502 00:25:10,380 --> 00:25:14,870 is applicable in sort of-- 503 00:25:14,870 --> 00:25:17,500 the perfect application is what you see here. 504 00:25:17,500 --> 00:25:20,880 It's not the most sophisticated of applications, 505 00:25:20,880 --> 00:25:23,520 but it works very well in this case. 506 00:25:23,520 --> 00:25:26,280 Now for the sector, I can't actually do that. 507 00:25:26,280 --> 00:25:28,914 And so there's a little bit more work, because I can't-- 508 00:25:28,914 --> 00:25:30,330 this only works when-- and I could 509 00:25:30,330 --> 00:25:34,770 put a list comprehension in here like this 510 00:25:34,770 --> 00:25:36,062 and generate all the Booleans. 511 00:25:36,062 --> 00:25:38,520 For the sector I end up having to do something a little bit 512 00:25:38,520 --> 00:25:39,180 different. 513 00:25:39,180 --> 00:25:42,930 I mean you could do things more convoluted and use all in here 514 00:25:42,930 --> 00:25:46,640 as well, but it's not worth it. 515 00:25:46,640 --> 00:25:47,990 OK, that make sense? 516 00:25:47,990 --> 00:25:48,830 Good. 517 00:25:48,830 --> 00:25:54,290 So here's the core routine that corresponds to the search. 518 00:25:54,290 --> 00:25:56,610 And ignore this global variable here. 519 00:25:56,610 --> 00:25:58,450 I'll explain that in a minute. 520 00:25:58,450 --> 00:26:00,080 That's going to be our metric. 521 00:26:00,080 --> 00:26:02,240 Backtracks is going to be our metric for computing 522 00:26:02,240 --> 00:26:02,805 performance. 523 00:26:02,805 --> 00:26:04,430 And it's going to be quite interesting. 524 00:26:04,430 --> 00:26:07,840 It's going to produce some interesting results for us 525 00:26:07,840 --> 00:26:12,170 when we run this on various different examples. 526 00:26:12,170 --> 00:26:15,980 But this core procedure looks a lot 527 00:26:15,980 --> 00:26:18,410 like the n-queens search in the sense 528 00:26:18,410 --> 00:26:22,430 that you have a for loop and a recursive call. 529 00:26:22,430 --> 00:26:29,450 And in this case the for loop is going 530 00:26:29,450 --> 00:26:33,590 to be something that ranges through the different values, 531 00:26:33,590 --> 00:26:39,230 that you find a location that you want to put something into, 532 00:26:39,230 --> 00:26:44,090 which is the next empty location in your current configuration. 533 00:26:44,090 --> 00:26:47,270 And then you need to go put in one through nine in there. 534 00:26:47,270 --> 00:26:49,030 And it's brutish. 535 00:26:49,030 --> 00:26:50,900 You're going to put in one and you're 536 00:26:50,900 --> 00:26:52,100 going to check conflicts. 537 00:26:52,100 --> 00:26:53,810 And then you'll put in two and you're 538 00:26:53,810 --> 00:26:55,020 going to check conflicts. 539 00:26:55,020 --> 00:26:57,300 If you put in a one and you don't get a conflict, 540 00:26:57,300 --> 00:26:58,610 then you get to recur. 541 00:26:58,610 --> 00:27:00,680 And you now move into something that 542 00:27:00,680 --> 00:27:05,900 is another partial configuration, potentially, 543 00:27:05,900 --> 00:27:08,480 but obviously has one location filled 544 00:27:08,480 --> 00:27:11,990 from the caller configuration. 545 00:27:11,990 --> 00:27:14,010 And then you go and look for the next cell. 546 00:27:14,010 --> 00:27:18,140 So it's certainly possible that I'd go-- 547 00:27:18,140 --> 00:27:21,150 when I put in a one here that fails, but if I put in a two 548 00:27:21,150 --> 00:27:24,075 here, it's not going to fail. 549 00:27:24,075 --> 00:27:27,530 A two is not going to fail here because, if I just look 550 00:27:27,530 --> 00:27:29,870 at those constraints, a two OK. 551 00:27:29,870 --> 00:27:31,970 All right, so I'm going to put in a two here. 552 00:27:31,970 --> 00:27:33,350 And then I'm going to recur. 553 00:27:33,350 --> 00:27:36,440 And I'm going to go out here, and I'll try and put in a one 554 00:27:36,440 --> 00:27:37,250 here. 555 00:27:37,250 --> 00:27:39,710 And a one is going to fail because of this and that. 556 00:27:39,710 --> 00:27:41,800 A two is going to fail because of that. 557 00:27:41,800 --> 00:27:45,920 A three-- is a three going to fail? 558 00:27:45,920 --> 00:27:47,190 No, not immediately. 559 00:27:47,190 --> 00:27:48,690 So I could put in a three here. 560 00:27:48,690 --> 00:27:50,830 And then I recur and go to the next one, 561 00:27:50,830 --> 00:27:52,390 and so on and so forth, right? 562 00:27:52,390 --> 00:27:57,790 And for each of these things obviously 563 00:27:57,790 --> 00:28:02,930 I have to do a bunch of search underneath. 564 00:28:02,930 --> 00:28:07,770 And you know thank goodness for fast computers, right? 565 00:28:07,770 --> 00:28:09,672 Because otherwise, I mean God, I mean 566 00:28:09,672 --> 00:28:11,130 can you imagine the amount of paper 567 00:28:11,130 --> 00:28:14,580 we'd generate if you were doing this and putting two and three 568 00:28:14,580 --> 00:28:16,830 and I want a new sheet of paper for the four, 569 00:28:16,830 --> 00:28:17,760 et cetera, et cetera. 570 00:28:17,760 --> 00:28:19,676 I mean, we can count the number of backtracks. 571 00:28:19,676 --> 00:28:21,521 That's how many sheets of paper you'll need. 572 00:28:21,521 --> 00:28:22,020 OK. 573 00:28:24,600 --> 00:28:28,785 So what you see here, again ignore the backtracks, 574 00:28:28,785 --> 00:28:30,660 I'll get to that in just a second-- it's just 575 00:28:30,660 --> 00:28:33,750 a way of counting the number of calls. 576 00:28:33,750 --> 00:28:39,570 And this thing here essentially says 577 00:28:39,570 --> 00:28:42,410 I'm going to be returning-- 578 00:28:42,410 --> 00:28:45,550 as long as I get through and find a solution 579 00:28:45,550 --> 00:28:47,610 I want to return true. 580 00:28:47,610 --> 00:28:50,040 So if solve Sudoku grid IJ is true, 581 00:28:50,040 --> 00:28:51,490 then I'm going return through. 582 00:28:51,490 --> 00:28:54,710 And then I'm going to pop up all the way to the top, 583 00:28:54,710 --> 00:28:56,670 assuming I got-- 584 00:28:56,670 --> 00:28:59,610 I go all the way down to the bottom and I get to the point 585 00:28:59,610 --> 00:29:02,610 where I have a solution that returns true, 586 00:29:02,610 --> 00:29:04,890 which is a completely full configuration that 587 00:29:04,890 --> 00:29:05,820 returns true. 588 00:29:05,820 --> 00:29:06,520 Right? 589 00:29:06,520 --> 00:29:10,590 But if not, then I need to go try the other combinations 590 00:29:10,590 --> 00:29:13,640 and I'm only going to make that recursive 591 00:29:13,640 --> 00:29:18,830 call if, obviously IJE, corresponding to this, 592 00:29:18,830 --> 00:29:20,730 is valid. 593 00:29:20,730 --> 00:29:23,340 And that checks the constraints. 594 00:29:23,340 --> 00:29:25,320 And the only other thing I have to worry about 595 00:29:25,320 --> 00:29:32,940 is essentially something that says reset your grid location 596 00:29:32,940 --> 00:29:38,970 and make sure that you're setting it back 597 00:29:38,970 --> 00:29:41,019 to zero after you're done. 598 00:29:41,019 --> 00:29:42,810 Right, and so I've just made a choice here, 599 00:29:42,810 --> 00:29:48,330 grid IJ equals E. If I look at this line of code here, 600 00:29:48,330 --> 00:29:52,110 this is resetting the grid IJ equals E and saying it's empty. 601 00:29:52,110 --> 00:29:54,960 Because if I've failed in all of these 602 00:29:54,960 --> 00:29:57,840 and I haven't return true in all of these, then obviously I 603 00:29:57,840 --> 00:29:59,130 want to change this. 604 00:29:59,130 --> 00:30:01,320 And you could argue that the next time around 605 00:30:01,320 --> 00:30:04,050 if I and J are exactly the same-- 606 00:30:04,050 --> 00:30:06,760 because I and J are set up here-- 607 00:30:06,760 --> 00:30:10,740 then I'm going to overwrite the E from a one to a two, 608 00:30:10,740 --> 00:30:12,060 et cetera, et cetera. 609 00:30:12,060 --> 00:30:15,460 And so that is, in fact, correct. 610 00:30:15,460 --> 00:30:23,150 But I do need to reset this outside of the loop, 611 00:30:23,150 --> 00:30:24,280 if not inside of the loop. 612 00:30:24,280 --> 00:30:27,450 So it's not like I can get away with this line of code. 613 00:30:27,450 --> 00:30:30,420 In general, if you ever backtrack, 614 00:30:30,420 --> 00:30:32,700 you have to go back and undo your decision. 615 00:30:32,700 --> 00:30:35,190 And you have to erase the tree. 616 00:30:35,190 --> 00:30:38,130 And that's essentially what that grid IJ equaling zero is doing. 617 00:30:38,130 --> 00:30:41,430 You just need to undo that decision. 618 00:30:41,430 --> 00:30:45,690 And you can do this a few different ways. 619 00:30:45,690 --> 00:30:47,280 But the biggest thing to remember 620 00:30:47,280 --> 00:30:51,180 when you do recursive search is to get your-- 621 00:30:51,180 --> 00:30:52,830 the undoing of your decision, which 622 00:30:52,830 --> 00:30:55,590 is what we call backtracking, to be correct. 623 00:30:55,590 --> 00:31:00,930 And if you ever leave a mess, then you'd have a problem. 624 00:31:05,000 --> 00:31:08,190 That's also true in the case of the N-queens problem. 625 00:31:08,190 --> 00:31:09,590 So I'm going to go ahead and-- 626 00:31:09,590 --> 00:31:11,430 and this is just a print routine. 627 00:31:11,430 --> 00:31:14,926 So this is not exactly the Sudoku that I have up there, 628 00:31:14,926 --> 00:31:16,550 the Sudoku puzzle that I have up there, 629 00:31:16,550 --> 00:31:19,950 but it's kind of roughly similar in complexity. 630 00:31:19,950 --> 00:31:24,600 And I could go ahead and run the Sudoku program. 631 00:31:24,600 --> 00:31:30,600 And for each of those different Sudoku problems, 632 00:31:30,600 --> 00:31:33,330 it's producing solved puzzles. 633 00:31:33,330 --> 00:31:35,370 So this is a solved puzzle. 634 00:31:35,370 --> 00:31:37,950 You can check this puzzle just real quick 635 00:31:37,950 --> 00:31:42,330 and you'll find that all of the constraints are satisfied. 636 00:31:42,330 --> 00:31:45,070 And I'm going to explain backtracks in a second. 637 00:31:45,070 --> 00:31:46,870 So true says that there's a solution. 638 00:31:46,870 --> 00:31:49,170 The number of backtracks was 579. 639 00:31:49,170 --> 00:31:51,780 For the second puzzle, which was a little bit harder, 640 00:31:51,780 --> 00:31:53,940 the number of backtracks was 6363. 641 00:31:53,940 --> 00:31:56,190 I'm sorry, this is just scrolling. 642 00:31:56,190 --> 00:32:00,622 And for the fourth one, it was 335,000-- 643 00:32:00,622 --> 00:32:01,830 I'm sorry, for the third one. 644 00:32:01,830 --> 00:32:06,470 And for the fourth one, was 9949. 645 00:32:06,470 --> 00:32:11,180 These last two puzzles, hard and diff, 646 00:32:11,180 --> 00:32:14,820 there was a Finnish guy called-- 647 00:32:14,820 --> 00:32:17,660 there is a Finnish guy called Arto Inkala, 648 00:32:17,660 --> 00:32:20,120 who designs puzzles. 649 00:32:20,120 --> 00:32:24,140 And he claimed that this hard puzzle in 2006 650 00:32:24,140 --> 00:32:27,500 was the hardest puzzle ever designed in Sudoku. 651 00:32:27,500 --> 00:32:30,650 And then in 2010 he came up with this more difficult puzzle, 652 00:32:30,650 --> 00:32:32,840 according to him, that required a lot of look 653 00:32:32,840 --> 00:32:36,320 ahead from a standpoint of the human being. 654 00:32:36,320 --> 00:32:39,530 Like if we went back to what I said you can't quite 655 00:32:39,530 --> 00:32:40,580 do this implication. 656 00:32:40,580 --> 00:32:42,050 You have to kind of make a guess. 657 00:32:42,050 --> 00:32:44,330 And then you have to go further and further down. 658 00:32:44,330 --> 00:32:46,820 And I think the claim was that the hard puzzle required 659 00:32:46,820 --> 00:32:50,600 like five levels of look ahead, and then the difficult puzzle 660 00:32:50,600 --> 00:32:52,670 required six levels of look ahead. 661 00:32:52,670 --> 00:32:55,250 And obviously, given that look ahead, 662 00:32:55,250 --> 00:32:57,770 this puzzle has to have an initial configuration that's 663 00:32:57,770 --> 00:32:58,640 solvable. 664 00:32:58,640 --> 00:33:02,990 So it's not a trivial thing to create puzzles. 665 00:33:02,990 --> 00:33:04,880 But now people are using computer programs 666 00:33:04,880 --> 00:33:07,050 and doing things like we're doing here 667 00:33:07,050 --> 00:33:09,620 to find difficult puzzles. 668 00:33:09,620 --> 00:33:11,990 And interestingly enough, the 2006 puzzle, 669 00:33:11,990 --> 00:33:15,120 at least for this naive computer program, 670 00:33:15,120 --> 00:33:17,720 takes 335,000 backtracks-- 671 00:33:17,720 --> 00:33:22,400 the one that was supposedly made more difficult 672 00:33:22,400 --> 00:33:27,620 in 2010, which now takes about 10,000 backtracks. 673 00:33:27,620 --> 00:33:30,830 So obviously there's a difference between the way 674 00:33:30,830 --> 00:33:34,709 this program behaves and how you or I would behave, 675 00:33:34,709 --> 00:33:37,250 or rather you would behave if you tried to solve this puzzle. 676 00:33:40,300 --> 00:33:42,830 So let me just explain backtracks, 677 00:33:42,830 --> 00:33:44,330 and then I'll stop to see if there's 678 00:33:44,330 --> 00:33:46,600 any questions about the code. 679 00:33:46,600 --> 00:33:50,460 So when you make recursive calls and you 680 00:33:50,460 --> 00:33:53,370 want to count the number of recursive procedure calls-- 681 00:33:53,370 --> 00:33:55,170 you want to do something inside each 682 00:33:55,170 --> 00:33:56,940 of the recursive procedures and you 683 00:33:56,940 --> 00:34:00,390 want to sort of cumulatively or collectively keep 684 00:34:00,390 --> 00:34:03,150 some information, one way of certainly doing 685 00:34:03,150 --> 00:34:04,950 it is to pass arguments. 686 00:34:04,950 --> 00:34:06,966 And then you have to return the argument, 687 00:34:06,966 --> 00:34:08,340 because when you pass an argument 688 00:34:08,340 --> 00:34:13,659 and you modify it it's not like that 689 00:34:13,659 --> 00:34:16,219 is going to be-- that modification, 690 00:34:16,219 --> 00:34:23,100 if it's just an integer, if it's not a mutable variable, 691 00:34:23,100 --> 00:34:25,889 it's not going to be seen by the caller procedure. 692 00:34:25,889 --> 00:34:30,060 And so when you do recursion and you want to do some counting, 693 00:34:30,060 --> 00:34:32,730 the notion of global variables is a convenient construct 694 00:34:32,730 --> 00:34:34,020 to have. 695 00:34:34,020 --> 00:34:35,610 And global variables essentially say 696 00:34:35,610 --> 00:34:38,850 that there's exactly one memory location associated 697 00:34:38,850 --> 00:34:40,080 with this variable. 698 00:34:40,080 --> 00:34:42,600 And we're going to go ahead and, anytime 699 00:34:42,600 --> 00:34:45,420 we are mutating this variable and you're modifying it, 700 00:34:45,420 --> 00:34:47,820 you're going to see the effect of that in that memory 701 00:34:47,820 --> 00:34:49,230 location. 702 00:34:49,230 --> 00:34:54,580 So what you have up here is, I set backtracks to be zero. 703 00:34:54,580 --> 00:34:56,661 OK and that's my global variable. 704 00:34:56,661 --> 00:34:58,410 The fact that I put backtracks equals zero 705 00:34:58,410 --> 00:35:01,260 here doesn't make this a global variable just yet. 706 00:35:01,260 --> 00:35:04,590 The fact that I have global backtracks inside of solve 707 00:35:04,590 --> 00:35:08,220 Sudoku now says that there's a single copy of backtracks, 708 00:35:08,220 --> 00:35:10,440 and it doesn't matter whether I'm 709 00:35:10,440 --> 00:35:13,470 at the top level of recursion or the bottom level of recursion. 710 00:35:13,470 --> 00:35:17,040 It's just that memory location corresponding to backtracks-- 711 00:35:17,040 --> 00:35:20,100 the name backtracks, that is getting incremented. 712 00:35:20,100 --> 00:35:22,510 And this could be 10 levels deep. 713 00:35:22,510 --> 00:35:26,010 It could be 40 levels deep, given that I've called things 714 00:35:26,010 --> 00:35:27,060 40 levels in. 715 00:35:27,060 --> 00:35:29,340 But it's just the one backtracks. 716 00:35:29,340 --> 00:35:31,110 So as you can see, what backtracks does 717 00:35:31,110 --> 00:35:34,620 is anytime you have a valid location 718 00:35:34,620 --> 00:35:36,300 and you've gone ahead and-- 719 00:35:40,620 --> 00:35:43,680 essentially you've failed. 720 00:35:43,680 --> 00:35:45,240 The reason it's out here is solve 721 00:35:45,240 --> 00:35:47,100 Sudoku did not return true. 722 00:35:47,100 --> 00:35:49,670 When solved Sudoku actually returns false, 723 00:35:49,670 --> 00:35:56,530 that's when you come out and you increment backtracks. 724 00:35:56,530 --> 00:35:59,920 So it meant that you had to do some undoing. 725 00:35:59,920 --> 00:36:02,460 When you set grid IJ to be zero, that's when you're 726 00:36:02,460 --> 00:36:05,280 undoing your guess, right? 727 00:36:05,280 --> 00:36:08,670 So backtracks makes sense from a standpoint of I 728 00:36:08,670 --> 00:36:13,200 need to backtrack and go in a different fork in the road. 729 00:36:13,200 --> 00:36:15,900 And so that's why I have backtracks plus equals one 730 00:36:15,900 --> 00:36:18,906 when I'm undoing my decision that I made. 731 00:36:18,906 --> 00:36:20,280 So this kind of gives you a sense 732 00:36:20,280 --> 00:36:24,339 for how many wrong guesses that this program did. 733 00:36:24,339 --> 00:36:26,130 And as you can imagine, the more the number 734 00:36:26,130 --> 00:36:29,340 of wrong guesses, the more the computation and the longer 735 00:36:29,340 --> 00:36:30,090 it takes. 736 00:36:30,090 --> 00:36:32,810 So it is definitely a proxy for performance. 737 00:36:32,810 --> 00:36:35,780 But it's a platform independent proxy 738 00:36:35,780 --> 00:36:38,010 that's more algorithm related as opposed 739 00:36:38,010 --> 00:36:39,390 to the speed of the computer. 740 00:36:39,390 --> 00:36:41,550 Because if this computer were twice as fast, 741 00:36:41,550 --> 00:36:44,700 I mean I'd just see things running faster 742 00:36:44,700 --> 00:36:46,860 even though the algorithm isn't any better. 743 00:36:46,860 --> 00:36:47,660 Right? 744 00:36:47,660 --> 00:36:48,920 That make sense? 745 00:36:48,920 --> 00:36:51,607 So it's a very simple use of global. 746 00:36:51,607 --> 00:36:53,190 You don't want to use global variables 747 00:36:53,190 --> 00:36:55,290 except in certain constrained settings. 748 00:36:55,290 --> 00:36:58,660 This is a fine use of global variables. 749 00:36:58,660 --> 00:37:00,396 Cool, good. 750 00:37:00,396 --> 00:37:01,770 So any questions about this code? 751 00:37:04,300 --> 00:37:06,760 So what I've done here is I just have the naive code. 752 00:37:06,760 --> 00:37:09,250 And I happen to have different numbers of backtracks 753 00:37:09,250 --> 00:37:10,620 because I have different inputs. 754 00:37:10,620 --> 00:37:13,450 Unlike the N-queens problem, which 755 00:37:13,450 --> 00:37:16,580 is kind of boring in some sense, because once you've solved it 756 00:37:16,580 --> 00:37:19,540 there's nothing left, in the case of Sudoku, 757 00:37:19,540 --> 00:37:22,090 I could change my input, my starting point, 758 00:37:22,090 --> 00:37:25,300 and give you different problems. 759 00:37:25,300 --> 00:37:28,930 And so the reason we had many different kinds of backtracks 760 00:37:28,930 --> 00:37:30,790 was simply because-- 761 00:37:30,790 --> 00:37:32,290 numbers of backtracks was because we 762 00:37:32,290 --> 00:37:34,850 had four different inputs to the Sudoku puzzle. 763 00:37:34,850 --> 00:37:37,280 All right, so are we good here? 764 00:37:37,280 --> 00:37:38,790 People understand this code? 765 00:37:38,790 --> 00:37:41,210 You're going to have to modify it, right? 766 00:37:41,210 --> 00:37:43,760 Not necessarily this code, depending on the exercise 767 00:37:43,760 --> 00:37:46,460 you do, but this is certainly something 768 00:37:46,460 --> 00:37:49,820 that hopefully you feel comfortable with potentially 769 00:37:49,820 --> 00:37:51,530 modifying. 770 00:37:51,530 --> 00:37:56,480 All right so what I'm going to do now is first 771 00:37:56,480 --> 00:38:00,740 I'm going to go ahead and show you 772 00:38:00,740 --> 00:38:04,820 some code that corresponds to something that 773 00:38:04,820 --> 00:38:07,210 is the original code, except that I'm 774 00:38:07,210 --> 00:38:09,800 going to add some smarts to it. 775 00:38:09,800 --> 00:38:13,050 What I'm going to do is, at any given point of time, 776 00:38:13,050 --> 00:38:17,120 I'm going to try to do some implications without actually 777 00:38:17,120 --> 00:38:19,040 doing any guessing. 778 00:38:19,040 --> 00:38:22,430 So the way I'm going to integrate the human approach 779 00:38:22,430 --> 00:38:25,820 into this exhaustive search approach at top level, 780 00:38:25,820 --> 00:38:28,010 is I'm going to take my configuration, 781 00:38:28,010 --> 00:38:30,200 and before I do an arbitrary guess, 782 00:38:30,200 --> 00:38:33,080 before I call find next cell, or maybe I 783 00:38:33,080 --> 00:38:37,400 have a particular location here that I'm eventually 784 00:38:37,400 --> 00:38:38,060 going to guess. 785 00:38:38,060 --> 00:38:39,420 So I do know that. 786 00:38:39,420 --> 00:38:41,150 But before that, I'm going to try and see 787 00:38:41,150 --> 00:38:46,640 whether the current grid values imply anything or not by using 788 00:38:46,640 --> 00:38:50,210 the rules in exactly the same way or roughly, I should say, 789 00:38:50,210 --> 00:38:53,420 the same way that we did right when we began the lecture. 790 00:38:53,420 --> 00:38:54,050 All right? 791 00:38:54,050 --> 00:38:56,120 So we're going to try and use some implications 792 00:38:56,120 --> 00:38:59,060 and maybe imply the eight or imply something different 793 00:38:59,060 --> 00:39:01,790 associated with some other location. 794 00:39:01,790 --> 00:39:05,660 So this is not a backtrack, in the sense 795 00:39:05,660 --> 00:39:07,310 that this is going to be-- 796 00:39:07,310 --> 00:39:09,830 I can take this to the bank assuming 797 00:39:09,830 --> 00:39:12,180 I haven't done any guessing up until this point, 798 00:39:12,180 --> 00:39:14,180 and assuming that the initial configuration that 799 00:39:14,180 --> 00:39:17,840 was given to me corresponds to a valid solution. 800 00:39:17,840 --> 00:39:21,260 But I'm actually going to do this 801 00:39:21,260 --> 00:39:23,220 at different points in the search. 802 00:39:23,220 --> 00:39:25,790 So it might be that I'm just going to arbitrarily choose 803 00:39:25,790 --> 00:39:27,800 a two here. 804 00:39:27,800 --> 00:39:32,330 And so I go through and I'm going to take this, 805 00:39:32,330 --> 00:39:34,550 for argument's sake, and I'm going to put a two down. 806 00:39:34,550 --> 00:39:38,480 And then I have not the initial puzzle that was given to me, 807 00:39:38,480 --> 00:39:40,850 but something that I've kind of hacked in the sense 808 00:39:40,850 --> 00:39:42,450 that I've stuck a two in there. 809 00:39:42,450 --> 00:39:44,660 And that may not correspond to the solution, 810 00:39:44,660 --> 00:39:46,940 because I just sort of put the two down there. 811 00:39:46,940 --> 00:39:49,220 But now given the two, I'm going to try and do 812 00:39:49,220 --> 00:39:51,530 some implications. 813 00:39:51,530 --> 00:39:53,660 And I'm going to try and see whether there's 814 00:39:53,660 --> 00:39:56,480 things that are valid or not. 815 00:39:56,480 --> 00:40:00,170 The important thing is that, because I put a two down 816 00:40:00,170 --> 00:40:03,110 in an arbitrary way without using implications, 817 00:40:03,110 --> 00:40:04,610 the two could have been incorrect. 818 00:40:04,610 --> 00:40:07,220 I mean that's exactly why we have all of these backtracks, 819 00:40:07,220 --> 00:40:07,880 correct? 820 00:40:07,880 --> 00:40:10,760 Because I've put down incorrect guesses and then 821 00:40:10,760 --> 00:40:12,080 I've had to backtrack. 822 00:40:12,080 --> 00:40:14,450 So once I put a two down and then 823 00:40:14,450 --> 00:40:17,790 I fill in a bunch of things with implications. 824 00:40:17,790 --> 00:40:20,940 You know, I may even put an eight up there. 825 00:40:20,940 --> 00:40:24,600 I may put a six out here, et cetera, et cetera. 826 00:40:24,600 --> 00:40:27,240 And I go deep in and then I realize, ooh, 827 00:40:27,240 --> 00:40:30,030 you know that two was a mistake. 828 00:40:30,030 --> 00:40:32,780 The two really shouldn't have been in there. 829 00:40:32,780 --> 00:40:35,420 Now I have to clean up everything. 830 00:40:35,420 --> 00:40:38,600 I have to clean up all of the guesses that came after two 831 00:40:38,600 --> 00:40:41,360 and all of the implications that came after two. 832 00:40:41,360 --> 00:40:41,900 All right? 833 00:40:41,900 --> 00:40:45,290 That's the biggest thing that I want you to take away 834 00:40:45,290 --> 00:40:48,380 from this integration of implications 835 00:40:48,380 --> 00:40:49,490 with exhaustive search. 836 00:40:49,490 --> 00:40:54,680 It's clean up your mess, clean up your bad guesses. 837 00:40:54,680 --> 00:40:58,370 The fact that-- you say, oh but the implication was something 838 00:40:58,370 --> 00:40:59,840 that was deterministic. 839 00:40:59,840 --> 00:41:03,770 It was exactly following these rules. 840 00:41:03,770 --> 00:41:04,790 No, no, no, no, no. 841 00:41:04,790 --> 00:41:06,110 It was deterministic. 842 00:41:06,110 --> 00:41:07,170 All of that is true. 843 00:41:07,170 --> 00:41:08,390 But you made a wrong guess. 844 00:41:08,390 --> 00:41:11,540 And therefore everything that you did from then on out 845 00:41:11,540 --> 00:41:13,560 is in question. 846 00:41:13,560 --> 00:41:15,650 And if you, in fact, find a contradiction, 847 00:41:15,650 --> 00:41:18,590 you've got to go all the way back and clean up everything. 848 00:41:18,590 --> 00:41:21,390 And then go back and erase everything that you had. 849 00:41:21,390 --> 00:41:24,300 And then go take this two and maybe turn it into a three 850 00:41:24,300 --> 00:41:25,850 or what have you. 851 00:41:25,850 --> 00:41:26,540 All right? 852 00:41:26,540 --> 00:41:30,574 So before I show you the code that does the implications-- 853 00:41:30,574 --> 00:41:32,240 and you can kind of imagine that there's 854 00:41:32,240 --> 00:41:33,906 many ways that we could do implications, 855 00:41:33,906 --> 00:41:35,430 we did that manually. 856 00:41:35,430 --> 00:41:38,450 I want to show you this part looks exactly the same as 857 00:41:38,450 --> 00:41:39,500 before, no change. 858 00:41:39,500 --> 00:41:41,990 Find next cell to grid is exactly the same. 859 00:41:41,990 --> 00:41:46,580 Is valid is exactly the same, right? 860 00:41:46,580 --> 00:41:48,860 There's a large make implications procedure 861 00:41:48,860 --> 00:41:51,570 and an undo implications that I'll get to in a second. 862 00:41:51,570 --> 00:41:55,340 But this part here looks almost exactly the same, 863 00:41:55,340 --> 00:41:59,210 except that I've replaced grid IJ equals 864 00:41:59,210 --> 00:42:01,280 E with make implications. 865 00:42:03,890 --> 00:42:07,510 And this is something that not only is-- 866 00:42:07,510 --> 00:42:11,380 what make implications is going to do is it's going to set-- 867 00:42:11,380 --> 00:42:14,140 whatever I had up here, it's going to set two up here. 868 00:42:14,140 --> 00:42:18,010 And on top of that it's going to go use these things to go 869 00:42:18,010 --> 00:42:21,340 fill in a bunch of different values in here. 870 00:42:21,340 --> 00:42:22,660 So it's one extra step. 871 00:42:22,660 --> 00:42:24,860 This is the integration that I talked about. 872 00:42:24,860 --> 00:42:27,550 So the idea is that-- 873 00:42:27,550 --> 00:42:31,670 now you can do this for the original as well. 874 00:42:31,670 --> 00:42:34,960 But the point is, once you've made a guess, 875 00:42:34,960 --> 00:42:38,200 you always want to check to see whether that guess does 876 00:42:38,200 --> 00:42:39,560 certain implications or not. 877 00:42:39,560 --> 00:42:40,060 Right? 878 00:42:40,060 --> 00:42:42,101 I mean that's the whole purpose of this exercise. 879 00:42:42,101 --> 00:42:44,186 Even humans do this in the very difficult puzzles. 880 00:42:44,186 --> 00:42:46,310 They make a guess and then they see whether there's 881 00:42:46,310 --> 00:42:47,830 some implication or not. 882 00:42:47,830 --> 00:42:49,300 And maybe there's a contradiction 883 00:42:49,300 --> 00:42:52,120 and they have to go back and undo all of that damage they 884 00:42:52,120 --> 00:42:54,580 caused and change the guess. 885 00:42:54,580 --> 00:42:57,720 But in general, when you have a configuration 886 00:42:57,720 --> 00:43:00,220 and you add to it, it's possible suddenly 887 00:43:00,220 --> 00:43:03,340 that there will be other things that are implied by the one 888 00:43:03,340 --> 00:43:05,230 change that you made to it. 889 00:43:05,230 --> 00:43:08,530 So grid IJ equals E in the original code 890 00:43:08,530 --> 00:43:11,535 got replaced with this procedure that we'll talk about, 891 00:43:11,535 --> 00:43:13,660 which I don't want to spend a whole lot of time on, 892 00:43:13,660 --> 00:43:17,580 but it's essentially something in terms of details. 893 00:43:17,580 --> 00:43:19,060 But it's essentially something that 894 00:43:19,060 --> 00:43:22,040 puts in different values in the different locations. 895 00:43:22,040 --> 00:43:26,080 And grid IJ equal zero is replaced by undo implications, 896 00:43:26,080 --> 00:43:30,040 which is cleaning up all of the incorrect guesses 897 00:43:30,040 --> 00:43:32,392 and incorrect implications. 898 00:43:32,392 --> 00:43:33,850 And the reason the implications are 899 00:43:33,850 --> 00:43:38,890 incorrect-- because it came from an incorrect guess. 900 00:43:38,890 --> 00:43:40,660 And so that's it. 901 00:43:40,660 --> 00:43:43,180 Undo implications is trivial. 902 00:43:43,180 --> 00:43:45,310 It just sets all of the implications, 903 00:43:45,310 --> 00:43:47,890 and I'll tell you what the data structure is in a second, 904 00:43:47,890 --> 00:43:51,520 but think of it as making everything zero, going back 905 00:43:51,520 --> 00:43:53,320 to a clean slate. 906 00:43:53,320 --> 00:43:57,940 I mean clean slate in the sense that all 907 00:43:57,940 --> 00:44:02,020 of the incorrect implications and guesses are cleaned up. 908 00:44:02,020 --> 00:44:04,070 So that's all there is over here. 909 00:44:04,070 --> 00:44:08,510 Make implications is-- you can do anything you want. 910 00:44:08,510 --> 00:44:10,060 You can do vertical scans. 911 00:44:10,060 --> 00:44:11,800 You can do horizontal scans. 912 00:44:11,800 --> 00:44:14,890 You can-- if you go look at Sudoku literature 913 00:44:14,890 --> 00:44:17,860 and you look at ways of playing Sudoku, 914 00:44:17,860 --> 00:44:20,170 there's books written on how you can become a better 915 00:44:20,170 --> 00:44:21,970 Sudoku puzzle solver. 916 00:44:21,970 --> 00:44:24,910 And you could take that, and you could code that in. 917 00:44:24,910 --> 00:44:26,770 And you could replace make implications 918 00:44:26,770 --> 00:44:29,590 with those fancy techniques that are up there, right? 919 00:44:29,590 --> 00:44:31,030 But we've established I'm lazy. 920 00:44:31,030 --> 00:44:33,820 And so I only write a certain amount of code, 921 00:44:33,820 --> 00:44:35,740 and then I get tired. 922 00:44:35,740 --> 00:44:38,350 And so I wrote about 20 lines of code corresponding 923 00:44:38,350 --> 00:44:41,470 to a fairly straightforward implication just 924 00:44:41,470 --> 00:44:44,800 to give you a sense of how this would work. 925 00:44:44,800 --> 00:44:46,300 But the most important thing in here 926 00:44:46,300 --> 00:44:48,700 is not the details of make implications. 927 00:44:48,700 --> 00:44:51,280 And I'll give you some sense of that before we're done. 928 00:44:51,280 --> 00:44:55,300 But it's really the structure that is the most important. 929 00:44:55,300 --> 00:44:58,270 The fact that I've done make implications here and undo 930 00:44:58,270 --> 00:45:03,190 implications here is the correctness requirement 931 00:45:03,190 --> 00:45:07,230 that is important to exhaustive search. 932 00:45:07,230 --> 00:45:10,290 So if I do this and I do kind of the implications 933 00:45:10,290 --> 00:45:13,100 that we had right at the beginning of lecture 934 00:45:13,100 --> 00:45:16,680 and I go ahead and run it, just take a look. 935 00:45:16,680 --> 00:45:19,080 I won't write this out, but remember 936 00:45:19,080 --> 00:45:22,590 what the backtracks are for these things, roughly speaking, 937 00:45:22,590 --> 00:45:24,040 for the original Sudoku. 938 00:45:24,040 --> 00:45:27,750 Oh, I'm sorry, I need to go to the shell. 939 00:45:27,750 --> 00:45:30,900 And it was 335,000-- 940 00:45:30,900 --> 00:45:37,140 what is it-- 579, 6363, 335,000, and 9949. 941 00:45:37,140 --> 00:45:41,760 So if I go off and I run Sudoku optimized, 942 00:45:41,760 --> 00:45:45,390 which is doing these implications like I describe, 943 00:45:45,390 --> 00:45:47,850 and I go ahead and run that. 944 00:45:47,850 --> 00:45:51,620 The first one goes from 579 to 33 backtracks. 945 00:45:51,620 --> 00:45:53,671 OK so that's pretty good. 946 00:45:53,671 --> 00:45:55,420 Because it's done a bunch of implications. 947 00:45:55,420 --> 00:45:57,720 It's still-- it's not super smart. 948 00:45:57,720 --> 00:46:00,360 I mean that is a simple enough puzzle that a human being would 949 00:46:00,360 --> 00:46:01,710 not backtrack. 950 00:46:01,710 --> 00:46:04,590 I mean a human being would not backtrack in that first puzzle, 951 00:46:04,590 --> 00:46:05,280 right? 952 00:46:05,280 --> 00:46:07,530 And you should check that. 953 00:46:07,530 --> 00:46:10,020 And-- oh, this thing finished in the middle. 954 00:46:10,020 --> 00:46:11,155 So it went to 33. 955 00:46:15,394 --> 00:46:19,220 Oh, only had three of them? 956 00:46:19,220 --> 00:46:21,940 What do I have here in Sudoku Opt? 957 00:46:27,510 --> 00:46:28,010 Oh I see. 958 00:46:28,010 --> 00:46:31,780 I only ran-- oh wow. 959 00:46:31,780 --> 00:46:35,960 OK so I ran inp2, hard, and difficult. 960 00:46:35,960 --> 00:46:42,100 So it really went from 6363 to 33. 961 00:46:42,100 --> 00:46:45,880 It went from 335,000 to 24,000. 962 00:46:45,880 --> 00:46:47,880 And then it went to-- 963 00:46:47,880 --> 00:46:50,590 7-- went from 9949 to 726. 964 00:46:50,590 --> 00:46:54,310 The details aren't-- the numbers aren't super important. 965 00:46:54,310 --> 00:46:55,760 Don't hang your hat on them. 966 00:46:55,760 --> 00:46:58,280 Obviously if I change the code those numbers change. 967 00:46:58,280 --> 00:47:00,280 But you can see that there are substantial gains 968 00:47:00,280 --> 00:47:03,010 to be had in terms of implications 969 00:47:03,010 --> 00:47:08,090 not making these dumb guesses that clearly are incorrect. 970 00:47:08,090 --> 00:47:11,140 And you can fill in-- if you take away 971 00:47:11,140 --> 00:47:12,850 some of these empty squares, then 972 00:47:12,850 --> 00:47:15,790 the depth of the recursion that you have to go through 973 00:47:15,790 --> 00:47:17,170 becomes substantially smaller. 974 00:47:17,170 --> 00:47:21,430 And that's why your backtracking is simpler. 975 00:47:21,430 --> 00:47:24,520 So I want to leave you with a couple of things. 976 00:47:24,520 --> 00:47:28,750 I want to give you some sense for what particular implication 977 00:47:28,750 --> 00:47:31,420 that-- 978 00:47:31,420 --> 00:47:33,070 a strategy that we used. 979 00:47:33,070 --> 00:47:35,380 And so I'll just put up make implications and give you 980 00:47:35,380 --> 00:47:38,600 some sense for how this works. 981 00:47:38,600 --> 00:47:41,500 So the basic idea is that what I'm doing here 982 00:47:41,500 --> 00:47:44,380 is I'm looking at a particular sector. 983 00:47:44,380 --> 00:47:50,080 And I've created a data structure that says the missing 984 00:47:50,080 --> 00:47:52,100 elements here-- if I put a two in here-- 985 00:47:52,100 --> 00:47:54,100 let's just say I go ahead and put a two in here. 986 00:47:54,100 --> 00:47:56,520 The missing elements here are-- 987 00:47:56,520 --> 00:48:04,610 the set is three, four, five, six, seven, and nine. 988 00:48:04,610 --> 00:48:07,990 So this could be three, four, five, six, seven, eight, nine. 989 00:48:07,990 --> 00:48:10,480 This could be three, four, five, six, seven, eight, nine. 990 00:48:10,480 --> 00:48:12,172 This is quite dumb right now. 991 00:48:12,172 --> 00:48:13,630 But each of these different squares 992 00:48:13,630 --> 00:48:16,100 could be three, four, five, six, seven, eight, nine. 993 00:48:16,100 --> 00:48:16,600 OK? 994 00:48:16,600 --> 00:48:18,220 Possibly, all right. 995 00:48:18,220 --> 00:48:21,430 And then I say-- so that's the first part of the code. 996 00:48:21,430 --> 00:48:24,670 And then I say I'm going to attach, essentially, 997 00:48:24,670 --> 00:48:27,490 a copy of the set to each of the missing squares. 998 00:48:27,490 --> 00:48:34,060 And then I'm going to go through and find the missing elements. 999 00:48:34,060 --> 00:48:38,320 So this thing here can't be a nine because I see a nine here. 1000 00:48:38,320 --> 00:48:40,660 It can't be a three, right? 1001 00:48:40,660 --> 00:48:42,490 And so I can take this thing here. 1002 00:48:42,490 --> 00:48:44,650 And I take away the nine. 1003 00:48:44,650 --> 00:48:46,730 And I take away the three. 1004 00:48:46,730 --> 00:48:49,270 And I can do the same thing with that. 1005 00:48:49,270 --> 00:48:51,059 Obviously I can also take away the-- 1006 00:48:51,059 --> 00:48:53,350 the eight isn't there, but I could take away the seven, 1007 00:48:53,350 --> 00:48:55,737 and I could away the three, the six, and the one. 1008 00:48:55,737 --> 00:48:57,320 So I go ahead and I take away the six. 1009 00:48:57,320 --> 00:49:00,160 And the three was already taken out. 1010 00:49:00,160 --> 00:49:01,300 And I keep doing this. 1011 00:49:01,300 --> 00:49:03,850 And I try and shrink the possibilities corresponding 1012 00:49:03,850 --> 00:49:06,790 to this particular square that has the set 1013 00:49:06,790 --> 00:49:08,470 of different possibilities. 1014 00:49:08,470 --> 00:49:09,730 And if I ever-- 1015 00:49:09,730 --> 00:49:12,310 so when can I make an implication? 1016 00:49:12,310 --> 00:49:14,440 What is the condition that is going 1017 00:49:14,440 --> 00:49:19,300 to let me make an implication when I take this set of numbers 1018 00:49:19,300 --> 00:49:22,610 and I start shrinking them down using these rules that I have 1019 00:49:22,610 --> 00:49:25,670 over on the right hand side there? 1020 00:49:25,670 --> 00:49:27,140 What is an implication? 1021 00:49:27,140 --> 00:49:30,010 What does that correspond to in relation to the size-- 1022 00:49:30,010 --> 00:49:31,580 in relation to the set? 1023 00:49:31,580 --> 00:49:32,830 Right, yeah, behind you, Ryan. 1024 00:49:32,830 --> 00:49:35,150 AUDIENCE: So if you only have one element. 1025 00:49:35,150 --> 00:49:36,650 SRINI DEVADAS: That's exactly right. 1026 00:49:36,650 --> 00:49:38,860 If you have one element in the set, 1027 00:49:38,860 --> 00:49:40,264 then that's an implication. 1028 00:49:40,264 --> 00:49:41,680 If I have two elements in the set, 1029 00:49:41,680 --> 00:49:43,360 it's not an implication, because I don't quite 1030 00:49:43,360 --> 00:49:44,470 know what to do there. 1031 00:49:44,470 --> 00:49:46,990 But if I had one element in the set, that's an implication. 1032 00:49:46,990 --> 00:49:47,800 And that's it. 1033 00:49:47,800 --> 00:49:49,840 That's-- you know this code is not complicated. 1034 00:49:49,840 --> 00:49:51,490 Check if the vset is a singleton, which 1035 00:49:51,490 --> 00:49:52,690 is a single element. 1036 00:49:52,690 --> 00:49:54,370 And I'm going to go ahead and append 1037 00:49:54,370 --> 00:49:57,100 to this implication, which is a very straightforward data 1038 00:49:57,100 --> 00:50:00,160 structure that says this is the grid location I, 1039 00:50:00,160 --> 00:50:04,150 grid location J, and this is the value that was implied by that. 1040 00:50:04,150 --> 00:50:09,940 So not only do I have IJE, which is the original guess 1041 00:50:09,940 --> 00:50:14,410 that I have, I also have kind of a bunch of other tuples 1042 00:50:14,410 --> 00:50:17,000 corresponding to different coordinates, 1043 00:50:17,000 --> 00:50:21,520 you know, KL coordinates and the value, call it V, 1044 00:50:21,520 --> 00:50:22,460 associated with that. 1045 00:50:22,460 --> 00:50:24,293 And these are all the different implications 1046 00:50:24,293 --> 00:50:26,380 that I can collect together in this list. 1047 00:50:26,380 --> 00:50:29,350 And I can just add those things into make implications. 1048 00:50:29,350 --> 00:50:30,700 And then I keep going. 1049 00:50:30,700 --> 00:50:33,940 And then if I ever realize I've made a bad guess, 1050 00:50:33,940 --> 00:50:36,730 I have to undo everything by zeroing them all out, which 1051 00:50:36,730 --> 00:50:40,150 is making them all empty. 1052 00:50:40,150 --> 00:50:41,890 So one thing that this code does, 1053 00:50:41,890 --> 00:50:43,330 and you can take a look at it. 1054 00:50:43,330 --> 00:50:48,070 And I would encourage you to do the first exercise, which 1055 00:50:48,070 --> 00:50:51,940 is taking these implications and making them a little more 1056 00:50:51,940 --> 00:50:55,960 powerful by adding three or four lines of code to this code. 1057 00:50:55,960 --> 00:50:58,872 And exactly what you have to do in this exercise, 1058 00:50:58,872 --> 00:51:00,580 and I'll show you what the results should 1059 00:51:00,580 --> 00:51:02,810 be in just a minute. 1060 00:51:02,810 --> 00:51:05,620 But let me just spend 30 seconds explaining to you 1061 00:51:05,620 --> 00:51:07,180 how you could do a little bit better 1062 00:51:07,180 --> 00:51:09,830 than what this code does. 1063 00:51:09,830 --> 00:51:14,500 So what I've described to you really is get this set, 1064 00:51:14,500 --> 00:51:17,200 imply, get a singleton, et cetera. 1065 00:51:17,200 --> 00:51:18,700 And then you can do this, obviously, 1066 00:51:18,700 --> 00:51:20,857 for each of these sectors. 1067 00:51:20,857 --> 00:51:21,940 And that's what this does. 1068 00:51:21,940 --> 00:51:23,606 You had a for loop up there that does it 1069 00:51:23,606 --> 00:51:25,120 for each of the sectors. 1070 00:51:25,120 --> 00:51:26,920 Grab a sector and go ahead and do 1071 00:51:26,920 --> 00:51:28,990 an implication for that sector. 1072 00:51:28,990 --> 00:51:32,190 Now this code just runs through the sectors, 1073 00:51:32,190 --> 00:51:34,870 you know, One, two, three, four, five, six, seven, eight, nine 1074 00:51:34,870 --> 00:51:37,120 and then discovers the implications 1075 00:51:37,120 --> 00:51:39,880 if they exist, adds them to the imply list, 1076 00:51:39,880 --> 00:51:42,430 and then throws up its hands and says I'm tired, I'm done, 1077 00:51:42,430 --> 00:51:44,710 I don't want to do any more. 1078 00:51:44,710 --> 00:51:47,170 What could you do that's an improvement, given 1079 00:51:47,170 --> 00:51:49,870 what we have described and what I've told you so far. 1080 00:51:49,870 --> 00:51:52,750 What is an incremental improvement 1081 00:51:52,750 --> 00:51:55,570 over going over these sectors once 1082 00:51:55,570 --> 00:51:59,102 and doing these implications and storing them and moving on? 1083 00:51:59,102 --> 00:52:00,560 What is an incremental improvement? 1084 00:52:00,560 --> 00:52:02,400 Ganatra? 1085 00:52:02,400 --> 00:52:06,160 AUDIENCE: Look, once we get all the singletons, 1086 00:52:06,160 --> 00:52:09,360 we can set those as-- since those are determined, like, 1087 00:52:09,360 --> 00:52:11,568 deterministic, I think that we could set those 1088 00:52:11,568 --> 00:52:14,042 into the original grid and say that's our new base grid 1089 00:52:14,042 --> 00:52:15,084 and run through it again. 1090 00:52:15,084 --> 00:52:16,958 SRINI DEVADAS: Run through it again, exactly. 1091 00:52:16,958 --> 00:52:18,280 You don't have to stop. 1092 00:52:18,280 --> 00:52:20,710 There's no reason to stop if you're implying. 1093 00:52:20,710 --> 00:52:22,120 Once you've put something in here 1094 00:52:22,120 --> 00:52:24,495 and you've gone through one, two, three, four, five, six, 1095 00:52:24,495 --> 00:52:26,216 seven, eight, nine, got the implications, 1096 00:52:26,216 --> 00:52:28,590 you can put them into the grid and then start over again. 1097 00:52:28,590 --> 00:52:30,380 One, two, three, four, that's what humans do. 1098 00:52:30,380 --> 00:52:30,770 Right? 1099 00:52:30,770 --> 00:52:32,900 When humans put something in, then they don't stop. 1100 00:52:32,900 --> 00:52:36,160 They just keep going until they get to the end. 1101 00:52:36,160 --> 00:52:39,400 Now of course all of these implications 1102 00:52:39,400 --> 00:52:41,920 could be incorrect if that first guess was incorrect. 1103 00:52:41,920 --> 00:52:44,060 There's no change there. 1104 00:52:44,060 --> 00:52:45,970 But there's nothing that's stopping you 1105 00:52:45,970 --> 00:52:50,050 from turning this little thing-- there's a loop here that simply 1106 00:52:50,050 --> 00:52:52,630 corresponds to making a pass over the sectors, 1107 00:52:52,630 --> 00:52:55,210 but you can put this whole thing into a loop. 1108 00:52:55,210 --> 00:52:56,830 And you keep going through the loop 1109 00:52:56,830 --> 00:53:00,010 until you basically have no change that 1110 00:53:00,010 --> 00:53:02,080 happens in your grid. 1111 00:53:02,080 --> 00:53:03,702 OK so that's four lines of code. 1112 00:53:03,702 --> 00:53:06,160 And I'm not going to show you what those four lines of code 1113 00:53:06,160 --> 00:53:07,951 look like, so close your eyes in case you-- 1114 00:53:11,730 --> 00:53:14,010 And this is the solution to that code. 1115 00:53:14,010 --> 00:53:15,720 And I'm going to go ahead and run it. 1116 00:53:15,720 --> 00:53:19,260 And you saw what those numbers were with respect 1117 00:53:19,260 --> 00:53:22,290 to the backtracks. 1118 00:53:22,290 --> 00:53:25,530 But if you do those extra implications, 1119 00:53:25,530 --> 00:53:29,110 the 33 went down to two for that example. 1120 00:53:29,110 --> 00:53:31,500 So this is not optimal, because I wanted one. 1121 00:53:31,500 --> 00:53:33,660 So if I wanted to be a human being that 1122 00:53:33,660 --> 00:53:35,640 took this easy puzzle and just sort of went 1123 00:53:35,640 --> 00:53:38,220 all the way without making any incorrect guesses, 1124 00:53:38,220 --> 00:53:39,701 I would be doing implications. 1125 00:53:39,701 --> 00:53:40,950 And that would go all the way. 1126 00:53:40,950 --> 00:53:44,130 And I got close with two. 1127 00:53:44,130 --> 00:53:47,850 And I didn't print out the intermediate ones, 1128 00:53:47,850 --> 00:53:50,082 but the 24,000 went down to 11,000. 1129 00:53:50,082 --> 00:53:51,540 And I forget what the last one was. 1130 00:53:51,540 --> 00:53:53,130 It went down. 1131 00:53:53,130 --> 00:53:56,970 So with four lines of code and with the optimized code 1132 00:53:56,970 --> 00:53:59,900 that I'll put up you should be able to get those numbers 1133 00:53:59,900 --> 00:54:01,860 in your first exercise. 1134 00:54:01,860 --> 00:54:04,785 Or you could solve diagonal Sudoku or even Sudoku. 1135 00:54:04,785 --> 00:54:06,660 Or you could spend the rest of the day coding 1136 00:54:06,660 --> 00:54:08,640 whatever you want, whatever. 1137 00:54:08,640 --> 00:54:10,920 All right, see you next time.