1 00:00:00,000 --> 00:00:01,968 [SQUEAKING] 2 00:00:01,968 --> 00:00:04,428 [RUSTLING] 3 00:00:04,428 --> 00:00:12,300 [CLICKING] 4 00:00:12,300 --> 00:00:13,390 JASON KU: Hey, everybody. 5 00:00:13,390 --> 00:00:14,050 Welcome back. 6 00:00:14,050 --> 00:00:19,368 This is our last quiz review for the term, quiz 3, 7 00:00:19,368 --> 00:00:20,910 we'll be talking about, which will be 8 00:00:20,910 --> 00:00:23,190 the last quiz until the final. 9 00:00:23,190 --> 00:00:24,720 It's on dynamic programming, which 10 00:00:24,720 --> 00:00:27,960 you guys have been studying in lectures, and recitations, 11 00:00:27,960 --> 00:00:35,200 and on your problem sets 7 and 8, and lectures 15 through 18, 12 00:00:35,200 --> 00:00:37,110 so four lectures. 13 00:00:37,110 --> 00:00:41,790 Quiz 1 and quiz 2 material on, essentially, data structures 14 00:00:41,790 --> 00:00:45,120 and graph algorithms aren't going to be explicitly test-- 15 00:00:45,120 --> 00:00:47,470 or we're not trying to explicitly test it, 16 00:00:47,470 --> 00:00:49,950 that material, on quiz 3. 17 00:00:49,950 --> 00:00:52,620 But it is fair game. 18 00:00:52,620 --> 00:00:53,910 The material's cumulative. 19 00:00:53,910 --> 00:00:57,330 And so if you have to store some stuff in the data structure, 20 00:00:57,330 --> 00:00:58,260 that's fair game. 21 00:00:58,260 --> 00:00:59,790 But we're not specifically trying 22 00:00:59,790 --> 00:01:01,530 to test you on that material. 23 00:01:01,530 --> 00:01:04,290 OK, and really, we haven't learned 24 00:01:04,290 --> 00:01:08,070 all that much new material in these last four lectures 25 00:01:08,070 --> 00:01:11,360 in this unit. 26 00:01:11,360 --> 00:01:12,690 So this is scope. 27 00:01:12,690 --> 00:01:17,700 We've got, we're mostly handling dynamic programming 28 00:01:17,700 --> 00:01:20,760 on these four lectures and recitations and these two 29 00:01:20,760 --> 00:01:22,470 problem sets. 30 00:01:22,470 --> 00:01:23,970 But really, the focus is going to be 31 00:01:23,970 --> 00:01:28,050 on this recursive framework of solving problems 32 00:01:28,050 --> 00:01:31,380 with a focus on dynamic programming, specifically. 33 00:01:31,380 --> 00:01:34,560 Now, the recursive framework we have, 34 00:01:34,560 --> 00:01:42,180 I think, in previous slides, we used this SRTBOT notation. 35 00:01:42,180 --> 00:01:44,490 And I think there might be a space there 36 00:01:44,490 --> 00:01:46,050 in previous versions. 37 00:01:46,050 --> 00:01:48,450 I'm concatenating them together here. 38 00:01:48,450 --> 00:01:51,900 But really, it's a framework for solving breaking down 39 00:01:51,900 --> 00:01:55,710 your problem into a set of subproblems that can then 40 00:01:55,710 --> 00:01:58,290 be related recursively. 41 00:01:58,290 --> 00:02:03,360 And if that relationship depends on problems in a decreasing 42 00:02:03,360 --> 00:02:07,650 or in a smaller sense, there's a directionality 43 00:02:07,650 --> 00:02:11,400 to which subproblems I'm reducing to each time I 44 00:02:11,400 --> 00:02:12,930 make a recursive call. 45 00:02:12,930 --> 00:02:16,620 And that dependency graph is a cyclic. 46 00:02:16,620 --> 00:02:22,590 Then we can solve via dynamic programming 47 00:02:22,590 --> 00:02:27,270 by memoizing from bottom up or by calling things 48 00:02:27,270 --> 00:02:29,610 and remembering the calls that we've 49 00:02:29,610 --> 00:02:32,950 called before via memoization. 50 00:02:32,950 --> 00:02:36,450 And the basic idea here is this the recursive framework SRT 51 00:02:36,450 --> 00:02:43,210 BOT that we established is good for any recursive algorithm. 52 00:02:43,210 --> 00:02:47,010 But in the special case where subproblems 53 00:02:47,010 --> 00:02:49,740 may be used more than once, may be 54 00:02:49,740 --> 00:02:53,460 used when computing other subproblems, 55 00:02:53,460 --> 00:02:57,450 then we get this really nice speed 56 00:02:57,450 --> 00:03:00,180 up by recognizing that we don't have 57 00:03:00,180 --> 00:03:02,490 to do that work more than once. 58 00:03:02,490 --> 00:03:04,860 And essentially, instead of looking at it 59 00:03:04,860 --> 00:03:07,020 as a tree of recursive calls that 60 00:03:07,020 --> 00:03:09,780 may call the same problems more than once, 61 00:03:09,780 --> 00:03:14,190 we look at it by collapsing those nodes of the same value 62 00:03:14,190 --> 00:03:14,970 down into one. 63 00:03:14,970 --> 00:03:16,320 We get a DAG. 64 00:03:16,320 --> 00:03:21,540 And dynamic programming is when those subproblems overlap. 65 00:03:21,540 --> 00:03:26,700 OK, so let's take a look at our recursive framework 66 00:03:26,700 --> 00:03:28,500 here, SRT BOT. 67 00:03:28,500 --> 00:03:33,570 We have the S-R-T B-O-T. I remembered the space this time. 68 00:03:33,570 --> 00:03:37,440 Subproblem, you're going to define some subproblems. 69 00:03:37,440 --> 00:03:39,450 You're going to relate them recursively. 70 00:03:39,450 --> 00:03:41,760 You're going to specify a topological order 71 00:03:41,760 --> 00:03:47,550 of those subproblems that the relation satisfies. 72 00:03:47,550 --> 00:03:50,820 You're going to list some base cases, basically wherever 73 00:03:50,820 --> 00:03:53,760 you could solve this problem in constant time 74 00:03:53,760 --> 00:03:56,780 without doing any recursive work. 75 00:03:56,780 --> 00:04:00,200 Stating how you solve the original problem, which 76 00:04:00,200 --> 00:04:04,010 might involve combining many subproblems, but frequently, 77 00:04:04,010 --> 00:04:09,350 is just finding one subproblem, and possibly 78 00:04:09,350 --> 00:04:11,210 remembering-- storing parent pointers 79 00:04:11,210 --> 00:04:13,770 to return an optimal sequence, or something like that. 80 00:04:13,770 --> 00:04:15,140 And then, analyze the time. 81 00:04:15,140 --> 00:04:19,100 Now, the last one isn't really important for solving a problem 82 00:04:19,100 --> 00:04:19,700 recursively. 83 00:04:19,700 --> 00:04:21,408 But in this class, it's really important, 84 00:04:21,408 --> 00:04:25,010 because we want to tell whether the algorithms that we make 85 00:04:25,010 --> 00:04:25,610 are efficient. 86 00:04:25,610 --> 00:04:30,750 So let's dive a little deeper into each one of these things. 87 00:04:30,750 --> 00:04:34,240 So when we approach a subproblem, 88 00:04:34,240 --> 00:04:38,590 really, what I'm asking you for is to describe-- 89 00:04:38,590 --> 00:04:40,330 basically, set up a set of problems. 90 00:04:40,330 --> 00:04:43,780 Basically, I like to use the variable x. 91 00:04:43,780 --> 00:04:46,580 But you can use whatever variable you want. 92 00:04:46,580 --> 00:04:49,390 But basically, you're telling us what's in your memo 93 00:04:49,390 --> 00:04:51,370 and how big your memo is. 94 00:04:51,370 --> 00:04:57,810 So we usually have x as a function of some variables. 95 00:04:57,810 --> 00:05:00,090 And you're wanting to describe to me 96 00:05:00,090 --> 00:05:01,800 what the meaning of that subproblem 97 00:05:01,800 --> 00:05:03,630 is in terms of the parameters. 98 00:05:03,630 --> 00:05:06,360 Now, if you have parameters in your subproblem 99 00:05:06,360 --> 00:05:09,600 that don't appear in your subproblem definition, 100 00:05:09,600 --> 00:05:10,580 you're doing it wrong. 101 00:05:10,580 --> 00:05:13,080 And you're probably not going to get points for the problem. 102 00:05:13,080 --> 00:05:16,020 Because I don't know what your problem means now. 103 00:05:16,020 --> 00:05:18,510 Even if it's a correct problem and you do the rest of it 104 00:05:18,510 --> 00:05:22,880 right, part of this class is about communication. 105 00:05:22,880 --> 00:05:26,360 And if you're not communicating to us what this thing is doing, 106 00:05:26,360 --> 00:05:29,720 it's really difficult for us-- 107 00:05:29,720 --> 00:05:33,810 for you to convince us that your algorithm is correct. 108 00:05:33,810 --> 00:05:36,680 So you really want to, in words, describe 109 00:05:36,680 --> 00:05:39,170 what the output of your subproblem is. 110 00:05:39,170 --> 00:05:42,760 What will the memo return to me? 111 00:05:42,760 --> 00:05:45,340 And how those return values depend 112 00:05:45,340 --> 00:05:48,720 on the inputs, the parameters of your subproblem. 113 00:05:48,720 --> 00:05:51,250 So that's what, in words, describe 114 00:05:51,250 --> 00:05:52,300 what a subproblem means. 115 00:05:52,300 --> 00:05:54,760 So that's going to be a really important thing for you 116 00:05:54,760 --> 00:05:57,580 not to forget on a quiz. 117 00:05:57,580 --> 00:06:00,790 Then, when making subproblems, often, what we're doing is 118 00:06:00,790 --> 00:06:06,860 we're rehearsing on different values of indices in a sequence 119 00:06:06,860 --> 00:06:08,550 or numbers in your problem. 120 00:06:08,550 --> 00:06:10,850 That's kind of what we got to in the last-- 121 00:06:10,850 --> 00:06:14,390 in lecture 18, I guess, when we were 122 00:06:14,390 --> 00:06:17,030 talking about expanding subproblems based 123 00:06:17,030 --> 00:06:18,320 on an integer in a problem. 124 00:06:18,320 --> 00:06:20,660 Now actually, an integer in our problem 125 00:06:20,660 --> 00:06:23,450 is the number of things in a sequence. 126 00:06:23,450 --> 00:06:26,480 And so, really, those indices are integers in our problem 127 00:06:26,480 --> 00:06:28,220 that we're looping over. 128 00:06:28,220 --> 00:06:34,160 Except those integers happen to be the size of our subproblem. 129 00:06:34,160 --> 00:06:36,290 Whereas, other integers might be larger, 130 00:06:36,290 --> 00:06:39,210 which is why you might get a pseudopolynomial time bound. 131 00:06:39,210 --> 00:06:41,210 But in general, when I have a sequence of things 132 00:06:41,210 --> 00:06:44,120 that I might want to dynamic program over, 133 00:06:44,120 --> 00:06:50,480 common choices for subproblems are prefixes or suffixes 134 00:06:50,480 --> 00:06:53,360 if I can kind of locally figure out 135 00:06:53,360 --> 00:06:58,100 to do with one what to do with one item and then recurse. 136 00:06:58,100 --> 00:07:02,752 Or if I can't kind of localize it by one choice on one side, 137 00:07:02,752 --> 00:07:04,460 if I have to make a choice in the middle, 138 00:07:04,460 --> 00:07:06,890 or I have to make a choice on both ends, 139 00:07:06,890 --> 00:07:09,080 then you might want to use sub-- 140 00:07:09,080 --> 00:07:13,310 basically contiguous subsequences of your sequence. 141 00:07:13,310 --> 00:07:15,680 Because you might need that flexibility 142 00:07:15,680 --> 00:07:17,335 when reversing downward, if you need 143 00:07:17,335 --> 00:07:19,460 to take something from both the front and the back, 144 00:07:19,460 --> 00:07:22,165 for example. 145 00:07:22,165 --> 00:07:23,540 And really, what's the difference 146 00:07:23,540 --> 00:07:25,400 between prefixes and suffixes? 147 00:07:25,400 --> 00:07:26,700 Not much. 148 00:07:26,700 --> 00:07:29,527 OK, we've been concentrating on suffixes in this class. 149 00:07:29,527 --> 00:07:31,610 Because in some sense, it's easier to think about. 150 00:07:31,610 --> 00:07:34,640 What am I doing with the first thing in my sequence, 151 00:07:34,640 --> 00:07:36,950 or my suffix? 152 00:07:36,950 --> 00:07:39,890 And then I can recurse on what happens later. 153 00:07:39,890 --> 00:07:46,280 Now, in actuality, when you're doing this, say, bottom up, 154 00:07:46,280 --> 00:07:49,340 the actual computation that is evaluated first 155 00:07:49,340 --> 00:07:50,450 is where in that sequence? 156 00:07:53,710 --> 00:07:56,800 I may be calling, at the top level, what 157 00:07:56,800 --> 00:07:58,630 happens to my first element. 158 00:07:58,630 --> 00:08:02,620 But I'll actually deal with that first element last. 159 00:08:02,620 --> 00:08:05,920 Because I will recursively solve everything below me, 160 00:08:05,920 --> 00:08:10,630 in front of me, before I figure out what to do with this thing. 161 00:08:10,630 --> 00:08:15,610 So in actuality, when I'm solving my recursion, 162 00:08:15,610 --> 00:08:18,520 I will start at the end, bottom up, 163 00:08:18,520 --> 00:08:20,590 because that's my base case. 164 00:08:20,590 --> 00:08:23,650 And then I'll work my way back to the front. 165 00:08:23,650 --> 00:08:26,780 Whereas, with prefixes, you look at it the other way. 166 00:08:26,780 --> 00:08:29,210 What am I doing with my last element? 167 00:08:29,210 --> 00:08:31,210 If I look at what I'm doing at the last element, 168 00:08:31,210 --> 00:08:34,159 I recurse on a prefix, on the stuff that's before me. 169 00:08:34,159 --> 00:08:37,299 And then when I do bottom up, I start from the front 170 00:08:37,299 --> 00:08:39,400 and work my way up. 171 00:08:39,400 --> 00:08:41,809 it's two different sides of the same coin. 172 00:08:41,809 --> 00:08:43,780 And usually, these are interchangeable. 173 00:08:43,780 --> 00:08:48,880 We've been doing it suffix-wise, because when starting to learn 174 00:08:48,880 --> 00:08:51,430 dynamic programming, it's a lot-- 175 00:08:51,430 --> 00:08:54,130 we read things from left to right and things like that. 176 00:08:54,130 --> 00:08:55,810 It's a lot easier to figure out what's 177 00:08:55,810 --> 00:08:58,630 happening with the first thing and move forward, conceptually. 178 00:08:58,630 --> 00:09:00,220 It's actually exactly the same thing. 179 00:09:00,220 --> 00:09:04,630 I could just flip my sequence, do the exact same thing 180 00:09:04,630 --> 00:09:05,800 with prefixes. 181 00:09:05,800 --> 00:09:08,680 It would be the exact same dynamic program. 182 00:09:08,680 --> 00:09:11,920 So these things are interchangeable. 183 00:09:11,920 --> 00:09:15,730 It's really useful, when learning to dynamic program, 184 00:09:15,730 --> 00:09:19,300 to be able to switch back and forth between these things. 185 00:09:19,300 --> 00:09:21,880 We'll be working on suffixes today 186 00:09:21,880 --> 00:09:24,310 on the problems that we do. 187 00:09:24,310 --> 00:09:26,020 But these are interchangeable. 188 00:09:26,020 --> 00:09:28,780 And sometimes it's useful to be able to conceptually think 189 00:09:28,780 --> 00:09:33,400 about it in both directions. 190 00:09:33,400 --> 00:09:38,950 So aside from dealing with subsequences of sequences, 191 00:09:38,950 --> 00:09:41,980 in particular, contiguous ones, we also 192 00:09:41,980 --> 00:09:45,640 often multiply our subsets across multiple inputs. 193 00:09:45,640 --> 00:09:48,490 Like if we have multiple sequences, 194 00:09:48,490 --> 00:09:51,370 we might take indices in each one of them 195 00:09:51,370 --> 00:09:54,460 to represent prefixes or suffixes. 196 00:09:54,460 --> 00:09:57,010 And then we might have to remember additional information 197 00:09:57,010 --> 00:09:59,500 by maintaining some auxiliary information, 198 00:09:59,500 --> 00:10:05,350 like am I trying to maximize or minimize my sum in a-- 199 00:10:05,350 --> 00:10:09,040 or evaluated expression in an arithmetic parenthization. 200 00:10:09,040 --> 00:10:13,420 Or is it player 1's turn or player's 2 turn? 201 00:10:13,420 --> 00:10:18,460 Or which finger-- where was my finger when I was playing 202 00:10:18,460 --> 00:10:19,678 piano or something like that? 203 00:10:19,678 --> 00:10:21,220 Those are the kinds of things that we 204 00:10:21,220 --> 00:10:24,110 might expand our state on. 205 00:10:24,110 --> 00:10:26,170 And in particular, we might expand our state 206 00:10:26,170 --> 00:10:28,480 based on the numbers in our problem 207 00:10:28,480 --> 00:10:31,630 if we're trying to, for example, keep track of how much 208 00:10:31,630 --> 00:10:35,620 space is left in a knapsack or something like that. 209 00:10:35,620 --> 00:10:40,420 But in general, if I'm trying to, say, pack a set of things, 210 00:10:40,420 --> 00:10:46,210 it's useful to know how much space I have left to pack. 211 00:10:46,210 --> 00:10:48,170 So that's subproblems. 212 00:10:48,170 --> 00:10:51,550 This is really the key part about dynamic programming 213 00:10:51,550 --> 00:10:52,690 is the recursive part. 214 00:10:52,690 --> 00:10:55,420 This is what makes it hard is choosing a set of subproblems. 215 00:10:55,420 --> 00:10:59,830 And it's often you build subproblems 216 00:10:59,830 --> 00:11:04,660 to fit well with relations. 217 00:11:04,660 --> 00:11:08,860 So usually, building what these subproblems are 218 00:11:08,860 --> 00:11:12,350 is usually closely coupled with the next step, 219 00:11:12,350 --> 00:11:14,890 which is relating the subproblems recursively. 220 00:11:14,890 --> 00:11:17,110 And relate recursively, I-- 221 00:11:17,110 --> 00:11:20,110 usually what I want is an expression, 222 00:11:20,110 --> 00:11:23,230 a mathematical expression, relating 223 00:11:23,230 --> 00:11:28,570 the definition of a subproblem you 224 00:11:28,570 --> 00:11:33,730 had in the previous section, relating those, in math terms, 225 00:11:33,730 --> 00:11:34,690 to the other things. 226 00:11:34,690 --> 00:11:38,140 This is-- it's really important that you write this in math, 227 00:11:38,140 --> 00:11:42,280 because it needs to be precise, to communicate this thing well. 228 00:11:42,280 --> 00:11:44,110 Now, you can write it in words. 229 00:11:44,110 --> 00:11:47,620 But I would suggest you write it as a mathematical expression, 230 00:11:47,620 --> 00:11:50,170 because it's a lot more concise for us 231 00:11:50,170 --> 00:11:53,320 to see what's happening in your recursion. 232 00:11:53,320 --> 00:11:55,120 So relate them recursively. 233 00:11:55,120 --> 00:11:56,830 Basically, I'm going to write, say, 234 00:11:56,830 --> 00:12:02,410 that x of some set of parameters equals some function, usually 235 00:12:02,410 --> 00:12:05,590 a maximization, or a minimization, or a summation, 236 00:12:05,590 --> 00:12:10,330 or and or, or an and, or some other combinator 237 00:12:10,330 --> 00:12:14,740 of a bunch of choices that you might make, so-- 238 00:12:14,740 --> 00:12:18,100 or a bunch of subproblems that you might recurse on. 239 00:12:18,100 --> 00:12:19,840 Basically, you're going to depend 240 00:12:19,840 --> 00:12:24,140 on some other subproblems that are smaller in some sense. 241 00:12:24,140 --> 00:12:27,930 Now, actually embedded in this, the idea 242 00:12:27,930 --> 00:12:33,100 of a smaller subproblem isn't really well defined yet. 243 00:12:33,100 --> 00:12:37,750 We haven't told you an ordering of these subproblems 244 00:12:37,750 --> 00:12:39,550 to be smaller. 245 00:12:39,550 --> 00:12:43,530 But that's what's going to come in the third step. 246 00:12:43,530 --> 00:12:47,250 So kind of a strategy for figuring out 247 00:12:47,250 --> 00:12:49,440 what these recursive relations might be 248 00:12:49,440 --> 00:12:54,520 is to identify some question about the subproblem solution. 249 00:12:54,520 --> 00:12:58,750 What do I do with the first character in this string? 250 00:12:58,750 --> 00:13:07,750 Or which cage do I put this tiger in? 251 00:13:07,750 --> 00:13:12,490 To figure out what subproblem should I recurse on later? 252 00:13:12,490 --> 00:13:15,290 I don't know the answer to that question. 253 00:13:15,290 --> 00:13:17,230 But if I knew the answer to that question, 254 00:13:17,230 --> 00:13:21,200 then I could recurse on a smaller subproblem, 255 00:13:21,200 --> 00:13:26,030 because I figured out what to do with that tiger. 256 00:13:26,030 --> 00:13:29,390 And so it will let me reduce to smaller subproblems. 257 00:13:29,390 --> 00:13:31,370 And then, what dynamic programming does 258 00:13:31,370 --> 00:13:33,740 is because I only have a polynomial number 259 00:13:33,740 --> 00:13:36,020 of subproblems, and I assumed I've already 260 00:13:36,020 --> 00:13:39,740 computed what those are, I've already 261 00:13:39,740 --> 00:13:43,640 memoized what the solutions to those problems are, 262 00:13:43,640 --> 00:13:45,590 then I can just locally brute force 263 00:13:45,590 --> 00:13:47,930 over all the possible answers to that question. 264 00:13:47,930 --> 00:13:51,340 And that's one way to look at dynamic programming. 265 00:13:51,340 --> 00:13:56,010 OK, so then as we were talking about topological order, 266 00:13:56,010 --> 00:13:58,680 arguing that relation is acyclic. 267 00:13:58,680 --> 00:14:00,690 Essentially, just defining what smaller 268 00:14:00,690 --> 00:14:02,700 means when we say we're recursing 269 00:14:02,700 --> 00:14:04,020 on smaller subproblems. 270 00:14:04,020 --> 00:14:05,280 What does smaller mean? 271 00:14:05,280 --> 00:14:09,420 Usually, you're saying that some index 272 00:14:09,420 --> 00:14:11,970 or some parameter of my subproblem 273 00:14:11,970 --> 00:14:13,890 always decreases or increases. 274 00:14:13,890 --> 00:14:15,910 Sometimes, that's not always the case. 275 00:14:15,910 --> 00:14:18,300 Sometimes, you have to, maybe, add a couple 276 00:14:18,300 --> 00:14:21,210 indices and see that that always increases, 277 00:14:21,210 --> 00:14:24,240 because one may stay the same while the other increases 278 00:14:24,240 --> 00:14:26,140 or something like that. 279 00:14:26,140 --> 00:14:28,410 But in general, as long as you argue 280 00:14:28,410 --> 00:14:31,770 that the relations are acyclic, then 281 00:14:31,770 --> 00:14:33,240 the subproblem graph is a DAG. 282 00:14:33,240 --> 00:14:36,180 And you can compute in a bottom-up manner. 283 00:14:36,180 --> 00:14:39,970 And you don't get infinite loops in your recursion. 284 00:14:39,970 --> 00:14:42,780 OK, the last thing, the last couple of things 285 00:14:42,780 --> 00:14:44,220 are kind of bookkeeping. 286 00:14:44,220 --> 00:14:46,140 But if you don't write these on your exam, 287 00:14:46,140 --> 00:14:48,090 we can't give you points for them. 288 00:14:48,090 --> 00:14:50,470 So write these down. 289 00:14:50,470 --> 00:14:54,190 Base cases, if you don't tell us base cases, 290 00:14:54,190 --> 00:14:57,160 then your algorithm cannot be polynomial time. 291 00:14:57,160 --> 00:14:59,890 It can't even be finite time, because your algorithm never 292 00:14:59,890 --> 00:15:01,210 stops. 293 00:15:01,210 --> 00:15:04,760 It just continues to recurse forever and ever and ever. 294 00:15:04,760 --> 00:15:06,610 And so it's hard to give us points-- 295 00:15:06,610 --> 00:15:08,290 I mean, we will give you some points 296 00:15:08,290 --> 00:15:11,170 if your subproblems and relation are correct. 297 00:15:11,170 --> 00:15:14,840 But really, if you write code without a base case, 298 00:15:14,840 --> 00:15:16,540 it's going to be wrong. 299 00:15:16,540 --> 00:15:19,600 So base cases are really important. 300 00:15:19,600 --> 00:15:24,610 Basically, for anything at the bounds of your computation, 301 00:15:24,610 --> 00:15:28,540 wherever your recursive relation would essentially 302 00:15:28,540 --> 00:15:31,180 go outside the bounds of your memo, 303 00:15:31,180 --> 00:15:33,790 let's say I'm dealing with a subsequence. 304 00:15:33,790 --> 00:15:38,740 And at some point, I'm trying to point to a state 305 00:15:38,740 --> 00:15:41,200 where I have zero or negative elements 306 00:15:41,200 --> 00:15:44,150 in my sequence, that's probably a bad thing. 307 00:15:44,150 --> 00:15:46,570 And so I want to define how to compute 308 00:15:46,570 --> 00:15:48,890 those things in constant time. 309 00:15:48,890 --> 00:15:51,920 So that my algorithm can terminate when it gets 310 00:15:51,920 --> 00:15:55,050 to one of those base cases. 311 00:15:55,050 --> 00:15:57,350 So it's really important that you 312 00:15:57,350 --> 00:16:01,682 cover all of those possible leaf locations 313 00:16:01,682 --> 00:16:03,890 where you want to be able to return in constant time. 314 00:16:03,890 --> 00:16:06,470 And we'll do some of that today. 315 00:16:06,470 --> 00:16:08,660 State solutions for all reachable, 316 00:16:08,660 --> 00:16:12,170 independent subproblems where the relation breaks down. 317 00:16:12,170 --> 00:16:15,650 Essentially, I would be going outside the bounds of my thing. 318 00:16:15,650 --> 00:16:20,810 Or anything where, maybe if you've got one item left, 319 00:16:20,810 --> 00:16:23,002 you might say, well, I have no choice 320 00:16:23,002 --> 00:16:24,210 on what to do with that item. 321 00:16:24,210 --> 00:16:26,360 I have to pick it or something like that. 322 00:16:26,360 --> 00:16:31,440 OK, then for your original problem, 323 00:16:31,440 --> 00:16:35,190 you show how to compute solution to the original problem 324 00:16:35,190 --> 00:16:39,540 from the solutions of your subproblems. 325 00:16:39,540 --> 00:16:43,350 So usually, this is just here's a subproblem. 326 00:16:43,350 --> 00:16:46,410 It's the one that used all of the things, 327 00:16:46,410 --> 00:16:48,780 and that's going to be my answer. 328 00:16:48,780 --> 00:16:50,170 But that's not always the case. 329 00:16:50,170 --> 00:16:52,770 Sometimes, like in a longest increasing subsequence, 330 00:16:52,770 --> 00:16:57,090 we had to take a max over all of our problems that we computed, 331 00:16:57,090 --> 00:16:59,280 or max subarray sum, we also had to do that. 332 00:17:03,960 --> 00:17:07,560 But in general, the output to our subproblems 333 00:17:07,560 --> 00:17:10,800 wants to be some scalar value that we're trying to optimize, 334 00:17:10,800 --> 00:17:13,290 or a Boolean, or something like that. 335 00:17:13,290 --> 00:17:18,119 It's how we maximize or minimize what we're doing. 336 00:17:18,119 --> 00:17:20,940 We're not storing the entire sequence of how we got there. 337 00:17:20,940 --> 00:17:22,829 Because there could be an exponential number 338 00:17:22,829 --> 00:17:24,932 of possible subsequences that got there. 339 00:17:24,932 --> 00:17:26,849 That's the whole point of dynamic programming. 340 00:17:26,849 --> 00:17:30,180 We're kind of isolating the complexity of one subproblem 341 00:17:30,180 --> 00:17:31,350 down to a single number. 342 00:17:34,010 --> 00:17:38,150 But in a lot of problems, we might want to reconstruct, 343 00:17:38,150 --> 00:17:46,820 say, the placement of tigers into cages and not just how-- 344 00:17:46,820 --> 00:17:49,670 what's the minimum discomfort over all tigers or something 345 00:17:49,670 --> 00:17:52,490 like that, like you had in your problem set. 346 00:17:52,490 --> 00:17:55,550 So I actually want to know where to put tigers into cages. 347 00:17:55,550 --> 00:17:59,690 And to do that, every time I maximized a subproblem, 348 00:17:59,690 --> 00:18:04,120 I can remember which subproblem or subproblems I depended on, 349 00:18:04,120 --> 00:18:08,680 just like storing parent pointers in shortest paths. 350 00:18:08,680 --> 00:18:11,170 And then, using those parent pointers, 351 00:18:11,170 --> 00:18:14,170 I can just walk back in my subproblem graph 352 00:18:14,170 --> 00:18:17,830 and figure out which path to a base case 353 00:18:17,830 --> 00:18:22,480 led me to an optimal solution. 354 00:18:22,480 --> 00:18:26,075 And then the last thing is analyzing running time. 355 00:18:26,075 --> 00:18:27,700 Generally, you're just summing the work 356 00:18:27,700 --> 00:18:29,350 done by each subproblem. 357 00:18:29,350 --> 00:18:31,210 Because the assumption is you're calculating 358 00:18:31,210 --> 00:18:35,340 all of the subproblems you described to me. 359 00:18:35,340 --> 00:18:39,840 But if the work per subproblem is bounded by the same value, 360 00:18:39,840 --> 00:18:41,730 you can just multiply it out. 361 00:18:41,730 --> 00:18:44,940 So that's generally a weaker bound, 362 00:18:44,940 --> 00:18:49,230 but usually asymptotically equivalent to the stronger 363 00:18:49,230 --> 00:18:50,550 notion on the left. 364 00:18:53,270 --> 00:18:55,990 And that's basically how you do running time. 365 00:18:55,990 --> 00:18:58,540 Usually, it's enough to-- 366 00:18:58,540 --> 00:19:03,370 how do I determine how many subproblems I have? 367 00:19:03,370 --> 00:19:05,080 Well, I look at the possible values 368 00:19:05,080 --> 00:19:07,840 of each of my parameters. 369 00:19:07,840 --> 00:19:10,150 And then I multiply those numbers together. 370 00:19:10,150 --> 00:19:13,430 A lot of people will maybe say, oh, I add them together. 371 00:19:13,430 --> 00:19:16,850 No, because I'm able to choose each of these independently. 372 00:19:16,850 --> 00:19:18,790 And so I multiply those things together. 373 00:19:18,790 --> 00:19:22,270 And then the work done by each subproblem 374 00:19:22,270 --> 00:19:25,910 is usually the size of the thing I 375 00:19:25,910 --> 00:19:28,180 maximizing over, or minimizing over, 376 00:19:28,180 --> 00:19:31,060 or summing in my relation. 377 00:19:31,060 --> 00:19:33,520 It's going to be the size of that, the branching 378 00:19:33,520 --> 00:19:35,920 that I have, the number of subproblems I depend on. 379 00:19:35,920 --> 00:19:38,650 And so the number of subproblems, 380 00:19:38,650 --> 00:19:41,020 you probably look at your subproblem statement 381 00:19:41,020 --> 00:19:42,280 definition. 382 00:19:42,280 --> 00:19:44,320 To find the work done by each subproblem, 383 00:19:44,320 --> 00:19:46,280 you look at your recursive relation. 384 00:19:46,280 --> 00:19:51,470 OK, so with that, we've got this really nice framework. 385 00:19:51,470 --> 00:19:53,690 And we're going to use it to solve some practice 386 00:19:53,690 --> 00:19:56,120 problems, happy days. 387 00:19:56,120 --> 00:20:02,900 And these are a little bit longer in terms of description 388 00:20:02,900 --> 00:20:05,450 than our previous quiz 2 review. 389 00:20:05,450 --> 00:20:09,600 So I'm going to go ahead and read them out for you. 390 00:20:09,600 --> 00:20:11,660 This one's a little shorter. 391 00:20:11,660 --> 00:20:15,350 Tiffany Bannen stumbles upon a lottery chart dropped by a time 392 00:20:15,350 --> 00:20:18,410 traveler from the future, which lists winning lottery 393 00:20:18,410 --> 00:20:21,050 numbers and positive integer cash payouts for the next n 394 00:20:21,050 --> 00:20:21,590 days. 395 00:20:21,590 --> 00:20:27,260 Anyone get the reference here, Tiffany Bannen? 396 00:20:27,260 --> 00:20:32,810 Biff Tannen from some Back to the Future thing. 397 00:20:32,810 --> 00:20:33,865 So this was actually-- 398 00:20:33,865 --> 00:20:35,990 I think it's the second to Back to the Future movie 399 00:20:35,990 --> 00:20:38,630 where this happens. 400 00:20:38,630 --> 00:20:41,390 Anyway, Tiffany wants to use this information to make money, 401 00:20:41,390 --> 00:20:44,240 because she knows the future about the lottery. 402 00:20:44,240 --> 00:20:47,000 But is worried that if she plays winning numbers every day, 403 00:20:47,000 --> 00:20:50,690 lottery organizers will get suspicious and shut her down. 404 00:20:50,690 --> 00:20:54,620 So the idea here is maybe it's still suspicious, but decides 405 00:20:54,620 --> 00:20:57,020 to play the lottery infrequently, at most, 406 00:20:57,020 --> 00:20:59,150 twice in any seven day period. 407 00:21:01,810 --> 00:21:06,160 She'll win, but it's infrequent enough 408 00:21:06,160 --> 00:21:09,370 that maybe that's by chance, maybe not. 409 00:21:09,370 --> 00:21:12,490 Describe a linear-time algorithm to determine the maximum amount 410 00:21:12,490 --> 00:21:15,430 of lottery winnings Tiff, Tiffany, 411 00:21:15,430 --> 00:21:17,918 can win in the next 10 days by playing the lottery 412 00:21:17,918 --> 00:21:18,460 infrequently. 413 00:21:18,460 --> 00:21:22,990 Now, this was a particularly difficult type of p-set, 414 00:21:22,990 --> 00:21:25,900 or first p-set on dynamic programming problem. 415 00:21:25,900 --> 00:21:28,670 But let's try to do it together. 416 00:21:28,670 --> 00:21:30,850 So this is problem 1. 417 00:21:30,850 --> 00:21:33,730 I'm going to just call it Lotto. 418 00:21:33,730 --> 00:21:39,870 OK, so how can we deal with subproblems here? 419 00:21:39,870 --> 00:21:42,810 Well, I might want to think about what 420 00:21:42,810 --> 00:21:45,360 do I do on the first day. 421 00:21:45,360 --> 00:21:47,130 Am I going to play the lottery or not? 422 00:21:47,130 --> 00:21:49,320 And then recurse on the rest. 423 00:21:49,320 --> 00:21:50,910 That sounds good, right? 424 00:21:50,910 --> 00:21:53,640 I might have something like-- 425 00:21:53,640 --> 00:22:05,750 well, let's say that L of i is the winnings on day i. 426 00:22:05,750 --> 00:22:08,270 This is kind of just like-- 427 00:22:08,270 --> 00:22:11,780 this doesn't define use notation on what the cash payouts are. 428 00:22:11,780 --> 00:22:14,580 And so I'm making a variable to do that. 429 00:22:14,580 --> 00:22:18,440 And for the sake of what's written down on my sheet, 430 00:22:18,440 --> 00:22:23,930 I'm going to assume that this is 1 index, I don't know why. 431 00:22:26,550 --> 00:22:28,080 So I have days 1 to n. 432 00:22:28,080 --> 00:22:30,420 I know what their lottery payouts are. 433 00:22:30,420 --> 00:22:33,360 So I might, when I'm doing my SRT BOT stuff, 434 00:22:33,360 --> 00:22:36,780 I have subproblems. 435 00:22:36,780 --> 00:22:41,130 What I might want to do is see what happens on day one 436 00:22:41,130 --> 00:22:42,840 and recurse on what's later. 437 00:22:42,840 --> 00:22:46,410 So I might have something like x of i 438 00:22:46,410 --> 00:23:05,780 is max winnings for, I guess, possible for days i to n. 439 00:23:05,780 --> 00:23:09,950 Anyone have a problem with this type of subproblem? 440 00:23:09,950 --> 00:23:15,290 Let's kind of see what this type of subproblem would lead me to. 441 00:23:18,020 --> 00:23:23,100 I can either, in my relate step, what are my choices? 442 00:23:23,100 --> 00:23:27,650 I can either play on day i or I cannot play on day i. 443 00:23:27,650 --> 00:23:36,250 If I'm trying to maximize this thing, maximize. 444 00:23:36,250 --> 00:23:40,510 Either if I play on day i I get Li. 445 00:23:40,510 --> 00:23:44,050 And then I can recurse on the remainder. 446 00:23:44,050 --> 00:23:47,488 Or I don't plan Li and I recurse on the remainder. 447 00:23:50,120 --> 00:23:52,770 Anyone like this recurrence? 448 00:23:52,770 --> 00:23:56,000 Why don't we like this recurrence? 449 00:23:56,000 --> 00:23:58,325 I'm just always going to pick this thing. 450 00:23:58,325 --> 00:23:59,700 These things are always positive. 451 00:23:59,700 --> 00:24:02,580 I think it's always positive integer payouts, yeah. 452 00:24:02,580 --> 00:24:04,740 And so I'm always going to pick Li 453 00:24:04,740 --> 00:24:06,450 and the problem here is this is not 454 00:24:06,450 --> 00:24:10,030 obeying or dealing with this condition that I have, 455 00:24:10,030 --> 00:24:14,130 which is I'm only allowed to play twice a week, or not quite 456 00:24:14,130 --> 00:24:15,010 twice a week. 457 00:24:15,010 --> 00:24:17,490 It's not a fixed week-long period. 458 00:24:17,490 --> 00:24:21,580 It's within any consecutive seven-day period, 459 00:24:21,580 --> 00:24:23,710 which is a little confusing. 460 00:24:23,710 --> 00:24:28,240 How can I remember what days I'm allowed to pick later on? 461 00:24:28,240 --> 00:24:30,235 It seems a little daunting. 462 00:24:33,570 --> 00:24:36,006 In a sense, for me to know-- 463 00:24:36,006 --> 00:24:39,690 this is max winning possible for days i to n. 464 00:24:39,690 --> 00:24:44,900 But in some sense, it depends on which days I picked before. 465 00:24:44,900 --> 00:24:51,900 Because if I picked i minus 1, I can't pick another day 466 00:24:51,900 --> 00:24:53,460 for another six days. 467 00:24:53,460 --> 00:25:00,990 If I have-- right, so I have-- let's do this precisely. 468 00:25:00,990 --> 00:25:04,560 This is i minus 1. 469 00:25:04,560 --> 00:25:11,550 I have seven day period, 1, 2, 3, 4, 5, 6, 7. 470 00:25:11,550 --> 00:25:16,613 If I played the lottery here and I played the lottery here, 471 00:25:16,613 --> 00:25:18,530 then I'm not allowed to play the lottery here, 472 00:25:18,530 --> 00:25:20,822 I'm not allowed to play here, not allowed to play here, 473 00:25:20,822 --> 00:25:23,150 not allowed to play here, not allowed to play here. 474 00:25:23,150 --> 00:25:24,470 I am allowed to play here. 475 00:25:27,130 --> 00:25:30,880 So this is i plus 1, plus 2, plus 3, plus 4, plus 5, 476 00:25:30,880 --> 00:25:32,470 plus 6, i plus 6. 477 00:25:35,340 --> 00:25:40,060 So depending on what happened before me, 478 00:25:40,060 --> 00:25:44,220 I might not be able to play until day i plus 6. 479 00:25:44,220 --> 00:25:46,500 But if I haven't played until-- 480 00:25:46,500 --> 00:25:52,020 since way back here, I could potentially play the next guy. 481 00:25:52,020 --> 00:25:56,180 I don't actually know which of these I can play on next. 482 00:25:56,180 --> 00:25:59,820 In some sense, I need to remember which days I'm 483 00:25:59,820 --> 00:26:01,600 allowed to play on next. 484 00:26:01,600 --> 00:26:02,790 And I want to be able to-- 485 00:26:02,790 --> 00:26:04,920 at the beginning, I have no restrictions. 486 00:26:04,920 --> 00:26:11,420 I can just play on this guy next, even if I played there. 487 00:26:11,420 --> 00:26:14,260 But in general, I will be restricted 488 00:26:14,260 --> 00:26:17,800 in some way between being able to play on this guy 489 00:26:17,800 --> 00:26:19,490 and being able to play on this guy. 490 00:26:19,490 --> 00:26:20,907 And so what I'm going to do is I'm 491 00:26:20,907 --> 00:26:23,560 going to generalize my problems by storing 492 00:26:23,560 --> 00:26:28,330 additional information so that I can rely on that information 493 00:26:28,330 --> 00:26:31,570 when I look into the future and recurse. 494 00:26:31,570 --> 00:26:34,210 So instead of-- 495 00:26:34,210 --> 00:26:38,305 I'm going to, in some sense, need to remember two days. 496 00:26:41,400 --> 00:26:44,550 Where was the last two places I played? 497 00:26:44,550 --> 00:26:46,680 I'm going to simplify that a little bit by saying 498 00:26:46,680 --> 00:26:50,490 that this subproblem, max winnings possible for days 499 00:26:50,490 --> 00:26:56,890 i to n, I'm going to say that I have to play on day i, 500 00:26:56,890 --> 00:27:00,730 similar restriction as longest increasing subsequence, 501 00:27:00,730 --> 00:27:03,340 I'm definitely including this in my subsequence. 502 00:27:03,340 --> 00:27:05,990 That's just going to make it easier for me to be like, 503 00:27:05,990 --> 00:27:09,442 oh, I definitely know I played on this day. 504 00:27:09,442 --> 00:27:11,900 It's going to be-- it would make it easier for my thinking. 505 00:27:11,900 --> 00:27:22,360 So assuming play on day i, and actually, I 506 00:27:22,360 --> 00:27:25,930 need to remember what's the next day I can play. 507 00:27:28,630 --> 00:27:32,920 So I'm going to expand this subproblem by another-- 508 00:27:32,920 --> 00:27:35,630 I guess this is a j. 509 00:27:35,630 --> 00:27:45,100 OK, assuming I play on day i, and I'm allowed to play, 510 00:27:45,100 --> 00:28:05,730 I guess, and next allowable play on day i plus j for, 511 00:28:05,730 --> 00:28:10,030 what are my possible range of days for j? 512 00:28:10,030 --> 00:28:15,240 I can either play the next day, but I'm never 513 00:28:15,240 --> 00:28:18,150 restricted past day i plus 6. 514 00:28:18,150 --> 00:28:21,670 Because the things before me [AUDIO OUT] 515 00:28:21,670 --> 00:28:24,430 further to the right, because I haven't dealt with them yet. 516 00:28:24,430 --> 00:28:29,610 So I only have to deal with this from 1 to 6. 517 00:28:29,610 --> 00:28:32,190 Well, this is nice, because it expanded my subproblems 518 00:28:32,190 --> 00:28:34,500 by a constant number. 519 00:28:34,500 --> 00:28:37,400 So I actually didn't lose anything asymptotically here 520 00:28:37,400 --> 00:28:40,002 by remembering this information. 521 00:28:40,002 --> 00:28:40,710 But I'm kind of-- 522 00:28:40,710 --> 00:28:43,950 I'm able to remember all the things that 523 00:28:43,950 --> 00:28:45,480 could have happened before me. 524 00:28:45,480 --> 00:28:48,330 I compress it into this one number. 525 00:28:48,330 --> 00:28:53,860 OK so now let's rewrite our relation. 526 00:28:53,860 --> 00:28:56,740 I'm actually going to go ahead and use some more board 527 00:28:56,740 --> 00:29:01,020 space, because I think that's easier than erasing. 528 00:29:05,730 --> 00:29:09,090 All right, so we are looking at my relation. 529 00:29:09,090 --> 00:29:10,830 This is a pretty complicated relation. 530 00:29:10,830 --> 00:29:12,660 But what's happening now? 531 00:29:12,660 --> 00:29:16,270 Now, I'm assuming I play on day i. 532 00:29:16,270 --> 00:29:18,240 That actually simplifies things a little bit. 533 00:29:18,240 --> 00:29:23,480 Because no matter what, I get the winnings on day i. 534 00:29:23,480 --> 00:29:26,000 Now, when I call this subproblem, 535 00:29:26,000 --> 00:29:28,970 I better make sure that it's OK that I played on day i. 536 00:29:28,970 --> 00:29:31,730 But that's in my caller. 537 00:29:31,730 --> 00:29:33,560 I am locally allowed to play on day i. 538 00:29:33,560 --> 00:29:35,120 I am playing on day i. 539 00:29:35,120 --> 00:29:38,480 That's the definition of my subproblem. 540 00:29:38,480 --> 00:29:46,040 x of i is I'm going to maximize over some choice. 541 00:29:46,040 --> 00:29:53,190 Max of-- I guess I have Li no matter what. 542 00:29:53,190 --> 00:29:56,028 So this could actually come out of my max. 543 00:29:56,028 --> 00:29:58,320 And then I'm going to choose-- what am I choosing here? 544 00:29:58,320 --> 00:30:00,870 I'm not choosing whether I'm playing on day i. 545 00:30:00,870 --> 00:30:04,290 I'm choosing what my next day i play is, so that then I 546 00:30:04,290 --> 00:30:07,880 can recurse on that subproblem. 547 00:30:07,880 --> 00:30:10,580 So what day can I play on? 548 00:30:10,580 --> 00:30:16,000 Well, I'm kind of restricted by this j parameter 549 00:30:16,000 --> 00:30:20,410 that I didn't add into my subproblem on what 550 00:30:20,410 --> 00:30:23,290 possible days I can play next. 551 00:30:23,290 --> 00:30:26,500 I'm going to split that into-- 552 00:30:26,500 --> 00:30:29,900 kind of compress that into one thing. 553 00:30:29,900 --> 00:30:35,830 So x, I can play on-- 554 00:30:35,830 --> 00:30:38,680 I have a choice of the next day I play, 555 00:30:38,680 --> 00:30:42,790 somewhere between j, which is my next allowable play, 556 00:30:42,790 --> 00:30:45,640 and sometime in the future. 557 00:30:45,640 --> 00:30:51,190 So i plus k, this is going to be my loop that I'm looping over 558 00:30:51,190 --> 00:30:54,340 in terms of my max. 559 00:30:54,340 --> 00:30:59,630 And then, what am I restricted on in my play? 560 00:30:59,630 --> 00:31:04,290 It depends on how far I am from i. 561 00:31:04,290 --> 00:31:09,920 So if I'm here, this is i. 562 00:31:09,920 --> 00:31:13,890 If I choose k to be the next day, 563 00:31:13,890 --> 00:31:18,010 I can't play for many, many times. 564 00:31:18,010 --> 00:31:24,590 So the subproblem I'm going to recurse on is-- 565 00:31:24,590 --> 00:31:28,040 this is i plus k. 566 00:31:28,040 --> 00:31:30,680 I'm going to recurse on i plus k. 567 00:31:30,680 --> 00:31:34,280 But I'm not able to-- 568 00:31:34,280 --> 00:31:37,340 j needs to be the max it can be. 569 00:31:37,340 --> 00:31:41,810 Because I can't play until i plus 6. 570 00:31:41,810 --> 00:31:45,600 So this i plus k plus 6 is going to be my thing. 571 00:31:45,600 --> 00:31:50,475 So let's see. 572 00:31:50,475 --> 00:31:51,350 So I'm going to put-- 573 00:31:54,230 --> 00:31:57,620 let's see if I can unpack what I wrote down here. 574 00:31:57,620 --> 00:32:01,875 Max of 1, 7 minus k. 575 00:32:01,875 --> 00:32:02,375 Yuck. 576 00:32:05,660 --> 00:32:10,670 Oh, I'll put it here. 577 00:32:10,670 --> 00:32:13,460 1, 7 minus k. 578 00:32:13,460 --> 00:32:22,930 OK, if I pick k, I'm not aggressive toward these boards 579 00:32:22,930 --> 00:32:23,660 as Justin is. 580 00:32:26,690 --> 00:32:34,800 So if I pick a k way down here, I'm not restricted at all. 581 00:32:34,800 --> 00:32:39,330 And so the most permissive option I have here is 1. 582 00:32:39,330 --> 00:32:42,690 So I definitely can't be worse than 1. 583 00:32:42,690 --> 00:32:43,500 But if I pick-- 584 00:32:46,950 --> 00:32:53,310 and then this needs to be some number between 6 and 1. 585 00:32:53,310 --> 00:32:55,710 And so I can just check the other bound. 586 00:32:55,710 --> 00:33:03,190 If this is the most productive, then this should be 6. 587 00:33:03,190 --> 00:33:05,850 So when k equals 1, this should be 6. 588 00:33:05,850 --> 00:33:09,100 And it decreases every time further back-- 589 00:33:09,100 --> 00:33:12,030 or further forward I choose this k. 590 00:33:12,030 --> 00:33:15,450 So that's what my subproblem is going to be. 591 00:33:15,450 --> 00:33:21,030 And I'm choosing over k in-- 592 00:33:21,030 --> 00:33:30,850 from j, sorry i plus j, sorry, j, thank you, until what? 593 00:33:33,720 --> 00:33:35,970 That's the question, until what? 594 00:33:35,970 --> 00:33:37,613 Do I have to loop over n? 595 00:33:37,613 --> 00:33:39,030 If I loop over n, I'm going to get 596 00:33:39,030 --> 00:33:41,430 a quadratic running time, which is worse 597 00:33:41,430 --> 00:33:43,530 than what I'm allowed to do. 598 00:33:43,530 --> 00:33:46,650 The assumption is that I only have to check 599 00:33:46,650 --> 00:33:48,510 the constant number of these. 600 00:33:48,510 --> 00:33:49,590 And why might that be? 601 00:33:52,250 --> 00:33:53,740 Any ideas? 602 00:33:53,740 --> 00:33:57,420 Let's say I am-- 603 00:33:57,420 --> 00:33:59,850 got my subproblem, I'm recursing. 604 00:33:59,850 --> 00:34:01,650 I've got i. 605 00:34:01,650 --> 00:34:05,310 Yeah, I don't know where j is. j is somewhere over here, One, 606 00:34:05,310 --> 00:34:07,710 two, three, maybe it's four, or something like that. 607 00:34:11,230 --> 00:34:13,150 Let's say I pick some k down here. 608 00:34:13,150 --> 00:34:13,929 What is this? 609 00:34:13,929 --> 00:34:15,699 This is i plus-- 610 00:34:15,699 --> 00:34:22,690 so this is j is 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 15, 611 00:34:22,690 --> 00:34:26,540 way over here, two weeks later. 612 00:34:26,540 --> 00:34:30,040 Is it ever optimal for me to do that? 613 00:34:30,040 --> 00:34:31,370 Why not? 614 00:34:31,370 --> 00:34:33,370 AUDIENCE: You could play the lotto in the middle 615 00:34:33,370 --> 00:34:34,320 and it wouldn't effect it. 616 00:34:34,320 --> 00:34:36,487 JASON KU: Yeah, I could play the lotto in the middle 617 00:34:36,487 --> 00:34:39,030 here, right? 618 00:34:39,030 --> 00:34:41,610 Within-- from here to there, that's 619 00:34:41,610 --> 00:34:43,980 a seven day period where I only played once. 620 00:34:43,980 --> 00:34:46,260 And from here to here, that's a seven day period 621 00:34:46,260 --> 00:34:48,060 where I only played once. 622 00:34:48,060 --> 00:34:49,917 So it's going to be-- 623 00:34:49,917 --> 00:34:51,000 these are positive values. 624 00:34:51,000 --> 00:34:53,458 So it's going to be more optimal for me to choose something 625 00:34:53,458 --> 00:34:55,949 in here to play. 626 00:34:55,949 --> 00:34:57,000 So how far-- 627 00:34:57,000 --> 00:35:00,870 I mean, I could just use 15 and that would satisfy. 628 00:35:00,870 --> 00:35:03,840 Because I've already argued to you that it's never optimal to. 629 00:35:03,840 --> 00:35:05,070 I can check this. 630 00:35:05,070 --> 00:35:06,450 It's not going to be optimal. 631 00:35:06,450 --> 00:35:10,500 It's going to be more optimal to play some time over here. 632 00:35:10,500 --> 00:35:14,250 But how far do I have to check? 633 00:35:14,250 --> 00:35:17,435 Well, maybe I have to check up to 7. 634 00:35:17,435 --> 00:35:18,060 Does that work? 635 00:35:21,660 --> 00:35:23,040 Not quite. 636 00:35:23,040 --> 00:35:28,640 So let's say I played here, and I played here, 637 00:35:28,640 --> 00:35:34,720 and I played here, and I played here, 638 00:35:34,720 --> 00:35:41,600 I actually can't play here, here, here, here, here. 639 00:35:41,600 --> 00:35:42,850 I'm not allowed to play those. 640 00:35:42,850 --> 00:35:44,770 I guess these should be O's. 641 00:35:44,770 --> 00:35:48,010 I played there. 642 00:35:48,010 --> 00:35:51,760 And I'm not allowed to play here, 1, 2, 3, 4, 5. 643 00:35:54,460 --> 00:35:56,320 But I am allowed to play anywhere in here. 644 00:35:56,320 --> 00:35:58,510 So I basically want to shrink this 645 00:35:58,510 --> 00:36:02,920 until these X's collide with each other. 646 00:36:02,920 --> 00:36:06,490 Because then it's possible that an optimal solution 647 00:36:06,490 --> 00:36:08,710 would require me to pick these two 648 00:36:08,710 --> 00:36:12,260 and then require me to pick these two way over there. 649 00:36:12,260 --> 00:36:14,650 So this is 10 things in the middle. 650 00:36:14,650 --> 00:36:16,460 I only have to go up to, at most, 11. 651 00:36:19,380 --> 00:36:20,340 It's 11. 652 00:36:20,340 --> 00:36:23,370 Now, you can move any constant above 11 653 00:36:23,370 --> 00:36:24,870 and get the same running time bound. 654 00:36:24,870 --> 00:36:28,180 But that's my analysis. 655 00:36:28,180 --> 00:36:33,660 OK, so we have our recursive relation. 656 00:36:33,660 --> 00:36:35,230 And so what am I doing? 657 00:36:35,230 --> 00:36:38,430 I'm just looping over my choices of next day to play. 658 00:36:38,430 --> 00:36:40,710 I'm rehearsing on this thing where I actually 659 00:36:40,710 --> 00:36:43,770 do play on that day. 660 00:36:43,770 --> 00:36:45,600 But I'm remembering the information 661 00:36:45,600 --> 00:36:49,300 about what I'm allowed to play next by limiting, 662 00:36:49,300 --> 00:36:52,930 based on what my previous value was. 663 00:36:52,930 --> 00:36:54,463 So that's the kind of key thing. 664 00:36:54,463 --> 00:36:56,380 I'm remembering something further in advance-- 665 00:36:56,380 --> 00:37:00,130 or I'm remembering what happened in the past 666 00:37:00,130 --> 00:37:02,620 by describing it as a restriction of something 667 00:37:02,620 --> 00:37:04,990 in the future. 668 00:37:04,990 --> 00:37:07,810 So this was a pretty difficult problem. 669 00:37:07,810 --> 00:37:11,080 I think it was one of our first dynamic programming problems 670 00:37:11,080 --> 00:37:11,860 on that term. 671 00:37:11,860 --> 00:37:14,218 It was probably a little ambitious. 672 00:37:14,218 --> 00:37:16,510 AUDIENCE: Are you saying this recurrence goes up to 11? 673 00:37:19,180 --> 00:37:21,960 JASON KU: Yes, the recurrence goes up to 11, not 10, 11. 674 00:37:24,790 --> 00:37:28,540 So we have our topological sort. 675 00:37:28,540 --> 00:37:31,120 What's a topological thought for these subproblems? 676 00:37:31,120 --> 00:37:31,999 Anybody? 677 00:37:37,050 --> 00:37:39,690 k is always going to be a positive number. 678 00:37:39,690 --> 00:37:42,980 Because j goes from 1 to 6. 679 00:37:42,980 --> 00:37:45,020 So I'm always going to be increasing 680 00:37:45,020 --> 00:37:49,040 in this first quantity. 681 00:37:49,040 --> 00:37:58,470 So for x of i, j, i always-- 682 00:37:58,470 --> 00:38:11,970 sorry, depends on strictly larger i. 683 00:38:11,970 --> 00:38:14,910 So this i, when I call subproblems, 684 00:38:14,910 --> 00:38:17,400 it always calls subproblems with a larger i. 685 00:38:17,400 --> 00:38:19,470 Now, this is a little weird, because I 686 00:38:19,470 --> 00:38:24,410 wanted me to depend on smaller subproblems, smaller. 687 00:38:24,410 --> 00:38:28,130 Now, it is smaller, because I'm taking a smaller suffix. 688 00:38:28,130 --> 00:38:31,250 But its corresponding to using a larger number. 689 00:38:31,250 --> 00:38:34,770 To me, that's a little confusing, but that's OK. 690 00:38:34,770 --> 00:38:37,650 Because we're kind of using things 691 00:38:37,650 --> 00:38:41,880 that are always monotonically going in some direction. 692 00:38:41,880 --> 00:38:46,070 So this is corresponding to a smaller subproblem 693 00:38:46,070 --> 00:38:47,960 in some measure. 694 00:38:47,960 --> 00:38:51,500 It's the number of elements that we're actually recursing on. 695 00:38:51,500 --> 00:38:54,710 In some sense, if we wrote this as a prefix, 696 00:38:54,710 --> 00:38:59,510 we would have us depending on strictly smaller i. 697 00:38:59,510 --> 00:39:03,020 And that would be more natural in terms of recursing 698 00:39:03,020 --> 00:39:04,550 on smaller subproblems. 699 00:39:04,550 --> 00:39:07,010 But I digress. 700 00:39:07,010 --> 00:39:12,290 All right, then we have our original subproblem. 701 00:39:12,290 --> 00:39:13,220 I'm going to-- 702 00:39:13,220 --> 00:39:15,840 I can't move this board. 703 00:39:15,840 --> 00:39:19,890 I'll just keep going, because we have lots of boards. 704 00:39:19,890 --> 00:39:26,380 OK, our original subproblem, now, what could I do? 705 00:39:26,380 --> 00:39:29,590 I have to start somewhere. 706 00:39:29,590 --> 00:39:33,220 Here, my subproblem's assuming that I'm starting at i. 707 00:39:33,220 --> 00:39:35,570 But I don't know where I start. 708 00:39:35,570 --> 00:39:39,890 I could start by taking the first element, but I might not. 709 00:39:39,890 --> 00:39:49,480 So I could just take the max over all i, 710 00:39:49,480 --> 00:39:51,820 over all i of x what? 711 00:39:56,410 --> 00:39:58,450 The first one, I'm not restricted 712 00:39:58,450 --> 00:39:59,920 on what I choose next. 713 00:39:59,920 --> 00:40:05,050 So what's the most permissive version of j? 714 00:40:05,050 --> 00:40:10,690 One, I'm allowed to take the next j. 715 00:40:10,690 --> 00:40:14,680 So if I just take the max over all of these subproblems, 716 00:40:14,680 --> 00:40:15,640 I'll get the solution. 717 00:40:15,640 --> 00:40:18,057 Now, actually, this is a little bit more work than I need. 718 00:40:18,057 --> 00:40:19,500 This is looping over all n. 719 00:40:22,440 --> 00:40:25,650 It's definitely correct, because I have to start somewhere. 720 00:40:25,650 --> 00:40:30,560 But will I ever start after the first, I don't know, 7? 721 00:40:30,560 --> 00:40:33,080 No, so I could just take this max 722 00:40:33,080 --> 00:40:34,972 over the first some constant number. 723 00:40:34,972 --> 00:40:35,930 And that would be fine. 724 00:40:35,930 --> 00:40:36,740 But that's OK. 725 00:40:36,740 --> 00:40:38,930 This is still smaller than the number of subproblems 726 00:40:38,930 --> 00:40:39,590 that we have. 727 00:40:39,590 --> 00:40:41,580 AUDIENCE: If I were being lazy during my exam 728 00:40:41,580 --> 00:40:44,720 and I looped over j, would that be correct? 729 00:40:44,720 --> 00:40:48,268 JASON KU: If I looped over j for? 730 00:40:48,268 --> 00:40:51,340 AUDIENCE: [INAUDIBLE] over every possible thing. 731 00:40:51,340 --> 00:40:54,820 JASON KU: Took the loop over this and j? 732 00:40:54,820 --> 00:40:56,740 Yeah, that would still be fine. 733 00:40:56,740 --> 00:40:57,250 Why not? 734 00:40:57,250 --> 00:41:02,560 It's just less-- it's more restrictive of subproblems. 735 00:41:02,560 --> 00:41:05,797 It will never be better to do that. 736 00:41:05,797 --> 00:41:08,380 But you could do that, because it wouldn't change your running 737 00:41:08,380 --> 00:41:09,552 time. 738 00:41:09,552 --> 00:41:11,260 AUDIENCE: Could it change my running time 739 00:41:11,260 --> 00:41:15,770 if my j accidentally looped too far? 740 00:41:15,770 --> 00:41:18,680 JASON KU: Well, j is restricted to be 1 to 6. 741 00:41:18,680 --> 00:41:19,940 So I'm not-- 742 00:41:19,940 --> 00:41:20,900 I don't think so. 743 00:41:20,900 --> 00:41:24,440 But in a different problem, in a different context, it could. 744 00:41:24,440 --> 00:41:26,900 OK, so that's the original. 745 00:41:26,900 --> 00:41:31,490 And in time here, what do we got? 746 00:41:31,490 --> 00:41:34,610 We have a linear number of subproblems, 747 00:41:34,610 --> 00:41:36,350 number of subproblems. 748 00:41:36,350 --> 00:41:41,690 We've got-- I actually like, usually, saying exactly how 749 00:41:41,690 --> 00:41:42,950 many subproblems I have. 750 00:41:42,950 --> 00:41:44,510 Oh, we didn't do base case. 751 00:41:44,510 --> 00:41:52,860 I missed BOT, I missed my B. We'll do the original first. 752 00:41:52,860 --> 00:41:54,300 And then the base case. 753 00:41:54,300 --> 00:42:01,090 OK, base case, what do we have as a base case here? 754 00:42:01,090 --> 00:42:04,360 Well, when I don't have anything to do. 755 00:42:04,360 --> 00:42:14,960 If I'm-- and actually, if I have this situation for i equals, 756 00:42:14,960 --> 00:42:20,555 say, n, I got my last thing. 757 00:42:23,140 --> 00:42:25,240 I could potentially start looping over 758 00:42:25,240 --> 00:42:29,245 sbuproblems that are negative in terms of my index. 759 00:42:29,245 --> 00:42:30,620 I'm not going to want to do that. 760 00:42:30,620 --> 00:42:32,537 There's a couple of ways I can deal with that. 761 00:42:32,537 --> 00:42:37,820 I could set a value for all of my problems for negative i. 762 00:42:37,820 --> 00:42:39,470 That's one thing I could do. 763 00:42:39,470 --> 00:42:41,800 But then I have to kind of remember, 764 00:42:41,800 --> 00:42:45,860 or I have to figure out how far I go into the negative. 765 00:42:45,860 --> 00:42:47,390 That's one thing I could do. 766 00:42:47,390 --> 00:42:49,440 And I give a base case for each of those cases. 767 00:42:49,440 --> 00:42:50,610 I don't have anything. 768 00:42:50,610 --> 00:42:52,265 So I get a-- 769 00:42:52,265 --> 00:42:58,330 I don't know, zero value for playing in the future, 770 00:42:58,330 --> 00:42:59,650 because I have negative things. 771 00:42:59,650 --> 00:43:01,190 I can't do anything with that. 772 00:43:01,190 --> 00:43:03,190 Another way of handling that, which 773 00:43:03,190 --> 00:43:06,100 I think I did in my solutions, was 774 00:43:06,100 --> 00:43:11,910 restrict that k to only be-- 775 00:43:11,910 --> 00:43:19,310 I guess and restrict that i plus k is less than or equal to n. 776 00:43:19,310 --> 00:43:21,560 And then, I'll never go to negative problems. 777 00:43:21,560 --> 00:43:23,210 I'll never recurse on these things. 778 00:43:23,210 --> 00:43:26,330 But that means that when I call this 779 00:43:26,330 --> 00:43:33,980 on n, When I only have one lottery day left, 780 00:43:33,980 --> 00:43:38,300 this set will be empty. 781 00:43:38,300 --> 00:43:40,850 So what's the max over that thing? 782 00:43:40,850 --> 00:43:42,260 Max over an empty set? 783 00:43:42,260 --> 00:43:43,730 I don't know. 784 00:43:43,730 --> 00:43:47,120 I mean, I could add on 0 here, that's one way I could do it. 785 00:43:47,120 --> 00:43:54,430 Or I could just say when I'm at n, and that thing is empty, 786 00:43:54,430 --> 00:44:02,090 or whenever it's empty, we can say the base case 787 00:44:02,090 --> 00:44:08,570 x i, j, I guess we could put this at n equals 0, or sorry, 788 00:44:08,570 --> 00:44:16,110 equals L of i, L of n, thank you. 789 00:44:16,110 --> 00:44:19,155 Because at the last guy, I have to use Ln. 790 00:44:22,720 --> 00:44:24,070 So there it is. 791 00:44:24,070 --> 00:44:29,090 Now, in actuality, if you write this correctly, 792 00:44:29,090 --> 00:44:34,280 I put the Li outside, and I union this with the 0, 793 00:44:34,280 --> 00:44:36,680 I can actually get away with just having 794 00:44:36,680 --> 00:44:38,630 the relation and no base case. 795 00:44:38,630 --> 00:44:42,350 Because my relation actually reduces to a base case, 796 00:44:42,350 --> 00:44:45,560 because of the way that I wrote my relation. 797 00:44:45,560 --> 00:44:48,260 But in general, you'll want to write some kind of base case 798 00:44:48,260 --> 00:44:52,640 here to either acknowledge that your relation handles it 799 00:44:52,640 --> 00:44:56,240 or be specific about what happens 800 00:44:56,240 --> 00:45:00,480 when I can't do any more work. 801 00:45:00,480 --> 00:45:08,000 And the last thing, time, we've got n subproblems exactly 802 00:45:08,000 --> 00:45:10,520 times constant work per subproblem. 803 00:45:10,520 --> 00:45:13,910 Because I'm looping over 11 possible values, 804 00:45:13,910 --> 00:45:17,940 actually, it's up to 11, because j could be 6. 805 00:45:21,910 --> 00:45:25,570 So this is order n work total. 806 00:45:25,570 --> 00:45:29,380 So this is a pretty daunting first problem. 807 00:45:29,380 --> 00:45:34,840 But in terms of what Erik, Professor Demaine, 808 00:45:34,840 --> 00:45:40,060 was talking about last lecture, in terms of categorization 809 00:45:40,060 --> 00:45:44,540 of subproblem, or categorization of dynamic programs, what do 810 00:45:44,540 --> 00:45:45,040 we got? 811 00:45:45,040 --> 00:45:47,320 We've got a suffix subproblem where 812 00:45:47,320 --> 00:45:50,650 we expanded by some local information, 813 00:45:50,650 --> 00:45:56,410 remembering when the next time I can play. 814 00:45:56,410 --> 00:45:57,970 So that's kind of a categorization 815 00:45:57,970 --> 00:45:59,110 of these subproblems. 816 00:45:59,110 --> 00:46:03,640 The recurrence relation has constant branching, 817 00:46:03,640 --> 00:46:07,525 but more than two branching. 818 00:46:07,525 --> 00:46:13,340 And I'm combining a bunch of subproblems 819 00:46:13,340 --> 00:46:17,780 in my original evaluation. 820 00:46:17,780 --> 00:46:23,030 And if I wanted to figure out what days Tiff should play 821 00:46:23,030 --> 00:46:26,940 on the lottery, you can store parent pointers 822 00:46:26,940 --> 00:46:30,660 when I'm evaluating this max. 823 00:46:30,660 --> 00:46:34,920 I figure out which subproblem x I recurse-- 824 00:46:34,920 --> 00:46:36,570 that gave me the max. 825 00:46:36,570 --> 00:46:38,760 And I can walk back to see which choices 826 00:46:38,760 --> 00:46:41,785 I made to figure out which days I played the lottery. 827 00:46:41,785 --> 00:46:42,660 Does that make sense? 828 00:46:47,000 --> 00:46:50,560 So any questions on problem 1? 829 00:46:50,560 --> 00:46:52,000 That's the most-- 830 00:46:52,000 --> 00:46:55,630 I wanted to have the most complicated one first. 831 00:46:55,630 --> 00:46:58,450 So that we could have an easier way to go. 832 00:46:58,450 --> 00:47:01,720 In a sense, this is the most complicated version 833 00:47:01,720 --> 00:47:05,740 of this kind of actually pretty simple dynamic programming 834 00:47:05,740 --> 00:47:06,460 setup. 835 00:47:06,460 --> 00:47:09,010 Why do I say simple dynamic programming setup? 836 00:47:09,010 --> 00:47:10,780 It's just suffixes. 837 00:47:10,780 --> 00:47:14,380 And I'm just doing a constant amount of work local to me. 838 00:47:14,380 --> 00:47:18,780 It's just a very complicated local setup. 839 00:47:18,780 --> 00:47:22,030 But that's what I mean by simple. 840 00:47:22,030 --> 00:47:23,680 When we're designing subproblems, 841 00:47:23,680 --> 00:47:25,000 this is one that we could-- 842 00:47:25,000 --> 00:47:27,100 I mean, when we're designing problems 843 00:47:27,100 --> 00:47:29,740 for this dynamic programming setup, 844 00:47:29,740 --> 00:47:34,650 it's one of the hardest from a-- 845 00:47:34,650 --> 00:47:37,110 it's one of the easiest from a conceptual standpoint, 846 00:47:37,110 --> 00:47:39,590 but one of the hardest to actually implement. 847 00:47:39,590 --> 00:47:45,890 OK, so problem 2, this one's a long one. 848 00:47:45,890 --> 00:47:48,530 A wealthy family, Alice, Bob, and their young son Charlie 849 00:47:48,530 --> 00:47:50,120 are sailing around the world. 850 00:47:50,120 --> 00:47:51,830 When they encounter a massive storm, 851 00:47:51,830 --> 00:47:56,000 Charlie is thrown overboard, presumed drowned. 852 00:47:56,000 --> 00:47:59,420 This is very colorful language for these problem set writers. 853 00:47:59,420 --> 00:48:01,730 20 years later, a man comes to Alice and Bob, 854 00:48:01,730 --> 00:48:04,910 claiming to be Charlie, having maybe been marooned 855 00:48:04,910 --> 00:48:07,550 on an island for that long. 856 00:48:07,550 --> 00:48:09,830 Alice and Bob are excited but skeptical. 857 00:48:09,830 --> 00:48:12,770 And they order a matching tests from the genetic testing 858 00:48:12,770 --> 00:48:16,550 company 46 and Thee. 859 00:48:16,550 --> 00:48:18,740 Given Alice and Bob-- 860 00:48:18,740 --> 00:48:23,660 sorry, given three length n DNA sequences, basically 861 00:48:23,660 --> 00:48:27,410 strings of CGTA, or something like that, 862 00:48:27,410 --> 00:48:30,050 from each of Alice, Bob, and Charlie, 863 00:48:30,050 --> 00:48:33,830 the testing center will determine three-- 864 00:48:33,830 --> 00:48:36,510 their ancestry as follows. 865 00:48:36,510 --> 00:48:40,490 If Charlie's can be partitioned into two, 866 00:48:40,490 --> 00:48:44,120 not necessarily contiguous, subsequences of equal length, 867 00:48:44,120 --> 00:48:46,160 so basically I can take-- 868 00:48:46,160 --> 00:48:52,280 if I have n is length 5 or length 6, it's better be even. 869 00:48:52,280 --> 00:48:55,040 I need to find three characters in order. 870 00:48:55,040 --> 00:48:59,140 And then the other three characters 871 00:48:59,140 --> 00:49:03,560 must match to make some substrings-- 872 00:49:03,560 --> 00:49:06,890 some subsequences in Alice and Bob's DNA. 873 00:49:06,890 --> 00:49:09,720 So that's a little hard to parse. 874 00:49:09,720 --> 00:49:11,660 So let's look at an example here. 875 00:49:11,660 --> 00:49:12,470 For Example. 876 00:49:12,470 --> 00:49:15,170 Alice's is AATT. 877 00:49:15,170 --> 00:49:18,290 Bob's DNA is CCGG. 878 00:49:18,290 --> 00:49:22,460 If Charlie's were CATG, they'd be 879 00:49:22,460 --> 00:49:28,480 matched, because CG is a subsequence of Charlie's DNA, 880 00:49:28,480 --> 00:49:34,640 and is a subsequence of Bob's DNA. 881 00:49:34,640 --> 00:49:37,710 And AT is a subsequence of Charlie's DNA. 882 00:49:37,710 --> 00:49:41,901 And Is also a subsequence of Alice's DNA. 883 00:49:41,901 --> 00:49:45,620 And so we've partitioned them into two equal length 884 00:49:45,620 --> 00:49:46,360 subsequences. 885 00:49:46,360 --> 00:49:49,660 These are not necessarily consecutive subsequences, 886 00:49:49,660 --> 00:49:58,020 but just any subsequences, such that they 887 00:49:58,020 --> 00:50:00,540 appear in Alice and Bob. 888 00:50:00,540 --> 00:50:05,250 But if Charlie would be found to be an impostor, if his sequence 889 00:50:05,250 --> 00:50:10,860 were AGTC, essentially, it's easy to realize that, 890 00:50:10,860 --> 00:50:17,960 because G and C are swapped in terms of their ordering. 891 00:50:17,960 --> 00:50:24,030 And GC, the letters GC only appear in Bob's DNA 892 00:50:24,030 --> 00:50:26,680 and don't appear in that order. 893 00:50:26,680 --> 00:50:29,320 So it's easy to see that he's an imposter with these strings. 894 00:50:29,320 --> 00:50:31,028 But you can imagine, with longer strings, 895 00:50:31,028 --> 00:50:33,660 this could be difficult to solve. 896 00:50:33,660 --> 00:50:36,150 So we want an n to the 4th time algorithm 897 00:50:36,150 --> 00:50:38,490 to determine whether Charlie is a fraud. 898 00:50:38,490 --> 00:50:43,920 OK, so I actually shortened this last night. 899 00:50:43,920 --> 00:50:46,480 This was twice as long on the problem set. 900 00:50:46,480 --> 00:50:52,016 So yeah, anyway, so how do we approach this problem? 901 00:50:52,016 --> 00:50:53,382 AUDIENCE: [INAUDIBLE]. 902 00:50:53,382 --> 00:50:55,340 JASON KU: No, they don't have to be contiguous. 903 00:50:55,340 --> 00:50:57,590 Like in the example, it would be matched 904 00:50:57,590 --> 00:51:04,570 if C and G is a subsequence, not contiguous, of CATG. 905 00:51:04,570 --> 00:51:06,760 Yeah, so that's an important part of this problem. 906 00:51:06,760 --> 00:51:09,760 I'm just-- I'm not trying to figure out if there's-- 907 00:51:09,760 --> 00:51:12,850 basically, there are only two contiguous subsequences 908 00:51:12,850 --> 00:51:15,460 of length 2n that this thing can be partitioned in. 909 00:51:15,460 --> 00:51:16,780 I just look in the middle. 910 00:51:16,780 --> 00:51:22,170 No, we're looking for subsequences, not substrings. 911 00:51:22,170 --> 00:51:25,532 So they kind of interleave like this in some way. 912 00:51:25,532 --> 00:51:27,490 And there's actually a number of different ways 913 00:51:27,490 --> 00:51:28,530 I can partition that. 914 00:51:28,530 --> 00:51:31,000 There's actually an exponential number of ways. 915 00:51:31,000 --> 00:51:33,110 So that's a problem, potentially. 916 00:51:33,110 --> 00:51:33,610 Yes. 917 00:51:33,610 --> 00:51:35,193 AUDIENCE: Is there a biological basis? 918 00:51:35,193 --> 00:51:37,590 JASON KU: No, there's no biological basis to this thing 919 00:51:37,590 --> 00:51:38,700 that I know of. 920 00:51:38,700 --> 00:51:44,020 OK, all right, so how do we solve this problem? 921 00:51:44,020 --> 00:51:45,630 What problem does this look like? 922 00:51:48,790 --> 00:51:51,460 I mean, it seems like string matching. 923 00:51:51,460 --> 00:51:53,770 So I might want to think it's something like longest 924 00:51:53,770 --> 00:51:55,750 common subsequence. 925 00:51:55,750 --> 00:52:00,310 But here, I have three sequences instead of two sequences. 926 00:52:00,310 --> 00:52:04,180 And we've got this other weird condition where we kind of need 927 00:52:04,180 --> 00:52:07,810 an exact partition of Charlie. 928 00:52:07,810 --> 00:52:10,180 I need to use all of the letters in Charlie, 929 00:52:10,180 --> 00:52:14,570 but I don't have to use all of the letters in Alice and Bob. 930 00:52:14,570 --> 00:52:25,325 So let's get some notation here A, B, and C are n length 931 00:52:25,325 --> 00:52:25,825 strings. 932 00:52:32,040 --> 00:52:34,620 So what could I do? 933 00:52:34,620 --> 00:52:37,650 Let's define some subproblems. 934 00:52:37,650 --> 00:52:45,050 If I were to go via longest common subsequence, 935 00:52:45,050 --> 00:52:49,820 I might keep track of an index of a suffix or prefix 936 00:52:49,820 --> 00:52:52,460 of each one of these strings. 937 00:52:52,460 --> 00:52:54,200 That kind of makes sense. 938 00:52:54,200 --> 00:53:01,060 Something like i, j, k, where we're 939 00:53:01,060 --> 00:53:06,000 talking about the suffixes-- 940 00:53:06,000 --> 00:53:18,150 sorry, that's prefixes, i, B, j, and C, k. 941 00:53:18,150 --> 00:53:20,090 That seems reasonable, at least. 942 00:53:20,090 --> 00:53:25,940 It's what we would do for longest common subsequence. 943 00:53:25,940 --> 00:53:28,940 What's the problem here? 944 00:53:28,940 --> 00:53:32,890 I mean, I could match this guy with one of these guys, 945 00:53:32,890 --> 00:53:35,530 or decide to skip it, and match one of these guys, 946 00:53:35,530 --> 00:53:37,300 and decide to skip it. 947 00:53:37,300 --> 00:53:42,690 But if I do that, I might get a subsequence. 948 00:53:46,540 --> 00:53:48,580 But actually, I always need to match 949 00:53:48,580 --> 00:53:55,490 all of C. Does that make sense? 950 00:53:55,490 --> 00:54:01,610 I always need to match all of C. 951 00:54:01,610 --> 00:54:02,235 So in a sense-- 952 00:54:11,400 --> 00:54:13,020 Let's see, how can I do this? 953 00:54:15,940 --> 00:54:18,130 I need to match all of C. But I also 954 00:54:18,130 --> 00:54:22,120 need to make sure I'm using exactly n over 2 characters 955 00:54:22,120 --> 00:54:30,370 from C in B. And exactly n over 2 characters from C in A? 956 00:54:30,370 --> 00:54:31,990 Does that make sense? 957 00:54:31,990 --> 00:54:35,630 So how can I satisfy that condition? 958 00:54:35,630 --> 00:54:39,560 Now I understand why I used prefixes before. 959 00:54:39,560 --> 00:54:45,430 And I swapped it to suffixes here. 960 00:54:45,430 --> 00:54:46,420 But we'll make it work. 961 00:54:52,080 --> 00:54:56,210 How can I remember how many characters I 962 00:54:56,210 --> 00:55:00,260 assigned from Alice versus Bob? 963 00:55:00,260 --> 00:55:05,510 As I I'm matching characters in Alice and Bob, 964 00:55:05,510 --> 00:55:08,660 I need to kind of remember where they point to, 965 00:55:08,660 --> 00:55:12,980 or how many I've already used in Charlie. 966 00:55:12,980 --> 00:55:18,470 So that I can divvy up the remainder in here. 967 00:55:18,470 --> 00:55:23,590 Oh, actually, this works in a different sense. 968 00:55:23,590 --> 00:55:26,470 There's 18 different ways we could do this. 969 00:55:26,470 --> 00:55:33,070 So, OK, so I need to remember how many I've already used up, 970 00:55:33,070 --> 00:55:34,720 so that I can be sure to allocate 971 00:55:34,720 --> 00:55:36,700 exactly that many characters in the future 972 00:55:36,700 --> 00:55:39,920 to either Alice or Bob. 973 00:55:39,920 --> 00:55:41,240 So how can I remember that? 974 00:55:44,120 --> 00:55:45,350 I can just remember. 975 00:55:45,350 --> 00:55:47,135 How many do I-- 976 00:55:56,430 --> 00:56:00,920 I'll do it the way that I did it before, 977 00:56:00,920 --> 00:56:02,620 which is I can remember-- 978 00:56:02,620 --> 00:56:04,670 I can remember two different things here. 979 00:56:04,670 --> 00:56:10,930 I can remember how many things I have left to match Alice, in C, 980 00:56:10,930 --> 00:56:13,600 or I can remember how many things I've already 981 00:56:13,600 --> 00:56:17,770 matched in C to Alice. 982 00:56:17,770 --> 00:56:22,480 If I talk about how many things I've already matched, 983 00:56:22,480 --> 00:56:26,530 then I can index this thing by the sum of those things. 984 00:56:26,530 --> 00:56:30,430 If I talk about how many things I have yet to match, 985 00:56:30,430 --> 00:56:35,080 I have to do n minus the things. 986 00:56:35,080 --> 00:56:38,120 So those are the different parameters that we can do. 987 00:56:38,120 --> 00:56:39,492 We'll do what's in my notes. 988 00:56:39,492 --> 00:56:40,450 And I'll try to fix it. 989 00:56:46,080 --> 00:56:50,510 So what we're going to do is remember-- or figure out 990 00:56:50,510 --> 00:56:53,330 how many things I'm still needing 991 00:56:53,330 --> 00:56:56,600 to match in C to Alice and Bob. 992 00:56:56,600 --> 00:56:59,790 So I'm going to call this k-- 993 00:56:59,790 --> 00:57:02,490 sorry, ki. 994 00:57:02,490 --> 00:57:07,980 i is associated with A. And B is associated with j. 995 00:57:07,980 --> 00:57:16,250 And kj, this is going to be the-- 996 00:57:20,280 --> 00:57:22,680 I have to write this down. 997 00:57:22,680 --> 00:57:25,780 So this is going to be what kind of output 998 00:57:25,780 --> 00:57:27,330 do I want to my subproblem? 999 00:57:27,330 --> 00:57:29,940 I just want to know if these things are-- if he's a fraud 1000 00:57:29,940 --> 00:57:30,640 or not. 1001 00:57:30,640 --> 00:57:32,490 So this is going to be a Boolean. 1002 00:57:32,490 --> 00:58:00,800 So true if can match ki length subsequence of suffix A, 1003 00:58:00,800 --> 00:58:14,160 suffix is this guy, and length kj, 1004 00:58:14,160 --> 00:58:32,170 I guess, kj length subsequence of suffix B, ji, or Bj suffix, 1005 00:58:32,170 --> 00:58:38,530 to all characters in. 1006 00:58:38,530 --> 00:58:39,970 And now what is this in? 1007 00:58:39,970 --> 00:58:42,790 This is the hard part. 1008 00:58:42,790 --> 00:58:53,220 Do I need a separate index for C to know where I am in C? 1009 00:58:53,220 --> 00:58:56,800 In a sense, yes, I need to know where I am 1010 00:58:56,800 --> 00:58:59,800 in C, how much I have to match. 1011 00:58:59,800 --> 00:59:06,800 But if I need to match ki to kj to all of them, 1012 00:59:06,800 --> 00:59:12,820 then there better be ki plus j things left in C. 1013 00:59:12,820 --> 00:59:15,370 So in a sense, I don't need to remember that information 1014 00:59:15,370 --> 00:59:16,300 again. 1015 00:59:16,300 --> 00:59:18,460 It's not independent to my other parameters. 1016 00:59:18,460 --> 00:59:19,720 I can compute it. 1017 00:59:19,720 --> 00:59:22,330 I could throw it in, but I can determine it 1018 00:59:22,330 --> 00:59:23,830 from the other parameters. 1019 00:59:23,830 --> 00:59:28,060 And so I want to match it with the suffix 1020 00:59:28,060 --> 00:59:33,700 of C of length ki plus kj. 1021 00:59:33,700 --> 00:59:36,340 So I think this is the only part that 1022 00:59:36,340 --> 00:59:40,000 is going to be annoying to me. 1023 00:59:40,000 --> 00:59:53,590 So this should be suffix of all of the things 1024 00:59:53,590 --> 00:59:57,670 minus ki minus kj minus 1. 1025 01:00:00,890 --> 01:00:02,740 It's just this. 1026 01:00:02,740 --> 01:00:04,360 And why is that? 1027 01:00:04,360 --> 01:00:10,400 If I have matched to everything, ki and kj are both 0. 1028 01:00:10,400 --> 01:00:16,951 And I should have nothing in C, which should n colon. 1029 01:00:16,951 --> 01:00:19,570 We're at 0 index, yes we are. 1030 01:00:19,570 --> 01:00:24,750 Whenever I use Python notation, I'd better be 0 index. 1031 01:00:24,750 --> 01:00:26,380 Does this make sense as a subproblem? 1032 01:00:26,380 --> 01:00:28,530 I mean, it's confusing. 1033 01:00:28,530 --> 01:00:29,783 But hopefully, it makes sense. 1034 01:00:29,783 --> 01:00:31,200 What I'm going to try to do is I'm 1035 01:00:31,200 --> 01:00:33,120 going to match some number of characters 1036 01:00:33,120 --> 01:00:38,217 in this suffix, which is hopefully longer than ki. 1037 01:00:38,217 --> 01:00:40,800 Otherwise, it's-- I'm going to be in a base case where this is 1038 01:00:40,800 --> 01:00:44,090 impossible. 1039 01:00:44,090 --> 01:00:49,830 And some subsequence of this matched completely into this. 1040 01:00:49,830 --> 01:00:54,970 So that's-- those are my subproblems. 1041 01:00:54,970 --> 01:00:58,120 I'm going to try to relate them now. 1042 01:00:58,120 --> 01:01:08,590 We have x, i, j, ki, kj, what is this going to equal? 1043 01:01:08,590 --> 01:01:09,850 Well, we've got Booleans. 1044 01:01:09,850 --> 01:01:18,070 So this is and false otherwise. 1045 01:01:18,070 --> 01:01:20,130 That's a Boolean. 1046 01:01:20,130 --> 01:01:26,170 So I just need some subproblem I recurse on to be true. 1047 01:01:26,170 --> 01:01:29,130 So what's the commutator for some 1048 01:01:29,130 --> 01:01:32,970 of a bunch of choices, Boolean choices, any one of which 1049 01:01:32,970 --> 01:01:33,740 may be true? 1050 01:01:36,733 --> 01:01:38,150 I want to combine a bunch of them. 1051 01:01:38,150 --> 01:01:40,420 I just want to see if any of them are true. 1052 01:01:40,420 --> 01:01:44,110 I'm going to or over them. 1053 01:01:44,110 --> 01:01:46,855 I'm going to or over four choices. 1054 01:01:50,030 --> 01:01:52,100 What are my choices? 1055 01:01:52,100 --> 01:01:57,900 Either the first thing in A matches with C, 1056 01:01:57,900 --> 01:02:02,170 the first thing in B matches with C, 1057 01:02:02,170 --> 01:02:05,530 or I don't match with either. 1058 01:02:05,530 --> 01:02:07,810 So those are my four choices. 1059 01:02:07,810 --> 01:02:12,790 So if I match with A, i plus 1, I 1060 01:02:12,790 --> 01:02:20,830 recurse on a smaller suffix of A and a-- 1061 01:02:20,830 --> 01:02:25,446 by adding-- oh, this all just works, great. 1062 01:02:25,446 --> 01:02:43,600 Kj, this is i, if Ai equals Ci and Ai is greater than 0. 1063 01:02:43,600 --> 01:02:49,540 So if ki is greater than 0, I need to match an i. 1064 01:02:49,540 --> 01:02:53,710 So this conditional doesn't even make sense 1065 01:02:53,710 --> 01:02:57,730 unless I've evaluated this ki to be bigger than 0. 1066 01:02:57,730 --> 01:03:01,540 Otherwise, I'm trying to access i of n. 1067 01:03:01,540 --> 01:03:03,910 So I'm just putting this conditional on there. 1068 01:03:03,910 --> 01:03:15,050 Same with matching B, ij plus 1, ki, kj, plus 1 if-- 1069 01:03:15,050 --> 01:03:17,540 sorry, this should be minus 1. 1070 01:03:17,540 --> 01:03:21,240 I have fewer characters that I have to recurse on. 1071 01:03:21,240 --> 01:03:22,790 So that's a typo in my notes. 1072 01:03:26,420 --> 01:03:32,916 Bj equals C. Oh, this is not Ci. 1073 01:03:32,916 --> 01:03:34,040 What is this? 1074 01:03:34,040 --> 01:03:35,840 It's whatever that thing is. 1075 01:03:35,840 --> 01:03:41,280 So I'm going to just say question mark 1076 01:03:41,280 --> 01:03:45,780 and kj is greater than 0. 1077 01:03:45,780 --> 01:03:47,400 So I'll fill that in in the notes. 1078 01:03:47,400 --> 01:03:49,500 It's going to be some complicated expression that 1079 01:03:49,500 --> 01:03:52,190 looks like that. 1080 01:03:52,190 --> 01:03:54,170 It's exactly that expression. 1081 01:03:54,170 --> 01:03:55,710 Yes, that it is. 1082 01:03:55,710 --> 01:03:57,020 So it's that thing. 1083 01:04:04,910 --> 01:04:10,850 OK, then we have two more choices. 1084 01:04:10,850 --> 01:04:19,950 Either I-- if I didn't match Ai, I may match Bj in the future. 1085 01:04:19,950 --> 01:04:22,100 So I only want to reduce Ai. 1086 01:04:22,100 --> 01:04:26,270 So xi plus 1, I leave everything else the same. 1087 01:04:31,350 --> 01:04:38,170 Assuming if i less than n, I don't 1088 01:04:38,170 --> 01:04:45,410 want to move off the end of this thing, or x, i, j plus 1, 1089 01:04:45,410 --> 01:04:52,840 ki, kj if j is less than n. 1090 01:04:52,840 --> 01:04:54,940 So those are my four choices. 1091 01:04:54,940 --> 01:04:56,870 If I match the letter n, great. 1092 01:04:56,870 --> 01:04:59,860 Otherwise, I decrease the size of my subproblem and I recurse. 1093 01:05:04,510 --> 01:05:09,350 So fun recursion, topological sort, 1094 01:05:09,350 --> 01:05:17,570 these subproblems only depend on what? 1095 01:05:17,570 --> 01:05:19,220 Larger i? 1096 01:05:19,220 --> 01:05:20,820 Not quite. 1097 01:05:20,820 --> 01:05:22,290 Larger j? 1098 01:05:22,290 --> 01:05:24,820 Not quite. 1099 01:05:24,820 --> 01:05:30,010 Changing k or-- these don't even change here. 1100 01:05:30,010 --> 01:05:37,240 So we're going to use depend on larger, I guess, strictly-- 1101 01:05:37,240 --> 01:05:40,990 that's kind of an important thing, i plus j. 1102 01:05:40,990 --> 01:05:43,870 Because at least one of these two things is increasing. 1103 01:05:43,870 --> 01:05:46,780 And then the nice thing about that is it 1104 01:05:46,780 --> 01:05:49,960 kind of tells us when we should stop. 1105 01:05:49,960 --> 01:05:54,700 We should stop when either i or j get to n. 1106 01:05:54,700 --> 01:05:57,580 We should know enough, at that point, 1107 01:05:57,580 --> 01:06:01,525 to be able to determine if we succeeded or not, possibly. 1108 01:06:05,140 --> 01:06:06,360 So we have our base case. 1109 01:06:10,260 --> 01:06:12,180 What's the easy base case? 1110 01:06:12,180 --> 01:06:14,610 When we succeeded. 1111 01:06:14,610 --> 01:06:15,997 When have we succeeded? 1112 01:06:18,740 --> 01:06:22,490 If we have nothing left in A and B 1113 01:06:22,490 --> 01:06:26,740 And we have nothing left in C. I have nothing left to match. 1114 01:06:26,740 --> 01:06:29,996 So I have nn. 1115 01:06:29,996 --> 01:06:32,472 And I don't need a match anything else. 1116 01:06:32,472 --> 01:06:33,680 That's just going to be true. 1117 01:06:36,900 --> 01:06:40,050 All roads point to this subproblem 1118 01:06:40,050 --> 01:06:41,785 to get to a true solution. 1119 01:06:44,720 --> 01:06:46,970 Otherwise, we have some false base cases. 1120 01:06:46,970 --> 01:06:49,430 If you set something up like this 1121 01:06:49,430 --> 01:06:52,340 and you only give us a base case that's true, 1122 01:06:52,340 --> 01:06:54,410 and you're oring over the things, 1123 01:06:54,410 --> 01:06:56,610 your answer will always be true. 1124 01:06:56,610 --> 01:06:59,480 So you're not having any discriminatory power at all. 1125 01:06:59,480 --> 01:07:02,000 If you give us a true base case, you better 1126 01:07:02,000 --> 01:07:05,940 be giving us some false base cases, or one, at least. 1127 01:07:05,940 --> 01:07:11,510 So in the case where the first one is n 1128 01:07:11,510 --> 01:07:20,190 And we have some i, kj, this is going to be false if what? 1129 01:07:23,350 --> 01:07:32,270 If we have nothing left in Ai or A, but this guy is positive, 1130 01:07:32,270 --> 01:07:35,420 we got problems. 1131 01:07:35,420 --> 01:07:37,820 Otherwise, this thing is 0. 1132 01:07:37,820 --> 01:07:41,510 And we'll just try to match everything up here. 1133 01:07:41,510 --> 01:07:44,480 And eventually, we'll get down to this base case or something 1134 01:07:44,480 --> 01:07:51,035 where this thing goes to 0 and we've got a problem. 1135 01:07:51,035 --> 01:07:54,080 And the same goes for the other side as well. 1136 01:07:54,080 --> 01:08:01,320 If we run out of things in B, when the number of things 1137 01:08:01,320 --> 01:08:04,140 we need to match in B is greater than 0. 1138 01:08:04,140 --> 01:08:07,250 So those are our base cases. 1139 01:08:07,250 --> 01:08:08,930 The original problem is what? 1140 01:08:13,920 --> 01:08:16,380 Yeah, it's just going to be one of our subproblems, 1141 01:08:16,380 --> 01:08:20,729 nn, and then n over 2 and n over 2, 1142 01:08:20,729 --> 01:08:24,210 trying to match half of the things in C 1143 01:08:24,210 --> 01:08:26,490 with half of the things. 1144 01:08:26,490 --> 01:08:28,979 AUDIENCE: The first two arguments should be 0. 1145 01:08:28,979 --> 01:08:35,130 JASON KU: 0, thank you, because we're 0 index, yes. 1146 01:08:35,130 --> 01:08:39,300 Again, that's-- switching from prefix to suffix in the middle 1147 01:08:39,300 --> 01:08:40,950 is fun. 1148 01:08:40,950 --> 01:08:44,130 And it better be the case that n is 2, 1149 01:08:44,130 --> 01:08:47,220 or else it's obviously false-- or is even, 1150 01:08:47,220 --> 01:08:50,859 or else this is obviously false. 1151 01:08:50,859 --> 01:08:54,550 And then, the last thing, which I'm not going to write down, 1152 01:08:54,550 --> 01:08:58,990 is we have a constant work here. 1153 01:08:58,990 --> 01:09:02,830 Because I'm just checking the value of four 1154 01:09:02,830 --> 01:09:06,850 subproblems and a conditional for each. 1155 01:09:06,850 --> 01:09:09,550 And I have how many subproblems? 1156 01:09:09,550 --> 01:09:15,160 i loops over n, j loops over n, k and kj loop over n over 2. 1157 01:09:15,160 --> 01:09:18,729 So I get a quartic number of subproblems, 1158 01:09:18,729 --> 01:09:21,560 quartic running time as designed. 1159 01:09:21,560 --> 01:09:25,359 So I'm not going to write that down, because I'm quite a bit 1160 01:09:25,359 --> 01:09:27,520 late. 1161 01:09:27,520 --> 01:09:29,260 I'm probably going to do-- just do one 1162 01:09:29,260 --> 01:09:31,930 more problem, which is sad, because the last one is 1163 01:09:31,930 --> 01:09:35,890 about Gokemon Po, which is fun, fun problem. 1164 01:09:35,890 --> 01:09:39,069 Gokemon Po basically relies on I'm 1165 01:09:39,069 --> 01:09:42,890 trying to catch a bunch of pocket monsters, just monsters, 1166 01:09:42,890 --> 01:09:44,399 I think is in this. 1167 01:09:44,399 --> 01:09:51,580 And you can either go to a location 1168 01:09:51,580 --> 01:09:56,540 and catch that monster for free, but that costs money, 1169 01:09:56,540 --> 01:09:59,720 because I have to ride share there. 1170 01:09:59,720 --> 01:10:03,190 Or I don't have to go to that location, 1171 01:10:03,190 --> 01:10:07,460 and I buy it on my in-app purchase, 1172 01:10:07,460 --> 01:10:10,730 but that costs me a different amount of money. 1173 01:10:10,730 --> 01:10:14,190 But buying it from the in-app purchase 1174 01:10:14,190 --> 01:10:17,880 kept me at the location I was previously, wherever I was. 1175 01:10:17,880 --> 01:10:20,010 And so the point of that problem is 1176 01:10:20,010 --> 01:10:22,990 I need to remember where I was last. 1177 01:10:22,990 --> 01:10:25,810 So that I know how far I need to travel 1178 01:10:25,810 --> 01:10:27,580 to get to my next monster. 1179 01:10:27,580 --> 01:10:30,250 So that's going to be the last one that I'm not 1180 01:10:30,250 --> 01:10:32,290 going to be able to get to. 1181 01:10:32,290 --> 01:10:40,200 Number 3 is a problem about tapas. 1182 01:10:40,200 --> 01:10:44,070 So these all come from spring '18. 1183 01:10:44,070 --> 01:10:47,340 The first two came from a problem set. 1184 01:10:47,340 --> 01:10:51,090 These next two come from the final exam that year. 1185 01:10:51,090 --> 01:10:56,960 Obert Ratkins is on a diet. 1186 01:10:56,960 --> 01:10:59,990 But he has a dinner at an upscale tapas 1187 01:10:59,990 --> 01:11:01,940 bar, where he got many-- he's going 1188 01:11:01,940 --> 01:11:04,220 to order many small plates. 1189 01:11:04,220 --> 01:11:09,050 There are n plates of food on the menu, where each plate has 1190 01:11:09,050 --> 01:11:12,320 a certain information, it has a volume, the number 1191 01:11:12,320 --> 01:11:14,210 of calories in that dish. 1192 01:11:14,210 --> 01:11:16,997 And a sweetness label, basically 0 or 1, 1193 01:11:16,997 --> 01:11:18,080 whether it's sweet or not. 1194 01:11:21,240 --> 01:11:22,140 But he's on a diet. 1195 01:11:22,140 --> 01:11:25,860 And he wants to eat no more than k calories during his meal, 1196 01:11:25,860 --> 01:11:28,410 but wants to fill his stomach as much as possible, 1197 01:11:28,410 --> 01:11:30,330 because he wants to feel full. 1198 01:11:30,330 --> 01:11:33,480 So he wants to maximize the volume that he fills, 1199 01:11:33,480 --> 01:11:36,180 even though he wants to reduce the number of calories-- 1200 01:11:36,180 --> 01:11:38,790 restrict the number of calories. 1201 01:11:38,790 --> 01:11:42,360 He also wants to order exactly s sweet plates. 1202 01:11:45,180 --> 01:11:46,830 So we've got this other condition 1203 01:11:46,830 --> 01:11:50,280 where I need to make sure I'm eating a certain number 1204 01:11:50,280 --> 01:11:51,353 of sweet plates. 1205 01:11:51,353 --> 01:11:53,520 It might be useful for me to remember how many sweet 1206 01:11:53,520 --> 01:11:54,720 plates I've already eaten. 1207 01:11:54,720 --> 01:11:57,990 So I make sure that I eat that number, 1208 01:11:57,990 --> 01:12:00,120 without purchasing the same dish twice. 1209 01:12:00,120 --> 01:12:04,710 So here's a condition that's similar to the knapsack 01 1210 01:12:04,710 --> 01:12:07,650 problem, versus a knapsack kind of general problem. 1211 01:12:07,650 --> 01:12:11,200 Am I allowed to take more than one of these things or not? 1212 01:12:11,200 --> 01:12:13,710 Here, it's a restriction that I'm not allowed 1213 01:12:13,710 --> 01:12:16,710 to take a plate more than once. 1214 01:12:16,710 --> 01:12:20,850 And I'm going to try to describe an order nks time 1215 01:12:20,850 --> 01:12:23,760 algorithm to find the maximum volume of food Obert 1216 01:12:23,760 --> 01:12:27,300 can eat given his diet. 1217 01:12:27,300 --> 01:12:30,600 So first thing I'm going to note here is one of the things 1218 01:12:30,600 --> 01:12:34,800 that we talked about at the last dynamic programming lecture was 1219 01:12:34,800 --> 01:12:40,940 is this a polynomial running time that it's asking me for. 1220 01:12:40,940 --> 01:12:46,257 Actually, on your problem set 8, you were asked on each problem 1221 01:12:46,257 --> 01:12:48,590 to categorize whether the running time of your algorithm 1222 01:12:48,590 --> 01:12:50,730 was polynomial or not. 1223 01:12:50,730 --> 01:12:53,430 And actually, you don't have to solve the problem in order 1224 01:12:53,430 --> 01:12:57,310 to answer that question if we give you the running time. 1225 01:12:57,310 --> 01:12:58,810 If we give you the running time, you 1226 01:12:58,810 --> 01:13:01,227 can just take a look at that running time and be like, oh, 1227 01:13:01,227 --> 01:13:05,050 is that polynomial and the size of my input. 1228 01:13:05,050 --> 01:13:07,440 And here is it? 1229 01:13:07,440 --> 01:13:10,170 All the ones previously were. 1230 01:13:10,170 --> 01:13:11,340 This one was order n. 1231 01:13:11,340 --> 01:13:13,170 That was order n squared, because the n was 1232 01:13:13,170 --> 01:13:16,020 the number of things in my input, the number of words 1233 01:13:16,020 --> 01:13:18,210 it took to give you that input. 1234 01:13:18,210 --> 01:13:20,080 Here, what do I have? 1235 01:13:20,080 --> 01:13:24,240 I have a triple of numbers for each plate. 1236 01:13:24,240 --> 01:13:26,100 There are n of them. 1237 01:13:26,100 --> 01:13:28,010 So n is polynomial. 1238 01:13:28,010 --> 01:13:31,850 s polynomial because s is smaller than n 1239 01:13:31,850 --> 01:13:34,140 and it's positive number. 1240 01:13:34,140 --> 01:13:40,830 But k, k is just some number in my input. 1241 01:13:40,830 --> 01:13:43,800 It's representable in, potentially, one word, that's 1242 01:13:43,800 --> 01:13:44,940 the assumption. 1243 01:13:44,940 --> 01:13:47,190 But it could have exponential size 1244 01:13:47,190 --> 01:13:50,400 depending on the size of my word of my machine. 1245 01:13:50,400 --> 01:13:53,160 I don't know how big k is relative to n. 1246 01:13:53,160 --> 01:13:57,630 And so this is a pseudopolynomial running time. 1247 01:13:57,630 --> 01:14:02,980 Because k is just a number in my problem, similar to subset sum, 1248 01:14:02,980 --> 01:14:04,890 similar to knapsack, which you guys 1249 01:14:04,890 --> 01:14:06,330 did in lecture and recitation. 1250 01:14:06,330 --> 01:14:09,990 And so if we ask you on an exam, which we probably will, 1251 01:14:09,990 --> 01:14:13,260 whether certain running times are polynomial or not, 1252 01:14:13,260 --> 01:14:14,860 that's the logic that you go about it. 1253 01:14:14,860 --> 01:14:18,370 How big is my input? 1254 01:14:18,370 --> 01:14:21,220 What is my running time that I'm trying to evaluate? 1255 01:14:21,220 --> 01:14:23,230 And can I bound each of those terms 1256 01:14:23,230 --> 01:14:25,060 in terms of the size of my input? 1257 01:14:25,060 --> 01:14:29,800 If not, then you say it's pseudopolynomial. 1258 01:14:29,800 --> 01:14:33,040 All right, so let's try to tackle this problem. 1259 01:14:33,040 --> 01:14:36,140 Already, because we've got pseudopolynomial, 1260 01:14:36,140 --> 01:14:38,110 you're thinking maybe this is going 1261 01:14:38,110 --> 01:14:41,930 to be knapsack-like or subset sum-like. 1262 01:14:41,930 --> 01:14:43,300 What do I need to-- 1263 01:14:43,300 --> 01:14:46,790 I'm just going to go straight for subproblems here. 1264 01:14:46,790 --> 01:14:52,850 Actually, I should probably say what my things are. 1265 01:14:52,850 --> 01:14:54,080 Meh, this is fine. 1266 01:14:54,080 --> 01:14:56,960 I gave notation up there, didn't I? 1267 01:14:56,960 --> 01:14:59,100 So we're going to have subproblems. 1268 01:14:59,100 --> 01:15:00,650 I'm going to-- I want to maximize 1269 01:15:00,650 --> 01:15:02,270 the number, the volume of food. 1270 01:15:02,270 --> 01:15:04,670 So that should probably be the output 1271 01:15:04,670 --> 01:15:11,100 of my subproblem, the max volume on some subset of dishes. 1272 01:15:11,100 --> 01:15:14,570 I'm going to choose suffixes here. 1273 01:15:14,570 --> 01:15:24,680 i and some other stuff is going to be max volume of food 1274 01:15:24,680 --> 01:15:37,930 possible for plates Pi to Pn. 1275 01:15:37,930 --> 01:15:39,930 Going to assume one index here, because why not? 1276 01:15:43,190 --> 01:15:46,770 But do I need to remember information along the way? 1277 01:15:46,770 --> 01:15:51,480 Yeah, just like with subset sum or knapsack, I need-- 1278 01:15:51,480 --> 01:15:54,745 I have this calorie limit. 1279 01:15:54,745 --> 01:15:56,120 So it's going to be really useful 1280 01:15:56,120 --> 01:15:58,700 for me to know how many calories I've already eaten, 1281 01:15:58,700 --> 01:16:02,460 or how many calories I have left in my budget. 1282 01:16:02,460 --> 01:16:16,100 So let's say j, using at most j calories 1283 01:16:16,100 --> 01:16:18,920 from the remaining dishes. 1284 01:16:18,920 --> 01:16:24,260 And I need to make sure that I'm eating exactly some number 1285 01:16:24,260 --> 01:16:26,630 of sweet plates in the future. 1286 01:16:26,630 --> 01:16:30,020 And I need to remember, as I eat a sweet plate, 1287 01:16:30,020 --> 01:16:32,815 the number of sweet plates I need to eat decreases. 1288 01:16:32,815 --> 01:16:34,190 And so I want to generalize that. 1289 01:16:34,190 --> 01:16:44,070 I'm going to put an s prime here to denote eating exactly 1290 01:16:44,070 --> 01:16:48,440 s sweet plates. 1291 01:16:51,080 --> 01:16:53,227 OK, so that's my subproblem. 1292 01:16:56,682 --> 01:16:57,890 I've got tons of board space. 1293 01:16:57,890 --> 01:17:00,140 I'm going to go ahead and use it. 1294 01:17:00,140 --> 01:17:09,160 Relation, we've got x, i, j, s prime equals-- 1295 01:17:09,160 --> 01:17:12,100 OK, I'm trying to maximize volume. 1296 01:17:12,100 --> 01:17:15,160 Probably, I want to be maximizing over something. 1297 01:17:15,160 --> 01:17:20,590 This combinator is kind of what I like to call it. 1298 01:17:20,590 --> 01:17:22,810 Usually, what you're doing in dynamic programming 1299 01:17:22,810 --> 01:17:25,480 is making some kind of choice or combin-- 1300 01:17:25,480 --> 01:17:29,750 combining, combinating, combining 1301 01:17:29,750 --> 01:17:33,740 some number of subproblems and choosing which one's the best. 1302 01:17:33,740 --> 01:17:36,960 If you just list a bunch of options here 1303 01:17:36,960 --> 01:17:39,192 and don't tell us how to combine them, 1304 01:17:39,192 --> 01:17:40,400 that's going to be a problem. 1305 01:17:40,400 --> 01:17:42,358 Because we don't know what your dynamic program 1306 01:17:42,358 --> 01:17:43,440 is doing at all. 1307 01:17:43,440 --> 01:17:46,400 So it's really useful for you to be able to tell us 1308 01:17:46,400 --> 01:17:47,990 how you're combining your subproblems. 1309 01:17:47,990 --> 01:17:51,290 Here, we're doing a maximization over the different volumes 1310 01:17:51,290 --> 01:17:52,160 possible. 1311 01:17:52,160 --> 01:18:01,470 If we decide to eat the plate i, then we get Vi in volume, 1312 01:18:01,470 --> 01:18:04,440 we fill our tummies with Vi in volume. 1313 01:18:04,440 --> 01:18:07,920 But then we have to recurse on using one fewer plate, 1314 01:18:07,920 --> 01:18:11,020 because we can't use that plate again. 1315 01:18:11,020 --> 01:18:15,660 And we've decreased the amount of calories in our budget. 1316 01:18:15,660 --> 01:18:19,860 And I'm going to say s prime minus 1317 01:18:19,860 --> 01:18:28,560 si, because si is 1 if it's sweet and 0 if it's not. 1318 01:18:28,560 --> 01:18:30,630 So it's kind of nice that they kind of 1319 01:18:30,630 --> 01:18:31,900 gave us this notation here. 1320 01:18:31,900 --> 01:18:34,500 I can just subtract it off if it's there. 1321 01:18:34,500 --> 01:18:38,070 I don't have to do this conditional or something. 1322 01:18:38,070 --> 01:18:41,610 And I don't ever want to go below these budgets. 1323 01:18:41,610 --> 01:18:48,900 So I'm just going to say if Ci is 1324 01:18:48,900 --> 01:18:52,770 less than or equal to j and si is less than 1325 01:18:52,770 --> 01:18:55,080 or equal to s prime. 1326 01:18:55,080 --> 01:18:57,240 So that's going to make sure that I never 1327 01:18:57,240 --> 01:19:00,240 have these guys go negative. 1328 01:19:00,240 --> 01:19:01,650 Otherwise, I don't eat the plate. 1329 01:19:01,650 --> 01:19:03,810 And that's kind of the easy case, 1330 01:19:03,810 --> 01:19:08,250 because I just go i plus 1, j, s prime. 1331 01:19:08,250 --> 01:19:09,480 These things didn't change. 1332 01:19:09,480 --> 01:19:12,540 I just have one fewer thing left. 1333 01:19:12,540 --> 01:19:14,190 So I'm maximizing over these things. 1334 01:19:14,190 --> 01:19:16,470 This one's an always. 1335 01:19:16,470 --> 01:19:17,180 It's not an if. 1336 01:19:20,380 --> 01:19:22,150 So I just have two choices. 1337 01:19:22,150 --> 01:19:26,440 And maximizing over them, topological sort order, 1338 01:19:26,440 --> 01:19:30,430 here, I'm always recursing on a thing with larger i. 1339 01:19:33,110 --> 01:19:40,340 Depend on larger i, so acyclicl, happy. 1340 01:19:43,430 --> 01:19:52,900 Base cases, what's the good case? 1341 01:19:52,900 --> 01:19:56,860 I get to the end, I've reached the end of my menu, 1342 01:19:56,860 --> 01:19:58,870 I can't look at any more plates, I'm stuffed. 1343 01:20:01,570 --> 01:20:04,390 And I've already forbidden myself 1344 01:20:04,390 --> 01:20:05,950 from going negative on the calories. 1345 01:20:05,950 --> 01:20:09,420 So that should be all good. 1346 01:20:09,420 --> 01:20:12,720 But what do I want on the third parameter? 1347 01:20:12,720 --> 01:20:19,050 0, I better have eaten exactly s plates. 1348 01:20:19,050 --> 01:20:24,420 So I want to get down to x n plus 1, 1349 01:20:24,420 --> 01:20:28,980 because I'm 1 index, j, for any j, 0. 1350 01:20:28,980 --> 01:20:32,550 That's going to be 0. 1351 01:20:32,550 --> 01:20:34,110 I got no calories there. 1352 01:20:34,110 --> 01:20:35,170 But it's a good thing. 1353 01:20:35,170 --> 01:20:36,390 It's a good place. 1354 01:20:36,390 --> 01:20:37,860 It's fine. 1355 01:20:37,860 --> 01:20:39,540 0 is good. 1356 01:20:39,540 --> 01:20:45,060 This is done, I don't know. 1357 01:20:45,060 --> 01:20:47,160 OK, there's another base case. 1358 01:20:47,160 --> 01:20:48,870 What's the bad base case. 1359 01:20:48,870 --> 01:20:50,130 I get to the end. 1360 01:20:50,130 --> 01:20:51,660 I'm always increasing i. 1361 01:20:51,660 --> 01:20:53,910 And so I better be doing something on n plus 1. 1362 01:20:53,910 --> 01:20:57,110 I got to the end. 1363 01:20:57,110 --> 01:21:00,260 j again, is going to be non-negative. 1364 01:21:00,260 --> 01:21:05,000 Because we're always going to be in our calorie budget. 1365 01:21:05,000 --> 01:21:12,340 But if this is anything other than s prime greater than-- 1366 01:21:12,340 --> 01:21:15,690 if it's anything but 0, what is that going to be? 1367 01:21:15,690 --> 01:21:17,415 AUDIENCE: It'd be minus infinity. 1368 01:21:17,415 --> 01:21:18,480 JASON KU: Minus infinity. 1369 01:21:18,480 --> 01:21:20,950 I never want to be in this situation. 1370 01:21:20,950 --> 01:21:24,840 If I do my dynamic program and I get a minus infinity up 1371 01:21:24,840 --> 01:21:29,130 at the top, that means there is no path to this subproblem 1372 01:21:29,130 --> 01:21:31,580 here, where I'm happy. 1373 01:21:31,580 --> 01:21:33,360 I'm always sad. 1374 01:21:33,360 --> 01:21:38,180 And so I return that the maximum volume of food Obert 1375 01:21:38,180 --> 01:21:43,520 can eat and maintain his diet is not possible. 1376 01:21:43,520 --> 01:21:46,840 Essentially, there aren't s dishes, sweet dishes 1377 01:21:46,840 --> 01:21:52,622 in the thing whose calorie budget are below my limit. 1378 01:21:52,622 --> 01:21:54,080 And that's probably an easier thing 1379 01:21:54,080 --> 01:21:57,200 to check than in this band. 1380 01:21:57,200 --> 01:22:01,470 So we have our original subproblems now. 1381 01:22:01,470 --> 01:22:03,230 Solution is given by what? 1382 01:22:07,960 --> 01:22:09,730 Just one of our subproblems, it's 1383 01:22:09,730 --> 01:22:11,930 just seeing what's the maximum volume. 1384 01:22:11,930 --> 01:22:15,910 I don't have to retrace my steps to figure out my thing. 1385 01:22:15,910 --> 01:22:18,430 I just-- I say one of the subproblems, 1386 01:22:18,430 --> 01:22:22,300 it's using all of the things on my menu, 1387 01:22:22,300 --> 01:22:28,480 using my entire budget k, and trying to get exactly s things. 1388 01:22:28,480 --> 01:22:32,570 That's going to be my output to my algorithm. 1389 01:22:32,570 --> 01:22:36,450 And this takes what time? 1390 01:22:36,450 --> 01:22:38,020 How many subproblems do I have? 1391 01:22:38,020 --> 01:22:44,346 I have n plus 1 subproblems for this parameter. 1392 01:22:44,346 --> 01:22:48,150 I have k plus 1 possible things for this parameter. 1393 01:22:48,150 --> 01:22:51,660 And I have s plus 1 possible things for this parameter. 1394 01:22:51,660 --> 01:23:01,610 So I get order nks subproblems, subproblems. 1395 01:23:01,610 --> 01:23:04,670 How much work per subproblem, just a max 1396 01:23:04,670 --> 01:23:11,030 of two things, so constant work per subproblem 1397 01:23:11,030 --> 01:23:17,570 yields order nks time total. 1398 01:23:17,570 --> 01:23:22,100 So those are three nice practice problems for you, 1399 01:23:22,100 --> 01:23:24,620 two that are polynomial, one that's pseudopolynomial. 1400 01:23:24,620 --> 01:23:26,600 You have one more example in there, 1401 01:23:26,600 --> 01:23:30,000 which is the Gokemon Po problem, which is a fun problem. 1402 01:23:30,000 --> 01:23:32,720 It involves remembering additional information that's 1403 01:23:32,720 --> 01:23:34,046 not-- 1404 01:23:34,046 --> 01:23:37,670 not really a pseudopolynomial number in your problem. 1405 01:23:37,670 --> 01:23:44,000 But it's the location of where I was last, or where I was going. 1406 01:23:44,000 --> 01:23:46,340 So take a look at that problem. 1407 01:23:46,340 --> 01:23:50,580 It's another kind of non-trivial way of expanding subproblems. 1408 01:23:50,580 --> 01:23:55,600 OK, and with that, good luck on your quiz 3.