1 00:00:00,000 --> 00:00:04,383 [SQUEAKING] [RUSTLING] [CLICKING] 2 00:00:12,970 --> 00:00:15,190 ERIK DEMAINE: All right, welcome back to 006. 3 00:00:15,190 --> 00:00:18,790 Today we start a totally new section of the class. 4 00:00:18,790 --> 00:00:21,340 Up till now, we've mostly been showing 5 00:00:21,340 --> 00:00:25,990 you really cool and powerful algorithms, sorting algorithms, 6 00:00:25,990 --> 00:00:28,750 graph algorithms, data structures, trees, lots 7 00:00:28,750 --> 00:00:30,970 of good stuff that you can apply to solve tons 8 00:00:30,970 --> 00:00:34,540 of algorithmic problems, either by reducing to the data 9 00:00:34,540 --> 00:00:38,320 structures that we showed you, or reducing to graph problems 10 00:00:38,320 --> 00:00:40,600 that we showed you, or by modifying 11 00:00:40,600 --> 00:00:41,950 those algorithms a bit. 12 00:00:41,950 --> 00:00:45,850 Today we're going to start a new section on algorithmic design-- 13 00:00:45,850 --> 00:00:49,120 how to, from scratch, come up with a polynomial time 14 00:00:49,120 --> 00:00:50,350 algorithm to solve a problem. 15 00:00:52,697 --> 00:00:54,280 And in particular, we're going to talk 16 00:00:54,280 --> 00:00:57,340 about a algorithmic design paradigm called 17 00:00:57,340 --> 00:01:00,190 dynamic programming, which is extremely powerful. 18 00:01:00,190 --> 00:01:02,860 It's probably the most powerful algorithmic design paradigm. 19 00:01:02,860 --> 00:01:03,730 Very general. 20 00:01:03,730 --> 00:01:06,340 Can solve lots of problems. 21 00:01:06,340 --> 00:01:10,900 It's a particular type of recursive algorithm design. 22 00:01:10,900 --> 00:01:14,200 And in general, this class-- 23 00:01:14,200 --> 00:01:17,290 all of algorithms-- is about recursive algorithm design 24 00:01:17,290 --> 00:01:19,900 at some level, because we want to write 25 00:01:19,900 --> 00:01:22,360 constant-sized pieces of code that solve 26 00:01:22,360 --> 00:01:23,890 problems of arbitrary size. 27 00:01:23,890 --> 00:01:25,690 We have some problem size n and we're 28 00:01:25,690 --> 00:01:27,928 trying to write 100 lines of code or whatever, 29 00:01:27,928 --> 00:01:30,220 some constant amount that doesn't depend on the problem 30 00:01:30,220 --> 00:01:30,720 size. 31 00:01:30,720 --> 00:01:32,260 We have one algorithm that solves 32 00:01:32,260 --> 00:01:34,970 all instances of the problem. 33 00:01:34,970 --> 00:01:39,400 And so we have to write code that is recursive or uses loops 34 00:01:39,400 --> 00:01:41,110 or somehow reuses the instructions 35 00:01:41,110 --> 00:01:43,360 that we give the computer. 36 00:01:43,360 --> 00:01:46,180 And you may know you can convert any algorithm based 37 00:01:46,180 --> 00:01:48,760 on loops into an algorithm using recursion. 38 00:01:48,760 --> 00:01:51,310 And we're going to take the recursive view today, 39 00:01:51,310 --> 00:01:53,020 in particular because it fits very 40 00:01:53,020 --> 00:01:56,080 well with our proof-by-induction technique, which we've 41 00:01:56,080 --> 00:02:00,430 used throughout this class, but also because it gives us 42 00:02:00,430 --> 00:02:04,270 some structure on how different subproblems relate in something 43 00:02:04,270 --> 00:02:08,690 called a subproblem graph, that we'll be talking about today. 44 00:02:08,690 --> 00:02:11,470 And so we're going to start out with, in general, how do we 45 00:02:11,470 --> 00:02:13,450 design recursive algorithms? 46 00:02:13,450 --> 00:02:16,270 That's sort of the overall, encompassing everything. 47 00:02:16,270 --> 00:02:18,970 We have thought very hard to come up 48 00:02:18,970 --> 00:02:21,910 with a cool acronym for this paradigm which we invented 49 00:02:21,910 --> 00:02:23,440 called SRTBOT-- 50 00:02:23,440 --> 00:02:25,330 thanks, Jason. 51 00:02:25,330 --> 00:02:27,940 And so we'll talk-- it's not actually for sorting. 52 00:02:27,940 --> 00:02:33,580 It's just an acronym for sub-problems, relations, 53 00:02:33,580 --> 00:02:39,160 topological order, base case, original problem, and time. 54 00:02:39,160 --> 00:02:42,975 But it's an acronym that will help you remember 55 00:02:42,975 --> 00:02:44,725 all the steps you need in order to specify 56 00:02:44,725 --> 00:02:46,090 a recursive algorithm. 57 00:02:46,090 --> 00:02:47,950 And dynamic programming is going to build 58 00:02:47,950 --> 00:02:52,330 on this template by adding one new idea called 59 00:02:52,330 --> 00:02:55,120 memoization, which is just the idea of reusing work 60 00:02:55,120 --> 00:02:57,110 that you've done before. 61 00:02:57,110 --> 00:03:00,250 And that's going to let us solve tons of problems. 62 00:03:00,250 --> 00:03:02,410 And let's see. 63 00:03:02,410 --> 00:03:05,540 I don't-- let's get into it. 64 00:03:05,540 --> 00:03:08,770 So we'll start out today with SRTBOT. 65 00:03:08,770 --> 00:03:13,330 So here is SRTBOT down the column here. 66 00:03:13,330 --> 00:03:16,577 This is a recursive algorithm design paradigm. 67 00:03:16,577 --> 00:03:18,160 And in general, what we're going to do 68 00:03:18,160 --> 00:03:20,560 is take the problem that we actually want to solve 69 00:03:20,560 --> 00:03:24,280 and split it up into lots of possible sub problems. 70 00:03:24,280 --> 00:03:26,530 And so the first part is to define what 71 00:03:26,530 --> 00:03:27,760 the heck are the subproblems. 72 00:03:27,760 --> 00:03:30,580 In general, we'll want some polynomial number of them. 73 00:03:30,580 --> 00:03:34,150 But it's pretty open-ended what these look like. 74 00:03:34,150 --> 00:03:37,588 And the hardest part, usually, in defining 75 00:03:37,588 --> 00:03:39,130 a recursive algorithm is figuring out 76 00:03:39,130 --> 00:03:40,600 what the sub problems should be. 77 00:03:40,600 --> 00:03:42,975 Usually they're related to the problem you want to solve. 78 00:03:42,975 --> 00:03:45,100 Often the problem you want to solve-- 79 00:03:45,100 --> 00:03:46,908 this is actually near the last step-- 80 00:03:46,908 --> 00:03:48,700 the original problem you're trying to solve 81 00:03:48,700 --> 00:03:51,430 is often one of these sub problems. 82 00:03:51,430 --> 00:03:53,200 And then you use the smaller sub problems 83 00:03:53,200 --> 00:03:56,590 in order to build up the final, original problem. 84 00:03:56,590 --> 00:03:58,180 But sometimes at the end, you need 85 00:03:58,180 --> 00:04:00,520 to take a bunch of subproblems and combine them 86 00:04:00,520 --> 00:04:01,900 into your original problem. 87 00:04:01,900 --> 00:04:04,330 You can think-- one analogy you can think of here 88 00:04:04,330 --> 00:04:06,160 is divide and conquer algorithms, 89 00:04:06,160 --> 00:04:09,010 which also had this kind of style. 90 00:04:09,010 --> 00:04:12,220 But more generally, we're going to relate different sub problem 91 00:04:12,220 --> 00:04:17,440 solutions with some recursive structure-- some recurrence 92 00:04:17,440 --> 00:04:19,209 relation. 93 00:04:19,209 --> 00:04:21,459 This is just a recursive algorithm 94 00:04:21,459 --> 00:04:24,640 that defines how to solve one problem in terms 95 00:04:24,640 --> 00:04:28,480 of smaller sub-problems for some notion of smaller. 96 00:04:28,480 --> 00:04:31,280 And this is given by the topological order. 97 00:04:31,280 --> 00:04:34,150 So if we think of the subproblems as a graph 98 00:04:34,150 --> 00:04:37,500 and we draw an edge between-- 99 00:04:37,500 --> 00:04:40,360 so the vertices of the graph are sub problems. 100 00:04:40,360 --> 00:04:42,790 The edges are the dependencies between those subproblems. 101 00:04:42,790 --> 00:04:45,580 Then what we'd like is the topological ordering, 102 00:04:45,580 --> 00:04:48,250 the topological sort problem we talked about in the context 103 00:04:48,250 --> 00:04:51,880 of DFS or DAG shortest paths. 104 00:04:51,880 --> 00:04:56,260 What we would like is that the subproblems and the calls-- 105 00:04:56,260 --> 00:04:59,110 the recursive calls between them in this recursive relation-- 106 00:04:59,110 --> 00:05:00,220 forms a DAG. 107 00:05:00,220 --> 00:05:02,560 We want it to be acyclic, otherwise 108 00:05:02,560 --> 00:05:05,170 you have an infinite loop in your recursive calls. 109 00:05:05,170 --> 00:05:09,220 If you have a cycle, you'll never terminate. 110 00:05:09,220 --> 00:05:12,900 And so to make sure that these dependencies 111 00:05:12,900 --> 00:05:15,540 between subproblems given by this recurrence relation 112 00:05:15,540 --> 00:05:18,060 is acyclic, one way to do that is 113 00:05:18,060 --> 00:05:20,798 to specify a topological order. 114 00:05:20,798 --> 00:05:22,340 Or you could prove it some other way. 115 00:05:22,340 --> 00:05:25,790 But often it's just a for loop to say, just do it 116 00:05:25,790 --> 00:05:27,290 in this order. 117 00:05:27,290 --> 00:05:31,020 Then of course any recursive structure needs base cases. 118 00:05:31,020 --> 00:05:33,752 So that's a useful step not to forget. 119 00:05:33,752 --> 00:05:35,960 We want to solve the original problem using these sub 120 00:05:35,960 --> 00:05:36,460 problems. 121 00:05:36,460 --> 00:05:38,810 And then we analyze a running time at the end. 122 00:05:38,810 --> 00:05:42,000 So six easy steps. 123 00:05:42,000 --> 00:05:43,980 Actually, the hardest ones are these two, 124 00:05:43,980 --> 00:05:45,450 which are interrelated. 125 00:05:45,450 --> 00:05:49,073 And what we're going to see over the next four lectures-- 126 00:05:49,073 --> 00:05:50,490 this is the first of four lectures 127 00:05:50,490 --> 00:05:53,370 on dynamic programming-- is lots of examples of applying 128 00:05:53,370 --> 00:05:56,820 this paradigm over and over together with the memoization 129 00:05:56,820 --> 00:05:58,990 idea, which we'll get to soon. 130 00:05:58,990 --> 00:06:03,010 Let's see an example first of an algorithm we've already seen, 131 00:06:03,010 --> 00:06:05,700 which is merge sort, so a divide and conquer algorithm, 132 00:06:05,700 --> 00:06:09,120 phrased with this structure of SRTBOT. 133 00:06:09,120 --> 00:06:11,460 So for the sub problems-- 134 00:06:11,460 --> 00:06:14,670 so our original problem is to sort the elements of A. 135 00:06:14,670 --> 00:06:18,120 And some sub-problems that we solve along the way are sorting 136 00:06:18,120 --> 00:06:21,390 different sub-arrays of A. So for every-- 137 00:06:21,390 --> 00:06:24,360 well, not for every i and j, but for some i and js, 138 00:06:24,360 --> 00:06:28,660 we sort the items from i up to j minus 1. 139 00:06:28,660 --> 00:06:32,170 So I'm going to define that subproblem to be s of ij. 140 00:06:32,170 --> 00:06:34,650 So this is something that I might want to solve. 141 00:06:34,650 --> 00:06:37,080 The original problem that I want to solve 142 00:06:37,080 --> 00:06:40,480 is s of 0 comma n, where n is the length of the array. 143 00:06:40,480 --> 00:06:42,960 So that's what I actually care about in the end. 144 00:06:42,960 --> 00:06:45,750 But we're going to solve that by writing it recursively 145 00:06:45,750 --> 00:06:49,780 in terms of sorting different sub-arrays as follows. 146 00:06:49,780 --> 00:06:51,825 This is the recurrence relation. 147 00:06:51,825 --> 00:06:53,200 I've written it very simply here. 148 00:06:53,200 --> 00:06:55,180 Of course, there's a merge algorithm, 149 00:06:55,180 --> 00:06:57,150 which is somewhat complicated. 150 00:06:57,150 --> 00:07:00,690 But as we saw the two finger linear time merge algorithm, 151 00:07:00,690 --> 00:07:04,170 given two sorted arrays-- 152 00:07:04,170 --> 00:07:06,600 so this is supposed to be the sorted array 153 00:07:06,600 --> 00:07:09,210 version of the items i through m. m 154 00:07:09,210 --> 00:07:11,610 is the middle element between i and j 155 00:07:11,610 --> 00:07:15,810 and the sorted array of the items from m up to j. 156 00:07:15,810 --> 00:07:18,960 If we merge those, that gives us the sorted array 157 00:07:18,960 --> 00:07:21,700 from i up to j. 158 00:07:21,700 --> 00:07:24,250 And that's exactly what merge sort does. 159 00:07:24,250 --> 00:07:28,210 So in general, this relation is just 160 00:07:28,210 --> 00:07:33,460 some algorithm for if you're given the solutions 161 00:07:33,460 --> 00:07:38,560 to some smaller subproblems, how do I solve the subproblem 162 00:07:38,560 --> 00:07:41,820 that I want to solve? 163 00:07:41,820 --> 00:07:46,550 And so we need to make sure that this problem is bigger 164 00:07:46,550 --> 00:07:48,590 than the ones that we recursively call on 165 00:07:48,590 --> 00:07:51,050 and that we don't get an infinite cyclic loop 166 00:07:51,050 --> 00:07:51,980 of recursions. 167 00:07:51,980 --> 00:07:53,870 And here our valid topological order 168 00:07:53,870 --> 00:07:57,470 is to say, solve these problems in order 169 00:07:57,470 --> 00:08:02,090 where j minus i-- the length of the sub-array-- is increasing. 170 00:08:02,090 --> 00:08:05,600 And then you can check because m is strictly between i and j. 171 00:08:05,600 --> 00:08:11,660 As long as we're not in a base case, then we know we can-- 172 00:08:11,660 --> 00:08:14,168 these subarrays will be smaller than this one. 173 00:08:14,168 --> 00:08:15,710 And so this increasing order gives us 174 00:08:15,710 --> 00:08:19,820 a valid topological order on all of the problems, all 175 00:08:19,820 --> 00:08:20,870 the subproblems. 176 00:08:20,870 --> 00:08:22,700 We have a base case, which is if we 177 00:08:22,700 --> 00:08:25,880 don't want to sort anything, that's the empty array, 178 00:08:25,880 --> 00:08:27,380 or at least in the original problem. 179 00:08:27,380 --> 00:08:29,420 And then running time is-- 180 00:08:29,420 --> 00:08:32,780 I mean, there's no better way to solve it than the recurrence 181 00:08:32,780 --> 00:08:34,451 that we already saw how to solve. 182 00:08:34,451 --> 00:08:36,409 So this is just another way to think of n log n 183 00:08:36,409 --> 00:08:41,179 merge sort in this labeled framework of SRTBOT. 184 00:08:41,179 --> 00:08:44,330 Let's get to another problem that 185 00:08:44,330 --> 00:08:48,380 does not fit recursion so well. 186 00:08:48,380 --> 00:08:51,480 But we can make it better. 187 00:08:51,480 --> 00:08:53,000 So this is-- we're going to start 188 00:08:53,000 --> 00:08:54,620 with a very simple problem, which 189 00:08:54,620 --> 00:08:59,060 is computing Fibonacci numbers. 190 00:08:59,060 --> 00:09:02,900 It's really just a toy problem to illustrate a very powerful 191 00:09:02,900 --> 00:09:06,290 idea, which is memoization. 192 00:09:06,290 --> 00:09:08,450 So the problem I'm interested in is I'm 193 00:09:08,450 --> 00:09:10,910 given a particular number, n. 194 00:09:10,910 --> 00:09:14,730 And I want to compute the nth Fibonacci number. 195 00:09:14,730 --> 00:09:17,390 And in case you forgot, the nth Fibonacci number 196 00:09:17,390 --> 00:09:22,190 is given by this recurrence. fn is fn minus 1 plus fn minus 2 197 00:09:22,190 --> 00:09:26,570 with base case, let's say, f1 equals f2 equals 1. 198 00:09:29,270 --> 00:09:30,950 And so we'd like to compute this. 199 00:09:30,950 --> 00:09:33,320 This seems-- this is a recurrence. 200 00:09:33,320 --> 00:09:36,230 So it seems very natural to write it 201 00:09:36,230 --> 00:09:37,480 as a recursive algorithm. 202 00:09:37,480 --> 00:09:38,570 So let's try to do it. 203 00:09:38,570 --> 00:09:41,570 We start with what are the sub problems. 204 00:09:41,570 --> 00:09:45,380 The obvious sub problems are just 205 00:09:45,380 --> 00:09:55,628 the various Fibonacci numbers, f i for i between 1 and n. 206 00:09:55,628 --> 00:09:57,170 So there are n of these sub problems. 207 00:10:00,560 --> 00:10:01,530 Cool. 208 00:10:01,530 --> 00:10:02,030 Let's see. 209 00:10:02,030 --> 00:10:04,250 We want a relation between them. 210 00:10:07,680 --> 00:10:10,850 Well, maybe just to distinguish the problems from the Fibonacci 211 00:10:10,850 --> 00:10:13,670 numbers, let me write f of i. 212 00:10:13,670 --> 00:10:16,520 This is a function, an algorithm we're going to define. 213 00:10:16,520 --> 00:10:18,980 And it's defined to be-- 214 00:10:18,980 --> 00:10:22,250 the goal we're trying to get is the ith Fibonacci number 215 00:10:22,250 --> 00:10:23,990 given i. 216 00:10:23,990 --> 00:10:26,540 And then we can write the recurrence relation 217 00:10:26,540 --> 00:10:31,940 on these guys, just f of i equals f of i minus 1 218 00:10:31,940 --> 00:10:34,100 plus f of i minus 2. 219 00:10:34,100 --> 00:10:37,940 So in other words, recursively compute those Fibonacci numbers 220 00:10:37,940 --> 00:10:39,000 then add them together. 221 00:10:39,000 --> 00:10:41,130 That's an algorithm. 222 00:10:41,130 --> 00:10:44,300 Next is t for topological order. 223 00:10:48,530 --> 00:10:51,290 Here, of course, we just want to compute 224 00:10:51,290 --> 00:10:57,920 these in order of increasing i from the base case is up. 225 00:10:57,920 --> 00:11:01,880 Another way I like to write this is as a for loop for i 226 00:11:01,880 --> 00:11:04,910 equals 1 to n. 227 00:11:04,910 --> 00:11:06,740 We will see why. 228 00:11:06,740 --> 00:11:10,805 But this gives an explicit order to compute these sub problems. 229 00:11:13,460 --> 00:11:21,460 And base case is just the same as the Fibonacci numbers, 230 00:11:21,460 --> 00:11:23,210 but I guess I should write in parentheses. 231 00:11:25,960 --> 00:11:30,460 The original problem we want to solve is f of n. 232 00:11:30,460 --> 00:11:32,890 And the time-- all right, here's where 233 00:11:32,890 --> 00:11:35,870 things get interesting or bad. 234 00:11:35,870 --> 00:11:39,580 So what is the running time of this recursive algorithm? 235 00:11:39,580 --> 00:11:42,520 As I've stated it so far, the running time 236 00:11:42,520 --> 00:11:46,270 is given by a recurrence. 237 00:11:46,270 --> 00:11:47,650 Let's write the recurrence. 238 00:11:47,650 --> 00:11:52,450 So in order to compute f of n, I recursively 239 00:11:52,450 --> 00:11:58,580 compute f of i minus 1 or f of n minus 1 here. 240 00:11:58,580 --> 00:12:03,760 And I recursively compute f of n minus 2. 241 00:12:03,760 --> 00:12:05,500 So that will take t of n minus 2. 242 00:12:05,500 --> 00:12:07,930 This first step will take t of n minus 1. 243 00:12:07,930 --> 00:12:11,510 And now I need to solve this recurrence. 244 00:12:11,510 --> 00:12:14,380 This is not a recurrence that falls to the master method. 245 00:12:14,380 --> 00:12:17,710 It doesn't have a divided by. 246 00:12:17,710 --> 00:12:19,540 So we have to think about it a little bit. 247 00:12:19,540 --> 00:12:20,998 But we don't have to think about it 248 00:12:20,998 --> 00:12:23,110 too hard, because this recurrence is 249 00:12:23,110 --> 00:12:25,150 the same as this recurrence, which 250 00:12:25,150 --> 00:12:26,470 is the same as this recurrence. 251 00:12:26,470 --> 00:12:27,880 I've written it three times now. 252 00:12:27,880 --> 00:12:32,400 And so the solution to this is the nth Fibonacci number. 253 00:12:32,400 --> 00:12:33,270 Oh, sorry. 254 00:12:33,270 --> 00:12:35,790 It's a little bit worse because in addition 255 00:12:35,790 --> 00:12:38,610 to those recursions, I also spend constant time 256 00:12:38,610 --> 00:12:40,920 to do the addition, maybe more than constant time. 257 00:12:40,920 --> 00:12:44,850 But if we just count the number of additions we do, 258 00:12:44,850 --> 00:12:51,360 it will be plus 1 additions. 259 00:12:53,620 --> 00:12:54,120 OK. 260 00:12:54,120 --> 00:12:58,330 But this is bigger than the nth Fibonacci number. 261 00:12:58,330 --> 00:13:01,320 And if you know anything about Fibonacci numbers, 262 00:13:01,320 --> 00:13:03,850 they grow exponentially. 263 00:13:03,850 --> 00:13:06,210 They're about golden ratio to the end. 264 00:13:06,210 --> 00:13:09,570 I'm wearing golden ratio, in case you forgot the number. 265 00:13:09,570 --> 00:13:13,410 So that's bad, because golden ratio is bigger than 1. 266 00:13:13,410 --> 00:13:16,320 So this is exponential growth, as we know, especially 267 00:13:16,320 --> 00:13:18,210 in this time, exponential growth is bad. 268 00:13:18,210 --> 00:13:19,740 In algorithms, exponential growth 269 00:13:19,740 --> 00:13:22,590 is bad, because we can only solve very small problems 270 00:13:22,590 --> 00:13:23,760 with exponential growth. 271 00:13:23,760 --> 00:13:25,020 Very small n. 272 00:13:25,020 --> 00:13:28,530 So this is a terrible way to compute the nth Fibonacci 273 00:13:28,530 --> 00:13:29,460 number-- 274 00:13:29,460 --> 00:13:40,260 exponential bad. 275 00:13:40,260 --> 00:13:43,440 OK, so don't do this. 276 00:13:43,440 --> 00:13:47,190 But there's a very tiny tweak to this algorithm 277 00:13:47,190 --> 00:13:53,280 that makes it really good, which is memoization. 278 00:13:53,280 --> 00:13:55,330 And this is a big idea. 279 00:13:55,330 --> 00:13:59,884 It is the big idea of dynamic programming. 280 00:14:02,670 --> 00:14:07,710 It's a funny word, probably made up by computer scientists. 281 00:14:07,710 --> 00:14:11,940 Instead of memorization, it's memoization, 282 00:14:11,940 --> 00:14:14,775 because we're going to write things down in a memo pad. 283 00:14:14,775 --> 00:14:16,530 It's the idea. 284 00:14:16,530 --> 00:14:18,000 And it's a very simple idea, which 285 00:14:18,000 --> 00:14:24,286 is just remember and reuse solutions to sub-problems. 286 00:14:35,710 --> 00:14:42,150 So let's draw the recursion tree for this recursive algorithm 287 00:14:42,150 --> 00:14:43,720 as we've done it so far. 288 00:14:43,720 --> 00:14:48,960 So at the top, we-- let me make a little bit of space. 289 00:14:48,960 --> 00:14:53,610 At the top we are calling f of n. 290 00:14:53,610 --> 00:14:59,100 And then that calls f of n minus 1 and f of n minus 2. 291 00:14:59,100 --> 00:15:00,930 And it does an addition up here. 292 00:15:00,930 --> 00:15:04,590 And then this calls f of n minus 2. 293 00:15:04,590 --> 00:15:08,280 And this calls f of n minus 3. 294 00:15:08,280 --> 00:15:11,370 This calls f of n minus 3. 295 00:15:11,370 --> 00:15:15,600 And this calls f of n minus 4. 296 00:15:15,600 --> 00:15:16,770 OK. 297 00:15:16,770 --> 00:15:23,710 And we notice that this sub problem is 298 00:15:23,710 --> 00:15:25,940 the same as this sub problem. 299 00:15:25,940 --> 00:15:28,600 So to compute f of n minus 1, I need f of minus 3. 300 00:15:28,600 --> 00:15:32,020 And also to compute f of n minus 2 I need f of n minus 3. 301 00:15:32,020 --> 00:15:33,700 So why are we computing it twice? 302 00:15:33,700 --> 00:15:36,100 Let's just do it once. 303 00:15:36,100 --> 00:15:39,460 When we solve it, let's write it in a table somewhere. 304 00:15:39,460 --> 00:15:42,250 And then when we need it again, we'll just reuse that value. 305 00:15:42,250 --> 00:15:42,953 Question? 306 00:15:42,953 --> 00:15:44,620 AUDIENCE: What about the f of n minus 2? 307 00:15:44,620 --> 00:15:46,510 ERIK DEMAINE: f of n minus 2 is also shared. 308 00:15:46,510 --> 00:15:49,450 So let me use a different symbol. 309 00:15:49,450 --> 00:15:53,050 f of n minus 2 is already here. 310 00:15:53,050 --> 00:15:54,640 So this was at the same level. 311 00:15:54,640 --> 00:15:57,440 But we also get shared reuse between different levels. 312 00:15:57,440 --> 00:15:59,740 In fact, I wouldn't even call f of n minus 3 313 00:15:59,740 --> 00:16:02,290 because this whole part doesn't need 314 00:16:02,290 --> 00:16:03,580 to be computed a second time. 315 00:16:03,580 --> 00:16:05,458 If I already computed it here, it 316 00:16:05,458 --> 00:16:07,000 doesn't matter which one comes first. 317 00:16:07,000 --> 00:16:08,440 Let's say this one comes first. 318 00:16:08,440 --> 00:16:11,830 Once this is done, I can write it down and reuse it over here. 319 00:16:14,430 --> 00:16:18,000 And then in here, we're going to call f of n minus three. 320 00:16:18,000 --> 00:16:21,420 So there's still another computation of f of n minus 3. 321 00:16:21,420 --> 00:16:25,770 When that one's done, I won't need to do this recursively. 322 00:16:25,770 --> 00:16:28,350 OK, so magically this is going to make 323 00:16:28,350 --> 00:16:32,070 this algorithm efficient with this very simple tweak. 324 00:16:32,070 --> 00:16:35,130 Let me write down the tweak more explicitly. 325 00:16:35,130 --> 00:16:36,510 I won't write code here. 326 00:16:36,510 --> 00:16:41,920 But just describe it as a data structure. 327 00:16:41,920 --> 00:16:47,920 So we're going to maintain our good friend, the dictionary, 328 00:16:47,920 --> 00:16:52,320 which is abstract data type or interface. 329 00:16:52,320 --> 00:16:54,870 We could use different data structures to do it. 330 00:16:54,870 --> 00:16:57,210 But we're going to map some problems 331 00:16:57,210 --> 00:17:01,830 to their solutions, at least the ones that we've solved already. 332 00:17:04,500 --> 00:17:07,470 And usually we can do this with just a direct access 333 00:17:07,470 --> 00:17:09,480 array, though you could use a hash table. 334 00:17:09,480 --> 00:17:12,119 Just get expected bounce. 335 00:17:12,119 --> 00:17:23,440 So when we write the code for our recursive function-- 336 00:17:23,440 --> 00:17:26,579 so in general, once we have a sort bot description, 337 00:17:26,579 --> 00:17:28,109 we can turn this into code. 338 00:17:28,109 --> 00:17:30,630 We define f of i. 339 00:17:30,630 --> 00:17:32,910 And it says am I in a base case? 340 00:17:32,910 --> 00:17:34,270 If so, return this. 341 00:17:34,270 --> 00:17:36,780 Otherwise, do this recursive call. 342 00:17:36,780 --> 00:17:38,280 That's our recursive algorithm. 343 00:17:38,280 --> 00:17:39,947 But we're going to do a little more now. 344 00:17:39,947 --> 00:17:44,940 And first we're going to check whether this sub 345 00:17:44,940 --> 00:17:49,570 problem that we're trying to solve has already been solved. 346 00:17:49,570 --> 00:17:55,740 And if so, we return that storage solution. 347 00:17:55,740 --> 00:17:59,205 That's the easy case, but it might not exist. 348 00:18:10,960 --> 00:18:14,290 And then we'll compute it in the usual way. 349 00:18:14,290 --> 00:18:19,170 So what the code then would look like to define f of i 350 00:18:19,170 --> 00:18:22,960 is first we check is i in our data structure. 351 00:18:22,960 --> 00:18:25,500 This is usually called the memo. 352 00:18:28,570 --> 00:18:33,350 So we say, is this sub-problem-- is i in my memo data structure? 353 00:18:33,350 --> 00:18:34,930 If so just return memo of i. 354 00:18:34,930 --> 00:18:35,540 Done. 355 00:18:35,540 --> 00:18:36,950 No recursion necessary. 356 00:18:36,950 --> 00:18:39,220 Otherwise, check if I'm a base case. 357 00:18:39,220 --> 00:18:40,600 If so, done. 358 00:18:40,600 --> 00:18:42,640 Otherwise, recurse. 359 00:18:42,640 --> 00:18:46,120 So recursively call f of i minus 1 and f of i minus 2. 360 00:18:46,120 --> 00:18:48,070 And in this recursion, we can see 361 00:18:48,070 --> 00:18:50,750 that after we call f of i minus 1, in fact, 362 00:18:50,750 --> 00:18:52,750 it will have already computed f of i minus 2. 363 00:18:52,750 --> 00:18:55,030 So while this call is recursive, this one 364 00:18:55,030 --> 00:18:57,640 will immediately terminate because i minus 2 365 00:18:57,640 --> 00:18:59,690 will already be in the memo table. 366 00:18:59,690 --> 00:19:03,020 And so if you think about what happens, in fact, 367 00:19:03,020 --> 00:19:07,643 we'll just have recursion down the left branch of this thing. 368 00:19:07,643 --> 00:19:09,310 And all the right branches will be free. 369 00:19:09,310 --> 00:19:12,080 We can just look things up in the memo table. 370 00:19:12,080 --> 00:19:14,380 So what is the overall running time? 371 00:19:14,380 --> 00:19:25,460 For Fibonacci, this should be order n. 372 00:19:25,460 --> 00:19:27,560 Why is it order n? 373 00:19:27,560 --> 00:19:28,880 This is number of additions. 374 00:19:31,850 --> 00:19:33,110 Come back to that in a second. 375 00:19:36,230 --> 00:19:39,320 In general, the way to analyze an algorithm 376 00:19:39,320 --> 00:19:41,150 like this that uses memoization is we just 377 00:19:41,150 --> 00:19:43,707 count how many different sub-problems are there? 378 00:19:43,707 --> 00:19:45,290 Because once we solve the sub-problem, 379 00:19:45,290 --> 00:19:46,610 we will never solve it again. 380 00:19:46,610 --> 00:19:48,300 That's the whole idea of a memo table. 381 00:19:48,300 --> 00:19:51,570 So we will solve each sub-problem at most once. 382 00:19:51,570 --> 00:19:54,080 And so we just need to count, how much time does it take 383 00:19:54,080 --> 00:19:55,950 to solve every sub-problem? 384 00:19:55,950 --> 00:19:59,000 And here you can see it's constant. 385 00:19:59,000 --> 00:20:01,520 Either it's a base case and it takes constant time 386 00:20:01,520 --> 00:20:04,988 or we recursively call these things. 387 00:20:04,988 --> 00:20:06,530 But those are different sub-problems. 388 00:20:06,530 --> 00:20:08,350 So we're going to count those later. 389 00:20:08,350 --> 00:20:09,725 And then the work that's actually 390 00:20:09,725 --> 00:20:12,140 done by this recurrence is a single addition. 391 00:20:12,140 --> 00:20:14,450 So in fact, it's n additions. 392 00:20:14,450 --> 00:20:20,180 To compute fn would be exactly n additions. 393 00:20:20,180 --> 00:20:24,570 So it turns out to be very nice closed form in this case. 394 00:20:24,570 --> 00:20:29,840 It should be exactly n sub problems to compute f of n 395 00:20:29,840 --> 00:20:32,900 because we started as dot at 1. 396 00:20:32,900 --> 00:20:35,940 And each one has one additional-- 397 00:20:35,940 --> 00:20:37,110 I guess not the base case. 398 00:20:37,110 --> 00:20:39,910 Maybe n minus 2. 399 00:20:39,910 --> 00:20:40,410 OK. 400 00:20:40,410 --> 00:20:43,050 Definitely order n. 401 00:20:43,050 --> 00:20:46,145 Now, there's this one subtlety which-- 402 00:20:46,145 --> 00:20:48,270 let's forget about dynamic programming for a moment 403 00:20:48,270 --> 00:20:51,090 and go back to good old lecture one and two, 404 00:20:51,090 --> 00:20:55,020 talking about the word ram model of computation. 405 00:20:55,020 --> 00:20:58,800 A question here that usually doesn't matter in this class. 406 00:20:58,800 --> 00:21:01,540 Usually we assume additions take constant time. 407 00:21:01,540 --> 00:21:04,470 And we usually do that because it's usually true. 408 00:21:04,470 --> 00:21:09,750 And in general, our model is the w bit additions-- 409 00:21:09,750 --> 00:21:12,690 where w is our machine word size-- 410 00:21:12,690 --> 00:21:13,860 takes constant time. 411 00:21:19,680 --> 00:21:22,050 But for this problem and this problem only, 412 00:21:22,050 --> 00:21:24,030 pretty much, for Fibonacci numbers, 413 00:21:24,030 --> 00:21:25,620 I happen to know that the Fibonacci 414 00:21:25,620 --> 00:21:27,040 numbers grow exponentially. 415 00:21:27,040 --> 00:21:32,730 So to write them down actually requires theta n bits 416 00:21:32,730 --> 00:21:35,580 because they are some constant to the n power. 417 00:21:35,580 --> 00:21:38,610 And so they're actually really big . 418 00:21:38,610 --> 00:21:40,650 n is probably bigger than w. 419 00:21:40,650 --> 00:21:44,570 Usually you think of problems that are much bigger than 64 420 00:21:44,570 --> 00:21:46,620 or whatever your word size happens to be. 421 00:21:46,620 --> 00:21:48,900 We do assume that w is at least log n. 422 00:21:48,900 --> 00:21:51,010 But n is probably bigger than w. 423 00:21:51,010 --> 00:21:52,260 It might be bigger or smaller. 424 00:21:52,260 --> 00:21:53,670 We don't know. 425 00:21:53,670 --> 00:21:57,570 And in general, to do an n bit addition-- 426 00:21:57,570 --> 00:22:01,590 these are n bit additions-- 427 00:22:01,590 --> 00:22:07,080 is going to take ceiling of n over w time. 428 00:22:07,080 --> 00:22:11,040 So in the end, we will spend this times n, 429 00:22:11,040 --> 00:22:12,780 because we have to do that, many of them, 430 00:22:12,780 --> 00:22:18,575 which is n plus n squared over w time. 431 00:22:18,575 --> 00:22:19,950 So a bit of a weird running time. 432 00:22:19,950 --> 00:22:23,970 But it's polynomial, whereas this original recursive 433 00:22:23,970 --> 00:22:26,880 algorithm was exponential here. 434 00:22:26,880 --> 00:22:28,950 Using this one simple idea of just remembering 435 00:22:28,950 --> 00:22:31,492 the work we've done, suddenly this exponential time algorithm 436 00:22:31,492 --> 00:22:32,400 becomes polynomial. 437 00:22:32,400 --> 00:22:33,030 Why? 438 00:22:33,030 --> 00:22:35,730 Because we have few sub problems. 439 00:22:35,730 --> 00:22:39,810 We had n sub problems. 440 00:22:39,810 --> 00:22:42,660 And for each sub problem, we could write a recurrence 441 00:22:42,660 --> 00:22:46,320 relation that if we already knew the solutions to smaller sub 442 00:22:46,320 --> 00:22:48,900 problems, we could compute this bigger 443 00:22:48,900 --> 00:22:50,640 problem very efficiently. 444 00:22:50,640 --> 00:22:56,250 This happened to be constant time or constant additions. 445 00:22:56,250 --> 00:22:57,900 n over w time. 446 00:22:57,900 --> 00:22:59,610 But as long as this is polynomial 447 00:22:59,610 --> 00:23:02,370 and this is polynomial, we're happy, 448 00:23:02,370 --> 00:23:07,500 because we have this nice formula that the time it takes 449 00:23:07,500 --> 00:23:15,040 is, at most, the sum over all sub problems of the relation 450 00:23:15,040 --> 00:23:15,540 time. 451 00:23:18,970 --> 00:23:23,490 So I'm referring to sub problems, like a number of them 452 00:23:23,490 --> 00:23:26,640 and the time it takes to evaluate this, 453 00:23:26,640 --> 00:23:28,272 ignoring the recursive calls. 454 00:23:28,272 --> 00:23:28,980 That's important. 455 00:23:28,980 --> 00:23:33,570 This is the non recursive part. 456 00:23:38,140 --> 00:23:40,870 In the notes, I call this non-recursive work. 457 00:23:45,700 --> 00:23:49,120 So this formula gives us a way to bound 458 00:23:49,120 --> 00:23:52,240 the running time of one of these algorithms 459 00:23:52,240 --> 00:23:54,100 if we use memoization. 460 00:23:54,100 --> 00:23:55,750 Without memoization, this is not true, 461 00:23:55,750 --> 00:23:57,730 Fibonacci to exponential time. 462 00:23:57,730 --> 00:23:59,980 But if we add memoization, we know that we only 463 00:23:59,980 --> 00:24:01,640 solve each sub-problem once. 464 00:24:01,640 --> 00:24:03,670 And so we just need to see, for each one, 465 00:24:03,670 --> 00:24:05,500 how much did it cost me to compute it, 466 00:24:05,500 --> 00:24:07,360 assuming all the recursion work is free, 467 00:24:07,360 --> 00:24:11,178 because that's already taken into account by the summation. 468 00:24:11,178 --> 00:24:12,970 So in particular, this summation is at most 469 00:24:12,970 --> 00:24:15,730 the number of sub-problems times the time per sub-problem, 470 00:24:15,730 --> 00:24:17,500 which in this case was order n. 471 00:24:17,500 --> 00:24:20,260 We could try to apply that analysis to merge sort, 472 00:24:20,260 --> 00:24:23,830 because after all, this is also a recursive algorithm. 473 00:24:23,830 --> 00:24:25,990 It happens to not benefit from memoization. 474 00:24:25,990 --> 00:24:27,520 But we could throw in memoization. 475 00:24:27,520 --> 00:24:29,230 It wouldn't hurt us. 476 00:24:29,230 --> 00:24:31,940 But if you think about the call graph here, 477 00:24:31,940 --> 00:24:38,500 which is like s of 0 m, which calls s of m-- 478 00:24:38,500 --> 00:24:45,610 0 n over 2 and o of n over 2n and so on. 479 00:24:45,610 --> 00:24:47,890 It has the same picture, but there's actually 480 00:24:47,890 --> 00:24:49,660 no common substructure here. 481 00:24:49,660 --> 00:24:51,520 You'll never see a repeated sub-problem, 482 00:24:51,520 --> 00:24:55,563 because this range is completely disjoined from this range. 483 00:24:55,563 --> 00:24:56,980 But you could throw in memoization 484 00:24:56,980 --> 00:24:58,397 and try to analyze in the same way 485 00:24:58,397 --> 00:25:01,420 and say, well, how many sub-problems are there? 486 00:25:01,420 --> 00:25:07,570 It looks like there's n choices for i and not quite n choices 487 00:25:07,570 --> 00:25:11,600 but it's at most n squared different choices. 488 00:25:11,600 --> 00:25:16,755 In fact, it's the triangular number sum of i equals 1 to n 489 00:25:16,755 --> 00:25:20,290 of i, different possible choices for inj. 490 00:25:20,290 --> 00:25:24,780 But this is theta n squared sub-problems, 491 00:25:24,780 --> 00:25:27,530 which seems not so good. 492 00:25:27,530 --> 00:25:29,960 And then how much time are we spending per sub problem? 493 00:25:29,960 --> 00:25:33,850 Well, to solve s of ij, we have to merge 494 00:25:33,850 --> 00:25:35,140 about that many elements. 495 00:25:35,140 --> 00:25:36,860 We know merge takes linear time. 496 00:25:36,860 --> 00:25:43,420 And so this takes theta j minus i time to evaluate. 497 00:25:43,420 --> 00:25:46,510 And so what we'd like to do is sum over all the sub 498 00:25:46,510 --> 00:25:48,160 problems of j minus i. 499 00:25:48,160 --> 00:25:51,130 This is the not triangular number 500 00:25:51,130 --> 00:25:53,770 but the tetrahedral number, I guess. 501 00:25:53,770 --> 00:25:56,080 And so we end up that the running time 502 00:25:56,080 --> 00:25:58,045 is, at most, n cubed. 503 00:26:00,730 --> 00:26:02,020 Great. 504 00:26:02,020 --> 00:26:05,950 So it's true that n log n is less than or equal to n cubed, 505 00:26:05,950 --> 00:26:07,620 but obviously not terribly useful. 506 00:26:07,620 --> 00:26:11,200 This algorithm by the way we already know how to analyze it 507 00:26:11,200 --> 00:26:12,490 is, indeed, n log n. 508 00:26:12,490 --> 00:26:17,060 And the running time turns out to be theta n log n. 509 00:26:17,060 --> 00:26:20,960 So sometimes this equation is not what you want to use. 510 00:26:20,960 --> 00:26:22,277 But often it's good enough. 511 00:26:22,277 --> 00:26:23,860 And especially if you just want to get 512 00:26:23,860 --> 00:26:25,360 a polynomial upper bound, then you 513 00:26:25,360 --> 00:26:27,190 can try to optimize it later. 514 00:26:27,190 --> 00:26:28,750 This will give you a polynomial upper 515 00:26:28,750 --> 00:26:31,125 bound as long as the number of sub-problems is polynomial 516 00:26:31,125 --> 00:26:34,000 and the time per sub-problem is polynomial. 517 00:26:34,000 --> 00:26:35,810 And indeed, n cubed is polynomial. 518 00:26:35,810 --> 00:26:39,550 It's not a great polynomial, but this is an alternate way 519 00:26:39,550 --> 00:26:40,970 to analyze merge sort. 520 00:26:40,970 --> 00:26:43,000 Obviously don't do this for merge sort. 521 00:26:43,000 --> 00:26:47,586 But it illustrates the technique. 522 00:26:47,586 --> 00:26:49,220 Good so far? 523 00:26:49,220 --> 00:26:51,230 Any questions? 524 00:26:51,230 --> 00:26:52,760 All right. 525 00:26:52,760 --> 00:26:57,680 Let me remember where we are. 526 00:26:57,680 --> 00:26:58,190 Cool. 527 00:26:58,190 --> 00:27:02,420 So the next thing I'd like to do is show you one more algorithm 528 00:27:02,420 --> 00:27:04,640 that we've already seen in this class that fits very 529 00:27:04,640 --> 00:27:07,430 nicely into this structure-- 530 00:27:07,430 --> 00:27:09,620 arguably is a dynamic program-- 531 00:27:09,620 --> 00:27:13,050 and that is DAG shortest paths. 532 00:27:13,050 --> 00:27:17,210 So just to close the loop here, when I say dynamic programming, 533 00:27:17,210 --> 00:27:21,980 I mean recursion with memoization. 534 00:27:21,980 --> 00:27:24,110 I mean, we take-- 535 00:27:24,110 --> 00:27:28,250 we write a recursive piece of code, 536 00:27:28,250 --> 00:27:32,510 which is like def f of some args, 537 00:27:32,510 --> 00:27:38,150 some sub-problem specification. 538 00:27:38,150 --> 00:27:44,870 We check is the problem in the memo table? 539 00:27:44,870 --> 00:27:52,379 If so, return memo of sub-problem. 540 00:27:56,510 --> 00:28:01,460 And otherwise check if it's a base case 541 00:28:01,460 --> 00:28:03,020 and solve it if it's a base case. 542 00:28:03,020 --> 00:28:09,660 And otherwise, write the recurrence recurse 543 00:28:09,660 --> 00:28:11,240 via relation. 544 00:28:14,720 --> 00:28:20,600 And set the memo table of the sub-problem 545 00:28:20,600 --> 00:28:23,270 to be one of those things. 546 00:28:23,270 --> 00:28:26,300 OK, so this is the generic dynamic program. 547 00:28:26,300 --> 00:28:32,090 And implicitly, I'm writing Fibonacci in that way. 548 00:28:32,090 --> 00:28:35,360 And all of the dynamic programs have this implicit structure 549 00:28:35,360 --> 00:28:41,060 where I start with a memo table which is empty 550 00:28:41,060 --> 00:28:44,390 and I always just check if I'm in the memo table. 551 00:28:44,390 --> 00:28:45,470 If I am, I return it. 552 00:28:45,470 --> 00:28:50,570 Otherwise I compute according to this recursive relation 553 00:28:50,570 --> 00:28:54,520 by recursively calling f. 554 00:28:54,520 --> 00:28:55,210 And that's it. 555 00:28:55,210 --> 00:29:00,450 So this is every DP algorithm is going to have that structure. 556 00:29:00,450 --> 00:29:04,860 And it's just using recursion and memoization together. 557 00:29:04,860 --> 00:29:06,930 OK, so now let's apply that technique 558 00:29:06,930 --> 00:29:10,020 to think about the DAG shortest paths problem. 559 00:29:10,020 --> 00:29:12,780 The problem was, I give you a DAG. 560 00:29:12,780 --> 00:29:15,030 I give you a source vertex, S-- 561 00:29:15,030 --> 00:29:16,470 single source shortest paths. 562 00:29:16,470 --> 00:29:20,490 Compute the shortest path weight from S to every vertex. 563 00:29:20,490 --> 00:29:22,350 That's the goal of the problem. 564 00:29:22,350 --> 00:29:24,780 And we saw a way to solve that, which is DAG relaxation. 565 00:29:24,780 --> 00:29:27,030 I'm going to show you a different way, which turns out 566 00:29:27,030 --> 00:29:31,350 to be basically the same, but upside down, or flipped left 567 00:29:31,350 --> 00:29:35,650 right, depending which way you direct your edges. 568 00:29:35,650 --> 00:29:37,885 So what are our sub-problems? 569 00:29:37,885 --> 00:29:40,260 Well, here, actually, they're kind of spelled out for us. 570 00:29:40,260 --> 00:29:42,630 We want to compute delta and SV for all these. 571 00:29:42,630 --> 00:29:47,220 So that is size of these sub-problems. 572 00:29:47,220 --> 00:29:51,872 That turns out to be enough for this overall problem. 573 00:29:51,872 --> 00:29:53,580 And the original problem we want to solve 574 00:29:53,580 --> 00:29:55,200 is all of the sub-problems. 575 00:29:55,200 --> 00:29:57,000 We solve all the sub-problems, we're done. 576 00:29:57,000 --> 00:29:59,001 And then we have-- 577 00:29:59,001 --> 00:30:02,100 I think we wrote this at some point during the DAG 578 00:30:02,100 --> 00:30:03,730 shortest paths lecture-- 579 00:30:03,730 --> 00:30:05,610 we have a recursive relation saying 580 00:30:05,610 --> 00:30:09,150 that the shortest way to get from s to v 581 00:30:09,150 --> 00:30:12,420 is the minimum of the shortest path 582 00:30:12,420 --> 00:30:16,230 to get to some vertex u plus the weight of the edge from u to v. 583 00:30:16,230 --> 00:30:16,920 Why? 584 00:30:16,920 --> 00:30:21,180 Because if we look at a vertex v, unless we started there, 585 00:30:21,180 --> 00:30:23,250 we came from somewhere. 586 00:30:23,250 --> 00:30:27,720 And so we can consider all of the possible choices 587 00:30:27,720 --> 00:30:30,030 for the previous vertex u. 588 00:30:30,030 --> 00:30:32,940 And if you start at s and get to v, 589 00:30:32,940 --> 00:30:34,990 you must go through one of them. 590 00:30:34,990 --> 00:30:39,317 And so this is finding the best way among all the choices of u. 591 00:30:39,317 --> 00:30:40,650 What's the best way to get to u? 592 00:30:40,650 --> 00:30:44,400 And then take the edge from u to v for all edges uv. 593 00:30:44,400 --> 00:30:46,680 And this is adjacency minus. 594 00:30:46,680 --> 00:30:48,420 We don't usually think of that. 595 00:30:48,420 --> 00:30:51,040 Usually we look at adjacency plus the outgoing edges. 596 00:30:51,040 --> 00:30:52,890 This is the incoming edges. 597 00:30:52,890 --> 00:30:55,650 And so u is an incoming-- 598 00:30:55,650 --> 00:30:59,940 uv is an incoming edge into v. OK, if we take that minimum-- 599 00:30:59,940 --> 00:31:03,240 and of course, possible there is no way to get to v. 600 00:31:03,240 --> 00:31:06,000 And so I'll also throw infinity into the set. 601 00:31:06,000 --> 00:31:07,200 Take the min of that set. 602 00:31:07,200 --> 00:31:09,490 That will give me the shortest pathway 603 00:31:09,490 --> 00:31:13,740 in an acyclic graph from s to v. And great, this is recursive. 604 00:31:13,740 --> 00:31:14,970 This was a sub problem. 605 00:31:14,970 --> 00:31:19,613 These are sub problems which are smaller, I guess. 606 00:31:19,613 --> 00:31:21,030 There's no clear notion of smaller 607 00:31:21,030 --> 00:31:25,410 here, except we already know the clear notion of smaller 608 00:31:25,410 --> 00:31:30,608 is the topological order of our DAG. 609 00:31:30,608 --> 00:31:32,150 Because our graph is acyclic, we know 610 00:31:32,150 --> 00:31:33,275 it has a topological order. 611 00:31:33,275 --> 00:31:36,380 We know how to compute it with DFS. 612 00:31:36,380 --> 00:31:39,680 And so that guarantees there's a topological order 613 00:31:39,680 --> 00:31:41,850 to compute these problems. 614 00:31:41,850 --> 00:31:46,880 And in fact, the relationship between problems 615 00:31:46,880 --> 00:31:50,810 is exactly the given graph, G. In order 616 00:31:50,810 --> 00:31:54,290 to compute the shortest pathway from s to v, 617 00:31:54,290 --> 00:31:56,180 I need to know the shortest pathway from s 618 00:31:56,180 --> 00:31:58,610 to all of the incoming vertices to v. 619 00:31:58,610 --> 00:32:02,210 And so this is I guess in the call graph, 620 00:32:02,210 --> 00:32:07,670 this vertex calls this vertex, but direct the edge this way 621 00:32:07,670 --> 00:32:11,900 to say that this vertex requires-- 622 00:32:11,900 --> 00:32:14,840 this vertex needs to be computed before this one. 623 00:32:14,840 --> 00:32:18,780 And so then I can complete them in a topological order. 624 00:32:18,780 --> 00:32:22,770 OK, we have a base case, which is delta of ss equals 0. 625 00:32:22,770 --> 00:32:26,910 And the running time is, again, we can use this formula 626 00:32:26,910 --> 00:32:29,910 and say, let's just sum over all the sub problems of the non 627 00:32:29,910 --> 00:32:32,340 recursive work in our recurrence relation 628 00:32:32,340 --> 00:32:34,710 and so it's computing this min. 629 00:32:34,710 --> 00:32:38,640 If I gave you these deltas for free 630 00:32:38,640 --> 00:32:41,640 and I gave you these weights, which we know from our weight 631 00:32:41,640 --> 00:32:44,430 data structure, how long does it take to compute this min? 632 00:32:44,430 --> 00:32:46,320 Well, however many things there are, however 633 00:32:46,320 --> 00:32:48,480 many numbers we're minning, which 634 00:32:48,480 --> 00:32:52,500 is the size of the incoming adjacency list plus 1 635 00:32:52,500 --> 00:32:54,430 for that infinity. 636 00:32:54,430 --> 00:32:57,450 And so if you compute this sum, sum of incoming edges 637 00:32:57,450 --> 00:33:00,150 to every vertex, that's all the edges. 638 00:33:00,150 --> 00:33:03,050 So this is v plus e. 639 00:33:03,050 --> 00:33:09,332 So in fact, this algorithm is morally the same algorithm 640 00:33:09,332 --> 00:33:11,290 as the one that we saw on the DAG shortest path 641 00:33:11,290 --> 00:33:18,070 lecture, which was compute a topological order and process 642 00:33:18,070 --> 00:33:23,840 vertices in that order and relax edges going out from vertices. 643 00:33:23,840 --> 00:33:26,860 So here-- so in that algorithm, we 644 00:33:26,860 --> 00:33:30,310 would have tried to relax this edge if there was a better 645 00:33:30,310 --> 00:33:32,170 path to v. And the first one certainly 646 00:33:32,170 --> 00:33:33,560 is better than infinity. 647 00:33:33,560 --> 00:33:36,850 So the first one we relax indeed. 648 00:33:36,850 --> 00:33:39,910 The next edge, if this gave a better path from s to v, 649 00:33:39,910 --> 00:33:42,400 then we would relax that edge and update the way here 650 00:33:42,400 --> 00:33:43,660 and do the same here. 651 00:33:43,660 --> 00:33:46,870 In the end, we're just computing this min in the relaxation 652 00:33:46,870 --> 00:33:48,490 algorithm but doing it step by step. 653 00:33:48,490 --> 00:33:51,550 In the relaxation algorithm, DAG relaxation, 654 00:33:51,550 --> 00:33:58,920 for each incoming edge to v, we update d of e if it's better. 655 00:33:58,920 --> 00:34:02,000 And so if you repeatedly update if you're better, 656 00:34:02,000 --> 00:34:04,390 that ends up computing a min. 657 00:34:04,390 --> 00:34:06,430 OK, so this is the same algorithm 658 00:34:06,430 --> 00:34:08,980 just kind of flipped backwards. 659 00:34:08,980 --> 00:34:11,199 A funny thing, although we wrote down 660 00:34:11,199 --> 00:34:15,219 the topological order of the sub problem graph 661 00:34:15,219 --> 00:34:16,870 here is the topological order of g, 662 00:34:16,870 --> 00:34:22,030 because the sub-problem graph is g, 663 00:34:22,030 --> 00:34:24,639 the algorithm doesn't actually have to compute one. 664 00:34:24,639 --> 00:34:27,010 It's doing it automatically for free. 665 00:34:27,010 --> 00:34:31,000 If you think about this algorithm, 666 00:34:31,000 --> 00:34:34,639 generic dp algorithm, which is check 667 00:34:34,639 --> 00:34:35,889 whether we're in a memo table. 668 00:34:35,889 --> 00:34:36,969 If so, return. 669 00:34:36,969 --> 00:34:40,810 Otherwise, recurse, or base case. 670 00:34:40,810 --> 00:34:43,210 This actually is a depth-first search 671 00:34:43,210 --> 00:34:46,929 through the sub-problem graph-- technically through the reverse 672 00:34:46,929 --> 00:34:48,370 of the sub-problem graph. 673 00:34:48,370 --> 00:34:51,429 If I draw an edge-- 674 00:34:51,429 --> 00:34:58,350 so from small to big-- 675 00:34:58,350 --> 00:35:01,950 so I'm just saying, I orient the edges from my smaller 676 00:35:01,950 --> 00:35:04,320 sub-problems to the ones that need it-- 677 00:35:04,320 --> 00:35:06,840 then I'm actually depth-first searching backwards 678 00:35:06,840 --> 00:35:10,470 in this graph because the bigger problem calls the smaller 679 00:35:10,470 --> 00:35:12,240 problem. 680 00:35:12,240 --> 00:35:16,110 And the memo table is serving as the "have I visited this vertex 681 00:35:16,110 --> 00:35:18,270 already" check in DFS. 682 00:35:18,270 --> 00:35:20,460 So this is actually a DFS algorithm. 683 00:35:20,460 --> 00:35:24,840 Plus we're doing some computation to actually solve 684 00:35:24,840 --> 00:35:27,000 the sub-problems we care about. 685 00:35:27,000 --> 00:35:29,820 So implicit in this algorithm, we are doing a DFS, 686 00:35:29,820 --> 00:35:32,430 and at the same time, we're doing this shortest path 687 00:35:32,430 --> 00:35:37,720 computation in the finishing order of that DFS traversal 688 00:35:37,720 --> 00:35:39,220 because all the edges are backwards. 689 00:35:39,220 --> 00:35:41,410 This is the same as the reverse finishing order 690 00:35:41,410 --> 00:35:42,560 if the graph is forwards. 691 00:35:42,560 --> 00:35:46,320 So in the end, we're computing a topological order 692 00:35:46,320 --> 00:35:49,590 because dynamic programming includes 693 00:35:49,590 --> 00:35:52,440 in it depth first search. 694 00:35:52,440 --> 00:35:54,540 A lot of words. 695 00:35:54,540 --> 00:35:57,900 But it's kind of cool that this framework just 696 00:35:57,900 --> 00:36:02,070 solves DAG shortest paths without much work. 697 00:36:02,070 --> 00:36:04,050 I mean, we did a lot of work in shortest paths 698 00:36:04,050 --> 00:36:05,950 to prove that this relation is true. 699 00:36:05,950 --> 00:36:08,340 Once you know it's true, the algorithm part 700 00:36:08,340 --> 00:36:11,280 is pretty much free. 701 00:36:11,280 --> 00:36:15,900 You just write down SRTBOT and you're done. 702 00:36:15,900 --> 00:36:18,420 OK. 703 00:36:18,420 --> 00:36:24,100 This brings us to in general-- 704 00:36:24,100 --> 00:36:26,110 at this point we have seen two examples 705 00:36:26,110 --> 00:36:27,190 of dynamic programming. 706 00:36:27,190 --> 00:36:28,630 I guess technically merge sort you 707 00:36:28,630 --> 00:36:30,130 could think of as a dynamic program, 708 00:36:30,130 --> 00:36:31,580 but it doesn't actually reuse anything. 709 00:36:31,580 --> 00:36:32,590 So it's not interesting. 710 00:36:32,590 --> 00:36:34,640 And indeed, that gave us a really bad bound. 711 00:36:34,640 --> 00:36:37,210 We've definitely seen DAG shortest paths and Fibonacci 712 00:36:37,210 --> 00:36:39,460 numbers as two interesting examples. 713 00:36:39,460 --> 00:36:43,090 And what the next remainder of this lecture and the next three 714 00:36:43,090 --> 00:36:46,330 lectures are going to be about is more and more examples 715 00:36:46,330 --> 00:36:48,580 of dynamic programming and how you 716 00:36:48,580 --> 00:36:51,280 can use it to solve increasingly general problems. 717 00:36:51,280 --> 00:36:53,740 So far, we've just solved an easy problem and a problem 718 00:36:53,740 --> 00:36:55,220 we already knew how to solve. 719 00:36:55,220 --> 00:37:02,680 Let's go to a new problem, which is bowling. 720 00:37:02,680 --> 00:37:04,290 Bowling is popular in Boston. 721 00:37:06,940 --> 00:37:10,390 Boston likes to play candlepin bowling, which 722 00:37:10,390 --> 00:37:12,010 is a bit unusual. 723 00:37:12,010 --> 00:37:15,070 Today we're going to play an even more unusual bowling 724 00:37:15,070 --> 00:37:19,120 game, one that I made up based on a bowling game 725 00:37:19,120 --> 00:37:22,720 that Henry [INAUDIBLE] made up in 1908. 726 00:37:22,720 --> 00:37:26,110 So ancient bowling, I'll call it, 727 00:37:26,110 --> 00:37:29,020 or I think linear bowling is what I might call it. 728 00:37:29,020 --> 00:37:31,930 I'll just call it bowling here. 729 00:37:31,930 --> 00:37:34,560 And now I'm going to attempt to draw a bowling pin. 730 00:37:34,560 --> 00:37:36,930 Not bad. 731 00:37:36,930 --> 00:37:39,450 They might get progressively worse. 732 00:37:39,450 --> 00:37:42,270 So imagine n identical bowling pins. 733 00:37:42,270 --> 00:37:44,070 Please pretend these are identical. 734 00:37:44,070 --> 00:37:49,020 And I have a ball which is approximately the same size 735 00:37:49,020 --> 00:37:50,070 as a bowling pin. 736 00:37:50,070 --> 00:37:52,320 These bowling pins are pretty close together. 737 00:37:52,320 --> 00:37:54,720 I should have left a little gap here. 738 00:37:54,720 --> 00:37:57,300 And you are a really good bowler. 739 00:37:57,300 --> 00:38:00,810 Now, unfortunately, these bowling pins are on a line. 740 00:38:00,810 --> 00:38:03,910 And you're bowling from way down at infinity. 741 00:38:03,910 --> 00:38:08,850 So when you bowl, you can only hit one pin 742 00:38:08,850 --> 00:38:10,530 or two pins or zero pins. 743 00:38:10,530 --> 00:38:12,960 But probably you want to hit some pins. 744 00:38:12,960 --> 00:38:14,970 So if you bowl straight out of pin, 745 00:38:14,970 --> 00:38:16,980 you will just hit that one pin. 746 00:38:16,980 --> 00:38:20,700 And if you bowl in the middle between two pins, 747 00:38:20,700 --> 00:38:22,410 you will knock down-- 748 00:38:22,410 --> 00:38:23,730 that's a ball, sorry-- 749 00:38:23,730 --> 00:38:26,010 you will knock down two pins. 750 00:38:26,010 --> 00:38:30,870 And this is your model of bowling, model of computation. 751 00:38:30,870 --> 00:38:36,060 Now, what makes this interesting is that the pins have values. 752 00:38:36,060 --> 00:38:39,900 Pin i has value-- 753 00:38:39,900 --> 00:38:44,340 this is obviously a toy problem, though this problem-- 754 00:38:44,340 --> 00:38:47,580 this type of bowling does go back to 1908, 755 00:38:47,580 --> 00:38:50,530 it was also a toy problem in that setting. 756 00:38:50,530 --> 00:38:55,590 So each of these bowling pins has some number on it, 757 00:38:55,590 --> 00:38:57,540 let's say 1, 9, 9-- 758 00:39:01,407 --> 00:39:03,240 I'll do a slightly more interesting example, 759 00:39:03,240 --> 00:39:11,130 maybe another one here and a 2 and a 5 and a 5, something 760 00:39:11,130 --> 00:39:13,420 like this. 761 00:39:13,420 --> 00:39:15,268 OK. 762 00:39:15,268 --> 00:39:17,060 Or maybe make it a little more interesting. 763 00:39:17,060 --> 00:39:19,280 Let's put some negative numbers on here. 764 00:39:19,280 --> 00:39:19,780 OK. 765 00:39:19,780 --> 00:39:24,160 And the model-- so you're at the carnival bowling. 766 00:39:24,160 --> 00:39:27,490 Each pin has different-- potentially different values. 767 00:39:27,490 --> 00:39:38,350 And the model is if you hit one pin, i, then you get vi points. 768 00:39:38,350 --> 00:39:40,330 So that's straight forward. 769 00:39:40,330 --> 00:39:43,120 To make it interesting, when you hit two pins, 770 00:39:43,120 --> 00:39:45,345 you get the product. 771 00:39:45,345 --> 00:39:46,720 So if I hit two pins, it's always 772 00:39:46,720 --> 00:39:53,095 i and i plus 1 for some I. You get vi times vi plus 1 points. 773 00:39:56,110 --> 00:39:58,060 This is the game you're playing. 774 00:39:58,060 --> 00:40:00,774 And it doesn't really matter that this is a product. 775 00:40:00,774 --> 00:40:03,460 Product is just some weird function 776 00:40:03,460 --> 00:40:05,320 that's hard to imagine. 777 00:40:05,320 --> 00:40:07,510 If you stare at this long enough, 778 00:40:07,510 --> 00:40:10,030 you should convince yourself that the optimal solution 779 00:40:10,030 --> 00:40:11,747 is probably to-- 780 00:40:11,747 --> 00:40:13,330 so, for each of these numbers, I could 781 00:40:13,330 --> 00:40:15,622 leave it singleton or pair it with its left neighbor 782 00:40:15,622 --> 00:40:17,080 or pair it with its right neighbor. 783 00:40:17,080 --> 00:40:19,630 But the pairings can't overlap because once I hit a pin, 784 00:40:19,630 --> 00:40:20,200 it's gone. 785 00:40:20,200 --> 00:40:21,700 It's knocked over. 786 00:40:21,700 --> 00:40:22,820 It disappears. 787 00:40:22,820 --> 00:40:26,410 So because of these nine, which are a very high value, what 788 00:40:26,410 --> 00:40:28,790 I'd probably like to do is hit both of them together, 789 00:40:28,790 --> 00:40:32,268 so pair them up, because 9 times 9 is 81. 790 00:40:32,268 --> 00:40:34,810 That's really big, much better than hitting them individually 791 00:40:34,810 --> 00:40:37,810 or hitting 9 times 1 or 9 times 2. 792 00:40:37,810 --> 00:40:40,390 1 and 1 is kind of funny, because it's actually better 793 00:40:40,390 --> 00:40:41,442 to hit them individually. 794 00:40:41,442 --> 00:40:43,900 That will give you two points, whereas if I'd pair them up, 795 00:40:43,900 --> 00:40:46,102 I only get one point. 796 00:40:46,102 --> 00:40:47,890 2 and minus 5, that seems bad. 797 00:40:47,890 --> 00:40:48,940 Negative 10 points. 798 00:40:48,940 --> 00:40:50,980 My goal is to maximize score. 799 00:40:56,980 --> 00:41:00,280 Do you have to hit all the pins? 800 00:41:00,280 --> 00:41:02,470 Let's say no, you don't have to hit all the pins. 801 00:41:02,470 --> 00:41:04,840 So I could skip the minus fives. 802 00:41:04,840 --> 00:41:07,780 But in fact, here, because they're adjacent, 803 00:41:07,780 --> 00:41:09,400 minus 5 times minus 5 is good. 804 00:41:09,400 --> 00:41:11,270 That's 25 points. 805 00:41:11,270 --> 00:41:14,740 So the optimal solution for this particular instance 806 00:41:14,740 --> 00:41:18,100 are to hit all the pins, these positive, 807 00:41:18,100 --> 00:41:19,990 these together, these together. 808 00:41:19,990 --> 00:41:23,410 If I added, for example, another pin of minus 3 here, 809 00:41:23,410 --> 00:41:25,600 I would choose not to hit that pin. 810 00:41:25,600 --> 00:41:26,300 Good question. 811 00:41:26,300 --> 00:41:29,410 So you just play until you are tired. 812 00:41:29,410 --> 00:41:32,950 When you decide to stop playing, how can I maximize your score? 813 00:41:32,950 --> 00:41:34,660 There are many variations of this game. 814 00:41:34,660 --> 00:41:37,840 All of them-- basically any variation-- 815 00:41:37,840 --> 00:41:40,040 not literally every variation, but many, 816 00:41:40,040 --> 00:41:43,000 many variations of this problem can all be solved quickly 817 00:41:43,000 --> 00:41:44,440 with dynamic programming. 818 00:41:44,440 --> 00:41:47,950 But let's solve this particular one. 819 00:41:47,950 --> 00:41:48,450 OK. 820 00:41:54,980 --> 00:41:58,460 So now we're really in algorithmic design mode. 821 00:41:58,460 --> 00:42:01,600 We need to think about SRTBOT. 822 00:42:01,600 --> 00:42:04,100 And in particular, we need to think about what would the sub 823 00:42:04,100 --> 00:42:06,000 problems be here? 824 00:42:06,000 --> 00:42:08,550 And at this point, we don't have a lot of help. 825 00:42:08,550 --> 00:42:11,390 So I should probably give you some tools. 826 00:42:11,390 --> 00:42:13,280 If I want to solve a problem like this, 827 00:42:13,280 --> 00:42:18,050 the input is a sequence of numbers. 828 00:42:18,050 --> 00:42:19,710 It's a sequenced data structure. 829 00:42:19,710 --> 00:42:23,840 Maybe it's an array of numbers, which is this v array. 830 00:42:26,600 --> 00:42:28,910 And let's see. 831 00:42:31,820 --> 00:42:39,670 A general tool for sub-problem design 832 00:42:39,670 --> 00:42:42,150 which will cover most of the problems-- maybe all 833 00:42:42,150 --> 00:42:45,070 of the problems that we see in this class 834 00:42:45,070 --> 00:42:47,050 for dynamic programming. 835 00:42:47,050 --> 00:42:49,720 Here's a trick. 836 00:42:49,720 --> 00:43:03,735 If your input is a sequence, here are some good sub-problems 837 00:43:03,735 --> 00:43:04,235 to consider. 838 00:43:13,070 --> 00:43:15,680 We could do all prefixes. 839 00:43:18,210 --> 00:43:22,250 So let's call the sequence x. 840 00:43:22,250 --> 00:43:27,710 So we could do x prefix means up to a given i for all i. 841 00:43:27,710 --> 00:43:34,880 We could do all the suffixes, x from i onward for all i. 842 00:43:34,880 --> 00:43:41,300 Or we could do substrings, which are the consecutive items 843 00:43:41,300 --> 00:43:43,130 from i to j. 844 00:43:43,130 --> 00:43:44,480 I don't write subsequence here. 845 00:43:44,480 --> 00:43:46,955 Subsequence means you can omit items in the middle. 846 00:43:46,955 --> 00:43:49,200 So substring you have to start in some position 847 00:43:49,200 --> 00:43:51,940 and do all the things up to j. 848 00:43:51,940 --> 00:43:55,240 So these are nice, easy to express in Python notation. 849 00:43:55,240 --> 00:43:57,910 And these are great, because they're polynomial. 850 00:43:57,910 --> 00:43:59,350 If I have n things-- 851 00:43:59,350 --> 00:44:02,170 if the length of my sequence, x, is n, 852 00:44:02,170 --> 00:44:05,980 then there are n prefixes-- technically n plus 1. 853 00:44:05,980 --> 00:44:08,350 So let's do theta n prefixes. 854 00:44:08,350 --> 00:44:10,300 There are theta n suffixes. 855 00:44:10,300 --> 00:44:14,290 And there are theta n squared substrings 856 00:44:14,290 --> 00:44:19,120 because there's n-- roughly n choices for i and j separately. 857 00:44:19,120 --> 00:44:21,070 Sorry? 858 00:44:21,070 --> 00:44:21,960 Sub-sequences. 859 00:44:21,960 --> 00:44:22,460 Good. 860 00:44:22,460 --> 00:44:22,690 Right. 861 00:44:22,690 --> 00:44:24,857 I didn't write sub-sequences, because in fact, there 862 00:44:24,857 --> 00:44:26,740 are exponentially many sub sequences. 863 00:44:26,740 --> 00:44:27,730 It's 2 to the n. 864 00:44:27,730 --> 00:44:29,950 For every item, I could choose it or not. 865 00:44:29,950 --> 00:44:31,900 So I don't want to parameterize-- 866 00:44:31,900 --> 00:44:35,110 I don't want my sub problems to be sub sequences because that's 867 00:44:35,110 --> 00:44:37,420 guaranteed-- well, then you're guaranteed 868 00:44:37,420 --> 00:44:40,270 to get an exponential number of sub-problems, which is bad. 869 00:44:40,270 --> 00:44:43,540 We'd like to balance the numbers of sub-problems by polynomial. 870 00:44:43,540 --> 00:44:47,800 So these are three natural ways to get polynomial bounds. 871 00:44:47,800 --> 00:44:50,110 Now, prefixes and suffixes are obviously better 872 00:44:50,110 --> 00:44:53,260 because there's fewer of them, linear instead of quadratic. 873 00:44:53,260 --> 00:44:56,650 And usually almost every problem you encounter, 874 00:44:56,650 --> 00:44:58,660 prefixes and suffixes are equally good. 875 00:44:58,660 --> 00:45:00,800 It doesn't really matter which one you choose. 876 00:45:00,800 --> 00:45:03,270 So maybe you'd like to think of-- 877 00:45:03,270 --> 00:45:04,150 well, we'll get to-- 878 00:45:06,970 --> 00:45:11,180 just choose whichever is more comfortable for you. 879 00:45:11,180 --> 00:45:12,430 But sometimes it's not enough. 880 00:45:12,430 --> 00:45:13,888 And we'll have to go to substrings. 881 00:45:13,888 --> 00:45:16,300 That won't be for another lecture or two. 882 00:45:16,300 --> 00:45:19,810 Today I claim that prefixes or suffixes 883 00:45:19,810 --> 00:45:24,830 are enough to solve the bowling problem. 884 00:45:24,830 --> 00:45:26,920 So what we're going to do is think about-- 885 00:45:26,920 --> 00:45:29,230 I prefer suffixes usually, because I 886 00:45:29,230 --> 00:45:32,810 like to work from left to right, from the beginning to the end. 887 00:45:32,810 --> 00:45:37,270 So we're going to think of a suffix of the bowling pins. 888 00:45:37,270 --> 00:45:39,770 And so what is the sub-problem on a suffix? 889 00:45:39,770 --> 00:45:43,510 Well, a natural version is just to solve the original problem, 890 00:45:43,510 --> 00:45:44,050 bowling. 891 00:45:44,050 --> 00:45:46,600 How do I maximize my score if all I were given 892 00:45:46,600 --> 00:45:47,710 were these pins? 893 00:45:47,710 --> 00:45:50,890 Suppose the pins to the left of i didn't exist. 894 00:45:50,890 --> 00:45:53,350 How would I maximize my score on the remaining pins? 895 00:45:53,350 --> 00:45:56,472 Or for this suffix, given these four pins, what would I do? 896 00:45:56,472 --> 00:45:58,180 And there's some weird sub problems here. 897 00:45:58,180 --> 00:46:00,460 If I just gave you the last pin, what would you do? 898 00:46:00,460 --> 00:46:01,060 Nothing. 899 00:46:01,060 --> 00:46:04,330 That's clearly different from what I would do globally here. 900 00:46:04,330 --> 00:46:06,430 But I claim if I can solve all suffixes 901 00:46:06,430 --> 00:46:09,670 I can solve my original problem, because one of the suffixes 902 00:46:09,670 --> 00:46:11,455 is the whole sequence. 903 00:46:14,140 --> 00:46:15,090 So let's do it. 904 00:46:18,430 --> 00:46:22,770 Sort by for bowling. 905 00:46:26,330 --> 00:46:29,160 So here is our dynamic program. 906 00:46:29,160 --> 00:46:34,020 The sub-problems are suffixes. 907 00:46:34,020 --> 00:46:42,960 So I'll write b of i is the maximum score we could get 908 00:46:42,960 --> 00:46:50,430 possible with our starting-- 909 00:46:50,430 --> 00:47:00,540 if we started a game with pins i, i plus 1, up to n minus 1, 910 00:47:00,540 --> 00:47:02,405 which is a suffix of the pins. 911 00:47:02,405 --> 00:47:04,530 Very important whenever you write a dynamic program 912 00:47:04,530 --> 00:47:06,540 to define what your sub-problems are. 913 00:47:06,540 --> 00:47:08,100 Don't just say how to compute them, 914 00:47:08,100 --> 00:47:10,680 but first say what is the goal of the sub problem. 915 00:47:10,680 --> 00:47:13,230 This is a common mistake to forget to state 916 00:47:13,230 --> 00:47:15,450 what you're trying to do. 917 00:47:15,450 --> 00:47:18,860 So now I have defined b of i. 918 00:47:18,860 --> 00:47:23,015 Now, what is the original thing I'm trying to solve? 919 00:47:23,015 --> 00:47:24,530 You also put in SRTBOT-- 920 00:47:24,530 --> 00:47:28,010 you could put the O earlier, then it actually spells sort. 921 00:47:28,010 --> 00:47:30,020 So why don't I do that for fun. 922 00:47:30,020 --> 00:47:35,420 The original problem we're trying to solve is b of 0, 923 00:47:35,420 --> 00:47:36,980 because that is all of the pins. 924 00:47:36,980 --> 00:47:39,930 The suffix starting at 0 is everything. 925 00:47:39,930 --> 00:47:42,720 So if we can solve that, we're done. 926 00:47:42,720 --> 00:47:46,100 Next is r for relate. 927 00:47:46,100 --> 00:47:49,340 This is the test of, did I get the sub-problems right, 928 00:47:49,340 --> 00:47:53,090 is whether I can write a recurrence relation. 929 00:47:53,090 --> 00:47:54,410 So let's try to do it. 930 00:47:54,410 --> 00:47:58,320 We want to compute b of i. 931 00:47:58,320 --> 00:48:04,220 So we have pin i here and then the remaining pins. 932 00:48:07,640 --> 00:48:10,790 And the big idea here is to just think about-- 933 00:48:10,790 --> 00:48:13,550 the nice thing about suffixes is if I take off 934 00:48:13,550 --> 00:48:15,973 something from the beginning, I still have a suffix. 935 00:48:15,973 --> 00:48:18,140 Remember, my goal is to take this sub-problem, which 936 00:48:18,140 --> 00:48:20,240 is suffix starting at i, and reduce it 937 00:48:20,240 --> 00:48:23,780 to a smaller sub problem, which means a smaller suffix. 938 00:48:23,780 --> 00:48:28,370 So I'd like to clip off one or two items here. 939 00:48:28,370 --> 00:48:33,770 And then the remaining problem will be one of my sub problems. 940 00:48:33,770 --> 00:48:37,970 I'll be able to recursively call b of something smaller than i-- 941 00:48:37,970 --> 00:48:39,650 or sorry, b of something larger than i 942 00:48:39,650 --> 00:48:43,730 will be a smaller subsequence because we're starting later. 943 00:48:43,730 --> 00:48:44,910 OK, so what could I do? 944 00:48:44,910 --> 00:48:47,270 Well, the idea is to just look at pin i 945 00:48:47,270 --> 00:48:50,000 and think, well, what could I do to pin i? 946 00:48:50,000 --> 00:48:52,310 I could not hit it ever with a ball. 947 00:48:52,310 --> 00:48:53,450 I could skip it. 948 00:48:53,450 --> 00:48:54,950 That's one option. 949 00:48:54,950 --> 00:48:57,720 What would be my score then? 950 00:48:57,720 --> 00:49:02,820 Well, if I skip pin i, that leaves the remaining pins, 951 00:49:02,820 --> 00:49:04,450 which is just a smaller suffix. 952 00:49:04,450 --> 00:49:08,830 So that is b of i plus 1. 953 00:49:08,830 --> 00:49:10,260 I'm going to write a max out here 954 00:49:10,260 --> 00:49:12,040 because I'd like to maximize my score. 955 00:49:12,040 --> 00:49:14,190 And one of the options is, forget about pin i. 956 00:49:14,190 --> 00:49:16,140 Just solve the rest. 957 00:49:16,140 --> 00:49:18,160 Another option is I throw a ball. 958 00:49:18,160 --> 00:49:21,540 And I exactly hit pin i. 959 00:49:21,540 --> 00:49:22,710 That's one thing I could do. 960 00:49:22,710 --> 00:49:26,220 And it would leave exactly the same remainder. 961 00:49:26,220 --> 00:49:34,620 So another option is b of i plus 1 plus vi. 962 00:49:34,620 --> 00:49:36,240 Why would I prefer this over this? 963 00:49:36,240 --> 00:49:39,880 Well, if vi is negative, I'd prefer this. 964 00:49:39,880 --> 00:49:42,970 But if vi is positive, I'd actually prefer this over that. 965 00:49:42,970 --> 00:49:47,723 So you can figure out which is better, just locally. 966 00:49:47,723 --> 00:49:49,390 But then there's another thing I can do, 967 00:49:49,390 --> 00:49:54,327 which is maybe I hit this pin in a pair with some other pin. 968 00:49:54,327 --> 00:49:56,160 Now, there's no pin to the left of this one. 969 00:49:56,160 --> 00:49:58,320 We're assuming we only have the suffix. 970 00:49:58,320 --> 00:50:01,470 And so the only other thing I can do is throw a ball 971 00:50:01,470 --> 00:50:03,750 and hit i together with i plus 1. 972 00:50:03,750 --> 00:50:05,580 And then I get the product. 973 00:50:05,580 --> 00:50:07,110 Now, what pins remain? 974 00:50:07,110 --> 00:50:08,340 i plus 2 on. 975 00:50:08,340 --> 00:50:09,900 Still a suffix. 976 00:50:09,900 --> 00:50:12,120 So if I remove one or two items, of course, 977 00:50:12,120 --> 00:50:13,350 I still get a suffix-- 978 00:50:13,350 --> 00:50:15,960 in this case, b of i plus 2-- 979 00:50:15,960 --> 00:50:18,300 and then the number of points that I add on 980 00:50:18,300 --> 00:50:21,930 are vi times vi plus 1. 981 00:50:21,930 --> 00:50:26,120 So this is a max of three things. 982 00:50:26,120 --> 00:50:28,240 So how long does it take me to compute it? 983 00:50:28,240 --> 00:50:30,340 I claim constant time. 984 00:50:30,340 --> 00:50:31,830 If I don't count the time it takes 985 00:50:31,830 --> 00:50:33,580 to compute these other sub problems, which 986 00:50:33,580 --> 00:50:37,150 are smaller because they are smaller suffixes further 987 00:50:37,150 --> 00:50:40,030 to the right, then I'm doing a couple 988 00:50:40,030 --> 00:50:42,370 of additions-- product, max. 989 00:50:42,370 --> 00:50:44,250 These are all nice numbers and I'll 990 00:50:44,250 --> 00:50:48,028 assume that they live in the w-bit word, 991 00:50:48,028 --> 00:50:50,070 because we're only doing constant sized products. 992 00:50:50,070 --> 00:50:51,390 That's good. 993 00:50:51,390 --> 00:50:56,620 So this takes constant, constant non-recursive work. 994 00:50:56,620 --> 00:50:57,900 How many sub problems are? 995 00:50:57,900 --> 00:51:01,230 Well, it's suffixes, so it's a linear number of sub problems. 996 00:51:01,230 --> 00:51:03,360 And so the time I'm going to end up 997 00:51:03,360 --> 00:51:07,950 needing is number of sub problems, n, 998 00:51:07,950 --> 00:51:11,460 times the non-recursive work I do per sub problem, which 999 00:51:11,460 --> 00:51:12,460 is constant. 1000 00:51:12,460 --> 00:51:16,170 And so this is linear time. 1001 00:51:16,170 --> 00:51:16,830 Great. 1002 00:51:16,830 --> 00:51:19,470 And I didn't finish SRTBOT, so there's 1003 00:51:19,470 --> 00:51:21,600 another t, which is to make sure that there 1004 00:51:21,600 --> 00:51:27,450 is a topological order and that is in decreasing i order. 1005 00:51:30,690 --> 00:51:33,870 Or I might write that as a for loop-- 1006 00:51:33,870 --> 00:51:37,500 for i equals n, n minus 1. 1007 00:51:37,500 --> 00:51:41,610 This is the order that I would compute my problems 1008 00:51:41,610 --> 00:51:44,825 because the suffix starting at n is the empty suffix. 1009 00:51:44,825 --> 00:51:46,950 The suffix starting at 0, that's the one I actually 1010 00:51:46,950 --> 00:51:47,617 want to compute. 1011 00:51:47,617 --> 00:51:50,370 That's the final suffix I should be computing. 1012 00:51:50,370 --> 00:51:53,830 And then we have a b for base case, 1013 00:51:53,830 --> 00:52:00,960 which is that first case, b of n equals 0, 1014 00:52:00,960 --> 00:52:02,010 because there's no pins. 1015 00:52:02,010 --> 00:52:04,310 So I don't get any points. 1016 00:52:04,310 --> 00:52:04,810 Sad. 1017 00:52:07,920 --> 00:52:09,150 OK, so this is it. 1018 00:52:09,150 --> 00:52:12,210 We just take these components, plug them 1019 00:52:12,210 --> 00:52:14,640 into this recursive, memoized algorithm, 1020 00:52:14,640 --> 00:52:17,042 and we have a linear time algorithm. 1021 00:52:17,042 --> 00:52:18,750 I want to briefly mention a different way 1022 00:52:18,750 --> 00:52:21,270 you could plug together those pieces, which is called bottom 1023 00:52:21,270 --> 00:52:27,600 up dp, which is-- 1024 00:52:27,600 --> 00:52:29,440 let's do it for this example. 1025 00:52:29,440 --> 00:52:32,760 So if I have-- 1026 00:52:32,760 --> 00:52:34,240 let's see. 1027 00:52:34,240 --> 00:52:39,700 Let me start with the base case, b of n equals 0. 1028 00:52:39,700 --> 00:52:41,470 But now it's an assignment. 1029 00:52:41,470 --> 00:52:44,140 And I'm going to do for loop from the topological order 1030 00:52:44,140 --> 00:52:49,150 for i equals n, n minus 1 to 0. 1031 00:52:49,150 --> 00:52:53,230 Now I'm going to do the relation, b of i equals 1032 00:52:53,230 --> 00:53:02,680 max of b of i plus 1 and b of i plus 1 plus bi 1033 00:53:02,680 --> 00:53:08,530 and b of i plus 2 plus di vi plus 1. 1034 00:53:08,530 --> 00:53:11,800 Technically this only works if i is strictly less than n 1035 00:53:11,800 --> 00:53:12,820 minus 1. 1036 00:53:12,820 --> 00:53:17,742 So I should have an if i is less than minus 1 for that last part 1037 00:53:17,742 --> 00:53:18,700 because I can only do-- 1038 00:53:18,700 --> 00:53:22,540 I can only hit two pins if there's at least two pins left. 1039 00:53:22,540 --> 00:53:27,640 And then return b of 0. 1040 00:53:27,640 --> 00:53:29,440 So what I just did is a transformation 1041 00:53:29,440 --> 00:53:35,950 from this SRTBOT template into a non-recursive algorithm, a 1042 00:53:35,950 --> 00:53:39,970 for loop algorithm, where I wrote my base case first. 1043 00:53:39,970 --> 00:53:43,060 Then I did my topological order. 1044 00:53:43,060 --> 00:53:46,570 Then I did my relation. 1045 00:53:46,570 --> 00:53:48,910 Then at the end, I did my-- 1046 00:53:48,910 --> 00:53:49,750 not base case. 1047 00:53:49,750 --> 00:53:50,650 The original problem. 1048 00:53:53,650 --> 00:53:56,380 And provided you can write your topological order 1049 00:53:56,380 --> 00:53:57,730 as some for loops. 1050 00:53:57,730 --> 00:54:00,367 This is actually a great way to write down a dp as code. 1051 00:54:00,367 --> 00:54:02,200 If I were going to implement this algorithm, 1052 00:54:02,200 --> 00:54:04,450 I would write it this way, because this is super fast. 1053 00:54:04,450 --> 00:54:05,560 No recursive calls. 1054 00:54:05,560 --> 00:54:06,700 Just one for loop. 1055 00:54:06,700 --> 00:54:08,620 In fact, this is almost a trivial algorithm. 1056 00:54:08,620 --> 00:54:12,670 It's amazing that this solves the bowling problem. 1057 00:54:12,670 --> 00:54:16,810 It's in some sense considering every possible strategy I could 1058 00:54:16,810 --> 00:54:19,120 for bowling these pins. 1059 00:54:19,120 --> 00:54:22,840 What we're using is what we like to call local brute force, 1060 00:54:22,840 --> 00:54:25,210 where when we think about pin i, we 1061 00:54:25,210 --> 00:54:28,120 look at all of the possible things I could do to pin i, 1062 00:54:28,120 --> 00:54:31,240 here there's really only three options of what I could do. 1063 00:54:31,240 --> 00:54:34,720 Now, normally, if I tried all the options for pin i 1064 00:54:34,720 --> 00:54:37,877 and then all the options for i plus 1 and i plus 2 and so on, 1065 00:54:37,877 --> 00:54:38,960 that would be exponential. 1066 00:54:38,960 --> 00:54:40,690 It'd be 3 times 3 times 3. 1067 00:54:40,690 --> 00:54:44,990 That's bad, but because I can reuse these sub problems, 1068 00:54:44,990 --> 00:54:47,530 it turns out to only be linear time. 1069 00:54:47,530 --> 00:54:49,090 It's almost like magic. 1070 00:54:49,090 --> 00:54:56,455 dp-- dp is essentially an idea of using local brute force. 1071 00:55:02,200 --> 00:55:06,440 And by defining a small number of sub-problems up front-- 1072 00:55:06,440 --> 00:55:09,370 and as long as I stay within those sub problems, 1073 00:55:09,370 --> 00:55:12,820 as long as I'm always recursing into this polynomial space, 1074 00:55:12,820 --> 00:55:15,425 I end up only doing polynomial work, 1075 00:55:15,425 --> 00:55:17,050 even though I'm in some sense exploring 1076 00:55:17,050 --> 00:55:19,570 exponentially many options. 1077 00:55:19,570 --> 00:55:22,300 And it is because what I do to this pin 1078 00:55:22,300 --> 00:55:26,410 doesn't depend too much to what I do to a pin much later. 1079 00:55:26,410 --> 00:55:30,520 There's a lot of intuition going on here for what-- 1080 00:55:30,520 --> 00:55:31,767 when DP works. 1081 00:55:31,767 --> 00:55:33,850 But we're going to see a lot more examples of that 1082 00:55:33,850 --> 00:55:36,370 coming up. 1083 00:55:36,370 --> 00:55:38,740 And I just want to mention the intuition for how 1084 00:55:38,740 --> 00:55:40,210 to write a recurrence like this is 1085 00:55:40,210 --> 00:55:42,628 to think about-- in the case of suffixes, 1086 00:55:42,628 --> 00:55:44,920 you always want to think about the first item, or maybe 1087 00:55:44,920 --> 00:55:46,210 the first couple of items. 1088 00:55:46,210 --> 00:55:49,150 The case of prefixes, you always think about the last item. 1089 00:55:49,150 --> 00:55:52,845 And for substrings, it could be any item-- maybe in the middle. 1090 00:55:52,845 --> 00:55:54,970 If I remove an item from the middle of a substring, 1091 00:55:54,970 --> 00:55:57,760 I get two substrings, so I can recurse. 1092 00:55:57,760 --> 00:56:00,040 Here or in general, what we want to do 1093 00:56:00,040 --> 00:56:03,460 is identify some feature of the solution 1094 00:56:03,460 --> 00:56:06,820 that if we knew that feature we would be done. 1095 00:56:06,820 --> 00:56:09,440 We would reduce to a smaller sub problem. 1096 00:56:09,440 --> 00:56:11,320 In this case, we just say, well, what 1097 00:56:11,320 --> 00:56:15,070 are the possible things I could do to the first pin? 1098 00:56:15,070 --> 00:56:16,720 There are three options. 1099 00:56:16,720 --> 00:56:18,970 If I knew which option it was, I would be done. 1100 00:56:18,970 --> 00:56:21,550 I could recurse and do my addition. 1101 00:56:21,550 --> 00:56:23,620 Now, I don't know which thing I want to do. 1102 00:56:23,620 --> 00:56:26,458 So I just try them all and take the max. 1103 00:56:26,458 --> 00:56:28,250 And if you're maximizing, you take the max. 1104 00:56:28,250 --> 00:56:29,875 If you're minimizing, you take the min. 1105 00:56:29,875 --> 00:56:32,380 Sometimes you take an or or an and. 1106 00:56:32,380 --> 00:56:34,360 There might be some combination function. 1107 00:56:34,360 --> 00:56:36,040 For optimization problems where you're 1108 00:56:36,040 --> 00:56:37,570 trying to maximize or minimize something, 1109 00:56:37,570 --> 00:56:39,487 like shortest paths you're trying to minimize, 1110 00:56:39,487 --> 00:56:40,670 we put them in here. 1111 00:56:40,670 --> 00:56:42,760 So usually it's min or max. 1112 00:56:42,760 --> 00:56:46,120 And this is extremely powerful. 1113 00:56:46,120 --> 00:56:47,590 All you need to do-- 1114 00:56:47,590 --> 00:56:50,680 the hard part is this inspired design part 1115 00:56:50,680 --> 00:56:54,640 where you say, what do I need to know that would 1116 00:56:54,640 --> 00:56:56,180 let me solve my problem? 1117 00:56:56,180 --> 00:56:58,900 And if you can identify that and the number of choices 1118 00:56:58,900 --> 00:57:01,900 for what you need to know is polynomial, 1119 00:57:01,900 --> 00:57:05,710 then you will be able to get a polynomial dynamic program. 1120 00:57:05,710 --> 00:57:06,730 That's the intuition. 1121 00:57:06,730 --> 00:57:10,620 You'll see a lot more examples in the next three lectures.