1 00:00:00,000 --> 00:00:08,520 CHRISTINE BREINER: Welcome back to recitation. 2 00:00:08,520 --> 00:00:12,670 In this video, what I'd like you to do is use least squares 3 00:00:12,670 --> 00:00:15,200 to fit a line to the following data, which includes three 4 00:00:15,200 --> 00:00:18,520 points: the point 0, 1, the point 2, 1, and 5 00:00:18,520 --> 00:00:19,910 the point 3, 4. 6 00:00:19,910 --> 00:00:23,272 And I've drawn a rough picture where these points are on a 7 00:00:23,272 --> 00:00:26,690 graph, and I'll be talking a little bit about that after 8 00:00:26,690 --> 00:00:28,880 you try this problem. 9 00:00:28,880 --> 00:00:32,010 So why don't you try to solve the problem based on the 10 00:00:32,010 --> 00:00:34,240 technique you learned in class, and then bring the 11 00:00:34,240 --> 00:00:35,460 video back up. 12 00:00:35,460 --> 00:00:37,990 I'll show you again what you're actually doing, and 13 00:00:37,990 --> 00:00:41,220 then I will also solve the problem and you can see, you 14 00:00:41,220 --> 00:00:42,490 can compare your answer to mine. 15 00:00:42,490 --> 00:00:51,240 16 00:00:51,240 --> 00:00:52,460 OK, welcome back. 17 00:00:52,460 --> 00:00:54,630 So the first thing I want to do when we're talking about 18 00:00:54,630 --> 00:00:57,630 least squares to fit a line to these data, what I'd like to 19 00:00:57,630 --> 00:00:59,770 do is I'm going to draw a line on here, and we're going to 20 00:00:59,770 --> 00:01:01,360 talk about what the least squares actually 21 00:01:01,360 --> 00:01:02,890 is trying to do. 22 00:01:02,890 --> 00:01:06,330 And so I'm going to draw this line, say, 23 00:01:06,330 --> 00:01:08,390 something like this. 24 00:01:08,390 --> 00:01:14,500 That's my first guess on what might be the actual least 25 00:01:14,500 --> 00:01:16,730 squares line for these data. 26 00:01:16,730 --> 00:01:19,730 And if you recall, what you're actually trying to do is 27 00:01:19,730 --> 00:01:22,300 you're trying to minimize a certain quantity, and the 28 00:01:22,300 --> 00:01:25,230 quantity you're trying to minimize is the difference 29 00:01:25,230 --> 00:01:29,200 between the actual value you get and the expected value you 30 00:01:29,200 --> 00:01:30,920 get, the square of that. 31 00:01:30,920 --> 00:01:35,120 So this is one thing we're going to square. 32 00:01:35,120 --> 00:01:40,830 So even pictorially, we could say we have something that 33 00:01:40,830 --> 00:01:44,540 contributes some area, OK, and then this is another thing 34 00:01:44,540 --> 00:01:47,260 that's the difference between the expected value and y and 35 00:01:47,260 --> 00:01:50,370 the actual value you get. 36 00:01:50,370 --> 00:01:52,440 So let me try and make that squared, because we square 37 00:01:52,440 --> 00:01:54,100 that value. 38 00:01:54,100 --> 00:01:55,800 And then this last one is pretty small. 39 00:01:55,800 --> 00:01:58,610 The difference between the expected value and the actual 40 00:01:58,610 --> 00:02:01,170 value is very small over in this case from 41 00:02:01,170 --> 00:02:02,970 the picture I drew. 42 00:02:02,970 --> 00:02:05,240 And so what you're trying to do is you're trying to find 43 00:02:05,240 --> 00:02:08,800 the line that minimizes the area of the sum of these 44 00:02:08,800 --> 00:02:11,020 three, in this case, three squares. 45 00:02:11,020 --> 00:02:14,750 So if I, you know, if I tip the line down further here, 46 00:02:14,750 --> 00:02:17,920 but I kept it fixed there, the area of this square would get 47 00:02:17,920 --> 00:02:21,550 bigger, because this distance would get bigger, but, of 48 00:02:21,550 --> 00:02:23,800 course, the area of this square would get smaller. 49 00:02:23,800 --> 00:02:27,130 And so you're trying to figure out a way to optimize, in this 50 00:02:27,130 --> 00:02:30,330 case, minimize, the area of these-- 51 00:02:30,330 --> 00:02:34,230 sorry, the sum of the areas of these squares by figuring out 52 00:02:34,230 --> 00:02:36,310 where to put the line. 53 00:02:36,310 --> 00:02:39,730 Now, I mentioned if I could fix this here and move it up 54 00:02:39,730 --> 00:02:42,120 and down, then that would change the 55 00:02:42,120 --> 00:02:43,950 slope but fix the intercept. 56 00:02:43,950 --> 00:02:47,490 But I could also move the intercept up and down, and I 57 00:02:47,490 --> 00:02:50,520 could slide the line up and down, and that would fix the 58 00:02:50,520 --> 00:02:54,130 slope but vary the intercept, or I could do both. 59 00:02:54,130 --> 00:02:55,730 And ultimately, that's what you're doing. 60 00:02:55,730 --> 00:02:59,720 You're trying to find the best y-intercept and the best slope 61 00:02:59,720 --> 00:03:02,790 at the same time, and that's why you learned this process 62 00:03:02,790 --> 00:03:06,710 in multivariable calculus, because you're trying to 63 00:03:06,710 --> 00:03:09,520 optimize something that has two quantities 64 00:03:09,520 --> 00:03:10,580 that you can change. 65 00:03:10,580 --> 00:03:13,920 You can change slope and you can change the y-intercept, so 66 00:03:13,920 --> 00:03:16,400 that's why you learned it now instead of in 67 00:03:16,400 --> 00:03:18,290 single-variable calculus. 68 00:03:18,290 --> 00:03:22,030 Now, if we come over here, I wrote down ahead of time what 69 00:03:22,030 --> 00:03:26,850 the equations are you get when you optimize, when you're 70 00:03:26,850 --> 00:03:29,430 trying to optimize the least squares line. 71 00:03:29,430 --> 00:03:32,360 So if you write down, you know, the function as a 72 00:03:32,360 --> 00:03:34,230 function of a and b, and you take the derivative with 73 00:03:34,230 --> 00:03:36,770 respect to a and you set it equal to zero, you get the 74 00:03:36,770 --> 00:03:37,365 first equation. 75 00:03:37,365 --> 00:03:39,820 When you take the derivative with respect to b and you set 76 00:03:39,820 --> 00:03:41,980 it equal to zero, you get the second equation. 77 00:03:41,980 --> 00:03:45,780 And so I'm just going to fill in the details and then tell 78 00:03:45,780 --> 00:03:48,510 you what the solution is, and then I'm going to mention a 79 00:03:48,510 --> 00:03:51,090 little more sophisticated thing we can do, or I guess 80 00:03:51,090 --> 00:03:54,850 maybe the next level of least squares we can do, that's not 81 00:03:54,850 --> 00:03:56,240 a linear problem. 82 00:03:56,240 --> 00:03:57,870 So I'll do that last. 83 00:03:57,870 --> 00:04:00,480 So what I want to do is I put all the values I need over 84 00:04:00,480 --> 00:04:01,420 here on the right. 85 00:04:01,420 --> 00:04:02,840 So all the values we need are on the right. 86 00:04:02,840 --> 00:04:05,780 We're going to need the sum of all the xi's and that's 5. 87 00:04:05,780 --> 00:04:07,565 The sum of all the yi's is 6. 88 00:04:07,565 --> 00:04:10,880 The sum of the xi squareds is 13, and the sum of the xi yi 89 00:04:10,880 --> 00:04:13,040 product is 14. 90 00:04:13,040 --> 00:04:14,110 Now, where are these coming from? 91 00:04:14,110 --> 00:04:15,780 Remember, what is xi and what is yi? 92 00:04:15,780 --> 00:04:21,890 If we come back over here, this is x1, y1, 93 00:04:21,890 --> 00:04:25,330 x2, y2, x3, y3, OK? 94 00:04:25,330 --> 00:04:28,830 So that's sort of how we get-- if we switch the whole pairs 95 00:04:28,830 --> 00:04:30,120 around, it doesn't matter. 96 00:04:30,120 --> 00:04:33,310 Obviously, if we switched x- and y-values together, it does 97 00:04:33,310 --> 00:04:35,900 matter, and that you see actually from the equation 98 00:04:35,900 --> 00:04:36,900 does matter. 99 00:04:36,900 --> 00:04:38,850 Right here you have a product of them. 100 00:04:38,850 --> 00:04:40,450 So let me write this out. 101 00:04:40,450 --> 00:04:43,750 And again, we're solving for a and b, so in this case, a is 102 00:04:43,750 --> 00:04:46,890 the slope and b is the intercept, right? 103 00:04:46,890 --> 00:04:48,080 So let me write everything in. 104 00:04:48,080 --> 00:04:51,570 So we actually get that we want to solve the following 105 00:04:51,570 --> 00:05:00,800 system: 13a plus 5b is equal to 14. 106 00:05:00,800 --> 00:05:03,380 Make sure I put in everything correctly. 107 00:05:03,380 --> 00:05:08,780 And then the second one is 5a plus-- 108 00:05:08,780 --> 00:05:12,240 n in this case is 3, because there were three points-- 109 00:05:12,240 --> 00:05:16,130 plus 3b is equal to 6. 110 00:05:16,130 --> 00:05:18,930 OK, this is a system of equations. 111 00:05:18,930 --> 00:05:22,420 It has two equations and two unknowns, and 112 00:05:22,420 --> 00:05:23,330 if you solve this-- 113 00:05:23,330 --> 00:05:25,820 I should look at my notes, because I didn't memorize it 114 00:05:25,820 --> 00:05:27,620 and I'm not going to do all the algebra. 115 00:05:27,620 --> 00:05:31,050 If you go to solve this, what you actually get is that a is 116 00:05:31,050 --> 00:05:34,780 equal to 6/7 and b is equal to 4/7. 117 00:05:34,780 --> 00:05:36,290 So you get that the line-- 118 00:05:36,290 --> 00:05:38,470 let me write it here. 119 00:05:38,470 --> 00:05:45,590 You get a is equal to 6/7 and b is equal to 4/7, OK? 120 00:05:45,590 --> 00:05:48,720 And so what that tells you is if you come back over here, 121 00:05:48,720 --> 00:05:50,100 the line we actually want-- 122 00:05:50,100 --> 00:05:52,090 I'll write the solution right here-- the line we actually 123 00:05:52,090 --> 00:05:59,850 want is the line y equals 6/7x plus 4/7. 124 00:05:59,850 --> 00:06:01,717 That is our least squares line that is our solution. 125 00:06:01,717 --> 00:06:03,930 OK? 126 00:06:03,930 --> 00:06:05,150 So that is actually-- 127 00:06:05,150 --> 00:06:09,490 solving for a line to fit these data, you get the 128 00:06:09,490 --> 00:06:11,600 following solution. 129 00:06:11,600 --> 00:06:14,120 But I want to point out, and what we're going to do, after 130 00:06:14,120 --> 00:06:16,920 this video there will be a challenge problem to have you 131 00:06:16,920 --> 00:06:18,560 actually finish this. 132 00:06:18,560 --> 00:06:22,190 I want to point out that you can actually also fit a 133 00:06:22,190 --> 00:06:25,040 higher-degree polynomial to these data. 134 00:06:25,040 --> 00:06:27,600 For instance, you could try and use the technique of least 135 00:06:27,600 --> 00:06:30,720 squares to fit a parabola to these data. 136 00:06:30,720 --> 00:06:32,230 And let me point out what the function would look 137 00:06:32,230 --> 00:06:33,820 like in this case. 138 00:06:33,820 --> 00:06:38,470 So this is for the challenge problem. 139 00:06:38,470 --> 00:06:41,740 For the challenge problem, it now will be a function of 140 00:06:41,740 --> 00:06:44,820 three variables, so it will look something like this. 141 00:06:44,820 --> 00:06:49,590 We'll say a comma b comma c, and the point that we'll be 142 00:06:49,590 --> 00:06:52,090 wanting to minimize, or the function we'll be wanting to 143 00:06:52,090 --> 00:06:55,460 minimize, we're summing over i, again from 1 to 3-- 144 00:06:55,460 --> 00:06:57,610 I didn't write that down above, but it was fairly 145 00:06:57,610 --> 00:07:00,350 obvious that that was what we were summing over-- 146 00:07:00,350 --> 00:07:02,000 the following thing. 147 00:07:02,000 --> 00:07:12,490 We have yi minus this big quantity: axi squared plus bxi 148 00:07:12,490 --> 00:07:15,820 plus c and we square the whole thing. 149 00:07:15,820 --> 00:07:18,300 And so what this is actually giving you is, just as in the 150 00:07:18,300 --> 00:07:21,370 picture before, this is giving you the difference between the 151 00:07:21,370 --> 00:07:23,720 expected value and the actual value. 152 00:07:23,720 --> 00:07:27,390 So this is your actual y-value, and based on the 153 00:07:27,390 --> 00:07:30,070 parabola you're looking for, what's in parentheses is your 154 00:07:30,070 --> 00:07:32,830 expected y-value, right? 155 00:07:32,830 --> 00:07:36,330 So this would actually be my parabola. 156 00:07:36,330 --> 00:07:38,720 When I evaluate it at xi, I get a number. 157 00:07:38,720 --> 00:07:42,330 I get a value, and that value will be on the parabola. 158 00:07:42,330 --> 00:07:45,270 This will be the y-value associated to the 159 00:07:45,270 --> 00:07:46,550 x-value from the data. 160 00:07:46,550 --> 00:07:48,720 So this is the actual value. 161 00:07:48,720 --> 00:07:50,390 This is the expected value. 162 00:07:50,390 --> 00:07:52,280 We take the difference and square it. 163 00:07:52,280 --> 00:07:55,400 And because we want to minimize how we're going to 164 00:07:55,400 --> 00:07:58,760 solve this problem and this is what you'll have to do is, 165 00:07:58,760 --> 00:08:02,230 just like with the line situation, you're going to 166 00:08:02,230 --> 00:08:04,780 take the derivative of this function with respect to each 167 00:08:04,780 --> 00:08:06,800 of these three variables separately. 168 00:08:06,800 --> 00:08:09,840 So you'll take f sub a, and then you'll set 169 00:08:09,840 --> 00:08:11,020 that equal to zero. 170 00:08:11,020 --> 00:08:14,440 And then you'll take f sub b, you'll set that equal to zero. 171 00:08:14,440 --> 00:08:16,330 And then you'll take f sub c, and you'll set 172 00:08:16,330 --> 00:08:16,870 that equal to zero. 173 00:08:16,870 --> 00:08:19,020 Again, we set them all equal to zero, because this is 174 00:08:19,020 --> 00:08:20,650 really an optimization problem. 175 00:08:20,650 --> 00:08:22,040 We want to minimize something. 176 00:08:22,040 --> 00:08:24,530 We want to minimize this quantity. 177 00:08:24,530 --> 00:08:27,730 So you take each of those three derivatives, partial 178 00:08:27,730 --> 00:08:30,360 derivatives, set them equal to zero, and you have a system of 179 00:08:30,360 --> 00:08:32,440 three equations with three variables. 180 00:08:32,440 --> 00:08:35,960 And you know how to solve such a system, by using matrices, 181 00:08:35,960 --> 00:08:36,680 for instance. 182 00:08:36,680 --> 00:08:38,840 That would be one way to solve such a system. 183 00:08:38,840 --> 00:08:44,850 And so then you can actually come up with a formula for the 184 00:08:44,850 --> 00:08:48,320 sort of parabola of best fit, you could think of it as. 185 00:08:48,320 --> 00:08:50,500 And so I'm going to stop there, because I think from 186 00:08:50,500 --> 00:08:52,250 here, you guys will do it. 187 00:08:52,250 --> 00:08:54,340 But I just want to point out this is where you're going to 188 00:08:54,340 --> 00:08:58,340 be starting this next sort of challenge problem to find the 189 00:08:58,340 --> 00:09:01,930 quadratic of best fit for these three data points. 190 00:09:01,930 --> 00:09:03,740 So that's where I'll end it. 191 00:09:03,740 --> 00:09:04,010