WEBVTT
00:00:00.000 --> 00:00:08.520
CHRISTINE BREINER: Welcome
back to recitation.
00:00:08.520 --> 00:00:11.060
In this video, what
I'd like you to do
00:00:11.060 --> 00:00:14.040
is use least squares to
fit a line to the following
00:00:14.040 --> 00:00:17.210
data, which includes three
points: the point (0, 1),
00:00:17.210 --> 00:00:19.910
the point (2, 1),
and the point (3, 4).
00:00:19.910 --> 00:00:23.050
And I've drawn a rough
picture where these points are
00:00:23.050 --> 00:00:26.100
on a graph, and I'll be
talking a little bit about that
00:00:26.100 --> 00:00:28.880
after you try this problem.
00:00:28.880 --> 00:00:31.657
So why don't you try to
solve the problem based
00:00:31.657 --> 00:00:33.240
on the technique you
learned in class,
00:00:33.240 --> 00:00:35.460
and then bring
the video back up.
00:00:35.460 --> 00:00:37.740
I'll show you again what
you're actually doing,
00:00:37.740 --> 00:00:40.990
and then I will also solve
the problem and you can see,
00:00:40.990 --> 00:00:42.490
you can compare
your answer to mine.
00:00:51.240 --> 00:00:52.317
OK, welcome back.
00:00:52.317 --> 00:00:54.900
So the first thing I want to do
when we're talking about least
00:00:54.900 --> 00:00:57.730
squares to fit a line to these
data, what I'd like to do
00:00:57.730 --> 00:00:59.375
is I'm going to
draw a line on here,
00:00:59.375 --> 00:01:01.916
and we're going to talk about
what the least squares actually
00:01:01.916 --> 00:01:02.890
is trying to do.
00:01:02.890 --> 00:01:08.390
And so I'm going to draw this
line, say, something like this.
00:01:08.390 --> 00:01:10.700
That's my first
guess on what might
00:01:10.700 --> 00:01:16.730
be the actual least squares
line for these data.
00:01:16.730 --> 00:01:19.290
And if you recall, what
you're actually trying to do
00:01:19.290 --> 00:01:21.530
is you're trying to
minimize a certain quantity,
00:01:21.530 --> 00:01:23.770
and the quantity you're
trying to minimize
00:01:23.770 --> 00:01:27.440
is the difference between
the actual value you get
00:01:27.440 --> 00:01:30.920
and the expected value you
get, the square of that.
00:01:30.920 --> 00:01:35.120
So this is one thing
we're going to square.
00:01:35.120 --> 00:01:37.920
So even pictorially,
we could say
00:01:37.920 --> 00:01:43.109
we have something that
contributes some area, OK,
00:01:43.109 --> 00:01:44.650
and then this is
another thing that's
00:01:44.650 --> 00:01:47.120
the difference between
the expected value in y
00:01:47.120 --> 00:01:50.171
and the actual value you get.
00:01:50.171 --> 00:01:51.670
So let me try and
make that squared,
00:01:51.670 --> 00:01:54.100
because we square that value.
00:01:54.100 --> 00:01:55.800
And then this last
one is pretty small.
00:01:55.800 --> 00:01:57.680
The difference between
the expected value
00:01:57.680 --> 00:02:00.950
and the actual value is
very small over in this case
00:02:00.950 --> 00:02:02.814
from the picture I drew.
00:02:02.814 --> 00:02:04.230
And so what you're
trying to do is
00:02:04.230 --> 00:02:06.350
you're trying to
find the line that
00:02:06.350 --> 00:02:09.220
minimizes the area of
the sum of these three,
00:02:09.220 --> 00:02:11.020
in this case, three squares.
00:02:11.020 --> 00:02:14.750
So if I, you know, if I tip
the line down further here,
00:02:14.750 --> 00:02:17.600
but I kept it fixed there,
the area of this square
00:02:17.600 --> 00:02:21.160
would get bigger, because this
distance would get bigger.
00:02:21.160 --> 00:02:23.624
But, of course, the area of
this square would get smaller.
00:02:23.624 --> 00:02:25.040
And so you're
trying to figure out
00:02:25.040 --> 00:02:28.080
a way to optimize, in
this case, minimize,
00:02:28.080 --> 00:02:31.180
the area of these--
sorry, the sum
00:02:31.180 --> 00:02:34.230
of the areas of these
squares by figuring out
00:02:34.230 --> 00:02:36.310
where to put the line.
00:02:36.310 --> 00:02:39.600
Now, I mentioned if I could
fix this here and move it
00:02:39.600 --> 00:02:42.550
up and down, then that
would change the slope
00:02:42.550 --> 00:02:43.942
but fix the intercept.
00:02:43.942 --> 00:02:45.942
But I could also move the
intercept up and down,
00:02:45.942 --> 00:02:48.830
and I could slide
the line up and down,
00:02:48.830 --> 00:02:52.980
and that would fix the slope
but vary the intercept,
00:02:52.980 --> 00:02:54.022
or I could do both.
00:02:54.022 --> 00:02:55.730
And ultimately, that's
what you're doing.
00:02:55.730 --> 00:02:59.720
You're trying to find the best
y-intercept and the best slope
00:02:59.720 --> 00:03:02.790
at the same time, and that's
why you learned this process
00:03:02.790 --> 00:03:06.150
in multivariable
calculus, because you're
00:03:06.150 --> 00:03:08.920
trying to optimize
something that has two
00:03:08.920 --> 00:03:10.580
quantities that you can change.
00:03:10.580 --> 00:03:13.780
You can change slope and you
can change the y-intercept,
00:03:13.780 --> 00:03:15.510
so that's why you
learned it now instead
00:03:15.510 --> 00:03:18.290
of in single-variable calculus.
00:03:18.290 --> 00:03:20.180
Now, if we come over
here, I wrote down ahead
00:03:20.180 --> 00:03:26.590
of time what the equations
are you get when you optimize,
00:03:26.590 --> 00:03:29.430
when you're trying to optimize
the least squares line.
00:03:29.430 --> 00:03:32.150
So if you write down,
you know, the function
00:03:32.150 --> 00:03:33.610
as a function of
a and b, and you
00:03:33.610 --> 00:03:35.350
take the derivative
with respect to a
00:03:35.350 --> 00:03:37.365
and you set it equal to zero,
you get the first equation.
00:03:37.365 --> 00:03:38.930
When you take the
derivative with respect
00:03:38.930 --> 00:03:40.420
to b and you set
it equal to zero,
00:03:40.420 --> 00:03:41.980
you get the second equation.
00:03:41.980 --> 00:03:45.020
And so I'm just going
to fill in the details
00:03:45.020 --> 00:03:46.790
and then tell you
what the solution is,
00:03:46.790 --> 00:03:49.820
and then I'm going to mention a
little more sophisticated thing
00:03:49.820 --> 00:03:52.670
we can do, or I guess
maybe the next level
00:03:52.670 --> 00:03:56.240
of least squares we can do,
that's not a linear problem.
00:03:56.240 --> 00:03:57.870
So I'll do that last.
00:03:57.870 --> 00:04:00.650
So what I want to do is I put
all the values I need over here
00:04:00.650 --> 00:04:01.420
on the right.
00:04:01.420 --> 00:04:03.211
So all the values we
need are on the right.
00:04:03.211 --> 00:04:05.780
We're going to need the sum
of all the x_i's and that's 5.
00:04:05.780 --> 00:04:07.565
The sum of all the y_i's is 6.
00:04:07.565 --> 00:04:09.345
The sum of the x_i
squareds is 13,
00:04:09.345 --> 00:04:12.736
and the sum of the
x_i*y_i product is 14.
00:04:12.736 --> 00:04:14.110
Now, where are
these coming from?
00:04:14.110 --> 00:04:15.780
Remember, what is
x_i and what is y_i?
00:04:15.780 --> 00:04:23.580
If we come back over here,
this is (x_1, y_1), (x_2, y_2),
00:04:23.580 --> 00:04:25.330
(x_3, y_3), OK?
00:04:25.330 --> 00:04:28.830
So that's sort of how we get--
if we switch the whole pairs
00:04:28.830 --> 00:04:30.120
around, it doesn't matter.
00:04:30.120 --> 00:04:32.960
Obviously, if we switched
x- and y-values together,
00:04:32.960 --> 00:04:35.900
it does matter, and that you
see actually from the equation
00:04:35.900 --> 00:04:36.900
does matter.
00:04:36.900 --> 00:04:38.850
Right here you have
a product of them.
00:04:38.850 --> 00:04:40.450
So let me write this out.
00:04:40.450 --> 00:04:43.270
And again, we're solving for
a and b, so in this case,
00:04:43.270 --> 00:04:46.830
a is the slope and b is
the intercept, right?
00:04:46.830 --> 00:04:48.080
So let me write everything in.
00:04:48.080 --> 00:04:52.640
So we actually get that we want
to solve the following system:
00:04:52.640 --> 00:05:00.800
13a plus 5b is equal to 14.
00:05:00.800 --> 00:05:03.380
Make sure I put in
everything correctly.
00:05:03.380 --> 00:05:09.570
And then the second one is
5a plus-- n in this case
00:05:09.570 --> 00:05:16.130
is 3, because there were three
points-- plus 3b is equal to 6.
00:05:16.130 --> 00:05:18.930
OK, this is a
system of equations.
00:05:18.930 --> 00:05:22.180
It has two equations
and two unknowns,
00:05:22.180 --> 00:05:24.480
and if you solve this-- I
should look at my notes,
00:05:24.480 --> 00:05:26.240
because I didn't
memorize it and I'm not
00:05:26.240 --> 00:05:27.620
going to do all the algebra.
00:05:27.620 --> 00:05:30.230
If you go to solve this,
what you actually get
00:05:30.230 --> 00:05:34.780
is that a is equal to 6/7
and b is equal to 4/7.
00:05:34.780 --> 00:05:38.470
So you get that the line--
let me write it here.
00:05:38.470 --> 00:05:45.590
You get a is equal to 6/7
and b is equal to 4/7, OK?
00:05:45.590 --> 00:05:48.720
And so what that tells you is
if you come back over here,
00:05:48.720 --> 00:05:51.410
the line we actually want-- I'll
write the solution right here--
00:05:51.410 --> 00:05:59.551
the line we actually want is the
line y equals 6/7 x plus 4/7.
00:05:59.551 --> 00:06:01.717
That is our least squares
line that is our solution.
00:06:01.717 --> 00:06:03.930
OK?
00:06:03.930 --> 00:06:08.990
So that is actually-- solving
for a line to fit these data,
00:06:08.990 --> 00:06:11.514
you get the following solution.
00:06:11.514 --> 00:06:13.680
But I want to point out,
and what we're going to do,
00:06:13.680 --> 00:06:16.780
after this video there will be
a challenge problem, to have
00:06:16.780 --> 00:06:18.502
you actually finish this.
00:06:18.502 --> 00:06:20.210
I want to point out
that you can actually
00:06:20.210 --> 00:06:25.040
also fit a higher-degree
polynomial to these data.
00:06:25.040 --> 00:06:26.840
For instance, you
could try and use
00:06:26.840 --> 00:06:29.360
the technique of least
squares to fit a parabola
00:06:29.360 --> 00:06:30.267
to these data.
00:06:30.267 --> 00:06:32.100
And let me point out
what the function would
00:06:32.100 --> 00:06:33.820
look like in this case.
00:06:33.820 --> 00:06:38.470
So this is for the
challenge problem.
00:06:38.470 --> 00:06:40.380
For the challenge
problem, it now
00:06:40.380 --> 00:06:43.320
will be a function
of three variables,
00:06:43.320 --> 00:06:44.820
so it will look
something like this.
00:06:44.820 --> 00:06:49.470
We'll say a comma b comma
c, and the point that we'll
00:06:49.470 --> 00:06:51.450
be wanting to minimize,
or the function
00:06:51.450 --> 00:06:54.720
we'll be wanting to minimize,
we're summing over i, again
00:06:54.720 --> 00:06:56.860
from 1 to 3-- I didn't
write that down above,
00:06:56.860 --> 00:06:58.340
but it was fairly
obvious that that
00:06:58.340 --> 00:07:02.000
was what we were summing
over-- the following thing.
00:07:02.000 --> 00:07:08.110
We have y_i minus
this big quantity:
00:07:08.110 --> 00:07:15.820
a x_i squared plus b*x_i plus c
and we square the whole thing.
00:07:15.820 --> 00:07:17.430
And so what this
is actually giving
00:07:17.430 --> 00:07:19.090
you is, just as in
the picture before,
00:07:19.090 --> 00:07:20.465
this is giving
you the difference
00:07:20.465 --> 00:07:23.720
between the expected value
and the actual value.
00:07:23.720 --> 00:07:28.040
So this is your actual y-value,
and based on the parabola
00:07:28.040 --> 00:07:29.940
you're looking for,
what's in parentheses is
00:07:29.940 --> 00:07:32.830
your expected y-value, right?
00:07:32.830 --> 00:07:36.330
So this would actually
be my parabola.
00:07:36.330 --> 00:07:38.720
When I evaluate it at
x_i, I get a number.
00:07:38.720 --> 00:07:42.330
I get a value, and that value
will be on the parabola.
00:07:42.330 --> 00:07:44.870
This will be the
y-value associated
00:07:44.870 --> 00:07:46.550
to the x-value from the data.
00:07:46.550 --> 00:07:48.720
So this is the actual value.
00:07:48.720 --> 00:07:50.390
This is the expected value.
00:07:50.390 --> 00:07:52.280
We take the difference
and square it.
00:07:52.280 --> 00:07:54.960
And because we want
to minimize how
00:07:54.960 --> 00:07:56.360
we're going to
solve this problem
00:07:56.360 --> 00:07:58.470
and this is what
you'll have to do
00:07:58.470 --> 00:08:01.660
is, just like with
the line situation,
00:08:01.660 --> 00:08:03.880
you're going to take the
derivative of this function
00:08:03.880 --> 00:08:06.800
with respect to each of these
three variables separately.
00:08:06.800 --> 00:08:11.020
So you'll take f sub a, and then
you'll set that equal to zero.
00:08:11.020 --> 00:08:14.224
And then you'll take f sub b,
you'll set that equal to zero.
00:08:14.224 --> 00:08:15.890
And then you'll take
f sub c, and you'll
00:08:15.890 --> 00:08:16.870
set that equal to zero.
00:08:16.870 --> 00:08:18.460
Again, we set them
all equal to zero,
00:08:18.460 --> 00:08:20.650
because this is really
an optimization problem.
00:08:20.650 --> 00:08:22.040
We want to minimize something.
00:08:22.040 --> 00:08:24.530
We want to minimize
this quantity.
00:08:24.530 --> 00:08:27.217
So you take each of
those three derivatives,
00:08:27.217 --> 00:08:29.050
partial derivatives,
set them equal to zero,
00:08:29.050 --> 00:08:32.440
and you have a system of three
equations with three variables.
00:08:32.440 --> 00:08:35.960
And you know how to solve such
a system, by using matrices,
00:08:35.960 --> 00:08:36.680
for instance.
00:08:36.680 --> 00:08:38.840
That would be one way
to solve such a system.
00:08:38.840 --> 00:08:40.410
And so then you can
actually come up
00:08:40.410 --> 00:08:46.860
with a formula for the sort
of parabola of best fit,
00:08:46.860 --> 00:08:48.320
you could think of it as.
00:08:48.320 --> 00:08:50.920
And so I'm going to stop there,
because I think from here,
00:08:50.920 --> 00:08:52.250
you guys will do it.
00:08:52.250 --> 00:08:53.680
But I just want
to point out this
00:08:53.680 --> 00:08:55.442
is where you're
going to be starting
00:08:55.442 --> 00:08:56.900
this next sort of
challenge problem
00:08:56.900 --> 00:09:00.900
to find the quadratic of
best fit for these three data
00:09:00.900 --> 00:09:01.930
points.
00:09:01.930 --> 00:09:03.511
So that's where I'll end it.