WEBVTT
00:00:00.820 --> 00:00:04.100
We will now develop the
solution to the linear least
00:00:04.100 --> 00:00:06.910
mean squares estimation problem.
00:00:06.910 --> 00:00:09.630
We want to find
coefficients a and b
00:00:09.630 --> 00:00:12.970
that make this expression,
this mean squared error,
00:00:12.970 --> 00:00:14.640
as small as possible.
00:00:14.640 --> 00:00:18.200
We will approach this problem
by proceeding in two stages.
00:00:18.200 --> 00:00:22.290
In the first stage, we
assume that the choice of a
00:00:22.290 --> 00:00:25.080
has already been
found and concentrate
00:00:25.080 --> 00:00:27.470
on the question of choosing b.
00:00:27.470 --> 00:00:30.310
Now if a has been
found and has been
00:00:30.310 --> 00:00:33.520
fixed to a specific
number, then this quantity
00:00:33.520 --> 00:00:36.400
here is a specific
random variable.
00:00:36.400 --> 00:00:38.090
And what do we have?
00:00:38.090 --> 00:00:40.930
We have a random variable
minus a constant.
00:00:40.930 --> 00:00:42.760
And we want to
choose that constant,
00:00:42.760 --> 00:00:47.100
so that this difference
squared is as small as possible
00:00:47.100 --> 00:00:49.180
in the expected value sense.
00:00:49.180 --> 00:00:52.750
So essentially, we're trying
to choose a constant b that
00:00:52.750 --> 00:00:56.740
estimates this random variable
in the best possible way.
00:00:56.740 --> 00:00:59.040
But this is a problem
that's familiar to us,
00:00:59.040 --> 00:01:01.830
and we know that
the best choice of b
00:01:01.830 --> 00:01:06.020
is equal to the expected
value of that random variable.
00:01:06.020 --> 00:01:09.630
And using also linearity,
this expected value
00:01:09.630 --> 00:01:13.070
can be written in this form.
00:01:13.070 --> 00:01:18.020
So if we know a, this is
how b should be chosen.
00:01:18.020 --> 00:01:22.840
Let us now move on
to the choice of a.
00:01:22.840 --> 00:01:26.930
Since we know what b
should be equal to,
00:01:26.930 --> 00:01:30.250
we can rewrite this
expression that we're
00:01:30.250 --> 00:01:35.130
trying to minimize by
substituting our choice of b,
00:01:35.130 --> 00:01:38.393
which is the expected
value of Theta minus aX.
00:01:43.229 --> 00:01:45.590
And this is the quantity
we want to minimize.
00:01:45.590 --> 00:01:46.660
What is it?
00:01:46.660 --> 00:01:49.430
We have a random variable
minus the expected value
00:01:49.430 --> 00:01:51.259
of that random variable squared.
00:01:51.259 --> 00:01:53.110
And then we take expected value.
00:01:53.110 --> 00:02:00.620
This is just the variance of the
random variable Theta minus aX.
00:02:00.620 --> 00:02:02.680
This is the variance
of the difference
00:02:02.680 --> 00:02:04.660
of two random variables.
00:02:04.660 --> 00:02:07.550
Because the two random
variables are dependent,
00:02:07.550 --> 00:02:11.640
this is not just the sum of
the individual variances.
00:02:11.640 --> 00:02:15.630
But we do have a formula,
even for the general case.
00:02:15.630 --> 00:02:18.720
And the formula tells
us that the variance
00:02:18.720 --> 00:02:20.780
of the difference of
two random variables
00:02:20.780 --> 00:02:24.010
is the variance of the
first random variable
00:02:24.010 --> 00:02:26.420
plus the variance of the second.
00:02:26.420 --> 00:02:29.170
And when we pull a
outside the variance,
00:02:29.170 --> 00:02:33.000
that gives us a
contribution of a squared.
00:02:33.000 --> 00:02:35.400
And then we have a cross-term.
00:02:35.400 --> 00:02:37.680
Because of the minus
sign here, the cross-term
00:02:37.680 --> 00:02:39.329
will have a minus sign.
00:02:39.329 --> 00:02:41.600
We have a factor of 2.
00:02:41.600 --> 00:02:46.120
And then we want the
covariance of Theta with aX.
00:02:46.120 --> 00:02:49.390
And because we have seen
that covariances behave
00:02:49.390 --> 00:02:53.270
in a linear manner, we can
pull a outside the covariance.
00:02:53.270 --> 00:02:55.940
And this is what
we are left with.
00:02:55.940 --> 00:02:59.710
This is the quantity we want
to optimize with respect to a.
00:02:59.710 --> 00:03:02.570
And to do this
optimization, we just
00:03:02.570 --> 00:03:07.320
set the derivative
with respect to a to 0.
00:03:07.320 --> 00:03:13.180
And that's going to give us
2a times the variance of X
00:03:13.180 --> 00:03:19.850
minus twice the covariance
of Theta with X, equal to 0.
00:03:19.850 --> 00:03:22.460
From which it
follows that a should
00:03:22.460 --> 00:03:25.310
be equal to the
covariance of Theta
00:03:25.310 --> 00:03:31.670
with X divided by
the variance of X.
00:03:31.670 --> 00:03:34.140
So we have found
what a should be.
00:03:34.140 --> 00:03:38.110
Once we know what a is, we
know what b should be equal to.
00:03:38.110 --> 00:03:40.079
So we have solved the problem.
00:03:40.079 --> 00:03:43.030
And here's the form
of the solution.
00:03:43.030 --> 00:03:48.050
This coefficient here
is the coefficient a.
00:03:48.050 --> 00:03:50.990
So we have here the term aX.
00:03:50.990 --> 00:03:55.940
And then this term together
with a times the expected value
00:03:55.940 --> 00:04:01.290
of X, this corresponds
to the coefficient b.
00:04:01.290 --> 00:04:04.070
It's also instructive
to rewrite this solution
00:04:04.070 --> 00:04:07.460
in a slightly different form
involving the correlation
00:04:07.460 --> 00:04:08.870
coefficient.
00:04:08.870 --> 00:04:11.930
Recall that the correlation
coefficient between two
00:04:11.930 --> 00:04:16.730
random variables is defined as
the covariance between the two
00:04:16.730 --> 00:04:20.350
random variables
divided by the product
00:04:20.350 --> 00:04:23.560
of their standard deviations.
00:04:23.560 --> 00:04:27.340
Using this relation, we can
now write the coefficient a
00:04:27.340 --> 00:04:34.060
as the covariance which is
who times sigma Theta sigma
00:04:34.060 --> 00:04:41.430
X divided by the variance of
X, which is sigma X squared.
00:04:41.430 --> 00:04:44.920
And after we cancel a factor
of sigma X from the numerator
00:04:44.920 --> 00:04:48.640
and the denominator,
we see that a is also
00:04:48.640 --> 00:04:52.640
equal to rho times sigma
Theta divided by sigma X.
00:04:52.640 --> 00:04:55.190
And this gives us
this alternative form
00:04:55.190 --> 00:04:57.470
for the solution.
00:04:57.470 --> 00:05:01.360
What we will be doing next will
be to interpret this solution
00:05:01.360 --> 00:05:04.410
and also to give a
number of examples.