WEBVTT
00:00:02.550 --> 00:00:06.070
In the next variation we
consider, all random variables
00:00:06.070 --> 00:00:07.920
are continuous.
00:00:07.920 --> 00:00:11.070
For this case, we do have
a Bayes rule, once more.
00:00:11.070 --> 00:00:13.800
And we have worked [out]
quite a few examples.
00:00:13.800 --> 00:00:16.120
So there's no point,
again, in going
00:00:16.120 --> 00:00:17.400
through a detailed example.
00:00:17.400 --> 00:00:20.890
Let us just discuss
some of the issues.
00:00:20.890 --> 00:00:24.150
One question is when
do these models arise?
00:00:24.150 --> 00:00:27.850
One particular class of models
that is very useful and very
00:00:27.850 --> 00:00:31.980
commonly used are so-called
linear normal models.
00:00:31.980 --> 00:00:34.305
In these models, we,
basically, combine
00:00:34.305 --> 00:00:37.500
various random variables
in a linear function.
00:00:37.500 --> 00:00:41.220
And all the random variables of
interest are now to be normal.
00:00:41.220 --> 00:00:45.140
For instance, we might have
a signal, a noisy signal,
00:00:45.140 --> 00:00:50.930
call it Theta, which is now
a continuous valued signal.
00:00:50.930 --> 00:00:53.280
We receive that
signal, but corrupted
00:00:53.280 --> 00:00:56.290
by some noise, which is
independent from what was sent.
00:00:56.290 --> 00:01:00.110
And we wish to recover, on the
basis of the observation X,
00:01:00.110 --> 00:01:03.180
we wish to recover
the value of Theta.
00:01:03.180 --> 00:01:05.580
And then there are versions
of this problem that
00:01:05.580 --> 00:01:09.840
involve Theta vectors
instead of single values.
00:01:09.840 --> 00:01:13.530
So that Theta consists
of multiple components,
00:01:13.530 --> 00:01:17.200
and where we obtain many
measurements X. We will,
00:01:17.200 --> 00:01:19.750
actually, see in the
next lecture sequence,
00:01:19.750 --> 00:01:23.990
a quite detailed discussion
of models of this type.
00:01:23.990 --> 00:01:27.380
And this will be one
of our main examples
00:01:27.380 --> 00:01:30.500
within our study of inference.
00:01:30.500 --> 00:01:34.100
There will be another example
that we will see a few times,
00:01:34.100 --> 00:01:36.430
and this involves
estimating the parameter
00:01:36.430 --> 00:01:38.850
of a uniform distribution.
00:01:38.850 --> 00:01:43.120
So X is a random variable that's
uniform over a certain range.
00:01:43.120 --> 00:01:47.241
But the range itself
is random and unknown.
00:01:47.241 --> 00:01:50.080
And on the basis
of observations X,
00:01:50.080 --> 00:01:54.840
we would like to make
an estimation of what
00:01:54.840 --> 00:01:57.670
the true value of Theta is.
00:01:57.670 --> 00:01:59.860
This is an example
that you will see
00:01:59.860 --> 00:02:03.870
in our collection of solved
problems for this class.
00:02:03.870 --> 00:02:05.940
So what are the questions
in this setting,
00:02:05.940 --> 00:02:10.350
we wish to come up with
ways of estimating Theta,
00:02:10.350 --> 00:02:13.840
we form an estimator,
and the main candidates
00:02:13.840 --> 00:02:18.440
for estimators at this
points are, once more,
00:02:18.440 --> 00:02:21.760
the maximum a posteriori
probability estimator,
00:02:21.760 --> 00:02:24.770
which looks at this
conditional density
00:02:24.770 --> 00:02:28.460
and picks a value of theta that
makes this conditional density
00:02:28.460 --> 00:02:30.200
as large as possible.
00:02:30.200 --> 00:02:33.560
And then the alternative one
is the least mean squares
00:02:33.560 --> 00:02:36.430
estimator, which just
computes the expected value
00:02:36.430 --> 00:02:38.840
of Theta given X.
00:02:38.840 --> 00:02:41.300
For any given
estimator, we then want
00:02:41.300 --> 00:02:43.840
to characterize its performance.
00:02:43.840 --> 00:02:46.800
In this case, a natural
notion of performance
00:02:46.800 --> 00:02:50.800
is the distance between
our estimate, or estimator,
00:02:50.800 --> 00:02:52.890
from the true value of Theta.
00:02:52.890 --> 00:02:56.650
And commonly we use
the squared distance
00:02:56.650 --> 00:03:00.700
and then take the average
of that squared distance.
00:03:00.700 --> 00:03:03.310
So in a conditional universe
where we have already
00:03:03.310 --> 00:03:06.330
observed some data,
we might be interested
00:03:06.330 --> 00:03:09.110
in this particular
expectation, which
00:03:09.110 --> 00:03:16.090
is the mean squared error of
this particular estimator,
00:03:16.090 --> 00:03:18.970
given that we obtain
some particular data.
00:03:18.970 --> 00:03:23.220
Or we can average over
all possible data points
00:03:23.220 --> 00:03:26.230
that we might obtain
so that we look
00:03:26.230 --> 00:03:30.210
at the unconditional
mean squared error, which
00:03:30.210 --> 00:03:34.620
is a measure of the overall
performance of our estimator.
00:03:34.620 --> 00:03:37.480
We will be talking
about these criteria
00:03:37.480 --> 00:03:41.430
and the mean squared error
in a fair amount of detail
00:03:41.430 --> 00:03:44.660
in a subsequent
lecture sequence.