WEBVTT
00:00:03.280 --> 00:00:07.790
We will now use the Bayes rule
in an important application
00:00:07.790 --> 00:00:10.650
that involves a discrete unknown
random variable and a
00:00:10.650 --> 00:00:12.410
continuous measurement.
00:00:12.410 --> 00:00:16.260
Our discrete unknown random
variable will be one that
00:00:16.260 --> 00:00:21.390
takes the values plus or minus
1 with equal probability.
00:00:23.900 --> 00:00:30.230
And the measurement will be
another random variable, Y,
00:00:30.230 --> 00:00:35.670
which is equal to the discrete
random variable, but corrupted
00:00:35.670 --> 00:00:40.780
by additive noise that we denote
by W. So what we get to
00:00:40.780 --> 00:00:45.320
observe is the sum of K and W.
00:00:45.320 --> 00:00:47.990
This is a common situation in
digital communications.
00:00:47.990 --> 00:00:51.490
We're trying to send one bit
of information whether K is
00:00:51.490 --> 00:00:56.050
plus 1 or minus 1, but the
observation that we're making
00:00:56.050 --> 00:00:59.620
is corrupted by a communication
channel, by some
00:00:59.620 --> 00:01:03.500
noise that is present in the
channel, and on the basis of
00:01:03.500 --> 00:01:06.810
the value of Y that we will
observe, we will try to guess
00:01:06.810 --> 00:01:09.080
what was sent.
00:01:09.080 --> 00:01:11.600
The assumption that we will make
about the noise is that
00:01:11.600 --> 00:01:15.800
it is a standard normal
random variable.
00:01:15.800 --> 00:01:19.870
So suppose that we observed a
specific value for the random
00:01:19.870 --> 00:01:23.840
variable Y. We want to make
a guess about the random
00:01:23.840 --> 00:01:27.360
variable capital K. Of course,
there's no way to guess with
00:01:27.360 --> 00:01:28.289
complete certainty.
00:01:28.289 --> 00:01:33.000
The only thing that we can say
is to determine how likely it
00:01:33.000 --> 00:01:36.500
is that a 1 was sent as opposed
to how likely it is
00:01:36.500 --> 00:01:38.780
that a minus 1 was sent.
00:01:38.780 --> 00:01:40.960
How do we approach
such a problem?
00:01:40.960 --> 00:01:43.550
Well, we use the version of the
Bayes rule that we have
00:01:43.550 --> 00:01:46.880
already developed, which is this
formula that gives us the
00:01:46.880 --> 00:01:49.300
conditional probabilities
that we want.
00:01:49.300 --> 00:01:52.530
And in particular, here, we're
asking a question about the
00:01:52.530 --> 00:01:58.789
conditional probability that K
takes the value of 1 given
00:01:58.789 --> 00:02:03.070
that a value of y has
been observed.
00:02:03.070 --> 00:02:05.520
This is what we want
to calculate.
00:02:05.520 --> 00:02:08.889
So let us look at the various
terms involved here and see
00:02:08.889 --> 00:02:11.700
what each term is.
00:02:11.700 --> 00:02:13.590
First, we need the
prior probability
00:02:13.590 --> 00:02:15.420
of K. This is simple.
00:02:15.420 --> 00:02:21.520
The prior probabilities are 1/2
for k equal to minus 1 or
00:02:21.520 --> 00:02:24.460
plus 1, because we said that
the two possibilities are
00:02:24.460 --> 00:02:25.980
equally likely.
00:02:25.980 --> 00:02:30.880
Then we need the conditional
density of Y given K. So what
00:02:30.880 --> 00:02:33.440
does this assumption mean?
00:02:33.440 --> 00:02:37.770
It means that Y is a standard
normal random variable to
00:02:37.770 --> 00:02:43.820
which we add the value of K. So
if K is equal to 1, we're
00:02:43.820 --> 00:02:49.020
taking a standard normal, and
we add a value of plus 1.
00:02:49.020 --> 00:02:55.200
So Y, given that K is equal to
plus 1, is going to be a
00:02:55.200 --> 00:02:57.420
standard normal plus 1.
00:02:57.420 --> 00:02:58.900
What does that do?
00:02:58.900 --> 00:03:01.560
If we take a standard normal and
add a constant to it, that
00:03:01.560 --> 00:03:06.660
changes the mean and makes the
mean equal to 1, and does not
00:03:06.660 --> 00:03:08.980
change the variance.
00:03:08.980 --> 00:03:12.510
On the other hand, if K happens
to be equal to minus
00:03:12.510 --> 00:03:17.570
1, then the observation that
we see is going to be a
00:03:17.570 --> 00:03:22.060
standard normal plus a value of
minus 1, and that changes
00:03:22.060 --> 00:03:26.640
the mean to become minus 1,
but with a variance of 1.
00:03:26.640 --> 00:03:33.690
So if we are to plot the density
of Y, that density, of
00:03:33.690 --> 00:03:38.350
course, will depend on what
the value of K was.
00:03:38.350 --> 00:03:44.870
And if K is equal to 1, then we
will obtain a normal that
00:03:44.870 --> 00:03:49.120
has a mean of 1, so it's
centered here.
00:03:49.120 --> 00:03:56.490
But if K is equal to minus 1,
then our observation will be a
00:03:56.490 --> 00:04:02.730
normal with unit variance,
but centered at minus 1.
00:04:06.120 --> 00:04:11.060
So if we are to write this
in terms of symbols, the
00:04:11.060 --> 00:04:16.950
distribution of Y is normal
with variance equal to 1.
00:04:16.950 --> 00:04:26.530
So the PDF is given by this
form, e to the minus 1/2 y
00:04:26.530 --> 00:04:31.390
minus the mean of Y. But given
the value of K, the mean of Y
00:04:31.390 --> 00:04:38.409
is equal to k, plus or minus
1, depending on what k is.
00:04:38.409 --> 00:04:44.130
So this is the PDF of a normal
with unit variance and mean
00:04:44.130 --> 00:04:45.800
equal to k.
00:04:45.800 --> 00:04:49.340
And it corresponds, when you
set k equal to 1, it
00:04:49.340 --> 00:04:50.990
corresponds to this graph.
00:04:50.990 --> 00:04:53.340
When you set K equal
to minus 1, it
00:04:53.340 --> 00:04:56.940
corresponds to that graph.
00:04:56.940 --> 00:05:00.590
Let us continue with the next
term in this expression.
00:05:00.590 --> 00:05:04.400
We need the term in the
denominator, which is obtained
00:05:04.400 --> 00:05:08.280
by taking a sum over the
different choices of k.
00:05:08.280 --> 00:05:10.930
There are 2 choices, and
each choice has a
00:05:10.930 --> 00:05:13.090
probability of 1/2.
00:05:13.090 --> 00:05:16.430
From the first choice, we have
1/2 times the density of Y
00:05:16.430 --> 00:05:19.325
when k is equal to minus 1.
00:05:24.450 --> 00:05:30.350
And when k is equal to minus 1,
we obtain this expression.
00:05:30.350 --> 00:05:34.000
And we have another term that
corresponds to the case where
00:05:34.000 --> 00:05:41.120
k is equal to plus one, in
which case we have this
00:05:41.120 --> 00:05:44.190
expression here.
00:05:44.190 --> 00:05:48.159
Once more, this expression
here corresponds to this
00:05:48.159 --> 00:05:50.640
normal with a mean of minus 1.
00:05:50.640 --> 00:05:53.640
This expression here corresponds
to a normal with a
00:05:53.640 --> 00:05:56.830
mean of plus 1, which
is this graph here.
00:05:56.830 --> 00:05:59.830
So at this point, we have in
our hands expressions for
00:05:59.830 --> 00:06:03.750
everything that is involved
here, and we can just apply
00:06:03.750 --> 00:06:10.270
the formula and carry out a
fair amount of algebra.
00:06:10.270 --> 00:06:13.470
There are some very nice
simplifications that happen
00:06:13.470 --> 00:06:17.590
along the way, and we end up
with an answer that has the
00:06:17.590 --> 00:06:18.990
following form.
00:06:18.990 --> 00:06:28.320
It's 1 divided by 1 plus
e to the minus 2 y.
00:06:28.320 --> 00:06:32.040
And this gives us the
probability that a 1 was sent.
00:06:32.040 --> 00:06:35.490
Let us try to make sense
of this expression.
00:06:35.490 --> 00:06:38.860
Let's see what it looks
like by plotting it as
00:06:38.860 --> 00:06:40.110
a function of y.
00:06:43.040 --> 00:06:47.370
So what we're plotting here
is this expression.
00:06:47.370 --> 00:06:52.730
OK, if y is very large, as y
goes to plus infinity, this
00:06:52.730 --> 00:06:55.620
term disappears, and
we obtain a 1.
00:06:59.080 --> 00:07:02.280
If, on the other hand, y is
very, very negative--
00:07:02.280 --> 00:07:04.890
so y goes to minus infinity--
00:07:04.890 --> 00:07:08.410
here we get to e to the
infinity, which is a very
00:07:08.410 --> 00:07:09.620
large number.
00:07:09.620 --> 00:07:13.990
So this ratio is going
to converge to 0.
00:07:13.990 --> 00:07:18.680
So we have a graph
that starts at 0.
00:07:18.680 --> 00:07:22.440
It actually rises monotonically,
and in the
00:07:22.440 --> 00:07:25.560
limit, converges to 1.
00:07:25.560 --> 00:07:30.460
If y is equal to 0, then
this term is 1,
00:07:30.460 --> 00:07:32.130
and we obtain a 1/2.
00:07:34.930 --> 00:07:37.730
Let us interpret this plot.
00:07:37.730 --> 00:07:44.860
If y is very large, it is much
more likely that y is coming
00:07:44.860 --> 00:07:50.170
out of this distribution so
that K is equal to 1.
00:07:50.170 --> 00:07:53.530
So the probability that K is
equal to 1, if we obtain this
00:07:53.530 --> 00:07:55.940
observation, is almost 1.
00:07:55.940 --> 00:07:57.800
We have almost certainty.
00:07:57.800 --> 00:08:01.040
If, on the other hand, y is
very, very negative, then it
00:08:01.040 --> 00:08:04.390
is much more likely that what
we're seeing is coming from
00:08:04.390 --> 00:08:08.480
this distribution so that
K is equal to minus 1.
00:08:08.480 --> 00:08:12.370
And in that case, the
probability that K was 1 is
00:08:12.370 --> 00:08:14.370
going to be approximately 0.
00:08:14.370 --> 00:08:19.790
Finally, if y is 0, then we're
just in the middle of the two
00:08:19.790 --> 00:08:23.670
possibilities, and by symmetry,
either choice of K
00:08:23.670 --> 00:08:25.400
is equally likely.
00:08:25.400 --> 00:08:28.920
Therefore, the posterior
probability that K is equal to
00:08:28.920 --> 00:08:32.880
1, given that Y was
equal to 0--
00:08:32.880 --> 00:08:34.820
that probability is 1/2.
00:08:34.820 --> 00:08:38.650
When Y is equal to 0, it's
equally likely that either
00:08:38.650 --> 00:08:41.750
signal was sent.
00:08:41.750 --> 00:08:45.590
This example is a prototype of
the kind of calculations that
00:08:45.590 --> 00:08:48.940
are done in the analysis of
communication systems.
00:08:48.940 --> 00:08:52.570
This is the simplest model of
communication of a single bit
00:08:52.570 --> 00:08:56.230
in the presence of additive
noise, but of course, there
00:08:56.230 --> 00:08:59.510
can also be more complicated
models in which we have more
00:08:59.510 --> 00:09:02.750
complicated signals that are
sent, and more complicated
00:09:02.750 --> 00:09:04.300
models of the noise.
00:09:04.300 --> 00:09:07.260
But the general principles
of the analysis are
00:09:07.260 --> 00:09:08.700
always of this kind.
00:09:08.700 --> 00:09:11.710
We're using the Bayes rule, and
we need to write down the
00:09:11.710 --> 00:09:13.170
different terms that
are involved.