WEBVTT

00:00:15.370 --> 00:00:17.450
PROFESSOR: And what we saw was
the performance was quite

00:00:17.450 --> 00:00:20.910
different in the two regimes.

00:00:20.910 --> 00:00:23.900
In power-limited regime,
typically our SNR is much

00:00:23.900 --> 00:00:28.930
smaller than one, whereas in the
bandwidth-limited regime,

00:00:28.930 --> 00:00:33.360
the SNR is large.

00:00:33.360 --> 00:00:36.370
And because of this behavior
of SNR, what we saw is that

00:00:36.370 --> 00:00:38.650
the Shannon spectral
efficiency in the

00:00:38.650 --> 00:00:42.010
bandwidth-limited regime,
it doubles for every --

00:00:42.010 --> 00:00:43.810
if we double our SNR.

00:00:43.810 --> 00:00:47.670
For every 3 dB increase in SNR,
the spectral efficiency

00:00:47.670 --> 00:00:50.790
increases by a factor of 2.

00:00:50.790 --> 00:00:53.470
In the bandwidth limited regime,
if we have a 3 dB

00:00:53.470 --> 00:00:57.070
increase in SNR, the spectral
efficiency increases by one

00:00:57.070 --> 00:00:59.080
bit per two dimension.

00:00:59.080 --> 00:01:02.470
On the other hand, if we double
our bandwidth, then the

00:01:02.470 --> 00:01:05.110
capacity in bits per second
is not affected in the

00:01:05.110 --> 00:01:06.830
power-limited regime.

00:01:06.830 --> 00:01:09.540
But if we double our bandwidth,
then in the

00:01:09.540 --> 00:01:12.970
bandwidth-limited regime,
the capacity increases

00:01:12.970 --> 00:01:15.200
approximately by
a factor of 2.

00:01:15.200 --> 00:01:17.010
And that's really what
motivated the name

00:01:17.010 --> 00:01:19.680
bandwidth-limited and
power-limited regime.

00:01:19.680 --> 00:01:23.260
If we want a more operational
definition, then we say that

00:01:23.260 --> 00:01:26.390
the flow is less than two bits
per two dimensions, we will

00:01:26.390 --> 00:01:29.330
have this bandwidth-limited
regime.

00:01:29.330 --> 00:01:32.590
And in this case, rho is greater
than two bits per two

00:01:32.590 --> 00:01:34.430
dimensions.

00:01:34.430 --> 00:01:37.430
The number two bits per two
dimensions was chosen because

00:01:37.430 --> 00:01:40.360
it is like the largest spectral
efficiency we can get

00:01:40.360 --> 00:01:42.210
through binary transmission.

00:01:42.210 --> 00:01:45.100
If it's an uncoded two-PAM over
a channel, then we get

00:01:45.100 --> 00:01:46.990
two bits per two dimensions.

00:01:46.990 --> 00:01:49.220
If we are coding, the only
thing we can do is reduce

00:01:49.220 --> 00:01:51.270
spectral efficiency.

00:01:51.270 --> 00:01:53.770
So typically, if we are going
to operate in power-limited

00:01:53.770 --> 00:01:56.940
regime, the operational meaning
is we can get away

00:01:56.940 --> 00:01:58.380
with binary transmission.

00:01:58.380 --> 00:02:00.900
In the bandwidth-limited regime,
we have to resort to

00:02:00.900 --> 00:02:03.160
non-binary transmission.

00:02:03.160 --> 00:02:09.180
So in other words, binary
modulation is done in

00:02:09.180 --> 00:02:17.460
power-limited regime, whereas we
need multi-level modulation

00:02:17.460 --> 00:02:19.240
in the bandwidth-limited
regime.

00:02:28.440 --> 00:02:32.290
The baseline system
here is 2-PAM.

00:02:32.290 --> 00:02:35.190
The uncoded performance
was of 2-PAM.

00:02:35.190 --> 00:02:38.920
In the bandwidth limited
regime, it is M-PAM.

00:02:38.920 --> 00:02:40.680
And the way we measure
performance in the

00:02:40.680 --> 00:02:44.950
power-limited regime is the
probability of bit error as a

00:02:44.950 --> 00:02:46.200
function of EbN0.

00:02:50.406 --> 00:02:51.360
OK?

00:02:51.360 --> 00:02:54.260
In the bandwidth-limited
regime, the performance

00:02:54.260 --> 00:02:59.080
measure is done by probability
of error per two dimensions as

00:02:59.080 --> 00:03:00.340
a function of SNR norm.

00:03:07.600 --> 00:03:11.560
And in this case, we saw that
the gap to capacity --

00:03:11.560 --> 00:03:18.670
or rather, to put it in other
words, the ultimate limit on

00:03:18.670 --> 00:03:31.090
EbN0 is minus 1.59 dB.

00:03:31.090 --> 00:03:47.810
And here, the ultimate limit
on SNR norm is 0 dB.

00:03:59.440 --> 00:04:00.520
OK?

00:04:00.520 --> 00:04:04.100
Any questions on this?

00:04:04.100 --> 00:04:04.570
Yes.

00:04:04.570 --> 00:04:09.770
AUDIENCE: Why do we use Eb
over N_0 SNR norm for

00:04:09.770 --> 00:04:10.460
[INAUDIBLE]

00:04:10.460 --> 00:04:11.226
bandwidth limited?

00:04:11.226 --> 00:04:13.050
PROFESSOR: That's
a good question.

00:04:13.050 --> 00:04:16.839
AUDIENCE: Why do we use Eb over
N_0 for both regimes?

00:04:16.839 --> 00:04:18.890
PROFESSOR: Or why don't use
SNR norm for both regimes?

00:04:18.890 --> 00:04:19.545
AUDIENCE: Yeah.

00:04:19.545 --> 00:04:21.480
PROFESSOR: Now if we think about
the bandwidth limited

00:04:21.480 --> 00:04:23.070
regime, what we really
care about is

00:04:23.070 --> 00:04:25.930
spectral efficiency, right?

00:04:25.930 --> 00:04:29.080
What SNR norm does, if you
remember the definition, is

00:04:29.080 --> 00:04:32.150
that it compass the amount of
SNR we require for a practical

00:04:32.150 --> 00:04:35.400
system to that of the best
possible system.

00:04:35.400 --> 00:04:37.310
So in other words, if you
do care about spectral

00:04:37.310 --> 00:04:42.890
efficiency, SNR norm is the
right measure to look for.

00:04:42.890 --> 00:04:46.230
OK, now what happens in the
power-limited regime?

00:04:46.230 --> 00:04:49.310
It turns out, probably more for
historic reasons, people

00:04:49.310 --> 00:04:52.120
started with EbN0 in the
power-limited regime.

00:04:52.120 --> 00:05:00.985
And if you look at the
definition of EbN0, it is SNR

00:05:00.985 --> 00:05:02.330
over rho, right?

00:05:02.330 --> 00:05:07.070
So in the power limited regime,
our rho is small, the

00:05:07.070 --> 00:05:10.490
SNR is going to be small, but
if you look at -- because we

00:05:10.490 --> 00:05:13.840
are in the power limited regime
so we have lots of

00:05:13.840 --> 00:05:17.730
bandwidth, so our SNR is going
to be small and the spectral

00:05:17.730 --> 00:05:19.450
efficiency is going
to be small.

00:05:19.450 --> 00:05:23.240
But if you look at the ratio
between the two, it's going to

00:05:23.240 --> 00:05:29.480
be greater than minus 1.59 dB.

00:05:29.480 --> 00:05:29.800
OK.

00:05:29.800 --> 00:05:33.610
so it turns out that the kind of
limit we do take, our EbN0

00:05:33.610 --> 00:05:37.810
remains constant as minus 1.59
dB, and that's probably one of

00:05:37.810 --> 00:05:40.590
the reasons that motivated
to use EbN0 in the

00:05:40.590 --> 00:05:44.180
power limited regime.

00:05:44.180 --> 00:05:46.590
On the other hand, one could
also argue is that what really

00:05:46.590 --> 00:05:48.820
happens in the power limited
regime is that our bandwidth

00:05:48.820 --> 00:05:50.370
becomes really large.

00:05:50.370 --> 00:05:51.330
So if this [UNINTELLIGIBLE]

00:05:51.330 --> 00:05:54.520
stick with 2-PAM system, then
we do get a spectral

00:05:54.520 --> 00:05:57.720
efficiency of two bits per two
dimensions, but that's just

00:05:57.720 --> 00:06:00.270
because we are using a
particular modulation scheme.

00:06:00.270 --> 00:06:03.100
If our bandwidth is really
large, we are not really going

00:06:03.100 --> 00:06:05.690
to care about what spectral
efficiency we use.

00:06:05.690 --> 00:06:08.790
What really matters is this
energy per bit, and that's why

00:06:08.790 --> 00:06:10.150
this is a reasonable
assumption.

00:06:12.870 --> 00:06:15.710
Does that answer
your question?

00:06:15.710 --> 00:06:20.010
Right, it's not completely clear
as to why this EbN0 is

00:06:20.010 --> 00:06:23.370
the best here, and SNR norm is
here if you don't take the

00:06:23.370 --> 00:06:26.410
limit rho going to zero here,
but again, you can think of it

00:06:26.410 --> 00:06:28.970
more as a convention.

00:06:28.970 --> 00:06:29.360
OK.

00:06:29.360 --> 00:06:30.610
AUDIENCE: [INAUDIBLE]

00:06:33.170 --> 00:06:35.790
PROFESSOR: So if you look in the
power-limited regime, you

00:06:35.790 --> 00:06:38.530
are saying rho is less than two
bits per two dimensions.

00:06:38.530 --> 00:06:41.070
If you use an uncoded 2-PAM,
what's your spectral

00:06:41.070 --> 00:06:42.800
efficiency?

00:06:42.800 --> 00:06:44.740
It's two bits per two
dimension, right?

00:06:44.740 --> 00:06:47.900
Now the idea is, suppose we want
to design a system with

00:06:47.900 --> 00:06:50.050
spectral efficiency greater
than two bits per two

00:06:50.050 --> 00:06:51.150
dimensions?

00:06:51.150 --> 00:06:54.020
We cannot really use
a 2-PAM system.

00:06:54.020 --> 00:06:56.685
Because if you put coding on top
of it, all we are going to

00:06:56.685 --> 00:06:58.860
do is simply reduce the spectral
efficiency below two

00:06:58.860 --> 00:07:00.730
bits per two dimension.

00:07:00.730 --> 00:07:05.150
So we have to start with a
non-binary modulation, right?

00:07:05.150 --> 00:07:07.180
So that's how we distinguish
between power-limited and

00:07:07.180 --> 00:07:09.426
bandwidth-limited

00:07:09.426 --> 00:07:12.157
AUDIENCE: That's just because
you're using the 2-PAM as your

00:07:12.157 --> 00:07:14.728
baseline [INAUDIBLE]?

00:07:14.728 --> 00:07:17.620
PROFESSOR: Right.

00:07:17.620 --> 00:07:19.080
OK?

00:07:19.080 --> 00:07:21.610
All right.

00:07:21.610 --> 00:07:24.655
So let us do an example to
finish off this analysis.

00:07:39.920 --> 00:07:41.730
Now suppose --

00:07:41.730 --> 00:07:44.430
say you are at a summer project,
and you're assigned

00:07:44.430 --> 00:07:46.410
to design some system.

00:07:46.410 --> 00:07:49.410
Your boss gives you some
specifications, like

00:07:49.410 --> 00:07:51.740
continuous time specifications.

00:07:51.740 --> 00:07:54.665
In particular, you have
a baseband system.

00:08:03.320 --> 00:08:06.575
The baseband system has a
bandwidth of one Megahertz.

00:08:10.510 --> 00:08:18.590
You have a power, P, which is
one unit, so that's another

00:08:18.590 --> 00:08:20.090
resource you have.

00:08:20.090 --> 00:08:22.280
And if you measure your channel,
it can be reasonably

00:08:22.280 --> 00:08:26.070
approximated as an AWGN channel,
so there is no ISI or

00:08:26.070 --> 00:08:29.450
any filtering going on, just
Additive White Gaussian Noise.

00:08:29.450 --> 00:08:34.159
And your noise a single sided
spectral density of ten to the

00:08:34.159 --> 00:08:39.566
minus six units per Hertz
of the bandwidth.

00:08:42.400 --> 00:08:43.650
And what is your goal?

00:08:49.070 --> 00:08:50.650
So you have the following
goal.

00:08:54.070 --> 00:09:05.240
Design a 2-PAM system,
with a specified

00:09:05.240 --> 00:09:06.490
probability of bit error.

00:09:11.380 --> 00:09:24.470
And what you want to do is
compare this to the ultimate

00:09:24.470 --> 00:09:25.720
Shannon limit.

00:09:33.080 --> 00:09:36.770
So that is your objective.

00:09:36.770 --> 00:09:40.540
So since we have to compare it
with Shannon limit, and we

00:09:40.540 --> 00:09:42.740
have already a formula for the
Shannon limit, let's just

00:09:42.740 --> 00:09:43.990
start with that.

00:09:50.320 --> 00:09:54.230
So for this problem, we have
the Shannon limit.

00:09:57.060 --> 00:10:02.870
You have rho is less than log
base 2 of 1 plus SNR.

00:10:02.870 --> 00:10:05.900
For SNR, I can talk in terms
of this continuous time

00:10:05.900 --> 00:10:13.240
parameters, it's P over N_0 W.
My P is one unit, and nought

00:10:13.240 --> 00:10:16.730
is ten to the minus six, and
my bandwidth, W, is one

00:10:16.730 --> 00:10:19.570
Megahertz here, which
is ten to the six.

00:10:19.570 --> 00:10:22.250
So my P over N_0 W
is basically one.

00:10:22.250 --> 00:10:24.700
So I have 1 plus
1, which is 2.

00:10:24.700 --> 00:10:27.100
This is log base
2 of 2, or it's

00:10:27.100 --> 00:10:29.820
one bit per two dimension.

00:10:29.820 --> 00:10:37.100
So my capacity in bits per
second is rho W, and rho is

00:10:37.100 --> 00:10:40.610
one bit per two dimension, the
bandwidth is one Megahertz.

00:10:40.610 --> 00:10:44.090
So I get ten to the six
bits per second.

00:10:44.090 --> 00:10:46.230
So this is my Shannon
capacity for this

00:10:46.230 --> 00:10:47.870
particular AWGN system.

00:10:51.490 --> 00:10:52.740
OK.

00:10:54.320 --> 00:10:58.030
The next thing we want to do
is compare this with a

00:10:58.030 --> 00:11:00.700
practical system, and
see how close we get

00:11:00.700 --> 00:11:02.550
to the Shannon limit.

00:11:02.550 --> 00:11:06.080
And since you only have to work
with 2-PAM, the generic

00:11:06.080 --> 00:11:10.080
architecture is something
we saw last time.

00:11:10.080 --> 00:11:15.610
You have input bits coming in,
let's call them X sub k, where

00:11:15.610 --> 00:11:17.230
k is the kth bit.

00:11:17.230 --> 00:11:20.550
And they belong to a certain
constellation,

00:11:20.550 --> 00:11:21.280
let's call the --

00:11:21.280 --> 00:11:24.920
the constellation points are
just -- it's a 2-PAM system,

00:11:24.920 --> 00:11:27.690
so we have minus alpha
and alpha.

00:11:27.690 --> 00:11:32.080
And this goes through a PAM
modulator, and one parameter

00:11:32.080 --> 00:11:33.920
to specify for the
PAM modulator

00:11:33.920 --> 00:11:36.891
is the symbol interval.

00:11:36.891 --> 00:11:40.780
Right, the time between sending
consecutive signals

00:11:40.780 --> 00:11:42.490
over the channel.

00:11:42.490 --> 00:11:47.260
What you get out is X
of t, this is the

00:11:47.260 --> 00:11:48.950
channel model, N of t.

00:11:48.950 --> 00:11:51.460
There's already noise over
the channel, and what you

00:11:51.460 --> 00:11:54.880
get out is Y of t.

00:11:54.880 --> 00:11:57.470
So this is the generic
architecture.

00:11:57.470 --> 00:12:05.230
And now, your goal for the
design problem is to select

00:12:05.230 --> 00:12:12.630
alpha and t in the right way,
so that they satisfy this

00:12:12.630 --> 00:12:15.930
continuous time constraints, and
at the same time, you have

00:12:15.930 --> 00:12:18.050
your probability of error of
ten to the minus five.

00:12:23.970 --> 00:12:26.520
So what would be an obvious
choice for T?

00:12:30.312 --> 00:12:31.562
AUDIENCE: [INAUDIBLE]

00:12:34.950 --> 00:12:35.580
PROFESSOR: Right.

00:12:35.580 --> 00:12:39.790
So the first idea is you are
given a certain amount of

00:12:39.790 --> 00:12:43.890
bandwidth, and you clearly want
send your signals as fast

00:12:43.890 --> 00:12:47.550
as possible in order to get
excellent data rate.

00:12:47.550 --> 00:12:50.100
Now because you have a certain
amount of bandwidth, what

00:12:50.100 --> 00:12:52.510
Nyquist's criteria tells
you is that you want

00:12:52.510 --> 00:12:54.730
to have zero ISI.

00:12:54.730 --> 00:12:59.450
And if you want to have zero
ISI, what you do know is that

00:12:59.450 --> 00:13:02.710
the symbol interval should
be greater than or

00:13:02.710 --> 00:13:04.670
equal to 1 over 2W.

00:13:04.670 --> 00:13:08.640
You cannot signal at a rate
faster than one over T, and so

00:13:08.640 --> 00:13:14.590
if we look at this, it's 1
over 2 times 10 to the 6.

00:13:14.590 --> 00:13:18.280
Now it I do use this particular
value of T, then

00:13:18.280 --> 00:13:21.710
what's my alpha going to be?

00:13:21.710 --> 00:13:25.320
Well, alpha is simply the
energy per symbol.

00:13:25.320 --> 00:13:30.370
So I know alpha squared is the
power that I have times the

00:13:30.370 --> 00:13:34.030
symbol interval, T. It's
just a definition.

00:13:34.030 --> 00:13:36.490
This comes from orthonormality
of the PAM

00:13:36.490 --> 00:13:38.100
system that we have.

00:13:38.100 --> 00:13:43.590
Now P is one, because that's
what I specified as a system

00:13:43.590 --> 00:13:48.225
specification, so this is just
T, which is one over times ten

00:13:48.225 --> 00:13:49.970
to the six.

00:13:49.970 --> 00:13:54.022
So I can select this value of
alpha and this value of T.

00:13:54.022 --> 00:13:55.272
AUDIENCE: [INAUDIBLE]

00:13:57.830 --> 00:14:02.110
PROFESSOR: Well, alpha squared
is energy per symbol.

00:14:02.110 --> 00:14:04.300
So what will that be?

00:14:04.300 --> 00:14:06.910
What's the energy per symbol,
if you're sending every T

00:14:06.910 --> 00:14:09.740
seconds, and if you
have a power of P?

00:14:09.740 --> 00:14:12.720
Es, that I mentioned last
time, or energy per two

00:14:12.720 --> 00:14:14.230
dimensions.

00:14:14.230 --> 00:14:16.950
So in PAM, it will be
energy per symbols.

00:14:16.950 --> 00:14:18.200
In that case, it will be 2P.

00:14:23.120 --> 00:14:23.540
OK.

00:14:23.540 --> 00:14:26.400
But now if I select these values
of alpha and T, will my

00:14:26.400 --> 00:14:27.950
system work?

00:14:27.950 --> 00:14:30.900
Is this a reasonable
design, or is there

00:14:30.900 --> 00:14:32.150
something wrong here?

00:14:37.690 --> 00:14:39.790
I'm clearly satisfying my --

00:14:39.790 --> 00:14:41.690
AUDIENCE: [INAUDIBLE]

00:14:41.690 --> 00:14:44.190
PROFESSOR: The probability
of error, right, exactly.

00:14:44.190 --> 00:14:47.870
So in fact, I do know how
to calculate it, right?

00:14:47.870 --> 00:14:50.460
What's the probability
of bit error?

00:14:50.460 --> 00:14:56.560
Well, we saw that last time it
was Q of root 2 Eb over N_0.

00:14:59.850 --> 00:15:02.480
Eb is same as alpha squared,
because we

00:15:02.480 --> 00:15:04.440
have one bit per symbol.

00:15:04.440 --> 00:15:08.420
So alpha squared is this
quantity here, 1 over 2 times

00:15:08.420 --> 00:15:10.020
10 to the 6.

00:15:10.020 --> 00:15:13.580
So this is Q of square
root of --

00:15:13.580 --> 00:15:19.810
so 2 alpha squared is 10 to
the 6, 1 over 10 to the 6.

00:15:19.810 --> 00:15:23.410
N_0, I know, is ten to the
minus six, so this is

00:15:23.410 --> 00:15:25.960
actually Q of 1.

00:15:25.960 --> 00:15:34.430
And if I do calculate that, it's
like 17 percent, which is

00:15:34.430 --> 00:15:36.170
nowhere close to ten
to the minus five.

00:15:44.030 --> 00:15:46.140
So any suggestions on how
I can improve my system?

00:15:48.680 --> 00:15:49.580
AUDIENCE: Increase
T [INAUDIBLE]

00:15:49.580 --> 00:15:51.570
PROFESSOR: Increase T, right?

00:15:51.570 --> 00:15:53.290
What's happening
right now is --

00:15:53.290 --> 00:15:56.960
the reason we selected this
value of T in the first place

00:15:56.960 --> 00:15:59.940
is because we wanted to send
our signals as fast as

00:15:59.940 --> 00:16:03.590
possible avoid ISI, but that's
just one of criteria in my

00:16:03.590 --> 00:16:04.860
system, right?

00:16:04.860 --> 00:16:08.810
I have to also satisfy this
probability of error criteria,

00:16:08.810 --> 00:16:11.650
so I want to make sure my
probability of error is going

00:16:11.650 --> 00:16:12.930
to be small.

00:16:12.930 --> 00:16:15.960
If I look at the expression for
probability of error, it

00:16:15.960 --> 00:16:19.170
doesn't really look at T. All
it looks at is this ratio of

00:16:19.170 --> 00:16:20.360
Eb/N0, right?

00:16:20.360 --> 00:16:23.735
So if I want to reduce my
probability of error, I have

00:16:23.735 --> 00:16:25.880
to increase my energy per bit.

00:16:25.880 --> 00:16:29.460
Now my energy per bit is P times
T, so the only hope of

00:16:29.460 --> 00:16:32.780
increasing my energy per bit
will be to increase T, which

00:16:32.780 --> 00:16:36.798
means I have to signal
at a slower rate.

00:16:36.798 --> 00:16:38.190
OK?

00:16:38.190 --> 00:16:39.750
So we have probability of --

00:16:39.750 --> 00:16:41.440
let's write the calculation
down.

00:16:48.200 --> 00:16:50.120
It's ten to the minus five.

00:16:50.120 --> 00:16:53.150
Last time, we saw that the best
way to solve this is to

00:16:53.150 --> 00:16:59.510
look at the waterfall curve,
and EbN_0 in this case is

00:16:59.510 --> 00:17:03.410
approximately 9.6 dB.

00:17:03.410 --> 00:17:05.550
I will say that that's
approximately ten on the

00:17:05.550 --> 00:17:06.800
linear scale.

00:17:12.440 --> 00:17:17.690
So this implies that energy
per bit is ten

00:17:17.690 --> 00:17:19.890
to the minus five.

00:17:19.890 --> 00:17:24.660
So energy per bit is P times T,
in this case, it's ten to

00:17:24.660 --> 00:17:29.890
the minus five. p is one, so
this implies that t is ten to

00:17:29.890 --> 00:17:31.420
the minus five.

00:17:31.420 --> 00:17:35.320
So I can send one bit every ten
to the minus five seconds.

00:17:35.320 --> 00:17:38.100
So my rate that I achieve --

00:17:38.100 --> 00:17:39.350
just write it here --

00:17:43.130 --> 00:17:49.187
which is ten to the five
bits per second.

00:17:49.187 --> 00:17:50.437
OK?

00:17:54.520 --> 00:17:59.050
If you compare this to the
Shannon limit, the Shannon

00:17:59.050 --> 00:18:00.970
limit is right here,
you have ten to the

00:18:00.970 --> 00:18:02.770
six bits per second.

00:18:02.770 --> 00:18:05.980
So you lose by a factor of ten
in your data rate if you're

00:18:05.980 --> 00:18:08.730
going to use an uncoded
2-PAM system.

00:18:08.730 --> 00:18:12.610
So what this example tells you
is that if you're going to do

00:18:12.610 --> 00:18:15.830
more sophisticated cording, you
can gain up to a factor of

00:18:15.830 --> 00:18:18.920
ten in your data rate.

00:18:18.920 --> 00:18:22.340
So if the 10 dB did not really
impress you last time,

00:18:22.340 --> 00:18:25.110
hopefully this example
throws more light on

00:18:25.110 --> 00:18:26.360
the value of coding.

00:18:28.650 --> 00:18:30.015
Are there any questions
on this example?

00:18:35.675 --> 00:18:37.615
AUDIENCE: Since the --

00:18:37.615 --> 00:18:41.510
since we are signaling at a
faster rate now, instead of

00:18:41.510 --> 00:18:43.610
using sink process, we can
use something better.

00:18:43.610 --> 00:18:45.220
PROFESSOR: That's a very
good point, yes.

00:18:45.220 --> 00:18:51.480
Well, if you look at the nominal
bandwidth here, it's 1

00:18:51.480 --> 00:18:53.240
over 2T, right?

00:18:53.240 --> 00:18:58.920
T is 10 to the minus 5 seconds,
so this says 1 over 2

00:18:58.920 --> 00:19:01.830
times 10 to the minus 5.

00:19:01.830 --> 00:19:06.065
So it's going to be 5 times
10 to the 4, or 50 KHz.

00:19:08.800 --> 00:19:09.270
OK?

00:19:09.270 --> 00:19:13.060
The available bandwidth you
have, the system bandwidth, if

00:19:13.060 --> 00:19:15.340
you will, is 1 Megahertz.

00:19:15.340 --> 00:19:20.440
But if you're going to do
Nyquist's ideal sinks pulses,

00:19:20.440 --> 00:19:23.745
then you only need 50
KHz of bandwidth in

00:19:23.745 --> 00:19:25.240
your system, right?

00:19:25.240 --> 00:19:29.900
So one advantage of this system,
if you will, is that

00:19:29.900 --> 00:19:32.610
you're not required to do the
complicated sink pulses.

00:19:32.610 --> 00:19:34.750
Do not need to send
those pulses.

00:19:34.750 --> 00:19:38.480
You could simply send, for
example, square pulses and,

00:19:38.480 --> 00:19:40.950
because your bandwidth is such
low, you have a very low

00:19:40.950 --> 00:19:43.060
complexity system.

00:19:43.060 --> 00:19:47.720
Of course, the price you pay is
you reduce the data rate by

00:19:47.720 --> 00:19:48.970
a factor of ten.

00:19:51.368 --> 00:19:52.920
OK, it's a good point.

00:19:52.920 --> 00:19:54.990
In fact, there are many points
that will come up in this

00:19:54.990 --> 00:19:59.080
example if you think about it
later on, so feel free to ask

00:19:59.080 --> 00:20:03.050
me questions if you think about
some issues later on.

00:20:07.590 --> 00:20:08.850
Ok.

00:20:08.850 --> 00:20:12.930
So I think we have motivated
the need for coding enough

00:20:12.930 --> 00:20:15.435
now, so let's look at
our encoder design.

00:20:27.760 --> 00:20:38.360
So a typical encoder design
takes bits in --

00:20:38.360 --> 00:20:41.110
we saw this last time
in the context of

00:20:41.110 --> 00:20:42.440
spectral efficiency --

00:20:42.440 --> 00:20:43.885
and produces symbols out.

00:20:47.840 --> 00:20:52.880
So I can represent my bits by,
say a vector b, and I can

00:20:52.880 --> 00:20:56.760
represent my symbols
by a vector x.

00:20:56.760 --> 00:20:59.880
So every sequence of b
bits gets mapped to a

00:20:59.880 --> 00:21:02.290
sequence of N symbols.

00:21:02.290 --> 00:21:05.860
Now this output sequence of
symbols is not any arbitrary

00:21:05.860 --> 00:21:11.160
sequence, but it lies in a set
of all possible sequences,

00:21:11.160 --> 00:21:16.860
which I denote by C. And this
set is essentially a set of

00:21:16.860 --> 00:21:21.230
permissible output symbol
sequences, which I will write

00:21:21.230 --> 00:21:27.590
by C sub j, which is a vector
in Rn, because there are N

00:21:27.590 --> 00:21:29.870
symbols being produced.

00:21:29.870 --> 00:21:34.550
And we can have set up to j, j
goes from 1 to M. So we can

00:21:34.550 --> 00:21:38.390
have up to M symbols.

00:21:38.390 --> 00:21:43.990
And note that here, M has to
be equal to 2 to the b, in

00:21:43.990 --> 00:21:46.220
order to be able to
map every sequence

00:21:46.220 --> 00:21:48.900
of b bits to M symbols.

00:21:48.900 --> 00:22:01.500
So this C is known as a
codebook, and each C sub j is

00:22:01.500 --> 00:22:02.750
called a codeword.

00:22:07.280 --> 00:22:07.437
OK?

00:22:07.437 --> 00:22:10.870
The standard definition
of an encoder.

00:22:10.870 --> 00:22:15.480
Now in today's lecture and half
of next week's lecture,

00:22:15.480 --> 00:22:19.160
we will be seeing at a very
specific case when N

00:22:19.160 --> 00:22:22.130
equals 1 and 2.

00:22:22.130 --> 00:22:26.010
In that case, instead of using
the letter C, we will be using

00:22:26.010 --> 00:22:30.970
a different letter, A. So C,
in that case, we'll call it

00:22:30.970 --> 00:22:32.220
actually a constellation.

00:22:34.650 --> 00:22:38.080
So in particular if C is one,
it's a PAM constellation.

00:22:38.080 --> 00:22:40.500
If N is 2, it's a QAM
constellation.

00:22:40.500 --> 00:22:44.180
And we'll be denoting it by
a letter A instead of C.

00:22:44.180 --> 00:22:50.090
So A is again, a sequence of
symbols a_j, which belongs to

00:22:50.090 --> 00:22:58.410
Rn, where one is less than j
is less than M. OK, in this

00:22:58.410 --> 00:22:59.660
case a is a constellation.

00:23:08.040 --> 00:23:13.810
a_j's are known as symbols, or
sometimes they're also known

00:23:13.810 --> 00:23:15.576
as signal points in
the constellation.

00:23:22.840 --> 00:23:24.610
There are a number of
definitions that

00:23:24.610 --> 00:23:26.140
follow from this --

00:23:26.140 --> 00:23:28.170
number of properties of
the constellation,

00:23:28.170 --> 00:23:29.620
rather, that follow.

00:23:29.620 --> 00:23:34.260
So in particular N is known
as the dimension of your

00:23:34.260 --> 00:23:36.900
constellation.

00:23:36.900 --> 00:23:40.160
The number N is this, and
here, it's the number of

00:23:40.160 --> 00:23:43.870
symbol sequences you output
for a sequence of b

00:23:43.870 --> 00:23:47.090
bits that are in.

00:23:47.090 --> 00:23:49.135
M is the size of your
constellation.

00:23:59.130 --> 00:24:06.430
The energy per constellation is
given by 1 over M times the

00:24:06.430 --> 00:24:14.210
summation the norm of a_j
squared, where j goes from one

00:24:14.210 --> 00:24:24.460
to M. The minimum distance of
your constellation is simply

00:24:24.460 --> 00:24:27.710
the Euclidean minimum distance
between two points in the

00:24:27.710 --> 00:24:30.380
constellation.

00:24:30.380 --> 00:24:33.290
So if you take the norm of a_i
minus a_j, and minimize it

00:24:33.290 --> 00:24:35.270
over all possible
values i and j.

00:24:38.330 --> 00:24:42.960
The number of nearest neighbors,
of the average

00:24:42.960 --> 00:24:44.460
number of nearest --

00:24:44.460 --> 00:24:59.360
K_min of A is the average number
of nearest neighbors in

00:24:59.360 --> 00:25:04.480
A.

00:25:04.480 --> 00:25:06.220
In addition to this, there
are some orthonormalized

00:25:06.220 --> 00:25:08.700
parameters that you
saw last time.

00:25:17.460 --> 00:25:22.017
The spectral efficiency, which
is in units of bits per two

00:25:22.017 --> 00:25:29.630
dimensions is 2b over N. And if
you want to eliminate b, we

00:25:29.630 --> 00:25:33.700
use the relation that b is
log M to the base 2 here.

00:25:33.700 --> 00:25:41.540
And so we have 2 log M
to the base 2 over N.

00:25:41.540 --> 00:25:47.170
The energy per two dimensions,
denoted by Es, is simply 2

00:25:47.170 --> 00:25:49.920
over N E(A).

00:25:49.920 --> 00:25:52.920
So E(A) is the average energy
of your constellation.

00:25:52.920 --> 00:25:55.900
If you divide it by the number
of dimensions you have, you

00:25:55.900 --> 00:26:00.330
get energy per dimension, and
you multiply it by 2.

00:26:00.330 --> 00:26:07.080
And finally, the energy per bit
is Es over rho, or it can

00:26:07.080 --> 00:26:12.060
also be expressed as E(A), which
is the energy per symbol

00:26:12.060 --> 00:26:15.430
over the number of bits per
symbol, which is log

00:26:15.430 --> 00:26:16.943
M to the base 2.

00:26:26.120 --> 00:26:29.120
It might seem like a lot of
definitions, but you will see

00:26:29.120 --> 00:26:33.110
very soon that they have a very
tight interplay among one

00:26:33.110 --> 00:26:38.030
another, so it's not nearly as
overwhelming as it might seem

00:26:38.030 --> 00:26:38.980
at the first point.

00:26:38.980 --> 00:26:40.450
AUDIENCE: [INAUDIBLE]

00:26:40.450 --> 00:26:41.430
PROFESSOR: Yes.

00:26:41.430 --> 00:26:42.680
AUDIENCE: Why is
[UNINTELLIGIBLE]?

00:26:46.780 --> 00:26:49.010
PROFESSOR: Because you have two
[UNINTELLIGIBLE] b bits

00:26:49.010 --> 00:26:54.522
coming in, right, which
you map to each --

00:26:54.522 --> 00:26:57.486
AUDIENCE: [INAUDIBLE]

00:26:57.486 --> 00:26:59.860
There are two [INAUDIBLE]
possible sequences, but all of

00:26:59.860 --> 00:27:02.670
them need not be used, right?

00:27:02.670 --> 00:27:06.340
PROFESSOR: Well, we assume that
there is no coding going

00:27:06.340 --> 00:27:09.050
on before the encoder.

00:27:09.050 --> 00:27:11.690
So you have the source code, for
which there's a sequence

00:27:11.690 --> 00:27:16.021
of IID bits, and then
they mapped to

00:27:16.021 --> 00:27:19.180
a sequence of symbols.

00:27:19.180 --> 00:27:22.700
So we'll see all of our possible
input bits coming in

00:27:22.700 --> 00:27:25.190
here, because it's produced
by a source code,

00:27:25.190 --> 00:27:27.620
like a Huffman code.

00:27:27.620 --> 00:27:27.745
Right?

00:27:27.745 --> 00:27:30.920
And the idea here is, perhaps
what you're asking is --

00:27:30.920 --> 00:27:33.250
this did not span the
entire space of Rn.

00:27:35.780 --> 00:27:38.400
We want to select these
sequences carefully here.

00:27:46.590 --> 00:27:49.230
Maybe we'll come back to that
later on in the course.

00:27:53.200 --> 00:27:54.470
OK, so let's do an example.

00:28:03.290 --> 00:28:16.040
So the example is, say we have
A, which is a 2-PAM system,

00:28:16.040 --> 00:28:21.340
and you want to look at this
constellation, B, which is

00:28:21.340 --> 00:28:26.720
denoted by A raised to K. The
definition of A raised to k is

00:28:26.720 --> 00:28:36.800
it's a sequence of K symbols
where each x_i belongs to A.

00:28:36.800 --> 00:28:54.049
So this is also known as the
K-fold Cartesian product of A.

00:28:54.049 --> 00:28:55.023
AUDIENCE: Another question.

00:28:55.023 --> 00:28:58.380
So it has been pre-decided that
b bits will be encoded

00:28:58.380 --> 00:28:58.810
[UNINTELLIGIBLE]?

00:28:58.810 --> 00:29:00.240
PROFESSOR: Right.

00:29:00.240 --> 00:29:02.450
So this is a specific structure
we are imposing on

00:29:02.450 --> 00:29:03.700
the encoder.

00:29:07.830 --> 00:29:10.590
So this is the constellation,
and you'll want to study the

00:29:10.590 --> 00:29:12.740
properties for this
constellation.

00:29:12.740 --> 00:29:18.230
For this constellation,
what's N going to be?

00:29:18.230 --> 00:29:20.648
What's the dimension
going to be?

00:29:20.648 --> 00:29:22.130
AUDIENCE: [INAUDIBLE]

00:29:22.130 --> 00:29:28.700
PROFESSOR: For B, not A. It's
going to be K. Well, the

00:29:28.700 --> 00:29:32.050
number of points in this
constellation, how

00:29:32.050 --> 00:29:33.910
many points are there?

00:29:33.910 --> 00:29:35.500
There are K coordinates.

00:29:35.500 --> 00:29:38.520
Each coordinate can be
plus or minus alpha.

00:29:38.520 --> 00:29:44.670
So we have 2 to the K possible
points in this constellation.

00:29:44.670 --> 00:29:45.920
OK?

00:29:47.560 --> 00:29:49.840
What's E of A going to be?

00:30:00.554 --> 00:30:02.990
AUDIENCE: [INAUDIBLE]

00:30:02.990 --> 00:30:05.890
PROFESSOR: K alpha
squared. right?

00:30:05.890 --> 00:30:07.570
Basically, we have
K coordinates.

00:30:07.570 --> 00:30:11.170
The energy for each coordinate
will simply add up.

00:30:11.170 --> 00:30:14.060
The energy across each
coordinate is always going to

00:30:14.060 --> 00:30:15.570
be alpha squared.

00:30:15.570 --> 00:30:18.480
So each point in this
constellation has an energy of

00:30:18.480 --> 00:30:19.960
K alpha squared.

00:30:19.960 --> 00:30:22.290
So regardless of how many points
we have, the average

00:30:22.290 --> 00:30:24.930
energy is always going to
be K alpha squared.

00:30:28.260 --> 00:30:29.510
Does everybody see this?

00:30:29.510 --> 00:30:30.418
AUDIENCE: [INAUDIBLE]

00:30:30.418 --> 00:30:32.690
PROFESSOR: You're right.

00:30:32.690 --> 00:30:33.940
Maybe that was the confusion.

00:30:36.260 --> 00:30:38.090
It's a good thing.

00:30:38.090 --> 00:30:42.560
Just getting too used to
writing E of A. OK.

00:30:42.560 --> 00:30:44.390
What's d_min of b going to be?

00:30:55.324 --> 00:30:56.318
AUDIENCE: [INAUDIBLE]

00:30:56.318 --> 00:30:57.312
2 alpha.

00:30:57.312 --> 00:30:58.562
PROFESSOR: 2 alpha.

00:31:01.090 --> 00:31:03.140
I think everybody had
the right idea.

00:31:03.140 --> 00:31:07.990
So the minimum distance here is
two alpha for A. If we look

00:31:07.990 --> 00:31:12.780
at this point B, we can fix K
minus 1 coordinates for two

00:31:12.780 --> 00:31:15.890
points to be the same, they will
only differ in one point.

00:31:15.890 --> 00:31:18.930
And so the minimum distance
is across that

00:31:18.930 --> 00:31:21.430
point, which is 2 alpha.

00:31:21.430 --> 00:31:23.690
What is K_min of
B going to be?

00:31:29.910 --> 00:31:34.920
It's going to be K. For each
point -- let's say the point

00:31:34.920 --> 00:31:39.890
which has all alphas, we can fix
K minus 1 coordinate and

00:31:39.890 --> 00:31:42.710
find another point which is
different in only one of the

00:31:42.710 --> 00:31:44.580
coordinates, say the
first coordinate.

00:31:44.580 --> 00:31:47.580
We can do it for all K different
coordinates, so

00:31:47.580 --> 00:31:53.830
K_min is going to be K for
each point, and hence the

00:31:53.830 --> 00:31:56.830
average number of nearest
neighbors is also K.

00:31:56.830 --> 00:32:00.870
OK, so now in this case, let's
first start with the

00:32:00.870 --> 00:32:02.360
normalized parameters.

00:32:02.360 --> 00:32:05.940
That's always good to start
with spectral efficiency.

00:32:05.940 --> 00:32:14.710
That's 2 log M over N. Well, log
of M is going to be K, so

00:32:14.710 --> 00:32:18.440
N is going to be K. So this is
going to be two bits per two

00:32:18.440 --> 00:32:24.300
dimensions, and this is the same
as that of the original

00:32:24.300 --> 00:32:30.370
constellation, A. Your energy
per two dimensions is going to

00:32:30.370 --> 00:32:35.200
be 2 over N E(B).

00:32:35.200 --> 00:32:39.080
E(B) is K alpha squared, N
equals K, so this is 2 alpha

00:32:39.080 --> 00:32:41.980
squared, and that is the same as
the original constellation,

00:32:41.980 --> 00:32:45.120
A.

00:32:45.120 --> 00:32:46.910
Finally.

00:32:46.910 --> 00:32:51.350
energy per bit is Es over
rho, so it's 2 alpha

00:32:51.350 --> 00:32:53.190
squared over 2.

00:32:53.190 --> 00:32:55.880
So that's alpha squared, and
that's same as the 2-PAM

00:32:55.880 --> 00:32:58.000
constellation.

00:32:58.000 --> 00:33:00.470
So why did I go through all
of these calculations?

00:33:03.120 --> 00:33:06.560
What we see is that the
normalized parameters, rho,

00:33:06.560 --> 00:33:10.920
Es, and Eb, are the same for the
Cartesian product as for

00:33:10.920 --> 00:33:12.900
the original constellation.

00:33:12.900 --> 00:33:16.080
And at some level, that should
not be too surprising, right?

00:33:16.080 --> 00:33:19.340
Because what I'll be doing in
this Cartesian product, we are

00:33:19.340 --> 00:33:21.510
not really doing any
coding, right?

00:33:21.510 --> 00:33:24.270
In this original constellation,
we had one bit

00:33:24.270 --> 00:33:27.080
coming in, and we are mapping
it to one symbol.

00:33:27.080 --> 00:33:29.640
All we are doing in the
Cartesian product is we are

00:33:29.640 --> 00:33:32.520
taking K bits in and mapping
them to K symbols.

00:33:32.520 --> 00:33:34.880
So we still have one
bit per symbol.

00:33:34.880 --> 00:33:39.770
The noise is IID, so it's
optimal to two decisions for

00:33:39.770 --> 00:33:42.280
each of the coordinates
independently, and decide

00:33:42.280 --> 00:33:44.460
whether that coordinate
corresponds to plus alpha or

00:33:44.460 --> 00:33:45.730
minus alpha.

00:33:45.730 --> 00:33:47.750
So in other words, there's
nothing gained by doing this

00:33:47.750 --> 00:33:49.650
Cartesian product.

00:33:49.650 --> 00:33:51.280
And we will see, the probability
of error

00:33:51.280 --> 00:33:54.750
expression depends on these
normalized parameters, if we

00:33:54.750 --> 00:33:58.140
want to look at Pb of E, and so
we do not gain anything in

00:33:58.140 --> 00:34:01.980
terms of the probability of
error, versus EbN_0, trade-off

00:34:01.980 --> 00:34:03.730
through Cartesian product.

00:34:03.730 --> 00:34:07.390
So I'm making the note here
because that's the

00:34:07.390 --> 00:34:09.520
only space I have.

00:34:09.520 --> 00:34:14.020
So the note is if I look at
probability of bit error

00:34:14.020 --> 00:34:21.060
versus EbN_0, the curve we saw
last time, it is the same for

00:34:21.060 --> 00:34:28.560
B and A. You should be able to
convince yourself about this,

00:34:28.560 --> 00:34:30.630
and so there is really no
coding going on here.

00:34:35.040 --> 00:34:36.460
Are there any questions
on this?

00:34:39.969 --> 00:34:43.510
Let's look at this problem
a bit more carefully now.

00:35:05.875 --> 00:35:06.869
AUDIENCE: [INAUDIBLE]

00:35:06.869 --> 00:35:07.863
PROFESSOR: Yeah.

00:35:07.863 --> 00:35:09.113
AUDIENCE: Why [INAUDIBLE]

00:35:12.750 --> 00:35:14.020
PROFESSOR: Right.

00:35:14.020 --> 00:35:15.770
AUDIENCE: What does he use?

00:35:15.770 --> 00:35:19.350
PROFESSOR: Energy per
two dimensions.

00:35:19.350 --> 00:35:21.960
Es will always be energy
per two dimensions.

00:35:21.960 --> 00:35:23.800
Throughout the course, we'll
be using these notations.

00:35:23.800 --> 00:35:26.040
Eb is the energy per bit,
Es is the energy per two

00:35:26.040 --> 00:35:27.560
dimensions.

00:35:27.560 --> 00:35:30.600
And if you want to say energy
per symbol, we'll be using

00:35:30.600 --> 00:35:34.068
this notation E sub
the constellation.

00:35:34.068 --> 00:35:34.512
AUDIENCE: Oh.

00:35:34.512 --> 00:35:36.290
It's not energy per bit.

00:35:36.290 --> 00:35:37.220
PROFESSOR: No, this
is energy --

00:35:37.220 --> 00:35:39.060
B is my constellation.

00:35:39.060 --> 00:35:42.210
So that's why it's energy of
that constellation, average

00:35:42.210 --> 00:35:43.990
energy per symbol in
that constellation.

00:36:14.320 --> 00:36:19.930
OK, so let us consider the
special case when K equals 3.

00:36:19.930 --> 00:36:24.040
So in that case, B is A^q.

00:36:24.040 --> 00:36:27.420
So if I look at all possible
points in B, they are going to

00:36:27.420 --> 00:36:30.950
lie on the vertices of a
three-dimensional cube.

00:36:30.950 --> 00:36:33.365
That's a Cartesian product
in three dimensions.

00:36:43.760 --> 00:36:46.580
And all my constellations points
are basically on the

00:36:46.580 --> 00:36:47.830
vertices of this cube.

00:36:54.070 --> 00:36:57.680
The distance here is going
to be 2 alpha.

00:36:57.680 --> 00:37:01.960
That's the length of each edge
in my cube, and that's what B

00:37:01.960 --> 00:37:03.025
is going to be.

00:37:03.025 --> 00:37:06.930
Clearly, the minimum distances
is 2 alpha, as we saw before.

00:37:06.930 --> 00:37:12.260
Now let me define a different
constellation, B prime, and

00:37:12.260 --> 00:37:15.120
only going to take four vertices
from these possible

00:37:15.120 --> 00:37:16.660
eight vertices.

00:37:16.660 --> 00:37:21.560
I'm going to take this vertex
here, I'm going to take this

00:37:21.560 --> 00:37:27.120
vertex here, this one,
and this one.

00:37:27.120 --> 00:37:29.140
I'm only taking four vertices.

00:37:29.140 --> 00:37:32.540
If I want to tell you explicitly
what the points

00:37:32.540 --> 00:37:36.310
are, I need to draw an axis, so
I'm simply drawing the x,

00:37:36.310 --> 00:37:38.680
y, and z axis here.

00:37:38.680 --> 00:37:42.740
This is x-axis, this is
y-axis, and z-axis.

00:37:42.740 --> 00:37:46.180
And B prime is a subset
of the points in this

00:37:46.180 --> 00:37:49.460
three-dimensional Cartesian
product.

00:37:49.460 --> 00:37:55.280
They will be alpha, alpha,
alpha; minus alpha, minus

00:37:55.280 --> 00:38:03.950
alpha, alpha; alpha, minus
alpha, minus alpha; and let's

00:38:03.950 --> 00:38:09.050
see, minus alpha, alpha,
minus alpha.

00:38:09.050 --> 00:38:12.300
So two of the coordinates will
be minus alpha here, in these

00:38:12.300 --> 00:38:16.260
three points, and we have one
coordinate all alphas.

00:38:16.260 --> 00:38:18.000
So this is my B prime.

00:38:18.000 --> 00:38:20.490
What is the minimum distance
going to be for B prime?

00:38:26.226 --> 00:38:28.150
AUDIENCE: [INAUDIBLE]

00:38:28.150 --> 00:38:29.540
PROFESSOR: 2 over
2 alpha, right?

00:38:29.540 --> 00:38:32.200
It's basically the length
of this edge here.

00:38:32.200 --> 00:38:34.740
This is 2 alpha, this
is 2 alpha.

00:38:34.740 --> 00:38:35.990
So it's 2 over 2 alpha.

00:38:45.300 --> 00:38:48.760
So in other words, by simply
selecting a subset of points,

00:38:48.760 --> 00:38:51.660
I have been able to increase
my minimum distance.

00:38:51.660 --> 00:38:54.220
Because my minimum distance
is larger, I hope that the

00:38:54.220 --> 00:38:56.810
probability of error will be
smaller as opposed to the

00:38:56.810 --> 00:38:58.420
original constellation.

00:38:58.420 --> 00:39:00.330
But this comes at the
price, right?

00:39:00.330 --> 00:39:01.580
And what's the price?

00:39:05.400 --> 00:39:07.540
The spectral efficiency
is smaller, right?

00:39:07.540 --> 00:39:10.450
What if I look at my spectral
efficiency?

00:39:10.450 --> 00:39:13.070
Well, I'm only sending
out two points, two

00:39:13.070 --> 00:39:15.090
bits per each point.

00:39:15.090 --> 00:39:18.040
So two bits, each point takes
three dimensions, so my

00:39:18.040 --> 00:39:21.450
spectral efficiency is 2 times
2 over 3 bits per two

00:39:21.450 --> 00:39:26.480
dimensions, or it's 4 over 3
bits per two dimensions.

00:39:26.480 --> 00:39:29.110
And this is in contrast to the
two bits per two dimensions we

00:39:29.110 --> 00:39:30.610
had for B.

00:39:30.610 --> 00:39:33.690
So in other words, there is
a trade-off between your

00:39:33.690 --> 00:39:37.180
spectral efficiency and
the minimum distance.

00:39:37.180 --> 00:39:40.200
We'll start with a K-dimensional
Cartesian

00:39:40.200 --> 00:39:42.720
product of A, which
has all points.

00:39:42.720 --> 00:39:46.690
We took a subset of points, and
if we chose them smartly,

00:39:46.690 --> 00:39:49.490
we were able to increase the
minimum distance, but the

00:39:49.490 --> 00:39:52.490
price we had to pay was
to reduce the spectral

00:39:52.490 --> 00:39:53.740
efficiency.

00:39:57.610 --> 00:40:00.040
AUDIENCE: Where did this
two / three come from?

00:40:00.040 --> 00:40:00.840
PROFESSOR: This two here?

00:40:00.840 --> 00:40:02.090
AUDIENCE: 2/3, yes.

00:40:02.090 --> 00:40:05.920
PROFESSOR: 2/3, I am sending
two bits per symbol, right?

00:40:05.920 --> 00:40:07.275
Each symbol has three
dimensions.

00:40:10.370 --> 00:40:13.780
So it's 2/3 bit per dimension,
or 4/3 bit per two dimension.

00:40:16.490 --> 00:40:25.590
OK, so the point was it seems
like there is a trade-off

00:40:25.590 --> 00:40:33.236
between minimum distance and
spectral efficiency.

00:40:37.840 --> 00:40:40.300
And indeed, this might seem like
a reasonable trade-off,

00:40:40.300 --> 00:40:42.710
and a lot of coding here that we
will be seeing in the early

00:40:42.710 --> 00:40:46.850
part of the course is indeed
motivated by this trade-off.

00:40:46.850 --> 00:40:49.250
You want to reduce your spectral
efficiency in order

00:40:49.250 --> 00:40:50.520
to increase your probability
of error.

00:40:53.250 --> 00:40:55.240
And this has in fact been
quite a dominant design

00:40:55.240 --> 00:40:58.570
principle for a large number of
codes that have come up in

00:40:58.570 --> 00:40:59.870
coding theory.

00:40:59.870 --> 00:41:04.910
However, if you look at what
Shannon says, Shannon says

00:41:04.910 --> 00:41:07.320
something quite different.

00:41:07.320 --> 00:41:10.530
In Shannon's theorem, all
they say is, you have --

00:41:10.530 --> 00:41:16.390
if your spectral efficiency is
below a certain amount, then

00:41:16.390 --> 00:41:24.180
your probability of bit error
can be made arbitrarily small.

00:41:24.180 --> 00:41:27.590
OK, so what Shannon is saying,
it's something much stronger

00:41:27.590 --> 00:41:28.490
than this trade-off.

00:41:28.490 --> 00:41:31.190
It's saying if you reduce your
spectral efficiency below a

00:41:31.190 --> 00:41:34.520
certain quantity which is
finite, then the probability

00:41:34.520 --> 00:41:37.030
of error can be made
arbitrarily small.

00:41:37.030 --> 00:41:40.150
There is no statement of minimum
distance in this

00:41:40.150 --> 00:41:41.710
theorem here.

00:41:41.710 --> 00:41:44.280
And indeed, if you look at the
most modern codes which are

00:41:44.280 --> 00:41:48.640
capacity approaching, they are
not designed to maximize the

00:41:48.640 --> 00:41:49.900
minimum distance.

00:41:49.900 --> 00:41:52.770
They are designed to work well
with some practical decoding

00:41:52.770 --> 00:41:55.120
algorithms, like the belief
propagation of

00:41:55.120 --> 00:41:56.640
algorithms and so on.

00:41:56.640 --> 00:41:58.890
So they are designed on a
somewhat different principle

00:41:58.890 --> 00:42:00.620
than minimum distance.

00:42:00.620 --> 00:42:04.140
But nevertheless, this is quite
a powerful tool that we

00:42:04.140 --> 00:42:06.910
will be using in the early
part of this course.

00:42:06.910 --> 00:42:09.800
We start with a K-dimensional
Cartesian product, select a

00:42:09.800 --> 00:42:12.580
subset of points, and we want
to increase the minimum

00:42:12.580 --> 00:42:14.395
distance at the cost of
spectral efficiency.

00:42:18.460 --> 00:42:21.160
OK.

00:42:21.160 --> 00:42:23.247
Now are there any questions?

00:42:23.247 --> 00:42:24.497
AUDIENCE: [INAUDIBLE]

00:42:29.620 --> 00:42:31.390
PROFESSOR: That's
a good question.

00:42:31.390 --> 00:42:34.430
Suppose I have a 2-PAM
constellation, then I can

00:42:34.430 --> 00:42:38.060
easily write the probability of
bit error as a function of

00:42:38.060 --> 00:42:39.190
Q function.

00:42:39.190 --> 00:42:42.310
If it is a more complicated
expression, I have to

00:42:42.310 --> 00:42:45.990
integrate over the decision
regions, which we'll be seeing

00:42:45.990 --> 00:42:47.850
later on in this lecture.

00:42:47.850 --> 00:42:50.150
And it's not usually possible to
get an exact probability of

00:42:50.150 --> 00:42:51.200
error expression.

00:42:51.200 --> 00:42:54.480
We usually use an in-union
bound, to bound it by a pair

00:42:54.480 --> 00:42:55.580
of [UNINTELLIGIBLE]
error probability.

00:42:55.580 --> 00:42:57.300
We'll be doing all
that just now.

00:43:05.430 --> 00:43:06.070
OK.

00:43:06.070 --> 00:43:08.090
So now let us --

00:43:08.090 --> 00:43:09.740
I have talked now enough now
about encoder, and we'll be

00:43:09.740 --> 00:43:12.840
visiting it very soon, but let
us switch gears and talk about

00:43:12.840 --> 00:43:15.380
the decoder now.

00:43:15.380 --> 00:43:17.820
OK, what does a decoder do?

00:43:17.820 --> 00:43:19.850
So the goal of a decoder
is the following.

00:43:23.360 --> 00:43:29.270
You get your received vector Y,
which is X plus N, and from

00:43:29.270 --> 00:43:36.400
Y, you want to estimate X-hat
as a point in your signal

00:43:36.400 --> 00:43:37.620
constellation.

00:43:37.620 --> 00:43:40.530
So you receive a noisy version
of X, and you want to estimate

00:43:40.530 --> 00:43:45.560
X-hat at the decoder.

00:43:45.560 --> 00:43:48.000
So this is the architecture
of your decoder.

00:43:48.000 --> 00:43:55.830
And the goal here is you
want to minimize the

00:43:55.830 --> 00:43:58.890
probability of error.

00:43:58.890 --> 00:44:01.490
And what's the probability
of error?

00:44:01.490 --> 00:44:07.420
It's basically probability that
X is not equal to X-hat.

00:44:07.420 --> 00:44:11.780
So that is your general criteria
at the decoder.

00:44:11.780 --> 00:44:14.290
Now what we'll doing next is
basically going through this

00:44:14.290 --> 00:44:18.210
exercise to show that this
minimum probability of error

00:44:18.210 --> 00:44:22.590
criteria is equivalent to a
bunch of other criteria.

00:44:22.590 --> 00:44:26.340
So the first criteria is the
MAP criteria: Maximum

00:44:26.340 --> 00:44:27.590
A-Posteriori Rule.

00:45:28.930 --> 00:45:35.110
So our probability of error
is basically --

00:45:35.110 --> 00:45:39.890
I can track it as an integral of
probability of error given

00:45:39.890 --> 00:45:49.030
Y times the density function of
Y. So if I want to minimize

00:45:49.030 --> 00:45:51.970
my probability of error, I want
to minimize each term in

00:45:51.970 --> 00:45:53.190
this integral.

00:45:53.190 --> 00:46:00.070
So this implies I want to
minimize probability of error

00:46:00.070 --> 00:46:02.860
given Y for each possible
value of Y. OK?

00:46:05.370 --> 00:46:06.910
Now what's that going to be?

00:46:06.910 --> 00:46:10.240
Well, in order to look
at what this term is,

00:46:10.240 --> 00:46:12.340
suppose I make a decision.

00:46:12.340 --> 00:46:15.520
I receive Y, and I decide
a symbol a_j is sent.

00:46:36.610 --> 00:46:38.880
Then what's the probability
of error going to be?

00:46:41.480 --> 00:46:50.070
My probability of error given
Y is going to be 1 minus the

00:46:50.070 --> 00:46:51.890
probability that
I was correct.

00:46:51.890 --> 00:46:55.570
Probability that I was correct
is probability X equals a_j,

00:46:55.570 --> 00:47:01.580
given Y. This follows
from the definition.

00:47:01.580 --> 00:47:04.720
So if I want to minimize my
probability of error given Y,

00:47:04.720 --> 00:47:07.300
I want to actually choose an
a_j that maximizes the

00:47:07.300 --> 00:47:14.890
probability of a_j given Y. So
this implies, choose a_j.

00:47:25.860 --> 00:47:28.706
And this is known
as the MAP rule.

00:47:32.110 --> 00:47:35.800
So the idea behind the MAP rule
is to choose the symbol

00:47:35.800 --> 00:47:38.690
in the constellation that
maximizes the posterior

00:47:38.690 --> 00:47:42.720
probability, given the
received symbol.

00:47:42.720 --> 00:47:45.410
Now, this MAP rule is equivalent
to the maximum

00:47:45.410 --> 00:47:49.180
likelihood rule, under the
assumption that all the signal

00:47:49.180 --> 00:47:51.460
points a_j are equally likely.

00:47:51.460 --> 00:47:53.510
The proof is not hard,
you just use

00:47:53.510 --> 00:47:55.520
Bayes Theorem for that.

00:47:55.520 --> 00:48:05.990
So suppose all a_j's
are equally likely.

00:48:08.940 --> 00:48:15.600
Then probability of a_j given
Y, which by Bayes Theorem is

00:48:15.600 --> 00:48:20.400
the density of Y given a_j,
times the probability of a_j.

00:48:20.400 --> 00:48:22.940
But since all a_j's are equally
likely, I will just

00:48:22.940 --> 00:48:31.850
write it as 1 over M, over the
density of Y. Now because Y is

00:48:31.850 --> 00:48:35.330
fixed, the density of Y is
fixed, so this quantity is

00:48:35.330 --> 00:48:37.840
just proportional to --

00:48:37.840 --> 00:48:40.260
the proportionality symbol --

00:48:40.260 --> 00:48:42.110
to the density of Y given a_j.

00:48:46.430 --> 00:48:48.560
I won't be writing all
the vectors, I

00:48:48.560 --> 00:48:49.320
might be missing some.

00:48:49.320 --> 00:48:50.990
But please bear with me.

00:48:50.990 --> 00:49:04.590
So this implies we want to
choose a_j that maximizes the

00:49:04.590 --> 00:49:10.450
density of Y given a_j, and this
is known as the maximum

00:49:10.450 --> 00:49:12.800
likelihood rule.

00:49:12.800 --> 00:49:14.820
And there is one final rule.

00:49:14.820 --> 00:49:18.580
Basically if the noise is
additive Gaussian, then the

00:49:18.580 --> 00:49:24.510
density of Y given a_j is
simply proportional to E

00:49:24.510 --> 00:49:29.450
raised to minus the norm
of Y minus a_j squared.

00:49:29.450 --> 00:49:33.400
So if we want to maximize this
quantity, we want to minimize

00:49:33.400 --> 00:49:37.170
Y minus a_j squared.

00:49:37.170 --> 00:49:48.170
So we want to choose a_j that
minimizes the Euclidean

00:49:48.170 --> 00:49:51.330
distance between Y minus a_j.

00:49:51.330 --> 00:49:54.770
And this is known as the minimum
distance decision

00:49:54.770 --> 00:49:56.310
rule, MDD rule.

00:49:56.310 --> 00:49:56.970
Yes?

00:49:56.970 --> 00:49:59.430
AUDIENCE: We are ignoring
[UNINTELLIGIBLE]?

00:49:59.430 --> 00:50:00.840
PROFESSOR: We are ignoring
P value.

00:50:00.840 --> 00:50:03.660
Because for a given Y, Py is
going to be fixed for all

00:50:03.660 --> 00:50:05.790
possible choices of a_j.

00:50:05.790 --> 00:50:08.440
The goal is I'm given Y, and I
want to decide which signal

00:50:08.440 --> 00:50:12.730
point was set, because that's
the probability of error given

00:50:12.730 --> 00:50:16.360
Y. This is my criteria now.

00:50:16.360 --> 00:50:19.746
So Y is fixed, so the density
of Y is fixed.

00:50:19.746 --> 00:50:22.040
AUDIENCE: [INAUDIBLE]

00:50:22.040 --> 00:50:24.340
PROFESSOR: It's basically
given by --

00:50:24.340 --> 00:50:26.720
in order to find this density,
we'll just condition it on

00:50:26.720 --> 00:50:31.416
a_j, and sum up over all
possible values of a_j.

00:50:31.416 --> 00:50:33.610
Just take the marginal
of Y, right?

00:50:33.610 --> 00:50:36.040
I mean, to write this
explicitly.

00:50:36.040 --> 00:50:37.850
I'm writing it here.

00:50:37.850 --> 00:50:42.750
It's going to be sigma P of
Y given a_j that's the

00:50:42.750 --> 00:50:44.400
probability of a_j.

00:50:44.400 --> 00:50:47.306
And this you can find
by the Gaussian.

00:50:47.306 --> 00:50:51.500
AUDIENCE: But then you
have [INAUDIBLE].

00:50:51.500 --> 00:50:53.470
PROFESSOR: But this is
a summation over

00:50:53.470 --> 00:50:54.416
all possible a_j's.

00:50:54.416 --> 00:50:57.500
I should write, sorry
-- a_k's.

00:50:57.500 --> 00:50:59.630
This is just a summation
over all k's, right?

00:50:59.630 --> 00:51:02.240
So basically, Py is going to
be a mixture of several

00:51:02.240 --> 00:51:04.180
Gaussians, OK?

00:51:04.180 --> 00:51:05.160
And it's fixed.

00:51:05.160 --> 00:51:07.120
AUDIENCE: [INAUDIBLE]

00:51:07.120 --> 00:51:08.370
PROFESSOR: Right.

00:51:10.510 --> 00:51:11.360
OK.

00:51:11.360 --> 00:51:14.700
So we want to choose the Minimum
Distance Decision

00:51:14.700 --> 00:51:23.170
rule, and I should have the
variance of noise here.

00:51:23.170 --> 00:51:26.310
OK, so what we have so far is
we started with the Minimum

00:51:26.310 --> 00:51:28.600
Probability of Error rule,
and that's the

00:51:28.600 --> 00:51:30.335
criteria of your decoder.

00:51:30.335 --> 00:51:35.510
Be sure this is equivalent to
MAP rule, and that basically

00:51:35.510 --> 00:51:38.950
comes just from the
definition.

00:51:38.950 --> 00:51:42.560
This integral here, we want to
minimize each term in the

00:51:42.560 --> 00:51:45.210
integral, and that basically
implies that the Maximum

00:51:45.210 --> 00:51:47.280
A-Posteriori rule is the best.

00:51:47.280 --> 00:51:52.280
This implied Maximum Likelihood
rule, and Maximum

00:51:52.280 --> 00:51:56.030
Likelihood rule comes from the
fact that all the signal

00:51:56.030 --> 00:51:58.250
points are equally likely.

00:51:58.250 --> 00:52:02.510
And this implies, then, the
Minimum Distance Decision

00:52:02.510 --> 00:52:05.200
rule, and that comes from the
fact that your noise is

00:52:05.200 --> 00:52:06.280
Additive Gaussian.

00:52:06.280 --> 00:52:09.270
So you have an exponential
in the --

00:52:09.270 --> 00:52:12.290
you have the Euclidean distance
as an exponent, and

00:52:12.290 --> 00:52:14.850
you want to minimize the
Euclidean distance.

00:52:14.850 --> 00:52:17.470
So this is the story
we have so far.

00:52:17.470 --> 00:52:20.745
And it turns out that the
Minimum Distance Decision rule

00:52:20.745 --> 00:52:23.130
is actually quite nice, because
it gives you a lot of

00:52:23.130 --> 00:52:25.950
geometrical insights.

00:52:25.950 --> 00:52:28.160
So say I have three points.

00:52:28.160 --> 00:52:30.410
We not even draw the
coordinates.

00:52:30.410 --> 00:52:37.180
My constellation, A, has three
points, and let me write them

00:52:37.180 --> 00:52:38.430
as a1, a2, a3.

00:52:40.620 --> 00:52:43.070
This is my constellation,
for example.

00:52:43.070 --> 00:52:48.040
And say I receive a symbol Y.
Then the job of the decoder is

00:52:48.040 --> 00:52:51.310
to figure out whether
I sent a1, a2 or a3.

00:52:51.310 --> 00:52:52.820
How will the decoder do that?

00:52:52.820 --> 00:52:55.680
Well, it will measure the
distance from all the three

00:52:55.680 --> 00:52:58.230
constellation points and
select the one with the

00:52:58.230 --> 00:53:00.660
smallest Euclidean distance.

00:53:00.660 --> 00:53:03.130
More generally, what we want
to do is we want to look at

00:53:03.130 --> 00:53:06.930
the space of all received Y
symbols, and partition it into

00:53:06.930 --> 00:53:08.240
decision regions.

00:53:08.240 --> 00:53:11.190
So that if the point falls in a
certain decision region, we

00:53:11.190 --> 00:53:13.490
say that the constellation point
corresponding to that

00:53:13.490 --> 00:53:15.760
decision region was sent.

00:53:15.760 --> 00:53:18.115
And how do I find the
decision region?

00:53:18.115 --> 00:53:20.680
Well, I start drawing
hyperplanes between every two

00:53:20.680 --> 00:53:22.660
pair of constellation points.

00:53:22.660 --> 00:53:25.500
Say I want to find the decision
region of point a1.

00:53:25.500 --> 00:53:28.060
I draw a hyperplane between
a1 and a3, which is

00:53:28.060 --> 00:53:29.720
given by this line.

00:53:29.720 --> 00:53:32.580
I draw a hyperplane between
a1 and a2 which is

00:53:32.580 --> 00:53:34.130
given by this point.

00:53:34.130 --> 00:53:37.860
And so the region -- the set of
points which are closer to

00:53:37.860 --> 00:53:43.740
a1 than a2 and a3 is basically
bounded by this region here.

00:53:43.740 --> 00:53:45.450
So this is my R1.

00:53:45.450 --> 00:53:48.230
Similarly, if I want to find a
decision region for a2 and a3,

00:53:48.230 --> 00:53:50.640
I will draw this line here.

00:53:50.640 --> 00:53:53.550
This will be R2 and
this will be R3.

00:53:53.550 --> 00:53:55.110
So these are my decision
regions.

00:53:59.210 --> 00:54:03.160
So if I want to write
that formally, Rj

00:54:03.160 --> 00:54:04.710
is my decision region.

00:54:04.710 --> 00:54:10.110
And it is the set of points Y
belong to Rn, such that the

00:54:10.110 --> 00:54:15.540
norm of Y minus a_j squared is
going to be less than or equal

00:54:15.540 --> 00:54:17.800
to -- it doesn't matter if you
have less than or equal to,

00:54:17.800 --> 00:54:20.550
because the point
[UNINTELLIGIBLE] bound

00:54:20.550 --> 00:54:21.800
[UNINTELLIGIBLE] probability
zero --

00:54:24.022 --> 00:54:26.070
is radius squared.

00:54:26.070 --> 00:54:28.950
That's just the definition
of Rj.

00:54:28.950 --> 00:54:32.360
Now the way to construct Rj was
to look at all the half

00:54:32.360 --> 00:54:35.480
planes which are closer to
this point than any other

00:54:35.480 --> 00:54:37.020
point, and take the
intersection of

00:54:37.020 --> 00:54:38.780
all the half planes.

00:54:38.780 --> 00:54:41.970
So in other words, I can
also write Rj to be the

00:54:41.970 --> 00:54:44.230
intersection of these
half planes --

00:54:44.230 --> 00:54:47.620
the intersections over all
points, j prime not equal to j

00:54:47.620 --> 00:54:49.240
-- of Rj, j prime.

00:54:49.240 --> 00:54:52.508
So Rj, j prime is your
half plane where --

00:55:06.460 --> 00:55:10.100
so norm of Y minus a_j prime
squared is greater than or

00:55:10.100 --> 00:55:14.650
equal to norm of Y minus
a_j squared.

00:55:14.650 --> 00:55:18.960
Note that this is a_j prime,
and this is a_j here.

00:55:18.960 --> 00:55:21.180
OK.

00:55:21.180 --> 00:55:23.750
So it turns out that this
decision region has a somewhat

00:55:23.750 --> 00:55:26.200
nice structure, because they're
intersection of a

00:55:26.200 --> 00:55:29.245
bunch of half planes, their
shape is the convex polytope.

00:55:43.290 --> 00:55:46.495
And they're also known by the
name Voronoi regions.

00:55:53.680 --> 00:55:57.820
OK, so these regions are known
as Voronoi regions here.

00:55:57.820 --> 00:56:02.160
Now the set of points whose
hyperplanes are active in a

00:56:02.160 --> 00:56:05.550
certain decision region has a
special name, too, and it's

00:56:05.550 --> 00:56:07.550
called the relevant subset.

00:56:07.550 --> 00:56:10.760
So in this case, the relevant
subset of a1 is a2 and a3,

00:56:10.760 --> 00:56:13.970
because both of them have
hyperplanes that are active in

00:56:13.970 --> 00:56:16.590
the decision region of a1.

00:56:16.590 --> 00:56:18.300
So let me write that down.

00:56:53.160 --> 00:56:56.470
So the relevant subset is the
set of points, a_j prime,

00:56:56.470 --> 00:57:01.410
whose hyperplanes are active
in this decision region Rj.

00:57:01.410 --> 00:57:04.450
There's a theorem which says
that the nearest neighbors are

00:57:04.450 --> 00:57:06.740
always included in the
relevant subset.

00:57:06.740 --> 00:57:07.990
It's asserted in your notes.

00:57:10.560 --> 00:57:11.810
OK?

00:57:21.490 --> 00:57:24.040
So now that we have this Minimum
Distance Decision

00:57:24.040 --> 00:57:27.010
rule, let us see if we can
get a hang with the

00:57:27.010 --> 00:57:28.260
probability of error.

00:58:23.550 --> 00:58:27.340
Let me see the probability of
error, given that I sent a

00:58:27.340 --> 00:58:29.340
symbol, a_j.

00:58:29.340 --> 00:58:32.090
I want the value that
probability of error.

00:58:32.090 --> 00:58:38.970
That's simply the probability
that Y does not belong to Rj,

00:58:38.970 --> 00:58:42.380
given that I sent
the symbol a_j.

00:58:42.380 --> 00:58:45.100
That's when an error happens.

00:58:45.100 --> 00:58:48.720
That's same as probability
that the noise vector --

00:58:48.720 --> 00:58:51.360
because Y is a_j plus N now --

00:58:51.360 --> 00:58:56.130
does not belong to the Rj minus
a_j, and the noise is

00:58:56.130 --> 00:59:00.570
independent of a_j so I remove
the conditioning.

00:59:00.570 --> 00:59:04.490
And that is 1 minus the
probability that the noise

00:59:04.490 --> 00:59:08.180
does belong to Rj minus a_j.

00:59:08.180 --> 00:59:11.300
If I want to find this
integral, find this

00:59:11.300 --> 00:59:18.320
expression, I will integrate
over the region Rj minus a_j,

00:59:18.320 --> 00:59:24.760
of the density of
the noise, dN.

00:59:24.760 --> 00:59:27.690
No note that the noise has
a spherical symmetry, but

00:59:27.690 --> 00:59:30.650
unfortunately despite that,
the integral is not a

00:59:30.650 --> 00:59:33.740
straightforward integral,
because this region here has

00:59:33.740 --> 00:59:35.055
sharp edges.

00:59:35.055 --> 00:59:37.750
The decision region is a convex
polytope, so it's

00:59:37.750 --> 00:59:40.540
typically something like this,
and if this was your point,

00:59:40.540 --> 00:59:44.450
a_j, your noise does have a
spherical symmetry about these

00:59:44.450 --> 00:59:48.060
spheres, but when it intersects
the decision

00:59:48.060 --> 00:59:50.010
boundary, things get ugly.

00:59:50.010 --> 00:59:52.800
And so this decision
-- this is not a

00:59:52.800 --> 00:59:55.740
nice integral in general.

00:59:59.420 --> 01:00:02.510
And so unfortunately, there is
not much progress we can make

01:00:02.510 --> 01:00:05.690
beyond this point for the exact
probability of error

01:00:05.690 --> 01:00:09.430
expression, but we can say
some nice geometrical

01:00:09.430 --> 01:00:12.400
properties about the probability
of error.

01:00:12.400 --> 01:00:18.400
The first property is that
probability of error is

01:00:18.400 --> 01:00:23.170
invariant to translations.

01:00:28.890 --> 01:00:30.900
And this should be
quite obvious.

01:00:30.900 --> 01:00:34.750
You have, say, a constellation
with two points here, and say

01:00:34.750 --> 01:00:36.480
I subtract off the mean.

01:00:36.480 --> 01:00:39.520
So I get a different
constellation whose

01:00:39.520 --> 01:00:41.240
points are like this.

01:00:41.240 --> 01:00:43.040
This is my constellation,
A, and this is my

01:00:43.040 --> 01:00:44.840
constellation, A prime.

01:00:44.840 --> 01:00:47.000
The probability of error will
be same for the two

01:00:47.000 --> 01:00:50.610
constellations because the
decision regions will have the

01:00:50.610 --> 01:00:53.940
same distance from
both the points.

01:00:53.940 --> 01:00:55.760
This should be quite obvious.

01:00:55.760 --> 01:00:59.120
And basically, what this really
says is if I have any

01:00:59.120 --> 01:01:02.150
constellation, I can always
subtract off the mean, and get

01:01:02.150 --> 01:01:04.740
another constellation with the
same probability of error, but

01:01:04.740 --> 01:01:06.940
with smaller average energy.

01:01:06.940 --> 01:01:21.400
And so this implies that any
optimal constellation will

01:01:21.400 --> 01:01:23.455
have zero mean.

01:01:29.710 --> 01:01:34.850
The second point is, the
probability of error is

01:01:34.850 --> 01:01:44.570
invariant to orthonormal
rotations.

01:01:49.100 --> 01:01:53.540
So if I had, say, one point, one
constellation, with these

01:01:53.540 --> 01:01:58.530
four points, and I rotate it by
45 degrees, what I get is

01:01:58.530 --> 01:02:01.180
another constellation with
these four points.

01:02:01.180 --> 01:02:04.035
And both these constellations
are simply rotations of one

01:02:04.035 --> 01:02:07.060
another, and they have the same
probability of error.

01:02:07.060 --> 01:02:09.110
And the easiest way to see that
is the decision regions

01:02:09.110 --> 01:02:12.290
here are simply the
four quadrants.

01:02:12.290 --> 01:02:14.770
And if I want to integrate my
probability of error, I will

01:02:14.770 --> 01:02:17.380
be integrating it over
this region.

01:02:17.380 --> 01:02:21.490
Here, my decision regions will
be these 45 degree lines.

01:02:21.490 --> 01:02:24.730
And if I want to integrate the
probability of error for this

01:02:24.730 --> 01:02:30.250
point here, it will be given by
noise, which is symmetric

01:02:30.250 --> 01:02:31.880
about these circles.

01:02:31.880 --> 01:02:34.590
Basically, since the noise is
invariant to orthonormal

01:02:34.590 --> 01:02:37.360
rotations, it should be quite
obvious that the probability

01:02:37.360 --> 01:02:39.264
of error is invariant
to rotations.

01:02:39.264 --> 01:02:41.120
AUDIENCE: [INAUDIBLE]

01:02:41.120 --> 01:02:42.260
PROFESSOR: Any rotation,
right.

01:02:42.260 --> 01:02:45.230
AUDIENCE: So what do you mean
by [UNINTELLIGIBLE]?

01:02:45.230 --> 01:02:47.882
PROFESSOR: Basically, you're
preserving the distance.

01:02:47.882 --> 01:02:50.440
So you're just rotating the
point, not scaling it.

01:02:50.440 --> 01:02:51.380
AUDIENCE: [INAUDIBLE]

01:02:51.380 --> 01:02:53.680
PROFESSOR: Unitarily.

01:02:53.680 --> 01:02:54.360
OK?

01:02:54.360 --> 01:02:57.260
I mean you will be proving these
properties in the next

01:02:57.260 --> 01:02:59.434
homework, which is
just handed out.

01:02:59.434 --> 01:03:01.210
AUDIENCE: So those
[UNINTELLIGIBLE]

01:03:01.210 --> 01:03:06.228
hold, because Minimum Distance
rule is orthonormal, right?

01:03:06.228 --> 01:03:09.005
And that's because you
assume Gaussian --

01:03:09.005 --> 01:03:10.982
PROFESSOR: Because you assume
Gaussian noise.

01:03:10.982 --> 01:03:12.430
AUDIENCE: So that's the only
assumption we make?

01:03:12.430 --> 01:03:13.260
PROFESSOR: Right.

01:03:13.260 --> 01:03:15.525
AUDIENCE: -- for those
[INAUDIBLE], right?

01:03:15.525 --> 01:03:16.775
PROFESSOR: I think so.

01:03:37.470 --> 01:03:39.430
AUDIENCE: Why [INAUDIBLE]
constellation

01:03:39.430 --> 01:03:41.910
must have zero mean?

01:03:41.910 --> 01:03:45.082
PROFESSOR: Because if I have any
constellation, there's a

01:03:45.082 --> 01:03:46.810
certain probability
of error, right?

01:03:46.810 --> 01:03:50.240
Can always subtract out the mean
from the constellation, I

01:03:50.240 --> 01:03:53.470
get a new constellation with a
smaller average energy with

01:03:53.470 --> 01:03:55.808
the same probability of error.

01:03:55.808 --> 01:03:58.253
AUDIENCE: Oh, so in terms
of [INAUDIBLE]

01:03:58.253 --> 01:03:59.231
PROFESSOR: Right.

01:03:59.231 --> 01:04:01.431
If you're looking at a trade-off
of probability of

01:04:01.431 --> 01:04:04.121
error versus energy, usually
what we look at.

01:04:07.070 --> 01:04:08.360
So maybe that's a good point.

01:04:08.360 --> 01:04:09.774
I should just mention it.

01:04:13.646 --> 01:04:16.066
For probability of
error versus --

01:04:18.970 --> 01:04:20.570
we're looking at this
trade-off here.

01:04:57.970 --> 01:04:58.110
Ok.

01:04:58.110 --> 01:05:01.820
The next idea is to basically
bound the probability of error

01:05:01.820 --> 01:05:05.330
by a union bound, because we
cannot compute an exact

01:05:05.330 --> 01:05:07.760
expression for the probability
of error, so we might as well

01:05:07.760 --> 01:05:10.540
compute a bound which
is tractable.

01:05:10.540 --> 01:05:16.410
So we'll look at what is known
as the pairwise error

01:05:16.410 --> 01:05:17.660
probability.

01:05:25.450 --> 01:05:30.370
So the idea behind pairwise
error probability is suppose I

01:05:30.370 --> 01:05:35.380
send a point, a_j, what is the
probability that instead of

01:05:35.380 --> 01:05:40.910
a_j at the receiver, I decide
that a_j prime was sent.

01:05:40.910 --> 01:05:44.230
This is the pairwise
error probability.

01:05:44.230 --> 01:05:50.020
So geometrically, say a_j and
a_j prime are two points here.

01:05:50.020 --> 01:05:52.620
Let me draw some coordinate
axis here.

01:05:52.620 --> 01:05:56.910
And say I sent point a_j, and
there is noise on the channel,

01:05:56.910 --> 01:05:58.100
that takes me to this point.

01:05:58.100 --> 01:06:02.670
So this is my Y, and this
is the noise vector.

01:06:02.670 --> 01:06:04.310
OK?

01:06:04.310 --> 01:06:09.050
And now what I want to know is
under what conditions will I

01:06:09.050 --> 01:06:11.450
decide a_j prime over a_j.

01:06:11.450 --> 01:06:15.440
What is the probability of
deciding a_j prime over a_j?

01:06:15.440 --> 01:06:18.840
So let's draw a line joining
a_j prime and a_j.

01:06:18.840 --> 01:06:21.370
So how would I decide --

01:06:21.370 --> 01:06:24.360
suppose I receive this point,
Y, and I wanted to decide

01:06:24.360 --> 01:06:26.550
between a_j and a_j prime.

01:06:26.550 --> 01:06:29.718
What would be my
decision rule?

01:06:29.718 --> 01:06:31.610
AUDIENCE: [INAUDIBLE]

01:06:31.610 --> 01:06:31.865
PROFESSOR: Uh-huh.

01:06:31.865 --> 01:06:33.115
AUDIENCE: [INAUDIBLE]

01:06:37.980 --> 01:06:38.235
PROFESSOR: You select --

01:06:38.235 --> 01:06:39.440
AUDIENCE: [INAUDIBLE]
a_j prime.

01:06:39.440 --> 01:06:40.370
PROFESSOR: Exactly.

01:06:40.370 --> 01:06:43.710
An equivalent way of saying it
is to project Y onto this

01:06:43.710 --> 01:06:46.790
line, a_j prime minus a_j.

01:06:46.790 --> 01:06:49.390
We take two projections, one
orthonormal to the line, one

01:06:49.390 --> 01:06:51.830
along on the line, and receive
this projection.

01:06:56.100 --> 01:06:58.290
This is a straight
line like this.

01:06:58.290 --> 01:06:59.890
Let's call it n tilde.

01:06:59.890 --> 01:07:03.098
I should change my chalk, it's
getting too blunt now.

01:07:06.220 --> 01:07:10.270
This projection here is closer
to a_j prime or a_j.

01:07:10.270 --> 01:07:14.670
So in other words, this
probability of error is same

01:07:14.670 --> 01:07:18.980
as the probability that this n
tilde, which is the projection

01:07:18.980 --> 01:07:26.380
of Y onto a_j prime minus a_j,
is greater than or equal to

01:07:26.380 --> 01:07:31.010
the norm of a_j prime
minus a_j over 2.

01:07:33.900 --> 01:07:37.130
OK, now why did I use the
notation n tilde here?

01:07:37.130 --> 01:07:40.650
Because the projection Y onto
a_j, which is this.

01:07:40.650 --> 01:07:43.920
n tilde is same as the
projection of the noise onto a

01:07:43.920 --> 01:07:47.300
line joining a_j prime
minus a_j.

01:07:47.300 --> 01:07:56.140
So n tilde, I can write it as
projection of N onto a_j prime

01:07:56.140 --> 01:07:59.510
minus a_j over the norm.

01:08:03.920 --> 01:08:05.390
AUDIENCE: [INAUDIBLE]

01:08:05.390 --> 01:08:05.880
PROFESSOR: Sorry?

01:08:05.880 --> 01:08:07.092
AUDIENCE: Why, exactly?

01:08:07.092 --> 01:08:09.170
PROFESSOR: You can just see
geometrically, right?

01:08:09.170 --> 01:08:12.110
This is a 90 degree here.

01:08:12.110 --> 01:08:13.590
This is the noise.

01:08:13.590 --> 01:08:16.660
If I project the noise, it will
be this component here.

01:08:23.660 --> 01:08:26.160
AUDIENCE: [INAUDIBLE]

01:08:26.160 --> 01:08:26.430
PROFESSOR: All right.

01:08:26.430 --> 01:08:27.750
This should be 90, I'm sorry.

01:08:27.750 --> 01:08:28.670
This is 90.

01:08:28.670 --> 01:08:29.870
I'm messing things up.

01:08:29.870 --> 01:08:31.410
OK, this is the noise here.

01:08:31.410 --> 01:08:34.830
This noise, if I project it onto
a_j prime minus a_j, it's

01:08:34.830 --> 01:08:36.460
going to be this component.

01:08:36.460 --> 01:08:40.800
This is Y, if I project it,
it's the same component.

01:08:40.800 --> 01:08:45.740
Now, if the noise is IID with
variance sigma squared in each

01:08:45.740 --> 01:08:49.200
coordinate, we are simply
projecting the noise onto one

01:08:49.200 --> 01:08:51.670
orthonormal vector.

01:08:51.670 --> 01:08:57.979
So n tilde is also Gaussian,
with zero mean

01:08:57.979 --> 01:09:00.220
variance sigma squared.

01:09:00.220 --> 01:09:05.330
So we can use that to find this
probability of error.

01:09:11.590 --> 01:09:14.979
So in that case, the probability
of error --

01:09:18.241 --> 01:09:21.040
I should write this --

01:09:21.040 --> 01:09:26.890
probability of a_j prime going
to a_j is simply probability

01:09:26.890 --> 01:09:30.460
that this Gaussian is greater
than some distance, and that's

01:09:30.460 --> 01:09:39.520
Q of norm of a_j prime minus
a_j over two sigma.

01:09:39.520 --> 01:09:39.990
Yes?

01:09:39.990 --> 01:09:42.560
AUDIENCE: What is sigma?

01:09:42.560 --> 01:09:47.100
PROFESSOR: So the noise vector
is IID, in each of the

01:09:47.100 --> 01:09:50.510
components, and has a variance
of sigma squared.

01:09:50.510 --> 01:09:55.100
Sigma is basically N_0 over 2,
if your noise is flat with --

01:09:55.100 --> 01:09:58.120
so let me just write
that down, sigma

01:09:58.120 --> 01:09:59.190
squared is N_0 over 2.

01:09:59.190 --> 01:10:01.700
If you have an AWGN channel,
and you project it on each

01:10:01.700 --> 01:10:04.920
orthonormal signal, that's
what you get.

01:10:04.920 --> 01:10:07.435
AUDIENCE: What if you project
the noise vector on

01:10:07.435 --> 01:10:09.910
[INAUDIBLE]

01:10:09.910 --> 01:10:13.380
why is [INAUDIBLE]

01:10:13.380 --> 01:10:14.260
you don't have --

01:10:14.260 --> 01:10:15.150
PROFESSOR: So you have
a noise vector.

01:10:15.150 --> 01:10:18.230
If you have a Gaussian vector,
and you project it onto an

01:10:18.230 --> 01:10:19.250
orthonormal basis --

01:10:19.250 --> 01:10:19.600
AUDIENCE: Yes.

01:10:19.600 --> 01:10:22.010
But [INAUDIBLE] normal?

01:10:22.010 --> 01:10:22.180
PROFESSOR: Right.

01:10:22.180 --> 01:10:24.650
[INAUDIBLE]

01:10:24.650 --> 01:10:27.465
We are only projecting out on
one vector which you need now.

01:10:30.754 --> 01:10:32.710
AUDIENCE: So then
you're saying --

01:10:32.710 --> 01:10:33.688
OK, yeah.

01:10:33.688 --> 01:10:37.111
The assumption is the noise is
symmetric in all dimensions?

01:10:37.111 --> 01:10:38.361
PROFESSOR: Right.

01:10:42.500 --> 01:10:45.420
Let's do this algebraically, so
you're convinced that there

01:10:45.420 --> 01:10:50.910
is no magic I'm doing here.

01:10:50.910 --> 01:10:54.860
So we can write this as you
said, as the probability that

01:10:54.860 --> 01:11:00.370
the norm of Y minus a_j squared
is greater than norm

01:11:00.370 --> 01:11:06.260
of Y minus a_j prime squared,
given that Y [UNINTELLIGIBLE]

01:11:06.260 --> 01:11:10.280
a_j, so Y is a_j plus
the noise vector.

01:11:10.280 --> 01:11:17.180
So I sub in for Y. What I get
is probability that Y is a_j

01:11:17.180 --> 01:11:22.050
plus N. So here I have norm of
N squared is greater than or

01:11:22.050 --> 01:11:27.790
equal to norm of a_j plus N
minus a_j prime squared.

01:11:30.850 --> 01:11:34.110
And since the only random
variable here is this noise,

01:11:34.110 --> 01:11:37.860
N, I can remove the conditioning
[UNINTELLIGIBLE]

01:11:37.860 --> 01:11:39.250
down there.

01:11:39.250 --> 01:11:42.940
Let me expand this second
norm term there.

01:11:42.940 --> 01:11:49.080
That's basically the norm of
a_j minus a_j prime squared

01:11:49.080 --> 01:11:54.340
plus the norm of N squared
minus two times the

01:11:54.340 --> 01:11:57.430
projection of N --

01:11:57.430 --> 01:11:59.660
or rather, the inner
product of N --

01:11:59.660 --> 01:12:01.410
and a_j prime minus a_j.

01:12:04.440 --> 01:12:08.560
So this is the probability that
the inner product of N

01:12:08.560 --> 01:12:16.980
and a_j prime minus a_j is
greater than or equal to norm

01:12:16.980 --> 01:12:22.860
of a_j prime minus a_j
squared over 2.

01:12:22.860 --> 01:12:26.730
If you divide by the norm of a_j
prime minus a_j, you get

01:12:26.730 --> 01:12:30.886
the same expression as we had.

01:12:38.770 --> 01:12:39.690
So it's the same thing.

01:12:39.690 --> 01:12:42.910
This was done geometrically,
this is done algebraically.

01:12:42.910 --> 01:12:45.690
So this is the expression of the
probability of error, and

01:12:45.690 --> 01:12:51.006
this is Q of the norm of a_j
prime minus a_j over 2 sigma.

01:13:53.860 --> 01:13:57.160
So now that we have the pairwise
error probability, we

01:13:57.160 --> 01:13:59.200
can use it to bound
the probability

01:13:59.200 --> 01:14:01.520
of error given a_j.

01:14:01.520 --> 01:14:05.090
Well by definition, the probably
of error given a_j is

01:14:05.090 --> 01:14:09.240
simply the probability of the
union of all the possible

01:14:09.240 --> 01:14:15.910
error events of the a_j goes to
a_j prime over all possible

01:14:15.910 --> 01:14:19.120
j prime, not equal to j.

01:14:19.120 --> 01:14:22.630
This by the union bound is
less than or equal to the

01:14:22.630 --> 01:14:28.570
summations of the probability
that a_j goes to a_j prime.

01:14:28.570 --> 01:14:30.230
That's just using union bound.

01:14:30.230 --> 01:14:33.150
And now I can sub that
expression over from there.

01:14:40.010 --> 01:14:41.800
This is the same as..So the
summation is over j prime not

01:14:41.800 --> 01:14:48.710
equal to j times Q of the
norm of a_j prime

01:14:48.710 --> 01:14:54.380
minus a_j over 2 sigma.

01:14:54.380 --> 01:14:57.170
Now let me write the summation
in a different way.

01:14:57.170 --> 01:14:59.530
I'm going to write the summation
over all possible

01:14:59.530 --> 01:15:05.820
distances which belong to the
set of distance, times K_D of

01:15:05.820 --> 01:15:09.550
a_j, times Q of D
over 2 sigma.

01:15:12.620 --> 01:15:23.590
Where the set D is the
set of all possible

01:15:23.590 --> 01:15:30.160
distances from a_j.

01:15:30.160 --> 01:15:31.410
Ok?

01:15:33.170 --> 01:15:45.270
And K_D of a_j is the
number of points at

01:15:45.270 --> 01:15:49.040
distance D from a_j.

01:15:51.780 --> 01:15:53.625
That's just a straightforward
change of variables.

01:15:56.210 --> 01:15:59.930
Now if you look at this
expression, then Q of B over 2

01:15:59.930 --> 01:16:03.070
sigma basically behaves like an
exponential, for an algebra

01:16:03.070 --> 01:16:04.950
use of the argument.

01:16:04.950 --> 01:16:12.470
So recall that Q of X is like
half E to the minus X squared

01:16:12.470 --> 01:16:16.770
over 2, for X much
larger than 1.

01:16:16.770 --> 01:16:21.260
So what you are really seeing
here is that you have a sum of

01:16:21.260 --> 01:16:24.830
a bunch of exponentials,
each written by this

01:16:24.830 --> 01:16:26.560
term, K_D of a_j.

01:16:26.560 --> 01:16:30.370
Now if you think about the
argument being large, then

01:16:30.370 --> 01:16:33.740
when you have a sum of
exponentials, the term with

01:16:33.740 --> 01:16:37.070
the smallest exponent will
dominate, because they are all

01:16:37.070 --> 01:16:39.440
decreasing exponentials.

01:16:39.440 --> 01:16:48.260
So this term can be written as
approximately K_min of a_j

01:16:48.260 --> 01:16:53.310
times Q of d_min over 2 sigma.

01:16:56.470 --> 01:16:58.770
So what I am doing is I'm
only picking up one

01:16:58.770 --> 01:17:02.630
term from this summation.

01:17:02.630 --> 01:17:05.825
So far, we have a strict upper
bound here, so this summation

01:17:05.825 --> 01:17:08.520
is a strict upper bound on the
probability of error given

01:17:08.520 --> 01:17:11.910
a_j, But now what I am doing is
I'm only going to keep one

01:17:11.910 --> 01:17:15.770
term in the summation, the term
which has the smallest

01:17:15.770 --> 01:17:17.260
exponent here.

01:17:17.260 --> 01:17:24.860
So I'm looking at the smallest
value of D in this set of

01:17:24.860 --> 01:17:27.780
possible distances from a_j.

01:17:27.780 --> 01:17:28.073
AUDIENCE: So you're
just looking

01:17:28.073 --> 01:17:29.100
at the nearest neighbor.

01:17:29.100 --> 01:17:30.490
PROFESSOR: You're looking at
essentially the nearest

01:17:30.490 --> 01:17:32.510
neighbor, geometrically
speaking.

01:17:32.510 --> 01:17:36.140
And this approximation
actually works

01:17:36.140 --> 01:17:38.150
quite well in practice.

01:17:38.150 --> 01:17:40.530
It's not a bound on the
probability of error given

01:17:40.530 --> 01:17:44.120
a_j, but it's an
approximation.

01:17:44.120 --> 01:17:45.520
And why did I do this?

01:17:45.520 --> 01:17:49.812
Well, if I want to look at the
probability over all error,

01:17:49.812 --> 01:17:52.340
what's that going to be?

01:17:52.340 --> 01:17:58.060
It's going to be the average
over all possible a_j's of

01:17:58.060 --> 01:18:00.910
probability of error
given a_j.

01:18:00.910 --> 01:18:04.880
Now, so I want to take an
average of this quantity.

01:18:04.880 --> 01:18:06.270
So this is a constant.

01:18:06.270 --> 01:18:09.130
So I will just take the average
over this, and that's

01:18:09.130 --> 01:18:15.720
going to be K_min of the
constellation, which is the

01:18:15.720 --> 01:18:19.890
average number of nearest
neighbors, times Q of D_min

01:18:19.890 --> 01:18:21.140
over 2 sigma.

01:18:24.680 --> 01:18:27.240
This is approximate here.

01:18:27.240 --> 01:18:31.270
So this is an approximation that
will be used, and it's a

01:18:31.270 --> 01:18:33.930
very useful approximation,
and it is known as

01:18:33.930 --> 01:18:35.180
the Union Bound Estimate.

01:18:49.140 --> 01:18:51.790
It's no longer a bound
on the probability of

01:18:51.790 --> 01:18:53.910
error, it's an estimate.

01:18:53.910 --> 01:18:56.405
And in fact, there is a homework
problem where you

01:18:56.405 --> 01:18:59.140
will be showing that the Union
Bound Estimate is in fact

01:18:59.140 --> 01:19:00.800
exact for an M-PAM
constellation.

01:19:03.500 --> 01:19:05.910
And I will let you think
why that is the case.

01:19:05.910 --> 01:19:08.170
I was going to do it, but then
I realized it's a homework

01:19:08.170 --> 01:19:12.610
problem, so you might as well
spend some time on it.

01:19:12.610 --> 01:19:15.030
So the last thing that I wanted
to do today is find a

01:19:15.030 --> 01:19:16.630
lower bound on the probability
of error.

01:19:29.460 --> 01:19:32.260
So if I look at probability of
error, it's a union of bunch

01:19:32.260 --> 01:19:33.210
of the events.

01:19:33.210 --> 01:19:34.190
AUDIENCE: [INAUDIBLE]

01:19:34.190 --> 01:19:34.860
PROFESSOR: Yes.

01:19:34.860 --> 01:19:36.530
AUDIENCE: [INAUDIBLE]

01:19:36.530 --> 01:19:41.400
That union should be
with a_j prime.

01:19:41.400 --> 01:19:44.180
PROFESSOR: The union should
be with a_j --

01:19:44.180 --> 01:19:45.510
yeah.

01:19:45.510 --> 01:19:48.430
It's not what I have?

01:19:48.430 --> 01:19:52.550
So I'm taking a union over all
possible events, but a_j's

01:19:52.550 --> 01:19:54.128
confused with a_j prime.

01:20:03.590 --> 01:20:04.290
AUDIENCE: [INAUDIBLE]

01:20:04.290 --> 01:20:08.638
a_j going to union j prime not
equal to j, a_j prime.

01:20:13.430 --> 01:20:14.350
PROFESSOR: Oh, I see.

01:20:14.350 --> 01:20:16.750
AUDIENCE: I think you need
parentheses around the --

01:20:16.750 --> 01:20:18.392
AUDIENCE: Brackets
around the --

01:20:18.392 --> 01:20:20.210
AUDIENCE: [INAUDIBLE]

01:20:20.210 --> 01:20:23.232
another set of parentheses
behind the event a_j

01:20:23.232 --> 01:20:24.500
going to a_j prime.

01:20:24.500 --> 01:20:27.184
Because that's the event you
were talking about there.

01:20:27.184 --> 01:20:28.642
At least that's [INAUDIBLE]

01:20:34.000 --> 01:20:35.720
PROFESSOR: So you are
saying that --

01:20:35.720 --> 01:20:37.754
AUDIENCE: Put parentheses
after the u.

01:20:37.754 --> 01:20:38.620
PROFESSOR: After the u.

01:20:38.620 --> 01:20:39.780
Like this?

01:20:39.780 --> 01:20:41.235
AUDIENCE: Yeah, right.

01:20:41.235 --> 01:20:43.175
That's the event.

01:20:43.175 --> 01:20:43.660
PROFESSOR: Right.

01:20:43.660 --> 01:20:45.600
That's what I meant.

01:20:45.600 --> 01:20:46.360
OK, fine.

01:20:46.360 --> 01:20:47.610
Fair enough.

01:20:53.120 --> 01:20:55.130
OK, so basically, the
lower bound is

01:20:55.130 --> 01:20:56.280
actually quite simple.

01:20:56.280 --> 01:20:58.470
All I'm going to do
is only take one

01:20:58.470 --> 01:21:00.180
event from that union.

01:21:00.180 --> 01:21:02.360
I'm only going to take one
point, which is the minimum

01:21:02.360 --> 01:21:04.810
distance from a_j.

01:21:04.810 --> 01:21:11.580
So probability of error given
a_j is greater than or equal

01:21:11.580 --> 01:21:16.890
to probability that a_j goes
to a_j prime, where now a_j

01:21:16.890 --> 01:21:19.390
prime is the nearest
neighbor of a_j.

01:21:19.390 --> 01:21:25.230
And this we know from PAM
analysis is simply Q of d_min

01:21:25.230 --> 01:21:27.052
over 2 sigma.

01:21:27.052 --> 01:21:29.350
So this is a strict lower bound
on the probability of

01:21:29.350 --> 01:21:31.960
error, and it has the
same exponent as

01:21:31.960 --> 01:21:33.210
the Union Bound Estimate.

01:21:47.510 --> 01:21:49.865
Of course, if I want to find
the overall probability of

01:21:49.865 --> 01:21:52.240
error, I can just take
an average of this.

01:21:52.240 --> 01:21:55.165
Since this is fixed, it's going
to be the same quantity.

01:21:57.670 --> 01:22:02.100
So far what we have is a strict
upper bound on the

01:22:02.100 --> 01:22:06.260
probability of error, which is
this quantity here, a union

01:22:06.260 --> 01:22:09.600
bound estimate, and we have
a lower bound on the

01:22:09.600 --> 01:22:11.110
probability of error.

01:22:11.110 --> 01:22:14.100
In the next lecture, we will be
looking at how to use these

01:22:14.100 --> 01:22:17.190
bounds to compute a probability
of error for small

01:22:17.190 --> 01:22:20.530
signal constellations, and
quantify the performance

01:22:20.530 --> 01:22:22.280
trade-off of the probability
of error versus the

01:22:22.280 --> 01:22:24.820
EbN_0 and so on.

01:22:24.820 --> 01:22:26.450
I think this is a natural
point to stop.

01:22:26.450 --> 01:22:27.700
It's almost time now.