WEBVTT

00:00:00.500 --> 00:00:02.540
--solutions for problem set 8.

00:00:02.540 --> 00:00:06.240
I found some problems
as I went through them.

00:00:06.240 --> 00:00:10.620
These are all new because we've
never been in this territory

00:00:10.620 --> 00:00:14.470
before this early where we
could actually offer problems.

00:00:14.470 --> 00:00:18.120
So it's a little
bit of a shakedown.

00:00:18.120 --> 00:00:21.970
8.2, simply, I don't think
you have enough background

00:00:21.970 --> 00:00:23.550
to actually do it.

00:00:23.550 --> 00:00:27.290
If you want to try
it, be my guest.

00:00:27.290 --> 00:00:31.940
I'll give you a sketch
of a solution, anyway.

00:00:31.940 --> 00:00:33.910
But really, for the
same reason that I

00:00:33.910 --> 00:00:38.090
didn't do exercise
3 in Chapter 11,

00:00:38.090 --> 00:00:42.590
we really shouldn't have
done exercise 2, either.

00:00:42.590 --> 00:00:45.440
For 8.3, which is
a decoding exercise

00:00:45.440 --> 00:00:47.660
using the BCJR algorithm.

00:00:47.660 --> 00:00:50.320
I don't know if we'll get
to the BCJR algorithm today.

00:00:50.320 --> 00:00:52.300
If we do, then please do it.

00:00:52.300 --> 00:00:55.060
If we don't, then
we can defer it.

00:00:55.060 --> 00:00:58.810
And you also need to know
that the noise variance is 1.

00:00:58.810 --> 00:01:01.510
Or, pick something,
but I suggest

00:01:01.510 --> 00:01:03.890
you just pick sigma
squared equals 1,

00:01:03.890 --> 00:01:06.980
even though that's
a large value.

00:01:06.980 --> 00:01:09.000
OK, any questions?

00:01:09.000 --> 00:01:12.480
Has anyone tried to
do the homework yet?

00:01:12.480 --> 00:01:13.097
Not yet.

00:01:13.097 --> 00:01:13.680
Good strategy.

00:01:16.530 --> 00:01:19.276
OK, Chapter 11.

00:01:22.660 --> 00:01:27.920
We have one topic to go in
Chapter 11, which is just

00:01:27.920 --> 00:01:31.680
the last little bit on Hadamard
transform realizations, which

00:01:31.680 --> 00:01:33.500
I do want to cover.

00:01:33.500 --> 00:01:39.006
We've really been talking
about styles of realizations

00:01:39.006 --> 00:01:43.540
on codes on graphs,
graphical realizations.

00:01:43.540 --> 00:01:45.810
We had the generator.

00:01:45.810 --> 00:01:50.665
And we had the parity check
and we had the trellis-style.

00:01:50.665 --> 00:01:54.050
And we even had a
tail-biting trellis.

00:01:54.050 --> 00:02:04.620
And we also developed
the cut-set bound

00:02:04.620 --> 00:02:11.830
which doesn't exactly
prove, but strongly suggests

00:02:11.830 --> 00:02:14.310
that we're going
to need cycles in

00:02:14.310 --> 00:02:15.620
our graphical representations.

00:02:20.230 --> 00:02:26.000
And one reason I want to
talk about the Hadamard

00:02:26.000 --> 00:02:29.490
transform realizations,
the Reed-Muller codes,

00:02:29.490 --> 00:02:34.440
is that they really demonstrate
this point I think very nicely.

00:02:34.440 --> 00:02:38.010
If we allow cycles, we get
very simple representations.

00:02:38.010 --> 00:02:42.840
If we demand
cycle-free, then we get

00:02:42.840 --> 00:02:46.300
still pretty nice realizations,
but much more complicated.

00:02:46.300 --> 00:02:50.050
They meet the cut-set
bound everywhere.

00:02:50.050 --> 00:02:52.300
Reed-Muller codes in general
are pretty good anyway

00:02:52.300 --> 00:02:56.030
for trellis realizations,
but they're certainly

00:02:56.030 --> 00:02:59.700
much more complicated than
the realizations with cycles.

00:02:59.700 --> 00:03:04.140
So that's what we're
going to do today

00:03:04.140 --> 00:03:15.545
is-- at least the first part is
Hadamard transform realizations

00:03:15.545 --> 00:03:18.510
of Reed-Muller codes.

00:03:23.850 --> 00:03:27.510
OK, what's the Hadamard
transform over the binary field

00:03:27.510 --> 00:03:28.050
F2?

00:03:28.050 --> 00:03:30.170
It's related to but
a little different

00:03:30.170 --> 00:03:32.810
than the Hadamard transform
over the real field.

00:03:32.810 --> 00:03:36.820
So people seeing the latter,
you'll see similarities,

00:03:36.820 --> 00:03:41.450
but you may see
differences as well.

00:03:41.450 --> 00:03:43.675
So the Hadamard transform.

00:03:47.900 --> 00:03:52.890
I think you remember when we
defined Reed-Muller codes, one

00:03:52.890 --> 00:03:59.110
of the ways we defined them was
in terms of a universal matrix

00:03:59.110 --> 00:04:06.600
Um, which I defined as
the m-fold tensor product,

00:04:06.600 --> 00:04:14.380
very briefly written like
that, of a 2 by 2 matrix, u1.

00:04:14.380 --> 00:04:16.079
What does u1 look like?

00:04:16.079 --> 00:04:22.570
u1 looks like-- over the binary
field, it just looks like this.

00:04:22.570 --> 00:04:25.730
It's the set of generators for
the Reed-Muller codes of length

00:04:25.730 --> 00:04:27.020
2.

00:04:27.020 --> 00:04:32.880
What you would expect, it's
a lower triangular matrix.

00:04:32.880 --> 00:04:36.593
And if we want to
take-- I don't know

00:04:36.593 --> 00:04:38.575
if you've run into
tensor products before.

00:04:38.575 --> 00:04:42.170
If we want to take the 2 by
2 tensor product like that,

00:04:42.170 --> 00:04:47.420
we simply replace each one by
the matrix itself and each 0

00:04:47.420 --> 00:04:49.580
by a matrix of 0's.

00:04:49.580 --> 00:04:55.330
So we get a 4 by 4 matrix which
looks like this 1, 0, 1, 1, 1,

00:04:55.330 --> 00:04:59.655
0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0.

00:04:59.655 --> 00:05:00.530
And what do you know?

00:05:00.530 --> 00:05:05.630
That's what we called our
universal Reed-Muller generator

00:05:05.630 --> 00:05:09.080
matrix for m equals 2.

00:05:09.080 --> 00:05:11.670
In other words, for
Reed-Muller codes of length 4.

00:05:11.670 --> 00:05:14.030
And you see how all
the Reed-Muller codes

00:05:14.030 --> 00:05:15.090
are generated.

00:05:15.090 --> 00:05:17.690
We can find generators for
all the Reed-Muller codes

00:05:17.690 --> 00:05:21.520
from this set.

00:05:21.520 --> 00:05:28.300
And similarly, u3
was this matrix

00:05:28.300 --> 00:05:33.176
that we've now seen many times.

00:05:36.180 --> 00:05:39.772
So all these matrices
have common properties.

00:05:39.772 --> 00:05:40.980
They're all lower triangular.

00:05:44.656 --> 00:05:47.830
They have many properties.

00:05:47.830 --> 00:05:51.030
One of them is that all
of them are self-inverse.

00:05:53.630 --> 00:05:59.352
And that's very easy
to prove from the fact

00:05:59.352 --> 00:06:02.900
that u1 is clearly self-inverse.

00:06:02.900 --> 00:06:06.330
u1 times u1 is the
identity matrix.

00:06:06.330 --> 00:06:10.870
And because these are
all tensor products,

00:06:10.870 --> 00:06:15.540
that makes all of these
their own inverses.

00:06:15.540 --> 00:06:22.420
So what's the 2 to the m by 2
to the m Hadamard transform is

00:06:22.420 --> 00:06:28.896
if we have a vector
u, then u times u m,

00:06:28.896 --> 00:06:36.410
a vector y is called the
Hadamard transform of u.

00:06:36.410 --> 00:06:41.110
What's the inverse Hadamard
transform if you like?

00:06:41.110 --> 00:06:43.360
Well, since u m is
its own inverse,

00:06:43.360 --> 00:06:48.860
if we just multiply both sides
by u, we get u equals y u m.

00:06:48.860 --> 00:06:58.825
So to get u back, we just
transform y again by u m.

00:06:58.825 --> 00:07:00.720
So the inverse
Hadamard transform

00:07:00.720 --> 00:07:03.750
is the Hadamard transform
again in this case.

00:07:03.750 --> 00:07:07.100
So we get the typical kind
of transform relationship.

00:07:07.100 --> 00:07:11.490
You can think of this time
domain, frequency domain.

00:07:11.490 --> 00:07:13.210
Anyway, this is the
set of variables

00:07:13.210 --> 00:07:16.600
from which we can get
this set of variables.

00:07:16.600 --> 00:07:22.640
They're somehow in dual domains.

00:07:22.640 --> 00:07:24.320
OK.

00:07:24.320 --> 00:07:29.330
Well, now recall one definition
of the Reed-Muller codes.

00:07:29.330 --> 00:07:41.930
The Reed-Muller codes
are-- Rm of mr is-- sorry,

00:07:41.930 --> 00:07:46.240
it's always rm, which I've
always found curious notation.

00:07:46.240 --> 00:07:47.760
Since m is more
important than r,

00:07:47.760 --> 00:07:50.610
it would seem that
you would say m first,

00:07:50.610 --> 00:07:53.810
but maybe it was for
this, I don't know.

00:07:53.810 --> 00:07:58.600
This is generated by--
so let's say that.

00:07:58.600 --> 00:08:09.011
The rows of u m of
weight 2 to the m minus r

00:08:09.011 --> 00:08:09.780
[? for ?] greater.

00:08:14.610 --> 00:08:22.480
So for instance, the 0 order
Reed-Muller code of length 8

00:08:22.480 --> 00:08:25.130
is simply the 8, 1, 8
repetition code and it's

00:08:25.130 --> 00:08:26.610
generated by this last row.

00:08:26.610 --> 00:08:28.200
It's the only one of weight 8.

00:08:28.200 --> 00:08:30.740
If we want the
Reed-Muller code of length

00:08:30.740 --> 00:08:37.919
8 and minimum distance 4,
we take this, this, this,

00:08:37.919 --> 00:08:40.710
and this as our
four generator rows

00:08:40.710 --> 00:08:42.950
and we get the 8,
4, 4 code that we've

00:08:42.950 --> 00:08:45.550
been mostly using for examples.

00:08:45.550 --> 00:08:47.590
If we want the
single parity check

00:08:47.590 --> 00:08:51.670
code of length 8,
the 8, 7, 2 code,

00:08:51.670 --> 00:08:53.780
we take all these 7
generators, everyone

00:08:53.780 --> 00:08:57.940
except for the unique
weight 1 generator up here.

00:08:57.940 --> 00:09:02.500
And as a hint for
doing the homework,

00:09:02.500 --> 00:09:06.180
it's helpful to notice
that if we write

00:09:06.180 --> 00:09:15.460
the indices of these rows
as 0, 1, 2, 3, 4, 5, 6, 7,

00:09:15.460 --> 00:09:17.120
then notice there's
a relationship

00:09:17.120 --> 00:09:20.440
between the Hamming
weight of the m bit binary

00:09:20.440 --> 00:09:25.070
expansion of these indices
and the weights of these rows.

00:09:25.070 --> 00:09:27.090
Is that clear instantly?

00:09:27.090 --> 00:09:29.975
I won't prove it, but if
the Hamming weight is 3,

00:09:29.975 --> 00:09:31.620
then the row has weight 8.

00:09:31.620 --> 00:09:35.430
If the Hamming weight is 2,
then the row has weight 4.

00:09:35.430 --> 00:09:38.990
If the hamming weight is 1,
then the row has weight 2.

00:09:38.990 --> 00:09:43.170
And if the hamming weight is
0, then the row has weight 1.

00:09:43.170 --> 00:09:50.150
And that's the easiest way
I've found to do problem 8.1.

00:09:50.150 --> 00:09:54.540
So these-- there are infinity
of relations, nice relations,

00:09:54.540 --> 00:09:56.035
you could prove
for these things.

00:09:56.035 --> 00:10:00.360
They're very important
combinatorially and so forth.

00:10:00.360 --> 00:10:08.170
OK, so from this
observation, we can therefore

00:10:08.170 --> 00:10:18.030
define a Reed-Muller code as
the set of all u times u m

00:10:18.030 --> 00:10:32.350
where we simply let some of
the ui free for some rows.

00:10:32.350 --> 00:10:35.160
In other words, it's
just a binary variable.

00:10:35.160 --> 00:10:44.410
And 0 for other rows as I just
described in words over here.

00:10:48.800 --> 00:10:51.700
So basically, where
we're going is

00:10:51.700 --> 00:10:56.778
we're going to define some kind
of graph which realizes u m.

00:10:56.778 --> 00:11:03.350
We're going to put the
8-- do it either way.

00:11:03.350 --> 00:11:07.130
I'm going to put
y0 y7 over here.

00:11:07.130 --> 00:11:11.560
If we're doing a code of
length 8, I'll put u3 in here.

00:11:11.560 --> 00:11:16.210
We're going to have 8 other
things sticking out over here.

00:11:16.210 --> 00:11:18.200
I shouldn't draw these
because they're really

00:11:18.200 --> 00:11:20.920
going to be internal,
implicit variables.

00:11:20.920 --> 00:11:24.770
So this will be u0
through u7 in some order.

00:11:24.770 --> 00:11:27.630
It turns out to be necessary
to play with the order

00:11:27.630 --> 00:11:29.880
to get a nice-looking graph.

00:11:29.880 --> 00:11:32.950
And we're going to fix
some of these to 0.

00:11:32.950 --> 00:11:40.590
For instance, for
the 8, 4, 4 code,

00:11:40.590 --> 00:11:42.870
we're going to fix those to 0.

00:11:42.870 --> 00:11:47.780
And we're going to allow these
coefficients corresponding

00:11:47.780 --> 00:11:52.720
to rows of weight 4 or
more be free arbitrary

00:11:52.720 --> 00:11:53.520
binary variables.

00:11:57.120 --> 00:11:58.890
So the set of all
code words will

00:11:58.890 --> 00:12:01.900
be generated as the
set of all Hadamard

00:12:01.900 --> 00:12:07.610
transforms of 8 tuples, which
are fixed to 0 in these 4

00:12:07.610 --> 00:12:11.080
positions, and which are
free in these 4 positions.

00:12:11.080 --> 00:12:12.360
Is that clearly stated?

00:12:12.360 --> 00:12:13.920
I think I'm reasonably
proud of that.

00:12:16.470 --> 00:12:19.810
So that's what I mean
by this notation here.

00:12:19.810 --> 00:12:23.860
And it's clear that the
dimension of the code

00:12:23.860 --> 00:12:29.090
is the number of free
coefficients here.

00:12:29.090 --> 00:12:31.790
Basically, because the u
m matrix is invertible,

00:12:31.790 --> 00:12:36.320
so we're going to
get an isomorphism

00:12:36.320 --> 00:12:38.720
between 4 tuples and code words.

00:12:41.920 --> 00:12:46.560
So that's a way of
constructing Reed-Muller codes.

00:12:46.560 --> 00:12:50.890
Now, how do I get a nice
graphical representation

00:12:50.890 --> 00:12:54.390
of u m?

00:12:54.390 --> 00:12:56.295
Let's work on that for
the next little bit.

00:12:59.970 --> 00:13:02.220
Well, as always with
Reed-Muller codes,

00:13:02.220 --> 00:13:05.090
we should do things recursively.

00:13:05.090 --> 00:13:06.140
We should start with u1.

00:13:08.760 --> 00:13:10.600
That's a pretty
simple relationship.

00:13:10.600 --> 00:13:17.800
How do we draw a graph for the
relationship y equals u1 u?

00:13:17.800 --> 00:13:22.820
OK, that means y0 equals u0.

00:13:22.820 --> 00:13:23.370
Let's see.

00:13:23.370 --> 00:13:26.200
Is that right?

00:13:26.200 --> 00:13:28.240
I've been putting
u on the left here

00:13:28.240 --> 00:13:30.780
to imitate what we do
with generator matrices.

00:13:30.780 --> 00:13:43.381
So that equation is
y0, y1 equals u0, u1.

00:13:46.890 --> 00:13:50.025
Sometimes I have problems
with basic things like this.

00:13:50.025 --> 00:13:52.935
AUDIENCE: [INAUDIBLE].

00:13:52.935 --> 00:13:55.480
PROFESSOR: Yeah, I
want row vectors.

00:13:55.480 --> 00:13:55.980
Sorry.

00:14:02.700 --> 00:14:10.290
OK, so that's the
equation, which

00:14:10.290 --> 00:14:17.485
we can see that says y0 equals
u0 and y1 equals u0 plus u1.

00:14:17.485 --> 00:14:18.600
Do I agree with that?

00:14:21.600 --> 00:14:27.410
y1 equals-- no, still not right.

00:14:27.410 --> 00:14:29.270
AUDIENCE: y0 is [INAUDIBLE].

00:14:29.270 --> 00:14:30.410
PROFESSOR: Plus u1.

00:14:30.410 --> 00:14:33.580
And y2 is-- y1 is u1.

00:14:33.580 --> 00:14:34.080
All right.

00:14:36.730 --> 00:14:38.045
Sorry, now we're in business.

00:14:40.780 --> 00:14:46.520
OK, so let's draw a
little graph of that.

00:14:46.520 --> 00:14:52.920
We have y1 equals u1.

00:14:52.920 --> 00:14:55.690
And if I make an
equals down here,

00:14:55.690 --> 00:14:57.860
I'll get a little u1 in here.

00:14:57.860 --> 00:15:05.590
So I can put u0 in here and
get out y0 equals u0 plus u1.

00:15:05.590 --> 00:15:08.750
OK, so that's a
graphical realization

00:15:08.750 --> 00:15:11.140
of this Hadamard transform.

00:15:11.140 --> 00:15:14.720
If I put 2 bits in
here for u, then

00:15:14.720 --> 00:15:18.000
I'll get the Hadamard
transform out here for y.

00:15:18.000 --> 00:15:20.900
But by the way, notice there
aren't any arrows here.

00:15:20.900 --> 00:15:24.350
This is a behavioral
realization.

00:15:24.350 --> 00:15:28.340
And I could equally
well put a 2 tuple here

00:15:28.340 --> 00:15:32.445
on the y's and I'd get out the
Hadamard transform at the u's.

00:15:32.445 --> 00:15:33.550
It goes either way.

00:15:38.140 --> 00:15:44.120
And this I find
somewhat interesting

00:15:44.120 --> 00:15:47.038
is called a controlled NOT gate.

00:15:49.790 --> 00:15:54.760
The idea here is that this
bottom variable is the control

00:15:54.760 --> 00:15:57.840
and it determines whether
this variable is inverted

00:15:57.840 --> 00:16:01.600
or not as it goes across here.

00:16:01.600 --> 00:16:05.770
And this is a basic gate
in quantum mechanics.

00:16:05.770 --> 00:16:09.789
Maybe the basic gate in quantum
mechanical realizations.

00:16:09.789 --> 00:16:11.455
[INAUDIBLE] at least
[INAUDIBLE] cubits.

00:16:14.330 --> 00:16:19.470
But anyway, that's actually
neither here nor there.

00:16:19.470 --> 00:16:22.880
All right, so that's
how we realize u1.

00:16:22.880 --> 00:16:24.640
All right, now
recursively there must

00:16:24.640 --> 00:16:32.070
be some way of putting together
little blocks like this

00:16:32.070 --> 00:16:35.705
to make bigger
blocks, say, for u2.

00:16:35.705 --> 00:16:37.360
To realize u2 or u3.

00:16:39.980 --> 00:16:51.050
OK, so for u2, let me just
try to steam ahead and hope

00:16:51.050 --> 00:16:51.880
this will work.

00:16:55.500 --> 00:16:59.580
We build another gate to
do a set of transforms

00:16:59.580 --> 00:17:03.530
to an intermediate
variable here.

00:17:03.530 --> 00:17:08.534
Intermediate variables, and
then we sort of do a butterfly.

00:17:15.390 --> 00:17:19.339
We're going to build this out
of 4 little blocks like this.

00:17:19.339 --> 00:17:21.950
We're going to write like this.

00:17:21.950 --> 00:17:23.839
And how are we going
to tie them together?

00:17:23.839 --> 00:17:28.430
The outputs here are going
to go to the equals signs.

00:17:28.430 --> 00:17:33.660
The outputs here are going
to go to the plus signs.

00:17:33.660 --> 00:17:36.410
And we get something that
looks like that for the 2

00:17:36.410 --> 00:17:41.000
by 2 Hadamard transform.

00:17:41.000 --> 00:17:48.420
And y0, y1, y2, y3.

00:17:48.420 --> 00:17:51.290
Now, probably this
should be u2 and this

00:17:51.290 --> 00:17:55.050
should be u1 just from
the way these works.

00:17:55.050 --> 00:17:58.170
And if we do that, we get what?

00:17:58.170 --> 00:18:01.500
This is u3.

00:18:01.500 --> 00:18:05.130
This is u1 plus u3.

00:18:05.130 --> 00:18:09.190
This is u2.

00:18:09.190 --> 00:18:11.473
This u0 plus u2.

00:18:14.670 --> 00:18:17.960
So we get that--
doing it backwards,

00:18:17.960 --> 00:18:28.010
y3 equals u3, y2 equals u2
plus u3, y1 equals u1 plus u3,

00:18:28.010 --> 00:18:35.100
and y0 equals u0 plus
u2 plus u1 plus u3.

00:18:35.100 --> 00:18:41.660
And that's probably correct,
but it's something like that.

00:18:41.660 --> 00:18:44.620
Done in the notes.

00:18:44.620 --> 00:18:48.660
And those of you who have ever
looked at the fast Fourier

00:18:48.660 --> 00:18:51.020
transform will see
some resemblance here

00:18:51.020 --> 00:18:55.470
to the butterflies that occur
in a fast Fourier transform.

00:18:55.470 --> 00:18:58.330
And it's not surprising
because these are all

00:18:58.330 --> 00:19:05.390
based on groups of-- in
this case, z2 squared.

00:19:05.390 --> 00:19:07.500
And so they have the
same algebraic structure

00:19:07.500 --> 00:19:09.265
underlying them.

00:19:09.265 --> 00:19:11.090
But that's, again,
a side comment

00:19:11.090 --> 00:19:12.595
that I won't take
time to justify.

00:19:16.020 --> 00:19:20.970
And similarly for
8, if we want to do

00:19:20.970 --> 00:19:24.260
the 8-- there's a
picture in the notes--

00:19:24.260 --> 00:19:26.490
we build 8 of these gates.

00:19:26.490 --> 00:19:29.050
Maybe I actually have
to do it because I

00:19:29.050 --> 00:19:32.150
want to reduce it also.

00:19:32.150 --> 00:19:38.820
So 1, 2, 3, 4.

00:19:38.820 --> 00:19:40.465
Sorry, this takes a little time.

00:19:51.200 --> 00:19:55.352
This is where making up slides
ahead of time is a good idea.

00:19:59.240 --> 00:20:02.360
All right, so we have
three levels of 2 by 2

00:20:02.360 --> 00:20:05.355
Hadamard transforms
as you might expect.

00:20:09.420 --> 00:20:15.210
At each level, we do four
2 by 2 Hadamard transforms.

00:20:15.210 --> 00:20:20.110
And we connect them together.

00:20:20.110 --> 00:20:24.170
At the first level, we
stay within the half.

00:20:24.170 --> 00:20:26.205
We connect them together
in the same way.

00:20:29.930 --> 00:20:33.020
And at the next level,
we have to spread out

00:20:33.020 --> 00:20:37.000
over the whole
thing, so we get--

00:20:37.000 --> 00:20:39.250
this goes up to this equals.

00:20:39.250 --> 00:20:40.570
This goes up to this equals.

00:20:40.570 --> 00:20:42.800
This goes up to that equals.

00:20:42.800 --> 00:20:45.610
This comes down here.

00:20:45.610 --> 00:20:49.510
This comes down here.

00:20:49.510 --> 00:20:53.380
This comes down here
and this goes across.

00:20:53.380 --> 00:20:59.420
That gives y0, y1,
y2, and so forth.

00:20:59.420 --> 00:21:05.080
y3, y4, y5, y6, y7.

00:21:05.080 --> 00:21:09.800
And again, we have to
jigger with the order here.

00:21:09.800 --> 00:21:12.160
Probably something like this.

00:21:20.460 --> 00:21:25.160
OK, in any case, it's done
correctly in the notes.

00:21:25.160 --> 00:21:30.740
All right, so this executes
an 8 by 8 Hadamard transform.

00:21:30.740 --> 00:21:35.160
You put y's on here and you
get the Hadamard transform

00:21:35.160 --> 00:21:38.370
out on the u's, or vice versa.

00:21:38.370 --> 00:21:40.580
You with me?

00:21:40.580 --> 00:21:41.960
OK.

00:21:41.960 --> 00:21:44.340
Notice that the components
here are all very simple.

00:21:44.340 --> 00:21:47.740
I've got a graphical realization
of the Hadamard transform

00:21:47.740 --> 00:21:54.330
relationship that involves only
binary variables, first of all.

00:21:54.330 --> 00:21:56.440
All the internal variables
here are just bits.

00:21:56.440 --> 00:21:57.880
They're all binary.

00:21:57.880 --> 00:22:03.070
And all of the constraint nodes
are these simple little degree

00:22:03.070 --> 00:22:04.600
3 constraint nodes.

00:22:04.600 --> 00:22:09.950
The only two nontrivial ones
you will probably ever see.

00:22:09.950 --> 00:22:11.970
Namely, the zero
sum type of node

00:22:11.970 --> 00:22:14.180
and the equality type of node.

00:22:14.180 --> 00:22:19.250
3, 2, 2 and 3, 1, 1 if we
consider them as codes.

00:22:19.250 --> 00:22:22.125
So it's made up of
extremely simple elements.

00:22:22.125 --> 00:22:27.290
And in that sense, we can say
it's a simple randomization.

00:22:27.290 --> 00:22:31.120
But of course, this
graph is full of cycles,

00:22:31.120 --> 00:22:32.670
isn't that right?

00:22:32.670 --> 00:22:33.940
Yes.

00:22:33.940 --> 00:22:36.460
These little things
cause cycles.

00:22:36.460 --> 00:22:39.920
So we get cycles as we go
around like that, for instance.

00:22:43.960 --> 00:22:47.950
Nonetheless, it's a perfectly--
you put something over here

00:22:47.950 --> 00:22:50.570
and you get something
deterministic out over here.

00:22:53.090 --> 00:22:54.340
All right.

00:22:54.340 --> 00:23:02.070
So to realize the 8, 4, 4 code--
are you all with me on this?

00:23:02.070 --> 00:23:03.870
I should check again.

00:23:03.870 --> 00:23:07.671
Is there anybody who
doesn't understand this?

00:23:07.671 --> 00:23:09.420
That's always the way
to put the question.

00:23:09.420 --> 00:23:09.950
Yes?

00:23:09.950 --> 00:23:13.674
AUDIENCE: You count the output
when you're counting degree?

00:23:13.674 --> 00:23:16.090
PROFESSOR: Do I count the
output when I'm counting degree?

00:23:16.090 --> 00:23:19.280
In this case, yes.

00:23:19.280 --> 00:23:21.440
I reserve my options on that.

00:23:21.440 --> 00:23:23.730
Sometimes I do,
sometimes I don't.

00:23:23.730 --> 00:23:25.060
It's whatever is convenient.

00:23:25.060 --> 00:23:27.831
My choice.

00:23:27.831 --> 00:23:29.705
Sorry, I think I just
whacked the microphone.

00:23:32.390 --> 00:23:34.630
OK.

00:23:34.630 --> 00:23:38.440
So now, let me go to
my Reed-Muller code

00:23:38.440 --> 00:23:40.850
realization, the 8, 4, 4 code.

00:23:40.850 --> 00:23:42.330
How are we going
to realize that?

00:23:42.330 --> 00:23:46.810
We've got these internal
variables here, some of which

00:23:46.810 --> 00:23:48.795
we're going to hold to 0.

00:23:51.930 --> 00:23:53.525
These four we're
going to hold to 0.

00:23:58.310 --> 00:24:01.120
And the rest we're
going to let go free.

00:24:01.120 --> 00:24:02.850
And so I claim
that already now I

00:24:02.850 --> 00:24:06.830
have a realization of the 8,
4, 4 code, if I regard these

00:24:06.830 --> 00:24:11.975
as internal variables, these
as my external variables

00:24:11.975 --> 00:24:12.720
over here.

00:24:15.670 --> 00:24:20.755
OK, does everyone find
that plausible at least?

00:24:20.755 --> 00:24:23.550
OK.

00:24:23.550 --> 00:24:26.407
But having done that--
now, watch this part

00:24:26.407 --> 00:24:27.990
closely because this
part you're going

00:24:27.990 --> 00:24:30.500
to have to imitate
on the homework.

00:24:30.500 --> 00:24:33.980
We can do some reductions.

00:24:33.980 --> 00:24:36.140
All right.

00:24:36.140 --> 00:24:40.250
If I have a 2 by 2
Hadamard transform of 0, 0,

00:24:40.250 --> 00:24:41.990
what's the result?

00:24:41.990 --> 00:24:43.850
0, 0.

00:24:43.850 --> 00:24:44.420
OK.

00:24:44.420 --> 00:24:48.500
I can do that in several steps,
but that's a good way to do it.

00:24:48.500 --> 00:24:51.050
So I don't really
need these gates here.

00:24:51.050 --> 00:24:52.650
I can just put 0's over here.

00:24:56.020 --> 00:25:00.390
Similarly, down here if I have
a 2 by 2 Hadamard transform of 2

00:25:00.390 --> 00:25:05.390
free variables, internal
variables, what do I get?

00:25:05.390 --> 00:25:08.560
I get two free variables.

00:25:08.560 --> 00:25:13.460
One of them happens to be
u3 plus u7 and the other one

00:25:13.460 --> 00:25:15.130
is just u7.

00:25:15.130 --> 00:25:18.710
I could call this
u3 prime, let's say.

00:25:18.710 --> 00:25:20.580
And since it's an
internal variable

00:25:20.580 --> 00:25:24.640
that I'm just using
to generate the code,

00:25:24.640 --> 00:25:25.920
I can delete that, too.

00:25:34.670 --> 00:25:37.180
So what can I do
about these here?

00:25:37.180 --> 00:25:39.970
When I have one free
variable and one zero,

00:25:39.970 --> 00:25:41.400
what's going to happen?

00:25:41.400 --> 00:25:44.120
This free variable is going
to come through here and here.

00:25:46.690 --> 00:25:49.400
The 0 doesn't count
here, so a plus just

00:25:49.400 --> 00:25:53.180
becomes a repetition in effect.

00:25:53.180 --> 00:25:55.620
It's got to have the same
parity as this input,

00:25:55.620 --> 00:25:56.520
so it is that input.

00:26:00.430 --> 00:26:06.037
So by another major
sweep, I'll just--

00:26:06.037 --> 00:26:07.245
this is an internal variable.

00:26:07.245 --> 00:26:10.540
I'll just call this u6.

00:26:10.540 --> 00:26:14.395
OK, I don't need to actually
draw a gate to generate it.

00:26:14.395 --> 00:26:16.087
It has to be the
same here and here,

00:26:16.087 --> 00:26:17.295
so I have to tie it together.

00:26:19.856 --> 00:26:21.730
Similarly, down here I
can just call this u5.

00:26:34.510 --> 00:26:37.199
And now, continuing
in here-- I hope this

00:26:37.199 --> 00:26:38.490
is going to come out all right.

00:26:38.490 --> 00:26:40.600
It doesn't seem
quite right so far,

00:26:40.600 --> 00:26:42.140
but let's see how it comes out.

00:26:45.099 --> 00:26:46.390
What do we have coming in here?

00:26:46.390 --> 00:26:49.310
We have two different
free variables.

00:26:49.310 --> 00:26:58.950
So again, let me just
call this u3 prime again.

00:26:58.950 --> 00:27:02.190
And this is maybe u5 prime.

00:27:02.190 --> 00:27:05.885
But in any case, I
can just draw this.

00:27:09.210 --> 00:27:13.830
In this case, I'm
coming around this way.

00:27:13.830 --> 00:27:16.820
And u3 prime was
dangling anyway,

00:27:16.820 --> 00:27:21.720
so I can leave that over here.

00:27:21.720 --> 00:27:24.720
And similarly down here,
I get two free ones,

00:27:24.720 --> 00:27:29.100
so let me just call
them like this.

00:27:31.870 --> 00:27:33.753
And this is equal to y7.

00:27:39.841 --> 00:27:43.340
Well, it's a little strange.

00:27:43.340 --> 00:27:47.050
I did something wrong there.

00:27:47.050 --> 00:27:50.430
I did something wrong.

00:27:50.430 --> 00:27:51.660
Well, maybe not.

00:27:51.660 --> 00:27:52.385
Maybe not.

00:27:54.920 --> 00:27:57.200
This is not going the way
it's gone in the past,

00:27:57.200 --> 00:28:00.530
so maybe I drew the graph
differently in the first place.

00:28:00.530 --> 00:28:02.470
This is u6 again.

00:28:02.470 --> 00:28:07.190
This is still u6 up here.

00:28:07.190 --> 00:28:08.910
But this is a 0.

00:28:08.910 --> 00:28:12.260
So this is u6 also
coming out here.

00:28:12.260 --> 00:28:17.870
But I need this
little-- I do need

00:28:17.870 --> 00:28:20.790
to replicate u6 three times.

00:28:20.790 --> 00:28:22.970
And maybe that's what
I needed down here too,

00:28:22.970 --> 00:28:26.610
was to replicate u5 three times.

00:28:26.610 --> 00:28:27.526
AUDIENCE: [INAUDIBLE].

00:28:31.916 --> 00:28:32.790
PROFESSOR: Excuse me?

00:28:32.790 --> 00:28:33.873
AUDIENCE: The bottom line.

00:28:36.597 --> 00:28:38.430
PROFESSOR: Say again,
I just didn't hear it.

00:28:38.430 --> 00:28:39.873
AUDIENCE: The bottom line.

00:28:39.873 --> 00:28:42.760
You remove that
[INAUDIBLE] shortcut.

00:28:42.760 --> 00:28:45.927
PROFESSOR: I removed--
this equaled a shortcut.

00:28:45.927 --> 00:28:46.552
AUDIENCE: Yeah.

00:28:46.552 --> 00:28:47.799
But the [INAUDIBLE].

00:28:47.799 --> 00:28:49.840
PROFESSOR: There was an
[INAUDIBLE] on top of it,

00:28:49.840 --> 00:28:51.350
right.

00:28:51.350 --> 00:28:52.260
And--

00:28:52.260 --> 00:28:53.700
AUDIENCE: That
cannot be replaced.

00:28:53.700 --> 00:28:56.938
PROFESSOR: That cannot be
replaced by a shortcut.

00:28:56.938 --> 00:28:57.890
OK.

00:28:57.890 --> 00:28:58.390
Thank you.

00:29:02.270 --> 00:29:03.470
So it should look like that?

00:29:06.400 --> 00:29:07.550
I don't know.

00:29:07.550 --> 00:29:09.960
I think I've botched
it, so you're

00:29:09.960 --> 00:29:12.160
going to do it right
on the homework.

00:29:12.160 --> 00:29:19.870
Let me just-- yeah, good.

00:29:19.870 --> 00:29:22.220
This is what it
should look like.

00:29:22.220 --> 00:29:24.090
You do need the equals there.

00:29:27.840 --> 00:29:30.970
It's because I drew the graph
backwards in the first place.

00:29:34.810 --> 00:29:37.910
Anyway, presto change-o.

00:29:37.910 --> 00:29:41.145
Going through that
kind of manipulation,

00:29:41.145 --> 00:29:45.940
what we eventually
come up with is

00:29:45.940 --> 00:29:48.360
something that looks like this.

00:30:04.090 --> 00:30:06.520
And I apologize for botching it.

00:30:19.330 --> 00:30:23.970
And coming around,
this is tied to this

00:30:23.970 --> 00:30:25.290
and this is tied to that.

00:30:31.900 --> 00:30:32.510
All right.

00:30:32.510 --> 00:30:37.090
And I asked you to imagine
that if I-- what I did

00:30:37.090 --> 00:30:39.470
was since I can draw
the Hadamard transform--

00:30:39.470 --> 00:30:42.980
either way, I drew
it the wrong way.

00:30:42.980 --> 00:30:47.860
And therefore, I wasn't going
to get what I intended to get,

00:30:47.860 --> 00:30:49.150
which is this.

00:30:49.150 --> 00:30:51.940
But it's done
correctly in Figure 10.

00:30:51.940 --> 00:30:55.950
And the types of reductions
that I explained to you

00:30:55.950 --> 00:30:58.320
are the types of reduction
that's necessarily

00:30:58.320 --> 00:31:00.370
to get from Figure 10A to 10B.

00:31:00.370 --> 00:31:02.280
Please do try this at home.

00:31:06.090 --> 00:31:11.965
Now, so I claim that this
is a realization of the 8,

00:31:11.965 --> 00:31:14.470
4, 4 code.

00:31:14.470 --> 00:31:20.370
And you can verify that
either by putting--

00:31:20.370 --> 00:31:23.450
you'll find there are four
free internal variables here

00:31:23.450 --> 00:31:29.500
and you can either set them
to 1 or you can try this trick

00:31:29.500 --> 00:31:32.000
that I discussed
before, I think.

00:31:32.000 --> 00:31:35.290
That we can regard these
as an information set.

00:31:35.290 --> 00:31:40.280
And by fixing y0,
y1, y2, and y4,

00:31:40.280 --> 00:31:42.870
you can trace your
way through the graph

00:31:42.870 --> 00:31:50.640
and find out that y3, y5,
y6, and y7 are determined.

00:31:50.640 --> 00:31:52.300
Again, do try this at home.

00:31:55.270 --> 00:31:57.490
So you can convince
yourself that this

00:31:57.490 --> 00:32:02.660
is a graph for 6-- that has
16 possible solutions for all

00:32:02.660 --> 00:32:05.200
the internal and
external variables.

00:32:05.200 --> 00:32:08.030
16 possible trajectories
that are valid that

00:32:08.030 --> 00:32:10.670
satisfy all the
internal constraints.

00:32:10.670 --> 00:32:13.510
And that the external
8 tuples that

00:32:13.510 --> 00:32:18.870
are parts of these trajectories
are the 8 tuples of the 8,

00:32:18.870 --> 00:32:20.210
4, 4 code.

00:32:20.210 --> 00:32:22.250
I just assert that now.

00:32:22.250 --> 00:32:23.100
Please verify it.

00:32:25.820 --> 00:32:28.800
All right, so now
let's step back

00:32:28.800 --> 00:32:34.070
and take a look at
this realization.

00:32:34.070 --> 00:32:37.620
I believe it's the simplest
representation of the 8, 4,

00:32:37.620 --> 00:32:43.470
4 codes if you account the
complexity of the constraints

00:32:43.470 --> 00:32:46.020
and of the variables.

00:32:46.020 --> 00:32:48.810
Notice that all the internal
variables, the states

00:32:48.810 --> 00:32:50.770
if you like, they're all binary.

00:32:50.770 --> 00:32:52.840
So it's a two-state
representation

00:32:52.840 --> 00:32:56.980
if we want to use that language,
that all the constraints are

00:32:56.980 --> 00:33:01.700
simple, either 3, 2, 2
or 3, 1, 3 constraints.

00:33:01.700 --> 00:33:05.240
We count that as the branch
complexity in effect.

00:33:05.240 --> 00:33:06.840
This isn't a
trellis, but this is

00:33:06.840 --> 00:33:08.840
what's analogous to
the branch complexity

00:33:08.840 --> 00:33:12.240
as we'll see when we get to
the sum product algorithm.

00:33:12.240 --> 00:33:14.730
The complexity is
proportional to the number

00:33:14.730 --> 00:33:17.750
of code words in each
of these constraints.

00:33:17.750 --> 00:33:19.890
So there are either
2 or 4 code words.

00:33:19.890 --> 00:33:21.850
It's a very simple realization.

00:33:21.850 --> 00:33:26.600
And there are only 12
of these constraints.

00:33:26.600 --> 00:33:32.680
On the other hand, it does have
cycles, like this cycle here.

00:33:35.610 --> 00:33:37.832
When we get to the
sum product algorithm,

00:33:37.832 --> 00:33:39.790
we'll see that there's
a fundamental difference

00:33:39.790 --> 00:33:43.910
between decoding on a graph
with cycles and a graph

00:33:43.910 --> 00:33:45.060
without cycles.

00:33:45.060 --> 00:33:48.160
Without cycles, we get an
exact decoding algorithm.

00:33:48.160 --> 00:33:50.340
It does maximum
likelihood decoding,

00:33:50.340 --> 00:33:53.210
like the Viterbi algorithm.

00:33:53.210 --> 00:33:57.700
Or actually, a posteriori
probability decoding.

00:33:57.700 --> 00:34:03.710
And in any case, it gives us
the optimum, in some sense.

00:34:03.710 --> 00:34:05.440
When we decode on a
graph with cycles,

00:34:05.440 --> 00:34:10.170
we get some degradation
from the optimum.

00:34:10.170 --> 00:34:13.370
Sae-Young Chung tried
decoding with this.

00:34:13.370 --> 00:34:17.590
And as I remember, it was a few
tenths of [? db ?] sub-optimum.

00:34:17.590 --> 00:34:19.010
It wasn't bad.

00:34:19.010 --> 00:34:21.199
But it wasn't optimum either.

00:34:21.199 --> 00:34:23.030
So that's the trade-off we make.

00:34:23.030 --> 00:34:30.179
Suppose we want to make this
into a cycle-free graph.

00:34:30.179 --> 00:34:32.860
We can probably do
that by agglomeration.

00:34:37.139 --> 00:34:41.460
Let me suggest the
following agglomeration.

00:34:41.460 --> 00:34:44.139
Let's make all of
this-- let's suppress

00:34:44.139 --> 00:34:47.320
that cycle into one big block.

00:34:47.320 --> 00:34:51.940
Let's suppress this
cycle into one big block.

00:34:51.940 --> 00:34:57.090
So we're going to consider
this one overall constraint.

00:34:57.090 --> 00:34:58.480
What is that constraint?

00:34:58.480 --> 00:35:01.410
We have two outputs.

00:35:01.410 --> 00:35:04.240
Or I'm sorry, two
variables over here.

00:35:04.240 --> 00:35:06.350
We have four
variables over here.

00:35:06.350 --> 00:35:09.520
So it's going to be
some code of length 6.

00:35:09.520 --> 00:35:12.780
And from the fact that you have
three inputs coming in here,

00:35:12.780 --> 00:35:16.570
you might guess it's going to
be a 6, 3 code, which it is.

00:35:21.950 --> 00:35:27.620
So if we do that,
we simply get 6, 3.

00:35:27.620 --> 00:35:32.950
And similarly, a 6, 3
constraint down here.

00:35:32.950 --> 00:35:37.790
There are 8 possible
valid behaviors

00:35:37.790 --> 00:35:39.190
just for this constraint.

00:35:39.190 --> 00:35:48.490
And again, since everything
is self-dual here,

00:35:48.490 --> 00:35:51.050
it's going to be 6, 3.

00:35:51.050 --> 00:35:52.750
So let me, again, assert that.

00:35:52.750 --> 00:35:55.010
Does anybody recognize
what this realization is?

00:36:03.204 --> 00:36:05.750
Would it help if I
drew it like this?

00:36:16.970 --> 00:36:19.765
This is four variables,
four variables.

00:36:27.580 --> 00:36:28.340
Anybody?

00:36:28.340 --> 00:36:30.730
Come on, you've
seen this before.

00:36:30.730 --> 00:36:32.027
AUDIENCE: [INAUDIBLE].

00:36:32.027 --> 00:36:32.610
PROFESSOR: No.

00:36:36.060 --> 00:36:38.180
This is a realization
of a different style.

00:36:38.180 --> 00:36:41.150
This is now a
cycle-free realization,

00:36:41.150 --> 00:36:45.520
as you can easily see either
from this or from-- well,

00:36:45.520 --> 00:36:46.790
it's cycle-free now.

00:36:46.790 --> 00:36:49.060
Sorry, I should
do one more thing.

00:36:49.060 --> 00:36:55.840
I should make this into
a quaternary variable.

00:36:55.840 --> 00:36:58.530
So this will be 2.

00:36:58.530 --> 00:37:01.455
It's basically u5,
u6 in this notation.

00:37:03.960 --> 00:37:05.830
Without doing that,
there's still a cycle.

00:37:05.830 --> 00:37:09.085
This little cycle going back and
forth there as we said before.

00:37:12.530 --> 00:37:16.260
So somebody I heard say it.

00:37:16.260 --> 00:37:17.090
Say it again.

00:37:17.090 --> 00:37:17.840
AUDIENCE: Trellis.

00:37:17.840 --> 00:37:18.775
PROFESSOR: Trellis.

00:37:18.775 --> 00:37:21.260
It's our two-section
trellis realization

00:37:21.260 --> 00:37:24.200
of the 8, 4, 4 code
where we divide a code

00:37:24.200 --> 00:37:27.500
word into two
sections of length 4.

00:37:27.500 --> 00:37:32.990
And simply, this is the
trellis at the first half.

00:37:32.990 --> 00:37:38.310
Eight possible lines,
each one containing

00:37:38.310 --> 00:37:40.590
two parallel transitions.

00:37:40.590 --> 00:37:42.090
Sorry.

00:37:42.090 --> 00:37:44.900
Four lines each containing
two parallel transitions

00:37:44.900 --> 00:37:46.865
going to one of eight states.

00:37:50.630 --> 00:37:53.355
And that's what the
actual trellis looks like.

00:37:56.570 --> 00:38:01.214
So this is, indeed, cycle-free.

00:38:01.214 --> 00:38:03.130
What's the difference
between this realization

00:38:03.130 --> 00:38:04.130
and the other one?

00:38:04.130 --> 00:38:07.560
Well, first of all,
we've got four states

00:38:07.560 --> 00:38:12.100
in this cycle-free realization
and we've got-- certainly,

00:38:12.100 --> 00:38:14.760
a much larger
constraint code here,

00:38:14.760 --> 00:38:17.590
which will turn into complexity,
computational complexity,

00:38:17.590 --> 00:38:19.370
when we go to decoding.

00:38:19.370 --> 00:38:22.320
Now, we've got a
branch complexity of 8.

00:38:22.320 --> 00:38:25.450
Whereas, we never had
anything larger than 4

00:38:25.450 --> 00:38:30.000
in the zero sum constraints
of the previous diagram.

00:38:30.000 --> 00:38:33.050
So in that sense, this
is more complicated.

00:38:33.050 --> 00:38:35.310
Its state complexity,
its branch complexity

00:38:35.310 --> 00:38:38.770
is higher than the
cycle-free one that we did.

00:38:38.770 --> 00:38:43.360
Than the one with cycles,
but it's cycle-free.

00:38:43.360 --> 00:38:46.520
You remember also
we did this one.

00:38:46.520 --> 00:38:51.020
We have both of
these going down here

00:38:51.020 --> 00:38:55.340
and we consider them each
to be a binary variable,

00:38:55.340 --> 00:38:57.700
then we get our
tail-biting realization,

00:38:57.700 --> 00:38:59.986
which is only two states.

00:38:59.986 --> 00:39:01.652
But it now has a cycle.

00:39:01.652 --> 00:39:03.110
And if we're going
to go to cycles,

00:39:03.110 --> 00:39:05.043
I would go all the
way to that other one.

00:39:05.043 --> 00:39:06.600
But maybe not.

00:39:06.600 --> 00:39:08.960
If you decode this
tail-biting realization,

00:39:08.960 --> 00:39:11.840
maybe its performance
is a little bit better

00:39:11.840 --> 00:39:15.050
because it more
faithfully represents

00:39:15.050 --> 00:39:21.540
what's going on inside the
circuit than the other one.

00:39:21.540 --> 00:39:23.030
How many different
ways have we had

00:39:23.030 --> 00:39:25.750
of realizing the
8, 4, 4 code now?

00:39:25.750 --> 00:39:29.650
We've had at least half a
dozen, maybe more than that.

00:39:29.650 --> 00:39:34.040
And this is all to
illustrate the various styles

00:39:34.040 --> 00:39:34.760
that we have.

00:39:34.760 --> 00:39:37.950
We've now had at least
one of each style.

00:39:37.950 --> 00:39:42.110
And now we have one
of Hadamard transform.

00:39:42.110 --> 00:39:44.970
Or, as I'm going to call it,
we have a reduced Hadamard

00:39:44.970 --> 00:39:52.530
transform, the one with only
the 12 simple constraints in it.

00:39:52.530 --> 00:39:56.660
Or, we also have the
agglomerated Hadamard

00:39:56.660 --> 00:40:02.744
transform, which gets
us to cycle-free.

00:40:02.744 --> 00:40:06.200
This has cycles.

00:40:06.200 --> 00:40:13.440
But this is simple and this
is, at least, more complex.

00:40:13.440 --> 00:40:14.240
So there's a trade.

00:40:14.240 --> 00:40:14.948
What do you want?

00:40:22.270 --> 00:40:24.740
OK.

00:40:24.740 --> 00:40:28.220
Just one more side
comment on the subject.

00:40:32.160 --> 00:40:38.290
For exercise 2, which
I didn't assign,

00:40:38.290 --> 00:40:45.870
shows that in general
an agglomerated Hadamard

00:40:45.870 --> 00:40:51.290
transform realization
of a Reed-Muller code,

00:40:51.290 --> 00:40:57.950
you can always get one
that looks like this.

00:40:57.950 --> 00:40:58.500
Let's see.

00:41:08.230 --> 00:41:14.020
Which is sort of a four-section
realization where each of these

00:41:14.020 --> 00:41:17.435
has length 2 to the m minus 2.

00:41:23.430 --> 00:41:26.380
This is like the
four-section realization

00:41:26.380 --> 00:41:27.610
of the Reed-Muller code.

00:41:30.240 --> 00:41:34.350
Each of these
state variables has

00:41:34.350 --> 00:41:37.550
dimension which meets
the cut-set bound.

00:41:37.550 --> 00:41:40.580
And the cut-set
bound for a cut that

00:41:40.580 --> 00:41:50.780
divides the trellis into one
quarter and three quarters

00:41:50.780 --> 00:41:53.550
or one half and one half.

00:41:53.550 --> 00:41:57.240
You remember all
of these dimensions

00:41:57.240 --> 00:42:00.890
were the same as
simply the state

00:42:00.890 --> 00:42:04.910
dimension of a
four-section trellis.

00:42:04.910 --> 00:42:08.430
And I hope you remember that
the state dimensions were

00:42:08.430 --> 00:42:10.610
the same at all four boundaries.

00:42:10.610 --> 00:42:12.980
So we always get
the same dimension

00:42:12.980 --> 00:42:17.610
for each of these, the same
as in a four-section trellis.

00:42:17.610 --> 00:42:20.600
And we get the minimal
trellis complexity here.

00:42:20.600 --> 00:42:24.040
So this is going to
turn out to be length

00:42:24.040 --> 00:42:33.740
equals 3s and k equals
t, where t is the branch

00:42:33.740 --> 00:42:35.980
complexity, the minimal
branch complexity

00:42:35.980 --> 00:42:38.200
parameter for a
trellis that we had

00:42:38.200 --> 00:42:42.925
in a table back in Chapter 6.

00:42:46.500 --> 00:42:50.940
So this actually turns out
to be a nice-- the nicest

00:42:50.940 --> 00:42:55.990
I know-- cycle-free realization
of a Reed-Muller code.

00:42:55.990 --> 00:43:00.740
For instance, for a 32,
16, 8 code these dimensions

00:43:00.740 --> 00:43:05.010
are 18, 9, 18, 9.

00:43:05.010 --> 00:43:06.250
These are all 6.

00:43:06.250 --> 00:43:13.400
It's a 64-state
realization with 8 symbols

00:43:13.400 --> 00:43:16.130
at each of these points.

00:43:16.130 --> 00:43:18.930
And each of these
is a 14, 7 code.

00:43:23.300 --> 00:43:26.860
Is this fundamentally simpler
than a four-section trellis

00:43:26.860 --> 00:43:30.030
realization of the
32, 16, 8 code?

00:43:30.030 --> 00:43:30.740
Not really.

00:43:30.740 --> 00:43:33.110
The branch complexity
is the same

00:43:33.110 --> 00:43:36.560
as it is in the central
sections of this code.

00:43:36.560 --> 00:43:40.140
This is very close
to being a-- it

00:43:40.140 --> 00:43:42.990
is a four-section
realization if we

00:43:42.990 --> 00:43:44.990
draw the four
sections like this.

00:43:50.240 --> 00:43:54.420
Except we have to draw a little
additional thing up here,

00:43:54.420 --> 00:43:56.535
which is slightly helpful.

00:44:01.370 --> 00:44:06.495
So it's sort of a expanded
trellis that looks like this.

00:44:19.590 --> 00:44:30.090
OK, so maybe think of that as a
quote, "explanation" for this.

00:44:32.870 --> 00:44:37.430
And I give explicit expressions
for s and particularly

00:44:37.430 --> 00:44:40.965
for t, which perhaps
you can derive

00:44:40.965 --> 00:44:43.740
or perhaps you don't care.

00:44:43.740 --> 00:44:45.901
You won't be held
responsible for it, anyway.

00:44:49.270 --> 00:44:54.590
You might even remember that
the Golay code, the 24, 12,

00:44:54.590 --> 00:45:02.030
8 realization, which
looked very much like this

00:45:02.030 --> 00:45:07.090
but contained only three
sections, each of length 8

00:45:07.090 --> 00:45:12.190
and a single 18, 9
constraint in the middle.

00:45:12.190 --> 00:45:14.450
These were all 6.

00:45:14.450 --> 00:45:16.805
Maybe you remember
that, maybe you don't.

00:45:16.805 --> 00:45:19.480
But that's a realization
of the Golay code.

00:45:19.480 --> 00:45:22.751
And it shows that these two
codes are very close cousins.

00:45:22.751 --> 00:45:24.542
There are many ways of
showing that they're

00:45:24.542 --> 00:45:25.640
very close cousins.

00:45:25.640 --> 00:45:27.740
They are.

00:45:27.740 --> 00:45:32.420
OK, so now I'm just going
off into side comments.

00:45:32.420 --> 00:45:34.330
You obviously won't
be held responsible

00:45:34.330 --> 00:45:36.510
for any of the side comments.

00:45:36.510 --> 00:45:39.360
But I hope they give you
something of a flavor

00:45:39.360 --> 00:45:42.565
for how the subject
can be developed.

00:45:47.720 --> 00:45:49.670
OK, that's the
end of Chapter 11.

00:45:49.670 --> 00:45:51.740
There is an appendix
in Chapter 11

00:45:51.740 --> 00:45:55.530
that talks about other
flavors of graphs.

00:45:55.530 --> 00:46:02.040
Factor graphs is a more
embracing philosophically,

00:46:02.040 --> 00:46:06.580
conceptually notion where each
of the constraints, instead

00:46:06.580 --> 00:46:12.740
of representing codes represents
the factors, local factors,

00:46:12.740 --> 00:46:16.094
some global function
factors into a product

00:46:16.094 --> 00:46:16.885
of local functions.

00:46:16.885 --> 00:46:20.030
The local functions are
represented by constraints.

00:46:20.030 --> 00:46:21.680
It's a nice way of
looking at things.

00:46:21.680 --> 00:46:24.837
It's not strictly
necessary for this course,

00:46:24.837 --> 00:46:25.920
so I haven't gone into it.

00:46:25.920 --> 00:46:29.220
But most of the literature now
is in terms of factor graphs.

00:46:32.050 --> 00:46:34.650
In other fields, we
have related topics,

00:46:34.650 --> 00:46:39.640
like Markov graphs or
Markov random fields.

00:46:39.640 --> 00:46:43.260
And slight
differences in choices

00:46:43.260 --> 00:46:45.920
have been made in these areas.

00:46:45.920 --> 00:46:48.470
Here, you see we're
putting all the variables

00:46:48.470 --> 00:46:52.800
on the edges and the
constraints as nodes in a graph.

00:46:52.800 --> 00:46:54.490
And that's the
normal graph style.

00:46:54.490 --> 00:46:56.720
In Markov graphs,
they make exactly

00:46:56.720 --> 00:46:58.300
the opposite convention.

00:46:58.300 --> 00:47:02.840
They make the variables into
nodes, which is probably

00:47:02.840 --> 00:47:05.120
what most people would
do if they sat down

00:47:05.120 --> 00:47:08.280
to start doing a graphical
model of something.

00:47:08.280 --> 00:47:12.302
And they make the
constraints into edges.

00:47:12.302 --> 00:47:14.010
But there are several
problems with that.

00:47:14.010 --> 00:47:16.750
One is that if you have a
constraint of larger degree

00:47:16.750 --> 00:47:19.380
than 2, then you really
need a hyper-edge,

00:47:19.380 --> 00:47:22.180
which is represented
by a clique.

00:47:22.180 --> 00:47:26.350
And cliques sometimes
introduce artifacts

00:47:26.350 --> 00:47:28.290
that you don't really want.

00:47:28.290 --> 00:47:31.040
So I think it's actually
an inferior style,

00:47:31.040 --> 00:47:34.200
but it's a very popular
style in some fields.

00:47:34.200 --> 00:47:38.260
Physics, particularly, for
indicating dependencies.

00:47:38.260 --> 00:47:40.670
All of these
graphical models are

00:47:40.670 --> 00:47:42.830
methods of indicating
dependencies

00:47:42.830 --> 00:47:46.480
and what's related to what.

00:47:46.480 --> 00:47:48.160
There's another style.

00:47:48.160 --> 00:47:50.680
It's very popular in
statistical inference,

00:47:50.680 --> 00:47:54.250
which is called
Bayesian networks.

00:47:54.250 --> 00:47:59.060
And this style is
similar to ours

00:47:59.060 --> 00:48:01.050
except the graphs are directed.

00:48:01.050 --> 00:48:04.440
Everything is a cause
and effect relationship.

00:48:04.440 --> 00:48:08.110
So you have one variable
that causes another variable.

00:48:08.110 --> 00:48:11.810
We've generally stayed away
from cause and effect here.

00:48:11.810 --> 00:48:16.440
We've used the behavioral
style, undirected graph.

00:48:16.440 --> 00:48:21.210
But Bayesian networks
have been hugely popular

00:48:21.210 --> 00:48:22.640
in statistical inference.

00:48:22.640 --> 00:48:27.400
And so the appendix is just
to explain the relationship

00:48:27.400 --> 00:48:29.830
between these various styles.

00:48:29.830 --> 00:48:31.800
And obviously, you can
work with any of them.

00:48:31.800 --> 00:48:35.040
It's a matter of preference.

00:48:35.040 --> 00:48:40.200
OK, any other questions
about Chapter 11?

00:48:40.200 --> 00:48:43.800
This was the basic introduction
to codes on graphs.

00:48:43.800 --> 00:48:46.650
Chapter 11 is, what
are codes on graphs?

00:48:46.650 --> 00:48:50.250
Chapter 12 is, how do we
decode codes on graphs?

00:48:50.250 --> 00:48:54.080
Which is the sum
product algorithm.

00:48:54.080 --> 00:48:55.530
So we'll move into that now.

00:48:55.530 --> 00:48:58.710
And then Chapter
13, we'll actually

00:48:58.710 --> 00:49:02.610
talk about classes of
capacity-approaching codes.

00:49:02.610 --> 00:49:12.340
So these three
chapters certainly

00:49:12.340 --> 00:49:19.600
form a closely related set that
should all be read together.

00:49:19.600 --> 00:49:24.545
OK, what are the basic facts
about sum product algorithm?

00:49:27.270 --> 00:49:39.520
I'm going to describe it as a
method of doing A Posteriori

00:49:39.520 --> 00:49:45.980
Probability decoding,
APP decoding, initially

00:49:45.980 --> 00:49:47.270
on cycle-free graphs.

00:49:59.690 --> 00:50:03.370
I'll define-- what do I mean
by a posteriori probability

00:50:03.370 --> 00:50:04.750
decoding?

00:50:04.750 --> 00:50:08.460
Then, we'll develop the
algorithm for cycle-free graphs

00:50:08.460 --> 00:50:12.215
where the theory is
kind of complete.

00:50:20.600 --> 00:50:22.760
There are various theorems
that you can easily

00:50:22.760 --> 00:50:26.030
prove about the performance
of the algorithm here.

00:50:26.030 --> 00:50:29.420
The algorithm completes in
a finite amount of time.

00:50:29.420 --> 00:50:31.620
Basically, equal
the amount of time

00:50:31.620 --> 00:50:33.450
it takes to get from
one side of the graph

00:50:33.450 --> 00:50:35.380
to the other, which in
graph theory language

00:50:35.380 --> 00:50:38.150
is called the
diameter of the graph.

00:50:38.150 --> 00:50:43.870
And it does exact a posteriori
probability decoding.

00:50:43.870 --> 00:50:47.150
So if that's actually
what you want to do,

00:50:47.150 --> 00:50:49.450
this is a good way of doing it.

00:50:49.450 --> 00:50:54.740
If you do this on a trellis
for instance, it's the, by now,

00:50:54.740 --> 00:51:00.910
well-known BCJR algorithm,
for Bahl, Cocke, Jelinek,

00:51:00.910 --> 00:51:07.380
and Raviv, who published
this algorithm back in 1973.

00:51:07.380 --> 00:51:10.430
Probably known in other
fields before that.

00:51:10.430 --> 00:51:14.491
Actually, you can sort of
find it in Gallagher's thesis.

00:51:14.491 --> 00:51:18.910
You can certainly find the
sum product algorithm there.

00:51:18.910 --> 00:51:23.830
Anyway, the BCJR
algorithm is now

00:51:23.830 --> 00:51:27.970
widely used as a component
of these decoding algorithms

00:51:27.970 --> 00:51:31.230
for turbo codes, in particular,
and capacity-approaching codes

00:51:31.230 --> 00:51:37.235
in general because it's an
exact way of decoding a trellis.

00:51:37.235 --> 00:51:39.660
And probably fast.

00:51:39.660 --> 00:51:43.450
Although, we'll find it's
more complex than the Viterbi

00:51:43.450 --> 00:51:44.010
algorithm.

00:51:44.010 --> 00:51:47.410
So if you are just interested
in decoding a trellis,

00:51:47.410 --> 00:51:50.190
you would probably use
the Viterbi algorithm.

00:51:50.190 --> 00:51:52.850
But you use the BCJR
algorithm when you actually

00:51:52.850 --> 00:51:56.630
want to compute the a
posteriori probabilities.

00:51:56.630 --> 00:51:58.695
And that's what
you want when it's

00:51:58.695 --> 00:51:59.820
part of a larger algorithm.

00:52:03.360 --> 00:52:07.570
So we'll do all the development
on cycle-free graph.

00:52:07.570 --> 00:52:09.350
Where we really,
eventually, want to use

00:52:09.350 --> 00:52:11.020
it is on graphs with cycles.

00:52:15.510 --> 00:52:20.000
And there are
hardly any theorems.

00:52:20.000 --> 00:52:22.530
In this case, you find yourself
going around the cycles.

00:52:25.500 --> 00:52:27.280
So there are all kinds
of new questions.

00:52:27.280 --> 00:52:28.050
How do you start?

00:52:28.050 --> 00:52:28.940
When do you stop?

00:52:28.940 --> 00:52:31.940
What are its
convergence properties?

00:52:31.940 --> 00:52:36.325
It becomes an iterative
and approximate algorithm.

00:52:41.220 --> 00:52:45.680
For instance, when I said that
decoding of that simple, little

00:52:45.680 --> 00:52:48.730
reduced Hadamard transform
realization for the 8, 4,

00:52:48.730 --> 00:52:50.260
4 code, which had
cycles in it, I

00:52:50.260 --> 00:52:54.120
said it was a couple of tenths
of [? db ?] sub-optimum.

00:52:54.120 --> 00:52:56.960
That was obtained by
running this algorithm

00:52:56.960 --> 00:53:00.140
on that little graph with
cycles until it seemed

00:53:00.140 --> 00:53:02.770
to have converged to something
and plotting its performance.

00:53:02.770 --> 00:53:05.970
And it's not quite as
good as the performance

00:53:05.970 --> 00:53:12.250
would be on a cycle-free
graph like this one.

00:53:12.250 --> 00:53:16.660
All right, so you give up
something in performance.

00:53:16.660 --> 00:53:23.080
And you give up-- it
becomes an algorithm

00:53:23.080 --> 00:53:25.040
that runs indefinitely.

00:53:25.040 --> 00:53:28.210
And you'd have to figure
out some stopping criterion

00:53:28.210 --> 00:53:28.850
and so forth.

00:53:28.850 --> 00:53:32.770
All together, it's much
more, what are we doing here?

00:53:32.770 --> 00:53:34.910
We can't really say
with any precision what

00:53:34.910 --> 00:53:37.100
we're doing here in most cases.

00:53:37.100 --> 00:53:40.520
Although, in Chapter 13, I'll
give some cases where we can

00:53:40.520 --> 00:53:43.990
say quite precisely
what it's doing.

00:53:43.990 --> 00:53:46.710
But it becomes much
harder to analyze.

00:53:46.710 --> 00:53:50.520
But in coding, this is
actually what you want to do.

00:53:50.520 --> 00:53:52.810
This is what works.

00:53:52.810 --> 00:53:55.615
Just try it on a graph with
cycles and hope for the best.

00:53:58.290 --> 00:54:02.900
And because we design
our own graphs in coding,

00:54:02.900 --> 00:54:05.990
we can design them so that
the algorithm works very well.

00:54:05.990 --> 00:54:07.440
Basically, we want
to design them

00:54:07.440 --> 00:54:12.790
to have cycles, but extremely
large cycles, large girth.

00:54:12.790 --> 00:54:15.740
And then the algorithm
tends not to get confused

00:54:15.740 --> 00:54:18.650
and to be near optimal.

00:54:18.650 --> 00:54:21.060
Because it takes a long
time for information

00:54:21.060 --> 00:54:22.690
to propagate around
the cycle and it's

00:54:22.690 --> 00:54:25.110
pretty attenuated
when it comes back in.

00:54:25.110 --> 00:54:28.460
Whereas, with the short
cycles that you saw here,

00:54:28.460 --> 00:54:30.120
that's not good.

00:54:30.120 --> 00:54:31.910
And in many of
these other fields,

00:54:31.910 --> 00:54:35.330
like physics for instance,
image processing,

00:54:35.330 --> 00:54:38.830
you're dealing with graphs
that look like grids.

00:54:41.720 --> 00:54:44.060
Something like that.

00:54:44.060 --> 00:54:47.300
You have lots and
lots of short cycles.

00:54:47.300 --> 00:54:49.949
And then just applying
the sum product algorithm

00:54:49.949 --> 00:54:51.490
to a graph that
looks like that won't

00:54:51.490 --> 00:54:53.010
work well at all
because you have

00:54:53.010 --> 00:54:54.820
these little, short cycles.

00:54:57.390 --> 00:54:58.890
So if you can design
your own graph,

00:54:58.890 --> 00:55:00.620
this is a good thing to do.

00:55:04.050 --> 00:55:05.770
OK, so let's get into it.

00:55:09.160 --> 00:55:16.450
First of all, let
me describe what

00:55:16.450 --> 00:55:20.940
I mean by a posteriori
probability decoding.

00:55:20.940 --> 00:55:23.930
This means simply that
for every variable,

00:55:23.930 --> 00:55:29.180
both internal variables
and external variables,

00:55:29.180 --> 00:55:33.210
we want to compute the
probability of that variable.

00:55:33.210 --> 00:55:38.960
Let's call it probability that
some external variable y, k

00:55:38.960 --> 00:55:42.810
takes on some
particular value y, k

00:55:42.810 --> 00:55:51.210
given the entire
received sequence.

00:55:51.210 --> 00:55:55.190
OK, so you can think in
terms of the 8, 4, 4 code.

00:55:55.190 --> 00:55:56.519
We're going to take this code.

00:55:56.519 --> 00:55:59.060
We're going to transmit it over
some kind of a noisy channel.

00:55:59.060 --> 00:56:03.310
We're going to get
a received sequence.

00:56:03.310 --> 00:56:06.110
And from that, we want to
figure out the probability

00:56:06.110 --> 00:56:10.210
that any particular input
bit is equal to a 0 or a 1.

00:56:10.210 --> 00:56:13.760
That might be more precisely
what we're going to do.

00:56:16.420 --> 00:56:28.080
So we really want
to develop a vector

00:56:28.080 --> 00:56:33.610
consisting of the values of
this a posteriori probability

00:56:33.610 --> 00:56:36.830
for all possible
values of the variable.

00:56:36.830 --> 00:56:39.360
This is sometimes
called a message.

00:56:44.390 --> 00:56:46.920
But this is what we're
going to try to compute.

00:56:49.780 --> 00:56:52.110
If we're actually
decoding a code

00:56:52.110 --> 00:56:54.490
and we get the a
posteriori probabilities

00:56:54.490 --> 00:57:00.100
for all these variables,
then at the end of the day,

00:57:00.100 --> 00:57:01.660
what do we want to do?

00:57:01.660 --> 00:57:06.310
We want to make hard
decisions on each of these

00:57:06.310 --> 00:57:07.830
and decide what was sent.

00:57:11.560 --> 00:57:13.400
It's quite easy to show.

00:57:13.400 --> 00:57:16.380
Actually, I'm not sure
I do this in the notes.

00:57:16.380 --> 00:57:18.740
That a posteriori
probability decoding

00:57:18.740 --> 00:57:21.530
is what you want
to do to minimize

00:57:21.530 --> 00:57:24.180
the probability of bit error.

00:57:24.180 --> 00:57:25.680
What's the probability
of bit error?

00:57:25.680 --> 00:57:30.010
It's the probability that y is
not what you guessed it was.

00:57:30.010 --> 00:57:33.790
In other words, it's 1 minus
the max of all of these.

00:57:33.790 --> 00:57:35.470
You would choose the max.

00:57:35.470 --> 00:57:37.430
The probability of it
actually being the max

00:57:37.430 --> 00:57:40.590
is given by that a
posteriori probability.

00:57:40.590 --> 00:57:42.560
And 1 minus that
is the probability

00:57:42.560 --> 00:57:43.560
that it's anything else.

00:57:43.560 --> 00:57:46.300
And that's the bit
error probability.

00:57:46.300 --> 00:57:52.240
So doing maximum a posteriori
probability decoding

00:57:52.240 --> 00:57:57.020
on a bit-wise basis is the way
of minimizing the bit error

00:57:57.020 --> 00:57:59.232
probability.

00:57:59.232 --> 00:58:00.690
Now, that isn't
actually what we've

00:58:00.690 --> 00:58:02.720
been doing in the
Viterbi algorithm,

00:58:02.720 --> 00:58:04.790
for instance, or in
general in our decoding.

00:58:04.790 --> 00:58:08.830
We've said, we want to minimize
the probability of decoding

00:58:08.830 --> 00:58:11.400
any code word on
a sequence basis.

00:58:11.400 --> 00:58:15.350
We want to minimize the
probability of error

00:58:15.350 --> 00:58:17.020
in decoding the whole sequence.

00:58:17.020 --> 00:58:19.630
So these are not
exactly the same thing.

00:58:19.630 --> 00:58:24.600
In fact, if you do
max APP decoding,

00:58:24.600 --> 00:58:27.230
you may not even
get a code word.

00:58:27.230 --> 00:58:30.290
You'll make individual
decisions on each of these bits,

00:58:30.290 --> 00:58:32.670
and that may not
actually be a code word.

00:58:32.670 --> 00:58:36.000
There's nothing that
requires it to be.

00:58:36.000 --> 00:58:37.489
Usually, it is.

00:58:37.489 --> 00:58:38.780
Usually, there's no difference.

00:58:38.780 --> 00:58:43.090
These algorithms tend to--
at least in relatively good

00:58:43.090 --> 00:58:45.920
signal noise, reach
ratio situations.

00:58:45.920 --> 00:58:49.410
They tend to both
give the same answer.

00:58:49.410 --> 00:58:52.705
And the probability of
bit error or word error

00:58:52.705 --> 00:58:55.890
is not markedly
different whichever one

00:58:55.890 --> 00:58:58.400
you use as a practical matter.

00:58:58.400 --> 00:59:01.210
But it's certainly
possible in principle

00:59:01.210 --> 00:59:07.290
that since maximum likelihood
sequence decoding gives you

00:59:07.290 --> 00:59:09.550
the lowest word
error probability,

00:59:09.550 --> 00:59:13.520
maximum APP bit-wise
decoding has

00:59:13.520 --> 00:59:17.680
got to give you a higher
word error probability.

00:59:17.680 --> 00:59:21.566
So in any particular
situation, you've

00:59:21.566 --> 00:59:23.440
got to decide which is
more important to you,

00:59:23.440 --> 00:59:25.070
minimizing the bit
error probability

00:59:25.070 --> 00:59:29.500
or minimizing the probability
of any error in the whole block.

00:59:29.500 --> 00:59:31.655
Actually, generally, the
latter is what you prefer.

00:59:34.550 --> 00:59:37.940
That governs your minimum
time between decoding

00:59:37.940 --> 00:59:39.310
errors and things like that.

00:59:45.670 --> 00:59:49.120
So that's a general discussion
of what APP decoding is.

00:59:49.120 --> 00:59:51.030
Why do we want it here?

00:59:51.030 --> 00:59:54.380
Because when we get to
capacity-approaching codes,

00:59:54.380 --> 00:59:57.190
we're going to talk
about big codes that

00:59:57.190 --> 01:00:00.420
are made up of smaller codes.

01:00:00.420 --> 01:00:03.880
The great thing about this
graphical realization-- this

01:00:03.880 --> 01:00:08.340
is realization of
linear systems.

01:00:08.340 --> 01:00:11.040
You realize a big system
by putting together blocks

01:00:11.040 --> 01:00:13.090
representing little systems.

01:00:13.090 --> 01:00:14.820
That's a good way
of designing things

01:00:14.820 --> 01:00:16.260
for all kinds of reasons.

01:00:16.260 --> 01:00:17.676
It's going to be
the way we design

01:00:17.676 --> 01:00:19.480
capacity-approaching codes.

01:00:19.480 --> 01:00:21.950
And we're going
to want a decoding

01:00:21.950 --> 01:00:25.210
algorithm for a
part of a system.

01:00:25.210 --> 01:00:28.470
Say, this 6, 3 part,
or say something else.

01:00:28.470 --> 01:00:32.220
Say this whole thing is
a part of the system.

01:00:32.220 --> 01:00:34.840
This whole thing is just 8, 4.

01:00:37.620 --> 01:00:43.480
That yields soft probabilistic
information in the output

01:00:43.480 --> 01:00:46.030
that we can then feed
into decoding algorithms

01:00:46.030 --> 01:00:48.030
for other parts of the system.

01:00:48.030 --> 01:00:49.760
So we won't want to
make hard decisions.

01:00:49.760 --> 01:00:53.115
We'll want to feed the whole
message, the whole vector

01:00:53.115 --> 01:00:57.550
of a posteriori
probabilities from one--

01:00:57.550 --> 01:00:59.930
a partially decoded
part of the system.

01:00:59.930 --> 01:01:03.930
We'll feed that into some other
part and then go decode that.

01:01:03.930 --> 01:01:05.590
For instance, in
turbo codes we'll

01:01:05.590 --> 01:01:10.000
decode-- we'll make it up out
of several convolutional codes.

01:01:10.000 --> 01:01:11.720
There will be several trellises.

01:01:11.720 --> 01:01:15.160
We'll do BCJR decoding
of one trellis.

01:01:15.160 --> 01:01:18.330
That will give us a bunch of
a posteriori probabilities

01:01:18.330 --> 01:01:21.620
about the bits that are
involved in that trellis,

01:01:21.620 --> 01:01:25.410
then we'll pass them back
to some other trellis

01:01:25.410 --> 01:01:27.995
decoder that's
decoding-- apparently,

01:01:27.995 --> 01:01:30.500
a different trellis,
but involving

01:01:30.500 --> 01:01:33.430
some of the same bits.

01:01:33.430 --> 01:01:39.140
So maybe I shouldn't go on too
long with these generalities,

01:01:39.140 --> 01:01:43.530
but that's why APP
decoding is going

01:01:43.530 --> 01:01:45.510
to be a good thing for us to do.

01:01:45.510 --> 01:01:48.230
And it's not because
ultimately we

01:01:48.230 --> 01:01:50.780
want to do APP decoding
of a single code.

01:01:50.780 --> 01:01:53.130
It's because we want
the soft decisions

01:01:53.130 --> 01:01:55.660
to feed into other decoding.

01:02:01.400 --> 01:02:01.910
All right.

01:02:06.490 --> 01:02:08.620
How do we compute this?

01:02:08.620 --> 01:02:12.180
Probability that
yk equals yk given

01:02:12.180 --> 01:02:19.635
r is going to be a basic
probability theory.

01:02:19.635 --> 01:02:27.640
It's just the sum over
all, let's say, code words

01:02:27.640 --> 01:02:35.600
in which yk equals yk of the
probability of those code

01:02:35.600 --> 01:02:37.240
words.

01:02:37.240 --> 01:02:40.790
So sum over all y.

01:02:40.790 --> 01:02:44.110
I'm just making
up notation here.

01:02:44.110 --> 01:02:46.890
In our code, we're
going to have some code

01:02:46.890 --> 01:02:50.330
words in the code
that have yk equals 0

01:02:50.330 --> 01:02:52.530
and some code words
that have yk equals 1.

01:02:52.530 --> 01:02:55.190
So we'll divide the code
words into two sets,

01:02:55.190 --> 01:02:58.200
the ones that are
compatible with yk equals 0

01:02:58.200 --> 01:03:01.110
and the ones that are
compatible with yk equals 1.

01:03:08.940 --> 01:03:12.030
The a posteriori
probability of yk

01:03:12.030 --> 01:03:16.450
equals yk is simply the sum of
the a posteriori probabilities

01:03:16.450 --> 01:03:18.960
that we get of any
of those code words.

01:03:24.000 --> 01:03:29.190
I don't have to make
proportional sign there.

01:03:29.190 --> 01:03:43.420
But by Bayes' law, each of
these-- Bayes' p of y given r

01:03:43.420 --> 01:03:46.150
is proportional to.

01:03:46.150 --> 01:03:48.370
Here's a sign that
a lot of people use.

01:03:48.370 --> 01:03:50.530
I'm not completely
comfortable with it yet,

01:03:50.530 --> 01:03:53.850
but that means proportional to.

01:03:53.850 --> 01:03:55.640
Prop 2 in [INAUDIBLE].

01:03:58.570 --> 01:04:02.425
Probability of r given y.

01:04:02.425 --> 01:04:06.000
In other words, the a posteriori
probability of a code word

01:04:06.000 --> 01:04:09.331
is proportional to the
probability of r given y.

01:04:09.331 --> 01:04:09.830
Why?

01:04:09.830 --> 01:04:14.130
Because by Bayes', it's actually
equal to that times probability

01:04:14.130 --> 01:04:16.010
of y over probability of r.

01:04:16.010 --> 01:04:18.180
But this is the same
for all of them.

01:04:18.180 --> 01:04:20.162
If the code words
are all equiprobable,

01:04:20.162 --> 01:04:21.620
these are the same
for all of them.

01:04:21.620 --> 01:04:24.390
So it's just proportional
to the likelihood

01:04:24.390 --> 01:04:29.250
of getting r given
that you transmitted y.

01:04:29.250 --> 01:04:32.820
I'm sure you've seen
this many times before,

01:04:32.820 --> 01:04:33.990
so I won't do in detail.

01:04:33.990 --> 01:04:41.650
So this is proportional
to the same sum.

01:04:41.650 --> 01:04:45.210
y in this code word,
where y-- a portion

01:04:45.210 --> 01:04:49.330
of the code where yk equals y.

01:04:49.330 --> 01:04:54.029
p of r given y.

01:04:54.029 --> 01:04:54.695
But that's easy.

01:04:58.960 --> 01:05:02.350
I'm assuming that I have
a memoryless channel here.

01:05:02.350 --> 01:05:04.330
That's been implicit
throughout this course,

01:05:04.330 --> 01:05:07.540
like the Additive White Gaussian
Noise channel or the Binary

01:05:07.540 --> 01:05:08.500
Symmetric channel.

01:05:08.500 --> 01:05:13.160
So this just breaks up into
a component-wise product.

01:05:13.160 --> 01:05:28.740
This is just the product
of p of-- over the k.

01:05:28.740 --> 01:05:32.805
p of rk given yk.

01:05:42.720 --> 01:05:51.780
And because yk has the same--
I should say k prime here.

01:05:51.780 --> 01:05:54.280
Sorry, that's very confusing.

01:05:54.280 --> 01:05:57.280
So this is over all k prime.

01:05:57.280 --> 01:06:00.130
But I can break this up.

01:06:00.130 --> 01:06:02.040
One of these terms
is going to be k.

01:06:02.040 --> 01:06:10.110
For that, yk is fixed at little
yk, at this particular value.

01:06:10.110 --> 01:06:16.420
So I can write this in turn
as this for k prime not equal

01:06:16.420 --> 01:06:17.690
to k.

01:06:17.690 --> 01:06:25.050
And then I get a common term
which is p of rk given yk.

01:06:29.450 --> 01:06:35.580
OK, so this is the expression
that I want to compute.

01:06:35.580 --> 01:06:37.130
Two comments.

01:06:37.130 --> 01:06:39.270
When I say "sum
product," it's really

01:06:39.270 --> 01:06:43.730
because this is just a
sum of products here.

01:06:43.730 --> 01:06:45.730
We see we want to
compute this actually

01:06:45.730 --> 01:06:47.850
rather nasty sum of products.

01:06:47.850 --> 01:06:49.750
We're looking for
an efficient way

01:06:49.750 --> 01:06:51.457
of computing the
sum of products.

01:06:51.457 --> 01:06:53.040
And we're going to
show how this could

01:06:53.040 --> 01:06:55.460
be computed by tracing
through the graph.

01:06:59.000 --> 01:06:59.920
That's one comment.

01:06:59.920 --> 01:07:06.740
The second comment is that in
the literature, particularly

01:07:06.740 --> 01:07:11.370
in the turbo code
literature, this has a name.

01:07:15.730 --> 01:07:19.380
This part, the direct
likelihood of the symbol given

01:07:19.380 --> 01:07:22.450
what you received
corresponding to that symbol,

01:07:22.450 --> 01:07:30.150
is called the intrinsic
information about y, k.

01:07:30.150 --> 01:07:34.270
And this part, which has
to do with, basically,

01:07:34.270 --> 01:07:38.650
the likelihood of yk given
all the other yk prime,

01:07:38.650 --> 01:07:41.120
for k prime not
equal to k, is called

01:07:41.120 --> 01:07:42.585
the extrinsic information.

01:07:48.480 --> 01:07:51.410
We basically take
these two parts,

01:07:51.410 --> 01:07:58.040
which we can compute separately,
and we multiply them together.

01:07:58.040 --> 01:08:00.870
And we get the
overall probability.

01:08:00.870 --> 01:08:05.120
So the probability consists
of a part due to rk

01:08:05.120 --> 01:08:07.940
and a part due to all
those other rk prime,

01:08:07.940 --> 01:08:10.400
which we're going to compute
by going through the graph.

01:08:13.960 --> 01:08:15.589
And I just introduced
this language

01:08:15.589 --> 01:08:17.005
because if you
read papers, you're

01:08:17.005 --> 01:08:18.838
going to see this
throughout the literature.

01:08:22.485 --> 01:08:22.984
OK.

01:08:27.960 --> 01:08:31.207
Let me start moving back.

01:08:31.207 --> 01:08:32.674
I never used this.

01:08:38.550 --> 01:08:41.729
Well, let me introduce
one more thing.

01:08:41.729 --> 01:08:44.210
So we want to compute
all these messages.

01:08:44.210 --> 01:08:48.189
In particular, we want to
compute the probability of yk

01:08:48.189 --> 01:08:50.600
equals yk given r.

01:08:50.600 --> 01:08:56.069
This part here is really the
r on k prime not equal to k.

01:08:56.069 --> 01:08:58.609
Terrible notation.

01:08:58.609 --> 01:09:01.720
The other r's.

01:09:01.720 --> 01:09:04.779
So that'll be a message.

01:09:04.779 --> 01:09:07.470
We also want to
compute similar things

01:09:07.470 --> 01:09:11.120
for all the internal variables,
which we have sometimes

01:09:11.120 --> 01:09:12.209
called state variables.

01:09:15.810 --> 01:09:17.500
And it breaks up
in the same way.

01:09:17.500 --> 01:09:21.500
And I won't take the trouble
to write out the expression.

01:09:21.500 --> 01:09:27.830
But again, now we
really want to go over

01:09:27.830 --> 01:09:31.712
all configurations which
have a particular state

01:09:31.712 --> 01:09:32.420
variable in them.

01:09:32.420 --> 01:09:36.160
So we're really going
over all behaviors.

01:09:36.160 --> 01:09:39.690
And we can compute
probabilities this way.

01:09:39.690 --> 01:09:41.430
I won't belabor that right now.

01:09:41.430 --> 01:09:45.729
So we want messages, really,
indicating the likelihood

01:09:45.729 --> 01:09:51.130
of each of the
external variables

01:09:51.130 --> 01:09:52.850
and each of the
internal variables.

01:09:52.850 --> 01:09:55.650
The whole set of
possible likelihoods,

01:09:55.650 --> 01:10:00.760
the APP vector, the message,
for both the symbol variables

01:10:00.760 --> 01:10:02.530
and the state variables.

01:10:02.530 --> 01:10:04.930
And that's what the
sum product algorithm

01:10:04.930 --> 01:10:06.409
is going to do for us.

01:10:09.691 --> 01:10:10.190
OK.

01:10:16.000 --> 01:10:21.930
So the sum product
algorithm is based

01:10:21.930 --> 01:10:27.290
on-- I discuss it three parts.

01:10:27.290 --> 01:10:36.820
One is the past future
decomposition of the code.

01:10:36.820 --> 01:10:44.050
Another is the actual
sum product update rule,

01:10:44.050 --> 01:10:47.186
which is based on
another decomposition.

01:10:50.870 --> 01:10:53.500
And finally, we need
an overall schedule.

01:11:00.520 --> 01:11:03.245
OK, so let me
discuss this first.

01:11:08.830 --> 01:11:11.870
Let's take a particular
state variable.

01:11:18.600 --> 01:11:22.170
I only mean internal variable.

01:11:22.170 --> 01:11:23.530
And here it is.

01:11:23.530 --> 01:11:30.190
Somewhere in our big
graph, it's an edge.

01:11:30.190 --> 01:11:31.530
Our big graph is cycle-free.

01:11:34.420 --> 01:11:37.645
So what's the special property
of edges in cycle-free graphs?

01:11:40.280 --> 01:11:41.400
Every edge is a cut-set.

01:11:45.920 --> 01:11:52.370
So if I take this edge, it
divides the whole graph--

01:11:52.370 --> 01:11:54.930
we've done this before--
into two parts, which

01:11:54.930 --> 01:11:59.750
we can call past and future.

01:11:59.750 --> 01:12:02.060
And if we were to
take out this edge,

01:12:02.060 --> 01:12:04.290
it would disconnect the
graph from these two parts.

01:12:04.290 --> 01:12:08.550
So there's no other connections
between the past and the future

01:12:08.550 --> 01:12:11.830
because the graph is cycle-free.

01:12:11.830 --> 01:12:20.500
And there's some set of
past observed variables, rp,

01:12:20.500 --> 01:12:25.020
and there's some set of
future observe variables, rf.

01:12:28.540 --> 01:12:30.990
So before, I called
this yp and yf.

01:12:30.990 --> 01:12:36.570
These are the received
observations corresponding

01:12:36.570 --> 01:12:39.100
to all the past variables,
the received observations

01:12:39.100 --> 01:12:40.855
corresponding to all
the future variables.

01:12:46.740 --> 01:12:50.920
And the past future
decomposition

01:12:50.920 --> 01:12:53.472
is basically like this here.

01:12:57.700 --> 01:12:58.710
Let me write it out.

01:13:07.980 --> 01:13:12.910
It basically says that the
probability of-- that state

01:13:12.910 --> 01:13:14.560
variable has a
particular value given

01:13:14.560 --> 01:13:20.290
r is proportional
to the probability

01:13:20.290 --> 01:13:25.900
that the state variable has
a particular value given r--

01:13:25.900 --> 01:13:30.230
that part-- times the
probability that it has

01:13:30.230 --> 01:13:35.180
a particular value based
on the future part.

01:13:35.180 --> 01:13:38.270
And that we can simply--
when we take the vector,

01:13:38.270 --> 01:13:45.800
we can simply multiply
the two components

01:13:45.800 --> 01:13:48.830
that both have a
common value sk and we

01:13:48.830 --> 01:13:57.170
get the overall a posteriori
probability that this state

01:13:57.170 --> 01:13:59.940
variable has a particular value.

01:13:59.940 --> 01:14:19.740
Now, I'm sure there is a-- OK.

01:14:19.740 --> 01:14:23.880
In the notes, I go through
a little development

01:14:23.880 --> 01:14:26.420
to show this.

01:14:26.420 --> 01:14:30.500
The development is, first
of all, based on the fact

01:14:30.500 --> 01:14:40.100
that if I ask-- now in the
corresponding expression here,

01:14:40.100 --> 01:14:44.440
I want to have
the sum over all y

01:14:44.440 --> 01:14:50.250
and s such that the behaviot--
I'm going to just lose myself

01:14:50.250 --> 01:14:51.830
in notation here.

01:14:51.830 --> 01:14:53.890
We only have a few minutes left.

01:15:01.200 --> 01:15:07.100
This is basically going
to be over a code that's

01:15:07.100 --> 01:15:11.580
consistent with the state
having this particular variable.

01:15:17.460 --> 01:15:24.850
But when I cut this
code up like this,

01:15:24.850 --> 01:15:28.570
the overall code of
all configurations

01:15:28.570 --> 01:15:32.970
that are consistent with a
certain state variable sk

01:15:32.970 --> 01:15:36.970
is going to divide up
into a Cartesian product

01:15:36.970 --> 01:15:39.950
of the past part
of the code that's

01:15:39.950 --> 01:15:44.480
consistent with sk across the
future part of the code that's

01:15:44.480 --> 01:15:46.410
consistent with sk.

01:15:46.410 --> 01:15:48.220
And this was
precisely the argument

01:15:48.220 --> 01:15:53.510
we went through when we
developed the cut-set bound.

01:15:53.510 --> 01:15:57.320
That given sk, we
can take any past

01:15:57.320 --> 01:16:00.670
that's consistent with
sk and any future that's

01:16:00.670 --> 01:16:03.990
consistent with sk
and map them together.

01:16:03.990 --> 01:16:08.890
And in effect, fixing the state
does disconnect the graph.

01:16:08.890 --> 01:16:11.060
And this is the
basic Markov property

01:16:11.060 --> 01:16:14.850
that we used to develop
the cut-set bound.

01:16:14.850 --> 01:16:16.740
Really, just depending
on that again.

01:16:19.350 --> 01:16:30.400
Plus the basic Cartesian
product, lemma,

01:16:30.400 --> 01:16:42.470
which is that if I have a
sum over a Cartesian product,

01:16:42.470 --> 01:16:52.756
x cross of f of x g of y, what's
a fast way of computing that?

01:16:52.756 --> 01:16:53.672
AUDIENCE: [INAUDIBLE].

01:16:57.648 --> 01:16:59.640
PROFESSOR: Right.

01:16:59.640 --> 01:17:02.470
So this is trivial.

01:17:02.470 --> 01:17:06.750
But in fact, that's what we
do in many fast algorithms.

01:17:10.740 --> 01:17:14.020
It's just the sum over
the x of the f part,

01:17:14.020 --> 01:17:16.900
the sum over y of the g part.

01:17:16.900 --> 01:17:20.480
And you multiply
those two together.

01:17:20.480 --> 01:17:27.530
Just think of x and y being
on some rectangular array.

01:17:27.530 --> 01:17:29.950
Here, we compute
all the products

01:17:29.950 --> 01:17:32.290
corresponding to every
combination of x and y

01:17:32.290 --> 01:17:33.550
individually.

01:17:33.550 --> 01:17:36.210
Here, we first take
the sum of all those,

01:17:36.210 --> 01:17:38.200
the sum of all those,
the sum all those.

01:17:38.200 --> 01:17:41.690
We multiply them times
the sum of-- in this way.

01:17:41.690 --> 01:17:46.560
And we will get exactly the same
terms if we multiply them out.

01:17:46.560 --> 01:17:51.040
And so this is just a fast
way of computing this.

01:17:53.600 --> 01:17:57.790
And that's really what we're
doing when we divide this up

01:17:57.790 --> 01:18:00.280
in this way.

01:18:00.280 --> 01:18:02.460
I think I'm incapable
of explaining that

01:18:02.460 --> 01:18:05.350
any more clearly right now And
we're at the end of our time,

01:18:05.350 --> 01:18:07.310
anyway.

01:18:07.310 --> 01:18:12.580
So we do want to
not do problem 8.3.

01:18:12.580 --> 01:18:15.215
You have only one problem due
on Wednesday, which is 8.1.

01:18:15.215 --> 01:18:18.000
It will probably take
you a while, anyway.

01:18:18.000 --> 01:18:20.790
We didn't get to
the BCJR algorithm.

01:18:20.790 --> 01:18:24.120
We'll take this up again
and complete the sum product

01:18:24.120 --> 01:18:26.690
algorithm next time.

01:18:26.690 --> 01:18:28.700
See you Wednesday.