WEBVTT

00:00:00.500 --> 00:00:02.830
The following content is
provided under a Creative

00:00:02.830 --> 00:00:04.370
Commons license.

00:00:04.370 --> 00:00:06.670
Your support will help
MIT OpenCourseWare

00:00:06.670 --> 00:00:11.030
continue to offer high-quality
educational resources for free.

00:00:11.030 --> 00:00:13.660
To make a donation or
view additional materials

00:00:13.660 --> 00:00:17.610
from hundreds of MIT courses,
visit MIT OpenCourseWare

00:00:17.610 --> 00:00:18.520
at ocw.mit.edu.

00:00:25.693 --> 00:00:27.110
ELIZABETH NOLAN:
Last time we were

00:00:27.110 --> 00:00:33.500
working on this PKS assembly
line that makes acrylide.

00:00:33.500 --> 00:00:37.580
And just as a review
of that, we left off

00:00:37.580 --> 00:00:42.230
having gone over the domains
and module architecture

00:00:42.230 --> 00:00:44.310
for this assembly line.

00:00:44.310 --> 00:00:48.890
So recall, each module
activates a given monomer.

00:00:48.890 --> 00:00:51.740
And we can use
depictions like this

00:00:51.740 --> 00:00:57.610
to show how the PKS builds
a growing polyketide chain.

00:00:57.610 --> 00:01:00.740
OK, and as you saw in
recitation last week,

00:01:00.740 --> 00:01:02.900
the actual structure of
one of these synthases

00:01:02.900 --> 00:01:05.180
is very different
than what's depicted

00:01:05.180 --> 00:01:08.510
by this left-to-right kind of
assembly-line depiction there.

00:01:08.510 --> 00:01:10.910
So you saw some amazing
conformational changes

00:01:10.910 --> 00:01:13.040
of the fatty acid synthase.

00:01:13.040 --> 00:01:15.410
And they're all different,
but just keep that in mind

00:01:15.410 --> 00:01:16.550
when thinking about these.

00:01:16.550 --> 00:01:19.760
So this sort of notation
is very helpful for us

00:01:19.760 --> 00:01:22.100
in terms of thinking about
how the biosynthesis goes,

00:01:22.100 --> 00:01:26.480
but it's not an accurate
representation of structure.

00:01:26.480 --> 00:01:29.210
OK, so where we left
off was with looking

00:01:29.210 --> 00:01:34.370
at how these optional
domains can do chemistry

00:01:34.370 --> 00:01:38.300
on the upstream monomer.

00:01:38.300 --> 00:01:42.620
And the last thing we're going
to do related to this assembly

00:01:42.620 --> 00:01:46.850
line is, one, ask how is
the polyketide released

00:01:46.850 --> 00:01:49.880
from the assembly line when
the biosynthesis is over.

00:01:49.880 --> 00:01:52.850
And then we'll just
do one exercise

00:01:52.850 --> 00:01:55.290
looking at the macrolide
and working backwards.

00:01:55.290 --> 00:01:58.790
So last time, we were looking
at the domain organization

00:01:58.790 --> 00:02:01.160
to determine what
sort of chemistry

00:02:01.160 --> 00:02:02.900
happens to a given monomer.

00:02:02.900 --> 00:02:04.610
We can do, effectively,
the opposite,

00:02:04.610 --> 00:02:07.550
looking at a natural
product and identify

00:02:07.550 --> 00:02:10.370
what those monomers and
properties of the assembly line

00:02:10.370 --> 00:02:11.450
are.

00:02:11.450 --> 00:02:16.610
OK, so in terms
of chain release,

00:02:16.610 --> 00:02:18.335
there are thioesterase domains.

00:02:28.090 --> 00:02:32.480
And these domains are
involved in chain release

00:02:32.480 --> 00:02:33.820
from the assembly line.

00:02:41.730 --> 00:02:44.660
So if you take a look
in the final module

00:02:44.660 --> 00:02:49.790
here, what we see at the end
is a TE for the thioesterase.

00:02:49.790 --> 00:02:56.210
OK, and so what happens
in the case of DEBS is,

00:02:56.210 --> 00:03:00.830
ultimately, the chain gets
transferred to a serine residue

00:03:00.830 --> 00:03:01.970
on the TE domain.

00:03:06.640 --> 00:03:11.520
OK, and I'm just going to
draw the polyketide like that.

00:03:11.520 --> 00:03:15.690
And then in this case
here, we remember

00:03:15.690 --> 00:03:23.790
we have the propionyl-CoA
from the loading module, so

00:03:23.790 --> 00:03:25.500
the starter unit.

00:03:25.500 --> 00:03:29.310
In this case, what happens is
there is a macrocyclization.

00:03:29.310 --> 00:03:31.770
So we can imagine
deprotonation--

00:03:31.770 --> 00:03:35.790
oh, excuse me, I forgot
the linkage here.

00:03:35.790 --> 00:03:41.130
So for this TE, we no longer
have the growing chain

00:03:41.130 --> 00:03:43.560
tethered by a Ppant arm.

00:03:43.560 --> 00:03:47.670
With the TE domain, it's
tethered to the serine residue.

00:03:47.670 --> 00:03:51.900
So it's transferred from the
thioester to this serine.

00:03:51.900 --> 00:03:54.405
And this here is just the
polyketide in between.

00:03:57.800 --> 00:04:00.150
I'm just abbreviating it.

00:04:00.150 --> 00:04:01.560
So we can have that.

00:04:01.560 --> 00:04:04.520
We can have attack and loss.

00:04:04.520 --> 00:04:09.570
OK, so in this case
what we end up with

00:04:09.570 --> 00:04:16.410
is the TE domain
plus a macrocycle.

00:04:21.459 --> 00:04:27.190
And so that's how we end up with
the structure as shown here.

00:04:27.190 --> 00:04:32.620
So some TE domains will result
in formation of a macrocycle.

00:04:32.620 --> 00:04:35.890
Some TE domains will catalyze
a hydrolytic release.

00:04:35.890 --> 00:04:37.450
And you get the linear chain.

00:04:37.450 --> 00:04:39.790
So you need to look at the
natural product structure.

00:04:39.790 --> 00:04:42.760
And based on that structure,
you can make an assessment

00:04:42.760 --> 00:04:46.150
as to how the TE works.

00:04:46.150 --> 00:04:49.450
And so that's also shown
in this depiction and one

00:04:49.450 --> 00:04:50.750
other depiction in the notes.

00:04:50.750 --> 00:04:53.980
So here, the entire
chain is drawn.

00:04:53.980 --> 00:04:56.020
And we're seeing
deprotonation donation here

00:04:56.020 --> 00:04:58.690
and then attack here
to give the macrocycle.

00:05:04.360 --> 00:05:13.270
So here is the
product of this DEBS.

00:05:13.270 --> 00:05:17.680
And what we're going to do as
a last exercise with the PKS

00:05:17.680 --> 00:05:21.190
is just look at this structure
and work through identifying

00:05:21.190 --> 00:05:23.740
the monomer units and
what optional domains

00:05:23.740 --> 00:05:27.430
acted on each monomer.

00:05:27.430 --> 00:05:31.960
And basically,
where can we start?

00:05:31.960 --> 00:05:37.210
So if the thioesterase
catalyzes a macrocyclization,

00:05:37.210 --> 00:05:42.730
that's an easy starting
point, because basically,

00:05:42.730 --> 00:05:47.530
the final monomer needs
to be involved there.

00:05:47.530 --> 00:05:50.200
And we know that
the only place we

00:05:50.200 --> 00:05:52.390
can get a structure like
this is from the starter,

00:05:52.390 --> 00:05:54.640
from that propionyl-CoA.

00:05:54.640 --> 00:05:59.320
So here, if we just
look, we have the monomer

00:05:59.320 --> 00:06:02.440
from module 0, the loading.

00:06:02.440 --> 00:06:04.870
And then as we
learned last time,

00:06:04.870 --> 00:06:08.770
each additional unit that
gets attached to the growing

00:06:08.770 --> 00:06:12.100
polyketide gives two
carbons, so two carbon units

00:06:12.100 --> 00:06:13.450
to the growing chain.

00:06:13.450 --> 00:06:18.300
So we can work our way
around by two carbons, 1, 2.

00:06:18.300 --> 00:06:25.060
OK, here we have module
1, another two carbons,

00:06:25.060 --> 00:06:51.060
module 2 here, module 3,
module 4, module 5, and here,

00:06:51.060 --> 00:06:51.790
module 6.

00:06:54.860 --> 00:06:57.110
So looking at a
structure, you can

00:06:57.110 --> 00:06:59.900
begin to dissect what
the assembly line will

00:06:59.900 --> 00:07:02.810
look like in terms of
the number of modules

00:07:02.810 --> 00:07:06.290
by counting C2 units to
the growing chain here.

00:07:06.290 --> 00:07:08.450
And then the other
thing we can do

00:07:08.450 --> 00:07:11.450
is look at the
functional group status

00:07:11.450 --> 00:07:14.030
and ask what types
of optional domains

00:07:14.030 --> 00:07:18.590
needed to be there in order to
give a given functional group.

00:07:18.590 --> 00:07:22.430
So for instance, in this
case, in module 1 here,

00:07:22.430 --> 00:07:23.910
we're seeing an OH group.

00:07:23.910 --> 00:07:27.770
So we know there had to be
the action of a keto reductase

00:07:27.770 --> 00:07:30.200
to reduce the ketone.

00:07:30.200 --> 00:07:33.410
Here we see we
have this carbonyl,

00:07:33.410 --> 00:07:35.960
so there was no optional domain.

00:07:35.960 --> 00:07:37.640
In this case, what happens?

00:07:37.640 --> 00:07:38.780
We have a methylene.

00:07:38.780 --> 00:07:42.480
So that ketone we started
with was fully reduced.

00:07:42.480 --> 00:07:45.270
So in this case, we
have the keto reductase,

00:07:45.270 --> 00:07:51.050
the dehydratase, and
the enoyl reductase.

00:07:51.050 --> 00:07:52.760
Again, here we can
look at this unit.

00:07:52.760 --> 00:07:56.140
We see an OH, which
tells us that there

00:07:56.140 --> 00:07:58.370
was action of a keto reductase.

00:07:58.370 --> 00:08:04.760
And here we have another OH,
so we have a keto reductase.

00:08:04.760 --> 00:08:11.850
And in this case, we have
none, this final one.

00:08:11.850 --> 00:08:14.540
And here, I didn't
write it, but none

00:08:14.540 --> 00:08:17.840
in terms of optional domains.

00:08:17.840 --> 00:08:19.730
So this can be pretty fun.

00:08:19.730 --> 00:08:21.320
This is a pretty
simple structure,

00:08:21.320 --> 00:08:23.040
but as structures
get more complex,

00:08:23.040 --> 00:08:26.030
you can map out what are
the optional domains there.

00:08:26.030 --> 00:08:28.970
And maybe you'll see, some other
unusual structural features

00:08:28.970 --> 00:08:33.020
will indicate there is
other optional domains

00:08:33.020 --> 00:08:34.471
beyond these three.

00:08:34.471 --> 00:08:35.929
And we're going to
see some of that

00:08:35.929 --> 00:08:37.935
as we move into the
non-ribosomal peptides.

00:08:40.710 --> 00:08:45.070
So that's given in the notes if
you want to practice on that.

00:08:45.070 --> 00:08:53.510
So with this, we're going
to transition into an NRPS

00:08:53.510 --> 00:08:55.700
and look at the
assembly line logic

00:08:55.700 --> 00:08:58.470
for non-ribosomal peptides.

00:08:58.470 --> 00:09:03.650
And so this is a slide from
last time, where we considered

00:09:03.650 --> 00:09:08.330
the starter units and
extender units for fatty acids

00:09:08.330 --> 00:09:10.520
and for polyketides.

00:09:10.520 --> 00:09:13.100
And so in
non-ribosomal peptides,

00:09:13.100 --> 00:09:17.510
we also have starter
units and extender units,

00:09:17.510 --> 00:09:20.150
but in the case of the
non-ribosomal peptide,

00:09:20.150 --> 00:09:21.640
as the name
indicates, we're going

00:09:21.640 --> 00:09:25.220
to be thinking about
amino acid monomers.

00:09:25.220 --> 00:09:28.250
And we're also going to be
considering examples where

00:09:28.250 --> 00:09:30.380
there is aryl acid monomers.

00:09:30.380 --> 00:09:34.310
So these NRPS
assembly lines will

00:09:34.310 --> 00:09:38.420
form polymers that incorporate
amino acid and aryl acid

00:09:38.420 --> 00:09:41.090
monomers.

00:09:41.090 --> 00:09:43.910
And this is another
slide from last time

00:09:43.910 --> 00:09:48.110
that is just summarizing
the core domains and then

00:09:48.110 --> 00:09:53.300
examples of optional domains
for the PKS and NRPS.

00:09:53.300 --> 00:09:58.520
So we learned last time
that for PKS, every module

00:09:58.520 --> 00:10:03.170
will have a KS and a T
domain, with the exception

00:10:03.170 --> 00:10:04.820
of the loading or
starter module.

00:10:04.820 --> 00:10:06.590
That has no keto
synthase, because there

00:10:06.590 --> 00:10:09.400
is no upstream group here.

00:10:09.400 --> 00:10:14.660
For NRPS, the core of
a module is CAT trio.

00:10:14.660 --> 00:10:17.450
So condensation
domain, or C domain,

00:10:17.450 --> 00:10:20.210
this is the domain that's
going to catalyze peptide bond

00:10:20.210 --> 00:10:23.180
formation between
two of the monomers.

00:10:23.180 --> 00:10:25.130
We have an adenylation domain.

00:10:25.130 --> 00:10:26.570
And we'll see this
does chemistry

00:10:26.570 --> 00:10:30.110
similar to the aminoacyl
tRNA synthetases.

00:10:30.110 --> 00:10:33.140
And then we have
the T domains that

00:10:33.140 --> 00:10:37.320
are carrier proteins for the
monomers and growing chain.

00:10:37.320 --> 00:10:40.430
OK, and then within
a given NRPS module,

00:10:40.430 --> 00:10:42.590
there can also be
optional domains.

00:10:42.590 --> 00:10:44.720
And just two examples
are shown here.

00:10:44.720 --> 00:10:48.110
So maybe there is an
epimerization of an amino acid.

00:10:48.110 --> 00:10:50.240
Maybe there is a
methyl group and there

00:10:50.240 --> 00:10:52.660
needs to be methyltransferase
to put that on.

00:10:52.660 --> 00:10:55.130
There is a lot of
diversity that comes

00:10:55.130 --> 00:10:57.260
into these structures
on the basis

00:10:57.260 --> 00:10:59.540
of these optional domains.

00:10:59.540 --> 00:11:01.640
And just to highlight
that, I've presented

00:11:01.640 --> 00:11:05.840
here a list of possible
optional domains

00:11:05.840 --> 00:11:08.495
you can find in NRPS,
or for that matter,

00:11:08.495 --> 00:11:12.380
a PKS here, so all
sorts of things.

00:11:12.380 --> 00:11:16.670
Look at halogenase,
cyclase, reductase.

00:11:16.670 --> 00:11:21.080
There is tremendous structural
diversity that can occur.

00:11:21.080 --> 00:11:29.030
OK, so if we consider
the NRPS assembly line

00:11:29.030 --> 00:11:31.130
structure and notation
similar to what

00:11:31.130 --> 00:11:36.170
we did with the polyketide
synthases, what do we see?

00:11:49.840 --> 00:11:52.420
So I'll just draw
one with two modules

00:11:52.420 --> 00:11:54.850
here, although n
can indicate more.

00:11:54.850 --> 00:11:59.050
So initially what we have
here is a starting or loading

00:11:59.050 --> 00:12:08.280
module, OK, so for
instance, module 0.

00:12:11.520 --> 00:12:20.010
OK, here we have module
1, 2 for extenders.

00:12:24.130 --> 00:12:27.775
And here we have a
thioesterase for chain release.

00:12:31.390 --> 00:12:34.000
So we'll find that
in the final module

00:12:34.000 --> 00:12:39.190
like what we saw with the
polyketide synthase for DEB.

00:12:39.190 --> 00:12:45.190
So this whole thing can
be called an NRPS here.

00:12:47.890 --> 00:12:51.100
And what happens in
terms of the action

00:12:51.100 --> 00:12:53.920
of these different
core domains--

00:12:53.920 --> 00:12:57.410
so A, we have adenylation.

00:13:06.520 --> 00:13:11.110
OK, and what these
domains do are select

00:13:11.110 --> 00:13:22.160
and activate the amino
acid or aryl acid monomers.

00:13:28.920 --> 00:13:31.430
OK and after these
monomers are activated,

00:13:31.430 --> 00:13:34.580
the A domain also transfers
them to the T domain.

00:13:37.330 --> 00:13:39.660
And we'll go over the
chemistry in a minute.

00:13:45.960 --> 00:13:50.230
OK, this T domain is like
what we saw with the PKS.

00:13:50.230 --> 00:14:04.550
We can call it a thiolation
domain or a peptidyl carrier

00:14:04.550 --> 00:14:05.050
protein.

00:14:09.390 --> 00:14:15.840
So these T domains are going to
be modified with the Ppant arm,

00:14:15.840 --> 00:14:17.845
like what we saw for PKS.

00:14:21.420 --> 00:14:25.860
We have the C
domain, condensation.

00:14:35.030 --> 00:14:38.750
And so this domain capitalizes
peptide bond formation.

00:14:48.890 --> 00:14:53.070
And I'll just point out here
that, in contrast to the keto

00:14:53.070 --> 00:14:55.170
synthase we saw in PKS--

00:14:55.170 --> 00:14:58.130
so we saw the keto synthase
doing covalent catalysis

00:14:58.130 --> 00:15:00.350
via its cysteine residue--

00:15:00.350 --> 00:15:03.560
the condensation domains
of NRPS are involved

00:15:03.560 --> 00:15:05.230
in non-covalent catalysis.

00:15:08.720 --> 00:15:10.830
So that's just an
important distinction.

00:15:14.170 --> 00:15:20.500
The growing chain does not get
attached to the C domain here.

00:15:20.500 --> 00:15:23.640
And then we have the
TE, so thioesterase,

00:15:23.640 --> 00:15:25.735
as we saw, for chain release.

00:15:29.130 --> 00:15:35.542
And this can be hydrolytic
or macrocyclization.

00:15:45.540 --> 00:15:50.090
OK, so let's consider
just the example

00:15:50.090 --> 00:15:55.760
of an NRPS that is responsible
for synthesizing a tripeptide.

00:15:55.760 --> 00:15:57.145
So what is the net reaction?

00:16:13.670 --> 00:16:18.680
So imagine that we have
three amino acid monomers.

00:16:22.910 --> 00:16:26.090
And I'll just point out
here too that beyond knowing

00:16:26.090 --> 00:16:30.590
an epimerization domain
epimerizes an amino acid,

00:16:30.590 --> 00:16:33.920
you're not responsible
for stereochemistry

00:16:33.920 --> 00:16:36.320
in terms of the various
structures we'll

00:16:36.320 --> 00:16:37.640
look at going through here.

00:16:37.640 --> 00:16:40.430
So I'm just not drawing
stereochemistry here.

00:16:51.000 --> 00:16:52.890
So we have three
amino acid monomers.

00:17:00.690 --> 00:17:04.200
There is going to
be some NRPS that's

00:17:04.200 --> 00:17:06.869
responsible for formation
of the tripeptide.

00:17:06.869 --> 00:17:12.569
And what we'll see is
that making a trimer

00:17:12.569 --> 00:17:17.130
requires three ATPs, so
one ATP per amino acid

00:17:17.130 --> 00:17:48.070
or aryl acid monomer, giving us
three AMT plus three PPi here

00:17:48.070 --> 00:17:55.960
to give us our tripeptide plus
three water molecules here.

00:17:55.960 --> 00:17:59.420
OK, so how does this happen?

00:17:59.420 --> 00:18:02.080
How does the NRPS
take these monomers

00:18:02.080 --> 00:18:04.960
and build, say, a tripeptide?

00:18:04.960 --> 00:18:12.640
We're going to look at the ACV
synthetase as a model for this.

00:18:12.640 --> 00:18:19.900
And so the ACV
tripeptide is important.

00:18:19.900 --> 00:18:24.340
It forms the backbone of
antibiotics of the penicillin

00:18:24.340 --> 00:18:26.140
and cephalosporin classes.

00:18:26.140 --> 00:18:29.680
So many of these
are used clinically.

00:18:29.680 --> 00:18:32.920
So here are the
structures of penicillin N

00:18:32.920 --> 00:18:34.810
And the cephalosporin.

00:18:34.810 --> 00:18:36.670
So at first inspection,
you might not

00:18:36.670 --> 00:18:40.240
guess that these are effectively
built from a tripeptide,

00:18:40.240 --> 00:18:42.820
but what happens is that
a non-ribosomal peptide

00:18:42.820 --> 00:18:47.590
synthetase, the ACV synthetase,
is responsible for forming

00:18:47.590 --> 00:18:50.710
two amide bonds between the
three starters-- or the three

00:18:50.710 --> 00:18:51.730
monomers.

00:18:51.730 --> 00:18:53.680
And then there is
additional enzymes

00:18:53.680 --> 00:18:58.990
that are responsible for
modifying that peptide scaffold

00:18:58.990 --> 00:19:01.510
to give, say, this
four-five fused ring

00:19:01.510 --> 00:19:06.190
system or this four-six
fused ring system.

00:19:06.190 --> 00:19:10.180
OK, so what is the
overall reaction of this?

00:19:10.180 --> 00:19:13.120
So similar to having
these three amino acid

00:19:13.120 --> 00:19:17.770
monomers here, what we
have are aminoadipate.

00:19:17.770 --> 00:19:21.240
We have L-cysteine and L-valine.

00:19:21.240 --> 00:19:24.400
And the synthetase takes
these three monomers and makes

00:19:24.400 --> 00:19:26.900
this molecule here,
which is called ACV.

00:19:29.430 --> 00:19:33.190
And so if we look at the
synthetase in cartoon form,

00:19:33.190 --> 00:19:34.810
this is the cartoon.

00:19:34.810 --> 00:19:38.080
So we see a loading
module, so just AT.

00:19:38.080 --> 00:19:41.140
Similar to the PKS, there
is no catalytic domain

00:19:41.140 --> 00:19:43.750
to make a new bond
in the loading module

00:19:43.750 --> 00:19:45.640
because there is
nothing upstream.

00:19:45.640 --> 00:19:48.610
We see a module here, CAT.

00:19:48.610 --> 00:19:51.340
We have another CAT trio here.

00:19:51.340 --> 00:19:53.050
And then what's this?

00:19:53.050 --> 00:19:57.170
This is our first example of an
optional domain within an NRPS.

00:19:57.170 --> 00:20:00.160
OK so this E is
for epimerization.

00:20:00.160 --> 00:20:03.940
I mean what we'll see is that
the synthetase epimerizes

00:20:03.940 --> 00:20:07.900
L-valine to D-valine during
the synthesis, and then

00:20:07.900 --> 00:20:09.670
the thioesterase.

00:20:09.670 --> 00:20:13.270
So similar to what we did
with the PKS, for the NRPS,

00:20:13.270 --> 00:20:18.040
you can count T domains as a
way to identify the modules

00:20:18.040 --> 00:20:20.815
and to figure out how many
monomers are involved.

00:20:25.310 --> 00:20:29.390
I also just point out-- and
this builds upon Colin's comment

00:20:29.390 --> 00:20:30.910
from last time--

00:20:30.910 --> 00:20:34.280
is that this assembly
line is responsible

00:20:34.280 --> 00:20:36.610
only for the synthesis
of a tripeptide,

00:20:36.610 --> 00:20:38.300
but look at its size.

00:20:38.300 --> 00:20:41.270
It's greater than
450 kilodaltons.

00:20:41.270 --> 00:20:46.520
That's quite big-- so a large
enzyme, 10 different domains,

00:20:46.520 --> 00:20:50.000
all just for synthesis of
this one tripeptide here.

00:20:53.210 --> 00:20:54.350
So what happens?

00:21:03.580 --> 00:21:09.240
We're going to go over the
action of the A domains

00:21:09.240 --> 00:21:13.170
and the T domains first.

00:21:13.170 --> 00:21:15.600
And then we'll look at
a cartoon in the slides.

00:21:22.370 --> 00:21:25.330
So the first points
to make are that we

00:21:25.330 --> 00:21:30.800
need to have loading
of the assembly line.

00:21:30.800 --> 00:21:34.480
So amino acids need to be
selected and activated.

00:21:37.380 --> 00:21:40.420
And that's where these
A domains come in.

00:21:40.420 --> 00:21:41.710
So what's happening?

00:21:45.570 --> 00:21:50.390
So we have some
amino acid monomers--

00:21:50.390 --> 00:21:54.650
so maybe it's the L-cysteine,
for instance, or the L-valine--

00:21:54.650 --> 00:21:57.710
plus ATP.

00:21:57.710 --> 00:22:00.650
The A domain does
chemistry similar to what

00:22:00.650 --> 00:22:05.270
we saw with the aminoacyl
tRNA synthetases

00:22:05.270 --> 00:22:08.108
to form an activated
intermediate.

00:22:14.090 --> 00:22:17.150
So we get an amino
adenylate here.

00:22:17.150 --> 00:22:18.140
And then what happens?

00:22:18.140 --> 00:22:19.085
So the T domain--

00:22:24.120 --> 00:22:34.500
OK, and this T domain must
be modified by a PPTase,

00:22:34.500 --> 00:22:45.630
like what we saw for the
PKS, to have the Ppant arm.

00:22:49.550 --> 00:22:52.370
After we have activation
of the amino acid

00:22:52.370 --> 00:22:54.800
or aryl acid
monomer, the A domain

00:22:54.800 --> 00:22:58.220
is going to assist with transfer
of this monomer to the Ppant

00:22:58.220 --> 00:22:59.830
arm of the T domain here.

00:23:24.500 --> 00:23:33.040
OK, so we got an aminoacyl-S-T
covalently tethered

00:23:33.040 --> 00:23:37.240
via a thioester linkage.

00:23:37.240 --> 00:23:42.880
So one ATP is consumed
per monomer loaded.

00:23:42.880 --> 00:23:47.020
And the ATP PPi exchange
assay we discussed back

00:23:47.020 --> 00:23:50.450
in the translation module for
studying the aminoacyl tRNA

00:23:50.450 --> 00:23:56.560
synthetases is used all the
time to study new A domains

00:23:56.560 --> 00:23:58.930
and ask what amino
acid or aryl acid

00:23:58.930 --> 00:24:02.470
monomers do they activate here.

00:24:02.470 --> 00:24:05.390
So that assay comes up
in this type of work.

00:24:08.290 --> 00:24:17.700
So what happens then in terms
of formation of a peptide bond,

00:24:17.700 --> 00:24:26.315
we're going to consider
condensation by the C domains.

00:24:33.040 --> 00:24:35.330
And so let's just imagine--

00:24:35.330 --> 00:24:37.970
we're just going to
draw two modules.

00:24:37.970 --> 00:24:47.290
So we have a loading module and
then a first extender module.

00:24:47.290 --> 00:24:55.160
And the T domains have been
post translationally modified

00:24:55.160 --> 00:24:56.750
with the Ppant arm.

00:24:56.750 --> 00:24:59.962
And the action of the A domains
has loaded the amino acids

00:24:59.962 --> 00:25:00.545
at this stage.

00:25:06.310 --> 00:25:10.180
OK, so we have some
amino acid loaded here.

00:25:10.180 --> 00:25:19.320
And then we have some
amino acid loaded here.

00:25:21.920 --> 00:25:23.980
OK, and what happens?

00:25:23.980 --> 00:25:26.280
We're going to have
nucleophilic attack

00:25:26.280 --> 00:25:30.900
from the alpha amino group
onto the upstream monomer

00:25:30.900 --> 00:25:32.345
and then transfer
of this monomer.

00:25:35.740 --> 00:25:38.455
And this occurs via the
action of the C domain.

00:25:58.560 --> 00:26:01.200
We have R2.

00:26:01.200 --> 00:26:09.280
And now we have formation
of our new peptide bond.

00:26:09.280 --> 00:26:16.780
Sorry, this is R2 here.

00:26:16.780 --> 00:26:20.290
And as I noted above, there
is no covalent catalysis

00:26:20.290 --> 00:26:22.240
with the C domain.

00:26:22.240 --> 00:26:25.230
Somehow it's helping to
bring these chains together

00:26:25.230 --> 00:26:28.300
and to allow this
nucleophilic attack to occur

00:26:28.300 --> 00:26:30.550
and to allow the monomer
to be transferred,

00:26:30.550 --> 00:26:32.680
but this unit is
never transferred

00:26:32.680 --> 00:26:33.870
to the C domain itself.

00:26:33.870 --> 00:26:34.370
Yeah?

00:26:34.370 --> 00:26:36.880
AUDIENCE: So is the C domain
responsible for deprotonating

00:26:36.880 --> 00:26:37.570
the NH2?

00:26:37.570 --> 00:26:39.790
Or is that just always--

00:26:39.790 --> 00:26:41.230
ELIZABETH NOLAN: Yeah, I don't--

00:26:41.230 --> 00:26:43.610
how this gets
deprotonated, I don't know.

00:26:43.610 --> 00:26:45.340
But this is back to
similar, like what

00:26:45.340 --> 00:26:47.230
we saw in the ribosome.

00:26:47.230 --> 00:26:50.650
And somehow, this alpha amino
group needs to be deprotonated.

00:26:50.650 --> 00:26:52.840
And there is something
in the environment

00:26:52.840 --> 00:26:55.660
of this machine that's
allowing that to happen,

00:26:55.660 --> 00:26:58.610
but whether it's the C
domain or something else,

00:26:58.610 --> 00:27:00.580
yeah, I don't know
the answer to that.

00:27:03.350 --> 00:27:08.890
So let's look at a cartoon of
this with this ACV synthase.

00:27:08.890 --> 00:27:12.970
So here we have on
top the synthase

00:27:12.970 --> 00:27:17.290
loaded with the
amino acid monomers.

00:27:17.290 --> 00:27:22.480
OK, so we see loading module
and then two extender modules.

00:27:22.480 --> 00:27:24.850
We have the aminoadipate.

00:27:24.850 --> 00:27:28.630
So it's not a canonical amino
acid, but it's amino-acid-like.

00:27:28.630 --> 00:27:32.260
We have the cysteine
and the valine.

00:27:32.260 --> 00:27:35.860
What happens as these
condensation reactions occur,

00:27:35.860 --> 00:27:37.600
we get chain elongation.

00:27:37.600 --> 00:27:40.660
So this is depicted
here in a similar manner

00:27:40.660 --> 00:27:43.210
to how that PKS assembly
line was depicted.

00:27:46.630 --> 00:27:52.540
So formation of two peptide
bonds, and then what happens?

00:27:52.540 --> 00:27:55.390
Ultimately, we
have chain transfer

00:27:55.390 --> 00:27:59.320
to a serine residue on
the thioesterase domain.

00:27:59.320 --> 00:28:02.420
And this is a case where the
thioesterase domain catalyzes

00:28:02.420 --> 00:28:04.510
this hydrolytic release.

00:28:04.510 --> 00:28:06.400
So as opposed to
macrocyclization,

00:28:06.400 --> 00:28:09.040
we're seeing activation
of a water molecule

00:28:09.040 --> 00:28:13.800
and attack, which releases
this ACV tripeptide.

00:28:13.800 --> 00:28:16.570
OK, and I've drawn the
ACV tripeptide here

00:28:16.570 --> 00:28:22.330
to indicate effectively
getting to this structure.

00:28:22.330 --> 00:28:26.020
So what happens after this
tripeptide is released

00:28:26.020 --> 00:28:28.810
from the assembly line, is that
there is additional enzymes

00:28:28.810 --> 00:28:30.850
that play a tailoring role.

00:28:30.850 --> 00:28:33.370
So like, for proteins we
talk about post-translational

00:28:33.370 --> 00:28:36.550
modification, for these
types of natural products,

00:28:36.550 --> 00:28:39.760
we talk about
post-assembly-line tailoring.

00:28:39.760 --> 00:28:44.760
And so in this case, there
is some enzymes such as IPNF,

00:28:44.760 --> 00:28:47.920
and non-heme iron enzyme
that's responsible

00:28:47.920 --> 00:28:53.170
for oxygenated cyclization
to give the fused ring system

00:28:53.170 --> 00:28:59.250
characteristics of these
beta-lactams like isopenicillin

00:28:59.250 --> 00:29:00.760
N.

00:29:00.760 --> 00:29:04.360
We can look at this in
another cartoon form.

00:29:04.360 --> 00:29:07.750
So here is the holoform.

00:29:07.750 --> 00:29:10.720
Recall, we called
the T domains apo

00:29:10.720 --> 00:29:13.930
when the serine is not
post-translationally modified

00:29:13.930 --> 00:29:15.550
with the Ppant arm.

00:29:15.550 --> 00:29:18.730
And the T domains are
holo when the Ppant

00:29:18.730 --> 00:29:22.630
arm has been attached, as
indicated by this squiggle.

00:29:22.630 --> 00:29:26.620
We then have loading of
the amino acid monomers

00:29:26.620 --> 00:29:29.470
via the action of the A domains.

00:29:29.470 --> 00:29:33.410
So formation of
that aminoacyl AMP

00:29:33.410 --> 00:29:39.460
or amino adenylate intermediate,
so one monomer per module.

00:29:39.460 --> 00:29:43.240
We have chain elongation events
catalyzed by the condensation

00:29:43.240 --> 00:29:44.950
domain.

00:29:44.950 --> 00:29:50.200
We have chain transfer to
the TE domain as shown here,

00:29:50.200 --> 00:29:56.490
chain transfer, and
then chain release here,

00:29:56.490 --> 00:29:58.440
and then post-assembly-line
tailoring.

00:30:01.350 --> 00:30:04.380
So with that in mind,
what we're going to do now

00:30:04.380 --> 00:30:09.120
is look at another non-ribosomal
peptide synthetase.

00:30:09.120 --> 00:30:12.810
This one synthesizes
the backbone

00:30:12.810 --> 00:30:16.170
of the antibiotic vancomycin.

00:30:16.170 --> 00:30:22.110
And the structure of
vancomycin is shown here.

00:30:22.110 --> 00:30:24.690
This is an antibiotic
that's basically

00:30:24.690 --> 00:30:28.530
considered one of last resort
for bacterial infections.

00:30:28.530 --> 00:30:31.860
And there is a huge problem
of vancomycin resistance

00:30:31.860 --> 00:30:34.320
in the clinic these days.

00:30:34.320 --> 00:30:37.320
So at first glance,
this molecule

00:30:37.320 --> 00:30:40.170
might not look like
it's based on a peptide.

00:30:40.170 --> 00:30:41.670
But then if you
look more carefully,

00:30:41.670 --> 00:30:44.000
you see there is a
lot of amide bonds.

00:30:44.000 --> 00:30:47.880
And there is also some
other things going in

00:30:47.880 --> 00:30:50.250
to get this final structure.

00:30:50.250 --> 00:30:54.390
So effectively, the
backbone of vancomycin

00:30:54.390 --> 00:30:59.220
is a polypeptide
that's a sevenmer.

00:30:59.220 --> 00:31:02.640
So within this
heptapeptide scaffold,

00:31:02.640 --> 00:31:05.370
there are two
proteinogenic amino acids

00:31:05.370 --> 00:31:10.170
and five non-proteinogenic
amino acids here.

00:31:10.170 --> 00:31:15.750
And because we have seven
amino-acid-type monomers,

00:31:15.750 --> 00:31:18.810
we need an assembly line
that has seven modules, one

00:31:18.810 --> 00:31:23.010
module per amino acid monomer.

00:31:23.010 --> 00:31:27.780
And what we'll see is that these
seven modules are distributed

00:31:27.780 --> 00:31:32.200
over three proteins.

00:31:32.200 --> 00:31:36.960
We have a case of a thioesterase
catalyzing hydrolytic release.

00:31:36.960 --> 00:31:38.760
And then we're going
to need to think

00:31:38.760 --> 00:31:41.520
about what are the other
tailoring enzymes involved

00:31:41.520 --> 00:31:44.200
in giving vancomycin
this structure.

00:31:44.200 --> 00:31:46.230
So for instance, look here.

00:31:46.230 --> 00:31:49.800
We see there is this
aryl-aryl C-C bond.

00:31:49.800 --> 00:31:52.710
We see these
aryl-ether connections.

00:31:52.710 --> 00:31:55.590
And we also have
these sugars attached.

00:31:55.590 --> 00:31:57.480
And look, there is
also an N-methylation

00:31:57.480 --> 00:32:02.220
here of leucine 1,
so a lot happening.

00:32:02.220 --> 00:32:06.510
And the consequence of this
post-assembly-line tailoring

00:32:06.510 --> 00:32:09.990
is that, what's a linear
sevenmer polypeptide ends up

00:32:09.990 --> 00:32:13.860
having an architecture that's
described as a dome, so

00:32:13.860 --> 00:32:15.750
a dome-shaped architecture.

00:32:15.750 --> 00:32:20.340
And what vancomycin does is
that it blocks biosynthesis

00:32:20.340 --> 00:32:24.390
of the bacterial cell wall
by binding to a certain lipid

00:32:24.390 --> 00:32:26.760
precursor in that.

00:32:26.760 --> 00:32:30.360
So let's look at
the assembly line.

00:32:30.360 --> 00:32:33.210
And this is just an
overview of the tailoring

00:32:33.210 --> 00:32:35.040
I just told you about.

00:32:35.040 --> 00:32:37.440
And this is the
amino acid sequence

00:32:37.440 --> 00:32:39.510
in order of the
different monomers there

00:32:39.510 --> 00:32:44.290
and the identities of the
non-proteinogenic amino acids.

00:32:44.290 --> 00:32:48.040
So here is the assembly line.

00:32:48.040 --> 00:32:53.190
And if we take a look, we
have the loading module, AT.

00:32:53.190 --> 00:32:59.340
We can count the T domains to
give us the modules involved

00:32:59.340 --> 00:33:00.210
in extension.

00:33:00.210 --> 00:33:02.280
So there is seven T domains.

00:33:02.280 --> 00:33:05.460
And look, CAT, CAT, CAT--

00:33:05.460 --> 00:33:09.180
we have a number of optional
epimerization domains.

00:33:09.180 --> 00:33:12.750
And at the end, we
see this TE domain.

00:33:12.750 --> 00:33:17.130
And so you can walk through
and look at each monomer being

00:33:17.130 --> 00:33:19.890
attached to the growing chain.

00:33:19.890 --> 00:33:21.750
And then what do we see?

00:33:21.750 --> 00:33:24.510
What we see happening
down here is

00:33:24.510 --> 00:33:29.820
that when we have the
linear polypeptide attached

00:33:29.820 --> 00:33:32.610
to this module
here, what happens

00:33:32.610 --> 00:33:35.730
is that there is some
tailoring happening

00:33:35.730 --> 00:33:39.870
while the polypeptide is still
attached to the assembly line.

00:33:39.870 --> 00:33:43.510
So enzymes that are not
parts of the assembly line

00:33:43.510 --> 00:33:46.720
but are involved in the
biosynthesis can come in.

00:33:46.720 --> 00:33:49.890
And sometimes they'll
modify the chain

00:33:49.890 --> 00:33:53.460
when it's still attached
to the NRPS or PKS.

00:33:53.460 --> 00:33:55.230
Or sometimes they
do the chemistry

00:33:55.230 --> 00:33:57.840
after the chain is released.

00:33:57.840 --> 00:34:00.810
And often, this is a
question that people need

00:34:00.810 --> 00:34:02.390
to sort out experimentally.

00:34:02.390 --> 00:34:04.785
So in this case here,
we see that there

00:34:04.785 --> 00:34:08.730
is some oxidative cross-linking
that occurs while the chain is

00:34:08.730 --> 00:34:10.409
still attached to the T domain.

00:34:10.409 --> 00:34:12.780
So there is formation
of the aryl-ether bond

00:34:12.780 --> 00:34:15.460
and this aryl-aryl bond here.

00:34:15.460 --> 00:34:17.190
And then after the
chain is released

00:34:17.190 --> 00:34:20.190
in a hydrolytic
manner, what happens

00:34:20.190 --> 00:34:24.059
is the sugars get attached
post-assembly-line here.

00:34:24.059 --> 00:34:25.139
Do you have a question?

00:34:25.139 --> 00:34:27.030
AUDIENCE: Yeah, are
the enzymes ever

00:34:27.030 --> 00:34:29.690
actually in the assembly line,
like the optional domains

00:34:29.690 --> 00:34:30.190
of PKS?

00:34:30.190 --> 00:34:32.370
Or in this case,
is it always such

00:34:32.370 --> 00:34:34.260
that the enzymes are separate?

00:34:34.260 --> 00:34:36.570
ELIZABETH NOLAN: It will
depend on the assembly line.

00:34:36.570 --> 00:34:37.987
Yeah, so that's
something you need

00:34:37.987 --> 00:34:42.719
to look for in the assembly
line from the bioinformatics.

00:34:42.719 --> 00:34:45.690
So in this case, we're only
seeing epimerization domains

00:34:45.690 --> 00:34:48.270
in the assembly line,
but there can easily

00:34:48.270 --> 00:34:50.550
be methyltransferases,
or reductases,

00:34:50.550 --> 00:34:54.150
or cyclases-- any
number of possibilities

00:34:54.150 --> 00:34:57.810
within the assembly
line itself there.

00:34:57.810 --> 00:35:03.080
And these optional domains will
work on the upstream monomer.

00:35:03.080 --> 00:35:07.130
This is just an example of
the tailoring enzymes involved

00:35:07.130 --> 00:35:10.790
for cross-linking of
this vancomycin scaffold.

00:35:10.790 --> 00:35:15.560
In this case, there are
three cytochrome P450 enzymes

00:35:15.560 --> 00:35:23.360
that are needed in order
to make these cross-links.

00:35:23.360 --> 00:35:26.210
And that chemistry
is shown here to get

00:35:26.210 --> 00:35:29.330
to what's called the
vancomycin aglycone, which

00:35:29.330 --> 00:35:34.040
means that there are
no sugars attached.

00:35:34.040 --> 00:35:36.260
And I won't draw this
one on the board,

00:35:36.260 --> 00:35:38.960
but you can do a
similar exercise

00:35:38.960 --> 00:35:41.150
with this molecule or
any others in terms

00:35:41.150 --> 00:35:44.060
of identifying the monomer
units from the structure

00:35:44.060 --> 00:35:45.740
for yourself.

00:35:45.740 --> 00:35:48.980
So if we're looking
here, we have

00:35:48.980 --> 00:35:52.730
effectively the
N-terminus, so the starter,

00:35:52.730 --> 00:35:56.270
and then effectively
look at the peptide bonds

00:35:56.270 --> 00:35:59.900
and work your way through to
find the different monomers

00:35:59.900 --> 00:36:00.590
here.

00:36:00.590 --> 00:36:02.930
So by doing that, if you're
given a natural product,

00:36:02.930 --> 00:36:05.150
you can figure out
how many modules are

00:36:05.150 --> 00:36:07.160
needed in the assembly line.

00:36:07.160 --> 00:36:08.960
And you can also
make an assessment

00:36:08.960 --> 00:36:12.680
as to what other types of
chemistry might have to happen.

00:36:12.680 --> 00:36:15.440
And I'll just keep in mind,
for something like this--

00:36:15.440 --> 00:36:18.410
let's just take this for an
example with this halogen.

00:36:18.410 --> 00:36:21.410
You might ask, well, is
that part of the monomer?

00:36:21.410 --> 00:36:24.800
Or is that atom incorporated
sometime down the road?

00:36:24.800 --> 00:36:26.450
OK, those are types
of questions people

00:36:26.450 --> 00:36:31.140
who explore biosynthesis of
these molecules think about.

00:36:31.140 --> 00:36:36.920
OK, so with that in mind, let's
take a look at some examples.

00:36:36.920 --> 00:36:41.720
And the questions are, what
kind of assembly line is this?

00:36:41.720 --> 00:36:43.240
How many monomers?

00:36:43.240 --> 00:36:47.880
And maybe there will be some
extra questions as we go.

00:36:47.880 --> 00:36:50.000
So here we have an
assembly line that's

00:36:50.000 --> 00:36:53.490
required to make an
antibiotic called daptomycin.

00:36:53.490 --> 00:36:55.580
And a company down the
street in Lexington

00:36:55.580 --> 00:36:59.570
called Cubist has done a lot of
work on this natural product.

00:36:59.570 --> 00:37:01.130
So how many monomers are here?

00:37:09.324 --> 00:37:13.270
Yeah, 13, right-- so
count these T domains

00:37:13.270 --> 00:37:15.280
based on what's seen here.

00:37:15.280 --> 00:37:16.675
How many optional domains?

00:37:19.900 --> 00:37:21.763
AUDIENCE: Three.

00:37:21.763 --> 00:37:23.680
ELIZABETH NOLAN: And
then what else do we see?

00:37:23.680 --> 00:37:26.650
So we see that
this assembly line

00:37:26.650 --> 00:37:32.980
is divided over three
proteins, effectively, here.

00:37:32.980 --> 00:37:35.140
And similar to what
we saw with DEBS,

00:37:35.140 --> 00:37:37.900
when we have a break
in the cartoon,

00:37:37.900 --> 00:37:41.260
that indicates a new
polypeptide chain.

00:37:41.260 --> 00:37:41.980
What's missing?

00:37:48.693 --> 00:37:49.735
AUDIENCE: Loading module.

00:37:49.735 --> 00:37:52.550
ELIZABETH NOLAN: Yeah, there
is no loading module here,

00:37:52.550 --> 00:37:55.710
right, no AT at the beginning.

00:37:55.710 --> 00:37:56.850
So what's going on?

00:38:02.600 --> 00:38:05.370
So in this case, I haven't
shown you a structure.

00:38:05.370 --> 00:38:08.880
It highlights there is always
exceptions to the rule.

00:38:08.880 --> 00:38:12.150
What happens here is that
the loading module actually

00:38:12.150 --> 00:38:18.390
loads a fatty acid, so not
a standard monomer for NRPS.

00:38:18.390 --> 00:38:21.100
So that fatty acid has
to come from somewhere.

00:38:21.100 --> 00:38:22.740
And you can think
about discussions

00:38:22.740 --> 00:38:26.280
here as to where that
may have come from.

00:38:26.280 --> 00:38:31.580
Look at how big this is,
624 kilodaltons, 783, 256--

00:38:31.580 --> 00:38:34.150
we're on the order
of 1.5 megadaltons.

00:38:34.150 --> 00:38:37.455
This is huge for a
13-mer natural product.

00:38:40.710 --> 00:38:43.410
What about this one?

00:38:43.410 --> 00:38:44.490
What do we see here?

00:38:49.630 --> 00:38:51.070
So this is a natural product--

00:38:51.070 --> 00:38:55.150
this makes the natural product
produced by Streptomyces

00:38:55.150 --> 00:38:57.850
that has insecticidal activity.

00:38:57.850 --> 00:38:59.320
And it kills parasitic worms.

00:38:59.320 --> 00:39:03.610
But anyhow, what kind
of natural product

00:39:03.610 --> 00:39:05.380
is produced by
this assembly line?

00:39:08.380 --> 00:39:10.480
We have a polyketide, right?

00:39:10.480 --> 00:39:11.335
How many modules?

00:39:28.328 --> 00:39:30.065
[INAUDIBLE] the T domains.

00:39:30.065 --> 00:39:31.190
AUDIENCE: [INAUDIBLE]

00:39:31.190 --> 00:39:32.900
ELIZABETH NOLAN: Yeah,
13 again, right--

00:39:32.900 --> 00:39:36.950
four proteins, 13
modules, so how many

00:39:36.950 --> 00:39:38.225
unmodified beta ketones?

00:39:41.715 --> 00:39:46.329
What would you want to look
for for a modified beta ketone?

00:39:46.329 --> 00:39:47.810
AUDIENCE: [INAUDIBLE]

00:39:47.810 --> 00:39:50.520
ELIZABETH NOLAN: Exactly, no
optional domains-- so how many

00:39:50.520 --> 00:39:52.350
of those?

00:39:52.350 --> 00:39:56.040
Yeah, right, so two
modules, we have one

00:39:56.040 --> 00:40:01.635
here and then one over here
with no optional domains.

00:40:05.250 --> 00:40:06.150
What about this one?

00:40:09.980 --> 00:40:11.730
This is for a molecule
called bleomycin.

00:40:11.730 --> 00:40:17.600
JoAnne is an expert on the
mechanism of this molecule.

00:40:17.600 --> 00:40:18.300
What's going on?

00:40:35.415 --> 00:40:36.810
OK, there is a lot going on.

00:40:36.810 --> 00:40:38.160
This one is very complicated.

00:40:38.160 --> 00:40:40.530
But in terms of
making an assessment

00:40:40.530 --> 00:40:44.988
about the type of biosynthetic
logic, what do we see here?

00:40:44.988 --> 00:40:47.360
AUDIENCE: [INAUDIBLE]

00:40:47.360 --> 00:40:49.020
ELIZABETH NOLAN:
Right, so what we

00:40:49.020 --> 00:40:52.290
see is that there is both
non-ribosomal peptide

00:40:52.290 --> 00:40:54.830
synthesis happening
and polyketide

00:40:54.830 --> 00:40:58.230
biosynthesis happening
in this assembly line.

00:40:58.230 --> 00:41:01.020
And that tells us that
the product metabolite

00:41:01.020 --> 00:41:03.750
is a PKS-NRPS hybrid.

00:41:03.750 --> 00:41:05.410
OK, so what do we see?

00:41:05.410 --> 00:41:07.740
We see all of these
CAT trios which

00:41:07.740 --> 00:41:10.740
are indicative of non-ribosomal
peptide biosynthesis.

00:41:10.740 --> 00:41:12.630
And then what's happening here?

00:41:12.630 --> 00:41:16.470
We have a module that's
using polyketide machinery.

00:41:16.470 --> 00:41:20.070
And then we go back to
non-ribosomal-peptide-based

00:41:20.070 --> 00:41:20.870
logic here.

00:41:23.440 --> 00:41:24.960
We have many proteins, right?

00:41:24.960 --> 00:41:28.800
So this assembly line is
divided over many proteins.

00:41:28.800 --> 00:41:32.100
And look, we see that even some
of the modules are divided up.

00:41:32.100 --> 00:41:35.100
So for instance,
this CAT trio is

00:41:35.100 --> 00:41:38.080
divided between two proteins.

00:41:38.080 --> 00:41:43.209
So you may not have all domains
of a module on a given protein.

00:41:43.209 --> 00:41:45.630
AUDIENCE: What happens if you
have two C domains in a row?

00:41:45.630 --> 00:41:48.130
ELIZABETH NOLAN: So where do
you see two C domains in a row?

00:41:48.130 --> 00:41:51.548
AUDIENCE: Between BlmV and BlmX.

00:41:51.548 --> 00:41:53.024
ELIZABETH NOLAN: Five and--

00:41:53.024 --> 00:41:55.405
AUDIENCE: Is that
actually in a row?

00:41:55.405 --> 00:41:57.530
ELIZABETH NOLAN: Yeah, so
then that's the question.

00:41:57.530 --> 00:42:00.390
Are they actually in a row?

00:42:00.390 --> 00:42:04.072
AUDIENCE: Further down, four Cy
cyclases without any C domain.

00:42:04.072 --> 00:42:05.780
ELIZABETH NOLAN: Yeah,
so that's actually

00:42:05.780 --> 00:42:06.738
where I was going next.

00:42:06.738 --> 00:42:12.800
So what's going on with
the Cy without a C domain?

00:42:12.800 --> 00:42:15.950
So what's happening-- and we'll
probably, if there is time,

00:42:15.950 --> 00:42:19.580
go over an example
of this on Friday--

00:42:19.580 --> 00:42:24.350
is that Cy, so these
cyclization domains

00:42:24.350 --> 00:42:27.480
are a variant on a
condensation domain.

00:42:27.480 --> 00:42:30.050
And what they do is, they
both catalyze formation

00:42:30.050 --> 00:42:33.440
of the peptide bond and then
they catalyze-- after that,

00:42:33.440 --> 00:42:36.350
they catalyze formation
of a heterocycle.

00:42:36.350 --> 00:42:38.210
So if you recall,
I believe we looked

00:42:38.210 --> 00:42:40.430
at the structure
of yersiniabactin

00:42:40.430 --> 00:42:44.240
during the first
lecture on these.

00:42:44.240 --> 00:42:45.890
It has a number of heterocycles.

00:42:45.890 --> 00:42:49.490
And those form by
this Cy domain.

00:42:49.490 --> 00:42:51.730
And we can see that
here in the structure.

00:42:51.730 --> 00:42:53.870
So what I've done
on this slide is

00:42:53.870 --> 00:42:57.040
just present to
you the structures,

00:42:57.040 --> 00:42:58.820
so the natural
products that result

00:42:58.820 --> 00:43:00.890
from these different
assembly lines.

00:43:00.890 --> 00:43:05.510
And if we take a look at the
bleomycin, what do we see here?

00:43:05.510 --> 00:43:09.480
We have these two heterocycles
that are fused together.

00:43:09.480 --> 00:43:15.290
And those are formed via the
action of these two cyclization

00:43:15.290 --> 00:43:16.980
domains down here.

00:43:16.980 --> 00:43:21.020
So effectively, these
originate from cysteine.

00:43:21.020 --> 00:43:24.110
So cysteines, and
serines, and threonines

00:43:24.110 --> 00:43:29.180
can end up forming structures
like these if there is

00:43:29.180 --> 00:43:31.250
the appropriate type of domain.

00:43:31.250 --> 00:43:34.490
This molecule is extremely
complicated here.

00:43:34.490 --> 00:43:36.650
And so it's a good
puzzle to look at it

00:43:36.650 --> 00:43:41.210
and try to sort out what are
the monomers in it in here.

00:43:41.210 --> 00:43:46.316
Does anyone know what
this does, bleomycin?

00:43:46.316 --> 00:43:48.272
AUDIENCE: [INAUDIBLE]

00:43:51.700 --> 00:43:56.040
ELIZABETH NOLAN: Well, so it's
an anticancer antibiotic here.

00:43:56.040 --> 00:43:58.080
It can intercalate into DNA.

00:43:58.080 --> 00:44:00.900
And these heterocycles
are important for that.

00:44:00.900 --> 00:44:03.233
And then it causes
strand breaks.

00:44:03.233 --> 00:44:04.650
And I've actually
learned recently

00:44:04.650 --> 00:44:06.450
it's also used for,
like, treating arts.

00:44:06.450 --> 00:44:09.680
So it will kill HPV
that causes warts.

00:44:09.680 --> 00:44:14.470
Anyhow, all of these compounds
have interesting activities,

00:44:14.470 --> 00:44:18.600
which is one reason why
they can be of interest.

00:44:18.600 --> 00:44:24.450
So with the logic in place,
where we're going to close

00:44:24.450 --> 00:44:29.670
this module is thinking about
how folks study these in lab.

00:44:29.670 --> 00:44:33.720
So say you want to figure out
the biosynthesis of a molecule

00:44:33.720 --> 00:44:36.810
like daptomycin
or bleomycin, what

00:44:36.810 --> 00:44:39.660
is it that one needs to do?

00:44:39.660 --> 00:44:42.030
And something just to keep
in mind with this right

00:44:42.030 --> 00:44:44.610
off the bat, is
that these are huge.

00:44:44.610 --> 00:44:46.560
So some of these
examples here, if you

00:44:46.560 --> 00:44:50.850
take a look at the sizes,
they're, like, comparable

00:44:50.850 --> 00:44:53.830
to the prokaryotic ribosome.

00:44:53.830 --> 00:44:56.550
That's a huge protein assembly.

00:44:56.550 --> 00:44:59.640
And that presents a
limitation from the standpoint

00:44:59.640 --> 00:45:04.020
of doing experimental
work, because trying

00:45:04.020 --> 00:45:08.670
to overexpress or produce these
assembly lines in something

00:45:08.670 --> 00:45:11.850
like E. coli is typically
just unreasonable.

00:45:11.850 --> 00:45:14.610
And in terms of a native
producer organism,

00:45:14.610 --> 00:45:16.740
say, something
like Streptomyces,

00:45:16.740 --> 00:45:18.630
we may or may not
know conditions

00:45:18.630 --> 00:45:22.080
that cause the organism to
make the natural product, so

00:45:22.080 --> 00:45:24.920
conditions that cause it
to express this machinery,

00:45:24.920 --> 00:45:26.950
and then even if it made at a--

00:45:26.950 --> 00:45:29.220
in an amount that's useful.

00:45:29.220 --> 00:45:33.000
So what happens?

00:45:33.000 --> 00:45:35.190
What are we going to
do as experimentalists?

00:45:35.190 --> 00:45:39.870
So as I said, we need to keep
in mind that these machines are

00:45:39.870 --> 00:45:41.350
enormous.

00:45:41.350 --> 00:45:43.230
And so we need to
take this into account

00:45:43.230 --> 00:45:46.980
during experimental design.

00:45:46.980 --> 00:45:52.260
And these days, bioinformatics
drives a lot of the studies.

00:45:52.260 --> 00:45:54.697
So rather than first
finding a natural product

00:45:54.697 --> 00:45:56.280
and determining its
structure and then

00:45:56.280 --> 00:45:59.700
hunting down the protein
machinery, a wealth of genomes

00:45:59.700 --> 00:46:01.020
are becoming available.

00:46:01.020 --> 00:46:04.140
And so you can use
bioinformatics to search

00:46:04.140 --> 00:46:08.430
for PKS or NRPS gene clusters.

00:46:08.430 --> 00:46:11.880
And then you can
make some assessment

00:46:11.880 --> 00:46:15.120
as to what type of molecule
these gene clusters might

00:46:15.120 --> 00:46:16.860
be responsible for making.

00:46:16.860 --> 00:46:20.370
So bioinformatics
plays a huge role.

00:46:20.370 --> 00:46:24.450
And it allows us to
predict the domains,

00:46:24.450 --> 00:46:28.980
to predict their locations, and
predict their boundaries here.

00:46:28.980 --> 00:46:32.370
So as I just said,
overexpression

00:46:32.370 --> 00:46:35.700
of a complete assembly line
is generally not feasible.

00:46:35.700 --> 00:46:37.440
So what do people do?

00:46:37.440 --> 00:46:43.770
People will typically express
individual domains or maybe

00:46:43.770 --> 00:46:48.930
di-domains and study
those in the test tube.

00:46:48.930 --> 00:46:54.150
So you can imagine PCR
amplifying an A domain or a T

00:46:54.150 --> 00:46:58.800
domain, or maybe the A
and T domain together,

00:46:58.800 --> 00:47:01.170
and then creating some
plasmid that allows

00:47:01.170 --> 00:47:04.980
you to express that in E. coli.

00:47:04.980 --> 00:47:07.230
So there is a lot
of overexpression.

00:47:07.230 --> 00:47:09.060
The proteins need
to be purified,

00:47:09.060 --> 00:47:11.820
so maybe something like affinity
chromatography that we've

00:47:11.820 --> 00:47:13.710
spoken about before.

00:47:13.710 --> 00:47:16.170
And then a key point
is that, in order

00:47:16.170 --> 00:47:18.570
to have any of this
chemistry work,

00:47:18.570 --> 00:47:22.590
these T domains need to be
post-translationally modified

00:47:22.590 --> 00:47:24.660
by the Ppant arm.

00:47:24.660 --> 00:47:27.710
And if you're overexpressing
a T domain from Streptomyces

00:47:27.710 --> 00:47:30.750
or some organism in E.
coli, you can pretty much

00:47:30.750 --> 00:47:34.330
assume there is no
PPTase in E. coli

00:47:34.330 --> 00:47:36.990
that's going to do this for you.

00:47:36.990 --> 00:47:39.810
So you need to do
that after the fact.

00:47:39.810 --> 00:47:45.720
And so there needs
to be a PPTase.

00:47:45.720 --> 00:47:49.470
And what we'll see
is that there is

00:47:49.470 --> 00:47:54.320
a PPTase from B. subtilis called
SFP that's very promiscuous.

00:47:54.320 --> 00:47:57.330
It will basically
modify any T domain.

00:47:57.330 --> 00:48:00.540
And so experimentally,
this is what people use,

00:48:00.540 --> 00:48:06.450
because often, one has no clue
what the endogenous PPTase is

00:48:06.450 --> 00:48:10.530
here, so SFP to the rescue.

00:48:10.530 --> 00:48:12.870
In terms of activity
assay, so once

00:48:12.870 --> 00:48:16.440
you have your domains
or di-domains purified,

00:48:16.440 --> 00:48:18.210
what happens?

00:48:18.210 --> 00:48:20.640
This is the typical flow.

00:48:20.640 --> 00:48:24.780
So the first is to characterize
the A domains and to ask,

00:48:24.780 --> 00:48:29.260
what amino acid or aryl acid
is activated by the A domain

00:48:29.260 --> 00:48:31.140
and what is the selectivity?

00:48:31.140 --> 00:48:32.880
And by getting that
information, you

00:48:32.880 --> 00:48:35.970
have a good clue as to
what monomer a given

00:48:35.970 --> 00:48:38.580
module is responsible for.

00:48:38.580 --> 00:48:41.970
And the ATP-PPi exchange
assay we discussed

00:48:41.970 --> 00:48:45.090
in the context of the
aminoacyl tRNA synthetases

00:48:45.090 --> 00:48:47.100
is commonly employed.

00:48:47.100 --> 00:48:50.280
So this is where we use
the radiolabeled ATP

00:48:50.280 --> 00:48:53.020
and took into
reversibility there.

00:48:53.020 --> 00:48:56.680
So go back and review
that assay as needed.

00:48:56.680 --> 00:48:59.890
There will be some examples
of this in the problem set.

00:48:59.890 --> 00:49:04.180
So once the A domain
activity is known

00:49:04.180 --> 00:49:07.850
in terms of preferred
monomer, the next question is,

00:49:07.850 --> 00:49:12.820
will that A domain transfer
the amino acid monomer

00:49:12.820 --> 00:49:14.650
to a given T domain?

00:49:14.650 --> 00:49:18.820
So you design assays to look
for transfer of the activated

00:49:18.820 --> 00:49:22.330
monomer to the
post-translationally-modified T

00:49:22.330 --> 00:49:23.950
domain here.

00:49:23.950 --> 00:49:25.960
So in these assays,
there is a lot

00:49:25.960 --> 00:49:31.630
of work with radiolabels,
with HPLC, and mass spec.

00:49:31.630 --> 00:49:35.560
So once these T
domains are loaded,

00:49:35.560 --> 00:49:38.500
you can look for
peptide bond formation.

00:49:38.500 --> 00:49:43.360
So imagine you have an isolated
T domain from a loading module

00:49:43.360 --> 00:49:47.380
that you've stuck the amino acid
on and then you have this guy,

00:49:47.380 --> 00:49:49.690
the next question
is, does the C domain

00:49:49.690 --> 00:49:53.320
catalyze bond
formation reaction?

00:49:53.320 --> 00:49:56.640
And again, we'll see there is
a lot of use of radiolabels,

00:49:56.640 --> 00:50:01.990
HPLC, SDS-PAGE here.

00:50:01.990 --> 00:50:05.980
And then you know, there is
the question of the TE domain

00:50:05.980 --> 00:50:09.250
and the TE domain
catalyzing chain release.

00:50:09.250 --> 00:50:12.520
So it's quite systematic in
terms of how you work through

00:50:12.520 --> 00:50:19.360
from identifying an assembly
line to then teasing apart

00:50:19.360 --> 00:50:21.970
the various activities
of the different domains

00:50:21.970 --> 00:50:23.530
and different modules.

00:50:23.530 --> 00:50:27.430
And so where we'll close
this module on Friday

00:50:27.430 --> 00:50:32.140
is with looking at
the experiments that

00:50:32.140 --> 00:50:35.980
were done for the biosynthesis
of an iron chelator produced

00:50:35.980 --> 00:50:42.550
by E. coli and working through
basically you know, how was it

00:50:42.550 --> 00:50:47.500
that this NRPS was found?

00:50:47.500 --> 00:50:50.050
What were the experiments
done to identify

00:50:50.050 --> 00:50:52.330
the different activities
of the different domains?

00:50:52.330 --> 00:50:55.900
And it's really
that work that has

00:50:55.900 --> 00:50:58.210
served as a foundation
and a paradigm

00:50:58.210 --> 00:51:03.700
for many, many further
studies of these systems here.

00:51:03.700 --> 00:51:05.530
And so with that,
we'll close for today.

00:51:05.530 --> 00:51:09.570
And there is no class Wednesday,
so I'll see you on Friday.