WEBVTT

00:00:06.370 --> 00:00:12.480
ERIC LANDER: And so the issue
became how does DNA

00:00:12.480 --> 00:00:13.875
replication work.

00:00:17.370 --> 00:00:18.650
And so I'm about
to go into it.

00:00:18.650 --> 00:00:25.130
Now, I'm going to note we're
going to be starting this DNA

00:00:25.130 --> 00:00:35.790
goes to RNA, goes to protein,
and DNA goes to itself.

00:00:35.790 --> 00:00:38.100
DNA is replicated.

00:00:38.100 --> 00:00:39.450
It makes RNA.

00:00:39.450 --> 00:00:41.710
The RNA is used to
make protein.

00:00:41.710 --> 00:00:42.780
This will be what
we'll be talking

00:00:42.780 --> 00:00:44.700
about today and tomorrow.

00:00:44.700 --> 00:00:47.640
So the first step of that
is, how does DNA give

00:00:47.640 --> 00:00:50.370
rise to more DNA?

00:00:50.370 --> 00:00:53.330
Well, how do you
find an enzyme?

00:00:53.330 --> 00:00:54.580
How do you do biochemistry?

00:00:56.950 --> 00:00:57.540
What do you do?

00:00:57.540 --> 00:00:58.890
AUDIENCE: Assays.

00:00:58.890 --> 00:00:59.350
ERIC LANDER: Assay.

00:00:59.350 --> 00:01:00.730
So you've got to grind
up the cell.

00:01:00.730 --> 00:01:02.730
I got to choose a cell in which
I'm likely to find an

00:01:02.730 --> 00:01:05.830
enzyme, grind it up, break it
up into different fractions,

00:01:05.830 --> 00:01:07.200
and test each fraction.

00:01:07.200 --> 00:01:08.930
That's all biochemists
do, right?

00:01:08.930 --> 00:01:13.950
So what cell might have the
enzyme we're looking for?

00:01:13.950 --> 00:01:15.675
What cells might be
able to copy DNA?

00:01:18.642 --> 00:01:20.400
How about all cells?

00:01:20.400 --> 00:01:21.570
So let's use a simple cell.

00:01:21.570 --> 00:01:24.180
What's a simple cell?

00:01:24.180 --> 00:01:25.330
Let's use bacteria.

00:01:25.330 --> 00:01:27.000
So we'll take some bacteria,
we'll grow it up,

00:01:27.000 --> 00:01:27.920
we'll grind it up.

00:01:27.920 --> 00:01:30.130
We'll fractionate it into
different fractions, and we'll

00:01:30.130 --> 00:01:33.810
see if one of those fractions
has the ability to copy DNA.

00:01:33.810 --> 00:01:35.150
If we're going to run
an assay, we have

00:01:35.150 --> 00:01:36.880
to give it a substrate.

00:01:36.880 --> 00:01:40.610
What substrate would you
like to give it?

00:01:40.610 --> 00:01:41.760
What do you think it needs?

00:01:41.760 --> 00:01:43.010
AUDIENCE: [INAUDIBLE].

00:01:45.450 --> 00:01:47.380
ERIC LANDER: It better have
some free nucleotides

00:01:47.380 --> 00:01:48.980
otherwise, how are we
going to make DNA.

00:01:48.980 --> 00:01:51.010
What else?

00:01:51.010 --> 00:01:53.490
Are you going to ask it to
make DNA all by itself?

00:01:53.490 --> 00:01:56.110
We want something that can copy
one of the strands of a

00:01:56.110 --> 00:01:57.310
double helix.

00:01:57.310 --> 00:01:58.140
So what should we give it?

00:01:58.140 --> 00:02:00.240
AUDIENCE: [INAUDIBLE].

00:02:00.240 --> 00:02:01.150
ERIC LANDER: Sorry?

00:02:01.150 --> 00:02:02.020
AUDIENCE: Half a helix.

00:02:02.020 --> 00:02:02.850
ERIC LANDER: Half a helix.

00:02:02.850 --> 00:02:06.480
A strand of DNA, the strand
to be used as a template.

00:02:06.480 --> 00:02:07.965
So let's give it a
template strand.

00:02:11.880 --> 00:02:14.860
So we'll take a template
strand of DNA.

00:02:14.860 --> 00:02:16.390
There's my template of DNA.

00:02:19.600 --> 00:02:22.810
Let's actually give it a little

00:02:22.810 --> 00:02:24.980
sequence actually, here.

00:02:24.980 --> 00:02:34.330
Let's say A phosphate, T
phosphate, G phosphate, C

00:02:34.330 --> 00:02:41.480
phosphate, A phosphate, T
phosphate, T phosphate, A

00:02:41.480 --> 00:02:46.460
phosphate, G phosphate,
G phosphate.

00:02:46.460 --> 00:02:48.810
I'm going to not write the
phosphates too much longer,

00:02:48.810 --> 00:02:51.760
guys, but anyway C phosphate,
C phosphate, T

00:02:51.760 --> 00:02:55.060
phosphate, like that.

00:02:55.060 --> 00:02:57.410
Pretty soon, in fact, almost
immediately, I'm going to

00:02:57.410 --> 00:02:59.820
start dropping the phosphates
in here.

00:02:59.820 --> 00:03:01.930
But that's the way it goes.

00:03:01.930 --> 00:03:04.610
All right.

00:03:04.610 --> 00:03:05.860
That's a template.

00:03:08.640 --> 00:03:14.845
We need floating around in the
solution some trinucleotides.

00:03:25.960 --> 00:03:33.230
We have some nucleotides
floating around.

00:03:36.160 --> 00:03:39.330
And now will this enzyme work?

00:03:39.330 --> 00:03:41.540
We would try different fractions
and see if it's able

00:03:41.540 --> 00:03:45.940
to just install the right
letters in the right place.

00:03:45.940 --> 00:03:50.620
Now, it turned out it needed one
more thing, and the person

00:03:50.620 --> 00:03:53.370
who discovered this, Arthur
Kornberg, thought of it.

00:03:53.370 --> 00:03:54.760
It needed a head start.

00:03:54.760 --> 00:03:56.010
It needed a primer.

00:03:59.450 --> 00:04:05.690
So the primer goes let's say,
phosphate T, phosphate A,

00:04:05.690 --> 00:04:16.670
phosphate C, phosphate G,
phosphate T, phosphate A,

00:04:16.670 --> 00:04:18.399
let's say like that.

00:04:18.399 --> 00:04:23.950
So this is the five
prime end of DNA.

00:04:23.950 --> 00:04:26.490
Remember the phosphate is
hanging off the five prime

00:04:26.490 --> 00:04:27.940
carbon, right?

00:04:27.940 --> 00:04:29.400
What's look at the other end.

00:04:29.400 --> 00:04:34.640
The other end ends in the
hydroxyl on the three prime

00:04:34.640 --> 00:04:37.790
end of the ribose.

00:04:37.790 --> 00:04:42.010
Since this is anti-parallel,
this strand is going five

00:04:42.010 --> 00:04:48.090
prime phosphate to three
prime hydroxyl.

00:04:48.090 --> 00:04:50.540
You're going to need to know
five prime and three prime.

00:04:50.540 --> 00:04:52.470
So I'm doing this so you
get used to five

00:04:52.470 --> 00:04:53.880
prime and three prime.

00:04:53.880 --> 00:04:54.580
There you go.

00:04:54.580 --> 00:04:57.460
If you're handed a primer to get
a head start, and you're

00:04:57.460 --> 00:05:00.460
handed a template, and you hand
it some nucleotides, you

00:05:00.460 --> 00:05:03.800
then assay different fractions
exactly as you suggested and

00:05:03.800 --> 00:05:09.300
we see is one of them capable
of extending this strand by

00:05:09.300 --> 00:05:12.640
putting in an A, putting in a T,
putting in a C, putting in

00:05:12.640 --> 00:05:17.320
a C, putting in a C, putting
in a G. That's the assay.

00:05:17.320 --> 00:05:29.450
And Arthur Kornberg discovered
an enzyme that could do this.

00:05:29.450 --> 00:05:31.000
And the biochemists went nuts.

00:05:31.000 --> 00:05:31.940
They thought, wow.

00:05:31.940 --> 00:05:33.520
This is so cool.

00:05:33.520 --> 00:05:36.770
Kornberg is able to discover
an enzyme that

00:05:36.770 --> 00:05:39.220
can accomplish this.

00:05:39.220 --> 00:05:43.470
The enzyme polymerizes DNA.

00:05:43.470 --> 00:05:46.754
Coincidentally, what is
the enzyme called?

00:05:46.754 --> 00:05:47.660
AUDIENCE: DNA polymerase.

00:05:47.660 --> 00:05:49.320
ERIC LANDER: DNA polymerase.

00:05:49.320 --> 00:05:51.130
Accidentally, has a nice name.

00:05:51.130 --> 00:05:52.310
Good.

00:05:52.310 --> 00:05:54.050
DNA polymerase.

00:05:58.590 --> 00:06:01.190
Excellent.

00:06:01.190 --> 00:06:03.640
Now, notice what it does.

00:06:03.640 --> 00:06:08.130
It takes this triphosphate, puts
it in here, and it joins

00:06:08.130 --> 00:06:10.140
it into a sugar phosphate
chain.

00:06:10.140 --> 00:06:12.205
Where does it get the energy
for that synthesis?

00:06:14.780 --> 00:06:17.600
Hydrolysis of the triphosphates,
right?

00:06:17.600 --> 00:06:19.500
It's the hydrolysis of
the triphosphate.

00:06:19.500 --> 00:06:21.800
That's the energy.

00:06:21.800 --> 00:06:26.030
What direction is the synthesis
proceeding?

00:06:26.030 --> 00:06:31.120
Starts here at the five prime
end, and it moves adding on to

00:06:31.120 --> 00:06:33.400
the three prime end.

00:06:33.400 --> 00:06:40.880
So it's five prime to three
prime direction.

00:06:40.880 --> 00:06:42.540
That's the direction it moves.

00:06:42.540 --> 00:06:44.720
It adds to the three
prime end.

00:06:44.720 --> 00:06:46.370
It adds to the free
nucleotides to

00:06:46.370 --> 00:06:47.620
the three prime end.

00:06:50.350 --> 00:06:52.025
Why not do it the other way?

00:06:52.025 --> 00:06:55.140
AUDIENCE: [INAUDIBLE].

00:06:55.140 --> 00:06:56.170
ERIC LANDER: Sorry?

00:06:56.170 --> 00:06:57.370
AUDIENCE: Phosphates.

00:06:57.370 --> 00:06:58.286
ERIC LANDER: Can't hear you.

00:06:58.286 --> 00:06:58.700
Shout loud.

00:06:58.700 --> 00:06:59.586
AUDIENCE: Phosphates.

00:06:59.586 --> 00:07:02.000
ERIC LANDER: Phosphates, yes.

00:07:02.000 --> 00:07:05.075
You see, suppose we were
going the other way.

00:07:08.850 --> 00:07:11.540
Suppose the primer
was this way.

00:07:11.540 --> 00:07:15.600
Where would as we added each
base, the triphosphate would

00:07:15.600 --> 00:07:18.590
be on the strands, right?

00:07:18.590 --> 00:07:24.150
And we'd be adding to the
three prime end here.

00:07:24.150 --> 00:07:28.320
That means the energy supplied
by the triphosphate would be

00:07:28.320 --> 00:07:33.240
on the growing strands rather
than in the free nucleotides.

00:07:33.240 --> 00:07:37.160
Why would it be a terrible idea
to put your energy source

00:07:37.160 --> 00:07:39.729
on the growing strand?

00:07:39.729 --> 00:07:42.094
MIKE: [INAUDIBLE].

00:07:42.094 --> 00:07:44.280
ERIC LANDER: Well Mike, you
know, those triphosphate bonds

00:07:44.280 --> 00:07:45.040
are pretty unstable.

00:07:45.040 --> 00:07:47.320
They hydrolyzed by themselves
at some frequency.

00:07:47.320 --> 00:07:49.550
If you're a free nucleotide
and the triphosphate

00:07:49.550 --> 00:07:52.010
hydrolyzes, big deal.

00:07:52.010 --> 00:07:55.900
That free nucleotide floating
around loses its triphosphate.

00:07:55.900 --> 00:07:57.960
But what if I'm the growing
strand, and I lose my

00:07:57.960 --> 00:07:59.115
triphosphate?

00:07:59.115 --> 00:07:59.720
AUDIENCE: [LAUGHS]

00:07:59.720 --> 00:08:00.480
ERIC LANDER: Exactly.

00:08:00.480 --> 00:08:01.540
AUDIENCE: There goes
your chain.

00:08:01.540 --> 00:08:03.470
ERIC LANDER: There
goes my chain.

00:08:03.470 --> 00:08:05.530
So you know, life's
not stupid.

00:08:05.530 --> 00:08:06.930
It doesn't do it that way.

00:08:06.930 --> 00:08:07.870
It does it this way.

00:08:07.870 --> 00:08:10.050
No one has ever found a
polymerase that goes this way.

00:08:10.050 --> 00:08:13.230
They find them all going that
way for just that reason.

00:08:13.230 --> 00:08:14.660
Exactly.

00:08:14.660 --> 00:08:16.100
Bingo.

00:08:16.100 --> 00:08:19.210
That was why life evolved it
that way, because you want

00:08:19.210 --> 00:08:24.190
your triphosphates, those
hydrolyzable triphosphates to

00:08:24.190 --> 00:08:27.320
be floating around freely
rather than investing.

00:08:27.320 --> 00:08:28.270
Now just think about that.

00:08:28.270 --> 00:08:29.140
It's a kind of cool thing.

00:08:29.140 --> 00:08:29.670
It doesn't matter.

00:08:29.670 --> 00:08:30.910
Your book doesn't
talk about it.

00:08:30.910 --> 00:08:33.080
But to me, it helps me remember
which way it's going

00:08:33.080 --> 00:08:35.370
and how it is, and it's
kind of interesting.

00:08:35.370 --> 00:08:36.110
Any way.

00:08:36.110 --> 00:08:36.710
All right.

00:08:36.710 --> 00:08:42.440
So Kornberg wins the Nobel
Prize for this.

00:08:42.440 --> 00:08:42.870
Good stuff.

00:08:42.870 --> 00:08:47.920
It's very deserved, but you
know, there's some questions.

00:08:47.920 --> 00:08:52.340
Where does the primer
come from in life?

00:08:52.340 --> 00:08:57.270
See, Kornberg gave this
test tube a primer.

00:08:57.270 --> 00:09:00.520
But suppose I'm replicating
some DNA.

00:09:00.520 --> 00:09:06.770
So let's suppose I have a double
strand of DNA, and I'm

00:09:06.770 --> 00:09:13.460
just going to open it up here,
five prime to three prime,

00:09:13.460 --> 00:09:18.280
five prime to three prime.

00:09:18.280 --> 00:09:20.769
I need to get like
a primer here.

00:09:25.080 --> 00:09:28.830
Then the primer can be extended
by polymerase.

00:09:28.830 --> 00:09:32.560
Well, where's the primer
come from?

00:09:32.560 --> 00:09:35.850
It turns out there is an enzyme
specially devoted to

00:09:35.850 --> 00:09:37.440
making those primers.

00:09:37.440 --> 00:09:41.080
Kornberg didn't know it,
but there's an enzyme.

00:09:41.080 --> 00:09:46.420
And by coincidence, it
is called primase.

00:09:46.420 --> 00:09:48.240
Exactly.

00:09:48.240 --> 00:09:50.290
Primase makes the primer.

00:09:50.290 --> 00:09:56.025
So you need a primer here, and
the primer is made by primase.

00:10:00.960 --> 00:10:07.810
Once primase makes a primer,
polymerase can chug along and

00:10:07.810 --> 00:10:10.340
do it just fine.

00:10:10.340 --> 00:10:11.860
Let's check out the
other strand.

00:10:14.370 --> 00:10:18.390
Primer here, polymerase
chugs along.

00:10:21.790 --> 00:10:25.210
But now as this double
helix opens up, what

00:10:25.210 --> 00:10:26.460
happens over here?

00:10:31.380 --> 00:10:34.150
The synthesis going this way.

00:10:34.150 --> 00:10:35.670
So what do I have to do here?

00:10:35.670 --> 00:10:37.318
AUDIENCE: [INAUDIBLE].

00:10:37.318 --> 00:10:39.232
ERIC LANDER: Another primer.

00:10:39.232 --> 00:10:40.482
Need another primer.

00:10:42.810 --> 00:10:44.510
Then as it opens up more,
what do I need?

00:10:44.510 --> 00:10:45.530
AUDIENCE: Another primer.

00:10:45.530 --> 00:10:46.780
ERIC LANDER: Another primer.

00:10:49.410 --> 00:10:53.100
So the two strands are
experiencing very different

00:10:53.100 --> 00:10:54.450
kind of replication.

00:10:54.450 --> 00:10:57.420
In one place, one primer in the
five prime to three prime

00:10:57.420 --> 00:10:59.840
direction is enough
to keep going.

00:10:59.840 --> 00:11:02.830
In the other strand, as it keeps
opening up, you gotta

00:11:02.830 --> 00:11:04.510
keep making primers.

00:11:04.510 --> 00:11:06.550
You have all these little
fragments there.

00:11:09.910 --> 00:11:16.520
Now, those little fragments were
discovered by Okazaki,

00:11:16.520 --> 00:11:20.960
and they are called
Okazaki fragments.

00:11:20.960 --> 00:11:23.260
Again, I just mention
these things.

00:11:23.260 --> 00:11:25.570
They are known to molecular
biologists.

00:11:25.570 --> 00:11:28.900
But these little guys are
Okazaki fragments, and they

00:11:28.900 --> 00:11:31.480
tell you that you're on
the right track here.

00:11:31.480 --> 00:11:33.280
This is indeed how
it's working.

00:11:33.280 --> 00:11:36.010
You can see those little
fragments there.

00:11:36.010 --> 00:11:40.260
But now, what's the problem with
the Okazaki fragments?

00:11:40.260 --> 00:11:41.970
They're not connected, right?

00:11:41.970 --> 00:11:44.400
The primase makes a primer.

00:11:44.400 --> 00:11:50.010
The polymerase copies the DNA,
it bumps into the next primer,

00:11:50.010 --> 00:11:52.690
but you've got to
connect them.

00:11:52.690 --> 00:11:54.420
So that's a problem.

00:11:54.420 --> 00:11:57.250
That's a real problem.

00:11:57.250 --> 00:11:58.500
I'll redraw that here.

00:12:04.140 --> 00:12:05.160
Here was my primer.

00:12:05.160 --> 00:12:06.710
I got a new primer over here.

00:12:09.450 --> 00:12:12.680
I got a new primer over here.

00:12:12.680 --> 00:12:13.380
Right there.

00:12:13.380 --> 00:12:14.120
Right there.

00:12:14.120 --> 00:12:17.340
They're not contiguous
connected.

00:12:17.340 --> 00:12:20.820
The word we use for connecting
two pieces of DNA, which is a

00:12:20.820 --> 00:12:23.700
standard English word not used
that often is to ligate two

00:12:23.700 --> 00:12:25.060
things together.

00:12:25.060 --> 00:12:27.960
Ligature, for example,
in music.

00:12:27.960 --> 00:12:31.460
You ligate things together.

00:12:31.460 --> 00:12:33.650
How do you think the cell deals
with ligating these

00:12:33.650 --> 00:12:36.230
things together?

00:12:36.230 --> 00:12:37.315
An enzyme called--

00:12:37.315 --> 00:12:38.110
AUDIENCE: Ligase.

00:12:38.110 --> 00:12:40.390
ERIC LANDER: Exactly.

00:12:40.390 --> 00:12:44.340
So ligase does the ligation.

00:12:44.340 --> 00:12:48.070
Ligase ligates.

00:12:48.070 --> 00:12:51.950
It is so lucky that these
words turn out to have

00:12:51.950 --> 00:12:53.780
accidentally made sense.

00:12:53.780 --> 00:12:56.400
It's really cool.

00:12:56.400 --> 00:12:59.950
So ligase ligates.

00:12:59.950 --> 00:13:05.000
Now, I'll tell you a factoid,
but don't worry

00:13:05.000 --> 00:13:05.820
about it too much.

00:13:05.820 --> 00:13:10.360
Primase actually doesn't
make DNA.

00:13:10.360 --> 00:13:11.540
We haven't gotten there
yet, but it turns out

00:13:11.540 --> 00:13:14.350
primase makes RNA.

00:13:14.350 --> 00:13:16.450
Turns out to be easier
to start an RNA

00:13:16.450 --> 00:13:18.740
than a DNA from scratch.

00:13:18.740 --> 00:13:20.710
Cell doesn't like to start
DNA from scratch.

00:13:20.710 --> 00:13:22.650
It likes to start RNA from
scratch as we'll get to a

00:13:22.650 --> 00:13:23.870
moment with transcription.

00:13:23.870 --> 00:13:26.010
So as a factoid, I'll mention
to you that those little

00:13:26.010 --> 00:13:29.340
primers are actually RNA
primers, and what happens is

00:13:29.340 --> 00:13:32.800
they get extended into DNA, and
they bump into and kind of

00:13:32.800 --> 00:13:36.070
displace the previous RNA, so
it's slightly more complicated

00:13:36.070 --> 00:13:36.890
than I told you.

00:13:36.890 --> 00:13:38.270
You're welcome to forget that.

00:13:38.270 --> 00:13:40.290
If you would like to believe
that primase is actually

00:13:40.290 --> 00:13:43.550
making little segments of
DNA, it'll be just fine.

00:13:43.550 --> 00:13:45.460
But in fact, it doesn't
actually.

00:13:45.460 --> 00:13:47.990
It's making little segments of
RNA so there's a whole other

00:13:47.990 --> 00:13:50.070
machinery that has to
deal with that.

00:13:50.070 --> 00:13:52.810
But the basic concept five prime
to three prime, little

00:13:52.810 --> 00:13:56.765
primers, getting extended,
getting ligated, that's how

00:13:56.765 --> 00:13:57.620
you make your DNA.

00:13:57.620 --> 00:13:58.980
And you can check it
out, and it works.

00:13:58.980 --> 00:14:00.230
All right.

00:14:09.500 --> 00:14:17.770
Well, it turns out to even be
a little more complicated.

00:14:17.770 --> 00:14:20.920
That was how we got the
synthesis going, but we also

00:14:20.920 --> 00:14:23.030
have a little bit of a
topological problem.

00:14:31.090 --> 00:14:35.120
This again, says a lot about
how people do science.

00:14:35.120 --> 00:14:37.300
You gotta just like not worry
about certain things.

00:14:37.300 --> 00:14:40.390
If Kornberg had said,
oh my goodness.

00:14:40.390 --> 00:14:43.860
I can't give my test tube a
primer, because I don't know

00:14:43.860 --> 00:14:46.550
how the cell would make a
primer, he wouldn't have made

00:14:46.550 --> 00:14:47.120
any progress.

00:14:47.120 --> 00:14:49.023
So he throws in the primer
and says, the cell

00:14:49.023 --> 00:14:50.070
will figure it out.

00:14:50.070 --> 00:14:54.400
I'm just giving it a primer,
and I'll see what happens.

00:14:54.400 --> 00:14:56.820
Now, there's another problem,
this topological problem that

00:14:56.820 --> 00:14:58.300
also can make your head hurt.

00:14:58.300 --> 00:15:01.050
Let me try to explain what the
topological problem is.

00:15:01.050 --> 00:15:08.735
Suppose I have DNA like that.

00:15:08.735 --> 00:15:10.580
Make that a little prettier.

00:15:10.580 --> 00:15:12.110
So I have some DNA like that.

00:15:21.570 --> 00:15:24.360
And maybe it goes around for
a very long distance like a

00:15:24.360 --> 00:15:25.800
circle or something like that.

00:15:25.800 --> 00:15:30.070
I now want to copy that DNA.

00:15:30.070 --> 00:15:33.390
So I have one strand,
and I'm copying it.

00:15:36.730 --> 00:15:43.650
I have this other strand,
and I'm copying it.

00:15:43.650 --> 00:15:47.760
And remember, these two strands
are wrapped around,

00:15:47.760 --> 00:15:51.560
and around, and around,
and around each other.

00:15:51.560 --> 00:15:53.170
One is going like this.

00:15:53.170 --> 00:15:55.050
One is going like that, and
there's some wrapped around.

00:15:55.050 --> 00:15:59.430
And as I tug them apart to
make a new strand, to

00:15:59.430 --> 00:16:05.560
synthesize a new strand, those
two new double helices are so

00:16:05.560 --> 00:16:10.360
totally intertwined
with each other.

00:16:10.360 --> 00:16:14.720
Every turn that there was in
the double helix is now a

00:16:14.720 --> 00:16:18.780
twist and turn connecting the
two, sort of entangling the

00:16:18.780 --> 00:16:20.380
two helices.

00:16:20.380 --> 00:16:23.440
So I have the two new
double helices

00:16:23.440 --> 00:16:26.130
entangled with each other.

00:16:26.130 --> 00:16:27.380
Why is that going
to be a problem?

00:16:31.160 --> 00:16:34.070
I'm going to send these
to two daughter cells.

00:16:34.070 --> 00:16:37.060
These are the two genomes for
the two daughter cells.

00:16:37.060 --> 00:16:39.990
In fact in particular, if this
thing was a circle, the two

00:16:39.990 --> 00:16:44.360
new circles will be totally
wrapped around each other with

00:16:44.360 --> 00:16:46.650
a gazillion wraps.

00:16:46.650 --> 00:16:50.130
No way they're going to
two daughter cells.

00:16:50.130 --> 00:16:51.560
Now, here is where
mathematicians are very

00:16:51.560 --> 00:16:55.120
useful, because it is a theorem
that if I take two

00:16:55.120 --> 00:16:59.020
circles wrapped around each
other like that, there is no

00:16:59.020 --> 00:17:01.750
topological deformation
possible that

00:17:01.750 --> 00:17:02.590
can separate them.

00:17:02.590 --> 00:17:04.359
It's like these puzzles, you
get some strings wrapped

00:17:04.359 --> 00:17:06.240
around each other
separate them.

00:17:06.240 --> 00:17:08.960
It's a theorem that two circles
wrapped around each

00:17:08.960 --> 00:17:13.839
other like that cannot
be separated unless,

00:17:13.839 --> 00:17:14.520
of course, you cheat.

00:17:14.520 --> 00:17:15.352
What's cheating?

00:17:15.352 --> 00:17:16.500
AUDIENCE: You cut it.

00:17:16.500 --> 00:17:17.319
ERIC LANDER: You cut
it, obviously.

00:17:17.319 --> 00:17:18.980
If you cut it, then you
can separate it.

00:17:18.980 --> 00:17:20.670
But otherwise, it's
mathematically impossible to

00:17:20.670 --> 00:17:22.430
separate them.

00:17:22.430 --> 00:17:25.349
So this could concern people.

00:17:25.349 --> 00:17:27.450
How could a cell do this?

00:17:27.450 --> 00:17:28.867
So what does the cell do?

00:17:28.867 --> 00:17:29.781
AUDIENCE: It cuts it.

00:17:29.781 --> 00:17:30.820
ERIC LANDER: It cuts it.

00:17:30.820 --> 00:17:32.020
It's got no choice, right?

00:17:32.020 --> 00:17:32.990
It's a theorem, right?

00:17:32.990 --> 00:17:35.050
Even cells can't violate
theorems.

00:17:35.050 --> 00:17:38.310
So it cuts it.

00:17:38.310 --> 00:17:40.440
The only way to get these things
apart is to cut it.

00:17:40.440 --> 00:17:43.570
Now, what it does, is it takes
those double helices.

00:17:43.570 --> 00:17:46.030
I'll represent the double
helix as a thicker

00:17:46.030 --> 00:17:46.620
kind of thing now.

00:17:46.620 --> 00:17:49.480
That was my double helix,
this other double helix

00:17:49.480 --> 00:17:52.200
wrapped around it.

00:17:52.200 --> 00:17:55.150
It's got to cut it.

00:17:55.150 --> 00:18:00.840
Now, when I take two DNAs that
are wrapped around each other

00:18:00.840 --> 00:18:02.720
or two DNAs that are separate,
have I done any

00:18:02.720 --> 00:18:04.920
chemistry on them?

00:18:04.920 --> 00:18:05.160
I'm sorry.

00:18:05.160 --> 00:18:07.040
Are they chemically different?

00:18:07.040 --> 00:18:09.440
They're chemically the
same molecules.

00:18:09.440 --> 00:18:13.110
But they're topologically
different.

00:18:13.110 --> 00:18:15.150
Topologically means
wrapped around.

00:18:15.150 --> 00:18:17.160
In one case, they were
topologically entangled.

00:18:17.160 --> 00:18:19.510
In the other case, they're
topologically separated from

00:18:19.510 --> 00:18:20.280
each other.

00:18:20.280 --> 00:18:23.030
So they're still the same
chemical bonds, the same

00:18:23.030 --> 00:18:29.260
molecules, but when I separate
these two double helices now,

00:18:29.260 --> 00:18:32.220
the difference between these
is that they are what are

00:18:32.220 --> 00:18:35.860
called topoisomers.

00:18:35.860 --> 00:18:38.620
They are isomers because they're
exactly the same

00:18:38.620 --> 00:18:39.920
chemical formula.

00:18:39.920 --> 00:18:41.940
But they're topoisomers
because they

00:18:41.940 --> 00:18:43.070
have different topology.

00:18:43.070 --> 00:18:46.060
They're not wrapped around
each other anymore.

00:18:46.060 --> 00:18:48.900
So it turns out there is an
enzyme that just gets in there

00:18:48.900 --> 00:18:51.040
and makes a double stranded
cut in one of the double

00:18:51.040 --> 00:18:54.920
helices, grabs the two ends,
passes it around the other

00:18:54.920 --> 00:18:59.380
side, and ligates them back
together, and keeps doing that

00:18:59.380 --> 00:19:01.410
until they're disentangled.

00:19:01.410 --> 00:19:02.960
Pretty clever.

00:19:02.960 --> 00:19:06.445
Cut, paste, cut, paste till it
can separate those two double

00:19:06.445 --> 00:19:08.520
helices from each other.

00:19:08.520 --> 00:19:13.200
Remarkably, this enzyme is
called topoisomerase.

00:19:18.980 --> 00:19:26.750
This job is done by
topoisomerase, actually, by

00:19:26.750 --> 00:19:28.750
topoisomerase II.

00:19:28.750 --> 00:19:30.820
There's a couple of different
topoisomerases, and it's

00:19:30.820 --> 00:19:34.740
topoisomerase II that does this
particular job, cuts and

00:19:34.740 --> 00:19:37.680
seals up that double-stranded
break.

00:19:37.680 --> 00:19:40.080
All right.

00:19:40.080 --> 00:19:42.250
It is amazing how this works.

00:19:42.250 --> 00:19:47.080
Let's take another problem in
how we do DNA replication.

00:19:47.080 --> 00:19:53.900
So let's deal with fidelity.

00:19:53.900 --> 00:19:57.305
The fidelity, accuracy
of replication.

00:20:05.100 --> 00:20:07.810
I have my strand.

00:20:07.810 --> 00:20:09.730
Which direction do we go?

00:20:09.730 --> 00:20:13.900
We go, for this template, five
prime to three prime.

00:20:13.900 --> 00:20:16.550
This way goes five prime to
three prime, the opposite

00:20:16.550 --> 00:20:17.890
direction there.

00:20:17.890 --> 00:20:21.040
I now add on.

00:20:21.040 --> 00:20:22.790
If this is a T, what
do I add in?

00:20:22.790 --> 00:20:23.996
AUDIENCE: [INAUDIBLE].

00:20:23.996 --> 00:20:31.000
ERIC LANDER: If it's a
GCGTAAT, et cetera.

00:20:31.000 --> 00:20:32.740
Why does the right base go in?

00:20:37.540 --> 00:20:38.864
Why does the right base go in?

00:20:38.864 --> 00:20:39.318
Yeah?

00:20:39.318 --> 00:20:40.680
AUDIENCE: Hydrogen bonding.

00:20:40.680 --> 00:20:41.030
ERIC LANDER: Hydrogen bonding.

00:20:41.030 --> 00:20:42.880
It's got that these
hydrogen bonds.

00:20:42.880 --> 00:20:44.440
AT makes two hydrogen bonds.

00:20:44.440 --> 00:20:46.610
GC makes three hydrogen bonds.

00:20:46.610 --> 00:20:50.500
The wrong base could
never go in.

00:20:50.500 --> 00:20:50.800
Sorry.

00:20:50.800 --> 00:20:54.160
In biochemistry, do you
ever say never?

00:20:54.160 --> 00:20:56.400
No, we say K equilibrium.

00:20:56.400 --> 00:20:59.940
We say how much more unfavored
is it for the

00:20:59.940 --> 00:21:02.350
wrong base to go in?

00:21:02.350 --> 00:21:05.500
It's not impossible, it's just
disfavored, because it's

00:21:05.500 --> 00:21:08.260
energetically less good.

00:21:08.260 --> 00:21:10.940
How much energetically
less good is it?

00:21:10.940 --> 00:21:14.660
What is the delta G for putting
in the wrong base?

00:21:14.660 --> 00:21:15.910
It's not infinity.

00:21:18.470 --> 00:21:24.660
It turns out that there is an
equilibrium constant for

00:21:24.660 --> 00:21:33.260
putting in the wrong base, and
that is K equilibrium is about

00:21:33.260 --> 00:21:37.280
10 to the third for the right
base, 10 to the minus third

00:21:37.280 --> 00:21:39.590
for the wrong base.

00:21:39.590 --> 00:21:40.290
Thank goodness.

00:21:40.290 --> 00:21:43.930
So only one time in 1,000 does
it put in the wrong base.

00:21:43.930 --> 00:21:45.130
That's what that has
to mean, right?

00:21:45.130 --> 00:21:48.870
If it's 1,000 times less favored
energetically, it

00:21:48.870 --> 00:21:55.410
means you only make a mistake
one letter in 1,000.

00:21:55.410 --> 00:21:57.700
How do you feel about that
for your own genomes?

00:21:57.700 --> 00:21:59.440
Is that a level of quality
control you

00:21:59.440 --> 00:22:00.500
are satisfied with?

00:22:00.500 --> 00:22:01.226
AUDIENCE: No.

00:22:01.226 --> 00:22:02.180
ERIC LANDER: No.

00:22:02.180 --> 00:22:04.980
How big is a typical gene?

00:22:04.980 --> 00:22:08.220
Typical gene is, in terms
of its protein coding

00:22:08.220 --> 00:22:10.708
information, you guys already
know about DNA goes to RNA

00:22:10.708 --> 00:22:11.090
goes to protein.

00:22:11.090 --> 00:22:13.870
It's about 2,000 bases of
protein coding information.

00:22:13.870 --> 00:22:18.870
That guarantees two mistakes
per cell division.

00:22:18.870 --> 00:22:20.310
Not good.

00:22:20.310 --> 00:22:22.320
Two mistakes per
cell division.

00:22:22.320 --> 00:22:23.110
That's not OK.

00:22:23.110 --> 00:22:26.340
That's two mistakes
per cell division.

00:22:26.340 --> 00:22:31.030
That would be two errors per
cell division, and you have a

00:22:31.030 --> 00:22:35.440
lot of cell divisions, you're
in a lot of trouble.

00:22:35.440 --> 00:22:38.200
So it turns out something
more is needed.

00:22:38.200 --> 00:22:41.030
Quality control is needed.

00:22:41.030 --> 00:22:49.770
So later, it was discovered
that the enzyme DNA

00:22:49.770 --> 00:23:01.350
polymerase, which has a five
prime to three prime

00:23:01.350 --> 00:23:08.200
polymerization activity also
does a second thing.

00:23:12.680 --> 00:23:16.590
That same enzyme, DNA
polymerase, is also a three

00:23:16.590 --> 00:23:19.590
prime to five prime
exonuclease.

00:23:22.360 --> 00:23:25.120
What do you think an
exonuclease is?

00:23:25.120 --> 00:23:26.000
AUDIENCE: [INAUDIBLE].

00:23:26.000 --> 00:23:28.320
ERIC LANDER: Take stuff out.

00:23:28.320 --> 00:23:31.740
So it adds bases in the forward
direction, but it also

00:23:31.740 --> 00:23:34.910
goes backwards and
takes bases out.

00:23:34.910 --> 00:23:36.710
Isn't that dumb?

00:23:36.710 --> 00:23:38.220
I thought we were trying to
synthesize, but we're also

00:23:38.220 --> 00:23:39.470
unsynthesizing.

00:23:41.410 --> 00:23:44.598
With some probability, it goes
backwards and takes out bases.

00:23:48.590 --> 00:23:53.290
Turns out that the probability
of taking out a base backwards

00:23:53.290 --> 00:23:54.860
is higher if it's
the wrong base.

00:23:58.840 --> 00:24:02.540
It's proofreading as it goes
as I hope you are.

00:24:02.540 --> 00:24:05.440
It's proofreading.

00:24:05.440 --> 00:24:09.160
It goes backwards and takes
bases out more often.

00:24:09.160 --> 00:24:12.400
Sometimes it takes out the
right bases, but it is

00:24:12.400 --> 00:24:14.170
proofreading its work.

00:24:20.230 --> 00:24:23.080
And more often when it's the
wrong base, it goes backwards,

00:24:23.080 --> 00:24:25.920
and so you get the benefit of
a K equilibrium from the

00:24:25.920 --> 00:24:26.940
original base.

00:24:26.940 --> 00:24:29.420
And then there's a separate
K equilibrium for the

00:24:29.420 --> 00:24:30.985
proofreading, and
that helps you.

00:24:30.985 --> 00:24:35.260
And when you combine the
proofreading with the original

00:24:35.260 --> 00:24:42.520
accuracy, now, we're down to
something like 10 to the minus

00:24:42.520 --> 00:24:48.070
five or 10 to the minus
six errors per

00:24:48.070 --> 00:24:49.760
base, per cell division.

00:24:53.510 --> 00:24:55.935
It's only making on the order
of one error per million.

00:25:00.000 --> 00:25:03.080
Now are we satisfied?

00:25:03.080 --> 00:25:03.730
No.

00:25:03.730 --> 00:25:05.370
You guys pretty hard nosed.

00:25:05.370 --> 00:25:09.760
Not good enough, because you
have 50 cell divisions to make

00:25:09.760 --> 00:25:11.340
more and some cells go
through many, many,

00:25:11.340 --> 00:25:12.880
many more cell divisions.

00:25:12.880 --> 00:25:14.400
Not acceptable.

00:25:14.400 --> 00:25:15.650
But it's a start.

00:25:20.430 --> 00:25:22.940
So proofreading helps.

00:25:22.940 --> 00:25:26.850
So we have the fidelity
of replication.

00:25:26.850 --> 00:25:31.120
Replication makes an error
at a rate of 10

00:25:31.120 --> 00:25:32.650
to the minus third.

00:25:32.650 --> 00:25:39.100
Proofreading brings you down
to 10 to the minus six, and

00:25:39.100 --> 00:25:41.130
there's another process.

00:25:41.130 --> 00:25:44.400
There are a set of enzymes that
go around and feel the

00:25:44.400 --> 00:25:48.340
DNA double helix after it's
finished, and if you put in

00:25:48.340 --> 00:25:53.620
the wrong base, the width of
the helix is not right.

00:25:53.620 --> 00:25:55.550
The shape is wrong.

00:25:55.550 --> 00:25:57.900
It feels for mismatches.

00:25:57.900 --> 00:26:00.060
So there is a mismatch
repair system.

00:26:04.980 --> 00:26:10.900
Mismatch repair comes along,
and if there was an error

00:26:10.900 --> 00:26:15.120
right here, the helix bulges
out too much let's say.

00:26:15.120 --> 00:26:22.400
Mismatch repair cuts, removes
some DNA, and gives the cell

00:26:22.400 --> 00:26:25.510
another chance to do it again.

00:26:25.510 --> 00:26:30.350
Mismatch repair gets you down
to something in the

00:26:30.350 --> 00:26:33.020
neighborhood of 10 to
the minus eighth, 10

00:26:33.020 --> 00:26:35.840
to the minus ninth.

00:26:35.840 --> 00:26:37.200
Let's say for the sake
of argument, 10

00:26:37.200 --> 00:26:38.390
to the minus ninth.

00:26:38.390 --> 00:26:41.610
You're genome is about three
times 10 to the ninth.

00:26:41.610 --> 00:26:45.100
Now making that's one or
two errors per genome,

00:26:45.100 --> 00:26:46.350
that's not so bad.

00:26:49.540 --> 00:26:51.240
Why do we care?

00:26:51.240 --> 00:26:52.890
Why am I bothering
you with this?

00:26:52.890 --> 00:26:55.810
Who cares between 10 to minus
sixth, 10 to the minus ninth?

00:26:55.810 --> 00:26:57.060
Big deal.

00:26:59.630 --> 00:27:07.450
Well, a few percent of you in
this class are heterozygous

00:27:07.450 --> 00:27:11.590
for a mutation in the mismatch
repair enzymes.

00:27:11.590 --> 00:27:12.730
Don't worry.

00:27:12.730 --> 00:27:16.000
Your cells have the other
copy that's good.

00:27:16.000 --> 00:27:20.930
But suppose one of your cells
were to lose, by mutation, the

00:27:20.930 --> 00:27:23.060
good copy of the mismatch
repair enzyme?

00:27:23.060 --> 00:27:26.250
And now that cell in your body
had no copies of mismatch

00:27:26.250 --> 00:27:27.500
repair enzyme.

00:27:31.780 --> 00:27:33.250
What do you think is going
to happen to your DNA

00:27:33.250 --> 00:27:35.930
replication?

00:27:35.930 --> 00:27:37.660
Instead of being one in a
billion, it would be one in a

00:27:37.660 --> 00:27:40.220
million accuracy.

00:27:40.220 --> 00:27:41.460
Turns out you have
an extremely high

00:27:41.460 --> 00:27:44.780
risk of colon cancer.

00:27:44.780 --> 00:27:47.660
There are hereditary colon
cancer syndromes that are due

00:27:47.660 --> 00:27:50.800
to inherited defects in the
mismatch repair system.

00:27:50.800 --> 00:27:53.380
It is not at all trivial.

00:27:53.380 --> 00:27:55.900
Hereditary polyposis
coli is due to a

00:27:55.900 --> 00:27:57.340
defect in this enzyme.

00:28:02.770 --> 00:28:03.720
It matters.

00:28:03.720 --> 00:28:06.010
You've got to get it down to
that level because otherwise,

00:28:06.010 --> 00:28:09.780
you're getting mutations that
cause cancer, that is, when

00:28:09.780 --> 00:28:12.430
you lose both copies, if
you lost both copies.

00:28:12.430 --> 00:28:14.890
Most of your cells would be
fine, but if you'd lose the

00:28:14.890 --> 00:28:18.100
other good copy, by chance,
that cell can

00:28:18.100 --> 00:28:21.490
go on to cause cancer.

00:28:21.490 --> 00:28:23.460
So this stuff actually
matters.

00:28:23.460 --> 00:28:29.120
Finally, finally, speed.

00:28:29.120 --> 00:28:31.900
Kind of fun to talk
about speed.

00:28:31.900 --> 00:28:35.291
How fast does polymerase work?

00:28:35.291 --> 00:28:43.720
It turns out that polymerase
is able to polymerize 2,000

00:28:43.720 --> 00:28:50.180
nucleotides per second.

00:28:50.180 --> 00:28:52.070
That's very impressive to me.

00:28:52.070 --> 00:28:54.910
It zips along at 2,000
nucleotides per second,

00:28:54.910 --> 00:28:59.050
installing the right base,
getting it right only 99.9% of

00:28:59.050 --> 00:29:03.260
the time, proofreading as it
goes, and gets the whole thing

00:29:03.260 --> 00:29:06.580
done 2,000 letters
in a second.

00:29:06.580 --> 00:29:08.580
That is impressive
engineering.

00:29:08.580 --> 00:29:12.110
That is really impressive
engineering.

00:29:12.110 --> 00:29:17.170
So that's kind of how DNA
replication works well, except

00:29:17.170 --> 00:29:18.420
for one thing.

00:29:29.540 --> 00:29:30.790
Kornberg was a biochemist.

00:29:33.290 --> 00:29:35.300
Biochemists purify things
in test tubes.

00:29:41.180 --> 00:29:49.650
He discovered an enzyme,
Kornberg's polymerase.

00:29:54.880 --> 00:29:59.420
How do we know it's the enzyme
the cell actually

00:29:59.420 --> 00:30:03.070
uses to copy its DNA?

00:30:03.070 --> 00:30:04.600
See, I'm a geneticist.

00:30:04.600 --> 00:30:07.300
I look at Kornberg and
I say, nice job.

00:30:07.300 --> 00:30:12.030
You showed me an enzyme that in
a test tube is capable of

00:30:12.030 --> 00:30:14.250
polymerizing DNA.

00:30:14.250 --> 00:30:17.340
How do I know that's the enzyme
that's actually doing

00:30:17.340 --> 00:30:19.110
it from the cell copies
its whole genome?

00:30:22.210 --> 00:30:25.858
What does a geneticist
want to see?

00:30:25.858 --> 00:30:27.350
AUDIENCE: A mutant.

00:30:27.350 --> 00:30:28.230
ERIC LANDER: A mutant.

00:30:28.230 --> 00:30:32.000
Show me a mutant then
I'll believe.

00:30:32.000 --> 00:30:39.330
So someone went along and took
E. colis one at a time because

00:30:39.330 --> 00:30:40.400
what else could they do.

00:30:40.400 --> 00:30:43.670
And for every single E. coli
they grew up from a plate,

00:30:43.670 --> 00:30:45.722
they purified Kornberg's
enzyme.

00:30:45.722 --> 00:30:48.680
And you know what they found?

00:30:48.680 --> 00:30:55.710
They found a mutant E. coli that
lacked Kornberg's enzyme,

00:30:55.710 --> 00:30:57.900
and it could replicate
its DNA just fine.

00:31:01.950 --> 00:31:03.200
What does that tell us?

00:31:07.440 --> 00:31:11.370
Kornberg actually had
the wrong enzyme.

00:31:11.370 --> 00:31:13.270
He still deserves a Nobel Prize
for it because he got an

00:31:13.270 --> 00:31:15.340
enzyme that could copy DNA.

00:31:15.340 --> 00:31:18.790
It's actually not the main
enzyme that does the job.

00:31:18.790 --> 00:31:22.270
Because we can make a mutant
that lacks that enzyme and it

00:31:22.270 --> 00:31:27.220
can still copy the DNA, it
can't be the main enzyme.

00:31:27.220 --> 00:31:31.320
Turns out what Kornberg found
was a minor polymerase that

00:31:31.320 --> 00:31:34.320
was used in those mismatch
repair situations that would

00:31:34.320 --> 00:31:37.190
come along and do the tidying
and clean up.

00:31:37.190 --> 00:31:39.810
The main enzyme turned out to
be another enzyme, a more

00:31:39.810 --> 00:31:41.390
complicated enzyme.

00:31:41.390 --> 00:31:45.450
So my point about biochemistry
and genetics both having to

00:31:45.450 --> 00:31:48.900
talk to each other, you only
really know something when you

00:31:48.900 --> 00:31:51.940
have it from a biochemical point
of view and the genetic

00:31:51.940 --> 00:31:53.000
point of view.

00:31:53.000 --> 00:31:54.500
The two have to go together.

00:31:54.500 --> 00:31:56.110
Kornberg's enzyme is
a great enzyme,

00:31:56.110 --> 00:31:57.470
it's a fantastic enzyme.

00:31:57.470 --> 00:32:00.410
It just happens not to be the
main enzyme, and you can only

00:32:00.410 --> 00:32:02.560
know that by genetics.

00:32:02.560 --> 00:32:05.360
Of course, you can only purify
it by biochemistry.

00:32:05.360 --> 00:32:06.050
All right.

00:32:06.050 --> 00:32:09.910
So that's DNA replication.

00:32:09.910 --> 00:32:12.640
Any questions about DNA
replication before I go on?

00:32:12.640 --> 00:32:13.060
Yes?

00:32:13.060 --> 00:32:15.210
AUDIENCE: [INAUDIBLE].

00:32:15.210 --> 00:32:17.450
ERIC LANDER: Polymerase III or
polymerase II, depending on

00:32:17.450 --> 00:32:18.200
the organism.

00:32:18.200 --> 00:32:19.330
They're all called
polymerases.

00:32:19.330 --> 00:32:20.650
They're all DNA polymerases.

00:32:20.650 --> 00:32:22.470
They just get different
names and numbers.

00:32:22.470 --> 00:32:25.150
Turns out most cells have
multiple polymerases and

00:32:25.150 --> 00:32:27.780
Kornberg found the kind
of simpler polymerase.

00:32:27.780 --> 00:32:29.980
The main replication polymerase
also called

00:32:29.980 --> 00:32:32.190
polymerase but with a different
number, is a

00:32:32.190 --> 00:32:33.410
different more complicated
enzyme.

00:32:33.410 --> 00:32:33.800
Yes?

00:32:33.800 --> 00:32:39.620
AUDIENCE: How does the enzyme
know which one is the right..?

00:32:39.620 --> 00:32:41.520
ERIC LANDER: how does it know
which one is right?

00:32:41.520 --> 00:32:44.747
AUDIENCE: [INAUDIBLE].

00:32:44.747 --> 00:32:47.080
ERIC LANDER: Because 50% of
the time you get it wrong.

00:32:47.080 --> 00:32:48.530
Do you know what bacteria do?

00:32:48.530 --> 00:32:49.420
What a great question.

00:32:49.420 --> 00:32:50.670
How would it know which
one to get right?

00:32:53.500 --> 00:32:56.400
Know what bacteria do?

00:32:56.400 --> 00:32:57.780
They're very tricky.

00:32:57.780 --> 00:33:00.210
They mark their DNA, don't
worry about this.

00:33:00.210 --> 00:33:02.680
They mark their DNA with
methyl groups.

00:33:02.680 --> 00:33:04.870
There is an enzyme that comes
along and put methyl groups at

00:33:04.870 --> 00:33:11.640
certain positions, but that
enzyme is kind of slow.

00:33:11.640 --> 00:33:15.100
So I have a methyl-marked
DNA double helix.

00:33:15.100 --> 00:33:18.580
When I replicate it, the new
strand is made, and what does

00:33:18.580 --> 00:33:19.610
the new strand lack?

00:33:19.610 --> 00:33:21.000
AUDIENCE: Little
methyl groups.

00:33:21.000 --> 00:33:22.910
ERIC LANDER: Little
methyl groups.

00:33:22.910 --> 00:33:25.110
It'll get them eventually
because that slow enzyme will

00:33:25.110 --> 00:33:29.050
come along and put them on, but
mismatch repair is fast.

00:33:29.050 --> 00:33:31.940
So what is mismatch repair
looking for?

00:33:31.940 --> 00:33:34.480
The little methyl groups that
are kind of breadcrumbs that

00:33:34.480 --> 00:33:38.330
say, this was the old strand,
and this guy is the new stand.

00:33:38.330 --> 00:33:39.370
It's thought of everything.

00:33:39.370 --> 00:33:41.020
It's really smart.

00:33:41.020 --> 00:33:42.270
Very, very smart.