Topics covered: Recombinant DNA II
Instructor: Prof. Graham Walker
What we've talked about in recombinant DNA so far is how to get a piece of DNA from somewhere and make a whole lot of copies of it. So, instead of working with DNA extracted from my cells, which there's 3 billion different base pairs of sequence, we can find a little stretch of DNA. But simply being able to clone it, it was a long way from being able to get a lot of a fragment of DNA to be able to figure out what that sequence is.
That sequence of that gene is actually from the xeroderma gene, the gene that's broken in the xeroderma pigmentosum variant patient, missing one of these translesion DNA polymerases that can copy over a thymine-thymine, pyrimidine dimmer induced by UV light. So the reason they have that problem with their skin after sunlight is because they are missing a polymerase that can't copy over accurately over this very common lesion caused by DNA damage.
But how do you get from having a piece of DNA and having the sequence? So, the first thing that people learn to do, and you still do this all the time in any molecular biology lab. We're sort of switching now in engineering as you'll see. You're going to see in the next things that I'm going to say, proteins that were, I talked to you about because of their biological roles.
DNA polymerase is, ligases, we learned what they do. And now you're going to see them used in manipulative ways. Restriction enzymes: they have a biological purpose, too. They weren't put on Earth for me to cut up DNA into fragments and clone. They were there to give the bacteria some kind of primitive immune system. But the first thing that you often have to do, and you have, let's say, a plasmid into which we've inserted a fragment.
And let's say it was the kind of cloning we described the other day, where I had cut to the vector with an EcoRI site, and the other DNA with an EcoRI site. So these, the junction between the inserted fragment and the cut vector has re-created two EcoRI sites now. And if I cut with EcoRI, I'll just undo what I did in the cloning, and we should get the vector DNA and the insert DNA back again.
So, I can go from this vector to give us an orientation. I'm going to imagine that it has one more restriction site. This one is called SalI. It just recognizes a different sequence. So, if I take the plasmid DNA and cut with EcoRI, I'm just reversing the cloning. So I should get the vector DNA and the insert that I generated in the first place.
But I have to detect them somehow. Unfortunately, they don't just look in the test tube like that. So, what people do is they use a very simple principle. It's called gel electrophoresis. And the idea is you just make a gel of something. In this particular case, it's just made of agar, which is agarose. These are polysaccharide products, often derived from seaweed or something like that. They have the property that if you warm them up, they're liquid, and that if you let them cool down, they're a gel.
You've run into Jell-O which has that property. That's actually made of a protein rather than a carbohydrate. But it's that kind of principle. So it's very easy to pour something and that let it solidify. Now you've got a slab. And it's just a network of things that interact. And the principle of the thing is that, so you have to get the molecules to move.
Well that's pretty easy with nucleic acids because they're charged. They have all those phosphates. They've got a lot of negative charge. So if you apply an electric field, they'll move. And the principle of the thing is that if you're big, it's harder to wiggle through this network than if you are small. Or you can think of a big, fat person trying to go through a forest with a lot of trees, and a little skinny one. And if we let them have a race, eventually the skinny one will emerge from the forest first.
And so, if we had a set of markers down on the side where this is big, and this is small, and we take this piece of DNA, we're going to get two fragments. The bigger one would be the vector and the smaller one would be an insert. So from that, we could say, oh, if I didn't know what I started with I could say what must be in that plasmid? Is the vector? And I can run it all by itself and see it's exactly the same size.
And I got an insert of this particular size. And now, can I learn anything more about that just using restriction enzymes? And let's say I now take the same. Actually, maybe I'll do it over here. So let's cut, this time, with EcoRI plus another restriction enzyme. They all have these weird names: BamHI, and let's see what happens. Well suppose I do that and I get something like this.
Well it looks like the vector wasn't cut at all. That still seems to be the same, but it looks as though the insert got cut into two pieces. Since it was linear, it must have one site in it. And so, this molecule that I cloned could look, be one of two kinds of ways. It could be like this. Let's say this is the insert. Here's the EcoRI. I'll use this SalI to orient us.
So, the Bam site could either be close over here. Or it could be over on the other side. Does that make sense? The logic is pretty simple. How can I tell which of those is correct? Just doing the kind of stuff I'm doing. Beautiful, beautiful. So if we cut with the SalI plus the BamHI. In one case, one would get a fragment like that. In the other case, get a fragment like that. That should feel easily familiar to you.
It should feel just like what we are doing what we did that phage cross, and we had some genes that were lined up. And we were trying to figure, was the orientation this way? Or was the orientation that? That was exactly the same principle. And so this is usually, in the lab you'd call this restriction mapping, or making a restriction map. And it enabled people to manipulate fragments of DNA and make inferences about their orientation and other features before we can actually even sequence DNA.
And that's just part of routine sort of stuff you do a lab. The equipment is disarmingly simple. It looks something like that, usually you're putting some colored dye so you can see that the things are moving down the gel. And the way you visualize the DNA is you add a molecule. The name of it doesn't particularly matter. It's called ethidium bromide. But its property is it doesn't fluoresce when it's just in solution.
But it's a flat molecule, and it can intercalate in between the base pairs in DNA. They have all those stacked base pairs going down a helix. This molecule's flat, and it likes to slip inside. And now it's a much more hydrophobic environment. It's hidden from the water, becomes florescent. And so, DNA that's soaked up this dye then will fluoresce when you put a UV light on it.
So if I take the gel out of there after I've run it, and soak it in this dye and then shine a little handheld UV light on it, it would look something like that if I photographed it. And so, you would end up with those patterns that look exactly like that. Oops, I guess I took the other one out. But you can, of course, depending on how complicated it is, you could have a lot of different fragments.
OK, so the next big thing that had to happen in order for us to really move to where we are in today's molecular biology was somehow, DNA had to be sequenced. And as I say, when I was an undergrad, or even when I was just about to start, when I was a postdoc anyway, just again it seemed like how would you ever do it? Because every nucleotide was joined by a phosphodiester bond.
The only difference was the base that was there. It seemed very, very difficult. It was hard to imagine you would ever be able to sort out the sequence of a billion base pairs. Of course, you could clone. Now you've got maybe a fragment of DNA that's a couple hundred base pairs long, and at least the problem becomes smaller. Maybe you could work it out. Now, there were a couple of different ways of doing it.
One was by Wally Gilbert, who's up at Harvard who got half the Nobel Prize for doing this. The other principle, the other one that's proved to be most generally useful is Fred Sanger from England. And he and Wally shared the Nobel Prize for discovering sequencing. And the principal was disarmingly simple. I think it's one of these great ideas do you look back at afterward and think, I could have thought of that.
You guys already know everything you need to invent how to sequence DNA. I've told you all the stuff already. But nobody's come down to tell me that you've got it. And I didn't think of it. So here is the principal. What we've talked about, if we take a DNA polymerase plus the four deoxynucleotide triphosphate's, remember we talked about deoxyribonucleotide, the adenosine triphosphate, and so on.
There's four different ones. And we take a primer. And there's a three prime hydroxyl right there. And so this is the other strand is going the opposite direction. If we add that, I think you all know what's going to happen. We're going to get an extension to the other end. And what happens every time we add a nucleotide is that three prime hydroxyl attacks the phosphate of the triphosphate. We lose two of the phosphates.
This is called pyrophosphate, and we've created a new five to three prime linkage. That gives us a new three prime hydroxyl, and we repeat the process, right? That's what we talked about. So, what would happen, let's spike in a little, let me do it. It's a little dideoxy TTP. So this is dideoxy. But what would we mean by that? Well, if this, remember where the deoxy came from? The ribose has at the two prime position has a hydrogen instead of a hydroxyl, and at the three prime position it has a hydroxyl.
If we made a dideoxy, what we do is we'd make that. What could that nucleotide do? Well, as long as the polymerase thought it was useful it would use this end, it would have its triphosphate up here. So, somebody else's three prime OH could come down and form a bond to here and we'd lose this. So it could get incorporated. That chain is finished. It can't be elongated anymore. So, let's think what would happen if we had, let me stretch this out a little bit here, and let's imagine we had a few A's in the sequence.
So, we are just going to spike it a bit. So, most of the things will not see a dideoxy. So, this primer will put, we'll try elongating this. So when we get to this point, this point many of them will put it an ordinary A, but a few will put it a dideoxy. And those will finish. At that point, they can't go any farther. The rest of them keep going, they add various nucleotides. When we get to the next A, most of them will put them a good T, but the ones that put in a dideoxy will stop, and they will generate a fragment that looks like that.
You get the idea. Out of this reaction, we are going to get a set of fragments. And each one terminates where there was an A up there. Now, in this newer emulation of this thing, we have a T. And the trick is to put a dye that you can attach to this nucleotide, so it has a particular color. So, suppose we had something that was yellow.
Then this particular set of fragments would be yellow. And maybe you can begin to see what would happen now. If we did the same game three more times, each time using a different dideoxy, next time maybe we'll use dideoxy A. And we'll put a different colored dye on it. Then every time, in this case we come to a T in the template, it would stop, and we'd get a little fragment that's stopped because it incorporated a dideoxy A, and those would be, let's say, green.
So, by the end of this, we would have all possible fragments if we mixed them all together and the last nucleotide on each fragment would say who it was by its color. So, if you were to, then, take this whole mixture of DNA fragments and you run them down a gel, in this case it's a different; it's a polyacrylamide gel because you have smaller trying to get things to go by smaller fragments.
You could sort of see what would happen. The big ones would be at the top. The small ones would be at the bottom. And you'd see each band would have a different color depending on the dideoxy that terminated its chain. So, if you had a little scanner that just goes along, it can read this, and it will print out something. And these are always slightly idealized.
This is a real one. But this is the sort of stuff you get back. If you send a piece of DNA over to a sequencing center, they'd send this back as a file or something. And you'd sit there. And it's very good these days. The technology wasn't as good, but they can almost always now get the sequence. Occasionally, you'll get something like a run of G's that gets a little hard, but what they'll do is they'll sort of what they call sequencing both strands.
You can see this way, but really only looking at the information of one strand. So, if we took the other strand. So if we took the other strand, and we did the same thing, but we should get the complementary piece of information. So, what this DNA sequencing allows you to do, then, is determine the exact sequence of nucleotides in some kind of piece. And much of the art from the rest of it then comes, how do you assemble all of those things together? In the case of a bacteria or something, it wasn't so bad because its DNA was small enough.
You could cut it into a bunch of sort of big fragments, and then take each one of those, and then the sorting problem was relatively simple. In the case of something like humans, it was really complicated because there were so much more DNA. And the other thing is higher organisms such as yourselves have a lot of repeated DNA. It's just the same sequence, and sometimes there's quite a bit of it, a bunch of repeats.
And so, if you see that at the end of your thing, you don't really quite know where you are in the genome. So a lot of other tricks had to be brought into play, including knowledge of the human genetic map. And so you could get yourself anchored at various places because you knew on this particular piece of DNA, because it was associated with some gene, had to be here on the chromosome. And therefore, things at least at least beside it were there on the chromosome.
And there were a whole lot of tricks to putting it together. But the very basic principle of how we sequence DNA has at its heart the same process that I was talking to you about as when we were doing DNA replication, except in this case it's just used in a very clever way. And that was an amazing idea. It got a Nobel Prize, and you've been sitting here for the last month with all the knowledge to do it.
I keep emphasizing that you've got to have that three prime hydroxyl. But some of the great ideas often when you look back you could see it was the hurdle was kind of small. And they didn't even have to do this with dyes at the beginning. In fact, that was a later innovation. The key thing was just the dideoxies stopping in each place. I was lucky enough to live through some of this, the development of this technology.
OK, so I've got one more really big thing to tell you, which again was extraordinarily clever, but extraordinarily simple once you heard about it. And it was one more technological advance. It wasn't a big insight into biology in and of itself, but it was a technology that opened up just incredible experimental possibilities. And it's something known as the Polymerase Chain Reaction.
And this allows, in principle, someone like me to go and to grab a single cell from you, take it to DNA, and get a copy of any gene I want from your genome. And I can look and see whether you have any mutations in that genome, or whether there are different polymorphic alleles in the population, in which one you've got from your mom, or which one you got from your dad. So, you take from a single DNA molecule, I can make as much as I want.
And this is just like DNA sequencing. You guys already know everything you need to know to invent this technique as well. It has very much that same property. It's another one of these very brilliant insights that you just had to put things in the right place. So let me explain the principle. So, suppose that I would like to know there's a gene that I know there's a family history of something, and I would like to know, did I happen to get the allele that carries that? Or did I get the one that didn't? So, in principle what I would like to do is to get a hold of the piece of DNA for that gene from my own cells.
But all I've started with is my entire DNA. Well, I could clone it. I could make a recombinant library. I could do everything else. But there's this other simple way. And one way this involves, what it involves taking, is since I know the sequence of the genome now, I know that almost everything is going to the same. There will be little differences between individuals.
I'll make a little primer that corresponds to the sequence at one end of the gene, and another primer that corresponds to the DNA at the other end of the gene, or whatever fragment I want to use. And that's all I have to do in terms of getting anything made. Now the rest, we are just going to play games with DNA, with DNA polymerase, and nucleoside triphosphates, just all the stuff I dragged you through talking about DNA replication.
So here's the idea. So here's my DNA, let's say, or part of it. If I could actually see the sequence, I would know, let's say, the gene I'm interested in is in here. So, what I would do is make a little primer. It just has to be enough to confer specificity for something with humans. If I make something probably 30 nucleotides long, that's enough.
It'll only bind one place in the DNA. And I make one, let's say, for the opposite strand over here. So remember, this is five prime, three prime, five prime to three prime. The principle is we'll heat to 95?C, and will denature the DNA. And we'll add an excess of the two primers. And let's say we'll cool to 55?C, or something. And we'll cool it down enough so that we can get the primers on. But we are not going to go all the way and let all the strands find their way back.
And we'll add a DNA polymerase plus four deoxy nucleoside triphosphates. Well, what will happen? Well, here's one of the strands. And we'll prime it here, let's say. So, it will copy down here and go as far as it can go. And the other one starts here. And it's going to go down all that way. Let's just repeat the whole process now, OK? What'll happen? Now when we pull them apart, we ought to have four strands.
We'll have the original ones here, and when I repeat the process, the same thing's going to happen again. This one will go here, and it will copy out. This one will go here. It will copy out. But what about this guy? So, this one becomes this one here. So the primer that it does will copy it, and it can't go any further. I just generated a piece that's exactly what I wanted.
And the same deal here: as long as I don't get lost, which what did I do? So, we've got this guy here. So, it starts there. So this one becomes this one, and we'll prime it here. It'll go along and it will stop. So there's the complementary strand to the one here. And I think this is sort of like doing a math problem. You can't just look at it and say, well maybe you will get it.
But there's nothing like sitting down with a pencil and paper, and take yourself through several cycles. What you will believe is how quickly you get to get being nothing but, almost nothing, but the sequence that you are trying to amplify. And so, this again has an astonishing effect. This is why you hear about DNA testing all the time in forensics, because you can take a tiny bit of DNA from saliva, or semen, or blood, or whatever they might find on a crime scene, and then they can amplify little pieces and they compare.
And there's a trick they use in forensics, and that is that there are sequences within the human genome where the little variable repeats like GT, GT, GT, GT, GT, GT, and I might have 14 of them in one of my chromosomes. The one I got from my mom might have 40. You might have 24 and something else, and so on. If you were to do PCR around a little region that was known to be variable, if you had 14 repeats you'd get a shorter fragment.
And if you had 40 repeats, you get a longer fragment. So, I'll come back to that in a sec. So, if you were to, for example, take something with a long repeat and a short repeat into this kind of thing. We get two fragments, say, one from the paternal. And if you do this with several such sites around the genome, pretty soon you run into situations where the odds of a particular combination of a long one at the site, a short one at the site, and so on, becomes statistically improbable that it's anyone other than yourself.
So on a crime scene, if they did this, they, for example, might have three individuals that they were thinking was possible. And they'd generate patterns like this, say, using three different loci like this, and then have the forensic sample. And it was pretty evident who didn't do it, and who at least remains a suspect. This probably wouldn't prove it. The very last thing, just to close us off, is when people developed the PCR technique, you had to sit there with your pipette because every time you raised it to 90? to denature the DNA you killed your enzyme.
So, you cool it down to 55, you squirted in a new DNA polymerase. And then someone finally said, another brilliant idea, what if I had a thermoresistant polymerase? Where would I find those? Well, Penny was talking to you about those vents where it's really, really hot, and those black smokers and everything, so maybe you got a bacterium from there. It would have a temperature resistant polymerase.
So here you are from the New England Biocatalog, vent exo-minus DNA polymerase, deep vent DNA polymerase. People went to grab those bacteria from there, grabbed the DNA polymerase gene. And now, the DNA polymerase just sits there. It just laughs and you bring it up to 90?. And when you cool it back down and give it a substrate again, it will do its thing.
And so, this whole thing can be done automatic and you don't have to sit there and pipette something in at the end of every run, another little cute sort of engineering trick that combined ecology together with biology. OK, see you on Friday.