1 00:00:09 --> 00:00:13 Good morning. So, I see we have a lot of parents here. 2 00:00:13 --> 00:00:17 How many parents have we got here? 3 00:00:17 --> 00:00:20 Welcome to the parents. How many of the parents have done 4 00:00:20 --> 00:00:23 the reading for today? [LAUGHTER] Good, because we'll call 5 00:00:23 --> 00:00:27 the parents too, right? We'll see what happens. 6 00:00:27 --> 00:00:31 All right, so, where are we? We've talked about this diagram that I 7 00:00:31 --> 00:00:36 keep coming back to. If you want to 8 00:00:36 --> 00:00:40 study biological function the two traditional ways to do that were to 9 00:00:40 --> 00:00:44 look at genetics or to look at biochemistry: genetics, the 10 00:00:44 --> 00:00:48 study of an organism with one broken component, those components being 11 00:00:48 --> 00:00:53 genes; biochemistry: the study of the purification of 12 00:00:53 --> 00:00:55 individual components from an organism away from the organism, 13 00:00:55 --> 00:00:58 particularly the most important such components being proteins. 14 00:00:58 --> 00:01:01 What do they have to do to each other? 15 00:01:01 --> 00:01:05 The unification in molecular biology that occurred in the middle 16 00:01:05 --> 00:01:09 of the century from the 1950s into the ‘60s and really 17 00:01:09 --> 00:01:13 up to 1970 or so, we came to a conceptual understand 18 00:01:13 --> 00:01:16 that genes encode proteins, 19 00:01:16 --> 00:01:19 and therefore these two different ways of looking at the organism: 20 00:01:19 --> 00:01:23 organism minus a component, components 21 00:01:23 --> 00:01:26 minus an organism were complementary points of view, 22 00:01:26 --> 00:01:29 and in theory, you could go from a gene sequence to a protein 23 00:01:29 --> 00:01:32 Sequence, a protein sequence back to a gene sequence, 24 00:01:32 --> 00:01:35 to go for a gene sequence to its function, its function 25 00:01:35 --> 00:01:40 to a protein, except for one to a tiny detail. This was all just 26 00:01:40 --> 00:01:46 conceptual. Conceptually we understood by about 1970 27 00:01:46 --> 00:01:51 that the DNA made the RNA made the protein. The protein carried 28 00:01:51 --> 00:01:56 out the function but as of then, 29 00:01:56 --> 00:02:00 you couldn't individually work with or purify the DNA corresponding to 30 00:02:00 --> 00:02:04 any particular gene. All of the inferences had been 31 00:02:04 --> 00:02:07 indirect inferences: indirect inferences from bacterial genetics, 32 00:02:07 --> 00:02:11 bacterial regulation or Meselson-Stahl experiments, 33 00:02:11 --> 00:02:15 and all sorts of interesting indirect ways 34 00:02:15 --> 00:02:19 working out the genetic code, but it didn't let you read anything. 35 00:02:19 --> 00:02:23 This was a problem. Some people in the 36 00:02:23 --> 00:02:26 late 1960s said, great, molecular biology is over. 37 00:02:26 --> 00:02:29 We understand in principle how life works. Now let's go understand 38 00:02:29 --> 00:02:31 how the brain works. And there was an exodus of some 39 00:02:31 --> 00:02:34 people from molecular biology into neurobiology to now go 40 00:02:34 --> 00:02:36 nail the brain, figured that would be worth another 41 00:02:36 --> 00:02:39 ten years or so. But in fact, remarkably, 42 00:02:39 --> 00:02:42 people began to focus on how you could get to 43 00:02:42 --> 00:02:46 work with individual specific genes. Now, what's so hard about that? I 44 00:02:46 --> 00:02:49 mean, it's not very hard to crack open a 45 00:02:49 --> 00:02:53 red blood cell and purify different proteins. You can purify hemoglobin. 46 00:02:53 --> 00:02:56 You can purify different enzymes. Biochemistry allows you to purify 47 00:02:56 --> 00:02:59 different components from each other. I want to purify an enzyme: let's 48 00:02:59 --> 00:03:02 crack open a yeast cell, separate the 49 00:03:02 --> 00:03:04 proteins over some column that separates them based on their size 50 00:03:04 --> 00:03:06 or their charge, and I'll get purer and purer 51 00:03:06 --> 00:03:09 fractions. I'll assay each fraction to see 52 00:03:09 --> 00:03:11 which one has the enzymatic activity. But basically I use the 53 00:03:11 --> 00:03:14 physical chemical properties of the proteins to 54 00:03:14 --> 00:03:17 separate them into different buckets. Why not do that with, 55 00:03:17 --> 00:03:21 say, the human DNA and purify out the gene for beta 56 00:03:21 --> 00:03:25 globin, that encodes the beta component of hemoglobin? 57 00:03:25 --> 00:03:30 What would be the problem of just using physical chemical purification 58 00:03:30 --> 00:03:36 to purify one human gene from another? Well, I mean, it's one 59 00:03:36 --> 00:03:39 very big molecule. Well, I could sheer it up. 60 00:03:39 --> 00:03:43 Maybe I'll just break it up. Now, let's purify the beta 61 00:03:43 --> 00:03:47 globin containing part. It all looks the same. It's just 62 00:03:47 --> 00:03:51 DNA. It's one chemical polymer with pretty boring properties, 63 00:03:51 --> 00:03:54 and they're not very different. 64 00:03:54 --> 00:03:56 Any particular DNA sequence in any other DNA sequence basically about 65 00:03:56 --> 00:04:00 the same molecular weight, same charges, 66 00:04:00 --> 00:04:04 there's nothing to separate them by. How are you going to purify beta 67 00:04:04 --> 00:04:08 globin? That was the problem. That's where 68 00:04:08 --> 00:04:11 recombinant DNA came in was recombinant DNA was a remarkable 69 00:04:11 --> 00:04:15 and totally different way of purifying individual 70 00:04:15 --> 00:04:19 components. And the basis of it was this notion of cloning. 71 00:04:19 --> 00:04:23 If I want to purify out from the human 72 00:04:23 --> 00:04:27 genome, how big is the human genome? The human genome is about three 73 00:04:27 --> 00:04:31 billion bases long. If I want to purify a particular 74 00:04:31 --> 00:04:37 gene, let's say beta globin or some other gene, 75 00:04:37 --> 00:04:46 typical gene, might be on the order of 30,000 letters long. 76 00:04:46 --> 00:04:49 This is one part in 105 purification I've got to achieve. 77 00:04:49 --> 00:04:53 Any given gene is only one part in 78 00:04:53 --> 00:04:56 105 of the human genome. And then, what about a typical 79 00:04:56 --> 00:04:59 mutation? Maybe the mutation that causes 80 00:04:59 --> 00:05:05 sickle cell anemia by changing a single nucleotide and beta globe, 81 00:05:05 --> 00:05:10 well, that's one base pair. So that means I'm trying to identify 82 00:05:10 --> 00:05:14 something that's on the order of one part in 109 actually, 83 00:05:14 --> 00:05:17 a little less than one part in 109 of the whole genome. 84 00:05:17 --> 00:05:21 Carrying out purifications like: really kind of hard 85 00:05:21 --> 00:05:25 to imagine. But the way it was done was by the invention of cloning. 86 00:05:25 --> 00:05:29 Let me briefly overview of the idea 87 00:05:29 --> 00:05:37 of cloning, and then we'll dive into the details. The idea of cloning 88 00:05:37 --> 00:05:42 was, the way to purify individual molecules would just be 89 00:05:42 --> 00:05:46 to take the molecules and just dilute them so that there 90 00:05:46 --> 00:05:49 was only one of each model. That's very pure, 91 00:05:49 --> 00:05:52 isn't it? The problem is it's not very much, so you need 92 00:05:52 --> 00:05:56 a way to take a single copy of a molecule, 93 00:05:56 --> 00:06:00 and then make many copies of it. So purification's not hard. You 94 00:06:00 --> 00:06:03 just dilute it down so you work with 95 00:06:03 --> 00:06:05 single molecules but then you need to copy it back again 96 00:06:05 --> 00:06:08 and again and again, and no biochemical technique 97 00:06:08 --> 00:06:10 involves, say, fractionating a cell and replicating 98 00:06:10 --> 00:06:13 some enzyme, you know, copying some enzyme. 99 00:06:13 --> 00:06:16 You can't copy enzymes, but you can copy DNA, and that was 100 00:06:16 --> 00:06:19 the basis of it. So here's the way it goes. The 101 00:06:19 --> 00:06:22 basic overview we'll look at is take your DNA and cut your DNA 102 00:06:22 --> 00:06:28 of interest, maybe the human genome, 103 00:06:28 --> 00:06:36 into pieces at defined sites Then, paste your DNA, which is more 104 00:06:36 --> 00:06:44 technically ligate, the word we use. 105 00:06:44 --> 00:06:53 Paste your DNA to some other DNA called a vector. So, 106 00:06:53 --> 00:06:57 cut your DNA and paste your DNA. Each piece of your, say, human DNA 107 00:06:57 --> 00:07:06 gets stuck to some piece of vector. 108 00:07:06 --> 00:07:18 Insert this DNA into vectors that can replicate in bacteria. 109 00:07:18 --> 00:07:25 So, I'm going to actually take my piece of human DNA 110 00:07:25 --> 00:07:28 and not just ligate it to any piece of DNA. 111 00:07:28 --> 00:07:32 I'm going to take my human DNA, and I'm going to ligate it to a 112 00:07:32 --> 00:07:35 vector that has all of the machinery, 113 00:07:35 --> 00:07:39 all of the ability to be copied in a bacteria. Then what 114 00:07:39 --> 00:07:45 I'm going to do is I'm going to transform my DNA into a 115 00:07:45 --> 00:07:53 host cell, a host bacterial cell. Transform means introduce. When we 116 00:07:53 --> 00:07:59 talk about transforming DNA, 117 00:07:59 --> 00:08:02 we're not talking about changing it. It's the word that's used for 118 00:08:02 --> 00:08:06 taking my DNA, stuck into a vector, 119 00:08:06 --> 00:08:10 and introducing it into bacterial cells. Ideally, 120 00:08:10 --> 00:08:14 each bacterial cell would carry one such 121 00:08:14 --> 00:08:22 DNA molecule, and then what I want to do is I want to plate my cells, 122 00:08:22 --> 00:08:31 and select those that carry human DNA, my DNA; DNA I've put on 123 00:08:31 --> 00:08:38 it. So, I'm going to put them on a Petri plate and I want only the 124 00:08:38 --> 00:08:44 bacteria that happen to have picked an individual piece of human DNA 125 00:08:44 --> 00:08:47 to grow. So, that's the trick. It's a very simple trick. Take 126 00:08:47 --> 00:08:50 total human DNA, cut it up into pieces, glue 127 00:08:50 --> 00:08:53 it to a vector that's able to be copied so that it's able to be 128 00:08:53 --> 00:08:56 replicated in bacteria, put the vectors into bacterial 129 00:08:56 --> 00:08:59 cells; every bacterial cell picks up no more than one vector. 130 00:08:59 --> 00:09:03 You plate it out, and you simply arrange so that the 131 00:09:03 --> 00:09:06 only cells that grow are those that picked up the piece of human DNA. 132 00:09:06 --> 00:09:09 And then, every one of these colonies 133 00:09:09 --> 00:09:12 is the descendent of a single bacterial cell that picked up a 134 00:09:12 --> 00:09:15 single human molecule, but is obligingly 135 00:09:15 --> 00:09:18 copying that molecule for you again and again and again and again. 136 00:09:18 --> 00:09:21 And thus, you have what we refer 137 00:09:21 --> 00:09:26 to; this whole collection here is called a library of clones. 138 00:09:26 --> 00:09:31 This is called a recombinant library because 139 00:09:31 --> 00:09:35 every piece of the human genome is somewhere in here. 140 00:09:35 --> 00:09:39 You know, this one here probably is active, and maybe 141 00:09:39 --> 00:09:43 this one here maybe is collagen-11 and that one there might, 142 00:09:43 --> 00:09:47 ah, there's beta globin. OK, actually 143 00:09:47 --> 00:09:49 when you look at the plate there's no way to tell but in principle 144 00:09:49 --> 00:09:52 they're all there. So, there will be this question of, 145 00:09:52 --> 00:09:54 how do we look at a library and pull out what the right one 146 00:09:54 --> 00:09:57 is? But somewhere in there should be a bacterial 147 00:09:57 --> 00:10:01 colony that has pure beta globin gene, the DNA for beta globin. 148 00:10:01 --> 00:10:03 The next lecture will be about how you actually find it. 149 00:10:03 --> 00:10:05 But today let's just build this library. So our 150 00:10:05 --> 00:10:09 goal is to be able to build a library like this. 151 00:10:09 --> 00:10:13 So, we have to figure out how to cut DNA, 152 00:10:13 --> 00:10:15 paste, DNA, vectors, etc., etc. So that's what our 153 00:10:15 --> 00:10:18 subject will be today. Let's dive in. First, 154 00:10:18 --> 00:10:26 cutting DNA, how do you cut DNA? Restriction enzymes, 155 00:10:26 --> 00:10:40 etc. It turns out that the way you could cut 156 00:10:40 --> 00:10:43 DNA at particular places is as follows. Let me take a piece of DNA. 157 00:10:43 --> 00:10:47 Here's a double-stranded piece of DNA. We'll go A, 158 00:10:47 --> 00:10:52 G, C, T, A, G, A, A, T, T, C, T, T, A, C, C, 159 00:10:52 --> 00:10:57 hydroxyl there, three primad. Let's go back on the 160 00:10:57 --> 00:11:03 other strand. What do we have? G, 161 00:11:03 --> 00:11:10 G, T, A, A, G, A, A, T, T, C, T, A, G, C, T, 162 00:11:10 --> 00:11:14 hydroxyl there, three prime. There's my double 163 00:11:14 --> 00:11:19 stranded piece of DNA. It turns out that there exists an 164 00:11:19 --> 00:11:29 enzyme that recognizes that exact sequence: G, A, 165 00:11:29 --> 00:11:39 A, T, T, C. The enzyme goes by the name Eco R1. 166 00:11:39 --> 00:11:42 This protein, this enzyme, scans along the the DNA, and it 167 00:11:42 --> 00:11:45 finds this sequence: G, A, A, T, T, C. 168 00:11:45 --> 00:11:48 Actually it's on this strand. What about on the other strand does 169 00:11:48 --> 00:11:52 it say? Same thing. But it's a reverse palindrome. 170 00:11:52 --> 00:11:56 It's symmetric. That's very good. 171 00:11:56 --> 00:12:00 And it turns out most restriction enzymes do that. OK, 172 00:12:00 --> 00:12:04 so what it does when it finds that, 173 00:12:04 --> 00:12:08 with the benefit of colored chalk that has just shown up here 174 00:12:08 --> 00:12:14 is it cleaves the DNA fragment like that. 175 00:12:14 --> 00:12:23 And what it gives you then is a broken double strand with 176 00:12:23 --> 00:12:29 an overhang, T, T, A, A, five prime, 177 00:12:29 --> 00:12:35 three prime, three prime, five prime. This has a hydroxyl 178 00:12:35 --> 00:12:38 here this. This has a phosphate there. And then this other fragment 179 00:12:38 --> 00:12:47 here is A, A, T, T, C, T, T, A, C, C, G, G, T, 180 00:12:47 --> 00:12:57 A. So, what happens is, and this has a hydroxyl five 181 00:12:57 --> 00:13:04 prime, three prime, three prime, five prime I get into 182 00:13:04 --> 00:13:10 two fragments of DNA that have been 183 00:13:10 --> 00:13:16 broken there and have it over. The overhang is complementary. 184 00:13:16 --> 00:13:23 Those two sequences match each other. There's what's called a five prime 185 00:13:23 --> 00:13:30 overhang and they're complementary So, we have complementary, 186 00:13:30 --> 00:13:36 that is matching, five prime over X. This is called Eco R1 because it's 187 00:13:36 --> 00:13:43 purified, this particular enzyme, 188 00:13:43 --> 00:13:49 from E coli strain R and it's the number one such enzyme that 189 00:13:49 --> 00:13:55 was purified from it. So, it is very simple nomenclature 190 00:13:55 --> 00:14:01 here. Now, here's a question. Why do bacteria have an enzyme like 191 00:14:01 --> 00:14:05 this? There are some people who feel that the reason 192 00:14:05 --> 00:14:09 is that this enzyme is here precisely to allow molecular 193 00:14:09 --> 00:14:14 biologists to cut and paste DNA, and this represents 194 00:14:14 --> 00:14:20 impressions likely, me among them. 195 00:14:20 --> 00:14:24 How did anybody find this stuff? Well, shaggy dog story, I have to 196 00:14:24 --> 00:14:28 tell you the following shaggy dog story. 197 00:14:28 --> 00:14:31 So, this is a fun shaggy dog story, and it's an MIT shaggy dog story 198 00:14:31 --> 00:14:34 because it comes from the work of Salvador 199 00:14:34 --> 00:14:37 Luria, who is a very famous biologist who worked 200 00:14:37 --> 00:14:41 here at MIT. So, Salvador Luria was studying 201 00:14:41 --> 00:14:46 bacteriophage. Remember, bacteriophage are the 202 00:14:46 --> 00:14:52 viruses that infect bacteria. So, he was studying bacteriophage, 203 00:14:52 --> 00:15:00 and he took his bacteriophage and used it to infect a strain of 204 00:15:00 --> 00:15:07 bacteria, strain A, and he also used it to infect a strain of 205 00:15:07 --> 00:15:13 bacteria, strain B. So when he did that, 206 00:15:13 --> 00:15:17 what you do is you plate a lawn of bacterial cells. 207 00:15:17 --> 00:15:21 You kind of have a slush of bacterial cells that you 208 00:15:21 --> 00:15:26 plate here with virus mixed in, and wherever there's a 209 00:15:26 --> 00:15:32 virus, the virus grows, replicates, and either kills or 210 00:15:32 --> 00:15:35 slows down the growth of the cells so that bacterial cells grow 211 00:15:35 --> 00:15:39 everywhere else, but where a viral particle landed 212 00:15:39 --> 00:15:41 there's an absence of bacterial cells and that hole in the lawn, 213 00:15:41 --> 00:15:44 this whole thing is called a lawn of 214 00:15:44 --> 00:15:47 bacteria, and the holes in the lawn are called plaques. 215 00:15:47 --> 00:15:51 So, when he did this, he found that when he did it on 216 00:15:51 --> 00:15:56 strain A he got a bunch of plaques and when he did it 217 00:15:56 --> 00:16:02 on strain B, he didn't, no plaques. So what what's the 218 00:16:02 --> 00:16:06 simplest explanation for this? 219 00:16:06 --> 00:16:10 Strain B is different somehow. It's resistant to the virus. I 220 00:16:10 --> 00:16:14 don't know, the virus has to come in and do 221 00:16:14 --> 00:16:18 various things, and strain b isn't compatible with 222 00:16:18 --> 00:16:23 the virus or something like that. No big deal. So it's a resistant 223 00:16:23 --> 00:16:28 strain. But, occasionally you'd get a plaque. 224 00:16:28 --> 00:16:33 Very occasionally, you'd have an occasional plaque. 225 00:16:33 --> 00:16:38 So now, how would this be? I said the strain was 226 00:16:38 --> 00:16:44 resistant. How could there be an occasional plaque? 227 00:16:44 --> 00:16:50 Mutation in, could it be imitation in the bacteria? 228 00:16:50 --> 00:16:56 Sorry. Well, if it was a mutation in the bacteria 229 00:16:56 --> 00:16:59 there would be one bacteria that had the mutation. It was now 230 00:16:59 --> 00:17:03 susceptible, and it would die. But, the lawn would 231 00:17:03 --> 00:17:06 kind of grow because the cells around it wouldn't have a mutation. 232 00:17:06 --> 00:17:10 So it's probably not a mutation in the bacteria 233 00:17:10 --> 00:17:13 but what could be? Maybe a mutation of the virus: what 234 00:17:13 --> 00:17:17 if it was a mutation in the virus that was able to overcome 235 00:17:17 --> 00:17:21 the resistance? Ah, so that's OK. 236 00:17:21 --> 00:17:25 So, what this must be is the existence of a resistant 237 00:17:25 --> 00:17:30 virus that is a virus that can overcome the resistance of 238 00:17:30 --> 00:17:36 the bacteria. So far: perfectly normal, no problem. Now, let's 239 00:17:36 --> 00:17:43 do the following experiment. Let's take this resistant virus, 240 00:17:43 --> 00:17:50 and grow it, again, on strain A and grow it on 241 00:17:50 --> 00:17:56 strain b. What do you think is going to happen when I grow it on 242 00:17:56 --> 00:18:02 strain A? It'll grow lots of plaques. It still grows 243 00:18:02 --> 00:18:08 on strain A, and now what's going to happen when I grow 244 00:18:08 --> 00:18:13 it on strain B? If this was really a mutation that 245 00:18:13 --> 00:18:17 made it able to grow on strain b then it gets lots of plaques because 246 00:18:17 --> 00:18:21 it's now gained the ability to grow on strain B, and sure 247 00:18:21 --> 00:18:27 enough, that's what happens. So, 248 00:18:27 --> 00:18:35 there's nothing funky yet. But now, suppose I take one of 249 00:18:35 --> 00:18:41 these resistant viruses that I isolated here on strain B, 250 00:18:41 --> 00:18:44 I grow it again here on strain A. It grows. I grow it on strain B. 251 00:18:44 --> 00:18:47 It grows. If I take it again from strain B and 252 00:18:47 --> 00:18:50 I repeat this, it'll still grow on strain A and 253 00:18:50 --> 00:18:54 still grow on strain B. Let's take one, 254 00:18:54 --> 00:19:00 though, from strain A. It's the resistant one which we have just now 255 00:19:00 --> 00:19:08 happened to have grown on strain A. And now, 256 00:19:08 --> 00:19:19 let's grow it again on strain A versus on strain B. And sure 257 00:19:19 --> 00:19:24 enough, it continues to grow on strain A, no problem. 258 00:19:24 --> 00:19:29 And we grow it now on strain B. And, what shall we 259 00:19:29 --> 00:19:32 get? Well, it should grow on strain B, right, because it was a mutant 260 00:19:32 --> 00:19:36 virus, and it gained the ability to grow on either. 261 00:19:36 --> 00:19:42 We passage it through B, it grows. We passage it through A. 262 00:19:42 --> 00:19:48 But the answer was nothing, no growth. How can that be? 263 00:19:48 --> 00:19:53 We had a virus. We agreed that was a mutant virus that 264 00:19:53 --> 00:19:55 had picked up the ability to grow on strain B, and we demonstrated 265 00:19:55 --> 00:19:58 it has now on either A or B. 266 00:19:58 --> 00:20:02 We then reached in, and grabbed a copy of it here from 267 00:20:02 --> 00:20:06 strain A, having grown on strain A, and we try it again and it now 268 00:20:06 --> 00:20:10 won't grow on strain B. If this was a mutation, 269 00:20:10 --> 00:20:15 I mean, maybe the mutation reverted, right? 270 00:20:15 --> 00:20:20 It was a reversion of the mutation. It mutated back. Is that plausible? 271 00:20:20 --> 00:20:24 No, come on. The chance that all of the copies there would mutate 272 00:20:24 --> 00:20:26 back, come on. I mean, you could repeat this 273 00:20:26 --> 00:20:30 several times and this is always what happens. 274 00:20:30 --> 00:20:34 What does that tell you about this mutation in the virus? 275 00:20:34 --> 00:20:39 It can't be a mutation of the virus because 276 00:20:39 --> 00:20:42 if it was a mutation, it would be transmitted through. 277 00:20:42 --> 00:20:46 But, passing through strain A makes it lose 278 00:20:46 --> 00:20:50 its ability to grow on strain B. But as long as you keep passing it 279 00:20:50 --> 00:20:55 through strain B, it can grow on strain B. This is 280 00:20:55 --> 00:20:59 not your typical genetics. So, Salvador Luria loved this. 281 00:20:59 --> 00:21:04 And, he really worked out what was going on. And somehow, 282 00:21:04 --> 00:21:12 well, so anyway, they referred to this as strain B 283 00:21:12 --> 00:21:20 having the ability to restrict the growth of the 284 00:21:20 --> 00:21:25 virus. Strain B can restrict the growth of the virus. 285 00:21:25 --> 00:21:30 That's where this word restriction enzyme 286 00:21:30 --> 00:21:34 comes from. What's really, truly going on here underneath the 287 00:21:34 --> 00:21:38 shaggy dog story? It took a long time before 288 00:21:38 --> 00:21:42 the shaggy dog story that Salvador Luria was the one to really 289 00:21:42 --> 00:21:47 demonstrate is fully worked out. But, what turns out to 290 00:21:47 --> 00:21:54 be the case is that strain B has a restriction enzyme. 291 00:21:54 --> 00:22:02 That's how it restricts the growth. It has one of 292 00:22:02 --> 00:22:10 these enzymes that can cut DNA at a specific place. 293 00:22:10 --> 00:22:17 When the virus comes into strain B, it injects its DNA, 294 00:22:17 --> 00:22:25 and the enzyme comes along and cuts the virus's DNA, protecting the 295 00:22:25 --> 00:22:31 bacteria. It's got its own little defense mechanism: pretty cool, 296 00:22:31 --> 00:22:38 pretty cool. So, any DNA that's introduced, 297 00:22:38 --> 00:22:43 if it has the sequence here, it'll take G, A, A, T, T, C, the 298 00:22:43 --> 00:22:49 bacteria cuts it. Wait a second, 299 00:22:49 --> 00:22:55 the bacteria has its own DNA. Why doesn't it chop up its own 300 00:22:55 --> 00:23:01 chromosome? Well, I mean, so one simple possibility would be 301 00:23:01 --> 00:23:05 that if this thing is looking for the sequence, G, A, A, T, T, C 302 00:23:05 --> 00:23:10 in the genome, maybe it's the case that the 303 00:23:10 --> 00:23:15 bacteria has arranged that its own DNA 304 00:23:15 --> 00:23:19 never has a G, A, A, T, T, C. That would be a 305 00:23:19 --> 00:23:24 simple solution, right? But is it a plausible 306 00:23:24 --> 00:23:29 solution? Why not? But just statistically, 307 00:23:29 --> 00:23:33 how often do I expect to encounter a G, A, A, T, T, C? What's 308 00:23:33 --> 00:23:39 the frequency of any given six letter word in a four letter 309 00:23:39 --> 00:23:44 alphabet? It's about one in 46. So, about one in 46 positions will 310 00:23:44 --> 00:23:48 be a G, A, A, T, T, C, and that's about 4,000 311 00:23:48 --> 00:23:52 letters. So, every 4, 00 letters, I expect to encounter a 312 00:23:52 --> 00:23:56 G, A, A, T, T, C. How big is the E coli genome? 313 00:23:56 --> 00:23:59 4 million letters. So, how many G, A, A, 314 00:23:59 --> 00:24:02 T, T, Cs will there be? About 1,000 315 00:24:02 --> 00:24:06 of them. It's just not plausible to imagine that it doesn't have the 316 00:24:06 --> 00:24:10 sites. So, your idea is that if it has these sites, 317 00:24:10 --> 00:24:12 it's got to arrange to protect its own sites. So, how is 318 00:24:12 --> 00:24:18 it going to protect its own sites? 319 00:24:18 --> 00:24:26 Covers it or something. You could imagine something covers 320 00:24:26 --> 00:24:32 it or something, but you want to alter your own, 321 00:24:32 --> 00:24:35 so it turns out you're exactly right. What happens is there is 322 00:24:35 --> 00:24:40 an enzyme that comes along, and at this position, 323 00:24:40 --> 00:24:47 attaches a methyl group. It modifies the DNA 324 00:24:47 --> 00:24:53 by attaching a methyl group. It turns out that that methyl group 325 00:24:53 --> 00:24:59 is enough to prevent the restriction enzyme from binding. 326 00:24:59 --> 00:25:07 So, this blocks the restriction enzyme. So, that way the bacteria 327 00:25:07 --> 00:25:14 is able to distinguish between its own DNA, 328 00:25:14 --> 00:25:21 which is methylated, and the viral DNA. So, wait a second, how does 329 00:25:21 --> 00:25:27 that explain my virus that manage to grow? How did my virus manage to 330 00:25:27 --> 00:25:33 grow? It would need to have gotten itself modified also to be protected. 331 00:25:33 --> 00:25:40 Could that happen by chance? What if the methylation enzyme, 332 00:25:40 --> 00:25:48 the methylase, which is floating around in the cell, 333 00:25:48 --> 00:25:56 “accidentally” methylated the virus's DNA? What would 334 00:25:56 --> 00:26:02 happen then? The virus would become immune. 335 00:26:02 --> 00:26:08 So, suppose the bacteria was pretty clever, and had a 336 00:26:08 --> 00:26:11 lot more restriction enzyme, and only a little bit of methylase? 337 00:26:11 --> 00:26:14 Well, you'd imagine that most of the 338 00:26:14 --> 00:26:17 time the restriction enzyme would cut up the viral DNA first. 339 00:26:17 --> 00:26:20 But every once in a while, the methylase 340 00:26:20 --> 00:26:24 would get there first and protect the virus's DNA. 341 00:26:24 --> 00:26:28 That becomes an immune virus because it can't 342 00:26:28 --> 00:26:32 be cut by the enzyme anymore. And, if I take that, and I grow it 343 00:26:32 --> 00:26:36 again on strain B, it'll now produce lots of plaques because 344 00:26:36 --> 00:26:42 it was methylated. And, if I grow it 345 00:26:42 --> 00:26:44 again on strain B, it remains methylated because once 346 00:26:44 --> 00:26:46 it's methylated and comes into the cell, it's not 347 00:26:46 --> 00:26:50 cut. And so, its descendants will get methylated. 348 00:26:50 --> 00:26:54 But, what happens if I ever grow that methylated virus 349 00:26:54 --> 00:26:59 on strain A? Strain A doesn't have the restriction enzyme, 350 00:26:59 --> 00:27:04 and it doesn't have the methylase. So, the 351 00:27:04 --> 00:27:08 progeny phage that grew up on strain A aren't methylated. 352 00:27:08 --> 00:27:13 They're no longer protected. The protection that the 353 00:27:13 --> 00:27:16 virus has is the protection that comes from this methylation enzyme. 354 00:27:16 --> 00:27:20 It's not the sequence of the DNA. It's 355 00:27:20 --> 00:27:23 the attachment to these methyl groups. And so, 356 00:27:23 --> 00:27:26 it turns out that if you ever pass this virus through strain A, 357 00:27:26 --> 00:27:39 passage through strain A, the resulting DNA loses is 358 00:27:39 --> 00:27:48 unmethylated. And now, it can be cut. 359 00:27:48 --> 00:27:52 And it can be cut. Well, this explained the weird 360 00:27:52 --> 00:27:56 results of Luria, that somehow bacteria had a complex 361 00:27:56 --> 00:27:59 defense mechanism of a restriction enzyme 362 00:27:59 --> 00:28:02 and a cognate methylase. The restriction enzyme would cut 363 00:28:02 --> 00:28:06 the sequence. The chromosome would be protected by 364 00:28:06 --> 00:28:09 methylating that site, and usually it would work fine. 365 00:28:09 --> 00:28:12 Occasionally the bacterial virus would get methylated. 366 00:28:12 --> 00:28:15 It would be protected as long as it continues to go through strains that 367 00:28:15 --> 00:28:18 have this restricted methylation system. That was it. Now, this 368 00:28:18 --> 00:28:21 shaggy dog story took a couple of decades to work out, 369 00:28:21 --> 00:28:25 and eventually led to Nobel prizes for the discovery 370 00:28:25 --> 00:28:28 of restriction enzymes. They're extremely important because 371 00:28:28 --> 00:28:32 although bacteria do this to protect themselves, they have also 372 00:28:32 --> 00:28:35 given us the perfect tool to now cut DNA where we want to 373 00:28:35 --> 00:28:38 cut DNA. Now, what if you wanted to cut at a G, 374 00:28:38 --> 00:28:42 A, A, T, T, C? You've got Eco R1. 375 00:28:42 --> 00:28:47 But what if you wanted to cut it cut it in another sequence? 376 00:28:47 --> 00:28:57 Well, it turns out that if you want to cut it at G, G, A, T, C, C 377 00:28:57 --> 00:29:09 there's an enzyme called Bam H1. If you want to cut it at 378 00:29:09 --> 00:29:14 A, A, G, C, T, T or A, A, G, C, 379 00:29:14 --> 00:29:20 T, T, there's an enzyme called Hmd 3. If you 380 00:29:20 --> 00:29:26 want to cut it at just G, A, T, C like this, C, T, A, 381 00:29:26 --> 00:29:34 G, an enzyme called Mbo 1. And, there are enzymes that 382 00:29:34 --> 00:29:41 cut it this way, enzymes that cut it this way, 383 00:29:41 --> 00:29:45 enzymes that cut it this way, enzymes that recognize four bases, 384 00:29:45 --> 00:29:49 six bases. There are even enzymes that recognize eight bases. 385 00:29:49 --> 00:29:53 It turns out that bacteria have elaborated zillions of different 386 00:29:53 --> 00:29:57 restriction enzymes that recognize different sequences. This 387 00:29:57 --> 00:30:00 perfect for molecular biologists. Bacteria, 388 00:30:00 --> 00:30:02 of course, are much smarter than we are, having been out this much 389 00:30:02 --> 00:30:06 longer, have developed all of these tools for engineering. 390 00:30:06 --> 00:30:10 All we have to do is borrow them. So how do you get Eco R1? 391 00:30:10 --> 00:30:13 We grow out that strain of E coli; you purify Wco R1. 392 00:30:13 --> 00:30:17 And how do you get Hmd 3? You grow up 393 00:30:17 --> 00:30:20 strain of haemophilus influenza. You purify the enzyme. At least, 394 00:30:20 --> 00:30:24 that's how primitive molecular biologists did it. If you 395 00:30:24 --> 00:30:27 wanted to work with a restriction enzyme, you'd grow up the bacteria. 396 00:30:27 --> 00:30:30 You'd purify the enzyme yourself, and you would just use 397 00:30:30 --> 00:30:34 it in your laboratory. Of course today what does a modern 398 00:30:34 --> 00:30:39 molecular biologist do if he or she should want Hmd 3? 399 00:30:39 --> 00:30:47 It's in the catalog. So the catalog has 200 restriction 400 00:30:47 --> 00:30:55 enzymes. Yup, PSI-1 is new, on sale, 500 units for $400. 401 00:30:55 --> 00:31:01 Let's see what Eco R1 is going for. 402 00:31:01 --> 00:31:04 Eco R1: look at this, 50,000 units $200. That's a good 403 00:31:04 --> 00:31:07 price for Eco R1 because it's a very famous 404 00:31:07 --> 00:31:10 enzyme here. So all you have to do is you give them your credit card 405 00:31:10 --> 00:31:15 number and you have it tomorrow by FedEx. So that's how restriction 406 00:31:15 --> 00:31:21 enzymes are obtained today. So, next up, we can cut DNA any 407 00:31:21 --> 00:31:26 place we want to. We now need to glue DNA together. 408 00:31:26 --> 00:31:32 Suppose I cut DNA, human DNA, and I'm 409 00:31:32 --> 00:31:34 going to cut it. I'll just take human DNA, 410 00:31:34 --> 00:31:37 your DNA, which I've purified, and I'm going to cut 411 00:31:37 --> 00:31:46 it at all its Eco R1 sites. I can take any other DNA I want. 412 00:31:46 --> 00:31:50 I don't know, I could take zebra DNA. I could take anything and I could 413 00:31:50 --> 00:31:56 also cut it at Eco R1 sites. I could mix them together, 414 00:31:56 --> 00:32:02 and after mixing them together the fragments will float around and 415 00:32:02 --> 00:32:08 remember this down here has T, T, 416 00:32:08 --> 00:32:16 A, A. This fragment over here from some other piece 417 00:32:16 --> 00:32:23 T, T, A, A, this could be human DNA. This could be zebra DNA if you want 418 00:32:23 --> 00:32:29 to. It doesn't matter. It could be bacterial DNA. 419 00:32:29 --> 00:32:33 These fragments overlap. They'll hydrogen bond a little bit, 420 00:32:33 --> 00:32:37 but that of course won't introduce a covalent bond here. 421 00:32:37 --> 00:32:42 I'd really like to make a covalent bond. I would like to attach the 422 00:32:42 --> 00:32:44 piece of DNA from one source to the piece of DNA from the 423 00:32:44 --> 00:32:47 other source by doing the opposite of the 424 00:32:47 --> 00:32:49 restriction enzyme. The restriction enzyme cut at these 425 00:32:49 --> 00:32:52 locations. I would now like to catalyze 426 00:32:52 --> 00:32:55 the rejoining of the sugar phosphate backbone here. 427 00:32:55 --> 00:32:59 So I would like to rejoin the sugar phosphate backbone. I 428 00:32:59 --> 00:33:03 have a hydroxyl here. I have a phosphate here, 429 00:33:03 --> 00:33:08 and I would like to ligate them together. So 430 00:33:08 --> 00:33:11 how I manage to ligate? What kind of fancy chemistry do I 431 00:33:11 --> 00:33:15 do to ligate these pieces of DNA together? I don't do any fancy 432 00:33:15 --> 00:33:20 chemistry. I again sit at the feet of bacteria who have solved all 433 00:33:20 --> 00:33:22 these problems before. And I ask bacteria, how do you do 434 00:33:22 --> 00:33:24 this? And they say, well, we have an enzyme called 435 00:33:24 --> 00:33:26 ligase. So, you purify ligase from bacteria, 436 00:33:26 --> 00:33:29 you add that, and ligase ligates the fragments together. Why 437 00:33:29 --> 00:33:33 do bacteria have an enzyme ligase? 438 00:33:33 --> 00:33:38 For a pair of their own DNA. Things go wrong this is part of the 439 00:33:38 --> 00:33:42 DNA maintenance scheme of bacteria. They have an enzyme 440 00:33:42 --> 00:33:46 ligase to appear their own breaks in DNA and, obligingly, you 441 00:33:46 --> 00:33:50 can purify DNA ligase. So you add ligase, today, 442 00:33:50 --> 00:33:54 of course, if you need a ligase, 443 00:33:54 --> 00:33:56 how do you get it? It's in the catalog, 444 00:33:56 --> 00:33:59 absolutely. So, you can glue together any of those things you 445 00:33:59 --> 00:34:04 want. All right, next up, what DNA do I want to stick 446 00:34:04 --> 00:34:10 together? I mean, here I made a silly example. 447 00:34:10 --> 00:34:15 I'm going to stick some human DNA to some zebra 448 00:34:15 --> 00:34:19 DNA. Why do that? I mean, just to show you that I can 449 00:34:19 --> 00:34:22 doing it, right? I'm just demonstrating that I could stick any 450 00:34:22 --> 00:34:25 DNA to any DNA. Remember, once I've 451 00:34:25 --> 00:34:27 got a piece of DNA it doesn't know whether it came from a human or a 452 00:34:27 --> 00:34:31 zebra. It's just the molecule. You can stick the molecules together, 453 00:34:31 --> 00:34:36 right? But what do I really want to attach my human DNA to? 454 00:34:36 --> 00:34:45 I want to attach it to attach it to some other DNA that 455 00:34:45 --> 00:34:59 has the ability to grow on its own within bacteria. Vectors: I need 456 00:34:59 --> 00:35:05 to make, here's what I would really like. I would like to have 457 00:35:05 --> 00:35:13 a piece of DNA that has some sequences that contain the 458 00:35:13 --> 00:35:23 recognition sites for replication. I'd like to have some replication 459 00:35:23 --> 00:35:31 initiation sites here. So, a piece of DNA that, 460 00:35:31 --> 00:35:36 remember, because the bacterial chromosome itself, 461 00:35:36 --> 00:35:42 here's my bacteria, bacteria'chromosome replicates 462 00:35:42 --> 00:35:48 itself, and it has the ability to start DNA replication at multiple 463 00:35:48 --> 00:35:55 sites called origins of replication. But, what I 464 00:35:55 --> 00:35:59 would really like is to be able to construct in the laboratory a 465 00:35:59 --> 00:36:07 synthetic piece of DNA that also would function as an 466 00:36:07 --> 00:36:19 origin of replication because then what I could do is in vitro take 467 00:36:19 --> 00:36:24 my piece of DNA, attach it to this vector, 468 00:36:24 --> 00:36:29 and it would now have the ability to grow 469 00:36:29 --> 00:36:33 the bacteria. How am I going to make a piece of DNA? 470 00:36:33 --> 00:36:37 What kind of engineering tricks can we do to create 471 00:36:37 --> 00:36:41 a small piece of DNA that has all the machinery needed to 472 00:36:41 --> 00:36:45 be able to be copied and replicated just like bacterial 473 00:36:45 --> 00:36:50 chromosomes? That's a pretty fancy feat of engineering. 474 00:36:50 --> 00:36:55 How are you going to do that? Sorry? OK, 475 00:36:55 --> 00:36:59 so who are you going to ask? If you wanted to do this, you're 476 00:36:59 --> 00:37:04 going to ask the experts. Who are the experts? Viruses or 477 00:37:04 --> 00:37:07 bacteria, or basically, if you want to do anything, 478 00:37:07 --> 00:37:10 the place to ask is the folks who have the most 479 00:37:10 --> 00:37:12 experience. And, the folks who have the most 480 00:37:12 --> 00:37:15 experience are almost always prokaryotic organisms because they 481 00:37:15 --> 00:37:18 are by far the most evolved things on 482 00:37:18 --> 00:37:21 this planet. Anything that can replicate itself and grow every 20 483 00:37:21 --> 00:37:24 minutes or something like that has had a lot more 484 00:37:24 --> 00:37:26 generations of evolution than you have. And therefore, 485 00:37:26 --> 00:37:29 they are much more optimized than we are. And so you go ask and say, 486 00:37:29 --> 00:37:33 has any bacteria worked out how to do this? Turns out bacteria have 487 00:37:33 --> 00:37:37 worked out how to do this just fine. In 488 00:37:37 --> 00:37:43 fact, most bacteria, at least many bacteria, 489 00:37:43 --> 00:37:49 contain within them, in addition to their own chromosome, 490 00:37:49 --> 00:37:57 small circles of DNA. These are called episomes. 491 00:37:57 --> 00:38:06 This is the chromosome. Epi means on top of 492 00:38:06 --> 00:38:10 or in addition to. So in addition to the chromosome, 493 00:38:10 --> 00:38:14 there's an episome. The episome is in fact an autonomously replicating 494 00:38:14 --> 00:38:22 piece of DNA that has an origin. And it replicates. Why do bacteria 495 00:38:22 --> 00:38:30 have episomes? It turns out episomes 496 00:38:30 --> 00:38:34 often contain genes. One fo the genes they contain, 497 00:38:34 --> 00:38:38 or some of the types of genes they contain, are resistance 498 00:38:38 --> 00:38:44 genes. There might be, for example, a penicillin resistance 499 00:38:44 --> 00:38:50 gene contained on an episome, or a streptomycin resistance gene. 500 00:38:50 --> 00:38:56 It turns out the bacteria have these 501 00:38:56 --> 00:39:02 episomes containing resistance genes, and they're not in the chromosome. 502 00:39:02 --> 00:39:08 They're separate. Now, why would they do that? 503 00:39:08 --> 00:39:15 It turns out when a bacterium dies and a cell cracks open, the 504 00:39:15 --> 00:39:21 DNA spills out. The next door neibhored bacteria has 505 00:39:21 --> 00:39:25 mechanismis to suck up DNA from the environment. You never 506 00:39:25 --> 00:39:29 know. It might find something interesting out there. 507 00:39:29 --> 00:39:33 So, it turns out that bacteria are rather promiscuously exchanging 508 00:39:33 --> 00:39:37 pieces of DNA all the time. And so, 509 00:39:37 --> 00:39:43 a bacteria that has an episome that has a penicillin resistance gene 510 00:39:43 --> 00:39:47 can spread it to other bacteria, and it's very nice. It's compact. 511 00:39:47 --> 00:39:51 It's on its own little episome, autonomously replicating 512 00:39:51 --> 00:39:56 piece of DNA. This is great for bacteria wanting to spread drug 513 00:39:56 --> 00:40:00 resistance. It's not good for human populations, 514 00:40:00 --> 00:40:03 for example, because this is how drug resistance spread through 515 00:40:03 --> 00:40:06 populations. This is why we have spreads of 516 00:40:06 --> 00:40:10 penicillin resistance. Now, of course, wait a second, 517 00:40:10 --> 00:40:15 this whole mechanism of spreading drug resistance, we've only had 518 00:40:15 --> 00:40:20 antibiotics since the 1940s. How did bacteria devise this so 519 00:40:20 --> 00:40:25 quickly? Sorry? Many generations since 1945? 520 00:40:25 --> 00:40:30 That would be very impressive. 521 00:40:30 --> 00:40:34 Yeah, but, I mean, why do they have this episome mechanism, the 522 00:40:34 --> 00:40:39 ability to spread DNA and all that? That's an awful lot 523 00:40:39 --> 00:40:45 to evolve in 50 years? Yeah? Something natural like 524 00:40:45 --> 00:40:51 penicillin. It turns out, we didn't 525 00:40:51 --> 00:40:54 think of penicillin. Who thought of penicillin? 526 00:40:54 --> 00:40:58 Fungi. Right, again, we learn from the lower organisms. Penicillin 527 00:40:58 --> 00:41:01 comes from fungi. Bacteria have been fighting 528 00:41:01 --> 00:41:04 off penicillin for millions and tens of millions of years. So, 529 00:41:04 --> 00:41:07 we may be very proud of our penicillin and all that. 530 00:41:07 --> 00:41:09 But, they've been at this for a very long time. 531 00:41:09 --> 00:41:11 This is about war between bacteria and fungi. 532 00:41:11 --> 00:41:14 That's what this is, OK? So, that's why these things are 533 00:41:14 --> 00:41:17 here. They're here so that bacteria can have these 534 00:41:17 --> 00:41:20 resistance genes against fungi and things like that that make 535 00:41:20 --> 00:41:24 antibiotics. Antibiotics are natural. We've made a few 536 00:41:24 --> 00:41:27 new ones, but most of the antibiotics have been made by nature. 537 00:41:27 --> 00:41:30 And so, if I wanted to replicate DNA, 538 00:41:30 --> 00:41:35 if I wanted to attach my human DNA to a piece of DNA that's 539 00:41:35 --> 00:41:41 capable of autonomous replication, autonomously replicating circles of 540 00:41:41 --> 00:41:47 DNA, these autonomously replicating circles of DNA are also called 541 00:41:47 --> 00:41:50 plasmids. And that's the word we'll mostly use for them, 542 00:41:50 --> 00:41:54 plasmids. All I need to do is purify a plasmid from 543 00:41:54 --> 00:41:57 a bacteria. So, I find a bacteria that has plasmids. 544 00:41:57 --> 00:42:01 I purify the plasmid, and then I can cut open the plasmid 545 00:42:01 --> 00:42:07 at the Eco R1 site, OK? So, this plasmid will have an 546 00:42:07 --> 00:42:13 ORI, an origin of replication. I'll cut 547 00:42:13 --> 00:42:19 it open at the Eco R1 sight. I'll take human DNA fragments that 548 00:42:19 --> 00:42:24 I've cut with Eco R1. I'll mix them with plasmid DNA that has 549 00:42:24 --> 00:42:30 been opened up, has an origin. Ligase will come 550 00:42:30 --> 00:42:34 along, join this up, and now I have a circle of DNA that 551 00:42:34 --> 00:42:38 has all the machinery to autonomously replicate, 552 00:42:38 --> 00:42:41 plus my human DNA. Now, if I wanted to get a vector, or an 553 00:42:41 --> 00:42:45 honest to goodness plasmid, I can go to a bacteria, 554 00:42:45 --> 00:42:50 grow it up, purify the plasmid, and cut it. Or alternatively, 555 00:42:50 --> 00:42:55 if I needed the plasmid, say, tomorrow, 556 00:42:55 --> 00:42:57 it's in the catalog. The next section of the catalog has 557 00:42:57 --> 00:42:59 a long list of plasmids here. There's 558 00:42:59 --> 00:43:04 a plasmid there, right? It's a nice plasmid. 559 00:43:04 --> 00:43:09 Oh yes, let's see, puck is a very good plasmid. 560 00:43:09 --> 00:43:12 PBR 322 is a good plasmid. The whole section, all this purple 561 00:43:12 --> 00:43:15 stuff are the plasmids. So, you can get the plasmids too. 562 00:43:15 --> 00:43:17 You place one order, you get the restriction enzymes, 563 00:43:17 --> 00:43:20 you get the ligases, you get the plasmids, no 564 00:43:20 --> 00:43:26 problem. So, I can then take total human DNA, cut up, cut 565 00:43:26 --> 00:43:32 up, cut up, cut up, add in plasmid, 566 00:43:32 --> 00:43:38 and I'm going to ligate together. And then, having ligated my human 567 00:43:38 --> 00:43:45 DNA to my plasmids, I'm going to mix with 568 00:43:45 --> 00:43:55 bacteria. I take some bacterial cells. I add my mixture of these 569 00:43:55 --> 00:43:58 plasmids containing human DNA. And now all I have to do is 570 00:43:58 --> 00:44:02 persuade the bacteria to suck up my plasmids containin human DNA. 571 00:44:02 --> 00:44:08 How do I teach bacteria to suck up DNA? They do that for 572 00:44:08 --> 00:44:11 a living. That's what they do. They're always spreading material. 573 00:44:11 --> 00:44:14 They have that ability. All we're doing is we're using 574 00:44:14 --> 00:44:16 their ability. So you get the sense that the kind 575 00:44:16 --> 00:44:19 of engineering that really works in biology is engineering that 576 00:44:19 --> 00:44:22 exploits what nature has been doing for a very long time. 577 00:44:22 --> 00:44:24 Rather than butting your head up against the problem, usually 578 00:44:24 --> 00:44:27 somebody has solved it, and it's almost always 579 00:44:27 --> 00:44:29 bacteria. So, you've transformed the bacteria. 580 00:44:29 --> 00:44:32 Now, there are a few tricks you can 581 00:44:32 --> 00:44:35 use to make them a little more transformable. 582 00:44:35 --> 00:44:39 You can add calcium phosphate, and blah, blah, blah, 583 00:44:39 --> 00:44:43 but you can sort of persuade them to take up the DNA. And then 584 00:44:43 --> 00:44:48 all you have to do is plate them out on a plate. 585 00:44:48 --> 00:44:55 Plate them out fairly dilutely so there are a lot of single bacterial 586 00:44:55 --> 00:45:03 cells that land on the plate, and wait for them to grow up. Each 587 00:45:03 --> 00:45:10 one of these had a single plasmid, a different plasmid than 588 00:45:10 --> 00:45:16 the next guy over. Wait a second, each one? 589 00:45:16 --> 00:45:22 How do I guarantee that every bacteria in my test 590 00:45:22 --> 00:45:28 tube took up a plasma? Is that plausible? I mean, 591 00:45:28 --> 00:45:34 I can't guarantee that every bacteria is going to take up a 592 00:45:34 --> 00:45:38 plasmid. Maybe I'll add so much plasmid that every bacteria will 593 00:45:38 --> 00:45:43 take one up. Oh, but that's a bad idea 594 00:45:43 --> 00:45:45 because why? Because then a lot of them will take up more than one. 595 00:45:45 --> 00:45:47 You don't want to do that. You really only 596 00:45:47 --> 00:45:50 want to have at most one. So, if you were going to arrange so 597 00:45:50 --> 00:45:53 that at random you only have about one, you've got to 598 00:45:53 --> 00:45:56 have a lot that are zero. So, this is a problem. I mean, 599 00:45:56 --> 00:45:59 it's a real waste. My library is going to have large numbers of 600 00:45:59 --> 00:46:02 bacteria that don't have any plasmid. In fact, this transformation 601 00:46:02 --> 00:46:06 process is not so efficient. It's not so efficient. So, we 602 00:46:06 --> 00:46:09 have a little bit of a problem here is 603 00:46:09 --> 00:46:13 that some of these guys will have human DNA. 604 00:46:13 --> 00:46:19 But, most of them won't. So, what can I do to arrange that 605 00:46:19 --> 00:46:26 any bacteria that did not pick up a plasmid 606 00:46:26 --> 00:46:34 was incapable of growing? Add a resistance gene to the 607 00:46:34 --> 00:46:42 plasmid. Suppose I were so clever as 608 00:46:42 --> 00:46:46 to add to that plasmid, penicillin resistance. So, 609 00:46:46 --> 00:46:51 not just an origin of replication, but suppose I also had a 610 00:46:51 --> 00:46:57 resistance gene here, say, for penicillin resistance or 611 00:46:57 --> 00:47:03 streptomycin resistance, or ampicillin tends to be a very big 612 00:47:03 --> 00:47:08 favorite, ampicillin resistance. Then, 613 00:47:08 --> 00:47:12 my plasmid would have ampicillin resistance gene encoded on it, 614 00:47:12 --> 00:47:16 an enzyme that can, say, break down ampicillin, 615 00:47:16 --> 00:47:22 so on and one way to to my perch you plate I just said ampa cell 616 00:47:22 --> 00:47:26 and now i'll even though most of the bacteria have not 617 00:47:26 --> 00:47:29 picked up a plasmid, only those bacteria that have picked 618 00:47:29 --> 00:47:32 up a plasmid have the ampicillin 619 00:47:32 --> 00:47:36 resistance gene and can grow on an ampicillin containing plate. 620 00:47:36 --> 00:47:39 Now, how do I get a plasmid with an ampicillin 621 00:47:39 --> 00:47:43 resistance gene? It's in the catalog. It's 622 00:47:43 --> 00:47:46 all there, right? In fact, these occur naturally. 623 00:47:46 --> 00:47:49 You can, with restriction enzymes, move the ampicillin resistance 624 00:47:49 --> 00:47:51 gene to your favorite plasmid. If you don't like that, you can put 625 00:47:51 --> 00:47:53 in kanamycin resistance, etc., etc., 626 00:47:53 --> 00:47:59 etc. So, that's how you do it. So, we've got the big picture here. 627 00:47:59 --> 00:48:06 We have now gotten a library, the Library of Human 628 00:48:06 --> 00:48:19 Fragments contained in E coli. The library is a big Petri plate or 629 00:48:19 --> 00:48:28 many Petri plates, each one of which is a colony. 630 00:48:28 --> 00:48:34 Each colony has a single vector with an origin, 631 00:48:34 --> 00:48:40 a resistance marker, and a distinct piece of 632 00:48:40 --> 00:48:47 human DNA. In this library lives somewhere the gene for Huntington's 633 00:48:47 --> 00:48:53 disease. Over here is a gene for cystic fibrosis, 634 00:48:53 --> 00:48:57 over here a gene for Duchenne muscular dystrophy, 635 00:48:57 --> 00:49:02 over here a gene for diastrophic dysplasia, 636 00:49:02 --> 00:49:08 over here a gene for etc. etc. The only detail, now, 637 00:49:08 --> 00:49:15 you've got a library. You've managed to purify each 638 00:49:15 --> 00:49:19 piece of human DNA away from every other piece of human DNA. 639 00:49:19 --> 00:49:21 The only question now is how do you use the library? 640 00:49:21 --> 00:49:24 How do you go to the library and withdraw the correct 641 00:49:24 --> 00:49:28 volume from the shelf? How do you find the one 642 00:49:28 --> 00:49:32 you're looking for? So, we have converted the problem 643 00:49:32 --> 00:49:35 of purification, which in every other form of biochemistry starts by 644 00:49:35 --> 00:49:38 saying, I'm going to purify something based on its distinctive 645 00:49:38 --> 00:49:42 properties, to I'm going to randomly purify everything. 646 00:49:42 --> 00:49:46 Everything would be purified in its own bacteria, and I've now converted 647 00:49:46 --> 00:49:51 to the problem of finding the one that I want in my 648 00:49:51 --> 00:49:55 library. Next time, we'll talk about how you go to the 649 00:49:55 --> 00:49:59 library and find what you want. 650 00:49:59 --> 50:04 See you then.