1 00:00:00 --> 00:00:05 So in the last lecture I spent quite a while trying to convey a sense of 2 00:00:05 --> 00:00:10 how the structure of DNA was discovered. The crystallographic 3 00:00:10 --> 00:00:16 data that led to it, as I said, was collected by Roslyn 4 00:00:16 --> 00:00:21 Franklin. And I saw there was some confusion about this picture that I 5 00:00:21 --> 00:00:27 showed you next. This is not a photograph of a 6 00:00:27 --> 00:00:32 double helix. This is what happened when she 7 00:00:32 --> 00:00:36 bounced the x-ray off the crystal of DNA. This is the diffraction 8 00:00:36 --> 00:00:40 pattern that she saw. And then one works backwards from 9 00:00:40 --> 00:00:44 that trying to figure out what kind of structure it was that would have 10 00:00:44 --> 00:00:49 caused that diffraction pattern. And you have to be a pretty good 11 00:00:49 --> 00:00:53 x-ray crystallographer to draw any kind of inferences from that. 12 00:00:53 --> 00:00:57 And there people, including Francis Crick, who saw the implications 13 00:00:57 --> 00:01:02 of it right away. But the point was she collected the 14 00:01:02 --> 00:01:06 data and then two people that I told you about then whose name you know 15 00:01:06 --> 00:01:10 so well, Jim Watson and Francis Crick, were the two individuals that 16 00:01:10 --> 00:01:14 came up with the model that explained the diffraction pattern. 17 00:01:14 --> 00:01:19 And therefore we learned the structure of DNA as a 18 00:01:19 --> 00:01:23 double-stranded helix. I also tried to make the case that 19 00:01:23 --> 00:01:27 it wasn't two geniuses who sat down in the room, took a look at this and 20 00:01:27 --> 00:01:32 popped up with the model. It was a story of real people with 21 00:01:32 --> 00:01:36 misadventures and mistakes and recovery from mistakes and so on 22 00:01:36 --> 00:01:41 getting it. It was also a very small group. And I'm going to take 23 00:01:41 --> 00:01:45 just a very small minute at the beginning of the class because I 24 00:01:45 --> 00:01:49 have a colleague, Vernon Ingram who's sitting down 25 00:01:49 --> 00:01:54 here in the front, who was a member of this very small 26 00:01:54 --> 00:01:58 group with Jim Watson and Francis Crick. So here was there where all 27 00:01:58 --> 00:02:02 this happened. And almost nobody in the world has 28 00:02:02 --> 00:02:06 had a chance in your generation to hear directly from somebody who was 29 00:02:06 --> 00:02:10 there when it happened. So asked Vernon if he would come 30 00:02:10 --> 00:02:14 and just talk to you for a little bit just what it was 31 00:02:14 --> 00:02:54 like to be there. 32 00:02:54 --> 00:03:00 Well, thanks, Graham. You seem to be at a very exciting 33 00:03:00 --> 00:03:06 state in 7.014. This structure of the secret of life, 34 00:03:06 --> 00:03:14 no less. And it's interesting that immediately when Watson and Crick 35 00:03:14 --> 00:03:22 put together a model of the DNA molecule that fit the x-ray data, 36 00:03:22 --> 00:03:30 that was the point, how do you know a model is correct? 37 00:03:30 --> 00:03:36 Because there are certain distances in the model, and those have to 38 00:03:36 --> 00:03:43 correlate exactly with the distances of the x-ray spots in the 39 00:03:43 --> 00:03:49 diffraction pattern that you saw. That's how you know that a model 40 00:03:49 --> 00:03:56 that you've built to certain specifications corresponds to what 41 00:03:56 --> 00:04:02 the molecule of itself in the crystal that you're examining 42 00:04:02 --> 00:04:09 actually is composed of. It was by sheer accident that I 43 00:04:09 --> 00:04:15 happened to be working as a biochemist in the MRC, 44 00:04:15 --> 00:04:22 The Medical Research Council lab at the Cavendish Laboratory where 45 00:04:22 --> 00:04:28 Watson and Crick were working. Sheer accident. It was a very 46 00:04:28 --> 00:04:35 crowded lab, as Graham said. And that's something that you should 47 00:04:35 --> 00:04:41 remember. When you're choosing a lab to work in, 48 00:04:41 --> 00:04:47 always go to a lab that's overcrowded. Never go to a lab 49 00:04:47 --> 00:04:53 where there's lots of space because a really successful lab attracts so 50 00:04:53 --> 00:04:59 many coworkers, visitors that it rapidly gets 51 00:04:59 --> 00:05:05 overcrowded. And that was the case in this 52 00:05:05 --> 00:05:12 laboratory. The director was Max Perutz. Co-director John Kendrew 53 00:05:12 --> 00:05:20 doing x-ray crystallography of proteins for almost the first time, 54 00:05:20 --> 00:05:27 and solving the protein structure. Francis Crick was a graduate student 55 00:05:27 --> 00:05:33 of Max Perutz's doing his PhD work. And the first thing I remember about 56 00:05:33 --> 00:05:39 Francis was when I went there as a biochemist to work with Max Perutz, 57 00:05:39 --> 00:05:44 when I went there, there was this tall gangling guy constantly 58 00:05:44 --> 00:05:50 circulating between the top floor of the building, his office in the 59 00:05:50 --> 00:05:55 middle and the x-ray machines at the bottom. He was constantly going up 60 00:05:55 --> 00:06:01 and down. And in those days the buildings didn't have an elevators 61 00:06:01 --> 00:06:07 or lifts as the English called them. 62 00:06:07 --> 00:06:13 So he was in excellent physical shape. Very crowded, 63 00:06:13 --> 00:06:19 a very modest lab. And what's usually forgotten is a key member of 64 00:06:19 --> 00:06:25 that group, an engineer, Tony Broad, key person because he 65 00:06:25 --> 00:06:31 invented what was then the world's best and most efficient x-ray 66 00:06:31 --> 00:06:37 machine, a rotating anode x-ray machine. 67 00:06:37 --> 00:06:43 And because to the x-ray crystallographers in that group this 68 00:06:43 --> 00:06:49 machine was available, because of that they were the 69 00:06:49 --> 00:06:56 preeminent x-ray structure group in the world. My job was as a 70 00:06:56 --> 00:07:02 biochemist protein biochemistry putting a heavy atom, 71 00:07:02 --> 00:07:08 mercury, very heavy atom into Max Parutz's hemoglobin crystals 72 00:07:08 --> 00:07:15 in specific places. That has a predictable effect on the 73 00:07:15 --> 00:07:21 x-ray pattern and that enables the Fourier diagram to be constructed 74 00:07:21 --> 00:07:28 with real phase values for the x-ray diffractions, for the physicists 75 00:07:28 --> 00:07:36 among you here. Are there any physicists here? 76 00:07:36 --> 00:07:44 Yeah, I thought so. That was a big step forward and that was also a big 77 00:07:44 --> 00:07:52 step in figuring out the structure of the DNA samples semi-crystals 78 00:07:52 --> 00:08:00 that Professor Walker just referred to. 79 00:08:00 --> 00:08:05 All dependent on the engineer Tony Broad who is never mentioned in any 80 00:08:05 --> 00:08:11 of these histories, but without him this would not have 81 00:08:11 --> 00:08:17 happened. So it was an exciting place to work in, 82 00:08:17 --> 00:08:22 very exciting. We were all young in those days. And living the lives of 83 00:08:22 --> 00:08:28 young men and young women with all the complications that arise when 84 00:08:28 --> 00:08:34 you put a whole bunch of very energetic young men, 85 00:08:34 --> 00:08:39 very energetic young women together. And by that I mean the interpersonal 86 00:08:39 --> 00:08:45 relationships which when you're in a crowded, very active situation can 87 00:08:45 --> 00:08:50 sometimes interfere. And always very entertaining, 88 00:08:50 --> 00:08:55 I can tell you that. I could give you chapter and verse. 89 00:08:55 --> 00:09:01 But it isn't really so very different from people 90 00:09:01 --> 00:09:07 your age now, right? I mean I'm not saying it interferes 91 00:09:07 --> 00:09:13 with you, sometimes it might. But it was an exciting lab, an 92 00:09:13 --> 00:09:20 exciting time to be there because we were not the only group trying to 93 00:09:20 --> 00:09:26 figure out the structure of DNA. A huge competitor was Linus Pauling 94 00:09:26 --> 00:09:33 at Caltech who had beaten that same group once before, 95 00:09:33 --> 00:09:39 quite recently, over the alpha helix, the crucial component of 96 00:09:39 --> 00:09:45 protein structure. He got the right answer first, 97 00:09:45 --> 00:09:51 1.5 angstrom reflection, the alpha helix. And our group, 98 00:09:51 --> 00:09:57 Max Parutz and our group had been wrong. So the group was smarting 99 00:09:57 --> 00:10:03 under that kind of defeat, if you like. 100 00:10:03 --> 00:10:10 And competition is a wonderful spur, as long as you don't let it get out 101 00:10:10 --> 00:10:17 of hand. Well, needless to say we didn't, 102 00:10:17 --> 00:10:24 but the competition with the Pauling lab was certainly so severe that we 103 00:10:24 --> 00:10:30 awaited the next letter. You see, in those days new 104 00:10:30 --> 00:10:35 scientific information arrived not by publications, 105 00:10:35 --> 00:10:40 that too much too long, but by personal letter. And, 106 00:10:40 --> 00:10:45 in fact, the NIH has put together all these various letters in the 107 00:10:45 --> 00:10:50 Francis Crick collection. And when you have time you should 108 00:10:50 --> 00:10:55 look at those. They're quite interesting because 109 00:10:55 --> 00:11:01 they tell you in a way a scientific paper does not tell you. 110 00:11:01 --> 00:11:06 What I feel about my experiment results. What she feels about her 111 00:11:06 --> 00:11:11 experiment results. What it means to me as a person, 112 00:11:11 --> 00:11:16 to her as a person, to him as a person. So we were constantly 113 00:11:16 --> 00:11:22 watching the mail and discussing the news as it came in, 114 00:11:22 --> 00:11:27 mostly over a beer at the pub next door. It was very conveniently 115 00:11:27 --> 00:11:33 located. But being a small group crowded 116 00:11:33 --> 00:11:39 together made communication within our group very easy indeed. 117 00:11:39 --> 00:11:45 And we had fights. I don't mean physical fights. 118 00:11:45 --> 00:11:51 We had scientific fights. And as a biochemist I was able to 119 00:11:51 --> 00:11:57 settle a crucial fight among the crystallographers Crick and Watson 120 00:11:57 --> 00:12:02 who were building the model. Because, quite frankly, 121 00:12:02 --> 00:12:07 they didn't know much chemistry. And were trying to build a model 122 00:12:07 --> 00:12:13 with the wrong confirmation of the peptide bond. They didn't realize 123 00:12:13 --> 00:12:18 that the peptide bond has two possible confirmations. 124 00:12:18 --> 00:12:23 And they had at one point a terrible time trying to fit 125 00:12:23 --> 00:12:29 everything together because they were using the wrong confirmation. 126 00:12:29 --> 00:12:34 I'm talking about lactam-lactim for those of you who are organic 127 00:12:34 --> 00:12:39 chemists and it means something, a confirmation. And once they got 128 00:12:39 --> 00:12:44 the first confirmation then the model clicked into place. 129 00:12:44 --> 00:12:50 So we all helped, that's what I'm trying to say. 130 00:12:50 --> 00:12:55 We all helped with one great aim in mind. It was clear. 131 00:12:55 --> 00:13:00 And you know from what Professor Walker said, that the DNA structure, 132 00:13:00 --> 00:13:06 in its structure held the clue to crucial physiological 133 00:13:06 --> 00:13:11 behavior of DNA. And Crick and Watson said this in 134 00:13:11 --> 00:13:17 their first paper, the structure itself because of its 135 00:13:17 --> 00:13:22 complimentarity gives you an immediate clue as to how it 136 00:13:22 --> 00:13:27 replicates. And replication of DNA structure from generation to 137 00:13:27 --> 00:13:33 generation is, of course, the crucial thing about 138 00:13:33 --> 00:13:38 DNA. The copying, the precise copying 139 00:13:38 --> 00:13:43 from generation to generation. And that fell out the of x-ray 140 00:13:43 --> 00:13:49 structure. That's why the x-ray structure was so very important, 141 00:13:49 --> 00:13:54 because it gave you an immediate understanding of the role of DNA in 142 00:13:54 --> 00:14:00 modern biology. So that's what we did. 143 00:14:00 --> 00:14:06 And eventually the people in the group, the group got so overcrowded 144 00:14:06 --> 00:14:13 they built a huge lab that was beautiful, like any new lab is. 145 00:14:13 --> 00:14:19 But the thing I remember most of all was the atmosphere in that place. 146 00:14:19 --> 00:14:26 So remember, when you go and choose a lab, choose one that's overcrowded. 147 00:14:26 --> 00:14:45 It will pay off. [APPLAUSE] 148 00:14:45 --> 00:14:47 Thank you so much. That was really wonderful. 149 00:14:47 --> 00:14:59 Thank you. 150 00:14:59 --> 00:15:02 I don't know if some of you realized quite how rare that was, 151 00:15:02 --> 00:15:05 this discovery of the structure of DNA. As I said, 152 00:15:05 --> 00:15:09 probably one of the big discoveries of mankind. Because, 153 00:15:09 --> 00:15:12 as Vernon said, you could see so many of the secrets of life as soon 154 00:15:12 --> 00:15:16 as you saw that structure. Very few people have ever heard 155 00:15:16 --> 00:15:19 from someone who was there at the time. Maybe you'll forget a bunch 156 00:15:19 --> 00:15:23 of stuff down the line, but I hope you'll remember you heard 157 00:15:23 --> 00:15:26 somebody who was there when Wesson and Crick were there and maybe his 158 00:15:26 --> 00:15:30 extra piece of advice about choosing a lab. 159 00:15:30 --> 00:15:33 To say one thing quickly, some of you I think understood what 160 00:15:33 --> 00:15:37 I've been trying to do. I spent quite a bit of time talking 161 00:15:37 --> 00:15:40 about science being done by real people doing real experiments. 162 00:15:40 --> 00:15:44 Thanks for your comments. A few of you have gone out of your way to say 163 00:15:44 --> 00:15:48 that this was a total waste of time and you didn't understand why I 164 00:15:48 --> 00:15:51 didn't teach you something instead of doing something on the test. 165 00:15:51 --> 00:15:55 Well, I'm making up the test. And if you don't think there'll be 166 00:15:55 --> 00:15:59 something on scientific process on the second exam you'll 167 00:15:59 --> 00:16:02 be surprised. So I'm spending a lot of time on 168 00:16:02 --> 00:16:06 this, and the reason is because you are MIT student. 169 00:16:06 --> 00:16:09 You know, you can go many places in the country to many high school 170 00:16:09 --> 00:16:12 biology courses and you can memorize, someone will tell you to memorize 171 00:16:12 --> 00:16:16 everything that's in the book, and you'll get tested whether you 172 00:16:16 --> 00:16:19 can memorize it. You guys are at MIT because you 173 00:16:19 --> 00:16:23 have the potential to be leaders in whatever you do. 174 00:16:23 --> 00:16:26 I've made the transition from being an undergrad sort of trying to 175 00:16:26 --> 00:16:30 memorize stuff in a textbook to working on a cutting-edge. 176 00:16:30 --> 00:16:32 I've made some reasonably significant discoveries in science, 177 00:16:32 --> 00:16:35 as have my other colleagues in the department, some of them making 178 00:16:35 --> 00:16:38 greater than I. But nevertheless if you're on the 179 00:16:38 --> 00:16:41 cutting-edge then you're dealing with all the stuff I'm trying to 180 00:16:41 --> 00:16:44 tell you about in this thing. You're working as a part of a group. 181 00:16:44 --> 00:16:47 There's competition. There are interpersonal 182 00:16:47 --> 00:16:50 relationships. You make mistakes. 183 00:16:50 --> 00:16:53 You recover from them. You're making inferences. 184 00:16:53 --> 00:16:56 You're testing models. This is a very complex, very real, 185 00:16:56 --> 00:16:59 very dynamic, very human interaction. I hope you got a little bit of 186 00:16:59 --> 00:17:03 whiff of that from Vernon. And I wouldn't be, 187 00:17:03 --> 00:17:07 I'm quite capable of reproducing diagrams from the textbook without 188 00:17:07 --> 00:17:11 trying to give you a deeper understanding, 189 00:17:11 --> 00:17:15 and that's what I'm trying to do here. And I hope if it hasn't made 190 00:17:15 --> 00:17:19 sense to you by the end that at least a few more of you will get it. 191 00:17:19 --> 00:17:23 And those of you who I think saw what I was doing I appreciate your 192 00:17:23 --> 00:17:27 telling me that in the things. These are anonymous so I don't know, 193 00:17:27 --> 00:17:31 but a couple of you are certainly trying to make it clear that you 194 00:17:31 --> 00:17:35 didn't think it was worth your time coming to lecture. 195 00:17:35 --> 00:17:38 I'm trying to tell you why I'm trying to do it. 196 00:17:38 --> 00:17:41 I'm trying to teach you in a deeper way. And this is a required course. 197 00:17:41 --> 00:17:44 It's important for your life. I hope some of you will see that or if 198 00:17:44 --> 00:17:47 you don't see it now you'll see it later in your career. 199 00:17:47 --> 00:17:50 OK. Now, we're going to talk about DNA replication. 200 00:17:50 --> 00:17:53 I'm going to start to drive into some of the details that maybe are 201 00:17:53 --> 00:17:56 more the kind of things you're expecting. I just want to make one 202 00:17:56 --> 00:17:59 quick point here. I've talked about cell division and 203 00:17:59 --> 00:18:03 we saw this, how cells come from other cells going to make more cells. 204 00:18:03 --> 00:18:07 I showed you this little movie you've seen a few times of a yeast 205 00:18:07 --> 00:18:10 cell dividing, but all cells divide. 206 00:18:10 --> 00:18:14 Here's a cancer cell dividing. If you get a cancer it's a cell 207 00:18:14 --> 00:18:17 that's forgotten how to stop dividing and is growing to make a 208 00:18:17 --> 00:18:21 tumor. There's this cancer cell dividing. It looks not unlike a 209 00:18:21 --> 00:18:25 yeast on a molecular level, very, very similar. But there's 210 00:18:25 --> 00:18:29 another point. I told you how the structure of DNA 211 00:18:29 --> 00:18:33 with the complimentary strands with G pairing with C and A pairing with 212 00:18:33 --> 00:18:37 T immediately gave rise to an insight as to how the genetic 213 00:18:37 --> 00:18:41 material could be replicated. And you guys know that it's held 214 00:18:41 --> 00:18:45 together by hydrogen bonds between base pairs which are about 215 00:18:45 --> 00:18:49 one-twentieth the strength of the covalent bonds. 216 00:18:49 --> 00:18:53 So you're able to peel the strands apart without breaking the covalent 217 00:18:53 --> 00:18:57 bonds. And then by pairing A with T and G with C and doing that on both 218 00:18:57 --> 00:19:02 strands then you can end up with two identical copies. 219 00:19:02 --> 00:19:05 And so if you do two identical copies and you do it again you get 220 00:19:05 --> 00:19:09 eight. One of the things we've realized over the last two or three 221 00:19:09 --> 00:19:13 years in looking through the exams is somehow, at least some of the 222 00:19:13 --> 00:19:17 class, didn't connect the business about cells coming from other cells 223 00:19:17 --> 00:19:21 and DNA duplicating to give daughter DNA. And I'm just trying to hammer 224 00:19:21 --> 00:19:25 home the point that these are related. Every time a cell divides 225 00:19:25 --> 00:19:29 it has to duplicate its genetic information. 226 00:19:29 --> 00:19:33 That's why I'm going to be telling you about DNA replication. 227 00:19:33 --> 00:19:37 Here's a picture of that same cancer cell, but watch over here. 228 00:19:37 --> 00:19:41 This is the DNA. And you see it's doubled. And see how the DNA, 229 00:19:41 --> 00:19:46 which is the chromosomes, has pulled apart so that at the end you now 230 00:19:46 --> 00:19:50 have two cells and you've got identical copies of DNA. 231 00:19:50 --> 00:19:54 So if you're studying cancer, for example, this sort of thing is 232 00:19:54 --> 00:19:59 relevant to you. OK. So the issue of how -- 233 00:19:59 --> 00:20:03 Well, before I do that, I'm sorry. Just a couple of things 234 00:20:03 --> 00:20:08 about DNA replication before I dive into this. So we all started out as 235 00:20:08 --> 00:20:13 a single cell. I've got a lot more obviously 236 00:20:13 --> 00:20:18 because I'm made up of a lot of cells. If I took all the DNA in my 237 00:20:18 --> 00:20:22 body and I wind up all the molecules in it, do you guys have any idea how 238 00:20:22 --> 00:20:27 long that would be? Who thinks it would reach let's say 239 00:20:27 --> 00:20:33 across the room? OK. Across campus? 240 00:20:33 --> 00:20:39 Across Cambridge? Around the world? To the moon? Anybody left? 241 00:20:39 --> 00:20:46 To the sun? I've got ten to the fourteenth cells. 242 00:20:46 --> 00:20:52 There's about a meter or two in each cell. 10 to 20 billion miles 243 00:20:52 --> 00:20:59 of DNA in each of our bodies, human DNA. 244 00:20:59 --> 00:21:03 They would go back and forth to the sun multiple times. 245 00:21:03 --> 00:21:07 So that much DNA had to get replicated in order for the 246 00:21:07 --> 00:21:12 fertilized egg we all started out as to become you. 247 00:21:12 --> 00:21:16 Another thing, the accuracy of replication is about 248 00:21:16 --> 00:21:21 ten to the minus tenth. Most people, including myself, 249 00:21:21 --> 00:21:25 don't have a very good feel for exponents. So that's one 250 00:21:25 --> 00:21:30 mistake in 10 billion. You know, it could be one mistake in 251 00:21:30 --> 00:21:34 10 to the ninety-ninth. Well, what is one mistake in 10 252 00:21:34 --> 00:21:39 billion mean? So let's relate it to something we know. 253 00:21:39 --> 00:21:43 If I was typing let's say an eight letter word, 60 words a minute, 254 00:21:43 --> 00:21:48 24 hours a day, 7 days a week, and I was as good as DNA replication, 255 00:21:48 --> 00:21:52 how often would I make a mistake? So you can each think of how long 256 00:21:52 --> 00:21:57 you think that is. But if I was good on average, 257 00:21:57 --> 00:22:02 I would make a mistake once every 38 years. 258 00:22:02 --> 00:22:06 So I'm about to tell you about a process that's absolutely 259 00:22:06 --> 00:22:11 astonishing in terms of how fast and how much you can do and with an 260 00:22:11 --> 00:22:15 accuracy that goes beyond what we're used to in our ordinary life. 261 00:22:15 --> 00:22:20 So how does it do this? It has to be more than just pulling the 262 00:22:20 --> 00:22:24 strands apart. And there's been some confusion as 263 00:22:24 --> 00:22:29 to why I'm emphasizing 5 prime and 3 prime. 264 00:22:29 --> 00:22:34 Well, each of these subunits, each nucleotide, this is a 3 prime 265 00:22:34 --> 00:22:39 hydroxyl and this is the 5 prime position. If we were joining 266 00:22:39 --> 00:22:44 together subunits that had a hook and an eye it would make a 267 00:22:44 --> 00:22:49 difference because it's not the same on both ends. If we're going to 268 00:22:49 --> 00:22:54 start hooking together it's exactly the same thing when we get to a 269 00:22:54 --> 00:22:59 biochemical level, the 5 prime end is not the same as 270 00:22:59 --> 00:23:04 hydroxyl at the 3 prime end because the whole thing is asymmetric. 271 00:23:04 --> 00:23:12 So the enzymes that copy DNA are known as DNA polymerases. 272 00:23:12 --> 00:23:20 And it was a very difficult challenge to figure out how they 273 00:23:20 --> 00:23:28 operated, but Arthur Kornberg was the first person to solve 274 00:23:28 --> 00:23:36 this problem. He was an extraordinarily gifted 275 00:23:36 --> 00:23:43 biochemist. He's still at Stanford. And what he found was if we have a 276 00:23:43 --> 00:23:49 5 prime end this would then be the 3 prime end, and there's a 3 prime 277 00:23:49 --> 00:23:56 hydroxyl which is this one right here. And this was paired, 278 00:23:56 --> 00:24:02 say, with a C and A paired with a T. And let's say a G paired with a C 279 00:24:02 --> 00:24:07 here. And let's say the next template base was, 280 00:24:07 --> 00:24:12 let's make it a T. What Arthur Kornberg was able to find was an 281 00:24:12 --> 00:24:18 enzyme activity that catalyzed a template-dependent replication of 282 00:24:18 --> 00:24:23 DNA. That was critical because he had to find, if you broke the cells 283 00:24:23 --> 00:24:28 open, somewhere in that gamish of enzymes and things from 284 00:24:28 --> 00:24:33 inside a cell. There had to be something that was 285 00:24:33 --> 00:24:37 able to copy DNA. So in order to do that he had to 286 00:24:37 --> 00:24:41 work out an assay. And he also had to have some kind 287 00:24:41 --> 00:24:45 of guess as to what the cell would be using in order to carry out the 288 00:24:45 --> 00:24:49 synthesis. But one thing that was sort of obvious was a DNA template 289 00:24:49 --> 00:24:53 because that was being copies. But the other part was you had to 290 00:24:53 --> 00:24:58 have energy to form a covalent bond. 291 00:24:58 --> 00:25:02 So somehow there had to be something that was sort of activated with the 292 00:25:02 --> 00:25:07 energy built into the molecule so that thermodynamically the whole 293 00:25:07 --> 00:25:12 thing would slide downhill when you made a bond. And he knew that the 294 00:25:12 --> 00:25:17 cell had triphosphates, just the same type that we talked 295 00:25:17 --> 00:25:36 about when we talked about ATP. 296 00:25:36 --> 00:25:43 So this would be a deoxyribonucleotide triphosphate. 297 00:25:43 --> 00:25:51 And he was able to make a guess, because he had to try things until 298 00:25:51 --> 00:25:59 he found something that would work, that this was what's used in DNA 299 00:25:59 --> 00:26:06 synthesis. So this hydroxyl ultimately attacks 300 00:26:06 --> 00:26:13 this phosphate here. And these two other phosphates then 301 00:26:13 --> 00:26:21 come off as a leaving group. So if we thought of it as a pea 302 00:26:21 --> 00:26:28 like this with two more peas here, these two come off and you get a new 303 00:26:28 --> 00:26:36 bond formed to the phosphate. And so what Kornberg then was able 304 00:26:36 --> 00:26:45 to find by using a DNA template that had this sort of structure and 305 00:26:45 --> 00:26:54 [TTATA?] like this, that he was now able to get an A 306 00:26:54 --> 00:27:04 added here. This hydroxyl here became the new hydroxyl. 307 00:27:04 --> 00:27:10 And so the direction of synthesis, this strand is the other way, so the 308 00:27:10 --> 00:27:16 direction of synthesis of a DNA polymerase, it's polymerizing in the 309 00:27:16 --> 00:27:22 5 prime to two 3 prime direction. This was again an amazing discovery 310 00:27:22 --> 00:27:28 because it was the first time that anyone had found an enzyme 311 00:27:28 --> 00:27:34 that could copy DNA. Arthur Kornberg got a Nobel Prize 312 00:27:34 --> 00:27:38 for it. But at this point actually genetics came in because there was a 313 00:27:38 --> 00:27:43 scientist John Cairns who was at that point down at Cold Spring 314 00:27:43 --> 00:27:47 Harbor, as I told you the other day. And John, in spite of the fact that 315 00:27:47 --> 00:27:52 Arthur had found a DNA polymerase that had all the properties that you 316 00:27:52 --> 00:27:56 would expect for copying DNA, didn't think that was the one that 317 00:27:56 --> 00:28:01 actually copied the DNA necessary for cellular replication. 318 00:28:01 --> 00:28:05 So he reasoned if he was right he'd be able to find a mutation that 319 00:28:05 --> 00:28:09 would eliminate the activity of that enzyme and the cell would still live. 320 00:28:09 --> 00:28:13 And so they did a screening, and it was a lot of work, but they 321 00:28:13 --> 00:28:18 eventually found a mutant of E. coli that lacked this DNA polymerase 322 00:28:18 --> 00:28:22 that Arthur Kornberg had discovered. And the cell was still alive and 323 00:28:22 --> 00:28:26 was still replicating its DNA. So it told both John and then 324 00:28:26 --> 00:28:31 Arthur there must be another enzyme in the cell. 325 00:28:31 --> 00:28:34 And so Arthur went back. And now working in a mutant that 326 00:28:34 --> 00:28:37 was missing this first polymerase he discovered he found the one that 327 00:28:37 --> 00:28:40 really replicates the DNA. The first one is important, 328 00:28:40 --> 00:28:44 too. It's needed for DNA repair. I'm going to talk to you about that 329 00:28:44 --> 00:28:47 in next lecture, but it's not absolutely crucial for 330 00:28:47 --> 00:28:50 life. And there's an interplay of genetics and biochemistry. 331 00:28:50 --> 00:28:53 And you'll see I'm just sort of foreshadowing what we're going to 332 00:28:53 --> 00:28:57 get to when we talk about the genetics of this. 333 00:28:57 --> 00:29:00 And I know a couple of you clearly were frustrated about me showing you 334 00:29:00 --> 00:29:03 pictures of the people who did this, but nevertheless since this was such 335 00:29:03 --> 00:29:07 a historic event a couple of years ago at Cold Spring Harbor. 336 00:29:07 --> 00:29:10 This you see the helix model down there. There was Jim Watson opening 337 00:29:10 --> 00:29:14 the symposium. When I got up to talk I said, 338 00:29:14 --> 00:29:17 well, I told my students that I'd let them know what it was like when 339 00:29:17 --> 00:29:21 I was there, so I took out a camera and I took a picture of the audience. 340 00:29:21 --> 00:29:24 And so there are a bunch of Nobel Laureates and types here who were 341 00:29:24 --> 00:29:28 sitting there smiling for you guys in the class. And there was Arthur 342 00:29:28 --> 00:29:32 Kornberg giving his talk. Now, these DNA polymerases are 343 00:29:32 --> 00:29:36 incredible protein machines. The crystal structures of DNA 344 00:29:36 --> 00:29:40 polymerases operating their template have been solved. 345 00:29:40 --> 00:29:44 And you can solve, depending on how many diffractions 346 00:29:44 --> 00:29:48 you can get, you can get a model that's more and more detailed. 347 00:29:48 --> 00:29:52 And there have been very high resolution models of DNA polymerases. 348 00:29:52 --> 00:29:56 This blue and white stuff is the surface of the protein, 349 00:29:56 --> 00:30:00 and this is sort of a template and the various parts. 350 00:30:00 --> 00:30:04 Just to give you an idea here are these tracings of the shapes of the 351 00:30:04 --> 00:30:08 electron density. You can see how the 352 00:30:08 --> 00:30:12 crystallographers have fit the nucleotides right in the crystal 353 00:30:12 --> 00:30:17 into these electron densities. And here putting it together a bit 354 00:30:17 --> 00:30:21 in the blue is the secondary structure of the protein and the 355 00:30:21 --> 00:30:25 templates and whatnot. And I don't expect you to see very 356 00:30:25 --> 00:30:29 much in that, but the point is I wanted to sort of just set you up to 357 00:30:29 --> 00:30:34 show you this little movie. Because DNA polymerases are 358 00:30:34 --> 00:30:38 incredible machines. They copy at about a thousand 359 00:30:38 --> 00:30:43 nucleotides a second and their accuracy is really amazing. 360 00:30:43 --> 00:30:47 And I'll tell you all the tricks to the accuracy in the next lecture, 361 00:30:47 --> 00:30:52 but I want to show you this little movie because this is sort of a 362 00:30:52 --> 00:30:56 simulation of what must happen every time a nucleotide is added. 363 00:30:56 --> 00:31:01 Now, we'll see this over and over again so I'll take it in pieces. 364 00:31:01 --> 00:31:04 The yellow and the orange are the secondary structures. 365 00:31:04 --> 00:31:08 That's an alpha helix. And certainly one thing you can see 366 00:31:08 --> 00:31:12 is happening, as we're looking at this, is the parts of the protein 367 00:31:12 --> 00:31:16 are moving during this. So you can see this alpha helix 368 00:31:16 --> 00:31:20 that's sort of swinging up and swinging back down. 369 00:31:20 --> 00:31:24 Now, what's over here is the template base. 370 00:31:24 --> 00:31:28 That's the base that correspondents to the T that I was just 371 00:31:28 --> 00:31:31 showing you here. This is the incoming nucleotide. 372 00:31:31 --> 00:31:35 There is the triphosphate coming down here. And, 373 00:31:35 --> 00:31:39 in fact, you just see those two phosphates going. 374 00:31:39 --> 00:31:43 So what's happening here, this is going to be the end of the 375 00:31:43 --> 00:31:46 growing chain. It's going attack right there, 376 00:31:46 --> 00:31:50 join the phosphate and the pyrophosphate will leave. 377 00:31:50 --> 00:31:54 And if you'll take a look, when you see this movement of this 378 00:31:54 --> 00:31:58 helix from the beginning state to up to here, you'll see what happens. 379 00:31:58 --> 00:32:02 It's squeezing the template base and the incoming nucleotide together. 380 00:32:02 --> 00:32:07 What it's really doing is testing for the correct shape. 381 00:32:07 --> 00:32:11 Remember the shape of an A-T base pair and a G-C base pair is the same. 382 00:32:11 --> 00:32:16 And if those of you who are confused about guanine and the 383 00:32:16 --> 00:32:21 keto-enol thing, try to draw hydrogen bonds with the 384 00:32:21 --> 00:32:25 enol form of guanine and see how you do. I think you'll begin 385 00:32:25 --> 00:32:30 to understand a bit. So at the heart of life is something 386 00:32:30 --> 00:32:36 that can copy DNA. And there are these exquisitely 387 00:32:36 --> 00:32:42 beautiful machines. The replica machine in E. 388 00:32:42 --> 00:32:48 coli has 18 proteins and the ones in our bodies are even more 389 00:32:48 --> 00:32:54 sophisticated with even more parts. OK. But to replicate a DNA 390 00:32:54 --> 00:33:00 molecule there's another problem that comes up. 391 00:33:00 --> 00:33:18 Because DNA polymerases copy -- 392 00:33:18 --> 00:33:23 -- and grow chains in a 5 prime to 3 prime direction. 393 00:33:23 --> 00:33:36 And they need a 3 prime hydroxy 394 00:33:36 --> 00:33:43 terminus. So they won't work if you just gave it a single strand of DNA. 395 00:33:43 --> 00:33:51 No DNA polymerase can handle that. It has to have something like this 396 00:33:51 --> 00:34:03 where there's a template strand -- 397 00:34:03 --> 00:34:07 -- and there's what's known as the primer strand. 398 00:34:07 --> 00:34:11 So there has to be something that has the 3 prime hydroxyl and there 399 00:34:11 --> 00:34:16 has to be something that's going to provide the template that's going to 400 00:34:16 --> 00:34:20 be copied. So if we pull strands apart like this with 5 prime to 3 401 00:34:20 --> 00:34:24 prime then they'll be 5 prime to 3 prime running in the opposite 402 00:34:24 --> 00:34:31 direction. If we have a template like this, 403 00:34:31 --> 00:34:39 this is OK because the strand here can be copied 5 prime to 3 prime. 404 00:34:39 --> 00:34:48 This is the new strand being synthesized by the DNA polymerase. 405 00:34:48 --> 00:34:56 But what about the other strand? The replication fork is moving in 406 00:34:56 --> 00:35:03 this direction, but if the -- So here is the 3 to 5 prime 407 00:35:03 --> 00:35:08 direction here. So if the DNA polymerase is going 408 00:35:08 --> 00:35:13 to be copying this strand it's going to be moving backwards to the 409 00:35:13 --> 00:35:17 direction of the replication fork. Now, I guess evolution and nature 410 00:35:17 --> 00:35:22 could have selected for two types of DNA polymerases, 411 00:35:22 --> 00:35:27 one that copies 5 prime to 3 prime and one that copies in the opposite 412 00:35:27 --> 00:35:32 direction. But it didn't. And there are a number of 413 00:35:32 --> 00:35:36 theoretical reasons that we could discuss in a more advanced course 414 00:35:36 --> 00:35:40 perhaps for why that is true. But, in fact, what it does is it 415 00:35:40 --> 00:35:44 uses the same polymerase. So as these things peel apart the 416 00:35:44 --> 00:35:48 polymerase works in the other direction, but there's another 417 00:35:48 --> 00:35:52 problem. If I just peel it apart like this there's no 3 prime 418 00:35:52 --> 00:35:56 hydroxyl. So it took people quite a few years to figure out the strategy 419 00:35:56 --> 00:36:01 that's used in nature. Nature has a special enzyme that 420 00:36:01 --> 00:36:08 makes a little piece of RNA. It's called an RNA primer. And 421 00:36:08 --> 00:36:15 what it does is it provides a 3 prime hydroxyl. 422 00:36:15 --> 00:36:22 And once you have the 3 prime hydroxyl at the end of the little 423 00:36:22 --> 00:36:29 RNA chain then the DNA polymerase -- 424 00:36:29 --> 00:36:39 -- can be made 5 prime to 3 prime. 425 00:36:39 --> 00:36:47 So as you peel open the replication fork then little pieces of RNA are 426 00:36:47 --> 00:36:55 used to make a new strand of DNA and it goes this way. 427 00:36:55 --> 00:37:03 Now that obviously doesn't give you a new intact DNA strand. 428 00:37:03 --> 00:37:08 And part of the clue to this working out what was going on at DNA 429 00:37:08 --> 00:37:13 replication was the recognition that newly synthesized DNA was made as 430 00:37:13 --> 00:37:18 little pieces. And then later it got joined into 431 00:37:18 --> 00:37:23 longer pieces. And the person who discovered this 432 00:37:23 --> 00:37:28 was Okazaki. So these fragments of DNA that are synthesized initially 433 00:37:28 --> 00:37:34 are called Okazaki fragments, after the person who discovered this. 434 00:37:34 --> 00:37:38 It was rather puzzling because when you tried to look at the synthesis 435 00:37:38 --> 00:37:42 of DNA you're looking at a long molecule, and you found some of the 436 00:37:42 --> 00:37:47 newly synthesized material was in short pieces. And as you watched 437 00:37:47 --> 00:37:51 over time it got longer. So the cell, I think you can sort 438 00:37:51 --> 00:37:55 of see from first principles what has to happen here then. 439 00:37:55 --> 00:37:06 That in order to come and make -- 440 00:37:06 --> 00:37:10 This strand is pretty easy to do, but what the cells have to do now is 441 00:37:10 --> 00:37:14 they've got these little RNA primers. 442 00:37:14 --> 00:38:25 And then they remove the RNA by an 443 00:38:25 --> 00:38:32 enzyme that's capable of degrading the DNA or clipping it 444 00:38:32 --> 00:38:38 at the junction. And that then leaves the cell in 445 00:38:38 --> 00:38:43 this sort of situation where there are little tiny gaps in between 446 00:38:43 --> 00:38:48 these pieces of DNA. But at the end of each one of these 447 00:38:48 --> 00:38:53 is a 3 prime hydroxyl. So another polymerase or one or 448 00:38:53 --> 00:38:58 another polymerase in the cell can fill those little pieces 449 00:38:58 --> 00:39:04 of DNA out. And then there's one little nick 450 00:39:04 --> 00:39:10 that needs to be sealed. And so what you finally end up with 451 00:39:10 --> 00:39:16 is a 3 prime hydroxyl here, a 5 prime phosphate that's at the 452 00:39:16 --> 00:39:22 other end, and then these are joined together. This is one nucleotide 453 00:39:22 --> 00:39:28 here and the other here. These are then joined together to 454 00:39:28 --> 00:39:34 give the ordinary phosphodiester bond that links -- 455 00:39:34 --> 00:39:41 -- the two nucleotides together like 456 00:39:41 --> 00:39:46 that. And the enzyme that does that is an enzyme called DNA ligase. 457 00:39:46 --> 00:39:51 You can almost think about it as DNA Scotch tape that will take a 458 00:39:51 --> 00:39:56 little nick in DNA, if we've got a phosphate and 459 00:39:56 --> 00:40:01 hydroxyl, and it will join them together. 460 00:40:01 --> 00:40:06 So this process of replication, which can go at about a thousand 461 00:40:06 --> 00:40:12 nucleotides a second with this amazing degree of accuracy, 462 00:40:12 --> 00:40:17 uses two different DNA polymerases, both of which biochemically can only 463 00:40:17 --> 00:40:23 go in one direction. But you can see they have to be 464 00:40:23 --> 00:40:28 somehow oriented so that one of them is able to move in this direction 465 00:40:28 --> 00:40:34 and the other one is able to move in that direction. 466 00:40:34 --> 00:40:38 The key part in this sort of the course is to try and understand this 467 00:40:38 --> 00:40:42 5 to 3 prime and to get this basic idea that nature had to do something. 468 00:40:42 --> 00:40:47 It was fairly easy to copy one strand because that was sort of the 469 00:40:47 --> 00:40:51 direction of the polymerase movement was the same as the replication for 470 00:40:51 --> 00:40:55 it movement, but the other strand had to have been much 471 00:40:55 --> 00:40:59 more a problem. And so when you get down to a 472 00:40:59 --> 00:41:03 biochemical level, though, it's very conceptually easy 473 00:41:03 --> 00:41:07 to say, oh, you've got complimentary strands so we just take it apart, 474 00:41:07 --> 00:41:11 we take the photograph and the negative and we make the opposite 475 00:41:11 --> 00:41:14 one and now we've got two copies. When you get down to the 476 00:41:14 --> 00:41:18 biochemical details there is this major biochemical issue of whether 477 00:41:18 --> 00:41:22 the polymerase can go in the 3 prime or the 5 prime direction. 478 00:41:22 --> 00:41:26 And nature has chosen to do it all or has been selected to do it all 479 00:41:26 --> 00:41:30 somehow with a polymerase going in one direction. 480 00:41:30 --> 00:41:35 There are many other aspects to DNA replication. And one of the tricks 481 00:41:35 --> 00:41:40 that I find most fascinating is that these polymerases, 482 00:41:40 --> 00:41:45 once they get on DNA they stay on. And that's part of the secret 483 00:41:45 --> 00:41:50 because it takes about a millisecond to add a nucleotide, 484 00:41:50 --> 00:41:55 but if it comes off the DNA it has to get back on. Then it 485 00:41:55 --> 00:42:00 takes about a minute. So the whole trick to being a very, 486 00:42:00 --> 00:42:04 very fast DNA polymerase is to somehow hang onto the DNA. 487 00:42:04 --> 00:42:08 So what biochemists did was they purified the actual enzymatic 488 00:42:08 --> 00:42:12 activity that could carry out this process, and then they started to 489 00:42:12 --> 00:42:16 look for other protein factors that would help the process to work 490 00:42:16 --> 00:42:20 better. And they discovered something called a processivity 491 00:42:20 --> 00:42:24 factor which made the polymerase stay on the DNA. 492 00:42:24 --> 00:42:28 And people wondered for a lot of years how that worked and why did 493 00:42:28 --> 00:42:33 this system work so well. And finally the crystal structure of 494 00:42:33 --> 00:42:38 the processivity factor was discovered. And if I go back to 495 00:42:38 --> 00:42:43 this sort of diagram where this is the piece of DNA that's copied, 496 00:42:43 --> 00:42:48 what it turned out was that the processivity factor is basically a 497 00:42:48 --> 00:42:53 doughnut that kind of gets clamped around the DNA like that. 498 00:42:53 --> 00:42:58 So it's sort of like taking a washer with a place where you can 499 00:42:58 --> 00:43:03 pry it apart opening it up, putting it around the DNA like this. 500 00:43:03 --> 00:43:07 And then the polymerase, more or less since this is 501 00:43:07 --> 00:43:11 topologically linked to the DNA, is like a washer sliding on a wire. 502 00:43:11 --> 00:43:16 This DNA polymerase hangs onto that and it doesn't come off. 503 00:43:16 --> 00:43:20 And I think there's a little picture of it. 504 00:43:20 --> 00:43:25 Here's a little movie. There's the DNA going through and 505 00:43:25 --> 00:43:29 this is one of these clamps. It's virtually the same structure 506 00:43:29 --> 00:43:34 in a bacterium and inside of us. But, in fact, the amino acids are 507 00:43:34 --> 00:43:38 almost all different. But the underlying structure of the 508 00:43:38 --> 00:43:43 protein is almost identical. And there are special machines 509 00:43:43 --> 00:43:48 called clamp loaders that pry open this clamps, clamp them around DNA, 510 00:43:48 --> 00:43:52 and that's part of the secret to how these polymerases are able to 511 00:43:52 --> 00:43:57 polymerize DNA so fast. There are a lot of other pieces of 512 00:43:57 --> 00:44:02 this machinery. If you go on you'll hear more about 513 00:44:02 --> 00:44:06 them. I just want to give you one of the most recent insights. 514 00:44:06 --> 00:44:10 I mean this, as you might guess, since DNA replication is at the 515 00:44:10 --> 00:44:14 heart of life it's been studied very, very hard, ever since the discovery 516 00:44:14 --> 00:44:19 of DNA helix. My colleague, Alan Grossman, made quite a 517 00:44:19 --> 00:44:23 discovery just probably three or four years ago. 518 00:44:23 --> 00:44:27 He took that green fluorescent protein that we've seen a few times, 519 00:44:27 --> 00:44:31 and he actually joined the gene encoding green fluorescent protein 520 00:44:31 --> 00:44:36 to the backend of a piece of the DNA polymerase. 521 00:44:36 --> 00:44:39 So wherever the DNA polymerase went now there was a little fluorescent 522 00:44:39 --> 00:44:43 molecule. And he looked to see where it was in the cell. 523 00:44:43 --> 00:44:47 And I, like many other people, had for years taught, and this is 524 00:44:47 --> 00:44:50 why, you know, I have respect for the fact that I'm 525 00:44:50 --> 00:44:54 just teaching you the current model. For much of my career I taught, so 526 00:44:54 --> 00:44:58 DNA polymerase is sort of like a train going down the tracks a 527 00:44:58 --> 00:45:03 thousand molecules per second. And we're doing all this stuff with 528 00:45:03 --> 00:45:09 the leading and with the two strands. So let me just put those words up 529 00:45:09 --> 00:45:15 while I'm up there. This one is called, 530 00:45:15 --> 00:45:22 this strand that's easy to replicate is called the leading strand. 531 00:45:22 --> 00:45:28 And this one where you have to do the primer and whatnot is called the 532 00:45:28 --> 00:45:33 lagging strand. In any case, what I had taught was 533 00:45:33 --> 00:45:37 that polymerase was like a train running on tracks. 534 00:45:37 --> 00:45:41 You could calculate how fast it would move. What Alan, 535 00:45:41 --> 00:45:45 to his amazing I imagine, found was when he looked to see 536 00:45:45 --> 00:45:48 where the DNA polymerase was, it wasn't spread out all over the 537 00:45:48 --> 00:45:52 cell as if you thought it was a thing running on tracks. 538 00:45:52 --> 00:45:56 In fact, it was in the center of the cell. And then late in cell 539 00:45:56 --> 00:46:00 division it split into two spots that went to the midpoints of what 540 00:46:00 --> 00:46:04 would be the daughter cells. And so what he ended up realizing 541 00:46:04 --> 00:46:09 from that was that instead it was more as if the polymerase was a 542 00:46:09 --> 00:46:14 factory and it pulled the DNA through it rather than it traveling 543 00:46:14 --> 00:46:19 down the tracks of the DNA. And that was a very surprising 544 00:46:19 --> 00:46:24 discovery that went against all the dogma and all the pictures in the 545 00:46:24 --> 00:46:30 textbook. And it was a discovery at MIT. 546 00:46:30 --> 00:46:34 That was published in, I think it was 2001, something like 547 00:46:34 --> 00:46:39 that, a very recent discovery. Things keep changing. That's again 548 00:46:39 --> 00:46:43 why I keep emphasizing I cannot teach you a fact in biology. 549 00:46:43 --> 00:46:48 I can teach you the best understanding we have that explains 550 00:46:48 --> 00:46:52 the experiments to date. But somebody may make a discovery 551 00:46:52 --> 00:46:57 tomorrow. That means we'll have to change our understanding. 552 00:46:57 --> 00:47:00 OK? So good luck on the exam. I'll see you on Monday, OK?