1 00:00:00 --> 00:00:06 By the time that Watson and Crick figured out the structure of DNA, 2 00:00:06 --> 00:00:12 you know, it was sort of obvious that since the two strands were 3 00:00:12 --> 00:00:18 complimentary you could see how it replicated. And they also could see 4 00:00:18 --> 00:00:24 that somehow the information must be encoded in the sequence of letters 5 00:00:24 --> 00:00:30 down the strands of the DNA. But it wasn't obvious what the code 6 00:00:30 --> 00:00:36 was and how it was arranged, how it worked. And in principle it 7 00:00:36 --> 00:00:42 was anything you could do with four-letters. And so I pointed out 8 00:00:42 --> 00:00:47 the other day this was sort of a four-letter alphabet. 9 00:00:47 --> 00:00:53 And I think it's useful to think of it this way with A, 10 00:00:53 --> 00:00:59 G, C and T, and RNA as also being a four-letter alphabet. 11 00:00:59 --> 00:01:05 But proteins are actually a 20-letter alphabet because there are 12 00:01:05 --> 00:01:12 20 different amino acids. And so somehow, since one of the 13 00:01:12 --> 00:01:19 key things that the DNA had to do, it somehow had to encode the 14 00:01:19 --> 00:01:26 information for making the proteins. And there was a lot of work on 15 00:01:26 --> 00:01:32 protein biosynthesis at the time. And it looked pretty complicated. 16 00:01:32 --> 00:01:36 People had found that RNA seemed to be important. Cells that were 17 00:01:36 --> 00:01:41 making lots of protein had lots of RNA in them. And another thing they 18 00:01:41 --> 00:01:45 noticed was that if you looked in eukaryotic cells the DNA stayed in 19 00:01:45 --> 00:01:49 the nucleus. The proteins, most of them, were out in the 20 00:01:49 --> 00:01:54 cytoplasm. And the evidence was that they were made out in the 21 00:01:54 --> 00:01:58 cytoplasm. So somehow the information had to get out of the 22 00:01:58 --> 00:02:03 nucleus where the DNA was and into the cytoplasm. 23 00:02:03 --> 00:02:06 And biochemists were breaking cells open and trying to make cellular 24 00:02:06 --> 00:02:10 extracts that would synthesize proteins. And I think it's fair to 25 00:02:10 --> 00:02:14 say at the time that it looked extremely complicated. 26 00:02:14 --> 00:02:18 And so thinking about how DNA encoded information and got 27 00:02:18 --> 00:02:22 translated into proteins was a very complex issue. 28 00:02:22 --> 00:02:26 But then actually there was a very interesting development that had a 29 00:02:26 --> 00:02:30 strong influence in Watson and Crick and led to them, 30 00:02:30 --> 00:02:34 Crick in particular, getting a key insight into the 31 00:02:34 --> 00:02:38 nature of this coding problem. There's a physicist, 32 00:02:38 --> 00:02:43 George Gamow, who some of you know. He proposed the “Big Bang Theory”. 33 00:02:43 --> 00:02:48 A very strong theoretical physicist. And he wrote a letter to Watson and 34 00:02:48 --> 00:02:53 Crick. He thought he'd figured out the basis of the genetic code. 35 00:02:53 --> 00:02:58 And his idea was you had these sequences of A, G, C and Ts. 36 00:02:58 --> 00:03:01 And so everywhere the two bases came together there was sort of like a 37 00:03:01 --> 00:03:05 little different shaped hole. So his idea was the amino acids 38 00:03:05 --> 00:03:09 would stick into these little holes. And he had a theory showing that 39 00:03:09 --> 00:03:13 you could encode the sequence of proteins by having the side chains 40 00:03:13 --> 00:03:17 in the amino acids stick into these little holes along the DNA. 41 00:03:17 --> 00:03:21 Now, there turned out to be a number of problems with that. 42 00:03:21 --> 00:03:25 It didn't take into account the involvement of RNA, 43 00:03:25 --> 00:03:29 which there sort of was quite of bit of evidence for. 44 00:03:29 --> 00:03:32 And more importantly it didn't take into account the structure of the 45 00:03:32 --> 00:03:36 side chains of the amino acids, which you guys have been exposed to. 46 00:03:36 --> 00:03:39 But it had a very profound influence on Watson and Crick. 47 00:03:39 --> 00:03:43 They read this letter. They immediately realized the idea was 48 00:03:43 --> 00:03:47 wrong and went out and had a lunch at a pub, decided again how they 49 00:03:47 --> 00:03:50 actually thought there were 25 amino acids, but they realized some of 50 00:03:50 --> 00:03:54 them were just sort of special ones that were modified only in 51 00:03:54 --> 00:03:58 particular proteins and there were really 20 amino acids that were 52 00:03:58 --> 00:04:02 found universally in nature and amino acids. 53 00:04:02 --> 00:04:05 And what they, Crick in particular, 54 00:04:05 --> 00:04:09 realized was that maybe instead of having to think about protein 55 00:04:09 --> 00:04:12 synthesis through this very complex set of extracts and mixtures a 56 00:04:12 --> 00:04:16 biochemist would work on, that he could think about it at a 57 00:04:16 --> 00:04:20 purely theoretical level, which basically is up at this kind 58 00:04:20 --> 00:04:23 of level. But if you have a molecule that has four letters and 59 00:04:23 --> 00:04:27 it's going to be encoding proteins how does it do it? 60 00:04:27 --> 00:04:31 Can I work out sort of the basis or a possible theory for how that could 61 00:04:31 --> 00:04:35 happen without actually knowing all of the biochemical details? 62 00:04:35 --> 00:04:40 So Crick made a couple of simplifying assumptions. 63 00:04:40 --> 00:04:45 One was that the DNA only determined -- 64 00:04:45 --> 00:04:56 -- the linear sequence of amino 65 00:04:56 --> 00:05:05 acids and protein. That all this information about the 66 00:05:05 --> 00:05:09 3-dimensional stuff came from the properties of the linear sequence 67 00:05:09 --> 00:05:14 once it was made. And I think you hopefully have 68 00:05:14 --> 00:05:18 enough understanding of hydrophobic and other sorts of interactions that 69 00:05:18 --> 00:05:23 would cause a linear sequence amino acid to take a particular 70 00:05:23 --> 00:05:27 confirmation. And the other assumption he made was that 71 00:05:27 --> 00:05:32 it must be universal. And it would be hard to see how life 72 00:05:32 --> 00:05:36 could have started if there wasn't some kind of code that was universal 73 00:05:36 --> 00:05:41 between organisms. And if you start from those kinds 74 00:05:41 --> 00:05:46 of considerations then what you can see is you cannot just have a 75 00:05:46 --> 00:05:50 one-to-one correspondence between a letter in the nucleic acid alphabet 76 00:05:50 --> 00:05:55 and a letter down here. If A stood for valine that would be 77 00:05:55 --> 00:06:00 fine, but you could only have code for four amino acids that way. 78 00:06:00 --> 00:06:07 So if you had one-letter words in DNA there are four possibilities. 79 00:06:07 --> 00:06:14 And so it could only make four. If you had two two-letter words then 80 00:06:14 --> 00:06:21 you'd have 16 possibilities, still not enough for all the amino 81 00:06:21 --> 00:06:29 acids. If you had a three-letter word -- 82 00:06:29 --> 00:06:36 -- then you could do 64, 83 00:06:36 --> 00:06:40 and in principle that would be all you'd need. It doesn't rule out 84 00:06:40 --> 00:06:44 there couldn't be five or six or seven-letter words. 85 00:06:44 --> 00:06:47 Or if you think about this as they were thinking about it at the time, 86 00:06:47 --> 00:06:51 even if it were let's say a three-letter word, 87 00:06:51 --> 00:06:55 is it a code where you have one word, then the next word, 88 00:06:55 --> 00:06:59 then the next word? Or could it be an overlapping word? And 89 00:06:59 --> 00:07:03 what about punctuation? And maybe another thing, 90 00:07:03 --> 00:07:07 you can see if it's AG, CT, etc., there's a frame of reference 91 00:07:07 --> 00:07:11 problem, because if I'm going to read them in groups of three, 92 00:07:11 --> 00:07:15 if I start here I'll get one word, but if I start one letter over the 93 00:07:15 --> 00:07:19 next group of three won't be the same. So somehow there would have 94 00:07:19 --> 00:07:23 to be a starting point. And so these are the sort of 95 00:07:23 --> 00:07:27 considerations that they had to take into account. And, in fact, 96 00:07:27 --> 00:07:32 Watson, excuse me. Francis Crick and another scientist 97 00:07:32 --> 00:07:37 Sydney Brenner and some other scientists worked out a very elegant 98 00:07:37 --> 00:07:42 genetic experiment that demonstrated that it was a three-letter code. 99 00:07:42 --> 00:07:47 And I don't have the time to go into it in this course. 100 00:07:47 --> 00:07:53 If you take a genetics course it's a very beautiful experiment. 101 00:07:53 --> 00:07:58 The principle of the thing, which I could show you rather easily, 102 00:07:58 --> 00:08:03 is if you're writing a thing where you're reading in three-letter words, 103 00:08:03 --> 00:08:08 something like this. The cat ran out and, 104 00:08:08 --> 00:08:12 I don't know, ate the rat or something like that. 105 00:08:12 --> 00:08:16 And these were all just continuously run together, 106 00:08:16 --> 00:08:20 not separated out, but I've put them out here. As you can see they're 107 00:08:20 --> 00:08:24 three-letter words. If you lost one letter then it 108 00:08:24 --> 00:08:28 would change to sort of gibberish. You'd get stuff that looked like 109 00:08:28 --> 00:08:33 this. And if you put one in you'd have the 110 00:08:33 --> 00:08:39 same problem, but if you were to either take out three letters or put 111 00:08:39 --> 00:08:45 in three letters then, even though there'd be a little mess 112 00:08:45 --> 00:08:51 in here somewhere, say I took out two more of these, 113 00:08:51 --> 00:08:57 what we would now have from then is the rest of it would now 114 00:08:57 --> 00:09:01 make sense again. And they did this sort of experiment 115 00:09:01 --> 00:09:04 genetically. They managed to figure out there were two kinds of 116 00:09:04 --> 00:09:07 mutations they could get in a particular way. 117 00:09:07 --> 00:09:10 Some were putting in a letter. Some were taking out a letter. And 118 00:09:10 --> 00:09:13 they didn't know at the time whether there were adding or deleting, 119 00:09:13 --> 00:09:16 but they could tell they were in the opposite directions. 120 00:09:16 --> 00:09:19 And then they found if they took three of one class, 121 00:09:19 --> 00:09:22 like three that would delete a letter and put them all together 122 00:09:22 --> 00:09:25 then things would more or less work. Or if they put three that stuck in 123 00:09:25 --> 00:09:28 an extra letter then everything would more or less work. 124 00:09:28 --> 00:09:34 So there was a genetic proof of the three-letter part of the code before 125 00:09:34 --> 00:09:41 it was figured out exactly how the code itself worked. 126 00:09:41 --> 00:09:48 And so going from this sort of theoretical insight into the code to 127 00:09:48 --> 00:09:55 actually figuring out how proteins were made there was still quite a 128 00:09:55 --> 00:10:02 lot of stuff that had to happen. And one was the concept of 129 00:10:02 --> 00:10:08 messenger RNA. As I said, there'd been quite a lot 130 00:10:08 --> 00:10:13 of evidence that RNA was somehow involved in protein synthesis 131 00:10:13 --> 00:10:17 because cells that made a lot of protein made a lot of RNA. 132 00:10:17 --> 00:10:22 And it seemed to be in the right sort of place in the cell for the 133 00:10:22 --> 00:10:27 proteins to be made. So the idea merged that RNA was 134 00:10:27 --> 00:10:32 somehow a carrier of information from the DNA to the cytoplasm. 135 00:10:32 --> 00:10:39 So it could serve as a template for making proteins. 136 00:10:39 --> 00:10:47 So the idea that the cell copied the sequence of a portion -- 137 00:10:47 --> 00:10:56 -- of the DNA. 138 00:10:56 --> 00:11:02 And we'd probably think of this as a gene right now. 139 00:11:02 --> 00:11:07 Into RNA. And the RNA would go into the cytoplasm. 140 00:11:07 --> 00:11:13 That's the part outside the nucleus. And then it would serve 141 00:11:13 --> 00:11:25 as a template -- 142 00:11:25 --> 00:11:30 -- for protein synthesis. Because of this thought that if you 143 00:11:30 --> 00:11:35 had a cell like this with a nucleus and the DNA in here, 144 00:11:35 --> 00:11:40 that if a piece of RNA were to go out into the cytoplasm and have 145 00:11:40 --> 00:11:46 those properties it would be functioning more or less as a 146 00:11:46 --> 00:11:51 messenger. It would be carrying the genetic information from inside the 147 00:11:51 --> 00:11:56 nucleus out into the cytoplasm. And so the term began to be used of 148 00:11:56 --> 00:12:02 a messenger RNA. And so over here I'll put an mRNA to 149 00:12:02 --> 00:12:08 indicate that. Now, one thing you can also see is 150 00:12:08 --> 00:12:15 we've talked about the structure of DNA and RNA. And it's essentially 151 00:12:15 --> 00:12:22 the same with one. This is the nucleotide, 152 00:12:22 --> 00:12:29 which is the fundamental building block of DNA. 153 00:12:29 --> 00:12:34 And if you recall, in DNA there's a hydroxyl, 154 00:12:34 --> 00:12:40 excuse me, a hydrogen there, but in RNA there is this extra 155 00:12:40 --> 00:12:45 hydroxyl. This is 1 prime, 2 prime, 3 prime, 4 prime, excuse me. 156 00:12:45 --> 00:12:51 Let's just leave it like for the moment, 1, 2, 3, 157 00:12:51 --> 00:12:57 4, 5. And so the DNA, as you heard, was deoxynucleic acid 158 00:12:57 --> 00:13:03 because it's missing this. But other than that the backbones 159 00:13:03 --> 00:13:11 are similar and the letters are almost the same. 160 00:13:11 --> 00:13:19 The A, the G and the C are exactly the same bases in DNA and RNA. 161 00:13:19 --> 00:13:27 The only difference is with the T and the uracil. 162 00:13:27 --> 00:13:35 So this is thiamine which is found in DNA. 163 00:13:35 --> 00:13:45 And this is uracil -- 164 00:13:45 --> 00:13:55 -- which is found in -- 165 00:13:55 --> 00:14:00 -- RNA. So the base pairing is over on this part of the molecule. 166 00:14:00 --> 00:14:06 So whether or not you have a methyl group doesn't really change the base 167 00:14:06 --> 00:14:12 pairing. And so this process of copying information in DNA to 168 00:14:12 --> 00:14:18 information that's in RNA was seen as essentially the same kind of 169 00:14:18 --> 00:14:24 language, but it's just sort of like taking somebody's word processor 170 00:14:24 --> 00:14:28 file and writing out longhand. You'd be transcribing the 171 00:14:28 --> 00:14:32 information but it would be essentially the same kind of 172 00:14:32 --> 00:14:36 information in essentially the same form. So this is known 173 00:14:36 --> 00:14:44 as transcription. 174 00:14:44 --> 00:14:48 I'll take just one very brief thing. Some of you may wonder why did 175 00:14:48 --> 00:14:53 nature do it this way? Why didn't it just use uracil in 176 00:14:53 --> 00:14:58 DNA? So as a very brief aside, I think we understand pretty much 177 00:14:58 --> 00:15:04 why it does it. And that is cytidine has this 178 00:15:04 --> 00:15:11 structure. So this is C which is found in DNA but it undergoes, 179 00:15:11 --> 00:15:18 all of your DNA is a chemical and it's able to undergo spontaneous 180 00:15:18 --> 00:15:25 kinds of damage. In fact, in every one of our human 181 00:15:25 --> 00:15:32 cells every day, 10,000 times in any given cell a 182 00:15:32 --> 00:15:39 base falls off totally just leaving the deoxyribose sitting there. 183 00:15:39 --> 00:15:44 And the cells have to fix it up. And we have DNA repair systems that 184 00:15:44 --> 00:15:50 do that. But another very common kind of thing that happens is that 185 00:15:50 --> 00:15:56 this NH2 group deaminates. And if you do that, if a C happens 186 00:15:56 --> 00:16:02 to deaminate in DNA it gives you a uracil. 187 00:16:02 --> 00:16:07 And if that ever happens, the cell is actually able to tell 188 00:16:07 --> 00:16:12 that something went wrong because uracil is not supposed to be in DNA 189 00:16:12 --> 00:16:18 and there are repair systems that constantly scan the DNA and take out 190 00:16:18 --> 00:16:23 any uracils that are in there. And the reason, if instead of using 191 00:16:23 --> 00:16:29 thiamine it used uracil then the cell wouldn't know whether the 192 00:16:29 --> 00:16:34 uracil got there because it was supposed to be there as part of the 193 00:16:34 --> 00:16:40 sequence or whether it had arisen by deamination of a cytidine. 194 00:16:40 --> 00:16:45 It's a minor point but I think we do have an understanding as to why 195 00:16:45 --> 00:16:51 there's thiamine in DNA and uracil in RNA. This isn't such a worry in 196 00:16:51 --> 00:16:56 RNA. OK. But anyway. So there's still a really big 197 00:16:56 --> 00:17:02 problem here, though, that Watson and Crick and others 198 00:17:02 --> 00:17:07 were grappling with. And it has to do, 199 00:17:07 --> 00:17:11 as I say, with this fact that the information up here is the first in 200 00:17:11 --> 00:17:16 DNA and RNA. It's written as a sequence of letters, 201 00:17:16 --> 00:17:20 if you will, chemical letters, but there are only four letters in 202 00:17:20 --> 00:17:24 the DNA alphabet and essentially the same four letters in 203 00:17:24 --> 00:17:29 the RNA alphabet. However, the protein language has 204 00:17:29 --> 00:17:34 got a totally different alphabet so it's somehow like sort of 205 00:17:34 --> 00:17:39 translating now from English to Japanese or something like that. 206 00:17:39 --> 00:17:44 Some really fundamental change had to happen because there was a real 207 00:17:44 --> 00:17:49 conversion from one kind of language to another. And so this process is 208 00:17:49 --> 00:17:54 known as translation, as going from information that's 209 00:17:54 --> 00:17:59 written using a four-letter nucleic acid alphabet to information that's 210 00:17:59 --> 00:18:05 written using a 20-letter amino acid alphabet. 211 00:18:05 --> 00:18:09 And Crick on purely theoretical grounds figured, 212 00:18:09 --> 00:18:14 well, if you're going from one language to another what do you need? 213 00:18:14 --> 00:18:19 You need a translator? And what's a translator? 214 00:18:19 --> 00:18:24 A translator is someone who speaks both languages. 215 00:18:24 --> 00:18:29 So his idea was that if there was -- I'm going to just separate out, 216 00:18:29 --> 00:18:34 let's say this is the messenger RNA. And I, just for clarity here, have 217 00:18:34 --> 00:18:39 spaced out the three-letter words so we can see them. 218 00:18:39 --> 00:18:44 These would be three like G-A-C or something like that in the RNA. 219 00:18:44 --> 00:18:49 That there would be some kind of translator. And his idea was that 220 00:18:49 --> 00:18:54 it would be something that had a particular amino acid at one end and 221 00:18:54 --> 00:19:00 it had the complimentary nucleotides at the other end. 222 00:19:00 --> 00:19:05 So it could, if you will, read the genetic code that was 223 00:19:05 --> 00:19:11 written in the RNA using the nucleic acid alphabet, 224 00:19:11 --> 00:19:16 but it would also be speaking the amino acid language. 225 00:19:16 --> 00:19:22 Got the idea? So the idea was that this would be, 226 00:19:22 --> 00:19:28 they used the words adaptor or a translator. So that was on 227 00:19:28 --> 00:19:32 basically theoretical grounds. If you had to go from a four-letter 228 00:19:32 --> 00:19:36 language to a 20-letter language you needed some kind of translator or 229 00:19:36 --> 00:19:40 adapter. Now, at that same time that these 230 00:19:40 --> 00:19:44 considerations were going on, biochemists began to find a class of 231 00:19:44 --> 00:19:54 small RNAs -- 232 00:19:54 --> 00:20:02 -- that had an amino acid -- 233 00:20:02 --> 00:20:08 -- attached. And so there were entities that had just the sort of 234 00:20:08 --> 00:20:14 properties that Crick had envisioned you'd need from theoretical 235 00:20:14 --> 00:20:20 considerations. These were given the name transfer 236 00:20:20 --> 00:20:26 RNAs or tRNAs as they're usually referred to now. 237 00:20:26 --> 00:20:31 And I've told you that RNA has, since it's got nucleic acid bases, 238 00:20:31 --> 00:20:37 if you have a single strand of either an RNA or a DNA and you don't 239 00:20:37 --> 00:20:42 have a complimentary double-strand, then if there are complimentary 240 00:20:42 --> 00:20:48 sequences they can come together and pair just the same way that 241 00:20:48 --> 00:20:53 complimentary sequences can come together in DNA. 242 00:20:53 --> 00:20:59 And in the case of tRNAs, once the sequence of these was 243 00:20:59 --> 00:21:05 determined, oops. There we go. They folded up into a 244 00:21:05 --> 00:21:11 clover leaf shape. And the amino acid is attached up 245 00:21:11 --> 00:21:17 at the 3 prime end of the chain up here in what's known as the acceptor 246 00:21:17 --> 00:21:23 part of the molecule. And so that corresponds to this 247 00:21:23 --> 00:21:30 part up here. And here is what's known as the anticodon. 248 00:21:30 --> 00:21:43 Each of these three-letter words -- 249 00:21:43 --> 00:21:50 -- in nucleic acid language is called codon. And so something that 250 00:21:50 --> 00:21:57 had a complimentary sequence to a codon was called an anticodon. 251 00:21:57 --> 00:22:05 So if G-G-G is the codon then C-C-C would be the anticodon. 252 00:22:05 --> 00:22:09 Now, this is just a schematic, as you can see. It shows where the 253 00:22:09 --> 00:22:14 hydrogen bonds are that form this stuff. When the crystal structures 254 00:22:14 --> 00:22:19 were done, the first crystal structure of tRNA was actually done 255 00:22:19 --> 00:22:24 by Alex Rich. He's in the Biology Department at MIT. 256 00:22:24 --> 00:22:29 And he was in this picture I showed you talking to Matt Meselson. 257 00:22:29 --> 00:22:33 And although we cannot see this terribly well, 258 00:22:33 --> 00:22:37 maybe you could hit the lights here, the crystal structure showed that 259 00:22:37 --> 00:22:41 the molecule didn't look like a clover leaf as in there. 260 00:22:41 --> 00:22:45 It had more this shape. And I'll show you this more clearly in this 261 00:22:45 --> 00:22:49 picture. I showed you this little part of the thing when I was showing 262 00:22:49 --> 00:22:53 you how an RNA could form. For example, if you copy the gene 263 00:22:53 --> 00:22:57 encoding a tRNA and, for example, the sequence here in 264 00:22:57 --> 00:23:01 green is complimentary to the sequence here, 265 00:23:01 --> 00:23:05 or the sequence here in sort of blue or purple was complimentary 266 00:23:05 --> 00:23:08 to the sequence here. That what can happen then, 267 00:23:08 --> 00:23:12 if you allow a single strand RNA like this to fold up, 268 00:23:12 --> 00:23:16 thermodynamically it will then go to the lower energy state which 269 00:23:16 --> 00:23:20 involves being able to make these hydrogen bonds. 270 00:23:20 --> 00:23:24 And I think you can sort of see the clover leaf. Here's one of the 271 00:23:24 --> 00:23:28 leaves. The other is down here and the others. It's a little 272 00:23:28 --> 00:23:32 bit distorted here. And the reason is, 273 00:23:32 --> 00:23:36 because I'm going to continue now to show you how this structure, 274 00:23:36 --> 00:23:40 once you get to the clover leaf, then it folds up to make other kinds 275 00:23:40 --> 00:23:44 of interactions and it takes that shape with the tRNA going on at this 276 00:23:44 --> 00:23:48 end and the anticodon being down here. And what's happening now is 277 00:23:48 --> 00:23:52 they've morphed on the van der Waals surfaces so you can see what this 278 00:23:52 --> 00:23:56 would look like, 3-dimensional shape. 279 00:23:56 --> 00:24:00 The amino acid would be attached at that end and there is the anticodon 280 00:24:00 --> 00:24:04 that we'd be able to recognize, the codon in the RNA. 281 00:24:04 --> 00:24:11 I mean the physical reality is pretty close to this simple little 282 00:24:11 --> 00:24:18 depiction here. OK. So once this basic paradigm 283 00:24:18 --> 00:24:25 had been straightened out that gave rise to this idea then, 284 00:24:25 --> 00:24:32 putting it all together, that the information in DNA, 285 00:24:32 --> 00:24:39 that a portion of it would be copied into RNA and that would go out into 286 00:24:39 --> 00:24:45 the cytoplasm. And then in the cytoplasm these 287 00:24:45 --> 00:24:51 translators, the tRNAs would be able to decode, read the nucleic acid 288 00:24:51 --> 00:24:57 information and use that to determine the linear order of amino 289 00:24:57 --> 00:25:02 acids in a protein. Crick, when he came up with this, 290 00:25:02 --> 00:25:07 gave this the term “the central dogma”. And people still use this 291 00:25:07 --> 00:25:12 term to apply this idea of information flow going from DNA to 292 00:25:12 --> 00:25:17 RNA in protein. And it's still used to this day. 293 00:25:17 --> 00:25:23 There's actually sort of a little twist to that, 294 00:25:23 --> 00:25:28 because at the time that Crick proposed the term he actually 295 00:25:28 --> 00:25:33 thought that the word dogma meant “an idea for which there is not 296 00:25:33 --> 00:25:38 reasonable evidence”. But he was sort of amused years 297 00:25:38 --> 00:25:43 later to realize that a more reasonable definition of dogma is it 298 00:25:43 --> 00:25:47 is something that a true believer cannot doubt. So he kind of 299 00:25:47 --> 00:25:52 accidentally made an insertion that he was right, but fortunately he was 300 00:25:52 --> 00:26:03 right. Now -- 301 00:26:03 --> 00:26:06 -- the next big job, though, in working this out was to 302 00:26:06 --> 00:26:14 crack the code. 303 00:26:14 --> 00:26:19 And it's fine to know that it's a 3-letter code and it's fine to know 304 00:26:19 --> 00:26:25 it goes into RNA and then the tRNAs translate it, but if you cannot 305 00:26:25 --> 00:26:31 crack the code then you have no idea what any of the information means. 306 00:26:31 --> 00:26:34 It was sort of like before the Rosetta Stone they could look at the 307 00:26:34 --> 00:26:38 hieroglyphics in the Egyptian tombs and they could see that it was a lot 308 00:26:38 --> 00:26:42 of information and there were symbols and so on, 309 00:26:42 --> 00:26:46 but they didn't know what it meant until finally they got something 310 00:26:46 --> 00:26:50 that allowed them to relate it to a language they did know and they were 311 00:26:50 --> 00:26:54 able to work out the principles. So somehow scientists had then to 312 00:26:54 --> 00:26:58 crack the code. And there were two scientists who 313 00:26:58 --> 00:27:02 played a really big role. One was Marshall Nirenberg who was 314 00:27:02 --> 00:27:08 at NIH and is, in fact, still at NIH. 315 00:27:08 --> 00:27:14 And the other was a scientist who's on the same floor as me at MIT, 316 00:27:14 --> 00:27:20 Gobin Khorana. And they used two different approaches, 317 00:27:20 --> 00:27:26 but between these two approaches the genetic code was cracked. 318 00:27:26 --> 00:27:32 And what Nirenberg did was to take a protein synthesizing -- 319 00:27:32 --> 00:27:42 -- extract that he knew needed RNA 320 00:27:42 --> 00:27:47 in order to work. So that wasn't a surprise at this 321 00:27:47 --> 00:27:52 point because people were thinking the RNA would be the message. 322 00:27:52 --> 00:27:57 And at that point the ability to make synthesized nucleic acids was 323 00:27:57 --> 00:28:03 quite limited compared to what we do now. 324 00:28:03 --> 00:28:08 And so there were different ways of making them. Sometimes you could do 325 00:28:08 --> 00:28:13 it enzymaticly. But what Nirenberg, 326 00:28:13 --> 00:28:18 for example, was able to make was poly-U. So this was an RNA that was 327 00:28:18 --> 00:28:23 just UUUUUUU. And then what he did was he set up 20 reactions, 328 00:28:23 --> 00:28:28 and in every reaction he put some of this extract, he put poly-U and he 329 00:28:28 --> 00:28:34 put 19 of the amino acids that were unlabeled. 330 00:28:34 --> 00:28:39 And then only one amino acid that had radiolabel in it. 331 00:28:39 --> 00:28:44 So he ran these 20 reactions and waited to see in any of these did he 332 00:28:44 --> 00:28:49 get protein made that would have been coded by the poly-U. 333 00:28:49 --> 00:28:55 And what he ended up with was polyphenylalanine. 334 00:28:55 --> 00:29:07 Which you may recall when we were 335 00:29:07 --> 00:29:15 talking about structures of amino acids, there's the basic backbone. 336 00:29:15 --> 00:29:23 And the polyphenylalanine is the one that has, if you will, 337 00:29:23 --> 00:29:31 a benzene ring hanging off the end. And so what that meant was that UUU 338 00:29:31 --> 00:29:39 must code for a fee or phenylalanine. 339 00:29:39 --> 00:29:45 And if it's UUU in the RNA that must mean that the DNA that encodes this 340 00:29:45 --> 00:29:51 must have that sequence AAA and TTT. And you can see that one of the two 341 00:29:51 --> 00:29:57 strands of the DNA, since T base pairs the same as 342 00:29:57 --> 00:30:03 uridine, but one of the strands in the DNA is going to have the same 343 00:30:03 --> 00:30:10 sequence as one of the strands in the RNA. 344 00:30:10 --> 00:30:14 Now, I'll just tell you one brief little anecdote. 345 00:30:14 --> 00:30:18 I heard Marshall Nirenberg at this meeting they had to celebrate the 346 00:30:18 --> 00:30:22 50th anniversary of the discovery of DNA. And he posed something that 347 00:30:22 --> 00:30:26 I'd never thought about in my years of teaching this but might occur to 348 00:30:26 --> 00:30:30 you guys if we put it on a problem set. 349 00:30:30 --> 00:30:34 You all know something that benzene is nothing but sort of these, 350 00:30:34 --> 00:30:38 this as I call it, we even referred to it as a benzene ring, 351 00:30:38 --> 00:30:42 which is a very organic kind of solvent. So if we put a problem set, 352 00:30:42 --> 00:30:46 if you've made polyphenylalanine would you expect this to be soluble 353 00:30:46 --> 00:30:50 in water? Well, this is very, very hydrophobic, 354 00:30:50 --> 00:30:54 very, very water-hating. And your answer would be correct. 355 00:30:54 --> 00:30:58 If you said no, I wouldn't expect polyphenylalanine to be 356 00:30:58 --> 00:31:02 soluble in water. In fact, if it were in a protein 357 00:31:02 --> 00:31:06 you'd expect it to probably be in the core where all the hydrophobic 358 00:31:06 --> 00:31:10 interactions, the water-hating parts would go. So Marshall Nirenberg 359 00:31:10 --> 00:31:14 said in his talk, well, he had shown that he had 360 00:31:14 --> 00:31:19 radioactive phenylalanine, and he still had to prove chemically 361 00:31:19 --> 00:31:23 that he had polyphenylalanine. But he wasn't much of a biochemist 362 00:31:23 --> 00:31:27 so he walked down to the lab just below NIH and walked in the door and 363 00:31:27 --> 00:31:31 saw the first person he saw and said how do you solubilize 364 00:31:31 --> 00:31:35 polyphenylalanine? Just to make sure I got this right. 365 00:31:35 --> 00:31:39 And the guy said, oh, you just take 33% hydrobromic acid and glacial 366 00:31:39 --> 00:31:42 acidic acid and it works. So he went back upstairs and 367 00:31:42 --> 00:31:45 dissolved it. It turned out it dissolved in that. 368 00:31:45 --> 00:31:49 And he went on and characterized it. And he said it didn't occur to him 369 00:31:49 --> 00:31:52 or he didn't learn until about 15 or 20 years later that he just walked 370 00:31:52 --> 00:31:56 up to the only person in the world who knew how to solubilize 371 00:31:56 --> 00:32:00 polyphenylalanine. By total coincidence this guy who 372 00:32:00 --> 00:32:04 had talked to had been working away trying to figure out a way and had 373 00:32:04 --> 00:32:08 come up with this odd mix of hydrobromic acid and glacial acidic. 374 00:32:08 --> 00:32:12 And he just said of all the places in the world, he walked up to the 375 00:32:12 --> 00:32:17 one person who knew and got the answer. So the other part of the 376 00:32:17 --> 00:32:21 story then involves Gobin Khorana who I mentioned when I was telling 377 00:32:21 --> 00:32:25 you initially about the Nobel Laureates at MIT. 378 00:32:25 --> 00:32:30 And Gobin is a brilliant organic chemist. He synthesized DNA. 379 00:32:30 --> 00:32:35 You know, it was a point where a whole issue of a journal came out 380 00:32:35 --> 00:32:40 and there was nothing but his labs work and synthesizing DNA. 381 00:32:40 --> 00:32:45 Well, he was good at nucleic acids. And one of the strategies that they 382 00:32:45 --> 00:32:50 could use chemically was they would make something like a dye nucleotide 383 00:32:50 --> 00:32:55 like CA. And then they were able to polymerize that to make a piece of 384 00:32:55 --> 00:33:00 RNA. So they could make an RNA that had the sequence CA, CA, 385 00:33:00 --> 00:33:05 CA, CA and so on. And what you can see from that is 386 00:33:05 --> 00:33:11 that there are two different codons in that. One is CAC and the other 387 00:33:11 --> 00:33:17 is ACA. And the reason he made was he was synthesizing it by 388 00:33:17 --> 00:33:22 polymerizing nucleotides. So in these same kinds of 389 00:33:22 --> 00:33:28 experiments I was describing before, what they found this synthesized was 390 00:33:28 --> 00:33:34 alternating histidine and threonine. 391 00:33:34 --> 00:33:39 And you cannot tell from that experiment alone. 392 00:33:39 --> 00:33:44 One of those must be histidine and one of them must be threonine, 393 00:33:44 --> 00:33:49 but you cannot tell from that experiment so more experiments were 394 00:33:49 --> 00:33:54 needed. And what was learned from that experiment in that case was 395 00:33:54 --> 00:34:00 that CAC corresponded to histidine and ACA corresponded to threonine. 396 00:34:00 --> 00:34:04 So these kind of experiments were then put together to give what's 397 00:34:04 --> 00:34:09 known as the genetic code which is the three-letter words encoded in 398 00:34:09 --> 00:34:13 DNA that encode the sequence amino acids and proteins. 399 00:34:13 --> 00:34:18 And it's usually displayed as a table and you read it in this way. 400 00:34:18 --> 00:34:22 That this thing over here is the first base in the codon, 401 00:34:22 --> 00:34:27 across the top is the second base in the codon, and down over here is the 402 00:34:27 --> 00:34:31 third base. So if we go to C as the first, say the one for histidine we 403 00:34:31 --> 00:34:36 were just showing you. C is the first letter. 404 00:34:36 --> 00:34:40 A is the second letter, so this is the box that we're going 405 00:34:40 --> 00:34:45 to be looking at. And if C is the third letter we can 406 00:34:45 --> 00:34:50 see it encoded histidine or AC come back to A. Then the A is certainly 407 00:34:50 --> 00:34:54 threonine. But you can also see something else here. 408 00:34:54 --> 00:34:59 And that is because there were 64 possibilities with this three-letter 409 00:34:59 --> 00:35:04 word the code is what's known as degenerate. 410 00:35:04 --> 00:35:09 That is there are more words in the genetic code than are needed to 411 00:35:09 --> 00:35:14 specify the number of amino acids that have to be coded. 412 00:35:14 --> 00:35:20 So I just want to make a couple of points about this. So 413 00:35:20 --> 00:35:30 the genetic code -- 414 00:35:30 --> 00:35:36 It's degenerate. There are 61 codons that correspond 415 00:35:36 --> 00:35:42 to an amino acid. And that means that some, 416 00:35:42 --> 00:35:48 and I think threonine is a good example, there's more than one word 417 00:35:48 --> 00:35:54 in the genetic code that means threonine. There were tree codons 418 00:35:54 --> 00:36:00 for which there was no corresponding amino acid. And those mean stop. 419 00:36:00 --> 00:36:05 And that would make sense because if you're reading down a nucleic acid 420 00:36:05 --> 00:36:11 piece of RNA, at some point you'd have to end the protein. 421 00:36:11 --> 00:36:16 And so there are actually three that are used for that purpose. 422 00:36:16 --> 00:36:22 And although there's some small variation on this in nature there's 423 00:36:22 --> 00:36:28 usually one amino acid that's used for starting a protein, 424 00:36:28 --> 00:36:33 and that's methionine. And it's AUG right there. 425 00:36:33 --> 00:36:37 Now, some of this stuff probably sounds like it's been around forever, 426 00:36:37 --> 00:36:42 and that's certainly true of some of the stuff you hear in your chemistry, 427 00:36:42 --> 00:36:46 math and physics courses. I just want to drive you home. 428 00:36:46 --> 00:36:51 When I was an undergrad Watson's first book called the molecule 429 00:36:51 --> 00:36:55 biology of the gene had come out, so when I was your age, and I 430 00:36:55 --> 00:37:00 realize that I look ancient but, you know, at least I'm still here. 431 00:37:00 --> 00:37:03 When I was an undergrad I had Watson's book. 432 00:37:03 --> 00:37:07 This was the genetic code that was in the code, the genetic code as of 433 00:37:07 --> 00:37:11 May 1965. And you'll notice there are gaps in here. 434 00:37:11 --> 00:37:15 And all the things that are underlined were things for which 435 00:37:15 --> 00:37:19 there was a tentative assignment. So although you may take this and 436 00:37:19 --> 00:37:23 think that it's been knowledge that's been around forever, 437 00:37:23 --> 00:37:27 it wasn't even complete in the textbook when I was 438 00:37:27 --> 00:37:32 an undergrad. OK. So one of the things then that's 439 00:37:32 --> 00:37:39 important to think about the nucleic acid stuff, this is the basis of how 440 00:37:39 --> 00:37:46 proteins are encoded in the DNA. But everything else has to be there, 441 00:37:46 --> 00:37:53 too. And the genetic code, that's what we've been talking about, 442 00:37:53 --> 00:38:01 is universal. But there are other languages -- 443 00:38:01 --> 00:38:09 -- written in the DNA that are not 444 00:38:09 --> 00:38:15 universal. And one of them was that little example I gave you with an 445 00:38:15 --> 00:38:21 origin of replication. E. coli only starts DNA replication 446 00:38:21 --> 00:38:27 at one very particular point in its chromosome, so it is a particular 447 00:38:27 --> 00:38:33 sequence of DNA. It's actually about 250 nucleotides 448 00:38:33 --> 00:38:39 long. So you could think of that as a language. It's like starting a 449 00:38:39 --> 00:38:45 chromosome replication language. It's only got one word in it, and 450 00:38:45 --> 00:38:51 the word is 250 nucleotides long. Another place that's very important, 451 00:38:51 --> 00:38:57 and that is if you're going to make an RNA copy, if you're going to do 452 00:38:57 --> 00:39:03 transcription of a piece of DNA -- And I'll call this the coding 453 00:39:03 --> 00:39:09 sequence. This would be the sequence of three-letter words that 454 00:39:09 --> 00:39:15 we'd specify the amino acid of the protein. If you were going to make 455 00:39:15 --> 00:39:21 an RNA copy of that, you would have to somewhere have 456 00:39:21 --> 00:39:27 something here that's a sequence up here that means start 457 00:39:27 --> 00:39:33 transcription. And one at the end, 458 00:39:33 --> 00:39:39 some other sequence of letters in the nucleic acid that would mean 459 00:39:39 --> 00:39:45 stop transcription. This is given the technical term 460 00:39:45 --> 00:39:51 that's referred to as a promoter. The stop one is referred to as a 461 00:39:51 --> 00:39:58 terminator. And these, we'll say more about this. 462 00:39:58 --> 00:40:02 Because the beauties of having this system of making an RNA copy is it 463 00:40:02 --> 00:40:07 provides a beautiful point of regulation. Because the cell can 464 00:40:07 --> 00:40:12 determine whether or not it's going to make a particular protein by 465 00:40:12 --> 00:40:17 whether or not it chooses to make the protein or not. 466 00:40:17 --> 00:40:22 And so having this RNA intermediate and being able to control 467 00:40:22 --> 00:40:27 transcription is a really important part of the whole regulation that 468 00:40:27 --> 00:40:33 makes life possible. The transcription is carried out by 469 00:40:33 --> 00:40:40 an enzyme that's known as RNA polymerase. And let me make one 470 00:40:40 --> 00:40:47 more point. These promoters and terminators are not universal. 471 00:40:47 --> 00:40:54 So when we talk about recombinant DNA a little bit in the course, 472 00:40:54 --> 00:41:01 if I take a mouse gene and I put it in E. coli. 473 00:41:01 --> 00:41:05 Even though the genetic code is the same, we might have all the same 474 00:41:05 --> 00:41:09 sequence of amino acids specified, you won't get the RNA made because 475 00:41:09 --> 00:41:13 the sequences that say start transcription and stop transcription 476 00:41:13 --> 00:41:17 are different between a mouse and a bacterium even though the genetic 477 00:41:17 --> 00:41:21 code is the same. So you can kind of see from first 478 00:41:21 --> 00:41:25 principles. If you're doing recombinant DNA and you wanted to 479 00:41:25 --> 00:41:29 express the mouse protein in E. coli, you would have to fiddle 480 00:41:29 --> 00:41:33 around with the sequences up here and the sequences down there, 481 00:41:33 --> 00:41:39 the parts that are not universal. You guys with me? 482 00:41:39 --> 00:41:47 OK. So what does an RNA polymerase do? It recognizes this sequence, 483 00:41:47 --> 00:41:55 and then it teases the strands apart to make a little bubble like this. 484 00:41:55 --> 00:42:03 So let's say ATAGCTA. So the other strand then would be TATCGTA. 485 00:42:03 --> 00:42:08 And then RNA polymerase, unlike a DNA polymerase, can begin a 486 00:42:08 --> 00:42:13 chain de novo. Remember an important thing about 487 00:42:13 --> 00:42:18 DNA polymerases was they had to have a primer terminus to get started. 488 00:42:18 --> 00:42:23 That was they had to use the Okazaki fragments. 489 00:42:23 --> 00:42:29 So this is DNA. This would be 5 prime, 3 prime, 490 00:42:29 --> 00:42:34 3 prime and 5 prime. And what an RNA polymerase can do, 491 00:42:34 --> 00:42:39 it uses DATP, DGTP, DCTP and DUTP. It uses triphosphates, 492 00:42:39 --> 00:42:43 excuse me. Get rid of these. Excuse me. My mistake. No deoxies 493 00:42:43 --> 00:42:47 here. Of course this is RNA. It uses ATP, GTP, CTP and UTP as 494 00:42:47 --> 00:42:51 the substrates. So it uses triphosphates just the 495 00:42:51 --> 00:42:55 same way DNA polymerases do. And then it's able to start a chain 496 00:42:55 --> 00:43:01 de novo. And it synthesizes the RNA in a 5 497 00:43:01 --> 00:43:08 prime to 3 prime direction, the same direction that a strand of 498 00:43:08 --> 00:43:15 DNA is made by DNA polymerase. So it would copy here. And so it 499 00:43:15 --> 00:43:21 would put in an A opposite a T. And then because it's RNA it will 500 00:43:21 --> 00:43:28 put in a U opposite an A, and then an AGCAU and so on. 501 00:43:28 --> 00:43:35 So this right here is the beginning of the RNA that's being synthesized 502 00:43:35 --> 00:43:41 by the RNA polymerase. This strand is known as the 503 00:43:41 --> 00:43:46 transcribed strand. And by default then that one is the 504 00:43:46 --> 00:43:51 non-transcribed strand. And what you can see by doing this, 505 00:43:51 --> 00:43:56 it's making an RNA the same sequences up here, 506 00:43:56 --> 00:44:01 except that everywhere there's a T there's now a U in the DNA. 507 00:44:01 --> 00:44:07 So the final thing then is how this information gets all put together to 508 00:44:07 --> 00:44:14 make proteins. And protein synthesis is done by an 509 00:44:14 --> 00:44:21 amazing machine known as the ribosome. It's made up of some 510 00:44:21 --> 00:44:30 special large RNAs -- 511 00:44:30 --> 00:44:36 -- called rRNAs, some proteins as well. 512 00:44:36 --> 00:44:43 These make up the ribosome. And then it needs a mRNA and then 513 00:44:43 --> 00:44:49 it needs the various tRNAs, each of which carries an amino acid 514 00:44:49 --> 00:44:56 that's appropriate to its anticodon. And in a very briefly sort of way 515 00:44:56 --> 00:45:02 this is -- And you can see this in your 516 00:45:02 --> 00:45:08 textbook, what the ribosome does is it takes, let's consider this is the 517 00:45:08 --> 00:45:14 mRNA. I'm just going to take three codons here. And this mRNA treads 518 00:45:14 --> 00:45:19 into the ribosome. And I'll sort of show it's able to 519 00:45:19 --> 00:45:25 recognize the first codon and the second codon. Remember, 520 00:45:25 --> 00:45:31 of course, there's no spacing like this in the RNA. 521 00:45:31 --> 00:45:36 And then in the context of this large factory it's able to find the 522 00:45:36 --> 00:45:41 tRNA that has amino acid one and the anticodon that would correspond to 523 00:45:41 --> 00:45:46 this. The tRNA that has the next amino acid attached and its 524 00:45:46 --> 00:45:51 anticodon. So you can see what's happened. It's been able to order 525 00:45:51 --> 00:45:57 the first amino acid encoded by that codon and put it physically right 526 00:45:57 --> 00:46:02 next to the next amino acid that's coded here. And then 527 00:46:02 --> 00:46:16 it catalyzes -- 528 00:46:16 --> 00:46:20 -- the formation of a peptide bond. And what happens when that does is 529 00:46:20 --> 00:46:25 the way this amino acid is joined to the tRNA there's energy 530 00:46:25 --> 00:46:31 stored in that bond. And so thermodynamically that allows 531 00:46:31 --> 00:46:37 this bond formation to go. And now you end up essentially with 532 00:46:37 --> 00:46:43 this. And what happens now is everything clicks over one. 533 00:46:43 --> 00:46:49 So you could think of it as this whole RNA shifts over one so the one 534 00:46:49 --> 00:46:55 that used to be here is now sticking outside. Here's part 535 00:46:55 --> 00:47:01 of the ribosome. Here's the next codon. 536 00:47:01 --> 00:47:05 What we have here is the tRNA that's got amino acid two joined to 537 00:47:05 --> 00:47:10 amino acid one. The next codon specifies the next 538 00:47:10 --> 00:47:15 amino acid which is three. And the process is then able to go 539 00:47:15 --> 00:47:19 on like that. Now, the structure of the ribosome, 540 00:47:19 --> 00:47:24 the crystal structure of the ribosome was just finished. 541 00:47:24 --> 00:47:29 And I guess we've got as many lights out as we can do right now. 542 00:47:29 --> 00:47:33 It's absolutely remarkable. It's mostly RNA. The gray stuff 543 00:47:33 --> 00:47:37 and the blue stuff are two huge RNAs that are all folded up in 544 00:47:37 --> 00:47:41 3-dimensional space. And these things that are sort of 545 00:47:41 --> 00:47:45 stuck on the outside, these purple things here or the dark 546 00:47:45 --> 00:47:49 blue things here that sort of look like cherries stuck on the outside 547 00:47:49 --> 00:47:53 of a cake, those are proteins. So most of this is RNA, big balls 548 00:47:53 --> 00:47:58 of RNA with proteins kind of decorating the outside. 549 00:47:58 --> 00:48:02 The mRNA is a green thing that snakes through. 550 00:48:02 --> 00:48:06 There's the mRNA. See it snaking through? 551 00:48:06 --> 00:48:10 And maybe you can recognize in the middle this tRNA. 552 00:48:10 --> 00:48:14 There's an orange one and a yellow one. Those correspond to the two 553 00:48:14 --> 00:48:18 tRNAs I depicted here. And I'm just going to see if I can 554 00:48:18 --> 00:48:22 stop this. There's a viewpoint I'd like you to see when it comes around 555 00:48:22 --> 00:48:26 again here in just a second. I'll see if I can catch it there. 556 00:48:26 --> 00:48:30 Right there. Here's one of the tRNAs in yellow. 557 00:48:30 --> 00:48:34 And its end is right there. And there's the other tRNA. 558 00:48:34 --> 00:48:38 And its end is right there. So this corresponds to the point at 559 00:48:38 --> 00:48:42 which there's going to be an amino acid formed. And something is going 560 00:48:42 --> 00:48:47 to catalyze the formation of that bond. Well, the next picture sort 561 00:48:47 --> 00:48:51 of shows what happens if you pull that apart. And what you'll see is 562 00:48:51 --> 00:48:55 that here's the end of one end of the tRNA, there's the other end, 563 00:48:55 --> 00:49:00 and there's nothing near it except for RNA. 564 00:49:00 --> 00:49:05 So RNA is actually catalyzing the formation of the peptide bond. 565 00:49:05 --> 00:49:11 Another way to say that would be that the ribosome, 566 00:49:11 --> 00:49:16 which is the protein synthesizing factory, is a ribozyme. 567 00:49:16 --> 00:49:22 Remember I said most of the chemical reactions that need 568 00:49:22 --> 00:49:27 catalysts are carried out by proteins but there are a few that 569 00:49:27 --> 00:49:33 are carried out by RNA where RNA is the catalyst? 570 00:49:33 --> 00:49:38 And remarkably the formation of the bond, which is at the heart of 571 00:49:38 --> 00:49:43 proteins which are so important for all life, is catalyzed by protein. 572 00:49:43 --> 00:49:48 If you look at what makes proteins, what do you see? You see huge balls 573 00:49:48 --> 00:49:53 of RNA, a mRNA threading through two tRNAs, and the enzyme activity or 574 00:49:53 --> 00:49:59 the catalytic activity is encoded by the RNA as well. 575 00:49:59 --> 00:50:03 As I said, people think possibly there was an RNA world that preceded 576 00:50:03 --> 00:50:08 our present-day world with DNA, RNA and protein. And who knows? 577 00:50:08 --> 00:50:12 But this sort of look at a ribosome could at least make you see that 578 00:50:12 --> 00:50:17 that's a plausible explanation that RNA might have been running the show 579 00:50:17 --> 00:50:21 for a while before anything else got involved. Anyway, we'll 580 00:50:21 --> 00:50:24 see you on Friday then.