1 00:00:00 --> 00:00:05 There were some other questions sort of running along this general idea 2 00:00:05 --> 00:00:10 of the fact that the information in DNA doesn't go, 3 00:00:10 --> 00:00:15 even though it encodes the information for proteins goes via 4 00:00:15 --> 00:00:21 this rRNA intermediate. Someone asked what was the M. 5 00:00:21 --> 00:00:25 The M is for messenger. The idea being that since the DNA, 6 00:00:25 --> 00:00:29 at least in eukaryotes the DNA was in the nucleus and proteins were 7 00:00:29 --> 00:00:33 made out of the cytoplasm, somehow that information had to be 8 00:00:33 --> 00:00:36 carried from the nucleus where the DNA was out to the cytoplasm. 9 00:00:36 --> 00:00:40 And that's where the term messenger was because the RNA was seen as 10 00:00:40 --> 00:00:43 something that would carry the information out. 11 00:00:43 --> 00:00:47 Now, a point here, it's really critical because we're 12 00:00:47 --> 00:00:51 going to continue to talk about gene regulation. 13 00:00:51 --> 00:00:56 And that is when a cell is making one of these mRNAs, 14 00:00:56 --> 00:01:01 it doesn't make one single copy of all of the genes that are in the 15 00:01:01 --> 00:01:06 genome on one RNA. Instead it does it either one gene 16 00:01:06 --> 00:01:11 at a time, which is the usual case, or occasionally as we see in the lac 17 00:01:11 --> 00:01:16 operon a little tiny cluster of DNAs that have related functions. 18 00:01:16 --> 00:01:22 And the beauty of that is that it then enables the cell to dial in how 19 00:01:22 --> 00:01:27 much protein is being made, in part at least, by determining how 20 00:01:27 --> 00:01:31 much RNA is being made. So you're going to make no RNA and 21 00:01:31 --> 00:01:35 not make the protein at all or make a little RNA and get a little 22 00:01:35 --> 00:01:39 protein, or if it's really a thing you need very large quantities of 23 00:01:39 --> 00:01:43 you can really crank out a lot of RNA and make a lot of protein. 24 00:01:43 --> 00:01:46 So the potential is there for regulation. And in bacteria, 25 00:01:46 --> 00:01:50 as I said, for almost all bacterial genes it's pretty straightforward. 26 00:01:50 --> 00:01:54 You can look in the DNA, and at least if you know where to start, 27 00:01:54 --> 00:01:58 see the start of a protein, you can just use that table of the genetic 28 00:01:58 --> 00:02:02 code and read off the sequence. But eukaryotes in particular, 29 00:02:02 --> 00:02:08 higher eukaryotes have this odd business that what seemed odd and 30 00:02:08 --> 00:02:13 surprising that when you look at their genes, many of them you cannot 31 00:02:13 --> 00:02:19 do that because it's as if there are extra bits of DNA stuck in the 32 00:02:19 --> 00:02:24 middle. And in some cases you heard it could be really huge amounts of 33 00:02:24 --> 00:02:30 DNA so that there is this extra thing where there's a pre-mRNA. 34 00:02:30 --> 00:02:35 And this RNA splicing we talked about has to take place to generate 35 00:02:35 --> 00:02:41 the mRNA. And once you have the mRNA then the ribosome and the 36 00:02:41 --> 00:02:46 charge tRNAs can be used to make the proteins. And someone asked what 37 00:02:46 --> 00:02:52 all that extra DNA is for. I mean we still don't fully 38 00:02:52 --> 00:02:57 understand that. There are some regulatory sequences 39 00:02:57 --> 00:03:03 and regulatory actors that are buried in that non-coding DNA. 40 00:03:03 --> 00:03:07 But another thing may be that this is just the way it's worked in 41 00:03:07 --> 00:03:11 evolution. And as long as it works there's no driving force necessarily 42 00:03:11 --> 00:03:15 to get rid of it. If you look at microorganisms, 43 00:03:15 --> 00:03:20 for example, yeast is a eukaryotic that, like E. coli, 44 00:03:20 --> 00:03:24 has to replicate pretty fast in order to compete with other 45 00:03:24 --> 00:03:29 microorganisms for the food and whatnot in its environment. 46 00:03:29 --> 00:03:33 And it has relatively little of these extra intervening sequences 47 00:03:33 --> 00:03:38 compared to what we find in our DNA. I gave you the example of Factor 8. 48 00:03:38 --> 00:03:42 It didn't really particularly matter what it was in the sense that 49 00:03:42 --> 00:03:47 it was just an example of something that has a lot of intervening 50 00:03:47 --> 00:03:52 sequence. What it is, though, it's one of a set of 51 00:03:52 --> 00:03:57 proteins that are involved in clotting of your blood. 52 00:03:57 --> 00:04:02 When you cut yourself, we have this system that prevents us 53 00:04:02 --> 00:04:07 from bleeding to death, unless you have hemophilia or 54 00:04:07 --> 00:04:12 something like that where's there's a problem with the clotting system, 55 00:04:12 --> 00:04:17 then a very complex set of things happen. And Factor 8 is one of the 56 00:04:17 --> 00:04:22 several proteins that play critical roles in that. 57 00:04:22 --> 00:04:27 Somebody asked, I talked about a few things that 58 00:04:27 --> 00:04:32 were, this was sort of the dogma. This was how information was thought 59 00:04:32 --> 00:04:37 to go. And then how Dave Baltimore found there was reverse 60 00:04:37 --> 00:04:42 transcriptase that could take an RNA and make a DNA copy. 61 00:04:42 --> 00:04:47 And the question was they didn't understand how the dogma could 62 00:04:47 --> 00:04:52 change. I mean that's sort of what I'm trying to emphasize a lot in 63 00:04:52 --> 00:04:57 this course is what I'm teaching you is what human experimentation and 64 00:04:57 --> 00:05:02 thought has brought up until spring of 2005 of terms in biology. 65 00:05:02 --> 00:05:05 Some of the sort of basic discoveries were made in physics so 66 00:05:05 --> 00:05:09 long ago that it's very unlikely that you'll come in and discover 67 00:05:09 --> 00:05:13 that the Newtonian mechanics you learned as a freshman is not 68 00:05:13 --> 00:05:17 operative anymore when you're a senior, but you can still have these 69 00:05:17 --> 00:05:21 massive revolutions in biology where just suddenly whole things, 70 00:05:21 --> 00:05:25 like RNA splicing, emerge from the woodwork almost overnight. 71 00:05:25 --> 00:05:29 And that's, in fact, what happened with that. That's what happened 72 00:05:29 --> 00:05:32 with reverse transcriptase. So, in a sense, 73 00:05:32 --> 00:05:36 it's almost a joke that Crick called it dogma. He didn't know what he 74 00:05:36 --> 00:05:39 was doing. But it sort of took that property on. And I'm trying to 75 00:05:39 --> 00:05:43 caution you that even though some of you would like me to stick to just 76 00:05:43 --> 00:05:46 facts that we're continually learning and there are discoveries 77 00:05:46 --> 00:05:50 being made even as we're going on with this course. 78 00:05:50 --> 00:05:53 Now, I told you that there were certain viruses, 79 00:05:53 --> 00:05:57 HIV being the one I really emphasized, that have 80 00:05:57 --> 00:06:01 this property. Their genetic material is not DNA. 81 00:06:01 --> 00:06:05 It's RNA. And the reverse transcriptase makes a double-strand 82 00:06:05 --> 00:06:09 copy that it inserts into the organism's DNA and becomes a 83 00:06:09 --> 00:06:13 permanent part. And I'm trying to caution you 84 00:06:13 --> 00:06:18 that's why safe sex is such a big deal. Because if you get infected 85 00:06:18 --> 00:06:22 with HIV you'll have it for the rest of your life. I mentioned there 86 00:06:22 --> 00:06:26 were some cancer viruses, and someone said, well, if you do 87 00:06:26 --> 00:06:31 that how can you cure cancer? Well, in fact, 88 00:06:31 --> 00:06:35 we're lucky in the sense that we don't have to contend at this point 89 00:06:35 --> 00:06:40 in a major way with human cancer viruses, that would be more of a 90 00:06:40 --> 00:06:44 problem, but your cats have to. You may have heard of the feline 91 00:06:44 --> 00:06:48 leukemia virus. This is a retrovirus of this same 92 00:06:48 --> 00:06:53 class, and cats infected with it have it in their saliva. 93 00:06:53 --> 00:06:57 And although it can be transmitted amongst pets in the same household, 94 00:06:57 --> 00:07:02 the big problem is usually cat fights. 95 00:07:02 --> 00:07:05 And then you get a scratch and then it gets in. So if you have a cat 96 00:07:05 --> 00:07:09 you probably had to take it to the vet and get vaccinated. 97 00:07:09 --> 00:07:13 And one of the reasons is you're trying to get it vaccinated against 98 00:07:13 --> 00:07:17 the feline leukemia virus. And it's just sort of like that 99 00:07:17 --> 00:07:21 story I told you with the streptococcus, 100 00:07:21 --> 00:07:25 that if your immune system has seen the thing beforehand then if you 101 00:07:25 --> 00:07:29 actually get an infection it has a very quick response and your cat 102 00:07:29 --> 00:07:33 doesn't get infected by the virus. And therefore doesn't become a 103 00:07:33 --> 00:07:37 candidate for getting leukemia in later life. OK. 104 00:07:37 --> 00:07:41 So at least there's an effort to try and response to at least a few 105 00:07:41 --> 00:07:45 of the things. There were some very interesting 106 00:07:45 --> 00:07:50 and thoughtful questions. So I want to now go back to talk a 107 00:07:50 --> 00:07:54 little bit more about this issue of regulation because that is one of 108 00:07:54 --> 00:07:58 the real secrets to life. And it's the ability of organisms 109 00:07:58 --> 00:08:02 to turn on and off certain functions and to have rheostats where the can 110 00:08:02 --> 00:08:07 control the levels of expression. But all of this has to be in the DNA. 111 00:08:07 --> 00:08:11 And so before people knew how this worked, it's a little mysterious. 112 00:08:11 --> 00:08:16 And maybe one way you might think about it, well, 113 00:08:16 --> 00:08:20 if I have a gene that encodes something for lactose metabolism, 114 00:08:20 --> 00:08:25 and I tell you there's a sequence upstream of this and somehow this 115 00:08:25 --> 00:08:29 gene is regulated depending on whether I put lactose in the medium 116 00:08:29 --> 00:08:34 or not, how is it going to work? Does it have to fit into little 117 00:08:34 --> 00:08:38 holes in between the letters of the genetic code sort of the way Gamov 118 00:08:38 --> 00:08:43 originally thought about it, or is there some other mechanism? 119 00:08:43 --> 00:08:47 And what you'll see here is one of the general strategies that 120 00:08:47 --> 00:08:52 evolution has chosen is although there's regulatory information in 121 00:08:52 --> 00:08:56 the DNA, DNA isn't a particularly good molecule for recognizing things, 122 00:08:56 --> 00:09:01 but what it is good at is encoding proteins. 123 00:09:01 --> 00:09:06 So the trick is to have some proteins made whose role in life are 124 00:09:06 --> 00:09:11 to be regulators. So that this is what is underlying 125 00:09:11 --> 00:09:17 this system that I started to tell you about on Friday. 126 00:09:17 --> 00:09:28 So remember -- 127 00:09:28 --> 00:09:33 -- beta-galactosidase is the enzyme that takes lactose which is lactose 128 00:09:33 --> 00:09:39 beta-1,4 glucose. And somebody said it's easy to get 129 00:09:39 --> 00:09:44 them mixed up. I apologize, but those are the 130 00:09:44 --> 00:09:50 names. And cleaves it, just breaks the bond to give 131 00:09:50 --> 00:09:56 galactose plus glucose. Both of those can be metabolized by 132 00:09:56 --> 00:10:02 ordinary elements you'll find in most cells. 133 00:10:02 --> 00:10:06 But taking lactose, which you also know as milk sugar 134 00:10:06 --> 00:10:10 because it's in milk, needs this extra function. 135 00:10:10 --> 00:10:14 If you're lactose intolerant, as a fraction of you will be because 136 00:10:14 --> 00:10:18 that's quite common in the human population, then, 137 00:10:18 --> 00:10:22 although you had the enzyme when you were baby, it's been shut off in 138 00:10:22 --> 00:10:26 your body since then and it causes problems. Because if you eat 139 00:10:26 --> 00:10:30 lactose, drink milk or something, the lactose goes right through your 140 00:10:30 --> 00:10:34 stomach and ends up in your intestine. 141 00:10:34 --> 00:10:38 And there are bacteria in there that are able to break it open. 142 00:10:38 --> 00:10:43 And when they do that it leads to gas and some other sort of 143 00:10:43 --> 00:10:48 uncomfortablenesses that are associated with lactose intolerance. 144 00:10:48 --> 00:10:52 So, as I'd said, the major finding was that this enzyme 145 00:10:52 --> 00:10:57 beta-galactosidase or beta-gal was regulated. That if you grow E. 146 00:10:57 --> 00:11:05 coli on glucose -- 147 00:11:05 --> 00:11:10 -- there was no beta-gal. And if you grow them on lactose as 148 00:11:10 --> 00:11:15 the carbon source then there were high levels of beta-gal. 149 00:11:15 --> 00:11:20 And that then lead Jacques Monod and Francois Jacob, 150 00:11:20 --> 00:11:26 the two French scientists I mentioned, to begin studying 151 00:11:26 --> 00:11:31 this problem. And, like so many things in biology, 152 00:11:31 --> 00:11:36 they were working on a huge problem, how are genes regulated? 153 00:11:36 --> 00:11:41 But it wasn't so evident at the beginning. What they were doing was 154 00:11:41 --> 00:11:46 this very modest thing, why is beta-galactosidase there in 155 00:11:46 --> 00:11:51 one condition and not another? Just a little problem in bacterial 156 00:11:51 --> 00:11:56 metabolism. And ultimately it gave us the roots to the answer of how 157 00:11:56 --> 00:12:01 genes are regulated. And I just have to leave out all the 158 00:12:01 --> 00:12:05 beautify stuff that led to what they found. Let me just recapitulate 159 00:12:05 --> 00:12:10 what I put on the board the other day. It turned out that the lacZ 160 00:12:10 --> 00:12:14 gene, this is the gene that encodes beta-galactosidase, 161 00:12:14 --> 00:12:19 so that's the sequence, that has the sequence of codons, 162 00:12:19 --> 00:12:23 that if you could start at the beginning and go along and just put 163 00:12:23 --> 00:12:28 all the amino acids in you would end up with beta-galactosidase. 164 00:12:28 --> 00:12:39 That unit of genetic information, which is called the lacC gene, let's 165 00:12:39 --> 00:12:50 put it here. Then there were two other genes just downstream of this 166 00:12:50 --> 00:13:01 in the DNA. And this unit is made as a single mRNA. 167 00:13:01 --> 00:13:05 So this is a little bit different than what I told you. 168 00:13:05 --> 00:13:09 It's not just one gene. It's actually two or three because 169 00:13:09 --> 00:13:13 these bacteria try to do everything very efficiently because they're 170 00:13:13 --> 00:13:17 growing quickly. This means it can turn on three 171 00:13:17 --> 00:13:21 genes of related function very efficiently. When you have several 172 00:13:21 --> 00:13:25 genes that are expressed using one mRNA, as I said, 173 00:13:25 --> 00:13:30 the genes are said to be organized in an operon. 174 00:13:30 --> 00:13:35 But the key point that we talked about before is you're going to have 175 00:13:35 --> 00:13:40 these genes expressed everything has to be written in the DNA. 176 00:13:40 --> 00:13:45 And it's not using the genetic code. It's using other words that are 177 00:13:45 --> 00:13:50 written there. And you've seen this word now in 178 00:13:50 --> 00:13:55 several lectures. That's a promoter. 179 00:13:55 --> 00:14:00 And that means to start transcription. 180 00:14:00 --> 00:14:04 And to stop the mRNA there has to be something at the other end. 181 00:14:04 --> 00:14:09 And I'll show you the sequence of at least one of these promoters in a 182 00:14:09 --> 00:14:13 minute. This means to stop transcription, 183 00:14:13 --> 00:14:18 to stop making the RNA copy. If you didn't have sequences like 184 00:14:18 --> 00:14:22 that the cell wouldn't know where should I begin the RNA and where 185 00:14:22 --> 00:14:27 does it end. And since there are many, many genes, 186 00:14:27 --> 00:14:32 there have to be many promoters and many terminators. 187 00:14:32 --> 00:14:35 And the other point I tried to hammer home the other day is 188 00:14:35 --> 00:14:39 although the genetic code is universal, you can take that little 189 00:14:39 --> 00:14:43 table and read the sequence of human proteins or E. 190 00:14:43 --> 00:14:46 coli proteins, these other languages that are written using this 191 00:14:46 --> 00:14:50 four-letter nucleic acid alphabet are not universal. 192 00:14:50 --> 00:14:54 So the sequences that E. coli uses for a promoter, a start 193 00:14:54 --> 00:14:58 transcription are very different than what our bodies use as a start 194 00:14:58 --> 00:15:03 transcription thing. And when we get to the recombinant 195 00:15:03 --> 00:15:10 DNA stuff that will be an issue. So this mRNA is then used to make 196 00:15:10 --> 00:15:17 proteins. This would be beta-galactosidase or the lac-Z gene 197 00:15:17 --> 00:15:24 product, and these other genes make, so these are proteins. Here you see 198 00:15:24 --> 00:15:31 this flow of information from the DNA through the mRNA down to being 199 00:15:31 --> 00:15:37 made into proteins. In the case of bacteria there's no 200 00:15:37 --> 00:15:42 nucleus. Everything is in one big pot so that mRNA doesn't have to go 201 00:15:42 --> 00:15:47 anywhere, but it's all there and it gets translated to give copies of 202 00:15:47 --> 00:15:52 the protein. So, as I said, somehow if the cell is 203 00:15:52 --> 00:15:57 going to now regulate whether these genes are expressed or not, 204 00:15:57 --> 00:16:03 depending on whether lactose is present. 205 00:16:03 --> 00:16:06 And the way they do it makes perfect sense. Don't bother to make the 206 00:16:06 --> 00:16:10 enzyme if there's no lactose in the neighborhood, and only make it if 207 00:16:10 --> 00:16:14 it's present. So how are they going to do that? Is the lactose going to 208 00:16:14 --> 00:16:17 come along and stick into some little hole here or something? 209 00:16:17 --> 00:16:21 That's not a general strategy and it wouldn't work if you look at the 210 00:16:21 --> 00:16:25 structure of DNA anyway. It wouldn't have access to the 211 00:16:25 --> 00:16:31 sequence of bases. So what was discovered was that 212 00:16:31 --> 00:16:41 there was another gene very close called lacI. And the protein that 213 00:16:41 --> 00:16:50 it encodes is called the lac repressor. And since it's a gene it 214 00:16:50 --> 00:17:00 has to have a promoter, a start transcription. 215 00:17:00 --> 00:17:07 But this one, be careful now, don't get yourself mixed up, this is 216 00:17:07 --> 00:17:14 for the lacI gene. This is a different promoter over 217 00:17:14 --> 00:17:21 here for this thing. And then there is also then a 218 00:17:21 --> 00:17:28 terminator or a stop transcription. And, again, this is for the lacI 219 00:17:28 --> 00:17:35 gene. So this gets made into an mRNA as well. 220 00:17:35 --> 00:17:44 And this gets translated into a protein that's known as lac 221 00:17:44 --> 00:17:53 repressor. And what that lac repressor has is the ability to bind 222 00:17:53 --> 00:18:03 to a particular sequence in DNA that's located right here. 223 00:18:03 --> 00:18:15 This is a binding site right here 224 00:18:15 --> 00:18:23 for lac repressor. So let me just try to blow that up 225 00:18:23 --> 00:18:31 just a little bit because this could be a little bit confusing. 226 00:18:31 --> 00:18:49 So here's the promoter -- 227 00:18:49 --> 00:18:53 -- for this lacZYA operon. Here's the beginning of the lacZ 228 00:18:53 --> 00:18:57 gene. And so this is where the RNA polymerase, the machine that's going 229 00:18:57 --> 00:19:01 to make the RNA copy has to bind. And here's the binding site. 230 00:19:01 --> 00:19:13 -- for lac repressor or the lacI 231 00:19:13 --> 00:19:19 protein. So how does this circuit work? And I sort of pose that as an 232 00:19:19 --> 00:19:25 issue for those of you who had followed it at least to that point. 233 00:19:25 --> 00:19:32 So let's consider the two situations. 234 00:19:32 --> 00:19:36 If there's no lactose present what we know from just scientists knew 235 00:19:36 --> 00:19:41 from experimentation was there was no beta-galactosidase activity 236 00:19:41 --> 00:19:45 inside the cell. You could crack them open and you 237 00:19:45 --> 00:19:50 wouldn't find this enzyme there. And when you added lactose they 238 00:19:50 --> 00:19:55 knew, from that experiment described on Friday, it was synthesized de 239 00:19:55 --> 00:20:00 novo. So if there's no lactose present -- 240 00:20:00 --> 00:20:09 And what happens is we have the lacI gene, mRNA is being made, 241 00:20:09 --> 00:20:18 this lac repressor is being made, here's the promoter for lacZYA, and 242 00:20:18 --> 00:20:27 here's the sequence. And this repressor goes up and 243 00:20:27 --> 00:20:34 binds to that. And by binding to this particular 244 00:20:34 --> 00:20:38 sequence what it does is it covers up the promoter. 245 00:20:38 --> 00:20:43 It covers up the start signal, the signal that says "start 246 00:20:43 --> 00:20:48 transcription here". Are you guys with me? 247 00:20:48 --> 00:20:52 It's a relatively simple strategy. It's just by lac repressor having 248 00:20:52 --> 00:20:57 that ability to bind to a particular sequence it's able to prevent the 249 00:20:57 --> 00:21:02 RNA polymerase from seeing the promoter. 250 00:21:02 --> 00:21:08 And therefore it's able to prevent the RNA from being made. 251 00:21:08 --> 00:21:15 So there's no mRNA. And if there's no mRNA over here then there's no 252 00:21:15 --> 00:21:21 beta-galactosidase being made. This is an exercise in futility. 253 00:21:21 --> 00:21:28 Why has the cell gone and made this useless protein that isn't doing 254 00:21:28 --> 00:21:35 anything in terms of helping it metabolize lactose? 255 00:21:35 --> 00:21:39 But now take a look at the system compared to what I described when we 256 00:21:39 --> 00:21:43 were first doing it. If we had to do all the regulation 257 00:21:43 --> 00:21:48 directly with the DNA we have this problem that lactose would somehow 258 00:21:48 --> 00:21:52 have to be able to see a sequence in DNA and somehow determine what 259 00:21:52 --> 00:21:57 happened. But what this cell has done now is it's set this system up 260 00:21:57 --> 00:22:01 so that the ability to make lactose or not make lactose is conditional 261 00:22:01 --> 00:22:06 on this protein called the lac repressor. 262 00:22:06 --> 00:22:11 If lac repressor is bound, as it's shown here, it's basically 263 00:22:11 --> 00:22:16 covering up the promoter, the cell cannot make RNA and cannot 264 00:22:16 --> 00:22:21 make beta-galactosidase. If it was absent, if we just got 265 00:22:21 --> 00:22:27 rid of lac repressor then the promoter would be exposed and the 266 00:22:27 --> 00:22:32 cell could make beta-galactosidase. And so it's now lac repressor that 267 00:22:32 --> 00:22:37 has the conditionality. It's, in essence, a sensor. 268 00:22:37 --> 00:22:42 It can at least be considered a sensor for whether lactose is 269 00:22:42 --> 00:22:47 present. And indeed that is the property that lac repressor has. 270 00:22:47 --> 00:22:52 It's able to bind lactose. So if we think lactose is present 271 00:22:52 --> 00:23:00 what happens? 272 00:23:00 --> 00:23:04 I mean this lacI gene is a pretty uninteresting, 273 00:23:04 --> 00:23:08 uninteresting from the standpoint of regulation in the sense that it's 274 00:23:08 --> 00:23:12 made all the time. The cell just continually cranks 275 00:23:12 --> 00:23:16 out a bit of lac repressor. It doesn't need very much. It just 276 00:23:16 --> 00:23:20 needs to make enough so that the one binding site for lac repressor has 277 00:23:20 --> 00:23:24 somebody bound to it. So it can get away with pretty low 278 00:23:24 --> 00:23:30 levels. But over here then we have this 279 00:23:30 --> 00:23:36 promoter, and we have the lacZ gene and so on here, 280 00:23:36 --> 00:23:43 and there is the binding sequence right there. But this lac repressor 281 00:23:43 --> 00:23:50 has the ability to bind lactose, which I'm going to draw as a little 282 00:23:50 --> 00:23:56 triangle here, even though you know it's a 283 00:23:56 --> 00:24:02 disaccharide. It has a different property. 284 00:24:02 --> 00:24:07 But the fundamental characteristic of the lac repressor binding lactose 285 00:24:07 --> 00:24:12 is it undergoes a change in confirmation. So if it's got a 286 00:24:12 --> 00:24:16 binding pocket, and lactose fits into that binding 287 00:24:16 --> 00:24:21 pocket, those alpha helices and beta sheets and so on move around a 288 00:24:21 --> 00:24:26 little bit. And what happens then is it perturbs the part of the 289 00:24:26 --> 00:24:31 protein that would normally be able to recognize this DNA sequence. 290 00:24:31 --> 00:24:41 And this cannot -- 291 00:24:41 --> 00:24:47 -- bind to the DNA sequence up here. And I'll tell you the special name 292 00:24:47 --> 00:24:53 for that binding site. It's just one of these terms that 293 00:24:53 --> 00:25:00 you'll see in biology. Everything has to be given a name. 294 00:25:00 --> 00:25:06 It's called an operator, for historical reasons. But, 295 00:25:06 --> 00:25:12 in any case, the lac repressor, once it's hanging onto a lactose 296 00:25:12 --> 00:25:18 it's unable to bind this sequence. That means that the start site for 297 00:25:18 --> 00:25:24 transcription is made. And so you get the mRNA made and 298 00:25:24 --> 00:25:30 then you get beta-galactosidase present. 299 00:25:30 --> 00:25:34 A little bit complicated, but sort of underlying it is this 300 00:25:34 --> 00:25:38 idea that now the cell is using a protein rather than DNA to tell 301 00:25:38 --> 00:25:43 whether lactose is present. And hopefully you can see this is 302 00:25:43 --> 00:25:47 really general now because you can design a regulatory protein. 303 00:25:47 --> 00:25:52 And basically it's got to have two things. It's got to have a part 304 00:25:52 --> 00:25:56 that talks to the DNA and recognizes some sequence, 305 00:25:56 --> 00:26:01 and it's got to have another part that senses whatever it is. 306 00:26:01 --> 00:26:05 Histidine, temperature, you name it. But once you 307 00:26:05 --> 00:26:10 understand that design principle then you can begin to see how it is 308 00:26:10 --> 00:26:15 that the cell is able to turn genes on and off just by encoding 309 00:26:15 --> 00:26:20 information in the DNA. And, as I say, one of the big 310 00:26:20 --> 00:26:25 tricks there is to let the protein do the sensing for you. 311 00:26:25 --> 00:26:30 Now, I just want to give you a little bit of a blowup of what this 312 00:26:30 --> 00:26:36 things looks like. Because sort of all I've done is 313 00:26:36 --> 00:26:42 kind of put it here as a sequence. I've called it a promoter. What a 314 00:26:42 --> 00:26:49 promoter then is, again it means the start for 315 00:26:49 --> 00:26:55 transcription, the process of making RNA. 316 00:26:55 --> 00:27:02 And I've tried to stress that these promoters are not universal. 317 00:27:02 --> 00:27:09 So when I tell you this for E. 318 00:27:09 --> 00:27:15 coli, this is what a promoter for E. coli looks like, but it doesn't look 319 00:27:15 --> 00:27:20 at all like a promoter in our bodies. And so it's basically a word that's 320 00:27:20 --> 00:27:26 written using this nucleic acid alphabet. And it looks 321 00:27:26 --> 00:27:32 something like this. There's TTGACA and then there are 322 00:27:32 --> 00:27:38 about 17 base pairs that can be just about anything. 323 00:27:38 --> 00:27:44 And then there's TATAAT. And then there's another little bit 324 00:27:44 --> 00:27:50 here that's about ten base pairs long. And then this is the start of 325 00:27:50 --> 00:27:56 the mRNA which is usually given the convention of being called 326 00:27:56 --> 00:28:02 the plus one position. So you'll notice this word I've 327 00:28:02 --> 00:28:07 called it, written, that says start transcription has 328 00:28:07 --> 00:28:12 even got two parts to it. And this is usually referred to the 329 00:28:12 --> 00:28:17 minus ten region of the promoter, and this is the minus 35 region 330 00:28:17 --> 00:28:22 because that's the distance from the start of transcription. 331 00:28:22 --> 00:28:27 It maybe seem sort of weird to you to see what I'm telling you is sort 332 00:28:27 --> 00:28:32 of word written in the nucleic acid language. 333 00:28:32 --> 00:28:35 It's got some bits in the middle that don't matter. 334 00:28:35 --> 00:28:39 But remember the DNA is a helix and things going around like this. 335 00:28:39 --> 00:28:43 So if you were just to take a DNA helix and then lay something down on 336 00:28:43 --> 00:28:47 one side of it, it would contact it here, 337 00:28:47 --> 00:28:51 it wouldn't contact it there, and then when it came back up again 338 00:28:51 --> 00:28:55 it would contact it here. And so as these things hang on 339 00:28:55 --> 00:28:59 along the sides of DNA it's not at all uncommon to find this sort of 340 00:28:59 --> 00:29:03 broken where you can have something that matters, something that doesn't 341 00:29:03 --> 00:29:07 matter and something that matters again. 342 00:29:07 --> 00:29:15 The RNA polymerase in E. coli, it's a machine. It's got four 343 00:29:15 --> 00:29:24 proteins that are the core. That's the part that actually 344 00:29:24 --> 00:29:33 synthesizes the RNA plus one protein, which is known as the 345 00:29:33 --> 00:29:41 sigma subunit. And it has the special job of 346 00:29:41 --> 00:29:48 recognizing the promoter. And so when we start asking where 347 00:29:48 --> 00:29:56 does this regulatory sequence that the lac repressor binds 348 00:29:56 --> 00:30:04 it in all of this. It turns out that the sequence for 349 00:30:04 --> 00:30:18 binding lac repressor -- 350 00:30:18 --> 00:30:23 -- overlaps with this minus ten region. So when the lac repressor 351 00:30:23 --> 00:30:28 is sitting down it's covering up a very important part of transcription. 352 00:30:28 --> 00:30:35 You guys with me? OK. So this is an interesting kind of 353 00:30:35 --> 00:30:44 regulation. It's given the general term -- 354 00:30:44 --> 00:30:53 -- negative regulation. 355 00:30:53 --> 00:30:57 And the reason that term is applied, it means that the regulatory 356 00:30:57 --> 00:31:09 protein -- 357 00:31:09 --> 00:31:19 -- interferes with transcription. 358 00:31:19 --> 00:31:24 And let's take a brief foray into we're going to talk about genetics 359 00:31:24 --> 00:31:37 as our next subject. And I think maybe we can let you 360 00:31:37 --> 00:31:57 sort of already get a sense of how some of this was figured out. 361 00:31:57 --> 00:32:17 So there's a substance called X-gal, which is a galactose with some 362 00:32:17 --> 00:32:37 chemical entity hanging off that's colorless. But yet it's a substrate 363 00:32:37 --> 00:32:57 whose bond can be cleaved by beta-galactosidase. 364 00:32:57 --> 00:32:52 And it gives galactose plus the free X entity. And this is colored. 365 00:32:52 --> 00:32:48 And this is a very useful thing for bacterial geneticists. 366 00:32:48 --> 00:32:43 Someone said they thought this was too much lab stuff, 367 00:32:43 --> 00:32:39 but I think if you don't have some sense of how this is done 368 00:32:39 --> 00:32:34 experimentally I'm not doing too good a job of conveying to you how 369 00:32:34 --> 00:32:30 we learn all this kind of thing. It just doesn't come out of a 370 00:32:30 --> 00:32:32 textbook. And so if we grow E. 371 00:32:32 --> 00:32:41 coli on plates that have glucose plus X-gal, the colonies would be 372 00:32:41 --> 00:32:49 colorless. And if we were to grow them on plates that had lactose plus 373 00:32:49 --> 00:32:58 X-gal then all of the colonies would be colored because they're making 374 00:32:58 --> 00:33:06 beta-galactosidase. And part of the way that this stuff 375 00:33:06 --> 00:33:12 that I've been telling you was figured out was by bacterial 376 00:33:12 --> 00:33:18 geneticists looking for something. What they looked for was back here 377 00:33:18 --> 00:33:24 on this plate that had Xgal. Almost all the colonies were 378 00:33:24 --> 00:33:30 colorless. These were colored because they could make 379 00:33:30 --> 00:33:36 beta-galactosidase. So if I gave you some plates of this, 380 00:33:36 --> 00:33:44 you looked in the lab and then you found a colored colony, 381 00:33:44 --> 00:33:52 it's a mutant. I'll define these terms for you very shortly. 382 00:33:52 --> 00:34:00 But it's got an alteration in the DNA that affects the regulation -- 383 00:34:00 --> 00:34:08 -- of beta-gal or the product of the 384 00:34:08 --> 00:34:14 lacC gene. And on the basis of what I've told you about this model, 385 00:34:14 --> 00:34:20 can you guys come up with two types of things, two places or kinds of 386 00:34:20 --> 00:34:27 mutations that could break this system that would lead to 387 00:34:27 --> 00:34:33 beta-galactosidase being on even though there's no lactose 388 00:34:33 --> 00:34:39 in the medium? Anybody see one of them? 389 00:34:39 --> 00:34:45 Why is it off? Because of lac repressor? In a wild type strain 390 00:34:45 --> 00:34:51 it's because lac repressor is bound to that sequence and it's shutting 391 00:34:51 --> 00:34:57 off transcription. So we had a variant that could now 392 00:34:57 --> 00:35:10 transcribe. Yeah. 393 00:35:10 --> 00:35:15 OK, so that's a good idea. So if we could somehow mutate that 394 00:35:15 --> 00:35:21 little binding sequence in a way that didn't screw up everything else 395 00:35:21 --> 00:35:26 then, even though there was lac repressor being made, 396 00:35:26 --> 00:35:32 if it couldn't bind here because the sequence had been changed 397 00:35:32 --> 00:35:37 then you'd get it made. That's exactly right. 398 00:35:37 --> 00:35:42 That's one of them. Yeah. OK, it was a problem making lacI. 399 00:35:42 --> 00:35:47 What would happen? Well, if we couldn't make this and it couldn't 400 00:35:47 --> 00:35:51 bind there we'd be on. And that's the other class. 401 00:35:51 --> 00:35:56 Can you think of a kind of mutation that we learned about, 402 00:35:56 --> 00:36:01 think back to the genetic code, that would prevent lac repressor 403 00:36:01 --> 00:36:10 from being made? 404 00:36:10 --> 00:36:14 I'm trying to give you a clue. There were 61 codons encoded for 405 00:36:14 --> 00:36:18 amino acids. Yeah. Oh, that would work. 406 00:36:18 --> 00:36:23 Yup, if we messed up the promoter. That's a sophisticated answer. 407 00:36:23 --> 00:36:27 Yeah, if we messed up the promoter for making lacI that would 408 00:36:27 --> 00:36:33 certainly give that. Can you think of another type of 409 00:36:33 --> 00:36:39 thing that would affect the lacI gene, would prevent lacI from being 410 00:36:39 --> 00:36:49 made? Somebody? Yeah. 411 00:36:49 --> 00:36:52 OK. If the sequence was wrong so that they could bind, 412 00:36:52 --> 00:36:56 that would be good. The one I'm trying to tease out of you, 413 00:36:56 --> 00:37:00 but I won't take longer now, is remember those three stop codons 414 00:37:00 --> 00:37:03 that didn't encode for anything? There should be one of those at the 415 00:37:03 --> 00:37:07 end of the protein. But if you changed one of the amino 416 00:37:07 --> 00:37:11 acid codons into a stop codon that would also prevent you from making 417 00:37:11 --> 00:37:15 it. So there are at least a couple of kinds of mutations. 418 00:37:15 --> 00:37:18 And what I've sort of done here is I've skipped all the evidence and 419 00:37:18 --> 00:37:22 given you the model. And I cannot give you all the 420 00:37:22 --> 00:37:26 evidence that lead to this, which is a pretty well-established 421 00:37:26 --> 00:37:30 model. I don't think this is going to change likely. 422 00:37:30 --> 00:37:35 We've been studying it for so long. But this is the kind of evidence on 423 00:37:35 --> 00:37:40 which it was based. In was by people finding things and 424 00:37:40 --> 00:37:45 figuring out that parts of the machinery were broken and then 425 00:37:45 --> 00:37:51 working on. So there's another kind of regulation known as positive 426 00:37:51 --> 00:37:56 regulation. And for a long time people though maybe everything was 427 00:37:56 --> 00:38:01 negative regulation. But it turns out positive regulation 428 00:38:01 --> 00:38:06 is far more common. In this case, the regulatory 429 00:38:06 --> 00:38:11 protein instead of inhibiting transcription assists with the 430 00:38:11 --> 00:38:16 transcription. And it turned out, 431 00:38:16 --> 00:38:21 after people had been studying the beta-galactosidase system for a 432 00:38:21 --> 00:38:26 number of years, that it had a positive control 433 00:38:26 --> 00:38:32 system superimposed or together with the negative regulatory system. 434 00:38:32 --> 00:38:37 The same thing engineers do all the time, pile up regulatory circuits 435 00:38:37 --> 00:38:42 and get all kinds of additional conditionalities. 436 00:38:42 --> 00:38:47 And the thing I've told you saw far was we asked whether beta-gal is 437 00:38:47 --> 00:38:55 present. And the carbon source -- 438 00:38:55 --> 00:39:02 -- is glucose. This is low or not there. 439 00:39:02 --> 00:39:08 If it's lactose it's high, but if cells were grown on both, 440 00:39:08 --> 00:39:13 glucose plus lactose, then beta-galactosidase is low again. 441 00:39:13 --> 00:39:18 And this makes some physiological sense because E. 442 00:39:18 --> 00:39:24 coli likes to use galactose. It's its favorite food source. 443 00:39:24 --> 00:39:29 And so if it's got its favorite food source around then it doesn't 444 00:39:29 --> 00:39:35 want to make proteins that are used to eat all its sort of less 445 00:39:35 --> 00:39:40 favorite food source. So there's a nice conditionality 446 00:39:40 --> 00:39:44 here. In order for it, the circuitry is set up so that it 447 00:39:44 --> 00:39:48 only makes the enzyme for metabolizing lactose when the cell 448 00:39:48 --> 00:39:53 realizes its favorite food source isn't there and then it senses 449 00:39:53 --> 00:39:57 there's lactose. So only under those conditions does 450 00:39:57 --> 00:40:02 it make the enzymes for making lactose. 451 00:40:02 --> 00:40:06 And, again, the way this circuitry works, this positive regulation 452 00:40:06 --> 00:40:11 needs two things. Once again, it needs a protein. 453 00:40:11 --> 00:40:16 This one has given the name CRP. And, again, it's something that's 454 00:40:16 --> 00:40:21 able to bind to a sequence in DNA. And then it's also got a 455 00:40:21 --> 00:40:25 conditionality. It's able to recognize something 456 00:40:25 --> 00:40:30 else. And what this one recognizes is this small molecule that's known 457 00:40:30 --> 00:40:36 on cyclic A&P. It's just the familiar 458 00:40:36 --> 00:40:42 [ribomonophosphate? that you've seen before but it's 459 00:40:42 --> 00:40:48 looped around and formed an ester bond here. And that's why it's 460 00:40:48 --> 00:40:54 called cyclic A&P. But the important thing about this 461 00:40:54 --> 00:41:01 is that the levels of cyclic A&P are dependant on glucose. 462 00:41:01 --> 00:41:12 So if you have high glucose you have low cyclic A&P and if you have low 463 00:41:12 --> 00:41:23 glucose you have high cyclic A&P. And here again is what's going to 464 00:41:23 --> 00:41:35 happen. So this is the promoter for lacZYA and here's the start of the 465 00:41:35 --> 00:41:46 lacZ gene. And I told you this is where the operator would be finding. 466 00:41:46 --> 00:41:58 This CRP protein is able to bind to -- 467 00:41:58 --> 00:42:08 -- a site that's even a little bit 468 00:42:08 --> 00:42:16 farther upstream of the lacZ gene than is the promoter. 469 00:42:16 --> 00:42:24 And the idea of this is that if this CRP [lacs?] -- 470 00:42:24 --> 00:42:33 -- just by itself, 471 00:42:33 --> 00:42:40 it doesn't bind to the DNA. But it has a little binding pocket 472 00:42:40 --> 00:42:48 that senses the levels of cyclic A&P. So if the cell is starving for 473 00:42:48 --> 00:42:55 glucose, there are high levels of A&P, then the CRP bound to cyclic 474 00:42:55 --> 00:43:03 A&P, this is capable of binding to this sequence. 475 00:43:03 --> 00:43:12 So that sounds weird, but what we've, again, 476 00:43:12 --> 00:43:21 got now is we've got a protein whose binding to DNA is conditional to 477 00:43:21 --> 00:43:30 something inside the cell. And rather than getting in the way, 478 00:43:30 --> 00:43:39 what this does, if you have a situation where you have CRP with 479 00:43:39 --> 00:43:48 cyclic A&P bond to it and it's next door to this promoter, 480 00:43:48 --> 00:43:57 what it's able to do is help RNA polymerase recognize the promoter. 481 00:43:57 --> 00:44:06 So this helps RNA polymerase. Whereas, when we were talking about 482 00:44:06 --> 00:44:11 the lac repressor what it was doing, if you recall, was getting in the 483 00:44:11 --> 00:44:16 way of RNA polymerase. And this actually has a relatively 484 00:44:16 --> 00:44:21 simple sort of molecular explanation for what's going on. The 485 00:44:21 --> 00:44:34 RNA polymerase -- 486 00:44:34 --> 00:44:39 The best sequence for the minus ten region of the promoter, 487 00:44:39 --> 00:44:45 that I showed you over on the other board, is something with the 488 00:44:45 --> 00:44:50 sequence TATAAT. So the RNA polymerase machinery, 489 00:44:50 --> 00:44:56 it can recognize more than one sequence, but if it sees a promoter 490 00:44:56 --> 00:45:01 that has a minus ten region that is TATAAT, it really binds well and 491 00:45:01 --> 00:45:07 that would be a very strong promoter and you get lots of mRNA. 492 00:45:07 --> 00:45:15 Now, the lacZ promoter actually has two nucleotides that are different. 493 00:45:15 --> 00:45:26 It's TATGTT. 494 00:45:26 --> 00:45:38 So this is a very weak promoter without help. 495 00:45:38 --> 00:45:43 So in this lac system, if we got rid of lac repressor 496 00:45:43 --> 00:45:48 entirely so that the promoter was just exposed all the time, 497 00:45:48 --> 00:45:53 as I showed you up here, I've left out a detail in this first part in 498 00:45:53 --> 00:45:59 that this promoter is not very strong. 499 00:45:59 --> 00:46:03 And we'd only get a little bit of RNA and a little bit of protein. 500 00:46:03 --> 00:46:08 And so what the cell does is if it knows there is no galactose, 501 00:46:08 --> 00:46:13 knows there's no glucose around and cyclic A&P levels are high then it 502 00:46:13 --> 00:46:18 uses the binding of the CRP to here to assist the RNA polymerase and get 503 00:46:18 --> 00:46:23 on. I understand this is a bit complicated. Some of you probably 504 00:46:23 --> 00:46:28 have it. Some of you will be lost and you'll have to sit and look at 505 00:46:28 --> 00:46:32 your textbook for a little while. But it comes down to a couple of 506 00:46:32 --> 00:46:36 really simple principles. One is to detect what's going on 507 00:46:36 --> 00:46:40 the cell makes a regulatory protein, the regulatory protein binds DNA and 508 00:46:40 --> 00:46:44 it also senses something. And these things can work in two 509 00:46:44 --> 00:46:48 ways. They can either bind DNA and get in the way, 510 00:46:48 --> 00:46:52 they can be a negative regulatory element, or they can bind to DNA and 511 00:46:52 --> 00:46:56 they can help something happen and be a positive regulatory element. 512 00:46:56 --> 00:47:00 All the rest are just the details of lac. 513 00:47:00 --> 00:47:04 And we could spend the entire course or regulation and barely scratch the 514 00:47:04 --> 00:47:09 surface, but it's one of the huge secrets of life that cells are able 515 00:47:09 --> 00:47:13 to individually turn on different genes in different ways at different 516 00:47:13 --> 00:47:18 times, have rheostats for levels, coordinate great sets of genes in 517 00:47:18 --> 00:47:21 response to various stimuli. OK? See you on Wednesday.