1 00:00:01,580 --> 00:00:04,080 NARRATOR: The following content is provided under a Creative 2 00:00:04,080 --> 00:00:05,620 Commons license. 3 00:00:05,620 --> 00:00:07,920 Your support will help MIT OpenCourseWare 4 00:00:07,920 --> 00:00:12,310 continue to offer high quality educational resources for free. 5 00:00:12,310 --> 00:00:14,910 To make a donation, or view additional materials 6 00:00:14,910 --> 00:00:18,870 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:18,870 --> 00:00:22,244 at ocw.mit.edu. 8 00:00:22,244 --> 00:00:23,910 LEYLA ISIK: So I'm going to just go over 9 00:00:23,910 --> 00:00:27,870 some very basic neuroscience, mostly terminology, just 10 00:00:27,870 --> 00:00:31,470 for people who have very little to no neuroscience background. 11 00:00:31,470 --> 00:00:34,380 When you hear the rest of the talks, you would think like, 12 00:00:34,380 --> 00:00:37,440 what does it mean that they're talking about spiking activity? 13 00:00:37,440 --> 00:00:38,910 Or what is fMRI measuring? 14 00:00:38,910 --> 00:00:40,870 So that's like the level at which this is. 15 00:00:40,870 --> 00:00:45,450 So my disclaimers are one, like I said, that it's very basic. 16 00:00:45,450 --> 00:00:48,389 And two, that it will be CBMM and vision centric, 17 00:00:48,389 --> 00:00:50,430 because the goal is to get you ready for the rest 18 00:00:50,430 --> 00:00:52,060 of this course. 19 00:00:52,060 --> 00:00:54,270 So please don't think that this is an exhaustive, 20 00:00:54,270 --> 00:00:55,740 or what I think is an exhaustive, 21 00:00:55,740 --> 00:00:58,300 summary of basic neuroscience. 22 00:01:00,950 --> 00:01:02,450 So just to give you a brief outline, 23 00:01:02,450 --> 00:01:04,325 first we'll talk about the basics of neurons, 24 00:01:04,325 --> 00:01:05,170 and their firing. 25 00:01:05,170 --> 00:01:06,790 Basic brain anatomy. 26 00:01:06,790 --> 00:01:09,250 How people measure neural activity in the brain, 27 00:01:09,250 --> 00:01:11,380 both invasively and non invasively. 28 00:01:11,380 --> 00:01:14,470 And then a brief rundown of the visual system. 29 00:01:14,470 --> 00:01:15,350 This is a neuron. 30 00:01:18,910 --> 00:01:23,920 And it has dendrites and axons, and the signal 31 00:01:23,920 --> 00:01:25,930 is propagated along the axon, and the axon 32 00:01:25,930 --> 00:01:27,920 terminates on another cell. 33 00:01:27,920 --> 00:01:33,310 And when one neuron terminates on another neuron, 34 00:01:33,310 --> 00:01:36,070 they form what's called the synapse. 35 00:01:36,070 --> 00:01:37,330 So here are some pictures. 36 00:01:37,330 --> 00:01:40,570 Sorry, it's hard to see on the projector, of neurons synapsing 37 00:01:40,570 --> 00:01:41,530 on other neurons. 38 00:01:41,530 --> 00:01:43,480 And that is how neurons communicate. 39 00:01:43,480 --> 00:01:47,120 They send electrical activity down their axon, 40 00:01:47,120 --> 00:01:49,420 and it reaches the next cell. 41 00:01:49,420 --> 00:01:52,090 And the synapse is both an electrical and chemical 42 00:01:52,090 --> 00:01:52,750 phenomenon. 43 00:01:52,750 --> 00:01:54,750 We're not going to get into the details of that, 44 00:01:54,750 --> 00:01:58,780 but if you're interested, I encourage you to Wikipedia it. 45 00:01:58,780 --> 00:02:03,120 Neurons have a ion gradient across them. 46 00:02:03,120 --> 00:02:04,900 So there is a different concentration 47 00:02:04,900 --> 00:02:08,050 of certain types of ions inside and outside of the cell. 48 00:02:08,050 --> 00:02:09,949 And there are ion channels along the cell. 49 00:02:09,949 --> 00:02:12,320 And these ion channels are voltage gated. 50 00:02:12,320 --> 00:02:18,310 So what happens is when these ion channels open, 51 00:02:18,310 --> 00:02:20,080 the voltage inside the cell changes, 52 00:02:20,080 --> 00:02:22,990 and that eventually leads a neuron to fire. 53 00:02:22,990 --> 00:02:26,380 And they fire what is known as an action potential. 54 00:02:26,380 --> 00:02:29,830 So it's possible for neurons' voltage to change a little bit, 55 00:02:29,830 --> 00:02:31,420 and that is known as potentiation. 56 00:02:31,420 --> 00:02:34,659 So they can get either excitatory and inhibitory 57 00:02:34,659 --> 00:02:35,200 potentiation. 58 00:02:35,200 --> 00:02:40,826 So that means either higher or lower activity, as shown here. 59 00:02:40,826 --> 00:02:42,700 And then once it reaches a certain threshold, 60 00:02:42,700 --> 00:02:45,130 they fire what's known as an action potential. 61 00:02:45,130 --> 00:02:47,230 So action potentials are all or none firing, 62 00:02:47,230 --> 00:02:49,480 and that's what is referred to as neural firing, 63 00:02:49,480 --> 00:02:50,860 or neural spiking. 64 00:02:50,860 --> 00:02:54,186 It's this actual spike in the voltage. 65 00:02:54,186 --> 00:02:55,810 That is all you need to know, like when 66 00:02:55,810 --> 00:02:57,550 people are talking about neural spiking, 67 00:02:57,550 --> 00:03:00,310 they're talking about the actual action potential. 68 00:03:00,310 --> 00:03:03,430 But oftentimes, we're not measuring things 69 00:03:03,430 --> 00:03:05,260 at the level of single spikes. 70 00:03:05,260 --> 00:03:06,790 So I'll get into it in a little bit, 71 00:03:06,790 --> 00:03:08,680 about what people are actually measuring, 72 00:03:08,680 --> 00:03:10,402 and what they're talking about when 73 00:03:10,402 --> 00:03:12,610 they're talking about different recording techniques. 74 00:03:16,910 --> 00:03:19,640 So some basic brain anatomy. 75 00:03:19,640 --> 00:03:26,362 This is a slice of the cortex, and just to orient you, 76 00:03:26,362 --> 00:03:27,820 I'm going to put these online, just 77 00:03:27,820 --> 00:03:29,200 so you know the terminologies. 78 00:03:29,200 --> 00:03:30,250 But there are different lobes. 79 00:03:30,250 --> 00:03:31,990 The occipital lobe is in the back, that's 80 00:03:31,990 --> 00:03:34,030 where early visual cortex is. 81 00:03:34,030 --> 00:03:36,910 Temporal lobe, parietal lobe, and frontal lobe. 82 00:03:36,910 --> 00:03:39,880 And if people are talking about the inferior part of the brain, 83 00:03:39,880 --> 00:03:43,510 they mean the bottom, superior top, et cetera. 84 00:03:43,510 --> 00:03:46,720 And this is a rough layout of where different sensory-- 85 00:03:46,720 --> 00:03:48,730 can people see that? 86 00:03:48,730 --> 00:03:50,950 Kind of, different sensory and motor 87 00:03:50,950 --> 00:03:53,440 cortexes, where they land on the cortex. 88 00:03:53,440 --> 00:03:55,840 So Nancy is going to give a really nice introduction 89 00:03:55,840 --> 00:03:58,480 to the functional specialization of the brain. 90 00:03:58,480 --> 00:04:01,090 This is just some basic anatomical terms 91 00:04:01,090 --> 00:04:02,544 to familiarize you all. 92 00:04:06,720 --> 00:04:09,810 Right, so neural recordings. 93 00:04:09,810 --> 00:04:13,770 So when we're talking about invasive neural recordings, 94 00:04:13,770 --> 00:04:16,350 the first type that we'll talk about is electrophysiology. 95 00:04:16,350 --> 00:04:18,600 So single and multi-unit recordings. 96 00:04:18,600 --> 00:04:20,970 And what that means is that somebody actually sticks 97 00:04:20,970 --> 00:04:23,850 an electrode into the brain of an animal 98 00:04:23,850 --> 00:04:25,932 and records their neural activity. 99 00:04:25,932 --> 00:04:27,390 So this can either be a single unit 100 00:04:27,390 --> 00:04:31,080 recording, which means you are recording from a single neuron. 101 00:04:31,080 --> 00:04:33,720 And either by sticking the electrode inside or on 102 00:04:33,720 --> 00:04:35,679 top of the neuron, or very close to the neuron. 103 00:04:35,679 --> 00:04:38,011 And that means that you're close enough that you're only 104 00:04:38,011 --> 00:04:40,350 picking up the changes in electrical activity 105 00:04:40,350 --> 00:04:42,399 from that one neuron. 106 00:04:42,399 --> 00:04:43,940 But what's more commonly measured now 107 00:04:43,940 --> 00:04:46,140 is multi-unit activity. 108 00:04:46,140 --> 00:04:48,360 That means that you stick an electrode in the brain, 109 00:04:48,360 --> 00:04:50,070 and it's picking up activity from a bunch 110 00:04:50,070 --> 00:04:51,576 of neurons around it. 111 00:04:51,576 --> 00:04:52,950 So you can either take that data, 112 00:04:52,950 --> 00:04:55,510 and get what's known as the local field potentials. 113 00:04:55,510 --> 00:04:57,540 So that is the changes in potential, 114 00:04:57,540 --> 00:05:00,180 in general, in that whole group of neurons. 115 00:05:00,180 --> 00:05:02,460 And people often analyze that data. 116 00:05:02,460 --> 00:05:04,680 Or, you can do some sort of preprocessing 117 00:05:04,680 --> 00:05:07,710 to figure out how many neural spikes you're getting. 118 00:05:07,710 --> 00:05:10,639 So that's typically trying to look at the neural firing. 119 00:05:10,639 --> 00:05:12,180 So from that activity, you can either 120 00:05:12,180 --> 00:05:14,490 get the spiking pattern, or what people refer to 121 00:05:14,490 --> 00:05:15,870 as the local field potential. 122 00:05:19,139 --> 00:05:20,680 And then you probably heard, you will 123 00:05:20,680 --> 00:05:24,700 hear a lot about ECoG data, from Gabriel and others this time. 124 00:05:24,700 --> 00:05:26,060 So this is really exciting. 125 00:05:26,060 --> 00:05:30,820 It's the opportunity to record from inside the human brain. 126 00:05:30,820 --> 00:05:34,060 From patients who have pharmacologically intractable 127 00:05:34,060 --> 00:05:35,270 epilepsy. 128 00:05:35,270 --> 00:05:37,670 So sorry this is kind of gross. 129 00:05:37,670 --> 00:05:40,780 But when people are having seizures, 130 00:05:40,780 --> 00:05:42,670 if surgeons want to resect that area, 131 00:05:42,670 --> 00:05:44,350 they first have to map very carefully 132 00:05:44,350 --> 00:05:46,067 where the seizures are coming from, 133 00:05:46,067 --> 00:05:48,400 and what else is around there, to make sure that they're 134 00:05:48,400 --> 00:05:49,930 helping the patient. 135 00:05:49,930 --> 00:05:53,140 So to do that, they place a grid of electrodes on the surface 136 00:05:53,140 --> 00:05:55,900 of the subject's cortex. 137 00:05:55,900 --> 00:05:59,410 And then leave that there often for a week, 138 00:05:59,410 --> 00:06:02,890 for several days, while they do different types of mapping 139 00:06:02,890 --> 00:06:04,040 in that area. 140 00:06:04,040 --> 00:06:06,610 So this provides the opportunity for scientists like Gabriel 141 00:06:06,610 --> 00:06:12,260 to then go and test the neural activity in those humans. 142 00:06:12,260 --> 00:06:15,160 Which is a very rare opportunity to be able to record invasively 143 00:06:15,160 --> 00:06:16,870 from humans. 144 00:06:16,870 --> 00:06:19,280 And again, since we're on the surface of the brain, 145 00:06:19,280 --> 00:06:21,010 this is not single unit activity. 146 00:06:21,010 --> 00:06:25,260 So you get something that is more similar to the LFP type 147 00:06:25,260 --> 00:06:26,590 signal. 148 00:06:26,590 --> 00:06:29,200 And then what I and many other people in the center do 149 00:06:29,200 --> 00:06:30,700 is also neuroimaging. 150 00:06:30,700 --> 00:06:33,160 So this is noninvasive, often in humans. 151 00:06:33,160 --> 00:06:35,846 Although people also do it in animals as well. 152 00:06:35,846 --> 00:06:37,720 And the main types you'll probably hear about 153 00:06:37,720 --> 00:06:40,780 at this course are MEG and EEG, which 154 00:06:40,780 --> 00:06:44,560 are very similar, and functional MRI. 155 00:06:44,560 --> 00:06:47,500 So when many neurons fire synchronously, 156 00:06:47,500 --> 00:06:50,500 so the neurons in your cortex have the nice property 157 00:06:50,500 --> 00:06:53,230 that they're all aligned in the same orientation. 158 00:06:53,230 --> 00:06:55,360 So when they fire at the same time, 159 00:06:55,360 --> 00:06:58,234 you actually get a weak electrical current. 160 00:06:58,234 --> 00:06:59,650 And that electrical current causes 161 00:06:59,650 --> 00:07:02,940 a change in both the electric and magnetic fields around it. 162 00:07:02,940 --> 00:07:06,730 And EEG and MEG measure the changes in electric e, 163 00:07:06,730 --> 00:07:12,010 and magnetic m, fields from those neural firings. 164 00:07:12,010 --> 00:07:13,930 But it's usually on the order of like tens 165 00:07:13,930 --> 00:07:15,950 of millions of neurons that need to be firing. 166 00:07:15,950 --> 00:07:17,560 So we're now at a much larger scale 167 00:07:17,560 --> 00:07:21,340 than we were with the invasive recordings. 168 00:07:21,340 --> 00:07:24,160 And because the neurons all have to be firing at the same time, 169 00:07:24,160 --> 00:07:26,982 usually they're not all firing an action potential. 170 00:07:26,982 --> 00:07:29,440 Because if you remember, it was just this very brief spike. 171 00:07:29,440 --> 00:07:31,148 You're just measuring kind of the changes 172 00:07:31,148 --> 00:07:33,730 in the potentiation of that whole group 173 00:07:33,730 --> 00:07:36,190 of cortical neurons. 174 00:07:36,190 --> 00:07:38,110 So this is a very coarse measure, 175 00:07:38,110 --> 00:07:40,070 but it's a direct measure of neural firing. 176 00:07:40,070 --> 00:07:42,470 So it has very good temporal resolution. 177 00:07:42,470 --> 00:07:43,930 So the question was about, I don't 178 00:07:43,930 --> 00:07:46,450 know if everyone heard, the temporal scale of MEG. 179 00:07:46,450 --> 00:07:48,970 So it's a millisecond temporal resolution. 180 00:07:48,970 --> 00:07:51,804 I think you can maybe even get higher. 181 00:07:51,804 --> 00:07:54,220 fMRI, on the other hand, usually has a temporal resolution 182 00:07:54,220 --> 00:07:56,620 of seconds, a couple of seconds. 183 00:07:56,620 --> 00:07:58,800 But this spatial resolution of fMRI 184 00:07:58,800 --> 00:08:00,730 is on the order of millimeters, whereas it's 185 00:08:00,730 --> 00:08:03,070 more like centimeters in MEG. 186 00:08:03,070 --> 00:08:06,820 And actually, so the problem in MEG and EEG 187 00:08:06,820 --> 00:08:08,770 is you're recording from-- 188 00:08:08,770 --> 00:08:12,860 here's a picture of the MEG, scanner subject sits in, 189 00:08:12,860 --> 00:08:15,340 and there's this helmet that goes around their head. 190 00:08:15,340 --> 00:08:17,800 And that helmet has 306 sensors. 191 00:08:17,800 --> 00:08:20,080 If it was an EEG, they would be wearing a cap. 192 00:08:20,080 --> 00:08:22,390 You've probably seen an EEG cap before, 193 00:08:22,390 --> 00:08:24,640 and the electrodes would be directly contacting 194 00:08:24,640 --> 00:08:26,290 their scalp. 195 00:08:26,290 --> 00:08:31,090 So you're measuring activity from 100 to 300 sensors, 196 00:08:31,090 --> 00:08:33,549 and often you're trying to estimate the activity 197 00:08:33,549 --> 00:08:35,409 in the cortex underneath. 198 00:08:35,409 --> 00:08:38,860 And so that is on the order of like 10,000 sources. 199 00:08:38,860 --> 00:08:40,640 And so it's a very ill posed problem, 200 00:08:40,640 --> 00:08:42,669 meaning that there is not a unique solution 201 00:08:42,669 --> 00:08:45,010 to go from sensors to cortex. 202 00:08:45,010 --> 00:08:47,170 And so because of that, we don't actually-- that's 203 00:08:47,170 --> 00:08:49,480 why they say that the spatial scale is so poor. 204 00:08:49,480 --> 00:08:51,820 But actually it's not a well-defined problem. 205 00:08:51,820 --> 00:08:56,620 So it's hard to even know where the activity is originating 206 00:08:56,620 --> 00:08:57,310 from. 207 00:08:57,310 --> 00:08:59,260 But that's a very active area of research 208 00:08:59,260 --> 00:09:03,190 for how you can constrain that problem with anatomy, 209 00:09:03,190 --> 00:09:06,970 and other measurements, to get better resolution. 210 00:09:06,970 --> 00:09:08,440 But still, I think people typically 211 00:09:08,440 --> 00:09:12,017 think of it as being on the order of centimeters. 212 00:09:12,017 --> 00:09:14,350 So the other main type of noninvasive neuroimaging we'll 213 00:09:14,350 --> 00:09:15,910 talk about is functional MRI. 214 00:09:15,910 --> 00:09:18,740 So here's a picture of an FMRI scanner. 215 00:09:18,740 --> 00:09:20,134 Subject's laying there, and often 216 00:09:20,134 --> 00:09:21,550 if we're doing a visual task, they 217 00:09:21,550 --> 00:09:26,570 look at stimuli on a mirror that reflects from a screen where 218 00:09:26,570 --> 00:09:28,820 we're presenting the stimuli. 219 00:09:28,820 --> 00:09:31,660 So fMRI measures the changes in blood flow 220 00:09:31,660 --> 00:09:34,420 that happen when neurons fire. 221 00:09:34,420 --> 00:09:37,610 And so as a result, this is not a direct measure. 222 00:09:37,610 --> 00:09:40,810 So this is not a direct measure of the actual neural firing. 223 00:09:40,810 --> 00:09:44,590 So it has a longer latency for the blood flow effects 224 00:09:44,590 --> 00:09:46,150 to occur. 225 00:09:46,150 --> 00:09:48,520 And so that's why it has the temporal scale that's 226 00:09:48,520 --> 00:09:50,560 more like a couple of seconds. 227 00:09:50,560 --> 00:09:53,450 But it has quite good spatial resolution. 228 00:09:53,450 --> 00:09:55,600 There's structural MRI, which if any of you 229 00:09:55,600 --> 00:09:59,140 have ever been injured, you may have had an MRI, 230 00:09:59,140 --> 00:10:04,774 and that measures the actual-- 231 00:10:04,774 --> 00:10:06,190 it doesn't measure the blood flow, 232 00:10:06,190 --> 00:10:10,260 it measures the actual structures underneath. 233 00:10:10,260 --> 00:10:13,420 I mean, often people will do an MRI and a functional MRI, 234 00:10:13,420 --> 00:10:15,280 and co register the two, so you have 235 00:10:15,280 --> 00:10:18,390 a very precise anatomical image that you can then 236 00:10:18,390 --> 00:10:20,610 put the brain activity on. 237 00:10:27,570 --> 00:10:28,070 OK. 238 00:10:28,070 --> 00:10:29,270 So I got into this a bit. 239 00:10:29,270 --> 00:10:35,049 So invasive electrophysiology is the highest resolution data, 240 00:10:35,049 --> 00:10:36,590 both spatial and temporally, I think, 241 00:10:36,590 --> 00:10:39,620 that most scientists collect. 242 00:10:39,620 --> 00:10:40,830 But it has some advantages. 243 00:10:40,830 --> 00:10:41,829 One, that it's invasive. 244 00:10:41,829 --> 00:10:44,810 So it's hard to test questions in humans. 245 00:10:44,810 --> 00:10:47,950 And just more difficult in general. 246 00:10:47,950 --> 00:10:50,600 And two, you're limited by brain coverage. 247 00:10:50,600 --> 00:10:53,000 So you can only stick a grid or an electrode 248 00:10:53,000 --> 00:10:55,470 in a couple of brain regions at once. 249 00:10:55,470 --> 00:10:57,710 So you really can't get information 250 00:10:57,710 --> 00:11:00,140 from across the whole brain, at this resolution 251 00:11:00,140 --> 00:11:02,950 with the technologies we currently have. 252 00:11:02,950 --> 00:11:05,870 fMRI, on the other hand, has broad coverage and good spatial 253 00:11:05,870 --> 00:11:08,330 resolution, but lower temporal resolution. 254 00:11:08,330 --> 00:11:10,515 And EEG and MEG have high temporal resolution, 255 00:11:10,515 --> 00:11:13,587 broad brain coverage, but low spatial information. 256 00:11:18,260 --> 00:11:20,440 All right. 257 00:11:20,440 --> 00:11:23,150 So a bit about visual processing in the brain. 258 00:11:23,150 --> 00:11:25,690 So this is a diagram. 259 00:11:25,690 --> 00:11:27,390 Can you see the colors? 260 00:11:27,390 --> 00:11:28,510 OK. 261 00:11:28,510 --> 00:11:30,240 A little, sorry. 262 00:11:30,240 --> 00:11:33,440 Of roughly what people think of as visual cortex. 263 00:11:33,440 --> 00:11:37,150 So the blue in the back is primary visual cortex, or V1, 264 00:11:37,150 --> 00:11:39,190 that's the earliest cortical stage that where 265 00:11:39,190 --> 00:11:42,100 visual signals originate. 266 00:11:42,100 --> 00:11:44,980 And then there is what's known as the ventral stream, which 267 00:11:44,980 --> 00:11:46,720 is often called the what pathway, 268 00:11:46,720 --> 00:11:49,679 or where people roughly believe object recognition occurs. 269 00:11:49,679 --> 00:11:51,220 And the dorsal stream, which is often 270 00:11:51,220 --> 00:11:52,960 known as the where pathway, which is 271 00:11:52,960 --> 00:11:58,040 thought to be more implicated in spatial information. 272 00:11:58,040 --> 00:12:00,650 However, this is an extreme oversimplification. 273 00:12:00,650 --> 00:12:03,490 I think Tommy put up this wiring diagram the other day. 274 00:12:03,490 --> 00:12:07,210 This is still a simplification, but a more realistic box 275 00:12:07,210 --> 00:12:09,310 diagram of all the different-- each box 276 00:12:09,310 --> 00:12:10,900 represents a different visual region. 277 00:12:10,900 --> 00:12:13,720 You can see that there's connections 278 00:12:13,720 --> 00:12:16,780 between all of them, between the ventral and dorsal stream. 279 00:12:16,780 --> 00:12:19,420 And while we roughly think of it as feedforward, 280 00:12:19,420 --> 00:12:22,540 which means that the input from, the output from one layer 281 00:12:22,540 --> 00:12:24,670 serves as input to the next, often there's 282 00:12:24,670 --> 00:12:25,810 feedback connections. 283 00:12:25,810 --> 00:12:29,620 Meaning that information can flow between areas. 284 00:12:29,620 --> 00:12:32,620 So that's why it's been so challenging to probe 285 00:12:32,620 --> 00:12:33,520 with physiology. 286 00:12:36,592 --> 00:12:38,300 OK, so like I said, there are many layers 287 00:12:38,300 --> 00:12:40,133 and they are thought to be roughly organized 288 00:12:40,133 --> 00:12:44,880 hierarchically into the first level primary visual cortex. 289 00:12:44,880 --> 00:12:47,460 In that area, you have cells that respond 290 00:12:47,460 --> 00:12:50,070 to oriented lines and edges. 291 00:12:50,070 --> 00:12:51,810 So a cell will-- 292 00:12:51,810 --> 00:12:53,520 I'll show an example of this, but fire 293 00:12:53,520 --> 00:12:56,490 for stimuli that it sees, that are in a certain orientation, 294 00:12:56,490 --> 00:12:58,410 in a certain place. 295 00:12:58,410 --> 00:13:02,810 And that is known as the cell's receptive field. 296 00:13:02,810 --> 00:13:05,460 And so it's often thought of as an edge detector. 297 00:13:05,460 --> 00:13:07,530 It's very analogous to a lot of edge detection 298 00:13:07,530 --> 00:13:11,044 algorithms in computer vision, for example. 299 00:13:11,044 --> 00:13:12,960 But then at what's thought to be the top layer 300 00:13:12,960 --> 00:13:16,380 of the ventral stream, inferior temporal cortex, 301 00:13:16,380 --> 00:13:19,530 cells fire in response to whole objects. 302 00:13:19,530 --> 00:13:22,121 And it's not just a specific orientation that they like. 303 00:13:22,121 --> 00:13:24,120 They will see this-- they will fire whether they 304 00:13:24,120 --> 00:13:26,310 see this object at different positions, 305 00:13:26,310 --> 00:13:29,310 and also have some tolerance to viewpoint and scale as well. 306 00:13:34,620 --> 00:13:36,690 So a lot of what we know about the visual system 307 00:13:36,690 --> 00:13:39,920 stem from Hubel and Wiesel's seminal work in the 1960s, 308 00:13:39,920 --> 00:13:44,760 looking at cells and cat V1. 309 00:13:44,760 --> 00:13:47,870 This is the stimulus that they're showing to the cat. 310 00:13:47,870 --> 00:13:49,880 It's an anesthesized cat, and they're recording. 311 00:13:49,880 --> 00:13:52,440 So you'll hear a popping, and those pops 312 00:13:52,440 --> 00:13:55,514 are the neural activity that they're recording. 313 00:13:55,514 --> 00:13:58,470 [POPPING NOISES] 314 00:13:58,470 --> 00:14:01,020 So they're recording from a single cell right now. 315 00:14:01,020 --> 00:14:02,700 So you see, you can hear anytime they 316 00:14:02,700 --> 00:14:05,430 present that light bar, in that specific position, 317 00:14:05,430 --> 00:14:06,120 the cell fires. 318 00:14:11,260 --> 00:14:13,330 And then as soon as they move it out of the bar, 319 00:14:13,330 --> 00:14:14,450 the cells stop firing. 320 00:14:14,450 --> 00:14:16,030 So that specific cell really likes 321 00:14:16,030 --> 00:14:18,160 this bar in this orientation. 322 00:14:18,160 --> 00:14:19,750 And they called this a simple cell. 323 00:14:22,900 --> 00:14:27,270 We can fast forward a little bit. 324 00:14:27,270 --> 00:14:30,312 They also show, OK. 325 00:14:30,312 --> 00:14:32,895 And then they found that there are these other types of cells. 326 00:14:35,585 --> 00:14:38,507 [CAR HONKING] 327 00:14:42,390 --> 00:14:42,890 Sorry. 328 00:14:47,280 --> 00:14:49,764 They showed that if you rotate it, doesn't fire at all. 329 00:14:49,764 --> 00:14:51,180 And then they show that there were 330 00:14:51,180 --> 00:14:53,470 these other types of cells. 331 00:14:53,470 --> 00:14:55,080 This is maybe not the movie we want. 332 00:14:55,080 --> 00:14:57,210 There are other cells that fire not only 333 00:14:57,210 --> 00:15:00,810 to that specific position, but to slight shifts 334 00:15:00,810 --> 00:15:02,320 in that position as well. 335 00:15:02,320 --> 00:15:04,620 And so it seems like those cells formed an aggregate 336 00:15:04,620 --> 00:15:08,160 over the simple cells, and they called those cells 337 00:15:08,160 --> 00:15:09,000 complex cells. 338 00:15:12,650 --> 00:15:15,800 And then people did similar things in mostly macaque IT. 339 00:15:15,800 --> 00:15:17,780 And so they found that in contrast 340 00:15:17,780 --> 00:15:20,300 to simple lines and edges, cells here 341 00:15:20,300 --> 00:15:21,940 fired in response to hands. 342 00:15:21,940 --> 00:15:26,260 So this is showing the cells' response here. 343 00:15:26,260 --> 00:15:29,450 So this is the number of spikes over time. 344 00:15:29,450 --> 00:15:32,870 So it fires a lot to hands. 345 00:15:32,870 --> 00:15:34,070 And it fires to that hand. 346 00:15:34,070 --> 00:15:35,600 This cell likes that hand, no matter 347 00:15:35,600 --> 00:15:37,910 what position you show it in. 348 00:15:37,910 --> 00:15:40,430 But it doesn't like these kind of other more simple objects, 349 00:15:40,430 --> 00:15:42,950 and this one is not selective for faces. 350 00:15:42,950 --> 00:15:46,130 So in IT, there are cells that are selective for very high, 351 00:15:46,130 --> 00:15:48,620 you would think of as high level objects. 352 00:15:48,620 --> 00:15:50,870 And they're tolerant to changes in those objects. 353 00:15:53,790 --> 00:15:56,910 So people have done many more sophisticated studies. 354 00:15:56,910 --> 00:16:01,010 This is an example from Gabriel and Jim DiCarlo, 355 00:16:01,010 --> 00:16:05,760 and Chou Hung, where they showed neural decoding. 356 00:16:05,760 --> 00:16:07,550 So applying a machine learning algorithm 357 00:16:07,550 --> 00:16:11,300 to the output of many cells, that these cells were again 358 00:16:11,300 --> 00:16:12,890 very specific for certain objects, 359 00:16:12,890 --> 00:16:15,330 but invariant to different transformations. 360 00:16:15,330 --> 00:16:17,480 So in particular here, they showed this monkey face 361 00:16:17,480 --> 00:16:19,040 at different sizes. 362 00:16:19,040 --> 00:16:26,030 And they showed that the cell fired. 363 00:16:26,030 --> 00:16:28,640 There was information present in the population of neurons 364 00:16:28,640 --> 00:16:30,890 for this specific monkey face, regardless 365 00:16:30,890 --> 00:16:32,690 of what size we showed it at. 366 00:16:32,690 --> 00:16:35,006 So these cells are often thought to be-- 367 00:16:35,006 --> 00:16:36,380 so it's often thought that as you 368 00:16:36,380 --> 00:16:39,960 move along the visual hierarchy, cells become more selective. 369 00:16:39,960 --> 00:16:43,340 So meaning, they like more specific objects. 370 00:16:43,340 --> 00:16:45,860 And more invariant, so more tolerant 371 00:16:45,860 --> 00:16:48,620 to changes in different transformations. 372 00:16:52,176 --> 00:16:54,050 And so the other thing I wanted to talk about 373 00:16:54,050 --> 00:16:58,220 was hierarchical feedforward. 374 00:16:58,220 --> 00:17:00,260 So computational models of the visual system, 375 00:17:00,260 --> 00:17:02,360 because Tommy mentioned this briefly, 376 00:17:02,360 --> 00:17:06,530 and I think it will tie into a lot of the computer vision 377 00:17:06,530 --> 00:17:07,910 work you'll hear about. 378 00:17:07,910 --> 00:17:10,339 So these are inspired by Hubel and Wiesel's findings 379 00:17:10,339 --> 00:17:12,300 in visual cortex. 380 00:17:12,300 --> 00:17:16,430 So meaning-- and I'm going to talk both about the HMAX 381 00:17:16,430 --> 00:17:18,470 model, which is the model developed by Tommy 382 00:17:18,470 --> 00:17:22,400 and others in his lab which is a simpler, more 383 00:17:22,400 --> 00:17:23,839 biologically faithful model. 384 00:17:23,839 --> 00:17:25,550 But this sort of architecture is also 385 00:17:25,550 --> 00:17:27,650 true of deep learning systems that you heard a lot 386 00:17:27,650 --> 00:17:30,500 about recently, and that have had a lot of success 387 00:17:30,500 --> 00:17:33,210 in computer vision challenges. 388 00:17:33,210 --> 00:17:35,750 So if you have an input image, you 389 00:17:35,750 --> 00:17:38,150 can then have a set of simple samples. 390 00:17:38,150 --> 00:17:41,000 Again, these are inspired by Hubel and Wiesel's findings, 391 00:17:41,000 --> 00:17:43,730 so they are oriented lines and edges. 392 00:17:43,730 --> 00:17:46,520 So this cell will fire, if you have 393 00:17:46,520 --> 00:17:49,052 an edge that's oriented like this, 394 00:17:49,052 --> 00:17:50,135 at that part of the image. 395 00:17:53,650 --> 00:17:57,780 And so again, it's just a basic edge detector. 396 00:17:57,780 --> 00:18:00,320 And so these perform template matching 397 00:18:00,320 --> 00:18:04,580 between their template, which is in this case an oriented bar, 398 00:18:04,580 --> 00:18:07,775 and the input image to build up selectivity. 399 00:18:07,775 --> 00:18:09,150 And then there are complex cells. 400 00:18:09,150 --> 00:18:14,300 And these complex cells pool, or take a local aggregate measure, 401 00:18:14,300 --> 00:18:16,010 to build up invariance. 402 00:18:16,010 --> 00:18:21,140 And so what that means is if you have, say this red cell here, 403 00:18:21,140 --> 00:18:23,760 this complex cell would look at these four simple cells. 404 00:18:23,760 --> 00:18:26,360 So you are now selective to that oriented line, 405 00:18:26,360 --> 00:18:29,486 not just at this position, but at all of these positions. 406 00:18:29,486 --> 00:18:30,860 And that gives you some tolerance 407 00:18:30,860 --> 00:18:32,090 to changes in position. 408 00:18:32,090 --> 00:18:34,140 So you'd be able to recognize the same object, 409 00:18:34,140 --> 00:18:35,690 whether it had this feature. 410 00:18:35,690 --> 00:18:40,435 Whether it was presented at this corner or in a local area. 411 00:18:40,435 --> 00:18:41,810 And so the way you do that is you 412 00:18:41,810 --> 00:18:45,500 take a max over the response of all those input cells. 413 00:18:45,500 --> 00:18:47,390 And then you can repeat this for many layers 414 00:18:47,390 --> 00:18:50,090 and, it's essentially the same thing 415 00:18:50,090 --> 00:18:54,240 as a multilayer convolutional neural network. 416 00:18:54,240 --> 00:18:57,020 And at the end, in this HMAX model, 417 00:18:57,020 --> 00:18:59,700 you take a global max over all scales and positions. 418 00:18:59,700 --> 00:19:03,320 So, in theory, you have all these more complex features 419 00:19:03,320 --> 00:19:05,840 that you can now respond to, regardless of where 420 00:19:05,840 --> 00:19:09,040 in the image and how large they're presented.