1 00:00:00,000 --> 00:00:01,964 [SQUEAKING] 2 00:00:01,964 --> 00:00:04,419 [RUSTLING] 3 00:00:04,419 --> 00:00:07,856 [CLICKING] 4 00:00:10,983 --> 00:00:12,900 NANCY KANWISHER: All right, let's get started. 5 00:00:12,900 --> 00:00:16,340 So today, we're going to talk at some length about what 6 00:00:16,340 --> 00:00:20,060 I mean by this idea of Marr's computational theory 7 00:00:20,060 --> 00:00:21,200 level of analysis. 8 00:00:21,200 --> 00:00:23,773 It's a way of asking questions about mind and brain, 9 00:00:23,773 --> 00:00:25,190 and we're going to talk about that 10 00:00:25,190 --> 00:00:26,995 in the case of color vision. 11 00:00:26,995 --> 00:00:28,370 And that's going to take a while. 12 00:00:28,370 --> 00:00:29,720 We'll go down and do the demo. 13 00:00:29,720 --> 00:00:31,670 We'll come back and talk about color vision 14 00:00:31,670 --> 00:00:34,820 and how we think about it at the level of computational theory 15 00:00:34,820 --> 00:00:37,057 and why that matters for mind and brain. 16 00:00:37,057 --> 00:00:39,140 And then we're going to start, in the second half, 17 00:00:39,140 --> 00:00:42,770 a whole session, which is going to roll into next class, 18 00:00:42,770 --> 00:00:45,860 on the methods we can use in cognitive neuroscience 19 00:00:45,860 --> 00:00:49,190 to understand the human brain, and we'll illustrate those 20 00:00:49,190 --> 00:00:50,510 with a case of face perception. 21 00:00:50,510 --> 00:00:52,910 And we'll talk about computational theory, 22 00:00:52,910 --> 00:00:55,678 light very briefly, a face perception, 23 00:00:55,678 --> 00:00:57,470 what you can learn from behavioral studies, 24 00:00:57,470 --> 00:00:59,262 and what you can learn from functional MRI. 25 00:00:59,262 --> 00:01:02,000 And then we'll go on and do other methods next time. 26 00:01:02,000 --> 00:01:04,129 Everybody with the program? 27 00:01:04,129 --> 00:01:05,510 All right. 28 00:01:05,510 --> 00:01:08,787 So to back up a little, the biggest theme addressed 29 00:01:08,787 --> 00:01:10,370 in this course, the big question we're 30 00:01:10,370 --> 00:01:12,470 trying to understand in this field is, 31 00:01:12,470 --> 00:01:14,822 how does the brain give rise to the mind? 32 00:01:14,822 --> 00:01:16,280 That's really what we're in it for. 33 00:01:16,280 --> 00:01:17,930 That's why there's lots of cognitive science. 34 00:01:17,930 --> 00:01:19,940 We're trying to understand how the mind emerges 35 00:01:19,940 --> 00:01:21,470 from this physical object. 36 00:01:21,470 --> 00:01:23,420 And so for the last few lectures, 37 00:01:23,420 --> 00:01:26,780 you've been learning some stuff about the physical basis 38 00:01:26,780 --> 00:01:28,705 of the brain, what it actually looks like. 39 00:01:28,705 --> 00:01:30,080 Some of you guys got to touch it. 40 00:01:30,080 --> 00:01:33,630 I hope you thought that was half as awesome as I did. 41 00:01:33,630 --> 00:01:36,980 And we got a sense of the basic physicality of the brain 42 00:01:36,980 --> 00:01:38,960 and some of its major parts. 43 00:01:38,960 --> 00:01:40,580 But now the agenda is, how are we 44 00:01:40,580 --> 00:01:43,250 going to explain how this physical object gives rise 45 00:01:43,250 --> 00:01:45,110 to something like the mind? 46 00:01:45,110 --> 00:01:46,730 And the first problem you encounter 47 00:01:46,730 --> 00:01:48,380 is, what is the mind anyway? 48 00:01:48,380 --> 00:01:50,457 I drew it as a weird, big, amorphous cloud 49 00:01:50,457 --> 00:01:52,790 because it's just not obvious how you think about minds, 50 00:01:52,790 --> 00:01:52,970 right? 51 00:01:52,970 --> 00:01:54,080 It feels like one of those things 52 00:01:54,080 --> 00:01:56,240 like, we could even have a science of the mind. 53 00:01:56,240 --> 00:01:57,590 What is mind? 54 00:01:57,590 --> 00:02:00,110 All kind of nervous making, right? 55 00:02:00,110 --> 00:02:02,960 And so our field of cognitive science, 56 00:02:02,960 --> 00:02:04,640 over the last few decades, has come up 57 00:02:04,640 --> 00:02:07,220 with this framework for how we can think about minds, 58 00:02:07,220 --> 00:02:08,539 and this isn't even a theory. 59 00:02:08,539 --> 00:02:10,370 It's more meta than that. 60 00:02:10,370 --> 00:02:13,070 It's a framework for thinking about what a mind is, 61 00:02:13,070 --> 00:02:15,770 and the framework is the idea that the mind 62 00:02:15,770 --> 00:02:20,150 is a set of computations that extract representations. 63 00:02:20,150 --> 00:02:22,280 OK, now that's pretty abstract. 64 00:02:22,280 --> 00:02:23,780 You can think of a representation 65 00:02:23,780 --> 00:02:26,480 in your mind as anything from a percept, 66 00:02:26,480 --> 00:02:30,330 like I see motion right now, or I see color. 67 00:02:30,330 --> 00:02:32,390 And as you learned before, you might see motion 68 00:02:32,390 --> 00:02:35,670 even if there isn't actually motion in the stimulus. 69 00:02:35,670 --> 00:02:37,940 But that representation of motion in your head, 70 00:02:37,940 --> 00:02:41,090 that percept, that's a kind of mental representation. 71 00:02:41,090 --> 00:02:45,170 Or if you're thinking, why is Nancy going through this really 72 00:02:45,170 --> 00:02:45,680 basic stuff? 73 00:02:45,680 --> 00:02:46,940 She's insulting our intelligence. 74 00:02:46,940 --> 00:02:48,920 If something like that is going on in the background 75 00:02:48,920 --> 00:02:50,390 as I'm lecturing, that's a thought. 76 00:02:50,390 --> 00:02:52,640 That's a mental representation of a sort. 77 00:02:52,640 --> 00:02:55,577 Or if you're thinking, oh my god, it's after 11:00, 78 00:02:55,577 --> 00:02:57,410 and I'm not going to get to eat until 12:30. 79 00:02:57,410 --> 00:02:58,733 I'm going to starve. 80 00:02:58,733 --> 00:03:00,650 Whatever thoughts are going through your head, 81 00:03:00,650 --> 00:03:03,800 those are mental representations, too, right? 82 00:03:03,800 --> 00:03:07,470 And so the question is, how do we think about those? 83 00:03:07,470 --> 00:03:14,060 And so this idea that mental processes are computations 84 00:03:14,060 --> 00:03:16,940 and mental contents are representations 85 00:03:16,940 --> 00:03:20,510 implies that ideally, in the long run, if we really 86 00:03:20,510 --> 00:03:23,120 understood minds, we'd be able to write the code 87 00:03:23,120 --> 00:03:25,250 to do everything that minds do, right? 88 00:03:25,250 --> 00:03:28,890 And that code would work, in some sense, in the same way. 89 00:03:28,890 --> 00:03:31,070 Now, that's a tall order. 90 00:03:31,070 --> 00:03:34,160 Mostly, we can't do that yet, like not even close, 91 00:03:34,160 --> 00:03:36,380 a few little cases in perception, 92 00:03:36,380 --> 00:03:38,930 kind of sort of maybe, but mostly, we can't do that yet. 93 00:03:38,930 --> 00:03:40,290 But that's the goal. 94 00:03:40,290 --> 00:03:42,567 That's the aspiration. 95 00:03:42,567 --> 00:03:44,150 And so the question is, how do we even 96 00:03:44,150 --> 00:03:46,400 get off the ground trying to launch 97 00:03:46,400 --> 00:03:47,870 this enterprise of coming up with 98 00:03:47,870 --> 00:03:52,910 an actual precise computational theory of what minds do? 99 00:03:52,910 --> 00:03:56,600 And the first step to that is by thinking about what is computed 100 00:03:56,600 --> 00:04:01,860 and why, and so that is the crux of David Marr's big idea, 101 00:04:01,860 --> 00:04:06,205 the brief reading assignment that I gave you guys from Marr. 102 00:04:06,205 --> 00:04:07,580 And he's talking about, how do we 103 00:04:07,580 --> 00:04:08,900 think about minds and brains? 104 00:04:08,900 --> 00:04:11,630 Step number one, what is computed and why? 105 00:04:11,630 --> 00:04:14,630 So we're going to focus on that for a bit here. 106 00:04:14,630 --> 00:04:16,740 And let's take vision, for example. 107 00:04:16,740 --> 00:04:18,980 You start with a world out there that 108 00:04:18,980 --> 00:04:20,959 sends light into your eyes. 109 00:04:20,959 --> 00:04:23,870 That's my icon of a retina, that blue thing in the back, 110 00:04:23,870 --> 00:04:24,830 the back of your eyes-- 111 00:04:24,830 --> 00:04:28,910 sends an image onto your eye, and then some magic happens. 112 00:04:28,910 --> 00:04:31,310 And then you know what you're looking at, OK? 113 00:04:31,310 --> 00:04:33,200 So that's what we're trying to understand. 114 00:04:33,200 --> 00:04:34,220 What goes on in there? 115 00:04:34,220 --> 00:04:35,660 In a sense, what is the code that 116 00:04:35,660 --> 00:04:38,120 goes on in here that takes this as an input 117 00:04:38,120 --> 00:04:41,570 and delivers that as an output, OK? 118 00:04:41,570 --> 00:04:44,210 More specifically, we can ask, as we 119 00:04:44,210 --> 00:04:46,070 did in the last couple of lectures-- 120 00:04:46,070 --> 00:04:47,960 let's take the case of visual motion. 121 00:04:47,960 --> 00:04:50,060 So suppose you're seeing a display like this, 122 00:04:50,060 --> 00:04:51,352 like something in front of you. 123 00:04:51,352 --> 00:04:53,870 Somebody jumps on a beach like that, 124 00:04:53,870 --> 00:04:55,760 and there's visual motion information. 125 00:04:55,760 --> 00:04:57,315 What are the kinds of things-- 126 00:04:57,315 --> 00:04:58,190 so that's your input. 127 00:04:58,190 --> 00:05:01,110 What are the kinds of outputs you might get from that? 128 00:05:01,110 --> 00:05:04,370 Well, to understand that, we need to know what is computed 129 00:05:04,370 --> 00:05:05,600 and why. 130 00:05:05,600 --> 00:05:06,915 So what is computed? 131 00:05:06,915 --> 00:05:07,790 Well, lots of things. 132 00:05:07,790 --> 00:05:10,570 You might see the presence of motion. 133 00:05:10,570 --> 00:05:13,180 You might see the presence of a person. 134 00:05:13,180 --> 00:05:15,130 Actually, you can detect people just 135 00:05:15,130 --> 00:05:17,050 from their pattern of motion. 136 00:05:17,050 --> 00:05:18,730 We should have done this at the demo. 137 00:05:18,730 --> 00:05:20,830 Write me a note to think about that next time. 138 00:05:20,830 --> 00:05:25,312 If we stuck a little tiny LEDs on each of my joints 139 00:05:25,312 --> 00:05:27,520 and we're in a totally black room and I jumped around 140 00:05:27,520 --> 00:05:29,452 and all you could see was those dots moving, 141 00:05:29,452 --> 00:05:30,910 you would see that it was a person. 142 00:05:30,910 --> 00:05:32,980 It would be trivially obvious. 143 00:05:32,980 --> 00:05:35,385 So motion can give you lots of information 144 00:05:35,385 --> 00:05:37,510 aside from "something's moving" and "what direction 145 00:05:37,510 --> 00:05:40,390 is it moving?" 146 00:05:40,390 --> 00:05:42,280 You can see someone's jumping. 147 00:05:42,280 --> 00:05:45,130 That also comes from the information about motion. 148 00:05:45,130 --> 00:05:47,230 You can infer something about the health 149 00:05:47,230 --> 00:05:49,630 of this person or even their mood, 150 00:05:49,630 --> 00:05:53,110 so there's a huge range of kinds of information 151 00:05:53,110 --> 00:05:56,920 we glean from even a pretty simple stimulus attribute 152 00:05:56,920 --> 00:05:57,682 like motion. 153 00:05:57,682 --> 00:05:59,140 And so if we're going to understand 154 00:05:59,140 --> 00:06:00,610 how do we perceive motion, we first 155 00:06:00,610 --> 00:06:03,010 need to get organized about, what's the input, 156 00:06:03,010 --> 00:06:05,350 and which of those outputs are we talking about? 157 00:06:05,350 --> 00:06:09,430 And probably, the code that goes on in between in your head 158 00:06:09,430 --> 00:06:11,680 or in a computer program, if you ever figured out 159 00:06:11,680 --> 00:06:13,360 how to do that, will be quite different 160 00:06:13,360 --> 00:06:14,860 for each of those things, but that's 161 00:06:14,860 --> 00:06:18,610 the way you need to be thinking about minds. 162 00:06:18,610 --> 00:06:19,700 OK, what are the inputs? 163 00:06:19,700 --> 00:06:20,590 What are the outputs? 164 00:06:20,590 --> 00:06:23,410 And then as soon as you pose that challenge-- 165 00:06:23,410 --> 00:06:24,910 OK, let's say it's just moving dots, 166 00:06:24,910 --> 00:06:26,920 and you're trying to tell if that's a person. 167 00:06:26,920 --> 00:06:28,990 Think about, what is the code you'd write? 168 00:06:28,990 --> 00:06:30,820 Just these moving dots. 169 00:06:30,820 --> 00:06:32,830 How the hell are you going to go from that 170 00:06:32,830 --> 00:06:36,190 to detecting if those dots are on the joints of a person who's 171 00:06:36,190 --> 00:06:38,260 moving around versus on something else? 172 00:06:38,260 --> 00:06:41,080 That's how you think, what are the computational challenges 173 00:06:41,080 --> 00:06:42,473 involved, OK? 174 00:06:42,473 --> 00:06:44,890 And I'm not going to ever ask you guys to write that code. 175 00:06:44,890 --> 00:06:49,120 We're just going to consider it as a thought enterprise 176 00:06:49,120 --> 00:06:52,570 to kind of see what the problem is that the brain is facing, 177 00:06:52,570 --> 00:06:54,580 that it's solving. 178 00:06:54,580 --> 00:06:57,220 OK, and so Marr's big idea is this whole business 179 00:06:57,220 --> 00:06:59,112 of thinking about what is computed and why, 180 00:06:59,112 --> 00:07:00,820 what the inputs and outputs are, and what 181 00:07:00,820 --> 00:07:03,970 the computational challenges are getting from those inputs 182 00:07:03,970 --> 00:07:06,190 to those outputs, that all of that 183 00:07:06,190 --> 00:07:10,360 is a prerequisite for thinking about minds or brains, OK? 184 00:07:10,360 --> 00:07:13,330 So we can't understand what brains are doing until we first 185 00:07:13,330 --> 00:07:14,200 think about this. 186 00:07:14,200 --> 00:07:16,657 That's why I'm carrying on about this at some length. 187 00:07:16,657 --> 00:07:18,490 And Marr writes so beautifully that I'm just 188 00:07:18,490 --> 00:07:21,130 going to read some of my favorite paragraphs 189 00:07:21,130 --> 00:07:24,550 because paraphrasing beautiful prose is a sin. 190 00:07:24,550 --> 00:07:26,980 So Marr says, "Trying to understand perception 191 00:07:26,980 --> 00:07:28,960 by studying only neurons is like trying 192 00:07:28,960 --> 00:07:32,020 to understand bird flight by studying only feathers. 193 00:07:32,020 --> 00:07:33,790 It just can't be done. 194 00:07:33,790 --> 00:07:35,320 To understand bird flight, you need 195 00:07:35,320 --> 00:07:37,330 to understand aerodynamics. 196 00:07:37,330 --> 00:07:40,450 Only then can one make sense of the structure of feathers 197 00:07:40,450 --> 00:07:42,190 and the shape of wings. 198 00:07:42,190 --> 00:07:44,440 Similarly, you can't reach an understanding 199 00:07:44,440 --> 00:07:46,030 of why neurons in the visual system 200 00:07:46,030 --> 00:07:48,970 behave the way they do just by studying their anatomy 201 00:07:48,970 --> 00:07:50,740 and physiology," OK? 202 00:07:50,740 --> 00:07:53,200 You have to understand the problem that's being solved, 203 00:07:53,200 --> 00:07:54,220 OK? 204 00:07:54,220 --> 00:07:57,370 Further, he says, "The nature of the computations 205 00:07:57,370 --> 00:07:59,710 that underlie perception depends more 206 00:07:59,710 --> 00:08:01,150 on the computational problems that 207 00:08:01,150 --> 00:08:04,630 have to be solved than on the particular hardware in which 208 00:08:04,630 --> 00:08:06,705 their solutions are implemented." 209 00:08:06,705 --> 00:08:08,080 So he's basically saying we could 210 00:08:08,080 --> 00:08:11,020 have a theory of any aspect of perception 211 00:08:11,020 --> 00:08:13,422 that would be essentially the same theory whether you 212 00:08:13,422 --> 00:08:15,130 write it in code and put it in a computer 213 00:08:15,130 --> 00:08:17,240 or whether it's being implemented in a brain. 214 00:08:17,240 --> 00:08:17,740 Yeah. 215 00:08:17,740 --> 00:08:20,507 AUDIENCE: Was Marr an engineer? 216 00:08:20,507 --> 00:08:22,090 NANCY KANWISHER: Marr was many things. 217 00:08:22,090 --> 00:08:24,100 He was a visionary, a visionary who 218 00:08:24,100 --> 00:08:27,190 studied vision, a truly brilliant guy 219 00:08:27,190 --> 00:08:29,230 with a very strong engineering background. 220 00:08:29,230 --> 00:08:32,260 And this is now pervading the whole field 221 00:08:32,260 --> 00:08:34,630 of cognitive science, that people take an engineering 222 00:08:34,630 --> 00:08:36,940 approach to understanding minds and brains, 223 00:08:36,940 --> 00:08:40,070 to try to really understand how they work. 224 00:08:40,070 --> 00:08:42,250 OK, so to better understand this, 225 00:08:42,250 --> 00:08:45,070 we're going to now consider the case of color vision. 226 00:08:45,070 --> 00:08:47,470 And so in this case, we start with color 227 00:08:47,470 --> 00:08:51,110 in the world that sends images onto the back of your retina, 228 00:08:51,110 --> 00:08:52,960 so magic happens. 229 00:08:52,960 --> 00:08:55,295 And we get a bunch of information out. 230 00:08:55,295 --> 00:08:56,920 So the question we're going to consider 231 00:08:56,920 --> 00:09:00,220 is, what do we use color for, OK? 232 00:09:00,220 --> 00:09:01,960 And we're going to use the same strategy 233 00:09:01,960 --> 00:09:03,565 we used in the Edgerton Center. 234 00:09:03,565 --> 00:09:05,440 We're trying to understand some of the things 235 00:09:05,440 --> 00:09:09,730 that we use color for by experiencing perception 236 00:09:09,730 --> 00:09:11,320 without color, OK? 237 00:09:11,320 --> 00:09:12,550 What are the outputs? 238 00:09:12,550 --> 00:09:15,310 OK, so to do that, we're going to head over right now 239 00:09:15,310 --> 00:09:17,050 to the imaging center, and we're going 240 00:09:17,050 --> 00:09:19,600 to have a cool demo by Rosa Lafer-Sousa. 241 00:09:19,600 --> 00:09:23,328 So if it's going to be faster to leave your stuff here-- 242 00:09:23,328 --> 00:09:23,870 I don't know. 243 00:09:23,870 --> 00:09:24,460 Maybe we should-- yeah? 244 00:09:24,460 --> 00:09:24,940 AUDIENCE: [INAUDIBLE] 245 00:09:24,940 --> 00:09:26,970 NANCY KANWISHER: Yeah, we'll lock the room, OK? 246 00:09:26,970 --> 00:09:27,470 Yeah? 247 00:09:27,470 --> 00:09:29,560 AUDIENCE: How long are we going to be there? 248 00:09:29,560 --> 00:09:30,950 NANCY KANWISHER: 10 minutes, something like that, 249 00:09:30,950 --> 00:09:32,860 and I need everyone to boogie because there's a lot of stuff 250 00:09:32,860 --> 00:09:34,027 I want to get through today. 251 00:09:34,027 --> 00:09:37,080 So let's go. 252 00:09:37,080 --> 00:09:42,510 All right, so what do we use color for when we have it? 253 00:09:42,510 --> 00:09:43,690 It's not a trick question. 254 00:09:43,690 --> 00:09:47,500 It's supposed to be really obvious now. 255 00:09:47,500 --> 00:09:49,170 Yeah, what's your name? 256 00:09:49,170 --> 00:09:49,930 AUDIENCE: Chardon. 257 00:09:49,930 --> 00:09:51,138 NANCY KANWISHER: Chardon, hi. 258 00:09:51,138 --> 00:09:53,170 AUDIENCE: Like choosing which food to eat. 259 00:09:53,170 --> 00:09:54,700 NANCY KANWISHER: Yeah, yeah. 260 00:09:54,700 --> 00:09:56,590 Choosing which. 261 00:09:56,590 --> 00:09:59,830 What else related to that but different? 262 00:09:59,830 --> 00:10:00,655 Yeah. 263 00:10:00,655 --> 00:10:01,955 AUDIENCE: Check-in procedure. 264 00:10:01,955 --> 00:10:03,580 NANCY KANWISHER: Yeah, yeah, like what? 265 00:10:03,580 --> 00:10:07,135 What did you notice that you could identify better? 266 00:10:07,135 --> 00:10:09,360 AUDIENCE: Different types of things. 267 00:10:09,360 --> 00:10:12,210 NANCY KANWISHER: Mm-hmm, but besides identifying 268 00:10:12,210 --> 00:10:13,510 and choosing, what else? 269 00:10:13,510 --> 00:10:15,218 AUDIENCE: More generally, bringing things 270 00:10:15,218 --> 00:10:17,175 into our awareness with the reds in particular 271 00:10:17,175 --> 00:10:18,480 with the strawberries. 272 00:10:18,480 --> 00:10:20,780 NANCY KANWISHER: Yeah, do you find them easier to find? 273 00:10:20,780 --> 00:10:22,183 AUDIENCE: No, much harder. 274 00:10:22,183 --> 00:10:23,850 NANCY KANWISHER: Oh, yeah, right, harder 275 00:10:23,850 --> 00:10:25,920 without the light, right. 276 00:10:25,920 --> 00:10:26,580 Exactly. 277 00:10:26,580 --> 00:10:27,240 What else. 278 00:10:29,750 --> 00:10:30,260 Yeah. 279 00:10:30,260 --> 00:10:31,250 AUDIENCE: Like driving. 280 00:10:31,250 --> 00:10:33,350 You need to have color to know the traffic lights. 281 00:10:33,350 --> 00:10:34,767 NANCY KANWISHER: Totally, totally. 282 00:10:34,767 --> 00:10:37,820 That's a modern invention but a really important one. 283 00:10:37,820 --> 00:10:40,350 What else? 284 00:10:40,350 --> 00:10:41,740 AUDIENCE: Are we general? 285 00:10:41,740 --> 00:10:43,290 Are we very general or like-- 286 00:10:43,290 --> 00:10:43,950 NANCY KANWISHER: Whatever. 287 00:10:43,950 --> 00:10:45,030 What do we use color for? 288 00:10:45,030 --> 00:10:48,240 AUDIENCE: I mean, we used to figure out 289 00:10:48,240 --> 00:10:51,270 what to eat because one of the strawberries 290 00:10:51,270 --> 00:10:53,310 wasn't actually a strawberry. 291 00:10:53,310 --> 00:10:55,180 So yeah, I used color to [INAUDIBLE].. 292 00:10:55,180 --> 00:10:56,025 NANCY KANWISHER: Uh-huh, and the bananas. 293 00:10:56,025 --> 00:10:56,932 Did anybody notice? 294 00:10:56,932 --> 00:10:58,140 Sometimes, it's hard to tell. 295 00:10:58,140 --> 00:11:00,006 Yeah, boy in the back? 296 00:11:00,006 --> 00:11:03,180 AUDIENCE: For assessing health risks. 297 00:11:03,180 --> 00:11:04,526 NANCY KANWISHER: Say more. 298 00:11:04,526 --> 00:11:07,040 AUDIENCE: If someone's face doesn't have color in it, 299 00:11:07,040 --> 00:11:09,438 you tend to assume that they're sickly. 300 00:11:09,438 --> 00:11:10,480 NANCY KANWISHER: Totally. 301 00:11:10,480 --> 00:11:13,510 Did you feel like people's faces looked a little sickly? 302 00:11:13,510 --> 00:11:15,820 Absolutely, absolutely. 303 00:11:15,820 --> 00:11:20,560 OK, so this is just to show you that a lot of computation 304 00:11:20,560 --> 00:11:23,478 theory starts with common sense of just reasoning, what 305 00:11:23,478 --> 00:11:24,520 do we use this stuff for? 306 00:11:24,520 --> 00:11:28,070 It helps to not have it to reveal what we use it for, 307 00:11:28,070 --> 00:11:30,910 but you guys have just reinvented the key insights 308 00:11:30,910 --> 00:11:33,490 in early field of color vision. 309 00:11:33,490 --> 00:11:36,280 OK, so standard story is to find fruit. 310 00:11:36,280 --> 00:11:38,350 If you ask yourself how many berries are here, 311 00:11:38,350 --> 00:11:39,260 take a moment. 312 00:11:39,260 --> 00:11:40,780 Get a mental tally. 313 00:11:40,780 --> 00:11:42,100 How many berries? 314 00:11:42,100 --> 00:11:43,030 OK, ready? 315 00:11:43,030 --> 00:11:44,830 Now how many berries, OK? 316 00:11:44,830 --> 00:11:46,150 You see more. 317 00:11:46,150 --> 00:11:49,060 And in fact, there's a long literature showing that 318 00:11:49,060 --> 00:11:51,815 primates who have three cone colors-- 319 00:11:51,815 --> 00:11:54,190 we're not going to go through all the physiological basis 320 00:11:54,190 --> 00:11:56,830 of cones and stuff like that, but they have a richer color 321 00:11:56,830 --> 00:12:00,010 vision because the number of different color receptors 322 00:12:00,010 --> 00:12:01,240 in their retina-- 323 00:12:01,240 --> 00:12:02,740 they're better at finding berries. 324 00:12:02,740 --> 00:12:04,750 And in fact, a paper came out a couple of years 325 00:12:04,750 --> 00:12:07,750 ago where they studied wild macaques on an island off 326 00:12:07,750 --> 00:12:10,270 of Puerto Rico called Cayo Santiago, 327 00:12:10,270 --> 00:12:14,110 and the macaques there have a natural variation genetically 328 00:12:14,110 --> 00:12:18,130 where some of them have two color photoreceptors instead 329 00:12:18,130 --> 00:12:19,480 of three, OK? 330 00:12:19,480 --> 00:12:21,650 And in fact, they followed them around, 331 00:12:21,650 --> 00:12:26,650 and the monkeys that have three photoreceptor types 332 00:12:26,650 --> 00:12:29,380 are better at finding fruit than the ones that have only two, 333 00:12:29,380 --> 00:12:29,650 OK? 334 00:12:29,650 --> 00:12:31,900 So that story that's just been a story for a long time 335 00:12:31,900 --> 00:12:34,413 turns out it's true. 336 00:12:34,413 --> 00:12:35,830 And also, as you guys have already 337 00:12:35,830 --> 00:12:40,300 said, to not just find things but identify properties-- 338 00:12:40,300 --> 00:12:41,920 you can probably tell whether you'd 339 00:12:41,920 --> 00:12:43,587 want to eat those bananas on the bottom. 340 00:12:43,587 --> 00:12:46,510 Maybe not, but it's hard to tell on the top which ones you like. 341 00:12:46,510 --> 00:12:49,270 And yet that's all you need to know, OK? 342 00:12:49,270 --> 00:12:52,840 So these are just a few of the ways that we use color 343 00:12:52,840 --> 00:12:54,820 and why it's important. 344 00:12:54,820 --> 00:12:58,670 But there is a very big problem now that we try to figure out, 345 00:12:58,670 --> 00:13:01,810 OK, what is the code that goes between the wavelength of light 346 00:13:01,810 --> 00:13:03,730 hitting your retina and trying to figure out, 347 00:13:03,730 --> 00:13:05,240 what color is that thing? 348 00:13:05,240 --> 00:13:06,920 So here's the problem. 349 00:13:06,920 --> 00:13:09,310 We want to determine a property of the object, 350 00:13:09,310 --> 00:13:11,860 of its surface properties, its color, right? 351 00:13:11,860 --> 00:13:14,230 That's a material property of that thing. 352 00:13:14,230 --> 00:13:15,940 But all we have-- 353 00:13:15,940 --> 00:13:17,050 so here's a thing. 354 00:13:17,050 --> 00:13:18,640 We'll call that reflectance. 355 00:13:18,640 --> 00:13:21,070 It's a property of wavelength, but you can think of it as, 356 00:13:21,070 --> 00:13:22,690 for now, just a single number. 357 00:13:22,690 --> 00:13:25,330 It's a property of that surface, but all we 358 00:13:25,330 --> 00:13:30,190 have is the light coming from that object to our eyes. 359 00:13:30,190 --> 00:13:31,190 That's called luminance. 360 00:13:31,190 --> 00:13:33,357 I'm not going to test you on these particular words, 361 00:13:33,357 --> 00:13:34,780 but you should get the idea, OK? 362 00:13:34,780 --> 00:13:36,280 So that's what we have. 363 00:13:36,280 --> 00:13:37,390 That's our input. 364 00:13:37,390 --> 00:13:39,520 But here's the problem. 365 00:13:39,520 --> 00:13:41,560 The light that's coming off the object 366 00:13:41,560 --> 00:13:44,020 is a function not just of the object 367 00:13:44,020 --> 00:13:46,930 but of the nature of the light that's shining on the object. 368 00:13:46,930 --> 00:13:48,880 That's called the illuminant. 369 00:13:48,880 --> 00:13:52,210 So the problem is we have this equation. 370 00:13:52,210 --> 00:13:54,820 This light coming from the object to our eyes 371 00:13:54,820 --> 00:13:58,930 is the product of the properties of the surface and the incident 372 00:13:58,930 --> 00:14:00,760 light, OK? 373 00:14:00,760 --> 00:14:04,365 And our problem is, we have to solve for L. I'm sorry. 374 00:14:04,365 --> 00:14:06,490 We have to solve for R, the property of the object. 375 00:14:06,490 --> 00:14:09,760 Given L, what is R? 376 00:14:09,760 --> 00:14:11,950 That's a problem, OK? 377 00:14:11,950 --> 00:14:15,260 That's kind of like if I said, A times B is 48. 378 00:14:15,260 --> 00:14:18,760 Please solve for A and B, OK? 379 00:14:18,760 --> 00:14:21,940 That's known in the field as an ill-posed or underdetermined 380 00:14:21,940 --> 00:14:22,610 problem. 381 00:14:22,610 --> 00:14:26,170 We don't have enough information to uniquely solve this, OK? 382 00:14:26,170 --> 00:14:29,530 That's a very, very deep problem in perception 383 00:14:29,530 --> 00:14:30,550 and a lot of cognition. 384 00:14:30,550 --> 00:14:35,110 We are often, in fact, most of the time in this boat, OK? 385 00:14:35,110 --> 00:14:37,120 So the implications are, when we want 386 00:14:37,120 --> 00:14:40,450 to infer reflectance of the property of the object from L, 387 00:14:40,450 --> 00:14:45,400 we must bring in other information, right? 388 00:14:45,400 --> 00:14:47,200 We must have some way to make guesses 389 00:14:47,200 --> 00:14:51,190 about I, about the light shining on that object, OK? 390 00:14:51,190 --> 00:14:53,530 So the big point is many, many inferences 391 00:14:53,530 --> 00:14:57,700 in perception and cognition are ill posed in exactly this way, 392 00:14:57,700 --> 00:14:58,510 all right? 393 00:14:58,510 --> 00:15:01,450 And so here are two other examples of ill-posed problems 394 00:15:01,450 --> 00:15:02,740 in perception. 395 00:15:02,740 --> 00:15:06,040 In shape perception, you have a similar situation. 396 00:15:06,040 --> 00:15:07,660 You have stuff in the world that's 397 00:15:07,660 --> 00:15:10,990 making an image on the back of your eyes, OK? 398 00:15:10,990 --> 00:15:13,630 That's optics. 399 00:15:13,630 --> 00:15:15,610 What we're trying to do as perceivers 400 00:15:15,610 --> 00:15:17,540 is reason backwards from that image. 401 00:15:17,540 --> 00:15:20,620 What object in the world caused that image on my retina? 402 00:15:20,620 --> 00:15:22,358 That's sometimes called inverse optics 403 00:15:22,358 --> 00:15:24,400 because you're trying to reason the opposite way. 404 00:15:24,400 --> 00:15:26,770 That's basically what we're doing in vision. 405 00:15:26,770 --> 00:15:28,960 So here's a problem. 406 00:15:28,960 --> 00:15:31,550 It's a crappy diagram, but if you can see here, 407 00:15:31,550 --> 00:15:36,760 there's three very different surface shapes here 408 00:15:36,760 --> 00:15:40,420 that are all casting the same image, for example, 409 00:15:40,420 --> 00:15:41,650 on a retina. 410 00:15:41,650 --> 00:15:44,290 You could do this with cardboard and cast it with a shadow. 411 00:15:44,290 --> 00:15:46,150 So everybody get what this shows here? 412 00:15:46,150 --> 00:15:48,400 What that means is if you start with this 413 00:15:48,400 --> 00:15:51,440 and you have to reason backwards to the shape that caused it, 414 00:15:51,440 --> 00:15:53,620 that's an ill posed problem, big time. 415 00:15:53,620 --> 00:15:55,480 It could be any of those things. 416 00:15:55,480 --> 00:15:57,490 This information doesn't constrain it. 417 00:15:57,490 --> 00:15:59,320 Does everybody see that problem? 418 00:15:59,320 --> 00:16:03,730 OK, so that's another ill-posed problem. 419 00:16:03,730 --> 00:16:06,450 Here's a totally different example of an ill-posed problem 420 00:16:06,450 --> 00:16:08,070 that's big in cognition. 421 00:16:08,070 --> 00:16:10,620 When you learn the meaning of a word, especially 422 00:16:10,620 --> 00:16:14,370 as an infant trying to learn language, the classic example 423 00:16:14,370 --> 00:16:15,540 the philosophers like-- 424 00:16:15,540 --> 00:16:16,230 God knows why. 425 00:16:16,230 --> 00:16:19,620 Philosophers like weird stuff, but never mind. 426 00:16:19,620 --> 00:16:24,390 Somebody points to that and says "gavagai," and your job 427 00:16:24,390 --> 00:16:27,810 is to figure out, what does "gavagai" mean, OK? 428 00:16:27,810 --> 00:16:31,150 So "gavagai" could mean all kinds of different things. 429 00:16:31,150 --> 00:16:33,300 It could just mean rabbit if you already 430 00:16:33,300 --> 00:16:34,890 have a concept of a rabbit. 431 00:16:34,890 --> 00:16:36,630 It could mean fur. 432 00:16:36,630 --> 00:16:38,970 It could mean ears. 433 00:16:38,970 --> 00:16:42,000 It could mean motion if the rabbit is jumping around, 434 00:16:42,000 --> 00:16:44,550 or in the example of the philosophers love, 435 00:16:44,550 --> 00:16:47,640 it could mean undetached rabbit parts. 436 00:16:47,640 --> 00:16:50,230 Weird, but anyway, philosophers like that kind of thing. 437 00:16:50,230 --> 00:16:52,050 Anyway, the point is, it's ill posed. 438 00:16:52,050 --> 00:16:54,460 We don't know from this what is the correct meaning 439 00:16:54,460 --> 00:16:54,960 of the word. 440 00:16:54,960 --> 00:16:57,930 Does everybody see how this underdetermines 441 00:16:57,930 --> 00:16:59,340 the correct meaning of the word? 442 00:16:59,340 --> 00:17:02,820 We don't have enough information to solve it, OK? 443 00:17:02,820 --> 00:17:07,530 So yeah-- blah, blah. 444 00:17:07,530 --> 00:17:11,640 So there's a whole literature on the extra assumptions 445 00:17:11,640 --> 00:17:14,573 that infants bring to bear to constrain that problem, 446 00:17:14,573 --> 00:17:15,990 so they can make a damn good guess 447 00:17:15,990 --> 00:17:18,270 about what the actual meaning of the word is, OK? 448 00:17:18,270 --> 00:17:20,763 It's a whole big literature, quite fascinating. 449 00:17:20,763 --> 00:17:22,680 OK, but for now, I just want you to understand 450 00:17:22,680 --> 00:17:24,810 what an ill-posed problem is and why 451 00:17:24,810 --> 00:17:27,900 it's central to understanding perception and cognition. 452 00:17:27,900 --> 00:17:31,620 OK, so back to the case of color. 453 00:17:31,620 --> 00:17:35,340 As I said, the big point is that lots of inferences, including 454 00:17:35,340 --> 00:17:38,170 determining the reflectance of an object, are ill posed, 455 00:17:38,170 --> 00:17:41,340 and so we have to bring in assumptions and knowledge 456 00:17:41,340 --> 00:17:44,108 from other places, from our knowledge of the statistics 457 00:17:44,108 --> 00:17:45,900 and the physics of the world, our knowledge 458 00:17:45,900 --> 00:17:47,100 of particular objects. 459 00:17:47,100 --> 00:17:51,210 All kinds of other things must be brought to bear, OK? 460 00:17:51,210 --> 00:17:54,810 So all of that, again, is considering the problem 461 00:17:54,810 --> 00:17:57,660 of color vision at the level of Marr's computational theory. 462 00:17:57,660 --> 00:17:59,880 Notice we haven't made any measurements yet. 463 00:17:59,880 --> 00:18:01,950 We've just thought about light and optics 464 00:18:01,950 --> 00:18:05,160 and what the problem is and what we use it for, OK? 465 00:18:05,160 --> 00:18:06,470 All this stuff. 466 00:18:06,470 --> 00:18:08,190 What is extracted and why? 467 00:18:08,190 --> 00:18:11,130 R, the reflectance of an object, useful for characterizing 468 00:18:11,130 --> 00:18:12,600 objects and finding them. 469 00:18:12,600 --> 00:18:14,160 What cues are available? 470 00:18:14,160 --> 00:18:19,040 Only L, and that's a problem because it's ill posed. 471 00:18:19,040 --> 00:18:23,127 OK, next question-- so obviously, we get around, 472 00:18:23,127 --> 00:18:24,960 and we can figure out what colors are which. 473 00:18:24,960 --> 00:18:27,510 What are the other sources of information 474 00:18:27,510 --> 00:18:30,570 that we might use in principle and that humans do use 475 00:18:30,570 --> 00:18:33,990 in practice, OK? 476 00:18:33,990 --> 00:18:36,210 And so all of that kind of stuff has 477 00:18:36,210 --> 00:18:38,130 been done without making any measurements. 478 00:18:38,130 --> 00:18:41,850 We're just thinking about the problem itself, OK? 479 00:18:41,850 --> 00:18:48,060 All right, so next, Marr's other levels of analysis, algorithm 480 00:18:48,060 --> 00:18:50,310 and representation in hardware, are more standard ones 481 00:18:50,310 --> 00:18:51,768 you will have encountered, which is 482 00:18:51,768 --> 00:18:54,060 why I'm making a big deal of computational theory. 483 00:18:54,060 --> 00:18:57,340 It's really his major novel contribution, 484 00:18:57,340 --> 00:18:59,910 but it's better understood by contrast with these. 485 00:18:59,910 --> 00:19:02,280 So at the level of algorithm and representation, 486 00:19:02,280 --> 00:19:04,405 this is like, what is the code that you would write 487 00:19:04,405 --> 00:19:05,970 to solve that problem, right? 488 00:19:05,970 --> 00:19:09,960 And so we could ask, how does the system do what it does? 489 00:19:09,960 --> 00:19:11,880 Can we write the code to do it, and what 490 00:19:11,880 --> 00:19:15,450 assumptions and computations and representations 491 00:19:15,450 --> 00:19:17,290 would be entailed? 492 00:19:17,290 --> 00:19:21,550 So how would we find out how humans do this? 493 00:19:21,550 --> 00:19:24,600 Well, one of the ways is a slightly more organized version 494 00:19:24,600 --> 00:19:28,560 of what you guys just did, and that's called psychophysics. 495 00:19:28,560 --> 00:19:31,080 Psychophysics just means showing people stuff 496 00:19:31,080 --> 00:19:33,600 and asking them what they see or playing them sounds 497 00:19:33,600 --> 00:19:35,190 and asking them what they hear. 498 00:19:35,190 --> 00:19:37,703 And you can do it in very sophisticated, formalized ways, 499 00:19:37,703 --> 00:19:39,120 or you can do it like we just did. 500 00:19:39,120 --> 00:19:41,850 Talk to us about what the world looks like, OK? 501 00:19:41,850 --> 00:19:44,340 Usually, psychophysics means a slightly more organized 502 00:19:44,340 --> 00:19:45,960 version. 503 00:19:45,960 --> 00:19:48,490 OK, so here's an example. 504 00:19:48,490 --> 00:19:51,503 In fact, it's a cool demo, also from Rosa. 505 00:19:51,503 --> 00:19:52,920 And so what I'm going to do is I'm 506 00:19:52,920 --> 00:19:55,350 going to show you a bunch of pictures of cars, 507 00:19:55,350 --> 00:19:58,620 and your task is going to be to shout out loud as fast as you 508 00:19:58,620 --> 00:20:00,457 can the color of the car, OK? 509 00:20:00,457 --> 00:20:02,040 They're going to appear on the screen. 510 00:20:02,040 --> 00:20:03,697 Everyone ready? 511 00:20:03,697 --> 00:20:05,280 As fast as you can, shout it out loud. 512 00:20:05,280 --> 00:20:05,670 Here we go. 513 00:20:05,670 --> 00:20:05,970 What color? 514 00:20:05,970 --> 00:20:06,300 AUDIENCE: Red. 515 00:20:06,300 --> 00:20:06,716 AUDIENCE: Green. 516 00:20:06,716 --> 00:20:07,299 AUDIENCE: Red. 517 00:20:07,299 --> 00:20:07,964 AUDIENCE: Red. 518 00:20:07,964 --> 00:20:08,547 AUDIENCE: Red. 519 00:20:08,547 --> 00:20:09,422 AUDIENCE: It's green. 520 00:20:09,422 --> 00:20:10,630 NANCY KANWISHER: Interesting. 521 00:20:10,630 --> 00:20:11,675 OK, here's another one. 522 00:20:11,675 --> 00:20:12,300 AUDIENCE: Blue. 523 00:20:12,300 --> 00:20:12,735 AUDIENCE: Yellow. 524 00:20:12,735 --> 00:20:13,443 AUDIENCE: Yellow. 525 00:20:13,443 --> 00:20:15,030 NANCY KANWISHER: Uh-huh, interesting. 526 00:20:15,030 --> 00:20:15,390 Ready? 527 00:20:15,390 --> 00:20:15,690 Here we go. 528 00:20:15,690 --> 00:20:16,482 Here's another one. 529 00:20:16,482 --> 00:20:17,670 AUDIENCE: Green. 530 00:20:17,670 --> 00:20:19,353 NANCY KANWISHER: OK, here's another one. 531 00:20:19,353 --> 00:20:19,978 AUDIENCE: Blue. 532 00:20:19,978 --> 00:20:21,561 NANCY KANWISHER: Ah, you guys cottoned 533 00:20:21,561 --> 00:20:22,740 on to that pretty fast. 534 00:20:22,740 --> 00:20:26,760 OK, so good job. 535 00:20:26,760 --> 00:20:28,515 Nice consensus, although I noticed 536 00:20:28,515 --> 00:20:29,890 a little bit of transition there, 537 00:20:29,890 --> 00:20:32,140 which is very interesting. 538 00:20:32,140 --> 00:20:34,680 But here's the thing. 539 00:20:34,680 --> 00:20:37,530 All of those cars are the exact same color. 540 00:20:37,530 --> 00:20:40,330 The body of the car is the exact same in all of them, 541 00:20:40,330 --> 00:20:43,560 and if you don't believe it, I'm going to include everything 542 00:20:43,560 --> 00:20:45,210 except for a patch, OK? 543 00:20:45,210 --> 00:20:47,970 Here we go. 544 00:20:47,970 --> 00:20:50,580 Boom, they're all gray. 545 00:20:50,580 --> 00:20:51,090 I know. 546 00:20:51,090 --> 00:20:52,260 It's awesome. 547 00:20:52,260 --> 00:20:54,120 It's Rosa that's awesome, not me. 548 00:20:54,120 --> 00:20:57,390 I just had to bum this because it's so awesome. 549 00:20:57,390 --> 00:21:01,380 OK, so Rosa spent months designing these stimuli to test 550 00:21:01,380 --> 00:21:05,380 particular ideas about vision, but the basic demo 551 00:21:05,380 --> 00:21:06,880 is simple and straightforward. 552 00:21:06,880 --> 00:21:09,160 You can get the point here, OK? 553 00:21:09,160 --> 00:21:11,810 So what's going on here? 554 00:21:11,810 --> 00:21:14,783 What's going on here is that you guys, the algorithm running 555 00:21:14,783 --> 00:21:16,450 in your head that's trying to figure out 556 00:21:16,450 --> 00:21:18,310 what is the color of that car, is 557 00:21:18,310 --> 00:21:21,100 trying to solve the ill-posed problem, 558 00:21:21,100 --> 00:21:24,820 and it's using other information than just the luminance 559 00:21:24,820 --> 00:21:26,710 of light coming from the object. 560 00:21:26,710 --> 00:21:29,830 It's using information from the rest of the object. 561 00:21:29,830 --> 00:21:33,580 It's making inferences about L, the luminance, the light 562 00:21:33,580 --> 00:21:35,560 hitting the object, OK? 563 00:21:35,560 --> 00:21:38,950 And in particular, when you look at that picture up there, 564 00:21:38,950 --> 00:21:41,170 what is the color of light shining on that car? 565 00:21:41,170 --> 00:21:41,920 AUDIENCE: Green. 566 00:21:41,920 --> 00:21:44,740 NANCY KANWISHER: Yeah, right, officially 567 00:21:44,740 --> 00:21:46,900 known as teal in the field, but some of you 568 00:21:46,900 --> 00:21:48,567 shouted out "green" first because that's 569 00:21:48,567 --> 00:21:51,310 what you saw first, is the color of light, OK? 570 00:21:51,310 --> 00:21:54,550 What's the color of light hitting that car? 571 00:21:54,550 --> 00:21:56,230 Yeah, purple, magenta. 572 00:21:56,230 --> 00:21:57,615 Here? 573 00:21:57,615 --> 00:21:58,240 AUDIENCE: Blue. 574 00:21:58,240 --> 00:21:59,980 NANCY KANWISHER: Yeah, and over there? 575 00:21:59,980 --> 00:22:00,820 AUDIENCE: Yellow. 576 00:22:00,820 --> 00:22:02,362 NANCY KANWISHER: Yeah, yellow-orange. 577 00:22:02,362 --> 00:22:06,220 Yeah, OK, so basically what your visual system did 578 00:22:06,220 --> 00:22:11,230 is look quickly, figure out the color of the incident light, L, 579 00:22:11,230 --> 00:22:14,980 and use that to solve the otherwise ill-posed problem 580 00:22:14,980 --> 00:22:17,770 of solving for R, the color of the car. 581 00:22:17,770 --> 00:22:19,960 And in this case, this demo shows 582 00:22:19,960 --> 00:22:22,450 that if you just change the color of the illuminant light 583 00:22:22,450 --> 00:22:25,948 and hold constant the actual wavelengths coming 584 00:22:25,948 --> 00:22:28,240 from that patch, you can radically change the perceived 585 00:22:28,240 --> 00:22:29,530 color of the car. 586 00:22:29,530 --> 00:22:30,820 Everyone got that? 587 00:22:30,820 --> 00:22:31,320 OK. 588 00:22:31,320 --> 00:22:31,900 AUDIENCE: So wait, wait, wait-- 589 00:22:31,900 --> 00:22:32,270 NANCY KANWISHER: Yeah. 590 00:22:32,270 --> 00:22:34,525 AUDIENCE: If you ran this through a computer, 591 00:22:34,525 --> 00:22:38,260 asked it to get the intensity of the pixel 592 00:22:38,260 --> 00:22:41,200 on the hood of the car there, it would not correspond to yellow, 593 00:22:41,200 --> 00:22:42,445 it would correspond to green? 594 00:22:42,445 --> 00:22:44,320 NANCY KANWISHER: Well, it depends what you're 595 00:22:44,320 --> 00:22:46,030 asking the computer exactly. 596 00:22:46,030 --> 00:22:47,923 If you hold up a spectrophotometer that's 597 00:22:47,923 --> 00:22:49,840 just going to measure the wavelength of light, 598 00:22:49,840 --> 00:22:51,520 they're all gray. 599 00:22:51,520 --> 00:22:52,910 Right there on top of those cars, 600 00:22:52,910 --> 00:22:55,270 they're all the exact same neutral gray. 601 00:22:55,270 --> 00:22:59,620 That's just the raw physical light coming from that patch, 602 00:22:59,620 --> 00:23:00,610 OK? 603 00:23:00,610 --> 00:23:03,640 But if you coded up the computer to do something smart 604 00:23:03,640 --> 00:23:06,400 and you coded it up to take other cues from the image, 605 00:23:06,400 --> 00:23:09,687 try to figure out what L is, and therefore solve for R, 606 00:23:09,687 --> 00:23:11,770 you might be able to get it to do the right thing. 607 00:23:11,770 --> 00:23:14,470 AUDIENCE: Yeah, I mean just like you looked at the pixels, 608 00:23:14,470 --> 00:23:18,310 like in the matrix, I mean, the color on the car, 609 00:23:18,310 --> 00:23:21,953 would it be what yellow is, or would it be more grays? 610 00:23:21,953 --> 00:23:22,870 NANCY KANWISHER: Gray. 611 00:23:22,870 --> 00:23:23,470 They're all gray. 612 00:23:23,470 --> 00:23:24,928 So that's what I was trying to show 613 00:23:24,928 --> 00:23:28,810 you here is that, in fact, they are actually gray, right? 614 00:23:28,810 --> 00:23:31,060 The cars are underneath there, and you can see 615 00:23:31,060 --> 00:23:32,800 they're all exactly the same. 616 00:23:32,800 --> 00:23:35,650 And they're gray, and there's no color in it, OK? 617 00:23:35,650 --> 00:23:37,420 Everyone got that? 618 00:23:37,420 --> 00:23:42,250 All right, OK, so all of that is a little baby 619 00:23:42,250 --> 00:23:46,270 example of psychophysics, what we 620 00:23:46,270 --> 00:23:48,520 do at the level of trying to understand the algorithms 621 00:23:48,520 --> 00:23:52,360 and representations extracted by the mind to try to figure out, 622 00:23:52,360 --> 00:23:56,320 what are the strategies that we use to solve problems 623 00:23:56,320 --> 00:23:58,210 about the visual world? 624 00:23:58,210 --> 00:24:01,960 OK, and so behavior or psychophysics or seeing, 625 00:24:01,960 --> 00:24:04,763 as you just did, can reveal those assumptions 626 00:24:04,763 --> 00:24:06,430 and reveal some of the tricks that we're 627 00:24:06,430 --> 00:24:09,040 using in the human visual system to solve 628 00:24:09,040 --> 00:24:10,912 those ill-posed problems, OK? 629 00:24:10,912 --> 00:24:12,370 So in this case, it was assumptions 630 00:24:12,370 --> 00:24:15,850 about the illuminant that enabled 631 00:24:15,850 --> 00:24:20,140 us to infer the reflectance from the luminance. 632 00:24:20,140 --> 00:24:23,290 OK, the third level Marr talks about 633 00:24:23,290 --> 00:24:25,270 is the level of hardware implementation. 634 00:24:25,270 --> 00:24:28,930 In the case of brains, that's neurons and brains, 635 00:24:28,930 --> 00:24:31,663 and so we won't cover this in any detail here. 636 00:24:31,663 --> 00:24:33,580 But there's lots and lots of work on the brain 637 00:24:33,580 --> 00:24:34,600 basis of color vision. 638 00:24:34,600 --> 00:24:36,560 We'll mention it briefly next time. 639 00:24:36,560 --> 00:24:38,440 So this is some of Rosa's work showing 640 00:24:38,440 --> 00:24:41,530 those little blue patches on the side of the monkey brain 641 00:24:41,530 --> 00:24:43,030 that are involved in color vision, 642 00:24:43,030 --> 00:24:44,620 and some work that Rosa did in my lab 643 00:24:44,620 --> 00:24:46,600 showing the bottom surface of the human brain 644 00:24:46,600 --> 00:24:49,300 with a very similar organization with those little blue patches 645 00:24:49,300 --> 00:24:52,540 in there that are particularly sensitive to color. 646 00:24:52,540 --> 00:24:55,120 So you can study brain regions that do that. 647 00:24:55,120 --> 00:24:57,280 If it's a monkey, you can stick electrodes in there 648 00:24:57,280 --> 00:24:58,960 and record from individual neurons 649 00:24:58,960 --> 00:25:01,030 and see what they code for, and you can really 650 00:25:01,030 --> 00:25:05,860 tackle at multiple levels the hardware neural basis of color 651 00:25:05,860 --> 00:25:08,260 vision in brains as well. 652 00:25:08,260 --> 00:25:11,020 OK, so the big general point is we 653 00:25:11,020 --> 00:25:14,650 need lots of levels of analysis to understand a problem 654 00:25:14,650 --> 00:25:16,960 like color vision, OK? 655 00:25:16,960 --> 00:25:19,750 And so accordingly, we need lots of methods 656 00:25:19,750 --> 00:25:22,090 to understand those things, all right? 657 00:25:22,090 --> 00:25:24,370 So what I want to do next is now launch 658 00:25:24,370 --> 00:25:27,503 into this whole thing about the different methods 659 00:25:27,503 --> 00:25:29,920 that we can use in the field, and this part of the lecture 660 00:25:29,920 --> 00:25:31,190 will go on to next time. 661 00:25:31,190 --> 00:25:32,170 But let's get going. 662 00:25:32,170 --> 00:25:34,000 Everybody good with this so far? 663 00:25:34,000 --> 00:25:38,560 All right, so we're going to use the case of face perception 664 00:25:38,560 --> 00:25:41,890 to think about the different kinds of questions 665 00:25:41,890 --> 00:25:44,650 and different levels of analysis in face perception. 666 00:25:44,650 --> 00:25:46,960 So let me start by saying, why face perception? 667 00:25:46,960 --> 00:25:48,850 Not just that I've worked on it for 20 years, 668 00:25:48,850 --> 00:25:50,710 although I'll admit that's relevant. 669 00:25:50,710 --> 00:25:52,180 There's lots of other good reasons 670 00:25:52,180 --> 00:25:56,410 beyond that, why we should care about face perception. 671 00:25:56,410 --> 00:26:00,490 So I don't have a demo that enables me to kind of put you 672 00:26:00,490 --> 00:26:03,523 in a situation where you can see everything but faces. 673 00:26:03,523 --> 00:26:04,940 That would be cool and informative 674 00:26:04,940 --> 00:26:07,450 if we could do that, but failing that, 675 00:26:07,450 --> 00:26:10,810 I can tell you about somebody who's in that situation. 676 00:26:10,810 --> 00:26:13,870 And this is a guy named Jacob Hodes. 677 00:26:13,870 --> 00:26:15,820 So this is a picture of him recently. 678 00:26:15,820 --> 00:26:17,800 I met him around a decade ago when he 679 00:26:17,800 --> 00:26:19,870 was a freshman at Swarthmore. 680 00:26:19,870 --> 00:26:23,060 And he sent me an email, and he said, 681 00:26:23,060 --> 00:26:26,920 I've just learned about face perception and the phenomenon 682 00:26:26,920 --> 00:26:29,320 of prosopagnosia, the fact that some people 683 00:26:29,320 --> 00:26:33,130 have a specific deficit in face recognition. 684 00:26:33,130 --> 00:26:37,000 And it explains everything in my life, and I want to meet you. 685 00:26:37,000 --> 00:26:39,580 And I said-- because he knew I worked on face perception. 686 00:26:39,580 --> 00:26:40,720 I said, that's awesome. 687 00:26:40,720 --> 00:26:42,340 I would love to meet you, but I got to tell you, 688 00:26:42,340 --> 00:26:43,700 I'm not going to be able to help. 689 00:26:43,700 --> 00:26:45,533 So if you're interested in chatting, please, 690 00:26:45,533 --> 00:26:48,077 please come by, but I don't want you 691 00:26:48,077 --> 00:26:50,410 to feel like I'm going to be able to do anything useful. 692 00:26:50,410 --> 00:26:51,493 He said, no, I don't care. 693 00:26:51,493 --> 00:26:53,330 I want to understand the science. 694 00:26:53,330 --> 00:26:56,380 So he comes by, and by the way, one of the things 695 00:26:56,380 --> 00:26:58,360 that people have wondered for a while is, 696 00:26:58,360 --> 00:27:00,580 are people who have particular problems with face 697 00:27:00,580 --> 00:27:02,770 recognition-- are they just socially weird? 698 00:27:02,770 --> 00:27:05,553 Are they just bizarre, maybe a little bit on the spectrum? 699 00:27:05,553 --> 00:27:06,970 They don't pay attention to faces, 700 00:27:06,970 --> 00:27:10,780 and so they don't get them very well and so forth. 701 00:27:10,780 --> 00:27:14,770 Or can they be totally normal in every other respect 702 00:27:14,770 --> 00:27:16,960 except for just face perception? 703 00:27:16,960 --> 00:27:18,250 And so I was very interested. 704 00:27:18,250 --> 00:27:21,310 I'd only emailed with this guy, and when he showed up 705 00:27:21,310 --> 00:27:24,620 in my office, within about 15 seconds, 706 00:27:24,620 --> 00:27:26,950 it's like, this is like the nicest, normalest kid you 707 00:27:26,950 --> 00:27:27,880 could ever meet. 708 00:27:27,880 --> 00:27:32,210 Such a nice guy, so normal, socially adept, smart, 709 00:27:32,210 --> 00:27:34,967 thoughtful-- lovely, lovely person. 710 00:27:34,967 --> 00:27:36,550 So I chatted with him for a long time, 711 00:27:36,550 --> 00:27:38,260 and he told me he was then halfway 712 00:27:38,260 --> 00:27:40,610 through-- he grew up in Lynn, Massachusetts, 713 00:27:40,610 --> 00:27:43,420 and he went off to Swarthmore his freshman year. 714 00:27:43,420 --> 00:27:46,870 And he had been having a really rough time of it 715 00:27:46,870 --> 00:27:50,230 because in his hometown, he was with the same group of kids 716 00:27:50,230 --> 00:27:52,840 all the way from first grade through high school. 717 00:27:52,840 --> 00:27:56,380 And so in fact, he just can't recognize faces at all, 718 00:27:56,380 --> 00:27:57,370 never could. 719 00:27:57,370 --> 00:27:59,650 When he was a little kid his mom used to drive him 720 00:27:59,650 --> 00:28:02,710 to the practice field, and they would sit there and come up 721 00:28:02,710 --> 00:28:05,030 with cues about, this is how you tell that's Johnny. 722 00:28:05,030 --> 00:28:06,400 He's got this weird thing about his hair, 723 00:28:06,400 --> 00:28:07,983 and this is how you tell that's Bobby. 724 00:28:07,983 --> 00:28:10,520 And they would practice and practice. 725 00:28:10,520 --> 00:28:12,850 And so he developed these clues to be 726 00:28:12,850 --> 00:28:16,360 able to figure out who was who in his small little cohort 727 00:28:16,360 --> 00:28:21,160 of kids that he knew all the way through high school. 728 00:28:21,160 --> 00:28:24,580 Then he goes off to college, and it's all these new people. 729 00:28:24,580 --> 00:28:26,100 And he's screwed. 730 00:28:26,100 --> 00:28:27,730 And he said to me that he was just 731 00:28:27,730 --> 00:28:29,722 devastated because he would go to a party, 732 00:28:29,722 --> 00:28:31,430 and he would meet someone and think, wow, 733 00:28:31,430 --> 00:28:33,362 this is a really nice person. 734 00:28:33,362 --> 00:28:35,320 I would really like to be this person's friend, 735 00:28:35,320 --> 00:28:37,028 but he would realize he would have no way 736 00:28:37,028 --> 00:28:40,180 to find that person again. 737 00:28:40,180 --> 00:28:42,912 And he's like, it's kind of oversharing 738 00:28:42,912 --> 00:28:45,370 to say when you've met somebody for 10 minutes, by the way, 739 00:28:45,370 --> 00:28:46,330 I'm not going be able to find you. 740 00:28:46,330 --> 00:28:47,163 You have to find me. 741 00:28:47,163 --> 00:28:49,870 It's like, you just don't want to have to go there yet, right? 742 00:28:49,870 --> 00:28:53,500 So there's all kinds of things that would make it a real drag 743 00:28:53,500 --> 00:28:55,720 to not be able to recognize other faces, 744 00:28:55,720 --> 00:28:59,140 and now having said all of that, I'll say that a surprisingly 745 00:28:59,140 --> 00:29:03,880 large of the population is in Jacob's situation, about 2% 746 00:29:03,880 --> 00:29:04,930 of the population. 747 00:29:04,930 --> 00:29:06,460 It would be unsurprising if there 748 00:29:06,460 --> 00:29:08,590 were one or two of you in here, and if there is, 749 00:29:08,590 --> 00:29:09,507 you can tell me later. 750 00:29:09,507 --> 00:29:11,590 I'd love to scan you. 751 00:29:11,590 --> 00:29:16,510 But about 2% of the population routinely 752 00:29:16,510 --> 00:29:19,000 fails to recognize family members, people 753 00:29:19,000 --> 00:29:22,450 they know really well, right? 754 00:29:22,450 --> 00:29:26,890 And interestingly, this is completely uncorrelated with IQ 755 00:29:26,890 --> 00:29:29,380 or with any other perceptual ability or ability 756 00:29:29,380 --> 00:29:31,270 to read or recognize scenes or anything else. 757 00:29:31,270 --> 00:29:31,770 Yeah. 758 00:29:31,770 --> 00:29:32,500 AUDIENCE: Is this the kind of thing 759 00:29:32,500 --> 00:29:34,270 where you either have it or don't have it? 760 00:29:34,270 --> 00:29:35,728 NANCY KANWISHER: Oh, good question. 761 00:29:35,728 --> 00:29:36,760 No, it's a gradation. 762 00:29:36,760 --> 00:29:39,692 So the 2% at the bottom are not this 2% who are really screwed, 763 00:29:39,692 --> 00:29:40,900 and everyone else is up here. 764 00:29:40,900 --> 00:29:44,290 It's a hugely wide distribution, and the point 765 00:29:44,290 --> 00:29:46,240 is that the bottom end of that distribution 766 00:29:46,240 --> 00:29:47,950 is really, really bad. 767 00:29:47,950 --> 00:29:49,880 They just can't do it at all. 768 00:29:49,880 --> 00:29:56,230 Similarly, the top end of that distribution is weirdly good. 769 00:29:56,230 --> 00:29:58,270 They are so good at face recognition 770 00:29:58,270 --> 00:30:00,760 that they have to hide it socially because otherwise 771 00:30:00,760 --> 00:30:02,350 people feel creeped out. 772 00:30:02,350 --> 00:30:04,060 For example, as one of those people-- 773 00:30:04,060 --> 00:30:06,160 they're called super recognizers. 774 00:30:06,160 --> 00:30:11,320 A bunch of them have been hired by investigation services 775 00:30:11,320 --> 00:30:13,210 in London recently as part of their kind 776 00:30:13,210 --> 00:30:14,530 of crime-solving unit. 777 00:30:14,530 --> 00:30:18,053 Those people are so good that one of them 778 00:30:18,053 --> 00:30:19,970 said to me-- we scanned a few of these people. 779 00:30:19,970 --> 00:30:24,040 One of them said, if I-- 780 00:30:24,040 --> 00:30:26,800 she recounted this event where she's 781 00:30:26,800 --> 00:30:29,050 standing in line waiting for movie tickets, 782 00:30:29,050 --> 00:30:31,780 and she realizes that the person in front of her in line 783 00:30:31,780 --> 00:30:34,660 was sitting at the next table over at a cafe four 784 00:30:34,660 --> 00:30:35,980 years before. 785 00:30:35,980 --> 00:30:38,680 She says, if I share this information with that person, 786 00:30:38,680 --> 00:30:40,240 they'll be creeped out, so I've just 787 00:30:40,240 --> 00:30:41,450 learned to keep it to myself. 788 00:30:41,450 --> 00:30:43,990 But I know that was the same person, right? 789 00:30:43,990 --> 00:30:46,090 So there's a huge spread. 790 00:30:46,090 --> 00:30:47,610 You had a question a while back. 791 00:30:47,610 --> 00:30:50,110 AUDIENCE: It's not a problem with recognizing facial images, 792 00:30:50,110 --> 00:30:50,655 is it? 793 00:30:50,655 --> 00:30:53,320 So for example, Jacob, looking at a person, 794 00:30:53,320 --> 00:30:54,700 could he describe them? 795 00:30:54,700 --> 00:30:56,260 NANCY KANWISHER: Absolutely. 796 00:30:56,260 --> 00:30:57,700 He knows that it's a face. 797 00:30:57,700 --> 00:30:59,440 He can tell if they're male or female. 798 00:30:59,440 --> 00:31:02,420 He can tell if they're happy or sad. 799 00:31:02,420 --> 00:31:04,530 It looks like a face to him. 800 00:31:04,530 --> 00:31:06,940 It just doesn't look different than anyone else. 801 00:31:06,940 --> 00:31:07,950 Yeah. 802 00:31:07,950 --> 00:31:09,700 AUDIENCE: Is there any difference-- 803 00:31:09,700 --> 00:31:14,050 OK, for example, my father-- he can tell faces in person just 804 00:31:14,050 --> 00:31:19,210 fine, but when he watches videos of people, he just cannot. 805 00:31:19,210 --> 00:31:20,930 He cannot recognize faces at all, 806 00:31:20,930 --> 00:31:22,583 so is there any difference? 807 00:31:22,583 --> 00:31:24,250 NANCY KANWISHER: There are lots of cues. 808 00:31:24,250 --> 00:31:25,900 I mean, that's a very interesting exercise 809 00:31:25,900 --> 00:31:26,525 to think about. 810 00:31:26,525 --> 00:31:29,290 What are the cues that you have in person, right? 811 00:31:29,290 --> 00:31:30,770 You have all kinds of other things. 812 00:31:30,770 --> 00:31:32,350 So first of all, there's lots of constraining information. 813 00:31:32,350 --> 00:31:33,250 The person you're looking at-- there 814 00:31:33,250 --> 00:31:35,417 are all kinds of things you know about where you are 815 00:31:35,417 --> 00:31:39,040 and who that might be that help, right? 816 00:31:39,040 --> 00:31:40,630 So yeah, there's many different cues 817 00:31:40,630 --> 00:31:43,400 to face recognition that might be engaged here. 818 00:31:43,400 --> 00:31:48,400 So my point is just that face recognition matters. 819 00:31:48,400 --> 00:31:51,560 You can get by if you can't do it, but it sucks. 820 00:31:51,560 --> 00:31:53,710 It's really hard, OK? 821 00:31:53,710 --> 00:31:55,560 OK, so yes, question? 822 00:31:55,560 --> 00:31:57,310 AUDIENCE: Do you have any idea what people 823 00:31:57,310 --> 00:32:00,352 at [INAUDIBLE] university? 824 00:32:00,352 --> 00:32:02,550 Are there any sort of models that 825 00:32:02,550 --> 00:32:05,220 suggest that when there are-- 826 00:32:05,220 --> 00:32:07,060 like, when they see faces, do they just 827 00:32:07,060 --> 00:32:09,327 see a sort of conglomerate of shapes? 828 00:32:09,327 --> 00:32:11,660 NANCY KANWISHER: No, they see the structure of the face. 829 00:32:11,660 --> 00:32:13,060 They see a proper face. 830 00:32:13,060 --> 00:32:15,460 If the eye was in the wrong place, they would know. 831 00:32:15,460 --> 00:32:18,220 They absolutely know the structure of the face. 832 00:32:18,220 --> 00:32:19,690 They all look kind of the same. 833 00:32:22,430 --> 00:32:24,398 By the way, we won't have time to talk 834 00:32:24,398 --> 00:32:25,940 about this in any detail, but there's 835 00:32:25,940 --> 00:32:28,160 a well-known fact that probably many of you guys 836 00:32:28,160 --> 00:32:30,860 have experienced, which is called the other-race effect. 837 00:32:30,860 --> 00:32:33,530 And that is the fact that they all look the same. 838 00:32:33,530 --> 00:32:36,830 Whoever they are, if you have less experience looking 839 00:32:36,830 --> 00:32:39,140 at that group of people, you're less well 840 00:32:39,140 --> 00:32:41,360 able to tell them apart, OK? 841 00:32:41,360 --> 00:32:43,490 I have this problem teaching all the time. 842 00:32:43,490 --> 00:32:46,278 I grew up in a rural lily-white community. 843 00:32:46,278 --> 00:32:48,320 My face recognition is not so good to begin with, 844 00:32:48,320 --> 00:32:51,080 and it's really not good for non-Caucasian faces. 845 00:32:51,080 --> 00:32:52,340 It's embarrassing as hell. 846 00:32:52,340 --> 00:32:53,450 It feels disrespectful. 847 00:32:53,450 --> 00:32:55,310 I hate it. 848 00:32:55,310 --> 00:32:56,750 I fault myself, but actually, it's 849 00:32:56,750 --> 00:32:58,430 just a fact of the perceptual system. 850 00:32:58,430 --> 00:33:01,970 Your perceptual system is tuned to the statistics of its input, 851 00:33:01,970 --> 00:33:06,540 and it's not so plastic later in life. 852 00:33:06,540 --> 00:33:10,850 And so a way to simulate a version that some of you 853 00:33:10,850 --> 00:33:13,670 may have experienced is whatever race of faces 854 00:33:13,670 --> 00:33:15,777 you have less experience with, if you 855 00:33:15,777 --> 00:33:17,360 find those people hard to distinguish, 856 00:33:17,360 --> 00:33:19,100 it's not that you can't tell it's a face. 857 00:33:19,100 --> 00:33:22,188 It's not that you wouldn't be able to tell if the nose was 858 00:33:22,188 --> 00:33:22,980 in the wrong place. 859 00:33:22,980 --> 00:33:25,290 It's just hard to tell one person from another, 860 00:33:25,290 --> 00:33:26,510 so it's a lot like that. 861 00:33:26,510 --> 00:33:28,430 I really need to get going, so I'll take one more question 862 00:33:28,430 --> 00:33:28,930 and go. 863 00:33:28,930 --> 00:33:31,820 AUDIENCE: Wait, could you kind of use an analogy? 864 00:33:31,820 --> 00:33:34,370 It's like being able to tell people apart 865 00:33:34,370 --> 00:33:39,830 by their hands or something to the point that you just-- 866 00:33:39,830 --> 00:33:42,680 you can't really tell people apart by their hands, 867 00:33:42,680 --> 00:33:45,600 usually, so is that kind of how people with face blindness 868 00:33:45,600 --> 00:33:46,100 feel? 869 00:33:46,100 --> 00:33:47,150 It's just looking at [INAUDIBLE].. 870 00:33:47,150 --> 00:33:48,900 NANCY KANWISHER: That's all you had, yeah. 871 00:33:48,900 --> 00:33:50,013 Yeah, probably, probably. 872 00:33:50,013 --> 00:33:52,430 Yeah, and there is, by the way, an interesting literature. 873 00:33:52,430 --> 00:33:54,500 You show people photographs of their own hand 874 00:33:54,500 --> 00:33:55,880 and a bunch of other hands. 875 00:33:55,880 --> 00:33:58,503 People can't pick out their own hand from-- 876 00:33:58,503 --> 00:33:59,420 so yeah, you're right. 877 00:33:59,420 --> 00:34:01,970 We're not so good at that. 878 00:34:01,970 --> 00:34:03,053 OK, I'm going to go ahead. 879 00:34:03,053 --> 00:34:04,512 If you guys are interested, I could 880 00:34:04,512 --> 00:34:06,680 post-- there's a whole fascinating literature here, 881 00:34:06,680 --> 00:34:08,090 but actually, I got dinged last year 882 00:34:08,090 --> 00:34:09,673 for talking about face recognition too 883 00:34:09,673 --> 00:34:10,730 much and prosopagnosia. 884 00:34:10,730 --> 00:34:13,132 We all heard about it in 900. 885 00:34:13,132 --> 00:34:15,590 So I took most of that out, and now you guys are asking me. 886 00:34:15,590 --> 00:34:16,820 So I don't know what the right thing is. 887 00:34:16,820 --> 00:34:19,130 But I'm going to go on, and I will put some optional readings 888 00:34:19,130 --> 00:34:20,880 online, especially if you send me an email 889 00:34:20,880 --> 00:34:22,159 and tell me to do that. 890 00:34:22,159 --> 00:34:25,940 OK, so point is, faces matter a lot. 891 00:34:25,940 --> 00:34:28,620 They matter for the quality of life. 892 00:34:28,620 --> 00:34:30,800 They're important because they convey 893 00:34:30,800 --> 00:34:33,620 a huge amount of information, not just the identity 894 00:34:33,620 --> 00:34:37,880 of the person but also their age, sex, mood, race, 895 00:34:37,880 --> 00:34:39,270 direction of attention. 896 00:34:39,270 --> 00:34:41,150 So if I'm lecturing like this right now 897 00:34:41,150 --> 00:34:43,429 and I start doing that, you guys are going to wonder, 898 00:34:43,429 --> 00:34:44,929 what the hell's going on over there? 899 00:34:44,929 --> 00:34:46,190 Yeah, I saw a few heads turn. 900 00:34:46,190 --> 00:34:48,139 I'm just doing a little demo here, right? 901 00:34:48,139 --> 00:34:51,560 We're very attuned to where other people are looking, OK? 902 00:34:51,560 --> 00:34:54,110 So this is just one of many different social cues 903 00:34:54,110 --> 00:34:55,070 we get from faces. 904 00:34:55,070 --> 00:35:00,200 There's just an incredibly rich bunch of information in a face. 905 00:35:00,200 --> 00:35:02,660 We read in aspects of people's personality 906 00:35:02,660 --> 00:35:04,572 from the shape of their face even though it's 907 00:35:04,572 --> 00:35:06,530 been shown with some interesting recent studies 908 00:35:06,530 --> 00:35:08,480 there's absolutely nothing you can 909 00:35:08,480 --> 00:35:09,942 infer about a person's personality 910 00:35:09,942 --> 00:35:11,150 from the shape of their face. 911 00:35:11,150 --> 00:35:14,957 We all do it, and we do it in systematic ways. 912 00:35:14,957 --> 00:35:16,790 Another reason this is important-- and faces 913 00:35:16,790 --> 00:35:18,380 are some of the most common stimuli 914 00:35:18,380 --> 00:35:22,190 that we see in daily life, starting from infancy where, 915 00:35:22,190 --> 00:35:24,590 I think, about 40% of waking time, 916 00:35:24,590 --> 00:35:28,550 there's a face right in front of an infant's eyes. 917 00:35:28,550 --> 00:35:32,570 And probably, these abilities to extract all this information 918 00:35:32,570 --> 00:35:36,530 have been important throughout our primate ancestry. 919 00:35:36,530 --> 00:35:39,260 So that's just to say, there's a big space of face perception, 920 00:35:39,260 --> 00:35:41,093 and now we're going to focus in on just face 921 00:35:41,093 --> 00:35:44,990 recognition, telling who that person is, all right? 922 00:35:44,990 --> 00:35:48,530 So what questions do we want to answer about face recognition? 923 00:35:48,530 --> 00:35:52,410 Well, a whole bunch of them, and what methods do we want to use? 924 00:35:52,410 --> 00:35:54,380 So let's start with some basic questions 925 00:35:54,380 --> 00:35:56,030 about face recognition. 926 00:35:56,030 --> 00:35:58,425 Well, first, as usual, we want to know, 927 00:35:58,425 --> 00:36:00,800 what is the structure of the problem in face recognition? 928 00:36:00,800 --> 00:36:01,620 What are the inputs? 929 00:36:01,620 --> 00:36:02,495 What are the outputs? 930 00:36:02,495 --> 00:36:03,500 Why is it hard, right? 931 00:36:03,500 --> 00:36:06,260 Just as we've been doing for motion and color-- 932 00:36:06,260 --> 00:36:08,480 that's Marr's computational theory level. 933 00:36:08,480 --> 00:36:11,030 We want to know, how does face recognition actually 934 00:36:11,030 --> 00:36:12,230 work in humans? 935 00:36:12,230 --> 00:36:13,700 What computations go on? 936 00:36:13,700 --> 00:36:16,250 What representations are extracted? 937 00:36:16,250 --> 00:36:18,560 And is that answer different? 938 00:36:18,560 --> 00:36:20,540 Are we running different code in our heads 939 00:36:20,540 --> 00:36:23,690 when we recognize faces from when we recognize toasters 940 00:36:23,690 --> 00:36:27,298 and apples and dogs, OK? 941 00:36:27,298 --> 00:36:29,840 Another facet of that-- do we have a totally different system 942 00:36:29,840 --> 00:36:31,850 for face recognition from the recognition 943 00:36:31,850 --> 00:36:33,170 of all those other things? 944 00:36:33,170 --> 00:36:35,720 If so, then we might want different theories 945 00:36:35,720 --> 00:36:38,000 of how face recognition works from our theories of how 946 00:36:38,000 --> 00:36:39,230 object recognition works. 947 00:36:41,750 --> 00:36:44,840 How quickly do we detect and recognize faces? 948 00:36:44,840 --> 00:36:47,120 That will help constrain what kinds of computations 949 00:36:47,120 --> 00:36:50,450 might be going on. 950 00:36:50,450 --> 00:36:52,790 And of course, how was face recognition 951 00:36:52,790 --> 00:36:54,920 actually implemented in neurons in brains? 952 00:36:54,920 --> 00:36:57,170 So this is just some of the big, wide-open questions 953 00:36:57,170 --> 00:36:58,500 we want to answer. 954 00:36:58,500 --> 00:37:01,670 So let's consider, what are our tools for considering 955 00:37:01,670 --> 00:37:02,390 these things? 956 00:37:02,390 --> 00:37:04,310 And you guys should all know what 957 00:37:04,310 --> 00:37:07,160 tools are available for thinking at the level of Marr's 958 00:37:07,160 --> 00:37:08,270 computational theory-- 959 00:37:08,270 --> 00:37:10,610 basically just thinking, right? 960 00:37:10,610 --> 00:37:12,110 You can collect some images, too, 961 00:37:12,110 --> 00:37:15,020 but basically to understand this, we just think. 962 00:37:15,020 --> 00:37:18,407 So for example, as I keep saying, 963 00:37:18,407 --> 00:37:20,240 at the level of Marr's computational theory, 964 00:37:20,240 --> 00:37:22,370 we want to know, what is the problem to be solved? 965 00:37:22,370 --> 00:37:23,120 What is the input? 966 00:37:23,120 --> 00:37:23,912 What is the output? 967 00:37:23,912 --> 00:37:27,100 How might you go from that input to that output, OK? 968 00:37:27,100 --> 00:37:29,410 So for example, here's a stimulus 969 00:37:29,410 --> 00:37:33,010 that might hit a retina, and then some magic happens. 970 00:37:33,010 --> 00:37:35,440 And then you just say, Julia, OK? 971 00:37:35,440 --> 00:37:38,500 So we want to know, what's going on in that magic, OK? 972 00:37:38,500 --> 00:37:41,597 And if a different image hits your retina, you go, oh, Brad. 973 00:37:41,597 --> 00:37:42,430 That is, I wouldn't. 974 00:37:42,430 --> 00:37:43,953 I live in a cave. 975 00:37:43,953 --> 00:37:45,370 I barely get out of the lab, but I 976 00:37:45,370 --> 00:37:47,662 understand that these are people most people recognize. 977 00:37:47,662 --> 00:37:48,850 That's why I use them. 978 00:37:48,850 --> 00:37:49,683 That's the question. 979 00:37:49,683 --> 00:37:51,970 What goes on here in the middle? 980 00:37:51,970 --> 00:37:54,920 And your first thought is, well, duh, easy. 981 00:37:54,920 --> 00:37:57,250 We could just make a template, a kind of store 982 00:37:57,250 --> 00:38:00,280 the pixels that match that image and take the incoming image 983 00:38:00,280 --> 00:38:02,440 and see if it exactly matches. 984 00:38:02,440 --> 00:38:06,020 And that's going to work great, right? 985 00:38:06,020 --> 00:38:07,780 No. 986 00:38:07,780 --> 00:38:08,752 Why not? 987 00:38:08,752 --> 00:38:10,600 AUDIENCE: Different angles [INAUDIBLE].. 988 00:38:10,600 --> 00:38:11,020 NANCY KANWISHER: Louder. 989 00:38:11,020 --> 00:38:12,450 AUDIENCE: Angles [INAUDIBLE]. 990 00:38:12,450 --> 00:38:15,400 NANCY KANWISHER: Yeah, yeah, absolutely. 991 00:38:15,400 --> 00:38:17,630 That's not going to work at all. 992 00:38:17,630 --> 00:38:21,550 And the problem is that we don't just have one picture of Julia 993 00:38:21,550 --> 00:38:23,050 that we can match. 994 00:38:23,050 --> 00:38:25,900 There are loads of loads of totally different kinds 995 00:38:25,900 --> 00:38:27,850 of pictures of Julia, all of which 996 00:38:27,850 --> 00:38:31,600 we look at and immediately go, Julia, no problem, OK? 997 00:38:31,600 --> 00:38:36,190 And so that means, what is it that we're doing in our heads? 998 00:38:36,190 --> 00:38:39,790 If we're storing templates, we have to store a lot of them, 999 00:38:39,790 --> 00:38:40,900 OK? 1000 00:38:40,900 --> 00:38:44,810 So all those differences in the images-- 1001 00:38:44,810 --> 00:38:46,750 so we could memorize lots of templates. 1002 00:38:46,750 --> 00:38:50,110 Well, that has long been taken as like the reductio ad 1003 00:38:50,110 --> 00:38:50,770 absurdum. 1004 00:38:50,770 --> 00:38:52,150 That's the ridiculous hypothesis. 1005 00:38:52,150 --> 00:38:53,303 How could that be? 1006 00:38:53,303 --> 00:38:55,720 How could there be room in here to store lots of templates 1007 00:38:55,720 --> 00:38:58,750 of each person, and furthermore, how would that 1008 00:38:58,750 --> 00:39:01,540 work for people we don't know? 1009 00:39:01,540 --> 00:39:05,140 The other idea, which is very vague right now, 1010 00:39:05,140 --> 00:39:07,720 is that, well, maybe we extract something 1011 00:39:07,720 --> 00:39:11,260 that's common across all of those, maybe something 1012 00:39:11,260 --> 00:39:13,510 like the distance between the eyes, 1013 00:39:13,510 --> 00:39:16,030 something about the shape of the mouth, 1014 00:39:16,030 --> 00:39:17,800 other kinds of properties that might 1015 00:39:17,800 --> 00:39:20,560 be invariant across those images, that is, 1016 00:39:20,560 --> 00:39:23,290 that you could pull out that information from any 1017 00:39:23,290 --> 00:39:24,625 of those images, OK? 1018 00:39:24,625 --> 00:39:26,500 It's sounding very vague because it is vague. 1019 00:39:26,500 --> 00:39:27,970 Nobody knows what those would be, 1020 00:39:27,970 --> 00:39:31,180 but the idea is, maybe there's some image invariant 1021 00:39:31,180 --> 00:39:34,000 properties of a face you can get from here that you can then 1022 00:39:34,000 --> 00:39:38,230 store and use to recognize faces, OK? 1023 00:39:38,230 --> 00:39:44,600 So now to think about this, we can step back and say, OK, 1024 00:39:44,600 --> 00:39:46,390 how is this done in machines? 1025 00:39:46,390 --> 00:39:49,900 So machine face recognition didn't work well at all 1026 00:39:49,900 --> 00:39:52,240 until very recently, OK? 1027 00:39:52,240 --> 00:39:55,398 And then all of a sudden, a couple of years ago-- 1028 00:39:55,398 --> 00:39:57,190 here's another paper from the different one 1029 00:39:57,190 --> 00:39:58,550 that I showed you before. 1030 00:39:58,550 --> 00:40:01,960 This one is VGGFace, one of the major deep net systems 1031 00:40:01,960 --> 00:40:02,890 for face recognition. 1032 00:40:02,890 --> 00:40:04,667 It's widely used. 1033 00:40:04,667 --> 00:40:06,250 There was another one the year before. 1034 00:40:06,250 --> 00:40:10,600 All of this since 2014, 2015-- hugely 1035 00:40:10,600 --> 00:40:12,027 cited, widely influential. 1036 00:40:12,027 --> 00:40:13,360 They're on all your smartphones. 1037 00:40:13,360 --> 00:40:16,240 Boom, it all just happened like nearly overnight 1038 00:40:16,240 --> 00:40:19,840 with the availability of lots of images to train deep nets. 1039 00:40:19,840 --> 00:40:21,760 So now these things are extremely 1040 00:40:21,760 --> 00:40:26,240 effective and accurate, and so in some sense, 1041 00:40:26,240 --> 00:40:28,690 those networks are possible models 1042 00:40:28,690 --> 00:40:31,150 of what we're doing in our heads when we recognize faces. 1043 00:40:31,150 --> 00:40:33,160 It doesn't mean we do it in the same way, 1044 00:40:33,160 --> 00:40:34,490 but it's a possibility. 1045 00:40:34,490 --> 00:40:37,240 It's a hypothesis we could test, OK? 1046 00:40:37,240 --> 00:40:37,990 Yeah. 1047 00:40:37,990 --> 00:40:40,198 AUDIENCE: What is the current state of the literature 1048 00:40:40,198 --> 00:40:42,610 surrounding getting other information from people's faces 1049 00:40:42,610 --> 00:40:44,185 like moods or what they're-- 1050 00:40:44,185 --> 00:40:46,030 NANCY KANWISHER: Lots, lots. 1051 00:40:46,030 --> 00:40:50,680 There's conferences and machine vision competitions 1052 00:40:50,680 --> 00:40:53,855 on extracting personality properties, mood properties, 1053 00:40:53,855 --> 00:40:55,480 every possible thing you could imagine. 1054 00:40:55,480 --> 00:40:57,550 This is a huge-- a lot of people care about this. 1055 00:40:57,550 --> 00:40:59,740 It's a huge field in computer vision, 1056 00:40:59,740 --> 00:41:01,840 and it's also a huge field in cognitive science, 1057 00:41:01,840 --> 00:41:03,787 asking what humans pull from faces. 1058 00:41:03,787 --> 00:41:05,620 AUDIENCE: How much success that you've seen? 1059 00:41:05,620 --> 00:41:07,390 NANCY KANWISHER: Oh god, others would know that better than me. 1060 00:41:07,390 --> 00:41:09,535 I bet it's pretty damn good, a lot of it. 1061 00:41:09,535 --> 00:41:11,410 Yeah, yeah, I mean, these things are suddenly 1062 00:41:11,410 --> 00:41:12,970 extremely effective. 1063 00:41:12,970 --> 00:41:16,570 Yeah, OK, and there will be, by the way, later in the course-- 1064 00:41:16,570 --> 00:41:19,060 my postdoc, Katharina Dobs, who knows that literature much 1065 00:41:19,060 --> 00:41:21,910 better than I do, will talk about deep nets 1066 00:41:21,910 --> 00:41:24,470 and their application in human cognitive neuroscience. 1067 00:41:24,470 --> 00:41:26,620 And she knows a lot about the various networks 1068 00:41:26,620 --> 00:41:29,080 that process face information. 1069 00:41:29,080 --> 00:41:30,310 OK, so this is progress. 1070 00:41:30,310 --> 00:41:32,890 Now we have some kind of computational model. 1071 00:41:32,890 --> 00:41:36,730 Trouble is, nobody really has an intuitive understanding of what 1072 00:41:36,730 --> 00:41:38,650 VGGFace is actually doing. 1073 00:41:38,650 --> 00:41:39,910 You know how to train one up. 1074 00:41:39,910 --> 00:41:43,870 There it is, but we don't really understand what it's doing. 1075 00:41:43,870 --> 00:41:46,210 And further, we have no idea if what it's doing 1076 00:41:46,210 --> 00:41:49,930 is anything like what humans are doing, OK? 1077 00:41:49,930 --> 00:41:51,970 So it's progress that we have a model now 1078 00:41:51,970 --> 00:41:54,730 that we didn't have like five years ago, 1079 00:41:54,730 --> 00:41:57,550 but we still have all these questions open. 1080 00:41:57,550 --> 00:42:01,120 OK, so on this first question, what do we want to know? 1081 00:42:01,120 --> 00:42:02,950 What we discover at the level of Marr 1082 00:42:02,950 --> 00:42:06,350 computational theory is a, if not the, central challenge 1083 00:42:06,350 --> 00:42:10,090 in face recognition is the huge variation across images, which 1084 00:42:10,090 --> 00:42:13,720 you know just by thinking about it or trying to write the code. 1085 00:42:13,720 --> 00:42:17,002 OK, so ooh, I'm just barely able. 1086 00:42:17,002 --> 00:42:18,460 I'm going to race along, and Anya's 1087 00:42:18,460 --> 00:42:20,252 going to tell me in five minutes to switch. 1088 00:42:20,252 --> 00:42:22,870 OK, so I want to talk just a little bit 1089 00:42:22,870 --> 00:42:23,900 about behavioral data. 1090 00:42:23,900 --> 00:42:25,480 I'll run out of time, and we'll roll 1091 00:42:25,480 --> 00:42:26,855 this in less time, because I want 1092 00:42:26,855 --> 00:42:29,020 to include functional MRI because you guys need it 1093 00:42:29,020 --> 00:42:30,040 for the assignment. 1094 00:42:30,040 --> 00:42:32,740 OK, so how are we going to figure out what 1095 00:42:32,740 --> 00:42:34,960 humans represent about faces? 1096 00:42:34,960 --> 00:42:36,490 OK, so here we are. 1097 00:42:36,490 --> 00:42:39,160 We consider this possibility that one way 1098 00:42:39,160 --> 00:42:41,650 to solve this problem is by essentially memorizing lots 1099 00:42:41,650 --> 00:42:43,720 of templates for each person. 1100 00:42:43,720 --> 00:42:46,300 Another possibility is this kind of vague, inchoate idea 1101 00:42:46,300 --> 00:42:49,300 that maybe there's some abstract representation that'll be 1102 00:42:49,300 --> 00:42:51,370 the same across all of those. 1103 00:42:51,370 --> 00:42:54,110 How are we going to figure out which humans do? 1104 00:42:54,110 --> 00:42:57,040 Well, if we're really memorizing lots of templates 1105 00:42:57,040 --> 00:42:59,350 for each person and that's how we recognize them 1106 00:42:59,350 --> 00:43:02,860 in all their different guises, that wouldn't work for people 1107 00:43:02,860 --> 00:43:03,760 we didn't know. 1108 00:43:03,760 --> 00:43:06,940 That is, you wouldn't be able to take two different photographs 1109 00:43:06,940 --> 00:43:10,600 of the same person and know if it's the same person or not, 1110 00:43:10,600 --> 00:43:11,530 right? 1111 00:43:11,530 --> 00:43:13,450 Because you could only do this by memorizing. 1112 00:43:13,450 --> 00:43:15,280 Does everybody get that idea? 1113 00:43:15,280 --> 00:43:17,290 Whereas whatever this other idea is, 1114 00:43:17,290 --> 00:43:20,352 it should work somewhat for novel individuals 1115 00:43:20,352 --> 00:43:21,310 you don't already know. 1116 00:43:21,310 --> 00:43:22,352 Here are two photographs. 1117 00:43:22,352 --> 00:43:24,410 Same person or different person? 1118 00:43:24,410 --> 00:43:27,790 So now let's ask, can humans do this? 1119 00:43:27,790 --> 00:43:30,430 Do we store lots of templates for individuals, 1120 00:43:30,430 --> 00:43:32,920 or can we do something more abstract? 1121 00:43:32,920 --> 00:43:35,680 Well, if we simply deal with this problem 1122 00:43:35,680 --> 00:43:38,320 by storing lots of templates for each individual, 1123 00:43:38,320 --> 00:43:41,830 maybe not literally pixel templates but some kind 1124 00:43:41,830 --> 00:43:45,520 of snapshot, then the key test is, 1125 00:43:45,520 --> 00:43:48,070 we shouldn't be able to do this matching task if we 1126 00:43:48,070 --> 00:43:49,480 don't know that person. 1127 00:43:49,480 --> 00:43:51,280 Everybody get the logic here? 1128 00:43:51,280 --> 00:43:52,720 OK, so let's try it. 1129 00:43:52,720 --> 00:43:54,760 So this paper a few years ago-- 1130 00:43:54,760 --> 00:43:55,780 Jenkins et al. 1131 00:43:55,780 --> 00:43:57,050 Asked that question. 1132 00:43:57,050 --> 00:43:58,460 So here's what they did. 1133 00:43:58,460 --> 00:44:00,550 They collected a whole bunch of photographs 1134 00:44:00,550 --> 00:44:03,430 of Dutch politicians with multiple images 1135 00:44:03,430 --> 00:44:05,680 of each politician, OK? 1136 00:44:05,680 --> 00:44:08,650 Then they gave them to people on cards, and they said, 1137 00:44:08,650 --> 00:44:11,140 there are multiple images of each person. 1138 00:44:11,140 --> 00:44:13,660 And I'm not going to tell you how many different politicians 1139 00:44:13,660 --> 00:44:14,530 are in this deck. 1140 00:44:14,530 --> 00:44:17,050 Just sort them in piles, so there's a different pile 1141 00:44:17,050 --> 00:44:19,148 for each person, OK? 1142 00:44:19,148 --> 00:44:21,190 I'm going to show you a low-tech version of this. 1143 00:44:21,190 --> 00:44:23,190 I'm going to show you a whole bunch of pictures, 1144 00:44:23,190 --> 00:44:24,670 all in one array, and you guys are 1145 00:44:24,670 --> 00:44:28,690 going to try to figure out how many people are there, OK? 1146 00:44:28,690 --> 00:44:29,470 Everybody ready? 1147 00:44:29,470 --> 00:44:30,890 I'm just going to leave it up for a few seconds. 1148 00:44:30,890 --> 00:44:32,432 There's going to be lots of pictures. 1149 00:44:32,432 --> 00:44:35,990 Your task is how many different individuals are depicted here. 1150 00:44:35,990 --> 00:44:36,490 Here we go. 1151 00:44:42,470 --> 00:44:44,030 OK, write down your best guess. 1152 00:44:44,030 --> 00:44:46,760 Just kind of look around. 1153 00:44:46,760 --> 00:44:49,910 OK, everybody got a guess? 1154 00:44:49,910 --> 00:44:53,290 OK, write down your guess. 1155 00:44:53,290 --> 00:44:56,410 OK, how many people think there were over 10 1156 00:44:56,410 --> 00:44:59,110 different individuals there? 1157 00:44:59,110 --> 00:44:59,620 One. 1158 00:44:59,620 --> 00:45:01,660 OK, how many people think over five? 1159 00:45:04,170 --> 00:45:07,350 Yeah, probably half of you. 1160 00:45:07,350 --> 00:45:10,800 How many people think over three? 1161 00:45:10,800 --> 00:45:12,165 Most of you. 1162 00:45:12,165 --> 00:45:12,930 There are two. 1163 00:45:16,860 --> 00:45:18,370 What does that mean? 1164 00:45:18,370 --> 00:45:19,830 That means you can't do it. 1165 00:45:19,830 --> 00:45:22,560 That means you can't match different images 1166 00:45:22,560 --> 00:45:27,090 of the same person if you don't know that person. 1167 00:45:27,090 --> 00:45:29,140 Pretty surprising, isn't it? 1168 00:45:29,140 --> 00:45:31,080 We think we're so awesome at face recognition 1169 00:45:31,080 --> 00:45:32,830 because most of the time, what we're doing 1170 00:45:32,830 --> 00:45:34,770 is recognizing people we know, people 1171 00:45:34,770 --> 00:45:37,410 we've seen in all different viewpoints and hair 1172 00:45:37,410 --> 00:45:41,010 arrangements and stuff. 1173 00:45:41,010 --> 00:45:42,780 If you don't have lots of opportunity 1174 00:45:42,780 --> 00:45:45,090 to store all those things and it's a novel face, 1175 00:45:45,090 --> 00:45:47,940 we're really bad at that, OK? 1176 00:45:47,940 --> 00:45:48,440 Yeah. 1177 00:45:48,440 --> 00:45:50,250 AUDIENCE: But there's a constraint of time? 1178 00:45:50,250 --> 00:45:51,090 NANCY KANWISHER: Yeah, yeah, I was 1179 00:45:51,090 --> 00:45:52,470 trying to make the demo work. 1180 00:45:52,470 --> 00:45:54,447 But OK, so the way they do this task, 1181 00:45:54,447 --> 00:45:56,280 people have unlimited time, and they're just 1182 00:45:56,280 --> 00:45:57,360 kind of sorting them. 1183 00:45:57,360 --> 00:46:00,240 The mean number of piles that people made in this experiment 1184 00:46:00,240 --> 00:46:01,920 was 7 and 1/2. 1185 00:46:01,920 --> 00:46:03,973 Correct answer's two, OK? 1186 00:46:03,973 --> 00:46:05,640 OK, now you might say, well, maybe those 1187 00:46:05,640 --> 00:46:07,770 are shitty photographs, right? 1188 00:46:07,770 --> 00:46:09,660 OK, so here's the control. 1189 00:46:09,660 --> 00:46:11,310 Those are Dutch politicians. 1190 00:46:11,310 --> 00:46:14,550 They then did the same experiment on Dutch people who 1191 00:46:14,550 --> 00:46:16,740 look at that photograph and in about two seconds 1192 00:46:16,740 --> 00:46:21,010 say two, duh, OK? 1193 00:46:21,010 --> 00:46:23,510 So if you know there's nothing wrong with those photographs, 1194 00:46:23,510 --> 00:46:26,510 it's just a matter of whether you know those people or not. 1195 00:46:26,510 --> 00:46:31,370 OK, so the point of all of this is that this crazy story 1196 00:46:31,370 --> 00:46:34,245 that, in fact, what a lot of what we're doing-- 1197 00:46:34,245 --> 00:46:35,870 I'm sort of simplifying here, but a lot 1198 00:46:35,870 --> 00:46:37,160 of what we're doing in face recognition, 1199 00:46:37,160 --> 00:46:39,140 a lot of the way we deal with all this image 1200 00:46:39,140 --> 00:46:43,280 variability is not that we have some very abstract, fancy, 1201 00:46:43,280 --> 00:46:47,480 high-level representation of each individual face. 1202 00:46:47,480 --> 00:46:49,580 We just have lots of experience with faces, 1203 00:46:49,580 --> 00:46:52,190 and we use that so that if we have a novel 1204 00:46:52,190 --> 00:46:54,350 face that we don't have all that experience with, 1205 00:46:54,350 --> 00:46:55,400 we're not so good at it. 1206 00:46:55,400 --> 00:46:56,990 I'm going to run out of time, so I'll take one question 1207 00:46:56,990 --> 00:46:57,780 and go on. 1208 00:46:57,780 --> 00:46:59,405 AUDIENCE: How do they control for-- you 1209 00:46:59,405 --> 00:47:01,895 know the issue you said about if you don't have experience 1210 00:47:01,895 --> 00:47:04,100 with similar races. 1211 00:47:04,100 --> 00:47:05,840 NANCY KANWISHER: Yeah. 1212 00:47:05,840 --> 00:47:08,510 I'm sure whenever you do face recognition experiments, 1213 00:47:08,510 --> 00:47:11,360 you make sure that if your dominant subject 1214 00:47:11,360 --> 00:47:14,780 pool is Caucasian, you have Caucasian faces or whatever. 1215 00:47:14,780 --> 00:47:15,707 Yeah. 1216 00:47:15,707 --> 00:47:17,540 Unless it's something you don't understand-- 1217 00:47:17,540 --> 00:47:18,780 I'm going to hang around after class. 1218 00:47:18,780 --> 00:47:20,960 You can ask me questions there, or if you have to go, 1219 00:47:20,960 --> 00:47:21,980 you can email me because I really 1220 00:47:21,980 --> 00:47:23,563 want to get through this next bit, OK? 1221 00:47:28,190 --> 00:47:29,460 OK, so there we are with that. 1222 00:47:29,460 --> 00:47:31,820 So what this suggests, kind of, sort of, 1223 00:47:31,820 --> 00:47:33,980 is that whatever we're doing, it's 1224 00:47:33,980 --> 00:47:35,690 something that benefits enormously 1225 00:47:35,690 --> 00:47:38,120 from lots and lots of experience with that individual. 1226 00:47:38,120 --> 00:47:40,760 Maybe it's not literal memorization 1227 00:47:40,760 --> 00:47:43,820 of actual pixel-like snapshots, but it's something more 1228 00:47:43,820 --> 00:47:45,590 like that than anybody would have guessed 1229 00:47:45,590 --> 00:47:49,670 before this experiment, OK? 1230 00:47:49,670 --> 00:47:54,990 OK, all right, I'm going to skip this awesome stuff here. 1231 00:47:54,990 --> 00:47:57,020 OK. 1232 00:47:57,020 --> 00:48:00,027 OK, so the benefits of-- actually, 1233 00:48:00,027 --> 00:48:02,360 I'm going to come back and do that slide next time, too. 1234 00:48:02,360 --> 00:48:04,890 And we're going to cut straight to functional MRI. 1235 00:48:04,890 --> 00:48:06,830 I'm sorry about this, but I just really want you guys to have 1236 00:48:06,830 --> 00:48:08,247 this background in case you don't. 1237 00:48:08,247 --> 00:48:09,980 You probably do. 1238 00:48:09,980 --> 00:48:12,500 So functional MRI-- another cool method 1239 00:48:12,500 --> 00:48:15,210 in cognitive neuroscience, and how would it be useful here? 1240 00:48:15,210 --> 00:48:17,150 OK, so first, what is it? 1241 00:48:17,150 --> 00:48:19,880 Functional MRI is the same as regular MRI 1242 00:48:19,880 --> 00:48:22,040 that's in probably tens of thousands 1243 00:48:22,040 --> 00:48:24,140 of hospitals around the world. 1244 00:48:24,140 --> 00:48:26,030 The big advances in functional MRI 1245 00:48:26,030 --> 00:48:28,310 were when some physicists in the early '90s 1246 00:48:28,310 --> 00:48:31,010 figured out how to take those images really fast 1247 00:48:31,010 --> 00:48:33,950 and how to make images that reflect not just 1248 00:48:33,950 --> 00:48:36,920 the density of tissue but the activity of neurons 1249 00:48:36,920 --> 00:48:38,720 at each point in the brain, OK? 1250 00:48:38,720 --> 00:48:40,190 That was big stuff, OK? 1251 00:48:40,190 --> 00:48:41,570 Early 1990s. 1252 00:48:41,570 --> 00:48:43,190 And so the reason it's a big deal 1253 00:48:43,190 --> 00:48:47,570 it is the best, highest spatial resolution 1254 00:48:47,570 --> 00:48:51,290 method for making pictures of human brain function 1255 00:48:51,290 --> 00:48:52,610 noninvasively. 1256 00:48:52,610 --> 00:48:55,070 That means without opening up the head, all right? 1257 00:48:55,070 --> 00:48:56,488 So that's an important thing. 1258 00:48:56,488 --> 00:48:58,530 That's why there's lots and lots of papers on it. 1259 00:48:58,530 --> 00:49:00,697 That's why we're going to spend a lot of time on it. 1260 00:49:00,697 --> 00:49:04,400 The bare basics are that the functional MRI signal that's 1261 00:49:04,400 --> 00:49:06,020 used is called the BOLD signal. 1262 00:49:06,020 --> 00:49:08,570 That stands for blood oxygenation level 1263 00:49:08,570 --> 00:49:10,640 dependent signal, OK? 1264 00:49:10,640 --> 00:49:13,010 And what that means is this. 1265 00:49:13,010 --> 00:49:15,680 Basic signal is blood flow. 1266 00:49:15,680 --> 00:49:18,200 And so the way it works is if a bunch of neurons 1267 00:49:18,200 --> 00:49:20,990 someplace in your brain start firing a lot, 1268 00:49:20,990 --> 00:49:25,340 that's metabolically expensive to make all those neurons fire, 1269 00:49:25,340 --> 00:49:26,870 and so you have to send more blood 1270 00:49:26,870 --> 00:49:28,320 to that part of the brain. 1271 00:49:28,320 --> 00:49:32,990 So it's just like if you go for a run, the muscles in your legs 1272 00:49:32,990 --> 00:49:35,960 need more blood delivered to them to supply them 1273 00:49:35,960 --> 00:49:38,030 metabolically for that increased activity, 1274 00:49:38,030 --> 00:49:40,820 and so the blood flow increase to your leg muscles 1275 00:49:40,820 --> 00:49:42,320 will increase, OK? 1276 00:49:42,320 --> 00:49:44,360 Well, similarly, the blood flow increases 1277 00:49:44,360 --> 00:49:46,350 to active parts of the brain. 1278 00:49:46,350 --> 00:49:49,580 Now, the weird part of it is that for reasons nobody 1279 00:49:49,580 --> 00:49:52,430 completely understands, the blood flow increase 1280 00:49:52,430 --> 00:49:54,920 more than compensates for the oxygen use, 1281 00:49:54,920 --> 00:49:56,990 so the signal is actually backwards. 1282 00:49:56,990 --> 00:50:02,390 Active parts of the brain have less, not more, 1283 00:50:02,390 --> 00:50:06,800 deoxygenated hemoglobin compared to oxygenated hemoglobin, 1284 00:50:06,800 --> 00:50:10,430 and the relevance of that is that oxygenated hemoglobin 1285 00:50:10,430 --> 00:50:12,920 and deoxygenated hemoglobin are magnetically 1286 00:50:12,920 --> 00:50:15,860 different in the way that the MRI signal can see. 1287 00:50:15,860 --> 00:50:17,930 So the basic signal you're looking at 1288 00:50:17,930 --> 00:50:20,430 is, how much oxygen is there in the blood 1289 00:50:20,430 --> 00:50:24,620 in that part of the brain, and hence, 1290 00:50:24,620 --> 00:50:26,360 how much blood flow went there? 1291 00:50:26,360 --> 00:50:28,810 And hence, how much neural activity was there? 1292 00:50:28,810 --> 00:50:30,110 Did that sort of make sense? 1293 00:50:30,110 --> 00:50:32,240 I'm not going to test you on which is paramagnetic 1294 00:50:32,240 --> 00:50:33,290 and which is diamagnetic. 1295 00:50:33,290 --> 00:50:33,980 I never remember. 1296 00:50:33,980 --> 00:50:35,270 I couldn't care less, but you should know what 1297 00:50:35,270 --> 00:50:36,800 the basic signal is, right? 1298 00:50:36,800 --> 00:50:38,960 It's a magnetic difference that results 1299 00:50:38,960 --> 00:50:40,820 from oxygenation differences that 1300 00:50:40,820 --> 00:50:42,530 result from blood flow differences that 1301 00:50:42,530 --> 00:50:43,880 result from neural activity. 1302 00:50:43,880 --> 00:50:44,653 Yeah. 1303 00:50:44,653 --> 00:50:46,710 AUDIENCE: So when there's more blood flow, 1304 00:50:46,710 --> 00:50:48,095 there's more oxygenated? 1305 00:50:48,095 --> 00:50:49,470 NANCY KANWISHER: More oxygenated, 1306 00:50:49,470 --> 00:50:53,730 and because it overcompensates for the metabolic use 1307 00:50:53,730 --> 00:50:58,860 of the neurons, the active parts that you see with an MRI signal 1308 00:50:58,860 --> 00:51:02,380 have more oxygenated hemoglobin, right? 1309 00:51:02,380 --> 00:51:06,100 OK, all right, so that's the basic signal. 1310 00:51:06,100 --> 00:51:08,320 And because that's the basic signal, 1311 00:51:08,320 --> 00:51:10,600 there's a bunch of things we can tell already. 1312 00:51:10,600 --> 00:51:13,400 So first of all-- 1313 00:51:13,400 --> 00:51:15,610 I'm just going to-- am I going to do this? 1314 00:51:15,610 --> 00:51:17,290 Yeah, I'm going to skip over this. 1315 00:51:17,290 --> 00:51:19,450 It doesn't really matter. 1316 00:51:19,450 --> 00:51:21,910 Because it's all based on blood flow, one, 1317 00:51:21,910 --> 00:51:24,070 it's extremely indirect-- 1318 00:51:24,070 --> 00:51:28,180 neural activity, blood flow change, overcompensation, 1319 00:51:28,180 --> 00:51:32,710 different magnetic response, MRI image, right? 1320 00:51:32,710 --> 00:51:35,500 So you would think with all those different steps 1321 00:51:35,500 --> 00:51:39,100 that you would get a really weird, nonlinear, messy, crappy 1322 00:51:39,100 --> 00:51:41,560 signal out the other end. 1323 00:51:41,560 --> 00:51:43,780 And it is one of the major challenges 1324 00:51:43,780 --> 00:51:46,512 of my personal atheism, but actually, you 1325 00:51:46,512 --> 00:51:48,220 get a damn good signal out the other end. 1326 00:51:48,220 --> 00:51:50,303 And it's pretty linear with neural activity, which 1327 00:51:50,303 --> 00:51:51,970 seems like kind of a freaking miracle, 1328 00:51:51,970 --> 00:51:53,860 given how indirect it is, OK? 1329 00:51:53,860 --> 00:51:55,870 But that has empowered this whole huge field 1330 00:51:55,870 --> 00:51:58,930 to discover cool things about the organization of the brain. 1331 00:51:58,930 --> 00:52:01,300 OK, nonetheless, there are many caveats. 1332 00:52:01,300 --> 00:52:03,460 Because it's blood flow, the signal 1333 00:52:03,460 --> 00:52:06,370 is limited in spatial resolution down 1334 00:52:06,370 --> 00:52:09,430 to-- people fight about this, but around a millimeter. 1335 00:52:09,430 --> 00:52:11,402 There are cowboys in the field who 1336 00:52:11,402 --> 00:52:13,360 think that they can get less than a millimeter. 1337 00:52:13,360 --> 00:52:13,930 Maybe. 1338 00:52:13,930 --> 00:52:14,770 I don't know. 1339 00:52:14,770 --> 00:52:16,540 It's debated. 1340 00:52:16,540 --> 00:52:19,630 And the temporal resolution is terrible. 1341 00:52:19,630 --> 00:52:21,285 Blood flow changes take a long time. 1342 00:52:21,285 --> 00:52:21,910 Think about it. 1343 00:52:21,910 --> 00:52:22,817 You start running. 1344 00:52:22,817 --> 00:52:24,400 How long does it take before the blood 1345 00:52:24,400 --> 00:52:25,690 flow increases to your calves? 1346 00:52:25,690 --> 00:52:28,150 Well, if you're really fit, it's probably fast, but still 1347 00:52:28,150 --> 00:52:29,440 going to take a few seconds. 1348 00:52:29,440 --> 00:52:31,270 It takes about six seconds for those blood 1349 00:52:31,270 --> 00:52:34,510 flow changes in the brain after neural activity. 1350 00:52:34,510 --> 00:52:36,890 And it happens over a big sloppy chunk of time, 1351 00:52:36,890 --> 00:52:39,790 and so you don't have much temporal resolution 1352 00:52:39,790 --> 00:52:40,930 with functional MRI. 1353 00:52:40,930 --> 00:52:42,760 Does that make sense? 1354 00:52:42,760 --> 00:52:45,150 OK. 1355 00:52:45,150 --> 00:52:49,300 OK, because it's this very indirect signal, 1356 00:52:49,300 --> 00:52:52,420 that also means that when we get a change in the MRI signal, 1357 00:52:52,420 --> 00:52:54,730 we don't exactly know what's causing it. 1358 00:52:54,730 --> 00:52:56,470 Is it synaptic activity? 1359 00:52:56,470 --> 00:52:58,120 Is it actual neural firing? 1360 00:52:58,120 --> 00:52:59,950 Is it one cell inhibiting another? 1361 00:52:59,950 --> 00:53:01,772 Is it a cell making protein? 1362 00:53:01,772 --> 00:53:03,730 I mean, it could be any of these things, right? 1363 00:53:03,730 --> 00:53:06,880 So we don't know, and that's a problem. 1364 00:53:06,880 --> 00:53:09,070 And another problem is the number you get out 1365 00:53:09,070 --> 00:53:14,770 is just the intensity of the detection of deoxyhemoglobin. 1366 00:53:14,770 --> 00:53:17,650 It doesn't translate directly into an absolute amount 1367 00:53:17,650 --> 00:53:19,210 of neural activity. 1368 00:53:19,210 --> 00:53:21,430 The consequence of that is all you can do 1369 00:53:21,430 --> 00:53:23,170 is compare two conditions. 1370 00:53:23,170 --> 00:53:26,020 You can never say, there was this exact amount 1371 00:53:26,020 --> 00:53:27,970 of metabolic activity right there. 1372 00:53:27,970 --> 00:53:29,890 You can only say it was more in this condition 1373 00:53:29,890 --> 00:53:32,590 than that condition, OK? 1374 00:53:32,590 --> 00:53:35,020 All right, so those are the major caveats. 1375 00:53:35,020 --> 00:53:38,140 Nonetheless, we can discover some cool stuff. 1376 00:53:38,140 --> 00:53:41,800 OK, so let's suppose, to get back to face recognition, 1377 00:53:41,800 --> 00:53:44,200 you wanted to know, is face recognition 1378 00:53:44,200 --> 00:53:48,290 a different problem in the brain from object recognition, right? 1379 00:53:48,290 --> 00:53:50,920 If it was, you might want to write different code 1380 00:53:50,920 --> 00:53:52,780 to try to understand it from the code you're 1381 00:53:52,780 --> 00:53:54,072 writing for object recognition. 1382 00:53:54,072 --> 00:53:56,710 It's something you'd kind of want to know, OK? 1383 00:53:56,710 --> 00:53:58,870 So here's an experiment I did-- god-- 1384 00:53:58,870 --> 00:53:59,840 20 years ago. 1385 00:53:59,840 --> 00:54:02,442 Anyway, simplest possible thing-- 1386 00:54:02,442 --> 00:54:04,900 so it's the easiest way I can explain to you the bare bones 1387 00:54:04,900 --> 00:54:06,610 of a simple MRI experiment. 1388 00:54:06,610 --> 00:54:08,860 You pop the subject in the scanner. 1389 00:54:08,860 --> 00:54:12,160 You scan their head continuously for about five minutes 1390 00:54:12,160 --> 00:54:14,680 while they look at a bunch of faces. 1391 00:54:14,680 --> 00:54:16,347 For 20 seconds, they stare at a dot. 1392 00:54:16,347 --> 00:54:17,680 They look at a bunch of objects. 1393 00:54:17,680 --> 00:54:18,930 They stare at a dot, OK? 1394 00:54:18,930 --> 00:54:20,290 It's a five-minute experiment. 1395 00:54:20,290 --> 00:54:22,370 You're scanning them that whole time. 1396 00:54:22,370 --> 00:54:25,570 And then you ask, of each three-dimensional pixel, 1397 00:54:25,570 --> 00:54:28,090 or voxel, in their brain, whether the signal 1398 00:54:28,090 --> 00:54:30,820 was higher in that voxel while the subject was 1399 00:54:30,820 --> 00:54:34,240 looking at faces than while they were looking at objects, OK? 1400 00:54:34,240 --> 00:54:36,695 And when you do that, you get a blob. 1401 00:54:36,695 --> 00:54:39,070 I've outlined it in green here, but there's a little blob 1402 00:54:39,070 --> 00:54:39,250 there. 1403 00:54:39,250 --> 00:54:41,200 This is a slice through the brain like this. 1404 00:54:41,200 --> 00:54:44,230 That blob is right in here on the bottom of the brain. 1405 00:54:44,230 --> 00:54:46,270 And the statistics are telling us 1406 00:54:46,270 --> 00:54:49,150 that the MRI signal is higher during the face epochs 1407 00:54:49,150 --> 00:54:50,470 than the object epochs-- 1408 00:54:50,470 --> 00:54:51,860 everybody with me here-- 1409 00:54:51,860 --> 00:54:54,970 which implies very indirectly that the neural activity 1410 00:54:54,970 --> 00:54:57,970 of that region was higher when this person was looking at 1411 00:54:57,970 --> 00:55:00,610 faces than when they were looking at objects. 1412 00:55:00,610 --> 00:55:02,508 OK, now whenever you see a blob like that, 1413 00:55:02,508 --> 00:55:05,050 really, you want to see the data that went into it, so here's 1414 00:55:05,050 --> 00:55:06,070 mine. 1415 00:55:06,070 --> 00:55:08,920 This is now the raw average MRI signal 1416 00:55:08,920 --> 00:55:12,070 intensity in that bit of brain over the five 1417 00:55:12,070 --> 00:55:13,870 minutes of the scan. 1418 00:55:13,870 --> 00:55:17,380 You can see the signal's higher in that 1419 00:55:17,380 --> 00:55:20,782 region when the person is looking at faces-- 1420 00:55:20,782 --> 00:55:22,240 these bars here-- than when they're 1421 00:55:22,240 --> 00:55:23,740 looking at objects there. 1422 00:55:23,740 --> 00:55:24,628 Everyone get that? 1423 00:55:24,628 --> 00:55:26,170 That's what the stats are telling us. 1424 00:55:26,170 --> 00:55:27,920 This is just the reality check of the data 1425 00:55:27,920 --> 00:55:29,860 that produced those stats. 1426 00:55:29,860 --> 00:55:34,120 OK, so now, in fact, you can see something 1427 00:55:34,120 --> 00:55:36,005 like that in pretty much every normal person. 1428 00:55:36,005 --> 00:55:38,380 I could pop any of you in the scanner, and in 10 minutes, 1429 00:55:38,380 --> 00:55:41,680 we'd find yours, OK? 1430 00:55:41,680 --> 00:55:43,600 Now, here's the key question. 1431 00:55:43,600 --> 00:55:46,300 Does this so far-- let's suppose you find this in anyone. 1432 00:55:46,300 --> 00:55:47,830 You do all the stats you like. 1433 00:55:47,830 --> 00:55:50,230 It's as robust as you could possibly want. 1434 00:55:50,230 --> 00:55:54,550 Do these data alone tell us that that region is specifically 1435 00:55:54,550 --> 00:55:55,540 responsive to faces? 1436 00:55:58,590 --> 00:55:59,090 No. 1437 00:55:59,090 --> 00:56:00,552 Why not? 1438 00:56:00,552 --> 00:56:02,520 AUDIENCE: Because it could-- 1439 00:56:02,520 --> 00:56:08,240 just like that certain arrangement of features, 1440 00:56:08,240 --> 00:56:12,140 or it could be reacting to the variable light intensities 1441 00:56:12,140 --> 00:56:16,113 that's reflecting off people's skin-- coloration of the skin. 1442 00:56:16,113 --> 00:56:17,030 NANCY KANWISHER: Good. 1443 00:56:17,030 --> 00:56:18,270 Keep going. 1444 00:56:18,270 --> 00:56:19,160 What else? 1445 00:56:19,160 --> 00:56:19,850 Yes, you. 1446 00:56:19,850 --> 00:56:22,723 AUDIENCE: Different faces of anything, like even animals. 1447 00:56:22,723 --> 00:56:24,890 NANCY KANWISHER: Yeah, then it might still be faces, 1448 00:56:24,890 --> 00:56:25,880 but it would be different if it's 1449 00:56:25,880 --> 00:56:27,410 human faces versus any faces. 1450 00:56:27,410 --> 00:56:28,763 We kind of want to know, right? 1451 00:56:28,763 --> 00:56:29,930 The code would be different. 1452 00:56:29,930 --> 00:56:30,823 Yeah. 1453 00:56:30,823 --> 00:56:32,618 AUDIENCE: It might be because of the face 1454 00:56:32,618 --> 00:56:35,765 is part of the bigger whole and the object's a [INAUDIBLE].. 1455 00:56:35,765 --> 00:56:38,140 NANCY KANWISHER: Uh-huh, the face is a part of something, 1456 00:56:38,140 --> 00:56:40,140 absolutely, where the object is the whole thing. 1457 00:56:40,140 --> 00:56:42,050 What else? 1458 00:56:42,050 --> 00:56:42,550 Yeah. 1459 00:56:42,550 --> 00:56:44,425 AUDIENCE: Maybe the orientation of the object 1460 00:56:44,425 --> 00:56:48,657 is so much simpler than the human face is. 1461 00:56:48,657 --> 00:56:50,740 NANCY KANWISHER: Just objects are simpler or maybe 1462 00:56:50,740 --> 00:56:52,240 just easier. 1463 00:56:52,240 --> 00:56:53,950 Maybe it's just hard to distinguish 1464 00:56:53,950 --> 00:56:57,070 one face from another, and so you need more blood flow. 1465 00:56:57,070 --> 00:56:59,560 Really, what that thing is-- that's a generic object 1466 00:56:59,560 --> 00:57:02,200 recognition system, but it has a harder time 1467 00:57:02,200 --> 00:57:03,700 distinguishing faces from each other 1468 00:57:03,700 --> 00:57:05,120 because they're so similar. 1469 00:57:05,120 --> 00:57:06,190 So there's more activity. 1470 00:57:06,190 --> 00:57:07,960 Everybody get that? 1471 00:57:07,960 --> 00:57:10,215 OK, what else? 1472 00:57:10,215 --> 00:57:11,590 I'm going to go two minutes over, 1473 00:57:11,590 --> 00:57:13,540 so if people have to leave, that's OK. 1474 00:57:13,540 --> 00:57:16,235 I'll try not to go more than two minutes over. 1475 00:57:16,235 --> 00:57:16,735 What else? 1476 00:57:20,100 --> 00:57:21,765 Yeah. 1477 00:57:21,765 --> 00:57:23,140 AUDIENCE: It could just as easily 1478 00:57:23,140 --> 00:57:28,648 be responsible for seeing maybe functional cues rather 1479 00:57:28,648 --> 00:57:30,093 than face recognition. 1480 00:57:30,093 --> 00:57:31,260 NANCY KANWISHER: Yeah, yeah. 1481 00:57:31,260 --> 00:57:32,843 As I just said, there's all this stuff 1482 00:57:32,843 --> 00:57:36,720 we get from a face, not just who is it, but are they healthy? 1483 00:57:36,720 --> 00:57:38,102 What mood are they in? 1484 00:57:38,102 --> 00:57:39,060 Where are they looking? 1485 00:57:39,060 --> 00:57:41,400 All that stuff. 1486 00:57:41,400 --> 00:57:43,830 OK, so what you guys just did-- this 1487 00:57:43,830 --> 00:57:46,320 is just basic common sense, but it's also the essence 1488 00:57:46,320 --> 00:57:47,522 of scientific reasoning. 1489 00:57:47,522 --> 00:57:49,230 And we'll do a lot of that in this class. 1490 00:57:49,230 --> 00:57:52,420 And the crux of the matter is, here's some data. 1491 00:57:52,420 --> 00:57:53,580 Here's an inference. 1492 00:57:53,580 --> 00:57:57,690 And so your job is to think, is there 1493 00:57:57,690 --> 00:58:00,900 any way that inference might not follow from those data? 1494 00:58:00,900 --> 00:58:03,660 How else might we account for those data, OK? 1495 00:58:03,660 --> 00:58:05,850 And you guys just did that beautifully, OK? 1496 00:58:05,850 --> 00:58:07,770 So the essence of good science is 1497 00:58:07,770 --> 00:58:11,820 whenever you see some data and an inference, ask yourself, 1498 00:58:11,820 --> 00:58:13,450 how might that inference be wrong? 1499 00:58:13,450 --> 00:58:15,750 How else might we account for those data? 1500 00:58:15,750 --> 00:58:17,940 OK, so that's what you guys just did. 1501 00:58:17,940 --> 00:58:21,000 I had previously made a list of other things that might mean. 1502 00:58:21,000 --> 00:58:22,590 It could respond to anything human. 1503 00:58:22,590 --> 00:58:24,270 You had said any kind of face, but it could also 1504 00:58:24,270 --> 00:58:25,978 be just anything human, may be a response 1505 00:58:25,978 --> 00:58:28,500 to hands, any body part, anything 1506 00:58:28,500 --> 00:58:32,100 we pay more attention to, anything that has curves in it, 1507 00:58:32,100 --> 00:58:35,100 or any of the suggestions you guys made, OK? 1508 00:58:35,100 --> 00:58:38,670 So the crux of the matter in how you do a good functional MRI 1509 00:58:38,670 --> 00:58:41,670 experiment or make a strong claim about a part of the brain 1510 00:58:41,670 --> 00:58:43,950 based on functional MRI is to take 1511 00:58:43,950 --> 00:58:46,630 all these alternative accounts seriously. 1512 00:58:46,630 --> 00:58:49,558 And so as just one example, what we did in our very first paper 1513 00:58:49,558 --> 00:58:51,600 is say, OK, there's lots of alternative accounts. 1514 00:58:51,600 --> 00:58:53,860 Let's try to tackle a bunch of them. 1515 00:58:53,860 --> 00:58:58,620 So we scan people looking at now 3/4 views of faces and hands, 1516 00:58:58,620 --> 00:59:01,710 and we made them press a button whenever two consecutive hands 1517 00:59:01,710 --> 00:59:04,440 were the same-- that's called a one-back task-- 1518 00:59:04,440 --> 00:59:07,020 or whenever two consecutive faces are the same. 1519 00:59:07,020 --> 00:59:09,750 By design, that task is harder on the hands 1520 00:59:09,750 --> 00:59:11,880 than the faces, so we were forcing 1521 00:59:11,880 --> 00:59:14,400 our subjects to pay more attention to the hands 1522 00:59:14,400 --> 00:59:17,100 than the faces, OK? 1523 00:59:17,100 --> 00:59:20,010 And what we found is you get the same blob, still responding 1524 00:59:20,010 --> 00:59:21,970 more to faces than hands. 1525 00:59:21,970 --> 00:59:24,210 And so the idea is by showing that, 1526 00:59:24,210 --> 00:59:26,603 we've ruled out every one of those things. 1527 00:59:26,603 --> 00:59:28,020 It's not just any human body part. 1528 00:59:28,020 --> 00:59:29,037 It doesn't go to hands-- 1529 00:59:29,037 --> 00:59:30,120 oh, sorry, anything human. 1530 00:59:30,120 --> 00:59:31,692 It's not just any body part. 1531 00:59:31,692 --> 00:59:33,150 It's not anything we paid attention 1532 00:59:33,150 --> 00:59:35,640 to because we made them pay more attention to the hands. 1533 00:59:35,640 --> 00:59:38,970 It's not anything with curvy outline, OK? 1534 00:59:38,970 --> 00:59:41,820 And so that's just a little tiny example 1535 00:59:41,820 --> 00:59:45,030 of how you can proceed in a systematic way 1536 00:59:45,030 --> 00:59:47,978 to try to analyze what is actually driving 1537 00:59:47,978 --> 00:59:49,020 this region of the brain. 1538 00:59:49,020 --> 00:59:50,812 You come up with a hypothesis, and then you 1539 00:59:50,812 --> 00:59:52,680 think of alternatives to the data. 1540 00:59:52,680 --> 00:59:54,340 And you come up with more hypotheses, 1541 00:59:54,340 --> 00:59:56,250 and then you think of ways to test them. 1542 00:59:56,250 --> 00:59:58,470 And we'll do a lot of that in here. 1543 00:59:58,470 --> 01:00:01,170 OK, so that's what I just said, and I'll just 1544 01:00:01,170 --> 01:00:03,240 say that there's lots of data since then. 1545 01:00:03,240 --> 01:00:06,660 That region of the brain actually extremely very much 1546 01:00:06,660 --> 01:00:10,650 prefers faces, and it's present in everyone. 1547 01:00:10,650 --> 01:00:15,420 And next time, we will talk about the fact 1548 01:00:15,420 --> 01:00:17,070 that that looks like it's suggesting 1549 01:00:17,070 --> 01:00:19,470 that we have a different system for face recognition 1550 01:00:19,470 --> 01:00:20,880 than object recognition. 1551 01:00:20,880 --> 01:00:23,190 But we haven't yet nailed the case, 1552 01:00:23,190 --> 01:00:25,830 and you guys should all think about what remains. 1553 01:00:25,830 --> 01:00:26,498 OK, thank you. 1554 01:00:26,498 --> 01:00:27,540 Sorry I was racing there. 1555 01:00:27,540 --> 01:00:30,830 I will hang out if you guys have questions.