1 00:00:01,640 --> 00:00:04,040 The following content is provided under a Creative 2 00:00:04,040 --> 00:00:05,580 Commons license. 3 00:00:05,580 --> 00:00:07,880 Your support will help MIT OpenCourseWare 4 00:00:07,880 --> 00:00:12,270 continue to offer high quality educational resources for free. 5 00:00:12,270 --> 00:00:14,870 To make a donation or view additional materials 6 00:00:14,870 --> 00:00:18,830 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:18,830 --> 00:00:20,000 at ocw.mit.edu. 8 00:00:22,292 --> 00:00:24,500 ELIZABETH SPELKE: I want to start with an observation 9 00:00:24,500 --> 00:00:25,620 about this summer school. 10 00:00:25,620 --> 00:00:29,480 There's a lot of development in this summer school. 11 00:00:29,480 --> 00:00:32,750 You've got two full mornings devoted to it-- today 12 00:00:32,750 --> 00:00:35,360 and on Thursday. 13 00:00:35,360 --> 00:00:40,580 It also came up pretty majorly in Josh Tenenbaum's class 14 00:00:40,580 --> 00:00:43,310 last Friday and I learned early this morning 15 00:00:43,310 --> 00:00:46,850 also in Shimon Ullman's class that I couldn't be here 16 00:00:46,850 --> 00:00:49,880 for yesterday afternoon. 17 00:00:49,880 --> 00:00:53,580 And the issues have come up in many other classes as well, 18 00:00:53,580 --> 00:00:57,630 including Nancy's, Winrich Freiwald's, and so forth. 19 00:00:57,630 --> 00:01:01,010 Now, what's come up is not only the general questions 20 00:01:01,010 --> 00:01:06,170 about development, but specific questions about human cognitive 21 00:01:06,170 --> 00:01:06,980 development. 22 00:01:06,980 --> 00:01:09,050 Questions that have been addressed primarily 23 00:01:09,050 --> 00:01:12,320 through behavioral experiments, not 24 00:01:12,320 --> 00:01:18,050 experiments using neural methods or computational models. 25 00:01:18,050 --> 00:01:21,350 And the topic that I'm going to be trying to-- 26 00:01:21,350 --> 00:01:23,840 that Allie and I will try to get you to think about 27 00:01:23,840 --> 00:01:27,090 for this morning is even narrower than that. 28 00:01:27,090 --> 00:01:32,240 It's about the cognitive capacities of human infants. 29 00:01:32,240 --> 00:01:34,670 And I think a fair initial question would be, 30 00:01:34,670 --> 00:01:37,970 why so much focus on early human development? 31 00:01:37,970 --> 00:01:39,930 And that question will get sharper 32 00:01:39,930 --> 00:01:42,140 if you look at where major organizations are 33 00:01:42,140 --> 00:01:43,520 putting their research money. 34 00:01:43,520 --> 00:01:46,250 They are not putting it into the kind of work 35 00:01:46,250 --> 00:01:48,590 that I'm going to be talking about today. 36 00:01:48,590 --> 00:01:51,410 There is no-- in the Obama BRAIN Initiative, 37 00:01:51,410 --> 00:01:53,550 where they're looking for new technologies, 38 00:01:53,550 --> 00:01:55,430 there's no call for new technologies 39 00:01:55,430 --> 00:01:57,560 to figure out what human knowledge is 40 00:01:57,560 --> 00:01:59,780 like at or near the initial state 41 00:01:59,780 --> 00:02:02,630 and how it grows over the course of infancy. 42 00:02:02,630 --> 00:02:06,380 And the European Human Brain Project 43 00:02:06,380 --> 00:02:11,510 doesn't have development as a major area in it, either. 44 00:02:11,510 --> 00:02:14,669 So I think it's fair to ask, why is CBMM 45 00:02:14,669 --> 00:02:17,210 taking such a different approach and putting so much emphasis 46 00:02:17,210 --> 00:02:19,700 on trying to get you guys to think about and learn 47 00:02:19,700 --> 00:02:22,640 about human development? 48 00:02:22,640 --> 00:02:25,220 And two general reasons, I think. 49 00:02:25,220 --> 00:02:27,530 One is, it's intrinsically fascinating. 50 00:02:27,530 --> 00:02:28,680 Come on. 51 00:02:28,680 --> 00:02:31,250 We are the most cognitively interesting creatures 52 00:02:31,250 --> 00:02:32,460 on the planet. 53 00:02:32,460 --> 00:02:35,660 And we're extremely flexible. 54 00:02:35,660 --> 00:02:39,500 At the very least, we know that a human infant can grow up 55 00:02:39,500 --> 00:02:43,160 to be a competent adult in any human culture of the world 56 00:02:43,160 --> 00:02:46,670 today and any human culture that existed in prehistory. 57 00:02:46,670 --> 00:02:48,732 And that means extremely varied-- 58 00:02:48,732 --> 00:02:50,690 they've had to learn extremely different things 59 00:02:50,690 --> 00:02:52,850 under different circumstances and have 60 00:02:52,850 --> 00:02:54,980 succeeded at doing that. 61 00:02:54,980 --> 00:02:59,300 We also know that by the time they start school, 62 00:02:59,300 --> 00:03:04,160 if they go to school at all, the really hard work 63 00:03:04,160 --> 00:03:06,650 of developing a common-sense understanding of the world 64 00:03:06,650 --> 00:03:07,880 is done. 65 00:03:07,880 --> 00:03:11,120 That is, it's not explicitly taught to children. 66 00:03:11,120 --> 00:03:13,571 Most of it isn't even very strongly 67 00:03:13,571 --> 00:03:16,070 implicitly taught to them in the form of other people trying 68 00:03:16,070 --> 00:03:17,300 to get them to learn things. 69 00:03:17,300 --> 00:03:19,425 What you're trying to do when you have a young kid, 70 00:03:19,425 --> 00:03:22,880 as those of you who have them know, or have had them know, 71 00:03:22,880 --> 00:03:30,230 is you're trying to get them not to climb off cliffs or explore 72 00:03:30,230 --> 00:03:32,247 the hot pots on the stove and so forth. 73 00:03:32,247 --> 00:03:34,580 You're really not spending very much of your time trying 74 00:03:34,580 --> 00:03:36,170 to get them to learn new stuff. 75 00:03:36,170 --> 00:03:37,890 They're doing that on their own. 76 00:03:37,890 --> 00:03:39,950 So it's I think a really interesting question, 77 00:03:39,950 --> 00:03:40,700 how do we do that? 78 00:03:40,700 --> 00:03:42,491 Intrinsically interesting in its own right, 79 00:03:42,491 --> 00:03:45,980 even if it were of no other use to us. 80 00:03:45,980 --> 00:03:48,230 But historically it's also been recognized 81 00:03:48,230 --> 00:03:50,930 as being really important for efforts 82 00:03:50,930 --> 00:03:53,690 to understand the human mind, understand the human brain, 83 00:03:53,690 --> 00:03:56,070 and build intelligent machines. 84 00:03:56,070 --> 00:04:01,670 So Helmholtz, who came up in Eero's talk last night, 85 00:04:01,670 --> 00:04:05,420 was not only a brilliant neurophysiologist 86 00:04:05,420 --> 00:04:09,440 and a physicist, he was extremely interested 87 00:04:09,440 --> 00:04:11,810 in perception and cognition. 88 00:04:11,810 --> 00:04:14,150 And he wrote about fundamental questions 89 00:04:14,150 --> 00:04:17,149 about human perceptual knowledge and experience. 90 00:04:17,149 --> 00:04:21,320 How is it that we experience the world as three-dimensional? 91 00:04:21,320 --> 00:04:23,800 He concluded that we didn't know the answer 92 00:04:23,800 --> 00:04:25,550 and never could know the answer, unless we 93 00:04:25,550 --> 00:04:29,810 could find ways to do systematic experiments on infants 94 00:04:29,810 --> 00:04:33,290 of the sort that could already be done to reveal mechanisms 95 00:04:33,290 --> 00:04:36,110 of color vision, for example, as were described 96 00:04:36,110 --> 00:04:37,790 last night on adults-- 97 00:04:37,790 --> 00:04:40,194 systematic psychophysical experiments on infants. 98 00:04:40,194 --> 00:04:41,610 But he looked at infants and said, 99 00:04:41,610 --> 00:04:44,122 I don't see any way to do that. 100 00:04:44,122 --> 00:04:46,580 We can't train them to make psychophysical judgments and so 101 00:04:46,580 --> 00:04:47,760 forth. 102 00:04:47,760 --> 00:04:50,090 But he was aware of their centrality. 103 00:04:50,090 --> 00:04:54,080 So was Turing, who in thinking ahead 104 00:04:54,080 --> 00:04:57,770 to how one might build intelligent machines, 105 00:04:57,770 --> 00:05:00,200 suggested that one aim to build a machine that 106 00:05:00,200 --> 00:05:03,170 could learn about the world the way children do. 107 00:05:03,170 --> 00:05:05,630 And a side of the work that's come up 108 00:05:05,630 --> 00:05:08,660 so many times in the whole Hubel-Wiesel tradition that 109 00:05:08,660 --> 00:05:10,610 started in the late '50s, I think 110 00:05:10,610 --> 00:05:14,300 one of the most exciting and important developments 111 00:05:14,300 --> 00:05:17,210 within that field, we're not just 112 00:05:17,210 --> 00:05:19,460 focusing on the response properties of neurons 113 00:05:19,460 --> 00:05:22,640 in mature visual systems, but rather 114 00:05:22,640 --> 00:05:25,490 on the development of those neurons 115 00:05:25,490 --> 00:05:27,530 and the effects of experience on them. 116 00:05:27,530 --> 00:05:30,980 When you discover that you get these gorgeous stripes 117 00:05:30,980 --> 00:05:35,490 of monocularly-driven cells in V1, 118 00:05:35,490 --> 00:05:37,790 it then immediately became really interesting to ask, 119 00:05:37,790 --> 00:05:41,030 suppose an animal were only looking at the world 120 00:05:41,030 --> 00:05:41,720 through one eye? 121 00:05:41,720 --> 00:05:44,280 Or suppose they could look at the world through the two eyes, 122 00:05:44,280 --> 00:05:46,460 but not at the same time, or not at the same things 123 00:05:46,460 --> 00:05:47,300 at the same time? 124 00:05:47,300 --> 00:05:49,460 What would happen to those cells? 125 00:05:49,460 --> 00:05:52,640 And there was gorgeous work addressing those questions 126 00:05:52,640 --> 00:05:54,270 from the beginning. 127 00:05:54,270 --> 00:05:57,830 Now, that work has somewhat receded from attention. 128 00:05:57,830 --> 00:05:59,160 I think that's a mistake. 129 00:05:59,160 --> 00:06:00,770 I think that there's a great deal 130 00:06:00,770 --> 00:06:03,680 to be learned from those kinds of studies now. 131 00:06:03,680 --> 00:06:07,712 And if I get nothing else across over this time, 132 00:06:07,712 --> 00:06:09,170 I hope you'll at least get the idea 133 00:06:09,170 --> 00:06:11,330 that this is a field worth following, looking 134 00:06:11,330 --> 00:06:15,590 at development in humans, looking at development 135 00:06:15,590 --> 00:06:18,470 of perceptual and cognitive capacities 136 00:06:18,470 --> 00:06:24,380 in animal models of human intelligence as well. 137 00:06:24,380 --> 00:06:26,680 So more specifically, I think there 138 00:06:26,680 --> 00:06:32,090 are three questions about human cognition 139 00:06:32,090 --> 00:06:34,280 for which studies of early development 140 00:06:34,280 --> 00:06:37,730 in general and in human infants in particular 141 00:06:37,730 --> 00:06:38,872 can shed light on. 142 00:06:38,872 --> 00:06:41,330 Two of them I'm not going to really be talking about today, 143 00:06:41,330 --> 00:06:43,220 except indirectly. 144 00:06:43,220 --> 00:06:46,220 One is the question, what distinguishes us 145 00:06:46,220 --> 00:06:47,340 from other animals? 146 00:06:47,340 --> 00:06:51,144 We come into the world with very similar equipment. 147 00:06:51,144 --> 00:06:52,310 But look what we do with it. 148 00:06:52,310 --> 00:06:55,490 We create these utterly different systems of knowledge 149 00:06:55,490 --> 00:06:57,560 that no other animal seems to share. 150 00:06:57,560 --> 00:07:00,590 What is it about us that sets us on a different path 151 00:07:00,590 --> 00:07:02,510 from other animals? 152 00:07:02,510 --> 00:07:03,920 That's question one. 153 00:07:03,920 --> 00:07:06,460 And the other question I won't talk about is-- 154 00:07:06,460 --> 00:07:08,450 well, I'll talk about it a tiny bit, but not 155 00:07:08,450 --> 00:07:12,020 directly-- is, where do abstract ideas come from? 156 00:07:12,020 --> 00:07:14,840 It seems like we not only develop systems of knowledge, 157 00:07:14,840 --> 00:07:19,130 but those systems center on concepts 158 00:07:19,130 --> 00:07:22,400 that refer to things that could never in principle 159 00:07:22,400 --> 00:07:25,010 be seen or acted on. 160 00:07:25,010 --> 00:07:29,060 Like the concept "seven," or the concept "triangle," 161 00:07:29,060 --> 00:07:34,070 or the concept "belief," or ethical concepts and so forth. 162 00:07:34,070 --> 00:07:36,260 Abstract concepts organize our knowledge. 163 00:07:36,260 --> 00:07:40,520 But since they can't be seen or touched or produced 164 00:07:40,520 --> 00:07:42,830 through our actions, how do we come to know them? 165 00:07:42,830 --> 00:07:44,720 I think studies of early development 166 00:07:44,720 --> 00:07:46,314 can shed light on that as well. 167 00:07:46,314 --> 00:07:48,980 But the question I want to focus on today is the third question, 168 00:07:48,980 --> 00:07:52,880 and it's the one that Josh raised on Friday. 169 00:07:52,880 --> 00:07:57,350 How do we get so much from so little as adults? 170 00:07:57,350 --> 00:07:59,390 As adults, you look at one of the photographs 171 00:07:59,390 --> 00:08:01,040 he showed of just an ordinary scene 172 00:08:01,040 --> 00:08:03,650 and you can immediately make predictions about, 173 00:08:03,650 --> 00:08:05,450 if you were to bang it, what would happen? 174 00:08:05,450 --> 00:08:06,230 What would fall? 175 00:08:06,230 --> 00:08:07,730 What would roll? 176 00:08:07,730 --> 00:08:09,920 We seem to get this very, very rich knowledge 177 00:08:09,920 --> 00:08:15,500 from this very, very limited body of information 178 00:08:15,500 --> 00:08:17,450 at any given time. 179 00:08:17,450 --> 00:08:20,910 And what that suggests is that we 180 00:08:20,910 --> 00:08:23,420 are able to bring to bear in interpreting 181 00:08:23,420 --> 00:08:26,180 that scene a whole body of knowledge 182 00:08:26,180 --> 00:08:30,110 that we already have about the world and how it behaves. 183 00:08:30,110 --> 00:08:33,140 But that raises the question, what is it that we know 184 00:08:33,140 --> 00:08:36,500 and how is our knowledge organized? 185 00:08:36,500 --> 00:08:43,715 What aspects of the world do we represent most fundamentally? 186 00:08:43,715 --> 00:08:46,870 Which of our concepts are most important to us 187 00:08:46,870 --> 00:08:48,980 and generate the other concepts and so forth? 188 00:08:48,980 --> 00:08:52,940 How can we carve human knowledge at its joints? 189 00:08:52,940 --> 00:08:54,800 And now this can be studied in adults 190 00:08:54,800 --> 00:08:56,720 and you've seen a number of examples of this. 191 00:08:56,720 --> 00:08:59,750 You saw it in Nancy's talk last Tuesday, right? 192 00:08:59,750 --> 00:09:03,050 Anyway, last week sometime. 193 00:09:03,050 --> 00:09:06,500 Studies using functional brain imaging 194 00:09:06,500 --> 00:09:11,930 to get at our representations of human faces. 195 00:09:11,930 --> 00:09:13,830 You saw it in Josh's talk. 196 00:09:13,830 --> 00:09:15,770 He was mostly using data from adults 197 00:09:15,770 --> 00:09:21,680 to be probing the knowledge of intuitive physics 198 00:09:21,680 --> 00:09:24,620 that he was focused on and that his computational models are 199 00:09:24,620 --> 00:09:26,210 trying to capture. 200 00:09:26,210 --> 00:09:29,010 You're going to see it on Thursday in-- 201 00:09:29,010 --> 00:09:31,430 no, tomorrow in Rebecca Saxe's talk, 202 00:09:31,430 --> 00:09:35,720 where she'll talk about human adults' attributions of beliefs 203 00:09:35,720 --> 00:09:39,260 and desires and other mental states to people. 204 00:09:39,260 --> 00:09:41,660 It's certainly studyable in adults, 205 00:09:41,660 --> 00:09:45,166 but it's difficult to answer these questions. 206 00:09:45,166 --> 00:09:47,540 It's difficult to answer these questions in any creature, 207 00:09:47,540 --> 00:09:49,081 but I think it's especially difficult 208 00:09:49,081 --> 00:09:52,230 to answer these questions in adults for a couple of reasons. 209 00:09:52,230 --> 00:09:54,440 One is that our knowledge is simply too rich. 210 00:09:54,440 --> 00:09:57,350 By the time we get to be adults, we know so much 211 00:09:57,350 --> 00:09:59,330 and we have so many alternative ways 212 00:09:59,330 --> 00:10:01,330 of solving any particular problem, 213 00:10:01,330 --> 00:10:02,710 that it's a real challenge to try 214 00:10:02,710 --> 00:10:05,860 to sift through all our abilities 215 00:10:05,860 --> 00:10:09,130 and figure out what the really fundamental, most fundamental 216 00:10:09,130 --> 00:10:11,314 concepts that we have are. 217 00:10:11,314 --> 00:10:12,730 And the second problem with adults 218 00:10:12,730 --> 00:10:15,910 is we not only know too much, we're too flexible. 219 00:10:15,910 --> 00:10:19,030 We can essentially relate anything to anything. 220 00:10:19,030 --> 00:10:20,800 We can use information from the face 221 00:10:20,800 --> 00:10:24,370 to answer all sorts of questions about the world. 222 00:10:24,370 --> 00:10:28,600 And here, I think, infants are useful for a maybe seemingly 223 00:10:28,600 --> 00:10:30,070 paradoxical reason. 224 00:10:30,070 --> 00:10:32,020 They're much less cognitively capable. 225 00:10:32,020 --> 00:10:34,150 They know much less about the world and they're 226 00:10:34,150 --> 00:10:35,590 far less flexible-- 227 00:10:35,590 --> 00:10:37,090 I'll show you examples of this-- 228 00:10:37,090 --> 00:10:39,010 far less flexible in the kinds of things 229 00:10:39,010 --> 00:10:43,330 that they can do with the knowledge that they do have. 230 00:10:43,330 --> 00:10:46,960 Nevertheless, they seem to come into the world equipped 231 00:10:46,960 --> 00:10:49,540 with knowledge that supports later learning. 232 00:10:49,540 --> 00:10:51,760 And because it's supporting later learning, 233 00:10:51,760 --> 00:10:55,000 it's being preserved over that learning. 234 00:10:55,000 --> 00:10:57,190 It's being incorporated in all of the later things 235 00:10:57,190 --> 00:10:58,390 that we learn. 236 00:10:58,390 --> 00:11:00,560 So it remains fundamental to us as adults. 237 00:11:00,560 --> 00:11:02,830 And I think this can help us, to think 238 00:11:02,830 --> 00:11:08,470 about how our own knowledge of the world is organized. 239 00:11:08,470 --> 00:11:09,270 OK. 240 00:11:09,270 --> 00:11:11,500 So that's a general overview. 241 00:11:11,500 --> 00:11:12,580 How do we study infants? 242 00:11:12,580 --> 00:11:16,600 Now here's where the tables turn radically. 243 00:11:16,600 --> 00:11:19,420 We have way better methods for studying cognition in adults 244 00:11:19,420 --> 00:11:23,140 than we do in infants, just as Helmholtz thought. 245 00:11:23,140 --> 00:11:24,554 They can't talk to us. 246 00:11:24,554 --> 00:11:26,470 They don't understand us when we talk to them, 247 00:11:26,470 --> 00:11:27,820 so we can't give them structure. 248 00:11:27,820 --> 00:11:33,460 Oh, and unlike willing trained animals, 249 00:11:33,460 --> 00:11:35,830 you can't train them to do things, 250 00:11:35,830 --> 00:11:39,490 at least not in any extended sense. 251 00:11:39,490 --> 00:11:40,480 They can't do much. 252 00:11:40,480 --> 00:11:42,550 I'm most interested in infants in the first four 253 00:11:42,550 --> 00:11:45,100 months of life before they even start reaching for things, 254 00:11:45,100 --> 00:11:50,710 much less sitting up by themselves or moving around. 255 00:11:50,710 --> 00:11:54,070 The interesting thing is, from day one, from the moment 256 00:11:54,070 --> 00:11:57,010 that they're born, they're observing the world. 257 00:11:57,010 --> 00:12:00,850 They're looking at things and they're getting information 258 00:12:00,850 --> 00:12:03,670 from what they see. 259 00:12:03,670 --> 00:12:06,580 Now, their observations-- we've learned over the last half 260 00:12:06,580 --> 00:12:10,270 century or so that their observations are systematic 261 00:12:10,270 --> 00:12:12,970 and they're reflected in very simple exploratory behaviors, 262 00:12:12,970 --> 00:12:16,215 like when a sound happens somewhere in the visual field, 263 00:12:16,215 --> 00:12:18,340 turning the head and orienting it toward the sound. 264 00:12:18,340 --> 00:12:20,830 Even newborn infants will do that. 265 00:12:20,830 --> 00:12:24,130 Or if something new or interesting is presented, 266 00:12:24,130 --> 00:12:26,975 infants will tend to look at it. 267 00:12:26,975 --> 00:12:28,600 And these behaviors I think can tell us 268 00:12:28,600 --> 00:12:31,000 something about what infants perceive and know. 269 00:12:31,000 --> 00:12:34,482 And before getting to the real substance of what 270 00:12:34,482 --> 00:12:36,190 I want to focus on today, let me give you 271 00:12:36,190 --> 00:12:37,990 a few examples of this. 272 00:12:37,990 --> 00:12:40,420 What kinds of things do infants look at? 273 00:12:40,420 --> 00:12:44,320 Well, if you present even a newborn infant-- 274 00:12:44,320 --> 00:12:45,880 infants at any age, really-- 275 00:12:45,880 --> 00:12:49,692 with two displays side by side, and vary 276 00:12:49,692 --> 00:12:52,150 properties of those displays and the relation between them, 277 00:12:52,150 --> 00:12:55,130 you'll see that they look at some things more than others. 278 00:12:55,130 --> 00:12:57,160 So they'll look at black-and-white stripes 279 00:12:57,160 --> 00:12:59,500 more than they'll look at a homogeneous gray field. 280 00:12:59,500 --> 00:13:00,760 That's useful. 281 00:13:00,760 --> 00:13:03,070 It allowed people to get initial measures 282 00:13:03,070 --> 00:13:06,905 of the development of visual acuity which infants-- 283 00:13:06,905 --> 00:13:10,210 it actually overturned a somewhat popular view 284 00:13:10,210 --> 00:13:12,540 that at birth, infants couldn't see at all. 285 00:13:12,540 --> 00:13:14,860 We know from these simple studies that they can. 286 00:13:14,860 --> 00:13:17,110 And we also know that their acuity starts out very low 287 00:13:17,110 --> 00:13:19,150 but gets pretty good by the time they're 288 00:13:19,150 --> 00:13:21,040 four to six months of age. 289 00:13:21,040 --> 00:13:23,857 It doesn't reach full adult levels until about two years. 290 00:13:23,857 --> 00:13:25,690 We also know that they look at moving arrays 291 00:13:25,690 --> 00:13:28,750 more than stationary arrays, and they 292 00:13:28,750 --> 00:13:30,520 look at three-dimensional objects 293 00:13:30,520 --> 00:13:34,300 more than two-dimensional objects. 294 00:13:34,300 --> 00:13:36,880 In addition to having intrinsic preferences 295 00:13:36,880 --> 00:13:40,180 between different things, they also 296 00:13:40,180 --> 00:13:42,820 have a preference for looking at displays that change 297 00:13:42,820 --> 00:13:45,340 or displays that present something new. 298 00:13:45,340 --> 00:13:47,680 So jumping from the '50s when those first studies were 299 00:13:47,680 --> 00:13:50,560 done up to the '80s, there was a whole flurry 300 00:13:50,560 --> 00:13:57,790 of studies showing babies pairs of cats on a series of trials 301 00:13:57,790 --> 00:13:59,500 and then switching to a cat and a dog. 302 00:13:59,500 --> 00:14:00,520 And the babies would look longer-- 303 00:14:00,520 --> 00:14:01,936 these are three-month-olds-- would 304 00:14:01,936 --> 00:14:06,760 look longer at the dog than at a new example of a cat. 305 00:14:06,760 --> 00:14:09,700 So they're able to orient to novelty. 306 00:14:09,700 --> 00:14:12,250 And they also look longer at a visual array 307 00:14:12,250 --> 00:14:14,980 that connects in some way to something they can hear. 308 00:14:14,980 --> 00:14:18,340 Now, one of the things I spend a lot of my time studying 309 00:14:18,340 --> 00:14:22,330 is foundations of mathematics-- numerical and spatial cognition 310 00:14:22,330 --> 00:14:22,930 in infants. 311 00:14:22,930 --> 00:14:24,850 I'm not going to talk about it at all today. 312 00:14:24,850 --> 00:14:27,280 But I kind of couldn't resist giving just one example 313 00:14:27,280 --> 00:14:31,710 of looking at what you hear that connects to infant sensitivity 314 00:14:31,710 --> 00:14:32,620 to a number. 315 00:14:32,620 --> 00:14:35,110 This is a study that was conducted in France 316 00:14:35,110 --> 00:14:37,720 by Veronique Izard and her colleagues 317 00:14:37,720 --> 00:14:40,450 with newborn infants in a maternity hospital. 318 00:14:40,450 --> 00:14:46,670 She played infants sequences of sounds, 319 00:14:46,670 --> 00:14:50,560 and each sequence involved repetitions of a syllable. 320 00:14:50,560 --> 00:14:54,220 For half the infants, each syllable appeared four times. 321 00:14:54,220 --> 00:14:56,500 For the others, it appeared 12 times. 322 00:14:56,500 --> 00:14:58,750 And for the ones for which it appeared four times, 323 00:14:58,750 --> 00:15:00,340 each syllable was three times as long. 324 00:15:00,340 --> 00:15:01,798 So the total duration of a sequence 325 00:15:01,798 --> 00:15:03,570 was the same for the two groups, but one 326 00:15:03,570 --> 00:15:06,300 involved four syllables and one involved 12. 327 00:15:06,300 --> 00:15:10,060 And after they heard that for a minute, 328 00:15:10,060 --> 00:15:12,060 the sound continued to play and now she showed, 329 00:15:12,060 --> 00:15:14,820 side by side, an array of four objects 330 00:15:14,820 --> 00:15:16,800 versus an array of 12 objects. 331 00:15:16,800 --> 00:15:18,930 And the babies tended to look at the array that 332 00:15:18,930 --> 00:15:23,880 corresponded in number to what they were hearing. 333 00:15:23,880 --> 00:15:26,970 Now, all of this gives us something to work with, 334 00:15:26,970 --> 00:15:30,070 but it raises a nasty problem. 335 00:15:30,070 --> 00:15:31,740 And the problem is, what are babies 336 00:15:31,740 --> 00:15:34,390 perceiving or understanding? 337 00:15:34,390 --> 00:15:36,250 Today we're not going to be asking, 338 00:15:36,250 --> 00:15:37,650 how can babies classify things? 339 00:15:37,650 --> 00:15:39,090 What do they respond to similarly? 340 00:15:39,090 --> 00:15:40,591 What do they respond to differently? 341 00:15:40,591 --> 00:15:43,006 We're going to be asking, what sense do they make of them? 342 00:15:43,006 --> 00:15:44,280 What are they representing? 343 00:15:44,280 --> 00:15:49,050 What the content of the representations 344 00:15:49,050 --> 00:15:52,050 that they're forming in each of these cases? 345 00:15:52,050 --> 00:15:57,090 And these studies as I've just described them don't tell us. 346 00:15:57,090 --> 00:16:01,470 Let's take the case of the sphere versus the disk. 347 00:16:01,470 --> 00:16:04,800 When this study was first conducted, 348 00:16:04,800 --> 00:16:07,950 the author concluded that babies have depth perception, 349 00:16:07,950 --> 00:16:10,320 that they perceive three-dimensional solid 350 00:16:10,320 --> 00:16:12,180 objects. 351 00:16:12,180 --> 00:16:13,707 Is that a justifiable conclusion? 352 00:16:13,707 --> 00:16:14,790 AUDIENCE: Not necessarily. 353 00:16:14,790 --> 00:16:16,590 ELIZABETH SPELKE: Why not? 354 00:16:16,590 --> 00:16:18,590 AUDIENCE: Because they are not [INAUDIBLE].. 355 00:16:18,590 --> 00:16:18,990 ELIZABETH SPELKE: Yeah. 356 00:16:18,990 --> 00:16:19,490 OK. 357 00:16:19,490 --> 00:16:22,770 So when you present things that differ in depth, 358 00:16:22,770 --> 00:16:26,310 you're presenting a host of different visual features 359 00:16:26,310 --> 00:16:28,710 that for us as adults are cues to depth. 360 00:16:28,710 --> 00:16:31,114 The question is, are they cues to depth for the infant? 361 00:16:31,114 --> 00:16:33,030 And the fact that the infant is looking longer 362 00:16:33,030 --> 00:16:35,910 at something we would call a sphere than at something we 363 00:16:35,910 --> 00:16:38,460 would call a disk, doesn't tell us whether they're looking 364 00:16:38,460 --> 00:16:40,920 longer because they're thinking, "sphere," 365 00:16:40,920 --> 00:16:43,837 or "3D," or "solid," or something like that, 366 00:16:43,837 --> 00:16:46,170 or whether they're looking longer because they're seeing 367 00:16:46,170 --> 00:16:47,670 a more interesting pattern of motion 368 00:16:47,670 --> 00:16:51,994 as they move their head around, or because as they converge 369 00:16:51,994 --> 00:16:54,660 on one part of the array they're getting interesting differences 370 00:16:54,660 --> 00:16:56,993 in how in-focus different parts of it are, and so forth. 371 00:16:56,993 --> 00:16:58,560 All of the different cues to depth 372 00:16:58,560 --> 00:17:00,184 could-- what we want to know is, what's 373 00:17:00,184 --> 00:17:02,262 the basis of this preference? 374 00:17:02,262 --> 00:17:03,720 And the existence of the preference 375 00:17:03,720 --> 00:17:05,040 doesn't tell us that. 376 00:17:05,040 --> 00:17:09,540 Similarly for the cats, and similarly 377 00:17:09,540 --> 00:17:11,849 for this single isolated experiment 378 00:17:11,849 --> 00:17:13,500 that I gave you on number, right? 379 00:17:13,500 --> 00:17:16,440 Does this say anything whatsoever about number, 380 00:17:16,440 --> 00:17:18,780 or could there be some sensory variable 381 00:17:18,780 --> 00:17:21,660 where there's just more going on in a stream of 12 sounds 382 00:17:21,660 --> 00:17:25,500 and there's more going on in an array of 12 objects, 383 00:17:25,500 --> 00:17:28,980 and babies are matching more with more, independently 384 00:17:28,980 --> 00:17:29,480 of number? 385 00:17:29,480 --> 00:17:31,770 These studies in themselves don't tell us. 386 00:17:31,770 --> 00:17:34,260 In order to find out, what we need to do 387 00:17:34,260 --> 00:17:37,260 is take these methods and do systematic experiments. 388 00:17:37,260 --> 00:17:40,710 And these experiments work best under the following conditions. 389 00:17:40,710 --> 00:17:44,730 When you're studying a function that exists in adults 390 00:17:44,730 --> 00:17:48,840 and whose properties have been explored in adults in detail 391 00:17:48,840 --> 00:17:52,080 systematically, when you have a body of psychophysical data 392 00:17:52,080 --> 00:17:55,110 that you can rest on in your understanding of what's 393 00:17:55,110 --> 00:18:00,450 happening in adults, and you can then apply that to infants. 394 00:18:00,450 --> 00:18:03,140 So one example of that took as its point of-- this 395 00:18:03,140 --> 00:18:07,470 is work by Richard Held, a wonderful perception 396 00:18:07,470 --> 00:18:11,067 psychologist who worked at MIT. 397 00:18:11,067 --> 00:18:12,150 Still is active, actually. 398 00:18:12,150 --> 00:18:14,250 He's retired but still active. 399 00:18:14,250 --> 00:18:15,840 And he did these beautiful experiments 400 00:18:15,840 --> 00:18:18,009 that started with the sphere-versus-disk phenomenon. 401 00:18:18,009 --> 00:18:19,800 And first of all, he tried to take it apart 402 00:18:19,800 --> 00:18:22,550 and say, let's just focus on one cue today, OK? 403 00:18:22,550 --> 00:18:25,500 Binocular disparity at the basis of stereo vision. 404 00:18:25,500 --> 00:18:27,810 So he put stereo goggles on babies. 405 00:18:27,810 --> 00:18:30,630 These were babies ranging in age up to about from birth 406 00:18:30,630 --> 00:18:32,640 to about four months, I think. 407 00:18:32,640 --> 00:18:35,820 He put stereo goggles on them and showed them, side 408 00:18:35,820 --> 00:18:38,590 by side, two arrays of stripes. 409 00:18:38,590 --> 00:18:42,240 In one of the arrays, the same image went to both eyes. 410 00:18:42,240 --> 00:18:44,400 In the other arrays, the edges of the stripes 411 00:18:44,400 --> 00:18:47,580 were offset in a way that leads an adult 412 00:18:47,580 --> 00:18:50,610 to see them as organized in depth-- some stripes in front 413 00:18:50,610 --> 00:18:52,560 of others. 414 00:18:52,560 --> 00:18:57,270 And he showed that infants looked longer at the array with 415 00:18:57,270 --> 00:18:59,880 the disparity-specified differences in depth than 416 00:18:59,880 --> 00:19:01,650 the array where it didn't. 417 00:19:01,650 --> 00:19:03,360 He did not conclude from that that they 418 00:19:03,360 --> 00:19:05,460 have depth perception, but it gave him 419 00:19:05,460 --> 00:19:08,760 a basis for doing a whole series of experiments that asked, 420 00:19:08,760 --> 00:19:11,130 in effect, do you see this effect 421 00:19:11,130 --> 00:19:14,340 under all and only the conditions in which adults 422 00:19:14,340 --> 00:19:16,350 have functional stereopsis? 423 00:19:16,350 --> 00:19:19,690 So he showed, for example, that if you rotate the array 424 00:19:19,690 --> 00:19:23,130 sideways 45 degrees so that you still 425 00:19:23,130 --> 00:19:25,170 have double images on the stereo side, 426 00:19:25,170 --> 00:19:27,570 but we wouldn't see depth because our eyes are 427 00:19:27,570 --> 00:19:30,820 side by side, not one above the other, the effect goes away. 428 00:19:30,820 --> 00:19:32,430 He varied the degree of disparity 429 00:19:32,430 --> 00:19:34,560 and showed that you only get this preference 430 00:19:34,560 --> 00:19:39,450 within this narrow range where we have functional stereopsis. 431 00:19:39,450 --> 00:19:42,940 And he was able to show the striking continuity between all 432 00:19:42,940 --> 00:19:44,910 of the properties of stereo vision 433 00:19:44,910 --> 00:19:47,370 in adults and in these infants. 434 00:19:47,370 --> 00:19:51,090 So that study and a bunch of others using other methods, 435 00:19:51,090 --> 00:19:53,700 I think have resolved this question of whether depth-- 436 00:19:53,700 --> 00:19:55,680 when depth perception begins. 437 00:19:55,680 --> 00:19:57,060 Its beginning very early. 438 00:19:57,060 --> 00:20:00,755 Stereopsis comes in around two to three months of age. 439 00:20:00,755 --> 00:20:02,420 Other depth cues come in at birth. 440 00:20:02,420 --> 00:20:03,980 It's beginning very, very early. 441 00:20:03,980 --> 00:20:05,771 But it didn't come from single experiments. 442 00:20:05,771 --> 00:20:08,701 It came from systematic patterns of experiments. 443 00:20:08,701 --> 00:20:10,700 In the case of cats versus dogs, we don't really 444 00:20:10,700 --> 00:20:13,070 have a psychophysics of cat perception, 445 00:20:13,070 --> 00:20:15,950 but steps have been taken to try to get to what the basis is 446 00:20:15,950 --> 00:20:19,850 of infants' distinction between dogs 447 00:20:19,850 --> 00:20:21,260 and cats in those experiments. 448 00:20:21,260 --> 00:20:23,900 And interestingly, what's popped out are faces. 449 00:20:23,900 --> 00:20:26,780 Turns out, you can occlude the cat and the dog's whole bodies, 450 00:20:26,780 --> 00:20:29,197 and if you leave their faces, you get these effects. 451 00:20:29,197 --> 00:20:31,280 If you occlude their faces and leave their bodies, 452 00:20:31,280 --> 00:20:34,460 you mostly do not, unless you cheat and give 453 00:20:34,460 --> 00:20:36,860 other obvious features, like all the dogs are standing 454 00:20:36,860 --> 00:20:40,170 and all the cats are sitting, or something like that. 455 00:20:40,170 --> 00:20:42,830 But in the normal case, faces are coming out 456 00:20:42,830 --> 00:20:46,696 as an important ingredient of that distinction. 457 00:20:46,696 --> 00:20:48,320 In the case of abstract number, there's 458 00:20:48,320 --> 00:20:52,100 also a lot of work in adults on our ability 459 00:20:52,100 --> 00:20:55,850 to apprehend at a glance approximate numerical value 460 00:20:55,850 --> 00:21:00,500 of sounds in a sequence or visual arrays. 461 00:21:00,500 --> 00:21:03,320 We've learned a lot about the conditions under which we can 462 00:21:03,320 --> 00:21:06,120 do that and the conditions under which we can't. 463 00:21:06,120 --> 00:21:10,070 That's not my topic for today, but Izard and her collaborators 464 00:21:10,070 --> 00:21:12,200 have been testing for all of those conditions 465 00:21:12,200 --> 00:21:13,280 in newborn infants. 466 00:21:13,280 --> 00:21:14,670 And so far, so good. 467 00:21:14,670 --> 00:21:17,540 It looks like there is a similar alignment 468 00:21:17,540 --> 00:21:19,140 between the patterns of-- 469 00:21:19,140 --> 00:21:22,310 the factors that influence infants' responses 470 00:21:22,310 --> 00:21:24,140 in those studies where they hear sounds 471 00:21:24,140 --> 00:21:26,540 and see arrays of objects and the factors 472 00:21:26,540 --> 00:21:31,720 that influence our abilities to apprehend approximate number. 473 00:21:31,720 --> 00:21:32,420 OK. 474 00:21:32,420 --> 00:21:35,656 So this gives us some good news and some bad news. 475 00:21:35,656 --> 00:21:37,280 The good news is that I think questions 476 00:21:37,280 --> 00:21:42,410 about the content of infants' perception and understanding 477 00:21:42,410 --> 00:21:44,960 of the world can be addressed. 478 00:21:44,960 --> 00:21:48,440 The bad news is that we can't do it very fast. 479 00:21:48,440 --> 00:21:51,390 You can't do it with a single silver-bullet experiment. 480 00:21:51,390 --> 00:21:53,990 You have to do it with a long and extensive pattern 481 00:21:53,990 --> 00:21:55,070 of research. 482 00:21:55,070 --> 00:21:58,550 In the past, research on infants has gone extremely slowly. 483 00:21:58,550 --> 00:22:00,350 Basically, the methods that we have 484 00:22:00,350 --> 00:22:03,890 allow you to ask each baby who comes into the lab maybe one, 485 00:22:03,890 --> 00:22:06,260 or if you're lucky, a couple of questions, 486 00:22:06,260 --> 00:22:08,070 but not more than that. 487 00:22:08,070 --> 00:22:11,600 So it takes a long time to do a single experiment. 488 00:22:11,600 --> 00:22:14,990 I do think, though, that this work is 489 00:22:14,990 --> 00:22:20,300 poised to accelerate dramatically 490 00:22:20,300 --> 00:22:21,980 and that we're poised to-- 491 00:22:21,980 --> 00:22:24,440 this is a good time to be thinking about infant cognition 492 00:22:24,440 --> 00:22:27,000 because I think we're soon going to be in a different world, 493 00:22:27,000 --> 00:22:29,333 where we can start asking these questions at a much more 494 00:22:29,333 --> 00:22:30,230 rapid pace. 495 00:22:30,230 --> 00:22:32,990 That's for at least two reasons, both of which, by the way, 496 00:22:32,990 --> 00:22:38,490 are being fostered by the Center for Brains, Minds and Machines 497 00:22:38,490 --> 00:22:41,630 and undertaken by people who are part of that center. 498 00:22:41,630 --> 00:22:45,230 One is, there are now efforts underway to be able to test 499 00:22:45,230 --> 00:22:46,580 infants on the web. 500 00:22:46,580 --> 00:22:48,770 These basic simple behavioral studies, you 501 00:22:48,770 --> 00:22:52,430 can assess looking time using the webcam 502 00:22:52,430 --> 00:22:58,520 in an iPad or a laptop, and you can test babies that way. 503 00:22:58,520 --> 00:23:00,120 And there's attempts to do that, which 504 00:23:00,120 --> 00:23:01,994 would make it possible to collect data doing 505 00:23:01,994 --> 00:23:03,410 the same kinds of experiments that 506 00:23:03,410 --> 00:23:06,220 have been done in the past, but much more quickly. 507 00:23:06,220 --> 00:23:09,530 Two, as Nancy already mentioned and Rebecca 508 00:23:09,530 --> 00:23:11,900 may talk about tomorrow, there are 509 00:23:11,900 --> 00:23:15,470 efforts underway to use functional brain imaging 510 00:23:15,470 --> 00:23:19,080 to get at not only what infants look at, 511 00:23:19,080 --> 00:23:22,370 but what regions of the brain are activated when they look 512 00:23:22,370 --> 00:23:23,930 at those things, which will give us 513 00:23:23,930 --> 00:23:26,570 a more specific signal of what infants are attending 514 00:23:26,570 --> 00:23:31,500 to and processing, someday, hopefully, in the near future. 515 00:23:31,500 --> 00:23:35,397 And we just had a retreat of CBMM, 516 00:23:35,397 --> 00:23:36,980 where there was a lot of brainstorming 517 00:23:36,980 --> 00:23:38,540 about new technologies to try to get 518 00:23:38,540 --> 00:23:40,831 more than just simple looking time out of young babies. 519 00:23:40,831 --> 00:23:43,316 So maybe some of that will work as well. 520 00:23:43,316 --> 00:23:44,690 But what I want to focus on today 521 00:23:44,690 --> 00:23:47,390 is that even this slow, plodding research 522 00:23:47,390 --> 00:23:49,360 has gone on for long enough at this point 523 00:23:49,360 --> 00:23:52,400 that I think we've learned something about what infants 524 00:23:52,400 --> 00:23:55,820 perceive and what they know. 525 00:23:55,820 --> 00:23:59,310 And I tried to put what I think we learned into two slides. 526 00:23:59,310 --> 00:24:00,950 Here's the first one. 527 00:24:00,950 --> 00:24:03,710 I think that very early in development, 528 00:24:03,710 --> 00:24:06,650 baby in the newborn period, but anyway, 529 00:24:06,650 --> 00:24:08,990 before babies are starting to reach for things 530 00:24:08,990 --> 00:24:13,280 and move around on their own, they already 531 00:24:13,280 --> 00:24:18,290 have a set of functioning cognitive systems, 532 00:24:18,290 --> 00:24:21,410 each specific to a different domain. 533 00:24:21,410 --> 00:24:25,700 One is a system for representing objects and their motions, 534 00:24:25,700 --> 00:24:28,980 collisions, and other interactions. 535 00:24:28,980 --> 00:24:31,310 Another is a system for representing people 536 00:24:31,310 --> 00:24:34,880 as agents who act on objects, and in doing so, 537 00:24:34,880 --> 00:24:39,620 pursue goals and cause changes in the world. 538 00:24:39,620 --> 00:24:42,290 A third is a system for perceiving people 539 00:24:42,290 --> 00:24:46,250 as social beings who can communicate with, engage 540 00:24:46,250 --> 00:24:50,630 with other social beings and share mental states. 541 00:24:50,630 --> 00:24:53,372 And then three other systems that I won't talk about today. 542 00:24:53,372 --> 00:24:54,830 One system of number, which I think 543 00:24:54,830 --> 00:24:58,370 is being tapped in that first Izard experiment. 544 00:24:58,370 --> 00:25:01,800 And two systems capturing aspects of geometry, 545 00:25:01,800 --> 00:25:06,120 one supporting navigation of the sort that Matt Wilson studies, 546 00:25:06,120 --> 00:25:11,630 the other supporting visual form perception of the sort that IT 547 00:25:11,630 --> 00:25:17,100 and occipital cortex represent. 548 00:25:17,100 --> 00:25:20,880 I think each of these systems operates as a whole. 549 00:25:20,880 --> 00:25:22,980 In Josh's terms from last Friday, 550 00:25:22,980 --> 00:25:25,410 it's internally compositional. 551 00:25:25,410 --> 00:25:27,810 Infants don't just come equipped with a set 552 00:25:27,810 --> 00:25:31,050 of local facts about how objects behave, they come equipped 553 00:25:31,050 --> 00:25:34,080 with a set of more general rules or principles that allow them 554 00:25:34,080 --> 00:25:36,300 to deal with objects in novel situations 555 00:25:36,300 --> 00:25:40,360 and make productive inferences about their interactions 556 00:25:40,360 --> 00:25:43,420 and behavior. 557 00:25:43,420 --> 00:25:45,030 Each of these systems is partially 558 00:25:45,030 --> 00:25:47,250 distinct from the other systems. 559 00:25:47,250 --> 00:25:48,990 It's distinct in three ways. 560 00:25:48,990 --> 00:25:51,240 First, each of them operates on different information. 561 00:25:51,240 --> 00:25:54,010 It's elicited under different conditions. 562 00:25:54,010 --> 00:25:59,070 Second, it gives rise to different representations 563 00:25:59,070 --> 00:26:00,660 with different content. 564 00:26:00,660 --> 00:26:04,240 And third, most deeply, it answers different questions. 565 00:26:04,240 --> 00:26:06,600 So for example, we have two-- infants 566 00:26:06,600 --> 00:26:09,360 have two systems for reasoning about people, 567 00:26:09,360 --> 00:26:11,560 but each system is answering a different question. 568 00:26:11,560 --> 00:26:13,481 The agent system is answering the question, 569 00:26:13,481 --> 00:26:14,480 what is this guy's goal? 570 00:26:14,480 --> 00:26:15,813 What is he trying to accomplish? 571 00:26:15,813 --> 00:26:19,380 What changes is he affecting in the world? 572 00:26:19,380 --> 00:26:22,080 The social system is asking, who is this guy related to? 573 00:26:22,080 --> 00:26:23,400 Who is he connected to? 574 00:26:23,400 --> 00:26:27,360 Who is he communicating with? 575 00:26:27,360 --> 00:26:31,500 Each of the systems are limited, extremely limited relative 576 00:26:31,500 --> 00:26:33,360 to what we find in adults. 577 00:26:33,360 --> 00:26:35,640 Each captures only a tiny part of what 578 00:26:35,640 --> 00:26:39,690 we as adults know about objects or agents 579 00:26:39,690 --> 00:26:41,209 or social interactions. 580 00:26:41,209 --> 00:26:42,750 Each of them, I think, interestingly, 581 00:26:42,750 --> 00:26:44,490 is shared by other animals. 582 00:26:44,490 --> 00:26:47,040 I didn't expect that to be true when 583 00:26:47,040 --> 00:26:48,870 we started doing this research. 584 00:26:48,870 --> 00:26:53,040 But as far as we can see so far, it's hard to find anything that 585 00:26:53,040 --> 00:26:56,490 a young human infant can do that a non-human animal can't. 586 00:26:56,490 --> 00:26:59,140 And I'll give you examples of that, too. 587 00:26:59,140 --> 00:27:02,610 And finally-- and I won't talk about this much, unfortunately. 588 00:27:02,610 --> 00:27:04,500 I think each of these systems continues 589 00:27:04,500 --> 00:27:07,080 to function throughout life and supports the development 590 00:27:07,080 --> 00:27:09,990 of new systems of knowledge. 591 00:27:09,990 --> 00:27:12,150 So when we think thoughts that only humans think, 592 00:27:12,150 --> 00:27:14,260 we engage these fundamental systems 593 00:27:14,260 --> 00:27:17,880 that we've had since infancy and other animals share. 594 00:27:17,880 --> 00:27:19,770 I also think this research tells us 595 00:27:19,770 --> 00:27:21,300 something about how we do that. 596 00:27:21,300 --> 00:27:24,120 I think that in addition to having 597 00:27:24,120 --> 00:27:26,280 these basic early developing systems, 598 00:27:26,280 --> 00:27:29,520 we have a uniquely human capacity 599 00:27:29,520 --> 00:27:31,830 to productively combine information 600 00:27:31,830 --> 00:27:36,510 across these systems, and through those combinations, 601 00:27:36,510 --> 00:27:39,900 to construct new concepts. 602 00:27:39,900 --> 00:27:41,910 I think these new concepts underlie, 603 00:27:41,910 --> 00:27:44,940 or they tend to be abstract, and they 604 00:27:44,940 --> 00:27:48,090 underlie a set of very important later-developing systems 605 00:27:48,090 --> 00:27:50,380 of knowledge, including knowledge 606 00:27:50,380 --> 00:27:54,630 that allow us to form taxonomies of objects, of tools, 607 00:27:54,630 --> 00:27:59,700 of natural kinds like animals and plants, 608 00:27:59,700 --> 00:28:01,350 and to reason about their behavior, 609 00:28:01,350 --> 00:28:03,660 such that when we encounter some new thing, 610 00:28:03,660 --> 00:28:05,970 we already know a lot about the kind of thing 611 00:28:05,970 --> 00:28:09,600 that it is and can use that to infer many 612 00:28:09,600 --> 00:28:12,150 of its specific properties, and also 613 00:28:12,150 --> 00:28:14,190 to direct our learning very explicitly to fill 614 00:28:14,190 --> 00:28:16,710 in the gaps in our knowledge. 615 00:28:16,710 --> 00:28:19,320 Another is the systems of natural number 616 00:28:19,320 --> 00:28:20,910 in Euclidean geometry. 617 00:28:20,910 --> 00:28:24,300 Natural number, children seem to construct over the first three 618 00:28:24,300 --> 00:28:25,680 to five years of life. 619 00:28:25,680 --> 00:28:29,190 Euclidean geometry seems to take much longer, much, much later. 620 00:28:29,190 --> 00:28:32,550 Molly Dillon, who's also here, has been trying to work 621 00:28:32,550 --> 00:28:35,370 on understanding-- and so has Veronique Izard-- 622 00:28:35,370 --> 00:28:37,170 how children go from six years of age, 623 00:28:37,170 --> 00:28:39,630 where they seem absolutely clueless about the simplest 624 00:28:39,630 --> 00:28:43,010 properties of Euclidean geometry, to 12-year-olds who, 625 00:28:43,010 --> 00:28:45,510 whether they're in the Amazon and have never been to school, 626 00:28:45,510 --> 00:28:51,540 or studying geometry in school, seem 627 00:28:51,540 --> 00:28:54,090 to have a basic rudimentary understanding of points 628 00:28:54,090 --> 00:28:57,270 and lines and figures on the Euclidean plane. 629 00:28:57,270 --> 00:28:59,426 A third is a system of persons and mental states. 630 00:28:59,426 --> 00:29:01,050 And I won't talk about it, but I'm only 631 00:29:01,050 --> 00:29:03,270 talking for the first half or so of this time, 632 00:29:03,270 --> 00:29:05,250 then Alia Martin's going to take over. 633 00:29:05,250 --> 00:29:06,780 And you'll touch on-- 634 00:29:06,780 --> 00:29:09,480 you'll get to some of those issues. 635 00:29:09,480 --> 00:29:14,160 Now, as Nancy said last week, I have this out-there hypothesis 636 00:29:14,160 --> 00:29:16,410 that I don't think anybody else in the world believes, 637 00:29:16,410 --> 00:29:18,060 but I still believe it. 638 00:29:18,060 --> 00:29:21,060 That this productive combinatorial capacity 639 00:29:21,060 --> 00:29:24,120 either is or is intimately tied to what's 640 00:29:24,120 --> 00:29:26,310 the most obvious cognitive difference between us 641 00:29:26,310 --> 00:29:30,390 and other animals, namely our faculty of natural language. 642 00:29:30,390 --> 00:29:33,870 In particular, I think that there 643 00:29:33,870 --> 00:29:36,330 are two general properties of natural language 644 00:29:36,330 --> 00:29:39,930 that make it an ideal medium for forming combinations 645 00:29:39,930 --> 00:29:42,790 of new concepts. 646 00:29:42,790 --> 00:29:46,560 One is that the words and the rules of-- 647 00:29:46,560 --> 00:29:48,090 well, three properties, actually. 648 00:29:48,090 --> 00:29:52,230 One is that the syntactic and semantic rules 649 00:29:52,230 --> 00:29:54,600 of natural languages are combinatorial and 650 00:29:54,600 --> 00:29:55,300 compositional. 651 00:29:55,300 --> 00:29:57,780 That is, if you learn the meanings of words 652 00:29:57,780 --> 00:29:59,770 and you learn how to combine them, 653 00:29:59,770 --> 00:30:02,190 you get the meanings of the expressions for free. 654 00:30:02,190 --> 00:30:03,740 You don't need to go out and learn 655 00:30:03,740 --> 00:30:05,960 what a brown cow is if you know what brown is 656 00:30:05,960 --> 00:30:09,220 and you know what a cow is. 657 00:30:09,220 --> 00:30:13,490 Second, the words and the rules of natural language 658 00:30:13,490 --> 00:30:15,680 apply across all domains. 659 00:30:15,680 --> 00:30:18,500 They're not restricted to one domain or another the way 660 00:30:18,500 --> 00:30:22,010 infants' other cognitive capacities seem to be. 661 00:30:22,010 --> 00:30:26,600 So if you learn how "cow" behaves 662 00:30:26,600 --> 00:30:29,660 in the expression "brown cow," and then you hear "brown ball," 663 00:30:29,660 --> 00:30:31,910 or something that a different domain of core knowledge 664 00:30:31,910 --> 00:30:35,750 would be capturing, you can immediately 665 00:30:35,750 --> 00:30:38,070 interpret that combination as well. 666 00:30:38,070 --> 00:30:41,510 And then the last thing about natural language that I think 667 00:30:41,510 --> 00:30:44,960 makes it so useful for cognitive development is that it's 668 00:30:44,960 --> 00:30:46,520 learned from other people. 669 00:30:46,520 --> 00:30:48,560 And other people talk about the things 670 00:30:48,560 --> 00:30:51,200 that they find useful to think about, right? 671 00:30:51,200 --> 00:30:53,720 Word frequency is a really good proxy 672 00:30:53,720 --> 00:30:57,740 for what the useful concepts out there are. 673 00:30:57,740 --> 00:31:03,980 So a child who has a very powerful combinatorial system 674 00:31:03,980 --> 00:31:06,560 that can create a huge set of concepts 675 00:31:06,560 --> 00:31:09,590 is going to have a search problem when they try to apply 676 00:31:09,590 --> 00:31:11,150 those concepts to the world. 677 00:31:11,150 --> 00:31:12,670 Something will happen in the world. 678 00:31:12,670 --> 00:31:15,380 And if they now have a million concepts 679 00:31:15,380 --> 00:31:17,600 that they could bring to bear, which one are they 680 00:31:17,600 --> 00:31:18,300 going to use? 681 00:31:18,300 --> 00:31:20,960 Are they to test them all out? 682 00:31:20,960 --> 00:31:23,780 Having too many concepts, too many innate concepts, 683 00:31:23,780 --> 00:31:26,360 would not necessarily be a blessing. 684 00:31:26,360 --> 00:31:29,819 But if you use language to guide you to the useful concepts, 685 00:31:29,819 --> 00:31:30,860 I think you'll do better. 686 00:31:30,860 --> 00:31:32,450 The ones people are going to talk about around 687 00:31:32,450 --> 00:31:34,699 you most frequently are going to be the ones that it's 688 00:31:34,699 --> 00:31:37,350 going to be most useful for you to be learning at that point. 689 00:31:37,350 --> 00:31:41,180 So let's go back to that first set of questions, which is what 690 00:31:41,180 --> 00:31:43,980 I want to be focusing on today. 691 00:31:43,980 --> 00:31:46,610 And as I said, I'll talk particularly 692 00:31:46,610 --> 00:31:49,090 about three domains where infants 693 00:31:49,090 --> 00:31:51,020 seem to develop knowledge quite rapidly 694 00:31:51,020 --> 00:31:52,460 over the course of infancy. 695 00:31:52,460 --> 00:31:56,300 And I'll spend most of my time on the first one, objects. 696 00:31:56,300 --> 00:32:00,350 So object cognition is really interesting 697 00:32:00,350 --> 00:32:02,810 and it seems to span this really big range. 698 00:32:02,810 --> 00:32:06,280 It seems to involve many different kinds of processes. 699 00:32:06,280 --> 00:32:08,510 If you're going to figure out what the objects are, 700 00:32:08,510 --> 00:32:11,330 what the bodies are in a scene, then you 701 00:32:11,330 --> 00:32:13,760 need segmentation abilities. 702 00:32:13,760 --> 00:32:15,770 You need to be able to take an array like this 703 00:32:15,770 --> 00:32:19,490 and break it down into units, figuring out 704 00:32:19,490 --> 00:32:22,890 what different parts of that array lie on the same object 705 00:32:22,890 --> 00:32:26,240 and what parts lie on different ones. 706 00:32:26,240 --> 00:32:29,720 So early mechanisms for doing that 707 00:32:29,720 --> 00:32:33,170 can participate in object representation. 708 00:32:33,170 --> 00:32:36,320 But also to perceive objects, arrays are cluttered 709 00:32:36,320 --> 00:32:39,200 and objects tend to be opaque. 710 00:32:39,200 --> 00:32:41,690 And when they are, it's never the case 711 00:32:41,690 --> 00:32:43,550 that all of the surfaces of one object 712 00:32:43,550 --> 00:32:44,930 are in view at the same time. 713 00:32:44,930 --> 00:32:46,610 And it's often the case that you're only 714 00:32:46,610 --> 00:32:49,430 seeing a little bit of any given object at a time. 715 00:32:49,430 --> 00:32:50,900 Yet somehow we're able to see this 716 00:32:50,900 --> 00:32:53,600 as a continuous table that's extending behind everything 717 00:32:53,600 --> 00:32:58,520 that's sitting on it, and even sort of as a continuous plate, 718 00:32:58,520 --> 00:33:00,170 a single plate that's partly-- 719 00:33:00,170 --> 00:33:04,560 that's on the table behind the vase, and so forth. 720 00:33:04,560 --> 00:33:06,470 So to represent objects, we've got 721 00:33:06,470 --> 00:33:08,540 to be able to take these visual fragments 722 00:33:08,540 --> 00:33:11,330 and put them together in the right sorts of ways. 723 00:33:11,330 --> 00:33:14,450 Something that's harder to show in a static image, 724 00:33:14,450 --> 00:33:17,000 but that of course is radically true about the world 725 00:33:17,000 --> 00:33:19,610 is that our perceptual encounters with objects 726 00:33:19,610 --> 00:33:20,840 are intermittent. 727 00:33:20,840 --> 00:33:22,670 We can look away and then look back, 728 00:33:22,670 --> 00:33:24,270 or an object can move out of view 729 00:33:24,270 --> 00:33:25,940 and then come back into view, yet 730 00:33:25,940 --> 00:33:29,270 what we experience is a world of persisting objects that 731 00:33:29,270 --> 00:33:31,520 are existing and moving on connected paths, 732 00:33:31,520 --> 00:33:35,430 whether we're looking at them or not. 733 00:33:35,430 --> 00:33:38,750 And finally, objects interact with other objects 734 00:33:38,750 --> 00:33:40,720 and we need to work out those interactions. 735 00:33:40,720 --> 00:33:42,470 And the working out that I'm interested in 736 00:33:42,470 --> 00:33:44,710 is not what this little boy is doing, 737 00:33:44,710 --> 00:33:46,430 but what his younger sister is doing 738 00:33:46,430 --> 00:33:48,080 as she's sitting in her infant seat 739 00:33:48,080 --> 00:33:50,720 and observing him acting on these towers 740 00:33:50,720 --> 00:33:52,591 and wondering what's going to happen next. 741 00:33:52,591 --> 00:33:53,090 OK? 742 00:33:53,090 --> 00:33:57,320 At least that's the problem on the table for today. 743 00:33:57,320 --> 00:34:02,840 OK, so a standard view for a very long time 744 00:34:02,840 --> 00:34:05,001 has been that different mechanisms solve 745 00:34:05,001 --> 00:34:07,250 these different aspects of the problem of representing 746 00:34:07,250 --> 00:34:08,080 objects. 747 00:34:08,080 --> 00:34:10,159 That segmentation depends on relatively 748 00:34:10,159 --> 00:34:11,750 low-level mechanisms. 749 00:34:11,750 --> 00:34:14,176 Completion and identity through time, 750 00:34:14,176 --> 00:34:16,550 it's going to depend on how much time we're talking about 751 00:34:16,550 --> 00:34:18,860 and how complicated the transformations are. 752 00:34:18,860 --> 00:34:20,429 They're sort of in the middle. 753 00:34:20,429 --> 00:34:24,320 And this is all about reasoning, about concepts 754 00:34:24,320 --> 00:34:26,300 that go beyond perception altogether, 755 00:34:26,300 --> 00:34:28,489 like the mass of an object, which we can't 756 00:34:28,489 --> 00:34:31,944 see directly, and so forth. 757 00:34:31,944 --> 00:34:33,860 And I kind of believed that that was true when 758 00:34:33,860 --> 00:34:36,150 we started doing this work. 759 00:34:36,150 --> 00:34:39,210 And because I did and wanted to know where the boundaries were 760 00:34:39,210 --> 00:34:40,909 of what infants could do, I started 761 00:34:40,909 --> 00:34:42,380 by working on these problems here. 762 00:34:42,380 --> 00:34:44,360 And that's what I'm going to talk about today. 763 00:34:44,360 --> 00:34:46,850 But let me flag at the outset that I no longer 764 00:34:46,850 --> 00:34:52,080 believe that the real representations of objects that 765 00:34:52,080 --> 00:34:55,210 organize infants' learning about the physical world, I 766 00:34:55,210 --> 00:34:57,380 no longer believe that they're embodied 767 00:34:57,380 --> 00:34:58,820 in a set of diverse systems. 768 00:34:58,820 --> 00:35:02,110 I think there's a single system that's ultimately at work here. 769 00:35:02,110 --> 00:35:03,770 Of course it has multiple levels to it, 770 00:35:03,770 --> 00:35:06,090 including low-level of edge detection, and so forth. 771 00:35:06,090 --> 00:35:08,640 But that there's a single system at work that both-- 772 00:35:08,640 --> 00:35:11,150 that tells us what's connected to what 773 00:35:11,150 --> 00:35:12,580 and where the boundaries of things 774 00:35:12,580 --> 00:35:15,880 are in arrays like this, how things continue 775 00:35:15,880 --> 00:35:18,520 where and when they're hidden, and how they interact 776 00:35:18,520 --> 00:35:19,330 with other things. 777 00:35:19,330 --> 00:35:21,670 That's one unitary system, and I'll 778 00:35:21,670 --> 00:35:24,790 try to show you what evidence supports 779 00:35:24,790 --> 00:35:27,670 that view, though, of course, jump 780 00:35:27,670 --> 00:35:31,330 in with questions or criticisms or alternative accounts. 781 00:35:31,330 --> 00:35:35,860 OK, so here's an intermediate case to start with. 782 00:35:35,860 --> 00:35:40,150 You present a-- it was studied a lot by Belgian psychologist 783 00:35:40,150 --> 00:35:45,550 Elvin Meshot back in the 1950s, I think-- '50s or early '60s. 784 00:35:45,550 --> 00:35:49,390 Take a triangle, present it behind an occluder, 785 00:35:49,390 --> 00:35:55,450 and ask babies, in effect, what do you see in that triangle? 786 00:35:55,450 --> 00:35:57,880 Do you see a connected object or do you see 787 00:35:57,880 --> 00:36:00,252 two separate visible fragments? 788 00:36:00,252 --> 00:36:01,960 We did these studies with four-month-olds 789 00:36:01,960 --> 00:36:03,751 because they're not yet reaching for things 790 00:36:03,751 --> 00:36:05,470 and manipulating objects. 791 00:36:05,470 --> 00:36:07,120 We used the fact that they tend to like 792 00:36:07,120 --> 00:36:08,620 to look at things that are new. 793 00:36:08,620 --> 00:36:11,590 So we presented this display repeatedly-- we, by the way, 794 00:36:11,590 --> 00:36:14,230 is Phil Kellman, now at UCLA and studying 795 00:36:14,230 --> 00:36:16,030 all this stuff in adults primarily, also 796 00:36:16,030 --> 00:36:18,130 studying mathematics now. 797 00:36:18,130 --> 00:36:22,390 Anyhow, so we presented displays like this repeatedly 798 00:36:22,390 --> 00:36:24,640 to babies until they got bored with them. 799 00:36:24,640 --> 00:36:27,670 And then we took the occluder away and in alternation, 800 00:36:27,670 --> 00:36:29,680 presented them with a complete triangle 801 00:36:29,680 --> 00:36:32,302 and with a triangle that had a gap in the center. 802 00:36:32,302 --> 00:36:34,510 And we reasoned that there were two possible outcomes 803 00:36:34,510 --> 00:36:35,650 of the study. 804 00:36:35,650 --> 00:36:41,290 Possibility one is that as empiricists and the then-very 805 00:36:41,290 --> 00:36:43,930 influential child psychologist-- 806 00:36:43,930 --> 00:36:46,030 developmental psychologist Jean Piaget 807 00:36:46,030 --> 00:36:48,970 argued, for a four-month-old infant who isn't yet 808 00:36:48,970 --> 00:36:50,710 reaching for things, the world is 809 00:36:50,710 --> 00:36:53,200 an array of visible fragments. 810 00:36:53,200 --> 00:36:55,840 So they will see this thing as ending 811 00:36:55,840 --> 00:37:00,250 at this edge where the occluder begins, and this display will 812 00:37:00,250 --> 00:37:02,180 look more similar to them than this display, 813 00:37:02,180 --> 00:37:04,444 so they'll be more interested in that one. 814 00:37:04,444 --> 00:37:06,610 There was also the theory from Gestalt psychologists 815 00:37:06,610 --> 00:37:08,943 and others that predicted the opposite, that there would 816 00:37:08,943 --> 00:37:11,050 be automatic completion processes that 817 00:37:11,050 --> 00:37:14,080 would lead any creature, whether they were experienced or not, 818 00:37:14,080 --> 00:37:16,870 to perceive the simpler arrangement, which is this one. 819 00:37:16,870 --> 00:37:19,330 Those, it seemed to us, were the only two options. 820 00:37:19,330 --> 00:37:20,921 Baby research is really fun because it 821 00:37:20,921 --> 00:37:22,420 can surprise you even when you think 822 00:37:22,420 --> 00:37:23,800 you've covered all the bases. 823 00:37:23,800 --> 00:37:26,000 Neither of those turned out to be true. 824 00:37:26,000 --> 00:37:28,690 What happened instead was that when we took the occluder away, 825 00:37:28,690 --> 00:37:30,880 you still saw an increase in looking 826 00:37:30,880 --> 00:37:33,790 both to the connected object and to the separate object, 827 00:37:33,790 --> 00:37:36,700 and those two increases were equal. 828 00:37:36,700 --> 00:37:39,610 Now, this could have been for an extremely boring reason. 829 00:37:39,610 --> 00:37:41,410 Maybe babies were only paying attention 830 00:37:41,410 --> 00:37:44,450 to the thing that was closest to them in the array. 831 00:37:44,450 --> 00:37:47,740 So we very quickly tested for that in the following way. 832 00:37:47,740 --> 00:37:51,160 Instead of contrasting an array with a small gap to an array 833 00:37:51,160 --> 00:37:53,590 that had it filled in, we contrasted an array 834 00:37:53,590 --> 00:37:55,900 with a small gap to an array with a larger gap, 835 00:37:55,900 --> 00:37:58,000 too large to have fit behind the occluder. 836 00:37:58,000 --> 00:38:00,190 And there, babies looked longer at the array 837 00:38:00,190 --> 00:38:01,510 with the larger gap. 838 00:38:01,510 --> 00:38:03,250 So we know it's not that they're not 839 00:38:03,250 --> 00:38:06,790 seeing this back form and its visible surfaces, 840 00:38:06,790 --> 00:38:09,730 but they seem to be uncommitted as to whether those surfaces 841 00:38:09,730 --> 00:38:11,957 are connected behind the occluder or not. 842 00:38:11,957 --> 00:38:14,290 They don't see them as ending where the occluder begins, 843 00:38:14,290 --> 00:38:16,569 but they don't clearly see them as connected, either. 844 00:38:16,569 --> 00:38:18,610 And we showed that this was quite generally true, 845 00:38:18,610 --> 00:38:21,580 both for simpler arrays and for more complicated-- well, 846 00:38:21,580 --> 00:38:23,905 for richer ones, like a sphere. 847 00:38:23,905 --> 00:38:25,780 We did this with a bunch of different arrays. 848 00:38:25,780 --> 00:38:27,520 And under these conditions, where 849 00:38:27,520 --> 00:38:30,136 the arrays are stationary, that's what we found. 850 00:38:30,136 --> 00:38:31,510 But there was one condition where 851 00:38:31,510 --> 00:38:33,093 we got a different finding, and that's 852 00:38:33,093 --> 00:38:35,200 when we took one of these arrays and moved it 853 00:38:35,200 --> 00:38:37,780 behind the occluder, never moving it enough to bring 854 00:38:37,780 --> 00:38:40,330 the center into view, but moving it enough such 855 00:38:40,330 --> 00:38:42,696 that the top and bottom were moving together. 856 00:38:42,696 --> 00:38:44,320 And when we did that, now babies looked 857 00:38:44,320 --> 00:38:46,000 longer at the display that had the gap. 858 00:38:49,160 --> 00:38:52,900 That raised the question, why is motion having this effect? 859 00:38:52,900 --> 00:38:54,830 And the immediate possibility, we thought, 860 00:38:54,830 --> 00:38:56,830 is motion is calling their attention to the rod, 861 00:38:56,830 --> 00:38:59,020 so they're tending to it more than they otherwise 862 00:38:59,020 --> 00:39:01,810 would, and it's leading them to see its other properties, 863 00:39:01,810 --> 00:39:04,940 like the alignment of its edges. 864 00:39:04,940 --> 00:39:09,340 So to test that, we gave them misaligned objects differing 865 00:39:09,340 --> 00:39:10,810 in color, differing in texture. 866 00:39:10,810 --> 00:39:12,490 All of the edges-- none of the edges 867 00:39:12,490 --> 00:39:14,277 were aligned with each other. 868 00:39:14,277 --> 00:39:16,360 If motion was just calling attention to alignment, 869 00:39:16,360 --> 00:39:17,980 it shouldn't do that in this case. 870 00:39:17,980 --> 00:39:22,210 But in fact, we found that after getting bored with that, 871 00:39:22,210 --> 00:39:24,970 infants expected something like this, not something like that. 872 00:39:24,970 --> 00:39:27,215 They looked longer at the display with the gap. 873 00:39:27,215 --> 00:39:28,840 So it looks like the motion is actually 874 00:39:28,840 --> 00:39:32,530 providing the information for the connectedness, 875 00:39:32,530 --> 00:39:36,476 and the alignment is not playing much of a role at all. 876 00:39:36,476 --> 00:39:37,850 Now, what could be going on here? 877 00:39:37,850 --> 00:39:39,040 This is the kind of thing I think 878 00:39:39,040 --> 00:39:41,510 that Josh likes to call a suspicious coincidence, right? 879 00:39:41,510 --> 00:39:44,080 That an infant is looking at this array, 880 00:39:44,080 --> 00:39:46,120 and isn't it odd that we're seeing this-- 881 00:39:46,120 --> 00:39:47,740 I'm seeing the same pattern of motion 882 00:39:47,740 --> 00:39:50,930 below the occluder as I'm seeing above it? 883 00:39:50,930 --> 00:39:53,170 Now that could be two separate objects that 884 00:39:53,170 --> 00:39:55,630 just happen to be moving together, 885 00:39:55,630 --> 00:39:57,490 but that would be rather unlikely. 886 00:39:57,490 --> 00:39:59,680 You're much more likely to see a pattern like that 887 00:39:59,680 --> 00:40:02,840 if in fact there's a between it and it's just one object that's 888 00:40:02,840 --> 00:40:03,369 in motion. 889 00:40:03,369 --> 00:40:04,910 I think that's probably the right way 890 00:40:04,910 --> 00:40:08,880 to think about what's going on in these experiments. 891 00:40:08,880 --> 00:40:12,260 But if it is, notice that not all coincidences 892 00:40:12,260 --> 00:40:15,440 that are suspicious for us are suspicious for infants. 893 00:40:15,440 --> 00:40:17,330 For us, it's a suspicious coincidence 894 00:40:17,330 --> 00:40:19,550 that this edge is aligned with that edge. 895 00:40:19,550 --> 00:40:21,092 For infants, it's not. 896 00:40:21,092 --> 00:40:22,550 I think this is a case where we can 897 00:40:22,550 --> 00:40:25,520 see infants can be useful for thinking 898 00:40:25,520 --> 00:40:27,710 about our own cognitive abilities 899 00:40:27,710 --> 00:40:32,520 because they seem to share some of our picture of the world, 900 00:40:32,520 --> 00:40:34,440 but not all of our picture of the world. 901 00:40:34,440 --> 00:40:37,550 And that can be a hint as to how that picture gets put together 902 00:40:37,550 --> 00:40:39,180 and how it's organized. 903 00:40:39,180 --> 00:40:40,537 So what kind of motion? 904 00:40:40,537 --> 00:40:42,120 We've tried a bunch of different ones. 905 00:40:42,120 --> 00:40:43,700 One of them is vertical motion. 906 00:40:43,700 --> 00:40:47,330 That's interesting because it's also a rigid displacement 907 00:40:47,330 --> 00:40:48,500 or motion in depth. 908 00:40:48,500 --> 00:40:51,830 They're both rigid displacements in three-dimensional space. 909 00:40:51,830 --> 00:40:54,110 Actually, all of these three are. 910 00:40:54,110 --> 00:40:56,480 But in this case, you don't get any side-to-side changes 911 00:40:56,480 --> 00:40:57,830 in the visual field. 912 00:40:57,830 --> 00:40:59,010 I think I animated this. 913 00:40:59,010 --> 00:40:59,789 Yeah. 914 00:40:59,789 --> 00:41:01,580 So this is kind of what the baby is seeing. 915 00:41:01,580 --> 00:41:04,370 By the way, all of these studies were done with real 3D objects 916 00:41:04,370 --> 00:41:06,290 and they had textures on them, and so forth. 917 00:41:06,290 --> 00:41:08,090 They've also all since been replicated 918 00:41:08,090 --> 00:41:10,880 in other labs using computer animated displays, which 919 00:41:10,880 --> 00:41:12,180 we didn't have-- 920 00:41:12,180 --> 00:41:14,734 which weren't available back in the day. 921 00:41:14,734 --> 00:41:16,400 And you get the same result. So I'm just 922 00:41:16,400 --> 00:41:18,020 doing cartoon versions of them here, 923 00:41:18,020 --> 00:41:20,600 but actually babies showed these effects 924 00:41:20,600 --> 00:41:22,176 across a range of different displays. 925 00:41:22,176 --> 00:41:23,300 So there's vertical motion. 926 00:41:23,300 --> 00:41:25,287 Here is motion in depth. 927 00:41:25,287 --> 00:41:27,620 Oh, and by the way, we're not restraining babies' heads, 928 00:41:27,620 --> 00:41:29,578 so it's not going to be anything near as, like, 929 00:41:29,578 --> 00:41:32,205 simple uniform as what's at their eye, 930 00:41:32,205 --> 00:41:33,570 is what I'm showing here. 931 00:41:33,570 --> 00:41:39,170 And then rotational motion, like that, around the midpoint. 932 00:41:39,170 --> 00:41:42,830 And what we found is that babies used both vertical motion 933 00:41:42,830 --> 00:41:46,070 and motion in depth about as well as they 934 00:41:46,070 --> 00:41:50,000 used horizontal motion to perceive the connectedness 935 00:41:50,000 --> 00:41:50,870 of the object. 936 00:41:50,870 --> 00:41:53,300 They did not use rotary motion. 937 00:41:53,300 --> 00:41:56,810 So I know there's a lot of interest and projects focused 938 00:41:56,810 --> 00:41:58,550 on perceptual invariance. 939 00:41:58,550 --> 00:42:00,560 And I think there's an interesting puzzle here, 940 00:42:00,560 --> 00:42:02,976 and it's one that Molly is very interested in, in the work 941 00:42:02,976 --> 00:42:05,030 that she's doing on geometry. 942 00:42:05,030 --> 00:42:08,510 These are all rigid motions. 943 00:42:08,510 --> 00:42:11,300 But somehow, rotation seems to be a whole lot harder 944 00:42:11,300 --> 00:42:17,210 for young intelligent beings to wrap their heads around 945 00:42:17,210 --> 00:42:20,690 than translation is-- 946 00:42:20,690 --> 00:42:23,660 including translation in depth or vertical translation. 947 00:42:23,660 --> 00:42:26,970 There's something hard about orientation changes. 948 00:42:26,970 --> 00:42:29,440 And in fact, I think they remain hard for us as adults. 949 00:42:29,440 --> 00:42:33,320 If you think of things like how the shape of a square 950 00:42:33,320 --> 00:42:36,500 seems to change if you rotate it 45 degrees so it's a diamond. 951 00:42:36,500 --> 00:42:39,221 It's no longer obvious that it's got four right angles. 952 00:42:39,221 --> 00:42:40,970 There's something about orientation that's 953 00:42:40,970 --> 00:42:42,720 harder than these other things. 954 00:42:42,720 --> 00:42:45,170 And I think we were seeing that here. 955 00:42:45,170 --> 00:42:48,020 When an object-- when a baby is sitting still and a rod 956 00:42:48,020 --> 00:42:49,610 is moving behind an occluder, it's 957 00:42:49,610 --> 00:42:52,370 moving both relative to the baby and relative 958 00:42:52,370 --> 00:42:54,260 to the surroundings, which of those things 959 00:42:54,260 --> 00:42:55,650 matters to the baby? 960 00:42:55,650 --> 00:42:59,360 So Phil Kellman did the ambitious experiment 961 00:42:59,360 --> 00:43:01,590 of putting a baby in a movable chair 962 00:43:01,590 --> 00:43:03,980 and moving the baby back and forth. 963 00:43:03,980 --> 00:43:08,000 In one condition, the baby is looking at a stationary rod, 964 00:43:08,000 --> 00:43:11,360 but his own motion is such that if you put a camera where 965 00:43:11,360 --> 00:43:14,060 the baby's head is, you'll see the image of that rod moving 966 00:43:14,060 --> 00:43:16,640 back and forth behind the block. 967 00:43:16,640 --> 00:43:18,530 In the other condition, the motion of the rod 968 00:43:18,530 --> 00:43:20,060 is tied to the motion of the baby, 969 00:43:20,060 --> 00:43:22,880 so it's always staying in the middle of the baby's 970 00:43:22,880 --> 00:43:27,500 visual field, but it's actually moving through the array. 971 00:43:27,500 --> 00:43:29,700 And it turned out that it's-- 972 00:43:29,700 --> 00:43:32,870 OK, so whether the baby was still or moving 973 00:43:32,870 --> 00:43:34,350 didn't matter at all. 974 00:43:34,350 --> 00:43:36,980 So if the object is-- sorry, I did these wrong. 975 00:43:36,980 --> 00:43:39,020 This should be still, that should be moving. 976 00:43:39,020 --> 00:43:42,620 If the object is still, and whether the baby is still 977 00:43:42,620 --> 00:43:44,180 or moving, it doesn't work. 978 00:43:44,180 --> 00:43:45,680 If the object is moving-- 979 00:43:45,680 --> 00:43:46,850 the diagram is right. 980 00:43:46,850 --> 00:43:48,770 It was just my label that's wrong. 981 00:43:48,770 --> 00:43:50,240 If the object is moving, it doesn't 982 00:43:50,240 --> 00:43:52,365 matter whether it's being displaced in the infant's 983 00:43:52,365 --> 00:43:53,540 visual field or not. 984 00:43:53,540 --> 00:43:54,590 It's seen as moving. 985 00:43:54,590 --> 00:43:56,930 Now, this isn't magic. 986 00:43:56,930 --> 00:43:58,806 The studies are not being done in a dark room 987 00:43:58,806 --> 00:44:01,013 with a single luminous object where the baby wouldn't 988 00:44:01,013 --> 00:44:01,700 be able to tell. 989 00:44:01,700 --> 00:44:04,100 There's lots of surround-- it's in a puppet stage 990 00:44:04,100 --> 00:44:05,520 and that is stationary. 991 00:44:05,520 --> 00:44:07,730 So there's lots of information for the object moving 992 00:44:07,730 --> 00:44:10,220 relative to its surroundings in all 993 00:44:10,220 --> 00:44:12,950 of the conditions of this study, and I'm sure that's critical. 994 00:44:12,950 --> 00:44:15,410 But for the point of view of the infant's connecting 995 00:44:15,410 --> 00:44:16,994 of the visible ends of the object, 996 00:44:16,994 --> 00:44:18,410 the question he's trying to answer 997 00:44:18,410 --> 00:44:20,150 is, is that thing moving? 998 00:44:20,150 --> 00:44:25,040 Not, am I experiencing movement in this changing scene? 999 00:44:25,040 --> 00:44:25,970 Retinal movement. 1000 00:44:25,970 --> 00:44:29,180 So if it's the case that-- 1001 00:44:29,180 --> 00:44:31,640 what those last findings suggest is 1002 00:44:31,640 --> 00:44:38,480 that the input representations to the system that's 1003 00:44:38,480 --> 00:44:42,170 forming objects out of arrays of visual surfaces 1004 00:44:42,170 --> 00:44:46,160 already capture a lot of the 3D spatial structure of the world. 1005 00:44:46,160 --> 00:44:48,270 This is a relatively late process. 1006 00:44:48,270 --> 00:44:52,370 And it allows us to ask, is it even specific to vision? 1007 00:44:52,370 --> 00:44:54,560 Would we see the same process at work 1008 00:44:54,560 --> 00:44:58,520 if we presented babies with the task of asking, am I feeling-- 1009 00:44:58,520 --> 00:45:02,270 are two things that are moving in the world connected? 1010 00:45:02,270 --> 00:45:05,480 Or are they not, in areas that I'm not perceiving? 1011 00:45:05,480 --> 00:45:08,480 We can ask that in other modalities. 1012 00:45:08,480 --> 00:45:12,110 So we did a series of studies-- this is with Arlette Streri. 1013 00:45:12,110 --> 00:45:14,450 We did a series of studies looking at perception 1014 00:45:14,450 --> 00:45:16,580 of objects by active touch. 1015 00:45:16,580 --> 00:45:21,000 By taking four-month-old babies and putting a bib over them. 1016 00:45:21,000 --> 00:45:22,950 Now I said they can't reach for objects, 1017 00:45:22,950 --> 00:45:25,640 but if you put a ring in a baby's hand, even a newborn's 1018 00:45:25,640 --> 00:45:27,080 hand, they'll grasp it. 1019 00:45:27,080 --> 00:45:29,060 So we put rings in their two hands. 1020 00:45:29,060 --> 00:45:31,750 And in one condition, the rings were rigidly attached, 1021 00:45:31,750 --> 00:45:34,250 although the array was set up so that they couldn't actually 1022 00:45:34,250 --> 00:45:37,557 feel that attachment and they couldn't see anything, 1023 00:45:37,557 --> 00:45:39,140 about the object, anyway, because they 1024 00:45:39,140 --> 00:45:42,280 had the screen blocking them. 1025 00:45:42,280 --> 00:45:45,050 But as they moved one, the other would move rigidly with it. 1026 00:45:45,050 --> 00:45:47,300 In the other condition, the two were unconnected, 1027 00:45:47,300 --> 00:45:49,250 so they would move independently. 1028 00:45:49,250 --> 00:45:55,020 And after babies explored that for-- over a series of trials, 1029 00:45:55,020 --> 00:45:56,990 and as in the other studies, we then 1030 00:45:56,990 --> 00:45:59,520 presented visual arrays in alternation 1031 00:45:59,520 --> 00:46:02,540 where the two rings were connected or not. 1032 00:46:02,540 --> 00:46:05,270 And found that in the condition where they had moved rigidly 1033 00:46:05,270 --> 00:46:07,640 together, infants extrapolated a connection 1034 00:46:07,640 --> 00:46:10,719 and looked longer at the arrays that were not connected. 1035 00:46:10,719 --> 00:46:12,510 In the case where they moved independently, 1036 00:46:12,510 --> 00:46:14,360 they did the opposite. 1037 00:46:14,360 --> 00:46:16,610 Now, that doesn't tell us that there is 1038 00:46:16,610 --> 00:46:18,770 a single system at work here. 1039 00:46:18,770 --> 00:46:23,709 It could be that there are, as Shimon, I believe, 1040 00:46:23,709 --> 00:46:25,250 was saying yesterday afternoon, there 1041 00:46:25,250 --> 00:46:26,702 are redundancies in the system. 1042 00:46:26,702 --> 00:46:28,160 You have different systems that are 1043 00:46:28,160 --> 00:46:29,570 capturing the same property. 1044 00:46:29,570 --> 00:46:30,840 That's still true. 1045 00:46:30,840 --> 00:46:32,060 But here's a reason to-- 1046 00:46:32,060 --> 00:46:36,120 we went on to ask not only what infants can do, 1047 00:46:36,120 --> 00:46:37,340 but what they can't do. 1048 00:46:37,340 --> 00:46:39,350 And I think it gives us reason to take seriously 1049 00:46:39,350 --> 00:46:41,090 the possibility that there's actually 1050 00:46:41,090 --> 00:46:43,220 a single system at work here. 1051 00:46:43,220 --> 00:46:45,320 What we did-- I haven't pictured it here-- 1052 00:46:45,320 --> 00:46:48,620 is, instead of varying the motion of the things, we did 1053 00:46:48,620 --> 00:46:52,110 vary their motion, but we also varied their other properties. 1054 00:46:52,110 --> 00:46:53,780 So their rigidity. 1055 00:46:53,780 --> 00:46:55,940 We contrasted a ring that was made out 1056 00:46:55,940 --> 00:46:57,620 of wood with a ring that was made out 1057 00:46:57,620 --> 00:47:00,440 of some kind of spongy, foam-rubbery material-- 1058 00:47:00,440 --> 00:47:04,220 their shape, their surface texture. 1059 00:47:04,220 --> 00:47:07,160 Asking, do infants take account of those properties 1060 00:47:07,160 --> 00:47:08,464 in extrapolating a connection? 1061 00:47:08,464 --> 00:47:10,130 Are they more likely to think two things 1062 00:47:10,130 --> 00:47:11,540 are connected to each other if they're both 1063 00:47:11,540 --> 00:47:12,920 made of foam rubber than if one of them 1064 00:47:12,920 --> 00:47:15,128 is made of foam rubber and the other is made of wood? 1065 00:47:15,128 --> 00:47:17,600 We never found any effect of those properties, 1066 00:47:17,600 --> 00:47:20,900 just as we didn't in the visual case. 1067 00:47:20,900 --> 00:47:24,050 So we see not only the same abilities, but the same limits. 1068 00:47:24,050 --> 00:47:25,850 And while that's not conclusive, I 1069 00:47:25,850 --> 00:47:29,290 think it adds weight to the idea that what 1070 00:47:29,290 --> 00:47:32,990 we could be studying here-- we started in the visual modality. 1071 00:47:32,990 --> 00:47:35,390 But what we could be studying here 1072 00:47:35,390 --> 00:47:38,660 is something that's more general and more abstract. 1073 00:47:38,660 --> 00:47:42,316 Basic notions about how objects behave that apply not only 1074 00:47:42,316 --> 00:47:44,690 when you're looking at things, but when you're actively-- 1075 00:47:44,690 --> 00:47:46,898 when you're feeling them, actively manipulating them, 1076 00:47:46,898 --> 00:47:50,800 exploring them in other modalities. 1077 00:47:50,800 --> 00:47:53,750 So I put a question mark because it's not absolutely conclusive, 1078 00:47:53,750 --> 00:47:56,810 but I think we should take seriously that possibility. 1079 00:47:56,810 --> 00:47:57,770 OK. 1080 00:47:57,770 --> 00:47:58,460 Only motion. 1081 00:47:58,460 --> 00:48:01,100 Is motion the only thing that works? 1082 00:48:01,100 --> 00:48:07,580 Or will other changes work, so if an object changes in color? 1083 00:48:07,580 --> 00:48:11,720 We created a particularly exciting color change 1084 00:48:11,720 --> 00:48:14,480 by embedding colored lights within a glass rod 1085 00:48:14,480 --> 00:48:16,190 so it's flashing on and off. 1086 00:48:16,190 --> 00:48:19,090 Succeeded in eliciting very high interest in that array. 1087 00:48:19,090 --> 00:48:21,620 Babies looked at it for a long time, 1088 00:48:21,620 --> 00:48:23,960 but only the motion array was seen as connected 1089 00:48:23,960 --> 00:48:25,100 behind the occluder. 1090 00:48:25,100 --> 00:48:29,540 So it looks like not all changes elicit this perception. 1091 00:48:29,540 --> 00:48:32,420 It's an open question what the class of effective changes is. 1092 00:48:32,420 --> 00:48:34,490 Maybe it's broader than just motions, 1093 00:48:34,490 --> 00:48:37,220 but it doesn't seem like all changes work. 1094 00:48:37,220 --> 00:48:41,780 Finally, is motion the only variable 1095 00:48:41,780 --> 00:48:44,420 that influences infants' perception 1096 00:48:44,420 --> 00:48:47,060 of-- the only property of surfaces that influences 1097 00:48:47,060 --> 00:48:49,250 infants' perception of objects? 1098 00:48:49,250 --> 00:48:51,830 The answer to that seems to be no. 1099 00:48:51,830 --> 00:48:55,340 So we studied that in a different situation 1100 00:48:55,340 --> 00:48:58,370 for which this is just a very impoverished cartoon. 1101 00:48:58,370 --> 00:49:00,980 We took two block-like objects-- 1102 00:49:00,980 --> 00:49:03,080 of different colors and textures in some studies, 1103 00:49:03,080 --> 00:49:04,760 same color and texture in others. 1104 00:49:04,760 --> 00:49:06,230 It didn't matter. 1105 00:49:06,230 --> 00:49:08,840 And put one on top of the other and either presented 1106 00:49:08,840 --> 00:49:13,040 them moving together or moving separately. 1107 00:49:13,040 --> 00:49:16,630 And then tested whether babies represented them as connected 1108 00:49:16,630 --> 00:49:17,547 in either of two ways. 1109 00:49:17,547 --> 00:49:19,254 Some of the studies were done with babies 1110 00:49:19,254 --> 00:49:20,480 who were old enough to reach. 1111 00:49:20,480 --> 00:49:22,100 And then we could ask, are they reaching for it 1112 00:49:22,100 --> 00:49:23,990 as if it were a single body or as if there 1113 00:49:23,990 --> 00:49:26,117 were two distinct bodies there? 1114 00:49:26,117 --> 00:49:27,950 I could give you more information about that 1115 00:49:27,950 --> 00:49:29,157 if you're interested. 1116 00:49:29,157 --> 00:49:30,740 The other was with looking time, where 1117 00:49:30,740 --> 00:49:34,550 we had a hand come out and grasp the top of the top object 1118 00:49:34,550 --> 00:49:35,154 and lift it. 1119 00:49:35,154 --> 00:49:37,070 And the question is, what should come with it? 1120 00:49:37,070 --> 00:49:38,861 Will the bottom object come with it as well 1121 00:49:38,861 --> 00:49:41,720 or will the top object on its own? 1122 00:49:41,720 --> 00:49:43,820 When the things had previously moved together, 1123 00:49:43,820 --> 00:49:45,554 they expected it all to move together. 1124 00:49:45,554 --> 00:49:46,970 When they'd moved separately, they 1125 00:49:46,970 --> 00:49:52,170 expected only the top object would move by itself. 1126 00:49:52,170 --> 00:49:54,140 And when there was no motion at all, 1127 00:49:54,140 --> 00:49:56,360 findings vary somewhat from one lab to another, 1128 00:49:56,360 --> 00:50:00,130 but mostly they tend to be ambiguous in the case where 1129 00:50:00,130 --> 00:50:01,090 there's no motion. 1130 00:50:01,090 --> 00:50:03,590 So there it looks like motion is doing all the work. 1131 00:50:03,590 --> 00:50:05,680 But if you make one simple change to this array 1132 00:50:05,680 --> 00:50:07,810 that you can't do in the occlusion studies, 1133 00:50:07,810 --> 00:50:10,420 you simply change the size of this object 1134 00:50:10,420 --> 00:50:13,356 and present it such that there's a gap between the two objects. 1135 00:50:13,356 --> 00:50:15,730 And you can either do it with this guy floating magically 1136 00:50:15,730 --> 00:50:17,771 in midair, or you can do it with two objects side 1137 00:50:17,771 --> 00:50:20,560 by side, both stably supported by a surface. 1138 00:50:20,560 --> 00:50:22,600 If there's a visible gap between them, 1139 00:50:22,600 --> 00:50:23,920 the motion no longer matters. 1140 00:50:23,920 --> 00:50:26,440 They will be treated as two distinct objects, 1141 00:50:26,440 --> 00:50:28,510 no matter what. 1142 00:50:28,510 --> 00:50:30,490 So what I think is going on here is 1143 00:50:30,490 --> 00:50:33,220 that babies have a system that's seeking 1144 00:50:33,220 --> 00:50:36,610 to find the connected, the solid connected bodies. 1145 00:50:36,610 --> 00:50:39,550 The bodies that are internally connected and will 1146 00:50:39,550 --> 00:50:44,350 remain so over motion. 1147 00:50:44,350 --> 00:50:46,390 And that's what's leading them to see 1148 00:50:46,390 --> 00:50:52,630 these patterns of relative motion or these visible gaps 1149 00:50:52,630 --> 00:50:57,880 as indicating a place where one object ends and the next object 1150 00:50:57,880 --> 00:50:58,630 begins. 1151 00:50:58,630 --> 00:51:02,560 I did want to get on to the problem of tracking objects 1152 00:51:02,560 --> 00:51:06,800 over time, perceiving not what's connected to what over space, 1153 00:51:06,800 --> 00:51:09,149 but what's connected to what over time. 1154 00:51:09,149 --> 00:51:11,440 Under what conditions are the thing that I'm seeing now 1155 00:51:11,440 --> 00:51:13,660 the same thing that I was seeing at some place 1156 00:51:13,660 --> 00:51:17,050 or time in the past? 1157 00:51:17,050 --> 00:51:24,220 So conceptually, it feels like continuity of motion over time 1158 00:51:24,220 --> 00:51:28,210 is related to connectedness of motion over space. 1159 00:51:28,210 --> 00:51:30,820 And it's been tested for in a variety of ways. 1160 00:51:30,820 --> 00:51:32,500 Here's one set of studies that we 1161 00:51:32,500 --> 00:51:35,190 did, where we have an object that moves behind 1162 00:51:35,190 --> 00:51:39,040 a single screen, and then either is-- and it starts here, 1163 00:51:39,040 --> 00:51:39,790 ends up here. 1164 00:51:39,790 --> 00:51:42,280 And either is seen to move between the two screens 1165 00:51:42,280 --> 00:51:44,770 or is not. 1166 00:51:44,770 --> 00:51:47,800 And we ask babies in effect, how many 1167 00:51:47,800 --> 00:51:49,810 objects do they think are in this display, 1168 00:51:49,810 --> 00:51:51,430 by boring half the babies with this, 1169 00:51:51,430 --> 00:51:53,740 half the babies with that, and then presenting them 1170 00:51:53,740 --> 00:51:55,360 in alternation with arrays of one 1171 00:51:55,360 --> 00:51:57,310 versus two objects, neither of which 1172 00:51:57,310 --> 00:52:00,070 ever passes through the center, but the arrays 1173 00:52:00,070 --> 00:52:00,926 differ in number. 1174 00:52:00,926 --> 00:52:02,800 In the one case, it's either moving over here 1175 00:52:02,800 --> 00:52:06,420 or it's moving over there on different trials. 1176 00:52:06,420 --> 00:52:09,610 And what we find is that in this case, 1177 00:52:09,610 --> 00:52:12,790 they expect to see one object and look longer at two. 1178 00:52:12,790 --> 00:52:15,610 In this case, they expect to see two objects 1179 00:52:15,610 --> 00:52:17,134 and look somewhat longer at one. 1180 00:52:17,134 --> 00:52:19,550 There's actually an overall preference for looking at two, 1181 00:52:19,550 --> 00:52:21,790 but you get that interaction and there's 1182 00:52:21,790 --> 00:52:26,950 a slight preference for looking at one in that condition. 1183 00:52:26,950 --> 00:52:28,990 Providing evidence, I think, that babies 1184 00:52:28,990 --> 00:52:33,730 are tracking objects over time by analyzing information 1185 00:52:33,730 --> 00:52:35,730 for the continuity of-- 1186 00:52:35,730 --> 00:52:38,890 or discontinuity of their object motion. 1187 00:52:38,890 --> 00:52:40,810 Now, Lisa Feigenson has conducted 1188 00:52:40,810 --> 00:52:43,655 stronger tests of this, I think, with somewhat older babies. 1189 00:52:43,655 --> 00:52:45,280 When babies get older and they do more, 1190 00:52:45,280 --> 00:52:47,657 you can do stronger tests. 1191 00:52:47,657 --> 00:52:49,240 So these are babies who are old enough 1192 00:52:49,240 --> 00:52:52,930 to crawl, old enough to eat, and old enough 1193 00:52:52,930 --> 00:52:55,600 to like graham crackers. 1194 00:52:55,600 --> 00:53:00,010 So she puts the baby back here, and in one set of studies, 1195 00:53:00,010 --> 00:53:04,000 she takes a single graham cracker, puts it in one box, 1196 00:53:04,000 --> 00:53:06,850 and then takes two graham crackers, one at a time, 1197 00:53:06,850 --> 00:53:09,280 and puts them in the other box. 1198 00:53:09,280 --> 00:53:12,520 And then the baby, who's being restrained by a parent, 1199 00:53:12,520 --> 00:53:13,330 is let loose. 1200 00:53:13,330 --> 00:53:15,430 And the question is, which box will they go to? 1201 00:53:15,430 --> 00:53:18,550 And they go to the box with the two graham crackers. 1202 00:53:18,550 --> 00:53:22,390 My favorite study, though, in this whole series 1203 00:53:22,390 --> 00:53:27,220 was one that she and Susan Carey ran as a boring control 1204 00:53:27,220 --> 00:53:27,760 condition. 1205 00:53:27,760 --> 00:53:29,843 I think it's the most interesting of the findings, 1206 00:53:29,843 --> 00:53:30,360 though. 1207 00:53:30,360 --> 00:53:31,870 In the boring control condition, they 1208 00:53:31,870 --> 00:53:33,911 were worried about the fact that maybe babies are 1209 00:53:33,911 --> 00:53:35,860 going to the box with two because they 1210 00:53:35,860 --> 00:53:38,800 see a hand around that box for a longer period of time, 1211 00:53:38,800 --> 00:53:40,850 doing more interesting stuff. 1212 00:53:40,850 --> 00:53:42,700 So they did the following boring control. 1213 00:53:42,700 --> 00:53:44,780 The two condition was the same as before. 1214 00:53:44,780 --> 00:53:46,990 So a hand comes out with a single graham cracker, 1215 00:53:46,990 --> 00:53:49,240 puts it in the box, comes out empty, 1216 00:53:49,240 --> 00:53:51,730 takes a second graham cracker, returns with a second graham 1217 00:53:51,730 --> 00:53:53,520 cracker, puts it in the box. 1218 00:53:53,520 --> 00:53:55,660 In the other condition, the hand comes out 1219 00:53:55,660 --> 00:53:58,330 with one graham cracker, puts it in the box, 1220 00:53:58,330 --> 00:54:01,360 comes out again with the graham cracker, 1221 00:54:01,360 --> 00:54:04,360 and then goes back into the box with that graham cracker. 1222 00:54:04,360 --> 00:54:08,470 So you've got more graham cracker sightings on the left. 1223 00:54:08,470 --> 00:54:12,490 You've got a same amount of hand activity on the two sides, 1224 00:54:12,490 --> 00:54:14,824 but the babies go to the box with two. 1225 00:54:14,824 --> 00:54:16,990 They're tracking the graham crackers, not the graham 1226 00:54:16,990 --> 00:54:19,570 cracker visual encounters. 1227 00:54:19,570 --> 00:54:24,910 They're tracking a continuous object over time. 1228 00:54:24,910 --> 00:54:28,690 Finally, objects. 1229 00:54:28,690 --> 00:54:30,760 Scenes don't usually just contain 1230 00:54:30,760 --> 00:54:36,550 a single object that's either connected, continuously visible 1231 00:54:36,550 --> 00:54:39,100 or not, or connected or not. 1232 00:54:39,100 --> 00:54:41,560 They contain multiple objects and those objects interact 1233 00:54:41,560 --> 00:54:43,190 with each other. 1234 00:54:43,190 --> 00:54:46,180 Shimon talked yesterday afternoon about the evidence 1235 00:54:46,180 --> 00:54:48,490 that babies are sensitive to these interactions, 1236 00:54:48,490 --> 00:54:52,930 at least down to about six months of age in the conditions 1237 00:54:52,930 --> 00:54:54,190 he was talking about. 1238 00:54:54,190 --> 00:54:56,500 In slightly different conditions, 1239 00:54:56,500 --> 00:54:58,570 the sensitivity has been shown as young 1240 00:54:58,570 --> 00:55:00,859 as three months of age. 1241 00:55:00,859 --> 00:55:02,900 Basically, here's a paradigm that will show that, 1242 00:55:02,900 --> 00:55:05,780 if you have a single object that's moving toward a screen. 1243 00:55:05,780 --> 00:55:08,250 Another object is stationary behind the screen. 1244 00:55:08,250 --> 00:55:11,480 But at the right time, the time at which this object, 1245 00:55:11,480 --> 00:55:14,060 if it continued moving at the same rate, at the point 1246 00:55:14,060 --> 00:55:15,890 where it would contact that object, 1247 00:55:15,890 --> 00:55:18,820 this object starts to move in the same direction. 1248 00:55:18,820 --> 00:55:21,080 And now, after seeing that repeatedly, 1249 00:55:21,080 --> 00:55:23,000 the screen is taken away and babies either 1250 00:55:23,000 --> 00:55:25,640 see the first object contacting the second and the second one 1251 00:55:25,640 --> 00:55:27,560 immediately starting to move, or they 1252 00:55:27,560 --> 00:55:29,120 see the first object stopping short 1253 00:55:29,120 --> 00:55:31,250 of the second an appropriate gap in time, 1254 00:55:31,250 --> 00:55:33,170 and then the second object starts to move. 1255 00:55:33,170 --> 00:55:35,750 And they look longer at this display, 1256 00:55:35,750 --> 00:55:38,850 providing evidence that they inferred 1257 00:55:38,850 --> 00:55:41,720 that the first object contacted the second at the point 1258 00:55:41,720 --> 00:55:45,860 at which it started to move. 1259 00:55:45,860 --> 00:55:50,630 Interestingly, as in the case of the occluded object studies, 1260 00:55:50,630 --> 00:55:53,330 if instead of having the second object move, 1261 00:55:53,330 --> 00:55:57,080 you have it change color and make a sound, 1262 00:55:57,080 --> 00:55:59,960 so it undergoes a change in state, but no motion, 1263 00:55:59,960 --> 00:56:04,170 the babies no longer infer contact in this condition. 1264 00:56:04,170 --> 00:56:05,750 They are attentive to those events. 1265 00:56:05,750 --> 00:56:08,240 They watch them a lot, but they're 1266 00:56:08,240 --> 00:56:10,490 uncommitted as to whether that first 1267 00:56:10,490 --> 00:56:13,040 object-- this is work of Paul Muentener and Susan Carey 1268 00:56:13,040 --> 00:56:14,245 relatively recently. 1269 00:56:14,245 --> 00:56:15,620 It wasn't done with cylinders, it 1270 00:56:15,620 --> 00:56:19,790 was done with a toy car that hits a block, 1271 00:56:19,790 --> 00:56:22,760 I think, or doesn't hit the block. 1272 00:56:22,760 --> 00:56:25,550 They're uncommitted as to whether the car contacted 1273 00:56:25,550 --> 00:56:27,830 the second object or not, if the second object 1274 00:56:27,830 --> 00:56:30,200 changes state but doesn't move. 1275 00:56:30,200 --> 00:56:32,540 Returning to the case where they succeed-- 1276 00:56:32,540 --> 00:56:35,810 namely, this thing went behind a screen, the other thing started 1277 00:56:35,810 --> 00:56:42,610 to move, infants inferred that they came into contact-- 1278 00:56:42,610 --> 00:56:46,430 that begins to suggest that maybe babies have some notion 1279 00:56:46,430 --> 00:56:48,440 that objects are solid, that two things can't 1280 00:56:48,440 --> 00:56:50,340 be in the same place at the same time, 1281 00:56:50,340 --> 00:56:53,420 that when one moving thing hits another thing, one 1282 00:56:53,420 --> 00:56:56,330 or the other of them or both, their motion has to change, 1283 00:56:56,330 --> 00:56:58,790 because they're not going to simply interpenetrate 1284 00:56:58,790 --> 00:56:59,660 each other. 1285 00:56:59,660 --> 00:57:02,900 And Josh already very briefly pointed 1286 00:57:02,900 --> 00:57:08,690 to some very old studies suggesting that babies have-- 1287 00:57:08,690 --> 00:57:12,860 make some assumption that objects are solid as early as-- 1288 00:57:12,860 --> 00:57:15,380 I think in the earliest studies done 1289 00:57:15,380 --> 00:57:18,590 with babies it's about two and a half months of age. 1290 00:57:18,590 --> 00:57:21,100 These are these studies that Renee Baillargeon did 1291 00:57:21,100 --> 00:57:25,400 that start with simply a screen, a flat screen, 1292 00:57:25,400 --> 00:57:28,790 rotating on a table, rotating 180 degrees back and forth 1293 00:57:28,790 --> 00:57:29,990 on a table. 1294 00:57:29,990 --> 00:57:34,550 Then she places an object behind this wall. 1295 00:57:34,550 --> 00:57:36,920 The screen is lying on the table with its back edge 1296 00:57:36,920 --> 00:57:38,030 right here at the middle. 1297 00:57:38,030 --> 00:57:41,060 She places an object behind it, and then the screen 1298 00:57:41,060 --> 00:57:44,076 starts to rotate up around the back edge 1299 00:57:44,076 --> 00:57:45,950 and the question to the infants in effect is, 1300 00:57:45,950 --> 00:57:48,680 what should happen to that screen? 1301 00:57:48,680 --> 00:57:50,570 And the two options she presents to them 1302 00:57:50,570 --> 00:57:52,970 is it either gets to the point where it would contact 1303 00:57:52,970 --> 00:57:55,820 this object which is now fully out of view, 1304 00:57:55,820 --> 00:57:59,030 and stops, and then returns to its first position, which 1305 00:57:59,030 --> 00:58:01,700 is a novel motion, but consistent 1306 00:58:01,700 --> 00:58:04,580 with the existence, location, and solidity 1307 00:58:04,580 --> 00:58:06,950 of that hidden object. 1308 00:58:06,950 --> 00:58:09,950 Or it continues merrily on its way and the same pattern 1309 00:58:09,950 --> 00:58:11,474 of rotation as before. 1310 00:58:11,474 --> 00:58:12,890 When it does that, of course, it's 1311 00:58:12,890 --> 00:58:14,540 going to come back flat on the screen 1312 00:58:14,540 --> 00:58:16,989 and there's not going to be any object there. 1313 00:58:16,989 --> 00:58:18,530 If there had been an object, it would 1314 00:58:18,530 --> 00:58:19,700 have had to be compressed. 1315 00:58:19,700 --> 00:58:22,140 Or what I think actually went on in those studies, 1316 00:58:22,140 --> 00:58:24,800 it was quickly and surreptitiously knocked out 1317 00:58:24,800 --> 00:58:26,090 of the way. 1318 00:58:26,090 --> 00:58:31,290 And infants looked less at this event than at this one-- 1319 00:58:31,290 --> 00:58:34,670 this one, sorry-- providing some evidence 1320 00:58:34,670 --> 00:58:37,610 that they were representing these objects, both as 1321 00:58:37,610 --> 00:58:43,100 existing when they were out of sight, and as solid. 1322 00:58:43,100 --> 00:58:47,000 So this is just a summary, not a claim about knowledge 1323 00:58:47,000 --> 00:58:50,450 development, about-- 1324 00:58:50,450 --> 00:58:53,090 I'm attempting to characterize here 1325 00:58:53,090 --> 00:58:57,650 with motion over just one dimension of space and time, 1326 00:58:57,650 --> 00:58:59,030 how infants seem-- 1327 00:58:59,030 --> 00:59:02,960 what infants seem to represent about the behavior of objects. 1328 00:59:02,960 --> 00:59:06,440 Namely that each object moves on a continuous path 1329 00:59:06,440 --> 00:59:08,300 through space and over time. 1330 00:59:08,300 --> 00:59:09,650 That it moves cohesively. 1331 00:59:09,650 --> 00:59:12,959 It doesn't split into pieces as it's moving. 1332 00:59:12,959 --> 00:59:14,750 So if you've seen something move like this, 1333 00:59:14,750 --> 00:59:19,190 then you find it unlikely that if this were lifted, 1334 00:59:19,190 --> 00:59:21,710 it would go on its own, and you look longer at that. 1335 00:59:21,710 --> 00:59:23,840 There is no merging, where two things that 1336 00:59:23,840 --> 00:59:26,820 previously moved independently now move together. 1337 00:59:26,820 --> 00:59:28,640 So after looking at this, it would also 1338 00:59:28,640 --> 00:59:30,920 be unlikely, if you lifted this, for the whole thing 1339 00:59:30,920 --> 00:59:33,230 to jump up at once. 1340 00:59:33,230 --> 00:59:34,430 They move without gaps. 1341 00:59:34,430 --> 00:59:36,650 They move without intersecting other objects 1342 00:59:36,650 --> 00:59:39,410 other objects on their paths of motion, such that two things 1343 00:59:39,410 --> 00:59:42,020 are in the same place at the same time. 1344 00:59:42,020 --> 00:59:45,440 And they move on contact with other objects 1345 00:59:45,440 --> 00:59:48,140 and not at a distance from them. 1346 00:59:48,140 --> 00:59:51,350 So that's just a summary of what I 1347 00:59:51,350 --> 00:59:55,490 think these studies show about four-month-old infants, not 1348 00:59:55,490 --> 00:59:56,467 newborns. 1349 00:59:56,467 --> 00:59:58,550 They also show that infants' perception of objects 1350 00:59:58,550 --> 01:00:00,340 is really limited. 1351 01:00:00,340 --> 01:00:05,010 There's all these situations under which we see unitary, 1352 01:00:05,010 --> 01:00:07,620 connected, bounded objects when they don't. 1353 01:00:07,620 --> 01:00:12,360 And interestingly, research by Fei Xu and Susan Carey 1354 01:00:12,360 --> 01:00:17,220 shows that even when you present really quite surprisingly 1355 01:00:17,220 --> 01:00:21,660 old infants, 10-month-olds, with objects that should be really 1356 01:00:21,660 --> 01:00:25,200 familiar to them, like toy ducks and trucks, 1357 01:00:25,200 --> 01:00:28,620 they don't assume that these two objects will be distinct 1358 01:00:28,620 --> 01:00:31,380 if they undergo no common motion. 1359 01:00:31,380 --> 01:00:33,360 If they're simply presented stationary, 1360 01:00:33,360 --> 01:00:37,020 the babies seem uncommitted as to whether there's 1361 01:00:37,020 --> 01:00:38,910 a boundary between them or not. 1362 01:00:38,910 --> 01:00:40,980 So they're using very limited information 1363 01:00:40,980 --> 01:00:44,130 to be making these basic-- 1364 01:00:44,130 --> 01:00:45,870 building these basic representations 1365 01:00:45,870 --> 01:00:48,390 of what's connected to what, where one thing ends 1366 01:00:48,390 --> 01:00:51,240 and the next begins. 1367 01:00:51,240 --> 01:00:55,080 Now, this changes very abruptly between about 10 1368 01:00:55,080 --> 01:00:56,160 and 12 months of age. 1369 01:00:56,160 --> 01:00:58,860 They start treating those as two separate objects, 1370 01:00:58,860 --> 01:01:03,630 whether they're moving together or stationary or not. 1371 01:01:03,630 --> 01:01:05,160 Now, infants' tracking of objects 1372 01:01:05,160 --> 01:01:07,260 shows very similar limits. 1373 01:01:07,260 --> 01:01:09,870 So I told you they succeed in perceiving-- 1374 01:01:09,870 --> 01:01:11,430 representing two distinct objects 1375 01:01:11,430 --> 01:01:13,020 in a situation like this. 1376 01:01:13,020 --> 01:01:15,930 But up until and including 10 months of age, 1377 01:01:15,930 --> 01:01:18,780 they fail in this situation. 1378 01:01:18,780 --> 01:01:21,980 If a truck comes out on one side of a single large screen, 1379 01:01:21,980 --> 01:01:23,490 so you're not getting information 1380 01:01:23,490 --> 01:01:25,890 for the motion behind that screen, 1381 01:01:25,890 --> 01:01:27,840 and a duck comes out on the other side, 1382 01:01:27,840 --> 01:01:30,990 and you ask babies, in effect, how many things are there? 1383 01:01:30,990 --> 01:01:32,460 One or two? 1384 01:01:32,460 --> 01:01:34,620 By removing the screen and alternately presenting 1385 01:01:34,620 --> 01:01:36,840 those two possibilities, they are uncommitted 1386 01:01:36,840 --> 01:01:39,780 between those two alternatives. 1387 01:01:39,780 --> 01:01:42,240 In this situation as in the previous one, 1388 01:01:42,240 --> 01:01:44,850 there's this very abrupt change between about 10 1389 01:01:44,850 --> 01:01:46,420 and 12 months of age. 1390 01:01:46,420 --> 01:01:49,790 And I can't resist saying, even though I'm way over time, 1391 01:01:49,790 --> 01:01:54,180 that Fei Xu has shown that that change is interestingly related 1392 01:01:54,180 --> 01:01:57,870 to the child's developing mastery of expressions 1393 01:01:57,870 --> 01:01:59,762 that name kinds of objects. 1394 01:01:59,762 --> 01:02:02,220 So she's been able to show, for example, that if you simply 1395 01:02:02,220 --> 01:02:06,420 ask for individual infants, when did they start succeeding here, 1396 01:02:06,420 --> 01:02:09,300 their success is predicted by their vocabulary 1397 01:02:09,300 --> 01:02:11,100 as reported by parents. 1398 01:02:11,100 --> 01:02:14,580 She's also shown that if you take a younger 1399 01:02:14,580 --> 01:02:15,890 infant who would be slated-- 1400 01:02:15,890 --> 01:02:19,620 destined to fail this study, but as you bring objects 1401 01:02:19,620 --> 01:02:23,160 out on the two sides, either familiar ones or novel ones, 1402 01:02:23,160 --> 01:02:24,990 starting at about nine months of age, 1403 01:02:24,990 --> 01:02:27,720 if you name them and you give them distinct object names, 1404 01:02:27,720 --> 01:02:29,310 they now infer two objects. 1405 01:02:29,310 --> 01:02:31,170 And in fact, they'll even do it if the two 1406 01:02:31,170 --> 01:02:33,990 things you bring out from behind a single wide screen 1407 01:02:33,990 --> 01:02:35,130 look the same. 1408 01:02:35,130 --> 01:02:37,620 If you bring one thing out and say, look, a blicket, 1409 01:02:37,620 --> 01:02:40,120 and put it back in, and then bring something out and say, 1410 01:02:40,120 --> 01:02:42,410 look, a toma, even if it looks the same, 1411 01:02:42,410 --> 01:02:43,910 they'll infer two objects. 1412 01:02:43,910 --> 01:02:45,930 So there seems to be this change that's 1413 01:02:45,930 --> 01:02:48,270 occurring at the end of the first year 1414 01:02:48,270 --> 01:02:51,030 quite dramatically that's overcoming this basically meant 1415 01:02:51,030 --> 01:02:54,260 that we're seeing earlier on.