1 00:00:01,640 --> 00:00:04,040 The following content is provided under a Creative 2 00:00:04,040 --> 00:00:05,580 Commons license. 3 00:00:05,580 --> 00:00:07,880 Your support will help MIT OpenCourseWare 4 00:00:07,880 --> 00:00:12,270 continue to offer high-quality educational resources for free. 5 00:00:12,270 --> 00:00:14,870 To make a donation or view additional materials 6 00:00:14,870 --> 00:00:18,830 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:18,830 --> 00:00:20,000 at ocw.mit.edu. 8 00:00:22,212 --> 00:00:23,920 LAURA SCHULZ: So what you've heard by now 9 00:00:23,920 --> 00:00:26,570 is that the hard problem of cognitive science 10 00:00:26,570 --> 00:00:29,810 turns out to be the problem of commonsense reasoning. 11 00:00:29,810 --> 00:00:31,710 Our computers can drive. 12 00:00:31,710 --> 00:00:34,880 They can perform fabulous calculations. 13 00:00:34,880 --> 00:00:38,480 They can beat us at Jeopardy! 14 00:00:38,480 --> 00:00:41,900 But when it comes to all of the really hard problems 15 00:00:41,900 --> 00:00:45,530 of cognitive science, there's only one organism that solves 16 00:00:45,530 --> 00:00:47,267 them, and that's a human child. 17 00:00:47,267 --> 00:00:49,850 And those are problems of face recognition, scene recognition, 18 00:00:49,850 --> 00:00:53,000 motor planning, natural language acquisition, causal reasoning, 19 00:00:53,000 --> 00:00:54,770 theory of mind, moral reasoning. 20 00:00:54,770 --> 00:00:57,500 All of those are solved in largely unsupervised learning 21 00:00:57,500 --> 00:01:00,320 by children by the age of five. 22 00:01:00,320 --> 00:01:02,930 And those are the things our computers don't do well. 23 00:01:02,930 --> 00:01:05,900 And that is largely because the problem of human intelligence, 24 00:01:05,900 --> 00:01:08,030 common sense intelligence, is a problem 25 00:01:08,030 --> 00:01:10,550 of drawing really rich inferences that are massively 26 00:01:10,550 --> 00:01:12,032 underdetermined by the data. 27 00:01:12,032 --> 00:01:13,490 So to make that point really clear, 28 00:01:13,490 --> 00:01:16,580 I'm to give you all a pop intelligence quiz, OK? 29 00:01:16,580 --> 00:01:18,440 So here's the pop intelligence quiz. 30 00:01:18,440 --> 00:01:19,940 Can I see a show of hands, please-- 31 00:01:19,940 --> 00:01:23,210 how many of you think that I have a spleen? 32 00:01:23,210 --> 00:01:25,770 Can I see a show of hands? 33 00:01:25,770 --> 00:01:27,770 Excellent. 34 00:01:27,770 --> 00:01:30,890 How many of you would care-- 35 00:01:30,890 --> 00:01:32,770 keep your hands down if you're an M.D. 36 00:01:32,770 --> 00:01:34,700 But other than that, how many of you 37 00:01:34,700 --> 00:01:36,470 would care to come up here and diagram, 38 00:01:36,470 --> 00:01:38,900 for the class, a spleen, and explain what it is, 39 00:01:38,900 --> 00:01:42,810 where it is, and what its exact function is in the human body? 40 00:01:42,810 --> 00:01:43,970 Anyone? 41 00:01:43,970 --> 00:01:46,320 How many of you have met me before? 42 00:01:46,320 --> 00:01:49,060 A few of you. 43 00:01:49,060 --> 00:01:51,950 But most of you, without knowing anything about spleens, 44 00:01:51,950 --> 00:01:53,490 and not knowing anything about me, 45 00:01:53,490 --> 00:01:56,930 are nonetheless extremely confident that I have one. 46 00:01:56,930 --> 00:01:58,970 So you have a lot of abstract knowledge 47 00:01:58,970 --> 00:02:02,480 in the absence of really much in the way 48 00:02:02,480 --> 00:02:04,791 of specific, concrete facts. 49 00:02:04,791 --> 00:02:06,290 OK, let me give you another problem. 50 00:02:06,290 --> 00:02:08,810 Of course, it's a kind of classic one from-- 51 00:02:08,810 --> 00:02:12,424 by the way, for those of you who are curious, that's a spleen. 52 00:02:12,424 --> 00:02:15,020 All right, classic-- these problems 53 00:02:15,020 --> 00:02:17,964 crop up in every aspect of human cognition, all right? 54 00:02:17,964 --> 00:02:19,130 What's behind the rectangle? 55 00:02:19,130 --> 00:02:20,835 AUDIENCE: [INAUDIBLE] 56 00:02:21,335 --> 00:02:22,800 LAURA SCHULZ: Right. 57 00:02:22,800 --> 00:02:24,560 You all know. 58 00:02:24,560 --> 00:02:27,050 You can't articulate it, but you know. 59 00:02:27,050 --> 00:02:29,450 And the fact that there are infinitely 60 00:02:29,450 --> 00:02:32,690 many other hypotheses consistent with the data 61 00:02:32,690 --> 00:02:35,390 doesn't trouble you at all, doesn't stop you 62 00:02:35,390 --> 00:02:38,787 from converging, collectively, on a single answer, which 63 00:02:38,787 --> 00:02:41,120 means there has to be a lot of constraints on how you're 64 00:02:41,120 --> 00:02:42,589 interpreting this kind of data. 65 00:02:42,589 --> 00:02:43,380 Here's another one. 66 00:02:43,380 --> 00:02:45,106 Complete the sentence. 67 00:02:45,106 --> 00:02:46,420 AUDIENCE: Very long neck. 68 00:02:46,420 --> 00:02:47,770 LAURA SCHULZ: Very long neck-- 69 00:02:47,770 --> 00:02:50,650 could be, or temper, or flight to Kenya. 70 00:02:50,650 --> 00:02:52,540 There are many things that it could be. 71 00:02:52,540 --> 00:02:54,546 In this case, it looks like a frequency issue. 72 00:02:54,546 --> 00:02:56,170 Oh, well, "neck" is a very common word. 73 00:02:56,170 --> 00:02:56,950 Others aren't. 74 00:02:56,950 --> 00:03:00,610 But if I said, "giraffes are really common on the African--" 75 00:03:00,610 --> 00:03:03,760 you would say "savannah," not "television," OK? 76 00:03:03,760 --> 00:03:06,850 So you're using a lot of rich information 77 00:03:06,850 --> 00:03:09,600 to take a tiny bit of data and draw rich inferences. 78 00:03:09,600 --> 00:03:11,672 And that is the problem, and the hard problem, 79 00:03:11,672 --> 00:03:13,130 of common sense intelligence, and I 80 00:03:13,130 --> 00:03:15,760 think, a real dissociation between what we do 81 00:03:15,760 --> 00:03:17,140 and what our children do. 82 00:03:17,140 --> 00:03:20,200 Human intelligence uses abstract structured representations 83 00:03:20,200 --> 00:03:22,732 to constrain the hypotheses and make 84 00:03:22,732 --> 00:03:24,190 these kind of outrageous inferences 85 00:03:24,190 --> 00:03:25,606 that we shouldn't be able to make. 86 00:03:25,606 --> 00:03:27,670 That's all well and good, but where 87 00:03:27,670 --> 00:03:30,640 do these abstract structured representations come from? 88 00:03:30,640 --> 00:03:32,140 And there's only two possibilities-- 89 00:03:32,140 --> 00:03:34,200 we're born with them, or we learn them. 90 00:03:34,200 --> 00:03:37,390 Liz Spelke has already told you a lot about the reasons 91 00:03:37,390 --> 00:03:39,440 to think that we are born with many of them, 92 00:03:39,440 --> 00:03:41,350 and so true for anything that might 93 00:03:41,350 --> 00:03:43,180 be stable over evolutionary time, that 94 00:03:43,180 --> 00:03:47,231 might emerge early in ontogeny and broadly in phylogeny. 95 00:03:47,231 --> 00:03:49,480 And that's true for many aspects of folk physics, folk 96 00:03:49,480 --> 00:03:52,220 psychology, causal reasoning, navigation, number. 97 00:03:54,840 --> 00:03:57,319 But there's a lot of other things you know. 98 00:03:57,319 --> 00:03:59,610 Shoes, ships, sealing wax-- basically everything else-- 99 00:03:59,610 --> 00:04:02,170 they are not plausibly innate. 100 00:04:02,170 --> 00:04:04,230 And so how do you learn them? 101 00:04:04,230 --> 00:04:06,750 So Piaget, the founder of our field 102 00:04:06,750 --> 00:04:10,080 of developmental psychology, says you build these up 103 00:04:10,080 --> 00:04:11,370 from experience. 104 00:04:11,370 --> 00:04:14,760 You build them up starting with sensory motor representations 105 00:04:14,760 --> 00:04:16,140 very, very gradually. 106 00:04:16,140 --> 00:04:19,420 You progress through a lot of concrete information 107 00:04:19,420 --> 00:04:21,450 until finally, somewhere around the age of 12, 108 00:04:21,450 --> 00:04:25,450 you get to abstract representations of the world. 109 00:04:25,450 --> 00:04:27,820 But it turns out that just like you and your spleens, 110 00:04:27,820 --> 00:04:30,160 children have a lot of abstract knowledge 111 00:04:30,160 --> 00:04:34,150 before they have much of this concrete information. 112 00:04:34,150 --> 00:04:35,740 They know, actually, almost nothing-- 113 00:04:35,740 --> 00:04:37,364 even less than you know, it turns out-- 114 00:04:37,364 --> 00:04:38,650 about anatomy and biology. 115 00:04:38,650 --> 00:04:40,090 But they know all of these kinds of things. 116 00:04:40,090 --> 00:04:41,080 Animals have insides. 117 00:04:41,080 --> 00:04:43,150 Similar kinds of animals have similar insides. 118 00:04:43,150 --> 00:04:44,500 Plants and objects don't. 119 00:04:44,500 --> 00:04:47,050 Removing those insides is a bad idea usually. 120 00:04:47,050 --> 00:04:50,530 They can go on and on without really understanding anything 121 00:04:50,530 --> 00:04:52,790 about anatomy or biology. 122 00:04:52,790 --> 00:04:53,560 And so for you. 123 00:04:53,560 --> 00:04:55,730 If I push you hard on all kinds of things-- 124 00:04:55,730 --> 00:04:57,460 how does a scissors really work-- 125 00:04:57,460 --> 00:05:00,160 you know, most of you would be like, ahh, and [INAUDIBLE] 126 00:05:00,160 --> 00:05:02,290 pretty quick, OK? 127 00:05:02,290 --> 00:05:03,340 And we know this. 128 00:05:03,340 --> 00:05:06,877 So we have these intuitive theories 129 00:05:06,877 --> 00:05:08,710 that seem to constrain our hypothesis space. 130 00:05:08,710 --> 00:05:10,570 But it's a really hard chicken and egg problem, 131 00:05:10,570 --> 00:05:12,910 because we've said we need these rich abstract theories 132 00:05:12,910 --> 00:05:14,770 to constrain the interpretation of data. 133 00:05:14,770 --> 00:05:16,270 But how do we learn them if we don't 134 00:05:16,270 --> 00:05:17,410 have concrete information? 135 00:05:17,410 --> 00:05:19,172 And nonetheless, some of them are learned. 136 00:05:19,172 --> 00:05:21,380 We're going to return to that at the end of the talk. 137 00:05:21,380 --> 00:05:23,740 And really, I'm going to do that to set up Josh and Tomer, who 138 00:05:23,740 --> 00:05:25,070 are going to talk about that. 139 00:05:25,070 --> 00:05:26,695 But first, what I'm going to talk about 140 00:05:26,695 --> 00:05:29,170 is-- you know, OK, why am I talking about this 141 00:05:29,170 --> 00:05:30,190 as an intuitive theory? 142 00:05:30,190 --> 00:05:32,200 Where is this argument coming from? 143 00:05:32,200 --> 00:05:33,540 I'm interested in learning. 144 00:05:33,540 --> 00:05:35,350 And it's a research program that emerged 145 00:05:35,350 --> 00:05:37,870 against the backdrop of two revolutions 146 00:05:37,870 --> 00:05:39,970 in our understanding of cognitive development. 147 00:05:39,970 --> 00:05:41,320 One was the infancy revolution. 148 00:05:41,320 --> 00:05:43,900 You've heard a lot about it, so I'm going to go very quickly. 149 00:05:43,900 --> 00:05:45,441 Babies, it turns out, know a lot more 150 00:05:45,441 --> 00:05:48,520 than we thought about objects, and about number, 151 00:05:48,520 --> 00:05:52,600 and about agents, and their goals, and their intentions. 152 00:05:52,600 --> 00:05:54,400 It's not just infants though. 153 00:05:54,400 --> 00:05:59,560 It turns out that very young children, preschoolers, also 154 00:05:59,560 --> 00:06:02,440 represent knowledge that is not plausibly 155 00:06:02,440 --> 00:06:07,780 innate in ways that are abstract, that are coherent, 156 00:06:07,780 --> 00:06:11,560 that are causal, that support prediction, and intervention, 157 00:06:11,560 --> 00:06:14,890 and explanation, and counterfactual reasoning 158 00:06:14,890 --> 00:06:17,290 in ways that seem to justify referring 159 00:06:17,290 --> 00:06:21,100 to them as intuitive theories. 160 00:06:21,100 --> 00:06:22,960 And together, these two revolutions, 161 00:06:22,960 --> 00:06:25,134 the infancy revolution and the revolution 162 00:06:25,134 --> 00:06:26,800 in our understanding of early childhood, 163 00:06:26,800 --> 00:06:29,330 dismantled Piagetian stage theory. 164 00:06:29,330 --> 00:06:31,220 There was never a time-- 165 00:06:31,220 --> 00:06:32,980 there is never a time-- in development 166 00:06:32,980 --> 00:06:36,490 when babies are only sensory motor learners. 167 00:06:36,490 --> 00:06:39,040 There is never a time when there is not 168 00:06:39,040 --> 00:06:43,780 some level of abstract representation going on. 169 00:06:43,780 --> 00:06:47,380 But fundamentally, neither of these revolutions 170 00:06:47,380 --> 00:06:49,267 was about learning per se. 171 00:06:49,267 --> 00:06:51,100 And I think I can make that point most clear 172 00:06:51,100 --> 00:06:55,420 by pointing to a popular book that came out at the time. 173 00:06:55,420 --> 00:06:58,669 The subtitle here is Minds, Brains, and How Children Learn. 174 00:06:58,669 --> 00:07:00,210 But in fact, if you look in the book, 175 00:07:00,210 --> 00:07:01,418 this was a publisher's title. 176 00:07:01,418 --> 00:07:02,640 There's nothing about brains. 177 00:07:02,640 --> 00:07:04,390 And there's very little about learning. 178 00:07:04,390 --> 00:07:07,660 The titles are what children know about objects, what 179 00:07:07,660 --> 00:07:10,080 children know about agents. 180 00:07:10,080 --> 00:07:12,850 And I can say this with great reverence and some authority, 181 00:07:12,850 --> 00:07:15,160 because the first author's my thesis advisor. 182 00:07:15,160 --> 00:07:16,810 And the book came out in 1999, which 183 00:07:16,810 --> 00:07:18,850 is the year I started graduate school. 184 00:07:18,850 --> 00:07:23,760 So literally and metaphorically, this is where I began. 185 00:07:23,760 --> 00:07:27,010 I began with this metaphor of the child as scientist. 186 00:07:27,010 --> 00:07:29,020 And it's a really problematic metaphor, 187 00:07:29,020 --> 00:07:32,440 because science is a historically and culturally 188 00:07:32,440 --> 00:07:35,290 specific practice that is practiced 189 00:07:35,290 --> 00:07:37,600 by a tiny minority of the human species 190 00:07:37,600 --> 00:07:40,000 and is difficult even for us, right? 191 00:07:40,000 --> 00:07:42,190 So it seems a really odd place to look 192 00:07:42,190 --> 00:07:46,750 for a universal metaphor for cognitive development. 193 00:07:46,750 --> 00:07:49,610 But science has this peculiar property, 194 00:07:49,610 --> 00:07:52,420 which is that it gets the world right. 195 00:07:52,420 --> 00:07:55,570 And if you really want to understand how new knowledge is 196 00:07:55,570 --> 00:07:58,480 possible, how you could get the world right, 197 00:07:58,480 --> 00:08:01,990 you might want to understand how scientists do it, 198 00:08:01,990 --> 00:08:04,420 what kinds of epistemic practices 199 00:08:04,420 --> 00:08:07,390 might support learning and discovery. 200 00:08:07,390 --> 00:08:10,771 And the answer to that is both that we do and do not know, 201 00:08:10,771 --> 00:08:12,520 which is to say we can say a lot of things 202 00:08:12,520 --> 00:08:13,720 about what scientists do. 203 00:08:13,720 --> 00:08:15,900 Here are some of them. 204 00:08:15,900 --> 00:08:17,410 They'll all be familiar to you. 205 00:08:17,410 --> 00:08:19,960 They would be familiar to you if you were a physicist, 206 00:08:19,960 --> 00:08:22,690 if you were in aero-astro, if you were in paleontology. 207 00:08:22,690 --> 00:08:24,700 These are the kinds of scientific practices 208 00:08:24,700 --> 00:08:28,150 that cut across content domains and arguably define 209 00:08:28,150 --> 00:08:30,947 what science is, which is to say, 210 00:08:30,947 --> 00:08:33,280 if you did all of these things, you couldn't necessarily 211 00:08:33,280 --> 00:08:34,210 do science. 212 00:08:34,210 --> 00:08:36,940 Science requires bringing these inferential processes 213 00:08:36,940 --> 00:08:40,000 to bear on really rich, specific, 214 00:08:40,000 --> 00:08:44,310 conceptual representations of individual content domains. 215 00:08:44,310 --> 00:08:47,562 But arguably, if you had all of that rich, specific content 216 00:08:47,562 --> 00:08:49,270 knowledge and you didn't do these things, 217 00:08:49,270 --> 00:08:51,040 you couldn't learn anything at all. 218 00:08:51,040 --> 00:08:53,380 These are the kinds of epistemic practices 219 00:08:53,380 --> 00:08:56,220 that seem fundamental to inquiry and discovery. 220 00:08:56,220 --> 00:08:58,480 And the argument that I've made in my research program 221 00:08:58,480 --> 00:09:00,521 is they are fundamental not just in science, they 222 00:09:00,521 --> 00:09:02,980 are fundamental in cognitive development. 223 00:09:02,980 --> 00:09:07,030 These are the processes that support learning and discovery. 224 00:09:07,030 --> 00:09:10,780 And there's good evidence that each and every one of them 225 00:09:10,780 --> 00:09:15,190 emerges in some form in the first few years of life. 226 00:09:15,190 --> 00:09:17,800 And that is because these are the only rational processes 227 00:09:17,800 --> 00:09:22,090 we know of that can solve the hard problem of learning, which 228 00:09:22,090 --> 00:09:25,150 is exactly the problem of how to draw 229 00:09:25,150 --> 00:09:28,900 rich abstract entrances rapidly and accurately 230 00:09:28,900 --> 00:09:31,720 from sparse, noisy data. 231 00:09:31,720 --> 00:09:34,570 I said we could characterize these practices both formally 232 00:09:34,570 --> 00:09:36,380 and informally. 233 00:09:36,380 --> 00:09:38,115 And indeed, as you've heard, I think, 234 00:09:38,115 --> 00:09:40,240 from some of our computational modeling colleagues, 235 00:09:40,240 --> 00:09:42,167 for each and every one of these practices, 236 00:09:42,167 --> 00:09:43,750 we can begin to characterize something 237 00:09:43,750 --> 00:09:47,290 about what it means to distinguish 238 00:09:47,290 --> 00:09:50,110 genuine causes from spurious associations 239 00:09:50,110 --> 00:09:54,160 or to optimize information gain. 240 00:09:54,160 --> 00:09:57,300 But with all due respect to my computational modeling 241 00:09:57,300 --> 00:10:01,270 colleagues, and much as we want really simple models-- 242 00:10:01,270 --> 00:10:04,930 Hebb's rule, Rescorla-Wagner, Bayes' law-- 243 00:10:04,930 --> 00:10:06,820 that would capture it, none of these 244 00:10:06,820 --> 00:10:09,100 do justice to what children can do. 245 00:10:09,100 --> 00:10:12,340 Because children can do all of these things. 246 00:10:12,340 --> 00:10:15,190 And we don't yet have a full formal theory 247 00:10:15,190 --> 00:10:18,160 of hypothesis, generation, inquiry, and discovery. 248 00:10:18,160 --> 00:10:21,322 That remains a hard problem of cognitive science. 249 00:10:21,322 --> 00:10:22,780 But it's a problem to which I think 250 00:10:22,780 --> 00:10:24,580 our theories should aspire. 251 00:10:24,580 --> 00:10:26,920 Because there is good empirical data 252 00:10:26,920 --> 00:10:29,440 that this is the kind of learning that humans, including 253 00:10:29,440 --> 00:10:33,740 even very young children, engage in. 254 00:10:33,740 --> 00:10:37,650 So normally, at this point, what I would do is I'd say, 255 00:10:37,650 --> 00:10:39,650 and I'm going to show you a few examples of this 256 00:10:39,650 --> 00:10:41,355 from my research program. 257 00:10:41,355 --> 00:10:43,730 But the talk I'm giving here is a sort of funny throwback 258 00:10:43,730 --> 00:10:44,670 talk in some ways. 259 00:10:44,670 --> 00:10:45,920 What I was asked to talk about today 260 00:10:45,920 --> 00:10:47,044 was the child as scientist. 261 00:10:47,044 --> 00:10:50,450 And that was a research program from a few, few years ago. 262 00:10:50,450 --> 00:10:52,700 And you know, at that point, I was a junior professor. 263 00:10:52,700 --> 00:10:54,230 And you know, when you're a junior professor, 264 00:10:54,230 --> 00:10:56,230 like when you're a graduate student or post-doc, 265 00:10:56,230 --> 00:10:59,140 it's all idealistic science for the sake of knowledge, 266 00:10:59,140 --> 00:10:59,960 pure science. 267 00:10:59,960 --> 00:11:02,000 And then you get tenure. 268 00:11:02,000 --> 00:11:05,210 And it's all grants, and money, and administration, 269 00:11:05,210 --> 00:11:07,370 and allocation of resources. 270 00:11:07,370 --> 00:11:10,340 And in the years since, I have started 271 00:11:10,340 --> 00:11:15,140 to think not just about the pure science of learning, 272 00:11:15,140 --> 00:11:18,500 but about the cost associated with gaining information 273 00:11:18,500 --> 00:11:20,820 and the trade-offs between those costs and rewards. 274 00:11:20,820 --> 00:11:24,680 So I'm going to take these same practices now, and situate them 275 00:11:24,680 --> 00:11:27,890 in a world, a real world, that has certain kinds of trade-offs 276 00:11:27,890 --> 00:11:30,380 in how you think about information and talk, 277 00:11:30,380 --> 00:11:32,210 along with the child as scientist, 278 00:11:32,210 --> 00:11:35,780 about what those costs do and how they are also, 279 00:11:35,780 --> 00:11:38,386 in themselves, a source of information about the world. 280 00:11:38,386 --> 00:11:40,010 So I'm going to talk about this as sort 281 00:11:40,010 --> 00:11:42,200 of inferential economics. 282 00:11:42,200 --> 00:11:43,820 Children know information is valuable. 283 00:11:43,820 --> 00:11:45,620 I'm going to show you a couple of examples 284 00:11:45,620 --> 00:11:47,630 of how they reason about it. 285 00:11:47,630 --> 00:11:50,120 And children selectively explore in ways 286 00:11:50,120 --> 00:11:51,371 that support information gain. 287 00:11:51,371 --> 00:11:52,994 So I'm going to show you some old work, 288 00:11:52,994 --> 00:11:55,100 but I'm also going to throw in a few new studies, 289 00:11:55,100 --> 00:11:57,070 because I can't resist. 290 00:11:57,070 --> 00:11:59,310 But information is also costly. 291 00:11:59,310 --> 00:12:02,120 And the costs themselves are informative in a variety 292 00:12:02,120 --> 00:12:02,830 of ways. 293 00:12:02,830 --> 00:12:05,413 I'm not necessarily going to get through all of these studies, 294 00:12:05,413 --> 00:12:07,620 although I might try to. 295 00:12:07,620 --> 00:12:10,970 But I want to give you a kind of feel for the kinds of things 296 00:12:10,970 --> 00:12:12,180 children can do. 297 00:12:12,180 --> 00:12:14,690 All right, so let's start by talking 298 00:12:14,690 --> 00:12:15,899 about a really basic problem. 299 00:12:15,899 --> 00:12:17,523 It's a problem that's basic to science, 300 00:12:17,523 --> 00:12:20,060 but it's also a problem basic to human learning, which is 301 00:12:20,060 --> 00:12:21,330 the problem of generalization. 302 00:12:21,330 --> 00:12:23,810 How do you generalize from sparse data? 303 00:12:23,810 --> 00:12:25,732 In science, we do this all the time. 304 00:12:25,732 --> 00:12:26,690 We have a small sample. 305 00:12:26,690 --> 00:12:29,990 We want to make a claim about the population as a whole. 306 00:12:29,990 --> 00:12:33,820 And of course, we can use feature similarity and category 307 00:12:33,820 --> 00:12:36,230 membership to say things that look the same 308 00:12:36,230 --> 00:12:38,960 or belong to the same kind are likely to share properties. 309 00:12:38,960 --> 00:12:40,370 So if some of these Martian rocks 310 00:12:40,370 --> 00:12:42,860 have high concentrations of silica, maybe they all do. 311 00:12:42,860 --> 00:12:46,130 If some of these needles on Pacific silver fir trees 312 00:12:46,130 --> 00:12:49,230 grow flat on the branch, maybe they all do. 313 00:12:52,270 --> 00:12:54,520 But in science, we can also do something a little more 314 00:12:54,520 --> 00:12:56,140 fussy and suspicious. 315 00:12:56,140 --> 00:12:58,600 We can say, well, you know, it kind of 316 00:12:58,600 --> 00:13:02,155 depends on how you got that sample of evidence, right? 317 00:13:02,155 --> 00:13:04,612 If you randomly sampled from the population, yeah, sure, 318 00:13:04,612 --> 00:13:06,070 the properties, you can generalize. 319 00:13:06,070 --> 00:13:08,620 But if you cherry-picked that data in some way, 320 00:13:08,620 --> 00:13:12,310 maybe the sample isn't going to generalize quite as broadly. 321 00:13:12,310 --> 00:13:15,590 So do all Martian rocks have high concentrations of silica 322 00:13:15,590 --> 00:13:17,620 or just the dusty ones on the surface? 323 00:13:17,620 --> 00:13:20,620 Do all Pacific silver fir needles lie flat or just 324 00:13:20,620 --> 00:13:22,870 those low on the canopy, right? 325 00:13:22,870 --> 00:13:24,950 These are the ones that are easy to sample from. 326 00:13:24,950 --> 00:13:28,677 How do I know how generalizable the property is? 327 00:13:28,677 --> 00:13:30,760 And if I think that you cherry-picked your sample, 328 00:13:30,760 --> 00:13:33,970 I might constrain my inferences only to things near the ground. 329 00:13:33,970 --> 00:13:35,470 So how far you're going to extend 330 00:13:35,470 --> 00:13:37,780 a generalization in science depends on 331 00:13:37,780 --> 00:13:39,940 whether you think that the sampling process was 332 00:13:39,940 --> 00:13:41,110 random or selective. 333 00:13:41,110 --> 00:13:43,360 And we wanted to know whether this was true for babies 334 00:13:43,360 --> 00:13:44,730 as well. 335 00:13:44,730 --> 00:13:45,820 So this is how we asked. 336 00:13:45,820 --> 00:13:47,950 We showed babies a population, in this case, 337 00:13:47,950 --> 00:13:50,350 of blue and yellow dog toys. 338 00:13:50,350 --> 00:13:51,330 They're in a box. 339 00:13:51,330 --> 00:13:53,920 The box is transparent, has a false front so it stays 340 00:13:53,920 --> 00:13:56,990 a less stable representation of what looks like a lot of balls. 341 00:13:56,990 --> 00:13:58,900 And we're going to reach into that box. 342 00:13:58,900 --> 00:14:00,739 And we're going to pull out-- 343 00:14:00,739 --> 00:14:03,280 there are many more blue balls in this box than yellow balls. 344 00:14:03,280 --> 00:14:05,620 We're going to pull out three blue balls one at a time 345 00:14:05,620 --> 00:14:06,880 and squeak them-- 346 00:14:06,880 --> 00:14:08,896 and squeeze them-- and they're going to squeak. 347 00:14:08,896 --> 00:14:11,020 And then we're going to hand the baby a yellow ball 348 00:14:11,020 --> 00:14:11,980 from the same box. 349 00:14:11,980 --> 00:14:14,170 And the question is, does the baby squeeze the ball 350 00:14:14,170 --> 00:14:16,335 and expect it to squeak? 351 00:14:16,335 --> 00:14:18,460 Well, there's nothing very suspicious about pulling 352 00:14:18,460 --> 00:14:22,180 three blue balls from a box of mostly blue balls. 353 00:14:22,180 --> 00:14:25,160 And this has a lot of feature properties in common. 354 00:14:25,160 --> 00:14:26,729 So it looks like the others. 355 00:14:26,729 --> 00:14:28,520 We predict that children should generalize. 356 00:14:28,520 --> 00:14:31,330 They should try squeezing this ball and should squeeze often. 357 00:14:31,330 --> 00:14:32,704 And the question is, what happens 358 00:14:32,704 --> 00:14:36,490 if you do exactly the same sample in exactly the same way 359 00:14:36,490 --> 00:14:39,054 from a different population? 360 00:14:39,054 --> 00:14:41,470 Now, it is very unlikely that you sampled three blue balls 361 00:14:41,470 --> 00:14:44,140 from a population of mostly yellow balls. 362 00:14:44,140 --> 00:14:45,610 In this case, it's much more likely 363 00:14:45,610 --> 00:14:47,800 that you were sampling selectively. 364 00:14:47,800 --> 00:14:51,350 So maybe only the blue balls had the property. 365 00:14:51,350 --> 00:14:51,850 Yeah? 366 00:14:51,850 --> 00:14:52,920 AUDIENCE: And they can see the population? 367 00:14:52,920 --> 00:14:55,419 LAURA SCHULZ: They can see the population-- transparent box, 368 00:14:55,419 --> 00:14:56,670 transparent front. 369 00:14:56,670 --> 00:15:00,160 So they can see the population. 370 00:15:00,160 --> 00:15:01,670 And if children understand that it's 371 00:15:01,670 --> 00:15:04,210 not just about the property similarity 372 00:15:04,210 --> 00:15:07,120 but something about how that evidence was generated, then, 373 00:15:07,120 --> 00:15:09,125 in this case, children should say, well look, 374 00:15:09,125 --> 00:15:11,500 you just looked like you were cherry-picking your sample. 375 00:15:11,500 --> 00:15:13,270 Maybe it doesn't generalize predictions 376 00:15:13,270 --> 00:15:15,340 that fewer children should try squeaking 377 00:15:15,340 --> 00:15:17,234 and children should squeeze less often. 378 00:15:17,234 --> 00:15:19,150 So I'm going to show you what this looks like. 379 00:15:19,150 --> 00:15:21,680 By the way, the yellow one has that funny thing at the end 380 00:15:21,680 --> 00:15:24,230 so that children could do something else with the ball, 381 00:15:24,230 --> 00:15:24,730 right? 382 00:15:24,730 --> 00:15:26,170 So they can bang it, or throw it around, 383 00:15:26,170 --> 00:15:27,290 or other things like that. 384 00:15:27,290 --> 00:15:28,961 So let me show you what it looks like. 385 00:15:28,961 --> 00:15:31,210 Kids are always going to see three squeaky blue balls. 386 00:15:31,210 --> 00:15:32,918 They're always going to get a yellow one. 387 00:15:32,918 --> 00:15:35,200 But you'll see that they do different things depending 388 00:15:35,200 --> 00:15:36,010 on whether they think-- 389 00:15:36,010 --> 00:15:36,370 [VIDEO PLAYBACK] 390 00:15:36,370 --> 00:15:37,930 LAURA SCHULZ: --the evidence was randomly sampled 391 00:15:37,930 --> 00:15:39,460 and possibly generalizable or not. 392 00:15:46,404 --> 00:15:48,740 So the child at the top is squeezing, 393 00:15:48,740 --> 00:16:04,992 and squeezing, and squeezing, and squeezing. 394 00:16:12,864 --> 00:16:14,360 [END PLAYBACK] 395 00:16:14,360 --> 00:16:16,970 So these were-- this was true both of the mean number 396 00:16:16,970 --> 00:16:19,077 of squeezes and the number of individual children 397 00:16:19,077 --> 00:16:19,910 who squeezed at all. 398 00:16:19,910 --> 00:16:21,760 I'm just going to show you the mean number of squeezes. 399 00:16:21,760 --> 00:16:23,843 But what you'll see is that children are much more 400 00:16:23,843 --> 00:16:26,000 likely to squeeze, and squeeze persistently, 401 00:16:26,000 --> 00:16:27,830 in the condition where the evidence looks 402 00:16:27,830 --> 00:16:31,202 like it was randomly sampled than selectively sampled. 403 00:16:31,202 --> 00:16:32,660 But what you could worry about here 404 00:16:32,660 --> 00:16:34,130 is children are sensitive to something 405 00:16:34,130 --> 00:16:36,230 about the relationship of the sample and the population. 406 00:16:36,230 --> 00:16:37,700 But maybe they will just generalize 407 00:16:37,700 --> 00:16:42,770 from a majority object to a minority, but not the reverse. 408 00:16:42,770 --> 00:16:44,990 Maybe they won't generalize from a minority object 409 00:16:44,990 --> 00:16:46,107 to the majority. 410 00:16:46,107 --> 00:16:48,440 So they don't really care about whether the evidence was 411 00:16:48,440 --> 00:16:50,120 randomly sampled or not, they just 412 00:16:50,120 --> 00:16:53,060 care about that aspect of the sample and population. 413 00:16:53,060 --> 00:16:55,730 So we ran a replication of the yellow ball condition. 414 00:16:55,730 --> 00:16:58,520 Again, we're going to pull an unlikely sample from that box, 415 00:16:58,520 --> 00:16:59,900 three blue balls in a row. 416 00:16:59,900 --> 00:17:02,420 And we're going to compare it with a sample that's 417 00:17:02,420 --> 00:17:03,470 not that improbable. 418 00:17:03,470 --> 00:17:06,579 You could easily randomly sample just one blue ball 419 00:17:06,579 --> 00:17:07,710 from the box. 420 00:17:07,710 --> 00:17:10,369 So in this case, children are going 421 00:17:10,369 --> 00:17:12,038 to see much less squeezing, right? 422 00:17:12,038 --> 00:17:14,079 They're only going to see one blue ball squeezed. 423 00:17:14,079 --> 00:17:16,020 And we squeeze it both once and three times, 424 00:17:16,020 --> 00:17:18,944 but it's only one blue ball in two different conditions. 425 00:17:18,944 --> 00:17:21,319 And the prediction there is that even though the children 426 00:17:21,319 --> 00:17:23,169 themselves are seeing much less squeezing, 427 00:17:23,169 --> 00:17:25,460 they should say, well, that's not an improbable sample. 428 00:17:25,460 --> 00:17:28,940 And they, themselves, should squeeze more. 429 00:17:28,940 --> 00:17:34,040 And that's exactly what we found in both of those conditions. 430 00:17:34,040 --> 00:17:35,220 It's graded, by the way. 431 00:17:35,220 --> 00:17:36,440 If you do two balls, they're intermediate. 432 00:17:36,440 --> 00:17:37,464 And if it's a model-- 433 00:17:37,464 --> 00:17:39,380 well, not going to talk about-- but yeah, what 434 00:17:39,380 --> 00:17:42,227 happens if you just pour them upside down and drop them? 435 00:17:42,227 --> 00:17:43,810 Now this is a really improbable sample 436 00:17:43,810 --> 00:17:45,601 I said, three blue balls from a yellow box. 437 00:17:45,601 --> 00:17:47,360 But you've just given positive evidence 438 00:17:47,360 --> 00:17:49,340 that you shook the ball, and it just happened to fall out. 439 00:17:49,340 --> 00:17:50,881 And they don't know we're MIT, and we 440 00:17:50,881 --> 00:17:54,460 can do sneaky technological things like have a trap door. 441 00:17:54,460 --> 00:17:56,510 So in this case, it's an improbable sample, 442 00:17:56,510 --> 00:17:57,920 but it was randomly generated. 443 00:17:57,920 --> 00:17:59,660 And the prediction is, in this case, 444 00:17:59,660 --> 00:18:01,960 the babies themselves should squeeze more. 445 00:18:01,960 --> 00:18:04,460 Because as I say, if you can pour out any balls that squeak, 446 00:18:04,460 --> 00:18:06,080 probably everything squeaks. 447 00:18:06,080 --> 00:18:07,137 And indeed, they do. 448 00:18:07,137 --> 00:18:08,420 Indeed, they do. 449 00:18:08,420 --> 00:18:09,260 All right? 450 00:18:09,260 --> 00:18:12,950 So 15-month-old babies' generalizations 451 00:18:12,950 --> 00:18:15,230 take into account more than category membership 452 00:18:15,230 --> 00:18:17,180 and the perceptual similarity of objects. 453 00:18:17,180 --> 00:18:19,160 They make graded inferences that are sensitive 454 00:18:19,160 --> 00:18:21,410 both to the amount of evidence they observe 455 00:18:21,410 --> 00:18:24,140 and to the process by which that evidence is sampled. 456 00:18:24,140 --> 00:18:26,081 Is that clear? 457 00:18:26,081 --> 00:18:27,830 All right, let me show you another example 458 00:18:27,830 --> 00:18:29,110 of sort of child as scientist. 459 00:18:29,110 --> 00:18:31,610 It's going to start with a hard problem of confounding 460 00:18:31,610 --> 00:18:33,985 that we all have, which is that we are part of the world. 461 00:18:33,985 --> 00:18:35,760 So in one-offs, when things go wrong, 462 00:18:35,760 --> 00:18:37,580 we may not know if we were responsible 463 00:18:37,580 --> 00:18:39,110 or the world was responsible, right? 464 00:18:39,110 --> 00:18:42,280 This is a chronic problem in relationships, right-- 465 00:18:42,280 --> 00:18:43,730 you or me? 466 00:18:43,730 --> 00:18:45,710 So it's a hard problem of confounding, 467 00:18:45,710 --> 00:18:47,842 and you might need some data to disambiguate it. 468 00:18:47,842 --> 00:18:50,300 So here we're going to give babies a case where they cannot 469 00:18:50,300 --> 00:18:51,630 do something. 470 00:18:51,630 --> 00:18:53,570 And the question is, can we give them 471 00:18:53,570 --> 00:18:57,332 a little bit of data to unconfound that problem 472 00:18:57,332 --> 00:18:59,540 and convince them either that the problem is probably 473 00:18:59,540 --> 00:19:02,219 with the toy or the problem is with themselves? 474 00:19:02,219 --> 00:19:04,510 And the argument is, if they think that it's themselves 475 00:19:04,510 --> 00:19:05,343 that's the problem-- 476 00:19:05,343 --> 00:19:07,670 it's the agent state and not the environment state-- 477 00:19:07,670 --> 00:19:11,730 they should hold the toy constant and change the agent. 478 00:19:11,730 --> 00:19:13,760 But if they think it's the toy, then they 479 00:19:13,760 --> 00:19:15,830 should just go ahead and reach for another toy. 480 00:19:15,830 --> 00:19:17,746 So in both cases, they're going to have access 481 00:19:17,746 --> 00:19:19,560 to another person, their mom. 482 00:19:19,560 --> 00:19:21,980 And they're going to have access to another toy. 483 00:19:21,980 --> 00:19:25,070 And so the question is, what do they 484 00:19:25,070 --> 00:19:29,280 do if we give them minimal statistical data 485 00:19:29,280 --> 00:19:31,880 to disambiguate these, OK? 486 00:19:31,880 --> 00:19:34,430 So this is the setup. 487 00:19:34,430 --> 00:19:37,340 We're going to show the babies two agents. 488 00:19:37,340 --> 00:19:43,220 In one condition, I am going to succeed one time at making 489 00:19:43,220 --> 00:19:44,805 that toy go and fail one time. 490 00:19:45,305 --> 00:19:48,770 And Hyowon Gweon, my collaborator on this project, 491 00:19:48,770 --> 00:19:50,490 is also going to fail and succeed once. 492 00:19:50,490 --> 00:19:53,419 So this looks like this toy has maybe faulty wiring 493 00:19:53,419 --> 00:19:53,960 or something. 494 00:19:53,960 --> 00:19:55,793 It works some of the time, not all the time. 495 00:19:55,793 --> 00:19:57,480 It's just not a great toy. 496 00:19:57,480 --> 00:19:59,240 The babies can have another toy. 497 00:19:59,240 --> 00:20:00,710 The parents are going to be there. 498 00:20:00,710 --> 00:20:03,420 If they think it's the toy, they should change the object. 499 00:20:03,420 --> 00:20:07,410 In the other condition, Hyowon is always going to succeed, 500 00:20:07,410 --> 00:20:11,050 which is generally true in my experience. 501 00:20:11,050 --> 00:20:13,490 And as is always true of my experience in technology, 502 00:20:13,490 --> 00:20:14,690 I am always going to fail. 503 00:20:14,690 --> 00:20:17,030 And in this case, the children should 504 00:20:17,030 --> 00:20:20,000 conclude that there's something wrong with the person. 505 00:20:20,000 --> 00:20:22,850 And if that's the case, they should hold the object constant 506 00:20:22,850 --> 00:20:23,735 and change the agent. 507 00:20:23,735 --> 00:20:24,860 This is what it looks like. 508 00:20:24,860 --> 00:20:27,245 [VIDEO PLAYBACK] 509 00:20:27,245 --> 00:20:32,969 - [INAUDIBLE] 510 00:20:32,969 --> 00:20:34,260 LAURA SCHULZ: We've lost sound. 511 00:20:41,700 --> 00:20:45,210 Well there's an audio here, but [INAUDIBLE] we 512 00:20:45,210 --> 00:20:47,434 are going to want that later. 513 00:20:47,434 --> 00:20:49,469 - Cool What happened to my toys? 514 00:20:49,469 --> 00:20:51,510 LAURA SCHULZ: In any case, were just showing them 515 00:20:51,510 --> 00:20:52,470 the data at this point. 516 00:20:52,470 --> 00:20:55,120 And the babies are going to get the toy in each condition. 517 00:20:55,120 --> 00:20:56,334 One toy is on the mat. 518 00:20:56,334 --> 00:20:58,500 By the way, there are lots of individual differences 519 00:20:58,500 --> 00:21:01,620 in any two sets of clips between the exact positioning 520 00:21:01,620 --> 00:21:03,810 of the parent, the child, the toy. 521 00:21:03,810 --> 00:21:06,750 All of these were coded blind conditions 522 00:21:06,750 --> 00:21:08,520 for all of these other variables to make 523 00:21:08,520 --> 00:21:10,870 sure those were evenly matched across conditions, 524 00:21:10,870 --> 00:21:11,820 and they were. 525 00:21:11,820 --> 00:21:16,120 So what you're going to see now though is 526 00:21:16,120 --> 00:21:18,950 that in the condition where the babies think it is probably 527 00:21:18,950 --> 00:21:23,730 the toy, they engage in a very different behavior 528 00:21:23,730 --> 00:21:30,190 than when they think it is probably agent. 529 00:21:30,190 --> 00:21:32,615 - [CRYING] 530 00:21:34,070 --> 00:21:36,980 - [INAUDIBLE] 531 00:21:41,830 --> 00:21:42,756 [END PLAYBACK] 532 00:21:42,756 --> 00:21:44,130 And that's what we found overall. 533 00:21:44,130 --> 00:21:46,410 The distribution, overall, of children's tendency 534 00:21:46,410 --> 00:21:48,190 to perform one action versus another 535 00:21:48,190 --> 00:21:51,420 differed across conditions depending on the pattern 536 00:21:51,420 --> 00:21:53,140 of data that they observed. 537 00:21:56,900 --> 00:21:59,900 So 16-month-olds track the statistical dependence 538 00:21:59,900 --> 00:22:01,490 between agents, objects, and outcomes. 539 00:22:01,490 --> 00:22:04,440 They can use minimal data to make attributions. 540 00:22:04,440 --> 00:22:06,830 And they help them choose between seeking help 541 00:22:06,830 --> 00:22:10,920 from others or exploring on their own. 542 00:22:10,920 --> 00:22:12,624 Clear? 543 00:22:12,624 --> 00:22:15,560 OK. 544 00:22:15,560 --> 00:22:17,900 I've just shown you kids' sensitivity to the data 545 00:22:17,900 --> 00:22:18,680 that they see. 546 00:22:18,680 --> 00:22:20,990 But of course, data isn't handed out all the time. 547 00:22:20,990 --> 00:22:22,950 At least disambiguating data isn't handed out. 548 00:22:22,950 --> 00:22:24,964 One of the really important, hard things 549 00:22:24,964 --> 00:22:26,630 that you have to do if you want to learn 550 00:22:26,630 --> 00:22:28,940 is sometimes actually figure out what data would 551 00:22:28,940 --> 00:22:31,320 be informative and go get it. 552 00:22:31,320 --> 00:22:33,230 And that is a really characteristic thing 553 00:22:33,230 --> 00:22:33,860 about science. 554 00:22:33,860 --> 00:22:35,760 And the question is, is it, in some sense, 555 00:22:35,760 --> 00:22:37,755 a characteristic thing about common sense. 556 00:22:37,755 --> 00:22:39,380 So that's what we're going to ask here. 557 00:22:39,380 --> 00:22:42,530 We're going to jump to much older children here. 558 00:22:42,530 --> 00:22:45,110 These are four and five-year-olds. 559 00:22:45,110 --> 00:22:47,660 And we're going to give them a problem where 560 00:22:47,660 --> 00:22:49,910 instead of showing them the disambiguating data, 561 00:22:49,910 --> 00:22:52,277 we're going to ask if the kids themselves will find it. 562 00:22:52,277 --> 00:22:54,360 So what we showed children-- when you were little, 563 00:22:54,360 --> 00:22:55,700 you possibly played with some beads 564 00:22:55,700 --> 00:22:57,430 that snapped together and pulled apart. 565 00:22:57,430 --> 00:22:58,960 These are like toddler toys. 566 00:22:58,960 --> 00:23:00,710 So we gave them these snap together beads. 567 00:23:00,710 --> 00:23:02,120 They're each uniquely colored. 568 00:23:05,530 --> 00:23:08,220 We place each bead, one at a time, on a toy. 569 00:23:08,220 --> 00:23:12,110 And in one condition, every bead makes the toy play music 570 00:23:12,110 --> 00:23:13,610 to each bead you put on. 571 00:23:13,610 --> 00:23:16,615 And the other condition, only half the beads did. 572 00:23:16,615 --> 00:23:18,740 So the only difference between these two conditions 573 00:23:18,740 --> 00:23:20,990 is basically the base rate of the candidate causes. 574 00:23:20,990 --> 00:23:23,060 One works for every bead, the other only works 575 00:23:23,060 --> 00:23:25,880 for some of the beads. 576 00:23:25,880 --> 00:23:28,084 And then we took all these training toys away, 577 00:23:28,084 --> 00:23:29,750 and we showed the children either a pair 578 00:23:29,750 --> 00:23:31,460 that we had epoxied together-- it was stuck. 579 00:23:31,460 --> 00:23:33,290 We tried to pull it apart, we showed we couldn't. 580 00:23:33,290 --> 00:23:34,700 The children tried to pull it apart, they couldn't. 581 00:23:34,700 --> 00:23:36,230 It's a stuck pair of beads, or it's 582 00:23:36,230 --> 00:23:38,810 an ordinary, separable pair of beads. 583 00:23:38,810 --> 00:23:41,240 And then, as a pair, we placed each pair, one at a time, 584 00:23:41,240 --> 00:23:44,120 on the toy, and the toy played music. 585 00:23:44,120 --> 00:23:46,400 In principle, this evidence is always confounded. 586 00:23:46,400 --> 00:23:49,580 One bead in each pair might be the responsible party 587 00:23:49,580 --> 00:23:51,267 activating the toy. 588 00:23:51,267 --> 00:23:52,850 But as a practical matter, if you just 589 00:23:52,850 --> 00:23:55,529 learn the base rate is that every single bead activates 590 00:23:55,529 --> 00:23:58,070 this toy, there's not a lot of information to be gained here. 591 00:23:58,070 --> 00:24:00,980 You should just assume that all of these beads work. 592 00:24:00,980 --> 00:24:02,870 And in that condition, we expected 593 00:24:02,870 --> 00:24:05,060 that kids would play indiscriminately 594 00:24:05,060 --> 00:24:06,352 with the two toys. 595 00:24:06,352 --> 00:24:09,350 But in the condition where only some of the beads work, 596 00:24:09,350 --> 00:24:10,830 there's genuine uncertainty. 597 00:24:10,830 --> 00:24:11,330 Right? 598 00:24:11,330 --> 00:24:12,900 Maybe only one of these beads work. 599 00:24:12,900 --> 00:24:14,480 Maybe they both did. 600 00:24:14,480 --> 00:24:15,980 And if that's true, and if kids are 601 00:24:15,980 --> 00:24:18,440 sensitive to the possibility of information gain, 602 00:24:18,440 --> 00:24:21,360 only one of these beads affords the possibility of finding out. 603 00:24:21,360 --> 00:24:24,330 On only one can you isolate the variables. 604 00:24:24,330 --> 00:24:25,820 And that's with the separable pair. 605 00:24:25,820 --> 00:24:28,070 So we thought on this condition, the kids should selectively 606 00:24:28,070 --> 00:24:29,278 play with the separable pair. 607 00:24:29,278 --> 00:24:32,120 And in particular, they should place each bead, one at a time, 608 00:24:32,120 --> 00:24:34,220 on the toy. 609 00:24:34,220 --> 00:24:35,657 So that's, in fact, what we find. 610 00:24:35,657 --> 00:24:37,490 In the obvious condition, the kids basically 611 00:24:37,490 --> 00:24:39,164 never separated the pair. 612 00:24:39,164 --> 00:24:41,330 And in the some beads condition, about half the kids 613 00:24:41,330 --> 00:24:46,280 did it and performed the exhaustive intervention. 614 00:24:46,280 --> 00:24:48,620 That was cool, but my graduate student at the time said, 615 00:24:48,620 --> 00:24:50,370 they're doing something really interesting 616 00:24:50,370 --> 00:24:52,610 even with the stuck pair of beads. 617 00:24:52,610 --> 00:24:54,129 We should look at the stuck pair. 618 00:24:54,129 --> 00:24:56,170 And I said, what can they do with the stuck pair? 619 00:24:56,170 --> 00:24:57,730 There's nothing to be done with the stuck pair. 620 00:24:57,730 --> 00:24:58,230 It's stuck. 621 00:24:58,230 --> 00:25:00,021 And she said, well, let's just try it again 622 00:25:00,021 --> 00:25:00,869 with the stuck pair. 623 00:25:00,869 --> 00:25:01,910 So we did the same thing. 624 00:25:01,910 --> 00:25:03,980 They got introduced the fact that either every bead worked, 625 00:25:03,980 --> 00:25:05,000 or only some of the beads worked. 626 00:25:05,000 --> 00:25:07,190 And this time we introduced just the stuck pair, 627 00:25:07,190 --> 00:25:10,180 and we placed it on the toy, and the toy made music. 628 00:25:10,180 --> 00:25:11,930 And let me show you what the children did. 629 00:25:11,930 --> 00:25:12,200 [VIDEO PLAYBACK] 630 00:25:12,200 --> 00:25:14,156 - All right, I'm going to do this one, 631 00:25:14,156 --> 00:25:16,112 and then it'll be your turn a little later. 632 00:25:16,112 --> 00:25:19,140 But now, can you just watch and see watch happens? 633 00:25:19,140 --> 00:25:19,895 All right. 634 00:25:24,178 --> 00:25:27,154 This one makes the machine go. 635 00:25:27,154 --> 00:25:30,240 How about this one? 636 00:25:30,240 --> 00:25:33,350 This one doesn't make the machine go. 637 00:25:33,350 --> 00:25:36,130 What about this one? 638 00:25:36,130 --> 00:25:39,634 This one doesn't make the machine go. 639 00:25:39,634 --> 00:25:42,898 Let's try this one. 640 00:25:42,898 --> 00:25:44,182 This one makes the machine go. 641 00:25:44,182 --> 00:25:46,140 LAURA SCHULZ: She goes over that a second time, 642 00:25:46,140 --> 00:25:49,750 and then she hands the child the toy. 643 00:25:49,750 --> 00:25:52,690 - Just a minute. 644 00:25:52,690 --> 00:25:54,557 - Look at that. 645 00:25:54,557 --> 00:25:55,140 [END PLAYBACK] 646 00:25:55,140 --> 00:25:57,639 LAURA SCHULZ: The child plays around, does just what we did. 647 00:26:04,630 --> 00:26:07,651 And then she does something we'd never done. 648 00:26:07,651 --> 00:26:09,150 She rotates the position of the bead 649 00:26:09,150 --> 00:26:13,158 so that only one makes contact with the toy at a time. 650 00:26:13,158 --> 00:26:17,898 And if you have a folk theory of contact causality, 651 00:26:17,898 --> 00:26:21,400 that is a pretty good way to isolate your variables. 652 00:26:21,400 --> 00:26:23,420 And not one that had occurred to, say, the PI 653 00:26:23,420 --> 00:26:24,657 on this investigation. 654 00:26:24,657 --> 00:26:26,240 But in fact, it occurred to about half 655 00:26:26,240 --> 00:26:28,370 the kids again, in that condition. 656 00:26:28,370 --> 00:26:31,010 In the some beads condition, where there was uncertainty, 657 00:26:31,010 --> 00:26:33,960 the kids were more likely to design their own intervention 658 00:26:33,960 --> 00:26:36,950 to try to isolate the variables than in the condition where 659 00:26:36,950 --> 00:26:38,951 all the beads worked. 660 00:26:38,951 --> 00:26:40,910 So preschoolers are using information 661 00:26:40,910 --> 00:26:42,680 about the base rate of candidate causes 662 00:26:42,680 --> 00:26:44,732 to distinguish the ambiguity of the evidence, 663 00:26:44,732 --> 00:26:46,190 and they're selecting and designing 664 00:26:46,190 --> 00:26:48,740 potentially informative interventions to isolate 665 00:26:48,740 --> 00:26:52,350 these causal variables. 666 00:26:52,350 --> 00:26:54,200 All right. 667 00:26:54,200 --> 00:26:55,750 I'm going to show you some new work 668 00:26:55,750 --> 00:26:59,010 now, kind of on the same theme. 669 00:26:59,010 --> 00:27:01,010 One way that investigations can be uninformative 670 00:27:01,010 --> 00:27:02,691 is because evidence is confounded. 671 00:27:02,691 --> 00:27:04,190 We're all familiar with that, right? 672 00:27:04,190 --> 00:27:05,960 We think we did the perfect experiment, then we're like, 673 00:27:05,960 --> 00:27:07,580 oh, well, it really could have been because of this really 674 00:27:07,580 --> 00:27:09,140 silly, boring reason. 675 00:27:09,140 --> 00:27:10,760 And that's disappointing. 676 00:27:10,760 --> 00:27:12,150 And we have to run it again. 677 00:27:12,150 --> 00:27:14,240 But another reason that investigations 678 00:27:14,240 --> 00:27:17,060 can be uninformative is because they generate outcomes that 679 00:27:17,060 --> 00:27:20,120 are super hard to distinguish. 680 00:27:20,120 --> 00:27:22,460 So if I have a handkerchief in one pocket, 681 00:27:22,460 --> 00:27:24,704 and a candy cane in the other, then a child 682 00:27:24,704 --> 00:27:26,120 who wants that candy cane is going 683 00:27:26,120 --> 00:27:28,670 to have no trouble patting you down and finding the candy 684 00:27:28,670 --> 00:27:29,240 cane. 685 00:27:29,240 --> 00:27:31,550 But if I have a pen in one pocket 686 00:27:31,550 --> 00:27:33,770 and a candy cane in the other, that's 687 00:27:33,770 --> 00:27:35,082 going to be a harder problem. 688 00:27:35,082 --> 00:27:37,040 And this might be more salient to you if I say, 689 00:27:37,040 --> 00:27:39,860 you're going to go in and have a lab test for a fatal disease, 690 00:27:39,860 --> 00:27:41,792 or potentially a benign disease. 691 00:27:41,792 --> 00:27:44,000 And you know what the results are going to look like? 692 00:27:44,000 --> 00:27:45,860 One is going to be reddish maroon, 693 00:27:45,860 --> 00:27:48,620 and the other is going to be maroonish red. 694 00:27:48,620 --> 00:27:50,330 OK, that's not the kind of test you want. 695 00:27:50,330 --> 00:27:54,470 You want yellow, blue, right? 696 00:27:54,470 --> 00:27:55,420 So this is important. 697 00:27:55,420 --> 00:27:58,040 We care about how much uncertainty there 698 00:27:58,040 --> 00:28:01,230 is over interpreting the outcome as well. 699 00:28:01,230 --> 00:28:06,230 So if children are sensitive to how useful actions are 700 00:28:06,230 --> 00:28:07,910 for information gain, then they should 701 00:28:07,910 --> 00:28:10,610 prefer interventions that generate distinctive patterns 702 00:28:10,610 --> 00:28:12,260 of evidence. 703 00:28:12,260 --> 00:28:17,510 So Max Siegel in my lab has been running some experiments 704 00:28:17,510 --> 00:28:18,455 like this. 705 00:28:18,455 --> 00:28:20,390 He started with a very simple one, basically 706 00:28:20,390 --> 00:28:21,950 the equivalent of the handkerchief and the candy 707 00:28:21,950 --> 00:28:22,450 cane. 708 00:28:22,450 --> 00:28:24,680 He said, OK, there's either a bean bag in this box 709 00:28:24,680 --> 00:28:26,971 that I'm going to put in here, or a pencil in this box. 710 00:28:26,971 --> 00:28:29,150 It is a shiny, cool, hologram, sparkly pencil. 711 00:28:29,150 --> 00:28:30,632 You'll want it. 712 00:28:30,632 --> 00:28:32,340 That's going to go in this box over here. 713 00:28:32,340 --> 00:28:33,890 And in this box, either the really cool, 714 00:28:33,890 --> 00:28:35,970 shiny hologram pencil, or the really boring yellow pencil 715 00:28:35,970 --> 00:28:36,690 is going to go in this box. 716 00:28:36,690 --> 00:28:37,790 And guess what I'm going to do? 717 00:28:37,790 --> 00:28:39,890 I'm going to take each box, and I'm going to shake it. 718 00:28:39,890 --> 00:28:40,580 So he does that. 719 00:28:40,580 --> 00:28:42,070 And you know what you hear both times? 720 00:28:42,070 --> 00:28:43,270 Ka-thunk, ka-thunk, ka-thunk. 721 00:28:43,270 --> 00:28:44,478 Ka-thunk, ka-thunk, ka-thunk. 722 00:28:44,478 --> 00:28:46,100 Indistinguishable sounds. 723 00:28:46,100 --> 00:28:49,100 And now the question is, which box do you want to open? 724 00:28:49,100 --> 00:28:50,710 Which box you want to open? 725 00:28:50,710 --> 00:28:51,290 Right? 726 00:28:51,290 --> 00:28:52,850 And if you're sensitive to the ambiguity, 727 00:28:52,850 --> 00:28:54,710 you should say, well, listen, if it were a beanbag, 728 00:28:54,710 --> 00:28:57,160 I would really know, so that must be the sparkly pencil, 729 00:28:57,160 --> 00:28:57,410 right? 730 00:28:57,410 --> 00:28:58,820 But I'm never going to know in this box, 731 00:28:58,820 --> 00:29:00,736 because both pencils are going to sound alike, 732 00:29:00,736 --> 00:29:03,280 so I really better choose this box. 733 00:29:03,280 --> 00:29:07,100 And then he's going to do a harder problem. 734 00:29:07,100 --> 00:29:11,884 He's going to say, there are eight shiny, colorful marbles-- 735 00:29:11,884 --> 00:29:12,800 you really want them-- 736 00:29:12,800 --> 00:29:15,400 in this box, or two really boring white ones. 737 00:29:15,400 --> 00:29:18,800 Or there are eight colorful, shiny marbles in this box, 738 00:29:18,800 --> 00:29:22,130 or six really boring white ones. 739 00:29:22,130 --> 00:29:24,980 In each case, they each get hidden, you're 740 00:29:24,980 --> 00:29:26,627 going to hear the box. 741 00:29:26,627 --> 00:29:29,210 In each case, it's going to make exactly the same sound, which 742 00:29:29,210 --> 00:29:31,220 is actually the sound of eight marbles in a box. 743 00:29:31,220 --> 00:29:33,420 And the question is, which box do you want to open? 744 00:29:33,420 --> 00:29:34,400 So let me show you how that works. 745 00:29:34,400 --> 00:29:35,296 [VIDEO PLAYBACK] 746 00:29:35,296 --> 00:29:37,490 - [INAUDIBLE] and I also have some marbles. 747 00:29:37,490 --> 00:29:40,622 Well, you see these marbles right here, the white ones? 748 00:29:40,622 --> 00:29:42,068 These are Bunny's. 749 00:29:42,068 --> 00:29:45,442 Oh, six of my marbles, yay. 750 00:29:45,442 --> 00:29:48,345 Oh, two of my marbles, yay. 751 00:29:48,345 --> 00:29:50,190 Guess what, Taylor? 752 00:29:50,190 --> 00:29:52,810 These marbles with lots of different colors, those 753 00:29:52,810 --> 00:29:54,840 are yours for right now. 754 00:29:54,840 --> 00:29:57,244 That's pretty cool, right? 755 00:29:57,244 --> 00:29:59,232 Those are your marbles. 756 00:29:59,232 --> 00:30:00,512 That's awesome. 757 00:30:00,512 --> 00:30:03,990 And in this game, I'm going to hide either 758 00:30:03,990 --> 00:30:07,990 your marbles or Bunny's marbles inside of this box. 759 00:30:07,990 --> 00:30:09,850 And then I'm going to hide either 760 00:30:09,850 --> 00:30:13,570 your or Bunny's marbles inside of this box. 761 00:30:13,570 --> 00:30:14,769 Does that sound like fun? 762 00:30:14,769 --> 00:30:16,852 And then we're going to look for your marbles, OK? 763 00:30:16,852 --> 00:30:19,497 If you find them, you get another sticker. 764 00:30:19,497 --> 00:30:21,580 All right, so I'm going to put Bunny's right here, 765 00:30:21,580 --> 00:30:23,090 and we're going to do the hide game. 766 00:30:23,090 --> 00:30:25,665 So first, I'm going to choose either your marbles or Bunny's 767 00:30:25,665 --> 00:30:27,705 marbles and put them in here. 768 00:30:27,705 --> 00:30:28,890 I'm going to pour them in. 769 00:30:33,174 --> 00:30:36,020 Look, somebody's marbles are in here. 770 00:30:36,020 --> 00:30:38,190 Now I'm going to do the same thing with this box. 771 00:30:38,190 --> 00:30:39,959 Either your marbles or Bunny's marbles 772 00:30:39,959 --> 00:30:41,408 are going to go in this box. 773 00:30:45,255 --> 00:30:45,755 All right. 774 00:30:45,755 --> 00:30:48,750 Are you ready to begin shaking and listening? 775 00:30:48,750 --> 00:30:52,000 So remember, over here, there's either your marbles 776 00:30:52,000 --> 00:30:54,340 or Bunny's marbles. 777 00:30:54,340 --> 00:30:55,580 OK, let's listen. 778 00:30:55,580 --> 00:30:57,404 [CLUNKING NOISES] 779 00:31:00,101 --> 00:31:00,600 All right. 780 00:31:00,600 --> 00:31:03,572 And over here, there's either your marbles 781 00:31:03,572 --> 00:31:05,378 or Bunny's marbles. 782 00:31:05,378 --> 00:31:06,346 Let's listen. 783 00:31:06,346 --> 00:31:08,282 [CLUNKING NOISES] 784 00:31:10,686 --> 00:31:11,186 Cool. 785 00:31:11,186 --> 00:31:12,510 Let's do it one more time. 786 00:31:12,510 --> 00:31:12,780 [END PLAYBACK] 787 00:31:12,780 --> 00:31:14,400 LAURA SCHULZ: We'll skip the one more time, but-- 788 00:31:14,400 --> 00:31:15,261 oops. 789 00:31:15,261 --> 00:31:15,760 Sorry. 790 00:31:23,450 --> 00:31:26,220 You can at the general principle. 791 00:31:26,220 --> 00:31:28,700 The children are overwhelmingly good at this kind of task, 792 00:31:28,700 --> 00:31:30,590 it turns out, in both of these cases 793 00:31:30,590 --> 00:31:33,110 and in many, many other iterations they went. 794 00:31:33,110 --> 00:31:35,347 They're very confident about which box 795 00:31:35,347 --> 00:31:37,430 they should pick, which means they're representing 796 00:31:37,430 --> 00:31:39,950 to themselves something about the ambiguity of the evidence, 797 00:31:39,950 --> 00:31:41,870 and their own ability to perceive 798 00:31:41,870 --> 00:31:44,760 these kinds of distinctions. 799 00:31:44,760 --> 00:31:46,570 So with Max, we've been talking about this 800 00:31:46,570 --> 00:31:48,590 as a kind of intuitive psychophysics, 801 00:31:48,590 --> 00:31:50,690 where they can represent their own discrimination 802 00:31:50,690 --> 00:31:54,770 threshold to make these kinds of distinctions. 803 00:31:54,770 --> 00:31:56,570 And they prefer interventions that 804 00:31:56,570 --> 00:31:58,640 generate distinctive patterns of evidence 805 00:31:58,640 --> 00:32:00,790 and maximize the possibility of information gain. 806 00:32:00,790 --> 00:32:02,440 Is that clear? 807 00:32:02,440 --> 00:32:03,920 OK. 808 00:32:03,920 --> 00:32:06,740 So there's a lot of ways in which children 809 00:32:06,740 --> 00:32:09,530 seem to be using intuitive theories, some kind 810 00:32:09,530 --> 00:32:11,800 of abstract, higher order of knowledge, 811 00:32:11,800 --> 00:32:19,130 to make inferences from data. 812 00:32:19,130 --> 00:32:23,600 But information is also costly to generate. 813 00:32:23,600 --> 00:32:25,890 And the costs themselves are informative. 814 00:32:25,890 --> 00:32:28,920 So I'm going to talk a little bit about that piece now. 815 00:32:28,920 --> 00:32:30,440 It's costly both for the learner, 816 00:32:30,440 --> 00:32:32,390 who cannot learn everything. 817 00:32:32,390 --> 00:32:35,582 And in a cultural context, where you're not just 818 00:32:35,582 --> 00:32:37,040 learning and exploring on your own, 819 00:32:37,040 --> 00:32:39,500 but you're actually getting information from other people, 820 00:32:39,500 --> 00:32:47,220 it's costly also for the teacher or for the informant. 821 00:32:47,220 --> 00:32:48,989 And these kinds of costs, and how 822 00:32:48,989 --> 00:32:50,780 we negotiate these kinds of costs, I think, 823 00:32:50,780 --> 00:32:53,750 are really fundamental to a lot of hard problems 824 00:32:53,750 --> 00:32:56,920 in communication and language. 825 00:32:56,920 --> 00:32:59,540 Lots of the field of pragmatics deals with problems 826 00:32:59,540 --> 00:33:00,869 of underdetermination. 827 00:33:00,869 --> 00:33:02,910 We say these sentences, we understand each other. 828 00:33:02,910 --> 00:33:05,690 We understand each other in all kinds of ambiguous contexts. 829 00:33:05,690 --> 00:33:08,000 We use a lot of social cues and other information 830 00:33:08,000 --> 00:33:09,140 to disambiguate. 831 00:33:09,140 --> 00:33:11,960 But part of what we do is, we make inferences 832 00:33:11,960 --> 00:33:13,790 about how much this person is going 833 00:33:13,790 --> 00:33:17,150 to communicate in this context, given 834 00:33:17,150 --> 00:33:19,820 how much I need to understand. 835 00:33:19,820 --> 00:33:22,505 And we use this to resolve these kinds of ambiguities. 836 00:33:22,505 --> 00:33:26,431 I'm going to give you a few examples of that here. 837 00:33:26,431 --> 00:33:28,430 Now, again, I'm going to start with the study we 838 00:33:28,430 --> 00:33:30,929 did a while ago then show you a little bit more recent work. 839 00:33:38,170 --> 00:33:39,920 I'm going to skip over some of this just 840 00:33:39,920 --> 00:33:42,489 to be able to cover all of this and say, 841 00:33:42,489 --> 00:33:44,780 Because there's a cost of information for both teachers 842 00:33:44,780 --> 00:33:47,150 and learners, it predicts some trade-offs 843 00:33:47,150 --> 00:33:49,260 in the kinds of inferences you should make. 844 00:33:49,260 --> 00:33:51,650 So for instance, if a knowledgeable informant 845 00:33:51,650 --> 00:33:55,530 shows you a toy and says, here, this toy has a single function. 846 00:33:55,530 --> 00:33:59,240 Then if you, the learner, think that that teacher is trying 847 00:33:59,240 --> 00:34:01,295 to generate a true sample from the hypothesis 848 00:34:01,295 --> 00:34:02,690 based on what actually is going to get 849 00:34:02,690 --> 00:34:04,314 the right idea to your head, you should 850 00:34:04,314 --> 00:34:07,850 assume that there is only one function and not two, 851 00:34:07,850 --> 00:34:08,976 or three, or four, or five. 852 00:34:08,976 --> 00:34:11,600 Because if there were more, they should have shown them to you, 853 00:34:11,600 --> 00:34:12,109 right? 854 00:34:12,109 --> 00:34:13,564 Because if they know the true hypothesis, 855 00:34:13,564 --> 00:34:15,105 and they could just demonstrate that, 856 00:34:15,105 --> 00:34:18,110 they can rule out all of that for you. 857 00:34:21,090 --> 00:34:24,860 But if you just stumble upon a single function of a toy, 858 00:34:24,860 --> 00:34:27,860 or a not knowledgeable teacher generates it accidentally, 859 00:34:27,860 --> 00:34:30,739 or if the teacher, as Liz Spelke pointed out, 860 00:34:30,739 --> 00:34:33,800 is interrupted in the middle of that demonstration, 861 00:34:33,800 --> 00:34:36,560 then you shouldn't assume that that evidence is exhaustive. 862 00:34:36,560 --> 00:34:38,620 It suspends that inference, right? 863 00:34:38,620 --> 00:34:40,010 Now, OK, well, you showed me one, 864 00:34:40,010 --> 00:34:41,935 but maybe there are two, three, or four. 865 00:34:41,935 --> 00:34:43,310 So it's only in a condition where 866 00:34:43,310 --> 00:34:45,920 I think you are a fully informed, freely acting 867 00:34:45,920 --> 00:34:48,560 teacher that I should assume, well, look, if you are helpful, 868 00:34:48,560 --> 00:34:50,850 knowledgeable teacher, then the information 869 00:34:50,850 --> 00:34:53,969 you give me should not only be true of the hypothesis, 870 00:34:53,969 --> 00:34:56,300 it should help me distinguish that hypothesis 871 00:34:56,300 --> 00:34:59,190 from available alternatives. 872 00:34:59,190 --> 00:35:01,670 And what that means is, there's a specific trade-off 873 00:35:01,670 --> 00:35:03,860 between instruction and exploration. 874 00:35:03,860 --> 00:35:05,540 Because if I'm instructed that there's 875 00:35:05,540 --> 00:35:08,462 one function of the toy, I don't need to explore any further. 876 00:35:08,462 --> 00:35:10,670 But if I just happen to find one function of the toy, 877 00:35:10,670 --> 00:35:12,500 maybe I do. 878 00:35:12,500 --> 00:35:14,930 So let me show you what we did to test this. 879 00:35:14,930 --> 00:35:15,890 We had a novel toy. 880 00:35:15,890 --> 00:35:18,620 It actually had four interesting properties, a squeaker, 881 00:35:18,620 --> 00:35:20,930 a light, a mirror, and music. 882 00:35:20,930 --> 00:35:23,120 And we demonstrated a single function 883 00:35:23,120 --> 00:35:27,686 of the toy, the squeaker, in three conditions. 884 00:35:27,686 --> 00:35:28,810 And we also had a baseline. 885 00:35:28,810 --> 00:35:31,320 In the pedagogical condition, we said, watch this, 886 00:35:31,320 --> 00:35:33,290 I'm going to show you-- 887 00:35:33,290 --> 00:35:35,510 sorry, the alignment's off-- 888 00:35:35,510 --> 00:35:37,580 but watch this, I'm going to show you my toy. 889 00:35:37,580 --> 00:35:40,250 They pulled the toy and then said, wow, see that. 890 00:35:40,250 --> 00:35:42,080 OK, the accidental condition was, 891 00:35:42,080 --> 00:35:44,030 look at this neat toy I found here, 892 00:35:44,030 --> 00:35:47,446 accidentally pulled this tube in the same way, wow see that. 893 00:35:47,446 --> 00:35:49,820 The baseline was, just look at this neat toy I have here, 894 00:35:49,820 --> 00:35:51,170 with no demonstration. 895 00:35:51,170 --> 00:35:52,970 And the interrupted condition was identical 896 00:35:52,970 --> 00:35:55,460 to the pedagogical condition, except the teacher was 897 00:35:55,460 --> 00:35:58,160 interrupted immediately after pulling the tube, 898 00:35:58,160 --> 00:36:00,122 and then she said, wow, see that. 899 00:36:00,122 --> 00:36:00,830 So is that clear? 900 00:36:00,830 --> 00:36:04,490 I'm sorry for the slide misalignment. 901 00:36:04,490 --> 00:36:08,030 And the prediction is that in the first condition, 902 00:36:08,030 --> 00:36:10,520 children should constrain their exploration relative to all 903 00:36:10,520 --> 00:36:11,464 the other conditions. 904 00:36:11,464 --> 00:36:13,130 So let me show you what that looks like. 905 00:36:23,210 --> 00:36:23,710 Or not. 906 00:36:23,710 --> 00:36:26,209 Which is too bad, because this is a really super cute slide. 907 00:36:26,209 --> 00:36:27,650 But it's not going to work. 908 00:36:27,650 --> 00:36:29,870 In this condition, what we found was this. 909 00:36:29,870 --> 00:36:32,180 We found a child in the children's' museum with a toy 910 00:36:32,180 --> 00:36:33,888 with all of these kind of wow properties. 911 00:36:33,888 --> 00:36:35,049 We say, wow, see this. 912 00:36:35,049 --> 00:36:36,590 We show them the property of the toy, 913 00:36:36,590 --> 00:36:41,210 and the child spends 90 seconds pulling only the squeaker. 914 00:36:41,210 --> 00:36:43,481 He then says, I'm very smart for a five-year-old. 915 00:36:43,481 --> 00:36:45,980 And when she asks for all of the other functions of the toy, 916 00:36:45,980 --> 00:36:47,771 he doesn't know any of the other functions, 917 00:36:47,771 --> 00:36:48,922 because he hasn't explored. 918 00:36:48,922 --> 00:36:51,380 And what we think is, he is very smart for a five-year-old. 919 00:36:51,380 --> 00:36:53,180 Because it's a completely rational inference that, 920 00:36:53,180 --> 00:36:54,980 if there were more functions of the toy, 921 00:36:54,980 --> 00:36:57,650 then they should have been demonstrated. 922 00:36:57,650 --> 00:36:59,960 And so what we find overall is, in fact, 923 00:36:59,960 --> 00:37:02,330 that children do fewer actions, and they discover 924 00:37:02,330 --> 00:37:04,490 fewer functions of this toy. 925 00:37:04,490 --> 00:37:07,430 This isn't just true, it turns out, we now know, 926 00:37:07,430 --> 00:37:09,890 because we live in a hyperpedagogical culture. 927 00:37:09,890 --> 00:37:13,010 Because Laura Shneidman and Amanda Woodward 928 00:37:13,010 --> 00:37:15,490 just replicated this study with Yucatec Mayan 929 00:37:15,490 --> 00:37:19,394 toddlers and found the same kind of effect, constraints 930 00:37:19,394 --> 00:37:20,810 in the pedagogical condition, even 931 00:37:20,810 --> 00:37:22,268 though it's a culture that's pretty 932 00:37:22,268 --> 00:37:26,460 limited in their pedagogy. 933 00:37:26,460 --> 00:37:30,810 So information is costly, and pedagogical contexts 934 00:37:30,810 --> 00:37:33,200 strengthen the inference that the absence of evidence-- 935 00:37:33,200 --> 00:37:36,050 a teacher's failure to go on and teach you more information-- 936 00:37:36,050 --> 00:37:39,308 is, in fact, evidence of its absence. 937 00:37:39,308 --> 00:37:41,300 Is that clear? 938 00:37:41,300 --> 00:37:44,030 And this is a very sensible inductive bias, 939 00:37:44,030 --> 00:37:47,355 but it predicts that instruction will, for better or worse, 940 00:37:47,355 --> 00:37:48,230 constrain expression. 941 00:37:48,230 --> 00:37:50,220 Because that's what it's supposed to do. 942 00:37:50,220 --> 00:37:52,850 It's supposed to constrain the hypotheses you consider. 943 00:37:52,850 --> 00:37:54,399 And indeed, it works quite well. 944 00:37:54,399 --> 00:37:56,690 And that's good if you're right about the world, right? 945 00:37:56,690 --> 00:37:58,956 And it constrains it to efficient learning. 946 00:37:58,956 --> 00:38:00,830 But it's bad if you're wrong about the world. 947 00:38:00,830 --> 00:38:03,350 Because the unknown unknowns, the things you don't know 948 00:38:03,350 --> 00:38:05,020 are true that you failed to teach, 949 00:38:05,020 --> 00:38:09,440 are going to potentially mislead a learner. 950 00:38:09,440 --> 00:38:10,820 How much is enough information? 951 00:38:10,820 --> 00:38:12,320 Well, there are lots of good reasons 952 00:38:12,320 --> 00:38:14,794 why teachers ought to provide very limited information. 953 00:38:14,794 --> 00:38:17,210 First of all, as I showed you in the first set of studies, 954 00:38:17,210 --> 00:38:19,090 evidence often supports generalization. 955 00:38:19,090 --> 00:38:19,590 Right? 956 00:38:19,590 --> 00:38:21,173 One dog toy squeaks, probably they all 957 00:38:21,173 --> 00:38:25,356 squeak, barring how generalizable that sample is. 958 00:38:25,356 --> 00:38:27,230 So I don't need to show you every single toy. 959 00:38:27,230 --> 00:38:28,760 I don't need to show a child, this 960 00:38:28,760 --> 00:38:30,170 is a cup, and that's a cup, and this a cup too, 961 00:38:30,170 --> 00:38:31,970 and that's a cup, and that's a cup. 962 00:38:31,970 --> 00:38:33,650 Once the child has a cup, I can assume 963 00:38:33,650 --> 00:38:35,870 that that child herself will be able to make 964 00:38:35,870 --> 00:38:37,282 the rational generalization. 965 00:38:37,282 --> 00:38:39,740 Or sometimes I know you're not going to be able to make it, 966 00:38:39,740 --> 00:38:41,630 but the additional information is just too costly. 967 00:38:41,630 --> 00:38:43,338 I'm working on teaching you two plus two, 968 00:38:43,338 --> 00:38:46,280 I'm not going to teach you linear algebra right now. 969 00:38:46,280 --> 00:38:47,667 It's a waste of our time. 970 00:38:47,667 --> 00:38:49,250 So that's another reason why you might 971 00:38:49,250 --> 00:38:51,270 provide limited information. 972 00:38:51,270 --> 00:38:57,890 So what are the contexts in which omitting information is 973 00:38:57,890 --> 00:39:01,010 a reasonable thing to do, and when is it misleading? 974 00:39:01,010 --> 00:39:04,020 When is this a real problem? 975 00:39:04,020 --> 00:39:06,680 And the answer turns out to be, if I'm the informant, 976 00:39:06,680 --> 00:39:09,080 and I know I'm providing information 977 00:39:09,080 --> 00:39:12,184 that is going to lead you to the wrong hypothesis, 978 00:39:12,184 --> 00:39:13,850 then we consider that a sin of omission. 979 00:39:13,850 --> 00:39:15,020 Right? 980 00:39:15,020 --> 00:39:18,390 If I'm omitting information, and I'm not doing that, 981 00:39:18,390 --> 00:39:20,014 then maybe that's not a problem. 982 00:39:20,014 --> 00:39:21,680 So one of the questions is, can children 983 00:39:21,680 --> 00:39:23,210 distinguish these contexts? 984 00:39:23,210 --> 00:39:26,810 Can they tell when the teacher is providing 985 00:39:26,810 --> 00:39:30,560 too little information, and it is going to cost the learner 986 00:39:30,560 --> 00:39:34,970 something in terms of what they can gain, and when they're not. 987 00:39:34,970 --> 00:39:36,470 So to test this-- 988 00:39:36,470 --> 00:39:38,780 this is again, Hyowon Gweon's work-- 989 00:39:38,780 --> 00:39:40,310 we introduced a toy. 990 00:39:40,310 --> 00:39:42,462 And it had one function, this wind up mechanism. 991 00:39:42,462 --> 00:39:44,420 And the kids got to explore, and they found out 992 00:39:44,420 --> 00:39:46,000 the toy did one thing. 993 00:39:46,000 --> 00:39:48,330 In the other condition, the toy looked the same. 994 00:39:48,330 --> 00:39:50,420 But in fact, the toy had lots of functions, 995 00:39:50,420 --> 00:39:52,460 and the children knew that. 996 00:39:52,460 --> 00:39:54,420 So the children always knew the ground truth. 997 00:39:54,420 --> 00:39:57,290 The toy either had one function or four. 998 00:39:57,290 --> 00:39:59,910 And then there was a teacher who taught Elmo. 999 00:39:59,910 --> 00:40:01,520 The teacher always did the same thing. 1000 00:40:01,520 --> 00:40:03,970 The teacher always taught just one function. 1001 00:40:03,970 --> 00:40:06,830 And the first question was, the teacher's always 1002 00:40:06,830 --> 00:40:09,370 doing the same thing with an identical looking toy. 1003 00:40:09,370 --> 00:40:11,229 Do the kids penalize the teacher? 1004 00:40:11,229 --> 00:40:13,020 Do they think he's a bad teacher if he only 1005 00:40:13,020 --> 00:40:15,080 teaches one function when there are really four, 1006 00:40:15,080 --> 00:40:17,960 compared to when he teaches one function and there's only one. 1007 00:40:17,960 --> 00:40:20,840 So the first thing we did was, ask kids to rate that teacher. 1008 00:40:20,840 --> 00:40:22,640 And indeed, they think that this teacher 1009 00:40:22,640 --> 00:40:24,520 is a terrible teacher when there's four functions 1010 00:40:24,520 --> 00:40:25,520 and he only teaches one. 1011 00:40:25,520 --> 00:40:29,890 They think he's a good teacher when he teaches one of one. 1012 00:40:29,890 --> 00:40:33,380 But the really interesting question was, what would 1013 00:40:33,380 --> 00:40:35,300 the children do to compensate if they 1014 00:40:35,300 --> 00:40:36,800 knew they had a bad teacher? 1015 00:40:36,800 --> 00:40:38,960 So we ran exactly the same set up 1016 00:40:38,960 --> 00:40:41,750 where the teacher shows Elmo the toy in the one function case 1017 00:40:41,750 --> 00:40:44,000 and the four function case. 1018 00:40:44,000 --> 00:40:46,209 And for reasons that will become clear, 1019 00:40:46,209 --> 00:40:47,750 we also ran a control condition where 1020 00:40:47,750 --> 00:40:49,670 there were four active functions, 1021 00:40:49,670 --> 00:40:53,030 and the teacher taught all four. 1022 00:40:53,030 --> 00:40:54,890 In all cases, the teacher then goes on 1023 00:40:54,890 --> 00:40:56,780 and runs that experiment I just showed you. 1024 00:40:56,780 --> 00:40:58,490 The teacher shows just the squeaker toy 1025 00:40:58,490 --> 00:41:00,832 here of this single function. 1026 00:41:00,832 --> 00:41:02,540 The question is, what should the kids do? 1027 00:41:02,540 --> 00:41:04,456 So it's a complicated set up, so I'll walk you 1028 00:41:04,456 --> 00:41:06,630 through it a little bit. 1029 00:41:06,630 --> 00:41:09,199 When you teach one of one function, 1030 00:41:09,199 --> 00:41:11,240 you should infer the toy probably does one thing, 1031 00:41:11,240 --> 00:41:12,470 it's a good teacher. 1032 00:41:12,470 --> 00:41:13,910 And so when they show you one function of this, 1033 00:41:13,910 --> 00:41:15,570 you should constrain your exploration, 1034 00:41:15,570 --> 00:41:17,987 say that was a sensible inference. 1035 00:41:17,987 --> 00:41:19,820 When they teach one of four, you should say, 1036 00:41:19,820 --> 00:41:20,720 that's a bad teacher. 1037 00:41:20,720 --> 00:41:21,920 The toy probably does more than one thing, 1038 00:41:21,920 --> 00:41:23,170 the new toy probably does too. 1039 00:41:23,170 --> 00:41:24,954 I'm going to explore more broadly. 1040 00:41:24,954 --> 00:41:27,120 But we don't know, in that case, if they're doing it 1041 00:41:27,120 --> 00:41:29,607 because they just saw a toy with one function, 1042 00:41:29,607 --> 00:41:31,440 and so they think this toy has one function. 1043 00:41:31,440 --> 00:41:32,820 And they just saw a toy with four functions, 1044 00:41:32,820 --> 00:41:34,570 so they think this toy has four functions. 1045 00:41:34,570 --> 00:41:36,930 So we can disambiguate those with this condition. 1046 00:41:36,930 --> 00:41:38,970 Now if they're just generalizing from the toy, 1047 00:41:38,970 --> 00:41:40,230 this toy has four functions, well, 1048 00:41:40,230 --> 00:41:41,970 then they should think this toy has four functions. 1049 00:41:41,970 --> 00:41:44,100 But if they're generalizing from the teacher, 1050 00:41:44,100 --> 00:41:45,234 this is a good teacher. 1051 00:41:45,234 --> 00:41:47,400 So when the teacher now tells you about the new toy, 1052 00:41:47,400 --> 00:41:50,220 that it has one function, the kids 1053 00:41:50,220 --> 00:41:51,700 should constrain their exploration. 1054 00:41:51,700 --> 00:41:55,260 So does everyone understand the logic of the design here? 1055 00:41:55,260 --> 00:41:59,430 And in fact, that's exactly what we find. 1056 00:41:59,430 --> 00:42:02,346 The children compensated with additional exploration 1057 00:42:02,346 --> 00:42:03,720 when they thought the teacher had 1058 00:42:03,720 --> 00:42:07,227 provided insufficient information to the learner. 1059 00:42:07,227 --> 00:42:07,810 Is that clear? 1060 00:42:10,960 --> 00:42:11,520 OK. 1061 00:42:11,520 --> 00:42:14,820 So information is costly to teachers and learners. 1062 00:42:14,820 --> 00:42:16,440 If teachers minimize their own costs 1063 00:42:16,440 --> 00:42:18,330 and provide too little information, 1064 00:42:18,330 --> 00:42:20,200 children think they're poor teachers. 1065 00:42:20,200 --> 00:42:22,200 They suspend the inference that that information 1066 00:42:22,200 --> 00:42:23,160 is representative. 1067 00:42:23,160 --> 00:42:27,900 They compensate with additional exploration. 1068 00:42:27,900 --> 00:42:32,580 I'm going to go ahead and show you just a couple 1069 00:42:32,580 --> 00:42:33,600 more examples here. 1070 00:42:37,300 --> 00:42:39,160 There's too little information. 1071 00:42:39,160 --> 00:42:40,940 But because information is costly, 1072 00:42:40,940 --> 00:42:42,820 you can also provide too much information. 1073 00:42:42,820 --> 00:42:44,890 I might be doing that right now. 1074 00:42:44,890 --> 00:42:46,810 Too much information is costly. 1075 00:42:46,810 --> 00:42:49,600 It's taking a toll on the learner to absorb all of that. 1076 00:42:49,600 --> 00:42:51,058 And you have to know, well, are you 1077 00:42:51,058 --> 00:42:53,420 providing me too much information, or just the right 1078 00:42:53,420 --> 00:42:53,920 amount? 1079 00:42:53,920 --> 00:42:55,961 And at the risk of falling into this trap myself, 1080 00:42:55,961 --> 00:42:58,600 I am going to show you quickly this study. 1081 00:42:58,600 --> 00:43:00,620 Because how much information is too much 1082 00:43:00,620 --> 00:43:02,620 information depends on a hard question which is, 1083 00:43:02,620 --> 00:43:03,940 how much do you already know? 1084 00:43:03,940 --> 00:43:04,587 Right? 1085 00:43:04,587 --> 00:43:05,920 You've all been here all summer. 1086 00:43:05,920 --> 00:43:07,090 You know a lot of things. 1087 00:43:07,090 --> 00:43:09,160 I'm not a very good estimate of what you know 1088 00:43:09,160 --> 00:43:11,240 or how much this information is going, 1089 00:43:11,240 --> 00:43:13,040 so it's a little hard for me titrate 1090 00:43:13,040 --> 00:43:16,750 what's the right amount of information to give you. 1091 00:43:16,750 --> 00:43:18,340 And the question is, can children 1092 00:43:18,340 --> 00:43:20,200 take these kinds of theory of mind problems 1093 00:43:20,200 --> 00:43:23,110 into account in order to estimate what information they 1094 00:43:23,110 --> 00:43:24,670 should be getting? 1095 00:43:24,670 --> 00:43:27,580 To test this, we give kids a 20-button toy. 1096 00:43:27,580 --> 00:43:30,850 If I push a single button and it makes music, how many of you 1097 00:43:30,850 --> 00:43:33,839 think that all the other buttons make music? 1098 00:43:33,839 --> 00:43:35,380 Because you can generalize from data, 1099 00:43:35,380 --> 00:43:36,970 and that is a really good inductive inference there. 1100 00:43:36,970 --> 00:43:38,261 They look the same, it's a toy. 1101 00:43:38,261 --> 00:43:40,090 One makes music, they probably all do. 1102 00:43:40,090 --> 00:43:41,880 But suppose I go on now to show you-- 1103 00:43:41,880 --> 00:43:44,400 so that's your prior expectation, they all work. 1104 00:43:44,400 --> 00:43:46,750 But that one doesn't work, and that one doesn't work, 1105 00:43:46,750 --> 00:43:48,958 and that one doesn't work, and that one doesn't work, 1106 00:43:48,958 --> 00:43:51,562 and that one doesn't work, and that-- oh, that one works. 1107 00:43:51,562 --> 00:43:53,770 And that one doesn't work, and that one doesn't work, 1108 00:43:53,770 --> 00:43:55,450 and that one doesn't work, and that one doesn't work. 1109 00:43:55,450 --> 00:43:56,440 I'm doing this for a reason. 1110 00:43:56,440 --> 00:43:57,648 I know this is really boring. 1111 00:43:57,648 --> 00:44:00,790 It's partly to give you a break, but it is also-- oh, 1112 00:44:00,790 --> 00:44:03,200 that one works. 1113 00:44:03,200 --> 00:44:06,370 OK, so now what you have learned about this toy 1114 00:44:06,370 --> 00:44:10,350 is that actually, only three of these buttons work. 1115 00:44:10,350 --> 00:44:13,019 And suppose I show you this across a couple of toys. 1116 00:44:13,019 --> 00:44:14,560 You've just changed your expectation. 1117 00:44:14,560 --> 00:44:16,854 Now if I show you that one button works, 1118 00:44:16,854 --> 00:44:18,270 you don't think all the rest work. 1119 00:44:18,270 --> 00:44:20,830 You think probably two others work, right? 1120 00:44:20,830 --> 00:44:25,410 So if I bring out a brand new toy, and I push this button, 1121 00:44:25,410 --> 00:44:28,300 you'll probably be relieved if I just go ahead 1122 00:44:28,300 --> 00:44:30,934 and push these three, right? 1123 00:44:30,934 --> 00:44:32,350 And I don't go around and show you 1124 00:44:32,350 --> 00:44:34,660 all of the inert buttons on the toy. 1125 00:44:34,660 --> 00:44:36,400 Because information is costly. 1126 00:44:36,400 --> 00:44:39,820 You have to sit there through all those demonstrations. 1127 00:44:39,820 --> 00:44:44,110 In this experiment, I'll show you Gweon's work. 1128 00:44:44,110 --> 00:44:46,210 We gave kids a common ground condition 1129 00:44:46,210 --> 00:44:49,420 where everybody shared prior knowledge, this abstract theory 1130 00:44:49,420 --> 00:44:52,810 you can use to constrain your interpretation of this data. 1131 00:44:52,810 --> 00:44:56,519 In this condition, there were two toy makers, 1132 00:44:56,519 --> 00:44:58,810 who are the informants, and there's two naive learners, 1133 00:44:58,810 --> 00:45:00,520 Ernie and Bert. 1134 00:45:00,520 --> 00:45:02,410 And in the common ground condition, 1135 00:45:02,410 --> 00:45:04,450 Ernie, and Bert, and the toy makers 1136 00:45:04,450 --> 00:45:07,330 are all there while the child explores and finds out 1137 00:45:07,330 --> 00:45:09,870 that only three buttons work on these toys. 1138 00:45:09,870 --> 00:45:10,750 OK. 1139 00:45:10,750 --> 00:45:14,980 And then one teacher shows of a brand new toy 1140 00:45:14,980 --> 00:45:17,550 just like this every single button, the inert ones 1141 00:45:17,550 --> 00:45:18,620 and the non-inert ones. 1142 00:45:18,620 --> 00:45:22,010 And the other teacher shows just the three working buttons. 1143 00:45:22,010 --> 00:45:22,566 Right? 1144 00:45:22,566 --> 00:45:24,190 And then we say, hey, kids, guess what? 1145 00:45:24,190 --> 00:45:25,810 We have a whole closet full of more of those toys. 1146 00:45:25,810 --> 00:45:27,601 One of these teachers can show them to you. 1147 00:45:27,601 --> 00:45:29,440 Which one do you want? 1148 00:45:29,440 --> 00:45:31,480 Which one do you want, OK? 1149 00:45:31,480 --> 00:45:35,080 The other condition is almost the same, but guess what? 1150 00:45:35,080 --> 00:45:39,100 The child explores on their own, right? 1151 00:45:39,100 --> 00:45:42,910 And so there's no common ground about what these toys do. 1152 00:45:42,910 --> 00:45:44,920 And then the teachers do the same thing. 1153 00:45:44,920 --> 00:45:47,680 One teacher shows exhaustive information, 1154 00:45:47,680 --> 00:45:51,970 and the other teacher pushes only the three working buttons. 1155 00:45:51,970 --> 00:45:55,960 So in this case, that efficient information, that less costly 1156 00:45:55,960 --> 00:45:58,330 simple demonstration could mislead the learners 1157 00:45:58,330 --> 00:45:59,890 about what the true hypothesis is. 1158 00:45:59,890 --> 00:46:02,940 They have a prior that all these buttons ought to work. 1159 00:46:02,940 --> 00:46:05,680 Which toy maker would you rather learn 1160 00:46:05,680 --> 00:46:08,800 from depends on whether the learners share 1161 00:46:08,800 --> 00:46:13,120 that prior background information or not. 1162 00:46:13,120 --> 00:46:16,690 And that was true not only when the children were 1163 00:46:16,690 --> 00:46:20,950 judging the informants, but when they were teaching themselves. 1164 00:46:20,950 --> 00:46:23,650 So again, these are four and five-year-olds. 1165 00:46:23,650 --> 00:46:27,320 The children had a condition where 1166 00:46:27,320 --> 00:46:31,350 Elmo got to see that only three buttons worked on the toy, 1167 00:46:31,350 --> 00:46:32,920 and the condition where Elmo didn't 1168 00:46:32,920 --> 00:46:36,220 get to see how many of those buttons worked on the toy. 1169 00:46:36,220 --> 00:46:40,210 And then the children got to teach Elmo the toy. 1170 00:46:40,210 --> 00:46:43,840 And the children were much more likely to press more buttons 1171 00:46:43,840 --> 00:46:46,269 and provide exhaustive evidence in the no common ground 1172 00:46:46,269 --> 00:46:48,060 condition than the common ground condition. 1173 00:46:48,060 --> 00:46:49,780 So children themselves are adjusting 1174 00:46:49,780 --> 00:46:52,339 the cost of the information they provide 1175 00:46:52,339 --> 00:46:54,130 based on their prior expectation about what 1176 00:46:54,130 --> 00:46:56,710 they think the learner is going to learn from the data. 1177 00:46:56,710 --> 00:46:59,680 What did we do is, lastly, tell you a little bit about how 1178 00:46:59,680 --> 00:47:05,500 the costs of information are informative 1179 00:47:05,500 --> 00:47:08,184 not just in figuring out what data to learn from 1180 00:47:08,184 --> 00:47:09,975 and how you should communicate information, 1181 00:47:09,975 --> 00:47:12,310 but in actually figuring out what people are doing, 1182 00:47:12,310 --> 00:47:14,650 and actually grounding out ordinary, everyday theory 1183 00:47:14,650 --> 00:47:16,067 of mind. 1184 00:47:16,067 --> 00:47:18,400 I'm going to start with an example you're familiar with. 1185 00:47:18,400 --> 00:47:19,850 I know you've seen this before. 1186 00:47:19,850 --> 00:47:23,710 This is an experiment by Gergely and Csibra, a rational action. 1187 00:47:23,710 --> 00:47:27,010 This little ball jumps over the wall to get to the momma ball-- 1188 00:47:27,010 --> 00:47:28,390 you've all seen this? 1189 00:47:28,390 --> 00:47:30,190 I think Josh was presenting it maybe? 1190 00:47:30,190 --> 00:47:30,690 OK. 1191 00:47:30,690 --> 00:47:32,314 And when you took the wall away, babies 1192 00:47:32,314 --> 00:47:34,150 expect that ball to take an efficient route. 1193 00:47:34,150 --> 00:47:37,210 So we think that rational agents should take the most efficient 1194 00:47:37,210 --> 00:47:43,420 route to the goal, they should maximize their overall utility. 1195 00:47:43,420 --> 00:47:46,720 But there's a lot of reasons why that ball might have jumped 1196 00:47:46,720 --> 00:47:49,690 over the wall, and they all have to do with costs 1197 00:47:49,690 --> 00:47:50,590 of rewards of action. 1198 00:47:50,590 --> 00:47:53,290 One might be, it was really hard to get over the wall, 1199 00:47:53,290 --> 00:47:55,954 but really rewarding to do so. 1200 00:47:55,954 --> 00:47:57,370 Another reason, though, is that it 1201 00:47:57,370 --> 00:47:59,828 was really easy to get over the wall, so she might as well, 1202 00:47:59,828 --> 00:48:01,960 but she didn't care that much about the reward. 1203 00:48:01,960 --> 00:48:04,300 These could have the same net utility, 1204 00:48:04,300 --> 00:48:06,806 but psychologically, you really care about the difference. 1205 00:48:06,806 --> 00:48:08,680 If we're talking about the internal structure 1206 00:48:08,680 --> 00:48:10,600 of an agent's motivations, you want 1207 00:48:10,600 --> 00:48:13,310 to decompose this simple argument 1208 00:48:13,310 --> 00:48:15,460 about rational, goal-directed action 1209 00:48:15,460 --> 00:48:18,670 into what the particular costs and rewards are. 1210 00:48:18,670 --> 00:48:21,310 So Julian Jara-Ettinger in our lab 1211 00:48:21,310 --> 00:48:23,200 has developed this account he calls 1212 00:48:23,200 --> 00:48:26,650 the naive utility calculus, which is our way of reasoning 1213 00:48:26,650 --> 00:48:28,030 about other peoples' actions. 1214 00:48:28,030 --> 00:48:31,500 Which is, we assume other agents are acting to maximize utility, 1215 00:48:31,500 --> 00:48:34,180 but we care about the internal structure also. 1216 00:48:34,180 --> 00:48:36,760 There are agent invariant rewards and costs. 1217 00:48:36,760 --> 00:48:38,920 Two cookies is always more than one cookie. 1218 00:48:38,920 --> 00:48:42,175 Higher hills are always higher than lower hills for all of us. 1219 00:48:42,175 --> 00:48:44,505 But some of us are more motivated to get cookies 1220 00:48:44,505 --> 00:48:45,880 than others of us, and some of us 1221 00:48:45,880 --> 00:48:48,910 find hills more costly than others of us. 1222 00:48:48,910 --> 00:48:50,980 So in addition to these agents invariant 1223 00:48:50,980 --> 00:48:52,900 aspects of costs and rewards, there's 1224 00:48:52,900 --> 00:48:55,420 also these internal subjective things that 1225 00:48:55,420 --> 00:48:56,680 are harder to judge, right? 1226 00:48:56,680 --> 00:49:00,040 How competent you are, what your values are, 1227 00:49:00,040 --> 00:49:01,600 and your preferences. 1228 00:49:01,600 --> 00:49:05,220 And so understanding how all of these worked together 1229 00:49:05,220 --> 00:49:08,170 lets you take a very, very simple analysis 1230 00:49:08,170 --> 00:49:11,420 and make surprisingly powerful inferences about what 1231 00:49:11,420 --> 00:49:12,791 other agents are doing. 1232 00:49:12,791 --> 00:49:14,290 I'm going to show you a few examples 1233 00:49:14,290 --> 00:49:17,380 and actually connect it back to how children 1234 00:49:17,380 --> 00:49:19,150 are scientists in this regard. 1235 00:49:19,150 --> 00:49:21,340 So here is an example experiment. 1236 00:49:21,340 --> 00:49:25,810 Here is Grover, and there is a cracker and a cookie 1237 00:49:25,810 --> 00:49:27,300 down on this low shelf. 1238 00:49:27,300 --> 00:49:31,030 And Grover goes ahead, and he chooses the cookie. 1239 00:49:31,030 --> 00:49:33,580 And now there's a cracker and a cookie and this shelf, 1240 00:49:33,580 --> 00:49:36,850 and Grover goes ahead and chooses the cracker. 1241 00:49:36,850 --> 00:49:40,360 And the question is, what does Grover like better? 1242 00:49:40,360 --> 00:49:42,526 Well, if your read out of preferences, 1243 00:49:42,526 --> 00:49:43,650 it's your actions you take. 1244 00:49:43,650 --> 00:49:44,990 It's your goal-directed action. 1245 00:49:44,990 --> 00:49:46,130 You should be at chance-- 1246 00:49:46,130 --> 00:49:48,350 you chose a cracker once and a cookie once. 1247 00:49:48,350 --> 00:49:50,090 But if even young children understand, 1248 00:49:50,090 --> 00:49:52,120 no, it's not that simple, right? 1249 00:49:52,120 --> 00:49:55,180 You're not just acting to maximize reward, 1250 00:49:55,180 --> 00:49:56,710 you're acting to maximize utility. 1251 00:49:56,710 --> 00:49:58,480 You have to take the costs into account. 1252 00:49:58,480 --> 00:50:00,670 Then clearly his preference is what 1253 00:50:00,670 --> 00:50:03,520 he chose when the costs were matched, right? 1254 00:50:03,520 --> 00:50:05,200 Not when the costs were mismatched. 1255 00:50:05,200 --> 00:50:07,075 And the children should say, no, no, no, it's 1256 00:50:07,075 --> 00:50:08,890 what he chose on the low box. 1257 00:50:08,890 --> 00:50:10,960 Which treat does he like best? 1258 00:50:10,960 --> 00:50:12,530 Is that clear? 1259 00:50:12,530 --> 00:50:16,540 And indeed, that is what children do. 1260 00:50:16,540 --> 00:50:20,050 You can also introduce a couple of characters. 1261 00:50:20,050 --> 00:50:22,390 Cookie Monster, who has a strong preference for cookies, 1262 00:50:22,390 --> 00:50:25,170 and Grover, who is indifferent, who likes them both. 1263 00:50:25,170 --> 00:50:27,850 So for Cookie Monster, the reward value of cookies 1264 00:50:27,850 --> 00:50:28,890 is much higher. 1265 00:50:28,890 --> 00:50:30,980 For Grover, they're equivalent. 1266 00:50:30,980 --> 00:50:32,440 And now you can set up a situation 1267 00:50:32,440 --> 00:50:36,340 where there's crackers in a low box and cookies at a high box. 1268 00:50:36,340 --> 00:50:38,680 And you say, guys, go on, you can make a choice. 1269 00:50:38,680 --> 00:50:40,510 And Grover chooses the cracker and Cookie 1270 00:50:40,510 --> 00:50:43,140 Monster chooses the cracker. 1271 00:50:43,140 --> 00:50:47,120 And you say to the kids, which puppet can't climb? 1272 00:50:47,120 --> 00:50:49,000 Now, no puppet has climbed. 1273 00:50:49,000 --> 00:50:50,380 No puppet has failed to climb. 1274 00:50:50,380 --> 00:50:52,660 No puppet has even thought about climbing. 1275 00:50:52,660 --> 00:50:54,597 But the kids can know the answer, right? 1276 00:50:54,597 --> 00:50:56,180 Because if Cookie Monster could climb, 1277 00:50:56,180 --> 00:50:58,540 and he had a high reward, then who would do it. 1278 00:50:58,540 --> 00:51:00,250 So you don't really know about Grover, 1279 00:51:00,250 --> 00:51:03,220 but you can make an inference that the costs were not 1280 00:51:03,220 --> 00:51:05,736 equivalent. 1281 00:51:05,736 --> 00:51:08,110 And by the way, in case you're worried that kids are just 1282 00:51:08,110 --> 00:51:09,555 saying, well, Cookie Monster, I've 1283 00:51:09,555 --> 00:51:10,930 been listening to Michelle Obama. 1284 00:51:10,930 --> 00:51:13,210 You know, obesity and fitness-- cookies aren't good, 1285 00:51:13,210 --> 00:51:14,160 I can't climb. 1286 00:51:14,160 --> 00:51:17,170 We ran the same experiment with Grover and clover, 1287 00:51:17,170 --> 00:51:19,690 and Grover really likes clovers, and Cookie Monster 1288 00:51:19,690 --> 00:51:21,280 likes them both. 1289 00:51:21,280 --> 00:51:23,830 And you now flip the inference around. 1290 00:51:23,830 --> 00:51:31,300 OK, so they can consider how the costs affect the inferences 1291 00:51:31,300 --> 00:51:33,459 they make. 1292 00:51:33,459 --> 00:51:35,500 All right, so let's bring this all back together. 1293 00:51:35,500 --> 00:51:36,640 I've thrown a lot of information-- 1294 00:51:36,640 --> 00:51:38,500 probably too much information-- at you 1295 00:51:38,500 --> 00:51:40,300 about how kids can reason from sparse data, 1296 00:51:40,300 --> 00:51:41,710 about how they use their theories to make 1297 00:51:41,710 --> 00:51:43,918 these inferences, about how they use this in teaching 1298 00:51:43,918 --> 00:51:46,520 and learning in social context. 1299 00:51:46,520 --> 00:51:48,760 But if kids are sensitive to these utilities 1300 00:51:48,760 --> 00:51:51,100 and the trade-offs, then the kinds of things 1301 00:51:51,100 --> 00:51:54,140 that I showed you them doing with beads and with machines, 1302 00:51:54,140 --> 00:51:56,930 they should also be able to do in social context. 1303 00:51:56,930 --> 00:51:59,410 We believe psychology to be something of a science, 1304 00:51:59,410 --> 00:52:01,049 as well as all of these other sciences, 1305 00:52:01,049 --> 00:52:02,590 and they should be able to apply some 1306 00:52:02,590 --> 00:52:04,840 of the same principles-- holding some things constant, 1307 00:52:04,840 --> 00:52:07,060 manipulating others-- in order to gain information. 1308 00:52:07,060 --> 00:52:10,330 So we basically ask that question of the children here. 1309 00:52:10,330 --> 00:52:13,270 Can they distinguish agents' different competencies 1310 00:52:13,270 --> 00:52:17,280 and rewards by manipulating the contexts that they see 1311 00:52:17,280 --> 00:52:19,100 and gaining information? 1312 00:52:19,100 --> 00:52:22,300 So here, we don't know if Cookie Monster can climb. 1313 00:52:22,300 --> 00:52:24,910 So let's put one treat on each box. 1314 00:52:24,910 --> 00:52:26,890 Where should we put the treats to find out 1315 00:52:26,890 --> 00:52:28,720 if Cookie Monster can climb? 1316 00:52:28,720 --> 00:52:30,890 Now, only one of these interventions is informative. 1317 00:52:30,890 --> 00:52:32,306 If the cookie is down low, and you 1318 00:52:32,306 --> 00:52:34,490 know Cookie Monster prefers cookies, 1319 00:52:34,490 --> 00:52:36,420 then you're not going to get any information. 1320 00:52:36,420 --> 00:52:38,260 But if the cookie is up high, then you 1321 00:52:38,260 --> 00:52:40,360 are going to get information, right? 1322 00:52:40,360 --> 00:52:44,460 And in fact, the kids are overwhelmingly good at this. 1323 00:52:44,460 --> 00:52:46,840 In case you think, well, treats should be put up high-- 1324 00:52:46,840 --> 00:52:48,798 and this is also true, by the way, for clovers, 1325 00:52:48,798 --> 00:52:49,880 but it's all right. 1326 00:52:49,880 --> 00:52:51,820 In case that you think they just have a heuristic like, 1327 00:52:51,820 --> 00:52:52,810 oh, well, let's put treats up high, 1328 00:52:52,810 --> 00:52:54,518 you can ask the question a different way. 1329 00:52:54,518 --> 00:52:57,430 You can say, both of our friends can climb the short box. 1330 00:52:57,430 --> 00:53:00,210 But only one of our friends can climb the tall box, 1331 00:53:00,210 --> 00:53:01,870 and we don't know which one. 1332 00:53:01,870 --> 00:53:04,492 So let's put the cookie up here and the cracker down here. 1333 00:53:04,492 --> 00:53:06,700 And if we want to figure out which one of our friends 1334 00:53:06,700 --> 00:53:10,330 can climb, which friend should we send in? 1335 00:53:10,330 --> 00:53:12,512 Well, Grover has no particular incentive to climb. 1336 00:53:12,512 --> 00:53:13,720 He could just do the cracker. 1337 00:53:13,720 --> 00:53:16,210 But Cookie Monster, he has an incentive to climb. 1338 00:53:16,210 --> 00:53:18,820 You should probably send in Cookie Monster. 1339 00:53:18,820 --> 00:53:21,370 And again, these are the kinds of inferences 1340 00:53:21,370 --> 00:53:23,723 that young children can make. 1341 00:53:23,723 --> 00:53:24,569 Is that clear? 1342 00:53:24,569 --> 00:53:25,840 All right. 1343 00:53:25,840 --> 00:53:31,000 So, end of a long winded talk. 1344 00:53:31,000 --> 00:53:32,890 I want to return it to this. 1345 00:53:32,890 --> 00:53:35,560 So I framed it in terms of the way I've been increasingly 1346 00:53:35,560 --> 00:53:38,085 thinking, and the projects that we're increasingly 1347 00:53:38,085 --> 00:53:39,460 moving towards, which is thinking 1348 00:53:39,460 --> 00:53:43,080 not just about the pure pursuit of knowledge and information, 1349 00:53:43,080 --> 00:53:45,634 but how do you pursue information in a complex world 1350 00:53:45,634 --> 00:53:47,800 where it's not just your own individual exploration. 1351 00:53:47,800 --> 00:53:49,060 You get it in a social context, you 1352 00:53:49,060 --> 00:53:50,351 get in interaction with others. 1353 00:53:50,351 --> 00:53:55,120 The information is costly both to deliver and to process. 1354 00:53:55,120 --> 00:53:57,850 But that, those costs themselves, are information, 1355 00:53:57,850 --> 00:54:01,030 and you can use them to make sense of the world. 1356 00:54:01,030 --> 00:54:04,750 And so I want to sort of bring these together and come back 1357 00:54:04,750 --> 00:54:08,620 to a problem that I posed a bit earlier, which is, I think, 1358 00:54:08,620 --> 00:54:12,130 I hope I've made the case that what kids are doing 1359 00:54:12,130 --> 00:54:15,700 is not reasoning about huge sets of data. 1360 00:54:15,700 --> 00:54:19,300 What kids are doing is, they're taking some very abstract 1361 00:54:19,300 --> 00:54:22,480 structure knowledge and using it to constrain their inferences 1362 00:54:22,480 --> 00:54:24,550 about tiny amounts of data. 1363 00:54:24,550 --> 00:54:27,100 Cookie Monster this, a couple of trials of evidence. 1364 00:54:27,100 --> 00:54:29,560 And then they make good inductive guesses, 1365 00:54:29,560 --> 00:54:34,420 which are sometimes wrong, but they are good. 1366 00:54:34,420 --> 00:54:38,380 And I said, these abstract representations, 1367 00:54:38,380 --> 00:54:40,100 not all of them are innate, right? 1368 00:54:40,100 --> 00:54:42,670 You've seen beautiful evidence of the many that are, 1369 00:54:42,670 --> 00:54:44,020 but a lot of them aren't. 1370 00:54:44,020 --> 00:54:45,130 A lot of them aren't. 1371 00:54:45,130 --> 00:54:47,588 A lot of the things that govern your common sense knowledge 1372 00:54:47,588 --> 00:54:49,210 every day, how do you get those? 1373 00:54:49,210 --> 00:54:52,610 And how do you get those from tiny amounts of data? 1374 00:54:52,610 --> 00:54:57,320 This is a problem that bugged me deeply for a very long time, 1375 00:54:57,320 --> 00:55:03,670 and I think there's been a real leap and a very exciting ascent 1376 00:55:03,670 --> 00:55:07,210 of how it is actually possible to use tiny amounts of data 1377 00:55:07,210 --> 00:55:09,154 to make really rich abstract inferences, which 1378 00:55:09,154 --> 00:55:10,570 then constrain your interpretation 1379 00:55:10,570 --> 00:55:11,290 of subsequent data. 1380 00:55:11,290 --> 00:55:13,000 I think that is a really important problem. 1381 00:55:13,000 --> 00:55:14,230 And with that, I'm going to turn it over 1382 00:55:14,230 --> 00:55:16,490 to Josh and Tomer, who maybe can tell you how 1383 00:55:16,490 --> 00:55:17,990 that is going to actually work. 1384 00:55:17,990 --> 00:55:20,830 So thanks to everyone here at Woods Hole.