1 00:00:01,685 --> 00:00:04,040 The phone and content is provided under a Creative 2 00:00:04,040 --> 00:00:05,580 Commons license. 3 00:00:05,580 --> 00:00:07,880 Your support will help MIT OpenCourseWare 4 00:00:07,880 --> 00:00:12,270 continue to offer high quality educational resources for free. 5 00:00:12,270 --> 00:00:14,870 To make a donation or view additional materials 6 00:00:14,870 --> 00:00:18,830 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:18,830 --> 00:00:20,000 at ocw.mit.edu. 8 00:00:21,881 --> 00:00:23,630 ETHAN MEYERS: What I'm talking about today 9 00:00:23,630 --> 00:00:26,180 is neural population decoding, which 10 00:00:26,180 --> 00:00:28,790 is very similar to what Rebecca was talking about, 11 00:00:28,790 --> 00:00:32,330 except for I'm talking about now more on the single neuron level 12 00:00:32,330 --> 00:00:35,810 and also talk a bit about some MEG at the end. 13 00:00:35,810 --> 00:00:39,740 But kind of to tie it to what was previously discussed, 14 00:00:39,740 --> 00:00:43,130 Rebecca talked a lot about, at the end, the big catastrophe. 15 00:00:43,130 --> 00:00:45,350 Well, you don't know if something is not 16 00:00:45,350 --> 00:00:49,340 there in the fMRI signal because things could be masked when 17 00:00:49,340 --> 00:00:51,470 you're averaging over a large region 18 00:00:51,470 --> 00:00:54,860 as you do when you're recording from those bold signals. 19 00:00:54,860 --> 00:00:58,790 And when you're doing decoding on single neurons, 20 00:00:58,790 --> 00:01:00,920 that is not really an issue because you're actually 21 00:01:00,920 --> 00:01:03,420 going down and recording those individual neurons. 22 00:01:03,420 --> 00:01:07,040 And so while in general in hypothesis testing 23 00:01:07,040 --> 00:01:10,240 you can never really say something doesn't exist, 24 00:01:10,240 --> 00:01:12,830 here you can feel fairly confident that it probably 25 00:01:12,830 --> 00:01:14,460 doesn't, unless you-- 26 00:01:14,460 --> 00:01:18,320 I mean, you could do a Bayesian analysis. 27 00:01:18,320 --> 00:01:20,170 Anyway, all right. 28 00:01:20,170 --> 00:01:23,900 So kind of the very basic motivation behind what I do 29 00:01:23,900 --> 00:01:28,400 is, you know, I'm interested in all the questions the CBMM is 30 00:01:28,400 --> 00:01:33,050 interested in, how can we algorithmically solve problems 31 00:01:33,050 --> 00:01:34,880 and perform behaviors. 32 00:01:34,880 --> 00:01:38,150 And so, you know, basically as motivation, 33 00:01:38,150 --> 00:01:42,800 you know, as a theoretician, we might have some great idea 34 00:01:42,800 --> 00:01:44,400 about how the brain works. 35 00:01:44,400 --> 00:01:47,450 And so what we do is we come up with an experiment 36 00:01:47,450 --> 00:01:48,180 and we run it. 37 00:01:48,180 --> 00:01:50,382 And we record a bunch of neural data. 38 00:01:50,382 --> 00:01:52,340 And then at the end of it, what we're left with 39 00:01:52,340 --> 00:01:54,320 is just a bunch of data. 40 00:01:54,320 --> 00:01:56,820 It's not really an answer to our question. 41 00:01:56,820 --> 00:02:00,012 So for example, if you recorded spikes, 42 00:02:00,012 --> 00:02:01,970 you might end up with something called a raster 43 00:02:01,970 --> 00:02:03,510 where you have trials and time. 44 00:02:03,510 --> 00:02:07,040 And you just end up with little indications at what 45 00:02:07,040 --> 00:02:09,530 times did a neuron spike. 46 00:02:09,530 --> 00:02:11,480 Or if you did an MEG experiment, you 47 00:02:11,480 --> 00:02:14,450 might end up with a bunch of kind of waveforms 48 00:02:14,450 --> 00:02:16,550 that are kind of noisy. 49 00:02:16,550 --> 00:02:19,740 And so this is a good first step, 50 00:02:19,740 --> 00:02:21,410 but obviously what you need to do 51 00:02:21,410 --> 00:02:24,407 is take this and turn it into some sort of answer 52 00:02:24,407 --> 00:02:25,115 to your question. 53 00:02:27,299 --> 00:02:29,090 Because if you can't turn it into an answer 54 00:02:29,090 --> 00:02:30,556 to your question, there is no point 55 00:02:30,556 --> 00:02:32,180 in doing that experiment to begin with. 56 00:02:34,880 --> 00:02:37,430 So basically, what I'm looking for is 57 00:02:37,430 --> 00:02:39,560 clear answers to questions. 58 00:02:39,560 --> 00:02:41,760 In particular I'm interested in two things. 59 00:02:41,760 --> 00:02:43,730 One is neural content. 60 00:02:43,730 --> 00:02:46,610 And that is what information is in a particular region 61 00:02:46,610 --> 00:02:49,029 of the brain, and at what time. 62 00:02:49,029 --> 00:02:50,570 And the other thing I'm interested in 63 00:02:50,570 --> 00:02:55,130 is neural coding, or what features of the neural activity 64 00:02:55,130 --> 00:02:58,980 contain that information. 65 00:02:58,980 --> 00:03:01,550 And so the idea is, basically, if we can make recordings 66 00:03:01,550 --> 00:03:03,530 from a number of different brain regions 67 00:03:03,530 --> 00:03:07,580 and tell what content was in different parts, then 68 00:03:07,580 --> 00:03:10,520 we could, basically, trace the information flow 69 00:03:10,520 --> 00:03:12,500 through the brain and try to unravel 70 00:03:12,500 --> 00:03:15,997 the algorithms that enable us to perform particular tasks. 71 00:03:15,997 --> 00:03:18,080 And then if we can do that, we can do other things 72 00:03:18,080 --> 00:03:20,660 that the CBMM likes to do, such as build 73 00:03:20,660 --> 00:03:24,920 helpful robots that will either bring us drinks 74 00:03:24,920 --> 00:03:25,940 or create peace. 75 00:03:28,580 --> 00:03:30,770 So the outline for the talk today 76 00:03:30,770 --> 00:03:34,256 is I'm going to talk about what neural population decoding is. 77 00:03:34,256 --> 00:03:35,630 I'm going to show you how you can 78 00:03:35,630 --> 00:03:38,960 use it to get at neural content, so what information 79 00:03:38,960 --> 00:03:40,370 is in brain regions. 80 00:03:40,370 --> 00:03:42,911 Then I'm going to show how you can use it to answer questions 81 00:03:42,911 --> 00:03:46,070 about neural coding, or how do neurons contain information. 82 00:03:46,070 --> 00:03:48,930 And then I'm going to show you a little bit how you can use 83 00:03:48,930 --> 00:03:50,130 it to analyze your own data. 84 00:03:50,130 --> 00:03:53,000 So very briefly, a toolbox I created 85 00:03:53,000 --> 00:03:54,995 that makes it easy to do these analyses. 86 00:03:57,970 --> 00:04:02,950 All right, so the basic idea behind neural decoding 87 00:04:02,950 --> 00:04:04,630 is that what you want to do is you 88 00:04:04,630 --> 00:04:07,300 want to take neural activity and try to predict something 89 00:04:07,300 --> 00:04:09,160 about the stimulus itself or about, 90 00:04:09,160 --> 00:04:11,150 let's say, an animal's behavior. 91 00:04:11,150 --> 00:04:14,560 So it's a function that goes from neural activity 92 00:04:14,560 --> 00:04:15,430 to a stimulus. 93 00:04:19,430 --> 00:04:22,760 And decoding approaches have been used for maybe 94 00:04:22,760 --> 00:04:24,130 about 30 years. 95 00:04:24,130 --> 00:04:27,110 So Rebecca was saying MBPA goes back to 2001. 96 00:04:27,110 --> 00:04:29,160 Well, this goes back much further. 97 00:04:29,160 --> 00:04:33,530 So in 1986, Georgopoulos did some studies 98 00:04:33,530 --> 00:04:35,800 with monkeys showing that he could 99 00:04:35,800 --> 00:04:38,540 decode where a monkey was moving its arm based 100 00:04:38,540 --> 00:04:40,130 on neural activity. 101 00:04:40,130 --> 00:04:43,340 And there was other studies in '93 102 00:04:43,340 --> 00:04:46,230 by Matt Wilson and McNaughton. 103 00:04:46,230 --> 00:04:49,040 Matt gave a talk here, I think, as well. 104 00:04:49,040 --> 00:04:50,510 And what he tried to do is decode 105 00:04:50,510 --> 00:04:53,180 where a rat is in a maze. 106 00:04:53,180 --> 00:04:55,160 So again, recording from the hippocampus, 107 00:04:55,160 --> 00:04:57,440 trying to tell where that rat is. 108 00:04:57,440 --> 00:05:02,120 And there's also been a large amount of computational work, 109 00:05:02,120 --> 00:05:06,530 such as work by Selinas and Larry Abbott, kind of comparing 110 00:05:06,530 --> 00:05:09,350 different decoding methods. 111 00:05:09,350 --> 00:05:12,200 But despite all of this work, it's still not widely used. 112 00:05:12,200 --> 00:05:15,910 So Rebecca was saying that MVPA has really taken off. 113 00:05:15,910 --> 00:05:17,870 Well, I'm still waiting for population decoding 114 00:05:17,870 --> 00:05:19,389 in neural activity to take off. 115 00:05:19,389 --> 00:05:20,930 And so part of me being up here today 116 00:05:20,930 --> 00:05:22,442 is to say you really should do this. 117 00:05:22,442 --> 00:05:23,150 It's really good. 118 00:05:25,670 --> 00:05:28,160 And just a few other names for decoding 119 00:05:28,160 --> 00:05:31,250 is MVPA, multi variant pattern analysis. 120 00:05:31,250 --> 00:05:34,280 This is the terminology that people in the fMRI community 121 00:05:34,280 --> 00:05:36,170 use and what Rebecca was using. 122 00:05:36,170 --> 00:05:37,781 It's also called read out. 123 00:05:37,781 --> 00:05:39,530 So if you've heard those terms, it kind of 124 00:05:39,530 --> 00:05:40,571 refers to the same thing. 125 00:05:43,770 --> 00:05:47,090 All right, so let me show you what 126 00:05:47,090 --> 00:05:51,390 decoding looks like in terms of an experiment with, let's say, 127 00:05:51,390 --> 00:05:52,910 a monkey. 128 00:05:52,910 --> 00:05:55,700 So here we'd have an experiment where we're showing the monkey 129 00:05:55,700 --> 00:05:58,220 different images on a screen. 130 00:05:58,220 --> 00:06:00,744 And so for example, we could show it a picture of a kiwi. 131 00:06:00,744 --> 00:06:02,660 And then we'd be making some neural recordings 132 00:06:02,660 --> 00:06:04,035 from this monkey, so we'd get out 133 00:06:04,035 --> 00:06:05,990 a pattern of neural activity. 134 00:06:05,990 --> 00:06:08,120 And what we do in decoding is we feed that pattern 135 00:06:08,120 --> 00:06:10,460 of neural activity into a machine learning 136 00:06:10,460 --> 00:06:13,280 algorithm, which we call pattern classifiers. 137 00:06:13,280 --> 00:06:16,532 Again, you've all heard a lot about that. 138 00:06:16,532 --> 00:06:17,990 And so what this algorithm does, is 139 00:06:17,990 --> 00:06:21,200 it learns to make an association between this particular 140 00:06:21,200 --> 00:06:25,460 stimulus and this particular pattern of neural activity. 141 00:06:25,460 --> 00:06:27,710 And so then we repeat that process with another image, 142 00:06:27,710 --> 00:06:30,020 get another pattern of neural activity out. 143 00:06:30,020 --> 00:06:31,850 Feed that into the classifier. 144 00:06:31,850 --> 00:06:34,190 And again, it learns that association. 145 00:06:34,190 --> 00:06:37,130 And so we do that for every single stimulus in our stimulus 146 00:06:37,130 --> 00:06:37,760 set. 147 00:06:37,760 --> 00:06:41,870 And for multiple repetitions of each stimulus. 148 00:06:41,870 --> 00:06:43,870 So you know, once this association is learned, 149 00:06:43,870 --> 00:06:46,140 what we do is we use the classifier 150 00:06:46,140 --> 00:06:47,606 or test the classifier. 151 00:06:47,606 --> 00:06:48,730 Here we show another image. 152 00:06:48,730 --> 00:06:51,220 We get another pattern of neural activity out. 153 00:06:51,220 --> 00:06:52,854 We feed that into the classifier. 154 00:06:52,854 --> 00:06:54,520 But this time, instead of the classifier 155 00:06:54,520 --> 00:06:57,430 learning the association, it makes a prediction. 156 00:06:57,430 --> 00:07:00,730 And here it predicted the kiwi, so we'd say it's correct. 157 00:07:00,730 --> 00:07:02,830 And then we can repeat that with a car, 158 00:07:02,830 --> 00:07:04,802 get another pattern of activity out. 159 00:07:04,802 --> 00:07:07,219 Feed it to the classifier, get another prediction. 160 00:07:07,219 --> 00:07:09,010 And this time the prediction was incorrect. 161 00:07:09,010 --> 00:07:11,350 It predicted a face, but it was actually a car. 162 00:07:11,350 --> 00:07:14,380 And so what we do is we just note how often 163 00:07:14,380 --> 00:07:15,820 are predictions correct. 164 00:07:15,820 --> 00:07:17,920 And we can plot that as a function of time 165 00:07:17,920 --> 00:07:20,302 and kind of see the evolution of information 166 00:07:20,302 --> 00:07:21,760 as it flows through a brain region. 167 00:07:27,750 --> 00:07:30,690 All right, so in reality, what we usually do is actually 168 00:07:30,690 --> 00:07:32,690 we run the full experiment. 169 00:07:32,690 --> 00:07:35,340 So we actually have collected all the data beforehand. 170 00:07:35,340 --> 00:07:39,420 And then what we do is we split it up into different splits. 171 00:07:39,420 --> 00:07:43,080 So here we had, you know, this experiment, let's say, 172 00:07:43,080 --> 00:07:45,220 was faces and cars or something. 173 00:07:45,220 --> 00:07:47,370 So we have different splits that have 174 00:07:47,370 --> 00:07:50,340 two repetitions of the activity of different neurons 175 00:07:50,340 --> 00:07:53,490 do two faces and two cars, and there's three different splits. 176 00:07:53,490 --> 00:07:56,492 And so what we do is we take two of the splits 177 00:07:56,492 --> 00:07:58,950 and train the classifier, and then have the remaining split 178 00:07:58,950 --> 00:08:00,000 and test it. 179 00:08:00,000 --> 00:08:05,940 And we do that for all permutations of leaving out 180 00:08:05,940 --> 00:08:08,190 a different test split. 181 00:08:08,190 --> 00:08:11,390 So you all heard about cross-validation before? 182 00:08:11,390 --> 00:08:14,040 OK. 183 00:08:14,040 --> 00:08:16,560 One thing to note about neural populations 184 00:08:16,560 --> 00:08:18,240 is when you're doing decoding, you 185 00:08:18,240 --> 00:08:22,050 don't actually need to record all the neurons simultaneously. 186 00:08:22,050 --> 00:08:24,342 So I think this might be one reason why a lot of people 187 00:08:24,342 --> 00:08:26,966 haven't jumped on the technique because they feel like you need 188 00:08:26,966 --> 00:08:28,410 to do these massive recordings. 189 00:08:28,410 --> 00:08:30,660 But you can actually do something what's called pseudo 190 00:08:30,660 --> 00:08:33,690 populations, where you build up a virtual population that you 191 00:08:33,690 --> 00:08:35,940 pretend was recorded simultaneously but really 192 00:08:35,940 --> 00:08:37,020 wasn't. 193 00:08:37,020 --> 00:08:40,779 So what you do with that is you just, if on the first day 194 00:08:40,779 --> 00:08:42,570 you recorded one neuron, and the second day 195 00:08:42,570 --> 00:08:44,891 you recorded the second neuron, etc. 196 00:08:44,891 --> 00:08:46,890 What you can do is you can just randomly select, 197 00:08:46,890 --> 00:08:48,390 let's say, one trial when a kiwi was 198 00:08:48,390 --> 00:08:50,980 shown from the first day, another trial 199 00:08:50,980 --> 00:08:52,785 from the second day, et cetera. 200 00:08:52,785 --> 00:08:54,270 You randomly pick them. 201 00:08:54,270 --> 00:08:58,721 And then you can just build up this virtual population. 202 00:08:58,721 --> 00:09:00,720 And you can do that for a few examples of kiwis, 203 00:09:00,720 --> 00:09:02,130 a few examples of cars. 204 00:09:02,130 --> 00:09:05,690 And then you just train and test your classifier like normal. 205 00:09:05,690 --> 00:09:08,280 But this kind of broadens the applicability. 206 00:09:08,280 --> 00:09:10,260 And then you can ask questions about what 207 00:09:10,260 --> 00:09:13,110 is being lost by doing this process versus if you 208 00:09:13,110 --> 00:09:15,382 had actually done the simultaneous recordings. 209 00:09:15,382 --> 00:09:17,340 And we'll discuss that a little bit more later. 210 00:09:21,710 --> 00:09:24,009 So I'll give you an example of one classifier, 211 00:09:24,009 --> 00:09:25,550 again, I'm sure you've seen much more 212 00:09:25,550 --> 00:09:27,091 sophisticated and interesting methods 213 00:09:27,091 --> 00:09:29,390 but I'll show you a very basic one that I 214 00:09:29,390 --> 00:09:31,482 have used a bit in the past. 215 00:09:31,482 --> 00:09:33,440 It's called the maximum correlation coefficient 216 00:09:33,440 --> 00:09:34,390 classifier. 217 00:09:34,390 --> 00:09:36,890 It's, again, very similar to what Rebecca was talking about. 218 00:09:36,890 --> 00:09:39,080 But all you do is-- 219 00:09:39,080 --> 00:09:41,040 let's say this is our training set. 220 00:09:41,040 --> 00:09:46,310 So we have four vectors for each image, 221 00:09:46,310 --> 00:09:47,814 each thing we want to classify. 222 00:09:47,814 --> 00:09:49,230 And all we're going to do is we're 223 00:09:49,230 --> 00:09:52,430 going to take the average across neurons 224 00:09:52,430 --> 00:09:56,270 to reduce these four vectors into a single factor 225 00:09:56,270 --> 00:09:57,790 for each stimulus. 226 00:09:57,790 --> 00:10:01,820 OK, so if we did that we'd get one kind of prototype 227 00:10:01,820 --> 00:10:03,320 of each of the stimuli. 228 00:10:03,320 --> 00:10:06,050 And then to test the classifier, all we're going to do 229 00:10:06,050 --> 00:10:07,610 is we're going to take a test point 230 00:10:07,610 --> 00:10:09,901 and we're going to do the correlation between this test 231 00:10:09,901 --> 00:10:13,220 point and each of the kind of prototype vectors. 232 00:10:13,220 --> 00:10:15,830 Whichever one has the highest correlation, 233 00:10:15,830 --> 00:10:19,910 we're going to say that's the prediction. 234 00:10:19,910 --> 00:10:21,720 Hopefully pretty simple. 235 00:10:25,940 --> 00:10:28,520 The reason we often use fairly simple classifiers, 236 00:10:28,520 --> 00:10:31,280 such as the maximum correlation coefficient classifier, 237 00:10:31,280 --> 00:10:32,555 is because-- 238 00:10:32,555 --> 00:10:34,490 or at least one motivation is because it 239 00:10:34,490 --> 00:10:38,240 can be translated into what information is directly 240 00:10:38,240 --> 00:10:42,650 available to a downstream population that 241 00:10:42,650 --> 00:10:45,840 is reading the information in the population you 242 00:10:45,840 --> 00:10:47,540 have recordings from. 243 00:10:47,540 --> 00:10:50,000 So you could actually view what the classifier 244 00:10:50,000 --> 00:10:53,390 learns as synaptic weights to a neuron. 245 00:10:53,390 --> 00:10:56,180 You could view the pattern of activity 246 00:10:56,180 --> 00:10:59,360 you're trying to classify as the pre-synaptic activity. 247 00:10:59,360 --> 00:11:02,870 And then by doing this dot product multiplication, perhaps 248 00:11:02,870 --> 00:11:04,760 pass through some non-linearity, you 249 00:11:04,760 --> 00:11:09,400 can kind of output a prediction about whether there 250 00:11:09,400 --> 00:11:12,230 is evidence for a particular stimulus being present. 251 00:11:12,230 --> 00:11:14,900 All right, so let's go into talking about neural content, 252 00:11:14,900 --> 00:11:17,030 or what information is in a brain region 253 00:11:17,030 --> 00:11:20,910 and how it needs decoding to get at that. 254 00:11:20,910 --> 00:11:23,390 So as motivation, I'm going to be talking 255 00:11:23,390 --> 00:11:26,210 about a very simple experiment. 256 00:11:26,210 --> 00:11:29,270 Basically, this experiment involves a monkey 257 00:11:29,270 --> 00:11:32,114 fixating on a point for-- 258 00:11:32,114 --> 00:11:33,780 well, through the duration of the trial. 259 00:11:33,780 --> 00:11:35,840 But first, there's a blank screen. 260 00:11:35,840 --> 00:11:40,520 And then after 500 milliseconds, up is going to come a stimulus. 261 00:11:40,520 --> 00:11:42,350 And for this experiment, there is 262 00:11:42,350 --> 00:11:45,039 going to be 7 different possible stimuli that are shown here. 263 00:11:45,039 --> 00:11:46,580 And what we're going to try to decode 264 00:11:46,580 --> 00:11:49,820 is which of these stimuli was present on one 265 00:11:49,820 --> 00:11:51,470 particular trial. 266 00:11:51,470 --> 00:11:55,400 And we're going to do that as a function of time. 267 00:11:55,400 --> 00:11:57,320 And the data I'm going to use comes 268 00:11:57,320 --> 00:11:59,210 from the inferior temporal cortex. 269 00:11:59,210 --> 00:12:02,690 We're going to look at 132 neuron pseudo populations. 270 00:12:02,690 --> 00:12:07,315 This was data recorded by Ying Jang in Bob Desimone's lab. 271 00:12:07,315 --> 00:12:09,440 It's actually part of a more complicated experiment 272 00:12:09,440 --> 00:12:12,380 but I've just reduced it here to the simplest kind of bare bones 273 00:12:12,380 --> 00:12:12,880 nature. 274 00:12:15,374 --> 00:12:16,790 So what we're going to do is we're 275 00:12:16,790 --> 00:12:21,770 going to basically train the classifier on one time 276 00:12:21,770 --> 00:12:24,590 point with the average firing rate in some bin. 277 00:12:24,590 --> 00:12:27,200 I think in this case it's 100 milliseconds. 278 00:12:27,200 --> 00:12:29,270 And then we're going to test at that time point. 279 00:12:29,270 --> 00:12:31,430 And then I'm going to slide over by a small amount 280 00:12:31,430 --> 00:12:32,940 and repeat that process. 281 00:12:32,940 --> 00:12:35,690 So each time we are repeating training and testing 282 00:12:35,690 --> 00:12:36,695 the classifier. 283 00:12:40,580 --> 00:12:42,830 Again, 100 milliseconds sampled every 10 seconds, 284 00:12:42,830 --> 00:12:44,300 or sliding every 10 seconds. 285 00:12:44,300 --> 00:12:46,680 And this will give us a flow of information over time. 286 00:12:46,680 --> 00:12:49,160 So during the baseline period we should not 287 00:12:49,160 --> 00:12:51,320 be able to decode what's about to be seen, 288 00:12:51,320 --> 00:12:54,662 unless the monkey is psychic, in which case 289 00:12:54,662 --> 00:12:56,870 either there is something wrong with your experiment, 290 00:12:56,870 --> 00:12:57,500 most likely. 291 00:12:57,500 --> 00:13:00,504 Or you should go to Wall Street with your monkey. 292 00:13:00,504 --> 00:13:02,420 But you know, you shouldn't get anything here. 293 00:13:02,420 --> 00:13:04,253 And then we should see some sort of increase 294 00:13:04,253 --> 00:13:06,720 here if there is information. 295 00:13:06,720 --> 00:13:09,500 And this is kind of what it looks like from the results. 296 00:13:09,500 --> 00:13:11,390 So this is zero. 297 00:13:11,390 --> 00:13:13,190 After here, we should see information. 298 00:13:13,190 --> 00:13:16,670 This is chance, or 1 over 7. 299 00:13:16,670 --> 00:13:18,560 And so if we try this decoding experiment, 300 00:13:18,560 --> 00:13:21,050 what we find is during the baseline, 301 00:13:21,050 --> 00:13:23,390 our monkey is not psychic. 302 00:13:23,390 --> 00:13:26,630 But when we put on a stimulus, we 303 00:13:26,630 --> 00:13:31,530 can tell what it is pretty well, like almost perfectly. 304 00:13:31,530 --> 00:13:33,120 Pretty simple. 305 00:13:33,120 --> 00:13:36,000 All right, we can also do some statistics 306 00:13:36,000 --> 00:13:40,410 to tell you when the decoding results are above chance doing 307 00:13:40,410 --> 00:13:43,140 some sort of permutation test where we shuffle the labels 308 00:13:43,140 --> 00:13:45,510 and try to do the decoding on shuffled labels where 309 00:13:45,510 --> 00:13:47,940 we should get chance decoding performance. 310 00:13:47,940 --> 00:13:51,690 And then we can see where is our real result relative to chance, 311 00:13:51,690 --> 00:13:53,510 and get p values and things like that. 312 00:13:56,640 --> 00:13:57,720 It's pretty simple. 313 00:13:57,720 --> 00:14:00,120 How does this stack up against other methods 314 00:14:00,120 --> 00:14:02,200 that people commonly use? 315 00:14:02,200 --> 00:14:06,150 So here's our decoding result. Here's another method. 316 00:14:06,150 --> 00:14:09,270 Here I'm applying an ANOVA to each neuron 317 00:14:09,270 --> 00:14:12,000 individually and counting the number of neurons that 318 00:14:12,000 --> 00:14:16,620 are deemed to be selective. 319 00:14:16,620 --> 00:14:18,954 And so what you see is that there's basically no neurons 320 00:14:18,954 --> 00:14:19,911 in the baseline period. 321 00:14:19,911 --> 00:14:21,280 And then we have a huge number. 322 00:14:21,280 --> 00:14:25,200 OK, so it looks pretty much identical. 323 00:14:25,200 --> 00:14:28,380 We can compute mutual information on each neuron 324 00:14:28,380 --> 00:14:31,500 and then average that together over a whole bunch of neurons. 325 00:14:31,500 --> 00:14:33,780 Again, looks pretty simple. 326 00:14:33,780 --> 00:14:35,490 Or similar, I should say. 327 00:14:35,490 --> 00:14:39,085 Or we can compute a selectivity index. 328 00:14:39,085 --> 00:14:41,460 Take the best stimulus, subtract from the worst stimulus, 329 00:14:41,460 --> 00:14:42,640 divide by the sum. 330 00:14:42,640 --> 00:14:43,770 Again, looks similar. 331 00:14:43,770 --> 00:14:47,560 So there's two takeaway messages here. 332 00:14:47,560 --> 00:14:51,120 First of all, why do decoding if all the other methods work just 333 00:14:51,120 --> 00:14:52,330 as well? 334 00:14:52,330 --> 00:14:54,990 And I'll show you in a bit, they don't always. 335 00:14:54,990 --> 00:14:57,150 And then the other take away message 336 00:14:57,150 --> 00:14:59,684 though is as a reassurance, it is giving you 337 00:14:59,684 --> 00:15:00,600 the same thing, right? 338 00:15:00,600 --> 00:15:02,520 So you know we're not completely crazy. 339 00:15:02,520 --> 00:15:04,645 It's a sensible thing to do in the most basic case. 340 00:15:06,859 --> 00:15:08,400 One other thing decoding can give you 341 00:15:08,400 --> 00:15:11,640 that these other methods can't is something called a confusion 342 00:15:11,640 --> 00:15:13,090 matrix. 343 00:15:13,090 --> 00:15:16,020 So a confusion matrix, Rebecca kind of talked 344 00:15:16,020 --> 00:15:18,450 a little bit about related concepts, 345 00:15:18,450 --> 00:15:22,050 basically what you have is you have the true classes here. 346 00:15:22,050 --> 00:15:25,380 So this is what was actually shown on each trial. 347 00:15:25,380 --> 00:15:28,830 And this is what your classifier predicted. 348 00:15:28,830 --> 00:15:30,930 So the diagonal elements mean correct predictions. 349 00:15:30,930 --> 00:15:34,320 There actually was a car shown and you predicted a car. 350 00:15:34,320 --> 00:15:36,780 But you can look at the off diagonal elements 351 00:15:36,780 --> 00:15:41,340 and you can see what was commonly made as a mistake. 352 00:15:41,340 --> 00:15:43,440 And this can tell you, oh, these two stimuli 353 00:15:43,440 --> 00:15:47,280 are represented in a similar way in a brain region, where 354 00:15:47,280 --> 00:15:48,476 the mistakes are happening. 355 00:15:53,310 --> 00:15:56,700 So another kind of methods issue is, 356 00:15:56,700 --> 00:16:00,599 what is the effect of using different classifiers? 357 00:16:00,599 --> 00:16:02,890 If the method is highly dependent on the classifier you 358 00:16:02,890 --> 00:16:06,240 use, then that's not a good thing 359 00:16:06,240 --> 00:16:08,540 because you're not telling yourself 360 00:16:08,540 --> 00:16:10,290 anything about the data, but you're really 361 00:16:10,290 --> 00:16:11,873 telling you something about the method 362 00:16:11,873 --> 00:16:13,510 you use to extract that data. 363 00:16:13,510 --> 00:16:17,160 But in general, for at least simple decoding questions, 364 00:16:17,160 --> 00:16:20,434 it's pretty robust to the choice of classifier you would use. 365 00:16:20,434 --> 00:16:22,350 So here is the maximum correlation coefficient 366 00:16:22,350 --> 00:16:24,290 classifier I told you about. 367 00:16:24,290 --> 00:16:25,740 Here's a support vector machine. 368 00:16:25,740 --> 00:16:28,350 You can see like almost everything looks similar. 369 00:16:28,350 --> 00:16:30,930 And like when there's something not working as well, 370 00:16:30,930 --> 00:16:33,300 it's generally a slight downward shift. 371 00:16:33,300 --> 00:16:35,610 So you get the same kind of estimation 372 00:16:35,610 --> 00:16:37,860 of how much information is in a brain region flowing 373 00:16:37,860 --> 00:16:39,330 as a function of time. 374 00:16:39,330 --> 00:16:43,136 But maybe your absolute accuracy is just a little bit lower 375 00:16:43,136 --> 00:16:44,760 if you're not using the optimal method. 376 00:16:44,760 --> 00:16:47,301 But really, it seems like we're assessing what is in the data 377 00:16:47,301 --> 00:16:50,250 and not so much about the algorithm. 378 00:16:50,250 --> 00:16:51,960 So that was decoding basic information 379 00:16:51,960 --> 00:16:54,480 in terms of content. 380 00:16:54,480 --> 00:16:56,760 But I think one of the most powerful things decoding 381 00:16:56,760 --> 00:16:59,520 can do is it can decode what I call 382 00:16:59,520 --> 00:17:02,094 abstract or invariant information 383 00:17:02,094 --> 00:17:04,510 where you can get an assessment of whether that's present. 384 00:17:04,510 --> 00:17:06,190 So what does that mean? 385 00:17:06,190 --> 00:17:09,810 Well, basically you can think of something like the word hello. 386 00:17:09,810 --> 00:17:11,760 It has many different pronunciations 387 00:17:11,760 --> 00:17:13,030 in different languages. 388 00:17:13,030 --> 00:17:14,821 But if you speak these different languages, 389 00:17:14,821 --> 00:17:17,010 you can kind of translate that word 390 00:17:17,010 --> 00:17:19,230 into some sort of meaning that it's a greeting. 391 00:17:19,230 --> 00:17:21,460 And you know how to respond appropriately. 392 00:17:21,460 --> 00:17:23,230 So that's kind of a form of abstraction. 393 00:17:23,230 --> 00:17:26,250 It's going from very different sound concepts 394 00:17:26,250 --> 00:17:28,496 into some sort of abstract representation 395 00:17:28,496 --> 00:17:30,870 where I know how to respond appropriately by saying hello 396 00:17:30,870 --> 00:17:33,330 back in that language. 397 00:17:33,330 --> 00:17:37,260 Or another example of this kind of abstraction or invariance 398 00:17:37,260 --> 00:17:41,050 is the invariance of the pose of a head. 399 00:17:41,050 --> 00:17:43,890 So for example, here is a bunch of pictures of Hillary Clinton. 400 00:17:43,890 --> 00:17:46,590 You can see her head is at very different angles. 401 00:17:46,590 --> 00:17:49,170 But we can still tell it's Hillary Clinton. 402 00:17:49,170 --> 00:17:51,570 So we have some sort of representation of Hillary 403 00:17:51,570 --> 00:17:54,270 that's abstracted from the exact pose of her head, 404 00:17:54,270 --> 00:17:56,550 and also abstracted from the color of her pantsuit. 405 00:17:56,550 --> 00:18:00,120 It's very highly abstract, right? 406 00:18:00,120 --> 00:18:03,780 So that's pretty powerful to know how the brain is dropping 407 00:18:03,780 --> 00:18:06,560 information in order to build up these representations that 408 00:18:06,560 --> 00:18:08,220 are useful for behavior. 409 00:18:08,220 --> 00:18:09,720 And I think if we were, again, going 410 00:18:09,720 --> 00:18:11,790 to build intelligent robotic system, 411 00:18:11,790 --> 00:18:14,100 we'd want to build it to have representations 412 00:18:14,100 --> 00:18:20,340 that have become more abstract so it can perform correctly. 413 00:18:20,340 --> 00:18:22,590 So let's show you the example of how 414 00:18:22,590 --> 00:18:28,220 we can assess abstract representations in neural data. 415 00:18:28,220 --> 00:18:31,420 What I'm going to look at is position invariance. 416 00:18:31,420 --> 00:18:33,720 So this is similar to a study that 417 00:18:33,720 --> 00:18:36,571 was done in 2005 by Hung and Kreiman in Science. 418 00:18:36,571 --> 00:18:38,070 And what I'm going to do here is I'm 419 00:18:38,070 --> 00:18:42,730 going to train the classifier with data at an upper location. 420 00:18:42,730 --> 00:18:44,790 So in this experiment, the stimuli 421 00:18:44,790 --> 00:18:47,230 was shown at three different locations. 422 00:18:47,230 --> 00:18:49,170 So on any given trial, one stimulus 423 00:18:49,170 --> 00:18:50,701 was shown at one location. 424 00:18:50,701 --> 00:18:52,200 And these three locations were used, 425 00:18:52,200 --> 00:18:55,410 so the 7 objects were all shown at the upper location, 426 00:18:55,410 --> 00:18:57,290 or at the middle, at the lower. 427 00:18:57,290 --> 00:18:58,790 And here I'm training the classifier 428 00:18:58,790 --> 00:19:01,130 using just the trials when the stimuli was 429 00:19:01,130 --> 00:19:02,880 shown in the upper location. 430 00:19:02,880 --> 00:19:05,430 And then what we can do is we can then test the classifier 431 00:19:05,430 --> 00:19:07,260 on those trials where the stimuli were just 432 00:19:07,260 --> 00:19:09,080 shown at the lower location. 433 00:19:09,080 --> 00:19:11,350 And we can see, if we train at the upper location, 434 00:19:11,350 --> 00:19:13,830 does it generalize to the lower location. 435 00:19:13,830 --> 00:19:16,530 And if it does, it means there is a representation that's 436 00:19:16,530 --> 00:19:18,580 invariant to position. 437 00:19:18,580 --> 00:19:22,320 Does that make sense to everyone? 438 00:19:22,320 --> 00:19:23,910 So let's take a look at the results 439 00:19:23,910 --> 00:19:27,810 for training at the upper and testing at the lower. 440 00:19:27,810 --> 00:19:29,597 They're down here. 441 00:19:29,597 --> 00:19:31,680 So here again, I'm training at the upper location. 442 00:19:31,680 --> 00:19:33,763 And this is the results from testing at the lower. 443 00:19:33,763 --> 00:19:34,470 Here is chance. 444 00:19:34,470 --> 00:19:37,600 And you can see we're well above chance in the decoding. 445 00:19:37,600 --> 00:19:40,680 So it's generalizing from the upper location to the lower. 446 00:19:40,680 --> 00:19:45,150 We can also train at the upper and test at the same upper, 447 00:19:45,150 --> 00:19:47,040 at the middle location. 448 00:19:47,040 --> 00:19:48,894 And what we find is this pattern of results. 449 00:19:48,894 --> 00:19:51,060 So we're getting best results when we train and test 450 00:19:51,060 --> 00:19:52,860 at exactly the same position. 451 00:19:52,860 --> 00:19:55,440 But we can see it does generalize to other positions 452 00:19:55,440 --> 00:19:57,770 as well. 453 00:19:57,770 --> 00:20:00,692 And so we can do this full permutations of things. 454 00:20:00,692 --> 00:20:02,150 So here we trained at the upper, we 455 00:20:02,150 --> 00:20:05,282 could also train at the middle, or train at the lower location. 456 00:20:05,282 --> 00:20:06,740 And here if we train at the middle, 457 00:20:06,740 --> 00:20:08,720 we get the best decoding performance when 458 00:20:08,720 --> 00:20:10,040 we decode at that same middle. 459 00:20:10,040 --> 00:20:12,680 But again, it's generalizing to the upper and lower locations, 460 00:20:12,680 --> 00:20:14,250 and the same for training at lower. 461 00:20:14,250 --> 00:20:15,875 Get the best performance testing lower, 462 00:20:15,875 --> 00:20:18,350 but it again generalizes. 463 00:20:18,350 --> 00:20:22,340 So if you want to just conclude this one mini study here, 464 00:20:22,340 --> 00:20:24,620 you know, information in IT is position invariant 465 00:20:24,620 --> 00:20:26,150 but not you know 100%. 466 00:20:30,642 --> 00:20:31,850 So we can use this technique. 467 00:20:31,850 --> 00:20:33,050 I'll show you a few other examples 468 00:20:33,050 --> 00:20:35,360 of how it can be used in slightly more powerful ways, 469 00:20:35,360 --> 00:20:38,760 maybe, or to answer slightly more interesting questions. 470 00:20:38,760 --> 00:20:42,520 So what another question we might want to ask, 471 00:20:42,520 --> 00:20:46,280 actually we did ask in this paper that just came out, 472 00:20:46,280 --> 00:20:50,150 was about the question of pose invariant 473 00:20:50,150 --> 00:20:51,950 identity information, so that same question 474 00:20:51,950 --> 00:20:55,820 about can a brain region respond to Hillary Clinton 475 00:20:55,820 --> 00:20:58,550 regardless of where she's looking. 476 00:20:58,550 --> 00:21:01,795 And so this is data recorded by Winrich Freiwald and Doris 477 00:21:01,795 --> 00:21:02,295 Tsao. 478 00:21:02,295 --> 00:21:05,460 Winrich probably already talked about this experiment. 479 00:21:05,460 --> 00:21:08,390 But what they did was they had the face system here 480 00:21:08,390 --> 00:21:10,010 where they found these little patches 481 00:21:10,010 --> 00:21:13,620 through fMRI that respond more to faces than other stimuli. 482 00:21:13,620 --> 00:21:16,070 They went in and they recorded from these patches. 483 00:21:16,070 --> 00:21:19,740 And in this study that we're going to look at, they did a-- 484 00:21:19,740 --> 00:21:23,840 they used these stimuli that had 25 different individuals 485 00:21:23,840 --> 00:21:26,151 shown at eight different head orientations. 486 00:21:26,151 --> 00:21:28,400 So this is Doris at eight different head orientations, 487 00:21:28,400 --> 00:21:33,014 but there were 24 other people who also were shown. 488 00:21:33,014 --> 00:21:34,430 And so what I'm going to try to do 489 00:21:34,430 --> 00:21:37,730 is decode between the 25 different people and see, 490 00:21:37,730 --> 00:21:40,790 can it generalize if I train at one orientation 491 00:21:40,790 --> 00:21:43,290 and test at a different one. 492 00:21:43,290 --> 00:21:46,010 And the three brain regions we're going to use 493 00:21:46,010 --> 00:21:47,790 is the most posterior region. 494 00:21:47,790 --> 00:21:50,340 So in this case, the eyes out here, this is like V1. 495 00:21:50,340 --> 00:21:51,960 This is the ventral pathway. 496 00:21:51,960 --> 00:21:55,200 So the most posterior region, we can combine ML and MF. 497 00:21:55,200 --> 00:21:58,820 We compare that to AL and to AM. 498 00:21:58,820 --> 00:22:01,374 I'm going to see how much position variance is there. 499 00:22:01,374 --> 00:22:03,290 So again, like I said, let's start by training 500 00:22:03,290 --> 00:22:05,540 on the left profile and then we can 501 00:22:05,540 --> 00:22:08,000 test on the left profile in different trials. 502 00:22:08,000 --> 00:22:11,690 Or we can test on a different set of images 503 00:22:11,690 --> 00:22:15,620 where the individuals were looking straight forward. 504 00:22:15,620 --> 00:22:19,520 So here are the results from the most posterior region, ML/MF. 505 00:22:19,520 --> 00:22:22,100 What we see is if we train in the left profile 506 00:22:22,100 --> 00:22:24,440 and test on the left profile here, 507 00:22:24,440 --> 00:22:27,140 we're getting results that are above chance, as indicated 508 00:22:27,140 --> 00:22:30,530 by the lighter blue trace. 509 00:22:30,530 --> 00:22:33,050 But if we train on the left profile and test 510 00:22:33,050 --> 00:22:34,670 in the straight results, we're getting 511 00:22:34,670 --> 00:22:37,790 results that are at chance. 512 00:22:37,790 --> 00:22:42,587 So this patch here is not showing very much pose 513 00:22:42,587 --> 00:22:43,540 invariance. 514 00:22:43,540 --> 00:22:45,540 So let's take a look at the rest of the results. 515 00:22:45,540 --> 00:22:47,270 So this is ML/MF. 516 00:22:47,270 --> 00:22:49,779 If we look at AL, what we see is, 517 00:22:49,779 --> 00:22:52,070 again, there's a big advantage for training and testing 518 00:22:52,070 --> 00:22:53,390 at that same orientation. 519 00:22:53,390 --> 00:22:55,010 But now we're seeing generalization 520 00:22:55,010 --> 00:22:56,870 to the other orientations. 521 00:22:56,870 --> 00:22:59,390 You're also seeing this "U" pattern where you're actually 522 00:22:59,390 --> 00:23:01,550 generalizing better from one profile 523 00:23:01,550 --> 00:23:03,920 to the opposite profile, which was reported in some 524 00:23:03,920 --> 00:23:06,710 of their earlier papers. 525 00:23:06,710 --> 00:23:08,925 But yeah, here you're seeing, statistically, 526 00:23:08,925 --> 00:23:09,800 that is above chance. 527 00:23:09,800 --> 00:23:12,622 Now it's not huge, but it's above what 528 00:23:12,622 --> 00:23:13,580 you'd expect by chance. 529 00:23:13,580 --> 00:23:16,580 And if we look at AM as well, we're 530 00:23:16,580 --> 00:23:19,640 seeing a higher degree of invariance, again, 531 00:23:19,640 --> 00:23:23,390 a slight advantage to the exact pose, but still pretty good. 532 00:23:23,390 --> 00:23:25,070 Again, this "U" a little bit but yeah, 533 00:23:25,070 --> 00:23:26,300 we're going to the back of the head. 534 00:23:26,300 --> 00:23:27,390 So what would that tell you, the fact 535 00:23:27,390 --> 00:23:29,473 that it's going to the back of the head, tells you 536 00:23:29,473 --> 00:23:31,510 it's probably representing something about hair. 537 00:23:31,510 --> 00:23:32,900 What I'm going to do next, rather than just training 538 00:23:32,900 --> 00:23:35,990 at the left profile, I'm going to take the results of training 539 00:23:35,990 --> 00:23:40,070 at each of the profiles and either testing at the same 540 00:23:40,070 --> 00:23:41,900 or testing at a different profile. 541 00:23:41,900 --> 00:23:45,080 And then I'm going to plot it as a function of time. 542 00:23:45,080 --> 00:23:49,070 So here are the results of training and testing 543 00:23:49,070 --> 00:23:50,850 at the same pose. 544 00:23:50,850 --> 00:23:52,310 So the non-invariant case. 545 00:23:52,310 --> 00:23:54,650 This is ML/MF. 546 00:23:54,650 --> 00:23:56,135 And this AL and AM. 547 00:23:56,135 --> 00:23:59,420 So this is going from the back of the head anterior. 548 00:23:59,420 --> 00:24:00,890 And what you see is there is a kind 549 00:24:00,890 --> 00:24:05,430 of an increase in this pose-specific information. 550 00:24:05,430 --> 00:24:07,850 Here the increase is fairly small. 551 00:24:07,850 --> 00:24:10,100 But there is just generally more information 552 00:24:10,100 --> 00:24:11,120 as you're going down. 553 00:24:11,120 --> 00:24:14,120 But the big increase is really in this pose invariant 554 00:24:14,120 --> 00:24:14,756 information. 555 00:24:14,756 --> 00:24:16,880 When you train at one location and test at another, 556 00:24:16,880 --> 00:24:18,260 that's these red traces here. 557 00:24:18,260 --> 00:24:21,740 And here you can see it's really accelerating a lot. 558 00:24:21,740 --> 00:24:25,250 It's really that these areas downstream are maybe 559 00:24:25,250 --> 00:24:27,380 pooling over the different poses to create opposing 560 00:24:27,380 --> 00:24:30,770 invariant representation. 561 00:24:30,770 --> 00:24:35,240 So to carry on with this for general concept of testing 562 00:24:35,240 --> 00:24:37,494 invariant representations or abstract representations, 563 00:24:37,494 --> 00:24:39,410 let me just give you one more example of that. 564 00:24:39,410 --> 00:24:41,460 Here was one of my earlier studies. 565 00:24:41,460 --> 00:24:47,520 What I did was this study was looking at categorization. 566 00:24:47,520 --> 00:24:49,460 It was a study done in Earl Miller's lab. 567 00:24:49,460 --> 00:24:50,969 David Friedman collected the data. 568 00:24:50,969 --> 00:24:52,760 And what they did was they trained a monkey 569 00:24:52,760 --> 00:24:55,670 to group a bunch of images together and called them cats. 570 00:24:55,670 --> 00:24:58,260 And then to group a number of images together and called 571 00:24:58,260 --> 00:24:59,730 them dogs. 572 00:24:59,730 --> 00:25:01,620 It wasn't clear that the images necessarily 573 00:25:01,620 --> 00:25:03,661 were more similar to each other within a category 574 00:25:03,661 --> 00:25:05,070 versus out of the category. 575 00:25:05,070 --> 00:25:07,080 But through this training, the monkeys 576 00:25:07,080 --> 00:25:11,170 could quite well group the images together in a delayed 577 00:25:11,170 --> 00:25:12,900 match to sample task. 578 00:25:12,900 --> 00:25:14,790 And so what I wanted to know was, 579 00:25:14,790 --> 00:25:17,250 is there information that is kind of about the animal's 580 00:25:17,250 --> 00:25:21,630 category that is abstracted away from the low level 581 00:25:21,630 --> 00:25:23,310 of visual features. 582 00:25:23,310 --> 00:25:25,440 OK, so was this learning process, 583 00:25:25,440 --> 00:25:27,570 did they build neural representations that 584 00:25:27,570 --> 00:25:29,940 are more similar to each other? 585 00:25:29,940 --> 00:25:34,770 So what I did here was I trained the classifier on two 586 00:25:34,770 --> 00:25:36,720 of the prototype images. 587 00:25:36,720 --> 00:25:40,100 And then I tested it on a left out prototype. 588 00:25:40,100 --> 00:25:42,450 And so if it's making correct predictions here, 589 00:25:42,450 --> 00:25:45,180 then it is generalizing to something 590 00:25:45,180 --> 00:25:47,930 that would only be available in the data if the monkey had-- 591 00:25:47,930 --> 00:25:51,750 due to the monkey's training. 592 00:25:51,750 --> 00:25:55,750 Modulo any low level compounds. 593 00:25:55,750 --> 00:25:59,280 And so here is decoding of this abstract or invariant 594 00:25:59,280 --> 00:26:00,649 information from the two areas. 595 00:26:00,649 --> 00:26:02,190 And what you see, indeed, there seems 596 00:26:02,190 --> 00:26:04,710 to be this kind of grouping effect, where 597 00:26:04,710 --> 00:26:08,190 the category is represented both in IT and PFC 598 00:26:08,190 --> 00:26:10,517 in this abstract way. 599 00:26:10,517 --> 00:26:12,600 So the same method can be used to assess learning. 600 00:26:16,190 --> 00:26:19,370 So just to summarize the neural content part, 601 00:26:19,370 --> 00:26:22,247 decoding offers a way to clearly see what information is there 602 00:26:22,247 --> 00:26:24,080 and how it is flowing through a brain region 603 00:26:24,080 --> 00:26:27,170 as a function of time. 604 00:26:27,170 --> 00:26:29,760 We can assess basic information and often it 605 00:26:29,760 --> 00:26:32,090 yields similar results to other methods. 606 00:26:32,090 --> 00:26:34,990 But we can also do things like assess 607 00:26:34,990 --> 00:26:36,830 abstract or invariant information, which 608 00:26:36,830 --> 00:26:38,570 is not really possible with other methods 609 00:26:38,570 --> 00:26:41,180 as far as I can see how to use those other methods. 610 00:26:44,330 --> 00:26:48,800 So for neural coding, my motivation is the game poker. 611 00:26:48,800 --> 00:26:50,079 This one study I did. 612 00:26:50,079 --> 00:26:52,370 Basically, when I moved to Boston I learned how to play 613 00:26:52,370 --> 00:26:54,200 Texas Hold'em. 614 00:26:54,200 --> 00:26:56,917 It's a card game where, you know-- it's a variant of poker, 615 00:26:56,917 --> 00:26:59,000 I'm sure most of you know, I didn't know the rules 616 00:26:59,000 --> 00:27:01,386 before but I learned the rules. 617 00:27:01,386 --> 00:27:03,260 And I could play the game pretty successfully 618 00:27:03,260 --> 00:27:05,480 in terms of at least applying those rules correctly, 619 00:27:05,480 --> 00:27:07,610 not necessarily in terms of winning money. 620 00:27:07,610 --> 00:27:09,530 But I knew what to do. 621 00:27:09,530 --> 00:27:11,480 And prior to that, I had known other games 622 00:27:11,480 --> 00:27:14,390 like Go Fish, or War, or whatever. 623 00:27:14,390 --> 00:27:15,860 And me learning how to play poker 624 00:27:15,860 --> 00:27:19,070 did not disrupt my ability to play go fish. 625 00:27:19,070 --> 00:27:21,290 I was still bad at that as well. 626 00:27:21,290 --> 00:27:26,180 So somehow this information that allowed me to play this game 627 00:27:26,180 --> 00:27:28,610 had to be added into my brain if we 628 00:27:28,610 --> 00:27:30,929 believe brains cause behavior. 629 00:27:30,929 --> 00:27:33,470 And so in this study, we're kind of getting at that question, 630 00:27:33,470 --> 00:27:37,490 what changed about a brain to allow it to perform a new task? 631 00:27:40,740 --> 00:27:45,230 And so to do this in an experiment with monkeys, 632 00:27:45,230 --> 00:27:46,730 basically, they used a paradigm that 633 00:27:46,730 --> 00:27:49,084 had two different phases to it. 634 00:27:49,084 --> 00:27:51,500 The first phase, what they did, was they had a monkey just 635 00:27:51,500 --> 00:27:54,050 do a passive fixation task. 636 00:27:54,050 --> 00:27:56,300 So what the monkey did was, there 637 00:27:56,300 --> 00:27:58,700 would be a fixation dot that came up. 638 00:27:58,700 --> 00:28:00,560 Up would come a stimulus. 639 00:28:00,560 --> 00:28:01,790 There would be a delay. 640 00:28:01,790 --> 00:28:03,560 There would be a second stimulus. 641 00:28:03,560 --> 00:28:05,570 And there would be a second delay. 642 00:28:05,570 --> 00:28:07,047 And then there would be a reward. 643 00:28:07,047 --> 00:28:09,380 And the reward was given just for the monkey maintaining 644 00:28:09,380 --> 00:28:09,994 fixation. 645 00:28:09,994 --> 00:28:11,660 The monkey did not need to pay attention 646 00:28:11,660 --> 00:28:14,120 to what the stimuli were at all. 647 00:28:14,120 --> 00:28:16,630 And on some trials the stimuli was the same. 648 00:28:16,630 --> 00:28:18,780 On other trials, they were different. 649 00:28:18,780 --> 00:28:21,736 But the monkey did not need to care about that. 650 00:28:21,736 --> 00:28:23,110 So monkey does this passive task. 651 00:28:23,110 --> 00:28:26,690 They record like over 750 neurons 652 00:28:26,690 --> 00:28:29,287 from the prefrontal cortex. 653 00:28:29,287 --> 00:28:31,370 And then what they did was they trained the monkey 654 00:28:31,370 --> 00:28:34,220 to deal with delayed match to sample task. 655 00:28:34,220 --> 00:28:37,290 And the delayed match to sample task ran very similar. 656 00:28:37,290 --> 00:28:39,500 So it had a fixation. 657 00:28:39,500 --> 00:28:40,790 There was a first stimulus. 658 00:28:40,790 --> 00:28:44,297 There was a delay, a second stimulus, a second delay. 659 00:28:44,297 --> 00:28:46,130 So up to this point, the sequence of stimuli 660 00:28:46,130 --> 00:28:48,570 was exactly the same. 661 00:28:48,570 --> 00:28:51,080 But now after the second delay, up came 662 00:28:51,080 --> 00:28:55,340 a choice target, a choice image, and the monkey 663 00:28:55,340 --> 00:28:57,860 needed to make a saccade to the green stimulus 664 00:28:57,860 --> 00:29:01,279 if these two stimuli were matches. 665 00:29:01,279 --> 00:29:03,320 And needed to make a saccade to the blue stimulus 666 00:29:03,320 --> 00:29:05,684 if they were different. 667 00:29:05,684 --> 00:29:07,850 And so what we wanted to know was when the monkey is 668 00:29:07,850 --> 00:29:10,340 performing this task, it needs to remember the stimuli 669 00:29:10,340 --> 00:29:12,000 and whether they were matched or not, 670 00:29:12,000 --> 00:29:15,740 is there a change in the monkey's brain. 671 00:29:15,740 --> 00:29:17,510 And so the way we're going to get at this 672 00:29:17,510 --> 00:29:21,405 is, not surprisingly, doing a decoding approach. 673 00:29:21,405 --> 00:29:23,780 And what we do is we're going to use the same thing where 674 00:29:23,780 --> 00:29:25,760 we train to classify at one point in time, 675 00:29:25,760 --> 00:29:28,070 test, and move on. 676 00:29:28,070 --> 00:29:31,190 And what we should find is that we're 677 00:29:31,190 --> 00:29:33,200 going to try to decode whether to stimuli 678 00:29:33,200 --> 00:29:34,280 matched or did not match. 679 00:29:34,280 --> 00:29:37,100 And so at the time when the second stimulus was shown, 680 00:29:37,100 --> 00:29:39,020 we should have some sort of information about 681 00:29:39,020 --> 00:29:40,478 whether it was a match or non-match 682 00:29:40,478 --> 00:29:42,070 if any information is present. 683 00:29:42,070 --> 00:29:43,820 And we can see, was that information there 684 00:29:43,820 --> 00:29:46,202 before when the monkey was just passively fixating, 685 00:29:46,202 --> 00:29:48,410 or does that information come on only after training. 686 00:29:51,530 --> 00:29:55,400 So here is a schematic of the results for decoding. 687 00:29:55,400 --> 00:29:57,220 It's a binary task, whether a trial 688 00:29:57,220 --> 00:29:58,700 was a match or a non-match. 689 00:29:58,700 --> 00:30:01,970 So chance is 50% if you were guessing. 690 00:30:01,970 --> 00:30:04,220 This light gray shaded region is the time 691 00:30:04,220 --> 00:30:05,780 when the first stimuli came on. 692 00:30:05,780 --> 00:30:09,440 This second region is the time the second stimulus came on. 693 00:30:09,440 --> 00:30:11,840 And here is where we're kind of going to ignore, 694 00:30:11,840 --> 00:30:14,060 this was either the monkey was making a choice 695 00:30:14,060 --> 00:30:15,450 or got a juice reward. 696 00:30:15,450 --> 00:30:17,420 We just ignore that. 697 00:30:17,420 --> 00:30:19,400 So let's make this interactive. 698 00:30:19,400 --> 00:30:21,257 How many people thought there was-- or think 699 00:30:21,257 --> 00:30:23,840 there might be information about whether the two stimuli match 700 00:30:23,840 --> 00:30:27,740 or do not match prior to the monkey doing the tasks, so 701 00:30:27,740 --> 00:30:30,840 just in the pacification task? 702 00:30:30,840 --> 00:30:33,530 Two, three, four, five-- 703 00:30:33,530 --> 00:30:36,440 how many people think there was not? 704 00:30:36,440 --> 00:30:38,610 OK, I'd say it's about a 50/50 split. 705 00:30:38,610 --> 00:30:42,090 OK, so let's look at the passive fixation task. 706 00:30:42,090 --> 00:30:45,300 And what we find is that there really wasn't any information. 707 00:30:45,300 --> 00:30:47,580 So there's no blue bar down here. 708 00:30:47,580 --> 00:30:50,270 So as far as the decoding could tell, 709 00:30:50,270 --> 00:30:52,610 I cannot tell whether the two stimuli match or not match 710 00:30:52,610 --> 00:30:55,190 in the passive fixation. 711 00:30:55,190 --> 00:30:58,340 What about in the active delay match to sample task, 712 00:30:58,340 --> 00:31:00,860 how many people think-- 713 00:31:00,860 --> 00:31:03,511 it would be a pretty boring talk if there wasn't. 714 00:31:03,511 --> 00:31:04,010 what area? 715 00:31:04,010 --> 00:31:06,920 We're talking about dorsolateral-- 716 00:31:06,920 --> 00:31:10,780 actually, both dorsa and ventra lateral prefrontal cortex. 717 00:31:17,310 --> 00:31:20,310 Yeah, indeed there was information there. 718 00:31:20,310 --> 00:31:23,180 In fact, we could decode nearly perfectly 719 00:31:23,180 --> 00:31:25,880 from that brain region. 720 00:31:25,880 --> 00:31:29,060 So way up here at the time when the second stimulus was shown. 721 00:31:29,060 --> 00:31:33,212 So clearly performing the task, or learning 722 00:31:33,212 --> 00:31:34,670 how to perform the task, influenced 723 00:31:34,670 --> 00:31:37,331 what information was present in the prefrontal cortex. 724 00:31:37,331 --> 00:31:39,080 I'm pretty convinced that this information 725 00:31:39,080 --> 00:31:40,640 is present and real. 726 00:31:40,640 --> 00:31:43,100 Now the question is, and why I'm using this 727 00:31:43,100 --> 00:31:45,170 as an example of coding, how did this information 728 00:31:45,170 --> 00:31:48,350 get added into the population. 729 00:31:48,350 --> 00:31:50,577 We believe it's there for real and probably 730 00:31:50,577 --> 00:31:52,660 contributing to behavior it's a pretty big effect. 731 00:31:55,190 --> 00:31:58,610 All right, so here is just some single neuron results. 732 00:31:58,610 --> 00:32:00,440 What I've plotted here is this is 733 00:32:00,440 --> 00:32:02,990 a measure of how much of the variability of a neuron 734 00:32:02,990 --> 00:32:08,060 is predicted about whether a trial is match or non-match. 735 00:32:08,060 --> 00:32:10,550 And I've plotted each dot as a neuron. 736 00:32:10,550 --> 00:32:12,530 I've plotted each neuron at the time 737 00:32:12,530 --> 00:32:14,990 where it had this maximum value of being 738 00:32:14,990 --> 00:32:17,680 able to predict whether a trial is match or non-match. 739 00:32:17,680 --> 00:32:19,180 And so this is the passive case. 740 00:32:19,180 --> 00:32:20,930 And so this is kind of a null distribution 741 00:32:20,930 --> 00:32:23,900 because we didn't see any information 742 00:32:23,900 --> 00:32:27,840 present about match or non-match in the passive case. 743 00:32:27,840 --> 00:32:29,840 When the monkey was performing the delayed match 744 00:32:29,840 --> 00:32:31,970 to sample task, what you see is that there's 745 00:32:31,970 --> 00:32:34,280 kind of a small number of neurons 746 00:32:34,280 --> 00:32:38,570 that become selective after the second stimulus is shown. 747 00:32:38,570 --> 00:32:41,900 So it seems like a few neurons are carrying 748 00:32:41,900 --> 00:32:43,631 a bunch of the information. 749 00:32:43,631 --> 00:32:46,130 Let's see if we can quantify this just maybe a little better 750 00:32:46,130 --> 00:32:47,970 using decoding. 751 00:32:47,970 --> 00:32:50,600 So what we're going to do is we're 752 00:32:50,600 --> 00:32:53,870 going to take the training set and we're 753 00:32:53,870 --> 00:32:58,580 going to do an ANOVA to find, let's say, the eight neurons 754 00:32:58,580 --> 00:33:01,570 that carry the most information out of the whole population. 755 00:33:01,570 --> 00:33:03,120 So the 750 neurons, let's just find 756 00:33:03,120 --> 00:33:07,980 the eight that had the smallest p value in an ANOVA. 757 00:33:07,980 --> 00:33:09,680 And so we can find those neurons. 758 00:33:09,680 --> 00:33:10,760 And we can keep them. 759 00:33:10,760 --> 00:33:13,431 And we can delete all the other neurons. 760 00:33:13,431 --> 00:33:14,930 And then now we found those neurons, 761 00:33:14,930 --> 00:33:18,380 we'll also go to the test set and we'll delete those neurons. 762 00:33:18,380 --> 00:33:22,340 And now we'll try doing the whole decoding procedure 763 00:33:22,340 --> 00:33:24,811 on the smaller population. 764 00:33:24,811 --> 00:33:26,810 And by deleting the neurons on the training set, 765 00:33:26,810 --> 00:33:28,550 we're not really biasing our results 766 00:33:28,550 --> 00:33:32,690 when we start doing the classification. 767 00:33:32,690 --> 00:33:36,200 So here are the results using all 750 neurons 768 00:33:36,200 --> 00:33:38,300 that I showed you before. 769 00:33:38,300 --> 00:33:42,050 And here are the results using just the eight best neurons. 770 00:33:42,050 --> 00:33:44,270 And what you can see is that the eight best neurons 771 00:33:44,270 --> 00:33:48,327 are doing almost as well as using all 750 neurons. 772 00:33:48,327 --> 00:33:50,410 Now I should say, there might be a different eight 773 00:33:50,410 --> 00:33:51,770 best at each point in time because I'm 774 00:33:51,770 --> 00:33:52,910 shifting that bin around. 775 00:33:52,910 --> 00:33:54,368 But still, at any one point in time 776 00:33:54,368 --> 00:33:57,770 there are eight neurons that are really, really good. 777 00:33:57,770 --> 00:34:00,920 So clearly there is kind of this compact or small subset 778 00:34:00,920 --> 00:34:05,361 of neurons that carry the whole information of the population. 779 00:34:05,361 --> 00:34:06,860 Once you've done that, you might not 780 00:34:06,860 --> 00:34:09,830 want to know the flip of that, how many redundant neurons are 781 00:34:09,830 --> 00:34:12,239 there that also carry that information. 782 00:34:12,239 --> 00:34:15,949 So here are the results, again, showing all 750 neurons 783 00:34:15,949 --> 00:34:16,812 as a comparison. 784 00:34:16,812 --> 00:34:18,270 And what I'm going to do now is I'm 785 00:34:18,270 --> 00:34:20,311 going to take those eight best neurons, find them 786 00:34:20,311 --> 00:34:22,560 in the training set, throw them out. 787 00:34:22,560 --> 00:34:24,080 I'm going to also throw another 120 788 00:34:24,080 --> 00:34:27,415 of the best neurons just to get rid of a lot of stuff. 789 00:34:27,415 --> 00:34:29,040 So I'm going to throw out the best 128. 790 00:34:29,040 --> 00:34:30,800 And then we'll look at the remaining neurons and see, 791 00:34:30,800 --> 00:34:33,050 is there redundant information in those neurons. 792 00:34:33,050 --> 00:34:36,949 It's still like 600 neurons or more. 793 00:34:36,949 --> 00:34:38,810 And so here are the results from that. 794 00:34:38,810 --> 00:34:41,690 What you see is that there is also redundant information 795 00:34:41,690 --> 00:34:43,130 in this kind of weaker tail. 796 00:34:43,130 --> 00:34:45,500 It's not quite as good as the eight best or not as 797 00:34:45,500 --> 00:34:47,480 high decoding accuracy, but there 798 00:34:47,480 --> 00:34:48,829 is redundant information to it. 799 00:34:51,679 --> 00:34:54,380 Just to summarize this part, what we see here 800 00:34:54,380 --> 00:34:56,270 is that there is a few neurons that 801 00:34:56,270 --> 00:34:58,775 really became highly, highly selective due to this process. 802 00:35:02,420 --> 00:35:04,240 So we see that there's a lot of information 803 00:35:04,240 --> 00:35:06,640 in this small, compact set. 804 00:35:06,640 --> 00:35:08,740 Here are the results from a related experiment. 805 00:35:08,740 --> 00:35:10,885 This was in a task where the monkey had 806 00:35:10,885 --> 00:35:12,760 to remember the spatial location of a stimuli 807 00:35:12,760 --> 00:35:16,570 rather than what an image was, like a square or circle. 808 00:35:16,570 --> 00:35:18,220 But anyway, small detail. 809 00:35:18,220 --> 00:35:20,710 Here's this big effect of this is match information, 810 00:35:20,710 --> 00:35:23,536 this is non-match information being decoded. 811 00:35:23,536 --> 00:35:24,910 So these are the decoding results 812 00:35:24,910 --> 00:35:27,420 that I showed you before. 813 00:35:27,420 --> 00:35:31,870 Here's an analysis where an ROC analysis was done on this data. 814 00:35:31,870 --> 00:35:33,850 So for each neuron, they calculated 815 00:35:33,850 --> 00:35:36,910 how well does an individual neuron separate the match 816 00:35:36,910 --> 00:35:38,920 and the non-match trials. 817 00:35:38,920 --> 00:35:41,040 And again, pre and post training. 818 00:35:41,040 --> 00:35:44,560 And what you see is here, they did not see this big split 819 00:35:44,560 --> 00:35:46,990 that I saw with the decoding. 820 00:35:46,990 --> 00:35:49,460 And this was published. 821 00:35:49,460 --> 00:35:53,830 So the question is, why did they not see it. 822 00:35:53,830 --> 00:35:57,490 And the reason is because there were only a few neurons that 823 00:35:57,490 --> 00:35:59,140 were really highly selective. 824 00:35:59,140 --> 00:36:00,770 That was enough to drive the decoding 825 00:36:00,770 --> 00:36:03,370 but it wasn't enough if you averaged over all the neurons 826 00:36:03,370 --> 00:36:04,910 to see this effect. 827 00:36:04,910 --> 00:36:07,600 So essentially, there's kind of like two populations here. 828 00:36:07,600 --> 00:36:09,100 There's a huge population of neurons 829 00:36:09,100 --> 00:36:10,840 that did pick up the match information, 830 00:36:10,840 --> 00:36:12,280 or picked it up very weakly. 831 00:36:12,280 --> 00:36:14,020 And then there's a small set of neurons 832 00:36:14,020 --> 00:36:16,780 that are very selective. 833 00:36:16,780 --> 00:36:20,930 And so if you take an average of the nonselective population, 834 00:36:20,930 --> 00:36:22,440 it's just here. 835 00:36:22,440 --> 00:36:24,850 Let's say this is the pre-training population. 836 00:36:24,850 --> 00:36:26,710 If you take an average of post-training 837 00:36:26,710 --> 00:36:28,810 over all the neurons, the average 838 00:36:28,810 --> 00:36:30,400 would shift slightly to the right. 839 00:36:30,400 --> 00:36:32,440 But it might not be very detectable 840 00:36:32,440 --> 00:36:34,960 from the pre-training amount of information. 841 00:36:34,960 --> 00:36:38,020 But if you have weights on just the highly selective neurons, 842 00:36:38,020 --> 00:36:39,634 you see a huge effect. 843 00:36:39,634 --> 00:36:41,800 So it's really important that you don't average over 844 00:36:41,800 --> 00:36:45,280 all your neurons but you treat the neurons as individuals, 845 00:36:45,280 --> 00:36:49,390 or maybe classes, because they're doing different things. 846 00:36:49,390 --> 00:36:52,540 So the next coding question I wanted to ask 847 00:36:52,540 --> 00:36:54,880 was, is information contained in what I 848 00:36:54,880 --> 00:36:57,580 call a dynamic population code. 849 00:36:57,580 --> 00:37:00,980 OK, so let me explain what that means. 850 00:37:00,980 --> 00:37:05,190 If we showed a stimulus, such as a kiwi, which I like showing, 851 00:37:05,190 --> 00:37:08,800 we saw that there might be a unique pattern for that kiwi. 852 00:37:08,800 --> 00:37:10,615 And that pattern is what enables me 853 00:37:10,615 --> 00:37:12,490 to discriminate between all the other stimuli 854 00:37:12,490 --> 00:37:14,089 and do the classification. 855 00:37:14,089 --> 00:37:15,880 But it might turn out that there's not just 856 00:37:15,880 --> 00:37:18,340 one pattern for that kiwi, but there's actually 857 00:37:18,340 --> 00:37:19,790 a sequence of patterns. 858 00:37:19,790 --> 00:37:22,270 So if we plotted the patterns in time, 859 00:37:22,270 --> 00:37:24,940 they would actually change. 860 00:37:24,940 --> 00:37:27,340 So it's a sequence of patterns that represents one thing. 861 00:37:29,890 --> 00:37:33,215 And this kind of thing has been shown a little bit. 862 00:37:33,215 --> 00:37:34,840 And actually now it's been shown a lot. 863 00:37:34,840 --> 00:37:38,350 But when I first did this in 2008, the kind of one 864 00:37:38,350 --> 00:37:40,360 study I knew of that kind of showed 865 00:37:40,360 --> 00:37:44,704 this was this paper by Ofer Mazor and Gilles Laurent 866 00:37:44,704 --> 00:37:46,370 where they did kind of the PCA analysis. 867 00:37:46,370 --> 00:37:49,030 And this is in like the locusts, I think, olfactory bulb. 868 00:37:49,030 --> 00:37:51,446 And they showed that there were these kind of trajectories 869 00:37:51,446 --> 00:37:53,830 in space where a particular odor was represented 870 00:37:53,830 --> 00:37:57,400 by maybe different neurons. 871 00:37:57,400 --> 00:38:00,250 And again, I had a paper in 2008 where I examined this. 872 00:38:00,250 --> 00:38:02,710 And there's a review paper by King and Dehaene 873 00:38:02,710 --> 00:38:03,980 in 2014 about this. 874 00:38:03,980 --> 00:38:06,830 And there's a lot of people looking at this now. 875 00:38:06,830 --> 00:38:10,172 So how can we get at this kind of thing in decoding? 876 00:38:10,172 --> 00:38:12,130 What you can do is you can train the classifier 877 00:38:12,130 --> 00:38:14,530 at one point in time, and test it at a point in time 878 00:38:14,530 --> 00:38:15,790 like we were doing before. 879 00:38:15,790 --> 00:38:19,260 But you can also test at other points in time. 880 00:38:19,260 --> 00:38:22,000 And so what happens is if you train at a point in time that 881 00:38:22,000 --> 00:38:24,760 should have the information, and things are contained 882 00:38:24,760 --> 00:38:27,880 in a static code where there's just one pattern, then if you 883 00:38:27,880 --> 00:38:30,544 test at other points in time, you should do well. 884 00:38:30,544 --> 00:38:33,210 Because you capture that pattern where there's good information, 885 00:38:33,210 --> 00:38:35,590 you should do well at other points in time. 886 00:38:35,590 --> 00:38:38,292 However, if it's a changing pattern of neural activity, 887 00:38:38,292 --> 00:38:40,000 then when you train at one point in time, 888 00:38:40,000 --> 00:38:43,305 you won't do well at other points in time. 889 00:38:43,305 --> 00:38:44,180 Does that make sense? 890 00:38:50,550 --> 00:38:54,900 So here are the results-- 891 00:38:54,900 --> 00:38:55,950 if that will go away. 892 00:38:55,950 --> 00:38:57,400 Let me just orient you here. 893 00:38:57,400 --> 00:38:59,700 So this is the same experiment, you know, 894 00:38:59,700 --> 00:39:03,390 time of the first stimulus, time of the second stimulus, chance. 895 00:39:03,390 --> 00:39:05,880 This black trace is what we saw before that I was always 896 00:39:05,880 --> 00:39:06,870 plotting in red. 897 00:39:06,870 --> 00:39:09,161 This is the standard decoding when I trained and tested 898 00:39:09,161 --> 00:39:11,250 at each point in time. 899 00:39:11,250 --> 00:39:13,590 This blue trace is where I train here 900 00:39:13,590 --> 00:39:16,920 and I tested all other points in time. 901 00:39:16,920 --> 00:39:19,530 So if it's the case that there's one pattern coding 902 00:39:19,530 --> 00:39:21,330 the information, what you're going to find 903 00:39:21,330 --> 00:39:24,000 is that as soon as that information becomes present, 904 00:39:24,000 --> 00:39:26,940 it will fill out this whole curve. 905 00:39:26,940 --> 00:39:29,340 Conversely, if it's changing, what you might see 906 00:39:29,340 --> 00:39:34,254 is just a localized information just at one spot. 907 00:39:34,254 --> 00:39:35,670 So let's take a look at the movie, 908 00:39:35,670 --> 00:39:36,878 if that moves out of the way. 909 00:39:36,878 --> 00:39:39,720 OK, here is the moment of truth. 910 00:39:39,720 --> 00:39:42,070 Information is rising. 911 00:39:42,070 --> 00:39:46,845 And what you see in this second delay period 912 00:39:46,845 --> 00:39:50,646 is clearly we see this little peak moving along. 913 00:39:50,646 --> 00:39:52,020 So it's not that there's just one 914 00:39:52,020 --> 00:39:55,884 pattern that contains information 915 00:39:55,884 --> 00:39:56,800 at all points in time. 916 00:39:56,800 --> 00:39:59,070 But in fact, it's a sequence of patterns 917 00:39:59,070 --> 00:40:00,864 that each contain that information. 918 00:40:07,640 --> 00:40:11,410 So here are the results just plotted in a different format. 919 00:40:11,410 --> 00:40:13,570 This is what we call a temporal cross training 920 00:40:13,570 --> 00:40:16,010 plot because I train at one point and test 921 00:40:16,010 --> 00:40:18,062 at a different point in time. 922 00:40:18,062 --> 00:40:20,020 So this is the time I'm testing the classifier. 923 00:40:20,020 --> 00:40:22,180 This is the time I'm training the classifier. 924 00:40:22,180 --> 00:40:23,980 This is the passive fixation stage, 925 00:40:23,980 --> 00:40:26,710 so there was no information in the population. 926 00:40:26,710 --> 00:40:28,322 And this is just how I often plot it. 927 00:40:28,322 --> 00:40:30,280 What you see is there's this big diagonal band. 928 00:40:30,280 --> 00:40:31,930 Here you see it's like widening a bit 929 00:40:31,930 --> 00:40:36,350 so it might be hitting some sort of stationary point there. 930 00:40:36,350 --> 00:40:38,710 But you can see that clearly there's 931 00:40:38,710 --> 00:40:40,979 these dynamics happening. 932 00:40:40,979 --> 00:40:43,145 And we can go and we can look at individual neurons. 933 00:40:43,145 --> 00:40:45,700 So these are actually the three most selective neurons. 934 00:40:45,700 --> 00:40:48,700 They're not randomly chosen. 935 00:40:48,700 --> 00:40:51,430 Red is the firing rate to the non-match trials. 936 00:40:51,430 --> 00:40:53,380 Blue is the firing rate to the match trials. 937 00:40:53,380 --> 00:40:56,910 This neuron has a pretty wide window of selectivity. 938 00:40:56,910 --> 00:41:00,160 This other neuron here has a really small window. 939 00:41:00,160 --> 00:41:02,650 There's just this little blip where it's more selective 940 00:41:02,650 --> 00:41:05,652 or has a higher firing rate to not match compared to match. 941 00:41:05,652 --> 00:41:08,110 And it's these neurons that have these little kind of blips 942 00:41:08,110 --> 00:41:10,510 that are giving rise to that dynamics. 943 00:41:10,510 --> 00:41:13,420 Here's something else we can ask about with the paradigm 944 00:41:13,420 --> 00:41:17,111 of asking coding questions. 945 00:41:17,111 --> 00:41:18,610 What we're going to do here is we're 946 00:41:18,610 --> 00:41:21,007 going to try a bunch of different classifiers. 947 00:41:21,007 --> 00:41:22,840 And here, you know, these are some questions 948 00:41:22,840 --> 00:41:23,570 that kind of came up. 949 00:41:23,570 --> 00:41:26,194 But can we tweak the classifier to understand a little bit more 950 00:41:26,194 --> 00:41:27,290 about population code. 951 00:41:27,290 --> 00:41:29,320 So here is a fairly simple example. 952 00:41:29,320 --> 00:41:31,702 But I compared three different classifiers. 953 00:41:31,702 --> 00:41:33,160 And the question I wanted to get at 954 00:41:33,160 --> 00:41:37,330 was, is information coded in the total activity of a population. 955 00:41:37,330 --> 00:41:40,630 Or is it coded more so in the relative activity 956 00:41:40,630 --> 00:41:42,080 of different neurons. 957 00:41:42,080 --> 00:41:44,920 So you know, in particular, in the face patches, 958 00:41:44,920 --> 00:41:49,900 we see that information of all neurons increases to faces. 959 00:41:49,900 --> 00:41:51,625 But if you think about that from a-- 960 00:41:51,625 --> 00:41:53,500 or maybe not information, but the firing rate 961 00:41:53,500 --> 00:41:55,270 increases to all faces. 962 00:41:55,270 --> 00:41:57,340 But if the firing rate increases to all faces, 963 00:41:57,340 --> 00:41:59,800 you've lost dynamic range and you can't really 964 00:41:59,800 --> 00:42:02,237 tell what's happening for individual faces. 965 00:42:02,237 --> 00:42:03,820 So what I wanted to know was, how much 966 00:42:03,820 --> 00:42:06,279 information is coded by this overall shift versus patterns. 967 00:42:06,279 --> 00:42:08,903 So what I did here was I used a Poisson Naive Bayes classifier, 968 00:42:08,903 --> 00:42:11,950 which takes into account both the overall magnitude and also 969 00:42:11,950 --> 00:42:12,940 the patterns. 970 00:42:12,940 --> 00:42:15,610 I used a classifier minimum angle 971 00:42:15,610 --> 00:42:17,740 that took only the patterns into account. 972 00:42:17,740 --> 00:42:20,140 And I used a classifier called the total population 973 00:42:20,140 --> 00:42:23,200 activity that only took the average activity 974 00:42:23,200 --> 00:42:25,440 of the whole population. 975 00:42:25,440 --> 00:42:28,270 This classifier's pretty dumb, but in a certain sense, 976 00:42:28,270 --> 00:42:30,670 it's what fMRI is doing, just averaging 977 00:42:30,670 --> 00:42:33,640 all your neurons together. 978 00:42:33,640 --> 00:42:36,050 So it's a little bit of a proxy. 979 00:42:36,050 --> 00:42:38,190 There's paper, also, by Elias Issa 980 00:42:38,190 --> 00:42:41,255 and Jim DiCarlo where they show that fMRI is actually fairly-- 981 00:42:41,255 --> 00:42:43,630 or somewhat strongly correlated with the average activity 982 00:42:43,630 --> 00:42:45,003 of a whole population. 983 00:42:47,590 --> 00:42:50,350 So let's see how these classifiers compare 984 00:42:50,350 --> 00:42:52,540 to each other to see where the information is 985 00:42:52,540 --> 00:42:54,180 being coded in the activity. 986 00:42:54,180 --> 00:42:58,570 Again, I'm going to use this study from Doris and Winrich 987 00:42:58,570 --> 00:43:01,750 where we're going to be looking at the pose specific phase 988 00:43:01,750 --> 00:43:03,440 information, just as an example. 989 00:43:03,440 --> 00:43:05,500 So this is decoding those 25 individuals 990 00:43:05,500 --> 00:43:07,160 when they're shown, trained, and tested 991 00:43:07,160 --> 00:43:08,350 that exact same head pose. 992 00:43:11,440 --> 00:43:15,730 And so what we see is we see that when we use the Poisson 993 00:43:15,730 --> 00:43:19,030 Naive Bayes classifier that took the pattern and also 994 00:43:19,030 --> 00:43:22,030 the total activity into account, and when 995 00:43:22,030 --> 00:43:24,830 we used the classifier that took just the pattern into account, 996 00:43:24,830 --> 00:43:28,610 the minimum angle, we're getting similar results. 997 00:43:28,610 --> 00:43:31,305 So the overall activity was not really adding much. 998 00:43:31,305 --> 00:43:33,430 But if you just use the overall activity by itself, 999 00:43:33,430 --> 00:43:35,380 it was pretty poor. 1000 00:43:35,380 --> 00:43:37,257 So this is, again, touching on something 1001 00:43:37,257 --> 00:43:39,340 about what Rebecca said, when you start averaging, 1002 00:43:39,340 --> 00:43:40,580 you can lose a lot. 1003 00:43:40,580 --> 00:43:43,130 And so you might be blind to a lot of what's going on 1004 00:43:43,130 --> 00:43:45,190 if you're just using voxels. 1005 00:43:49,060 --> 00:43:53,890 There is reasons to do invasive recordings. 1006 00:43:53,890 --> 00:43:58,180 All right, and I think this might be my last point in terms 1007 00:43:58,180 --> 00:43:59,470 of neural coding. 1008 00:43:59,470 --> 00:44:02,830 But this is the question of the independent neuron code. 1009 00:44:02,830 --> 00:44:05,950 So is there more activity if you took into account 1010 00:44:05,950 --> 00:44:09,005 the joint activity of all neurons simultaneously, 1011 00:44:09,005 --> 00:44:11,015 so if you had simultaneous recordings 1012 00:44:11,015 --> 00:44:13,390 and took that into account, versus the pseudo populations 1013 00:44:13,390 --> 00:44:16,150 I'm doing where you are treating each neuron as if they 1014 00:44:16,150 --> 00:44:18,880 were statistically independent. 1015 00:44:18,880 --> 00:44:21,910 And so this is a very, very simple analysis. 1016 00:44:21,910 --> 00:44:25,450 Here I just did the decoding in an experiment 1017 00:44:25,450 --> 00:44:27,130 where we had simultaneous recordings 1018 00:44:27,130 --> 00:44:29,560 and compared it to using that same data but using 1019 00:44:29,560 --> 00:44:35,050 pseudo populations on that data, using very simple classifiers. 1020 00:44:35,050 --> 00:44:36,460 And so here are the results. 1021 00:44:36,460 --> 00:44:38,980 What I found was that in this one case 1022 00:44:38,980 --> 00:44:40,660 there was a little bit extra information 1023 00:44:40,660 --> 00:44:42,370 in the simultaneous recordings as 1024 00:44:42,370 --> 00:44:44,534 compared to the pseudo populations. 1025 00:44:44,534 --> 00:44:47,200 But you know, it wouldn't really change many of your conclusions 1026 00:44:47,200 --> 00:44:48,158 about what's happening. 1027 00:44:48,158 --> 00:44:50,780 It's like, you know, maybe a 5% increase or something. 1028 00:44:50,780 --> 00:44:53,590 And then this has been seen in a lot of the literature. 1029 00:44:53,590 --> 00:44:56,110 This is the question of temporal precision 1030 00:44:56,110 --> 00:44:58,412 or what is sometimes called temporal coding. 1031 00:44:58,412 --> 00:45:00,370 What happens, you know, some of the experiments 1032 00:45:00,370 --> 00:45:03,130 I was using 100 millisecond bin, sometimes I was using 500. 1033 00:45:03,130 --> 00:45:05,141 What happens when you change the bin size? 1034 00:45:05,141 --> 00:45:06,890 What happens, this is pretty clear, again, 1035 00:45:06,890 --> 00:45:08,835 from a lot of studies that I've done, 1036 00:45:08,835 --> 00:45:11,460 when you increase the bin size, generally the decoding accuracy 1037 00:45:11,460 --> 00:45:13,350 goes up. 1038 00:45:13,350 --> 00:45:15,270 What you lose is temporal precision, 1039 00:45:15,270 --> 00:45:17,730 because now you're blurring over a much bigger area. 1040 00:45:17,730 --> 00:45:21,330 So in terms of your understanding what's going on, 1041 00:45:21,330 --> 00:45:24,840 you have to find the right point between having 1042 00:45:24,840 --> 00:45:27,226 a very clear result by having a larger bin versus you 1043 00:45:27,226 --> 00:45:28,600 caring about the time information 1044 00:45:28,600 --> 00:45:29,600 and using a smaller bin. 1045 00:45:32,700 --> 00:45:36,356 And I haven't seen that I need like one millisecond resolution 1046 00:45:36,356 --> 00:45:37,980 or a very complicated classifier that's 1047 00:45:37,980 --> 00:45:40,440 taking every single spike time into account to help me. 1048 00:45:40,440 --> 00:45:42,930 But again, I haven't explored this as fully as I could. 1049 00:45:42,930 --> 00:45:44,700 So it would be interesting for someone 1050 00:45:44,700 --> 00:45:47,280 to use a method [INAUDIBLE] that people really 1051 00:45:47,280 --> 00:45:50,940 love to claim that things are coded in patterns in time. 1052 00:45:50,940 --> 00:45:53,220 You know, if you want to, go for it. 1053 00:45:53,220 --> 00:45:54,260 Show me it. 1054 00:45:54,260 --> 00:45:55,650 I've got some data available. 1055 00:45:55,650 --> 00:45:58,650 Build a classifier that does that and we can compare it. 1056 00:45:58,650 --> 00:46:01,950 But I haven't seen it yet. 1057 00:46:01,950 --> 00:46:03,645 So a summary of the neural coding. 1058 00:46:03,645 --> 00:46:07,290 Decoding allows you to examine many questions, such as is 1059 00:46:07,290 --> 00:46:08,540 there a compact code. 1060 00:46:08,540 --> 00:46:11,040 So is there just a few neurons that has all the information. 1061 00:46:11,040 --> 00:46:12,180 Is there a dynamic code. 1062 00:46:12,180 --> 00:46:13,971 So is the pattern of activity that's coding 1063 00:46:13,971 --> 00:46:16,350 information changing in time. 1064 00:46:16,350 --> 00:46:19,670 Are neurons independent or is there more information coded 1065 00:46:19,670 --> 00:46:21,990 in their joint activity. 1066 00:46:21,990 --> 00:46:24,010 And what is the temporal precision. 1067 00:46:24,010 --> 00:46:25,756 And this is, again, not everything, 1068 00:46:25,756 --> 00:46:27,750 there are many other questions you could ask. 1069 00:46:30,072 --> 00:46:31,905 Any other questions about the neural coding? 1070 00:46:37,670 --> 00:46:40,250 Just a few other things to mention. 1071 00:46:40,250 --> 00:46:43,490 So you know, I was talking all about, basically, spiking data. 1072 00:46:43,490 --> 00:46:46,880 But you can also do decoding from MEG data. 1073 00:46:46,880 --> 00:46:49,760 So there was a great study by Leyla 1074 00:46:49,760 --> 00:46:53,570 where she tried to decode from MEG signals. 1075 00:46:53,570 --> 00:46:56,170 Here's just one example from that paper where 1076 00:46:56,170 --> 00:46:59,740 she was trying to decode which letter of the alphabet, 1077 00:46:59,740 --> 00:47:01,730 or at least 25 of the 26 letters, 1078 00:47:01,730 --> 00:47:05,660 was shown to a subject, a human subject in an MEG scanner. 1079 00:47:05,660 --> 00:47:08,360 You know, see is very nice, you know, 1080 00:47:08,360 --> 00:47:10,522 people are not psychic either. 1081 00:47:10,522 --> 00:47:12,980 And then at the time, slightly after the stimulus is shown, 1082 00:47:12,980 --> 00:47:14,780 you can decode quite well. 1083 00:47:14,780 --> 00:47:17,030 And things are above chance. 1084 00:47:17,030 --> 00:47:20,684 And then she went on to examine position invariance 1085 00:47:20,684 --> 00:47:22,850 in different parts of the brain, the timing of that. 1086 00:47:22,850 --> 00:47:24,590 So you can check out that paper as well. 1087 00:47:27,290 --> 00:47:34,280 And as Rebecca mentioned, this kind of approach 1088 00:47:34,280 --> 00:47:35,984 has really taken off in fMRI. 1089 00:47:35,984 --> 00:47:37,400 Here are three different toolboxes 1090 00:47:37,400 --> 00:47:40,087 you could use if you're doing fMRI. 1091 00:47:40,087 --> 00:47:42,170 So I wrote a toolbox I will talk about in a minute 1092 00:47:42,170 --> 00:47:44,400 to do neural decoding, and I recommend it for that. 1093 00:47:44,400 --> 00:47:46,377 But if you're going to do fMRI decoding, 1094 00:47:46,377 --> 00:47:48,710 you probably are better off using one of these toolboxes 1095 00:47:48,710 --> 00:47:51,660 because they have certain things that are fMRI specific, 1096 00:47:51,660 --> 00:47:54,470 such as mapping back to voxels that my toolbox doesn't have. 1097 00:47:54,470 --> 00:47:56,590 Although you could, in principle, 1098 00:47:56,590 --> 00:47:58,580 throw fMRI data into my toolbox as well. 1099 00:48:02,440 --> 00:48:05,100 And then all these studies so far I've mentioned 1100 00:48:05,100 --> 00:48:08,760 have had kind of structure where every trial is exactly 1101 00:48:08,760 --> 00:48:12,630 the same length, as Tyler pointed out. 1102 00:48:12,630 --> 00:48:14,040 And if you wanted to do something 1103 00:48:14,040 --> 00:48:16,500 where it wasn't that structured that well, such as decoding 1104 00:48:16,500 --> 00:48:19,050 from a rat running around a maze where it wasn't always doing 1105 00:48:19,050 --> 00:48:21,860 things in the same amount of time, 1106 00:48:21,860 --> 00:48:27,060 there's a toolbox that came out of Emory Brown's lab that 1107 00:48:27,060 --> 00:48:28,710 should hopefully enable you to do some 1108 00:48:28,710 --> 00:48:30,108 of those kinds of analyses. 1109 00:48:33,034 --> 00:48:35,450 All right, let me just briefly talk about some limitations 1110 00:48:35,450 --> 00:48:39,610 to decoding, just like Rebecca did with the downer at the end. 1111 00:48:39,610 --> 00:48:43,710 So some limitations are, this is a hypothesis-based method. 1112 00:48:43,710 --> 00:48:46,702 So we have specific questions in mind that we want to test. 1113 00:48:46,702 --> 00:48:49,160 And then we can assess whether those questions are answered 1114 00:48:49,160 --> 00:48:52,264 or not, to a certain degree. 1115 00:48:52,264 --> 00:48:54,680 So that's kind of a good thing but it's also a down thing. 1116 00:48:54,680 --> 00:48:56,750 Like if we didn't think about the right question, 1117 00:48:56,750 --> 00:48:58,160 then we're not going to see it. 1118 00:48:58,160 --> 00:48:59,300 So there could be a lot happening 1119 00:48:59,300 --> 00:49:01,883 in our neural activity that we just didn't think to ask about. 1120 00:49:04,250 --> 00:49:05,967 And so unsupervised learning methods 1121 00:49:05,967 --> 00:49:07,050 might get at some of that. 1122 00:49:07,050 --> 00:49:09,560 And you could see about how much is the variable of interest 1123 00:49:09,560 --> 00:49:11,976 you're interested in, accounting for the total variability 1124 00:49:11,976 --> 00:49:14,090 in a population. 1125 00:49:14,090 --> 00:49:16,070 Also, I hinted at this throughout the talk, 1126 00:49:16,070 --> 00:49:19,337 just because information is present doesn't mean it's used. 1127 00:49:19,337 --> 00:49:21,920 The back of the head stuff might be an example of that or not, 1128 00:49:21,920 --> 00:49:22,790 I don't know. 1129 00:49:22,790 --> 00:49:24,710 But you just have to interpret the results 1130 00:49:24,710 --> 00:49:27,110 and don't know the information there. 1131 00:49:27,110 --> 00:49:29,650 Therefore, this is the brain region doing x. 1132 00:49:29,650 --> 00:49:32,510 A lot of stuff can kind of sneak in. 1133 00:49:32,510 --> 00:49:36,810 Timing information can be also really interesting. 1134 00:49:36,810 --> 00:49:38,200 I've been exploring this summer. 1135 00:49:38,200 --> 00:49:41,390 So if you can know the relative timing, when information 1136 00:49:41,390 --> 00:49:43,010 is in one brain region versus another, 1137 00:49:43,010 --> 00:49:46,440 it can tell you a lot about kind of the flow of information 1138 00:49:46,440 --> 00:49:48,920 the computation that brain regions might be doing. 1139 00:49:48,920 --> 00:49:53,930 So I think that's another very promising area to explore. 1140 00:49:53,930 --> 00:49:57,290 Also, decoding kind of focuses on the computational level 1141 00:49:57,290 --> 00:50:00,202 or algorithmic level, or really neural representations 1142 00:50:00,202 --> 00:50:01,910 if you thought about Marr's three levels. 1143 00:50:01,910 --> 00:50:04,535 It doesn't talk about this kind of implementational mechanistic 1144 00:50:04,535 --> 00:50:05,240 level. 1145 00:50:05,240 --> 00:50:07,670 So [INAUDIBLE] it's not one thing it can do. 1146 00:50:07,670 --> 00:50:09,789 Now if you have the flow of information going 1147 00:50:09,789 --> 00:50:12,080 through an area and you understand that well and what's 1148 00:50:12,080 --> 00:50:13,580 being represented, I think you might 1149 00:50:13,580 --> 00:50:16,580 be able to back out some of these mechanisms or processes 1150 00:50:16,580 --> 00:50:18,020 of how that can be built up. 1151 00:50:18,020 --> 00:50:22,864 But in and of itself, decoding doesn't give you that. 1152 00:50:22,864 --> 00:50:24,530 Also, decoding methods, computationally, 1153 00:50:24,530 --> 00:50:25,820 can be intensive. 1154 00:50:25,820 --> 00:50:27,645 can take up to an hour. 1155 00:50:27,645 --> 00:50:29,270 If you do something really complicated, 1156 00:50:29,270 --> 00:50:31,832 it can take you a week to run something very elaborate. 1157 00:50:31,832 --> 00:50:33,290 You know, sometimes it can be quick 1158 00:50:33,290 --> 00:50:35,040 and you can do it in a few minutes, 1159 00:50:35,040 --> 00:50:38,000 but it's certainly a lot slower than doing something 1160 00:50:38,000 --> 00:50:41,154 like an activity index where you're done in two seconds 1161 00:50:41,154 --> 00:50:43,070 and then you have the wrong answer right away. 1162 00:50:48,410 --> 00:50:50,780 Let me just spend like five more minutes talking 1163 00:50:50,780 --> 00:50:52,580 about this toolbox and then you can all 1164 00:50:52,580 --> 00:50:54,990 go work on your projects and do what you want to do. 1165 00:50:54,990 --> 00:50:57,260 So this is a toolbox I made called the neural decoding 1166 00:50:57,260 --> 00:50:57,779 toolbox. 1167 00:50:57,779 --> 00:50:59,320 There's a paper about it in Frontiers 1168 00:50:59,320 --> 00:51:01,505 in Neuroinfomatics in 2013. 1169 00:51:01,505 --> 00:51:04,130 And the whole point of it was to try to make it easy for people 1170 00:51:04,130 --> 00:51:07,400 to do these analyses because [INAUDIBLE].. 1171 00:51:07,400 --> 00:51:10,550 And so basically, here is like six lines of code 1172 00:51:10,550 --> 00:51:13,599 that if you ran it would do one of those analyses for you. 1173 00:51:13,599 --> 00:51:15,140 And not only is it six lines of code, 1174 00:51:15,140 --> 00:51:17,972 but it's almost literally these exact same six lines of code. 1175 00:51:17,972 --> 00:51:19,430 The only thing you'd, like, replace 1176 00:51:19,430 --> 00:51:22,850 would be your data rather than this data file. 1177 00:51:22,850 --> 00:51:31,393 And so what you can do, the whole idea behind it 1178 00:51:31,393 --> 00:51:33,660 is it's a kind of open science idea, you know, 1179 00:51:33,660 --> 00:51:36,299 I want more transparency so I'm sharing my code. 1180 00:51:36,299 --> 00:51:38,840 If you use my code, ultimately, if you could share your data, 1181 00:51:38,840 --> 00:51:40,970 that would be great because I think 1182 00:51:40,970 --> 00:51:42,380 I wouldn't have been able to develop any of this stuff 1183 00:51:42,380 --> 00:51:43,940 if people hadn't shared data with me. 1184 00:51:43,940 --> 00:51:46,670 I think we'll make a lot more progress in science 1185 00:51:46,670 --> 00:51:49,877 if we're open and share. 1186 00:51:49,877 --> 00:51:50,960 There you go, I'm a hippy. 1187 00:51:55,920 --> 00:52:01,010 And here's the website for the toolbox, www.readout.info. 1188 00:52:01,010 --> 00:52:03,570 Just talk briefly a little bit more about the toolbox. 1189 00:52:03,570 --> 00:52:08,670 The way it was designed is around four abstract classes. 1190 00:52:08,670 --> 00:52:11,759 So these are kind of major pieces or objects 1191 00:52:11,759 --> 00:52:13,300 that you can kind of swap in and out. 1192 00:52:13,300 --> 00:52:14,810 They're like components that allow 1193 00:52:14,810 --> 00:52:16,950 you to do different things. 1194 00:52:16,950 --> 00:52:20,100 So for example, one of the components is a data source. 1195 00:52:20,100 --> 00:52:23,487 This creates the training and test set of data. 1196 00:52:23,487 --> 00:52:25,320 You can separate that out in different ways, 1197 00:52:25,320 --> 00:52:28,430 like there's just a standard one but you can swap it out 1198 00:52:28,430 --> 00:52:32,600 to do that invariance or abstract analysis. 1199 00:52:32,600 --> 00:52:34,760 Or you can do things like, I guess, change 1200 00:52:34,760 --> 00:52:38,310 the different binning schemes within that piece of code. 1201 00:52:38,310 --> 00:52:40,310 So that's one component you can swap in and out. 1202 00:52:40,310 --> 00:52:42,560 Another one are these preprocessors. 1203 00:52:42,560 --> 00:52:45,140 What they do is they apply pre-processing to your training 1204 00:52:45,140 --> 00:52:47,690 data, and then use those parameters that 1205 00:52:47,690 --> 00:52:51,410 were learned on the training set to do some mechanics 1206 00:52:51,410 --> 00:52:53,370 to the test set as well. 1207 00:52:53,370 --> 00:52:55,720 So for example, when I was selecting the best neurons, 1208 00:52:55,720 --> 00:52:58,490 I used a preprocessor that just eliminated-- 1209 00:52:58,490 --> 00:53:00,620 found good neurons in the training set, 1210 00:53:00,620 --> 00:53:02,510 just used those, and then also eliminated 1211 00:53:02,510 --> 00:53:04,030 those neurons in the test set. 1212 00:53:04,030 --> 00:53:05,446 And so there are different, again, 1213 00:53:05,446 --> 00:53:07,400 components you can swap in and out with that. 1214 00:53:07,400 --> 00:53:10,640 An obvious component you can swap in and out, classifiers. 1215 00:53:10,640 --> 00:53:13,040 You could throw in a classifier that takes correlations 1216 00:53:13,040 --> 00:53:14,502 into account or doesn't. 1217 00:53:14,502 --> 00:53:15,710 Or do whatever you want here. 1218 00:53:15,710 --> 00:53:18,860 You know, use some highly nonlinear or somewhat nonlinear 1219 00:53:18,860 --> 00:53:23,100 thing and see is the brain doing it that way. 1220 00:53:23,100 --> 00:53:26,930 And there's this final piece called cross validator. 1221 00:53:26,930 --> 00:53:29,330 It basically runs the whole cross validation loop. 1222 00:53:29,330 --> 00:53:31,520 It pulls data from the data source, 1223 00:53:31,520 --> 00:53:33,270 creating training and test sets. 1224 00:53:33,270 --> 00:53:35,190 It applies the future preprocessor. 1225 00:53:35,190 --> 00:53:37,437 It trains the classifier and reports the results. 1226 00:53:37,437 --> 00:53:40,020 Generally, I've only written one of these and it's pretty long 1227 00:53:40,020 --> 00:53:41,010 and does a lot of different things, 1228 00:53:41,010 --> 00:53:42,759 like gives you different types of results. 1229 00:53:42,759 --> 00:53:44,610 So not just is there information loss 1230 00:53:44,610 --> 00:53:46,600 but gives you mutual information and all these other things. 1231 00:53:46,600 --> 00:53:48,808 But again, if you wanted to, you could expand on that 1232 00:53:48,808 --> 00:53:50,740 and do the cross-validation in different ways. 1233 00:53:54,690 --> 00:53:57,770 If you wanted to get started on your own data, 1234 00:53:57,770 --> 00:54:00,570 you just have to put your data in a fairly simple format. 1235 00:54:00,570 --> 00:54:03,770 It's a format I call raster format. 1236 00:54:03,770 --> 00:54:05,120 It's just in a raster. 1237 00:54:05,120 --> 00:54:06,770 So you just have trials going this way. 1238 00:54:06,770 --> 00:54:07,604 Time going this way. 1239 00:54:07,604 --> 00:54:09,061 And if it was spikes, it would just 1240 00:54:09,061 --> 00:54:11,690 be the ones and zeros that happen on the different trials. 1241 00:54:11,690 --> 00:54:14,390 If this was MEG data, you'd have your MEG 1242 00:54:14,390 --> 00:54:16,350 actual continuous values in there. 1243 00:54:16,350 --> 00:54:19,240 Again, trials and time. 1244 00:54:19,240 --> 00:54:20,620 Or fMRI or whatever. 1245 00:54:20,620 --> 00:54:25,010 fMRI might just be one vector if you didn't have any time. 1246 00:54:25,010 --> 00:54:26,970 And so again, this is just blown up. 1247 00:54:26,970 --> 00:54:27,890 This was trials. 1248 00:54:27,890 --> 00:54:28,700 This is time. 1249 00:54:28,700 --> 00:54:31,970 You can have the little ones where a spike occurred. 1250 00:54:31,970 --> 00:54:33,800 And then what corresponds to each trial, 1251 00:54:33,800 --> 00:54:36,500 you need to give the labels about what happened. 1252 00:54:36,500 --> 00:54:39,220 So you'd have just something called raster labels. 1253 00:54:39,220 --> 00:54:39,987 It's a structure. 1254 00:54:39,987 --> 00:54:42,320 And you'd say, OK, on the first trial I showed a flower. 1255 00:54:42,320 --> 00:54:43,528 Second trial I showed a face. 1256 00:54:43,528 --> 00:54:45,677 Third trial I showed a couch. 1257 00:54:45,677 --> 00:54:47,760 And these could be numbers or whatever you wanted. 1258 00:54:47,760 --> 00:54:49,490 But it's just indicating different things are 1259 00:54:49,490 --> 00:54:50,810 happening in different trials. 1260 00:54:50,810 --> 00:54:53,910 And you can also have multiple ones of these. 1261 00:54:53,910 --> 00:54:56,120 So if I want to decode position, I also 1262 00:54:56,120 --> 00:54:57,470 have upper, middle, lower. 1263 00:54:57,470 --> 00:54:59,928 And so you can use the same data and decode different types 1264 00:54:59,928 --> 00:55:02,060 of things from that data set. 1265 00:55:02,060 --> 00:55:03,950 And then there's this final information 1266 00:55:03,950 --> 00:55:05,160 that's kind of optional. 1267 00:55:05,160 --> 00:55:06,530 It's just raster site info. 1268 00:55:06,530 --> 00:55:09,200 So for each site you could have just meta information. 1269 00:55:09,200 --> 00:55:12,500 This is the recording I made on January 14 1270 00:55:12,500 --> 00:55:15,470 and it was recorded from IT. 1271 00:55:18,054 --> 00:55:19,970 So you just define these three things and then 1272 00:55:19,970 --> 00:55:23,040 the toolbox plug and play. 1273 00:55:23,040 --> 00:55:26,360 So with some experience you should be able to do that. 1274 00:55:26,360 --> 00:55:27,540 So that's it. 1275 00:55:27,540 --> 00:55:30,410 I want to thank the Center for Brains, Minds, Machines 1276 00:55:30,410 --> 00:55:32,180 for funding this work. 1277 00:55:32,180 --> 00:55:35,480 And all my collaborators who collected the data or who 1278 00:55:35,480 --> 00:55:38,180 worked with me to analyze it. 1279 00:55:38,180 --> 00:55:41,840 And there is the URL for the toolbox 1280 00:55:41,840 --> 00:55:44,290 if you want to download it.