1 00:00:00,080 --> 00:00:02,430 The following content is provided under a Creative 2 00:00:02,430 --> 00:00:03,810 Commons license. 3 00:00:03,810 --> 00:00:06,050 Your support will help MIT OpenCourseWare 4 00:00:06,050 --> 00:00:10,170 continue to offer high quality educational resources for free. 5 00:00:10,170 --> 00:00:12,690 To make a donation or to view additional materials 6 00:00:12,690 --> 00:00:16,600 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,600 --> 00:00:17,305 at ocw.mit.edu. 8 00:00:20,814 --> 00:00:22,730 PROFESSOR: Right, so now I'm going to walk you 9 00:00:22,730 --> 00:00:24,140 through some examples. 10 00:00:24,140 --> 00:00:29,490 And just so you know where this stuff is-- so you all 11 00:00:29,490 --> 00:00:31,440 have your LLGrid accounts. 12 00:00:31,440 --> 00:00:33,900 And in your LLGrid accounts-- and I'm looking over here 13 00:00:33,900 --> 00:00:35,440 because these people are going to check to make sure 14 00:00:35,440 --> 00:00:36,880 I don't say anything wrong. 15 00:00:36,880 --> 00:00:38,780 So this is, by the way, this is [INAUDIBLE] 16 00:00:38,780 --> 00:00:40,670 over here and Julie Mullin. 17 00:00:40,670 --> 00:00:45,780 They are our expert consultants, PhDs in computational science, 18 00:00:45,780 --> 00:00:47,420 that help you all. 19 00:00:47,420 --> 00:00:50,400 And we're all eternally grateful to them 20 00:00:50,400 --> 00:00:54,080 for helping us get all this technology to work. 21 00:00:54,080 --> 00:00:58,060 So if you go to your LLGrid account, 22 00:00:58,060 --> 00:01:02,332 there should be a link in there called Tools. 23 00:01:02,332 --> 00:01:04,040 This is where all the software is that we 24 00:01:04,040 --> 00:01:05,750 provide as part of LLGrid. 25 00:01:05,750 --> 00:01:08,510 And there'll be one called d4m_api in there. 26 00:01:08,510 --> 00:01:11,340 So there'll be a folder just like this. 27 00:01:11,340 --> 00:01:17,490 And just so you know-- so all the lectures are there for you. 28 00:01:17,490 --> 00:01:20,530 And I am putting them all in for public release. 29 00:01:20,530 --> 00:01:23,160 And all the software we're going to post on the internet too. 30 00:01:23,160 --> 00:01:26,140 And so just make it easier for you 31 00:01:26,140 --> 00:01:28,530 to use, use with your government-- use 32 00:01:28,530 --> 00:01:30,492 with your sponsors, use with your projects, 33 00:01:30,492 --> 00:01:31,450 all that kind of stuff. 34 00:01:31,450 --> 00:01:34,860 We try and do that so that people aren't constantly asking 35 00:01:34,860 --> 00:01:36,100 us, well, what can I share? 36 00:01:36,100 --> 00:01:38,058 It's like, I'm going to put it on the internet. 37 00:01:38,058 --> 00:01:38,950 So there you go. 38 00:01:42,240 --> 00:01:44,360 We want to focus on the examples directory here. 39 00:01:47,100 --> 00:01:50,860 The order of the examples is numerical order. 40 00:01:50,860 --> 00:01:53,960 So we are going to go through examples 41 00:01:53,960 --> 00:01:56,640 in the folder called one and then two and then three. 42 00:01:56,640 --> 00:01:58,890 And this really corresponds to kind of the first three 43 00:01:58,890 --> 00:02:01,590 lectures, then the next three lectures, 44 00:02:01,590 --> 00:02:04,650 then the next three lectures, OK? 45 00:02:04,650 --> 00:02:08,720 And so today, we're going to go in here to this first one. 46 00:02:08,720 --> 00:02:11,465 And then here is kind of like the first three of those three. 47 00:02:11,465 --> 00:02:12,840 We're going to go and we're going 48 00:02:12,840 --> 00:02:17,530 to review this set here, OK? 49 00:02:17,530 --> 00:02:22,180 So these are the-- the examples of have this sort of-- they 50 00:02:22,180 --> 00:02:23,760 basically take the first two letters 51 00:02:23,760 --> 00:02:29,540 of the folder-- in this case, [INAUDIBLE] intro AI 1234. 52 00:02:29,540 --> 00:02:30,810 Those are the actual examples. 53 00:02:30,810 --> 00:02:34,297 If you see other files in there, those are supporting files. 54 00:02:34,297 --> 00:02:35,505 You don't run those directly. 55 00:02:38,040 --> 00:02:40,940 So I'm going to start my-- I'm going to-- I always run 56 00:02:40,940 --> 00:02:41,960 MATLAB from the shell. 57 00:02:41,960 --> 00:02:44,150 Other people run it from the IDE. 58 00:02:44,150 --> 00:02:47,320 I'm going to create a shell here. 59 00:02:51,930 --> 00:02:53,100 [INAUDIBLE] folder. 60 00:02:53,100 --> 00:02:56,350 I'm going to start my MATLAB up. 61 00:02:56,350 --> 00:02:58,050 This will take a minute because we're 62 00:02:58,050 --> 00:03:06,238 reusing 2012B, which is a little bit slower on this computer. 63 00:03:06,238 --> 00:03:07,210 All right, there we go. 64 00:03:13,016 --> 00:03:14,640 And you'll be pleased to know I develop 65 00:03:14,640 --> 00:03:18,312 all the [? D form ?] software on this little computer. 66 00:03:18,312 --> 00:03:20,020 And people are going, well, why don't you 67 00:03:20,020 --> 00:03:23,410 get a big giant workstation to develop your programs on? 68 00:03:23,410 --> 00:03:25,900 It's like, well, because if it feels OK for me 69 00:03:25,900 --> 00:03:27,810 on this computer, I know that most of you 70 00:03:27,810 --> 00:03:29,270 have much better computers. 71 00:03:29,270 --> 00:03:32,830 And it will feel-- should feel very, very, very good for you. 72 00:03:32,830 --> 00:03:34,251 All right, so we're in the folder. 73 00:03:34,251 --> 00:03:36,750 And I'm just going to run the first program-- AI1_SetupTEST. 74 00:03:43,780 --> 00:03:46,080 Actually, before we even get into that, 75 00:03:46,080 --> 00:03:47,830 the first thing you're going to want to do 76 00:03:47,830 --> 00:03:51,920 is check before you run this that your D4M code has 77 00:03:51,920 --> 00:03:52,910 been properly set up. 78 00:03:52,910 --> 00:03:55,545 The simplest way to do that is when you start MATLAB, 79 00:03:55,545 --> 00:04:00,930 if you type help D4M and you get a list 80 00:04:00,930 --> 00:04:03,990 of all the functions in D4M, then 81 00:04:03,990 --> 00:04:08,050 that means that your path is set up properly. 82 00:04:08,050 --> 00:04:11,580 And this is just a list of all the different functions. 83 00:04:11,580 --> 00:04:13,900 We break them down in different types. 84 00:04:13,900 --> 00:04:16,329 You've got a little sort of like how to set up here, 85 00:04:16,329 --> 00:04:19,100 although a lot of this stuff isn't-- is written for more 86 00:04:19,100 --> 00:04:20,772 people on the outside world. 87 00:04:20,772 --> 00:04:22,230 And we go through all the functions 88 00:04:22,230 --> 00:04:23,929 and we categorize the functions. 89 00:04:23,929 --> 00:04:25,720 You know, some are functions that we really 90 00:04:25,720 --> 00:04:28,020 expect to use all the time. 91 00:04:28,020 --> 00:04:30,590 Some are functions that you might use once 92 00:04:30,590 --> 00:04:31,745 in kind of rare instances. 93 00:04:31,745 --> 00:04:33,950 And then we also have functions like, look, 94 00:04:33,950 --> 00:04:35,200 you really-- they're there. 95 00:04:35,200 --> 00:04:36,010 They're in the library. 96 00:04:36,010 --> 00:04:37,830 These are really not meant for you to use, though. 97 00:04:37,830 --> 00:04:39,780 They're really internal supporting functions. 98 00:04:39,780 --> 00:04:42,050 But they're there in the library. 99 00:04:42,050 --> 00:04:45,840 All right, so that's the first thing you want to check 100 00:04:45,840 --> 00:04:47,910 is to make sure that's set up. 101 00:04:47,910 --> 00:04:54,680 If you have any issues, send email to grid-help@ll.mit.edu 102 00:04:54,680 --> 00:04:57,340 and the people will help you in and check it out. 103 00:04:57,340 --> 00:05:01,180 And don't be surprised if some of you do have an issue. 104 00:05:01,180 --> 00:05:03,640 This is the first time we've really rolled it out 105 00:05:03,640 --> 00:05:05,300 to such a large audience. 106 00:05:05,300 --> 00:05:07,920 And so we absolutely expect people to have 107 00:05:07,920 --> 00:05:12,338 little things that will pop up. 108 00:05:12,338 --> 00:05:17,960 All right, so I'm going to run the first example here. 109 00:05:17,960 --> 00:05:21,010 I should say we also have tested all the stuff. 110 00:05:21,010 --> 00:05:26,920 For those of you who are utterly averse to commercial software, 111 00:05:26,920 --> 00:05:28,850 depending on your religious preferences, 112 00:05:28,850 --> 00:05:31,120 we also run with GNU Octave, which 113 00:05:31,120 --> 00:05:32,840 is the GPL version for those of you 114 00:05:32,840 --> 00:05:36,570 who refuse to run non-free software. 115 00:05:36,570 --> 00:05:38,860 And so this-- all this stuff should also 116 00:05:38,860 --> 00:05:41,520 work with that as well just for those people who 117 00:05:41,520 --> 00:05:43,710 prefer to use that type of environment. 118 00:05:43,710 --> 00:05:46,350 But generally, MATLAB-- you know, 119 00:05:46,350 --> 00:05:48,840 it's pretty available here. 120 00:05:48,840 --> 00:05:53,870 And so we certainly encourage you to use that. 121 00:05:53,870 --> 00:05:57,760 So we're going to run the first test here. 122 00:05:57,760 --> 00:06:01,730 AI1_SetupTEST-- Go. 123 00:06:01,730 --> 00:06:03,390 Yay, it worked. 124 00:06:03,390 --> 00:06:07,400 It's embarrassing when you're recording and these don't work. 125 00:06:07,400 --> 00:06:10,740 So I'm going to walk you through some really rudimentary stuff 126 00:06:10,740 --> 00:06:13,780 here in this example. 127 00:06:13,780 --> 00:06:16,850 So one of the things in D4M that you're dealing with a lot 128 00:06:16,850 --> 00:06:22,880 is lists of strings-- long list of strings, 129 00:06:22,880 --> 00:06:26,470 millions of distinct strings. 130 00:06:26,470 --> 00:06:28,980 Now, MATLAB has data structures that 131 00:06:28,980 --> 00:06:33,185 do support lists of strings, cell arrays being one of them. 132 00:06:33,185 --> 00:06:35,560 There's other data structure that you can use to do that. 133 00:06:35,560 --> 00:06:36,920 They naturally support these. 134 00:06:36,920 --> 00:06:42,400 But they tend to be very memory intensive and very slow. 135 00:06:42,400 --> 00:06:45,700 And so given our whole thing here 136 00:06:45,700 --> 00:06:47,970 is we have a real focus on performance, 137 00:06:47,970 --> 00:06:49,480 me giving you a great tool that's 138 00:06:49,480 --> 00:06:54,520 1,000 times slower than other techniques is not very helpful. 139 00:06:54,520 --> 00:06:57,470 So we are going to be dealing with list of strings 140 00:06:57,470 --> 00:06:58,090 all the time. 141 00:06:58,090 --> 00:07:03,140 And so in D4M, a list of strings is a row vector of characters, 142 00:07:03,140 --> 00:07:03,640 OK? 143 00:07:03,640 --> 00:07:05,140 I've highlighted here. 144 00:07:05,140 --> 00:07:09,540 And the last character in the row vector is the delimiter. 145 00:07:09,540 --> 00:07:11,290 In this case, it's the comma. 146 00:07:11,290 --> 00:07:12,620 You're at the end. 147 00:07:12,620 --> 00:07:14,210 That is the delimiter. 148 00:07:14,210 --> 00:07:17,910 It can be whatever you want it to be. 149 00:07:17,910 --> 00:07:19,140 It can be semicolon. 150 00:07:19,140 --> 00:07:20,655 It can be a dash. 151 00:07:20,655 --> 00:07:22,500 It can be a space. 152 00:07:22,500 --> 00:07:27,080 I tend to recommend new line, which is ASCII character 10-- 153 00:07:27,080 --> 00:07:29,130 very safe delimiter. 154 00:07:29,130 --> 00:07:33,540 But whatever that last character is, that's the limiter. 155 00:07:33,540 --> 00:07:36,530 And you could have different lists, different strings 156 00:07:36,530 --> 00:07:37,980 with different delimiters. 157 00:07:37,980 --> 00:07:41,730 It should handle those situations just fine. 158 00:07:44,820 --> 00:07:47,800 But within a list of strings, it needs 159 00:07:47,800 --> 00:07:51,060 to be that-- that last character will be the delimiter. 160 00:07:51,060 --> 00:07:55,270 So I'm actually creating here a list. 161 00:07:55,270 --> 00:07:57,270 I'm going to creating a set of triples here-- r, 162 00:07:57,270 --> 00:07:59,380 which is the rows, c, which is the columns, v, 163 00:07:59,380 --> 00:08:01,000 which is the values. 164 00:08:01,000 --> 00:08:03,870 And so I have an r, which is this. 165 00:08:03,870 --> 00:08:06,650 I have a set of c here, which is this. 166 00:08:06,650 --> 00:08:08,960 Vector here and then a list of values-- 167 00:08:08,960 --> 00:08:11,910 and the values in this case are just 168 00:08:11,910 --> 00:08:15,120 appending the row and the column together with a dash. 169 00:08:15,120 --> 00:08:18,360 And all three of these use the comma 170 00:08:18,360 --> 00:08:22,109 as-- it's the last character-- as the delimiter, all right? 171 00:08:22,109 --> 00:08:24,400 And now I'm going to create an associative array, which 172 00:08:24,400 --> 00:08:27,610 is the fundamental data structure in D4M. 173 00:08:27,610 --> 00:08:29,750 It's what allows us to bridge linear algebra 174 00:08:29,750 --> 00:08:31,430 and strings together. 175 00:08:31,430 --> 00:08:35,080 So this is a constructor command for an associative array-- 176 00:08:35,080 --> 00:08:36,190 so Assoc. 177 00:08:36,190 --> 00:08:41,030 And then we give it a list of row keys. 178 00:08:41,030 --> 00:08:44,080 These are called-- we often call them keys-- list of column keys 179 00:08:44,080 --> 00:08:46,710 and a list of string values. 180 00:08:46,710 --> 00:08:49,740 And this construct the associative array a. 181 00:08:49,740 --> 00:08:51,660 And in MATLAB, since I haven't terminated 182 00:08:51,660 --> 00:08:54,127 this with a semicolon, it will now will print out 183 00:08:54,127 --> 00:08:54,960 the list of triples. 184 00:08:54,960 --> 00:08:57,970 So now you can see the list of triples we can construct here 185 00:08:57,970 --> 00:09:07,810 was row a, column a, value a-a, row aa, column a, value aa-a, 186 00:09:07,810 --> 00:09:10,310 and I won't read the whole list. 187 00:09:10,310 --> 00:09:10,810 Yes, Darryl. 188 00:09:10,810 --> 00:09:12,101 AUDIENCE: Question. [INAUDIBLE] 189 00:09:16,850 --> 00:09:19,850 PROFESSOR: That is an artifact of this printing out display. 190 00:09:19,850 --> 00:09:23,710 It will show you the delimiter that you actually used here 191 00:09:23,710 --> 00:09:26,707 just as-- and it's good to know that you-- 192 00:09:26,707 --> 00:09:28,040 AUDIENCE: It's not really there. 193 00:09:28,040 --> 00:09:29,150 PROFESSOR: It's not really there. 194 00:09:29,150 --> 00:09:30,810 Yeah, I mean, it's there and it's not there. 195 00:09:30,810 --> 00:09:32,240 But it's showing you the delimiter. 196 00:09:32,240 --> 00:09:34,630 And after a while, you just kind of learn to ignore that. 197 00:09:34,630 --> 00:09:36,921 The only time this does become a little bit of an issue 198 00:09:36,921 --> 00:09:40,060 is that that is the newline character. 199 00:09:40,060 --> 00:09:42,130 Then the formatting gets a little bit-- 200 00:09:42,130 --> 00:09:44,080 and we've actually built in routines 201 00:09:44,080 --> 00:09:45,913 that allow you to take an associative array, 202 00:09:45,913 --> 00:09:48,630 replace the delimiter with something nicer in one command 203 00:09:48,630 --> 00:09:52,940 so that you can print it out and doesn't look so crazy if you're 204 00:09:52,940 --> 00:09:54,587 using a new line. 205 00:09:54,587 --> 00:09:56,670 Typically, though, you tend to be doing this print 206 00:09:56,670 --> 00:09:58,460 out command on small things. 207 00:09:58,460 --> 00:10:01,840 So it's fine. 208 00:10:01,840 --> 00:10:04,440 So that's the whole list, you see there, 209 00:10:04,440 --> 00:10:08,650 of all the different entries. 210 00:10:08,650 --> 00:10:12,590 If you use the disp command, it will actually 211 00:10:12,590 --> 00:10:17,700 display the internal structure of the associative array. 212 00:10:17,700 --> 00:10:23,110 And an associative array object in D4M has four fields total. 213 00:10:23,110 --> 00:10:24,020 That's it. 214 00:10:24,020 --> 00:10:26,970 We do everything with just four fields. 215 00:10:26,970 --> 00:10:30,190 And the four fields are a set of row keys. 216 00:10:30,190 --> 00:10:34,410 This is a lexicographically sorted list 217 00:10:34,410 --> 00:10:37,590 of the unique row keys. 218 00:10:37,590 --> 00:10:40,200 And it's stored as a string list. 219 00:10:40,200 --> 00:10:43,600 So in this case, we had six unique row keys. 220 00:10:43,600 --> 00:10:49,420 And we have six entries in this row string list. 221 00:10:49,420 --> 00:10:52,390 Likewise, the column is the same thing. 222 00:10:52,390 --> 00:10:54,360 We have these six unique entries. 223 00:10:54,360 --> 00:10:57,400 And you see they inherit the delimiter that 224 00:10:57,400 --> 00:10:58,750 was passed into them. 225 00:10:58,750 --> 00:11:02,730 If I'd used different ones, you would see different delimiters. 226 00:11:02,730 --> 00:11:09,050 And then the value, which is another list that 227 00:11:09,050 --> 00:11:14,490 shows all the different strings, OK, and then 228 00:11:14,490 --> 00:11:19,870 a matrix, which shows-- which is a six by six matrix. 229 00:11:19,870 --> 00:11:24,520 So this is a six by six associative array. 230 00:11:24,520 --> 00:11:26,490 And this is the pointer. 231 00:11:26,490 --> 00:11:31,460 Basically, the value stored in a six by six sparse matrix points 232 00:11:31,460 --> 00:11:34,470 to the index of the value. 233 00:11:34,470 --> 00:11:36,884 So you can view that as a pointer to the values. 234 00:11:36,884 --> 00:11:38,300 And this is how we can have values 235 00:11:38,300 --> 00:11:40,180 that are actually strings. 236 00:11:40,180 --> 00:11:41,520 That may not be quite clear. 237 00:11:41,520 --> 00:11:43,430 There's an easier way to do this. 238 00:11:43,430 --> 00:11:45,430 So we have a little routine function here called 239 00:11:45,430 --> 00:11:51,840 displayFull, which produces a nice tabular view of the data. 240 00:11:51,840 --> 00:11:56,120 So here's the row keys, the column keys, and the values. 241 00:11:56,120 --> 00:11:58,900 And you see this was the matrix I constructed. 242 00:11:58,900 --> 00:12:03,480 You had a full first column and a full first row 243 00:12:03,480 --> 00:12:06,320 and then values along the diagonal. 244 00:12:09,340 --> 00:12:13,651 And then what we're going to do is we want to save this. 245 00:12:13,651 --> 00:12:14,650 We want to write it out. 246 00:12:14,650 --> 00:12:16,233 So we're going to write it out to-- we 247 00:12:16,233 --> 00:12:18,660 have a function [INAUDIBLE] assoc to CSV. 248 00:12:18,660 --> 00:12:20,490 So I pass an associative array. 249 00:12:20,490 --> 00:12:25,144 I give it the row terminator and the column separator and a file 250 00:12:25,144 --> 00:12:26,560 name and we'll write that data out 251 00:12:26,560 --> 00:12:30,750 to a CSV file, which is very convenient. 252 00:12:30,750 --> 00:12:32,230 [INAUDIBLE] actually hide that. 253 00:12:32,230 --> 00:12:37,490 You can see here is the CSV file. 254 00:12:37,490 --> 00:12:38,780 So you can look at it. 255 00:12:38,780 --> 00:12:41,030 There's the CSV file. 256 00:12:41,030 --> 00:12:44,322 You can zoom in on that for you. 257 00:12:44,322 --> 00:12:45,260 Can you see? 258 00:12:45,260 --> 00:12:45,760 There we go. 259 00:12:45,760 --> 00:12:50,890 Those are our six rows, six columns against first row, 260 00:12:50,890 --> 00:12:54,485 against first column, and our diagonal, all right? 261 00:12:58,480 --> 00:13:00,810 All right, so let's go back here. 262 00:13:00,810 --> 00:13:04,990 Now let's go to our next example, 263 00:13:04,990 --> 00:13:07,376 which is going to be AI2. 264 00:13:07,376 --> 00:13:09,000 And now we're going to talk about how-- 265 00:13:09,000 --> 00:13:11,839 we sort of described how we put data into an associative array. 266 00:13:11,839 --> 00:13:14,380 We're now going to talk about how we query it or get data out 267 00:13:14,380 --> 00:13:16,191 of the associative array. 268 00:13:16,191 --> 00:13:18,190 And one thing that's very nice about the queries 269 00:13:18,190 --> 00:13:20,730 that I will show you as we get to the later parts 270 00:13:20,730 --> 00:13:22,399 of the [? course ?] we do databases, 271 00:13:22,399 --> 00:13:24,440 whether you're querying an associate array that's 272 00:13:24,440 --> 00:13:29,010 in memory or binding to a table, it's the same. 273 00:13:29,010 --> 00:13:31,510 We try and make it so that almost everything you 274 00:13:31,510 --> 00:13:34,580 would do in an associate array, you could also do on a table. 275 00:13:34,580 --> 00:13:36,810 So you can write your programs in associative arrays 276 00:13:36,810 --> 00:13:38,560 and then switch a couple things and now it 277 00:13:38,560 --> 00:13:41,610 should also work on tables in the database the same way. 278 00:13:41,610 --> 00:13:43,300 And so we try and preserve that concept. 279 00:13:43,300 --> 00:13:47,150 The only difference is that a table [? in ?] databases just 280 00:13:47,150 --> 00:13:50,880 can be much, much bigger than an associative array you would 281 00:13:50,880 --> 00:13:54,275 have in your memory space. 282 00:13:54,275 --> 00:13:55,900 All right, so you're going to run that. 283 00:13:55,900 --> 00:13:57,996 That's the next example. 284 00:13:57,996 --> 00:13:59,370 So the first thing we did is when 285 00:13:59,370 --> 00:14:00,820 we wrote out the associate array, 286 00:14:00,820 --> 00:14:01,720 we're going to read it back in. 287 00:14:01,720 --> 00:14:03,560 So we have a nice function here called 288 00:14:03,560 --> 00:14:07,605 ReadCSV that reads CSV files. 289 00:14:10,720 --> 00:14:14,150 Just so you know, Microsoft Excel 290 00:14:14,150 --> 00:14:18,850 does write out a non-standard CSV file. 291 00:14:18,850 --> 00:14:21,090 Microsoft Excel [INAUDIBLE] common-- that 292 00:14:21,090 --> 00:14:30,310 is, if you have a row that is empty after a certain point, 293 00:14:30,310 --> 00:14:35,540 it won't write out those commas, which is technically probably 294 00:14:35,540 --> 00:14:37,947 not conformant with the official CSV format, 295 00:14:37,947 --> 00:14:40,530 although I don't know there is a sufficiently written down CSV 296 00:14:40,530 --> 00:14:41,120 format, right? 297 00:14:41,120 --> 00:14:43,171 You can describe it in one line. 298 00:14:43,171 --> 00:14:44,920 So you just have to be careful about that. 299 00:14:44,920 --> 00:14:48,085 If you see this issue, people write out a CSV from Excel. 300 00:14:48,085 --> 00:14:54,240 And if there's an empty last-- if it's last row is not 301 00:14:54,240 --> 00:14:59,160 fully dense or if its last column is not a dense column, 302 00:14:59,160 --> 00:15:03,440 you can get this issue and it will screw this stuff up. 303 00:15:03,440 --> 00:15:06,490 We also don't support quoted strings. 304 00:15:06,490 --> 00:15:08,835 Way to do that, though, is to create a TSV file. 305 00:15:08,835 --> 00:15:10,470 So put tabs in there. 306 00:15:10,470 --> 00:15:15,367 And we support tab-- just, if you just call this a TSV file, 307 00:15:15,367 --> 00:15:16,825 when you write it out, you can have 308 00:15:16,825 --> 00:15:19,449 your separator via tab instead of column, a comma, 309 00:15:19,449 --> 00:15:20,490 and then you're all good. 310 00:15:20,490 --> 00:15:22,590 And so that's how we support that. 311 00:15:22,590 --> 00:15:25,335 So we're going to now-- so that's how you read it in. 312 00:15:25,335 --> 00:15:26,960 And now we're going to do a whole bunch 313 00:15:26,960 --> 00:15:27,900 of different queries. 314 00:15:27,900 --> 00:15:30,840 These are all what are relatively complicated queries. 315 00:15:30,840 --> 00:15:36,430 So the first one here is get me rows a and b. 316 00:15:36,430 --> 00:15:39,920 So if I pass in a string list just like the-- whoops, 317 00:15:39,920 --> 00:15:43,000 didn't want to do that. 318 00:15:43,000 --> 00:15:43,890 Let's see here. 319 00:15:43,890 --> 00:15:47,550 We pass in a string list of the same type that I have before. 320 00:15:47,550 --> 00:15:51,640 And I just say this exact same type of indexing 321 00:15:51,640 --> 00:15:54,349 that we normally have in MATLAB where you're like, 322 00:15:54,349 --> 00:15:55,640 I want to get it a set of rows. 323 00:15:55,640 --> 00:15:56,990 I give it a set of rows. 324 00:15:56,990 --> 00:16:00,890 And so this says, get me rows a and b. 325 00:16:00,890 --> 00:16:02,980 And give me the whole row-- colon-- 326 00:16:02,980 --> 00:16:06,550 use the standard MATLAB syntax that colon means full row. 327 00:16:06,550 --> 00:16:08,780 Likewise, if a was a binding to a table, 328 00:16:08,780 --> 00:16:11,610 it would deliver the exact same query to the database 329 00:16:11,610 --> 00:16:15,130 and return the exact same thing in associative array. 330 00:16:15,130 --> 00:16:17,680 Here's another more complicated that says get me 331 00:16:17,680 --> 00:16:22,560 all rows containing a. 332 00:16:22,560 --> 00:16:25,702 So our wild card character is a little liberal here. 333 00:16:25,702 --> 00:16:27,660 It will just give you anything containing an a. 334 00:16:27,660 --> 00:16:30,620 It doesn't really respect beginnings or endings 335 00:16:30,620 --> 00:16:31,860 or anything like that. 336 00:16:31,860 --> 00:16:34,100 It's kind of a regular expression. 337 00:16:34,100 --> 00:16:40,490 And this says get any row containing a and column's one 338 00:16:40,490 --> 00:16:41,440 through three. 339 00:16:41,440 --> 00:16:44,391 So we can use numerical indexes. 340 00:16:44,391 --> 00:16:45,890 Sometimes, you're just like, I don't 341 00:16:45,890 --> 00:16:47,014 care what the row keys are. 342 00:16:47,014 --> 00:16:50,220 Just give me the first 10 columns or the first 10 rows. 343 00:16:50,220 --> 00:16:55,380 And that will return this, OK? 344 00:16:55,380 --> 00:17:00,370 This is one feature that does not work on all databases. 345 00:17:00,370 --> 00:17:05,109 Some database have a concept-- the numerical index 346 00:17:05,109 --> 00:17:05,770 of a column. 347 00:17:05,770 --> 00:17:07,240 And some databases do not. 348 00:17:07,240 --> 00:17:10,919 So it's dependent on the actual database. 349 00:17:10,919 --> 00:17:11,710 Here's another one. 350 00:17:11,710 --> 00:17:12,849 This is a range query. 351 00:17:12,849 --> 00:17:16,200 So if you remember in MATLAB, if you do colon and two values, 352 00:17:16,200 --> 00:17:17,450 it gives you a range. 353 00:17:17,450 --> 00:17:19,060 So we can do a range query here. 354 00:17:19,060 --> 00:17:25,140 So a-- give me all columns a through b. 355 00:17:25,140 --> 00:17:26,650 If you really wanted to just get-- 356 00:17:26,650 --> 00:17:29,100 we have something called a starts with, which is something 357 00:17:29,100 --> 00:17:30,683 like-- a lot of times, you'll be like, 358 00:17:30,683 --> 00:17:33,470 I want to just get the rows or the columns that 359 00:17:33,470 --> 00:17:35,000 begin with a certain string. 360 00:17:35,000 --> 00:17:37,280 So we have a little shorthand routine here 361 00:17:37,280 --> 00:17:38,530 that constructs that for you. 362 00:17:38,530 --> 00:17:42,572 So this says, get me all rows starting with a and c. 363 00:17:42,572 --> 00:17:44,530 Likewise, I can do the same thing with columns. 364 00:17:44,530 --> 00:17:46,835 I can say get me columns a and b. 365 00:17:46,835 --> 00:17:53,960 I can say get me all columns that contain a with columns-- 366 00:17:53,960 --> 00:17:55,520 with rows one through three. 367 00:17:55,520 --> 00:17:57,930 Like I said, I can do column ranges here. 368 00:17:57,930 --> 00:18:00,790 I can say, give me all columns a to b. 369 00:18:00,790 --> 00:18:03,180 Likewise, I can do the starts with command 370 00:18:03,180 --> 00:18:08,540 as well as give me all columns starting with a or c. 371 00:18:08,540 --> 00:18:10,910 And then finally, I think this kind of fun, 372 00:18:10,910 --> 00:18:13,210 we can actually query the values. 373 00:18:13,210 --> 00:18:14,900 Again, this is a feature that's not 374 00:18:14,900 --> 00:18:17,100 in-- [? it's ?] supported with the database. 375 00:18:17,100 --> 00:18:19,570 But it is supported with the associate array, which 376 00:18:19,570 --> 00:18:23,850 is if I say get me all-- return an associative array were 377 00:18:23,850 --> 00:18:29,830 all the values are greater than b-- I'm sorry, less than b, OK? 378 00:18:29,830 --> 00:18:32,760 So let me just show you what that looks like. 379 00:18:32,760 --> 00:18:34,509 So this was that query. 380 00:18:34,509 --> 00:18:35,800 And this is what it looks like. 381 00:18:35,800 --> 00:18:37,950 So I do display a-- and you see now 382 00:18:37,950 --> 00:18:41,810 we don't have any values that begin 383 00:18:41,810 --> 00:18:45,230 with a b-- all the values, OK? 384 00:18:45,230 --> 00:18:47,080 Another thing that we should notice here 385 00:18:47,080 --> 00:18:50,830 is that this is now a three by three matrix, not a six 386 00:18:50,830 --> 00:18:52,700 by six matrix. 387 00:18:52,700 --> 00:18:56,900 Associative arrays never store an empty row 388 00:18:56,900 --> 00:18:58,160 or an empty column. 389 00:18:58,160 --> 00:19:01,350 That is a big difference between traditional sparse 390 00:19:01,350 --> 00:19:04,142 linear algebra and associative arrays. 391 00:19:04,142 --> 00:19:05,600 In sparse linear algebra, you could 392 00:19:05,600 --> 00:19:09,000 have a row of all zero-- an empty row 393 00:19:09,000 --> 00:19:11,700 or an empty column, not an associative arrays. 394 00:19:11,700 --> 00:19:15,567 Associative arrays, you either have-- 395 00:19:15,567 --> 00:19:17,650 there's going to be-- if you have a row or column, 396 00:19:17,650 --> 00:19:19,200 it's going to have an [? entry-- ?] yes. 397 00:19:19,200 --> 00:19:20,075 AUDIENCE: [INAUDIBLE] 398 00:19:26,280 --> 00:19:29,090 PROFESSOR: So basically, all it does-- 399 00:19:29,090 --> 00:19:33,510 so it's basically-- so if we had a value to begin with, 400 00:19:33,510 --> 00:19:35,820 c, that would also not be included. 401 00:19:35,820 --> 00:19:37,800 So it's basically-- lexographically, we 402 00:19:37,800 --> 00:19:41,070 compare the value aa with b. 403 00:19:41,070 --> 00:19:43,600 And we say, is it lexographically before b? 404 00:19:43,600 --> 00:19:45,031 If so, it satisfies the condition. 405 00:19:45,031 --> 00:19:45,906 AUDIENCE: [INAUDIBLE] 406 00:19:50,700 --> 00:19:51,610 PROFESSOR: Yes, yes. 407 00:19:51,610 --> 00:19:55,090 So that's the policy that we [? made-- ?] 408 00:19:55,090 --> 00:19:57,380 so strictly, the algebra of associate arrays 409 00:19:57,380 --> 00:19:59,670 doesn't require lexographical ordering. 410 00:19:59,670 --> 00:20:01,580 That's an implementation fact. 411 00:20:01,580 --> 00:20:04,440 It's a very important implementation fact. 412 00:20:04,440 --> 00:20:07,780 We are constantly maintaining lexicographical order 413 00:20:07,780 --> 00:20:10,009 inside the data structure. 414 00:20:10,009 --> 00:20:10,550 Yeah, Darryl? 415 00:20:10,550 --> 00:20:11,425 AUDIENCE: [INAUDIBLE] 416 00:20:13,745 --> 00:20:15,620 PROFESSOR: That's right, yeah-- three by six, 417 00:20:15,620 --> 00:20:17,630 sorry, three by six. 418 00:20:17,630 --> 00:20:21,230 Yes, yes-- because these are full here, right? 419 00:20:21,230 --> 00:20:26,510 But you see that the ones-- none of the ones with b in it 420 00:20:26,510 --> 00:20:29,290 happened because they didn't satisfy criteria. 421 00:20:29,290 --> 00:20:30,620 So therefore, they're empty. 422 00:20:30,620 --> 00:20:34,540 And so therefore-- so just very important thing to know. 423 00:20:34,540 --> 00:20:36,445 All right, moving on here to the next-- 424 00:20:36,445 --> 00:20:37,320 AUDIENCE: [INAUDIBLE] 425 00:20:40,307 --> 00:20:42,140 PROFESSOR: You define other [? orderings. ?] 426 00:20:42,140 --> 00:20:45,410 The mathematics would no doubt admit that. 427 00:20:45,410 --> 00:20:48,560 It's really backed in, though, into the implementation. 428 00:20:48,560 --> 00:20:52,790 I mean, because we rely on the MATLAB sort command-- 429 00:20:52,790 --> 00:20:55,250 and as far as I know, that does not allow 430 00:20:55,250 --> 00:20:57,570 you to have other orderings. 431 00:20:57,570 --> 00:20:59,300 Now, you could obviously do rehashed. 432 00:20:59,300 --> 00:21:02,075 You could hash your rows and keys to some other thing 433 00:21:02,075 --> 00:21:05,010 and then have an associate array that maps those back and forth. 434 00:21:05,010 --> 00:21:08,254 And in fact, people do that all the time and just have-- 435 00:21:08,254 --> 00:21:09,170 and you could do that. 436 00:21:09,170 --> 00:21:10,950 But that's how-- and if you really, really want 437 00:21:10,950 --> 00:21:12,250 the order-- the main thing, though, 438 00:21:12,250 --> 00:21:13,350 is to kind of think of it, though, 439 00:21:13,350 --> 00:21:15,141 is that the ordering doesn't really matter. 440 00:21:15,141 --> 00:21:17,960 It's really a device that allows me to do fast lookups. 441 00:21:17,960 --> 00:21:23,380 And you shouldn't really care about the ordering. 442 00:21:23,380 --> 00:21:25,660 So let's move on to the next example. 443 00:21:25,660 --> 00:21:28,550 So actually-- so we're going to do AI3. 444 00:21:28,550 --> 00:21:32,640 So now we're going to do some math on this stuff, OK? 445 00:21:32,640 --> 00:21:35,050 So here we go. 446 00:21:35,050 --> 00:21:37,539 All right, so once again, I read in my data 447 00:21:37,539 --> 00:21:39,205 that I constructed in the first example. 448 00:21:42,570 --> 00:21:46,679 The values of that data are strings. 449 00:21:46,679 --> 00:21:48,220 But if I want to do math, sometimes I 450 00:21:48,220 --> 00:21:49,714 don't want to do math on strings. 451 00:21:49,714 --> 00:21:51,380 Sometimes, I want to do math on numbers. 452 00:21:51,380 --> 00:21:54,450 So in D4M, the associate arrays can also 453 00:21:54,450 --> 00:21:58,250 be numbers, which can be very convenient for doing 454 00:21:58,250 --> 00:21:59,970 mathematical operations. 455 00:21:59,970 --> 00:22:04,670 And so what we have here is we have this command 456 00:22:04,670 --> 00:22:07,990 called dblLogi, which is a shorthand for-- 457 00:22:07,990 --> 00:22:11,530 and it's kind of cut off here-- applying 458 00:22:11,530 --> 00:22:15,209 logical to the associate array, which takes all the values 459 00:22:15,209 --> 00:22:17,000 and converts them just to a zero or a one-- 460 00:22:17,000 --> 00:22:18,510 basically throws away the strings. 461 00:22:18,510 --> 00:22:21,150 Or if it was a numeric value, just [INAUDIBLE] are you there? 462 00:22:21,150 --> 00:22:21,950 You're a one. 463 00:22:21,950 --> 00:22:24,680 You're not there, you're a zero or you're empty. 464 00:22:24,680 --> 00:22:26,950 And then since we can't do arithmetic 465 00:22:26,950 --> 00:22:30,170 on logicals in MATLAB, we have to bump them back into doubles. 466 00:22:30,170 --> 00:22:32,250 So we do this so often, we've actually 467 00:22:32,250 --> 00:22:35,170 made a little shorthand here where we call dblLogi. 468 00:22:35,170 --> 00:22:37,670 And you'll see that all the time in the class because I just 469 00:22:37,670 --> 00:22:39,711 don't like to type all the characters [INAUDIBLE] 470 00:22:39,711 --> 00:22:42,770 writing double logical gets very redundant. 471 00:22:42,770 --> 00:22:44,930 Now, if we go and look at that data again, 472 00:22:44,930 --> 00:22:47,050 we see when you do displayFull that 473 00:22:47,050 --> 00:22:49,260 instead of where before we had values 474 00:22:49,260 --> 00:22:52,730 of these various strings, we just now have numbers-- 475 00:22:52,730 --> 00:22:53,630 just ones. 476 00:22:53,630 --> 00:22:54,750 So that's convenient. 477 00:22:54,750 --> 00:22:56,840 Now I can do arithmetic on them. 478 00:22:56,840 --> 00:23:00,111 So let me do the first thing here. 479 00:23:00,111 --> 00:23:01,610 One of the things I might want to do 480 00:23:01,610 --> 00:23:06,030 is sum all the rows-- so the MATLAB sum command. 481 00:23:06,030 --> 00:23:09,150 It's basically-- you're giving it essentially the dimension 482 00:23:09,150 --> 00:23:10,090 to eliminate. 483 00:23:10,090 --> 00:23:13,320 The first dimension we want to eliminate or sum over 484 00:23:13,320 --> 00:23:14,525 is the row dimensions. 485 00:23:14,525 --> 00:23:17,810 So that means it's summing up all these rows. 486 00:23:17,810 --> 00:23:20,614 It's basically squishing the matrix and summing them up. 487 00:23:20,614 --> 00:23:22,280 And now we have a new associative array, 488 00:23:22,280 --> 00:23:25,490 which is essentially a one by six associative array. 489 00:23:25,490 --> 00:23:28,240 You see now that the row key is empty. 490 00:23:28,240 --> 00:23:30,140 Because when we sum, the definitions 491 00:23:30,140 --> 00:23:32,560 of the rows sort of kind of go away. 492 00:23:32,560 --> 00:23:35,310 So we don't-- you don't have to have strings to be your row 493 00:23:35,310 --> 00:23:35,810 keys. 494 00:23:35,810 --> 00:23:40,100 You can have numeric row keys. 495 00:23:40,100 --> 00:23:44,590 Just leave the row entry empty. 496 00:23:44,590 --> 00:23:46,880 But there are some cautions with that as well. 497 00:23:46,880 --> 00:23:48,796 So when we sum that and you see [INAUDIBLE] we 498 00:23:48,796 --> 00:23:50,489 have all the columns, the values, 499 00:23:50,489 --> 00:23:52,030 because they're not strings, are just 500 00:23:52,030 --> 00:23:56,440 stored in this a matrix itself. 501 00:23:56,440 --> 00:23:59,230 And you see that we had six and two and two and two and two 502 00:23:59,230 --> 00:23:59,840 and two. 503 00:23:59,840 --> 00:24:02,530 So when we sum these up, that's exactly what you would expect. 504 00:24:05,360 --> 00:24:08,480 Moving along, we can do the columns. 505 00:24:08,480 --> 00:24:09,760 So here I'm going to sum. 506 00:24:09,760 --> 00:24:12,750 And then I'm going to do display full, which is the same as just 507 00:24:12,750 --> 00:24:13,890 kind of listing it. 508 00:24:13,890 --> 00:24:19,425 And we see here we now sum the-- compressed all the columns. 509 00:24:19,425 --> 00:24:20,550 So we have a column vector. 510 00:24:20,550 --> 00:24:23,560 And this shows the row labels of that column vector. 511 00:24:23,560 --> 00:24:26,510 We now have a new column label, which is just one. 512 00:24:26,510 --> 00:24:28,800 And then you see the actual values there-- again, 513 00:24:28,800 --> 00:24:30,070 a very useful thing. 514 00:24:30,070 --> 00:24:32,160 People do this all the time summing 515 00:24:32,160 --> 00:24:35,620 of their rows and columns. 516 00:24:35,620 --> 00:24:37,620 Let's do a simple join. 517 00:24:37,620 --> 00:24:43,010 So I'm going to say give me a column vector A. Get me 518 00:24:43,010 --> 00:24:45,540 another column vector b, right? 519 00:24:45,540 --> 00:24:48,500 And now I'm going to join these two together. 520 00:24:48,500 --> 00:24:52,440 Now, I could just do aa and ab. 521 00:24:52,440 --> 00:24:54,720 But I'd get an empty matrix. 522 00:24:54,720 --> 00:24:57,270 The reason is because it would attempt to do the joins 523 00:24:57,270 --> 00:24:59,470 and they would have different column labels. 524 00:24:59,470 --> 00:25:02,060 And so when we do joins, we're intersecting 525 00:25:02,060 --> 00:25:06,300 the two sets of row and column keys together. 526 00:25:06,300 --> 00:25:08,740 And if they have sep-- those are just separate columns. 527 00:25:08,740 --> 00:25:11,950 They add them together, they have no intersection. 528 00:25:11,950 --> 00:25:14,440 However, if we have this function called no call, which 529 00:25:14,440 --> 00:25:16,190 is actually blows away the column 530 00:25:16,190 --> 00:25:19,272 and basically gives them all, in this case a column value one, 531 00:25:19,272 --> 00:25:20,480 we can now add them together. 532 00:25:20,480 --> 00:25:22,630 And we can actually find where these two 533 00:25:22,630 --> 00:25:26,390 things have a common value. 534 00:25:26,390 --> 00:25:28,285 So that's a fairly simple way to do a join. 535 00:25:30,830 --> 00:25:34,650 This is something called a facet search, which basically says, 536 00:25:34,650 --> 00:25:37,080 all right, I'm going to join these things together. 537 00:25:37,080 --> 00:25:42,740 So I'm going to create a column vector. 538 00:25:42,740 --> 00:25:45,680 But then I'm going to transpose that over here. 539 00:25:45,680 --> 00:25:46,610 So I transpose it. 540 00:25:46,610 --> 00:25:48,193 And then I'm going to multiply it back 541 00:25:48,193 --> 00:25:49,385 with the original matrix. 542 00:25:49,385 --> 00:25:52,840 This gives me a count of essentially all-- given 543 00:25:52,840 --> 00:25:56,630 all columns, all rows that contained-- 544 00:25:56,630 --> 00:26:01,300 had an entry in column a and B, can you now sum up their rows? 545 00:26:01,300 --> 00:26:04,270 And this is a fairly-- this is actually 546 00:26:04,270 --> 00:26:07,710 the mathematical basis of, if you ever do in Google search 547 00:26:07,710 --> 00:26:11,090 it does auto-find, it's essentially 548 00:26:11,090 --> 00:26:12,670 trying to do this type of operation. 549 00:26:12,670 --> 00:26:15,230 It's trying to guess what the next most popular topic would 550 00:26:15,230 --> 00:26:16,022 be. 551 00:26:16,022 --> 00:26:18,230 And this is essentially-- this mathematical operation 552 00:26:18,230 --> 00:26:19,370 does that kind of thing. 553 00:26:19,370 --> 00:26:21,745 We actually have a bunch of applications in lab 554 00:26:21,745 --> 00:26:23,930 that use this quite heavily. 555 00:26:23,930 --> 00:26:26,851 And I can display the transpose of that-- 556 00:26:26,851 --> 00:26:28,100 so to make it a column vector. 557 00:26:28,100 --> 00:26:31,140 And as you see here, we have a bunch of columns here 558 00:26:31,140 --> 00:26:32,570 and then values. 559 00:26:32,570 --> 00:26:35,195 And then we can actually do things like sum and normalize. 560 00:26:35,195 --> 00:26:36,910 So we can divide. 561 00:26:36,910 --> 00:26:38,620 So I'm going to normalize them. 562 00:26:38,620 --> 00:26:39,700 You can display that. 563 00:26:39,700 --> 00:26:41,780 You see now you get the probabilities associated 564 00:26:41,780 --> 00:26:43,310 with these things. 565 00:26:43,310 --> 00:26:45,300 Just basically, you can just kind of 566 00:26:45,300 --> 00:26:48,340 go on your way doing math. 567 00:26:48,340 --> 00:26:50,870 This shows essentially the correlation of the columns 568 00:26:50,870 --> 00:26:51,530 a and b. 569 00:26:51,530 --> 00:26:54,340 But why not just do all the correlations at once? 570 00:26:54,340 --> 00:26:57,680 So I'll focus on here. 571 00:26:57,680 --> 00:26:59,190 So we have this function square in, 572 00:26:59,190 --> 00:27:01,440 which is the same as a transpose a but a little bit 573 00:27:01,440 --> 00:27:05,130 faster if you're just squaring something with itself. 574 00:27:05,130 --> 00:27:07,680 I'm going to get the diagonal of that. 575 00:27:07,680 --> 00:27:12,880 So if I use this function Adj, which stands for adjacency, 576 00:27:12,880 --> 00:27:15,570 it'll just pop out that a matrix by itself. 577 00:27:15,570 --> 00:27:17,130 Whenever you use Adj, you're getting 578 00:27:17,130 --> 00:27:21,340 a straight sparse MATLAB matrix. 579 00:27:21,340 --> 00:27:23,010 And you can do any operation on that 580 00:27:23,010 --> 00:27:24,550 that you want that MATLAB supports. 581 00:27:24,550 --> 00:27:26,460 So it's a really-- I highly recommend that. 582 00:27:26,460 --> 00:27:28,130 If you can't figure out how to do it 583 00:27:28,130 --> 00:27:29,505 with associate arrays [INAUDIBLE] 584 00:27:29,505 --> 00:27:31,250 just pop up that adjacency matrix. 585 00:27:31,250 --> 00:27:32,160 Do whatever you want. 586 00:27:32,160 --> 00:27:34,790 And then you can just-- as long as you didn't change the size, 587 00:27:34,790 --> 00:27:38,500 you can just stuff it right back in at essentially no cost. 588 00:27:38,500 --> 00:27:41,261 So this is basically copies that field out. 589 00:27:41,261 --> 00:27:42,760 And we're going to get the diagonal. 590 00:27:42,760 --> 00:27:43,965 The reason we want to get the diagonal 591 00:27:43,965 --> 00:27:46,810 is when we do correlations, we always get this dense diagonal. 592 00:27:46,810 --> 00:27:48,250 I want to eliminate that. 593 00:27:48,250 --> 00:27:53,370 So I'm going to then now take the adjacency matrix 594 00:27:53,370 --> 00:27:57,370 of the original correlation matrix, subtract the diagonal. 595 00:27:57,370 --> 00:27:59,490 And then we have this function [? putAdj, ?] which 596 00:27:59,490 --> 00:28:03,440 is kind of the inverse of Adj, which just says 597 00:28:03,440 --> 00:28:05,060 I have an associative array. 598 00:28:05,060 --> 00:28:08,650 Replace that a with this. 599 00:28:08,650 --> 00:28:13,760 No checking, instantaneous copy, very fast, 600 00:28:13,760 --> 00:28:15,620 but make sure it's the right size. 601 00:28:15,620 --> 00:28:19,580 If it's the wrong size or you've reoriented 602 00:28:19,580 --> 00:28:21,580 the columns in some way or the rows in some way, 603 00:28:21,580 --> 00:28:23,070 then it won't make any sense. 604 00:28:23,070 --> 00:28:25,572 So it's a little bit of an advanced feature, 605 00:28:25,572 --> 00:28:26,780 but it's a very powerful one. 606 00:28:26,780 --> 00:28:30,360 You can pop stuff in and out of these structures very quickly. 607 00:28:30,360 --> 00:28:32,750 And that way, if you ever need to do math 608 00:28:32,750 --> 00:28:35,800 that we don't support, you can just do that directly. 609 00:28:35,800 --> 00:28:38,000 And so if we look at that, we see 610 00:28:38,000 --> 00:28:41,970 what you would expect-- a nice correlation matrix here, 611 00:28:41,970 --> 00:28:46,800 symmetric with no diagonal and the counts of each one. 612 00:28:46,800 --> 00:28:50,300 Right here-- so let me move on to the next example 613 00:28:50,300 --> 00:28:54,320 here, which is AI4. 614 00:28:54,320 --> 00:28:56,330 And this just shows you different ways 615 00:28:56,330 --> 00:28:59,030 to construct associative arrays, some of the various kind 616 00:28:59,030 --> 00:29:01,770 of more degenerate cases. 617 00:29:01,770 --> 00:29:05,550 So I have a bench [INAUDIBLE] creating a string here 618 00:29:05,550 --> 00:29:07,760 and a bunch of numeric values. 619 00:29:07,760 --> 00:29:11,580 The point here is when you construct an associative array, 620 00:29:11,580 --> 00:29:13,090 one, we have full support for all 621 00:29:13,090 --> 00:29:15,370 these degenerate empty conditions. 622 00:29:15,370 --> 00:29:17,465 You give me anything with any kind of empty, 623 00:29:17,465 --> 00:29:20,020 we're going to return an empty associative array. 624 00:29:20,020 --> 00:29:22,822 MATLAB has outstanding support for empty objects. 625 00:29:22,822 --> 00:29:24,030 You can just keep on passing. 626 00:29:24,030 --> 00:29:26,040 You don't have to-- we basically-- 627 00:29:26,040 --> 00:29:28,090 what I'm saying is we do a lot of checking. 628 00:29:28,090 --> 00:29:30,310 If the thing is empty, short circuit, 629 00:29:30,310 --> 00:29:31,730 return an empty type of thing. 630 00:29:31,730 --> 00:29:33,770 So you're not having to constantly check 631 00:29:33,770 --> 00:29:37,740 if something is empty or not in order to proceed. 632 00:29:37,740 --> 00:29:38,890 So that's what we do there. 633 00:29:38,890 --> 00:29:41,290 That's just showing you all these [? supported. ?] 634 00:29:41,290 --> 00:29:44,210 In addition, you can do mixed types. 635 00:29:44,210 --> 00:29:46,240 So when you construct an associative array, 636 00:29:46,240 --> 00:29:49,820 if one of the values is a scalar-- like, for instance, 637 00:29:49,820 --> 00:29:50,650 you can have that. 638 00:29:50,650 --> 00:29:54,680 You don't-- you know normally, as in the MATLAB sparse 639 00:29:54,680 --> 00:29:57,220 constructor, when you give it a set of triples, 640 00:29:57,220 --> 00:29:58,760 they all have to be the same length. 641 00:29:58,760 --> 00:30:01,560 Or any one of them can-- or any of them can be scalars. 642 00:30:01,560 --> 00:30:05,220 So this just says, I have a bunch of columns strings, 643 00:30:05,220 --> 00:30:08,630 a bunch of numeric values. 644 00:30:08,630 --> 00:30:10,690 Oh and by the way, they all have the same row. 645 00:30:10,690 --> 00:30:15,200 So this is a very quick way to construct a row vector. 646 00:30:15,200 --> 00:30:19,200 I don't have to go and replicate this a that many times 647 00:30:19,200 --> 00:30:20,710 to make this work. 648 00:30:20,710 --> 00:30:23,710 Likewise here, I could have a numeric value 649 00:30:23,710 --> 00:30:30,070 for the actual rows, a string value for the columns. 650 00:30:30,070 --> 00:30:31,910 And I want everyone to have the same value. 651 00:30:31,910 --> 00:30:34,470 They're all going to have the value of a. 652 00:30:34,470 --> 00:30:36,680 And you can do variations on that theme. 653 00:30:36,680 --> 00:30:41,380 So you do-- these are fast ways to create row columns, row 654 00:30:41,380 --> 00:30:43,830 vectors, or column vectors or values 655 00:30:43,830 --> 00:30:45,630 with all or constant numeric value 656 00:30:45,630 --> 00:30:47,930 and other types of things like that. 657 00:30:47,930 --> 00:30:50,164 And then we can just display one of these. 658 00:30:50,164 --> 00:30:52,580 And you see here-- I think it was which one we displayed-- 659 00:30:52,580 --> 00:30:55,220 this first one, which is a row. 660 00:30:55,220 --> 00:30:57,507 And you can see when you display that, it just 661 00:30:57,507 --> 00:30:58,590 has one value for the row. 662 00:30:58,590 --> 00:30:59,923 These are the different columns. 663 00:30:59,923 --> 00:31:01,302 These are the different values. 664 00:31:01,302 --> 00:31:02,760 Then again, if we display full, you 665 00:31:02,760 --> 00:31:06,230 see in the tabular form that's what you have. 666 00:31:06,230 --> 00:31:10,640 So that brings us to the end of the first lecture. 667 00:31:10,640 --> 00:31:14,830 I'm happy to stay for questions that people might have. 668 00:31:14,830 --> 00:31:18,260 Please go and check out your LLGrid account. 669 00:31:18,260 --> 00:31:19,680 Copy the examples. 670 00:31:19,680 --> 00:31:21,807 Don't try and work out of the-- you'll 671 00:31:21,807 --> 00:31:24,140 get, like, permission denied errors and stuff like that. 672 00:31:24,140 --> 00:31:26,180 Copy the examples to your home directory. 673 00:31:26,180 --> 00:31:27,990 Try and get your D4M working. 674 00:31:27,990 --> 00:31:29,870 Try and run just this first example. 675 00:31:29,870 --> 00:31:32,730 Make sure it behaves the same way that we had here. 676 00:31:32,730 --> 00:31:36,680 And then send email to grid-help. 677 00:31:36,680 --> 00:31:38,770 And then as we get into the next week, 678 00:31:38,770 --> 00:31:41,522 we'll do homeworks that are a little bit more substantive. 679 00:31:41,522 --> 00:31:43,480 But this is just to make sure the technology is 680 00:31:43,480 --> 00:31:45,870 working for you. 681 00:31:45,870 --> 00:31:49,250 And that's-- we'll wrap it up there. 682 00:31:49,250 --> 00:31:50,930 Thank you.