1 00:00:00,040 --> 00:00:02,410 The following content is provided under a Creative 2 00:00:02,410 --> 00:00:03,790 Commons license. 3 00:00:03,790 --> 00:00:06,030 Your support will help MIT OpenCourseWare 4 00:00:06,030 --> 00:00:10,100 continue to offer high quality educational resources for free. 5 00:00:10,100 --> 00:00:12,680 To make a donation or to view additional materials 6 00:00:12,680 --> 00:00:16,590 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,590 --> 00:00:17,260 at ocw.mit.edu. 8 00:00:22,690 --> 00:00:23,800 PROFESSOR: All right. 9 00:00:23,800 --> 00:00:24,050 All right. 10 00:00:24,050 --> 00:00:25,800 So we're going to continue now and kind of 11 00:00:25,800 --> 00:00:31,474 get into the examples part of the course. 12 00:00:31,474 --> 00:00:32,890 And again, this is really the part 13 00:00:32,890 --> 00:00:36,680 that people should really-- I strongly encourage 14 00:00:36,680 --> 00:00:38,750 people doing examples. 15 00:00:38,750 --> 00:00:41,250 How many people got a chance to try the examples 16 00:00:41,250 --> 00:00:43,170 in the-- very good. 17 00:00:43,170 --> 00:00:47,260 And we got a couple of questions for people. 18 00:00:47,260 --> 00:00:50,390 And most of the time we got another e-mail from them 19 00:00:50,390 --> 00:00:52,370 right away that said, no, got it to work. 20 00:00:52,370 --> 00:00:55,620 So we're very pleased about that. 21 00:00:55,620 --> 00:00:58,440 The great thing about working at Lincoln Labs 22 00:00:58,440 --> 00:01:01,630 is people are pretty good at figuring stuff out. 23 00:01:05,560 --> 00:01:07,280 We have some experience this and before. 24 00:01:07,280 --> 00:01:09,310 Obviously, our parallel MATLAB technology, 25 00:01:09,310 --> 00:01:11,030 we've release that out. 26 00:01:11,030 --> 00:01:12,220 Many, many people use it. 27 00:01:12,220 --> 00:01:14,040 And for the most part, we're proud to say 28 00:01:14,040 --> 00:01:17,610 that whenever we get a question from the outside world, 29 00:01:17,610 --> 00:01:19,330 and it's not working, it's usually 30 00:01:19,330 --> 00:01:22,467 the person hasn't taken severely active steps to ignore 31 00:01:22,467 --> 00:01:23,217 the documentation. 32 00:01:26,670 --> 00:01:30,500 But if they follow it at all, they're generally pretty safe. 33 00:01:30,500 --> 00:01:34,260 So I'm going to get into the example code here. 34 00:01:34,260 --> 00:01:38,060 So the example code this is-- oh, that's not, 35 00:01:38,060 --> 00:01:39,980 D4Muser_share, that's incorrect. 36 00:01:39,980 --> 00:01:43,640 It's tools lower case examples. 37 00:01:43,640 --> 00:01:44,890 I should correct these slides. 38 00:02:02,750 --> 00:02:03,570 Oh, nice. 39 00:02:03,570 --> 00:02:06,270 There we go. 40 00:02:06,270 --> 00:02:08,169 Great. 41 00:02:08,169 --> 00:02:09,410 That is an x, right? 42 00:02:09,410 --> 00:02:10,161 I hope so. 43 00:02:14,450 --> 00:02:15,690 Oh yes, thank you. 44 00:02:18,580 --> 00:02:19,080 Great. 45 00:02:21,790 --> 00:02:22,750 All right. 46 00:02:22,750 --> 00:02:25,800 So this is where the example code is. 47 00:02:25,800 --> 00:02:28,630 We're going to go over here. 48 00:02:28,630 --> 00:02:33,280 Your assignment, should you choose to undertake it, 49 00:02:33,280 --> 00:02:40,650 is to pick a picture, hopefully something 50 00:02:40,650 --> 00:02:42,044 that's interesting to you. 51 00:02:42,044 --> 00:02:43,460 Doesn't have to be straight lines, 52 00:02:43,460 --> 00:02:46,635 it could be anything where you could create, draw edges, 53 00:02:46,635 --> 00:02:48,260 or something, you could take some photo 54 00:02:48,260 --> 00:02:51,567 and put it through one of those Photoshop effects 55 00:02:51,567 --> 00:02:53,400 and turn it into a line drawing or something 56 00:02:53,400 --> 00:02:54,775 and then start labeling vertices. 57 00:02:57,680 --> 00:03:00,370 You do want to limit the number of lines though. 58 00:03:00,370 --> 00:03:03,300 I would recommend 20 would be the most 59 00:03:03,300 --> 00:03:04,300 number of lines you do. 60 00:03:04,300 --> 00:03:06,490 Because this can get fairly involved. 61 00:03:06,490 --> 00:03:08,450 You might start with something fairly small. 62 00:03:08,450 --> 00:03:09,510 So select a picture. 63 00:03:09,510 --> 00:03:10,790 Label the edges and vertices. 64 00:03:10,790 --> 00:03:14,130 Create an incidence matrix. 65 00:03:14,130 --> 00:03:16,940 And then just compute the adjacency matrix 66 00:03:16,940 --> 00:03:19,580 from the instance matrix using this formula. 67 00:03:19,580 --> 00:03:22,280 You can certainly use d4m if you want to do that. 68 00:03:22,280 --> 00:03:24,420 But it's fine to do it by hand. 69 00:03:24,420 --> 00:03:26,680 It's fine to do it by hand, in which case 70 00:03:26,680 --> 00:03:28,720 you'd want to a very small one. 71 00:03:28,720 --> 00:03:33,674 Doing it once by hand is useful. 72 00:03:33,674 --> 00:03:35,310 Twice marginally. 73 00:03:35,310 --> 00:03:37,767 And then the third time, not useful. 74 00:03:37,767 --> 00:03:39,100 Just like in high school, right? 75 00:03:39,100 --> 00:03:40,610 They make you do matrix multiplied 76 00:03:40,610 --> 00:03:45,121 by hand like actually a lot. 77 00:03:45,121 --> 00:03:45,620 A lot. 78 00:03:49,140 --> 00:03:50,806 So that's the example in the assignment. 79 00:03:53,540 --> 00:03:57,010 If you do this on a piece of paper 80 00:03:57,010 --> 00:04:00,850 and scan it and email it to me before the next lecture, 81 00:04:00,850 --> 00:04:03,020 I will give you some feedback on it 82 00:04:03,020 --> 00:04:06,920 before the following lecture. 83 00:04:06,920 --> 00:04:12,670 If you give it to me after that too bad. 84 00:04:12,670 --> 00:04:14,380 I just won't give you feedback. 85 00:04:14,380 --> 00:04:15,730 So there's no credit. 86 00:04:15,730 --> 00:04:18,890 But this gives you some incentive to get it done. 87 00:04:18,890 --> 00:04:22,250 If you want feedback from me about what you did makes sense 88 00:04:22,250 --> 00:04:24,410 or if you encountered like I picked this picture 89 00:04:24,410 --> 00:04:27,150 and now I don't understand-- I did my best 90 00:04:27,150 --> 00:04:29,460 but I have questions about how I did this. 91 00:04:29,460 --> 00:04:30,590 Does it make sense? 92 00:04:30,590 --> 00:04:31,590 I'd be happy to do that. 93 00:04:31,590 --> 00:04:35,370 So just scan it, email it to me, or if it is electronic format 94 00:04:35,370 --> 00:04:36,370 you can do that as well. 95 00:04:36,370 --> 00:04:43,850 So the lecture portion and now go back here. 96 00:04:43,850 --> 00:04:49,970 And let's see here, we go to examples. 97 00:04:49,970 --> 00:04:50,740 All right. 98 00:04:50,740 --> 00:04:56,720 So we're in the intro, EdgeArt directory, 99 00:04:56,720 --> 00:04:59,490 start my MATLAB shell. 100 00:05:05,000 --> 00:05:05,500 MATLAB. 101 00:05:11,740 --> 00:05:14,410 This takes about 15 seconds for it to do this. 102 00:05:22,702 --> 00:05:25,410 This particular thing always CDs me 103 00:05:25,410 --> 00:05:27,580 to the location of the directory. 104 00:05:27,580 --> 00:05:32,400 You need to do that in order to see the examples. 105 00:05:32,400 --> 00:05:34,490 I think most MATLAB users are familiar with that. 106 00:05:34,490 --> 00:05:36,650 But just some that are not. 107 00:05:40,040 --> 00:05:47,370 Here is the actual adjacency matrix of that painting. 108 00:05:47,370 --> 00:05:50,200 So there it is. 109 00:05:50,200 --> 00:05:51,440 Same thing we just saw. 110 00:05:51,440 --> 00:05:53,898 So this is the data set that we're going to work with here. 111 00:06:00,540 --> 00:06:02,820 So I'm going to do EA1. 112 00:06:02,820 --> 00:06:08,200 Again, all the examples are numbered in order, 113 00:06:08,200 --> 00:06:10,650 kind of that you're supposed to proceed through them. 114 00:06:10,650 --> 00:06:13,230 And if see a file in a directory that's 115 00:06:13,230 --> 00:06:16,220 like couple capital letters, which are the first two 116 00:06:16,220 --> 00:06:20,650 letters of the directory or the abbreviation of the directory 117 00:06:20,650 --> 00:06:23,740 in a test, that is the examples. 118 00:06:23,740 --> 00:06:25,500 The other files are just supporting files. 119 00:06:25,500 --> 00:06:27,606 You are not expected to run those. 120 00:06:27,606 --> 00:06:29,480 You can try running them but they'll probably 121 00:06:29,480 --> 00:06:33,710 cause you problems if you run them by themselves. 122 00:06:33,710 --> 00:06:34,850 All right, here we go. 123 00:06:34,850 --> 00:06:37,890 So we just ran that example. 124 00:06:37,890 --> 00:06:39,174 Let's see what we got. 125 00:07:01,880 --> 00:07:03,780 I think that worked. 126 00:07:03,780 --> 00:07:09,510 So the first thing we did here is we read in our data. 127 00:07:09,510 --> 00:07:12,604 So we have this Read CSV, very useful function. 128 00:07:15,920 --> 00:07:17,590 Not necessarily the fastest. 129 00:07:17,590 --> 00:07:20,770 In some of the later examples we have a sort 130 00:07:20,770 --> 00:07:25,500 of a super fast reader, which does 131 00:07:25,500 --> 00:07:27,240 almost no formatting for you. 132 00:07:27,240 --> 00:07:30,150 Just reads in the triples as raw triples. 133 00:07:30,150 --> 00:07:34,510 Because a lot of times if you're reading a very large CSV file, 134 00:07:34,510 --> 00:07:36,750 you may not want to construct the full associative 135 00:07:36,750 --> 00:07:39,629 array out of the way we would just do it by default. 136 00:07:39,629 --> 00:07:41,670 And you can just read in the triples very quickly 137 00:07:41,670 --> 00:07:43,479 then do some manipulations. 138 00:07:43,479 --> 00:07:45,270 Or a lot of times the first thing you'll do 139 00:07:45,270 --> 00:07:48,097 is recreate an associated array and then get the triples out 140 00:07:48,097 --> 00:07:50,430 of it, because you want to work on the triples directly. 141 00:07:50,430 --> 00:07:52,750 And so we do have ways for doing that. 142 00:07:52,750 --> 00:07:57,940 So we now created the incidence matrix 143 00:07:57,940 --> 00:08:01,270 into an associative array. 144 00:08:01,270 --> 00:08:07,960 And let me just show you that. 145 00:08:07,960 --> 00:08:14,330 So if I go disp(E), so that shows you the incidence matrix. 146 00:08:14,330 --> 00:08:17,010 These are its internal structures here. 147 00:08:17,010 --> 00:08:18,300 These are the row keys. 148 00:08:18,300 --> 00:08:23,250 So those are the labels of the edges. 149 00:08:23,250 --> 00:08:29,440 They have been converted to Luxor graphically sorted order. 150 00:08:29,440 --> 00:08:30,580 We have the column keys. 151 00:08:30,580 --> 00:08:33,610 But there's so many that MATLAB chooses not to print out 152 00:08:33,610 --> 00:08:34,780 the full list here. 153 00:08:34,780 --> 00:08:36,780 But it's again, a Luxor graphically sorted 154 00:08:36,780 --> 00:08:39,630 list of the unique labels. 155 00:08:39,630 --> 00:08:43,210 These are the different values that we have. 156 00:08:43,210 --> 00:08:45,720 So you remember we had an order column, which 157 00:08:45,720 --> 00:08:50,740 is one, two, three, blue, green, orange, pink, silver, etc. 158 00:08:50,740 --> 00:08:55,790 And here and then the overall adjacency matrix, 159 00:08:55,790 --> 00:09:01,735 which connects the row keys to their value keys is 19 by 21. 160 00:09:04,410 --> 00:09:06,710 19 rows by 21 columns. 161 00:09:06,710 --> 00:09:10,870 This actually shows a mixed use. 162 00:09:10,870 --> 00:09:13,230 Some of these are essentially exploded, right? 163 00:09:13,230 --> 00:09:15,882 The fact that we have each edge. 164 00:09:15,882 --> 00:09:17,340 But some of the values are actually 165 00:09:17,340 --> 00:09:18,490 storing real stuff here. 166 00:09:18,490 --> 00:09:20,490 Blue, green, orange, etc. 167 00:09:20,490 --> 00:09:23,044 So you don't have to go all the way with one. 168 00:09:23,044 --> 00:09:24,210 You can mix them too around. 169 00:09:24,210 --> 00:09:26,690 You can create dense matrices that look more 170 00:09:26,690 --> 00:09:27,690 like traditional tables. 171 00:09:27,690 --> 00:09:29,650 And there's sometimes that you want to do that. 172 00:09:29,650 --> 00:09:32,180 And other times you can completely 173 00:09:32,180 --> 00:09:34,480 explode it out and do things like that as well. 174 00:09:39,490 --> 00:09:42,390 So we go back. 175 00:09:42,390 --> 00:09:45,830 We just want to work with the vertices. 176 00:09:45,830 --> 00:09:49,270 So we're going to ignore this color and order columns. 177 00:09:49,270 --> 00:09:51,450 So we're going to create our first projection is 178 00:09:51,450 --> 00:09:53,560 to just work with the vertices. 179 00:09:53,560 --> 00:09:56,640 So we have this little shorthand here called StartsWith, 180 00:09:56,640 --> 00:09:59,110 if you give it a character. 181 00:09:59,110 --> 00:10:01,700 And it will create a little query 182 00:10:01,700 --> 00:10:05,500 that will get just things that start with that. 183 00:10:05,500 --> 00:10:07,020 And very useful thing. 184 00:10:07,020 --> 00:10:08,130 It's very efficient too. 185 00:10:08,130 --> 00:10:11,650 It actually formally creates the two boundaries. 186 00:10:11,650 --> 00:10:14,850 It figures out what the maximum range is for that. 187 00:10:14,850 --> 00:10:17,337 And then just needs to look up the beginning and the end. 188 00:10:17,337 --> 00:10:19,670 It [? can ?] [? assume ?] lexicographical order and then 189 00:10:19,670 --> 00:10:21,930 it can just grab the middle. 190 00:10:21,930 --> 00:10:24,350 We have a regular expression approach to it, 191 00:10:24,350 --> 00:10:26,550 which works fine on small associative arrays. 192 00:10:26,550 --> 00:10:28,591 But this is much more efficient because it's just 193 00:10:28,591 --> 00:10:32,310 basically two lookups and go. 194 00:10:32,310 --> 00:10:38,590 It also works if this E was actually a binding to a table 195 00:10:38,590 --> 00:10:39,910 in a database. 196 00:10:39,910 --> 00:10:42,410 You can use this same syntax, while the regular expression 197 00:10:42,410 --> 00:10:43,630 syntax doesn't really work. 198 00:10:43,630 --> 00:10:44,441 Yes? 199 00:10:44,441 --> 00:10:44,940 [INAUDIBLE] 200 00:10:49,650 --> 00:10:53,290 PROFESSOR: So that's the delimiter of the string. 201 00:10:53,290 --> 00:10:56,760 It's a string list with one entry in it. 202 00:10:56,760 --> 00:10:58,120 So this is a string list. 203 00:10:58,120 --> 00:11:00,610 All our strings are lists here. 204 00:11:00,610 --> 00:11:03,060 Well, because this also allows me-- 205 00:11:03,060 --> 00:11:06,841 could've been anything that was unique to that. 206 00:11:06,841 --> 00:11:08,840 And it didn't have to be the same delimiter that 207 00:11:08,840 --> 00:11:09,942 was the one inside. 208 00:11:09,942 --> 00:11:12,150 It could have been a new line, could have been a tab, 209 00:11:12,150 --> 00:11:13,116 could move whatever. 210 00:11:13,116 --> 00:11:14,990 I just picked comma because it was convenient 211 00:11:14,990 --> 00:11:17,130 but could have been a space. 212 00:11:17,130 --> 00:11:19,910 Because you could have multiple arguments to the StartsWith. 213 00:11:19,910 --> 00:11:22,790 If you want to say I want starts with this and starts with this 214 00:11:22,790 --> 00:11:25,120 and starts with this, you just give it a whole list 215 00:11:25,120 --> 00:11:27,531 and it will go and do that all for you. 216 00:11:27,531 --> 00:11:29,030 And so that's why we do it that way. 217 00:11:29,030 --> 00:11:32,150 Wherever possible, we try and accept a list of strings. 218 00:11:32,150 --> 00:11:33,860 And then do the right thing. 219 00:11:33,860 --> 00:11:35,400 So basically, this is saying, get me 220 00:11:35,400 --> 00:11:37,290 all columns that start with V. So that 221 00:11:37,290 --> 00:11:38,630 would be all the vertices. 222 00:11:38,630 --> 00:11:41,000 And this is the colon, which as you know 223 00:11:41,000 --> 00:11:44,930 and I want the full row of that. 224 00:11:44,930 --> 00:11:50,020 These had these values of-- which were the order 225 00:11:50,020 --> 00:11:51,940 group that they were in. 226 00:11:51,940 --> 00:11:54,780 And so we don't care about those. 227 00:11:54,780 --> 00:11:58,084 We want to convert them back to regular numeric numbers. 228 00:11:58,084 --> 00:11:59,500 And so we have this shorthand here 229 00:11:59,500 --> 00:12:01,580 called double Logi, which basically just takes 230 00:12:01,580 --> 00:12:05,630 the logical and then does the double thing mathematically. 231 00:12:05,630 --> 00:12:08,370 Formally, you would use to call this the infinity norm 232 00:12:08,370 --> 00:12:11,390 on strings, if that even makes sense. 233 00:12:11,390 --> 00:12:14,430 But it basically just says, if there's a value there converted 234 00:12:14,430 --> 00:12:17,420 to a double precision one. 235 00:12:17,420 --> 00:12:21,642 If there's not a value there, it's just a sparse 0, not-- 236 00:12:21,642 --> 00:12:23,350 And we do this all the time because a lot 237 00:12:23,350 --> 00:12:25,760 of times we want to do math on these things. 238 00:12:25,760 --> 00:12:27,810 It's like regular math. 239 00:12:27,810 --> 00:12:30,330 We want to compute the vertex adjacency graph. 240 00:12:30,330 --> 00:12:33,370 So we do that with our square in function 241 00:12:33,370 --> 00:12:36,120 here, our inner square product. 242 00:12:36,120 --> 00:12:39,440 So that basically takes our matrix 243 00:12:39,440 --> 00:12:43,340 and uses the edge as the common key to join them together. 244 00:12:43,340 --> 00:12:45,040 And we use or display full function 245 00:12:45,040 --> 00:12:47,380 to actually show what we got. 246 00:12:47,380 --> 00:12:54,540 And so now you see the vertex adjacency matrix here. 247 00:12:54,540 --> 00:12:56,410 And then the number is the number 248 00:12:56,410 --> 00:13:00,920 of times those vertices are on the same edge. 249 00:13:00,920 --> 00:13:03,040 So it's obviously symmetric. 250 00:13:03,040 --> 00:13:05,730 And you see here these vertices here, 251 00:13:05,730 --> 00:13:08,070 which are part of this common hyper edge, 252 00:13:08,070 --> 00:13:12,000 we have this sort of 6 9 6 structure here, 253 00:13:12,000 --> 00:13:14,450 various other types of symmetric structures 254 00:13:14,450 --> 00:13:17,080 going on here showing you the counts, 255 00:13:17,080 --> 00:13:20,360 how many times those vertices appeared on the same edge. 256 00:13:20,360 --> 00:13:22,700 These are obviously cliques, because they 257 00:13:22,700 --> 00:13:25,301 are all vertices that are on the same hyper edge line. 258 00:13:25,301 --> 00:13:26,717 So if you have the same hyper edge 259 00:13:26,717 --> 00:13:30,260 line whenever you do the squaring operation, 260 00:13:30,260 --> 00:13:35,277 you see it as this clique in this operation here. 261 00:13:39,750 --> 00:13:45,770 We can also do the square out function, which computes 262 00:13:45,770 --> 00:13:47,130 the edge adjacency matrix. 263 00:13:47,130 --> 00:13:53,740 So now we want to ask, which edges share common vertices? 264 00:13:53,740 --> 00:13:56,250 And so that's this structure here. 265 00:13:59,020 --> 00:14:02,417 It shows you these are the various edges. 266 00:14:02,417 --> 00:14:03,500 And again, it's symmetric. 267 00:14:03,500 --> 00:14:05,125 And you can see all kinds of structures 268 00:14:05,125 --> 00:14:08,530 in here about which vertices share-- which 269 00:14:08,530 --> 00:14:13,390 edges share common vertices. 270 00:14:13,390 --> 00:14:16,320 And that is pretty much it for this example. 271 00:14:16,320 --> 00:14:17,980 Very simple example. 272 00:14:17,980 --> 00:14:19,990 Hopefully it gets you to the point 273 00:14:19,990 --> 00:14:23,510 where you can begin to try out the homework assignment. 274 00:14:23,510 --> 00:14:28,618 And if there's any questions that people have, 275 00:14:28,618 --> 00:14:29,490 let's try this. 276 00:14:29,490 --> 00:14:31,350 Here, one thing that's fun to do spy(E). 277 00:14:34,186 --> 00:14:35,600 See if that works. 278 00:14:50,840 --> 00:14:53,460 I think between PowerPoint, Quick Time and everything 279 00:14:53,460 --> 00:14:57,080 else it's just about-- there it is. 280 00:14:57,080 --> 00:14:58,750 Wow, that took a long time. 281 00:14:58,750 --> 00:15:02,330 So this shows that adjacency matrix. 282 00:15:02,330 --> 00:15:04,197 So spy works. 283 00:15:04,197 --> 00:15:06,530 And I don't know if we showed that in the first example, 284 00:15:06,530 --> 00:15:10,390 but the spy function, the standard MATLAB sparse plotting 285 00:15:10,390 --> 00:15:13,380 function, very useful way to just sort of look at the stuff, 286 00:15:13,380 --> 00:15:16,620 all those previous charts I showed you on the examples 287 00:15:16,620 --> 00:15:17,790 were done this thing. 288 00:15:17,790 --> 00:15:20,150 And a little feature that sometimes 289 00:15:20,150 --> 00:15:22,620 causes MATLAB to crash because it doesn't really like 290 00:15:22,620 --> 00:15:23,870 doing things too aggressively. 291 00:15:23,870 --> 00:15:28,582 But if you click on it, it shows you the row and the column 292 00:15:28,582 --> 00:15:30,040 and the value associated with that, 293 00:15:30,040 --> 00:15:31,960 and also prints it out here. 294 00:15:31,960 --> 00:15:33,410 Likewise here. 295 00:15:33,410 --> 00:15:36,830 So you see row, column, that gives you the order. 296 00:15:36,830 --> 00:15:41,450 And this tells you the row is G2, its column, its color, 297 00:15:41,450 --> 00:15:43,240 its value is green. 298 00:15:43,240 --> 00:15:46,257 So actually, I lied to you. 299 00:15:46,257 --> 00:15:47,590 That was just the first example. 300 00:15:50,780 --> 00:15:51,830 Senior moment there. 301 00:15:51,830 --> 00:15:53,805 We have two more to go. 302 00:15:53,805 --> 00:15:54,890 Lot more fun. 303 00:15:54,890 --> 00:15:59,460 See, you're all excited that class ended early. 304 00:15:59,460 --> 00:16:01,510 Here's the other examples. 305 00:16:01,510 --> 00:16:05,077 So again, what we do is we're reading in the data set. 306 00:16:05,077 --> 00:16:06,410 We're displaying the full thing. 307 00:16:06,410 --> 00:16:08,701 So you can see there's the full beast I just showed you 308 00:16:08,701 --> 00:16:09,655 the spy plot. 309 00:16:09,655 --> 00:16:11,530 Now, I'm going to use some analytics on this. 310 00:16:11,530 --> 00:16:14,740 So one thing I want to do is just get the orange edges. 311 00:16:14,740 --> 00:16:19,090 So I can say, get me the column color. 312 00:16:19,090 --> 00:16:21,870 And because orange is held in the value, 313 00:16:21,870 --> 00:16:23,890 I can actually compare it with the string. 314 00:16:23,890 --> 00:16:28,230 So I say, return me the matrix, the associative array 315 00:16:28,230 --> 00:16:33,440 of the row, color and everything equal to orange. 316 00:16:33,440 --> 00:16:35,490 So that returns me another associative array. 317 00:16:35,490 --> 00:16:38,230 The result of this whole logical expression 318 00:16:38,230 --> 00:16:40,330 is another associative array. 319 00:16:40,330 --> 00:16:45,000 I can then get the rows for those associative arrays. 320 00:16:45,000 --> 00:16:54,440 And I've found all edges that are orange. 321 00:16:54,440 --> 00:16:58,570 And I can then pass that, get those rows 322 00:16:58,570 --> 00:17:01,190 and pass that back into the original one 323 00:17:01,190 --> 00:17:05,280 to get just the orange edges. 324 00:17:05,280 --> 00:17:09,140 So this just shows you this composable ability 325 00:17:09,140 --> 00:17:12,220 is very powerful, same ability that MATLAB has. 326 00:17:12,220 --> 00:17:14,849 One does similar statements in MATLAB all the time. 327 00:17:14,849 --> 00:17:16,364 And so it's very composable. 328 00:17:16,364 --> 00:17:17,780 This would be a very complex query 329 00:17:17,780 --> 00:17:19,180 to do in other approaches. 330 00:17:19,180 --> 00:17:20,680 And you can do it very quickly here. 331 00:17:20,680 --> 00:17:20,990 Yes? 332 00:17:20,990 --> 00:17:21,489 Question? 333 00:17:21,489 --> 00:17:23,310 [INAUDIBLE] would that be called to route 334 00:17:23,310 --> 00:17:27,932 since it is in the row with the argument [INAUDIBLE]? 335 00:17:30,860 --> 00:17:31,870 No. 336 00:17:31,870 --> 00:17:34,170 It doesn't. 337 00:17:34,170 --> 00:17:35,980 It doesn't. 338 00:17:35,980 --> 00:17:40,220 What we could do is overload it so if you passed 339 00:17:40,220 --> 00:17:43,910 an associative array with just a single argument 340 00:17:43,910 --> 00:17:48,250 and have it behave like when you pass in the MATLAB logical 341 00:17:48,250 --> 00:17:50,000 and do that kind of intersection, 342 00:17:50,000 --> 00:17:51,980 we could think about that. 343 00:17:51,980 --> 00:17:55,590 But MATLAB actually relies on then converting 344 00:17:55,590 --> 00:18:00,460 the matrix to a vector and then using that singular-- 345 00:18:00,460 --> 00:18:02,335 basically, it takes that logical, converts it 346 00:18:02,335 --> 00:18:05,120 to a single [? index, ?] which kind of gets us 347 00:18:05,120 --> 00:18:08,590 into an area where it's very fuzzy and gray. 348 00:18:08,590 --> 00:18:11,080 So we definitely thought about that 349 00:18:11,080 --> 00:18:16,910 and there are places where we line up with MATLAB very nicely 350 00:18:16,910 --> 00:18:19,790 because the mathematics of associative arrays 351 00:18:19,790 --> 00:18:23,200 and linear algebra really line up very nicely. 352 00:18:23,200 --> 00:18:25,180 And there's places where they intersect. 353 00:18:25,180 --> 00:18:28,317 There's things you can do in with associative arrays 354 00:18:28,317 --> 00:18:29,900 that you can't do with linear algebra. 355 00:18:29,900 --> 00:18:33,490 For example, we can matrix multiply any two associative 356 00:18:33,490 --> 00:18:34,160 arrays. 357 00:18:34,160 --> 00:18:35,676 There's no conformance requirements. 358 00:18:35,676 --> 00:18:37,550 That is, they don't have the same-- normally, 359 00:18:37,550 --> 00:18:43,190 when you multiply two matrices-- the inner dimensions have 360 00:18:43,190 --> 00:18:44,140 to match, basically. 361 00:18:44,140 --> 00:18:47,850 The columns of the left argument and the rows 362 00:18:47,850 --> 00:18:50,160 of the right argument must be the same length. 363 00:18:50,160 --> 00:18:52,470 In associative arrays that requirement doesn't exist. 364 00:18:52,470 --> 00:18:55,990 You can multiply or add any two associative arrays. 365 00:18:55,990 --> 00:18:58,470 We'll get into that much more in the next class. 366 00:18:58,470 --> 00:19:01,370 But that's a very powerful feature that people like. 367 00:19:01,370 --> 00:19:04,960 But there's other places where it doesn't really make sense. 368 00:19:04,960 --> 00:19:09,150 So things like, what is the upper triangular section 369 00:19:09,150 --> 00:19:12,630 of an associative array gets to be a little bit-- huh, what's 370 00:19:12,630 --> 00:19:13,420 that mean? 371 00:19:13,420 --> 00:19:15,680 So excellent question though. 372 00:19:15,680 --> 00:19:16,820 Excellent question though. 373 00:19:16,820 --> 00:19:18,560 So we can display this. 374 00:19:18,560 --> 00:19:24,170 So this just shows we found all the edges that 375 00:19:24,170 --> 00:19:27,330 had color orange. 376 00:19:27,330 --> 00:19:29,060 But we don't have to just-- that's a way 377 00:19:29,060 --> 00:19:30,980 of using the values to select. 378 00:19:30,980 --> 00:19:35,240 We could, of course, use StartsWith in the rows, 379 00:19:35,240 --> 00:19:36,870 just say StartsWith O and G. I just 380 00:19:36,870 --> 00:19:40,540 happened to label all my rows that orange rows began with O 381 00:19:40,540 --> 00:19:43,070 and green rows begin with G. 382 00:19:43,070 --> 00:19:44,960 So we can do StartsWith O G. So that's 383 00:19:44,960 --> 00:19:47,910 example of those two range queries being 384 00:19:47,910 --> 00:19:49,240 done at the same time. 385 00:19:49,240 --> 00:19:51,250 And so that gives us EOG. 386 00:19:51,250 --> 00:19:59,610 And again, we can see now we get the orange and green rows. 387 00:19:59,610 --> 00:20:01,490 So that just shows you that. 388 00:20:01,490 --> 00:20:05,025 And then we'll wrap up here with the last example, which is EA3. 389 00:20:08,270 --> 00:20:10,800 Now, we get to do some really fun stuff. 390 00:20:10,800 --> 00:20:15,410 Again, we're reading in our data set, 391 00:20:15,410 --> 00:20:19,620 converting, just getting the vertices. 392 00:20:19,620 --> 00:20:23,760 And then converting it to numeric 0's and 1's. 393 00:20:23,760 --> 00:20:28,970 I'm going to now just get the orange edges. 394 00:20:28,970 --> 00:20:32,230 And I'm going to get the green edges. 395 00:20:32,230 --> 00:20:33,710 All right. 396 00:20:33,710 --> 00:20:38,710 And now I'm going to essentially do the cross correlation 397 00:20:38,710 --> 00:20:39,960 of the orange and green edges. 398 00:20:39,960 --> 00:20:42,180 So I'm going to transpose one of them 399 00:20:42,180 --> 00:20:43,954 and matrix multiply with the other ones. 400 00:20:43,954 --> 00:20:45,620 We don't do square in here because they' 401 00:20:45,620 --> 00:20:47,210 are different matrices. 402 00:20:47,210 --> 00:20:49,110 In fact, I would say that people often 403 00:20:49,110 --> 00:20:51,210 do the cross-correlation more often 404 00:20:51,210 --> 00:20:53,260 than just square the matrix with itself. 405 00:20:53,260 --> 00:20:55,410 You have one set with another set 406 00:20:55,410 --> 00:20:58,510 and you want to cross correlate them, a very common operation. 407 00:20:58,510 --> 00:21:01,980 And so that's just transpose EVG here. 408 00:21:01,980 --> 00:21:04,680 And that gets that. 409 00:21:04,680 --> 00:21:09,330 And as you can see, the results of that inner product 410 00:21:09,330 --> 00:21:11,690 was empty. 411 00:21:11,690 --> 00:21:14,580 But the result of the outer product is full. 412 00:21:14,580 --> 00:21:18,650 So you guys can think about why that is. 413 00:21:18,650 --> 00:21:28,630 And so this just shows you the common-- how many edges these 414 00:21:28,630 --> 00:21:31,520 appeared with together. 415 00:21:31,520 --> 00:21:33,672 This shows you that there was no-- well, 416 00:21:33,672 --> 00:21:34,880 you guys can figure that out. 417 00:21:34,880 --> 00:21:36,700 I'll let you do your homework. 418 00:21:36,700 --> 00:21:41,430 Now, another thing that we often want to do is basically, 419 00:21:41,430 --> 00:21:50,470 this showed us which edges had common shared vertices 420 00:21:50,470 --> 00:21:53,160 together. 421 00:21:53,160 --> 00:21:58,500 But it didn't tell us we lost the actual vertices. 422 00:21:58,500 --> 00:21:59,860 We just know that there's three. 423 00:21:59,860 --> 00:22:01,020 And a lot of times we want them. 424 00:22:01,020 --> 00:22:02,603 This is sometimes called the pedigree. 425 00:22:02,603 --> 00:22:03,980 You do a correlation. 426 00:22:03,980 --> 00:22:07,040 And then you would like-- you say these two 427 00:22:07,040 --> 00:22:08,130 things are correlated. 428 00:22:08,130 --> 00:22:11,270 I want to get to the connector back to the original record 429 00:22:11,270 --> 00:22:13,100 so I can go back to my e-matrix and really 430 00:22:13,100 --> 00:22:15,720 find that out and present that as the evidence 431 00:22:15,720 --> 00:22:17,750 that these things are connected together. 432 00:22:17,750 --> 00:22:19,210 We don't want to lose that. 433 00:22:19,210 --> 00:22:23,170 So we have invented special new matrix multiplies 434 00:22:23,170 --> 00:22:25,410 in the land of associative arrays. 435 00:22:25,410 --> 00:22:30,170 One of them we call a CatKeyMul, which 436 00:22:30,170 --> 00:22:33,750 you view the matrix multiply as sort of there's 437 00:22:33,750 --> 00:22:36,930 a key, a common key, this row column key 438 00:22:36,930 --> 00:22:39,610 that's being joined together. 439 00:22:39,610 --> 00:22:42,370 Well, we can preserve that and store that in a value, 440 00:22:42,370 --> 00:22:45,570 instead of say the count or something like that. 441 00:22:45,570 --> 00:22:50,880 So we do CatKeyMul EV0 transpose EVG. 442 00:22:50,880 --> 00:22:54,010 And then when we display that, we not only 443 00:22:54,010 --> 00:22:57,730 see which edges were connected, but we also 444 00:22:57,730 --> 00:23:00,360 have preserved the list of the vertices. 445 00:23:00,360 --> 00:23:02,610 It's essentially a concatenated list 446 00:23:02,610 --> 00:23:05,030 of those keys, very useful thing. 447 00:23:07,860 --> 00:23:11,000 One has to be so very powerful technique. 448 00:23:11,000 --> 00:23:14,634 One definitely has to be careful about using this technique 449 00:23:14,634 --> 00:23:15,300 if it generates. 450 00:23:15,300 --> 00:23:19,090 It's going to generate extremely, extremely long lists 451 00:23:19,090 --> 00:23:19,690 here. 452 00:23:19,690 --> 00:23:23,310 That can really cause us to behave very slowly. 453 00:23:23,310 --> 00:23:25,710 In particular, when asked to be careful, 454 00:23:25,710 --> 00:23:29,850 if you do a CatKeyMul on a matrix with itself, 455 00:23:29,850 --> 00:23:32,221 why do you think you would have to be careful with that? 456 00:23:32,221 --> 00:23:32,721 Anyone? 457 00:23:35,310 --> 00:23:37,890 Remember, that dense diagonal that happened 458 00:23:37,890 --> 00:23:39,656 whenever we squared matrices? 459 00:23:39,656 --> 00:23:43,410 So that means this thing will have an enormous number 460 00:23:43,410 --> 00:23:45,720 of entries along the diagonal. 461 00:23:45,720 --> 00:23:47,680 So you have to be very careful when 462 00:23:47,680 --> 00:23:50,340 you use this kind of function to correlate something 463 00:23:50,340 --> 00:23:50,920 with itself. 464 00:23:50,920 --> 00:23:52,680 It's better to correlate two distinct sets 465 00:23:52,680 --> 00:23:55,972 because they tend to not have that dense diagonal, which can 466 00:23:55,972 --> 00:23:57,180 be a little bit of a problem. 467 00:23:57,180 --> 00:23:58,660 There's ways to get around that. 468 00:23:58,660 --> 00:24:00,960 There's hacks and tricks and stuff like that. 469 00:24:00,960 --> 00:24:03,030 But at a certain level, really dense diagonala-- 470 00:24:03,030 --> 00:24:05,113 I mean, essentially be listing every single vertex 471 00:24:05,113 --> 00:24:07,990 in your data set-- you could be listing every single vertex 472 00:24:07,990 --> 00:24:10,950 in your data set almost every single time, multiple times 473 00:24:10,950 --> 00:24:11,760 along the diagonal. 474 00:24:11,760 --> 00:24:13,910 That would get very, very bulky. 475 00:24:13,910 --> 00:24:16,640 So just a little word of caution there. 476 00:24:16,640 --> 00:24:18,990 And so now, I've baited you once. 477 00:24:18,990 --> 00:24:20,780 But this really is the end. 478 00:24:20,780 --> 00:24:23,220 So are there any questions on the examples 479 00:24:23,220 --> 00:24:25,370 before we cut it off here? 480 00:24:25,370 --> 00:24:28,610 Is there a question back here somewhere? 481 00:24:28,610 --> 00:24:29,970 No? 482 00:24:29,970 --> 00:24:30,470 All right. 483 00:24:30,470 --> 00:24:30,700 Great. 484 00:24:30,700 --> 00:24:31,360 Thank you very much. 485 00:24:31,360 --> 00:24:32,026 I appreciate it. 486 00:24:32,026 --> 00:24:34,030 And we'll see you next week. 487 00:24:34,030 --> 00:24:37,440 And again, think of a picture, draw some lines, 488 00:24:37,440 --> 00:24:43,570 make an incidence matrix, have fun with it, 489 00:24:43,570 --> 00:24:45,707 compute the degree of your significant other. 490 00:24:48,430 --> 00:24:50,370 Bad idea? 491 00:24:50,370 --> 00:24:51,920 Good.