1 00:00:01 --> 00:00:03 Yes, OK, four, three, two, one, 2 00:00:03 --> 00:00:05 OK, I see you guys are in a happy mood. 3 00:00:05 --> 00:00:08 I don't know if that means 18.06 is ending, 4 00:00:08 --> 00:00:09 or, the quiz was good. 5 00:00:09 --> 00:00:13 Uh, my birthday conference was going on at the time of the 6 00:00:13 --> 00:00:17 quiz, and in the conference, of course, everybody had to say 7 00:00:17 --> 00:00:21 nice things, but I was wondering, what would my 18.06 8 00:00:21 --> 00:00:24 class be saying, because it was at the exactly 9 00:00:24 --> 00:00:26 the same time. 10 00:00:26 --> 00:00:31 But, what I know from the grades so far, 11 00:00:31 --> 00:00:39 they're basically close to, and maybe slightly above the 12 00:00:39 --> 00:00:44 grades that you got on quiz two. 13 00:00:44 --> 00:00:48 So, very satisfactory. 14 00:00:48 --> 00:00:55 And, then we have a final exam coming up, and today's lecture, 15 00:00:55 --> 00:01:00.82 as I told you by email, will be a first step in the 16 00:01:00.82 --> 00:01:07.61 review, and then on Wednesday I'll do all I can in reviewing 17 00:01:07.61 --> 00:01:09 the whole course. 18 00:01:09 --> 00:01:15 So my topic today is -- actually, this is a lecture I 19 00:01:15 --> 00:01:19 have never given before in this way, and it will -- well, 20 00:01:19 --> 00:01:23 four subspaces, that's certainly fundamental, 21 00:01:23 --> 00:01:26.86 and you know that, so I want to speak about 22 00:01:26.86 --> 00:01:31 left-inverses and right-inverses and then something called 23 00:01:31 --> 00:01:33 pseudo-inverses. 24 00:01:33 --> 00:01:39 And pseudo-inverses, let me say right away, 25 00:01:39 --> 00:01:45 that comes in near the end of chapter seven, 26 00:01:45 --> 00:01:52 and that would not be expected on the final. 27 00:01:52 --> 00:01:56 But you'll see that what I'm talking about is really the 28 00:01:56 --> 00:01:59 basic stuff that, for an m-by-n matrix of rank r, 29 00:01:59 --> 00:02:03 we're going back to the most fundamental picture in linear 30 00:02:03 --> 00:02:03 algebra. 31 00:02:03 --> 00:02:06 Nobody could forget that picture, right? 32 00:02:06 --> 00:02:09.92 When you're my age, even, you'll remember the row 33 00:02:09.92 --> 00:02:12 space, and the null space. 34 00:02:12 --> 00:02:18 Orthogonal complements over there, the column space and the 35 00:02:18 --> 00:02:23 null space of A transpose column, orthogonal complements 36 00:02:23 --> 00:02:24 over here. 37 00:02:24 --> 00:02:27.9 And I want to speak about inverses. 38 00:02:27.9 --> 00:02:28 OK. 39 00:02:28 --> 00:02:34 And I want to identify the different possibilities. 40 00:02:34 --> 00:02:38 So first of all, when does a matrix have a just 41 00:02:38 --> 00:02:42 a perfect inverse, two-sided, you know, 42 00:02:42 --> 00:02:47 so the two-sided inverse is what we just call inverse, 43 00:02:47 --> 00:02:48 right? 44 00:02:48 --> 00:02:53 And, so that means that there's a matrix that produces the 45 00:02:53 --> 00:03:00 identity, whether we write it on the left or on the right. 46 00:03:00 --> 00:03:03 And just tell me, how are the numbers r, 47 00:03:03 --> 00:03:07.61 the rank, n the number of columns, m the number of rows, 48 00:03:07.61 --> 00:03:11 how are those numbers related when we have an invertible 49 00:03:11 --> 00:03:12 matrix? 50 00:03:12 --> 00:03:17 So this is the matrix which was -- chapter two was all about 51 00:03:17 --> 00:03:20 matrices like this, the beginning of the course, 52 00:03:20 --> 00:03:26 what was the relation of th- of r, m, and n, for the nice case? 53 00:03:26 --> 00:03:32 They're all the same, all equal. 54 00:03:32 --> 00:03:37 So this is the case when r=m=n. 55 00:03:37 --> 00:03:43.23 Square matrix, full rank, period, 56 00:03:43.23 --> 00:03:51 just -- so I'll use the words full rank. 57 00:03:51 --> 00:03:52.04 OK, good. 58 00:03:52.04 --> 00:03:53 Everybody knows that. 59 00:03:53 --> 00:03:54 OK. 60 00:03:54 --> 00:03:55 Then chapter three. 61 00:03:55 --> 00:04:00 We began to deal with matrices that were not of full rank, 62 00:04:00 --> 00:04:05 and they could have any rank, and we learned what the rank 63 00:04:05 --> 00:04:05 was. 64 00:04:05 --> 00:04:09.96 And then we focused, if you remember on some cases 65 00:04:09.96 --> 00:04:12 like full column rank. 66 00:04:12 --> 00:04:19 Now, can you remember what was the deal with full column rank? 67 00:04:19 --> 00:04:24 So, now, I think this is the case in which we have a 68 00:04:24 --> 00:04:28 left-inverse, and I'll try to find it. 69 00:04:28 --> 00:04:34 So we have a -- what was the situation there? 70 00:04:34 --> 00:04:40 It's the case of full column rank, and that means -- what 71 00:04:40 --> 00:04:42 does that mean about r? 72 00:04:42 --> 00:04:48.76 It equals, what's the deal with r, now, if we have full column 73 00:04:48.76 --> 00:04:54 rank, I mean the columns are independent, but maybe not the 74 00:04:54 --> 00:04:55 rows. 75 00:04:55 --> 00:04:59 So what is r equal to in this case? 76 00:04:59 --> 00:04:59 n. 77 00:04:59 --> 00:05:00 Thanks. n. 78 00:05:00 --> 00:05:01 r=n. 79 00:05:01 --> 00:05:05.82 The n columns are independent, but probably, 80 00:05:05.82 --> 00:05:07 we have more rows. 81 00:05:07 --> 00:05:12 What's the picture, and then what's the null space 82 00:05:12 --> 00:05:13 for this? 83 00:05:13 --> 00:05:18 So the n columns are independent, what's the null 84 00:05:18 --> 00:05:20 space in this case? 85 00:05:20 --> 00:05:26 So of course, you know what I'm asking. 86 00:05:26 --> 00:05:29 You're saying, why is this guy asking 87 00:05:29 --> 00:05:35 something, I know that-- I think about it in my sleep, 88 00:05:35 --> 00:05:35 right? 89 00:05:35 --> 00:05:40 So the null space of this matrix if the rank is n, 90 00:05:40 --> 00:05:45 the null space is what vectors are in the null space? 91 00:05:45 --> 00:05:47 Just the zero vector. 92 00:05:47 --> 00:05:48 Right? 93 00:05:48 --> 00:05:52 The columns are independent. 94 00:05:52 --> 00:05:55 Independent columns. 95 00:05:55 --> 00:06:03 No combination of the columns gives zero except that one. 96 00:06:03 --> 00:06:12 And what's my picture over, -- let me redraw my picture -- 97 00:06:12 --> 00:06:17 the row space is everything. 98 00:06:17 --> 00:06:17 No. 99 00:06:17 --> 00:06:18 Is that right? 100 00:06:18 --> 00:06:23 Let's see, I often get these turned around, 101 00:06:23 --> 00:06:23.88 right? 102 00:06:23.88 --> 00:06:25 So what's the deal? 103 00:06:25 --> 00:06:29 The columns are independent, right? 104 00:06:29 --> 00:06:34 So the rank should be the full number of columns, 105 00:06:34 --> 00:06:38 so what does that tell us? 106 00:06:38 --> 00:06:40 There's no null space, right. 107 00:06:40 --> 00:06:41.18 OK. 108 00:06:41.18 --> 00:06:44 The row space is the whole thing. 109 00:06:44 --> 00:06:47 Yes, I won't even draw the picture. 110 00:06:47 --> 00:06:52 And what was the deal with -- and these were very important in 111 00:06:52 --> 00:06:59 least squares problems because -- So, what more is true here? 112 00:06:59 --> 00:07:03 If we have full column rank, the null space is zero, 113 00:07:03 --> 00:07:07 we have independent columns, the unique -- so we have zero 114 00:07:07 --> 00:07:09 or one solutions to Ax=b. 115 00:07:09 --> 00:07:14 There may not be any solutions, but if there's a solution, 116 00:07:14 --> 00:07:18.39 there's only one solution because other solutions are 117 00:07:18.39 --> 00:07:21.84 found by adding on stuff from the null space, 118 00:07:21.84 --> 00:07:25 and there's nobody there to add on. 119 00:07:25 --> 00:07:32 So the particular solution is the solution, 120 00:07:32 --> 00:07:37 if there is a particular solution. 121 00:07:37 --> 00:07:43 But of course, the rows might not be - are 122 00:07:43 --> 00:07:49 probably not independent -- and therefore, 123 00:07:49 --> 00:07:54 so right-hand sides won't end up with a zero equal zero after 124 00:07:54 --> 00:07:58 elimination, so sometimes we may have no solution, 125 00:07:58 --> 00:07:59 or one solution. 126 00:07:59 --> 00:07:59 OK. 127 00:07:59 --> 00:08:03 And what I want to say is that for this matrix A -- oh, 128 00:08:03 --> 00:08:09 yes, tell me something about A transpose A in this case. 129 00:08:09 --> 00:08:12 So this whole part of the board, now, is devoted to this 130 00:08:12 --> 00:08:13 case. 131 00:08:13 --> 00:08:15 What's the deal with A transpose A? 132 00:08:15 --> 00:08:19 I've emphasized over and over how important that combination 133 00:08:19 --> 00:08:23 is, for a rectangular matrix, A transpose A is the good thing 134 00:08:23 --> 00:08:27 to look at, and if the rank is n, if the null space has only 135 00:08:27 --> 00:08:31 zero in it, then the same is true of A transpose A. 136 00:08:31 --> 00:08:39.59 That's the beautiful fact, that if the rank of A is n, 137 00:08:39.59 --> 00:08:47 well, we know this will be an n by n symmetric matrix, 138 00:08:47 --> 00:08:52.32 and it will be full rank. 139 00:08:52.32 --> 00:08:54 So this is invertible. 140 00:08:54 --> 00:08:56 This matrix is invertible. 141 00:08:56 --> 00:08:59 That matrix is invertible. 142 00:08:59 --> 00:09:04 And now I want to show you that A itself has a one-sided 143 00:09:04 --> 00:09:04 inverse. 144 00:09:04 --> 00:09:05 Here it is. 145 00:09:05 --> 00:09:08.85 The inverse of that, which exists, 146 00:09:08.85 --> 00:09:13 times A transpose, there is a one-sided -- shall I 147 00:09:13 --> 00:09:18 call it A inverse? -- left of the matrix A. 148 00:09:18 --> 00:09:20 Why do I say that? 149 00:09:20 --> 00:09:24 Because if I multiply this guy by A, what do I get? 150 00:09:24 --> 00:09:28 What does that multiplication give? 151 00:09:28 --> 00:09:33 Of course, you know it instantly, because I just put 152 00:09:33 --> 00:09:37.87 the parentheses there, I have A transpose A inverse 153 00:09:37.87 --> 00:09:43 times A transpose A so, of course, it's the identity. 154 00:09:43 --> 00:09:46 So it's a left inverse. 155 00:09:46 --> 00:09:51 And this was the totally crucial case for least squares, 156 00:09:51 --> 00:09:57 because you remember that least squares, the central equation of 157 00:09:57 --> 00:10:01.2 least squares had this matrix, A transpose A, 158 00:10:01.2 --> 00:10:04 as its coefficient matrix. 159 00:10:04 --> 00:10:10.3 And in the case of full column rank, that matrix is invertible, 160 00:10:10.3 --> 00:10:11 and we're go. 161 00:10:11 --> 00:10:16.13 So that's the case where there is a left-inverse. 162 00:10:16.13 --> 00:10:21.1 So A does whatever it does, we can find a matrix that 163 00:10:21.1 --> 00:10:24 brings it back to the identity. 164 00:10:24 --> 00:10:31 Now, is it true that, in the other order -- so A 165 00:10:31 --> 00:10:36 inverse left times A is the identity. 166 00:10:36 --> 00:10:37.47 Right? 167 00:10:37.47 --> 00:10:40 This matrix is m by n. 168 00:10:40 --> 00:10:43.72 This matrix is n by m. 169 00:10:43.72 --> 00:10:49 The identity matrix is n by n. 170 00:10:49 --> 00:10:50 All good. 171 00:10:50 --> 00:10:52 All good if you're n. 172 00:10:52 --> 00:10:57 But if you try to put that matrix on the other side, 173 00:10:57 --> 00:10:59 it would fail. 174 00:10:59 --> 00:11:04.5 If the full column rank -- if this is smaller than m, 175 00:11:04.5 --> 00:11:09 the case where they're equals is the beautiful case, 176 00:11:09 --> 00:11:12 but that's all set. 177 00:11:12 --> 00:11:16 Now, we're looking at the case where the columns are 178 00:11:16 --> 00:11:19 independent but the rows are not. 179 00:11:19 --> 00:11:23 So this is invertible, but what matrix is not 180 00:11:23 --> 00:11:23 invertible? 181 00:11:23 --> 00:11:26.62 A A transpose is bad for this case. 182 00:11:26.62 --> 00:11:28 A transpose A is good. 183 00:11:28 --> 00:11:32 So we can multiply on the left, everything good, 184 00:11:32 --> 00:11:35 we get the left inverse. 185 00:11:35 --> 00:11:40 But it would not be a two-sided inverse. 186 00:11:40 --> 00:11:48 A rectangular matrix can't have a two-sided inverse, 187 00:11:48 --> 00:11:56 because there's got to be some null space, right? 188 00:11:56 --> 00:11:59 If I have a matrix that's rectangular, then either that 189 00:11:59 --> 00:12:03 matrix or its transpose has some null space, because if n and m 190 00:12:03 --> 00:12:06 are different, then there's going to be some 191 00:12:06 --> 00:12:09 free variables around, and we'll have some null space 192 00:12:09 --> 00:12:10 in that direction. 193 00:12:10 --> 00:12:14 OK, tell me the corresponding picture for the opposite case. 194 00:12:14 --> 00:12:18.15 So now I'm going to ask you about right-inverses. 195 00:12:18.15 --> 00:12:20 A right-inverse. 196 00:12:20 --> 00:12:28 And you can fill this all out, this is going to be the case of 197 00:12:28 --> 00:12:30 full row rank. 198 00:12:30 --> 00:12:37 And then r is equal to m, now, the m rows are 199 00:12:37 --> 00:12:42 independent, but the columns are not. 200 00:12:42 --> 00:12:47 So what's the deal on that? 201 00:12:47 --> 00:12:51 Well, just exactly the flip of this one. 202 00:12:51 --> 00:12:57 The null space of A transpose contains only zero, 203 00:12:57 --> 00:13:03 because there are no combinations of the rows that 204 00:13:03 --> 00:13:05 give the zero row. 205 00:13:05 --> 00:13:08.37 We have independent rows. 206 00:13:08.37 --> 00:13:13 And in a minute, I'll give an example of all 207 00:13:13 --> 00:13:15 these. 208 00:13:15 --> 00:13:20.25 So, how many solutions to Ax=b in this case? 209 00:13:20.25 --> 00:13:23.05 The rows are independent. 210 00:13:23.05 --> 00:13:26 So we can always solve Ax=b. 211 00:13:26 --> 00:13:31 Whenever elimination never produces a zero row, 212 00:13:31 --> 00:13:36 so we never get into that zero equal one problem, 213 00:13:36 --> 00:13:42 so Ax=b always has a solution, but too many. 214 00:13:42 --> 00:13:48 So there will be some null space, the null space of A -- 215 00:13:48 --> 00:13:53.19 what will be the dimension of A's null space? 216 00:13:53.19 --> 00:13:56 How many free variables have we got? 217 00:13:56 --> 00:14:03 How many special solutions in that null space have we got? 218 00:14:03 --> 00:14:08 So how many free variables in this setup? 219 00:14:08 --> 00:14:11 We've got n columns, so n variables, 220 00:14:11 --> 00:14:16 and this tells us how many are pivot variables, 221 00:14:16 --> 00:14:22 that tells us how many pivots there are, so there are n-m free 222 00:14:22 --> 00:14:24 variables. 223 00:14:24 --> 00:14:30 So there are infinitely many solutions to Ax=b. 224 00:14:30 --> 00:14:34 We have n-m free variables in this case. 225 00:14:34 --> 00:14:34 OK. 226 00:14:34 --> 00:14:39 Now I wanted to ask about this idea of a right-inverse. 227 00:14:39 --> 00:14:40.1 OK. 228 00:14:40.1 --> 00:14:44 So I'm going to have a matrix A, my matrix A, 229 00:14:44 --> 00:14:50 and now there's going to be some inverse on the right that 230 00:14:50 --> 00:14:53.42 will give the identity matrix. 231 00:14:53.42 --> 00:14:57 So it will be A times A inverse on the right, 232 00:14:57 --> 00:15:00 will be I. 233 00:15:00 --> 00:15:06 And can you tell me what, just by comparing with what we 234 00:15:06 --> 00:15:11 had up there, what will be the right-inverse, 235 00:15:11 --> 00:15:15 we even have a formula for it. 236 00:15:15 --> 00:15:20 There will be other -- actually, there are other 237 00:15:20 --> 00:15:25 left-inverses, that's our favorite. 238 00:15:25 --> 00:15:29 There will be other right-inverses, 239 00:15:29 --> 00:15:34 but tell me our favorite here, what's the nice right-inverse? 240 00:15:34 --> 00:15:40 The nice right-inverse will be, well, there we had A transpose 241 00:15:40 --> 00:15:46 A was good, now it will be A A transpose that's good. 242 00:15:46 --> 00:15:51 The good matrix, the good right -- the thing we 243 00:15:51 --> 00:15:57 can invert is A A transpose, so now if I just do it that 244 00:15:57 --> 00:16:01 way, there sits the right-inverse. 245 00:16:01 --> 00:16:06 You see how completely parallel it is to the one above? 246 00:16:06 --> 00:16:07 Right. 247 00:16:07 --> 00:16:11 So that's the right-inverse. 248 00:16:11 --> 00:16:17 So that's the case when there is -- In terms of this picture, 249 00:16:17 --> 00:16:23 tell me what the null spaces are like so far for these three 250 00:16:23 --> 00:16:24.48 cases. 251 00:16:24.48 --> 00:16:28 What about case one, where we had a two-sided 252 00:16:28 --> 00:16:32 inverse, full rank, everything great. 253 00:16:32 --> 00:16:37 The null spaces were, like, gone, right? 254 00:16:37 --> 00:16:43 The null spaces were just the zero vectors. 255 00:16:43 --> 00:16:48 Then I took case two, this null space was gone. 256 00:16:48 --> 00:16:56 Case three, this null space was gone, and then case four is, 257 00:16:56 --> 00:17:04 like, the most general case when this picture is all there 258 00:17:04 --> 00:17:11 -- when all the null spaces -- this has dimension r, 259 00:17:11 --> 00:17:18 of course, this has dimension n-r, this has dimension r, 260 00:17:18 --> 00:17:26 this has dimension m-r, and the final case will be when 261 00:17:26 --> 00:17:30 r is smaller than m and n. 262 00:17:30 --> 00:17:34 But can I just, before I leave here look a 263 00:17:34 --> 00:17:36 little more at this one? 264 00:17:36 --> 00:17:39 At this case of full column rank? 265 00:17:39 --> 00:17:43 So A inverse on the left, it has this left-inverse to 266 00:17:43 --> 00:17:45 give the identity. 267 00:17:45 --> 00:17:51 I said if we multiply it in the other order, we wouldn't get the 268 00:17:51 --> 00:17:52 identity. 269 00:17:52 --> 00:17:58 But then I just realized that I should ask you, 270 00:17:58 --> 00:17:59 what do we get? 271 00:17:59 --> 00:18:06 So if I put them in the other order -- if I continue this down 272 00:18:06 --> 00:18:13 below, but I write A times A inverse left -- so there's A 273 00:18:13 --> 00:18:18.83 times the left-inverse, but it's not on the left any 274 00:18:18.83 --> 00:18:20 more. 275 00:18:20 --> 00:18:27.38 So it's not going to come out perfectly. 276 00:18:27.38 --> 00:18:37 But everybody in this room ought to recognize that matrix, 277 00:18:37 --> 00:18:38 right? 278 00:18:38 --> 00:18:44 Let's see, is that the guy we know? 279 00:18:44 --> 00:18:46 Am I OK, here? 280 00:18:46 --> 00:18:50 What is that matrix? 281 00:18:50 --> 00:18:52 P. 282 00:18:52 --> 00:18:53 Thanks. 283 00:18:53 --> 00:18:53 P. 284 00:18:53 --> 00:18:57 That matrix -- it's a projection. 285 00:18:57 --> 00:19:01 It's the projection onto the column space. 286 00:19:01 --> 00:19:06 It's trying to be the identity matrix, right? 287 00:19:06 --> 00:19:12 A projection matrix tries to be the identity matrix, 288 00:19:12 --> 00:19:18 but you've given it, an impossible job. 289 00:19:18 --> 00:19:21 So it's the identity matrix where it can be, 290 00:19:21 --> 00:19:24 and elsewhere, it's the zero matrix. 291 00:19:24 --> 00:19:26 So this is P, right. 292 00:19:26 --> 00:19:29.39 A projection onto the column space. 293 00:19:29.39 --> 00:19:29 OK. 294 00:19:29 --> 00:19:34 And if I asked you this one, and put these in the opposite 295 00:19:34 --> 00:19:38 order -- so this came from up here. 296 00:19:38 --> 00:19:41 And similarly, if I try to put the right 297 00:19:41 --> 00:19:44 inverse on the left -- so that, like, came from above. 298 00:19:44 --> 00:19:48 This, coming from this side, what happens if I try to put 299 00:19:48 --> 00:19:50 the right inverse on the left? 300 00:19:50 --> 00:19:54 Then I would have A transpose A, A transpose inverse A, 301 00:19:54 --> 00:19:58 if this matrix is now on the left, what do you figure that 302 00:19:58 --> 00:19:59 matrix is? 303 00:19:59 --> 00:20:05 It's going to be a projection, too, right? 304 00:20:05 --> 00:20:13.06 It looks very much like this guy, except the only difference 305 00:20:13.06 --> 00:20:18 is, A and A transpose have been reversed. 306 00:20:18 --> 00:20:25 So this is a projection, this is another projection, 307 00:20:25 --> 00:20:28 onto the row space. 308 00:20:28 --> 00:20:34 Again, it's trying to be the identity, but there's only so 309 00:20:34 --> 00:20:37 much the matrix can do. 310 00:20:37 --> 00:20:42 And this is the projection onto the column space. 311 00:20:42 --> 00:20:48 So let me now go back to the main picture and tell you about 312 00:20:48 --> 00:20:53 the general case, the pseudo-inverse. 313 00:20:53 --> 00:20:56 These are cases we know. 314 00:20:56 --> 00:20:59.18 So this was important review. 315 00:20:59.18 --> 00:21:04 You've got to know the business about these ranks, 316 00:21:04 --> 00:21:10 and the free variables -- really, this is linear algebra 317 00:21:10 --> 00:21:11 coming together. 318 00:21:11 --> 00:21:16 And, you know, one nice thing about teaching 319 00:21:16 --> 00:21:17 18.06, 320 00:21:17 --> 00:21:18 It's not trivial. 321 00:21:18 --> 00:21:22 But it's -- I don't know, somehow, it's nice when it 322 00:21:22 --> 00:21:23 comes out right. 323 00:21:23 --> 00:21:25 I mean -- well, I shouldn't say anything bad 324 00:21:25 --> 00:21:27 about calculus, but I will. 325 00:21:27 --> 00:21:30 I mean, like, you know, you have formulas for 326 00:21:30 --> 00:21:32 surface area, and other awful things and, 327 00:21:32 --> 00:21:37.7 you know, they do their best in calculus, but it's not elegant. 328 00:21:37.7 --> 00:21:44 And, linear algebra just is -- well, you know, 329 00:21:44 --> 00:21:52 linear algebra is about the nice part of calculus, 330 00:21:52 --> 00:21:57 where everything's, like, flat, and, 331 00:21:57 --> 00:22:03.08 the formulas come out right. 332 00:22:03.08 --> 00:22:05 And you can go into high dimensions where, 333 00:22:05 --> 00:22:07 in calculus, you're trying to visualize 334 00:22:07 --> 00:22:09.48 these things, well, two or three dimensions 335 00:22:09.48 --> 00:22:10.59 is kind of the limit. 336 00:22:10.59 --> 00:22:13 But here, we don't -- you know, I've stopped doing two-by-twos, 337 00:22:13 --> 00:22:16 I'm just talking about the general case. 338 00:22:16 --> 00:22:19 OK, now I really will speak about the general case here. 339 00:22:19 --> 00:22:22 What could be the inverse -- what's a kind of reasonable 340 00:22:22 --> 00:22:25 inverse for a matrix for the completely general matrix where 341 00:22:25 --> 00:22:28 there's a rank r, but it's smaller than n, 342 00:22:28 --> 00:22:31 so there's some null space left, and it's smaller than m, 343 00:22:31 --> 00:22:34 so a transpose has some null space, and it's those null 344 00:22:34 --> 00:22:37 spaces that are screwing up inverses, right? 345 00:22:37 --> 00:22:46 Because if a matrix takes a vector to zero, 346 00:22:46 --> 00:22:59 well, there's no way an inverse can, like, bring it back to 347 00:22:59 --> 00:23:02 life. 348 00:23:02 --> 00:23:04.49 My topic is now the pseudo-inverse, 349 00:23:04.49 --> 00:23:08 and let's just by a picture, see what's the best inverse we 350 00:23:08 --> 00:23:09 could have? 351 00:23:09 --> 00:23:12.27 So, here's a vector x in the row space. 352 00:23:12.27 --> 00:23:13 I multiply by A. 353 00:23:13 --> 00:23:17 Now, the one thing everybody knows is you take a vector, 354 00:23:17 --> 00:23:20 you multiply by A, and you get an output, 355 00:23:20 --> 00:23:23 and where is that output? 356 00:23:23 --> 00:23:24 Where is Ax? 357 00:23:24 --> 00:23:29 Always in the column space, right? 358 00:23:29 --> 00:23:33 Ax is a combination of the columns. 359 00:23:33 --> 00:23:37 So Ax is somewhere here. 360 00:23:37 --> 00:23:43 So I could take all the vectors in the row space. 361 00:23:43 --> 00:23:49 I could multiply them all by A. 362 00:23:49 --> 00:23:55 I would get a bunch of vectors in the column space and what I 363 00:23:55 --> 00:24:00.56 think is, I'd get all the vectors in the column space just 364 00:24:00.56 --> 00:24:01 right. 365 00:24:01 --> 00:24:06 I think that this connection between an x in the row space 366 00:24:06 --> 00:24:12 and an Ax in the column space, this is one-to-one. 367 00:24:12 --> 00:24:15 We got a chance, because they have the same 368 00:24:15 --> 00:24:15 dimension. 369 00:24:15 --> 00:24:19 That's an r-dimensional space, and that's an r-dimensional 370 00:24:19 --> 00:24:20 space. 371 00:24:20 --> 00:24:23 And somehow, the matrix A -- it's got these 372 00:24:23 --> 00:24:27.39 null spaces hanging around, where it's knocking vectors to 373 00:24:27.39 --> 00:24:27 zero. 374 00:24:27 --> 00:24:30 And then it's got all the vectors in between, 375 00:24:30 --> 00:24:33 which is almost all vectors. 376 00:24:33 --> 00:24:37 Almost all vectors have a row space component and a null space 377 00:24:37 --> 00:24:37 component. 378 00:24:37 --> 00:24:39 And it's killing the null space component. 379 00:24:39 --> 00:24:42 But if I look at the vectors that are in the row space, 380 00:24:42 --> 00:24:45 with no null space component, just in the row space, 381 00:24:45 --> 00:24:47 then they all go into the column space, 382 00:24:47 --> 00:24:49 so if I put another vector, let's say, y, 383 00:24:49 --> 00:24:52 in the row space, I positive that wherever Ay is, 384 00:24:52 --> 00:24:54 it won't hit Ax. 385 00:24:54 --> 00:24:59 Do you see what I'm saying? 386 00:24:59 --> 00:25:02 Let's see why. 387 00:25:02 --> 00:25:05 All right. 388 00:25:05 --> 00:25:09 So here's what I said. 389 00:25:09 --> 00:25:22 If x and y are in the row space, then A x is not the same 390 00:25:22 --> 00:25:25 as A y. 391 00:25:25 --> 00:25:29 They're both in the column space, of course, 392 00:25:29 --> 00:25:31 but they're different. 393 00:25:31 --> 00:25:36 That would be a perfect question on a final exam, 394 00:25:36 --> 00:25:42 because that's what I'm teaching you in that material of 395 00:25:42 --> 00:25:48 chapter three and chapter four, especially chapter three. 396 00:25:48 --> 00:25:54 If x and y are in the row space, then Ax is different from 397 00:25:54 --> 00:25:55 Ay. 398 00:25:55 --> 00:26:00 So what this means -- and we'll see why -- is that, 399 00:26:00 --> 00:26:04 in words, from the row space to the column space, 400 00:26:04 --> 00:26:08 A is perfect, it's an invertible matrix. 401 00:26:08 --> 00:26:11 If we, like, limited it to those spaces. 402 00:26:11 --> 00:26:16 And then, its inverse will be what I'll call the 403 00:26:16 --> 00:26:18 pseudo-inverse. 404 00:26:18 --> 00:26:20 So that's that the pseudo-inverse is. 405 00:26:20 --> 00:26:24 It's the inverse -- so A goes this way, from x to y -- sorry, 406 00:26:24 --> 00:26:27 x to A x, from y to A y, that's A, going that way. 407 00:26:27 --> 00:26:31 Then in the other direction, anything in the column space 408 00:26:31 --> 00:26:35 comes from somebody in the row space, and the reverse there is 409 00:26:35 --> 00:26:37 what I'll call the pseudo-inverse, 410 00:26:37 --> 00:26:40 and the accepted notation is A plus. 411 00:26:40 --> 00:26:45 So y will be A plus x. 412 00:26:45 --> 00:26:47 I'm sorry. 413 00:26:47 --> 00:26:57 No, y will be A plus times whatever it started with, 414 00:26:57 --> 00:26:58 A y. 415 00:26:58 --> 00:27:05 Do you see my picture there? 416 00:27:05 --> 00:27:07 Same, of course, for x and A x. 417 00:27:07 --> 00:27:10 This way, A does it, the other way is the 418 00:27:10 --> 00:27:13 pseudo-inverse, and the pseudo-inverse just 419 00:27:13 --> 00:27:16 kills this stuff, and the matrix just kills this 420 00:27:16 --> 00:27:16 stuff. 421 00:27:16 --> 00:27:20 So everything that's really serious here is going on in the 422 00:27:20 --> 00:27:25 row space and the column space, and now, tell me 423 00:27:25 --> 00:27:32 -- this is the fundamental fact, that between those two 424 00:27:32 --> 00:27:39 r-dimensional spaces, our matrix is perfect. 425 00:27:39 --> 00:27:39 Why? 426 00:27:39 --> 00:27:42 Suppose they weren't. 427 00:27:42 --> 00:27:46.68 Why do I get into trouble? 428 00:27:46.68 --> 00:27:50 Suppose -- so, proof. 429 00:27:50 --> 00:27:57 I haven't written down proof very much, but I'm going to use 430 00:27:57 --> 00:27:59 that word once. 431 00:27:59 --> 00:28:02 Suppose they were the same. 432 00:28:02 --> 00:28:08 Suppose these are supposed to be two different vectors. 433 00:28:08 --> 00:28:14 Maybe I'd better make the statement correctly. 434 00:28:14 --> 00:28:17 If x and y are different vectors in the row space -- 435 00:28:17 --> 00:28:20 maybe I'll better put if x is different from y, 436 00:28:20 --> 00:28:24 both in the row space -- so I'm starting with two different 437 00:28:24 --> 00:28:27.95 vectors in the row space, I'm multiplying by A -- so 438 00:28:27.95 --> 00:28:31 these guys are in the column space, everybody knows that, 439 00:28:31 --> 00:28:35 and the point is, they're different over there. 440 00:28:35 --> 00:28:38 So, suppose they weren't. 441 00:28:38 --> 00:28:40 Suppose A x=A y. 442 00:28:40 --> 00:28:45 Suppose, well, that's the same as saying 443 00:28:45 --> 00:28:47 A(x-y) is zero. 444 00:28:47 --> 00:28:49 So what? 445 00:28:49 --> 00:28:56 So, what do I know now about (x-y), what do I know about this 446 00:28:56 --> 00:28:57 vector? 447 00:28:57 --> 00:29:05 Well, I can see right away, what space is it in? 448 00:29:05 --> 00:29:09 It's sitting in the null space, right? 449 00:29:09 --> 00:29:12 So it's in the null space. 450 00:29:12 --> 00:29:16 But what else do I know about it? 451 00:29:16 --> 00:29:21 Here it was x in the row space, y in the row space, 452 00:29:21 --> 00:29:23 what about x-y? 453 00:29:23 --> 00:29:28 It's also in the row space, right? 454 00:29:28 --> 00:29:32 Heck, that thing is a vector space, and if the vector space 455 00:29:32 --> 00:29:35 is anything at all, if x is in the row space, 456 00:29:35 --> 00:29:39 and y is in the row space, then the difference is also, 457 00:29:39 --> 00:29:41 so it's also in the row space. 458 00:29:41 --> 00:29:42.07 So what? 459 00:29:42.07 --> 00:29:45.66 Now I've got a vector x-y that's in the null space, 460 00:29:45.66 --> 00:29:50.4 and that's also in the row space, so what vector is it? 461 00:29:50.4 --> 00:29:52 It's the zero vector. 462 00:29:52 --> 00:29:57 So I would conclude from that that x-y had to be the zero 463 00:29:57 --> 00:30:00 vector, x-y, so, in other words, 464 00:30:00 --> 00:30:05 if I start from two different vectors, I get two different 465 00:30:05 --> 00:30:06 vectors. 466 00:30:06 --> 00:30:11 If these vectors are the same, then those vectors had to be 467 00:30:11 --> 00:30:13 the same. 468 00:30:13 --> 00:30:18 That's like the algebra proof, which we understand completely 469 00:30:18 --> 00:30:23 because we really understand these subspaces of what I said 470 00:30:23 --> 00:30:27 in words, that a matrix A is really a nice, 471 00:30:27 --> 00:30:31 invertible mapping from row space to columns pace. 472 00:30:31 --> 00:30:36 If the null spaces keep out of the way, then we have an 473 00:30:36 --> 00:30:38 inverse. 474 00:30:38 --> 00:30:41 And that inverse is called the pseudo inverse, 475 00:30:41 --> 00:30:44.9 and it's a very, very, useful in application. 476 00:30:44.9 --> 00:30:49 Statisticians discovered, oh boy, this is the thing that 477 00:30:49 --> 00:30:53 we needed all our lives, and here it finally showed up, 478 00:30:53 --> 00:30:56.73 the pseudo-inverse is the right thing. 479 00:30:56.73 --> 00:31:00 Why do statisticians need it? 480 00:31:00 --> 00:31:06 And because statisticians are like least-squares-happy. 481 00:31:06 --> 00:31:11 I mean they're always doing least squares. 482 00:31:11 --> 00:31:17 And so this is their central linear regression. 483 00:31:17 --> 00:31:20 Statisticians who may watch this on video, 484 00:31:20 --> 00:31:24.81 please forgive that description of your interests. 485 00:31:24.81 --> 00:31:29 One of your interests is linear regression and this problem. 486 00:31:29 --> 00:31:34 But this problem is only OK provided we have full column 487 00:31:34 --> 00:31:34 rank. 488 00:31:34 --> 00:31:38 And statisticians have to worry all the time about, 489 00:31:38 --> 00:31:43 oh, God, maybe we just repeated an experiment. 490 00:31:43 --> 00:31:48 You know, you're taking all these measurements, 491 00:31:48 --> 00:31:52.54 maybe you just repeat them a few times. 492 00:31:52.54 --> 00:31:57 You know, maybe they're not independent. 493 00:31:57 --> 00:32:01 Well, in that case, that A transpose A matrix that 494 00:32:01 --> 00:32:03 they depend on becomes singular. 495 00:32:03 --> 00:32:06 So then that's when they needed the pseudo-inverse, 496 00:32:06 --> 00:32:10.83 it just arrived at the right moment, and it's the right 497 00:32:10.83 --> 00:32:11 quantity. 498 00:32:11 --> 00:32:11 OK. 499 00:32:11 --> 00:32:15 So now that you know what the pseudo-inverse should do, 500 00:32:15 --> 00:32:18 let me see what it is. 501 00:32:18 --> 00:32:20 Can we find it? 502 00:32:20 --> 00:32:30 So this is my -- to complete the lecture is -- how do I find 503 00:32:30 --> 00:32:35 this pseudo-inverse A plus? 504 00:32:35 --> 00:32:36 OK. 505 00:32:36 --> 00:32:36 OK. 506 00:32:36 --> 00:32:40 Well, here's one way. 507 00:32:40 --> 00:32:49 Everything I do today is to try to review stuff. 508 00:32:49 --> 00:32:54.47 One way would be to start from the SVD. 509 00:32:54.47 --> 00:32:58 The Singular Value Decomposition. 510 00:32:58 --> 00:33:04 And you remember that that factored A into an orthogonal 511 00:33:04 --> 00:33:11 matrix times this diagonal matrix times this orthogonal 512 00:33:11 --> 00:33:12 matrix. 513 00:33:12 --> 00:33:18 But what did that diagonal guy look like? 514 00:33:18 --> 00:33:20 This diagonal guy, sigma, has some non-zeroes, 515 00:33:20 --> 00:33:23 and you remember, they came from A transpose A, 516 00:33:23 --> 00:33:26 and A A transpose, these are the good guys, 517 00:33:26 --> 00:33:28.94 and then some more zeroes, and all zeroes there, 518 00:33:28.94 --> 00:33:30 and all zeroes there. 519 00:33:30 --> 00:33:32 So you can guess what the pseudo-inverse is, 520 00:33:32 --> 00:33:35.59 I just invert stuff that's nice to invert -- well, 521 00:33:35.59 --> 00:33:38 what's the pseudo-inverse of this? 522 00:33:38 --> 00:33:43 That's what the problem comes down to. 523 00:33:43 --> 00:33:51 What's the pseudo-inverse of this beautiful diagonal matrix? 524 00:33:51 --> 00:33:55 But it's got a null space, right? 525 00:33:55 --> 00:34:00 What's the rank of this matrix? 526 00:34:00 --> 00:34:03 What's the rank of this diagonal matrix? 527 00:34:03 --> 00:34:05 r, of course. 528 00:34:05 --> 00:34:09 It's got r non-zeroes, and then it's otherwise, 529 00:34:09 --> 00:34:09 zip. 530 00:34:09 --> 00:34:13 So it's got n columns, it's got m rows, 531 00:34:13 --> 00:34:15 and it's got rank r. 532 00:34:15 --> 00:34:20 It's the best example, the simplest example we could 533 00:34:20 --> 00:34:24 ever have of our general setup. 534 00:34:24 --> 00:34:24 OK? 535 00:34:24 --> 00:34:28 So what's the pseudo-inverse? 536 00:34:28 --> 00:34:33 What's the matrix -- so I'll erase our columns, 537 00:34:33 --> 00:34:38 because right below it, I want to write the 538 00:34:38 --> 00:34:40 pseudo-inverse. 539 00:34:40 --> 00:34:46 OK, you can make a pretty darn good guess. 540 00:34:46 --> 00:34:49 If it was a proper diagonal matrix, invertible, 541 00:34:49 --> 00:34:54 if there weren't any zeroes down here, if it was sigma one 542 00:34:54 --> 00:34:59 to sigma n, then everybody knows what the inverse would be, 543 00:34:59 --> 00:35:04 the inverse would be one over sigma one, down to one over s- 544 00:35:04 --> 00:35:08 but of course, I'll have to stop at sigma r. 545 00:35:08 --> 00:35:14 And, it will be the rest, zeroes again, 546 00:35:14 --> 00:35:16 of course. 547 00:35:16 --> 00:35:25 And now this one was m by n, and this one is meant to have a 548 00:35:25 --> 00:35:33 slightly different, you know, transpose shape, 549 00:35:33 --> 00:35:34 n by m. 550 00:35:34 --> 00:35:40 They both have that rank r. 551 00:35:40 --> 00:35:45 My idea is that the pseudo-inverse is the best -- is 552 00:35:45 --> 00:35:49 the closest I can come to an inverse. 553 00:35:49 --> 00:35:53 So what is sigma times its pseudo-inverse? 554 00:35:53 --> 00:35:58 Can you multiply sigma by its pseudo-inverse? 555 00:35:58 --> 00:36:00 Multiply that by that? 556 00:36:00 --> 00:36:02 What matrix do you get? 557 00:36:02 --> 00:36:05 They're diagonal. 558 00:36:05 --> 00:36:08 Rectangular, of course. 559 00:36:08 --> 00:36:13 But of course, we're going to get ones, 560 00:36:13 --> 00:36:16 R ones, and all the rest, zeroes. 561 00:36:16 --> 00:36:23 And the shape of that, this whole matrix will be m by 562 00:36:23 --> 00:36:23 m. 563 00:36:23 --> 00:36:29.75 And suppose I did it in the other order. 564 00:36:29.75 --> 00:36:32 Suppose I did sigma plus sigma. 565 00:36:32 --> 00:36:35 Why don't I do it right underneath? 566 00:36:35 --> 00:36:37.17 in the opposite order? 567 00:36:37.17 --> 00:36:40 See, this matrix hasn't got a left-inverse, 568 00:36:40 --> 00:36:45 it hasn't got a right-inverse, but every matrix has got a 569 00:36:45 --> 00:36:46 pseudo-inverse. 570 00:36:46 --> 00:36:52 If I do it in the order sigma plus sigma, what do I get? 571 00:36:52 --> 00:36:54.42 Square matrix, this is m by n, 572 00:36:54.42 --> 00:36:57 this is m by m, my result is going to m by m -- 573 00:36:57 --> 00:37:00 is going to be n by n, and what is it? 574 00:37:00 --> 00:37:03 Those are diagonal matrices, it's going to be ones, 575 00:37:03 --> 00:37:04.61 and then zeroes. 576 00:37:04.61 --> 00:37:08 It's not the same as that, it's a different size -- it's a 577 00:37:08 --> 00:37:09.26 projection. 578 00:37:09.26 --> 00:37:12 One is a projection matrix onto the column space, 579 00:37:12 --> 00:37:17 and this one is the projection matrix onto the row space. 580 00:37:17 --> 00:37:20 That's the best that pseudo-inverse can do. 581 00:37:20 --> 00:37:25 So what the pseudo-inverse does is, if you multiply on the left, 582 00:37:25 --> 00:37:29.89 you don't get the identity, if you multiply on the right, 583 00:37:29.89 --> 00:37:34 you don't get the identity, what you get is the projection. 584 00:37:34 --> 00:37:39 It brings you into the two good spaces, the row space and column 585 00:37:39 --> 00:37:40 space. 586 00:37:40 --> 00:37:44 And it just wipes out the null space. 587 00:37:44 --> 00:37:50.64 So that's what the pseudo-inverse of this diagonal 588 00:37:50.64 --> 00:37:56 one is, and then the pseudo-inverse of A itself -- 589 00:37:56 --> 00:38:00 this is perfectly invertible. 590 00:38:00 --> 00:38:07 What's the inverse of V transpose? 591 00:38:07 --> 00:38:13.41 Just another tiny bit of review. 592 00:38:13.41 --> 00:38:23.11 That's an orthogonal matrix, and its inverse is V, 593 00:38:23.11 --> 00:38:25 good. 594 00:38:25 --> 00:38:27 This guy has got all the trouble in it, 595 00:38:27 --> 00:38:30 all the null space is responsible for, 596 00:38:30 --> 00:38:33 so it doesn't have a true inverse, it has a 597 00:38:33 --> 00:38:36.15 pseudo-inverse, and then the inverse of U is U 598 00:38:36.15 --> 00:38:37 transpose, thanks. 599 00:38:37 --> 00:38:39 Or, of course, I could write U inverse. 600 00:38:39 --> 00:38:42 So, that's the question of, how do you find the 601 00:38:42 --> 00:38:46 pseudo-inverse -- so what statisticians do 602 00:38:46 --> 00:38:50 when they're in this -- so this is like the case of where least 603 00:38:50 --> 00:38:54 squares breaks down because the rank is -- you don't have full 604 00:38:54 --> 00:38:58 rank, and the beauty of the singular value decomposition is, 605 00:38:58 --> 00:39:03 it puts all the problems into this diagonal matrix where it's 606 00:39:03 --> 00:39:04 clear what to do. 607 00:39:04 --> 00:39:10 It's the best inverse you could think of is clear. 608 00:39:10 --> 00:39:17 You see there could be other -- I mean, we could put some stuff 609 00:39:17 --> 00:39:22 down here, it would multiply these zeroes. 610 00:39:22 --> 00:39:28 It wouldn't have any effect, but then the good 611 00:39:28 --> 00:39:33 pseudo-inverse is the one with no extra stuff, 612 00:39:33 --> 00:39:38 it's sort of, like, as small as possible. 613 00:39:38 --> 00:39:44 It has to have those to produce the ones. 614 00:39:44 --> 00:39:48 If it had other stuff, it would just be a larger 615 00:39:48 --> 00:39:52 matrix, so this pseudo-inverse is kind of the minimal matrix 616 00:39:52 --> 00:39:55.18 that gives the best result. 617 00:39:55.18 --> 00:39:57 Sigma sigma plus being r ones. 618 00:39:57 --> 00:39:59 SK. so I guess I'm hoping -- 619 00:39:59 --> 00:40:03 pseudo-inverse, again, let me repeat what I 620 00:40:03 --> 00:40:06 said at the very beginning. 621 00:40:06 --> 00:40:13 This pseudo-inverse, which appears at the end, 622 00:40:13 --> 00:40:22 which is in section seven point four, and probably I did more 623 00:40:22 --> 00:40:29 with it here than I did in the book. 624 00:40:29 --> 00:40:32 The word pseudo-inverse will not appear on an exam in this 625 00:40:32 --> 00:40:36 course, but I think if you see this all will appear, 626 00:40:36 --> 00:40:39 because this is all what the course was about, 627 00:40:39 --> 00:40:42 chapters one, two, three, four -- but if you 628 00:40:42 --> 00:40:44 see all that, then you probably see, 629 00:40:44 --> 00:40:48 well, OK, the general case had both null spaces around, 630 00:40:48 --> 00:40:51.3 and this is the natural thing to do. 631 00:40:51.3 --> 00:40:51 Yes. 632 00:40:51 --> 00:40:57 So, this is one way to find the pseudo-inverse. 633 00:40:57 --> 00:41:03 The point of a pseudo-inverse, of computing a pseudo-inverse 634 00:41:03 --> 00:41:10 is to get some factors where you can find the pseudo-inverse 635 00:41:10 --> 00:41:12 quickly. 636 00:41:12 --> 00:41:15 And this is, like, the champion, 637 00:41:15 --> 00:41:19 because this is where we can invert those, 638 00:41:19 --> 00:41:23 and those two, easily, just by transposing, 639 00:41:23 --> 00:41:27 and we know what to do with a diagonal. 640 00:41:27 --> 00:41:31 OK, that's as much review, maybe -- 641 00:41:31 --> 00:41:36 let's have a five-minute holiday in 18.06 and, 642 00:41:36 --> 00:41:41 I'll see you Wednesday, then, for the rest of this 643 00:41:41 --> 00:41:41 course. 644 00:41:41 --> 00:41:44 Thanks.