These video lectures of Professor Gilbert Strang teaching 18.06 were recorded in Fall 1999 and do not correspond precisely to the current edition of the textbook. However, this book is still the best reference for more information on the topics covered in each lecture.
Instructor/speaker: Prof. Gilbert Strang
I've been multiplying matrices already, but certainly time for me to discuss the rules for matrix multiplication.
And the interesting part is the many ways you can do it, and they all give the same answer.
And they're all important. So matrix multiplication, and then, come inverses. So we mentioned the inverse of a matrix. That's a big deal.
Lots to do about inverses and how to find them.
Okay, so I'll begin with how to multiply two matrices.
First way, okay, so suppose I have a matrix A multiplying a matrix B and -- giving me a result -- well, I could call it C. A times B.
Okay. So, let me just review the rule for this entry. That's the entry in row i and column j. So that's the i j entry.
Right there is C i j. We always write the row number and then the column number. So I might -- I might -- maybe I take it C 3 4, just to make it specific.
So instead of i j, let me use numbers. C 3 4. So where does that come from, the three four entry? It comes from row three, here, row three and column four, as you know.
Column four. And can I just write down, or can we write down the formula for it? If we look at the whole row and the whole column, the quick way for me to say it is row three of A -- I could use a dot for dot product. I won't often use that, actually. Dot column four of B.
But this gives us a chance to just, like, use a little matrix notation. What are the entries? What's this first entry in row three? That number that's sitting right there is...
A, so it's got two indices and what are they? 3 1. So there's an a 3 1 there.
Now what's the first guy at the top of column four? So what's sitting up there? B 1 4, right.
So that this dot product starts with A 3 1 times B 1 4. And then what's the next -- so this is like I'm accumulating this sum, then comes the next guy, A 3 2, second column, times B 2 4, second row.
So it's b A 3 2, B 2 4 and so on.
Just practice with indices. Oh, let me even practice with a summation formula. So this is -- most of the course, I use whole vectors. I very seldom, get down to the details of these particular entries, but here we'd better do it. So it's some kind of a sum, right? Of things in row three, column K shall I say? Times things in row K, column four. Do you see that that's what we're seeing here? This is K is one, here K is two, on along -- so the sum goes all the way along the row and down the column, say, one to N. So that's what the C three four entry looks like. A sum of a three K b K four.
Just takes a little practice to do that.
Okay. And -- well, maybe I should say -- when are we allowed to multiply these matrices? What are the shapes of these things? The shapes are -- if we allow them to be not necessarily square matrices.
If they're square, they've got to be the same size. If they're rectangular, they're not the same size. If they're rectangular, this might be -- well, I always think of A as m by n.
m rows, n columns. So that sum goes to n. Now what's the point -- how many rows does B have to have? n. The number of rows in B, the number of guys that we meet coming down has to match the number of ones across. So B will have to be n by something. Whatever.
P. So the number of columns here has to match the number of rows there, and then what's the result? What's the shape of the result? What's the shape of C, the output? Well, it's got these same m rows -- it's got m rows.
And how many columns? P.
m by P. Okay.
So there are m times P little numbers in there, entries, and each one, looks like that.
Okay. So that's the standard rule.
That's the way people think of multiplying matrices.
I do it too. But I want to talk about other ways to look at that same calculation, looking at whole columns and whole rows. Okay.
So can I do A B C again? A B equaling C again? But now, tell me about... I'll put it up here.
So here goes A, again, times B producing C.
And again, this is m by n. This is n by P and this is m by P. Okay.
Now I want to look at whole columns. I want to look at the columns of -- here's the second way to multiply matrices. Because I'm going to build on what I know already. How do I multiply a matrix by a column? I know how to multiply this matrix by that column. Shall I call that column one? That tells me column one of the answer.
The matrix times the first column is that first column.
Because none of this stuff entered that part of the answer.
The matrix times the second column is the second column of the answer. Do you see what I'm saying? That I could think of multiplying a matrix by a vector, which I already knew how to do, and I can think of just P columns sitting side by side, just like resting next to each other. And I multiply A times each one of those. And I get the P columns of the answer. Do you see this as -- this is quite nice, to be able to think, okay, matrix multiplication works so that I can just think of having several columns, multiplying by A and getting the columns of the answer.
So, like, here's column one shall I call that column one? And what's going in there is A times column one.
Okay. So that's the picture a column at a time. So what does that tell me? What does that tell me about these columns? These columns of C are combinations, because we've seen that before, of columns of A.
Every one of these comes from A times this, and A times a vector is a combination of the columns of A.
And it makes sense, because the columns of A have length m and the columns of C have length m.
And every column of C is some combination of the columns of A.
And it's these numbers in here that tell me what combination it is. Do you see that? That in that answer, C, I'm seeing stuff that's combinations of these columns. Now, suppose I look at it -- that's two ways now. The third way is look at it by rows. So now let me change to rows.
Okay. So now I can think of a row of A -- a row of A multiplying all these rows here and producing a row of the product. So this row takes a combination of these rows and that's the answer.
So these rows of C are combinations of what? Tell me how to finish that. The rows of C, when I have a matrix B, it's got its rows and I multiply by A, and what does that do? It mixes the rows up. It creates combinations of the rows of B, thanks. Rows of B.
That's what I wanted to see, that this answer -- I can see where the pieces are coming from.
The rows in the answer are coming as combinations of these rows. The columns in the answer are coming as combinations of those columns.
And so that's three ways. Now you can say, okay, what's the fourth way? The fourth way -- so that's -- now we've got, like, the regular way, the column way, the row way and -- what's left? The one that I can -- well, one way is columns times rows.
What happens if I multiply -- So this was row times column, it gave a number. Okay.
Now I want to ask you about column times row.
If I multiply a column of A times a row of B, what shape I ending up with? So if I take a column times a row, that's definitely different from taking a row times a column. So a column of A was -- what's the shape of a column of A? n by one.
A column of A is a column. It's got m entries and one column. And what's a row of B? It's got one row and P columns. So what's the shape -- what do I get if I multiply a column by a row? I get a big matrix. I get a full-sized matrix.
If I multiply a column by a row -- should we just do one? Let me take the column two three four times the row one six. That product there -- I mean, when I'm just following the rules of matrix multiplication, those rules are just looking like -- kind of petite, kind of small, because the rows here are so short and the columns there are so short, but they're the same length, one entry.
So what's the answer? What's the answer if I do two three four times one six, just for practice? Well, what's the first row of the answer? Two twelve. And the second row of the answer is three eighteen. And the third row of the answer is four twenty four. That's a very special matrix, there. Very special matrix.
What can you tell me about its columns, the columns of that matrix? They're multiples of this guy, right? They're multiples of that one.
Which follows our rule. We said that the columns of the answer were combinations, but there's only -- to take a combination of one guy, it's just a multiple.
The rows of the answer, what can you tell me about those three rows? They're all multiples of this row. They're all multiples of one six, as we expected. But I'm getting a full-sized matrix. And now, just to complete this thought, if I have -- let me write down the fourth way. A B is a sum of columns of A times rows of B.
So that, for example, if my matrix was two three four and then had another column, say, seven eight nine, and my matrix here has -- say, started with one six and then had another column like zero zero, then -- here's the fourth way, okay? I've got two columns there, I've got two rows there. So the beautiful rule is -- see, the whole thing by columns and rows is that I can take the first column times the first row and add the second column times the second row. So that's the fourth way -- that I can take columns times rows, first column times first row, second column times second row and add.
Actually, what will I get? What will the answer be for that matrix multiplication? Well, this one it's just going to give us zero, so in fact I'm back to this -- that's the answer, for that matrix multiplication. I'm happy to put up here these facts about matrix multiplication, because it gives me a chance to write down special matrices like this.
This is a special matrix. All those rows lie on the same line. All those rows lie on the line through one six. If I draw a picture of all these row vectors, they're all the same direction.
If I draw a picture of these two column vectors, they're in the same direction. Later, I would use this language. Not too much later, either. I would say the row space, which is like all the combinations of the rows, is just a line for this matrix. The row space is the line through the vector one six. All the rows lie on that line.
And the column space is also a line.
All the columns lie on the line through the vector two three four. So this is like a really minimal matrix. And it's because of these ones.
Okay. So that's a third way.
Now I want to say one more thing about matrix multiplication while we're on the subject.
And it's this. You could also multiply -- You could also cut the matrix into blocks and do the multiplication by blocks. Yet that's actually so, useful that I want to mention it.
Block multiplication. So I could take my matrix A and I could chop it up, like, maybe just for simplicity, let me chop it into two -- into four square blocks. Suppose it's square. Let's just take a nice case. And B, suppose it's square also, same size. So these sizes don't have to be the same. What they have to do is match properly. Here they certainly will match. So here's the rule for block multiplication, that if this has blocks like, A -- so maybe A1, A2, A3, A4 are the blocks here, and these blocks are B1, B2,3 and B4? Then the answer I can find block. And if you tell me what's in that block, then I'm going to be quiet about matrix multiplication for the rest of the day.
What goes into that block? You see, these might be -- this matrix might be -- these matrices might be, like, twenty by twenty with blocks that are ten by ten, to take the easy case where all the blocks are the same shape.
And the point is that I could multiply those by blocks.
And what goes in here? What's that block in the answer? A1 B1, that's a matrix times a matrix, it's the right size, ten by ten.
Any more? Plus, what else goes in there? A2 B3, right? It's just like block rows times block columns. Nobody, I think, not even Gauss could see instantly that it works.
But somehow, if we check it through, all five ways we're doing the same multiplications.
So this familiar multiplication is what we're really doing when we do it by columns, by rows by columns times rows and by blocks. Okay.
I just have to, like, get the rules straight for matrix multiplication. Okay. All right, I'm ready for the second topic, which is inverses. Okay.
Ready for inverses. And let me do it for square matrices first. Okay.
So I've got a square matrix A. And it may or may not have an inverse, right? Not all matrices have inverses. In fact, that's the most important question you can ask about the matrix, is if it's -- if you know it's square, is it invertible or not? If it is invertible, then there is some other matrix, shall I call it A inverse? And what's the -- if A inverse exists -- there's a big "if" here.
If this matrix exists, and it'll be really central to figure out when does it exist? And then if it does exist, how would you find it? But what's the equation here that I haven't -- that I have to finish now? This matrix, if it exists multiplies A and produces, I think, the identity.
But a real -- an inverse for a square matrix could be on the right as well -- this is true, too, that it's -- if I have a -- yeah in fact, this is not -- this is probably the -- this is something that's not easy to prove, but it works.
That a left -- square matrices, a left inverse is also a right inverse. If I can find a matrix on the left that gets the identity, then also that matrix on the right will produce that identity.
For rectangular matrices, we'll see a left inverse that isn't a right inverse. In fact, the shapes wouldn't allow it. But for square matrices, the shapes allow it and it happens, if A has an inverse. Okay, so give me some cases -- let's see.
I hate to be negative here, but let's talk about the case with no inverse. So -- these matrices are called invertible or non-singular -- those are the good ones. And we want to be able to identify how -- if we're given a matrix, has it got an inverse? Can I talk about the singular case? No inverse.
All right. Best to start with an example.
Tell me an example -- let's get an example up here.
Let's make it two by two -- of a matrix that has not got an inverse. And let's see why.
Let me write one up. No inverse. Let's see why. Let me write up -- one three two six. Why does that matrix have no inverse? You could answer that various ways. Give me one reason.
Well, you could -- if you know about determinants, which you're not supposed to, you could take its determinant and you would get -- Zero. Okay.
Now -- all right. Let me ask you other reasons.
I mean, as for other reasons that that matrix isn't invertible. Here, I could use what I'm saying here. Suppose A times other matrix gave the identity. Why is that not possible? Because -- oh, yeah -- I'm thinking about columns here. If I multiply this matrix A by some other matrix, then the -- the result -- what can you tell me about the columns? They're all multiples of those columns, right? If I multiply A by another matrix that -- the product has columns that come from those columns. So can I get the identity matrix? No way. The columns of the identity matrix, like one zero -- it's not a combination of those columns, because those two columns lie on the -- both lie on the same line. Every combination is just going to be on that line and I can't get one zero.
So, do you see that sort of column picture of the matrix not being invertible. In fact, here's another reason.
This is even a more important reason.
Well, how can I say more important? All those are important. This is another way to see it.
A matrix has no inverse -- yeah -- here -- now this is important. A matrix has no -- a square matrix won't have an inverse if there's no inverse because I can solve -- I can find an X of -- a vector X with A times -- this A times X giving zero. This is the reason I like best.
That matrix won't have an inverse.
Can you -- well, let me change I to U.
So tell me a vector X that, solves A X equals zero.
I mean, this is, like, the key equation. In mathematics, all the key equations have zero on the right-hand side. So what's the X? Tell me an X here -- so now I'm going to put -- slip in the X that you tell me and I'm going to get zero.
What X would do that job? Three and negative one? Is that the one you picked, or -- yeah.
Or another -- well, if you picked zero with zero, I'm not so excited, right? Because that would always work. So it's really the fact that this vector isn't zero that's important.
It's a non-zero vector and three negative one would do it.
That just says three of this column minus one of that column is the zero column. Okay.
So now I know that A couldn't be invertible.
But what's the reasoning? If A X is zero, suppose I multiplied by A inverse.
Yeah, well here's the reason. Here -- this is why this spells disaster for an inverse. The matrix can't have an inverse if some combination of the columns gives z- it gives nothing. Because, I could take A X equals zero, I could multiply by A inverse and what would I discover? Suppose I take that equation and I multiply by -- if A inverse existed, which of course I'm going to come to the conclusion it can't because if it existed, if there was an A inverse to this dopey matrix, I would multiply that equation by that inverse and I would discover X is zero.
If I multiply A by A inverse on the left, I get X.
If I multiply by A inverse on the right, I get zero.
So I would discover X was zero. But it -- X is not zero.
X -- this guy wasn't zero. There it is.
It's three minus one. So, conclusion -- only, it takes us some time to really work with that conclusion -- our conclusion will be that non-invertible matrices, singular matrices, some combinations of their columns gives the zero column. They they take some vector X into zero. And there's no way A inverse can recover, right? That's what this equation says.
This equation says I take this vector X and multiplying by A gives zero. But then when I multiply by A inverse, I can never escape from zero.
So there couldn't be an A inverse.
Where here -- okay, now fix -- all right.
Now let me take -- all right, back to the positive side.
Let's take a matrix that does have an inverse. And why not invert it? Okay.
Can I -- so let me take on this third board a matrix -- shall I fix that up a little? Tell me a matrix that has got an inverse. Well, let me say one three two -- what shall I put there? Well, don't put six, I guess is -- right? Do I any favorites here? One? Or eight? I don't care. What, seven? Seven. Okay.
Seven is a lucky number. All right, seven, okay. Okay.
So -- now what's our idea? We believe that this matrix is invertible. Those who like determinants have quickly taken its determinant and found it wasn't zero. Those who like columns, and probably that -- that department is not totally popular yet -- but those who like columns will look at those two columns and say, hey, they point in different directions. So I can get anything.
Now, let me see, what do I mean? How I going to computer A inverse? So A inverse -- here's A inverse, now, and I have to find it. And what do I get when I do this multiplication? The identity. You know, forgive me for taking two by two-s, but -- lt's good to keep the computations manageable and let the ideas come out.
Okay, now what's the idea I want? I'm looking for this matrix A inverse, how I going to find it? Right now, I've got four numbers to find.
I'm going to look at the first column.
Let me take this first column, A B.
What's up there? What -- tell me this.
What equation does the first column satisfy? The first column satisfies A times that column is one zero.
The first column of the answer. And the second column, C D, satisfies A times that second column is zero one. You see that finding the inverse is like solving two systems. One system, when the right-hand side is one zero -- I'm just going to split it into two pieces. I don't even need to rewrite it. I can take A times -- so let me put it here. A times column j of A inverse is column j of the identity. I've got n equations. I've got, well, two in this case. And they have the same matrix, A, but they have different right-hand sides. The right-hand sides are just the columns of the identity, this guy and this guy.
And these are the two solutions.
Do you see what I'm going -- I'm looking at that equation by columns. I'm looking at A times this column, giving that guy, and A times that column giving that guy. So -- Essentially -- so this is like the Gauss -- we're back to Gauss.
We're back to solving systems of equations, but we're solving -- we've got two right-hand sides instead of one. That's where Jordan comes in.
So at the very beginning of the lecture, I mentioned Gauss-Jordan, let me write it up again.
Okay. Here's the Gauss-Jordan idea.
Gauss-Jordan solve two equations at once.
Okay. Let me show you how the mechanics go. How do I solve a single equation? So the two equations are one three two seven, multiplying A B gives one zero. And the other equation is the same one three two seven multiplying C D gives zero one. Okay. That'll tell me the two columns of the inverse. I'll have inverse. In other words, if I can solve with this matrix A, if I can solve with that right-hand side and that right-hand side, I'm invertible. I've got it.
Okay. And Jordan sort of said to Gauss, solve them together, look at the matrix -- if we just solve this one, I would look at one three two seven, and how do I deal with the right-hand side? I stick it on as an extra column, right? That's this augmented matrix. That's the matrix when I'm watching the right-hand side at the same time, doing the same thing to the right side that I do to the left? So I just carry it along as an extra column. Now I'm going to carry along two extra columns. And I'm going to do whatever Gauss wants, right? I'm going to do elimination.
I'm going to get this to be simple and this thing will turn into the inverse. This is what's coming.
I'm going to do elimination steps to make this into the identity, and lo and behold, the inverse will show up here.
K--- let's do it. Okay.
So what are the elimination steps? So you see -- here's my matrix A and here's the identity, like, stuck on, augmented on.
STUDENT: I'm sorry... STRANG: Yeah? STUDENT: -- is the two and the three supposed to be switched? STRANG: Did I -- oh, no, they weren't supposed to be switched. Sorry.
Thank you very much. And there -- I've got them right. Okay, thanks.
Okay. So let's do elimination.
All right, it's going to be simple, right? So I take two of this row away from this row.
So this row stays the same and two of those come away from this. That leaves me with a zero and a one and two of these away from this is that what you're getting -- after one elimination step -- Let me sort of separate the -- the left half from the right half.
So two of that first row got subtracted from the second row.
Now this is an upper triangular form. Gauss would quit, but Jordan says keeps going.
Use elimination upwards. Subtract a multiple of equation two from equation one to get rid of the three. So let's go the whole way. So now I'm going to -- this guy is fine, but I'm going to -- what do I do now? What's my final step that produces the inverse? I multiply this by the right number to get up to ther to remove that three. So I guess, I -- since this is a one, there's the pivot sitting there.
I multiply it by three and subtract from that, so what do I get? I'll have one zero -- oh, yeah that was my whole point. I'll multiply this by three and subtract from that, which will give me seven.
And I multiply this by three and subtract from that, which gives me a minus three. And what's my hope, belief? Here I started with A and the identity, and I ended up with the identity and who? That better be A inverse. That's the Gauss Jordan idea.
Start with this long matrix, double-length A I, eliminate, eliminate until this part is down to I, then this one will -- must be for some reason, and we've got to find the reason -- must be A inverse.
Shall I just check that it works? Let me just check that -- can I multiply this matrix this part times A, I'll carry A over here and just do that multiplication.
You'll see I'll do it the old fashioned way.
Seven minus six is a one. Twenty one minus twenty one is a zero, minus two plus two is a zero, minus six plus seven is a one. Check.
So that is the inverse. That's the Gauss-Jordan idea.
So, you'll -- one of the homework problems or more than one for Wednesday will ask you to go through those steps.
I think you just got to go through Gauss-Jordan a couple of times, but I -- yeah -- just to see the mechanics. But the, important thing is, why -- is, like, what happened? Why did we -- why did we get A inverse there? Let me ask you that. We got -- so we take -- We do row reduction, we do elimination on this long matrix A I until the first half is up.
Then a second half is A inverse.
Well, how do I see that? Let me put up here how I see that. So here's my Gauss-Jordan thing, and I'm doing stuff to it.
So I'm -- well, whole lot of E's.
Remember those are those elimination matrices.
Those are the -- those are the things that we figured out last time. Yes, that's what an elimination step is it's in matrix form, I'm multiplying by some Es.
And the result -- well, so I'm multiplying by a whole bunch of Es. So, I get a -- can I call the overall matrix E? That's the elimination matrix, the product of all those little pieces.
What do I mean by little pieces? Well, there was an elimination matrix that subtracted two of that away from that. Then there was an elimination matrix that subtracted three of that away from that.
I guess in this case, that was all.
So there were just two Es in this case, one that did this step and one that did this step and together they gave me an E that does both steps. And the net result was to get an I here. And you can tell me what that has to be. This is, like, the picture of what happened. If E multiplied A, whatever that E is -- we never figured it out in this way.
But whatever that E times that E is, E times A is -- What's E times A? It's I.
That E, whatever the heck it was, multiplied A and produced I. So E must be -- E A equaling I tells us what E is, namely it is -- STUDENT: It's the inverse of A. STRANG: It's the inverse of A.
Great. And therefore, when the second half, when E multiplies I, it's E -- Put this A inverse. You see the picture looking that way? E times A is the identity. It tells us what E has to be. It has to be the inverse, and therefore, on the right-hand side, where E -- where we just smartly tucked on the identity, it's turning in, step by step -- It's turning into A inverse.
There is the statement of Gauss-Jordan elimination.
That's how you find the inverse.
Where we can look at it as elimination, as solving n equations at the same time -- -- and tacking on n columns, solving those equations and up goes the n columns of A inverse. Okay, thanks.
See you on Wednesday.