These video lectures of Professor Gilbert Strang teaching 18.06 were recorded in Fall 1999 and do not correspond precisely to the current edition of the textbook. However, this book is still the best reference for more information on the topics covered in each lecture.
Instructor/speaker: Prof. Gilbert Strang
OK, this is the lecture on positive definite matrices.
I made a start on those briefly in a previous lecture.
One point I wanted to make was the way that this topic brings the whole course together, pivots, determinants, eigenvalues, and something new- four plot instability and then something new in this expression, x transpose Ax, actually that's the guy to watch in this lecture.
So, so the topic is positive definite matrix, and what's my goal? First, first goal is, how can I tell if a matrix is positive definite? So I would like to have tests to see if you give me a, a five by five matrix, how do I tell if it's positive definite? More important is, what does it mean? Why are we so interested in this property of positive definiteness? And then, at the end comes some geometry. Ellipses are connected with positive definite things. Hyperbolas are not connected with positive definite things, so we- it's this, we, there's a geometry too, but mostly it's linear algebra and -- this application of how do you recognize 'em when you have a minim is pretty neat. OK. I'm gonna begin with two by two.
All matrices are symmetric, right? That's understood; the matrix is symmetric, now my question is, is it positive definite? Now, here are some -- each one of these is a complete test for positive definiteness. If I know the eigenvalues, my test is are they positive? Are they all positive? If I know these -- so, A is really -- I look at that number A, here, as the, as the one by one determinant, and here's the two by two determinant.
So this is the determinant test.
This is the eigenvalue test, this is the determinant test. Are the determinants growing in s- of all, of all end, sort of, can I call them leading submatrices, they're the first ones the northwest, Seattle submatrices coming down from from there, they all, all those determinants have to be positive, and then another test is the pivots.
The pivots of a two by two matrix are the number A for sure, and, since the product is the determinant, the second pivot must be the determinant divided by A.
And then in here is gonna come my favorite and my new idea, the, the, the the one to catch, about x transpose Ax being positive. But we'll have to look at this guy. This gets, like a star, because for most, presentations, the definition of positive definiteness would be this number four and these numbers one two three would be test four. OK.
Maybe I'll tuck this, where, you know, OK. So I'll have to look at this x transpose Ax. Can you, can we just be sure, how do we know that the eigenvalue test and the determinant test, pick out the same matrices, and let me, let's just do a few examples.
Some examples. Let me pick the matrix two, six, six, tell me, what number do I have to put there for the matrix to be positive definite? Tell me a sufficiently large number that would make it positive definite? Let's just practice with these conditions in the two by two case.
Now, when I ask you that, you don't wanna find the eigenvalues, you would use the determinant test for that, so, the first or the pivot test, that, that guy is certainly positive, that had to happen, and it's OK. How large a number here -- the number had better be more than what? More than eighteen, right, because if it's eight -- no. More than what? Nineteen, is it? If I have a nineteen here, is that positive definite? I get thirty eight minus thirty six, that's OK. If I had an eighteen, let me play it really close. If I have an eighteen there, then I positive definite? Not quite. I would call this guy positive, so it's useful just to see that this the borderline. That matrix is on the borderline, I would call that matrix positive semi-definite.
And what are the eigenvalues of that matrix, just since we're given eigenvalues of two by twos, when it's semi-definite, but not definite, then the -- I'm squeezing this eigenvalue test down, -- what's the eigenvalue that I know this matrix has? What kind of a matrix have I got here? It's a singular matrix, one of its eigenvalues is zero. That has an eigenvalue zero, and the other eigenvalue is -- from the trace, twenty. OK.
So that, that matrix has eigenvalues greater than or equal to zero, and it's that "equal to" that brought this word "semi-definite" in.
And, the what are the pivots of that matrix? So the pivots, so the eigenvalues are zero and twenty, the pivots are, well, the pivot is two, and what's the next pivot? There isn't one.
We got a singular matrix here, it'll only have one pivot.
You see that that's a rank one matrix, two six is a -- six eighteen is a multiple of two six, the matrix is singular it only has one pivot, so the pivot test doesn't quite pass. The -- let me do the x transpose Ax. Now this is -- the novelty now.
OK. What I looking at when I look at this new combination, x transpose Ax.
x is any vector now, so lemme compute, so any vector, lemme call its components x1 and x2, so that's x. And I put in here A.
Let's, let's use this example two six, six eighteen, and here's x transposed, so x1 x2.
We're back to real matrices, after that last lecture that- that said what to do in the complex case, let's come back to real matrices.
So here's x transpose Ax, and what I'm interested in is, what do I get if I multiply those together? I get some function of x1 and x2, and what is it? Let's see, if I do this multiplication, so I do it, lemme, just, I'll just do it slowly, x1, x2, if I multiply that matrix, this is 2x1, and 6x2s, and the next row is 6x1s and 18x2s, and now I do this final step and what do I have? I've got 2x1 squareds, got 2X1 squareds is coming from this two, I've got this one gives me eighteen, well, shall I do the ones in the middle? How many x1 x2s do I have? Let's see, x1 times that 6x2 would be six of 'em, and then x2 times this one will be six more, I get twelve.
So, in here is going, this is the number a, this is the number 2b, and in here is -- x2 times eighteen x2 will be eighteen x2 squareds and this is the number c. So it's ax1 -- it's like ax squared. 2bxy and cy squared. For my first point that I wanted to make by just doing out a multiplication is, that is as soon as you give me the matrix, as soon as you give me the matrix, I can -- those are the numbers that appear in -- I'll call this thing a quadratic, you see, it's not linear anymore.
Ax is linear, but now I've got an x transpose coming in, I'm up to degree two, up to degree two, maybe quadratic is the -- use the word.
A quadratic form. It's purely degree two, there's no linear part, there's no constant part, there certainly no cubes or fourth powers, it's all second degree. And my question is -- is that quantity positive or not? That's -- for every x1 and x2, that is my new definition -- that's my definition of a positive definite matrix. If this quantity is positive, if, if, if, it's positive for all x's and y's, all x1 x2s, then I call them -- then that's the matrix is positive definite. Now, is this guy passing our test? Well we have, we anticipated the answer here by, by asking first about eigenvalues and pivots, and what happened? It failed. It barely failed.
If I had made this eighteen down to a seven, it would've totally failed. I do that with the eraser, and then I'll put back eighteen, because, seven is such a total disaster, but if -- I'll keep seven for a second. Is that thing in any way positive definite? No, absolutely not.
I don't know its eigenvalues, but I know for sure one of them is negative. Its pivots are two and then the next pivot would be the determinant over two, and the determinant is -- what, what's the determinant of this thing? Fourteen minus thirty six, I've got a determinant minus twenty two.
The next pivot will be -- the pivots now, of this thing are two and minus eleven or something.
Their product being minus twenty two the determinant.
This thing is not positive definite.
What would be -- let me look at the x transpose Ax for this guy.
What's -- if I do out this multiplication, this eighteen is temporarily changing to a seven. This eighteen is temporarily changing to a seven, and I know that there's some numbers x1 and x2 for which that thing, that function, is negative.
And I'm desperately trying to think what they are.
Maybe you can see. Can you tell me a value of x1 and x2 that makes this quantity negative? Oh, maybe one and minus one? Yes, that's -- in this case, that, will work, right, if I took x1 to be one, and x2 to be minus one, then I always get something positive, the two, and the seven minus one squared, but this would be minus twelve and the whole thing would be negative; I would have two minus twelve plus seven, a negative. If I drew the graph, can I get the little picture in here? If I draw the graph of this thing? So, graphs, of the function f(x,y), or f(x), so I say here f(x,y) equal this -- x transpose Ax, this, this this ax squared, 2bxy, and cy squared.
And, let's take the example, with these numbers.
OK, so here's the x axis, here's the y axis, and here's -- up is the function.
z, if you like, or f. I apologize, and let me, just once in my life here, put an arrow over these, these, axes so you see them. That's the vector and I just, see, instead of x1 and x2, I made them x- the components x and y. OK.
So, so, what's a graph of 2x squared, twelve xy, and seven y squared? I'd like to see -- I not the greatest artist, but let's -- tell me something about this graph of this function. Whoa, tell me one point that it goes through.
The origin. Right? Even this artist can get this thing to go through the origin, when these are zero, I, I certainly get zero. OK.
Some more points. If x is one and y is zero, then I'm going upwards, so I'm going up this way, and I'm, I'm going up, like, two x squared in that direction. So -- that's meant to be a parabola. And, suppose x stays zero and y increases. Well, y could be positive or negative; it's seven y squared. Is this function going upward? In the x direction it's going upward, and in the y direction it's going upwards, and if x equals y then the forty-five degree direction is certainly going upwards; because then we'd have what, about, everything would be positive, but what? This function -- what's the graph of this function? Look like? Tell me the word that describes the graph of this function. This is the non-positive definite here, everybody's with me here, for some reason got started in a negative direction, your case that isn't positive definite.
And what's the graph look like that goes up, but does it -- do we have a minimum here, does it go from, from the origin? Completely? No, because we just checked that this thing failed. It failed along the direction when x was minus y -- we have a saddle point, let me put myself, let me, to the least, tell you the word. This thing, goes up in some directions, but down in other directions, and if we actually knew what a saddle looked like or thinks saddles do that -- the way your legs go is, like, down, up, the way, you, looking like, forward, and, the, and drawing the thing is even worse than describing -- I'm just going to say in some directions we go up and in other directions, there is, a saddle -- Now I'm sorry I put that on the front board, you have no way to cover it, but it's a saddle.
OK. And, and this is a saddle point, it's the, it's the point that's at the maximum in some directions and at the minimum in other directions. And actually, the perfect directions to look are the eigenvector directions.
We'll see that. So this is, not a positive definite matrix. OK.
Now I'm coming back to this example, getting rid of this seven, let's move it up to twenty.
Let's, let's let's make the thing really positive definite.
OK. So this is, this number's now twenty. c is now twenty.
OK. Now that passes the test, which I haven't proved, of course, it passes the test for positive pivots. It passes the test for positive eigenvalues. How can you tell that the eigenvalues of that matrix are positive without actually finding them? Of course, two by two I could find them, but can you see -- how do I know they're positive? I know that their product is -- I know that lambda one times lambda two is positive, why? Because that's the determinant, right, lambda one times lambda two is the determinant, which is forty minus thirty-six is four. So the determinant is four.
And the trace, the sum down the diagonal, is twenty-two. So, they multiply to give four.
So that leaves the possibility they're either both positive or both negative. But if they're both negative, the trace couldn't be twenty-two.
So they're both positive. So both of the eigenvalues that are positive, both of the pivots are positive -- the determinants are positive, and I believe that this function is positive everywhere except at zero, zero, of course. When I write down this condition, So I believe that x transposed, let me copy, x transpose Ax is positive, except, of course, at the minimum point, at zero, of course, I don't expect miracles.
So what does its graph look like, and how do I check, and how do I check that this really is positive? So we take it's graph for a minute.
What would be the graph of that function -- it does not have a saddle point. Let me -- I'll raise the board, here, and stay with this example for a while.
So I want to do the graph of -- here's my function, two x squared, twelve xy-s, that could be positive or negative, and twenty y squared. But my point is, so you're seeing the underlying point is, that, the things are positive definite when in some way, these, these pure squares, squares we know to be positive, and when those kind of overwhelm this guy, who could be m- positive or negative, because some like or have same or have same or different signs, when these are big enough they overwhelm this guy and make the total thing positive, and what would the graph now look like? Let me draw the x - well, let me draw the x direction, the y direction, and the origin, at zero, zero, I'm there, where do I go as I move away from the origin? Where do I go as I move away from the origin? I'm sure that I go up.
The origin, the center point here, is a minim because this thing I believe, and we better see why, it's, the graph is like a bowl, the graph is like a bowl shape, it's -- here's the minimum.
And because we've got a pure quadratic, we know it sits at the origin, we know it's tangent plane, the first derivatives are zero, so, we know, first derivatives, First derivatives are all zero, but that's not enough for a minimum. It's first derivatives were zero here. So, the partial derivatives, the first derivatives, are zero.
Again, because first derivatives are gonna have an x or an a y, or a y in them, they'll be zero at the origin.
It's the second derivatives that control everything. It's the second derivatives that this matrix is telling us, and somehow -- here's my point.
You remember in Calculus, how did you decide on a minimum? First requirement was, that the derivative had to be zero.
But then you didn't know if you had a minimum or a maximum. To know that you had a minimum, you had to look at the second derivative. The second derivative had to be positive, the slope had to be increasing as you went through the minimum point.
The curvature had to go upwards, and that's what we're doing now in two dimensions, and in n dimensions. So we're doing what we did in Calculus. Second derivative positive, m- will now become that the matrix of second derivatives is positive definite. Can I just -- like a translation of -- this is how minimum are coming in, ithe beginning of Calculus -- we had a minimum was associated with second derivative, being positive.
And first derivative zero, of course.
Derivative, first derivative, but it was the second derivative that told us we had a minimum.
And now, in 18.06, in linear algebra, we'll have a minim for our function now, our function will have, for your function be a function not of just x but several variables, the way functions really are in real life, the minimum will be when the matrix of second derivatives, the matrix here was one by one, there was just one second derivative, now we've got lots. Is positive definite. So positive for a number translates into positive definite for a matrix.
And it this brings everything you check pivots, you check determinants, you check all your values, or you check this minimum stuff.
OK. Let me come back to this graph.
That graph goes upwards. And I'll have to see why.
How do I know that this, that this function is always positive? Can you look at that and tell that it's always positive? Maybe two by two, you could feel pretty sure, but what's the good way to show that this thing is always positive? If we can express it, as, in terms of squares, because that's what we know for any x and y, whatever, if we're squaring something we certainly are not negative.
So I believe that this expression, this function, could be written as a sum of squares. Can you tell me -- see, because all the problems, the headaches are coming from this xy term.
If we can get expressions -- if we can get that inside a square, so actually, what we're doing is something called, that you've seen called completing the square.
Let me start the square and you complete it.
OK, I think we have two of x plus, now I don't remember how many y-s we need, but you'll figure it out, squared. How many y-s should I put in here, to make -- what do I want to do, the two x squared-s will be correct, right? What I want to do is put in the right number of y-s to get the twelve xy correct. And what is that number of y-s? Let's see, I've got two times, and so I really want six xy-s to come out of here, I think maybe if I put three there, does that look right to you? I have two- this is, we can mentally, multiply out, that's X squared, that's right, that's six X Y, times the two gives from, right, and how many Y squareds have I now got? How many Y squareds have I now got from this term? Eighteen. Eighteen was the key number, remember? Now if I want to make it twenty, then I've got two left.
Two y squared-s. That's completing the square, and it's, now I can see that that function is positive, because it's all squares. I've got two squares, added together, I couldn't go negative. What if I went back to that seven? If instead of twenty that number was a seven, then what would happen? This would still be correct, I'd still have this square, to get the two x squared and the twelve xy, and I'd have eighteen y squared and then what would I do here? I'd have to remove eleven y squared-s, right, if I only had a seven here, then instead of -- when I had a twenty I had two more to put in, when I had an eighteen, which was the marginal case, I had no more to put in.
When I had a seven, which was the case below zero, the indefinite case, I had minus eleven. OK. Now, so, you can see now, that this thing is a bowl. It's going upwards, if I cut it at a plane, z equal to one, say, I would get, I would get a curve, what would be the equation for that curve? If I cut it at height one, the equation would be this thing equal to one. And that curve would be an ellipse. So actually, already, I've blocked into the lecture, the different pieces that we're aiming for. We're aiming for the tests, which this passed; we're aiming for the connection to a minimum, which this -- which we see in the graph, and if we chop it up, if we set this thing equal to one, if I set that thing equal to one, that -- what that gives me is, the cross-section.
It gives me this, this curve, and its equation is this thing equals one, and that's an ellipse.
Whereas if I cut through a saddle point, I get a hyperbola. OK.
But this minimum stuff is really what I'm most interested in. OK.
By -- I just have to ask, do you recognize, I mean, these numbers here, the two that appeared outside, the three that appeared inside, the two that appeared there -- actually, those numbers come from elimination.
Completing the square is our good old method of Gaussian elimination, in expressed in terms of these squares.
The -- let me show you what I mean.
I just think those numbers are no accident, If I take my matrix two, six, six, and twenty, and I do elimination, then the pivot is two and I take three, what's the multiplier? How much of row one do I take away from row two? Three. So what you're seeing in this, completing the square, is the pivots outside and the multiplier inside. Just do that again? The pivot is two, three -- three of those away from that gives me two, six, zero, and what's the second pivot? Three of this away from this, three sixes'll be eighteen, and the second pivot will also be a two.
So that's the U, this is the A, and of course the L was one, zero, one, and the multiplier was three. So, completing the square is elimination. Why I happy to see, happy to see that coming together? Because I know about elimination for m by m matrices. I just started talking about completing the square, here, for two by twos.
But now I see what's going on. Completing the square really amounts to splitting this thing into a sum of squares, so what's the critical thing -- I have a lot of squares, and inside those squares are multipliers but they're squares, and the question is, what's outside these squares? When I complete the square, what are the numbers that go outside? They're the pivots.
They're the pivots, and that's why positive pivots give me sum of squares. Positive pivots, those pivots are the numbers that go outside the squares, so positive pivots, sum of squares, everything positive, graph goes up, a minimum at the origin, it's all connected together; all connected together. And in the two by two case, you can see those connections, but linear algebra now can go up to three by three, m by m. Let's do that next.
Can I just, before I leave two by two, I've written this expression "matrix of second derivatives." What's the matrix of second derivatives? That's one second derivative now, but if I'm in two dimensions, I have a two by two matrix, it's the second x derivative, the second x derivative goes there -- shall I write it -- fxx, if you like, fxx, that means the second derivative of f in the x direction.
fyy, second derivative in the y direction.
Those are the pure derivatives, second derivatives.
They have to be positive. For a minimum.
This number has to be positive for a minimum.
That number has to be positive for a minimum. But, that's not enough. Those numbers have to somehow be big enough to overcome this cross-derivative, Why is the matrix symmetric? Because the second derivative f with respect to x and y is equal to -- I can, that's the beautiful fact about second derivatives, is I can do those in either order and I get the same thing.
So this is the same as that, and so, that's the matrix of second derivatives. And the test is, it has to be positive definite. You might remember, from, tucked in somewhere near the end of eighteen o' two or at least in the book, was the condition for a minimum, For a function of two variables. Let's -- when do you have a minimum? For a function of two variables, believe me, that's what Calculus is for. The condition is first derivatives have to be zero. And the matrix of second derivatives has to be positive definite.
So you maybe remember there was an fxx times an fyy that had to be bigger than an an fxy squared, that's just our determinant, two by two.
But now, we now know the answer for three by three, m by m, because we can do elimination by m by m matrices, we can connect eigenvalues of m by m matrices, we can do sum of squares, sum of m squares instead of only two squares; and so let's take a, let me go over here to do a three by three example.
So, three by three example. OK.
Oh, let me -- shall I use my favorite matrix? You've seen this matrix before. Yes, let's use the good matrix, four by one, oops, open.
Is that matrix positive definite? What's -- so I'm going to ask questions about this matrix, is it positive definite, first of all? What's the function associated with that matrix, what's the x transpose Ax? Is -- do we have a minimum for that function, at zero? And then even what's the geometry? OK. First of all, is the matrix positive definite, now I've given you the numbers there so you can take the determinants, maybe that's the quickest, is that what you would do mentally, if I give you all a matrix on a quiz and say is it positive definite or not? I would take that determinant and I'd give the answer two.
I would take that determinant and I would give the answer for that two by two determinant, I'd give the answer three, and anybody remember the answer for the three by three determinant? It was four, you remember for these special matrices, when we do determinants, they went up two, three, four, five, six, they just went up linearly. So that matrix has -- the determinants are two, three, and four.
Pivots. What are the pivots for that matrix? I'll tell you, they're -- the first pivot is two, the next pivot is three over two, the next pivot is four over three.
Because, the product of the pivots has to give me those determinants. The product of these two pivots gives me that determinant; the product of all the pivots gives me that determinant.
What are the eigenvalues? Oh, I don't know.
The eigenvalues I've got, what do I have a cubic equation -- a degree three equation? There are three eigenvalues to find. If I believe what I've said today, what do I know about these eigenvalues, even though I don't know the exact numbers. I -- I remember the numbers.
Because these matrices are so important that people figure them. But -- what do you believe to be true about these three eigenvalues -- you believe that they are all positive. They're all positive.
I think that they are two minus square root of two, two, and two plus the square root of two.
I think. Let me just -- I can't write those numbers down without checking the simple checks, what the first simple check is the trace, so if I add those numbers I get six and if I add those numbers I get six. The other simple test is the determinant, if I -- can you do this, can you multiply those numbers together? I guess we can multiply by two out there. What's two minus square root of two times two plus square root of two, that'll be four minus two, that'll be two, yeah, two times two, that's got the determinant, right, so it's got, it's got a chance of being correct and I think it is. Now, what's the x transpose Ax? I better give myself enough room for that. x transpose Ax for this guy.
It's two x1 squareds, and two x2 squareds, and two x3 squareds. Those come from the diagonal, those were easy. Now off the diagonal there's a minus and a minus, they come together there'll be a minus two minus two whats? Are coming from this one two and two one position, is the x1 x2. I'm doing mentally a multiplication of this matrix times a row vector on the left times a column vector on the right, and I know that these numbers show up in the answer. The diagonal is the perfect square, this off diagonal is a minus two x1 x2, and there are no x1 x3-s, and there're minus two x2 x3-s.
And I believe that that expression is always positive. I believe that that curve, that graph, really, of that function, this is my function f, and I'm in more dimensions now than I can draw, it -- but the graph of that function goes upwards.
It's a bowl. Or maybe the right word is -- just forgot, what's a long word for bowl? Hm, maybe paraboloid, I think, paraboloid comes in.
I'll edit the tape and get that word in.
Bowl, let's say, is, that, so that, and if I can -- I could complete the squares, I could write that as the sum of three squares, and those three squares would get multiplied by the three pivots. And the pivots are all positive. So I would have positive pivots times squares, the net result would be a positive function and a bowl which goes upwards.
And then, finally, if I cut -- if I slice through this bowl, if I -- now I'm asking you to stretch your visualization here, because I'm in four dimensions, I've got x1 x2 x3 in the base, and this function is z, or f, or something. And its graph is going up. But I'm in four dimensions, because I've got three in the base and then the upward direction, but now if I cut through this four-dimensional picture, at level one, so, suppose I cut through this thing at height one.
So I take all the points that are at height one.
That gives me -- it gave me an ellipse over there, in that two by two case, in this case, this will be the equation of an ellipsoid, a football in other words. Well, not quite a football.
A lopsided football. What would be, can I try to describe to you what the ellipsoid will look like, this ellipsoid, I'm sorry that the, that I've ended the matrix right -- at the point, let's -- let me be sure you've seen the equation.
Two x1 squared, two x2 squared, two x3 squared, minus two of the cross parts, equal what? That is the equation of a football, so what do I mean by a football or an ellipsoid? I mean that, well, I'll draw a few.
It's like that, it's got a center, and it's got it's got three principal directions.
This ellipsoid. So -- you see what I'm saying, if we have a sphere then all directions would be the same.
If we had a true football, or it's closer to a rugby ball, really, because it's more curved than a football, it would have one long direction and the other two would be equal. That would be like having a matrix that had one eigenvalue repeated.
And then one other different. So this sphere comes from, like, the identity matrix, all eigenvalues the same.
Our rugby ball comes from a case where -- three, the three, two of the three eigenvalues are the same. But we know how the case where -- the typical case, where the three eigenvalues were all different. So this will have -- How do I say it, if I look at this ellipsoid correctly, it'll have a major axis, it'll have a middle axis, and it'll have a minor axis. And those three axes will be in the direction of the eigenvectors. And the lengths of those axes will be determined by the eigenvalues. I can get -- turn this all into linear algebra, because we have -- the right thing we know about eigenvectors and eigenvalues, for that matrix is what? Of -- let me just tell you that, repeat the main linear algebra point.
How could we turn what I said into algebra; we would write this A as Q, the eigenvector matrix, times lambda, the eigenvalue matrix times Q transposed. The principal axis theorem, we'll call it, now.
The eigenvectors tell us the directions of the principal axes. The eigenvalues tell us the lengths of those axes, actually the lengths, or the half-lengths, or one over the eigenvalues, it turns out. And that is the matrix factorization which is the most important matrix factorization in our eigenvalue material so far.
That's diagonalization for a symmetric matrix, so instead of the inverse I can write the transposed.
OK. I've -- so what I've tried today is to tell you the -- what's going on with positive definite matrices. Ah, you see all how all these pieces are there and linear algebra connects them.
OK. See you on Friday.