Instructor: Prof. Gilbert Strang
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation, or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
PROFESSOR STRANG: Actually, two things to say about eigenvalues. One is about matrices in general and then the second is to focus on our favorites, those second derivatives and second differences. There's a lot to say about eigenvalues but then we'll have the main ideas. So the central idea of course is to find these special directions and we expect to find n directions, n eigenvectors y where this n by n matrix is acting like a number in each of those directions. So we have this for n different y's and each one has its own eigenvalue lambda. And of course the eig command in MATLAB will find the y's and the lambdas. So it finds the y's and the lambdas in a matrix. So that's what I'm going to do now, straightforward. Any time I have n vectors, so I have n of these y's, I've n y's and n lambdas. Well, if you give me n vectors, I put them into the columns of a matrix, almost without thinking. So can I just do that? So there is y_1, the first eigenvector. That's y_2 to y_n. Okay, that's my eigenvector matrix. Often I call it S. So I'll stick with that. S will be the eigenvector matrix.
Since these are eigenvectors I'm going to multiply that matrix by A. That should bring out the key point. I'm just going to repeat this, which is one at a time, by doing them all that once. So what happens if I multiply a matrix by a bunch of columns? Matrix multiplication is wonderful. It does the right thing. It multiplies A times the first column. So let's put that there. A times the first column along to A times the last column. Just column by column. But now we recognize these. They're special y's. They're special because they're eigenvectors. So this is lambda_1*y_1 along to that column is lambda_n*y_n. Right? Now I've used the fact that they were eigenvectors. And now, one final neat step of matrix multiplication is to factor out this same eigenvector matrix again and realize, and I'll look at it, that it's being multiplied by this diagonal, that's now a diagonal matrix of eigenvalues.
So let's just look at that very last step here. Here I had the first column was lambda_1*y_1. I just want to see, did I get that right? If I'm looking at the first column where that lambda_1 is sitting, it's going to multiply y_1 and it'll be all zeroes below so I'll have none of the other eigenvectors. So I'll have lambda_1*y_1, just what I want. Got a little squeezed near the end there, but so let me write above. The result is just A times this eigenvector matrix that I'm going to call S equals what? This is Ay=lambda*y for all n at once. A times S is, what have I got here? What's this? That's S. And what's the other guy? That's the eigenvalue matrix. So it's just got n numbers. They automatically go on the diagonal and it gets called capital Lambda. Capital Lambda for the matrix, little lambda for the numbers. So this is n, this is all n at once. Straightforward.
Now I'm going to assume that I've got these n eigenvectors, that I've been able to find n independent directions. And almost always, you can. For every symmetric matrix you automatically can. So these y's are independent directions. If those are the columns of a matrix, yeah, here's a key question about matrices. What can I say about this matrix S if its n columns are independent? Whatever that, you know, we haven't focused in careful detail, but we have an idea. That means sort of none of them are combinations of the others. We really have n separate directions. Then that matrix is? Invertible. A matrix that's got n columns, independent, that's what we're hoping for. That matrix has an inverse. We can produce, well all the good facts about matrices. This is a square matrix. So I can invert it if you like. And I can write A as S lambda. I'm multiplying on the right by this S inverse.
And there I have the diagonalization of a matrix. The matrix has been diagonalized. And what does that mean? Well this is, of course the diagonal that we're headed for. And what it means is that if I look at my matrix and I separate out the different eigendirections, I could say, that the matrix in those directions is just this diagonal matrix. So that's a short way of saying it.
Let me just carry one step further. What would A squared be? Well now that I have it in this cool form, S*lambda*S inverse, I would multiply two of those together and what would I learn? If I do that multiplication what comes out? First an S from here. And then what? Lambda squared. Why lambda squared? Because in the middle is the S S inverse that's giving the identity matrix. So then the lambda multiplies the lambda and now here is S inverse. Well A squared is S*lambda squared*S inverse. What does that tell me in words? That tells me that the eigenvectors of A squared are? The same. As for A. And it tells me that the eigenvalues of A squared are? The squares. So I could do this. Maybe I did it before, one at a time. Ay=lambda*y, multiply again by A. A squared*y is lambda*Ay, but Ay is lambda*y so I'm up to lambda squared*y.
You should just see that when you've got these directions then your matrix is really simple. Effectively it's a diagonal matrix in these good directions. So that just shows one way of seeing-- And of course what about A inverse? We might as well mention A inverse. Suppose A is invertible. Then what do I learn about A inverse? Can I just invert that? I'm just playing with that formula, so you'll kind of, like, get handy with it.
What would the inverse be if I have three things in a row multiplied together? What's the inverse? So I'm going to take the inverses in the opposite order, right? So the inverse of that will come first. So what's that? Just S. The lambda in the middle gets inverted and then the S at the left, its inverse comes at the right. Well what do I learn from that? I learn that the eigenvector matrix for A inverse is? Same thing, again. Same. Let me put just "Same". What's the eigenvalue matrix for A inverse? It's the inverse of this guy, so what does it look like? It's got one over lambdas. That's all it says. The eigenvalues for A inverse are just one over the eigenvalues for A.
If that is so, and it can't be difficult, we could again, we could prove it sort of like, one at a time. This is my starting point, always. How would I get to A inverse now and recover this fact that the eigenvalues for the inverse, just turn them up. If A has an eigenvalue seven, A inverse will have an eigenvalue 1/7. What do I do? Usually multiply both sides by something sensible. Right? What shall I multiply both sides by? A inverse sounds like a good idea, right. So I'm multiplying both sides by A inverse, so that just leaves y and here is that number, here is A inverse times y. Well, maybe I should do one more thing. What else shall I do? Divide by lambda. Take that number lambda and put it over here as one lambda. Well, just exactly what we're looking for. The same y has A inverse, the same y as an eigenvector of A inverse and the eigenvalue is one over lambda.
Oh, and of course, I should have said before I inverted anything, what should I have said about the lambdas? Not zero. Right? A zero eigenvalue is a signal the matrix isn't invertible. So that's perfect test. If the matrix is invertible, all its eigenvalues are not zero. If it's singular, it's got a zero eigenvalue. If a matrix is singular, then Ay would be 0y for some, there'd be a vector that that matrix kills. If A is not invertible, there's a reason for it. It's because it takes some vector to zero, and of course, you can't bring it back to life. So shall I just put that up here? Lambda=0 would tell me I have a singular matrix. All lambda not equal zero would tell me I have an invertible matrix. These are straightforward facts. It's taken down in this row and it's just really handy to have up here.
Well now I'm ready to move toward the specific matrices, our favorites. Now, those are symmetric. So maybe before I leave this picture, we better recall what is special when the matrix is symmetric. So that's going to be the next thing. So if A is symmetric I get some extra good things. So let me take instead of A, I'll use K. So that'll be my letter for the best matrices. So symmetric. So now what's the deal with symmetric matrices? The eigenvalues, the lambdas. I'll just call them the lambdas and the y's. The lambdas are, do you remember from last time? If I have a symmetric matrix, the eigenvalues are all? Anybody remember? They're all real numbers. You can never run into complex eigenvalues if you start with a symmetric matrix. We didn't prove that but it's just a few steps like those. And what about, most important, what about the y's? The eigenvectors. They are, or can be chosen to be, or whatever, anybody remember that fact? These are, like, the golden facts. Every sort of bunch of matrices reveals itself through what its eigenvalues are like and what its eigenvectors are like. And the most important class is symmetric and that reveals itself through real eigenvalues and...? Orthogonal, good. Orthogonal eigenvectors, orthogonal. And in fact, since I'm an eigenvector, I can adjust its length as I like. Right? If y is an eigenvector, 11y is an eigenvector because I would just multiply both sides by 11. That whole line of eigenvectors is getting stretched by lambda.
So what I want to do is make them unit vectors. MATLAB will automatically produce, eig would automatically give you vectors that have been normalized to unit. Here's something good. So what does orthogonal mean? That means that one of them, the dot product of one of them with another one is? Now that's not, I didn't do the dot product yet. What symbol do I have to write on left-hand side? Well you could say, just put a dot. Of course. But dots are not cool, right? So maybe I should say inner product, that's the more upper-class word. But I want to use transpose. So it's the transpose. That's the dot product. And that's the test for perpendicular. So what's the answer then? I get a zero if i is different from j. If I'm taking two different eigenvectors and I take their dot product, that's what you told me, they're orthogonal.
And now what if i equals j? If I'm taking the dot product with itself, each eigenvector with itself. So what does the dot product of a vector with itself give? It'll be one because I'm normal. Exactly. What it always gives, the dot product of a vector with itself, you just realize that that'll be y_1 squared, y_2 squared, it'll be the length squared. And here we're making the length to be one.
Well once again, if I write something down like this which is straightforward I want to express it as a matrix statement. So I want to multiply, it'll involve my good eigenvector matrix. And this will be what? I want to take all these dots products at once. I want to take the dot product of every y with every other y. Well here you go. Just put these guys in the rows, now that we see that it really was the transpose multiplying y, do you see that that's just done it? In fact, you'll tell me what the answer is here. Don't shout it out, but let's take it two or three entries and then you can shout it out. So what's the (1, 1) entry here of I guess that's what we called S. And now this would be its transpose. And what I'm saying is if I take-- Yeah, this is important because throughout this course we're going to be taking A transpose A, S transpose S, Q transpose Q, often, often, often. So here we got the first time at it. So why did I put a zero there, because it's not it. What is it? What is that first entry? One. Because that's the row times the column, that's a one. And what's the entry next to it? Zero. Right? y_1 dot product with y_2 is, we're saying, zero.
So what matrix have I got here? I've got the identity. Because y_2 with y_2 will put a one there and all zeroes elsewhere. Zero, zero. And y_3 times y_3 will be the one. I get the identity. So this is for symmetric matrices. In general, we can't expect the eigenvectors to be orthogonal. It's these special ones that are. But they're so important that we notice. Now so this is the eigenvector matrix S and this is its transpose. So I'm saying that for a symmetric matrix, S transpose times S is I. Well that's pretty important. In fact, that's important enough that I'm going to give an extra name to S, the eigenvector matrix when it comes from a symmetric matrix, when it has a matrix with S transpose times S equaling the identity is really a good matrix to know. So let's just focus on those guys. I can put that up here. So here's a matrix.
Can I introduce a different letter than S? It just helps you to remember that this remarkable property is in force. That we've got it. So I'm going to call it-- When K is a symmetric matrix, I'll just repeat that, then its eigenvector matrix has this S transpose times S-- I'm going to call it Q. I'm going to call the eigenvectors, so for this special situation, A times-- So I'm going to call the eigenvector matrix Q. It's the S but it's worth giving it this special notation to remind us that this is, so Q is, an orthogonal matrix. There's a name for matrices with this important property. And there's a letter Q that everybody uses. An orthogonal matrix. And what does that mean? Means just what we said, Q transpose times Q is I. What I've done here is just giving a special, introducing a special letter Q, a special name, orthogonal matrix for what we found in the good, in this-- for eigenvectors of a symmetric matrix.
And this tells me one thing more. Look what's happening here. Q transpose times Q is giving the identity. What does that tell me about the inverse of Q? That tells me here some matrix is multiplying Q and giving I. So what is this matrix? What's another name for this Q transpose? Is also Q inverse. Because that's what defines the inverse matrix, that times Q should give I. So Q transpose is Q inverse. I'm moving along here.
Yes, please. The question was, shouldn't I call it an orthonormal matrix? The answer is yes, I should. But nobody does. Dammit! So I'm stuck with that name. But orthonormal is the proper name. If you call it an orthonormal matrix, I'm happy because that's really the right name for that matrix, orthonormal. Because orthogonal would just mean orthogonal columns but we've taken this extra little step to make all the lengths one. And then that gives us this great property. Q transpose is Q inverse.
Orthogonal matrices are like rotations. I better give an example of an orthogonal matrix. I'll do it right under here. Here is an orthogonal matrix. So what's the point? It's supposed to be a unit vector in the first column so I'll put cos(theta), sin(theta). And now what can go in the second column of this orthogonal matrix? It's gotta be a unit vector again because we've normalized and it's gotta be, what's the connection to the first column? Orthogonal, gotta be orthogonal. So I just wanted to put something here that sum of squares is one, so I'll think cos(theta) and sin(theta) again. But then I've got to flip them a little to make it orthogonal to this. So if I put minus sin(theta) there and plus cos(theta) there that certainly has length one, good. And the dot product, can you do the dot product of that column with that column? It's minus sine cosine, plus sine cosine, zero. So there is a two by two, actually that's a fantastic building block out of which you could build many orthogonal matrices of all sizes. That's a rotation by theta. That's a useful matrix to know. It takes every vector, swings it around by an angle theta. What do I mean? I mean that Qx, Q times a vector x, rotates x by theta. Let me put it. Qx rotates whatever vector x you give it, you multiply by Q, it rotates it around by theta, it doesn't change the length. So that would be an eigenvector matrix of a pretty typical two by two.
I see as I talk about eigenvectors, eigenvalues there's so much to say. Because everything you know about a matrix shows up somehow in its eigenvectors and eigenvalues and we're focusing on symmetric guys. What happens to this A=S*lambda*S inverse? Let's write that again. Now we've got K. It's S*lambda*S inverse like any good diagonalization but now I'm giving S a new name, which is what? Q. because when I give K, when I use that letter K I'm thinking symmetric so I'm in this special situation of symmetric. I have the lambda, the eigenvalue matrix, and here I have Q inverse. But there's another little way to write it and it's terrifically important in mechanics and dynamics, everywhere. It's simple now. We know everything. Q lambda what? Q transpose.
Do you see the beauty of that form? That's called the principal axis theorem in mechanics. It's called the spectral theorem in mathematics. It's diagonalization, it's quantum mechanics, everything. Any time you have a symmetric matrix there's the wonderful statement of how it breaks up when you look at its orthonormal eigenvectors and its real eigenvalues. Do you see that once again the symmetry has reappeared in the three factors? The symmetry has reappeared in the fact that this vector is the transpose of this one. We saw that for elimination when these were triangular. That makes me remember what we had in a different context, in the elimination when things were triangular we had K=L*D*L transpose. I just squeezed that in to ask you to sort of think of the two as two wonderful pieces of linear algebra in such a perfect shorthand, perfect notation. This was triangular times the pivot matrix times the upper triangular. This is orthogonal times the eigenvalue matrix times its transpose. And the key point here was triangular and the key point here is orthogonal. That took some time, but it had to be done. This is the right way to understand. That the central theme, it's a highlight of a linear algebra course and we just went straight to it.
And now what I wanted to do was look now at the special K. Oh, that's an awful pun. The special matrices that we have, so those are n by n. And as I said last time, usually it's not very likely that we find all the eigenvalues and eigenvectors of this family of bigger and bigger matrices. So now I'm going to specialize to my n by n matrix K equals twos down the diagonal, minus ones above and minus ones below. What are the eigenvalues of that matrix and what are the eigenvectors? How to tackle that? The best way is the way we've done with the inverse and other ways of understanding K, was to compare it with the continuous problem. So this is a big matrix which is a second difference matrix, fixed-fixed. Everybody remembers that the boundary conditions associated with this are fixed-fixed.
I want to ask you to look at the corresponding differential equation. So you may not have thought about eigenvectors of differential equations. And maybe I have to call them eigenfunctions but the idea doesn't change one bit. So what shall I look at? K corresponds to what? Continuous differential business, what derivative, what? So I would like to look at Ky=lambda*y. I'm looking for the y's and lambdas and the way I'm going to get them is to look at, what did you say it was? K, now I'm going to write down a differential equation that's like this but we'll solve it quickly. So what will it be? K is like, tell me again. Second derivative of y with respect to x squared. And there's one more thing you have to remember. Minus. And here we have lambda*y(x). That's an eigenvalue and an eigenfunction that we're looking at for this differential equation.
Now there's another thing you have to remember. And you'll know what it is and you'll tell me. I could look for all the solutions. Well, let me momentarily do that. What functions have minus the second derivative is a multiple of the function? Can you just tell me a few? Sine and cosine. I mean this is a fantastic eigenvalue problem because its solutions are sines and cosines. And of course we could combine them into exponentials. We could have sine(omega*x) or cos(omega*x) or we could combine those into e^(i*omega*x), would be a combination of those, or e^(-i*omega*x). Those are combinations of these, so those are not new. We've gotten lots of eigenfunctions. Oh, for every frequency omega this solves the equation. What's the eigenvalue? If you guess the eigenfunction you've got the eigenvalue just by seeing what happens.
So what would the eigenvalue be? Tell me again. Omega squared. Because I take the second derivative of the sine, that'll give me the cosine back to the sine, omega squared comes out, omega comes out twice. Comes out with a minus sign from the cosine and that minus sign is just right to make it plus. Lambda is omega squared. So omega squared. All the way of course. Those are the eigenvalues.
All our differential examples had something more than just the differential equation. What's the additional thing that a differential equation comes with? Boundary conditions. With boundary conditions. Otherwise we got too many. I mean we don't want all of these guys. What boundary conditions? If we're thinking about K, our boundary conditions should be fixed and fixed. So that's the full problem. This is part of the problem not just an afterthought. Now these conditions, that will be perfect. Instead of having all these sines and cosines we're going to narrow down to a family that satisfies the boundary conditions. First boundary condition is it has to be zero at x=0. What does that eliminate now? Cosines are gone, keeps the sines. Cosines are gone by that first boundary condition. These are guys that are left. I won't deal with these at this point because I'm down to sines already from one boundary condition. And now, the other boundary condition. The other boundary condition has to at x=1, if it's going to work sin(omega*x) has to be? Nope, what do I put now? sin(omega), right? x is one. I'm plugging in here. I'm just plugging in x=1 to satisfy. And it has to equal zero. So that means, that pins down omega. Doesn't give me just one omega, well tell me one omega that's okay then. The first omega that occurs to you is? Pi. The sine comes back to pi. So we've got one. y_1. Our first guy is with omega=pi is sin(pi*x).
That's our fundamental mode. That's the number one eigenfunction. And it is an eigenfunction, it satisfies the boundary condition. Everybody would know its picture, just one arch of the sine function. And the lambda that goes with it, lambda_1, so this is the first eigenfunction, what's the first eigenvalue? Pi squared, right. Because omega, we took to be pi. So lambda_1 is pi squared. We've got one. We were able to do it because we could solve this equation in an easy way. Ready for a second one? What will the next one be? The next eigenfunction it's got to, whatever its frequency is, omega, it's got to have sin(omega)=0. What's your choice? 2pi. So the next one is going to be sin(2pi*x). And what will be the eigenvalue that goes with that guy? lambda_2 will be omega squared, which is 2pi squared, 2pi all squared, so that's four pi squared. You see the whole list. The sines with these correct frequencies are the eigenfunctions of the second derivative with fixed-fixed boundary conditions. And this is entirely typical. We don't have just n of them. The list goes on forever, right? The list goes on forever because we're talking here about a differential equation. A differential equation's somehow like a matrix of infinite size. And somehow these sines are the columns of the infinite size eigenvector matrix. And these numbers, pi squared, four pi squared, nine pi squared, 16pi squared are the eigenvalues of the infinite eigenvalue matrix. We got those answers quickly.
And let's just mention that if I changed to free-fixed or to free-free I could repeat. I'd get different y's. If I have different boundary conditions I expect to get different y's. In fact, what would it look like if that was y'=0 as the left end? What would you expect the eigenfunctions to look like? They'd be cosines. They'd be cosines. And then we would have to adjust the omegas to make them come out right at the right-hand end. So this y(0)=0, the fixed ones gave us sines, the free ones give us cosines, the periodic ones if I had y(0)=y(1) so that I'm just circling around, then I would expect these e^(ikx)'s -- the textbook will, so I'm in the eigenvalue section of course, and the textbook lists the answers for the other possibilities. Let's go with this one. Because this is the one that corresponds to K.
We're now ready for the final moment. And it is can we guess the eigenvectors for the matrix? Now I'm going back to the matrix question. And as I say, normally the answer's no. Who could guess? But you can always hope. You can try. So what will I try? Here, let me draw sin(x), sin(pi*x). And let me remember that my matrix K was a finite difference matrix. Let's make it four by four. One, two, three, four let's say. What would be the best I could hope for, for the eigenvector, the first eigenvector? I'm hoping that the first eigenvector of K is very, very like the first eigenfunction in the differential equation, which was this sin(pi*x), so that's sin(pi*x). Well, what do you hope for? What shall I hope for as the components of y_1, the first eigenvector? It's almost too good. And as far as I know, basically it only happens with these sines and cosines example. These heights, I just picked these, what I might call samples, of the thing. Those four values and of course zero at that end and zero at that end, so because K, the matrix K is building in the fixed-fixed. These four heights, these four numbers, those four sines-- In other words, what I hope is that for Ky=lambda*y, I hope that y_1, the first eigenvector, it'll be sin(pi*x), but now what is x? So this is x here from zero to one. So what's x there, there, there and there? Instead of sin(pi*x), the whole curve, I'm picking out those four samples. So it'll be the sine of, what'll it be here? Pi. Pi divided by n+1.
Which in my picture would be, we'll make it completely explicit. Five. It's 1/5 away along. Maybe I should make these y's into column vectors since we're thinking of them as columns. So here's y_1. sin(pi/5), sin(2pi/5), sin(3pi/5), sin(4pi/5). That's the first eigenvector. And it works. And you could guess now the general one. Well when I say it works, I haven't checked that it works. I better do that. But the essential point is that it works. I may not even do it today.
So, in fact, tell me the second eigenvector. Or tell me the second eigenfunction over here. What's the second eigenfunction? Let me draw it with this green chalk. So I'm going to draw y_2. Now what does y_2 look like? sin(2pi*x). What's the new picture here? It goes up. What does it do? By here it's got back, oh no, damn. I would've been better with three points in the middle, but it's correct. It comes down here. Right? That's sin(2pi*x). That's halfway along. I'll finish this guy. This'll be sin(2pi/5), sin(4pi/5). See I'm sampling this same thing. I'm sampling 2pi*x at those same points. sin(6pi/5) and sin(8pi/5). Maybe let's accept this as correct. It really works. It's the next eigenvector.
And then there's a third one and then there's a fourth one. And how many are there? n usually. And in my case, what is n in the picture I've drawn? n here is four. One, two, three, four. n is four in that picture and that means that I'm dividing by n+1. That's really sin(pi*h). You remember I used h as the step size. So h is 1/5, 1/(n+1), 1/5. So it's sin(pi*h), sin(2pi*h), 4pi*h-- 3pi*h, 4pi*h. Here's 2, sin(2pi*h), sin(4pi*h), sin(6pi*h), sin(8pi*h).
So I have two things to do. One is to remember what is the remarkable property of these y's. So there's a y that we've guessed. Right now you're taking my word for it that it is the eigenvector and this is the next one. I copied them out of those functions. And just remind me, what is it that I'm claiming to be true about y_1 and y_2. They are orthogonal, there are orthogonal. Well to check that I'd have to do some trig stuff. But what I was going to do was come over here and say this was a symmetric differential equation. We found its eigenfunctions. What do you think's up with those? Those are orthogonal too. So this would be a key fact in any sort of advanced applied math is that the sine function is orthogonal to the sin(2x). That function as orthogonal to this one.
And actually that's what makes the whole world of Fourier series work. So that was really a wonderful fact. That this is orthogonal to this. Now you may, quite reasonably, ask what do I mean by that? What does it mean for two functions to be orthogonal? As long as we're getting all these parallels, let's get that one too. I claim that this function, which is this, is orthogonal to this function. What does that mean? What should these functions-- Could I write dot or transpose or something? But now I'm doing it for functions. I just want you to see the complete analogy. So for vectors, what did I do? If I take a dot product I multiply the first component times the first component, second component times the second, so on, so on. Now what'll I do for functions? I multiply sin(pi*x) * sin(2pi*x) at each x. Of course I've got a whole range of x's. And then what do I do? I integrate. I can't add. I integrate instead. So I integrate one function sin(pi*x) against the other function, sin(2pi*x), dx, and I integrate from zero to one and the answer comes out zero. The answer comes out zero.
The sine functions are orthogonal. The sines are orthogonal functions. The sine vectors are orthogonal vectors. I normalize to length one and they go right into my Q. So if I multiply, if I did that times that, that dot product would turn out to be zero. If I had been a little less ambitious and taken n to be two or three or something we would have seen it completely. But maybe doing with four is okay.
So great lecture except for that. Didn't get there. So Wednesday's lecture is sort of the bringing all these pieces together, positive eigenvalues, positive pivots, positive definite. So come on Wednesday please. Come Wednesday. And Wednesday afternoon I'll have the review session as usual.