Topics covered: Homogeneous Linear Systems with Constant Coefficients: Solution via Matrix Eigenvalues (Real and Distinct Case)
Instructor/speaker: Prof. Arthur Mattuck
The last time I spent solving a system of equations dealing with the chilling of this hardboiled egg being put in an ice bath. We called T1 the temperature of the yoke and T2 the temperature of the white. What I am going to do is revisit that same system of equations, but basically the topic for today is to learn to solve that system of equations by a completely different method. It is the method that is normally used in practice. Elimination is used mostly by people who have forgotten how to do it any other way. Now, in order to make it a little more general, I am not going to use the dependent variables T1 and T2 because they suggest temperature a little too closely. Let's change them to neutral variables.
I will use x = T1, and for T2 I will just use y. I am not going to re-derive anything. I am not going to resolve anything. I am not going to repeat anything of what I did last time, except to write down to remind you what the system was in terms of these variables, the system we derived using the particular conductivity constants, two and three, respectively. The system was this one, -2x + 2y.
And the y' = 2x - 5y. And so we solved this by elimination. We got a single second-order equation with constant coefficients, which we solved in the usual way. From that I derived what the x was, from that we derived what the y was, and then I put them all together. I will just remind you what the final solution was when written out in terms of arbitrary constants. It was c1 e^(-t) + c2 e^(-6t), and y = c1/2 e^(-t) - 2 c2 e^(-6t). That was the solution we got.
And then I went on to put in initial conditions, but we are not going to explore that aspect of it today. We will in a week or so. This was the general solution because it had two arbitrary constants in it. What I want to do now is revisit this and do it by a different method, which makes heavy use of matrices. That is a prerequisite for this course, so I am assuming that you reviewed a little bit about matrices. And it is in your book. Your book puts in a nice little review section. Two-by-two and three-by-three will be good enough for 18.03 mostly because I don't want you to calculate all night on bigger matrices, bigger systems. So nothing serious, matrix multiplication, solving systems of linear equations, end-by-end systems.
I will remind you at the appropriate places today of what it is you need to remember. The very first thing we are going to do is, let's see. I haven't figured out the color coding for this lecture yet, but let's make this system in green and the solution can be in purple. Invisible purple, but I have a lot of it. Let's abbreviate, first of all, the system using matrices. I am going to make a column vector out of (x, y). Then you differentiate a column vector by differentiating each component. I can write the left-hand side of the system as (x, y)'. How about the right-hand side?
Well, I say I can just write the matrix of coefficients to ( -2, 2; 2, -5)(x, y). And I say that this matrix equation says exactly the same thing as that green equation and, therefore, it is legitimate to put it up in green, too. The top here is x'. What is the top here? After I multiply these two I get a column vector. And what is its top entry? It is -2x + 2y. There it is. And the bottom entry the same way is 2x - 5y, just as it is down there. Now, what I want to do is, well, maybe I should translate the solution. What does the solution look like?
We got that, too. How am I going to write this as a matrix equation? Actually, if I told you to use matrices, use vectors, the point at which you might be most hesitant is this one right here, the very next step. Because how you should write it is extremely well-concealed in this notation. But the point is, this is a column vector and I am adding together two column vectors. And what is in each one of the column vectors?
Think of these two things as a column vector. Pull out all the scalars from them that you can. Well, you see that c1 is a common factor of both entries and so is e^(-t), that function. Now, if I pull both of those out of the vector, what is left of the vector? Well, you cannot even see it. What is left is a 1 up here and a one-half there. So I am going to write that in the following form. I will put out the c1, it's the common factor in both, and put that out front. Then I will put in the guts of the vector, even though you cannot see it, the column vector (1, 1/2). And then I will put the other scalar function in back. The only reason for putting one of these in front and one in back is visual so to make it easy to read. There is no other reason.
You could put the c1 here, you could put it here, you could put the e^(-t) in front if you want to, but people will fire you. Don't do that. Write it the standard way because that is the way that it is easiest to read. The constants out front, the functions behind, and the column vector of numbers in the middle. And so the other one will be written how? Well, here, that one is a little more transparent.
c2, 1, 2 and the other thing is e^(-6t). There is our solution. That is going to need a lot of purple, but I have it. And now I want to talk about how the new method of solving the equation. It is based just on the same idea as the way we solve second-order equations. Yes, question. Oh, here. Sorry. This should be negative two. Thanks very much. What I am going to use is a trial solution. Remember when we had a second-order equation with constant coefficients the very first thing I did was I said we are going to try a solution of the form e^(rt). Why that? Well, because Oiler thought of it and it has been known for 200 or 300 years that that is the thing you should do.
Well, this has not been known nearly as long because matrices were only invented around 1880 or so, and people did not really use them to solve systems of differential equations until the middle of the last century, 1950-1960. If you look at books written in 1950, they won't even talk about systems of differential equations, or talk very little anyway and they won't solve them using matrices. This is only 50 years old. I mean, my God, in mathematics that is very up to date, particularly elementary mathematics. Anyway, the method of solving is going to use as a trial solution. Now, if you were left to your own devices you might say, well, let's try x = a1 e^(lambda1 t) and y = a2 e^(lambda2 t).
Now, if you try that, it is a sensible thing to try, but it will turn out not to work. And that is the reason I have written out this particular solution, so we can see what solutions look like. The essential point is here is the basic solution I am trying to find. Here is another one. Their form is a column vector of constants. But they both use the same exponential factor, which is the point. In other words, I should not use here, in my trial solution, two different lambdas, I should use the same lambda. And so the way to write the trial solution is (x, y) equals two unknown numbers, that or that or whatever, times e to a single unknown exponent factor.
Let's call it lambda t. It is called lambda. It is called r. It is called m. I have never seen it called anything but one of those three things. I am using lambda. Your book uses lambda. It is a common choice. Let's stick with it. Now what is the next step? Well, we plug into the system. Substitute into the system. What are we going to get? Well, let's do it. First of all, I have to differentiate. The left-hand side asks me to differentiate this. How do I differentiate this? Column vector times a function. Well, the column vector acts as a constant. And I differentiate that. That is lambda e^(lambda t).
So the (x, y)' = (a1, a2) e^(lambda t) lambda. Now, it is ugly to put the lambda afterwards because it is a number so you should put it in front, again, to make things easier to read. But this lambda comes from differentiating e^(lambda t) and using the chain rule. This much is the left-hand side. That is the derivative (x, y) prime.
I differentiate the x and I differentiated the y. How about the right-hand side. Well, the right-hand side is (-2, 2; 2, -5) times what? Well, times (x, y), which is (a1, a2) e^(lambda t). Now, the same thing that happened a month or a month and a half ago happens now. The whole point of making that substitution is that the e^(lambda t), the function part of it drops out completely.
And one is left with what? An algebraic equation to be solved for lambda a1 and a2. In other words, by means of that substitution, and it basically uses the fact that the coefficients are constant, what you have done is reduced the problem of calculus, of solving differential equations, to solving algebraic equations. In some sense that is the only method there is, unless you do numerical stuff. You reduce the calculus to algebra. The Laplace transform is exactly the same thing. All the work is algebra. You turn the original differential equation into an algebraic equation for Y(s), you solve it, and then you use more algebra to find out what the original little y(t) was. It is not different here.
So let's solve this system of equations. Now, the whole problem with solving this system, first of all, what is the system? Let's write it out explicitly. Well, it is really two equations, isn't it? The first one says lambda a1 = -2a1 + 2a2. That is the first one. The other one says lambda a2 = 2a1 - 5a2.
Now, purely, if you want to classify that, that is two equations and three variables, three unknowns. The a1, a2, and lambda are all unknown. And, unfortunately, if you want to classify them correctly, they are nonlinear equations because they are made nonlinear by the fact that you have multiplied two of the variables. Well, if you sit down and try to hack away at solving those without a plan, you are not going to get anywhere. It is going to be a mess. Also, two equations and three unknowns is indeterminate. You can solve three equations and three unknowns and get a definite answer, but two equations and three unknowns usually have an infinity of solutions. Well, at this point it is the only idea that is required.
Well, this was a little idea, but I assume one would think of that. And the idea that is required here is, I think, not so unnatural, it is not to view these a1, a2, and lambda as equal. Not all variables are created equal. Some are more equal than others. a1 and a2 are definitely equal to each other, and let's relegate lambda to the background. In other words, I am going to think of lambda as just a parameter. I am going to demote it from the status of variable to parameter. If I demoted it further it would just be an unknown constant. That is as bad as you can be. I am going to focus my attention on the a1, a2 and sort of view the lambda as a nuisance. Now, as soon as I do that, I see that these equations are linear if I just look at them as equations in a1 and a2.
And moreover, they are not just linear, they are homogenous. Because if I think of lambda just as a parameter, I should rewrite the equations this way. I am going to subtract this and move the left-hand side to the right side, and it is going to look like (-2 - lambda)a1 + 2a1 = 0. And the same way for the other one. It is going to be 2a1 + (-5 - lambda)a2 = 0. That is a pair of simultaneous linear equations for determining a1 and a2, and the coefficients involved are parameter lambda. Now, what is the point of doing that?
Well, now the point is whatever you learned about linear equations, you should have learned the most fundamental theorem of linear equations. The main theorem is that you have a square system of homogeneous equations, this is a two-by-two system so it is square, it always has the trivial solution, of course, a1, a2 equals zero. Now, we don't want that trivial solution because if a1 and a2 are zero, then so are x and y zero.
Now that is a solution. Unfortunately, it is of no interest. If the solution were x, y zero, it corresponds to the fact that this is an ice bath. The yoke is at zero, the white is at zero and it stays that way for all time until the ice melts. So that is the solution we don't want. We don't want the trivial solution. Well, when does it have a nontrivial solution? Nontrivial means non-zero, in other words.
If and only if this determinant is zero. In other words, by using that theorem on linear equations, what we find is there is a condition that lambda must satisfy, an equation in lambda in order that we would be able to find non-zero values for a1 and a2. Let's write it out. I will recopy it over here. What was it? Negative 2 minus lambda, two, here it was 2 and -5 - lambda. All right. You have to expand the determinant. In other words, we are trying to find out for what values of lambda is this determinant zero.
Those will be the good values which lead to nontrivial solutions for the a's. This is the equation lambda plus 2. See, this is minus that and minus that, the product of the two minus ones is plus one. So it is (lamda + 2)(lambda + 5), which is the product of the two diagonal elements, minus the product of the two anti-diagonal elements, which is 4, is equal to zero.
And if I write that out, what is that, that is the equation lambda^2 + 7 lambda, 5 lambda plus 2 lambda, and then the constant term is 10 minus 4 which is 6. How many of you have long enough memories, two-day memories that you remember that equation? When I did the method of elimination, it led to exactly the same equation except it had r's in it instead of lambda. And this equation, therefore, is given the same name and another color. Let's make it salmon.
And it is called the characteristic equation for this method. All right. Now I am going to use now the word from last time. You factor this. From the factorization we get its root easily enough. The roots are lambda = -1 and lamda = -6 by factoring the equation. Now what I am supposed to do? You have to keep the different parts of the method together. Now I have found the only values of lambda for which I will be able to find nonzero values for the a1 and a2. For each of those values of lambda, I now have to find the corresponding a1 and a2. Let's do them one at a time. Let's take first lamba = -1.
My problem is now to find a1 and a2. Where am I going to find them from? Well, from that system of equations over there. I will recopy it over here. What is the system? The hardest part of this is dealing with multiple minus signs, but you had experience with that in determinants so you know all about that. In other words, there is the system of equations over there. Let's recopy them here. Minus 2, minus minus 1 makes minus 1. What's the other coefficient? It is just plain old 2. Good. There is my first equation. And when I substitute lamba = -1 for the second equation, what do you get?
2 a1 plus negative 5 minus negative 1 makes negative 4. There is my system that will find me a1 and a2. What is the first thing you notice about it? You immediately notice that this system is fake because this second equation is twice the first one. Something is wrong. No, something is right. If that did not happen, if the second equation were not a constant multiple of the first one then the only solution of the system would be a1 equals zero, a2 equals zero because the determinant of the coefficients would not be zero.
The whole function of this exercise was to find the value of lambda, negative 1, for which the system would be redundant and, therefore, would have a nontrivial solution. Do you get that? In other words, calculate the system out, just as I have done here, you have an automatic check on the method. If one equation is not a constant multiple of the other you made a mistake. You don't have the right value of lambda or you substituted into the system wrong, which is frankly a more common error. Go back, recheck first the substitution, and if convinced that is right then recheck where you got lambda from. But here everything is going fine so we can now find out what the value of a1 and a2 are.
You don't have to go through a big song and dance for this since most of the time you will have two-by-two equations and now and then three-by-three. For two-by-two all you do is, since we really have the same equation twice, to get a solution I can assign one of the variables any value and then simply solve for the other. The natural thing to do is to make a2 equal one, then I won't need fractions and then a1 will be a2.
So the solution is (2, 1). I am only trying to find one solution. Any constant multiple of this would also be a solution, as long as it wasn't zero, zero which is the trivial one. And, therefore, this is a solution to this system of algebraic equations. And the solution to the whole system of differential equations is, this is only the (a1, a2) part. I have to add to it, as a factor, lambda is negative, therefore, e^(-t). There is our purple thing.
See how I got it? Starting with the trial solution, I first found out through this procedure what the lambda's have to be. Then I took the lambda and found what the corresponding a1 and a2 that went with it and then made up my solution out of that. Now, quickly I will do the same thing for lambda = -6. Each one of these must be treated separately. They are separate problems and you are looking for separate solutions. Lambda equals negative 6. What do I do? How do my equations look now? Well, the first one is minus 2 minus negative 6 makes plus 4.
It is 4a1 + 2a2 = 0. Then I hold my breath while I calculate the second one to see if it comes out to be a constant multiple. I get 2a1 plus negative 5 minus negative 6, which makes plus 1. And, indeed, one is a constant multiple of the other. I really only have on equation there. I will just write down immediately now what the solution is to the system. Well, the (a1, a2) will be what? Now, it is more natural to make a1 equal 1 and then solve to get an integer for a2. If a1 is 1, then a2 is negative 2.
And I should multiply that by e^(-6t) because negative 6 is the corresponding value. There is my other one. And now there is a superposition principle, which if I get a chance will prove for you at the end of the hour. If not, you will have to do it yourself for homework. Since this is a linear system of equations, once you have two separate solutions, neither a constant multiple of the other, you can multiply each one of these by a constant and it will still be a solution.
You can add them together and that will still be a solution, and that gives the general solution. The general solution is the sum of these two, an arbitrary constant. I am going to change the name since I don't want to confuse it with the c1 I used before, times the first solution which is (2, 1) e to the negative t plus c2, another arbitrary constant, times 1 negative 2 e to the minus 6t.
Now you notice that is exactly the same solution I got before. The only difference is that I have renamed the arbitrary constants. The relationship between them, c1 / 2, I am now calling c1 tilda, and c2 I am calling c2 tilda. If you have an arbitrary constant, it doesn't matter whether you divide it by two. It is still just an arbitrary a constant. It covers all values, in other words. Well, I think you will agree that is a different procedure, yet it has only one coincidence. It is like elimination goes this way and comes to the answer.
And this method goes a completely different route and comes to the answer, except it is not quite like that. They walk like this and then they come within viewing distance of each other to check that both are using the same characteristic equation, and then they again go their separate ways and end up with the same answer. There is something special of these values. You cannot get away from those two values of lambda. Somehow they are really intrinsically connected. Occurs the exponential coefficient, and they are intrinsically connected with the problem of the egg that we started with. Now what I would like to do is very quickly sketch how this method looks when I remove all the numbers from it. In some sense, it becomes a little clearer what is going on. And that will give me a chance to introduce the terminology that you need when you talk about it.
Well, you have notes. Let me try to write it down in general. I will first write it out two-by-two. I am just going to sketch. The system looks like (x, y) equals, I will still put it up in colors. Except now, instead of using twos and fives, I will use (a, b; c, d). The trial solution will look how? The trial is going to be (a1, a2). That I don't have to change the name of. I am going to substitute in, and what the result of substitution is going to be lambda (a1, a2).
I am going to skip a step and pretend that the e^(lamda t) have already been canceled out. Is equal to (a, b; c, d) times (a1, a2). What does that correspond to? That corresponds to the system as I wrote it here. And then we wrote it out in terms of two equations. And what was the resulting thing that we ended up with? Well, you write it out, you move the lambda to the other side. And then the homogeneous system is we will look in general how? Well, we could write it out. It is going to look like (a - lambda, b; c, d - lambda). That is just how it looks there and the general calculation is the same. Times (a1, a2) is equal to zero.
This is solvable nontrivially. In other words, it has a nontrivial solution if an only if the determinant of coefficients is zero. Let's now write that out, calculate out once and for all what that determinant is. I will write it out here. It is (a - lambda)(d - lambda) - bc = 0. And let's calculate that out.
It is lambda^2 - a lambda - d lambda + ad - bc, where have I seen that before? This equation is the general form using letters of what we calculated using the specific numbers before. Again, I will code it the same way with that color salmon. Now, most of the calculations will be for two-by-two systems.
I advise you, in the strongest possible terms, to remember this equation. You could write down this equation immediately for the matrix. You don't have to go through all this stuff. For God's sakes, don't say let the trial solution be blah, blah, blah. You don't want to do that. I don't want you to repeat the derivation of this every time you go through a particular problem. It is just like in solving second order equations.
You have a second order equation. You immediately write down its characteristic equation, then you factor it, you find its roots and you construct the solution. It takes a minute. The same thing, this takes a minute, too. What is the constant term? ad - bc, what is that? Matrix is (a, b; c, d). Ad minus bc is its determinant. This is the determinant of that matrix. I didn't give the matrix a name, did I?
I will now give the matrix a name A. What is this? Well, you are not supposed to know that until now. I will tell you. This is called the trace of A. Put that down in your little books. The abbreviation is tr A, and the word is trace. The trace of a square matrix is the sum of the d elements down its main diagonal. If it were a three-by-three there would be three terms in whatever you are up to. Here it is a + b, the sum of the diagonal elements. You can immediately write down this characteristic equation. Let's give it a name. This is a characteristic equation of what? Of the matrix, now. Not of the system, of the matrix.
You have a two-by-two matrix. You could immediately write down its characteristic equation. Watch out for this sign, minus. That is a very common error to leave out the minus sign because that is the way the formula comes out. Its roots. If it is a quadratic equation it will have roots; lambda1, lambda2 for the moment let's assume are real and distinct. For the enrichment of your vocabulary, those are called the eigenvalues.
They are something which belonged to the matrix A. They are two secret numbers. You can calculate from the coefficients a, b, and c, and d, but they are not in the coefficients. You cannot look at a matrix and see what its eigenvalues are. You have to calculate something. But they are the most important numbers in the matrix. They are hidden, but they are the things that control how this system behaves. Those are called the eigenvalues. Now, there are various purists, there are a fair number of them in the world who do not like this word because it begins German and ends English.
Eigenvalues were first introduced by a German mathematician, you know, around the time matrices came into being in 1880 or so. A little while after eigenvalues came into being, too. And since all this happened in Germany they were named eigenvalues in German, which begins eigen and ends value. But people who do not like that call them the characteristic values. Unfortunately, it is two words and takes a lot more space to write out.
An older generation even calls them something different, which you are not so likely to see nowadays, but you will in slightly older books. You can also call them the proper values. Characteristic is not a translation of eigen, but proper is, but it means it in a funny sense which has almost disappeared nowadays. It means proper in the sense of belong to. The only example I can think of is the word property. Property is something that belongs to you. That is the use of the word proper. It is something that belongs to the matrix. The matrix has its proper values. It does not mean proper in the sense of fitting and proper or I hope you will behave properly when we go to Aunt Agatha's or something like that.
But, as I say, by far the most popular thing, slowly the word eigenvalue is pretty much taking over the literature. Just because it's just one word, that is a tremendous advantage. Okay. What now is still to be done? Well, there are those vectors to be found. So the very last step would be to solve the system to find the vectors a1 and a2. For each (lambda)i, find the associated vector. The vector, we will call it (alpha)i. That is the a1 and a2. Of course it's going to be indexed. You have to put another subscript on it because there are two of them.
And a1 and a2 is stretched a little too far. By solving the system, and the system will be the system which I will write this way, (a - lambda, b; c, d - lambda). It is just that system that was over there, but I will recopy it, (a1, a2) equals zero, zero. And these are called the eigenvectors. Each of these is called the eigenvector associated with or belonging to, again, in that sense of property. Eigenvector, let's say belonging to, I see that a little more frequently, belonging to lambda i.
So we have the eigenvalues, the eigenvectors and, of course, the people who call them characteristic values also call these guys characteristic vectors. I don't think I have ever seen proper vectors, but that is because I am not old enough. I think that is what they used to be called a long time ago, but not anymore. And then, finally, the general solution will be, by the superposition principle, (x, y) equals the arbitrary constant times the first eigenvector times the eigenvalue times the e to the corresponding eigenvalue. And then the same thing for the second one, (a1, a2), but now the second index will be 2 to indicate that it goes with the eigenvalue e^(lambda 2t). I have done that twice.
And now in the remaining five minutes I will do it a third time because it is possible to write this in still a more condensed form. And the advantage of the more condensed form is A, it takes only that much space to write, and B, it applies to systems, not just the two-by-two systems, but to end-by-end systems. The method is exactly the same. Let's write it out as it would apply to end-by-end systems.
The vector I started with is (x, y) and so on, but I will simply abbreviate this, as is done in 18.02, by x with an arrow over it. The matrix A I will abbreviate with A, as I did before with capital A. And then the system looks like x prime is equal to -- x' is what? x' = Ax That is all there is to it. There is our green system. Now notice in this form I did not even tell you whether this a two-by-two matrix or an end-by-end. And in this condensed form it will look the same no matter how many equations you have.
Your book deals from the beginning with end-by-end systems. That is, in my view, one of its weaknesses because I don't think most students start with two-by-two. Fortunately, the book double-talks. The theory is end-by-end, but all the examples are two-by-two. So just read the examples. Read the notes instead, which just do two-by-two to start out with. The trial solution is x equals what? An unknown vector alpha times e^(lambda t). Alpha is what we called a1 and a2 before. Plug this into there and cancel the e^(lambda t). What do you get? Well, this is lambda alpha e^(lambda t) = A alpha e^(lambda t)
These two cancel. And the system to be solved, A alpha = lambda alpha. And now the question is how do you solve that system? Well, you can tell if a book is written by a scoundrel or not by how they go -- A book, which is in my opinion completely scoundrel, simply says you subtract one from the other, and without further ado writes A minus lambda, and they tuck a little I in there and write alpha equals zero. Why is the I put in there? Well, this is what you would like to write. What is wrong with this equation?
This is not a valid matrix equation because that is a square end-by-end matrix, a square two-by-two matrix if you like. This is a scalar. You cannot subtract the scalar from a matrix. It is not an operation. To subtract matrices they have to be the same size, the same shape. What is done is you make this a two-by-two matrix. This is a two-by-two matrix with lambdas down the main diagonal and I elsewhere.
And the justification is that lambda alpha is the same thing as the lambda I times alpha because I is an identity matrix. Now, in fact, jumping from here to here is not something that would occur to anybody. The way it should occur to you to do this is you do this, you write that, you realize it doesn't work, and then you say to yourself I don't understand what these matrices are all about. I think I'd better write it all out. And then you would write it all out and you would write that equation on the left-hand board there. Oh, now I see what it should look like. I should subtract lambda from the main diagonal. That is the way it will come out. And then say, hey, the way to save lambda from the main diagonal is put it in an identity matrix. That will do it for me. In other words, there is a little detour that goes from here to here.
And one of the ways I judge books is by how well they explain the passage from this to that. If they don't explain it at all and just write it down, they have never talked to students. They have just written books. Where did we get finally here? The characteristic equation from that, I had forgotten what color. That is in salmon. The characteristic equation, then, is going to be the thing which says that the determinant of that is zero.
That is the circumstances under which it is solvable. In general, this is the way the characteristic equation looks. And its roots, once again, are the eigenvalues. And from then you calculate the corresponding eigenvectors. Okay. Go home and practice. In recitation you will practice on both two-by-two and three-by-three cases, and we will talk more next time.