# Lecture 7: Finite Differences for the Heat Equation

Flash and JavaScript are required for this feature.

The following content is provided by MIT OpenCourseWare under a Creative Commons license. Additional information about our license and MIT OpenCourseWare, in general, is available at ocw.mit.edu.

PROFESSOR: OK. Last time, I started to talk about the heat equation, also known as the diffusion equation. And what we did then was to solve it. Give an analytical solution. Both for any initial value; u of x and zero, and then, specifically for the point source. So then, we're talking about the impulse response, or the fundamental solution, when we start from a delta function. And, it was that Gaussian bell shaped curve. Fantastic fundamental solution.

But the formulas you get involving infinite integrals are not the greatest for actually getting numbers for the whole temperature history. u of tnx. We have to go to finite differences. And, of course, we need finite differences also in more general cases, where non linear terms could appear. Or even just variation in x and t in the linear co-efficient, which I had just set to be 1.

So, this is the first job. Difference methods for the heat equation. And you'll see that we get pushed toward implicit methods. Because explicit method will require delta t to be that very small sized delta x squared, and that's pretty slow going. And of course, what I'm saying applies equally to-- we might be in 2D or in 3D diffusion of pollution, for example, in environmental engineering. Our fundamental solution from the delta function would easily extend to two dimensions or three. And, our difference methods will extend, but we'll see what comes up.

And then, let me point to this second problem which combines the two. We already know about one way waves coming from conviction term. Now we're getting the basic ideas for that the diffusion term and lot of very important, real problems combine the two. And then you have a situation that I'm just sort of pointing to here, that coefficient, as before, has the units of velocity. Distance over time. You see, if I bring time up here, I have time over distance there. So, I need distance over time from c it's what we've seen always. And, then the c delta t over delta x the quantity r was dimension.

So, I'm taking three minutes to speak about dimensions. Dimensional analysis is a very simple but useful thing to do at the beginning of a problem. So here we had r equals c delta t over delta x. Which is dimensionless. And, typically for explicit methods, it had to be bounded by 1, for example.

Now for the diffusion term d. What are the units of d? Well, here now I have distance squared in the denominator, so d needs to be a distance squared over time. So it has different dimensions, and the quantity capital R that will meet parallel to this, will be d delta t over delta x squared. And, again it's dimensionless. Because if d has dimensions of distance squared over time, that reverses it to give dimensional-less quantity, and we'll see that explicit methods require a bound on capital R, and that's what so forces us a into that small time step.

OK. So you'll see capital R playing a similar role to little r. And actually while we're talking about dimensional analysis, back to the differential equation, suppose we want, as we certainly do, to compare the importance of the conviction term and the diffusion term. Well that comparison is somehow the ratio of c to d, So, somehow, it's natural to think of the ratio of c:d as telling us a very important fact. Is our problem more like a wave problem? Do we expect to be on the unit circle, or close? With a term that in free a space is pure imaginary, from ik, or, if d is big, we have then a small ratio, and that tells us that the diffusion term is important, that we've got lots of this dissipation That the stability, at least in the differential equation, is much greater, because this is producing a minus k squared in free a space, and a strongly negative term.

So it's that ratio c:d but of course, my point here in the dimensional stuff is that c:d isn't quite OK as it is, because they have different units. And we need one more distant. So we need the ratio of c -- where do I need an l? I need another l up there. I need another distance there to get something that is altogether dimensional-less. And so that's the critical number. And, it's a known as the Peclet Number Pe, I'll call it.

What is l? We're talking here only about the differential equation. l is characteristic length. If our problem is set on a finite interval, then it's probably the length of the interval. It's a characteristic length of the problem.

And, then this, quantity cl over d gives us a good measure of the importance of these two. And, we'll see in the finite difference case, there will be cell peclet number which is entirely different. A peclet number for the little finite different cell, where this l is changed to like a delta x. OK. I think we've got it there.

Have I got the numerator and denominator correct there, or should they be reversed? Is it time over distance? It is, isn't it? Time over distance. That shows that I'm not a dimensional analyst. Let me fix it. And, then that will mean fixing this. So, let me just fix that. It doesn't sound right as I said it, but I read wrong. OK. So c is a time over a distance. Thanks. Is that right? It was right the first time? In that case, we'll move on. OK. All right. OK. Good. Thanks. Yeah. You're right. OK. Sorry that's on the tapes. OK.

So now let's begin with a natural explicit method for the heat equation. What everybody would think of. So, this will be explicit for the heat equations, and I'm in 1d. And, what do you think of naturally, forward difference in time. So, can I write it this way? Time difference. I'll use capital U, as always for the finite difference solution, divided by delta x, and I'm doing the heat equation with c equal to 1. Just to keep things simple, let's just do the heat equation normalized. So, here I'm going to take a second difference in the x direction of u divided by delta x squared. OK. What's the accuracy? Well, we know the accuracy of each term, the center difference, the error is the discreditization error, the local error, the truncation error, it's also called, will be proportional to -- oh I'm sorry, that's delta t of course. OK. So, the discreditization error in a forward difference, when I divide by delta t -- so it's first order in t, and standard difference because of the symmetry is second order in X. OK.

And, this would be another problem where I could have begun with semi- discreditization. I could have begun with keeping this as a derivattives du to t equal this center difference, and the discussion would have been entirely parallel to what we'll have here. So, I've jump right to the finite difference in x and t. So in other words, what's happening here? un plus 1, u at that center point, n plus 1, of course, we're our little molecule is computing this backward different, and this or rather forward in time, so this is the time direction. n plus 1, and this center difference in space. So if I just write out what that does, it's ujn coming from the time difference. Then, I'll multiply by delta t, and I have a delta x squared, so that ratio is what I'm calling capital R. Capital R, to distinguish the diffusion ratio from the wave ratio. Times what? So it's just going to be uj plus 1 at time n minus 2uj at time n, plus uj minus 1 at time n.

We're good at writing that expression down. We can follow either the ikx, of course. That's always the idea. Start with uj zero as e to ikj delta x, and follow it up. We're looking for growth factor g. The growth factor in a single step, and, of course, the point is, it depends on k.

At at every step, the exponential is always with us. It's the factor that multiplies that exponential that we want, and you see what it is, don't you? Can you see what g is here? So, if I put in that, and then ultimately factor out the exponential, it'll be multiplied by a 1, from that part, and an r from this, and now in parentheses, we have now very familiar quantity that comes from either the ik delta x, either the minus ik delta x. Those combine to give to cosigned k delta x, and the center one is minus 2. OK. That's our growth factor.

And the question is, does it stay below 1, which is the stability requirement for all k delta x. And, so you can see the dangerous frequency as often the dangerous frequency is when k delta x is pi, and that cosign becomes minus 1. And, then we have so stability. So stability at k delta x equal pi needs -- so this becomes 1 and this is negative 2, and that's negative 2. We have 1 minus 4r, and that quantity has to be, in absolute value, not exceeding 1. Well, there's no danger that it's going to go above 1. The danger is will it go below minus 1, and it's not allowed to. So we need it to be greater than or equal to minus 1. And, now, if I put the minus 1 on this side and the 4r on that side, that means 2 is greater or equal to 4r. And, I get the famous condition r less or equal to half. That's the stability requirement.

And if there was a coefficient d in heat equation, the coefficient would be there in r. OK. So, that's the stability condition, and that's the thing we would like to find a way around, because, it says that this delta t -- I'll just put that in, less or equal a half, and that means a very small delta t. OK.

What's the way around it? To go to an explicit method. So, let me go first to a completely explicit method. Fully explicit. I'm sorry. To go to an implicit method. Now, implict method -- now I'm going to do -- switch to an implicit method by computing this difference at time and plus 1. So this is now and plus 1. And, what's the change in g? A big change. OK. So now I want to recompute g for the growth factor over a single step for the exponential. When the right hand side, the second difference is taken at the new time. So this has to move over to the other side of the equation. That makes the stability better, but, of course, it's going to make the computation more difficult. Let me just think about the stability for now.

So, instead of this board, this is the explicit case. Yeah.

AUDIENCE: [INAUDIBLE]

PROFESSOR: So it has to be, let's draw a little picture here. So, there's zero. Here's 1. Here's minus 1. And this quantity is starting at 1, and subtracting 4r. So, the question is, does it stop here? If it stops there, then in this picture, these would both be negative. This would be about minus a half, but that is greater than minus 1. So this would be the OK case when this is true. And, that's what required r less or equal to half. So in this case, yeah. In that case, r is less than but if r equal to half -- everybody sees this -- if r equaled a half, this would be 1 minus 2. This would bring us all the way here. And if r is bigger than a half, then that would reverse that inequality, and we would be unstable. Yeah. I think it's OK. Thanks. It's the difference between looking at the magnitude --if I put magnitude, then I would need less or equal or just realizing that it's just taking it as a number. 1 minus 4. OK. I think that's all right.

Now the implicit case. What happened to g? What happened to g? Let me put it here. Implicit. So, what's changed? This term, which came from the second difference is moved over to the n plus 1 side of the equation. So, we have in Fourier space, we would have 1 from the ujn plus 1, and then minus 2r, and times our expression 2 cosign k delta x minus 2 that's what multiplies e to the i-- do you see that that's showing up on the new side of the equation, multiplying either the ik delta x, and on the old side, the old time step, we only have the 1. In other words, the growth factor is, this goes down into the denominator, and we can see what's going on. So the growth factor is 1 over this expression. 1 minus 2r times 2 cosign k delta x minus 2. We keep running into that quantity from the second difference. OK. So if I look at that, what's going on?

The key point of this quantity is that it's always negative, or zero, but never positive. Because this term is never as large as this one. So it's negative. And now, I have another minus sign here, so I have 1 divided by 1 plus something. And always, for all k and all ratios r it's OK. Right? 1 over 1 plus something positive. And you see that everything's real. That's because we're doing the diffusion equation, where the wave equation took us, we had some imaginary part. Absolutely stable. Stable for all delta t. I'll say even for large delta t. That's good. Of course, we're not going to take delta t equal to 1000. Even if stability allowed us, the accuracy would be hopeless. It'd be terrible. I can't take giant time step and expect to follow the true exponentials for either the ikx.

But, I can take a larger step than this very small delta one half delta x squared step. So this good. What's the price that we pay? The price is that at every time step, we have to solve. This is now on the other side, so this brings a whole matrix over. So, this implicit case -- when I bring this second difference matrix over, I've used capital K because the second difference matrix there probably the most interesting difference matrix, I'll say. One of the most interesting matrixes period. It has the minus 2s on the diagonal and 1s above the diagonal. So, actually, with signs exchanged, I called it k. So can I just remember the meaning of this matrix k. I'm going to stay with that letter because it's a standard in finite elements for the stiffness matrix. So, k by definition, is the matrix with the 2s on the diagonal, minus 1s above the diagonal and minus 1s below the diagonal.

And, what size is it? On the whole line, where we're working for simplicity, I guess k is infinite in both directions. 2s and minus 1s. On an interval where we would really compute, k would be a finite matrix. But, the main point is that this matrix k is minus the second difference matrix. So that when I bring it over to this side which changes that plus to a minus, I get this equation to solve. The identity, coming from the 1, plus r, the ratio, times is this matrix k, multiplies this vector of all the values at the new time step, and equals, in this case, all the values at the old time step. Because, at the old time step, I just have the identity. Are you OK with this?

This is the problem that we actually have to solve. I mean the main point about it is, that we have a linear system. The u's at the new time step are coupled. Coupled by the second difference. Coupled by the matrix k, so we have to deal with it. Well actually, in one dimension, no problem. Because in one dimension, k is just to tri-diagonal matrix, and we can solve tri-diagonal systems almost as fast as we could work with the identity matrix. And actually, what we're seeing this ratio are exactly the eigen values of the inverse matrix. So, if I use a formula, I would say un plus 1 is the inverse matrix times un. And, the growth factor is just the eigen values. The growth factor g is simply the eigen values of this inverse matrix. And, notice again, that all the eigen values of k are greater or equal zero. It was a positive definite matrix, because I switched sign. And, the effect of the identity is to push it a little more positive. Here. It's making -- this is a positive number there. That's the eigen values of rk, and then, I making it a little more positive with this 1, so we have a problem that is safely invertible, stable, and in one dimension, quite simple. So in one dimension we're entirely ready to go implicit.

Let me just, while I'm thinking about it, recall that we had some other matrix in the finite matrix case, we had some other matrixes in the very first section of the original notes, which had different boundary conditions. In other words, if we're on a finite interval, then this matrix k corresponds to zero boundary conditions. Zero temperature at the end. But we might have other boundary conditions, like the derivative of the temperature is zero at one end, or the other end. That would change this matrix in the first row and the last row to another nice matrix. So, we might have not k but one of the other tri-diagonal matrixes.

But, here's the real point. What happens in two dimensions? Suppose I include now a uyy term in the heat equation, or three dimensions with uzz term. What's different? We can follow through the stability test for the explicit method, and I'll have a term from delta x squared, and a term from delta y squared, and a delta z squared. It'll even slightly tighter on delta t. Or I can follow through the implicit method where now this second difference in the x direction appears also with a second difference in the y direction and a second different in the z direction.

So what's happening? From the matrix [? theory ?] point of view, I still have nice matrixes with 2s and minus 1s or maybe with -- I guess what we would, -- can we just remember what we would have in a 2 dimensional case. In the two dimensional case, instead of this second difference with a minus 1 a 2 and a minus 1 in the x direction, we also have a second difference coming from the uyy in the y direction so another minus 1 and that 2 moves up to a 4. So our matrix in the 2d 2 space dimension case, will have a 4s on the diagonal. So it grows from this one. It's called the tensor product coming out of this. Anyway it produces this famous five point molecule with 4s on the diagonal, and 4 minus 1s going down other diagonal. And the problem is that those diagonal are not -- 2 of the diagonals might be right next to the main diagonal, but the other two will be farther away. Because we can't number the nodes to keep all these 4 numbered adjacently at all the mesh points. In other words, we have a serious matrix problem. That's my comment.

We have a matrix k in 2d. So what I'm speaking about is this problem. Identity plus r k2d. u at the new time equal u at the old time. And k 2d is by no means so easy to deal with as k1d. It doesn't have a nice narrow band like this. The band stretches further, and we'll see in spades in then next part of the course which is about solving large linear systems. So we have a large system because we've got lots of mesh points. We have matrix that has a wider band, and so you need new ideas in solving it. So this problem is what will be a model -- is the connection, really, between this section of the course, difference methods for initial value problems, and the next section which is solving large linear systems. Because this will be a typical model larger linear system.

So that's a comment of that for implicit equations, the price was low in 1d, but the price is not so low in 2d. OK.

Now, have we got any other candidates between explicit and implicit? Well, we did way back for ordinary differential equations. We had a trapezoidal rule. And what did that do? That increased the accuracy by centering it at time and plus a half. Do you remember that? On the right hand side, we used half the explicit difference and half of the implicit different, so natural to do it again. It worked then, and it'll work now.

So this will be the trapezoidal method. But most people in the diffusion application name it after these authors, Crank and Nicelson. So it's the Crank Nicelson method. It's half the explicit part plus half the implicit part. And the order of accuracy is increased by centering it this way. That's the point of course. So I guess the point will be we don't have any extra work compared to a fully implicit one, and we get an extra order of accuracy by centering it at a half. So ujn plus 1 minus ujn over delta t is r. Oh sorry. We've got delta t there. So will be the second difference at the old time times a half. It would just take half of that explicit part plus half of the implicit part. This is a time and plus one over delta x squared. You would think of that. Everybody would. It's a just a natural idea. It increases the accuracy. It's still implicit. We still have to bring this term over on to this side when ujn moves over on to that side. And we can look at look at the growth factor. So that's what we have to do. So bring that term over so and bring delta t up, so its delta t, divided by delta x squared and r. OK. So I've got this term. The 1. And when I bring that term over, it's a half of what we brought over before, which was the r 2 cosign k delta x minus the 2. That multiplies g, and on the right side, we have the 1 from the ujn, plus the half, times the r times 2 cosigned k delta x minus 2. I'm certainly getting to the point where I write these expressions down faster without explicitly writing either the ik delta x and then canceling it.

So, g is a ratio again. In the explicit case g was all in the numerator. There was no denominator. In the fully implicit case, g was all denominator, with just a one in the numerator. Now we have g as the ratio. So I'm going to bring this down below. And, g is a ratio. So it's a fraction because it's got an explicit part in the numerator and an implicit part in the denominator. And, of course, the main question is when is g less or equal to one?

Well. Actually it's cool because g is simply -- look, the numerator is one, and all this stuff is negative. So this is one minus something. And, what's going down here? The sign's just reverse. It's one plus something. And this is a positive quantity, and this is the same positive quantity. So, this is one minus, do you see this, because this is negative, I'll write it as 1 minus something -- something positive. And here, this is minus and this is negative, so this is 1 plus the same thing. So what's the answer?

[UNINTELLIGLBEPHRASE]

One minus a positive number over 1 plus that positive number can't go wrong. So, for all r. OK. In a way, that has to be more attractive than fully implicit, because it didn't take more work, it wasn't any less stable, and it was more accurate. So Crank Nicelson is sort of a natural idea at this level of analysis. OK. So those are the three methods to think about. Of course actual computations will bring in new questions. Boundary conditions, non linealarities, and everything. I'm just staying at this simple level, where I'm comparing the first ideas that would occur to us. OK. We may want higher accuracy, all sorts of things. But, we're seeing the contrast with the wave equation case. But, here we move more toward the implicit side, and the reward is unconditional stability. No condition on r, but the price is more computation.

OK. Well at that same level of thinking about the most natural methods that occur to you, I want to speak now about the big issue of convection diffusion. Suppose both types of problem are with us. So convention diffusion. What would you do then?

Now I have both convection and diffusion in the equation ut equals cux, plus duxx. And I guess I want to say that this is something that the world has not come to a decision on. The computational world is still debating what to do with conviction diffusion. So it's a very appropriate equation model to experiment with. Maybe I'll ask you, as the next kind of informal homework, to begin to experiment with that. And when I say experiment, you can think immediately of the methods you might think of. The first methods to try. So the first method of course, will be explicit. Fully explicit. What do you expect to have happen there? Let's just see how the two different terms each give their little push towards the stability requirement.

So explicit is going to be the time difference, forward time difference equals c times-- what shall I chose first for the space difference? Up wind, maybe? up wind in space? So, this is going to be explicit and up wind. Of course, everybody knows that refers to the windy term in the equation. The wave term. OK. So, of course, I'll get a forward difference in the x direction of u at time n divided delta x, and then I'm going to do the natural one. A centered second different at time n for the diffusion term. OK. We'll see what comes out of that. It's going to involve not only the ratio, it will involve both ratios. Naturally. Little r, as soon as I multiply up by delta x, I have a little r there. My wave equation ratio, and I have a d delta t over delta x squared, which I'm calling capital R. That's the diffusion ratio, and multiplied by all sorts of stuff.

So actually, why don't I write down what the equation really is then? Explicitly, its ujn plus 1 is ujn so maybe I should collect all the terms here. Can I collect the terms that multiply uj plus 1n? So what's going to multiply uj plus 1n? Here I have a c delta t over delta x. I have an r from this term. And here I have a d. Right? No. Sorry. I have a capital R. Isn't that right?

When I multiply through by delta t, I have d delta t, over delta x squared. I have a capital R that's going to multiply the -- does that look right? And then what multiplies the center guy, the ujn? Well that's where the time difference comes over. There's the usual 1. Then from this difference, this upwind difference, there's a minus at the center term, because it's a forward difference. So that will be a minus r, and from this center different, well remember, that that's a second difference. So, there's a coefficient 2 in the middle term, or minus 2, and when I bring the delta t up, it's minus is 2 capital R. And then, the third one, whatever this last coefficient is, is multiplying ujn minus 1, and now where does that come? It doesn't come from here? It doesn't come from the upwind difference. I guess it only comes from there. And it's probably just r. And I guess I would check any formula like that, to be sure that the coefficients add to 1. And they do.

And why should they? Why should those coefficients add to 1? So that's my little test that I've got it right. They should add to 1, because a constant initial if uj zero was a constant, if I'm starting from a constant function, it's x derivatives would all be zero. If I had u equal constant, then the x derivatives would be zeros, so the time derivatives would be zero so that constant would stay forever. And so I would want to get the same constant out from constants going in. And so the coefficient should add to one.

Well. I can see a situation when I know stability will be OK. Stability can't fail if those three coefficients are all positive. If those three coefficients are all positive, then my growth factor can't be larger than 1. Those coefficients, they add to 1. So the growth factor is this times -- should I write it down? The growth factor will be this -- so this will be the growth factor --g will be that number times either the ik delta x, from shifting over one. It'll be this number times 1, and it'll be this number times either the minus ik delta x, and my point is just that the easiest estimate would say, well this has magnitude 1, that has magnitude 1, that has magnitude 1. If I just add up magnitudes, I'll get this one, and this one and this one. And if they're all positive, they add to 1, I'm safe. So what's the stability condition that this simple argument requires? Well. That's positive. That's positive. The danger is this one. So I guess I would like that to be positive. So I would want -- I'm certainly stable if this is positive, which I might as well write as r plus 2 big R being not larger than 1. Because if that's true, I'm taking a combination with positive coefficients adding to 1, I can't go wrong. OK. And in fact, that's pretty familiar. It combines the two conditions we know. The condition that r should be less or equal 1, small r, the wave Courant-Friedrich-Lewy condition. And the condition that capital R should be less or equal to half the diffusion condition. Now they're both. Now I'm forcing them even more by saying that the sum has to be less or 1.

OK. So let me leave the explicit up wind at this point, and ask you what other method might you want to try? Well it would be natural to think of explicit centered, where are you would center this and we could follow that one through. I guess I'll do that next time very quickly. On the notes we'll do it. And I'm not sure which of those two ways is better.

Or, you're going to say go implicit. And now I'm going to ask, all right but what part of it do you want me to make implicit? Maybe all I need to make implicit is the heat equation part. And I can keep this explicit perhaps. And I could even think of splitting the step. This is something we haven't done. I could think of taking a wave step and then a diffusion step. Whatever. Putting them together into one step where the diffusion step is changed to implicit because of the great stability that provide. OK.

I think you're seeing the reality of scientific computation even at this fundamental model level of a model with two quite different physical affects of motion with finite speed, energy preserving, magnitude of g, tending to be right at 1 and diffusion at infinite speed, giving us growth factor that tends to be real and below 1, but the price is it pushes us toward implicit methods. I think that's this one choice that I've worked through, and the others that I've mention, present a real problem. I'll say another word about this topic next time and then we'll pretty soon move to a non linear equations and more realistic, more difficult applications. OK. Thanks.