# Lecture 24: Error Estimates / Projections

Flash and JavaScript are required for this feature.

INTRODUCTION: The following content is provided by MIT OpenCourseWare under a Creative Commons license. Additional information about our license and MIT OpenCourseWare in general is available at OCW.MIT.edu.

PROFESSOR: OK. Ready for today's discussion, which I'm quite sort of happy about. I hadn't really seen this coming. First we haven't said anything about how do you estimate the error in a problem. Because we're often taking continuous problem coming from differential equations typically and discretizing it, and solving it -- solving the discrete problem. And we want some idea. It's not just some math question of course, because, you know, in engineering design or analysis an idea of what the error is critical. And secondly, it gives us an idea of where we can improve it. You know, if we see what's controlling the error then we know what's going on. So I'm speaking now about steady state problems. We discussed error for initial value problems, and we realize there that the error depended on usually finite differences. So the error we knew we could figure out locally from find a difference that we chose, you know, we chose second order accurate differences frequently, but we could've moved up to a fourth. And then the other ingredient you need is stability to know that those local errors don't explode as time goes for. OK. So we had something to say about error and accuracy at that time about that topic.

And now I'm realizing that many of our other topic fall in this category. And I put three of the topics here. In the idea sort of where the underlying discretization is a projection. It's somehow a projection between a true solution and the discrete computed solution. Somehow the discrete computed solution is in some finite dimensional family where we can compute, and we're projecting into that family. And here are three examples. So you'll see that actually what I'm talking about is central to the whole semester.

Well finite elements actually appeared more in 18085. So I'll have to recap a little bit there. But they would be the natural tool for solving that list of continuous problems from calculus variations that I've talked about last time, minimizing some energy, where the energy is a typically an integral and were in the continuous case. And what's the infinite element central idea? It is choose these guys, choose some basis function, and look at their combinations. Maybe choose n basis functions. I guess I better get a chalk or it's going to be a short lecture. OK. So we have a continuous problem, continuous ODE, ordinary differential equation or a partial differential equation with boundary conditions as always that we want to make discrete. And the point is I am not making a discrete by finite differences. This is a different route from finite differences.

And in Laplace's equation and these, and many, many problems this is the preferred route. And finite elements are a particular choice of these guys. The Galerkin idea, so I'll use his name, for the overall idea is choose these, and look for the best combination, and we have to say what does best mean. Well best is going to mean in our minimization problems, it will be the combination that gives the minimum. The combination of these n file functions. Now the exact solution is not going to be in that little n dimensional space. Those functions for finite elements might be piecewise linear, or if you want to upgrade it they might be piecewise parabolas, piecewise cubics, whatever. It was the brilliant finite element idea to choose simple functions. And then you could take n quite large, but you're still not getting the exact solution of course, and it's to estimate how far off you are. But then also in multigrid, what was the idea in multigrid? We started with a system at level h. We started with some problem Ah uh equal fh, which was a big system.

If we did ordinary Jacobi or Gauss-Seidel that was too slow, a multigrid idea was project, there's that magic word project, onto to a course grid, where the problem is smaller. Here we started with a continuous problem, and we got it down to size n. Here we start with a discrete problem and we cut its size in 1/2 or in 1/4 or an 1/8 again. So projection is producing a smaller problem, and the question is what are you losing? And actually conjugate gradients. You remember these spaces? The spaces span the combinations of b ab a squared b and so on. That was the computationally convenient of sub space to project onto. And so we've got important examples.

Now what I realized, I'm pleased about this, is that they all fit. So here's my problem, what's the error? So I'm going to u star for the correct answer. This is the true. And I'm going to use U star, capital U star for the approximate, the one that we get by any of those key ideas in numerical analysis. And I'm trying to estimate the difference. OK. And I just put up here that I'm dealing with positive definite problems. In fact, the matrix K the symmetric positive definite problem you have often has this A transposed CA form, or in the continuous case it might K might stand for the A transpose might be minus the derivative, the C might be a variable or constant of the physical coefficient; A would be d by the x, that would be the K thing. The equation I want to get it Ku equal f. That's the strong form. Strong form will be Ku equal f.

But the whole point of last lecture and this one is that we don't get to the strong form. This projection starts with a minimum problem, and gets to a weak form. And it's the minimum problem or the weak form that we want to think about not the strong form.

OK. Now. Up there I wrote an identity. If you just multiply that right hand side out, I believe it comes out right. And it's very valuable for our purposes here. OK. So what's on the left side? This is the quantity that we're minimize, we're minimizing this. And by just manipulation, we wrote it this way. So now I can identify what u star. So I'm going to minimize this over all u. That certainly looks like a discrete problem right? So think of u as a vector, f as a vector, K as a positive definite symmetric matrix. Just think about that. But I want it to apply very much to the continuous case too, to differential equations. But here's a point then, what's the winner? You can't immediately see, well maybe you can, somehow see that when I set the derivative of this thing to 0, I get that equation. You can kind of believe it, even if it's vectors and matrices.

But here you can see right away that how do I make this small? This is a constant here. That's just a constant, so it doesn't depend on u. So what choice of u makes that term small? Well of course it's the choice of u is the one that makes this thing 0, because that's a positive definite matrix. No way is anything is going to get negative here. This is a something transposed K something. You remember what positive definite means. It means that x transpose Kx for every vector x, there's never a negative. So the best we could do would be to bring this to 0, and of course to bring it to 0 is to bring the x to 0, to bring this thing to 0, so this thing should be 0. So u minus K in verse f should be 0. And of course that leads us back to the same conclusion that we reached from this the strong form. OK. Good.

So that an identity, but now I want to use this identity to answer the question about what if I minimize only on a sub space? That's the problem for today. So that's a question. Maybe I'll write it on this fresh board. Now I minimize only over some sub space. Now can I can I use capital U? I used little u for in here, allowing it to be any vector, and I found a winner. And let me give the winner a name, u star. It's a real headache in this subject of just the notation. What should we call the winner? So sometimes I call it u hap, that was a familiar notation in estimation theory of these squares. Today I'm going to call it u star So it's u star is the winner, small u star.

Let me take time out, one minute time out. I mean optimize is about the problem of minimizing some function f of x. I'm just going to take one minute on the problems of an author. What do you call the winning function, winning vector, or the winning x? I mean it could be just a scalar, we could be doing just calculus. What do I call the x that gives the minimum? You may have a favorite, you can't call it x right? I mean Because that's just confusing it with the variable x. So I'm saying well you could call it x hat, you could call it x star, you call it capital X. So I'll just write a few of those down - x star would be a possibility, x hat would be a possibility, x minimizing would be a possibility. I'm doing this just because I want to write down the thing that you often see, which is argmin of F of x. And I write that down just so if you ever see it, you know what the heck it means. It has the same meaning as any of these. I am not a big fan of that. But what does this mean? It means the argument that gives the minimum of F of x, right. Argument is a fancy word for the variable in the function. So this the argument that gives the minimum of F of x, and that's what we're looking for to name. But I'm sure not happy about writing that name. So here it goes, it disappears. But you'll see it, and now you now want it means. OK.

So this is now the central issue here. I want to minimize my same guy 1/2 u transpose Ku minus u transpose f, or which is exactly the same, that same right hand side because the two are equal. I want to minimize over some capital U's not all U's. if I minimize over all U's, then I get the exact answer. But the idea of all these numerical methods is minimize over some finite dimensional, smaller dimensional, sub space of trial functions. I don't know. I'll say minimize over U in the trial space, just to write it out in words. OK. Well now I guess my name for the winner in the trial space is going to be U star. So the winner in the trial space will be U star. So that's my finite element solution, my conjugate gradient solution, my multigrid solution.

All these problems are reducing to a smaller trial space, and picking the winner there, computing the winner there. And then the question of today is how far apart are the two? OK. And now there's this little formula that gives us a good idea. This little formula gives us a handle on that. So now can I look at this formula? Maybe maybe I'll copy the formula here. This is the minimum over all these trial, these guys, of 1/2 -- oh but I'm going to write it that way -- 1/2U, that's my trial guy, minus KN verse f, that's my U star now. Am I OK to give that name to KN verse f? Yeah, I already gave it I guess. U star is Kn verse f. So that's cool. Shorter. IT's shorter and better. K, oh it's transposed U minus u star plus a constant. So I can forget that. I can forget this constant part. That's not important to us. So this simple identity then has expressed our key discretization approach here as finding the U, and we're going to call it U star, that's nearest to u star. That's the great fact that makes the whole subject pleasant. OK. So the winner U star is then, by this since I'm minimizing that expression, it's certainly the nearest in the K norm. It's the nearest waited by K, somehow that K is important. That K is reflecting the problem we're solving. That K is the energy or whatever to U star. OK. That's the great conclusion. You could say that's the fundamental theorem of this projection, error estimates and stuff. So do you see that we got to that? The thing that minimizes makes this as small as we can. So nearest is simply a translation of what that says. Pick the capital U that's closest to u star. OK.

Now how does that help? That helps because I want to estimate the difference. So now I estimate the difference of U star minus u star. So I'm trying to get a handle on how different those are. How So let me take in this case of, say, Laplace's equation or the 1D case. again I'm speaking steady state boundary value problem. So U star is the best combination of these trial function, capital U, and little u star is the exact solution. Let me draw a picture. So suppose I'm in 1D, 0 to one. OK. Yeah, let me pick that model problem again. So minus the derivative of cdu dx is f of x with boundary conditions. And I guess if I'm consistent now I should write u star, because it's the winning solution. And suppose the boundary conditions are 0 at both ends, and it does something. OK. So that's u star. OK. So that's a continuous case, which I'm drawing a picture to represent what the solution might look like. OK.

Now what about this finite element stuff? Suppose the phi's are linear pieces. I'm going to do linear finite element method. Then any combination of these linear guys is going to be piecewise linear, there's going to be a mesh. You know this set up. It might look like that, and have a value there. It might have a value there, might have a value there, there, there, and there. OK. And let's suppose this is the winning, doesn't look like a winner to me because I think it could probably do better, but capital U star. But remember that we're measuring what's the K in norm stuff. I'm measuring the difference between these two not point-wise, which would be of course pleasant to say, OK, the distance is just, you know, maybe the maximum distance or the mean square error. That would be quite pleasant. But here the measure of distance involves this K, which comes with the problem. So by distance here I mean the distance between U star and u star. Can I write it with a capital K there to indicate that's the K norm? That's the norm in which this is small, as small as can be made. And what is the K norm for this particular problem? Well it's the integral, and involves the c, and it involves the U star prime minus the u star prime square dx, integrated from 0 to 1.

I'm just picking this example problem so that you get some idea how we're measuring the error. It's the natural measure for the error. It's the energy measure. We're measuring error and energy. This is an energy expression. This thing, you know, represents some kind of elastic bar or something, so we're measuring the internal energy here. And notice in particular, maybe the most important point is not the c of x, which just comes along for the ride, our measure of the error is in the derivative. We're measuring error and slopes, because those are the stresses in the bar, and that's where energy comes from, internal strain energy. OK. So in other words, I have this function u star, this curve guy, and I have this function, which I'm thanking to be the winner. And again it's the winner in the sense that it minimizes this. This is the minimum. This is the expression that we have some handle on, because we know that U star, capital U star, will make that as small as it can. It does not make small the point-wise error. It might try accidentally we hope it does of course, get point-wise error right. But what it is constructed to get right is energy error. Make that small. OK.

So now the question is how do we estimate the difference? OK. So now here's this key point, how do we estimate the difference. Again we're looking at the error. By the way I should of, maybe I did, put a square there. That was the norm squared. You realize why I don't want a big square root sign, it's just clumsy. So I'm looking at the square there. OK. So now how do you estimate the thing knowing that this piecewise linear that came out of some finite element calculation or some giant code, is best possible? Well here's the idea. Look at a convenient candidate that might not be the winner. Let me put, because this was as small as possible, this is less than or equal to U minus u star, it's always in the correct measure, for every trial function U. This just says what I've said now three ways, that u star's the best in the K norm.

Now I to get some bound on this, I can take any u, I can take any u an estimate its difference from u star, and that will give me a bound. In other words, I don't know what this particular guy happened to be. Let me just jump to the key idea. I know that that one is better than for example, I'm just going to pick one piecewise linear trial function that's quite convenient. Pick the one that interpolates the exact one. OK. Now for some reason, known only to finite elements, that wasn't the finite element winner. That wasn't u star, that was another u. For example, so like I'll put it in blue here, for example take u to be the function that interpolates u star, little u star the function that I've drawn here. Since our question is how close can we get to the curve guy by piecewise linear, well one choice is how close does the interpolate come. It doesn't necessarily come the closest. Probably not. But it's in the right ballpark.

So now I just ask the question, and let me draw the same picture again. I have a function and I interpolate it by piecewise linear guys. So piecewise linear function there. And so this is a comparison between the u star and its interpolate, which is my candidate U that I'm recommending as a trial. And now it's a pure approximation question. You see we no longer have to know all about finite elements, we don't have to know anything about finite elements. We just asking the question if you give me a function, and you compare it with the piecewise linear interpolate, how far apart are they? How far apart? And let me call the step size h, and of course could be an unequal steps. It could be an structured mesh. Everything works here. Now we come to just a basic sort of understanding of calculus. How close does a curve come from the cord? Really that's what it's come down to. How close is curve function from a cord? I can even blow that up.

So here I have some curve going up, and compare that with the cord. And this distance here is h. So I'm just looking for something like is the difference of order h, is it of order h square, is it of order e to the h. What the distance between the two? And you could imagine it's a parabola, because I'm focusing down on just a little piece. I take a little parabola, a little h piece of it, and I compare it with a cord, what's the distance between the two? That distance there. Well your eye probably tells you that it's smaller than h, because h was this big, and I'm only looking this big vertically. So this distance, that distance there maximum distance, is of order h square. But that's not the question. The question is how far apart are these, how far apart of the these in measure? How far apart are the slopes? Because the K a norm is dealing with the slopes, and not the function itself. So more important than the distance is the error in slope, and would you want to guess what that's like? What's your guess on that, the error in the slope?

Now the slopes are not going to be as good as the function as always. Slopes are one derivative higher, you loose something, you get order h. So that's the error. Let's see, I hope I'm right here. Yeah. I think that's right. There I was speaking pointwise, and then I've got to integrate it over the whole interval here, a unit interval. So if I square it of course I'm going to get h square, and then if I integrated it, I still have h square, and then when I take the square root I have h. So h is the right quantity. If I can put down here what our conclusion was from this method by taking capital U to be this convenient, not the only choice, but a convenient choice just to get an idea what the error might be. And then the error from the actual U star is got to be better, our estimate is order of h for this particular application. This particular application. That's the error in energy norm in the slopes. So the theory of finite elements would go ahead to try to show that the error in the distance between them, into displacement you could say, is h square.

But that's not so easy, and I won't be able to do it here. Why is it not easy to say that the error in the displacement is of order h square? It's certainly true for this interpolate, right. There's no question that the interpolate is you could easily see as h square, order h square away from the function. But the point is that capital U star was not the best in staying near the function, it was only the best in staying near the slope. So I don't have this crutch to lean on here. This is in the K norm, and not in the mean square norm. It's in a norm that deals with slope, but not with just plain displacement distance. OK. Maybe that's made the point. And this is the part of finite element theory just to do that. OK.

If I could take a minute about notation, because the notation that I've used here of K is really a matrix notation. You don't truly see it in finite element papers. For me it's clear, right. I mean that identity was quite clear from vectors and matrices. But with finite elements I'm really dealing with functions. So it's not fair to use matrix notation, you know, when it's integrals, and derivatives, and functions that are involve here. You know, these are functions, and the actual solution is a function.

So I all I want to do is mention the notation that you now see say in 6920. The engineering course that would be, you know, quite related to this one of finite elements and Galerkin and all sorts of stuff, that we'll touch in the remaining weeks. How would they write the minimization problem? The original problem now, going back to the original problem, and just saying wait a minute we didn't really have a good notation for it. The thing I'm looking for is like this. This is the kind of thing that I want, c of x u prime square dx. Well it might also have a first order term a d of x, u square of x, it could have that. It could have second derivatives. It could be in two variables. For Laplace we had minimum of du dx square and du dy square dxdy, well and we had the linear term too. I've just written the sort of the left hand side, the quadratic term. And all I want to say is that everybody, well not everybody, but a lot of people now would use the notation a of u,u for the quadratic term. So that a of u,u in an engineering paper represents the internal strain energy, whether it's this, whether it's this, a combination. It somehow suggest to us it's quadratic. And there's a linear term, and that's often written l of u. So I better put down what l of u typically is. This l of u might be the integral of f of x u of x dx. That would be the linear term. And I think I'd be happier to have the 1/2 there. So that really it's just a match with this, but somehow it's a little cooler. This looks so much like vectors and matrices, that that and that is kind of neutral. OK. So I'm just speaking about notation here, and I could've mentioned this last time when I was speaking about all these examples from calculus of variations. OK.

So that's the minimum problem. If you give me that notation for the minimum problem, what's the weak form in this notation? So I'm introducing this just because you see it elsewhere. It's it's exactly what we're doing all the time, so I just want you to recognize it. OK. So how do we get the weak form? Can I recap how you get the weak form? If us is the winner. Right. I'll just think of u as the winner. Then if I move it by v, move it a little, then this expression should go up. So what happens if I move it by v, so I'm going to compare the two. I'm going to compare that with 1/2 a of u plus v, you know, upped a little minus l of u plus v. Maybe I'll erase min and put in less or equal to. I'm just recapping that this should be less or equal to this one for all v. You see it's the weak stuff? u is the winner. I'm now using u and not u star, and I'm using v for the delta u, for the movement away from the winner, which raises the energy. OK.

Now what do I plan to do? I plan to cancel common terms here, and see what's going on, and find that first variation. Just what I did last time. I'm just doing it in this new notation. So what's the point? This l of u is linear. So l of u plus v is the same as l of u and l of v, right. If I put in u plus v there, it splits into two integrals here they are, so when I subtract these guys go. Now what about this one? Well just as last time when I put in u plus v here and expanded everything, now I have something squared. So this business here is going to be 1/2. I'll get something from the u alone. And then I'll get two something canceling the 1/2 from u and v, that's the cross term. And then I'll get something from the v alone. I'm dodging a couple of bullets here just going to the main point. OK. So the main point is I'm going to subtract, I'm going to at the differences. So the 0 ordered terms are all gone, and it's this quantity that has to be great or equal 0, right. Because I was left with that great or equal for all v. OK. Don't let me leave minus l of v there. OK. So that's the thing that has to be great or equal to 0.

Now we're just going to repeat the same the discussion that we had last time with different letters. If this is going to be great or equal 0 for all v, then I'm going to think of small v's, in which case this term is going to be smaller than the others and won't help. So what has to happen? What has to happen for this to be great or equal to 0 for all v? You know like we're taking the derivative in the direction of v. We're moving in the direction of v. And this is the first order term, that's the first variation. That's what has to be 0 for all of v. I guess I can bring it over here if you eye will follow it. I have a of uv minus l of v, that's the sort of first order term and then the second order term. It has to be great or equal 0 all v. And that question is so what? What do we get out of that? Well what I'm saying is this stuff has to be 0, because v could have either sign. So if this was positive or negative, I could switch sign if I want to. So it has to be 0. So a of uv has to equal l of v, and that's the weak form. That's the weak form, that's the form integral of c u prime v prime dx equals integral of fvdx. That's the l of v, and this is the a of u and v. This is what you see in engineering papers, when they're launching into the finite element method, and they're planning to get some notation. That's a very familiar notation. And then the final point is what about this term? So now we know this is 0. This is the weak form of the equation, find u, so that this holds for all v. That's the weak form of the equation. Oh yeah, I better not rush by it. Is This is the form finite elements come from. It doesn't solve this equation exactly. This is the differential equation. I'm in here. It's our Euler-Lagrange equation. The finite element method says, OK take the trial functions, and get it right for guys. So that give us our N equations, where this is a continuous problem. Yeah. So the weak form is what leads you naturally to finite elements. You take this, and you only make it true on a finite dimensional space. OK.

And the final comment is what about this guy? Well the whole point is that we assume positive definite, we assume stability, we've made our life easy by guaranteeing by the fact that this is always great or equal 0. I don't even have to think about this one. OK. Because if u and v are the same, this is a square, and that material coefficient that better be not negative, right. OK.

Thank you for your patience to listen to that. See this notation for the same thing that we did last time for the weak form, and you see how it pays off immediately to give us the finite element. OK. Good. So the project ones are all there, and there's just two or three left here. And I hope you have a super weekend, and do give a thought to project two.