Video Description: Herb Gross show how the chain rule is involved in finding some integrals involving parameters. He computes the derivatives of integrals with constant limits, as well as derivatives of integrals with variable limits of integration (chain rule).
Instructor/speaker: Prof. Herbert Gross
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
PROFESSOR: Hi. Our lesson, today, hopefully will serve two purposes. On the one hand, we will give a nice application of the chain rule. Now that we've had two units devoted to working with the chain rule, I thought you might enjoy seeing how it's used in places which aren't quite that obvious-- places that we wouldn't expect to be used, at least. And secondly, I would like to pick as my application one which comes up in many, many different contexts.
And without further ado, the topic I want to cover today is called "Integrals Involving Parameters". Now, that sounds like a big mouthful. Let me motivate that for you, first of all, physically, and then in term of a couple of geometric examples.
You all know from past experience my great ability with physical applications, so I won't even try to find anything profound here. Let me just take a pseudo example, pointing out what type of situation we're trying to deal with and leaving it to your own backgrounds to see places where the same principle could have been applied, but hopefully in a more practical, meaningful way for you.
Imagine for example, that we have a platform that we're looking along in the x direction. And we've punched holes in this platform, say, and liquid is trickling through these various holes. The holes are all on a horizontal line this way. What we're going to do is we're going to focus our attention on a particular particle of the liquid, and we're going to watch it as it falls. And what we would like to do is find how far that particle fell, say, during the first second of its flight.
Obviously, from a calculus point of view, the first thing we have to do is know what the velocity function is, because ultimately, we would like to integrate the velocity. Now, notice that we would expect the velocity to be depending on time. If this were a freely falling situation, we'd expect the usual gravitational type situation. Or whatever the situation happened to be, we would expect on the one hand that the velocity does depend on of time. On the other hand, because of how the streams are flowing-- in other words, we don't know what's happening above here that's causing the water to shoot out-- we don't know what the initial velocity is coming out of each of these holes in the sense that a different opening may give rise to a different velocity of stream coming out.
All I'm trying to bring out here is that as we try to focus our attention on a particular opening, we find that the velocity of the particle that will follow it during that first second is a function both of its position x-- in other words, x sub 0 in this case, because we're focusing at x equals x0 and the time t as t goes from 0 to 1. And we then simply integrate along the vertical direction here. We find y as a function of x0 from 0 to 1-- v(x0, t) dt.
Once we're through integrating, you see, notice that the integration is with respect to t. So when we're through integrating, this being a definite integral, t no longer appears. We have a function of x0 alone, saying nothing more than the distance that the particle falls during the first second is a function of the position of the opening along the line here.
Now, at any rate, the practical application is not so much writing down this equation as the inverse is the case. Namely, in many practical applications, we are given this particular integral. And for some reason or other, want to determine what v itself is. In other words, we often want to find the derivative given the integral. All right.
Let's just let it go at that for the time being. The important point is that I want you to see an example of an integral. Let me just write this here in more abstract form. It appears to be a definite integral, a to b. The function inside the integrand is apparently a function of two variables, x and y-- say, in this case x0 and t. One of the variables is a variable of integration-- in this case, y-- and the other variable is being treated as a constant.
And that's where the word parameter comes in. x is a parameter, meaning a variable constant, in the sense that, for this particular problem, x is chosen to be in some domain and remains fixed. In other words, this is some function of x when we're all through here. All right?
As a geometric example, imagine the following situation. We have a surface w = f(x,y). We take the plane x = x0 and intersect this surface with that particular plane. We get a curve, you see.
Now, we look at that curve corresponding to two points, p and q, where p and q are determined by the y values a and b. In other words, p corresponds to y = a. q corresponds to y = b. And now, a very natural question that might come up is that we would like to find the area of this particular plane region, in other words, the area of this slice between p and q.
Now, you know, the first thing I hope that you'll notice is that because this shape of the surface can be in many different ways, the particular cross section that we get does depend on the choice of x0. Different slices-- different planes x = x0 will give us different curves of intersection. The point is that once we have the curve intersection x is being treated as the parameter. This particular curve is given by what equation?
f is a function of x0 and y. See, x = x0 for every point on this curve. And so the area of the region R is the integral from a to b, f(x0, y) dy. And the question that very often comes up is, how do you find the derivative of A sub R with respect to x0, noticing, you see, that A sub R is a function of x0 alone, the y dropping out between the limits a and b when we perform the operation of integration.
A third place that this type of situation occurs is in solving certain differential equations. For example, suppose we're given a particular curve and that that curve determines a region R between the lines x = a and x = b. Suppose all we know about the curve is its slope at any point. We know that its slope at any point is given by dy/dx and some function of x and y which we don't necessarily have to go into right now.
And let's suppose, for the sake of argument, that we solve this first order differential equation. What we'll find, if we're lucky, is that y is some function of x and an arbitrary constant. Remember, once you have one solution to a differential equation, you have an infinite family in terms of a one parameter solution to a differential equation. In other words, in finding the area of the region R in this case, there are many curves that satisfy this particular differential equation. Until we know what specific points are being referred to over here, the best we know for sure is what? That these endpoints are a and b, that the integrand is f(x,c), and we're integrating that with respect to x from a to b.
I freudianly put a c in here, because I think what I was trying to emphasize for you is that when you look at this thing, observe that this integral is a function of c alone. Namely, when you integrate this thing, you integrate it with respect to x. The x drops out. All you have left here is a c. A sub R, then, is a function of c. And in many cases, what we would like to do is see how fast the area changes as a function of c. In other words, how do we change the area as a function of changing the arbitrary constant c?
And I have enough exercises in the assignment to give you concrete drill on this. All I'm trying to give you here is an overview of the entire topic. And the reason I want to give you this overview is that it's hinted at in the textbook, but this topic is not covered there. In fact, the reason that I had you read that particular section of the textbook before the lecture this time-- you notice that usually we start with the lecture.
This time I had you read the textbook first, because the way the textbook covered this topic is essentially nothing more than the way we tackled a different problem last semester. And I want you to see that the problem done in the Thomas text is not a new problem-- it's one that we've done before-- but that with the tools that we now have available, we could've tackled a more significant problem. And that's the one I'm electing to do here and what I want to show you the key steps on, because they're not in the text. But I will leave the reinforcement for the exercise.
At any rate, hopefully now, when you see an integral of this type, it will not bother you too much. In other words don't worry about how do you integrate a function of two independent variables? When you see something like this, it means that there is some implicitly implied domain for x. In other words, we have some function of x. Let's say the domain of g might very well be, say, all x's between two values, say c and d.
But who cares about that right now? The important point is that g is defined on a certain set of values x. And what it says is to compute the output of the g machine. For the given x, you fix that x and integrate f(x,y) dy between a and b. Notice, you see, during the integration, x is being treated as a constant, so that for all intents and purposes, this is an ordinary integral.
But because x isn't a bona fide constant, meaning what? It's a constant only in the sense that once chosen, it remains fixed for this particular integration. Different values of x will give me different integrals here. And consequently a very natural question that comes up is how does my function g-- which depends on x-- how does that vary as x varies?
In other words, the key question is simply this. First of all, given g defined this way, one, does g prime even exist? Does dg/dx exist? And two, if it does exist, what is it? In other words, the question that we're raising is if we can find g prime of x, how do we do it in terms of looking at the right hand side?
And let me not try to guess the answer here. The answer does turn out to be, in this particular case, one that you might have guessed. But I prefer to show you that we don't have to guess, point one. Point two, if you do guess, you won't always be that lucky. That's my finale for today's lecture. But let me see if I can survive to get to the finale first.
Let me see how we'll tackle a problem like this. First of all, to see if g prime exists at some value x0, what we have to do is-- way back to the very beginning of Part 1-- the same old definition for an ordinary derivative. We have to compute the limit of g of x sub 0 plus h minus g(x sub 0) over h, taking the limit as h approaches 0.
Notice what the g machine does... What the g machine does is it feeds x into the integrand here and integrates this with respect to y from a to b. So if the input of my g machine is x0 plus h, that means that the x is replaced by x0 plus h here. g of x0 plus h is simply integral from a to b f of x0 plus h comma y dy. Similarly, g(x0) is integral from a to b f(x0, y) dy. I now want to form this difference. And noticing-- again, this is all calculus of a single variable-- that the difference of two definite integrals is the definite integral of the difference, I can conclude that g(x0 + h) minus g(x0) is simply this single integral over here.
Now, my next step in determining g prime of x0 is I must divide this by h. Notice, by the way, that h is an arbitrary increment, but once chosen, remains fixed. Notice that h is a constant as far as this integration is concerned. Consequently, to divide by h, it is permissible to bring the h inside the integrand. In other words, technically speaking, the h should be here, but since h is a constant with respect to y, the integral can have the h brought in.
Why do I want to bring the h in here? Let me again telegraph what I'm leading up to. Obviously, when I'm going to compute g prime, my next step is to take the limit of this as h approaches 0. With h in here, I look at this and what I hope is that by this amount of time at least the following minimum amount of material has rubbed off on you in a second nature way-- that if I look at this expression in brackets as h approaches 0, this is precisely the definition of what we mean by the partial of f(x, y) with respect to x evaluated at (x0, y).
See, this is what? The change in f-- see, y is held constant. We're taking this over to change from x0 to x0 plus h and dividing by h. This is a partial of f with respect to x. That's why I want to bring the h inside.
So now, I say, OK, g prime of x0, by definition, is this limit. I now want to take the limit of this expression. And by the way, notice what I'd love to do now is to jump right in here and say, aha, this is just the partial of f with respect to x evaluated at (x0, y0).
But the thing I would like you to notice-- and again, going back to Part 1 of course, one of the big things that we talked about under the heading of uniform convergence. There is a very dangerous thing in general to interchange the order of limit and integration. This says what? First perform the integration, and then take the limit. What we would like to be able to do is first take the limit and then integrate the result.
Now, we did see that, provided the integrand was continuous, these operations were permissible. But we'll talk about that a little later. For the time being, let's simply summarize by saying if the limit operation and the integration operation can be interchanged, then the derivative-- see, this thing here is what?
This is my g(x). The derivative of g(x) with respect to x has the very delightful form that, essentially, all I have to do is take the derivative operation, come inside the integrand, and replace the derivative with respect to x by the partial derivative with respect to x. In other words, the derivative of the integral from a to b, f(x, y) dy is the integral from a to b, the partial of f with respect to x dy-- provided, of course, that the limit and the integration are interchangeable.
In particular, this will be true if f and f sub x exist and are continuous and see, straighten out the range and the domain and what have you once and for all. Notice that y is allowed to exist between a and b. We've said that x is going to exist on some domain between c and d. Notice that saying that y is between a and b and x is between c and d geometrically says that f is defined on a rectangle. See? OK.
And then what we're saying is, under these conditions, to integrate an integral with a parameter, with respect to that parameter, all we have to do is come inside the integral sign and differentiate-- take the partial derivative-- with respect to what is being used as the parameter. In this case, it's x, which is the parameter.
Now, the only danger with this particular thing-- and by the way, notice, not only is there a danger here that I'm going to mention. The danger is this looks so easy, you may be saying, why did he do it the hard way? Why didn't he just tell us this was the right way of doing it? And the point is it just happens to be one of those coincidences where the rigorous way yields a logical answer which is consistent with what is probably our intuitive guess.
But it's not always going to happen that way. And the example that I have in mind now goes back to what we were talking about the beginning of the lecture. Namely, I wanted to give you an application of the chain rule. And here's where that application comes in.
I now call this-- I don't know what to call it. So let's just call it variable limits of integration. Same problem as before-- it's going to cause the chain rule to come in now. The only difference is going to be-- and that's just a forewarning. You don't have to know that right now.
I'm going to have the same problem as before. What did I have before? I had that g(x) was integral f(x, y) dy between the two constants a and b. Now, I'm going to let my constants of integration also depend on the parameter. See, all the constants of integration have to be our constants as far as y is concerned.
What I'm saying is what makes this problem differ from the previous one is suppose that it happens that instead of being given a nice rectangle to play around with, I'm given a couple of curves like this in the xy plane. See, this would be a of x. This would be y equals b(x).
See, what I'm saying now is that not only does the integrand depend on what value of x I pick, but the limits of integration as I'm finding a cross-sectional area of a surface, you see. The limits of the integral themselves depend on the choice of x-- constant, as far as y is concerned, but depend on x. You see, now, what happens is that my parameter appears in the limits as well as just in the integrand.
And now, you see, also what this means is if I try the previous approach of computing g(x plus delta x), et cetera, I'm in trouble, because the only way I can combine two integrals and put them under the same integral sign is if they're between the same limits of integration. Notice here, for example, that if I replace x by x plus delta x, I not only change the integrand, but notice that the limits become what? a of x plus delta x, b(x plus delta x)-- and those in general, unless a and b happen to be constants, will vary with x.
In fact, let's look at it this way. This is the problem we should have started with in the sense that constant limits of integration are a special case of this. At any rate, what I wanted to show you was that this particular problem can be handled very nicely in terms of the chain rule. Namely, what we do here is we observe that, first of all, y is not really a variable here. It's integrated out.
So what we think of is let's think of x as being some variable u. Let's think of b(x) as being some variable v. Let's write down the function of three independent variables u, v, and x. OK. What will that function be? Let u, v, and x be arbitrary, independent variables. Look at the integral from u to v f(x, y) dy. This is obviously dependent upon u. It's dependent upon v. And it's dependent upon x.
The place that the chain rule comes in is that in our particular problem, u and v cannot be arbitrary, but rather u must be that particular function a(x), and v must be the particular function b(x). Consequently g(x) is simply what? It's h(u, v, x), where u is a(x) and v is b(x). Consequently, to find g prime of x, what we want is h prime of x and to find h prime as a function of x.
See the chain rule here? u can be expressed in terms of x. v can be expressed in terms of x. Obviously, x is already expressed in terms of x. So this is really implicitly a function of x alone. So by the chain rule, what I'd like to be able to do is to combine these three pieces of information to find h prime of x.
And remember how the chain rule works. Now, I'm not going to beat that to death. We've just had two units on that. Let's just say it rather quickly. g prime of x is the partial of h with respect to u times u prime of x plus the partial of h with respectively to v times v prime of x times the partial of h with respect to x times x prime of x, which, of course, is just 1. In other words, writing this thing out, g prime of x is simply this.
What is the partial of h with respect to u? Let's come back here for a second and remember what h is. h is this integral. I want to take the partial of that with respect to u. That means I have to investigate this.
Now, here's the interesting point. Whereas u, v, and x are independent variables, what does it mean when you say you're taking the partial with respect to u? It means that you're treating v and x as constants.
Now, if v and x are being treated as constants, what I have is simply what? I'm taking a derivative with respect to a variable where the only place the variable appears is as the lower limit of the integrand. In other words, I claim that that's nothing more than minus f(x, u). Then I go inside the integrand. In other words, I differentiate the integral. That leaves me just the integrand-- and replace the variable by the variable of integration u here. And because it's the lower limit, I put in the minus sign.
Now, why did I go through that very fast? That's why I had you read this assignment first. Notice that the assignment in the textbook does not touch what I'm talking about, but rather seems to review that topic that we covered under Part 1-- that if you wanted to take an ordinary derivative with respect to u, integral from u to a, g(y) dy, the answer would just be minus g(u).
And that's exactly what I did in here. I treated v and x as constants here. In other words, the only variable in here was u. That will be emphasized again in the exercises. In a similar way, the partial of h with respect to v means this thing. Notice now that u is being treated as a constant. x is being treated as a constant. To differentiate this, my variable appears only as an upper limit on the integrand.
That means I come inside the integral sign, replace the integral by just the function itself, replacing what? Replacing y-- that's the only variables of integration-- by the upper limit v. In other words, this is f(x, v). All right?
And finally, the partial of h with respect to x is this integral here. Notice now that u and v are being treated as constants. With u and v being treated as constants, that's the special case that we started out lecture with, namely, to differentiate with respect to a parameter when the parameter appears only as part of the integrand.
So how do we do that? We come inside. This is the interval from u to v, the partial of f with respect to x dy. Putting the whole thing together-- recalling, among other things, that du/dx, since u is a(x), is just a prime of x and that dv/dx is b prime of x-- what this says-- and again, I want to see the beauty of the chain rule here, because at least to my way of thinking, I don't see anything at all intuitive about the result I'm going to show you.
And that is as soon as you make the limits of integration variable, to differentiate an integral involving a parameter-- you see again, what's the parameter here? x. We integrate it with respect to y. This is a constant as far as y is concerned. See, intuitively, you might say, gee, all you've got to do is take the derivative sign, bring it inside, and this should be the answer. See, it's the same as we did before.
The point is-- and this is where many serious mistakes are made in problems involving integrals of this type-- is that the reason that our intuitive way happened to be right in the simpler case was that these were constants with respect to x. Now, they're variables. Well, it turns out, if you just wrote this thing down, you would be wrong.
What is the correction factor? Again, come back to here. The correction factor is this here, which we've just started to compute. Again, just saying it-- if you wrote this term down to get the correct answer, you would have to tack on what? b prime of x times f(x, b(x)). That means what? You think of a y as being over here. For the particular value of x, you replace y by b(x). In other words, you look at f(x, y), and every place you see a y, replace it by b(x). And then subtract that from that a prime of x times f(x, a(x)).
Now again, I suspect that, for many of you, it's the first time that you've seen something like this, because I say it's a topic which I believe was a natural one to occur in the textbook, but for some reason it doesn't appear there. Because of the importance of the concept, the number of times it appears in physical applications, the number of times that one has to differentiate with respect to an integral-- I don't know the physical applications well enough to lecture on them, but it does occur in probability theory, among other places, it appears in any subject involving integral equations and the like-- that I wanted to give you the experience of seeing what the concept means, to have you hear me say it. And then I will spend the exercises trying to drive home the computational know how so that you will be able to do these things at least in a mechanical way, independently of whether the theory made that much sense because of the lack of physical example motivation other than what I did at the beginning.
At any rate, keep in mind, though, that in terms of our present topic, where we're talking about the chain rule, this is a certainly noble application to show an important place in the physical world where knowledge of the chain rule plays a very important role. And with that, I might just as well conclude today's lecture. And until next time, good bye.
Funding for the publication of this video was provided by the Gabriella and Paul Rosenbaum Foundation. Help OCW continue to provide free and open access to MIT courses by making a donation at ocw.mit.edu/donate.