Topics covered: Non-independent variables
Instructor: Prof. Denis Auroux
Lecture Notes - Week 6 Summary (PDF)
OK, so first of all there used to be, is the sound system working? Can you hear me in the back? Can anyone here in the back? No? Can we make it a bit louder? Oh, yeah, OK, it should be good now. So there used to be problem set solutions, new problem sets, practice exams. I'm pretty much out of everything. It will all be on the Web. The practice exams have been on the webpage for a while.
So, you can get it there. So -- OK, some information about the exam: so the next test in this class is on Tuesday, and it will be in the same place as last time. OK, so some of you go to Walker Memorial third floor. Some of you come here. The basic rule is you go to the same place as last time, namely, if your last name starts with A-R or if you're left handed, then you go to Walker. If your last name starts with S-Z and you're right handed, then you come here. And if you're confused, you go to wherever you can make it. As usual, no documents, no calculators allowed. And, as usual, we will be discussing the practice exams in lecture and recitation.
So, tomorrow basically will be a review session. We'll go over practice 2A. So, please bring it back tomorrow. And, practice 2B will be discussed in recitation on Monday. So, before that, we have one more topic to cover. So, today we'll learn about some new cool stuff about partial derivatives with constraints. OK, so this will be actually on the exam also, but it's the last topic that will be of the exam. And, the new problem set is only due next Thursday. So, in principle, you don't need to start working on it until after the exam. I would still like you encourage you to actually have a look at it because basically you can do all of it today. And, it's actually not that hard. It's actually good practice on the topic that we're going to see today. So, if you want practice on that, then you should look at the p-set.
OK, so we're going to continue looking at what happens when we have non-independent variables. So, I'm afraid we don't take deliveries during class time, sorry. Please take a seat, thanks. [LAUGHTER] [APPLAUSE] OK, so Jason, you please claim your package at the end of lecture. OK, so last time we saw how to use Lagrange multipliers to find the minimum or maximum of a function of several variables when the variables are not independent. And, today we're going to try to figure out more about relations between the variables, and how to handle functions that depend on several variables when they're related. So, just to give you an example, in physics, very often, you have functions that depend on pressure, volume, and temperature where pressure, volume, and temperature are actually not independent.
But they are related, say, by PV=nRT. So, of course, then you can substitute and expressed a function in terms of two of them only, but very often it's convenient to keep all three. But then we have to figure out, what are the rates of change with respect to t, with respect to each other, the rate of change of f with respect to these variables, and so on. So, we have to figure out what we mean by partial derivatives again.
So, OK, more generally, let's say just for the sake of notation, I'm going to think of a function of three variables, x, y, z, where the variables are related by some equation, but I will put in the form g of x, y, z equals some constant. OK, so that's the same kind of setup as we had last time, except now we are not just looking for minima and maxima. We are trying to understand partial derivatives.
So, the first observation is that if x, y, and z are related, then that means, in principle, we could solve for one of them, and express it as a function of the two others. So, in particular, can we understand even without solving? Maybe we can not solve. Can we understand how the variables are related to each other? So, for example, z, you can think of z as a function of x and y. So, we can ask ourselves, what are the rates of change of z with respect to x, keeping y constant, or with respect to y keeping x constant? And, of course, if we can solve, that we know the formula for this.
And then we can compute these guys. But, what if we can't solve? So, how do we find these things without solving? Well, so let's do an example. Let's say that my relation is x^2 yz z^3=8. And, let's say that I'm looking near the point (x, y, z) equals (2, 3, 1). So, let me check 2^2 plus three times one plus 1^3 is indeed eight. OK, but now, if I change x and y a little bit, how does z change? Well, of course I could solve for z in here. It's a cubic equation. There is actually a formula. But that formula is quite complicated. We actually don't want to do that. There's an easier way.
So, how can we do it? Well, let's look at the differential -- -- of this constraint quantity. OK, so if we called this g, let's look at dg. So, what's the differential of this? So, the differential of x^2 is 2x dx plus, I think there's a zdy. There's a ydz, and there's also a 3z^2 dz. OK, you can get this either by implicit differentiation and the product rule, or you could get this just by putting here, here, and here the partial derivatives of this with respect to x, y, and z.
OK, any questions about how I got this? No? OK. So, now, what do I do with this? Well, this represents, somehow, variations of g. But, well, I've set this thing equal to eight. And, eight is a constant. So, it doesn't change. So, in fact, well, we can set this to zero because, well, they call this g. Then, g equals eight is constant. That means we set dg equal to zero. OK, so, now let's just plug in some values at this point.
That tells us, well, so if x equals two, that's 4dx plus z is one. So, dy plus y 3z^2 should be 6dz equals zero. And now, this equation, here, tells us about a relation between the changes in x, y, and z near that point. It tells us how you change x and y, well, how z will change. Or, it tells you actually anything you might want to know about the relations between these variables so, for example, you can move dz to that side, and then express dz in terms of dx and dy. Or, you can move dy to that side and express dy in terms of dx and dz, and so on. It tells you at the level of the derivatives how each of the variables depends on the two others.
OK, so, just to clarify this: if we want to view z as a function of x and y, then what we will do is we will just move the dz's to the other side, and it will tell us dz equals minus one over six times 4dx plus dy. And, so that should tell you that partial z over partial x is minus four over six. Well, that's minus two thirds, and partial z over partial y is going to be minus one sixth. OK, another way to think about this: when we compute partial z over partial x, that means that actually we keep y constant. OK, let me actually add some subtitles here.
So, here that means we keep y constant. And so, if we keep y constant, another way to think about it is we set dy to zero. We set dy equals zero. So if we do that, we get dx equals negative four sixths dx. That tells us the rate of change of z with respect to x. Here, we set x constant. So, that means we set dx equal to zero. And, if we set dx equal to zero, then we have dz equals negative one sixth of dy. That tells us the rate of change of z with respect to y. OK, any questions about that?
No? What, yes? Yes, OK, let me explain that again. So we found an expression for dz in terms of dx and dy. That means that this thing, the differential, is the total differential of z viewed as a function of x and y. OK, and so the coefficients of dx and dy are the partials. Or, another way to think about it, if you want to know partial z partial x, it means you set y to be constant. Setting y to be constant means that you will put zero in the place of dy.
So, you will be left with dz equals minus four sixths dx. And, that will give you the rate of change of z with respect to x when you keep y constant, OK? So, there are various ways to think about this, but hopefully it makes sense. OK, so how do we think about this in general? Well, if we know that g of x, y, z equals a constant, then dg, which is gxdx gydy gzdz should be set equal to zero. OK, and now we can solve for whichever variable we want to express in terms of the others. So, for example, if we care about z as a function of x and y --
-- we'll get that dz is negative gx over gz dx minus gy over gz dy. And, so if we want partial z over partial x, so, well, so one way is just to say that's going to be the coefficient of dx in here, or just to write down the other way. We are setting y equals constant. So, that means we set dy equal to zero. And then, we get dz equals negative gx over gz dx. So, that means partial z over partial x is minus gx over gz.
And, see, that's a very counterintuitive formula because you have this minus sign that you somehow probably couldn't have seen come if you hadn't actually derived things this way. I mean, it's pretty surprising to see that minus sign come out of nowhere the first time you see it. OK, so now we know how to find the rate of change of constrained variables with respect to each other. You can apply the same to find, if you want partial x, partial y, or any of them, you can do it. Any questions so far? No? OK, so, before we proceed further, I should probably expose some problem with the notations that we have so far.
So, let me try to get you a bit confused, OK? So, let's take a very simple example. Let's say I have a function, f of x, y equals x y. OK, so far it doesn't sound very confusing. And then, I can write partial f over partial x. And, I think you all know how to compute it. It's going to be just one. OK, so far we are pretty happy. Now let's do a change of variables. Let's set x=u and y=u v. It's not very complicated change of variables. But let's do it.
Then, f in terms of u and v, well, so f, remember f was x y becomes u plus u plus v. That's twice u plus v. What's partial f over partial u? It's two. So, x and u are the same thing. Partial f over partial x, and partial f over partial u, well, unless you believe that one equals two, they are really not the same thing, OK? So, that's an interesting, slightly strange phenomenon. x equals u, but partial f partial x is not the same as partial f partial u.
So, how do we get rid of this contradiction? Well, we have to think a bit more about what these notations mean, OK? So, when we write partial f over partial x, it means that we are varying x, keeping y constant. When we write partial f over partial u, it means we are varying u, keeping v constant. So, varying u or varying x is the same thing. But, keeping v constant, or keeping y constant are not the same thing.
If I keep y constant, then when I change x, so when I change u, then v will also have to change so that their sum stays the same. Or, if you prefer the other way around, when I do this one I keep v constant. If I keep v constant and I change u, then y will change. It won't be constant. So, that means, well, life looked quite nice and easy with these notations. But, what's dangerous about them is they are not making explicit what it is exactly that we are keeping constant.
OK, so just to write things, so here we change u and x that are the same thing. But we keep y constant, while here we change u, which is still the same thing as x. But, what we keep constant is v, or in terms of x and y, that's y minus x constant. And, that's why they are not the same. So, whenever there's any risk of confusion, OK, so not in the cases that we had before because what we've done until now, we didn't really have a problem. But, in a situation like this, to clarify things, we'll actually say explicitly what it is that we want to keep constant.
OK, so what's going to be our new notation? Well, so it's not particularly pleasant because it uses, now, a subscript not to indicate what you are differentiating, but rather what you were holding constant. So, that's quite a conflict of notation with what we had before. I think I can safely blame it on physicists or chemists. OK, so this one means we keep y constant, and partial f over partial u with v held constant, similarly.
OK, so now what happens is we no longer have any contradiction. We have partial f over partial x with y constant is different from partial f over partial x with v constant, which is the same as partial f over partial u with v constant. OK, so this guy is one. And these guys are two. So, now we can safely use the fact that x equals u if we are keeping track of what is actually held constant, OK? So now, that's going to be particularly important when we have variables that are related because, let's say now that I have a function that depends on x, y, and z. But, x, y, and z are related. Then, it means that I look at, say, x and y as my independent variables, and z as a function of x and y.
Then, it means that when I do partials, say, with respect to x, I will hold y constant. But, I will let z vary as a function of x and y. Or, I could do it the other way around. I could vary x, keep z constant, and let y be a function of x and z. And so, I will need to use this kind of notation to indicate which one I mean. OK, any questions? No? All right, so let's try to do an example where we have a function that depends on variables that are related. OK, so I don't want to do one with PV=nRT because probably, I mean, if you've seen it, then you've seen too much of it.
And, if you haven't seen it, then maybe it's not the best example. So, let's do a geometric example. So, let's look at the area of the triangle. So, let's say I have a triangle, and my variables will be the sides a and b. And the angle here, theta. OK, so what's the area of this triangle? Well, its base times height over two. So, it's one half of the base is a, and the height is b sine theta. OK, so that's a function of a, b, and theta. Now, let's say, actually, there is a relation between a, b, and theta that I didn't tell you about, namely, actually, I want to assume that it's a right triangle, OK?
So, let's now assume it's a right triangle with, let's say, the hypotenuse is b. So, we have the right angle here, actually. So, a is here. b is here. Theta is here. So, saying it's a right triangle is the same thing as saying that b equals sine theta, OK? So that's our constraint. That's the relation between a, b, and theta. And, this is a function of a, b, and theta. And, let's say that we want to understand how the area depends on theta. OK, what's the rate of change of the area of this triangle with respect to theta? So, I claim there's various answers. I can think of at least three possible answers.
So, what can we possibly mean by the rate of change of A with respect to theta? So, these are all things that we might want to call partial A partial theta. But of course, we'll have to actually use different notations to distinguish them. So, the first way that we actually already know about is if we just forget about the fact that the variables are related, OK? So, if we just think of little a, b, and theta as independent variables, and we just change theta, keeping a and b constant --
So, that's exactly what we meant by partial A, partial theta, right? I'm not putting any constraints. So, just to use some new notation, that would be the rate of change of A with respect to theta, keeping a and b fixed at the same time. Of course, if we are keeping a and b fixed, and we are changing theta, it means we completely ignore this property of being a right triangle. So, in fact, it corresponds to changing the area by changing the angle, keeping these lengths fixed. And, of course, we lose the right angle.
When we rotate this side here, but the angle doesn't stay at a right angle. And that one, we know how to compute, right, because it's the one we've been computing all along. So, that means we keep a and b fixed. And then, so let's see, what's the derivatives of A with respect to theta? It's one half ab cosine theta. OK, now that one we know. Any questions? No? OK, the two other guys will be more interesting. So far, I'm not really doing anything with my constraint. Let's say that actually I do want to keep the right angle.
Then, when I change theta, there's two options. One is I keep a constant, and then of course b will have to change because if this width stays the same, then when I change theta, the height increases, and then this side length increases. The other option is to change the angle, keeping b constant. So, actually, this side stays the same length. But then, a has to become a bit shorter. And, of course, the area will change in different ways depending on what I do. So, that's why I said we have three different answers. So, the next one is keep, I forgot which one I said first. Let's say keep a constant. And, that means that b will change. b is going to be some function of a and theta.
Well, in fact here, we know what the function is because we can solve the constraint, namely, b is a over cosine theta. But we don't actually need to know that so that the triangle, so that the right angle, so that we keep a right angle. And, so the name we will have for this is partial a over partial theta with a held constant, OK? And, the fact that I'm not putting b in my subscript there means that actually b will be a dependent variable. It changes in whatever way it has to change so that when theta changes, a stays the same while b changes so that we keep a right triangle.
And, the third guy is the one where we actually keep b constant, and now a, we think a as a function of b and theta, and it changes so that we keep the right angle. So actually as a function of b and theta, it's given over there. A equals b cosine theta. And so, this guy is called partial a over partial theta with b held constant. OK, so we've just defined them. We don't know yet how to compute these things. That's what we're going to do now. That is the definition, and what these things mean.
Is that clear to everyone? Yes, OK. Yes? OK, so the second answer, again, so one way to ask ourselves, how does the area depend on theta, is to say, well, actually look at the area of the right triangle as a function of a and theta only by solving for b. And then, we'll change theta, keep a constant, and ask, how does the area change? So, when we do that, when we change theta and keep a the same, then b has to change so that it stays a right triangle, right, so that this relation still holds. That requires us to change b. So, when we write partial a over partial theta with a constant, it means that, actually, b will be the dependent variable.
It depends on a and theta. And so, the area depends on theta, not only because theta is in the formula, but also because b changes, and b is in the formula. Yes? No, no, we don't keep theta constant. We vary theta, right? The goal is to see how things change when I change theta by a little bit. OK, so if I change theta a little bit in this one, if I change theta a little bit and I keep a the same, then b has to change also in some way. There's a right triangle. And then, because theta and b change, that causes the area to change. OK, so maybe I should re-explain that again. So, theta changes.
A is constant. But, we have the constraint, a equals be plus sine theta. That means that b changes. And then, the question is, how does A change? Well, it will change in part because theta changes, and in part because b changes. But, we want to know how it depends on theta in this situation. Yes? Ah, that's a very good question. So, what about, I don't keep a and b constant? Well, then there's too many choices. So I have to decide actually how I'm going to change things. See, if I just say I have this relation, that means I have two independent variables left, whichever two of the three I want. But, I still have to specify two of them to say exactly which triangle I mean. So, I cannot ask myself just how will it change if I change theta and do random things with a and b?
It depends what I do with a and b. Of course, I could choose to change them simultaneously, but then I have to specify how exactly I'm going to do that. Ah, yes, if you wanted to, indeed, we could also change things in such a way that the third side remains constant. And that would be, yet, a different way to attack the problem. I mean, we don't have good notation for this, here, because we didn't give it a name. But, yeah, I mean, we could. We could call this guy c, and then we'd have a different formula, and so on. So, I mean, I'm not looking at it for simplicity. But, you could have many more.
I mean, in general, you will want, once you have a set of nice, natural variables, you will want to look mostly at situations where one of the variables changes. Some of them are held fixed, and then some dependent variable does whatever it must so that the constraint keeps holding. OK, so let's try to compute one of them. Let's say I decide that we will compute this one. OK, let's see how we can compute partial a, partial theta with a held fixed.
[APPLAUSE] OK, so let's try to compute partial A, partial theta with a held constant. So, let's see three different ways of doing that. So, let me start with method zero. OK, it's not a real method. That's why I'm not getting a positive number. So, that one is just, we solve for b, and we remove b from the formulas. OK, so here it works well because we know how to solve for b. But I'm not considering this to be a real method because in general we don't know how to do that. I mean, in the beginning I had this relation that was an equation of degree three. You don't really want to solve your equation for the dependent variable usually. Here, we can. So, solve for b and substitute.
So, how do we do that? Well, the constraint is a=b cosine theta. That means b is a over cosine theta. Some of you know that as a secan theta. That's the same. And now, if we express the area in terms of a and theta only, A is one half of ab cosine, sorry, ab sine theta is now one half of a^2 sine theta over cosine theta. Or, if you prefer, one half of a^2 tangent theta. Well, now that it's only a function of a and theta, I know what it means to take the partial derivative with respect to theta, keeping a constant. I know how to do it.
So, partial A over partial theta, a held constant, well, if a is a constant, then I get this one half a^2 coming out times, what's the derivative of tangent? Secan squared, very good. If you're European and you've never heard of secan, that's one over cosine. And, if you know the derivative as one plus tangent squared, that's the same thing. And, it's also correct. OK, so, that's one way of doing it. But, as I've already said, it doesn't get us very far if we don't know how to solve for b. We really used the fact that we could solve for b and get rid of it. So, there's two systematic methods, and let's say the basic rule is that you should give both of them a chance.
You should see which one you prefer, and you should be able to use one or the other on the exam. OK, most likely you'll actually have a choice between one or the other. It will be up to you to decide which one you want to use. But, you cannot use solving in substitution. That's not fair. OK, so the first one is to use differentials. By the way, in the notes they are called also method one and method two. I'm not promising that I have the same one, am I? I mean, I might have one and two switched. It doesn't really matter. So, how do we do things using differentials? Well, first, we know that we want to keep a fixed, and that means that we'll set da equal to zero, OK?
The second thing that we want to do is we want to look at the constraint. The constraint is a equals b cosine theta. And, we want to differentiate that. Well, differentiate the left-hand side. You get da. And, differentiate the right-hand side as a function of b and theta. You should get, well, how many db's? Well, that's the rate of change with respect to b. That's cosine theta db minus b sine theta d theta.
That's a product rule applied to b times cosine theta. So -- Well, now, if we have a constraint that's relating da, db, and d theta, OK, so that's actually what we did, right, that's the same sort of thing as what we did at the beginning when we related dx, dy, and dz. That's really the same thing, except now are variables are a, b, and theta. Now, we know that also we are keeping a fixed. So actually, we set this equal to zero. So, we have zero equals da equals cosine theta db minus b sine theta d theta. That means that actually we know how to solve for db.
OK, so cosine theta db equals b sine theta d theta or db is b tangent theta d theta. OK, so in fact, what we found, if you want, is the rate of change of b with respect to theta. Why do we care? Well, we care because let's look, now, at dA, the function that we want to look at. OK, so the function is A equals one half ab sine theta. Well, then, dA, so we had to use the product rule carefully, or we use the partials.
So, the coefficient of d little a will be partial with respect to little a. That's one half b sine theta da plus coefficient of db will be one half a sine theta db plus coefficient of d theta will be one half ab cosine theta d theta. But now, what do I do with that? Well, first I said a is constant. So, da is zero. Second, well, actually we don't like b at all, right? We want to view a as a function of theta. So, well, maybe we actually want to use this formula for db that we found in here. OK, and then we'll be left only with d thetas, which is what we want.
So, if we plug this one into that one, we get da equals one half a sine theta times b tangent theta d theta plus one half ab cosine theta d theta. And, if we collect these things together, we get one half of ab times sine theta times tangent theta plus cosine theta d theta. And, if you know your trig, but you'll see that this is sine squared over cosine plus cosine squared over cosine. That's the same as secan theta. So, now you have expressed da as something times d theta. Well, that coefficient is the rate of change of A with respect to theta with the understanding that we are keeping a fixed, and letting b vary as a dependent variable.
Not enough space: sorry. OK, in case it's clearer for you, let's think about it backwards. So, we wanted to find how A changes. To find how A changes, we write da. But now, this tells us how A depends on little a, little b, and theta. Well, we know actually we want to keep little a constant. So, we set this to be zero. Theta, well, we are very happy because we want to express things in terms of theta.
Db we want to get rid of. How do we get rid of db? Well, we do that by figuring out how b depends on theta when a is fixed. And, we do that by differentiating the constraint equation, and setting da equal to zero. OK, so -- I guess to summarize the method, we wrote dA in terms of da, db, d theta. Then, we say that a is constant means we set da equals zero. And, the third thing is that because, well, we differentiate the constraint.
And, we can solve for db in terms of d theta. And then, we plug into dA, and we get the answer. OK, oops. So, here's another method to do the same thing differently is to use the chain rule. So, we can use the chain rule with dependent variables, OK? So, what does the chain rule tell us? The chain rule tells us, so we will want to differentiate -- -- the formula for a with respect to theta holding a constant. So, I claim, well, what does the chain rule tell us? It tells us that, well, when we change things, a changes because of the changes in the variables. So, part of it is that A depends on theta and theta changes. How fast does theta change?
Well, you could call that the rate of change of theta with respect to theta with a constant. But of course, how fast does theta depend to itself? The answer is one. So, that's pretty easy. Plus, then we have the partial derivative, formal partial derivative, of A with respect to little a times the rate of change of a in our situation. Well, how does little a change if a is constant? Well, it doesn't change. And then, there is Ab, the formal partial derivative times, sorry, the rate of change of b. OK, and how do we find this one? Well, here we have to use the constraint.
OK, and we can find this one from the constraint as we've seen at the beginning either by differentiating the constraint, or by using the chain rule on the constraint. So, of course the calculations are exactly the same. See, this is the same formula as the one over there, just dividing everything by partial theta and with subscripts little a. But, if it's easier to think about it this way, then that's also valid.
OK, so tomorrow we are going to review for the test, so I'm going to tell you a bit more about this also as we go over one practice problem on that.
This is one of over 2,200 courses on OCW. Find materials for this course in the pages linked along the left.
MIT OpenCourseWare is a free & open publication of material from thousands of MIT courses, covering the entire MIT curriculum.
No enrollment or registration. Freely browse and use OCW materials at your own pace. There's no signup, and no start or end dates.
Knowledge is your reward. Use OCW to guide your own life-long learning, or to teach others. We don't offer credit or certification for using OCW.
Made for sharing. Download files for later. Send to friends and colleagues. Modify, remix, and reuse (just remember to cite OCW as the source.)
Learn more at Get Started with MIT OpenCourseWare