Topics covered:: Implicit differentiation, inverses
Instructor: Prof. David Jerison
Lecture Notes (PDF)
The following content was created under a Creative Commons License. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
Professor: So, we're ready to begin the fifth lecture. I'm glad to be back. Thank you for entertaining my colleague, Haynes Miller. So, today we're going to continue where he started, namely what he talked about was the chain rule, which is probably the most powerful technique for extending the kinds of functions that you can differentiate. And we're going to use the chain rule in some rather clever algebraic ways today.
So the topic for today is what's known as implicit differentiation. So implicit differentiation is a technique that allows you to differentiate a lot of functions you didn't even know how to find before. And it's a technique - let's wait for a few people to sit down here. Physics, huh? Ok, more Physics. Let's take a break. You can get those after class.
All right, so we're talking about implicit differentiation, and I'm going to illustrate it by several examples. So this is one of the most important and basic formulas that we've already covered part way. Namely, the derivative of x to a power is a x ^ a - 1. Now, what we've got so far is the exponents, plus or minus 1 plus or minus 2, etc. You did the positive integer powers in the first lecture, and then yesterday Professor Miller told you about the negative powers.
So what we're going to do right now, today, is we're going to consider the exponents which are rational numbers, ratios of integers. So a = m / n. m and n are integers. All right, so that's our goal for right now, and we're going to use this method of implicit differentiation. In particular, it's important to realize that this covers the case m = 1. And those are the nth roots. So when we take the one over n power, we're going to cover that right now, along with many other examples. so this is our first example.
So how do we get started? Well we just write down a formula for the function. The function is y = x^ m / n. That's what we're trying to deal with. And now there's really only two steps. The first step is to take this equation to the nth power, so write it y ^ n = x ^m. Alright, so that's just the same equation re-written. And now, what we're going to do is we're going to differentiate. So we're going to apply d / dx to the equation. Now why is that we can apply it to the second equation, not the first equation? So maybe I should call these equation 1 and equation 2. So, the point is, we can apply it to equation 2. Now, the reason is that we don't know how to differentiate x^ m / n. That's something we just don't know yet. But we do know how to differentiate integer powers. Those are the things that we took care of before.
So now we're in shape to be able to do the differentiation. So I'm going to write it out explicitly over here, without carrying it out just yet. That's d / dx of y^ n = d / dx of x ^m. And now you see this expression here requires us to do something we couldn't do before yesterday. Namely, this y is a function of x. So we have to apply the chain rule here. So this is the same as - this is by the chain rule now - (d/dy y ^ n) dy / dx. And then, on the right hand side, we can just carry it out. We know the formula. It's mx ^ m - 1.
Right, now this is our scheme. And you'll see in a minute why we win with this. So, first of all, there are two factors here. One of them is unknown. In fact, it's what we're looking for. But the other one is going to be a known quantity, because we know how to differentiate y to the n with respect to y. That's the same formula, although the letter has been changed. And so this is the same as - I'll write it underneath here - (n y^ n - 1) dy / dx = m x^m - 1.
Okay, now comes, if you like, the non-calculus part of the problem. Remember the non-calculus part of the problem is always the messier part of the problem. So we want to figure out this formula. This formula, the answer over here, which maybe I'll put in a box now, has this expressed much more simply, only in terms of x. And what we have to do now is just solve for dy / dx using algebra, and then solve all the way in terms of x.
So, first of all, we solve for dy/ dx. So I do that by dividing the factor on the left hand side. So I get here m x ^m - 1 / n y ^ n - 1. And now I'm going to plug in -- so I'll write this as m / n. This is x^ m - 1. Now over here I'm going to put in for y, (x ^ m / n)( n - 1). So now we're almost done, but unfortunately we have this mess of exponents that we have to work out. I'm going to write it one more time. So I already recognize the factor a out front. That's not going to be a problem for me, and that's what I'm aiming for here. But now I have to encode all of these powers, so let's just write it. It's m - 1, and then it's minus the quantity (n - 1) t m / n. All right, so that's the laws exponents applied to this ratio here. And then we'll do the arithmetic over here on the next board. So we have here m - 1 - (n - 1) m / n = m - 1. And if I multiply n by this, I get - m. And if the second factor is minus minus, that's a plus. And that's + m /n. Altogether the two m's cancel. I have here - 1 + m / n. And lo and behold that's the same thing as a - 1, just what we wanted. All right, so this equals a x^n - 1.
Okay, again just a bunch of arithmetic. From this point forward, from this substitution on, it's just the arithmetic of exponents.
All right, so we've done our first example here. I want to give you a couple more examples, so let's just continue. The next example I'll keep relatively simple. So we have example two, which is going to be the function x^2 + y^2 = 1. Well, that's not really a function. It's a way of defining y as a function of x implicitly. There's the idea that I could solve for y if I wanted to. And indeed let's do that. So if you solve for y here, what happens is you get y^2 = 1 - x ^2, and y = plus or minus the square root of 1 - x ^2.
So this, if you like, is the implicit definition. And here is the explicit function y, which is a function of x. And now just for my own convenience, I'm just going to take the positive branch. This is the function. It's just really a circle in disguise. And I'm just going to take the top part of the circle, so we'll take that top hump here.
All right, so that means I'm erasing this minus sign. I'm just taking the positive branch, just for my convenience. I could do it just as well with the negative branch. Alright, so now I've taken the solution, and I can differentiate with this. So rather than using the dy / dx notation over here, I'm going to switch notations over here, because it's less writing. I'm going to write y ' and change notations. Okay, so I want to take the derivative of this. Well this is a somewhat complicated function here. It's the square root of 1 - x^2, and the right way always to look at functions like this is to rewrite them using the fractional power notation. That's the first step in computing a derivative of a square root. And then the second step here is what? Does somebody want to tell me? Chain rule, right. That's it. So we have two things. We start with one, and then we do something else to it. So whenever we do two things to something, we need to apply the chain rule. So 1 - x^2, square root.
All right, so how do we do that? Well, the first factor I claim is the derivative of this thing. So this is 1/2 ^ - 1/2. So I'm doing this kind of by the advanced method now, because we've already graduated. You already did the chain rule last time. So what does this mean? This is an abbreviation for the derivative with respect to blah of blah ^ 1/2, whatever it is. All right, so that's the first factor that we're going to use. Rather than actually write out a variable for it and pass through as I did previously with this y and x variable here, I'm just going to skip that step and let you imagine it as being a placeholder folder that variable here. So this variable is now parenthesis. And then I have to multiply that by the rate of change of what's inside with respect to x. And that is going to be - 2x. The derivative of 1 - x^2 = - 2x.
And now again, we couldn't have done this example two before example one, because we needed to know that the power rule worked not just for a integer but also for a = 1/2. We're using the case a = 1/2 right here. It's 1/2 times, and this - 1/2 here is a - 1. So this is the case a = 1/2. a - 1 happens to be -1/2. Okay, so I'm putting all those things together. And you know within a week you have to be doing this very automatically. So we're going to do it at this speed now. You want to do it even faster, ultimately. Yes?
Professor: The question is could I have done it implicitly without the square roots. And the answer is yes. That's what I'm about to do. So this is an illustration of what's called the explicit solution. So this guy is what's called explicit. And I want to contrast it with the method that we're going to now use today. So it involves a lot of complications. It involves the chain rule. And as we'll see it can get messier and messier. And then there's the implicit method, which I claim is easier. So let's see what happens if you do it implicitly The implicit method involves, instead of writing the function in this relatively complicated way, with the square root, it involves leaving it alone. Don't do anything to it. In this previous case, we were left with something which was complicated, say x ^ 1/3 or x ^ 1/2 or something complicated. We had to simplify it. We had an equation one, which was more complicated. We simplified it then differentiated it. And so that was a simpler case. Well here, the simplest thing us to differentiate is the one we started with, because squares are practically the easiest thing after first powers, or maybe zeroeth powers to differentiate. So we're leaving it alone. This is the simplest possible form for it, and now we're going to differentiate.
So what happens? So again what's the method? Let me remind you. You're applying d / dx to the equation. So you have to differentiate the left side of the equation, and differentiate the right side of the equation. So it's this, and what you get is 2 x + 2 y y ' = to what? 0. The derivative of 1 = 0. So this is the chain rule again. I did it a different way. I'm trying to get you used to many different notations at once. Well really just two. Just the prime notation and the dy / dx notation. And this is what I get.
So now all I have to do is solve for y '. So that y ', if I put the 2 x on the other side is - 2 x, and then divide by 2y, which is -x /y. So let's compare our solutions, and I'll apologize, I'm going to have to erase something to do that. So let's compare our two solutions. I'm going to put this underneath and simplify. So what was our solution over here? It was 1/2 (1 - x ^2) ^ -1/2 ( -2 x). That was what we got over here. And that is the same thing, if I cancel the 2's, and I change it back to looking like a square root, that's the same thing as - x / square root of 1 - x ^2. So this is the formula for the derivative when I do it the explicit way. And I'll just compare them, these two expressions here. And notice they are the same. They're the same, because y = square root of 1 - x^2. Yeah? Question?
Professor: The question is why did the implicit method not give the bottom half of the circle? Very good question. The answer to that is that it did. I just didn't mention it. Wait, I'll explain. So suppose I stuck in a minus sign here. I would have gotten this with the difference, so with an extra minus sign. But then when I compared it to what was over there, I would have had to have another different minus sign over here. So actually both places would get an extra minus sign. And they would still coincide. So actually the implicit method is a little better. It doesn't even notice the difference between the branches. It does the job on both the top and bottom half. Another way of saying that is that you're calculating the slopes here. So let's look at this picture. Here's a slope. Let's just take a look at a positive value of x and just check the sign to see what's happening. If you take a positive value of x over here, x is positive. This denominator is positive. The slope is negative. You can see that it's tilting down. So it's ok. Now on the bottom side, it's going to be tilting up. And similarly what's happening up here is that both x and y are positive, and this x and this y are positive. And the slope is negative. On the other hand, on the bottom side, x is still positive, but y is negative. And it's tilting up because the denominator is negative. The numerator is positive, and this minus sign has a positive slope. So it matches perfectly in every category. This complicated, however, and it's easier just to keep track of one branch at a time, even in advanced math. Okay, so we only do it one branch at a time. Other questions?
Okay, so now I want to give you a slightly more complicated example here. And indeed some of the, so here's a little more complicated example. It's not going to be the most complicated example, but you know it'll be a little tricky.
So this example, I'm going to give you a fourth order equation. So y ^ 4 + x y ^2 - 2 = 0. Now it just so happens that there's a trick to solving this equation, so actually you can do both the explicit method and the non-explicit method. So the explicit method would say okay well, I want to solve for this. So I'm going to use the quadratic formula, but on y ^2. This is quadratic in y ^2, because there's a fourth power and a second power, and the first and third powers are missing.
So this is y ^2 = - x plus or minus the square root of x ^2 - 4 (-2 ) / 2. And so this x is the b. This -2 is the c, and a = 1 in the quadratic formula. And so the formula for y is plus or minus the square root of - x plus or minus the square root x ^2 + 8 /2.
So now you can see this problem of branches, this happens actually in a lot of cases, coming up in an elaborate way. You have two choices for the sign here. You have two choices for the sign here. Conceivably as many as four roots for this equation, because it's a fourth degree equation. It's quite a mess. You should have to check each branch separately. And this really is that level of complexity, and in general it's very difficult to figure out the formulas for quartic equations. But fortunately we're never going to use them. That is, we're never going to need those formulas. So the implicit method is far easier. The implicit method just says okay I'll leave the equation in its simplest form. And now differentiate. So when I differentiate, I get 4 y ^3 y ' +... now here I have to apply the product rule. So I differentiate the x and the y ^2 separately. First I differentiate with respect to x, so I get y ^2. Then I differentiate with respect to the other factor, the y ^2 factors. And I get x (2 y y '). And then the gives me 0. So - 0 = 0.
So there's the implicit differentiation step. And now I just want to solve for y '. So I'm going to factor out 4 y ^3 + 2 x y. That's the factor on y '. And I'm going to put the y ^2 on the other side. - y ^2 over here. And so the formula for y ' is - y ^2 / 4 y ^3 + 2 x y. So that's the formula for the solution. For the slope. You have a question?
Professor: So the question is for the y would we have to put in what solved for in the explicit equation. And the answer is absolutely yes. That's exactly the point. So this is not a complete solution to a problem. We started with an implicit equation. We differentiated. And we got in the end, also an implicit equation. It doesn't tell us what y is as a function of x. You have to go back to this formula to get the formula for x. So for example, let me give you an example here. So this hides a degree of complexity of the problem. But it's a degree of complexity that we must live with. So for example, at x = 1, you can see that y = 1 solves. That happens to solve y ^ 4 + x y ^2 - 2= 0. That's why I picked the 2 actually, so it would be 1 + 1 - 2 = 0. I just wanted to have a convenient solution there to pull out of my hat at this point. So I did that.
And so we now know that when x = 1, y = 1. So at (1, 1) along the curve, the slope is equal to what? Well, I have to plug in here, - 1 ^2 / 4 * 1^3 + 2 * 1 * 1. That's just plugging in that formula over there, which turns out to be - 1/6. So I can get it.
On the other hand, at say x = 2, we're stuck using this formula star here to find y. Now, so let me just make two points about this, which are just philosophical points for you right now. The first is, when I promised you at the beginning of this class that we were going to be able to differentiate any function you know, I meant it very literally. What I meant is if you know the function, we'll be able give a formula for the derivative. If you don't know how to find a function, you'll have a lot of trouble finding the derivative.
So we didn't make any promises that if you can't find the function you will be able to find the derivative by some magic. That will never happen. And however complex the function is, a root of a fourth degree polynomial can be pretty complicated function of the coefficients, we're stuck with this degree of complexity in the problem.
But the big advantage of his method, notice, is that although we've had to find star, we had to find this formula star, and there are many other ways of doing these things numerically, by the way, which we'll learn later, so there's a good method for doing it numerically. Although we had to find star, we never had to differentiate it. We had a fast way of getting the slope. So we had to know what x and y were. But y ' we got by an algebraic formula, in terms of the values here.
So this is very fast, forgetting the slope, once you know the point. yes?
Student: What's in the parentheses?
Professor: Sorry, this is... Well let's see if I can manage this. Is this the parentheses you're talking about? Well, so maybe I should put commas around it. But it was "s a y", comma comma, okay? Well here was at x = 1. I'm just throwing out a value. Any other value. Actually there is one value, my favorite value. Well this is easy to evaluate right? x = 0, I can do it there. That's maybe the only one. The others are a nuisance.
All right, other questions?
Now we have to do something more here. So I claimed to you that we could differentiate all the functions we know. But really we can learn a tremendous about functions which are really hard to get at. So this implicit differentiation method has one very, very important application to finding inverse functions, or finding derivatives of inverse functions. So let's talk about that next.
So first, maybe we'll just illustrate by an example. If you have the function y = to square root x, for x positive, then of course this idea is that we should simplify this equation and we should square it so we get this somewhat simpler equation here. And then we have a notation for this. If we call f (x) = the square root of x, and g ( y) = x, this is the reversal of this. Then the formula for g(y) is that it should be y ^2.
And in general, if we start with any old y = f(x), and we just write down, this is the defining relationship for a function g, the property that we're saying is that g ( f(x)) has got to bring us back to x. And we write that in a couple of different ways. We call g the inverse of f. And also we call f the inverse of g, although I'm going to be silent about which variable I want to use, because people mix them up a little bit, as we'll be doing when we draw some pictures of this.
So let's see. Let's draw pictures of both f and f inverse on the same graph. So first of all, I'm going to draw the graph of f(x) = square root of x. That's some shape like this. And now, in order to understand what g ( y) is, so let's do the analysis in general, but then we'll draw it in this particular case.
If you have g (y) = x, that's really just the same equation right? This is the equation g (y) = x, that's y ^2 = x. This is y = square root of x, those are the same equations, it's the same curve. But suppose now that we wanted to write down what g( x) is. In other words, we wanted to switch the variables, so draw them as I said on the same graph with the same x, and the same y axes. Then that would be, in effect, trading the roles of x and y. We have to rename every point on the graph which is of the ordered pair (x, y), and trade it for the opposite one.
And when you exchange x and y, so to do this, exchange x and y, and when you do that, graphically what that looks like is the following: suppose you have a place here, and this is the x and this is the y, then you want to trade them. So you want the y here right? And the x up there. It's sort of the opposite place over there. And that is the place which is directly opposite this point across the diagonal line x = y. So you reflect across this or you flip across that. You get this other shape that looks like that. Maybe I'll draw it with a colored piece of chalk here.
So this guy here is y = f^(-1)(x). And indeed, if you look at these graphs, this one is the square roots. This one happens to be y = x^2. If you take this one, and you turn it, you reverse the roles of the x axis and the y axis, and tilt it on its side.
So that's the picture of what an inverse function is, and now I want to show you that the method of implicit differentiation allows us to compute the derivatives of inverse functions. So let me just say it in general, and then I'll carry it out in particular. So implicit differentiation allows us to find the derivative of any inverse function, provided we know the derivative of the function.
So let's do that for what is an example, which is truly complicated and a little subtle here. It has a very pretty answer. So we'll carry out an example here, which is the function y = tan^(-1). So again, for the inverse tangent all of the things that we're going to do are going to be based on simplifying this equation by taking the tangent of both sides. So, us let me remind you by the way, the inverse tangent is what's also known as arctangent. That's just another notation for the same thing. And we're going to use to describe this function is the equation tan y = x. That's what happens when you take the tangent of this function. This is how we're going to figure out what the function looks like.
So first of all, I want to draw it, and then we'll do the computation. So let's make the diagram first. So I want to do something which is analogous to what I did over here with the square root function. So first of all, I remind you that the tangent function is defined find two values here, which are pi / 2 and - pi/2. And it starts out at minus infinity and curves up like this. So that's the function tan x. And so the one that we have to sketch is this one which we get by reflecting this across the axis. Well not the axis, the diagonal. This slope by the way, should be less - a little lower here so that we can have it going down and up.
So let me show you what it looks like. On the front, it's going to look a lot like this one. So this one had curved down, and so the reflection across the diagonal curved up. Here this is curving up, so the reflection is going to curve down. It's going to look like this. Maybe I should, sorry, let's use a different color, because it's reversed from before. I'll just call it green.
Now, the original curve in the first quadrant eventually had an asymptote which was straight up. So this one is going to have an asymptote which is horizontal. And that level is what? What's the highest? It is just pi / 2. Now similarly, the other way, we're going to do this: and this bottom level is going to be - pi/2. So there's the picture of this function. It's defined for all x. So this green guy is y = arctan x. And it's defined all the way from minus infinity to infinity.
And to use a notation that we had from limit notation as x goes to infinity, let's say, arctan x = pi/2. That's an example of one value that's of interest in addition to the finite values.
Okay, so now the first ingredient that we're going to need, is we're going to need the derivative of the tangent function. So I'm going to recall for you, and maybe you haven't worked this out yet, but I hope that many of you have, that if you take the derivative with respect to y of tan y . So this you do by the quotient rule. So this is of the form u / v, right? You use the quotient rule. So I'm going to get this. But what you get in the end is some marvelous simplification that comes out to cos ^2y. 1 / cos squared. You can recognize the cosine squared from the fact that you should get v ^2 in the denominator, and somehow the numerators all cancel and simplifies to 1. This is also known as sec^2y. So that something that if you haven't done yet, you're going to have to do this as an exercise.
So we need that ingredient, and now we're just going to differentiate our equation. And what do we get? We get, again, (d /dy tan y ) dy / dx = 1. Or, if you like, 1 / cos ^2 y times in the other notation, y' = 1. So I've just used the formulas that I just wrote down there.
Now all I have to do is solve for y '. It's cos ^2y. Unfortunately, this is not the form that we ever want to leave these things in. This is the same problem we had with that ugly square root expression, or with any of the others. We want to rewrite in terms of x. Our original question was what is d / dx of arctan x. Now so far we have the following answer to that question: it's cos ^2 ( arctan x). Now this is a correct answer, but way too complicated. Now that doesn't mean that if you took a random collection of functions, you wouldn't end up with something this complicated. But these particular functions, these beautiful circular functions involved with trigonometry all have very nice formulas associated with them. And this simplifies tremendously.
So one of the skills that you need to develop when you're dealing with trig functions is to simplify this. And so let's see now that expressions like this all simplify. So here we go. There's only one formula, one ingredient that we need to use to do this, and then we're going to draw a diagram. So the ingredient again, is the original defining relationship that tan y = x. So tan y = x can be encoded in a right triangle in the following way: here's the right triangle and tan y means that y should be represented as an angle. And then, its tangent is the ratio of this vertical to this horizontal side. So I'm just going to pick two values that work, namely x and 1. Those are the simplest ones.
So I've encoded this equation in this picture. And now all I have to do is figure out what the cos y is in this right trying here. In order to do that, I need to figure out what the hypotenuse is, but that's just square root of 1 + x ^2. And now I can read off what the cos y is. So the cos y is one divided by the hypotenuse. So it's 1 / square root, whoops, yeah, 1 + x^2.
And so cos ^2 is just 1 / 1 + x^2. And so our answer over here, the preferred answer which is way simpler than what I wrote up there, is that d/dx of arctan x = 1 / 1 + x^2. Maybe I'll stop here for one more question. I have one more calculation which I can do even in less than a minute. So we have a whole minute for questions. Yeah?
Professor: What happens to the inverse tangent? The inverse tangent this... Ok this inverse tangent is the same as this y here. Those are the same thing. So what I did was I skipped this step here entirely. I never wrote that down. But the inverse tangent was that y. The issue was what's a good formula for cos y in terms of x? So I am evaluating that, but I'm doing it using the letter y. So in other words, what happened to the inverse tangent is that I called it y, which is what it's been all along.
Okay, so now I'm going to do the case of the sine, the inverse sine. And I'll show you how easy this is if I don't fuss with... because this one has an easy trig identity associated with it. So if y = arcsin x, and sin y = x, and now watch how simple it is when I do the differentiation. I just differentiate. I get (cos y) y ' = 1. And then, y ', so that implies that y ' = 1 / cos y, and now to rewrite that in terms of x, I have to just recognize that this is the same as this, which is the same as 1 / square root of 1 - x^2.
So all told, the derivative with respect to x of the arcsine function is 1 / square root of 1 - x^2. So these implicit differentiations are very convenient. However, I warn you that you do have to be careful about the range of applicability of these things. You have to draw a picture like this one to make sure you know where this makes sense. In other words, you have to pick a branch for the sine function to work that out, and there's something like that on your problem set. And it's also discussed in your text.
So we'll stop here.