Topics covered:: Derivatives, slope, velocity, rate of change
Instructor: Prof. David Jerison
Lecture Notes (PDF - 1.1MB)
The following content is provided under a Creative Commons License. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
Professor: So, again welcome to 18.01. We're getting started today with what we're calling Unit One, a highly imaginative title. And it's differentiation. So, let me first tell you, briefly, what's in store in the next couple of weeks. The main topic today is what is a derivative. And, we're going to look at this from several different points of view, and the first one is the geometric interpretation. That's what we'll spend most of today on. And then, we'll also talk about a physical interpretation of what a derivative is.
And then there's going to be something else which I guess is maybe the reason why Calculus is so fundamental, and why we always start with it in most science and engineering schools, which is the importance of derivatives, of this, to all measurements. So that means pretty much every place. That means in science, in engineering, in economics, in political science, etc. Polling, lots of commercial applications, just about everything.
Now, that's what we'll be getting started with, and then there's another thing that we're gonna do in this unit, which is we're going to explain how to differentiate anything. So, how to differentiate any function you know. And that's kind of a tall order, but let me just give you an example. If you want to take the derivative - this we'll see today is the notation for the derivative of something - of some messy function like e ^ x arctan x. We'll work this out by the end of this unit.
All right? Anything you can think of, anything you can write down, we can differentiate it. All right, so that's what we're gonna do, and today, as I said, we're gonna spend most of our time on this geometric interpretation. So let's begin with that.
So here we go with the geometric interpretation of derivatives. And, what we're going to do is just ask the geometric problem of finding the tangent line to some graph of some function at some point. Which is to say (x_0, y_0). So that's the problem that we're addressing here. Alright, so here's our problem, and now let me show you the solution. So, well, let's graph the function. Here's its graph. Here's some point. All right, maybe I should draw it just a bit lower. So here's a point P. Maybe it's above the point x_0. x_0, by the way, this was supposed to be an x_0. That was some fixed place on the x-axis. And now, in order to perform this mighty feat, I will use another color of chalk. How about red? OK. So here it is. There's the tangent line, well, not quite straight. Close enough. All right? I did it.
That's the geometric problem. I achieved what I wanted to do, and it's kind of an interesting question, which unfortunately I can't solve for you in this class, which is, how did I do that? That is, how physically did I manage to know what to do to draw this tangent line? But that's what geometric problems are like. We visualize it. We can figure it out somewhere in our brains. It happens. And the task that we have now is to figure out how to do it analytically, to do it in a way that a machine could just as well as I did in drawing this tangent line.
So, what did we learn in high school about what a tangent line is? Well, a tangent line has an equation, and any line through a point has the equation y - y_0 is equal to m, the slope, times x - x_0.
So here's the equation for that line, and now there are two pieces of information that we're going to need to work out what the line is. The first one is the point. That's that point P there. And to specify P, given x, we need to know the level of y, which is of course just f(x_0). That's not a calculus problem, but anyway that's a very important part of the process. So that's the first thing we need to know. And the second thing we need to know is the slope. And that's this number m. And in calculus we have another name for it. We call it f prime of x_0. Namely, the derivative of f. So that's the calculus part. That's the tricky part, and that's the part that we have to discuss now. So just to make that explicit here, I'm going to make a definition, which is that f '(x_0) , which is known as the derivative, of f, at x_0, is the slope of the tangent line to y = f(x) at the point, let's just call it P.
All right? So, that's what it is, but still I haven't made any progress in figuring out any better how I drew that line. So I have to say something that's more concrete, because I want to be able to cook up what these numbers are. I have to figure out what this number m is. And one way of thinking about that, let me just try this, so I certainly am taking for granted that in sort of non-calculus part that I know what a line through a point is. So I know this equation. But another possibility might be, this line here, how do I know - well, unfortunately, I didn't draw it quite straight, but there it is - how do I know that this orange line is not a tangent line, but this other line is a tangent line? Well, it's actually not so obvious, but I'm gonna describe it a little bit. It's not really the fact-- this thing crosses at some other place, which is this point Q. But it's not really the fact that the thing crosses at two place, because the line could be wiggly, the curve could be wiggly, and it could cross back and forth a number of times. That's not what distinguishes the tangent line.
So I'm gonna have to somehow grasp this, and I'll first do it in language. And it's the following idea: it's that if you take this orange line, which is called a secant line, and you think of the point Q as getting closer and closer to P, then the slope of that line will get closer and closer to the slope of the red line. And if we draw it close enough, then that's gonna be the correct line. So that's really what I did, sort of in my brain when I drew that first line.
And so that's the way I'm going to articulate it first. Now, so the tangent line is equal to the limit of so called secant lines PQ, as Q tends to P. And here we're thinking of P as being fixed and Q as variable. All right? Again, this is still the geometric discussion, but now we're gonna be able to put symbols and formulas to this computation. And we'll be able to work out formulas in any example.
So let's do that. So first of all, I'm gonna write out these points P and Q again. So maybe we'll put P here and Q here. And I'm thinking of this line through them. I guess it was orange, so we'll leave it as orange. All right. And now I want to compute its slope. So this, gradually, we'll do this in two steps. And these steps will introduce us to the basic notations which are used throughout calculus, including multi-variable calculus, across the board. So the first notation that's used is you imagine here's the x-axis underneath, and here's the x_0, the location directly below the point P. And we're traveling here a horizontal distance which is denoted by delta x. So that's delta x, so called. And we could also call it the change in x.
So that's one thing we want to measure in order to get the slope of this line PQ. And the other thing is this height. So that's this distance here, which we denote delta f, which is the change in f. And then, the slope is just the ratio, delta f / delta x. So this is the slope of the secant. And the process I just described over here with this limit applies not just to the whole line itself, but also in particular to its slope. And the way we write that is the limit as delta x goes to 0. And that's going to be our slope. So this is the slope of the tangent line.
OK. Now, This is still a little general, and I want to work out a more usable form here, a better formula for this. And in order to do that, I'm gonna write delta f, the numerator more explicitly here. The change in f, so remember that the point P is the point (x_0, f(x_0)). All right, that's what we got for the formula for the point. And in order to compute these distances and in particular the vertical distance here, I'm gonna have to get a formula for Q as well. So if this horizontal distance is delta x, then this location is x_0 + delta x. And so the point above that point has a formula, which is x_0 plus delta x, f of - and this is a mouthful - x_0 plus delta x.
All right, so there's the formula for the point Q. Here's the formula for the point P. And now I can write a different formula for the derivative, which is the following: so this f'(x_0) , which is the same as m, is going to be the limit as delta x goes to 0 of the change in f, well the change in f is the value of f at the upper point here, which is x_0 + delta x, and minus its value at the lower point P, which is f(x_0), divided by delta x. All right, so this is the formula. I'm going to put this in a little box, because this is by far the most important formula today, which we use to derive pretty much everything else. And this is the way that we're going to be able to compute these numbers.
So let's do an example. This example, we'll call this example one. We'll take the function f(x) , which is 1/x . That's sufficiently complicated to have an interesting answer, and sufficiently straightforward that we can compute the derivative fairly quickly. So what is it that we're gonna do here? All we're going to do is we're going to plug in this formula here for that function. That's all we're going to do, and visually what we're accomplishing is somehow to take the hyperbola, and take a point on the hyperbola, and figure out some tangent line. That's what we're accomplishing when we do that. So we're accomplishing this geometrically but we'll be doing it algebraically. So first, we consider this difference delta f / delta x and write out its formula.
So I have to have a place. So I'm gonna make it again above this point x_0, which is the general point. We'll make the general calculation. So the value of f at the top, when we move to the right by f(x), so I just read off from this, read off from here. The formula, the first thing I get here is 1 / (x_0 + delta x). That's the left hand term. Minus 1 / x_0, that's the right hand term. And then I have to divide that by delta x. OK, so here's our expression. And by the way this has a name. This thing is called a difference quotient. It's pretty complicated, because there's always a difference in the numerator. And in disguise, the denominator is a difference, because it's the difference between the value on the right side and the value on the left side here. OK, so now we're going to simplify it by some algebra.
So let's just take a look. So this is equal to, let's continue on the next level here. This is equal to 1 / delta x times... All I'm going to do is put it over a common denominator. So the common denominator is (x_0 + delta x) * x_0. And so in the numerator for the first expressions I have x_0, and for the second expression I have x_0 + delta x. So this is the same thing as I had in the numerator before, factoring out this denominator. And here I put that numerator into this more amenable form.
And now there are two basic cancellations. The first one is that x_0 and x_0 cancel, so we have this. And then the second step is that these two expressions cancel, the numerator and the denominator. Now we have a cancellation that we can make use of. So we'll write that under here. And this is equals -1 over x_0 plus delta x times x_0. And then the very last step is to take the limit as delta x tends to 0, and now we can do it. Before we couldn't do it. Why? Because the numerator and the denominator gave us 0 / 0. But now that I've made this cancellation, I can pass to the limit. And all that happens is I set this delta x to 0, and I get -1/x_0^2. So that's the answer. All right, so in other words what I've shown - let me put it up here - is that f'(x_0) = -1/x_0^2.
Now, let's look at the graph just a little bit to check this for plausibility, all right? What's happening here is, first of all it's negative. It's less than 0, which is a good thing. You see that slope there is negative. That's the simplest check that you could make. And the second thing that I would just like to point out is that as x goes to infinity, that as we go farther to the right, it gets less and less steep. So as x_0 goes to infinity, less and less steep. So that's also consistent here, when x_0 is very large, this is a smaller and smaller number in magnitude, although it's always negative. It's always sloping down. All right, so I've managed to fill the boards. So maybe I should stop for a question or two. Yes?
Professor: So the question is to explain again this limiting process. So the formula here is we have basically two numbers. So in other words, why is it that this expression, when delta x tends to 0, is equal to -1 / x_0^2 ? Let me illustrate it by sticking in a number for x_0 to make it more explicit. All right, so for instance, let me stick in here for x_0 the number 3. Then it's -1 over 3 plus delta x times 3. That's the situation that we've got. And now the question is what happens as this number gets smaller and smaller and smaller, and gets to be practically 0? Well, literally what we can do is just plug in 0 there, and you get 3 plus 0 times 3 in the denominator. -1 in the numerator. So this tends to -1/9 (over 3^2). And that's what I'm saying in general with this extra number here. Other questions? Yes.
Professor: So the question is what happened between this step and this step, right? Explain this step here. Alright, so there were two parts to that. The first is this delta x which is sitting in the denominator, I factored all the way out front. And so what's in the parentheses is supposed to be the same as what's in the numerator of this other expression. And then, at the same time as doing that, I put that expression, which is the difference of two fractions, I expressed it with a common denominator. So in the denominator here, you see the product of the denominators of the two fractions. And then I just figured out what the numerator had to be without really... Other questions? OK.
So I claim that on the whole, calculus gets a bad rap, that it's actually easier than most things. But there's a perception that it's harder. And so I really have a duty to give you the calculus made harder story here. So we have to make things harder, because that's our job. And this is actually what most people do in calculus, and it's the reason why calculus has a bad reputation. So the secret is that when people ask problems in calculus, they generally ask them in context. And there are many, many other things going on. And so the little piece of the problem which is calculus is actually fairly routine and has to be isolated and gotten through. But all the rest of it, relies on everything else you learned in mathematics up to this stage, from grade school through high school. So that's the complication. So now we're going to do a little bit of calculus made hard. By talking about a word problem.
We only have one sort of word problem that we can pose, because all we've talked about is this geometry point of view. So far those are the only kinds of word problems we can pose. So what we're gonna do is just pose such a problem. So find the areas of triangles, enclosed by the axes and the tangent to y = 1/x. OK, so that's a geometry problem. And let me draw a picture of it. It's practically the same as the picture for example one. We only consider the first quadrant. Here's our shape. All right, it's the hyperbola. And here's maybe one of our tangent lines, which is coming in like this. And then we're trying to find this area here. Right, so there's our problem. So why does it have to do with calculus? It has to do with calculus because there's a tangent line in it, so we're gonna need to do some calculus to answer this question. But as you'll see, the calculus is the easy part.
So let's get started with this problem. First of all, I'm gonna label a few things. And one important thing to remember of course, is that the curve is y = 1/x. That's perfectly reasonable to do. And also, we're gonna calculate the areas of the triangles, and you could ask yourself, in terms of what? Well, we're gonna have to pick a point and give it a name. And since we need a number, we're gonna have to do more than geometry. We're gonna have to do some of this analysis just as we've done before. So I'm gonna pick a point and, consistent with the labeling we've done before, I'm gonna to call it (x_0, y_0). So that's almost half the battle, having notations, x and y for the variables, and x_0 and y_0, for the specific point.
Now, once you see that you have these labelings, I hope it's reasonable to do the following. So first of all, this is the point x_0, and over here is the point y_0. That's something that we're used to in graphs. And in order to figure out the area of this triangle, it's pretty clear that we should find the base, which is that we should find this location here. And we should find the height, so we need to find that value there. Let's go ahead and do it. So how are we going to do this? Well, so let's just take a look. So what is it that we need to do? I claim that there's only one calculus step, and I'm gonna put a star here for this tangent line. I have to understand what the tangent line is. Once I've figured out what the tangent line is, the rest of the problem is no longer calculus. It's just that slope that we need. So what's the formula for the tangent line? Put that over here. it's going to be y - y_0 is equal to, and here's the magic number, we already calculated it. It's in the box over there. It's -1/x_0^2 ( x - x_0). So this is the only bit of calculus in this problem. But now we're not done. We have to finish it. We have to figure out all the rest of these quantities so we can figure out the area.
All right. So how do we do that? Well, to find this point, this has a name. We're gonna find the so called x-intercept. That's the first thing we're going to do. So to do that, what we need to do is to find where this horizontal line meets that diagonal line. And the equation for the x-intercept is y = 0. So we plug in y = 0, that's this horizontal line, and we find this point. So let's do that into star. We get 0 minus, oh one other thing we need to know. We know that y0 is f(x_0) , and f(x) is 1/x , so this thing is 1/x_0. And that's equal to -1/x_0^2. And here's x, and here's x_0. All right, so in order to find this x value, I have to plug in one equation into the other.
So this simplifies a bit. This is -x/x_0^2. And this is plus 1/x_0 because the x_0 and x0^2 cancel somewhat. And so if I put this on the other side, I get x / x_0^2 is equal to 2 / x_0. And if I then multiply through - so that's what this implies - and if I multiply through by x_0^2 I get x = 2x_0.
OK, so I claim that this point we've just calculated, it's 2x_0. Now, I'm almost done. I need to get the other one. I need to get this one up here. Now I'm gonna use a very big shortcut to do that. So the shortcut to the y-intercept is to use symmetry. All right, I claim I can stare at this and I can look at that, and I know the formula for the y-intercept. It's equal to 2y_0. All right. That's what that one is. So this one is 2y_0. And the reason I know this is the following: so here's the symmetry of the situation, which is not completely direct. It's a kind of mirror symmetry around the diagonal. It involves the exchange of (x, y) with (y, x); so trading the roles of x and y. So the symmetry that I'm using is that any formula I get that involves x's and y's, if I trade all the x's and replace them by y's and trade all the y's and replace them by x's, then I'll have a correct formula on the other way. So if everywhere I see a y I make it an x, and everywhere I see an x I make it a y, the switch will take place. So why is that? That's just an accident of this equation. That's because, so the symmetry explained... is that the equation is y = 1/x. But that's the same thing as xy = 1, if I multiply through by x, which is the same thing as x = 1/y. So here's where the x and the y get reversed. OK now if you don't trust this explanation, you can also get the y-intercept by plugging x = 0 into the equation star. OK? We plugged y = 0 in and we got the x-value. And you can do the same thing analogously the other way.
All right so I'm almost done with the geometry problem, and let's finish it off now. Well, let me hold off for one second before I finish it off. What I'd like to say is just make one more tiny remark. And this is the hardest part of calculus in my opinion. So the hardest part of calculus is that we call it one variable calculus, but we're perfectly happy to deal with four variables at a time or five, or any number. In this problem, I had an x, a y, an x_0 and a y_0. That's already four different things that have various relationships between them. Of course the manipulations we do with them are algebraic, and when we're doing the derivatives we just consider what's known as one variable calculus. But really there are millions of variable floating around potentially. So that's what makes things complicated, and that's something that you have to get used to. Now there's something else which is more subtle, and that I think many people who teach the subject or use the subject aren't aware, because they've already entered into the language and they're so comfortable with it that they don't even notice this confusion. There's something deliberately sloppy about the way we deal with these variables.
The reason is very simple. There are already four variables here. I don't wanna create six names for variables or eight names for variables. But really in this problem there were about eight. I just slipped them by you. So why is that? Well notice that the first time that I got a formula for y_0 here, it was this point. And so the formula for y_0, which I plugged in right here, was from the equation of the curve. y_0 = 1 / x_0. The second time I did it, I did not use y = 1/x. I used this equation here, so this is not y = 1/x. That's the wrong thing to do. It's an easy mistake to make if the formulas are all a blur to you and you're not paying attention to where they are on the diagram.
You see that x-intercept calculation there involved where this horizontal line met this diagonal line, and y = 0 represented this line here. So the sloppiness is that y means two different things. And we do this constantly because it's way, way more complicated not to do it. It's much more convenient for us to allow ourselves the flexibility to change the role that this letter plays in the middle of a computation. And similarly, later on, if I had done this by this more straightforward method, for the y-intercept, I would have set x equal to 0. That would have been this vertical line, which is x = 0. But I didn't change the letter x when I did that, because that would be a waste for us. So this is one of the main confusions that happens. If you can keep yourself straight, you're a lot better off, and as I say this is one of the complexities.
All right, so now let's finish off the problem. Let me finally get this area here. So, actually I'll just finish it off right here. So the area of the triangle is, well it's the base times the height. The base is 2x_0, the height is 2y_0, and a half of that. So it's 1/2 (2x_0) * (2y_0) , which is 2x_0 y_0, which is, lo and behold, 2. So the amusing thing in this case is that it actually didn't matter what x_0 and y_0 are. We get the same answer every time. That's just an accident of the function 1 / x. It happens to be the function with that property.
All right, so we have some more business today, some serious business. So let me continue. So, first of all, I want to give you a few more notations. And these are just other notations that people use to refer to derivatives. And the first one is the following: we already wrote y = f(x). And so when we write delta y, that means the same thing as delta f. That's a typical notation. And previously we wrote f prime for the derivative, so this is Newton's notation for the derivative. But there are other notations. And one of them is df/dx, and another one is dy/dx, meaning exactly the same thing. And sometimes we let the function slip down below so that becomes d/dx of f and d/dx of y. So these are all notations that are used for the derivative, and these were initiated by Leibniz. And these notations are used interchangeably, sometimes practically together. They both turn out to be extremely useful. This one omits - notice that this thing omits - the underlying base point, x_0. That's one of the nuisances. It doesn't give you all the information. But there are lots of situations like that where people leave out some of the important information, and you have to fill it in from context. So that's another couple of notations.
So now I have one more calculation for you today. I carried out this calculation of the derivative of the function 1 / x. I wanna take care of some other powers. So let's do that.
So Example 2 is going to be the function f(x) = x^n. n = 1, 2, 3; one of these guys. And now what we're trying to figure out is the derivative with respect to x of x^n in our new notation, what this is equal to. So again, we're going to form this expression, delta f / delta x. And we're going to make some algebraic simplification. So what we plug in for delta f is ((x delta x)^n - x^n)/delta x. Now before, let me just stick this in then I'm gonna erase it. Before, I wrote x_0 here and x_0 there. But now I'm going to get rid of it, because in this particular calculation, it's a nuisance. I don't have an x floating around, which means something different from the x_0. And I just don't wanna have to keep on writing all those symbols. It's a waste of blackboard energy. There's a total amount of energy, and I've already filled up so many blackboards that, there's just a limited amount. Plus, I'm trying to conserve chalk. Anyway, no 0's. So think of x as fixed. In this case, delta x moves and x is fixed in this calculation. All right now, in order to simplify this, in order to understand algebraically what's going on, I need to understand what the nth power of a sum is. And that's a famous formula. We only need a little tiny bit of it, called the binomial theorem. So, the binomial theorem which is in your text and explained in an appendix, says that if you take the sum of two guys and you take them to the nth power, that of course is (x + delta x) multiplied by itself n times. And so the first term is x^n, that's when all of the n factors come in. And then, you could have this factor of delta x and all the rest x's. So at least one term of the form (x^(n-1)) times delta x. And how many times does that happen? Well, it happens when there's a factor from here, from the next factor, and so on, and so on, and so on. There's a total of n possible times that that happens. And now the great thing is that, with this alone, all the rest of the terms are junk that we won't have to worry about. So to be more specific, there's a very careful notation for the junk. The junk is what's called big O of (delta x)^2. What that means is that these are terms of order, so with (delta x)^2, (delta x)^3 or higher. All right, that's how. Very exciting, higher order terms. OK, so this is the only algebra that we need to do, and now we just need to combine it together to get our result. So, now I'm going to just carry out the cancellations that we need. So here we go. We have delta f / delta x, which remember was 1 / delta x times this, which is this times, now this is x^n plus nx^(n-1) delta x plus this junk term, minus x^n. So that's what we have so far based on our previous calculations. Now, I'm going to do the main cancellation, which is this. All right. So, that's 1/delta x times nx^(n-1) delta x plus this term here. And now I can divide in by delta x. So I get nx^(n-1) plus, now it's O(delta x). There's at least one factor of delta x not two factors of delta x, because I have to cancel one of them. And now I can just take the limit. In the limit this term is gonna be 0. That's why I called it junk originally, because it disappears. And in math, junk is something that goes away. So this tends to, as delta x goes to 0, nx^(n-1). And so what I've shown you is that d/dx of x to the n minus-- sorry, n, is equal to nx^(n-1).
So now this is gonna be super important to you right on your problem set in every possible way, and I want to tell you one thing, one way in which it's very important. One way that extends it immediately. So this thing extends to polynomials. We get quite a lot out of this one calculation. Namely, if I take d/dx of something like (x^3 + 5x^10) that's gonna be equal to 3x^2, that's applying this rule to x^3. And then here, I'll get 5*10 so 50x^9. So this is the type of thing that we get out of it, and we're gonna make more hay with that next time. Question. Yes. I turned myself off. Yes?
Professor: The question was the binomial theorem only works when delta x goes to 0. No, the binomial theorem is a general formula which also specifies exactly what the junk is. It's very much more detailed. But we only needed this part. We didn't care what all these crazy terms were. It's junk for our purposes now, because we don't happen to need any more than those first two terms. Yes, because delta x goes to 0. OK, see you next time.
This is one of over 2,200 courses on OCW. Find materials for this course in the pages linked along the left.
MIT OpenCourseWare is a free & open publication of material from thousands of MIT courses, covering the entire MIT curriculum.
No enrollment or registration. Freely browse and use OCW materials at your own pace. There's no signup, and no start or end dates.
Knowledge is your reward. Use OCW to guide your own life-long learning, or to teach others. We don't offer credit or certification for using OCW.
Made for sharing. Download files for later. Send to friends and colleagues. Modify, remix, and reuse (just remember to cite OCW as the source.)
Learn more at Get Started with MIT OpenCourseWare