Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

**Topics covered:** Work, average value, probability

**Instructor:** Prof. David Jerison

Lecture 23: Work, Probability

## Related Resources

Lecture Notes (PDF - 2.2MB)

The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation, or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.

PROFESSOR: Today we're going to hold off just a little bit on boiling water. And talk about another application of integrals, and we'll get to the witches' cauldron in the middle. The thing that I'd like to start with today is average value. This is something that I mentioned a little bit earlier, and there was a misprint on the board, so I want to make sure that we have the definitions straight. And also the reasoning straight. This is one of the most important applications of integrals, one of the most important examples.

If you take the average of a bunch of numbers, that looks like this. And we can view this as sampling a function, as we would with the Riemann sum. And what I said last week was that this tends to this expression here, which is called the continuous average. So this guy is the continuous average. Or just the average of f. And I want to explain that, just to make sure that we're all on the same page. In general, if you have a function and you want to interpret the integral, our first interpretation was that it's something like the area under the curve. But average value is another reasonable interpretation. Namely, if you take equally spaced points here, starting with x_0, x_1, x_2, all the way up to x_n, which is the left point b, and then we have values y_1, which is f(x_1); y_2, which is f(x_2); all the way up to y_n, which is f(x_n). And again, the spacing here that we're talking about is (b-a) / n. So remember that spacing, that's going to be the connection that we'll draw.

Then the Riemann sum is y_1 through y_n, the sum of y_1 through y_n, multiplied by delta x. And that's what tends, as delta x goes to 0, to the integral. The only change in point of view if I want to write this limiting property, which is right above here, the only change between here and here is that I want to divide by the length of the interval. b - a. So I will divide by b - a here. And divide by b - a over here. And then I'll just check what this thing actually is. Delta x / (b-a), what is that factor? Well, if we look over here to what delta x is, if you divide by b - a, it's 1 / n. So the factor delta x / (b-a) is 1 / n. That's what I put over here, the sum of y_1 through y_n divided by n. And as this tends to 0, it's the same as n going to infinity. Those are the same things. The average value and the integral are very closely related. There's only this difference that we're dividing by the length of the interval.

I want to give an example which is an incredibly simpleminded one, but it'll come into play later on. So let's take the example of a constant. And this is, I hope-- will make you slightly less confused about what I just wrote. As well as making you think that this is as simpleminded and reasonable as it should be. If I check what the average value of this constant is, it's given by this relatively complicated formula here. That is, I have to integrate the function c. Well, it's just the constant c. And however you do this, as an antiderivative or as thinking of it as a rectangle, the answer that you're going to get is c here. So work that out. The answer is c. And so the fact that the average of c is equal to c, which had better be the case for averages, explains the denominator. Explains the 1 / (b-a) there. That's cooked up exactly so that the average of a constant is what it's supposed to be. Otherwise we have the wrong normalizing factor. We've clearly got a piece of nonsense on our hands. And incidentally, it also explains the 1/n in the very first formula that I wrote down. The reason why this is called the average, or one reason why it's the right thing, is that if you took the same constant c, for y all the way across there n times, if you divide it by n, you get back c. That's what we mean by average value and that's why the n is there.

So that was an easy example. Now none of the examples that we are going to give are going to be all that complicated. But they will get sort of steadily more sophisticated. The second example is going to be the average height of a point on a semicircle. And maybe I'll draw a picture of the semicircle first here. And we'll just make it the standard circle, the unit circle. So maybe I should have called it a unit semicircle. This is the point negative 1, this is the point 1. And we're picking a point over here and we're going to take the typical, or the average, height here. Integrating with respect to dx. So sort of continuously with respect to dx. Well, what is that? Well, according to the rule, it's 1/(b-a) times - sorry, it's up here in the box. 1/(b-a), the integral from a to b, f(x) dx. That's 1 / (+1 - (-1)). The integral from - 1 to 1, square root of 1 - x^2, dx. Right, because the height is y is equal to-- this is y is equal to the square root of 1 - x^2.

And to evaluate this is not as difficult as it seems. This is 1/2 times this quantity here, which we can interpret as an area. It's the area of the semicircle. So this is the area of the semicircle, which we know to be half the area of the circle. So it's pi/2. And so the answer, here the average height, is pi/4. Now, later in the class and actually not in this unit, we'll actually be able to calculate the antiderivative of this. So in other words, we'll be able to calculate this analytically. For right now we just have the geometric reason why the value of this is pi/2. And we'll do that in the fourth unit when we do a lot of techniques of integration. So here's an example. Turns out, the average height of this is pi/4.

Now, the next example that I want to give introduces a little bit of confusion. And I'm not going to resolve this confusion completely, but I'm going to try to get you used to it. I'm going to take the average height again. But now, with respect to arc length. Which is usually denoted theta. Now, this brings up an extremely important feature of averages. Which is that you have to specify the variable with respect to which the average is taking place. And the answer will be different depending on the variable. So it's not going to be the same. Wow, can't spell the word length here. Just like the plural of witches the last time. We'll work on that. We'll fix all of our, that's an ancient Gaelic word, I think. Lengh.

So now, let me show you that it's not quite the same here. It's especially exaggerated if maybe I shift this little interval dx over to the right-hand end. And you can see that the little portion that corresponds to it, which is the d theta piece, has a different length from the dx piece. And indeed, as you come down here, these very short portions of dx length have much longer portions of theta length. So that the average that we're taking when we do it with respect to theta is going to emphasize the low values more. They're going to be more exaggerated. And the average should be lower than the average that we got here. So we should expect a different number. And it's not going to be / 4, it's going to be something else. Whatever it is, it should be smaller than pi / 4. Now, let's set up the integral. The integral follows the same rule. It's just 1 over the length of the interval times the integral over the interval of the function. That's the integral, but now where does theta range? This time, theta goes from 0 to pi. So the integral is from 0 to pi. And the thing we divide by is pi. And the integration requires us to know the formula for the height. Which is sin theta. In terms of theta, of course. It's the same as square root of 1 - x^2, but it's expressed in terms of theta. So it's this. And here's our average. I'll put this up here. So that's the formula for the height.

So let's work it out. This one, we have the advantage of being able to work out because we know the antiderivative of sin theta. It happens with this factor of pi, it's -cos theta. And so, that's -1/pi cos pi-- sorry. (cos pi - cos 0). Which is -1/pi (-2), which is 2 / pi. And sure enough, if you check it, you'll see that 2 / pi < pi / 4, because pi^2 > 8. Yeah, question.

STUDENT: [INAUDIBLE]

PROFESSOR: The question is how do I get sin theta. And the answer is, on this diagram, if theta is over here, then this height is this, and this is the angle theta, then the height is the sine. OK. Another question.

STUDENT: [INAUDIBLE]

PROFESSOR: The question is, what is the first one, the first one is an average of height, of a point on a semicircle and this one is with respect to x. So what this reveals is that it's ambiguous to say what the average value of something is, unless you've explained what the underlying averaging variable is.

STUDENT: [INAUDIBLE]

PROFESSOR: The next question is how should you interpret this value. That is, what came out of this calculation? And the answer is only sort of embedded in this calculation itself. So here's a way of thinking of it which is anticipating our next subject. Which is probability. Which is, suppose you picked a number at random in this interval. With equal likelihood, one place and another. And then you saw what height was above that. That would be the interpretation of this first average value. And the second one is, I picked something at random on this circle. And equally likely, any possible point on this circle according to its length. And then I ask what the height of that point is. And those are just different things. Another question.

STUDENT: [INAUDIBLE]

PROFESSOR: cos pi, shouldn't it be 0? No. cos of-- it's -1. cos pi is -1. Cosine, sorry. No, cos 0 = 1. cos pi = -1. And so they cancel. That is, they don't cancel. It's -1 - 1, which is -2. Key point. Yeah.

STUDENT: [INAUDIBLE]

PROFESSOR: All right, let me repeat. So the question was to repeat the reasoning by which I guessed in advance that probably this was going to be the relationship between the average value with respect to arc length versus the average value with respect to this horizontal distance. And it had to do with the previous way this diagram was drawn. Which is comparing an interval in dx with an interval in theta. A little section in theta. And when you're near the top, they're nearly this same. That is, it's more or less balanced. It's a little curved here, a little different. But here it becomes very exaggerated. The d theta lengths are much longer than the dx lengths. Which means that importance given by the theta variable to these parts of the circle is larger, relative to these parts. Whereas if you look at this section versus this section for the dx, they give equal weights to these two equal lengths. But here, with respect to theta, this is relatively short and this is much larger. So, as I say, the theta variable's emphasizing the lower parts of the semicircle more. That's because this length is shorter and this length is longer. Whereas these two are the same. It's a balancing act of the relative weights.

I'm going to say that again in a different way, and maybe this will-- The lower part is more important for theta.

STUDENT: [INAUDIBLE]

PROFESSOR: So the question is, but shouldn't it have a bigger value because it's a longer length. Never with averages. Whatever the length is, we're always dividing. We're always compensating by the total. We have the integral from 0 to pi, but we're dividing by pi. Here we had the integral from -1 to 1, but we're dividing by 2. So we divide by something different each time. And this is very, very important. It's that the average of a constant is that same constant regardless of which one we did. So if it were a constant, we would always compensate for the length. So the length never matters. If it's the integral from 0 to 1,000,000, or 100, let's say, 1/100 c dx, it's just the same. It's always that, it doesn't matter how long it is. Because we compensate. That's really the difference between an integral and an average, is that we're dividing by the total.

Now I want to introduce another notion, which is actually what's underlying these two examples that I just wrote down. And this is by far the one which you should emphasize the most in your thoughts. Because it is much more flexible, and is much more typical of real life problems. So the idea of a weighted average is the following. You take the integral, say from a to b, of some function. But now you multiply by a weight. And you have to divide by the total. And what's the total going to be? It's the integral from a to b of this total weighting that we have. Now, why is this the correct notion? I'm going to explain it to you in two ways. The first is this very simpleminded thing that I wrote on the board there, with the constants. What we want is the average value of c to be c. Otherwise this makes no sense as an average. Now, let's just look at this definition here. And see that that's correct. If you integrate c, from a to b, w(x) dx, and you divide by the integral from a to b, w(x) dx, not surprisingly, the c factors out. It's a constant. So this is c times the integral a to b, w(x) dx, divided by the same thing. And that's why we picked it. We picked it so that these things would cancel. And this would give c. So in the previous case, this property explains the denominator. And again over here, it explains the denominator.

And let me just give you one more explanation. Which is maybe a real-life-- pretend real-life example. You have a stock which you bought for $10 one year. And then six months later you brought some more for $20. And then you bought some more for $30. Now, what's the average price of your stock? Well, it depends on how many shares you bought. If you bought this many shares the first time, and this many shares the second time, and this many shares the third time, this is the total amount that you spent. And the average price is the total price divided by the total number of shares. And this is the discrete analog of this continuous averaging process here. The function f now, so I use w for weight, the function f now is the function whose values are 10, 20 and 30. And the weightings are the relative importance of the different purchases.

So again, these w_i's are weights. There was another question. Out in the audience, at some point. Over here, yes.

STUDENT: [INAUDIBLE]

PROFESSOR: Very, very good point. So in this numerator here, the statement is-- in this example, we factored out c. But here we cannot factor out f(x). That's extremely important and that is the whole point. So, in other words, the weighted average is very interesting - you have to do two different integrals to figure it out in general. When it happens that this is c, it's an extremely boring integral. Which in fact because, it's an average, you don't even have to calculate at all. Factor it out and cancel these things and never bother to calculate these two numbers. So these massive numbers just cancel. So it's a very special property of a constant, that it factors out.

That was our first discussion, and now with this example I'm going to go back to the heating up of the witches' cauldron and we'll use average value to illustrate the integral that we get in that context as well. I remind you, let's see. The situation with the witches' cauldron was this. The first important thing is that there were-- so this is the big cauldron here. This is the one whose height is 1 meter and whose width is 2 meters. And it's a parabola of revolution here. And it had about approximately 1600 liters in it. And this curve was y = x^2. And the situation that I described at the end of last time was that the initial temperature was T = 0 degrees Celsius. And the final temperature, instead of being a constant temperature, we were heating this guy up from the bottom. And it was hotter on the bottom than on the top. And the final temperature was given by the formula T is equal to 100 minus 30 times the height y. So at y = 0, at the bottom, it's 100. And at the top, T = 70 degrees. OK, so this is the final configuration for the temperature.

And the question was how much energy do we need. So, the first observation here, and this is the reason for giving this example, is that it's important to realize that you want to use the method of disks in this case. The reason-- So it doesn't have to do with, you shouldn't think of the disks first. But what you should think of is the horizontal. We must use horizontals because T is constant on horizontals. It's not constant on verticals. If we set things up with shells, as we did last time, to compute the volume of this, then T will vary along the shell. And we will still have an averaging problem, an integral problem along the vertical portion. But if we do it this way, T is constant on this whole level here. And so there's no more calculus involved in calculating what the contribution is of any given level. So T is constant on horizontals. Actually, in disguise, this is that same trick that we have here. We can factor constants out of integrals. You could view it as an integral, but the point is that it's more elementary than that.

Now I have to set it up for you. And in order to do that, I need to remember what the equation is. Which is y = x^2. And the formula for the total amount of energy is going to be volume times the number of degrees. That's going to be equal to the energy that we need here. And so let's add it up. It's the integral from 0 to 1, and this is with respect to y. So the y level goes from 0 to 1. This top level is y = 1, this bottom level is y = 0. And the disk that we get, this is the point (x, y) here, is rotated around. And its radius is x. So the thickness is dy, and the area of the disk is pi x^2. And the thing that we're averaging is T. Well, we're not yet averaging, we're just integrating it. We're just adding up the total.

Now I'm just going to plug in the various values for this. And what I'm going to get is T, again, is 100 - 30y. And this radius is measured up to this very end. So x^2 = y. So this is pi y dy. And this is the integral that we'll be able to evaluate. Yeah, question.

STUDENT: [INAUDIBLE]

PROFESSOR: All right. Well, let's carry this out. Let's finish off the calculation here. Let's see. This is equal to, what it it equal to? Well, I'll put it over here. It's equal to 50 pi y ^2 minus-- right, because this is 100 pi y, and then there's a 30, this is 100 pi y - 30 pi y^2, and I have to take the antiderivative of that. So I get 50 pi y^2, and I get 10 pi y^3, evaluated at 0 and 1. And that is 40 pi. Now, I spent a tremendous amount of time last time focusing on units. Because I want to tell you how to get a realistic number out of this. And there's a subtle point here that I pointed out last time that had to do with changing meters to centimeters. I claim that I've treated those correctly. So, what we have here is that the answer is in degrees, that is Celsius, times cubic meters. These are the correct units. And now, I can translate this into-- Celsius is spelled with a C. That's interesting. Celsius. I can translate this into units that you're more familiar with. So let's try 40 pi degrees times m^3, and then do the conversion factors.

First of all there's one calorie per degree times a milliliter. That's one conversion. And then let's see. I'm going to have to translate from centimeters so I have here (100 cm / m)^3. So these are the two conversion factors that I need. And so, I get 40 pi 10^6, that's 100^3. And this is in calories. So how much is this? Well, it's a little better, maybe, to do it in 40 pi * 1,000 kilocalories, because these are the ones that you actually see on your nutrition labels of foods. And so this number is around 125 or so. Let's see, is that about right? Let's make sure I've got these numbers right. Yeah, this is about 125. 40 times pi. And so one candy bar-- This is a Halloween example, so. One candy bar is about 250 kilocalories. So this is half a candy bar. So the answer to our question is that it takes 500 candy bars to heat up this thing. OK, so that's our example. Now, yeah. Question.

STUDENT: [INAUDIBLE]

PROFESSOR: What does the integral give us? This integral is-- the integral represents the following things. So the question is, what does this integral give us. So here's the integral. Here it is, rewritten so that it can be calculated. And what this integral is giving us is the following thing. You have to imagine the following idea. You've got a little chunk of water in here. And you're going to raise is from 0 degrees all the way up to whatever the target temperature is. And so that little milliliter of water, if you like, has to be raised from 0 to some number which is a function of the height. It's something between 70 and 100 degrees. And the one right above it also has to be raised to a temperature, although a slightly different temperature. And what we're doing with the integral is we're adding up all of those degrees and the calorie represents how much it takes, one calorie represents how much it takes to raise by 1 degree 1 milliliter of water. One cubic centimeter of water. That's the definition of a calorie. And we're adding it up. So in other words, each of these cubes is one thing. And now we have to add it up over this massive thing, which is 1600 liters. And we have a lot of different little cubes. And that's what we did. When we glommed to them all together. That's what the integral is doing for us. Other questions.

Now I want to connect this with weighted averages before we go on. Because that was the reason why I did weighted averages first. I'm going to compute also the average final temperature. So, final because this is the interesting one, the average starting temperature's very boring, it's 0. The average final temperature is-- individually the temperatures are different. And the answer here is it's the integral from 0 to 1 of T pi y dy divided by the integral from 0 to 1 of pi y dy. So this is the total temperature, weighted appropriately to the volume of water that's involved at that temperature, divided by the total volume of water. And we computed these two numbers. The number in the numerator is what we call 40 pi. And the number in the denominator, actually this is easier than what we did last time with shells; you can just look at this and see that it's the area under a triangle. It's pi / 2. And so the answer here is 80 degrees. This is the average temperature. Note that this is a weighted average. The weighting here is different according to the height. The weighting factor is pi y. That's the weighting factor. And that's not surprising. When y is small, there's less volume down here. Up above, those are more important volumes, because there's more water up at the top of the cauldron than there is down at the bottom of the cauldron. If you compare this to the ordinary average, if you take the maximum temperature plus the minimum temperature, divided by 2, that would be (100 + 70) / 2. You would get 85 degrees. And that's bigger. Why? Because the cooler water is on top. And the actual average, the correct weighted average, is lower than this fake average. Which is not the true average in this context. All right so the weighting is that the thing is getting fatter near the top.

So now I'm going to do another example of weighted average. And this example is also very much worth your while. It's the other incredibly important one in interpreting integrals. And it's a very, very simple example of a function f. The weightings will be different, but the function f, will be of a very particular kind. Namely, the function f will be practically a constant. But not quite. It's going to be a constant on one interval, and then 0 on the rest. So we'll do those weighted averages now. And this subject is called probability.

In probability, what we do, so I'm just going to give some examples here. I'm going to pick a point in quotation marks - at random. In the region y < x < 1 - x^2. That's this shape here. Well, let's draw it right down here. For now. So, somewhere in here. Some point, (x, y). And then I need to tell you, according to what this random really means. This is proportional to area, if you like. So area inside of this section. And then the question that we're going to answer right now is, what is the chance that - or, it's usually called probability - that x > 1/2. Let me show you what's going on here.

And this is always the case with things in probability. So, first of all, we have a name for this. This is called the probability that x > 1/2. And so that's what it's called in our notation here. And what it is, is the probability is always equal to the part divided by the whole. It's a ratio just like the one over there. And which is the part and which is the whole? Well, in this picture, the whole is the whole parabola. And the part is the section x > 1/2. And it's just the ratio of those two areas. Let's write that down. That's the integral from 1/2 to 1 of (1 - x^2) dx, divided by the integral from -1 to 1, (1 - x^2) dx. And again, the weighting factor here is 1 - x^2. And to be a little bit more specific here, the starting point a = -1 and the endpoint is +1. So this is P(x < 1/2). And if you work it out, it turns out to be 5/18, we won't do it. Yeah.

STUDENT: [INAUDIBLE]

PROFESSOR: What we're trying to do with probability. So I can't repeat your question. But I can try to say-- because it was a little bit too complicated. But it was not correct, OK. What we're taking is, we have two possible things that could happen. Either, let's put it this way. Let's make it a gamble. Somebody picks a point in here at random. And we're trying to figure out what your chances are of winning. In other words, the chances the person picks something in here versus something in there. And the interesting thing is, so what percent of the time do you win. The answer is it's some fraction of 1. And in order to figure that out, I have to figure out the total area here. Versus the total of the entire, all the way from -1 to 1, the beginning to the end. So in the numerator, I put success, and in the denominator I put all possibilities. So that-- Right?

STUDENT: [INAUDIBLE]

PROFESSOR: And that's the interpretation of this. So maybe I didn't understand your question.

STUDENT: [INAUDIBLE]

PROFESSOR: Ah, why is 1 - x^2. the weighting factor. That has to do with how you compute areas under curves. The curve here is y = 1 - x^2. And so, in order to calculate how much area is between 1/2 and 1, I have to integrate. That's the interpretation of this. This is the area under that curve. This integral. And the denominator's the area under the whole thing. OK, yeah. Another question.

STUDENT: [INAUDIBLE]

PROFESSOR: Ah. Yikes. It was supposed to be the same question as over here. Thank you.

STUDENT: [INAUDIBLE] PROFESSOR: This has something to do with weighting factors. Here's the weight factor. Well, it's the relative importance from the point of view of this probability of these places versus those. That is, so this is a weighting factor because it's telling me that in some sense this number 5/18-- actually that makes me think that this number is probably wrong. Well, I'll let you calculate it out. It looks like it should be less than 1/4 here, because this is 1/4 of the total distance and there's a little less in here than there is in the middle. So in fact it probably should be less than 1/4, the answer.

STUDENT: [INAUDIBLE]

PROFESSOR: The equation of the curve is 1 - x^2. The reason why it's the weighting factor is that we're interpreting-- The question has to do with the area under that curve. And so, this is showing us how much is relatively important versus how much is not. This is-- These parts are relatively important, these parts are less important. According to area. Because we've said that area is the way we're making the choice. So I don't have quite enough time to tell you about my next example. Instead, I'm just going to tell you what the general formula is. And we'll do our example next time. I'll tell you what it's going to be.

So here's the general formula for probability here. We're going to imagine that we have a total range which is maybe going from a to b, and we have some intermediate values x_1 and x_2, and then we're going to try to compute the probability that some variable that we picked at random occurs between x_1 and x_2. And by definition, we're saying that it's an integral. It's the integral from x_1 to x_2 of the weight dx, divided by the integral all the way from a to b. Of the weight. So, again, this is the part divided by the whole. And the relationship between this and the weighted average that we had earlier was that the function f f(x) is kind of a strange function. It's 0 and 1. It's just-- The picture, if you like, is that you have this weighting factor. And it's going from a to b. But then in between there, we have the part that we're interested in. Which is between x_1 and x_2. And it's the ratio of this inner part to the whole thing that we're interested in.

Tomorrow I'm going to try to do a realistic example. And I'm going to tell you what it is, but we'll take it up tomorrow. I told you it was going to be tomorrow, but we still have a whole minute, so I'm going to tell you what the problem is.

So this is going to be a target practice problem. You have a target here and you're throwing darts at this target. And so you're throwing darts at this target. And somebody is standing next to the dartboard. Your little brother is standing next to the dartboard here. And the question is, how likely you are to hit your little brother. So this will, let's see. You'll see whether you like that or not. Actually, I was the little brother. So, I don't know which way you want to go. We'll go either way. We'll find out next time.

## Free Downloads

### Video

- iTunes U (MP4 - 106MB)
- Internet Archive (MP4 - 106MB)

### Free Streaming

### Subtitle

- English - US (SRT)

## Welcome!

This is one of over 2,200 courses on OCW. Find materials for this course in the pages linked along the left.

**MIT OpenCourseWare** is a free & open publication of material from thousands of MIT courses, covering the entire MIT curriculum.

**No enrollment or registration.** Freely browse and use OCW materials at your own pace. There's no signup, and no start or end dates.

**Knowledge is your reward.** Use OCW to guide your own life-long learning, or to teach others. We don't offer credit or certification for using OCW.

**Made for sharing**. Download files for later. Send to friends and colleagues. Modify, remix, and reuse (just remember to cite OCW as the source.)

Learn more at Get Started with MIT OpenCourseWare