The problem we face is that of finding the area between a curve described by the equation y = f(x) and the x-axis in a finite interval [a, b].
We use the approach that we have used to define the integral. When f is continuous in the interval, we divide it into N subintervals, each of width which we will call d, (we assume b > a) and evaluate f at the endpoints of each of these intervals, a + jd for j from 0 to N.
When f is only piecewise continuous, you should first break the interval into subintervals in which f is continuous and proceed as we discuss in each of these.
The trapezoid rule consists in approximating the area of each subinterval by its width d multiplied by the average of the values of f at its endpoints. For the j-th interval this is
We now address the questions: why do this?
Is it better to use this trapezoid rule than say, to choose a point x' at random in the j-th interval and compute d * f(x') as the contribution to the area from that interval?
To get a partial answer to this question, consider one interval of width d and center at xj and for convenience we set xj = 0.
Then the interval begins at and ends at .
Suppose now we can expand our integrand f in a power series about x, with results
f(x) = f(0) + ax + bx2 + cx3 + ex4 + ...
The actual area under this function in this interval will be
(Here the factor arises as follows: there are identical contributions from the endpoints and which are each .
The factor comes about similarly. Notice that the ax and other odd power terms in f do not contribute at all here because they are odd and their contributions cancel out.)
The trapezoid rule that we have described, on the other hand, gives the following proposed answer for this area
The contribution from f(0) is exactly right, that from b is a factor of three too large, and that from e is a factor of 5 too large.
Notice that any estimation here that is symmetric about 0 will get the odd (a, c, ...) terms right.
In particular we could use the "midpoint rule" which approximates the area as f(0)d, and this gets the a, c, ... contributions right but gets nothing at all from the b, e, ... terms.
The symmetry which causes the a and c terms to cancel out is the great advantage that the trapezoid rule possesses here and it shares it with the midpoint rule.
Consider what happens if we decrease d by a factor of 2 (or any other factor z). Because of this symmetry, the error in the trapezoid rule or the midpoint rule goes down by a factor of 23 for each interval, but now there are two intervals where only one was before.
This error must therefore be doubled to have a comparison over the original interval, and the actual error in either the trapezoid rule or the midpoint rule goes down by a factor of 4 in any one interval from the b term, and even more from the e and further terms.
25.1 Set up a spreadsheet that divides a given interval a to b into N equal subintervals, evaluates a given function, say sin x at each of the N + 1 interval endpoints, and calculates the Trapezoid rule evaluation of the resulting integral.
25.2 Make a spreadsheet with the capability of computing this evaluation for N = 1, 2, 4, 8, 16, and 32 simultaneously. (You can put them next to each other by starting with 32 and entering the instruction =if(mod(j,2^k)=0,2*prev column entry,0), with j the index of the subinterval end and k the column index. Each increase in k by 1 will decrease the number of trapezoid intervals by a factor of 2.)