]> 9.4 The Gradient in Polar Coordinates and other Orthogonal Coordinate Systems

## 9.4 The Gradient in Polar Coordinates and other Orthogonal Coordinate Systems

Suppose we have a function given to us as $f ( x , y )$ in two dimensions or as $g ( x , y , z )$ in three dimensions. We can take the partial derivatives with respect to the given variables and arrange them into a vector function of the variables called the gradient of $f$ , namely

$( ∂ f ∂ x , ∂ f ∂ y ) or ( ∂ g ∂ x , ∂ g ∂ y , ∂ g ∂ z )$

which mean

$∂ f ∂ x i ^ + ∂ f ∂ y j ^ or ∂ g ∂ x i ^ + ∂ g ∂ y j ^ + ∂ g ∂ z k ^$

Suppose however, we are given $f$ as a function of $r$ and $θ$ , that is, in polar coordinates, (or $g$ in spherical coordinates, as a function of $φ , θ$ , and $ρ$ ).

For example, suppose $f = 1 r$ , or $g = 1 ρ$ , or $g = sin ⁡ θ$ .

How do we find the gradient of $f$ or $g$ ?

One way to find the gradient of such a function is to convert $r$ or $ρ$ or $θ$ into rectangular coordinates using the appropriate formulae for them, and perform the partial differentiation on the resulting expressions.

Thus we can write

$1 r = ( x 2 + y 2 ) − 1 / 2$

and find, by ordinary partial differentiating

$∇ ⟶ 1 r = − x r 3 i ^ − y r 3 j ^$

It is a bit more convenient sometimes, to be able to express the gradient directly in polar coordinates or spherical coordinates, like it is expressed in rectangular coordinates as above.

We want here an expression involving partial derivatives with respect to $r$ and $θ$ multiplied by vectors pointing respectively in the $r$ direction, and $θ$ direction.

So we want to know: what vectors should these partial derivatives be multiplied by in order to form the gradient?

When we find the answer, the actual partial derivative with respect to each polar variable will be the dot product of a unit vector in a polar direction with the gradient.

We therefore digress to discuss what thes unit vectors are so that you can recognize them.

The r direction is the direction tilted by an angle $θ$ counterclockwise from the $x$ axis. A unit vector in that direction, call it $u r ⟶$ , can be written in any of the three following forms

$u r ⟶ = cos ⁡ θ i ^ + sin ⁡ θ j ^ = x r i ^ + y r j ^ = r ⟶ r$

The unit vector in the $θ$ direction lies in the direction $90 °$ beyond the $r$ direction, counterclockwisely, and is therefore given by

$u θ ⟶ = − sin ⁡ θ i ^ + cos ⁡ θ j ^ = − y r i ^ + x r j ^$

We now ask: what is $∇ ⟶ f$ in polar coordinates?

We know that if we make differential changes in $r$ and $θ$ the resulting change in $f$ will be given by

$d f ( r , θ ) = ∂ f ∂ r d r + ∂ f ∂ θ d θ$

(A)

since this relation holds for any variables at all.

But they must also obey

$d f = ( ∇ ⟶ f ) r d s r + ( ∇ ⟶ f ) θ d s θ = ∇ ⟶ f · d s$

As we noted briefly in section 3.8 , distance in polar coordinates upon making small changes in the variables $r$ and $θ$ is described by

$d s 2 = d r 2 + r 2 d θ 2$

From this we deduce that $d s r$ is $d r$ while $d s θ$ is $r d θ$ .

Putting the two equations for ds together, we deduce:

$( ∇ ⟶ f ) r$ is the partial derivative of $f$ with respect to $r$ , just as $( ∇ ⟶ f ) x$ is its partial derivative with respect to $x$ .

But because $d s θ$ has a factor of $r$ in it, there must be a compensating factor of $r$ in the denominator of the component of $∇ ⟶ f$ in the $θ$ direction

$( ∇ ⟶ f ) θ = ∂ f r ∂ θ$

and

$∇ ⟶ f = ∂ f r ∂ θ u θ ⟶ + ∂ f ∂ r u r ⟶$

A similar computation can be made for any orthogonal directions in any dimension, and we can anticipate the result.

The component of $∇ ⟶ f$ in the direction of any such variable will be the partial derivative of $f$ with respect to that variable, divided by the ratio of distance change in that direction to change in the variable itself.

Using the last equation we can immediately deduce that the gradient of $θ$ is $u θ ⟶ r$ , except of course at $r = 0$ , where $θ$ is not differentiable. Similarly we find that the gradient of $1 r$ is $− 1 r 2 u r ⟶$ .

Exercises:

9.5 Use the fact that both angular variables in spherical coordinates are polar variables to express $d s 2$ in 3 dimensions in terms of differentials of the three variables of spherical coordinates. From this deduce the formula for gradient in spherical coordinates.

9.6 Find the gradient of $ϕ$ in spherical coordinates by this method and the gradient of $θ$ in spherical coordinates also.

There is a third way to find the gradient in terms of given coordinates, and that is by using the chain rule.

We can first consider differential change of $f$ in rectangular coordinates, and then relate the differential changes in $x$ and $y$ to differential changes in the other coordinates, say $r$ and $θ$ . Combining these we can relate the change in $f$ to changes in the latter two variables.

Since we know how to write the gradient in rectangular coordinates and can recognize unit vectors, we can express the resulting expression in terms of components of the gradient in the other coordinate system.

Explicitly we can write

$d f = ∂ f ∂ x d x + ∂ f ∂ y d y d x = ∂ x ∂ r d r + ∂ x ∂ θ d θ d y = ∂ y ∂ r d r + ∂ y ∂ θ d θ$

and use the latter two equations to get rid of $d x$ and $d y$ in the first equation. The result is an expression for $d f$ in terms of $d r$ and $d θ$ , the coefficients of which can be described in terms of unit vectors in the various directions, and the gradient in rectangular coordinates.

Comparing that equation with the basic formula defining partial derivatives, Equation (A) above you can read off the components of the gradient.

This approach is useful when $f$ is given in rectangular coordinates but you want to write the gradient in your coordinate system, or if you are unsure of the relation between $d s 2$ and distance in that coordinate system.

Exercises:

9.7 Do this computation out explicitly in polar coordinates.

9.8 Do it as well in spherical coordinates.

What variables should we keep constant in taking partial derivatives?

It is worth noting that when we take the partial derivative with respect to $x$ or $y$ we always mean that we are keeping the other variable, $y$ or $x$ , constant; on the other hand the partials with respect to $r$ and $θ$ always mean keeping the other one of these, $θ$ or $r$ , constant. Any other meaning has to be described explicitly.

There are times and places where in a partial derivative one can become confused as to which variable or variables are being kept constant, and under such circumstances it is wise to modify the notation to supply this information explicitly. Thus we can write $( ∂ f ∂ x ) y$ to mean the partial derivative with respect to $x$ keeping $y$ fixed, and then there can be no confusion as to what is kept constant.