Skip to main content

CLP-3 Multivariable Calculus

Section 3.8 Optional— Integrals in General Coordinates

One of the most important tools used in dealing with single variable integrals is the change of variable (substitution) rule
See Theorems 1.4.2 and 1.4.6 in the CLP-2 text. Expressing multivariable integrals using polar or cylindrical or spherical coordinates are really multivariable substitutions. For example, switching to spherical coordinates amounts replacing the coordinates \(x,y,z\) with the coordinates \(\rho,\theta,\varphi\) by using the substitution
\begin{equation*} \vX =\vr(\rho,\theta,\varphi)\qquad \dee{x}\,\dee{y}\,\dee{z} = \rho^2\,\sin\varphi\,\dee{\rho}\,\dee{\theta}\,\dee{\varphi} \end{equation*}
where
\begin{equation*} \vX=\llt x\,,\, y\,,\, z\rgt\qquad\text{and}\qquad \vr(\rho,\theta,\varphi)=\llt \rho\cos\theta\sin\varphi\,,\, \rho\sin\theta\sin\varphi\,,\, \rho\cos\varphi\rgt \end{equation*}
We'll now derive a generalization of the substitution rule 3.8.1 to two dimensions. It will include polar coordinates as a special case. Later, we'll state (without proof) its generalization to three dimensions. It will include cylindrical and spherical coordinates as special cases.
Suppose that we wish to integrate over a region, \(\cR\text{,}\) in \(\bbbr^2\) and that we also wish 1  to use two new coordinates, that we'll call \(u\) and \(v\text{,}\) in place of \(x\) and \(y\text{.}\) The new coordinates \(u\text{,}\) \(v\) are related to the old coordinates \(x\text{,}\) \(y\text{,}\) by the functions 2 
\begin{align*} x&=x(u,v)\\ y&=y(u,v) \end{align*}
To make formulae more compact, we'll define the vector valued function \(\vr(u,v)\) by
\begin{equation*} \vr(u,v) = \llt x(u,v) \,,\, y(u,v) \rgt \end{equation*}
As an example, if the new coordinates are polar coordinates, with \(r\) renamed to \(u\) and \(\theta\) renamed to \(v\text{,}\) then \(x(u,v) = u\cos v\) and \(y=u\sin v\text{.}\)
Note that if we hold \(v\) fixed and vary \(u\text{,}\) then \(\vr(u,v)\) sweeps out a curve. For example, if \(x(u,v) = u\cos v\) and \(y=u\sin v\text{,}\) then, if we hold \(v\) fixed and vary \(u\text{,}\) \(\vr(u,v)\) sweeps out a straight line (that makes the angle \(v\) with the \(x\)-axis), while, if we hold \(u \gt 0\) fixed and vary \(v\text{,}\) \(\vr(u,v)\) sweeps out a circle (of radius \(u\) centred on the origin).
We start by cutting \(\cR\) (the shaded region in the figure below) up into small pieces by drawing a bunch of curves of constant \(u\) (the blue curves in the figure below) and a bunch of curves of constant \(v\) (the red curves in the figure below).
Concentrate on any one of the small pieces. Here is a greatly magnified sketch.
For example, the lower red curve was constructed by holding \(v\) fixed at the value \(v_0\text{,}\) varying \(u\) and sketching \(\vr(u,v_0)\text{,}\) and the upper red curve was constructed by holding \(v\) fixed at the slightly larger value \(v_0+\dee{v}\text{,}\) varying \(u\) and sketching \(\vr(u,v_0+\dee{v})\text{.}\) So the four intersection points in the figure are
\begin{alignat*}{2} P_2&=\vr(u_0, v_0+\dee{v}) &\qquad P_3&=\vr(u_0+\dee{u}, v_0+\dee{v})\\ P_0&=\vr(u_0, v_0) & P_1&=\vr(u_0+\dee{u}, v_0) \end{alignat*}
Now, for any small constants \(\dee{U}\) and \(\dee{V}\text{,}\) we have the linear approximation 3 
\begin{align*} \vr(u_0+\dee{U},v_0+\dee{V}) &\approx \vr(u_0\,,\,v_0) +\pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{U} +\pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{V} \end{align*}
Applying this three times, once with \(\dee{U}=\dee{u}\text{,}\) \(\dee{V}=0\) (to approximate \(P_1\)), once with \(\dee{U}=0\text{,}\) \(\dee{V}=\dee{v}\) (to approximate \(P_2\)), and once with \(\dee{U}=\dee{u}\text{,}\) \(\dee{V}=\dee{v}\) (to approximate \(P_3\)),
\begin{alignat*}{4} P_0&=\vr(u_0\,,\,v_0)\\ P_1&=\vr(u_0+\dee{u}, v_0) &&\approx \vr(u_0\,,\,v_0) &&+\pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{u}\\ P_2&=\vr(u_0, v_0+\dee{v}) &&\approx \vr(u_0\,,\,v_0) && &&+\pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{v}\\ P_3&=\vr(u_0+\dee{u}, v_0+\dee{v}) &&\approx\vr(u_0\,,\,v_0) &&+\pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{u} &&+\pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{v} \end{alignat*}
We have dropped all Taylor expansion terms that are of degree two or higher in \(\dee{u}\text{,}\) \(\dee{v}\text{.}\) The reason is that, in defining the integral, we take the limit \(\dee{u},\dee{v}\rightarrow 0\text{.}\) Because of that limit, all of the dropped terms contribute exactly \(0\) to the integral. We shall not prove this. But we shall show, in the optional §3.8.1, why this is the case.
The small piece of \(\cR\) surface with corners \(P_0\text{,}\) \(P_1\text{,}\) \(P_2\text{,}\) \(P_3\) is approximately a parallelogram with sides
\begin{align*} \overrightarrow{P_0P_1} \approx \overrightarrow{P_2P_3} &\approx \pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{u} =\llt\pdiff{x}{u}(u_0,v_0)\,,\, \pdiff{y}{u}(u_0,v_0) \rgt \dee{u}\\ \overrightarrow{P_0P_2} \approx \overrightarrow{P_1P_3} &\approx \pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{v} =\llt\pdiff{x}{v}(u_0,v_0)\,,\, \pdiff{y}{v}(u_0,v_0) \rgt \dee{v} \qquad \end{align*}
Here the notation, for example, \(\overrightarrow{P_0P_1}\) refers to the vector whose tail is at the point \(P_0\) and whose head is at the point \(P_1\text{.}\) Recall, from 1.2.17 that
\begin{equation*} \text{area of parallelogram with sides $\llt a,b\rgt$ and $\llt c,d\rgt$} = \left|\det\left[\begin{matrix}a&b\\ c&d\end{matrix}\right]\right| =\big|ad-bc\big| \end{equation*}
So the area of our small piece of \(\cR\) is essentially
Recall that \(\det M\) denotes the determinant of the matrix \(M\text{.}\) Also recall that we don't really need determinants for this text, though it does make for nice compact notation.
The formula (3.8.2) is the heart of the following theorem, which tells us how to translate an integral in one coordinate system into an integral in another coordinate system.
The determinant
\begin{align*} \det\left[\begin{matrix} \pdiff{x}{u}(u,v)&\pdiff{y}{u}(u,v) \\ \pdiff{x}{v}(u,v)&\pdiff{y}{v}(u,v) \end{matrix}\right] \end{align*}
that appears in (3.8.2) and Theorem 3.8.3 is known as the Jacobian 4 .
We'll start with a pretty trivial example in which we simply rename \(x\) to \(Y\) and \(y\) to \(X\text{.}\) That is
\begin{align*} x(X,Y) &= Y\\ y(X,Y) &= X \end{align*}
Since
\begin{align*} \pdiff{x}{X}&=0 &\pdiff{y}{X}&=1\\ \pdiff{x}{Y}&=1 &\pdiff{y}{Y}&=0 \end{align*}
(3.8.2), but with \(u\) renamed to \(X\) and \(v\) renamed to \(Y\text{,}\) gives
\begin{align*} \dee{A} &= \left|\det\left[\begin{matrix}0 & 1 \\ 1 & 0 \end{matrix}\right]\right| \dee{X}\,\dee{Y} = \dee{X}\,\dee{Y} \end{align*}
which should really not be a shock.
Polar coordinates have
\begin{align*} x(r,\theta) &= r\cos\theta\\ y(r,\theta) &= r\sin\theta \end{align*}
Since
\begin{align*} \pdiff{x}{r}&=\cos\theta &\pdiff{y}{r}&=\sin\theta\\ \pdiff{x}{\theta}&=-r\sin\theta &\pdiff{y}{\theta}&=r\cos\theta \end{align*}
(3.8.2), but with \(u\) renamed to \(r\) and \(v\) renamed to \(\theta\text{,}\) gives
\begin{align*} \dee{A} &= \left|\det\left[\begin{matrix}\cos\theta &\sin\theta \\ -r\sin\theta & r\cos\theta \end{matrix}\right]\right| \dee{r}\dee{\theta} =\big(r\cos^2\theta + r\sin^2\theta\big)\,\dee{r}\dee{\theta}\\ &= r\,\dee{r}\,\dee{\theta} \end{align*}
which is exactly what we found in 3.2.5.
Parabolic 5  coordinates are defined by
\begin{align*} x(u,v) &= \frac{u^2-v^2}{2}\\ y(u,v) &= uv \end{align*}
Since
\begin{align*} \pdiff{x}{u}&= u &\pdiff{y}{u}&=v\\ \pdiff{x}{v}&=-v &\pdiff{y}{v}&=u \end{align*}
(3.8.2) gives
\begin{align*} \dee{A} &= \left|\det\left[\begin{matrix} u & v \\ -v & u \end{matrix}\right]\right| \dee{u}\dee{v} = (u^2+v^2)\,\dee{u}\,\dee{v} \end{align*}
In practice applying the change of variables Theorem 3.8.3 can be quite tricky. Here is just one simple (and rigged) example.
Evaluate
\begin{equation*} \dblInt_\cR\frac{y}{1+x}\ \dee{x}\,\dee{y}\qquad\text{where } \cR=\Set{(x,y)}{0\le x\le 1,\ 1+x\le y\le 2+2x} \end{equation*}
Solution.
We can simplify the integrand considerably by making the change of variables
\begin{align*} s&=x & x&=s\\ t&=\frac{y}{1+x} & y&=t(1+x) = t(1+s) \end{align*}
Of course to evaluate the given integral by applying Theorem 3.8.3 we also need to know
  • [\(\circ\)] the domain of integration in terms of \(s\) and \(t\) and
  • [\(\circ\)] \(\dee{x}\,\dee{y}\) in terms of \(\dee{s}\,\dee{t}\text{.}\)
By (3.8.2), recalling that \(x(s,t)=s\) and \(y(s,t)=t(1+s)\text{,}\)
\begin{align*} \dee{x}\,\dee{y} &= \left|\det\left[\begin{matrix}\pdiff{x}{s}&\pdiff{y}{s}\\ \pdiff{x}{t}&\pdiff{y}{t} \end{matrix}\right]\right| \dee{s}\,\dee{t} = \left|\det\left[\begin{matrix}1&t\\ 0&1+s \end{matrix}\right]\right| \dee{s}\,\dee{t} = (1+s)\,\dee{s}\,\dee{t} \end{align*}
To determine what the change of variables does to the domain of integration, we'll sketch \(\cR\) and then reexpress the boundary of \(\cR\) in terms of the new coordinates \(s\) and \(t\text{.}\) Here is the sketch of \(\cR\) in the original coordinates \((x,y)\text{.}\)
The region \(\cR\) is a quadrilateral. It has four sides.
  • The left side is part of the line \(x=0\text{.}\) Recall that \(x=s\text{.}\) So, in terms of \(s\) and \(t\text{,}\) this line is \(s=0\text{.}\)
  • The right side is part of the line \(x=1\text{.}\) In terms of \(s\) and \(t\text{,}\) this line is \(s=1\text{.}\)
  • The bottom side is part of the line \(y=1+x\text{,}\) or \(\frac{y}{1+x}=1\text{.}\) Recall that \(t=\frac{y}{1+x}\text{.}\) So, in terms of \(s\) and \(t\text{,}\) this line is \(t=1\text{.}\)
  • The top side is part of the line \(y=2(1+x)\text{,}\) or \(\frac{y}{1+x}=2\text{.}\) In terms of \(s\) and \(t\text{,}\) this line is \(t=2\text{.}\)
Here is another copy of the sketch of \(\cR\text{.}\) But this time the equations of its four sides are expressed in terms of \(s\) and \(t\text{.}\)
So, expressed in terms of \(s\) and \(t\text{,}\) the domain of integration \(\cR\) is much simpler:
\begin{equation*} \Set{(s,t)}{0\le s\le 1,\ 1\le t\le 2} \end{equation*}
As \(\dee{x}\,\dee{y} = (1+s)\,\dee{s}\,\dee{t}\) and the integrand \(\frac{y}{1+x}=t\text{,}\) the integral is, by Theorem 3.8.3,
\begin{align*} \dblInt_\cR\frac{y}{1+x}\ \dee{x}\,\dee{y} &=\int_0^1\dee{s}\int_1^2\dee{t}\ (1+s)t =\int_0^1\dee{s}\ (1+s)\ \left[\frac{t^2}{2}\right]_1^2\\ &=\frac{3}{2}\left[s+\frac{s^2}{2}\right]_0^1\\ &=\frac{3}{2}\times \frac{3}{2}\\ &=\frac{9}{4} \end{align*}
There are natural generalizations of (3.8.2) and Theorem 3.8.3 to three (and also to higher) dimensions, that are derived in precisely the same way as (3.8.2) was derived. The derivation is based on the fact, discussed in the optional Section 1.2.4, that the volume of the parallelepiped (three dimensional parallelogram)
determined by the three vectors \(\va=\llt a_1,a_2,a_3\rgt ,\ \vb=\llt b_1,b_2,b_3\rgt \) and \(\vc=\llt c_1,c_2,c_3\rgt \) is given by the formula
\begin{align*} \text{volume of parallelepiped with edges } \va, \vb, \vc &= \left| \det\left[\begin{matrix}a_1&a_2&a_3 \\ b_1&b_2&b_3\\ c_1&c_2&c_3\end{matrix}\right] \right| \end{align*}
where the determinant of a \(3\times 3\) matrix can be defined in terms of some \(2\times 2\) determinants by
If we use
\begin{align*} x&=x(u,v,w)\\ y&=y(u,v,w)\\ z&=z(u,v,w) \end{align*}
to change from old coordinates \(x,y,z\) to new coordinates \(u,v,w\text{,}\) then
Cylindrical coordinates have
\begin{align*} x(r,\theta,z) &= r\cos\theta\\ y(r,\theta,z) &= r\sin\theta\\ z(r,\theta,z) & = z \end{align*}
Since
\begin{align*} \pdiff{x}{r}&=\cos\theta &\pdiff{y}{r}&=\sin\theta &\pdiff{z}{r}&=0\\ \pdiff{x}{\theta}&=-r\sin\theta &\pdiff{y}{\theta}&=r\cos\theta &\pdiff{z}{\theta}&=0\\ \pdiff{x}{z}&= 0 &\pdiff{y}{z}&=0 &\pdiff{z}{z}&=1 \end{align*}
(3.8.8), but with \(u\) renamed to \(r\) and \(v\) renamed to \(\theta\text{,}\) gives
\begin{align*} \dee{V} &= \left|\det\left[\begin{matrix}\cos\theta &\sin\theta&0 \\ -r\sin\theta & r\cos\theta&0 \\ 0 & 0 & 1 \end{matrix}\right]\right| \dee{r}\,\dee{\theta}\,\dee{z}\\ &= \left|\cos\theta\det\left[\begin{matrix} r\cos\theta&0 \\ 0 & 1 \end{matrix}\right] -\sin\theta\det\left[\begin{matrix} -r\sin\theta&0 \\ 0 & 1 \end{matrix}\right]\right.\\ &\hskip2.3in+0\left.\det\left[\begin{matrix} -r\sin\theta & r\cos\theta \\ 0 & 0 \end{matrix}\right] \right| \dee{r}\,\dee{\theta}\,\dee{z}\\ &=\big(r\cos^2\theta + r\sin^2\theta\big)\,\dee{r}\,\dee{\theta}\,\dee{z}\\ &= r\,\dee{r}\,\dee{\theta}\,\dee{z} \end{align*}
which is exactly what we found in (3.6.3).
Spherical coordinates have
\begin{align*} x(\rho,\theta,\varphi) &= \rho\,\cos\theta\,\sin\varphi\\ y(\rho,\theta,\varphi) &= \rho\,\sin\theta\,\sin\varphi\\ z(\rho,\theta,\varphi) & = \rho\,\cos\varphi \end{align*}
Since
\begin{align*} \pdiff{x}{\rho}&=\cos\theta\,\sin\varphi &\pdiff{y}{\rho}&=\sin\theta\,\sin\varphi &\pdiff{z}{\rho}&=\cos\varphi\\ \pdiff{x}{\theta}&=-\rho\,\sin\theta\,\sin\varphi &\pdiff{y}{\theta}&=\rho\,\cos\theta\,\sin\varphi &\pdiff{z}{\theta}&=0\\ \pdiff{x}{\varphi}&= \rho\,\cos\theta\,\cos\varphi &\pdiff{y}{\varphi}&=\rho\,\sin\theta\,\cos\varphi &\pdiff{z}{\varphi}&=-\rho\,\sin\varphi \end{align*}
(3.8.8), but with \(u\) renamed to \(\rho\text{,}\) \(v\) renamed to \(\theta\) and \(w\) renamed to \(\varphi\text{,}\) gives
\begin{align*} \dee{V} &= \left|\det\left[\begin{matrix}\cos\theta\,\sin\varphi & \sin\theta\,\sin\varphi &\cos\varphi \\ -\rho\,\sin\theta\,\sin\varphi &\rho\,\cos\theta\,\sin\varphi &0 \\ \rho\,\cos\theta\,\cos\varphi &\rho\,\sin\theta\,\cos\varphi &-\rho\,\sin\varphi \end{matrix}\right]\right| \dee{\rho}\,\dee{\theta}\,\dee{\varphi}\\ &= \left|\cos\theta\,\sin\varphi\det\left[\begin{matrix} \rho\,\cos\theta\,\sin\varphi&0 \\ \rho\,\sin\theta\,\cos\varphi &-\rho\,\sin\varphi \end{matrix}\right] \right.\\ &\hskip1in\left. -\sin\theta\,\sin\varphi\det\left[\begin{matrix} -\rho\,\sin\theta\,\sin\varphi &0 \\ \rho\,\cos\theta\,\cos\varphi &-\rho\,\sin\varphi \end{matrix}\right] \right.\\ &\hskip1in\left. +\cos\varphi\det\left[\begin{matrix} -\rho\,\sin\theta\,\sin\varphi &\rho\,\cos\theta\,\sin\varphi \\ \rho\,\cos\theta\,\cos\varphi &\rho\,\sin\theta\,\cos\varphi \end{matrix}\right] \right| \dee{\rho}\,\dee{\theta}\,\dee{\varphi}\\ &=\rho^2 \big|-\cos^2\theta \sin^3\varphi - \sin^2\theta\sin^3\varphi -\sin\varphi\cos^2\varphi \big|\,\dee{\rho}\,\dee{\theta}\,\dee{\varphi}\\ &=\rho^2 \big|-\sin\varphi \sin^2\varphi -\sin\varphi\cos^2\varphi \big|\,\dee{\rho}\,\dee{\theta}\,\dee{\varphi}\\ &= \rho^2\sin\varphi\,\dee{\rho}\,\dee{\theta}\,\dee{\varphi} \end{align*}
which is exactly what we found in (3.7.3).

Subsection 3.8.1 Optional — Dropping Higher Order Terms in \(\dee{u},\dee{v}\)

In the course of deriving (3.8.2), that is, the \(\dee{A}\) formula for
we approximated, for example, the vectors
\begin{alignat*}{2} \overrightarrow{P_0P_1} &=\vr(u_0+\dee{u}, v_0) -\vr(u_0\,,\,v_0) &= \pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{u} + E_1 &\approx \pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{u}\\ \overrightarrow{P_0P_2} &=\vr(u_0, v_0+\dee{v})-\vr(u_0\,,\,v_0) &= \pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{v} + E_2 &\approx \pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{v} \end{alignat*}
where \(\vE_1\) is bounded 6  by a constant times \((\dee{u})^2\) and \(E_2\) is bounded by a constant times \((\dee{v})^2\text{.}\) That is, we assumed that we could just ignore the errors and drop \(E_1\) and \(E_2\) by setting them to zero.
So we approximated
\begin{align*} \left|\overrightarrow{P_0P_1}\times\overrightarrow{P_0P_2}\right| &=\left|\Big[\pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{u} + \vE_1\Big] \times\Big[\pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{v} + \vE_2\Big] \right|\\ &=\left|\pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{u} \times\pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{v} + \vE_3 \right|\\ &\approx \left|\pdiff{\vr}{u}(u_0\,,\,v_0)\,\dee{u} \times\pdiff{\vr}{v}(u_0\,,\,v_0)\,\dee{v} \right| \end{align*}
where the length of the vector \(\vE_3\) is bounded by a constant times \((\dee{u})^2\,\dee{v}+\dee{u}\,(\dee{v})^2\text{.}\) We'll now see why dropping terms like \(\vE_3\) does not change the value of the integral at all 7 . Suppose that our domain of integration consists of all \((u,v)\)'s in a rectangle of width \(W\) and height \(H\text{,}\) as in the figure below.
Subdivide the rectangle into a grid of \(n\times n\) small subrectangles by drawing lines of constant \(v\) (the red lines in the figure) and lines of constant \(u\) (the blue lines in the figure). Each subrectangle has width \(\dee{u} = \frac{W}{n}\) and height \(\dee{v} = \frac{H}{n}\text{.}\) Now suppose that in setting up the integral we make, for each subrectangle, an error that is bounded by some constant times
\begin{equation*} (\dee{u})^2\,\dee{v}+\dee{u}\,(\dee{v})^2 =\Big(\frac{W}{n}\Big)^2 \frac{H}{n} + \frac{W}{n}\Big(\frac{H}{n}\Big)^2 =\frac{WH(W+H)}{n^3} \end{equation*}
Because there are a total of \(n^2\) subrectangles, the total error that we have introduced, for all of these subrectangles, is no larger than a constant times
\begin{equation*} n^2 \times \frac{WH(W+H)}{n^3} = \frac{WH(W+H)}{n} \end{equation*}
When we define our integral by taking the limit \(n\rightarrow 0\) of the Riemann sums, this error converges to exactly \(0\text{.}\) As a consequence, it was safe for us to ignore the error terms when we established the change of variables formulae.
We'll keep our third wish in reserve.
We are abusing notation a little here by using \(x\) and \(y\) both as coordinates and as functions. We could write \(x=f(u,v)\) and \(y=g(u,v)\text{,}\) but it is easier to remember \(x=x(u,v)\) and \(y=y(u,v)\text{.}\)
Recall 2.6.1.
It is not named after the Jacobin Club, a political movement of the French revolution. It is not named after the Jacobite rebellions that took place in Great Britain and Ireland between 1688 and 1746. It is not named after the Jacobean era of English and Scottish history. It is named after the German mathematician Carl Gustav Jacob Jacobi (1804 – 1851). He died from smallpox.
The name comes from the fact that both the curves of constant \(u\) and the curves of constant \(v\) are parabolas.
Remember the error in the Taylor polynomial approximations. See 2.6.13 and 2.6.14.
See the optional § 1.1.6 of the CLP-2 text for an analogous argument concerning Riemann sums.