Tag Archive: E&M

Why Don’t We Teach the Helmholtz Theorem?

In an earlier post, I outlined a derivation of the Helmholtz Theorem starting from the identity

\[ \nabla^2 \left( \frac{1}{|\vec r – \vec r \, {}’|} \right) = -4 \pi \delta( \vec r – \vec r \, {}’ ) .\]

It seems hard for me to believe but it was many years after I had studied E&M in graduate school that I came across this theorem and to appreciate its power.  The reason I say it is hard for me to believe is that almost none of the traditional texts talk about it or even have an index entry for it.

The traditional way we teach electricity and magnetism is to trace through the historic development by examining a host of 18th and 19th century experiments. The usual course is to introduce integral forms of the laws and then show how these can lead to Maxwell’s equation (e.g., from Coulomb’s law to Gauss’s law).  The road here is long and tortuous, starting with static fields, then layering on the time dependence, and then finally sneaking the displacement current into Ampere’s law.  This approach obscures the unity of Maxwell’s equations.  It also leaves the student bored, confused, and overwhelmed, and incapable of appreciating what follows.

To such a student is lost the wonder of realizing that the fields can take on a life of their own, independent of the things that created them.  Lost is the realization that these fields can radiate outwards; that they can reflect and refract (i.e., optics as an inherently electromagnetic phenomenon); and that they can be generated and controlled at will to form the communication network we all use on a daily basis.

A better approach starts with Maxwell’s equations in their full form and then uses the Helmholtz theorem to ‘interrogate’ them to derive the time-honored static field results of Coulomb and Biot-Savart. I believe this approach, which is nearly impossible to find in the usual textbooks, offers a clearer view into the unity of electricity and magnetism, at the cost of some slightly more mature vector calculus.  The fields are introduced early, and the equations they satisfy are complete. There is no unlearning facts later on.  For example, the traditional textbook results for Coulomb’s law emphasize that the electric field is conservative and that its curl is identically zero.  Weeks or months go by with that concept firmly emphasized and entrenched and then, and only then, is the student informed that the result doesn’t hold in general.

There is also a fundamental flaw in the pedagogy of deriving Maxwell’s equations from the integral forms.  Nowhere along the line is there any explanation as to why knowing the divergence and curl of a vector field is all that is needed to uniquely specify the field. After all, why can’t a uniform field be added as a constant of integration?

At the heart of the traditional approach is the idea of a force field as a physically real object and not just a useful mathematical construct. The prototype example is the electric field, which comes from the experimental expression for Coulomb’s law, stating that the force on two charges $$q_2$$ due to $$q_1$$ is given by (note all equations are expressed in SI units):

\[ \vec F_{21}(\vec r) = \frac{1}{4 \pi \epsilon_0} \frac{ q_1 q_2 \left( \vec r_2 – \vec r_1 \right)}{|\vec r_2 – \vec r_1|^3} \, .\]

The usual practice is then to assume one of the charges is a test charge and that the other is smeared into an arbitrary charge distribution within a volume $$V$$ and that the resulting electric field is

\[ \vec E(\vec r) = \frac{1}{4 \pi \epsilon_0} \int_V d^3 r \frac{ \rho(\vec r \;’) \left( \vec r – \vec r \; ‘ \right)}{|\vec r – \vec r \; ‘|^3} \, . \]

The last step in the the traditional approach involves introducing vector field divergence and curl, the associated theorems they obey, and applying the whole lot to electric flux to get the first of the Maxwell equations

\[ \nabla \cdot \vec E (\vec r) = \rho(\vec r) / \epsilon_0 \, .\]

As the traditional program proceeds, magnetostatics follows with the introduction of the Biot-Savart law

\[ \vec B(\vec r) = \frac{\mu_0}{4 \pi} \int_V d^3 r’ \frac{ \vec J(\vec r \; ‘) \times (\vec r – \vec r \;’)}{|\vec r – \vec r \; ‘|^3} \, , \]

as the experimental observation for the generation of a magnetic field for a given current density within a volume $$V$$. This time the vanishing of the divergence is used to find the vector potential and the curl is related to the current density via Ampere’s law

\[ \nabla \times \vec B(\vec r) = \mu_0 \vec J(\vec r) \,.\]

The traditional approach finally gets to time-varying fields when taking up Faraday’s law, requiring the student to unlearn $$\nabla \times \vec E = 0$$ and then finally re-learn Ampere’s equation with the introduction of the displacement current. By this time the full Maxwell equations are on display, but the linkage between the different facets of each field is highly obscured, and the basic underpinning of the theory — that the divergence and curl tells all there is to know about a field — is not to be found. The pedagogy seems to suffer from too many unconnected facts with no common framework by which to relate them.

Using the Helmholtz Theorem in conjunction with an upfront statement of the Maxwell equations offers several advantages in teaching electromagnetism. I will content myself with just the derivation of the Coulomb’s and Biot-Savart’s law.  Additional information can be found in the paper and presentation I recently wrote for the Fall meeting of the Chesapeake Section of AAPT.

Start by considering the Maxwell equations, presented here in vacuum, as

\[ \nabla \cdot \vec E(\vec r,t) = \rho(\vec r,t) / \epsilon_0 \, ,\]

\[ \nabla \cdot \vec B(\vec r,t) = 0 \, , \]

\[ \nabla \times \vec E(\vec r,t) = -\frac{\partial \vec B (\vec r,t)}{\partial t} \, , \]

and

\[ \nabla \times \vec B(\vec r,t) = \mu_0 \vec J (\vec r,t) + \epsilon_0 \mu_0 \frac{\partial \vec E(\vec r,t)}{\partial t} \, .\]

Since the Coulomb and Biot-Savart laws are in the domain of the electro- and magnetostatics, all terms in Maxwell’s equations involving time derivatives are set equal to zero and all $$t$$’s are eliminated to yield
\[ \nabla \cdot \vec E(\vec r) = \rho(\vec r) / \epsilon_0 \, ,\]
\[ \nabla \cdot \vec B(\vec r) = 0 \, , \]
\[ \nabla \times \vec E(\vec r) = 0 \, , \]

and
\[ \nabla \times \vec B(\vec r) = \mu_0 \vec J (\vec r) \, .\]

Now substituting the electric and magnetic field divergences and curls into the $$U(\vec r)$$ and $$\vec W(\vec r)$$ expressions in Helmholtz’s theorem yields the usual scalar potential

\[U(\vec r) = \frac{1}{4 \pi \epsilon_0}\int_V d^3 r’ \frac{\rho(\vec r \;’)}{|\vec r – \vec r \;’|} \]

for the electric field and the usual vector potential

\[\vec W(\vec r) = \frac{\mu_0}{4\pi} \int_{V} d^3r’ \frac{ \vec J(\vec r) }{|\vec r – \vec r\;’|}\]

for the magnetic field.

On the whole, I think this approach enhances the physical understanding of Maxwell’s equations in ways the traditional approach can’t.  It’s not without its downside, but none of the problems present much difficulty.  Further details can be found in my paper.

 

Deriving the Helmholtz Theorem

To derive the Helmholtz theorem start first with one representation of the delta-function in 3-dimensions

\[ \nabla^2 \left( \frac{1}{|\vec r – \vec r \, {}’|} \right) = -4 \pi \delta( \vec r – \vec r \, {}’ ) \, .\]

Start with the identity
\[ \vec F(\vec r) = \int_{V} d^3 r’ \delta(\vec r – \vec r \,{}’) F(\vec r \,{}’) \]
for an arbitrary vector field $$\vec F(\vec r)$$ over a given volume $$V$$. Note that time will not be involved in this derivation. Also note that there is a ongoing discussion in the literature about the correct way to extend this theorem for time varying fields. This will be a discussed in a future post.

Using the explicit representation of the delta-function stated above and factoring out the derivatives with respect to the field point $$\vec r$$ yields

\[ \vec F(\vec r) = \frac{ -\nabla^2_{\vec r} }{4 \pi} \int_V d^3 r’ \frac{\vec F(\vec r\,{}’)}{|\vec r – \vec r\,{}’|} \, . \]

Now apply the vector identity $$\nabla^2 = \nabla( \nabla \cdot ) – \nabla \times (\nabla \times)$$. Doing so allows the expression for $$\vec F(\vec r)$$ to take the form
\[ \vec F (\vec r) = \frac{1}{4 \pi} \nabla_{\vec r} \times \vec I_{vector} – \frac{1}{4 \pi} \nabla_{\vec r} I_{scalar} \]
where the integrals
\[ \vec I_{vector} = \nabla_{\vec r} \times \int_v d^3 r ‘ \frac{\vec F (\vec r\,’)}{|\vec r – \vec r\,’|} \]
and
\[ I_{scalar} = \nabla_{\vec r} \cdot \int_v d^3 r ‘ \frac{\vec F (\vec r\,’)}{|\vec r – \vec r\,’|} \; . \]

The strategy for handling these terms is to

  1. bring the derivative operator with respect to r into the integral
  2. switch the derivative from r to r’ with a cost of a minus sign
  3. integrate by parts
  4. apply the appropriate boundary conditions and boundary integral version of the divergence theorem to the total derivative piece

Application of this strategy to the vector (first) integral gives
\[ \vec I _{vector} = \int_V d^3 r’ \frac{ \nabla_{\vec r\,’} \times \vec F (\vec r \, ‘)}{|\vec r – \vec r\,’|} – \int_{\partial V} dS \frac{\hat n \times \vec F(\vec r\,’)}{|\vec r – \vec r\,’|} \; .\]

Likewise, the same strategy applied to the scalar (second) integral gives
\[ I_{scalar} = \int_V d^3 r’ \frac{ \nabla \cdot \vec F ( \vec r \,’)}{|\vec r – \vec r\,’|} – \int_{\partial V} dS \frac{\hat n \cdot \vec F ( \vec r \,’)}{|\vec r – \vec r\,’|} \; . \]

Now the usual case of interest sets the bounding volume to be all space, which requires that the field drop off faster than $$r^{-1}$$. If this condition is met then the surface integrals zero and the original field can be written as
\[ \vec F (\vec r) = -\nabla U(\vec r) + \nabla \times \vec W(\vec r) \]
where
\[ U(\vec r) = \frac{1}{4\pi} \int_V d^3 r’ \frac{ \nabla’ \cdot \vec F ( \vec r \,’)}{|\vec r – \vec r\,’|} \]
and
\[ \vec W(\vec r) = \frac{1}{4\pi} \int_V d^3 r’ \frac{ \nabla’ \times \vec F ( \vec r \,’)}{|\vec r – \vec r\,’|} \; .\]

At this point it is a snap to derive Coulomb’s and Biot-Savart’s laws from the Maxwell equations but that is a post for another time.