## Dipole Field Lines and the Earth

As pointed out in an article entitled Field Pattern of a magnetic dipole (Am. J. Phys. 68 (6), June 2000), J. P. Mc Tavish, notes that while it is common to see a treatment of the magnetic dipole in essentially every electromagnetic text, it is all too rare to find a discussion on how to draw the field lines. While Mc Tavish's article is successful in remedying that shortfall, it omits any connection with specific physical situations, in particular with the most important dipole magnet to the human race: the one on which we live - the Earth.

Alternatively, in the space physics community it is all too common to see the physical connection to the Earth's dipole field and corresponding field lines but the basic physics associated with deriving these equations from Maxwell's equations to the dipole is usually missing. And those that do (e.g. the very instructive lecture by Ling-Hsaio Lyu) often cut some vital parts.

It is the intention of this post to create a single narrative that starts from first principles and ends with the standard space physics expression for the Earth's dipole field in terms of the standard L-shells. The bulk of the initial part of this post is based on the treatment of the dipole in Foundations of Electromagnetic Theory by Reitz, Milford, and Christy.

The starting point is the monopole law $\vec \nabla \cdot \vec B = 0$, which asserts that the magnetic field is purely solenoidal, and, as a result, can be written in terms of the vector potential $\vec A$ as

Since we want the description of an isolated magnetic dipole field, two immediate conditions can be place on Maxwell's equations: $\partial_t \vec B = 0$ and $\rho = 0$. The only non-trivial Maxwell equation is now the Ampere-Maxwell law, that states (subject to the earlier conditions)

Substituting in the vector potential yields

Expanding the double-curl with the standard vector identity leads to

which can be further reduced by picking the Lorentz gauge, which sets $\vec \nabla \cdot \vec A = 0$. This choice leads to the final equation

which asserts that each component of the vector potential obeys a Poisson equation with the source being $\mu_0 \vec J$. The solution of these three equations follows by analogy with electrostatics and the result is

Now for most dipoles (microscopically contained in matter or the Earth), the field is generally distant from the source point ($|\vec r| >> |\vec r \, '|$). This relation suggests a multipole expansion of the Green's function

The dipole is the first-order piece of such an expansion, so we can approximate the Green's function as

One additional approximation is needed; assume that the currents creating the dipole are bound and an be idealized as flowing through a "thin wire" (effectively a Dirac delta function is used to localize the current into a circuit whose characteristic sizes are small compared to all other lengths being considered). In this case, the volume integral becomes a closed line integral. Under this modest approximation:

The first integral vanishes since the line integral forms a close loop. The second integral, refered hereafter as the dipole integral, requires some ingenuity, all of which is supplied by Rietz, Milford, and Christy. The first step in dealing with the dipole integral comes from noting (using back-cab) that

To eliminate the first term employ the identity

that results from a change in $\vec r \,'$ keeping $\vec r$ fixed.

Now add the two previous expressions together to get

which, when divided by 2 and substituted into the dipole integral gives

The second terms is identically zero, since it an exact differential. The first terms can be suggestively rearranged to give

where $\vec m$ is defined as the magnetic dipole (or magnetic dipole moment).

Before computing $\vec B$, let's take a moment to think about the definition of $\vec m$. Consider an arbitrary current-carrying loop

where the origin is chosen near the loop. The diffential vector $\vec r \, ' \times d \vec r \,'$ is normal to a small slice of area. Summing each piece in the integral results in an expression that roughly says

but the area has each slice weighted differently, based on how far from the origin it excursion takes it. As a side note, Griffiths calls this the directed area and obtains it from the Stoke's theorem by an ingenious substitution for the vector field in question (seems like some degree of cleverness is required no matter how one does this analysis).

The next step is to calculate the magnetic field. Taking the curl gives

To expand the right-hand side, use index notation (starting with $\vec A \times \vec B = \epsilon_{ijk} A_j B_k$)

Cyclically permute the indices on $\epsilon_{k \ell m}$ to get

Now use the standard identity

to get

Expanding gives

Now focus on the term inside the square brackets. Operating with the derivatives gives

The first two terms vanish and the second two, when put into vector notation, give for the magnetic field

With this form in hand, it is possible to calculate the field lines. There are two ways to do this: in Cartesian coordinates and in spherical coordinates. Either is fine, but spherical coordinates affords some advantages in space physics and will be pursued here.

That said, both cases benefit from exploiting axial symmetry and, without loss of generality, the analysis will be confined to the $x-z$ plane.

The geometry is shown below.

To put the coordinate-free expression into polar coordinates we need to express the vectors in terms of the basis $\left\{ \hat r, \hat \lambda, \hat \phi \right\}$ (although we are supressing $\hat \phi$ by exploiting the azimuthal symmetry).

Starting with the position vector

gives

and

Inverting these relations gives

Since $\vec m = m \hat z$, the magnetic field becomes

giving

and

where $D = m \mu_0 /4 \pi$.

The final piece is to get an equation of the form $r = r(\lambda)$ so that we can get the field lines. From Section 3.2 of Introduction to Vector Analysis by David and Snider, the tangent field to the field lines is given by $\vec T = \beta \vec B$, where $\beta$ is an arbitrary, and ultimately, inconsequential, constant. In coordinates, this expression leads to the vector equation

which gives

This equation is recast as follows:

The solution is readily obtained by integrating from the equation towards the pole (symmetry again)

giving

which simplifies to

The final form is

The parameter $L$ labels the fieldline and different fieldlines intersect the Earth's surface at different latitudes.

## Top Ten Nobel Prizes in Physics

As part of the special New Years list theme issue, this month’s column is going to propose a ranking for the top 10 Nobel Prizes in physics out of the 117 that have been awarded.  The primary criterion employed is that the work had to have a profound effect on the daily human quality of life in contrast to having a deep and lasting influence on our general knowledge of the universe.  The recent buzz about the discovery of the Higgs boson captured the imaginations of the public at large and advanced our knowledge of the fundamental laws of nature but has currently had no practical impact on the day-to-day lives of most people.  The energies required to unleash the Higgs are enormous and the idea that Higgs physics will be featured in the latest tech gadget absurd.  Thus, its discovery has no place on this list, even though it was an amazing accomplishment.

Each entry in the list will be cited first with the year, followed by the text the Nobel committee used to describe the discovery followed by the person or persons to whom the prize was awarded.  For those prizes where more than one discovery was recognized, the ordering will be alphabetical.

## 10.       1918:  Discovery of elemental quanta – Max Planck

The one that started the whole enterprise of quantum mechanics.  When faced with the puzzle of how to solve the ultraviolet catastrophe in the modeling of black body radiation, Planck proposed a radical concept:  the energy in an oscillating mode couldn’t be any arbitrary value but had to be a multiple of some elementary packet of energy.  The fact that is fit the experimental data so well launched a mode of physical investigation that enabled all of the incredible technology that we all enjoy.

## 9.         1989:  Development of Methods to Isolate Atoms and Subatomic Particles for study; Development of the Atomic Clock – Hans Georg Dehmelt, Wolfgang Paul, Norman Foster Ramsey

This entry is primarily on the list due to the last cited development of the atomic clock.  It is hard to reckon how important good time keeping and synchronization is to the underlying aspects of human activity.  Knowing when to plant is vital to good harvests; knowing when to debit or credit an account is vital to good business; and knowing how to accurately keep good time opens the door for greater exploration of the universe.

## 8.         2007: Discovery of Giant Magnetoresistance – Albert Fert and Peter Grunberg

The digital age has profoundly enriched lives all over the world and this entry is the first of 3 concerning that aspect of our lives.  The discovery of giant magnetoresistance immensely increased the areal density of information that could be stored on hard drives and it is the most important reason why huge amount of data can be placed in clouds and servers.  Each of us is now endowed with the ability to store and access digital data in a way that only the richest corporations could a few decades earlier.  It is not an over-statement to conclude that the new fields of data science and machine learning would not be possible with this discovery.

## 7.         1929: Discovery of the Wave Nature of Electrons – Louis-Victor de Broglie

What Planck began with his quantization of radiation, de Broglie finished with his proposal that matter had a wave-like nature of its very own.  Arguing from symmetry and well-established physical models of classical mechanics, de Broglie’s idea of matter waves solidified quantum mechanics as the preeminent science and paved to way for the formal development of the theory by Heisenberg, Schrodinger, Born, and Dirac.

## 6.         1952:  Discovery of Nuclear Magnetic Resonance in Solids – Felix Bloch and E.M. Purcell

Nuclear magnetic resonance is one of the two non-destructive imaging techniques on this list.  Mostly known under its more politically-correct name of magnetic resonance imaging, this technique complements the use of X-rays in viewing the inner workings of the body with invasive surgery.  It also has applications beyond medical imaging in a variety of situations where non-destructive investigation is required.

## 5.         1945:  Discovery of the Exclusion Principle of Electrons – Wolfgang Pauli

Like the discoveries of Planck and de Broglie, Pauli’s idea is profound not for it intrinsic nature but for the doors it opened.  Advances in chemistry and material sciences, research fields that have opened up miracle drugs and novel new plastics, semi-conductors, and other materials would not have happened without the idea that electrons pack into an atom in a way that prevents them from having the same quantum numbers.

## 4.         2009:  Achievements Concerning the Transmission of Light in Fibres for Optical Communication;  Invention of the CCD Sensor, an Imaging Semiconductor Circuit – Willard Boyd and George Smith; Charles Kao

We all take for granted the ability to take out a phone to film or photograph something of interest to us and to use the information superstructure to post or share what we’ve captured with others.  In this way we communicate with love ones not present and memorialize for later study.  In this way, we also make up an army of citizen-journalists who have created a new type of politics not filtered through official channels.   All of this is having a profound effect on society and is enabled by the CCD and the fiber optics.  I’ve often thought that perhaps the best name for this entry is the YouTube prize.

## 3.         1901:  Discovery of X-Rays – Wilhelm Conrad Rontgen

The first Nobel Prize ever awarded, Rontgen’s discovery of X-Rays was also the first type of non-invasive imaging.  Due to their ability to penetrate matter, X-Rays allowed the first views into the workings of the human body without the need to cut it open.  Broken bones, infections, and tumors are only some of the maladies that could be treated with greater precision and with less pain for the patient.  X-Rays also proved useful in investigating the goodness of welds, the stability of structures, and the like.

## 2.         1956: Investigations on Semiconductors and Invention of the Transistor – John Bardeen, Walter H. Brattain, William B. Shockey

It is hard to underestimate the importance of the transistor.  Without it, logical gates, programming and practical computing, the entire digital age would not have come about.  The idea that millions of these devices can be packed into an area about the size of a postage stamp staggers the imagination.  And, as we are just at the beginning the road, it is hard to see just where all of this technology will lead.  But one thing is certain, the transistor has had one of the most profound influences on society of any modern human invention.

## 1.         1909: Development of Wireless Telegraphy – Ferdinand Braun and Guglielmo Marconi

And number one on the list is Braun’s and Marconi’s invention of radio.  ‘Wireless Telegraphy’, as the Nobel Committee called it, ushered in telecommunications and all that comes in its train: television, cell phones, radio, wireless routers, etc..  Man is a social creature and language is the glue that holds society together.  Telecommunications allows the reach of language to go further linking far-away with close-to-home.  No other entry in this list has done more to knit the world together as a single family.

## Dynamical Systems and Constraints: Part 4 - Bead on a Wire Revisited

In the last two posts, the role of constraints in a mechanical system and the methods for incorporating them have been explored. Among the things that were demonstrated were the facts that the Newton approach was ill-suited to the inclusion of constraints while the Lagrangian method is better equipped, and that within the Lagrangian method there are two approaches for incorporating the constraints: 1) direct elimination in the event the constraints are holonomic, and 2) the use of the Lagrange multipliers. The earlier analysis of the pendulum showed how to analytically use the Lagrange multipliers and demonstrated, in detail, that the multiplier was the force of constraint.

This post returns to the problem of a bead moving down a wire and shows that, whereas the Newtonian approach fails, the Lagrangian approach works nicely. The starting point is the Lagrangian

with both $x$ and $y$ as dynamical degrees of freedom augmented with a Lagrange multiplier that enforces the constraint equation $y= f(x)$.

The conjugate momenta are

and

The corresponding generalized forces are

where $f'$ is the derivative of the constraint function $f(x)$ with respect to $x$, and

The equations of motion are

and

Since the ratio of the Lagrange multiplier to particle mass appears in each equation (i.e., the constraint force per unit mass), it is natural and convenient to define $\lambda_m = \frac{\lambda}{m}$ and to work with that quantity going forward.

Since the $y$ degree of freedom is related to $x$ through the constraint, there is a kinematic relationship that requires

and

To determine the Lagrange multiplier, equate the dynamic expression for $\ddot y$ with the kinematic one to get

The next step eliminates the $x$ acceleration by substituting-in its equation of motion giving

which can be easily solved for

This expression of the constraint force per unit mass is similar to the expressions in the earlier analysis but differs in one key way with the inclusion of the velocity-dependent piece proportional to $\dot x^2$. This constraint acceleration is now equipped to generate additional force (not just the expected normal force on the surface) as numerical errors move the motion off of the constraint. Of course, this is not a miraculous result and the constraint had been used to link the $x$ and $y$ equation of motion, but it is cleaner than the earlier method, it provides the constraint force, and, most important, it provides a mechanism for calculating the kinematics in a more convenient way than the solve-for-$x$-and-then-deduce method of that earlier post. As a reminder, those state-space equations were

where $\tau = (1+f'^2)^{-1}$.

The Lagrange multipliers form of the equations of motion are also easy to code and couple to a standard solver. As in the earlier post, I used the Python and SciPy ecosystem with the primary function being:

def roller_coaster_multipliers(state,t,geometry,g):
#unpack the state
x  = state[0]
y  = state[1]
vx = state[2]
vy = state[3]

#evaluate the geometry
f,df,df2 = geometry(x)

#calculate the lagrange multiplier per unit mass
lam_m = ( df2*vx*vx + g )/(1.0 + df*df)

return np.array([vx,vy,-lam_m*df,lam_m - g])


where the function geometry can be any curve in the $x-y$ plane. For example, for a bead sliding down a parabolic-shaped wire $y = x^2$, the corresponding function is

def parabola(x):
return x**2, 2*x, 2


and the assignment

  geometry = parabola


provides this shape to the solver.

The results for the motion are

showing excellent agreement with the constraint for over two oscillations and good agreement with energy conservation.

This approach offers a nice alternative to the earlier 1-D approach in that all the dynamical degrees of freedom are present and, as a result, the kinematics are cleaner. In addition, the actual constraint force, taking into account deviations away from the surface, is computed. This is an important feature and one that can't be obtained otherwise, since we can only compute the normal force at the surface and not away from it on either side.

## Dynamical Systems and Constraints: Part 3 - Simple Pendulum

In the spirit of simplicity embraced by the last post, this month’s analysis also deals with motion in one-dimension. As shown in that analysis, the naive application of Newton's laws to motion of a bead along a curved wire failed except in the simplest of cases – that of motion along a straight line (i.e., essentially down an inclined plane). Instead, Lagrange’s method was used where one of the dynamical degrees of freedom was eliminated by direct substitution of the constraint into the Lagrangian. This post examines another (deceptively) simple problem, the motion of a bob at the end of an ideal pendulum, and compares and contrasts direct substitution of the constraint with the use of Lagrange multipliers. The idea is to examine how Lagrange multipliers work and what information they provide in a problem where one can easily understand all the ‘moving parts’. The content of the post is inspired by the fine write-up on Lagrange multipliers by Ted Jacobson for a course on classical mechanics. I've taken Jacobson's basic discussion on Lagrange multipliers and extended it in several directions; nonetheless, the insights on that particular topic are due to him.

To begin, assume that the pendulum bob can move in the plane subject to the forces of gravity and of an inextensible rod of length $R$ to which the bob is attached.

The Lagrangian for arbitrary two-dimensional motion, subjected to gravity along the $x$-direction, expressed in plane polar coordinates is

where the zero of the potential energy is taken to be the lowest point in the bob's motion and where the constraint $\rho = R$ has not yet been used.

Before digging into the Lagrangian method, it is worth taking a step back to see how this system is described in the Newtonian picture. The forces on the bob at any given point of the motion in the plane lead to a vector equation that is determined by the well-known form of the acceleration in terms of the radial and azimutal unit vectors

Applying Newton's laws to the force vectors shown in the figure gives

and

Because the radius is fixed at $\rho = R$, the kinematics of the pendulum require $\dot \rho = 0$ and $\ddot \rho = 0$. The first equation provides a mechanism for calculating $T$ as

The second equation gives the classic equation of motion for the pendulum

These simple equations will help in interpretting the Lagrangian analysis.

## Directly Impose the Constraint

Since the constraint is holonomic, it can be directly imposed on the Lagrangian, allowing for the radial degree of freedom to be eliminated, yielding

The conjugate momentum is

the generalized force is

and the resulting equation of motion is

exactly the result obtained from Newton's laws, where computation of $\hat e_{\rho}$ and $\hat e_{\phi}$ and slogging through a vector set of equations was needed.

It is worth noting how much easier it was to get to these equations than the path through Newton's laws.

## Using Lagrange Multipliers

The alternative way of handling the constraint is by leaving both degrees of freedom (radial and azimuthal) in the Lagrangian, augmenting it with the dynamical constraint. The key is in determining how this augmentation is to be done.

Jacobson argues from analogy with multi-variable calculus. In order to optimize a function $f(x,y,z)$ subject to a constraint $C(x,y,z) = 0$, one must limit the gradient of the function so that

Since $C=0$,

This manipulation allows for the combination $d(f + \lambda C)$ to now represent an unconstrained optimization problem. In the event there are multiple constraints, each constraint gets a corresponding Lagrange multiplier and is added to the function.

The generalization to variational problems comes by identifying $f$ as the action $S$. The interesting part is the generalization required to include the constraints. Since the constraint is active at each point of the motion, there are an infinity of constraints, and each one must be included in the action at each time by integration. The resulting unconstrained variational principle is

The new Lagrangian is

and the azimuthal equation

To these equations must be added the constaint to yield

and

Thus we can identify $\lambda = T$, the force of constraint.

This is a general feature of the Lagrange multipliers. They incorporate the constraint into the variation and we are free to ignore it if we want only the dynamics, or we can use $\lambda$ to determine if it is explicitly desired.

## Dynamical Systems and Constraints: Part 2 - Bead on a Wire

This post begins a multi-part exploration of the role of constraints in mechanical systems. It builds on the general framework discussed in the last post but does so by examining specific examples. This month the focus will be on the simplest example of a constraint: a bead sliding on a wire.

Despite its trivial appearance, this problem is surprisingly complex due to the presence of the constraint and, as will be shown below, provides significant obstacles to obtaining an accurate numerical solution.

The following figure shows a portion of the wire, the position, $\vec r$, and the velocity, $\vec v$, of the bead, and the two forces acting upon it; namely the force of gravity ${\vec F}_g$ and the normal force ${\vec F}_N$. With this information, it is, in principle, easy to solve for the motion of the bead via Newton's laws. There are several equivalent ways of carrying this out but I'll be focusing on the most geometric of these.

The position of a particle in a plane is given by

Since the bead must remain on the wire, only those values of $y$ satisfying $y = f(x)$ are permissible and the position can now be written as

The velocity is then given by

The normal to the curve, $\hat N$, is obtained from the requirement that motion consistent with the constraint must always be perpendicular to the normal to the wire at the point in question:

Using the constraint equation $y - f(x) = 0$ and then taking the differential yields the component form

from which it is easy to read off the components of the normal (the sign being chosen so that $\vec F_N$ is in opposition $\vec F_g$. Subsequent normalization gives

The forces on the bead are due to gravity

and the normal force

For notational convenience in what follows, define

Summing the forces and inserting them into Newton's second law leads to the following equations (eliminating $m$ from both sides)

Expressing Newton's equations in state-space form then yields

These equations are easy to code and couple to a standard solver. I chose to implement this system in Python and SciPy as follows:

def roller_coaster(state,t,geometry,g):
#unpack the state
x     = state[0]
y     = state[1]
vx    = state[2]
vy    = state[3]

#evaluate the geometry
f,df,df2 = geometry(x)
tau      = 1.0/(1.0+df**2)

#return the derivative
return np.array([vx,vy,-g*tau*df,-g*tau*df**2])


where the function geometry can be any curve in the $x-y$ plane. For example, for a bead sliding down a parabolic-shaped wire $y = x^2$, the corresponding function is

def parabola(x):
return x**2, 2*x, 2


and the assignment

  geometry = parabola


provides this shape to the solver.

The first case studied with this formalism was where the wire shape is a simple straight segment, parallel to the edge of an inclided plane and refered to as such for the sake simplicity. The resulting motion was consistent with the shape

and the energy

was very well conserved.

These good results failed to hold when the shape of the wire was made parabolic. The resulting motion stayed on the parabola for a short time but eventually departed from the constraint by dropping below

while simultaneously showing a large deviation

from the initial energy.

The situation was no different for other geometeries tried. The problem seems to lie in the fact that the normal force is unable to respond to numerical errors. As the constraint begins to be violated, the wire is unable to conjure additional force to enforce it. The Newton method, while attractive from its simplicity, is simply inadequate for working with all but the most trivial of constraints.

The usual way of addressing this problem is to eliminate one degree of dynamical freedom. Either $x$ or $y$ has to go and the conventional approaches seem to favor $x$ as the parameter that stays. Again, there are several ways to effect this change but in this case both are worth examining.

In the first method, Newton's equations are used to eliminate the $y$ in favor of $x$ by combining their equations of motion by first expressing

and then by inserting the left-hand side into the $y$ equation yielding

Solving this to give

and then using

and

The final equation, after substitution for $\ddot y$ has been made and $\ddot x$ has been isolated, is

In the second method, the constraint is put into the Lagrangian before forming the Euler-Lagrange equations. The resulting Lagrangian becomes

The conjugate momentum is

and the resulting Euler-Lagrange equation is

Simplifying a bit yields

which happily is the same equation as obtained from Newton's laws.

The last step is to solve this differential equation. Rewriting this equation in state-space form gives a simple system to solve numerically:

Like the original Newtonian system, this one was easily coded up into Python.

def roller_coaster_Lagrange(state,t,geometry,g):
#unpack the state
x  = state[0]
vx = state[1]

#evaluate the geometry
f,df,df2 = geometry(x)
tau      = 1.0/(1.0+df**2)

return np.array([vx,-df*tau*(g+df2*vx**2)])


The constraint is always honored since $y$ is obtained after the fact from $x^2$.

In addition, the conservation of energy

shows that there wasn't any error in either the theory or the numerical implementation.

However, the key question is, does the solution accurately capture the motion by correctly placing the bead at $(x,f(x))$ at the correct time? There is no easy way to answer this question. The plot of the $x$ component as a function of time

certainly suggests that the Lagrange implementation is correctly capturing not just the dynamics but also the kinematics. That said, a more complete justification will have to wait for another time and another post.

## Dynamical Systems and Constraints: Part 1 - General Theory

Constraints in dynamical systems are important in physics for two reasons. First, they show in a variety of real-world mechanical systems with practical applications. In addition, the notion of a constraint, usually of a more abstract variety, is an essential ingredient in modern physical systems like quantum field theory and general relativity. This brief post is meant to cover the basic theory of constraints in classical physics as a precursor to some future posts on specific systems. The approach and the notation is strongly influenced by Theoretical Mechanics by Murray Spiegel, although the arrangement, presentation, and logical flow are my own.

The proper setting for understanding constraints in classical mechanics is the Lagrangian point-of-view and the primary idea for carrying out this program is the notion of generalized coordinates. By expressing the work-energy theorem in generalized coordinates, we will not only arrive at the Euler-Lagrange equations but will find the proper place in which to put the constraints into the formalism through the use of Lagrange multipliers.

The kinetic energy for a multi-particle system is expressed in Cartesian coordinates as

where the index $n$ labels each particle.

The partial derivatives of $T$ with respect to the generalized coordinate and velocity are:

and

Lagrange's cleverness comes is recognizing two somewhat non-obvious steps. First is that one can simplify the last expression by noting that

which lead immediately to the cancellation-of-dots that says

This results then simplifies

The second of Lagrange's steps is to recognize that taking the total time-derivative of that same term yields

The second part of this expression nicely cancels the derivative of the kinetic energy with respect to the generalized coordinate and so we arrive at the identity

Defining the left-hand side compactly as

gives

Like Lagrange, we can conclude that the presence of the accelerations in the right-hand side is related to the sum of the forces. However, it would be incorrect to conclude that each individual force can be teased out this the right-hand side as will be discussed below.

Before trying to make sense of the $Y_{\alpha}$'s, examine the forces in generalized coordinates via the differential work, which is given by

Regrouping this last term gives the final expression

Forming the sum

yields an expression for the differential work as well, since from the Work-Energy theorem $dW = dT$.

Subtracting these two expressions gives

One might be tempted to conclude that $Y_\alpha = \Phi_\alpha$ but this would be erroneous. All that can be concluded is that the projections of each of the two terms along the constrained surface are the same. The following figure shows a simple geometric case where two vectors (blue and green) (denoting same the generalized accelerations and the generalized forces) are both perpendicular to the red vector (denoting the motion of the system consistent with the constraint) without being equal. If, and only if, there are no constraints in the system does $Y_\alpha = \Phi_\alpha$.

If the constraints are integrable, that is to say they can be expressed in such a way that one generalized coordinate can be eliminated by expressing it in terms of the others, then a straight substitution will eliminate the constraint and no additional steps are needed. If that step can't be carried out, then the use of Lagrange multipliers are required as follows.

Suppose that there are $m$ constraints of the form

and, that any instant, the constraints are satisfied so that the $dt$ term can be set to zero. Then these constraint equations can be added together in an undetermined, as yet, linear combination

These $m$ equations augment the original equation to yield the augmented form

These equations, when regrouped, form two systems of equations. The first system can be regrouped and rewritten, using the constraint equations, so that we can eliminate $m$ of the $dq_\alpha$ (say $dq_1,\ldots,dq_m$) and then solve that system for the Lagrange multipliers. Returning to the augmented form, we can set the coefficients of the dependent terms equal to zero. Now the augmented form reduces to

where the prime indicates that the sum is restricted to the independent terms, which allows us to immediately conclude that

for all $\alpha$, both the dependent and independent set.

Next post, I'll explore this framework with some instructive examples taken from the classical mechanics literature.

## Energy and Hamiltonians: Part 3 - Examples

This post, building on the work of the previous two posts, investigates the six cases possible in which $E=h$ or $E\neq h$ and in which either one, the other, neither, or both are conserved. The examples are generally taken from simple mechanical systems in an attempt to illustrate the underlying physics without bringing in an unwieldy set of mathematics.

As a reminder, the energy $E$ is always defined as the sum of kinetic and potential energies and the function $h$ reflects not only the underlying physics (i.e., $E$) but also the generalized coordinate system in which it is described. The power of the generalized coordinates will allow us, in certain circumstances, to find conserved energy-like quantities that are not the energy proper. This is the power and complexity of generalized coordinates and it is this feature that makes it fun.

## Case 1 - $E \neq h$, both conserved: Free Particle with Co-moving Transformation

Consider the free particle. The Lagrangian for the system is given by

The conjugate momentum and the generalized force are

and

respectively. The equation of motion is the familar

The conserved quantity $h_x$ is the energy, which is entirely kinetic, given by:

Now suppose that we define the generalized coordinate $q$ through the transformation $x = q + v_0 t$ and $\dot x = \dot q + v_0$. The corresponding Lagrangian is

with the corresponding equation of motion

The conserved quantity $h_q$ is given by

From the transformation equation, $q = v_0$ and so $h_q = 0$. The physical interpretation is that, by transforming (via a Galilean transformation) into a frame co-moving with the free particle, the conserved quantity is identically zero, a value different from the energy in the original frame.

## Case 2 - $E \neq h$, $E$ conserved: Free Particle with Accelerating Transformation

Building off of the last case, again look at the free particle and consider the transformation into an accelerating frame given by

with

The transformed Lagrangian is

The corresponding $h_q$ is

which, since it is time varying, is not conserved.

## Case 3 - $E = h$, both conserved: Planar SHO

An example where the value of the energy $E$ and the function $h_q$ can take on dramatically different forms and yet yield the same value is supplied by the planar simple harmonic oscillator. In this case, a mass is found in cyclindrically symmetric harmonic potential $V = 1/2 \, k(x^2 + y^2)$.

The corresponding Lagrangian is given by

with the equation of motion being

This equation has well known analytical solutions, but for a change of pace, I implemented numerical solutions of these equations with a Jupyter notebook. The state-space equation took the form of

Under the point transformation to plane polar coordinates

and

(where $\omega_0 = \sqrt{k/m}$), the Lagrangian becomes

The resulting Euler-Lagrange equations are

and

which are interpretted as the equation of motion of a one-dimensional radial oscillator and the conservation of the angular momentum in the $x-y$ plane. The corresponding numerical form is

The initial conditions for the simulation were

and

for Cartesian and polar forms, respectively.

After the trajectories had been computed, the ephemerides were fed into functions that computed the Cartesian energy

and

as appropriate. As expected, the values for the energy, arrived at in two independent ways, are identical

## Case 4 - $E \neq h$, $h$ conserved: Constrained Planar SHO

In constrast to the Case 3, the constrained planar simple harmonic oscillator executes its motion subject to the constraint

where $\omega$ is the driving frequency. The mechanical analog would be something like the following picture

where a mass on a spring is forced by a motor that keeps the oscillator rotating at a constant angular rate.

The Lagrangian in this case is given by

Note that there is no easy way to express the Lagrangian in Cartesian coordinates - another example of the power of generalized coordinates.

The energy of the system is

but

The minus sign makes all the difference here. Since there is a motor in the problem, the expectation is that the energy should not be conserved, even though the structure of Lagrangian mechanics guarentees the conservation of $h$ (since it is not time-dependent).

While there are analytic solutions to these equations, I again opted for a numerical simulation. The equation of motion is

where $\omega_0^2 = k/m$.

The resulting dynamics show the radial motion oscillating sinusoidally

about an equilibrium value determined by

The graph of the energy and $h_r$ shows clearly that the energy is not conserved, even though $h_r$, as expected, is

## Case 5 - $E \neq h$, neither conserved: Damped SHO

The final example is the damped simple harmonic oscillator. The Lagragian is given by

which results in the equation of motion

The energy,

is clearly not conserved and the $h$ function,

is also not conserved since it depends on time.

## Case 6 - $E = h$, neither conserved

This last case is something of a contrivance. As may have been guessed, by the above examples, finding a case where $E=h$ and yet both are not conserved is not particularly easy to do. Rotating drivers that leave a dynamical condition like $\dot \theta = constant \equiv \omega$ lead to a conserved $h$ since the transformation to a rotating frame leaves the system in a co-moving configuration, despite the time varying nature of the transformation. Equally frustrating is trying to incorporate dissipation into the problem. Dissipation generally leads to a different form for $h$ compared with $E$ as Case 5 demonstrates.

The one example that does work, is by defining the potential energy as

and the Lagrangian as

resulting in the equation of motion

Both the energy and $h$ are given by

which is clearly not conserved.

## Energy and Hamiltonians: Part 2 - Conservation Considerations

Last month, I covered the basic ground work that established under what conditions generalized coordinates can be used in the Lagrangian formulation. Generalized coordinates enable vast simplifications of the analysis of mechanical systems and provide for deep insight into the possible motion. The analysis found that the permissible transformations are the point transformations

since these leave the Euler-Lagrange equations of motion invariant.

The next step is to examine what these transformations do to the expression of the energy. Note that no transformation can 'destroy' energy conservation - if the energy is conserved in one set of coordinates it is conserved in all coordinate systems. But it is possible that a poor transformation will hide the conservation of energy and that a clever transformation will reveal the conservation of a related quantity (the function $h$ to be discussed shortly) even when the energy is not conserved.

In Newtonian mechanics, the energy is given by the sum of the kinetic and potential energy (each possibly depending on $q^i$, ${\dot q}^i$, and $t$)

The corresponding quantity in Lagrangian dynamics is the $h$ function defined as

with $L = T - V$.

At first glance, it isn't at all obvious why $h$ should ever be conserved and why it can be identified as the energy, in certain circumstances. To see the conservation properties of $h$ examine its total time derivative given by

The second and fourth terms cancel and the first and third terms can be grouped so that the expression now reads

Physical motion requires the first term, which are the Euler-Lagrange equations, to be zero, leaving the very nice expression that

Thus, under some very general circumstances, $h$ is conserved if the Lagrangian, in some coordinate system, doesn't have any time dependence. Since the allowed point transformations can be time-dependent, it is possible for $h$ to be conserved in certain generalized coordinate systems and not in others. This is the genesis of the confusion of between $E$ and $h$ discussed in the previous post.

Having answered the first question about the conservation properties of $h$, we now turn to when $h$ can be identified as the energy $E$. In order not to obscure the physical points, for the bulk of this post 1-$d$ systems will be examined with the generalized coordinates being denoted as $x$ (starting Cartesian coordinate) and $q$ (arbitrary generalized coordinate). At the end, this restriction will be relaxed.

The kinetic energy, given by

will be the key parameter, with the potential energy just coming along for the ride.

Expanding on Morin (http://www.people.fas.harvard.edu/~djmorin/chap15.pdf), we examine three cases: 1) the limited point transformation $x = x(q)$, 2) the full point transformation $x = x(q,t)$, and 3) the 'forbidden' phase-space transformation $x=x(q,{\dot q})$. In each of these cases, we exploit the fact that the transformations are invertible and so either $x$ or $q$ can be regarded as the base coordinate. Assuming the Cartesian $x$ as a function of the generalized coordinate $q$ is simply a convenient choice.

# Case 1: $x = x(q)$

This case is commonly encountered when the geometry of the system better matches cylindrical, spherical, or some other curvilinear coordinate system. The velocities are related by

leading to the kinetic energy transforming as

Defining

the expression for $h$ becomes

Simplifying yields

So in the case where the point transformation is limited to exclude time dependance, $h = E$. Since the motion is derivable from a potential, we can immediately conclude that the energy is conserved.

# Case 2: $x = x(q,t)$

This case is commonly encountered in rotating systems where a time-dependent transformation puts the viewpoint coincident with a co-rotating observer. This is the trickiest of all the physical situations since the energy may or may not be conserved independently of what $h$ is doing. These cases will be examined in detail in future posts.

The velocities are related by

leading to the kinetic energy transforming as

Expanding and simplifying yields the kinetic energy in generalized coordinates as

or, more compactly as,

Once again defining

the expression for

Simplifying gives

So, in this case, $h$ is not identified with the energy. Whether $h$ or $E$ are conserved is a matter for case-by-case analysis.

# Case 3: $x = x(q,{\dot q})$

This final case is more of a curiosity rather than a serious exploration. The case is 'forbidden' by last month's analysis since it doesn't preserve the transformation properties of the Euler-Lagrange equations. Nonetheless, it is interesting to see what would happen to $h$ should this transformation be allowed.

The velocities are related by

leading to the kinetic energy becoming

Following the same process above yields

Finally, a note on relaxing the restriction to 1-$d$. In multiple dimensions, the velocities in the most general case transform as

Following the same method as above, the kinetic energy will take the form

using the same shorthand notation as above.

The only new feature (beyond the trivial decoration of the $q$s with an index and the implied summation) is the cross-term $\mu^{ij} {\dot q}^i {\dot q}^j$. These terms will disappear (as do all the terms linear in a given $q$) from $h$ since

will yield a positive term

that is exactly cancelled by the term coming from $-L$. Everything else follows through as in the 1-$d$ case.

## Energy and Hamiltonians: Part 1 - Generalized Coordinates

One of the most confusing aspects of Lagrangian/Hamiltonian mechanics centers on the energy $E$ and its relationship to the function $h$ (or to the Hamiltonian $H$, to which it is closely related). The key questions are, to what extent these two terms can be identified with each other, and which of them is conserved and in which circumstances. The answers have deep implications into the analysis of the possible motions the system supports. The opportunity for confusion arises because there are many different possibilities that show up in practice.

The aim of this and the following columns is to delve into these cases in sufficient detail to flesh out the six logically possible cases that result from $E$ being either conserved or not, $h$ being either conserved or not, both subject to the possible constraint that $E=h$.

Before going into specific analysis of this case or that, it is important to get an overview as to how these various cases result in practice. Central to the variety of possibilities is the notion of generalize coordinates and the application of dynamical constraints.

As discussed in great detail in Classical Mechanics by Goldstein (especially Chapter 1), generalized coordinates are what provide the Lagrangian method with its power by enabling it to rise above Cartesian coordinates and into a frame better adapted to the geometry of the dynamics. For example, central force motion is best described in plane-polar coordinates that not only make the computations easier but also provide a simple method of imposing constraints like, the motion must take place at a fixed radius (e.g., a bead on a loop of wire). The number of generalized coordinates can always match the number of degrees of freedom, a condition that usually can't be fulfilled in Cartesian coordinates if there are constraints. Generalized coordinates serve as the cornerstone in understanding rigid body motion - the three Euler angles mapping directly to the three degrees of freedom such a body has to move about its center of mass.

The price for such freedom is that the universe of possible conserved quantities must be enlarged from the obvious ones evident in elementary mechanics to a broader set. This expansion in scope leads to the need to distinguish between the energy $E$ and the function $h$. One of the most important and famous examples of this is the conservation of the Jacobi constant in the circular-restricted three-body problem.

But, before launching into an analysis that compares and contrasts $E$ with $h$, it is important to establish some basic results of the Lagrangian method.

The first result is that the Lagrangian equations of motion are invariant under what Goldstein calls a point transformation

relating one set of generalized coordinates $q^i$ to another $s^j$ (note that Cartesian coordinates are a subset of generalized coordinates).

Transformation of the Lagrangian $L(q^i,t)$ to $L'(s^j,t)$ starts with determining the form of the generalized velocities

($\dot f \equiv \frac{d}{dt} f$).

From this relation comes the somewhat counterintuitive 'cancellation-of-the-dots' that states

A needed corollary of the cancellation-of-the-dots results from applying the total time derivative to both sides of the above relation. The right-hand side expands as

Switching the order of partial derivatives (always allowed for physical functions) and regrouping terms gives the right-hand side (noting along the way that $\partial {\dot s}^k / \partial s^j = 0$) as

The next step is to grind through the equations of motion, which is straightforward if a bit tedious. Start by labeling the Lagrangian that results after the substitution of the point transformation as

The coordinate-derivative piece of the equations of motion gives

The velocity derivative piece of the equations of motion gives

Combining these pieces into the Euler-Lagrange equations gives

Cancellation-of-the-dots and its corollary tells us that the second and fourth terms on the right-hand side cancel, leaving

Since the original equations of motion are zero and the Jacobian of the point transformation is not, we conclude that

which informs us that the Euler-Lagrange equations are invariant under point transformations. This result is our hunting license to use any coordinates related to Cartesian coordinates by a point transformation to describe a system's degrees of freedom.

The next result is that the Lagrangian is invariant with respect to the addition of a total time derivative. In other words, the same equations of motion result from a Lagrangian $L$ as do from one defined as

The proof is as follows. Applying the Euler-Lagrange operator to both sides of the definition of $L'$ gives

Invariance of the equations of motion requires the last two terms to cancel. The easiest way to see that they do is to first concentrate on the expansion of the total time derivative of $F$ as

The third term above then becomes

Cancellation of the last two terms now rests on the recognition that

a fact easily established by realizing that the total time derivative is a specific sum of partial derivatives and that partial derivatives commute with each other.

This result allows us to play with the form of the Lagrangian in an attempt to find the simplest or most compact way of describing the motion of the physical system.

Next month's column will use the general structure of this formalism as a guide to the connection between the energy $E$ and $h$. The columns in the following months will apply the two results derived here to various physical systems in order to illustrate the various cases for $E$ and $h$ outlined above.

## Action and Reaction

Action and reaction; Newton's 3rd law tells us that for every action there is an equal an opposite reaction. Nature, in effect, is a banker whose accounts are carefully balanced according to the principle of double-entry bookkeeping. Modern theories emphasize this point of view by saying that at the fundamental level all interactions are mediated by the exchange of 'force quanta' that transfer energy, momentum, and angular momentum. So all is well - or is it?

Often textbooks muddy the water of this 'simple' situation by engaging in nuanced discussions and citing exceptions without emphasizing two points. One is that action-reaction is always honored. The second is that it may require careful logic supported by exact experiements to rescue new cases that seem to be the exception.

Now, these nuances and exceptions, which roughly fall into three prototypical classes, are real and need to be addressed. Each is interesting in its own right, helping to illustarte and explore the richness of the action-reaction concept, but the 'answer' for each of these cases is always the same - there are degrees of freedom that are being ignored that balance the action-reaction.

Case 1 - Dissipative forces

Friction is one of the earliest examples of what is discussed in which action-reaction seems to be violated. Energy is bled out of the system and the motion eventually comes to a halt. The prototype example is the damped harmonic oscillator. Its equation of motion is given by

with the general solution

with $\gamma = \Gamma/2m$ and $\omega = \sqrt{\Gamma^2/4 m^2 - k/m}$.

It is natural to ask, where does the energy go? In this model, the energy disappears with no account as to where and how. Of course, the accepted answer is straightfoward enough. The energy goes into heat; into degrees of freedom not directly tracked by this model. There is balance on the action-reaction front. There is a reaction force of the oscillator on the environment and it is this force which transfers the energy to the surroundings.

While this accepted answer is known to virtually everybody who's taken a physics class, it is worth pausing to consider that it wasn't neccessarily obvious to those who first grappled with this problem, nor is it obvious to the beginner who is introduced to it. In fact, it is very difficult to find a detailed argument that traces through all of the evidence in a logical fashion. It is simply common knowledge supported by our belief in the balance of action and reaction.

Case 2 - Driving forces

The second case, which is actually closely related to the first, involves a driving force that maintains some aspect of the system. Energy, momentum, and/or angular momentum enter and leave the system without being explicitly tracked. The best example of such a system is the circular restricted three-body problem. There are numerous applications of this model to spaceflight, with many missions exploiting the $L_1$ and $L_2$ Lagrange points.

In the circular-restricted three-body problem, two massive bodies are in orbit around their common barycenter. This motion honors action-reaction perfectly with the gravitational force between the bodies being equal and opposite. The 'forced motion' which breaks action-reaction, takes place in the idealization where a test particle is introduced into the system. The test particle is thought to be so small that its back-reaction on the motion of the two bodies can be ignored. The equation of motion is

with the motion of the primaries given by

and

and with

That the motions of $m_1$ and $m_2$ are set by their interaction with no influence from the test particle is the source of the driving force. So, by construction, the model violates Newton's third law.

But often treatments of the circular-restricted three-body problem confuse this simple point by dwelling on the conservation of the Jacobi constant. This conservation is clearly an important point as it helps in the understanding of the global properties of the solutions. Nonetheless, discussions about its importance rarely emphasize that while the model violates action-reaction, real physical situations do not. The conservation of the Jacobi constant is a useful tool but it does not actually occur in nature. Rather a close approximation results. While this distinction may seem subtle in theory it is quite important in practice - real physical situations cannot be expected to exactly conserve the Jacobi constant.

Case 3 - Velocity-Dependent Forces

The case of velocity-dependent forces is perhaps the most confusing member of this list. There are two reasons for this. The first is that the physics literature usually employs this term to mean only one kind of velocity-dependent force: the force of magnetism. Second, the force of magnetism cannot be studied without direct analysis of the fields that act as intermediaries. This latter point should be one of clarity in the community when in fact it is often one of confusion.

Case in point, the discussion in Chapter 1 of Goldstein's Classical Mechanics concerning the weak and strong forms of action-reaction pairs. Goldstein has this interesting point to make about action and reaction

...[T]he Biot-Savart law indeed may violate both forms of the action and reaction law.*

He goes on to say that:

Usually it is then possible to find some generalization of ${\mathbf P}$ or ${\mathbf L}$ that is conserved. Thus, in an isolated system of moving charges it is the sum of the mechanical angular momentum and the electromagnetic "angular momentum" of the field is conserved.

Of course, this is eactly the well-known conservation law discussed in a previous post on Maxwell's equations. There is no reason to erode the confidence in the reader by the inclusion of quotes around the field angular momentum.

The situation is somewhat relieved by a footnote that attempts to give concrete scenarios. There are two of them and this post will deal with them in turn.

The first part states

If two charges are moving uniformly with parallel velocity vectors that are not perpendicular to the line joing the charges, then the mutual forces are equal and opposite but do not lie along the vector between the charges

which, diagrammitically, looks like

Since there are only two particles, the most economical description has their motion take place in a plane. The parallel velocity vector is

and the separation vector between them is

The combined electrostatic and magentic force on particle 1 due to particle 2 is

Substituting in the form of $\hat v$ and simplifing gives

Since the only thing that changes in computing the force on particle 2 due to particle 1 is the sign on the relative position vector it is easy to see that

and action-reaction is honored. However, this force is action-reaction in its weak form where the forces don't lie along the vector joining the two particles. This conclusion is best supported by computing the $x$-component of the force

which is zero for only a few isolated angles.

The second part of the footnote states

Consider, further, two charges moving (instantenously) so as to "cross the T", i.e., one charge moving directly at the other, which in turn is moving at right angles to the first. Then the second charge exerts a nonvanishing force on the first, without experiencing any reaction force at all.

The scenario here can be pictured as

Now the statement in this footnote is clearly gibberish and it is corrected in the 3rd edition to change 'nonvanishing force' to 'nonvanishing magnetic force', so we'll focus solely on magentic force.

From the Biot-Savart law the magnetic force on particle 1 due to particle 2 is

while the force on 2 due to 1 is

Clearly action-reaction seems violated but strictly speaking this is not true. What should be emphasized is that there is a 'third body' in the mix, the electromagnetic field and that the action-reaction can be restored (although not quite point-wise) by looking for additional degrees of freedom.