Monthly Archive: May 2015

Mass-Varying Systems

Last week, the problem of the conveyor belt accumulating some material at a steady rate $$r$$ was examined. The motor driving the belt kept it moving at a constant velocity. The power expended by the motor was double the power needed to accelerate the material when it hits the belt. The natural question is, why is it exactly double.

This question provoked a lot of discussion in ‘The Physics Teacher’, starting with an article by Wu, which tried to give a straightforward answer to why the value is exactly $$2$$ and also where did the other half of the energy go. For the most part, the argument by Wu, while physically intuitive, is lacking in logical rigor.

The best argument was given by Marcelo Alonso who starts with the varying-mass form of Newton’s law

\[ m \dot {\vec v} + {\dot m} (\vec v – \vec u) = \vec F_{ext} \; . \]

In this equation, $$m$$ is the system mass, which, in the case of the conveyor problem, is the mass of the belt and the material on it at any given time. The vector $$\vec v$$ is the velocity of the system. The quantities $$\dot m = r$$ and $$\vec u$$ are the rate of change of the system mass (more on this later) and the velocity of the mass that either enters of leaves the system, respectively. Justification for this equation will be given below.

The power imparted by the external force $$\vec F_{ext}$$ is

\[ {\mathcal P} = \vec F_{ext} \cdot \vec v = m \, \dot {\vec v} \cdot \vec v + \dot m (\vec v \cdot \vec v – \vec u \cdot \vec v) \; .\]

For the conveyor belt problem, the material enters with a very small velocity $$\vec u$$ that is perpendicular to the system velocity $$\vec v$$ so the last term vanishes. Since the motor keeps $$\vec v$$ constant it can be moved into and out of any total time derivative. The motor power is now

\[ {\mathcal P} = \dot m (\vec v \cdot \vec v) = \frac{d}{dt} \left( m \, \vec v \cdot \vec v \right) = \frac{d}{dt} ( 2 KE ) \; .\]

Viewed in this light, the factor of two is not so much mysterious as accidental. It is a consequence of the special circumstances that $$\vec u \cdot \vec v $$ is zero and $$\dot {\vec v} = 0$$. If the material were introduced to the belt in some other fashion, for example, thrown horizontally on the belt, the ratio of the power provided to the time rate of change of the materials kinetic energy would have been different and its value wouldn’t attract special attention.

All told this is a mostly satisfactory explanation. The only problem is that the terse response Alonso provided in the form of a letter to the editor didn’t give enough details on the varying-mass form of Newton’s law was derived.

Surprisingly, these details are difficult to find in most textbooks. Modern presentations don’t seem to care too much about varying mass situations although they are important in many applications. Systems in motion that suffer a mass change are almost everywhere in modern life. Examples range from cars that consume gasoline as they move, to ice accumulation on or water evaporation from industrial surfaces, to rocket motion and jet propulsion.

So in the interest of completeness, I derive the varying-mass form of Newton’s equations here.

There are two cases to consider: the system mass increases as material is accreted and the system mass decreases as material is ejected. The specific mechanisms of accretion or ejection are not important for what follows.

I will start with the accreting case just before a bit of matter $$\delta m$$ is introduced into the system mass. The initial momentum of the system and lump is

\[ \vec p_i = m \vec v + \delta m \, \vec u \; .\]

Just after this small lump has combined with the system, the final momentum is

\[ \vec p_f = ( m + \delta m)(\vec v + \delta \vec v) \; .\]

The change in momentum is the difference between these two quantities

\[ \Delta \vec p = \vec p_f – \vec p_i = ( m + \delta m)(\vec v + \delta \vec v) – m \vec v – \delta m \, \vec u \; ,\]

which simplifies to

\[ \Delta \vec p = m \, \delta \vec v + \delta m \, \vec v – \delta m \, \vec u \; .\]

This change is momentum is caused by the impulse due to the external force $$\vec F_{ext}$$

\[ \Delta \vec p = \vec F_{ext} \, \delta t \]

acting over the short time $$\delta t$$.

Equating the terms gives

\[ m \, \delta \vec v + \delta m \, \vec v – \delta m \, \vec u = \vec F_{ext} \, \delta t \; , \]

which becomes

\[ m \frac{\delta \vec v}{\delta t} + \frac{\delta m}{\delta t} ( \vec v – \vec u) = \vec F_{ext} \]

when dividing by the small time $$\delta t$$. Before this equation is taken to the limit and derivatives materialize out of deltas, let’s look at the ejection case.

For ejected mass, the system starts with the momentum

\[ \vec p_i = m \vec v \; .\]

After the lump with mass $$\delta m$$ is ejected with velocity $$\vec u$$, the final momentum takes the form

\[ \vec p_f = ( m – \delta m)(\vec v + \delta \vec v) + \delta m \, \vec u \;. \]

Following the same process above, the analogous equation

\[ m \frac {\delta \vec v}{\delta t} – \frac{\delta m}{\delta t} ( \vec v – \vec u) = \vec F_{ext} \]

results.

The two equations look similar with only a small change in sign in front of the $$\frac{\delta m}{\delta t}$$ to hint at the physical difference of mass inflowing in the first and outflowing in the second.

Sign differences can always be problematic and more so when it comes to ‘in’ versus ‘out’. Confusion can easily insert itself in such cases. As an example where careful track of sign conventions is important, consider the presentation of mass-varying systems in Halliday and Resnick. They present the mass ejection case as a preliminary to a discussion of rocket propulsion but they arrive at an equation with the opposite sign from the one derived here. They get this difference by explicitly setting the sign of $$dm/dt$$ opposite to that of $$\delta m/\delta t$$.

A small change in one’s frame of mind and a careful attention to the difference between $$\delta m/\delta t$$ and $$dm/dt$$ is all that is needed to reconcile these differences. To go from the simple ratios of deltas to actual derivatives, note that by the initial construction $$\delta m$$ was a positive quantity, a small bit of matter. The time increment $$\delta t$$ is also positive. So the ratio of these two quantities is also positive. But the rate of change of the system mass $$m$$ can be either positive or negative. Despite the notational similarity between $$\delta m/\delta t$$ and $$dm/dt$$, the mass $$m$$ being addressed is not the same. The mass in the derivative expression is the system mass which can gain or lose mass and so $$dm$$ can be positive or negative.

For the accreted mass case, $$dm/dt = r > 0$$ and the appropriate limit is

\[ m \frac{d \vec v}{d t} + \frac{d m}{dt} ( \vec v – \vec u ) = \vec F_{ext} \; ,\]

or, with an eye towards combining the two cases,

\[ m \frac{d \vec v}{d t} + \left| \frac{d m}{dt} \right| ( \vec v – \vec u ) = \vec F_{ext} \; .\]

For the ejected mass case, $$dm/dt = r < 0$$ and the appropriate limit is \[ m \frac{d \vec v}{d t} + \left| \frac{d m}{dt} \right| ( \vec v - \vec u ) = \vec F_{ext} \; ,\] with the sign on the second term now changing due to the inclusion of the absolute value.

The Conveyor Belt

The conservation of energy, which is so innocently applied in elementary applications, can be actually quite complicated when the system in question has mass flowing into or out of it. One of the most popular textbook questions for provoking thought on this topic is the conveyor belt.

Two well-known texts, ‘Mechanics’ 3rd ed., by Symon and ‘Physics’ by Halliday and Resnick, cover the conveyor belt problem, although in slightly different ways. Both start from a common setup of a belt moving at a constant velocity $$\vec v$$ onto which mass is dropped at a rate $$r = \frac{d m}{d t}$$ from a hopper above. The motor powering the conveyor applies a varying force $$\vec F$$ so that the constant velocity is maintained as the material mass $$m(t)$$ grows. For completeness, the mass of the belt is assigned a value $$M$$. Since the problem is one-dimensional, the explicit vector notation will be dropped in what follows.

The derivation starts with the mechanical momentum of the belt, defined as
\[ p(t) = [m(t) + M] v \; .\]

The time-varying motor force needed to maintain the constant velocity has to match the change in the momentum and thus
\[ F(t) = \frac{dp}{dt} = \frac{dm(t)}{dt} v = rv \; .\]

The power supplied by the motor is

\[ {\mathcal P} = F(t) v = v^2 \frac{dm(t)}{dt} = v^2 r \; , \]

which can be manipulated into a more familiar form since both $$v$$ and $$M$$ are constant, to yield

\[ {\mathcal P} = \frac{d}{dt}\left( m v^2 \right) = \frac{d}{dt} \left[ (m+M) v^2 \right] = 2 \frac{d}{dt} \left[ KE_{sys} \right ] \; . \]

In words, the motor supplies power that is twice is large as the change in the kinetic energy of the system. Halliday and Resnick also derive this two-to-one relationship. Both texts then ask where the excess half of the power going?

Of course, these authors expect that the student would infer where the energy has gone. His argument would start with the notion that the law of conservation of energy is cherished and believed to be correct in all cases. If half the supplied energy is easily found in the form of kinetic energy of the belt plus material the rest must be ‘hidden’ in a less obvious place. This missing other half must have been converted into some other form. Continuing on in this vein, the student would then conclude that a force is needed to accelerate each bit of mass introduced onto the belt and that that force can only be due to friction between the belt and the material so introduced. Once friction is introduced, it is a short and easy step to conclude that the extra energy is converted into the internal energy of the belt or the material.

Since this is the obvious path for such a venerable problem, it is hard to believe that there is any controversy surrounding this conclusion. But as Mu-Shiang Wu points out in ‘Note on a Conveyor-belt problem’, The Physics Teacher (TPT) 23, 220 (1986), the student is often puzzled why the missing energy happens to be half of the supplied power. By his presentation in that article, Wu also implies that the student is not satisfied with the broad and general conclusion that missing half is dumped into heat, he also wants to see the explicit mechanism. Wu’s argument to explain this mechanism goes something like this.

Follow a bit of mass $$\delta m$$ as it falls from the hopper. Relative to an observer riding along the belt, this chunk is moving with velocity $$-v$$. Since this bit of mass must come to rest with respect to the belt over some period of time $$\delta t$$ there must a be a frictional force $$F_f$$ that arrests the mass’s leftward motion. Wu posits the form of this frictional force to be

\[ F_f = \mu N = \mu \delta m g \; ,\]

where $$\mu$$ is the coefficient of kinetic (or sliding) friction, and $$N$$ is the normal force supplied by the belt. This kinetic friction force results in a constant acceleration $$a = \mu g$$. Using the standard the kinematic relations for constant acceleration

\[ x_f = x_i + v_i \delta t + 1/2 a \delta t^2 \]

and

\[ v_f = v_i + a \delta t\]

and eliminating the time $$\delta t$$ it takes for the friction force to bring the chunk of mass to a stop (i.e. $$v_f = 0$$) leads to the total distance traveled as

\[ D = \frac{v^2}{2 a} \; .\]

The work done by the frictional force is then

\[ \delta W = F_f D = \frac{1}{2} \delta m v^2 \; , \]

and the power exerted is

\[ {\mathcal P_f} = \frac{\delta W}{\delta t} = \frac{1}{2} \frac{\delta m}{\delta t} v^2 = \frac{d}{dt} \left( \frac{1}{2} m v^2 \right) \; .\]

At this point, Wu stops with the statement

“It turns out that regardless of whether we assume 1 sec or 1/100 sec for the acceleration time, the thermal power developed by frictional forces between the belt and the sand is always exactly half of the supplied power.”

Overall, I am suspicious of Wu’s argument. It has the attractive feature of having a explicit mechanism for the force that brings the material to rest on the conveyor belt but the use of the co-moving frame is done using a bit of a cheat. To demonstrate the roots of my suspicion, let me modify Wu’s argument starting just after the use of the constant acceleration kinematic equations. The total displacement (not distance) traveled is

\[ D = (x_f – x_i) = -\frac{v^2}{2 a} \; .\]

The work done by the frictional force is then (note the sign)

\[ \delta W = \int_{x_i}^{x_f} F_f dx = F_f (x_f – x_0) = -\frac{1}{2} \delta m v^2 \; .\]

Thus the chunk of mass loses energy in this frame and the power loss is

\[ \frac{\delta W}{\delta t} = -\frac{1}{2} \frac{\delta m}{\delta t} v^2 \]

or (taking the limit in the usual casual physicist style)

\[ {\mathcal P_f} = – \frac{d}{dt} \left( \frac{1}{2} m v^2 \right) \; .\]

The power lost by the object is exactly one half of the total power supplied by the belt. This loss is assumed to go into heat so the energy balance is satisfied in a hand-waving way but there is this pesky problem associated with the two different frames. So I don’t think the puzzle is satisfied.

In the intervening years (1987-1990) after Wu’s initial article was published a number of other author’s published notes, critiques, and alternative ways of thinking about the conveyor belt problem. Judging by the different points-of-view expressed, the original unanswered question by Symon and Halliday and Resnick seems to have assumed a manifest truth that is not as obvious once one digs in as it is on the surface. Almost none of the arguments I’ve read in TPT have swayed me except for a letter by Marcel Alonso in response to Wu’s original article.

Next week I’ll cover Alonso’s argument and some details about variable mass systems.

A Flux Transport Example

One of the more confusing things when I was learning vector calculus \& classical field theory was the derivation of the flux transport theorem. The theorem relates the change in flux $$\Phi$$ due to a vector field $$\vec F$$ through a moving surface $${\mathcal S}$$ to certain integrals involving the time rate of change and the divergence of the field over the surface along with a contour integral around the surface’s boundary. Using the notation of ‘Introduction to Vector Analysis, 4th ed.’ by Davis and Snider, the flux transport theorem is given by:

\[ \frac{d \Phi}{d t} = \int_{\mathcal S} d \vec {\mathcal S} \cdot \left[ \frac{\partial \vec F}{\partial t} + (\nabla \cdot \vec F) \vec v \right] + \int_{\partial {\mathcal S}} d \vec \ell \cdot \vec F \times \vec v \; , \]

where $$d \vec {\mathcal S}$$ is the outward normal to a differential portion of surface area, $$\vec v$$ is the velocity at each point on the surface (recall it is moving), and $$d \vec \ell$$ is a differential line element along the boundary of the surface $$\partial {\mathcal S}$$.

Davis and Snider have some nice homework problems but I wanted an example where the surface changed size as well as moved in space. The example I concocted is one with a rectangular shaped surface whose four vertices are given by:
\[ \vec {\mathcal A} \doteq [t,-t,-1] \; ,\]
\[ \vec {\mathcal B} \doteq [t,t,-1] \; ,\]
\[ \vec {\mathcal C} \doteq [t,t,1] \; ,\]
and
\[ \vec {\mathcal B} \doteq [t,-t,1] \; .\]
Note that I represent ($$\doteq$$) the coordinates of the points by row arrays simply for typographical convenience, although whether they are rows or columns doesn’t matter.

The surface, which moves uniformly in the x-direction, is canted 45 degrees with respect to the y-z plane and has a time varying area of $$4t$$. The surface is parametrized by two parameters $$u$$ and $$v$$ ranging from $$0$$ to $$1$$ such that any point on the surface (including the boundary) is given by

\[\vec {\mathcal R}(u,v) \doteq \left[ t, (2u-1)t, 2v -1 \right] \; \; u,v \in[0,1] \; .\]

In this parameterization, the outward normal to a differential patch is

\[ d \vec {\mathcal S} = \frac{\partial \vec {\mathcal R}}{\partial u} \times \frac{\partial \vec {\mathcal R}}{\partial v} du dv \doteq [4t,0,0] du dv \; , \]

from which we immediately get the total area as

\[ Area(\vec {\mathcal S}) = \int_{(u,v)} |d \vec {\mathcal S} | = \int_0^1 du \int_0^1 dv \, 4 t = 4t \]

as expected.

The form of the vector field is

\[ \vec F(\vec r) \doteq [x y^2, y z^2 z x^2] \]

where $$\vec r \doteq [x,y,z] $$.

Since the form of the surface and the vector field are fairly simple, the flux through the surface due to $$\vec F$$

\[ \Phi[\vec F,\vec {\mathcal S}] = \int_{\vec {\mathcal S}} \vec F(\vec {\mathcal R}(u,v)) \cdot d \vec {\mathcal S} \]

can be explicitly computed as
\[ \Phi[\vec F,\vec {\mathcal S}] = \int_0^1 du \int_0^1 dv \, 4 t^4 (2 u – 1)^2 = \frac{4t^4}{3} \; .\]

The time derivative is then easily obtained as

\[ \frac{d \Phi[\vec F,\vec {\mathcal S}]}{d t} = \frac{16t^3}{3} \; .\]

To use the flux transport theorem, we need to compute $$div(\vec F)$$ and $$\frac{\partial \vec F}{\partial t}$$ and then evaluate these terms across the expanse of the surface $$\mathcal{S}$$. Likewise we also need to compute $$\vec F \times \vec v$$, where $$\vec v$$ is the velocity of the surface and then evaluate this terms along the its boundary.

The divergence of the field is given by

\[ div(\vec F) = x^2 + y^2 + z^2 \; , \]

which, in the $$u-v$$ parametrization becomes

\[ div(\vec F) = t^2 + (2 u – 1)^2 t^2 + (2 v- 1)^2 \; .\]

The time derivative of the field is

\[\frac{\partial \vec F}{\partial t} = 0 \]

since the vector field $$\vec F$$ has no explicit time dependence. All of the time rate of change is due to the surface moving within the field.

The surface integral in the flux transport equation, formally given by

\[ I_0 = \int_{\vec {\mathcal S}} d \vec {\mathcal S} \cdot \left[ div\left(\vec F(\vec {\mathcal R}(u,v))\right) \vec v + \frac{\partial \vec F}{\partial t}\left(\vec {\mathcal R}(u,v)\right) \right] \]

in the $$u-v$$ parametrization, evaluates to

\[ I_0 = \frac{ 16 t^3}{3} + \frac{4 t }{3} \; .\]

The velocity of any point on the surface is obtained by taking a time derivative with respect to $$\vec {\mathcal R}$$

\[ \vec v = \frac{\partial \mathcal{R} (u,v)}{\partial t} \doteq [1, 2 u – 1, 0] \; .\]

The integrand of the line integral is

\[ \vec F \times \vec v \doteq [ – z x^2 (2 u – 1), z x^2 , x y^2 (2 u – 1) – y z^2 ] \; , \]

which, in the $$u-v$$ parametrization, becomes

\[ \vec F \times \vec v \doteq [-(2 v – 1)(2 u – 1) t^2, (2 v – 1) t^2, (2 u – 1)^3 t^3 – (2 u – 1)(2 v – 1 )^2 t] \; .\]

There are four distinct lines or legs making up the boundary of the surface. These are given by

\[ \vec {\mathcal L}_1(u) = \vec {\mathcal A} + u(\vec {\mathcal B} – \vec {\mathcal A}) \; ,\]
\[ \vec {\mathcal L}_2(v) = \vec {\mathcal B} + v(\vec {\mathcal C} – \vec {\mathcal B}) \; ,\]
\[ \vec {\mathcal L}_3(u) = \vec {\mathcal C} + u(\vec {\mathcal D} – \vec {\mathcal C}) \; ,\]

and

\[ \vec {\mathcal L}_4(v) = \vec {\mathcal D} + v(\vec {\mathcal A} – \vec {\mathcal D}) \]

and there is a corresponding integral for each.

The evaluation of each is a bit tedious (particularly in taking care to make sure that $$u$$ or $$v$$ take on the appropriate value for the leg being traversed) but straightforward leading to
\[ I_1 =-2 t^3 \int_0^1 du = -2 t^3 \; , \]
\[ I_2 = 2 t^3 \int_0^1 dv – 2t \int_0^1 dv (2v-1)^2 = \frac{ 2(3 t^3 – t)}{3} \; , \]
\[ I_3 = -2t^3 \int_0^1 du = -2 t^3 \; , \]
and

\[ I_4 = 2 t^3 \int_0^1 dv – 2t\int_0^1 dv (2v-1)^2 = 2 t^3 – \frac{2}{3} t \; ,\]
respectively.

The total line integral around the boundary is then given as the sum of these terms

\[ I_c = – \frac{4}{3} t \; .\]

Adding this result to the result from surface integral gives

\[I_{tot} = \frac{16 t^3}{3} \; , \]

which is the same result as before.

While this example is a bit contrived, it does offer a simple combination of both movement and growth that seems absent within the literature. Furthermore, when worked in detail, each piece drives home the content of the flux transport theorem.