Uncategorized

Enter Entropy

Conrad SchiffFebruary 25, 2022

Up to this point in this classical survey of entropy, the star player, namely entropy itself has remained unseen but perhaps felt or anticipated. In this post, we actually get to the definition of entropy from the phenomenological point of view that is equilibrium thermodynamics.

To recap, we’ve covered the following points. To begin, the first law of thermodynamics neither restricts the types of allowed energy transformations nor the direction, order, or frequency. Like a good accountant working with double entry bookkeeping, the first law merely insists that the books balance and that energy losses from one account are balanced with gains in others – in short that energy is conserved. Next, we have the postulates of Clausius and Kelvin, which, much like financial regulators, place (or at least assert that nature places) restrictions on the types of allowed transactions. These restrictions naturally led to the notion of energy transfers falling into two broad categories: reversable and irreversible. Third, we find that the Carnot engine, with is incredibly simple set of 4 reversible processes, provides profoundly universal conclusions including the facts that the Kelvin and Clausius postulates each logically imply the other and that both are just two of many different facets of the larger second law of thermodynamics (although we stopped short of stating this in no uncertain terms since we hadn’t yet defined classical entropy). Finally, we established that no engine in the world can match the efficiency of the Carnot engine which is given by

\[ \epsilon_{Carnot} = 1 – \frac{|Q_H|}{|Q_L|} \; , \]

where $|Q_H|$ and $|Q_L|$ are the amounts of heat extracted from and dumped to the high ($T_H$) and low ($T_L$) temperature reservoirs, respectively. This implies that irreversibility is tied to the limitations found within the second law but this expression is hard to use since the amount of heat moved into or out during the cycle depends on the nature of the working substance.

To see this connection more clearly, let’s return to the Carnot engine but this time specifying the working substance to be an ideal gas with the familiar equation of state

\[ P V = n R T \; , \]

which we will use on each leg of the Carnot cycle, shown below.

As a reminder, we are using the Clausius sign convention that assigns positive values to heat that flows into the ideal gas (system for short) and to work done by the system on its surroundings so that the first law reads $\Delta U = Q – W$. The pair of points ${\mathcal A}$ and ${\mathcal B}$ and the pair ${\mathcal C}$ and ${\mathcal D}$ are connected by isotherms. We’ll take, as given, the well-known result that the internal energy of an ideal gas depends only on temperature so along these isotherms $\Delta U_{{\mathcal A} \rightarrow {\mathcal B}} = 0 $ and $\Delta U_{{\mathcal C} \rightarrow {\mathcal D}} = 0$. From the first law, the constancy of the internal energy means that the heat that enters or exits the system is identically equal to the work: $Q_H = -W_{{\mathcal A} \rightarrow {\mathcal B}}$ and $Q_L = -W_{{\mathcal C} \rightarrow {\mathcal D}}$. Each of these works can be calculated easily by relating the pressure work $dW = P dV$ to the equation of state.

For the isothermal expansion

\[ Q_H = -W_{{\mathcal A} \rightarrow {\mathcal B}} = \int_{{\mathcal A}}^{{\mathcal B}} P d V \; .\]

Solving the equation of state for the pressure, substituting that result into the integral to eliminate $P$ and pulling out the constant temperature $T_H$ gives

\[ Q_H = n R T_H \int_{{\mathcal A}}^{{\mathcal B}} \frac{d V}{V} = n R T_H \ln \left( \frac{V_{{\mathcal B}}}{ V_{{\mathcal A}}}\right)\; .\]

As a check, note that since $V_{{\mathcal B}} > V_{{\mathcal A}}$ the heat transferred between the system and the higher temperature reservoir is positive ($Q_H > 0$) and that the work is negative ($W < 0$) since the system ‘pushes on’ the surrounding as it expands, both of which are consistent with the Clausius sign convention.

A similar analysis for the isothermal compression gives

\[ Q_L = n R T_L \ln \left( \frac{V_{{\mathcal D}}}{ V_{{\mathcal C}}}\right)\; .\]

As additional check, note that since $V_{{\mathcal D}} < V_{{\mathcal C}}$, that the heat transferred between the system and the lower temperature reservoir is negative ($Q_L$) representing heat flowing out, as expected.

We can now eliminate the volumes at each end state from these expressions for the heat flows by relating $T$ and $V$ on the adibats. The heat flow, by definition, is zero on these processes meaning that $\delta U = – W$. We will also assume that the ideal gas is calorically perfect so that the heat capacity at constant volume $C_V$ is a constant so that internal energy, which only depends on temperature, can be expressed as

\[ d U = n C_V d T \; . \]

Setting this expression equal to the work $dW = – P dV$ and eliminating the pressure using the equation of state gives

\[ C_V dT = -\frac{R T}{V} dV \; , \]

which can be immediately integrated from initial to final values to give

\[ \frac{T_f}{T_i} = \left( \frac{V_f}{V_i}\right)^{-\frac{R}{C_V}} = \left( \frac{V_f}{V_i}\right)^{1- \gamma}\; , \]

where the ratio $R/C_V$ is replaced with $\gamma – 1$ to match the usual convention. Finally, cross-multiplication to gather initial and final values on the left- and right-hand sides, respectively, yields

\[ T_i V_i^{\gamma – 1} = T_f V_f^{\gamma – 1} \;. \]

Applying this formula to the adiabatic expansion from ${\mathcal B} \rightarrow {\mathcal C}$ and the adiabatic compression from ${\mathcal D} \rightarrow {\mathcal A}$ gives

\[ T_H V_{{\mathcal B}}^{\gamma – 1} = T_L V_{{\mathcal C}}^{\gamma – 1} \; \]

and

\[ T_H V_{{\mathcal A}}^{\gamma – 1} = T_L V_{{\mathcal D}}^{\gamma – 1} \; .\]

Dividing the previous equation by the last one gives

\[ \frac{ T_H V_{{\mathcal B}}^{\gamma – 1}}{ T_H V_{{\mathcal A}}^{\gamma – 1}} = \frac{ T_L V_{{\mathcal C}}^{\gamma – 1}}{ T_L V_{{\mathcal D}}^{\gamma – 1}} \; ,\]

which simplifies to

\[ \frac{V_{{\mathcal B}}}{V_{{\mathcal A}}} = \frac{ V_{{\mathcal C}}}{ V_{{\mathcal D}}} \; .\]

These ratios enable us to eliminate the volumes from the computations of heat obtained from analyzing the isothermal steps leaving us with

\[ \frac{Q_H}{T_H} = \frac{-Q_L}{T_L} \; ,\]

where ‘extra’ minus sign come from flipping $V_{{\mathcal D}}/V_{{\mathcal C}}$ in the original expression to $V_{{\mathcal C}}/V_{{\mathcal D}}$. The question of sign convention can be totally eliminated by with absolute values to

\[ \frac{|Q_H|}{T_H} = \frac{|Q_L|}{T_L} \; .\]

The simplicity of this expression belies it profundity. Despite all the state changes in the Carnot cycle the ratio of the heat transferred between the system and the higher temperature reservoir to its temperature is equation to the ratio of the heat transferred between the system and the lower temperature reservoir and its corresponding temperature. This ‘conservation’ of ‘whatever’ lead us to propose a new quantity denoted by $S$ which says

\[ S \equiv \frac{Q}{T} \; .\]

And, thus, entropy has entered onto the stage of thermodynamics.

Before we go, it is worth noting the point Carter raises in his book Classical and Statistical Thermodynamics, that entropy so defined complements the conjugate nature of pressure and volume. Pressure is an intensive variable that when multiplied by the extensive variable volume leaves a quantity (i.e., work) with units of energy. Temperature is an intensive variable that when multiplied by entropy also leaves a quantity (i.e., heat) with units of energy. In this way, entropy perhaps could have been anticipated without the analysis from above.

We’ll explore this classical definition of entropy more deeply in the next post, including showing that it is a state variable and fleshing out some of its connections to irreversibility.

Uncategorized

Feb

2022

Carnot Cycle

Conrad SchiffJanuary 28, 2022

In the last post, we explored Kelvin’s and Clausius’ postulates and found that each of them, in their own way, put limits on the way in which energy could be exchanged in physical systems. We also showed that each of them logically implied the other thereby demonstrating that both were two separate facets of the second law and that a particular result of the Carnot cycle was an integral part of the proof. Now it is time to take a much deeper look at the Carnot cycle and the preeminent place it holds in understanding entropy. Several of the arguments presented here closely mirror those in Carter’s Classical and Statistical Thermodynamics and several of the figures and the argument about the efficiency of the Carnot cycle were inspired by the Fundamentals of Physics by Halliday, Resnick, and Walker.

The Carnot cycle is a special kind of engine or refrigerator depending on how it is run. For clarity, we define engines and refrigerators as machines that execute a set of thermodynamic processes on a physical system, called the working substance, that result in some in the transfer of energy, either in the form or work or heat, but which return the working substance to its initial state. Taken together, the set of processes is called a cycle.

Any thermodynamic process connects an initial state ${\mathcal A}$ to a terminal state ${\mathcal B}$ through a locus of intermediate thermodynamic states. Thermodynamic processes fall into one of two broad categories: reversible and irreversible, with the key difference being that reversible processes naturally run from in the opposite direction (${\mathcal B}$ to ${\mathcal A}$) while irreversible processes do not (hence their names). An example of a reversible process in which energy is exchanged is an elastic collision between two balls. A movie of the collision looks physically reasonable whether it is run forwards or backwards. An example of an irreversible process is the heating of a tank of water by a resistor (of resistance $R$) powered by a battery delivering a current $i$. The heat delivered to the tank during some time span $\Delta t$ is $Q = {\mathcal P} \Delta t = i^2 R \Delta t$. That amount of heat raises the temperature of the tank at the expense of depleting an equal amount of stored energy in the battery. This process looks natural but none of us would expect that we could recharge the battery simply by cooling the tank back to original temperature. This one-way character of irreversible processes gives the physical world the ‘arrow of time’ and many consider it to be one of the hallmarks of entropy.

What makes the Carnot cycle so special is that it represents an ideal engine that operates between two heat reservoirs in which all the processes are reversible. It consists of four steps: ${\mathcal A} \rightarrow {\mathcal B} \rightarrow {\mathcal C} \rightarrow {\mathcal D} \rightarrow {\mathcal A}$.

In the first step (${\mathcal A} \rightarrow {\mathcal B}$) the working substance absorbs $Q_H$ from the high temperature reservoir such that it isothermally expands and performs positive work. In the second step (${\mathcal B} \rightarrow {\mathcal C}$) the working substance expands adiabatically ($Q=0$) while it also does positive work. The third step (${\mathcal C} \rightarrow {\mathcal D}$) consists of the working substance being isothermally compressed by dumping $Q_L$ to the cold temperature reservoir will having work done on it (i.e., negative work is done by the system). In the fourth step (${\mathcal D} \rightarrow {\mathcal A}$), the working substance is returned to its initial state by being adiabatically compressed.

Theorists usually represent the Carnot cycle visually as a set of curves plotted in the pressure-volume ($PV$) plane.

We typically imagine the working substance as an ideal gas. In this case, the abstract steps of the Carnot cycle become familiar processes in terms of the usual cylinder-and-piston arrangement.

Since the change in internal energy of the system must be zero (the gas ends in the same state that it started in), the first law tells us that the work delivered by the engine $W = |Q_H| – |Q_C|$. Standard practice is to visualize an engine (executing the Carnot cycle or otherwise) abstractly as an systems of interacting boxes that take in and dump heat of quantities $Q_H$ and $Q_L$, respectively, in the process delivering work of amount $W$.

The efficiency of the engine is the fraction of the heat energy that enters the engine that results in useful work and is given by

\[ \epsilon = \frac{W}{|Q_H|} = 1 – \frac{|Q_C|}{|Q_H|} \; . \]

Since every process is reversible, the Carnot cycle can be operated in the opposite order giving a refrigerator that is able to move heat from the lower temperature reservoir to the higher temperature reservoir at the expense of work being provided to the cycle rather than being delivered.

We should now ask if there are any engines that can operate more efficiently than the Carnot cycle. In the process of answering this question in the negative we will see more clearly the connections with Kelvin’s and Clausius’ postulates and will understand why the Carnot cycle has a preeminent position in the field of thermodynamics.

We start our answer by conjecturing that there exists an engine, call it Engine X, whose operating efficiency is better than the Carnot cycle,

\[ \epsilon_X > \epsilon_{Carnot} \; .\]

This assumption doesn’t mean that Engine X takes in the same amount of heat as the Carnot engine nor dumps the same amount but simply that it delivers a given amount of work $W$ as a larger fraction of whatever it takes in. Thus, if the Engine X absorbs $Q’_H$ joules from the higher reservoir to deliver $W$ joules of work then

\[ \frac{|W|}{|Q’_H|} > \frac{|W|}{|Q_H|} \; . \]

This inequality simplifies to

\[ |Q_H| > |Q’_H| \; .\]

By the first law,

\[ |Q_H| – |Q_L| = |W| = |Q’_H| – |Q’_L| \; , \]

which can be rewritten as

\[ |Q_H| – |Q’_H| = |Q_L| – |Q’_L| \; .\]

From our previous analysis of the efficiency, the quantity on the left-hand side is positive and so must be the quantity on the right-hand. By using the work delivered by Engine X to power a Carnot refrigerator

we can create a process whose only effect is to move heat from a colder reservoir to warmer one directly violating the Clausius postulate. If we accept the Clausius postulate we must reject the idea that any engine can be more efficient than the Carnot engine.

Two points in conclusion are worth making. First, we might have expected this outcome given the special nature of reversible processes. Second, given the Carnot cycle’s position as the gold standard of thermodynamics it should come as no surprise that we can always decompose an arbitrary process

into Carnot subprocesses

whose internal adiabats cancel and whose efficiency and total energy exchange can be determined by adding the individual subprocesses together (credit to Carter for making this point so clearly).

Next month, we’ll recast the Carnot efficiency in light of using an ideal gas as the working substance and will begin to see the emergence of entropy.

Uncategorized

Jan

2022

Carnot, Clausius, and Kelvin

Conrad SchiffDecember 24, 2021

This month we were return to our exploration of entropy after our brief detour into field theory. The earlier posts explored the definition of entropy derived from the statistical mechanics. In this installment, we return to the thermodynamic roots of entropy that originated in the analysis of the 19^th Century. Our key players in this drama are Sadi Carnot, Lord Kelvin, and Rudolph Clasius, who will be with us both here and for several of the following posts.

This analysis closely follows the presentation found in Enrico Fermi’s book Thermodynamics with some additional extensions in the logic and new, explanatory diagrams that attempt to provide a cleaner approach to traditional material.

Thermodynamics rests upon the idea of a system in equilibrium so that we can characterize it in terms of a very small number of state variables compared with the overwhelmingly enormous number of degrees of freedom the system possesses. A bottle of water is a good poster child for a system in equilibrium. The bottle can be described by the amount of water $m$ (or the number of moles $n$), the volume it occupies $V$, its temperature $T$, pressure $P$, and the like. Even if the water were not pure and we were forced to also specify the percentage of impurities by type there would still be far, far fewer numbers to specify than the incredibly astronomical number of position and velocity components required to describe the state as Newton would. The state variables are completely independent of how the system made it into that configuration. Their values represent average quantities where individual, finer-grained fluctuations are smeared out. There are two state variables that stand above the rest both in importance: the internal energy $U$ and the entropy $S$.

The internal energy is relatively familiar to us based on its analogy to the traditional energies defined in classical mechanics and electrodynamics. That said, it took quite a long time before it was appreciated in the mid-1800s that mechanical energy and heat were equivalent. When the dust had settled, the first law of thermodynamics had been postulated as

\[ \Delta U = Q – W \; , \]

where $Q$ is the heat that enters or leaves the system and $W$ the work done by the system on its surroundings. The sign convention is such that heat entering and work performed are both positive quantities. If one regards energy as the ‘currency’ for physical transactions, then the first law amounts to an accounting principle that says the books must balance and, in this regard, it is relatively easy to understand the physical content.

The entropy, on the other hand, is more difficult to summarize succinctly. Many people can offer euphemisms stating that the principle of entropy means that there is ‘no free lunch’ or that it ‘forbids perpetual motion’ but these euphemisms don’t provide much in the way of physical understanding.

There are several steps in providing a firm understanding of entropy. The rest of this post centers on the first step which involves the different ways of expressing the limitations the second law recognizes in the conversion between work and heat.

We start by looking at the isothermal expansion of an ideal gas, in which a flame provides the heat which causes the expantion. Since the internal energy of an ideal gas only depends on temperature, as long as the temperature remains constant, there is no change in the internal energy $\Delta U = 0$. Then from the first law $W = Q$, which means that all of heat energy is changed into the work needed to raise the piston.

It natural to ask if all physical processes allow for a complete conversion of heat to work as allowed by the first law whenever $\Delta U = 0$ or are there limitations on how efficient arbitrary physical processes can be?

After much analysis and experimentation, most of which was done in the 1800s, the second law of thermodynamics emerged with a clear set of limitations for how changes between heat and work are made. Its modern form expresses the statement in terms of entropy but we will avoid it in favor of more macroscopic statements.

Fermi provides two postulates that capture different aspects of the second law. The first postulate, attributed to Lord Kelvin, states that:

a transformation who’s only final result is to transform into work the heat extracted from a source that is at the same temperature is impossible.

Graphically this forbidden process is represented on a $PVT$ diagram as follows.

The circular arc reminds us that the state of the system must remain unchanged at the end of the transformation.

The second postulate, attributed to Clausius, states

if heat flows by conduction from a body A to another body B then the transformation who’s only final result is to transfer heat from B to A is impossible.

The graphical representation of this forbidden process as follows.

Note that both postulates rule out as impossible certain transformations that leave the state of the system otherwise unchanged (“only final result is…”). Since the state of system is unchanged we will focus on cyclic processes, of the kind used in engines, in which a complete circuit returns the system to its original state ($\Delta U = 0$) with some fraction of the heat absorbed being transformed into work.

The textbook example of a cyclic process is the Carnot cycle, which operates a system between two thermal reservoirs with temperatures $T_C$ and $T_H$ (with $T_C < T_H$).

While the details of the Carnot cycle will be explored in the next post, for our purposes the final result relating the work derived to the heat exchanged given by

\[ W = Q_H – Q_C \; \]

will be all that is needed.

Fermi devotes a large amount of effort showing that the Kelvin and Clausius postulates are logically equivalent and are different facets of the same underlying limitations of the second law.

The first part of the proof, that Kelvin’s postulate implies the Clausius postulate, is the easiest to understand. Suppose that the Kelvin postulate were false. Then we could extract from system $A$ some work $W$ leaving system $A$ otherwise unchanged. We then use the work to raise a block up an inclined plane gaining gravitational potential energy. We then let the block slide down the plane using friction to transform the potential energy into heat which we can then dump to the hot reservoir in system $B$, thus violating the Clausius postulate.

The converse leg of the proof, that the Clausius postulate implies the Kelvin postulate, is a bit more difficult. Suppose that the Clausius postulate were false and that it were possible to transfer some heat $Q_H$ from the cold reservoir at temperature $T_C$ with no other changes in the system to the hot reservoir $Q_H$. As long as the amount of heat is consistent with what is normally adsorbed from by a Carnot cycle to produce an amount of work $W$ we then find that we can return the hot reservoir to its original state with no additional changes. We could then use the Carnot cycle to adsorb this heat to produce some work $W$ thus extracting heat from the hot reservoir without any additional changes in violation of the Kelvin postulate.

The logical equivalence of the Kelvin and Clausius postulates demonstrate that these various limitations are different facets of the second law. This logical structure serves as the launching pad for exploring the concept of entropy from the macroscopic point-of-view.

[Note added after publication – the equivalence between the Kelvin and Clausius postulates is nicely described here.]

Uncategorized

Dec

2021

A Curvilinear Mantra – Part 2

Conrad SchiffNovember 26, 2021

The last post introduced the curvilinear mantra for students working with field equations in such disciplines as fluid mechanics, general relativity, and electricity and magnetism. The textbook example (see, e.g. Acheson Appendix A, pp 352-3) is Euler’s equations for ideal fluids in two spatial dimensions.

In cartesian coordinates these equations read

\[ \rho \left( V_x \partial_x + V_y \partial_y + \partial_t \right) V_x = – \partial_x p + f_x \; \]

and

\[ \rho \left( V_x \partial_x + V_y \partial_y + \partial_t \right) V_y = -\partial_y p + f_y \; ,\]

whereas, in polar coordinates these equations read

\[ \rho \left( V_r \partial_r + \frac{V_\theta}{r} \partial_\theta + \partial_t \right) V_r – \rho \frac{{V_\theta}^2}{r} = -\partial_r p + f_r \; \]

and

\[ \rho \left( V_r \partial_r + \frac{V_\theta}{r} \partial_\theta + \partial_t \right) V_\theta + \rho \frac{V_r V_\theta}{r} = -\frac{1}{r} \partial_\theta p + f_\theta \; . \]

As discussed in the previous post, beginning students are often confused by two changes when transitioning from cartesian to polar coordinates. The first is the appearance of $1/r$ scale factors that decorate various terms such as $V_\theta/r \partial_\theta$. The second is the appearance of additional additive terms, such as $V_r V_\theta/r$.

The curvilinear mantra explains these changes as follows: the scale factors come from minding the units and the additive terms show up to account for how the basis unit vectors change from place to place.

The first half of the mantra was covered in the previous post. This post finishes the exploration by demonstrating how the additive terms arise due to the spatial variations of the basis vectors.

The first step involves writing the position vector in terms of the polar coordinates and the cartesian unit basis vectors

\[ {\vec r} = r \cos \theta {\hat x} + r \sin \theta {\hat y} \; .\]

The polar unit basis vectors are defined by taking the derivatives of the position vector with respect to the polar coordinates and then unitizing. The radial basis vector (not unitized) is

\[ {\vec e}_r \equiv \frac{\partial {\vec r}}{\partial r} = \cos \theta {\hat x} + \sin \theta {\hat y} \; .\]

Conveniently, this vector has a unit length and we can immediately write the radial unit basis vector as

\[ {\hat r} = \cos \theta {\hat x} + \sin \theta {\hat y} \; . \]

Following the same procedure, the polar angle basis vector (not unitized) is

\[ {\vec e}_\theta \equiv \frac{\partial {\vec r}}{\partial \theta} = -r \sin \theta {\hat x} + r \cos \theta {\hat y} \; . \]

This vector has length $r$ and so the polar angle unit base vector is

\[ {\hat \theta} = -\sin \theta {\hat x} + \cos \theta {\hat y} \; .\]

Both vectors are independent of $r$ but do depend on $\theta$ and their variations are

\[ \partial_\theta {\hat r} = {\hat \theta} \; \]

and

\[ \partial_\theta {\hat \theta} = – {\hat r} \; . \]

At this point we have all the ingredients we need. From the first part of the curvilinear mantra we have the velocity in polar coordinates is

\[ {\vec V} = V_r {\hat r} + \frac{V_\theta}{r} {\hat \theta} \; \]

and the material (or total) time derivative is

\[ \frac{D}{Dt} = V_r \partial_r + \frac{V_\theta}{r} \partial_\theta + \partial_t \; , \]

where the scale factors on the polar angle terms are due to minding units.

Applying the material time derivative to the velocity gives

\[ \frac{D {\vec V}}{Dt} = \left( V_r \partial_r + \frac{V_\theta}{r} \partial_\theta + \partial_t \right) \left( V_r {\hat r} + \frac{V_\theta}{r} {\hat \theta} \right) \; . \]

Expanding this expression term-by-term yields

\[ V_r \partial_r \left( V_r {\hat r} \right) + \frac{V_\theta}{r} \partial_\theta \left( V_\theta {\hat \theta} \right) + \frac{V_\theta}{r} \partial_\theta \left( V_r {\hat r} \right) + \left(\partial_t V_r \right) {\hat r} + \left( \partial_t V_\theta \right) {\hat \theta} \; . \]

Expanding the derivatives, taking care to evaluate the spatial derivatives of the unit basis vectors, yields

\[ V_r \partial_r V_r {\hat r} + \frac{V_\theta}{r} \left( \partial_\theta V_\theta \right) {\hat \theta} – \frac{{V_\theta}^2}{r} {\hat r} + \left( \frac{V_\theta}{r} \partial_\theta V_r \right) {\hat r} + \\ \frac{V_\theta V_r}{r} {\hat \theta} + \left(\partial_t V_r \right) {\hat r} + \left( \partial_t V_\theta \right) {\hat \theta} \; . \]

Collecting terms gives the radial term as

\[ V_r \partial_r V_r + \frac{V_\theta}{r} \partial_\theta V_r – \frac{{V_\theta}^2}{r} + \partial_t V_r \; \]

and the polar angle term as

\[ \frac{V_\theta}{r} \partial_\theta V_\theta + \frac{V_\theta}{r} \partial_\theta V_\theta + \frac{V_\theta V_r}{r} + \partial_t V_\theta \; .\]

Factoring the terms yields

\[ \left( V_r \partial_r + \frac{V_\theta}{r} \partial_\theta + \partial_t \right) V_r – \frac{{V_\theta}^2}{r} \; \]

and

\[ \left( V_r \partial_r + \frac{V_\theta}{r} \partial_\theta + \partial_t \right) V_\theta + \frac{V_\theta V_r}{r} \; .\]

Happily, these expressions match term-for-term the textbook (up to multiplication by $\rho$). This shows the accuracy and power of the curvilinear mantra. Hopefully it will catch on in classrooms.

Uncategorized

Nov

2021

A Curvilinear Mantra – Part 1

Conrad SchiffSeptember 24, 2021

These next two posts are a bit of a departure from the thermal physics theme that had been the central point for the last many months. They grew out of some discussions on classical field theory that arose in several venues with different people and it seemed important to capture what is a clean (and perhaps new) argument for the beginning student on the best way to transform differential equations into curvilinear coordinates.

The starting point is the recasting of the Euler equation for an ideal fluid (typically a gas)

\[ \rho \frac{D {\vec V}}{Dt} = -{\vec \nabla p} + {\vec f} \; , \]

where $\rho$ and $p$ are the mass density and pressure of the fluid, ${\vec V}$ is its velocity, $\frac{D}{Dt}$ is the material derivative, and ${\vec f}$ is the body force per unit mass.

Typically, within basic discussions of fluid mechanics, Euler’s equation is expressed in Cartesian coordinates (assumed here, without loss of generality to the method, to cover a two dimensional space) where the velocity is given by

\[ {\vec V} = V_x {\hat x} + V_y {\hat y} \; ,\]

the material derivative takes on the simple form

\[ \frac{D}{Dt} = V_x \partial_x + V_y \partial_y + \partial_t \; ,\]

and Euler’s equation, in component form is

\[ \rho \frac{D}{Dt} V_x = -\partial_x p + f_x \; ,\]

and

\[ \rho \frac{D}{Dt} V_y = -\partial_y p + f_y \; .\]

This relatively, simple form allows the student to focus on the Lagrangian nature of following a fluid flow but it typical hides a subtle complication when using curvilinear (or even rotating coordinates). For example, the corresponding version of Euler’s equations in cylindrical coordinates (see also Acheson’s Appendix A.6) uses

\[ \frac{D}{Dt} = \partial_t + V_r \partial_r + \frac{V_{\theta}}{r} \partial_{\theta} \; \]

for the material derivative with the component equations being

\[ \rho \frac{D}{Dt} V_r – \frac{{V_{\theta}}^2}{r^2} = – \partial_r p + f_r \; \]

and

\[ \rho \frac{D}{Dt} V_{\theta} + \frac{V_r V_{\theta}}{r} = -\frac{1}{r} \partial_{\theta} p + f_{\theta} \; . \]

Suddenly there are new multiplicative terms (e.g. the $1/r$ multiplying the derivative with respect to the polar angle $\theta$) as well as additive terms on the left-hand side of the component equations (e.g. $-\frac{{V_{\theta}}^2}{r^2}$) that weren’t there in the Cartesian version. The student is left to wonder about just why they are there.

Many books and lecture notes on the internet try to justify one or the other (but rarely both) with varying degrees of success. The aim of this note is to suggest a simple mantra: the multiplicative terms are strictly the result of minding units and the additive terms are strictly the result of the curvilinear basis vectors changing from point to point.

The strategy behind the mantra is that even if the students don’t fully connect all the dots the first few times, they will have an explanation that is rock solid and easy to remember to guide them in exploring on their own.

Let’s examine each of these claims in turn.

The first claim of the mantra is that the multiplication of the $\partial_{\theta}$ term by $1/r$ is the result of minding units. Of the two claims of the mantra, this one is the more conceptually difficult even though it is the easier of the two claims to understand mathematically. The conceptual hurdle is rooted in the arguments used to define it in terms of the partial derivatives of a scalar field, $f(x,y,t)$, expressed in Cartesian coordinates

\[ df = \partial_x f dx + \partial_y f dy + \partial_t f dt \; .\]

Dividing by $dt$ immediately gives the Cartesian form the material derivative

\[ \frac{Df}{Dt} = V_x \partial_x f + V_y \partial_y f + \partial_t f \; .\]

The student then asks why doesn’t a similar relationship hold for curvilinear coordinates. For example, why isn’t the material derivative in cylindrical coordinates not based on the differential of $g(r,\theta,t)$

\[ dg = \partial_r g dr + \partial_\theta g d\theta + \partial_t g dt \; ?\]

This is a point most often most clearly discussed within the realm of continuum mechanics or general relativity. Schutz, in his book A First Course in General Relativity, notes, in Section 5.5, that defining the gradient of $g$ essentially in terms of the differential given above is perfectly acceptable but that the price paid for using it is that the basis vectors that are not normalized, which he summarizes with the equation

\[ {\vec e}_{\alpha} \cdot {\vec e}_{\beta} = g_{\alpha \beta} \neq \delta_{\alpha \beta} \; .\]

While this is certainly true and quite clearly argued, the beginning student consulting Schutz (or some similar text) as a reference has to know either the definition of the metric or the difference between vectors and differential forms and the natural duality between them. In the first case, they need to know that the metric encodes all of the possible dot products between the basis vectors. In the second, they are confronted with notation that expresses the duality between basis forms and vectors in the coordinate version as

\[ \left<d\theta, \partial_{\theta} \right> = 1 \; \]

and in the non-coordinate version as

\[ \left< {\tilde \omega}^{\hat \theta}, {\vec e}_{\hat \theta} \right> = 1 \; .\]

These mathematical distinctions are quite beyond the beginning student who, by definition, is struggling with a host of other things.

A cleaner way of justifying the first point of the mantra is to perform a unit analysis on the differential $dg$. It doesn’t matter what units $g$ possesses but for the sake of this argument let’s assume $g$ has units of temperature. The idea of a temperature field is familiar and the units are well known. We will denote the units of a physical quantity by square brackets so that in this case $[g] = T$.

The differential must also have units of temperature which means that the partial derivatives have mixed units. The partial derivative with respect to the radius $r$ has units of temperature per length

\[ \left[ \partial_r g \right] = T/L \; \]

while the partial derivative with respect to the azimuth $\theta$ has units of temperature

\[ \left[ \partial_{\theta} g \right] = T \; .\]

Dividing by $dt$ gives a material derivative of the form

\[ \frac{Dg}{Dt} = V_r \partial_r g + U_{\theta} \partial_{\theta} g + \partial_t g \; .\]

The units on the radial velocity $V_r \equiv dr/dt$ are length per unit time as we expect of a conventional derivative but the units on the azimuthal velocity $U_{\theta} \equiv d\theta/dt$ are radians per unit time, which are quite different (hence the use of the letter $U$ in place of $V$). The next step is to challenge the student to think about how any lab would measure this angular velocity and to then argue that a much better way to link to experiments is to multiply $U_{\theta}$ by the radius $r$.

Once this step is done, the remaining piece involves rewriting the differential as

\[ dg = dr \partial_r g + (r d\theta) (\frac{1}{r} \partial_{\theta} g) + dt \partial_t g \; , \]

where we’ve multiplied the second term by unity in the form of $r/r$. Dividing by $dt$ immediately gives

\[ \frac{Dg}{Dt} = V_r \partial_r g + V_{\theta} \frac{1}{r} \partial_{\theta} g + \partial_t g \; , \]

which is the accepted form of the material derivative.

Next post will cover the second part of the mantra by showing that the additive terms result from how the basis vectors in curvilinear components change from point-to-point in space.

Uncategorized

Sep

2021

A Binomial Gas

Conrad SchiffAugust 27, 2021

The last installment discussed Robert Swendsen’s critique of the common, and in his analysis, erroneous method of understanding the entropy of a classical gas of distinguishable particles. As discussed in that post, his aim in making this analysis is to persuade the physics community to re-examine its understanding of entropy and to rediscover Boltzmann’s fundamental definition based on probability and not on phase space volume. To quote some of Swendsen closing words:

Although the identification of the entropy with the logarithm of a volume in phase space did originate with Boltzmann, it was only a special case. Boltzmann’s fundamental definition of the entropy in his 1877 paper has none of the shortcomings resulting from applying an equation for a special case beyond its range of validity.

On the question of how this special case blossomed into textbook dogma we will have to content ourselves with speculations. It seems likely that the passion by which quantum mechanics gripped the physics community made it attractive to view the entire world through the lens of indistinguishable particles. Furthermore, quantum mechanics also elevated the concept of phase space since various dimensions could be viewed as canonically conjugate variables subject to the uncertainty principle. So, it is plausible that the physics community, dazzled by this new theory of the subatomic, latched onto the special case and ignored Boltzmann’s fundamental definition. If true, this would be incredibly ironic since the key focus of Boltzmann was on probability which is arguably the most shocking and intriguing aspect of quantum mechanics.

Regardless of these finer points of physics history, since the concept of probability is key in deriving the correct formula for a classical distinguishable gas let’s focus on the toy example Swendsen provides in order to illustrate his point. As in the last post, we will assume that the average energy per particle $

If we imagine a system with $N$ total distinguishable particles distributed between a volume $V$ partitioned into sub-volumes $V_1$ and $V_2$ then the probability $P(N_1,N_2)$ of having $N_1$ particles in $V_1$ and $N_2 = N – N_1$ in $V_2 = V – V_1$ is given by the binomial distribution

\[ P(N_1,N_2) = \left( \begin{array}{c} N \\ N_1 \end{array} \right) p^{N_1} (1-p)^{N_2} \; ,\]

where $p$ is the probability of being found in $V_1$ (i.e. a ‘success’). Since there are no constraints forcing particles to accumulate in any one section compared to the others they will distribute randomly within the entire domain. Therefore, $p = V_1/V$ and the probability is given by

\[ P(N_1,N_2) = \left( \frac{N!}{N_1! N_2!} \right) \left( \frac{V_1}{V} \right)^{N_1} \left( \frac{V_2}{V} \right)^{N_2} \; .\]

This expression is Swendsen’s launching point for deriving the correct expression for a classical gas of distinguishable particles. But before continuing with the analysis it is worth taking a few moments to better understand the physical content of that expression (even for those you understand the binomial distribution well).

There is a very compact way to make a Monte Carlo simulation of this thought experiment using the python ecosystem. One starts by defining a random realization of the classical gas particles placed within the volume and then reporting out the macroscopic thermodynamic state.

def particles_in_a_box(V1,N,V):
    import numpy as np

    #get random positions of the particles
    pos = np.random.random(N)

    #count number in subvolume V1
    threshold = V1/V
    return len(np.where(pos<threshold)[0])

In this context, the macroscopic thermodynamic state is a measure of how many particles are found in the sub-volume $V_1$. This is a critical point, particularly in light of the quantum interpretation that so many have embraced: two thermodynamic states can be identical without the underlying microstates being the same. For example, if $N=3$ and $N_1=2$ then each of the following lists results in the same thermodynamic state:

[True,True,False]
[True,False,True]
[False,True,True]

where True and False result from the call to the numpy.where function and indicate whether the particle is found within $V_1$ (True) or not (False).

To get the probabilities, one makes an ensemble of such systems and this is what the following function does

def generate_MC_estimate(V1,N,V,num_trials):
    import numpy as np
    results = np.zeros(num_trials)
    for i in range(num_trials):
        results[i] = particles_in_a_box(V1,N,V)
    return results

The following plot shows how well the empirical results for an ensemble with 100,000 realizations agree with the formula derived above for a simulation of 2000 particles placed in a box where $V_1 = 0.3 V$.

Following Boltzmann, the entropy is

\[ S = k \ln P + C = k \ln \left[ \left( \frac{N!}{V^{N_1}V^{N_2}}\right) \left( \frac{{V_1}^{N_1}}{N_1 !} \right) \left( \frac{{V_2}^{N_2}}{N_2 !} \right) + C \right] \; ,\]

where the previous expression has been grouped into parts dealing with the entire subsystem $(N,V)$, the first sub-volume $(N_1,V_1)$, and the second subsystem $(N_2,V_2)$. The constant $C$ depends only the whole system $N$ and $V$ but not on the subdivisions and, for reasons that should become obvious, we will take it to be

\[ C = – k \ln \left( \frac{{V}^{N}}{N !} \right) \; . \]

We first expand the entropy expression along this grouping to get

\[ S = k \ln \left( \frac{N!}{{V}^{N}} \right) + k \ln \left( \frac{{V_1}^{N_1}}{N_1 !} \right) + k \ln \left( \frac{{V_2}^{N_2}}{N_2 !} \right) – k \ln \left( \frac{{V}^{N}}{N !} \right) \; .\]

The first and last terms are inverses of each other and, under the action of the logarithm, cancel, leaving

\[ S = k \ln \left( \frac{{V_1}^{N_1}}{N_1 !} \right) + k \ln \left( \frac{{V_2}^{N_2}}{N_2 !} \right) \; .\]

As the whole is a sum of the parts, this expression is clearly extensive.

The final step is the application of Sterlings approximation ($\ln n! \approx n \ln n – n$). To keep things clear, we will apply it to terms of the form

\[ S = k \ln \left( \frac{V^N}{N!} \right) \; \]

to get

\[ S = k \left( \ln V^N – \ln N! \right) = k \left( N \ln V – N \ln N – N \right) = k N \left( \ln \frac{V}{N} – 1 \right) \; , \]

which clearly shows that $S$ scales linearly with the system size (at least in the thermodynamic limit).

All told, Swendsen argues persuasively that the correct interpretation of the entropy is that it is always proportional to the logarithm of the probability, that the ‘traditional’ expression depending on the volume of phase is a special case of the larger rule, and that by misapplying this special case large numbers of physicists have taught or have been taught incorrectly for decades. So much for the ideas of settled science.

Uncategorized

Aug

2021

Of Milk and Entropy

Conrad SchiffJune 25, 2021

Last month’s column teased the idea that there was a challenge to the common wisdom that states that the reason that the traditional (T) expression for the entropy of a classical (C) gas of distinguishable (D) particles, given by

\[ S_{TCD} = k N \left[ \ln V + \frac{3}{2} \ln \frac{E}{N} + X \right] \; , \]

where $X$ is some constant, fails to be extensive is that classical mechanics overcounts the number of possible configurations. A division of the partition function by $N!$ yields an extensive expression

\[ S_{T} = k N \left[ \ln \frac{V}{N} + \frac{3}{2} \ln \frac{E}{N} + X \right] \; , \]

as the ‘correct one’ and the conclusion is a philosophical one: there is no escaping quantum statistics; all gases are made up of indistinguishable (I) particles.

This conclusion seems ably rebutted by a paper entitled Statistical mechanics of colloids and Boltzmann’s definition of the entropy, by Robert H. Swendsen in 2006 in the American Journal of Physics. Swendsen’s argument centers on looking at whole milk – the kind you can buy in any supermarket or convenience store.

While I must confess that even though I purchase whole milk regularly and was well aware that the term ‘homogenized’ attached to it, I never really bothered to understand just what was made homogeneous. The basic notion is that the homogenized milk is a colloid with tiny fat and protein globules (Swendsen states characteristic sizes of the fat globules of ~0.5 microns) separated in a water medium. Whole milk, which is 4

There are two key assumptions that Swendsen makes at the core of his analysis of whole milk as a classical colloid:

The globules are distinguishable
The globules constitute a gas

That the globules are distinguishable is strongly supported by the fact that at a diameter of ~0.5 microns, there are approximately $10^{9}$ atoms (give or take an order of magnitude) contained within each globule, and, so, it would be extremely unlikely that any two globules would contain exactly the same number of atoms. The odds of finding identical globules drops many orders of magnitude more once one considers that each globule will contain some amount of foreign contaminants so that both the composition and the number of atoms found within any given globule will likely be unique and thus each globule will be microscopically distinguishable.

That the globules can be model as an ideal gas takes a bit more thought. The key features of an ideal gas are that it is a collection of similar objects that only interact with each other over a very short range and that the time between interactions is large compared to the duration of the interaction. Despite the fact that the globules are suspended in water, a substance which continuously jostles the individual globules, doesn’t alter the fact that they interact with other fat globules through a short-range electrostatic repulsion only occasionally (on the order of 4

With Swendsen’s two assumptions well-supported, we are now equipped to argue against the conclusion that quantum mechanics is inescapable. Here we have a gas of distinguishable particles, all much larger than an atom so that quantum statistics can hold no sway, for which the traditional expression for entropy predicts startlingly wrong conclusions. One, we’ve already encountered in the Gibbs paradox discussion in the last post. The other, which is a variation, also deals with mixing and goes something like this.

Imagine that we divide a tank of total volume into two subdivisions $V = V_1 + V_2$ subject to the constraint $V_1 > V_2$. The larger sub-volume $V_1$ is filled with whole milk and the smaller sub-volume $V_2$ is filled with skim milk (completely devoid of fat globules). Let $N$ be the total number of fat globules in the system, which are initially contained in $V_1$ and $E = E_1$ be their total energy. For simplicity, we can also assume that the average energy per particle $\epsilon$ remains fixed (no heat transfer and no work done). The initial entropy of the system given by the traditional formula is

\[ S_{TCD,initial} = k N \left[ \ln V_1 + \frac{3}{2} \ln \epsilon + X \right] \; . \]

We then imagine opening a small port for the two systems to mix and then closing it. The final entropy is

\[ S_{TCD,final} = k N_1 \left[ \ln V_1 + \frac{3}{2} \ln \epsilon + X \right] + k N_2 \left[ \ln V_2 + \frac{3}{2} \ln \epsilon + X \right] \; . \]

The difference in the entropy is then

\[ \Delta S_{TCD} = k N_1 \ln V_1 + k N_2 \ln V_2 – k N \ln V_1 = k N_2 \left( \ln V_2 – \ln V_1 \right) = k N_2 \ln \left( \frac{V_2}{V_1} \right) \; . \]

And here is the problem: given that by construction $V2 < V1$ the entropy is always negative, even though mixing is an irreversible process; it takes work to restore the system to its ‘before’ state (macroscopically that all the globules are back in the larger volume if not precisely at the same microstate of positions ${\vec r}_i$ and velocities ${\vec v}_i$.

This second inconsistency (and likely there are others) further emphasizes that the classical expression is deeply flawed and, of course, we already knew this. But we can’t resort to quantum mechanics to come in save the day, as was done with simpler gases, since this approach is also deeply flawed as each object in this system is also distinguishable.

Swendsen resolves this problem by arguing that the methodology that led to the classical expression is wrong because it makes entropy related to the volume in phase space and not to the probability. To quote Swendsen:

Oddly enough, Boltzmann would not have encountered these problems, because he would not have used Eq.1 [for $S_{TCD}$]. He wrote the entropy (in modern notation) as

\[ S_{dist} = kN \left[ \ln \frac{V}{N} + \frac{3}{2} \ln \frac{E}{N} + X + 1\right] \; .\] If we use [this equation], the entropy remains constant in the first experiment when the wall between the two subvolumes of milk is either removed or reinserted, as is appropriate for a reversible process. For the second experiment, it is easy to show that $S_{dist,total}$ is always positive; the entropy increases as it must for an irreversible process.

Next blog will delve into this question about probability a bit more to show how Monte Carlo simulations dealing with microstates dovetail with entropy and these observations.

Uncategorized

Jun

2021

Gibbs Paradox

Conrad SchiffMay 28, 2021

This month’s column builds upon the basic building blocks from last month, namely that despite the seemingly simple presentation that most textbooks afford to the idea of entropy there is an enormous amount of subtly and nuance in an idea that is well over a hundred years old. As discussed in that earlier post, Robert Swendsen argues in his 2011 article How physicists disagree on the meaning of entropy (American Journal of Physics 79, 342), the primary area where things seem to break down is that different people presuppose an implicit set of assumptions not necessarily shared by anyone else. To quote Swendsen

When people discuss the foundations of statistical mechanics, the justification of thermodynamics, or the meaning of entropy, they tend to assume that the basic principles they hold are shared by others. These principles often go unspoken, because they are regarded as obvious. It has occurred to me that it might be good to restart the discussion of these issues by stating basic assumptions clearly and explicitly, no matter how obvious they might seem.

The one area that has triggered this realization was his recent work on (and subsequent debate over) the Gibbs paradox.

The Gibbs Paradox, named after Josiah Willard Gibbs, is the derivation from classical statistical mechanics which leads to an entropy expression for the ideal gas that is not extensive. The expectation that entropy is extensive amounts to saying that one expects to see that the entropy of a system doubles when the system itself doubles in size (keeping all other things equal). Since the ideal gas is the standard textbook example of a nontrivial collection of matter perfectly designed for understanding thermodynamics finding a result that flies in the face of this expectation casts doubt on the underpinnings of statistical physics. The usual way that this doubt is remedied is to patch up the classical analysis by appealing to quantum mechanics and the indistinguishability of particles. The concept of indistinguishability among the particles is, of course, lies at the heart of the Fermi-Dirac and the Bose-Einstein statistics for fermions and boson, respectively. The idea basically being that there are no ways of labeling, of painting, of hanging a number on individual particles and, therefore, that our basic ignorance must be included in the way we use in statistical mechanics.

Specifically, the classical analysis of an ideal gas made of distinguishable particles (using what Swendsen calls the traditional definition of entropy) leads to the following expression for the entropy (‘CD’ = classical, distinguishable)

\[ S_{CD} = k N \left[ \ln V + \frac{3}{2} \ln \frac{E}{N} + X \right] \; , \]

where $$X$$ is some constant. The objection is that expression is not extensive due to the $$\ln V$$ term in brackets. For example, scaling the system by some overall factor $$\alpha$$ ($$N \rightarrow \alpha N$$, $$E \rightarrow \alpha E$$, and $$V \rightarrow \alpha V$$ gives an entropy of

\[ S_{CD,\alpha} = k \alpha N \left[ \ln ( \alpha V) + \frac{3}{2} \ln \frac{E}{N} + X \right] \; , \]

which simplifies to

\[ S_{CD,\alpha} = \alpha S_{CD} + k N \alpha \ln \alpha \; . \]

On the surface, the lack of extensivity might not seem alarming but consider the following composite system comprised of two tanks of an identical gas placed side-by-side. Each collection has the same density and average energy per particle and each has the same volume. Further suppose that there is a sliding panel at the interface between the tanks. By removing the partition, the tank size now doubles (i.e. $$\alpha = 2$$) and the entropy change is

\[S_{CD,new} – S_{CD,old} = 2 S_{CD} + 2\ln 2 k N – 2 S_{CD} = 2 \ln 2 k N \; . \]

At this point, Gibbs notes that something is quite wrong. The removal of the partition is a reversible process (since the gas is thermodynamically the same on both sides the presence or absence of the partition shouldn’t make a difference), meaning that the entropy should not increase at all.

The remedy found in most textbooks (e.g. Fundamentals of Statistical and Thermal Physics by Reif from which the following quoted expressions come) starts by arguing that when we remove the partition and allow the gas molecules in one tank to mix with those in another we are implicitly assuming them “individually distinguishable, as though interchanging the positions of two like molecules would lead to a physically distinct state of the gas.” The argument concludes by directing us to correct for the overcounting that “taking classical mechanics too seriously” has foisted upon us. The correction for over-counting involves dividing a term earlier in the derivation (the partition function) by $$N!$$, which corrects the entropy (now adapted to indistinguishable particles hence the change from ‘D’ to ‘I’) to read

\[ S_{CI} = k N \left[ \ln \frac{V}{N} + \frac{3}{2} \ln \frac{E}{N} + X’ \right] \; , \]

which is obviously extensive with the equally obvious implication that the problem is solved and nothing more needed to be done.

Here the story seems to have staled for some long period of time (decades), most likely due to the belief that quantum mechanics was the correct viewpoint (or at least more correct) for the world at large. It seems to have been only fairly recent that a revived interest in putting classical statistical mechanics on firmer footing arose. The result of this new effort has been the rediscovery of an old definition of entropy that Swendsen, who has been championing this viewpoint for nearly two decade, argues leads to more sensible results than a simple reflexive appeal to quantum mechanics. And his most compelling argument to support this revived viewpoint is a substance that is likely to surprise: simple homogenized milk. However, that story, in all its glory, will have to wait until next month’s column.

Uncategorized

May

2021

An Invitation to Entropy

Conrad SchiffApril 30, 2021

The subject matter over the last few months has touched upon thermodynamics in a variety of guises. For example, the concept of enthalpy and isentropic flow has played a key role in compressible fluid flow. In the posts discussing the Maxwell relations, the thermodynamics square and the classic relationships between second order partial derivatives were the main tools used to eliminate pesky terms involving the entropy in favor of quantities easier to measure in the lab.

It seems that it is now prudent to put down a few notions about entropy itself. No other physical quantity, with the possible exception of energy, is as ubiquitously used as entropy and none is as poorly understood as entropy. Indeed, in his 2011 article entitled How physicists disagree on the meaning of entropy, Robert Swendsen starts with the quotation from von Neuman “nobody understands entropy”. Chemists use entropy to determine the direction of chemical reactions, physicists use it when looking at matter in motion (e.g. compressible gas within a cylinder), electrical engineers use it when characterizing information loss on channel, the amount software can compress an depends on its entropy, and so on.

Entropy seems to be a Swiss army knife concept with lots of different built-in gadgets that can be pulled out and used on a moment’s notice. It’s no wonder that such multi-faceted idea is not only poorly understood but also gives rise to radically contradictory notions. For example, Swendsen starts his article with the following list of 18 properties that he has seen or heard attributed to entropy:

The theory of probability has nothing to do with statistical mechanics.
The theory of probability is the basis of statistical mechanics.
The entropy of an ideal classical gas of distinguishable particles is not extensive.
The entropy of an ideal classical gas of distinguishable particles is extensive.
The properties of macroscopic classical systems with distinguishable and indistinguishable particles are different.
The properties of macroscopic classical systems with distinguishable and indistinguishable particles are the same.
The entropy of a classical ideal gas of distinguishable particles is not additive.
The entropy of a classical ideal gas of distinguishable particles is additive.
Boltzmann defined the entropy of a classical system by the logarithm of a volume in phase space.
Boltzmann did not define the entropy by the logarithm of a volume in phase space.
The symbol W in the equation S=k log W, which is inscribed on Boltzmann’s tombstone, refers to a volume in phase space.
The symbol W in the equation S=k log W, which is inscribed on Boltzmann’s tombstone, refers to the German word “Wahrscheinlichkeit” (probability).
The entropy should be defined in terms of the properties of an isolated system
The entropy should be defined in terms of the properties of a composite system.
Thermodynamics is only valid in the “thermodynamic limit,” that is, in the limit of infinite system size.
Thermodynamics is valid for finite systems.
Extensivity is essential to thermodynamics.
Extensivity is not essential to thermodynamics.

This list, which is really a list of 9 pairs of contradictory statements about entropy, goes out of its way to show just how many diverging ideas scientists have about entropy. And since it is trendy to have one’s own pet idea(s) about this fundamental concept, it seems about time, that I get my own and that it is the aim of this blog and the ones that follow. As a warm up to a deeper dive, I decided to return to the basic ideas introduced in Halliday and Resnick physics.

The most intriguing aspect of the textbook discussion of entropy is that it is a state variable, that is to say, its value depends only on what the system is doing at any given time and not how the system got there. This is a key concept because it means that we are relieved in trying to find the particular path through which the system evolved.

What is particularly remarkable about this discovery is that it came about in the 19^th century. This was the time in which the idea of smooth distributions of matter held the day. When the primary concept was that of a field, continuous in every way. A time well before the concept of discrete, microscopic states emerged from an understanding of the quantum mechanics of atoms, molecules, and other substances.

The thermodynamic relationship for entropy reads

\[ S_f – S_i = \int_{i}^{f} \frac{dQ}{T} \; , \]

where any path connecting the initial state (denoted by $$i$$) with the final state (denoted by $$f$$) will do. Nowhere in this definition can one find any clear signpost to indicate lumpy matter or the concept of the discrete. In addition, nothing in this definition even hints at a particular substance or class of them; nor is a particular phase of matter required. A breathtaking sweep of generality is hidden behind a few simple glyphs on a page.

As an example of the universality of the fundamental statement consider a familiar household system, say a glass of milk. If we do something prosaic like warm it by 10 degrees Celsius we arrive at the same entropy change as we would have if we had boiled the milk off into a vapor, melted the glass down, reconstituted the latter and recondensed the former. No matter what bizarre journey we subject a material to, the resulting change in entropy will simply depend on the initial and final configurations and not on the details connecting one to the other.

The usual playground for first thinking about entropy is the ideal gas and the usual example given the student is the computation of the entropy change of the free expansion of a gas. The context of this discussion usually follows upon the heels of an introduction to the kinetic theory of gases – a theory that presupposes the existence of atoms. The free expansion of gas is, perhaps, the most radical of all irreversible processes. There is no orderly flow, the very concept of a continuum fails to apply; every atom goes its own way and no macroscopic evolution of thermodynamic state can even be imagined.

And yet, almost blithely, textbooks argue the ease at which the entropy change in such a process can calculated. The argument goes as follows. From the kinetic theory of gases, one can show that during a free expansion, the internal energy does not change. The reason for this is that the gas does no work (that is what ‘free’ really means) and the process happens fast enough that no heat is transferred in or out. Since the change in internal energy is given by

\[ \Delta U = n C_V \Delta T \; \]

any ideal gas process that doesn’t change the internal energy also leaves the temperature unchanged. The matching thermodynamic process, where reversibility and equilibrium are maintained at all times is the isothermal expansion.

The first law

\[ dU = dQ – dW \; \]

can be specialized to any reversible ideal gas process, to yield

\[ n C_V dT = dQ – p dV = dQ – \frac{n R T}{V} dV \; .\]

Solving for $$dQ/T$$ gives

\[ \frac{dQ}{T} = n R \frac{dV}{V} + n C_V \frac{dT}{T} \; .\]

Integrating both sides from the initial to final state gives

\[ S_f – S_I = \int_i^{f} \frac{dQ}{T} = n R \ln \left( \frac{V_f}{V_i} \right) + n C_V \ln \left( \frac{T_f}{T_i} \right) \; .\]

This simplifies for an isothermal process to be

\[ S_f – S_i = n R \ln \left( \frac{V_f}{V_i} \right) \; .\]

So, the change in entropy for a free expansion must be exactly equal to the expression above even though, as the free expansion occurs, there is a complete absence of anything resemble a volume. This is a subtle result that gets only more subtle when one reflects on the fact that statistical mechanics wasn’t available when the concept of entropy first appeared.

It is for this reason that the next several blogs will be looking at entropy.

Uncategorized

Apr

2021

Maxwell’s Relations in Action

Conrad SchiffMarch 26, 2021

Last month’s column introduced the notion of the thermodynamic square as a mnemonic for organizing certain second-order partial derivatives amongst the various thermodynamic potentials: the internal energy $$U$$, the Gibbs and Helmholtz free energies $$G$$ and $$F$$, and the enthalpy $$H$$. As previously alluded to, physicist primarily use these relations (called the Maxwell Relations) to eliminate the terms difficult or impossible to measure experimentally in favor of parameters that are easily measured in the lab.

A practical example of the application of the Maxwell relations is the simplification of the ‘first’ and ‘second $$T dS$$ equations’ as listed in Exercise 7.3-1 on page 189 of Herbert B. Callen’s book Thermodynamics and an Introduction to Thermostatistics, 2^nd edition. The relevant physical properties in these equations are (Sec. 3.9 of Callen):

the number of particles $$N$$,
the differential heat $$dQ = T dS$$,
the heat capacity at constant volume $$c_V = \frac{T}{N} \left( \frac{\partial S }{\partial T} \right)_V= \frac{1}{N} \left(\frac{\partial Q }{\partial T}\right)_{V}$$,
the heat capacity at constant pressure $$c_P = \frac{T}{N} \left( \frac{\partial S }{\partial T} \right)_P = \frac{1}{N}\left(\frac{\partial Q}{\partial T}\right)_{P}$$,
the coefficient of thermal expansion $$\alpha = \frac{1}{V} \left(\frac{\partial V}{\partial T}\right)_P $$, and
the isothermal compressibility $$\kappa_T = – \frac{1}{V} \left( \frac{\partial V}{\partial P} \right)_T$$.

First $$TdS$$ Relation

The first relation we want to verify is

\[ T dS = N c_V dT + \frac{T \alpha}{\kappa_T} dV \; .\]

From the form of this equation, assume that the quantity in question is the entropy as a function of the temperature and volume $$S = S(T,V)$$. Taking the first differential gives

\[ dS = \left(\frac{\partial S}{\partial T}\right)_{V} dT + \left(\frac{\partial S}{\partial V}\right)_{T} dV \; .\]

The first term is relatively easy to deal with in terms of the heat capacity at constant volume $$c_V$$:

\[ \left(\frac{\partial S}{\partial T}\right)_{V} = \frac{N c_V}{T} \; .\]

The second term requires a bit more work. First use the Maxwell relation associated with the Helmholtz free energy $$F$$

to get

\[ – \left(\frac{\partial S}{\partial V}\right)_{T} = – \left(\frac{\partial P}{\partial T}\right)_{V} \; .\]

Next use the second classical partial derivative identity

\[ \left(\frac{\partial P}{\partial T}\right)_{V} \left(\frac{\partial T}{\partial V}\right)_{P} \left(\frac{\partial V}{\partial P}\right)_{T} = – 1 \; ,\]

and solve for

\[ \left(\frac{\partial P}{\partial T}\right)_{V} = \frac{-1}{\left(\frac{\partial V}{\partial P}\right)_{T} \left(\frac{\partial T}{\partial V}\right)_{P} } \; . \]

Use another of the classical partial derivative identities to move the $$\left(\frac{\partial T}{\partial V}\right)_{P}$$ to the numerator to get

\[ \left(\frac{\partial P}{\partial V}\right)_{T} = – \left(\frac{\partial V}{\partial T}\right)_{P} / \left(\frac{\partial V}{\partial P}\right)_{T} \; .\]

Multiply the numerator and denominator by $$1/V$$ and use the definitions of $$\alpha$$ and $$\kappa_T$$ to get

\[ \left(\frac{\partial P}{\partial V}\right)_{T} = \frac{\alpha}{\kappa_T} \; . \]

At this point, the first differential stands as

\[ dS = \frac{N c_V}{T} dT + \frac{\alpha}{\kappa_T} dV \; . \]

Multiplying each side by $$T$$ gets us to the final form of the first $$T dS$$ equation

\[ T dS = N c_V dT + \left(\frac{T \alpha}{\kappa_T} \right) dV \; .\]

Second $$TdS$$ Relation

The second relation is

\[ T dS = N c_P dT – T V \alpha dP \; . \]

From the form of this equation, assume that the quantity in question is the entropy as a function of the temperature and pressure $$S = S(T,P)$$. As in the first $$T dS$$ relation, taking the first differential gives

\[ dS = \left(\frac{\partial S}{\partial T}\right)_{P} dT + \left(\frac{\partial S}{\partial P}\right)_{T} dV \; .\]

The first term, as in the case above, is also relatively easy to deal with in terms of the heat capacity, this time at constant volume, $$c_P$$:

\[ \left(\frac{\partial S}{\partial T}\right)_{P} = \frac{N c_P}{T} \; .\]

The second term only requires a Maxwell relation in terms of the Gibbs free energy

\[ -\left(\frac{\partial S}{\partial P}\right)_{T} = \left(\frac{\partial V}{\partial T}\right)_{P} \; .\]

The first differential becomes

\[ dS = \frac{N c_P}{T} dT – \left(\frac{\partial V}{\partial T}\right)_{P} dP \; .\]

Multiplying the second term by $$V/V$$ and simplifying gives

\[ dS = \frac{N c_P}{T} dT – V \alpha dP \; ,\]

which becomes the desired relation when multiplying both sides by $$T$$

\[ T dS = N c_P dT – T V \alpha dP \; .\]

To summarize, these two relations show how to express the heat $$T dS$$, expressed in terms of the entropy (which cannot be measured) in terms of parameters, such as temperature, pressure, and heat capacity, that are easily measured in the lab. All that is needed is the machinery of partial derivatives. This observation is the reason that so many textbooks on Thermodynamics have specific sections devoted to these approaches.

Uncategorized

Mar

2021

Uncategorized

First $$TdS$$ Relation

Second $$TdS$$ Relation

Post navigation