Formalist Analogies in Statistical Mechanics

July 31, 2010

I recently read Mark Steiner’s neat little book on the applicability of mathematics in physics. His main thesis is that the ways in which mathematics is successfully applied in physics are often anthropocentric. He takes this as a strike against naturalism.

One example of anthropocentric reasoning he identifies is the use of what he calls formalist analogies in the discovery/construction of new theories. One example that bugs many philosophers is “quantization”, where the quantum mechanical description of a system is derived by considering its classical description and replacing classical observables with quantum mechanical operators. This technique was introduced by Heisenberg, whose heuristics I expressed amazement at in my last post. There are problems with this heuristic, such as how to deal with descriptions containing products of classical variables, given that quantum mechanical operators don’t necessarily commute, but there are some standard workarounds that seem to work for most cases. Those problems are not really the issue Steiner is getting at, though. His issue is that the matrices Heisenberg uses to replace classical observables “have no independent physical meaning”; they are mere formalisms. The matrix equation one gets from a quantization of a classical equation is parasitic on the classical equation, which is itself “false” according to quantum mechanics. This lack of independent physical meaning, Steiner argues, means that we are not entitled to use induction to infer that since quantization works in certain model cases, it will work for all cases.

I’ve always been disturbed by the approach to statistical mechanics that makes use of Gibbsian ensembles, particularly the grand canonical ensemble. Part of my discomfort with it may be because many textbooks introduce them using formalist analogies, in Steiner’s sense. Gibbs himself was thoroughly instrumentalist about the ensembles and did not ascribe any physical meaning to them, but modern textbooks are liable to be more cavalier about physical meaning. Thus, one often finds them treating the ensembles as more than just a calculational technique, which I think treads into formalist analogy territory.

Gibbs emphasizes throughout his classic monograph that his ensembles are purely imaginary and meant to make calculations easier. However, later textbooks have a tendency to try to justify the use of ensembles by a mixture of physical and formal analogies. For example, textbook authors often speak of the equilibrium ensemble derived from combining two grand canonical ensembles (Tolman is one example). They take the resultant ensemble as the equilibrium that would result from combining a representative from one ensemble with that from another. Taken merely as a calculational tool, this is unproblematic. The formalistic reasoning comes into play when the outcome of the interaction of ensembles is straightforwardly taken, without further justification, to represent the outcome of the interactions of actual systems. For, as in the quantization analogy, the ensemble is parasitic on the actual system for its physical relevance. When two systems interact, we do not have two ensembles interacting. So there is no physical case to be made that the outcome of the ensembles’ interaction also represents that of the systems’ interaction. Just as the success of quantization in a few cases doesn’t seem to give us reason to expect a successful induction to all cases, the success of representing a system on its own with an ensemble doesn’t seem to give us reason to expect a successful induction to cases where multiple systems interact.

Certainly there exist authors who are more careful about using ensembles. Fowler for example justifies their use not by claiming that the entire ensemble represents the system of interest, but rather that the system is itself a small part of some larger system that has the characteristics of an ensemble. This makes more physical sense, but it means checking for more physically realistic conditions that your system of interest must fulfill, before applying ensemble methods. In Fowler’s case, he requires that the system of interest is one of many subsystems of a large ensemble-like system, where the subsystems exchange only small amounts of energy — small compared to the total energy of the ensemble. At the same time, however, their interactions with one another must be significant enough to allow the entire ensemble to attain an equilibrium state.

Newer textbooks, however, have a tendency to simply introduce Gibbsian ensembles, without checking for physical sense and the restrictions that must accompany them, and “justifying” them with mere formal analogies. One wonders what the point of such “justifications” is — I prefer Gibbs’ honest admission that he introduces ensembles only because they give him the correct answers.

Even more annoying are cases where a “justification” for using an ensemble is introduced with reference to a realistic physical model, but the ensemble is then used for examples where the physical conditions in that model, the conditions that were relevant to the justification, do not hold! For example, Pathria introduces the grand canonical ensemble by considering a system exchanging particles and heat with a large reservoir. However, all the problems he next considers, to which the grand canonical ensemble is applied, are cases where particle number is conserved! The only exception is an example of adsorption of particles on a surface, which appears as an exercise at the end of the chapter. We know that the grand canonical ensemble gives us the right answers even for systems that have a constant number of particles because for many-particle systems, the equilibrium ensemble contains is composed mostly of systems with the “equilibrium” number of particles, so when you average over the ensemble to get the equilibrium number, systems with a non-equilibrium number of particles contribute nearly nothing to the average. But this merely justifies the grand canonical ensemble as a calculational trick and is wholly separate from the physical model that was used to justify the ensemble method, a model whose salient features were then thoroughly ignored when the ensemble method was applied to other systems.


It feels like cheating.

July 23, 2010

I continue to be amazed at the flimsiness of the heuristics that physicists use, often successfully, to make important theoretical progress. A particularly shocking example I’ve just read is Heisenberg’s “discovery” that systems with symmetric wavefunctions correspond to those that obey Bose-Einstein statistics, and that those with anti-symmetric wavefunctions correspond to those that obey Pauli’s exclusion principle. He does not refer to Fermi-Dirac statistics since this was before Dirac “discovered” them, and Fermi’s discovery was also published in German only later (although he had published it in Italian earlier).

Why it feels like cheating:

  1. The entire paper is based on the analysis of systems of coupled harmonic oscillators. He gives a quantum mechanical treatment of them that results in two groups of possible solutions, and shows that transitions between solutions can take place only between the members of each group. He then notes that we see only one of the two possible systems of ortho- and para-helium in nature, and suggests that this one-sidedness is due to the dichotomy he’d derived from his model. He then proceeds to generalise the dichotomy to all systems in nature:

    For the helium spectrum it is an empirical fact that only one system exists… that the other systems are not realised in nature. In fact it seems to me to indicate — if we assume, that the results we have derived for two systems can be generalised to arbitrarily many systems — on the one hand, an actual connection between the highlighted quantum mechanical indeterminacy [between which type of system exists], and on the other hand, the Pauli rule and the Einstein-Bose counting.

    (Pardon my amateur translation.) No justification for the generalisation exists in the paper.

  2. Shortly after he admits:

    Grounds that this is the only system, of all the possible quantum mechanical solutions, that occurs, will scarcely be derived from the simple quantum mechanical calculation.

  3. Despite all that, he still feels justified to extend the conclusions of his model into the realm of metaphysics:

    [The symmetric/anti-symmetric restrictions on wavefunctions] mean that it makes no physical sense to speak of the movement or the matrix representing the movement of an individual electron or of the matrix of any non-symmetric function of electrons in a system of atoms… Therefore e.g. the exchange relations in their familiar form also generally contain no physical sense

Of course, there is no question about the subsequent empirical success of relating anti-symmetric wavefunctions to Fermi-Dirac statistics and symmetric wavefunctions to Bose-Einstein statistics, but it remains amazing to me that such heuristics and casual generalizations as the ones Heisenberg uses are so successful. I do not think this is a one-off occurrence, either. In fact it seems to me that most theoretical development in physics proceeds this way, particularly the “revolutionary” developments.

ResearchBlogging.org
Dirac, P. (1926). On the Theory of Quantum Mechanics Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character (1905-1934), 112 (762), 661-677 DOI: 10.1098/rspa.1926.0133
Fermi, E. (1926). Zur Quantelung des idealen einatomigen Gases Zeitschrift für Physik, 36 (11-12), 902-912 DOI: 10.1007/BF01400221
Heisenberg, W. (1926). Mehrkörperproblem und Resonanz in der Quantenmechanik Zeitschrift für Physik, 38 (6-7), 411-426 DOI: 10.1007/BF01397160


Follow

Get every new post delivered to your Inbox.