I’ve written before on the spuriousness of the claim that one needs quantum mechanics to understand the ‘correction’ of a factor of N! to the usual expression for the entropy of an ideal gas. Van Kampen, Jaynes and Swendsen have independently made this argument, though with different approaches.1 The upshot is that if you take a probabilistic definition of entropy, it doesn’t matter whether you calculate the entropy of a classical ideal gas by assuming that its particles are distinguishable or by assuming that its particles are indistinguishable: you get the same, correct Sackur-Tetrode entropy either way.
But if the distinguishability of classical particles has nothing to do with the resolution of the Gibbs paradox, then what lies behind the difference between classical statistics and quantum statistics? Why does ‘classical indistinguishability’ have no effect on classical statistics, while quantum indistinguishability leads to completely different statistics? According to Simon Saunders,2 the answer lies in the discretization of the phase space of quantum systems.
When we construct a probability measure over the phase space of a classical system, the most natural choice is that of the volume of phase space. A probability measure over the phase space of a quantum system, though, would be a discrete measure that counts the possible distinct number of states allowed. Consider, then, a very simple quantum system consisting of two indistinguishable particles with three orthogonal states. This system has six possible orthogonal two-particle states, because (2, 1) for example is the same state as (1, 2) if the particles are indistinguishable.
Now consider the classical analogue of this quantum system. Since a classical system has a continuous phase space, the classical analogue is that of a system with its 2-D phase space coarse-grained on each ‘particle’ dimension into three different sections (all figures from Saunders’ paper):

When we consider indistinguishable classical particles, however, we have to halve the above phase space diagram, since if two states with particles swapped are to count as one and the same state, then the diagram is symmetric about its diagonal:

This halved diagram then correctly represents the phase space representation of a classical system of two indistinguishable particles. Note, though, that even though the two particles are indistinguishable, if we take the usual volume-of-phase-space probability measure, the ‘diagonal’ states (1, 1), (2, 2) and (3,3) are half as likely as the non-diagonal states. And this is exactly what we would have gotten had we considered the particles distinguishable instead and counted (1,2) and (2,1) as two distinct states, since there would be twice as many non-diagonal combinations as diagonal combinations. So taking the probability measure on the reduced phase space of the second diagram gives us the same results as considering permutations of distinct states on the full phase space of the first diagram. Since entropy is, in the modern formulation, a measure of the probability of a system’s macrostate, the entropy of a classical system is not affected by considerations of distinguishability.
Quantum systems, on the other hand, are impervious to the classical effects of ‘halving of the phase space’, since as discrete systems they are not reliant on the volume of phase space for their probability measure. In the above example, if the system is quantum with distinguishable particles, then we have nine possible states each with equal probability. If the system has indistinguishable particles, we have only six possible states, each one equally weighted. So moving from distinguishable particles to indistinguishable particles does result in a change in statistics (and entropy) in the quantum case.
Naturally, Saunders has a more mathematically detailed exposition than this, but I thought the simple phase space diagrams provided the most immediate and natural feel for the crux of the issue.
The phase space diagrams also allows us to see why Pauli spoke of Bose-Einstein statistics causing particles to “condense into groups” of the same kind. In the quantum indistinguishable system, the diagonal states, containing pairs of particles in the same states, are weighted just as much as the non-diagonal, heterogeneous states. In the classical system, the diagonal states are weighted less. So groups of particles in the same states are favoured in quantum statistics relative to classical statistics.
There’s some other interesting points in the paper made about how distinguishability relates to the persistence of properties over time which I might comment on later.
[1] E. T. Jaynes, “The Gibbs Paradox”, in Maximum-Entropy and Bayesian Methods, C. R. Smith, G. Erickson, and P. Neudorfer, eds. (Kluwer, Dordrecht), p. 1-22; R. H. Swendsen, “Statistical mechanics of classical systems with distinguishable particles”, J. Stat. Phys. 107:1143 (2002); and N. G. van Kampen, “The Gibbs Paradox”, in Essays in Theoretical Physics, W. E. Parry, ed. (Pergamon, Oxford, 1984), pp. 303-312.
[2] S. Saunders, “On the explanation for quantum statistics”, Studies In History and Philosophy of Science Part B: Studies In History and Philosophy of Modern Physics, Vol. 37, No. 1. (March 2006), pp. 192-211.