Thursday, October 19, 2006

Fisher and Wright on Population Size: Part 2   posted by DavidB @ 10/19/2006 02:13:00 AM

Rather later than intended, here is Part 2 of this note.

Part 1 looked at R. A. Fisher's views on the effective population size of species before the publication of The Genetical Theory of Natural Selection (GTNS) in 1930. The main conclusion was that from an early date (1921) Fisher believed that there was usually enough migration between different parts of a species' range, on an evolutionary timescale, to neutralise the effect of genetic drift in causing gene frequencies in different parts of the species to diverge. Fisher therefore thought that for most purposes in population genetics a species could legitimately be treated as a single randomly interbreeding population. He did not hold the absurd view, sometimes attributed to him, that a species was literally random-breeding (panmictic) throughout its range, no matter how extensive that might be.

I said I would come back in the next instalment to consider Fisher's views in GTNS itself...

There is nothing in GTNS [1] to suggest that Fisher's views on effective population size had changed. The main novelty is that GTNS contains, in its chapter on sexual reproduction, a brief but important discussion on the nature of species and the process of speciation. But even these points had been foreshadowed in Fisher's correspondence.

First I will gather together those passages I can find in GTNS giving Fisher's views on the actual size of species. The first relevant passage, quite early in the book, is a remark that 'the number of individuals surviving to reproduce in each generation must in most species exceed a million, and in many is at least a million-fold greater...' [1, p10]

In the second (1958) edition of GTNS Fisher added an important comment shortly after this:

The circumstance that smaller numbers, even less than 100, are sometimes found to reproduce themselves locally, does not, as has been supposed, add to the frequency of random extinction [of genes], or to the importance of the so-called 'genetic drift'. For this, perfect isolation is required over a number of generations equally numerous with the population isolated. Even if perfect isolation could be postulated, which is always questionable, it is still improbable that the small isolated population would not ordinarily die out altogether before a period of evolutionary significance could elapse, or that it would not be later absorbed in other populations with a different genetic constitution.[1, p.273, Dover p.10]

This was a point he had made in correspondence to both Leonard Darwin and Sewall Wright in 1929 - and apparently accepted by Wright at that time - but which Fisher unfortunately did not make explicitly in the first edition of GTNS. If he had, the 'Fisher-Wright debate' might have been better understood. Note particularly that Fisher's objection is not just concerned with steady small-scale migration, but also with occasional larger changes. On a time scale of thousands of years, we cannot expect climate and topography be constant: ponds dry up; rivers change their course; and changes in temperature and rainfall involve changes in vegetation and the associated animal species.

Another addition in 1958 is explicitly aimed at Sewall Wright, who had studied the genetics of the rare plant species Oenothera organensis. Fisher comments thus:

In the case of Oenothera organensis the existing wild population has been thought to be so small as one thousand individuals... Species having populations of less than 10,000 must, of course, be presumed to have fallen greatly in population during their recent evolutionary history... Sewall Wright has proposed that the number of alleles observed might be the sum of the number of alleles in different highly isolated groups into which the totality of this small population is divided. This should be the case if the larger population has in its recent history been subdivided. Isolation, however, of the degree required, if it now exists, would necessarily be a recent condition due doubtless to the recent large reduction in population numbers.[1, p.294-6, Dover p.109]

Returning to the first edition text, we have found a clear statement that a species population is usually greater than a million, and a somewhat less clear statement that it is often 'at least a million-fold greater'. A 'million-fold greater' does not mean 'a million greater' but literally a million times greater - that is, at least a million million! It might be supposed that Fisher was here writing carelessly, but several later passages show that he did believe some species had populations of a million million, while populations of over a thousand million were commonplace. For example, In chapter 4 he discusses the choice of a suitable scale for measuring gene ratios, and advocates a logarithmic scale:

The range of possible frequency ratios on the logarithmic scale thus depends on the number of individuals in the species, and it is easy to see that it is increased by 2log10, or 4.6, if the population in the species is increased tenfold. For example, a species of 10,000,000,000 individuals will give a range of values from about -23.7 to +23.7.[1, p72, Dover p.79]

He goes on to recommend that the population size should be counted when individuals are reaching sexual maturity, and at the lowest point of any annual cycle: 'we shall count each generation near the maximum of its reproductive value, and when its numbers are least'. [1, p73, Dover p.80] Evidently, then, Fisher is not thinking of his populations of many millions as just gametes or fertilised eggs, most of which will die before maturity. Later in the chapter he refers to 'a species in which 1,000,000,000 come in every generation to maturity'[1, p78, Dover p.85], and to 'a mutant form [existing] in as many as 1,000 million individuals in each generation'[1, p79, Dover p.80]. He develops equations to deal with the effects of mutation and genetic drift, in which n is 'the number of individuals breeding in each generation', 'a large number of many millions or thousands of millions'[1, p84, Dover p.91]. To show the range of possible effects, he calculates the effect of taking values of n 'from a million to a billion'[1, p91, Dover p.98]. In American usage, and in recent British usage, a billion means a thousand million: 1,000,000,000 or 10^9. But in Fisher's day the usual British practice was to use a billion for a million million, or 10^12. To remove any doubt on this point, Fisher's table includes values from 10^6 to 10^12: from a million to a (traditional British) billion. Finally, at the end of the chapter he refers again explicitly to 'a population of a thousand million or a billion individuals'[1, p96, Dover p.103].

It may be wondered where Fisher acquired these views on the size of species, and whether they were generally accepted in his time, but I don't know the answer to this. It may be more interesting to consider whether they are (or were) correct.

A usual lower limit of one million for the total population of a species does not seem unreasonable. With the exception of species endemic to oceanic islands, deep lakes, and other special cases, we would expect most species, unless they are close to extinction, to have a geographical range of at least one million square kilometres. This may sound a lot, but it is less than 1 percent of the land area of the planet. For comparison, the smallest continent, Australia, has an area of about 8 million square kilometres. Most species, with the stated exceptions, range over a substantial fraction of a continent, or an equivalent part of the oceans. With a range of one million square kilometres a species population of 1 million would therefore require only an average density of 1 mature individual per sq. km., though not necessarily spread evenly and continuously across that range For most species of animals and plants this seems plausible. The main exceptions would be large mobile predators like lions or eagles, which may require a hunting range of 10 or more sq. km. to find enough prey. In modern times many large mammals have populations of less than one million, but this is usually due to hunting and destruction of their habitat by man. In natural conditions, subject to the stated exceptions, Fisher's lower limit seems reasonable.

There might well be some incredulity about the upper end of the range. I have seen it suggested (but unfortunately cannot recall where) that Fisher's figure of 10^12 was not intended literally, but only as a theoretical illustration of his population-genetics model. I doubt this: the statement that 'the number of individuals surviving to reproduce in each generation must in most species exceed a million, and in many is at least a million-fold greater' appears quite literal. But is it in fact impossible? Certainly one would not expect any vertebrate species to have a population of a million million. Man has a population of over 6 thousand million, and several domesticated species (cow, sheep, pig, dog, cat) have populations of one or a few thousand million. I hazard a guess that the most numerous land vertebrate is the domestic chicken, with an estimated world population of 23 thousand million. The total world population of small rodents may well exceed 100 thousand million, but this is divided among many species. Some widespread bird species may also run to thousands of millions. In the sea, when not over-fished by man, some species of fish have very large populations. A type of sardine migrates in vast shoals along the coast of South Africa. A single shoal has been measured at 40 km. long by 15 km. wide and 40 metres deep, giving it a volume of 2,400 million cubic metres. It would therefore only take a density of 10 sardines per cubic metre for a single shoal to outnumber the world population of the chicken.

Still, none of these figures come close to a million million. However, most animal species are not vertebrates, but small invertebrates. (Plants raise different issues: individuals are generally quite large but, as autotrophs, capable of high population densities.) A population of a million million would require an average density of one mature individual per square metre over 1 million sq. km., or a lower density over an appropriate larger area. For many insects and other small invertebrates this seems quite achievable. Very small insects like thrips and aphids can have immense populations. A single infestation of alfalfa aphids was once estimated at 170,000 million in an area of a few hundred square kilometres. [2, p. 266] For more typical examples, in the 1920s the agricultural research centre at Rothamsted in England carried out regular sample censuses of the invertebrates in the soil, and estimated that there were around 800,000 earthworms, nearly three million hymenoptera, one and a half million flies, and two and a third million springtails (small primitive insects) per acre of arable land (about 4000 sq. m.), giving a density of nearly 2,000 of these creatures per square metre.[3, p.107] (Of course, these would not all be breeding adults, and they would be divided among many species.) Fisher was the chief statistician at Rothamsted, so he would have been aware of these vast numbers of small invertebrates. And all this is without counting nematodes and other microscopic creatures. Provided we set aside our preoccupation with large vertebrates, a species population of a million million is by no means impossible.

Of course, as Fisher recognised, for some purposes what counts is the population of breeding adults at its lowest point. (Technically, the effective population size is somewhat larger than this, but we need not go into this.) It may therefore be pointed out (as Sewall Wright did) that many populations are subject to large fluctuations. This is true, but its significance should not be overstated. Given the assumption of free migration within a species, what matters is not the local population size but the global population, and this is likely to be less variable, as local fluctuations often cancel out. Many species also have adaptations for avoiding or resisting severe conditions, such as migration, or producing resistant eggs, larvae, or pupae. The adults of many insect species die out in the winter, but the population is preserved by eggs or larvae in sheltered places.

Overall, Fisher seems justified in assuming that total populations of most species run to millions, if not many millions. This is important in Fisher's view of evolution for two main reasons. First, the relative impact of selection and genetic drift on an allele depends on the numbers of the allele in the population, which are constrained by the population size. The change in numbers due to selection on an allele is proportional to the number of copies of the allele in the population at the time. In contrast the change in numbers due to genetic drift, in each generation, is closer to the square root of the number of copies of an allele, so it increases more slowly than the number of copies. If the number of copies is very small, as in the case of a single new mutation, genetic drift is almost always more important, until by chance the number has drifted to a high enough level for selection to become the predominant influence. If the population itself is very small (of the order of a few thousand) this point may never be reached. And if the selective advantage of the allele is very small (say, less than 1 in a million), very large populations are needed if the selective advantage is not to be swamped by drift. This is relevant, as Fisher saw, to his theory of the evolution of dominance, which requires very weak selection operating over long periods.

The second major implication is for the effect of mutation rates. A mutation which only ever occurs once in a species is very likely to be lost by chance before it can establish itself, even if it has a substantial selective advantage. If on the other hand it recurs on many occasions, it has a high probability of eventually by chance reaching the level at which selection becomes the predominant influence. The frequency with which the mutation recurs depends on the mutation rate and the population size. If the mutation rate is reasonably high - say 1 in 100,000 per generation - it will recur frequently (over an evolutionary timescale of thousands of generations) even in small populations. But if the mutation rate is very low, it may not recur at all unless the population is large. Fisher believed that extremely rare mutations might play a significant part in long term evolution, and for this reason believed that adaptive evolution would progress faster in large populations.

Fisher's conclusions however all presuppose that there is sufficient migration and interbreeding among different parts of a species, over an evolutionary time scale, to offset the effects of genetic drift in causing local gene frequencies to diverge. It is therefore disappointing that GTNS contains no explicit, quantified discussion of the effects of migration, other than the brief passage added at page 10 of the Dover edition. GTNS does however contain a very interesting discussion of the nature of species, in the chapter on sexual reproduction:

The intimate manner in which the whole body of individuals of a single species are bound together by sexual reproduction has been lost sight of by some writers. Apart from the intervention of geographical barriers so recently that the races separated are not yet regarded as specifically distinct, the ancestry of each single individual, if carried back only a hundred generations, must embrace practically all of the earlier period who have contributed appreciably to the ancestry of the present population. If we carry the survey back for 200, 1,000, or 10,000 generations, which are relatively short periods in the history of most species, it is evident that the community of ancestry must be even more complete... In sexual organisms... each individual is not the final member of a single series, but of converging lines of descent which ramify comparatively rapidly throughout the entire specific group. The variations which exist within a species are like the differences in colour between different threads which have crossed and recrossed each other a thousand times in the weaving a single uniform fabric.

The effective identity of the remote ancestry of all existing members of a single sexual species may be seen in a another way, which in particular cases should be capable of some quantitative refinement. Of the heritable variance in any character in each generation a portion is due to the hereditary differences in their parents [i.e. the differences between the parents of different individuals] , while the remainder, including nearly all differences between whole brothers and sisters, is due to genetic segregation. Those portions are not very unequal; the correlations observed in human statistics show that segregation must account for a little more than two-fifths, and the hereditary differences of the parents for nearly three-fifths of the whole. These hereditary differences are in their turn, if we go back a second generation, due partly to segregation and partly to hereditary differences in the grandparents. As we look farther and farther back, the proportion of the existing variance ascribable to differences of ancestry becomes rapidly smaller and smaller; taking the fraction due to segregation as only 2/5 in each generation, the fraction due to differences of ancestry 10 generations back is only about one part in 100 while at 30 generations it is less than one in four millions.[1, p124-5, Dover p.138-9]

It is noteworthy here that Fisher is implicitly using the extent of interbreeding as the criterion for species identity, as in the 'biological species definition' of Ernst Mayr, though Mayr had not yet promulgated this when Fisher was writing (1930). Fisher's application of the 'analysis of variance' to the subject is also interesting, and deserves to be evaluated by a competent statistician (i.e. not me). There seems to be an assumption that the 'ramification' of ancestry is unlimited and uniform however far back we go, until it embraces the entire species. This will often be the case, but if, say, an individual is from a small, relatively isolated village, we might suppose that his ancestry would ramify more slowly if we go back more than a few generations. On a larger scale, we might wonder how Fisher would account for the differences between geographical races. In fact, Fisher goes on to say:

It is only the geographical and other barriers to sexual intercourse between different races, factors admittedly similar to those which condition the development of incipient species as geographical races, which prevent the whole of mankind from having had, apart from the last thousand years, a practically identical ancestry. The ancestry of members of the same nation can differ little beyond the last 500 years; at 2,000 years the only differences that would seem to remain would be those between different ethnographic races; these, or at least some of the elements of these, may indeed be extremely ancient; but this would only be the case if for long ages the diffusion of blood between the separated groups was almost non-existent.

In the next section of the chapter, Fisher gives a brief but important dicussion of speciation, of which it has been said: 'Had Fisher developed any of his ideas mathematically the history of speciation might have been quite different' [4, p.7]. But there is still no quantitative treatment of the effects of migration to justify Fisher's confidence that a species can usually be treated as a single interbreeding population. For such a treatment we must turn to the writings of Sewall Wright. Part 3 of this note, if and when I get round to it, will therefore look at Wright's work on the subject.


[1] I will give page references to The Genetical Theory of Natural Selection: a Complete Variorum Edition, ed. J. H. Bennett, 1999, and, for convenience, to the more widely accessible Dover edition. (Where only one page reference is given, it is the same in both editions.)

[2] H. Andrewartha and L. Birch: The Ecological Web, 1984.

[3] Charles Elton: Animal Ecology, 1927, quoted from the Methuen Paperback edition.

[4] S. Berlocher, in Endless Forms: Species and Speciation (ed. D. Howard and S. Berlocher), 1998.