Friday, September 29, 2006

Fisher and Wright on Population Size   posted by DavidB @ 9/29/2006 02:01:00 AM

As part of background reading for 10 Questions for A. W. F. Edwards I re-read R. A. Fisher's Genetical Theory of Natural Selection, and noted some points of interest. Over the next few weeks I hope to explore these.

The first concerns Fisher's views on the effective population size of species. It is well known that Sewall Wright attached great importance to the internal population structure of species, while Fisher in contrast thought that for most purposes a species could be treated as a single randomly interbreeding population.

This contrast is sometimes posed in stark terms. For example, the historian William B. Provine, in a comment on one of Sewall Wright's papers, says: 'Fisher wrote to Wright to say that he believed effective population size to be a crucial variable, but that he considered most species to be panmictic (random breeding) throughout their ranges, even if their range was the entire world. Wright disagreed strongly'. [4, p. 71]

If this were an accurate representation of Fisher's views, then of course Wright would be right. Most species are obviously not panmictic throughout their ranges. But this in itself should raise suspicions. Scientists seldom espouse views that are obviously absurd. So what did Fisher actually say about population size?

To explore this will take several posts. Here is a first instalment, covering the period before the publication of GTNS in 1930.

Most of Fisher's papers and letters are available online here, but there are a few exceptions. It turns out that some of Fisher's most explicit comments on population size came in a book review in 1921 [1] that is not available online. The Dutch biologists A. L. and A. C. Hagedoorn were early proponents of what is now called genetic drift. In the 1920s and 30s it was often described (by Fisher and Wright among others) as the Hagedoorn Effect. In his review of their book Fisher commented:

The present reviewer has examined this particular point, and finds that in the absence of mutation or crossing [i.e. interbreeding with individuals from outside the group], a perceptible reduction of variability will take place in a number of generations equal to four times the number of the species breeding. In the case of great seasonal fluctuations in number, it is only fair to take the number of interbreeding individuals at its lowest point, but even so the number of interbreeding individuals in any group will seldom be less than five figures...[i.e. less than 10,000].

The number of robins in a county [i.e. an English county, with an area of a few hundred to a few thousand square miles] even in winter must be many thousands, and in a thousand generations interbreeding will have taken place to an extent which will have spread the blood of any local group over the whole country [sic: it is not clear if Fisher meant 'country' or 'county'. Both would make sense in the context].

If a relatively rare animal were taken, such as the badger in this country, and attention were confined to a small isolated district such as the Isle of Man [about 300 square miles, but, alas, without badgers], it might be possible to find a population of not more than 1,000 which had been isolated from other members of the species for several thousand generations; in such a case a reduction of variability might actually be proved to have occurred.

This book review comes nearly 10 years before the publication of GTNS. It also precedes Fisher's own first published calculations on the rate at which random fluctuations would affect genetic diversity. It is however clear that Fisher had already gone far enough into the problem to say that significant loss of diversity would take a number of generations longer than the relevant number of individuals in the population. In saying that 'it is only fair to take the number of interbreeding individuals at its lowest point', Fisher also shows an understanding that the effective population size may well be less than its current size. But Fisher makes no claim that breeding, even within a local area, is literally random (panmictic). The point he makes is that in the absence of geographical barriers (such as in his hypothetical Isle of Man example), the 'blood' of any local group would spread over a much wider area in the course of a thousand generations.

In the following year (1922), Fisher published his first major paper [2] on population genetics. In this paper he considered the rate at which the genetic variance in a population would decline as a result of random processes in the absence of selection, mutation, and migration. Under these conditions each individual gene will have a certain probability of reproducing, or of dying without descendants, in each generation. As each line of descent, once lost, cannot be revived, any original variance of gene types (alleles) will progressively be reduced. In the absence of mutation and migration every gene at a given locus in the population will ultimately be descended from the same ancestral gene, so that all variance will be extinguished. In the paper Fisher shows that the rate of decline of variance depends on the total population size, being slower with larger populations. He calculates that variance will be reduced substantially in 4n generations, where n is the diploid population size. (He later corrected this to 2n, after finding an error in the calculation. Wright, using a different method, had already reached the figure 2n and pointed out the discrepancy to Fisher.) Fisher remarks that:

This is a very slow rate of diminution; a population of n individuals would require 4n generations breeding at random to reduce its variance in the ratio 1 to e, or 2.8n generations to halve it. As few specific groups contain less than 10,000 individuals among whom interbreeding takes place, the period required for the action of the Hagedoorn effect, in the entire absence of mutation, is immense.

I think it is unfortunate that Fisher refers here to 'random breeding', as this may give the impression that the breeding structure of the population affects the outcome. Given Fisher's other assumptions, this is not the case. With the assumptions of zero migration, mutation, and selection, each individual gene can be regarded as one of 2n asexually reproducing entities, with equal reproductive prospects, and this is in fact how Fisher treats it in his calculations. This population of genes can be divided up in any way whatsoever, without affecting the rate of loss of variance in the population of genes as a whole. Imagine that we assign each gene in the original population to a notional 'group' in an entirely arbitrary way, and then trace the descendants of each gene and group through succeeding generations. There will be two simultaneous processes going on: groups will grow, shrink, or become extinct; and within each group the number of descendants of each original gene will increase or decline. Ultimately all the members of the group will be descended from a single gene. Within any group, variance will decline more quickly than in the total population, but in the total population this will be partly offset by increasing variance between groups. Even after each group has become 'fixed' for a particular allele, variance in the overall population will continue to decline due to random-walk fluctuation in the size of the groups, including extinctions. But the original division of the population into groups was imaginary, and the mere act of assigning a gene to an imaginary group cannot influence the course of evolution. The notional division of the population into groups therefore does not affect the outcome. But neither does the existence of real subdivisions such as geographical or breeding groups, provided this does not affect the number of descendants of each individual gene, and by Fisher's assumptions it does not. Random breeding is therefore in this case a red herring.

So far as I am aware, Fisher's next relevant comment on population size came in a letter to Leonard Darwin early in 1929. This was a response by Fisher to a letter from Darwin enclosing an old letter from Francis Galton, discussing the number of ancestors of village-dwellers. Galton noted that in the case of an isolated village the number of ancestors, in any generation far enough back, would hardly exceed the present number of villagers. Fisher expressed interest in Galton's letter, but commented:

It is perfectly true that village communities may be much isolated, but I wonder if Galton ever considered (or people like Fleure, who find 'Neolithic' villages all over the place) how complete the isolation must be to be worth anything genetically.

If only one in 10 filter in from outside in each generation, in seven generations half the population comes from outside and in 70 generations all but 1 in 100. Isolation would be very extreme at this level, in the ordinary course of events, and catastrophic events, war raids, famine, plague, are not so rare as to be ignored in the case of such habitual isolation.

King Solomon lived 100 generations ago, and his line may be extinct; if not, I wager he is in the ancestry of all of us, and in nearly equal proportions, however unequally his wisdom may be distributed.[3, p.95]

This appears to be the first time that Fisher explicitly commented on the effects of migration. The 'King Solomon' example is also interesting. First, there is the claim that if Solomon has any living descendants, he is probably in the ancestry of all of us. Fisher does not explain how he reached this view, but evidently, if all lines of descent do not die out by chance at an early stage, and allowing for an average of around two descendants per generation, the potential number of Solomon's descendants will increase explosively until it is far greater than the actual population of the human species. It does not follow that he is yet in the ancestry of all of us, because even in the absence of geographical barriers, there might not have been time for his 'blood' to have spread throughout the world. If we suppose, for example, that that none of his descendants moved more than 10 miles from their place of conception, then none of his descendants after 100 generations could be more than 1000 miles from Israel. However, recent calculations of the date of the 'Most Recent Common Ancestor' tend to support Fisher's 'wager', with the possible exception of a few isolated tribes like the Andaman Islanders. There might be more dispute about the stronger claim that Solomon would be in the ancestry of all of us 'in nearly equal proportions'. It is not clear how Fisher would have justified it in 1929. But one must remember that he was a physicist by training, and he may have intuitively seen an analogy with physical diffusion processes. If a substance is diffusing through a gas or fluid, in the absence of impermeable barriers, or pumps to reverse the flow, the substance will always flow from areas of greater to lesser concentration (subject to minor stochastic fluctuations), until it is evenly distributed throughout. The 'diffusion' of ancestry would obey the same principle, so it would be reasonable to wager that at some stage Solomon's share in everyone's ancestry would be approximately equal. But the speed with which this outcome was reached would be affected by geographical and behavioural barriers, and it is not obvious whether 100 generations would be long enough to reach an equilibrium state. (To preempt an obvious comment, having Solomon in one's ancestry does not imply having any of his genes. These would be diluted by half at each stage of reproduction, so that after 100 generations most lines of descent would contain none of his genes at all. If he has any living descendants, his average expected share in their genes would only be 1/n, where n is the number of individuals in Solomon's time who have left living descendants.)

Later in the same year (1929) Fisher had a further occasion to discuss the question of migration, as part of a correspondence with Sewall Wright. In 1928 Fisher had begun to expound his theory of the evolution of dominance. Noting that harmful mutations tended to be recessive, Fisher sought to explain this by the selection of modifier genes at other loci tending to suppress the effect of harmful mutations. He recognised that for various reasons this would be a weak form of selection, but considered that if it operated over very long periods of time, in response to recurring mutations, it could have the effect he claimed. Sewall Wright had a number of objections to this. For the present purpose Wright's most important point was as follows :

Selection controls the situation if s [a measure of selective intensity] is larger than 1/2n [where n is effective population size], but is of little importance below this figure. In small inbred populations (1/2n large) even vigorous selection is ineffective in keeping injurious factors from drifting into fixation... In the case of Dr Fisher's modifiers of dominance with selection coefficients at best of the order of mutation rate, the latter must be greater than 1/2n if the gene is not to drift back and forth in the course of geologic time from one state of approximate fixation to the other and practically as freely in the face of the selection pressure as with it.

Unfortunately it is difficult to estimate n in animal and plant populations. In the calculations it [n] refers to a population breeding at random, a condition not realised in natural populations as wholes. In most cases random interbreeding is more or less restricted to small localities. These and other conditions such as violent seasonal oscillation in numbers may well reduce n to moderate size, which for the present purpose [i.e. the effectiveness of very weak selection] may be taken as anything less than a million. If mutation rate is of the order of one in a million per locus, an interbreeding group of less than a million can show little effect of selection of the type which Dr Fisher postulates even though there be no more important selection process [to override the effect of weak selection of modifiers] and time be unlimited.[4, p.75]

In correspondence with Wright Fisher responded to several of Wright's criticisms. On the question of population size he wrote as follows:

I am not sure I agree with you as to the magnitude of the population number n. To reduce it [as Wright had done] to the number in a district requires that there shall be no diffusions even over the number of generations considered. For the relevant purpose I believe n must usually be the total population on the planet, enumerated at sexual maturity, and at the minimum of the annual or other periodic fluctuations. For birds twice the number of nests would be good. I am glad, however, that you stress the importance of this number.[3, p.273]

It is this passage on which William Provine bases his claim that Fisher 'considered most species to be panmictic (random breeding) throughout their ranges, even if their range was the entire world'. It should be evident that Fisher said nothing of the kind. First, as a pedantic point, Fisher did not claim that any species had a range over 'the entire world'. Very few species do, even if we treat land and sea species separately. More important, Fisher did not maintain that species were literally panmictic throughout their range, but only that 'for the relevant purpose' [the relative effects of drift and selection] the appropriate number must 'usually' be the total breeding population of the species, at its periodic minimum size. His reason for this belief was evidently related to the question of 'diffusion', including migration between districts. As he had indicated in his letter to Leonard Darwin, Fisher doubted that isolation of local groups would usually be strict enough, over a period of many generations, to prevent a significant diffusion of genes between localities, which would tend to equalise gene frequencies between them.

Fisher's letter prompted Wright himself to think about the effects of migration (apparently for the first time), and he replied to Fisher as follows:

I was much interested in your comment on the population number... Since I wrote, I have been trying to get a clearer idea of the effect of diffusion, and I see, at least, that isolation in districts must be much more nearly complete than I realized at first, to permit random fixation of strains.[5, p.256]

Wright then discussed the effect of migration into a district at random from the entire population, and concluded:

I must admit that rather strict isolation is necessary on this basis to maintain appreciable variation of q, and I recognise of course that variation in q does not interfere much with selection unless it is so great as to bring about a U-shaped piling up close to q = 0 and q = 1. I am not entirely clear as to the effects of interchange between adjacent districts with similar q's. Presumably there could be considerably more such interchange without preventing a drifting apart of the q's for more remote districts. However, it seems clear that N must be based on the entire species, unless isolation is substantially complete, in considering the interference with selection.[5, p.256]

Thus, we reach the remarkable result that far from 'disagreeing strongly' with Fisher's position on the point at issue in the correspondence, Wright explicitly agreed with him!

In the same letter Wright asked Fisher whether he had published anything on the 'diffusion problem', and Fisher replied:

I have so far published nothing on the 'diffusion problem', but have in the press a book on 'The Genetic [sic] Theory of Natural Selection', which has part of a chapter on the cohesion of species in relation to the problem of their fission. I think it must be generally true that the ancestry of all individuals of a species is practically the same except for the last 100 or perhaps 10,000 generations, and that a gene frequency gradient is maintained by selection between different parts of a species' range. So that well-marked local variations may or may not be incipient species, according as real fission, cessation of diffusion, ultimately supervenes.[5, p.258]

This appears to be Fisher's last comment on the subject before the publication of GTNS in 1930, to which I will turn in a further post.

[1] R. A. Fisher: Eugenics Review, 13, 1921, 467-70, review of A. L. and A. C. Hagedoorn, The Relative Value of the Processes Causing Evolution.
[2] R. A. Fisher: On the Dominance Ratio, 1922 (online)
[3] J. H. Bennett (ed.): Natural Selection, Heredity and Eugenics, Including selected correspondence of R. A. Fisher with Leonard Darwin and others. 1983.
[4] Sewall Wright, Evolution: Selected Papers, ed. William B. Provine, 1986.
[5] William B. Provine: Sewall Wright and Evolutionary Biology, 1986.