Thursday, July 03, 2008
Continuing my series of notes on Sewall Wright's population genetics, I come to the subject of migration. This is important in understanding the differences between Wright and R. A. Fisher on the role of genetic drift in evolution. Fisher and Wright both agreed that genetic drift would be too weak a process to be of evolutionary significance in large populations (above, say, 10,000 in effective size) . [Note 1] Equally, they agreed that it would be important in small populations, provided these remained sufficiently isolated over sufficiently long periods of time. Their disagreement was over the probability that the necessary degree of isolation would occur. This depends largely on the rate of migration between populations.
Fisher's views on the subject can be pieced together from scattered remarks, as I attempted here. It seems that from an early stage - at least from his 1921 review of the 'Hagedoorn Effect' - Fisher regarded small isolated populations as unimportant in evolution. If they stayed isolated for long, they would go extinct from occasional adverse conditions (epidemic disease, drought, etc). If they did not stay isolated, the flow of migrants from outside (whether in a steady small trickle, or occasional larger floods) would be sufficient to prevent their gene frequencies from drifting far from those of the general population of their species. But so far as I know, Fisher never made any formal quantitative estimate of the amount of migration necessary to offset genetic drift. Sewall Wright, on the other hand, did make such estimates, and developed them in published works from 1931 onwards. It is known that a first draft of Wright's major 1931 paper on 'Evolution in Mendelian Populations') was written as long ago as 1925. In this he already took the view that genetic drift in small semi-isolated populations was an important evolutionary factor. This might suggest that by that time he had already considered the role of migration in depth. The draft of 1925 has not survived (Provine p. 237), but it seems that in fact it did not yet contain a detailed treatment of migration. The evidence for this is from Wright's correspondence with Fisher in 1929. Wright told Fisher that 'since I wrote [in August 1929, sending a copy of his draft] I have been trying to get a clearer idea of the effect of diffusion [i.e. migration] and I see, at least, that isolation in districts must be much more nearly complete than I realized at first, to permit random fixation of strains' [Provine p.256]. This conclusion is presented more formally in 'Evolution in Mendelian Populations' (at ESP pp.127-9). Here Wright develops an equation for the distribution of gene frequencies which incorporates a term for m, the rate of migration into a small semi-isolated population from a larger population with different gene frequencies. The exact meaning of this equation is difficult to interpret [see Note 2], but Wright's own conclusion is that 'Where m [the migration rate] is less than 1/2N [with N being the effective size of the receiving population] there is a tendency toward chance fixation of one or the other allelomorph [i.e. one of the alleles at a locus where there are two alleles in the population]. Greater migration prevents such fixation. How little interchange appears necessary to hold a large population together may be seen from the consideration that m = 1/2N means an interchange of only one individual every other generation, regardless of the size of the subgroup'. This conclusion has been widely restated in the population genetics literature. Unfortunately I do not know of any clear and mathematically elementary proof. (John Maynard Smith [p. 158-60] presents a proof using only basic algebra, but it combines the treatment of migration and mutation, and involves various simplifying assumptions and approximations. There are also some confusing misprints or slips of the pen.) It may be surprising that the rate of migration sufficient to prevent populations drifting apart can be stated as a constant number of migrants, regardless of the size of the population. D. S. Falconer comments that 'This conclusion, which may at first seem paradoxical, may be understood by noting that a smaller population needs a higher rate of immigration than a larger one to be held at the same state of dispersion' [Falconer p.79]. We may put this point slightly more formally by noting that the effect of migration in offsetting drift may be expected to be proportional to the rate of migration. The rate can be expressed as n/N, where n is the number of migrants and N is the effective size of the receiving population. Since the effect of genetic drift has previously been shown to be proportional to 1/2N, we can therefore expect the migration rate required to neutralise drift to be n/N = k/2N, where k is some constant factor of proportionality. But it follows that in equilibrium we will have n = k/2, where k is a constant. Of course, this does not tell us the size of k, but it is plausible that it is of the order of 1, as is proved by Wright and others using more rigorous methods. The conclusion that only around 1 migrant every other generation is sufficient to prevent sub-populations drifting apart might seem fatal to Wright's belief in the importance of genetic drift. As shown in his correspondence with Fisher, Wright does initially seem to have had his confidence shaken. But Wright (like Fisher) was not one to give up a cherished theory without a struggle. Immediately following the quoted passage from 'Evolution in Mendelian Populations', Wright continues: 'However, this estimate must be qualified by the consideration that the effective N [the population size] of the formula is in general much smaller than the actual size of the population or even than the breeding stock, and by the further consideration that qm ['m' is a subscript, indicating the frequency of the allele among the migrants] of the formula refers to the gene frequency of actual migrants and that a further factor must be included if qm is to refer to the species as a whole. Taking both of these into account, it would appear that an interchange of the order of thousands of individuals per generation between neighboring subgroups of a widely distributed species might well be insufficient to prevent a considerable random drifting apart in their genetic compositions' (ESP p.128). Wright's first point, that effective N may be lower than the apparent size of the population, is either confused or confusing, since Wright has just proved that N, the effective size of the receiving population, is irrelevant to the number of immigrants required to neutralise drift. Perhaps Wright is thinking of the effective number of migrants, rather than of the receiving population, in which case the number who succeed in contributing to the gene pool may indeed be less than the total number. The second point is valid, but not well explained. Wright's formula contains a term mqm (with the second m a subscript), where qm is the frequency of the relevant allele among the migrants. But the underlying assumption is that this is the same as in the species generally. Wright's point (made more explicitly in later papers) is that the allele frequencies in neighbouring populations are likely to be more similar than in the species generally, so that mqm will actually be less than is assumed in the derivation of the result. To adjust for this we might stipulate that the 'effective' number of migrants is smaller than the actual number, even of those who successfully breed, just as the 'effective' population size may be smaller than the actual size. This approach is clearer in later papers, for example at ESP p.236: 'Cross breeding is, however, most likely to be with neighboring populations which differ but little in value of q. In this case the coefficient m is only a small fraction of the actual amount of change [i.e. the actual observed rate of migration]'. With this adjustment of mqm, the number of actual migrants required to neutralise drift might indeed be many more than 1 per generation. This is valid as far at it goes, but it depends on the assumption that allele frequencies in neighbouring populations are likely to be relatively similar. This is perfectly plausible, but only because we tacitly assume that migration between neighbouring subpopulations is, or recently has been, sufficient to offset genetic drift. Wright therefore seems perilously close to sawing off the branch he is sitting on. Certainly, if the allele frequencies do drift 'considerably' apart (to use Wright's word in 'Evolution in Mendelian Populations'), the assumption of similar frequencies ceases to apply, and we can no longer rely on it. A further consideration is that on an evolutionary time scale (i.e. hundreds or thousands of generations) occasional larger influxes of migrants are almost bound to occur, and undo all the slow work of genetic drift. Even if an allele is lost or fixed in a subpopulation, it can be reintroduced at any time by migration from outside, so long as it persists somewhere in the species. Wright continued to study the effect of migration after 1931, with his fullest treatment in the paper 'Isolation by Distance' in 1943 (ESP pp.401-425). Here Wright examines three different models for migration: the Island Model, in which migrants are derived at random from a number of semi-isolated subpopulations of the species, and therefore on average have the gene frequencies of the species as a whole; isolation by distance in a two-dimensional continuum, where the probability of cross-breeding is proportional to the distance between the birthplaces of the breeding individuals; and isolation by distance in a linear range such as a river-bank. Wright's conclusions from the Island Model are not very different from those in his 1931 paper based on the cruder assumption of random migration throughout the species. The conclusions from two-dimensional isolation by distance are only slightly more favourable. As he summarises it in 1943: 'It is apparent that there is a great deal of local differentiation if the random breeding unit is as small as 10, even within a territory the diameter of which is only ten times that of the unit. If the unit has an effective size of 100, differentiation becomes important only at much greater relative distances. If the effective size is 1000, there is only slight differentiation at enormous distances. If it is as large as 10,000 the situation is substantially the same as if there were panmixia [random mating] throughout any conceivable range' (ESP p.411). Only for the more special linear-range model is there substantial differentiation due to drift in populations of moderate size. Wright's theoretical conclusions might seem to imply that genetic drift in subpopulations would seldom be a major factor in evolution. It seems to require rather special circumstances to be effective: either very small populations, populations sparsely scattered with long distances between them, populations with a narrow linear range, or organisms that are very immobile at all stages of their life cycle. Wright nevertheless continued to insist throughout his career that drift in subpopulations was an important, if not essential, feature of evolution. The uncharitable view of this would be that Wright was simply stubborn. Having taken up his position on the importance of this factor, before having considered in depth the effects of migration, he was determined to defend it. come what may. (There would be a parallel here with the equally stubborn position of Fisher on the evolution of dominance.) A more charitable view would be that Wright was trying to find an explanation of something that was generally accepted by biologists when he began his career: namely, that the observable differences between subspecies, and even between species, are usually selectively neutral. Wright himself stresses this point in 'Evolution in Mendelian Populations': 'It appears, however, that the actual differences among natural geographical races and subspecies are to a large extent of the nonadaptive sort expected from random drifting apart. An interesting example, apparently nonadaptive, is the racial distribution of the 3 allelomorphs which determine human blood groups' (ESP p.128). In the years and decades following 'Evolution in Mendelian Populations', the opinion of biologists turned away from the consensus view in 1931 (really no more than a superficial assumption) that subspecific differences are selectively neutral. Much of the relevant research was carried out by the students and collaborators of Wright and Fisher themselves, notably E. B. Ford in England and Theodosius Dobzhansky in the USA. The general outcome was that even apparently minor subspecific differences often had some selective value. Human blood groups, for example, were found to be correlated with resistance to different diseases, though it remains unclear whether all such differences have a selective basis. The importance of genetic drift in subpopulations is of course an empirical matter. It is quite possible that some species are 'Wrightian' and some are 'Fisherian' in this respect. The observed amount of genetic diversity between subpopulations is usually quite modest (Maynard Smith p.160-161], suggesting that migration between them is usually sufficient to prevent them drifting far apart . There are theoretical reasons for expecting that 'Fisherian' species would be in a majority. Most species have adaptations for dispersal at some stage of their life. Plants, for example, have adaptations for spreading their seeds. Among animals, the juveniles of one or both sexes often disperse from their region of birth to find mates or territories. With a few exceptions, organisms that just stick to one spot are doomed to extinction within a fairly short period of evolutionary time, since the conditions of life seldom stay fixed for many generations. Even in species with relatively stable environments, there are theoretical reasons for expecting that a mixture of mobility and immobility would be adaptive (W. D. Hamilton, Narrow Roads of Gene Land, vol. 1, chapter 11). But it remains possible that 'Wrightian' processes are important in some cases. A particularly interesting case is the modern human species itself. After the dispersal of modern humans out of Africa, it is likely that human populations for most of the last 100,000 years were small and scattered, with little migration between different continental groups. These are good conditions for Wrightian genetic drift. Whether the observed differences in gene frequencies between continental populations are due to drift or selection remains an active area of research [see Jobling et al., passim]. Note 1. Neither Wright nor Fisher were very interested in genetic drift among genetic variants that are selectively entirely neutral, as expounded in Kimura's theory of neutral evolution at the molecular level. Fisher died before Kimura published his theory. Wright lived long enough to take account of it, and found it plausible enough with regard to neutral mutations of nucleotides, but considered it of no evolutionary interest (see Provine p.469-77). Note 2. As I understand it, Wright's conception of the distribution of gene frequencies is broadly is follows. We assume that two populations have evolved separately, and are fixed for different alleles at one or more loci. (For simplicity it is assumed that there are no more than two alleles at each locus.) The two populations are then combined and interbreed freely. Assuming that the populations are of equal size, the frequencies of the alleles at each locus in the combined population will initially all be 50%. The combined population then evolves in isolation. As a result of random genetic drift, the allele frequencies will tend to drift away from 50%. Over a large number of loci (or over a large number of hypothetical populations) we can ask, what is the probability that an allele will have any particular frequency after any specified number of generations? The total of such probabilities over all possible allele frequencies, from 0 to 1, will of course add up to 1, and will have an approximately smooth (continuous) distribution, which (on the given assumptions) will be symmetrical around a frequency of 50%. Initially the probability distribution will be clumped closely around 50%, but as time goes on it will spread out. Eventually, some alleles will begin to be lost or fixed, with a probability of 1/2N per generation. Wright now assumes that beyond a certain number of generations the shape of the probability distribution of frequencies for the remaining alleles will be approximately constant, apart from the continuing occasional loss and fixation of alleles, which will affect all the remaining alleles equally. The problem is to find this constant distribution under various assumptions about mutation, migration, and selection. Much of Wright's work in the 1930s was devoted to this problem. I cannot claim to have followed Wright's derivations in detail, as his explanations are obscure even by his usual standards. The problem is not just that the mathematics is advanced (though it does involve more calculus than in most of Wright's work) but that he makes various simplifying assumptions and approximations which are not self-evidently justified. I can only take it on trust that the conclusions are correct, and that if they were not (as Dobzhansky put it) 'some mathematician would have found it out'. References: [Provine] William B. Provine: Sewall Wright and Evolutionary Biology, 1986. [ESP] Sewall Wright: Evolution: Selected Papers, edited and with Introductory Materials by William B. Provine, 1986. D. S. Falconer: Introduction to Quantitative Genetics, 3rd edn., 1989. M. Jobling, M. Hurles, and C. Tyler-Smith: Human Evolutionary Genetics, 2004. John Maynard Smith: Evolutionary Genetics, 1989. Labels: Burbridge, Population genetics |