Tuesday, April 29, 2008
Some Athletes' Genes Help Outwit Doping Test :
The 55 men in a drug doping study in Sweden were normal and healthy. And all agreed, for the sake of science, to be injected with testosterone and then undergo the standard urine test to screen for doping with the hormone.
The whole "Asian" angle wouldn't be as important from where I stand if China wasn't intent on becoming an athletic superpower. Specifically, from Doping Test Results Dependent on Genotype of UGT2B17, the Major Enzyme for Testosterone Glucuronidation:
We demonstrated that a deletion polymorphism in the gene coding for UGT2B17...is strongly associated with TG levels in urine...All subjects devoid of the gene had a T/E ratio below 0.4...This polymorphism was considerably more common in a Korean Asian than in a Swedish Caucasian population, with 66.7 and 9.3 % deletion/deletion (del/del) homozygotes respectively.
They don't seem to know what SNP is causing this. If you are curious, you can check out the linkage disequilibrium around UGT2B17.
John Derbyshire has a long column excoriating Ben Stein and the Discovery Institute titled A Blood Libel on Our Civilization:
And there is science, perhaps the greatest of all our achievements, because nowhere else on earth did it appear. China, India, the Muslim world, all had fine cities and systems of law, architecture and painting, poetry and prose, religion and philosophy. None of them ever accomplished what began in northwest Europe in the later 17th century, though: a scientific revolution. Thoughtful men and women came together in learned societies to compare notes on their observations of the natural world, to test their ideas in experiments, and in reasoned argument against the ideas of others, and to publish their results in learned journals. A body of common knowledge gradually accumulated. Patterns were observed, laws discerned and stated.
Via Talk Islam.
Update: John also rips David Berlinski a new one. Via Quantum Ghosts.
Sunday, April 27, 2008
For a few weeks I've been mulling over a "theory" about the nature of contemporary fiction. The quotes are because this is a theory in the way that normal people have theories; they don't know much and just make up plausible (to their mind) models that are ultimately grounded in a whole lot of ignorance. I really don't know much here, and I strongly suspect I'm wrong, but I can't help but express an opinion in public though I feel I shouldn't because of my admitted ignorance. To some extent I'm putting this post up to be enlightened by readers who do know a great deal more about letters (e.g., The Man Who is Thursday, who should also resize the little dog so his front page load doesn't go well north of 300 K).
Here's the argument: contemporary mainstream fiction is very different from the storytelling of the deep past because of a demand side shift. Women consume most fiction today, and their tastes differ, on average, from those of men. How do they differ? To be short about it men are into plot, while women are into character. This means that modern literary fiction emphasizes psychological complexity, subtly and finesse. In contrast, male-oriented action adventure or science fiction exhibits a tendency toward flat monochromatic characters and a reliance on interesting events and twists. Over my lifetime I've read a fair amount; but the vast majority of the fiction has been science fiction & fantasy. Many males outgrow this bias, perhaps as they become more psychologically complex and nuanced, but I haven't (though I don't read much fiction in general at this point). I know many other males who are similar; we aren't dumb, and not all of us have Asperger's. We just aren't interested into characterization or character. We are people of exotic ideas, novelty of story arc and exploration of startling landscapes. Contemporary mainstream fiction, high, middlebrow and low, does not usually satisfy these needs.
But ancient fiction; epics, myths, etc., do fulfill these requirements. I didn't seek out fiction in any form before I was 13 or so (I was assigned books in school of course); but I had read Bullfinch's Mythology as well as translations of the Iliad and Gilgamesh. In hindsight I suspect that my interest in these works is due to the fact that they are recognizably High Fantasy. Either they are explicit myths, or, they refer to peoples and places whose lack of banality is due to their distance in time & space (obviously I have never been to the Zagros mountains!). I also have read historical fiction which is sufficiently distant in time, e.g., the whole of Colleen McCullough's Masters of Rome series.
To some extent if you know me in person you can see that I'm not interested in the details of the characters of other human beings. I'm somewhat along the autism spectrum toward Asperger's. I'm not the type to lose myself in a story, and I'm not really interested in most horror films because I have a hard time getting scared or identifying with the characters (I can't forget it's just a movie and the people aren't real). It seems clear to me why I have a hard time being interested in mainstream fiction; not only am I not interested in the characters, but I'm just not like most of the people depicted in terms of their values or personality. I can't "relate," and I'm not interested in "relating."
If you read Isaac Asimov's biography, In Memory Yet Green, I think you get a sense of why his novels depict flat characters. Though Asimov seems to be a gregarious individual, he was very narcissistic and self-involved. I don't get a sense that he was a socially sensitive soul (though he did resent the anti-Semitism he had experienced or slights from strangers). Asimov wrote something of an apologia for science fiction as a genre of ideas, but I think it reflects the set of values which I've expressed above and which many science fiction oriented individuals embody; plots, not people. (if you want every stereotype of science fiction readers confirmed, check out William Sims Bainbridge's Dimensions of Science Fiction, which is based on surveys at science fiction conventions)
For whatever reason Our Kind of People don't become literary critics or arbiters of taste & sophistication. Science fiction & fantasy can never be Great Fiction. If a work of science fiction & fantasy is Great Fiction then by definition it is not science fiction & fantasy. Slaughterhouse-Five, Brave New World and 1984 are not science fiction. Within the science fiction ghetto authors such as Ursula K. Le Guin and Ray Bradbury, who admit or manifest little interest in science as such and emphasize literary values and social messages (especially Le Guin for the latter), are held up as the great authors who are acceptable. In other words, authors for whom psychological exploration just happens to involve a spaceship in the background.
Why does any of this matter? For one, I think that it is somewhat peculiar that many of us find fiction from the past more engaging than popular contemporary works. Aupelius' Golden Ass gets my attention; most contemporary fiction does not. I am arguing here that this is partly due to the fact that in the past those who read copiously were, on average, much more like me than they were like the typical human. Not only were readers by and large men (usually of some means and comfort), but they were often also disproportionately eggheads who were eccentric by their nature. How many elite scholars were there such as Claudius who were not attracted to the public life of politics and do not appear in the annals of history? With the printing press, cheaper paper, and the rise of mass literacy,1 things changed, the distribution of taste shifted. And so did the distribution of genres.
So am I full of crap?
Addendum: I also think there is a supply-side issue; female authors tend to produce a particular type of work. This is evident within science fiction; female authors are underrepresented in hard science fiction. Here is something from the Wikipedia entry for the Tales of Genji:
The Tale of Genji...is a classic work of Japanese literature attributed to the Japanese noblewoman Murasaki Shikibu in the early eleventh century, around the peak of the Heian Period. It is sometimes called the world's first novel, the first modern novel, or the first novel to still be considered a classic. This issue is a matter of debate. See Stature below.
The first psychological novel? Sounds really boring (though it seems like she makes an attempt at plot, so perhaps I should check this out. I enjoyed Musashi, whose author was influenced by the Tales of Genji).
1 - I am not convinced that even the Athenian democracy was characterized by mass literary. See Ancient Literacy.
Saturday, April 26, 2008
Friday, April 25, 2008
The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World
A few weeks ago Tyler Cowen mentioned he was reading David W. Anthony's The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World. I ordered it on Amazon, and it was hanging around the house so I decided to check it out early this evening...I read all 466 pages in one sitting. If you are a GNXP reader interested in archeology, prehistory and Central Asia you have to read this book! I've read In Search of the Indo-Europeans: Language, Archaeology and Myth, The Coming of the Greeks, and Archaeology and Language: The Puzzle of Indo-European Origins, but David W. Anthony really achieved something here which I wasn't prepared for with all due respect to the scholars who authored the aforementioned works. Most readers are aware that I've complained about how happily pig-ignorant of other fields most archaeologists are. Frankly, when I see an academic book I'm curious about, but find out that the author is an archaeologist I get really suspicious. They're good at collecting data, but once the stamps are arranged there seems an extreme reluctance to fire up a real analytic engine.
The author of The Horse, the Wheel, and Language is well aware of these prejudices and refers to them obliquely. He obviously doesn't want to dismiss all of his colleagues as psuedo-scientific quacks, but he admits the general ignorance of archaeologists of fields such historical linguistics which you would think they'd check in on, and alludes to their nearly fanatical adherence to the "Pots not Peoples" paradigm (a reaction to their pre-World War II love affair with migrations). Anthony's interdisciplinary scope is very impressive; he references ideas and conclusions from Albion's Seed and Y: The Descent of Man. Cutting edge research on the evolution of lactose tolerance & the phylogenetics of domesticated cattle are intelligently integrated into the narrative. Nevertheless, the bulk of the text is an extremely dense exposition of archaeological discoveries from sites on the Pontic Steppe since the fall of the Soviet Union. The central argument is that Indo-Europeans emerged from this region between the Dnieper and Volga around 3500 BCE and over the next 1500 years spread in all directions. I won't detail the how or the why, I just finished the book about 15 minutes ago and haven't fully assimilated that much of it, but you can read chapter 1 online.
Note: I don't want to make it seem like this is a breezy popular book, about 2/3 of the material consists of detailed reportage of various sites and analysis of data on pots, cemeteries and seed-husks. Anthony is clearly talking to his fellow scholars, but I think the prose is accessible enough for an interested lay reader as he avoids obscuring jargon. Rather, if you are scared by an endless parade of facts this might not be for you. On the other hand, if your data-gullet is endless, what are you waiting for?
Related: The Inner Asian gap: the Afanasievo breakthrough.
A new paper, The Dawn of Human Matrilineal Diversity, is out in AJHG. I read too much John Hawks to really be all that excited about mtDNA based studies, and this paper is Mitochondrial Eve to the nth power. But...I do think it is indicative of a trend which suggests a rollback from the most extreme Out of Africa scenarios; i.e., that one band somewhere in Eastern Africa arose ~100,000 years ago and expanded demographically so that they were the exclusive ancestors of all human beings.1 The introgression story is one angle; but the likelihood of preexistent population substructure within Africa itself is another. If you don't read the paper (which is Open Access), just check out Figure 1 and the map. Breathless description of the study over at ScienceDaily of course....
Related: Kambiz comments in more detail.
1 - Some of this was more about public perception than reality; just like the one mtDNA ancestor was conflated with one female ancestor (the same trick of course applied to the NRY).
Thursday, April 24, 2008
One of the "debates" currently occupying evolutionary biology is whether evolution occurs primarily via changes in protein-coding sequence or via changes in gene regulation (apparently it's become so heated that battles between the two camps are now fought through t-shirts).
As understanding of the genetics of adaptation advances, this debate will likely fade away--a priori, it's easy to make the case for either, and well-studied individual examples are showing that, as one might expect, evolution isn't particularly dogmatic about the sources of variation it works with.
It's these case studies that are most interesting--take, for example, a recent study on the adaptation of Arabidopsis halleri to the heavy-metal-polluted soils it now occupies. This is quite a nice example of evolution via gene regulation--the authors map the ability to tolerate heavy metals to a particular candidate gene, then identify both a change in copy number as well as changes in the promoter region of the gene that lead to high levels of expression. To complete the story, they then pop this highly expressing version of the gene back into A. thaliana (the model organism) and show they're able to recreate the crucial aspects of the adaptation in that species.
Tuesday, April 22, 2008
Statistical Modeling, Causal Inference, and Social Science asks where asks where all the Smiths have gone:
Sam Roberts writes,In 1984, according to the Social Security Administration, nearly 3.4 million Smiths lived in the United States. In 1990, the census counted 2.5 million. By 2000, the Smith population had declined to fewer than 2.4 million.Where did all the Smiths go from 1984 to 1990? I can believe it flatlined after 1990, but it's hard to believe that the count could have changed so much in 6 years.
Here's another explanation, it's the inverse of the phenomenon of those claiming Native American ancestry in the United States doubling in 10 years. Many Smiths were at one point Schmidts, who knows if some of them didn't revert now that WASP surnames aren't as value-added? I strongly suspect that the number of ethnic whites in the USA is overstated because those with mixed-ancestry emphasize the most non-traditional quanta of their heritage. That means if someone is 1/4 German & 3/4 English they might declare their ethnicity as German. I'll probably have to look up some social science on this question at some point....
Note: the rank of Schmidt increased in terms of rank by 33 from 1990 to 2000.
Update: I took a bunch of German names and their English or Anglicized variants and compared their ranks between 1990 and 2000. I'm sure that the trend you see is the combined result of the decrease in proportion of those with very common Anglo names because of the decline of the non-Hispanic white fraction as well as a moderate stream of new German immigrants. But who knows?
Update II: Proportion of German Americans dropping faster than English Americans?
Update III: Took some Census 2000 data and produced this....
I think the ratio of First to Second ancestry is probably a pretty good sense out admixture/outmarriage rates. Look at the Welsh; not very distinct from other British Isles groups and far less numerous, ergo lots of second ancestry.
Update IV: Median age for people of English ancestry is 44. For German it is 37. Same with Irish. What's up with that?
Monday, April 21, 2008
The first column shows the theoretical expected PC maps for a class of models in which genetic similarity decays with geographic distance (see text for details). The second column shows PC maps for population genetic data simulated with no range expansions, but constant homogeneous migration rate, in a two-dimensional habitat. The columns marked Asia, Europe and Africa are redrawn from the originals of ref. 3 [this reference is to Cavalli-Sforza's The History and Geography of Human Genes]. Each map is marked by which PC it represents. The order of maps in each of the last three columns was chosen to correspond with the shapes in the first two columns.
What does this mean? The authors say it best in the abstract:
Nearly 30 years ago, Cavalli-Sforza et al. pioneered the use of principal component analysis (PCA) in population genetics and used PCA to produce maps summarizing human genetic variation across continental regions. They interpreted gradient and wave patterns in these maps as signatures of specific migration events. These interpretations have been controversial, but influential, and the use of PCA has become widespread in analysis of population genetics data. However, the behavior of PCA for genetic data showing continuous spatial variation, such as might exist within human continental groups, has been less well characterized. Here, we find that gradients and waves observed in Cavalli-Sforza et al.'s maps resemble sinusoidal mathematical artifacts that arise generally when PCA is applied to spatial data, implying that the patterns do not necessarily reflect specific migration events.
Labels: Population genetics
Sunday, April 20, 2008
Clark has a post pointing to the obvious parallels between the practices of the Fundamentalist Church of Jesus Christ of Latter Day Saints and those of West African immigrants. The "problem" with the FLDS situation is pretty clear; they're WASPs with weird folkways. Of course the reaction to the FLDS is simply a retread of what happened with the original Mormons, a culturally heterodox group whose primary following was among the lower and lower middle class of Greater New England.1 I had friends in high school who were from the old Mormon stock whose ancestors had been driven west; many of the remembrances passed down through the generations resembled those of the Trail of Tears. My friends were proud & patriotic Americans, but I was surprised that on a deep level they seem to have never forgotten the persecution which Mormons experienced from the American government and the people which it claimed to represent.
The "problem" with the original Mormon church, and the FDLS today, is that we aren't living in a land of black & white, where good and evil are clear and distinct. In some ways the early Mormons were an admirable folk, picking themselves up by their own bootstraps and forging a new religion in the wilderness of the American continent. But they also manifested hostility toward outgroups and an exclusionary tendency which ill-suited them in their interaction with other Americans, "gentiles" as they would call them. The history of the Mormons from their original emigration down to the banning of polygyny was one of interminable conflict with the American republic, the Utah territory was defined by the clash between a Mormon theocracy and the occupational government of the United States. This enmity was only resolved by the Mormon rejection of polygyny.
This episode showed that the tolerance of the American polity had its limits. Though multiculturalism is a relatively new concept in terms of its elaboration, the United States of the 19th century was shockingly diverse when it came to religious pluralism. The Mormons themselves were an outgrowth of the Second Great Awakening, which transformed the American South into the domain of Baptists and produced many of the mainstream denominations which are still on the scene today. Joseph Smith's cult is the most exotic outlier, but it was not entirely atypical. Smith's sin was not to push Protestantism into a new direction, it was the fact that he dragged the Mormon church into a landscape which transgressed against the bourgeois norms at the heart of American society (this occurred with other religious-social groups which emerged out of the Second Great Awakening, but only the Mormons remain).
The emblematic violation of those norms was of course plural marriage, polygyny. I don't think that plural marriage is wrong like murder is wrong, but the social dynamics which emerge from its ubiquitous practice among the FLDS are well known, and I am skeptical that the practice is conducive to the perpetuation of a bourgeois republic. Even within the Muslim world modernizers are very critical of polygyny because of the familial destabilization it portends. In a world where time is finite one can make quick back of the envelope inferences about the effect upon parental inputs in a situation where one man fathers dozens of children with multiple wives. Though there are very specific principled arguments one can against polygyny, I suspect that the consequentialist ones are at the heart of the relatively universal objection to the practice from most Americans.
The FLDS situation gets to the heart of a broader problem in any polity, and that is one of diversity of values. As WASPs without the race card to bail them out the members of the FLDS find themselves facing the reality of prejudice & discrimination at the hands of the majority. On pure moralistic grounds I think one can point to the ubiquity of debauched polyamory in much of American society, and low "paternity certainty." Why this fixaton on the FLDS's practices? Aside from the formalization of a routine of statutory rape encouraged by Warren Jeffs, I suspect a bigger issue is that the FLDS legitimizes & solemnizes practices Americans want to keep marginalized and sinful (for lack of a better word). Most Americans are regularly bombarded with the message that prejudice & discrimination are bad, but the reality is that we engage in these activities every single day of our lives. Our rejection of polygyny brings into stark relief the persistence of shibboleths and unspoken norms. The non-ethnic whiteness of Fundamentalist Mormons results in our disgust not being buffered by race guilt or discounting of the practitioners of exotic behaviors as marginally human. The members of the FLDS are "All American" in their stock, so their practices are more repulsive than they would otherwise be. They are apostates from the bourgeois consensus.
And consensus is vitally important, no matter how much we wish to emphasize the value of public debate and difference. Winnifred Sullivan's book The Impossibility of Religious Freedom elucidated the charade that a world without prejudice & discrimination truly is. In Catholicism & American Freedom John T. McGreevy documents how American Catholics became part of the mainstream in large part due to their assimilation of American values and folkways. In other words, Catholicism became acceptable when it became Protestant, the apotheosis of which was John F. Kennedy.2 Because religion is so important to people we treat it differently; Americans receive exemptions and dispensations from civil expectations if their religious obligations or taboos contradict mainstream norms. But these exemptions can only go so far, and they are extended only toward particular groups who have received the acclaim necessary for public recognition.
The treatment meted out to the FLDS illustrates the limits of the tolerance of acts between consenting adults, that the circle of diversity is not without boundaries. The historical record also shows that the tolerance extended toward numerous factions such Catholics and Jews was in large part a reflection of the fact that both of these groups subsumed themselves into the set of expectations which were normal within American Protestantism.3 For sects where the numbers are smaller, such as the Amish, heterodoxy is accepted because their impact is so marginal and their custom are in the generality inoffensive or quaint. In the past the American society admitted the reality of these boundaries and the general outline of our circle of tolerance; today we are somewhat in denial, and the schizophrenic reaction to something like the FLDS controversy reflects the clash between our deep-rooted values and our notional avowal of universal multiculturalism.
Related: Jake Young blogs the economic benefits of monogamy.
1 - Greater New England included much of northern Ohio, for example.
2 - I obviously don't mean that American Catholicism is in schism from the Roman Church. Rather, in terms of the conception of their relationship to their religion of choice American Catholics bring American Protestant presuppositions. This was clear even during the early 19th century, but the massive influx of European Catholic immigrants de-Americanized the church by around 1850 and brought to the fore "Old World" values and and expectations in terms of how the church would relate to the state. The result was decades of conflict which only abated when the children of the immigrants became numerically dominant and brought their own American sensibilites to the table. Simultaneously with this demographic shift the international Roman Catholic Church was shifting to a more "Americanist" perspective, culminating in Vatican II. The point is that the United States culture didn't really compromise with the Catholic Church, the church was transformed until it became acceptable.
3 - Note the popularity of non-"Orthodox" Judaism in the United States.
Thursday, April 17, 2008
A common SNP of MCPH1 is associated with cranial volume variation in Chinese population:
Microcephaly (MCPH) genes are informative in understanding the genetics and evolution of human brain volume. MCPH1 and abnormal spindle-like MCPH associated (ASPM) are the two known MCPH causing genes that were suggested undergone recent positive selection in human populations. However, previous studies focusing only on the two tag single nucleotide polymorphisms(SNPs) of MCPH1 and ASPM failed to detect any correlation between gene polymorphisms and variations of brain volume and cognitive abilities. We conducted an association study on eight common SNPs of MCPH1 and ASPM in a Chinese population of 867 unrelated individuals. We demonstrate that a non-synonymous SNP (rs1057090, V761A in BRCA1 C-terminus (BRCT) domain) of MCPH1 other than the two known tag SNPs is significantly associated with cranial volume in Chinese males. The haplotype analysis confirmed the association of rs1057090 with cranial volume, and the homozygote males containing the derived alleles of rs1057090 have larger cranial volumes compared with those containing the ancestral alleles. No recent selection signal can be detected on this SNP, suggesting that the brain volume variation in human populations is likely neutral or under very weak selection in recent human history.
They used EHH & iHS. Also, they suggest that the derived form of rs1057090 is very ancient (the SNP has a very small window of linkage disequilibrium around it).
Related: This is Bruce Lahn's brain on ASPM and MCPH1, Did Modern Humans Get a Brain Gene from Neandertals?, Microcephalin & ASPM and Selection "controversy".
Tuesday, April 15, 2008
The discussion continues in regards to the relationship of various West Eurasian and North African groups (i.e., Europeans, North Africans and Near Easterners). There have been several papers published within the last few years which shed some light on these questions. We've blogged them before, and I don't think that they radically alter what you might find in History and Geography of Human Genes, but I thought I'd point to them again, with a special focus on figures of note.
European Population Substructure: Clustering of Northern and Southern Populations. Figure 4 B:
Analysis and Application of European Genetic Substructure Using 300 K SNP Information, Figure 1 B & D:
Discerning the Ancestry of European Americans in Genetic Association Studies, Figure 3 A:
Analysis of genetic variation in Ashkenazi Jews by high density SNP genotyping, Figure 9:
Since these papers are all Open Access there's really no excuse to not read them (at least the "Discussion" sections). I hope people won't go around looking for charts to "prove" whatever pet hypothesis they want to promote, the population-level classifications we generate often have only an approximate relationship to the multi-dimensional shape of human genetic variation at the finer-grained level. Note that some of these principal component charts really don't have that many individuals typed, and you may wonder about the representativeness of the samples of their putative national populations. Though these are important points, I do think we need to be cautious about our expectations in regards to the sort information we're going to extract on the margins as the N increases and the individuals typed come from every region of a nation. I suspect we'll get more oddities like the Etruscans as isolated or peculiar populations are included in these samples, and the exceptions to the broad patterns tell us a lot about the details of human history. But, I doubt we'll overturn the general shape of the relationships and clinal gradients we see here.
Addendum: I somewhat played down the future surprises that these sorts of fine grained analyses might have for us...but I do want to note that the studies will continue. That's because they aren't done for the purposes of elucidating human genetic history as such, rather, the primary rationale is to highlight substructure which might be relevant when attempting to ascertain disease relevant alleles. In the medical context then there may be significant returns on the investment here which I don't want to underestimate. If, for example, a particular drug's efficacy within the African American population in the United States is directly proportional to the makeup of one's ancestry then identifying ancestry-informative markers is very useful.
Update: Measuring European Population Stratification with Microarray Genotype Data, Figure 1 A:
Monday, April 14, 2008
Greg's post about SNPs, Jews and evolutionary genetic parameters has been getting a lot of play around the blogs & forums. Most of it seems to be due to the persistent interest in the genetic relationship of Ashkenazi Jews to other European populations. This makes sense, since the 19th century the question of how the "Jewish race" relates to European gentiles has had some sociopolitical relevance.... But a commenter at Steve's blog pointed out that Bauchet et al. from last year had a PC chart which included Armenians, who are I think a good proxy for northern Middle Eastern populations in general. One interesting result from surveys of Y chromosomal lineages is the finding that Jews may have more affinities with northern Levantine & Anatolian Middle Eastern populations than with southern Levantine and Arabian ones. The non-trivial female mediated input of Sub-Saharan ancestry into many Arab populations since the rise of Islam is far less evident in non-Arab Muslim populations (Kurds, Persians and Turks) as well as Middle Eastern Jews, and obviously Ashkenazi ones. But another point is that recent work suggests that the impact of historical events (e.g., the Arab conquest) might have been more demographically significant than we had previously assumed, and so Jewish affinity with northern Middle Eastern populations may reflect that these groups have been less affected by exogenous genetic inputs within the last 2,000 years.
Caution about the sample sizes of course (though I assume within the next year we'll have much better data to go off of), but something to include into your list of priors when making phylogenetic background assumptions.
Note: I added geographic labels to the PC chart for clarification.
Update: Steve has another post up:
On the first two axes, Ashkenazi Jews are rather close to "Europeans" and "Russians." They are similar to Yemenites (from Southern Arabian peninsula) on the first axis, but not on the second. And they are similar to Samaritans (who currently subsist on two hilltops in Israel), good, bad or indifferent, on the second axis but not on the first. They are fairly similar to the Druze (of Lebanon and Israel) on the first two axes, but not on the third.
The Samaritans are cousins of the Jews. But:
In the past, the Samaritans are believed to have numbered several hundred thousand, but persecution and assimilation have reduced their numbers drastically. In 1919, an illustrated National Geographic report on the community stated that their numbers were less than 150.
Like the Kalash or Sardinians the Samaritans are going to be weird outliers because of their demographic history. Inbreeding and no gene flow in for that long will do that to you (many people in the Middle East are descended from Samaritans of course, but very few Samaritans are descended from non-Samaritans).
The Yemenites are also a peculiar comparison point because they are geographic outliers in relation to other Middle Eastern populations with a long and distinct history. They have a large proportion of Sub-Saharan ancestry for an Arab group. An interesting historical note is that during the Islamic expansion Yemenite tribes were prominent in Iraq and Egypt, though I doubt they left a very strong genetic imprint in these regions.
The Druze are a better point of comparison, being a more mainstream Middle Eastern group. But that's only relative to the Samaritans, who are at an advanced stage of pedigree collapse, or the Yemenites, who are on the geographic margins of the Near East (it is easy to argue that before Islam Yemen was more a part of the trans-Indian Ocean world than it was of the Near East). The Druze are an esoteric ethno-religious group which as been resident in the mountains of Lebanon. who have not accepted converts since 1031, so again you have a recipe for some genetic distinctiveness developing because of social norms.
All that being said...perhaps as we explore the genetics of the Middle East further we'll find that most groups exhibit these sorts of inbred tendencies because of the prevalence of consanguinity?
Addendum: Modest levels of gene flow are very good at equilibrating and mitigating the build up of variation between groups. Islands, like Sardinia, often develop unique genetic profiles because water seems to be a powerful barrier to marriage connections. The Samaritans & Kalash have not had any gene flow in for a very long time, in both cases in part because of being embedded among Muslims who do not generally tolerate conversion to other religions, and in the case of the Kalash their geographic isolation. Some of the same issues apply to the Druze, though I suspect much more modestly (in part because Druze isolation is more recent).
If you read The Corner you know that John has been in Tucson for the Toward a Science of Consciousness conference the past week. He's now assembled his reflections.
Saturday, April 12, 2008
There was an interesting paper in BMC Genetics back in in February: "Analysis of genetic variation in Ashkenazi Jews by high density SNP genotyping. " They ran 500K Affy chips on 100 Ashkenazi women and on 60 CEPH-derived HapMap (CEU) individuals. They hoped to find greater levels of linkage disequilibrium and lower haplotype complexity among the Ashkenazim, as a putatively bottlenecked population. This would simply some forms of genetic mapping. Some earlier work had suggested that this might be the case - but that earlier work had either looked at a single chromosome or at a small samples from a number of chromosomes.
The expected pattern is not there. Average LD is very similar in the two populations, although it varies from chromosome to chromosome. It's slightly smaller among the Ashkenazi at short distances, slighter greater for longer distances, but overall very similar, as you can see.
There were somewhat _more_ haplotype blocks among the Ashkenazi sample, not fewer.
You would expect a bottlenecked population to have more monomorphic sites, but the Ashkenazi sample had noticeably fewer, 9.1 % versus 12.4 %.
Altogether, the paper concludes that "These data are more consistent with the AJ as an older, larger population than CEU. " Which means that there is no sign of any bottleneck in this data. The paper, obviously written by several people, _refers_ to several bottlenecks that have been discussed in earlier studies, but this measurement set contains thousands of times more data than those earlier studies. If there had been a bottleneck, they would have seen it, and if they don't see it, there must not have been one.
They see very significant gene frequency differences in a couple of fair-sized regions: LCT and and HLA. Those differences were of course generated by selection. There are differences in smaller regions at a number of other positions, and long homozygous regions in the Ashkenazi sample average about 20% longer - so at least some of their long haplotypes are younger.
Fact: we find long haplotypes around the mutations causing common Ashkenazi diseases, on the order of one to ten Mb.
Bottlenecks affect the whole genome, but selection only affects a small fraction. Selection would not change genome-wide LD much, would not much increase the number of monomorphic sites, but it could generate long haplotypes around selected mutations.
The authors think that these differences "reflect the impact of both selection as well as genetic drift." - but there is, as far as I can tell, no evidence of drift in this data at all. Perhaps I'm missing something.
This SNP study (and others) also shows that Ashkenazim are genetically distinct from other Europeans, which allows fairly accurate identification of group membership. Almost perfectly distinct, if you look at Ashkenazim whose grandparents are all Ashkenazi (the violet dots). Obviously, there was low inward gene flow for a long time, but that has increased a lot in the last century. Distinct local selection pressures could have caused noticeable change when gene flow was that low.
Check out this figure, from a recent paper in PLOS Genetics ( Tian et al, Analysis and Application of European Genetic Substructure Using 300 K SNP Information):
Heny Harpending and I came to these same conclusions several years ago, using a far smaller data set: the evidence indicated low gene flow that would allow local selection, and we found no evidence for - indeed, solid evidence against - the kind of bottleneck that would explain the observed spectrum of genetic disease among the Ashkenazim. Which leaves selection as the only explanation - but selection for what?
Friday, April 11, 2008
Most people with an interest in genetics will be aware that Sewall Wright made major contributions to the theory of kinship or relatedness. Fewer people will have any direct knowledge of his work on the subject, and those who do consult his writings may find them difficult. The present note is intended to help those who want to tackle Wright at first hand. See also this evaluation by the geneticist W. G. Hill.
Most of Wright's key ideas on the subject were first presented in a 5-part paper on 'Systems of Mating' (SM) in 1921. All 5 parts can be found on the internet with a little searching. SM1, which is the most fundamental, is here, and SM5, which contains a relatively un-technical summary, is here.
Rather than go straight to Wright's own approach, I will begin by comparing and contrasting it with that of the French geneticist Gustave Malecot, based on the concept of Identity by Descent. Malecot first introduced his methods around 1940, and since then they have supplanted Wright's approach, to the extent that Wright's own methods have been almost forgotten. What is presented in textbooks as due to Wright is often in reality due to Malecot. The two approaches do have some similarities, and in simple cases they lead to the same quantitative results, but there are also some important differences.
Malecot and Identity by Descent
In Malecot's system two genes at the same locus, in the same or different individuals, are defined as Identical by Descent (IBD) if they are both descended from the very same individual ancestral gene, without either of them undergoing mutation in the interim. The relatedness between two individuals can be measured, roughly speaking, by calculating the probability that two genes at the same locus in the two individuals are IBD. To do this it is necessary first to identify all the distinct paths of descent connecting the two individuals through a common ancestor, and then to calculate the probability that the same gene will have descended to both individuals from that ancestor along any given path. Since all such paths of descent are mutually exclusive (though portions of them may overlap), the resulting probabilities can be added together to give the total probability that a given gene in the two individuals is IBD. To take a simple case, consider two individuals (full siblings) who have both parents in common. I assume that the parents are not related to each other or inbred. If we select a (diploid autosomal) gene at random from one sibling, there is a probability of one-half that it comes from the mother, and, if it does, a probability of one-half that the same gene has descended from the mother to the other sibling. This gives a compound probability of one-quarter that the second sibling has received a gene from the mother that is IBD to the selected gene in the first sibling. There is likewise a probability of one-quarter that the second sibling has received an IBD copy from the father. The total probability is therefore one-half, which is often called the Coefficient of Relationship or Relatedness between full siblings. If the parents are themselves related or inbred (i.e. descended from one of their own ancestors by more than one possible path), additional paths of descent need to be taken into account. Since there are two genes at the relevant locus in the second sibling, there is a probability of one-quarter (one-half times one-half) that a particular one of these genes, chosen at random, is IBD to the selected gene in the first sibling. This is usually known as their Coefficient of Kinship. If a male and female with a non-zero Coefficient of Kinship mate together, there is a non-zero probability that any offspring will inherit two genes that are IBD to each other. This is usually known as the offspring's Coefficient of Inbreeding, and a little consideration shows that it is equal to the Coefficient of Kinship of the parents.
A point left vague in some accounts is how far back the paths of ancestry can or should be traced. There would be little point in tracing them back so far that the gene would probably have mutated along the way to one or both descendants, but with a mutation rate of only about 1 in 100,000 per generation this is not a major constraint. In practice, ancestry is seldom traced back beyond five or six generations, as the probability of Identity by Descent along any given path going beyond than this is very small (less than 1 in 1,000), and the aggregate probability along all such paths will usually be much the same for all individuals in the same population.
Wright and the Correlation between Relatives
None of this is directly due to Sewall Wright. He does uses path diagrams similar to those of Malecot (who was inspired by Wright's work), but the quantities measured along the paths are not probabilities of Identity by Descent but path coefficients. As discussed in my note on Wright's method of Path Analysis, the correlation between two variables can be derived from the path coefficients along the paths connecting them. The measures of relationship between two individuals in Wright's system are always in principle correlation coefficients. In simple cases (no inbreeding, no dominance, no assortative mating, and so on) they are quantitatively the same as Malecot's measures, but in principle they are quite different. Three important differences should be emphasised:
a) like all correlation coefficients, Wright's measures of relationship are valid only relative to a specified statistical population. The coefficient of relationship between two individuals may well vary according to the specified population; e.g. it may be different if the specified population is an ethnic group to which the individuals belong as compared with a population comprising several ethnic groups.
b) unlike probabilities, which are always positive, a correlation coefficient can be either positive or negative. In fact, although Wright seldom discusses negative relationships, within any specified population they are in principle as common as positive relationships.
c) relative to any specified population, the correlation between two randomly selected individuals from that population is zero (apart from sampling error). This point has sometimes been overlooked, for example in discussions of Hamilton's Rule. The 'r' in Hamilton's Rule should be a regression coefficient rather than a correlation coefficient (as Hamilton realised around 1970 - see Narrow Roads of Gene Land, vol. 1, p.179), but the same principle applies: the regression of one randomly selected individual on another randomly selected individual, relative to the population from which they are randomly selected, is approximately zero. Hamilton's Rule therefore predicts that altruistic behaviour will not be directed randomly towards all members of the relevant population, though it may be difficult to decide which population is 'relevant' for the purpose.
I emphasise these points partly because Wright himself does not. They are implicit in the use of correlation coefficients, but Wright seldom explicitly mentions them. An exception is in SM5, where Wright points out that the correlation between relatives within an inbred line will be small although relative to the wider population it is large. Some more general statements are made in Wright's late work on Evolution and the Genetics of Populations (EGP). In volume 2 of that work (1969) he says that 'In a panmictic [randomly mating] population, there is no correlation between homologous genes of uniting gametes relative to the gene frequencies in the whole population. On splitting up into small lines which breed within themselves, a correlation between uniting gametes is to be expected.... The relativity referred to above has sometimes been overlooked or misinterpreted. A correlation coefficient is, of course, always relative. It is a property of the population as well as the two variables....' (pp.175-77.) Wright goes on to discuss Malecot's method of Identity by Descent. He accepts that it is a useful technique and often leads to the same results as his own, but argues that his own approach is more general and in particular that his own concept of relationship allows for negative values.
Wright is often vague about the population in which the correlations are to be measured, leaving this to be inferred from the context. Sometimes the relevant population is the entire generation to which the correlated individuals belong, sometimes it is a defined sub-population, but sometimes it seems to be a 'foundation stock' from which they are descended. This is problematic, as it seems to require a correlation between individuals relative to the means and standard deviations in a population to which they do not themselves belong. I will discuss this further in dealing with Wright's work on inbreeding and genetic diversity.
Correlations between notional values
Wright was not the first person to work on the correlation between relatives. Unknown to Wright, R. A. Fisher had already treated the subject at length, by different methods, in 1918. In fact, the subject goes back at least to 1904, when Karl Pearson considered the correlations to be expected on the hypothesis of Mendelian dominant inheritance. He found that (on certain simplified assumptions) the correlation between parent and offspring would be only one-third, rather than the correlation of about one-half usually found in empirical data on human traits. Pearson considered this a serious objection to the generality of the Mendelian theory. One of the aims of Fisher's 1918 paper was to show that, when complications such as assortative mating were taken into account, the data were consistent with widespread Mendelian dominance.
The idea of a correlation between relatives is intelligible enough when the correlation involves continuous phenotypic traits such as height, but it is more obscure when the traits are purely qualitative, or when the correlation is not between phenotypes but between gametes or genotypes. If there are varying types of gametes or genotypes (e.g. different alleles at a locus) in the population, they may be said to be positively associated if the same types tend to occur together, more often than would be expected by chance, in the same individual or in certain pairs of individuals. There are several useful measures of the 'association' of qualitative variables (see any edition of G. U. Yule's Introduction to the Theory of Statistics). However, Wright (like his predecessors) preferred to use the Pearson product-moment correlation coefficient. To obtain a Pearson correlation coefficient in the case of purely qualitative variables, such as differences between alleles, it is necessary to give the correlated items notional algebraic or numerical values. Since these are to some extent arbitrary, it might be feared that this would introduce an arbitrary element into the results, but in the cases of interest the arbitrary values cancel out and leave the correlation coefficient itself unaffected.
The procedure can be illustrated by the problem of dominance, which is treated by Wright in SM1, page 117-8. If we assign the homozygotes AA and aa the arbitrary values 1 and 0 respectively, in the case of complete dominance of A, the heterozygote Aa will have the value 1, while in the case of zero dominance it will have the value 1/2. Each individual in the population will therefore have a pair of numerical values, under the assumptions of dominance and non-dominance respectively. For homozygotes the two values will be the same but for heterozygotes they will be different. If the frequencies of the various genotypes in the population are specified, the means and standard deviations of the numerical values can be calculated, and the covariance and the correlation coefficient between the pairs of values can then be derived in the usual way. The correlation coefficient will be unaffected if one or both variables are systematically multiplied by or added to a constant (see Notes on Correlation, Part 2). But this entails that we would get the same correlation if we chose any other set of arbitrary values as alternatives to 0 and 1, provided the value of the heterozygote in the absence of dominance is half-way between that of the homozygotes. We can therefore obtain a quite general result for the correlation between the values of genotypes with and without dominance. (Of course, correlations could be calculated in a similar way on different assumptions about dominance, e.g. for partial dominance.) It can be shown by this method that Wright's results at the bottom of page 117 are correct, though I do not see how Wright derived his particular formulae, which are far from obvious. [As I have mentioned elsewhere, the equation p = root-uv appears to be a printing error or slip of the pen, as under Hardy-Weinberg equilibrium it should be p = 2root-uv. In fact, I now find that this error was listed in the printed Corrigenda to the relevant volume of Genetics but has not been corrected in the pdf copy.]
Systems of Mating I
I will conclude this note with some comments on Wright's most important paper on the subject: the first in the series on Systems of Mating (SM1).
Here Wright uses his method of path analysis to derive the correlation between relatives. In principle the ultimate result is a correlation between phenotypes, which should take account of all environmental and genetic influences, including dominance, epistasis, assortative mating, and shared environment (if any).
While the method of path analysis has some advantages for this purpose, which Wright emphasised, it also has some disadvantages. The variability among individuals is partly due to the chance effects of genetic recombination and segregation. It is therefore necessary for the path diagrams to contain an independent variable designated as 'chance' (see the diagram in SM1, p.116), which may be formally justified but still looks odd. More importantly, the method of path analysis assumes that the effects of causal influences can be simply added together. In genetics this is not always the case, as the effects of epistasis and dominance are not purely additive. Wright therefore excludes epistasis from his model 'for the present' (p.117). He does attempt to incorporate an adjustment for the effects of dominance, but this is not entirely successful. For the time being I will assume that the method is confined to the additive effects of genes.
It is not always clear what is the relevant population for the purposes of the correlations, especially as more than one generation of individuals are often involved in the correlations. Wright seems to assume (see the beginning of SM4) that in the absence of selection the proportions of different alleles in the total population will be constant, but in a finite population this cannot be strictly true, as there will be fluctuation due to genetic drift. Perhaps Wright is assuming for the purpose that the population can be regarded as indefinitely large. In this case it is legitimate to assume that gene frequencies in the absence of selection are constant. More seriously, it is not clear whether the intended reference population is the current population of each generation, the 'foundation stock' from which they are descended, or some combination of the two. Wright's reference to 'random mating' at the top of page 119 of SM1 would not make much sense if the intended reference population is the current one (of the parents), since f' would then always be zero.
Each path of descent is built up from the links between parent and offspring, so this relationship is especially important. In Wright's analysis (page 118-20) the direct relationship between parent and offspring can be analysed as a path with the following steps: parent's phenotype - parent's genotype - gamete (egg or sperm) - offspring's genotype - offspring's phenotype. (If the offspring's two parents have a non-zero correlation, an indirect path via the other parent also needs to be considered.) The path coefficients along the direct path from parent to offspring can be represented in the form hbah, where h represents the correlations between the phenotypes and genotypes of the parent and offspring (which may be different). The correlation coefficient can be considered a measure of broad heritability, that is, the extent to which the individual's phenotype is determined by the genotype. Its square, h^2, measures the proportion of phenotypic variance accounted for by genetic variance. This is historically the origin of the familiar use of h^2 to represent heritability. It should however be noted that Wright's usage is not quite the same as the modern one. In modern usage h^2 usually stands for narrow or additive heritability, measured by the extent to which the offspring predictably resemble the parents. Wright's h^2 is closer to the modern concept of broad heritability, as it measures the extent to which the phenotype of an individual is determined by its genotype. The key equation (p.116) is h^2 + d^2 + e^2 = 1, where h stands for all aspects of genetic heredity, and e and d stand for predictable effects of the environment and random fluctuations in development.
The coefficients a and b are the path coefficients representing, respectively, the contribution of the gamete (egg or sperm) to the variance in the genotype of the offspring, and the contribution of the parental genotype to the variance in the gametes. As none of these entities have a measurable phenotypic value, it is necessary to assume that they have arbitrary algebraic or numerical values, in the way discussed above. Wright's derivation of the values of a and b (SM1, pp.118-19) is particularly important, and needs to be carefully studied. Unfortunately it is not easy to follow. I would offer two tips. First, it is essential to refer frequently to the path diagram on page 116, without which the derivation would be unintelligible. Second, Wright does not explain why pG.H'' = rG.H'', which is crucial to the validity of the proof. I think it follows from the fact that the only causal path from the parental genotype to the gamete is the direct path pG.H''. [Added: having written this, I am pleased to find that Wright gives this explanation in another article.]
It should be noted that if the parents are unrelated and not inbred, a and b are both equal to root-1/2, so the product ab along the path from parent to offspring in this case equals one-half, as in Malecot's method.
It may perhaps be felt that Wright's derivation of the path coefficient b is a trick with smoke and mirrors. It is mathematically valid, but Wright's claim that 'in a sense, it is legitimate to reverse the arrows....' invites the response that in another sense it is not legitimate, since there is no causal influence from the gametes back to the gametocyte. This part of the proof therefore goes against the spirit if not the mathematical letter of path analysis.
At the top of page 120 Wright explains, very terseley, how correlations between relatives can be derived from the path coefficients. Again, it should be noted that in simple cases, and with perfect additive heritability, the results are the same as Malecot's. Wright then attempts to take account of dominance. As noted above, on page 117-8 of SM1 Wright gives formulae for the correlation between genotypic values with and without dominance. In the standard case of random mating the correlation comes out at root-1/1+p, where p is the proportion of heterozygotes in the population. To adjust the correlations between relatives to allow for dominance, Wright multiplies them by 1/1+p. He does not explain the logic behind this, but I think it is that each of the two correlated relatives has a genotypic value without dominance, which is the basis for the original correlation, and that these values can each be multiplied by root-1/1+p to give a typical adjusted correlation between the values with dominance. The effect is to reduce the correlation between the individuals by the factor 1/1+p. It may perhaps be wondered why only the two individuals at each end of the chain, and not the intermediate individuals, have their values adjusted. I think the explanation is that dominance is essentially an effect on phenotypes rather than genotypes, and in calculating the correlation between the individuals at the ends of the chain we need not take account of dominance effects on intermediate phenotypes any more than we need take account of environmental effects on them, since these do not affect the path coefficients along the chain.
Unfortunately Wright discovered, after reading Fisher's 1918 paper, that except in the case of half-siblings his own treatment of dominance effects was invalid, and in a footnote to his famous 1931 paper on 'Evolution in Mendelian Populations' he withdrew it. His original method therefore never satisfactorily covered epistasis and dominance. He later attempted to incorporate a revised treatment of dominance in his method of path analysis, but the result was very complicated. [See EGP vol 2., p435-6.] In this area Fisher's Analysis of Variance has been more generally used. The method of path diagrams remains very useful for the analysis of relationships, but the paths are now usually interpreted in Malecot's fashion as probabilities of Identity by Descent, and not as correlations.
The Problem of Negative and Zero Correlations
I emphasised earlier that in Wright's system the correlations between relatives, and therefore the measures of relatedness, can be zero or even negative. Yet it seems that Wright's actual procedures for measuring relatedness, by tracing path coefficients back through common ancestors, can only produce positive figures. For example, suppose that on average two randomly chosen members of a population have a degree of relatedness, measured by Identity of Descent within, say, the last thousand years, equivalent to that of full first cousins, i.e. a Malecot Coefficient of Relationship of one-eighth. On the face of it, if we trace back the paths of descent using Wright's methods, and work out the path coefficients, assuming complete additive heritability, the result will be a correlation of one-eighth, numerically equivalent to the Malecot coefficient. But the correlation coefficient between randomly selected members of a population, relative to that population as a whole, must be approximately zero. We therefore seem to have a contradiction.
It took me a while to see how this paradox can be resolved. I think the main explanation [see Note] is that in the usual applications of Wright's methods there is a tacit assumption that only the paths leading through common ancestors need be taken into account. All other paths can be regarded merely as background noise. For example, if we trace the paths between two full first cousins, we need only take into account the paths leading through the two grandparents they have in common, and not the other four grandparents, unless some of these lead back to other common ancestors in the fairly recent past. Ordinarily this is a reasonable approach, but it breaks down if it is is applied to the kind of case referred to in the last paragraph. If we trace back the entire ancestry of two randomly chosen individuals, for some large number of generations, the ancestors will have a mixture of positively and negative correlations between them. The positive and negative correlations will (approximately) cancel out. In a complete path analysis all these correlations would need to be taken into account, even if they do not involve a direct path through a common ancestor. When properly interpreted, Wright's methods therefore do not lead to a contradiction.
I had originally planned to go on to consider the extension of Wright's measures of kinship to the relations between populations, such as his well-known FST statistic. But the post is already long, so I will reserve the subject for another time.
Note: I say the main explanation , because the effect of common ancestry itself may also be reduced when we take account of negative correlations. For example, in the case of cousins with two common grandparents, these two grandparents may be negatively correlated, in which case the indirect path running through both of them would have a negative value. Or a common ancestor might have a negative coefficient of inbreeding (i.e. be less inbred than average for the population), which would reduce the path coefficient from parental genotype to gamete. But as far as I can see, these factors would never be sufficient to offset the positive correlations due to common ancestry entirely. It is therefore also necessary to take account of negative correlations between non-common ancestors.
Thursday, April 10, 2008
Pajamas Media has a post up, Muslims Leaving Islam in Droves, which seems to be getting a bit of linkage. There's a lot of weird stuff in this post, so I figured I'd offer a little quick commentary on the assertions and data. I'm not going to do detailed citations at this point of why I believe what I believe in the interests of time, but if you dig deeper into the ethnography I think you'll see that I'm not making things up.
First, there's the assertion of mass conversions from Islam to Christianity in Africa. The link provided with an Al-Jazeerah transcript (translated) suggests that either Ahmad al-Qataani, leader of the Companions Lighthouse for the Science of Islamic Law in Libya, is stupid or mendacious. There's a lot of wacko contentions, but the big picture is this: in 1900 Africa was a predominantly pagan continent. Even regions which had long been historically dominated by Muslim elites, such as Senegal, was only lightly Islamicized at the level of the populace. In other words, institutional Islam has very shallow roots in much of Sub-Saharan Africa where it has historically been the only high religion. One can infer this from the fact that in East Africa the coastal margins were dominated by Muslim entrepots, and yet the majority of the population today is Christian in states such as Mozambique, Kenya and Tanzania. Why? Because for whatever reason Muslims did not convert the interior tribes (I suspect that the fact that these peoples were a source of slaves as pagans, but would be forbidden if Muslims, might have played some role). An analogy might be Scandinavia in the late 10th century, when some warlords had converted to Christianity (e.g., Harald Bluetooth) and Christians were a presence as a minority across many regions, but paganism was still the dominant religion.
Since 1900 the proportion of Muslims has increased, but the proportion of Christians has increased far faster. Whereas the ratio of Muslims to Christians was lopsided in favor of Muslims in 1900 (with most Christians resident in Ethiopia), today there are more Christians in Sub-Saharan Africa. In Southern and interior Central and East Africa the dominance of Christianity should be no surprise; Islam never penetrated these regions except in the form of the occasional trader, slave or otherwise. In contrast, in West Africa and in the Horn of Africa Islam arrived as an elite religion of the courts, a vector for high civilization (converting Nubia, almost conquering Ethiopia). But one needs to remember that the presence of Islam in Nigeria or the Guinea coast was never equivalent to that in Algeria or Egypt; Kambiz tells me that Muslim women in Ethiopia go topless on occasion. I think that tells you all you need to know about the penetration of Islamic values into many of these societies. The arrival of European colonialism resulted in a new avenue toward assimilation into a high culture which had nothing to do with Islam, and since 1950 the "forest zone" in much of West Africa has been Christianized. The fact that a long serving president of Benin converted from Christianity to Islam to Christianity again should illustrate the fluidity of religion in Sub-Saharan Africa (I suspect American readers might appreciate the protean & personal nature of religious affiliation in much of Sub-Saharan African better than Europeans or Asians).
The article also has out-of-control fantasies by Christian evangelists:
Although al-Qataani points to Africa, there is another phenomenon based on repulsion from Islamist dictatorship, corruption, and terrorist violence. In Iran as many as 1 million people have surreptitiously converted to Evangelical Christianity in the last five years. Pastor Hormoz Shariat claims to have converted 50,000 of them through his U.S.-based Farsi-language satellite ministry. He contrasts the upswing to the efforts of evangelical missionaries in Iran between 1830 and 1979, whose 149 years of work built a Christian community of only 3,000. One Iranian religious scholar believes youth are abandoning Islam because it is identified with the corrupt Iranian government. Now the Iranian Majlis (parliament) is debating the death penalty for conversion.
It's not impossible that there might be 1 million crypto-Christians in Iran, but do note this is a nation of 71 million. I'm sure I have enough Iranian readers to get a sense of these sorts of claims because if there really are 1 million crypto-Christians most Iranian Americans should know of them through their extended families, right? The exuberance of Christian evangelists is understandable, but the media tends to be way too credulous. Remember that some evangelical Christians claim there are over 100 million Christians in China, though surveys suggest considerably less (though more than the Chinese government admits). There are also anecdotal accounts of how hostile to Islam some Iraqis are now that Shia clericalism has somewhat of an influence. There's a problem with this though: a disproportionate number of emigrants from Iraq today are from its ancient Christian communities. It seems rather tasteless to fan flames over likely non-existent potentials to convert Iraqi Muslims to Christianity when the indigenous Christians are being driven out, and it seems that we are seeing the last generation of Christianity in Iraq (I am very skeptical that the Chaldaean Diaspora in Sweden will flock back to Iraq once it is more stable, just as the Church of the East Diaspora in the United States did not return after the expulsions of the early 20th century).
The rest of the article alludes to apostasy and conversion to Christianity in Russia, Europe and other parts of the world. I suspect the numbers for Malaysia are a bit exaggerated, especially since the source is a mufti who likely wants to justify a more aggressive role for his office, but secularization has been attested for French citizens whose families are traditionally Muslim, and Russia has a long history of converting and assimilating "Tatars" into its population. A portion of the noble Russian boyar class were derived from the elites of Turkic peoples who were brought into the fold of the expanding Empire. In places like Albania the population is predominantly secular and Christians, Hare Krishnas and Muslims are all attempting to find converts in the population.
In any case, I suspect the article was meant as a propaganda piece. I suppose it is important to rally the troops...but I'm generally not too fond of making stuff up, since that sort of behavior tends to come back and bite you. I also think some people will take it a bit too literally so I wanted to clarify a few issues....
Note: If you are interested a scholarly exposition of data, Philip Jenkins' books are pretty good. He's pro-Christian, but he is pretty good about not making stuff up or deceiving readers.
This is an open thread where you can post links or pointers to books & papers you think might be interesting to those who read this weblog in comments. I generally get a lot of good pointers via comments, but if this works out I'll just post this every month or so. I'll leave the scope of the request to your discretion, though if you are a regular reader you know the purview of this weblog.
I've criticized economists for being a bit cavalier about nutritional basics before. A comment below points me to this working paper, Agricultural Specialization and Health in Ancient and Medieval Europe:
It has been argued that protein-rich milk and beef are major determinants of the biological standard of living for societies of the late 18th and 19th centuries: a high local supply of milk lead to better nutrition and taller stature (which is correlated with health and longevity), even if purchasing power is not necessarily high: The shadow price of milk was extremely low, because this food item could not be shipped, but was used for subsistence (and the butter was sold). In this paper we consider this proximity-to-protein production effect in ancient and medieval Europe. The decisive protein production can be traced quantitatively for the first time using a sample of 2,059,689 animal bones. The share of cattle bones is ceteris paribus an indicator of milk (and beef) supply, especially if controlling for population density. We compare information on cattle bone share with estimates of heights in three European regions (Mediterranean, Northeast, Centralwest) for the 1st to 17th century A.D. In an experiment, we suggest height estimates for today's Turkey, Greece, the Near East and Egypt during antiquity, based on the regression formulas we find.
A paper with a lot of data which I found fascinating. But...I have to be somewhat skeptical of this note:
Lactose intolerance was probably not a decisive limiting factor in Europe. Crotty (2001) emphasized the importance of lactose intolerance in his bold attempt to explain the evolution of capitalism based on cattle farming patterns. Crotty argued that lactose-intolerant people could not make sufficient use of cattle. Lactose intolerance means that many people in the world have digestive problems, if they do drink large quantities of milk after age 5-7, because at that age genetically lactose intolerant people loose their ability to digest fresh milk without diarrhoea and similar problems. Especially East Asians (east of Tibet and Rajasthan), American Indians and some African people have problems with lactose intolerance. For Southern Europe, the results are mixed - one study on Spain categorized the country into the highest group of lactose tolerance (70 % and more lactose intolerance) and a Greek study found a middle position (30 - 70 % lactose intolerance); whereas in Italy and Turkey less than 30 % were classified as lactose tolerant (see Mace et al., 2003). But even lactose intolerant people can digest modified milk as Kefir, Lassi and similar products. Moreover, all people can drink about a cup of milk per day if they train their intestinal bacteria to live in a milk environment. Even many South Koreans today consume some milk, using this method of permanent training. We thank Barry Bogin, Anthropology Department U. Michigan/ Dearborn, and S. Pak, Seoul National U., for hints.
I've posted lactose tolerance rates across regions before, they vary a great deal. It is certainly true that milk is not cyanide for lactose intolerant individuals, but I don't think we should soft pedal the ramifications of the development of adult lactase persistence. Its evolution was due to one of the most powerful and recent selective events in our species' genetic history. Many of the Eurasian alleles seem to be descended from a recent common ancestor. Some of the African alleles are likely to be independent. The West Asian alleles also seem derived from an independent mutational form distinct from that of the more widespread Eurasian variant.
This isn't to deny the reality that milk is a great source of nurtition which might be an important variable which might explain variation in height across time & space. And I don't dismiss the R2 they can produce. But the geography of genes in this case strongly implies a lot of local ecological adaptation has been at work, and should be included in these models as opposed to brushed aside, e.g.:
...We provide two new lines of genetic evidence that this long, common haplotype arose rapidly due to recent selection: (1) by use of the traditional FST measure and a novel test based on pexcess, we demonstrate large frequency differences among populations for the persistence-associated markers and for flanking markers throughout the haplotype, and (2) we show that the haplotype is unusually long, given its high frequency-a hallmark of recent selection. We estimate that strong selection occurred within the past 5,000-10,000 years, consistent with an advantage to lactase persistence in the setting of dairy farming; the signals of selection we observe are among the strongest yet seen for any gene in the genome.
I don't know much economics or economic history, but I do know a little human population genetics, so I'm biased in hoping that everyone else gets hooked into this field. But I also believe that economic historians should be aware of the fact that the evolution of lactase persistence is one of the best case studies for recent gene-culture coevolution. One should be cautious of assuming that the maximal utilization cattle as milk producers is purely a function of economic or social conditions (though the long term impact of those economic and social conditions do count for a great deal). Here's a salient point from A Map of Recent Positive Selection in the Human Genome:
An important type of selective pressure that has confronted modern humans is the transition to novel food sources with the advent of agriculture and the colonization of new habitats...As noted above, we see a strong signal of selection in the alcohol dehydrogenase (ADH) cluster in East Asians, including the third longest haplotype around a high frequency allele in East Asians. A variety of genes involved in carbohydrate metabolism have evidence for recent selection, including genes involved in metabolizing mannose (MAN2A1 in Yoruba and East Asians), sucrose (SI in East Asians), and lactose (LCT in Europeans). Processing of dietary fatty acids is another system with signals of strong selection, including uptake (SLC27A4 and PPARD in Europeans), oxidation (SLC25A20 in East Asians) and regulation (NCOA1 in Yoruba and LEPR in East Asians). The latter gene (LEPR) is the leptin receptor and plays an important role in regulating adipose tissue mass.
Since then there's been the CNV & amylase work....
Wednesday, April 09, 2008
Check out this story about scientists who use drugs like Ritalin to get an extra edge. I'm not too interested in the problems with the methodology of the survey. Rather, I wonder, what do you take? And why?
Tuesday, April 08, 2008
Update: Added a chart.
One of the major themes of the past few decades has been the perception that greater cultural homogenization is occurring because of globalization, which is enabled by the changes in technological and institutional parameters. Shared material culture & values may piggyback along the cresting wave of economic integration and growth. An extremely optimistic model might be that we are seeing the emergence of a vast world market unified by a common set of mediating institutions and core values. There is obviously something to this. A substantial number of Muslims defend their religion's feminist credentials and decry polygyny, while Buddhists reframe their own independent tradition as an elucidation of a universal rational spiritual tradition. These responses show the power of Western culture in setting the terms of debate. But these general trends need to be tempered by an attention to the details, the specifics of which may not entail the results in all cases which our general framework would lead us to expect.
Consider the issue of language. The consistent belly-aching over the mass extinction of obscure languages is just the latest chapter in thousands of years of linguistic winnowing. Today the Iberian peninsula is home to a group of related languages aside from Basque. 2,000 years ago it hosted tongues of disparate families; Basque, Celtic, Latin, Punic and a medley of southern Iberian languages such as Tartessian. With the extinction of most and the emergence of a few large blocks one may perhaps argue that there is more discontinuity, not less, when it comes to speech. The logic here is that a welter of dialects would tend to fade into each other, and even when there would be a "jump" across language families (e.g., Finnic to Slavic) there would be a greater number of mediating dialects sharing lexical features to facilitate cross-fertilization. With the rise of nation-states and the expansion of originally narrow dialects into lingua francas which quickly monopolize the public spaces (e.g., modern Italian and French as descendants of particular Florentine or Parisian dialects) these intermediary variants no longer play their roles. Oligopolies of languages sponsored by nation-states force bridge dialects to fade to the margins. What are bridge dialects? Catalan and Occitan are two that I have in mind. Because of the decentralized nature of the modern Spanish polity the former looks like it may have a future, but the latter is slowly being crushed by the dominance of French.
Though language is emotionally salient for many, that is really not what I had in mind. In The Clash of Civilizations and the Remaking of World Order Samuel Huntington presented a thesis which used religion as the major organizing principle around which societies cohere. I am willing to accept this more or less (though language is obviously a major fissure as well). I have argued before that communication improvements are a major reason that I believe Islam is becoming more centralized in terms of belief and practice; the ummah is realizing its unity much more concretely than in the past. Recently I was reading a history of Burma, and the author noted that in the past many Muslims who were in areas where they were a minority were difficult to distinguish from non-Muslims. Most of their practices were similar to their neighbors, and they did not dress any differently, men and women prayed in a mixed setting etc. Much the same could be said of 19th century Bengal, where the outlook of Muslim and Hindu peasants didn't differ greatly and veneration of Hindu and Sufi saints bled into each other, resulting in an operationally syncretistic milieu, the perfect matrix for groups like the Baul to operate and receive patronage. Among abangan Muslims in Java the Ramayana remains very popular. In China the Hui Muslim intellectuals of the 18th century justified the high status of their religion on Confucian principles. In Vietnam the Cham Muslims were known to syncretize their Islam with that of the Mahayana Buddhism of their Vietnamese neighbors. The examples are endless, and one can generalize beyond Islam in South and East Asia.
Things have changed a great deal. In many of these regions Islam has gone through periods of "reform" and new found adherence to "orthodoxy." I suspect that santri Muslims in Java would assert that the spread of their form of Islam simply has to do with education; their Islam is the more authentic Islam, that of the abangan is debased weak tea. In China ties with the West enabled by modern transportation (broadly construed) resulted in a rethinking of the Hui relationship with the majority culture; instead of Confucius as the arbiter of correct thought they began to look to Muslim eminences from Southwest Asia as their authentic sages. In Kerala in South India Yemeni ulema who were reforming the Islam of that region instructed peasant women to no longer go topless as had been their custom when working in the fields. What you see here is a tightening of the ship, a purging and paring back of heterodoxy, heresy and laxity allowed and engendered by isolation.
Or do you? There aren't any black & white answers here, I don't think one can totally deny the thesis that the early texts of Islam reflect an Arab society at variance with assimilative dynamics manifest on the margins of the Muslim world. But there maybe less to the texts than meets the eyes. When reading about Burmese Muslims, or Hui Muslims, and so on, I was struck by the lack of rationalization they seemed to need for the fact that they were subordinated to non-Muslim rulers and populations. Their minority status was taken as a given, and they freely integrated themselves into a non-Muslim order (e.g., Burmese Muslims who served as soldiers, or Hui who entered the bureaucracy via the examination system). To some extent this contrasts with the pro forma nods to propriety near the "center" of the Muslim world; the fact that the Emirate of Granada was a vassal to Christian powers for centuries was long cause for some concern in the domain of political theory. Muslims in the Russian Empire engaged in soul searching as to whether it was acceptable to render under to the Orthodox Christian Tsarina (Catherine the Great). The logic was simply that of jihad and domination; the only peace was that which prevailed under Islamic dominion. That was the argument, but it was breached and contradicted by practice rather early on.
But why did this argument not seem to come up in some lands where Muslims were a small minority? Clearly there is the issue of practicality. There was no question that the Muslims of Burma were in no position to make demands or wage war against the non-Muslim majority. But, going back to my emphasis on communication and identification there was less of an exemplar of extensive Muslim states which expunge pluralism through a process of cultural attrition. Certainly India came close, but the reality remained that it was a primarily Hindu realm demographically, and the Muslim masses of Bengal were only notionally Islamicized during most of history. The apologia offered by the Emirate of Granada and the Tatars who remained within the Russian Empire was necessary because of the affinity & identification with polities where the dominionist narrative was taken for granted. Specifically, the Ottomans offered refuge to any Muslims who emigrated south into their lands, and the Sultan more or less saw himself as the natural lord of the Muslims of Russia. Tatars who remained within a Christian Empire and integrated did so despite the option of emigration or passive resistance and continued loyalty to the Sultan. The Emirate of Granada had successful models of the triumph of the eternal jihad across the Straits of Gibraltar in the Muslim polities of the Maghreb.
Today the information umbrella of the ummah spans the whole globe. Chinese Muslims are no longer ignorant of the currents of change and conformity in the rest of the Islamic world; rather, they are part of the discussion. But as they shift their marginal units of attention to the broader debates in the Muslim world they decrease the attention spent engaging their non-Muslim neighbors. These sorts of processes are complex; note that there is evidence that 19th century reformist Islamic movements in many parts of China succeeded when they used indigenous mythical formula. The paradox is that on the practical level Chinese means were the most efficient method to arrive to the ends of identification of Muslims as distinct from their non-Muslim Chinese neighbors! I bring this up to caution that even if there is a distinct tendency for many Muslims around the world to assert that they are concurrently moving toward a reassertion of 7th century Islamic values, that may not truly be the reality. This goes to emphasizing that despite the anti-liberal ethos of most Islamic fundamentalist movements, their origins, methods and to some extent practical outcomes, imply that substantively they are the product of dynamics of the last few centuries no matter their late antique packaging & marketing. The ubiquity of modern technology within Islamist circles may not be so aberrant or mercenary, but rather hint at structural features at sharp variance with their public propoganda and self-images.
But packaging matters. When the Muslim women of Kerala began wearing blouses some of their Hindu landlords objected that they were putting on airs. When some of these landlords forced the women to revert to their old style of dress their menfolk rebelled and killed them (these were not sui generis in this part of India, the same incidents occurred between landlords and low caste groups, but without the religious valence). Amartya Sen has objected to the emphasis on the Islamic identity of Bangladeshis in the United Kingdom to the exclusion of their Bengaliness, a dimension which they share with Sen (a culturally Hindu Bengali). I suspect though that Sen's objection may be in vain; perhaps the multi-textured demographic landscape is going to cede ground to the religious oligopolies of the future? The very rugged and chaotic nature of the phenotypic space which cultures had previously explored might have served as a buffer to massive seismic collisions which are now going to be inevitable in the world of crashing cultural plates.
The chart to the left illustrates what I'm talking about. Imagine a bounded region, and variation along a character (e.g., % of red-meat derived protein in diet). The further you go back in time the more local variation you tend to see. As you move closer to the present there is "cultural consolidation."
Monday, April 07, 2008
PULITZER WINNER: Harmon of 'NYT' Studied DNA After Birth of Child . Recall that Harmon interviewed a contributor to this weblog as well as Half Sigma for a recent article.
Via Jonathan Eisen.
Labels: human biodiversity
When it comes to association studies population substructure is something you have to keep in mind, but what about age? On the Replication of Genetic Associations: Timing Can Be Everything!:
The failure of researchers to replicate genetic-association findings is most commonly attributed to insufficient statistical power, population stratification, or various forms of between-study heterogeneity or environmental influences...Here, we illustrate another potential cause for nonreplications that has so far not received much attention in the literature. We illustrate that the strength of a genetic effect can vary by age, causing "age-varying associations." If not taken into account during the design and the analysis of a study, age-varying genetic associations can cause nonreplication. By using the 100K SNP scan of the Framingham Heart Study, we identified an age-varying association between a SNP in ROBO1 and obesity and hypothesized an age-gene interaction. This finding was followed up in eight independent samples comprising 13,584 individuals. The association was replicated in five of the eight studies, showing an age-dependent relationship...Furthermore, this study illustrates that it is difficult for cross-sectional study designs to detect age-varying associations. If the specifics of age- or time-varying genetic effects are not considered in the selection of both the follow-up samples and in the statistical analysis, important genetic associations may be missed.
More digestable summary at ScienceDaily.
Sunday, April 06, 2008
Identification of ten loci associated with height highlights new biological pathways in human growth & Genome-wide association analysis identifies 20 loci that influence adult height. ScienceDaily has a long review.
Update: I was pretty sure Genetic Future would hit this, so I didn't say much. Well, here's what he notes:
ScienceDaily puts a positive spin on the story ("Scientists are beginning to develop a clearer picture of what makes some people stand head and shoulders above the rest"), but the real story is this: despite the massive scale of these studies, they're still only capturing less than 5% of the total variance in a trait that is almost entirely (90%) genetic. This is a powerful demonstration of the inability of current GWAS technology to access the genetic variants responsible for the vast majority of heritable variation in at least some complex traits, for reasons I have previously discussed in detail.
The bolded parts are exactly right in my estimation; of course nearly a century of biometric analysis of human height should lead us to expect this. In contrast, 50 years ago there was pedigree based work which implied that skin color was going to resolve itself so that about half a dozen loci of large effect explaining most of the variance. That's what we see.
But I think height is important & interesting. Our species has shrunk since the last Ice Age (even modern nutrition hasn't brought us all the way back). Why? Cross-cultural evidence seems to suggest that tall men are more reproductively fit, but the fact that there is a normal range of variation within populations tells us that strong directional selection hasn't been effective over the long term. Otherwise, variation would quickly be exhausted. But it seems likely that some of the between-population differences are due to genetic differences.
Related: Why you be short or tall (well, a little bit) & Why Asians are so short (perhaps).
Saturday, April 05, 2008
Since I know there are some nerds in the audience, I thought I would point to this radio interview with Richard Stallman. Every response Stallman is going to make to any substantive question can be derived from his ideology, but it's always funny to see him interacting with normals. The dynamic reminds me a lot of how I saw people in Bangladesh deal with my uncle who was high up in the Tablighi Jamaat. Of course, I joke about how weird Stallman is, but it seems to me that individuals with his peculiar psychological profile and cognitive talents have probably had an outsized affect on the shape of human history. Stallman is unlikely to replicate his genes, but like Gilgamesh he will live on in memory....
Related: Article about Stallman from Salon. Note that it states he has (had? The article is old) a friendship with John McCarthy, which suggests that Stallman's rigid radicalism does not seem to prevent relationships with those who are opposed to his politics.
Friday, April 04, 2008
A few weeks ago I mentioned that I was going to read The Prehistoric Origins of European Economic Integration and throw up a post on the topic. I've read it, but I don't have anything intelligent to say on it right now. Unfortunately, when it comes to economic history I'm at the left edge of the knowledge curve, and my inferential engine really isn't post-worthy most of the time. When something intelligent pops into mind I'll post it, but until then I thought this portion of the paper was interesting from a GNXP perspective:
Like specie, addictive substances have played a central role in integrating the world economy. Alcohol consumption in the European interior goes back to the third millennium, and was evidently a central element in early ritual. Until northern Europeans learned how to malt grain to brewing beer, however, alcohol could only be obtained by fermenting fruit and honey, which made it costly and rare. The arrival of a beverage having an alcoholic content upwards of ten percent worked a revolution in trans-Alpine Europe. Writing when the trade was in full swing immediately after the Roman conquest Diodorus observed thatThe Gauls are exceedingly addicted to the use of wine and fill themselves with the wine brought into their country by merchants, drinking it unmixed; and since they partake of this drink without moderation by reason of their craving for it, when they are drunken they fall into a stupor or state of madness. Consequently, many of the Italian traders, induced by the love of money that characterizes them, believe that the love of wine of these Gauls is their own Godsend,'Caesar reports that the Nerviens and the Suevians refused entry to wine traders for fear the drink would weaken their warriors.
Thursday, April 03, 2008
DNA from Pre-Clovis Human Coprolites in Imbler, North America:
The timing of the first human migration into the Americas and its relation to the appearance of the Clovis technological complex in North America ca. 11-10.8 thousand radiocarbon years before present (14C ka B.P.) remains contentious. We establish that humans were present at Paisley 5 Mile Point Caves, south-central Oregon, by 12,300 14C yr. B.P., through recovery of human mtDNA from coprolites, directly dated by accelerator mass spectrometry. The mtDNA corresponds to Native American founding haplogroups A2 and B2. The dates of the coprolites are >1000 14C years earlier than currently accepted dates for the Clovis-complex.
ScienceDaily has a really long review of the results & their implications. Also check out A Three-Stage Colonization Model for the Peopling of the Americas.
Tuesday, April 01, 2008
Just got a note from someone I trust that a massive QTL for IQ has been discovered, on the order of 10 points in effect for a substitution of the the major allele for the minor (it's additive and independent, so homozygote minor allele ~ 20 points greater than homozygote major allele). The novel variant is found in an ethnic-religious minority population and no other phenotypic effects are discernble for those who carry the IQ boosting polymorphism. Everything is very preliminary at this point...but they've checked and re-checked and this seems to be real. There are two genes previous implicated in neurological pathologies in this region of the genome, so a molecular genetic & physiological story should be easy to extract.
I'm being a little vague on the details for obvious reasons; no one wants to be scooped. But word is spreading through the labs though, so my friend thought it might be good to prep the public and those at GNXP who are interested in this topic. Expect a Nick Wade article as soon as possible. Exciting times....
Update: Yes, April Fool's. Obviously I wasn't going to do something like taking down the site and pretending someone was going to sue us; you might recall that several GNXP readers sent the befuddled sysop of the Gene Expression Omnibus some irate emails....