Diabetes and obesity

Update: I made a major error in the algebra of estimating “white diabetes rates” per county. So the last set of correlations was junk. I think fixed the issue. Thanks to “bayesian” who noted that something was off with them.

The CDC provides data on diabetes by county as well as obesity.

Some Correlations:

Diabetes-Obesity = 0.72
Diabetes-Black = 0.65
Diabetes-Latino = -0.14

What’s going on with the last? Latinos, in particular Mexican Americans, are more susceptible to diabetes than whites. So it must be that in counties where there are many Mexican Americans, white have particularly low prevalence of diabetes.

Other correlations:

Diabetes-Obama Vote = -0.01
Diabetes-College Educated = -0.46
Diabetes-Median Household Income = -0.45
Diabetes-Median Home Value = -0.42

I’m struck by the fact that the correlations are higher than for obesity (if you think about it in terms of r-squared, the square of the correlation explaining the variance of Y by X, it’s even more striking). Probably has to do with the fact that only a subset of the obese are diabetic, as diabetes is a more extreme manifestation of morbidity. Let’s control for the % black in a county with partial correlations:

Diabetes-Obesity = 0.63
Diabetes-Obama Vote = -0.28
Diabetes-College Educated = -0.52
Diabetes-Median Household Income = -0.41
Diabetes-Median Home Value = -0.43

Not much change in the correlations really. Also, now there is a modest correlation between political liberalism and lower levels of diabetes now that the black proportion is controlled (the correlation with black proportion controlled for obesity and Obama vote is -0.24, same magnitude and direction).

I also tried to estimate white diabetes prevalence by county. The national data suggest that blacks are 1.7 times more likely to be diabetic, and Latinos 2 times more likely. Obviously there’s going to be some variance for these two groups, so I don’t know how useful this estimate for whites is going to be. But, it should put into stark relief the negative correlation between the proportion of Latinos and white diabetic rates (note: again, Latinos seem to vary quite a bit and there are many counties along the Mexican-American border, as well as on the East Coast, where Latinos are so far deviated from the aggregate risk that I had to dump the data).

Here are some correlations (again, white county proportions are estimates):

White diabetic proportion-White obesity rate (estimate from previous post) = 0.47
White diabetic proportion-College Educated = -0.46
White diabetic proportion-Obama Vote = -0.18
White diabetic proportion-Median Household Income = -0.39
White diabetic proportion-Median Home Value = -0.44

OK, enough with correlations. Maps. Diabetes for all groups:

Now, my estimates for whites:

I think the assumption of an invariant relationship between white and non-white rates (i.e., blacks = 1.7 X whites) is causing problems. The white areas underneath the median suspiciously concentrated in the Black Belt.

So let’s just focus on counties which are 85% or more white:

Where the fat folks live

Since it’s after Thanksgiving and I’m feeling bloated, I figure a follow up to the post on obesity and diabetes might be apropos. I want to focus on obesity. I have the raw county-by-county data, but obviously it isn’t broken down by race. But, I do have the proportions for reach race by county, and, the CDC provides state-by-state breakdowns of the proportion of obese by race. So I decided to “estimate” the proportion of whites obese by county.

1) By “white,” I mean “Non-Hispanic white.” I’m going to say “white” from now on exclusive of Hispanics.

2) Some states, such as Vermont, do not have a large enough sample to estimate the obesity proportion of blacks. I just used a neighboring state to fill in the numbers. This guesstimate is really not much of an issue because the proportion of blacks is so low in the states I had to estimate that the estimate of obesity for whites and estimate of obesity for all races is the same in these counties anyhow.

3) Simple algebra. Total Obesity Percent In County = (Obesity Percent Whites) X (Percent Whites) + (Obesity Percent Blacks) X (Percent Blacks) + (Obesity Percent Latinos) X (Percent Latinos)

For the obesity percent of blacks and Latinos I only have state level data, so this is going to be a rough estimate. And it’s going to result in the variation exhibiting state-to-state discontinuities, since the county variable is dependent on a state level variable. Also, I discarded some counties where the usage of state level data caused really big distortions. Along the Mexican border Latinos are not nearly as obese as they are further into the United States, so I end up with numbers where whites have negative obesity percentages to make the math work out. These are counties which are 90% or more Latino with relatively low obesity numbers.

I did the map shading the way I normally do. Blue is above the median value, and red below the median value, with the scale being set to their max and mins respectively. Unfortunately this causes a problem in the scaling in terms of an asymmetry because one side of the distribution will tend to have a more extreme outlier (usually the above median is where the skew is).

Here’s the map with all the populations:

This is basically the earlier map except shaded differently. Here are the summary statistics for obesity by county:

min = 12.40
1st quartile = 26.60
median = 28.40
mean = 28.25
3rd quartile = 30.20
max = 43.70

Now for my estimate of whites only:

As you can see, the use of state level is causing some distortions. Also, you see something peculiar in the summary statistics:

1st quartile = 25.54
median = 27.62
mean = 26.71
3rd quartile = 29.47
max = 58.11

These averages don’t align with the CDC values aggregated. But that’s because I’m looking at county level data, and not weighting by population. Lots of low density counties with few people have many obese people. Instead of looking at national averages, we’re looking at regional variations.

On the estimates, Texas probably jumps out at you. To my surprise it turns out that whites in Texas are a touch lighter than the national average for whites! For me the big thing that sticks out is that Appalachia seems to be split in two, along the Appalachian Trail (I feel funny mentioning the Appalachian Trail….). Some areas, such as New England, Colorado and California do not surprise in terms of whites who are below the national median. But again there is a pattern of some pockets in the Upper Midwest being relatively under the norm in the proportion of obesity. Some of you might be surprised by the Pacific Northwest, but this region is characterized by urban-rural polarization.

What are the correlations by ethnicity? Here are the correlations with white obesity in terms of ancestral proportion (the proportion of ethnicity X as a proportion of whites):

English = -0.17
German = -0.02
American = 0.07
Scots Irish = -0.13
Irish = -0.19

These are very modest correlations. Probably mostly explained by geography. How about voting?

Obama vote = -0.21

Again, modest. Median Family Income? Only -0.14! That surprised me. Interestingly, Median Home Value had a -0.26 correlation with obesity. Of course the “Dirt Gap” tracks this; in places where people are thinner property values are higher, and rose higher in the past decade. The proportion who have a college degree is like home value, a correlation of -0.25.

None of this is really surprising, on the aggregate level you know that wealthier and more educated people are thinner. So I might as well do something that’s not totally predictable. Most of the variance of obesity on the county level isn’t predicated by educational levels, but a non-trivial fraction is. I decided to fit a loess curve to the plot of obesity (white) who are college educated. Then I simply took the residuals above and below the line and shaded them blue and red respectively. In other words, blue areas have a lot of fat people for the number of college graduates, while red areas have relatively few fat people for the number of college graduates.

The wealth of politicians

Open Secrets has data on members of the House and Senate in relation to their net worth. Here are some descriptive statistics:
Democrats & Republicans:
25th percentile = $228,006
Median = $791,004
75thth percentile = $2,962,519
Mean = $6,438,210
Republicans:
25th percentile = $269,007
Median = $999,381
75thth percentile = $3,421,512
Mean = $6,010,456
Democrats:
25th percentile = $217,001
Median = $718,756
75thth percentile = $2,516,033
Mean = $6,731952
Let’s limit to those who have positive net worth (greater than zero) and less than $50,000,000. This is about two standard deviations above the median, so it removes the top ~2% who tend to skew the results.

Read More

Marriage equality, inbreeding style

The New York Times has an article on cousin marriage that’s up. Here’s some important bits:

Shane Winters, 37, whom she now playfully refers to as her “cusband,” proposed to her at a surprise birthday party in front of family and friends, and the two are now trying to have a baby. They are not concerned about genetic defects, Ms. Spring-Winters said, and their fertility doctor told them he saw no problem with having children.
The couple — she is a second-grade teacher and he builds furniture — held their wedding last summer on a lake near this tiny town in central Pennsylvania. But their official marriage took place a month earlier in Maryland, at Annapolis City Hall, because marriage between first cousins is illegal in Pennsylvania — and in 24 other states, according to the National Conference of State Legislatures — under laws enacted mostly in the 19th century.

For the most part, scientists studying the phenomenon worldwide are finding evidence that the risk of birth defects and mortality is less significant than previously thought. A widely disseminated study published in The Journal of Genetic Counseling in 2002 said that the risk of serious genetic defects like spina bifida and cystic fibrosis in the children of first cousins indeed exists but that it is rather small, 1.7 to 2.8 percentage points higher than for children of unrelated parents, who face a 3 to 4 percent risk — or about the equivalent of that in children of women giving birth in their early 40s. The study also said the risk of mortality for children of first cousins was 4.4 percentage points higher.
More-recent studies suggest that the risks may be even lower. In September, Alan Bittles, a researcher at the Centre for Comparative Genomics at Murdoch University in Australia and one of the authors of the 2002 study, published a paper in Proceedings of the National Academy of Sciences that reported that the mortality rate was closer to 3.5 percentage points higher. He said he expected ongoing research to find the risk of defects to be lower than previously assumed as well.
“It’s never as simple as people make it out to be,” said Dr. Bittles, noting that very early studies did not account for factors like access to prenatal health care, and did not distinguish between couples like Ms. Spring-Winters and her husband, the first cousins in a family to marry, and those who are part of groups in which the practice is common over generations and has led to high rates of genetic disorders. “But the widely accepted scare stories — even within academia — and the belief that cousin marriage is inevitably harmful have declined in the face of some of the data we’ve been producing,” he said.

Diane B. Paul, a professor emerita of political science at the University of Massachusetts, Boston, and a research associate in zoology at Harvard, was an author of a paper published last year in the journal PLoS Biology that described the difficulty of generalizing about the potential for birth defects or increased mortality in the children of cousins. Each couple’s risk depends on the individuals’ particular genetic makeup, she said, which means “it’s very difficult to determine.” And even the small average risk of defects reported in the 2002 study, she added, represents nearly double the risk to children of unrelated parents.

As a religious Methodist, she said, she also worried that marrying her cousin would be wrong in the eyes of her church. But as it turned out, the Methodist Church has no official position on marriage between cousins, unlike the Roman Catholic Church, which requires cousins to obtain dispensation before marrying. And after talking to a relative who is a Baptist minister, Ms. Spring-Winters said, she discovered that the Bible does not say anything explicitly negative about cousin marriage, although it does list examples of sexual impurity, including relations with “close relatives,” like sisters, stepchildren, grandchildren, aunts and stepsisters; and those between mothers and sons, and fathers and daughters.
“If the Bible said no, we wouldn’t have done it,” she said.

A few salient points noted above:

Read More

Liberty or Libel?

There has been much discussion in the blogosphere (for example by Olivia Judson here) of the current libel case between the science writer Simon Singh and the British Chiropractic Association. Most of the comments have supported Singh and criticised both the BCA and the trial Judge, Sir David Eady. Science writers complain that the libel laws are stifling fair criticism of unscientific claims (which makes this at least marginally relevant to gnxp).

I have no interest, of any kind, in chiropractic, and I support freedom of speech, so you might expect me to join the chorus of Singh-lovers and Eady-haters. Unfortunately, much of the commentary has been ill-informed or self-interested (since journalists and bloggers view the libel laws much as turkeys view Christmas). The British press has other motives for attacking Judge Eady, who has extended the legal right of privacy against paparazzi and tabloid journalists. So protestations of concern for ‘free speech’ need to be taken with a hefty pinch of salt…

Some red herrings to dispose of. First, there is a legitimate debate over the practice of ‘libel tourism’ or ‘forum shopping’. But this issue does not arise in the Singh case, where a British writer made comments about a British organisation in a British newspaper. There is no question that a British Court is entitled to try the case.

Second, on libertarian grounds I would be willing to argue for complete freedom of speech, with no restrictions on libel. But that is not where we start from. Every country has some kind of libel law. The details vary, and the balance between freedom of speech and protection of individual reputation is struck in different ways. It is arguable that American law leans too far in favour of the libeller, while English law leans too far in favour of the libelled. But Eady’s critics argue that even within the general framework of English libel law his rulings are dangerous to freedom of speech. I will therefore take that general framework as given.

What then are the issues?

Here is the key passage from Singh’s article, which prompted the libel action:

The British Chiropractic Association claims that their members can help treat children with colic, sleeping and feeding problems, frequent ear infections, asthma and prolonged crying, even though there is not a jot of evidence. This organization is the respectable face of the chiropractic profession and yet it happily promotes bogus treatments.

Before going any further it is necessary to set out the various stages of a libel action under English law, since some of the critical commentary seems to misunderstand this. A case can be divided into four main stages:

Stage 1: It must be established what was said or written, who said it, and who was its ‘target’. In the present case this is straightforward.

Stage 2: It is necessary to decide whether what was said is defamatory. Roughly, this means whether or not it is damaging to the reputation of the complainant. At this stage, under English libel law, the truth or falsity of what was said is irrelevant. [Note 1] Much of the comment on the case has failed to grasp this. A true statement may be defamatory, and a false statement may be non-defamatory. The point at issue is not its truth, but whether it is damaging.

Stage 3: If it is decided that a statement is defamatory, the person responsible for the statement may then defend it. Except in certain special circumstances, the defence is either that the statement is true (the defence of ‘justification’), or that it constitutes ‘fair comment’. In the English system it is usually for a jury to decide whether the defence is convincing.

Stage 4: If the jury finds in favour of the complainant, a decision is then needed on the amount of damages or other remedial action. Damages are decided by the jury. All costs of the case are usually paid by the losing side. It has been suggested in some commentaries that it is cheap to bring a libel action, because the complainant can hire a lawyer on a no-win no-fee basis. But this is only true if the complainant has a strong case; otherwise no lawyer will touch it.

The basic complaint of the BCA is that Singh’s article accuses them of dishonesty, by promoting treatments which they know to be ‘bogus’.

Judge Eady was asked to give preliminary rulings on two issues: what Singh’s words meant; and whether they amounted to an assertion of fact or merely an expression of opinion. On the first point, he decided, agreeing with the BCA, that Singh’s article accuses them of dishonesty, saying: ‘[the quoted passage] is in my judgment the plainest allegation of dishonesty and indeed it accuses them of thoroughly disreputable conduct.’ After this, it was straightforward to take the further step of deciding that the passage is defamatory, since an accusation of dishonesty could hardly not be. On the second point, Judge Eady concluded that the passage amounts to an assertion of fact. The importance of this is that if the defamatory passage is an assertion of fact, the defence of ‘fair comment’ is not available, and the only defence (usually) is to show that the assertion is factually true, or ‘justified’. This defence remains open to Singh.

The case so far therefore raises two issues:

1. Was Eady right to conclude that Singh had accused the BCA of dishonesty?

2. Was Eady right to conclude that the accusation was an assertion of fact, rather than merely an expression of opinion?

On the first point, the matter is perhaps not as clear-cut as Eady’s ruling suggests, but on a common-sense reading of Singh’s passage it is at least a very reasonable interpretation. Singh’s words are strong: he says there is ‘not a jot of evidence’ for the BCA’s claims, and that while it is the ‘respectable face’ of chiropractic, it still ‘happily promotes bogus treatments’. Whether or not Singh intended this to be an accusation of dishonesty, it is a natural inference for the reader to draw. The word ‘bogus’ by itself usually has an implication of dishonesty; the dictionary gives synonyms such as ‘sham’, ‘spurious’, and ‘counterfeit’. To say that someone promotes bogus treatments therefore might in itself be taken as implying dishonesty. This interpretation is reinforced by the contrast Singh draws between the ‘respectable face’ of the BCA and its ‘happily’ promoting ‘bogus treatments’. The contrast between ‘respectable face’ and ‘bogus’ seems to imply that the BCA is not, after all, as respectable as it may appear. If Singh did not intend an imputation of dishonesty, he expressed himself carelessly. An alternative possibility is that he did intend to impute dishonesty, but chose his words so as to insinuate that conclusion without making it explicit. In any case, under English libel law, Singh’s intention is irrelevant: what matters is the interpretation that reasonable readers are likely to put on his words.

On the second point, namely whether the defamatory claim was a matter of fact or opinion, the issues are more technical, and I do not pretend to understand all the legal subtleties. According to Eady’s ruling:

It will have become apparent by now that I also classify the defendant’s remarks as factual assertions rather than the mere expression of opinion. Miss Rogers reminded me, by reference to Hamilton v Clifford [2004] EWHC 1542 (QB), that one is not permitted to seek shelter behind a defence of fair comment when the defamatory sting is one of verifiable fact. [Note 2] Here the allegations are plainly verifiable and that is the subject of the defence of justification. What matters is whether those responsible for the claims put out by the BCA were well aware at the time that there was simply no evidence to support
them. That is an issue capable of resolution in the light of the evidence called. In other words, it is a matter of verifiable fact. That is despite the fact that the words complained of appear under a general heading “comment and debate”. It is a question of substance rather than labelling.

Given the assumption that there was an accusation of dishonesty, this seems a reasonable enough decision. The defence of ‘fair comment’ is more narrowly circumscribed than the layman might imagine. The test of whether something is ‘opinion’ depends on the substance of the alleged disreputable conduct, and not on the form in which the allegation is made. It does not become a matter of opinion just because the author uses the words ‘in my opinion’ or some other verbal dodge.

Clearly the whole case (so far) hinges on the question whether a reasonable reader would interpret Singh’s words as containing an accusation of dishonesty. Much of the commentary has either missed this point, or strained to find alternative interpretations. For example, the words are interpreted as imputing mere gullibility or ignorance, rather than dishonesty. In some circumstances that might be the most natural interpretation of the same or similar words. For example, it might be said that exorcism is a ‘bogus’ treatment for mental illness, yet that some religious sects ‘happily promote’ this bogus treatment. In this case it might plausibly be argued that the implied accusation is one of gullibility or ignorance rather than dishonesty. But this interpretation relies on the background knowledge than religious sects are commonly ill-informed and gullible. In the case of the BCA, the contrast that Singh himself makes is between the BCA’s position as the ‘respectable face’ of a medical profession, and its willingness ‘happily’ to promote ‘bogus’ treatments for which there is ‘not a jot of evidence’. It is difficult to regard this merely as an accusation of gullibility. According to Judge Eady’s ruling:

It is alleged that the claimant promotes the bogus treatments “happily”. What that means is not that they do it naively or innocently believing in their efficacy, but rather that they are quite content and, so to speak, with their eyes open to present what are known to be bogus treatments as useful and effective. That is in my judgment the plainest allegation of dishonesty and indeed it accuses them of thoroughly disreputable conduct.

The critics complain that this is reading too much into the word ‘happily’, which could have a variety of other meanings. But again the question is not what the word might conceivably mean, but what a reasonable reader is likely to take it to mean. The meaning of words often depends on their context. In this case, the word ‘happily’ does not have its literal meaning as a description of an emotional state. The word must in some way describe the collective state of mind of the BCA in promoting ‘bogus’ treatments, and in the context it does (it seems to me) have a strong suggestion of dishonesty. The alternative is to suppose that it has a weaker connotation of recklessness or irresponsibility, but not of conscious dishonesty, or that it leaves several possibilities open, meaning (roughly) ‘dishonest or gullible or reckless or irresponsible…’. These interpretations are not impossible, but Singh himself has made it more difficult to accept them by saying that there is ‘not a jot of evidence’ for the ‘bogus’ treatments. If this were true, then the BCA, as a body of specialists in the field, could hardly be unaware of it, and their promotion of such treatments would go beyond mere recklessness into conscious dishonesty. Judge Eady’s interpretation is therefore not unreasonable.

Nor does the case have the far-reaching implications for freedom of speech or scientific research that some critics claim. No-one is suggesting that it is improper to criticise chiropractic or other alternative therapies. The only lesson to be drawn is that if you wish to accuse someone of dishonesty, at least in England, you must be ready to back up your accusation with evidence; and if you do not wish to accuse someone of dishonesty, you should choose your words with care.

Note 1: This is the position in most of the Common Law world. It was also the position in the United States until a series of Supreme Court decisions shifted the burden of proof onto complainants, where they are ‘public figures’, to show that the words complained of are not only defamatory but deliberately false. A useful account of American libel law is here.

Note 2: Out of curiosity I looked up this case. British readers may recall the incident when the former MP Neil Hamilton and his wife were accused of having raped a woman. The accusation was investigated by the police and disproved. The accuser was subsequently prosecuted and jailed for making false accusations. But before this, she had sold her story to the tabloids, using the PR consultant Max Clifford as intermediary. During the police investigation Max Clifford had gone on television to defend the woman’s claims, and among other things said he personally believed the claims were true. This was what led to the libel action, as the Hamiltons claimed that by endorsing the woman’s accusations Clifford was himself in effect accusing the Hamiltons of rape. Clifford argued in his defence that he was merely expressing an opinion, but the Judge ruled that he was making an assertion of fact, and could not shield behind the defence of ‘fair comment’. And who was the Judge? – step forward, Mr Justice Eady!

Added on 27 November: it has been pointed out that Simon Singh has recently been granted leave to appeal on some of the issues raised by the case. The Appeal Court may well reverse Judge Eady’s rulings on some or all matters. In my post I did not suggest that Eady was necessarily right, just that his rulings were a lot more reasonable than some commentators have claimed. As I said at the outset I have no interest in chiropractic. I have only commented on the case because I was getting tired of misrepresentations of it, which recur in an article in the (London) Times yesterday. Two things in particular have irritated me. One is the one-sided presentation of the case by the commentators. I have not seen a single comment which recognises that the BCA might just have a legitimate complaint when they are, arguably, accused of dishonesty. You can argue about the precise meaning of the words used by Singh, but no-one can sensibly deny that they could be used to make an accusation of dishonesty. Second, I am concerned that scare-mongering about the effects of the case on free speech and scientific enquiry could be a self-fulfilling prophecy. If scientists and science writers (including bloggers) are led to believe that they cannot make strong criticisms of pseudo-science without facing a libel action, freedom of speech and enquiry really will be inhibited. For the reasons given in my post, I do not think that the Singh case has these implications, and those who claim that it does are harming the cause they wish to defend.

I am also happy to acknowledge that I obtained the text of Judge Eady’s ruling through JackOfKent’s blog, via Olivia Judson’s blog, which is linked in my post. I would also stress that my criticism of ‘ill-informed’ commentators does not include JackOfKent. I don’t agree with his assessment of the case, but he is certainly well-informed about it – far more so than me.

Added on 29 November: I hold no brief, in any sense, for the BCA, but it seems to me that in fairness one should not accuse them of ‘litigiousness’, without at least checking their own statements of position. Here is one of their p
ress notices on the Singh case. I do not know (obviously) whether the quote they attribute to Simon Singh at the end of their statement is true, but if it is, it puts Singh in a very different light from that presented by his cheerleaders.

Egypt & evolution & the Muslim world

Last week I pointed to numbers on evolution and the Muslim world. The New York Times has an article up about the conference which inspired my investigation into that topic. The reporter focuses on the rote learning and creativity as the factors behind a lack of knowledge or understanding of evolutionary theory. Plausible, but really unlikely. East Asian nations have the same issues (which they are trying to reform), but acceptance of evolution is high there. In fact, even in non-developed nations such as the Philippines acceptance of evolution can be high. It is higher than in the United States! In Russia there is surprisingly low level of acceptance of evolution, though that might be the aftereffect of the Lysenko interlude, when conventional evolutionary biology was rejected. In other words, the reasons for skepticism of evolution are somewhat diverse, though rote learning and lack of creativity are surely neither necessary nor sufficient.
I suspect that the best analogy for what’s going on with Muslims, even elite Muslims (the samples I pointed to last week were elites), is what occurred with conservative Protestants as they faced the forces of modernism in the 20th century. Some aspects of the modern world they accepted, and others they rejected. The historical sciences, and in particular those which bear upon human nature and origins, they reject with particular vehemence. Despite the pleas of a minority, such as Francis Collins, most American Evangelicals seem to believe that rejection of evolutionary theory is necessarily entailed by their religion. Similarly, most Muslims seem to feel the same way. Even American Muslims seem to have this attitude, though not as much as American Evangelicals. While 33% of American Evangelicals accept that evolution as the best explanation for the origin of human life, 45% of Muslims do (vs. 48% of all Americans). Yet 80% of Hindu accept evolution as the best explanation for the origin of life. In any case, the citizens of Muslim nations seem to assert that religion is very important in their lives, so naturally they would be skeptical of ideas which they believe contravene the precepts of their religion.
Below the fold are results where individuals were asked how important religion was in their life from the World Values Survey 2005 by country….

Read More

No support for birth order effects on personality from the GSS

In researching for a review of The Nurture Assumption, I read over the debate between Harris and Sulloway over birth order effects on personality. Sulloway’s thesis, explained in Born to Rebel, is that last-born children have more rebellious, agreeable, and open-minded/liberal personalities, and that this manifests itself in history with revolutions spearheaded by last-borns. This runs in contrast to Harris’s theory that the family environment has no lasting impact on personality, so she spends a good deal of time in her books and articles critiquing it.

The whole debate makes my head dizzy. A seemingly simple empirical question has produced years of arguing over methodology. I’m not going to go over the tedious back and forth here, except to say that you can see what both sides have to say with a Google search.

Large, controlled studies have not been kind to Sulloway’s thesis. Freese, Powell, and Steelman (1999) looked for a relationship between birth order (controlled for family size) and a variety of political measures on the nationally representative General Social Survey (GSS). They found no significant associations, contrary to Sulloway’s predictions.

I decided to look at the GSS myself, this time to see whether questions that tapped into personality characteristics outside of politics showed any relationship with birth order (SIBORDER), when sibship size (SIBS) was controlled for. I excluded only children. I used the Multiple Regressions feature on the Berkeley SDA tool. I found no significant associations between birth order and any of the four variables I looked at:

  • MEMLIT (proxy for openness/creativity)- “Here is a list of various organizations. Could you tell me whether or not you are a member of each type? m. Literary, art, discussion or study groups”
  • TRUST (proxy for agreeableness) – “Generally speaking, would you say that most people can be trusted or that you can’t be too careful in life?”
  • WORLD4 (proxy for agreeableness) – “People have different images of the world and human nature. We’d like to know the kinds of images you have. Here is a card with sets of contrasting images. On a scale of 1-7 where would you place your image of the world and human nature between the two contrasting images? 1. Human nature is basically good. 7. Human nature is fundamentally perverse and corrupt.”
  • OBEYLAW (proxy for rebelliousness) – “In general, would you say that people should obey the law without exception, or are there exceptional occasions on which people should follow their consciences even if it means breaking the law?”

I wouldn’t say that we should write off the idea of birth order influences on personality and intelligence, only that we should be very skeptical of them. To the extent that they do exist, they’re probably not very significant.

Population substructure within China

The state of China has 1/5 of humanity within its borders, so it’s genetic structure is of interest. It is obviously important for medical reasons to clarify issues of population structure so that disease susceptibility among the Han is well characterized, in particular with the heightened medical needs of an aging population in the coming generation. And of course, there are the nationalistic concerns. About 20 years ago L. L. Cavalli-Sforza reported that his South Chinese samples were genetically closer to Southeast Asians than North Chinese in The History and Geography of Human Genes. This result has been somewhat muddled in the past generation with the rise of uniparental markers (NRY and mtDNA passed through the male and female lineages) along with studies which utilize hundreds of thousands of SNPs. One thing that seems to be clear is that genes vary as a function of geography in China (just as they do pretty much everywhere).
Two new articles in AJHG shed some more light on this issue, Genomic Dissection of Population Substructure of Han Chinese and Its Implication in Association Studies:

To date, most genome-wide association studies (GWAS) and studies of fine-scale population structure have been conducted primarily on Europeans. Han Chinese, the largest ethnic group in the world, composing 20% of the entire global human population, is largely underrepresented in such studies. A well-recognized challenge is the fact that population structure can cause spurious associations in GWAS. In this study, we examined population substructures in a diverse set of over 1700 Han Chinese samples collected from 26 regions across China, each genotyped at ∼160K single-nucleotide polymorphisms (SNPs). Our results showed that the Han Chinese population is intricately substructured, with the main observed clusters corresponding roughly to northern Han, central Han, and southern Han. However, simulated case-control studies showed that genetic differentiation among these clusters, although very small (FST = 0.0002 ∼0.0009), is sufficient to lead to an inflated rate of false-positive results even when the sample size is moderate. The top two SNPs with the greatest frequency differences between the northern Han and southern Han clusters (FST > 0.06) were found in the FADS2 gene, which associates with the fatty acid composition in phospholipids, and in the HLA complex P5 gene (HCP5), which associates with HIV infection, psoriasis, and psoriatic arthritis. Ingenuity Pathway Analysis (IPA) showed that most differentiated genes among clusters are involved in cardiac arteriopathy (p < 10−101). These signals indicating significant differences among Han Chinese subpopulations should be carefully explained in case they are also detected in association studies, especially when sample sources are diverse.

And, Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation:

Read More

GWAS, population structure and the Han Chinese

Two new articles in AJHG, Genomic Dissection of Population Substructure of Han Chinese and Its Implication in Association Studies:

To date, most genome-wide association studies (GWAS) and studies of fine-scale population structure have been conducted primarily on Europeans. Han Chinese, the largest ethnic group in the world, composing 20% of the entire global human population, is largely underrepresented in such studies. A well-recognized challenge is the fact that population structure can cause spurious associations in GWAS. In this study, we examined population substructures in a diverse set of over 1700 Han Chinese samples collected from 26 regions across China, each genotyped at ∼160K single-nucleotide polymorphisms (SNPs). Our results showed that the Han Chinese population is intricately substructured, with the main observed clusters corresponding roughly to northern Han, central Han, and southern Han. However, simulated case-control studies showed that genetic differentiation among these clusters, although very small (FST = 0.0002 ∼0.0009), is sufficient to lead to an inflated rate of false-positive results even when the sample size is moderate. The top two SNPs with the greatest frequency differences between the northern Han and southern Han clusters (FST > 0.06) were found in the FADS2 gene, which associates with the fatty acid composition in phospholipids, and in the HLA complex P5 gene (HCP5), which associates with HIV infection, psoriasis, and psoriatic arthritis. Ingenuity Pathway Analysis (IPA) showed that most differentiated genes among clusters are involved in cardiac arteriopathy (p < 10−101). These signals indicating significant differences among Han Chinese subpopulations should be carefully explained in case they are also detected in association studies, especially when sample sources are diverse.

And, Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation:

Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future.

Maps of diabetes & obesity

Hope readers have a happy Thanksgiving. I assume this is also a day when you’re not going to think too much about your diet and eat what you want to eat. But I thought this map on diabetes and obesity for those age 20 and up was interesting. These are estimates, which I think explains the rather sharp boundaries at state lines (since state level data was probably used to predict county values, see the methods here). To my knowledge the cuisine of the Upper Midwest and New England gets about as much props as that of England (vs. “Southern home cooking”), but hotdish can’t be all that unhealthy? 🙂 H/T Ezra Klein.