The Inheritance of Inequality: Big Insight, Small Error

Share on FacebookShare on Google+Email this to someoneTweet about this on Twitter

Gintis and Bowles have done great work cleaning up a lot of the discussion about cooperation, evolution, and economic outcomes. A Google Scholaring of their names turns up 14 items with over 100 citations, most of which would be well worth reading for GNXP regulars.

But that said, in their 2002 Journal of Economic Perspectives piece “The Inheritance of Inequality,” they appear to make a small error. It’s an error that’s all-too-easy for even good folks to make: They apparently squared the h-squared.

Their big insight and their small error are all part of answering a simple question: How much of the correlation of income between parent and child can be explained by the heritability of IQ? You might think it’s straightforward: IQ is highly heritable, so if there’s some channel linking IQ to income, then it’s all over but the shouting.

But numbers matter. And Gintis/Bowles work out the numbers, finding that there’s a weak link in that causal chain: The low correlation (0.27 according to Gintis and Bowles) between IQ and wages. The causal chain goes like this:

1. Parental earnings have a 0.27 correlation with parent’s IQ.
2. Heritability of IQ between parent and child is a bit more than 1/2 of h-squared (why a bit more? assortive mating). They take an h-squared of 0.5 for IQ.
3. Child’s earnings have a 0.27 correlation with child’s IQ.

So the net result is 0.27*0.3*0.27 = 0.022 (page 10). A very small number, especially since the raw parent-child income correlation in U.S. data is about 0.4. So yes, knowing a parent’s income helps you predict their adult (especially male) child’s income. But only 5% (or 0.022/0.4) of the total correlation can be explained by IQ’s impact on wages. Small potatoes.

(Oh, but where’s the small error? It’s where Gintis and Bowles report that the net result is 0.01 instead of 0.022–a difference that I can most easily attribute to a mistaken squaring of the h-squared.)

If I really wanted to get that net result up from a measly 5%–if I knew in my heart that IQ really was a driving force in intergenerational income inequality–then how would I do it? Well, I might use a higher heritability of IQ, I might assume more assortive mating, or I might assume a bigger correlation between wages and IQ.

Hard to do much to budge that IQ/wage link: Zax and Rees’s paper only has a 0.3 correlation between teenage IQ and middle-aged wages, and when Cawley, Heckman et al. regress NLSY wages on the first 10 principal components of the AFQT, they get a similar result.

So you think maybe a higher heritability of IQ will save you? Well, let’s just go all the way to perfect heritability of IQ and perfect assortive mating on IQ. In other words, let’s see if “IQ clones” will be have enough similarity in wages to match the 0.4 intergenerational correlation of income.

Will the IQ clones have similar incomes? Not so much. (0.3^2)*1 still equals something small: 0.09. Less than 1/4 of the intergeneration correlation in income. Medium-sized potatoes, but we had to make a ton of ridiculous assumptions to get there.

It’s that doggone low correlation between IQ and wages, a correlation that has to be squared because we’re comparing parent to child. So a high heritability of IQ doesn’t imply a high heritability of IQ-caused-income. Another reminder that lots of things impact your wages: Not just how smart you are.

Gintis and Bowles work through some finger exercises to argue for big environmental effects, and that’s all well and good. But to my mind, the interesting fact is that income is still highly heritable!

G/B report that MZT (identical twin) earnings correlation is 0.56, and DZT (fraternal twin) earnings correlation is 0.36, so using the crudest of approximations, the heritability of earnings is still (0.56-0.36)*2=0.4. So income apparently has a modestly high heritability, but most of it can’t be explained by the IQ-wage channel. Looks like the genetic heritability of income is being driven mostly by non-IQ channels.

Labels: , ,


  1. Connections and cultural fluency probably play some role in earning a high income and I imagine their impact becomes more significant for higher IQ folk. I think those are associated with coming from a high SES background.

  2. half sigma asserts that once you hit a relativey low threshold of IQ (e.g., 115) your cranking personal connections are a better bet for marginal returns than more IQ. based on what, i don’t know.

  3. time preferences

  4. Also, isn’t there a difference in the degree to which iq correlates to income for an individual and for a group? I seem to recall Sailer saying that while iq tells us little about how a single individual will turn out, it tells us much more when we are dealing with average iq’s and average incomes for large group of people. Or am I getting this wrong?

    is probably (partially) it. Executive control and one’s degree of ADD-ness matter a lot for these things.

  6. LL: 
    If by “large group of people” you mean “large group of people,” then no, IQ is apparently still a poor predictor of group wages. Just look at the results from Jones and Schneider.  
    Each of the observations in Figure 2 of the paper (page 27) is an average wage of a “large group of people”: Immigrants to the U.S. from a particular country. It’s a U.S. census result, so each observation may be the average wage of hundreds, even thousands of immigrants (The data adjust for the impact of education, but they report that the raw data tell the same story, acc. to Section IV).  
    So yes, the R-squared of 22% is higher for groups than for individuals (which has an R-squared of 10% or so), and maybe that’s what Sailer is talking about. But the ‘effect size,’ as psychologists like to say, is almost exactly the same: One IQ point predicts a mere 1% higher wages between groups.  
    So even going from an IQ of 70 to 100–two full standard deviations–might yield about 30% higher wages. Small potatoes indeed! One good reason to be sanguine about immigration from low-average-IQ countries.  
    Speaking of Sailer, Jones and Schneider don’t take the Ross Douthat approach in their paper: They refer to Sailer’s very useful data table based on Lynn and Vanhanen’s IQ and the Wealth of Nations a couple of times.

  7. This linear model tomfoolery is silly.  
    If I were talking to Gintis/Bowles, I’d say: make a table with 6 columns: mother/father IQ & income, child IQ and income. Then do pairs() in R. Then show me that pairs plot. Then we can start conditioning on variables. Chaining correlations blindly in this fashion is only done by people who don’t understand that data lives in weirdly shaped point clouds. In particular, r_{xz} != r_{xy}r_{yz} in general.

  8. PS Herrick: no personal diss intended against you. Just that this fetishization of correlations drives me up the wall. I want scatterplots!

  9. David Rowe and his colleagues objected strenuously to this chain model of causation. They argued that IQ and income should load on a common factor. 
    As sometimes occurs in path models with latent variables, the semantic interpretation is not entirely clear. 
    Regardless, the results “feel” different. 35 and 12 percent, respectively, of the variances in IQ and income are accounted for by genetic influences affecting both variables. 30 percent of the variance in income is accounted for by genetic influences that do not affect either IQ or educational attainment. Interestingly, 49 percent of the variance in income is accounted for by nonshared environmental factors that do not not affect either IQ or education.

  10. gc-  
    You are absolutely correct. This paper is total nonsense. You can’t naively chain the correlations. 
    Out of curiousity I just created a python script to generate sample data. Just using very basic assumptions, I was able to create a dataset of 500 parent child pairs where the following was true: 
    Parental IQ and wage had a .23 correlation. 
    Child IQ and wage had a .51 correlation 
    Parent IQ and Child IQ had a .33 correlation 
    Parent earnings and child earnings had a .15 correlation. 
    If you simply multiply the values, you get .23*.33*.55 = .02 
    Thus naively multiplying the correlations can make the results go off by an order of magnitude! 
    Here is the code I used to generate the dataset ( note, I made IQ linear for simplicity, all I wanted to prove was that just multiplying the correlations will not work in all circumstances): 
    def runtest(): 
    parent_iq = random.randint(85, 115) 
    child_iq = parent_iq+random.randint(-40, 40) 
    parent_earnings = parent_iq * 600 + random.randint(-40000, 40000) 
    child_earnings = child_iq * 600 + random.randint(-40000, 40000) 
    print parent_iq,”,“,child_iq,”,”,parent_earnings,”,”,child_earnings 
    for x in range(0, 500):  

  11. One of the obvious non-genetic causes, at least in Britain, is geographical. Overpaid Londoners’ children get overpaid London jobs.

  12. Just anecdotally, looking at dozens of people I’ve know, a successful parent can just keep a lame kid in the middle class with connections, gifts, and credit, though just barely, but the grandchildren’s chances won’t be so good unless the grandparents are rich and long-lived enough to take care of them. 
    Except for the most talented, most disciplined 10% or 20%, I suspect that the success rate of my college class has a lot to do with the amount of family support they could count on. Maybe for them too.

  13. Libra — right.  
    More generally, think about a multivariate normal. There are (weak) constraints on the interrelationships of the correlations because the correlation matrix must be positive semidefinite. However, that still leaves a lot of room for flexibility.  
    Here’s a simple example in R:  
    > library(mvtnorm) 
    > N = 500; mu = c(0,0,0); sigma = matrix(c(1,.1,.8,.1,1,.2,.8,.2,1),3,3,byrow=TRUE); xx = rmvnorm(N,mu,sigma) 
    > mu 
    [1] 0 0 0 
    > sigma 
    [,1] [,2] [,3] 
    [1,] 1.0 0.1 0.8 
    [2,] 0.1 1.0 0.2 
    [3,] 0.8 0.2 1. 
    > head(xx) 
    [,1] [,2] [,3] 
    [1,] -0.2109703 -1.45252910 -1.0041174 
    [2,] -0.9966107 -0.11467753 -0.6215200 
    [3,] -1.7997072 1.71219427 -0.6523481 
    [4,] 2.4727438 0.04898101 2.0142931 
    [5,] -0.3277056 -2.00856426 -0.0851483 
    [6,] 0.8441157 1.75704248 0.3519600 
    > cor(xx) 
    [,1] [,2] [,3] 
    [1,] 1.0000000 0.1301810 0.8124007 
    [2,] 0.1301810 1.0000000 0.2120614 
    [3,] 0.8124007 0.2120614 1.0000000 
    > cor(xx)[1,2] * cor(xx)[2,3] 
    [1] 0.02760636 
    > cor(xx)[1,3] 
    [1] 0.8124007

  14. btw, Libra, if you haven’t seen it, check out rpy and sagemath’s R library sometime…I’ve been using both to equip python with some serious number crunching ability. You probably also know about numpy…they (= sage and numpy) are coming together to make the ultimate python numerical/mathematical library. Fun times.

  15. I remember some economics paper showing that college-educated left-handers have higher incomes (30%?) than their right-handed counterparts. My guess is that the key driver of this difference is the ruthlessness of the left-handers (actually it’s not just a guess).  
    It might also be the case that although in childhood left-handers lag behind right-handers in IQ, they have a longer neoteny, their brains mature slighly later, so if you re-test them in their 30s maybe they beat the right-handers. I would like to see somebody doing exactly this research (I know from Stanley Corren that they mature later, but don’t have data specifically on differential IQ trajectores in early adulthood).  
    The problem with handedness is that it can be either the result of genes or of prenatal troubles, which means that it doesn’t map neatly into “heritable”.

  16. The NLSY data should be able to address these questions directly, no?

  17. … which is the source that Rowe et al. used :) … 
    Genotypes may influence the phenotypic associations among IQ, education, and income. To investigate this hypothesis, we believe that the appropriate methodology requires estimation of genetic and environmental influences using data able to separate these influences. The National Longitudinal Survey of Youth (NLSY) is a nationally representative sample that contains genetically-informative full- and half-siblings (28?35 years old in 1992; Ns=1943 full-siblings, 129 half-siblings). A biometric genetic model was fit that estimated the shared environmental and genetic variance components of IQ, years of education, and hourly income. The total heritabilities were 0.64 for IQ, 0.68 for education, and 0.42 for income. Heritabilities due to a common genetic factor were 0.35 for IQ, 0.52 for education, and 0.12 for income. Environmental influences due to a common shared environmental factor were 0.23 for IQ, 0.18 for education, and 0.08 for income. The model predicted a correlation of 0.63 between IQ and education and 0.34 between IQ and income. Sixty-eight percent of the former and 59% of the latter was genetically mediated; the remainder was mediated by common shared environment. These findings suggest that social inequality in the United States has its origin in both genetically-based traits and in different environmental backgrounds. 
    There should also be data in the NLSY to test HS’s theory.

  18. Adoption studies should be helpful here, no?

  19. B&G NEVER claim, contrary to the assertions of gc and Libra, that chaining the  
    correlations will lead to the same correlation as correlating the “endpoints” of the chain. In fact, that entire paper is devoted to analyzing the difference between those two quantities. B&G clearly are aware of the difference. 
    Recall that B&G start off with the parent child correlation. They then engage in the following thought experiment. Suppose the only source of parent child resemblance in income is due to the genetic transmission of IQ. Then what parent-offspring correlation in income might we expect given what we know about the effect of IQ on earnings and the heritability of IQ?  
    There are a number of things one might question about the B&G accounting framework, but I think that the points raised by gc and Libra are not consistent with the contents of the paper.

  20. I would not expect Gintis to make a major error of the type GC mentions. He seems to be a sharp guy with top math skills.

  21. Steve: 
    Adoption studies won’t be helpful: They will control for genes but not IQ per se. And part of the Gintis/Bowles point is that the heritability of IQ explains only a modest fraction of the genetic heritability of income.  
    We can debate the precise fraction but “substantially less than 1/2″ seems to be a good starting point. Methinks that even with an ideal measure of g rather than IQ, the story wouldn’t change much, though no evidence springs to mind other than the fact that even if you double or triple 0.02, you’ve still got small potatoes.

  22. so, whatever the missing puzzle piece is, it’s heritable, but it doesn’t appear to be IQ. wouldn’t personality traits be a good place to look next?

  23. ben g 
    That is exactly what B&G suggest. The problem is that personality is an elusive concept to measure.

  24. No it’s not. Psychometric personality measures go back at least to the 1970s — Big Five and Eysenck’s Personality Questionnaire. 
    Time preferences are also not elusive to measure.

  25. No it’s not. Psychometric personality measures go back at least to the 1970s — Big Five and Eysenck’s Personality Questionnaire. 
    Time preferences are also not elusive to measure.
    Just because there are measures for a trait does not mean it is easily measured or measured correctly and consistently. This is really important in the case of personality traits, since it is not something that cannot be observed and measured visually like height, for example. 
    This is also relevant for IQ. It is the single best measure (AFAIK) we currently have of intelligence. I have heard arguments that it should be viewed as an imperfect measure of a trait that cannot viewed and measured unlike the circumference of someone’s head – it’s an argument I find that quite reasonable. And from what I read, there might be better measures of intelligence on the horizon that do not depend one’s performance on paper-and-pencil tests.

  26. JG, 
    The Big-Five traits, like IQ, have a high degree of reliability and validity. In plain English, that means that people who take the tests multiple times get similar scores each time, and that the tests predict with a great degree of accuracy what you would expect/want them to predict. In the case of personality, peoples’ friends, families, peers, and teachers tend to give them similar ratings than the ones they give themselves. The personality tests also predict a variety of life outcomes pretty well– neurotic people have higher rates of depression/anxiety, extraverted people are happier, conscientous people are harder workers, etc. 
    You raise a seperate, and valid, point though: 
    I have heard arguments that it should be viewed as an imperfect measure of a trait that cannot viewed and measured unlike the circumference of someone’s head – it’s an argument I find that quite reasonable. And from what I read, there might be better measures of intelligence on the horizon that do not depend one’s performance on paper-and-pencil tests. 
    this is a plausible argument (and a correct one, in my opinion). the thing is, though, no matter what advances are made in science, measures like IQ and the big-5 personality traits will remain the de-facto standard for accurately measuring and representing the differences between peoples’ behaviors. 
    I like this analogy– the IQ and personality tests are like a thermometer. The actual way that genes and environment interact to create patterns of behaviors is like thermodynamics. You don’t need a detailed theory of thermodynamics to understand that a thermometer measures heat.

  27. Kids that grow up in a high SES home spend their life seeing their parents make lots of money. Think of it as a decades long education process in how to maximize earnings potential. I think if my dad were a wealthy entrapanuer, for example, I’d have a lot of exclusive experience that’d give me an edge when I started my own business. I’d also have the benefit of my parents encouraging me to become an entrapanuer (and the benefit of connections if I jumped in). If I’m really lucky, my parents might give me some startup capital. In comparison, an equally bright kid from the slums might not have the know-how, connections, or encouragement for entrapanueriship. So he’ll get his degree and go work for Microsoft for $70k a year.  
    You see a stronger IQ correlation on things like education because graduating HS and getting through college is fairly straightforward. A poor kid might not have anyone that can get him into the Ivy Leagues, but he can make it through CC and a regular state school on his own.

  28. B&G NEVER claim, contrary to the assertions of gc and Libra, that chaining the correlations will lead to the same correlation as correlating the “endpoints” of the chain. In fact, that entire paper is devoted to analyzing the difference between those two quantities. B&G clearly are aware of the difference. 
    Perhaps, as I’ll have to read it more closely, but Herrick does so in the post here, by comparing (0.3)^2*1 to .4 and saying that “less than 1/4 of the intergeneration correlation is income.” Similarly, talking about the “causal chain” and “only 5% (or 0.022/0.4) of the total correlation can be explained by IQ’s impact on wages” is doing exactly that. 
    Correlation measures the linear relationship between two random variables. Multiplying them in order to determine what percentage of one correlation “is explained” by another is a bad idea and bad mathematics. 
    It’s particularly awful if non-linear factors are involved. To take an example (4.22) from Counterexamples in Probability and Statistics by Romano and Siegel, if X and Y are Gaussian normal r.v. with any means and std. dev. 1 and 4, respectively, and we let W = exp(X) and Z = exp(Y), then the correlation between W and Z is constrained to fall in the range -0.000251 to 0.01372. A correlation of 0.01372 between W and Z occurs X and Y when perfectly correlated (r = 1). (Other easier examples involve the demonstration that a set of data with a perfect quadratic fit may have a correlation of 0.)

  29. I get the impression that some of the commenters here don’t understand the method of proof by reductio ad absurdum. Surely B&G are trying to test the hypothesis that the correlation between parents’ and offspring’s income is mainly or solely due to genetically inherited IQ. On this hypothesis, given the size of the other relevant corelations, we would expect the income correlation to be much smaller than it actually is. The hypothesis is therefore (prima facie) refuted. In tracing the consequences of the hypothesis, B&G do indeed ‘chain the correlations’, but that seems quite legitimate for the purpose at hand. If anyone wants to argue that there is some fancy non-linear relationship among the variables such that the hypothesis can still be maintained, I think the onus is on them to be more specific.

  30. David B said what I was thinking, and more elegantly at that. What B & G have done is challenged the supporters of the Bell Curve argument to come up with a new argument, because the intuitive correlation-based one they’ve been using for years just got falsified.

  31. David B and ben g: 
    This is my intuition as to why the chain of correlation argument is incorrect – 
    The genetic transmission of income and IQ are each due to some set of genes [1]. *Suppose* that 100% of the (0.3) correlation between IQ and income is due to a common set of genes. Of course there are also genes which regulate IQ and not income and vice-versa, which is one reason why the IQ-income correlation isn’t 1.0. Likewise, the heritabilities of IQ and income aren’t identical because they also have their own distinct developmental pathways aside from the common variants. Yet the underlying genetic co-transmission of income and IQ is much stronger than the chain of correlation calculation permits because in this scenario many IQ boosting alleles are also income boosting alleles — they aren’t independent. 
    Per the Rowe paper [2], the actual shared heritability is less than 100%. Someone can go through the exercise of figuring out what the correct number is using the data from the Rowe paper, but it’s going to be more than 2%. 
    [1] It’s not a “set of genes” but a set of polymorphisms, but it reads better that way 

  32. lol: I may have misunderstood your example, but I don’t see how it gets round the problem. To simplify the logic to the max, suppose we are dealing with 3 traits of the same individuals, (A) IQ, (B) and (C), 2 other traits positively correlated with IQ and with each other, say, music appreciation and size of vocabulary. Suppose, if you like, that IQ is totally genetically determined, which seems to embrace your own assumption. 
    Now, we wish to test the hypothesis that the positive correlation between B and C is entirely due to their correlations with A (IQ). As an important supplementary assumption, we assume that all the correlations and regressions involved are linear. 
    With these assumptions, the hypothesis to be tested implies that the partial correlation between B and C, given A, is zero. It follows from this that the expected bivariate correlation between B and C, rBC, equals rAB.rAC. In other words, on these assumptions, it is legitimate to ‘chain the correlations’.  
    But now suppose in fact that the observed correlation between B and C is much larger than rAB.rAC. Something in the assumptions must be wrong: most obviously, the assumption that the correlation between B and C is entirely due to their correlations with A. Now, it is logically possible that the error is in the supplementary assumption that all the correlations involved are linear. But note that (a) this assumption is separately testable, and in fact most such correlations are at least approximately linear; and (b) to save the hypothesis, the non-linearity would have to be of a rather peculiar kind, which raises the (linear) correlation between B and C despite the (asssumed)fact that there is no connection between them other than their connection with A. I can’t visualise what such a non-linear relationship would be, which is why I have said that I think the burden of proof is on those who wish to advocate it.