« Great Audio Archive | Gene Expression Front Page | The Politics »
December 20, 2003

In like Flynn

Deep breath...

In my last musings on the subject of IQ comparisons, I said (threatened?) that I might return to the subject of the Flynn Effect.

The political scientist James R. Flynn was the first to draw attention to the fact that average intelligence test scores in most industrialised countries have increased substantially over a period of decades (Flynn [1], [2] - see the references below). Before Flynn, psychologists had occasionally mentioned that IQs had risen (e.g. Vernon, p. 207), but showed remarkably little curiosity about the phenomenon, so Flynn deserves the credit for highlighting it.

Here I am interested in the questions: how large is the cumulative Flynn Effect? How long has it been going on? And is it still continuing?

A rise in mean IQ scores has been found in almost every period in every industrialised country where the question has been studied (for a few exceptions see Storfer p. 97). The rate of increase is usually between 2 and 4 IQ points per decade, but rates as low as 1 and as high as 8 points per decade have been found. The largest increases are usually on non-verbal tests like Raven’s Progressive Matrices, the lowest on individually administered general tests of the Binet type.

Related interjection from Razib: Matthew Yglesias has a post on the black-white IQ gap.

The largest cumulative increases I can find in the literature are as follows:

West Germany 1954-1981: a 20-point gain by children on the German form of WISC [Storfer, p. 96]
France 1949-1974: 20-25 points gain on Raven’s among military recruits Storfer, p. 96]
Netherlands 1952-1982: 20 points gain on Raven’s among military recruits [Storfer, p. 97, cumulating the growth rates for three periods]
Japan post-WWII: 20 points gain over a 24 year period on Japanese version of WISC [Storfer, p. 99. From Storfer’s account I infer that the period was from the early 1950s to the mid 1970s.]
USA 1918-1995: Flynn, in Neisser (ed.), p. 37, gives a figure of 25 points for the increase in Binet-type (Stanford Binet or Terman-Merrill) test scores over this period.
UK 1900-1992: Flynn, in Neisser (ed.), p. 33, estimates that most Britons born in the late 19th century would have scored below an IQ of 75 on modern norms for Raven’s, implying an increase of 25 points or more over this period (see below.)

How far back does the Flynn Effect go? IQ tests only began around 1905 (Binet-Simon) and adequate national standardisation samples are not found until the 1930s. However, Flynn (Neisser, p. 36) gives evidence that the rising trend in the USA started no later than 1918. He also presents intriguing data from the Raven’s standardisation samples in the UK. The 1942 standardisation sample included adults aged up to 65 (and thus born from 1877 onwards), and Flynn concludes that even after allowing for decline of IQ with age, 70 per cent of Britons born in the late 19th century would have scored below an IQ of 75 on current (1990s) norms. This implies a mean IQ of not more than 70 on current norms [see note 1].

Is the increase still continuing? There is conflicting evidence on this. Lynn and Pagliari found that in the USA the rise continued unabated at least until 1989. However, Teasdale and Owen’s [2] study of Danish army recruits suggests that the rate of increase is slowing down, and is now ‘modest’.

If we assume that gains have continued in recent years at a rate of 2 points per decade, then the cumulative figures given above can be updated to give total gains up to 2002 as follows:

West Germany 1954-2002: 24 points
France 1949-2002: over 25 points
Netherlands 1952-2002: 24 points
Japan 1950-2002: 26 points
USA 1918-2002: 26 points
UK 1900-2002: 30 points or more.

I’m sure you can guess where this is leading to! By subtracting the cumulative ‘Flynn Effect’ from today’s IQ levels we can estimate the average IQ levels (by today’s norms) of these countries at relevant dates in the past. These levels can then be inserted for comparison into the table of national IQs provided by Lynn and Vanhanen. Here is the result, for selected industrialised and other countries (historic figures are in bold, the others are current figures as given by Lynn and Vanhanen [see note 2]).

Japan........................105
Germany...................102
Netherlands..............102
UK.............................100
USA (white)..............100
France.........................98
USA (all races)...........98
Czech Republic..........97
Greece.........................92
Malaysia.......................92
Indonesia......................89
Iraq................................87
Mexico..........................87
USA (blacks)................85
Egypt.............................83
India...............................81
Guatemala....................79
Japan (1950)...............79
Netherlands (1952).....78

Zambia..........................77
Germany (1954)..........76
USA (white, 1918).......74

Uganda.........................73
Congo (Braz.)...............73
France (1949)..............73 or below
Jamaica........................72
UK (1900).....................70 or below
Nigeria...........................67
Sierra Leone.................64
Equatorial Guinea.........59.

I won’t consider in detail the causes of the Flynn Effect, which are discussed in detail by the contributors to Neisser (ed.). Personally, I like the argument of Richard Lynn that the main factor is improved nutrition during development and early childhood. Over the last 100 years, industrialised countries have seen three striking trends: (a) average height has increased; (b) the average age of puberty has fallen; and (c) average IQ scores have risen. Improved nutrition is the major reason for (a) and (b), so it would be a parsimonious explanation if it is also responsible for (c). But whatever the causes, they are likely to be environmental factors which vary between populations separated in space as well as in time.

Some of the cumulative increases in IQ (up to 30 points, or 2 standard deviations) may seem very large. Flynn himself describes them as ‘massive gains’. This may be somewhat misleading. We usually have no basis for judging the ‘size’ of an IQ interval other than the proportions of the target population who achieve the scores defining the interval. This is quite different from the measurement of physical qualities such as height, where (barring scruples about relativity) one inch is the same as any other inch. Nor is a difference of one standard deviation necessarily ‘large’ in relation to the total range of intelligence. Consider the analogy with height. In the late C19 the average height of mature males in England was about 5 feet 7 1/2 inches, with an s.d. of about 2 1/2 inches. Since then the average height has increased to 5 feet 10 inches, i.e. by about one s.d. of the C19 level. This is a noteworthy increase, but I don’t think anyone would be tempted to call it ‘massive’, since we can see (literally) that it is only a small proportion of total height. It should also be noted that a ‘large’ increase in IQ may boil down to a small number of items passed in a test. Each correct answer on Raven’s Matrices accounts for two IQ points, so an increase of 20 points corresponds to an additional 10 correct answers, out of a maximum of 60 items. Or in the Danish military tests, the total increase of 10 IQ points between 1958 and 1998 corresponds to a raw score increase of about 6 items out of 78 (Teasdale and Owen [1]). Is this ‘massive’? What do such terms mean in this context?

Finally, an increase in mean scores does not necessarily mean that the whole distribution of scores has shifted upwards, retaining the same ‘shape’. Unfortunately the literature on the Flynn Effect tends not to say much about distributions. The studies by Teasdale and Owen are an exception. T & O [1] shows a marked change in the distribution. There is a reduction in the number of low scores and an increase in the number of moderately high scores, but no marked increase in very high scores. The distribution changes from symmetrical to negatively skewed, with a ‘pile-up’ of moderately high scores. T & O consider, but reject, the possibility that this is an artificial ‘ceiling’ effect. If this pattern is representative of the changing pattern of IQ scores generally, it might explain why we do not seem to be living in a golden age of genius.


Note 1: here I assume an approximately normal distribution (small departures from normality do not matter for this purpose) and a standard deviation of at least 10 points. Since the upper 30 per cent of a normal distribution are at least half a standard deviation above the mean, on these assumptions the mean must be at least 5 points below the boundary of the upper 30 per cent, hence if this boundary is 75 points the mean is not more than 70 points .

Note 2: I have simply subtracted the cumulative Flynn Effect from the ‘current’ IQ levels shown for these countries in Lynn and Vanhanen. In fact, L & V’s data are not all recent; e.g. their IQ of 72 for Jamaica is based on data from 1962. L & V’s procedure (p. 197-8 of their book) is to express each country’s mean IQ by reference to a mean IQ of 100 for the UK. If the test concerned was standardised for the UK substantially earlier or later than the date of the test in the other country, then L & V assume that mean IQ in the UK has increased by 2 points per decade, and adjust the data accordingly. For example, if country A had a mean IQ of 105 in 1970, on a test standardised with a UK mean of 100 in 1960, then L & V would estimate the UK mean as 102 in 1970. As this is 3 points below country A at that date, their table would state the IQ of country A as 103 against a notional UK mean of 100, preserving the differential of 3 points. Bizarre though L & V’s approach may seem, it is probably legitimate for combining data from different time periods, if you are going to do this at all. Provided that the mean IQs in country A and in the UK have increased at roughly the same rate since the date of the tests, the rank order of national IQs will be unchanged, and the numerical intervals will not be badly distorted (across the range of difference likely to be encountered in practice). If on the other hand mean IQs have increased at very different rates, then the figures in the table could be misleading with respect to current (2002) relative levels. However, this is unlikely to be the case for the industrialised countries considered here, because (a) most of the test data for these countries are quite recent, and (b) the rates of change in recent decades are unlikely to have been widely different in different industrialised countries.


References:

James R. Flynn [1] ‘The mean IQ of Americans: Massive gains 1932 to 1978’. Psychological Bulletin, 95, 1984, 29-51.
James R. Flynn [2] ‘Massive IQ gains in 14 nations: what IQ tests really mean’. Psychological Bulletin, 101, 1987, 171-91.
R. Lynn and C. Pagliari ‘The intelligence of American children is still rising’.
J. Biosocial Science, 26, 1994, 65-67.
R. Lynn and T. Vanhanen. IQ and the wealth of nations. 2002
Ulric Neisser (ed.) The Rising Curve: Long-term gains in IQ and related measures. 1998
Miles D. Storfer. Intelligence and giftedness. 1990.
T. Teasdale and D. Owen [1]. ‘National secular trends in intelligence and education: a twenty year cross-sectional study’, Nature, 325, 1987, 119-21.
T. Teasdale and D. Owen [2]. ‘Forty-year secular trends in cognitive abilities’, Intelligence, 28, 2000, 115-20.
P. E. Vernon: Intelligence: Heredity and Environment, 1979.

Posted by David B at 06:50 AM