Thursday, March 05, 2009

Will information criteria replace p-values in common use? Some trends   posted by agnostic @ 3/05/2009 12:30:00 AM
Share/Bookmark

P-values come from null hypothesis testing, where you test how likely your observed data (and more extreme data) are under the assumption that the null hypothesis is true. As such, they do not allow us to decide which of a variety of hypotheses or models is true. The probability they encode refers to the observed data under an assumption -- it does not refer to the hypotheses on the table.

Using information criteria allows us to decide between a variety of hypotheses or models about how the world works. They formalize Occam's Razor by rewarding models that show a good fit to the observed data, while penalizing models that have lots of parameters to estimate (i.e., those that are more complex). Whichever one best balances this trade-off wins.

Although I'm not a stats guy -- I'm much more at home cooking up models -- I've been told that the broader academic world is becoming increasingly hip to the idea of using information criteria, rather than insist on null hypothesis testing and reporting of p-values. So, let's see what JSTOR has to say.

I did an advanced search of all articles for "p value" and for "Akaike Information Criterion" (the most popular one), looking at 5-year intervals just to save me some time and to smooth out the year-to-year variation. I start when the AIC is first mentioned. For the prevalence of each, I end in 2003, since there's typically a 5-year lag before articles end up in JSTOR, and estimating the prevalence requires a good guess about the population size. For the ratio of the size of one group to the other, I go up through 2008, since this ratio does not depend an accurate estimate of the total number of articles. From 2004 to 2008, there are 4132 articles with "p value" and 927 with "Akaike Infomration Criterion," so the estimate of the ratio isn't going to be bad even with fewer articles available during this time.

Intervals are represented by their mid-point. Someone else can do the better job of searching year by year, perhaps restricting the search to social science journals to see if real headway is being made. (It would be uninteresting to see a rise of the popularity of information criteria in statistics journals.) Here are the trends in the use of each, as well as the ratio of p-value to AIC:


It's promising that both are increasing over the past 30-odd years, since that means more people are bothering to be quantitative. Still, less than 5% of articles mention p-values or information criteria -- some of that is due to the presence of arts and humanities journals, but there's still a big slice of the hard and soft sciences that needs to be converted. Also encouraging is the steady decline in the dominance of p-values to the AIC: they're still about 4.5 times as commonly used in academia at large, but that's down from about 15.5 times as common in the mid-1970s, a 71% decline. Graduate students and young professors -- the writing is on the wall. Aside from being intellectually superior, information criteria will give you a competitive edge in the job market, at least in the near future. After that, they will be required.

Labels: , ,