One of the researchers, a biostatistician named Georgia Salanti, fired up a laptop and projector and started to take the group through a study she and a few colleagues were completing that asked this question: were drug companies manipulating published research to make their drugs look good? Salanti ticked off data that seemed to indicate they were, but the other team members almost immediately started interrupting. One noted that Salanti’s study didn’t address the fact that drug-company research wasn’t measuring critically important “hard” outcomes for patients, such as survival versus death, and instead tended to measure “softer” outcomes, such as self-reported symptoms (“my chest doesn’t hurt as much today”). Another pointed out that Salanti’s study ignored the fact that when drug-company data seemed to show patients’ health improving, the data often failed to show that the drug was responsible, or that the improvement was more than marginal.
Salanti remained poised, as if the grilling were par for the course, and gamely acknowledged that the suggestions were all good—but a single study can’t prove everything, she said. Just as I was getting the sense that the data in drug studies were endlessly malleable, Ioannidis, who had mostly been listening, delivered what felt like a coup de grâce: wasn’t it possible, he asked, that drug companies were carefully selecting the topics of their studies—for example, comparing their new drugs against those already known to be inferior to others on the market—so that they were ahead of the game even before the data juggling began? “Maybe sometimes it’s the questions that are biased, not the answers,” he said, flashing a friendly smile. Everyone nodded. Though the results of drug studies often make newspaper headlines, you have to wonder whether they prove anything at all. Indeed, given the breadth of the potential problems raised at the meeting, can any medical-research studies be trusted?
This discussion reminded me of Jim Manzi’s earlier essay. There, he argued that the Social Sciences were so far behind the hard sciences because of the problem of causal density. Without the benefits of randomization and experimentation available in the physical sciences, it’s hard to figure out causality — or so Manzi argues.
Yet as the Atlantic article points out, having recourse to randomization isn’t sufficient to generate knowledge. The real problem there is experimenter bias. When there are large incentives to produce results in a particular way, those results tend to be published. Trial-and-error got us thousands of medical papers, but it appears that the vast majority of them are just wrong (one researcher above suggests 90%). In some cases, our knowledge even regresses over time. Ulcers are caused by a bacteria, for instance, not stress — a fact that we used to know, but then somehow forgot.
An old joke goes that the Moon mission was a horrible idea, because it allowed every wiseguy to go “If the government can get us to the moon, why can’t it do X, Y, Z?”. Similarly, it looks like the immense success of physics has obscured how difficult it actually is to generate knowledge, even in cases where randomized experiments are possible.
As further demonstration of that, look at behavioral economics. This field was supposed to deal with problems of neo-classical economics by employing more realistic assumptions about human psychology. We recently got a good test of this theory, taking the 2008 stimulus — which Manzi already flags as an instance where we know less than we think. The 2008 ARRA stimulus incorporated payroll witholding on the behavioral assumption that people would be more likely to spend money if the tax cut was not salient — if their paychecks simply got larger. I saw Bill Maher repeating this idea as if it were established fact.
One analysis suggests that was not the case — that the decision to administer the stimulus in the form of higher paychecks (as opposed to getting the money in a lump sum) resulted in far less spending. That’s billions of dollars in “wasted” taxpayer money as the result of behavioral economics research that turned out not to work — and we don’t know why.
I’d suggest that the reason why we think the certainty of our knowledge goes from Math > Physics > Chemistry > Biology > Economics > Psychology has as much to do with human fudge factors and politics as with the underlying difficulty of the material. This study by Daniele Fanelli, for instance, found that the softer sciences were more likely to report positive results, indicative of bias — either from the freedom to fudge results, or publication bias.