Friday, January 05, 2007

More on "variables must vary"   posted by agnostic @ 1/05/2007 04:50:00 PM

Returning to a previous post of p-ter's, a common fallacy in popular discussions that involve statistics is to assert that "variable" X doesn't predict outcomes in Y very well, so X is of little consequence when we're talking about Y, despite the fact that the values of X didn't vary that much to begin with in the sample on which we based our conclusion. So, tautologically, variation in X will account for a pitiful fraction of the variation in Y or anything else. You have to allow X to vary to see what influence it's capable of. Imagine you looked at variation in height among NBA players, and it turned out that it was a poor predictor of MVP status. It would be silly to conclude that height is of little importance here -- just ask the 99.9% of male adults who are below the mean height in the NBA of 6 ft 7 in (cite; assume normal distribution with pop mean and SD = 70 in and 3 in). If you let height vary as it does in the population by letting anyone play in the NBA, of course it would be a good predictor of excellence in basketball.

On that note, there's an NYT article about some new algorithms Google is developing -- not those that deliver the most relevant sites given your search query, but ones being developed to best predict success within the company. That way, managers can cut out a lot of the guesswork involved in hiring. In short, they adminstered a battery of questions to their employees, gathered performance data on these employees, and looked for which "biodata" variables best predicted success. As for academic performance:

Among the first results was confirmation that Google's obsession with academic performance was not always correlated with success at the company.

"Sometimes too much schooling will be a detriment to you in your job," Dr. Carlisle said, adding that not all of the more than 600 people with doctorates at Google are equally well suited to their current assignments.

Well, sure, and height is "not always" correlated with success in the NBA, and "not all" of the NBA players with collegiate MVP awards are equally well suited to their professional assignments. But at a firm like Google, standardized mean and variation in academic performance, book smarts, or g must be similar to those of height in the NBA, or else Google wouldn't be the giant that it is. Thus, the fact that engineer A got an 800 on the Math GRE and a 3.8 GPA, while engineer B got a 780 and 3.7, is not going to tell you much about who is more likely to come up with the next big idea -- presumably the behavior and personality variables would do more of the work, such as if one of the engineers is curious and the other is close-minded. But as with socialist basketball, in a Google where anyone was free to work as an engineer, g and its correlates would account for a hefty portion of the variance in performance. Linda Gottfredson has many free articles on g in the workplace at her webpage. I doubt that these subtleties are lost on the people at Google, but journalists could cure much of their innumeracy with a handful of weekend workshops on statistical fallacies common in reporting.