A Pitfall of Measuring Correlation
I saw this article on The Australian titled: “So, Who Are the Smartest Scientists?” reporting on a paper from Interdisciplinary Journal of Research on Religion.
Naturally, my first thought was “Data scientists are the smartest!” But then my second thought was “wait, how would you even measure that?” The article just says “IQ,” but then it goes on to say that scientists in physical sciences are less religious than ones in social sciences. One of the paper’s authors is quoted as explaining “This is predicted by their high IQ.”
This raised more questions:
Does the paper claim that smarter people are less religious?
Well, here’s the first sentence of the conclusion:
“There is sound evidence of a negative correlation between intelligence and religiosity and between intelligence and political extremism.”
So, yeah. It pretty clearly makes that claim.
What about method? Does the paper have reason to claim that?
I’m not going to analyze that here. The researchers were most likely working in good faith and their data probably support their conclusion for their sample. I recognize that this is the most important question and I’m “yada yada”ing it, but that’s not what I want to talk about.
Are there other explanations?
This is the interesting bit. I’m going to depart from the paper at this point. The data below are made up to illustrate a pitfall of data analysis. I’m going to ask you to make some assumptions that may not be true. Just bear with me.
Let’s assume that religiosity and intelligence are independent and that they are in no way correlated. Let’s further assume that both religiosity and intelligence are useful or are correlated with useful traits in studying sciences. Finally, let’s assume that the “elite institutions” in the study have done an effective job of selecting people with an abundance of useful traits.
To illustrate the point, I’ve invented some data using a Box-Muller transform. This is not the authors’ data; I made it up. There is no correlation. Here’s the whole population of 100 pseudo-humans:
But remember, our assumption is that intelligence and religiosity are useful (or tend to occur alongside useful traits), and these are elite schools. They only hire the best of the best, so here are the 11 pseudo-humans who end up working at elite schools. They’re in red.
What happens when we look at them? Here’s a plot of just the red dots with a linear fit. We find a negative correlation between intelligence and religiosity!
These data were randomly-generated normal distributions on both axes. The data were totally uncorrelated by design. The apparent negative correlation was created by the sampling plan. What you’re seeing is the relative rarity of people who are outliers on both distributions compared to the outliers on just one. If you measure any two traits that are correlated with participation in your sample group, you’ll show a negative correlation between them.
So… are religion and intelligence negatively correlated?
I wouldn’t say that based on this paper. I’m willing to go along with the conclusion that their data really do show a negative correlation for this small sample, but even there I would first look to alternative explanations of which I’ve given only one.
What’s the Lesson Here?
I’ll give you a couple:
1) Be very suspicious of simple relationships between data, especially when reported in non-science press.
2) Consult with experts before you decide to up-end your business because you see an unexpected (but crystal clear!) correlation between two factors. It could be nothing more than sampling bias.