Simpson, Eh?

Continuing my series on errors in analysis, I’m going do dig into a tricky issue of correlation analysis.

To picture the problem, let’s say I work at a university. It’s called Mock U. Like a lot of universities, Mock U is trying to foster a diverse workforce. In particular, we want to ensure that our faculty positions are filled by more women than men to close an existing gender gap.

Let’s look at the jobs offered to applicants by gender in two colleges:

Men Women Adv
Hired Applied % Hired Applied %
Engineering 2 25 8% 3 30 10% w
Business 5 6 83% 4 4 100% w

Are we meeting our objectives? In Engineering, 25 men applied and 2 were hired (8%). Compare that to the 3 women hired from a pool of 30 applicants (10%). The engineering college favors women in hiring. In business, 5 men were hired from among 6 applicants (83%) while all 4 women who applied were hired (100%). The College of Business favors women in its hiring practices too. This gender gap should be closed in no time, right?

But what happens when we look at both departments together?

Mock U hired a higher percentage of men who applied than women. That’s odd. Can we still say the system is working?

This is known as Simpson’s Paradox. I haven’t found it as common as Berkson’s Paradox, but I think it’s more confusing.

This occurs when you have two populations of data with the same correlation, but there’s a confounding factor. The two populations in this case are different colleges, and the hiring correlation for each of them is positive (my policy appears to be selecting more women from pools of job applicants).

But there’s a confounding factor that’s making the University’s hiring proportion as a whole move in the opposite direction.

Can you see the confounding factor in this example?

It could be a lot of things, but at a glance, it looks like jobs in the College of Engineering are a lot more competitive than jobs in the College of Business (Engineering only hired 9% of applicants while Business hired 69%).

You have slightly more women applying to a much more competitive position, and slightly more men applying to a less competitive one. That’s a recipe for encountering this paradox, and it’s worth watching for as you slice your data.

You have to frame your question carefully to avoid ambiguous answers like this. If our objective were to reshape the hiring of either department alone, we could claim success. If we want to change the whole university, we still have work to do.

This can be a pain in analysis but it can also be useful. It can be helpful to look deeper at a process to see if there’s agreement at multiple levels. Or it can help you notice factors in your data you hadn’t thought of before. But mostly it’s useful for starting arguments about whether it’s even possible for sub-groups to all have a positive correlation while the population at large has a negative one. You can try it with your friends, but don’t say I didn’t warn you.