5 Selection bias

Correlation does not equal causation! This is a mantra that statisticians around the world repeat often. But why is it so important?

In today’s lecture, we will learn about selection bias and how it can confound simple comparisons of groups. We will also learn about the potential outcomes framework, which is a way of defining causality.

Selection bias occurs when the groups that we are comparing are not actually comparable. This can happen for a variety of reasons, such as: The groups were selected in a non-random way, or the groups have different distributions of confounding variables. When selection bias is present, it can be difficult to draw any conclusions about the causal effect of one variable on another.

The potential outcomes framework is a way of thinking about causality that helps us to address and ideally even avoid selection bias. In this framework, we imagine two potential worlds: one in which the treatment was received and one in which it was not. We can then compare the outcomes in these two worlds to estimate the causal effect of the treatment.

Estimating a causal effect is difficult, but it is not impossible. In the next lecture, we will discuss some of the methods that can be used to estimate causal effects. For now, it is important to understand the challenges that we face in estimating causal effects and why the simple standard approach does not work.

Lecture slides: Lecture 5.

Reading: Angrist and Pischke (2014, Ch. 1: pages 1-11).

Optional (skim) reading: Schaeffer and Kas (2024).

Don’t miss out on this video by Joshua Angrist

References

Angrist, Joshua D., and Jörn-Steffen Pischke. 2014. Mastering ’Metrics: The Path from Cause to Effect. Princeton University Press.
Schaeffer, Merlin, and Judith Kas. 2024. “The Integration Paradox: Does Awareness of the Extent of Ethno-Racial Discrimination Increase Reports of Discrimination?” Political Psychology n/a(n/a). doi: 10.1111/pops.13027.