Abstract
Simpson’s paradox has been known for years and can arise in a wide variety of settings. When data exhibit Simpson’s paradox, we are faced with the question of whether to use aggregated data or conditional data. In this article, we present ways to graph data to better understand the paradox and the use of causal diagrams as a tool for deciding whether or not to use conditional data.
Keywords:
Acknowledgments
Stan Wagon provided valuable assistance to me as I was preparing this article. He and Karl Heiner maintain a website [Citation3] that includes a variation on the Whickham smoking example, the unemployment example, the Titanic example, other examples, and some diagrams similar to for illustrating the paradox. Kevin Cummiskey has been a valuable guide in my exploration of Simpson’s paradox and causal inference. I am indebted to Ann Cannon for technical assistance. The reviewers of an earlier version were tremendously helpful and deserve credit for improving the article.