10,729
Views
0
CrossRef citations to date
0
Altmetric
Dataset

Causal Inference Is Not Just a Statistics Problem

, &
Pages 150-155 | Published online: 12 Jan 2024
 

Abstract

This article introduces a collection of four datasets, similar to Anscombe’s quartet, that aim to highlight the challenges involved when estimating causal effects. Each of the four datasets is generated based on a distinct causal mechanism: the first involves a collider, the second involves a confounder, the third involves a mediator, and the fourth involves the induction of M-Bias by an included factor. The article includes a mathematical summary of each dataset, as well as directed acyclic graphs that depict the relationships between the variables. Despite the fact that the statistical summaries and visualizations for each dataset are identical, the true causal effect differs, and estimating it correctly requires knowledge of the data-generating mechanism. These example datasets can help practitioners gain a better understanding of the assumptions underlying causal inference methods and emphasize the importance of gathering more information beyond what can be obtained from statistical tools alone. The article also includes R code for reproducing all figures and provides access to the datasets themselves through an R package named “quartets.” Supplementary materials for this article are available online.

This article is part of the following collections:
Teaching Simpson’s Paradox, Confounding, and Causal Inference

Supplementary Materials

The supplementary material includes R code to generate the tables and figures.

Data Availability Statement

The causal quartet datasets presented in this article are available in an R package titled quartets (D’Agostino McGowan Citation2023). https://r-causal.github.io/quartets/

Disclosure Statement

No potential conflict of interest was reported by the author(s).