ABSTRACT
This paper offers an accessible discussion of graphical causal models and how such a framework can be used to help identify causal relations. A graphical causal model represents a researcher’s qualitative assumptions. As a result of the credibility revolution, there is growing interest to properly estimate cause-and-effect relationships. Using several examples, we illustrate how graphical models can and cannot be used to identify causation from observational data. Further, we offer a replication of a previous study that explored college enrolment by high school seniors who were eligible for student aid. From the original study, we use a graphical causal model to motivate the quantitative and qualitative modelling assumptions. Using a similar difference-in-difference approach based on propensity score matching, we estimate a smaller average treatment effect than the original study. The smaller estimated effect arguably stems from the graphical causal model’s delineation of the original model specification.
Disclosure statement
No potential conflict of interest was reported by the author(s). The findings and conclusions in this publication are those of the author(s) and should not be construed to represent any official USDA or U.S. Government determination or policy. This work was supported [in part] by the U.S. Department of Agriculture, Economic Research Service.
Supplementary material
Supplemental data for this article can be accessed online at https://doi.org/10.1080/00036846.2023.2208856
Notes
1 Heckman and Pinto (Citation2015) seem amenable to DAG analyses, but criticize the approach as not being able to accommodate nonrecursive simultaneous equations models.
2 The term ‘conditioning’ is defined as introducing information about a variable into an analysis; e.g. through sample selection, stratification, or regression control (Elwert and Winship Citation2014).
3 The online appendix offers a brief explanation of the (conditional) independence assumptions (i.e. the ignorability or conditional ignorability assumptions) required for causal inference.
4 We include additional examples and illustrations of over-control bias, confounding bias, and endogenous selection in the online appendix.
5 The question of whether or how DAGs can accommodate temporal dynamics is an ongoing topic of research (Morgan and Winship Citation2015). In principal, DAG analyses can accommodate time series related issues (provided the relationship is acyclic and not cyclic) as the variables within the DAG are ordered by construction according to the causal factors, which by definition have to temporally precede the effect. The arguable concern is that additional confounding factors, which threaten causal identification, arise as the time-horizon of the problem grows. This is a valid concern that has been addressed by various means according to the disciplinary literatures. Examples in the epidemiology literature include effect modification, which is a type of causal mediation analysis in which confounders are explored after treatment but before the outcome (VanderWeele and Robins Citation2007). Within the economics literature, DAGs have been used to help justify the specification of vector autoregressive or error-correction models (see Ji, Zhang, and Geng (Citation2018) as an example).
6 Recent applications of the front-door criterion within the economics and political science literatures can be found in Acharya, Blackwell, and Sen (Citation2016), Moya (Citation2018), Assunção, Bragança, and Hemsley (Citation2019), and Bellemare, Bloem, and Wexlar (Citation2020).
7 We offer conditional independence tests in this section for illustrative purposes. However, it should be acknowledged that it is never possible to test all of the relevant causal assumptions in non-experimental studies (Robins and Wasserman Citation1999).
8 The results of the conditional independence tests reported here are based on ‘mutual information’ theory. This generates an asymptotic Chi-squared test based on Monte Carlo permutation. We do not pursue these tests further and leave additional analysis to future research.
9 The replication code is available here: https://github.com/burnettwesley/graphical_causal_modeling.
10 This finding is comparable to Dynarski’s original DiD estimate of 0.182 (Dynarski Citation2003, 283).
11 Our estimate in column (2) is only marginally smaller than Dynarski’s original treatment effect estimate of 0.219 (Dynarski Citation2003, 283).