3,652
Views
2
CrossRef citations to date
0
Altmetric
Pages 953-970 | Received 02 Mar 2022, Accepted 17 Mar 2022, Published online: 14 Jun 2022
 

Abstract

Statistical network models based on Pairwise Markov Random Fields (PMRFs) are popular tools for analyzing multivariate psychological data, in large part due to their perceived role in generating insights into causal relationships: a practice known as causal discovery in the causal modeling literature. However, since network models are not presented as causal discovery tools, the role they play in generating causal insights is poorly understood among empirical researchers. In this paper, we provide a treatment of how PMRFs such as the Gaussian Graphical Model (GGM) work as causal discovery tools, using Directed Acyclic Graphs (DAGs) and Structural Equation Models (SEMs) as causal models. We describe the key assumptions needed for causal discovery and show the equivalence class of causal models that networks identify from data. We clarify four common misconceptions found in the empirical literature relating to networks as causal skeletons; chains of relationships; collider bias; and cyclic causal models.

Notes

1 Note we consider only the structural equation here, omitting the measurement equation. SEM models without measurement equations are sometimes referred to as path models.

2 A collider structure XiXkXj, which does not contain an edge directly connecting the parents (XiXj and XiXj) is called an open v-structure. If there are any open v-structures in the DAG, then the moral graph must contain an undirected edge between the relevant parents XiXj. Thus, the moral graph “marries” unmarried parents.

3 The PC algorithm identifies the Markov-equivalence set of a DAG, the set of DAGs that satisfy the same d-separation statements and which match the independence relations between variables in the data.

4 Formally, there are no unblocked back-door paths, based on d-separation rules, passing through unobserved variables, that connect any pair of observed variables (Pearl, Citation2009).

5 The example GGM and all of the SEM model weights matrices in the statistical-equivalence set can be found in the supplementary materials of this paper: https://osf.io/qfyx9/.

6 Notably, there are other reasons not considered here why relationships in a statistical network can be considered spurious or biased estimates of an (in)dependence relationship, which are not eliminated via conditioning. For example, measurement error can result in under- or over-estimation of partial correlation relationships relative to those present between the true constructs of interest (c.f., Buonaccorsi, Citation2010; Schuurman & Hamaker, Citation2019).

7 Let ρAC represent the correlation between A and C and let ρBC represent the correlation between B and C. If C is a positive collider in a linear SEM model, then ρAC and ρBC lie between zero and one in value. Take it that A and B are causally independent, as in the top right panel of Figure 6. Following Pearl (Citation2013), the partial correlation between A and B conditional on C is given by ρACρBC(1ρAC2)(1ρBC2), which, given the aforementioned range restriction, will be negative.

Additional information

Funding

The work of Oisín Ryan was supported by a grant from the Netherlands Organization for Scientific Research (NWO; Onderzoekstalent Grant 406-15-128).