4,019
Views
45
CrossRef citations to date
0
Altmetric
Review

Graphical Models for Processing Missing Data

&
Pages 1023-1037 | Received 09 Jan 2018, Accepted 04 Jan 2021, Published online: 16 Mar 2021
 

Abstract

This article reviews recent advances in missing data research using graphical models to represent multivariate dependencies. We first examine the limitations of traditional frameworks from three different perspectives: transparency, estimability, and testability. We then show how procedures based on graphical models can overcome these limitations and provide meaningful performance guarantees even when data are missing not at random (MNAR). In particular, we identify conditions that guarantee consistent estimation in broad categories of missing data problems, and derive procedures for implementing this estimation. Finally, we derive testable implications for missing data models in both missing at random and MNAR categories.

Notes

Notes

1 These results apply to modified versions of MAR and MNAR as defined in Section 2.2.

2 For a gentle introduction to causal graphical models, see Elwert (Citation2013), Lauritzen (Citation2001), and Pearl (2009b, secs. 1.2 and 11.1.2).

4 The term identifiability is sometimes used in lieu of recoverability. We prefer using recoverability over identifiability since the latter is strongly associated with causal effects, while the former is a broader concept, applicable to statistical relationships as well. See Section 3.5.

5 This definition is more operational than the standard definition of identifiability for it states explicitly what is achievable under recoverability and more importantly, what problems may occur under nonrecoverability.

6 A variable is a collider on the path if the path enters and leaves the variable via arrowheads (a term suggested by the collision of causal forces at the variable) (Greenland and Pearl Citation2011).

7 Markov blanket MbX of variable X is any set of variables such that X is conditionally independent of all the other variables in the graph given MbX (Pearl Citation1988).

8 For an introduction to do-calculus, see Pearl and Bareinboim (Citation2014, sec. 2.5) and Koller and Friedman (Citation2009).

9 Unless otherwise specified nonrecoverability will assume joint distribution as a target and does not exclude recoverability of targets such as odds ratio (discussed in Bartlett, Harel, and Carpenter (Citation2015)).

Additional information

Funding

The authors gratefully acknowledge support of this work by grants from NSF IIS-1302448, IIS-1527490, and IIS-1704932; ONR N00014-17-1-2091; DARPA W911NF-16-1-0579.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.