4,019
Views
45
CrossRef citations to date
0
Altmetric
Review

Graphical Models for Processing Missing Data

&
Pages 1023-1037 | Received 09 Jan 2018, Accepted 04 Jan 2021, Published online: 16 Mar 2021

References

  • Adams, J. (2007), Researching Complementary and Alternative Medicine, London: Routledge.
  • Allison, P. D. (2002), Missing Data, Quantitative Applications in the Social Sciences, Thousand Oaks, CA: SAGE.
  • Allison, P. D. (2003), “Missing Data Techniques for Structural Equation Modeling,” Journal of Abnormal Psychology, 112, 545.
  • Balakrishnan, N. (2010), Methods and Applications of Statistics in the Life and Health Sciences, Hoboken, NJ: Wiley.
  • Bang, H., and Robins, J. M. (2005), “Doubly Robust Estimation in Missing Data and Causal Inference Models,” Biometrics, 61, 962–973. DOI: https://doi.org/10.1111/j.1541-0420.2005.00377.x.
  • Bartlett, J. W., Carpenter, J. R., Tilling, K., and Vansteelandt, S. (2014), “Improving Upon the Efficiency of Complete Case Analysis When Covariates Are MNAR,” Biostatistics, 15, 719–730. DOI: https://doi.org/10.1093/biostatistics/kxu023.
  • Bartlett, J. W., Harel, O., and Carpenter, J. R. (2015), “Asymptotically Unbiased Estimation of Exposure Odds Ratios in Complete Records Logistic Regression,” American Journal of Epidemiology, 182, 730–736. DOI: https://doi.org/10.1093/aje/kwv114.
  • Bojinov, I., Pillai, N., and Rubin, D. (2017), “Diagnosing Missing Always at Random in Multivariate Data,” arXiv no. 1710.06891.
  • Breskin, A., Cole, S. R., and Hudgens, M. G. (2018), “A Practical Example Demonstrating the Utility of Single-World Intervention Graphs,” Epidemiology, 29, e20–e21.
  • Chang, M. (2011), Modern Issues and Methods in Biostatistics, New York: Springer.
  • Cinelli, C., and Pearl, J. (2018), “On the Utility of Causal Diagrams in Modeling Attrition: A Practical Example,” Technical Report R-479, Department of Computer Science, University of California, Los Angeles, CA, available at https://ftp.cs.ucla.edu/pub/stat_ser/r479.pdf, Journal of Epidemiology (forthcoming).
  • Collins, L. M., Schafer, J. L., and Kam, C.-M. (2001), “A Comparison of Inclusive and Restrictive Strategies in Modern Missing Data Procedures,” Psychological Methods, 6, 330. DOI: https://doi.org/10.1037/1082-989X.6.4.330.
  • Cox, D. R., and Wermuth, N. (1993), “Linear Dependencies Represented by Chain Graphs,” Statistical Science, 8, 204–218. DOI: https://doi.org/10.1214/ss/1177010887.
  • Daniel, R. M., Kenward, M. G., Cousens, S. N., and De Stavola, B. L. (2012), “Using Causal Diagrams to Guide Analysis in Missing Data Problems,” Statistical Methods in Medical Research, 21, 243–256. DOI: https://doi.org/10.1177/0962280210394469.
  • Darwiche, A. (2009), Modeling and Reasoning With Bayesian Networks, New York: Cambridge University Press.
  • Dempster, A., Laird, N., and Rubin, D. (1977), “Maximum Likelihood From Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society, Series B, 39, 1–38.
  • Doretti, M., Geneletti, S., and Stanghellini, E. (2018), “Missing Data: A Unified Taxonomy Guided by Conditional Independence,” International Statistical Review, 86, 189–204. DOI: https://doi.org/10.1111/insr.12242.
  • Elwert, F. (2013), “Graphical Causal Models,” in Handbook of Causal Analysis for Social Research, ed. S. Morgan, Dordrecht: Springer, pp. 245–273.
  • Gill, R. D., and Robins, J. M. (1997), “Sequential Models for Coarsening and Missingness,” in Proceedings of the First Seattle Symposium in Biostatistics, Springer, pp. 295–305.
  • Gill, R. D., Van Der Laan, M. J., and Robins, J. M. (1997), “Coarsening at Random: Characterizations, Conjectures, Counter-Examples,” in Proceedings of the First Seattle Symposium in Biostatistics, Springer, pp. 255–294.
  • Gleason, T. C., and Staelin, R. (1975), “A Proposal for Handling Missing Data,” Psychometrika, 40, 229–252. DOI: https://doi.org/10.1007/BF02291569.
  • Graham, J. W. (2009), “Missing Data Analysis: Making It Work in the Real World,” Annual Review of Psychology, 60, 549–576. DOI: https://doi.org/10.1146/annurev.psych.58.110405.085530.
  • Graham, J. W. (2012), Missing Data: Analysis and Design, Statistics for Social and Behavioral Sciences, New York: Springer.
  • Greenland, S., and Pearl, J. (2011), “Causal Diagrams,” in International Encyclopedia of Statistical Science, ed. M. Lovric, Berlin, Heidelberg: Springer, pp. 208–216.
  • Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., and Smola, A. (2012), “A Kernel Two-Sample Test,” Journal of Machine Learning Research, 13, 723–773.
  • Haitovsky, Y. (1968), “Missing Data in Regression Analysis,” Journal of the Royal Statistical Society, Series B, 30, 67–82. DOI: https://doi.org/10.1111/j.2517-6161.1968.tb01507.x.
  • Heckman, J. J. (1976), “The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models,” in Annals of Economic and Social Measurement (Vol. 5, No. 4), ed. S. V. Berg, Cambridge, MA: NBER, pp. 475–492.
  • Holmes, C. B., Sikazwe, I., Sikombe, K., Eshun-Wilson, I., Czaicki, N., Beres, L. K., Mukamba, N., Simbeza, S., Moore, C. B., Hantuba, C., and Mwaba, P. (2018), “Estimated Mortality on HIV Treatment Among Active Patients and Patients Lost to Follow-Up in 4 Provinces of Zambia: Findings From a Multistage Sampling-Based Survey,” PLoS Medicine, 15, e1002489. DOI: https://doi.org/10.1371/journal.pmed.1002489.
  • Huang, Y., and Valtorta, M. (2006), “Identifiability in Causal Bayesian Networks: A Sound and Complete Algorithm,” in Proceedings of the National Conference on Artificial Intelligence (Vol. 21). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, pp. 1149–1154.
  • Koller, D., and Friedman, N. (2009), Probabilistic Graphical Models: Principles and Techniques, Cambridge, MA: Massachusetts Institute of Technology.
  • Kuroki, M., and Pearl, J. (2014), “Measurement Bias and Effect Restoration in Causal Inference,” Biometrika, 101, 423–437. DOI: https://doi.org/10.1093/biomet/ast066.
  • Lauritzen, S. L. (1996), Graphical Models (Vol. 17), Oxford: Oxford University Press.
  • Lauritzen, S. L. (2001), “Causal Inference From Graphical Models,” in Complex Stochastic Systems, eds. O. E. Bardorff-Nielsen, D. R. Cox, and C. Kliippelberg, Boca Raton, FL: Chapman and Hall/CRC Press, pp. 63–107.
  • Li, L., Shen, C., Li, X., and Robins, J. M. (2013), “On Weighting Approaches for Missing Data,” Statistical Methods in Medical Research, 22, 14–30. DOI: https://doi.org/10.1177/0962280211403597.
  • Little, R. J. (1988), “A Test of Missing Completely at Random for Multivariate Data With Missing Values,” Journal of the American Statistical Association, 83, 1198–1202. DOI: https://doi.org/10.1080/01621459.1988.10478722.
  • Little, R. J. (2008), “Selection and Pattern-Mixture Models,” in Longitudinal Data Analysis, eds. G. Fitzmaurice, M. Davidian, G. Verbeke, and G. Molenberghs, Boca Raton, FL: CRC Press, pp. 409–431.
  • Little, R. J., and Rubin, D. (2002), Statistical Analysis With Missing Data, New York: Wiley.
  • Little, R. J. (2014), Statistical Analysis With Missing Data, New York: Wiley.
  • Meyers, L. S., Gamst, G., and Guarino, A. J. (2006), Applied Multivariate Research: Design and Interpretation, Thousand Oaks, CA: SAGE.
  • Mohan, K. (2018), “On Handling Self-Masking and Other Hard Missing Data Problems,” in AAAI Symposium 2018, available at https://why19.causalai.net/papers/mohan-why19.pdf.
  • Mohan, K., and Pearl, J. (2014a), “Graphical Models for Recovering Probabilistic and Causal Queries From Missing Data,” in Advances in Neural Information Processing Systems (Vol. 27), eds. Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger, Curran Associates, Inc., pp. 1520–1528.
  • Mohan, K. (2014b), “On the Testability of Models With Missing Data,” in Proceedings of AISTAT.
  • Mohan, K., Pearl, J., and Tian, J. (2013), “Graphical Models for Inference With Missing Data,” in Advances in Neural Information Processing Systems (Vol. 26), pp. 1277–1285.
  • Mohan, K., Van den Broeck, G., Choi, A., and Pearl, J. (2014), “An Efficient Method for Bayesian Network Parameter Learning From Incomplete Data,” Technical Report, UCLA, Presented at Causal Modeling and Machine Learning Workshop, ICML-2014.
  • Osborne, J. W. (2012), Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data, Thousand Oaks: SAGE.
  • Osborne, J. W. (2014), Best Practices in Logistic Regression, Thousand Oaks: SAGE.
  • Pearl, J. (1988), Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, San Francisco, CA: Morgan Kaufmann.
  • Pearl, J. (1995), “Causal Diagrams for Empirical Research,” Biometrika, 82, 669–688.
  • Pearl, J. (2009a), “Causal Inference in Statistics: An Overview,” Statistics Surveys, 3, 96–146.
  • Pearl, J. (2009b), Causality: Models, Reasoning and Inference, New York: Cambridge University Press.
  • Pearl, J., and Bareinboim, E. (2014), “External Validity: From Do-Calculus to Transportability Across Populations,” Statistical Science, 29, 579–595. DOI: https://doi.org/10.1214/14-STS486.
  • Peters, C. L. O., and Enders, C. (2002), “A Primer for the Estimation of Structural Equation Models in the Presence of Missing Data: Maximum Likelihood Algorithms,” Journal of Targeting, Measurement and Analysis for Marketing, 11, 81–95. DOI: https://doi.org/10.1057/palgrave.jt.5740069.
  • Pfeffermann, D., and Sikov, A. (2011), “Imputation and Estimation Under Nonignorable Nonresponse in Household Surveys With Missing Covariate Information,” Journal of Official Statistics, 27, 181–209.
  • Potthoff, R., Tudor, G., Pieper, K., and Hasselblad, V. (2006), “Can One Assess Whether Missing Data Are Missing at Random in Medical Studies?,” Statistical Methods in Medical Research, 15, 213–234. DOI: https://doi.org/10.1191/0962280206sm448oa.
  • Resseguier, N., Giorgi, R., and Paoletti, X. (2011), “Sensitivity Analysis When Data Are Missing Not-at-Random,” Epidemiology, 22, 282. DOI: https://doi.org/10.1097/EDE.0b013e318209dec7.
  • Rhoads, C. H. (2012), “Problems With Tests of the Missingness Mechanism in Quantitative Policy Studies,” Statistics, Politics, and Policy, 3, 1–25. DOI: https://doi.org/10.1515/2151-7509.1012.
  • Robins, J. M. (1997), “Non-Response Models for the Analysis of Non-Monotone Non-Ignorable Missing Data,” Statistics in Medicine, 16, 21–37. DOI: https://doi.org/10.1002/(SICI)1097-0258(19970115)16:1 < 21::AID-SIM470 > 3.0.CO;2-F.
  • Robins, J. M. (2000), “Robust Estimation in Sequentially Ignorable Missing Data and Causal Inference Models,” in Proceedings of the American Statistical Association (Vol. 1999), Indianapolis, IN, pp. 6–10.
  • Robins, J. M., Rotnitzky, A., and Scharfstein, D. O. (2000), “Sensitivity Analysis for Selection Bias and Unmeasured Confounding in Missing Data and Causal Inference Models,” in Statistical Models in Epidemiology, the Environment, and Clinical Trials, eds. M. E. Halloran and D. Berry, New York: Springer, pp. 1–94.
  • Robins, J. M., Rotnitzky, A., and Zhao, L. P. (1994), “Estimation of Regression Coefficients When Some Regressors Are Not Always Observed,” Journal of the American statistical Association, 89, 846–866. DOI: https://doi.org/10.1080/01621459.1994.10476818.
  • Rotnitzky, A., Robins, J. M., and Scharfstein, D. O. (1998), “Semiparametric Regression for Repeated Outcomes With Nonignorable Nonresponse,” Journal of the American Statistical Association, 93, 1321– 1339. DOI: https://doi.org/10.1080/01621459.1998.10473795.
  • Rubin, D. B. (1976), “Inference and Missing Data,” Biometrika, 63, 581–592. DOI: https://doi.org/10.1093/biomet/63.3.581.
  • Rubin, D. B. (1978), “Multiple Imputations in Sample Surveys—A Phenomenological Bayesian Approach to Nonresponse,” in Proceedings of the Survey Research Methods Section of the American Statistical Association (Vol. 1), American Statistical Association, pp. 20–34.
  • Scheffer, J. (2002), “Dealing With Missing Data,” Research Letters in the Information and Mathematical Sciences, 3, 153–160.
  • Seaman, S., Galati, J., Jackson, D., and Carlin, J. (2013), “What Is Meant By ‘Missing at Random’?,” Statistical Science, 28, 257–268. DOI: https://doi.org/10.1214/13-STS415.
  • Shpitser, I., and Pearl, J. (2006), “Identification of Conditional Interventional Distributions,” in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, pp. 437–444.
  • Sriperumbudur, B. K., Gretton, A., Fukumizu, K., Schölkopf, B., and Lanckriet, G. R. (2010), “Hilbert Space Embeddings and Metrics on Probability Measures,” Journal of Machine Learning Research, 11, 1517–1561.
  • Sverdlov, O. (2015), Modern Adaptive Randomized Clinical Trials: Statistical and Practical Aspects, Boca Raton, FL: Chapman and Hall/CRC.
  • Székely, G. J., Rizzo, M. L., and Bakirov, N. K. (2007), “Measuring and Testing Dependence by Correlation of Distances,” The Annals of Statistics, 35, 2769–2794. DOI: https://doi.org/10.1214/009053607000000505.
  • Thoemmes, F., and Mohan, K. (2015), “Graphical Representation of Missing Data Problems,” Structural Equation Modeling: A Multidisciplinary Journal, 22, 631–642. DOI: https://doi.org/10.1080/10705511.2014.937378.
  • Thoemmes, F., and Rose, N. (2013), “Selection of Auxiliary Variables in Missing Data Problems: Not All Auxiliary Variables Are Created Equal,” Technical Report R-002, Cornell University.
  • Tian, J., and Pearl, J. (2002), “On the Testable Implications of Causal Models With Hidden Variables,” in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers Inc., pp. 519–527.
  • Van den Broeck, G., Mohan, K., Choi, A., Darwiche, A., and Pearl, J. (2015), “Efficient Algorithms for Bayesian Network Parameter Learning From Incomplete Data,” in Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, pp. 161–170.
  • Van der Laan, M., and Robins, J. (2003), Unified Methods for Censored Longitudinal Data and Causality, New York: Springer-Verlag.
  • van Stein, B., and Kowalczyk, W. (2016), “An Incremental Algorithm for Repairing Training Sets With Missing Values,” in International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Springer, pp. 175–186.
  • Verma, T., and Pearl, J. (1991), “Equivalence and Synthesis of Causal Models,” in Proceedings of the Sixth Conference in Artificial Intelligence, Association for Uncertainty in AI, pp. 220–227.
  • White, I. R., and Carlin, J. B. (2010), “Bias and Efficiency of Multiple Imputation Compared With Complete-Case Analysis for Missing Covariate Values,” Statistics in Medicine, 29, 2920–2931. DOI: https://doi.org/10.1002/sim.3944.
  • Zhang, N., Chen, H., and Elliott, M. R. (2016), “Nonrespondent Subsample Multiple Imputation in Two-Phase Sampling for Nonresponse,” Journal of Official Statistics, 32, 769–785. DOI: https://doi.org/10.1515/jos-2016-0039.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.