Abstract
Causal inference with observational data is a central goal in many fields. Propensity score methods are design-based approaches that try to ensure covariate balance without using information from the outcome variables. Analysis-based approaches, such as the Bayesian Additive Regression Tree and the Causal Forest, bypass the issue of covariate balance, and directly model the outcomes. We use a Monte Carlo simulation to study the performance of these two types of approaches. Some of the simulation scenarios involve large number of covariates relative to the number of observations. We find that the analysis-based approaches can yield very poor performance, without any warning about not enough overlap between the covariate distributions for the treated and control groups. In contrast, the propensity score methods provide warning about not enough overlap, but such warning could be overly-cautious when there is enough overlap.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Additional information
Notes on contributors
Junni L. Zhang
Junni L. Zhang is Associate Professor of Statistics at National School of Development, Peking University, People's Republic of China. She obtained Ph.D. in statistics from Harvard University. Her research interests are causal inference, Bayesian demography, and data and text mining.