Abstract
Access to high-dimensional data has made the use of machine learning in causal inference more common in recent years. The double/debiased machine learning (DML) estimator for the treatment effect is designed to obtain a valid inference when nuisance functions in the treatment and outcome equations, are estimated using machine learning methods. However, when some covariates in the treatment equation do not appear in the outcome equation, the inclusion of such covariates in the propensity score estimation will result in the increasing bias and variance of the DML estimator. To solve this issue, we introduce an outcome-adaptive DML estimator, which incorporates the outcome-adaptive lasso for the variable selection in the propensity score estimation. We evaluate the performance of the proposed method using Monte Carlo simulation. The results indicate that our proposed method in many cases outperforms other methods.
MATHEMATICS SUBJECT CLASSIFICATION:
Acknowledgments
The authors thank an anonymous referee and participants of the Kansai Econometric Society Meetings at Hitotsubashi University for their helpful comments and discussion.
Disclosure statement
The first author of this article received an honorarium from the Daiichi-Sankyo donation program (https://www.daiichisankyo.co.jp/corporate/ds-shougakukifu/#secI, accessed March 6, 2021).
Notes
1 In general, when confounders are omitted from the outcome equation, a regression coefficient on the treatment suffers from bias because the error term and treatment variable are correlated. If there is a covariate included in the treatment equation but not in the outcome equation, it can be used as an instrumental variable since it is correlated with the regressor (namely, the treatment variable) but not correlated with the error term in the regression model.