Abstract
There is a fast-growing literature on estimating optimal treatment regimes based on randomized trials or observational studies under a key identifying condition of no unmeasured confounding. Because confounding by unmeasured factors cannot generally be ruled out with certainty in observational studies or randomized trials subject to noncompliance, we propose a general instrumental variable (IV) approach to learning optimal treatment regimes under endogeneity. Specifically, we establish identification of both value function for a given regime
and optimal regimes
with the aid of a binary IV, when no unmeasured confounding fails to hold. We also construct novel multiply robust classification-based estimators. Furthermore, we propose to identify and estimate optimal treatment regimes among those who would comply to the assigned treatment under a monotonicity assumption. In this latter case, we establish the somewhat surprising result that complier optimal regimes can be consistently estimated without directly collecting compliance information and therefore without the complier average treatment effect itself being identified. Our approach is illustrated via extensive simulation studies and a data application on the effect of child rearing on labor participation. Supplementary materials for this article are available online.
Supplementary Materials
Supplementary material available online includes lower and upper bounds of with partial identification, the efficient influence function of
under Assumptions 2-6 and 8, proofs, and additional simulation scenarios.
NIH Blueprint for Neuroscience Research;
OWL: outcome weighted learning; RWL: residual weighted learning; IV-IW: the proposed estimators with weight or
; IV-MR: the proposed multiply robust estimators with weight
or
.