Notes
1 The derivation is also unclear. The conditional restriction considered in the appendix is equivalent to , that is, the conditional mean of treated outcome given X is a function of only Z. The final loss function, appearing in their last display equation, also appears to me to be in error.
2 Unfortunately, a typo in the Journal version replaced the two ’s in its definition with two ’s. In private correspondence, LLL, who pointed out the typo, explain this typo is the cause for their mistaking this quantity for the value diameter of the policy space (display equation above their Equation (5)) rather than the optimality gap.
3 Both statements depend on the relevant smoothness, of course. More generally and in multivariate settings, this can be phrased in terms of Lipschitz gradient (first statement) and strong convexity (second statement) of the value function. Alternatively, in the finite-policy-space case, we have the argument above using the probability of optimal choice.