455
Views
0
CrossRef citations to date
0
Altmetric
Articles

Efficient and Robust Estimation of the Generalized LATE Model

ORCID Icon
 

Abstract

This article studies the estimation of causal parameters in the generalized local average treatment effect (GLATE) model, which expands upon the traditional LATE model to include multivalued treatments. We derive the efficient influence function (EIF) and the semiparametric efficiency bound (SPEB) for two types of causal parameters: the local average structural function (LASF) and the local average structural function for the treated (LASFT). The moment conditions generated by the EIF satisfy two robustness properties: double robustness and Neyman orthogonality. Based on the robust moment conditions, we propose the double/debiased machine learning (DML) estimator for estimating the LASF. The DML estimator is well-suited for high dimensional settings. We also propose null-restricted inference methods that are robust against weak identification issues. As an empirical application of these methods, we examine the potential health outcome across different types of health insurance plans using data from the Oregon Health Insurance Experiment.

Acknowledgments

This article is a revised version of the third chapter of my doctoral dissertation at UC San Diego. I am grateful to my advisors, Graham Elliott and Yixiao Sun, for their generous advice, guidance, and support. I appreciate the Editor Ivan Canay, an associate editor, and two referees whose valuable comments and suggestions have significantly improved this article. I also thank Wei-Lin Chen, Yu-Chang Chen, and Kaspar Wüthrich for insightful discussions.

Disclosure Statement

The author reports there are no competing interests to declare.

Notes

1 To distinguish it from the GLATE model, we refer to the LATE model studied by Imbens and Angrist (Citation1994) as the “binary LATE model.”

2 Recently, Newey and Stouli (Citation2021) give necessary and sufficient characterization for identification of average structural function of multiple treatment levels, where the usual unconfoundedness (full independence) is relaxed to mean independence.

3 If the treatment variable were continuous instead of discrete, the analysis would differ. For instance, Flores et al. (Citation2012) and Huber et al. (Citation2020) provide such methodological and empirical analyses.

4 A more detailed background on the experiment is postponed to Section 6.

5 Heckman and Pinto (Citation2018) focus on the identification analysis while keeping the conditioning covariates implicit. In other words, their analysis can be interpreted as being conditional on X = x for a specific value of x. For the estimation issues addressed in our article, however, the role of covariates is more prominent. For example, the efficiency bound is computed directly through conditional moments of various variables given X; and the smoothness of these conditional moments and the dimension of X all play a role in determining whether the causal parameter can be n-estimated. Therefore, the covariates must be explicitly modeled. For a discussion on the role of covariates in LATE models, refer to Chen and Xie (Citation2022).

6 These two sets of conditioning covariates are considered in the original paper by Finkelstein et al. (Citation2012).

7 In this article, we consider the set of covariates as given by the researcher’s field knowledge or data availability. The issue of model specification, including the selection of covariates, is beyond the scope of our study.

8 The Moore-Penrose inverse A+ of a matrix A is the unique matrix that satisfies AA+A=A,A+AA+=A+, and that both AA+ and A+A are symmetric.

9 To clarify the notation used, the subscript 1 in pno,1 and βno,1 represents k = 1, whereas the subscript 1 in Pno,1 and Qno,1 corresponds to z = 1.

10 As with the proofs provided in Frölich (Citation2007) and Hong and Nekipelov (Citation2010), our demonstration of the efficiency bound does not involve the explicit construction of parametric submodels that satisfy the model’s identification assumptions. This method, as emphasized in the recent study by Navjeevan, Pinto, and Santos (Citation2023), could result in a loose efficiency bound when the model is locally overidentified, as defined by Chen and Santos (Citation2018). However, similar to the binary case, the GLATE model that includes multivalued treatment is locally just identified, as defined by Chen and Santos (Citation2018). Consequently, our efficiency bound is accurate. We appreciate the editor’s suggestion on this issue.

11 The terminology “conditional expectation projection” is adopted from Chen, Hong, and Tarozzi (Citation2008) and Hong and Nekipelov (Citation2010), whereas Hahn (Citation1998) refers to these estimators as “nonparametric imputation based estimators.”

12 The full subscripts are kept in the Appendices that contain proofs.

13 In two-step semiparametric estimations, Donsker properties are usually required so that a suitable stochastic equicontinuity condition is satisfied. See, for example, Assumption 2.5 in Chen, Linton, and Van Keilegom (Citation2003).

14 We study the DML2 estimator defined in Chernozhukov et al. (Citation2018). Another estimator, the DML1 estimator, is also proposed in that same paper. We do not study the DML1 estimator since it is asymptotically equivalent to DML2, and the authors generally recommend DML2.

15 The intercept is typically not penalized and thus not included in the matrix μ.

16 The lottery list variables consist of age; sex; whether English is the preferred language; whether the individuals signed themselves up for the lottery; whether they provided a phone number; whether the individuals gave their address as a PO box; whether they signed up the first or last day the lottery list was open; and the week they signed up for the lottery.

17 Alternative values of the elastic net parameter α yield similar results, which are omitted here for brevity.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.