62
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Estimation of average treatment effects for massively unbalanced binary outcomes

, &
Received 31 May 2022, Accepted 27 Nov 2023, Published online: 15 May 2024
 

Abstract

The MLE of the ATE in the logit model for binary outcomes may have a significant second-order bias if the event has a low probability, which is the case we focus on in this article. We derive the second-order bias of the logit ATE estimator, and we propose a bias-corrected estimator of the ATE. We also propose a variation on the logit model with parameters that are elasticities. Finally, we propose a computational trick that avoids numerical instability in the case of estimation for rare events.

Word Count::

Notes

1. See Appendix A.3 for details.

2. Our bias formula looks somewhat different from the one presented in King and Zeng (Citation2001), which in turn is based on McCullagh and Nelder (Citation1989). It can be shown that the two bias formulae are in fact identical. See Appendix B.

3. We assume that the treatment assignment is unconfounded given X.

4. Appendix E also notes that δ = 1 (so that pnn1) is not compatible with asymptotic normality and that this rate is not appropriate for logit models.

5. His Equation (2) implies that Λ(θn)→0, nΛ(θn)→∞. Note that Λ(θn)→0 means that 1/(1+exp(θn))1, which in turn means that exp⁡(θn)→0, or θn. Also note that nΛ(θn)→∞ rules out the Poisson approximation, because if Λ(θn)n1, we cannot have nΛ(θn)→∞. On the other hand, nΛ(θn)→∞ is satisfied as long as Λ(θn)nδ with 0δ<1.

6. In , for the α0=2.5 and n = 500 combination, the bias of β̂ is 0.0168, where the true value of β is 1. So, the MLE overestimates the elasticity by 1.68%, which is reduced to 0.41% by the bias-corrected estimator.

7. When α and the support of x is bounded, we can guarantee that exp(α+βx)<1. If the support of x is not bounded, but if we are sure that α+βx<0 for most values of x, we may want to adopt a parameterization P(x)=Ψ(α+βx) where Ψ(t)=12exp(t) if t < 0, and Ψ(t)=112exp(t) if t > 0.

8. It is in the sense that t1+t=t+O(t2).

9. p denotes the probability of y = 1 for each parameter combination.

10. As was discussed in Remark 2, the second-order bias of the ATE is zero when the propensity score is constant. In order to verify this result, we will first consider the case that the distributions of x are identical over the D = 1 and D = 0 subsamples. In , , and in Appendix G, we evaluate the performance of various estimators of the ATE under random assignment.

11. Our simulation results are for the case of a single covariate. Adding covariates did not change the conclusions.

12. p1 and p0 denote the probabilities of y=1 for Di = 1 and Di = 0, i.e., the treated and control sub-samples.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 578.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.