0
Views
0
CrossRef citations to date
0
Altmetric
Method

Using machine learning for efficient flexible regression adjustment in economic experiments

, &
Received 20 Feb 2024, Accepted 14 Jun 2024, Published online: 01 Aug 2024
 

Abstract

This study investigates the optimal use of covariates in reducing variance when analyzing experimental data. We show that finding the variance-minimizing strategy for making use of pre-treatment observables is equivalent to estimating the conditional expectation function of the outcome given all available pre-randomization observables. This is a pure prediction problem, which recent advances in machine learning (ML) are well-suited to tackling. Through a number of empirical examples, we show how ML-based regression adjustments can feasibly be implemented in practical settings. We compare our proposed estimator to other standard variance reduction techniques in the literature. Two important advantages of our ML-based regression adjustment estimator are that (i) they improve asymptotic efficiency relative to other alternatives and (ii) they can be implemented automatically, with relatively little tuning from the researcher, which limits the scope for data-snooping.

JEL Codes::

Ackonwledgments

We thank Lyft Inc. for providing a large portion of the data used in this project. We additionally thank Adeline Sutton for her help in accessing and interpreting the CHECC data, as well as Brent Hickman, Michael Cuna, Atom Vayalinkal, and participants at the Advances in Field Experiments conference for helpful comments that have improved the article. Documentation of our procedures and our Stata and R code can be found here: https://github.com/gsun593/FlexibleRA

Declaration of Interest

John List was Chief Economist at Lyft when this research was carried out. He is now Chief Economist at Walmart. Ian Muir and Gregory Sun were also employed at Lyft at the time that the research was carried out. They are no longer affiliated with Lyft.

Author Contribution Statement

The authors confirm contribution to the paper as follows: study conception and design: Gregory Sun; data collection: Ian Muir, Gregory Sun; analysis and interpretation of results: John List, Ian Muir, Gregory Sun; draft manuscript preparation: John List, Gregory Sun. All authors reviewed the results and approved the final version of the manuscript.

Notes

1. Specifically, our estimators attain the asymptotic efficiency bound subject to the constraint that Pr(Wi,g=1|Xi=x)=ρg for fixed proportions 𝝆, and for all x. If randomization probabilities can be made conditional on x, then for a fixed target parameter, variance can be further decreased by exploiting heteroskedasticity in Yi(g) conditional on Xi. For instance, if the researcher is interested in estimating the average treatment effect, then the researcher could further reduce variance by over-sampling treatments for which the outcome of the variance is higher: Pr(Wi,g=1|Xi=x)Var(Yi(g)|Xi=x). This information is often difficult to obtain in practice, and moreover, the optimal sampling design for one target parameter may not be optimal for another.

2. We provide code for doing so at https://github.com/gsun593/FlexibleRA

3. Such a choice makes 𝐂𝐡 deterministically 0.

4. Note a subtle difference in the justification for this fact. In this case, Ag is uncorrelated only with linear functions of X, but because we are restricting ourselves to the class of linear in X regression adjustments, the summands of Bg and Cgβg are all restricted to be linear as well.

5. The two examples NW explicitly have in mind are logistic regression and Poisson regression. In the former case, f()=exp()/(1+exp()) while in the latter case, f()=exp().

6. Where here, A and B are as in the previous section.

7. As we will see in our simulations, we still prefer m̂ to be high quality, as the ability of m̂ to fit the data affects the sampling variability of the resulting estimator.

8. However, this point should not be overstated. Nonparametric estimators typically suffer from slower rates of convergence than parametric estimators, so in a finite sample, one may still prefer linear regression adjustment. Our empirical results suggest that, in general, one should pick the method that produces the highest quality out-of-sample predictions of the outcome as measured by mean squared error.

9. Note that if the sample size is not sufficiently large, some care should be taken to ensure that each fold gets observations from each of the treatment groups g.

10. R Code implementing our flexible regression adjustment along with the analyses of the three non Lyft settings can be found at the following link: https://github.com/gsun593/FlexibleRA. We have also included a copy of the code in Appendix B

11. See Friedman et al. (Citation2004) for an interpretation of this strategy as approximating the solution to a LASSO-like estimation procedure.

12. For confidentiality reasons, we cannot report the exact size of this sample. However, at the time of our writing, Lyft recorded a number of passengers in the tens of millions.

13. Specifically, the x axis in these qq-plots is defined by the theoretical quantiles of a standard normal distribution while the y axis corresponds to the empirical quantiles. If the asymptotic theory is correct, the points in these plots should lie close to the 45 degree line, and deviations from this prediction allow us to more precisely visualize deviations from asymptotic normality.

14. This reduction is not just due to noise: the difference would be statistically significant if subjected to formal hypothesis testing.

15. If the nonparametric method being used has algorithmic complexity growing faster than linearly in dataset size (which is common), two-fold cross-fitting would be even faster than not using a split sample for sufficiently large datasets.

16. Specifically, we implemented our point estimates according to Equation11 and our standard errors according to Equation12, but using an OLS fit for m̂g,i in place of a fitted machine learning model.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 578.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.