952
Views
9
CrossRef citations to date
0
Altmetric
Measurement, Statistics, and Research Design

A Comparison of Propensity Score Weighting Methods for Evaluating the Effects of Programs With Multiple Versions

ORCID Icon, &

References

  • Agresti, A. (2002). Categorical data analysis (2nd ed.). Hoboken, NJ: Wiley & Sons.
  • Austin, P. C. (2010). The performance of different propensity score methods for estimating difference in proportions (risk differences or absolute risk reductions) in observational studies. Statistics in Medicine, 29, 2137–2148. doi:10.1002/sim.3854
  • Austin, P. C. (2011a). Comparing paired vs. non-paired statistical methods of analyses when making inferences about absolute risk reductions in propensity-score matched samples. Statistics in Medicine, 30(11), 1292–1301. doi:10.1002/sim.4200
  • Austin, P. C. (2011b). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3), 399–424. doi:10.1080/00273171.2011.568786
  • Austin, P. C. (2014). A comparison of 12 algorithms for matching on the propensity score. Statistics in medicine, 33(6), 1057–1069. doi:10.1002/sim.6004
  • Bandalos, D. L., & Leite, W. L. (2015). Use of Monte Carlo studies in structural equation modeling research. In G. R. Hancock & R. O. Mueller & (Eds.), Structural equation modeling: A second course (2nd ed., pp. 625–666). Greenwich, CT: Information Age.
  • Bang, H., & Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4), 962–973. doi:10.1111/j.1541-0420.2005.00377.x
  • Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Sturmer, T. (2006). Variable selection for propensity score models. American Journal of Epidemiology, 163(12), 1149–1156. doi:10.1093/aje/kwj149
  • Cepeda, M. S., Boston, R., Farrar, J. T., & Strom, B. L. (2003). Optimal matching with a variable number of controls vs. a fixed number of controls for a cohort study: Trade-offs. Journal of Clinical Epidemiology, 56, 230–237. doi:10.1016/S0895-4356(02)00583-8
  • Cochran, W. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics, 24, 295–313. doi:10.2307/2528036
  • Cuong, N. V. (2009). Impact evaluation of multiple overlapping programs under a conditional independence assumption. Research in Economics, 63(1), 27–54. doi:10.1016/j.rie.2008.10.001
  • Cuong, N. V. (2013). Which covariates should be controlled in propensity score matching? Evidence from a simulation study. Statistica Neerlandica, 67(2), 169–180. doi:10.1111/stan.12000
  • Dean, L. T., Hillier, A., Chau-Glendinning, H., Subramanian, S. V., Williams, D. R., & Kawachi, I. (2015). Can you party your way to better health? A propensity score analysis of block parties and health. Social Science and Medicine, 1982, 201–209. doi:10.1016/j.socscimed.2015.06.019
  • Diamond, A., & Sekhon, J. S. (2013). Genetic Matching for Estimating Causal Effects: A general multivariate matching method for achieving balance in observational studies. Review of Economics and Statistics, 95, 932–945. doi:10.1162/REST_a_00318
  • Doyle, W. R. (2011). Effect of increased academic momentum on transfer rates: An application of the generalized propensity score. Economics of Education Review, 30(1), 191–200. doi:10.1016/j.econedurev.2010.08.004
  • Gu, X. S., & Rosenbaum, P. R. (1993). Comparison of multivariate matching methods: Structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 2, 405–420.
  • Hansen, B. B. (2004). Full matching in an observational study of coaching for the SAT. Journal of the American Statistical Association, 99(467), 609–618. doi:10.1198/016214504000000647
  • Hansen, B. B. (2007). Optmatch: Flexible, optimal matching for observational studies. R News, 7(2), 18–24.
  • Heeringa, S. G., West, B. T., & Berglund, P. A. (2010). Applied survey data analysis. Boca Raton, FL: CRC Press.
  • Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis, 15(3), 199–236. doi:10.1093/pan/mpl013
  • Hong, G. (2010). Marginal mean weighting through stratification: Adjustment for selection bias in multilevel data. Journal of Educational and Behavioral Statistics, 35(5), 499–531. doi:10.3102/1076998609359785
  • Hong, G. (2012). Marginal mean weighting through stratification: A generalized method for evaluating multivalued and multiple treatments with nonexperimental data. Psychological Methods, 17(1), 44–60. doi:10.1037/a0024918
  • Hong, G., & Hong, Y. (2008). Reading instruction time and homogeneous grouping in kindergarten: An application of marginal mean weighting through stratification. Educational Evaluation and Policy Analysis, 31(1), 54–81. doi:10.3102/0162373708328259
  • Hoogland, J. J., & Boomsma, A. (1998). Robustness studies in covariance structure modeling: An overview and a meta-analysis. Sociological Methods and Research, 26(3), 329–367. doi:10.1177/0049124198026003003
  • Imai, K., & Van Dyk, D. A. (2004). Causal inference with general treatment regimes: Generalizing the propensity score. Journal of the American Statistical Association, 99(467), 854–866. doi:10.1198/016214504000001187
  • Imbens, G. W. (2000). The role of the propensity score in estimating dose-response functions. Biometrika, 87(3), 706–710. doi:10.1093/biomet/87.3.706
  • Kang, J. D. Y., & Schafer, J. L. (2007a). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22(4), 523–539. doi:10.1214/07-sts227
  • Kang, J. D. Y., & Schafer, J. L. (2007b). Rejoinder: Demystifying double Robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22(4), 574–580. doi:10.1214/07-sts227rej
  • Lechner, M. (2002). Program heterogeneity and propensity score matching: An application to the evaluation of active labor market policies. Review of Economics and Statistics, 84(2), 205–220. doi:10.1162/003465302317411488
  • Leite, W. L. (2016). Practical propensity score methods using R. Thousand Oaks, CA: Sage.
  • Lohr, S. (1999). Sampling: Design and analysis. Pacific Grove, CA: Duxbury Press.
  • Lumley, T. (2004). Analysis of complex survey samples. Journal of Statistical Software, 9(8), 1–19. doi:10.18637/jss.v009.i08
  • Lumley, T. (2010). Complex surveys: A guide to analysis using R. New York, NY: Wiley.
  • Lunceford, J. K., & Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study. Statistics in Medicine, 23, 2937–2960. doi:10.1002/sim.1903
  • McCaffrey, D. F., Griffin, B. A., Almirall, D., Slaughter, M. E., Ramchand, R., & Burgette, L. F. (2013). A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Statistics in Medicine, 19, 3388–3414. doi:10.1002/sim.5753
  • McKelvey, R. D., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4, 103–120. doi:10.1080/0022250X.1975.9989847
  • Olejnik, S., & Algina, J. (2003). Generalized eta and omega squared statistics: Measures of effect size for some common research designs. Psychological Methods, 8, 434–447. doi:10.1037/1082-989X.8.4.434
  • Penuel, W. R., & Means, B. (2010). Using large-scale databases in evaluation: advances, opportunities, and challenges. American Journal of Evaluation, 32(1). doi:10.1177/1098214010388268
  • R Development Core Team. (2012). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org
  • Robins, J. M., Hernan, M. A., & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology, 11, 550–560. doi:10.1097/00001648-200009000-00011
  • Rodgers, J. L. (1999). The bootstrap, the jackknife, and the randomization test: A sampling taxonomy. Multivariate Behavioral Research, 34(4), 441–456. doi:10.1207/S15327906MBR3404_2
  • Rosenbaum, P. R. (1989). Optimal matching for observational studies. Journal of the American Statistical Association, 84(408), 1024–1032. doi:10.1080/01621459.1989.10478868
  • Rosenbaum, P. R. (2010). The power of a sensitivity analysis and its limit. Design of observational studies (pp. 257–274). New York, NY: Springer.
  • Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. doi:10.1093/biomet/70.1.41
  • Rosenbaum, P. R., & Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79, 516–524. doi:10.1080/01621459.1984.10478078
  • Rubin, D. B. (1986). Comment: Which ifs have causal answers? Journal of the American Statistical Association, 81, 961–962.
  • Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13(4), 279–313. doi:10.1037/a0014268
  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.
  • Strayhorn, T. L. (2009). Accessing and analyzing national databases. In T. J. Kowalski & T. J. Lasley II (Eds.), Handbook of data-based decision making in education (pp. 105–122). New York, NY: Routledge.
  • Stuart, E. A. (2010). Matching Methods for Causal Inference: A Review and a Look Forward. Statistical Science, 25(1), 1–21. doi:10.1214/09-sts313
  • Thoemmes, F. J., & Kim, E. S. (2011). A systematic review of propensity score methods in the social sciences. Multivariate Behavioral Research, 46(1), 90–118. doi:10.1080/00273171.2011.540475
  • U.S. Department of Education, Institute of Education Sciences, & What Works Clearinghouse. (2013). What Works Clearinghouse: Procedures and standards handbook (Version 3.0). Washington, DC: U.S. Department of Education.
  • Van der Leeden, R., Meijer, E., & Busing, F. M. (2008). Resampling multilevel models. In L. J. de & E. Meijer (Eds.), Handbook of multilevel analysis (pp. 401–433). New York, NY: Springer.
  • Wen, X., Leow, C., Hahs-Vaughn, D. L., Korfmacher, J., & Marcus, S. M. (2012). Are two years better than one year? A propensity score analysis of the impact of Head Start program duration on children's school performance in kindergarten. Early Childhood Research Quarterly, 27(4), 684–694. doi:10.1016/j.ecresq.2011.07.006

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.