2,542
Views
13
CrossRef citations to date
0
Altmetric
Teacher’s Corner

Propensity Score Analysis of Complex Survey Data with Structural Equation Modeling: A Tutorial with Mplus

References

  • Abadie, A., & Imbens, G. W. (2002). Simple and bias-corrected matching estimators for average treatment effects. Technical Working Paper T0283, NBER.
  • Alarcon, G. M. (2011). A meta-analysis of burnout with job demands, resources, and attitudes. Journal of Vocational Behavior, 79, 549–562. doi:10.1016/j.jvb.2011.03.007
  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
  • Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103(3), 411–423. doi:10.1037/0033-2909.103.3.411
  • Asparouhov, T. (2006). General multi-level modeling with sampling weights. Communications in Statistics: Theory and Methods, 35(3), 439–460. doi:10.1080/03610920500476598
  • Asparouhov, T., & Muthén, B. O. (2006). Robust chi square difference testing with mean and variance adjusted test statistics. Los Angeles, CA. Retrieved from http://www.statmodel.com/examples/webnote.shtml#web10
  • Austin, P. C. (2008). A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Statistics in Medicine, 27(12), 2037–2049. doi:10.1002/sim.3150
  • Austin, P. C. (2009). Some methods of propensity-score matching had superior performance to others: Results of an empirical investigation and Monte Carlo simulations. Biometrical Journal, 51(1), 171–184. doi:10.1002/bimj.200810488
  • Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3), 399–424. doi:10.1080/00273171.2011.568786
  • Austin, P. C. (2014). A comparison of 12 algorithms for matching on the propensity score. Statistics in Medicine, 33(6), 1057–1069. doi:10.1002/sim.6004
  • Aydin, B., Leite, W. L., & Algina, J. (2015). The effects of including observed means or latent means as covariates in multilevel models for cluster randomized trials. Educational and Psychological Measurement, 76(5), 803–823. doi:10.1177/0013164415618705
  • Bang, H., & Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4), 962–973. doi:10.1111/j.1541-0420.2005.00377.x
  • Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 186–203. doi:10.1207/s15328007sem1302_2
  • Bettini, E., Jones, N. D., Brownell, M. T., Conroy, M., & Leite, W. L. (2018). Relationships between novices’ social resources and workload manageability. Journal of Special Education, 52, 113–126.
  • Bettini, E., Jones, N. D., Brownell, M. T., Conroy, M., Park, Y., Leite, W. L., … Benedict, A. E. (2017). Workload manageability among novice special and general educators: Relationships with emotional exhaustion and career intentions. Remedial and Special Education, 38, 246–256. doi:10.1177/0741932517708327
  • Billingsley, B. S., Griffin, C. C., Smith, S. J., Kamman, M., & Israel, M. (2009). A review of teacher induction in special education: Research, practice, and technology solutions. Retrieved from http://www.ncipp.org
  • Bishop, C. D., Leite, W. L., & Snyder, P. (2018). Using propensity score weighting to reduce selection bias in large-scale data sets. Journal of Early Intervention. Advanced online publication. doi:10.1177/1053815118793430
  • Borman, G. D., & Dowling, N. M. (2008). Teacher attrition and retention: A meta-analytic and narrative review of the research. Review of Educational Research, 78, 367–409. doi:10.3102/0034654308321455
  • Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Sturmer, T. (2006). Variable selection for propensity score models. American Journal of Epidemiology, 163(12), 1149–1156. doi:10.1093/aje/kwj149
  • Carnegie, N. B., Harada, M., & Hill, J. L. (2016). Assessing sensitivity to unmeasured confounding using a simulated potential confounder. Journal of Research on Educational Effectiveness, 1–26. doi:10.1080/19345747.2015.1078862
  • Cepeda, M. S., Boston, R., Farrar, J. T., & Strom, B. L. (2003). Optimal matching with a variable number of controls vs. a fixed number of controls for a cohort study: Trade-offs. Journal of Clinical Epidemiology, 56, 230–237.
  • Cochran, W. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics, 24, 295–313. doi:10.2307/2528036
  • Collier, Z. K., & Leite, W. L. (2017, April). Teaching neural networks to estimate propensity scores. Paper presented at the annual meeting of the American Education Research Association Conference. San Antonio, TX.
  • Conley, S., & You, S. (2017). Key influences on special education teachers’ intentions to leave: The effect of administrative support and teacher team efficacy in a mediational model. Educational Management Administration and Leadership, 45, 521–540. doi:10.1177/1741143215608859
  • Cornfield, J., Haenszel, W., Hammond, E., Lilienfeld, A., Shimkin, M., & Wynder, E. (1959). Smoking and lung cancer: Recent evidence and a discussion of some questions. Journal of the National Cancer Institute, 22, 173–203.
  • Cuong, N. V. (2013). Which covariates should be controlled in propensity score matching? Evidence from a simulation study. Statistica Neerlandica, 67(2), 169–180. doi:10.1111/stan.12000
  • Diamond, A., & Sekhon, J. S. (2013). Genetic matching for estimating causal effects: A general multivariate matching method for achieving balance in observational studies. The Review of Economics and Statistics, 95, 932–945. doi:10.1162/REST_a_00318
  • Dorie, V., Harada, M., Carnegie, N. B., & Hill, J. (2016). A flexible, interpretable framework for assessing sensitivity to unmeasured confounding. Statistics in Medicine, 35(20), 3453–3470. doi:10.1002/sim.6973
  • Dugoff, E. H., Schuler, M., & Stuart, E. A. (2014). Generalizing observational study results: Applying propensity score methods to complex surveys. Health Services Research, 49(1), 284–303. doi:10.1111/1475-6773.12090
  • Enders, C. K. (2001). A primer on maximum likelihood algorithms available for use with missing data. Structural Equation Modeling, 8(1), 128–141. doi:10.1207/S15328007SEM0801_7
  • Enders, C. K., & Bandalos, D. L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling, 8, 430–457. doi:10.1207/S15328007SEM0803_5
  • Funk, M. J., Westreich, D., Wiesen, C., Sturmer, T., Brookhart, M. A., & Davidian, M. (2011). Doubly robust estimation of causal effects. American Journal of Epidemiology, 173(7), 761–767. doi:10.1093/aje/kwq439
  • Gu, X. S., & Rosenbaum, P. R. (1993). Comparison of multivariate matching methods: Structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 2, 405–420.
  • Hahs-Vaughn, D. L., & Onwuegbuzie, A. J. (2006). Estimating and using propensity score analysis with complex samples. The Journal of Experimental Education, 75, 31–65. doi:10.3200/JEXE.75.1.31-65
  • Harring, J. R., McNeish, D. M., & Hancock, G. R. (2017). Using phantom variables in structural equation modeling to assess model sensitivity to external misspecification. Psychol Methods, 22(4), 616–631. doi:10.1037/met0000103
  • Heckman, J. J. (2005). The scientific model of causality. Sociological Methodology, 35, 1–97. doi:10.1111/j.0081-1750.2006.00164.x
  • Heeringa, S. G., West, B. T., & Berglund, P. A. (2010). Applied survey data analysis. Boca Raton, FL: CRC Press.
  • Hill, J. (2004). Reducing bias in treatment effect estimation in observational studies suffering from missing data. ISERP working papers. Columbia University Academic Commons, Institute for Social and Economic Research and Policy, Columbia University. doi:10.7916/D8B85G11
  • Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis, 15(3), 199–236. doi:10.1093/pan/mpl013
  • Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945–960. doi:10.1080/01621459.1986.10478354
  • Hong, G. (2010). Marginal mean weighting through stratification: Adjustment for selection bias in multilevel data. Journal of Educational and Behavioral Statistics, 35(5), 499–531. doi:10.3102/1076998609359785
  • Hong, G. (2012). Marginal mean weighting through stratification: A generalized method for evaluating multivalued and multiple treatments with nonexperimental data. Psychological Methods, 17(1), 44–60. doi:10.1037/a0024918
  • Hong, G., & Raudenbush, S. W. (2005). Effects of kindergarten retention policy on children’s cognitive growth in reading and mathematics. Educational Evaluation and Policy Analysis, 27(3), 205–224. doi:10.3102/01623737027003205
  • Hong, G., & Raudenbush, S. W. (2006). Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association, 101(475), 901–910. doi:10.1198/016214506000000447
  • Hoshino, T., Kurata, H., & Shigemasu, K. (2006). A propensity score adjustment for multiple group structural equation modeling. Psychometrika, 71, 691–712. doi:10.1007/S11336-005-1370-2
  • Hui, J., & Rubin, D. B. (2008). Principal stratification for causal inference with extended partial compliance. Journal of the American Statistical Association, 103(481), 101–111. doi:10.1198/016214507000000347
  • Imbens, G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Economics & Statistics, 86(1), 4–29. doi:10.1162/003465304323023651
  • Imbens, G. W., & Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47(1), 5–86. doi:10.1257/jel.47.1.5
  • Jo, B., & Stuart, E. A. (2009). On the use of propensity scores in principal causal effect estimation. Statistics in Medicine, 28(23), 2857–2875. doi:10.1002/sim.3669
  • Jo, B., Stuart, E. A., MacKinnon, D. P., & Vinokur, A. D. (2011). The use of propensity scores in mediation analysis. Multivariate Behavioral Research, 46(3), 425–452. doi:10.1080/00273171.2011.576624
  • Jo, B., & Vinokur, A. D. (2011). Sensitivity analysis and bounding of causal effects with alternative identifying assumptions. Journal of Educational and Behavioral Statistics. doi:10.3102/1076998610383985
  • Jöreskog, K. G. (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36(2), 109–133. doi:10.1007/BF02291393
  • Kang, J. D. Y., & Schafer, J. L. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22(4), 523–539. doi:10.1214/07-sts227
  • Kaplan, D. (1999). An extension of the propensity score adjustment method for the analysis of group differences in MIMIC models. Multivariate Behavioral Research, 34(4), 467–492. doi:10.1207/S15327906MBR3404_4
  • Kelcey, B. M. (2011). Assessing the effects of teachers’ reading knowledge on students’ achievement using multilevel propensity score stratification. Educational Evaluation and Policy Analysis, 33(4), 458–482. doi:10.3102/0162373711415262
  • Kish, L. (1965). Survey sampling. New York, NY: Wiley.
  • Lanza, S. T., Coffman, D. L., & Xu, S. (2013). Causal inference in latent class analysis. Structural Equation Modeling: a Multidisciplinary Journal, 20(3), 361–383. doi:10.1080/10705511.2013.797816
  • Lee, B. K., Lessler, J., & Stuart, E. A. (2011). Weight trimming and propensity score weighting. PLoS ONE, 6(3), 1–6.
  • Leite, W. L. (2015). Latent growth modeling of longitudinal data with propensity score matched groups. In W. Pan & H. Bai (Eds.), Propensity score analysis: Fundamentals, developments, and extensions (pp. 191–216). New York, NY: Guilford.
  • Leite, W. L. (2017). Practical propensity score methods using R. Thousand Oaks, CA: Sage Publishing.
  • Leite, W. L., & Aydin, B. (2016, April). A comparison of methods for imputation of missing covariate data prior to propensity score analysis. Paper presented at the American Education Research Association Conference, Washington, D.C.
  • Leite, W. L., Jimenez, F., Kaya, Y., Stapleton, L. M., MacInnes, J. W., & Sandbach, R. (2015). An evaluation of weighting methods based on propensity scores to reduce selection bias in multilevel observational studies. Multivariate Behavioral Research, 50(3), 265–284. doi:10.1080/00273171.2014.991018
  • Li, L., Shen, C., Wu, A. C., & Li, X. (2011). Propensity score-based sensitivity analysis method for uncontrolled confounding. American Journal of Epidemiology. doi:10.1093/aje/kwr096
  • Lohr, S. (1999). Sampling: Design and analysis. Pacific Grove, CA: Duxbury Press.
  • Lunceford, J. K., & Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study. Statistics in Medicine, 23, 2937–2960. doi:10.1002/sim.1903
  • McCaffrey, D. F., Griffin, B. A., Almirall, D., Slaughter, M. E., Ramchand, R., & Burgette, L. F. (2013). A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Statistics in Medicine, 19, 3388–3414. doi:10.1002/sim.5753
  • McCaffrey, D. F., Ridgeway, G., & Morral, A. R. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9, 403–425. doi:10.1037/1082-989X.9.4.403
  • McNeish, D., Stapleton, L. M., & Silverman, R. D. (2017). On the unnecessary ubiquity of hierarchical linear modeling. Psychological Methods, 22, 114–140. doi:10.1037/met0000078
  • Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543. doi:10.1007/BF02294825
  • Millsap, R. E., & Olivera-Aguillar, M. (2012). Investigating measurement invariance using confirmatory factor analysis. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 380–392). New York, NY: Guilford Press.
  • Mitra, R., & Reiter, J. P. (2012). A comparison of two methods of estimating propensity scores after multiple imputation. Statistical Methods in Medical Research. doi:10.1177/0962280212445945
  • Muthén & Muthén. (2017). Mplus (Version 8.0). Los Angeles, CA: Author.
  • Muthén, B., & Asparouhov, T. (2014). Causal effects in mediation modeling: an introduction with applications to latent variables. Structural Equation Modeling, 22(1), 12–23. doi:10.1080/10705511.2014.935843
  • Muthén, L. K., & Muthén, B. O. (2017). Mplus user’s guide. (Eigth ed.). Los Angele, CA: Author.
  • Puma, M. J., Olsen, R. B., Bell, S. H., & Price, C. (2009). What to do when data are missing in group randomized controlled trials. Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.
  • Raykov, T. (2012). Propensity score analysis with fallible covariates: A note on a latent variable modeling approach. Educational and Psychological Measurement, 72, 715–733. doi:10.1177/0013164412440999
  • Ridgeway, G., Kovalchik, S. A., Griffin, B. A., & Kabeto, M. U. (2015). Propensity score analysis with survey weighted data. Journal of Causal Inference, 3(2). doi:10.1515/jci-2014-0039
  • Rodríguez de Gil, P., Bellara, A. P., Lanehart, R. E., Lee, R. S., Kim, E. S., & Kromrey, J. D. (2015). How do propensity score methods measure up in the presence of measurement error? A monte carlo study. Multivariate Behavioral Research, 50(5), 520–532. doi:10.1080/00273171.2015.1022643
  • Ronfeldt, M., Loeb, S., & Wyckoff, J. (2013). How teacher turnover harms student achievement. American Educational Research Journal, 50, 4–36. doi:10.3102/0002831212463813
  • Rosenbaum, P. R. (2002). Observational studies. New York, NY: Springer.
  • Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. doi:10.1093/biomet/70.1.41
  • Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. doi:10.18637/jss.v048.i02
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701. doi:10.1037/h0037350
  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.
  • Rubin, D. B. (2005). Causal inference using potential outcomes: design, modeling, decisions. Journal of the American Statistical Association, 100(469), 322–331. doi:10.1198/016214504000001880
  • Rubin, D. B. (2007). The design versus the analysis of observational studies for causal effects: Parallels with the design of randomized trials. Statistics in Medicine, 26(1), 20–36. doi:10.1002/sim.2739
  • Rubin, D. B. (2008). For objective causal inference, design trumps analysis. The Annals of Applied Statistics, 2, 808–840. doi:10.1214/08-AOAS187
  • Rust, K. F., & Rao, J. N. K. (1996). Variance estimation for complex surveys using replication techniques. Statistical Methods in Medical Research, 5, 283–310. doi:10.1177/096228029600500305
  • Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13(4), 279–313. doi:10.1037/a0014268
  • Shadish, W. R. (2010). Campbell and Rubin: A primer and comparison of their approaches to causal inference in field settings. Psychological Methods, 15(1), 3–17. doi:10.1037/a0015916
  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin Company.
  • Shen, C., Li, X., Li, L., & Were, M. C. (2011). Sensitivity analysis for causal inference using inverse probability weighting. Biometrical Journal, 53(5), 1–16. doi:10.1002/bimj.201100042
  • Stapleton, L. M. (2002). The incorporation of sample weights into multilevel structural equation models. Structural Equation Modeling, 9(4), 475–503. doi:10.1207/S15328007SEM0904_2
  • Stapleton, L. M. (2006). An assessment of practical solutions for structural equation modeling with complex sample data. Structural Equation Modeling, 13(1), 28–58. doi:10.1207/s15328007sem1301_2
  • Stapleton, L. M. (2008). Variance estimation using replication methods in structural equation modeling with complex sample data. Structural Equation Modeling, 15(2), 183–210. doi:10.1080/10705510801922316
  • Steiner, P. M., Cook, T. D., & Shadish, W. R. (2011). On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics, 36(2), 213–236. doi:10.3102/1076998610375835
  • Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15(3), 250–267. doi:10.1037/a0018719
  • Sterba, S. K. (2009). Alternative model-based and design-based frameworks for inference from samples to populations: From polarization to integration. Multivariate Behavioral Research, 44(6), 711–740. doi:10.1080/00273170903333574
  • Stuart, E. A., & Rubin, D. B. (2007). Best practices in quasi-experimental designs: Matching methods for causal inference. In J. Osboarne (Ed.), Best practices in quantitative methods (pp. 155–176). Thousand Oaks, CA: Sage.
  • Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science, 25(1), 1–21. doi:10.1214/09-sts313
  • Thoemmes, F. J., & Kim, E. S. (2011). A systematic review of propensity score methods in the social sciences. Multivariate Behavioral Research, 46(1), 90–118. doi:10.1080/00273171.2011.540475
  • Tourkin, S. C., Pugh, K. W., Fondelier, S. E., Parmer, R. J., Cole, C., Jackson, B., … Walter, E. (2004). 1999–2000 Schools and Staffing Survey (SASS) data file user’s manual (NCES 2004–303). (FORM SAS-4A, OMB No. 1850-0598). Washington, DC: U.S. Department of Education, National Center for Education Statistics.
  • U.S. Department of Education, Institute of Education Sciences, & What Works Clearinghouse. (2017). What works clearinghouse: Standards handbook (Version 4.0). Washington, DC Retrieved from https://ies.ed.gov/ncee/wwc/Docs/referenceresources/wwc_standards_handbook_v4.pdf
  • Vanderweele, T. J. (2011). Principal stratification–Uses and limitations. The International Journal of Biostatistics, 7(1). doi:10.2202/1557-4679.1329
  • Wu, J.-Y., & Kwok, O.-M. (2012). Using SEM to analyze complex survey data: A comparison between design-based single-level and model-based multilevel approaches. Structural Equation Modeling, 19(1), 16–35. doi:10.1080/10705511.2012.634703

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.