0
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning

ORCID Icon, ORCID Icon & ORCID Icon

References

  • Ahmed, I., Bertling, M., Zhang, L., Ho, A. D., Loyalka, P., Xue, H., Rozelle, S., & Domingue, B. (2024). Heterogeneity of item-treatment interactions masks complexity and generalizability in randomized controlled trials. Journal of Research on Educational Effectiveness, 1–22. Advance online publication. https://doi.org/10.1080/19345747.2024.2361337
  • Anderson, R. C., & Pearson, P. D. (1984). A schema-theoretic view of basic processes in reading comprehension. In P. D. Pearson, R. Barr, M. L. Kamil, & P. Mosenthal (Eds.), Handbook of reading research (pp. 255–291). Routledge.
  • Bailey, D., Duncan, G. J., Odgers, C. L., & Yu, W. (2017). Persistence and fadeout in the impacts of child and adolescent interventions. Journal of Research on Educational Effectiveness, 10(1), 7–39. https://doi.org/10.1080/19345747.2016.1232459
  • Barnett, S. M., & Ceci, S. J. (2002). When and where do we apply what we learn? A taxonomy for far transfer. Psychological Bulletin, 128(4), 612–637. https://doi.org/10.1037/0033-2909.128.4.612
  • Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
  • Binici, S. (2007). Random-effect differential item functioning via hierarchical generalized linear Model and generalized linear latent mixed Model: A comparison of estimation methods [ Doctoral dissertation, The Florida State University]. FSU’s Digital Repository. https://diginole.lib.fsu.edu/islandora/object/fsu:182005.
  • Breen, R., Karlson, K. B., & Holm, A. (2018). Interpreting and understanding logits, probits, and other nonlinear probability models. Annual Review of Sociology, 44(1), 39–54. https://doi.org/10.1146/annurev-soc-073117-041429
  • Brennan, R. L. (1992). Generalizability theory. Educational Measurement Issues & Practice, 11(4), 27–34. https://doi.org/10.1111/j.1745-3992.1992.tb00260.x
  • Bulut, O., Gorgun, G., & Yildirim-Erbasli, S. N. (2021). Estimating explanatory extensions of dichotomous and polytomous Rasch models: The eirm package in R. Psych, 3(3), 308–321. https://doi.org/10.3390/psych3030023
  • Bürkner, P.-C. (2021). Bayesian item response modeling in R with brms and stan. Journal of Statistical Software, 100(5), 1–54. https://doi.org/10.18637/jss.v100.i05
  • Cho, S. J., Athay, M., & Preacher, K. J. (2013). Measuring change for a multidimensional test using a generalized explanatory longitudinal item response model. The British Journal of Mathematical and Statistical Psychology, 66(2), 353–381. https://doi.org/10.1111/j.2044-8317.2012.02058.x
  • De Boeck, P. (2008). Random item IRT models. Psychometrika, 73(4), 533–559. https://doi.org/10.1007/s11336-008-9092-x
  • De Boeck, P., Bakker, M., Zwitser, R., Nivard, M., Hofman, A., Tuerlinckx, F., & Partchev, I. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39(12), 1–28. https://doi.org/10.18637/jss.v039.i12
  • De Boeck, P., Cho, S. J., & Wilson, M. (2016). Explanatory item response models. In A. A. Rupp & J. P. Leighton (Eds.), The Wiley handbook of cognition and assessment: Frameworks, methodologies, and applications (pp. 249–266). John Wiley & Sons.
  • De Boeck, P., & Wilson, M. (2014). Multidimensional explanatory item response modeling. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 252–271). Routledge.
  • Donoghue, J. R., & Isham, S. P. (1998). A comparison of procedures to detect item parameter drift. Applied Psychological Measurement, 22(1), 33–51. https://doi.org/10.1177/01466216980221002
  • Frederickx, S., Tuerlinckx, F., De Boeck, P., & Magis, D. (2010). RIM: A random item mixture model to detect differential item functioning. Journal of Educational Measurement, 47(4), 432–457. https://doi.org/10.1111/j.1745-3984.2010.00122.x
  • Gamerman, D., Gonçalves, F. B., & Soares, T. M. (2017). Differential item functioning. In W. van der Linden (Ed.), Handbook of item response theory. Volume three: Applications (pp. 67–86). CRC Press.
  • Gilbert, J. B. (2024). Modeling item-level heterogeneous treatment effects: A tutorial with the glmer function from the lme4 package in R. Behavior Research Methods. 56(), 5055–5067. https://doi.org/10.3758/s13428-023-02245-8
  • Gilbert, J. B. (2024a). Estimating treatment effects with the explanatory item response model. Journal of Research on Educational Effectiveness. Advance online publication. 1–19. https://doi.org/10.1080/19345747.2023.2287601
  • Gilbert, J. B. (2024b). How measurement affects causal inference: Attenuation bias is (usually) more important than scoring weights (EdWorkingpaper No. 23-766). Annenberg Institute at Brown University. https://doi.org/10.26300/4hah-6s55
  • Gilbert, J. B., Hieronymus, F., Eriksson, E., & Domingue, B. W. (2024). Item-level heterogeneous treatment effects of selective serotonin reuptake inhibitors (SSRIs) on depression: Implications for inference, generalizability, and identification. Epidemiologic Methods, 13(1), 1–17. https://www.degruyter.com/document/doi/10.1515/em-2024-0006/html
  • Gilbert, J. B., Kim, J. S., & Miratrix, L. W. (2023). Modeling item-level heterogeneous treatment effects with the explanatory item response model: Leveraging large-scale online assessments to pinpoint the impact of educational interventions. Journal of Educational and Behavioral Statistics, 48(6), 889–913. https://doi.org/10.3102/10769986231171710
  • Gilbert, J. B., & Miratrix, L. W. (2023, March 6). Recovering effect sizes from dichotomous variables using logistic regression. CARES lab blog. https://cares-blog.gse.harvard.edu/post/logistic-effects/
  • Gilbert, J. B., Miratrix, L. W., Joshi, M., & Domingue, B. (2024). Disentangling person-dependent and item-dependent causal effects: Applications of item response theory to the estimation of treatment effect heterogeneity. Journal of Educational and Behavioral Statistics. https://doi.org/10.3102/10769986241240085
  • Hedges, L. V. (1981). Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics, 6(2), 107–128. https://doi.org/10.3102/10769986006002107
  • Hox, J. J., Moerbeek, M., & Van de Schoot, R. (2017). Multilevel analysis: Techniques and applications. Routledge.
  • Jeon, M., & Rabe-Hesketh, S. (2016). An autoregressive growth model for longitudinal item analysis. Psychometrika, 81(3), 830–850. https://doi.org/10.1007/s11336-015-9489-2
  • Kim, J. S., Burkhauser, M. A., Mesite, L. M., Asher, C. A., Relyea, J. E., Fitzgerald, J., & Elmore, J. (2021). Improving reading comprehension, science domain knowledge, and reading engagement through a first-grade content literacy intervention. Journal of Educational Psychology, 113(1), 3–26. https://doi.org/10.1037/edu0000465
  • Kim, J. S., Burkhauser, M. A., Relyea, J. E., Gilbert, J. B., Scherer, E., Fitzgerald, J., Mosher, D., & McIntyre, J. (2023). A longitudinal randomized trial of a sustained content literacy intervention from first to second grade: Transfer effects on students’ reading comprehension. Journal of Educational Psychology, 115(1), 73–98. https://doi.org/10.1037/edu0000751
  • Kim, J. S., Gilbert, J. B., Relyea, J. E., Rich, R., Scherer, E., Burkhauser, M. A., & Tvedt, J. N. (2024). Time to transfer: Long-term effects of a sustained and spiraled content literacy intervention in the elementary grades. Developmental Psychology. Advance online publication. 60(7), 1279–1297. https://doi.org/10.1037/dev0001710
  • Kintsch, W. (2009). Learning and constructivism. In S. Tobias & T. M. Duffy (Eds.), Constructivist instruction: Success or failure? (pp. 235–253). Routledge.
  • Koretz, D. (2005). Alignment, high stakes, and the inflation of test scores. Teachers College Record, 107(14), 99–118. https://doi.org/10.1177/016146810510701405
  • Lee, H., & Geisinger, K. F. (2019). Item parameter drift in context questionnaires from international large-scale assessments. International Journal of Testing, 19(1), 23–51. https://doi.org/10.1080/15305058.2018.1481852
  • Lee, W., & Cho, S. J. (2017). The consequences of ignoring item parameter drift in longitudinal item response models. Applied Measurement in Education, 30(2), 129–146. https://doi.org/10.1080/08957347.2017.1283317
  • Liu, S., Kuppens, P., & Bringmann, L. (2021). On the use of empirical Bayes estimates as measures of individual traits. Assessment, 28(3), 845–857. https://doi.org/10.1177/1073191119885019
  • Liu, Y., Millsap, R. E., West, S. G., Tein, J. Y., Tanaka, R., & Grimm, K. J. (2017). Testing measurement invariance in longitudinal data with ordered-categorical measures. Psychological Methods, 22(3), 486. https://doi.org/10.1037/met0000075
  • Lüdecke, D. (2018). Ggeffects: Tidy data frames of marginal effects from regression models. Journal of Open Source Software, 3(26), 772. https://doi.org/10.21105/joss.00772
  • Luo, J., Wang, M. C., Ge, Y., Chen, W., & Xu, S. (2020). Longitudinal invariance analysis of the short grit scale in Chinese young adults. Frontiers in Psychology, 11, 1–9. https://doi.org/10.3389/fpsyg.2020.00466
  • Luo, S., Zou, H., Stebbins, G. T., Schwarzschild, M. A., Macklin, E. A., Chan, J., Oakes, D., Simuni, T., Goetz, C. G., & members of Parkinson Study Group SURE‐PD3 Investigators. (2022). Dissecting the domains of Parkinson’s disease insights from longitudinal item response theory modeling. Movement Disorders, 37(9), 1904–1914. https://doi.org/10.1002/mds.29154
  • Miratrix, L. W., Weiss, M. J., & Henderson, B. (2021). An applied researcher’s guide to estimating effects from multisite individually randomized trials: Estimands, estimators, and estimates. Journal of Research on Educational Effectiveness, 14(1), 270–308. https://doi.org/10.1080/19345747.2020.1831115
  • Montoya, A. K., & Jeon, M. (2020). MIMIC models for uniform and nonuniform DIF as moderated mediation models. Applied Psychological Measurement, 44(2), 118–136. https://doi.org/10.1177/0146621619835496
  • Mood, C. (2010). Logistic regression: Why we cannot do what we think we can do, and what we can do about it. European Sociological Review, 26(1), 67–82. https://doi.org/10.1093/esr/jcp006
  • Muthén, B. (2000). Methodological issues in random coefficient growth modeling using a latent variable framework: Applications to the development of heavy drinking ages 18–37. In J. S. Rose, L. Chassin, C. C. Presson, & S. J. Sherman (Eds.), Multivariate applications in substance use research (pp. 113–140). Psychology Press.
  • Naumann, A., Hochweber, J., & Hartig, J. (2014). Modeling instructional sensitivity using a longitudinal multilevel differential item functioning approach. Journal of Educational Measurement, 51(4), 381–399. https://doi.org/10.1111/jedm.12051
  • O’Connell, A. A., & McCoach, D. B. (2022). Multilevel modeling methods with introductory and advanced applications. Bell, B. A. (Eds.). Information Age Publishing.
  • Pastor, D. A., & Beretvas, S. N. (2006). Longitudinal Rasch modeling in the context of psychotherapy outcomes assessment. Applied Psychological Measurement, 30(2), 100–120. https://doi.org/10.1177/0146621605279761
  • Perfetti, C. (2007). Reading ability: Lexical quality to comprehension. Scientific Studies of Reading, 11(4), 357–383. https://doi.org/10.1080/10888430701530730
  • Proust-Lima, C., Philipps, V., Perrot, B., Blanchin, M., & Sébille, V. (2022). Modeling repeated self-reported outcome data: A continuous-time longitudinal item response theory model. Methods, 204, 386–395. https://doi.org/10.1016/j.ymeth.2022.01.005
  • Randall, J., Cheong, Y. F., & Engelhard, G., Jr. (2011). Using explanatory item response theory modeling to investigate context effects of differential item functioning for students with disabilities. Educational and Psychological Measurement, 71(1), 129–147. https://doi.org/10.1177/0013164410391577
  • Raudenbush, S. W., & Bloom, H. S. (2015). Learning about and from a distribution of program impacts using multisite trials. American Journal of Evaluation, 36(4), 475–499. https://doi.org/10.1177/1098214015600515
  • R Core Team. (2022). R: A language and environment for statistical computing (version 4.3.2) [computer software]. R Foundation for Statistical Computing. https://www.R-project.org/
  • Rockwood, N. J., & Jeon, M. (2019). Estimating complex measurement and growth models using the R package PLmixed. Multivariate Behavioral Research, 54(2), 288–306. https://doi.org/10.1080/00273171.2018.1516541
  • Rosseel, Y. (2012). Lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02
  • Rupp, A. A., & Zumbo, B. D. (2006). Understanding parameter invariance in unidimensional IRT models. Educational and Psychological Measurement, 66(1), 63–84. https://doi.org/10.1177/0013164404273942
  • Sales, A., Prihar, E., Heffernan, N., & Pane, J. F. (2021). The effect of an intelligent tutor on performance on specific posttest problems [ Paper presentation]. Proceedings of the 14th International Conference on Educational Data Mining (EDM 2021). https://eric.ed.gov/?id=ED615618
  • Shi, Y., Leite, W., & Algina, J. (2010). The impact of omitting the interaction between crossed factors in cross‐classified random effects modelling. The British Journal of Mathematical and Statistical Psychology, 63(1), 1–15. https://doi.org/10.1348/000711008X398968
  • Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. Oxford University Press.
  • Soland, J., Kuhfeld, M., & Edwards, K. (2022). How survey scoring decisions can influence your study’s results: A trip through the IRT looking glass. Psychological Methods. https://doi.org/10.1037/met0000506
  • Stevenson, C. E., Hickendorff, M., Resing, W. CM., Heiser, W. J., & de Boeck, P. AL. (2013). Explanatory item response modeling of children’s change on a dynamic test of analogical reasoning. Intelligence, 41(3), 157–168. https://doi.org/10.1016/j.intell.2013.01.003
  • Sukin, T. M. (2010). Item parameter drift as an indication of differential opportunity to learn: An exploration of item flagging methods & accurate classification of examinees [ Doctoral dissertation]. University of Massachusetts Amherst]. Scholarworks @ UMass Amherst. https://scholarworks.umass.edu/open_access_dissertations/301/
  • Te Marvelde, J. M., Glas, C. A., Van Landeghem, G., & Van Damme, J. (2006). Application of multidimensional item response theory models to longitudinal data. Educational and Psychological Measurement, 66(1), 5–34. https://doi.org/10.1177/0013164405282490
  • Ten Have, T. R., & Localio, A. R. (1999). Empirical Bayes estimation of random effects parameters in mixed effects logistic regression models. Biometrics Bulletin, 55(4), 1022–1029. https://doi.org/10.1111/j.0006-341X.1999.01022.x
  • Van den Noortgate, W., & De Boeck, P. (2005). Assessing and explaining differential item functioning using logistic mixed models. Journal of Educational and Behavioral Statistics, 30(4), 443–464. https://doi.org/10.3102/10769986030004443
  • VanderWeele, T. J., & Vansteelandt, S. (2022). A statistical test to reject the structural interpretation of a latent factor model. Journal of the Royal Statistical Society: Series B, Statistical Methodology, 84(5), 2032–2054. https://doi.org/10.1111/rssb.12555
  • von Hippel, P. T. (2023). Multiply by 37: A surprisingly accurate rule of thumb for converting effect sizes from standard deviations to percentile points (EdWorkingpaper No. 23–829). Annenberg Institute at Brown University. https://doi.org/10.26300/xk0b-ft25
  • Waclawiw, M. A., & Liang, K. Y. (1994). Empirical Bayes estimation and inference for the random effects model with binary response. Statistics in Medicine, 13(5‐7), 541–551. https://doi.org/10.1002/sim.4780130516
  • Wan, S., Bond, T. N., Lang, K., Clements, D. H., Sarama, J., & Bailey, D. H. (2021). Is intervention fadeout a scaling artefact? Economics of Education Review, 82, 102090. https://doi.org/10.1016/j.econedurev.2021.102090
  • Wang, C., & Nydick, S. W. (2020). On longitudinal item response theory models: A didactic. Journal of Educational and Behavioral Statistics, 45(3), 339–368. https://doi.org/10.3102/1076998619882026
  • Wilson, M., Zheng, X., & McGuire, L. (2012). Formulating latent growth using an explanatory item response model approach. Journal of Applied Measurement, 13(1), 1–22.
  • Ye, F. (2016). Latent growth curve analysis with dichotomous items: Comparing four approaches. The British Journal of Mathematical and Statistical Psychology, 69(1), 43–61. https://doi.org/10.1111/bmsp.12058

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.