Multilevel Rasch Modeling: Does Misfit to the Rasch Model Impact the Regression Model?: The Journal of Experimental Education: Vol 88 , No 4

Abstract

Multilevel Rasch models are increasingly used to estimate the relationships between test scores and student and school factors. Response data were generated to follow one-, two-, and three-parameter logistic (1PL, 2PL, 3PL) models, but the Rasch model was used to estimate the latent regression parameters. When the response functions followed 2PL or 3PL models, the proportion of variance explained in test scores by the simulated student or school predictors was estimated accurately with a Rasch model. Proportion of variance within and between schools was also estimated accurately. The regression coefficients were misestimated unless they were rescaled out of logit units. However, item-level parameters, such as DIF effects, were biased when the Rasch model was violated, similar to single-level models.

Keywords:

Notes

1 Although all of the mathematical properties of the Rasch model hold for the 1PL, the Rasch model did not develop out of the same tradition as IRT models (see Wright & Stone, Citation1979, for an explication of Rasch’s philosophy). Additionally, the 1PL model is often identified by constraining the variance of the θs, which necessitates adding an a-parameter common to all items. With this parameterization, the units are no longer logits; for example, if the a-parameter = 0.8, each scale unit equals 0.8 logits.

2 These interpretations of the covariates presuppose no collinearity among the predictors and no omitted covariates. If, for example, X_j and X_k are correlated, β₁ indicates the expected change in η_ijk for each unit change in the examinee predictor holding the school predictor constant and β₃ indicates the expected change in η_ijk for each unit change in the school predictor given a constant distribution of X_j within schools. Similarly, if there is a contextual (compositional) effect of the school mean X_j and this effect were omitted from the model and X_j is not school-mean centered, β₁ would be a mixture of the expected change within schools and the contextual effect.

3 Similarly, even if the item difficulties are estimated in the studied sample, one could estimate the item difficulties by conditional maximum likelihood and then replace the ε_i with the fixed estimates (Christensen, Bjorner, Kreiner, & Petersen, Citation2004; Zwinderman, Citation1991, pp. 593, 598). With very short tests, estimating the item parameters separately from the regression parameters may lead to underestimates of the standard errors of the regression parameters, but this problem decreases with increasing test length (Christensen et al., Citation2004).

4 An additional complication is that one must control for overall group-mean differences, often labeled impact in the DIF literature. If one were confident that the reference item contains no DIF, then the coefficient for the DIF characteristic in the equation for β_0j represents impact. Alternatively, if no item is designated as the reference item (γ₀₀ is omitted and there are β_0j . . . β_Ij instead of β(I−1)_j), then the coefficient for the DIF characteristic in the equation for β_0j represents impact if the DIF balances to zero across items. See Cheong and Kamata (Citation2013) for further discussion.

5 Hedges and Hedberg summarized data from standardized tests of mathematics and reading, grades K–12. The average ICC, before adding covariates to the model, was about .22 for nationally representative samples but smaller when limited to low-socioeconomic or low-achievement schools. Grade 3 reading had the highest (without covariates) ICC of .27.

6 An a-parameter of 1 in the logistic metric is equivalent to an a-parameter of 0.588 in the normal (probit) metric.

7 The term explained variance is not intended to imply causality but is simply less awkward than variance accounted for.

8 For the 3PL items, these b-differences corresponded to average log-odds differences of 0.43, 0.35, and 0.26 for the easy, middle, and hard items, respectively. The averages were calculated using 50 quadrature points evenly spaced between −4 and 4, with the difference at each quadrature point weighted by the total (focal + reference) population density at that point.

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 53.00 Add to cart

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 169.00 Add to cart

* Local tax will be added as applicable

Multilevel Rasch Modeling: Does Misfit to the Rasch Model Impact the Regression Model?

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Multilevel Rasch Modeling: Does Misfit to the Rasch Model Impact the Regression Model?

Abstract

Notes

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature