123
Views
2
CrossRef citations to date
0
Altmetric
Original Articles

The Extent of Mismeasurement for Aberrant Examinees

&
Pages 42-68 | Published online: 26 Mar 2010
 

Abstract

The person-fit literature assumes that aberrant response patterns could be a sign of person mismeasurement, but this assumption has rarely, if ever, been empirically investigated before. We explore the validity of test responses and measures of 10-year-old examinees whose response patterns on a commercial standardized paper-and-pencil mathematics test were flagged as aberrant. Validity evidence was collected through postexamination reflective interviews with 31 of the 80 pupils flagged as aberrant and their teachers, and teacher assessment (TA) judgments for the whole examination cohort of 674 examinees. Analysis suggested that interview-adjusted scores were significantly better fitting than expected by chance, but only some adjustments suggest serious mismeasurement. In addition, disagreement between TA and test scores was significantly greater for aberrant examinees, and partially predicted the interview adjustments. We conclude that person misfit statistics when combined with TA might be a useful antidote to mismeasurement, and we discuss the implications for assessment research and practice.

Notes

1In the context of this study the term “misfit” refers to the degree to which a response-pattern deviates from what is predicted by the measurement model, as indicated by fit statistics for persons or items. However, whenever a model is fitted to empirical data, there is some expected deviance from the model, and it is only when the misfit becomes significantly larger than the average that one begins to regard the misfit as a problem, that is, is “unexpectedly large” and hence suspicious. So we reserve here the term “aberrance” for response patterns whose misfits are larger than some criterion involving a cut-off value. Thus, aberrance implicates a cut-off criterion and is regarded as binary, whereas misfit takes into account information from the whole distribution of the fit statistic values (i.e., overfits, fits and our so-called aberrant, highly misfitting misfits). To sum up, a response pattern is identified as aberrant if it has a large enough misfit and an aberrant examinee is defined as an individual who has provided an aberrant response pattern.

2The equations of the fit statistics presented in this section are the person-fit versions. To calculate the item-fit versions, instead of aggregating residuals across items for each person we aggregate residuals for each item across persons.

aEighteen pupils belonged to both Infit and Outfit aberrant groups.

aPupil has obtained zero score.

bPupil has obtained perfect score.

aDenotes that values obtained after are smaller than values before.

bDenotes that values obtained after are larger than values before.

aMathematics for Learning and Teaching (MaLT)10 mark ranges and indicative National Curriculum sublevels according to MaLT developers: −9: working toward 3c; 10–14:3c; 15–19:3b; 20–24:3a; 25–29:4c; 30–34:4b; 35–38:4a; 39+:4a or better.

Note. aTest level > Teacher assessment level.

bTest level < Teacher assessment level.

cTest level = Teacher assessment level.

aTeacher assessment.

ak = 45.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 290.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.