417
Views
12
CrossRef citations to date
0
Altmetric
INVESTIGATIONS

The Impact of Repeated Exposure to Items

, , &
Pages 404-409 | Published online: 27 Oct 2015
 

Abstract

Theory: When test developers have a limited number of test questions available or when the equating design requires some item overlap across forms, psychometricians worry that examinees who encounter previously seen questions on subsequent test forms may be able to inflate their test score due to their familiarity with the repeated test questions. Hypotheses: Prior exposure to test questions may lead to contamination and inflated scores. This research seeks to detect if examinees' scores were inflated due to prior exposure to test questions and, if so, whether those increases were significant. Method: The sample for this study consisted of candidates who took the American Board of Family Medicine's certification examination twice in a single year (n = 988). Examinees were randomly assigned one of two forms for their first attempt and received the other form for their repeat test. There were 99 questions in common across both forms. The Rasch model was used to estimate examinee ability. Performance changes on the common questions and unique questions were compared and repeated measures t tests were performed to establish whether score changes were likely to have occurred by chance. Results: On average, the examinees increased their overall ability estimate by .187 logits on the repeat attempt. The repeated measures t tests indicate this difference was statistically significant, t(987) = −25.298, p < .001, α = .05. The mean difference between the examinees' ability estimate on common and unique items for their first attempt was not statistically significant, t(987) = .264, p = .792, α = .05; however, the mean difference between common and unique items on the second attempt (0.029 logits) was statistically significant, t(987) = 3.28, p = .001, α = .05. Conclusions: Some of the increase in the examinees' overall ability estimate may attributed to a general increase in the latent trait; however, there was a small but detectable increase that could be attributed to prior exposure to the questions. On average, about 15% of the repeated questions were changed from wrong to right, but about 11% of questions were changed from right to wrong, suggesting that examinees may occasionally be using prior exposure to their benefit but general guessing accounts for more of the changes. The impact of the mean difference between the common and unique item scores (0.029 logits) is trivial at the individual level; however, such a bias among the population of repeat testers could be problematic if a small subset of examinees were using a “remember–research–retest” strategy to obtain nontrivial score increases.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 65.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 464.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.