1,092
Views
2
CrossRef citations to date
0
Altmetric
Research Articles

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

&
Pages 279-297 | Published online: 27 Sep 2013
 

Abstract

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources of error, which are characteristic of alternate assessments. Instead, Generalizability theory (G-theory) allows researchers to identify sources of error and analyze the relative contribution of each source. This study demonstrates an application of G-theory to examine reliability for an alternate assessment. A G-study with the facets rater type, assessment attempts, and tasks was examined to determine the relative contribution of each to observed score variance. Results were used to determine the reliability of scores. The assessment design was modified to examine how changes might impact reliability. As a final step, designs that were deemed satisfactory were evaluated regarding the feasibility of adapting them into a statewide standardized assessment and accountability program.

Notes

1There are limitations in treating tasks in this manner. First, results will not be produced that allow for interpretations in variability due to differences in task difficulty because task differences cannot be disentangled from task by student interactions in this design. In addition, the task variability that is confounded with the interaction might also be influenced by the fact that task is not a truly nested facet. It might be slightly underestimated as compared to a design where students were administered a purely unique pair of tasks.

2An interaction between rater type and student indicates that students are rank-ordered differently by the two types of raters. Another way of thinking about it is that the way in which the two types of raters differ in their ratings depends on the student.

3These percentages are based on total variance of scores had the scores for students been averaged across a single rater type, single task, and single attempt. This is not to be confused with the observed total variance of scores where the scores for students were averaged across two rater types, two tasks and three attempts.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 400.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.