ABSTRACT
There is currently a lack of instrumentation with sufficient technical quality focused on the implementation of classroom formative assessment at a grain-size appropriate for the provision of feedback to teachers and program developers. This paper details the development of a validity argument for the High-Impact Classroom Assessment Practices observation protocol (HI-CAP), and examines how to begin evaluating one inference in the interpretative argument. We present a conceptual framework for the HI-CAP and then articulate the interpretive argument. Finally we present evidence to evaluate one part of this argument, the scoring inference, using independent ratings of lessons from pairs of observers across 65 lessons in ninth-grade ELA and mathematics which suggest modest evidence for appropriateness, consistency, and preliminary evidence supporting the scoring model. We conclude with a discussion of the strengths and limitations of the current protocol and training procedures and implications for developing a validity argument for other similar protocols.
Notes
1 Differences reported in Table 6 reflect the difference between the Developer Pilot and the Trainee Pilot. All differences are positive indicating that the agreement for the Developer Pilot was always greater, across all dimensions, than the agreement for the Trainee Pilot.