1,640
Views
2
CrossRef citations to date
0
Altmetric
Research Articles

Benchmark rating procedure, best of both worlds? Comparing procedures to rate text quality in a reliable and valid manner

ORCID Icon, & ORCID Icon
Pages 302-319 | Received 30 Mar 2021, Accepted 24 Jul 2023, Published online: 11 Aug 2023

Figures & data

Table 1. Means, SD and Cronbach’s Alpha Reliability for Holistic, Benchmark, and Analytic Ratings.

Table 2. Average Reliability Coefficients for One, Two and Three Raters.

Table 3. Correlations between Holistic, Benchmark, and Analytic Ratings.

Table 4. Variance Components as Proportions of the Total Variance for each Rating Procedure.

Figure 1. Estimated generalisability of writing scores for one task (solid lines) and four tasks (dashed lines), with varying number of raters. Lines represent the rating procedure that is used by raters, b: benchmark rating procedure, a: analytic rating procedure, h: holistic rating procedure.

Figure 1. Estimated generalisability of writing scores for one task (solid lines) and four tasks (dashed lines), with varying number of raters. Lines represent the rating procedure that is used by raters, b: benchmark rating procedure, a: analytic rating procedure, h: holistic rating procedure.

Table 5. Means, SD and Reliability by Benchmark Condition and Text Sample.

Supplemental material

Supplemental Material

Download MS Word (197.2 KB)