819
Views
0
CrossRef citations to date
0
Altmetric
Corrections

Correction

This article refers to:
The whole is more than the sum of its parts – assessing writing using the consensual assessment technique

Article title: Making graduation matter: Initial evidence and implications for policy and research

Authors: Daniela Zahn, Ursula Canton, Victoria Boyd, Laura Hamilton, Josianne Mamo, Jane McKay, Linda Proudfoot, Dickson Telfer, Kim Williams, and Colin Wilson

Journal: Studies in Higher Education

DOI: https://doi.org/10.1080/03075079.2019.1711044

The article was originally published with incorrect ICC formula and related figures and texts. The correct version of is given as below,

  1. The equation (2) in page 7 should as ICC = (MSR – MSE)/MSR (2)

    where MSR = SSRows / dfRows and MSE = SSE / dfE.

    SSRows is the sum of squares of the between-text variance. SSE, the sum of squares of error, consists of two elements: the between-judge variability, i.e. how much a single rater’s mean deviates from the mean across all raters, and the within-judge variability, that is calculated from how much raters’ individual ratings of a text deviate from the mean rating of this text across all raters (Field, 2005). The ICC was calculated using average measures across all six items, because they better represent the way in which all six aspects contribute to the readers' overall impression of the text.

  2. The first line in Results section in page 7 should read as Overall the average measure ICC was .808 with a 95% confidence interval from .675 to .900 (F(26, 182) = 5.199, p < .001).

  3. Table 3 is removed from the article and the second, third and fourth paragraph is replaced with the text as below Analyses estimating the ICC for single score consistency were run to investigate the level of consensus for single items. The results indicated a level of agreement that is below the threshold of moderate or good agreement.

  4. Fourth line in Replication section in page 8 should read as below,

    With the following results: .810 with a 95% confidence interval from .680 to .901 (F(26, 208) = 5.253, p < .001).

  5. Table 4, 5, 6 and 7 are removed from the article and second paragraph is replaced with the text as below, Again, individual results by item indicated a level of agreement that is below the threshold of moderate or good agreement.

  6. The first paragraph under Discussion section in page 9 should read as below,

    The first question we asked in this study was whether consensual assessment can be adapted to quantify tacit agreement on successful writing as a complex, socially – and culturally bound concept. Overall, the nine independent, expert raters in the first study showed moderate to good consensus in their evaluation of the 27 texts analysed. Moreover, this result replicated in the second study. This suggests that the answer to the first research question is ‘yes’: the shared tacit knowledge that allows members of a discourse community to judge how successfully a writer communicates with them is sufficiently consistent to use an operational definition and can be quantified. Importantly, the quantification relies on averaging across all 6 items to create an overall evaluation of the text that consists of the individual aspects captured in each item.

    This consensus across all aspects of the text is, however, not reflected in the results for the individual questions. The higher level of variation at the level of the individual question could result from mainly two general and different sources. The first source could be the instrument itself. That is, the pre-test has shown that the single questions tend to be interpreted along the intended lines, but there is still room for individual differences in how the raters interpreted each question. The second general source of error could stem from individual differences between raters. Despite the fact that the instrument is easy and intuitive to use, raters, esp. those trained in rubric-based assessment might find it (cognitively and epistemologically) challenging to switch from the supposed objectivity of well-defined rubrics to trusting their intuitive expertise as readers (see below). Further rater-related error could stem from rater specific factors, such as length or type of previous experience, but these cannot be explored further with our data. Future studies could address this by capturing more rater specific data.

    Taken together, the studies thus offer proof of concept for the use of consensus-based assessment in the field of academic writing. It also offers quantitative confirmation of the insights other Academic Literacies researchers established through qualitative methods.

  7. The second paragraph under Discussion section in page 9 is removed from the article and the third paragraph should as below,

    The successful replication of the first study suggests that the answer to the second research question, whether a suitable measurement tool can reliably capture such tacit agreement, is also yes. The results thus demonstrate that our tool based on consensual assessment is appropriate to capture readers’ impressions of how successfully different texts communicate. To emphasise the adaptation to writing, it will be given a new and memorable name: Technique for Writing Assessment by Consensus (TWAC).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.