1,136
Views
65
CrossRef citations to date
0
Altmetric
STATISTICAL PRACTICE

How Robust Are Multirater Interrater Reliability Indices to Changes in Frequency Distribution?

&
Pages 373-384 | Received 01 Apr 2015, Published online: 21 Nov 2016

References

  • Baethge, C., Franklin, J., and Mertens, S. (2013), “Substantial Agreement of Referee Recommendations at a General Medical Journal—A Peer Review Evaluation at Deutsches Ärzteblatt International,” PloS One, 8, 1–7.
  • Brennan, R. L., and Prediger, D. J. (1981), “Coefficient Kappa: Some Uses, Misuses, and Alternatives,” Educational and Psychological Measurement, 41, 687–699.
  • Carifio, J., and Perla, R. J. (2007), “Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends About Likert Scales and Likert Response Formats and Their Antidotes,” Journal of Social Sciences, 3, 106–116.
  • Cicchetti, D. V., and Feinstein, A. R. (1990), “High Agreement but Low Kappa: II. Resolving the Paradoxes,” Journal of Clinical Epidemiology, 43, 551–558.
  • Conger, A. J. (1980), “Integration and Generalization of Kappas for Multiple Raters,” Psychological Bulletin, 88, 322–328.
  • Feinstein, A. R., and Cicchetti, D. V. (1990), “High Agreement but Low Kappa: I. The Problems of Two Paradoxes,” Journal of Clinical Epidemiology, 43, 543–549.
  • Fleiss, J. L. (1971), “Measuring Nominal Scale Agreement Among Many Raters,” Psychological Bulletin, 76, 378–382.
  • Guggenmoos-Holzmann, I. (1993), “How Reliable are Chance-Corrected Measures of Agreement?,” Statistics in Medicine, 12, 2191–2205.
  • Gutiérrez, R. (2007), “(Re)Defining Equity: The Importance of a Critical Perspective,” in Improving Access to Mathematics: Diversity and Equity in the Classroom, eds. N. S. Nasir and P. Cobb, New York, NY: Teachers College Press, pp. 37–50.
  • ——— (2013), “The Sociopolitical Turn in Mathematics Education,” Journal for Research in Mathematics Education, 44, 37–68.
  • Gwet, K. L. (2002), “Inter-Rater Reliability: Dependency on Trait Prevalence and Marginal Homogeneity,” Statistical Methods for Inter-Rater Reliability Assessment Series, 2, 1–9.
  • ——— (2008), “Computing Inter-Rater Reliability and its Variance in the Presence of High Agreement,” British Journal of Mathematical and Statistical Psychology, 61, 29–48.
  • ——— (2014), Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters, Gaithersburg, MD: Advanced Analytics, LLC.
  • Janson, H., and Olsson, U. (2001), “A Measure of Agreement for Interval or Nominal Multivariate Observations,” Educational and Psychological Measurement, 61, 277–289.
  • Krippendorff, K. (2012), Content Analysis: An Introduction to its Methodology, Thousand Oaks, CA: Sage Publications.
  • Lang, A. T., Grooms, L. P., Sturm, M., Walsh, M., Koch, T., and O’Brien, S. H. (2014), “The Accuracy of a Parent-Administered Bleeding Assessment Tool in a Pediatric Hematology Clinic,” Haemophilia, 20, 807–813.
  • LeBreton, J. M, and Senter, J. L. (2008), “Answers to 20 Questions About Interrater Reliability and Interrater Agreement,” Organizational Research Methods, 11, 815–852.
  • National Council of Teachers of Mathematics (1991), Professional Standards for Teaching Mathematics, Reston, VA: Author.
  • ——— (2000), Principles and Standards for School Mathematics, Reston, VA: Author.
  • National Governors Association Center for Best Practices, Council of Chief State School Officers (2012), Common Core State Standards for Mathematics, Washington, DC: Author.
  • Nelson, J. C., and Pepe, M. S. (2000), “Statistical Description of Interrater Variability in Ordinal Ratings,” Statistical Methods in Medical Research, 9, 475–496.
  • R Core Development Team (2014), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing. Available at http://www.R-project.org/.
  • Wongpakaran, N., Wongpakaran, T., Wedding, D., and Gwet, K. L. (2013), “A Comparison of Cohen’s Kappa and Gwet’s AC1 When Calculating Inter-Rater Reliability Coefficients: A Study Conducted with Personality Disorder Samples,” BMC Medical Research Methodology, 13, 1–7.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.