Search in:

Advanced search

The American Statistician Volume 70, 2016 - Issue 4

Submit an article Journal homepage

1,136

Views

CrossRef citations to date

Altmetric

STATISTICAL PRACTICE

How Robust Are Multirater Interrater Reliability Indices to Changes in Frequency Distribution?

David QuarfootCenter for Research in Mathematics and Science Education, San Diego State University, San Diego, CA, USACorrespondence[email protected]

Richard A. LevineDepartment of Statistics, San Diego State University, San Diego, CA, USA

Pages 373-384 | Received 01 Apr 2015, Published online: 21 Nov 2016

Cite this article
https://doi.org/10.1080/00031305.2016.1141708
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Baethge, C., Franklin, J., and Mertens, S. (2013), “Substantial Agreement of Referee Recommendations at a General Medical Journal—A Peer Review Evaluation at Deutsches Ärzteblatt International,” PloS One, 8, 1–7.
Web of Science ®Google Scholar
Brennan, R. L., and Prediger, D. J. (1981), “Coefficient Kappa: Some Uses, Misuses, and Alternatives,” Educational and Psychological Measurement, 41, 687–699.
Web of Science ®Google Scholar
Carifio, J., and Perla, R. J. (2007), “Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends About Likert Scales and Likert Response Formats and Their Antidotes,” Journal of Social Sciences, 3, 106–116.
Google Scholar
Cicchetti, D. V., and Feinstein, A. R. (1990), “High Agreement but Low Kappa: II. Resolving the Paradoxes,” Journal of Clinical Epidemiology, 43, 551–558.
PubMed Web of Science ®Google Scholar
Conger, A. J. (1980), “Integration and Generalization of Kappas for Multiple Raters,” Psychological Bulletin, 88, 322–328.
Web of Science ®Google Scholar
Feinstein, A. R., and Cicchetti, D. V. (1990), “High Agreement but Low Kappa: I. The Problems of Two Paradoxes,” Journal of Clinical Epidemiology, 43, 543–549.
PubMed Web of Science ®Google Scholar
Fleiss, J. L. (1971), “Measuring Nominal Scale Agreement Among Many Raters,” Psychological Bulletin, 76, 378–382.
Web of Science ®Google Scholar
Guggenmoos-Holzmann, I. (1993), “How Reliable are Chance-Corrected Measures of Agreement?,” Statistics in Medicine, 12, 2191–2205.
PubMed Web of Science ®Google Scholar
Gutiérrez, R. (2007), “(Re)Defining Equity: The Importance of a Critical Perspective,” in Improving Access to Mathematics: Diversity and Equity in the Classroom, eds. N. S. Nasir and P. Cobb, New York, NY: Teachers College Press, pp. 37–50.
Google Scholar
——— (2013), “The Sociopolitical Turn in Mathematics Education,” Journal for Research in Mathematics Education, 44, 37–68.
Web of Science ®Google Scholar
Gwet, K. L. (2002), “Inter-Rater Reliability: Dependency on Trait Prevalence and Marginal Homogeneity,” Statistical Methods for Inter-Rater Reliability Assessment Series, 2, 1–9.
Google Scholar
——— (2008), “Computing Inter-Rater Reliability and its Variance in the Presence of High Agreement,” British Journal of Mathematical and Statistical Psychology, 61, 29–48.
PubMed Web of Science ®Google Scholar
——— (2014), Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters, Gaithersburg, MD: Advanced Analytics, LLC.
Google Scholar
Janson, H., and Olsson, U. (2001), “A Measure of Agreement for Interval or Nominal Multivariate Observations,” Educational and Psychological Measurement, 61, 277–289.
Web of Science ®Google Scholar
Krippendorff, K. (2012), Content Analysis: An Introduction to its Methodology, Thousand Oaks, CA: Sage Publications.
Google Scholar
Lang, A. T., Grooms, L. P., Sturm, M., Walsh, M., Koch, T., and O’Brien, S. H. (2014), “The Accuracy of a Parent-Administered Bleeding Assessment Tool in a Pediatric Hematology Clinic,” Haemophilia, 20, 807–813.
PubMed Web of Science ®Google Scholar
LeBreton, J. M, and Senter, J. L. (2008), “Answers to 20 Questions About Interrater Reliability and Interrater Agreement,” Organizational Research Methods, 11, 815–852.
Web of Science ®Google Scholar
National Council of Teachers of Mathematics (1991), Professional Standards for Teaching Mathematics, Reston, VA: Author.
Google Scholar
——— (2000), Principles and Standards for School Mathematics, Reston, VA: Author.
Google Scholar
National Governors Association Center for Best Practices, Council of Chief State School Officers (2012), Common Core State Standards for Mathematics, Washington, DC: Author.
Google Scholar
Nelson, J. C., and Pepe, M. S. (2000), “Statistical Description of Interrater Variability in Ordinal Ratings,” Statistical Methods in Medical Research, 9, 475–496.
PubMed Web of Science ®Google Scholar
R Core Development Team (2014), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing. Available at http://www.R-project.org/.
Google Scholar
Wongpakaran, N., Wongpakaran, T., Wedding, D., and Gwet, K. L. (2013), “A Comparison of Cohen’s Kappa and Gwet’s AC1 When Calculating Inter-Rater Reliability Coefficients: A Study Conducted with Personality Disorder Samples,” BMC Medical Research Methodology, 13, 1–7.
PubMed Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

How Robust Are Multirater Interrater Reliability Indices to Changes in Frequency Distribution?

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

How Robust Are Multirater Interrater Reliability Indices to Changes in Frequency Distribution?

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date