Abstract
Estimation of the degree of agreement between different raters is of crucial importance in medical and social sciences. There are lots of different approaches proposed in the literature for this aim. In this article, we focus on the inter-rater agreement measures for the ordinal variables. The ordinal nature of the variable makes this estimation task more complicated. Although there are modified versions of inter-rater agreement measures for ordinal tables, there is no clear agreement on the use of a particular approach. We conduct an extensive Monte Carlo simulation study to evaluate and compare the accuracy of mainstream inter-rater agreement measures for ordinal tables with each other and figure out the effect of different table structures on the accuracy of these measures. Our results are useful in the sense that they provide detailed information about which measure to use with different table structures to get most reliable inferences about the degree of agreement between two raters. With our simulation study, we recommend use of Gwet’s AC2 and Brennan-Prediger’s κ in the situation where there is high agreement among raters. However, it should be noted that these coefficients overstate the extent of agreement among raters when there is no agreement, and the data is unbalanced.
Acknowledgment
The authors would like to acknowledge the valuable comments and suggestions of the anonymous reviewer, which have improved the quality of this paper. Also, the authors gratefully acknowledge the generous financial support of VIED and RMIT. Finally, the authors’ thanks are due to Dr. Gwet for kindly granting permission to include his functions to calculate various Kappa-like measures.
Disclosure statement
No potential conflict of interest was reported by the authors.
Funding
Duyet Tran received financial support of VIED and RMIT.