Abstract
Cohen’s kappa and intraclass kappa are widely used for assessing the degree of agreement between two raters with binary outcomes. However, many authors have pointed out its paradoxical behavior, that comes from the dependence on the prevalence of a trait under study. To overcome the limitation, Gwet (Citation2008) proposed an alternative and more stable agreement coefficient referred to as the AC1. In this paper, we discuss a likelihood-based inference of the AC1 in the case of multiple raters and binary outcomes. Construction of confidence intervals is mainly discussed. In addition, hypothesis testing, sample size estimation, and the method of assessing the effect of subject covariates on agreement are also presented. The performance of the estimator of AC1 and its confidence intervals are investigated in a simulation study, and an example is presented.