Abstract
This paper argues that currently available methods for the assessment of the repeatability and reproducibility of ordinal classifications are not satisfactory. The paper aims to study whether we can modify a class of models from Item Response Theory, well established for the study of the reliability of categorical measurements in psychometrics and education, for use in business and industry, and whether the resulting approaches offer a satisfactory solution. The fitted models can be presented graphically, but also allow the calculation of probabilities of correct ordering and consistent classification. In addition, the model-based approach allows refined diagnostics, giving the user insight into the workings of a classification procedure, which is vital information for a user willing to improve a poor classification procedure. The approach is illustrated from a real-life example, and the proposed analysis is contrasted to two popular alternative analyses, based on Goodman and Kruskal’s gamma and Kendall’s coefficient of concordance. The datasets and mathematical proofs are available as online supplemental materials.