Abstract
Double sampling is usually applied to collecting disease information of individuals for situations, in which a gold standard is available for validating a subset of the sample that has been classified by a fallible classifier. Inference procedures are developed for this type of double-sampling data. However, it could happen in practice that such gold standard does not exist. When both false-positive and false-negative diagnoses exist for two classifiers, we propose several likelihood-based and Bayesian methods for estimating the disease prevalence, diagnostic-test sensitivity and specificity under two double-sampling models. A variety of methods including Wald, log-transformation, logit-transformation, inverse-hyperbolic-tangent-transformation, Bootstrap-resampling, Bayesian are proposed to construct confidence intervals (CIs) for these parameters. Moreover, the method of variance estimates recovery (MOVER) is proposed to find the CIs for their difference. The performances of the proposed CIs are evaluated and compared with respect to the empirical coverage probability, coverage width and ratio of mesial non-coverage to non-coverage probability. Empirical results show that all CIs except for Bayesian credible interval with Jeffery non-informative prior and Bootstrap CI based on normality assumption generally produce satisfactory results, and hence be recommended for practical applications. The data from a study of dementia disorders is used to illustrate the proposed methods.
Disclosure statement
No potential conflict of interest was reported by the authors.