Abstract
Generalized linear models are addressed to describe the dependence of data on explanatory variables when the binary outcome is subject to misclassification. Both probit and t-link regressions for misclassified binary data under Bayesian methodology are proposed. The computational difficulties have been avoided by using data augmentation. The idea of using a data augmentation framework (with two types of latent variables) is exploited to derive efficient Gibbs sampling and expectation–maximization algorithms. Besides, this formulation has allowed to obtain the probit model as a particular case of the t-link model. Simulation examples are presented to illustrate the model performance when comparing with standard methods that do not consider misclassification. In order to show the potential of the proposed approaches, a real data problem arising when studying hearing loss caused by exposure to occupational noise is analysed.
Acknowledgements
The authors thank Dr Pardo for providing the data and for helpful suggestions related to the medical context. The authors also thank an anonymous referee for comments and suggestions which have improved the content and the readability of the paper. This research has been partially funded by Ministerio de Economía y Competitividad, Spain (Projects TIN2008-06796-C04-03 and MTM2011-28983-C03-02), Junta de Extremadura, Spain (Project GRU10110) and European Union (European Regional Development Funds). For partial support of this work through research grants, thanks are due to the Natural Sciences and Engineering Research Council of Canada and to the Fonds de recherche du Québec – Nature et technologies.