Abstract
This study shows how to address the problem of trait-unrelated response styles (RS) in rating scales using multidimensional item response theory. The aim is to test and correct data for RS in order to provide fair assessments of personality. Expanding on an approach presented by Böckenholt (2012), observed rating data are decomposed into multiple response processes based on a multinomial processing tree. The data come from a questionnaire consisting of 50 items of the International Personality Item Pool measuring the Big Five dimensions administered to 2,026 U.S. students with a 5-point rating scale. It is shown that this approach can be used to test if RS exist in the data and that RS can be differentiated from trait-related responses. Although the extreme RS appear to be unidimensional after exclusion of only 1 item, a unidimensional measure for the midpoint RS is obtained only after exclusion of 10 items. Both RS measurements show high cross-scale correlations and item response theory-based (marginal) reliabilities. Cultural differences could be found in giving extreme responses. Moreover, it is shown how to score rating data to correct for RS after being proved to exist in the data.
Notes
Detailed information about slope parameters and item difficulties for all IRT models can be requested from the authors.
In contrast to the concept of reliability in classical test theory, which uses the average error variance for all scores, the IRT-based marginal reliability can be characterized as a function of latent proficiency θ, where the integration over possible values of θ takes the place of the traditional characterization of an average error variance:
with as the average of the (possibly varying) values of the expected error variance σ2
e*. These marginal reliabilities for IRT scores parallel the internal consistency estimates of reliability for traditional test scores (Wainer et al., Citation2007, p. 76).