84
Views
3
CrossRef citations to date
0
Altmetric
Original Articles

Exploring the Relationship Between Interrater Correlations and Validity of Peer Ratings

&
Pages 180-197 | Published online: 23 Apr 2008
 

Abstract

A field study was conducted to investigate the relationship between interrater correlations and validity estimates of peer ratings. Validity coefficients and interrater correlations were calculated for 281 work units in a large law enforcement organization in Israel. The main result was a weak positive linear relationship between these two variables. Furthermore, in some of the analyses conducted, a nonlinear quadratic component in the relationship between these measures was evident. Validity was low only when interrater correlation was very low (r = .4 and less). Above this level, validity was stable and almost did not change as interrater correlation increased. This finding, together with other studies (CitationBorman, 1975; CitationBuckner, 1959; CitationFreeberg, 1969; CitationWeekley & Gier, 1989), cast doubt on the assertion that interrater correlation in the field of performance rating is a proper measurement of reliability.

Notes

1Actually, under such conditions, the empirically derived validity cannot be higher than the midpoint between the lower reliability and the higher reliability.

2We ran more simulations under a wide range of true validity and criterion reliability values and obtained the same function form in all of them. Certainly, when true validity and criterion reliability are very low, the function relating validity to reliability is quite flat. However, in most instances this is not the case. The value of 0.3 for true validity was chosen because the mean empirical validity value in the case of the composite criterion was 0.291 (see ). Considering that true validity should be equal to or higher than empirical validity, we assumed that the theoretical functions under this condition most closely represent the minimal expected function between validity and reliability in our study.

3It is important to differentiate between interrater agreement and interrater reliability, which are two different concepts. Interrater agreement refers to the degree to which the absolute magnitudes of the ratings given by each rater are the same for the given ratees. Interrater reliability refers to the degree to which raters provide similar rank order of the ratees (CitationTinsley & Weiss, 1975). Our study deals with interrater reliability.

4In most cases, the interrater reliability index used was interrater correlation.

5 CitationBuckner (1959) uses the term interrater agreement and not interrater reliability in his article. However, in the method section of his article (p. 61), he notes that interrater agreement estimates were made according to the basic equation for the coefficient of reliability. Thus, it seems that Buckner did not differentiate between agreement and reliability and that his interrater agreement measures are essentially interrater reliability measures.

6These criteria were necessary because not all members of the group participated in the evaluation process. The evaluation in a work team was held on a certain day at a certain time. Because of work constraints, not all the unit members were able to take part in the process (their peers were permitted to evaluate them even if they were not present). In addition, unit members were instructed not to evaluate individuals if they were insufficiently acquainted with them. Criteria measures were not available to all unit members for a variety of reasons related to organizational bureaucracy and rules. In some rare cases, raters did not differentiate at all among their ratees (they gave the same grade to all their coworkers). These raters were counted as team members that did not participate in the evaluation process.

7In small groups, a minimum of five members was required in addition to the 80% rule in all the criteria listed.

8To assess the internal consistency between the components of the composite criterion, we calculated several alpha coefficients. First, we checked the consistency for each criterion across years. The alpha coefficients found where 0.78 for supervisor evaluations, 0.57 for absenteeism, and 0.54 for discipline. This level of consistency is reasonable, taking into account the small number of years (2 or 3). Alpha coefficients among the three criteria were calculated within each year and across years. The correlations between the criteria were weak (0.03–0.14), yielding low consistency coefficients (about 0.24). Low correlations between nonjudgmental and judgmental measures are not rare. CitationHeneman (1986), for example, found a corrected correlation of only 0.27 between supervisory ratings and nonjudgmental measures. Despite this low level of consistency, we decided to use composite criterion in addition to the separate usage of its components. According to CitationSchmidt and Kaplan (1971), if all criterion elements are correlated to an underlying economic construct it is possible to weight them into a composite irrespective of their intercorrelations. In our study, all three criteria seem to correlate with what the organization conceive as work performance (that can be transformed to economical value). Police officers who show low level of absenteeism, evaluated highly by their supervisors, and who have no discipline problems are considered to be high performers, and vice versa.

9We decided to calculate work unit norms in the standardization process instead of the overall sample norms in order to control for possible differences between these units. Note that these groups were very different. Some were field units (patrol, investigation), while others were administrative and logistics units. The unit norms were simply the mean and standard deviation of the various criteria within each team.

10The analysis was conducted on the raw validity and reliability indexes, calculated for each work team. The major analyses were repeated with Fisher transformed Zs of these indexes. No meaningful differences were detected.

p < .10.

*p < .05.

**p < .01.

***p < .001.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.