9,625
Views
72
CrossRef citations to date
0
Altmetric
Web papers

A literature review of multi-source feedback systems within and without health services, leading to 10 tips for their successful design

, , , &
Pages e185-e191 | Published online: 03 Jul 2009

Abstract

Multi-source feedback (MSF) has become the accepted mechanism of ensuring the appropriate professional behaviour of doctors. It is part of the mandatory assessment of doctors in training and is to be utilized as part of the revalidation of trained doctors. There is significant variation in the models of MSF currently used within the National Health Service and new models of MSF are being designed by various specialties. No single model has been recognized as the ‘gold standard’. However, there is a large published literature concerning MSF, both in the context of health systems and, more extensively, within industry. This published literature is reviewed, drawing attention to aspects of MSF systems in which there is consensus on effective approaches as well as other aspects in which there is doubt about the optimum approach. In the light of the review 10 principles key in the development of effective MSF models have been produced.

Introduction

It is has always been implicitly accepted that the medical curriculum extends beyond knowledge and skills to include values and behaviours but now these have been made explicit by the General Medical Council (Citation2001). Assessment drives the curriculum, and the need therefore arises to assess behaviours in the workplace. Multi-source feedback (MSF) (also called 360° assessment) has emerged as the dominant process for assessing professional attitudes and behaviour in the workplace. Extensive literature, both in healthcare and in industry, shows that such assessment can, with certain caveats, be practical, valid and reliable. All described MSF systems broadly follow a model in which a number of colleagues act as assessors of an individual. Their assessments are recorded on a pro forma. The judgements are fed back to the individual, either unadulterated or via an intermediary mentor or supervisor. Systems vary in terms of: number of assessors; method of assessor selection and whether or not the assessor is anonymous to the assessee; the content of the pro forma used in the assessment (areas scored and ratings scale used); and the mechanism for feeding back to the assessee.

The published literature includes both ideas and empirical evidence. Definitive ‘proof’ of the unassailability of MSF as an assessment method is not possible: the observations are by their nature subjective; there is no gold standard; the quality under scrutiny changes with time and final outcomes can only be measured after years, when many other influences would have occurred. Therefore it is important that some rules should be characterized to guide the future development of MSF, based on what is already known. This paper considers a number of key issues that have been distilled from the literature. These relate to the purpose of MSF, validity (content and predictive), the assessors (who, how many and reliability) and training implications. The paper ends by proposing 10 tips to consider when designing a MSF system.

Value of multi-source feedback

When done in the right way for the right purpose, MSF systems have been shown to enhance teamworking (Dominick et al., Citation1997; Garavan et al., Citation1997; Druskatt & Wolff, Citation1999), productivity (Edwards & Ewen, Citation1996), communication and trust (Waldman & Bowen, Citation1998). MSF systems are now deeply ensconced in industry: 90% of managers find them helpful (Handy et al., Citation1996), and almost all top US companies now use them routinely (Atwater & Waldman, Citation1998; Ghorpade, Citation2000).

Purposes of multi-source feedback

The literature on MSF in industry provides a useful perspective on its application in healthcare. In particular, there is an important distinction to be made as to the purpose to which the assessment is put. McCarthy & Garavan (Citation2001) provide a comprehensive monograph on the use of MSF in industry. They note that there are potentially five purposes:

Purpose and correct identification of low performers

In almost all of the published studies within health systems, the principal purpose of the assessment was the identification of those who may have a problem in the interpersonal domain. Some studies combined this with multi-rater assessment of various cognitive functions (such as problem-solving). Some were directly summative, and some were used to inform the possibility of a problem, such that formative action and/or further assessment could occur. In industry, however, there were often other purposes for the assessments—for instance to help determine outcomes relating to pay or career progress.

The way in which results are used has a critical bearing on whether those identified as failing were truly below standard. If these people suffered a serious disadvantage or setback, then the diagnosis must be certain. However, if the suggested intervention is skilled feedback, followed by intensified scrutiny of both good and bad behaviour, then, at worst, this is embarrassing, annoying, time-consuming and unnecessary. At best, however, it rescues poorer trainees, protects patients, reduces opportunity for disharmony and complaint, mollifies aggrieved staff and provides evidence which may be useful in the future.

It might of course be a matter of concern if not all under-performers were detected. However, the fact is that some were detected in all studies. This means that there is an assessment driver in the system to demand a minimum performance in terms of interpersonal behaviours, which would not be there in the absence of MSF.

Purpose and organizational values

London & Beatty (Citation1993) and Hoffman (Citation1995) report that MSF makes organizational values explicit. This has important practical implications: for instance, even if a small number of descriptors of behaviours were adequately reliable, they may not be sufficiently descriptive of what is desired, if the principle purpose of the assessment is to disseminate these values. This concept is supported by London & Smither (Citation1995).

Insight and self-assessment

Three large studies (Mabe & West, Citation1982; Fletcher, Citation1999; Van der Heijden & Nijhof, Citation2004) have shown that self-ratings tend to be higher than MSF ratings, particularly in the less highly rated individuals. These studies suggest that a potential purpose for MSF is to examine ‘insight’, as measured by this discrepancy between self-rating and ratings by others. Bernardin et al. (Citation1993) point out that managers may not change their behaviour when rated by subordinates. This brings out the inherent irony in MSF systems: A key purpose is to help people with blind spots to develop insight, yet this very lack of insight is the quality that makes acceptance potentially difficult, and so may lead to embitterment rather than change.

Validity of multi-source feedback

Content validity

Carline et al. (Citation1989) described the desirable behaviour of doctors, and rated these using physician associates. Newble et al. (Citation1999) set out a complete new assessment methodology for trainees in Australasia, part of which was ‘physician assessment’, which comprised multi-rater assessment of humanistic behaviours. They chose 12 behaviours to assess, only some of which were ‘humanistic’. Choice of descriptors was based on a major consultation exercise. Zhang et al. (Citation2001) used a similar method to identify humanistic qualities in nurses with a view to future measurement. Ilott & Bunch (Citation1980), on the other hand, used a modified Delphi technique, in a large-scale project, to try to identify the ‘competences’ of surgical SHOs. Wood & Campbell (Citation2004) did the same thing for O&G trainees.

More recent health systems papers tend to quote the GMC's Good Medical Practice (2001) as the arbiter of content validity. However, there is very little evidence as to how broad agreement on content should be translated into ‘domain’ or ‘item’ wording.

Evans et al. (Citation2004) discussed instruments for the peer assessment of physicians. They required such instruments to have data on the way they were developed, or on their validation. They excluded self-assessment tools, those not completed by physician peers and those used only in educational settings. These criteria would have excluded most multi-source feedback tools discussed in this article. They did identify three instruments, all from North America. All these included the assessment of clinical performance and other domains such as communication skills and what they refer to as ‘humanistic qualities’.

Content validity and halo effect

Linn et al. (Citation1975) were not able to find any record of a systematic attempt to devise a reliable and valid assessment instrument for assessment in the affective domain, by peers. They devised a ‘performance rating scale’, comprising 16 items each assessed on a four-point scale. Factor analysis demonstrated that many of the items pertained, at least in part, to a single domain that they called ‘interpersonal or relationship’ factor. This alone accounted for 40% of the variance. A second factor, ‘knowledge or skill’, produced nearly one-third of the variance.

Davidge & Hull (Citation1980) and Dielman et al. (Citation1980) again confirmed by a factor analysis the major influence of just two dominant domains on assessment of undergraduates. Dawson-Saunders & Paiva (Citation1986) showed that only four ‘dimensions’ were enough to encompass the totality of performance during a year of medical student clerkship. Maxim & Dielman (Citation1987), in a large study, rated medical students by house officers and attending teachers in 13 behaviourally anchored rating scales. Factor analysis showed that the large majority of the variance was due to two domains: interpersonal skills and problem-solving. Risucci et al. (Citation1989) used factor analysis to show that ‘interpersonal factors’ explained the large majority of the variance, with ‘knowledge’ once again being the second dominant factor. Ramsey and colleagues, in two studies of MSF of physicians, found by factor analysis that two main qualities, ‘deep cognitive’ and ‘humanistic’, explained the majority of the variance in the ratings (Ramsey et al., Citation1993, Ramsey et al., Citation1996).

Whitehouse et al. (Citation2002, Citation2005) demonstrated that four ‘descriptors’ were enough options to give raters. They showed that with only four domains consistent results could be demonstrated across different raters, and also that individual raters were consistent over time, in rating against these four descriptors. This was confirmed by Wood & Campbell (Citation2004). Principle component analysis again showed that a single ‘communication’ factor accounted for the large majority of the variance.

The emerging message of these papers is highly consistent: despite an intuitive need for semantic declension of performance to a variety of specific competences, just two factors have in reality been the main ones being assessed. One of these is cognitive, which, it could be argued, is better tested by other means. The other is ‘interpersonal’ or ‘humanistic’. The evidence is therefore powerful that an overwhelming ‘halo effect’ occurs in multi-rater assessment in the affective domain, whereby just a single ‘interpersonal’ factor is the main determinant of the outcome.

Predictive validity

No study has yet demonstrated that, over a significant period of time, those trainees identified as the best do indeed outperform their peers in relation to the attributes under question. Keck et al. (Citation1979) demonstrated that a combination of cognitive and non-cognitive factors predict future clinical performance, much better than cognitive factors alone. Dawson-Saunders & Paiva (Citation1986) again highlighted a strong non-cognitive component in performance at medical school and in residency. Whilst only MSF exists to measure such performance it must remain a (very reasonable) assumption that attributes such as ‘interpersonal skills’ are indeed desirable, and do indeed underpin desirable performance. Other tools for measurement of such performance are needed. Church (Citation2000) however, shows that in industry, where measurement of performance is often better developed, better (but not perfect) evidence exists for the predictive validity of MSF.

The assessors in multi-source feedback

Non-medical assessors

Risucci et al. (Citation1989) helped establish the role of non-traditional assessors in the workplace, in the assessment of humanistic, interpersonal qualities. He compared peer assessment and supervisor assessment of surgical trainees. Both groups of assessors agreed remarkably with regard to overall rating across the 10 categories while differing in their emphasis.

Butterfield & Pearson (Citation1990) moved the debate on from peer assessment of residents to assessment by nurses of doctors’ ‘humanistic behaviour’. They showed differences in perception as to the definitions of desirable humanistic behaviours, which understandably created some dissent as to whether nurses were qualified to assess doctors—albeit in non-clinical domains. Undeterred, Butterfield & Mazzaferri (Citation1991) developed a form for nurses to assess house staff in relation to humanistic behaviour. In a large study, he was able to show that the instrument was both practical and reliable. They pointed out, however, that the nurses’ picture was often a different perspective from the medical one.

Wenrich et al. (Citation1993) reported the substantial transition to the use of nurses in the assessment of doctors. The study agreed with Butterfield in showing feasibility, although residents again did not universally like being assessed by non-doctors. As with Butterfield, it also showed a moderate correlation between assessment by non-doctors and doctors, once more raising the question as to which perspective was to be taken more into account. Ramsey et al. (Citation1993) importantly provided evidence that the assessee could nominate his/her own assessor, without biasing the outcome.

Number of assessors

Clearly, the issue of numbers of assessors (‘raters’) relates in part to potential for rater error or bias. The fact that rater error can occur has been discussed by Saavedra & Kwun (Citation1993). Bettenhausen & Fedor (Citation1997) point out that in peer rating there is a tendency to leniency, so as to minimize bad feeling. However, multi-rater assessment takes these issues into account by virtue of its nature, in not relying on one source. Thus the issue comes down to how many raters are needed to minimize the chance of an inaccurate rating.

Butterfield & Mazzaferri (Citation1991) estimated that only five or six raters could provide a representative picture for each trainee. Wenrich et al.'s Citation1993 study calculated that 10 to 15 nurses were needed. Ramsey et al. (Citation1993) suggested 11 raters. Woolliscroft et al. (Citation1994) had more disappointing results. They agreed that only five to 10 Programme Directors would be needed to get a representative result, when rating interns for ‘humanistic qualities’. However, the figure for nurses was 10 to 20, and for patients, more than 50.

Ramsey and colleagues (Citation1996) analysed 3005 questionnaires relating to MSF of 187 physicians. Ten to 11 responses per physician were necessary to achieve a generalizability coefficient of 0.7. Wood & Campbell (Citation2004) calculated the need for eight raters of O&G trainees to give an intraclass coefficient of 0.8.

There might be many reasons why any of these figures differ. First, for some studies, more than just ‘interpersonal’ aspects were being measured. Second, it could be that some groups of raters have a tendency to be inconsistent in their rating. They may have had different training, milieu or natural propensities. And different roles might have different experiences of the assessee. Third, expansion of categories to be rated could conceivably add subjectivity, or irrelevancy, or inadequacy of the rater to be in a position to judge. Finally, raters may have differed in the amount of training they received. This opens the possibility that the number of raters might be susceptible to further reduction, by virtue of better, more focused assessment instruments, and better trained raters.

Accepting the assessments

If the systems could be well accepted and motivating for assessees, then the worry about the lack of consistent evidence on driving improvement would be dissipated. One might reasonably assume, in that case, that if assessees liked the system and found it motivating, then organizational benefits would ensue. Evidence for this is abundant (McEvoy & Buller, Citation1987; Fedor & Bettenhausen, Citation1989; Riggio & Cole, Citation1992; Tornow, Citation1993; Yukl & Lepsinger, Citation1995; Maurer & Tarullli, Citation1996; Towers-Perrin, Citation1998; Wimer & Nowack, Citation1998). Additionally, there would be an upward spiral, by virtue of using the system to communicate the values and expectations of the organization in relation to professional behaviours. The Alimo-Metcalfe study (Citation1998) makes the interesting point that MSF systems may highlight strengths of which the assessee is unaware.

Fedor & Bennenhausen (Citation1989) and others (Becker & Klimoski, Citation1989; Bettenhausen & Fedor, Citation1997; Bilsbury, Citation1998), say that it is therefore critical to success to stress the developmental nature of the system to assessees. Barclay & Harland (Citation1995) add that the trainees should perceive their raters as competent, and that they should have the opportunity to have potential errors or biases re-examined before they go on the final record.

Training

Kaplan (Citation1993) notes that receiving negative feedback can demotivate. Industry provides some important insights into this field. McCarthy & Garavan (Citation2001) have reviewed the literature extensively, and feel that training of both raters and assessees is important. They feel that raters need to understand about the common forms of bias, such as halo effect and centralization (Hoffman, Citation1995).

Many authors agree that training in giving feedback, however, is done badly, to the great detriment of the system (Holmes and McCall, 1997; Lindsey et al., Citation1997; Kanouse, Citation1998). If managers are not trained in giving negative feedback, they may hate doing it, and so reject the MSF system (Folger & Cropanzano, Citation1998). Such problems characteristically occur in hierarchical organizations (London et al., Citation1990; Murphy & Cleveland, Citation1991). This can create tension between raters and assesses (Hautaluoma et al., Citation1992). Interestingly, however, not much can be found in the literature concerning the training of assessees to receive feedback.

Use of computerized systems

There is little evidence in the literature concerning this topic but what there is, is encouraging. Penny (Citation2003) compared pen-and-paper with electronic systems, and concluded that the method did not influence the consistency of raters. Archer et al. (Citation2005) described a successful system in which assessments were centrally scanned into a computer for summarizing and analysis. Payne (Citation1998), however, warned that people may have an unreasonable respect for electronic data and numerical scores, and that a danger of electronic systems might be that they portray subjective data as objective.

Conclusion

It has become clear that communication and relationships need to be part of the formal medical curriculum, and that ‘humanistic’ behaviour needs to be assessed. Published studies, within and without health systems, to a greater or lesser extent, bear out the themes of validity and usefulness of multi-rater assessment in the workplace. The complete lack of a published opposing view may be due to publication bias, or to the difficulty in demonstrating such a conclusion. However, given the three decades of literature, it is most likely to be because multi-rater assessment is indeed a valid and potentially useful tool. Careful planning needs to go into the design of an MSF system to ensure that those being rated predominantly welcome the system. From a review of world literature in both industry and health, it is possible to postulate a set of rules for the future development of MSF systems.

Ten tips in the development of multi-source feedback systems

Develop a positive culture

A positive and supportive ‘culture’ or ‘climate’ of the organization will help motivation to change behaviour, but use of MSF in a negative culture can demotivate. Organizations should use the introduction of MSF as a time to address ‘climate’ issues.

Be clear about the purpose

There is a vital distinction between the developmental use of MSF, and its use in decisions that affect careers. For the latter purpose, the system must be well proven, mature and owned by all.

Clearly express any desired behaviours

Use of MSF disseminates an understanding of desired behaviours, and so the opportunity should be taken to make desired behaviours clear and explicit.

Keep the number of items to be scored few

Whatever the wording of items, assessments of behaviours in the affective domain are dominated by a single ‘halo effect’ perception of the person's interpersonal skills. Large numbers of items do not add discrimination, and may increase inconsistency of rating.

Keep the scale simple and fit-for-purpose

The construction of the MSF scale, and the addition of any option for free text, should reflect the purpose of the assessment.

Use six to 10 raters

The better the instrument, the more generalizable the results from just a few raters—but sampling multiple sources is an inherent element of the assessment.

Compare results with self-assessment

Assessees tend to be more lenient with themselves than raters are. A large discrepancy between the two suggests lack of insight. Discrepancy is reduced by the assessee having a clearer grasp of the desired behaviours. The implication is that MSF should be compared with self-assessment. Furthermore, the assessee's view may add an important and valid perspective, which was glossed over by the raters.

Train those giving feedback

The power of MSF to change behaviour depends on the quality of the feedback. MSF systems demand training of those giving feedback. Skilled, unthreatening feedback is particularly important when the assessee is learning new behaviours.

Skilled feedback needs to be backed up by educational planning, which itself implies time and training to fulfil educational goals.

Involve the assessees

Assessee involvement in the development and implementation of multi-source feedback is likely to enhance the quality of the system and maximize its potential.

Incorporate development

MSF is an opportunity to develop shared understanding in an organization. Raters and assessees should help develop the system together, and should be trained together in dealing with improving performance, in areas shown to be weak. Investigators’ initial attempts at wording will result in some domains being less consistently interpreted than others. There should be an ongoing dialogue between all parties in the organization, to help standardize understanding, and to help develop future iterations of the system.

Additional information

Notes on contributors

Laurence Wood

LAURENCE WOOD is an associate postgraduate dean for education in the West Midlands Deanery and a consultant in obstetrics and gynaecology at University Hospital Coventry and Warwickshire in Coventry.

Andrew Hassell

ANDREW HASSELL is an associate postgraduate dean for education in the West Midlands Deanery and a consultant in rheumatology at the University Hospital of North Staffordshire in Stoke on Trent.

Andrew Whitehouse

ANDREW WHITEHOUSE is a director of hospital and specialist education in the West Midlands Deanery, and a consultant physician at George Eliot Hospital in Nuneaton, Warwickshire.

Alison Bullock

ALISON BULLOCK is Reader in Medical and Dental Education in the School of Education at the University of Birmingham and a senior member of the Centre for Research in Medical and Dental Education in the School of Education.

David Wall

DAVID WALL is a deputy regional postgraduate dean in the West Midlands Deanery and a professor of medical education at Staffordshire University.

References

  • Alimo-Metcalfe B. 360° feedback and leadership development. International Journal of Selection and Assessment 1998; 6(1)35–44
  • Archer JC, Norcini J, Davies H. Use of SPRAT for peer review of paediatricians in training. British Medical Journal 2005; 330: 1251–1253
  • Atwater L, Waldman D. Accountability in 360° feedback. Human Resources Magazine 1998; 43(6)96–102
  • Barclay J, Harland L. Peer performance appraisals: the impact of rater competence, rater location and rating correctability on fairness perceptions. Group and Organization Management 1995; 20(1)39–60
  • Becker T, Klimoski R. A field study of the relationship between the organizational feedback environment and performance. Personnel Psychology 1989; 42: 343–358
  • Bernardin H, Dahmus S, Redmon G. Attitudes of first-line supervisors toward subordinate appraisals. Human Resource Management 1993; 32(2–3)315–324
  • Bettenhausen K, Fedor D. Peer and upward appraisals: a comparison of their benefits and problems. Group and Organization Management 1997; 22(2)236–263
  • Bilsbury N. Academy of Human Resource Development Conference Proceedings. A case study: employee perceptions of the efficacy of 360 degree feedback at a large Midwestern utility. Academy of Human Resources Development, Oak Brook, IL 1998
  • Bracken D, Summers L, Fleenor J. High-tech 360°. Training and Development 1998; 52(8)42–46
  • Butterfield OS, Pearson JA. Nurses in resident evaluation: a qualitative study of the participants’ perspectives. Evaluation and the Health Professions 1990; 13: 453–473
  • Butterfield PS, Mazzaferri EL. A new rating form for use by nurses in assessing residents’ humanistic behaviour. Journal of General Internal Medicine 1991; 6: 155–161
  • Cardy R, Dobbins G. Performance Appraisal: Alternative Perspectives. South-Western Publishing, Cincinnati, OH 1994
  • Carline JD, Wenrick M, Ramsey PG. Characteristics of ratings of physician competence by professional associates. Evaluation and the Health Professions 1989; 12: 409–423
  • Church A, Bracken D. Advancing the state-of-the-art of 360° feedback: guest editors’ comments on the research and practice of multi-rater assessment methods. Group and Organization Management 1997; 22(2)149–161
  • Church AH. Do higher performing managers actually receive better ratings? a validation of multirater assessment methodology. Consulting Psychology Journal: Practice and Research 2000; 52: 99–116
  • Crossley T, Taylor I. Developing competitive advantage through 360° feedback. American Journal of Management Development 1995; 1(1)6–17
  • Davidge AM, Hull A.L. A system for the evaluation of medical students’ clinical competence. Journal of Medical Education 1980; 55: 65–67
  • Dawson-Saunders B, Paiva REA. The validity of clerkship performance evaluations. Medical Education 1986; 20: 240–245
  • DeSimone R. Establishing the link: relating a 360 degree management assessment and development process to the bottom line. Health Care Supervisor 1998; 17: 31–36
  • Dielman TE, Hull AL, Davis WK. Psychometric properties of clinical performance ratings. Evaluation and the Health Professions 1980; 3: 103–117
  • Dominick P, Reilly R, McGourty J. The effects of peer feedback on team member behaviour. Group and Organization Management 1997; 22(4)508–520
  • Druskatt V, Wolff S. Effects and timing of developmental peer appraisals in self-managing work groups. Journal of Applied Psychology 1999; 84(1)58–74
  • Edwards M, Ewen A. 360°. Feedback: The Powerful New Model for Employee Assessment and Performance Improvement. AMACOM, New York 1996
  • Evans R, Elwyn G, Edwards A. Review of instruments for peer assessment of physicians. British Medical Journal 2004; 328: 1240–1243
  • Facteau CL, Facteau CD, Schoel LC, Russell EA, Poteet ML. Reactions of leaders to 360-degree feedback from subordinates and peers. Leadership Quarterly 1998; 9(4)427–448
  • Farh J, Dobbins G. Effects of self-esteem on leniency bias in self-reports of performance: a structural equation model analysis. Personnel Psychology 1989; 42: 835–850
  • Fedor D, Bettenhausen K. The impact of purpose, participant preconceptions and rating level on the acceptance of peer evaluations. Group and Organization Studies 1989; 14(2)182–197
  • Fletcher C. The implications of research on gender differences in self-assessment and 360 degree appraisal. Human Resource Management 1999; 9(1)39–46
  • Folger R, Cropanzano R. Organizational justice and performance evaluation: test and trial measures. Organizational Justice and Human Resource Management. Sage Publications, Beverly Hills, CA 1998
  • Garavan T, Morley M, Flynn M. 360-degree feedback: its role in employee development. Journal of Management Development 1997; 13(2–3)134–148
  • General Medical Council. Good Medical Practice. GMC, London 2001
  • Ghorpade J. Managing five paradoxes of 360-degree feedback. Academy of Management Executive 2000; 14: 140–150
  • Handy L, Devine M, Heath L. 360° Feedback: Unguided Missile or Powerful Weapon?. Ashridge Management Research Group, Berkhamsted 1996
  • Hautaluoma J, Jobe L, Visser S, Donkersgoed W. Employee reactions to different upward feedback methods. 1992, paper presented to the 7th Annual Meeting of the Society for Industrial and Organizational Psychology, Montreal
  • Hazucha J, Hezlett S, Schneider R. The impact of 360-degree feedback on management skills development. Human Resource Management 1993; 32(2–3)325–351
  • Hoffman R. Ten reasons why you should be using 360-degree feedback. Human Resources Magazine 1995; 40: 82–86
  • Ilott I, Bunch G. Competencies of basic surgical trainees. Annals of the Royal College of Surgeons of England 1980; 1(Suppl.)14–16
  • Kanouse D. Why multi-rater feedback systems fail?. Human Resources Focus 1998; 75: 3–4
  • Kaplan R. 360° feedback PLUS: boosting the power of co-worker ratings for executives. Human Resource Management 1993; 32(2–3)299–314
  • Keck JW, et al. Efficacy of cognitive/non cognitive measures in predicting resident physician performance. Journal of Medical Education 1979; 54: 759–765
  • Keeping L, Levy P, Brown D. Examining Self-Appraisal Formality and Expectations on Appraisal Reactions. GA, Society for Industrial and Organizational Psychology, Atlanta 1999
  • Lepsinger R, Lucia A. Creating champions for 360° feedback. Training and Development 1998; 52(2)49–53
  • Lindsey E, Holmes V, McCall M. Key Events in Executives’ Lives. Centre for Creative Leadership, Greensboro, NC 1997
  • Linn BS, Arostegui M, Zeppa R. Performance self assessment. British Journal of Medical Education 1975; 9: 98–101
  • London M, Beatty R. 360° feedback as a competitive advantage. Human Resource Management 1993; 32(2–3)353–372
  • London M, Larsen H, Thisted L. Relationships between feedback and self-development. Group and Organization Management 1999; 24(1)5–27
  • London M, Smither J. Can multi-source feedback change perceptions of goal accomplishment, self-evaluations and performance-related outcomes? Theory-based applications and directions for research. Personnel Psychology 1995; 48(4)803–839
  • London M, Wohlers A, Gallagher P. 360° feedback surveys: a source of feedback to guide management development. Journal of Management Development 1990; 9: 17–31
  • Mabe P, West S. Validity of self-evaluation of ability: a review and meta-analysis. Applied Psychology: An International Review 1982; 67: 280–296
  • Martocchio J, Judge T. Relationship between conscientiousness and learning in employee training: mediating influences of self-deception and self-efficacy. Journal of Applied Psychology 1997; 82(5)764–773
  • Maurer T, Tarulli B. Acceptance of peer/upward performance appraisal systems: role of work context factors and beliefs about managers’ development capability. Human Resource Management 1996; 35(2)217–241
  • Maxim BR, Dielman TE. Dimensionality, internal consistency and interrrater reliability of clinical performance ratings. Medical Education 1987; 21: 130–137
  • McCarthy A, Garavan T. Developing self-awareness in the managerial career development process: the value of 360° feedback and the MBTI. Journal of European Industrial Training 1999; 23(9)437–445
  • McCarthy AM, Garavan TN. 360 Degree feedback processes: performance improvement and employee career development. Journal of European Industrial Training 2001; 25(1)3–32
  • McEvoy G, Buller P. User acceptance of peer appraisals in an industrial setting. Personnel Psychology 1987; 40: 785–797
  • Murphy K, Cleveland J. Performance Appraisal: An Organizational Perspective. Allyn & Bacon, Boston, MA 1991
  • Newble D, Paget N, McLaren B. Revalidation in Australia and New Zealand: approach of Royal Australasian College of Physicians. British Medical Journal 1999; 319(7218)1185–1188, (Intl edn)
  • O’Reilly B. 360° feedback can change your life. Fortune 1994; 130(8)93–7
  • Payne T. Editorial: 360 degree assessment and feedback. International Journal of Selection and Assessment 1998; 6(1)16–18
  • Penny JA. Exploring differential item functioning in a 360-degree assessment: rater source and method of delivery. Organisational Research Methods 2003; 6(1)61–79
  • Ramsey PG, Carline JD, Blank LL, Wenrich MD. Feasibility of hospital-based use of peer ratings to evaluate the performances of practicing physicians. Academic Medicine 1996; 71(4)364–370
  • Ramsey PG, Wenrich MD, Carline JD, Inui TS, Larson B, LoGerfo JP. Use of peer ratings to evaluate physician performance. Journal of the American Medical Association 1993; 269: 1655–1660
  • Riggio R, Cole E. Agreement between subordinate and superior ratings of supervisory performance and effects on self and subordinate job satisfaction. Journal of Occupational and Organizational Psychology 1992; 65: 151–158
  • Risucci DA, Tortolania AJ, Ward RJ. Ratings of surgical residents by Self supervisors and peers. Surgery, Gynaecology+Obstetrics 1989; 169: 519–526
  • Saavedra R, Kwun S. Peer evaluation in self-managing work groups. Journal of Applied Psychology 1993; 78(3)450–462
  • Southgate L, Hays RB, Norcini J, Mulholland H, Ayers B, Woolliscroft J, Cusimano M, McAvoy P, Ainsworth M, Haist S, Campbell M. Setting performance standards for medical practice: a theoretical framework. Medical Education 2001; 35(5)474–481
  • Steensma C, Gould L, Moseley C. Using a group-based 360 degree feedback process to facilitate the merger of four marketing units at Disney Networks. US Human Resource Planning 1998; 2(4)11–15
  • Tornow W. Perceptions or reality: is multi-perspective measurement a means to an end?. Human Resource Management 1993; 32(2/3)221–229
  • Tornow W, London M. Maximizing the Value of 360-degree Feedback: A Process for Successful Individual and Organizational Development. Jossey-Bass, San Francisco, CA 1998
  • Towers-Perrin. 360° Feedback: The Global Perspective. Towers Perrin, London 1998
  • Van der Heijden BI, Nijhof AH. The value of subjectivity: problems and prospects for 360 degree appraisal systems. International Journal of Resource Management 2004; 15(3)493–511
  • Waldman D. Predictors of employee preferences for multi-rater and group-based performance appraisal. Group and Organization Management 1997; 22(2)264–287
  • Waldman D, Bowen D. The acceptability of 360° appraisals: a customer-supplier relationship perspective. Human Resource Management 1998; 37(2)117–129
  • Wenrich MD, Carline JD, Giles LM, Ramsay P. Ratings of the performances of practising internists by hospital based registered nurses. Academic Medicine 1993; 68: 680–687
  • Whitehouse A, Hassell A, Wood L, Wall D, Walzman M, Campbell I. Development and reliability testing of a new form for 360 degree assessment of senior house officers’ professional behaviour, as specified by the General Medical Council. Medical Teacher 2005; 27: 252–258
  • Whitehouse A, Wall D, Walzman M. Pilot study of 360 degree assessment of personal skills of senior house officers. Hospital Medicine 2002; 63(3)172–175
  • Wimer S, Nowack K. Thirteen common mistakes using 360-degree feedback. Training and Development 1998; 52(5)69–79
  • Wood L, Campbell I. 360 degree assessment: encouraging results of a 6 year study. Annual Meeting of the Association for the Study of Medical Education. Liverpool, UK 2004
  • Woolliscroft JO, Howell JD, Patel BP, Swanson DB. Resident–patient interactions: the humanistic qualities of internal medicine residents assessed by patients, attending physicians, program supervisors, and nurses. Academic Medicine 1994; 69(3)216–224
  • Yukl G, Lepsinger R. How to get the most out of 360° feedback. Training and Development 1995; 32(12)45–50
  • Zhang Z, Luk W, Arthur D, Wong T. Nursing competencies: personal characteristics contributing to effective nursing performance. Advances in Nursing 2001; 33(4)467–474

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.