1,341
Views
16
CrossRef citations to date
0
Altmetric
Research Article

A single generic multi-source feedback tool for revalidation of all UK career-grade doctors: Does one size fit all?

, , , &
Pages e75-e83 | Published online: 28 Jan 2011

Abstract

Background: The UK Department of Health is considering a single, generic multi-source feedback (MSF) questionnaire to inform revalidation.

Method: Evaluation of an implementation pilot, reporting: response rates, assessor mix, question redundancy and participants’ perceptions. Reliability was estimated using Generalisability theory.

Results: A total of 12,540 responses were received on 977 doctors. The mean time taken to complete an MSF exercise was 68.2 days. The mean number of responses received per doctor was 12.0 (range 1–17) with no significant difference between specialties. Individual question response rates and participants’ comments about questions indicate that some questions are less appropriate for some specialities. There was a significant difference in the mean score between specialities. Despite guidance, there were significant differences in the mix of assessors across specialties. More favourable scores were given by progressively more junior doctors. Nurses gave the most reliable scores.

Conclusions: It is feasible to electronically administer a generic questionnaire to a large population of doctors. Generic content is appropriate for most but not all specialties. The differences in mean scores and the reliability of the MSF between specialties may be in part due to the specialty differences in assessor mix. Therefore the number and assessor mix should be standardised at specialty level and scores should not be compared across specialties.

Introduction

It is routine in some professions, for example aviation, to provide evidence of continuing competence throughout one's career. However, doctors in the UK who have finished their training are not currently subject to such regulation. In recent years there has been an increase in public and political pressure to change this as a result of several high profile cases where the performance of doctors has been found to be below acceptable levels.

In 1998 the UK's General Medical Council (GMC), which holds a register of all medical practitioners in the UK and is responsible for regulation, published a document stating ‘all doctors should be prepared to demonstrate at regular intervals that they remain up to date and fit to practice’ (Irvine Citation1998). This was followed by the proposition that this process should become a condition of continued registration (and therefore ability to practice) and the term ‘revalidation’ was introduced.

It has taken considerable time and legislative changes, but in November 2009 the first ‘licences to practice’ were distributed to all doctors in the UK and this has started the clock on revalidation. To retain their licences, all doctors will now need to participate in revalidation and demonstrate, every 5 years, they remain up to date and fit to practise.

Doctors in training in the UK have already collected and presented evidence at regular intervals to progress through training and therefore it is expected that this information will also be used for revalidation. However, for career-grade doctors, this activity is less well developed and relies on an appraisal process which is not designed for the purposes of assessment. Much work has been done to establish how career-grade doctors will obtain performance information and how it will be used as evidence to make a decision regarding revalidation. The Department of Health (DH) has recommended that one of the key pieces of evidence will come from a multi-source feedback (MSF) exercise (Department of Health Citation2007). MSF or 360° feedback is a questionnaire-based assessment method in which multiple co-workers (assessors) provide confidential feedback on an employee's (assessee) key performance behaviours. In medicine it is important to include the perspectives of several discrete groups including medical peers, medical trainees, student doctors, patients, nurses, allied health professionals (AHPs) and administrative/clerical staff. For the purpose of this article, MSF will refer to feedback from co-workers. (We developed a different questionnaire for patient feedback which is described separately (Mackillop et al. Citation2006).)

MSF tools have been widely evaluated for healthcare professionals particularly in North America (Ramsay et al. Citation1993; Wenrich et al. Citation1993; Lipner et al. Citation2002; Lockyer Citation2003; Lockyer & Violato Citation2004; Violato et al. Citation2006). In the UK, MSF is increasingly used as evidence in the portfolios of doctors in training and several studies have validated MSFs for this purpose (Thomas et al. Citation1999; Whitehouse et al. Citation2002, Citation2005; Davies & Archer 2003; Archer et al. Citation2005; Hesketh et al. Citation2005; Wilkinson & Wade Citation2005; Davies et al. Citation2008; Wilkinson et al. Citation2008). Small studies have also described using MSF for consultants (Mason et al. Citation2003; Bennett et al. Citation2004) and general practitioners (GP; Griffin et al. Citation2000; Elwyn et al. Citation2005; Murphy et al. Citation2008) and two studies validating a previously developed questionnaire on consultants (Crossley et al. Citation2008), and consultants and GPs (Campell et al. Citation2008) have been published. More recently, specialty-specific MSFs have been developed in some specialties, for trainees (Davies et al. Citation2008), and for career-grade doctors (The Royal College of Radiologists Citation2004; Lelliot et al. Citation2008), suggesting a possible need for more specialised questions or processes for some specialties.

The only study to evaluate the utility of MSF across specialties was Lockyer and Violato (Citation2004) who used a previously validated 36-item colleague MSF questionnaire across 304 doctors in three specialties – internal medicine (103), psychiatry (101) and paediatrics (100). They found there were no specialty differences in response rates or reliability of the MSF. Furthermore, factor analysis showed the items clustered into the same four factors across the three specialties (patient management, clinical assessment, professional development and communication). However, communication was the most important discriminating factor in psychiatry but patient management was the most important factor for the other two specialties. In other words, there is evidence that aggregate scores may hide important speciality differences therefore comparing scores between specialties may be inappropriate.

This study evaluates the utility of a generic questionnaire for career-grade doctors, across a wide range of specialties. Novelly, we focus on the specialty differences in the mix of assessors as a possible additional confounding factor.

Method

Development of the questionnaire

The Royal College of Physicians (RCP), the Academy of Medical Royal Colleges (AoMRC), the GMC and the DH collaborated to develop a generic MSF questionnaire for career-grade doctors, to support revalidation. The initial content was defined by a literature review and consultation with consultants from many specialties and experts within the field of MSF. Consultation took the form of e-mail correspondence using a modified Delphi technique (a method for obtaining group consensus involving a cycle of questionnaire, feedback and modification which continues until consensus is reached) to identify relevant items. This was followed by a focus group that included representatives from all assessor groups, where the content, wording and rating scale were finalised. The instrument was required to map onto the General Medical Council's (Citation2006) Good Medical Practice and was developed to make it as generic as possible while ensuring the domains were relevant. It has 10 domains (Appendix). A 4-point scale was chosen with 2 points either side of the borderline score. Descriptors were modified during the consultation process. The Health and Probity question was made a yes/no answer, and the instructions strongly advised the assessors to elaborate in the text box if the question was answered ‘yes’. The text box also encourages assessors to elaborate on any of the domains.

Administration

An independent provider (360 Clinical Ltd) administered the instrument electronically to reduce administration time and to achieve a consistent and secure process. Independence is stipulated by the Department of Health (Citation2007) and General Medical Council (Citation2010) and was thought to be more acceptable to the profession.

Data collection and recruitment

Recruitment was undertaken in two ways: 1000 doctors were nominated by their employer (National Health Service Hospital or Primary Healthcare Trust) as part of the local appraisal process. Written consent was obtained from the Medical Director of each employer and from each assessor, to include our questionnaire and collect the anonymised data. Several Royal Colleges also arranged for an advertisement for volunteers to be sent to all its members and fellows. The data were stored in a secure data storage unit used by the independent provider. Jim Crossley and Pirashanthie Vivekananda-Schmidt analysed anonymised data.

The process

For practical reasons, we asked the doctor being assessed to nominate his/her own assessors. However, we reduced selection bias by first, making recommendations in choosing assessor and second, using a facilitator (who was usually the doctor's appraiser) to scrutinise the choice of assessors.

We requested 15 assessors per doctor. We recommended that the assessors should be representative of the doctor's day-to-day clinical practice but should include at least two doctors, two nurses, two AHPs and two clerical/managerial staff. The facilitator verified the mix of assessors against these recommendations and was given the opportunity to decline assessors. The facilitator was also responsible for feeding back the results at a designated meeting at the end of the process. For most doctors, this was at their annual appraisal.

The system sent weekly electronic reminders to the assessee to nominate their assessors and, once the assessors were accepted by the facilitator, similar reminders were sent to the assessors. An electronic report was generated once 10 responses were received or 12 weeks had elapsed.

Analysis of data

Quantitative analyses were conducted in SPSS (version 14.0) and Excel (2003). Descriptive analysis included means, standard error of measurement (SEM) and frequencies to describe the assessors, the item scores, the mean scores, the form scores, the response rates per doctor and per item and the time taken to complete an MSF. Analyses were performed at the whole cohort level and at the speciality level.

Evaluation

The utility of the questionnaire was investigated by evaluating the feasibility and the relevance of this MSF process and estimating the reliability. This was done for the whole cohort of participants, and where appropriate, specialties were compared.

Feasibility was evaluated by:

  • calculating the median response on the response scale to the question ‘How easy did you find the questionnaire to complete?’;

  • calculating the time taken to complete the MSF process;

  • calculating the response rates;

  • calculating the number of assessors in each assessor group per doctor and evaluating adherence to the recommendations;

  • identifying themes in participants’ free text comments.

Relevance of the questions was evaluated by:

  • calculating the median response on the response scale to the question ‘How worthwhile do you think the MSF process is?’;

  • calculating item response rates and item redundancy across specialties;

  • calculating item response rate and item redundancy across assessor groups;

  • identifying themes in participants’ free text comments.

Reliability of the questionnaire was estimated using Generalisability theory. For the G-study, variance components were estimated using the VARCOMP procedure (MINQUE). For the D-study, SEM and G co-efficient were calculated in Excel using Cronbach's original equations.

Significance tests

One-way ANOVA was used to compare the mean scores of the specialities for significant differences.

Results

Recruitment and participation

illustrates the recruitment and uptake of the pilot. More than 90% of participating doctors worked in England in a Hospital Trust or Primary Healthcare Trust environment. Volunteers were marginally more likely than doctors nominated by their trust (78% versus 68%) to participate in the MSF process. A broad range of specialties were represented ().

Figure 1. Recruitment and uptake of the pilot.

Figure 1. Recruitment and uptake of the pilot.

Table 1.  Number of career-grade doctors, mean score and projected reliability from each specialty

Overall scores

For the whole cohort, aggregate scores followed a normal distribution with a positive skew (). All aggregate scores were above the borderline score. This is similar to other MSF questionnaires and shows that the aggregate numerical score by itself is not of great value but individual item scores and importantly, free text comments, are more useful in the feedback discussion.

Figure 2. Doctors’ aggregate scores.

Figure 2. Doctors’ aggregate scores.

Specialty comparison

Overall mean scores were particularly high for ophthalmology and medicine and low for pathology and psychiatry (). These differences reached statistical significance using ANOVA (F = 2.1, p < 0.05). However, using a regression model in the variance component analysis, the speciality differences were negligible compared to the variance attributed to the individual doctor being rated.

Across the cohort ‘leadership’, ‘clinical assessment’ and ‘patient management’ were scored the lowest over all. Different specialties had different strengths and weaknesses. Pathologists and GPs had lower scores for clinical assessment (3.33 and 3.34, respectively), and pathologists and radiologists had lower scores for patient management (3.31 and 3.39, respectively). Medicine and GPs scored highly (3.58) for professional development.

Feasibility

A total of 12,540 responses were received and 977 doctors had one or more questionnaires completed. The mean number of responses received per doctor was 12.0 (range 1–17). Only 76 (7.9%) had greater than 15 completed questionnaires, but 867 (90.0%) had greater than 10 completed questionnaires. For the small subgroup of doctors where data were available (157 of 977 doctors), the mean time taken to complete an MSF exercise (i.e. to collect a mean of 12 responses) was 68.18 days (range 2–84 days). 73% of participants completed the questionnaire in 5 min or less.

Asked ‘How easy did you find to fill it out?’ on a 4-point scale: There were 1611 responses (13% response rate). The majority were positive with 85% of assessors responding ‘good’ and 11% responding ‘outstanding’.

Specialty comparison

The response rate per doctor was not significantly different between specialties, with a mean response rate ranging from 11.3 completed questionnaires in ophthalmology to 12.7 in radiology ().

Table 2.  Average number of completed questionnaires per specialty and number of doctors with <2 responses per assessor group

Relevance

Asked ‘In your opinion how worthwhile is this method of assessment?’, there were 1631 responses (13% response rate). The majority of responses were positive with 79% of assessors responding ‘good’ and 5.7% responding ‘outstanding’. A total of 5666 assessors (18.2%) and 348 assessees (25.4%) used the comment box to illustrate their ratings with examples of observable behaviour.

Specialty comparison: Item response rates and redundancy across specialties

shows the proportion of assessors who responded to each item. Again, the pattern observed reflects the patterns of clinical practice and the limitations of the assessment method. The response rates are the poorest for those activities where doctors are less often observed by their co-workers; this includes key areas such as ‘clinical assessment’ and ‘patient management’. This pattern is most obvious for doctors in specialities who do not do clinical work (e.g. pathology) or whose clinical work is least often observed (e.g. general practice).

Table 3.  Percentage of assessors responding to an item

Reliability

Estimation of reliability using Generalisability theory was similar to other validated MSFs ().

Table 4.  Estimation of reliability across the whole cohort

Specialty comparison

Reliability estimation for most specialties is highly speculative because it is based on small numbers of respondents. However, even for those specialities where the sample size is over 100 (general practice, medicine and psychiatry), there is substantial variation in reliability ().

Analysis of assessor groups

Mix of assessors

There was a high proportion of consultant assessors, particularly for pathology (43.0%), anaesthetics (38.0%) and radiology (34.8%). In contrast, GPs had a high proportion of clerical/managerial staff (35.2%) as assessors and very few assessors were junior medical staff or AHPs.

A significant number of doctors did not fulfil our recommendations for choosing assessors (). The pattern broadly reflects patterns of normal clinical practice. Fewer than half of the pathologists or radiologists were able to provide two nurse respondents because doctors in these specialities rarely work with nurses. Similarly, fewer than half of all doctors (and only a quarter of GPs) were able to provide two AHP assessors but this was relatively easy for, for example, radiologists who work closely with radiographers.

Item response rates across assessor groups

Doctors and nurses had a high response rate for all questions. Interestingly, approximately 50% of clerical/managerial staff felt they could accurately assess the clinical items (1 and 2) and the teaching item (5). AHPs also had low response rates to questions 1, 2 and 5. The qualitative data explain these lower response rates: ‘As I am a secretary I was unable to answer any questions regarding the Dr's clinical work although from comments from others I believe this to be good. I can only answer what I know.’ {Assessor}

Scores by assessor group

To explore whether specific differences in the mix of assessors might affect MSF scores, we examined mean scores by assessor group. The different groups provide significantly different mean ratings, particularly across progressively junior medical trainees and student doctors (F = 6.8, p < 0.01; ).

Figure 3. Group mean score by assessor group.

Figure 3. Group mean score by assessor group.

Assessor group reliability

To explore whether specialty differences in the mix of assessors might affect the reliability of the MSF across specialties, we examined reliability by assessor group (). The data show that nurses are most likely to agree with each other about the differences between one assessee and another. However, we also want to promote truly ‘360°’ feedback; therefore, it is important that all assessor groups are represented. Not only is this intuitive, but the G-study data show that the four different assessor groups offer four slightly different perspectives on any given doctor over and above the subjectivity of individual raters. (This ‘facet’ or condition – assessee × rater group – accounted for 4% of score variance.)

Table 5.  Reliability by assessor group

Discussion

The reason that MSF has been given such prominence in revalidation is that this method has been shown in many settings to produce reliable information on important qualities of a doctor (e.g. communication skills, professionalism and interpersonal skills) that are difficult to test by other means (Bracken et al. Citation2001). Moreover, these qualities are assessed in the context of the day-to-day practice of a doctor, therefore reflecting his/her true performance.

The DH has indicated that it may be desirable to have generic tools for career-grade doctors. This may aid implementation, as whole health regions and institutions would be able to use one system; thus use economies of scale to reduce cost in commissioning and training. This also has the potential to allow for comparison of performance between doctors, specialties or geographical areas, thus ensuring consistency of performance across these domains. However, there is little evidence to support or refute the use of a generic MSF tool.

This study demonstrates that it is feasible to utilise an electronic MSF process across specialties. We set a target of at least 10 colleague responses for the MSF, and 90% of doctors were able to achieve this within a 3-month period. Most assessors and assessees who responded considered the MSF questionnaire represented a relevant and useful tool to support revalidation. In addition, our results showed that a generic questionnaire is useful for the majority of specialties. However, a minority of specialties do not gain all the information required as evidenced by poor response rates for some items for pathology and general practice. For some specialties, therefore, it may be appropriate to customise the MSF process at the specialty level so that the content of the questionnaire can be specified.

In this study, we gave strict guidance on the mix of assessors and incorporated this guidance into an electronic system. We have demonstrated that doctors of different specialities select a different mix of assessors despite this guidance.

In order to evaluate whether a different ‘assessor mix’ matters, we looked at the scores given by, and the reliability of, different assessor groups. A difference in scores given by different assessor groups has previously been described (Lipner et al. Citation2002; Crossley et al. Citation2006; Bullock et al. Citation2009) and this large study confirms these findings. Moreover, our data show how progressively more junior doctors give higher scores. We may therefore postulate that mean scores may be affected if there is a disproportionately high or low number of junior staff as assessors.

We also found that there was a difference in reliability between different assessor groups. It is therefore possible that the observed speciality-related difference in the reliability of the assessment may be due, in part, to the mix of assessors used if a particular assessor group is disproportionately represented. For example, GPs have a much higher proportion of clerical/managerial raters (35% versus average of 25%) with nursing staff underrepresented (16% vs average of 18%). This could be responsible, in part, for the poorer reliability of this MSF seen for GPs. We found that nurses were more reliable in their assessment and this confirms a finding from a previous study in a UK healthcare setting (Whitehouse et al. Citation2005). Perhaps this is because they have the most valid opportunities to observe assessees in practice. Although nursing staff offer the most reliable scoring, it is also important to keep the ‘360°’ nature of an MSF exercise as we also found that the different assessor groups offer consistently different perspectives on any given doctor.

Study strengths and limitations

The strengths of this study include the large study size, the use of the questionnaires specifically designed to evaluate generic administration, appropriate statistical evaluation, and the method of recruitment. By recruiting through trusts, we reduced selection bias associated with using volunteers as several large trusts nominated all their consultants, not just the motivated ones. Despite this, there were still a significant number of recruited doctors who did not contribute to the study and therefore there is a risk that, even with such a large study, the population may be to a certain extent self selected and not truly representative.

A limitation was that some demographic data which are known to affect scores such as sex, length of relationship of assessor to doctor being assessed (Sargeant et al. Citation2003) and in what environment that relationship took place (Archer et al. Citation2005; Davies & Archer Citation2005) were not collected. Therefore, we were unable to control for these known confounders. Similarly, no data were collected on the assessor groups other than their designation which could have the potential to bias our results.

Another limitation was that the reliability estimate for the colleague questionnaire meant that we required 15 assessors for acceptable reliability. We found in this study that attaining 10 responses was possible in the vast majority of doctors, but few were able to attain 15 responses. This was in part because our system was set to conclude the process once 10 responses were received. We have since changed this to 15 required responses.

Future work

Since MSF has been given a high profile in revalidation, it is imperative we understand how best to utilise this method to gain accurate, reliable results. Thus, further work to establish factors which may affect scores is required. Of equal importance is the requirement for work to establish how we should be using the results of MSF in the context of revalidation.

MSF was developed as a formative tool to aid personal development and there is evidence beginning to emerge from North America that under certain conditions MSF can induce behavioural change (Sargeant et al. Citation2003). Further, it is now recognised that those doctors who participate in voluntary recertification in the US, which includes MSF, have better patient outcomes (Norcini et al. Citation2000; Holmboe et al. Citation2008). This is an area for further work, but if proven, will cement MSF as a key tool in appraisal.

Conclusion

While a generic instrument is feasible, it may not be the best strategy for some specialties. If a generic instrument were to be used, we have shown that there need to be certain specialty-specific conditions in place, for example strict guidelines on the mix of assessors, to ensure a standardised and fair approach. Content of the questionnaire may also have to be tailored to the specialty. With the differences in utility of a generic questionnaire between specialties shown in this study, comparison of scores across specialty would not be appropriate. This may be a disadvantage for regulatory bodies, but if MSF is to be used as an educational tool to induce desired behavioural change, cross-specialty comparison is unnecessary.

Details of ethical approval: The Local Research and Ethics Committee (LREC) advised that ethics approval was not required.

Details of funding: Funding for this study was received from the Department of Health, the Academy of Medical Royal Colleges and the General Medical Council.

Declaration of interest: While the authors may be financed or employed by the funders, no financial incentive was given or received for writing this article.

References

  • Archer JC, Norcini J, Davies HA. Use of SPRAT for peer review of paediatricians in training. Br Med J 2005; 330: 1251–1253
  • Bennett H, Gatrell J, Packham R. Medical appraisal: Collecting evidence of performance through 360° feedback. Clin Manage 2004; 12: 1–7
  • Bracken DW, Timmreck CW, Church AH. The handbook of multisource feedback: The comprehensive resource for designing and implementing MSF processes. Jossey-Bass, San Francisco, CA 2001
  • Bullock AD, Hassell A, Markham WA, Wall DW, Whitehouse AB. How ratings vary by staff group in multi-source feedback assessment of junior doctors. Med Educ 2009; 43(6)516–520
  • Campell JL, Richards SH, Dickens A, Greco M, Narayanan A, Brearley S. Assessing the professional performance of UK doctors: An evaluation of the utility of the General Medical Councils patient and colleague questionnaires. Qual Saf Health Care 2008; 17: 187–193
  • Crossley J, McDonnell J, Cooper C, McAvoy P, Archer J, Davies H. Can a district hospital assess its doctors for re-licensure?. Med Educ 2008; 42(4)359–363
  • Crossley J, Wilkinson J, Davies H, Wade W, Archer J, McAvoy P, Putting the 360 into multi source feedback. Abstract presented at The Association for the Study of Medical Education's conference on multi source feedback. London, UK: 13 December 2006
  • Davies HA, Archer JC. Multi-source feedback: Development and practical aspects. Clin Teach 2005; 2: 77–81
  • Davies H, Archer J, Bateman A, Dewar S, Crossley J, Grant J, Southgate L. Specialty specific multisource feedback – Assuring validity, informing training. Med Educ 2008; 42: 1014–1020
  • Department of Health 2007. Trust, assurance and safety – The regulation of health professionalism the 21st century. London, UK: Department of Health.
  • Elwyn G, Lewis M, Evans R, Hutchings H. Using a ‘Peer Assessment Questionnaire’ in primary medical care. Br J Gen Pract 2005; 55(518)690–695
  • General Medical Council. Good medical practice. GMC, London 2006
  • General Medical Council 2010. Revalidation: The way ahead. GMC consultation document. London: GMC. Available from http://www.gmc-uk.org/doctors/licensing/5794.asp.
  • Griffin E, Sanders C, Craven D, King J. A computerised 360° feedback tool for personal and organisational development in general practice. Health Inf J 2000; 6: 71–80
  • Hesketh EA, Anderson F, Bagnall GM, Driver CP, Johnston DA, Marshall D, Needham G, Orr G, Walker K. Using a 360° diagnostic screening tool to provide an evidence trail of junior doctors performance throughout their first postgraduate year. Med Teach 2005; 27(3)219–233
  • Holmboe ES, Wang Y, Meehan TP, Tate JP, Ho SY, Starkey KS, Lipner RS. Association between maintenance of certification examination scores and quality of care for medicare beneficiaries. Arch Intern Med 2008; 168(13)1396–1403
  • Irvine D. Maintaining good medical practice. General Medical Council, LondonUK 1998
  • Lelliot P, Williams R, Mears A, Andiappan M, Owen H, Reading P, Coyle N, Hunter S. Questionnaires for 360-degree assessment of consultant psychiatrists: Development and psychometric properties. Br J Psychiatry 2008; 193: 156–160
  • Lipner RS, Blank LL, Leas BF, Fortna GS. The value of patient and peer ratings in recertification. Acad Med 2002; 77(10)S64–S66
  • Lockyer J. Multi-source feedback in the assessment of physicians competencies. J Contin Educ Health Prof 2003; 23: 4–12
  • Lockyer JM, Violato C. An examination of the appropriateness of using a common peer assessment instrument to assess physicians skills across specialities. Acad Med 2004; 79(10)S5–S8
  • Mackillop LH, Armitage M, Wade W. Collaborating with patients and carers to develop a patient survey to support consultant appraisal and revalidation. Clin Manage 2006; 14: 89–94
  • Mason R, Chaudhry N, Hartley E, Ayers B. Developing an effective system of 360-degree appraisal for consultants: Results of a pilot study. Clin Governance Bull 2003; 4: 11–12
  • Murphy D, Bruce D, Mercer S, Eva K. The reliability of work-based assessment in postgraduate medical education and training: A national evaluation in general practice in the United Kingdom. Adv Health Sci Educ Theory Pract 2008; 14(2)219–232
  • Norcini JJ, Kimball HR, Lipner RS. Certification and specialization: Do they matter in the outcome of acute myocardial infarction?. Acad Med 2000; 75(12)1193–1198
  • Ramsay PG, Wenrich MD, Carline JD, Inui TS, Larson EB, LoGerfo JP. Use of peer ratings to evaluate physicians performance. J Am Med Assoc 1993; 269: 1655–1660
  • Sargeant JM, Mann KV, Ferrier SN, Langille DB, Muirhead PD, Hayes VM, Sinclair DE. Responses of rural family physicians and their colleagues and co-worker raters to multi-source feedback process. Acad Med 2003; 78(10)S42–S44
  • The Royal College of Radiologists. 3600 Appraisal – Good practice for radiologists. The Royal College of Radiologists, London 2004
  • Thomas P, Gebo K, Hellmann D. A pilot study of peer review in residency training. J Gen Intern Med 1999; 14: 551–554
  • Violato C, Lockyer JM, Fidler H. Assessment of pediatricians by a regulatory authority. Pediatrics 2006; 117(3)796–802
  • Wenrich MD, Carline JD, Giles LM, Ramsay PG. Ratings of the performances of practicing internists by hospital-based registered nurses. Acad Med 1993; 68: 680–687
  • Whitehouse A, Hassell A, Wood L, Wall D, Walzman M, Campbell I. Development and reliability testing of TAB a form for 360° assessment of senior house officers’ professional behaviour, as specified by the General Medical Council. Med Teach 2005; 27(3)252–258
  • Whitehouse A, Walzman M, Wall D. Pilot study of 360 degrees assessment of personal skills to inform record of in training assessments for senior house officers. Hosp Med 2002; 63: 172–175
  • Wilkinson J, Crossley J, Wragg A, Mill P, Cowan G, Wade W. Implementing workplace-based assessment across the medical specialties in the United Kingdom. Med Educ 2008; 42: 364–373
  • Wilkinson J, Wade W. New methods of performance assessment for trainees. R Coll Pathol Bull 2005; 132: 12–15

Appendix

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.