2,958
Views
15
CrossRef citations to date
0
Altmetric
Web Papers

The development of an instrument to assess clinical teaching with linkage to CanMEDS roles: A psychometric analysis

, , &
Pages e290-e296 | Published online: 24 May 2011

Abstract

Background: Assessment of clinical teaching by learners is of value to teachers, department heads, and program directors, and must be comprehensive and feasible.Aims: To review published evaluation instruments with psychometric evaluations and to develop and psychometrically evaluate an instrument for assessing clinical teaching with linkages to the CanMEDS roles.Method: We developed a 19-item questionnaire to reflect 10 domains relevant to teaching and the CanMEDS roles. A total of 317 medical learners assessed 170 instructors. Fourteen (4.4 %) clinical clerks, 229 (72.3%) residents, and 53 (16.7%) fellows assessed 170 instructors. Twenty-one (6.6%) did not specify their position.Results: A mean number of eight raters assessed each instructor. The internal consistency reliability of the 19-item instrument was Cronbach's α = 0.95. The generalizability coefficient (Ep2) analysis indicated that the raters achieved Ep2 of 0.95. The factor analysis showed three factors that accounted for 67.97% of the total variance. The three factors together, with the variance accounted for and their internal consistency reliability, are teaching skills (variance = 53.25s%; Cronbach's α = 0.92), Patient interaction (variance = 8.56%; Cronbach's α = 0.91), and professionalism (variance = 6.16%; Cronbach's α = 0.86). The three factors are intercorrelated (correlations = 0.48, 0.58, 0.46; p < 0.01).Conclusion: It is feasible to assess clinical teaching with the 19-item instrument that has demonstrated evidence of both validity and reliability.

Introduction

The challenge of how to systematically evaluate clinical teaching has been the subject of much discussion among educators, researchers, and administrators in medical schools concerned with the quality of clinical teaching. Many evaluation instruments have been developed to provide feedback to clinical teachers, department heads, and program directors. There has been a considerable effort devoted to investigate the psychometric properties of instruments to assess clinical teaching effectiveness (Ramsbottom-Lucier et al. Citation1994; Solomon et al. Citation1997; Bardes et al. Citation1998; Litzelman et al. Citation1998, Citation1999; Marriott & Litzelman Citation1998; Copeland & Hewson Citation1999; Steiner et al. Citation2000; Williams et al. Citation2001, Citation2002; Snell et al. Citation2002; van der Hem-Stokroos et al. Citation2003; Beckman et al. Citation2004, Citation2005; Leamon & Fields Citation2005; Bierer & Hull Citation2007).

Specialty Medical Education in Canada has undergone a significant change since the adoption of the CanMEDS Citation2000 Project, first published in 1996 as the Skills for the New Millennium Report of the Societal Needs Working Group. Although the CanMEDS competencies are now accepted among educators involved with residency program administration, they are still not clearly understood by many teachers in clinical medicine (Frank Citation2005).

The evaluation of clinical teaching is a much studied topic in medical education as the results may have significant implications for both clinical teachers as well as program administrators. Teachers, learners, program administrators, and the health care system, as well as patients, can be expected to gain from improved clinical teaching (Snell et al. Citation2002). Thus, its evaluation is interesting and important.

A common type of teacher evaluation is the administration of a standardized form to rate teachers by learners at the conclusion of clinical rotations. Over the years, medical educators have reported on the development of these forms (Irby & Rakestraw Citation1981; Litzelman et al. Citation1998; Copeland & Hewson Citation1999; Copeland & Hewson Citation2000; Beckman et al. Citation2004). Snell et al. (Citation2002) reviewed the research on the evaluation of clinical teaching and outlined a set of guiding principles to broaden current perspectives on evaluation: validity, reliability, efficiency, and feasibility. They also indicated that evaluation instruments should be consistent with the culture of the organization, acceptable to teachers, easy to administer, and applicable to all levels of teachers.

Drawing from previous research (Irby & Rakestraw Citation1981; Irby Citation1983; Guyatt et al. Citation1993; Cohen et al. Citation1996; James & Osborne Citation1999; Copeland & Hewson Citation2000) and relevant articles on effective clinical teaching (Parsell & Bligh Citation2001; Stark Citation2003; Sutkin et al. Citation2008; Pratt et al. Citation2009), our objective in this study was to review existing clinical instruments and then develop and test a new clinical teaching assessment instrument (CTAI) that could be applied broadly across all clinical departments to assess clinical teaching contributions, provide feedback to clinical teachers, their department heads, and program directors.

The pilot study applied certain conditions to the development of the instrument which included: (1) the instrument must be comprehensive, with evidence of validity and reliability; (2) the instrument must be feasible; that is, it had to be workable, practical, and achievable and (3) the instrument must link to the CanMEDS roles.

Methods

Our first step was to identify and review existing clinical education evaluation instruments. We identified 12 published studies of teaching assessment instruments with associated research on reliability and validity. These are summarized in .

Table 1.  Summary of psychometric characteristics of 12 clinical teaching assessment instruments.

While some of the instruments had adequate reliability, each had limitations when considered for widespread use either due to a narrow focus (specific to one medical discipline) or to lack of feasibility due to the time required to complete the instrument. One of our goals was to create an instrument that was useful for all postgraduate medical specialty programs as well as family medicine and can be completed in less than 5 min.

As can be seen in , we carefully summarized relevant research issues of each study (i.e., sample type and size, instrument item type) and psychometric characteristics (i.e., reliability, validity). While most instruments had adequate internal consistency reliability, few studies reported generalizability coefficients or validity data beyond factor analysis. A notable exception was the Steiner et al. (Citation2000) study which employed a multitrait, multimethod approach to construct validity.

The Beckman et al. (Citation2004) review of the published data on the reliability of instruments points to a lack of articles that focus on examining the psychometric characteristics of scales used to assess clinical teachers. Like our summary in , they found that while factor analysis and Cronbach's α were the most commonly reported statistical methods used, there was little evidence to support the use of other less common methods that used test–retest reliability, interrater reliability, and convergent validity between instruments.

While Beckman et al. (Citation2004) also summarized themes that could help to develop a general evaluation instrument, they argued that the unique teaching cultures at most schools limit the generalizability of even the most carefully designed instruments and that future studies should focus on more narrowly defined populations.

In a subsequent work, Beckman et al. (Citation2005) emphasized the need to use consistent validity criteria for teaching assessment instruments. They noted, as we have, that most validity studies have focused on content and internal structure evidence (e.g., factor analysis) but less on criterion-related and construct validity (e.g., multitrait multimethod).

Instrument development

After reviewing the instruments detailed in , many of which had items in common, we considered the CanMEDS (Citation1996) framework as well as The College of Family Physicians of Canada Standards for Accreditation of Residency Training Programs (2006),and designed a 19-item questionnaire to reflect 10 teacher domains relevant to the CanMEDS framework on a five-point scale (1 = never to 5 = always; ). The 10 domains included clear/organized, enthusiastic/stimulating, establishes rapport with learners, actively involves learners, knowledgeable/analytical, provides direction and feedback, approachable, respect/professional behavior, communication, and health advocacy. The CanMEDS roles of manager, communicator, professional, medical expert, scholar, collaborator and health advocate were linked to each question. The item linkages to the CanMEDS roles are shown in . We called this instrument the CTAI.

Table 2.  Descriptive statistics for the 19 items of the clinical teaching assessment instrument and CanMEDS roles (n  =  317).

Table 3.  Principal components rotated to the varimax criterion with Kaiser normalization.

Procedures

All postgraduate medical education programs in the Faculty of Medicine at the University of Calgary were invited to participate. Participating departments included Adult Cardiology, Adult Hematology, Adult Infectious Diseases, Adult Respirology, Anesthesia, Community Medicine, Emergency Medicine – CCFP, Emergency Medicine – FRCP, Family Medicine, Internal Medicine, Obstetrics and Gynecology, Pediatrics, Pediatrics subspecialities, Radiation Oncology, Radiology, Surgery, and Surgery subspecialities. Each clinical program director assisted in administering the instrument and was asked to provide the questionnaires at an educational activity. Participants were assured of the anonymity of their responses which were collected in confidential sealed envelopes and delivered for data analysis.

Analyses

A number of statistical analyses were undertaken to address the reliability and validity of the instrument. Response rates were used to determine feasibility for each item on the CTAI, therefore the percentage of “not applicable” along with the mean and standard deviation was computed. We propose that items where “not applicable” exceeds 20% on an instrument may be in need of revision or deletion as learners do not see them as evaluable behaviors.

Instrument internal consistency reliability was examined by calculating the Cronbach's α coefficient. This analysis was followed by a generalizability analysis (G-study) for a single-facet, nested design to determine the generalizability coefficient (Ep2) to insure there were sufficient numbers of items and raters to provide stable data for each individual instructor. Normally, an Ep2 ≥ 0.70 suggests data are stable. If the Ep2 is low, it suggests that more raters or more items would be required to enhance stability.

We used exploratory factor analysis to determine which items belonged together (i.e., became a “factor” or “scale”). This analysis allowed us to identify the factors, and describe the relative variance accounted for by each factor and their coherence. Using individual rater data as the unit of analysis, the items were intercorrelated using Pearson product moment correlations. The correlation matrix was then decomposed into principal components and these were subsequently rotated to the normalized varimax criterion. Items were considered to be part of a factor if their primary loading was on that factor. The number of factors to be extracted was based on the Kaiser rule (i.e., eigenvalues >1.0).

The factors or scales established through exploratory factor analysis were used to establish the key domains (e.g., CanMEDS roles), while the items within each factor provide more precise information about specific behaviors (e.g., organized time to allow for teaching and care giving, interacted effectively with patients and their families). This analysis makes it possible to determine whether the instrument items were aligned into the appropriate constructs, CanMEDS roles or factors as intended.

This study received approval from the Conjoint Health Research Ethics Board of the University of Calgary.

Results

A total of 317 clinical clerks, residents, and fellows assessed 170 instructors. Of these, 14 (4.4%) were clinical clerks, 229 (72.3%) were residents, 53 (16.7%) were fellows, and 21 (6.6%) did not specify their position. When asked their degree of involvement with the clinical teacher/resident during the current rotation, 28 (8.8%) reported a slight involvement, 84 (26.5%) reported having a moderate involvement, 113 (35.7%) reported a considerable involvement, 73 (23%) reported having extensive involvement, and 19 (6%) did not report on their degree of involvement. A mean number of eight raters assessed each instructor. When presented to postgraduate medical education program directors for acceptability, 18 program directors from a variety of medical and surgical specialties and subspecialties including Family Medicine found the instrument acceptable and volunteered to allow their residents to participate in this pilot study.

The analyses for the CTAI are reported under two headings.

Item analyses, reliability, and generalizability

The item analyses are summarized in . The range of item means for all 19 items were 3.80 (SD = 1.07) to 4.58 (SD = 0.72). The overall mean was 4.24 and SD = 0.83 of one full-scale point. The distributional properties of the items on the CTAI are typical of five-point items. No items on the CTAI had levels of “not applicable” greater than 20%, though one item (Was well prepared for teaching sessions) had NA = 16.4%. When we removed the 28(9.4%) that reported only “slight involvement” with the clinical teacher, there were no differences to the results in .

The Cronbach's α coefficient for the total CTAI scale was 0.95 with a mean standard error of measurement (SEM) across all the items =0.19. The CTAI, therefore, has very high internal consistency reliability and produces very small errors of measurement. The generalizability coefficient assessment provided an Ep2 = 0.95 suggesting that the data provided to individual instructors was stable.

Factor analysis

The factor analysis showed that the data decomposed into three factors that accounted for 67.97% of the total variance (). The varimax rotation converged in seven iterations.

The factor loadings (bolded in square brackets for the large loadings) for each item on each factor are given in . From this pattern of loadings, it is evident that the factors together with the variance accounted for and their internal consistency reliability are Teaching skills (variance = 53.25%; Cronbach's α = 0.92), Patient interaction (variance = 8.56%; Cronbach's α = 0.91), and Professionalism (variance = 6.16%; Cronbach's α = 0.86). The three factors are intercorrelated (correlations = 0.48, 0.58, 0.46; p < 0.01; ).

Discussion

The main findings of this study are: (1) a brief, 19–item, instrument was developed to assess clinical teaching based on the CanMEDS roles, (2) the CTAI had high internal consistency (Cronbach's α) reliability as did each of the factors, (3) the CTAI had high Ep2 coefficients, (4) factor analysis of the CTAI resulted in three, theoretically meaningful, cohesive factors that accounted for nearly 70% of the total variance.

To create a brief instrument, we employed only 19 items. To carefully select the items, we surveyed a large number of published studies that reported on clinical teaching instruments. Our items were then carefully matched to the CanMEDS roles that are considered important physician characteristics. These procedures result in enhancing face and content validity of the CTAI.

Like other instruments (Irby & Rakestraw Citation1981; Irby Citation1983; Litzelman et al. Citation1998, 1999), the CTAI has high internal consistency reliability (Cronbach's α) and small SEMs. Additionally, we were able to show that with as few as eight raters, the scores on the CTAI are stable with an Ep2 = 0.95. Both the internal consistency and generalizability across raters, then, are very high.

Beyond the face and content validity evidence, we examined the internal structure of the CTAI using factor analysis and were able to derive three cohesive, theoretically meaningful factors: Teaching skills, Patient Interaction, and Professionalism.

Teaching skills are central to any teaching assessment instrument (Beran & Violato Citation2009). Thus, the following items loaded on this factor: (1) Was well prepared for teaching sessions, (2) Organized time to allow for teaching and care giving, (3) Was stimulating, (5) Coached me on my clinical reasoning or technical skills, (6) Encouraged me to ask questions, (7) Incorporated research data or practice guidelines into teaching, (8) Emphasized a problem-solving approach rather than solutions, (9) Stimulated me to learn independently, (10) Clearly specified what I was expected to know and do during this rotation, and (11) Offered feedback. Meanwhile, other items did not load on Teaching skills but did on Patient Interaction: (14) Demonstrated compassionate patient-centered care, (15) Interacted effectively with patients and their families, (16) Answered questions clearly, (17) Taught effective patient/family communication skills, (18) Pointed out opportunities for health advocacy, (19) Responded to individual patient health needs as part of patient care. Finally, the remaining three items loaded on Professionalism: (4) Fostered an environment of respect in which I felt comfortable participating, (12) Was approachable for discussion, and (13) Treated team members in a professional manner. The clustering of these items as well as the pattern of loadings summarized in provides both convergent and divergent validity evidence for the CTAI.

The results of this study support our goal of developing a brief instrument which is efficient and feasible and for which there is evidence of validity and reliability. The CTAI is simple to administer, takes a minimum amount of time for residents to complete, and is generally consistent with the culture of medical education at most universities. It also highlights the importance of CanMEDS roles for teachers and is expected to contribute to the continued dissemination of the CanMEDS roles into the medical teaching community at large.

There are several limitations to this study. In order to receive forthright data, it was important to insure the anonymity of resident responses. Therefore, it was not possible to link repeat evaluations of the same physicians by the same learners to evaluate intra-observer reliability. Moreover, the instrument may be of limited value for learners that have spent little time with the preceptor/teacher, although the psychometric properties of the instrument did not change in any way when we removed these low involvement learners from the analyses. The data from this study come from only one institution and it will be valuable to collect data at other medical centers to study the generalizability of our findings.

Summary and conclusions

The focus in this study was to investigate the reliability and validity of the CTAI. The psychometric results provided support for evidence of both reliability and validity. Nonetheless, we were not able to assess how the teachers responded to feedback from the instrument. It will be useful to create individual faculty reports and study the effect the feedback has on subsequent teacher behavior. Finally, future research may well focus on the relationship between scores on the CTAI and student learning. Meanwhile, the CTAI is a brief, simple to administer instrument with evidence for reliability and validity.

Declaration of interest: The authors report no declarations of interest.

References

  • Bardes CL, Hayes JG, Falcone DJ, Hajjar DP, Alonso DR. Measuring teaching: A relative value scale in teaching. Teach Learn Med 1998; 10(1)40–43
  • Beckman TJ, Cook DA, Mandrekar JN. What is the validity evidence for assessments of clinical teaching?. J Gen Intern Med 2005; 20(12)1159–1164
  • Beckman TJ, Ghosh AK, Cook DA, Erwin PJ, Mandrekar JN. How reliable are assessments of teaching? A review of the published instruments. J Gen Intern Med 2004; 19(9)971–977
  • Beran T, Violato C. Student ratings of teaching effectiveness: Student engagement and course characteristics. Can J High Educ 2009; 39(1)1–13
  • Bierer SB, Hull AL. Examination of a clinical teaching effectiveness instrument used for summative faculty assessment. Eval Health Prof 2007; 30(4)339–361
  • CanMEDS. 1996. Skills for the new millennium: Report of the societal needs working group, CanMEDS 2000 Project. Annals Royal College of Physicians and Surgeons of Canada. 29:206–216. Available from: http://rcpsc.medical.org/canmeds.
  • Cohen R, MacRae H, Jamieson C. Teaching effectiveness of surgeons. Am J Surg 1996; 171(6)612–614
  • Copeland HL, Hewson M, Clinical teaching effectiveness instrument: Development and psychometric testing. Presented at the American Educational Research Association Annual Conference, 1999 April, Montreal
  • Copeland HL, Hewson MG. Developing and testing an instrument to measure the effectiveness of clinical teaching in an academic medical centre. Acad Med 2000; 75(2)161–166
  • Frank JR, (ed). 2005. The CanMEDS 2005 physician competency framework: Better standards. Better Physicians. Better Care. Ottawa: The Royal College of Physicians and Surgeons of Canada
  • Guyatt GH, Nishikawa J, Willan A, McIlroy W, Cook D, Gibson J, Kerigan A, Neville A. A measurement process for evaluating clinical teachers in internal medicine. Can Med Assoc J 1993; 149(8)1097–1102
  • Irby DM. Evaluating instruction in medical education. J Med Educ 1983; 58(11)844–849
  • Irby DM, Rakestraw P. Evaluating clinical teaching in medicine. J Med Educ 1981; 56(3)181–186
  • James PA, Osborne JW. A measure of medical instructional quality in ambulatory settings: The MedIQ. Fam Med 1999; 31(4)263–269
  • Leamon MH, Fields L. Measuring teaching effectiveness in a pre-clinical multi-instructor course: A case study in the development and application of a brief instructor rating scale. Teach Learn Med 2005; 17(2)119–129
  • Litzelman DK, Stratos GA, Marriott DJ, Skeff KM. Factorial validation of a widely disseminated educational framework for evaluating clinical teachers. Acad Med 1998; 73(6)688–695
  • Litzelman DK, Westmoreland GR, Skeff KM, Stratos GA. Factorial validation of an educational framework using residents’ evaluations of clinician-educators. Acad Med 1999; 74(Suppl. 10)S25–S27
  • Magill MK, McClure C, Commerford K. A system for evaluating teaching in the ambulatory setting. Fam Med 1986; 18(3)173–174
  • Marriott DJ, Litzelman DK. Teaching the teachers: Is it effective? Students’ global assessments of clinical teachers: A reliable and valid measure of teaching effectiveness. Acad Med 1998; 73(10)S72–S74
  • McLeod PJ, James CA, Abrahamowicz M. Clinical tutor evaluation: A 5-years study by students on an in-patient service and residents in an ambulatory care clinic. Med Educ 1993; 27: 48–53
  • Parsell G, Bligh J. Recent perspectives on clinical teaching. Med Educ 2001; 35(4)409–414
  • Pratt DD, Harris P, Collins JB. The power of one: Looking beyond the teacher in clinical instruction. Med Teach 2009; 31(2)133–137
  • Ramsbottom-Lucier MT, Gillmore GM, Irby DM, Ramsey PG. Evaluation of clinical teaching by general internal medicine faculty in outpatient and inpatient settings. Acad Med 1994; 69(2)152–154
  • Snell L, Tallett S, Haist S, Hays R, Norcini J, Prince K, Rothman A, Rowe R. A review of the evaluation of clinical teaching: New perspectives and challenges. Papers from the 9th Cambridge Conference. Med. Educ 2002; 34(10)862–870
  • Solomon DJ, Speer AJ, Rosebraugh CJ, DiPette DJ. The reliability of medical student ratings of clinical teaching. Eval Health Prof 1997; 20(3)343–352
  • Stark P. Teaching and learning in the clinical setting: A qualitative study of the perceptions of students and teachers. Med Educ 2003; 37(11)975–982
  • Steiner IP, Franc-Law J, Kelly KD, Rowe BH. Faculty evaluation by residents in an emergency medicine program: A new evaluation instrument. Acad Emerg Med 2000; 7(9)1015–1021
  • Sutkin G, Wagner E, Harris I, Schiffer R. What makes a good clinical teacher in medicine? A review of the literature. Acad Med 2008; 83(5)452–466
  • The College of Family Physicians of Canada. Standards for accreditation of residency training programs. (2006). ISBN 1-896014-51-8.
  • van der, Hem-Stokroos HH, Daelmans HE, van der Vleuten CP, Haarman HJ, Scherpbier AJ. A qualitative study of constructive clinical learning experiences. Med Teach 2003; 25(2)120–126
  • Williams BC, Litzelman DK, Babbott SF, Lubitz RM, Hofer TP. Validation of a global measure of faculty's clinical teaching performance. Acad Med 2002; 77(2)177–180
  • Williams BC, Pillsbury MS, Stern DT, Grum CM. Comparison of resident and medical student evaluation of faculty teaching. Eval Health Prof 2001; 24(1)53–60

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.