Search in:

Medical Education Online Volume 25, 2020 - Issue 1

Submit an article Journal homepage

Open access

8,027

Views

CrossRef citations to date

Altmetric

Listen

Article

Evaluating the effectiveness of undergraduate clinical education programs

John W. Ragsdalea Assistant Dean for Clinical Education, University of Kentucky College of Medicine, Lexington, KY, USACorrespondence[email protected]

https://orcid.org/0000-0002-9942-1192 View further author information

Andrea Berryb Executive Director of Faculty Life, University of Central Florida College of Medicine, Orlando, FL, USA

https://orcid.org/0000-0002-3884-4108 View further author information

Jennifer W. Gibsonc Director, Office of Medical Education, Tulane University School of Medicine, New Orleans, LA, USA

https://orcid.org/0000-0002-5864-0903 View further author information

Christiane R. Herber-Valdezd Assistant Professor, Department of Medical Education, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center at El Paso, El Paso, TX, USA;e Managing Director, Office of Institutional Research and Effectiveness, Texas Tech University Health Sciences Center at El Paso, El Paso, TX, USA

https://orcid.org/0000-0001-7170-9605 View further author information

Lauren J. Germainf Director of Evaluation, Assessment and Research; Assistant Professor, Public Health and Preventive Medicine, SUNY Upstate Medical University, Syracuse, NY, USA

https://orcid.org/0000-0003-4357-4314 View further author information

Deborah L. Engleg Assistant Dean, Assessment and Evaluation, Duke University School of Medicine, Durham, NC, USA

https://orcid.org/0000-0003-1849-6828 View further author information

& representing the Program Evaluation Special Interest Group of the Southern Group on Educational Affairs (SGEA) within the Association of American Medical Colleges (AAMC) show all

Article: 1757883 | Received 28 Jan 2020, Accepted 09 Apr 2020, Published online: 30 Apr 2020

Cite this article
https://doi.org/10.1080/10872981.2020.1757883
CrossMark

In this article

ABSTRACT
Introduction
Types of measures
Strategies for tracking and monitoring data
Strategies for using data for curriculum oversight
Future steps in program evaluation
Conclusion
Acknowledgements
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

ABSTRACT

Medical schools should use a variety of measures to evaluate the effectiveness of their clinical curricula. Both outcome measures and process measures should be included, and these can be organized according to the four-level training evaluation model developed by Donald Kirkpatrick. Managing evaluation data requires the institution to employ deliberate strategies to monitor signals in real-time and aggregate data so that informed decisions can be made. Future steps in program evaluation includes increased emphasis on patient outcomes and multi-source feedback, as well as better integration of existing data sources.

KEYWORDS:

Undergraduate medical education
clinical education
program evaluation
accreditation
curriculum

Introduction

Undergraduate medical education programs are charged with evaluating learner activities during clerkship experiences and the outcomes of their learning, in order to determine program effectiveness. Program evaluation should involve a three-pronged approach that includes baseline measurements (pre-clerkship), process measurements (activities of learners during the clerkship) and outcome measurements (assessment of learning products or end points)[Citation1]. Many of these measures are defined by the Liaison Committee on Medical Education (LCME) in its various educational standards but are not aggregated in a succinct resource to facilitate this process. In addition, the LCME documents do not explicitly categorize these concepts in terms of the types of measures being used (e.g., process, outcome). We believe that by aggregating these standards and by applying existing educational frameworks, we can improve the effectiveness of program evaluation for the highly complex, clinical training environment.

LCME Element 1.1 requires medical schools to engage in strategic planning and continuous quality improvement (CQI) processes that establish short- and long-term programmatic goals, result in the achievement of measurable outcomes that are used to improve programmatic quality, and ensure effective monitoring of the medical education program’s compliance with accreditation standards [Citation2]. A robust approach to program evaluation can help ensure sufficient attention is given to critical measures to avoid a ‘severe action decision’ by the LCME [Citation3]. To meet these goals, we are presenting a summary of the data to be included in the program evaluation process and a discussion of strategies to be used in collecting and reviewing that data. We believe that having a comprehensive and succinct list of process and outcome measures will allow faculty and administrators to more effectively monitor, assess, and evaluate the quality of their educational programs as part of the CQI process. In this paper, we propose a set of guidelines or best practices that can be used by all parties responsible for program evaluation to identify essential data sources, as well as mechanisms to access, monitor, and analyze data to determine program effectiveness.

Types of measures

Data to evaluate program effectiveness can broadly be grouped into process measures and outcome measures. Process measures focus on aspects of program and curriculum delivery, such as logistics of how teaching occurs, how courses are organized, and the types of patient encounters required in the curriculum. These measures may be granular (e.g., the number of duty hour violations in a clerkship) or more broad (e.g., how formative feedback is provided), but all evaluate interim steps or components in the learning process, not the result of the process. Outcome measures, in contrast, evaluate if learning occurred, particularly whether student- and program-level objectives and targets were met. These can also be granular (e.g., passing rate on standardized tests) or broad (e.g., successful transition to residency). These categories of process and outcome measures are not strictly defined, but are important constructs to consider in the design of evaluation strategies and the selection of measures for determining program effectiveness. By viewing program evaluation through this lens of process and outcome measures, stakeholders can ensure they are considering program evaluation broadly and can better prioritize different types of measures. Since outcome measures focus on the end products of learning, these measures should be weighed more heavily than process measures, though both are important.

Another model that is useful to consider is the Kirkpatrick model of evaluation [Citation4–Citation8]. First proposed by Donald Kirkpatrick in the 1950s [Citation9],this model includes four levels of outcomes for a training program: reaction, learning, behavior, and results. Since then, the framework has been expanded and revised during its extensive use. Its most recent iteration, the New World Kirkpatrick Model [Citation4], expands on the original four-level model based on the effect on learning outcomes:

Level 1: Reaction

This includes learner satisfaction, engagement, and relevance.

Level 2: Learning

This includes changes in knowledge, skills, attitudes, confidence, and commitment.

Level 3: Behavior

This includes the application of what was learned and change in learner behavior.

Level 4: Results

This includes the achievement of outcomes and indicators of progress towards those outcomes.

This model can be very useful in evaluating clinical program effectiveness. For example, course evaluation ratings would be considered Kirkpatrick level 1 and measures of medical knowledge on standardized tests would be Kirkpatrick level 2. Applying the Kirkpatrick framework to measures of program effectiveness can help stakeholders prioritize different measures. For example, measures of behavior change are more meaningful than measures which simply reflect learner satisfaction, though the former may be harder to demonstrate. In addition, this framework can indicate opportunities to improve program evaluation rigor by highlighting measures to include at higher Kirkpatrick levels.

Applying both of the above frameworks, we compiled a summary of essential measures to use in evaluating the effectiveness of undergraduate clinical education programs (.).

Table 1. Essential measures to use in evaluating the effectiveness of undergraduate clinical education programs

Download CSV Display Table

Strategies for tracking and monitoring data

Program evaluation data are collected both within medical schools and by at least four external regulating bodies: the Liaison Committee on Medical Education (LCME), National Board of Medical Examiners (NBME), Association of American Medical Colleges (AAMC), and National Resident Matching Program (NRMP). These data are critical in decision-making at the local level (e.g., improvement in individual clerkships) and national level (e.g., LCME accreditation). However, using data effectively for decision-making requires aggregating them across different sources, developing internal processes to ensure data integrity, and enacting a deliberate strategy for data management. An inventory of data sources currently in use can be a valuable first step for organizing the process and gathering stakeholder input. Such an inventory can be structured by: a) level of data (e.g., individual student, clerkship group, graduation cohort, exam, clerkship, year, program), b) data source, c) party responsible for the review, d) data storage location, e) output/report format, f) collection/reporting cycle, and g) reviewers/data users. During the inventory process, it is important to develop protocols for managing data, including business process rules for data input and flow across systems, data definitions, and limitations. For example, annual NBME exam performance reports include data for each institution’s entire group of test takers, which might not correspond to academic year cohorts due to misalignment with the institution’s academic calendar or students delaying the exam. In addition, de-identification of data and procedures for the dissemination and sharing of data sets are necessary to safeguard student records. When the inventory is complete, data collected at the same level can be organized by identifiers such as student ID or clerkship name, and merged manually or automatically aggregated. The use of an education data warehouse may facilitate this process [Citation10]. Maintaining data architecture, hygiene, and quality assurance processes are critical to success.

The use of data visualization tools, such as online dashboards, allows for customizable summaries and real-time reporting, while making data more accessible and interpretable for stakeholders. AAMC’s Curriculum Dashboard Resource [Citation11] lists four primary reasons to develop curriculum dashboards: ‘compare metrics to national standards, evaluate educational programs over time, identify trends in educational program quality, and benchmark faculty, resident and student performance.’ Stonybrook’s Drivers of Dashboard Development (3-D) approach [Citation12] is used in curricular CQI and has been linked to improvements in LCME compliance activities, including timeliness of grades, mid-clerkship feedback, and policy awareness. The most critical elements to consider in dashboard creation are who the end-users are, their level of data fluency, and how the data will be used in decision-making. It is also important to undergo a standard-setting process to determine appropriate benchmarks for each metric.

Data-driven decision-making regarding clinical education programs occurs on a variety of cycles. While some metrics can be reviewed annually, others require immediate or near immediate action. An incident of mistreatment reported on an end-of-clerkship evaluation, for example, necessitates a rapid response, which can be activated by an automatic alert informing the responsible parties of the issue. Data alerts are important but should be used sparingly, to avoid unnecessarily fatiguing those responsible for responding. Queries of stakeholders, existing policies, and accreditation expectations will determine in which circumstances and at which time points alerts are necessary. Often alerts are associated with sensitive information like poor performance and problems with the learning environment, and therefore, a consistent response procedure should exist and be made transparent to students, faculty, staff, and other stakeholders. Alert response procedures should include to whom the alerts will be sent, the type of information they will include (particularly if identifying data are involved), and action steps to be taken.

Strategies for using data for curriculum oversight

Data play an important role in determining the quality of the educational program and whether the program meets the goals and expectations of its stakeholders. This process can be used to inform the future direction of the curriculum and essential functions that support the curriculum, such as faculty development. Most LCME-accredited programs utilize standardized data provided by national sources such as the NBME and AAMC, as well as internal information. Information provided by external sources allows a program to benchmark outcomes against national percentiles. Internal sources of information can be useful in detecting and evaluating contextual features unique to a program.

Informed by LCME requirements, ‘medical schools must collect and use a variety of outcome data, including national norms of accomplishment, to demonstrate the extent which medical students are achieving program objectives and to enhance the quality of the medical education program as a whole.’[Citation13] Under the oversight of the institution’s curriculum committee, valid data must be collected to ensure the trustworthiness of information and to eliminate anecdotal storytelling, which can undermine the curriculum[Citation14]. Additionally, as the final authority on curricular matters, the curriculum committee has to review relevant data in order to make curricular decisions and improvements. Many schools determine cut-off measures – often through a curricular dashboard – to highlight strengths which can then be replicated in other areas of the curriculum, or weaknesses that require additional resources, support, or monitoring. Outcomes of such a review can also be used to inform faculty development to address areas of weakness in the curriculum.

Future steps in program evaluation

Typically, the evaluation of clinical programs relies on a combination of learner satisfaction (Kirkpatrick Level 1 [Citation4]), measures of learning (Kirkpatrick Level 2), and changes in behavior (Kirkpatrick Level 3). However, the latter is limited by a paucity of nationally standardized measurement tools. The goal of a clinical education program is to graduate clinicians who can function effectively in their professional roles and provide high-quality care. To determine if this goal is being achieved requires measuring the care that is delivered by the program’s graduates, that is, by measuring the effects of the clinical program on patients (Kirkpatrick Level 4).

With a rapidly changing healthcare landscape and increasing public demands for accountability, the discourse on evaluation frameworks are shifting towards measures of patient outcomes [Citation15,Citation16]. As the Institute of Medicine [Citation17] highlighted the need for clinical education to fit healthcare needs, calls to examine the effects of educational training on the quality of care provided by health profession learners followed [Citation15,Citation16,Citation18–Citation22]. Early responses included recommendations for ‘evidence-guided education’, whereby medical educators monitor clinical outcomes to inform the design of medical education programs [Citation23]; others called for the development of research agendas to examine the impact of educational programs on clinical outcomes[Citation18]. Though methodological challenges and factors that confound the performance of medical professionals have been acknowledged [Citation16,Citation18,Citation20,Citation24], there has been a general consensus on the need to include population outcome measures in the evaluation of clinical teaching strategies, curricula, and programs. Recently, patient-reported outcomes (PROs) and patient-based outcomes (PBOs) have been discussed as critical indicators for program evaluation and continuous quality improvement [Citation15,Citation16]. While some studies have examined clinical outcomes as measures of education quality [Citation25–Citation30], uniform systems and efficient ways of collecting and analyzing outcome data across institutions are needed [Citation15,Citation22]. Ultimately, the primary goal of clinical education is to prepare professionals who deliver quality healthcare; hence, the goal of evaluation should be to demonstrate that clinical education programs are contributing to improved patient outcomes.

In evaluating program effectiveness, it is important to include a variety of perspectives. For example, assessment of student performance during clerkships should not only include evaluations from faculty and residents, but also from patients, clinical staff, administrative staff, peer students, and even self-evaluations. Multisource Feedback (MSF) approaches, such as 360-degree evaluations, are already used in many residency training programs [Citation31–Citation36] and even in some undergraduate medical education programs [Citation37,Citation38]. MSF evaluations can provide valuable insight into the learning environment, increase stakeholder representation in the medical education program, and identify gaps in skill development that may go unrecognized in traditional evaluations.

Beyond additional types of data, future steps in program evaluation also include better data systems and more robust data-tracking mechanisms. Currently, most program measures exist in systems that do not communicate well with one another. This makes integration into a coherent database that provides real-time updates challenging. For example, Graduation Questionnaire data are only initially provided in Portable Document Format (PDF) format rather than a format which allows for integration into a data management system. A future state in which raw data, especially nationally normed data, is provided electronically in formats which integrate with other local data systems would allow for better tracking of program data and assessment of interventions in real-time.

Conclusion

As stakeholders evaluate the effectiveness of clinical education programs, it is important to understand the types of measures that must be included and how these measures relate to each other. It is also imperative that programs have a robust mechanism to track and monitor data and use it to inform curricular decisions. As types of data and data systems evolve, we will be better able to accomplish these goals and ensure our clinical education programs are effective in training future providers.

Acknowledgments

The authors wish to thank Loretta Jackson-Williams, MD, PhD, Vice Dean for Medical Education at the University of Mississippi School of Medicine for her leadership of the SGEA Program Evaluation Special Interest Group.

Disclosure statement

The authors report no conflict of interest.

Additional information

Funding

There was no funding for this work.

References

Durning SJ, Hemmer P, Pangaro LN. The structure of program evaluation: an approach for evaluating a course, clerkship, or components of a residency or fellowship training program. Teach Learn Med. 2007;19(3):308–6.
PubMed Web of Science ®Google Scholar
Liaison Committee on Medical Education. Functions and Structure of a Medical School: standards for Accreditation of Medical Education Programs Leading to the MD Degree. cited 2019 Nov 18. Available from: http://lcme.org/publications/
Google Scholar
Hunt D, Migdal M, Waechter DM, et al. The Variables That Lead to Severe Action Decisions by the Liaison Committee on Medical Education. Acad Med. 2016;91(1):87–93.
PubMed Web of Science ®Google Scholar
Kirkpatrick DL, Kirkpatrick JD. Kirkpatrick’s four levels of training evaluation. Alexandria, VA: ATD Press; 2016.
Google Scholar
Hammick M, Dornan T, Steinert Y. Conducting a best evidence systematic review. Part 1: from idea to data coding. BEME Guide No. 13. Med Teach. 2010;32(1):3–15.
PubMed Web of Science ®Google Scholar
Kirkpatrick DL. Evaluating training programs: the four levels. 1st ed. San Francisco, CA: Berrett-Koehler; 1996.
Google Scholar
Kirkpatrick DL, Kirkpatrick JD. Evaluating training programs: the four levels. 3rd ed. San Francisco, CA: Berrett-Koehler; 2006.
Google Scholar
Issenberg SB, McGaghie WC, Petrusa ER, et al. Features and uses of high-fidelity medical simulations that lead to effective learning: a BEME systematic review. Med Teach. 2005;27(1):10–28.
PubMed Web of Science ®Google Scholar
Kirkpatrick DL. Techniques for Evaluating Training Programs. Am Soc Train Direct. 1959;13:3–9.
Google Scholar
Triola MM, Pusic MV. The education data warehouse: a transformative tool for health education research. J Grad Med Educ. 2012;4(1):113–115.
PubMedGoogle Scholar
AAMC. Curriculum Dashboard Resource. cited 2019 Nov 18. Available from: https://www.aamc.org/download/493604/data/umecurriculumdashboardresource.pdf
Google Scholar
Shroyer AL, Lu WH, Chandran L. Drivers of Dashboard Development (3-D): A Curricular Continuous Quality Improvement Approach. Acad Med. 2016;91(4):517–521.
Web of Science ®Google Scholar
Liaison Committee on Medical Education. Standard 8: Curricular management, evaluation, and enhancement. Functions and Structure of a Medical School: Standards for Accreditation of Medical Education Programs Leading to the M.D. Degree. cited 2019 Nov 18. Available from: http://lcme.org/publications/
Google Scholar
Davis WK, White CB, Norman GR, et al. International handbook of research in medical education. Dordrecht: Springer; 2002.
Google Scholar
Rosenberg ME. An Outcomes-Based Approach across the Medical Education Continuum. Trans Am Clin Climatol Assoc. 2018;129:325–340.
Google Scholar
Dauphinee WD. Educators must consider patient outcomes when assessing the impact of clinical training. Med Educ. 2012;46(1):13–20.
PubMed Web of Science ®Google Scholar
Institute of Medicine. Crossing the quality chasm: a new health system for the 21st century. Washington, DC: National Academy Press; 2001.
Google Scholar
Chen FM, Bauchner H, Burstin H. A call for outcomes research in medical education. Acad Med. 2004;79(10):955–960.
PubMed Web of Science ®Google Scholar
Whitcomb ME. Using clinical outcomes data to reform medical education. Acad Med. 2005;80(2):117.
PubMed Web of Science ®Google Scholar
Schuwirth L, Cantillon P. The need for outcome measures in medical education. BMJ. 2005;331(7523):977–978.
Web of Science ®Google Scholar
Dauphinee WD. The role of theory-based outcome frameworks in program evaluation: considering the case of contribution analysis. Med Teach. 2015;37(11):979–982.
PubMed Web of Science ®Google Scholar
Boulet JR, Durning SJ. What we measure … and what we should measure in medical education. Med Educ. 2019;53(1):86–94.
PubMed Web of Science ®Google Scholar
Glick TH. Evidence-guided education: patients’ outcome data should influence our teaching priorities. Acad Med. 2005;80(2):147–151.
PubMed Web of Science ®Google Scholar
Moreau KA, Eady K. Connecting medical education to patient outcomes: the promise of contribution analysis. Med Teach. 2015;37(11):1060–1062.
PubMed Web of Science ®Google Scholar
Norcini JJ, Kimball HR, Lipner RS. Certification and specialization: do they matter in the outcome of acute myocardial infarction? Acad Med. 2000;75(12):1193–1198.
PubMed Web of Science ®Google Scholar
Tamblyn R, Abrahamowicz M, Dauphinee WD, et al. Association between licensure examination scores and practice in primary care. JAMA. 2002;288(23):3019–3026.
PubMed Web of Science ®Google Scholar
Asch DA, Nicholson S, Srinivas S, et al. Evaluating obstetrical residency programs using patient outcomes. JAMA. 2009;302(12):1277–1283.
PubMed Web of Science ®Google Scholar
Asch DA, Nicholson S, Srinivas SK, et al. How do you deliver a good obstetrician? Outcome-based evaluation of medical education. Acad Med. 2014;89(1):24–26.
PubMed Web of Science ®Google Scholar
Sirovich BE, Lipner RS, Johnston M, et al. The association between residency training and internists’ ability to practice conservatively. JAMA Intern Med. 2014;174(10):1640–1648.
PubMed Web of Science ®Google Scholar
Bansal N, Simmons KD, Epstein AJ, et al. Using Patient Outcomes to Evaluate General Surgery Residency Program Performance. JAMA Surg. 2016;151(2):111–119.
PubMed Web of Science ®Google Scholar
Mahoney D, Bogetz A, Hirsch A, et al. The Challenges of Multisource Feedback: feasibility and Acceptability of Gathering Patient Feedback for Pediatric Residents. Acad Pediatr. 2019;19(5):555–560.
Web of Science ®Google Scholar
LaMantia J, Yarris LM, Sunga K, et al. Developing and Implementing a Multisource Feedback Tool to Assess Competencies of Emergency Medicine Residents in the USA. AEM Educ Train. 2017;1(3):243–249.
Google Scholar
Jani H, Narmawala W, Ganjawale J. Evaluation of Competencies Related to Personal Attributes of Resident Doctors by 360 Degree. J Clin Diagn Res. 2017;11(6):JC09–JC11.
Web of Science ®Google Scholar
Riveros R, Kimatian S, Castro P, et al. Multisource feedback in professionalism for anesthesia residents. J Clin Anesth. 2016;34:32–40.
Web of Science ®Google Scholar
Ogunyemi D, Gonzalez G, Fong A, et al. From the eye of the nurses: 360-degree evaluation of residents. J Contin Educ Health Prof. 2009;29(2):105–110.
Web of Science ®Google Scholar
Pollock RA, Donnelly MB, Plymale MA, et al. 360-degree evaluations of plastic surgery resident accreditation council for graduate medical education competencies: experience using a short form. Plast Reconstr Surg. 2008;122(2):639–649.
Web of Science ®Google Scholar
Emke AR, Cheng S, Chen L, et al. A Novel Approach to Assessing Professionalism in Preclinical Medical Students Using Multisource Feedback Through Paired Self- and Peer Evaluations. Teach Learn Med. 2017;29(4):402–410.
PubMed Web of Science ®Google Scholar
Lai MM, Roberts N, Martin J. Effectiveness of patient feedback as an educational intervention to improve medical student consultation (PTA Feedback Study): study protocol for a randomized controlled trial. Trials. 2014;15:361.
Web of Science ®Google Scholar

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Evaluating the effectiveness of undergraduate clinical education programs

ABSTRACT

Introduction

Types of measures

Table 1. Essential measures to use in evaluating the effectiveness of undergraduate clinical education programs

Strategies for tracking and monitoring data

Strategies for using data for curriculum oversight

Future steps in program evaluation

Conclusion

Acknowledgments

Disclosure statement

References

Information for

Open access

Opportunities

Help and information

Evaluating the effectiveness of undergraduate clinical education programs

ABSTRACT

Introduction

Types of measures

Table 1. Essential measures to use in evaluating the effectiveness of undergraduate clinical education programs

Strategies for tracking and monitoring data

Strategies for using data for curriculum oversight

Future steps in program evaluation

Conclusion

Acknowledgments

Disclosure statement

Additional information

Funding

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature