1,747
Views
4
CrossRef citations to date
0
Altmetric
Short Communication

Promoting innovation in the objective structured teaching examination and feedback: clustering teachers to aid teaching evaluation

ORCID Icon &
Article: 1620544 | Received 22 Mar 2019, Accepted 10 May 2019, Published online: 11 Jun 2019

ABSTRACT

Problem: This study used the principles of feedback in a faculty development curriculum to enable clinical teachers to conduct objective structured teaching exercises for performance assessment. Intervention: the Flanders System of Interaction Analysis (FIA) was given to analysis of the data collected from a particular situation, to videotapes of simulated clinical teaching skills. Context: The Sparse K-Means clustering method, one-way ANOVA and post hoc tests were employed to cluster the most commonly used skills by teachers and compare the features of different clusters were then discussed. Outcome: The evaluation method employed in this study can be extended to more teaching methods and skills. Lessons Learned: that through teaching observation, clinical teaching skills and reflection teaching can be improved.

Background

Medical education emphasizes providing feedback to learners of all levels. However, how feedback is given to clinical teachers and the content of this feedback are rarely evaluated [Citation1]. Clinical teachers are crucial within medical education. If the faculty development curriculum provides teachers with understanding of their own teaching effectiveness, which makes them aware of their shortcomings and able to make modifications, their teaching effectiveness will increase.

Literature review

When a clinician decides to become a teacher, faculty development units typically enroll them in various training courses [Citation2]. However, what clinical teachers actually do at the bedside often differs greatly from the faculty training [Citation3]. Some teachers are oblivious to these differences, whereas others can identify and address the gap. Therefore, when observing clinical teaching, which is the way of learning after completing the faculty development curriculum, clinical teachers should be encouraged to evaluate their own thoughts, learning, and teaching practices, thereby clarifying their nascent beliefs [Citation4]. As teachers, they should thoroughly examine their role and their responsibilities to students but also their personal beliefs and practices in the context, becoming a model for the students, because clinical teachers are also learners [Citation5].

Teaching observation and feedback

Among all methods of collecting teaching performance data, the most common is teaching observation [Citation6]. Teaching observation prompts changes in teachers’ instruction methods and can be conducted through individual or group classroom observation [Citation7]. The strength of this approach is that it collects data directly related to teacher performance, and the observation records can be kept and used repeatedly. Its weakness, however, is that observation can disturb the teacher’s instruction [Citation8]. Additionally, it does not detect sociological, psychological, political, and organizational factors relevant to the teacher [Citation9]. Furthermore, being observed may cause the teacher to be more performative than usual [Citation10].

Social interaction analysis has been promoted in the teaching context by socio-educational scholars and can be used to record and analyze teaching behavior under real teaching conditions, ultimately providing feedback to teachers on how they can improve their teaching and thus reducing the time and effort wasted by novice teachers [Citation11]. Additionally, the analysis can serve as a reference during teacher evaluation. Among social interaction analysis methods, the most famous is the Flanders interaction analysis system (FIAS) [Citation12]. Researchers can use the FIAS to conduct quantitative social interaction analysis. The FIAS is an observational tool used to classify the verbal behaviors of teachers and pupils as they interact in the classroom. Only verbal communication in the classroom is considered; nonverbal gestures are not taken into account. The basic assumption of the system is that in the classroom, a teacher’s verbal statements are consistent with his or her nonverbal gestures and overall behavior [Citation13]. The system categorizes the verbal behaviours and the feedback of pupils in the classroom into 10 categories, each of which contains teachers’ verbal communications and nonverbal gestures, with which the system can be used to study teacher–student interaction [Citation12].

Since 1992, some medical schools in the USA have developed objective structured teaching examinations (OSTEs) [Citation14]. These are similar to the objective structured clinical examinations for evaluating medical students’ clinical skills in that an OSTE employs standardized students (SSs) to evaluate the teaching skills of clinical teachers or residents [Citation15]. In each station of an OSTE, clinical scenarios are often constructed, such as bedside teaching, ambulatory care education, chart writing, and feedback giving. Clear objectives and standards are provided for the assessment, so the teaching skills of clinical teachers and residents can be effectively assessed [Citation16]. Additionally, the SSs provide a highly realistic and low-risk simulated environment in which teachers can practice, experience new teaching techniques and behaviors, and receive direct feedback, which helps them adjust and improve their teaching [Citation16].

On the basis of the aforementioned theories and background, this study used an OSTE for teaching observation. We videotaped teachers during their simulated clinical feedback teaching. The FIAS was used for teaching behavior classification. Teaching modes were used in structural clustering Sparse K-Means. Teachers of varying backgrounds were recruited and observed. We described and classified teachers’ behaviors to provide a reference for departments or stakeholder wishing to increase faculty development effectiveness.

Methodology

This research was approved by the Institutional Review Board of Tzuchi Hospital (IRB 100–90). The research participants were 38 clinical teachers from 2011–2013. The participants received 40 hours of instruction on the Faculty Development Program in General Medicine, which included a 4-hour course on how to provide feedback. The content of the feedback section covered environment and atmosphere (creating a safe environment first), guidance and triggers (emotions are deduced), diagnoses and feedback clarification, improvement plans, application, retrospection, and confirmation (confirming messages). Lectures and both large- and small-group discussions were involved. An OSTE design was employed for summative evaluation. To attain objectivity, we analyzed one of the stations, feedback skills, which took approximately 15 minutes to complete, and analyzed the interaction of the 38 teachers with the SSs. The scenario in the teaching case regarding feedback skill was as follows. The SS (played by a resident) reported to the teacher about a case with acute gastric ulcer, bloody stool, and abdominal pain. The SS reported that the patient was being transferred to the intensive care unit and outlined the treatment procedure. Conflict had arisen because the family and medical team had different opinions. The teacher then had to give feedback and make suggestions on how to proceed.

Research tools

Introduction to the FIAS

The FIAS is a common system for observing the interactive behaviors of teachers and students in a classroom during teaching. The goal is to analyze teaching behavior and to record major teacher–student interaction events. Through understanding the effect of classroom interaction events, teachers can gain insight into and improve their teaching [Citation12].

Flanders encoded all verbal events between teachers and students into a 10-category system. Recording is performed by observing the ongoing interactions in the classroom and recording them in the appropriate categories. The recording timeline of category number continued at the rate of 20–25 recordings per minutes for the entire 15-minute observation period. The behavior categories are displayed in .

Table 1. Flanders interaction analysis system verbal interaction categories.

Analysis and statistical methods

Coding was conducted by two people. Videos of teaching sessions were recorded, and the coders then discussed the appropriate codes while watching the videos. To prevent the observers from making considerably different value judgments due to subjectivity, in addition to observer training, an interrater reliability test was conducted when the grading was performed. We randomly selected 10 videos of teaching sessions to evaluate the interrater reliability via calculating Kappa. The results of Kappa ranged from 0.73 to 0.90, showing that the scores awarded by the two observers for the 10 categories of teaching behavior were highly consistent. Because the participants who provided feedback were SSs acting according to a fixed scenario and with fixed dialogue, in the final analysis, the SSs’ reactions were excluded from the analysis. This study excluded the students’ responses and considered only 9 of the 10 FIAS verbal interaction categories for analysis. The categories were named F0–F8 ().

Table 2. Demographics (N = 38).

We employed the sparse K-Means for analysis [Citation17], it were adjusted for different FIAS categories and physician factors (age, sex, specialty, and hospital level). Clustering is the process of categorizing concrete or abstract objects into groups on the basis of their similarity. The standard k-means clustering is commonly used in data mining. For the number of clusters k, the algorithm is trained to arbitrarily select k item centroids and group each input item with its most similar centroid. The centroid in each cluster is then recalculated using the distance center of all the categorized centroids. The goal is to investigate the influence and grouping relationship of each cluster. One of the simplest unsupervised learning algorithms for solving clustering problem, K-means clustering has a simple and straightforward procedure for classifying a given data set into a certain number of clusters fixed a priori. These centroids should be selected intelligently because different locations would obtain different results. The optimal choice is to select centroids as far from each other as possible. The next step is to consider each point in the data set and identify which centroid it is nearest to [Citation18]. Doing this for all points completes the first step and obtains the first groupage. The k centroids of each cluster are then recalculated. After the new k centroids have been obtained, the data points must be reassociated with their new closest centroid. This generates a loop. As the loop proceeds, the k centroids change location until their location becomes stationary. Sparse K-Means clustering is an established method of simultaneously excluding uninformative features and clustering the observations. After identifying clusters, we further compare the characteristics of different clusters. Statistical mean differences of continuous data between different clusters were analyzed with One-way analysis of variance (ANOVA) or Kruskal-Wallis test. The Bonferroni correction was used as post-hoc analysis. The Chi-square test or Fisher’s exact test were used to evaluate the association between two categorical<0.05. All analyses were performed using R (Version 3.5.0), the R-packages sparcl (Version 1.0.4), and SPSS software version 17.0 (SPSS Inc., Chicago, IL, USA).

Results

A total of 38 clinical teachers, 28 men and 10 women, participated in this study. Those with 5 or more years of teaching experience were considered highly experienced, and those with 4 years or fewer were considered less experienced. Slightly more than half of the participants were less experienced. The participants were from different levels of hospitals and from different departments (internal medicine department, 42.1%; surgery department, 7.9%; and other departments, 50%). The participants’ backgrounds are detailed in . We employed Sparse K-Means clustering to group the participants. presents the results of this clustering into three main groups (C1–C3). C1 (n = 20) comprised the participants who principally performed F3 category behaviors, which are indirect influences (responses) in which the ideas of students are accepted or used. C2 (n = 10) comprised the participants who principally performed F4 category behaviors, which are the asking of questions, and F8 category behaviors, which are pupil-talk–response interactions. C3 (n = 8) comprised the participants who principally performed F5 category behaviors, which are lecturing under direct influence (initiation). These three clusters have notably higher percentage of the teaching behavior via FIAS (). No significant differences in the sex, department, experience, and group (hospital level) of these three groups of teachers were discovered.

Table 3. Characteristics of the three clusters.

Figure 1. Visualization of cluster obtained using the Sparse K-Means algorithm.

Figure 1. Visualization of cluster obtained using the Sparse K-Means algorithm.

Figure 2. Flanders interaction analysis of the feedback scenario: C1 (N = 20): F3 (accepting or using ideas of students, response); C2 (N = 10): F4 (asking questions); C3 (N = 8): F5 (lecturing, initiation).

Figure 2. Flanders interaction analysis of the feedback scenario: C1 (N = 20): F3 (accepting or using ideas of students, response); C2 (N = 10): F4 (asking questions); C3 (N = 8): F5 (lecturing, initiation).

Discussion

The FIAS enabled clinical teachers to determine their performance on instruction, interaction, and coaching behavior of the teaching process after watching clips of their lessons [Citation12]. If qualitative feedback can be provided together with timeline marking, more precise feedback and suggestions can be provided. Additionally, clinical teachers can objectively reflect on their teaching as well as gain inferences through comprehension, transformation, instruction, evaluation, reflection, and new comprehension, thereby positively affecting their future teaching.

This study discovered that after the teachers had finished the faculty development curriculum, when practicing the feedback skill in a simulated scenario, they could all ask questions to receive students’ opinions, listen intently, and express their own experience and thoughts, regardless of their experience and department. During the intense basic faculty development curriculum, the teaching effectiveness of the teachers improved. Through objective teaching observation, discussion, and recording, we determined how the teachers actually taught students on giving feedback. After completing the faculty development curriculum, the teachers could all demonstrate the following during practice teaching scenarios: understanding what the students were asking for help on; assisting them in setting learning tasks and goals; discussing the required skills with the students; clarifying the content of critical clinical handling principles, assessments, and evaluative discussions; and assisting students in strengthening their skills. These are the preliminary results of the faculty development curriculum.

Conclusion and suggestions

Clinical teachers should all complete the teaching principle curriculum, which involves learning about theories and general rules in lectures, participating in small group discussions, observing the actions of role models in outstanding teaching cases, and scenario writing and amendment for actual teaching units. If the curriculum is completed by teachers formulating effective methods of summative evaluation for use in teaching practice and videotaping to provide feedback, clinical teachers will be able to organize the knowledge they acquire from the curriculum, and this knowledge will positively influence their teaching performance. To train excellent clinical teachers, time and effort are necessary to help those teachers develop mature teaching behavior. Faculty development units must be supportive of this process by providing more comprehensive course planning and following up on teachers’ progress.

Limitations of the study

FIAS provides a simple way to collect and analyze teaching behavior automatically, yet it has some limitations. Firstly, some clinical interactive activities are not undertaken via interaction behavior (e.g., content expertise and interpersonal attributes), and thus they could not be perceived by FIAS so far. Secondly, the sample size was relatively small according to the most popular rule from Forman, which recommends a sample size of at least 2^k, where k is the number of variables used for clustering. Lastly, the extended dialogue between students and faculty cannot be analyzed at present. All these issues need to be addressed in future studies.

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by the Buddhist Tzu Chi Medical Foundation [TCMMP106-06].

References

  • Weinstein DF. Feedback in clinical education: untying the Gordian knot. Acad Med. 2015;90:559–6.
  • Christner JG, Dallaghan GB, Briscoe G, et al. The community preceptor crisis: recruiting and retaining community-based faculty to teach medical students—A shared perspective from the alliance for clinical education. Teach Learn Med. 2016;28:329–336.
  • Wong BM, Holmboe ES. Transforming the academic faculty perspective in graduate medical education to better align educational and clinical outcomes. Acad Med. 2016;91:473–479.
  • Miller JE, Seldin P. Changing practices in faculty evaluation. Academe. 2014;100:35–38.
  • Ebert-May D, Derting TL, Hodder J, et al. What we say is not what we do: effective evaluation of faculty professional development programs. BioScience. 2011;61:550–558.
  • Blackmore JA. A critical evaluation of peer review via teaching observation within higher education. Int J Educational Manage. 2005;19:218–232.
  • Dillmann R. Teaching and learning of robot tasks via observation of human performance. Rob Auton Syst. 2004;47:109–116.
  • Yiend J, Weller S, Kinchin I. Peer observation of teaching: the interaction between peer review and developmental models of practice. J Further Higher Educ. 2014;38:465–484.
  • Dybowski C, Harendza S. ”Teaching is like nightshifts … ”: a focus group study on the teaching motivations of clinicians. Teach Learn Med. 2014;26:393–400.
  • Steinberg MP, Garrett R. Classroom composition and measured teacher performance: what do teacher observation scores really measure? Educ Eval Policy Anal. 2016;38:293–317.
  • Cornelius-White J. 2007. Learner-centered teacher-student relationships are effective: A meta-analysis. Rev Educ Res. 77:113–143.
  • Flanders NA. Analyzing teaching behavior. MA: Addison-Wesley Reading; 1970.
  • Brophy J. Teacher praise: A functional analysis. Rev Educ Res. 1981;51:5–32.
  • Morrison EH, Boker JR, Hollingshead J, et al. Reliability and validity of an objective structured teaching examination for generalist resident teachers. Acad Med. 2002;77:S29–S32.
  • Julian K, Appelle N, O’Sullivan P, et al. The impact of an objective structured teaching evaluation on faculty teaching skills. Teach Learn Med. 2012;24:3–7.
  • Prislin MD, Fitzpatrick C, Giglio M, et al. Initial experience with a multi-station objective structured teaching skills evaluation. Acad Med. 1998;73:1116–1118.
  • Witten DM, Tibshirani R. A framework for feature selection in clustering. J. Amer. Statist. Assoc. 2010;105:713–726.
  • Zakharov K. Application of k-means clustering in psychological studies. Quant Methods Psycho. 2016;12:87–100.