1,832
Views
6
CrossRef citations to date
0
Altmetric
Education and Practice

Development of a Child Abuse Checklist to Evaluate Prehospital Provider Performance

, BS, , MD, MSci, , MD, , BS, NREMT-P, , RN, MSN, CPEN, , MSN, APRN, NP-C, , MPH & , MD show all
Pages 222-232 | Received 12 Jul 2016, Accepted 17 Aug 2016, Published online: 04 Oct 2016

ABSTRACT

Objectives: To develop and provide validity evidence for a performance checklist to evaluate the child abuse screening behaviors of prehospital providers. Methods: Checklist Development: We developed the first iteration of the checklist after review of the relevant literature and on the basis of the authors' clinical experience. Next, a panel of six content experts participated in three rounds of Delphi review to reach consensus on the final checklist items. Checklist Validation: Twenty-eight emergency medical services (EMS) providers (16 EMT-Basics, 12 EMT-Paramedics) participated in a standardized simulated case of physical child abuse to an infant followed by one-on-one semi-structured qualitative interviews. Three reviewers scored the videotaped performance using the final checklist. Light's kappa and Cronbach's alpha were calculated to assess inter-rater reliability (IRR) and internal consistency, respectively. The correlation of successful child abuse screening with checklist task completion and with participant characteristics were compared using Pearson's chi squared test to gather evidence for construct validity. Results: The Delphi review process resulted in a final checklist that included 24 items classified with trichotomous scoring (done, not done, or not applicable). The overall IRR of the three raters was 0.70 using Light's kappa, indicating substantial agreement. Internal consistency of the checklist was low, with an overall Cronbach's alpha of 0.61. Of 28 participants, only 14 (50%) successfully screened for child abuse in simulation. Participants who successfully screened for child abuse did not differ significantly from those who failed to screen in terms of training level, past experience with child abuse reporting, or self-reported confidence in detecting child abuse (all p > 0.30). Of all 24 tasks, only the task of exposing the infant significantly correlated with successful detection of child abuse (p < 0.05). Conclusions: We developed a child abuse checklist that demonstrated strong content validity and substantial inter-rater reliability, but successful item completion did not correlate with other markers of provider experience. The validated instrument has important potential for training, continuing education, and research for prehospital providers at all levels of training.

Introduction

Many infants and children die or suffer long-term disability from physical child abuse each year in the United States.Citation1–3 The acute nature of their injuries and lack of ready access to primary care may lead child abuse victims to present to an emergency department (ED) for care, which may be the first and only opportunity for victims of child abuse to be recognized.Citation4 Approximately 30% of children who died from child abuse had been previously evaluated by health care providers, often in the ED setting, for injuries that were not recognized to be from abuse.Citation5–9 These missed opportunities for child abuse recognition occur despite encountering multiple levels of care – from prehospital or emergency medical services (EMS) providers to hospital providers such as emergency medicine physicians and nursing staff. Opportunities thus exist to improve the recognition of physical child abuse by a provider somewhere on the continuum of emergency services.

It is reasonable to postulate that prospective, real-time screening of young children who are transported by EMS providers to EDs for care might be one way to detect child abuse before serious injury or death occurs. First, EMS providers transport over 1.2 million pediatric patients under the age of 18 annually to EDs, which comprises up to 13% of their total patient transports.Citation10,11 Second, EMS providers are in a unique position to recognize and report child abuse, as they are often the only members of a child's healthcare team who can offer important initial and unadulterated information to ED providers about family interactions, the home environment, and the scene of injury. Finally, EMS providers are also mandated reporters in many states, yet they receive minimal training on child abuse and report a lack of confidence in dealing with suspected child maltreatment.Citation12–15

To our knowledge, no study to date has assessed the real-time behavior of EMS providers during child abuse screening. This study focused on the development and validation of a formative checklist to evaluate the clinical performance of EMS providers when caring for a physically abused infant in a standardized simulation-based scenario. Of note, child abuse may refer to a constellation of conditions such as physical abuse or non-accidental trauma, neglect, or sexual exploitation or abuse. However, we use the term “child abuse” throughout the paper specifically to refer to non-accidental trauma as evaluated in our simulated scenario.

Methods

Study Design and Population

This project consisted of two phases: (1) development of a performance checklist to evaluate EMS providers' screening behaviors for abuse and (2) validation of the checklist via a simulation-based abusive head trauma scenario. Validation of the checklist was framed around the unified concept of construct validity.Citation16–18 Under this unified framework of construct validity, we evaluated the checklist's content, response process, inter-rater reliability, and relation to other variables as categories of evidence contributing to the overall construct validity of the checklist rather than as different types of validity themselves.Citation16,17 Study subjects for the simulation-based scenario included EMS providers recruited from the Yale-New Haven Sponsor Hospital Program and local EMS services. Participants were encouraged to participate when free from clinical responsibilities or job-related duties; participation was not mandatory. This study was approved by the Institutional Review Board (IRB) committee at Yale University and by the New Haven Sponsor Hospital Governing committee. All participants gave informed consent prior to participating.

Study Protocol

Checklist Development

Potential items for the initial checklist were derived from a literature review of physical exam findings characteristic of abusive trauma in infants,Citation19–22 of the scope of practice of EMS providers,Citation14,23,24 and of previously published screening checklists and decision-making tools.Citation25–30 Based on this literature review and on clinical experience of two pediatric emergency medicine physicians (GT and KB) at an urban children's hospital, the authors created an initial performance checklist consisting of 25 items believed to be important and necessary to the screening of an infant for physical abuse in a clinical scenario. The initial checklist was then adapted for face validity and feasibility based on verbal feedback from two experienced paramedics who participated in pilot testing of a simulation scenario of child abuse written by the authors (Appendix 1).

Six content experts were recruited to participate in three rounds of Delphi review to determine the final checklist items.Citation31 For the first two rounds of review, the experts were prompted with the statement “Performing the task of ____ will provide important information related to the screening of an infant less than 1 year old for child abuse.” They then ranked their agreement with the statement for each checklist item according to a 5-point Likert scale (1 = strongly disagree, 2 = disagree, 3 = neither agree nor disagree, 4 = agree, and 5 = strongly agree). An a priori decision based on prior checklist work was made to include items without further modification that had a mean score greater than or equal to 4 on the 5-point Likert scale of agreement.Citation32 In addition to ranking each item in terms of importance, the experts were prompted to provide qualitative free-text feedback on each item's wording and clarity, to add items that were relevant but missing, or to make other pertinent comments. The final review round included the additional question, “Do you think this item can reasonably be expected for an EMT-Basic (EMT-B) or Paramedic (EMT-P) to perform prior to arrival?” Experts ranked each item as either “yes” (within the scope of practice) or “no” (out of scope of practice) and provided commentary on the operational definition of each item. Appendix 2 describes the scope of practice for EMS levels in detail. An a priori decision was made to include only those items endorsed by greater than 60% of reviewers as within the scope of practice for EMS providersCitation33,34

Checklist Validation

Content Validity

To assess content validity, a panel of six content experts reviewed the checklist items as above. Two of the content experts were board-certified child abuse pediatricians, and two were pediatric emergency medicine specialists with research experience in simulation-based education. All four pediatricians practiced at a major regional pediatric tertiary care center, and each had more than 6 years of experience in their respective fields. The final two content experts were EMS instructors with an average of 15 years of teaching and clinical field experience. The expert panel confirmed that the final checklist items were important and necessary, identified missing elements, commented on checklist item clarity and wording, and verified that each item was within the scope of EMS practice.

Response Process

To explore evidence of how accurately the final checklist reflected the thoughts and actions of EMS providers during a case of suspected abusive trauma, we conducted a series of standardized, videotaped simulation sessions (Appendix 1) from February to April 2014 with N = 28 EMS volunteers (16 EMT-B's, 12 EMT-P's). The details of the simulation scenario were based on cases of physical abuse managed by the members of the research team as well as factors associated with abusive head trauma in the literature. Each simulation lasted approximately 10 minutes and was conducted at the Yale-New Haven Sponsor Hospital. The participants were encouraged to think aloud as much as possible during the simulation session so that real-time thought processes could be compared to checklist items. Following each simulation session, two investigators with experience conducting qualitative interviews facilitated one-on-one, semi-structured, face-to-face interviews with each EMS provider about his/her experience with the simulation scenario and in general with child abuse and/or neglect. The interview guide consisted of open-ended questions and included prompts to encourage detailed discussion. Participants were encouraged to discuss their perceived performance in the simulation, past experience(s) with child abuse cases, and perceived barriers and facilitators to screening for child abuse in the pre-hospital setting. Interviews were audiotaped and transcribed verbatim.

Inter-rater Reliability

Three raters independently scored each videotaped session using the final checklist. All raters participated in one 30-minute training session with the author (AA). During training, the author (AA) provided each rater with a copy of the final checklist and discussed each item's task group, operational definition, and description in detail until clarity about each element of the checklist was achieved. Each rater was instructed to view the simulations once, in real time, without pausing after which they were allowed to rewind, pause, and fast-forward the videotaped simulation as needed to complete the checklist. No discussion occurred among the raters during or following their scoring of the videotaped sessions.

Relation to Other Variables

To evaluate the correlation of successful child abuse screening with participant characteristics, participants provided demographic information prior to participation (). Participants were also asked to rate on a scale of 0–100 in their confidence in recognizing physical abuse in children.

Table 1. Participant demographics by level of training

Data Analysis

Descriptive statistics of the Delphi responses, participant demographics, and participant performance in simulation were performed to analyze checklist development and validity. An inter-rater reliability (IRR) analysis was performed to assess the degree that raters consistently assigned categorical ratings to participants in the study. Kappa was computed for each coder pair then averaged to provide a single index of IRR with the following interpretation: κ < 0 for Light's kappa indicates no agreement; κ = 0–0.20, slight; κ = 0.21–0.40, fair; κ = 0.41–0.60, moderate; κ = 0.61–0.80 substantial; and κ = 0.81–1 as almost perfect agreement.Citation35,36 A Light's kappa value above 0.61 was decided a priori to be an acceptable level of agreement for the performance checklist. Internal consistency for the checklist was assessed by calculating Cronbach α coefficient and item discrimination statistics (corrected item-total correlation and Cronbach α if item deleted) for all subjects combined based on rater scores. Consensus (i.e., when 2 or more of the same scores were rate by the three raters) ratings were used when possible; otherwise, mean scores were used when there were discordant ratings. Analysis was done using SPSS v. 22.0 (IBM Inc.). Bivariate and multivariate analyses were conducted using Pearson's chi square test for ratios, Mann-Whitney U test for continuous non-parametric variables, and independent T-test for normal continuous variables.

The qualitative interviews were reviewed by six investigators (GT, MG, JRK, KBi, AA, and KB) and analyzed using the constant comparative method of inductive analysis, that is initial codes, or labels, were applied to summarize and categorize portions of data.Citation37 The research team discussed and refined the codes during a series of meetings. When the coding process was complete, they clustered coded data into relevant themes.

Results

Checklist Development

After an extensive literature review detailed previously, the initial checklist consisted of 25 items, two of which were modified after pilot testing with two experienced paramedics. Next, all six content experts completed each of the three rounds of modified Delphi review for 100% expert participation. The final checklist included 24 items classified into 7 task groups (Appendix 3). Operational definitions for each item provided clarification of item completeness. Scoring included three categories: (1) done, (2) not done, and (3) not applicable (N/A) for those items not required for a particular scenario. The third scoring category (N/A) was included to enhance the generalizability and applicability of the checklist to multiple clinical scenarios.

Checklist Validation

Content Validity

The items unanimously agreed upon by the experts as being the most important (mean score of 5 on 5-point Likert scale) during the screening of an infant for child abuse were: “exposes anterior torso,” “exposes posterior torso,” “exposes entire head and face,” “inspects exposed areas for signs of abuse,” and “solicits reason for call” ().

Table 2. Delphi results for final checklist items

Response Process

Overall, 14 participants (50%) detected abuse and 14 participants (50%) failed to detect child abuse in simulation. The majority (26/28, or 93%) of participants disclosed during the qualitative interview that their performance in videotaped simulation was consistent with their mental models. That is, those who successfully screened for abuse were suspicious for the abuse and purposefully performed tasks to gather evidence. Two participants failed to identify the abuse during simulation using our checklist criteria but revealed suspicion of the abuse during the qualitative interview. Of the two, one participant failed to report suspicion during the simulation due to uncertainty about whether the facial bruise was an artifact related to the simulator or an actual skin finding. The other participant suspected abuse based on the simulated babysitter's behavior on presentation but failed to perform the tasks necessary to screen for the abuse.

In the individual debriefings after each scenario, participants described barriers to recognizing child abuse and/or neglect (CAN). Reported barriers to recognition included discomfort with the care of pediatric patients, uncertainty related to the diagnosis of CAN (accepting parental story about alternative diagnosis, lack of experience with CAN, and difficulty distinguishing between accidental and intentional injuries), a focus on the chief complaint, and having a limited opportunity to do a full evaluation in the prehospital setting.

Further results and discussion of the qualitative inquiry will be published in a subsequent manuscript.

Inter-rater Reliability and Internal Consistency

The inter-rater reliability was tested for the entire performance checklist, for each task group, and for each item (). For the entire checklist (items 1–1 to 7–3), the Light's kappa (κ) was 0.70, indicating substantial inter-rater reliability. The task groups with the highest kappa values were the “neurologic assessment” and “exposure” categories (both κ = 0.83), and the lowest kappa values for the “basic history” category (κ = 0.31). Cronbach α for the overall checklist was 0.601, which is low. Item discrimination statistics are presented in Appendix 4.

Table 3. Inter-rater reliability and Light's Kappa

Relation to Other Variables

The characteristics of the 14 participants who did detect child abuse did not differ significantly from those who failed to detect child abuse. Namely, detectors and non-detectors had similar levels of training (EMT-B or EMT-P), past experience with child abuse reporting, and self-reported confidence in detecting child abuse (all p > 0.3; ).

Table 4. Associations between participant characteristics and screening for abuse

Of the checklist items, only items in the “exposure” task group significantly correlated with identifying abuse (all p < 0.05; ). Performance in any other task group, including basic history and scene evaluation and safety, did not increase the likelihood of detecting abuse. The majority of participants exposed the infant's anterior torso (84%) and entire head and face (54%). The most commonly missed items in the exposure category were “exposes the posterior torso” (39%) and “palpates entire skeleton for signs of tenderness” (15%). However, the items in the “exposure” category that correlated most statistically significantly with the detection of child abuse were “exposes posterior torso” and “inspects exposed areas for signs of abuse” (both p < 0.001). Of those participants who exposed the posterior torso, 91% reported their suspicion of child abuse in simulation. Of those participants who did not expose the posterior torso, only 24% still detected child abuse.

Table 5. Associations between checklist items and successful screening for abuse

Discussion

We created a performance evaluation checklist and report evidence for its validity when used for formative assessment of EMS providers caring for a simulated infant with physical abuse. Content validity was established through repeated iterations and adaptations of the instrument, initially in real time at expert participant level and then at the expert panel level. Additionally, the expert panel was varied enough to represent multiple dimensions of the knowledge and skills required to screen an infant for child abuse. The behaviors deemed most important by the expert panel during review were also found during simulation pilot testing to be most important in distinguishing child abuse detectors from non-detectors.

Checklist performance did not distinguish between participants with high or low levels of training or years of experience; reported levels of confidence; or numbers of previous child abuse cases. Several potential explanations may explain the lack of correlation between provider experience and rates of child abuse screening. First, providers consistently identify pediatrics as a whole, but specifically child abuse, as an area of weakness regardless of the level of training.Citation12–14 EMS providers report being largely uncomfortable with most aspects of pediatric care, particularly when applied to infants less than one year old – the population known to be the most vulnerable to child abuse.Citation38 Our participants too reported significant uncertainty related to the diagnosis of CAN and many struggled with the level of proof needed for reporting suspected CAN to child protective services. Second, EMS providers who are experts in the domain of child abuse and who would be the basis for comparison between expert and novice cohorts are difficult to find given that less than one hour of EMS training is devoted to child abuse topics.Citation14,24 The training that is offered may not involve discussion of child abuse screening or practical application of skills via simulation.Citation39,40

Only one factor contributed significantly to successful screening – exposure. In agreement with current literature, full exposure alone seems to be sufficient to detect bruising, the most common manifestation of physical child abuse and a rare finding in pre-mobile infants.Citation19–22,37,38 However, a focus on the chief complaint and having a limited opportunity to do a full evaluation were perceived by participants to be barriers to recognizing CAN in the pre-hospital setting. One potential application of these findings may be to include the task of ‘skin exposure’ in current EMS procedural policies related to care of infants in the pre-hospital care setting. Skin exposure is not a task unique to EMS providers, yet detecting bruising in the prehospital setting may prompt prehospital providers to search for additional environmental cues to substantiate the mechanism of the injury. Situational factors from the home may be missed in the absence of detecting skin changes indicative of physical abuse.

Inter-rater reliability was substantial with a Light's kappa of 0.70.Citation35,36 The IRR analysis suggested that raters had substantial agreement in ratings, although the variable of interest contained a modest amount of error variance due to differences in subjective ratings given by raters. Statistical power for subsequent analyses may be therefore modestly reduced, but the ratings were deemed adequate for use in the hypothesis testing of the present study.

There are several limitations to this study. First, we tested this performance evaluation checklist in a single simulation scenario involving an infant with bruising. Items such as “assesses caregiver behavior” and “evaluates the condition of the home,” though important for successful screening of child abuse, are difficult to apply in a simulated environment and could explain the low Cronbach's alpha for our checklist items. Iterative modification of this checklist prior to use in other abuse-related scenarios, such as an environment involving drugs and alcohol or an injury like fracture or abdominal trauma, may further improve its validity. Second, results may have differed had we recruited participants from broader gender, racial, and sociodemographic backgrounds and from a variety of geographic regions and training institutions. A third limitation is the difficulty of assessing cognitive processes in simulation. However, simulation-based performance checklists have been used effectively as a tool to evaluate clinical skills, to enhance medical education, and to train providers in a wide range of medical specialties and procedures including surgery and anesthesiology.Citation34,39–43 Participants in our study were encouraged to think aloud as much as possible during simulation, and results from the qualitative interview suggest that participants' internal thought processes were consistent with checklist performance. Finally, although the Delphi technique assures the items on the final checklist are based upon expert consensus and our study used multiple experts with varied expertise, respondent answers are perceptions and may be biased.

Conclusions

Assessing clinical performance is integral to improving patient care and providing feedback to providers regarding improvement of their own clinical skills with evaluation of pediatric patients. EMS providers are in a unique position to recognize and report child abuse but receive little training about this topic. We have developed a valid performance checklist that can be used to guide training of EMS providers in the evaluation of children suffering physical abuse. Results from our analysis of checklist performance further support the need for greater incorporation of child abuse topics in both training and protocol development of EMS providers.

References

  • U.S. Department of Health & Human Services, Administration for Children and Families, Administration on Children, Youth and Families, Children's Bureau. Child maltreatment 2013. Author, 2015. Available at: http://www.acf.hhs.gov/programs/cb/research-data-technology/statistics-research/child-maltreatment. 2013.
  • Hussey JM, Chang JJ, Kotch JB. Child maltreatment in the United States: prevalence, risk factors, and adolescent health consequences. Pediatrics. 2006;118:933–42.
  • Leventhal JM, Gaither JR. Incidence of serious injuries due to physical abuse in the United States: 1997 to 2009. Pediatrics. 2012;130:e847–52.
  • Keshavarz R, Kawashima R, Low C. Child abuse and neglect presentations to a pediatric emergency department. J Emerg Med. 2002;23:341–5.
  • King WK, Kiesel EL, Simon HK. Child abuse fatalities: are we missing opportunities for intervention? Pediatr Emerg Care. 2006;22:211–4.
  • Jenny C, Hymel KP, Ritzen A, Reinert SE, Hay TC. Analysis of missed cases of abusive head trauma. JAMA. 1999;281:621–6.
  • Carty H, Pierce A. Non-accidental injury: a retrospective analysis of a large cohort. Eur Radiol. 2002;12:2919–25.
  • Thorpe EL, Zuckerbraun NS, Wolford JE, Berger RP. Missed opportunities to diagnose child physical abuse. Pediatr Emerg Care. 2014;30:771–6.
  • Ravichandiran N, Schuh S, Bejuk M, et al. Delayed identification of pediatric abuse-related fractures. Pediatrics. 2010;125:60–6.
  • Shah MN, Cushman JT, Davis CO, Bazarian JJ, Auinger P, Friedman B. The epidemiology of emergency medical services use by children: an analysis of the National Hospital Ambulatory Medical Care Survey. Prehosp Emerg Car. 2008;12:269–76.
  • National EMS Information System (NEMSIS) Data Report: 2010 EMS Pediatric Data Demographics (Age 0–18 years). NEDARC, 2012. Available at: http://www.nedarc.org/emsdatasystems/nemsisReports/2010Demographics.html.
  • Stevens S, Alexander J. The impact of training and experience on EMS providers' feelings toward pediatric emergencies in a rural state. Ped Emerg Care. 2005;21:12–17.
  • Glaeser P, Linzer J, Tunik MG, Henderson DP, Ball J. Survey of nationally registered emergency medical services providers: pediatric education. Ann Emerg Med. 2000;36:33–8.
  • Markenson D, Tunik M, Cooper A, et al. A national assessment of knowledge, attitudes, and confidence of prehospital providers in the assessment and management of child maltreatment. Pediatrics. 2007;119:e103–8.
  • Mandatory Reporters of Child Abuse and Neglect: Summary of State Laws. 2008. Available at: https://www.childwelfare.gov/pubPDFs/mandaall.pdf.
  • Cook DA, Beckman TJ. Current concepts in validity and reliability for psychometric instruments: theory and application. Am J Med. 2006;119:166 e7–16.
  • Messick S. Validity. In: Linn RL, ed. Educational measurement (3rd ed., pp. 13–104). New York: American Council on Education and Macmillan, 1989.
  • American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing. Washington, DC: American Educational Research Association, 1999.
  • Sheets LK, Leach ME, Koszewski IJ, Lessmeier AM, Nugent M, Simpson P. Sentinel injuries in infants evaluated for child physical abuse. Pediatrics. 2013;131:701–7.
  • Harper NS, Feldman KW, Sugar NF, Anderst JD, Lindberg DM. Examining siblings to recognize abuse I. Additional injuries in young infants with concern for abuse and apparently isolated bruises. J Ped. 2014;165:383–8.
  • Maguire S, Mann M. Systematic reviews of bruising in relation to child abuse-what have we learnt: an overview of review updates. Evid Based Child Health. 2013;8:255–63.
  • Pierce MC, Kaczor K, Aldridge S, O'Flynn J, Lorenz DJ. Bruising characteristics discriminating physical child abuse from accidental trauma. Pediatrics. 2010;125:67–74.
  • Remick K, Caffrey S, Adelgais K. Prehospital provider scope of practice and pmplications for pediatric prehospital care. Clin Ped Emerg Med. 2014;15:9–17.
  • Zaveri P, Agrawal D. Pediatric education and training of prehospital providers: a critical analysis. Clin Ped Emerg Med. 2006;7:114–120.
  • Benger JR, Pearce V. Simple intervention to improve detection of child abuse in emergency departments. BMJ 2002;324:780.
  • Louwers EC, Korfage IJ, Affourtit MJ, et al. Effects of systematic screening and detection of child abuse in emergency departments. Pediatrics. 2012;130:457–64.
  • Louwers EC, Korfage IJ, Affourtit MJ, et al. Detection of child abuse in emergency departments: a multi-centre study. Arch Dis Child. 2011;96:422–5.
  • Sittig JS, Uiterwaal CS, Moons KG, Nieuwenhuis EE, van de Putte EM. Child abuse inventory at emergency rooms: CHAIN-ER rationale and design. BMC Ped. 2011;11:91.
  • Flaherty EG, Sege RD, Griffith J, et al. From suspicion of physical child abuse to reporting: primary care clinician decision-making. Pediatrics. 2008;122:611–9.
  • Clark KD, Tepper D, Jenny C. Effect of a screening profile on the diagnosis of nonaccidental burns in children. Pediatric Emergency Care 1997;13:259–61.
  • Clayton M. Delphi: a technique to harness expert opinion for critical decision-making tasks in education. Educ Psych. 1997;17:373–86.
  • Conroy KM, Elliott D, Burrell AR. Developing content for a process-of-care checklist for use in intensive care units: a dual-method approach to establishing construct validity. BMC Health Serv Res. 2013;13:380.
  • Cheung JJ, Chen EW, Darani R, McCartney CJ, Dubrowski A, Awad IT. The creation of an objective assessment tool for ultrasound-guided regional anesthesia using the Delphi method. Reg Anesth Pain Med. 2012;37:329–33.
  • Morgan PJ, Lam-McCulloch J, Herold-McIlroy J, Tarshis J. Simulation performance checklist generation using the Delphi technique. Canadian J Anaesth. 2007;54:992–7.
  • Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74.
  • Light R. Measures of response agreement for qualitative data: some generalizations and alternatives. Psych Bull. 1971;76:365.
  • Pierce MC, Magana JN, Kaczor K, et al. The prevalence of bruising among infants in pediatric emergency departments. Ann Emerg Med. 2016;67:1–8.
  • Sugar NF, Taylor JA, Feldman KW. Bruises in infants and toddlers: those who don't cruise rarely bruise. Puget Sound Pediatric Research Network. Arch Pediatr Adolesc Med. 1999;153:399–403.
  • Lockyer J, Singhal N, Fidler H, Weiner G, Aziz K, Curran V. The development and testing of a performance checklist to assess neonatal resuscitation megacode skill. Pediatrics. 2006;118:e1739–44.
  • Reid J, Stone K, Brown J, et al. The Simulation Team Assessment Tool (STAT): development, reliability and validation. Resuscitation. 2012;83:879–86.
  • Scavone BM, Sproviero MT, McCarthy RJ, et al. Development of an objective scoring system for measurement of resident performance on the human patient simulator. Anesthesiology. 2006;105:260–6.
  • Hunt EA, Hohenhaus SM, Luo X, Frush KS. Simulation of pediatric trauma stabilization in 35 North Carolina emergency departments: identification of targets for performance improvement. Pediatrics. 2006;117:641–8.
  • Schmutz J, Eppich WJ, Hoffmann F, Heimberg E, Manser T. Five steps to develop checklists for evaluating clinical performance: an integrative approach. Acad Med. 2014;89:996–1005.

Appendix 1

Narrative Description of Scenario

Sessions began with scripted instructions from the study team, followed by a short preparation period in which the participant became familiarized with the infant simulator, SimNewB® (Laerdal, USA), by feeling its brachial and femoral pulses. The programming of the simulator and the scripting of the caregiver were standardized for each participant. The participant was blinded to the study goals throughout the simulation session. A trained standardized actress played the role of the abusive babysitter/caregiver and the moderator (a study team member who was also an EMS instructor) played the role of the participant's EMS partner to ensure consistency and fidelity of the simulation across sessions. A research team member videotaped each session. The participant was allowed to ask questions of the moderator/EMS partner and the babysitter, both of whom provided standardized responses (e.g., temperature, pupillary response, capillary refill times, history of present illness) but did not make suggestions or provide guidance. The participant was instructed at the beginning of the simulation to act as the lead decision-maker but was allowed to use the moderator/EMS partner as an assistant for completing physical tasks such as monitoring heart rate. The participant was given information only if he/she asked for it but was encouraged to think aloud as much as possible so that research team members could capture cognitive processes on the checklist.

In the scenario, the participant arrives at the home of a 5-month old, afebrile, seizing infant. The mother is absent, but the babysitter is present, nervous, and holding a cigarette in her hand as a prop. The infant is laying supine with generalized tonic clonic shaking on an exam table draped to resemble a standard bed. The infant has no significant medical problems, is taking no medications and has no allergies. One minute after EMS arrival, the infant stops seizing. When asked about what could have caused the seizure, the babysitter continues to be nervous and insists that the baby was fine when she fed and placed him to sleep. She states she found the baby shaking when she checked on him. Vital signs were standardized and reported to the EMS provider when requested during and after the cessation of the seizure. Physical exam reveals that the SimNewB® has a red, hand-shaped slap mark on the left temporal skull (covered by a hat) and three cigarette burn marks on the medial back (covered by a front-zipped onesie). The burn marks were purposefully located on the posterior torso, as paramedics during the initial pilot testing noted that an anterior lesion would allow providers to detect the abuse even as they undressed child to listen to the heart or lungs and would not test the ability of an EMS provider to undress the child for a skin exam. If asked about the facial or back skin findings, the babysitter is unable to provide a plausible reason and provides explanations such as “maybe the baby hit the side of the crib.”

At the participant's prompt, the SimNewB®, participant, and moderator move from the home to the ambulance and are en route to the hospital. The moderator states that the participant may perform any additional exams that he/she would normally perform during this time. Upon arrival, the moderator then switches to the role of the sign-out nurse and prompts the participant to report pertinent findings from the simulation. The simulation did not have a time limit and was considered complete at the end of the participant's verbal report.

Appendix 2

Description of General EMS Provider Training by Level and Pediatric CareCitation1

EMT-Basic (EMT-B):

100–110 hours of training, <10 hours of pediatric care

Skills include: history and physical exam skills, basic airway management, spinal immobilization, splint application, and assisted administration of non-prescription medications

EMT-Paramedic (EMT-P):

1000–1200 hours of training, <100 hours of pediatric care

Additional skills include: advanced airway management, pleural decompression, ECG and 12-lead interpretation and management, and administration of IV fluids, medications, and blood products

1The US Department of Transportation National Highway Traffic Safety Administration and National Council of State EMS Training Coordinators.

Appendix 3

Appendix 4

Appendix 4: Item Discrimination Statistics

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.