5,199
Views
4
CrossRef citations to date
0
Altmetric
Articles

Dysphagia management: Does structured training improve the validity and reliability of cervical auscultation?

ORCID Icon & ORCID Icon

Abstract

Purpose: Cervical auscultation (CA) uses a stethoscope or microphone to complement the clinical swallow examination by interpreting swallowing sounds and swallow-respiratory coordination. This study investigated the effects of structured CA training on CA-rating agreement with Flexible Endoscopic Evaluation of Swallowing (FEES) and CA rater reliability.

Method: Thirty-nine speech-language pathologists participated in a structured CA training course at Gothenburg University. They rated nine swallow-respiratory sound recordings which were simultaneously recorded during FEES. Swallow sounds were rated six weeks prior to the CA-workshop using two binary yes/no questions, (1) Safe, (2) Dysphagia, and a third Dysphagia Severity rating. Swallow sounds were rated again (re-randomised) one month post CA-workshop.

Result: Agreement with FEES (validity) improved significantly (p < 0.05) pre-post training for the Safe and Dysphagia questions, with post training sensitivities >90% and specificities at 76% and 85% respectively. Dysphagia severity rating improved non-significantly. Intra-rater reliability improved significantly with kappa statistics >0.90 post training. Improvements for inter-rater reliability were noted, though non-significant.

Conclusion: Results demonstrate that with structured training, the validity of CA (to detect a Safe/Dysphagic swallow) significantly improves, as does intra-rater reliability. This is congruent with literature identifying the positive effects of structured training improving instrumental dysphagia assessment.

Introduction

Cervical auscultation (CA) is an adjunct to the clinical swallow evaluation and may use a stethoscope, microphone or accelerometer to listen to and interpret swallow-respiratory sounds and coordination, in paediatric and adult populations (Almeida, Ferlin, Parente, & Goldani, Citation2008; Borr, Hielscher-Fastabend, & Lucking, Citation2007; Cichero & Murdoch, Citation2002; Frakking, Chang, David, Orbell‐Smith, & Weir, Citation2019; Frakking, Chang, O’Grady, David, & Weir, Citation2017; Lagarde, Kamalski, & van Den Engel-Hoek, Citation2016; Leslie, Drinnan, Finn, Ford, & Wilson, Citation2004; Morinière, Boiron, Alison, Makris, & Beutter, Citation2008; Nozue et al., Citation2017; Stroud, Lawrie, & Wiles, Citation2002; Youmans & Stierwalt, Citation2011). Auscultation is regularly used in cardiac and respiratory medicine. It is often the first step in the investigative process to determine normal from abnormal heart and lung function. Cervical auscultation uses a similar approach, using sound to determine normal from abnormal swallowing function. Early CA studies focussed on the cause of swallowing sounds and the characterisation of swallowing sound properties in various cohorts (Cichero & Murdoch, Citation2002; Leslie et al., Citation2007; Morinière et al., Citation2008; Yamashita et al., Citation2014; Youmans & Stierwalt, Citation2011). More recent studies confirm the utility of CA as an adjunct to a clinical evaluation of swallowing as compared against instrumental assessments such as Videofluoroscopy Swallow Study (VFSS) or Fibre-optic Endoscopic Evaluation of Swallowing (FEES) (Bergström, Svensson, & Hartelius, Citation2014; Dudik, Kurosu, Coyle & Sejdić, Citation2018; Frakking et al., Citation2016; Frakking et al., Citation2017). Bergström et al. (Citation2014) used a prospective single-blinded study to evaluate the validity and reliability of CA in isolation, or when used in conjunction with clinical swallow examination (CSE). The researchers found that the paired condition of CA with the CSE showed a greater correlation with the FEES findings than if CA was used in isolation. Frakking et al. (Citation2016) demonstrated similar results, using a randomised control trial that compared clinical feeding examination in isolation, with clinical feeding evaluation + CA and evaluated against VFSS for children referred for feeding/swallowing assessment. Frakking et al. (Citation2016) found the sensitivity of the feeding evaluation combined with CA to be 85%, compared with the clinical feeding exam in isolation at 63.6%. The paired CSE + CA condition also showed a negative predictive value of 91.9% with confidence to ‘rule out aspiration’, and a higher agreement with VFSS findings than the clinical exam in isolation. Dudik et al. (Citation2018) compared the swallow sounds of healthy individuals and an equal number of individuals with mixed aetiology of dysphagia. Using VFSS as the instrumental reference test and evaluating non-aspirating swallows, they concluded that there were distinct features that differentiated healthy controls from individuals with dysphagia, even where aspiration had not occurred, results that are also supported by earlier research (Almeida et al., Citation2008; Borr et al., Citation2007; Morinière et al., Citation2008; Youmans & Stierwalt, Citation2011).

Although there is a growing body of recent, robust studies demonstrating that the addition of CA to the clinical swallow examination significantly improves agreement with instrumental assessments, it is the results from early CA validity and reliability studies that have cast doubt on CA and the benefit to dysphagia management (Leslie et al., Citation2004; Leslie et al., Citation2007; Stroud et al., Citation2002). Several authors suggested poor or variable reliability of CA compared with an instrumental assessment to detect aspiration, with kappa statistics reported between K = 0.12–0.85 (Borr et al., Citation2007; Leslie et al., Citation2004; Stroud et al., Citation2002). Agreement between CA and instrumental assessment for determining normal from dysphagic (penetration/aspiration) swallows reported sensitivity data of 80–94% and specificity data of 70–94% (Borr et al., Citation2007; Leslie et al., Citation2004), which are considered acceptable values of validity (Swan, Cordier, Brown, & Speyer, Citation2019).

While early CA research reported poor/variable reliability, two man are as of methodological flaws can be identified in these earlier studies: (1) recording of the swallow sounds and (2) CA-rater training. Swallow sounds were recorded using suboptimal stethoscope/microphone placement in two studies (Leslie et al., Citation2004; Stroud et al., Citation2002) with one study recording sounds via a locally modified stethoscope with a microphone inserted into the stethoscope bifurcation (Leslie et al., Citation2004). Studies did not report stethoscope information regarding bell or diaphragm use, tubing, use of amplification or treatment of the sounds recorded for analysis (Leslie et al., Citation2004; Stroud et al., Citation2002). For all stethoscopes, the headpiece and tubing are known to affect the transmission characteristics of sounds with selective filtering (Richardson & Moody, Citation2000).

In terms of the second methodological difference, many of these earlier CA studies reported inconsistency in the type and amount of CA training. Training ranged from ‘no training’ to being ‘self-taught’ or attending 5 hours of training (details of training not available (Borr et al., Citation2007; Leslie et al., Citation2004; Stroud et al., Citation2002)). Similarly, clinician experience with CA in the Stroud et al. (Citation2002) and Leslie et al. (Citation2004) studies ranged from ‘no experience’ to having attended a course and using CA weekly in clinical practice.

Current trends in instrumental dysphagia assessment indicate that structured training is essential for optimal rater reliability and assessment outcomes (Dziewas et al., Citation2008; Martin-Harris et al., Citation2008; Nordin et al., Citation2017). Therefore, training presents as a critical feature for improving CA validity and rater reliability especially given the more recent literature demonstrating its utility as an adjunct to the CSE. The current study investigated the effects of structured teaching and practical skill training in CA, for graduated speech-language pathologists (SLPs). Embedded within a University curriculum, the hypothesis was that structured training would improve validity (sensitivity and specificity) and rater reliability as measured against swallowing sounds recorded simultaneously during and assessed via FEES.

Method

Participants

Thirty-nine SLPs with 1–20 years (mean = 5.6 years) dysphagia experience and no formal CA training participated in this study. Participants were enrolled in the 2016 Gothenburg University LP9440 course (3 university credits). Inclusion criterion was that participants were practising SLPs working with dysphagia. No exclusion criteria were applied.

Procedure

The CA course consisted of a 2-day theoretical and practical workshop, learning the skills necessary to listen to and interpret swallow-respiratory sounds, with pre and post workshop learning activities (see , Supplementary material). Workshop material and teaching were based on established training programs developed by author JC following doctoral studies, clinical experience and the growing body of literature. Established training had a strong focus on teaching the practical skills of CA with reference to normal and abnormal swallow-respiratory sound definitions (Cichero & Murdoch, Citation2006) with evidence-based literature and teaching methodology continuously updated and improved upon following feedback from CA teaching via workshops hosted by Speech Pathology Australia (2002–2016), Singapore General Hospital (2010), and the Hong Kong Speech Therapists Association (2012, 2015). The authors LB and JC had 18 and 25 years clinical CA experience (respectively) with 5 and 21 years CA research and teaching experience.

Figure 1. Cervical Auscultation (CA) Course Outline.

Figure 1. Cervical Auscultation (CA) Course Outline.

Six weeks prior to the workshop, participants rated nine swallow sounds (both dysphagic and non-dysphagic swallows, as determined by FEES). Post workshop, the swallow sounds were re-randomised and rated again by CA course participants within one month of workshop completion. CA course participants were blinded to swallow sound recording information and FEES results.

Swallow sound samples

The swallow-respiratory sounds were simultaneously recorded using a Littmann e3200 electronic stethoscope from patients routinely referred to FEES, as previously described (Bergström et al., Citation2014). Although both FEES and VFSS are considered to the be gold standard of dysphagia assessment (Swan et al., Citation2019), in this study FEES was the reference test to which the swallow-respiratory sound ratings were compared against. The FEES outcome measures used for this study were:

  • the Penetration-Aspiration Scale (PAS) (Rosenbek, Robbins, Roecker, Coyle, & Wood, Citation1996),

  • two binary questions of whether the swallow was ‘safe’ (denoted by a PAS of ≤2), and whether the swallow was ‘dysphagic’; and

  • dysphagia severity Likert scale rating based on the Australian Therapy Outcome Measures (AusTOMs) (Skeat & Perry, Citation2005).

These questions were considered pertinent since patients may present with a dysphagic/inefficient swallow yet still be safe (i.e. no penetration/aspiration) on the same consistency (Bergström et al., Citation2014).

Flexible endoscopic evaluation of swallowing (FEES) ratings were determined by two experienced clinicians (12 and 23 years dysphagia experience) immediately after the FEES procedure and with a review of the audio-visual recording; 100% consensus agreement was obtained on all occasions (Bergström et al., Citation2014). One healthy volunteer (author LB) provided one swallow-respiratory sound recording to ensure an equal dysphagic versus non-dysphagic swallow representation. The healthy volunteer swallow was confirmed to be normal via FEES, though the heathy swallow sound and FEES recordings were not simultaneously recorded as opposed to the patient swallow sound and FEES recordings. Not all patients presented with dysphagic swallows (as per FEES) and some patients showed dysphagic characteristics on one consistency but not the other (see ).

Table I. Swallow sound characteristics as recorded via FEES (Fiberoptic Endoscopic Evaluation of Swallowing).

Five patients (3 female, 2 male, mean age 64 years, range 50–84) with a range of demographical backgrounds and medical diagnoses, reflective of usual speech pathology caseload within a large regional hospital (Supplementary material 3), provided the acoustic and instrumental samples used in the study, as described previously (Bergström et al., Citation2014). All patients provided one or two swallow-respiratory sound recordings, consisting of (1) 10 ml milk, IDDSI-0 (International Dysphagia Diet Standardisation Initiative, [IDDSI]), (Steele et al., Citation2018) via spoon, and/or (2) 10 ml Swedish ‘nyponsoppa’ nectar juice, IDDSI-2, via spoon. Five IDDSI-0 (thin fluid) swallows, and four IDDSI-2 (mildly thick fluid) swallows were used for pre and post CA course assessment, including two duplicate swallow sounds randomly chosen for intra-rater reliability (see ).

Swallow sounds were saved as a wav. sound file and sent to the CA course participants via USB or TransferBigFiles.com website along with set instructions and a study-specific rating form. CA course participants were required to listen to the swallow sounds, in order, and rate each swallow-respiratory sound clip according to the following three questions:

  1. Would you consider the patient to be safe on this consistency? (Binary yes/no question)

  2. Is the swallow dysphagic? (Binary yes/no question)

  3. Please rate the severity of dysphagia as per rating scale adapted from AusTOMs, (Skeat & Perry, Citation2005), where 0 = no dysphagia; 1 = mild dysphagia; 2 = moderate dysphagia; 3 = moderate-severe dysphagia; 4 = severe dysphagia. See Supplementary material 4.

Pre and post training CA-ratings were compared to assessment results from the FEES reference test (Bergström et al., Citation2014).

Statistics

Statistics were calculated using SAS version 9.4. Statistically significant difference between pre-post CA training was set at p < 0.05 and Confidence Intervals (CI) calculated to 95%. Descriptive statistics were also calculated.

Agreement with flexible endoscopic evaluation of swallowing (FEES)

Percent exact agreement was calculated for all three questions. Percent close agreement (+/− 1 point from FEES rating) was calculated for the Severity question (ordinal data) and kappa statistic calculations were completed for all three questions. For the Safe (Q1) and Dysphagia (Q2) questions (binary data), validity was calculated using sensitivity, specificity and area under the Receiving Operating Curve (aROC). Significance between pre and post data was analysed using a multi-step approach: (a) for each rating (n = 39), the difference in percent agreement between pre and post training was calculated (mean difference, standard deviation [SD], median, minimum and maximum difference values) and (b) statistically significant differences analysed across CA raters using the Wilcoxon signed-rank test (non-normally distributed data).

For Q3. Severity Rating correlation with FEES (ordinal data), was analysed using Spearman’s correlation co-efficient. Statistically significant difference between Severity Rating pre and post training was analysed using the above multi-step approach: (a) for each Severity rating (n = 39), the difference in percent agreement between pre and post training was calculated as per above, and (b) statistically significant differences were analysed across CA raters using the Wilcoxon signed-rank test.

Intra-rater reliability

For the Safe and Dysphagia questions, intra-rater reliability was calculated with percent exact agreement and Cohen’s kappa. Statistically significant differences between pre and post training was analysed using the Sign test to determine whether intra-rater reliability showed worse, no change or better scores post training.

For the Severity Rating question (ordinal data), intra-rater reliability was calculated using percent exact agreement and weighted kappa. Statistically significant difference between Severity Rating pre and post training was analysed via a multi-step method: (a) the weighted kappa difference between pre and post training for each participant was calculated and (b) these differences across raters were then analysed using the Wilcoxon signed-rank test.

Inter-rater reliability

Mean percent agreement between all raters was calculated for all three questions (Safe, Dysphagia and Severity Rating). Pre and post training inter-rater reliability for the Safe and Dysphagia questions was calculated using Fleiss kappa and confidence intervals (CI) to 95%

For the Severity Rating, inter-rater reliability was calculated using the mean of all pairwise weighted kappas (n = 780). Confidence Intervals were calculated using the pre and post training mean of the lower limit and mean of the higher limits. Significant difference between pre and post training was considered with reference to CI.

Interpretation

Agreement with FEES and between raters was calculated using Cohen’s kappa, Fleiss’ kappa and weighted kappa and interpreted using the original Landis and Koch (Citation1977) guidelines. Correlation with FEES (ordinal data by multiple raters) was calculated using Spearman’s Correlation Coefficient and interpreted using Hinkle, Wiersma, and Jurs (Citation2003) correlation co-efficient size. Sensitivity and specificity values were interpreted using qualitative descriptors (Lange & Lippa, Citation2017). See for details of the interpretation of statistical analyses.

Table II. Interpretation of statistical analysesa.

Ethical considerations

This study was conducted according to the Declaration of Helsinki. All participants provided informed consent voluntarily and were able to withdraw consent, participation and/or their submitted data (until publication) without risk of any repercussions whatsoever. Written study information was given to CA course participants after the 2-day workshop and those that wished to participate were able to submit their signed consent forms up until 4 weeks post workshop. The first author (LB) was available to answer any questions (verbally or via email) during this time. No consent was withdrawn.

Results

Agreement with FEES

Results for the ‘Safe’ question (), demonstrated significant improvement post training (p < 0.001). Level of agreement with FEES (K = 0.67) is considered to be substantial agreement (Landis & Koch, Citation1977). Sensitivity and specificity values were ‘very high’ and ‘high’ (>90%; and 75% respectively). Results for the ‘Dysphagia’ question also demonstrated significant improvements (p = 0.008) post CA-course with substantial FEES agreement of K = 0.77. ‘Very high’ sensitivity (93%) and ‘high specificity’ (85%) values were noted post training. For the ‘Severity Rating’, non-significant improvements were noted pre to post CA training. However, the CA Severity Ratings, were highly correlated with FEES ratings (rs > 0.80), with substantial kappa agreement (K > 0.60), pre and post training.

Table III. Agreement with Fibreoptic Endoscopic Evaluation of Swallowing (FEES).

Rater reliability

Intra-rater reliability for all three questions (Safe, Dysphagia, and Severity Rating) improved significantly (p < 0.05) post training (). Kappa statistic (K > 0.90) for all three questions, indicated almost perfect agreement post training. Inter-rater reliability () showed non-significant improvements, indicated by overlapping CIs including the number zero. Of note, the mean inter-rater agreement for the Safe and Dysphagia question improved from moderate agreement (K < 0.60) to substantial agreement (K > 0.60) post CA-training. For the Severity rating, mean agreement between all raters both pre and post training, was of a moderate agreement.

Figure 2. Intra-rater reliability pre-post training for (1) Safe, (2) Dysphagia, (3) Severity ratings, and significance between pre-post training. K = Cohen’s Kappa. Kw = Weighted Kappa Significence = p < 0.05, indicated by bolded font.

Figure 2. Intra-rater reliability pre-post training for (1) Safe, (2) Dysphagia, (3) Severity ratings, and significance between pre-post training. K = Cohen’s Kappa. Kw = Weighted Kappa Significence = p < 0.05, indicated by bolded font.

Figure 3. Intra-rater reliability pre-post training for (1) Safe, (2) Dysphagia, (3) Severity ratings, and significance between pre-post training. Kf = Fleiss Kappa. Kw = Weighted Kappa. a = mean of 780 Weighted Kappa. ns = not significant as indicated by overlapping Confidance Intervals.

Figure 3. Intra-rater reliability pre-post training for (1) Safe, (2) Dysphagia, (3) Severity ratings, and significance between pre-post training. Kf = Fleiss Kappa. Kw = Weighted Kappa. a = mean of 780 Weighted Kappa. ns = not significant as indicated by overlapping Confidance Intervals.

Discussion

Cervical auscultation has previously been reported in early CA literature as having poor/variable validity and reliability, with recommendations for these aspects to be improved (Borr et al., Citation2007; Lagarde et al., Citation2016; Leslie et al., Citation2004; Stroud et al., Citation2002). To the best of the authors’ knowledge, the current study is the first to evaluate the impact of structured CA training for SLPs and its effect on rater reliability and validity of CA, as compared with FEES. Post CA training, speech-language pathologists significantly improved their ability to identify if a swallow was safe and if a swallow was dysphagic confirming the hypothesis that training is an integral part of CA utility. As compared with FEES, results demonstrated substantial (kappa) agreement (>0.61), very high sensitivities (>90%), high specificities (>75%) and a high correlation (>0.70). Rater reliability also improved post CA training, though only the intra-rater reliability improvement was significant with kappa agreements >0.90.

These results differ from earlier CA studies investigating reliability and agreement with instrumental assessments (Borr et al., Citation2007; Leslie et al., Citation2004; Stroud et al., Citation2002; Zenner, Losinski, & Mills, Citation1995). Differences between the current study and previous studies may be explained by methodological incongruencies and can be broadly grouped into four categories: (a) inconsistent/no CA training, (b) swallow sound and sampling methodology, (c) assessment parameters and type of decision making for CA versus instrumental assessment, and (d) provision of definitions.

Cervical auscultation validity: Impact of training

Early CA studies describe variability in the amount and quality of CA training that SLPs received prior to participating in CA validity and reliability rating tasks. As compared with VFSS, Leslie et al. (Citation2004) reported CA sensitivity and specificity values of 62% and 66% respectively. However, the CA-training reported ranged from: no training, to self-taught, to ‘attendance at a course’ (details not reported). Borr et al. (Citation2007) included participants that had ‘no training’ to ‘CA-workshop attendance’ with sensitivities and specificities of 94% and 70% respectively. Two studies reported more consistent CA training. Stroud et al. (Citation2002) described 5 hours of CA-training and reported 86% sensitivity and 56% specificity, whilst Zenner et al. (Citation1995) reported ‘extensive’ CA-training, though number of hours and specific training details were not described in either study. Results from the current study demonstrate that following a structured CA training program very high sensitivities (>90%) and high specificities (>75%) were achieved.

Structured training and the use of descriptors has been shown to increase rater reliability in the instrumental dysphagia literature (Martin-Harris et al., Citation2008; Nordin et al., Citation2017; Pilz et al., Citation2016). Martin-Harris et al. (Citation2008) reported high intra and inter-rater agreement (≥80%) following standardised SLP training using established dysphagia descriptors. Similarly, Nordin et al. (Citation2017) used a structured program to improve VFSS rater reliability for specific features such as pharyngeal constriction ratio measurements, airway closure and total pharyngeal transit time, amongst others. All clinicians, regardless of previous experience, were able to achieve 80% agreement following training. These studies however reported rater reliability using percent agreement, which does not account for chance agreement, as opposed to Pilz et al. (Citation2016) and the current study, where further analysis using inferential statistics with the kappa statistic were reported. Pilz et al., (Citation2016) investigated the rater reliability of two final year medical students evaluating FEES parameters following 10 hours of training. Their results demonstrated variable rater reliability, depending on the different FEES assessment parameters (including piecemeal deglutition, vallecula and piriform fossa pooling, rated using a 4-point Likert scale), a categorical aspiration/penetration/none question, and the penetration aspiration scale. Intra-rater reliability in that study was reported as K = 0.76–0.93, with inter-rater reliability reported as K = 0.58–0.88; results similar to the current study.

Cervical auscultation validity: Impact of swallow sound sampling parameters

Methodology must be considered when comparing the CA literature. Previous studies focussed exclusively on the swallowing sound in isolation, however, for validity to be accurately evaluated, swallow-respiratory sounds should be assessed using simultaneous CA and instrumental assessment, with an appropriately configured recording device. Zenner et al. (Citation1995) focussed exclusively on swallow sounds and reported highly variable sensitivity (23–83%) and specificity (62–93%) ratings as compared with VFSS, with swallow sounds recorded non simultaneously (6–22 days prior to the VFSS). Additionally, Zenner et al. (Citation1995) and other research groups (Leslie et al., Citation2004; Stroud et al., Citation2002), used a range of different stethoscopes and/or non-standardised modifications to the recording devices/stethoscopes in their studies, which may further explain the poorer sensitivities and specificities values in these earlier results. In contrast, Frakking et al. (Citation2016) used a standard recording device (direct microphone attachment), simultaneous VFSS and CA-recordings with both swallow sounds and, pre-post swallow-respiratory sounds for SLPs to rate. Both the current study and the Frakking study show the importance of swallow-respiratory sound sampling and the simultaneous recording of CA and instrumental assessment for high validity ratings.

Cervical auscultation: Impact of outcome measures and decision making

Research design can have unintended consequences. The assessment parameters and type of decision making for CA-ratings and instrumental evaluation in earlier research were different to the current study. In earlier CA literature, different rating scales were used for CA-ratings vs. instrumental assessment ratings. Leslie et al. (Citation2004), asked CA raters to identify if swallows were normal versus abnormal, no definitions given; whereas the VFSS clinicians were asked to identify penetration/aspiration. Notably, these ratings assess two different parameters: (a) normal/abnormal, (i.e. dysphagic or not), and (b) penetration/aspiration, (i.e. safe or not).

These methodological differences may have further contributed to incongruencies between studies’ validity results. Provision of definitions for normal and abnormal swallow-respiratory coordination, direct teaching, and practical tasks to identify normal from abnormal acoustic parameters (Cichero & Murdoch, Citation2006) are key features of the current study’s training methodology. High validity and reliability results from more recent research (Frakking et al., Citation2016) further support the use of standard definitions in CA. Continued work by Frakking et al. (Citation2017), (a) asked clinicians to identify suspected aspiration in a yes/no dichotomous decision-making task in addition to having CSE information available, (b) ensured structured training and (c) provided established definitions. Sensitivity and specificity in that study (94% and 95% respectively) are classified as very high (Lange & Lippa, Citation2017).

High sensitivity and specificity are demonstrated when a dichotomous (e.g. yes/no) clinical decision is required, as demonstrated by the current study and research by Frakking and colleagues. The ability to accurately characterise binary decisions is also noted in medical auscultation. Kumar and Thompson (Citation2013) indicate that simply distinguishing normal from abnormal heart sounds and murmurs rather than making specific diagnoses may be a more realistic goal of doctor training. Kumar & Thompson’s study provides support for the current study’s questions of “Is the swallow dysphagic, yes or no?” and “Is the patient safe or not safe to swallow this fluid thickness/consistency?”. With this understanding, the design and results of the current study lend strength to the argument that CA should be used as a complement to the CSE. A combined CSE + CA clinical assessment that provides highly sensitive ratings of normality or abnormality increases appropriate referrals to instrumental examinations. This approach is similar to medical screening progression to instrumental diagnostics. In hindsight, better scores related to the binary decision-making questions compared with the 5-point AusTOMS likert scale of dysphagia severity was to be expected. Accurate severity rating requires access to a range of patient information that cannot be captured by either a single swallow-respiratory assessment nor an isolated instrumental assessment. Holistic dysphagia management requires multifaceted patient/clinical information and high-level clinical reasoning (Doeltgen, McAllister, Murray, Ward, & Pretz, Citation2018; McAllister et al., Citation2020).

Cervical auscultation: Reliability

Results from the current study demonstrate almost perfect intra-rater agreement with kappa statistics above 0.90, post training. Inter-rater agreement, however, was moderate to substantial, with weighted kappas of 0.58–0.64. These findings are both similar and different when compared to earlier research. Studies with no, or variable CA-training, and/or methodological differences as discussed above, report poor rater reliability results, with intra-rater reliability (kappa statistic) ranging from −0.12–0.85 and inter-rater reliability ranging from 0.17–0.28 (Borr et al., Citation2007; Leslie et al., Citation2004; Stroud et al., Citation2002; Zenner et al., Citation1995). In more recent literature Frakking et al. (Citation2017) present similar results to the current study with substantial to almost perfect intra and inter-rater reliabilities (K > 0.80). It is hypothesised that these higher reliability ratings are due to robust methodology using structured training, specific swallow-respiratory definitions and simultaneous CA and instrumental recordings.

Rater reliability results from the current study are also comparable to and/or better than rater reliability results seen in the instrumental assessment literature (Dziewas et al., Citation2008; Kelly et al., Citation2007; Pilz et al., Citation2016). Within those studies, intra-rater reliability ranges from K = 0.72–0.84 and Intra-class Correlation Coefficients (ICC) from 0.80–1.0, while inter-rater reliability ranges from K = 0.01–0.89 and ICC = 0.78. A recent systematic review by Swan et al. (Citation2019) summarised the psychometric properties of ratings scales used by VFSS and FEES using a 4 point-likert scale where the methodological quality per study was presented as percentage of rating (Poor = 0–25%, Fair = 25.1–50%, Good = 50.1–75%, Excellent = 75.1–100%). Only two of 39 visuoperceptual measures from 45 research articles demonstrated excellent rater reliability (>75%). Interestingly, this systematic review identified that intra-rater reliability often presents higher figures than inter-rater reliability; trends that are reflected in the current study.

Holistic dysphagia assessment: Cervical auscultation as an adjunct to the clinical swallow exam

The current study investigates the validity and reliability of swallow and/or swallow-respiratory assessment as compared with FEES. Although results are within the higher range of agreement with FEES, they are based on the assessment of dysphagia solely using cervical auscultation, without other components of the CSE. Although this methodology was appropriate for a pre-post training study it is not in-line with the principles of evidence-based medicine (Sackett, Rosenberg, Gray, Haynes, & Richardson, Citation1996), nor reflective of clinical practice, and is contrary to emerging literature (Bergström et al., Citation2014; Frakking et al., Citation2016; Frakking et al., Citation2017; Lagarde et al., Citation2016). Cervical auscultation is an adjunct to the CSE and should not be used as a stand-alone tool.

The current study supports the use of CA as part of the CSE, specifically after appropriate CA training. The CSE has been criticised for its lack of diagnostic ability, particularly in identifying aspiration. Yet, it has also been argued that the CSE is an essential part of holistic dysphagia management for acute-care with rapidly changing patient status, short lengths of stay; and non-acute care where patient behaviour, co-morbidities, fatigue, and fluctuations in physical, respiratory and cognitive abilities, are important considerations for safe eating (American Speech-Language-Hearing Association (ASHA), Citation2020; McAllister et al., Citation2020) This is particularly relevant when considering oral intake over an entire meal and mealtime behaviours for people with dementia, COPD exacerbations, cerebral palsy, traumatic brain injury, and others (ASHA, Citation2020; Howle, Baguley, & Brown, Citation2014; Lewis, Walterfang, Velakoulis, & Vogel, Citation2018). The CSE + CA allows for functional assessment, ongoing follow-up and regular patient contact, facilitating person-centred care. Rather than a single examination, the CSE would be better described as a Clinical Swallow Examination and Review (CSE/review).

Cervical auscultation and CSE/reviews: Complementing evidence-based practice

Cervical auscultation adds value to the CSE/review, contributing further to evidence-based medicine (EBM) and optimal dysphagia management. According to Sackett’s model, evidenced-based decisions combine and consider (1) relevant scientific evidence, (2) clinical judgement with (3) patient values and preferences (Sackett et al., Citation1996). However, EBM also includes the health system (e.g. availability of services, treatments, assistive devices, health insurance coverage), and the service organisation (e.g. spectrum and training level of health professionals, availability of diagnostic, and treatment devices) (Gutenbrunner & Nugraha, Citation2020). The current study, with sensitivities of 93–99% indicate that the addition of CA to the CSE/review, could assist in more appropriate referrals to instrumental assessment and use of resources. Rather than replacing instrumental assessments, CA as part of the CSE/review should be used to complement holistic dysphagia management, including improved referral to instrumental assessments, improved prompt follow-up after instrumental assessment, review and investigation of patient preferences during meal-times, and furthermore, has the likelihood to positively impact cost‐effective, high‐quality, patient-centred care (Gutenbrunner & Nugraha, Citation2020; McAllister et al., Citation2020).

It should be emphasised that (a) the CSE/review + CA and (b) instrumental assessment provide complementary information for evidence-based management (Gutenbrunner & Nugraha, Citation2020; Sackett et al., Citation1996). Instrumental investigations provide specific point-in-time diagnostic information of visual observations of penetration, aspiration and pathophysiological causes for dysphagia which underpins effective rehabilitation (Swan et al., Citation2019). CA provides acoustic information regarding swallow-respiratory coordination not clearly captured in VFFS or FEES (Frakking et al., Citation2017; Nozue et al., Citation2017; Yamashita et al., Citation2014). The CSE/review + CA compliments holistic and person-centred care by providing opportunity for regular review of patient/dysphagia progression and management adaptation, rather than just a snapshot of a swallow assessment. The benefits of CSE + CA include ongoing review of the patient’s ability to manage compensatory strategies throughout mealtimes, ability (adherence) and accuracy in completing dysphagia manoeuvres/exercises. The opportunity for ongoing CSE/review + CA, not only supports evidence-based intervention, but also allows for training and education of the patient and carers to ensure individualised goal-setting, rehabilitation and discharge planning (Bergström et al., Citation2014; Doeltgen et al., Citation2018; McAllister et al., Citation2020).

Finally, recent research argues for training and optimising swallow-respiratory coordination to improve swallow function (Curtis, Dakin, & Troche, Citation2020; Martin-Harris et al., Citation2015; McFarland et al., Citation2016). The added swallow-respiratory information from CA may further contribute to rehabilitation programs that use dysphagia manoeuvres/strategies (modified Supraglottic swallow, chin tuck, extra swallows to clear pharyngeal residue) during a functional mealtime review, and further research in this area is warranted.

Limitations and future research

Areas for improvement of the present study and recommendations for future work include repetition of the study with (a) a larger sample size and (b) the inclusion of longer swallow sound recordings which routinely capture pre and post swallow respiratory phases. The current study was part of a university course that included 39 SLPs rating 9 swallow sounds, however the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) recommend more than 50 participants (Swan et al., Citation2019. Post hoc sample size calculations, identify that 38 swallow samples (without duplications) would have been required to achieve 80% power.

A consistent sampling of pre and post swallow respiration in addition to the swallow sound is recommended. Samples within this study ranged from 3 seconds (swallow sound only) to 22 seconds (pre-swallow respiration, swallow/s, post-swallow respiration). Investigation of reliability over a range of liquid thicknesses and solid bolus consistencies is also recommended, given the variability reported in both the CA and FEES literature (Bergström et al., Citation2014; Pilz et al., Citation2016).

Training approaches also warrant further research. This study incorporated several different teaching methodologies (pre-workshop theory, evidence, hands-on practical skills, problem-based learning with case examples, blended and peer learning, implementation, reflection and evaluation components) however teaching format for session frequency and face-to-face versus on-line learning has not been investigated. Nordin et al. (Citation2017) demonstrated the positive cumulative effect of weekly structured training using VFSS (Nordin et al., Citation2017). Additionally, the Modified Barium Swallow Measurement Tool for Swallow Impairment (MBS-Imp) is entirely independent of face-to-face teaching and this structured, on-line training program has shown promising results (Martin-Harris et al., Citation2008). Given the current challenges of classroom and face-to-face learning secondary to Covid-19, future CA courses should investigate the benefits of on-line training and remote access to a set of swallow sounds allowing for weekly practice over a designated time frame.

It would also be beneficial to investigate the functional outcomes of dysphagia assessment rather than impairment-related outcomes (e.g. aspiration). It is argued that functional outcomes of dysphagia are of greater importance than the presence/absence of aspiration per se. Aspiration is only one (non-obligatory) sign of dysphagia and does not necessarily lead to aspiration pneumonia (Langmore et al., Citation1998). Other parameters might include nutrition and hydration outcomes, oral vs. non-oral routes of intake, functional independence versus the need for assistance or close supervision and confirmed aspiration pneumonia (American Speech-Language-Hearing Association, Citation2020; Lewis et al., Citation2018; Murray, Scholten, & Doeltgen, Citation2018). The outcomes of holistic dysphagia assessment and management should be evaluated considering the personal impact of activity, participation and wellbeing, not just the impairment, in keeping with the International Classification of Functioning, Disability and Health (ICF) (Skeat & Perry, Citation2005).

Conclusion

The current study demonstrates that SLPs who have undergone structured CA training, with definitions of normal and abnormal swallow-respiratory sounds, significantly improved their ability to identify if a swallow was safe and if a swallow was dysphagic. Results demonstrated substantial (kappa) agreement with FEES, very high sensitivities and high specificities. Rater reliability also improved post CA training, though only intra-rater reliability improved significantly with very high kappa agreements. This study adds to the growing body of evidence demonstrating that structured training improves dysphagia assessment, and further supports the use of CA as an adjunct to the CSE/review, which is key for holistic, person-centred dysphagia management.

Disclosure statement

The authors report no declarations/conflict of interest.

References