957
Views
0
CrossRef citations to date
0
Altmetric
Original Article

The effect of auditory training on listening effort in hearing-aid users: insights from a pupillometry study

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Received 28 Jun 2022, Accepted 04 Jan 2024, Published online: 30 Jan 2024

Abstract

Objective

The study investigated how auditory training affects effort exerted by hearing-impaired listeners in speech-in-noise task.

Design

Pupillometry was used to characterise listening effort during a hearing in noise test (HINT) before and after phoneme-in-noise identification training. Half of the study participants completed the training, while the other half formed an active control group.

Study sample

Twenty 63-to-79 years old experienced hearing-aid users.

Results

Higher peak pupil dilations (PPDs) were obtained at the end of the study compared to the beginning in both groups of the participants. The analysis of pupil dilation in an extended time window revealed, however, that the magnitude of pupillary response increased more in the training than in the control group. The effect of training on effort was observed in pupil responses even when no improvement in HINT was found.

Conclusion

The results demonstrate that using a listening effort metric adds additional insights into the effectiveness of auditory training compared to the situation when only speech-in-noise performance is considered. Trends observed in pupil responses suggested increased effort—both after the training and the placebo intervention—most likely reflecting the effect of the individual’s motivation.

Introduction

Recent evidence suggests that auditory training has a potential to support aural rehabilitation in individuals with hearing loss, especially when combined with the use of hearing devices (Stropahl, Besser, and Launer Citation2020). Training programs for hearing-impaired populations are designed to improve specific perceptual or cognitive skills relevant for speech perception in noise (e.g., Ferguson et al. Citation2014; Humes et al. Citation2009; Serman Citation2012; Sweetow and Sabes Citation2006; Whitton et al. Citation2017; Woods et al. Citation2015). The desired scenario is that the skills refined via training will contribute to better speech understanding in everyday life situations, including a variety of untrained stimuli and conditions. Typically, the generalisation of training is assessed using an untrained speech intelligibility task administered before and after the intervention (see for review: Henshaw and Ferguson Citation2013; Stropahl, Besser, and Launer Citation2020). Significant improvement in the training recipients’ performance (compared to a control group) is interpreted as evidence for the training’s efficacy.

Nevertheless, evaluating the success of an intervention based only on speech recognition metrics has certain limitations. The result of a speech intelligibility test does not directly quantify, how difficult it is for an individual to achieve a given performance. It has been shown that hearing-impaired listeners need to invest more resources to recognise words in noise than an age-matched group with normal hearing (McCoy et al. Citation2005). This finding illustrates why for the hearing-impaired listeners—in the longer run—daily communication presents a demanding and exhausting task. Individuals with hearing disabilities indeed report higher perceived listening effort and fatigue levels than people without auditory deficits (Alhanbali et al. Citation2017). Increased conversation effort appears to be a highly distressing consequence of a hearing loss, as it has been associated with a higher perceived handicap (Gatehouse and Noble Citation2004). Quantifying listening effort seems essential to better describe the peoples’ listening and communication difficulties.

Moreover, measuring listening effort adds additional value on top of the “traditional” intelligibility tests, e.g., when evaluating the success of hearing loss remediation strategies. It has been shown that metrics of subjective listening effort in noise represent a different dimension of hearing-aid outcome than metrics related to speech recognition (Humes Citation1999). Physiological correlates of listening effort can reveal the benefit of hearing-aid noise reduction even when intelligibility remains unchanged (Wendt, Hietkamp, and Lunner Citation2017). It is plausible that using effort-related metrics would provide valuable insights into the effectiveness of interventions other than hearing aids, such as auditory training. However, to the authors’ knowledge so far only two studies on auditory training have taken this approach (Henshaw and Ferguson Citation2014; Kuchinsky et al. Citation2014).

Henshaw and Ferguson (Citation2014) administered a dual auditory and memory task designed by Howard, Munro, and Plack (Citation2010) in a group of hearing-aid users who underwent phoneme-in-noise discrimination training. The primary (auditory) task was to recognise and repeat words presented in quiet or in two noise conditions, whereas the secondary (memory) task was to remember digits displayed on a screen prior to the auditory task, and to recall them after the auditory task was completed. In the dual-task paradigm, the performance in the secondary task is assumed to reflect spare cognitive resources not utilised to solve the primary task and has thus been used to characterise listening effort (McGarrigle et al. Citation2014). Henshaw and Ferguson (Citation2014) reported an improvement in the dual-task score (primary and secondary task taken together) in one out of noise conditions, which was interpreted as a change in the allocation of mental resources to the task.

Kuchinsky et al. (Citation2014) used pupillometry to investigate the impact of frequent-word training program developed by Humes et al. (Citation2009) on effort during word-recognition task in older adults with hearing loss. Pupillometry is known to capture cognitive load during a behavioural task (Piquado, Isaacowitz, and Wingfield Citation2010) and has been widely applied in hearing science to probe listening effort (Naylor et al. Citation2018). Kuchinsky et al. (Citation2014) reported that the training resulted in an increased overall dilation and a faster peak of the responses, none of which was observed in the non-training control group. The authors interpreted these two trends as biomarkers of an increased arousal and a more rapid discrimination of words in noise, respectively. The study demonstrated that combining a speech-in-noise task with pupillometry can reveal some aspects of training which are not reflected in behavioural results. Yet, the task used to obtain pupil responses was very similar to the one used in the training (closed-set monosyllabic word recognition in noise). It remains unknown, if and how training effects on effort would be reflected in pupil responses to untrained and more complex speech material.

The present investigation focused specifically on a training paradigm targeting the identification of low-level speech units—phonemes. Significant on-task improvements, i.e., better consonant identification thresholds in noise (Woods et al. Citation2015) and higher vowel and post-vowel consonant recognition scores (Koprowska et al. Citation2022), were observed in hearing-aid users after this type of training. However, none of these studies reported generalisation of learning to more complex speech material, like hearing in noise test (HINT). Woods et al. (Citation2015) explained the lack of significant effect in HINT by the predominant role of top-down processes in comprehension of such high-context sentence material. Reconstructing the meaning of a sub-optimally perceived sentence is possible based, e.g., on contextual cues. However, this process imposes additional demands on cognitive resources, as described, for example, by the Ease of Language Understanding (ELU) model (Rönnberg et al. Citation2013). Improving phoneme discrimination should lower the need to involve extra resources in compensating for ambiguities in the speech signal. Thus, the effort needed to understand the sentences might be lower after the training, even if the intelligibility remains similar.

The present study tested this hypothesis by measuring listening effort in experienced hearing-aid users before and after phoneme-in-noise training. Task-evoked pupil responses were recorded while participants performed the Danish HINT (Nielsen and Dau Citation2010)—untrained speech material that was more complex and realistic than the trained stimuli. Pupillometry was selected as an appropriate metric for this study as a well-established methodology to measure cognitive load during a speech-in-noise recognition task. Moreover, using pupillometry allowed for extending the findings by Kuchinsky et al. (Citation2014) to a different training paradigm, speech-in-noise test and experimental group set-up. On the other hand, comparing two pupillometry measurements taken at two different days with a longer interval in between comes with certain complications. Pupil size has multiple sources of variability that are unrelated to the cognitive load. These include caffeine intake (Abokyi, Owusu-Mensah, and Osei Citation2017), arousal (for example caused by a novel situation such as participating for the first time in a research study) or even time of the day (Eggert et al. Citation2012). While planning and executing a study, care should be taken that the potential contribution of these factors is as similar as possible during two different visits.

Three research questions were addressed. First, it was investigated if listening effort is lower after the training than before. Two pupil metrics were considered: the peak pupil dilation (PPD), defined as the maximum of the pupil response function, and the overall time course of the response, characterised with the help of growth curve analysis (GCA). After the training, reduced PPDs, as well as reduced overall responses, were expected as indicators of lower effort.

Second, the sensitivity of pupillometry to the potential effect of training on listening effort in a situation when there is no change in intelligibility was assessed. For this purpose, the pupillary responses were measured at two signal-to-noise ratios (SNRs) corresponding to the performance levels of 50% (SNR50) and 80% (SNR80). It was expected that the effect of training on intelligibility would only be seen in the SNR50 condition since in the SNR80 condition, the performance would be close to ceiling. It was also anticipated that the release from effort would be reflected in the pupil metrics regardless of the performance level.

The third question was whether active participation without the training elicits significant changes in pupil responses. As opposed to the study by Kuchinsky et al. (Citation2014), the present experiment involved not a passive but an active control group. Kuchinsky et al. (Citation2014) used GCA to model pupil responses and reported no significant changes in the obtained parameters in the passive control group. The present study applied the same method to investigate if any changes for the active control group occurred.

Materials and methods

Participants and study design

Twenty 63-to-79 year old individuals (eight female) with minimum one year of experience using hearing aids participated in the study. All of them had mild-to-moderate (defined as better-ear pure-tone average (PTA) of thresholds for 500, 1000, 2000 and 4000 Hz within the range 26-55 hearing level [dB HL]), symmetrical (an interaural difference between the PTAs no greater than 10 dB as in Noble and Gatehouse (Citation2004)) hearing loss. When enrolled in the study, none of them reported eye diseases or taking medications that could affect the pupillary response mentioned in Winn et al. (Citation2018). All individuals participating in the study attended two measurement visits with a two-week interval. Between these visits, one group of the participants (n = 10) completed six sessions of phoneme-in-noise training while the other group (n = 10) received a control intervention. The study was approved by the Science-Ethics Committee for the Capital Region of Denmark (reference H-16036391). All the participants signed an informed consent and received a financial compensation.

The summary of the groups’ characteristics (age, hearing loss and working memory) is presented in . The groups were homogeneous in terms of age and hearing loss but a significant difference between the auditory digit span scores was observed (t = −2.98, p < .01). The training group had, on average, lower working memory capacity than the control group.

Table 1. Group means, standard deviations (SD) and the results of between-group t-test (t-statistics and p-values) for age, PTA, and auditory digit span (DS) score. Significant difference between the groups is indicated in bold.

Phoneme-in-noise training and control activity

The phoneme-in-noise training method was a modification of the SchooLo program (Serman Citation2012), which had previously been used in studies with cochlear-implant patients (Schumann et al. Citation2015) and normal-hearing listeners (Schumann, Garea Garcia, and Hoppe Citation2017). The procedure was self-administered and self-paced, enabling multiple stimulus repetitions (both before and after the response) and providing auditory and orthographic feedback on a trial-to-trial basis. Up to three attempts were allowed to provide a correct response for each trained item (each with reduced difficulty level). The training material consisted of logatomes from the Danish nonsense word corpus (DANOK; Nielsen and Dau Citation2019) spoken by two different talkers (male and female) presented in background noise that matched the frequency magnitude spectrum of the voices. The SNR across the training was adjusted based on the performance. The stimuli were delivered via a loudspeaker, and the participants wore their own hearing aids. The details of the training procedure are presented in Koprowska et al. (Citation2022).

The control activity involved listening to audiobooks and answering multiple-choice questions - a procedure that had previously been developed to record electrophysiological responses to running speech (Bachmann, MacDonald, and Hjortkjær Citation2021). The stimuli were delivered via ER-3A insert phones (Etymotic Research), and an audiogram-based linear amplification using a 'Cambridge’ formula (Moore, Glasberg, and Stone Citation1999) was provided, except for the fifth session with unamplified speech. The participants were seated in a soundproof booth with an EEG cap on. During the listening exercises, neural responses were collected with the help of an Active Two system (BioSemi). The electrophysiological data were used to investigate the test-retest reliability of subcortical responses to running speech in older hearing-impaired adults (for details, see: Bachmann Citation2021).

Both the training and the control intervention comprised six visits to the research facility. These visits were distributed over two weeks at two- to three-day intervals (with minor exceptions due to the participants’ and facilities’ availability). Such a schedule was based on previous studies that used the same training program (Schumann, Garea Garcia, and Hoppe Citation2017; Schumann et al. Citation2015). The listening part of each visit lasted for approximately 1,5 hours. The participants attended all the sessions, except for two individuals in the training group who completed five sessions out of six (due to technical problems with their hearing aids).

Experimental procedure

During the pre- and post-intervention assessment, sentence intelligibility was measured using the Danish HINT (Nielsen and Dau Citation2010). The five-word sentences spoken by a male talker were presented in stationary speech-shaped noise. The stimuli were played from a loudspeaker placed at a one-meter distance from the listener. The participants were instructed to repeat the sentences as precisely as possible and were encouraged to guess if they were uncertain. The responses were scored by an audiologist who was a native Danish speaker. The participants performed the task while wearing their own hearing aids.

The measurement started with an estimation of the SNR which for a given individual corresponded to an intelligibility score of 50% (SNR50) and 80% (SNR80), respectively. This part of the measurement did not involve eye-tracking. After a training list was used to familiarise the participants with the task, two sentence lists of 20 sentences each were administered. The noise fixed at 65 dB SPL started one second before and ended one second after the sentence. The speech level was adjusted adaptively from trial to trial to achieve the desired performance. A word-scoring procedure was applied. To estimate the SNR50, the magnitude of the speech level adjustment varied between +2 dB and −2 dB, depending on the number of correctly repeated words in the sentence. The speech level was increased by 2 dB if none of the five words was correctly repeated and decreased by 2 dB when all five words were correctly repeated. For one, two, three and four correctly recognised words, the step size was +1.2, +0.4, −0.4 and −1.2 dB, respectively. To estimate SNR80, the speech level could change between +3.2 dB (zero correctly repeated words) and −0.8 dB (five correctly repeated words). The step size was equal +2.4, +1.6, +0.8 and 0 dB for one, two, three and four correctly recognised words. For the first four sentences, the magnitude of the adjustment was multiplied by two. The order of SNR50 and SNR80 estimation was balanced across the participants.

In the following part of the assessment, the participants’ pupil size was registered while they performed the sentence intelligibility task at a constant SNR. The pupil size was recorded at a sampling rate of 500 Hz with an EyeLink camera (SR Research, Canada). The data were collected in a room without windows, and the luminance conditions were the same for all measurement sessions. The camera was fixed to the participant chair with an adjustable arm and located in front of the person. The participants were asked to fixate a visual target located on the wall behind the loudspeaker. The adjustable headrest was used to ensure the participants’ comfort and stabilise the head position. After one training list to familiarise participants with the eye-tracker, two test conditions were administered: one corresponding to 50% intelligibility and one corresponding to 80% intelligibility. The noise level at a one-meter distance was fixed at 65 dB SPL, and the speech level was set to achieve the desired SNR (SNR50 or SNR80). The order of the two conditions was balanced across the participants.

Each trial (corresponding to one sentence) followed a specific sequence of events to obtain a reliable task-evoked pupil response (Winn et al. Citation2018). The trial began with two seconds of silence and three seconds of noise. Keeping the initial noise interval constant across the trials was intended to prepare the participants for the stimulus and avoid a situation where the sentence occurred unexpectedly, possibly causing a task-unrelated dilation. After the pre-stimulus phase, the target sentence was presented while the noise continued playing. The average duration of the sentence was 1.5 seconds. A three-second retention interval followed the sentence to separate the dilation evoked by listening to the stimulus from the dilation related to verbal response. The noise continued during this phase. The participants were instructed not to repeat the sentence until the noise ended. After the offset of the noise, the participant provided a verbal response. The experimenter manually triggered the beginning of the next trial a few seconds after the response to allow the pupil to return to its baseline size.

At the post-intervention visit, the test lists were administered at SNR50 and SNR80 defined before the intervention, such that for each individual the acoustic conditions (and not the performance level) were held constant across the visits. Nevertheless, two adaptive lists were also run during the post-intervention visits in order to keep the duration of the two measurement sessions approximately the same. A substantial disproportion of the measurement time across the visits would have most likely resulted in different level of within-session fatigue, which would have made the comparison of the obtained pupillometry results less reliable.

It has been shown that the pupil diameter is smaller in the afternoon compared to the morning (Eggert et al. Citation2012). To avoid within-subject variation of the baseline pupil size, the measurements were planned in such a way that for each participant the pre-intervention and post-intervention visit took place at the same time of the day. Only for two participants in the training group and one in the control group, due to logistic challenges, the second visit had to be rescheduled by 4-5 hours. To limit the impact of caffeine intake on the pupillary responses, the participants were instructed not to drink coffee at least six hours before the measurement session.

Pupillometry data processing

Raw pupil data were pre-processed in MATLAB using the PUPILS pipeline (Relaño-Iborra and Bækgaard Citation2020), which included blink detection, interpolation and low-pass filtering. Blinks were detected by identifying values differing more than two standard deviations from the mean of the acquisition corresponding to one sentence list. Datapoints within 50 milliseconds before and 150 milliseconds after the blinks (Winn et al. Citation2018) were removed and linearly interpolated. Next, the data were denoised by applying low-pass filtering with a cut-off frequency of 10 Hz. Trials with more than 25% of the data interpolated within the relevant time interval (between the onset and the offset of the sound) were removed from further analysis (5% of all the recorded pupil traces).

The data within each trial were baseline-corrected by subtracting an average pupil size within one second before the sentence onset. Furthermore, all traces were visually inspected for baseline artefacts, which would affect the subsequent pupil estimation or gross distortions that were not detected by the automatic procedure. Trials in which these artefacts were found were also discarded from further analysis. The baseline-corrected data were normalised relative to the pupil range so as to obtain a better test-retest reliability (Neagu et al. Citation2023). The lower and upper limits of the individual’s pupil range were defined as the 0.025 and 0.975 quantiles of the datapoints within five seconds after the sentence onset (considering only non-discarded trials across all conditions). The normalisation involved a subtraction of the lower limit and a subsequent division by the dynamic range.

Four mean response curves were calculated for each participant by averaging the non-rejected trials for each of the two experimental conditions within each visit. A minimum of ten preserved trials was set as a requirement to include the resulting mean response in the analysis. Based on this criterion, one test subject in the control group was excluded from further analysis. Overall, 1268 trials were analysed: 621 for the “pre” visit (on average 32.68 per individual) and 647 for the “post” visit (on average 34.05 per individual).

Analysis

The sentence intelligibility results, expressed as the percentage of correctly repeated words within the test list, were analysed in R (R Core Team Citation2020) using the lmerTest package (Kuznetsova, Brockhoff, and Christensen Citation2017). A mixed-effect linear model with fixed effects of Visit (levels: pre, post) and Group (levels: training, control), as well as a Visit x Group interaction and a random intercept of Participant, was fitted for each condition. An interaction of fixed effects represented the impact of the intervention on the outcome measure. The significance of the interaction would indicate that the impact of the intervention on speech intelligibility differed across the groups. In such case, a post-hoc analysis including planned comparisons between the levels of Visit within the factor of Group was performed. The Tukey adjustment method was applied to account for multiple comparisons (Lenth Citation2016).

The next step in the analysis aimed to investigate any potential effect of the experimental factors on the pupil metrics, namely peak-pupil dilation (PPD) and the baseline pupil diameter. PPD is argued to reflect the cognitive load during the behavioural task, while baseline size is associated with global arousal status. PPDs were extracted as the maximum value of each mean response curve between 0.5 to 3 seconds after the stimulus onset (i.e., up to approximately 1.5 seconds after the end of the sentence). Based on the previous literature, the maximum of the response should be contained within this time window (Winn et al. Citation2018). The baseline pupil diameter for each participant within each visit and condition was calculated as the mean of the baseline pupil sizes recorded for each non-discarded sentence within a test list. The statistical analysis of baseline and PPDs employed the same mixed-linear model approach as applied to the intelligibility scores (apart from the PPDs in SNR80 condition, where the random effect was removed from the model to resolve the singularity issue).

To capture the effect of the training on the overall shape of the response, GCA was applied. GCA is a multilevel regression technique suitable for analysing time-course data (Mirman Citation2014) commonly used in pupillometry studies (e.g., Bianchi et al. Citation2019; Kuchinsky et al. Citation2013; Wendt et al. Citation2018; Winn Citation2016). GCA allows for modelling the shape of the response using polynomial functions. At the output of a GCA model, the intercept represents the overall area under the curve, while the linear term reflects the overall slope of the response. The quadratic term is interpreted as the symmetric rise and fall rate around the peak and the cubic term as the delay of the peak. In the present study, the GCA was used to model the shape of the response curve within 0 to 3 seconds after the onset of the sentence.

The GCA was performed in R using the lme4 package (Bates et al. Citation2015). Separate models were fitted for each experimental condition. Within the time window selected for the analysis, the response curve typically does not show more than two inflection points. Therefore, the selected model structure included up to the third-order time term (i.e., the intercept, linear, quadratic and cubic term). The random effect structure of the model allowed all these terms to vary with the listener. Orthogonal polynomials were used such that each term could be interpreted independently. The impact of experimental factors (Group, Visit and Group × Visit interaction) on the curve characteristics was investigated by gradually adding interactions between these factors and the polynomial terms. After increasing the model complexity by one parameter (i.e. adding each subsequent interaction), the fit was evaluated using an Akaike Information Criterion (AIC) and a χ2 test. The model tests are summarised in Supplementary Table 1 (SNR50) and Supplementary Table 2 (SNR80). The reduction of the AIC value indicated improved goodness of fit. The p-value obtained from the χ2 test showed whether adding a new parameter significantly improved the model fit compared to the previous most-complex model. The models were gradually built up until the most complex structure of interest was reached, i.e., pupil dilation ∼ Visit x Group x (Intercept + Linear + Quadratic + Cubic)+(Intercept + Linear + Quadratic + Cubic|Participant). The models that provided the best description of the data (i.e., involved only those effects, which significantly contributed to the model fit) were used for the subsequent analysis.

After defining the best-fitting models, the parameter estimates of interest (characterising the “post” visit with reference to the “pre” visit in each group of the participants) were extracted together with their significance levels. The parameters for the training group were obtained by fitting the respective models with the “training-pre” as a reference. To obtain the parameters for the control group, the models were refitted using “control-pre” as a reference.

Results

Speech intelligibility

represents the pre-intervention (empty bars) and post-intervention (filled bars) speech intelligibility scores at SNR50 (left panel) and SNR80 (right panel). The training group’s results are shown on the left, and the control group’s results are displayed on the right side of each panel. The pre-intervention scores roughly corresponded to the targeted intelligibility levels (50% and 80% correct). At SNR50, there was a marginally significant Visit x Group interaction (F(1,18) = 4.24, p = .05). A post-hoc analysis revealed that the percentage of correct responses was significantly higher after the intervention only for the training group (t = −2.87, p = .01). Neither the interaction nor main effects were significant in the second condition.

Figure 1. Pre-intervention (empty bars) and post-intervention (filled bars) word scores in percent correct for SNR50 (left panel) and SNR80 (right panel). The mean results of the training group are shown on the left of the panels, and the mean results of the control group are on the right of the panels. Errorbars represent one standard error of the mean. Asterisks indicate significance level *p < .05.

Figure 1. Pre-intervention (empty bars) and post-intervention (filled bars) word scores in percent correct for SNR50 (left panel) and SNR80 (right panel). The mean results of the training group are shown on the left of the panels, and the mean results of the control group are on the right of the panels. Errorbars represent one standard error of the mean. Asterisks indicate significance level *p < .05.

Pupil baseline and PPD

Each boxplot shown in corresponds to the distribution of baseline values obtained for one condition and one group of the listeners during one visit. The left and the right panel display the data for the SNR50 and the SNR80 condition, respectively. Empty boxes indicate data collected at the beginning of the study (“pre”), while shaded boxes correspond to the results taken after the intervention (“post”). The pairs of “pre” and “post” boxes are shown on the left-hand side of the panels for the training group and on the right-hand side of the panels for the control group. The baseline diameter is expressed in arbitrary units, as given in the result files obtained from the EyeLink system. No significant interaction nor the main effect of Visit or Group on the baseline pupil diameter was found in any of the conditions.

Figure 2. Boxplots of pupil diameters during the baseline interval obtained before (empty boxes) and after (filled boxes) the intervention for the SNR50 (left panel) and SNR80 (right panel) condition. The results of the training group are shown on the left, and the results of the control group are on the right of the panels. The upper and lower edges of the boxes indicate the 75th and 25th percentile of the data, respectively. The whiskers correspond to the most extreme observations not considered outliers. The horizontal lines represent the medians of the distributions.

Figure 2. Boxplots of pupil diameters during the baseline interval obtained before (empty boxes) and after (filled boxes) the intervention for the SNR50 (left panel) and SNR80 (right panel) condition. The results of the training group are shown on the left, and the results of the control group are on the right of the panels. The upper and lower edges of the boxes indicate the 75th and 25th percentile of the data, respectively. The whiskers correspond to the most extreme observations not considered outliers. The horizontal lines represent the medians of the distributions.

shows PPDs obtained before (empty bars) and after (filled bars) the intervention in the two experimental conditions: SNR50 (left panel) and SNR80 (right panel). The two bars shown on the panels’ left side represent the results of the training group, whereas the two bars displayed on the panels’ right side represent the results of the control group. Both groups had similar PPDs before the intervention and showed consistently higher values after the training in both experimental conditions. The analysis revealed no significant Visit × Group interaction nor a significant Group effect in any of the conditions. The effect of Visit was significant both for SNR50 (F(1,17) = 7.27, p < .05) and SNR80 (F(1,34) = 4.24, p < .05), indicating that the increase in PPD was similar for both groups.

Figure 3. Normalised peak pupil dilations (PPDs) before (empty bars) and after (filled bars) the intervention shown for SNR50 (left panel) and SNR80 (right panel). The mean results of the training group are shown on the left and the mean results of the control group are on the right of the panels. Errorbars represent one standard error of the mean.

Figure 3. Normalised peak pupil dilations (PPDs) before (empty bars) and after (filled bars) the intervention shown for SNR50 (left panel) and SNR80 (right panel). The mean results of the training group are shown on the left and the mean results of the control group are on the right of the panels. Errorbars represent one standard error of the mean.

GCA output

Significant Group × Visit interactions in GCA model for the SNR50 (Supplementary Table 1) indicated that the interventions administered in the training and control group differently affected the overall area under the curve (represented by the intercept), the overall steepness of the curve (linear term), the rise and fall rate around the peak (quadratic term) as well as the delay (cubic term). Similarly, the GCA model output for the SNR80 condition (Supplementary Table 2) revealed a significant Group × Visit interaction for all terms, indicating that the two interventions affected all the curve parameters considered in the model differently. Additionally, there was a significant Group effect on the intercept, suggesting a training-unrelated difference in the area under the curve between the two groups.

shows the output of the GCA models for the conditions SNR50 (left) and SNR80 (right) within the time window 0-3 seconds after the sentence onset. The upper panels represent the outcomes for the training group, and the lower panels show the results for the control group. The modelled responses are indicated by solid lines: the pre-intervention condition is displayed in green and the post-intervention condition in blue. The data measured before (dotted lines) and after (dashed lines) the intervention are shown for comparison. It can be seen that the overall magnitude of the response increased from “pre” to “post” in both conditions in the training group. The response also increased in the control group but only in the SNR50 condition and to a somewhat smaller degree.

Figure 4. The output of the GCA models for SNR50 (left column) and SNR80 (right column). The upper panels show the results for the training group and the bottom panels represent the control group. The modelled responses for the “pre” visit are plotted in green and the modelled responses for the “post” visit are shown in blue. The shaded areas represent the models’ variability. Data obtained at pre- and post-intervention visits are shown for comparison in dotted and dashed lines, respectively.

Figure 4. The output of the GCA models for SNR50 (left column) and SNR80 (right column). The upper panels show the results for the training group and the bottom panels represent the control group. The modelled responses for the “pre” visit are plotted in green and the modelled responses for the “post” visit are shown in blue. The shaded areas represent the models’ variability. Data obtained at pre- and post-intervention visits are shown for comparison in dotted and dashed lines, respectively.

The parameters characterising the pupil response during the “post” visit with reference to the “pre” visit for each of the groups are summarised in for the SNR50 and in for the SNR80. In the SNR50 condition (), a significant increase of the overall area under the curve (intercept) and the overall steepness of the curve (linear term) was observed in both groups (p < .0001). The steepness around the peak decreased, as indicated by the positive quadratic term estimate (p < .0001); and the delay of the peak increased (p < .0001). Overall, the trends were similar across the groups, which is in agreement with the visual representation in . The magnitude of the change was larger for the intercept and the cubic term in the training group. On the opposite, the estimates of the linear and the quadratic term were smaller in the training group.

Table 2. Parameter estimates characterising the "post" level of the Visit factor relative to the reference "pre" level for the SNR50 condition. The extracted values and their significance levels are shown for the training group on the left and the control group on the right. Significance levels p < .05 are shown in bold.

Table 3. Parameter estimates characterising the "post" level of the Visit factor relative to the reference "pre" level for the SNR80 condition. The extracted values and their significance levels are shown for the training group on the left and the control group on the right. Significance levels p < .05 are shown in bold.

In the SNR80 condition (), the intervention was followed by an increased slope (linear) in both groups (p < .0001). The overall area under the curve (intercept) and the cubic term also changed for both groups (p < .0001) but in the opposite directions. The increased magnitude and faster peak were seen in the training group as opposed to slightly reduced magnitude and delayed peak in the control group. The significant change of the quadratic term was observed only in the training group (p < .0001) but not in the control group (p = .06). The parameter estimates were generally larger for the training group than for the control group, which again corresponds well to the trends visible in .

Discussion

The present study used pupillometry to characterise the impact of phoneme-in-noise auditory training on listening effort exerted in HINT. Task-evoked pupil responses to HINT sentences were obtained in noise conditions corresponding to 50% intelligibility (SNR50) and 80% intelligibility (SNR80), respectively. Three research questions were addressed: (i) whether the training would result in reduced effort, (ii) whether pupillometry would reveal the effect of training on effort in a condition when intelligibility remains unchanged and (iii) whether active participation in the study can elicit significant changes in pupillary responses.

The effect of training on listening effort

The hypothesised outcome of the training was that less reliance on the cognitive resources will be required to understand speech, leading to decreased listening effort. Lower effort is typically associated with smaller PPDs. The results of the study showed, however, that the PPDs were higher at the post-intervention visit, suggesting the opposite. Even more surprisingly, the magnitude of the increase in PPDs was the same for both groups.

The GCA analysis in the time window 0-3 seconds provided a more nuanced picture. First, a significant Visit x Group interaction with all polynomial terms indicates a training-related difference between the groups. A subsequent analysis revealed significant pre- to post-differences in the parameters describing the pupil response within both groups, not only the group that received the training.

The training group showed consistently higher overall magnitude and increased slope of the response for both conditions. Interestingly, in the SNR50 conditions, the overall trend in change from pre- to post-visit was very similar for both groups, although the difference was more pronounced in the training group. However, this similarity was not observed in the SNR80, where the control group showed quite an opposite behaviour than the training group and much smaller differences between the “pre” and “post” session.

The increase of pupil response after the training—although not in line with the hypothesis of the present study—is consistent with findings by Kuchinsky et al. (Citation2014), who also found an increased average dilation in the group that completed the training. Thus, both studies suggest higher effort after the intervention, which can result from a change in the effort allocation policy. According to the Framework for Understanding Effortful Listening (FUEL; Pichora-Fuller et al. Citation2016), the effort allocation policy is modulated by the current state of an individual’s motivation or fatigue. It is plausible that motivation played an important role in both cases. After having received an intervention to improve their understanding speech in noise, participants might have been more willing to allocate more resources in the task, as they expected that the investment of effort would result in a higher success rate. This argument would, to some extent, explain why the pupil responses did not increase in the passive control group in Kuchinsky et al. (Citation2014) but did for the active control group in our study (in the SNR50 condition).

Apart from an overall larger response, Kuchinsky et al. (Citation2014) reported more rapid peaks in pupil dynamics after the training. The authors interpreted the accelerated rise and decay of the pupil as a sign of faster discrimination of words in noise. Moreover, the peakedness of the response was elucidated as an indicator of effort related to selecting the target word from among alternatives displayed on the screen. While our study revealed the increased slope of the response from the training group, the accelerated rise and decay were not so apparent. This is unsurprising given the substantial differences in the design of the outcome measure. In our study, the duration of the response’s rise and fall was constrained by the timing of the trial, specifically the length of the sentence and the retention interval. Furthermore, the participants did not have visual response alternatives available, eliminating any effort related to the selection process.

Another factor potentially contributing to the pattern of the results is a change in the fatigue status between the “pre” and “post” visits. Higher self-reported fatigue levels are associated with smaller PPDs in speech-in-noise task (Wang et al., Citation2018). If auditory training succeeded in alleviating the challenges of everyday communication, then its recipients would be less fatigued upon the “post” visit. Lower fatigue would allow them to exert more effort during the HINT, contributing to their pupil responses being higher than before. Since no change in fatigue levels is expected for the individuals in the control group, this could explain why the increase in the overall magnitude of the pupil response in SNR50 condition seems to be lower for them than for the training group. Unfortunately, the fatigue status of the listeners was not monitored in the present study. Thus, its potential impact on the registered pupil responses remains a speculation.

Ceiling effects in HINT and the value of using pupillometry

As expected, the number of correctly repeated words increased significantly after the training in the SNR50 condition but not in the SNR80 condition. The situation when a person understands 80% of the information corresponds roughly to the bottom range of the Lowest Acceptable Performance Level (Boothroyd and Schauer Citation2015), i.e., the intelligibility level at which people are willing to maintain a conversation for a short time. It is unlikely that people would engage in a conversation if they understand only 50% of what is being said. Hence, out of the two considered experimental conditions, the SNR80 resembles a real-life challenging communication scenario more closely than the SNR50. However, traditional sentence tests are not sensitive enough to detect training benefits when recognition is close to the ceiling.

The fact that changes in pupil responses were observed in both conditions confirms that pupillometry is sensitive to cognitive effects even when speech recognition is at the ceiling, which is in line with previous studies (McLaughlin and Van Engen Citation2020). This finding underlines the value of using outcome measures other than speech intelligibility tests to evaluate the effectiveness of auditory training and implies that an intervention should not be claimed ineffective solely based on the lack of significant improvement in speech recognition metric. The negative results reported by studies on phoneme training in hearing-aid users (such as Koprowska et al. Citation2022; Woods et al. Citation2015) might as well stem from the low sensitivity of HINT rather than the absence of any learning generalisation to the untrained task.

The effect of a non-training activity on pupillary responses

An important element of the present investigation was the inclusion of an active control group. A previous study that used pupillometry to measure the effect of auditory training (Kuchinsky et al. Citation2014) involved a passive control group, which completed two measurement sessions interleaved by a period of time equal to the duration of the investigated training. No changes in pupil traces for that group were reported. Here, in contrast, the control group participated in a non-training activity, which should have resulted in the expectation that their auditory skills had improved. As such, an opportunity arose to investigate if these expectations would be reflected in the pupillometry results.

The control group showed a significant increase in PPDs (in both conditions) and the overall pupil response (in SNR50 condition), which might have been associated with the participants’ expectations regarding the effectiveness of the received intervention. It has previously been demonstrated that the expectations have an impact on the outcome measures, e.g., labelling a hearing aid as “new” or as “conventional” affected users’ preference, sound quality ratings and even speech recognition benefit (Dawes, Hopkins, and Munro Citation2013). Similar mechanism might have underlain the effect observed in the present study.

In a research setting, when the results are analysed on a group level, it is possible to control for this effect using a double-blind design. There is, however, an ongoing debate about a potential use of pupillometry as an individualised measure of effort in a clinical setting. The demonstrated sensitivity of pupillary responses to “placebo” effects should be carefully taken into consideration in the context of future clinical applications of pupillometry to assess the efficacy of hearing-loss intervention at the level of an individual listener.

Considerations regarding the study sample

A relatively small number of participants—10 individuals in the training group and nine in the control group that were included in the analysis—is a substantial limitation of the present study. It is important to underline that recruiting such a small sample size was not intentional but resulted from the limited access to a specific population (i.e., elderly experienced hearing-aid users without any contraindications to participate, such as eye diseases and medications)—partially due to COVID-19 pandemic. Larger numbers of participants and analysed trials would be desirable to tackle the inherent noisiness of the pupillometry recordings and obtain reliable estimates of the responses as well as to achieve sufficient statistical power to detect small effect sizes. Nonetheless, even with a limited number of participants, we observed a marginally significant improvement in more difficult HINT condition (SNR50) and a more substantial and statistically significant increase in pupil size for the training group (in the 3-s time window). These findings allow us to remain positive about the effectiveness of training and demonstrate that, as a proof of concept, pupillometry can serve as a viable method for assessing the impact of training and could potentially be as robust, if not more so, than traditional measures such as the HINT test.

Another aspect worth attention is that the training and the control group, although well matched in terms of age and hearing loss, differed in their working memory capacity reflected in auditory Digit Span scores. This might be a potential confounding factor due to the plausible relationship between cognitive abilities and the allocation of mental resources. Although some studies did not point to a link between working memory capacity and pupil responses measured during masked speech recognition task in normal hearing (Koelewijn et al. Citation2012; Zekveld et al. Citation2014) and hearing-impaired listeners (Koelewijn et al. Citation2014), there are some reports that show the opposite. For example, Dingemanse and Goedegebure (Citation2022) observed lower PPDs during speech recognition task in cochlear-implant users with better working memory. Considering that the control group in the present study had better working memory than the trained participants, the potential contribution of this difference to the pattern of the results cannot be ruled out.

Summary and conclusion

The present study investigated the impact of phoneme-in-noise identification training on listening effort in hearing-aid users. Task-evoked pupil responses were registered while the participants performed a sentence intelligibility test before and after the training. The sentence test was administered at two performance levels, corresponding to 50% (SNR50) and 80% (SNR80) intelligibility (defined before the training). The outcomes were assessed with reference to an active control group. PPDs were significantly higher at the post- compared to the pre-intervention visit in both groups. The analysis of time-dependent effects in pupil dilation revealed that the responses’ magnitude and shape were affected differently across the groups, with a larger and more consistent across conditions increase in the responses in the group that received the training. A significant improvement in intelligibility was found only in the training group and in the more challenging condition (SNR50). The pupil responses showed the impact of the training on effort in both HINT conditions also when intelligibility remained unchanged (SNR80). The findings indicate that training led to increased listening effort, which may be linked to changes in the participants’ motivation status. The study demonstrated the sensitivity of pupillometry to the effect of training at high intelligibility levels. The significant changes in PPDs and the shape of the pupil responses observed in the active control group indicate that this method is also sensitive to placebo effects, which can be a challenge for potential use in clinical settings.

Supplemental material

Supplemental Material

Download MS Word (33.6 KB)

Supplemental Material

Download MS Word (33.4 KB)

Acknowledgements

The authors thank Rikke Skovhøj Sørensen for her contribution to the data collection and Florine Bachmann for collaboration in planning and conducting the joint part of the experiment. We would also like to acknowledge Helia Relaño-Iborra, Mihaela-Beatrice Neagu and Michał Feręczkowski for their input regarding the data analysis and Oticon Medical for the pupillometry data acquisition software.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by Sivantos GmbH.

References

  • Abokyi, S., J. Owusu-Mensah, and K. A. Osei. 2017. “Caffeine Intake is Associated with Pupil Dilation and Enhanced Accommodation.” Eye (London, England) 31 (4):615–619. https://doi.org/10.1038/eye.2016.288
  • Alhanbali, S., P. Dawes, S. Lloyd, and K. J. Munro. 2017. “Self-Reported Listening-Related Effort and Fatigue in Hearing-Impaired Adults.” Ear and Hearing 38 (1):e39–e48. https://doi.org/10.1097/AUD.0000000000000361
  • Bachmann, F. L. 2021. “Subcortical electrophysiological measures of running speech.”. (PhD thesis)., Technical University of Denmark, Kgs. Lyngby.
  • Bachmann, F. L., E. N. MacDonald, and J. Hjortkjær. 2021. “Neural Measures of Pitch Processing in EEG Responses to Running Speech.” Frontiers in Neuroscience 15:738408. https://doi.org/10.3389/fnins.2021.738408
  • Bates, D., M. Mächler, B. M. Bolker, and S. C. Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software 67 (1):1–48. https://doi.org/10.18637/jss.v067.i01
  • Bianchi, F., D. Wendt, C. Wassard, P. Maas, T. Lunner, T. Rosenbom, and M. Holmberg. 2019. “Benefit of Higher Maximum Force Output on Listening Effort in Bone-Anchored Hearing System Users: A Pupillometry Study.” Ear and Hearing 40 (5):1220–1232. https://doi.org/10.1097/AUD.0000000000000699
  • Boothroyd, A., and A. Schauer. 2015. “Lowest acceptable performance level.” International Collegium of Rehabilitative Audiology, Berkeley, CA.
  • Dawes, P., R. Hopkins, and K. J. Munro. 2013. “Placebo Effects in Hearing-Aid Trials are Reliable.” International Journal of Audiology 52 (7):472–477. https://doi.org/10.3109/14992027.2013.783718
  • Dingemanse, G., and A. Goedegebure. 2022. “Listening Effort in Cochlear Implant Users: The Effect of Speech Intelligibility, Noise Reduction Processing, and Working Memory Capacity on the Pupil Dilation Response.” Journal of Speech, Language, and Hearing Research: JSLHR 65 (1):392–404. https://doi.org/10.1044/2021_JSLHR-21-00230
  • Eggert, T., C. Sauter, R. Popp, J. Zeitlhofer, and H. Danker-Hopfe, “Vigilance” of the German Society for Sleep Research and Sleep Medicine (DGSM) 2012. “The Pupillographic Sleepiness Test in Adults: Effect of Age, Gender, and Time of Day on Pupillometric Variables.” American Journal of Human Biology 24 (6):820–828. https://doi.org/10.1002/ajhb.22326
  • Ferguson, M. A., H. Henshaw, D. P. A. Clark, and D. R. Moore. 2014. “Benefits of Phoneme Discrimination Training in a Randomized Controlled Trial of 50- to 74-Year-Olds With Mild Hearing Loss.” Ear and Hearing 35 (4):e110–e121. https://doi.org/10.1097/AUD.0000000000000020
  • Gatehouse, S., and W. Noble. 2004. “The Speech, Spatial and Qualities of Hearing Scale (SSQ).” International Journal of Audiology 43 (2):85–99. https://doi.org/10.1080/14992020400050014
  • Henshaw, H., and M. Ferguson. 2014. “Assessing the benefits of auditory training to real-world listening: identifying appropriate and sensitive outcomes.” In T. Dau, S. Santurette, J. C. Dalsgaard, L. Tranebjærg, T. Andersen, & T. Poulsen (Eds.), Proceedings of ISAAR 2013: Auditory Plasticity - Listening with the Brain. 4th symposium on Auditory and Audiological Research (45–52). Nyborg, Denmark: The Danavox Jubilee Foundation. https://proceedings.isaar.eu/index.php/isaarproc/article/view/2013-05.
  • Henshaw, H., and M. A. Ferguson. 2013. “Efficacy of Individual Computer-Based Auditory Training for People with Hearing Loss: A Systematic Review of the Evidence.” PloS One 8 (5):e62836. https://doi.org/10.1371/journal.pone.0062836
  • Howard, C. S., K. J. Munro, and C. J. Plack. 2010. “Listening Effort at Signal-to-Noise Ratios that are Typical of the School Classroom.” International Journal of Audiology 49 (12):928–932. https://doi.org/10.3109/14992027.2010.520036
  • Humes, L. E. 1999. “Dimensions of Hearing Aid Outcome.” Journal of the American Academy of Audiology 10 (01):26–39. https://doi.org/10.1055/s-0042-1748328
  • Humes, L. E., M. H. Burk, L. E. Strauser, and D. L. Kinney. 2009. “Development and Efficacy of a Frequent-Word Auditory Training Protocol for Older Adults with Impaired Hearing.” Ear and Hearing 30 (5):613–627. https://doi.org/10.1097/AUD.0b013e3181b00d90
  • Koelewijn, T., A. A. Zekveld, J. M. Festen, and S. E. Kramer. 2014. “The Influence of Informational Masking on Speech Perception and Pupil Response in Adults with Hearing Impairment.” The Journal of the Acoustical Society of America 135 (3):1596–1606. https://doi.org/10.1121/1.4863198
  • Koelewijn, T., A. A. Zekveld, J. M. Festen, J. Rönnberg, and S. E. Kramer. 2012. “Processing Load Induced by Informational Masking Is Related to Linguistic Abilities.” International Journal of Otolaryngology 2012:865731–865711. https://doi.org/10.1155/2012/865731
  • Koprowska, A., J. Marozeau, T. Dau, and M. Serman. 2022. “The Effect of Phoneme-Based Auditory Training on Speech Intelligibility in Hearing-Aid Users.” International Journal of Audiology 62 (11):1048–1058. https://doi.org/10.1080/14992027.2022.2135032
  • Kuchinsky, S. E., J. B. Ahlstrom, S. L. Cute, L. E. Humes, J. R. Dubno, and M. A. Eckert. 2014. “Speech-Perception Training for Older Adults with Hearing Loss Impacts Word Recognition and Effort.” Psychophysiology 51 (10):1046–1057. https://doi.org/10.1111/psyp.12242
  • Kuchinsky, S. E., J. B. Ahlstrom, K. I. Vaden, S. L. Cute, L. E. Humes, J. R. Dubno, and M. A. Eckert. 2013. “Pupil Size Varies with Word Listening and Response Selection Difficulty in Older Adults With Hearing Loss.” Psychophysiology 50 (1):23–34. https://doi.org/10.1111/j.1469-8986.2012.01477.x
  • Kuznetsova, A., P. B. Brockhoff, and R. H. B. Christensen. 2017. “lmerTest Package: Tests in Linear Mixed Effects Models.” Journal of Statistical Software 82 (13):1–26. https://doi.org/10.18637/jss.v082.i13
  • Lenth, R. V. 2016. “Least-Squares Means: The R Package lsmeans.” Journal of Statistical Software 69 (1):1–33. https://doi.org/10.18637/jss.v069.i01
  • McCoy, S. L., P. A. Tun, L. C. Cox, M. Colangelo, R. A. Stewart, and A. Wingfield. 2005. “Hearing Loss And Perceptual Effort: Downstream Effects on Older Adults’ Memory for Speech.” The Quarterly Journal of Experimental Psychology. A, Human Experimental Psychology 58 (1):22–33. https://doi.org/10.1080/02724980443000151
  • McGarrigle, R., K. J. Munro, P. Dawes, A. J. Stewart, D. R. Moore, J. G. Barry, and S. Amitay. 2014. “Listening Effort and Fatigue: What Exactly Are We Measuring? A British Society of Audiology Cognition in Hearing Special Interest Group “white paper.” International Journal of Audiology 53 (7):433–440. https://doi.org/10.3109/14992027.2014.890296
  • McLaughlin, D. J., and K. J. Van Engen. 2020. “Task-Evoked Pupil Response for Accurately Recognized Accented Speech.” The Journal of the Acoustical Society of America 147 (2):EL151–EL156. https://doi.org/10.1121/10.0000718
  • Mirman, D. 2014. Growth Curve Analysis and Visualization Using R. Boca Raton, FL: CRC Press. https://doi.org/10.1017/CBO9780511564345
  • Moore, B. C. J., B. R. Glasberg, and M. A. Stone. 1999. “Use of a Loudness Model for Hearing Aid Fitting: III. A General Method for Deriving Initial Fittings for Hearing Aids with Multi-Channel Compression.” British Journal of Audiology 33 (4):241–258. https://doi.org/10.3109/03005369909090105
  • Naylor, G., T. Koelewijn, A. A. Zekveld, and S. E. Kramer. 2018. “The Application of Pupillometry in Hearing Science to Assess Listening Effort.” Trends in Hearing 22:2331216518799437. https://doi.org/10.1177/2331216518799437
  • Neagu, M.-B., A. A. Kressner, H. Relaño-Iborra, P. Bækgaard, T. Dau, and D. Wendt. 2023. “Investigating the Reliability of Pupillometry as a Measure of Individualized Listening Effort.” Trends in Hearing 27:233121652311532. https://doi.org/10.1177/23312165231153288
  • Nielsen, J. B., and T. Dau. 2010. “The Danish Hearing in Noise Test.” International Journal of Audiology 50 (3):202–208. https://doi.org/10.3109/14992027.2010.524254
  • Nielsen, J. B., and T. Dau. 2019. “A Danish Nonsense Word Corpus for Phoneme Recognition Measurements.” Acta Acustica United with Acustica 105 (1):183–194. https://doi.org/10.3813/AAA.919299
  • Noble, W., and S. Gatehouse. 2004. “Interaural Asymmetry of Hearing loss, Speech, Spatial and Qualities of Hearing Scale (SSQ) Disabilities, and Handicap.” International Journal of Audiology 43 (2):100–114. https://doi.org/10.1080/14992020400050015
  • Pichora-Fuller, M. K., S. E. Kramer, M. A. Eckert, B. Edwards, B. W. Y. Hornsby, L. E. Humes, U. Lemke, T. Lunner, M. Matthen, C. L. Mackersie, et al. 2016. “Hearing Impairment and Cognitive Energy: The Framework for Understanding Effortful Listening (FUEL).” Ear and Hearing 37 Suppl 1 (1):5S–27S. https://doi.org/10.1097/AUD.0000000000000312
  • Piquado, T., D. Isaacowitz, and A. Wingfield. 2010. “Pupillometry as a Measure of Cognitive Effort in Younger and Older Adults.” Psychophysiology 47 (3):560–569. https://doi.org/10.1111/j.1469-8986.2009.00947.x
  • R Core Team 2020. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
  • Relaño-Iborra, H., and P. Bækgaard. 2020. PUPILS Pipeline: A Flexible Matlab Toolbox for Eyetracking and Pupillometry Data Processing. http://arxiv.org/abs/2011.05118
  • Rönnberg, J., T. Lunner, A. Zekveld, P. Sörqvist, H. Danielsson, B. Lyxell, Ö. Dahlström, C. Signoret, S. Stenfelt, M. K. Pichora-Fuller, et al. 2013. “The Ease of Language Understanding (ELU) Model: Theoretical, Empirical, and Clinical Advances.” Frontiers in Systems Neuroscience 7:31. https://doi.org/10.3389/fnsys.2013.00031
  • Schumann, A., L. Garea Garcia, and U. Hoppe. 2017. “Trainierbarkeit der Diskrimination von Phonemen im Störgeräusch bei normalhörenden Erwachsenen.” Laryngo- Rhino- Otologie 96 (2):98–103. https://doi.org/10.1055/s-0042-113134
  • Schumann, A., M. Serman, O. Gefeller, and U. Hoppe. 2015. “Computer-Based Auditory Phoneme Discrimination Training Improves Speech Recognition in Noise in Experienced Adult Cochlear Implant Listeners.” International Journal of Audiology 54 (3):190–198. https://doi.org/10.3109/14992027.2014.969409
  • Serman, M. 2012. “SchooLo: New Speech Training Method Based on Changes in Speech Cues.” 15th Annual Meeting of the German Society of Audiology.
  • Stropahl, M., J. Besser, and S. Launer. 2020. “Auditory Training Supports Auditory Rehabilitation.” Ear and Hearing 41 (4):697–704. https://doi.org/10.1097/aud.0000000000000806
  • Sweetow, R. W., and J. H. Sabes. 2006. “The Need for and Development of an Adaptive Listening and Communication Enhancement (LACE TM) Program.” Journal of the American Academy of Audiology 17 (8):538–558. https://doi.org/10.3766/jaaa.17.8.2
  • Wang, Y., S. E. Kramer, D. Wendt, G. Naylor, T. Lunner, and A. A. Zekveld. 2018. “The Pupil Dilation Response During Speech Perception in Dark and Light: The Involvement of the Parasympathetic Nervous System in Listening Effort.” Trends in Hearing 22:233121651881660. https://doi.org/10.1177/2331216518816603.
  • Wendt, D., R. K. Hietkamp, and T. Lunner. 2017. “Impact of Noise and Noise Reduction on Processing Effort: A Pupillometry Study.” Ear and Hearing 38 (6):690–700. https://doi.org/10.1097/AUD.0000000000000454
  • Wendt, D., T. Koelewijn, P. Książek, S. E. Kramer, and T. Lunner. 2018. “Toward a More Comprehensive Understanding of the Impact of Masker Type and Signal-to-Noise Ratio on the Pupillary Response While Performing a Speech-in-Noise Test.” Hearing Research 369:67–78. https://doi.org/10.1016/j.heares.2018.05.006
  • Whitton, J. P., K. E. Hancock, J. M. Shannon, and D. B. Polley. 2017. “Audiomotor Perceptual Training Enhances Speech Intelligibility in Background Noise.” Current Biology 27 (21):3237–3247.e6. https://doi.org/10.1016/j.cub.2017.09.014
  • Winn, M. B. 2016. “Rapid Release From Listening Effort Resulting From Semantic Context, and Effects of Spectral Degradation and Cochlear Implants.” Trends in Hearing 20:233121651666972. https://doi.org/10.1177/2331216516669723
  • Winn, M. B., D. Wendt, T. Koelewijn, and S. E. Kuchinsky. 2018. “Best Practices and Advice for Using Pupillometry to Measure Listening Effort: An Introduction for Those Who Want to Get Started.” Trends in Hearing 22:2331216518800869. https://doi.org/10.1177/2331216518800869
  • Woods, D. L., Z. Doss, T. J. Herron, T. Arbogast, M. Younus, M. Ettlinger, and E. W. Yund. 2015. “Speech Perception in Older Hearing Impaired Listeners: Benefits of Perceptual Training.” PloS One 10 (3):e0113965. https://doi.org/10.1371/journal.pone.0113965
  • Zekveld, A. A., M. Rudner, S. E. Kramer, J. Lyzenga, and J. Rönnberg. 2014. “Cognitive Processing Load During Listening is Reduced More by Decreasing Voice Similarity than by Increasing Spatial Separation Between Target and Masker Speech.” Frontiers in Neuroscience 8:88. https://doi.org/10.3389/fnins.2014.00088