1,436
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Perceiving utilitarian gradients: Heart rate variability and self-regulatory effort in the moral dilemma task

, ORCID Icon, , , &
Pages 391-405 | Received 28 Jul 2020, Published online: 01 Jun 2021

ABSTRACT

It is not yet clear which response behavior requires self-regulatory effort in the moral dilemma task. Previous research has proposed that utilitarian responses require cognitive control, but subsequent studies have found inconsistencies with the empirical predictions of that hypothesis. In this paper, we treat participants’ sensitivity to utilitarian gradients as a measure of performance. We confronted participants (N = 82) with a set of five dilemmas evoking a gradient of mean utilitarian responses in a 4-point scale and collected data on heart rate variability and utilitarian responses. We found positive correlations between tonic and phasic HRV and sensitivity to the utilitarian gradient in the high tonic group, but not in the low tonic group. Moreover, the low tonic group misplaced a scenario with a selfish incentive at the high end of the gradient. Results suggest that performance is represented by sensitivity correlated with HRV and accompanied with a reasonable placement of individual scenarios within the gradient.

Moral cognition from the perspective of the neurovisceral integration model

HRV and self-regulation

Heart Rate Variability (HRV) refers to the variation in the time interval (in milliseconds) between consecutive heartbeats. Higher variability measured in a resting condition indicates a healthier state of neurovisceral integration (Porges, Citation1995; Thayer & Lane, Citation2000), indicating more robust self-regulatory capacities during emotional or cognitive challenges. According to polyvagal (Porges, Citation1995) and neurovisceral (McIntosh et al., Citation2020; Mulcahy et al., Citation2019; R. Smith et al., Citation2017; Thayer & Lane, Citation2000) models, bidirectional connections between cortical, limbic and brainstem structures send inputs to the sinoatrial node via the vagus nerve. These brain structures thus influence heart rate through vagal withdrawal and reactivation, producing HRV (Berntson et al., Citation1997). Research has established a correlation between vagal activity and changes in cerebral blood flow in the frontal lobe, which influences affective and attentional regulation (Jennings et al., Citation2015; Thayer et al., Citation2009). The associations between self-regulatory capacity and prefrontal activity, on the one hand, and between prefrontal activity and HRV on the other, support the inference that HRV may reflect features of self-regulatory behavior. Additionally, research employing different self-regulation tasks has shown that HRV indices can serve as reliable bio-markers of self-regulation, broadly understood as the adaptive regulation of emotion, cognition and behavior in relation to environmental demands (Bridgett et al., Citation2015). Recent meta-analyses have supported the robustness of the link between HRV and top-down regulation in both emotional and cognitive-behavioral contexts (Holzman & Bridgett, Citation2017; Zahn et al., Citation2016).

Low indices of tonic HRV – i.e., HRV obtained from subjects in a resting state – have been associated with emotion-regulation disorders such as panic, social anxiety, and generalized anxiety disorders (Chalmers et al., Citation2014; Friedman, Citation2007), self-regulatory failures like hypervigilance (a tendency to interpret neutral stimuli as indicators of threat, Park et al., Citation2013), and the inability to inhibit attention to distractors while concentrating on a perceptual task (Park et al., Citation2014). Conversely, high indices of tonic HRV have been linked to positive emotion and social connectedness (Kok & Fredrickson, Citation2010).

There is also a robust association between HRV and self-regulation in cognitive tasks. According to an influential and widely shared view, cognitive control stems from the maintenance of patterns of neural activity in prefrontal brain regions, which bias the signals of other brain structures (Miller & Cohen, Citation2001). Studies have found that high indices of tonic HRV are associated with better performance at executive-function tasks (Backs & Seljos, Citation1994; Capuana et al., Citation2014; Forte et al., Citation2019; Hansen et al., Citation2003; Luft et al., Citation2009; Middleton et al., Citation1999; Muthukrishnan et al., Citation2017; Park et al., Citation2014; Thayer et al., Citation2009).

The relationship between cognitive control and phasic HRV – i.e., HRV obtained during exposure to an emotional or cognitive challenge – is less straightforward. Studies show inconsistencies regarding whether self-regulatory effort induces phasic HRV increase or decrease relative to tonic HRV. In some tasks, self-regulatory effort is associated with a rise of phasic relative to tonic HRV. For instance, alcoholics with good impulse control over drinking underwent a clear increase in HRV during alcohol exposure, and women displayed a significant increase in HRV relative to baseline during their efforts to regulate negative emotions in a marital discussion (Ingjaldsson, Laberg and Thayer Citation2003; T.W. Smith et al., Citation2011; see also Butler et al., Citation2006; Park et al., Citation2014). In contrast, in other tasks phasic HRV suppression relative to baseline or recovery is indicative of regulatory effort (Backs & Seljos, Citation1994; Beauchaine, Citation2001; Duschek et al., Citation2009; Hansen et al., Citation2003; Middleton et al., Citation1999; Weber et al. c). Studies in the latter category differ with respect to whether better performance is associated with a greater or a smaller phasic suppression (Park et al., Citation2014). In some tasks, mostly related to working memory and attention regulation, a smaller phasic HRV suppression is associated with better performance, whereas a greater decrease in phasic HRV correlates with worse performance (Backs & Seljos, Citation1994; Hansen et al., Citation2003; Segerstrom & Nes Citation2007; Muthukrishnan et al., Citation2017). But other studies, related to sustained and selective attention or to stress regulation, have reported that greater phasic HRV suppression is associated with both better performance and higher tonic HRV (Beauchaine et al. Citation2007; Chapman et al. Citation2010; Duschek et al., Citation2009; Porges, Citation1995; Weber et al. Citation2010).

HRV and moral cognition

In research of moral cognition employing sacrificial moral dilemmas, participants are confronted with short stories where killing one person is conducive to saving more lives. In such stories, a moral norm against killing conflicts with a moral norm promoting the greater good. The paradigmatic example of sacrificial dilemmas is Footbridge, where a large man is pushed onto the tracks of a runaway trolley to stop it from running over and killing five workers. The moral dilemma task consists in deciding whether it is appropriate to override the moral norm against killing for the sake of the greater good (Greene et al., Citation2004, Citation2001). Participants must assess the consequences and weigh them against the aversion toward killing. Such a task is not appropriate to test the polyvagal or the neurovisceral integration models, because it is not possible to independently identify correct performance during the moral dilemma task. In contrast to most cognitive or emotion regulation tasks, in moral dilemmas no particular response is saliently right or wrong such that it univocally indicates self-regulation. However, tests of both the polyvagal and the neurovisceral models with other self-regulation tasks confirm that high tonic HRV predicts better performance. In the moral dilemma task, this would amount to better management of the moral conflict. Accordingly, the only published HRV study with sacrificial moral dilemmas (Park et al., Citation2016) opted for testing whether participants better at self-regulation – with higher tonic HRV – would reveal stronger utilitarian inclinations in responses to the moral dilemmas, as claimed by the dual-process theory of moral judgment (Greene et al., Citation2004, Citation2001). According to this theory, cognitive self-regulation implies overriding a prepotent deontological emotion against killing the one person. This theory relies on research showing that utilitarian judgment in the moral dilemma task correlates with brain areas associated to cognitive control and working memory – the dorsolateral prefrontal cortex. In contrast, deontological judgment activates areas in the ventromedial prefrontal cortex and the amygdala, involved in emotional assessment (Greene et al., Citation2004, Citation2001). Park et al. (Citation2016) found, however, that high tonic HRV predicted deontological responses, suggesting that utilitarian responders have a poorer level of neuro-visceral integration. This result contradicts the claims of the dual-process model of moral cognition. The HRV data suggest, instead, that utilitarian responders poorly integrate bottom-up emotional signals – arising in the amygdala and activating the sympathetic nervous system – with top-down neural processes. This low integration suggests some form of impairment in the processes leading to moral judgment.

The conflicting results of neuroimaging vs. HRV studies call for more research into self-regulatory processes in moral judgment. Indeed, new converging results produced by two different neuroimaging labs (Shenhav & Greene, Citation2014; conceptually replicated by Hutcherson et al., Citation2015) suggest that good performance in moral dilemmas lies in the integration of emotional and rational inputs, not in the dominance of one over the other. These new results were obtained requesting from participants three different types of assessments in a within-subjects design: how do you feel about the options (emotional assessment); which option produces more beneficial results (utilitarian assessment); which option is more morally acceptable (integrative judgment of moral acceptability). Both labs found that the emotional and the utilitarian assessments together predicted the judgments of moral acceptability. The emotional and utilitarian assessments functioned as inputs to be pondered and integrated in an “all things considered” judgment of moral acceptability, which in both studies correlated with the activation of the ventromedial prefrontal cortex [vmPFC]. Thus, both studies suggested a new role for the vmPFC: it is no longer the locus of the emotional assessment – now ascribed to the amygdala – but is rather in charge of facilitating the integration of partially conflicting emotional and rational assessments. Importantly, in both studies “all things considered” judgments were spread all over a 4-point bipolar scale ranging from the appropriateness of deontological to the appropriateness of utilitarian judgment, with a mean near the midpoint. Neither study presented any claim to the effect that stronger activations in the vmPFC correlate with one particular response type (utilitarian or deontological). Absent a result of this nature, we must infer that correct performance is not tied to a utilitarian response. Good performers are, presumably, those who strive to integrate both the emotional and the “greater good” assessments, consistent with the neuro-visceral integration model of HRV (Park et al., Citation2016). The early efforts of Greene et al. (Citation2001, Citation2004, Citation2008) to show that utilitarian responses are outputs of cognitive control are not aligned with the new integrative role of the vmPFC, nor with the fact that “all things considered” judgments are spread all over a 4-point bipolar scale. Conversely, these new results also cast doubt on the claim, suggested by the findings of Park et al. (Citation2016), that good self-regulation leads to more deontological responses in the moral dilemma task.

If neither type of moral response can be excluded as expression of self-regulation, it seems appropriate to introduce a new method of measurement independent of response type. In the exploratory study reported here, we propose to measure self-regulation in relation to an objective gradient of utilitarian pull in a set of personal dilemmas. Personal dilemmas have two features that make the conflict salient and trigger the need for an “all things considered” moral judgment. On the one hand, they are constructed in such a way that the agent attempting the utilitarian sacrifice must intentionally enter into physical contact with the victim (or at least perform muscular exertion) as a means to saving the others, and not merely as a side-effect of her attempt. On the other hand, if you tweak these personal scenarios in the right way, the proportion of utilitarian responses can be reliably manipulated from low to high. We explain in the overview below the meaning of these features and how they can be used to measure correct performance in participants confronting a set of dilemmas.

Overview: Sensitivity to utilitarian gradients as a measure of performance

A utilitarian gradient is realized in a set of dilemmas when the individual items depict always the same deontological cost (killing one person) and oppose it to utilitarian incentives of different magnitudes, which go from low to intermediate to high, thereby configuring a gradient in utilitarian pull. Independent studies have shown that participants, taken collectively, react rationally to utilitarian incentives, i.e., the sample mean of utilitarian response to each individual item corresponds to the item’s incentive level (Bucciarelli Citation2015; Christensen et al. Citation2014; Huebner, Hauser, & Pettit, Citation2011; Moore et al., Citation2008; Rosas and Koenigs Citation2014; Trémolière & Bonnefon, Citation2014; Gürçay & Baron, Citation2017). Participants are responsive to different levels of utilitarian incentives in a similar way in which, when playing a public goods game, they rationally decide their cooperative investment depending on its rate of return (Evans et al., Citation2015; Krajbich et al., Citation2015; Rosas et al., Citation2019). In moral dilemmas, participants estimate what can be called an “utilitarian rate of return”. We can picture this estimation as a specification of the mental process that generates an “all things considered” judgment of moral acceptability. In this sense, measuring sensitivity to utilitarian gradients would allow us to see integrative moral judgment in action.

We thus thought it reasonable to explore whether HRV relates in the expected way to participants’ sensitivity to the gradient. A reasonable hypothesis is that differences in tonic (resting) HRV would relate to differences in gradient sensitivity. Participants with higher tonic HRV would exhibit, e.g., greater sensitivity; or alternatively, they would perceive the ordering of scenarios in the gradient in closer conformity to the ordering of costs and benefits. Similarly, we would expect that HRV would either decrease or increase during the task (phasic HRV). This would replicate what has been observed in many experiments confronting participants with emotional or cognitive challenges. Since dilemmas contain both emotional and cognitive challenges, we did not make a more specific prediction, in accordance with the exploratory nature of this study.

It is worth emphasizing that this method is not subject to the arbitrariness affecting the claim that either utilitarian or deontological responses index correct performance. It is not unreasonable to expect that participants have different thresholds at which for them the balance tips in favor of the utilitarian option. But if some participants lack such thresholds or show inconsistencies in the placement of scenarios along the gradient, it would be interesting to see whether this behavior somehow relates to tonic and/or phasic HRV levels. Attending to the latter possibility, we included in the set of scenarios one that involves a risk to the agent herself: in this scenario (labeled Flames, see Appendix), the agent (who is always identified with the participant) must sacrifice one individual to save five others, and the agent is one of them. Since low HRV is correlated with anxiety, panic, and a heightened sense of threat, high- and low-HRV participants might well respond differently to a scenario where their own (hypothetical) survival is at stake.

In this sense, sensitivity to the gradient indicates correct performance in a way that is free from the arbitrariness of considering either type of judgment to be generally correct. The method uses a purely formal feature of the cognitive process underlying the formation of “all things considered” moral judgments, namely, sensitivity to a utilitarian gradient. Note also that, although we talk of an “utilitarian gradient”, the gradient is also deontological in the opposite direction, as is natural in a bipolar scale.

Methods

Participants

We recruited 82 advanced undergraduate and graduate participants by word of mouth from universities in Bogotá, Colombia. Participants were 21–26 years of age, Mage = 24.8, SD = 1.45. This strict age limit was established to avoid known developmental confounds on HRV (Sosnowski, Citation2010). Proportion of females was 54%. The study was conducted in Spanish (see Appendix for original materials and translations). Participants refrained from alcohol and drug use for 4 h prior to their participation. They were also screened for a history of neurological disease. None reported any antecedents. Eleven participants had incomplete measurements due to errors in the digital recording of their electrophysiological data and two participants had incomplete utilitarian responses. This yielded a final n = 71 participants for the ECG measures and 80 participants for the utilitarian responses.

Procedure

All participants were tested individually in a well-lit room at the Interdisciplinary Laboratory of Human Sciences and Processes at the Universidad Externado de Colombia. Participants were welcomed and asked to read and sign the informed consent form, and to fill a survey that included health and socioeconomic questions, and a question about religiosity. Participants were then fitted with the electrocardiogram (ECG) electrodes. After checking for proper equipment functioning and signal registration, participants were asked to remain in a resting position for 5 minutes, during which baseline measurements were taken.

When the baseline period was completed, we presented participants with the moral dilemma task. SuperLab was used to present the tasks and measure response times, while synchronizing these response times with electrophysiological signals. We asked participants to judge each situation presented to them from a moral point of view, imagining that the actions entailed no harm to themselves and had no legal or other consequences. Participants judged five scenarios in a within-subjects design. ECG measurements were taken for each of the five scenarios, which were presented sequentially and counterbalanced. For each scenario, the ECG measurements were taken in three self-paced phases: 1. the initial dilemma description, 2. the presentation of the utilitarian option, and 3. the request for a response. Participants moved through the three phases of each dilemma by pressing the space bar; this action was synchronized with ECG measurements, marking the beginning and the end of each phase. Phasic HRV was measured in the response phase, which spanned from the presentation of the question to the participant’s registration of an answer. To illustrate, we present here the three phases in the scenario Shark, which in our study represents the baseline dilemma with the lowest utilitarian pull:

Description: You are watching an exhibition of sharks being fed in an aquarium pool. A metal fence suddenly collapses, and a group of people fall into the pool. Their frantic movement in the water attracts the hungry sharks. A person next to you has a harmless episode of nose-bleeding.

Options: If you push this person into the water, their blood will draw the sharks away from the other five people. The bleeding person will die, but the five people will swim to safety.

Response: How right or wrong is it to cause the death of the person in order to save five people?

Participants registered their responses in a 4-point scale (1 = “Entirely wrong”, 2 = “More wrong than right”, 3 = “More right than wrong”, 4 = “Entirely right”). After participants responded to all five dilemmas, electro-physiological measurements stopped. All statistical analyses were carried out with SPSS version 25.0. The data are available at https://osf.io/vdrzm/

Materials

Sacrificial dilemmas

We designed a sequence of five dilemmas using only high-conflict personal scenarios, i.e., scenarios where the sacrifice of one person involves contact with the victim and/or exertion of muscular force and is knowingly performed as a means, not merely as a side-effect, to saving others.

Our scenarios included features that would confront participants with different utilitarian rates of return. Shark was our Footbridge-like scenario and acted as the baseline. It was constructed with four features: (1) a modest 1:5 kill-save ratio; (2) the victim is innocent of the threat of imminent death impending over others; (3) the victim will not die unless sacrificed to save the five; and (4) the agent performing the sacrifice has no personal stake in the sacrifice. In each of the other four scenarios we tweaked a single one of these four features to give the scenario a stronger utilitarian pull: Dam changed the kill-save ratio, i.e., one hundred thousand people would be saved by sacrificing one innocent person; in Grenade the person to be sacrificed is not innocent, because he credibly threatens five innocent others with imminent death; Boat features an innocent victim who would die along with five other persons even if no action was taken, but whose sacrifice would save the other five; and Flames featured an agent that would save herself along with four others by sacrificing one innocent person. We included these differences to create a utilitarian gradient, taking into account previous research that shows their effects in the percentages or means of utilitarian response (Christensen et al. Citation2014; Trémolière & Bonnefon, Citation2014; Rosas & Koenigs Citation2014; Bucciarelli, Citation2015; Rosas et al., Citation2019). The results were as expected: across the five scenarios we obtained three significantly distinct levels of mean utilitarian responses in the four-point scale.

Physiological measurements

We used PowerLab 16/35 ADC to record electrophysiological measures, amplified with a DualBio amplifier. Recordings were sampled by PowerLab at 2000 Hz. Superlab installed on a Dell personal computer (Intel Core i7-6700, 3.40 GHz, 16 GB RAM) was used to show stimuli to participants. Superlab sent stimuli markers via USB to Cedrus StimTracker, which passed them to Powerlab via digital port. PowerLab was connected via USB to a second Dell personal computer (Intel Core i7-6700, 3.40 GHz, 16 GB RAM) running LabChart. These devices and connections ensured fidelity and Jitter-free markers in the recordings.

Using Labchart’s built-in software filters, we filtered ECG/EKG data applying a digital Notch Filter to remove power line noise in the 60 Hz frequency band. We also used a high pass filter of 0.5 Hz with a 2nd-order Butterworth, internal parameters cutoff frequency of 0.26 and a transition width of 1.21. Finally, we applied a low-frequency filter of 20 Hz with a 4th-order Butterworth, internal parameters cutoff frequency of 22.92 and a transition width of 25.6. To obtain HRV measures, Labchart provided automatic beat recognition based on R wave detection and a powerful classifier for normal and ectopic beats. The R waves detected were visually inspected by two researchers. We also used Labchart to calculate HRV measures in the time and frequency domains from the intervals of the markers based on the detected R waves.

ECG provides inter-beat interval (IBI) information that specifies the temporal distances between each R-spike and the next (R-R intervals). To derive HRV from this information, researchers can apply time-domain or frequency-domain approaches. Time-domain approaches apply statistical computations to the R-R intervals to derive information about IBI variance. Among the most common time-domain measures are the root mean square of successive differences of successive intervals (RMSSD) and the percentage of adjacent normal-to-normal intervals that differ from each other by more than 50 ms (pNN50). Frequency-domain approaches split ECG data into different frequency components of the power spectral density band to provide HRV estimates within specific bands. The most commonly used frequency-domain measures are the high-frequency component (HF, between 0.15 and 0.4 Hz), the low-frequency component (LF, between 0.04 and 0.15 Hz) and the LF/HF ratio. (Task Force of The European Society of Cardiology and The North American Society of Pacing and Electrophysiology, Citation1996; Thayer et al., Citation2010).

Among the frequency-domain components, there is general agreement that HF represents vagally-mediated parasympathetic nervous system activity. Thus, we focus our frequency-domain analyses on HF measures. The parasympathetic system’s activity registers prefrontal activity through the vagus nerve and allows us to theoretically connect HRV to self-regulatory capacity (Hansen et al., Citation2003; Holzman & Bridgett, Citation2017). In measuring HRV in the HF band (RSA), we did not control for respiration frequency, as some researchers recommend (Grossman et al., Citation1991). According to a review of the literature (Denver et al., Citation2007), there is no evidence of a lawful connection between RSA amplitude and frequency of respiration. Denver et al. (Citation2007) also confirmed with two experiments that this holds true both in a resting condition and under manipulation of the amplitude of RSA via atropine.

Results

The utilitarian gradient in the five scenarios

Our five scenarios configured a gradient of utilitarian response. A related samples Friedman’s analysis of variance by ranks with Bonferroni post hoc comparisons confirmed three significantly distinct levels of mean utilitarian response (all ps ≤ 0.01). Shark and Grenade, the low and high end of the gradient, respectively, were significantly different from each other and from Boat, Dam and Flames. The latter three items occupied the middle level and were not significantly different from each other but differed significantly from both the low and high ends of the gradient ( and ).

Figure 1. The five scenarios build a gradient of utilitarian response in a 4-point response scale

Figure 1. The five scenarios build a gradient of utilitarian response in a 4-point response scale

Table 1. Descriptive statistics: Mean and SD of utilitarian response

Utilitarian response correlates negatively with the HRV measures

The HRV measures of HF, RMSSD and pNN50, for both the tonic and the phasic HRV, were log10 transformed to obtain a normal distribution. We first checked for age effects on tonic HRV with Pearson correlations (2-tailed) and found none (all ps > .09). We computed the total utilitarian response (bipolar scale) to the five dilemmas for each participant. This variable originated from the ordinal 4-point scale of utilitarian response and was treated as ordinal. We then ran Spearman rho correlations (2-tailed) between total utilitarian response and HRV measures. We successfully replicated the key finding in Park et al. (Citation2016): utilitarian response was negatively correlated with the three tonic HRV measures (HF, RMSSD and pNN50). The effect sizes of the correlations with the tonic measures were roughly of the same magnitude as those found by Park et al.: −.227 ≤ rs ≤ −.343, p < .001. We also found negative correlations between utilitarian response and the three phasic HRV measures (marginal with phasic HF, p = .052, see table S1 in the Appendix).

Effects of tonic and phasic HRV on moral sensitivity

We calculated the response slope of the regression line for each participant with the slope function in Excel, ordering scenarios according to their sample means as shown in . The average regression slope for the sample was .25, SD = .19. Roughly, the average slope represents the sample’s average moral sensitivity to the utilitarian gradient, as defined above. Since we have conceded that differences in moral sensitivity are legitimate, it is not possible to establish a unique correct or optimal slope. But plausibly, everything far above and far below one standard deviation from the average could count as incorrect performance; moreover, large divergences in the placement of one or several scenarios should be flagged for interpretation of performance, in conjunction with HRV data. Participants’ slopes ranged between −.3 and .8, including 0. Slope −.2 was not represented in any participant and some of the slopes were sparsely populated (see histogram, ). However, we did not exclude any slope data from the following analyses. Since the participants’ slope was derived from the ordinal utilitarian response in a 4-point scale, we treated the variable as ordinal in the correlation analyses. Spearman rho correlations (two-tailed) between moral sensitivity and the three measures of log10 transformed HRV, both tonic and phasic, revealed no correlation of sensitivity with tonic, and a small positive correlation of sensitivity with phasic HRV on the three measures. The correlation values with phasic HF, RMSSD and pNN50 were, respectively: rs = .153, p = .017; rs = .139, p = .008; rs = .182, p = .011. In the correlation with HF, to ensure the detection of 0.15 Hz in the high-frequency spectrum, we excluded the data points where participants responded in less than 6.7s to any one of the 5 dilemmas (105 of 356 valid data points). We also ran the analysis including those data points and the result was similar: no correlation with tonic and a small positive correlation with phasic HF (rs = .109, p = .04).

Figure 2. a) Distribution of participant slopes. The frequency represents responses to each individual dilemma, where each participant delivered 5 responses. The slope −.2 was not represented in any participant. The slope .7 was represented in a participant with no HF data. b) Mean tonic and phasic HF as a function of participant slope and high (N = 129 responses) vs. low (N = 122 responses) tonic group. Groups were based on the median split of log10 transformed tonic HF, after subtracting 105 data points from responses where participants took <6.7 sec

Figure 2. a) Distribution of participant slopes. The frequency represents responses to each individual dilemma, where each participant delivered 5 responses. The slope −.2 was not represented in any participant. The slope .7 was represented in a participant with no HF data. b) Mean tonic and phasic HF as a function of participant slope and high (N = 129 responses) vs. low (N = 122 responses) tonic group. Groups were based on the median split of log10 transformed tonic HF, after subtracting 105 data points from responses where participants took <6.7 sec

As mentioned above, in some cognitive task studies neither performance nor HRV-change relative to baseline are uniform across high vs. low tonic HRV (Weber et al. Citation2010; Park et al., Citation2014). Given this, we explored whether the links between performance (as measured by moral sensitivity) and HRV-change could be different for participants with high vs. low tonic HRV. Thus, we repeated the correlation analyses separately for high vs. low tonic groups. The two groups were based on the median split of log10-transformed tonic HRV measures. In the high tonic group, we obtained medium-sized positive correlations between moral sensitivity and both tonic and phasic HRV. The rs values for the high tonic group were, with tonic: (HF) rs = .493, p < .001; (RMSSD) rs = .256, p = .001; (pNN50) rs = .298, p < .001; and with phasic: (HF) rs = .379, p < .001; (RMSSD) rs = .306, p < .001; (pNN50) rs = .249, p = .003. In the low tonic group, we observed tendencies toward a negative correlation between tonic HRV and moral sensitivity. In that group, the positive correlations between sensitivity and phasic HRV were not statistically significant, except for a small correlation with HF: rs = .194, p = .037.

Although the median split is an arbitrary method to separate low vs. high tonic groups, it reveals that the correlation between HRV and moral sensitivity is not uniform across all levels of tonic HRV. It is almost inexistent in the low tonic group, but it is robust at higher levels. A plausible interpretation – in line with previous research on the relation between self-regulation and HRV – points to more controlled moral sensitivity at higher tonic levels. At these levels, individuals exert effort to carefully estimate the utilitarian rate of return in the moral dilemma task, while at low tonic levels responsiveness to utilitarian gradients is likely based on intuitions prone to distortions. We comment on this in the discussion. A significant change from tonic to phasic HRV would indicate this difference in effort. We focus the remaining analyses on the log10 transformed high frequency (HF) variable.

We plotted the mean tonic and phasic HF values on separate lines against the slope scores on the x-axis (), separating the plots in low (N = 122 responses) vs. high (N = 129 responses) tonic HF groups. shows that the slope range between 0 and .4 is the most populated and thus yields the more reliable data. In the segment from .2 to .4, visually displays the positive correlation between slope and phasic and tonic HF in the high tonic group, and the absence of such correlation in the low tonic group. In the high tonic group, a statistically significant suppression of phasic HF accompanies its positive correlation with slope. We confirmed this with a comparison of mean tonic vs. phasic HF for the high tonic group in the slope range (0, .4). Tonic HF (M = 2.75, SD = .49) was significantly higher than phasic HF (M = 2.56 SD = .69), t (224) = 5.08, p < .001.

We then ran linear regressions with slope as DV. In the high tonic group, phasic HF was a significant predictor of slope when entered as the sole IV in the regression: Beta = .244, t(123) = −2.79, p = .006, adj. R2 = .05. When entered together with tonic HF, only the latter was a significant predictor: Beta .502, t(122) = 5.44, p < .001, adj. R2 = .23, indicating that tonic HRV accounted for 23% of the variance in moral sensitivity; the values for phasic HF were: Beta = −.017, t(122) = −.19, p = .85. The effect of phasic HF on slope in the high tonic HF group was thus completely mediated by tonic HF. In contrast, neither tonic nor phasic HF predicted slope in the low tonic group, neither when entered alone nor together (all ps > .23).

Despite the important differences between the low vs. high tonic groups in the relationships between tonic HF, phasic HF, and slope, the low tonic group also exhibited moral sensitivity corresponding to the slopes between 0 and .4. We thus explored differences between the low vs. high tonic HF groups regarding their placement of scenarios along the utilitarian gradient. To compare the two groups in this respect, we plotted the mean UR for each scenario for each group. The groups differed in only one respect: their attitude to Flames, the only scenario in our battery where the self-preservation of the participants was hypothetically at stake. The high tonic group rated it as of low utilitarian pull (second to lowest after Shark) while the low tonic group rated it as second to highest after Grenade (). Independent samples t-tests reveal a difference in mean UR in low vs. high tonic group only in Flames: t(51) = 2.570, p = .013. Consistently, the negative correlation found in the whole sample between tonic HF and UR (rs = −.227, p < .001) disappeared when we repeated the analysis excluding Flames (rs = −.088, p = .219). We comment on this finding in the discussion.

Figure 3. Perception of the utilitarian gradient as a function of level of tonic HF

Figure 3. Perception of the utilitarian gradient as a function of level of tonic HF

No significant relation of total UR or reaction time to the high vs. low tonic group

Finally, we explored whether total UR and reaction time related to tonic and phasic HF in a similar way as slope does, i.e., whether these variables exhibited particularly significant relations to the high vs. the low tonic groups. Reaction time did not correlate significantly with either tonic or phasic HF. When we divided the sample in low vs. high tonic groups based on the median split of log10 transformed HF, no significant results emerged for either total UR or reaction time. Total UR’s negative correlations with both tonic and phasic HF emerged only for the whole sample (rs = −.227, p < .001, and rs = −.125, p = .052, respectively), and disappeared when looking at either group of the divided sample.

Discussion

Research into moral judgment using HRV measures and the moral dilemma task is hampered by the lack of an objective measure of performance in that task. The only study to date addressing this subject found that utilitarian judgment was negatively correlated to tonic HRV, suggesting that utilitarian responders have a poorer level of neuro-visceral integration (Park et al., Citation2016). This result contradicts the claims of the original dual process model of moral cognition (Greene et al., Citation2008, Citation2004, Citation2001), namely, that utilitarian responses express the ability to exert cognitive control over a prepotent emotional response. We do not deny that there is a reasonable level of certainty in the inference from levels of tonic HRV to performance on any cognitive task, and thus also in the moral dilemma task. However, with respect to the moral dilemma task, an independent measure of performance would provide research with a more fine-grained picture of regulatory effort in that task. Discriminating performance on the basis of moral response type is not an option, given that neither deontological nor utilitarian responses can be straightforwardly classified as correct or incorrect. Moreover, new neuroscientific results (Hutcherson et al., Citation2015; Shenhav & Greene, Citation2014) suggest that the vmPFC integrates emotional and rational assessments into “all things considered” moral judgments. Since these integrative moral judgments are spread evenly across a 4-point bipolar scale, it seems out of the question to identify correct performance, i.e., successful integration, with any of the two moral response types.

In this paper, we provide an independent measure of self-regulation for this task. To achieve this, we link the new neuroscientific evidence about integrative moral judgments (integrating emotional “harm aversion” and rational “greater good” estimates) with a feature of personal moral dilemmas that has been replicated across many studies. Although the paradigmatic personal dilemma – namely Footbridge – usually receives a low percentage of utilitarian approval, its characteristic features can be tweaked to create statistically significant increments in utilitarian responses. Those features change the utilitarian “rate of return”, i.e., the participants’ estimate of how justifiable it would be to violate a norm against killing in order to save lives (Moore et al., Citation2008; Christensen et al. Citation2014). It is reasonable to expect from participants some moral sensitivity to this “rate of return”. Thus, we confronted them with a set of five moral dilemmas that exhibit a gradient of low to high rates. The gradient is initially set by ordering the sample’s mean responses to the different scenarios from low to high. We represent the participants’ sensitivity to the utilitarian gradient as their response slope to the five scenarios, and we use the participants’ sensitivity as a measure of self-regulation.

In tasks where performance is positively correlated with regulatory capacities, higher tonic HRV is expected to correlate with better performance. Thus, in this study, sensitivity to the utilitarian gradient would be expected to correlate with higher tonic HRV. Non-parametric correlations showed no relation between tonic HRV and participants’ response slope to the set of dilemmas. But after dividing the sample into high vs. low tonic HRV groups, we observed medium-sized positive correlations between three measures of tonic HRV and participants’ response slope only for the high tonic group. Apparently, only high tonic HRV participants make a controlled estimate of the utilitarian rate of return. Thus, only their response slope fits our measure of performance.

Another expectation is that differences in regulatory capacity should make a difference with respect to phasic HRV reactivity measures. Regulatory effort was expressed as phasic HRV suppression in the high tonic group. A statistically significant decrease in phasic HF was observed in the slope range between 0 and .4. In the low tonic group, participants show no consistent pattern in phasic reactivity. Moreover, although we had observed small positive correlations between slope and phasic HRV measures in the whole sample, they were medium-sized in the high tonic group, while they were absent in the low tonic group. This suggests that the ability to discriminate differences in the utilitarian pull increased with the regulatory effort exerted by participants in the high tonic HRV group, amd that only high-tonic participants exerted the effort. Any responsiveness shown by low tonic HRV participants was probably intuitively guided and subject to distortions. These results present evidence for considering sensitivity to a utilitarian gradient across personal dilemmas as a measure of performance.

There are two issues that could cause concern. One is minor, namely, how to interpret the negative slopes (−.3 and −.1) observed in two high tonic participants. Adequate sensitivity to the utilitarian gradient should preclude negative slopes. We believe it is unlikely that this observation carries any significance. But larger samples and comprehension checks would be needed to decide this issue.

The second concern is critical. Although the pattern of relations between slope, tonic HF and phasic HF is consistent with HRV findings in other cognitive tasks, this pattern emerged only in the high tonic group. The low tonic group also populated the slope range that indexes correct performance without exhibiting the expected pattern. This suggests that the slope range alone cannot pin down regulatory failure. It is thus fitting that we did observe a difference between the two groups regarding the scenario ordering. These groups placed Flames on opposite sides of the gradient (). This is no small difference, since Flames differs from the baseline scenario only by including a selfish stake in the utilitarian action. Low tonic participants who rate this scenario at the high end of the gradient are susceptible to a distortion in their moral judgment, perhaps due to greater levels of personal anxiety. Therefore, regulatory failures cannot be spotted using slope values alone. Instead, they are primarily identified in failures to discriminate between utilitarian moral and other motivations (e.g., self-preservation).

This last point has interesting implications regarding the focus of future research with moral dilemmas. Research has been all too narrowly focused on the opposition between deontology and utilitarianism and has created the expectation that performance is somehow linked to it. But plausibly, low vagal tone would not produce an increase in utilitarian responses unless some scenario or scenarios were overrated, and thus misplaced along the gradient. What previous research interpreted as a poor neurovisceral integration in utilitarian inclinations expressed in moral dilemmas, might be more properly viewed as an excessive influence of non-moral incentives on moral judgment, caused by self-regulatory failures in participants with low vagal tone. This kind of confusion distorts the integration of conflicting moral considerations, which was postulated as the role for the vmPFC by the new neuroimaging studies reviewed in the introduction.

This result points us to a plausible conclusion: low vagal tone causes failures in the estimation of the utilitarian rate of return of some scenarios by allowing non-moral features to influence the moral integration process, rather than by simply failing to instantiate moral sensitivity. Further research should try to establish how these failures arise. One possibility, compatible with our data, is that participants initially produce an intuitive estimation of the rate of return, i.e., the “all things considered” moral judgment. This intuitive processing might yield roughly correct estimates in many cases. But some scenarios might pose a need to cross-check the results via effortful deliberation. This second process might fail to intervene when vagal tone is low and would be responsible for the misplacement of some scenarios.

Limitations and recommendations for future research

In this study, we attempted to avoid well-known developmental confounds on HRV by including only young adult participants (age range = 21 to 26 y.). We did not, however, measure other possible factors like exercise level or socioeconomic status, which could possibly affect HRV levels. Those factors function as causes of high or low levels of tonic HRV. We left out the causes to focus on whether HRV can be used as a measure of moral performance. Since utilitarian responses do not seem to index self-regulatory effort – utilitarian scores are negatively correlated with tonic HRV levels – we explored whether sensitivity to a utilitarian gradient could function as such a measure. Our study shows that participants’ slopes in a range from 0 to .4 and a morally acceptable placement of items within the utilitarian gradient correlates with high tonic HVR in a moral dilemma task.

This measure does not point to utilitarian responses as uniquely indexing self-regulatory effort, which was proposed by the original dual-process theory of moral cognition (Greene et al., Citation2004, Citation2001). If our performance measurement proposal is correct, future research will need to re-conceive the contributions of intuition (system 1) and deliberation (system 2) (Evans & Stanovich, Citation2013) to moral judgment. Moral dilemmas might, for example, activate intuitions of both utilitarian and deontological principles. Some recent research points in this direction (Bago & De Neys Citation2019; Rosas & Aguilar-Pardo, Citation2020). After moral intuitions are activated, participants would engage deliberation to confirm whether the intuitive answer properly accommodates all the features of the situation. However, participants can and will differ in the response they reach after deliberation. This is a plausible hypothesis that could be used to guide future research.

The use of scenario-placement along a utilitarian gradient as a measure of performance introduces some additional complication in the methods, for researchers will need to agree as to what are unacceptable slopes and unacceptable item-placements. That said, in some cases, those agreements are not difficult to reach, as is the case of a scenario that is baseline in all aspects, except that the agent has a selfish stake in the sacrifice. And once agreements are reached, it is easy to measure performance. The measure proposed here is compatible with possible instances of deontological responses to a selfish stake scenario being the result of self-regulatory effort. Though it is reasonable to be flexible regarding which utilitarian incentives override a deontological norm against killing, some motivations – e.g., a self-preservation anxiety – could not be meaningfully taken to be overriding moral factors.

An important limitation in our study is that we did not implement an independent control task of self-regulation. This could be done with a self-report questionnaire, e.g., the Difficulties in Emotion Regulation Scale (DERS) (Gratz & Roemer, Citation2004) or the Self-Control Scale (Tangney et al., Citation2004), or perhaps better, by requesting participants to complete a separate cognitive task with correct and incorrect answers, which would allow researchers to measure HRV and self-regulation independently of the moral dilemma task and to compare it with the moral sensitivity to utilitarian gradients. This control task should be implemented in future research.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This project was funded by the Research Division of the Universidad Nacional de Colombia, project 37159, 2017–2018, and by the Universidad Externado de Colombia, project 370072017–2018. It was supported by the Swiss National Science Foundation’s project The Nature and Value of Efforts.

References

Appendix

English translations and original scenarios (Spanish).

The utilitarian incentives built into each scenario are indicated in parentheses in the English version.

Shark (baseline incentive: save five)

You are watching an exhibition of sharks being fed in an aquarium pool. A metal fence suddenly collapses, and a group people fall into the pool. Their frantic movement in the water attracts the hungry sharks. A person next to you has a harmless episode of nose-bleeding. If you push this person into the water, their blood will draw the sharks away from the other five people. The bleeding person will die, but the five people will swim to safety.

How right or wrong is it to cause the death of the person in order to save five people?

Boat (baseline and doomed victim)

You escape on a lifeboat from a ship on fire. The lifeboat slowly sinks due to the weight of 2 passengers too many. You are a good swimmer, so you jump into the turbulent waters; but now with one passenger too many the boat continues to slowly sink. None of the remaining passengers wear life jackets, and none of them can swim.

Nobody is willing to get off and the boat will soon sink. If you pull the person closest to you into the water and keep clear of his attempts to drag you down with him, this person will drown, but you will save five people in the boat.

How right or wrong is it to cause the death of the person in order to save five others?

Dam (baseline and save100K)

You have a job in the dam management of a large river. Failure in the materials has caused a leak in the dam. The leak has opened a hole and the dam will soon burst apart. A huge volume of water will flood a nearby town. There is no time to warn the inhabitants.

On the dam’s edge a worker looks concerned into the dam. If you push this person into the water, his body will stopple the hole, giving time to take the necessary actions. This person will drown, but you will save 100.000 people living in the town.

How right or wrong is it to cause the death of the person in order to save 100.000 people?

Flames (baseline and selfish incentive)

You and other five people are trapped in a burning building. There is only one emergency exit, but it is blocked by burning debris. A sixth person tries to crawl through a hole at the base of that exit. You and the other five lack the time to do the same thing.

If you push that person through the debris to unlock the exit, that person will die, but you and the other five people will exit to safety.

How right or wrong is it to cause the death of the person in order to save yourself and four others?

Grenade (baseline and guilty victim)

You stand on a rooftop and near you a man is about to throw a grenade onto a group of five people below. The group is unaware of the threat and there is no way to warn them. The five people will surely die in the explosion.

If you push the man with the grenade with a quick movement, he will fall and fail to activate the grenade. The fall will kill him, but the five people below will be saved.

How right or wrong is it to cause the death of the person in order to save five others?

Original scenarios (Spanish)

Tiburón

Visitas un acuario que exhibe tiburones mientras son alimentados. Muy cerca de ti, la reja donde se agolpan los visitantes cede y 5 personas caen al agua. Su caída atrae a los tiburones hambrientos.

Junto a ti, otro visitante sangra excesivamente por la nariz. Si lo empujas al agua, la sangre atraerá a los tiburones y los alejará de las 5 personas que están a punto de ser devoradas. El hombre que sangra morirá, pero las otras cinco personas se salvarán.

Barco

Viajas en un barco que se incendia y te pones a salvo en un bote. El bote aguanta a cinco personas, pero lleva siete y comienza a hundirse por sobrepeso. Tú nadas muy bien y saltas al agua, pero el bote con seis se sigue hundiendo. Ninguno de los seis lleva chaleco salvavidas y ninguno sabe nadar. Todos van a morir si nadie salta al agua; pero nadie lo hace.

Si desde el agua tú jalas al hombre que está más cerca y lo sacas del bote sabiendo que se ahogará por las condiciones difíciles del mar, el bote dejará de hundirse y los otros cinco se salvarán.

Represa

Trabajas en la represa de un río muy caudaloso. Por fallas en el material se produjo un escape en el dique. La concentración de la presión en ese punto romperá el dique. El río caerá con ferocidad sobre una ciudad aledaña de cien mil habitantes. Ya no hay tiempo de avisar.

En el borde de la represa hay un trabajador. Si lo empujas al agua, la corriente lo arrastrará y tapará el hueco con su cuerpo dando tiempo a tomar medidas permanentes. El trabajador se ahogará, pero se salvarán cien mil vidas.

Llamas

Tú y cuatro personas más están atrapadas en un edificio en llamas. Hay una única salida de emergencia, pero está bloqueada por escombros ardientes. Un hombre intenta arrastrarse por un hueco en la base de dicha salida, pero las otras cuatro personas y tú no tienen tiempo de hacer lo mismo.

Si empujas al hombre que está intentando salir contra los escombros ardientes para desbloquear el paso, él morirá, pero tú y las otras cuatro personas se salvarán.

Granada

Estás en una azotea y te das cuenta de que cerca de ti hay un hombre que amenaza con arrojar una granada sobre un grupo de 5 personas reunidas en un parque aledaño. El grupo desconoce la amenaza y no hay forma de avisarles. Las 5 personas morirán con seguridad en la explosión.

Si empujas al hombre con un movimiento rápido, caerá desde la azotea sin tener tiempo de activar la granada. Esto lo matará, pero así se salvarán las 5 personas inocentes.