456
Views
0
CrossRef citations to date
0
Altmetric
Research Article

All mouth and trousers? Use of the Devil’s Advocate questioning protocol to determine authenticity of opinions about protester actions

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Received 06 Feb 2023, Accepted 29 Jun 2023, Published online: 19 Sep 2023

Abstract

We examined the Devil’s Advocate lie detection method which is aimed at detecting lying about opinions. In this approach, participants give reasons for why they hold an opinion in the eliciting-opinion question and counter-arguments to their opinion in a devil’s advocate question. Truth tellers (n = 55) reported their true opinion about protestor actions, whereas lie tellers (n = 55) reported the opposite of their true opinion. Answers were coded for number of arguments and plausibility, immediacy, clarity and scriptedness. Data were analysed with analyses of variance with veracity being the sole factor. Supporting the hypothesis, truth tellers provided more pro-arguments than lie tellers and to all eliciting-opinion questions their answers sounded more plausible, immediate and clear than lie tellers’ answers. The opposite pattern was predicted for the devil’s advocate question but not found, likely caused by the simplification of the question. Neither was being scripted a diagnostic veracity indicator.

Verbal deception research overwhelmingly focuses on lying about alleged activities (Vrij, Citation2008; Vrij, Fisher, et al., Citation2023; Vrij, Granhag, et al., Citation2022). Practitioners, however, frequently tell us that they are also interested in lying about opinions. Examples of these are: ‘What is this person’s view of committing terrorist acts to achieve a goal?’, ‘What does this person think of protester actions?’, ‘What is this person’s view of our Government and country?’ and ‘What is this person’s view of the network of people s/he wants to inform us about as an informant?’. Making incorrect veracity judgements about expressed opinions can have serious consequences. Usman Khan was the London Bridge attacker who stabbed two people to death and injured three others on 29 November 2019 before being shot dead by the police. He spent eight years in jail for terrorist offences but was released on parole in December 2018. He was considered a ‘success story’ for a rehabilitation programme for terrorism convicts (Dixon et al., Citation2019).

The Devil’s Advocate approach is to our knowledge the only verbal interview protocol available to date designed to detect lying about opinions. As far as we are aware it has been examined three times with individual interviewees (Leal, Vrij, Deeb, & Dabrowna, Citation2023; Leal et al., Citation2010; Mann et al., Citation2022) and once with pairs of interviewees (Deeb et al., Citation2018). The Devil’s Advocate approach consists of three questions. First, the interviewee is asked what his/her opinion is about a specific topic (‘What do you think of the cost-of-living protest actions that occur across the UK’?). This is followed by an eliciting-opinion question that invites interviewees to report their arguments in favour of the view they just expressed – for example, ‘Provide all the reasons why you are in favour of the cost-of-living protest actions.’ The third question is the devil’s advocate question in which interviewees are asked to give arguments opposed to their expressed view: ‘Try to play devil’s advocate and imagine that you are against the cost-of-living protest actions. Provide all the reasons why you may be against these actions’.

The Devil’s Advocate interview protocol focuses on the last two questions. Truth tellers and lie tellers are thought to respond differently to them. In this context, truth telling means expressing a true opinion whereas lying means expressing an opinion opposite to the true opinion. For example, someone who is in favour of protest actions claims in the interview to be against them. When people have formed an opinion about a topic, they have thought about reasons that support their view (Ajzen, Citation2001) and those reasons are typically readily available to them (Darley & Gross, Citation1983; Jones & Sugden, Citation2001). The eliciting-opinion question invites truth tellers to express these reasons, and they should be able to do so eloquently. The devil’s advocate question invites truth tellers to report reasons that oppose their actual opinion. Such reasons are less readily available to them because, as the confirmation bias predicts, people do not seek information that opposes their views (Darley & Gross, Citation1983; Jones & Sugden, Citation2001). Truth tellers should therefore struggle more to answer the devil’s advocate question than the eliciting-opinion question, resulting in a less eloquent answer to the devil’s advocate question than to the eliciting-opinion question.

For lie tellers, the devil’s advocate question is asking for the reasons they believe. These reasons should therefore be readily available to them. The eliciting-opinion question asks for reasons they do not believe and these reasons should thus be less readily available to them. Yet, for two reasons we do not predict lie tellers to be more eloquent when responding to the devil’s advocate question than to the eliciting-opinion question. First, lie tellers prepare themselves for interviews to come across as convincing (Clemens et al., Citation2013; Deeb et al., Citation2018; Leins et al., Citation2013). They are therefore likely to think of reasons they can give that support their pretended opinion. This should improve the eloquence of their responses to the eliciting-opinion question. Second, although truth tellers often think that their credibility shines through (Kassin, Citation2014; Kassin et al., Citation2010) – referred to as the illusion of transparency (Savitsky & Gilovich, Citation2003) – lie tellers do not take their credibility for granted (DePaulo et al., Citation2003). It means that for lie tellers the way they present information matters (Granhag & Hartwig, Citation2008). Lie tellers are keen to be consistent in interviews because they think that inconsistencies indicate deceit (Deeb et al., Citation2017; Strömwall et al., Citation2004; Vrij, Deeb et al., Citation2021). In an interview about alleged past activities, consistency is typically defined as presenting the same information in multiple interviews (Fisher et al., Citation2013; Vredeveldt et al., Citation2014). This type of consistency is impossible in a Devil’s Advocate interview because the eliciting-opinion and devil’s advocate requests ask about different things. In a Devil’s Advocate interview, consistency would mean responding to the eliciting-opinion and devil’s advocate request with equal eloquence and/or in equal length. The latter (equal length) is how consistency is defined in comparable baseline deception research (Bogaard, Meijer, et al., Citation2022; Bogaard, Nußbaum, et al., Citation2022; Palena et al., Citation2018), whereby a truthful answer about a certain activity is compared with the answer to a target question about which the veracity status is unknown.

The three experiments published to date supported the Devil’s Advocate reasoning. The residue means (responses to the opinion-eliciting questions minus responses to the devil’s advocate questions) were larger in truth tellers than in lie tellers for eloquence in Leal, Vrij, Deeb, and Dabrowna (Citation2023) and Mann et al. (Citation2022) (Leal, Vrij, Deeb, & Dabrowna, Citation2023: plausibility d = 0.51, immediacy d = 0.38, clarity d = 0.48; Mann et al., Citation2022: plausibility d = 0.47, immediacy d = 0.48, clarity d = 0.35). Leal et al. (Citation2010) measured only plausibility and immediacy and reported their findings differently: they reported the residue means for truth tellers and lie tellers separately. The same pattern as that in Leal, Vrij, Deeb, and Dabrowna (Citation2023) and Mann et al. (Citation2022) emerged: in truth tellers, the residue means in truth tellers (plausibility d = 1.59 and immediacy d = 2.16) were more pronounced than in lie tellers (plausibility d = 0.04 and immediacy d = 0.36).

Unlike Leal et al. (Citation2010), Leal, Vrij, Deeb, and Dabrowna (Citation2023) and Mann et al. (Citation2022) counted the number of reasons that truth tellers and lie tellers provided. For truth tellers, the arguments that support their view are more readily available than their arguments that oppose their view. For lie tellers, the pro- and anti-arguments may be more in balance due to their preparation and keenness to be consistent. It was therefore predicted that truth tellers would provide more reasons that supported their opinion (defined as pro-arguments minus anti-arguments) than lie tellers. This was not found in previous research. In fact, no difference emerged in Leal, Vrij, Deeb, and Dabrowna (Citation2023) (d = 0.22), whereas lie tellers reported more pro-arguments than truth tellers in Mann et al. (Citation2022) (d = 0.45). We sought to examine the robustness of these unexpected effects in the present experiment.

It has been argued that it is important to measure the strategies that truth tellers and lie tellers use in interviews to appear convincing (DePaulo et al., Citation2003; Vrij & Granhag, Citation2012). Such strategies could explain the differences in speech content (as well as nonverbal behaviour) between truth tellers and lie tellers (DePaulo et al., Citation2003). Insight into strategies could also be used to develop interview protocols intended to exploit the different strategies that truth tellers and lie tellers use (Vrij & Granhag, Citation2012; Vrij, Granhag, et al., Citation2022). To date, strategies have only been investigated when people tell the truth or lie about alleged activities. Results showed that truth tellers are inclined to tell it all whereas lie tellers prefer to keep their stories simple (Hartwig et al., Citation2007; see also Colwell et al., Citation2006; Hartwig et al., Citation2010; and Vrij et al. (Citation2010)). Another strategy reported by both truth tellers and lie tellers is to control their nonverbal demeanour (Leal, Vrij, Deeb, & Fisher, Citation2023). We explored whether these and perhaps other strategies would emerge when people lie about their opinions.

Similar to Mann et al. (Citation2022), the present experiment employed the Devil’s Advocate interview protocol in a protester-actions setting. We introduced the following six changes. First, the level of activism participants endorsed was rather low in Mann et al. (Citation2022). We attempted to raise these levels by presenting participants with vignettes whereby a protagonist has suffered a clear injustice. The vignettes may stir up empathy/anger in participants such that they may consider it appropriate to take stronger actions about the situations described.

Second, our discussions with practitioners revealed that they found the devil’s advocate question difficult to use due to the way it was phrased. Mann et al. (Citation2022) kept the original phrasing intact but added a question to the protocol that elicited devil’s advocate responses in a different way: participants were asked to take on the role of a script writer and to write a part in a film for a character who strongly holds the opposite view to the views they supported in the interview. It thus invited truth tellers to report views opposite to their real views but lie tellers to report their real views. This alternative devil’s advocate question did not reveal veracity differences. We therefore dropped it in the present experiment but simplified the original devil’s advocate question instead. See the Method section for the entire interview protocol including the new devil’s advocate question (Q3).

Third, there was only one eliciting-opinion question in Mann et al. (Citation2022) and Leal et al. (Citation2010). In the current protocol, we included three eliciting-opinion questions (Q2 is the original question, and Q4 and Q5 are new). Multiple questions about the same concept show the level of internal consistency, which is an indicator of reliability (Henson, Citation2001). Note that the result of the changes presented in Points 2 and 3 is that the current interview protocol is unbalanced, with three eliciting-opinion and one devil’s advocate question.

Fourth, Mann et al. (Citation2022) examined the Verifiability approach in addition to the Devil’s Advocate approach whereas we have focused solely on the Devil’s Advocate approach. Fifth, unlike Mann et al. (Citation2022) and Leal et al. (Citation2010), we measured the self-reported strategies that interviewees used to come across as convincing during the interview.

Finally, there were a few deviations from Mann et al. (Citation2022) regarding these dependent variables. We did not measure number of words, number of details and verifiable sources. Number of words and number of details were dropped because they did not yield significant effects in Mann et al. (Citation2022). In addition, we considered number of words to be too vague to be meaningful. Available methods to code details typically refer to references to people, location, actions, time and objects (Alison et al., Citation2013; Deeb et al., Citation2022) or to references to sensory information (details related to vision, sound, taste, touch or smell) and contextual embeddings (where and when an experience took place; Gancedo et al., Citation2021). Such details in coding makes sense when people discuss alleged experiences but makes less sense when people discuss their opinions. Verifiable sources were deleted because they are part of the Verifiability approach rather than the Devil’s Advocate approach.

The variable being scripted is new. Scriptedness is defined as giving a response that contains mental scripts – for example, two- or three- or more-word phrases that automatically come to mind in speech (e.g. ‘to be honest’, ‘knock-on-effect’, ‘social media platform’, etc.). All other variables we measured are cues to truthfulness (truth tellers report them more than lie tellers), whereas being scripted would be a cue to deceit (lie tellers report it more than truth tellers). There is a lack of cues to deceit in verbal lie detection (Nahari et al., Citation2019), which is considered problematic (Vrij, Fisher, et al., Citation2023). If only cues to truthfulness are measured, deception can only be determined indirectly: via the absence of cues to truthfulness. In real life someone could be more confident in deciding that someone is lying if the absence of cues to truthfulness is accompanied by the presence of cues to deceit (Vrij, Fisher, et al., Citation2023).

Hypothesis

In this pre-registered experiment (pre-registration: https://osf.io/f5chx?view_only=509da9433071416490e8461a12e7c995) we tested one pre-registered hypothesis. It was similar to the hypothesis that was examined by Mann et al. (Citation2022). Truth tellers will provide more arguments defending their opinion, and their responses will be more plausible, immediate and clear than those of lie tellers. Lie tellers’ responses will sound more 'scripted’.

A more direct test of the Devil’s Advocate Approach is to compare the answers to the eliciting-opinion question and the devil’s advocate questions. For this purpose, we subtracted the devil’s advocate mean score from the eliciting-opinion mean score, and, following Mann et al. (Citation2022), we label these mean scores ‘residue means’. The Devil’s Advocate approach predicts higher residue means for truth tellers than for lie tellers (exploratory hypothesis).

Method

Participants

We ran a GPower analysis to determine the required sample size for a multivariate F-test (multivariate analysis of variance, MANOVA, global effects). Assuming an alpha level of .05, a statistical power of .85, and a medium effect size (f2 = .16), we needed at least 102 participants for the experiment. We aimed for a medium effect size as our smallest effect size of interest due to the practical significance of the research. Differences between truth tellers and lie tellers should be clearly noticeable in real time, a prerequisite for making a veracity assessment method relevant for practitioners. This smallest effect size of interest also aligns with previous research on the Devil’s Advocate approach (e.g. Mann et al., Citation2022) that showed medium to large effect sizes. We recruited 112 adult participants, but two participants did not follow the instructions correctly. Their data were omitted, and the final sample included 110 participants. Most participants were females (70%); the others were males (28.2%) or non-binary (1.8%). Their mean age was 28.86 years (SD = 11.25). Most participants (64.5%) were from the United Kingdom; the others were from Eastern Europe (10.9%), Western Asia (9.1%), Western Europe (6.4%), Africa (4.5%), Eastern Asia (1.8%) or the Middle East (0.9%). Participants were recruited via the departmental database and university staff and student portals; 49.1% of participants were students at the university, 18.2% were staff, and 32.7% were not from the university. A total of 18 different native languages were spoken across the sample, but for most participants (70%) English was the native language. The experiment conformed with the principles of the Declaration of Helsinki, and ethics approval was granted by the University faculty’s ethics committee and the sponsor’s ethics committee.

Statistical analysis

We carried out the same analyses as those of Mann et al. (Citation2022). To test the pre-registered hypothesis, we carried out MANOVAs with veracity as the only factor and as dependent variables the number of pro- and anti-arguments and (via 7-point Likert scales) ratings for plausibility, immediacy, clarity and being scripted. Pro-arguments are arguments supporting the views expressed by the interviewee: arguments that could be expected when expressing the attitude and when answering the eliciting-opinion question. Anti-arguments are arguments opposing the views expressed by the interviewee: arguments that could be expected when answering the devil’s advocate question.

A direct test of the Devil’s Advocate approach is to compare the answers to the eliciting-opinion questions (Q2, Q4 and Q5) and the devil’s advocate question (Q3). Thus, in the second MANOVA we analysed the difference in answering the three eliciting opinion questions (by averaging the results to the three individual questions) and the devil’s advocate question. Following Mann et al. (Citation2022), we subtracted the devil’s advocate mean score from the eliciting-opinion mean score (e.g. average plausibility score in the three eliciting opinion questions minus the plausibility score in the devil’s advocate question). The pro-action arguments residue mean was computed as: (eliciting-opinion questions pro-action arguments + devil’s advocate question pro-action arguments) – (eliciting-opinion questions anti-action arguments + devil’s advocate question anti-action arguments). Following Mann et al. (Citation2022), we label these mean scores ‘residue means’.

Note that the pre-registered hypothesis and exploratory hypothesis refer to the same data. The difference is that they are analysed in different ways. The pre-registered hypothesis analyses present a complete picture of the results. The exploratory hypothesis is based on residue means. It masks the individual eliciting-opinion and devil’s advocate results but is a more direct test of the Devil’s Advocate approach. In other words, the two different sets of analyses have different strengths, which justifies presenting both.

Procedure

Participants were recruited via an advertisement entitled ‘Actions speak louder than words: opinions about protester actions’ and which mentioned that participants in the experiment would be asked to either lie or tell the truth regarding their beliefs about activist behaviour.

The experiment took place online via Zoom. Those who expressed an interest to take part were sent a Qualtrics link to the Participant Information Sheet (PIS) that outlined the experiment and the consent form. Participants were asked to read the PIS form and to sign their consent if they agreed to take part. After receiving the signed informed consent form and prior to the experiment taking place, the experimenter sent the participant a Qualtrics link to the first questionnaire. It comprised a request for usual demographic information (gender, age, nationality, first language, etc.) followed by four vignettes (each relating to a different topic) depicting a scenario whereby a protagonist has suffered an injustice. The vignettes were designed to stir up empathy/anger in participants such that they may consider it appropriate for action to be taken about the situations described. The order in which the vignettes were presented was alternated to avoid any fatigue effects. See Appendix for the four vignettes.

The participant was asked to indicate on a 10-point scale how unfair the situation described in the vignette was (from 1 = low to 10 = high) and the extent to which they sympathise with the protagonist in the vignette (from 1 = not at all to 10 = very much so). They were then asked to rank the four vignettes in order of their perceived level of unfairness.

Once the experimenter received responses to this questionnaire she sent the participant a link to the second questionnaire, which asked 48 questions relating to what actions participants would consider to be appropriate for the protagonist to take in the example given in the vignette that they ranked as most unjust. The aim of the questionnaire was to establish ground truth of actions the participant considers appropriate for someone else to take. The questionnaire is based on the Activism Orientation Scale (Corning & Myers, Citation2002) and was also used by Mann et al. (Citation2022). Unbeknown to the participant, the items in the questionnaire distinguish between six levels of activism, whereby Level 1 indicates that the participant condones only low levels of protest (e.g. displaying slogan stickers), and Level 6 indicates condoning high levels of protest (e.g. using physical violence). See for the six levels and an example of each level. As in the Activism Orientation Scale, the participant was asked to rate their agreement to each item on a 4-point Likert scale (completely disagree/somewhat disagree/somewhat agree/completely agree).

Table 1. The six levels of activism and an example of each level.

At the agreed time for the experiment to take place, the experimenter met with the participant and briefly explained that they would be interviewed shortly about the vignette and the actions that they considered appropriate for the protagonists to take. She then posted the vignette that the participant had previously selected as the most unjust into the chat feature of Zoom and asked the participant to read it again prior to being interviewed. The experimenter then briefly discussed with the participant their responses to the second questionnaire. She then informed the participant either that they are to answer completely honestly in their interview (truth tellers, n = 55) or to tell the interviewer that they would only condone a much lower level of action (or no action) than they consider to be appropriate (lie tellers, n = 55). To increase motivation, the participant was told that in addition to their £10 for taking part, if they successfully convince the interviewer that they are being truthful they will be entered into a draw for £50 Amazon vouchers, but if they are not successful, they will be asked to write a statement about their real beliefs. In fact, all participants were entered into the draw (and not asked to write a statement). The participant was then given as much time as they wanted if they wanted to prepare a rationale for their beliefs in response to possible questions (more relevant for lie tellers who were giving responses that do not reflect their actual beliefs).

When the participant indicated that they were ready to be interviewed, they were put into a break-out room on Zoom with the interviewer who asked the following five questions:

Q1. What actions do you think it would be appropriate for Jane (and Matt) to take in their situation, to express how unfairly Jane has been treated? (Expressing actions question)

Q2. What has led you to feel that this action [just described in 1] is appropriate? (First eliciting-opinion question)

Q3. Can you think of any arguments for why Jane and Matt should not take those actions against those individuals? (Devil’s advocate question)

Q4. Why might someone decide to act in this situation? (Second eliciting-opinion question)

Q5. Why do you think that stronger actions than those you discussed earlier would be inappropriate? (Third eliciting-opinion question)

The interviewer notified the experimenter when the interview had concluded, after which the experimenter brought the participant back into the main meeting room. Participants were asked to complete a post-interview questionnaire where they rated on a percentage scale the extent to which they had told the truth. They were then debriefed. All participants received £10 (via a bank transfer) and were entered in the £50 Amazon vouchers draw.

Coding

All interviews were recorded. An audio file was retained at the end of the interview for transcribing. All interviews were transcribed and coded for each question. One rater, blind to the veracity status of the participant, coded the number of pro-arguments, number of anti-arguments and, on 7-point Likert scales, plausibility, immediacy, clarity and being scripted.

In the Devil’s Advocate protocol each question is considered an individual entity. Therefore, when repetitions occurred within the same question (e.g. in Q1) the information was not coded again, but if they occurred in another question they were coded again (e.g. in Q1 and in Q2). We coded the number of pro-arguments, number of anti-arguments and, on 7-point Likert scales, plausibility, immediacy, clarity and being scripted. The way Questions 2 and 4 (both eliciting-opinion questions) were phrased implies that they may result in participants repeating their Question 2 answer in Question 4. This did not happen frequently. Only five truth tellers and four lie tellers repeated a pro-argument raised in Question 2 in Question 4 and only three participants (all lie tellers) repeated an anti-argument mentioned in Question 2 in Question 4. Percentage-wise, 2% (SD = 8%) of the pro-arguments and 1% (SD = 7%) of the anti-arguments were repeated in Question 4, with no significant difference in repetitions between truth tellers and lie tellers for pro-arguments, F(1, 108) = 0.66, p = .418, d = 0.12, 95% confidence interval, CI [–0.25, 0.50], and anti-arguments, F(1, 108) = 3.02, p = .085, d = 0.28, 95% CI [–0.10, 0.65]. We decided to keep the repeated arguments in the dataset.

For Question 1 we coded pro-action and anti-action arguments that the participants mentioned. For example, this statement contained three pro-action arguments (in bold) and one anti-action argument (in italics).

So I think it’ll be suitable, let’s say acceptable for them to um organise uh Facebook group or some sort of campaign and advertise it online to get other people who have experienced the same thing and um share their beliefs or their the issues they’ve had just to kind of create a bigger group and express the issues they’ve had to then uh political party or um start a petition, maybe and just speak to someone who is in charge, and is able to actually influence the change they want. Um so more of um what’s the word lenient approach I would say, just because it’s not, I don’t think it’s acceptable for anyone to be, you know, uh violent, uh express their opinions, like, throw eggs at the parliament, something, I wouldn’t say that that’s appropriate.

For Questions 2 to 5 we coded the number of reasons participants came up with in favour of (pro-arguments) or against (anti-arguments) the appropriateness of the actions they described in their answers to Question 1. For example, this statement contained two pro-arguments (in bold) and two anti-arguments (in italics).

If you I believe in protest, I believe in protest in the most ethical remit. I don’t believe in obstructing other people’s day. Because I think actually you can end up inadvertently hurting people who probably agree with your cause, just because the avenue you went down to express yourself. So it also it’s getting your voice heard in the right forums, even if those forums aren’t necessarily gonna bring an immediate benefit, you lend your voice to a cause. And I think that’s the best way to bring about change, rather than drawing a lot of attention to yourself in the wrong format, if that makes sense.

Questions 1 to 5 were also rated on scales of 1 (not at all) to 7 (extremely) for plausibility (whether the response sounds reasonable and genuine and if there is enough of an answer to sound convincing), immediacy (whether the response sounds personal and not distanced), clarity (how clear the response is for the reader to understand what the participant is saying by the end of the answer) and being scripted (a response that contains mental scripts, e.g. two-, three- or more-word phrases that automatically come to mind in speech). The definitions for plausibility, immediacy and clarity were taken from Mann et al. (Citation2022) but being scripted was a new variable not coded by Mann et al. (Citation2022). The following example received a 6 for plausibility, immediacy and directness and a 2 for being scripted:

uh so I think, participating in political meetings and sharing online posts, those are okay, politically, cause you’re not being a threat to anyone in society, you’re not endangering anyone, you’re not getting physically violent, or uh obstructing other people in their day to day lives.

A ‘6′ for plausibility was given because the answer was thoughtfully and carefully worded. Immediacy received a ‘6’ because it was to the point and personal, insofar as the participant gives an opinion that starts with ‘I think’ and lists what they think is acceptable or not acceptable. Clarity received a ‘6’ because the participant put effort into explaining their views, and the answer was well understood by the coders. Being scripted received a ‘2’ because the answer included phrases such as ‘physically violent’, ‘online posts’ and ‘day to day lives’.

The following answer received a ‘2’ for plausibility, immediacy and clarity and a ‘1’ for being scripted, because the person struggles to come up with a meaningful answer.

well I guess it’s a personal thing, isn’t it? So for the, for the for the person, the person has been working hard to get to that place. Um and now they feel that they’ve been they’ve been sort of, yeah, let down by the university, or, you know but yeah but yeah.

A second rater, also blind to the veracity status of the participants, coded the interviews of a random sample of 54 participants (49%). Inter-rater reliability between the two coders was measured using the two-way random effects model measuring consistency. It was good to very good for pro-action arguments (average measures intraclass correlation coefficient, ICC = .98), anti-action arguments (average measures ICC = .94), plausibility (average measures ICC = .75), immediacy (average measures ICC = .86) and clarity (average measures ICC = .81) and satisfactory for being scripted (average measures ICC = .61; ).

Table 2. Level of action allocation as a function of veracity.

For coding the self-reported strategies, a bottom–up form of coding was carried out. One rater grouped similar comments together and categorised them accordingly. This resulted in the 10 categories listed in . A second rater then allocated each comment in one or more of the 10 categories. The inter-rater reliability between the two coders was very good, kappa = .82. The discrepancies between the two coders were resolved by a third coder.

Table 3. The self-reported strategies used by lie tellers and truth tellers.

Results

Manipulation check: levels of unfairness, sympathy and appropriate actions

Two analyses of variance (ANOVAs) were carried out with veracity as the only factor and the perceived levels of unfairness and sympathy related to the vignette they discussed in the interview as dependent variables. The effects for unfairness, F(1, 108) = 0.00, p = 1.00, d = 0.00, 95% CI [–0.37, 0.37] and sympathy were not significant, F(1, 108) = 0.73, p = .396, d = 0.17, 95% CI [–0.21, 0.54]. The total means showed that the participants found the situation described in the vignette very unfair (M = 8.96, SD = 1.26), and they felt a lot of empathy for the protagonists described in the vignette (M = 8.91, SD = 1.57). The chosen topics for truth tellers to discuss in the interview were cannabis (n = 14, 25.5%), university fees (n = 12, 21.8%), cancer/Covid (n = 12, 21.8%) and cost of living crisis (n = 17, 30.9%). The topics chosen for lie tellers were cannabis (n = 11, 20.0%), university fees (n = 11, 20.0%), cancer/Covid (n = 18, 32.7%) and cost of living crisis (n = 15, 27.3%). There was no association between veracity and the chosen vignette, χ2(3, N = 110) = 1.73, p = .631, θ = .13.

A chi-square test of independence was performed to examine the relationship between veracity and the highest level of actions considered appropriate by the participants. There was no association between these two variables, χ2(4, N = 110) = 1.26, p = .868, θ = .11. See for the chosen levels, which were generally low.

Manipulation check: reported truth telling

An ANOVA with veracity (truth vs. lie) as the only factor and the percentage of reporting truth telling as the dependent variable revealed a significant effect, F(1, 108) = 336.44, p < .001, d = 3.49, 95% CI [2.85, 4.03]. Truth tellers (M = 97.33%, SD = 5.52, 95% CI [92.16, 102.49]) reported to have told the truth more than lie tellers (M = 29.73%, SD = 26.80, 95% CI [24.56, 34.89]).

Reported strategies

More lie tellers (n = 25, 45.5%) than truth tellers (n = 13, 23.6%) reported to have used a strategy in the interview to appear convincing, χ2(2, N = 110) = 5.79, p = .016, θ = .34. The reported strategies are depicted in . The most frequently reported strategy amongst lie tellers (n = 8) was improvisation followed by a speech-related strategy that was not speech-content related (repeating questions, pretending lack of knowledge, use a conversational style) followed by paying attention to nonverbal behaviour. The most frequently reported strategy amongst truth tellers was telling the truth.

Pre-registered hypothesis testing

We tested our data using null hypothesis significance testing (NHST) and equivalence testing to support any null findings demonstrating an absence of differences between truth tellers and lie tellers (see Lakens et al., Citation2018). We chose our smallest effect size of interest to be 0.5, because our research is applied, and we were interested in observing a moderate to large effect size. Thus, the equivalence bounds ranged between −0.5 and 0.5.

A MANOVA analysing the expressing attitude question (Q1) with veracity as the only factor and the six variables listed in as dependent variables revealed a significant multivariate veracity effect, F(6, 103) = 22.52, p < .001, ηp2 = .57. shows that our hypothesis was supported for all the dependent variables except for being scripted, for which the null hypothesis was supported. Truth tellers reported more pro- and fewer anti-arguments than lie tellers about the actions that would be appropriate to take. Compared to lie tellers’ statements, the truth tellers’ statements were rated as more plausible, immediate and clear.

Table 4. Statistical results for the expressing actions question, eliciting-opinion questions and devil’s advocate question.

further shows the results for the answers to the averaged eliciting-opinion questions and devil’s advocate question separately. The MANOVA for the averaged eliciting-opinion questions showed a significant multivariate effect, F(6, 103) = 19.47, p < .001, ηp2 = .53. All effects except those for number of anti-arguments (for which the results were inconclusive) and being scripted (for which the null hypothesis was supported) were statistically different but not equivalent. In alignment with the Devil’s Advocate approach rationale and supporting our hypothesis, truth tellers reported more pro-arguments than lie tellers, and truth tellers’ statements appeared more plausible, immediate and clear than lie tellers’ statements.

The MANOVA for the devil’s advocate question also showed a significant multi-variate effect, F(6, 103) = 11.97, p < .001, ηp2 = .41. All univariate effects were statistically different and not equivalent, except the effect for number of anti-arguments reported for which the null hypothesis was supported. shows that truth tellers reported more pro-arguments than lie tellers, and truth tellers’ statements appeared more plausible, immediate, clear and being scripted. These findings contradict the Devil’s Advocate approach rationale.

Since the devil’s advocate results were unexpected, we carried out further analyses. We examined whether the unexpected results were caused by participants avoiding answering the question by including phrases such as ‘no’ or ‘not really’ in their replies. More truth tellers (n = 18) than lie tellers (n = 8) did this, χ2(2, N = 110) = 5.04, p = .025. In addition, compared to the other answers, these ‘I cannot’ answers were rated as more plausible (M = 4.19, SD = 0.89, 95% CI [3.78, 4.61] vs. M = 3.68, SD = 1.11, 95% CI [3.45, 3.91]), F(1, 108) = 4.63, p = .034, d = 0.51, 95% CI [0.12, 0.88], and clearer (M = 4.15, SD = 1.01, 95% CI [3.73, 4.58], vs. M = 3.58, SD = 1.12, 95% CI [3.35, 3.82]), F(1, 108) = 5.38, p = .022, d = 0.53, 95% CI [0.15, 0.91], and more scripted (M = 2.08, SD = 0.98, 95% CI [1.76, 2.40] vs. M = 1.69, SD = 0.78, 95% CI [1.51, 1.87]), F(1, 108) = 4.34, p = .040, d = 0.44, 95% CI [0.06, 0.81]).

shows the statistical results for each of the three eliciting-opinion questions. The findings for each question were similar and showed similar patterns as those that emerged in the averaged eliciting-opinion question results.

Table 5. Statistical results for the three individual eliciting-opinion questions.

Exploratory hypothesis testing

The MANOVA for the residue means (means for the three eliciting-opinion questions, averaged, minus means for the devil’s advocate question) showed a significant multivariate veracity effect, F(5, 104) = 6.69, p < .001, ηp2 = .24. At a univariate level only the effects for number of pro-arguments supported our hypothesis, see . The residue mean was significantly higher for truth tellers than for lie tellers. For all other variables, there was statistical equivalence, meaning that the null hypothesis was supported. This supports the hypothesis for number of arguments only.

Table 6. Statistical results for the residue scores.

Following Mann et al. (Citation2022), we further examined whether the residue means differed from zero for truth tellers and lie tellers. One-sample t-tests showed that for truth tellers none of the residue scores differed from zero in the NHST analysis (all ts < 1.19, all ps > .243, all ds < 0.16). In the equivalence testing analysis, all residue scores were statistically equivalent (3.84 ≤ ts ≤ 4.73, all ps < .001, −0.12 < ds < −0.02) except for pro-arguments, t = 0.30, p = .385, d = −0.16, 95% CI [–0.39, 0.06]. This confirms that all residue scores did not differ from zero; however, the results were inconclusive for pro-arguments.

For lie tellers, the residue scores for pro-arguments, t(54) = 7.58, p < .001, d = 1.02, 95% CI [1.02, 1.35]), and being scripted, t(54) = 2.26, p = .028, d = 0.31, 95% CI [0.03, 0.57]), differed from zero in the NHST analysis (all other ts < 1.04, all ps > .305, all ds < 0.14). These results were supported in the equivalence testing analysis for pro-arguments, t = −6.40, p = 1.00, d = −1.02, 95% CI [−1.33, −0.77], for plausibility, immediacy and clarity (5.11 ≤ ts ≤ 7.55, all ps < .001, −0.06 < ds < 0.14), and for being scripted, t = −2.92, p = .003, d = 0.31, 95% CI [0.08, 0.55].

Discussion

We predicted in the pre-registered hypothesis that truth tellers’ responses to the expression opinion question (Q1) and eliciting opinion questions (Q2, Q4, Q5) would be more plausible, immediate and clear than lie tellers’ responses. We found strong support for this. It replicated Leal et al. (Citation2010), Leal, Vrij, Deeb, and Dabrowna (Citation2023) and Mann et al. (Citation2022) and the findings of an experiment outside a Devil’s Advocate context where participants lied about opinions (Vrij, Deeb, et al., Citation2022). It strengthens the conclusion that lying about opinions can be detected by examining these three variables.

We further predicted in the pre-registered hypothesis that truth tellers would provide more arguments defending their opinion than lie tellers when answering Q1, Q2, Q4 and Q5. We found strong support for this hypothesis for the number of pro-arguments mentioned (truth tellers reported more pro-arguments than lie tellers). This contradicts Mann et al. (Citation2022) who found that lie tellers reported more pro-arguments than truth tellers. Leal, Vrij, Deeb, and Dabrowna (Citation2023) and Vrij, Deeb, et al. (Citation2022) found no significant veracity effect for the number of pro-arguments. In other words, the four available experiments to date showed all three possible effects. Compared to lie tellers, truth tellers reported (a) more, (b) fewer or (c) the same number of pro-arguments. Although more research into this variable is required to determine whether it is a diagnostic veracity indicator when people lie about their opinions, its potential to become a diagnostic indicator looks bleak.

There was weak support for the pre-registered hypothesis that lie tellers report more anti-arguments than truth tellers. Limited support was also found by Vrij, Deeb, et al. (Citation2022) but Leal, Vrij, Deeb, and Dabrowna (Citation2023) and Mann et al. (Citation2022) found no veracity effect for this variable. Again, more research is required to determine the diagnostic value of this variable as a veracity indicator when people lie about their opinions.

The results contradicted the predictions for the devil’s advocate question (Q3) because truth tellers sounded more (rather than less) plausible, immediate and clear than lie tellers when answering the devil’s advocate question. The consequence of the unexpected effect for the devil’s advocate question is that the exploratory hypothesis largely failed to show the expected results, unlike Leal et al. (Citation2010), Leal, Vrij, Deeb, and Dabrowna (Citation2023) and Mann et al. (Citation2022), where the devil’s advocate question yielded the expected results. The exploratory hypothesis predicted higher residue means for truth tellers than for lie tellers. Only the effect for number of pro-arguments supported our exploratory hypothesis: the residue mean was significantly higher for truth tellers than for lie tellers. No significant effects emerged for the remaining variables.

Being scripted, a new variable in the Devil’s Advocate approach, did not emerge as a diagnostic veracity indicator in the experiment. The ICC for this variable was lower than that for the other variables indicating that coders found it relatively difficult to code this variable. In other words, our operationalisation of being scripted was unsuccessful.

Strategies that truth tellers and lie tellers reported to have used to appear convincing in the interview were measured for the first time in a lying about opinions setting. Only a minority of lie tellers (45.5%) and a minority of truth tellers (23.6%) reported to have used a strategy. Although our dependent variables were all related to speech-content, many strategies that lie tellers reported to have used were hardly or not at all related to speech content. That is, the most frequently mentioned strategy amongst lie tellers – improvisation – is speech-content related but cannot be considered a strong strategy. This was followed by using a speech-related strategy that is not speech-content related (repeating a question, pretending not to be knowledgeable or using a conversational style) followed by paying attention to nonverbal behaviour. Lie tellers reported several speech-content related strategies, such as (a) basing their answers around the truth, (b) building their argument around an example, (c) providing the opposite of their own view, (d) consistency, (e) coming up with a solution to the protagonists’ problem and (f) using another person perspective. However, these strategies were only occasionally mentioned (these six strategies combined were mentioned 16 times). The absence of speech-content-related strategies amongst lie tellers could perhaps explain why we obtained such large veracity effects in the current experiment.

The most frequently reported strategies to be convincing when people tell the truth or lie about their alleged activities are ‘telling it all’ (truth tellers), ‘keeping the story simple’ (lie tellers) and ‘controlling nonverbal demeanour’ (lie tellers; Hartwig et al., Citation2007; Leal, Vrij, Deeb, & Fisher, Citation2023). Of these three strategies, only the controlling nonverbal demeanour strategy emerged in the present experiment. This suggests that interviewees use other strategies when they attempt to be convincing in opinion interviews rather than in interviews about alleged activities. This makes it worthwhile to further examine strategies used in opinion interviews.

The dependent variables plausibility, immediacy and clarity can be considered quality details as they reflect the quality of a statement. In contrast, the number of arguments can be considered a quantity detail because it tells us nothing about the quality of the arguments. The present experiment showed more support for the pre-registered hypothesis regarding the quality details than that regarding the quantity details, and the same pattern emerged in Leal, Vrij, Deeb, and Dabrowna (Citation2023), Mann et al. (Citation2022) and Vrij, Deeb, et al. (Citation2022). When discussing alleged past activities, the variable total details (a quantity measure) typically emerges as a strong veracity indicator: truth tellers report more details than lie tellers (Amado et al., Citation2016; Gancedo et al., Citation2021). It thus seems that quantity measures are stronger veracity indicators when discussing past activities than when discussing opinions. We can think of two reasons why this is the case. First, lie tellers are inclined to avoid presenting incriminating evidence. Details about past activities are perhaps easier to check by investigators than details about expressed opinions, and lie tellers are therefore more reluctant to provide details about past activities than details about opinions. Second, it is probably easier to make up many details about opinions than about past activities. Expressing opinions is context free, so if someone has heard or read an opinion, they can easily claim it to be their own. Past activities are embedded in time and location (occurred at a specific moment at a specific location), and such contextual embeddings cannot be easily replaced when making up a story. Lie tellers are therefore motivated to leave contextual embeddings out of their fabricated statements about past activities (Nisin et al., Citation2022), which contributes to lie tellers reporting fewer details than truth tellers.

Plausibility emerged as the strongest veracity indicator in the present lying about opinions experiment, replicating the findings of several other lying about opinions experiments (Leal, Vrij, Deeb, & Dabrowna, Citation2023; Mann et al., Citation2022; Vrij, Deeb, et al., Citation2022). Researchers are reluctant to examine plausibility, due to its subjective nature: ‘How to define plausibility?’ (Vrij, Deeb, et al., Citation2021). However, we found good reliability between our coders for plausibility (average measures ICC = .75), which suggests that it is possible to measure it in a reliable way. In addition, when researchers do examine plausibility, it emerges as one of the strongest verbal veracity indicators, not only when people lie about opinions but also when they lie about their alleged activities (Gancedo et al., Citation2021; Sporer et al., Citation2021). We think that plausibility deserves more attention from verbal deception researchers than it currently receives.

Although it is always unfortunate if a variable does not emerge as a diagnostic veracity indicator, it is particularly unfortunate for the being scripted variable, because it is a cue to deceit (lie tellers report the cue more than truth tellers). All the other variables we measured were cues to truthfulness (truth tellers report the cue more than lie tellers). Verbal deception research overwhelmingly focuses on cues to truthfulness. All 19 criteria that belong to the Criteria-Based Content Analysis verbal veracity assessment tool are cues to truthfulness (Amado et al., Citation2016), and one out of eight cues that belong to the Reality Monitoring (RM) tool are cues to truthfulness (Oberlader et al., Citation2016). The exception is cognitive operations, which was the only RM criterion that did not emerge as a significant veracity indicator in a recent meta-analysis (Gancedo et al., Citation2021). Besides this, there are difficulties in coding the cognitive operations variable (Vrij, Citation2008), which we also saw with the being scripted variable. In addition, verifiable details – a key variable in the Verifiability approach (Nahari, Citation2019) – is a cue to truthfulness, and so are type-token ratio – a key variable in the Reality Interview (Colwell et al., Citation2013) – and complications – a key variable in Cognitive Credibility Assessment (Vrij, Mann, et al., Citation2021).

Some cues to deceit have emerged in verbal deception research. Meta-analyses showed that common knowledge details and self-handicapping strategies (Vrij, Palena, et al., Citation2021) but, particularly, statement–evidence inconsistency and within-statement inconsistency – two key variables in the Strategic Use of Evidence (SUE) tool (Hartwig & Granhag, Citation2022; Hartwig et al., Citation2014) – are diagnostic cues to deceit. The problem with the two SUE variables is that the SUE tool can only be used when an investigator possesses independent evidence (witnesses, CCTV footage, fingerprints, etc), which is in many situations not the case.

There are at least two reasons why it is important that researchers continue their search for verbal cues to deceit (Vrij, Fisher, et al., Citation2023). First, as already mentioned in the introduction, investigators could make their veracity judgements with much more confidence if they examine a mixture of verbal cues to truthfulness and deceit rather than just verbal cues to deceit. Second, it is more natural to people focusing on the presence of a signal (i.e. cue to deceit) than on the absence of a signal (i.e. cue to truthfulness). Since it is more natural to focus on cues to deceit, verbal lie detection may become more popular amongst practitioners if we will be able to advise them what these verbal cues to deceit are. Future research could focus on different cues to deceit, such as the predictability of the answers. Someone who has not thought much about arguments that go against their opinion may come up with arguments that sound obvious or predictable. We thus predict that lie tellers will give more predictable answers than truth tellers.

We made two changes in the Devil’s Advocate protocol compared to Leal et al. (Citation2010) and Mann et al. (Citation2022). First, we asked three eliciting-opinion questions rather than just one. The three questions produced very similar results, which increases confidence that the findings are reliable. Nonetheless, the responses to the three questions showed little overlap; it means that we managed to gather more information regarding why interviewees (allegedly) were in favour of their expressed opinion than we would have obtained with a single question. We therefore recommend continuing using the three eliciting-opinion questions protocol. Second, we simplified the devil’s advocate question. The results contradicted the predictions because truth tellers sounded more (rather than less) plausible, immediate and clear than lie tellers when answering the devil’s advocate question. An explanation would be that truth tellers did not find it difficult to come up with a devil’s advocate answer. This cannot be caused by the topic – lying about actions – because in Mann et al. (Citation2022) participants also discussed protester actions. The most logical conclusion is that the change in wording somehow caused the absence of the effect. Mann et al. (Citation2022, Appendix) asked: You must have considered your opinion. If you were to look at it from the point of view of an opposer to the actions taken by Matt and Jane, is there anything you can say against their actions? Can you think of any arguments for why Jane and Matt should not take those actions against those individuals? In the current experiment we simplified it to: Can you think of any arguments for why Jane and Matt should not take those actions against those individuals? The main difference between the original question and the rephrased question is that in the current experiment we did not refer to ‘look at it from the point of view of an opposer to the actions’. Perhaps this particular phrase is necessary to make people really think about counter-arguments. Indeed, 18 truth tellers and eight lie tellers avoided answering the question by including ‘No’ or ‘Not really’ in their responses, and these answers were rated as more plausible and clearer than the other responses to the devil’s advocate question. The absence of the predicted effect means that the search continues to come up with a good alternative to the original devil’s advocate question.

In the original devil’s advocate protocol (Leal et al., Citation2010), participants just stated their opinion in the first question, whereas in Mann et al. (Citation2022) and the present experiment participants were asked to elaborate on this and discuss why they had that opinion. Similar to Mann et al. (Citation2022), this question resulted in very similar veracity differences to those resulting from the eliciting-opinion questions. Apparently, lie tellers struggle to sound like truth tellers not only when they express their reasons for having their false opinion (eliciting-opinion questions) but also in expressing the false opinion itself. We therefore recommend maintaining this ‘expressing the opinion’ question at the beginning of the interview because it gives investigators further information about the veracity status of the interviewee.

Three limitations merit discussion. First, the level of activism participants endorsed was rather low. We attempted to raise these levels by presenting participants’ vignettes whereby a protagonist has suffered a clear injustice. We were hoping that these vignettes would stir up empathy/anger in participants so that they would consider it appropriate to take strong actions. The situations the protagonists were in were indeed seen as very unfair and elicited very strong feelings of empathy, yet most participants were not keen on strong actions. If we assume that our participants represent the general public, it suggests that the willingness for protester actions amongst the population is generally low. We think that the devil’s advocate interview tool works better when the participants have more extreme views. The more extreme their views, the less they will consider anti-arguments to be appropriate, so the less they will think about them. The more they reject the anti-arguments, the more they will struggle to express them eloquently in the devil’s advocate question.

Second, similar to Mann et al. (Citation2022), we only audio recorded the interviews. Future research could videotape the interviews so that participants’ nonverbal behaviour could be analysed. Leal et al. (Citation2010) did video-record their participants and found differences in emotional involvement between truth tellers and lie tellers. Truth tellers’ opinion-eliciting answers revealed more emotional involvement than their devil’s advocate answers, whereas no clear differences emerged in lie tellers’ answers to the two types of question. Future research could try to replicate this finding and could examine possible cues to deceit (e.g. being ambivalent or making the impression to try to convince someone). The nonverbal behaviours we have in mind are holistic impressions based on multiple cues rather than the single cue observations (e.g. gaze aversion, fidgeting, micro-expressions of emotions) popular in nonverbal lie detection research (Vrij et al., Citation2019). Meta-analyses have shown that holistic impressions are better veracity indicators than single-cue analyses (DePaulo & Morris, Citation2004; Hartwig & Bond, Citation2014).

Third, similar to Mann et al., (Citation2022), for ethical and practical reasons participants were asked to discuss what actions they considered to be appropriate for others to take rather than actions they may take themselves in the same position. From an ethical standpoint this avoided the event of participants reporting being willing to take strong or illegal actions themselves. From a practical standpoint, since we expected most participants would not be activists themselves we considered they might be more agreeable to other people taking certain actions that carry negative consequences than to taking such risks themselves. For example, someone might have the opinion that Putin should be stopped with lethal force but would probably be more likely to condone someone else taking such action rather than do it themselves. However, we expect participants’ opinions about what action is acceptable for someone else to take would positively correlate with their opinions about what action is acceptable to take themselves, though still this remains an empirical question.

Ethical standards

Declaration of conflicts of interest

Samantha Mann has declared no conflicts of interest.

Aldert Vrij has declared no conflicts of interest.

Haneen Deeb has declared no conflicts of interest.

Sharon Leal has declared no conflicts of interest.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional (University of Portsmouth Faculty of Science and Health Ethics Committee SHFEC 2021-014 A) and the funding (Centre for Research and Evidence on Security Threats) research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Availability of data and material

The data that support the findings of this study are available from the corresponding author upon request.

Additional information

Funding

Centre for Research and Evidence on Security Threats [Economic and Social Research Council, ESRC Award: ES/N009614/1].

References

  • Alison, L. J., Alison, E., Noone, G., Elntib, S., & Christiansen, P. (2013). Why tough tactics fail and rapport gets results: Observing Rapport- Based Interpersonal Techniques (ORBIT) to generate useful information from terrorists. Psychology, Public Policy, and Law, 19(4), 411–431. https://doi.org/10.1037/a0034564
  • Amado, B. G., Arce, R., Fariña, F., & Vilarino, M. (2016). Criteria-Based Content Analysis (CBCA) reality criteria in adults: A meta-analytic review. International Journal of Clinical and Health Psychology, 16(2), 201–210. https://doi.org/10.1016/j.ijchp.2016.01.002
  • Ajzen, I. (2001). Nature and operation of attitudes. Annual Review of Psychology, 52, 27–58. https://doi.org/10.1146/annurev.psych.52.1.27
  • Bogaard, G., Meijer, E. H., Vrij, A., & Nahari, G. (2022). Deception detecting using comparable truth baselines. Psychology, Crime, & Law, 29, 567–583. https://doi.org/10.1080/1068316X.2022.2030334
  • Bogaard, G., Nußbaum, M., Schlaudt, L. S., Meijer, E., Nahari, G., & Vrij, A. (2022). A comparable truth baseline improves truth/lie discrimination. Applied Cognitive Psychology, 36(5), 1060–1071. https://doi.org/10.1002/acp.3990
  • Clemens, F., Granhag, P. A., & Strömwall, L. A. (2013). Counter-interrogation strategies when anticipating questions on intentions. Journal of Investigative Psychology and Offender Profiling, 10(1), 125–138. https://doi.org/10.1002/jip.1387
  • Colwell, K., Hiscock-Anisman, C. K., & Fede, J. (2013). Assessment Criteria Indicative of Deception: An example of the new paradigm of differential recall enhancement. In B. S. Cooper, D. Griesel, & M. Ternes (Eds.) Applied issues in investigative interviewing, eyewitness memory, and credibility assessment (pp. 259–292). Springer. https://doi.org/10.1007/978-1-4614-5547-9_11
  • Colwell, K., Hiscock-Anisman, C., Memon, A., Woods, D., & Michlik, P. M. (2006). Strategies of impression management among deceivers and truth tellers: How liars attempt to convince. American Journal of Forensic Psychology, 24, 31–38. https://digitalcommons.daemen.edu/faculty_scholar/366.
  • Corning, A. F., & Myers, D. J. (2002). Individual orientation toward engagement in social action. Political Psychology, 23(4), 703–729.
  • Darley, J. M., & Gross, P. H. (1983). A hypothesis-confirming bias in labelling effects. Journal of Personality and Social Psychology, 44(1), 20–33. https://doi.org/10.1037/0022-3514.44.1.20
  • Deeb, H., Vrij, A., Hope, L., Mann, S., Granhag, P. A., & Lancaster, G. (2017). Suspects’ consistency in statements concerning two events when different request formats are used. Journal of Investigative Psychology and Offender Profiling, 14(1), 74–87. https://doi.org/10.1002/jip.1464
  • Deeb, H., Vrij, A., Hope, L., Mann, S., Leal, S., Granhag, P. A., & Strömwall, L. A. (2018). The Devil’s Advocate approach: An interview technique for assessing consistency among deceptive and truth-telling pairs of suspects. Legal and Criminological Psychology, 23(1), 37–52. https://doi.org/10.1111/lcrp.12114
  • Deeb, H., Vrij, A., Leal, S., Fallon, M., Mann, S., Luther, K., & Granhag, P. A. (2022). Mapping details to elicit information and cues to deceit: The effects of map richness. The European Journal of Psychology Applied to Legal Context, 14(1), 11–19. https://doi.org/10.5093/ejpalc2022a2
  • DePaulo, B. M., Lindsay, J. L., Malone, B. E., Muhlenbruck, L., Charlton, K., & Cooper, H. (2003). Cues to deception. Psychological Bulletin, 129(1), 74–118. https://doi.org/10.1037/0033-2909.129.1.74
  • DePaulo, B. M., & Morris, W. L. (2004). Discerning lies from truths: Behavioural cues to deception and the indirect pathway of intuition. In P. A. Granhag & L. A. Strömwall (Eds.), Deception detection in forensic contexts (pp. 15–40). Cambridge University Press.
  • Dixon, H., Ward, V., Wilford, G. (2019). London Bridge attacker was poster boy for rehab scheme he targeted. The Daily Telegraph, Retrieved December 1, 2019, from https://www.telegraph.co.uk/news/2019/12/01/london-bridge-attacker-poster-boy-rehab-scheme-targeted/
  • Fisher, R. P., Vrij, A., & Leins, D. A. (2013). Does testimonial inconsistency indicate memory inaccuracy and deception? Beliefs, empirical research and theory. In B. S. Cooper, D. Griesel, & M. Ternes (Eds.) Applied issues in investigative interviewing, eyewitness memory, and credibility assessment (pp. 173–190). Springer.
  • Gancedo, Y., Fariña, F., Seijo, D., Vilariño, M., & Arce, R. (2021). Reality monitoring: A meta-analytical review for forensic practice. The European Journal of Psychology Applied to Legal Context, 13(2), 99–110. https://doi.org/10.5093/ejpalc2021a10
  • Granhag, P. A., & Hartwig, M. (2008). A new theoretical perspective on deception detection: On the psychology of instrumental mind-reading. Psychology, Crime & Law, 14(3), 189–200. https://doi.org/10.1080/10683160701645181
  • Hartwig, M., & Bond, C. F. (2014). Lie detection from multiple cues: A meta-analysis. Applied Cognitive Psychology, 28(5), 661–676. https://doi.org/10.1002/acp.3052
  • Hartwig, M., & Granhag, P. A. (2022). Strategic use of evidence (SUE): A review of the technique and its principles. In G. Oxburgh, T. Myklebust, M. Fallon & M. Hartwig (Eds.), Interviewing and interrogation: A review of research and practice since World War II. Torkel Opsahl Academic EPublisher.
  • Hartwig, M., Granhag, P. A., & Luke, T. (2014). Strategic use of evidence during investigative interviews: The state of the science. In D. C. Raskin, C. R. Honts, & J. C. Kircher (Eds.), Credibility assessment: Scientific research and applications (pp. 1–36). Academic Press.
  • Hartwig, M., Granhag, P. A., & Strömwall, L. (2007). Guilty and innocent suspects’ strategies during police interrogations. Psychology, Crime, & Law, 13(2), 213–227. https://doi.org/10.1080/10683160600750264
  • Hartwig, M., Granhag, P. A., Strömwall, L., & Doering, N. (2010). Impression and information management: On the strategic self-regulation of innocent and guilty suspects. The Open Criminology Journal, 3(1), 10–16. https://doi.org/10.2174/1874917801003020010
  • Henson, R. K. (2001). Understanding internal consistency reliability estimates: A conceptual primer on coefficient Alpha. Measurement and Evaluation in Counseling and Development, 34(3), 177–189. https://doi.org/10.1080/07481756.2002.12069034
  • Jones, M., & Sugden, R. (2001). Positive confirmation bias in the acquisition of information. Theory and Decision, 50(1), 59–99. https://doi.org/10.1023/A:1005296023424
  • Kassin, S. M. (2014). False confessions: Causes, consequences, and implications for reform. Policy Insights from the Behavioral and Brain Sciences, 1(1), 112–121. https://doi.org/10.1177/2372732214548678
  • Kassin, S. M., Appleby, S. C., & Torkildson-Perillo, J. (2010). Interviewing suspects: Practice, science, and future directions. Legal and Criminological Psychology, 15(1), 39–55. https://doi.org/10.1348/135532509X44936
  • Lakens, D., Scheel, A. M., & Isager, P. M. (2018). Equivalence testing for psychological research: A tutorial. Advances in Methods and Practices in Psychological Science, 1(2), 259–269. https://doi.org/10.1177/2515245918770963
  • Leal, S., Vrija, A., Deeb, H., Dabrowna, O., & Fisher, R. P. (2023). Combining the Devil’s Advocate Approach and Verifiability Approach to assess veracity in opinion statements. The European Journal of Psychology Applied to Legal Context, 15(2), 53–61.
  • Leal, S., Vrij, A., Deeb, H., & Fisher, R. P. (2023). Interviewing to detect omission lies. Applied Cognitive Psychology, 37(1), 26–41. https://doi.org/10.1002/acp.4020
  • Leal, S., Vrij, A., Mann, S., & Fisher, R. (2010). Detecting true and false opinions: The Devil’s Advocate approach as a lie detection aid. Acta Psychologica, 134(3), 323–329. https://doi.org/10.1016/j.actpsy.2010.03.005
  • Leins, D., Fisher, R. P., & Ross, S. J. (2013). Exploring liars’ strategies for creating deceptive reports. Legal and Criminological Psychology, 18(1), 141–151. https://doi.org/10.1111/j.2044-8333.2011.02041.x
  • Mann, S., Vrij, A., Deeb, H., & Leal, S. (2022). Actions speak louder than words: The Devil’s Advocate questioning protocol in opinions about protester actions. Applied Cognitive Psychology, 36(4), 905–918. https://doi.org/10.1002/acp.3979
  • Nahari, G. (2019). Verifiability approach: Applications in different judgmental settings. In T. Docan-Morgan (Ed.), The Palgrave handbook of deceptive communication (pp. 213–225). Palgrave Macmillan.
  • Nahari, G., Ashkenazi, T., Fisher, R. P., Granhag, P. A., Hershkowitz, I., Masip, J., Meijer, E. H., Nisin, Z., Sarid, N., Taylor, P. J., Verschuere, B., & Vrij, A. (2019). Language of lies: Urgent issues and prospects in verbal lie detection research. Legal and Criminological Psychology, 24(1), 1–23. https://doi.org/10.1111/lcrp.12148
  • Nisin, Z., Nahari, G., & Goldsmith, M. (2022). Lies divorced from context: Evidence for Context Embedded Perception (CEP) as a feasible measure for deception detection. Psychology, Crime & Law, 2022, 1–17. https://doi.org/10.1080/1068316X.2022.2078825
  • Sporer, S., Manzanero, A. L., & Masip, J. (2021). Optimizing CBCA and RM research: Recommendations for analyzing and reporting data on content cues to deception. Psychology, Crime, & Law, 27(1), 1–39. https://doi.org/10.1080/1068316X.2020.1757097
  • Oberlader, V. A., Naefgen, C., Koppehele-Gossel, J., Quinten, L., Banse, R., & Schmidt, A. F. (2016). Validity of content-based techniques to distinguish true and fabricated statements: A meta-analysis. Law and Human Behavior, 40(4), 440–457. https://doi.org/10.1037/lhb0000193
  • Palena, N., Caso, L., Vrij, A., & Orthey, R. (2018). Detecting deception through small talk and comparable truth baselines. Journal of Investigative Psychology and Offender Profiling, 15(2), 124–132. https://doi.org/10.1002/jip.1495
  • Savitsky, K., & Gilovich, T. (2003). The illusion of transparency and the alleviation of speech anxiety. Journal of Experimental Social Psychology, 39(6), 618–625. https://doi.org/10.1037/0022-3514.75.2.332
  • Strömwall, L. A., Granhag, P. A., & Hartwig, M. (2004). Practitioners’ beliefs about deception. In P. A. Granhag & L. A. Strömwall (Eds.), Deception detection in forensic contexts (pp. 229–250). Cambridge University Press.
  • Vredeveldt, A., van Koppen, P. J., & Granhag, P. A. (2014). The inconsistent suspect: A systematic review of different types of consistency in truth tellers and liars. In R. Bull (Ed.), Investigative interviewing (pp. 183–207). Springer Science + Business Media. https://doi.org/10.1007/978-1-4614-9642-7_10
  • Vrij, A. (2008). Detecting lies and deceit: Pitfalls and opportunities (2nd ed.) John Wiley and Sons. ISBN: 978-0-470-51624-9.
  • Vrij, A., Deeb, H., Leal, S., & Fisher, R. P. (2022). The effects of a secondary task on true and false opinion statements. International Journal of Psychology & Behavior Analysis, 7, 185. https://doi.org/10.15344/2455-3867/2022/185
  • Vrij, A., Deeb, H., Leal, S., Granhag, P. A., & Fisher, R. P. (2021). Plausibility: A verbal cue to veracity worth examining? The European Journal of Psychology Applied to Legal Context, 13(2), 47–53. https://doi.org/10.5093/ejpalc2021a4
  • Vrij, A., Fisher, R. P., & Leal, S. (2023). How researchers can make verbal lie detection more attractive for practitioners. Psychiatry, Psychology, & Law, 30(3), 383–396. https://doi.org/10.1080/13218719.2022.2035842
  • Vrij, A., & Granhag, P. A. (2012). Eliciting cues to deception and truth: What matters are the questions asked. Journal of Applied Research in Memory and Cognition, 1(2), 110–117. https://doi.org/10.1016/j.jarmac.2012.02.004
  • Vrij, A., Granhag, P. A., Ashkenazi, T., Ganis, G., Leal, S., & Fisher, R. P. (2022). Verbal lie detection: Its past, present and future. Brain Sciences, 12(12), 1644. https://doi.org/10.3390/brainsci12121644
  • Vrij, A., Hartwig, M., & Granhag, P. A. (2019). Reading lies: Nonverbal communication and deception. Annual Review of Psychology, 70(1), 295–317. https://doi.org/10.1146/annurev-psych-010418-103135
  • Vrij, A., Mann, S., Leal, S., & Fisher, R. P. (2021). Combining verbal veracity assessment techniques to distinguish truth tellers from lie tellers. The European Journal of Psychology Applied to Legal Context, 13(1), 9–19. https://doi.org/10.5093/ejpalc2021a2
  • Vrij, A., Mann, S., Leal, S., & Granhag, P. A. (2010). Getting into the minds of pairs of liars and truth tellers: An examination of their strategies. The Open Criminology Journal, 3(1), 17–22. https://doi.org/10.2174/1874917801003010017
  • Vrij, A., Palena, N., Leal, S., & Caso, L. (2021). The relationship between complications, common knowledge details and self-handicapping strategies and veracity: A meta-analysis. The European Journal of Psychology Applied to Legal Context, 13(2), 55–77. https://doi.org/10.5093/ejpalc2021a7

Appendix