460
Views
5
CrossRef citations to date
0
Altmetric
Articles

Generalized linear mixed models for deception research: avoiding problematic data aggregation

&
Pages 821-835 | Received 09 Sep 2014, Accepted 29 Apr 2015, Published online: 21 Sep 2015
 

Abstract

While the concept of sampling variation is well understood by most researchers in the field of deception detection, previous studies have failed to account for the multiple sources of sampling variation present in typical experimental designs and use participant-level data as the dependant measure in analyses. These aggregated data, however, contain inherent biases that can mislead researchers. We argue that to appropriately test hypotheses and make inferences beyond a particular sample of participants, the decision-level data must be modelled directly. To illustrate how this can be achieved we provide an introduction to generalized linear mixed models (GLMMs) for the analysis of deception data and present Monte Carlo simulations demonstrating both the seriousness of the inherent biases present in participant-level data and the benefits of the GLMM approach. These simulations suggest that the empirical Type 1 and Type 2 error rates associated with main effects testing in deception research may be as high as 35% when data are aggregated ‘by-judge’ and as high as 60% when data are aggregated ‘by-sender’, respectively. When decision-level data are modelled directly, however, these rates are likely to be close to nominal levels (6% and 28%, respectively). Implications for past and future research are discussed.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1. In an analysis using by-sender data, the design would be specified as a 2×2 fully between-subjects factorial design where senders either tell the truth or tell a lie (between-subjects factor) while being interviewed with one of the two protocols; either the new protocol or the old protocol (between-subjects factor).

2. Also of interest would be the interaction between interview protocol and veracity, but for the sake of clarity we do not discuss this here.

3. We note that different authors use different notations. Here we have followed the notation used by Goldstein (Citation2003).

4. Calculating precise p values for fixed effects estimates produced by GLMMs is difficult, given the uncertainty associated with identifying the correct degrees of freedom for the t and F distributions that the p values are based on. While there are numerous approximations available, for the purposes of this paper we can reasonably expect the t-distribution to approximate the normal distribution (the data set is fairly large and balanced), thus we can assume a coefficient is ‘significant’ if its t-value is greater than 2 (see Baayen, Davidson, & Bates, Citation2008).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 199.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.