106
Views
0
CrossRef citations to date
0
Altmetric
Research Article

How to measure lineup fairness: concurrent and predictive validity of lineup-fairness measuresOpen Data

ORCID Icon, ORCID Icon &
Received 07 Feb 2023, Accepted 12 Jan 2024, Published online: 01 Feb 2024
 

ABSTRACT

The current study examined the concurrent and predictive validity of four families of lineup-fairness measures – mock-witness measures, perceptual ratings, face-similarity algorithms, and resultant assessments (assessments based on eyewitness participants’ responses) – with 40 mock crime/lineup sets. A correlation analysis demonstrated weak or non-significant correlations between the mock-witness measures and the algorithms, but the perceptual ratings correlated significantly with both the mock-witness measures and the algorithms. These findings may reflect different task characteristics – pairwise similarity ratings of two faces versus overall similarity ratings for multiple faces – and suggest how to use algorithms in future eyewitness research. The resultant assessments did not correlate with the other families, but a multilevel analysis showed that only the resultant assessments – which are based on actual eyewitness choices – predicted eyewitness performance reliably. Lineup fairness, as measured using actual eyewitnesses, differs from lineup fairness as measured using the three other approaches.

Open Scholarship

This article has earned the Center for Open Science badges for Open Data and Preregistered. The data and materials are openly accessible at https://osf.io/swdfu/ and https://bit.ly/3zaJQMk.

Author note

Much of this study occurred while the second author was at Queen Margaret University, Edinburgh, UK.

This study was preregistered (see https://bit.ly/3zaJQMk).

The data file and analysis code are available at https://osf.io/swdfu/, and materials are available from the corresponding author upon reasonable request.

Author contributions

All persons who meet authorship criteria are listed as authors. The contributions that each author made to the current study and manuscript are described below.

The first author: Conceptualization, Funding acquisition, Methodology, Formal analysis, Visualization, and Writing-original draft.

The second author: Conceptualization, Methodology, Software, Data collection, and Writing-review & editing.

The third author: Conceptualization, Methodology, Supervision, and Writing-review & editing.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 The list of searched articles published from 2020 to 2022 is provided as Supplemental Material 1.

2 We pre-registered our analytic approach here: https://bit.ly/3zaJQMk.

3 The authors used the dual-video & single-lineup paradigm (Oriet & Fitzgerald, Citation2018). That is, they selected four versions of a videoed event. Each video depicted the same event with a different person as the target. Two pairs of videos were selected such that each pair comprised targets that were similar looking. They then created a TP lineup for each of the four targets; each TP lineup was used as the TA lineup for the other member of a target’s pair.

4 For the mock-witness paradigm and perceptual ratings, we managed data quality from Mturk/CloudResearch based on the following steps. (1) We only allowed vetted high-quality participants who had passed CloudResearch's attention and engagement measures into the study. (2) We applied CloudResearch's additional data quality settings to exclude anyone that we previously identified as a bad participant. We blocked ‘suspicious geocode locations’ and ‘duplicate IP addresses’ and used the ‘verify worker country and state location’ option. (3) We excluded participants who reported responding randomly, cheating, or using a mobile device. We also included an admonishment that their payment or credit would not be affected so there was no reason for them to lie. (4) We excluded bots and previews. (5) We excluded people who did not complete the study. (6) We excluded people who reported relevant technical difficulties.

5 When a suspect was not chosen by mock-witnesses, his choosing frequency was adjusted to 0.5 to calculate estimates of the mock-witness indices.

6 We selected the ten features based on our preliminary examination. In the examination, we reviewed eighteen research papers on eyewitness descriptions, and listed features commonly reported in the papers. Each of the eighteen papers is included in the reference list of the present study with an asterisk. The features were sex/gender, age, hair, clothing, height, weight, body build/shape, race, and complexion. However, some of the features are impossible for people to make ratings about given the stimuli we have which only show faces (e.g. height, weight, clothing, etc.). Thus, we replaced them with other features which are frequently described by mock-witnesses, such as hair length, facial hair, nose size, and so on.

7 For ease of computation, we calculated the estimates of Euclidean Distance, assuming the equal weight on each feature. The computation may be elaborated with different weights on each feature.

8 Although the analysis with the resultant mock-witness indices was not part of our preregistration, we included them based on the suggestions of two anonymous reviewers.

9 However, given that researchers may be interested in the degree of bias in TP lineups and that it is possible to hold memory constant experimentally, we additionally calculated resultant assessments of TP lineup fairness based on Penrod (Citation2003)’s classification of guessers and reliable eyewitnesses for interested readers. The calculation methods are available in Supplemental Material 3.

10 Even when including estimates of the resultant mock-witness measures for TP lineups in the same correlation analysis, the correlation patterns did not considerably differ from those in (see Supplemental Material 3).

11 Because estimates of the resultant mock-witness measures were calculated only in TA lineups, those measures were excluded from the moderation analysis. However, an additional multilevel analysis including the resultant mock-witness measures of TP and TA lineups demonstrated that the resultant Lineup Size indices predicted suspect and filler ID rates only in TP lineups, whereas the resultant Lineup Bias indices predicted those DVs in both TP and TA lineups (see Supplemental Material 3).

12 We would like to express thanks to an anonymous reviewer for raising this point.

13 The distribution of major variables in the meta-analytic review is follows: MTP = 4.08, SDTP = 1.62, and MTA = 3.51, SDTA = 1.24 for Tredoux’s E; MTP = 3.66, SDTP = 1.24, and MTA = 5.39, SDTA = 1.24 for Effective Size; MTP = 0.45, SDTP = 0.15, and MTA = 0.25, SDTA = 0.17 for suspect ID rates; MTP = 0.27, SDTP = 0.16, and MTA = 0.31, SDTA = 0.16 for filler ID rates; and MTP = 0.28, SDTP = 0.11, and MTA = 0.44, SDTA = 0.13 for rejection rates.

Additional information

Funding

This research was supported by Hallym University Research Fund, HRF-202012-003.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 199.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.