215
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Metamnemonic predictions of lineup identification

, &
Pages 1019-1038 | Received 07 Dec 2022, Accepted 19 May 2023, Published online: 02 Jun 2023
 

ABSTRACT

After a crime is committed, investigators may query witnesses about whether they believe they will be to identify the perpetrator. However, we know little about how such metacognitive judgments are related to performance on a subsequent lineup identification task. The extant research has found the strength of this relationship to be small or nonexistent, which conflicts with the large body of literature indicating a moderate relationship between predictions and performance on memory tasks. In Studies 1–3, we induce variation in encoding quality by having participants watch a mock crime video with either low, medium, or high exposure quality, and then assess their future lineup performance. Calibration analysis revealed that assessments of future lineup performance were predictive of identification accuracy. This relationship was driven primarily by poor performance following low assessments. Studies 4 and 5 showed that these predictions are not based on a witness’s evaluation of their encoding experience, nor on a contemporaneous assessment of memory strength. These results reinforce the argument that variation in memory quality is needed to obtain reliable relationships between predictions and performance. An unexpected finding is that witnesses who made a prediction shortly after encoding evinced superior memory compared to those who made a prediction later.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data analysed in the current study are available in the OSF repository at https://osf.io/4ywr9/.

Notes

1 Although a calibration analysis is also sometimes used to refer to a specific type of analysis that includes all identifications in the calculation of accuracy, we use the term “calibration analysis” more generally to refer to any analytic approach in which judgments and accuracy collapsed over subsets of data are compared. In the current studies, we measure how well pre-lineup assessments are calibrated to identification accuracy for suspect identifications only.

2 “Worse-case scenario” analyses that do not rely on such an assumption are reported in the General Discussion. None of the conclusions from either study 4 or 5 changed as a result.

3 It should be emphasised that the CAC analyses are our primary analyses. However, to facilitate comparison with prior research, we also report the point-biserial correlation between pre-lineup assessments and identification accuracy, as well as logistic regression analyses, collapsed across all data. These ancillary analyses are included in the Supplementary Materials.

4 Because this scenario was not fully orthogonally crossed with the variable of exposure quality, data from these two videos were not used in any of the statistical comparisons between conditions. Data from these two videos were only used when assessing the relationship between pre-lineup assessments and identification accuracy.

5 Although it would also be possible to measure calibration for non-choosers only, we decided to only examine calibration for participants who identified a suspect.

6 All parameter estimates are standardized (see MacKinnon & Dwyer, Citation1993).

7 The CI is bias-corrected and accelerated (BCa; Efron & Tibshirani, Citation1993), computed using a bootstrapping procedure using the boot package in R (Canty & Ripley, Citation2019). For each iteration, parameter estimates were standardized.

8 We re-did the ROC analysis for the data collapsed across Studies 1–3 for each pre-lineup assessment bin. We found that participants who gave a high assessment showed better discriminability (pAUC = 0.011) than participants who gave a low assessment (pAUC = 0.005), D = 2.02, p = .04, but not than participants who gave a medium assessment (pAUC = 0.012), D = −0.52, p = .60. Participants who gave a medium assessment performed better than participants who gave a low assessment, D = 3.06, p = .002.

9 An unplanned post-hoc test revealed that the pAUC was greater for the immediate condition (0.23) than the delayed condition (0.07) for participants who returned for part two before the median return time (D = 2.81, p < .01). For participants who returned for part two after the median return time, performance in the immediate condition (0.06) did not differ from the delayed condition (0.15; D = −1.57, p = .12).

10 These analyses were repeated for participants who returned for part two before and after the median return time. None of the results differed from the overall analysis with exception that for “early returners,” the confidence interval for the difference between the low- and the medium-assessment bins overlapped with zero (95% CI = [−0.55, 0.06]), but the confidence interval for the difference between low- and high-assessment bins did not overlap with zero (95% CI = [−0.55, −0.03]).

Additional information

Funding

This publication was made possible with a Ruth L. Kirschstein National Research Service Award (NRSA) Institutional Research Training Grant from the National Institutes of Health (National Institute on Aging) grant # 5T32AG000175. We thank Stephen Lindsay for granting access to his stimuli.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.