Abstract
Objective:
Logistic regression (LR) is recognized as a promising method for making decisions about neuropsychological performance validity by integrating information across multiple measures. However, this method has yet to be widely adopted in clinical practice, likely because several open questions remain about its utility relative to simpler methods, its effectiveness across different clinical contexts, and its feasibility at sample sizes common in the field. The current study addresses these questions by assessing classification performance of logistic regression and alternative methods across an array of simulated data sets.
Methods:
We simulated scores of valid and invalid performers on 6 tests designed to mimic the psychometric and distributional properties of real performance validity measures. Out-of-sample predictive performance of LR and a commonly used alternative (“vote counting”) was assessed across different base rates, validity measure properties, and sample sizes.
Results:
LR improved classification accuracy by 2%–12% across simulation conditions, primarily by improving sensitivity. False positives and negatives can be further reduced when LR predictions are interpreted as continuous, rather than binary. LR made robust predictions at sample sizes feasible for neuropsychology research (N = 307) and when as few as 2 tests with good psychometric properties were used.
Conclusions:
Although training and test data sets of at least several hundred individuals may be required to develop and evaluate LR models for use in clinical practice, LR promises to be an efficient and powerful tool for improving judgements about performance validity. We offer several recommendations for model development and LR interpretation in a clinical setting.
Disclosure statement
The authors have no conflicts of interest to declare. This work was authored as part of the contributor’s official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U. S. C 105, no copyright protection is available for such works under U. S. Law.
Data availability statement
All code used to conduct the simulation studies reported in this manuscript and all simulated data are publicly available on the Open Science Framework: osf.io/tfcnw/.