7,828
Views
11
CrossRef citations to date
0
Altmetric
Review Article

Legal Admissibility of the Rorschach and R-PAS: A Review of Research, Practice, and Case Law

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, & show all
Pages 137-161 | Received 29 Apr 2021, Accepted 07 Jan 2022, Published online: 18 Feb 2022
 

Abstract

The special issue editors selected us to form an “adversarial collaboration” because our publications and teaching encompass both supportive and critical attitudes toward the Rorschach and its recently developed system for use, the Rorschach Performance Assessment System (R-PAS). We reviewed the research literature and case law to determine if the Rorschach and specifically R-PAS meet legal standards for admissibility in court. We included evidence on norms, reliability, validity, utility, general acceptance, forensic evaluator use, and response style assessment, as well as United States and selected European case law addressing challenges to mental examination motions, admissibility, and weight. Compared to other psychological tests, the Rorschach is not challenged at unusually high rates. Although the recently introduced R-PAS is not widely referenced in case law, evidence suggests that information from it is likely to be ruled admissible when used by a competent evaluator and selected variables yield scores that are sufficiently reliable and valid to evaluate psychological processes that inform functional psycholegal capacities. We identify effective and ethical but also inappropriate uses (e.g., psychological profiling) of R-PAS in criminal, civil, juvenile, and family court. We recommend specific research to clarify important aspects of R-PAS and advance its utility in forensic mental health assessment.

Data availability statement

The authors confirm that the data supporting the findings of this study are available within the article and its supplementary materials.

Disclosure statement

Donald J. Viglione and Gregory J. Meyer are members of a company that sells the R-PAS manual and associated products.

Notes

1 Previous Rorschach systems have not included this instruction, leading to extensive variability in the number of responses evaluees provide and consequent difficulty applying one set of normative expectations to protocols that may contain 14 responses (R) or may contain 70 or more responses. R-PAS sets 16 responses as a minimum and limits evaluees to no more than four responses per card. Data indicate R-PAS administration achieves its aim to produce an average range for R of 18 to 27 responses (Hosseininasab et al., Citation2019; Meyer et al., Citation2011; Pianowski et al., Citation2021).

2 There is substantial neurophysiological research on the Rorschach that we lack the space to address, but it is part of the empirical foundation supporting eight of the 21 (38%) variables in Table 2.

3 People may call the Rorschach a “projective test” or think it relates to psychoanalytic theory. However, since Exner (Citation1974), the test has been conceptualized as a problem-solving task. Further, juxtaposing the term “projective” against so-called “objective” tests (i.e., self-report inventories) is misleading (Viglione & Rivera, Citation2003). It is more accurate to call it a performance task or performance-based test (Meyer et al., Citation2011; Meyer & Kurtz, Citation2006), as used here.

4 We use the term “multimethod” to refer to this type of multiple-source context in FMHA.

5 Comparing an evaluee’s Rorschach summary scores to a normative standard that looks healthier than the population it is supposed to represent, results in overpathologizing.

6 Following R-PAS, even CS users now recommend replacing the old CS norms with international norms (e.g., Weiner et al., Citation2019).

7 Viglione and Meyer (Citation2008) noted this variable had low interrater reliability in the CS literature.

8 The CS generated neither normed scores for variables nor a visual plot of its scores, making it necessary for evaluators to memorize interpretive cut-off scores for the individual variables, which increases error-proneness.

9 To quantify the extent to which R-PAS made use of these findings in its variable selection, we correlated being retained in R-PAS (no = 0, yes = 1) with level of empirical support (excellent = 4, good = 3, modest = 2, little to none = 1, no studies = 0), finding ρ = .65, p < .001. Thus, inclusion of variables in R-PAS was strongly determined by the evidence for their convergent validity.

10 The validity estimates included in this review were based on the hypotheses of the original study authors, which at times are not clearly stated. Unlike Mihura et al. (Citation2013), the coefficients were not rated for the extent to which the criterion matched the core construct for the variable, nor limited to those that achieved that match. Thus, the coefficients we report can be considered as convergent validity coefficients, as opposed to what Mihura et al. described as construct validity coefficients, after limiting their analyses to criteria judged to fit the core interpretation of each variable.

11 One author (JH) reviewed all studies and identified those that were relevant. Two authors (JH and GM) then reviewed these articles, extracted information for the entries in Supplemental Tables 1 and 2, extracted information for the entries in Tables 1 and 2, and provided the ratings in Tables 1 and 2. References for all Supplemental Tables are found in Supplemental Material References.

12 The most stringent standards for reliability cited by Edens and Boccaccini (Citation2017) are those recommended by Nunnally and Bernstein (Citation1994, p. 265) for instances when “important decisions are made with respect to specific test scores.” They declare that in those circumstances, .95 is the desirable standard and .90 the minimal standard, with the guidelines applying to reliability in general, not interrater reliability specifically. We reviewed a sample of literature citing Edens and Boccaccini (Citation2017) and found neither endorsements nor criticisms of these recommended levels of reliability. We consider these ICC recommendations to be more suitable for “stand-alone” instruments, such as violence risk assessment tools, which yield scores that are often interpreted with direct implications for essential legal decisions. This view is in line with Edens and Boccacini’s quote from Nunnally and Bernstein.

13 Erard et al. (Citation2017) did not provide separate results for each of the profiled variables. Instead, they provided specific values only for those variables having surprisingly poor reliability: FQ-%, WD-%, and WSumCog, which had ICCs of .29, .31, and .42, respectively. In our analysis of recent reliability studies (see Table 1) these variables have interrater reliability values in the adequate or good range.

14 As an example, an actuarial risk assessment instrument might be used in a stand-alone way to link a specific scale elevation to a specific forensic opinion.

15 Two publications included multiple, but separate, case studies, which were considered individual cases, whereas multiple testing administrations as part of one court-ordered evaluation for child custody or course of treatment were considered a single case evaluation.

16 The challenged tests were: (1) Abel and Becker Cognition Scale (ABCS); (2) MCMI (all versions); (3) MMPI (all versions); (4) PAI (all versions); (5) PCL (all versions); (6) Static-99 (all versions); (7) Structured Interview of Reported Symptoms (SIRS, all versions); and (8) Wechsler Adult Intelligence Scale (WAIS, all versions).

17 This search string is provided in the note field of Supplemental Table 5.

18 This search string is provided in the note field of Supplemental Table 6.

19 Indeed, a parallel Westlaw search referencing the CS but excluding R-PAS [adv: DA(bef 1/1/2020) & Rorschach Rorshach & “comprehensive system” Exner % R-PAS “perform! assess!”] returned only 26 cases. Thus, it appears that judicial opinions rarely refer to any specific Rorschach systems.

20 Limited research is defined as having either less than four studies (ns < 4) or less than 200 participants (np < 200) in Table 2. Adequate reliability is defined as reliability ratings of Adequate or above in Table 1. These 12 variables are F%, MC-PPD, WSumCog, SevCog, WD-%, YTVC’, m, MOR, SC-Comp, ODL%, MAP/MAHP, and PHR/GPHR.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.