Abstract
We surveyed practicing clinicians who regularly used the Rorschach about the perceived clinical validity of specific Rorschach scores from many coding systems. The survey included quantitative feedback on the validity of specific variables as well as qualitative input in several areas, including the validity of specific variables, the potentially unique information that can be obtained from them, coding challenges associated with Comprehensive System (CS) codes, and recommendations for CS developments. Participants were recruited by applying a snowball sampling strategy. Based on responses from 246 experienced clinicians from 26 countries, composite judgments on rated variables were quite reliable (e.g., M α = .95 across 88 CS variables), despite limited agreement among any 2 judges. The aggregated judgments clearly differentiated among scores that were considered more and less clinically valid and the overall results aligned with recently obtained meta-analytic conclusions from the traditional validity literature (Mihura, Meyer, Dumitrascu, & Bombel, 2012). The judges also provided guidance concerning revisions and enhancements that would facilitate Rorschach-based assessment in the future. We discuss the implication of the quantitative and qualitative findings and provide suggestions for future directions based on the results.
Acknowledgments
Portions of this article were presented at the 2008 (New Orleans, LA) and 2009 (Chicago, IL) annual meetings of the Society for Personality Assessment. We thank John Exner, Phil Erdberg, Chris Fowler, and Roger Greene for their contributions in designing this project as part of Exner's Rorschach Research Council. We are grateful to Helena Lunazzi and Manuel Esbert for translating the survey into Spanish and we thank all the clinicians who offered their input into this endeavor.
Editor's Note: Robert E. McGrath served as the Editor for this article with complete decision authority.
Notes
We use codes to indicate response-level classifications and “scores” for protocol-level sums. All are variables.
The Rorschach Discussion List is accessible at http://tech.groups.yahoo.com/group/Rorschach_List/.
We also examined the potential benefits of converting each rater's raw scores to z scores and thereby setting each rater's mean score to 0.0 and their standard deviation to 1.0 to control for individual differences in how the rating scale was used. Using z scores in this context has no impact on interrater reliability correlations but it could affect the mean and standard deviation of the ratings for each variable. However, it had no notable impact on our results, with descriptive data remaining virtually unchanged. For instance, across the 115 rated variables, the correlation of the raw score Ms and z score Ms was .989; for the 88 CS variables, the correlation was .997. Thus, only raw score results are reported.
It is possible the rating scale also contributed to all ratings having a positive mean if some raters disregarded the verbal anchors and considered a rating of zero to indicate the absence of validity.
Although it is reasonable to think that scores are likely to be relatively weak if they have had no studies published since 1974 targeting their core construct validity, we also limited analyses to just the 53 scores with some relevant data. For them, the association of perceived validity with the meta-analytic results was .35 (n = 53, p = .009).
As exceptions, Exner cautions about making inferences about dependency based on the Food content code and about making diagnostic determinations of depression from the DEPI (see Mihura et al., Citation2012, p. 9).
Without considering three variables that are not in R–PAS because they are redundant with other variables (i.e., Sum6, XA%, and TDI), the 54 variables that are included in R–PAS have much higher rated validity than the 58 variables that are not in R–PAS (Ms = 1.09 vs. 0.81, respectively; d = 1.48, p < .01).