525
Views
0
CrossRef citations to date
0
Altmetric
Assessment, Development, and Validation

The Therapy Progress Scale: Evaluating Psychometric Properties in an Outpatient Sample of Clients in Private Practice

ORCID Icon, , , , &

Abstract

Measurement-based care, an evidence-based practice endorsed by the American Psychological Association, is underpinned by routine assessment supporting a data-driven approach to clinical decision making. Nonetheless, there is a need for brief, nonproprietary measures assessing non-symptom-based outcomes. The present study examined the psychometric properties of the Therapy Progress Scale (TPS), a four-item measure assessing clients’ perceived treatment progress in multiple life functioning domains. The sample included 36,420 clients (66% female, 55.5% White, 31.5% Racial/Ethnic Minority) receiving outpatient psychotherapy from a practice-research group of private practitioners. The TPS demonstrated a one-factor solution (χ2 (2) = 362.08, RMSEA = .076, CFI = .999, TLI = .996) with high reliability estimate (coefficient α = .87). Additionally, the factor structure was consistent across client gender and race/ethnicity. There were moderate negative correlations with symptom-based measures (i.e. PHQ-9 and GAD-7). Test-retest correlation was also strong. Implications for research and practice are provided.

Measurement-based care (MBC) is an evidence-based practice that has been supported by the American Psychological Association (APA) for almost two decades (APA Presidential Task Force on Evidence-Based Practice in Psychology, Citation2006). APA’s commitment to this approach to care was recently strengthened by the release of APA’s Practice Guidelines for MBC (Boswell et al., Citation2023). MBC can be viewed as an approach to health care encompassing several procedures and systems. Within the scope of psychological services, MBC includes administrating patient-reported outcome measures (PROMs), tracking measurement scores over the course of therapy, processing changes (or lack thereof) in sessions, and making modifications to treatment (as needed). These processes have been aided by measurement-feedback systems (MFSs), software facilitating the clinical interpretation of routine outcome measurement scores. Engaging in MBC has been shown to increase client retention and enhance treatment outcomes in a variety of naturalistic and experimental research designs (e.g. Anker et al., Citation2009; de Jong et al., Citation2021; Lambert et al., Citation2018; Reese et al., Citation2013).

MBC originated in the medical field and was later introduced to the field of mental health services through translational science (Boswell et al., Citation2015). Consequently, certain aspects of MBC in mental health services have mirrored practices that are traditionally associated with the medical field. Among these, MBC in mental health has continued, following standard medical practice, to predominantly rely on the administration and interpretation of symptom-based measures (Wampold & Imel, Citation2015). Nonetheless, recent research and development efforts have been underpinned by a growing interest in producing measures assessing psychological processes that are perceived by psychotherapists and clients alike as more clinically useful and relevant than symptom-based measures (Bugatti & Boswell, Citation2022; Jensen-Doss et al., Citation2018).

As an alternative and complimentary approach to symptom-based assessment, several measures assessing clients’ well-being and life functioning have become an integral component of the routine assessment batteries administered by several MBC systems (e.g. Kopta et al., Citation2015; Miller et al., Citation2003). Most of these scales assess eudemonic aspects of well-being, characterized by assessment of distinct life domains, such as social, professional, spiritual, and familial. These measures are consistent with several trends in psychological services. First, most well-being and life functioning measures are transdiagnostic. Recent years have seen a progressive abandonment of disorder-specific approaches to treatment in favor of transdiagnostic, evidence-based psychotherapies (e.g. Barlow et al., Citation2017; Johansson et al., Citation2012). These approaches to treatment target psychological processes that cut across diagnoses (e.g. mindfulness, experiential avoidance, emotion regulation) and thus their effectiveness is better assessed by non-symptom-based measures, which tend to be diagnosis-specific. Second, well-being and life functioning measures are transtheoretical. The majority of psychotherapists combine multiple approaches in eclectic and integrative manners (e.g. Norcross & Rogan, Citation2013). As such, measures assessing change on transtheoretical constructs might better capture treatment outcomes. Third, a significant portion of clients seeking psychological services either meet criteria for multiple psychiatric conditions or are subthreshold for many. Yet, many of those patients report significant impairment across various domains of life functioning and well-being. Psychotherapy can effectively help most clients achieve lasting improvement (Barkham et al., Citation2021; Wampold & Imel, Citation2015), and thus measures capturing those outcomes are in need. Fourth, these measures should reduce the burden placed on clients required by extended batteries of measures, which has identified as one of the primary impediments at the client-level (Boswell et al., Citation2015; Lewis et al., Citation2019).

The few ultra-brief measures of functioning that are currently available are limited to “point-in-time” assessment (e.g. Outcome Rating Scale; Miller et al., Citation2003; Rating of Outcome Scale; Seidel et al., Citation2017). This approach features items worded in such a way to capture individuals’ self-rating on a specific construct (e.g. social functioning) at a specified timepoint. “Point-in-time” assessment is the standard measurement approach used in medicine and psychological assessment. However, as an alternative to “point-in-time” assessment, change-oriented measures have been developed (e.g. Patient Estimate of Improvement; Hatcher & Barends, Citation1996). Items included in these measures are worded to prompt individuals to report change from the previous assessment timepoint. They have the potential to capture subtle change that would otherwise be missed by “point-in-time” assessment. The change-oriented approach to measurement is consistent with the Response-Shift literature (Howard et al., Citation1981; Moore & Owen, Citation2014; Rapkin & Schwartz, Citation2004). “Response shift is a combination of a testing effect and a person’s maturation that actually improves validity rather than threatens validity” (Moore & Owen, Citation2014, p. 183). That is, clients’ perspectives on their mental health might have shifted after gaining new insights or implemented new coping strategies.

Any measure should also have rigorous psychometric properties to properly interpret the scores. At a foundational level, measures should have sound factor structure, evidenced by strong factor loadings and high reliability (Dumas & Dong, Citation2019; Wainer & Thissen, Citation2001). Moreover, this factor structure should be consistent (i.e. factor invariance) across client factors, such as gender and/or race/ethnicity. Factor (or factorial) invariance is key to be sure that the factor structure of the measured latent variable is identical across groups or testing occasions (Meade & Lautenschlager, Citation2004; Millsap & Olivera-Aguilar, Citation2012), so construct scores can be meaningfully compared. However, measures that have factor invariance evidence do not fully capture the lived experience from different cultural groups. Indeed, Owen et al. (Citation2011) recommended that, to better assess specific cultural experiences, different measures should be employed; nonetheless, there is still an advantage in testing factor invariance for global measures. For instance, there is growing evidence that some therapists have worse therapy outcomes with their racial/ethnic minority (REM) clients as compared to their REM clients (e.g. Drinane et al., Citation2016; Hayes et al., Citation2016; Imel et al., Citation2011; Owen et al., Citation2021). The finding has also been found for gender comparison with men and women (Owen et al., Citation2009). Yet, for those comparisons to be fruitful the measure would need to capture similar information for both REM and White clients. Additionally, novel, routinely administered measures should be sensitive to between-session change to capture meaningful change across therapy and in turn help therapists make clinical decisions (Jacobson & Truax, Citation1991).

The Present Study

The present study was designed to test the psychometric properties of a novel life functioning scale, which was specifically developed to overcome the limitations of currently available life functioning scales. The Therapy Progress Scale (TPS) is an ultra-brief, self-report PROM composed of 4 items assessing clients’ perceived change in functioning within distinct life domains over the course of the previous two weeks. It was adapted from the ECHO-Global Functioning Measure to better serve the purpose and characteristics of MBC. We hypothesized that the TPS would retain a one-factor structure (H1) with strong reliability (>.80; H2), and we posited that the TPS would be moderately correlated with the PHQ-9 and GAD-7 (H3). We also posited that the TPS would be sensitive to change, evidenced by changes from pre-post changes (H4). Lastly, we hypothesized that the factor structure would be identical for men and women (H5) as well as White and Racial/Ethnic Minorities (H6).

Method

Inclusion and Exclusion Criteria

All clients who had received psychotherapy from the PRG SonderMind and who had completed the Therapy Progress Scale at a minimum of two time-points over the course of treatment were deemed eligible to participate.

Participant Characteristics

Clients

The analytic sample was composed of a total of 36,420 unique clients. Most participating clients identified as female (66%; 29% male, 1.7% nonbinary, 3.3% missing) and White (55.5%, 31.5% Racial/Ethnic Minority [REM], and 13% missing). Of the REM clients, 2.70% were Indigenous/Native American, 1.80% were Middle Eastern, 10.51% were Asian American, 38.01% were Black, 10.63% were biracial/ethnic, 35.48% were Latinx, 2.57% were Southeast Asian or Hawaiian/Pacific Islander. Clients’ average age was 35.15 (SD = 12.48). No other demographic information was collected. Most diagnoses were related to mood or anxiety disorders, with other diagnoses being less frequent (e.g. trauma-related disorders, eating disorders). Diagnoses were assessed by therapists as part of their routine practice.

Sampling Procedures

All study procedure were approved by the University of Denver Institutional Review Board, protocol #1776376-1.

Measurement of Constructs

Therapy Progress Scale

The Therapy Progress Scale (TPS) is a brief four-item measure to assess changes in a range of life domains that was adapted from the ECHO Global. Specifically, the TPS was developed by extracting the “Perceived Improvement” subscale from the ECHO Global. Items from this four-item subscale were rephrased to support bi-weekly (every two weeks) administration. These modifications were made so that the measure could be used within the context of MBC, which is characterized by frequent assessment. The final four items are: (a) compared to two weeks ago, rate your ability to deal with daily problems now; (b) compared to two weeks ago, rate your ability to deal with social situations now; (c) compared to two weeks ago, rate your ability to accomplish the things you want to do now; and (d) rate your problems or symptoms now. These items are rated on a five-point scale: 1. Very Poor 2. Poor 3. Fair 4. Good 5. Very Good. The scale was purposely created to offer a neutral mid-point allowing clients to describe little to no change from previous ratings on a specific item, which would provide valuable clinical information to therapists. Description of the psychometrics are provided in the results section.

Patient Health Questionnaire-9

The Patient Health Questionnaire-9 (PHQ-9; Kroenke et al., Citation2001) is a nine-item, self-report scale. It was designed to be administered to adult individuals to measure depressive symptom severity. In a validation study (Kroenke et al., Citation2001), the PHQ-9 demonstrated excellent internal reliability (Cronbach’s α = 0.89), and test-retest reliability (r=0.84). In the present study, the PHQ-9 was administered electronically prior to the first session (and then continually for those diagnosed with a depressive disorder). In the present study, the Cronbach’s α was 0.89.

Generalized Anxiety disorder-7

The Generalized Anxiety Disorder-7 (GAD-7; Spitzer et al., Citation2006) is a seven-item self-report scale to assess anxiety symptoms in adults. Clients are asked to rate the frequency of their anxiety symptoms within the last two weeks. Items are rated on a Likert type scale of 0–3: 0 = not at all, 1 = several days, 2 = more than half of the days, 3 = nearly every day. Items are then summed to provide a total score. This measure has shown good reliability (Cronbach’s α = 0.92) and test-retest reliability (ICC = 0.83). In the present study, the GAD-7 was administered electronically prior to the first session (and then continually for those diagnosed with an anxiety disorder). In the present study, the Cronbach’s α was 0.89.

Data Collection

Data was collected through an online practice-research group (PRG) platform for private practitioners. It facilitates several clinical tasks, including supporting prospective clients’ search for available therapists, scheduling, and billing. Moreover, it offers automated MBC, which comprises the administration of routine outcome and process measures that are then viewable by both therapists and clients. These outcomes are stored as part of the PRG database that was harnessed by the present study. The TPS was sent to clients prior to each of their therapy sessions. The PHQ-9 and GAD-7 were delivered at baseline and subsequently if the client was diagnosed with a depressive or anxiety condition, respectively.

Analytic Plan

Data Diagnostics

We first checked the fundamental psychometric properties of the TPS, such as dimensionality, reliability, and validity.

Primary Analysis

Multigroup confirmatory factor analysis (CFA) was conducted to investigate the measurement invariance (MI) of the scale across demographic groups. In this study, MI was tested over gender (men versus women) and race/ethnicity (White versus REM) groups. The latter comparison was limited to these two groups due to the relatively smaller size of other REM identities (e.g. Middle Eastern, Native Hawaiian) which could therefore be retained in the analyses. Moreover, this methodological approach was supported by several previous studies that implemented a comparable approach (e.g. Drinane et al., Citation2016; Hayes et al., Citation2016; Imel et al., Citation2011; Owen et al., Citation2021). All CFA models were performed with the weighted least square mean and variance adjusted (WLSMV) estimator via Mplus 8.7 (Muthén & Muthén, Citation2017). Model fit was evaluated using chi-square statistics and alternative fit indices, including Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), and Root-Mean-Square Error of Approximation (RMSEA). The cutoffs of goodness-of-fit follow the recommendation by Hu and Bentler (Citation1999): REMSA ≤.06 or .08, CFI and TLI ≥.90 or .95, and the equivalence of measurement models across demographic groups were evaluated by the magnitude of changes in the referred model fit indices (e.g. Δ CFI and Δ TLI ≤−0.01, Cheung & Rensvold, Citation2002; Δ RMSEA <−0.015, Chen, Citation2007).

Results

Descriptive Statistics of Items

Item statistics of the TPS are summarized in . From the proportions of clients selecting each response category, all four items showed similar response patterns in which most clients selected the middle category and the rest tapered off symmetrically toward extreme categories. Such a pattern is considered ideal, given that it may maximize response variability (Dumas & Dong, Citation2019; Wainer & Thissen, Citation2001). Moreover, the item-scale correlations ranged from .67 to .78, indicating that all items were able to discriminate among clients who have better or worse abilities to successfully engage in life tasks.

Table 1. Item Statistics of the Therapy Progress Scale.

In clinical measurement practice, clients’ data could be nested within therapists, meaning that a client’s responses to the scale may be more similar to those with the same therapist. Checking the potential clustering effect of clients’ measurement data is necessary because the findings would impact later modeling choices (e.g. the need to employ a multilevel measurement model, Coleman et al., Citation2022). From , the intraclass correlations of four items ranged from 1% to 4%, which meant the clustering effect was negligible. Therefore, the following measurement models were all conducted within a single-level framework.

Fundamental Psychometric Properties of the TPS

To examine whether the hypothesized unidimensional measurement model for the TPS fit the collected data, a confirmatory factor analysis with the whole sample was performed. Notably, the size of the sample was large (i.e. over 30,000 clients), so we did not report those overpowered significant tests for chi-square statistics. The initial CFA showed excellent model fit, χ2 (2) = 362.08, RMSEA = .076, CFI = .999, TLI = .996. This finding supported the theoretical factor structure of the TPS, and thus our first hypothesis.

displays additional psychometric properties of the TPS. The coefficient α of the scale was .87 (95% CI [.87, .87]) for this sample. Given α often indicates a lower-bound estimate of internal consistency, we also calculated coefficient H as a measure of maximum reliability (Hancock & Mueller, Citation2001; McNeish, Citation2018). The coefficient H of the measure was .92, which further demonstrated that the scale was highly reliable. These findings supported our second hypothesis.

Table 2. Correlations, Alphas, Means and Standard Deviation of Scale Scores.

Moreover, moderate correlations were found between the focal scale and two criterion measures (PHQ-9: r=−0.46; GAD-7, r=−0.42), suggesting that the TPS was related to symptoms of depression and anxiety, while capturing different aspects of life functioning.Footnote1 These findings supported our third hypothesis, noting concurrent validity estimates. To examine whether the TPS is sensitive to change, we examined pre- to post-treatment changes. Utilizing a subsample clients (n = 32,819; i.e. those who had at least 2 sessions, regardless of the baseline score) we examined changes from pre to post treatment. The mean number of sessions was 6.90 (SD = 6.59). We found an average baseline score on the TPS of 12.18 (SD=2.99) and average post-treatment score of 9.41 (SD=3.43). The change from pre to post was characterized by a large sized effect, Cohen’s d=0.93. This large sized effect provided initial evidence that the TPS is sensitive to change (supporting our fourth hypothesis).

Assessing Measurement Invariance of the TPS

MI of a measurement tool generally implies that the underlying construct is measured in the same way over different groups or measurement occasions (Meade & Lautenschlager, Citation2004; Millsap & Olivera-Aguilar, Citation2012). In this study, multigroup CFA models were used to assess the MI of the scale across gender and race/ethnicity groups. We conducted sequential invariance models (i.e. configural, metric, and scalar) to test each type of measurement invariance: gender invariance and race/ethnicity invariance.

Model Configuration

The configural models (i.e. models 1 and 3) concerned whether the dimensionality and the pattern of factor-item relationships in the TPS were identical between two gender groups and two race/ethnicity groups. While holding the general latent structure equal, the configural models allowed item loadings and thresholds to be freely estimated within each demographic group. In the metric invariance models (i.e. models 2 and 4), loadings of each item were set to be equal across gender or race/ethnicity groups, but item thresholds were freely estimated within each group. The scalar invariance models (i.e. models 3 and 6) imposed additional constraints on the item thresholds. The Mplus codes of these models are available upon request to the corresponding author.

Evaluating Measurement Invariance of the TPS

presents the fit indices of the sequential MI models as well as the magnitude of changes for model comparisons. An invariance level (e.g. scalar invariance) is achieved when the imposed parameter constraints (e.g. equal loadings and thresholds) do not substantively worsen model fit indices. Scalar-level invariance is usually needed to establish MI and meaningfully compares participants’ scores across subgroups (Dong & Dumas, Citation2020; Meade & Lautenschlager, Citation2004).

Table 3. Fit Indices of the Conducted MI Models.

Gender Invariance

Due to the low number of non-binary clients we excluded them from these analyses (see limitations). In evaluating the gender invariance of the Life Functioning scale, we first compared the fit indices of its configural (model 1) and metric (model 2) models. It was found that imposing equal item loadings over gender groups resulted in a better fit (Δ RMSEA = − .015 and Δ TLI =.002), which provided strong evidence of metric invariance (Marsh, Citation2007). We further examined scalar-invariance by comparing the fit indices of the corresponding metric (model 2) and scalar (model 3) models. From , the scalar model showed increased model fit (Δ RMSEA = −.022 and Δ TLI =.001), even with extra constraints on all item thresholds. Thus, scalar-level invariance was established, and this finding supported the MI of the TPS over gender groups (supporting hypothesis 5). Additionally, the mean scores on the TPS for men was 3.00 (SD = 0.76) and for women the mean score was 3.05 (SD = 0.77), indicating very similar mean level scores.

Race/Ethnicity Invariance

We followed the same procedures to evaluate the race/ethnicity invariance of the measure. Since there were small sample sizes within some racial/ethnic groups, we collapsed racial/ethnic minorities into one group (see limitations). Compared to the configural model (model 4), the metric model (model 5) showed no decrease in all fit indices (Δ RMSEA = − .022, Δ CFI = 0, and Δ TLI =.002), so the metric-level invariance was satisfied. We next compared the metric (model 5) and scalar (model 6) models and found a slight decrease in CFI (Δ = −0.002), an increase in RMSEA (Δ = − .004), and no change in TLI. We concluded that the scalar invariance was achieved for two main reasons. First, the magnitude of change in CFI was within an acceptable range (i.e. drop in CFI less than .01, Chen, Citation2007; Cheung & Rensvold, Citation2002). Second, the parsimonious model (i.e. scalar model) had shown the same TLI and better RMSEA than the complex model (i.e. metric model), which may be perceived as a source of more conservative evidence for MI (Coleman et al., Citation2022; Marsh, Citation2007). Taken together, the race/ethnicity invariance of the TPS was demonstrated (supporting hypothesis six). Additionally, the mean scores on the TPS for REM clients was 3.02 (SD=0.82) and for white clients the mean was 3.06 (SD=0.74), indicating very similar mean level scores.

Discussion

The present study sought to validate the psychometric properties of the Therapy Progress Scale (TPS), a brief, four-item measure assessing between-session change on four distinct domains of life functioning, which was adapted from the ECHO Global for routine administration in MBC. The TPS was found to feature a unidimensional factor structure, possessing excellent reliability within the examined sample. Moreover, the TPS displayed sufficient measurement invariance across gender and race/ethnicity. The clinical utility of the TPS was also corroborated by the large-sized changes captured over the course of treatment, as well as by moderate correlations with two commonly used symptom-based measures.

The generalizability of the findings reported in the present study is supported by the vast naturalistic dataset that underpinned its analyses, which included over 36,000 psychotherapy clients receiving services from private practitioners in a variety of geographical locations in the United States. The ecological validity of these results was also further substantiated by the heterogeneity of clinical presentations represented by the client sample, which mirrored the comorbidities, co-occurrences, and variety of diagnostic features observed in routine outpatient care (Shadish et al., Citation2000). As such, the TPS was found to demonstrate its psychometric properties described above in a sample representative of the general outpatient clinical population.

Beyond its psychometric properties that were corroborated in the present study, it is paramount to underscore the current need for a brief, transdiagnostic, transtheoretical, and nonproprietary measure of life functioning specifically designed for MBC. Although MBC is an evidence-based practice supported and encouraged by the APA since 2006 (APA, Citation2006), utilization rates in routine clinical practice have been underwhelming (Ionita & Fitzpatrick, Citation2014). The discrepancy between MBC proven effectiveness and its implementation rates have been examined by a small, yet cohesive body of research that identified a distinct number of factors contributing to this observed phenomenon. Hatfield & Ogles (Citation2004, Citation2007) found that practicality was the primary concern reported by clinicians who did not utilize MBC, which included the time burden placed on clients by the routine completion of extended assessment measures. Thus, the development of brief measures—such as the TPS—specifically designed to minimize this additional burden placed on clients is of the utmost importance for the achievement of higher MBC utilization rates. Likewise, although MBC relying predominantly on symptom-based measures has inarguably been found to enhance treatment outcomes (e.g. Lambert et al., Citation2018), therapists’ perceived clinical utility of this type of measures has been found to be lukewarm (e.g. Jensen-Doss et al., Citation2018). Therefore, there is a need for greater availability of non-symptom-based routine outcome measures, which may be better aligned with theoretical approaches de-emphasizing diagnostic symptomatology. However, the concurrent validity demonstrated by the TPS in relation to some of the most commonly administered symptom-based measures (i.e. PHQ-9; GAD-7) suggest that the TPS could be utilized in routine clinical practice as a complementary addition to already well-established symptom-based assessment. This approach would afford therapist with an additional tool for gaining a more comprehensive understanding of client change over the course of treatment.

The findings from the present study have meaningful implications for mental health practitioners. The TPS appears to have initial strong support for its validity and reliability. Thus, clinicians could integrate the TPS as part of their MBC practice. Since the TPS captures constructs related to life functioning that differ significantly by those measured by other commonly administered measures (e.g. symptom-based measures), it could provide clinicians and clients with an additional metric to track and evaluate treatment progress. Common suggestions for MBC are to administer the measure either weekly or biweekly and discuss the scores with clients (see Boswell et al., Citation2023). Therapists could also track their average outcomes across their caseload. This feedback could highlight areas of growth, such as having better outcomes with one client group (e.g. White clients vs REM clients).

Limitations

The present study featured several laudable qualities, such as the reliance on a vast, naturalistic sample, the representation of masters-level clinicians, and the inclusion of a broad range of clinical presentations in the client sample. Nevertheless, there were also several notable factors that may limit the generalizability of its findings. First, the sample was recruited on a single behavioral health technology platform. As such unique characteristics associated with this platform may have had an impact on the findings. Second, although representative of the largest group of mental health clinicians in the US, the sample was disproportionately composed of masters-level clinicians. It is unclear if or how therapists with higher levels of training (e.g. doctoral degrees) may have differed in the outcomes produced over the course of treatment, and whether or not those may have been equally captured by the TPS. Third, the client sample was limited to clients receiving outpatient services. MBC has become the standard of practice in other clinical settings, such as partial hospitalization programs, inpatient psychiatric centers, and behavioral medicine. Therefore, the properties of the TPS should be examined in these settings by future research. Fourth, the factor invariance tests could not be conducted with some groups (e.g. non-binary, specific racial/ethnic groups). Future studies should examine whether the TPS factor structure is consistent for these and other cultural groups. Finally, although an attempt was made to optimize the balance between brevity and comprehensiveness, it is possible that the measure’s items might miss life functioning changes that could have otherwise been capture by a scale with additional items.

In conclusion, the TPS demonstrated strong psychometric properties (e.g. factor structure, factor invariance, concurrent validity, reliability) within a large sample of clients in private practice. The four-item measure has appeal as an alternative/addition to symptom-based measures or other ultra-brief measures. Moreover, the change-oriented nature of the scale might be more appealing to clinicians monitoring treatment progress. Ultimately, we hope that by promoting more ultra-brief measures that therapists might find the ideal way to engage with MBC.

Significance Statement

The Therapy Progress Scale is a four-item measure designed for routine assessment of clients’ perspective of treatment progress in multiple life functioning domains. The psychometric properties displayed in the present study support the use of this measure in outpatient clinical settings as a routine assessment tool facilitating clinical responsiveness.

Disclosure Statement

Dr. Bugatti and Dr. Owen have a contract with SonderMind, Inc. to conduct research. Dr. Richardson and Dr. Newton are SonderMind’s employees

Additional information

Notes on contributors

Matteo Bugatti

Matteo Bugatti, Ph.D., is a Research Assistant Professor at University of Denver, Department of Counseling Psychology. His research interests include measurement-based care, and psychotherapy process and outcomes.

Yixiao Dong

Yixiao Dong, Ph.D. is an Assistant Professor at University of Denver, Department of Research Methods and Statistics.

Jesse Owen

Jesse Owen, Ph.D. is a Professor at University of Denver, Department of Counseling Psychology.

Zachary Richardson

Zachary Richardson, Ph.D. is a Staff Data Scientist at SonderMind, Inc.

Wendy Rasmussen

Wendy Rasmussen, Ph.D. is the Director of Clinical Strategy at SonderMind, Inc.

Douglas Newton

Douglas Newton, M.D. is the Chief Medical Offier at SonderMind, Inc.

Notes

1 The correlation between the TPS and PHQ-9 for those who scored a 10 or higher on the PHQ-9 was r = −0.35. For those clients who scored lower than a 10 on the PHQ-9 the correlation was r = −0.21. By restricting the range naturally reduces the size of the correlations.

References

  • APA Presidential Task Force on Evidence-Based Practice. (2006). Evidence-based practice in psychology. The American Psychologist, 61(4), 271–285. https://doi.org/10.1037/0003-066X.61.4.271
  • Anker, M. G., Duncan, B. L., & Sparks, J. A. (2009). Using client feedback to improve couple therapy outcomes: A randomized clinical trial in a naturalistic setting. Journal of Consulting and Clinical Psychology, 77(4), 693–704. https://doi.org/10.1037/a0016062
  • Barkham, M., Lutz, W., & Castonguay, L. G. (2021). Bergin and Garfield’s handbook of psychotherapy and behavior change. Psychotherapie, Psychosomatik, Medizinische Psychologie, 72(1), 7–8. https://doi.org/10.1055/a-1686-4682
  • Barlow, D. H., Farchione, T. J., Bullis, J. R., Gallagher, M. W., Murray-Latin, H., Sauer-Zavala, S., Bentley, K. H., Thompson-Hollands, J., Conklin, L. R., Boswell, J. F., Ametaj, A., Carl, J. R., Boettcher, H. T., & Cassiello-Robbins, C. (2017). The unified protocol for transdiagnostic treatment of emotional disorders compared with diagnosis-specific protocols for anxiety disorders: A randomized clinical trial. JAMA Psychiatry, 74(9), 875–884. https://doi.org/10.1001/jamapsychiatry.2017.2164
  • Boswell, J. F., Hepner, K. A., Lysell, K., Rothrock, N. E., Bott, N., Childs, A. W., Douglas, S., Owings-Fonner, N., Wright, C. V., Stephens, K. A., Bard, D. E., Aajmain, S., & Bobbitt, B. L. (2023). The need for a measurement-based care professional practice guideline. Psychotherapy, 60(1), 1–16. https://doi.org/10.1037/pst0000439
  • Boswell, J. F., Kraus, D. R., Castonguay, L. G., & Youn, S. J. (2015). Treatment outcome package: Measuring and facilitating multidimensional change. Psychotherapy, 52(4), 422–431. https://doi.org/10.1037/pst0000028
  • Bugatti, M., & Boswell, J. F. (2022). Clinician perceptions of nomothetic and individualized patient-reported outcome measures in measurement-based care. Psychotherapy Research: Journal of the Society for Psychotherapy Research, 32(7), 898–909. https://doi.org/10.1080/10503307.2022.2030497
  • Chen, F. F. (2007). Sensitivity of goodness of fit indices to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464–504. https://doi.org/10.1080/10705510701301834
  • Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 9(2), 233–255. https://doi.org/10.1207/S15328007SEM0902_5
  • Coleman, J. J., Dong, Y., Dumas, D., Owen, J., & Kopta, M. (2022). Longitudinal measurement invariance of the Behavioral Health Measure in a clinical sample. Journal of Counseling Psychology, 69(1), 100–110. https://doi.org/10.1037/cou0000524
  • de Jong, K., Conijn, J. M., Gallagher, R. A., Reshetnikova, A. S., Heij, M., & Lutz, M. C. (2021). Using progress feedback to improve outcomes and reduce drop-out, treatment duration, and deterioration: A multilevel meta-analysis. Clinical Psychology Review, 85, 102002. https://doi.org/10.1016/j.cpr.2021.102002
  • Dong, Y., & Dumas, D. (2020). Are personality measures valid for different populations? A systematic review of measurement invariance across cultures, gender, and age. Personality and Individual Differences, 160, 109956. https://doi.org/10.1016/j.paid.2020.109956
  • Drinane, J. M., Owen, J., & Kopta, S. M. (2016). Racial/ethnic disparities in psychotherapy: Does the outcome matter? Testing, Psychometrics, and Methodology in Applied Psychology, 23(4), 531–544. https://doi.org/10.4473/TPM23.4.7
  • Dumas, D., & Dong, Y. (2019). Development and calibration of the student opportunities for deeper learning instrument. Psychology in the Schools, 56(9), 1381–1412. https://doi.org/10.1002/pits.22292
  • Hancock, G. R., & Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. In R. Cudeck, S. du Toit, & D. Sörbom (Eds.), Structural equation modeling: Present and future—A festschrift in honor of Karl Jöreskog (pp. 195–216). Scientific Software International.
  • Hatcher, R. L., & Barends, A. W. (1996). Patients’ view of the alliance in psychotherapy: Exploratory factor analysis of three alliance measures. Journal of Consulting and Clinical Psychology, 64(6), 1326–1336. https://doi.org/10.1037/0022-006X.64.6.1326
  • Hatfield, D. R., & Ogles, B. M. (2004). The use of outcome measures by psychologists in clinical practice. Professional Psychology: Research and Practice, 35(5), 485–491. https://doi.org/10.1037/0735-7028.35.5.485
  • Hatfield, D. R., & Ogles, B. M. (2007). Why some clinicians use outcome measures and others do not. Administration and Policy in Mental Health, 34(3), 283–291. https://doi.org/10.1007/s10488-006-0110-y
  • Hayes, J. A., McAleavey, A. A., Castonguay, L. G., & Locke, B. D. (2016). Psychotherapists’ outcomes with White and racial/ethnic minority clients: First, the good news. Journal of Counseling Psychology, 63(3), 261–268. https://doi.org/10.1037/cou0000098
  • Howard, G. S., Millham, J., Slaten, S., & O’donnell, L. (1981). Influence of subject response style effects on retrospective measures. Applied Psychological Measurement, 5(1), 89–100. https://doi.org/10.1177/014662168100500113
  • Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. https://doi.org/10.1080/10705519909540118
  • Imel, Z. E., Baldwin, S., Atkins, D. C., Owen, J., Baardseth, T., & Wampold, B. E. (2011). Racial/ethnic disparities in therapist effectiveness: A conceptualization and initial study of cultural competence. Journal of Counseling Psychology, 58(3), 290–298. https://doi.org/10.1037/a0023284
  • Ionita, G., & Fitzpatrick, M. (2014). Bringing science to clinical practice: A Canadian survey of psychological practice and usage of progress monitoring measures. Canadian Psychology / Psychologie Canadienne, 55(3), 187–196. https://doi.org/10.1037/a0037355
  • Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59(1), 12–19. https://doi.org/10.1037//0022-006x.59.1.12
  • Jensen-Doss, A., Haimes, E. M. B., Smith, A. M., Lyon, A. R., Lewis, C. C., Stanick, C. F., & Hawley, K. M. (2018). Monitoring treatment progress and providing feedback is viewed favorably but rarely used in practice. Administration and Policy in Mental Health, 45(1), 48–61. https://doi.org/10.1007/s10488-016-0763-0
  • Jensen-Doss, A., Smith, A. M., Becker-Haimes, E. M., Mora Ringle, V., Walsh, L. M., Nanda, M., Walsh, S. L., Maxwell, C. A., & Lyon, A. R. (2018). Individualized progress measures are more acceptable to clinicians than standardized measures: Results of a national survey. Administration and Policy in Mental Health, 45(3), 392–403. https://doi.org/10.1007/s10488-017-0833-y
  • Johansson, R., Hesser, H., Ljótsson, B., Frederick, R. J., & Andersson, G. (2012). Transdiagnostic, affect-focused, psychodynamic, guided self-help for depression and anxiety through the internet: Study protocol for a randomised controlled trial. BMJ Open, 2(6), e002167. https://doi.org/10.1136/bmjopen-2012-002167
  • Kopta, M., Owen, J., & Budge, S. (2015). Measuring psychotherapy outcomes with the behavioral health measure–20: Efficient and comprehensive. Psychotherapy, 52(4), 442–448. https://doi.org/10.1037/pst0000035
  • Kroenke, K., Spitzer, R. L., & Williams, J. B. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613. https://doi.org/10.1046/j.1525-1497.2001.016009606.x
  • Lambert, M. J., Whipple, J. L., & Kleinstäuber, M. (2018). Collecting and delivering progress feedback: A meta-analysis of routine outcome monitoring. Psychotherapy, 55(4), 520–537. https://doi.org/10.1037/pst0000167
  • Lewis, C. C., Boyd, M., Puspitasari, A., Navarro, E., Howard, J., Kassab, H., Hoffman, M., Scott, K., Lyon, A., Douglas, S., Simon, G., & Kroenke, K. (2019). Implementing measurement-based care in behavioral health: A review. JAMA Psychiatry, 76(3), 324–335. https://doi.org/10.1001/jamapsychiatry.2018.3329
  • Marsh, H. W. (2007). Application of confirmatory factor analysis and structural equation modeling in sport/exercise psychology. In G. Tenenbaum & R. C. Eklund (Eds.), Handbook of sport psychology (3rd ed., pp. 774–798). Wiley. https://doi.org/10.1002/9781118270011.ch35
  • McNeish, D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychological Methods, 23(3), 412–433. https://doi.org/10.1037/met0000144
  • Meade, A. W., & Lautenschlager, G. J. (2004). A comparison of item response theory and confirmatory factor analytic methodologies for establishing measurement equivalence/lnvariance. Organizational Research Methods, 7(4), 361–388. https://doi.org/10.1177/1094428104268027
  • Miller, S. D., Duncan, B. L., Brown, J., Sparks, J. A., & Claud, D. A. (2003). The outcome rating scale: A preliminary study of the reliability, validity, and feasibility of a brief visual analog measure. Journal of Brief Therapy, 2(2), 91–100.
  • Millsap, R. E., & Olivera-Aguilar, M. (2012). Investigating measurement invariance using confirmatory factor analysis. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 380–392). The Guilford Press.
  • Moore, J., & Owen, J. (2014). Assessing outcomes: Practical methods and evidence. Journal of College Counseling, 17(2), 175–185. https://doi.org/10.1002/j.2161-1882.2014.00056.x
  • Muthén, L. K., & Muthén, B. O. (2017). Mplus user’s guide. Muthén & Muthén.
  • Norcross, J. C., & Rogan, J. D. (2013). Psychologists conducting psychotherapy in 2012: Current practices and historical trends among division 29 members. Psychotherapy, 50(4), 490–495. https://doi.org/10.1037/a0033512
  • Owen, J., Coleman, J., Drinane, J., Tao, K., Imel, Z., Wampold, B., & Kopta, M. (2021). Psychotherapy racial/ethnic disparities in treatment outcomes: The role of university racial/ethnic composition. Journal of Counseling Psychology, 68(4), 418–424. https://doi.org/10.1037/cou0000548
  • Owen, J., Leach, M. M., Wampold, B., & Rodolfa, E. (2011). Multicultural approaches in psychotherapy: A rejoinder. Journal of Counseling Psychology, 58(1), 22–26. https://doi.org/10.1037/a0022222
  • Owen, J., Wong, Y. J., & Rodolfa, E. (2009). Empirical search for psychotherapists’ gender competence in psychotherapy. Psychotherapy, 46(4), 448–458. https://doi.org/10.1037/a0017958
  • Rapkin, B. D., & Schwartz, C. E. (2004). Toward a theoretical model of quality-of-life appraisal: Implications of findings from studies of response shift. Health and Quality of Life Outcomes, 2(1), 14. https://doi.org/10.1186/1477-7525-2-14
  • Reese, R. J., Slone, N. C., & Miserocchi, K. M. (2013). Using client feedback in psychotherapy from an interpersonal process perspective. Psychotherapy, 50(3), 288–291. https://doi.org/10.1037/a0032522
  • Seidel, J. A., Andrews, W. P., Owen, J., Miller, S. D., & Buccino, D. L. (2017). Preliminary validation of the Rating of Outcome Scale and equivalence of ultra-brief measures of well-being. Psychological Assessment, 29(1), 65–75. https://doi.org/10.1037/pas0000311
  • Shadish, W. R., Navarro, A. M., Matt, G. E., & Phillips, G. (2000). The effects of psychological therapies under clinically representative conditions: A meta-analysis. Psychological Bulletin, 126(4), 512–529. https://doi.org/10.1037/0033-2909.126.4.512
  • Spitzer, R. L., Kroenke, K., Williams, J. B., & Löwe, B. (2006). A brief measure for assessing generalized anxiety disorder: The GAD-7. Archives of Internal Medicine, 166(10), 1092–1097. https://doi.org/10.1001/archinte.166.10.1092
  • Wainer, H., & Thissen, D. (2001). True score theory: The traditional method. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 23–72). Lawrence Erlbaum Associates.
  • Wampold, B. E., & Imel, Z. E. (2015). The great psychotherapy debate: The evidence for what makes psychotherapy work. Routledge.