12,887
Views
40
CrossRef citations to date
0
Altmetric
Measurement

The optimal short version of the Zarit Burden Interview for dementia caregivers: diagnostic utility and externally validated cutoffs

, &
Pages 706-710 | Received 06 Dec 2017, Accepted 26 Feb 2018, Published online: 19 Mar 2018

ABSTRACT

Objectives: Using a sample of dementia caregivers, we compared the diagnostic utility of the various short versions of the Zarit Burden Interview (ZBI) with the original scale to identify the most optimal one. Next, we established externally validated cutoffs for the various ZBI versions using probable depression cases as a reference standard.

Methods: Caregivers (N = 394; 236 males; Agemean = 56 years) were administered the ZBI and a self-report depression measure. Participants who exceeded the cutoff for the latter were identified as probable depression cases. For each of the ZBI versions, a receiver operating characteristic (ROC) curve was plotted against probable depression cases. The area under these ROC curves between the short versions and the original were then compared using a non-parametric approach.

Results: Compared to the original ZBI, the AUROC were similar for the 6-item, 7-item, and two 12-item versions, but significantly worse for the other short variants. The sensitivity and specificity of the cutoffs for all ZBI versions ranged from 77.3% to 85.2% and 60.1% to 79.8%, respectively.

Conclusions: The original ZBI had good utility in identifying probable depression in caregivers, while the 6-item variant can be a useful alternative when short versions are preferred.

Introduction

As the prevalence of dementia soars due to the rapidly aging populations in developed countries, an increasing number of people will have to care for a loved one with dementia and undertake its associated burdens. Caregiver burden have been previously defined as the financial, physical and psychological consequence of caring for an adult with a disabling condition (George & Gwyther, Citation1986). Caring for a person with dementia (PWD) is a long-term commitment and may span up to 20 years after the initial dementia diagnosis (Karlin, Bell, Noah, Martichuski, & Knight, Citation1999). As a result of such long term burden, meta-analytic research have documented that caregivers, relative to non-caregivers, are more likely to suffer from depression and physical illnesses, as well as experience a decrease in self-efficacy and subjective well-being (Pinquart & Sörensen, Citation2003). Hence, the need to assess caregiving-related burden cannot be understated. Such assessments help inform the clinician if appropriate interventions are needed and ultimately the lived experience with dementia for both the caregiver and PWD.

The Zarit Buden Interview (ZBI; Zarit & Zarit, Citation1987) is one of the most widely used measures to assess caregiver burden. Since its inception, this 22-item scale has been translated into many languages and used in many countries across a diverse range of caregivers and patient populations; meta-analytic research have suggested the ZBI to be reliable across the diverse contexts in which it has been used in (Bachner & O'Rourke, Citation2007). To facilitate quicker administration, several short versions of the ZBI have been developed,using various methods, ranging from single-item to 18-item versions.Whitlatch, Zarit, and von Eye (Citation1991) carried out an exploratory factor analysis (EFA) and identified the two factors of personal strain and role strain, consisting of 18 items in total. Arai, Tamiya, and Yano (Citation2003) similarly obtained the two factors of personal and role strain in their EFA. However their model consisted of eight items. In Knight, Fox, and Chou (Citation2000) EFA, they identified the three factors of embarrasment/anger, patient's dependency and self-criticism which collectively consisted of 14 items. Bédard et al. (Citation2001) created the 4-item and 12-item versions, by choosing four and 12 items, respectively, with the highest item-total correlation within a similar personal and role strain two-factor model that emerged as the optimal model in their EFA. Hébert, Bravo, and Préville (Citation2000) obtained another 12-item version by carrying out an EFA based on the Whitlatch et al. (Citation1991) two-factor model and selected items which constitute the most parsimonious structure of the two-factor model. These studies were carried out on dementia caregiver populations. Using data from caregivers of palliative care patients; Gort et al. (Citation2005) produced their short version by having an expert committee select seven items. Higginson, Gao, Jackson, Murray, and Harding (Citation2010) modified this 7-item version by removing the global item question (item 22) to obtain a 6-item version, which was validated in a mixed sample of dementia, cancer and brain injured patients’ caregivers. Also within the same study, the authors examined a 1-item (global item question) version. The list of included items of these different short versions is presented in .

Table 1. Overview of all studied ZBI short variants.

Nevertheless, the ZBI and their short variants are fraught with a few problems. First, having multiple short versions can be problematic as it makes for comparison between them difficult, and it is unclear if these short versions are equivalent to the original in terms of diagnostic utility. It would also be useful to know which of these versions is the most optimal, that is – the least-item version that has comparable diagnostic utility to the original. Furthermore, these short variants and the original lack cutoffs that are validated against a clinically significant outcome. For instance, the cutoff in the original ZBI (Zarit & Zarit, Citation1987) and one of the 12-item (Bédard et al., Citation2001) versions were determined arbitrarily. While Higginson et al. (Citation2010) reported the cutoffs for various short versions of ZBI in their study, they obtained these cutoffs by using the arbitrarily determined cutoff in the original as a reference standard. As such, these cutoffs lack external validity, as O'Rourke and Tuokko (Citation2003) observed that the cutoff for the 12-item variant (Bédard et al., Citation2001) had low levels of sensitivity in classifying probable depression cases. The lack of validated cutoffs for the ZBI and its short variants makes it difficult for one to assess the clinical significance of the scores and, more importantly, the need for intervention.

Therefore, the current study aimed to address these issues and better inform clinicians on the use of the ZBI and their short versions. In the current study, we compared the diagnostic utility, via the area under the receiver operating characteristic (AUROC), between the various ZBI short versions and the original, to identify the most optimal ZBI short version. Next we sought to determine externally validated cutoffs for the various ZBI variants. Given that a specific clinical diagnosis does not exist to characterize excessive caregivers burden, similar to a previous validation study (O'Rourke & Tuokko, Citation2003) we opted to use a common consequence of such excessive burdens – depression (Alspaugh, Stephens, Townsend, Zarit, & Greene, Citation1999; Berger et al., Citation2005)., as a reference standard. That is, if one becomes depressed, it would be reasonable to assume that such burdens are excessive.

Methods

Participants and procedures

Participants were recruited from the dementia services of two tertiary hospitals in the north-eastern part of Singapore. We used a consecutive sampling method and had a response rate of 87.8%. Participants were included if they are : (1) age 21 or above; (2) spouses or children of PWD ; and (3) caring for PWD who is residing in the community. A total of 394 caregivers for persons at various stages of dementia were recruited. presents the demographical characteristics of these caregivers and the person they are caring for. The PWDs were mostly females (Agemean = 79.5; SD = 8.2) and at the moderate and severe stage of dementia. Their caregivers (Agemean = 53.0; SD = 10.7) were mostly Chinese, married, children of PWD and the primary caregiver of the PWD. On average, they have cared for the PWD for 6.8 years (SD = 6.7).

Table 2. Demographic information of the caregivers and the persons with dementia (n = 394).

At the respective clinics in the hospitals, these caregivers gave their informed consent before completing onsite, the self-administered demographic questionnaire, the ZBI, and Center for Epidemiological Studies Depression Scale (CES-D). This study had received the ethical approval from the Domain Specific Review Board of the National Healthcare Group, Singapore.

Measures

The ZBI (Zarit & Zarit, Citation1987) measures the perceived burden of caregivers via 5-point Likert scale items. These items were summed to produce a total score ranging from 0 to 88. According to the original test instructions, score range of 61–88 indicates high burden. The ZBI has demonstrated good reliability and validity for assessing caregiver burden in our local context (Yap, Citation2010). On top of the original 22-item total score, we also calculated the total scores for the different short versions of the ZBI by summing up their respective items (see ).

The CES-D is a 20-item, self-report scale which assessed depressive symptoms in the past one week (Radloff & Radloff, Citation1977). Each item is scored on a 4-point Likert scale to reflect the frequency of the described symptom. The total score for this scale ranges from 0 to 60. A cut-off score of ≥16, as suggested by the original authors, was used in the current study to classify participant as having probable depression. This cutoff has been validated in the local context and has demonstrated high levels of sensitivity (≥ 90.9%; Stahl et al., Citation2008), although its specificity was relatively low (≤67.6) in classifying depression cases. Such levels of specificity were similar to the pooled specificity statistic obtained via a meta-analysis (Vilagut, Forero, Barbaglia, & Alonso, Citation2016). Furthermore, in view of the trade-off between sensitivity and specificity, it may be better to opt for higher sensitivity than specificity, especially since false-negative diagnoses may have more costly consequences that those of false-positive in the clinical context.

Analyses

For each of the studied ZBI variants, we plotted a receiver operating characteristic (ROC) curve against probable depression cases and calculated the AUROC. Next, we assessed whether the AUROC of the various ZBI short versions were significantly different from that of the original via a non-parametric approach (Delong & Carolina, Citation1988) to derive confidence intervals and standard errors of the differences in AUROC. Correlations between variables were examined using Pearson correlation coefficients. Statistical significance was set at p < 0.05. Given that multiple comparisons were carried out, we also computed bonferroni-adjusted p-values. All analyses were performed using the STATA software version 14.

Results

The results of the correlation between the total scores of the various ZBI version and CES-D are presented in . All correlations were statistically significant (Bonferroni adjusted ps > 0.001). The 6-item and 7-item versions had the highest correlation (r = 0.73) with CES-D scores while the 1-item version had the lowest (r = 0.56). presents the results of the AUROC analyses. The original ZBI had AUROC of 0.859. All ZBI variants yielded at least acceptable AUROC (ranging from 0.779 to 0.866). The 6-item (Higginson et al., Citation2010), 7-item (Gort et al., Citation2005) and the two 12-item (Bédard et al., Citation2001; Hébert et al., Citation2000) ZBI variants’ AUROC were not significantly different from the original ZBI (all Bonferroni adjusted ps> 0.05), while the rest had significantly worse AUROC relative to the original ZBI.

Table 3. Correlation between scores of CES-D and the various ZBI versions

Table 4. AUROC statistics and comparison of the ZBI variants with the original 22-item version, and cutoffs related information

The derived cutoffs for all ZBI versions had acceptable levels of sensitivity and specificity (>70%), except for the 1-item (Higginson et al., Citation2010) version (see ). At the optimal cutoff of ≥34, the original ZBI had a sensitivity of 85.2% and a specificity of 74.8%. Detailed results of the sensitivity and specificity statistics for all ZBI variants are reported in the Supplementary materials.

Discussion

The aim of the current study is twofold. First, we sought to identify the most optimal ZBI short version and derive externally validated cutoffs for the different versions of ZBI. We compared the AUROC between the ZBI short versions and the original. The 6-item version (Higginson et al., Citation2010) emerged as the most optimal short version in having the least number of items yet demonstrated comparable diagnostic utility as the original 22-item version. The cutoff for this version also had relatively high specificity and sensitivity in classifying probable depression cases. Interestingly, while this 6-item version had a comparable AUROC value to the original 22-item version, a few of the longer ZBI versions (Arai et al., Citation2003; Knight et al., Citation2000; Whitlatch et al., Citation1991) actually had significantly lower AUROC values than the original. One possible explanation for this is that the extra items beyond the 6-item version may be less relevant in the context of depression, hence their inclusion is unlikely to increase the scale's diagnostic utility as measured via the accuracy in classifying probable depression cases. Relatedly, in showing that ZBI scores correlate significantly and highly with depression scores, we have also demonstrated the excellent convergent validity of ZBI and provided evidence to confirm similar findings reported previously (Tang et al., Citation2016).

Next, using probable depression cases as a reference standard, we derived the cutoffs for the full ZBI and its shorter variants. It is interesting to note that the cutoffs reported previously (Bédard et al., Citation2001; Higginson et al., Citation2010; Zarit & Zarit, Citation1987) were lower than our validated cutoffs. Given that lower cutoffs generally correspond to lower specificity, this suggests that the previous cutoffs could suffer from reduced specificity in assessing caregiver burden and consequently result in false positive errors (i.e. identified as excessive burden when it is in fact not). Hence, one should exercise caution in interpreting the results of previous research that used these cutoffs.

These findings have two important implications. Firstly, the externally validated cutoffs for ZBI afford clinicians greater confidence in identifying caregivers who are experiencing high levels of caregiving-related burden, to the extent that they might be suffering from depression or at risk of developing it. This will facilitate early intervention for these caregivers and efforts to improve the caregiving experience for both the caregiver and PWD. Secondly, by showing that the 6-item, 7-item and both 12-item versions of the ZBI are equivalent to the original 22-item version in terms of diagnostic utility, these shorter versions can be utilized to assess for caregiver burden with greater convenience and confidence.

Some limitations of the study are noteworthy. First, the participants were recruited from two hospitals in one region of Singapore, hence they may not be geographically representative. Second, the participants were recruited from dementia services in tertiary centers and may therefore not be fully representative of those in the community. However, this is less likely a problem considering that most PWDs in Singapore receive care from tertiary centers. Third, the caregivers sample included only spouses and children of the PWD, hence these findings may not apply to other caregivers such as the PWD's children-in-law or paid caregivers. Forth, the sample was over-represented by Chinese; as such the results may only be applicable to the Chinese population. In relation to these generalizability issues, future studies may consider replicating these findings in other populations. Finally, given the cross-sectional nature of this study, it is possible that the depressive symptoms assessed among caregivers may not be a consequence of caregiving-related burden but rather the result of other unrelated stressors. As such, future studies may consider verifying the current findings with a longitudinal design.

Supplemental material

Sup-mat-The_optimal_short_version_of_the_Zarit_Burden_Interview-Yu.docx

Download MS Word (48.3 KB)

Acknowledgment

The authors thank the participants and staff at Institute of Mental Health and Khoo Teck Puat Hospital, for their support of this research.

Disclosure statement

The authors have no conflicts of interest to declare.

Additional information

Funding

This work was supported by the National Medical Research Council, Ministry of Health, Singapore [grant numbers NMRC/CG/004/2013, NMRC/Fellowship/0030/2016 to T.M.L, NMRC/CSSSP/0014/2017 to T.M.L] and pilot funding from the National University of Singapore.

References

  • Alspaugh, M. E. L., Stephens, M. A. P., Townsend, A. L., Zarit, S. H., & Greene, R. (1999). Longitudinal patterns of risk for depression in dementia caregivers: Objective and subjective primary stress as predictors. Psychology and Aging, 34, 34–43. US: American Psychological Association. Retrieved fromhttps://doi.org/10.1037/0882-7974.14.1.34
  • Arai, Y., Tamiya, N., & Yano, E. (2003). The short version of the Japanese version of the Zarit caregiver burden interview (J-ZBI_8): Its reliability and validity. Nihon Ronen Igakkai Zasshi. Japanese Journal of Geriatrics, 40(5), 497–503.
  • Bachner, Y. G., & O'Rourke, N. (2007). Reliability generalization of responses by care providers to the Zarit Burden Interview. Aging & Mental Health, 11(6), 678–685. Retrieved fromhttps://doi.org/10.1080/13607860701529965
  • Bédard, M., Molloy, D. W., Squire, L., Dubois, S., Lever, J. A., & O'donnell, M. (2001). The Zarit Burden Interview: A new short version and screening version. Gerontologist, 41(5), 652–657. Retrieved fromhttps://doi.org/10.1093/geront/41.5.652
  • Berger, G., Bernhardt, T., Weimer, E., Peters, J., Kratzsch, T., & Frolich, L. (2005). Longitudinal study on the relationship between symptomatology of dementia and levels of subjective burden and depression among family caregivers in memory clinic patients. Journal of Geriatric Psychiatry and Neurology, 18(3), 119–128. Retrieved fromhttps://doi.org/10.1177/0891988704273375
  • Delong, E. R., & Carolina, N. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach author (s): Elizabeth R. DeLong, David M. DeLong and Daniel L. Clarke-Pearson published by: International biometric society stable. Biometrics, 44(3), 837–845.
  • George, L. K., & Gwyther, L. P. (1986). Caregiver weil-being: A multidimensional examination of family caregivers of demented adults. The Gerontologist, 26(3), 253–259.
  • Gort, A. M., March, J., Gómez, X., de Miguel, M., Mazarico, S., & Ballesté, J. (2005). Short Zarit scale in palliative care. Medicina Clinica, 124(17), 651–653.
  • Hébert, R., Bravo, G., & Préville, M. (2000). Reliability, validity and reference values of the Zarit Burden Interview for assessing informal caregivers of community-dwelling older persons with dementia. Canadian Journal on Aging / La Revue Canadienne Du Vieillissement, 19(4), 494–507. Retrieved fromhttps://doi.org/10.1017/S0714980800012484
  • Higginson, I. J., Gao, W., Jackson, D., Murray, J., & Harding, R. (2010). Short-form Zarit caregiver burden interviews were valid in advanced conditions. Journal of Clinical Epidemiology, 63(5), 535–542. Retrieved fromhttps://doi.org/10.1016/j.jclinepi.2009.06.014
  • Karlin, N. J., Bell, P. A., Noah, J. L., Martichuski, D. K., & Knight, B. L. (1999). Assessing Alzheimer's support group particpation: A retrospective follow-up. American Journal of Alzheimer's Disease, 14(6), 326–333. Retrieved fromhttps://doi.org/10.1177/153331759901400607
  • Knight, B. G., Fox, L. S., & Chou, C. -P. (2000). Factor structure of the burden interview. J Clin Geropsychology, 6(4), 249–258. Retrieved fromhttps://doi.org/10.1023/A:1009530711710
  • Longmire, C. V. F., & Knight, B. G. (2011). Confirmatory factor analysis of a brief version of the Zarit Burden Interview in black and white dementia caregivers. Gerontologist, 51(4), 453–462. Retrieved fromhttps://doi.org/10.1093/geront/gnr011
  • O'Rourke, N., & Tuokko, H. A. (2003). Psychometric properties of an abridged version of the Zarit Burden Interview within a representative Canadian caregiver sample. The Gerontologist, 43(1), 121–127. Retrieved fromhttps://doi.org/10.1093/geront/43.1.121
  • Pinquart, M., & Sörensen, S. (2003). Differences between caregivers and noncaregivers in psychological health and physical health: A meta-analysis. Psychology and Aging, 18(2), 250.
  • Radloff, L. S., & Radloff, L. S. (1977). Center for epidemiologic studies depression scale (CES-D). Applied Psychological Measurement 1(1), 385–401.
  • Stahl, D., Sum, C. F., Lum, S. S., Liow, P. S., Chan, Y. H., Verma, S.,… Chong, S. A. (2008). Screening for depressive symptoms: Validation of the CES-D scale in a multi-ethnic group of patients with diabetes in Singapore. Diabetes Care, 31, 1118–1119.
  • Tang, J. Y., Ho, A. H., Luo, H., Wong, G. H., Lau, B. H., Lum, T. Y., & Cheung, K. S. (2016). Validating a cantonese short version of the Zarit Burden Interview (CZBI-Short) for dementia caregivers. Aging & Mental Health, 20(9), 996–1001. Retrieved fromhttps://doi.org/10.1080/13607863.2015.1047323
  • Vilagut, G., Forero, C. G., Barbaglia, G., & Alonso, J. (2016). Screening for depression in the general population with the center for epidemiologic studies depression (CES-D): A systematic review with meta-analysis. PloS One, 11(5), e0155431.
  • Whitlatch, C. J., Zarit, S. H., & von Eye, A. (1991). Efficacy of interventions with caregivers: A reanalysis. The Gerontologist, 31(1), 9–14. Retrieved fromhttps://doi.org/10.1093/geront/31.1.9
  • Yap, P. (2010). Validity and reliability of the Zarit Burden Interview in assessing caregiving burden. Ann Acad Med Singapore, 39, 758–763.
  • Zarit, S. H., & Zarit, J. M. (1987). Instructions for the burden interview. University Park: Pennsylvania State University.