1,994
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Psychometric validity of the Montgomery and Åsberg Depression Rating Scale for Youths (MADRS-Y)

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 421-431 | Received 17 Jun 2022, Accepted 05 Oct 2022, Published online: 01 Nov 2022

Abstract

Background

Because of all the serious consequences of major depressive disorder (MDD), it is important to screen for MDD in adolescents. The aim of this study was to test the psychometric properties of the newly developed self-report depression scale MADRS-Y for adolescents in a normative Swedish sample.

Methods

The study included 620 adolescents in the age range of 12–20 years old. The normative sample was randomly split into two equal parts, to perform principal component analysis (PCA) on sample one and confirmatory factor analysis (CFA) on sample two. We investigated the psychometrics.

Results

The result from the PCA suggested that all 12 potential items should be used, and the items loaded on the same construct of depression. The CFA supported the one-factor structure with good fit indices. Measurement invariance was confirmed, allowing interpretation regardless of gender or age differences. Reliability was good, α .89, for both samples separately. Test-retest reliability was good to excellent (intraclass correlation coefficients = .87 and .91). Evidence of convergent and discriminant validity was shown.

Conclusions

The results in the current study suggest that the MADRS-Y is a brief, reliable, and valid self-report questionnaire of depressive symptoms for adolescents in the general population.

Major depressive disorder (MDD [Citation1]) in adolescence is a serious and common global health problem [Citation2,Citation3]. The probability of depression rises from around 5% in early adolescence to as high as 20% at the end of adolescence [Citation3]. For untreated depression in adolescence, there is an increased risk of suicide [Citation3], substance use disorders [Citation4], and numerous negative psychosocial consequences and impairments [Citation5–7]. Because of all the serious consequences of MDD, it is important to screen for and to evaluate the treatment effect of depressive symptoms. In the current study, we investigated the psychometric properties of the self-report questionnaire Montgomery – Åsberg Depression Rating Scale – Self report (MADRS-S) [Citation8] adapted for adolescents; Montgomery – Åsberg Depression Rating Scale –Youth (MADRS-Y).

Self-report questionnaires for depression are time efficient and require few staff resources [Citation9]. They can be especially useful when multiple assessments are needed [Citation10]. Adolescent self-assessments result in more internalizing symptoms compared to parent ratings [Citation11], and there is agreement that adolescents could report their internal state more accurately than other informants [Citation12]. Patients’ self-ratings can, in some areas, outmatch clinicians’ ratings, for example in the assessment of suicidality [Citation13].

Many of the commonly used MDD self-report questionnaires for children and adolescents have not been sufficiently psychometrically evaluated for use in this age group [Citation14]. Questionnaires for adolescents are often inadequate, with a childhood or adult perspective, without considering the unique development period in adolescents [Citation15]. In addition, it is important that instructions and item descriptions are age-adjusted and easy to understand for adolescents [Citation16].

The criteria for MDD in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM–5) for adolescents and adults are identical, with the important exception that irritability counts as a core symptom for adolescent MDD and can replace depressed mood [Citation1]. In a study with male adolescents (n = 1004, aged 18 years), the symptom ‘irritability’ appeared to be the best indicator of male depression [Citation17], and it has also been shown to be one of the most common symptoms in girls [Citation18]. Adolescents diagnosed with MDD report both symptoms of insomnia and hypersomnia [Citation19], which are both included in the sleep disturbance DSM-5 criteria for MDD, and they report weight changes and appetite disturbances in both directions [Citation20–22], which is also included in DSM-5 MDD criteria.

To better capture adolescent MDD [Citation17,Citation19–23] and the DSM-5 criteria [Citation1], we developed an adapted version of the adult MADRS-S. The content of the items in MADRS-Y are compatible with the DSM-5 diagnostic criteria for MDD, except the item worry/anxiety, which is not part of the DSM-5 MDD criteria, even though the DSM-5 describes anxiety as a prominent feature of MDD [Citation1]. Another difference is that the DSM-5 has a MDD criteria describing psychomotor disturbance, which is not included in the MADRS-Y, since it is much less common [Citation1].

The aim of this study was to test the psychometric properties of the newly developed self-report scale in Swedish, MADRS-Y, for adolescents in a normative sample.

Method

Study setting

Data was collected in the project called Adolescents’ experience of mental illness – psychometric properties of new Swedish versions of tests (UPOP). The purpose of the project was to develop new instruments for screening and to evaluate treatment effects of mental illness among young patients. The project was approved by the Swedish Regional Ethical Review Board in Umeå (number 2018/59-31). The study was conducted in a medium-sized town (90 000 inhabitants) and its surroundings in the northern part of Sweden. Data collection took place during the years 2018 to 2019.

Procedure

The MADRS-Y questionnaire was developed from the original version MADRS-S by our research group. Instructions and item descriptions were age-adapted throughout the questionnaire (shorter sentences, easier and modernized language) to be easily understood and administered. We also adjusted the MADRS-Y questionnaire to the DSM-5 MDD criteria [Citation1]. Four adolescents (15–18 years old) filled in the MADRS-Y questionnaire and gave feedback through interviews before we distributed the form to the participants in this study. No changes were made based on the interviews. Four schools from different socioeconomic areas were included. The students were asked to fill out the self-report questionnaire online. Verbal and written information was given to students by research assistants, and written consent was then obtained from the students who chose to participate. For the students younger than 15 years of age, informed consent was also obtained from the parents. During the time of participation, a research assistant and a teacher were present to answer questions from the participants. The procedure has been described in detail elsewhere [Citation24].

Participants

The study population included a convenience sample with students from four schools in the age-range of 12–20 years old. The schools for the older age group offered different programs. Exclusion criteria were 1. lack of fluency in written Swedish and 2. inability to complete online forms. Of the 897 students who were asked to participate in the study, 620 (69%) choose to do so. 620 respondents were included in the analyses, of which 383 (61.8%) were girls. Most of the respondents were born in Sweden (88%).

In order to avoid performing principal component analysis (PCA) and confirmatory factor analysis (CFA) on the same sample, the sample was randomly split into two equal parts [Citation25]: sample one (n = 310) and sample two (n = 310). For measuring socioeconomic status, a Swedish socioeconomic classification system [Citation26,Citation27] was used to estimate the households’ places in a socioeconomic ranking based on six different classes of parents’ socioeconomic status, see .

Table 1. Descriptive statistics for MADRS-Y.

The adolescents were asked to complete the same self-reports again after a period of three weeks to obtain data on test-retest reliability. 173 (55.8%) adolescents from sample one and 175 (56.5%) from sample two completed the re-test. In sample one, there were 117 (67.6%) girls, and their ages ranged from 12 to 18 years old (M = 15.46 SD = 1.70). In sample two, there were 121 (69.1%) girls, and their ages ranged from 12 to 19 years old (M = 15.34 SD 1.65).

Measures

All measures in the current study are self-reporting questionnaires in Swedish.

Description of the self-assessment measure MADRS-Y

Montgomery – åsberg depression rating scale – youth (MADRS-Y)

MADRS-Y is a brief self-report questionnaire for adolescents including 12 items. It has been developed from the MADRS-S [Citation8] and measures symptoms of depression. As in MADRS-S, the items assess reported sadness, anxiety, reduced sleep, reduced appetite, concentration difficulties, lassitude, inability to feel, pessimistic thoughts, and suicidal thoughts. However, in MADRS-Y, there are also three new items: irritability, increased sleep, and increased appetite. The self-report is measured with a scale from 0 (low) to 6 (high), and the maximum score is 72. Higher scores indicate more severe depression. MADRS-Y begins with instructions including how the rater is feeling right now and during the last three days.

Convergent validity

Montgomery – Åsberg Depression Rating Scale – Self report (MADRS-S)

MADRS-S is an adult self-report questionnaire of depression symptom [Citation8,Citation28]. The scale includes nine items on a scale from 0 (low) to 6 (high). The maximum score is 54, and higher score indicates more severe depression [Citation28]. MADRS-S is a reliable and valid instrument for adults with good sensitivity for assessing MDD. The internal consistency measured with Cronbach’s alpha has shown to be between .76 and .94 [Citation29]. Only one study has validated MADRS-S in an adolescent psychiatric sample [Citation29], it was found that the psychometric properties and diagnostic accuracy of the MADRS-S (n = 105, aged 12–17 years) were good (e.g. Cronbach alpha .87). In the current study, Cronbach’s alpha was .84 in sample one and .87 (95% CI) in sample two.

WHO-five well-being index (WHO-5)

WHO-5 [Citation30] is a short global rating scale with five statements measuring subjective well-being [Citation31,Citation32]. The respondent is asked to rate how well 5 statements applies considering the last 14 days. Each item is scored from 0 (not present) to 5 (constantly present), where higher scores mean better wellbeing [Citation32]. The scale has shown adequate validity [Citation32]. The WHO-5 has demonstrated acceptable sensitivity and good specificity compared to other depression screeners for adolescents, as well as good psychometric qualities (Cronbach’s alpha .85) [Citation33]. Cronbach’s alpha in the current study was .86 for sample one and .88 for sample two (95% CI).

The revised child anxiety and depression scale (RCADS)

The RCADS [Citation34] is a scale for children and adolescents measuring anxiety and depression. RCADS contains 47 items, of which 37 items measure symptoms of anxiety and 10 items measure symptoms of depression. Items are scored 0 (never) to 3 (always) [Citation34]. Some items were not collected due to a mistake when programming the web survey, therefore, we used only the anxiety scale. The mean alpha value for the anxiety scale has been found to be the same as for the total score (.93) [Citation35]. It has been shown to be a valid and reliable instrument (e.g. Cronbach’s alpha of .96) [Citation36]. In the current study, Cronbach’s alpha was .89 for sample one and .97 for sample two (95% CI).

Reynolds adolescent depression scale second edition (RADS-2)

RADS-2 [Citation37] is a self-report questionnaire measuring depressive symptoms in adolescents. The scale includes 30 items and each item scores from 0 (almost never) to 4 (most of the time). Higher scores indicate higher symptom severity. The RADS-2 is a well validated self-report questionnaire and has also been shown to be a reliable instrument to use in a Swedish community setting [Citation38]. Cronbach’s alpha in the current study was .87 for sample one and .88 for sample two (95% CI).

Divergent validity

Patient-reported outcomes measurement information system (PROMIS)

PROMIS, developed by The US National Institutes of Health, is designed to develop, validate, and standardize item banks to measure patient-reported health [Citation39]. The following PROMIS item banks [Citation39,Citation40] were used to investigate divergent validity: PROMIS Pediatric Item Bank v2.0 - Peer Relationships (15 items) [Citation41], PROMIS® Pediatric Item Bank v1.0 - Family Relationships - Short Form 8a (8 items) [Citation42,Citation43], and PROMIS® Pediatric Item Bank v1.0 - Physical Activity (10 items) [Citation44–46]. These three item banks were chosen because they are all related, but different, to the construct depression. All PROMIS forms use a five-point Likert scale ranging from 1 (e.g. ‘never’ or ‘not at all’) to 5 (e.g. ‘always’ or ‘almost always’). The following are Cronbach’s alphas in the current study for sample one: peer relationships .93, family relationships .95, and physical activity .94, and for sample two: peer relationships .91, family relationships .95, and physical activity .94 (95% CI).

Statistical analysis

We randomly split the sample into two equal halves (two samples with n = 310), using SPSS, version 27.0. Thereafter we analyzed the samples separately in all analyzes. In sample one, a PCA was performed to explore the factor structure. In sample two, a CFA was calculated to examine whether the same factor structure was obtained as in the PCA when using a new sample.

Descriptive statistics were calculated first. T-tests were used [Citation47] for gender, age, and socioeconomic differences, and Cohen’s d for effect sizes. Cohen’s guidelines for effect size are: d ≥ .2 a small effect, d ≥ .5 a medium effect, and d ≥ .8 a large effect [Citation48]. The dividing into the two age groups 12–15 years and 16–20 years, is the same dividing as in the Swedish school system.

We calculated corrected item-total correlations (ritc) and a correlation less than .3 indicates that the item does not correlate well with the overall scale and should be removed [Citation49,Citation50]. Inter-correlations between items calculated by Pearson’s r. Cohen’s guidelines for effect size are: r ≥ .1 a small effect, r ≥ .3 a medium effect, and r ≥ .5 a large effect [Citation48].

The reliability of the scale was calculated by using Cronbach’s α, and an α between .70 and .90 is considered to be good [Citation51].

Test–retest reliability over a 3-week period of the MADRS-Y was estimated using intraclass correlation coefficients (ICCs). ICC estimates and their 95% confident intervals were calculated with absolute agreement type and a 2-way mixed model. Values less than .5 are indicative of poor reliability, values between .5 and .75 indicate moderate reliability, values between .75 and .9 indicate good reliability, and values greater than .9 indicate excellent reliability [Citation52].

To select the final items for the MADRS-Y, we analyzed all 12 potential items using PCA on sample one. PCA is used to explore which factors or items are loading on the same constructs and it is used to obtain more simple or interpretable factors [Citation53]. The significance is dependent on the sample size. With a sample of more than 300 participants, loadings of .30 should be significant [Citation54]. We used an oblique rotation (Oblimin in SPSS) in the PCA, because we had intercorrelations between almost all items. First, the assumptions for the PCA model were tested. Kaiser-Meyer-Olkin (KMO) was used to test for sampling adequacy. The KMO varies between 0 and 1, and values smaller than 0.5 will indicate a too small sample [Citation55]. Bartlett’s test should be significant to show that correlations between variables are significantly different from zero. However, this test is dependent on sample size and is often significant. Correlations between items should not be too high. If they are above .80 [Citation50], it is an indication that they are measuring the same thing.

A Scree plot and eigen values were calculated. Criteria are that each principal component explain at least 5% of the variance, the cumulative variance be at least 7%, and eigenvalue criteria must be more than 1 [Citation56].

To confirm the latent structure of the MADRS-Y in sample two, we used a single factor CFA. The internal consistency was examined using Lavaan package for structural equation modelling version 0.6 − 3 (BETA [Citation57];. Fit indices used were: the Satorra–Bentler Chi square (SB χ2) and the coefficient χ2/df (≤ 3.00 [Citation58]), the comparative fit index (CFI; > .90 [Citation59];), the Tucker–Lewis index (TLI; > .90 [Citation59];), the root mean square error of approximation (RMSEA < .08 [Citation60]), and standardized root mean square residual, SRMR < .08 [Citation61]. A robust diagonally weighted least squares with scaled-shifted test (DWLSSS) estimator was used because we had a non-normal distribution and ordinal scale responses [Citation62]. Diagonally weighted least squares (DWLS) use a polychoric correlation matrix, not sensitive for non-normal distribution.

Measurement equivalence or invariance across sex and age groups was calculated. First, a baseline model or configural invariance model was estimated [Citation63]. Second, a test of invariant factor loadings or metric invariance was done. Third, invariant factor covariance or scalar invariance was calculated. When examining Goodness of fit, it is recommended to not only use Chi square (X2), because it is sensitive to sample sizes [Citation64]. Thus, we also used ΔRMSEA, ΔCFI, and Satorra-Bentler’s test. Our interpretations followed the recommendations for small and skewed data: ΔRMSEA ≤ .01 can be interpreted as an excellent fit (close fit: .01–.05; fair fit: .05–.08; mediocre fit: .08–.10; poor fit: .10+) [Citation65]. For a summary of different goodness of fit indices refer to [Citation63]. A CFI change of smaller than or equal to −.01 is evidence of measurement invariance (null hypothesis of invariance should not be rejected) even with 150 per group [Citation65,Citation66]. A non-significant Satorra-Bentler test indicates that there is no difference between the models [Citation63].

We calculated evidence of the convergent and divergent validity by using Pearson’s correlations (r) between MADRS-Y and the convergent and divergent validity measures.

A variable correlation plot was performed and principal component analyses were used for showing the relationships between the MADRS-Y and all convergent and divergent validity scales. Positively correlated variables were grouped together, and negatively correlated variables were positioned on opposite sides in the plot [Citation67].

Results

Descriptive statistics of sample one and two

Descriptive statistics were calculated for sample one and two separately, see . Both samples were equally sized (n = 310) and did not differ significantly in gender, age, SES, symptoms on MADRS-Y, or for all validity scales.

Gender, age, and socioeconomic differences

T-tests showed that girls rated significantly more depressive symptoms than boys on the MADRS-Y total scale in sample one (p = .032) and in sample two (p < .001), with small effect sizes. See .

Table 2. Internal consistency and T-test between gender, for the Swedish sample of MADRAS-Y total score.

For age differences the samples were divided into the two groups 12–15 years old (sample one N = 134, sample two N = 148) and 16–20 years old (sample one N = 176, sample two N = 162). In sample one 12–15 years old rated significantly (p < .001) less depressive symptoms (M = 8.81 SD = 9.54) than 16–20 years old (M = 13.35, SD = 10.24) with small effect size (d = –.46, 95% CI (–.68 to –.23)). In sample two there was no significant difference (p = .235) between the age groups 12–15 years (M = 11.57, SD = 11.99) and 16–20 years old (M = 13.02, SD = 9.26) with small effect size (d = –.137, 95% CI (–.36 to .09)).

For socioeconomic differences the samples were divided into two groups (group 1: students, manual workers and clerical or office workers (sample one N = 143, sample two N = 150) and group two: higher civil servants, and executives and self-employed in the other (sample one N = 124, sample two N = 128). There was no significant difference (p = .094) in scores between group one (M = 10.23, SD = 8.98) and group two (M = 12.21, SD = 10.28) in sample one with small effect size (d = –.21, 95% CI (–.45 to .04)). Also in sample two, there was no significant difference (p = .531) in scores between group one (M = 12.61, SD = 10.53) and group two (M = 11.82, SD = 10.28) with small effect size (d = –.08, 95% CI (–.16–.31)).

Descriptive statistics and corrected item-total correlation of the items

Before calculating PCA and CFA, descriptive statistics of the items were obtained, see . The corrected item-total correlation (ritc) was higher than .3 in both samples. The items that measured increased sleep and increased appetite had lowest ratings in mean and in item total correlation (.32– .43) in both samples, however, they contributed to ordinal alpha.

Table 3. Descriptive statistics for MADRS-Y.

Correlations between items in sample one

Pearson’s correlations showed that most correlations were larger than .32, indicating the correct use of an oblique (correlating) rotation in the PCA. 47 correlations ranged from .32 to .74, and 19 correlations ranged from .14 to .31. All correlations were significant at the 0.01 level. Inter-correlations between MADRS-Ýs items are seen in .

Table 4. Intercorrelations between MADRS-Y items using Pearson’s r for sample one and sample two.

Principal component analysis in sample one

One PCA analysis with oblique rotation (direct oblimin) was conducted with the 12 items on sample one. The Kaiser-Meyer-Olkin measure verified the sampling adequacy for the analysis, KMO = .88, which is well above the acceptable limit of 0.5 [Citation55]. Bartlett’s test was significant. Initial analysis was run to obtain eigenvalues for each factor in the data. Two factors had eigenvalues over Kaiser’s criterion of 1, and the first explained 44.86% variance while the second explained 9.11% variance in the PCA. The scree plot justified to retain only one factor, a depression factor. shows the factor loadings of the PCA.

Table 5. Principal Component Analysis (PCA) (Sample One) and Confirmatory Factor Analysis (CFA) (Sample Two) Results for the MADRS-Y.

Correlations between items in sample two

Pearson’s correlations showed that correlations ranged from .15 to .71. There was only one pair that did not correlate (increased appetite and decreased sleep, r = .08). Inter-correlations between MADRS-Ýs items are seen in .

Confirmatory factor analysis in sample two

A CFA of the MADRS-Y yielded a one-factor model for the Swedish normative sample (sample two). The first analysis showed varying results of the model to the data, χ2/df and RMSEA was higher than recommended goodness of fit indices: χ2 (54) = 205.68, χ2/df = 3.81, CFI = .96, TLI = .95, RMSEA = 0.10 (90% CI .08 to .11; SRMR = .06). The scale showed standardized factor loadings higher than .40 for all items, see .

Measurement invariance in sample two

In testing measurement of invariance for gender, the configural invariance model demonstrated an acceptable fit to the data indicating that the factor structure of the MADRS-Y was equivalent for boys and girls. The metric invariance model showed a good fit as seen in , except for ΔRMSEA, which had a close fit, indicating that items are loading in the same way for boys and girls, despite the constraint of ‘thresholds’. The scalar invariance model showed a good fit (‘thresholds’ and ‘loadings’ where constrained) indicating that there are no observable sex differences between the intercepts in the models. Satorra-Bentler’s tests were not significant, which indicating a good model.

Table 6. Measurement Invariance Goodness of Fit for the One-Factor Model of MADRS-Y.

Measurement invariance for age

When testing measurement invariance for age, two age groups were used in the calculations: 12 to 15 years old (n = 151, M = 14.13, SD = .19), and 16 to 20 years old (n = 108, M = 17.16, SD = 1.04).

The configural invariance model demonstrated a close fit to the data as seen in , only RMSEA was too high (.097), indicating that the factor structure of the MADRS-Y was possibly equivalent for the two age groups. The metric invariance model and the scalar invariance model showed a good fit, except for ΔRMSEA, which had a close fit, indicating that the factor loadings of the items and the intercepts were the same between the two age groups. Satorra-Bentler’s tests were not significant, which indicating a good model.

Internal consistency and test-retest reliability

Internal consistency for MADRS-Y was good (α =.89 in both samples separately) as seen in . In addition, internal consistency by gender was also good.

For test-retest reliability, ICC estimates were calculated. For sample one (n = 173), the three-week test-retest ICC was good .87 (95% CI .82, .90; p <.001), and for sample two (n = 175), the ICC was excellent .91 (95% CI .87, .93; p <.001).

Convergent and divergent validity

displays the convergent and divergent validity. The result for convergent validity showed high correlations between all measures and MADRS-Y in both sample one (ranging from −.61 to −.90) and sample two (ranging from −.73 to −.90). For divergent validity the results for sample one (ranging from −.32 to −.41) and sample two (ranging from −.21 to −.43) showed small to moderate significant associations between the MADRS-Y and the three scales.

Table 7. Pearson’s correlations between MADRS-Y and validity measures.

A PCA was conducted to visualize the correlations between the MADRS-Y and the validity scales, see . The analysis showed that MADRS-Y had a strong positive correlation to the depression scales MADRS-S and RADS-2, and the anxiety scale RCADS anxiety. The MADRS-Y had a strong negative correlation to wellbeing (WHO-5). Furthermore, the analysis demonstrates clearly that the MADRS-Y measures a different construct than the PROMIS scales.

Figure 1. Principal Component Analyses of MADRS-Y total scale and all validity measures. Note. MADRS-S = Montgomery and Asberg Depression Rating Scale – Self-rated [Citation8]; RADS-2 = Reynolds Adolescent Depression Scale Version 2 [Citation32]; RCADS Anx = The Revised Child Anxiety and Depression Scale - total Anxiety scale [Citation34]; WHO-5 = WHO Well-being Index [Citation31]; PROMIS Family = PROMIS Pediatric Short Form v.1.0—Family relationships 8a; PROMIS Friends = PROMIS Pediatric Bank v2.0 – Peer relationships; PROMIS Physical Activity = PROMIS Pediatric Bank v1.0—Physical activity [Citation45].

Figure 1. Principal Component Analyses of MADRS-Y total scale and all validity measures. Note. MADRS-S = Montgomery and Asberg Depression Rating Scale – Self-rated [Citation8]; RADS-2 = Reynolds Adolescent Depression Scale Version 2 [Citation32]; RCADS Anx = The Revised Child Anxiety and Depression Scale - total Anxiety scale [Citation34]; WHO-5 = WHO Well-being Index [Citation31]; PROMIS Family = PROMIS Pediatric Short Form v.1.0—Family relationships 8a; PROMIS Friends = PROMIS Pediatric Bank v2.0 – Peer relationships; PROMIS Physical Activity = PROMIS Pediatric Bank v1.0—Physical activity [Citation45].

Discussion

The present study aimed to test the psychometric properties of the newly developed self-report scale MADRS-Y for adolescents in a normative sample.

Girls rated significantly more depressive symptoms than boys in both samples, with small effect sizes according to Cohen [Citation48]. This difference in gender was expected, based on previous literature [Citation68]. For age, there was a small significant difference in sample one, but not in sample two, indicating fewer depressive symptoms in the younger group of 12–15 years old, in comparison to age group 16–20 years old. This is in line with international reports [Citation69] and literature [Citation3].

The result from the PCA suggested that all 12 items should be used in MADRS-Y. The items loaded on the same construct of depression. Correlations between items was not too high (above .80), which supports that the items do not measure the same thing. The CFA supported the one-factor structure with good fit indices, except for RMSEA. Research has found that the model fit indices SRMR is more accurate than RMSEA when data is ordinal [Citation70].

Measurement invariance was confirmed using Finch et al., guidelines for small and skewed data [Citation65]. We found MADRS-Y to be equal between girls and boys, and age groups (12 to 15 years old, and 16 to 20 years old). It can be a useful measure allowing interpretation regardless of gender or age differences when screening adolescents for depression.

The reliability of the MADRS-Y scale calculated by Cronbach’s α was good, .89 for both samples separately. Test-retest reliability over three weeks was good to excellent (ICC = .87 and .91) in the current study.

Evidence of convergent and discriminant validity was shown. As expected, the MADRS-Y had the highest correlation with the MADRS-S indicating that MADRS-Y is measuring the same construct, but with a scale adapted for adolescents with easier and modernized language, and new items capturing irritability, increased appetite, and increased sleep.

The three new items in MADRS-Y

The new item ‘irritability’ showed a high item-total correlation (ritc = .6 in sample one and ritc = .66 in sample two). This is in line with the fact that irritability is one of the core symptoms that can be observed in adolescent MDD [Citation1,Citation17,Citation18,Citation23] and is related to general symptom severity of MDD in adolescents [Citation18].

The items that measured increased sleep and increased appetite had the lowest item-total correlation in both samples (ritc = .32–.43). However, the lowest rating was still over .3 and may therefore be included as a valid item in the self-report scale. One possible explanation for the lower ratings of increased sleep and increased appetite can be that these two items are associated with a subtype of depression described as ‘atypical depression’, in which the most important symptoms are the reverse vegetative features of overeating and oversleeping [Citation71]. These two items were significantly correlated to each other in both samples one and two (r = .23 and r = .34) in the current study. Atypical depression is associated with a younger age of depression onset, usually during adolescence, and with concurrent anxiety, and is predominant in females [Citation72–74]. It is important to capture these subtypes of depression for potential personalized treatment [Citation72].

Interestingly, we found that many adolescents reported both decreased and increased sleep and decreased and increased appetite during the last three days. Correlations between decreased and increased sleep were significant and moderate (r = .41 and r = .37), and between decreased and increased appetite significant but small (r = .14 and r = .15). Adolescent sleep is typically more variable compared to children and adults [Citation75] and eating behavior is often more irregular [Citation76]. Clinically, sleeping disorders are common in adolescents and correlated with depressive symptoms [Citation77]. Eating disorders are common in adolescent girls and about 12% may experience some form of bulimia nervosa or binge eating disorder [Citation78]. Importantly, there is a strong association between bulimia nervosa and MDD in adolescent girls [Citation79].

Compliance with DSM-5 diagnostic criteria

Including the three new items, the content of the items in MADRS-Y can be found in DSM-5 diagnostic criteria for MDD, except for the MADRS-Y item of anxiety [Citation1]. In this current study, the anxiety item showed high item-total reliability (ritc .69 in sample one and ritc .67 in sample two). Depressive symptoms are significantly associated with several types of anxiety in adolescents [Citation80] and do not represent distinct constructs across development [Citation81].

Limitations

There are several limitations in this study. A convenience sample was used, which means that generalizability is limited. The distribution of gender was skewed, which might have biased the result. Measurement invariance was calculated despite low power (100–150 in each group) however, this was corrected for, in the interpretations [Citation65,Citation66]. Limits on respondent burden precluded examination of monotrait-heteromethod validation. In addition, we did not investigate the adolescents’ subjective experience of filling out the MADRS-Y.

Conclusions

The CFA in the second half of the sample confirms that the factor structure of the MADRS-Y had a god fit to the model. Measurement invariance confirms that the self-report can be used independently in gender and age groups (12- to 20-year-old youth). MADRS-Y is a brief, reliable, and valid self-report questionnaire that can be used for screening for MDD among adolescents in the general population.

Future studies in clinical samples will show clinical usefulness for screening and assessment of treatment response. Furthermore, more advanced statistical methods based on modern test theory, such as item response theory, can further be used to evaluate and develop highly reliable items [Citation82,Citation83].

Acknowledgements

The authors would like to thank all the adolescents who participated in this project. Correspondence concerning this article should be addressed to Magnus Vestin, Department of Clinical Science, Child and Adolescent Psychiatry, Umeå University, SE-90185 Umeå, Sweden. E-mail: [email protected]

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data that support the findings of this study are available from the corresponding author, M.V, upon reasonable request.

Additional information

Funding

This study was supported by clinical research funding (ALF) from Västerbotten county council.

Notes on contributors

Magnus Vestin

Magnus Vestin is a PhD student at Department of Clinical Science, the Child- and Adolescent Psychiatry unit, Umeå University, Sweden. He is a licensed psychologist, specialist in clinical child and youth psychology, at the Child and adolescent psychiatry clinic in Umeå.

Marie Åsberg

Marie Åsberg is an emeritus professor of psychiatry at the Karolinska institutet in Stockholm. Her research has mainly been focused on assessment and treatment of depression, the psychobiology of suicide, and more recently stress-related exhaustion disorder. She is coauthor of the Montgomery-Åsberg Depression Rating Scale (MADRS).

Marie Wiberg

Marie Wiberg is a professor in statistics with a specialty in psychometrics at Umeå University, Sweden. Her research focus on test score equating, international large-scale assessments, parametric and nonparametric item response theory, and psychometrics in general.

Eva Henje

Eva Henje is a senior specialist and associate professor in Child and Adolescent Psychiatry at Department of Clinical Science at Umeå University. Her clinical work and research is mainly focused on trauma- and stress related disorder and depression in youth.

Inga Dennhag

Inga Dennhag, PhD, psychologist and psychotherapist. She teaches and work with research at the Child and Adolescent Psychiatry at Umeå University in Sweden. Her main research areas are psychometrics, sexual harassments, teenage depression, and trauma. Year 2017, the book Power and Psychotherapy came out.

References

  • American Psychiatric Association. 2013. Diagnostic and statistical manual of mental disorders (5th ed.). Virginia: APA.
  • Gore FM, Bloem PJ, Patton GC, et al. Global burden of disease in young people aged 10–24 years: a systematic analysis. Lancet. 2011;377(9783):2093–2102.
  • Thapar A, Collishaw S, Pine DS, et al. Depression in adolescence. Lancet. 2012;379(9820):1056–1067.
  • Birmaher B, Brent D, Bernet W, et al. Practice parameter for the assessment and treatment of children and adolescents with depressive disorders. J Am Acad Child Adolesc Psychiatry. 2007;46(11):1503–1526.
  • Clayborne ZM, Varin M, Colman I. Systematic review and meta-analysis: adolescent depression and long-term psychosocial outcomes. J Am Acad Child Adolesc Psychiatry. 2019;58(1):72–79.
  • Fletcher JM. Adolescent depression: diagnosis, treatment, and educational attainment. Health Econ. 2008;17(11):1215–1235.
  • Lewinsohn PM, Rohde P, Seeley JR. Major depressive disorder in older adolescents: prevalence, risk factors, and clinical implications. Clin Psychol Rev. 1998;18(7):765–794.
  • Svanborg P, Asberg M. A new self-rating scale for depression and anxiety states based on the comprehensive psychopathological rating scale. Acta Psychiatr Scand. 1994;89(1):21–28.
  • Möller HJ. Rating depressed patients: observer- vs self-assessment. Eur Psychiatry. 2000;15(3):160–172.
  • Bondolfi G, Jermann F, Rouget BW, et al. Self- and clinician-rated montgomery-Asberg depression rating scale: evaluation in clinical practice. J Affect Disord. 2010;121(3):268–272.
  • Vanwoerden S, Steinberg L, Coffman AD, et al. Evaluation of the PAI-A anxiety and depression scales: evidence of construct validity. J Pers Assess. 2018;100(3):313–320.
  • Smith S. Making sense of multiple informants in child and adolescent psychopathology. J Psychoeduc Assess. 2007;25(2):139–149.
  • Joiner TE, Jr., Walker RL, Pettit JW, et al. Evidence-based assessment of depression in adults. Psychol Assess. 2005;17(3):267–277.
  • Stockings E, Degenhardt L, Lee YY, et al. Symptom screening scales for detecting major depressive disorder in children and adolescents: a systematic review and meta-analysis of reliability, validity and diagnostic utility. J Affect Disord. 2015;174:447–463.
  • Wright AJ. Clinical applications of European adolescent assessment research. J Pers Assess. 2020;102(3):440–442.
  • Domanska OM, Firnges C, Bollweg TM, et al. Do adolescents understand the items of the european health literacy survey questionnaire (HLS-EU-Q47) – German version? Findings from cognitive interviews of the project "measurement of health literacy among adolescents" (MOHLAA) in Germany. Arch Public Health. 2018;76(1):46.
  • Moller Leimkuhler AM, Heller J, Paulus NC. Subjective well-being and 'male depression’ in male adolescents. J Affect Disord. 2007;98(1–2):65–72.
  • Crowe M, Ward N, Dunnachie B, et al. Characteristics of adolescent depression. Int J Ment Health Nurs. 2006;15(1):10–18.
  • Lovato N, Gradisar M. A Meta-analysis and model of the relationship between sleep and depression in adolescents: recommendations for future research and clinical practice. Sleep Med Rev. 2014;18(6):521–529.
  • Maxwell MA, Cole DA. Weight change and appetite disturbance as symptoms of adolescent depression: toward an integrative biopsychosocial model. Clin Psychol Rev. 2009;29(3):260–273.
  • Mills JG, Thomas SJ, Larkin TA, et al. Overeating and food addiction in major depressive disorder: links to peripheral dopamine. Appetite. 2020;148:104586.
  • Rice F, Riglin L, Lomax T, et al. Adolescent and adult differences in major depression symptom profiles. J Affect Disord. 2019;243:175–181.
  • Nardi B, Francesconi G, Catena-Dell’osso M, et al. Adolescent depression: clinical features and therapeutic strategies. Eur Rev Med Pharmacol Sci. 2013;17(11):1546–1551.
  • Ståhl S, Dennhag I. Online and offline sexual harassment associations of anxiety and depression in an adolescent sample. Nord J Psychiatry. 2021;75(5):330–335.
  • Fokkema M, Greiff S. How performing PCA and CFA on the same data equals trouble. Eur Assoc Psychol Assess. 2017;33(6):399–402.
  • Statistics Sweden. 1984. Socioekonomisk indelning (SEI) www.scb.se/contentassets/6ffc47f46c8d4391a5798b7757af29df/ov9999_1982a01_br_x11op8204.pdf
  • Statistics Sweden. 2019. SEI yrkesförteckning version 2019-02-21. https://www.scb.se/contentassets/22544e89c6f34ce7ac2e6fefbda407ef/sei_index_webb_20190221.pdf.
  • Svanborg P, Asberg M. A comparison between the Beck Depression Inventory (BDI) and the self-rating version of the montgomery asberg depression rating scale (MADRS). J Affect Disord. 2001;64(2–3):203–216.
  • Ntini I, Vadlin S, Olofsdotter S, et al. The montgomery and åsberg depression rating scale – self-assessment for use in adolescents: an evaluation of psychometric and diagnostic accuracy. Nord J Psychiatry. 2020;74(6):415–422.
  • World Health Organization. 1998. Wellbeing measures in primary health care/the DepCare Project: report on a WHO meeting. https://www.euro.who.int/__data/assets/pdf_file/0016/130750/E60246.pdf.
  • Blom EH, Bech P, Hogberg G, et al. Screening for depressed mood in an adolescent psychiatric context by brief self-assessment scales – testing psychometric validity of WHO-5 and BDI-6 indices by latent trait analyses. Health Qual Life Outcomes. 2012;10(1):149.
  • Topp CW, Ostergaard SD, Sondergaard S, et al. The WHO-5 well-being index: a systematic review of the literature. Psychother Psychosom. 2015;84(3):167–176.
  • Allgaier AK, Pietsch K, Frühe B, et al. Depression in pediatric care: is the WHO-five well-being index a valid screening instrument for children and adolescents? Gen Hosp Psychiatry. 2012;34(3):234–241.
  • Chorpita BF, Yim L, Moffitt C, et al. Assessment of symptoms of DSM-IV anxiety and depression in children: a revised child anxiety and depression scale. Behav Res Ther. 2000;38(8):835–855.
  • Piqueras JA, Martin-Vivar M, Sandin B, et al. The revised child anxiety and depression scale: a systematic review and reliability generalization meta-analysis. J Affect Disord. 2017;218:153–169.
  • Esbjørn BH, Sømhovd MJ, Turnstedt C, et al. Assessing the revised child anxiety and depression scale (RCADS) in a national sample of Danish youth aged 8-16 years. PLoS One. 2012;7(5):e37339.
  • Reynolds W. 2002. Reynolds adolescent depression scale: professional manual (2nd ed.). Odessa, FL: Psychological Assessment Resources.
  • Blomqvist I, Ekbäck E, Dennhag I, et al. Validation of the swedish version of the reynolds adolescent depression scale second edition (RADS-2) in a normative sample. Nord J Psychiatry. 2021;75(4):292–300.
  • Cella D, Yount S, Rothrock N, et al. The patient-reported outcomes measurement information system (PROMIS): progress of an NIH roadmap cooperative group during its first two years. Med Care. 2007;45(5 Suppl 1):S3–s11.
  • HealthMeasures. 2021. Publication Checklist. https://www.healthmeasures.net/resource-center/research-tools/publication-checklist.
  • Dewalt DA, Thissen D, Stucky BD, et al. PROMIS pediatric peer relationships scale: development of a peer relationships item bank as part of social health measurement. Health Psychol. 2013;32(10):1093–1103.
  • Bevans KB, Riley AW, Landgraf JM, et al. Children’s family experiences: development of the PROMIS(®) pediatric family relationships measures. Qual Life Res. 2017;26(11):3011–3023.
  • Cox ED, Connolly JR, Palta M, et al. Reliability and validity of PROMIS(R) pediatric family relationships short form in children 8–17 years of age with chronic disease. Qual Life Res. 2020;29(1):191–199.
  • Carlberg Rindestig F, Wiberg M, Chaplin JE, et al. Psychometrics of three Swedish physical pediatric item banks from the patient-reported outcomes measurement information system (PROMIS)®: pain interference, fatigue, and physical activity. J Patient Rep Outcomes. 2021;5(1):105.
  • Tucker CA, Bevans KB, Teneralli RE, et al. Self-reported pediatric measures of physical activity, sedentary behavior, and strength impact for PROMIS: conceptual framework. Pediatr Phys Ther. 2014;26(4):376–384.
  • Tucker CA, Bevans KB, Teneralli RE, et al. Self-reported pediatric measures of physical activity, sedentary behavior, and strength impact for PROMIS: item development. Pediatr Phys Ther. 2014;26(4):385–392.
  • Mircioiu C, Atkinson J. A comparison of parametric and non-parametric methods applied to a likert scale. Pharmacy. 2017;5(4):26.
  • Cohen J. 1988. Statistical power analysis for the behavioral sciences (2nd ed.). New York: Lawrence Erlbaum.
  • Everitt BS. 2002. The cambridge dictionary of statistics (2nd ed.). Cambridge: Cambridge University Press.
  • Field AP. 2018. Discovering statistics using IBM SPSS statistics (5th ed.). London: Sage Publications.
  • Nunnally JC, Bernstein IH. 1994. Psychometric theory. New York: McGraw-Hill.
  • Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–163.
  • Yaremko RM, Harari H, Harrison RC, et al. 1986. Handbook of research and quantitative methods in psychology: for students and professionals. New York: Lawrence Erlbaum Associates.
  • Steven JP. 2002. Applied mulitvariate statistics fo the social sciences (4th ed). Mahwah, NJ: Earlbaum.
  • Kaiser HF, Rice J. Little jiffy, mark 4. Educ Psychol Meas. 1974;34(1):111–117.
  • Suhr DD. 2005. Principal component analysis vs. exploratory factor analysis. SUGI 30 Proceedings, 203–230. https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/203-30.pdf.
  • Rossell Y. 2018. Latent Variable Analysis, version 0.6-3. http://lavaan.org/.
  • Kline RB. 1998. Principles and practice of structural equation modeling. New York: Guilford Press.
  • Byrne BM. 1994. Structural equation modeling with eqs and eqs/windows: basic concepts, applications, and programming. Thousand Oaks: SAGE. https://doi.org/10.1177/014662169401800208
  • Browne MW, Cudeck R. Alternative ways of assessing model fit. Sociol Methods Res. 1992;21(2):230–258.
  • Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struc Equ Model. 1999;6(1):1–55.
  • Li CH. Confirmatory factor analysis with ordinal data: comparing robust maximum likelihood and diagonally weighted least squares. Behav Res Methods. 2016;48(3):936–949.
  • Svetina D, Rutkowski L, Rutkowski D. Multiple-group invariance with categorical outcomes using updated guidelines: an illustration using Mplus and the lavaan/semTools packages. Struc Equat Model. 2020;27(1):111–130.
  • Bentler PM, Bonett DG. Significance tests and goodness of fit in the analysis of covariance structures. Psycholog Bull. 1980;88(3):588–606.
  • Finch HW, French BF, Hernández Finch ME. Comparison of methods for factor invariance testing of a 1-factor model with small samples and skewed latent traits. Front Psychol. 2018;9:332.
  • Cheung GW, Rensvold RB. Evaluating goodness-of-fit indexes for testing measurement invariance. Struct Equat Model. 2002;9(2):233–255.
  • Kassambara A. 2017. Practical guide to principal component methods in R. (Vol. 2). South Carolina: Create Space Independent Publishing Platform.
  • Merikangas KR, He JP, Burstein M, et al. Lifetime prevalence of mental disorders in U.S. adolescents: results from the National Comorbidity Survey replication-adolescent supplement (NCS-A). J Am Acad Child Adolesc Psychiatry. 2010;49(10):980–989.
  • World Health Organization. 2019. Spotlight on adolescent health and well-being: findings from the 2017/2018 health behaviour in school-aged children (HBSC) survey in Europe and Canada. https://apps.who.int/iris/bitstream/handle/10665/332091/9789289055000-eng.pdf.
  • Shi D, Maydeu-Olivares A, Rosseel Y. Assessing fit in ordinal factor analysis models: SRMR vs. RMSEA. Struc Equat Model. 2020;27(1):1–15.
  • Kendler KS, Eaves LJ, Walters EE, et al. The identification and validation of distinct depressive syndromes in a population-based sample of female twins. Arch Gen Psychiatry. 1996;53(5):391–399.
  • Łojko D, Rybakowski JK. Atypical depression: current perspectives. Neuropsychiatr Dis Treat. 2017;13:2447–2456.
  • Lyndon B, Parker G, Morris G, et al. Is atypical depression simply a typical depression with unusual symptoms? Aust N Z J Psychiatry. 2017;51(9):868–871.
  • Matza LS, Revicki DA, Davidson JR, et al. Depression with atypical features in the National Comorbidity Survey: classification, description, and consequences. Arch Gen Psychiatry. 2003;60(8):817–826.
  • Gradisar M, Gardner G, Dohnt H. Recent worldwide sleep patterns and problems during adolescence: a review and meta-analysis of age, region, and sleep. Sleep Med. 2011;12(2):110–118.
  • Zahra J, Ford T, Jodrell D. Cross-sectional survey of daily junk food consumption, irregular eating, mental and physical health and parenting style of British secondary school children. Child Care Health Dev. 2014;40(4):481–491.
  • Kansagra S. Sleep disorders in adolescents. Pediatrics. 2020;145(Suppl 2):S204–s209.
  • Stice E, Marti CN, Rohde P. Prevalence, incidence, impairment, and course of the proposed DSM-5 eating disorder diagnoses in an 8-year prospective community study of young women. J Abnorm Psychol. 2013;122(2):445–457.
  • Touchette E, Henegar A, Godart NT, et al. Subclinical eating disorders and their comorbidity with mood and anxiety disorders in adolescent girls. Psychiatry Res. 2011;185(1–2):185–192.
  • Waszczuk MA, Zavos HM, Gregory AM, et al. The phenotypic and genetic structure of depression and anxiety disorder symptoms in childhood, adolescence, and young adulthood. JAMA Psychiatry. 2014;71(8):905–916.
  • McElroy E, Fearon P, Belsky J, et al. Networks of depression and anxiety symptoms across development. J Am Acad Child Adolesc Psychiatry. 2018;57(12):964–973.
  • Irwin DE, Stucky B, Langer MM, et al. An item response analysis of the pediatric PROMIS anxiety and depressive symptoms scales.(report). Qual Life Res. 2010;19(4):595–607.
  • Reeve BB, Hays DR, Bjorner BJ, et al. Psychometric evaluation and calibration of health-related quality of life item banks: plans for the patient-reported outcomes measurement information system (PROMIS). Med Care. 2007;45(5 Suppl 1):S22–S31.