1,228
Views
0
CrossRef citations to date
0
Altmetric
Dermatology

Psychometric evaluation of the Worst Pruritus Numerical Rating Scale (NRS), Atopic Dermatitis Symptom Scale (ADerm-SS), and Atopic Dermatitis Impact Scale (ADerm-IS)

, , , &
Pages 1289-1296 | Received 01 May 2023, Accepted 22 Aug 2023, Published online: 13 Sep 2023

Abstract

Background

Atopic dermatitis (AD) is a chronic inflammatory skin disease characterized by pruritus, skin pain, and sleep impacts, which are only reportable by patients themselves. The goal of this research is to evaluate the reliability, validity, and interpretability of the scores from three patient-reported outcome measures within the context of a clinical trial for adolescents and adults with moderate to severe AD.

Methods

Data from a Phase 3 randomized, double-blind, placebo-controlled, multinational clinical trial for individuals 12–75 years of age with moderate to severe AD (AD Up [ClinicalTrials.gov NCT03568318]) were used to assess the reliability, validity, and interpretability of scores on the Worst Pruritus Numerical Rating Scale (NRS) and the Atopic Dermatitis Symptom and Impact Scales (ADerm-SS and ADerm-IS). Analyses were conducted separately for the adult and adolescent subgroups.

Results

Of the 882 participants included in the psychometric analyses, the majority were adults (n = 769, 87.2%), male (n = 536, 60.8%), and white (n = 630, 71.4%). Multi-item scores from the ADerm-SS and ADerm-IS had good internal consistency reliability, and most scores demonstrated acceptable test-retest reliability. Scores from the three questionnaires demonstrated adequate validity, exhibiting correlations with other conceptually related outcome assessments and score differences between clinically distinct subgroups. Finally, the score interpretation analyses provide estimates for meaningful within-person change and between-groups difference thresholds that may be useful for future research in adults and adolescents with moderate to severe AD.

Conclusions

These results provide evidence that the scores produced by the Worst Pruritus NRS, ADerm-SS, and ADerm-IS are reliable and construct-valid when completed by adults and adolescents with moderate to severe AD in a clinical trial setting. The results presented here expand upon the previous qualitative evidence of these tools and provide further support for their use in future clinical studies. While results are specific to clinical trials, next steps would be to evaluate the use of these questionnaires in clinical practice. This can provide clinicians and dermatologists a window into the patient’s disease experience outside of the clinic, aid in shared decision making, and support a patient-centric approach to management of moderate to severe AD.

Introduction

Atopic dermatitis (AD) is a chronic inflammatory skin disease affecting up to 20% of children and 10% of adults worldwideCitation1. In addition to eczematous lesions, AD is characterized by pruritus, skin pain, and sleep impactsCitation2, which are only reportable by adult and adolescent patients themselves. Therefore, it is important to assess these aspects using patient-reported outcome (PRO) questionnaires that produce reliable, valid, and interpretable scores.

While there are existing questionnaires for the assessment of the signs, symptoms, and impacts of ADCitation3–5, they are typically completed during study visits. To also understand how participants feel and function in their daily lives outside of study visits, three novel PRO questionnaires have been developed in accordance with best practice recommendations and regulatory guidance documentsCitation6–8: Worst Pruritus Numerical Rating Scale (NRS), Atopic Dermatitis Symptom Scale (ADerm-SS), and Atopic Dermatitis Impact Scale (ADerm-IS)Citation9–11. Specifically, these measures assess daily and weekly symptoms and impacts relevant to adults and adolescents with moderate to severe AD. The content validity of these questionnaires and guidelines on severity strataCitation12 have been previously described for both adultsCitation9 and adolescentsCitation10, and they were included in Phase 3 clinical trials of individuals 12 years of age and older with moderate to severe AD, including AD Up (NCT03568318), a randomized placebo-controlled, double-blind study to evaluate upadacitinib combined with topical corticosteroids.

The goal of this research is to evaluate the reliability, validity, and interpretability of the scores produced by these three PRO questionnaires in adolescents and adults with moderate to severe AD, using data from one of the clinical trials (AD Up; NCT03568318).

Methods

Study design and participants

Data from one Phase 3 randomized, double-blind, placebo-controlled, multinational clinical trial for individuals 12–75 years of age with moderate to severe AD (AD Up; NCT03568318) were used for the current psychometric analyses. Trial design and participant characteristics have been previously publishedCitation13. In summary, participants were between 12 and 75 years old, with moderate to severe AD, had symptom onset at least 3 years prior to Baseline visit, had inadequate response to topical corticosteroids, and had no prior exposure to any Janus kinase (JAK) inhibitor or dupilumab (see Supplemental Table 1 for key eligibility criteria). The trial was conducted with approval from appropriate ethics review committees, and all participants provided written informed consent. The analysis population for the psychometric analyses were participants in the intent-to-treat analysis population who had scores for one or more of the target questionnaires (Worst Pruritus NRS, ADerm-SS, and/or ADerm-IS) at both Baseline and at least one post-treatment timepoint up to Week 16.

Study questionnaires

The content and scoring for the three target questionnaires (Worst Pruritus NRS, ADerm-SS, and ADerm-IS) are described in . These questionnaires were completed on a handheld device that the participants had at home with them for the duration of the study. The order of administration for these questionnaires varies between daily completion (Days 1–6 of each week) and once-a-week completion (Day 7 of each week). The order of daily completion is Worst Pruritus NRS, ADerm-SS Daily Items (1–3), and ADerm-IS Daily Items (1–3). The order of the once-a-week completion is Worst Pruritus NRS, ADerm-SS Daily Items (1–3), ADerm-IS Daily Items (1–3), ADerm-SS Weekly Items (4–11), and ADerm-IS Weekly Items (4–10).

Table 1. Content and scoring for the three target questionnaires.

Weekly average scores are calculated for the Worst Pruritus NRS and ADerm-SS Skin Pain (Item 3). Two total scores can be calculated from items of the ADerm-SS: (1) the sum of Items 1–7 (symptoms only) is the ADerm-SS TSS-7; (2) the sum of all items (1–11; signs and symptoms) is the ADerm-SS TSS-11. The domain scores for the ADerm-IS are based on and supported by exploratory and confirmatory factor analysesCitation11 (see Supplemental Figures 1 and 2 and Supplemental Table 2). Please note: the results for all scores for the target questionnaires, with the exception of the ADerm-SS TSS-11, are presented below (the results for the ADerm-SS TSS-11 are in the supplemental materials).

Additional assessments were utilized to support the psychometric evaluation and score performance of the three primary/target assessments, including patient-reported and clinician-reported questionnaires that were completed at key study visits. Specifically, patients completed several questionnaires at key study visits: Dermatology Life Quality Index (DLQI)Citation14,Citation15, EQ-5D Visual Analog Scale (VAS)Citation16, Patient Global Impression of Change (PGIC), Patient Global Impression of Severity (PGIS), and Patient-Oriented Eczema Measure (POEM)Citation17. Clinicians completed two questionnaires: Eczema Area and Severity Index (EASI)Citation18,Citation19 and validated Investigator Global Assessment of AD (vIGA-AD)Citation20. Finally, participants and investigators completed different items of the Scoring Atopic Dermatitis (SCORAD)Citation21,Citation22 assessment. Participants completed the questionnaires before the other study-visit activities or clinician assessments. The order of the questionnaires used in the current analyses was patient-reported items for the SCORAD, POEM, PGIC, and DLQI, which were then followed by the clinician assessments completed by the investigator (vIGA-AD and EASI). Details about each supplementary assessment used for the psychometric and score interpretation analyses are presented in Supplemental Table 3.

Statistical analyses

To evaluate the reliability of scores, internal consistency (for ADerm-SS TSS-7 and ADerm-IS Sleep, Daily Activities, and Emotional State domain scores) and test-retest (for all three questionnaires) analyses were conducted. Supplemental Table 4 summarizes each set of analyses and benchmarks for determining acceptable results, where appropriate. To evaluate the validity of scores, convergent validity, known-groups, and sensitivity-to-change analyses were conducted. Finally, anchor-based analyses were used to calculate meaningful within-person change estimates, and distribution-based analyses were used to calculate between-groups clinically important difference estimates. All analyses were conducted separately for the adult (18–75 years) and adolescent (12–17 years) subgroups using SAS V9.4.

Results

Demographic characteristics of participants included in analyses

Of the 882 participants included in the psychometric analyses, the majority were adults (n = 769, 87.2%), male (n = 536, 60.8%), and white (n = 630, 71.4%). Approximately half were rated by clinicians on the vIGA-AD as having severe AD (n = 469, 53.2%), and the average POEM score was 21.2, which also indicates more severe AD at Baseline (see Supplemental Table 3 for scores at Baseline for all supportive assessments). Baseline scores for the target assessments demonstrate that both age groups were symptomatic and experiencing impacts related to AD before treatment (). Additionally, the range of responses selected on the individual items for Baseline (and follow-up) timepoints were evaluated, and no floor/ceiling effects were noted (data not presented).

Table 2. Score distribution of target assessments at Baseline (N = 882).

Reliability

The internal consistency reliability for the ADerm-SS TSS-7 and ADerm-IS domain scores at Week 16 are presented in and , respectively. The ADerm-SS TSS-7 showed good internal consistency (Cronbach’s alpha (α) ≥0.96) in both adults and adolescents, and removal of any item did not improve reliability. Similar results were found for the three domain scores of the ADerm-IS, with α ≥ 0.93. Overall, removal of items did not improve internal consistency reliability for the ADerm-IS Sleep and Daily Activities domains; however, for ADerm-IS Emotional State, removal of Item 8 (self-consciousness) did result in a small increase in the α coefficient (a change of 0.01 for adults and 0.03 for adolescents). The scoring of this domain remained unchanged, as the α was high with or without this item, and the concepts assessed by the items in the domain are conceptually related.

Table 3. Internal consistency reliability for ADerm-SS TSS-7 item scores for Week 16.

Table 4. Internal consistency reliability for ADerm-IS multi-item scores for Week 16.

Test-retest reliability was evaluated using the Baseline and Week 2 timepoints among participants who selected “no change” on the PGIC at Week 2. Sample sizes for both these adult and adolescent analysis populations were substantially smaller than the number that had completed the target questionnaires (). Even with these sample size constraints, most scores demonstrated acceptable test-retest reliability in both populations, though some scores narrowly missed the pre-specified threshold (ICC > 0.60).

Table 5. Test-retest reliability of the ADerm-SS, ADerm-IS and Worst Pruritus NRS scores between Baseline and Week 2 for participants who reported “no change” on PGIC.

Validity

Pearson correlations demonstrated convergent validity between target questionnaire and supplementary assessment scores (Supplemental Table 5). For the adult population, nearly all correlations evaluated were moderate or greater. Specifically, strong correlations (r > 0.80) were observed for Worst Pruritus NRS and SCORAD Itch VAS (r = 0.85), ADerm-SS Skin Pain and SCORAD Itch VAS (n = 0.87) and POEM (r = 0.80), ADerm-IS Daily Activities and DLQI (r = 0.83), and ADerm-IS Emotional State and DLQI (r = 0.80). The ADerm-IS Sleep domain had the highest correlation with SCORAD Sleep VAS (r = 0.79). Similar patterns were seen for adolescents.

Validity according to known-groups methods was demonstrated for the target assessments, with Worst Pruritus NRS, ADerm-SS, and ADerm-IS scores able to distinguish among the clinically distinct groups (Supplemental Table 6 for the adult sample and Supplemental Table 7 for the adolescent sample). Mean Worst Pruritus NRS; ADerm-SS Skin Pain and TSS-7; and ADerm-IS Sleep, Daily Activity, and Emotional State domain scores were higher at greater AD severity levels across multiple assessments, for both adults and adolescents.

The Worst Pruritus NRS, ADerm-SS, and ADerm-IS scores are sensitive to change, as there were moderate to strong correlations between change scores for the three target questionnaires and the majority of the supportive assessments (). For adults, moderate to strong relationships were observed between the change scores of the three target questionnaires with all the supportive assessments, with the highest correlations of change scores seen between Worst Pruritus NRS and SCORAD Itch VAS (r = 0.81), ADerm-SS TSS-7 and SCORAD Itch VAS (r = 0.77), ADerm-IS Sleep and SCORAD Sleep VAS (r = 0.73), and ADerm-IS Daily Activities and DLQI Total score (r = 0.74). Similar patterns were generally observed among adolescents, though correlations were lower with EQ-5D VAS and EASI in most instances. Lower correlation between the target assessments and the EQ-5D VAS may be due to the fact that the VAS assesses the broad concept of “general health.” Additionally, the lower correlation between the patient-reported target assessments and the investigator-completed EASI may be due to the differences in respondents.

Table 6. Pearson correlation on change scores of the ADerm-SS, ADerm-IS, and Worst Pruritus NRS and other supportive measures from Baseline to Week 16.

Meaningful within-person change and minimal between-groups differences

Anchor-based analyses used the PGIS and vIGA-AD to estimate meaningful within-person change on the target assessments. summarizes the results for the anchor-based analyses (Supplemental Tables 8 and 9 present the full results). Overall, estimates were higher using the vIGA-AD anchor compared to the PGIS, and this is noted for both adults and adolescents. For the Worst Pruritus NRS and ADerm-SS Skin Pain scores, meaningful within-person change scores associated with minimal improvement were approximately 3 points for both adults and adolescents. For the ADerm-IS Daily Activity and Emotional State domains, the meaningful within-person change estimates were also similar for the two age groups and were 10 points and 9 points, respectively. For the ADerm-SS TSS-7 and ADerm-IS Sleep domain scores, meaningful within-person change estimates were larger for adults compared to adolescents. Specifically, for the ADerm-SS TSS-7, the meaningful within-person change estimate was approximately 20 points for adults and 15 points for adolescents. For the ADerm-IS Sleep domain, the meaningful within-person change estimate was 9 points for adults and 5 points for adolescents.

Table 7. Summary of meaningful within-person change and between-groups differences.

To evaluate minimal between-groups differences that exceed measurement error, two estimates were calculated for each target score and are presented in . Between-groups difference estimates were largely consistent across the two age groups for the target assessment scores. Overall, results support a minimal between-groups difference threshold of 1 point on the Worst Pruritus NRS and ADerm-SS Skin Pain scores for both adults and adolescents. For the ADerm-SS TSS-7, results support a minimal between-groups difference threshold between 7 and 10 points for both adults and adolescents. For the ADerm-IS domains, minimal between-groups difference threshold estimates were as follows: 4 points for Sleep domain, 5–8 points for Daily Activities domain, and 4–5 points for Emotional State domain.

Discussion

These results provide evidence that the scores produced by the Worst Pruritus NRS, ADerm-SS, and ADerm-IS are reliable and construct-valid when completed by adults and adolescents with moderate to severe AD in a clinical trial setting. These results build upon the previously published qualitative content validity research for the ADerm-SS and ADerm-IS questionnairesCitation9,Citation23 and quantitative analyses to provide severity strata guidelinesCitation12.

A strength of these analyses is the clinical trial setting, which provided longitudinal data to evaluate both test-retest reliability and sensitivity to change, as well as provide estimates of within-person change for the three target assessments. Specific to score interpretation estimates, these results may prove useful if planning endpoints for future clinical trials for both meaningful within-person change and minimal between-groups differences. Specifically for the two weekly average scores (Worst Pruritus NRS and ADerm-SS Skin Pain, which range from 0 to 10), a 3-point change and 1-point difference can be considered threshold estimates of meaningful within-person change and minimal between-groups difference, respectively, for both adults and adolescents. Of note, this is similar to the estimate reported for another similar daily assessment for itch (Peak Pruritus NRS)Citation24. For the ADerm-SS TSS-7 and ADerm-IS Sleep domain, meaningful within-person change threshold estimates differed between adults and adolescents, though minimal between-groups difference threshold estimates were consistent for the two age groups. Threshold estimates for the ADerm-IS Daily Activities and Emotional State domains were generally consistent across age groups. It is also notable that the thresholds associated with the vIGA-AD anchor were consistently higher (both within-person and between groups) than those with the PGIS. This may be related to the correlation between the target patient-reported assessments and the clinician-reported vIGA-AD. Additionally, it may be associated with the quicker efficacy of the treatment on patient-reported symptoms (e.g. pain and itch), compared to the visible improvement of the appearance of the skin as reported by the investigator using the vIGA-AD. It is important to note that the thresholds presented in these results represent the minimal meaningful within-person change in this clinical trial context for this patient population. If future studies are interested in evaluating more stringent definitions of improvement, or a different sample of patients (e.g. more or less severe AD), then the thresholds should be evaluated within that context.

While these results may be used as a starting place for endpoint planning in future clinical research, it should be noted that depending on the study objectives, researchers may select more stringent thresholds, or thresholds based on the age range of the sample (depending on the questionnaire). For example, while the within-patient change threshold for Worst Pruritus NRS and ADerm-SS Skin Pain scores that was supported by the current analyses was 3 points, the ranked endpoints in the three upadacitinib clinical trials (NCT03569293; NCT03607422; and NCT03568318) were based on a 4-point change, providing greater certainty of the meaningfulness of the change experienced by participants. Similarly, while the current analyses for the ADerm-SS TSS-7 supported different thresholds of meaningful within-person change for adults and adolescents (20 and 15 points, respectively), the upadacitinib trials used a single stringent score threshold (28 points) for both age groups for efficacy analyses.

One limitation of the current analyses is that it is generalizable to a clinical trial setting with over 70% of participants self-identifying as Caucasian; therefore, future research should evaluate the reliability and validity of these target assessment scores if implemented in clinical practice with a more racially diverse sample. Another limitation is the sample size for certain analyses, specifically for test-retest reliability, which required that participants self-report “no change” on a global assessment between Baseline and Week 2 to be included in the analysis population. This small sample size for participants self-reporting “no change” may also be due to the rapid response trial participants experienced on the treatment. Though directionally supportive, the reliability estimates should be considered preliminary and may be an under-estimate.

While this limitation applies to both the adult and adolescent subgroups, the sample size of the adolescent subgroup (12–17 years old) in the clinical trial dataset was comparatively small relative to the adult subgroup. The difference in sample sizes between the two age groups may contribute to some of the differences in results. While the scores from the adolescent subgroup were moderately correlated with most of the supportive assessments, the magnitude of the correlations were numerically smaller than the adult subgroup correlations. Therefore, results for the adolescents should be considered preliminary and be replicated with a larger sample.

When selecting questionnaires for future research, it is important to develop a measurement strategy that assesses the signs, symptoms, and impacts relevant to the target patient population while minimizing patient burden and redundancy in the questionnaires. While other AD-specific assessments analyzed similar concepts, the recall period differs, which fundamentally changes the data that are collected. Aligning the measurement strategy with the aim of the study will maximize the likelihood that questionnaires selected will result in data that are relevant to the research question. Please note that while the current analyses summarize results for two scores generated by the ADerm-SS questionnaire (Skin Pain and TSS-7), this questionnaire can also generate a second total symptom score that includes all 11 items. The ADerm-SS TSS-7 excludes four items (rash, skin thickening/lichenification, bleeding, and oozing) that are signs that may be assessed by clinicians in other study assessments (e.g. SCORAD, EASI). However, if research is interested in the patient perspective on these signs, the ADerm-SS TSS-11 can be calculated as the sum of all 11 items within the questionnaire (results for the score distribution, reliability, validity, and score interpretation analyses are presented in Supplemental Tables 5, 7, 8, and 10).

Conclusion

The goal of the target PRO questionnaires was to assess the patient experience (both symptoms and impacts) in the patients’ life outside study visits. While other itch-specific daily assessments have been developed using a diary designCitation25,Citation26,, to our knowledge these questionnaires differ from others that have been used in prior clinical research because they assess AD-specific signs, symptoms, and impacts in patients’ daily lives, beyond itch. The results presented here expand upon the previous qualitative evidence of these tools for adultsCitation9 and adolescentsCitation10 and therefore provide further support for their use in future clinical studies. While these results are specific to the context of a clinical trial, next steps would be to evaluate the use of these questionnaires in clinical practice. This can provide clinicians and dermatologists a window into the patient’s disease experience outside of the clinic, aid in shared decision making, and support a patient-centric approach to management of moderate to severe AD.

Transparency

 

Supplemental material

Supplemental Material

Download PDF (582.1 KB)

Acknowledgements

None.

Declaration of financial/other relationships

Jonathan I. Silverberg has received honoraria as a consultant and/or advisory board member for AbbVie, Afyx, Aobiome, Arena, Asana, Aslan, BioMX, Biosion, Bluefin, Bodewell, Boehringer-Ingelheim, Cara, Castle Biosciences, Celgene, Connect Biopharma, Dermavant, Dermira, Dermtech, Eli Lilly, Galderma, GlaxoSmithKline, Incyte, Kiniksa, LEO Pharma, Luna, Menlo, Novartis, Optum, Pfizer, RAPT, Regeneron, Sanofi-Genzyme, Shaperon, Sidekick Health, and Union. He has served as a speaker for AbbVie, Eli Lilly, LEO Pharma, Pfizer, Regeneron, and Sanofi-Genzyme, and has received institutional grants from Galderma and Pfizer. Yael A. Leshem has received honoraria or fees as a consultant from AbbVie, Sanofi, and Genentech and as an advisory board member from Sanofi and Regeneron Pharmaceuticals, Pfizer, and Dexcel Pharma, an independent research grant from AbbVie, and has, without personal compensation, provided investigator services for Eli Lilly, Pfizer, and AbbVie. Brian M. Calimlim is an employee of AbbVie and may own AbbVie stock or stock options. Jeffrey McDonald and Leighann Litcher-Kelly are employed by Adelphi Values LLC, which received payment from AbbVie Inc. to support the research activities presented in this publication. A reviewer on this manuscript has disclosed that they have received honoraria from AbbVie, Eli Lilly, Pfizer, Sanofi, LEO and AstraZeneca.

Additional information

Funding

AbbVie Inc. funded this study and participated in the study design; study research; collection, analysis, and interpretation of data; and writing, reviewing, and approving of this publication. All authors had access to the data, and participated in the development, review, and approval, and in the decision to submit this publication. No honoraria or payments were made for authorship.

References

  • Nutten S. Atopic dermatitis: global epidemiology and risk factors. Ann Nutr Metab. 2015;66(Suppl 1):8–16. doi: 10.1159/000370220.
  • Grant L, Seiding Larsen L, Trennery C, et al. Conceptual model to illustrate the symptom experience and humanistic burden associated with atopic dermatitis in adults and adolescents. Dermatitis. 2019;30(4):247–254. doi: 10.1097/DER.0000000000000486.
  • Silverberg JI, Lei D, Yousaf M, et al. Comparison of Patient-Oriented Eczema Measure and Patient-Oriented Scoring Atopic Dermatitis vs. Eczema Area and Severity Index and other measures of atopic dermatitis: a validation study. Ann Allergy Asthma Immunol. 2020;125(1):78–83. doi: 10.1016/j.anai.2020.03.006.
  • Thomas KS, Apfelbacher CA, Chalmers JR, et al. Recommended core outcome instruments for health-related quality of life, long-term control and itch intensity in atopic eczema trials: results of the HOME VII consensus meeting. Br J Dermatol. 2021;185(1):139–146. doi: 10.1111/bjd.19751.
  • Chalmers JR, Simpson E, Apfelbacher CJ, et al. Report from the fourth international consensus meeting to harmonize core outcome measures for atopic eczema/dermatitis clinical trials (HOME initiative). Br J Dermatol. 2016;175(1):69–79. doi: 10.1111/bjd.14773.
  • US Department of Health and Human Services, Food and Drug Administration, Center for Biologics Evaluation and Research, et al. Patient-reported outcome measures: use in medical product development to support labeling claims. Food and Drug Administration; 2009 [cited 2021 Oct 30]. Available from: http://www.fda.gov/downloads/Drugs/Guidances/UCM193282.pdf.
  • Patrick DL, Burke LB, Gwaltney CJ, et al. Content validity - establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: part 1-Eliciting concepts for a new PRO instrument. Value Health. 2011;14(8):967–977. doi: 10.1016/j.jval.2011.06.014.
  • Patrick DL, Burke LB, Gwaltney CJ, et al. Content validity – establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: part 2-Assessing respondent understanding. Value Health. 2011;14(8):978–988. doi: 10.1016/j.jval.2011.06.013.
  • Foley C, Tundia N, Simpson E, et al. Development and content validity of new patient-reported outcome questionnaires to assess the signs and symptoms and impact of atopic dermatitis: the Atopic Dermatitis Symptom Scale (ADerm-SS) and the Atopic Dermatitis Impact Scale (ADerm-IS). Curr Med Res Opin. 2019;35(7):1139–1148. doi: 10.1080/03007995.2018.1560222.
  • Silverberg JI, Simpson EL, McLafferty M, et al. Content validity of the Atopic Dermatitis Symptom Scale (ADerm-SS) and Atopic Dermatitis Impact Scale (ADerm-IS) in adolescents to assess the symptoms and impacts of atopic dermatitis. Presented at the Revolutionizing Atopic Dermatitis (RAD) Congress, 2020 Dec 13-14, 2020, Virtual.
  • Silverberg JI, Simpson EL, Litcher-Kelly L, et al. Psychometric evaluation of three patient-reported outcome questionnaires assessing the symptoms and impacts of atopic dermatitis in adults and adolescents. Presented at the Revolutionizing Atopic Dermatitis (RAD) Congress, 2020 Dec 13-14, 2020, Virtual.
  • Silverberg JI, Simpson E, Calimlim BM, et al. Determining severity strata for three atopic dermatitis patient-reported outcome questionnaires: defining severity score ranges for the Worst Pruritus numerical rating scale and the Atopic Dermatitis Symptom and Impact Scales (ADerm-SS and ADerm-IS). Dermatol Ther. 2022;12(12):2817–2827. doi: 10.1007/s13555-022-00836-5.
  • Reich K, Teixeira HD, de Bruin-Weller M, et al. Safety and efficacy of upadacitinib in combination with topical corticosteroids in adolescents and adults with moderate-to-severe atopic dermatitis (AD Up): results from a randomised, double-blind, placebo-controlled, phase 3 trial. Lancet. 2021;397(10290):2169–2181. doi: 10.1016/S0140-6736(21)00589-4.
  • Finlay AY, Khan GK. Dermatology Life Quality Index (DLQI)–a simple practical measure for routine clinical use. Clin Exp Dermatol. 1994;19(3):210–216. doi: 10.1111/j.1365-2230.1994.tb01167.x.
  • Hongbo Y, Thomas CL, Harrison MA, et al. Translating the science of quality of life into practice: what do Dermatology Life Quality Index scores mean? J Invest Dermatol. 2005;125(4):659–664. doi: 10.1111/j.0022-202X.2005.23621.x.
  • Devlin NJ, Shah KK, Feng Y, et al. Valuing health-related quality of life: an EQ-5D-5L value set for England. Health Econ. 2018;27(1):7–22. doi: 10.1002/hec.3564.
  • University of Nottingham Centre of Evidence Based Dermatology. POEM – Patient Oriented Eczema Measure; 2021 [cited 2021 Nov 12]. Available from: https://www.nottingham.ac.uk/research/groups/cebd/resources/poem.aspx.
  • Oakley A. EASI Score. DermNet New Zealand: 01/2015; 2015 [cited 2019 Sep 8]. Available from: http://www.dermnetnz.org/topics/easi-score/.
  • Hanifin JM, Thurston M, Omoto M, et al. The Eczema Area and Severity Index (EASI): assessment of reliability in atopic dermatitis. EASI Evaluator Group. Exp Dermatol. 2001;10(1):11–18. doi: 10.1034/j.1600-0625.2001.100102.x.
  • Simpson E, Bissonnette R, Eichenfield LF, et al. The validated Investigator Global Assessment for Atopic Dermatitis (vIGA-AD): the development and reliability testing of a novel clinical outcome measurement instrument for the severity of atopic dermatitis. J Am Acad Dermatol. 2020;83(3):839–846. doi: 10.1016/j.jaad.2020.04.104.
  • European Task Force on Atopic Dermatitis. Severity scoring of atopic dermatitis: the SCORAD index. Consensus report of the European Task Force on atopic dermatitis. Dermatology. 1993;186:23–31.
  • Kunz B, Oranje AP, Labrèze L, et al. Clinical validation and guidelines for the SCORAD index: consensus report of the European Task Force on atopic dermatitis. Dermatology. 1997;195(1):10–19. doi: 10.1159/000245677.
  • Silverberg JI, Simpson EL, McLafferty M, et al. Content validity of the Atopic Dermatitis Symptom Scale (ADerm-SS) and Atopic Dermatitis Impact Scale (ADerm-IS) in adolescents to assess the symptoms and impacts of atopic dermatitis. Presented at the Revolutionizing Atopic Dermatitis (RAD) Congress, 2020 Dec 13-14, Virtual.
  • Yosipovitch G, Reaney M, Mastey V, et al. Peak pruritus numerical rating scale: psychometric validation and responder definition for assessing itch in moderate-to-severe atopic dermatitis. Br J Dermatol. 2019;181(4):761–769. doi: 10.1111/bjd.17744.
  • Schnitzler C, Rosen J, Szepietowski J, et al. Validation of ‘ItchApp©’ in Poland and in the USA: multicentre validation study of an electronical diary for the assessment of pruritus. J Eur Acad Dermatol Venereol. 2019;33(2):398–404. doi: 10.1111/jdv.15300.
  • Phan NQ, Blome C, Fritz F, et al. Assessment of pruritus intensity: prospective study on validity and reliability of the Visual Analogue Scale, Numerical Rating Scale and Verbal Rating Scale in 471 patients with chronic pruritus. Acta Derm Venereol. 2012;92(5):502–507. doi: 10.2340/00015555-1246.