55
Views
5
CrossRef citations to date
0
Altmetric
Original Research

Estimated risk for chronic pain determined using the generic STarT Back 5-item screening tool

, , , , , , , , , , , & show all
Pages 461-467 | Published online: 24 Feb 2017

Abstract

Objective

The generic STarT Back 5-item screening tool (STarT-G) is used to manage chronic pain in the lower back and elsewhere. This study evaluated the validity of the Japanese version of this generic screening tool.

Materials and methods

Japanese participants between the ages of 20 and 64 years completed online surveys regarding pain. Survey reliability was assessed with internal consistency, as calculated using Cronbach’s alpha coefficients. Spearman’s correlation coefficients were used to evaluate concurrent validity between the STarT-G score and standard reference questionnaires. Associations between STarT-G scores and the presence of a disability due to chronic pain (DCP) were analyzed using receiver operator characteristic (ROC) curves.

Results

Analyses ultimately included data obtained from 52,842 Japanese participants (54.4% male) with a mean (standard deviation) age of 47.7 (9.4) years. Approximately 1.5% of participants had DCP, and the mean STarT-G score was 1.2 (1.4). The Cronbach’s alpha coefficient was 0.71, indicating an acceptable reliability. The STarT-G score moderately correlated with the pain numerical rating scale (NRS) score (Spearman’s correlation coefficient: r = 0.34). When the STarT-G threshold was set at 4, the sensitivity and specificity of the DCP predictive model were 65.8% and 82.4%, respectively, and the area under the ROC was 0.808.

Conclusion

The STarT-G was internally consistent and was able to distinguish between subjects with and without a DCP. Therefore, the STarT-G can reliably be used in the Japanese population to identify patients with DCP.

Introduction

Disability due to chronic pain (DCP) results in absence from work and is a major public health concern in Japan and many Western countries.Citation1Citation4 Various screening tools have been developed to identify chronic pain subgroups and comorbid factors.Citation5Citation7 A widely used powerful tool is the STarT Back Tool (STarT), a 9-item screening tool that was developed as a prognostic indicator of lower back pain (LBP). Items 1–4 evaluate physical factors and items 5–9 assess psychosocial factors ().Citation5,Citation8 The STarT score is often used by primary care physicians in England to make clinical decisions.Citation5 Specifically, the STarT results indicate the subgroup that an LBP patient falls into, which helps determine which treatment strategies may be most effective. The STarT has been shown to be particularly effective for individual patient management in the physiotherapy setting. Patients who underwent STarT testing and subsequent targeted therapy had higher clinical and cost efficacy than patients who did not undergo STarT testing and were treated with usual care strategies.Citation5 We previously translated the STarT into Japanese,Citation9 and this version was linguistically validated in a general cross-cultural adaptation process.Citation10Citation12 We also evaluated the reliability and validity of “the STarT into Japanese” in a large number of Japanese patients with LBP.Citation13

Figure 1 The Keele STarT Back screening tool (9-item).

Note: Copyright ©2007. Reprinted from Keele University. STarT Back Screening Tool Website. Available from: https://www.keele.ac.uk/sbst/startbacktool/usingandscoring/.Citation8
Figure 1 The Keele STarT Back screening tool (9-item).

The lower back was the most common site of chronic pain and accounted for 65% of all cases of reported chronic pain in a Japanese epidemiological study.Citation1 However, chronic pain often originates in places other than the lower back, and a generic screening tool is needed to help effectively manage chronic pain from all sites. One such tool is the generic version of the STarT Back 5-item screening tool (STarT-G), a modified version of the 9-item STarT.Citation8 The STarT 9-item screening tool provides an easy way to stratify patients into three subgroups according to the probability of a poor prognosis or pain chronicity. These categories are defined as “low risk,” “medium risk,” and “high risk” ().Citation8 On the other hand, the use of STarT-G (5-item) screening tool has not yet been established. The STarT-G has also not been validated for evaluating chronic pain in a large group of Japanese subjects. Therefore, the current study was performed to examine the validity of STarT-G in such a population using cross-sectional data obtained from STarT-G surveys administered online.

Figure 2 The STarT Back tool scoring system.

Notes: Scores were used to stratify patients into “low risk,” “medium risk,” and “high risk” groups. Copyright ©2007. Reprinted from Keele University. STarT Back Screening Tool Website. Available from: https://www.keele.ac.uk/sbst/startbacktool/usingandscoring/.Citation8
Figure 2 The STarT Back tool scoring system.

Materials and methods

This study was reviewed and approved by the medical/ethics review board of the Japan Labour Health and Welfare Organization at Kanto Rosai Hospital (Kanagawa, Japan, approval number: 2012-22). All study procedures adhered to the tenets of the Declaration of Helsinki. Participation was voluntary, and no personal information was collected. Written informed consent was not obtained, but submitting the completed questionnaire was considered evidence of consent. Before completing the questionnaire, potential participants read an explanation of the survey’s purpose and were informed that they should proceed to the questionnaire only if they agreed to participate in the study. As an incentive, participants received online shopping reward points from the Internet research company that helped conduct this study (UNITED, Inc., Tokyo, Japan).

Study population

Subject information was collected via surveys administered online in January and February 2014. Participants were recruited from an online panel conducted by an Internet research company (UNITED, Inc.). The all-Japanese study population consisted of ~1.25 million registered research volunteers between the ages of 20 and 64 years. From this volunteer pool, 965,919 individuals were randomly selected and invited by e-mail to complete an online questionnaire on health problems associated with pain. We ultimately obtained 52,842 online responses by January 31, 2014.

Study measures

The 5-item STarT-G tool is a modified version of the 9-item psychosocial subscale that specifically identifies distress in other conditions.Citation5 Questions address fear (one item from the Tampa Scale of Kinesiophobia), anxiety (one item from the Hospital Anxiety and Depression Scale), pessimistic patient expectations (one item from the Pain Catastrophizing Scale), low mood, (one item from the Hospital Anxiety and Depression Scale), and how bothersome pain is.Citation7 The first four items had possible responses of “agree” or “disagree,” and the bothersome item had possible responses from 0 to 5 (Likert scale). We used the 5-item STarT back screening tool that is available from the Keele University website (March 2013, ).Citation8

Figure 3 The generic condition screening tool (5-items).

Note: Copyright ©2007. Reprinted from Keele University. STarT Back Screening Tool Website. Available from: https://www.keele.ac.uk/sbst/startbacktool/usingandscoring/.Citation8
Figure 3 The generic condition screening tool (5-items).

The study questionnaire investigated pain experienced over the past month in 20 different anatomical sites. All anatomical sites were illustrated on diagrams to ensure that participants correctly identified each area. Examined sites included the head, chin, teeth/mouth, face, throat, neck, shoulder, elbow, wrist/hand, chest, abdomen, back, low back, hip, thigh, knee, lower leg, ankle/foot, genitals, and anus. The degree of chronic pain experienced over the last 4 weeks was assessed using the numerical rating scale (NRS), with scores ranging from 0 (no pain at all) to 10 (the worst pain imaginable).

Somatizing tendency was assessed using a subset of items from a linguistically validated Japanese version of the Brief Symptom Inventory (BSI).Citation14,Citation15 Seven somatic symptoms were assessed for severity, including faintness or dizziness, pain in the heart or chest, nausea or upset stomach, difficulty breathing, numbness or tingling in part of the body, weakness in part of the body, and hot or cold spells. All symptoms were assessed on a five-point scale that evaluated how much the participant was bothered by the symptom. Participants chose from the following response options: not at all (0), mildly (1), moderately (2), quite a bit (3), and extremely (4). For this test, participants were grouped by the number of somatic symptoms or pain sites. A participant was considered to have a symptom if he/she responded with a 2–4, which is indicative of somatization.Citation16,Citation17

The presence/absence of a DCP was also investigated. A DCP was considered present when the pain symptoms had continued for at least 6 months and the subject had withdrawn from social activities because of pain.

Statistical analyses

Data are presented as mean (standard deviation), where applicable. Participant demographic and clinical characteristics were summarized using descriptive statistics. To examine floor and ceiling effects, the percentages of respondents with total scores of 0 and 5 were calculated. Floor and ceiling effects were considered present when >15% of respondents had the lowest or highest possible score, respectively.Citation18 To examine STarT-G reliability, we evaluated internal consistency by calculating Cronbach’s alpha coefficients. An alpha index >0.70 indicates a satisfactory internal consistency.Citation19 Spearman’s correlation coefficients were used to evaluate concurrent validity by examining correlations between STarT-G and NRS pain scores. Correlation coefficients were interpreted using Cohen’sCitation20 criteria for correlation strength in psychometric validation (0.10 = weak, 0.30 = moderate, and 0.50 = strong).

The ability of STarT-G scores to differentiate between participants with known differences (known-group validity) was examined using the Jonckheere–Terpstra test. To do this, participants were categorized into the following groups according to the number of somatic symptoms present: no symptoms, one symptom, and two or more symptoms.

Associations between STarT-G scores and the presence of a DCP were examined using receiver operator characteristic (ROC) curves and the corresponding area under the curve (AUC). Accuracy was determined using the AUC. The following traditional academic point system for AUC values can be used as a rough guide for classifying diagnostic test accuracy: 0.90–1.00 = excellent, 0.80–0.90 = good, 0.70–0.80 = fair, 0.60–0.70 = poor, and 0.50–0.60 = fail.Citation21 Statistical analyses were performed using SPSS statistical software (version 20.0; SPSS, Inc., Chicago, IL, USA). All reported P values are two-sided, and statistical significance was defined as P < 0.05.

Results

A total of 52,842 participants were ultimately included in analyses. Mean subject age was 47.7 (9.4) years, and 54.4% of participants were male. Approximately 1.5% of participants claimed to have experienced a DCP. summarizes participant demographic characteristics and overall pain survey results.

Table 1 Participant demographic and pain characteristics

Mean STarT-G score was 1.2 (1.4). A remarkable ceiling effect was not observed, with only 2.3% of participants reporting the highest score of 5. However, a substantial floor effect was observed, with 41.0% of participants reporting the lowest score of 0. The Cronbach’s alpha coefficient was 0.71, indicating good test reliability. Concurrent validity was examined by investigating the correlation between STarT-G score and pain NRS. The two pain measures were only moderately correlated (r = 0.34).

We examined the STarT-G scores among participants with known differences. As expected, participants with more somatic symptoms had significantly higher STarT-G scores. The mean score was 0.97 (1.12), 1.96 (1.42), and 2.74 (1.53) in participants with zero, one, and two or more somatic symptoms, respectively (). This linear trend of increasing total STarT-G score with an increasing number of somatic symptoms was highly significant (Jonckheere–Terpstra test, P < 0.0001). Furthermore, participants with pain at a higher number of body sites had significantly higher STarT-G scores. The mean score was 0.63 (1.05), 1.05 (1.25), 1.27 (1.30), 1.50 (1.37), 1.80 (1.45), 2.23 (1.54), and 2.96 (1.57) in participants with zero, one, two, three, four-to-five, six-to-nine, and ≥10 pain sites, respectively (). This linearly increasing trend in STarT-G score with an increasing number of bodily pain sites was highly significant (Jonckheere–Terpstra test, P < 0.0001).

Figure 4 Mean STarT-G scores for participants with different numbers of somatic symptoms.

Notes: The linear trend was found to be highly significant (Jonckheere–Terpstra test, P < 0.0001). The STarT-G is the generic version of the STarT Back 5-item screening tool. The number of somatic symptoms was determined using the Brief Symptom Inventory somatization scale.
Figure 4 Mean STarT-G scores for participants with different numbers of somatic symptoms.

Figure 5 Mean STarT-G scores for participants with different numbers of pain sites.

Notes: The linear trend was found to be highly significant (Jonckheere–Terpstra test, P < 0.0001). The STarT-G is the generic versions of the STarT Back 5-item screening tool. The number of pain sites represents pain experienced during the past month in the head, chin, teeth/mouth, face, throat, neck, shoulder, elbow, wrist/hand, chest, abdomen, back, low back, hip, thigh, knee, lower leg, ankle/foot, genitals, and/or anus.
Figure 5 Mean STarT-G scores for participants with different numbers of pain sites.

The ability of the model to predict the presence of a DCP was also examined when the STarT-G threshold was set to 4. At this cutoff value, sensitivity and specificity for detecting a DCP were 65.8% and 82.4%, respectively. Additionally, area under the ROC curve was 0.808 for this STarT-G threshold, indicating that the model was good ().

Figure 6 Receiver operating characteristics (ROC) curve of disability due to chronic pain, as assessed using a STarT-G score threshold value of 4.

Note: The area under the ROC curve was 0.808.
Figure 6 Receiver operating characteristics (ROC) curve of disability due to chronic pain, as assessed using a STarT-G score threshold value of 4.

Discussion

Here, we evaluated psychometric properties of the STarT-G. We found that the survey was internally consistent and had acceptable concurrent and known-groups validity in the Japanese population. The Cronbach’s alpha coefficient for the STarT-G was 0.71, indicating a good internal consistency. This value was similar to that obtained for the Japanese 9-item STarT scale (0.75).Citation13 Concurrent validity was assessed by analyzing correlations between the STarT-G and pain NRS scores, which were moderately correlated with each other (r = 0.34). Known-group validity was investigated by examining relationships between STarT-G scores and the number of somatic symptoms and body pain sites. These analyses showed that the STarT-G score increased as the number of somatic symptoms and pain sites increased. This suggests that the STarT-G is able to differentiate between patients with different levels of chronic pain and pain-related problems.

Yellow flags are useful in identifying patients with chronic LBP who have a poor prognosis.Citation22 The 5-item tool covers the minimal important psychological factors that are considered to be yellow flags for overall chronic LBP. This survey includes questions related to fear, anxiety, catastrophizing, depression, and bothersomeness, all of which are the most important predictors identified as yellow flags. For patients with high STarT-G scores, specific cognitive behavioral approaches are needed in addition to pain education, motivation, encouragement, exercise, medical therapy (minimal amounts), and physical treatment. This conclusion is based on previous reports that stated, “early intervention to yellow flag leads to better outcome.”Citation23,Citation24

Finally, ~1.5% of participants reported having a DCP. At a STarT-G threshold value of 4 points, ROC analysis revealed that the sensitivity and specificity of DCP were 65.8% and 82.4%, respectively. Additionally, the AUC was 0.808, indicating a good capacity of the STarT-G to differentiate between patients with and without a DCP.

The STarT-G is a diagnosis-specific screening tool used for communication between primary care physicians and pain specialists in the care of chronic pain patients. Using the STarT-G threshold of 4 points, patients examined here were divided into the following two groups: those at risk for a DCP and those with minimal to no risk for a DCP. We recommend that patients at or beyond this threshold consult a pain specialist. The STarT-G is now planned to be used as a tool to identify patients for referral to one of 18 core facilities in Japan that provide cognitive behavioral therapy.

Our study had several limitations. First, our study population was selected from Internet research volunteers who have chronic pain. Given that 41% of participants had a STarT-G score of 0, many patients may have had chronic pain that was not severe enough to require medical care. This may have influenced our results. Second, Internet-based surveys can introduce a selection bias and may not be representative of the general population. Because our study population was selected from Internet research volunteers who may differ from general Internet users, caution is needed when interpreting our study findings. In particular, people living in large cities are overrepresented in Internet survey company volunteers. In addition, a higher proportion of respondents had completed university or graduate level education than the general population, particularly in older respondents.Citation25 Third, our study had a test reliability of >0.70.Citation19 However, Nunnally and BernsteinCitation26 recommend a minimum test reliability of >0.90 for making clinical decisions. Therefore, it is possible that test reliability was overestimated. Finally, this cross-sectional study did not assess the ability of the STarT-G to predict pain consistency. Future longitudinal studies are needed to better understand potential associations between risk groups and long-term pain outcomes. These should also examine whether or not the STarT-G score is predictive of DCP.

Conclusion

The STarT-G scale had acceptable internal consistency, reliability, and validity (concurrent and known groups) in Japanese patients with chronic pain. We hope that these analyses of the psychometric properties of STarT-G will enable Japanese clinicians to use this survey as a screening tool for detecting DCPs. The STarT-G is simple, fast, and suitable for use in primary care settings, all of which suggest that the STarT-G may facilitate screening for DCP in the primary care setting in Japan. We hope using the STarT-G will ultimately ease physical, social, and economical burdens of chronic pain in the Japanese population.

Disclosure

The authors report no conflicts of interest in this work.

References

  • NakamuraMNishiwakiYUshidaTToyamaYPrevalence and characteristics of chronic musculoskeletal pain in JapanJ Orthop Sci201116442443221678085
  • GoldbergDSMcGeeSJPain as a global public health priorityBMC Public Health20111177021978149
  • GuerriereDNChoinièreMDionDThe Canadian STOP-PAIN project - Part 2: what is the cost of pain for patients on waitlists of multidisciplinary pain treatment facilities?Can J Anaesth201057654955820414821
  • LynchMEThe need for a Canadian pain strategyPain Res Manag2011162778021499581
  • HillJCWhitehurstDGLewisMComparison of stratified primary care management for low back pain with current best practice (STarT Back): a randomised controlled trialLancet201137898021560157121963002
  • Leboeuf-YdeCGronstvedtABorgeJAThe Nordic back pain subpopulation program: demographic and clinical predictors for outcome in patients receiving chiropractic treatment for persistent low back painJ Manipulative Physiol Ther200427849350215510092
  • DunnKMCroftPRClassification of low back pain in primary care: using “bothersomeness” to identify the most severe casesSpine (Phila Pa 1976)2005301887189216103861
  • STarT Back Screening Tool Website Available from: https://www.keele.ac.uk/sbst/startbacktool/usingandscoring/Accessed February 17, 2017
  • MatsudairaKKikuchiNKawaguchiMDevelopment of a Japanese version of the STarT (Subgrouping for Targeted Treatment) Back screening tool: translation and linguistic validationJ Musculoskel Pain Res201351119 Japanese
  • GuilleminFBombardierCBeatonDCross-cultural adaptation of health-related quality of life measures: literature review and proposed guidelinesJ Clin Epidemiol19934612141714328263569
  • SuzukamoYKumanoHPsychometricsIkegamiNFukuharaSShimozumaKIkedaSQOL Evaluation Handbook for Clinical DiagnosisTokyoIgaku Shoin2001813 Japanese
  • WildDGroveAMartinMPrinciples of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: report of the ISPOR Task Force for translation and cultural adaptationValue Health2005829410415804318
  • MatsudairaKOkaHKikuchiNHagaYSawadaTTanakaSPsychometric properties of the Japanese version of the STarT Back Tool in patients with low back painPLoS One2016113e015201927002823
  • DerogatisLRMelisaratosNThe Brief Symptom Inventory: an introductory reportPsychol Med19831335956056622612
  • MatsudairaKInuzukaKKikuchiNDevelopment of the Japanese version of the brief symptom inventory-somatization scale: translation and linguistic validationOrthop Surg201263149153 Japanese
  • MatsudairaKPalmerKTReadingIHiraiMYoshimuraNCoggonDPrevalence and correlates of regional pain and associated disability in Japanese workersOccup Environ Med201168319119620833762
  • DerogatisLRMelisoratosNThe Brief Symptom Inventory: an introductory reportPsychol Med19831335956056622612
  • TerweeCBBotSDde BoerMRQuality criteria were proposed for measurement properties of health status questionnairesJ Clin Epidemiol2007601344217161752
  • NunnallyJCPsychometric Theory2nd edNew YorkMcGraw-Hill1978
  • CohenJStatistical Power Analysis for the Behavioral Sciences2nd edHillsdaleLawrence Erlbaum Associates1988
  • HosmerDWLemeshowSAssessing the fit of the modelHosmerDWLemeshowSApplied Logistic Regression2nd edNew YorkWiley2000143202
  • PincusTMcCrackenLMPsychological factors and treatment opportunities in low back painBest Pract Res Clin Rheumatol201327562563524315144
  • NicholasMKLintonSJWatsonPJEarly identification and management of psychological risk factors (“yellow flags”) in patients with low back pain: a reappraisalPhys Ther201191573775321451099
  • KendallNALintonSJMainCJGuide to Assessing Psychosocial Yellow Flags in Acute Low Back Pain: Risk Factors for Long-term Disability and Work LossWellington, New ZealandAccident Rehabilitation and Compensation Insurance Corporation of New Zealand and the National Health Committee1997
  • Statistics Bureau Ministry of Internal Affairs and Communication [webpage on the Internet]Population Census and Labour Force Survey2011 Available from: www.stat.go.jp; http://www.stat.go.jp/data/index.htmAccessed October 4, 2011
  • NunnallyJCBernsteinIHPsychometric Theo3rd edNew YorkMcGraw-Hill1994