706
Views
3
CrossRef citations to date
0
Altmetric
Original Article

A reliability generalization meta-analysis of coefficient alpha and test–retest coefficient for the aging males’ symptoms (AMS) scale

, , , , , & show all
Pages 244-253 | Received 09 Sep 2016, Accepted 06 Oct 2016, Published online: 22 Nov 2016

Abstract

Purpose: The aging males’ symptoms (AMS) scale is an instrument used to determine the health-related quality of life in adult and elderly men. The purpose of this study was to synthesize internal consistency (Cronbach’s alpha) and test–retest reliability for the AMS scale and its three subscales.

Methods: Of the 123 studies reviewed, 12 provided alpha coefficients which were then used in the meta-analyses of internal consistency. Seven of the 12 included studies provided test–retest coefficients, and these were used in the meta-analyses of test–retest reliability.

Results: The AMS scale had excellent internal consistency [α = 0.89 (95% CI 0.88–0.90)]; the mean alpha estimates across the AMS subscales ranged from 0.79 to 0.82. The AMS scale also had good test–retest reliability [r = 0.85 (95% CI 0.82–0.88]; the test–retest reliability coefficients of the AMS subscales ranged from 0.76 to 0.83. There was significant heterogeneity among the included studies.

Conclusions: The AMS scale and the three subscales had fairly good internal consistency and test–retest reliability. Future psychometric studies of the AMS scale should report important characteristics of the participants, details of item scores, and test–retest reliability.

Introduction

The aging males’ symptoms (AMS) scale is an instrument which has been used for over 15 years to determine the level of health-related quality of life in adult and elderly men [Citation1]. The scale was designed as a self-administered scale to: (a) assess symptoms of aging (independent from those which are disease-related) between groups of males under different conditions, (b) evaluate the severity of symptoms over time and (c) measure changes pre- and post-androgen therapy [Citation1].

The AMS scale has been well-accepted internationally and has been translated into more than 30 languages [Citation2–11]. As the AMS scale is widely used worldwide, one potential problem with the AMS scale is that the symptoms of aging could be affected by factors such as culture, race, religion and economic status. If the reliability statistics of the AMS scale vary internationally, then the utility of the AMS scale would be significantly limited with regard to pooling results of clinical studies among countries. Therefore, it is important to compare the reliability of the AMS scale among countries.

Daig et al. [Citation11] demonstrated that the AMS had good reliability among different countries; however, in their report, the reliability statistics of the AMS scale were based largely on samples from the Western world, and there were only two small samples from Asia, each of which had fewer than 30 participants. Such results might not be applicable in non-Western countries. Recently, several psychometric studies of the AMS scale in non-Western countries have been published [Citation2–8,12]. Therefore, it would be relevant to update the review of the reliability statistics of the AMS scale.

The objective of this systematic review and meta-analysis was to specifically evaluate the available evidence for the reliability of the AMS scale.

Methods

Instrument

The AMS scale is a 17-item polychotomous Likert-type scale. Respondents rate each item on a five-point response scale (1–5: none = 1; mild = 2; moderate = 3; severe = 4; and extremely severe = 5), so that the possible scores range from 17 to 85. Based on the total score, the severity of the symptoms is then classified as none/little (17–26), mild (27–36), moderate (37–49) or severe (50–85) [Citation1]. The AMS scale can be divided into three subscales: (1) psychological subscale (AMS-PSY) which contains depressed mood, burned out, increased irritability, anxiety and nervousness; (2) somatovegetative subscale (AMS-SOM) which contains impaired well-being, increased joint complaints, increased sweating, need for more sleep, increased sleep disturbance, muscular weakness and physical exhaustion; and (3) sexual subscale (AMS-SEX) which contains past peak, decreased beard growth, impaired sexual potency, fewer morning erections and disturbed libido.

Literature search

We followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) Statement [Citation13]. We searched two electronic databases (PubMed and Google Scholar), using the search terms “Aging Males' symptoms scale” OR “Ageing Males' symptoms scale” AND “reliability”, in the title/abstract field without language restrictions from 1 January 1993 to 31 July 2016 to identify publications about the AMS scale. In addition, we screened references in the included studies and reviews, and hand searched The Aging Male archives. Journal articles, dissertations, theses and electronically available unpublished manuscripts were included to insure that all the available published and unpublished studies could be located in an attempt to control for publication bias.

Study selection

Two reviewers independently assessed publications for inclusion in the review. Discrepancies were resolved through discussion by the review team.

Psychometric variables analyzed and statistical methods used

All the statistical analyses were conducted using R version 3.3.1 (R Foundation for Statistical Computing, Vienna, Austria). Meta-analyses were conducted using the metafor package [Citation14]. The p values were two-tailed, and the significance level was set at 0.05.

Three primary variables were of interest in this psychometric meta-analysis: internal consistency, test–retest reliability and descriptive statistics (i.e. means and standard deviations) of age and scale scores. If means and standard deviations were not reported in the included studies, these statistics were then derived using median and range with the formulas provided by Hozo et al. [Citation15].

Cronbach’s alpha (α) was used for the internal consistency analyses. The raw α values were transformed using Bonett’s method [Citation16,Citation17] for analyses, and were back-transformed to α for presentation. Pearson’s product moment correlation coefficient r was used as the effect size for test–retest reliability and convergent validity analyses. Pearson’s rs were transformed into z values using the Fisher's method [Citation18] for analyses, and were back-transformed to r. Assessments of homogeneity of α and r were conducted using random-effects (RE) models with restricted maximum-likelihood estimators, which assume the selected studies were sampled from a larger set of studies, thus allowing for greater external generalizability than fixed effects models. The 95% confidence interval (95% CI) was chosen to facilitate a simple assessment of whether an effect size was greater than zero.

To estimate the heterogeneity in effect sizes, the degree of inconsistency (I2) was calculated and interpreted as follows: 0% meant no inconsistency, 25% low inconsistency, 75% high inconsistency and 100% total inconsistency [Citation19]. I2 values greater than 75% implied significant heterogeneity, indicating that meta-regression analyses should be conducted for mediator and moderator variables.

Five methods were used to detect publication bias in the present study: funnel plot analysis, trim and fill procedure [Citation20], Orwin’s fail-safe N [Citation21], Egger regression [Citation22] and the Begg rank test [Citation23]. If there were possible publication bias, the method of Henmi and Copas [Citation24] was used to estimate the average true effect and corresponding CI in a RE model.

Results

Literature search

We identified 10 publications in Pubmed, 121 in Google Scholar, and 22 in The Aging Male. A total of 12 studies were eligible for further analyses (). These studies were conducted in 16 countries between 1996 and 2016. The number of participants per study varied considerably (range 21–4633, median = 210).

Figure 1. Flowchart of the search strategy and selection process for identifying papers for meta-analysis.

Figure 1. Flowchart of the search strategy and selection process for identifying papers for meta-analysis.

Internal consistency

A total of 12 studies [Citation2–9,11,12,25,26] with a combined sample size of 12  747 participants reported coefficient α results for the AMS scale (). When all the studies were weighted and then averaged, α = 0.89 (95% CI 0.88–0.90) (). Except for one study [Citation25] that reported α results for only the AMS scale, the remaining 11 articles reported α results for all the AMS subscales, with a combined sample size of 12 135. The generalized α results for the three AMS subscales were as follows: α = 0.82 (95% CI 0.78–0.84) for the AMS-PSY (); α = 0.81 (95% CI 0.79–0.83) for the AMS-SOM (); and α = 0.79 (95% CI 0.77–0.82) for the AMS-SEX (). Therefore, the pooled mean αs, and their upper and lower limits of CIs fell between the recommended cut-offs of 0.70–0.95 [Citation27]. With regard to the meta-analyses of α results, all the I2 values for the AMS scale and its subscales were above 75%, indicating significant heterogeneity. We conducted a meta-regression analysis with α coefficients as dependent variables, and found that the standard deviation (SD) and coefficient of variation (CV) of the AMS total scores were statistically significant positive moderators of α for the AMS scale (). There was a trend in the association between the SD of the AMS total score and the α coefficients of the AMS scale (r = 0.55, t = 2.18, df = 11, p = 0.052). The CV of the AMS total score was correlated with the α coefficients of the AMS scale (r = 0.67, t = 2.96, df = 11, p = 0.01).

Figure 2. Forest plot of the alpha coefficients of the AMS scale using a RE model.

Figure 2. Forest plot of the alpha coefficients of the AMS scale using a RE model.

Figure 3. Forest plot of the alpha coefficients of the AMS-PSY using a RE model.

Figure 3. Forest plot of the alpha coefficients of the AMS-PSY using a RE model.

Figure 4. Forest plot of the alpha coefficients of the AMS-SOM using a RE model.

Figure 4. Forest plot of the alpha coefficients of the AMS-SOM using a RE model.

Figure 5. Forest plot of the alpha coefficients of the AMS-SEX using a RE model.

Figure 5. Forest plot of the alpha coefficients of the AMS-SEX using a RE model.

Table 1. Internal consistency of the AMS scale and its three subscales – alpha coefficients.

Table 2. Meta-regression analyses of the internal consistency of the AMS scale.

Publication bias was evaluated as the possible bias for internal consistency of the AMS scale. All of the Egger’s regression and Begg rank tests and the trim and fill procedure suggested no publication bias (p > 0.05), and the fail safe N = 69. Visual inspection of plots suggested no obvious causes for concern.

Testretest reliability

A total of seven studies [Citation2–4,6,8,9,11] with a sample size of 594 reported test–retest r coefficients for the AMS scale (). Such results were weighted and then combined to yield a test–retest reliability coefficient (rtt) of 0.85 (95% CI 0.82–0.88) for the AMS scale (); rtt of 0.76 (95% CI 0.72–0.80) for the AMS-PSY (); rtt of 0.83 (95% CI 0.77–0.87) for the AMS-SOM (); and rtt of 0.83 (95% CI 0.77–0.88) for the AMS-SEX (). The interval between the measurements ranged from 1 week to 3 weeks. The duration between two measurements was not associated with the test–retest reliability coefficients for the AMS scale and its three subscales (p > 0.05). With regard to the meta-analyses of test–retest reliability coefficients, all the I2 values for the AMS scale and its subscales were below 75%, indicating non-significant heterogeneity.

Figure 6. Forest plot of the test–retest coefficients (r) of the AMS scale using a RE model.

Figure 6. Forest plot of the test–retest coefficients (r) of the AMS scale using a RE model.

Figure 7. Forest plot of the test–retest coefficients (r) of the AMS-PSY using a RE model.

Figure 7. Forest plot of the test–retest coefficients (r) of the AMS-PSY using a RE model.

Figure 8. Forest plot of the test–retest coefficients (r) of the AMS-SOM using a RE model.

Figure 8. Forest plot of the test–retest coefficients (r) of the AMS-SOM using a RE model.

Figure 9. Forest plot of the test–retest coefficients (r) of the AMS-SEX using a RE model.

Figure 9. Forest plot of the test–retest coefficients (r) of the AMS-SEX using a RE model.

Table 3. Testretest reliability of the AMS scale and its three subscales – Pearson’s correlation coefficients.

Publication bias was evaluated as the possible basis for the test–retest reliability of AMS scale. All of the Egger’s regression and Begg rank tests suggested no publication bias (p > 0.05), and the fail safe N = 7. The trim and fill procedure suggested four missing studies on the right side of the funnel plot (p = 0.03). Based on the method of Henmi and Copas [Citation24], the estimate of rtt was 0.86 (95% CI 0.83–0.89) for the AMS scale.

Discussion

This meta-analytic review examined two types of reliability for the AMS scale. With regard to internal consistency, our meta-analysis showed that the generalized αs of the AMS scale and its subscales were good to excellent, although such estimates might have been inflated by significant heterogeneity. The results of a meta-regression indicated that the SD and CV of the AMS total scores were the most important predictors of internal consistency of the AMS. The relationships among SD, CV and internal consistency are consistent with the classical test theory, so that with increased observed score variance, the reliability also increases, provided that the error variance remains constant. Our α estimates should also be robust because the fail-safe number was 69, immensely larger than the number of included studies.

Second, test–retest reliability – a consistency of measurement across time – is an important aspect of an instrument’s reliability. If an assessment instrument has low test–retest reliability, it is very difficult to distinguish between changes in scores that reflect real changes in the severity of illness from variations in scores due to the instrument's low test–retest reliability. According to the criteria of Gliner et al. [Citation28], the test–retest reliability is high if it is above 0.80. Our meta-analysis demonstrated that the AMS scale, AMS-SOM and AMS-SEX had good test–retest reliability, whereas the AMS-PSY has slightly lower test–retest reliability (0.76). With regard to potential publication bias, the trim and fill procedure suggested some missing studies on the right side of the funnel plot of the AMS scale, and the fail-safe number was 7, indicating that the estimates would be liable to new studies. The test–retest reliability of the AMS scale might be under-estimated in RE models so it would be better estimated as 0.86 (95% CI 0.83–0.89) using the method of Henmi and Copas [Citation24].

There were some difficulties in performing this meta-analysis. First, important data were often missing, i.e. many studies did not report the characteristics of the participants and other relevant information necessary for moderator analyses or transformations, e.g. the mean and standard deviation of age of participants, means and standard deviations of the AMS score and subscores, and the clinical status of participants. Second, some reliability statistics, such as item-total correlation coefficients, were generally not reported in the included studies.

Conclusions

Our meta-analyses demonstrated that the AMS scale and three subscales had fairly good internal consistency and test–retest reliability internationally. Future psychometric studies of the AMS scale should include important characteristics of the participants, details of item scores and test–retest reliability.

Ethical approval

This article contains two studies [Citation8,Citation12] with human participants performed by the authors. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed consent was obtained from all individual participants included in the two included studies that were performed by the authors.

Declaration of interest

All authors declare no conflicts of interest. The authors are responsible for the content and writing of this paper.

This research received no specific grant from any funding agency in the public, commercial, or non-for-profit sectors.

Acknowledgements

We would like to thank Ms. Ya-Shan Chen for assistance with the managerial work of the present study.

References

  • Heinemann LAJ, Zimmermann T, Vermeulen A, et al. A new ‘aging males’ symptoms' rating scale. Aging Male 1999;2:105–14.
  • Yuen JW, Ng CF, Chiu PK, et al. “Aging males” symptoms and general health of adult males: a cross-sectional study. Aging Male 2016;19:71–8.
  • Kong XB, Guan HT, Li HG, et al. The ageing males' symptoms scale for Chinese men: reliability, validation and applicability of the Chinese version. Andrology 2014;2:856–61.
  • Ardebili HE, Khosravi S, Larijani B, et al. Psychometric evaluation of the Persian version of the 'aging male scales' questionnaire. Int J Prev Med 2014;5:1178–85.
  • Ren YF, Wang B, Miao MH, et al. [Feasibility of the aging males' symptoms scale for the male population of Shanghai]. Zhonghua Nan Ke Xue 2013;19:418–21.
  • Kobayashi K, Hashimoto K, Kato R, et al. The aging males' symptoms scale for Japanese men: reliability and applicability of the Japanese version. Int J Impot Res 2008;20:544–8.
  • Akinyemi A, Bamiwuye O, Inathaniel T, et al. The Nigerian aging males' symptoms scale. Experience in elderly males. Aging Male 2008;11:89–93.
  • Chen CY, Wang WS, Liu CY, Lee SH. Reliability and validation of a Chinese version of the aging males' symptoms scale. Psychol Rep 2007;101:27–38.
  • Leungwattanakij S, Lersithichai P, Ratana-Olarn K. The aging males' symptoms rating scale: cultural and linguistic validation into Thai. J Med Assoc Thai 2003;86:1106–15.
  • Heinemann LA, Saad F, Zimmermann T, et al. The aging males' symptoms (AMS) scale: update and compilation of international versions. Health Qual Life Outcomes 2003;1:15.
  • Daig I, Heinemann LA, Kim S, et al. The aging males' symptoms (AMS) scale: review of its methodological characteristics. Health Qual Life Outcomes 2003;1:77.
  • Lee CP, Chen Y, Jiang KH, et al. Development of a short version of the aging males' symptoms scale: Mokken scaling analysis and Rasch analysis. Aging Male 2016;19:117–23.
  • Moher D, Liberati A, Tetzlaff J, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 2009;6:e1000097.
  • Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Stat Softw 2010;36:48. doi: 10.18637/jss.v036.i03.
  • Hozo SP, Djulbegovic B, Hozo I. Estimating the mean and variance from the median, range, and the size of a sample. BMC Med Res Methodol 2005;5:1–10.
  • Bonett DG. Varying coefficient meta-analytic methods for alpha reliability. Psychol Methods 2010;15:368–85.
  • Bonett DG. Sample size requirements for testing and estimating coefficient alpha. J Educ Behav Stat 2002;27:335–40.
  • Fisher RA. On the “probable error” of a coefficient of correlation deduced from a small sample. Metron 1921;1:1–32.
  • Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med 2002;21:1539–58.
  • Duval SJ. The trim and fill method. In: Rothstein HR, Sutton AJ, Borenstein M, eds. Publication bias in meta-analysis: prevention, assessment and adjustments. Chichester, England: Wiley; 2005.
  • Orwin RG. A fail-safe N for effect size in meta-analysis. J Educ Behav Stat 1983;8:157–9.
  • Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ 1997;315:629–34.
  • Begg CB, Mazumdar M. Operating characteristics of a rank correlation test for publication bias. Biometrics 1994;50:1088–101.
  • Henmi M, Copas JB. Confidence intervals for random effects meta-analysis and robustness to publication bias. Stat Med 2010;29:2969–83.
  • Mas M, Group EAS. Psychometric validation of the Spanish version of the aging males’ symptoms scale (AMSS) in a population-based sample. J Sex Med 2008;5:106.
  • Myon E, Martin N, Taieb C. The French aging males' symptoms (AMS) scale: methodological review. Health Qual Life Outcomes 2005;3:20.
  • Bland JM, Altman DG. Cronbach's alpha. BMJ 1997;314:572.
  • Gliner JA, Morgan GA, Leech NL. Research methods in applied settings: an integrated approach to design and analysis. 2nd ed. London: Routledge; 2011.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.