321
Views
4
CrossRef citations to date
0
Altmetric
METHODOLOGY

Studies on Reliability and Measurement Error of Measurements in Medicine – From Design to Statistics Explained for Medical Researchers

ORCID Icon, , ORCID Icon, &
Pages 193-212 | Received 24 Nov 2022, Accepted 27 May 2023, Published online: 07 Jul 2023

References

  • Stenroth L, Sefa S, Arokoski J, Toyras J. Does magnetic resonance imaging provide superior reliability for achilles and patellar tendon cross-sectional area measurements compared with ultrasound imaging? Ultrasound Med Biol. 2019;45(12):3186–3198. doi:10.1016/j.ultrasmedbio.2019.08.001
  • Mokkink LB, Boers M, van der Vleuten CPM, et al. COSMIN risk of bias tool to assess the quality of studies on reliability or measurement error of outcome measurement instruments: a Delphi study. BMC Med Res Methodol. 2020;20(1). doi:10.1186/s12874-020-01179-5
  • Mokkink LB, Terwee CB, Patrick DL, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63(7):737–745. doi:10.1016/j.jclinepi.2010.02.006
  • de Vet HC, Terwee CB, Knol DL, Bouter LM. When to use agreement versus reliability measures. J Clin Epidemiol. 2006;59:1033–1039. doi:10.1016/j.jclinepi.2005.10.015
  • McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychol Methods. 1996;1:30–46. doi:10.1037/1082-989X.1.1.30
  • Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420–428. doi:10.1037/0033-2909.86.2.420
  • Brennan RL. Generalizability theory. Statistics for Social Science and Public Policy. Springer-Verlag; 2001.
  • Shavelson RJ, Webb NM. Generalizability theory. A Primer. Vol 1. Measurement Methods for the Social Science. Sage Publishing; 1991.
  • Bloch R, Norman G. Generalizability theory for the perplexed: a practical introduction and guide: AMEE Guide No. 68. Med Teach. 2012;34(11):960–992. doi:10.3109/0142159X.2012.703791
  • Eekhout I, Mokkink LB. ICC & SEM power: sample size decision assistant for studies on reliability and measurement error; 2022. Available from: https://iriseekhout.shinyapps.io/ICCpower/. Accessed June 21, 2023.
  • Mokkink LB, HCWd V, Diemeer S, Eekhout I. Sample size recommendations for studies on reliability and measurement error: an online application based on simulation studies. Health Serv Outcomes Res Method. 2022. doi:10.1007/s10742-022-00293-9
  • Rose M, Bjorner JB, Gandek B, Bruce B, Fries JF, Ware JE. The PROMIS physical function item bank was calibrated to a standardized metric and shown to improve measurement efficiency. J Clin Epidemiol. 2014;67(5):516–526. doi:10.1016/j.jclinepi.2013.10.024
  • Fischer JS, Jak AJ, Kniker JE, Rudick RA, Cutter G. Multiple Sclerosis Functional Composite (MSFC). Administration and Scoring Manual. National Multiple Sclerosis Society; 2001.
  • Holen JC, Saltvedt I, Fayers PM, Hjermstad MJ, Loge JH, Kaasa S. Doloplus-2, a valid tool for behavioural pain assessment? BMC Geriatr. 2007;7:29. doi:10.1186/1471-2318-7-29
  • Butland RJ, Pang J, Gross ER, Woodcock AA, Geddes DM. Two-, six-, and 12-minute walking tests in respiratory disease. Br Med J. 1982;284(6329):1607–1608. doi:10.1136/bmj.284.6329.1607
  • Ware JE, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30(6):473–483. doi:10.1097/00005650-199206000-00002
  • Aaronson NK, Muller M, Cohen PD, et al. Translation, validation, and norming of the Dutch language version of the SF-36 Health Survey in community and chronic disease populations. J Clin Epidemiol. 1998;51(11):1055–1068. doi:10.1016/s0895-4356(98)00097-3
  • Gellhorn AC, Carlson MJ. Inter-rater, intra-rater, and inter-machine reliability of quantitative ultrasound measurements of the patellar tendon. Ultrasound Med Biol. 2013;39(5):791–796. doi:10.1016/j.ultrasmedbio.2012.12.001
  • Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–163. doi:10.1016/j.jcm.2016.02.012
  • White E, Armstrong BK, Saracci R. Principles of Exposure Measurement in Epidemiology. Collecting, Evaluating, and Improving Measures of Disease Risk Factors. Oxford University Press; 2008.
  • Liljequist D, Elfving B, Skavberg Roaldsen K. Intraclass correlation - A discussion and demonstration of basic features. PLoS One. 2019;14(7):e0219854. doi:10.1371/journal.pone.0219854
  • Eekhout I. Agree: agreement and reliability between multiple raters. R package version 0.1.8. Available from: https://github.com/iriseekhout/Agree/. Accessed March 8, 2022.
  • Eekhout I, Mokkink LB. Estimating ICCs and SEMs with multilevel models. Available from: https://www.iriseekhout.com/r/agree/. Accessed January 20, 2022.
  • Streiner DL, Norman G. Health measurement scales. In: A Practical Guide to Their Development and Use. 4th ed. Oxford University Press; 2008.
  • de Vet HC, Terwee CB, Mokkink L, Knol DL. Measurement in Medicine. Practical Guides to Biostatistics and Epidemiology. Cambridge University Press; 2011.
  • Skeie EJ, Borge JA, Leboeuf-Yde C, Bolton J, Wedderkopp N. Reliability of diagnostic ultrasound in measuring the multifidus muscle. Chiropr Man Therap. 2015;23:15. doi:10.1186/s12998-015-0059-6
  • Kottner J, Audige L, Brorson S, et al. Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. J Clin Epidemiol. 2011;64(1):96–106. doi:10.1016/j.jclinepi.2010.03.002
  • Gagnier JJ, Lai J, Mokkink LB, Terwee CB. COSMIN reporting guideline for studies on measurement properties of patient-reported outcome measures. Qual Life Res. 2021;30:2197–2218. doi:10.1007/s11136-021-02822-4
  • Demetrashvili N, Wit EC, van den Heuvel ER. Confidence intervals for intraclass correlation coefficients in variance components models. Stat Methods Med Res. 2016;25(5):2359–2376. doi:10.1177/0962280214522787
  • Efron B. Better bootstrap confidence intervals. J Am Stat Assoc. 1987;82(397):171–185. doi:10.1080/01621459.1987.10478410
  • Loy A, Korobova J. Bootstrapping clustered data in R using lmeresampler. arXiv. 2021;20:54.
  • de Vet HCW. Guide for the calculation of ICC in SPSS. Available from: http://www.clinimetrics.nl/images/upload/files/Chapter%205/chapter%205_5_Calculation%20of%20ICC%20in%20SPSS.pdf. Accessed July 14, 2021.
  • Brennan RL. urGENOVA. University of Iowa; 2021. Available from: https://education.uiowa.edu/research-centers/center-advanced-studies-measurement-and-assessment/computer-programs. Accessed June 21, 2023.
  • Terwee CB, Peipert JD, Chapman R, et al. Minimal important change (MIC): a conceptual clarification and systematic review of MIC estimates of PROMIS measures. Qual Life Res. 2021;30(10):2729–2754. doi:10.1007/s11136-021-02925-y
  • Zou GY. Sample size formulas for estimating intraclass correlation coefficients with precision and assurance. Stat Med. 2012;31:3972–3981. doi:10.1002/sim.5466