63
Views
1
CrossRef citations to date
0
Altmetric
Original Research

Reproducibility of measurements and variability of the classification algorithm of Stratus OCT in normal, hypertensive, and glaucomatous patients

, , , , &
Pages 139-145 | Published online: 25 Nov 2022

Abstract

Purpose:

To assess the reproducibility of retinal nerve fiber layer (RNFL) measurements and the variability of the probabilistic classification algorithm in normal, hypertensive and glaucomatous eyes using Stratus optical coherence tomography (OCT).

Methods:

Forty-nine eyes (13 normal, 17 ocular hypertensive [OHT] and 19 glaucomatous) of 49 subjects were included in this study. RNFL was determined with Stratus OCT using the standard protocol RNFL thickness 3.4. Three different images of each eye were taken consecutively during the same session. To evaluate OCT reproducibility, coefficient of variation (COV) and intraclass correlation coefficient (ICC) were calculated for average thickness (AvgT), superior average thickness (Savg), and inferior average thickness (Iavg) parameters.

The variability of the results of the probabilistic classification algorithm, based on the OCT normative database, was also analyzed. The percentage of eyes with changes in the category assigned was calculated for each group.

Results:

The 50th percentile of COV was 2.96%, 4.00%, and 4.31% for AvgT, Savg, and Iavg, respectively. Glaucoma group presented the largest COV for all three parameters (3.87%, 5.55%, 7.82%). ICC were greater than 0.75 for almost all measures (except from the inferior thickness parameter in the normal group; ICC = 0.64, 95% CI 0.334–0.857).

Regarding the probabilistic classification algorithm for the three parameters (AvgT, Savg, Iavg), the percentage of eyes without color-code category changes among the three images was as follows: normal group, 100%, 84.6% and 92%; OHT group, 89.5%, 52.7%, 79%; and Glaucoma group, 82%, 70.6%, and 76.5%, respectively. A probabilistic category switch from pathologic to normal or vice versa was observed in three eyes (15.8%) of the glaucomatous group for the Savg parameter and in two eyes of the OHT group: one eye (5,9%) for the AvgT and one eye (5.9%) for the Savg parameter.

Conclusions:

OCT RNFL measurements showed a good reproducibility in normal, OHT, and glaucoma eyes. The probabilistic classification for the three main parameters showed certain variability, especially in glaucoma group and OHT group. Therefore, one isolated category result should be interpreted with caution before clinical classification of the patient.

Introduction

Primary open-angle glaucoma (POAG) is an acquired progressive optic neuropathy, characterized by damage of retinal ganglion cells leading to loss of visual function.Citation1 In clinical practice, the diagnosis of POAG and determination of glaucomatous progression are based on a characteristic appearance of the optic discCitation2 and typical visual fields (VF) changes. There is evidence of a quantitative structure–function relationship,Citation3 but this is not lineal and a relatively large proportion of ganglion cells must be lost before the changes exceed the normal variability. In fact, only after 25% to 35% of ganglion cells have died a statistically significant visual field abnormality occurs.Citation4 In this regard, some devices such as optical coherence tomography (OCT) have been developed in order to detect and quantify early retinal nerve fiber layer (RNFL) loss.

The third generation OCT, Stratus OCT (software version A2, Carl Zeiss Meditec Inc, Dublin, CA), is a noncontact and noninvasive imaging technique that obtains cross-sectional images of the retina with a resolution of 8–10 μm.Citation5 Several studies have reported that the Stratus OCT with its internal normative database shows high sensitivity and specificity for diagnosing glaucoma.Citation6Citation8 Recently, screening capability of Stratus OCT 3 for diagnosing early glaucoma has also been evaluated,Citation9 obtaining a moderate sensitivity with high specificity.

The following step to assess the clinical usefulness of an imaging device consists on determining its ability to detect progression, which is strongly dependent on the reproducibility of the measurements obtained. No imaging device or functional test can detect changes that are smaller than its particular variability. Different studies have already evaluated the reproducibility of RNFL measurements using the previous generations of OCTCitation10,Citation11 and the Stratus OCTCitation12Citation14 demonstrating excellent and only slightly different reproducibility results for both instruments.

Moreover, the Stratus OCT software allows the comparison of the RNFL thickness with a normative database and offers an automatic classification of each parameter in four color-code categories: a white band (5% of normal population falls inside), a green band (90% of normal population falls inside), a yellow band (4% of normal population falls inside) and a red band (1% of normal population falls inside). Clinically, the white band is considered over normal limits, the green band inside normal limits, the yellow band borderline, and the red band outside normal limits. Since this color-coded classification is widely used in clinical practice as a complementary tool to diagnose and follow-up ocular hypertensive (OHT), glaucoma suspects, and glaucoma patients, it is useful and pertinent to know the variability of this classification. The assessment of intraobserver and intrasession variability allows the evaluation of the instrument in steady conditions diminishing the potential influence of other external factors such as patient conditions, operator, or the disease itself. To the best of our knowledge the variability of the color-code classification implemented in the currently available instrument, has not yet been reported.

The purpose of this study is to determine the intraobserver and intrasession reproducibility of RNFL measurements and the variability of the probabilistic classification algorithm offered by OCT Stratus for normal, OHT, and glaucoma patients.

Material and methods

Design

Cross-sectional study with prospective sampling.

Subjects

Forty-nine patients were consecutively recruited from the outpatient ophthalmic clinic. Informed consent was obtained from all participants and the study was approved by the Ethical Committee of Universidad de Valladolid.

Three groups of subjects were enrolled in our study: normal, OHT, and glaucoma. One eye per patient (right eye) was selected for inclusion, with the exception of cases in which only one eye met our inclusion criteria. All subjects underwent a complete ophthalmic evaluation, including visual acuity testing, intraocular pressure measured by Goldman applanation tonometry, anterior biomicroscopy, gonioscopy, and posterior segment biomicroscopy under dilation (two drops 1% tropicamide) and optic nerve head photography (Topcon IMAGEnet™ 2000 FA/ICG System, Topcon American Corporation, NJ).

Visual fields were performed with the Humphrey Visual Field Analyzer (Carl Zeiss Meditec, CA, USA) using 24–2 SITA Standard protocol. Inclusion criteria for normal subjects were a best-corrected visual acuity of 20/30 or better, intraocular pressure (IOP) under 21 mmHg, normal visual field, normal appearing optic nerve head and absence of any ophthalmic diseases except for mild cataract. OHT patients had a best-corrected visual acuity of 20/30 or better, IOP over 21 mmHg on more than two occasions, normal slit-lamp examination, normal visual field, normal optic disc, and no evidence of other ophthalmic diseases. A normal visual field was defined as a mean deviation (MD) and pattern standard deviation (PSD) within 95% confidence limits and a glaucoma hemifield test result “within normal limits”. Glaucoma patients were included if basal IOP was over 21 mmHg on more than two occasions and optic nerve and visual field were glaucomatous. Optic nerve was considered glaucomatous if a rim notch, cup-to-disc ratio >0.7 with alteration of inferior superior nasal temporal (ISNT) rule, disc hemorrhage, or RNFL defect were detected. Visual field was defined as glaucomatous according to Anderson’s criteria,Citation15 in which at least one of the following was present: 1) a cluster of at least three points in the pattern deviation probability plot, located in areas typical of glaucoma, having a probability level of p ≤ 5%, with at least one point having a probability level of p ≤ 1%; 2) a PSD with a probability level of p ≤ 5%, and 3) glaucoma hemifield test results outside normal limits. This condition had to be present in two consecutive reliable visual fields. Reliable criteria for visual fields include false–positive and false–negative responses of <25% and fixation losses of <20%.

Exclusion criteria included: Subjects who presented a best-corrected visual acuity worse than 20/30, angle abnormalities on gonioscopy, other intraocular eye diseases (secondary glaucoma, diabetic retinopathy, age-related macular degeneration, acute anterior segment diseases, etc), other diseases affecting the visual fields or history of intraocular surgery (except from uncomplicated cataract surgery). Subjects with unreliable visual fields or without good quality OCT images were not eligible.

OCT measurements

All eyes were scanned with Stratus OCT (model 3000, software version A2, Carl Zeiss Meditec Inc, Dublin, CA) using the protocol that measures RNFL thickness at a circumference of 3.4 mm in diameter centered at the disc by a single operator. Image acquisition was performed as follows: pupils were dilated with two instillations of 1% tropicamide; the subject received fixation instructions; the 3.4 mm scan-circle was positioned around the disc by an experienced operator and finally, the image was acquired and saved. Three different OCT images using Standard RNFL Scan protocol were obtained during the same visit, without breaks between each measurement. The quality of the image was checked by an independent observer using the following criteria: the fundus image had to be centered and clear enough to see the optic disc and the scan circle; the different retinal layers needed to be present in the image, specifically the red color band of RPE and RNF had to be visible, with no missing or blank area in the scan pattern. The OCT images were automatically analyzed with the Straus OCT software (v. A2) to quantitatively assess retinal morphology. RNFL thickness was assessed globally (0°−359°) and in two retinal regions: superior (46°–135°) and inferior (226°–315°).

Statistical analysis

A repeated-measures analysis of variance was performed for each of the three RNFL parameters (superior average, inferior average, and average thickness). We determined the intraclass correlation coefficient (ICC) and coefficient of variation (COV) as measures of reproducibility for each variable and group. COV was calculated using the standard deviation divided by the mean thickness and compared among the groups with Wilcoxon/Kruskall–Wallis Test. The ICC was the ratio between the intersubject component of the variance to the total variance. Linear regression analysis was performed to assess the correlation between age and RNFL measurements of reproducibility. Finally, the percentage of cases of the three groups with changes observed in OCT probabilistic classification algorithm was calculated for each parameter.

Results

Forty-nine eyes of 49 patients, 13 normal eyes, 17 OHT, and 19 with glaucoma were included in this study. There were 26 women and 23 men.

The mean (± SD) age of the normal group was 54.17 ± 13.03 years, for OHT group was 57.88 ± 13.44 years, and for the glaucoma group was 64.72 ± 14.38 years (non-statistically significant differences).

Mean RNFL thickness (±SD) in the three groups was significantly thinner in glaucomatous eyes, 58.88 (±21.31) microns compared to 87.13 (±9.60) microns in OHT eyes and 91.82 (±10.24) microns in normal eyes (p < 0.001). RNFL thickness COVs for normal, OHT, and glaucomatous eyes are presented in Table . The 50th percentile of global COV was 2.96%, 4.00%, and 4.31% for average thickness (AvgT), superior average thickness (Savg) and inferior average thickness (Iavg), respectively. When we looked at potential differences of COV among groups using the Wilcoxon/Kruskal–Wallis test, we found that COV was significantly higher among glaucoma eyes than in the other two groups for AvgT and Iavg, but not for Savg (p = 0.036, p = 0.045, p > 0.05, respectively). We also analyzed the potential relation between age and COV with Spearman test and we found statistically significant linear correlation (p < 0.05) for Savg and Iavg (r = 0.379 and r = 0.336, light correlation).

Table 1 Coefficient of variation by diagnosis (%)

The ICCs for normal, OHT and glaucomatous eyes are presented in Table . ICCs were greater than 0.75 for almost all measures (except for the Iavg in the normal group; ICC = 0.64, 95% CI 0.334–0.857) and most were in the 0.80 to 0.90 range or higher.

Table 2 Intraclass correlation coefficient by diagnosis

The distribution of the color code classification in our three groups is shown in Table , , and . In the normal and hypertensive group the most frequently observed category was green, while in the glaucoma group the color mostly showed by our patients was red. In the normal group we didn’t find any patient with a red category in any parameter, but we found two measurements with red category in the hypertensive group, one for SAvg parameter and one for AvgT parameter, respectively.

Table 3a Color-code distribution for the three different measurements and parameters in the normal group

Table 3b Color-code distribution for the three different measurements and parameters in the OHT group

Table 3c Color-code distribution for the three different measurements and parameters in the glaucoma group

We also studied the variability of this probabilistic classification performed by the OCT3 algorithm. Regarding AvgT parameter (Table ), in the normal group we didn’t find any patient with a category change. In the OHT group and in the glaucoma group we found changes of one category (green to yellow or yellow to red, or vice versa) in 11.8% and 10.6% of patients, respectively. We also found a two-category switch (green to red or vice versa) in one patient from 17 of the OHT group. When we analyzed the Savg parameter (Table ) we found a high percentage of changes in the color-coded classification in the glaucoma group: 31.6% of patients had a change of one category and 15.8% of two categories. The OHT group showed one category change in 23.6% of patients and two category changes in one patient. The normal group didn’t have category changes in 84.6% of the patients. For the Iavg parameter (Table ), in the normal group the category classification didn’t change in 92.3% of the patients. We observed one category change in 23.6% and 15.8% of patients for the OHT group and for the glaucoma group, respectively. The only change of two categories for the Iavg parameter was found in a patient from the glaucoma group (5.3%).

Table 4a Variability of color-coded classification for average thickness parameter: category changes (number of patients in parenthesis)

Table 4b Variability of color-coded classification for superior thickness parameter: category changes (number of patients in parenthesis)

Table 4c Variability of color-coded classification for inferior thickness parameter: category changes (number of patients in parenthesis)

Discussion

Although OCT3 can measure RNFL thickness at a resolution of 8 to 10 microns with established validity, its role in diagnosis of early glaucoma and in detection of progression is not clearly defined.

Reproducibility of RNFL measurements is the base for the applicability of Stratus OCT as a diagnostic tool in clinical practice. Several studies have evaluated the reproducibility of Stratus OCT.Citation12Citation14 Budenz and colleaguesCitation12 showed high intraobserver and intrasession reproducibility of OCT RNFL thickness measurements, using a very similar method of image acquisition to the one applied in our study (three scans per patient during the same day) but with different inclusion criteria for the glaucoma group (it also included suspected glaucoma eyes). In this study, the ICC results were similar in normal and glaucomatous eyes and, although they obtained better results with standard scan than with fast scan protocol, even the lowest ICC (95% CI) was greater than 0.70 indicating excellent reproducibility of all measurements. We have obtained similar ICC results in almost all parameters, except for the inferior sector of normal group (ICC = 0.64, 95% CI 0.334–0.857); this difference could be attributed to the lower sampling of our study (49 versus 157 patients). In fact, Gurses-Ozden and colleaguesCitation16 showed that reproducibility of RNFL measurements can be improved by increasing the sampling density or number of scans performed. In another reproducibility study performed with only ten normal eyes,Citation13 the values of ICC obtained were of 0.83, which are moderately lower, possibly due to the few number of patients included and to the fact that they measured the RNFL on three occasions that could be up to five months apart. Whether five months as a relatively short period of time is sufficient or not to allow significant RNFL thickness changes or if other factors are also responsible for the greater variability is not clear.

Although reproducibility of any device is indispensable, clinically we are concerned about the capability of the OCT to differentiate glaucoma versus normal patients, especially in early stages. OCT diagnostic accuracy depends on the comparison of the individual results with its database. Different studies have demonstrated the capability of differentiating healthy from glaucomatous eyes using an earlier-version OCT,Citation8,Citation17,Citation18,Citation19 and OCT StratusCitation6,Citation8,Citation9,Citation20,Citation21 to analyze the sensitivity parameter. This sensitivity depends also on the disease stage of the glaucoma. In fact, Kim and colleaguesCitation20 demonstrated that in preperimetric RNFL damage Stratus OCT has a low sensitivity for almost all parameters. The highest sensitivity was only 40.8% and was achieved using the parameter of one hour abnormal at the 5% level. This study had a low sensitivity in contrast with previous studies with manifest VF defects.Citation6

In clinical practice we use the color-coded classification that allows a quicker, more intuitive and more convenient RNFL evaluation than the absolute micron thickness. Theoretically, this algorithm classification should be able to detect structural injury preceding visual field defects and should be useful to detect some kind of structural progression, but this, of course, will depend on its particular reproducibility and variability. Therefore, the following crucial step is to assess the reproducibility of the probabilistic classification which depends on two main factors. Firstly, the proximity of each individual thickness value to the border of the category in which it fell. The algorithm uses specific cut points to classify a certain value as normal, borderline or glaucoma comparing it with a normative database. This normative database was derived from a measurement of 328 normal subjects comprising 205 (63%) white, 79 (24%) hispanic, 27 (8%) black, and 11 (3%) asian people and it is very important since the OCT classification algorithm is based on it.Citation5 This factor is equally important and present in any device with a classification algorithm based in probabilistic criteria. Secondly, the variability of the measurements given by the instrument, this is specific for each imaging device. If the variability is large, the range of thickness measurements obtained is wider and the chance that those values have of jumping from one side of the cut-point to the other is proportionally greater, so then the comparison to the normative database and its probabilistic classification would be variable and clinically less useful. Ideally, clinicians would like an instrument that always assigns the same category to a certain eye unless significant progression occurs.

In our study, the AvgT was the most stable parameter in the color-code classification for all the groups. In contrast, Savg was the most variable parameter for all the groups, having the highest percentage of changes in two categories. When we analyzed the variability results for the different groups, we found that the normal group is the one that presented less category changes for any parameter. Glaucoma and OHT groups were more variable, only 52.7% of glaucoma patients were stable in their color-code category for the Savg parameter, three glaucomatous patients changed two categories for the Savg parameter, and one patient changed two categories for the AvgT parameter. Regarding the OHT group, we found changes in one category in four patients for the superior parameter and in four patients for the inferior parameter and in two categories in one patient for the superior parameter and in one patient for the inferior parameter. Such changes could reflect worsening, or even theoretical improvement, of the disease or may also be a consequence of instrument–algorithm variability. In the present study, since all three images were taken on the same day, the most likely explanation is the last one, although some variation in patient position between scans and the scan circle placement should be considered. A relatively frequent variation of categories exists at a short term, especially in glaucoma and OHT group, being this last one also very variable because of its intermediate situation that could include some preperimetrical glaucomas. One explanation for this variability could be the characteristic probabilistic database distribution, with a wide range of RNFL thickness micron values inside the 95% of normality, a narrow range for pathologic thickness measurements and even a narrower range for borderline ones. So, if the individual result falls close enough to the category cut-point, it is more likely that a small quantitative change may imply a big qualitative change: a category switch. In this regard, our study has two limitations: first, the fact that the three images were taken on the same day and by the same operator may underestimate the variability found in clinical practice; second, at the time of the study the signal strength algorithm was not implemented in our instrument and for this reason it could not be used for image quality assessment. Nevertheless, quality assessment was carefully performed with the method described above.

Since there is considerable intra-session variability in the classification algorithm, we have to be cautious when, at longer term, we find a change in the classification from normal to borderline or outside normal limits that could be interpreted as progression of the disease. For this reason, one isolated category result should be interpreted with caution before clinical classification of the patient and a specific follow-up algorithm is needed to adequately assess RNFL thickness over time with Stratus OCT.

The found variability in the probabilistic classification algorithm is compatible with a good reproducibility of RNFL thickness measurements and its clinical usefulness, but warrants caution when interpreting its results and supports the recommendation of confirming results with more than one image. The specific number of images that is needed to confirm the results would require another study with a different design.

Disclosure

The authors received financial support from Asociación para la Investigación en Glaucoma. None of the authors have any proprietary interest.

References

  • QuigleyHANeuronal death in GlaucomaProg Retin Eye Res19991839579920498
  • JonasJBBuddeWMPanda-JonasSOphthalmoscopic evaluation of the optic nerve headSurv Ophthalmol19994329332010025513
  • HarwerthRSQuigleyHAVisual field defects and retinal ganglion cell losses in patients with glaucomaArch Ophthalmol200612485385916769839
  • Kerrigan-BaumrindLAQuigleyHAPeaseMENumber of ganglion cells in glaucoma eyes compared with threshold visual field tests in the same personsInvest Ophthalmol Vis Sci200041747748
  • Stratus OCT User Manual2002Dublin, CA. Carl Zeiss Meditec; PN 55648 REV. A 4/03, 1-1-1-3.
  • BudenzDLMichaelAChangRTSensitivity and specificity of the StratusOCT for perimetric glaucomaOphthalmology20051123915629813
  • HoeingJWParkKHKimTWDiagnostic ability of optical coherence tomography with a normative database to detect lolalized retinal nerve fiber layer defectsOphthalmology20051122157216316290196
  • HougaardJLHeijlABengstonBGlaucoma detection using different Stratus optical coherence tomography protocolsActa Ophthalmol Scand20078525125617343690
  • ParikhRSParikhSSekharGCDiagnostic capability of optical coherence tomography (Stratus OCT 3) in early glaucomaOphthalmology20071142238224317561260
  • SchumanJSHeeMRPuliafitoCAQuantification of nerve fiber thickness in normal and glaucomatous eyes using optical coherence tomography: a pilot studyArch Ophthalmol19951135865967748128
  • BlumenthalEZWilliamsJMWeinrebRNGirkinCABerryCCZangwillLMReproducibility of nerve fiber layer thickness measurements by use of optical coherence tomography in normal and glaucomatous eyesOphthalmology200311019019512511365
  • BudenzDLChangRTHuangXReproducibility of retinal nerve fiber thickness measurements using the Stratus OCT in normal and glaucomatous eyesInvest Opthalmol Vis Sci20054624402443
  • PaunescuLASchumanJSPriceLLReproducibility of nerve fiber layer thickness and optic nerve head measurements using Stratus OCTInvest Ophthalmol Vis Sci2004451716172415161831
  • SchumanJSPedut-KloizmanTHertzmarkEReproducibility of nerve fiber layer thickness measurements using optical coherence tomographyOphthalmology1996103188918988942887
  • AndersonDRPatellaVMAutomated Static PerimetrySt. Louis, MOMosby1999152
  • Gurses-OzdenIshikawaHHoh StIncreasing sampling density improves reproducibility of optical coherence tomography measurementsJ Glaucoma1999823824110464731
  • BowdCWeinrebRNWiliamsJMThe retinal nerve fiber layer thickness in ocular hypertensives, normal and glaucomatous eyes with optical coherence tomographyArch Ophthalmol2000118222610636409
  • GreaneyMJHoffmanDCGarway-HeathDFComparison of optic nerve imaging methods to distinguish normal eyes from those with glaucomaInvest Ophthalmol Vis Sci20024314014511773024
  • Nouri-MadahviKHoffmanDTannenbaumDPLawSKCaprioliJIdentifying early glaucoma with optical coherence tomographyAm J Ophthalmol200413722823514962410
  • KimTParkUParkKKimDMAbility of Stratus OCT to identify localized retinal nerve fiber layer defects in patients with normal standard automated perimetry resultsInvest Ophthalmol Vis Sci2007481635164117389494
  • AntonAMoreno-MontañesJBlázquezFÁlvarezAMartínBMolinaBUsefulness of optical coherence tomography parameters of the optic disc and the retinal nerve fiber layer to differentiate glaucomatous, ocular hypertensive, and normal eyesJ Glaucoma2007161817224742