Search in:

Clinical Ophthalmology Volume 8, 2014 - Issue

Submit an article Journal homepage

Open access

Views

CrossRef citations to date

Altmetric

Listen

Original Research

Grader agreement, and sensitivity and specificity of digital photography in a community optometry-based diabetic eye screening program

Luckni Sellahewa1 Diabetic Medicine Department, Nottingham University Hospitals, Nottingham, UK;2 North Nottinghamshire Eye Screening Service, Sherwood Forest Hospitals Foundation Trust, University of Nottingham, Nottingham, UK

Craig Simpson2 North Nottinghamshire Eye Screening Service, Sherwood Forest Hospitals Foundation Trust, University of Nottingham, Nottingham, UK

Prema Maharajan2 North Nottinghamshire Eye Screening Service, Sherwood Forest Hospitals Foundation Trust, University of Nottingham, Nottingham, UK

John Duffy2 North Nottinghamshire Eye Screening Service, Sherwood Forest Hospitals Foundation Trust, University of Nottingham, Nottingham, UK

Iskandar Idris3 Division of Medical Sciences and Graduate Entry Medicine, School of Medicine, University of Nottingham, Nottingham, UKCorrespondence[email protected]

Pages 1345-1349 | Published online: 17 Jul 2014

Cite this article
CrossMark

In this article

Introduction
Materials and methods
Results
Discussion
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF

Abstract

Background

Digital retinal photography with mydriasis is the preferred modality for diabetes eye screening. The purpose of this study was to evaluate agreement in grading levels between primary and secondary graders and to calculate their sensitivity and specificity for identifying sight-threatening disease in an optometry-based retinopathy screening program.

Methods

This was a retrospective study using data from 8,977 patients registered in the North Nottinghamshire retinal screening program. In all cases, the ophthalmology diagnosis was used as the arbitrator and considered to be the gold standard. Kappa statistics were used to evaluate the level of agreement between graders.

Results

Agreement between primary and secondary graders was 51.4% and 79.7% for detecting no retinopathy (R0) and background retinopathy (R1), respectively. For preproliferative (R2) and proliferative retinopathy (R3) at primary grading, agreement between the primary and secondary grader was 100%. Where there was disagreement between the primary and secondary grader for R1, only 2.6% (n=41) were upgraded by an ophthalmologist. The sensitivity and specificity for detecting R3 was 78.2% and 98.1%, respectively. None of the patients upgraded from any level of retinopathy to R3 required photocoagulation therapy. The observed kappa between the primary and secondary grader was 0.3223 (95% confidence interval 0.2937–0.3509), ie, fair agreement, and between the primary grader and ophthalmology for R3 was 0.5667 (95% confidence interval 0.4557–0.6123), ie, moderate agreement.

Conclusion

These data provide information on the safety of a community optometry-based retinal screening program for screening as a primary and as a secondary grader. The level of agreement between the primary and secondary grader at a higher level of retinopathy (R2 and R3) was 100%. Sensitivity and specificity for R3 were 78.2% and 98.1%, respectively. None of the false-negative results required photocoagulation therapy.

Keywords:

retinopathy
screening
public health
community
optometry
diabetes

Introduction

Diabetic retinopathy is a highly specific microvascular complication of diabetes and the leading cause of blindness in people under the age of 60 years in industrialized countries.^Citation1^–^Citation4 Data from the Early Treatment of Diabetic Retinopathy Study showed that early laser treatment would be more than 90% effective in preventing blindness,^Citation4 and as such, early detection of sight-threatening disease is crucial in preventing blindness in this group of patients. To this end, previous studies have shown the effectiveness of diabetes eye screening programs to prevent blindness in patients with diabetes.^Citation2^–^Citation9 The United Kingdom National Screening Committee therefore recommended a systematic population screening program^Citation10 which was implemented in 2003. As a result, the current National Health Service (NHS) Diabetic Eye Screening Programme is in place.^Citation11

Digital retinal photography with mydriasis is the preferred modality for diabetic eye screening based on its reported values for sensitivity and specificity,^Citation12^–^Citation15 and its ability to quality assure screening standards.^{Citation16,Citation17} This modality of retinopathy screening fulfils the Exeter minimum standard for sensitivity and specificity of 80% and 95%, respectively, for robust and safe diabetic retinopathy screening.^{Citation18,Citation19} Conventionally, this utilizes technicians to perform the primary grading, with secondary grading performed by more experienced screeners or clinicians, and arbitration grading performed by an ophthalmologist or a diabetologist with expertise in diabetic retinopathy screening. However, in selected screening programs, primary and secondary gradings are performed by trained opticians. Whilst data are available on the effectiveness of individual screening modalities,^Citation10^–^{Citation13,Citation17}^–^Citation19 there is currently only one study that has looked at the interobserver agreement between primary graders and an expert grader.^Citation20 Information on the safety, effectiveness, and agreement between primary and secondary graders for images of patients undergoing routine diabetic eye screening in a community optometry-based retinopathy screening program has not yet been reported.

Materials and methods

The North Nottinghamshire diabetic retinopathy screening service has utilized an optometry-based model since April 2006 and involves 36 optometrists across 21 sites. Screening is undertaken by local optometrists, and two-field digital images of the retina are recorded in the database and graded. All models and makes of the retinal cameras in use, as well as their age, are approved based on criteria set by the NHS Diabetic Eye Screening Programme. Tropicamide 1% is used to dilate the pupils to an acceptable size for screening, which is performed according to a standard national screening protocol. Primary and secondary grading is carried out by optometrists on the digital retinal images, and a web-based referral to an ophthalmologist is required if there is disagreement between primary and secondary graders or if sight-threatening retinopathy is observed.

For this study, data were collected retrospectively between January 2011 and December 2011 from a cohort of 8,977 patients registered in an optometry-based retinal screening program database currently in place in North Nottinghamshire. These patients were reviewed by optometrists who carried out digital retinal photography. Images were stored in a web-based database and graded according to the national screening standard.^Citation11 Grading levels were as follows: no retinopathy (R0), background retinopathy (R1), preproliferative retinopathy (R2), proliferative retinopathy (R3), and maculopathy (M1). Any retinopathy detected by a primary grader (R1, R2, M1) and 10% of images with no evidence of retinopathy (R0) was sent for secondary grading performed by another optometrist. If there was any disagreement between the primary and secondary grader, the images were sent to arbitration, which was performed by an ophthalmologist. The presence of proliferative retinopathy (R3) would require an urgent referral to ophthalmology. However, during 2011, due to an internal quality audit that was being undertaken, all patients with R1 were referred to the ophthalmologist for screening. Retinal images that were not gradable by the primary grader for reasons such as previous surgery or cataracts were referred directly to ophthalmology. Patients under ophthalmology follow-up were kept under ophthalmology review with follow-up appointments until their retinopathy was stable. The screening program also has in place a fail-safe mechanism (monitored by a fail-safe officer) whereby images of patients subsequently found to have R3 or have undergone photocoagulation therapy are traced back to see whether this was missed during screening on an ongoing basis. No R3 was being missed at screening during the period of this audit. Once the patients had stable retinopathy with no immediate intervention required, they were referred back into the local retinal screening recall process.

We calculated the agreement between the primary and secondary grader as well as between individual graders and ophthalmologists by means of Kappa statistics.^Citation21 We also looked at the proportion of disagreement leading to an upgrading of the retinopathy level. Assessment of sensitivity and specificity values in this study was limited to images graded as R3, since all R3 are referred to an ophthalmologist for arbitration or a final grading. R3 grading from the primary grader was compared against the “gold standard” ophthalmological diagnosis. Sensitivity is calculated as the (number of true positives/true positives + false negatives) while specificity is calculated as the (number of true negatives/true negatives + false positives). This work is labeled as service evaluation. The audit work and data derived from this work are part of the program’s ongoing clinical governance exercise to maintain standards of retinopathy screening within the service. The statistical analysis was performed using SPSS version 14 software (SPSS Inc., Chicago, IL, USA).

Results

Of 8,977 patients (15,583 images), 734 patients were graded as R0 by the primary grader. Of these, 377 were graded as R0 by the secondary grader. This resulted in 51.4% agreement between the primary and secondary grader for patients graded as R0 at primary grading. The other 357 patients had no agreement between the primary and secondary grader. From these, 4.8% (n=17) were downgraded and 3.6% (n=13) were upgraded by ophthalmology ().

Table 1 Percentage of agreement, disagreement, upgrading, and downgrading of images in the North Nottingham screening program

Download CSV Display Table

Background retinopathy grading (R1) was given to 7,784 patients by the primary grader and 1,448 of these were graded by ophthalmology. The level of agreement between primary and secondary graders in this group was 79.7% (n=6,204). Among these patients, 15.5% (n=207) of agreement was reported between the primary grader and ophthalmology, while the agreement between the secondary grader and ophthalmology was 10.7% (n=835). For the proportion in which there was disagreement between the primary and secondary grader, 2.6% (n=41) were upgraded, of which 1% (n=16) were upgraded to R3 (). For the proportion in which there was disagreement between the primary and secondary grader, 0.8% (n=13) were downgraded to a different grade by ophthalmology (). Where patients were graded R2 (n=210) at primary grading, agreement between the primary and secondary grader was 100% (); 207 of the 210 that were graded as R2 by the primary grader were graded by the secondary grader as well as ophthalmology. This was due to an internal quality assurance audit that was taking place in 2011.

Proliferative retinopathy (R3) was detected in 249 patients by the primary grader, but only 31.7% (79) of these were subsequently confirmed as R3 by ophthalmology. Of the total population screened (n=8,977), 8,728 were found not to have R3 by the primary grader, while 1,777 patients were confirmed by ophthalmology not to have R3. From these data, the sensitivity and specificity for R3 in our cohort is 78.2% and 98.1% (); 3.6% of normal (R0) and 2.6% of background retinopathy (R1) had a disagreement in grading, leading to an upgrading of retinopathy level by ophthalmology. Ten percent of images graded as R0 went through to ophthalmology for arbitration. Of these, there was no agreement between the primary and secondary grader, but there was 56.6% agreement between the primary grader and ophthalmology, and 36.6% agreement between the secondary grader and ophthalmology.

We used Kappa statistics to evaluate the level of agreement between primary and secondary graders and between primary and arbitration graders for R0–R2. There was an observed kappa of 0.3223 (95% confidence interval 0.2937–0.3509) and 0.269 (95% confidence interval 0.216–0.321), respectively ( and ). The level of agreement between the primary grader and ophthalmology for R3 using Kappa statistics gives an observed kappa of 0.5667 (95% confidence interval 0.4557–0.6123).

Table 2 Agreement and disagreement for primary grader (horizontal axis) and secondary grader (vertical axis)

Download CSV Display Table

Table 3 Agreement and disagreement for primary grader (horizontal axis) and arbitration grader (vertical axis)

Download CSV Display Table

Discussion

For a systematic screening program to be effective, it needs a database that is robust and well maintained. The system currently in place in North Nottinghamshire uses a central call/recall center with ongoing quality assurance taking place at all stages of the process. In addition to their professional qualification registered by the General Optical Council which regulates dispensing opticians and optometrists, all screeners/graders would have undertaken a certificate for diabetic retinopathy screening by City and Guilds, as well as undergoing a test training set mandated by the NHS Diabetic Eye Screening Programme. During the period of the audit, one test training set was performed by the opticians. However, data for the intergrader agreement based on this exercise were not available. Although the national program recommended only 10% of R0 to be secondarily screened, we performed an internal audit for the year 2009–2010, where all R0 underwent secondary grading as a result of a quality assurance exercise recommended by the NHS Retinopathy Screening Programme. No sight-threatening retinopathy (R2 or higher) was identified.

The above study provides novel information on the safety and effectiveness of a community-based retinal screening program that uses optometrists at both the primary and secondary grader level compared with other optometry or nonoptometry-based programs that use senior graders, diabetologists, or ophthalmologists as secondary graders.

Evidence for the effectiveness of screening is based on evidence of treatment efficacy especially after early detection and on cost-effectiveness. Comparing this screening program with the Exeter standards,^{Citation18,Citation19} ours achieved a specificity level above the expected 95% but the sensitivity level was marginally short of the recommended 80% threshold. Of note, the sensitivity data here refer to data analysis specific to R3 rather than data from the whole program. Moreover, it is conceivable that the slightly higher level of false-positives observed here reflects a slightly overcautious approach by optometrists to grading in patients with a higher likelihood of abnormalities in their eyes. In addition, image arbitration was performed by an ophthalmologist who may decide on the final “grade” based on clinical need for photocoagulation therapy rather than actual reporting of the images. Nevertheless, the importance of appropriate sensitivity and specificity for any screening modality has become more important in view of some recent evidence which may advocate for a different frequency of retinopathy screening for different individuals depending on the risk of retinopathy progression, based on baseline and/or previous screening results.^Citation24 Despite a high false-negative rate, none of the false negatives required urgent photocoagulation therapy, which reflects a subsequent “clinical” diagnosis by the ophthalmologist rather than a misdiagnosis by the optometrist. This has been confirmed by regular audit of our data based on the governance structure currently in place in our screening program. It was also reassuring to note that the levels of agreement between primary and secondary graders for higher levels of retinopathy (R2 and R3) were both 100%. For lower levels of retinopathy, ie, R0 and R1, agreement between primary and secondary graders were lower at 51.4% and 79.7%, respectively. Of these, 3.6% of normal (R0) and 2.6% of background (R1) retinopathy showed a disagreement in grading, leading to an upgrading of retinopathy level by ophthalmology, but none required photocoagulation therapy.

Some limitations to this study needs to be highlighted. To calculate sensitivity and specificity, we analyzed data specific to R3 only. This was because only 10% of R0 and some of R1 and R2 were referred to ophthalmology, whereas all R3 were referred to an independent ophthalmologist. Because of this, we were unable to look at the sensitivity and specificity for the whole cohort, which affects the results reported in our study. We used the ophthalmologist grade as the gold standard, so it would be important to have all retinopathy graded as R2 by the primary grader reviewed by ophthalmology to ensure that none of these would need to be upgraded to R3, which would mean they will need ophthalmology follow-up and potential treatment. The study was carried out by retrospective data collection, which would also be considered as a limitation, due to the presence of confounding biases. We were also not able to reliably determine results for maculopathy within our program. Further, we were not able to accurately adjust results for ungradable images, due to poor patient compliance with the screening protocol, poor mydriasis, or other factors. Interpretation of the results is limited to this program and cannot necessarily be generalized to other programs. Lastly, although Kappa statistics is a recognized method for assessment of agreement, the magnitude of kappa reflecting adequate agreement is unclear. However, arbitrary guidelines are available to indicate level of agreement, although these are not evidence-based. Generally, however, it is accepted that a kappa score >80% would suggest very good agreement.^{Citation25,Citation26} Despite this, due to methodological limitations of other research in this area, and due to a lack of data and evidence of optometrists as primary and secondary graders in detecting R3 in a retinopathy screening program, we believe data from this study would enhance available knowledge concerning the safety and effectiveness of an optometry community-based retinopathy screening program.

There is no clear evidence suggesting who has the best sensitivity and specificity for detecting sight-threatening retinopathy, ie, whether it is independent graders, optometrists, diabetologists, general practitioners, or ophthalmologists. A single study showed that retinal photographs assessed by optometrists could achieve >91% sensitivity in detecting R3 or sight-threatening retinopathy.^Citation20 Data on the effectiveness of individual screening modalities are widely available.^{Citation13,Citation17,Citation19,Citation23} However, our study provides unique data on the safety, effectiveness, and agreement between primary and secondary graders for images of patients undergoing routine diabetes eye screening in a community optometry-based retinopathy screening program.

Author contributions

LS contributed to the data acquisition and analysis, and interpretation of the data, and wrote the first draft of the manuscript. CS supported the acquisition and analysis of the data. JD and PM contributed to analysis or interpretation of data. II conceptualized the study and contributed to the design, analysis, and interpretation of the data. II is the guarantor for this study. All authors contributed to the writing of the manuscript and agreed on the final draft.

Disclosure

The authors report no conflicts of interest in this work.

References

OwensDRGibbinsRLKohnerEDiabetic retinopathy screeningDiabet Med200017749339310972577
PubMed Web of Science ®Google Scholar
StefánssonEBekTPortaMScreening and prevention of diabetic blindnessActa Ophthalmol Scand200078437438510990036
PubMedGoogle Scholar
GarvicanLClowesJGillowTPreservation of sight in diabetes: developing a national risk reduction programmeDiabet Med200017962763411051281
PubMed Web of Science ®Google Scholar
ScanlonPAldingtonSWilkinsonCEarly Treatment Diabetic Retinopathy Study Research Group. Early photocoagulation for diabetic retinopathy, ETDRS report number 9Ophthalmology19919857667852062512
PubMed Web of Science ®Google Scholar
JamesMTurnerDBroadbentDCost effectiveness analysis of screening for sight threatening diabetic eye diseaseBMJ200032072501627163110856062
PubMed Web of Science ®Google Scholar
BuxtonMSculpherMFergusonBScreening for treatable diabetic retinopathy: a comparison of different methodsDiabet Med1991843713771830260
PubMed Web of Science ®Google Scholar
SculpherMBuxtonMFergusonBA relative cost-effectiveness analysis of different methods of screening for diabetic retinopathyDiabet Med1991876446501833116
PubMed Web of Science ®Google Scholar
BachmannMONelsonSImpact of diabetic retinopathy screening on a British district population: case detection and blindness prevention in an evidence based modelJ Epidemiol Community Health199852145529604041
PubMed Web of Science ®Google Scholar
DaviesRRoderickPCanningCThe evaluation of screening policies for diabetic retinopathy using simulationDiabet Med200219976277012207814
PubMed Web of Science ®Google Scholar
UK National Screening Committee Available from: http://www.screening.nhs.ukAccessed May 31, 2013
Google Scholar
NHS Diabetic Eye Screening Programme Available from: http://diabeticeye.screening.nhs.ukAccessed May 31, 2013
Google Scholar
FergusonBAHumphreysJEAltmanJFBScreening for treatable diabetic retinopathy: a comparison of different methodsDiabet Med1991843713771830260
PubMed Web of Science ®Google Scholar
HutchinsonAMcIntoshAPetersJEffectiveness of screening and monitoring tests for diabetic retinopathy – systematic reviewDiabet Med200017749550610972578
PubMed Web of Science ®Google Scholar
ScanlonPHWilkinsonCPAldingtonJScreening for diabetic retinopathyScanlonPHWilkinsonCPAldingtonSJMatthewsDRA Practical Manual of Diabetic Retinopathy ManagementOxford, UKWiley-Blackwell2009
Google Scholar
TaylorDFisherJJacobJThe use of digital cameras in a mobile retinal screening environmentDiabet Med199916868068610477214
PubMed Web of Science ®Google Scholar
GoatmanKAPhilipSFlemingADExternal quality assurance for image grading in the Scottish diabetic retinopathy screening programmeDiabet Med201229677678322023553
PubMed Web of Science ®Google Scholar
SallamAScanlonPHStrattonIMAgreement and reasons for disagreement between photographic and hospital biomicroscopy grading of diabetic retinopathyDiabet Med201128674174621342245
PubMed Web of Science ®Google Scholar
HardingSPBroadbentDMNeohCSensitivity and specificity of photography and direct ophthalmoscopy in screening for sight threatening eye diseases: the Liverpool Eye StudyBMJ19953117013113111357580708
PubMed Web of Science ®Google Scholar
HardingSGreenwoodRAldingtonSGrading and disease management in national screening for diabetic retinopathy in England and WalesDiabet Med2003201296597114632697
PubMed Web of Science ®Google Scholar
PatraSGommEMMacipeMInterobserver agreement between primary graders and an expert grader in the Bristol and Weston diabetic retinopathy screening programme: a quality assurance auditDiabet Med200926882082319709153
PubMed Web of Science ®Google Scholar
DonnerAShoukriMKlarNTesting the equality of two dependent Kappa statisticsStat Med200019337338710649303
PubMed Web of Science ®Google Scholar
GibbinsRLOwensDRAllenJCPractical application of the European field guide in screening for diabetic retinopathy by using ophthalmoscopy and 35 mm retinal slidesDiabetologia199841159649498631
PubMed Web of Science ®Google Scholar
OlsonJStrachanFHipwellJA comparative evaluation of digital imaging, retinal photography and optometrist examination in screening for diabetic retinopathyDiabet Med200320752853412823232
PubMed Web of Science ®Google Scholar
StrattonIMAldingtonSJTaylorDJAdlerAIScanlonPHA simple risk stratification for time to development of sight threatening diabetic retinopathyDiabetes Care20133658058523150285
PubMed Web of Science ®Google Scholar
LandisJRKochGGThe measurement of observer agreement for categorical dataBiometrics1977331159174843571
PubMed Web of Science ®Google Scholar
FleissJLStatistical Methods for Rates and Proportions2nd edNew York, NY, USAJohn Wiley1981
Google Scholar

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Grader agreement, and sensitivity and specificity of digital photography in a community optometry-based diabetic eye screening program

Abstract

Background

Methods

Results

Conclusion

Introduction

Materials and methods

Results

Table 1 Percentage of agreement, disagreement, upgrading, and downgrading of images in the North Nottingham screening program

Table 2 Agreement and disagreement for primary grader (horizontal axis) and secondary grader (vertical axis)

Table 3 Agreement and disagreement for primary grader (horizontal axis) and arbitration grader (vertical axis)

Discussion

Author contributions

Disclosure

References

Information for

Open access

Opportunities

Help and information

Grader agreement, and sensitivity and specificity of digital photography in a community optometry-based diabetic eye screening program

Abstract

Background

Methods

Results

Conclusion

Introduction

Materials and methods

Results

Table 1 Percentage of agreement, disagreement, upgrading, and downgrading of images in the North Nottingham screening program

Table 2 Agreement and disagreement for primary grader (horizontal axis) and secondary grader (vertical axis)

Table 3 Agreement and disagreement for primary grader (horizontal axis) and arbitration grader (vertical axis)

Discussion

Author contributions

Disclosure

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date