244
Views
1
CrossRef citations to date
0
Altmetric
ORIGINAL RESEARCH

Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records

ORCID Icon, ORCID Icon, , &
Pages 1339-1349 | Received 09 Apr 2022, Accepted 09 Jul 2022, Published online: 08 Nov 2022
 

Abstract

Background

Identifying high data-continuity patients in an electronic health record (EHR) system may facilitate selecting cohorts with a lower degree of variable misclassification and promote study validity. We updated a previously developed algorithm for identifying patients with high EHR data-completeness by adding demographic and health utilization factors to improve adaptability to networks serving patients of diverse backgrounds. We also expanded the algorithm to accommodate data in the ICD-10 era.

Methods

We used Medicare claims linked with EHR data to identify individuals aged ≥65 years. EHR-continuity was defined as the proportion of encounters captured in EHR data relative to claims. We compared the model with additional demographic factors and their interaction terms with other predictors with the original algorithm and assessed the performance by area under the ROC curve (AUC) and net reclassification index (NRI).

Results

The study cohort consisted of 264,099 subjects. The updated prediction model had an AUC of 0.93 in the validation set. Compared to the previous model, the new model had an NRI of 37.4% (p<0.001) for EHR-continuity classification. Interaction terms between demographic variables and other predictors did not improve the performance. Patients within the top 20% of predicted EHR-continuity had four times less misclassification of key variables compared to the remaining population.

Conclusion

Adding demographic and healthcare utilization variables significantly improved the model performance. Patients with high predicted EHR-continuity had less misclassification of study variables compared to the remaining population in both ICD-9 and 10 eras.

Data Sharing Statement

Data supporting the results reported in this manuscript contain detailed, patient-level clinical information and therefore cannot be made available publicly to protect patient privacy.

Ethics Approval and Informed Consent

This research analyzed data retrospectively and was therefore exempt from review by the Institutional Review Board of Mass General Brigham.

Disclosure

Dr. Schneeweiss (ORCID# 0000-0003-2575-467X) is participating in investigator-initiated grants to the Brigham and Women’s Hospital from Boehringer Ingelheim unrelated to the topic of this study. He is a consultant to Aetion Inc., a software manufacturer of which he owns equity. His interests were declared, reviewed, and approved by the Brigham and Women’s Hospital in accordance with their institutional compliance policies. Dr. Merola owns equity in and is an employee of Aetion, Inc. The authors report no other conflicts of interest in this work.

Additional information

Funding

This project was supported by NIH Grant R01LM012594. The study sponsor did not participate in any stages of this research, including study design, execution, or composition of this manuscript for publication.