1,002
Views
34
CrossRef citations to date
0
Altmetric
Original Article

Automated data extraction—A feasible way to construct patient registers of primary care utilization

, &
Pages 52-56 | Received 07 Oct 2011, Accepted 20 Dec 2011, Published online: 15 Feb 2012
 

Abstract

Introduction. Electronic medical records (EMRs) enable analysis of health care data by using data mining techniques to build research databases. Though the reliability of the data extraction process is crucial for the credibility of the final analysis, there are few published validations of this process. In this paper we validate the performance of an automated data mining tool on EMR in a primary care setting.

Methods. The Pygargus Customized eXtraction Program (CXP) was programmed to find and then extract data from patients meeting criteria for type 2 diabetes mellitus (T2DM) at one primary health care clinic (PHC). The ability of CXP to extract relevant cases was assessed by comparing cases extracted by an EMR integrated search engine. The concordance of extracted data with the original EMR source was manually controlled.

Results. Prevalence of T2DM was 4.0%, which correspond well to previous estimations. By searching for drug prescriptions, diagnosis codes, and laboratory values, 38%, 53%, and 91% of relevant cases were found, respectively. The sensitivity of CXP regarding extraction of relevant cases was 100%. The specificity was 99.9% due to 12 non-T2DM cases extracted. The congruity at single-item level was 99.6%. The 13 incorrect data items were all located in the same structural module.

Conclusion. The CXP is a reliable and accurate data mining tool to extract selective data from EMR.

Acknowledgements

The RECAP-DM study is a co-operation between the Uppsala University, the Karolinska Institute of Stockholm and Merck Sharp & Dohme (Sweden) AB worked together in the RECAP-DM study. Merck Sharp & Dohme did not contribute to the decision-making in building the research database, nor to the interpretation of data or to the writing process.

Declaration of interest: M.M. received minor financial support from Pygargus AB for initiating this study. J.S. has received financial support from Pygargus AB for conducting epidemiological research utilizing Pygargus CXP. J.H. declares no competing interests.