112
Views
0
CrossRef citations to date
0
Altmetric
Original Research

Validity of an Automated Algorithm to Identify Cirrhosis Using Electronic Health Records in Patients with Primary Biliary Cholangitis

, ORCID Icon, , ORCID Icon, , , ORCID Icon, , ORCID Icon, , ORCID Icon, , , , , , & show all
Pages 1261-1267 | Published online: 10 Nov 2020

Abstract

Background

Biopsy remains the gold standard for determining fibrosis stage in patients with primary biliary cholangitis (PBC), but it is unavailable for most patients. We used data from the 11 US health systems in the FibrOtic Liver Disease Consortium to explore a combination of biochemical markers and electronic health record (EHR)-based diagnosis/procedure codes (DPCs) to identify the presence of cirrhosis in PBC patients.

Methods

Histological fibrosis staging data were obtained from liver biopsies. Variables considered for the model included demographics (age, gender, race, ethnicity), total bilirubin, alkaline phosphatase, albumin, aspartate aminotransferase (AST) to platelet ratio index (APRI), Fibrosis 4 (FIB4) index, AST to alanine aminotransferase (ALT) ratio, and >100 DPCs associated with cirrhosis/decompensated cirrhosis, categorized into ten clusters. Using least absolute shrinkage and selection operator regression (LASSO), we derived and validated cutoffs for identifying cirrhosis.

Results

Among 4328 PBC patients, 1350 (32%) had biopsy data; 121 (9%) were staged F4 (cirrhosis). DPC clusters (including codes related to cirrhosis and hepatocellular carcinoma diagnoses/procedures), Hispanic ethnicity, ALP, AST/ALT ratio, and total bilirubin were retained in the final model (AUROC=0.86 and 0.83 on learning and testing data, respectively); this model with two cutoffs divided patients into three categories (no cirrhosis, indeterminate, and cirrhosis) with specificities of 81.8% (for no cirrhosis) and 80.3% (for cirrhosis). A model excluding DPCs retained ALP, AST/ALT ratio, total bilirubin, Hispanic ethnicity, and gender (AUROC=0.81 and 0.78 on learning and testing data, respectively).

Conclusion

An algorithm using laboratory results and DPCs can categorize a majority of PBC patients as cirrhotic or noncirrhotic with high accuracy (with a small remaining group of patients’ cirrhosis status indeterminate). In the absence of biopsy data, this EHR-based model can be used to identify cirrhosis in cohorts of PBC patients for research and/or clinical follow-up.

Introduction

Although biopsy remains the gold standard for determining liver damage, fibrosis, and cirrhosis in patients with primary biliary cholangitis (PBC), it is invasive and performed on a relatively small subset of patients. Transient elastography has shown promise for use in PBC patientsCitation1,Citation2 but has not been universally implemented in health care systems that are not supported by specialty gastroenterology and hepatology clinics. An efficient system to identify cirrhosis in PBC patients using data from electronic health records (EHR)—such as diagnosis and procedure codes (DPCs), and laboratory results—may inform epidemiologic research and clinical trials, as well as the identification of subgroups of PBC patients who could benefit from earlier intervention.

Biomarkers for liver fibrosis calculated from results of laboratory tests, such as the Aspartate Aminotransferase to Platelet Ratio Index (APRI) and Fibrosis-4 (FIB4), have been well described and validated among patients with viral hepatitis. However, the distinct etiology and natural history of PBC mean that the ability of these biomarkers to identify cirrhosis cannot be assumed, and there are currently no studies developing or validating PBC-specific cutoffs for cirrhosis. Likewise, elevated alkaline phosphatase (ALP), total bilirubin, and the ratio of aspartate aminotransferase to alanine aminotransferase (AST/ALT) are known to be important prognostic markers for PBC progression and response to treatment with ursodeoxycholic acid (UDCA).Citation3,Citation4 It is likely that the inclusion of these variables could increase the utility of any marker for cirrhosis among patients with PBC.

The FibrOtic Liver Disease Consortium (FOLD) comprises a cohort of more than 4000 PBC patients drawn from 11 US health systems. We applied machine learning techniques to develop and validate an automated algorithm combining EHR-based data—including DPCs and routine laboratory results—for the identification of cirrhosis among patients with PBC.

Methods

The FOLD Consortium has been previously described.Citation3,Citation5 Briefly, FOLD comprises 11 geographically diverse health systems, representing four US Census Bureau-defined regions of the US (Northeast, Midwest, Northwest, and South). FOLD follows the guidelines of the US Department of Health and Human Services for the protection of human subjects. The study protocol was approved by the Institutional Review Board of each participating site (see Supplementary Table 1). All authors had access to the study results and reviewed and approved the final manuscript.

Cirrhosis Cohort Identification

FOLD PBC patient identification methods have been previously described.Citation5 All cases were confirmed with chart abstraction performed by trained medical abstractors. We used EHR data to identify FOLD PBC patients who had undergone liver biopsy. Fibrosis staging from biopsy results was collected by abstraction from pathology reports, and mapped to an F0–F4 equivalency scale:Citation6 F0, no fibrosis; F1, portal fibrosis without septa; F2, portal fibrosis with few septa; F3, numerous septa without cirrhosis; and F4, cirrhosis. FOLD hepatologists provided adjudication of indeterminate cases. If the patient had more than one biopsy, the biopsy with the highest fibrosis stage was considered. The outcome of interest was a biopsy with F4/cirrhosis biopsy staging.

Possible Covariates/Classifiers

details covariates considered for the model, including patient demographics (age, gender, race, Hispanic ethnicity); total bilirubin; ALP and albumin (classified in relation to “normal” as defined by the assay used at each site); APRI; FIB4 index; and AST/ALT ratio. We collected laboratory data, liver-related diagnosis and procedure codes (International Classification of Diseases Ninth and Tenth editions [ICD9/10] and Current Procedural Terminology [CPT] codes), all measured within six months before/after biopsy. In cases where more than one laboratory result was available, the one closest to the date of biopsy was used. ICD9/10 and CPT codes were grouped into ten clusters (C1 to C10, detailed in ); these were used as dichotomized (presence/absence) variables in the classification analysis. An “unknown” category was used for all variables that had missing data.

Table 1 Laboratory Data

Table 2 ICD-9/10 and CPT Codes Comprising the Ten Cluster Variables (C1–C10)

Statistical Analysis

Data were randomly divided into two sets at a 2:1 ratio; learning data (2/3) were used to build the classification model and testing data (1/3) were used for model validation. We performed analysis using machine learning approaches to build the model, including SPM (Salford Predictive Modeler, version 8.0) Least Absolute Shrinkage and Selection Operator (LASSO)Citation7 and several machine-leaning R packages,Citation8 including Classification and Regression Tree (CART), K-Nearest-Neighbor (KNN), polynomial support vector machines (SVMs), neural networks, random forest models, and eXtreme Gradient Boosting (xgb)Trees.Citation8Citation10

The model building process started with variable selection for the initial multivariable model. Highly correlated variables (eg, AST/ALT ratio, APRI, and FIB4) were fit into the model one at a time with other covariates to determine which would be selected. The same modeling approach was repeated using laboratory data without DPCs, given that FIB4 (a commonly used laboratory data-based biomarker) has been used to identify cirrhosis among patients with chronic hepatitis.Citation6 The final modeling approach and multivariable model were selected for optimal classification accuracy (defined by the highest area under the receiver operating characteristic curve [AUROC]). Final model selection was based on classification accuracy in the validation set, with estimation of model goodness-of-fit measured by AUROC. Models are considered to have “reasonable” and “good” accuracy when the AUROC is 70–80% and 80–90%, respectively. We also identified an optimal cut-off point to provide clinical utility to correctly classify patients as either cirrhotic or non-cirrhotic.

Results

Among 4328 confirmed PBC patients observed from 2006 to 2016, 1350 (32%) had biopsy data with F0–F4 staging; 121 (9%) were histologically staged F4 (cirrhosis). The median number of biopsies per patient was 1; 25th and 75th percentiles were 1 and 1 with a range of 1 to 7. presents the two-group comparison for all covariates of interest.

Table 3 Two-Group Comparison for Covariates of Interest

The LASSO approach—using three laboratory variables (ALP, total bilirubin, AST/ALT), gender, and ethnicity—had “good” model classification accuracy; AUROC was 0.81 (learning data) and 0.78 (testing data). The model equation is expressed as: Lscore = −2.10444 - 0.0380115 [if ALP normal] - 0.10366 [if ALP 1-<2*ULN] + 0.0703424 [if ALP 2-<3*ULN] - 0.0859862 [if ALP≥3*ULN] - 0.0961179 [if male] + 0.0552183 [if non-Hispanic ethnicity] - 0.0557998 [if Hispanic ethnicity] - 0.0553804 [if bilirubin ≤0.4] - 0.0701528 [if bilirubin 0.5>0.4] - 0.0239056 [if bilirubin 0.7>0.5] - 0.0881663 [if bilirubin 1.0>0.7] + 0.0658567 [if bilirubin 1.5>1.0] + 0.115739 [if bilirubin 2.0>1.5] + 0.136269 [if bilirubin >2.0] - 0.0865529 [if AST/ALT<1.1] + 0.0898492 [if AST/ALT 1.1-<2.2] + 0.144404 [if AST/ALT ≥2.2]. At the optimal cutoff of 0.1 in this model, sensitivity was 70% and specificity was 72% based on validation results.

A LASSO model with three laboratory variables (ALP, total bilirubin, AST/ALT), two DPC clusters for diagnosis of hepatocellular carcinoma and cirrhosis, and ethnicity (Hispanic yes/no) demonstrated the best performance; this model reached “excellent” classification accuracy, with AUROCs of 0.86 on learning data and 0.83 on testing data. This model combining laboratory and DPC data had significantly better predictive ability (AUROC) compared to the model using laboratory data without DPCs (p=0.001). The equation for this final LASSO model is expressed as : Lscore = −2.80400 + 0.303777 [if ALP 2-<3*ULN] - 1.11856 [if Hispanic ethnicity] - 0.325175 [if bilirubin ≤0.4] + 0.28772 [if bilirubin >0.5-0.7 mg/dL] + 0.512881 [if bilirubin >1.0-1.5 mg/dL] + 0.9406 [if bilirubin >1.5-2.0 mg/dL] + 0.756801 [if bilirubin >2.0 mg/dL] - 0.652222 [if AST/ALT<1.1] + 0.645455 [if AST/ALT 1.1-<2.2] + 0.681193 [if AST/ALT ≥2.2] + 0.707349 [if diagnosis of hepatocellular carcinoma] + 1.3713 [if two diagnoses of cirrhosis]. A single cut-off of 0.08 (derived from the formula Prob = 1.0/(1.0 + exp(-score)) yielded sensitivity of 76% and specificity of 75% based on validation results. Two optimal cut-offs (0.07 and 0.10) divided patients into three groups—non-cirrhotic (≤0.07); indeterminate 0.7≤0.10); and cirrhotic (>0.10)—and yielded improved specificities of 81.8% for absence of cirrhosis and 80.3% for presence of cirrhosis.

Other modelling approaches using the same covariates reached similar or lower model classification accuracy (Supplementary Table 2); performance of the xgbTree model was similar to the LASSO model (AUROC=0.82 on testing data) but required ten variables (age, gender, ethnicity, albumin, ALP, AST/ALT ratio, total bilirubin, platelet count, and DPCs related to hepatocellular carcinoma and cirrhosis), making this model less useful in the “real world” setting.

Discussion

Using data drawn from the FOLD consortium, we applied machine learning methods to EHR-based laboratory results and DPCs to develop and validate a method for identifying cirrhosis among patients with PBC. Our previous work has shown that cirrhosis is an important prognostic marker for poor outcomes among patients with PBC.Citation3Citation5 However, in these analyses, cirrhosis identification was based on a limited number of patients with biopsy data (32%). Transient elastography has gradually begun to replace biopsy, but has not yet been universally implemented, especially in health systems without specialty hepatology clinics; only 12% of patients in our real-world cohort had elastography data available. Our EHR-based model could help address that gap in the identification of PBC patients with cirrhosis. The classification accuracy of our model using both laboratory data and DPC codes was “good” (AUROC=0.83 on testing data) and was significantly better than an alternate model using laboratory results without DPCs. The combined model with two-optimal cuts (0.07 and 0.10) divided patients into three groups (cirrhotic and non-cirrhotic, with a small group [<7%] as indeterminate); this model yielded 81.8% specificity for the absence of cirrhosis and 80.3% specificity for the presence of cirrhosis.

We believe this is the first validated model for use of EHR-based data for cirrhosis identification among PBC patients. Although previously developed markers for cirrhosis, such as APRI (which combines AST, ALT, and platelet count) and FIB4 (which combines age, AST, ALT, and platelet count), have been validated in populations with viral hepatitis, it is not clear if they are optimized for use in patients with PBC. In a model replacing AST/ALT ratio with FIB4, classification accuracy was moderate (AUROC=0.75 on testing data). Our analysis found that a combination of total bilirubin, ALP, and AST/ALT ratio—rather than APRI, FIB4, or the individual components of those markers—provided better accuracy (AUROC=0.83 on testing data). In light of our recent study showing that total bilirubin, ALP, and AST/ALT ratio were independent risk factors for all-cause mortality in patients with PBC,Citation4 our current findings suggest that these variables may be the most appropriate biomarkers for cirrhosis and poor outcomes.

One limitation of our analysis is that—although the overall model classification accuracy reached 83%—sensitivity and specificity remained only moderate (75–76%) with the use of a single cut-off (0.08). We addressed this issue with the use of two cutoffs (0.07 and 0.10), which improved specificity to >80% for determining the absence of cirrhosis and presence of cirrhosis, and left only 6.8% of patients classified as “indeterminate.” While the use of more extreme cutoffs (eg, 0.05 and 0.16) could yield specificity in the range of 85–88%, it would classify more patients (28%) as indeterminate. Limitations to this method can be further addressed by using a hierarchical approach for cirrhosis identification: 1) cirrhosis determination based on biopsy or transient elastography when available; 2) use of our model with two cutoffs. Analyses can categorize those patients who fall into the “indeterminate” group as “unknown.” We have successfully implemented a similar approach for cirrhosis identification in patients with viral hepatitis.Citation11,Citation12 An additional unavoidable limitation of classification models that they are most accurate when applied to a sample with patient characteristics similar to those used to build the model. Likewise, this model will need to be validated using external data from a similar patient population.

In conclusion, our study showed that a model using EHR-based data can be used to efficiently identify PBC patients with cirrhosis. Using a hierarchical approach that also takes into consideration cirrhosis determination via biopsy/transient elastography data, when such data are available, we expect that this model will be useful for research in patients with PBC, and could serve as a quality improvement tool to ensure the best available care for such patients. Our model may also be useful in the identification of risk factors for decompensation in large observational studies of patients with PBC. There are interventions that mitigate risk of cirrhotic patients' progression from compensated to decompensated cirrhosis, and poor outcomes of decompensation—this tool may help clinicians identify and monitor such patients.

Author Contributions

All authors made substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; took part in drafting the article or revising it critically for important intellectual content; agreed to submit to the current journal; gave final approval of the version to be published; and agree to be accountable for all aspects of the work.

Disclosure

Stuart C. Gordon receives grant/research support from AbbVie Pharmaceuticals, Conatus, CymaBay, Eiger Pharmaceuticals, Eli Lilly, Genfit, Gilead Sciences, GlaxoSmithKline, Intercept Pharmaceuticals, Merck, and Viking Therapeutics. Mei Lu, Joseph A. Boscarino, Mark A. Schmidt, Yihe G. Daida, Jia Li, Loralee B. Rupp, and Sheri Trudeau receive research grant support from Gilead Sciences and Intercept Pharmaceuticals. Carla V. Rodriguez-Watson owns stock in Gilead (<$5000). Heather Anderson receives grant/research support from Intercept Pharmaceuticals. Jeffrey J. VanWormer receives grant/research support from Retrophin. Christopher L. Bowlus receives grant/research support from AbbVie Pharmaceuticals, Bristol-Myers-Squibb, CymaBay, Gilead Biosciences, GlaxoSmithKline, Intercept Pharmaceuticals, Merck, Mirum, Shire Pharmaceuticals, Takeda Pharmaceuticals, TARGET Pharmasolutions, and has served as an advisor for Bristol-Myers-Squibb, Gilead Biosciences, Intercept Pharmaceuticals, and Takeda. Keith Lindor is a consultant/advisor for Biopharma and has served as an ad hoc advisor for HighTide, Takeda, Shire, and Intercept Pharmaceuticals. He sits on a Data Safety Monitoring Board for Takeda. Robert J. Romanelli receives received grant/research support from Pfizer Inc. and Janssen Scientific Affairs. The authors report no other conflicts of interest in this work.

Additional information

Funding

The FOLD Consortium has previously received funding from Intercept Pharmaceuticals Inc.

References

  • Wong VW-S, Chan HL-Y. Transient elastography. J Gastroenterol Hepatol. 2010;25(11):1726–1731. doi:10.1111/j.1440-1746.2010.06437.x21039833
  • Joshita S, Yamashita Y, Sugiura A, et al. Clinical utility of FibroScan as a non-invasive diagnostic test for primary biliary cholangitis. J Gastroenterol Hepatol. 2019.
  • Lu M, Zhou Y, Haller IV, et al. Increasing prevalence of primary biliary cholangitis and reduced mortality with treatment. Clin Gastroenterol Hepatol. 2018;16(8):1342–1350. doi:10.1016/j.cgh.2017.12.03329277621
  • Gordon SC, Wu KH, Lindor K, et al. Ursodeoxycholic acid treatment preferentially improves overall survival among African Americans with primary biliary cholangitis. Am J Gastroenterol. 2020;115(2):262–270. doi:10.14309/ajg.000000000000051231985529
  • Lu M, Li J, Haller IV, et al. Factors associated with prevalence and treatment of primary biliary cholangitis in United States health systems. Clin Gastroenterol Hepatol. 2018;16(8):1333–1341. doi:10.1016/j.cgh.2017.10.01829066370
  • Li J, Gordon SC, Rupp LB, et al. The validity of serum markers for fibrosis staging in chronic hepatitis B and C. J Viral Hepat. 2014;21(12):930–937. doi:10.1111/jvh.1222424472062
  • CART 6.0 User’s Guide Salford Systems [computer program]. 2010.
  • Torsten H CRAN task view: machine learning & statistical learning. 2020 Available from: https://CRAN.R-project.org/view=MachineLearning. Accessed 101, 2020.
  • Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors). Ann Statist. 2000;28(2):337–407. doi:10.1214/aos/1016218223
  • Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Statist. 2001;29(5):1189–1232. doi:10.1214/aos/1013203451
  • Li J,Zhang T, Gordon SC. Does Hepatitis C eradication lead to improved glucose metabolism, renal and cardiovascular outcomes in diabetic patients? American Association for the Study of Liver Diseases (AASLD) 2017 Auunal Meeting 2017:ID: 981.
  • Lu M, Wu KH, Li J, et al. Adjuvant ribavirin and longer direct-acting antiviral treatment duration improve sustained virological response among hepatitis C patients at risk of treatment failure. J Viral Hepat. 2019;26(10):1210–1217. doi:10.1111/jvh.1316231197910