92
Views
0
CrossRef citations to date
0
Altmetric
ORIGINAL RESEARCH

Surgical Complication Risk Factor Identification Using High-Dimensional Hospital Data: An Illustrative Example in Hemostasis-Related Complications

, , &
Pages 683-689 | Received 25 Jun 2022, Accepted 18 Oct 2022, Published online: 05 Nov 2022
 

Abstract

Purpose

To describe an approach wherein high-dimensional hospital data can be used to identify generalizable risk factors for surgical complications for which there may be limited prior knowledge, as illustrated in the context of hemostasis-related complications (HRC).

Patients and Methods

This was a retrospective study of the Premier Healthcare Database. Patients included for the study underwent video-assisted thoracoscopic lobectomy (VATL), laparoscopic right colectomy (LRC), or laparoscopic sleeve gastrectomy (LSG) on an inpatient setting between Oct-2015 and Feb-2020 (first = index). The outcome, HRC, comprised hemorrhage, control of bleeding, and acute posthemorrhagic anemia. For each cohort, a high-dimensional dataset (ie, comprising 1000s of candidate risk factors) was constructed using taxonomies from the Clinical Classification Software Refined (CCSR). Candidate risk factors were fed into logistic regression models with a 70%/30% train/test split for each cohort; clinically plausible risk factors that were consistently significant predictors of HRC across the 3 training models were then used in a final parsimonious model including sex, age, race, and payor; finally, the parsimonious model was applied to the test data to compare predicted risk with observed incidence of HRSC.

Results

The study included 11,141 VATL, 20,156 LRC, and 121,547 LSG patients, in whom 7.5%, 7.8%, and 1.2% experienced HRSC, respectively. Ultimately, 6 clinically plausible CCSR categories were identified as being statistically significant predictors across all 3 cohorts (eg, coagulation and hemorrhagic disorders, malnutrition, alcohol-related disorders, among others). In the parsimonious model applied to the test data, the observed incidence of HRSC was substantially higher in the top quintile vs bottom quintile of predicted risk: LSG 2.05% vs 0.53%, LRC 13.30% vs 4.11%, VATS 12.49% vs 5.04%.

Conclusion

High-dimensional real-world data can be useful to identify risk factors for outcomes that generalize across multiple cohorts. The risk factors identified herein should be considered for inclusion in future studies of hemostasis-related complications.

Acknowledgments

The abstract for this paper was presented as a poster at ICPE 2022, the 38th International Conference on Pharmacoepidemiology and Therapeutic Risk Management (ICPE), Copenhagen, Denmark, 26–28 August, 2022.

Disclosure

Stephen Johnston, Sanjoy Roy, and Esther Pollack are employees and stockholders of Johnson & Johnson. Aakash Jha is an employee of Mu Sigma, which was paid by Johnson & Johnson to provide data analytics support. The authors report no other conflicts of interest in this work.

Additional information

Funding

This work was supported by Johnson & Johnson; however, it received no specific grant but rather was conducted as routine methodological research work.