102
Views
0
CrossRef citations to date
0
Altmetric
Special Report

Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review

, , , &
Pages 467-475 | Received 02 Sep 2023, Accepted 20 Feb 2024, Published online: 05 Mar 2024
 

ABSTRACT

Introduction

Patient-reported outcomes (PROs; symptoms, functional status, quality-of-life) expressed in the ‘free-text’ or ‘unstructured’ format within clinical notes from electronic health records (EHRs) offer valuable insights beyond biological and clinical data for medical decision-making. However, a comprehensive assessment of utilizing natural language processing (NLP) coupled with machine learning (ML) methods to analyze unstructured PROs and their clinical implementation for individuals affected by cancer remains lacking.

Areas covered

This study aimed to systematically review published studies that used NLP techniques to extract and analyze PROs in clinical narratives from EHRs for cancer populations. We examined the types of NLP (with and without ML) techniques and platforms for data processing, analysis, and clinical applications.

Expert opinion

Utilizing NLP methods offers a valuable approach for processing and analyzing unstructured PROs among cancer patients and survivors. These techniques encompass a broad range of applications, such as extracting or recognizing PROs, categorizing, characterizing, or grouping PROs, predicting or stratifying risk for unfavorable clinical results, and evaluating connections between PROs and adverse clinical outcomes. The employment of NLP techniques is advantageous in converting substantial volumes of unstructured PRO data within EHRs into practical clinical utilities for individuals with cancer.

Article highlights

  • Unstructured PROs are often passively collected during the patient–clinician conversation and documented as a part of routine clinical care, and a significant number of unstructured PROs have been available in EHRs.

  • Due to the challenge of conducting PRO surveys from busy clinical settings, leveraging free-text PROs documented in EHRs and applying NLP for analyzing PROs to improve the cancer decision-making process is clinically relevant.

  • Applying NLP methods can greatly enhance the efficiency and precision of examining unstructured PROs data in EHRs for cancer individuals, which will bolster more effective clinical applications in the field of oncology.

  • While still in its early stages, the implementation of large language models as a new technique holds the potential to enhance the examination of unstructured PROs and their utilization in oncology.

Declarations of interest

The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

Author contributions

Conceptualization: Jin-ah Sim, I-Chan Huang; Data curation: Jin-ah Sim; Funding acquisition: I-Chan Huang, Xiaolei Huang; Methodology: Jin-ah Sim, Xiaolei Huang, I-Chan Huang; Project administration: I-Chan Huang; Resources: I-Chan Huang; Supervision: I-Chan Huang; Visualization: Jin-ah Sim; Writing – original draft preparation: Jin-ah Sim, I-Chan Huang; Writing – review and editing: Xiaolei Huang, Madeline R. Horan, Justin. N. Baker, I-Chan Huang; All authors have read and agreed to the submitted version of the manuscript.

Reviewer disclosures

Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/14737167.2024.2322664

Additional information

Funding

The research reported in this manuscript was supported by the U.S. National Cancer Institute R01CA238368 (Huang/Baker) and the National Science Foundation IIS-2245920 (Huang). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.