ABSTRACT
Introduction
Patient-reported outcomes (PROs; symptoms, functional status, quality-of-life) expressed in the ‘free-text’ or ‘unstructured’ format within clinical notes from electronic health records (EHRs) offer valuable insights beyond biological and clinical data for medical decision-making. However, a comprehensive assessment of utilizing natural language processing (NLP) coupled with machine learning (ML) methods to analyze unstructured PROs and their clinical implementation for individuals affected by cancer remains lacking.
Areas covered
This study aimed to systematically review published studies that used NLP techniques to extract and analyze PROs in clinical narratives from EHRs for cancer populations. We examined the types of NLP (with and without ML) techniques and platforms for data processing, analysis, and clinical applications.
Expert opinion
Utilizing NLP methods offers a valuable approach for processing and analyzing unstructured PROs among cancer patients and survivors. These techniques encompass a broad range of applications, such as extracting or recognizing PROs, categorizing, characterizing, or grouping PROs, predicting or stratifying risk for unfavorable clinical results, and evaluating connections between PROs and adverse clinical outcomes. The employment of NLP techniques is advantageous in converting substantial volumes of unstructured PRO data within EHRs into practical clinical utilities for individuals with cancer.
Article highlights
Unstructured PROs are often passively collected during the patient–clinician conversation and documented as a part of routine clinical care, and a significant number of unstructured PROs have been available in EHRs.
Due to the challenge of conducting PRO surveys from busy clinical settings, leveraging free-text PROs documented in EHRs and applying NLP for analyzing PROs to improve the cancer decision-making process is clinically relevant.
Applying NLP methods can greatly enhance the efficiency and precision of examining unstructured PROs data in EHRs for cancer individuals, which will bolster more effective clinical applications in the field of oncology.
While still in its early stages, the implementation of large language models as a new technique holds the potential to enhance the examination of unstructured PROs and their utilization in oncology.
Declarations of interest
The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
Author contributions
Conceptualization: Jin-ah Sim, I-Chan Huang; Data curation: Jin-ah Sim; Funding acquisition: I-Chan Huang, Xiaolei Huang; Methodology: Jin-ah Sim, Xiaolei Huang, I-Chan Huang; Project administration: I-Chan Huang; Resources: I-Chan Huang; Supervision: I-Chan Huang; Visualization: Jin-ah Sim; Writing – original draft preparation: Jin-ah Sim, I-Chan Huang; Writing – review and editing: Xiaolei Huang, Madeline R. Horan, Justin. N. Baker, I-Chan Huang; All authors have read and agreed to the submitted version of the manuscript.
Reviewer disclosures
Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.
Supplementary material
Supplemental data for this article can be accessed online at https://doi.org/10.1080/14737167.2024.2322664