ABSTRACT
There has been growing interest of using machine learning (ML) methods with real-world data (RWD) to generate real-world evidence (RWE) to support regulatory decisions. In the U.S. Food and Drug Administration (FDA), ML has been applied in both prediction and causal inference problems for drug safety evaluation. The ML applications include health outcome identification, missing data imputation, risk factor identification, drug utilization discovery and causal inference study. We demonstrate the present utility and future potential of ML for regulatory science. We then discuss the challenges and considerations when using ML methods with RWD to generate RWE. Specifically, we focus on the transparency and reproducibility issue of using ML, the potential of ML and natural language processing (NLP) for missing data in RWD, training data issue for rare events, and interpretability of studies using ML.
Funding
The author(s) reported there is no funding associated with the work featured in this article.