ABSTRACT
Introduction: Published drug safety data has evolved in the past decade due to scientific and technological advances in the relevant research fields. Considering that a vast amount of scientific literature has been published in this area, it is not easy to identify the key information. Topic modeling has emerged as a powerful tool to extract meaningful information from a large volume of unstructured texts.
Areas covered: We analyzed the titles and abstracts of 4347 articles in four journals dedicated to drug safety from 2007 to 2016. We applied Latent Dirichlet allocation (LDA) model to extract 50 main topics, and conducted trend analysis to explore the temporal popularity of these topics over years.
Expert Opinion/Commentary: We found that ‘benefit-risk assessment and communication’, ‘diabetes’ and ‘biologic therapy for autoimmune diseases’ are the top 3 most published topics. The topics relevant to the use of electronic health records/observational data for safety surveillance are becoming increasingly popular over time. Meanwhile, there is a slight decrease in research on signal detection based on spontaneous reporting, although spontaneous reporting still plays an important role in benefit-risk assessment. The topics related to medical conditions and treatment showed highly dynamic patterns over time.
Acknowledgement
The author would like to thank Lester Reich (Pfizer) and Manfred Hauben (Pfizer) for critical reading and suggestions for the manuscript.
Declaration of interest
The author is an employee of Pfizer. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed. Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.