275
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Machine learning and traditional QSAR modeling methods: a case study of known PXR activators

, , , &
Pages 903-917 | Received 25 Oct 2022, Accepted 22 Mar 2023, Published online: 14 Apr 2023
 

Abstract

Pregnane X receptor (PXR), extensively expressed in human tissues related to digestion and metabolism, is responsible for recognizing and detoxifying diverse xenobiotics encountered by humans. To comprehend the promiscuous nature of PXR and its ability to bind a variety of ligands, computational approaches, viz., quantitative structure–activity relationship (QSAR) models, aid in the rapid dereplication of potential toxicological agents and mitigate the number of animals used to establish a meaningful regulatory decision. Recent advancements in machine learning techniques accommodating larger datasets are expected to aid in developing effective predictive models for complex mixtures (viz., dietary supplements) before undertaking in-depth experiments. Five hundred structurally diverse PXR ligands were used to develop traditional two-dimensional (2D) QSAR, machine-learning-based 2D-QSAR, field-based three-dimensional (3D) QSAR, and machine-learning-based 3D-QSAR models to establish the utility of predictive machine learning methods. Additionally, the applicability domain of the agonists was established to ensure the generation of robust QSAR models. A prediction set of dietary PXR agonists was used to externally-validate generated QSAR models. QSAR data analysis revealed that machine-learning 3D-QSAR techniques were more accurate in predicting the activity of external terpenes with an external validation squared correlation coefficient (R2) of 0.70 versus an R2 of 0.52 in machine-learning 2D-QSAR. Additionally, a visual summary of the binding pocket of PXR was assembled from the field 3D-QSAR models. By developing multiple QSAR models in this study, a robust groundwork for assessing PXR agonism from various chemical backbones has been established in anticipation of the identification of potential causative agents in complex mixtures.

Communicated by Ramaswamy H. Sarma

Disclosure statement

The authors declare no conflict of interest.

Data availability statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Additional information

Funding

This research is supported in part by “Science-Based Authentication of Botanical Ingredients,” funded by the Food and Drug Administration [Grant Number 2U01FD004246], the United States Department of Agriculture, Discovery & Development of Natural Products for Pharmaceutical & Agricultural Applications [No. 58-6060-6-015]. Also, the authors thank the Computational Chemistry and Bioinformatics Research Core of the US National Institute of General Medical Sciences of the National Institutes of Health (NIH) under [Award Number P20GM130460] for providing access to the Cresset software.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.