Search in:

Drug Design, Development and Therapy Volume 10, 2016 - Issue

Submit an article Journal homepage

Open access

119

Views

CrossRef citations to date

Altmetric

Original Research

Prediction of selective estrogen receptor beta agonist using open data and machine learning approach

Ai-qin Niu1 Department of Gynecology, the First People’s Hospital of Shangqiu, Shangqiu, Henan, People’s Republic of China

Liang-jun Xie2 Department of Image Diagnoses, the Third Hospital of Jinan, Jinan, Shandong, People’s Republic of China

Hui Wang1 Department of Gynecology, the First People’s Hospital of Shangqiu, Shangqiu, Henan, People’s Republic of China

Bing Zhu1 Department of Gynecology, the First People’s Hospital of Shangqiu, Shangqiu, Henan, People’s Republic of China

Sheng-qi Wang3 Department of Mammary Disease, Guangdong Provincial Hospital of Chinese Medicine, the Second Clinical College of Guangzhou University of Chinese Medicine, Guangzhou, People’s Republic of ChinaCorrespondence[email protected]

Pages 2323-2331 | Published online: 18 Jul 2016

Cite this article
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF

Figures & data

Figure 1 The data analysis and machine learning schema.

Notes: Step 1: collect ER-β agonist data from public database. Step 2: chemical diversity analysis. Step 3: construct machine learning models. Step 4: validate the constructed models.
Abbreviations: ER, estrogen receptor; SVM, support vector machine; ROC, receiver operating characteristic; PCA, principal component analysis.

Figure 2 Principal component analysis (PCA) of the dataset.

Notes: The PCA was based on four types of fingerprints. Each dot represents a unique compound of the dataset. Black dots represent active compounds, whereas gray dots represent inactive compounds.
Abbreviations: Ext, extended; AP2D, 2D atom pairs; FP, fingerprints.

Figure 3 The heat map of distance matrix for the compounds in the collected dataset.

Note: Green represents a large distance and structural dissimilarity.

Table 1 Model performances of 5-fold cross validation

Download CSV Display Table

Figure 4 The ROC curves of the 5-fold cross validation models based on four types of fingerprints (FP) and four machine learning approaches.

Note: The error bar in the curve is based on five runs of the 5-fold cross validation process.
Abbreviations: ROC, receiver operating characteristic; NB, Naïve Bayesian; KNN, k-nearest neighbor; RF, random forest; SVM, support vector machine; Ext, extended; AP2D, 2D atom pairs; TP, true positives, FPos, false positives.

Figure 5 Performance ranking of machine learning methods with various fingerprints (FP).

Note: Take KNN for example, KNN ranked first with MACCSFP, ranked second with AP2D, and ranked third with ExtFP or PubChemFP.
Abbreviations: NB, Naïve Bayesian; KNN, k-nearest neighbor; RF, random forest; SVM, support vector machine; Ext, extended; AP2D, 2D atom pairs.

Figure 6 Performance ranking of fingerprints (FP) in various machine learning methods.

Note: Take MACCSFP for example, MACCSFP ranked third in NB, and ranked second in KNN, RF, and SVM.
Abbreviations: NB, Naïve Bayesian; KNN, k-nearest neighbor; RF, random forest; SVM, support vector machine; Ext, extended; AP2D, 2D atom pairs.

Table S2 Five-fold cross validation model performance using experimental inactive agonists

Download CSV Display Table

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Prediction of selective estrogen receptor beta agonist using open data and machine learning approach

Table 1 Model performances of 5-fold cross validation

Table 2 Model performances of test set

Table 3 Model performances of external test set

Table S1 Ten-fold cross validation model performance

Table S2 Five-fold cross validation model performance using experimental inactive agonists

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Prediction of selective estrogen receptor beta agonist using open data and machine learning approach

Figures & data

Table 1 Model performances of 5-fold cross validation

Table 2 Model performances of test set

Table 3 Model performances of external test set

Table S1 Ten-fold cross validation model performance

Table S2 Five-fold cross validation model performance using experimental inactive agonists

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date