2,098
Views
6
CrossRef citations to date
0
Altmetric
Research Paper

A nine-hub-gene signature of metabolic syndrome identified using machine learning algorithms and integrated bioinformatics

, , , , , , & show all
Pages 5727-5738 | Received 14 Jun 2021, Accepted 11 Aug 2021, Published online: 13 Sep 2021
 

ABSTRACT

Early risk assessments and interventions for metabolic syndrome (MetS) are limited because of a lack of effective biomarkers. In the present study, several candidate genes were selected as a blood-based transcriptomic signature for MetS. We collected so far the largest MetS-associated peripheral blood high-throughput transcriptomics data and put forward a novel feature selection strategy by combining weighted gene co-expression network analysis, protein-protein interaction network analysis, LASSO regression and random forest approaches. Two gene modules and 51 hub genes as well as a 9-hub-gene signature associated with metabolic syndrome were identified. Then, based on this 9-hub-gene signature, we performed logistic analysis and subsequently established a web nomogram calculator for metabolic syndrome risk (https://xjtulgz.shinyapps.io/DynNomapp/). This 9-hub-gene signature showed excellent classification and calibration performance (AUC = 0.968 in training set, AUC = 0.883 in internal validation set, AUC = 0.861 in external validation set) as well as ideal potential clinical benefit.

Research highlights

  1. Combining bioinformatics analysis and machine learning algorithms

  2. Providing a novel strategy for biomarker identification

  3. A nine-hub-gene signature with high diagnostic value for MetS

Acknowledgements

We would like to thanks all participants for their commitment and cooperation.

Author contributions

Conception and design: Xin Huang, Pei Yang, Kunzheng Wang, Guanzhi Liu; collection and assemble of data: Yutian Lei, Zhuo Huang, Sen Luo; analysis and interpretation of the data: Guanzhi Liu,Jianhua Wu; draft of the article: Guanzhi Liu; All authors read, critically revised and approved the final manuscript.

Data availability statement

The data that support the findings of the this study are available from the corresponding author on reasonable request. The datasets for this study can be found in the Gene Expression Omnibus (GEO) database [https://www.ncbi.nlm.nih.gov/geo/].

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed here

Additional information

Funding

The research was funded by Key Project for Science Research and Development of Shaanxi Province [2019SF‐164] and Clincal Research Award of the First Affiliated Hospital of Xian Jiaotong University, China [No. XJTU1AF-CRF-2019-014].