Abstract
Marfan syndrome (MFS) is a hereditary disease with high mortality. This study aimed to explore peripheral blood potential markers and underlying mechanisms in MFS via a series bioinformatics and machine learning analysis. First, we downloaded two MFS datasets from the GEO database. A total of 215 differentially expressed genes (DEGs) and 78 differentially expressed miRNAs (DEMs) were identified via “Limma” package. 60 DEGs, mainly enriched in abnormal transportation of structure and energy substances, were selected after protein-protein interaction (PPI) network construction, of which 20 were chosen for machine learning after three algorithms (betweenness, closeness, and degree) filtration using Cytoscape. Four overlapping DEGs (ACTN1, CFTR, GCKR, LAMA3) were finally selected as the candidate markers based on three machine-learning approaches (Lasso, random forest, and support vector machine-recursive feature elimination). Furthermore, we collected peripheral blood from MFS patients and healthy control to validate the findings and the results showed that compared with the control, the expression of the four DEGs was all statistically different in MFS patients validated by qRT-PCR. Besides, the area under the receiver operating characteristics curve was greater than 0.8 for each DEG. Single-sample gene-set enrichment analysis showed that the four DEGs were strongly associated with inflammation and myogenesis pathway. Finally, we constructed the mRNA-miRNA network based on the intersection of DEMs and predicted miRNAs targeting DEGs. In conclusion, our study partially provided four potential markers for MFS pathogenesis.
Communicated by Ramaswamy H. Sarma
Author contributions
Study design: YF.Z, LM.T and JJ.W. Data acquisition and processing: GH.W and CJ.L. Clinical sample collection and experimental validation: QY.W and JJ.W. Image generation and interpretation: XQ.T and ZF.W. All the authors drafted the manuscript and approved the final manuscript.
Disclosure statement
The authors declare that they have no conflict of interests.
Data availability statement
Datasets (GSE110964, GSE110965) utilized in the study can be acquired from GEO database (https://www.ncbi.nlm.nih.gov/). The data processing procedures and relevant codes are available from the corresponding author upon reasonable request.
Ethical approval and consent to participate
The clinical sample collection protocol was approved by the Ethics committee board of Shaoxing People’s Hospital (Approval number: 2022 Ethics Clearance No. 120). All the participants have issued the informed consent.
Acknowledgment
We appreciate the editing services provided by the Home for Researchers editorial team (www.home-for-researchers.com).