1,850
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Identification of prognostic biomarkers of breast cancer based on the immune-related gene module

, , , , , , & show all
Article: 2244695 | Received 20 Mar 2023, Accepted 31 Jul 2023, Published online: 16 Aug 2023

Figures & data

Table 1. Primer sequences for qRT-PCR.

Figure 1. Identification of soft threshold power in WGCNA and screening of immune-related gene modules.

(A) Scale-free index analysis of various soft threshold power β. (B) Average connectivity analysis of each soft threshold power β. (C) When β = 6, the connectivity distribution of the co-expression network. (D) Scale-free topological performance verification of co-expression network when β = 6. (E) Dynamic splicing clustering tree based on intergenic dissimilarity to measure 1-TOM clustering. Each color represents a gene module. (F) Correlation heat map between module principal components and BC immune traits. The numbers above the parentheses in each square indicate the correlation coefficient and the numbers in parentheses indicate the degree of significance

Figure 1. Identification of soft threshold power in WGCNA and screening of immune-related gene modules.(A) Scale-free index analysis of various soft threshold power β. (B) Average connectivity analysis of each soft threshold power β. (C) When β = 6, the connectivity distribution of the co-expression network. (D) Scale-free topological performance verification of co-expression network when β = 6. (E) Dynamic splicing clustering tree based on intergenic dissimilarity to measure 1-TOM clustering. Each color represents a gene module. (F) Correlation heat map between module principal components and BC immune traits. The numbers above the parentheses in each square indicate the correlation coefficient and the numbers in parentheses indicate the degree of significance

Figure 2. Enrichment analysis of target module gene.

(A) GO functional enrichment analysis. (B) KEGG analysis. The color of the dots indicates the degree of enrichment significance. The size of the dots indicates the number of enriched genes.

Figure 2. Enrichment analysis of target module gene.(A) GO functional enrichment analysis. (B) KEGG analysis. The color of the dots indicates the degree of enrichment significance. The size of the dots indicates the number of enriched genes.

Figure 3. Construction of the risk assessment model based on the target module.

(A) LASSO coefficient spectrum of LASSO regression analysis. (B) Coefficient distribution map for logarithmic (λ) sequence in LASSO model. (C) Forest plot of five genes obtained by multivariate Cox regression analysis. When hazard ratio (HR) >1, the characteristic gene is a risk factor. When HR <1, the trait gene is a protective factor. When HR = 1, the characteristic gene does not play a role in survival time.

Figure 3. Construction of the risk assessment model based on the target module.(A) LASSO coefficient spectrum of LASSO regression analysis. (B) Coefficient distribution map for logarithmic (λ) sequence in LASSO model. (C) Forest plot of five genes obtained by multivariate Cox regression analysis. When hazard ratio (HR) >1, the characteristic gene is a risk factor. When HR <1, the trait gene is a protective factor. When HR = 1, the characteristic gene does not play a role in survival time.

Figure 4. Validation of the five-feature gene risk assessment model.

(A) Expression heat map of five-feature genes in high- and low-risk groups of TCGA-BRCA training set. (B) Distribution of five-feature gene prognostic model risk scores in TCGA-BRCA training set. (C) Distribution of survival status in TCGA-BRCA training set. Red dots indicate that the patient is dead at a fixed time, and green dots indicate that the patient is alive. (D) PCA reduced-dimension map of five-feature genes in TCGA-BRCA high- and low-risk groups. (E–G) ROC curves of a prognostic model in TCGA-BRCA training set, GSE42568 validation set, and GSE20685 validation set. (H–J) K-M survival curves in high- and low-risk groups of TCGA-BRCA training set, GSE42568 validation set, and GSE20685 validation set.

Figure 4. Validation of the five-feature gene risk assessment model.(A) Expression heat map of five-feature genes in high- and low-risk groups of TCGA-BRCA training set. (B) Distribution of five-feature gene prognostic model risk scores in TCGA-BRCA training set. (C) Distribution of survival status in TCGA-BRCA training set. Red dots indicate that the patient is dead at a fixed time, and green dots indicate that the patient is alive. (D) PCA reduced-dimension map of five-feature genes in TCGA-BRCA high- and low-risk groups. (E–G) ROC curves of a prognostic model in TCGA-BRCA training set, GSE42568 validation set, and GSE20685 validation set. (H–J) K-M survival curves in high- and low-risk groups of TCGA-BRCA training set, GSE42568 validation set, and GSE20685 validation set.

Figure 5. Analysis of immune characteristics and gene mutation differences in TCGA-BRCA high- and low-risk groups.

(A) Heat map of 29 immune features analyzed by ssGSEA. (B) Violin plot of tumor purity score for samples from high- and low-risk groups. ****p < 0.00001. (C) Box-plot of HLA-related protein expression in high- and low-risk groups. ***p < 0.0001. (D) CIBERSORT was used to analyze differences in 22 immune cell infiltration scores between high- and low-risk groups. *p < 0.05, **p < 0.001, ***p < 0.0001. (E) Pearson correlation analysis of programmed cell death-Ligand 1(PD-L1) and risk score. (F) Pearson correlation analysis between programmeddeath-1(PD-1) and risk score. (G) Pearson correlation analysis between LAG3 and risk score. (H) Pearson correlation analysis between CTLA4 and risk score. (I) Waterfall map of 10 high-frequency mutation genes in the low-risk group. (J) Waterfall map of 10 high-frequency mutation genes in the high-risk group. The ordinate represents the gene, the abscissa represents the patient, and the color of the thin bars in the coordinate represents the mutation type of the gene corresponding to the abscissa

Figure 5. Analysis of immune characteristics and gene mutation differences in TCGA-BRCA high- and low-risk groups.(A) Heat map of 29 immune features analyzed by ssGSEA. (B) Violin plot of tumor purity score for samples from high- and low-risk groups. ****p < 0.00001. (C) Box-plot of HLA-related protein expression in high- and low-risk groups. ***p < 0.0001. (D) CIBERSORT was used to analyze differences in 22 immune cell infiltration scores between high- and low-risk groups. *p < 0.05, **p < 0.001, ***p < 0.0001. (E) Pearson correlation analysis of programmed cell death-Ligand 1(PD-L1) and risk score. (F) Pearson correlation analysis between programmeddeath-1(PD-1) and risk score. (G) Pearson correlation analysis between LAG3 and risk score. (H) Pearson correlation analysis between CTLA4 and risk score. (I) Waterfall map of 10 high-frequency mutation genes in the low-risk group. (J) Waterfall map of 10 high-frequency mutation genes in the high-risk group. The ordinate represents the gene, the abscissa represents the patient, and the color of the thin bars in the coordinate represents the mutation type of the gene corresponding to the abscissa

Figure 6. Enrichment results of GSEA biological pathway in high- and low-risk groups. (A-C) GSEA results showed enriched signaling pathways. ES indicates enrichment scores and FDR q-val indicates significance.

Figure 6. Enrichment results of GSEA biological pathway in high- and low-risk groups. (A-C) GSEA results showed enriched signaling pathways. ES indicates enrichment scores and FDR q-val indicates significance.

Figure 7. Construction and verification of the nomogram.

(A) Correlation between risk score and clinical tumor stage. (B) Association between risk scores and breast cancer subtypes. (C) The forest plot was constructed in line with univariate Cox regression analysis. (D) The forest plot was constructed in line with multivariate Cox regression analysis. (E) Nomogram for predicting OS in TCGA-BRCA training set. (F–H) Calibration curves for predicting 1-, 3-, and 5-year OS rates of BC patients.

Figure 7. Construction and verification of the nomogram.(A) Correlation between risk score and clinical tumor stage. (B) Association between risk scores and breast cancer subtypes. (C) The forest plot was constructed in line with univariate Cox regression analysis. (D) The forest plot was constructed in line with multivariate Cox regression analysis. (E) Nomogram for predicting OS in TCGA-BRCA training set. (F–H) Calibration curves for predicting 1-, 3-, and 5-year OS rates of BC patients.

Figure 8. Expression levels of prognostic feature genes in the high- and low-risk groups were verified by qRT-PCR.

*p < 0.05 indicates a significant difference in gene expression between the tumor and normal groups.

Figure 8. Expression levels of prognostic feature genes in the high- and low-risk groups were verified by qRT-PCR.*p < 0.05 indicates a significant difference in gene expression between the tumor and normal groups.

Data availability statement

The datasets generated and analyzed during the current study are not publicly available but are available from the corresponding author on reasonable request.