741
Views
36
CrossRef citations to date
0
Altmetric
Original Research

Identification of potential core genes in triple negative breast cancer using bioinformatics analysis

, , , , , & show all
Pages 4105-4112 | Published online: 18 Jul 2018

Abstract

Background

Triple-negative breast cancer (TNBC) is a subtype of breast cancer with poor clinical outcome and limited treatment options. Lacking molecular targets, chemotherapy is the main adjuvant treatment for TNBC patients.

Materials and methods

To explore potential therapeutic targets for TNBC, we analyzed three microarray datasets (GSE38959, GSE45827, and GSE65194) derived from the Gene Expression Omnibus (GEO) database. The GEO2R tool was used to screen out differentially expressed genes (DEGs) between TNBC and normal tissue. Gene Ontology function and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis were performed using the Database for Annotation, Visualization and Integrated Discovery to identify the pathways and functional annotation of DEGs. Protein–protein interaction of these DEGs was analyzed based on the Search Tool for the Retrieval of Interacting Genes database and visualized by Cytoscape software. In addition, we used the online Kaplan–Meier plotter survival analysis tool to evaluate the prognostic value of hub genes expression in breast cancer patients.

Results

A total of 278 upregulated DEGs and 173 downregulated DEGs were identified. Among them, ten hub genes with a high degree of connectivity were picked out. Overexpression of these hub genes was associated with unfavorable prognosis of breast cancer, especially, CCNB1 overexpression was observed and indicated poor outcome of TNBC.

Conclusion

Our study suggests that CCNB1 was overexpressed in TNBC compared with normal breast tissue, and overexpression of CCNB1 was an unfavorable prognostic factor of TNBC patients. Further study is needed to explore the value of CCNB1 in the treatment of TNBC.

Introduction

Triple-negative breast cancer (TNBC) is defined as a subtype of breast cancer which lacks expression of estrogen receptor (ER) and progesterone receptor (PR) and demonstrates no amplification of human epidermal growth factor receptor 2 (HER2). This subset accounts for ~12%–17% of all invasive breast cancers.Citation1 TNBC is more frequently diagnosed in younger women and behaves more aggressively in clinical behaviors. Patients with TNBC are more likely to develop relapse and visceral metastasis than other subtypes of breast cancer.Citation2Citation5 Lacking molecular targets, patients diagnosed with TNBC cannot be treated with endocrine therapy or HER2-targeted therapy. Chemotherapy is currently the main adjuvant treatment for TNBC patients.Citation1 Unfortunately, many tumors are resistant to chemotherapy and relapse or metastasize quickly after adjuvant treatment.Citation6,Citation7 Up to date, TNBC is still a disease with poor outcome and limited treatment options. Hence, it is urgent and necessary to explore novel therapeutic targets for TNBC.

In this study, we tried to detect novel indicators of poor prognosis in TNBC patients and endeavor to provide potential therapeutic targets for this challenging disease. To detect the differentially expressed genes (DEGs) between TNBC and healthy human breast tissue, bioinformatics methods were used to analyze the gene expression profiling data downloaded from the Gene Expression Omnibus (GEO) database. Gene Ontology (GO) functional annotation analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed for the screened DEGs. Then, we established a protein–protein interaction (PPI) network to identify hub genes related to TNBC. The survival analysis of these hub genes was performed using the online database Kaplan–Meier plotter.

Materials and methods

Data source

The gene expression datasets analyzed in this study were obtained from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). A total of 1,821 series about human breast cancer were retrieved from the database. After a careful review, three gene expression profiles (GSE38959, GSE45827, and GSE65194) were selected. Among them, GSE38959 was based on the Agilent GPL4133 platform (Agilent-014850 Whole Human Genome Microarray 4×44K G4112F), and GSE45827 and GSE65194 were based on platform GPL570 ([HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array). All of the data were freely available online, and this study did not involve any experiment on humans or animals performed by any of the authors.

Data processing of DEGs

The GEO2R online analysis tool (https://www.ncbi.nlm.nih.gov/geo/geo2r/) was used to detect the DEGs between TNBC and normal samples, and the adjusted P-value and |logFC| were calculated. Genes that met the cutoff criteria, adjusted P<0.05 and |logFC|≥2.0, were considered as DEGs. Statistical analysis was carried out for each dataset, and the intersecting part was identified using the Venn diagram webtool (bioinformatics.psb.ugent.be/webtools/Venn/).

GO and KEGG pathway analysis of DEGs

GO analysis is a common useful method for large scale functional enrichment research; gene functions can be classified into biological process (BP), molecular function (MF), and cellular component (CC). KEGG is a widely used database which stores a lot of data about genomes, biological pathways, diseases, chemical substances, and drugs. GO annotation analysis and KEGG pathway enrichment analysis of DEGs in this study was performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID) tools (https://david.ncifcrf.gov/). P<0.01 and gene counts≥10 were considered statistically significant.

PPI network construction and hub gene identification

The Search Tool for the Retrieval of Interacting Genes (STRING) database (http://string-db.org/) is designed to analyze the PPI information. To evaluate the potential PPI relationship, the DEGs identified previously were mapped to the STRING database. The PPI pairs were extracted with a combined score>0.4. Subsequently, the PPI network was visualized by Cytoscape software (www.cytoscape.org/). Nodes with higher degree of connectivity tend to be more essential in maintaining the stability of the entire network. CytoHubba, a plugin in cytoscape, was used to calculate the degree of each protein node. In our study, the top ten genes were identified as hub genes.

Survival analysis of hub genes

The Kaplan–Meier plotter (http://kmplot.com/analysis/) is an online tool applied to assess the effect of 54,675 genes on survival using 10,461 cancer samples (5,143 breast, 1,816 ovarian, 2,437 lung, and 1,065 gastric cancer). The Kaplan–Meier plotter mRNA breast cancer database was applied to evaluate the prognostic values of hub genes in breast cancer patients, especially in TNBC patients. In our study, TNBC patients were screened out based on ER, PR, and HER-2 negative expression. Probes of genes were selected based on the “only JetSet best probe set,” and the desired probe IDs for each gene are shown in . For each gene, cancer patients were divided into two groups according to the median values of mRNA expression. P<0.01 was considered to indicate a statistically significant result.

Results

Identification of DEGs

Three gene expression profiles (GSE38959, GSE45827, and GSE65194) were selected in this study. Among them, GSE38959 contained 30 TNBC samples and 13 normal samples, and GSE45827 and GSE65194 included 41 TNBC specimens and eleven normal breast specimens respectively (). Based on the criteria of P<0.05 and |logFC|≥2, a total of 852 DEGs were identified from GSE38959, including 515 upregulated genes and 337 downregulated genes. In gene chip GSE45827, 2,995 DEGs were identified; 2,117 genes were upregulated, and 878 genes were downregulated. And from GSE65194, 3,031 DEGs including 2,130 upregulated genes and 901 downregulated genes were identified. All DEGs were identified by comparing TNBC samples with normal breast samples. Subsequently, Venn analysis was performed to get the intersection of the DEG profiles (). Finally, 451 DEGs were significantly differentially expressed among all three groups, of which 278 were significantly upregulated genes and 173 were downregulated.

Table 1 Statistics of the three microarray databases derived from the GEO database

Figure 1 Venn diagram of DEGs common to all three GEO datasets.

Notes: (A) Downregulated genes. (B) Upregulated genes.
Abbreviations: DEG, differentially expressed gene; GEO, Gene Expression Omnibus.
Figure 1 Venn diagram of DEGs common to all three GEO datasets.

Functional enrichment analyses of DEGs

GO function and KEGG pathway enrichment analysis for DEGs were performed using the DAVID (). The enriched GO terms were divided into CC, BP, and MF ontologies. The results of GO analysis indicated that DEGs were mainly enriched in BPs, including sister chromatid cohesion, microtubule-based movement, anaphase-promoting complex-dependent catabolic process, and extracellular matrix (ECM) organization. MF analysis showed that the DEGs were significantly enriched in microtubule binding, transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding, ATPase activity, and microtubule motor activity. For the cell component, the DEGs were enriched in condensed chromosome kinetochore, microtubule, kinetochore, and spindle. In addition, the results of KEGG pathway analysis showed that DEGs were mainly enriched in pathways in cancer, small cell lung cancer, and ECM–receptor interaction.

Table 2 Significantly enriched GO terms and KEGG pathways of DEGs

PPI network construction and hub gene identification

Protein interactions among the DEGs were predicted with STRING tools. A total of 111 nodes and 1,365 edges were involved in the PPI network, as presented in . The top ten genes evaluated by connectivity degree in the PPI network were identified (). The results showed that cyclin-dependent kinases 1 (CDK1) was the most outstanding gene with connectivity degree=64, followed by cyclin B1 (CCNB1; degree=61), baculoviral IAP repeat containing 5 (BIRC5; degree=60), aurora kinase A (AURKA; degree=58), polo-like kinase 1 (PLK1; degree=56), mitotic arrest deficient 2-like 1 (MAD2L1; degree=54), BUB1 mitotic checkpoint serine/threonine kinase B (BUB1B; degree=54), nuclear division cycle 80 (NDC80; degree=53), budding uninhibited by benzimidazoles 1 (BUB1; degree=52), and kinesin family member 11 (KIF11; degree=52). All of these hub genes were upregulated in TNBC.

Table 3 Top ten hub genes with higher degree of connectivity

Figure 2 Protein–protein interaction network constructed with the differentially expressed genes.

Note: Red nodes represent upregulated genes, and blue nodes represent downregulated genes.
Figure 2 Protein–protein interaction network constructed with the differentially expressed genes.

Survival analysis of ten hub genes

To investigate the prognostic values of the ten potential hub genes, the Kaplan–Meier plotter bioinformatics analysis platform was used. A total of 1,402 breast cancer patients were available for the analysis of overall survival. We found that high expression of these hub genes was associated with unfavorable overall survival of breast cancer patients ().

Figure 3 Kaplan–Meier overall survival analyses for the top ten hub genes expressed in breast cancer patients.

Note: See for gene description.
Figure 3 Kaplan–Meier overall survival analyses for the top ten hub genes expressed in breast cancer patients.

However, only overexpression of CCNB1 was an unfavorable prognostic factor of relapse-free survival in TNBC patients (HR=2.12; 95% CI: 1.2–3.72; P=0.0078; n=255). There were not enough incidents for overall survival analysis ().

Figure 4 Kaplan–Meier relapse-free survival analyses for CCNB1 expression in TNBC patients.

Abbreviations: CCNB1 cyclin B1; TNBC, triple-negative breast cancer.
Figure 4 Kaplan–Meier relapse-free survival analyses for CCNB1 expression in TNBC patients.

Discussion

Breast cancer is a heterogeneous disease, and the histopathological features and clinical behaviors are distinct among subtypes. TNBC is a unique subtype of breast cancer with poor prognosis. Patients with TNBC have an increased likelihood of relapse and visceral metastasis. Due to lacking a therapeutic target, patients with TNBC could not benefit from endocrine therapy or HER2-targeted therapy, and chemotherapy is currently the mainstay of adjuvant treatment. However, TNBC patients are more likely to develop chemoresistance. Hence, it is crucial to identify new specific targeted therapies for TNBC.

In the present study, gene expression and protein–protein expression analysis based on publicly available databases was performed to identify potential key genes correlated with TNBC. DEGs between TNBC and healthy human breast tissue were screened out based on gene expression profiling data from the GEO database. Totally, we identified 278 upregulated DEGs and 173 downregulated DEGs. These DEGs were associated with the GO BP terms such as condensed chromosome kinetochore, sister chromatid cohesion, kinetochore, and microtubule binding, and significantly enriched in the KEGG terms small cell lung cancer, pathways in cancer, and ECM–receptor interaction. A PPI network was constructed to investigate the interrelationship of the DEGs, and ten hub genes were identified, including AURKA, BIRC5, BUB1B, BUB1, CCNB1, CDK1, KIF11, MAD2L1, NDC80, and PLK1. All of these genes were upregulated in TNBC. Finally, the Kaplan–Meier plotter online tool was applied to predict the relationship between the expression of hub genes and prognosis of TNBC patients. Based on the Kaplan–Meier plotter, overexpression of all the above genes was related to unfavorable prognosis of breast cancer patients. However, only overexpression of CCNB1 was an unfavorable prognostic factor of TNBC patients.

CCNB1, also known as cyclin B1, is a key modulator in controlling cell proliferation.Citation8 Some research has demonstrated that CCNB1 is involved in apoptosis, chemoresistance, and epithelial mesenchymal transitions of tumor cells.Citation9,Citation10 Overexpression of cyclin B1 has been reported in many tumors, such as colorectal cancer, gastric cancer,Citation11 pancreatic carcinoma,Citation12 and lung carcinoma.Citation13 Some of these studies suggested that the overexpression of cyclin B1 may be associated with the poor prognosis of these malignant diseases. For breast cancer, a lot of studies have shown that cyclin B1 overexpression was associated with aggressive clinical behaviors and was an independent prognostic factor. Aaltonen et alCitation14 showed that cyclin B1 overexpression was correlated with an aggressive phenotype and was significantly associated with shorter overall survival and metastasis-free survival in breast cancer patients. Ding et alCitation15 reported that a high level of CCNB1 was closely associated with hormone therapy resistance and poor recurrence-free survival, disease-free survival, and distant metastasis-free survival of ER+ breast cancer patients. And a meta-analysis by Sun et alCitation16 suggested that cyclin B1 overexpression might be an independent potential prognostic marker for disease-specific survival and disease-free survival of breast cancer. TNBC are usually high grade tumors with primitive features, suggesting that cyclin B1 may overexpress in TNBC. Agarwal et alCitation17 reported that cyclin B1 was expressed at a significantly higher level in TNBC cell lines than other subtypes. In our study, cyclin B1 was overexpressed in TNBC compared to normal breast tissue, and overexpression of cyclin B1 was correlated with unfavorable relapse-free survival of TNBC patients. Therefore, cyclin B1 may be a prognostic factor and potential therapeutic target for TNBCs.

Except for CCNB1, we detected other nine hub genes associated with breast cancer, including CDK1, AURKA, BIRC5, MAD2L1, BUB1B, BUB1, PLK1, KIF11, and NDC80. Most of them were reported as an essential factor involved in cell division and proliferation. Proteins encoded by AURKA, BUB1, BUB1B, PLK1, and CDK1 are all serine/threonine kinases involved in the regulation of the cell cycle,Citation18 and overexpression of these genes has been detected in various human cancers and correlated with their prognosis. Roylance et alCitation19 reported that a high AURKA expression level was significantly associated with poorer clinical outcome in breast cancer patients. In Sotiriou et al’s study,Citation20 BUB1 was upregulated and correlated with a poor clinical prognosis in breast cancer patients. Many studies have shown an association between PLK1 overexpression and poor clinical prognosis, and suggested that inhibition of PLK1 may be a potential therapy for cancer treatment.Citation21,Citation22 For CDK1, many research studies have reported its overexpression in cancers and that it acts as an adverse prognostic factor, and many kinds of CDK inhibitors have been developed.Citation23

In our study, BIRC5, KIF11, MAD2L1, and NDC80 were overexpressed in breast cancer compared to normal breast tissues, and overexpression of these genes was significantly correlated with unfavorable clinical outcome in breast cancer patients. The results of our research were consistent with other studies.Citation24Citation27 However, the role of these genes in TNBC is not clear and further study is needed.

Conclusion

Our bioinformatics analysis identified 451 DEGs between TNBCs and normal breast tissues based on the gene expression datasets obtained from the GEO database. Among them, ten hub genes might be the core genes of breast cancer, including AURKA, BIRC5, BUB1B, BUB1, CCNB1, CDK1, KIF11, MAD2L1, NDC80, and PLK1. All of them were upregulated in breast cancer, and overexpression of these genes was associated with unfavorable clinical outcome in breast cancer patients. In TNBC patients, CCNB1 overexpression is an unfavorable prognostic factor. Further study is needed to confirm the results of our research. Anyway, CCNB1 may be a potential target for TNBC therapy.

Acknowledgments

This study was supported by the Hubei Provincial Natural Science Fund Grant No 2016CFB525.

Supplementary material

Table S1 The desired probes of hub genes in the Kaplan–Meier plotter database

Disclosure

The authors report no conflicts of interest in this work.

References

  • FoulkesWDSmithIEReis-FilhoJSTriple-negative breast cancerN Engl J Med2010363201938194821067385
  • HudisCAGianniLTriple-negative breast cancer: an unmet medical needOncologist201116Suppl 1111
  • DentRTrudeauMPritchardKITriple-negative breast cancer: clinical features and patterns of recurrenceClin Cancer Res20071315 Pt 14429443417671126
  • CareyLWinerEVialeGCameronDGianniLTriple-negative breast cancer: disease entity or title of convenience?Nat Rev Clin Oncol201071268369220877296
  • DentRHannaWMTrudeauMRawlinsonESunPNarodSAPattern of metastatic spread in triple-negative breast cancerBreast Cancer Res Treat2009115242342818543098
  • DengXAppleSZhaoHCD24 Expression and differential resistance to chemotherapy in triple-negative breast cancerOncotarget2017824382943830828418843
  • WeinLLoiSMechanisms of resistance of chemotherapy in early-stage triple negative breast cancer (TNBC)Breast201734Suppl 1S27S3028668293
  • SmitsVAMedemaRHChecking out the G(2)/M transitionBiochim Biophys Acta200115191–211211406266
  • SongYZhaoCDongLOverexpression of cyclin B1 in human esophageal squamous cell carcinoma cells induces tumor cell invasive growth and metastasisCarcinogenesis200829230731518048386
  • MatthessYRaabMSanhajiMLavrikINStrebhardtKCdk1/cyclin B1 controls Fas-mediated apoptosis by regulating caspase-8 activityMol Cell Biol201030245726574020937773
  • WenYCaoLLianWPLiGXGxLThe prognostic significance of high/positive expression of cyclin B1 in patients with three common digestive cancers: a systematic review and meta-analysisOncotarget2017856963739638329221213
  • ZhouLLiJZhaoYPThe prognostic value of Cyclin B1 in pancreatic cancerMed Oncol201431910725106528
  • SoriaJCJangSJKhuriFROverexpression of cyclin B1 in early-stage non-small cell lung cancer and its clinical implicationCancer Res200060154000400410945597
  • AaltonenKAminiRMHeikkiläPHigh cyclin B1 expression is associated with poor survival in breast cancerBr J Cancer200910071055106019293801
  • DingKLiWZouZZouXWangCCCNB1 is a prognostic biomarker for ER+ breast cancerMed Hypotheses201483335936425044212
  • SunXZhangyuanGShiLWangYSunBDingQPrognostic and clinicopathological significance of cyclin B expression in patients with breast cancer: a meta-analysisMedicine20179619e686028489780
  • AgarwalRGonzalez-AnguloAMMyhreSIntegrative analysis of cyclin protein levels identifies cyclin B1 as a classifier and predictor of outcomes in breast cancerClin Cancer Res200915113654366219470724
  • FinettiPCerveraNCharafe-JauffretESixteen-kinase gene expression identifies luminal breast cancers with poor prognosisCancer Res200868376777618245477
  • RoylanceREndesfelderDJamal-HanjaniMExpression of regulators of mitotic fidelity are associated with intercellular heterogeneity and chromosomal instability in primary breast cancerBreast Cancer Res Treat2014148122122925288231
  • SotiriouCNeoSYMcshaneLMBreast cancer classification and prognosis based on gene expression profiles from a population-based studyProc Natl Acad Sci U S A200310018103931039812917485
  • GutteridgeRENdiayeMALiuXAhmadNPlk1 Inhibitors in Cancer Therapy: From Laboratory to ClinicsMol Cancer Ther20161571427143527330107
  • LiuZSunQWangXPlkWXPLK1, A Potential Target for Cancer TherapyTransl Oncol2017101223227888710
  • ChaeSWSohnJHKimDHOverexpressions of Cyclin B1, cdc2, p16 and p53 in human breast cancer: the clinicopathologic correlations and prognostic implicationsYonsei Med J201152344545321488187
  • WangSCPCNA: a silent housekeeper or a potential therapeutic target?Trends Pharmacol Sci201435417818624655521
  • WangZKatsarosDShenYBiological and Clinical Significance of MAD2L1 and BUB1, Genes Frequently Appearing in Expression Signatures for Breast Cancer PrognosisPLoS One2015108e013624626287798
  • PeiYYLiGCRanJWeiFXKinesin family member 11 contributes to the progression and prognosis of human breast cancerOncol Lett20171466618662629181100
  • HamyASBiecheILehmann-CheJBIRC5 (survivin): a pejorative prognostic marker in stage II/III breast cancer with no response to neoadjuvant chemotherapyBreast Cancer Res Treat2016159349951127592112