88
Views
15
CrossRef citations to date
0
Altmetric
Original Research

Identification of genes and analysis of prognostic values in nonsmoking females with non-small cell lung carcinoma by bioinformatics analyses

, , , , &
Pages 4287-4295 | Published online: 08 Oct 2018

Abstract

Background

This study was performed to identify disease-related genes and analyze prognostic values in nonsmoking females with non-small cell lung carcinoma (NSCLC).

Materials and methods

Gene expression profile GSE19804 was downloaded from the Gene Expression Omnibus (GEO) database and analyzed by using GEO2R. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes were used for the functional and pathway enrichment analysis. Then, the Search Tool for the Retrieval of Interacting Genes, Cytoscape, and Molecular Complex Detection were used to construct the protein–protein interaction (PPI) network and identify hub genes. Finally, the Kaplan–Meier plotter online tool was used for the overall survival analysis of hub genes.

Results

A cohort of 699 differentially expressed genes was screened, and they were mainly enriched in the terms of ECM–receptor interaction, focal adhesion, and cell adhesion molecules. A PPI network was constructed, and 15 hub genes were identified base on the subset of PPI network. Then, two significant modules were detected and several genes were found to be associated with the cell cycle pathway. Finally, nine hub genes’ (UBE2C, DLGAP5, TPX2, CCNB2, BIRC5, KIF20A, TOP2A, GNG11, and ANXA1) expressions were found to be associated with the prognosis of the patients.

Conclusion

Overall, we propose that the cell cycle pathway may play an important role in nonsmoking females with NSCLC and the nine hub genes may be further explored as potential targets for NSCLC diagnosis and treatment.

Introduction

Lung cancer is one of the leading causes of cancer mortality worldwide.Citation1 Non-small-cell lung cancer (NSCLC), the main type of lung cancer, accounts for 80%–85% of all cases, and small cell lung cancer (SCLC) accounts for around 20% of all cases. Although smoking is the major risk factor for lung cancer,Citation2 only 7% of female patients with lung cancer have a history of cigarette smoking in Taiwan.Citation3 In addition, other factors, such as environmental exposureCitation4,Citation5 or hereditary factors,Citation6 have been reported in association with nonsmoking lung cancer patients. However, the molecular mechanisms of NSCLC among nonsmoking women are unclear, although some genes, such as EML4-ALK,Citation7 EGFR,Citation8 TP53,Citation9 and PIK3CA,Citation10 have been found to be associated with lung cancer in never smokers. Thus, it is important to explore the molecular mechanisms involved in NSCLC’s onset and progression among nonsmoking females and identify effective biomarkers.

Recently, gene microarray and bioinformatics analysis were widely used to identify potential biomarkers of cancer. Interestingly, studies were performed to identify disease-related genes, which were associated with prognosis of breast cancer, by using integrated bioinformatics methods.Citation11,Citation12 Similarly, some studies were performed to find important key genes in lung cancer.Citation13Citation15 But there are limited studies on nonsmoking females with NSCLC.

In 2010, Lu et alCitation3 obtained a panel of differentially expressed genes (DEGs) from nonsmoking female NSCLC patients and normal samples. Using the same data, the present study was performed to further screen the DEGs and predict their underlying function by functional and pathway enrichment analyses. Then, protein–protein interaction network (PPI) networks and modules of PPI network were performed to identify hub genes. More importantly, the prognostic values of the hub genes were further confirmed by survival analysis in nonsmoking females with NSCLC.

Materials and methods

Microarray data

Microarray expression profiles of GSE19804 were obtained from the Gene Expression Omnibus (GEO) database. GSE19804, which was based on the platform of the GPL570 (Affymetrix Human Genome U133 Plus 2.0 Array, Santa Clara, CA, USA), consisted of 60 samples from nonsmoking female NSCLC patients and 60 normal samples.Citation3

Identification of DEGs

The DEGs between NSCLC samples and normal controls were identified using GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/), which is a web tool that is applied to screen DEGs by comparing two groups of samples. |log FC| >1.5 and P<0.01 were selected as the cutoff criterion.

Functional and pathway enrichment analysis

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs were performed using Database for Annotation, Visualization, and Integrated Discovery (DAVID),Citation16 which is a comprehensive set of functional annotation tools. P<0.05 was set as the cutoff criterion.

PPI network analysis of DEGs and modules selection

The Search Tool for the Retrieval of Interacting Genes (STRING) databaseCitation17 was used to construct a PPI network for DEGs. Then, the modules of PPI network with significant gene pairs (combined score >0.8) was obtained by the Molecular Complex Detection (MCODE) pluginCitation18 in Cytoscape.Citation19 MCODE scores >14 and the number of nodes >15 were used as the cutoff values. Moreover, the function and pathway enrichment analysis of genes in each module was performed using DAVID.

Survival analysis of the hub genes

Kaplan–Meier plotter is a web tool that predicts the prognostic values of genes in some cancer patients, including breast, ovarian, lung, and gastric cancer patients (http://kmplot.com/analysis/index.php?p=background). The nonsmoking female patients with NSCLC were divided into two groups according to the particular gene expression level (high vs low expression).Citation20 Based on these categories, overall survival (OS) analysis of the two patient groups was then compared by the tool. The HR with 95% CIs and log-rank P-value were calculated and showed.

Results

Identification of DEGs

The total number of samples included 60 NSCLC samples and 60 normal samples. A total of 699 genes were identified, including 161 upregulated and 538 downregulated genes.

GO and KEGG pathway enrichment analysis

We used the DAVID for GO and KEGG pathway enrichment analysis based upon DEGs. The top five GO terms of DEGs are shown in . As to biological process, the DEGs were significantly enriched in the regulation of cell proliferation, cell adhesion, biological adhesion, vasculature development, and blood vessel development. About cellular component, the DEGs were significantly enriched in the extracellular region part, extracellular region, extracellular space, cell surface, and proteinaceous extracellular matrix. In addition, the most enriched GO terms in molecular function were growth factor binding, carbohydrate binding, calcium ion binding, pattern binding, and polysaccharide binding. On the other hand, the most enriched KEGG pathway terms were as follows: ECM–receptor interaction, focal adhesion, cell adhesion molecules, neuroactive ligand-receptor interaction, and complement and coagulation cascades.

Table 1 Functional and pathway enrichment analysis of DEGs in nonsmoking females with NSCLC

Construction of PPI network and module identification

To predict the interactions of the identified DEGs from the protein level, a PPI network from STRING was constructed. In total, 439 nodes and 1,056 edges were shown in the PPI network (). The subset of PPI network for the DEGs with a combined score >0.8 was performed to determine the hub genes. The top 15 genes were selected as hub genes, which were GNG11, UBE2C, CCNB2, NMU, ANXA1, FPR2, KIF20A, MELK, NUSAP1, DLGAP5, CENPF, BIRC5, TOP2A, TPX2, and HMMR (). Subsequently, two modules with MCODE scores >14 and the number of nodes >15 were selected (), and enrichment analysis showed that the genes in the modules were mainly associated with cell cycle, chemokine signaling pathway, cytokine-cytokine receptor interaction, and neuroactive ligand-receptor interaction ().

Table 2 The hub genes that had a degree >20 in PPI network

Table 3 Functional and pathway enrichment analysis of the DEGs in modules

Figure 1 PPI network of DEGs.

Abbreviatins: DEGs, differentially expressed genes; PPI, protein–protein interaction.

Figure 1 PPI network of DEGs.Abbreviatins: DEGs, differentially expressed genes; PPI, protein–protein interaction.

Figure 2 The two modules identified in the PPI network of the DEGs.

Note: (A) Module 1 and (B) module 2. Blue represents upregulated DEGs; yellow represents downregulated DEGs.

Abbreviatins: DEGs, differentially expressed genes; PPI, protein–protein interaction.

Figure 2 The two modules identified in the PPI network of the DEGs.Note: (A) Module 1 and (B) module 2. Blue represents upregulated DEGs; yellow represents downregulated DEGs.Abbreviatins: DEGs, differentially expressed genes; PPI, protein–protein interaction.

Survival analysis

Kaplan–Meier plotter was used to predict the prognostic value of 15 identified hub genes. Our results showed that high expression of UBE2C was associated with worse OS for NSCLC patients, as well as DLGAP5, TPX2, CCNB2, BIRC5, KIF20A, and TOP2A (P<0.05) (). Additionally, low expression of ANXA1 was associated with poorer OS for NSCLC patients, as well as GNG11 (P<0.05) ().

Figure 3 Kaplan–Meier curves depicting OS in nonsmoking females with NSCLC with high and low expression of ANXA1 (A), GNG11 (B), BIRC5 (C), CCNB2 (D), DLGAP5 (E), KIF20A (F), TOP2A (G), TPX2 (H), and UBE2C (I).

Abbreviations: NSCLC, non-small cell lung carcinoma; OS, overall survival.

Figure 3 Kaplan–Meier curves depicting OS in nonsmoking females with NSCLC with high and low expression of ANXA1 (A), GNG11 (B), BIRC5 (C), CCNB2 (D), DLGAP5 (E), KIF20A (F), TOP2A (G), TPX2 (H), and UBE2C (I).Abbreviations: NSCLC, non-small cell lung carcinoma; OS, overall survival.

Discussion

In this study, a total of 699 DEGs were screened, including 161 upregulated genes and 538 downregulated genes. Moreover, we selected two significant modules with several key DEGs in the regulatory network of nonsmoking female patients with NSCLC. Then, survival analysis of these genes determined that seven overexpressed genes (UBE2C DLGAP5, TPX2, CCNB2, BIRC5, KIF20A, and TOP2A) and two downregulated genes (GNG11 and ANXA1) were significantly correlated with worse OS of nonsmoking female patients with NSCLC.

The data showed that the seven overexpressed genes are involved in “module 1” of the subnetwork, which is enriched in the cell cycle pathway. As we know that, ubiquitin-conjugating enzyme E2C (UBE2C) is a key regulator of cell cycle progression. High expression of UBE2C is associated with aggressive progression and poor outcome of malignant glioma,Citation21 and UBE2C has been identified as a prognostic protein marker in bladder cancer.Citation22 Moreover, UBE2C, DLGAP5, and TPX2 are associated with the progression and prognosis of pancreatic carcinoma.Citation23 Additionally, both DLGAP5 and TPX2 are mitosis-associated genes and correlated with poor prognosis in NSCLC patients.Citation24 TPX2 overexpression is associated with poor survival in gastric cancer.Citation25 A study also found that TPX2 promotes glioma cell proliferation and invasion via activation of the AKT signaling pathway in glioblastoma multiforme.Citation26 Elevated DLGAP5 expression was negatively correlated with both OS and relapse-free survival of lung cancer.Citation27 A study found that silencing of DLGAP5 by siRNA significantly inhibits the proliferation and invasion of hepatocellular carcinoma cells.Citation28 Cyclin B2 (CCNB2) is a member of cyclin family proteins. Using the same data (GEO:GSE19804), Qian et al further demonstrated that both mRNA and protein expressions of CCNB2 were higher in NSCLC than in normal lung tissues and CCNB2 overexpression is a poor prognostic biomarker in Chinese NSCLC patients.Citation29 BIRC5, which codes for survivin, can participate in cellular survival functions, such as cell cycle progression.Citation30 Moreover, BIRC5 (survivin) is a pejorative prognostic marker in stage II/III breast cancer.Citation31 Similarly, survivin is upregulated in lung cancer tissues and high expression of BIRC5 is associated with poor survival in lung cancer.Citation32,Citation33 Kinesin family member 20A (KIF20A), also known as RAB6KIFL, plays an important role in cytokinesis.Citation34 In addition, it has been reported that high expression of KIF20A is associated with poor OS in cervical squamous cell carcinoma.Citation35 Similarly, a study has found that positive expression of KIF20A indicates poor prognosis of glioma patients.Citation36 High expression of TOP2A, which is involved in DNA synthesis and transcription, is associated with recurrence and metastasis of prostate cancer.Citation37 In addition, high mRNA levels of TOP2A are independent predictors of poor outcome in renal cell carcinoma patients.Citation38 Moreover, another study has found that TOP2A is associated with worse prognosis in NSCLC patients.Citation39 Taken together, these data suggest that UBE2C, DLGAP5, TPX2, CCNB2, BIRC5, KIF20A, and TOP2A are involved in the cell cycle pathway and play a significant role in cancer development, which supports our findings.

On the other hand, GNG11 and ANXA1, which are found in “module 2”, were associated with chemokine signaling pathway, cytokine-cytokine receptor interaction, and neuroactive ligand-receptor interaction. GNG11, a lipid-anchored protein, acts as a tumor suppressor in lung adenocarcinoma.Citation40 Annexin A1 (ANXA1) is a calcium- and phospholipid-binding protein. Some studies found that low ANXA1 expression was associated with better OS in gastric cancerCitation41 and esophageal squamous cell carcinoma.Citation42 However, low expression of ANXA1 was correlated with worse OS in breast cancer patients.Citation43 Downregulation of ANXA1 is correlated with radioresistance in nasopharyngeal carcinoma.Citation44 Moreover, decreased expression of ANXA1 was correlated with poor survival of pancreatic ductal adenocarcinoma (PDAC) patients, and ANXA1 knockdown inhibited cell proliferation, induced G1 phase cell cycle arrest, and increased PDAC cell migration and invasion capacity compared with controls.Citation45 Similarly, ANXA1 knockdown suppressed the proliferation, migration, and invasion of NSCLC cells.Citation46 Together, we speculate that GNG11 and ANXA1 may play a crucial role in NSCLC. ANXA1 could regulate the gastric cancer cell invasion through the formyl peptide receptor (FPR)/extracellular signal-regulated kinase/integrin beta-1-binding protein pathway, and all three FPRs (FPR1 through FPR3) were involved in the regulation process.Citation41 FPR2 promotes invasion and metastasis of gastric cancer cells and predicts the prognosis of patients.Citation47 A study also found that FPR2 was overexpressed in epithelial ovarian cancer (EOC) tissues and FPR2 overexpression indicated poor prognosis of EOC patients.Citation48 However, in the present study, we found that FPR2 was downregulated and low expression indicated better prognosis of NSCLC patients. Similarly, we found that Neuromedin U (Nmu), a secreted neuropeptide, was upregulated and high expression indicated better prognosis of NSCLC patients. But in HNSCC, the expression levels of Nmu in primary tumors with regional metastasis were higher, compared with those without metastasis, and overexpression of Nmu may be involved in the process of regional metastasis of HNSCC.Citation49 So, the roles of FPR2 and Nmu need to be further investigated.

In summary, the present study was intended to identify the potential biomarkers and analyze the prognostic values in nonsmoking females with NSCLC by bioinformatics analysis. We found that hub genes of complex networks, such as UBE2C, DLGAP5, TPX2, CCNB2, BIRC5, KIF20A, TOP2A, GNG11, and ANXA1, may act as potential biomarkers for nonsmoking females with NSCLC. However, the current study was performed by bioinformatics analysis and the conclusions remain to be confirmed by corresponding experiments. Therefore, further investigation is required to verify our findings and determine the potential clinical value of these as biomarkers.

Disclosure

The authors report no conflicts of interest in this work.

References

  • JemalABrayFCenterMMFerlayJWardEFormanDGlobal cancer statisticsCA Cancer J Clin2011612699021296855
  • WoodMEKellyKMullineauxLGBunnPAThe inherited nature of lung cancer: a pilot studyLung Cancer200030213514411086207
  • LuTPTsaiMHLeeJMIdentification of a novel biomarker, SEMA5A, for non-small cell lung carcinoma in nonsmoking womenCancer Epidemiol Biomarkers Prev201019102590259720802022
  • LamWKLung cancer in Asian women-the environment and genesRespirology200510440841716135162
  • HosgoodHDBoffettaPGreenlandSIn-home coal and wood use and lung cancer risk: a pooled analysis of the International Lung Cancer ConsortiumEnviron Health Perspect2010118121743174720846923
  • WuPFLeeCHWangMJCancer aggregation and complex segregation analysis of families with female non-smoking lung cancer probands in TaiwanEur J Cancer200440226026614728941
  • WongDWLeungELSoKKUniversity of Hong Kong Lung Cancer Study GroupThe EML4-ALK fusion gene is involved in various histologic types of lung cancers from nonsmokers with wild-type EGFR andKRASCancer200911581723173319170230
  • ShigematsuHLinLTakahashiTClinical and biological features associated with epidermal growth factor receptor gene mutations in lung cancersJ Natl Cancer Inst200597533934615741570
  • ToyookaSTsudaTGazdarAFThe TP53 gene, tobacco exposure, and lung cancerHum Mutat200321322923912619108
  • YamamotoHShigematsuHNomuraMPIK3CA mutations and copy number gains in human lung cancersCancer Res200868176913692118757405
  • FangEZhangXIdentification of breast cancer hub genes and analysis of prognostic values using integrated bioinformatics analysisCancer Biomarkers2018212373381
  • HuangZDuanHLiHIdentification of Gene Expression Pattern Related to Breast Cancer Survival Using Integrated TCGA Datasets and Genomic ToolsBiomed Res Int20152015110
  • LiB-QYouJChenLIdentification of Lung-Cancer-Related Genes with the Shortest Path Approach in a Protein-Protein Interaction NetworkBiomed Res Int20132013618
  • JinXLiuXLiXGuanYIntegrated Analysis of DNA Methylation and mRNA Expression Profiles Data to Identify Key Genes in Lung AdenocarcinomaBiomed Res Int2016201619
  • PiaoJSunJYangYJinTChenLLinZTarget gene screening and evaluation of prognostic values in non-small cell lung cancers by bio-informatics analysisGene201864730631129305979
  • Huang daWShermanBTLempickiRASystematic and integrative analysis of large gene lists using DAVID bioinformatics resourcesNat Protoc200941445719131956
  • SzklarczykDMorrisJHCookHThe STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessibleNucleic Acids Res201745D1D362D36827924014
  • BaderGDHogueCWAn automated method for finding molecular complexes in large protein interaction networksBMC Bioinformatics20034212525261
  • ShannonPMarkielAOzierOCytoscape: a software environment for integrated models of biomolecular interaction networksGenome Res200313112498250414597658
  • GyoőrffyBSurowiakPBudcziesJLánczkyAOnline survival analysis software to assess the prognostic value of biomarkers using transcriptomic data in non-small-cell lung cancerPLoS One2013812e8224124367507
  • MaRKangXZhangGFangFDuYLvHHigh expression of UBE2C is associated with the aggressive progression and poor outcome of malignant gliomaOncol Lett20161132300230426998166
  • FristrupNBirkenkamp-DemtröderKReinertTMulticenter validation of cyclin D1, MCM7, TRIM29, and UBE2C as prognostic protein markers in non-muscle-invasive bladder cancerAm J Pathol2013182233934923201130
  • ZhouZChengYJiangYTen hub genes associated with progression and prognosis of pancreatic carcinoma identified by co-expression analysisInt J Biol Sci201814212413629483831
  • SchneiderMAChristopoulosPMuleyTAURKA, DLGAP5, TPX2, KIF11 and CKAP5: Five specific mitosis-associated genes correlate with poor prognosis for non-small cell lung cancer patientsInt J Oncol201750236537228101582
  • TomiiCInokuchiMTakagiYTPX2 expression is associated with poor survival in gastric cancerWorld J Surg Oncol20171511428069036
  • GuJJZhangJHChenHJWangSSTPX2 promotes glioma cell proliferation and invasion via activation of the AKT signaling pathwayOncol Lett20161265015502228105208
  • ShiYXYinJYShenYZhangWZhouHHLiuZQGenome-scale analysis identifies NEK2, DLGAP5 and ECT2 as promising diagnostic and prognostic biomarkers in human lung cancerSci Rep201771807228808310
  • LiaoWLiuWYuanQSilencing of DLGAP5 by siRNA significantly inhibits the proliferation and invasion of hepatocellular carcinoma cellsPLoS One2013812e8078924324629
  • QianXSongXHeYCCNB2 overexpression is a poor prognostic biomarker in Chinese NSCLC patientsBiomed Pharmacother20157422222726349989
  • Blanc-BrudeOPMesriMWallNRPlesciaJDohiTAltieriDCTherapeutic targeting of the survivin pathway in cancer: initiation of mitochondrial apoptosis and suppression of tumor-associated angio-genesisClin Cancer Res2003972683269212855648
  • HamyASBiecheILehmann-CheJBIRC5 (survivin): a pejorative prognostic marker in stage II/III breast cancer with no response to neoadjuvant chemotherapyBreast Cancer Res Treat2016159349951127592112
  • YuXZhangYCavazosDmiR-195 targets cyclin D3 and survivin to modulate the tumorigenesis of non-small cell lung cancerCell Death Dis20189219329416000
  • DuanLHuXJinYLiuRYouQSurvivin protein expression is involved in the progression of non-small cell lung cancer in Asians: a meta-analysisBMC Cancer201616127627090386
  • FontijnRDGoudBEchardAThe human kinesin-like protein RB6K is under tight cell cycle control and is essential for cytokinesisMol Cell Biol20012182944295511283271
  • ZhangWHeWShiYHigh Expression of KIF20A Is Associated with Poor Overall Survival and Tumor Progression in Early-Stage Cervical Squamous Cell CarcinomaPLoS One20161112e167449
  • DuanJHuangWShiHPositive expression of KIF20A indicates poor prognosis of glioma patientsOnco Targets Ther201696741674927843327
  • LiXLiuYChenWTOP2Ahigh is the phenotype of recurrence and metastasis whereas TOP2Aneg cells represent cancer stem cells in prostate cancerOncotarget2014519949825237769
  • ChenDMaruschkeMHakenbergOZimmermannWStiefCGBuchnerATOP2A, HELLS, ATAD2, and TET3 Are Novel Prognostic Markers in Renal Cell CarcinomaUrology2017102265.e1265.e7
  • HouGXLiuPYangJWenSMining expression and prognosis of topoisomerase isoforms in non-small-cell lung cancer by using Onco-mine and Kaplan-Meier plotterPLoS One2017123e174515
  • HsuYLHungJYLeeYLIdentification of novel gene expression signature in lung adenocarcinoma by using next-generation sequencing data and bioinformatics analysisOncotarget201786210483110485429285217
  • ChengTYWuMSLinJTAnnexin A1 is associated with gastric cancer survival and promotes gastric cancer cell invasiveness through the formyl peptide receptor/extracellular signal-regulated kinase/integrin beta-1-binding protein 1 pathwayCancer2012118235757576722736399
  • HanGTianYDuanBShengHGaoHHuangJAssociation of nuclear annexin A1 with prognosis of patients with esophageal squamous cell carcinomaInt J Clin Exp Pathol20147275175924551299
  • WangLPBiJYaoCAnnexin A1 expression and its prognostic significance in human breast cancerNeoplasma201057325325920353277
  • HuangLLiaoLWanYDownregulation of Annexin A1 is correlated with radioresistance in nasopharyngeal carcinomaOncol Lett20161265229523428101240
  • LiuQHShiMLBaiJZhengJNIdentification of ANXA1 as a lymphatic metastasis and poor prognostic factor in pancreatic ductal adenocarcinomaAsian Pac J Cancer Prev20151672719272425854353
  • FangYGuanXCaiTKnockdown of ANXA1 suppresses the biological behavior of human NSCLC cells in vitroMol Med Rep20161353858386627035116
  • HouXLJiCDTangJFPR2 promotes invasion and metastasis of gastric cancer cells and predicts the prognosis of patientsSci Rep201771315328600569
  • XieXYangMDingYYuLChenJFormyl peptide receptor 2 expression predicts poor prognosis and promotes invasion and metastasis in epithelial ovarian cancerOncol Rep20173863297330829039544
  • WangLChenCLiFOverexpression of neuromedin U is correlated with regional metastasis of head and neck squamous cell carcinomaMol Med Rep20161421075108227279246