107
Views
20
CrossRef citations to date
0
Altmetric
Original Research

Identification of potential biomarkers and analysis of prognostic values in head and neck squamous cell carcinoma by bioinformatics analysis

, , , &
Pages 2315-2321 | Published online: 26 Apr 2017

Abstract

The purpose of this study was to find disease-associated genes and potential mechanisms in head and neck squamous cell carcinoma (HNSCC) with deoxyribonucleic acid microarrays. The gene expression profiles of GSE6791 were downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) were obtained with packages in R language and STRING constructed protein–protein interaction (PPI) network of the DEGs with combined score >0.8. Subsequently, module analysis of the PPI network was performed by Molecular Complex Detection plugin and functions and pathways of the hub gene in subnetwork were studied. Finally, overall survival analysis of hub genes was verified in TCGA HNSCC cohort. A total of 811 DEGs were obtained, which were mainly enriched in the terms related to extracellular matrix (ECM)–receptor interaction, ECM structural constituent, and ECM organization. A PPI network was constructed, consisting of 401 nodes and 1,254 edges and 15 hub genes with high degrees in the network. High expression of 4 genes of the 15 genes was associated with poor OS of patients in HNSCC, including PSMA7, ITGA6, ITGB4, and APP. Two significant modules were detected from the PPI network, and the enriched functions and pathways included proteasome, ECM organization, and ECM–receptor interaction. In conclusion, we propose that PSMA7, ITGA6, ITGB4, and APP may be further explored as potential biomarkers to aid HNSCC diagnosis and treatment.

Introduction

Head and neck squamous cell carcinoma (HNSCC) is the sixth most common cancer worldwide, with ~650,000 new cases and nearly 350,000 patient deaths from HNSCC annually.Citation1 Prognosis remains poor, and the 5-year survival rates for HNSCC patients continue to be <50%. Local tumor recurrence, distant metastasis, and therapeutic resistance appear to be the major contributing factors for this low survival rate.Citation2

Previously identified biomarkers can help in predicting the prognosis of HNSCC. However, their clinical application is limited. Currently, there is no evidence-based recommendation for altering the treatment of patients with HNSCC by the expression of individual biomarkers.Citation3 Therefore, it is crucial to investigate the molecular mechanisms involved in proliferation, apoptosis, and invasion of HNSCC and discover more effective biomarkers of HNSCC to improve diagnosis and prevention of the disease.

Currently, genetic and genomics research is developing rapidly, which helps us to understand the potential mechanisms of some diseases.Citation4,Citation5 For example, microarray analysis is widely used in the field of cancer genetics research, which may measure gene expression on a genome-wide scale simultaneously.Citation6

In the present study, the biological informatics approach was used to analyze the gene expression profiles in HNSCC, and functional analysis was performed to identify differentially expressed genes (DEGs) between HNSCC and normal control. Subsequently, network analysis was applied for the DEGs and a protein–protein interaction (PPI) network was constructed; then, we investigated whether the hub gene of the subnetwork could reduce the overall survival (OS) in TCGA database. Through analyzing their biological functions, pathways, and OS, we may bring to light the underlying mechanisms of HNSCC development and identify the potential candidate biomarkers for diagnosis, prognosis, and drug targets.

Materials and methods

Microarray data

Microarray expression profiles of GSE6791Citation7 were downloaded from Gene Expression Omnibus database for identifying DEGs of HNSCC. GSE6791, which was already deposited in GPL570 (Affymetrix Human Genome U133 Plus 2.0 Array, Santa Clara, CA, USA), consisted of 42 HNSCC samples and 14 normal epithelial samples.

Data preprocessing and identification of DEGs

The raw array data were subjected to background correction and quartile data normalization. Then, the DEGs between HNSCC samples and normal controls were identified using the empirical Bayes approach in linear models for the microarray data (limma) package.Citation8 |log FC| >1 and P<0.05 were selected as the cutoff criterion.

Functional and pathway enrichment analysis of DEGs

The Database for Annotation, Visualization, and Integrated Discovery (DAVID),Citation9 which is a comprehensive set of functional annotation tools, has been used for systematic and integrative analysis of large gene lists. In this work, the significant gene ontology (GO) biological process terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of the identified DEGs were performed using DAVID database with the thresholds of P<0.05.

Modules from the PPI network

To evaluate the interactive relationships among DEGs, the DEGs were mapped to the Search Tool for the Retrieval of Interacting Genes(STRING) database.Citation10 Then, the interaction relationships of DEGs were selected to construct the PPI network (combined score >0.8) and visualized using Cytoscape.Citation11 The Molecular Complex Detection (MCODE) pluginCitation12 in Cytoscape was used to screen the modules of PPI network, using cutoff values as follows: MCODE scores >15 and number of nodes >15. Moreover, the function and pathway enrichment analysis of DEGs in each module was performed using DAVID.

Survival analysis of the hub gene

OS analysis was performed using HNSCC samples from the TCGA dataset and mRNA Z-score data files were downloaded from the cBioPortal.Citation13 Patients were classified into high or low expression based on whether Z-score expression was > median (high) or < median (low). Based on these categories, log-rank analysis and Kaplan–Meier plots were produced using Prism Software (GraphPad Software, Inc., La Jolla, CA, USA).

Results

Identification of DEGs

The total number of samples analyzed was 42 HNSCC samples, along with 14 normal epithelial samples. After data preprocessing, DEG analysis was performed using the limma software package. A total of 811 genes were identified after the analyses of GSE6791, including 550 upregulated and 261 downregulated genes.

GO and KEGG pathway enrichment analyses

We uploaded all 811 DEGs to the online software DAVID to identify overrepresented GO categories and KEGG pathways. GO analysis results showed that the most overrepresented GO terms in biological processes were enriched in extracellular matrix (ECM) organization, antigen processing and presentation of exogenous peptide antigen via major histocompatibility class I, transporter associated with antigen processing-dependent, and collagen catabolic process. In addition, the most enriched GO terms in molecular function and cellular component were threonine-type endopeptidase activity and extracellular exosome, respectively. On the other hand, the most enriched KEGG pathway terms were as follows: ECM–receptor interaction, amebiasis, proteasome, focal adhesion, and small cell lung cancer ().

Table 1 Functional and pathway enrichment analysis of upregulated and downregulated DEGs in HNSCC

Coexpression network analysis of DEGs

To interpret the biological meaning of the identified DEGs, we constructed a coexpression network for the DEGs with a combined score >0.8 and with significant interaction relation composed of 401 nodes and 1,254 edges by STRING database analysis (). From the coexpression network of the selected DEGs, the top 15 hub genes were determined according to the number of the interacting edges: CDK1, PTK2, ITGAV, APP, COL1A1, MMP9, AURKA, BMP2, ITGB4, CDC20, SDC4, COL1A2, ITGA6, PSMA7, and STAT1 (). The distinct modules of 401 DEGs and their interacting genes were further identified by the MCODE using Cytoscape software. Among the modules, two subnetworks with >15 nodes were selected (), and enrichment analysis showed that the genes in the subnetworks were mainly associated with proteasome, ECM–receptor interaction, protein digestion and absorption, and focal adhesion ().

Figure 1 PPI network of differentially expressed genes.

Notes: Blue represents downregulated DEGs; red represents upregulated DEGs.
Abbreviations: PPI, protein–protein interaction; DEGs, differentially expressed genes.
Figure 1 PPI network of differentially expressed genes.

Figure 2 Functional modules in the PPI network.

Notes: From PPI networks of DEGs with combined score >0.8, we clustered two functional modules, using MCODE: module 1 (A) and module 2 (B). Blue represents downregulated DEGs; red represents upregulated DEGs.
Abbreviations: PPI, protein–protein interaction; DEGs, differentially expressed genes; MCODE, Molecular Complex Detection.
Figure 2 Functional modules in the PPI network.

Table 2 The hub genes that had a degree >22 in PPI network

Table 3 Functional and pathway enrichment analysis of the DEGs in modules

Hub genes were validated as an independent predictor for OS in the TCGA cohort

We subsequently sought to assess the significance of expression of 15 hub genes in HNSCC. Therefore, the relation between expression of 15 hub genes and OS in the TCGA HNSCC cohort (461 patients) was verified, and the patients were divided into low or high expression groups according to the median expression. Our results showed that poor OS was associated only in those patients with high expression of PSMA7 (HR: 1.60 [1.20–2.10], P=0.0009) in the TCGA HNSCC cohort, as well as ITGA6 (HR: 1.32 [1.00–1.75], P=0.0472), ITGB4 (HR: 1.38 [1.05–1.83], P=0.0113), and APP (HR: 1.40 [1.04–1.87], P=0.0113; ).

Figure 3 Kaplan–Meier curves depicting OS in the TCGA HNSCC cohort with high and low expression of PMSA7 (A), ITGA6 (B), ITGB4 (C) and APP (D), respectively. Abbreviations: OS, overall survival; HNSCC, head and neck squamous cell carcinoma; HR, hazard ratio; CI, confidence interval.

Figure 3 Kaplan–Meier curves depicting OS in the TCGA HNSCC cohort with high and low expression of PMSA7 (A), ITGA6 (B), ITGB4 (C) and APP (D), respectively. Abbreviations: OS, overall survival; HNSCC, head and neck squamous cell carcinoma; HR, hazard ratio; CI, confidence interval.

Conclusion

Despite advances in surgical, chemotherapy, and medical therapy, the overall mortality of HNSCC has remained virtually unchanged over the past decades. The lethality of HNSCC is mainly due to difficulties in detecting it at an early stage and the lack of effective treatments for patients in advanced stages. Interestingly, bioinformatics plays a major role in the analysis and interpretation of genomic and proteomic data.Citation14 For example, some researchers focus on bioinformatics, nanogenomics, and nanoproteomics aspects of contemporary nanodentistry and summarize some proteomics and proteogenomics approaches for oral diseases.Citation15,Citation16 Therefore, in the present study, we attempted to utilize comprehensive bioinformatics methods to explore the potential molecular mechanism of HNSCC to improve survival rate and prevention.

In this study, a total of 811 DEGs were screened, consisting of 550 upregulated genes and 261 downregulated genes. Moreover, we selected two significant modules with several key DEGs (like PSMA7, ITGA6, and ITGB4) in HNSCC regulatory network, and functional enrichment analyses showed that these key DEGs were mainly enriched in ECM–receptor interaction, which is closely related to cancer. Finally, survival analysis of these hub genes revealed that four overexpressed genes were significantly correlated with poor OS of patients in the TCGA HNSCC cohort, and these included PSMA7, ITGA6, ITGB4, and APP.

The data showed that PSMA7 is involved in “module 1” of the gene coexpression network, which is enriched in the proteasome pathway. Many studies have suggested that proteasome promotes the degradation of oxidatively damaged proteins that play a role in the cell cycle and transcription, which are essential for cancer improvement. Previously, it was reported that PSMA7 inhibits the proliferation, tumorigenicity, and invasion of human lung adenocarcinoma cells.Citation17 Similar results also showed that high expression of PSMA7 is associated with liver metastasis in colorectal cancer.Citation18 Besides, Hu et al also found depletion of PSMA7 inhibited cell growth, invasion, and migration in RKO cells and strongly suppressed the tumorigenic ability of RKO cells in vivo.Citation19 Taken together, we speculate that the overexpression of PSMA7 may contribute to HNSCC progression and correlate with a poor prognosis.

On the other hand, ITGA6 and ITGB4, which are found in “module 2” in PPI network, were associated with the ECM–receptor interaction pathway, and belong to the integrin family, which participates in cell adhesion as well as cell surface-mediated signaling. Interactions between cells and the ECM could lead to the direct or indirect control of cellular processes of adhesion, migration, differentiation, proliferation, and apoptosis.Citation20 As previously reported, silencing of ITGA6 genes significantly inhibited cell migration and invasion in head and neck cancer cells and hepatocellular carcinoma cells.Citation21,Citation22 Similarly high ITGA6 expression was shown to enhance invasion in models of metastatic breast cancer.Citation23 Moreover, Kwon et alCitation24 found ITGA6 is a possible target for antibody-related diagnostic and therapeutic modalities in esophageal squamous cell carcinoma. Meanwhile, ITGB4 regulates migration and invasion in models of metastatic prostate cancer.Citation25 Moreover, Masugi et alCitation26 found that knockdown of ITGB4 reduced the migration and invasion and that upregulation of ITGB4 promoted cell scattering and motility in pancreatic ductal adenocarcinoma cells. Besides, our study shows that ITGB4 was associated with poor prognosis in HNSCC; similar results have also been shown in pancreatic ductal adenocarcinoma patients.Citation27 Together, we speculate that ITGA6 and ITGB4 in ECM–receptor interaction signaling pathway may play a significant role in HNSCC.

Amyloid-β precursor protein (APP) is the highly conservative single transmembrane protein with a receptor-like structure that has been shown to be involved in Alzheimer disease,Citation28 but its function in normal physiological is unclear. Interestingly, APP is increased in many different cancers, such as colon cancer, pancreatic cancer, and thyroid cancer.Citation29Citation31 Lim et alCitation32 found that overexpression of APP is found both in malignant breast cancer cell lines and in human breast cancer tissues, and APP could regulate cell growth, apoptosis, and motility of breast cancer, possibly via engagement of AKT-mediated signaling pathways. Similarly, APP could promote cell growth in pancreatic cancer cells.Citation31 In addition, Ko et alCitation33 found a significant increase of APP in an oral squamous cell carcinoma (OSCC) tissue and also that OSCC patients with high mRNA levels of APP had poor prognoses. The abovementioned studies show that APP may be involved in the pathogenesis of malignant tumors by affecting cell growth or apoptosis, thereby supporting our findings.

In summary, the current study was intended to identify DEGs with comprehensive bioinformatics analysis to find the potential biomarkers and predict progression of diseases. We found that hub genes of complex networks, such as PSMA7, ITGA6, ITGB4, and APP, may be exploited as a prognostic tool for HNSCC. Finally, our results suggested that proteasome and ECM–receptor interaction may be important in the development of HNSCC. However, further experimental studies are still required to prove our findings and determine the potential clinical value of these as biomarkers.

Acknowledgments

The project was supported by the Guangdong Natural Science Foundation of China (2015A030313309).

Disclosure

The authors report no conflicts of interest in this work.

References

  • ParkinDMBrayFFerlayJPisaniPGlobal cancer statistics, 2002CA Cancer J Clin20055527410815761078
  • PrinceMESivanandanRKaczorowskiAIdentification of a subpopulation of cells with cancer stem cell properties in head and neck squamous cell carcinomaProc Natl Acad Sci U S A2007104397397817210912
  • AngKKSturgisEMHuman Papillomavirus as a marker of the natural history and response to therapy of head and neck squamous cell carcinomaSemin Radiat Oncol201222212824222385920
  • OrlandoBBragazziNNicoliniCBioinformatics and systems biology analysis of genes network involved in OLP (Oral Lichen Planus) pathogenesisArch Oral Biol201358666467323347958
  • LakhaniSRAshworthAMicroarray and histopathological analysis of tumours: the future and the past?Nat Rev Cancer20011215115711905806
  • RaysMChenYSuYAUse of a cDNA microarray to analyse gene expression patterns in human cancerNat Genet19961444574608944026
  • PyeonDNewtonMALambertPFFundamental differences in cell cycle deregulation in human papillomavirus-positive and human papillomavirus-negative head/neck and cervical cancersCancer Res200767104605461917510386
  • SmythGKLimma: linear models for microarray dataGentlemanRCareyVJHuberWIrizarryRADudoitSBioinformatics and Computational Biology Solutions Using R and BioconductorSeattle, WASpringer2005397420
  • HuangDWShermanBTLempickiRASystematic and integrative analysis of large gene lists using DAVID bioinformatics resourcesNat Protoc200941445719131956
  • SzklarczykDFranceschiniAKuhnMThe STRING database in 2011: functional interaction networks of proteins, globally integrated and scoredNucleic Acids Res201139Suppl 1D561D56821045058
  • ShannonPMarkielAOzierOCytoscape: a software environment for integrated models of biomolecular interaction networksGenome Res200313112498250414597658
  • BaderGDHogueCWAn automated method for finding molecular complexes in large protein interaction networksBMC Bioinformatics200341212525261
  • GaoJAksoyBADogrusozUIntegrative analysis of complex cancer genomics and clinical profiles using the cBioPortalSci Signal20136269pl123550210
  • WrightJTHartTCThe genome projects: implications for dental practice and educationJ Dent Educ200266565967112056771
  • BragazziNLPechkovaENicoliniCProteomics and proteogenomics approaches for oral diseasesAdv Protein Chem Struct Biol20149512516224985771
  • NicoliniCBragazziNNanogenomics and nanoproteomics for personalized nanotheranostics for oral and colorectal cancerPer Med2015131911
  • TanJYHuangXLuoYLPSMA7 inhibits the tumorigenicity of A549 human lung adenocarcinoma cellsMol Cell Biochem20123661–213113722584585
  • HuXTChenWWangDThe proteasome subunit PSMA7 located on the 20q13 amplicon is overexpressed and associated with liver metastasis in colorectal cancerOncol Rep200819244144618202793
  • HuXTChenWZhangFBDepletion of the proteasome subunit PSMA7 inhibits colorectal cancer cell tumorigenicity and migrationOncol Rep20092251247125219787246
  • HansenNUGenoveseFLeemingDJKarsdalMAThe importance of extracellular matrix for cell function and in vivo likenessExp Mol Pathol201598228629425595916
  • KinoshitaTNohataNHanazawaTTumour-suppressive microRNA-29s inhibit cancer cell migration and invasion by targeting laminin-integrin signalling in head and neck squamous cell carcinomaBr J Cancer2013109102636264524091622
  • LvGLvTQiaoSRNA interference targeting human integrin α6 suppresses the metastasis potential of hepatocellular carcinoma cellsEur J Med Res2013185224304619
  • BrooksDLPSchwabLPKrutilinaRITGA6 is directly regulated by hypoxia-inducible factors and enriches for cancer stem cell activity and invasion in metastatic breast cancer modelsMol Cancer2016152627001172
  • KwonJLeeTSLeeHWIntegrin alpha 6: a novel therapeutic target in esophageal squamous cell carcinomaInt J Oncol20134351523153024042193
  • BanyardJChungIMigliozziMIdentification of genes regulating migration and invasion using a new model of metastatic prostate cancerBMC Cancer20141438724885350
  • MasugiYYamazakiKEmotoKUpregulation of integrin β4 promotes epithelial-mesenchymal transition and is a novel prognostic marker in pancreatic ductal adenocarcinomaLab Invest201595330831925599535
  • DamhoferHMedemaJPVeenstraVLAssessment of the stromal contribution to Sonic Hedgehog-dependent pancreatic adenocarcinomaMol Oncol2013761031104223998958
  • O’BrienRJWongPCAmyloid precursor protein processing and Alzheimer’s diseaseAnnu Rev Neurosci20113418520421456963
  • MengJYKataokaHItohHKoonoMAmyloid β protein precursor is involved in the growth of human colon carcinoma cell in vitro and in vivoInt J Cancer2001921313911279603
  • KrauseKKargerSSheuSYEvidence for a role of the amyloid precursor protein in thyroid carcinogenesisJ Endocrinol2008198229129918480379
  • HanselDERahmanAWehnerSHerzogVYeoCJMaitraAIncreased expression and processing of the Alzheimer amyloid precursor protein in pancreatic cancer may influence cellular proliferationCancer Res200363217032703714612490
  • LimSYooBKKimHSAmyloid-beta precursor protein promotes cell proliferation and motility of advanced breast cancerBMC Cancer20141492825491510
  • KoSYLinSCChangKWIncreased expression of amyloid precursor protein in oral squamous cell carcinomaInt J Cancer2004111572773215252842