Abstract
Lung cancer (LC) is the leading cause of cancer-related deaths worldwide. Smoking has been identified as the main contributing cause of the disease’s development. The study aimed to identify the key genes in small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), the two major types of LC. Meta-analysis was performed with two datasets GSE74706 and GSE149507 obtained from Gene Expression Omnibus (GEO). Both the datasets comprised samples from cancerous and adjacent non-cancerous tissues. Initially, differentially expressed genes (DEGs) were identified. To understand the underlying molecular mechanism of the identified genes, pathway enrichment, gene ontology (GO) and protein–protein interaction (PPI) analyses were done. A total of hub genes were identified which were subjected to mutation study analysis in LC patients using cBioPortal. These genes (i.e. AURKA, AURKB, KIF23, RACGAP1, KIF2C, KIF20A, CENPE, TPX2 and PRC1) have shown overexpression in LC patients and can be explored as potential candidates for prognostic biomarkers. TPX2 reported a maximum mutation of This was followed with high throughput screening and docking analysis to identify the potential drug candidates following competitive inhibition of the AURKA-TPX2 complex. Four compounds, CHEMBL431482, CHEMBL2263042, CHEMBL2385714, and CHEMBL1206617 were identified. The results signify that the selected 9 genes can be explored as biomarkers in disease prognosis and targeted therapy. Also, the identified 4 compounds can be further analyzed as promising therapeutic candidates.
Communicated by Ramaswamy H. Sarma
Acknowledgments
The authors would like to thank Jamia Millia Islamia for providing infrastructure, journal access, and internet facilities. Dr. Ravins Dohare would like to ackowledge the Science and Engineering Research Board (SERB), Department of Science and Technology (DST), Government of India for providing him financial assistance (Grant Number: EEQ/2016/000509). Prithvi Singh would like to thank the Indian Council of Medical Research (ICMR) for awarding him Senior Research Fellowship (Grant Number: BMI/11(89)/2020).
Disclosure statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Funding
The author(s) reported there is no funding associated with the work featured in this article.
Authors’ contributions
Aiman Mushtaq: Methodology, Software, Formal analysis, Data curation, Writing—Original Draft, Writing—Review & Editing, Visualization. Prithvi Singh: Conceptualization, Methodology, Software, Formal analysis, Data curation, Writing—Original Draft, Writing—Review & Editing, Visualization. Mohd Mohsin: Writing—Original Draft, Writing—Review & Editing. Gulnaz Tabassum: Writing—Review & Editing. Taj Mohammad: Software, Data curation. Md Imtaiyaz Hassan: Writing—Review & Editing. Mansoor Ali Syed: Writing—Review & Editing. Ravins Dohare: Methodology, Software, Formal analysis, Data curation, Writing—Original Draft, Writing—Review & Editing, Visualization, Supervision, Project administration.
Data availability statement
The datasets used in our study was downloaded from National Center for Biotechnology Information–Gene Expression Omnibus under accession numbers GSE74706 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE74706) and GSE149507 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE149507).