1,391
Views
1
CrossRef citations to date
0
Altmetric
Research Paper

A DNA methylation profile of long non-coding RNAs can predict OS in prostate cancer

, ORCID Icon, , , &
Pages 3252-3262 | Received 31 Mar 2021, Accepted 16 Jun 2021, Published online: 08 Jul 2021

ABSTRACT

Prostate cancer (PCa) is the most common male reproductive tract malignant tumor, accurate evaluation of PCa characterization and prognostic prediction at diagnosis are vital for the effective administration of the disease, especially at the molecular level. In this study, 48 CpG sites with differential methylation associated with overall survival (OS) were screened out between PCa and normal adjacent tissues. 16 CpG sites were selected by the least absolute shrinkage and selection operator (LASSO) and the risk score formula for methylated-based classifier was established. For 16-lncRNAs-CpG-classifier, the area under the curve (AUC) were 0.890, 0.917, and 0.932 at 3 years, 5 years and 7 years, respectively. Kaplan–Meier curves indicated that patients with high-risk scores had worse OS than those with low-risk scores. Prognostic methylation model of lncRNAs was identified from the whole genome in patients with PCa. This novel finding provides a novel insight for screening biomarkers of a prognosis for PCa.

Graphical abstract

Introduction

Prostate cancer (PCa) is the most frequent male reproductive tract malignant tumor, which is the fifth cause of male cancer death in the world [Citation1]. Along with aging population structure and changes in diet, coupled with male hormone use undeserved, the incidence of PCa is rising globally. Although discovering and treating of PCa before symptoms happen may not promote their health or make them live longer due to usually growing very slowly, accurate evaluation of PCa characterization and prognostic prediction at diagnosis is momentous for the useful management of the cancer, especially at the molecular level [Citation2].

Over the past decade, the rapid progress of genomic techniques and their application for deciphering the genome of cancer have provided new diagnostic value and prognostic assessment for PCa patients. DNA methylation is formed by the covalent addition of methyl groups and DNA bases, typically the CpG dinucleotide cytosine, which lead to reversible alterations in gene expression of key tumor suppressor genes without permanent changes in DNA sequence in tumorigenesis [Citation3]. Furthermore, DNA methylation as a biomarker for cancers describes stable changes, which could improve individual PCa risk assessment and prognostic prediction [Citation4].

Long non-coding RNAs (lncRNAs) are crucial to regulate the gene expression and are associated with many biological processes, including tumorigenesis in mammals. Likewise, lncRNAs are frequently dysregulated in cancers, including PCa [Citation5]. Furthermore, lncRNAs with aberrant methylation patterns significantly affect the occurrence and development of tumors. For example, it has been revealed that deregulation of the DNMT1-related lncRNAs conduced to aberrant gene expression and DNA methylation levels [Citation6]. However, little is known about the DNA methylation pattern of lncRNAs in PCa [Citation7].

In this study, we explored the methylation sites of lncRNAs between PCa and para-carcinoma tissues to screen out differential methylation sites. Based on correlation analysis and LASSO regression analysis, a methylated classifier of lncRNAs was developed for assessing the prognosis of PCa. Furthermore, we also predicted the prognosis value of methylation-based classifiers, which could guide individualized clinical treatment for PCa patients.

Methods

Patient datasets

The DNA methylation data, RNA-seq data and clinical information of PCa patients were acquired from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/) [Citation8]. DNA methylation profile was performed by Infinium HumanMethylation450 BeadChip. The RNA-seq was performed on IlluminaHiSeq RNA-seq platform [Citation9]. The annotation file of lncRNA was acquired from GENCODE (https://www.gencodegenes.org/) [Citation10]. All datasets were publicly available and this research met the publication guidelines.

CpG sites of lncRNAs

The methylation of CpG sites, which were reported as the beta-value, ranged from 0 (no methylation) to 1 (complete methylation). The methylation beta-values were normalized in R software using ‘minfi’ package [Citation11]. Based on the annotation of TCGA, CpG sites in 2kb upstream of lncRNAs transcriptional start site (TSS) were filtered from 485,577 sites in HumanMethylation450 BeadChip. Differential methylation sites of lncRNA between PCa and para-carcinoma tissues were screened using ‘minfi’ package, and the q-value was less than 0.05 to be considered statistically significant. The T-test was applied to calculate the differential beta-values of CpG sites between PCa and para-carcinoma tissues, and the differential beta-values were more than 0.1 to be considered meaningful. Then, correlation analysis was performed to screen CpG sites, where the methylated levels were negatively associated with lncRNA expression levels, and the p-value < 0.05 was considered statistically significant.

Methylation-based classifier for overall survival (OS)

Univariate Cox regression analysis was performed to calculate the correlation between methylation level of CpG sites filtered in the above step and patient’s OS, and the p-value < 0.05 was considered in this study. Next, a Least Absolute Shrinkage and Selection Operator (LASSO) regression model was performed to screen the most useful prediction CpG sites, and a methylation-based classifier was constructed for predicting OS based on the sum of CpG site methylation levels weighted by the coefficients [Citation12]. LASSO regression is a variable selection and shrinkage method for the regression models to identify the variables and relevant regression coefficients, which are established to minimize the prediction error.

Predictive and prognostic analysis of methylation-based classifier

Based on the methylation-based classifier, the risk scores of patients were calculated, and the predictive effect of the classifier for OS was assessed by time-dependent receiver operating characteristic (ROC) analysis based on the risk score in R software using ‘timeROC’ package [Citation13]. The area under the curve (AUC) of ROC indicated the predictive or prognostic accuracy. Based on the median of the classifier risk score, patients were separated into high-risk group and low-risk group, and the Kaplan–Meier method was performed to estimate survival for patients, the p-value was less than 0.05 to be considered statistically significant.

Co-expression gene of lncRNAs and functional enrichment analysis

The co-expression genes of lncRNAs were identified to use MEM (Multi-Experiment Matrix). MEM is a similarity search of gene expression across numerous datasets [Citation14]. The main feature of MEM is that it collects large microarray datasets. It utilizes rank aggregation to integrate information from different datasets into a single global sorting and simultaneously estimates statistical significance. The function enrichment analysis of co-expression genes was performed through the DAVID Bioinformatics Resources (https://david.ncifcrf.gov/) [Citation15]. The FDR of enrichment terms was less than 0.05 to be considered statistically significant, and the enrichment results were visualized by the ‘ggplot2’ package in R software.

Statistical analysis

Continuous or categorical variables were compared between the two groups by the t-test (normal distribution) or χ2 test, respectively. P < 0.05 or Q < 0.05 was considered statistically significant. Differential β-value > 0.1 between two groups was considered statistically significant. FDR < 0.05 was considered significant.

Results

In this study, we aimed to develop a prognosis methylation model of lncRNAs to predict the prognosis of PCa patients. We investigated the differential methylation sites of lncRNAs between PCa and para-carcinoma tissues. Based on the correlation analysis and LASSO regression analysis, the methylation prognosis model of lncRNAs was built to predict the prognosis of PCa. In addition, the prognosis value of the methylation-based classifier was assessed, which indicated that this prognosis model could better predict the outcome of PCa patients.

Characteristics of patient datasets

There were 500 patients of PCa in TCGA, and 498 patients had both the DNA methylation data and survival data. The clinical characteristics of the 498 patients are shown in the . The methylation data were from 548 samples, including 498 PCa samples, and 50 normal adjacent tissues. The RNA-seq data were 547 samples, and 530 samples had both RNA-seq data and methylation data.

Table 1. Clinical characteristics of prostate cancer patients

Differential methylation of CpG sites and methylation-based classifier

Methylation beta-values were normalized by ‘minfi’ package in R software (). Eleven thousand two hundred and fifty-nine CpG sites situated within 2kb upstream of lncRNA TSS (excluding CpG sites on the X and Y chromosomes) were screened out based on the annotation of HumanMethylation450 BeadChip by TCGA. There were 6470 CpG sites with differential methylation levels between PCa and para-carcinoma tissues, which were screened out using ‘minfi’ package, and 1556 CpG sites had different beta-values, which were more than 0.1. Among the 1556 CpG sites, the 484 CpG sites methylation levels were found to be negatively associated with the lncRNAs expression levels. Based on univariate Cox regression analysis, 48 CpG sites were significantly associated with OS. In order to develop methylation-based classifier for predicting OS, the LASSO regression was performed using the methylated data of 48 CpG sites. Ultimately, 16 CpG sites were screened out using LASSO regression analysis (, b). The risk score formula for the methylation-based classifier was established as follows: −0.074 × beta_cg00496102 – 0.497 × beta_cg02893550 + 0.086 × beta_cg03482458 + 0.057 × beta_cg06313119 + 0.068 × beta_cg06457534 + 1.287 × beta_cg06942685 + 0.309 × beta_cg09671962 – 0.175 × beta_cg14034476 + 0.138 × beta_cg14245102 + 0.553 × beta_cg15736169 – 0.484 × beta_cg19930288 + 0.021 × beta_cg21741562 – 0.661 × beta_cg22408108 + 0.122 × beta_cg23643814 – 0.022 × beta_cg23679434 – 1.103 × eta_cg24514600. shows the features of the 16 CpG sites, which were selected by LASSO. Compared with para-carcinoma, the methylated levels of eight CpG sites were up-regulated and eight CpG sites were down-regulated in PCa (). Unsupervised hierarchical clustering analysis showed that methylated data of the 16 CpG sites could clearly discriminate between PCa and para-carcinoma samples ().

Table 2. Characteristics of CpG sites selected by LASSO

Figure 1. (a) The raw data of beta-values in methylated CpG site.

  1. The normalized beta-values of methylation of CpG site.

  2. One thousand most variable CpG sites were screened using ‘minfi’

package in R software

Figure 1. (a) The raw data of beta-values in methylated CpG site. The normalized beta-values of methylation of CpG site.One thousand most variable CpG sites were screened using ‘minfi’package in R software

Figure 2. A CpG sites were selected in LASSO analysis.B LASSO coefficient profiles of the CpG sites.C Hierarchical clustering by differential levels in methylated CpG sites

Figure 2. A CpG sites were selected in LASSO analysis.B LASSO coefficient profiles of the CpG sites.C Hierarchical clustering by differential levels in methylated CpG sites

Figure 3. The methylated levels of 8 CpG sites were up-regulated and 8 CpG sites were down-regulated in PCa compared with para-carcinoma tissues

Figure 3. The methylated levels of 8 CpG sites were up-regulated and 8 CpG sites were down-regulated in PCa compared with para-carcinoma tissues

Predictive and prognosis value of the methylation-based classifier

We calculated the risk score of each patient using a methylation-based classifier and used time-dependent ROC to evaluate the precision of the methylation-based classifier. Time-dependent ROC curve analysis showed that the classifier had a good predictive accuracy at different follow-up periods (AUC at 3 years = 0.890; AUC at 5 years = 0.917; AUC at 7 years = 0.932; ). All patients were separated into high-risk or low-risk groups based on the median cutoff point of the risk scores. Kaplan-Meier's analysis indicated that patients in the high-risk group had worse OS than those in the low-risk group (). Kaplan–Meier analysis of a single CpG site in the classifier are also shown in .

Figure 4. (a)Time-dependent ROC analysis was carried out to estimate the predictive effect for OS at varied follow-up periods.(b) SKaplan–Meier analysis was used to estimate the OS in patients. The patients were divided into high-risk group or low-risk group based on the median cutoff point of risk score

Figure 4. (a)Time-dependent ROC analysis was carried out to estimate the predictive effect for OS at varied follow-up periods.(b) SKaplan–Meier analysis was used to estimate the OS in patients. The patients were divided into high-risk group or low-risk group based on the median cutoff point of risk score

Figure 5. Kaplan-Meier analyses with high-risk or low-risk score of methylation beta-values of single CpG site in the classifier

Figure 5. Kaplan-Meier analyses with high-risk or low-risk score of methylation beta-values of single CpG site in the classifier

Identification lncRNA co-expression gene and functional evaluation

In this study, 16 lncRNAs were related with CpG sites in the methylation-based classifier (). The MEM analysis was performed to identify a total of 2241 co-expressed genes of lncRNAs. The DAVID was applied to perform functional analysis of the co-expression genes. The results indicated that Gene Oncology (GO) terms were enriched conspicuously, which were related to G-protein coupled receptor signaling (biological process), cell junction (cellular component), and calcium ion binding (molecular function). KEGG analysis showed that these co-expression genes were enriched in the calcium signaling pathway and cAMP signaling pathway ().

Figure 6. Functional enrichment analysis of lncRNA co-expression genes

  1. Gene Oncology (GO) enrichment

  2. KEGG pathway enrichment.

Figure 6. Functional enrichment analysis of lncRNA co-expression genes Gene Oncology (GO) enrichmentKEGG pathway enrichment.

Discussion

In this study, a prognostic methylation model of lncRNAs was identified from the whole genome in patients with PCa. Combined methylation analyses and LASSO Cox regression has found that the classifier based on aberrantly methylated lncRNAs conferred an OS advantage in PCa.

Although little protein potentially coded, it was indicated that some lncRNAs played crucial functions in multitudinous tumor physiological processes including regulation of cell-cyle, proliferation, differentiation, tumorigenesis as well as RNA processing and modification, nuclear remodeling, and cytoplasmic or nuclear trafficking [Citation16]. Methylation was involved in epigenetic silencing of cancer-relevant genes including individual non-coding RNAs [Citation17]. In this study, the CpG-rich regions located in lncRNAs were displayed in PCa patients with various survival rates, which mediated the loss-of-function of genes about 10 folds more common than by point mutations. Methylated aberrations of lncRNAs can specifically influence various biological activities or pathways. The resulting functional annotation showed that the methylated aberration of lncRNAs can perturb tumor-related signal pathways, containing the regulation of transcription, cell differentiation, protein kinase activation, nucleotide binding, cellular target components, RAS/RAF/MEK/MAPK, and PI3K/Akt /mTOR signaling pathways [Citation18]. PI3Ks were rich in intracellular membranes and can induce phosphatidylinositol 3,4,5-trisphosphate production [Citation19]. AKT serine/threonine kinases can be phosphorylated by phospholipids in a PI3K-dependent manner to activate numerous substrates, thereby promoting cellular growth, division cycle, and progression.

The well-documented function of methylated lncRNAs suggested that methylated lncRNAs can influence the DNA-specific binding domain and regulate various signal transductions including G-protein coupled receptor signaling, cAMP, cell junction, and Calcium [Citation20]. Activation of G-protein-coupled signaling associated with upregulated downstream signaling components has been shown in many tumors [Citation21]. The dysregulation of cAMP signaling has proven to be implicated in the pathophysiological disorders of cancer and can be utilized as a prospective preventive strategy for antitumor intervention [Citation22]. Ca2+-second-messenger plays key roles in the activation or repression of many signal transductions. The dysregulation of Ca 2+ homeostasis leads to tumorigenesis, such as gene transcription, angiogenesis, proliferation, and cellular apoptosis, and several tumors are closely associated with Ca (2+) signals [Citation23].

The prognosis resulting from patients with advanced PCa was very grave, the median duration was shorter in patients with clinically detectable metastases. Once there is metastasis, there are limited therapeutic interventions for PCa. Thus, early diagnosis of PCa is important to reduce mortality in patients, which requires reliable prognosticators [Citation24]. In our study, we applied Cox-regression techniques with LASSO to prognostically classify all patients, and based upon their own quantified methylation levels in 16 sites of lncRNAs, respectively. The PCa was subsequently divided into groups of higher and lower risk by median risk scores, indicating that a worse prognosis appeared in the high-risk subgroup rather than in the low-risk subgroup. The (ROC) analysis indicated that our prognostic model had optimal performance in predicting the overall 3,5 and 7-year survival of PCa.

Previous data have shown that various lncRNAs are differentially expressed in the progressed PCa. Aberrantly methylated DNA of cancer-relevant genes has been found to be correlated with the suppression in multiple cancer cellular types [Citation25]. Therefore, based upon the combined methylation panel of lncRNAs, we integrate aberrant methylation of lncRNAs into prognostic assessments using the LASSO-Cox regression tool with a high level of prognosis accuracy. The biological function of lncRNAs used for our classifier has been shown to be associated with other cancers in multiple previous studies. The site cg02893550 in the gene symbol of the lncRNA-CpG gene is FGF14-AS2, and the correlation between FGF14-AS2 and miR-205-5p was validated in breast cancer cells. Overexpression of FGF14-AS2 impaired the miR-205-5p induced phenotypic characteristics on proliferation, invasion, migration, and apoptosis in breast cancer cells [Citation26]. It is identified that LINC01122 (cg09671962) is closely associated with OS coupled with a high risk of gastric cancer [Citation27]. It was proved that MEG3 (cg14245102) can affect Wnt-β-catenin pathway and control epithelial-mesenchymal transition [Citation28]. As defined for lncRNA, the upregulation of PVT1 (cg24514600) is proved to promote cancer cell proliferation, invasion, metastasis, and drug resistance [Citation29].

The limitations in the current research can be further studied in the future. First, the results of prognosis based upon methylated panel analysis were obtained from the PCa database of TCGA, so the predictive panel accuracy should be tested in an alternative clinical cohort. Secondly, the sample size was limited in our study. Considering the incidence of PCa, the sample sizes need to be expanded in varied regions. Thirdly, the tissues of PCa patients were used in our research, while the blood samples are much easily obtained in the clinic. Thus, further prospective studies including groups in varied regions with consistently applied protocols would be needed to validate and confirm these biomarker candidates in the near future.

Conclusion

In summary, the differential methylation sites of lncRNAs between PCa and para-carcinoma tissues were screened out. Among these aberrant methylation sites of lncRNAs, a classifier of CpG sites in lncRNAs was established to predict the OS. Furthermore, the methylation-based classifier had admirable prediction performance, which could offer a novel insight for screening prognosis biomarkers of PCa.

Highlights

  1. Differential methylation sites of lncRNAs between PCa and para-carcinoma tissues were screened out.

  2. Sixteen CpG sites were selected by LASSO Regression and a risk score formula for methylated-based classifier was established.

  3. The time-dependent ROC was used to evaluate the precision of the classifier for predicting the OS.

Availability of data and materials

The DNA methylation data, RNA-seq data and clinical information of PCa patients were obtained from TCGA dataset (https://portal.gdc.cancer.gov/). The DNA methylation profile was performed by Infinium HumanMethylation450 BeadChip, The RNA-seq data was performed on the IlluminaHiSeq RNA-seq platform. The annotation files of lncRNAs were obtained from GENCODE (https://www.gencodegenes.org/). All datasets were publicly available and the study met the publication guidelines.

Disclosure statement

The authors have declared that no competing interest exists.

Additional information

Funding

This project was supported by the Natural Science Foundation of China (81974312 and 81501823), Zhejiang Provincial Natural Science Foundation of China (Y18H160217 and LQ19H160008), The medical scientific research of Zhejiang Province(2017KY459), Wenzhou Municipal Science and Technology Bureau (Y20190203)

References

  • Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249.
  • Xu J, Lan Y, Yu F, et al. Transcriptome analysis reveals a long non-coding RNA signature to improve biochemical recurrence prediction in prostate cancer. Oncotarget. 2018;9(38):24936–24949.
  • Pistore C, Giannoni E, Colangelo T, et al. DNA methylation variations are required for epithelial-to-mesenchymal transition induced by cancer-associated fibroblasts in prostate cancer cells. Oncogene. 2017;36(40):5551–5566.
  • Luo H, Zhao Q, Wei W, et al. Circulating tumor DNA methylation profiles enable early diagnosis, prognosis prediction, and screening for colorectal cancer. Sci Transl Med. 2020;12(524).
  • Lemos AEG, Matos ADR, Ferreira LB, et al. The long non-coding RNA PCA3: an update of its functions and clinical applications as a biomarker in prostate cancer. Oncotarget. 2019;10(61):6589–6603.
  • Merry CR, Forrest ME, Sabers JN, et al. DNMT1-associated long non-coding RNAs regulate global gene expression and DNA methylation in colon cancer. Hum Mol Genet. 2015;24(21):6240–6253.
  • Pang X, Zhao Y, Wang J, et al. Competing endogenous RNA and coexpression network analysis for identification of potential biomarkers and therapeutics in association with metastasis risk and progression of prostate cancer. Oxid Med Cell Longev. 2019;2019:8265958.
  • Tomczak K, Czerwinska P, Wiznerowicz M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol (Pozn). 2015;19(1A):A68–77.
  • Morris TJ, Beck S. Analysis pipelines and packages for infinium humanmethylation450 beadchip (450k) data. Methods. 2015;72:3–8.
  • Harrow J, Frankish A, Gonzalez JM, et al. GENCODE: the reference human genome annotation for the ENCODE project. Genome Res. 2012;22(9):1760–1774.
  • Aryee MJ, Jaffe AE, Corrada-Bravo H, et al. Minfi: a flexible and comprehensive bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30(10):1363–1369.
  • Yu P, Tong L, Song Y, et al. Systematic profiling of invasion-related gene signature predicts prognostic features of lung adenocarcinoma. J Cell Mol Med. 2021. DOI:10.1111/jcmm.16619
  • Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000;56(2):337–344.
  • Zhang J, Shi K, Huang W, et al. The DNA methylation profile of non-coding RNAs improves prognosis prediction for pancreatic adenocarcinoma. Cancer Cell Int. 2019;19(1):107.
  • Huang DW, Sherman BT, Tan Q, et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 2007;35(suppl_2):W169–75.
  • Hombach S, Kretz M. Non-coding RNAs: classification, biology and functioning. Adv Exp Med Biol. 2016;937:3–17.
  • O’Leary VB, Hain S, Maugg D, et al. Long non-coding RNA PARTICLE bridges histone and DNA methylation. Sci Rep. 2017;7(1):1790.
  • Peng WX, Koirala P, Mo YY. LncRNA-mediated regulation of cell signaling in cancer. Oncogene. 2017;36(41):5661–5667.
  • Martini M, De Santis MC, Braccini L, et al. PI3K/AKT signaling pathway and cancer: an updated review. Ann Med. 2014;46(6):372–383.
  • Zhi H, Li Y, Wang L. Profiling DNA methylation patterns of non-coding RNAs (ncRNAs) in human disease. Adv Exp Med Biol. 2018;1094:49–64.
  • Komolov KE, Benovic JL. G protein-coupled receptor kinases: past, present and future. Cell Signal. 2018;41:17–24.
  • Ravnskjaer K, Madiraju A, Montminy M. Role of the cAMP pathway in glucose and lipid metabolism. Handb Exp Pharmacol. 2016;233:29–49.
  • Lee JM, Davis FM, Roberts-Thomson SJ, et al. Ion channels and transporters in cancer. 4. Remodeling of Ca 2+ signaling in tumorigenesis: role of Ca 2+ transport. Am J Physiol Cell Physiol. 2011;301(5):C969–76.
  • Tradonsky A, Rubin T, Beck R, et al. A search for reliable molecular markers of prognosis in prostate cancer: a study of 240 cases. Am J Clin Pathol. 2012;137(6):918–930.
  • Cai C, Wang W, Tu Z. Aberrantly DNA methylated-differentially expressed genes and pathways in hepatocellular carcinoma. J Cancer. 2019;10(2):355–366.
  • Yang Y, Xun N, Wu JG. Long non-coding RNA FGF14-AS2 represses proliferation, migration, invasion, and induces apoptosis in breast cancer by sponging miR-205-5p. Eur Rev Med Pharmacol Sci. 2019;23(16):6971–6982.
  • Jiang Y, Zhang X, Rong L, et al. Integrative analysis of the gastric cancer long non-coding RNA-associated competing endogenous RNA network. Oncol Lett. 2021;21(6):456.
  • Ghafouri-Fard S, Taheri M. Maternally expressed gene 3 (MEG3): a tumor suppressor long non coding RNA. Biomed Pharmacother. 2019;118:109129.
  • Zhou DD, Liu XF, Lu CW, et al. Long non-coding RNA PVT1: emerging biomarker in digestive system cancer. In: Cell Prolif. 2017;.50(6).