1,438
Views
4
CrossRef citations to date
0
Altmetric
Research Paper

Epigenome-wide scan identifies differentially methylated regions for lung cancer using pre-diagnostic peripheral blood

ORCID Icon, ORCID Icon, ORCID Icon, , ORCID Icon, , & ORCID Icon show all
Pages 460-472 | Received 01 Dec 2020, Accepted 22 Apr 2021, Published online: 19 May 2021

ABSTRACT

Background: DNA methylation markers have been associated with lung cancer risk and may identify aetiologically relevant genomic regions, or alternatively, be markers of disease risk factors or biological processes associated with disease development.

Methods: In a nested case–control study, we measured blood leukocyte DNA methylation levels in pre-diagnostic samples collected from 430 participants (208 cases; 222 controls) in the 1989 CLUE II cohort. We compared DNA methylation levels with case/control status to identify novel genomic regions, both single CpG sites and differentially methylated regions (DMRs), while controlling for known DNA methylation changes associated with smoking using a previously described pack-years-based smoking methylation score. Stratification analyses were conducted over time from blood draw to diagnosis, histology, and smoking status.

Results: We identified 16 single CpG sites and 40 DMRs significantly associated with lung cancer risk (q < 0.05). The identified genomic regions were associated with genes including H19, HOXA3/HOXA4, RUNX3, BRICD5, PLXNB2, and RP13. For the single CpG sites, the strongest association was noted for cg09736286 in the DIABLO gene (OR [for 1 SD] = 2.99, 95% CI: 1.95–4.59, P-value = 4.81 × 10–7). We found that CpG sites in the HOXA3/HOXA4 region were hypermethylated in cases compared to controls.

Conclusion: The single CpG sites and DMRs that we identified represented significant measurable differences in lung cancer risk, providing potential biomarkers for lung cancer risk stratification. Future studies will need to examine whether these regions are causally related to lung cancer.

Introduction

Despite substantial reductions in lung cancer incidence and death rates over the past three decades [Citation1], lung cancer continues to be the leading cause of cancer death in the US and is projected to account for 135,720 deaths in 2020, approximately 22% of all cancer deaths [Citation1]. To reduce the lung cancer burden in the US, cancer prevention and early detection remain a top priority [Citation2]. However, while the conventional lung cancer screening method of using low-dose CT (LDCT) scans is effective in reducing lung cancer mortality [Citation3], it leads to high false positive rates (>95% of pulmonary nodules detected are benign) [Citation4], over-diagnosis [Citation5,Citation6], radiation exposure [Citation7], and has had poor uptake [Citation8]. Identifying DNA methylation markers associated with lung cancer risk many years prior to diagnosis may help uncover biological pathways that are important in the development of lung cancer. Separately, identifying DNA methylation markers that are strong predictors of lung cancer, irrespective of their role in aetiology, may provide new opportunities for risk stratification and screening prioritization.

Genome-wide DNA methylation profiling has identified numerous new genomic regions associated with disease risk and may provide new opportunities for risk stratification. For instance, differences in blood leukocyte DNA methylation levels (proportion of CpGs methylated) have been associated with smoking [Citation9,Citation10], elevated subclinical inflammation [Citation11,Citation12], obesity [Citation13,Citation14], type II diabetes [Citation15] and heart disease [Citation16,Citation17]. With respect to lung cancer, many of the alterations in blood leukocyte DNA methylation levels identified to date have been directly linked to smoking behaviours [Citation9,Citation10,Citation18–20]. A recent meta-analysis that included over 15,000 individuals identified 2,623 differentially methylated CpG sites that are related to cigarette smoking [Citation21]. In addition, lung cancer case–control EWAS studies have suggested that some of the identified smoking-related CpG sites, particularly two sites within the AHRR and F2RL3 loci, may mediate the effects of smoking on lung cancer [Citation10,Citation20]. However, these two CpG sites were no longer associated with lung cancer risk when EWAS adopted a more stringent control for smoking [Citation22]. A Mendelian randomization study also reported little evidence to support these two CpG sites’ causal link to lung cancer [Citation23], suggesting that the observed association between DNA methylation changes and lung cancer in EWAS studies could be a reflection of residual confounding after adjusting for self-reported smoking status [Citation24,Citation25].

In addition to the need to control for residual confounding, identifying DNA methylation alterations that are not caused by active smoking may also reveal important biological pathways. To date, very few studies have investigated methylation changes associated with lung cancer risk that are not associated with smoking [Citation19,Citation26]. Among smokers, variations in methylation may indicate a different genetic susceptibility to lung cancer, since individuals often differ in their ability to detoxify carcinogenic compounds or to repair induced DNA damage, for example. Alternatively, variation in methylation could be a result of other environmental risk factors, such as occupational exposures, changes in the immune response, radon, or second-hand smoke. Using blood samples collected from men and women without a cancer diagnosis in 1989 in the CLUE II cohort [Citation27,Citation28], we compared DNA methylation levels (using the EPIC array) in individuals who later developed lung cancer with those who did not develop lung cancer in the same time frame. Specifically, we aimed to identify both single CpG sites, and differentially methylated regions, measured in pre-diagnostic blood samples of lung cancer cases and matched controls, that represent measurable differences in lung cancer risk independent of smoking exposures.

Materials and methods

Study population – CLUE I/II cohort

Subjects for this study were selected from among participants of the CLUE II cohort who had also participated in the CLUE I cohort study (flowchart in Supplemental and additional details in Supplemental Methods). Both cohorts were based in Washington County, MD, and were initially established to identify serological precursors to cancer and other chronic diseases [Citation27–29]. CLUE II was conducted from May through October 1989, during which 32,894 individuals (25,076 were Washington County residents) provided a blood sample [Citation30]. Among all participants, 98.3% were white, reflecting the population of this county at the time, and 59% were female. Participants provided health information at baseline, including the potential confounders attained education, cigarette smoking status, number of cigarettes smoked daily, cigar/pipe smoking status, and self-reported weight and height, from which body mass index (BMI) was calculated.

Figure 1. Manhatten plot for significant DNA-methylation (CpG) probes. CpGs reaching statistical significance are annotated (located above the dotted red horizontal line [p<1x10-6]).

Figure 1. Manhatten plot for significant DNA-methylation (CpG) probes. CpGs reaching statistical significance are annotated (located above the dotted red horizontal line [p<1x10-6]).

Lung cancer ascertainment

All the incident lung cancer cases (ICD9 162 and ICD10 C34) were ascertained from linkage to the Washington County cancer registry (before 1992 to the present) and the Maryland Cancer Registry (since 1992 when it began to the present). We selected all 241 first primary incident lung cancer cases who had participated in CLUE I and were diagnosed after the day of blood draw in CLUE II through January 2018. Using incidence density sampling, we selected one control per case matching on age, sex, cigarette smoking status, number of cigarettes smoked daily, cigar/pipe smoking status, and date of CLUE II blood draw. Given the incidence density sampling approach, there were more cases/controls than the number of samples, as some controls can develop cancer at a later time point and some controls may be sampled more than once.

DNA methylation measurements

DNA extracted from buffy coat was bisulphite-treated using the EZ DNA Methylation Kit (Zymo) and DNA methylation was measured at specific CpG sites across the genome using the 850 K Illumina Infinium MethylationEPIC BeadArray (Illumina, Inc., CA, USA) at the University of Minnesota Genomic Center (details in Supplemental Methods).

Statistical analysis

In the current nested case–control study, we aimed to identify novel genomic regions, both single CpG sites and differentially methylated regions where differences in DNA methylation levels are not explained by smoking exposures, by controlling for the known DNA methylation changes associated with smoking in the statistical analysis. To examine the association between CpG-specific DNA methylation and lung cancer risk, we conducted epigenome-wide association analysis using unconditional multivariable logistic regression to estimate odds ratios (OR) of lung cancer per 1 SD increase in methylation level at single CpG sites. To maximize power, we used unconditional logistic regression to include cases and controls without a matched pair and included participants every time they were sampled. In addition, we confirmed that the analysis results were not qualitatively different when using a conditional regression model (Supplemental Table 8 presents a side-by-side comparison). All models were adjusted for age at blood draw, sex, four surrogate variables for batch effects [Citation31,Citation32], smoking status (never, former, current), pack-years based smoking methylation score (details in Supplemental Methods), BMI, and leukocyte cell composition [Citation33,Citation34] (given the potential for confounding by cell composition) [Citation35]. The use of four surrogate variables explains the largest percentage of data variation as determined by using the ‘ctrlsva’ function in the Enmix package. All p-values were adjusted for multiple comparisons calculating the false discovery rate (FDR) using the B&H method [Citation36]. Statistically significant CpGs were required to meet the multiple testing adjustment criteria of FDR (q-value) <0.05. Analyses of single CpGs with lung cancer were also stratified by smoking status and time from blood-draw to diagnosis, and separately by non-small cell (NSCLC) and small cell (SCLC) histology. All controls were included in these three types of stratification analyses. All statistical analyses were performed in R (version 3.5.0).

We used the DMRcate Bioconductor R package [Citation37] to identify differentially methylated regions (DMRs) associated with lung cancer risk. Adjusting for the same covariates as in the single CpG analyses, DMRs were calculated using a parameter setting of lambda = 1,000 and kernel adjustment C = 2 (default setting) [Citation37]. Statistically significant DMRs were required to have a minimum of two statistically significant single CpGs and to meet the multiple testing adjustment criteria of FDR (q-value) <0.1 (calculated using the B&H method) [Citation36]. Associations were also examined by time from blood-draw to diagnosis and lung cancer histology. Two of the most statistically significant regions were further evaluated for patterns by time to diagnosis.

Results

Population characteristics

presents the characteristics of the 208 lung cancer cases and 222 controls that we included in this study. Over 99% of the participants were white. The median time to lung cancer diagnosis was 14 years. The median age at blood draw in 1989 was 59 and 57 years in cases and controls, respectively. Overall, 55% of cases and controls were women and 11% were never smokers ().

Table 1. Pre-diagnostic characteristics of lung cancer cases and matched controls nested in the CLUE II cohort study

Single CpG EWAS analysis

The EWAS analysis identified 16 differentially methylated CpGs that were statistically significant after multiple comparison correction (q < 0.05). Results are presented in (statistically significant CpGs were sorted by q-value) and . The 16 CpGs located in genomic regions that have previously been associated with lung cancer or other malignancies, including H19, RUNX3, BAIAP2L2, ADAM11, GPR132, CUEDC1, SSBP4, AMPD2, and RTN4A. (the top 1000 CpGs based on q-value are presented in Supplemental ).

Table 2. Association between single CpGs and lung cancer, CLUE II: statistically significant CpGs after multiple comparison correction (q < 0.05)

CpGs previously reported to be associated with lung cancer risk, including cg05575921 in the AHRR gene [Citation20], cg03636183 in the F2RL3 gene [Citation20], cg23387569 in the AGAP2 gene [Citation10], cg10151248 in the PC gene [Citation22], and cg13482620 in the B3GNTL1 gene [Citation22], were not statistically significant in our analyses in which we adjusted for a pack-years methylation score. Adjusting for smoking status (never, former, current) but not for the pack-years methylation score, two CpGs in the AHRR and F2RL3 genes had similar-sized associations with lung cancer risk as previously reported (AHRR: OR [for 1 SD] = 0.43, 95% CI: 0.31–0.60, P-value = 6.76 × 10–7 vs. previously reported OR [for 1 SD] = 0.39, 95% CI: 0.24–0.61, P-value = 2.55 × 10–5 for cg05575921; F2RL3: OR for [1 SD] = 0.53, 95% CI: 0.40–0.70, P-value = 7.91 × 10–6 vs. previously reported OR [for 1 SD] = 0.51, 95% CI: 0.35–0.73, P-value = 4.19 × 10–4 for cg03636183) [Citation20]. In addition, in the EWAS analysis conducted without adjusting for the pack-years methylation score, only one CpG (cg14391737) was statistically significantly associated with lung cancer risk after adjusting for multiple comparisons. This CpG has been related to smoking in multiple studies [Citation21] (top 1000 CpGs presented in Supplemental ).

To further examine the associations of the significant CpGs with lung cancer risk, we stratified over time from blood draw to diagnosis (≤10, >10 years). The magnitude of risk was similar in the two strata; small differences in risk were likely due to statistical variability; there was no evidence of effect modification by time to diagnosis (). For these 16 differentially methylated CpGs, the ORs of lung cancer were in general slightly higher for former smokers and for SCLC, than among current smokers or for NSCLC, respectively (Supplemental ).

Table 3. Association between differential methylated regions (DMR) and lung cancer in CLUE II participants a

DMR analysis

Using the DMRcate package in R, we identified differentially methylated regions (DMRs) by case/control status [Citation37]. Instead of focusing on single CpG identification, the DMRcate method identifies regions of chromosomes that are differentially methylated by case–control status. After adjusting for both smoking status and pack-years methylation score, 40 DMRs were found to be statistically significantly associated with lung cancer risk (). The regions we identified using DMRcate could also be found among the results from using the R package dmrff, although these regions were split up into smaller regions when using dmrff (due to the different methodological approaches; Supplemental Table 6).

We conducted stratified analyses to examine whether our DMR results were modified by time to diagnosis or differed by histology. Among those with ≤10 years between blood draw and diagnosis, a region located on chromosome 20:36,148,699–36,149,271 (genes NNAT and BLCAP) was statistically significantly associated with lung cancer. According to histology, one region located in H19 gene and one region located in MYEOV gene were statistically significant DMRs for NSCLC and SCLC, respectively (Supplemental ).

Table 4. Odds ratio for top 10 CpGs (q-value <0.05) in two differentially methylated region (H19 and HOXA4) and lung cancer risk, and stratified by time to diagnosis in CLUE II participants

We conducted further analyses for two of the most statistically significant DMRs. These two regions, located on the chromosome 11 (H19 gene) and 7 (HOXA3/HOXA4 gene), consisted of 31 and 20 CpGs, respectively, that differed between the cases and controls (all were q-value <0.1 FDR adjusted). For each of these two regions, we selected the CpGs with the strongest associations with lung cancer risk for comparison by time to cancer diagnosis (≤10, >10 years). Results for CpGs in these two regions are presented in (top 10 statistically significant CpGs are included), with additional tables presented in Supplemental Table 5. For CpGs located in the HOXA3/HOXA4 region, 18 out of 20 were statistically significant and all sites were hypermethylated in cases compared to controls. Results for these HOXA3/HOXA4 region CpGs were similar in the ≤10 and >10 year time to diagnosis groups. In the H19 region, the strongest association was noted for cg00237904. This CpG was also identified in the top statistically significant CpGs in the single CpG EWAS (, q-value <0.05).

Discussion

In this study, we identified both single CpGs and DMRs in lung cancer that are not primarily driven by smoking history, by using a DNA methylation-based pack-years score to adjust for cumulative smoking. Using a prospective study design (with pre-diagnostic blood), we identified 16 single CpG sites and 40 DMR regions that were associated with lung cancer risk; genes in these regions included H19, HOXA3/HOXA4, RUNX3, BRICD5, RP13, and PLXNB2.

Previous studies have used either case-control or nested case–control designs to study the association between methylation markers and lung cancer risk. Retrospective case–control studies on this topic are not comparable to our study because they either used a different methodology to measure blood leukocyte methylation [Citation38], or measured methylation biomarkers in sputum as a classifier for lung cancer risk [Citation39,Citation40]. Previously, three nested case–control lung cancer EWAS publications used pre-diagnostic, peripheral blood samples to examine DNA methylation levels associated with lung cancer risk, while adjusting for smoking using self-reported information [Citation10,Citation20,Citation22] (summarized in Supplemental Table 7). Fasanelli 2015 [Citation20] and Baglietto 2016 [Citation10] together identified six CpGs (cg05951221, cg21566642, cg05575921, cg06126421, cg23387569 and cg03636183) with significant ORs for lung cancer after adjusting for smoking using self-reported smoking status. Further stratification showed that five of the six CpGs had methylation levels strongly influenced by smoking. Sandanger 2018 found two additional CpGs, cg10151248 (PC) and cg13482620 (B3GNTL1), to be significantly associated with lung cancer risk after adjusting for smoking status, pack-years, and a comprehensive smoking index built using self-reported information [Citation22]. In our analyses, cg05575921 (AHRR), cg03636183 (F2RL3), and cg21566642 methylation levels were statistically significantly associated with lung cancer when we adjusted for self-reported smoking status, whereas cg10151248 (PC), cg13482620 (B3GNTL1), cg23387569 (AGAP2), cg05951221, and cg06126421 methylation levels did not significantly differ between the cases and controls regardless of whether we adjusted for the pack-years methylation score or not.

Our analyses identified novel genomic regions that are independent of smoking exposures. Many of the significant single CpGs we identified through EWAS were located in genomic regions that have been previously associated with lung cancer or other malignancies (RUNX3 [Citation41], H19 [Citation42], BAIAP2L2 [Citation43], GPR132 [Citation44], CUEDC1 [Citation45], SSBP4 [Citation46], AMPD2 [Citation47], ADAM11 [Citation48], and RTN4R [Citation49]). For instance, the RUNX3 region is a tumour suppressor gene that is implicated in lung cancer oncogenesis [Citation41]. Promoter hypermethylation of RUNX3 has been associated with NSCLC survival [Citation50]. Many of the top differentially methylated regions we identified using the DMRcate analysis included genes that have been previously linked to lung cancer in other studies (H19 [Citation42], HOXA3/HOXA4 [Citation51], PLXNB2 [Citation52], PRDM1 [Citation53], TSPAN4 [Citation54], PHPT1 [Citation55], MSI2 [Citation56], CBX5 [Citation57], RCAN1 [Citation58], CCL5 [Citation59], and BRDT [Citation60]), providing some support for our findings (albeit, we acknowledge that risk and survival may not be driven by the same genes). Of these eleven genetic regions already linked to lung cancer, many are shown to be connected to poor outcome in lung cancer. For instance, decreased expression levels of PLXNB2 [Citation52] and PRDM1 [Citation53] have been found to be correlated with poor prognosis in lung cancer, while TSPAN4 [Citation54], PHPT1 [Citation55], MSI2 [Citation56], and CBX5 [Citation57] are linked to metastasis. In addition, HOXA3/HOXA4 [Citation51], RCAN1 [Citation58], and CCL5 [Citation59] are involved in the growth, development, and migration of lung cancer cells. Other top regions identified in our study include genes that have been linked to breast cancer (NNAT [Citation61], RPL7A [Citation62], and HIST1H2BO [Citation63]), colorectal cancer (RP11 [Citation64,Citation65]), endometrial cancer (HELZ2 [Citation66]), pancreatic cancer (LFNG [Citation67]), prostate cancer (MAST3 [Citation68]), renal cell carcinoma (KCNJ1 [Citation69]), and tumour progression (ZC3H12D [Citation70]).

Of all the novel genomic regions that we identified as associated with lung cancer risk, the H19 region was the only one that appeared in both the single CpG EWAS and DMRcate results. The H19 long noncoding RNA (LncRNA) has been previously implicated in lung cancer causation. Inhibition of LncRNA H19 has been found to suppress the growth, migration, and invasion of NSCLC [Citation42]. In terms of disease development, loss of imprinting of the H19 gene has been connected to a genome-wide loss of methylation and associated with the transformation from normal to NSCLC [Citation71–73]. In our analyses, we found that hypermethylation of many H19 region CpGs were associated with lung cancer risk in CLUE II. The direction of this association was unexpected since the overexpression of H19 LncRNA in lung tumour is often correlated with hypomethylation of the promoter region CpGs [Citation72]. H19 LncRNA belongs to a highly conserved imprinted gene cluster that plays important roles in embryonal development and growth control [Citation74] and H19 region methylation has been found to be influenced by early life exposures, including maternal factors during pregnancy [Citation75], suggesting the possibility that external exposures could impact H19 methylation. Since the blood samples were drawn years before cancer diagnosis in this study, the methylation patterns we observed could be regions that are modulated early on in lung cancer development. Finally, while it should be noted that blood methylation levels may not reflect lung tissue methylation levels, they can reflect important immunological changes or epigenetic programming that could be important in lung cancer development [Citation76]. More research is needed to investigate how methylation patterns in the blood are related to subsequent cancer risk

A major strength of the study was the use of the DMRcate analysis methods and the Illumina Infinium MethylationEPIC 850 K BeadArray, allowing us to investigate genome-wide regional methylation level differences between lung cancer cases and controls. Another strength is our strict adjustment for smoking, using both self-reported smoking status and a methylation-based pack-year smoking score.

Limitations of this study include a relatively small sample size and a lack of replication datasets. Thus, the regions we identified that have not previously been associated with lung cancer should be investigated in other populations. It is also possible that some of single CpGs and DMRs that we identified after adjusting for the pack-year methylation score could be related to risk factors unique to CLUE II. The case–control EWAS study design used for this study could not provide evidence to help elucidate whether the methylation sites and regions identified may be causally implicated in lung cancer. Further studies are needed to investigate pathways related to the novel genomic regions that we identified.

This study demonstrated the importance of carefully controlling for known DNA methylation changes associated with smoking to be able to identify novel genomic regions. We showed the potential for this approach to identify DMRs (i.e., not single CpG alterations) by case/control status using peripheral blood collected prior to lung cancer diagnosis. Future studies will need to examine whether the regions we identified are causally related to lung cancer risk. Further work in other populations should be conducted to validate regions that we observed to be associated with lung cancer risk independent of smoking exposures, especially among different ethnic and racial groups. These findings suggest that methylation changes detectable years prior to cancer diagnosis could potentially influence lung cancer risk, providing potential opportunities for risk stratification and screening prioritization.

Authors’ contributions

DSM, KTK and EAP designed the study and obtained funding and acquisition of data. JL assisted with the preparation of dataset. DSM supervised all research activities. MR and NZ conducted the statistical analyses. DSM and NZ drafted the manuscript. DSM, EAP, KTK, DCK and CJM interpreted the data and provided critical revisions of the manuscript. All authors read and approved the final version of the manuscript.

Availability of data and material

The datasets generated during the current study are available from the corresponding author on reasonable request and will be deposited into dbGaP within 1 year.

Ethics approval

This study was approved by the Institutional Review Board at Johns Hopkins University Bloomberg School of Public Health and at Tufts University.

Supplemental material

Supplemental Material

Download MS Excel (637.6 KB)

Acknowledgments

Cancer incidence data were provided by the Maryland Cancer Registry, Center for Cancer Surveillance and Control, Maryland Department of Health, 201 W. Preston Street, Room 400, Baltimore, MD 21201. We acknowledge the State of Maryland, the Maryland Cigarette Restitution Fund, and the National Program of Cancer Registries of the Centers for Disease Control and Prevention for the funds that helped support the availability of the cancer registry data.

Disclosure statement

Karl Kelsey is a founder and scientific advisor to Cellintec, which had no role in this research. No potential conflict of interest was reported by all other author(s).

Supplementary material

Supplemental data for this article can be accessed here.

Additional information

Funding

This work was supported by 2018 American Association for Cancer Research (AACR)-Johnson & Johnson Lung Cancer Innovation Science Grant (18-90-52-MICH). Note:  The funders had no role in the design of the study; the collection, analysis, and interpretation of the data; the writing of the manuscript; and the decision to submit the manuscript for publication.

References

  • Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020;70(1):7–30.
  • Va M. Force USPST. Screening for lung cancer: U.S. preventive services task force recommendation statement. Ann Intern Med. 2014;160(5):330–338.
  • National Lung Screening Trial Research Team. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. 2011;365(5):395–409.
  • Fabrikant MS, Wisnivesky JP, Marron T, et al. Benefits and challenges of lung cancer screening in older adults. Clin Ther. 2018;40:526–534.
  • Heleno B, Siersma V, Brodersen J. Estimation of overdiagnosis of lung cancer in low-dose computed tomography screening: a secondary analysis of the Danish lung cancer screening trial. JAMA Intern Med. 2018;178(10):1420–1422.
  • Patz EF Jr., Pinsky P, Gatsonis C, et al. Overdiagnosis in low-dose computed tomography screening for lung cancer. JAMA Intern Med. 2014;174(2):269–274.
  • McCunney RJ, Li J. Radiation risks in lung cancer screening programs. Chest. 2014;145(3):618–624.
  • Quaife SL, Ruparel M, Beeken RJ, et al. The Lung Screen Uptake Trial (LSUT): protocol for a randomised controlled demonstration lung cancer screening pilot testing a targeted invitation strategy for high risk and ‘hard-to-reach’patients. BMC Cancer. 2016;16(1):1–9.
  • Shenker NS, Polidoro S, Van Veldhoven K, et al. Epigenome-wide association study in the European prospective investigation into cancer and nutrition (EPIC-Turin) identifies novel genetic loci associated with smoking. Hum Mol Genet. 2013;22(5):843–851.
  • Baglietto L, Ponzi E, Haycock P, et al. DNA methylation changes measured in pre-diagnostic peripheral blood samples are associated with smoking and lung cancer risk. Int J Cancer. 2017;140(1):50–61.
  • Ligthart S, Marzi C, Aslibekyan S, et al. DNA methylation signatures of chronic low-grade inflammation are associated with complex diseases. Genome Biol. 2016;17(1):255.
  • Ahsan M, Ek WE, Rask-Andersen M, et al. The relative contribution of DNA methylation and genetic variants on protein biomarkers for human diseases. PLoS Genet. 2017;13(9):e1007005.
  • Wahl S, Drong A, Lehne B, et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature. 2016;541:81.
  • Xu K, Zhang X, Wang Z, et al. Epigenome-wide association analysis revealed that SOCS3 methylation influences the effect of cumulative stress on obesity. Biol Psychol. 2018;131:63–71.
  • Chambers JC, Loh M, Lehne B, et al. Epigenome-wide association of DNA methylation markers in peripheral blood from Indian Asians and Europeans with incident type 2 diabetes: a nested case-control study. Lancet Diabetes Endocrinol. 2015;3(7):526–534.
  • Huan T, Joehanes R, Song C, et al. Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nat Commun. 2019;10(1):4267.
  • Agha G, Mendelson MM, Ward-Caviness CK, et al. Blood Leukocyte DNA Methylation Predicts Risk of Future Myocardial Infarction and Coronary Heart Disease. Circulation. 2019;140(8):645–657.
  • Zhang Y, Elgizouli M, Schottker B, et al. Smoking-associated DNA methylation markers predict lung cancer incidence. Clin Epigenetics. 2016;8:127.
  • Zhang Y, Breitling LP, Balavarca Y, et al. Comparison and combination of blood DNA methylation at smoking-associated genes and at lung cancer-related genes in prediction of lung cancer mortality. Int J Cancer. 2016;139(11):2482–2492.
  • Fasanelli F, Baglietto L, Ponzi E, et al. Hypomethylation of smoking-related genes is associated with future lung cancer in four prospective cohorts. Nat Commun. 2015;6(1):1–9.
  • Joehanes R, Just AC, Marioni RE, et al. Epigenetic signatures of cigarette smoking. Circulation. 2016;9(5):436–447.
  • Sandanger TM, Nøst TH, Guida F, et al. DNA methylation and associated gene expression in blood prior to lung cancer diagnosis in the Norwegian Women and Cancer cohort. Sci Rep. 2018;8(1):1–10.
  • Battram T, Richmond RC, Baglietto L, et al. Appraising the causal relevance of DNA methylation for risk of lung cancer. Int J Epidemiol. 2019;48(5):1493–1504.
  • Fewell Z, Davey Smith G, Sterne JA. The impact of residual and unmeasured confounding in epidemiologic studies: a simulation study. Am J Epidemiol. 2007;166(6):646–655.
  • Munafò MR, Timofeeva MN, Morris RW, et al. Association between genetic variants on chromosome 15q25 locus and objective measures of tobacco exposure. J Natl Cancer Inst. 2012;104(10):740–748.
  • Zhang X, Gao L, Liu Z-P, et al. Uncovering driver DNA methylation events in nonsmoking early stage lung adenocarcinoma. BioMed Research International. 2016;2016:2090286. doi:https://doi.org/10.1155/2016/2090286.
  • Genkinger JM, Platz EA, Hoffman SC, et al. Fruit, vegetable, and antioxidant intake and all-cause, cancer, and cardiovascular disease mortality in a community-dwelling population in Washington County, Maryland. Am J Epidemiol. 2004;160(12):1223–1233.
  • Kakourou A, Koutsioumpa C, Lopez DS, et al. Interleukin-6 and risk of colorectal cancer: results from the CLUE II cohort and a meta-analysis of prospective studies. Cancer Causes Control. 2015;26(10):1449–1460.
  • Schober SE, Comstock GW, Helsing KJ, et al. Serologic precursors of cancer: i. Prediagnostic serum nutrients and colon cancer risk. Am J Epidemiol. 1987;126(6):1033–1041.
  • Comstock GW, Helzlsouer KJ, Bush TL. Prediagnostic serum levels of carotenoids and vitamin E as related to subsequent cancer in Washington County, Maryland. Am J Clin Nutr. 1991;53(1Suppl):260S–264S.
  • Leek JT, Johnson WE, Parker HS, et al. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28(6):882–883.
  • Leek JT, Scharpf RB, Bravo HC, et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet. 2010;11(10):733–739.
  • Houseman EA, Accomando WP, Koestler DC, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13:86.
  • Salas LA, Koestler DC, Butler RA, et al. An optimized library for reference-based deconvolution of whole-blood biospecimens assayed using the Illumina HumanMethylationEPIC BeadArray. Genome Biol. 2018;19(1):64.
  • Adalsteinsson BT, Gudnason H, Aspelund T, et al. Heterogeneity in white blood cells has potential to confound DNA methylation measurements. PLoS One. 2012;7(10):e46705.
  • Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological). 1995;57(1):289–300.
  • Peters TJ, Buckley MJ, Statham AL, et al. De novo identification of differentially methylated regions in the human genome. Epigenetics Chromatin. 2015;8:6.
  • Wang L, Aakre JA, Jiang R, et al. Methylation markers for small cell lung cancer in peripheral blood leukocyte DNA. J Thorac Oncol. 2010;5(6):778–785.
  • Leng S, Wu G, Klinge DM, et al. Gene methylation biomarkers in sputum as a classifier for lung cancer risk. Oncotarget. 2017;8(38):63978.
  • Liu D, Peng H, Sun Q, et al. The indirect efficacy comparison of DNA methylation in sputum for early screening and auxiliary detection of lung cancer: a meta-analysis. Int J Environ Res Public Health. 2017;14(7):679.
  • Sato K, Tomizawa Y, Iijima H, et al. Epigenetic inactivation of the RUNX3 gene in lung cancer. Oncol Rep. 2005;15(1):129–135.
  • Huang Z, Lei W, Hu HB, et al. H19 promotes non‐small‐cell lung cancer (NSCLC) development through STAT3 signaling via sponging miR‐17. J Cell Physiol. 2018;233(10):6768–6776.
  • Liu J, Shangguan Y, Sun J, et al. BAIAP2L2 promotes the progression of gastric cancer via AKT/mTOR and Wnt3a/β-catenin signaling pathways. Biomed Pharmacother. 2020;129:110414.
  • Chen P, Zuo H, Xiong H, et al. Gpr132 sensing of lactate mediates tumor–macrophage interplay to promote breast cancer metastasis. Proc Nat Acad Sci. 2017;114(3):580–585.
  • Lopes R, Korkmaz G, Revilla SA, et al. CUEDC1 is a primary target of ERα essential for the growth of breast cancer cells. Cancer Lett. 2018;436:87–95.
  • Guo X, Lin W, Bao J, et al. A comprehensive cis-eQTL analysis revealed target genes in breast cancer susceptibility loci identified in genome-wide association studies. Am J Hum Genet. 2018;102(5):890–903.
  • Gao Q-Z, Qin Y, Wang W-J, et al. Overexpression of AMPD2 indicates poor prognosis in colorectal cancer patients via the Notch3 signaling pathway. World J Clin Cases. 2020;8(15):3197.
  • Sieuwerts AM, Meijer-van Gelder ME, Timmermans M, et al. How ADAM-9 and ADAM-11 differentially from estrogen receptor predict response to tamoxifen treatment in patients with recurrent breast cancer: a retrospective study. Clin Cancer Res. 2005;11(20):7311–7321.
  • He Y, Ji P, Li Y, et al. Genetic variants were associated with the prognosis of head and neck squamous Carcinoma. Front Oncol. 2020;10:372.
  • Yanagawa N, Tamura G, Oizumi H, et al. Promoter hypermethylation of RASSF1A and RUNX3 genes as an independent prognostic prediction marker in surgically resected non-small cell lung cancers. Lung Cancer. 2007;58(1):131–138.
  • Xu K, Liu B, Ma Y. The tumor suppressive roles of ARHGAP25 in lung cancer cells. Onco Targets Ther. 2019;12:6699.
  • Liu H, Zhao H. Prognosis related miRNAs, DNA methylation, and epigenetic interactions in lung adenocarcinoma. Neoplasma. 2019;66(3):487–493.
  • Zhu Z, Wang H, Wei Y, et al. Downregulation of PRDM1 promotes cellular invasion and lung cancer metastasis. Tumor Biol. 2017;39(4):1010428317695929.
  • Ying X, Zhu J, Circular ZY. RNA circ‐TSPAN4 promotes lung adenocarcinoma metastasis by upregulating ZEB1 via sponging miR‐665. Mol Genet Genomic Med. 2019;7(12):e991.
  • A-j X, X-h X, S-t D, et al. Clinical significance of PHPT1 protein expression in lung cancer. Chin Med J (Engl). 2010;123(22):3247–3251.
  • Kudinov AE, Deneka A, Nikonova AS, et al. Musashi-2 (MSI2) supports TGF-β signaling and inhibits claudins to promote non-small cell lung cancer (NSCLC) metastasis. Proc Nat Acad Sci. 2016;113(25):6955–6960.
  • Yu Y-H, Chiou G-Y, Huang P-I, et al. Network biology of tumor stem-like cells identified a regulatory role of CBX5 in lung cancer. Sci Rep. 2012;2:584.
  • Ma N, Shen W, Pang H, et al. The effect of RCAN1 on the biological behaviors of small cell lung cancer. Tumor Biol. 2017;39(6):1010428317700405.
  • Huang C-Y, Fong Y-C, Lee C-Y, et al. CCL5 increases lung cancer migration via PI3K, Akt and NF-κB pathways. Biochem Pharmacol. 2009;77(5):794–803.
  • Grunwald C, Koslowski M, Arsiray T, et al. Expression of multiple epigenetically regulated cancer/germline genes in nonsmall cell lung cancer. Int J Cancer. 2006;118(10):2522–2528.
  • Nass N, Walter S, Jechorek D, et al. High neuronatin (NNAT) expression is associated with poor outcome in breast cancer. Virchows Arch. 2017;471(1):23–30.
  • Zhu Y, Lin H, Li Z, et al. Modulation of expression of ribosomal protein L7a (rpL7a) by ethanol in human breast cancer cells. Breast Cancer Res Treat. 2001;69(1):29–38.
  • Xie W, Zhang J, Zhong P, et al. Expression and potential prognostic value of histone family gene signature in breast cancer. Exp Ther Med. 2019;18(6):4893–4903.
  • Sun L, Jiang C, Xu C, et al. Down-regulation of long non-coding RNA RP11-708H21. 4 is associated with poor prognosis for colorectal cancer and promotes tumorigenesis through regulating AKT/mTOR pathway. Oncotarget. 2017;8(17):27929.
  • Wu Y, Yang X, Chen Z, et al. m 6 A-induced lncRNA RP11 triggers the dissemination of colorectal cancer cells via upregulation of Zeb1. Mol Cancer. 2019;18(1):1–16.
  • Qiao Z, Jiang Y, Wang L, et al. Mutations in KIAA1109, CACNA1C, BSN, AKAP13, CELSR2, and HELZ2 are associated with the prognosis in endometrial cancer. Front Genet. 2019;10:909.
  • Liu P, Weng Y, Sui Z, et al. Quantitative secretomic analysis of pancreatic cancer cells in serum-containing conditioned medium. Sci Rep. 2016;6:37606.
  • Dahlman KB, Parker JS, Shamu T, et al. Modulators of prostate cancer cell proliferation and viability identified by short-hairpin RNA library screening. PLoS One. 2012;7(4):e34414.
  • Guo Z, Liu J, Zhang L, et al. KCNJ1 inhibits tumor proliferation and metastasis and is a prognostic factor in clear cell renal cell carcinoma. Tumor Biol. 2015;36(2):1251–1259.
  • Huang S, Qi D, Liang J, et al. The putative tumor suppressor Zc3h12d modulates toll-like receptor signaling in macrophages. Cell Signal. 2012;24(2):569–576.
  • Anisowicz A, Huang H, Braunschweiger KI, et al. A high-throughput and sensitive method to measure global DNA methylation: application in lung cancer. BMC Cancer. 2008;8(1):222.
  • Kondo M, Suzuki H, Ueda R, et al. Frequent loss of imprinting of the H19 gene is often associated with its overexpression in human lung cancers. Oncogene. 1995;10(6):1193–1198.
  • Langevin SM, Kratzke RA, Kelsey KT. Epigenetics of lung cancer. Transl Res. 2015;165(1):74–90.
  • Gabory A, Jammes H, The DL. H19 locus: role of an imprinted non‐coding RNA in growth and development. Bioessays. 2010;32(6):473–480.
  • Miyaso H, Sakurai K, Takase S, et al. The methylation levels of the H19 differentially methylated region in human umbilical cords reflect newborn parameters and changes by maternal environmental factors during early pregnancy. Environ Res. 2017;157:1–8.
  • Widschwendter M, Jones A, Evans I, et al. Epigenome-based cancer risk prediction: rationale, opportunities and challenges. Nat Rev Clin Oncol. 2018;15(5):292.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.