522
Views
0
CrossRef citations to date
0
Altmetric
Brief report

Gaps in current methods to detect polymorphic CpGs from Illumina Infinium human methylation microarrays and exploring their potential impact in multi-EWAS analyses

ORCID Icon & ORCID Icon
Article: 2281153 | Received 14 Jul 2023, Accepted 27 Oct 2023, Published online: 20 Nov 2023
 

ABSTRACT

DNA methylation (DNAm) epigenome-wide association studies (EWAS) have been performed on diverse ethnicities to discover novel biomarkers associated with various diseases, such as cancers, autoimmune diseases, and neurological disorders. However, genetic polymorphisms can influence DNAm levels resulting in methylation quantitative trait loci (meQTL). These can be either direct effects, by altering the sequence of the methylation (CpG) site itself, or, in the case of array-based measures, indirectly altering the detection probe-binding site interaction. Given that genetic variant frequencies associated with meQTL can differ between population groups, these have the potential to confound EWAS observations, particularly in multi-ethnic populations. In this study, we analysed publicly available DNA methylation profiles (450K array), consisting of 1342 individuals from 6 distinct ancestral groups. We investigate two distinct tools (GapHunter and MethylToSNP) specifically designed to identify CpG sites that may be influenced by genetic variation. Results from this aggregated trans-ancestral epigenome-wide dataset suggest that both tools fail to consistently identify not only rarer (MAF < 0.05) genetic variant effects but also more than half of sites predicted to be associated with variants with much higher allele frequencies (MAF >0.2). In addition, there is a relatively low concordance in the detection of polymorphic CpGs between GapHunter and MethylToSNP. Screening of CpG site associations from EWAS using either of these tools is unlikely to be a robust or comprehensive means of identifying all genetic variant confounding effects.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

Publicly available datasets from the NCBI GEO database were utilized in this study.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/15592294.2023.2281153

Additional information

Funding

This work was supported by funding from The Healthier Lives, National Science Challenge, and Genomics Aotearoa (a New Zealand Ministry of Business, Innovation and Employment-funded research platform) and project grants from the Health Research Council of New Zealand.