1,153
Views
0
CrossRef citations to date
0
Altmetric
Research Paper

A novel principal component based method for identifying differentially methylated regions in Illumina Infinium MethylationEPIC BeadChip data

, , , , , & show all
Article: 2207959 | Received 18 Sep 2022, Accepted 19 Apr 2023, Published online: 17 May 2023

Figures & data

Table 1. Summary of Representative Genomic Regions Used in Power Simulation.

Figure 1. Absolute Correlations between probes in Region 1.

Note: *Region 1: Chr6:30038712-30039600.
*PC1, PC2: absolute values of PC loadings on PC1 and PC2.
*cg03343571 and cg22184136 were the most loaded probes on PC1 and PC2, where cg22184136 was dropped out by coMethDMR.
Figure 1. Absolute Correlations between probes in Region 1.

Table 2. Comparison of Numbers of Regions and DMRs Reported by DMRPC and coMethdmr.

Figure 2. The Genome-wide False Positive Rates on Genic Regions using 100 Discovery Cohort Subjects.

Note: *Genome-wide FP: genome-wide false positive rate.
*Minimal MAC:minimal maximum absolute pairwise correlation of a genomic region
*80%, 90%, 95%, 99%:minimal variance explained by PCs used.
*MetaPC:meta-analysis using multiple PCs, MultiPC: multivariate regression using multiple PCs.
*MetaPC1:meta-analysis using 1st PC only, MultiPC1: multivariate regression using 1st PC only.
Figure 2. The Genome-wide False Positive Rates on Genic Regions using 100 Discovery Cohort Subjects.

Figure 3. The Genome-wide False Positive Rates on Genic Regions using All Discovery Cohort Subjects (n = 528).

Note: *Genome-wide FP: genome-wide false positive rate.
*Minimal MAC:minimal maximum absolute pairwise correlation of a genomic region × 80%, 90%, 95%, 99%: minimal variance explained by PCs used.
*MetaPC:meta-analysis using multiple PCs, MultiPC: multivariate regression using multiple PCs.
*MetaPC1:meta-analysis using 1st PC only, MultiPC1: multivariate regression using 1st PC only.
Figure 3. The Genome-wide False Positive Rates on Genic Regions using All Discovery Cohort Subjects (n = 528).

Figure 4. The True Positive Rates for Continuous Signals on Representative Regions: a.Region 1; b. Region 2; c. Region 3.

Note: Legend: *80%, 90%, 95%, 99%:minimal variance explained by PCs used.
*MetaPC:meta-analysis using multiple PCs, MultiPC: multivariate regression using multiple PCs.
*MetaPC1:meta-analysis using 1st PC only, MultiPC1: multivariate regression using 1st PC only.
*CTPC1,CTPC2,CTPC1+PC2: continuous true positive signals simulated associated with PC1, PC2, and PC1+PC2.
Figure 4. The True Positive Rates for Continuous Signals on Representative Regions: a.Region 1; b. Region 2; c. Region 3.

Table 3. Replicate ‘Novel’ DMRs without EWAS Hits in the Discovery cohort that were only identified by DMRPC or coMethdmr.

Table 4. Comparison of Relative Computational Burden between DMRPC and coMethdmr.

Figure 5. Visualization of Two Age-related DMRs in the Discovery Cohort a.DMR: Chr6:11044877 –11,044,974 (PFDR=3.31×1064). b.DMR: Chr3:147125712 –147,127,193 (PFDR=4.69×1032).

Note: Legend: *The DMR Chr6:11044877-11044974 was the most significant age-related DMR in the Discovery cohort PFDR=3.31×1064 with 4 probes available and a MAC of 0.77. In this region, 2 PCs were adopted in the DMRPC analysis to explain 86.08% total variance of methylation residuals.
*The DMR Chr3:147125712-147127193 (PFDR=4.69×1032) had 28 probes available within the region and a MAC of 0.65. In the region, 10 PCs were adopted in the DMRPC analysis to explain 64.85% total variance of methylation residuals.
*Only PCs with nominal p-values<0.05 and probes (at least 2) on each PC with absolute PC loadings above the estimated 50% quantile were plotted. Note, weight signs (±) are arbitrary in PCA.
Figure 5. Visualization of Two Age-related DMRs in the Discovery Cohort a.DMR: Chr6:11044877 –11,044,974 (PFDR=3.31×10−64). b.DMR: Chr3:147125712 –147,127,193 (PFDR=4.69×10−32).
Supplemental material

Supplemental Material

Download MS Word (573.6 KB)

Data availability statement

The datasets analysed during the current study are not publicly available. Qualified investigators can apply to the PTSD Genetics and TRACTS data repositories to gain access to these data via a Data Use Agreement. Please contact Dr. MW Miller regarding access to methylation data from PTSD Genetics and TRACTS data repositories.