294
Views
0
CrossRef citations to date
0
Altmetric
Research Paper

Multi-tissue DNA methylation microarray signature is predictive of gene function

ORCID Icon, , , , & ORCID Icon
Pages 1404-1418 | Received 19 Oct 2021, Accepted 25 Jan 2022, Published online: 13 Feb 2022
 

ABSTRACT

Background Transcriptional correlation networks derived from publicly available gene expression microarrays have been previously shown to be predictive of known gene functions, but less is known about the predictive capacity of correlated DNA methylation at CpG sites. Guilt-by-association co-expression methods can adapted for use with DNA methylation when a representative methylation value is created for each gene. We examine how methylation compares to expression in predicting Gene Ontology terms using both co-methylation and traditional machine learning approaches across different types of representative methylation values per gene. Methods We perform guilt-by-association gene function prediction with a suite of models called Methylation Array Network Analysis, using a network of correlated methylation values derived from over 24,000 samples. In generating the correlation matrix, the performance of different methods of collapsing probe-level data effect on the resulting gene function predictions was compared, along with the use of different regions surrounding the gene of interest. Results Using mean comethylation of a given gene to its annotated term had an overall highest prediction macro-AUC of 0.60 using mean gene body methylation, across all Gene Ontology terms. This was increased using the logistic regression approach with the highest macro-AUC of 0.82 using mean gene body methylation, compared to the naive predictor of 0.72. Conclusion Genes correlated in their methylation state are functionally related. Genes clustered in co-methylation space were enriched for chromatin state, PRC2, immune response, and development-related terms.

Acknowledgments

The authors would like to acknowledge funding from NIH grants # 5P30AG050911, 2P20GM103636 and 5U54GM104938 (to J.D.W.)

Disclosure statement

No potential conflict of interest was reported by the authors.

Data availability

Code for MANA is located at https://gitlab.com/wrenlab/mana.git

Supplementary material

Supplemental data for this article can be accessed here

Additional information

Funding

This work was supported by the National Institutes of Health [2P20GM103636]; National Institutes of Health [5U54GM104938]; National Institutes of Health [5P30AG050911].

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.