Abstract
Aim: We aim to predict transcription factor (TF) binding events from knowledge of gene expression and epigenetic modifications. Materials & methods: TF-binding events based on the Encode project and The Cancer Genome Atlas data were analyzed by the random forest method. Results: We showed the high performance of TF-binding predictive models in GM12878, HeLa, HepG2 and K562 cell lines and applied them to other cell lines and tissues. The genes bound by the top TFs (MAX and MAZ) were significantly associated with cancer-related processes such as cell proliferation and DNA repair. Conclusion: We successfully constructed TF-binding predictive models in cell lines and applied them in tissues.
Supplementary data
To view the supplementary data that accompany this paper please visit the journal website at:www.tandfonline.com/doi/suppl/10.2217/epi-2019-0321
Financial & competing interests disclosure
This work was funded by the National Natural Science Foundation of China (no. 61972116). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
The authors thank L Bianji, Edanz Group China (www.liwenbianji.cn/ac), for editing the English text of a draft of this manuscript. Writing assistance was funded by the National Natural Science Foundation of China (no. 61972116).