Search in:

Epigenetics Volume 13, 2018 - Issue 9

Submit an article Journal homepage

Free access

3,084

Views

CrossRef citations to date

Altmetric

Listen

Research paper

Super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers

Aziz KhanCentre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo, Norway;Key Lab of Bioinformatics/Bioinformatics Division, BNRIST (Beijing National Research Center for Information Science and Technology), Department of Automation, Tsinghua University, Beijing, ChinaCorrespondence[email protected]

http://orcid.org/0000-0002-6459-6224 View further author information

Anthony MathelierCentre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo, Norway;Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, Oslo, Norway

http://orcid.org/0000-0001-5127-5459 View further author information

Xuegong ZhangKey Lab of Bioinformatics/Bioinformatics Division, BNRIST (Beijing National Research Center for Information Science and Technology), Department of Automation, Tsinghua University, Beijing, China;School of Life Sciences, Tsinghua University, Beijing, China

http://orcid.org/0000-0002-9684-5643 View further author information

Pages 910-922 | Received 03 May 2018, Accepted 10 Aug 2018, Published online: 11 Oct 2018

Cite this article
https://doi.org/10.1080/15592294.2018.1514231
CrossMark

In this article

ABSTRACT
Background
Results
Discussion
Material and methods
Supplemental material
Acknowledgements
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions
View PDF PDF

ABSTRACT

Super-enhancers and stretch enhancers represent classes of transcriptional enhancers that have been shown to control the expression of cell identity genes and carry disease- and trait-associated variants. Specifically, super-enhancers are clusters of enhancers defined based on the binding occupancy of master transcription factors, chromatin regulators, or chromatin marks, while stretch enhancers are large chromatin-defined regulatory regions of at least 3,000 base pairs. Several studies have characterized these regulatory regions in numerous cell types and tissues to decipher their functional importance. However, the differences and similarities between these regulatory regions have not been fully assessed. We integrated genomic, epigenomic, and transcriptomic data from ten human cell types to perform a comparative analysis of super and stretch enhancers with respect to their chromatin profiles, cell type-specificity, and ability to control gene expression. We found that stretch enhancers are more abundant, more distal to transcription start sites, cover twice as much the genome, and are significantly less conserved than super-enhancers. In contrast, super-enhancers are significantly more enriched for active chromatin marks and cohesin complex, and more transcriptionally active than stretch enhancers. Importantly, a vast majority of super-enhancers (85%) overlap with only a small subset of stretch enhancers (13%), which are enriched for cell type-specific biological functions, and control cell identity genes. These results suggest that super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers, and importantly, most of the stretch enhancers that are distinct from super-enhancers do not show an association with cell identity genes, are less active, and more likely to be poised enhancers.

KEYWORDS:

Gene regulation
genomics
epigenomics
super-enhancers
stretch enhancers

Background

The human body contains several hundred distinct cell types and most of the regulatory code that drives cell type-specific gene expression resides in cis-regulatory elements termed enhancers [Citation1]. Enhancers are noncoding regulatory regions distal to the genes they regulate where transcription factors (TFs) and the transcriptional apparatus bind and orchestrate the gene regulation [Citation1,Citation2]. While estimations predicted approximately one million potential enhancers in the human genome, only a very small fraction of these enhancers are active in a given cell [Citation3,Citation4]. These active enhancers are primarily found in regions of accessible chromatin [Citation5], marked by monomethylation of histone H3 at lysine 4 (H3K4me1) and acetylation of histone H3 at lysine 27 (H3K27ac) [Citation5–Citation8], produce enhancer RNAs (eRNAs) [Citation9], and are significantly loaded with the coactivator protein p300 [Citation10] and RNA polymerase II (RNA Pol II) [Citation8]. Additionally, considerable levels of H3K4me3 are also observed at active enhancers bound by RNA Pol II [Citation11] and some studies have linked H3K4me3 broad peaks with transcriptional elongation, enhancer activity, and cellular identity [Citation12–Citation14]. Further, poised enhancers are enriched for trimethylation of histone H3 on lysine 27 (H3K27me3) and have depleted H3K27ac and RNA Pol II signal [Citation6,Citation15].

Despite critical advances in terms of technology and methodology to identify enhancers genome-wide and to understand their molecular mechanisms and function, a clear understanding is still lacking. Defining cell type-specific enhancers and accurately assigning them to the gene(s) they regulate is of great interest but very challenging due to lack of known cell type-specific signatures. In 2013, Whyte et al. showed that clusters of enhancers, termed super-enhancers, control cell identity. These super-enhancers were defined based on their enrichment for binding of key master regulator TFs, Mediator, and chromatin regulators [Citation16]. These cluster of enhancers are cell type-specific, control the expression of cell identity genes, are sensitive to perturbation, associated with disease, distinctly methylated, and boost the processing of primary microRNA into precursors of microRNAs [Citation16–Citation20]. Concomitantly to the discovery of super-enhancers, Parker et al. showed that large genomic regions with enhancer characteristic termed stretch enhancers, and defined based on their size (>3 kb), control cell identity [Citation21]. These stretch enhancers are known to be cell type-specific and are enriched for disease-associated variants, for instance, carrying small nuclear polymorphisms SNPs associated with type 2 diabetes [Citation21,Citation22]. Further, it has been shown that super-enhancers and stretch enhancers overlap with the known locus control regions (LCR) [Citation17,Citation21,Citation23]. Significant attention has been given to these cell type-specific regulatory regions to understand their molecular mechanisms and functional importance. Since the parallel publications of these two concepts, there has been confusion among some in the research community to differentiate these two classes of regulatory regions. The concepts of super-enhancers and stretch enhancers try to capture the same underlying biological phenomenon, but it is unclear whether these regions, defined using different approaches, have different or equivalent regulatory potential. Some recent studies even referred to these regions collectively as SEs (Super/Stretch Enhancers), assuming them to be equivalent [Citation24–Citation26]. While some studies showed that super-enhancers and stretch enhancers overlap [Citation27,Citation28], a detailed comparative analysis to understand their differences and similarities is lacking.

We performed a comprehensive analysis of super-enhancers and stretch enhancers in 10 human cell lines by integrating ChIP-seq data for histone modifications (H3K27ac, H3K4me1, H3K4me3, and H3K27me3), RNA Pol II, P300, cohesin components (RAD21 and SMC3) and CTCF, chromatin accessibility data (DNase-seq), and transcriptomics data (RNA-seq, GRO-seq, and GRO-cap). Our analyzes revealed significant differences between super and stretch enhancers. Stretch enhancers are more abundant, distal to transcription start sites (TSS) and less conserved than super-enhancers. Comparatively, super-enhancers are significantly more enriched for active chromatin marks, RNA Pol II, and cohesin components, are transcriptionally more active, and more transcribed than stretch enhancers. Finally, only a small fraction of stretch enhancers overlap with super-enhancers, which we named super-stretch enhancers. These super-stretch enhancers are highly transcribed, associated with cell identity genes, and enriched for cell type-specific biological functions.

Results

Genomic distribution and conservation of super-enhancers and stretch enhancers from ten human cell types

To systematically compare super-enhancers and stretch enhancers, we obtained super-enhancers from dbSUPER [Citation29] and stretch enhancers from ten human cell types [Citation21]. These cell types included B-lymphoblastoid cells (GM128278), embryonic stem cells (H1-ES), erythrocytic leukemia cells (K562), hepatocellular carcinoma cells (HepG2), human umbilical vein endothelial cells (HUVEC), human mammary epithelial cells (HMEC), human smooth muscle myoblasts (HSMM), normal human epidermal keratinocytes (NHEK), normal human lung fibroblasts (NHLF), and pancreatic islets (Islets). We looked at the genomic distribution of these regulatory regions and found that stretch enhancers are an order of magnitude more numerous and cover twice as much the human genome than super-enhancers (). Specifically, we found an average of 745 super-enhancers with mean size 22,812 bp and 11,160 stretch enhancers with mean size 5,060 bp. Further, a majority of super-enhancers (69%) was located close to TSSs (<2 kb), while a majority of stretch enhancers (70%) was located very distal from TSSs (>10 kb) (, Supplementary Figure S1). This difference in distribution of distances from TSS to super and stretch enhancers is statistically significant (P value <2.2e-16, Wilcoxon rank sum test). Next, we investigated the evolutionary conservation of super- enhancers and stretch enhancers by using phastCons scores for 99 vertebrate genomes aligned to the human genome (hgdownload.cse.ucsc.edu/goldenpath/hg19/phastCons100way/) [Citation30]. Super-enhancers were significantly more conserved than stretch enhancers (P value <2.2e-16, Wilcoxon rank sum test) (, S2). Taken together, these results indicate that stretch enhancers are more distal to TSSs, more abundant in number, and cover twice as much of the genome as super-enhancers, which are significantly more conserved.

Figure 1. Genomic distribution and conservation of super-enhancers and stretch enhancers in 10 human cell types.

(a) Number of super- and stretch enhancers in 10 cell types. (b) Fraction of the human genome covered by super- and stretch enhancers across 10 human cell types. (c) Distribution of distances to TSS for super- and stretch enhancers (average across the 10 cell types) (P value <2.2e-16, Wilcoxon rank sum test). (d) Evolutionary conservation score, phastCons scores obtained from UCSC 100 vertebrate species (phastCons100way) at super- and stretch enhancers with 6 kb flanking regions in H1-ES, K562, NHLF, and Islets cell types.

Super-enhancers are enriched for active chromatin marks

We next sought to highlight the potentially distinct chromatin marks found at super-enhancers and stretch enhancers by using ChIP-seq data from the ENCODE project [Citation31] for H3K27ac, H3K4me1, H3K4me3, and H3K27me3. Looking at average ChIP-seq signals, we observed that super-enhancers are highly enriched for active chromatin marks, such as H3K27ac and H3K4me3, while depleted for poised marks such as H3K27me3 (; Supplementary Figure S3, S4). In contrast, stretch enhancers are highly enriched for poised chromatin mark H3K27me3 and depleted for active chromatin marks H3K27ac and H3K4me3 (). Additionally, super-enhancers and stretch enhancers are similarly marked by H3K4me1 ().

Figure 2. Chromatin modifications at super-enhancers and stretch enhancers.

(a) Genome-wide average ChIP-seq profiles for H3K27ac, H3K4me, H3K4me3, and H3K27me3 at super- and stretch enhancers in H1-ESC, GM12878, and K562. (b) Genomic browser screenshot showing super- and stretch enhancers with ChIP-seq signals for H3K27ac, H3K4me1, H3K4me3, P300, and CTCF, and open chromatin (DNaseI), RNA-seq, and conservation at the locus of SOX2 gene in H1-ES cells.

Active enhancers are primarily found in regions of accessible chromatin [Citation5]. Here, for most of the cell types, we observed significantly higher levels of DNase I hypersensitive sites (DHSs) for super-enhancers than for stretch enhancers (Supplementary Figure S5). Furthermore, we found that stretch enhancers overlap with super-enhancers at key cell identity genes, such as SOX2 (), POU5F/OCT4 (Supplementary Figure S6(a)), and NANOG (Supplementary Figure S6(b)) in embryonic stem cells, and are highly enriched for active chromatin marks. Additionally, in Pancreatic islet, most of the binding sites of islet-specific transcription factors including PDX1, NKX2-2, FOXA2, and NKX6-1, are mapped to super-enhancers, while some are mapped to stretch enhancers [Citation32] (Supplementary Figure S7).

Many type-2 diabetes SNPs from the Diabetes genetics replication and meta-analysis (DIAGRAM; red color) [Citation33], as well as fasting glycemia SNPs from the Meta-analyzes of glucose and insulin-related traits consortium (MAGIC; blue color) [Citation34] were observed on and around these enhancer regions. Interestingly, we observed higher ChIP-seq binding signal at stretch enhancers that overlap with super-enhancers. Taken together, these results highlight that super-enhancers are significantly more active and located in open regions than stretch enhancers, which are more likely to be poised.

Super-enhancers are enriched with cohesin and CTCF binding

It is known that enhancers are brought close to their target genes through chromatin looping mechanisms. These long-range enhancer–promoter interactions and DNA looping are mediated by the cohesin complex and CTCF [Citation35,Citation36]. We compared the occupancy of two cohesin components (SMC3 and RAD21) and CTCF at super-enhancers and stretch enhancers in GM12878 and K562 cells. We observed significantly higher SMC3, RAD21, and CTCF ChIP-seq binding signal at super-enhancers than at stretch enhancers (). This suggests that super-enhancers are regions with frequent interactions mediated by the cohesin complex and CTCF when compared to stretch enhancers.

Super-enhancers are transcriptionally more active than stretch enhancers

RNA Pol II plays a critical role in transcription and a majority of active enhancers recruit RNA Pol II [Citation9]. We observed significantly higher RNA Pol II binding at super-enhancers than at stretch enhancers (; Supplementary Figure S4). This was expected, since we previously highlighted a significantly higher occupancy of active chromatin marks H3K27ac and H3K4me3 at super-enhancers. To further assess the effect of enhancer activity on gene expression regulation, we associated genes with super-enhancers and stretch enhancers based on proximity and calculated their transcriptional abundance. For all the 10 cell types, we found that the genes near super-enhancers were significantly more expressed than genes near stretch-enhancers (P value <2.2e-16, Wilcoxon rank sum test) ().

Figure 3. Chromatin organization at super-enhancers and stretch enhancers.

(a-b) Spatial distribution of two Cohesin components, RAD21 and SMC3 (a) and CTCF (b) at super- and stretch enhancers from K562 and GM12878 cells.

Figure 4. Transcriptional activity at super-enhancers and stretch enhancers.

(a) Genome-wide profile of RNA Pol II at super- and stretch enhancers in H1-ESC, GM12878, K562, and HUVEC cell-lines. (b) Transcriptional abundance in reads per kilobase of transcript per million mapped reads (RPKM) of genes near (within a 50 kb window) super- and stretch enhancers across 10 cell types (P value <2.2e-16, Wilcoxon rank sum test). (c) GRO-seq profiles at the constituents of super- and stretch enhancers in K562 and GM12878 cell-lines. (d) GRO-cap profiles at the constituents of super- and stretch enhancers in K562 and GM12878 cell-lines.

Recent studies have shown that most of the active enhancers are bi-directionally transcribed to produce RNA transcripts, referred to as eRNAs [Citation9]. These transcribed enhancers exhibit higher in vitro activity, suggesting that production of eRNA is linked to functional activity [Citation37]. Recent techniques based on global run-on sequencing (GRO-seq and GRO-cap) have been developed for the detection of these unstable RNAs generated from enhancer elements [Citation38,Citation39]. We used publicly available data from GRO-seq and GRO-cap assays in K562 and GM12878 to investigate the levels of eRNAs at super- enhancers and stretch enhancers. Super-enhancers and their constituents (i.e., individual enhancers composing the clusters) harbored a significantly higher signal for both GRO-seq and GRO-cap () Supplementary Figure S8) than stretch enhancers. Taken together, these results confirm that super-enhancers are more active and transcribed and can greatly enhance the transcription of their nearby genes when compared with stretch enhancers.

A small subset of stretch enhancers overlaps with super-enhancers

Next, we investigated to what extent super-enhancers and stretch enhancers are conserved and overlapped together among cell types. We computed pairwise Jaccard statistics among cell types for super-enhancer and stretch enhancer regions, independently. We observed higher Jaccard statistics for stretch enhancers than for super-enhancers, meaning that stretch enhancers significantly shared across cell types than super-enhancers (P value = 4.311e-10, Wilcoxon signed rank test) (Supplementary Figure S9). Next, we computed the fraction of overlap between super-enhancers and stretch enhancers in each cell type (). We observed that most of the super-enhancers overlap with a small fraction of the stretch enhancers in all cell types. On average, a vast majority of super-enhancers (85%) overlap with only a small number of stretch enhancers (13%) (). These stretch enhancers that overlap with super-enhancer regions were termed super-stretch enhancers. We then compared these super-stretch enhancers with super-enhancers and stretch-only enhancers, which do not overlap super-enhancers (referred to as stretch from now on in this work) (Supplementary Figure S10).

Figure 5. Overlap analysis of super-enhancers and stretch enhancers.

(a) Fraction of overlap between super- and stretch enhancers across the ten cell types. (b) Pie chart of average overlap of super- and stretch enhancers. (c) The region length in base pairs (bp) of super-stretch and stretch enhancers across 10 cell types (p-value < 2.2e-16, Wilcoxon rank sum test).

These super-stretch enhancers are significantly smaller in size than the remaining stretch-only enhancers P value <2.2e-16, Wilcoxon rank sum test) (). In line with the fact that the vast majority of super-enhancers are super-stretch enhancers, we recapitulate that these regions were (i) enriched for active chromatin marks, such as H3K27ac and H3K4me3 (Supplementary Figure S11); (ii) enriched for cohesin and CTCF binding (Supplementary Figures S12 and S13(a)); (iii) near highly expressed genes (P value < 2.2e-16, Wilcoxon rank sum test, Supplementary Figure S13(a)); and (iv) significantly transcribed when compared to stretch enhancers (Supplementary Figure S13(b)).

To further test the ability of those super-enhancer that do not overlap with stretch, we grouped the regions into super-only, super-stretch, and stretch-only regions based on the overlap analysis (Supplementary Figure S14(a)). We investigated the average ChIP-seq profile of H3K27ac, H3K4me1, H3K4me3, and RNA Pol II at these three categories in three cell lines (H1, GM12878, and 562) from ENCODE tier-1 (Supplementary Figure S14(b)). We observed that the super-only group has significantly higher signal for active chromatin marks, such as H3K27ac, H3K4me3, and RNA Pol II, and lower signal for H3K4me1, compared to stretch-only and super-stretch groups.

Taken together, these results show that super-enhancers are more active in general and a vast majority of super-enhancers also contain a small fraction of stretch enhancers. Further, the small fraction of stretch enhancers that overlap with super-enhancers (called super-stretch enhancers) are highly enriched for active chromatin marks, highly transcribed, and can greatly enhance the transcription of their associated genes.

Super-stretch enhancers are cell type-specific and control key cell identity genes

We sought to analyze how super-stretch enhancers were associated with cell type-specific genes. We performed k-means clustering based on the H3K27ac histone modification at super, super-stretch, and stretch enhancers in five cell types (GM12878, K562, H1-ES, HepG2, and HUVEC). We observed cell type-specific clusters for all the three groups, but significantly stronger cell type-specific signal at super and super-stretch enhancers than at stretch enhancers (). It was expected to observe higher level of H3K27ac signal at super-enhancers, as these were defined solely based on H3K27ac signal while stretch enhancers were defined using several other chromatin marks, including H3K27ac. To test if the signal was coming from the way super-enhancers were identified, we redefined stretch enhancers using same candidate enhancers defined using H3K27ac peaks in H1-ES, GM12878, and K562 cell types. We then divided these stretch enhancers into super-stretch and stretch enhancers, as described above. Interestingly, we found similar patterns for H3K27ac (Supplementary Figure S15(a)) and GRO-seq signal (Supplementary Figure S15(b)). In addition, we found cell type-specific GO terms for super and super-stretch enhancers but not stretch enhancers (Supplementary Figure S16). This reinforces our observations that super-enhancers and stretch enhancers are intrinsically different.

When considering the closest genes to super-enhancers or super-stretch enhancers defined in H1-ES cells, we found that they were highly expressed in H1-ES cells but not in other cell types (). This observation holds true for the 10 considered cell types (Supplementary Figure S17). On the contrary, we did not observe such cell type-specific behavior for genes close to stretch enhancers (Supplementary Figure S17). To confirm this cell type-specific behavior associated to super-enhancers and stretch enhancers, we further assessed the biological function of these regulatory regions using the tool GREAT to perform gene ontology enrichment analysis from the genes close to super, super-stretch, and stretch enhancers. In all cell types analyzed, super-enhancers and super-stretch enhancers were found close to genes enriched for corresponding cell type-specific functions. For example, in ES cells, terms like stem cell development, stem cell differentiation, and stem cell activation were enriched for genes associated with super and super-stretch enhancer (; Supplementary Figure S18).

Figure 6. Cell type-specificity analysis of super, super-stretch and stretch enhancers.

(a) K-means clustering on the histone modification H3K27ac profile at super, super-stretch and stretch enhancers in five ENCODE cell types (GM12878, K562, H1-ES, HepG2, and HUVEC). (b) Transcriptional abundance in units of RPKM of genes associated with H1-ESC and how these genes are expressed in the other nine cell types tested as shown along the axis. (c) GO analysis of super-, super-stretch and stretch enhancers in H1 ES cell type. (d) Venn diagram shows the overlap of genes associated with super, super-stretch and stretch enhancers, and label the known key cell-identity genes in H1 ES, K562, and Islets cells.

Next, we performed the overlap of genes associated with super-, super-stretch and stretch enhancers and checked for the known key cell identity genes. We found a majority of key cell identity genes associated with either super-enhancers or super-stretch enhancers. For example, the ESC pluripotency genes SOX2, OCT4 (POU5F1), and NANOG binding were found in super-enhancers and super-stretch enhancers but not stretch enhancers (). In K562 cells, proteins such as GATA1, TAL1, and JUN bound at super-enhancers and super-stretch enhancers, while GATA2 and HBG1 bound at super-enhancers. Similarly, in Islets cells, PDX1 and NKX2-2 bound at super-enhancers and super-stretch enhancers and FOXA2 bound at super-stretch enhancers. Taken together, these results suggest that a small subset of stretch enhancers, the one overlapping with super-enhancers (super-stretch), are preferentially associated with genes that have a key role in the cell type-specific biology.

Discussion

In this study, we have performed a comprehensive analysis of super-enhancers and stretch enhancers by comparing their histone modification profiles, chromatin accessibilities, and abilities to regulate cell type-specific gene expression. At the genome scale, stretch enhancers are more abundant, cover twice the genome, and are further away from TSS than super-enhancers; super-enhancers are evolutionary more conserved. Moreover, super-enhancers are found to overlap active chromatin marks, such as H3K27ac and H3K4me3, while stretch enhances are enriched for the poised mark H3K27me3. Super-enhancers are found to be significantly more occupied by the cohesin complex, CTCF, and RNA Pol II and produce eRNAs. Further, a majority of the super-enhancers overlaps with a small fraction of stretch enhancers; the overlapping regions are more cell type-specific and found near genes involved in cell maintenance, differentiation, and development.

We observed that super-enhancers are loaded with active chromatin marks, such as H3K27ac and RNA Pol II and produce eRNAs, while stretch enhances are enriched for H3K27me3, depleted for H3K27ac, and do not produce eRNA. These properties show that super-enhancers are transcriptionally active, while stretch enhancers are poised.

As super-enhancers were originally defined using H3K27ac signal, it was expected that they were containing higher level of H3K27ac signal when compared to stretch enhancers, which were defined based on multiple chromatin marks. Redefining stretch enhancers using solely the H3K27ac mark did not change our observations of the difference between super-enhancer and stretch enhancer. This reinforces the fact that super-enhancers are overall more active than stretch enhancers, which are likely poised enhancers. Further studies will be required to find the specific function of the genomic regions defined as stretch enhancers and that do not overlap super-enhancers.

Conventionally, enhancers and promoters are regarded as distinct cis-regulatory elements and distinguished by enrichment for histone modifications (H3K4me1/2 and H3K27ac enriched in enhancers; H3K4me3 enriched in promoters); however, recent studies have shown that some of the highly transcribed enhancers can function as promoters in vivo [Citation40–Citation42]. The exceptionally high enrichment for H3K27ac and RNA Pol II combined with H3K4me3 (a mark usually associated with promoters) at super-enhancer regions suggests that super-enhancers may also work as promoters to regulate their target gene(s) or these clusters may encompass promoter sequences. This is in line with the enrichment of GRO-seq and GRO-cap signal at these super-enhancer regions, which confirms that these regions initiate transcription.

Stretch enhancers were defined as large (>3 Kb) linear genomic regions with specific chromatin marks [Citation21] while super-enhancers were defined as clusters of enhancers enriched for active chromatin marks. We observed that a small fraction of stretch enhancers that overlap super-enhancers was more active, cell type-specific, and significantly smaller than the rest of these regions. It clearly highlights that length is not an optimal feature to determine cell type-specific regulatory regions. Our analyzes and several other lines of evidence suggest that enhancers are more likely to be cell type-specific, transcriptionally active, and frequently interacting when found in clusters at the genomic scale, regardless of their sizes [Citation32,Citation43]. Whether the individual enhancers within a cluster (super-enhancer) work synergistically or additively is still in active debate [Citation44]. Some studies have shown synergy and hierarchy between the individual enhancers within the clusters [Citation25,Citation45,Citation46], but this synergy is less obvious for some developmentally regulated super-enhancers [Citation47]. Further, locus control regions [Citation23] have been known for decades and whether the newly introduced super- and stretch enhancers represent a new paradigm in transcriptional regulation is still debated [Citation28]. It has been suggested that genes that are regulated by multiple cis-regulatory elements may achieve higher levels of RNA Pol II recruitment due to higher local concentration of required factors than genes regulated by fewer regulatory elements [Citation48,Citation49]. A very recent simulation study, based on the properties of super-enhancers, proposed a conceptual phase separation model for transcriptional gene regulation [Citation50]. If that is the case, then the current approaches to identify clusters of enhancers are not optimal. The concepts of super-enhancers and stretch enhancers try to capture the same underlying biological phenomenon; however, the methods used to identify regions of the genome of particular importance for cell identity are different and imperfect. Further, the current methods do not integrate chromatin interaction data to restrict these clusters to be within the so-called topologically associated domains, where most of the enhancer-promoter interactions and regulation takes place. Hence, we still need to develop new methods to capture cell type-specific enhancers that might not be derived from clusters of enhancers at the genomic scale but rather in the 3D space of the nucleus.

Material and methods

Data description

We used the processed ChIP-seq data for histone modifications and DNase-seq data generated by the Encyclopedia of regulatory DNA elements (ENCODE) to perform all the analysis in this report (hgdownload-test.cse.ucsc.edu/goldenPath/hg19/encodeDCC/) (Table S1). We obtained the processed RNA-seq based gene expression data as RPKM for the ten cell types from [Citation21]. We also used GRO-seq and GRO-cap data in GM12878 and K562, which is listed in supplementary Table S2.

GRO-seq and GRO-cap data analysis

The GRO-seq, and GRO-cap reads were aligned to human genome-build hg19 using bowtie2 [Citation51] with default parameters. The aligned and sorted bam files were used to compute the average signal profile at super-enhancer and stretch enhancers.

Super-enhancers

Super-enhancers were downloaded as BED files for the ten human cell types GM12878, H1-ES, K562, HepG2, HUVEC, HMEC, HSMM, NHEK, NHLF, and Islets from dbSUPER (http://asntech.org/dbsuper/) [Citation29].

Stretch enhancers

We obtained the stretch enhancer annotations for the 10 human cell types (including GM12878, H1-ES, K562, HepG2, HUVEC, HMEC, HSMM, NHEK, NHLF and Islets) from [Citation21]. We also redefined stretch enhancers using only H3K27ac ChIP-seq data in cell lines H1-ES, GM12878, and K562. We used the stitched enhancer regions (typical enhancers and super-enhancers) and ranked them based on length, separating the ones that are larger than 3 kb.

Assigning genes to super-enhancers and stretch enhancers

Genes were assigned to super-enhancers and stretch enhancers based on proximity as descried in [Citation16,Citation17]. It is known that enhancers tend to loop and communicate with target genes [Citation52], and most of these enhancer-promoter interactions occur within a distance of ~50 kb [Citation53]. This approach identified a large proportion of true enhancer/promoter interactions in ESC [Citation54]. Hence, all transcriptionally active genes were assigned to super-enhancers and stretch enhancers within a 50 kb window.

Gene ontology analysis

To perform Gene Ontology analysis, we used the Genomic regions enrichment of annotations tool (GREAT) (http://bejerano.stanford.edu/great/, version 3.0.0) [Citation55] with default parameters. The top 10 GO terms with the lowest P value were reported.

Overlap analysis

We used BEDTools [Citation56] to perform intersection of genomic regions. We considered two regions to overlap if they shared at least 1 bp. To perform pairwise overlap analysis we used the Intervene tool [Citation57].

Visualization and statistical analysis

We generated box plots using the R programming language by extending the whiskers to 1.5x the interquartile range. The P values for box-plots were calculated using Wilcoxon signed-rank tests with the wilcox.test function in R. We used ngs.plot [Citation58] to generate heat maps and normalized binding profiles at the constituents of super-enhancers and stretch enhancers along with their flanking 3 kb regions. We used Intervene (https://asntech.shinyapps.io/intervene/) [Citation57] to generate pairwise heatmaps. For genome-browser screenshots, we used the Biodalliance genome browser with ENCODE data [Citation59].

Authors’ contributions

AK conceived the project and designed the experiments. XZ reviewed and approved experiment design and supervised the project. AM provided suggestions, supported to redefine stretch enhancers, and overviewed the finalization of the project. AK performed the experiments and analyzed the data. AK wrote the manuscript and XZ and AM reviewed it. All authors read and approved the final manuscript.

Supplemental material

Supplemental Material

Download PDF (5.3 MB)

Acknowledgments

We thank the ENCODE consortium and all the authors of the original studies used in this work for making their data freely available.

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplementary material

Supplemental data for this article can be accessed here.

Additional information

Funding

AK and AM have been supported by the Norwegian Research Council (project number 187615), Helse Sør-Øst, and the University of Oslo through the Centre for Molecular Medicine Norway (NCMM). XZ is supported by the NSFC grants 61721003 and 61673231. Norwegian Research Council; NSFC [61721003 and 61673231]; Helse Sør-Øst

Related Research Data

Super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers

Source: Figshare

Super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers

Source: Figshare

Linking provided by

References

Heinz S, Romanoski CE, Benner C, et al. The selection and function of cell type-specific enhancers. Nat Rev Mol Cell Biol. 2015;16:144–154.
PubMed Web of Science ®Google Scholar
Barolo S. Shadow enhancers: frequently asked questions about distributed cis-regulatory information and enhancer redundancy. BioEssays. 2012;34:135–141.
PubMed Web of Science ®Google Scholar
Thurman RE, Rynes E, Humbert R, et al. The accessible chromatin landscape of the human genome. Nature. 2012;489:75–82.
PubMed Web of Science ®Google Scholar
Heintzman ND, Hon GC, Hawkins RD, et al. Histone modifications at human enhancers reflect global cell type-specific gene expression. Nature. 2009;459:108–112.
PubMed Web of Science ®Google Scholar
Shlyueva D, Stampfel G, Stark A. Transcriptional enhancers: from properties to genome-wide predictions. Nat Rev Genet. 2014;15:272–286.
PubMed Web of Science ®Google Scholar
Rada-Iglesias A, Bajpai R, Swigut T, et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature. 2011;470:279–283.
PubMed Web of Science ®Google Scholar
Arnold CD, Gerlach D, Stelzer C, et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science. 2013;339:1074–1077.
PubMed Web of Science ®Google Scholar
Bonn S, Zinzen RP, Girardot C, et al. Tissue-specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development. Nat Genet. 2012;44:148–156.
PubMed Web of Science ®Google Scholar
Kim T-K, Hemberg M, Gray JM, et al. Widespread transcription at neuronal activity-regulated enhancers. Nature. 2010;465:182–187.
PubMed Web of Science ®Google Scholar
Visel A, Mj B, Li Z, et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature. 2009;457:854–858.
PubMed Web of Science ®Google Scholar
Calo E, Wysocka J. Modification of enhancer chromatin: what, how, and why? Mol Cell. 2013;49:825–837.
PubMed Web of Science ®Google Scholar
Chen K, Chen Z, Wu D, et al. Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor-suppressor genes. Nat Genet. 2015;47:1149–1157.
PubMed Web of Science ®Google Scholar
Benayoun BA, Pollina EA, Ucar D, et al. H3K4me3 breadth is linked to cell identity and transcriptional consistency. Cell. 2014;158:673–688.
PubMed Web of Science ®Google Scholar
Dincer A, Gavin DP, Xu K, et al. Deciphering H3K4me3 broad domains associated with gene-regulatory networks and conserved epigenomic landscapes in the human brain. Transl Psychiatry. 2015;5:e679.
PubMed Web of Science ®Google Scholar
Creyghton MP, Cheng AW, Welstead GG, et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A. 2010;107:21931–21936.
PubMed Web of Science ®Google Scholar
Whyte WA, Orlando DA, Hnisz D, et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153:307–319.
PubMed Web of Science ®Google Scholar
Hnisz D, Abraham BJ, Lee TI, et al. Super-enhancers in the control of cell identity and disease. Cell. 2013;155:934–947.
PubMed Web of Science ®Google Scholar
Lovén J, Hoke HA, Cy L, et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell. 2013;153:320–334.
PubMed Web of Science ®Google Scholar
Suzuki HI, Young RA, Sharp PA. Super-enhancer-mediated RNA processing revealed by integrative microRNA network analysis. Cell. 2017;168:1000–1014.e15.
PubMed Web of Science ®Google Scholar
Heyn H, Vidal E, Ferreira HJ, et al. Epigenomic analysis detects aberrant super-enhancer DNA methylation in human cancer. Genome Biol. 2016;17:11.
PubMed Web of Science ®Google Scholar
Parker SCJ, Stitzel ML, Taylor DL, et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc Natl Acad Sci U S A. 2013;110:17921–17926.
PubMed Web of Science ®Google Scholar
Quang DX, Erdos MR, Parker SCJ, et al. Motif signatures in stretch enhancers are enriched for disease-associated genetic variants. Epigenetics Chromatin. 2015;8:23.
PubMed Web of Science ®Google Scholar
Li Q, Peterson KR, Fang X, et al. Locus control regions. Blood. 2002;100:3077–3086.
PubMed Web of Science ®Google Scholar
Vahedi G, Kanno Y, Furumoto Y, et al. Super-enhancers delineate disease-associated regulatory nodes in T cells. Nature. 2015;520:558–562.
PubMed Web of Science ®Google Scholar
Hnisz D, Schuijers J, Lin CY, et al. Convergence of developmental and oncogenic signaling pathways at transcriptional super-enhancers. Mol Cell. 2015;58:362–370.
PubMed Web of Science ®Google Scholar
Flynn RA, Do BT, Rubin AJ, et al. 7SK-BAF axis controls pervasive transcription at enhancers. Nat Struct Mol Biol. 2016;23:1–11.
PubMed Web of Science ®Google Scholar
Niederriter A, Varshney A, Parker S, et al. Super enhancers in cancers, complex disease, and developmental disorders. Genes. 2015;6:1183–1200.
PubMed Web of Science ®Google Scholar
Pott S, Lieb JD. What are super-enhancers? Nat Genet. 2014;47:8–12.
Web of Science ®Google Scholar
Khan A, Zhang X. dbSUPER: a database of super-enhancers in mouse and human genome. Nucleic Acids Res. 2016;44:D164–D171.
PubMed Web of Science ®Google Scholar
Rosenbloom KR, Armstrong J, Barber GP, et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 2014;43:D670–D681.
PubMed Web of Science ®Google Scholar
Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74.
PubMed Web of Science ®Google Scholar
Pasquali L, Gaulton KJ, Rodríguez-Seguí SA, et al. Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants. Nat Genet. 2014;46:136–143.
PubMed Web of Science ®Google Scholar
Morris AP, Voight BF, Teslovich TM, et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet. 2012;44:981–990.
PubMed Web of Science ®Google Scholar
Scott RA, Lagou V, Welch RP, et al. Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways. Nat Genet. 2012;44:991–1005.
PubMed Web of Science ®Google Scholar
Rao SSP, Huang S-C, Hilaire BGS, et al. Cohesin loss eliminates all loop domains. Cell. 2017;171:305–320.e24.
PubMed Web of Science ®Google Scholar
Zuin J, Dixon JR, Mija VDR, et al. Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc Natl Acad Sci USA. 2014;111:996–1001.
PubMed Web of Science ®Google Scholar
Andersson R, Gebhard C, Miguel-Escalada I, et al. An atlas of active enhancers across human cell types and tissues. Nature. 2014;507:455–461.
PubMed Web of Science ®Google Scholar
Core LJ, Waterfall JJ, Lis JT. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science. 2008;322:1845–1848.
PubMed Web of Science ®Google Scholar
Kruesi WS, Core LJ, Waters CT, et al. Condensin controls recruitment of RNA polymerase II to achieve nematode X-chromosome dosage compensation. eLife. 2013;2:e00808.
PubMed Web of Science ®Google Scholar
Mikhaylichenko O, Bondarenko V, Harnett D, et al. The degree of enhancer or promoter activity is reflected by the levels and directionality of eRNA transcription. Genes Dev. 2018;32:42–57.
PubMed Web of Science ®Google Scholar
Henriques T, Scruggs BS, Inouye MO, et al. Widespread transcriptional pausing and elongation control at enhancers. Genes Dev. 2018;32:26–41.
PubMed Web of Science ®Google Scholar
Rennie S, Dalby M, Lloret-Llinares M, et al. Transcription start site analysis reveals widespread divergent transcription in D. melanogaster and core promoter-encoded enhancer activities. Nucleic Acids Res. 2018;46:5455–5469.
PubMed Web of Science ®Google Scholar
Ernst J, Kheradpour P, Mikkelsen TS, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473:43–49.
PubMed Web of Science ®Google Scholar
Dukler N, Gulko B, Huang Y, et al. Is a super-enhancer greater than the sum of its parts? Nat Genet. 2017;49:2–7.
Web of Science ®Google Scholar
Proudhon C, Snetkova V, Raviram R, et al. Active and inactive enhancers cooperate to exert localized and long-range control of gene regulation. Cell Rep. 2016;15:2159–2169.
PubMed Web of Science ®Google Scholar
Shin HY, Willi M, Yoo KH, et al. Hierarchy within the mammary STAT5-driven wap super-enhancer. Nat Genet. 2016;48:904–911.
PubMed Web of Science ®Google Scholar
Hay D, Hughes JR, Babbs C, et al. Genetic dissection of the α-globin super-enhancer in vivo. Nat Genet. 2016;48:1–12.
Web of Science ®Google Scholar
Andersson R. Promoter or enhancer, what’s the difference? Deconstruction of established distinctions and presentation of a unifying model. BioEssays. 2015;37:314–323.
PubMed Web of Science ®Google Scholar
Andersson R, Sandelin A, Danko CG. A unified architecture of transcriptional regulatory elements. Trends Genet. 2015;31:426–433.
PubMed Web of Science ®Google Scholar
Hnisz D, Shrinivas K, Ra Y, et al. A phase separation model for transcriptional control. Cell. 2017;169:13–23.
PubMed Web of Science ®Google Scholar
Langmead B, Trapnell C, Pop M, et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25.
PubMed Web of Science ®Google Scholar
Ong C-T, Corces VG. Enhancer function: new insights into the regulation of tissue-specific gene expression. Nat Rev Genet. 2011;12:283–293.
PubMed Web of Science ®Google Scholar
Chepelev I, Wei G, Wangsa D, et al. Characterization of genome-wide enhancer-promoter interactions reveals co-expression of interacting genes and modes of higher order chromatin organization. Cell Res. 2012;22:490–503.
PubMed Web of Science ®Google Scholar
Dixon JR, Selvaraj S, Yue F, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380.
PubMed Web of Science ®Google Scholar
McLean CY, Bristor D, Hiller M, et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol. 2010;28:495–501.
PubMed Web of Science ®Google Scholar
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842.
PubMed Web of Science ®Google Scholar
Khan A, Mathelier A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics. 2017;18:287.
PubMed Web of Science ®Google Scholar
Shen L, Shao N, Liu X, et al. ngs.plot: quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC genomics. BMC Genomics. 2014;15:284.
PubMed Web of Science ®Google Scholar
Down TA, Piipari M, Hubbard TJP. Dalliance: interactive genome viewing on the web. Bioinformatics. 2011;27:889–890.
PubMed Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers

ABSTRACT

Background

Results

Genomic distribution and conservation of super-enhancers and stretch enhancers from ten human cell types

Super-enhancers are enriched for active chromatin marks

Super-enhancers are enriched with cohesin and CTCF binding

Super-enhancers are transcriptionally more active than stretch enhancers

A small subset of stretch enhancers overlaps with super-enhancers

Super-stretch enhancers are cell type-specific and control key cell identity genes

Discussion

Material and methods

Data description

GRO-seq and GRO-cap data analysis

Super-enhancers

Stretch enhancers

Assigning genes to super-enhancers and stretch enhancers

Gene ontology analysis

Overlap analysis

Visualization and statistical analysis

Authors’ contributions

Supplemental Material

Acknowledgments

Disclosure statement

Supplementary material

Related Research Data

References

Information for

Open access

Opportunities

Help and information

Super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers

ABSTRACT

Background

Results

Genomic distribution and conservation of super-enhancers and stretch enhancers from ten human cell types

Super-enhancers are enriched for active chromatin marks

Super-enhancers are enriched with cohesin and CTCF binding

Super-enhancers are transcriptionally more active than stretch enhancers

A small subset of stretch enhancers overlaps with super-enhancers

Super-stretch enhancers are cell type-specific and control key cell identity genes

Discussion

Material and methods

Data description

GRO-seq and GRO-cap data analysis

Super-enhancers

Stretch enhancers

Assigning genes to super-enhancers and stretch enhancers

Gene ontology analysis

Overlap analysis

Visualization and statistical analysis

Authors’ contributions

Supplemental Material

Acknowledgments

Disclosure statement

Supplementary material

Additional information

Funding

Related Research Data

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date