2,489
Views
3
CrossRef citations to date
0
Altmetric
Technical Paper

NET-prism enables RNA polymerase-dedicated transcriptional interrogation at nucleotide resolution

& ORCID Icon
Pages 1156-1165 | Received 09 Apr 2019, Accepted 14 May 2019, Published online: 03 Jun 2019

ABSTRACT

The advent of quantitative approaches that enable interrogation of transcription at single nucleotide resolution has allowed a novel understanding of transcriptional regulation previously undefined. However, little is known, at such high resolution, how transcription factors directly influence RNA Pol II pausing and directionality. To map the impact of transcription/elongation factors on transcription dynamics genome-wide at base pair resolution, we developed an adapted NET-seq protocol called NET-prism (Native Elongating Transcription by Polymerase-Regulated Immunoprecipitants in the Mammalian genome). Application of NET-prism on elongation factors (Spt6, Ssrp1), splicing factors (Sf1), and components of the pre-initiation complex (PIC) (TFIID, and Mediator) reveals their inherent command on transcription dynamics, with regards to directionality and pausing over promoters, splice sites, and enhancers/super-enhancers. NET-prism will be broadly applicable as it exposes transcription factor/Pol II dependent topographic specificity and thus, a new degree of regulatory complexity during gene expression.

Introduction

Transcription is a highly dynamic process that comprises three different stages. Initiation involves RNA Polymerase II recruitment to the promoter followed by release of RNA Pol II towards progressive elongation. Transcriptional termination is promoted when RNA transcripts are processed and RNA Pol II is released from the chromatin template [Citation1,Citation2]. This dynamic shift from one stage to another is facilitated by a compendium of regulatory processes involving phosphorylation of the Pol II C-terminal domain (CTD) and recruitment of factors that facilitate and regulate RNA Pol II activity [Citation3Citation5].

Approaches that precisely map the position of RNA Pol II at a high resolution have provided a deeper insight into transcriptional regulatory mechanisms [Citation6Citation11]. For example, the development of the human NET-seq protocol quantitatively purifies Pol II in the presence of a strong Pol II inhibitor hence omitting the utilisation of an antibody [Citation9]. Although, this particular approach successfully maps the 3ʹend of nascent RNA to reveal the strand-specific position of Pol II with single nucleotide resolution, it does not distinguish between different Pol II variants or specific protein-dependent interactions. A similar protocol, the mammalian NET-seq protocol (mNET-seq) uses an immunoprecipitation step to capture the nascent RNA produced by different C-terminal domain (CTD) phosphorylated forms of Pol II [Citation12]. Immunoprecipitation has a potential second benefit as it would allow in principle the interrogation of transcription factor – RNA Pol II interaction genome-wide quantitatively, at nucleotide resolution and strand-specificity, none of which is possible using conventional ChIP-seq. In S. cerevisiae, such an approach – coined TEF-seq – was developed to interrogate Paf1 – RNA Pol II interaction [Citation7] and allowed new insight into Paf1 requirements for gene expression.

Here, we sought to develop a mammalian counterpart to TEF-seq, which includes an immunoprecipitation step of RNA Pol II associated factors, while being efficient enough to capture sufficient amounts of nascent RNA for processing the latter as part of NET-seq type libraries. A detailed protocol is outlined in ) and also in a step-by-step form as part of the Supplementary Information (‘NET-prism protocol’).

Figure 1. NET-prism as a tool to interrogate active RNA Pol II – interaction with associated proteins. (a) Schematic representation of the approach. For detailed experimental conditions, please refer to the Supplementary Information (‘NET-prism protocol’). Upon extraction of nascent RNA, libraries are made using the human NET-seq protocol [Citation9]. (b) Volcano plot of Pol II IP vs Mock (IgG) IP depicting a whole RNA Pol II – protein interactome as assessed by Mass spectrometry. Significant values (FDR < 0.05) are coloured in black. (c) Independent confirmation of identified RNA Pol II interactors using IP conditions as used in NET-prism followed by western blotting.

Figure 1. NET-prism as a tool to interrogate active RNA Pol II – interaction with associated proteins. (a) Schematic representation of the approach. For detailed experimental conditions, please refer to the Supplementary Information (‘NET-prism protocol’). Upon extraction of nascent RNA, libraries are made using the human NET-seq protocol [Citation9]. (b) Volcano plot of Pol II IP vs Mock (IgG) IP depicting a whole RNA Pol II – protein interactome as assessed by Mass spectrometry. Significant values (FDR < 0.05) are coloured in black. (c) Independent confirmation of identified RNA Pol II interactors using IP conditions as used in NET-prism followed by western blotting.

Results

Extraction conditions of NET-prism allow for co-purification of RNA Pol II with known associated complex members

Similarly to the original yeast NET-seq and TEF-seq protocols [Citation6,Citation7], we relied on a strong inhibitor for Pol II (α-amanitin) to prevent run-on of the polymerase during all lysis steps and on DNase I to solubilize chromatin. We optimized conditions for DNase I treatment in the absence and presence of urea, which proved to be necessary for efficient solubilization. We found that 100U DNase I and 50mM urea were sufficient to release a large fraction of engaged RNA Pol II from chromatin (Supplementary Figure 1). We then wanted to investigate if we were able to IP known and new co-factors of RNA Pol II under these experimental conditions and examined the total Pol II protein interactome by Mass spectrometry using the same extraction conditions as NET-prism to identify such factors in a native chromatin state.

We identified both, positive (Supt5, Supt6, FACT, Paf1) and negative (NELF) elongation factors as well as splicing (Srsf5, Srsf6, Sf1) and TFIID (Taf10, Taf15) components as significantly enriched with Pol II under NET-prism conditions () and Supplementary Table 1), equipping researchers with a list to guide any follow-up experimentation. We were also able to confirm some of these interactions using immunoprecipitation followed by western blotting ()).

NET-prism captures unique transcriptional patterns of RNA Pol II-associated factors

We picked one transcription factor, TFIID (antibody raised against TBP) and two elongation factors, Spt6 and Sssrp1 (subunit of the FACT heterodimer) to validate the NET-prism approach and interrogate the impact of these factors on RNA Pol II activity. We also performed an IP for Mediator (Med14), serving as a negative control, since it did not display a significant association with Pol II under the conditions used to solubilise chromatin ()). The data were highly reproducible among biological replicates (Supplementary Figure 2A) and exhibited diverse correlations with total RNA Pol II over promoter regions (Supplementary Figure 2B NET-seq/prism), indicating that different TFs establish unique patterns of RNA Pol II stalling. Indeed, aligned and averaged NET-prism profiles over the TSS demonstrate additional regulatory complexity during transcriptional initiation and elongation, suggesting that TF binding specificity directly affects RNA Pol II initiation and elongation dynamics. IPs for elongation factors Spt6 and Ssrp1 show strong and broad enrichment of the Pol II complex. These data are in agreement with ChIP-seq densities for both elongation factors [Citation13]. On the other hand, TFIID-bound RNA Pol II displays a sharp signal centred around the TSS, whereas an IP for Mediator (Med14) yields no nascent RNA transcripts as there is minimal interaction between Mediator and RNA Pol II under the conditions used ()). Similar RNA Pol II patterns were also confirmed at a single gene level ()). To test more systematically how different transcription factors might influence RNA Pol II initiation and elongation, we sought to determine whether different NET-prism libraries provide improved resolution of RNA Pol II distribution patterns. We calculated the travelling ratio (TR) in the sense direction which is defined as the density of RNA Pol II over the promoter (−30 to +250 bp around the TSS) versus the gene body area (+300 bp downstream of the TSS to −200 bp upstream of the TES). All NET-prism libraries exhibited different TRs indicating different pause-release dynamics of Pol II when bound by different TFs ()) These data suggest that NET-prism is indeed able to resolve mechanistic and dynamic interplays between transcription factors and active RNA Pol II.

Figure 2. NET-prism application on polymerase-bound transcription factors. (a) Metaplot profiles and heatmaps over protein-coding genes (n = 4,314) for polymerase associated elongation (Spt6, Ssrp1) and initiation (TFIID, Med14) factors. A 10-bp smoothing window has been applied. Blue = Sense transcription, Red = Anti-sense transcription. (b) RNA Pol II interrogation of all NET-seq/prism libraries over a single gene (Srsf6). (c) Cumulative distribution of Pol II travelling ratio as assessed by NET-prism.

Figure 2. NET-prism application on polymerase-bound transcription factors. (a) Metaplot profiles and heatmaps over protein-coding genes (n = 4,314) for polymerase associated elongation (Spt6, Ssrp1) and initiation (TFIID, Med14) factors. A 10-bp smoothing window has been applied. Blue = Sense transcription, Red = Anti-sense transcription. (b) RNA Pol II interrogation of all NET-seq/prism libraries over a single gene (Srsf6). (c) Cumulative distribution of Pol II travelling ratio as assessed by NET-prism.

Interestingly, as all three factors examined here (Spt6, SSRP1 and as TFIID antibody was raised against TBP) also interact with RNA Pol I and III, we asked if nascent transcript would also stem from gene products of these two polymerases. Indeed, all three IPs against these proteins also pull-down nascent transcripts generated by RNA Pol I and Pol III (Supplementary Figure 2C), suggesting that NET-prism might be an ideal tool for the investigation of all three RNA polymerases.

Sequential NET-prism confirms that nascent RNA stems from direct interaction between active RNA Pol II and Ssrp1

Using western blotting, we showed that the conditions of NET-prism allow co-purification of RNA Pol II in the IPs against transcription factors ()). However, at least one of them, Ssrp1 has been previously reported to bind RNA [Citation14Citation16]. Therefore, we decided to test more rigorously if the recovered nascent RNA is specifically associated with RNA Pol II and does not stem from direct binding of Ssrp1 to nascent RNA. In order to address this question, we performed a sequential IP as part of NET-prism as outlined in ).

Figure 3. Sequential NET-prism. (a) Schematic illustration of the experiment. Peptide elution was performed using the same CTD peptide used for antibody production and was used in excess. (b) Metaplot profile comparing single and sequential IP. (c) Single gene snapshot comparing single and sequential IP.

Figure 3. Sequential NET-prism. (a) Schematic illustration of the experiment. Peptide elution was performed using the same CTD peptide used for antibody production and was used in excess. (b) Metaplot profile comparing single and sequential IP. (c) Single gene snapshot comparing single and sequential IP.

Initially, RNA Pol II was immunoprecipitated using an anti-CTD antibody, followed by competitive elution of RNA Pol II by an excess of CTD peptide (the exact peptide used to generate the α-CTD antibody). The eluent subsequently served as input for the second round of IP using an anti-SSRP1 antibody to capture exclusively SSRP1-bound Pol II complexes. The isolated nascent RNA was subsequently used for library generation. Importantly, comparing single and sequential IP by metagene profiling ()) and single gene interrogation ()) revealed high similarity, strongly suggesting that indeed, NET-prism captures only nascent RNA bound by RNA polymerases and not by TFs.

NET-prism reveals high resolution Pol II pausing at intron-exon boundaries

Transcriptional elongation rates can affect splicing outcomes suggesting that transcription and splicing are tightly coupled [Citation17,Citation18]. Data generated by human NET-seq, mNET-seq, and PRO-seq are consistent with this kinetic model of splicing regulation [Citation9,Citation11,Citation12]. While mNET-seq already implicated different RNA Pol II variants to play distinct roles during splicing dynamics [Citation12], it is not known, whether transcription (elongation) factors facilitate RNA Pol II pausing at splice sites. We also reasoned that NET-prism might be an ideal tool to dissect splicing factor – RNA Pol II interaction at splice sites. Therefore, we performed an additional NET-prism library for Splicing factor 1 (Sf1) and included this in our splicing dynamics analysis. As splicing intermediates are known NET-seq contaminants due to the presence of 3ʹ-OH groups in these RNAs [Citation9], we removed them to avoid bias. For the splicing dynamics analysis, we assessed total RNA Pol II (NET-seq [Citation19]) and NET-prism data for Ssrp1, Spt6, TFIID and Sf1 over intron-exon boundaries. Total RNA Pol II in mouse ES cells showed increased pausing at exon boundaries similarly to human cells [Citation9] () Total Pol II). Exploration of NET-prism datasets confirmed that only Sf1 exhibited similar pausing at exon boundaries () & Supplementary Figure 4A). In addition, components of the PIC did not associate with Pol II pausing over spliced sites (Supplementary Figure 4B). Interesting to note is also the fact that NET-prism libraries displayed higher Pol II density over exons as opposed to introns suggesting that transcriptional elongation is slower at exons in mouse ES cells (). In addition, Sf1-PolII interaction clearly marks exons, indicating specificity of the approach. Taken together, these results augment the kinetic model of transcription and splicing coupling. Our data in combination with previously published results therefore suggest that transcriptional splicing mechanics is facilitated by Pol II variants and elongation factors differently and NET-prism might represent one ideal tool to address this at high resolution.

Figure 4. Association of different proteins with transcriptional splicing as assessed by NET-prism. (a) Heatmaps and metaplots assessing polymerase pausing for total Pol II and Splicing factor 1 (Sf1) over exon boundaries (n = 5,550). Solid lines indicate the mean values, whereas the shading represents the 95% confidence interval. (b) Boxplots measuring Pol II coverage over exons (n = 41,356) and introns (n = 199,172) for each NET-seq/prism library. First and last exons are removed from the analysis. (c) RNA Pol II interrogation of all NET-seq/prism libraries over a single gene (Actb). Exons are highlighted in purple.

Figure 4. Association of different proteins with transcriptional splicing as assessed by NET-prism. (a) Heatmaps and metaplots assessing polymerase pausing for total Pol II and Splicing factor 1 (Sf1) over exon boundaries (n = 5,550). Solid lines indicate the mean values, whereas the shading represents the 95% confidence interval. (b) Boxplots measuring Pol II coverage over exons (n = 41,356) and introns (n = 199,172) for each NET-seq/prism library. First and last exons are removed from the analysis. (c) RNA Pol II interrogation of all NET-seq/prism libraries over a single gene (Actb). Exons are highlighted in purple.

NET-prism reveals diverse transcriptional dynamics at enhancers

Enhancers and super-enhancers have been shown to play a prominent role in the control of gene expression programs essential for cell identity across many mammalian cell types [Citation20Citation22]. Production of enhancer RNAs (eRNAs) is bidirectional and is governed by distinctive patterns of chromatin accessibility [Citation23], but it is not well characterised whether the same transcriptional rules apply over enhancers as in promoters, in terms of initiation and elongation. We therefore extended our analysis to identify high resolution Pol II stalling at distal and super-enhancers using NET-prism. Highest correlations were identified among Total Pol II and Ssrp1 both for distal and super-enhancers ()). Total Pol II and TFs exhibited significantly higher ChIP-seq density over super-enhancers as opposed to distal enhancers. Concomitantly, increased transcriptional activity was confirmed over super-enhancers via NET-prism suggesting TF density being proportional to the degree of Pol II recruitment ()). Strikingly, both metaplot profiling (Supplementary Figure 5) and single enhancer ()) interrogation of NET-prism transcriptional activity exposed distinctive topographic Pol II stalling; Ssrp1 displayed patterns similar to transcriptional initiation whereas Spt6 imitated a trail reminiscent of transcriptional elongation. Moreover, transcriptional activity prompted by TFIID also supports, to some degree, a notion of transcriptional initiation over enhancers ()).

Figure 5. Distinctive patterns of transcriptional regulation over enhancers and super-enhancers. (a) Pearson’s correlation heatmap among NET-seq/prism libraries over distal enhancers (blue – red) and super-enhancers (grey – gold). (b) Boxplots measuring either transcription factor (ChIP-seq) or Pol II (NET-prism) density over distal enhancers (red) and super-enhancers (gold). Significance was tested via the Wilcoxon rank test (** p < 1.0e−10, *** p < 2.2e−16). (c) Pol II distribution over a distal (chr1: 86,484,171–86,495,700) or super-enhancer as assessed by NET-seq/prism. H3K27Ac density is depicted in black colour. Blue and red depict RNA Pol II pausing in the positive and negative strand, respectively.

Figure 5. Distinctive patterns of transcriptional regulation over enhancers and super-enhancers. (a) Pearson’s correlation heatmap among NET-seq/prism libraries over distal enhancers (blue – red) and super-enhancers (grey – gold). (b) Boxplots measuring either transcription factor (ChIP-seq) or Pol II (NET-prism) density over distal enhancers (red) and super-enhancers (gold). Significance was tested via the Wilcoxon rank test (** p < 1.0e−10, *** p < 2.2e−16). (c) Pol II distribution over a distal (chr1: 86,484,171–86,495,700) or super-enhancer as assessed by NET-seq/prism. H3K27Ac density is depicted in black colour. Blue and red depict RNA Pol II pausing in the positive and negative strand, respectively.

Discussion

Here, we have developed a new approach to accurately assess transcriptional topography at a high resolution. In summary, NET-prism allows the direct strand-specific investigation of the transcriptional landscape at single nucleotide resolution of any protein of interest in complex with RNA Pol II. Its robustness enables a deeper insight into the interplay of transcriptional mechanisms conferred by different Pol II variants and proteins that are bound to Pol II. The comprehensive Pol II – protein interactome that we provide here (Supplementary Table 1) facilitates the choice of the protein of interest when applying NET-prism. In addition, given the right RNA polymerase inhibitors and antibodies, NET-prism can be extended to specifically interrogate nascent transcription governed by either RNA Pol I or Pol III.

We hypothesize that NET-prism will be an ideal tool to investigate transcription/elongation factor interactions with actively travelling RNA polymerase at single nucleotide resolution and with strand specificity. An analogous approach has been previously developed in yeast [Citation7], where a variant of the yeast NET-seq protocol [Citation6], called TEF-seq, reveals distinctive patterns of Pol II when bound by diverse elongation factors (Paf1, Spt6, Spt16). Similarly to this approach, NET-prism exposes diverse Pol II signals for every immunoprecipitated TF implying the different dynamics conferred by TF-associated RNA Pol II.

Moreover, our study yields a global picture of how transcriptional elongation is affected at splicing sites and NET-prism might shed light on an unresolved dogma encompassing splicing catalysis. The idea of transcriptional elongation influencing alternative splicing arises from two unique models; the recruitment model (differential recruitment of splicing factors) and the kinetic model (Pol II pausing determines the timing in which splicing sites are presented) [Citation5,Citation18]. Similarly to other high resolution approaches [Citation9,Citation11,Citation12], we show that splicing is associated with Pol II exon density and strong pauses at both the 3ʹ and 5’SS, consistent with the kinetic model.

It is important to recognize though that NET-prism – similarly to ChIP-seq – greatly relies on the quality of the antibody used. Antibody cross-reactivity might result in unspecific binding and thus, generation of artefactual RNA Pol II stalling patterns. Therefore, the choice of a highly specific antibody for the protein of interest is important to achieve unique RNA Pol II footprints.

Similarly to the human NET-seq [Citation9], we expect the adaptation of NET-prism to be equally straightforward in any higher eukaryotic cell type. The use of an IP step in NET-prism makes it practical for studying a range of different Pol II – associated factors in order to improve our understanding of transcriptional elongation and its connection to transcript fate. The combination of NET-prism with a high resolution ChIP-seq technique, such as ChIP-nexus [Citation24], could illuminate how exactly in vivo binding of transcription or splicing factors correlates with transcriptional activity over different cell states and conditions. Therefore, NET-prism will become a valuable tool for unravelling transcriptional and regulatory complexity.

Material and methods

Cell culture

The E14 cell line (mESCs) was cultured at 37°C, 7.5% CO2, on 0.1% gelatin coated plates, in DMEM + GlutaMax™ (Gibco) with 15% fetal bovine serum (Gibco), MEM non- essential amino acids (Gibco), penicillin/streptomycin (Gibco), 550 µM 2-mercaptoethanol (Gibco), and 10 ng/ml of leukaemia inhibitory factor (LIF) (eBioscience).

Antibody-bead coating

50 μl of Dynabeads G were washed twice in 200 μl of IP buffer (50 mM Tris-HCl (pH 7.0), 50 mM NaCl, 1% NP-40) and ~10 μg of antibody were added. Antibodies used in this study: Spt6 (Cell Signalling – D6J9H); Ssrp1 (Biolegends – 10D1); TFIID (Santa Cruz – sc-273); Med14 (Invitrogen – PA5-44864), Total Pol II (anti-CTD, ab817 – Abcam), Sf1 (A303-214A; Bethyl).

Nuclear extraction and DNase treament

A detailed protocol is available in the Supplementary Information (‘NET-prism protocol’). Briefly, 108 ES cells were used for each IP. It is important to split cells down to five batches of 2 × 107 each when performing nuclei extraction. All extraction steps are performed on ice to avoid degradation of the nascent RNA. 2 × 107 cells were treated with 200 μl of cytoplasmic lysis buffer (0.15% (vol/vol) NP-40, 10 mM Tris-HCl (pH 7.0), 150 mM NaCl, 25 µM α-amanitin (Epichem), 10 U RNasin Ribonuclease inhibitor (Promega) and 1× protease inhibitor mix (Thermo)) for 5 min on ice. Lysate was layered on 500 μl of sucrose buffer (10 mM Tris-HCl (pH 7.0), 150 mM NaCl, 25% (wt/vol) sucrose, 25 µM α-amanitin, 20 U RNasin Ribonuclease inhibitor and 1× protease inhibitor mix) and spun down for 5 min at 16,000g (4ºC). The supernatant was carefully removed and nuclei were resuspended in 100 μl of DNase digestion buffer (1x DNase buffer (NEB), 25 µM α-amanitin, 20 U RNasin Ribonuclease inhibitor and 1× protease inhibitor mix) and further treated with 100 U of DNase I (NEB) for 20 min on ice. It is important for nuclei to be fully resuspended in the DNase digestion buffer. Non-resuspended nuclei are an indication of harsh cytoplasmic lysis conditions – In this case reduce the volume of cytoplasmic lysis buffer.

Chromatin-solubilised nuclei were spun down at 6,000g (4ºC) for 2 min and the supernatant was carefully removed. Nuclei were further treated with 200 μl of nuclei lysis buffer (1% (vol/vol) NP-40, 20 mM HEPES (pH 7.5), 125 mM NaCl, 50 mM urea, 0.2 mM EDTA, 0.625 mM DTT, 25 µM α-amanitin, 20 U RNasin Ribonuclease inhibitor and 1× protease inhibitor mix) for 5 min on ice. Nuclei lysate was spun down at 18,500g (4ºC) for 2 min and supernatants from five different batches were combined. Phosphatase inhibitor mix (x1) (Thermo) was implemented on all the above extraction steps for batches intended for Pol II S2ph and Pol II S5ph immunoprecipitations.

Chromatin Immunoprecipitation (IP) and nascent RNA extraction

A detailed protocol is available in the Supplementary Information (‘NET-prism protocol’). Briefly, combined supernatants from the previous step were incubated in a final 1/10 dilution in IP buffer for 2 hours at 4ºC. For the sequential IP, a total Pol II antibody was used for 2 hours, followed by elution twice with 100 μl 2.5 mM CTD peptide (synthesized by Peptide Specialty Laboratories, Heidelberg, Germany; identical to Abcam ab17564) for 30 min. The eluate was further incubated with Ssrp1 antibody-coated beads for an additional 2 hours. Beads were washed 4 times with 1 ml of IP buffer and 700 μl of Qiazol (Qiagen) was directly added to the beads, followed by 140 μl of Chloroform. Samples were spun down and supernatant was ethanol precipitated (0.3M NaOAc, 2 μl Glycoblue). Concentration and size of nascent RNA was assessed by Nanodrop and TapeStation 2200, respectively. An IP from 108 ES cells usually yields ~200–1000 ng of nascent RNA. Assessment of RNA size is important in order to evaluate the fragmentation time during the library preparation.

NET-prism library preparation

Two biological replicates were processed for each IP and library preparation. NET-prism libraries were prepared similarly to the human NET-seq protocol [Citation25] with few modifications. The random barcode was ligated overnight at 16ºC to maximise ligation efficiency. Alkaline fragmentation of the ligated nascent RNA varies depending on the size of the RNA fragments obtained from each IP. IPs for Pol II S5ph, Pol II S2ph, Ssrp1, and Spt6 yielded large RNA fragments and therefore the ligated nascent RNA was fragmented until all RNA transcripts were within the range of ~35–200 nucleotides. IPs for TFIID, and Mediator yielded fragments < 200 nt and therefore no fragmentation was performed. Maximum recovery of ligated RNA and cDNA was achieved from 15% TBE-Urea (Invitrogen) and 10% TBE-Urea (Invitrogen), respectively, by adding RNA recovery buffer (Zymo Research, R1070-1-10) to the excised gel slices and further incubating at 70°C (1500 rpm) for 15 min. Gel slurry was transferred through a Zymo-Spin IV Column (Zymo Research, C1007-50) and further precipitated for subsequent library preparation steps. cDNA containing the 3ʹ end sequences of a subset of mature and heavily sequenced snRNAs, snoRNAs, and rRNAs, were specifically depleted using biotinylated DNA oligos (Mylonas et al.). Oligo-depleted circularised cDNA was amplified via PCR (9–12 cycles) and double stranded DNA was run on an 8% TBE gel. The final NET-seq library running at ~150 bp was extracted and further purified using the ZymoClean Gel DNA recovery kit (Zymo Research). Sample purity and concentration was assessed in a 2200 TapeStation and further sequenced on a HiSeq 2500 Illumina Platform (Supplementary Table 2).

NET-prism analysis

All the NET-prism fastq files were processed using custom Python scripts (https://github.com/BradnerLab/netseq) to align (mm10 genome) and remove PCR duplicates and reads arising from RT bias. Reads mapping exactly to the last nucleotide of each intron and exon (Splicing intermediates) were further removed from the analysis. The final NET-prism BAM files were converted to bigwig (1 bp bin), separated by strand, and normalized to x1 sequencing depth using Deeptools [Citation26] (v 2.4) with an ‘–Offset 1’ in order to record the position of the 5ʹ end of the sequencing read which corresponds to the 3ʹ end of the nascent RNA. NET-seq/prism tags sharing the same or opposite orientation with the TSS were assigned as ‘sense’ and ‘anti-sense’ tags, respectively. Promoter-proximal regions were carefully selected for analysis to ensure that there is minimal contamination from transcription arising from other transcription units. Genes overlapping within a region of 2.5 kb upstream of the TSS were removed from the analysis. For the NET-seq/prism metaplots, genes underwent several rounds of k-means clustering in order to filter regions; in a 2kb window around the TSS, rows displaying very high Pol II occupancy within a < 100 bp region were removed from the analysis as they represent non-annotated short non-coding RNAs. For ), genes that displayed an RPKM > 1 for Total Pol II (n = 6,107) were used for metaplot profiling. Average Pol II occupancy profiles were visualised using R (v 3.3.0).

Travelling ratio & termination index

The travelling ratio is calculated via:

Travelling ratio=Proximal PromoterGeneBody

with Proximal Promoter defined as the Pol II coverage −30 bp and +250 bp around the TSS whereas Gene body region as the Pol II coverage +300 bp downstream of TSS and −200 bp upstream of TES.

ChIP-seq data processing

All ChIP-seq fastq files were aligned to the mm10 genome using Bowtie2 (v 2.2.6) with default parameters [Citation27]. All BAM files were converted to bigwig (10 bp bin) and normalised to x1 sequencing depth using Deeptools (v 2.4) [Citation26]. Duplicated reads were removed. Blacklisted mm9 co-ordinates were converted to mm10 using the LiftOver tool from UCSC and were further removed from the analysis. Average binding profiles were visualised using R (v 3.3.0).

Mass spectrometry sample preparation

Independent ES cell cultures were grown in 10cm dishes. Per IP, 20 × 107 cells were extracted, lysed, and nuclei were treated with DNase I as described above. The supernatant was incubated for 2 hours with a total Pol II antibody (ab817 Abcam) or IgG (Cell Signalling) at 4ºC. In total, four samples were prepared for each IP (Total Pol II, IgG). After thorough washing of beads with IP buffer, samples were incubated overnight at 37°C with Tris pH 8.8 and 300 ng Trypsin Gold (Promega). Peptides were desalted using StageTips [Citation28] and dried. The peptides were resuspended in 0.1% formic acid and analysed using liquid chromatography – mass spectrometry (LC-MS/MS).

LC-MS/MS analysis

Peptides were separated on a 25 cm, 75 μm internal diameter PicoFrit analytical column (New Objective) packed with 1.9 μm ReproSil-Pur 120 C18-AQ media (Dr. Maisch) using an EASY-nLC 1200 (Thermo Fisher Scientific). The column was maintained at 50°C. Buffer A and B were 0.1% formic acid in water and 0.1% formic acid in 80% acetonitrile. Peptides were separated on a segmented gradient from 6% to 31% buffer B for 45 min and from 31% to 50% buffer B for 5 min at 200 nl/min. Eluting peptides were analyzed on a QExactive HF mass spectrometer (Thermo Fisher Scientific). Peptide precursor m/z measurements were carried out at 60,000 resolution in the 300 to 1800 m/z range. The top ten most intense precursors with charge state from 2 to 7 only were selected for HCD fragmentation using 25% normalized collision energy. The m/z values of the peptide fragments were measured at a resolution of 30,000 using a minimum AGC target of 8e3 and 55 ms maximum injection time. Upon fragmentation, precursors were put on a dynamic exclusion list for 45 sec.

Protein identification and quantification

The raw data were analyzed with MaxQuant version 1.6.0.13 [Citation29] using the integrated Andromeda search engine [Citation30]. Peptide fragmentation spectra were searched against the canonical and isoform sequences of the mouse reference proteome (proteome ID UP000000589, downloaded December 2017 from UniProt). Methionine oxidation and protein N-terminal acetylation were set as variable modifications; cysteine carbamidomethylation was set as fixed modification. The digestion parameters were set to ‘specific’ and ‘Trypsin/P,’ The minimum number of peptides and razor peptides for protein identification was 1; the minimum number of unique peptides was 0. Protein identification was performed at a peptide spectrum matches and protein false discovery rate of 0.01. The ‘second peptide’ option was on. Successful identifications were transferred between the different raw files using the ‘Match between runs’ option. Label-free quantification (LFQ) [Citation31] was performed using an LFQ minimum ratio count of 2. LFQ intensities were filtered for at least three valid values in at least one group and imputed from a normal distribution with a width of 0.3 and down shift of 1.8. The median value of the log2 LFQ intensities for the RNA Pol II IPs was used for the imputation of the missing values in the IgG IPs. Differential abundance analysis was performed using limma [Citation32] (Supplementary Table 1).

Enhancers and super-enhancers

BED files containing typical enhancer and super-enhancer coordinates in mESCs were downloaded from Whyte et al. [Citation22]. Distal enhancers were defined as regions that are not overlapping with any annotated gene within a 2000 bp window. Only the distal enhancers that displayed an RPKM > 1 for Pol II were kept for subsequent analyses.

Publicly available datasets

NET-seq (GSE90906) (Mylonas et al.), ChIP-seq (Pol II; GSE28247 [Citation33], Ssrp1; GSE90906 (Mylonas et al.), Spt6; GSE103180 [Citation13], TFIID; GSE39237 [Citation34], H3K27Ac (Encode Consortium – E14 cell line)).

Data availability

Data have been deposited in Gene Expression Omnibus (GEO) under accession numbers GSE 107257.

Supplemental material

Supplemental Material

Download Zip (1 MB)

Acknowledgments

We are grateful to Ilian Attanassov of the Max Planck Institute for Biology of Ageing Proteomics Core Facility for Mass Spectrometry Analysis. Sequencing was performed at the Max Planck Genome core centre in Cologne and data analysis was done on servers of the GWDG, Göttingen and the MPI-AGE cluster. Models were prepared using Biorender. We thank members of the Tessarz laboratory for discussion and comments on the manuscript.

Disclosure statement

The authors declare that they have no conflict of interest.

Supplementary data

Supplemental data for this article can be accessed here.

Additional information

Funding

Funding has been provided by core support of the Max Planck Society.

References

  • Jonkers I, Lis JT. Getting up to speed with transcription elongation by RNA polymerase II. Nat Rev Mol Cell Biol. 2015;16:167–177.
  • Harlen KM, Churchman LS. The code and beyond: transcription regulation by the RNA polymerase II carboxy-terminal domain. Nat Rev Mol Cell Biol. 2017;18:263–273.
  • Voss TC, Hager GL. Dynamic regulation of transcriptional states by chromatin and transcription factors. Nat Rev Genet. 2013;15:69–81.
  • Jeronimo C, Watanabe S, Kaplan CD, et al. The histone chaperones FACT and Spt6 restrict H2A.Z from intragenic locations. Mol Cell. 2015;58:1–12.
  • Bentley DL. Coupling mRNA processing with transcription in time and space. Nat Rev Genet. 2014;15:163–175.
  • Churchman LS, Weissman JS. Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature. 2011;469:368–373.
  • Fischl H, Howe FS, Furger A, et al. Paf1 has distinct roles in transcription elongation and differential transcript fate. Mol Cell. 2017;65: 685–698. e688.
  • Scruggs BS, Gilchrist DA, Nechaev S, et al. Bidirectional transcription arises from two distinct hubs of transcription factor binding and active chromatin. Mol Cell. 2015;58:1101–1112.
  • Mayer A, Di Iulio J, Maleri S, et al. Native elongating transcript sequencing reveals human transcriptional activity at nucleotide resolution. Cell. 2015;161:541–554.
  • Min IM, Waterfall JJ, Core LJ, et al. Regulating RNA polymerase pausing and transcription elongation in embryonic stem cells. Genes Dev. 2011;25:742–754.
  • Kwak H, Fuda NJ, Core LJ, et al. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science. 2013;339:950–953.
  • Nojima T, Gomes T, Grosso ARF, et al. Mammalian NET-Seq reveals genome-wide nascent transcription coupled to RNA processing. Cell. 2015;161:526–540.
  • Wang AH, Juan AH, Ko KD, et al. The elongation factor Spt6 maintains ESC pluripotency by controlling super-enhancers and counteracting polycomb proteins. Mol Cell. 2017;68: 398–413. e6.
  • He C, Sidoli S, Warneford-Thomson R, et al. High-resolution mapping of RNA-binding regions in the nuclear proteome of embryonic stem cells. Mol Cell. 2016;64:416–430.
  • Beckmann BM, Horos R, Fischer B, et al. The RNA-binding proteomes from yeast to man harbour conserved enigmRBPs. Nat Commun. 2015;6:1–9.
  • Baltz AG, Munschauer M, Schwanhäusser B, et al. The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol Cell. 2012;46:674–690.
  • Fong N, Kim H, Zhou Y, et al. Pre-mRNA splicing is facilitated by an optimal RNA polymerase II elongation rate. Genes Dev. 2014;28:2663–2676.
  • Dujardin G, Lafaille C, de la Mata M, et al. How slow RNA Polymerase II Elongation favors alternative exon skipping. Mol Cell. 2014;54:683–690.
  • Mylonas C, Tessarz P. Transcriptional repression by FACT is linked to regulation of chromatin accessibility at the promoter of ES cells. Life Sci Alliance. 2018;1:e201800085.
  • Heinz S, Romanoski CE, Benner C, et al. The selection and function of cell type-specific enhancers. Nat Rev Mol Cell Biol. 2015;16:144–154.
  • Adam RC, Fuchs E. The Yin and Yang of chromatin dynamics in stem cell fate selection. Trends Genet. 2016;32:89–100.
  • Whyte WA, Orlando DA, Hnisz D, et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153:307–319.
  • Core LJ, Martins AL, Danko CG, et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat Genet. 2014;46:1311–1320.
  • He Q, Johnston J, Zeitlinger J. ChIP-nexus enables improved detection of in vivo transcription factor binding footprints. Nat Biotechnol. 2015;33:395–401.
  • Mayer A, Churchman LS. Genome-wide profiling of RNA polymerase transcription at nucleotide resolution in human cells with native elongating transcript sequencing. Nat Protoc. 2016;11:813–833.
  • Ramírez F, Ryan DP, Grüning B, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44:W160–W165.
  • Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359.
  • Rappsilber J, Ishihama Y, Mann M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal Chem. 2003;75:663–670.
  • Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008;26:1367–1372.
  • Cox J, Neuhauser N, Michalski A, et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. J Proteome Res. 2011;10:1794–1805.
  • Cox J, Hein MY, Luber CA, et al. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol Cell Proteomics. 2014;13:2513–2526.
  • Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47–e47.
  • Handoko L, Xu H, Li G, et al. CTCF-mediated functional chromatin interactome in pluripotent cells. Nature Biotech. 2011;43:630–638.
  • Ku M, Jaffe JD, Koche RP, et al. H2A.Z landscapes and dual modifications in pluripotent and multipotent stem cells underlie complex genome regulatory functions. Genome Biol. 2012;13:R85.