1,173
Views
8
CrossRef citations to date
0
Altmetric
RNA Family

Detailed secondary structure models of invertebrate 7SK RNAs

ORCID Icon, & ORCID Icon
Pages 158-164 | Received 04 Oct 2017, Accepted 30 Nov 2017, Published online: 21 Dec 2017

ABSTRACT

The 7SK RNA is a small nuclear RNA that is involved in the regulation of Pol-II transcription. It is very well conserved in vertebrates, but shows extensive variations in both sequence and structure across invertebrates. A systematic homology search extended the collection of 7SK genes in both Arthropods and Lophotrochozoa making use of the large number of recently published invertebrate genomes. The extended data set made it possible to infer complete consensus structures for invertebrate 7SK RNAs. These show that not only the well-conserved 5′- and 3′- domains but all the interior Stem A domain is universally conserved. In contrast, Stem B region exhibits substantial structural variation and does not adhere to a common structural model beyond phylum level.

1. Introduction

The 7SK snRNA is a Pol-III transcript in animals with a typical length of about 330nt [Citation1,Citation2]. Its sequence is very well conserved in vertebrates [Citation3,Citation4], but shows a high level of variability in invertebrates [Citation5,Citation6]. Due to its abundance it has been known since the 1960s. The molecule is capped at its 5′end by a highly specific methylase MePCE, also known as BCDIN3 [Citation7]. Its stability is regulated by means of a highly specific interaction with LARP7 (La-related protein 7, also known as PIP7S) [Citation8-11] LARP7 also plays an inhibitory role counteracting MEPCE; to this end, its xRRM domain binds to the 3′ stem loop of the 7SK RNA [Citation12]. The Bin3 RNA methyltransferase reinforces LARP7 binding. It adds to the stability of the interaction [Citation13] and catalyzes the 5′ methylation that is protective against degradation [Citation14].

The primary function of 7SK is to mediate an inhibitory interaction of the HEXIM1 protein with the general transcription elongation factor P-TEFb, thereby repressing transcript elongation by Pol II [Citation4,Citation15-17]. To activate P-TEFb it must be released from the complex with 7SK RNA [Citation18]. This process is facilitated by the PPIM1G phosphatase, which binds then to 7SK RNA and prevents re-binding of P-TEFb [Citation19].

7SK RNA suppresses the deaminase activity of APOBEC3C and sequesters this enzyme in the nucleolus [Citation20]. It features at least two distinctive secondary structure elements: Both HEXIM1 and P-TEFb bind specific sequence motifs at the 5′-terminal hairpin, while a 3′-terminal hairpin interacts with P-TEFb only. The high level of sequence conservation in gnathostomes contrast highly divergent 7SK sequences in invertebrates, however. Still, high levels of sequences conservation have been reported for the 3′- and 5′-hairpins, suggesting that the protein interaction sites are likely homologous to the ones in vertebrates.

Two highly conserved GAUC motifs located in the upper part of the 5′ hairpin form a short helix and, together with adjacent, conserved U bulges form the the HEXIM1 binding site [Citation4,Citation21]. A recent crystal structure [Citation22] of the 5′-terminal hairpin shows that the major grove of the GAUC-GAUC helix is occupied by the adjacent uridines, which form non-standard base pairs with nucleotides of the helix. This structural element is also the main interaction site for PPM1G [Citation19]. This motif also is a binding site for the HIV-1 transcriptional transactivator (Tat) protein [Citation23], which can cause chemical variations in some residues of the GAUC-GAUC helix [Citation24]. The 7SK RNA is able to form alternative structures under the influence of different factors [Citation24]. The presence of the GAUC-GAUC helix, for example, is magnesium-dependent [Citation12]. Finally, 7SK RNA plays a role in cancer by negatively regulating [Citation25,Citation26] the attenuating HMGA1 protein, which is strongly overexpressed in several cancers [Citation27-Citation29].

To-date 7SK RNAs have been identified in a wide range of bilaterian animals. While their sequence and structure is well conserved in vertebrates, it shows large changes in size, sequence, and secondary structure across the invertebrate 7SK genes identified so far [Citation3-6,Citation30]. In these respects, invertebrate 7SK behaves similar to the vertebrate telomerase RNA component [Citation31]. It is unclear at this point whether 7SK RNA is a bilaterian innovation, or whether homologs in diploplasts or even outside the animal kingdom just have escaped detection. As a case in point, the putative, highly derived gene reported for Caenorhabditis species [Citation30] (also named T26A8.6 and CeN21-2) may in fact be a homolog of the U8 snoRNA [Citation32]. The main purpose of this contribution is to provide an update of the sequence and secondary structure information available for invertebrates. To this end, we conducted a comprehensive homology search. Using a greatly enlarged set of credible 7SK genes we then proceed to constructing high quality sequence alignments and detailed structural models that allow us to trace in full detail the evolutionary history of this RNA family.

2. Materials and methods

2.1. Initial data sources

The previously reported Arthropoda 7SK RNA [Citation6] and detected invertebrate 7SK RNAs [Citation5] were the first sequences used as queries for the homology search. The genomic data was downloaded from NCBI databases [Citation33], the antgenomes database websiteFootnote1 [Citation34], and from VectorBase database websiteFootnote2 [Citation35] at different times during 2017. A complete list of investigated genomes can be found in Supplemental Table 1.

2.2. Homology search

Homology search was conducted both using NCBI blast [Citation36,Citation37] and the infernal suite (release 1.1.1) [Citation38,Citation39]. For the blast searches we used the phylogenetically most closely related sequences from the supplemental material of [Citation6] as query. We used default parameters for both blast and infernal. Although this returns a substantial number of initial false positives these were easily weeded out by manual inspection. An initial covariance model was constructed using the 7SK sequences of Nasonia, Bombus, Solenopsis, and Pogonomyrmex based on a clustalw (version 2.1) [Citation40] alignment and the structure model of [Citation6]. In later steps, candidate sequences were then added to alignments of closely related 7SK sequences using clustalw (version 2.1) [Citation40]. Initial consensus secondary structures were computed using RNAalifold [Citation41]. Alignments were then manually improved where necessary as detailed below. The expanded sequence alignments were then used to update the covariance models used for infernal-based search. All figures were drawn with R2R [Citation42] from the final multiple sequence alignments and consensus structures that are provided in Stockholm format in the Electronic Supplement.

In order to further validate the 7SK RNA candidate sequences we search the 100-nt of genomic sequence upstream of the predictions for sequence elements characteristic for Pol III promoters. To this end we downloaded U6 snRNAs promoters from the supplementary data of [Citation44] and aligned them to the putative upstream sequences, see for an example.

Figure 1. Alignment of the region upstream of the 7SK RNA candidate from the genome of the ant Metapolybia cingulata (below) and a sequence logo (above) constructed from the upstreams regions of the three genomic copies of the U6 snRNA, the single U6atac snRNA, and the 7SK RNA from honey bee (Apis mellifera). The PSE and TATA elements are highlighted. The sequence logo was generated with WebLogo [Citation43].

Figure 1. Alignment of the region upstream of the 7SK RNA candidate from the genome of the ant Metapolybia cingulata (below) and a sequence logo (above) constructed from the upstreams regions of the three genomic copies of the U6 snRNA, the single U6atac snRNA, and the 7SK RNA from honey bee (Apis mellifera). The PSE and TATA elements are highlighted. The sequence logo was generated with WebLogo [Citation43].

2.3. Manual data curation

We used the emacs text editor in ralee mode [Citation45] for manipulation of alignments. The 5′ hairpin structure was annotated following [Citation6], where related by subtly different consensus structures were for different groups of invertebrates were reported for this domain. The 3′-terminal hairpin was highly conserved in the previous studies over different species [Citation5], even between invertebrates and vertebrates. Stem A was also annotated following the lead of [Citation5,Citation6]. No structural annotation was given in previous studies to the large, highly variable central region of the molecule. Initially, we also left this part unannotated and used only the 5′ and 3′ regions to inspect candidate sequences.

Since structure predictions started to contradict conservation relative to sequence alignments as the number and phylogenetic scope of sequence increased, we subdivided the alignment data set both “vertically” into individual componenents of the 7SK structure and “horizontally” into groups of sequences with likely coherent conserved secondary structures. First, we considered sequences from six subgroups of Hexapoda: Apocrita, Tenthredinidae, Coleoptera, Orthoptera, Cephidae, and Isoptera. Sequences from each group were realigned separately and then reannotated. At this point, unambiguous alignments for the 5′ stem, the 3′ stem as well as stem A (with consensus motif CCGTGC-GCACGG were obtained.

For each subgroup of species we separately aligned the region between Stem A and the 3′ hairpin, which has a length between 80 and almost 200 nt in different clades. Then we computed putative consensus structures for these regions, see Fig. S1 for an example. This procedure consistently identified a new stem, which we term Stem B, in the previously unannotated regions as well as an extension of stem A, which led us to repartition the regions containing stem A and the new stem.

After constructing alignments and secondary structure models of the four domains (5′-terminal hairpin, stem A, the newly reported stem B between stem A and 3′ end and the 3′ -terminal hairpin) separately for each group of sequences, the models were merged into a single alignment for each structural region. To this end we computed covariance models from each alignment and determined the best matching sequence in other sequence groups using infernal. Using the Apocrita alignment as a starting point, we interatively added the other groups, manually editing the alignment to improve the consistency of both sequence alignment and consensus structure in each step. The significance of the covariation in the predicted base pairs of the final models was assessed using R-scape [Citation46]. Base pairs with r-scape E-values below 2.0 are compiled in the Electronic Supplement. We note that the default cut-off of 0.05 of R-scape is almost never reached.

3. Results

3.1. Extended set of 7SK RNAs

In total we identified 48 candidates for 7SK genen in invertebrate genomes. The phylogenetic distribution of the sequences is summarized in . No ecdysozoan candidates were found outside Arthropoda. However, we substantially extended the phylogenetic scope within the group, identifying not only many additional sequences from diverse Hexapoda, but also new genes from the subphyla Crustacea and Chelicerata. We also reanalyzed the much smaller collection of known and predicted lophotrochozoan 7SK RNA genes. Not surprisingly, the genes in hexapoda are easily recognizable using the arthropod covariance model RF01052 currently provided in Rfam 13.0. A few of the candidates from Arachnida as well as Lophotrochozoa, however, escape detection with the previously published model, see Electronic Supplement.

Figure 2. Phylogenetic distribution of the 7SK genes identified in Arthropoda. Black crosses mark clades for which 7SK RNA are newly reported, groups for which homologs were reported in earlier studies are marked by black squares. White crosses indicates clades where homology searches remained unsuccessful. For simplicity we show only a single branch for Crustacea.

Figure 2. Phylogenetic distribution of the 7SK genes identified in Arthropoda. Black crosses mark clades for which 7SK RNA are newly reported, groups for which homologs were reported in earlier studies are marked by black squares. White crosses indicates clades where homology searches remained unsuccessful. For simplicity we show only a single branch for Crustacea.

3.2. Refined model of arthropod 7SK RNA

The structure of invertebrate 7SK RNAs is best discussed by first using the Hexapoda consensus, , as paradigmatic example. Overall, the molecule is well-conserved in both sequence and structure. The 5′ stem is highly conserved and very similar to the corresponding, well-characterized structure of its vertebrate homolog [Citation4,Citation47], see Fig. S2. The model in particular conserves the P-TEFb binding motif including the U-bulges. The 5′ stem is also well conserved in flies, as demonstrated by a comparison of our model with the structure reported [Citation14] for the 7SK RNA of Drosophila melanogaster. Both structures feature the same P-TEFb binding motif as well as a second identical helix. Surprisingly, the 5′ hairpin is not well supported by covarying base pairs according to r-scape, possibly due to the high levels of conservation of helical region.

Figure 3. Hexapoda 7SK RNA structure drawn with R2R [Citation42]. Nucleotides marked by a letter are present in at least 65%, 90% or 97% or the sequenes depending on the color as indicated in the legend. Circled positions without a letter are more variable and are present in a fraction of the sequences as indicated in the legend. Base pairs shaded in red show no variation, green indicates covariation and blue refers to compatible mutation. In addition, the circled nucleotides at the 5′ stem represents the P-TEFB and HEXIM1 binding sites and the target site of PPIM1G. The circled nucleotides at the 3′ stem, are nucleotides crucial for P-TEFB binding.

Figure 3. Hexapoda 7SK RNA structure drawn with R2R [Citation42]. Nucleotides marked by a letter are present in at least 65%, 90% or 97% or the sequenes depending on the color as indicated in the legend. Circled positions without a letter are more variable and are present in a fraction of the sequences as indicated in the legend. Base pairs shaded in red show no variation, green indicates covariation and blue refers to compatible mutation. In addition, the circled nucleotides at the 5′ stem represents the P-TEFB and HEXIM1 binding sites and the target site of PPIM1G. The circled nucleotides at the 3′ stem, are nucleotides crucial for P-TEFB binding.

Our results strongly suggest that the conserved Stem A structure is more extensive than previously reported. In particular, there are at least three additional base pairs. Key features of this structure, in particular the three pairs at the base of the stem, are clearly conserved between vertebrates and invertebrates. The structure is well supported by covariation both in the Arthropoda as well as the more general invertebrate model. Stem A is well supported by covarying base pairs according to r-scape.

Beyond refinements of previously described structural elements we found clear evidence for a novel Stem B located between Stem A and the 3′ hairpin. As shown in , it features a highly conserved loop as well as a quite well-conserved sequence over much of its stem. In contrast to Stem A, there is no evidence of significant covariation in Stem B.

Figure 4. Comparison of stem B in vertebrates, Hexapoda without Coleoptera, and Coleoptera alone, respectively. See caption of for explanation of the color code.

Figure 4. Comparison of stem B in vertebrates, Hexapoda without Coleoptera, and Coleoptera alone, respectively. See caption of Fig. 3 for explanation of the color code.

Interestingly, a similar structure is part of the vertebrate 7SK RNA secondary structure model [Citation4,Citation47]. It was not recognized as a conserved element in earlier studies [Citation5,Citation6], which in particular focussed on drosophilids. Fruitflies, have insertions in this region that effectively made it impossible to construct reliable alignments of this region with other arthropods. Hence drosophilid sequences were excluded from the construction of consensus models for the region between Stem A and the 3′ hairpin. While the 7SK RNA of other Diptera is within the range of other arthropods, their sequences are highly divergent in this region and we did not succeed to produce a credible alignment with other arthropods. Hence separate alignments and structure models have been constructed for drosophilids and the remaining diptera, respectively. The dipteran model, Fig. S3 shares at least an overall structural similarity as well as the P-TEFb binding sites in both the 5′ and 3′ stems with the other arthropods. In these regions the alignment clearly shows the sequence homology. The structures in the Stem A and Stem B regions, on the other hand, do not seem to conform to the arthropod consensus. Coleoptera 7SK RNAs also systematically deviate from the ancestral structure of Stem B, see .

The structure model derived from the nine Arachnida sequences (Fig. S4) showed less conservation in the 5′ stem region, although the longe-range interaction does appear to be well preserved. There is also a clearly recognizable Stem A and a 3′ haipin. However, a Stem B structure cannot be attested. Significantly co-varying base pairs are observed in 5′ stem, 3′ stem, and Stem A.

Our homology search also identified 7SK RNA genes in Crustacea, in particular in genomes of Limulus polyphemus and Triops cancriformis, and we recovered the 7SK gene in the myriapod Strigamia maritima, which was already annotated in the EnsemblMetazoa browser. These sequences, together with a few additional uncertain candidates, however, are too diverged beyond their 5′ and 3′ hairpins to be integrated with confidence into our alignments and secondary structure model. A more detailed analysis of their structures thus will require additional, more closely related genomes to be sequenced.

3.3. Invertebrates model

Superpositions of the structural models for individual phylogenetic groups (see Fig. S5) show that there is significant variation also at the level of secondary structures. On the other hand, there is also a clearly identifiable core structure common to the invertebrate structures. This core in particular covers the top 11 base pairs of the 5′ stem including the P-TEFb binding site as well as the 3′ hairpain. At least six base pairs of Stem A are also essentially immutable.

Stem B can be ascertained only in Hexapoda. At present the data remain very sparse for Lophotrochozoa, where 7SK sequences so far could be identified only for a few representatives of the Mollusca. In isolation, these data were insufficient to infer a Stem B-like structure. In particular, we found that the relevant sequence regions were too dissimilar to the Stem B sequences of Hexapoda for a credible sequence alignment. Similarly, we have at present no evidence that Stem B in Hexapoda (and possibly lophotrochozoa) is homologous to a corresponding structure in vertebrates.

As expected, the HEXIM1 binding motif GAUC-GAUC [Citation21] is perfectly conserved across the invertebrates, as noted already in [Citation5,Citation6]. Consistent with the results of [Citation22], furthermore, the adjacent uracil residues are also highly conserved, with the notable exception of the substitution of adenine. While the consensus motif for vertebrates reads UuGAUC-uGAUC, invertebrates instead feature UUGAUC-AGAUC as their consensus HEXIM binding motif.

The 7SK RNA appears in an open and a closed conformation. In the closed structure, a helix is formed between the 5′ end of the molecule and a complementary region just upstream of the 3′ stem [Citation22]. Mutations in this stem affect LARP7 binding [Citation47]. The alternative, open conformation has been described e.g. in [Citation4] and [Citation48].

Although the main purpose of this contribution is the presentation of a more detailed and more complete secondary structure model, it is worth noting that the covariance model corresponding to the invertebrate consensus described here provides at moderate improvement also for the purpose of homology search. It not only recognized several candidate sequences that are invisible to the Rfam 13.0 arthropod 7SK RNA model, but also achieves systematically improved scores on 7SK RNAs from both Ecdysozoa and Lophotrochozoa.

4. Discussion and concluding remarks

We have significantly extended and refined the understanding of the 7SK RNA throughout the bilaterian animals. Based on the carefully curated results of iterative homology searches we constructed a detailed model of the secondary structure and the structural variation of 7SK throughout the Ecdysozoa, and to a lesser extent also in the Lophotrochozoa. Our analysis confirms in particular the evolutionarily well-conserved sequence and secondary structure elements at both the 5′- and 3′-ends of the molecule that are associated with P-TEFb and HEXIM binding. In addition, we found that Stem A is also a universally conserved feature of this snRNA. In contrast, we were able to establish an additional structural domain, Stem B, located between Stem A and the 3′ hairpin, that is conserved in most Hexapoda and possibly across some other major invertebrate lineages. The overall organization of the 7SK RNA secondary structure in summarized in , emphazing the overall clover-leaf like with the 3 large, conserved stem-loop structures emanating from a multiloop enclosing almost the entire molecule, with the exception of the tail containing the highly conserved 3′ stem. Despite the presence of well-conserved domains and a common overall organization, we observe that the structure of the molecule was subject to substantial changes throughout animal evolution, with highly derived homologs in Diptera and in particular in fruitflies.

Figure 5. General representation of invertebrates 7SK RNA secondary-structure. The typically length range of the major structural features is indicated, highlighting the variability of the details while the overall organization remains conserved.

Figure 5. General representation of invertebrates 7SK RNA secondary-structure. The typically length range of the major structural features is indicated, highlighting the variability of the details while the overall organization remains conserved.

The homology search step reconfirmed earlier observations [Citation5,Citation6,Citation30] regarding the difficulty of finding 7SK genes. The blastn based searches almost exclusively recognize the 5′ hairpin region as soon as phylogenetic distances reach (sub)phylum level. As expected, infernal adds at least a moderate gain in sensitivity [Citation49]. Once identified, it is surprisingly easy to recognize bona fide 7SK RNA sequences and to separate them from false positive candidates, owing to the total length of the molecule and its multiple conserved domains.

So far, no 7SK RNA has been reported for several major bilaterian clades, including Nematoda and Platyhelminthes. Within arthropoda, the known genes are restricted to insects. The identity of a putative 7SK RNA in Caenorhabiditis remains uncertain [Citation30,Citation32]. Furthermore, no 7SK homologs have been found outside the Bilateria. It remains a question for future research whether 7SK is a true innovation that appeared relatively late in animal evolution, or whether it has diverged so far from its distant ancestors that state-of-the-art homology search techniques are insufficient to trace its history back in time any further.

The variability of the 7SK RNA again highlights that ncRNAs that are subject to important constraint on both their sequence and their secondary structure can be very hard to identify with the currently available technology for homology search. Similar to teleomerase RNA [Citation31], the 7SK RNA harbours rapidly evolving regions including large insertions that are difficult or sometimes impossible to align. Although substantial efforts have been expended here to overcome this issue by means of manual curation of initial alignments, the models can certainly be improved further, in particular in the variable regions. This of course also limits the sensitivity of covariance models in the homology search step.

The inspection of the 7SK genes identified here also highlight a more fundamental issue concerning the recognition of distance homologies in non-coding RNAs, namely the fact that parts of the sequence maybe under very little selection pressure for both sequence and structure, thus rendering entire domains uninformative at phylogenetic ranges exceeding a few hundred million years of divergence time. For the 7SK RNA, for example, we have been unable to obtain reliable alignments of much of region between the well-conserved 5′- and 3′- stem-loop regions. The 7SK RNA is not the only RNA family that suffers from this issue. Similar problems so far limit our ability to detect telomerase RNAs [Citation31,Citation50], RNAse MRP and RNAse P [Citation51]. In the latter case, pattern-based approaches have been applied quite efficiently to extract candidate sequences from genomic data [Citation52]. The development of fully automatic methods to evaluate distantly related candidates beyond considering the scoring of covariance models for individual domains, however, remains an open problem for future research.

Disclosure of potential conflicts of interest

No potential conflicts of interest were disclosed.

Supplemental material

supp_data_1412913.zip

Download Zip (223.9 KB)

Additional information

Funding

Ali M. Yazbeck was funded by a doctoral stipend of the National Council for Scientific Research of Lebanon, Lebanon (CNRS-L). Machine readable data accompanying this article are available at http://www.bioinf.uni-leipzig.de/publications/supplements/17-013

Notes

References

  • Krüger W, Benecke BJ. Structural and functional analysis of a human 7 S K RNA gene. J Mol Biol. 1987;195:31–41. https://doi.org/10.1016/0022-2836(87)90325-1
  • Murphy S, Di Liegro C, Melli M. The in vitro transcription of the 7SK RNA gene by RNA polymerase III is dependent only on the presence of an upstream promoter. Cell. 1987;51:81–87. https://doi.org/10.1016/0092-8674(87)90012-2
  • Gürsoy HC, Koper D, Benecke BJ. The vertebrate 7S K RNA separates hagfish (Myxine glutinosa) and lamprey (Lampetra fluviatilis). J Mol Evol. 2000;50:456–464. https://doi.org/10.1007/s002390010048
  • Egloff S, Van Herreweghe E, Kiss T. Regulation of polymerase II transcription by 7SK snRNA: two distinct RNA elements direct P-TEFb and HEXIM1 binding. Mol Cell Biol. 2006;26:630–642. https://doi.org/10.1128/MCB.26.2.630-642.2006
  • Gruber AR, Koper-Emde D, Marz M, et al. Invertebrate 7SK snRNAs. J Mol Evol. 2008;66:107–115. https://doi.org/10.1007/s00239-007-9052-6
  • Gruber A, Kilgus C, Mosig A, et al. Arthropod 7SK RNA. Mol Biol Evol. 2008;25:1923–1930. https://doi.org/10.1093/molbev/msn140
  • Jeronimo C, Forget D, Bouchard A, et al. Systematic analysis of the protein interaction network for the human transcription machinery reveals the identity of the 7SK capping enzyme. Mol Cell. 2007;27:262–274. https://doi.org/10.1016/j.molcel.2007.06.027
  • He N, Jahchan NS, Hong E, et al. A La-related protein modulates 7SK snRNP integrity to suppress P-TEFb-dependent transcriptional elongation and tumorigenesis. Mol Cell. 2008;29:588–599. https://doi.org/10.1016/j.molcel.2008.01.003
  • Krueger BJ, Jeronimo C, Roy BB, et al. LARP7 is a stable component of the 7SK snRNP while P-TEFb, HEXIM1 and hnRNP A1 are reversibly associated. Nucleic Acids Res. 2008;36:2219–2229. https://doi.org/10.1093/nar/gkn061
  • Markert A, Grimm M, Martinez J, et al. The La-related protein LARP7 is a component of the 7SK ribonucleoprotein and affects transcription of cellular and viral polymerase II genes. EMBO Rep. 2008;9:569–575. https://doi.org/10.1038/embor.2008.72
  • Eichhorn CD, Chug R, Feigon J. hLARP7 C-terminal domain contains an xRRM that binds the 3′ hairpin of 7SK RNA. Nucleic Acids Res. 2016;44:9977–9989.
  • Brogie JE, Price DH. Reconstitution of a functional 7SK snRNP. Nucleic Acids Res. 2017;45:6864–6880. https://doi.org/10.1093/nar/gkx262
  • Xue Y, Yang Z, Chen R, et al. A capping-independent function of MePCE in stabilizing 7SK snRNA and facilitating the assembly of 7SK snRNP. Nuclec Acids Res. 2009;38:360–369. https://doi.org/10.1093/nar/gkp977
  • Cosgrove MS, Ding Y, Rennie WA, et al. The Bin3 RNA methyltransferase targets 7SK RNA to control transcription and translation. Wiley Interdiscip Rev RNA. 2012;3:633–647. https://doi.org/10.1002/wrna.1123
  • Michels AA, Fraldi A, Li Q, et al. Binding of the 7SK snRNA turns the HEXIM1 protein into a P-TEFb (CDK9/cyclin T) inhibitor. EMBO J. 2004;23:2608–2619. https://doi.org/10.1038/sj.emboj.7600275
  • Blazek D, Barboric M, Kohoutek J, et al. Oligomerization of HEXIM1 via 7SK snRNA and coiled-coil region directs the inhibition of P-TEFb. Nucleic Acids Res. 2005;33:7000–7010. https://doi.org/10.1093/nar/gki997
  • Peterlin BM, Price DH. Controlling the elongation phase of transcription with P-TEFb. Mol Cell. 2006;23:297-305.
  • Chen R, Liu M, Li H, et al. PP2B and PP1α cooperatively disrupt 7SK snRNP to release P-TEFb for transcription in response to Ca2+ signaling. Genes Dev. 2008;22:1356–1368.
  • Gudipaty SA, McNamara RP, Morton EL, et al. PPM1G binds 7SK RNA and Hexim1 to P-TEFb assembly into the 7SK snRNP and sustain transcription elongation. Mol Cell Biol. 2015;35:3810–3828. 10.1128/MCB.00226-15. https://doi.org/10.1101/gad.1636008. https://doi.org/10.1128/MCB.00226-15
  • He WJ, Chen R, Yang Z, et al. Regulation of two key nuclear enzymatic activities by the 7SK small nuclear RNA. Cold Spring Harb Symp Quant Biol. 2006;71:301–311. https://doi.org/10.1101/sqb.2006.71.019
  • Lebars I, Martinez-Zapien D, Durand A, et al. HEXIM1 targets a repeated GAUC motif in the riboregulator of transcription 7SK and promotes base pair rearrangements. Nucleic Acids Res. 2010;38:7749–7763. https://doi.org/10.1093/nar/gkq660
  • Martinez-Zapien D, et al. The crystal structure of the 5′ functional domain of the transcription riboregulator 7SK. Nucleic Acids Res. 2017;45:3568–3579. https://doi.org/10.1093/nar/gkw1351
  • Muniz L, Egloff S, Ughy B, et al. Controlling cellular P-TEFb activity by the HIV-1 transcriptional transactivator Tat. PLoS pathogens. 2010;6:e1001152. https://doi.org/10.1371/journal.ppat.1001152
  • Bourbigot S, Dock-Bregeon AC, Eberling P, et al. Solution structure of the 5′-terminal hairpin of the 7SK small nuclear RNA. RNA. 2016;22:1844–1858.
  • Eilebrecht S, Brysbaert G, Wegert T, et al. 7SK small nuclear RNA directly affects HMGA1 function in transcription regulation. Nucleic acids Res. 2010;39:2057–2072. https://doi.org/10.1093/nar/gkq1153
  • Eilebrecht S, Bécavin C, Léger H, et al. HMGA1-dependent and independent 7SK RNA gene regulatory activity. RNA Biol. 2011;8:143–157. https://doi.org/10.4161/rna.8.1.14261
  • Masciullo V, Baldassarre G, Pentimalli F, et al. HMGA1 protein over-expression is a frequent feature of epithelial ovarian carcinomas. Carcinogenesis. 2003;24:1191–1198. https://doi.org/10.1093/carcin/bgg075
  • Fusco A, Fedele M. Roles of HMGA proteins in cancer. Nat Rev Cancer. 2007;7:899. https://doi.org/10.1038/nrc2271.
  • Eilebrecht S, Benecke BJ, Benecke A. 7SK snRNA-mediated, gene-specific cooperativity of HMGA1 and P-TEFb. RNA Biol. 2011;8:1084–1093. https://doi.org/10.4161/rna.8.6.17015
  • Marz M, Donath A, Verstaete N, et al. Evolution of 7SK RNA and its protein partners in metazoa. Mol Biol Evol. 2009;26:2821–2830. https://doi.org/10.1093/molbev/msp198
  • Xie M, Mosig A, Qi X, et al. Size variation and structural conservation of vertebrate telomerase RNA. J Biol Chem. 2008;283:2049–2059. https://doi.org/10.1074/jbc.M708032200
  • Hokii Y, Sasano Y, Sato M, et al. A small nucleolar RNA functions in rRNA processing in Caenorhabditis elegans. Nucleic Acids Res. 2010;38:5909–5918. https://doi.org/10.1093/nar/gkq335
  • Wheeler DL, Barrett T, Benson DA, et al. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2007;36:D13–D21. https://doi.org/10.1093/nar/gkm1000
  • Wurm Y, Uva P, Ricci F, et al. Fourmidable: a database for ant genomics. BMC genomics. 2009;10:5. https://doi.org/10.1186/1471-2164-10-5
  • Giraldo-Calderón GI, et al. VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases. Nucleic Acids Res. 2014;43:D707–D713. https://doi.org/10.1093/nar/gku1117
  • Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
  • Camacho C, Coulouris G, Avagyan V, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2008;10:421. 10.1186/1471-2105-10-421. https://doi.org/10.1186/1471-2105-10-421
  • Eddy SR. A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an rna secondary structure. BMC bioinformatics. 2002;3:18. https://doi.org/10.1186/1471-2105-3-18
  • Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster rna homology searches. Bioinformatics. 2013;29:2933–2935. https://doi.org/10.1093/bioinformatics/btt509
  • Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. https://doi.org/10.1093/nar/22.22.4673
  • Lorenz R, Bernhart SH, Höner zu Siederdissen C, et al. ViennaRNA Package 2.0. Algorithms for Molecular Biology. 2011;6:26. https://doi.org/10.1186/1748-7188-6-26
  • Weinberg Z, Breaker RR. R2R-software to speed the depiction of aesthetic consensus RNA secondary structures. BMC bioinformatics. 2011;12:3. https://doi.org/10.1186/1471-2105-12-3
  • Crooks GE, Hon G, Chandonia JM, et al. WebLogo: a sequence logo generator. Genome Res. 2004;14:1188–1190. https://doi.org/10.1101/gr.849004
  • Hernandez Jr G, Valafar F, Stumph WE. Insect small nuclear RNA gene promoters evolve rapidly yet retain conserved features involved in determining promoter activity and RNA polymerase specificity. Nucleic Acids Res. 2006;35:21–34. https://doi.org/10.1093/nar/gkl982
  • Griffiths-Jones S. RALEE—RNA alignment editor in Emacs. Bioinformatics. 2005;21:257–259. https://doi.org/10.1093/bioinformatics/bth489
  • Rivas E, Clements J, Eddy SR. A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nature Methods. 2017;14:45–48. https://doi.org/10.1038/nmeth.4066
  • Uchikawa E, Natchiar KS, Han X, et al. Structural insight into the mechanism of stabilization of the 7SK small nuclear RNA by LARP7. Nucleic Acids Res. 2015;43:3373–3388. https://doi.org/10.1093/nar/gkv173
  • Wassarman DA, Steitz JA. Structural analyses of the 7SK ribonucleoprotein (rnp), the most abundant human small RNP of unknown function. Mol Cell Biol. 1991;11:3432–3445. https://doi.org/10.1128/MCB.11.7.3432
  • Menzel P, Gorodkin J, Stadler PF. The tedious task of finding homologous non-coding RNA genes. RNA. 2009;15:2075–2082. https://doi.org/10.1261/rna.1556009
  • Qi X, Li Y, Honda S, et al. The common ancestral core of vertebrate and fungal telomerase RNAs. Nucleic Acids Res. 2013;41:450–462. https://doi.org/10.1093/nar/gks980
  • Woodhams MD, Stadler PF, Penny D, et al. RNAse MRP and the RNA processing cascade in the eukaryotic ancestor. BMC Evol Biol. 2007;7:S13. https://doi.org/10.1186/1471-2148-7-S1-S13
  • Yusuf D, Marz M, Stadler PF, et al. Bcheck: A wrapper tool for detecting RNase P RNA genes. BMC Bioinformatics. 2010;11:432.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.