777
Views
4
CrossRef citations to date
0
Altmetric
Commentary

RS and RGG repeats as primitive proteins at the transition between the RNA and RNP worlds

Pages 4-5 | Published online: 01 Jan 2012

Abstract

For many experimental biologists in the field of nuclear cell biology, low complexity repeats in nuclear proteins constitute a nuisance. They are difficult to express, impossible to crystalize and have low but near ubiquitous unwanted affinities toward many biomolecules. Examples of such nuclear protein repeats are RS (Arg-Ser) repeats in splicing factors, RGG (Arg-Gly-Gly) repeats in hnRNP proteins and FG (Phe-Gly) repeats in nuclear pore components. Here, I would like to present a more positive perspective for at least a subset of repeats and suggest that they are excellent candidates to have constituted the first proteins emerging from an RNA world.

The RNA world hypothesis supposes forms of life that use RNA for both coding and catalytic activity.Citation1,Citation2 It has been proposed that primordial proteins functioned as RNA chaperones, aiding RNA folding.Citation3 RS and RGG domains are low in complexity, predicted to be unstructured,Citation4 subject to simple regulation and directly involved in RNA metabolism. Proteins that contain simple unstructured repeats have been found to act as RNA chaperonesCitation5,Citation6 and the intrinsically unstructured regions of several RNA chaperones have been proposed to possess such chaperone activity.Citation4,Citation5 RS domains consist of several (5–20) copies of the RS dipeptide, irregularly interrupted by other amino acids. RS domains can directly bind RNA and themselvesCitation7,Citation8 in a rather non-specific manner. Mutagenesis analysis indicated that the exact number and spacing of these repeats are not important for their function.Citation9 RS domain self-association is promoted by phosphorylation at serine residues.Citation7,Citation8 Initially, affinity for single stranded RNA was found to be reduced by SR phosphorylation,Citation7,Citation8 presumably by ionic repulsion between the negatively charged RNA and phosphoserines. More recent data however indicates that RS domains also possess an intrinsic affinity for double stranded RNA, and that their function in splicing required serine phosphorylation,Citation10 possibly to properly discriminate between single stranded and double stranded RNA. The RS domain is predicted to be unstructuredCitation11 and found mainly in proteins involved in pre-mRNA splicing. However, consistent with an early role in RNA metabolism, RS domains are also present in proteins that function in other sectors of RNA metabolism such as transcription elongation, polyadenylation, translationCitation12,Citation13 and microRNA processing.Citation14 Analogous to the current role of RS domains in splicing,Citation10,Citation15 a function of RS domains in RNA-based forms of life may have been to promote base-pairing between different RNAs that subsequently were transesterified by RNA-based catalysis.

Arguing against an early function is that so far RS domains have been mainly identified in (higher) eukaryotes. However, their possible absence in bacteria and archea may reflect that r-selected prokaryotes have replaced several RNP-catalyzed processes by the more efficient protein-only catalysis,Citation16 limiting the need for this type of RNA chaperone activity. In that sense budding yeast may represent an intermediate in metabolic advancement with only one clear RS-domain splicing factor Npl3.Citation17 Advantages of maintaining the less efficient RNP catalysis in higher eukaryotes may include multilevel control or higher combinatorial potential. Also, the inefficiencies may not be rate limiting and even trivial compared with other activities such as transcribing large introns and maintaining a largely non protein coding genome.

RGG domains, irregular repeats of arginine-glycine-glycine, are present in RNA binding proteins such as hnRNP proteins.Citation18 RGG domains directly interact with RNA and are also predicted to be unstructured.Citation4 They are subject tova rather unusual post-translational modification, methylation of arginine. Importantly, similar to phosphorylation of RS repeats, arginine methylation may be reversibleCitation19 and methylated RGG domains are altered in their RNA binding.Citation20

Several predictions arise from this view of simple repeat proteins, one of which is that RS serine ribozyme kinases and phosphatases and RGG arginine ribozyme methyltransferases and demethylases must have existed, remnants of which may still be present in modern day organisms. Interestingly, RNA control of serine phosphorylation has been found in the C-terminal domain (CTD) of RNA polymerase II,Citation21 where the non-coding RNA 7SK functions as an inhibitor of the CTD serine 2 kinase P-TEFb. The CTD consists of ~20–50 copies of a seven amino acid repeat, is intrinsically unstructured and lacks catalytic activity. The consensus repeat sequence is Tyr-Ser-Pro-Thr-Ser-Pro-Ser, but non-consensus repeats of the CTD contain two arginine and seven lysine substitutions. Notably, the arginines are modified by site-specific methylationCitation22 which plays a role in snRNA and snoRNA expression. Possibly, the origins of the CTD may also predate protein-based catalysis. Another prediction is that some RS or RGG repeats may function in RNA metabolism in prokaryotes. Lastly, these simple repeat proteins can conceivably be produced without very precise translation machinery and with an incomplete genetic amino acid code. In this light it is interesting that arginine strongly and specifically binds to its own codons in RNA.Citation23

In summary, simple repeat protein domains should not or not only be viewed by nuclear cell biologists as troublesome regions meant to be masked in database searches, but as interesting protein domains that may shed light on early cellular evolution.

Acknowledgments

I’d like to thank Karla Neugebauer for discussions and collegues for encouragement to submit this manuscript, versions of which have been idly spinning on disks for many years.

References

  • Crick FH. The origin of the genetic code. J Mol Biol 1968; 38:367 - 79; http://dx.doi.org/10.1016/0022-2836(68)90392-6; PMID: 4887876
  • Orgel LE. Evolution of the genetic apparatus. J Mol Biol 1968; 38:381 - 93; http://dx.doi.org/10.1016/0022-2836(68)90393-8; PMID: 5718557
  • Herschlag D. RNA chaperones and the RNA folding problem. J Biol Chem 1995; 270:20871 - 4; PMID: 7545662
  • Tompa P, Csermely P. The role of structural disorder in the function of RNA and protein chaperones. FASEB J 2004; 18:1169 - 75; http://dx.doi.org/10.1096/fj.04-1584rev; PMID: 15284216
  • Rajkowitsch L, Chen D, Stampfl S, Semrad K, Waldsich C, Mayer O, et al. RNA chaperones, RNA annealers and RNA helicases. RNA Biol 2007; 4:118 - 30; http://dx.doi.org/10.4161/rna.4.3.5445; PMID: 18347437
  • Schroeder R, Barta A, Semrad K. Strategies for RNA folding and assembly. Nat Rev Mol Cell Biol 2004; 5:908 - 19; http://dx.doi.org/10.1038/nrm1497; PMID: 15520810
  • Xiao SH, Manley JL. Phosphorylation of the ASF/SF2 RS domain affects both protein-protein and protein-RNA interactions and is necessary for splicing. Genes Dev 1997; 11:334 - 44; http://dx.doi.org/10.1101/gad.11.3.334; PMID: 9030686
  • Tacke R, Chen Y, Manley JL. Sequence-specific RNA binding by an SR protein requires RS domain phosphorylation: creation of an SRp40-specific splicing enhancer. Proc Natl Acad Sci USA 1997; 94:1148 - 53; http://dx.doi.org/10.1073/pnas.94.4.1148; PMID: 9037021
  • Valcárcel J, Green MR. The SR protein family: pleiotropic functions in pre-mRNA splicing. Trends Biochem Sci 1996; 21:296 - 301; PMID: 8772383
  • Shen H, Green MR. RS domains contact splicing signals and promote splicing by a common mechanism in yeast through humans. Genes Dev 2006; 20:1755 - 65; http://dx.doi.org/10.1101/gad.1422106; PMID: 16766678
  • Haynes C, Iakoucheva LM. Serine/arginine-rich splicing factors belong to a class of intrinsically disordered proteins. Nucleic Acids Res 2006; 34:305 - 12; http://dx.doi.org/10.1093/nar/gkj424; PMID: 16407336
  • Zhong XY, Wang P, Han J, Rosenfeld MG, Fu XD. SR proteins in vertical integration of gene expression from transcription to RNA processing to translation. Mol Cell 2009; 35:1 - 10; http://dx.doi.org/10.1016/j.molcel.2009.06.016; PMID: 19595711
  • Huang Y, Steitz JA. SRprises along a messenger's journey. Mol Cell 2005; 17:613 - 5; http://dx.doi.org/10.1016/j.molcel.2005.02.020; PMID: 15749011
  • Wu H, Sun S, Tu K, Gao Y, Xie B, Krainer AR, et al. A splicing-independent function of SF2/ASF in microRNA processing. Mol Cell 2010; 38:67 - 77; http://dx.doi.org/10.1016/j.molcel.2010.02.021; PMID: 20385090
  • Shen H, Green MR. RS domain-splicing signal interactions in splicing of U12-type and U2-type introns. Nat Struct Mol Biol 2007; 14:597 - 603; http://dx.doi.org/10.1038/nsmb1263; PMID: 17603499
  • Poole A, Jeffares D, Penny D. Early evolution: prokaryotes, the new kids on the block. Bioessays 1999; 21:880 - 9; http://dx.doi.org/10.1002/(SICI)1521-1878(199910)21:10<880::AID-BIES11>3.0.CO;2-P; PMID: 10497339
  • Gilbert W, Siebel CW, Guthrie C. Phosphorylation by Sky1p promotes Npl3p shuttling and mRNA dissociation. RNA 2001; 7:302 - 13; http://dx.doi.org/10.1017/S1355838201002369; PMID: 11233987
  • Mattaj IW. RNA recognition: a family matter?. Cell 1993; 73:837 - 40; http://dx.doi.org/10.1016/0092-8674(93)90265-R; PMID: 8500177
  • Chang B, Chen Y, Zhao Y, Bruick RK. JMJD6 is a histone arginine demethylase. Science 2007; 318:444 - 7; http://dx.doi.org/10.1126/science.1145801; PMID: 17947579
  • Blackwell E, Zhang X, Ceman S. Arginines of the RGG box regulate FMRP association with polyribosomes and mRNA. Hum Mol Genet 2010; 19:1314 - 23; http://dx.doi.org/10.1093/hmg/ddq007; PMID: 20064924
  • Yang Z, Zhu Q, Luo K, Zhou Q. The 7SK small nuclear RNA inhibits the CDK9/cyclin T1 kinase to control transcription. Nature 2001; 414:317 - 22; http://dx.doi.org/10.1038/35104575; PMID: 11713532
  • Sims RJ 3rd, Rojas LA, Beck D, Bonasio R, Schuller R, Drury WJ 3rd, et al. The C-terminal domain of RNA polymerase II is modified by site-specific methylation. Science 2011; 332:99 - 103; http://dx.doi.org/10.1126/science.1202663; PMID: 21454787
  • Knight RD, Landweber LF. Rhyme or reason: RNA-arginine interactions and the genetic code. Chem Biol 1998; 5:R215 - 20; http://dx.doi.org/10.1016/S1074-5521(98)90001-1; PMID: 9751648

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.