863
Views
5
CrossRef citations to date
0
Altmetric
Point-of-View

Poor codon optimality as a signal to degrade transcripts with frameshifts

ORCID Icon & ORCID Icon
Pages 327-333 | Received 12 Jun 2018, Accepted 07 Aug 2018, Published online: 28 Aug 2018

ABSTRACT

Frameshifting errors are common and mRNA quality control pathways, such as nonsense-mediated decay (NMD), exist to degrade these aberrant transcripts. Recent work has shown the existence of a genetic link between NMD and codon-usage mediated mRNA decay. Here we present computational evidence that these pathways are synergic for removing frameshifts.

Introduction

Frameshifting errors in gene expression

All biochemical pathways are intrinsically stochastic processes. Transcription, splicing, and translation are especially error prone, with error rates 4–6 orders of magnitude higher than that of DNA polymerase [Citation1Citation6]. Such errors can result in single-amino acid substitutions, as well as truncation of the protein due to nonsense mutations or frameshifting errors. The latter can occur due to insertion and deletion events during transcription, splicing errors, and ribosomal slippage during translation ().

Figure 1. The impact of frameshifting errors in gene expression. Gene expression can result in frameshifting errors (indicated as *) due to transcriptional insertion/deletion epimutations, errors in splicing or ribosomal slippage during translation (top). These processes potentially generate deleterious proteins, which justifies the need of mRNA quality control mechanisms in cells (bottom). In the absence of errors, mRNAs are translated leading to physiological protein levels. The current model indicates that frameshifting errors generate Premature Termination Codons (PTC) that trigger Nonsense-Mediated Decay (NMD) on them, mainly because of the generated long 3ʹUTR (in yeast). Our hypothesis is that NMD is often nonspecific for errors, so that other quality control mechanisms must exists. We note that another signal of “incorrectness” may appear in transcripts with frameshifts: a stretch of poorly-optimized codons (in blue, indicating worse tRNA adaptation) between the error and the PTC. This should lead to reduced translation efficiency, mRNA decay and lower protein concentrations of the frameshifted transcript.

Figure 1. The impact of frameshifting errors in gene expression. Gene expression can result in frameshifting errors (indicated as *) due to transcriptional insertion/deletion epimutations, errors in splicing or ribosomal slippage during translation (top). These processes potentially generate deleterious proteins, which justifies the need of mRNA quality control mechanisms in cells (bottom). In the absence of errors, mRNAs are translated leading to physiological protein levels. The current model indicates that frameshifting errors generate Premature Termination Codons (PTC) that trigger Nonsense-Mediated Decay (NMD) on them, mainly because of the generated long 3ʹUTR (in yeast). Our hypothesis is that NMD is often nonspecific for errors, so that other quality control mechanisms must exists. We note that another signal of “incorrectness” may appear in transcripts with frameshifts: a stretch of poorly-optimized codons (in blue, indicating worse tRNA adaptation) between the error and the PTC. This should lead to reduced translation efficiency, mRNA decay and lower protein concentrations of the frameshifted transcript.

Frameshifts in protein coding genes are likely to be among the most damaging events, as they result in truncated proteins which may be misfolded or form dominant negative alleles [Citation7,Citation8], () This justifies an evolutionary pressure for cells to contain mRNA surveillance pathways that remove transcripts bearing frameshifts. Suppression of frameshift errors is thought to be one of the major roles of the mRNA quality control machinery [Citation9].

Nonsense-mediated decay for removing frameshifting errors

In eukaryotes, nonsense-mediated decay (NMD) is a conserved mRNA surveillance pathway that is often assumed to fulfill a frameshift-removing role [Citation10]. This follows from the observation that frameshifts generate premature termination codons (PTCs), recognition of which targets the transcript for NMD. However, the quantitative effects of NMD, when measured, are often small [Citation11,Citation12]. In addition, a large fraction native transcripts (between 5% and 30% depending on the genome) are targeted by NMD [Citation13]. In the context of mRNA quality control, these are poor evidence for NMD being an effective quality control pathway.

The mechanism of NMD may be species-specific [Citation10,Citation12] and has even been proposed to be a passive result of the degradation of unprotected transcripts [Citation14]. In yeast, NMD is thought to act on long 3ʹUTRs [Citation15,Citation16], so that transcripts bearing 3ʹUTRs longer than 250 nucleotides are targeted by NMD (). Recent work has shown that this is mostly true and, importantly, the strength of NMD depends linearly on 3ʹUTR length [Citation11], (Figure 3(b)). However, native 3ʹUTR lengths are highly variable, ranging from 0 to 1461 nucleotides [Citation17]. Frameshifts in native transcripts with short 3ʹUTRs are unlikely to result in efficient NMD.

These data suggest that NMD is both inaccurate and inefficient discretizing “correct” vs “incorrect” transcripts. We propose that an efficient quality control pathway should be better able to distinguish and degrade incorrect transcripts.

Results

Codon bias and mRNA quality control

Recent work [Citation11] provides an unexpected clue towards understanding mRNA quality control. Two mechanisms of co-translational regulation, NMD and codon bias-dependent mRNA expression [Citation18,Citation19],(), are genetically linked; both pathways are regulated by the DEAD-box RNA helicase Dbp2 and by promoter architecture. A quantitative analysis of the impact of these pathways on mRNA levels gives rise to the hypothesis that they may act in a synergistic manner to remove transcripts with frameshifts. In addition to generating a PTC, frameshifts generate a second signal of “wrong transcript”: a run of normally out-of-frame codons between the frameshift and the PTC that are now translated (). Below we provide computational support of this hypothesis.

Figure 2. The meaning of codon bias in the transcriptome. (a) Highly expressed genes are often selected to have optimized codons in agreement with the cellular tRNA pool, allowing efficient translation of them (purple). This is known as “translational selection” [Citation20Citation23]. On the other hand, genes with a poor codon optimization are inefficiently translated and targeted for mRNA decay (blue) [Citation18]. (b) Top: in yeast, most native genes (purple) exhibit a tRNA Adaptation Index (tAI, as a measure of codon optimality) in the range of ORFs predicted from random transcription throughout the genome (blue). Such random ORFs simulate the absence of codon bias in terms of tRNA adaptation. A small fraction of genes have non-random tAI, which corresponds to genes “selected for translation”. Bottom: most native transcripts (purple) have high tAI, as compared to random ORFs (blue). This histogram was generated weighting each gene by mRNA expression level (which is exponentially distributed), which indicates the per-transcript distribution of tAI.

Figure 2. The meaning of codon bias in the transcriptome. (a) Highly expressed genes are often selected to have optimized codons in agreement with the cellular tRNA pool, allowing efficient translation of them (purple). This is known as “translational selection” [Citation20–Citation23]. On the other hand, genes with a poor codon optimization are inefficiently translated and targeted for mRNA decay (blue) [Citation18]. (b) Top: in yeast, most native genes (purple) exhibit a tRNA Adaptation Index (tAI, as a measure of codon optimality) in the range of ORFs predicted from random transcription throughout the genome (blue). Such random ORFs simulate the absence of codon bias in terms of tRNA adaptation. A small fraction of genes have non-random tAI, which corresponds to genes “selected for translation”. Bottom: most native transcripts (purple) have high tAI, as compared to random ORFs (blue). This histogram was generated weighting each gene by mRNA expression level (which is exponentially distributed), which indicates the per-transcript distribution of tAI.

The meaning and role of codon bias

All transcriptomes exhibit imbalances in the synonymous codons used for each amino acid. Not all synonymous codons are equally abundant, a phenomena called “codon bias” [Citation20,Citation21]. Highly expressed genes use codons translated by abundant tRNAs [Citation22] and are coded by optimized codons (), leading to efficient protein synthesis. Highly expressed genes with efficient translation initiation but with suboptimal codon usage are deleterious and affect the expression of the rest of the proteome [Citation23].

It was previously noted that use of optimal codons increased not only protein levels, but also mRNA levels [Citation24Citation26], suggesting that ribosome speed might regulate mRNA stability. Recently, a pathway that involves the DEAD-box RNA helicase Dhh1 was found to target transcripts with suboptimal codon usage for decay in a translation-dependent manner [Citation18,Citation27]. Even short stretches of twelve suboptimal codons reduce mRNA levels [Citation19], likely due to slower translation [Citation28].

While most genes do not have highly optimized codon usage, the majority of the yeast transcriptome is populated by highly optimized mRNAs (). The top 10% of expressed genes have highly optimized codon usage. In yeast these genes account for 77% of the transcripts in a cell. Translational selection [Citation29] will result in the optimized codon usage of constitutively highly expressed genes but will act less efficiently on genes with lower expression, genes that are rarely expressed, and of course on out-of-frame codons.

Codon optimality for removing frameshifting errors

In addition to producing PTCs, frameshifts are likely to introduce a stretch of non-optimized codons at the 3ʹend of the ORF (). In genes with optimized codons, this will result in a sudden changes in translation efficiency after the frameshift, which will reduce protein synthesis and target the transcript for decay ()). This reasoning follows the observation that the impact of low codon optimality on translation efficiency and mRNA decay is local and can act over as few as twelve codons [Citation19,Citation28]. The magnitude of the decrease in codon optimality will be highest for transcripts with high codon optimization (most of the mRNAs in the cell ()), which correspond to highly expressed genes that likely bear most of the frameshifts (assuming a uniform distribution of errors across transcripts [Citation1]). Our hypothesis is that frameshift-removing mechanisms are especially relevant for such highly-expressed genes. Furthermore, the impact of low codon optimality close to the 3ʹ end of the mRNA is higher [Citation30]. In the case of a frameshift, the enrichment of non-optimal codons should be towards the end of the ORF, which predicts that the destabilizing effect will be even stronger.

Figure 3. Codon bias can implement quality control of mRNAs with frameshifts. (a) tAI follows a negative sigmoidal relationship with mRNA expression levels. Expression was calculated as the log2-ratio between mRNA and DNA abundance of a synthetic ORF library of random fragments from the yeast genome, expressed in a plasmid [Citation11]. The dashed line represent a threshold in which decreasing tAI reduces expression. (b) NMD strength follows a positive linear relationship with 3ʹUTR length. NMD was measured as the expression (calculated as in A) log2-ratio between identical ORF libraries built in a Δupf1 or a wt strain [Citation11]. This ratio indicates the impact of NMD for each sequence in the library (which has variable 3ʹUTR lengths), as UPF1 is responsible for NMD [Citation10]. The dashed line represent a threshold in which increasing 3ʹUTR generates NMD (positive values in the Y axis). (c) A pipeline for predicting the impact of NMD and codon on frameshift quality control. As an example of frameshift, we simulated 10 [Citation5] random single-base deletions on native transcripts. Each gene includes a number of mutations proportional to its expression level. For each error (and corresponding native transcript) we calculated tAI between the frameshift and the PTC (local tAI) and the resulting 3ʹUTR length. We used these as measures of the impact of error on translation efficiency and/or NMD targeting. (d) Transcripts with frameshifts (blue) have lower tAI (top) and longer 3ʹUTRs (bottom), when compared to native mRNAs (purple). The dashed lines represent the thresholds described in A,B.

Figure 3. Codon bias can implement quality control of mRNAs with frameshifts. (a) tAI follows a negative sigmoidal relationship with mRNA expression levels. Expression was calculated as the log2-ratio between mRNA and DNA abundance of a synthetic ORF library of random fragments from the yeast genome, expressed in a plasmid [Citation11]. The dashed line represent a threshold in which decreasing tAI reduces expression. (b) NMD strength follows a positive linear relationship with 3ʹUTR length. NMD was measured as the expression (calculated as in A) log2-ratio between identical ORF libraries built in a Δupf1 or a wt strain [Citation11]. This ratio indicates the impact of NMD for each sequence in the library (which has variable 3ʹUTR lengths), as UPF1 is responsible for NMD [Citation10]. The dashed line represent a threshold in which increasing 3ʹUTR generates NMD (positive values in the Y axis). (c) A pipeline for predicting the impact of NMD and codon on frameshift quality control. As an example of frameshift, we simulated 10 [Citation5] random single-base deletions on native transcripts. Each gene includes a number of mutations proportional to its expression level. For each error (and corresponding native transcript) we calculated tAI between the frameshift and the PTC (local tAI) and the resulting 3ʹUTR length. We used these as measures of the impact of error on translation efficiency and/or NMD targeting. (d) Transcripts with frameshifts (blue) have lower tAI (top) and longer 3ʹUTRs (bottom), when compared to native mRNAs (purple). The dashed lines represent the thresholds described in A,B.

To compare the role of NMD and codon bias in mRNA quality control we ran a frameshift-introducing simulation on yeast transcripts. We generated random single-base deletions in native transcripts and calculated codon optimality (tRNA adaptation index, tAI [Citation31]) and 3ʹUTR length with and without the frameshift. Because errors occur on a per transcript basis, each gene received a number of errors proportional to its mRNA expression level ()).

We found that almost all frameshifts produce a large decrease in tAI after the mutation ()). The change in tAI range due to frameshifts decreases mRNA levels [Citation11], ()). In contrast, ~ 50% of errors produce 3ʹUTRs in the range of native 3ʹUTR lengths (), likely unaffected by NMD ()). These findings indicate that selection for codon-optimality (which acts on highly expressed genes) can be a robust way to define “correct transcripts” and thus remove transcripts that contain frameshifts

Discussion

Cells needs to remove transcripts with errors; mutants with increased error rates or that are unable to remove transcripts with errors grow slowly [Citation1,Citation32]. Frameshift errors are likely to be deleterious, both by generating deleterious protein isoforms, and because suboptimal codons titrate away both tRNAs and ribosomes [Citation23,Citation33]. However, both the sequence features that cells recognize and the mechanisms by which they do so remain poorly understood. Many open questions remain.

NMD is weak [Citation11,Citation12] and affects 5–20% of the native transcriptome [Citation13], so it may be both inefficient and unspecific for removing errors. Removing transcripts with low codon optimality may be more accurate and efficient. This is consistent with the fact that NMD strength follows a linear relationship with 3ʹUTR length, while codon optimality has a sigmoidal impact on expression (). Small changes in codon optimality can lead to a large decrease in expression.

We observe that ~ 50% of frameshifts generate 3ʹUTRs within the range of native transcripts, likely unaffected by NMD. This exemplifies how a model based on a qualitative basis (“NMD removes frameshifts because these have longer 3ʹUTRs”) can fail to predict of the quantitative behavior of a system.

Our recent work suggests a genetic link between codon bias and NMD [Citation11]. Here we report a possible explanation of this interaction, but it remains to be seen which is the impact on measured expression levels of both processes. The mechanism of this link also remains to be established.

In frameshifted mRNAs, the quantitative impact of the low-tAI stretches of ORF in expression remains elusive. It will be interesting to see if they can explain more or less quality control than NMD. In addition, the effect of codon bias on expression is expected to impact protein levels [Citation20,Citation23], not only mRNA . This predicts that the impact of codon bias on expression is higher than reported here ()), which is not true for NMD. This could explain why we observe a lot of splice isoforms that have PTCs in humans, which may arise from frameshifting splicing errors. NMD does not remove them (as we can detect them), but it is likely that they have lower codon adaptation and reduced protein levels.

Finally, this work raises a possible explanation for an adaptive benefit of imbalanced tRNA repertoires [Citation22], which would confer the ability to degrade transcripts that are not supposed to be highly expressed. It is almost certain that cells avoid selecting the expression of ORFs with a random composition of codons. Frameshifts generate such random stretches, that are likely targeted for decay. Thus, there may be an evolutionary pressure for imbalanced tRNA repertoires to ensure proper mechanisms of mRNA quality control. It will be interesting to determine if this process has driven the evolution of codon bias and codon-usage associated mRNA stability, or it is a passive result due to the fact that almost any frameshift will reduce the optimality of the already very optimal genes.

Methods

Codon bias measurements

Codon bias was approximated by calculating the tRNA adaptation index (tAI) [Citation31] for each open reading frame (ORF, either native of the yeast transcriptome or simulated). In order to generate random ORFs we simulated random transcription start sites (TSS) across the whole genome of Saccharomyces cerevisiae [Citation32] and generated the ORF starting at the first ATG from the TSS. tAI was calculated on each of them in order to measure the codon bias of random coding sequences.

mRNA expression weighting

In order to approximate per-transcript distributions (of tAI and 3ʹUTR length) we weighted each gene by the sum of the TPM expression obtained from multiple RNA-seq experiments (generated in [Citation3]). This means that each gene has a weight in the distribution which is proportional to its mRNA expression.

Relationship between ORF features and expression

We obtained data about the relationship between several ORF features (3ʹUTR length and tAI) and mRNA expression from an existing dataset [Citation11]. It includes the expression measurements for a library of ~ 10,000 ORFs randomly generated from the yeast genome. In order to determine the impact of 3ʹUTR length on NMD we generated generated the same library on a UPF1 deletion strain, as described before [Citation11].

Simulating frameshifts

As an example of frameshift, we simulated 10 [Citation5] random single-base deletions on yeast native transcripts. Each gene includes a number of mutations proportional to its expression level (as explained in mRNA expression weighting). For each error (and corresponding native transcript) we calculated tAI between the frameshift and the PTC (local tAI) and the resulting 3ʹUTR length.

Data availability

All code and data are at https://github.com/MikiSchikora/CodonBias_QualityControl

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by Ministerio de Economía y Competitividad (MINECO) and the Fondo Europeo de Desarrollo Regional (FEDER) BFU2015-68351-P (MINECO/ERDF EU), AGAUR [2014SGR0974 and 2017SGR1054] and the Unidad de Excelencia María de Maeztu, funded by MINECO [MDM-2014-0370].

References

  • Gout J-F, Li W, Fritsch C, et al. The landscape of transcription errors in eukaryotic cells. Sci Adv. 2017;3:e1701484.
  • Gout J-F, Thomas WK, Smith Z, et al. Large-scale detection of in vivo transcription errors. Proc Natl Acad Sci USA. 2013;110:18584–18589.
  • Carey LB. RNA polymerase errors cause splicing defects and can be regulated by differential expression of RNA polymerase subunits. Elife [Internet]. 2015;4. DOI:10.7554/eLife.09945
  • Milo R, Jorgensen P, Moran U, et al. BioNumbers–the database of key numbers in molecular and cell biology. Nucleic Acids Res. 2010;38:D750–3.
  • Imashimizu M, Oshima T, Lubkowska L, et al. Direct assessment of transcription fidelity by high-resolution RNA sequencing. Nucleic Acids Res. 2013;41:9090–9104.
  • Fox-Walsh KL, Hertel KJ. Splice-site pairing is an intrinsically high fidelity process. Proc Natl Acad Sci USA. 2009;106:1766–1771.
  • Weterman MAJ, Sorrentino V, Kasher PR, et al. A frameshift mutation in LRSAM1 is responsible for a dominant hereditary polyneuropathy. Hum Mol Genet. 2012;21:358–370.
  • Sadhu MJ, Bloom JS, Day L, et al. Highly parallel genome variant engineering with CRISPR–Cas9. Nat Genet. 2018;50:510–514.
  • Isken O, Maquat LE. Quality control of eukaryotic mRNA: safeguarding cells from abnormal mRNA function. Genes Dev. 2007;21:1833–3856.
  • Behm-Ansmant I, Kashima I, Rehwinkel J, et al. mRNA quality control: an ancient machinery recognizes and degrades mRNAs with nonsense codons. FEBS Lett. 2007;581:2845–2853.
  • Espinar L, Schikora Tamarit MÀ, Domingo J, et al. Promoter architecture determines cotranslational regulation of mRNA. Genome Res [Internet]. 2018. DOI:10.1101/gr.230458.117
  • Lindeboom RGH, Supek F, Lehner B. The rules and impact of nonsense-mediated mRNA decay in human cancers. Nat Genet [Internet]. 2016. DOI:10.1038/ng.3664
  • Peccarelli M, Kebaara BW. Regulation of natural mRNAs by the nonsense-mediated mRNA decay pathway. Eukaryot Cell. 2014;13:1126–1135.
  • Brogna S, McLeod T, Petric M. The meaning of NMD: translate or perish. Trends Genet. 2016;32:395–407.
  • Zhang J, Sun X, Qian Y, et al. At least one intron is required for the nonsense-mediated decay of triosephosphate isomerase mRNA: a possible link between nuclear splicing and cytoplasmic translation. Mol Cell Biol. 1998;18:5272–5283.
  • Amrani N, Ganesan R, Kervestin S, et al. A faux 3ʹ-UTR promotes aberrant termination and triggers nonsense-mediated mRNA decay. Nature. 2004;432:112–118.
  • Nagalakshmi U, Wang Z, Waern K, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008;320:1344–1349.
  • Radhakrishnan A, et al. (2016) The DEAD-Box Protein Dhh1p Couples mRNA Decay and Translation by Monitoring Codon Optimality. Cell 167(1):122–132.e9.
  • Chen S, Li K, Cao W, et al. Codon-resolution analysis reveals a direct and context-dependent impact of individual synonymous mutations on mRNA level. Mol Biol Evol. 2017;34:2944–2958.
  • Ikemura T. Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol. 1985;2:13–34.
  • Grantham R. Working of the genetic code. Trends Biochem Sci. 1980;5:327–331.
  • Novoa EM, Ribas de Pouplana L. Speeding with control: codon usage, tRNAs, and ribosomes. Trends Genet. 2012;28:574–581.
  • Frumkin I, et al. (2018) Codon usage of highly expressed genes affects proteome-wide translation efficiency. Proc Natl Acad Sci U S A 115(21):E4940–E4949.
  • Te’o VS, Cziferszky AE, Bergquist PL, et al. Codon optimization of xylanase gene xynB from the thermophilic bacterium Dictyoglomus thermophilum for expression in the filamentous fungus Trichoderma reesei. FEMS Microbiol Lett. 2000;190:13–19.
  • Presnyak V, Alhusaini N, Chen Y-H, et al. Codon optimality is a major determinant of mRNA stability. Cell. 2015;160:1111–1124.
  • Boël G, Letso R, Neely H, et al. Codon influence on protein expression in E. Coli Correlates with mRNA Levels. Nature. 2016;529:358–363.
  • Harigaya Y, Parker R. Codon optimality and mRNA decay. Cell Res. 2016;26:1269–1270.
  • Yu C-H, Dang Y, Zhou Z, et al. Codon usage influences the local rate of translation elongation to regulate co-translational protein folding. Mol Cell. 2015;59:744–754.
  • Akashi H, Eyre-Walker A. Translational selection and molecular evolution. Curr Opin Genet Dev. 1998;8:688–693.
  • Mishima Y, Tomari Y. Codon usage and 3ʹ UTR length determine maternal mRNA stability in Zebrafish. Mol Cell. 2016;61:874–885.
  • dos Reis M, Wernisch L, Savva R. Unexpected correlations between gene expression and codon usage bias from microarray data for the whole Escherichia coli K-12 genome. Nucleic Acids Res. 2003;31:6976–6985.
  • Cherry JM, Hong EL, Amundsen C, et al. Saccharomyces genome database: the genomics resource of budding yeast. Nucleic Acids Res. 2012;40:D700–5.
  • Shah P, Ding Y, Niemczyk M, et al. Rate-limiting steps in yeast protein translation. Cell. 2013;153:1589–1601.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.