1,154
Views
13
CrossRef citations to date
0
Altmetric
Research Paper

Programmed ribosomal frameshifting in the expression of the regulator of intestinal stem cell proliferation, adenomatous polyposis coli (APC)

Pages 637-647 | Received 16 Jan 2011, Accepted 04 Mar 2011, Published online: 01 Jul 2011
 

Abstract

A programmed ribosomal frameshift (PRF) in the decoding of APC (adenomatous polyposis coli) mRNA has been identified and characterized in Caenorhabditis worms, Drosophila and mosquitoes. The frameshift product lacks the C-terminal approximately one-third of the product of standard decoding and instead has a short sequence encoded by the -1 frame which is just 13 residues in C. elegans, but is 125 in D. melanogaster. The frameshift site is A_AA.A_AA.C in Caenorhabditids, fruit flies and the mosquitoes studied while a variant A_AA.A_AA.A is found in some other nematodes. The predicted secondary RNA structure of the downstream stimulators varies considerably in the species studied. In the twelve sequenced Drosophila genomes, it is a long stem with a four-way junction in its loop. In the five sequenced Caenorhabditis species, it is a short RNA pseudoknot with an additional stem in loop 1. The efficiency of frameshifting varies significantly, depending on the particular stimulator within the frameshift cassette, when tested with reporter constructs in rabbit reticulocyte lysates. Phylogenetic analysis of the distribution of APC programmed ribosomal frameshifting cassettes suggests it has an ancient origin and raises questions about a possibility of synthesis of alternative protein products during expression of APC in other organisms such as humans. The origin of APC as a PRF candidate emerged from a prior study of evolutionary signatures derived from comparative analysis of the 12 fly genomes. Three other proposed PRF candidates (Xbp1, CG32736, CG14047) with switches in conservation of reading frames are likely explained by mechanisms other than PRF.

Acknowledgements

The authors would like to thank (1) Dr. Avril Coghlan for useful suggestions related to identification of APC sequences in nematodes and for her help with TreeFam and (2) Dr. Erik Jorgensen for drawing our attention to the location (with respect to the frameshift site) of the SxIP motif in C. elegans and its potential implication in binding EB1.

Financial Support

This work was supported by Science Foundation Ireland (grant number 06/IN.1/B81 to P.V.B., 08/IN.1/B1889 to J.F.A.), the Wellcome Trust (grant number 088789 to A.E.F.), and NIH (grant number HG004164) to G.M.

Figures and Tables

Figure 1 Domain organization of APC proteins. Highlighted are conserved PFAM domains identified by a PFAM search.Citation76 The color codes are as follows: blue-PF00514 (armadillo), brown-PF05972, yellow-PF05923 (cysteine-rich regions), purple-PF05924 (SAMP), orange-PF05956 (basic) and green-PF05937 (EB1 binding domain). Frameshift sites are shown with an L-like red shape and indicate the length of the C-terminal extension after PRF in each case.

Figure 1 Domain organization of APC proteins. Highlighted are conserved PFAM domains identified by a PFAM search.Citation76 The color codes are as follows: blue-PF00514 (armadillo), brown-PF05972, yellow-PF05923 (cysteine-rich regions), purple-PF05924 (SAMP), orange-PF05956 (basic) and green-PF05937 (EB1 binding domain). Frameshift sites are shown with an L-like red shape and indicate the length of the C-terminal extension after PRF in each case.

Figure 2 Multiple alignment of fly and malaria mosquito APC gene sequences corresponding to the region translated in the −1 frame. Heptameric frameshift sites are shown in yellow. Nucleotides involved in base-pairing interactions within the same double-stranded regions of the predicted secondary structure are differentially colored, e.g., the second stem of the predicted pseudoknot in the malaria mosquito (anoGam1) is shown in red. Secondary structures are also shown in bracket format below the alignment. The first bracket row corresponds to the four-way junction in fruit flies. The second bracket row corresponds to the malaria mosquito pseudoknot.

Figure 2 Multiple alignment of fly and malaria mosquito APC gene sequences corresponding to the region translated in the −1 frame. Heptameric frameshift sites are shown in yellow. Nucleotides involved in base-pairing interactions within the same double-stranded regions of the predicted secondary structure are differentially colored, e.g., the second stem of the predicted pseudoknot in the malaria mosquito (anoGam1) is shown in red. Secondary structures are also shown in bracket format below the alignment. The first bracket row corresponds to the four-way junction in fruit flies. The second bracket row corresponds to the malaria mosquito pseudoknot.

Figure 3 Predicted stimulatory PRF secondary RNA structures in flies and mosquitoes. The sequences shown include the A AAA AAC slippery pattern (with codons in the zero frame separated by spaces) at the 5′ end and the entire mRNA regions forming the predicted stimulatory structures 3′ from the shift site. Latin names of the organisms are indicated.

Figure 3 Predicted stimulatory PRF secondary RNA structures in flies and mosquitoes. The sequences shown include the A AAA AAC slippery pattern (with codons in the zero frame separated by spaces) at the 5′ end and the entire mRNA regions forming the predicted stimulatory structures 3′ from the shift site. Latin names of the organisms are indicated.

Figure 4 Coding potential statistics for Caenorhabditis APC. (1) Map of the APC coding sequence showing the predicted frameshift ORF. (2–4) The positions of stop codons in each of the three forward reading frames. The +0 frame corresponds to APC and is therefore devoid of stop codons. (5 and 6) Conservation at synonymous sites within APC (see ref. Citation30 for details). (5) depicts the probability that the degree of conservation within a given window could be obtained under a null model of neutral evolution at synonymous sites, while (6) depicts the absolute amount of conservation as represented by the ratio of the observed number of substitutions within a given window to the number expected under the null model. (7–9) MLOGD sliding window plots (see ref. Citation61 for details). The null model, in each window, is that the sequence is non-coding, while the alternative model is that the sequence is coding in the given reading frame. Positive scores favor the alternative model and, as expected, in the +0 frame (7) there is a strong coding signature throughout APC . In the +1 and +2 frames (8 and 9), scores are generally negative with some random scatter into low positive scores, except in the region occupied by the predicted frameshift ORF, where there is a strong positive coding signal in the +2/−1 reading frame. (Due to the limited sequence data available, a 50-codon sliding window was used to increase the signal-to-noise ratio throughout the alignment, notwithstanding that this is expected to dilute the signal from the overlapping ORF itself—since, at ∼16 codons, the overlapping ORF is much shorter than the 50-codon window). Note that, regardless of the sign (either positive or negative), the magnitude of MLOGD scores tends to be lower within the overlap region (7–9) due to there being fewer substitutions with which to discriminate the null model from the alternative model in this region of above-average nucleotide conservation.

Figure 4 Coding potential statistics for Caenorhabditis APC. (1) Map of the APC coding sequence showing the predicted frameshift ORF. (2–4) The positions of stop codons in each of the three forward reading frames. The +0 frame corresponds to APC and is therefore devoid of stop codons. (5 and 6) Conservation at synonymous sites within APC (see ref. Citation30 for details). (5) depicts the probability that the degree of conservation within a given window could be obtained under a null model of neutral evolution at synonymous sites, while (6) depicts the absolute amount of conservation as represented by the ratio of the observed number of substitutions within a given window to the number expected under the null model. (7–9) MLOGD sliding window plots (see ref. Citation61 for details). The null model, in each window, is that the sequence is non-coding, while the alternative model is that the sequence is coding in the given reading frame. Positive scores favor the alternative model and, as expected, in the +0 frame (7) there is a strong coding signature throughout APC . In the +1 and +2 frames (8 and 9), scores are generally negative with some random scatter into low positive scores, except in the region occupied by the predicted frameshift ORF, where there is a strong positive coding signal in the +2/−1 reading frame. (Due to the limited sequence data available, a 50-codon sliding window was used to increase the signal-to-noise ratio throughout the alignment, notwithstanding that this is expected to dilute the signal from the overlapping ORF itself—since, at ∼16 codons, the overlapping ORF is much shorter than the 50-codon window). Note that, regardless of the sign (either positive or negative), the magnitude of MLOGD scores tends to be lower within the overlap region (7–9) due to there being fewer substitutions with which to discriminate the null model from the alternative model in this region of above-average nucleotide conservation.

Figure 5 Multiple alignment of genomic sequences in the region of the PRF cassette from seven nematodes. Heptameric frameshift sites are shown in yellow. Nucleotides involved in predicted base-pairing interactions within the same double stranded regions of the secondary structure are differentially colored (e.g., the first stem of the pseudoknot is green, the second is red and the stem in the first loop of the pseudoknot is in blue). The secondary structure of the RNA pseudoknot in the C. elegans sequence is also shown in bracket format above the alignment and its diagram is shown in .

Figure 5 Multiple alignment of genomic sequences in the region of the PRF cassette from seven nematodes. Heptameric frameshift sites are shown in yellow. Nucleotides involved in predicted base-pairing interactions within the same double stranded regions of the secondary structure are differentially colored (e.g., the first stem of the pseudoknot is green, the second is red and the stem in the first loop of the pseudoknot is in blue). The secondary structure of the RNA pseudoknot in the C. elegans sequence is also shown in bracket format above the alignment and its diagram is shown in Figure 2.

Figure 6 Expression of APC frameshift cassettes in reticulocyte lysates. (A) Drosophila melanogaster APC frameshift cassette. The wild-type sequence with the predicted frameshift site underlined is shown; the sequence of the mutant frameshift site is shown above. Sequences that were mutated to introduce a termination codon in the −1 or the zero frames are boxed. Endpoints of the 3′ deletion series are shown by arrows, as well as nucleotide changes in the stem. (B) In vitro translations of Drosophila APC pDluc constructs. 1° specifies primary frameshift product. (1) The frameshift site was replaced by C AAG AAC C, such that standard translation produces the renilla luciferase-APC-firefly luciferase fusion protein (in-frame control). (2) The wild-type APC sequence ending 104 nt 3′ of the shift site (see A). (3) A C-to-U mutation 5′ of the shift site that results in a UAA stop codon in the −1 frame. (4) CUG was changed to UAA, 3′ of the frameshift site, resulting in a stop codon in the zero reading frame. All subsequent constructs contained the CUG to UAA mutation. (5) Mutation of the frameshift site to C AAG AAC. (6–9) Deletions of the 3′ sequence—endpoints are +83, +62, +44 and +8, respectively (see A). (10) Mutation of the 5′ side of the putative stem. (11) Mutation of the 3′ side of the putative stem. (12) Combination of the 5′ and 3′ stem mutations predicted to restore base pairing (see A). (13) No template added. The positions of the products due to frameshifting at the primary site, A AAA AAC, termination in the WT construct and termination at the zero frame UAA, two codons 3′ of the frameshift site are shown by arrows. The secondary frameshift product of unknown origin is indicated by *. (C) In vitro translations of frameshift cassettes from D. melanogaster, A. gambiae and C. elegans are labelled WT. Lanes with cassettes designed to produce the frameshift product as a result of standard translation are labelled IFC (In-Frame Control). The positions of the respective frameshift and termination products are shown by arrows. Note that with the C. elegans WT cassette, the frameshift product is the smaller of the two products due to a −1 stop codon within the sequence of the predicted 3′ stimulatory element (see ). The * signifies the secondary frameshift product from the D. melanogaster APC sequence.

Figure 6 Expression of APC frameshift cassettes in reticulocyte lysates. (A) Drosophila melanogaster APC frameshift cassette. The wild-type sequence with the predicted frameshift site underlined is shown; the sequence of the mutant frameshift site is shown above. Sequences that were mutated to introduce a termination codon in the −1 or the zero frames are boxed. Endpoints of the 3′ deletion series are shown by arrows, as well as nucleotide changes in the stem. (B) In vitro translations of Drosophila APC pDluc constructs. 1° specifies primary frameshift product. (1) The frameshift site was replaced by C AAG AAC C, such that standard translation produces the renilla luciferase-APC-firefly luciferase fusion protein (in-frame control). (2) The wild-type APC sequence ending 104 nt 3′ of the shift site (see A). (3) A C-to-U mutation 5′ of the shift site that results in a UAA stop codon in the −1 frame. (4) CUG was changed to UAA, 3′ of the frameshift site, resulting in a stop codon in the zero reading frame. All subsequent constructs contained the CUG to UAA mutation. (5) Mutation of the frameshift site to C AAG AAC. (6–9) Deletions of the 3′ sequence—endpoints are +83, +62, +44 and +8, respectively (see A). (10) Mutation of the 5′ side of the putative stem. (11) Mutation of the 3′ side of the putative stem. (12) Combination of the 5′ and 3′ stem mutations predicted to restore base pairing (see A). (13) No template added. The positions of the products due to frameshifting at the primary site, A AAA AAC, termination in the WT construct and termination at the zero frame UAA, two codons 3′ of the frameshift site are shown by arrows. The secondary frameshift product of unknown origin is indicated by *. (C) In vitro translations of frameshift cassettes from D. melanogaster, A. gambiae and C. elegans are labelled WT. Lanes with cassettes designed to produce the frameshift product as a result of standard translation are labelled IFC (In-Frame Control). The positions of the respective frameshift and termination products are shown by arrows. Note that with the C. elegans WT cassette, the frameshift product is the smaller of the two products due to a −1 stop codon within the sequence of the predicted 3′ stimulatory element (see Fig. 2). The * signifies the secondary frameshift product from the D. melanogaster APC sequence.

Figure 7 Phylogenetic tree of APC genes extracted from TreeFam.Citation70 The nodes corresponding to the events of independent APC gene duplications are indicated with arrows.

Figure 7 Phylogenetic tree of APC genes extracted from TreeFam.Citation70 The nodes corresponding to the events of independent APC gene duplications are indicated with arrows.