707
Views
0
CrossRef citations to date
0
Altmetric
Commentary

Narrowing down the candidates of beneficial A-to-I RNA editing by comparing the recoding sites with uneditable counterparts

, , , , & ORCID Icon
Article: 2304503 | Received 20 Oct 2023, Accepted 08 Jan 2024, Published online: 29 Jan 2024

ABSTRACT

Adar-mediated adenosine-to-inosine (A-to-I) RNA editing mainly occurs in nucleus and diversifies the transcriptome in a flexible manner. It has been a challenging task to identify beneficial editing sites from the sea of total editing events. The functional Ser>Gly auto-recoding site in insect Adar gene has uneditable Ser codons in ancestral nodes, indicating the selective advantage to having an editable status. Here, we extended this case study to more metazoan species, and also looked for all Drosophila recoding events with potential uneditable synonymous codons. Interestingly, in D. melanogaster, the abundant nonsynonymous editing is enriched in the codons that have uneditable counterparts, but the Adar Ser>Gly case suggests that the editable orthologous codons in other species are not necessarily edited. The use of editable versus ancestral uneditable codon is a smart way to infer the selective advantage of RNA editing, and priority might be given to these editing sites for functional studies due to the feasibility to construct an uneditable allele. Our study proposes an idea to narrow down the candidates of beneficial recoding sites. Meanwhile, we stress that the matched transcriptomes are needed to verify the conservation of editing events during evolution.

Introduction

Current knowledges on beneficial RNA editing

The ADAR (adenosine deaminase acting on RNA) mediated adenosine-to-inosine (A-to-I) RNA editing, resembling A-to-G mutations (), is a highly prevalent type of RNA modification in metazoans [Citation1,Citation2]. This editing process mainly occurs co-transcriptionally in nucleus. A-to-I editing in coding sequence (CDS) of genes might change the genomically encoded amino acid, leading to nonsynonymous mutations (termed ‘recoding’) [Citation3]. However, an essential difference between A-to-I RNA editing and A-to-G DNA mutation is that RNA editing could be flexibly regulated in a temporal-spatial manner, while DNA mutation is hardwired to the genome and might cause pleiotropic effects.

Figure 1. A-to-I RNA editing and the evolution of Adar auto-recoding sites in insects. (a) A-to-I RNA editing mediated by ADAR. I is recognized as G. (b) Prediction of adaptive RNA editing theory. The relative fitness of A- and G-alleles under different conditions is considered. (c) Drosophila Adar has an auto-recoding site that forms a negative feedback loop. (d) Evolution of Adar S>G (site1) and I>M (site2) sites in insects. “edSer”: editable serine codon; “unSer”: uneditable serine codon; “edIle”: editable isoleucine codon; “unIle”: uneditable isoleucine codon.

Figure 1. A-to-I RNA editing and the evolution of Adar auto-recoding sites in insects. (a) A-to-I RNA editing mediated by ADAR. I is recognized as G. (b) Prediction of adaptive RNA editing theory. The relative fitness of A- and G-alleles under different conditions is considered. (c) Drosophila Adar has an auto-recoding site that forms a negative feedback loop. (d) Evolution of Adar S>G (site1) and I>M (site2) sites in insects. “edSer”: editable serine codon; “unSer”: uneditable serine codon; “edIle”: editable isoleucine codon; “unIle”: uneditable isoleucine codon.

Regarding the advantage of RNA editing over DNA mutation, early theories believed that nonsynonymous RNA editing is the driving force for adaptive evolution [Citation4] since the editing mechanism could diversify the proteome when necessary, facilitating organisms adapt to changeable environment. Although a plenty of literatures have reported the overrepresentation (indicating positive selection and potential benefit) of nonsynonymous RNA editing events in the transcriptomes of various species like insects [Citation5,Citation6], mollusks [Citation3,Citation7–9], plants [Citation10,Citation11], and fungi [Citation12–14], an unresolved pivotal question is: what is the exact benefit of RNA editing? Is ‘having editing’ better than ‘having only one version of A- or G-allele’ (namely diversifying hypothesis)? Or editing is beneficial and necessary because it reverses deleterious mutations and restores the ancestral allele (namely restorative hypothesis) [Citation15]? Particularly, under the ‘diversifying hypothesis’, one should expect that A-allele is fitter under condition#1 (e.g. high temperature) and G-allele is fitter under condition#2 (e.g., low temperature), and the RNA editing level could be smartly finetuned according to which allele is better under a given condition ().

A recent study in fungi has begun to answer this question [Citation16]. For a given A-to-I RNA editing site chosen from the various recoding sites in Fusarium graminearum, researchers revealed that the genomically uneditable mutant is fitter at asexual stage, while the genomically fixed-G mutant is fitter at sexual stage, and the wild-type (editable) could regulate the editing status and have a higher average fitness over two stages [Citation16]. This is the first case studies that experimentally verified the adaptiveness of individual editing sites.

Unresolved questions, aims, and scopes

However, (1) discovering the functional and beneficial editing sites from the sea of total editing events remains a challenging task. One needs to give priority to a particular set of RNA editing sites for functional studies. It is necessary to find a criterion to narrow down the candidates for experimental verification; (2) In fungi, editing events have strong sequence preference where the upstream nucleotide is almost U [Citation12]. This enables the manual abolishment of editing by replacing the upstream U with other nucleotides. In animals, however, the editing motif is not as stringent as that in fungi, making it impossible to completely prevent editing by changing the editing context in animals. This largely hampers the discovery of beneficial editing sites due to the technical limitation to obtain the mutant animals with an unedited allele. Although ADAR-KO is a potential way, this affects all editing sites rather than controlling the editing of individual site of interest.

In this article, we will first introduce our recent findings on the evolutionary trajectory of the auto-recoding sites in insect Adar gene [Citation17,Citation18], claiming that the comparison between current editable codons versus ancestral uneditable codons might be a putative way to look for beneficial RNA editing sites. Then, we found that in D. melanogaster, nonsynonymous editing is enriched in three types of AA changes Ser>Gly, Ile>Met, and Arg>Gly, which are the only three sets of codons that have uneditable counterparts. We proposed several possible hypotheses to explain this enrichment, and claimed that the recoding events on these editable Ser, Ile, and Arg codons should be prioritized for functional editing studies due to the feasibility of constructing an uneditable mutant. Our focus on experimental verification of beneficial editing should narrow down to these recoding sites.

Notably, a work presented by Popitsch et al. found a pattern that SNPs were more prevalent at editing sites compared to unedited sites in D. melanogaster, suggesting that editing is adaptive because A-to-I replacements are beneficial at these nucleotide sites, but the adaptive nature is not diversifying [Citation19]. In contrast, the editing sites used in this current study (see below) have excluded all the genomic polymorphic sites in D. melanogaster population to avoid false positive editing calls, while Popitsch et al., 2020 particularly focused on those editing sites overlapping with genomic polymorphisms. Given that these two studies focus on different sets of editing sites, it is not surprising that the conclusions from them are slightly different. As we will state in the following sections, editing at these Ser, Ile, and Arg codons might either be the results of purifying selection, neutral evolution, or it might be beneficial but this does not distinguish between the diversifying hypothesis and restorative hypothesis.

Results and discussion

Evolution of Adar S>G auto-recoding site in insects: from uneditable to editable codons

Early studies discovered that the sole Adar gene in insect genome, which is orthologous to mammalian ADAR2 [Citation20], has an auto-recoding site (S>G site) in Drosophila melanogaster. Editing at this S>G site changes a serine codon to a glycine codon, and the edited AdarG isoform has lower catalytic activity than the unedited AdarS isoform [Citation21], forming a negative feedback loop (). While the editing event on this serine codon (AGC/U, editable serine codon, denoted as edSer) is deeply conserved across all examined Drosophila species (with head/brain RNA-Seq), the common ancestor of insects had an uneditable serine codon at this position (TCN, denoted as unSer) as revealed by the sequence alignment and phylogeny of all available insect genomes plus a non-insect arthropod Ixodes scapularis () [Citation18]. This evolutionary process from unSer-to-edSer indicates the potential demand for having the editing ability (editable) compared with the less favorable state of having only one protein version permanently. We should stress that apart from this Adar S>G site itself, the above-mentioned logic only represents the putative evidence for advantageous editing, and the final judgment of adaptiveness should rely on experimental evidence on fitness or protein functions.

Moreover, the unSer-to-edSer transition at Adar S>G site independently occurred for multiple times during insect evolution. In the phenotypically complex insect order Hemiptera, both unSer and edSer codons exist at this position (). Species with edSer codons formed a monophyly in the tree, suggesting that edSer has independently emerged in Hemiptera [Citation18]. Interestingly, genome-wide investigation on all Hemipteran genomes revealed an unexpectedly high unSer-to-edSer transition rate compared with random control, raising a tempting implication that there is a selective advantage to having the edSer codon during evolution [Citation18]. A potential question here is, if the unSer-to-edSer replacement is so beneficial, why was it not fixed early in the evolution of insects? This is an open question similar to the many cases of convergent evolution, letting people think why the adaptation did not take place earlier in the common ancestor? Our putative answer is the random nature of mutations plus the directional natural selection. This unSer-to-edSer transition was a multi-step process with a few intermediate codons observed [Citation18]. This transition did not take place in the common ancestor of insects, but it later occurred in different insect clades. The fact is that we currently observe their appearance in many insect orders and that this S>G site is known to be functional in Drosophila.

Similar to the evolution from ancestral unSer to derived edSer codon at S>G site, Adar gene has another auto-recoding site in Hymenoptera species that causes an Ile-to-Met (ATA>ATG) change, termed I>M site () [Citation17]. While this editable isoleucine codon (edIle) is almost fixed in Hymenoptera, the ancestral state is (likely) an uneditable isoleucine codon (ATC/T, unIle) that could not be edited to methionine. This again reflects the selection pressure that promotes the emergence of an editable status [Citation17].

We believe that our recent works [Citation17,Citation18] represent a preliminary but essential step toward the experiment-free strategy to find the putatively beneficial and functional RNA editing sites. The core novel idea is to fully utilize the evolutionary trace of the editing site to demonstrate the necessity of having an editable status compared with an uneditable ancestral state. Nevertheless, we stress that although we carried out novel explanations to describe the evolutionary trajectory of Adar S>G site, a neutral explanation can’t be completely ruled out. Although the Adar S>G editing is functional (i.e., reduces ADAR activity), on the fitness level it might not have any (positive or negative) impact. Thus, the plain observation of this edited site in insects could also be compatible with a neutral explanation.

The overrepresentation of Ser>Gly, Ile>Met, and Arg>Gly recoding in Drosophila RNA editome

As we previously explained [Citation18], not all codons subjected to nonsynonymous editing have a synonymous but uneditable counterpart. The comparison between edSer (AGC/U) and unSer (TCN) codons benefits from the fact that serine has six synonymous codons, and the comparison between edIle (ATA) and unIle (ATC/T) codons relies on the fact that ATA>ATG (Ile>Met) is the only nonsynonymous editing type taking place at the 3rd codon position. Apart from Ser and Ile, only Arg has comparable editable (AGA/G) and uneditable (CGN) codons, where editing at the 1st position of AGA/G leads to an Arg>Gly change. We wonder whether these three types of editable codons are truly enriched with RNA editing events in the transcriptome data.

We take the well-studied model insect Drosophila melanogaster to examine this issue. Among the bona fide A-to-I RNA editing sites in fly brains, 678 sites cause nonsynonymous changes. We focused on three types of nonsynonymous changes Ser>Gly, Ile>Met, and Arg>Gly due to that the pre-edited codons are the only ones that have uneditable synonymous counterparts. Among the 678 recoding sites, the numbers and proportions of Ser>Gly, Ile>Met, and Arg>Gly are 92 (13.6%), 35 (5.16%), and 60 (8.85%), respectively (). To understand whether these fractions are overrepresented compared to random expectation, we set out to calculate the expected numbers and fractions of each AA change caused by nonsynonymous editing. As it is well established that metazoan ADARs prefer a 3-mer motif where the −1 nucleotide of editing sites avoids G and the + 1 nucleotide of editing sites favors G [Citation22,Citation23], this preference should be considered. We counted the numbers of 16 types of 3-mers centered by the 678 nonsynonymous editing sites, and found that the most abundant 3-mer is CAG (133 sites, 19.6%) and the least abundant 3-mer is GAC (5 sites, 0.74%) (). To provide an estimated fraction of each type of AA changes, we changed all adenosines to guanosines in the D. melanogaster genome (Materials and Methods) and obtained 11,862,949 nonsynonymous mutation sites. By keeping the same fractions of 3-mer motifs, we randomly sampled 2,327,098 (19.6%) nonsynonymous adenosines with CAG motif 87,485 (0.74%) adenosines with GAC motif, and the same goes for other 3-mer motifs (). The total number of sampled nonsynonymous adenosines will equal 11,862,949. Then, we calculated the proportion of each AA change among these 11,862,949 background sites. By comparing the observed and expected proportions of each type of AA change, the enrichment of these AA changes was obtained ().

Figure 2. Observed and expected numbers/proportions of AA changes caused by nonsynonymous RNA editing. Total numbers of expected mutations were obtained by changing all adenosines to guanosines in the reference genome of D. melanogaster. The expected proportions of each AA change were estimated by the proportions of observed AA changes caused by nonsynonymous RNA editing. The 3-mer motif preferred by Adar is also considered. p values of the enrichments were calculated using one-tailed Fisher’s exact tests for each AA change. *, p < 0.05; ***, p < 0.001.

Figure 2. Observed and expected numbers/proportions of AA changes caused by nonsynonymous RNA editing. Total numbers of expected mutations were obtained by changing all adenosines to guanosines in the reference genome of D. melanogaster. The expected proportions of each AA change were estimated by the proportions of observed AA changes caused by nonsynonymous RNA editing. The 3-mer motif preferred by Adar is also considered. p values of the enrichments were calculated using one-tailed Fisher’s exact tests for each AA change. *, p < 0.05; ***, p < 0.001.

Strikingly, the three AA changes with the most significant enrichment were exactly Ile>Met, Ser>Gly, and Arg>Gly (). Although another type of AA change Lys>Arg also showed a slight enrichment, the significance was much weaker (). Next, we argued that this enrichment on Ile>Met, Ser>Gly, and Arg>Gly was unique to RNA editing in Drosophila. We took advantage of the well characterized RNA editomes and the availability of 1000-genome projects in Drosophila and humans, and found that the SNPs in Drosophila (), the RNA recoding sites in humans (), and the SNPs in human populations () did not exhibit enrichment on Ile>Met, Ser>Gly, and Arg>Gly sites.

Figure 3. Enrichment of each type of AA change. (a) SNPs in global population of D. melanogaster. (b) Nonsynonymous editing sites in humans. (c) SNPs in global human populations.

Figure 3. Enrichment of each type of AA change. (a) SNPs in global population of D. melanogaster. (b) Nonsynonymous editing sites in humans. (c) SNPs in global human populations.

Different hypotheses explaining the overrepresentation of Ser>Gly, Ile>Met, and Arg>Gly recoding

Since the enrichment of Ile>Met, Ser>Gly, Arg>Gly recoding sites are unique to RNA editing in Drosophila, and these three pre-edited codons are the only ones with uneditable synonymous codons, we should raise several hypotheses to explain this observation (). The first hypothesis is the ‘uneditable-to-editable is beneficial’ hypothesis (). During evolution, the gains and losses of editing events were mostly achieved by altering the dsRNA structures or sequence context, but the editing codon itself remained the unchanged. For the codons containing adenosines (including the editable Ser, Ile, Arg codons), if the editing status is fitter than the unedited status, then the surrounding cis-elements might mutate and let the focal codon become editable (). However, for Ser, Ile, and Arg, they have an additional way to become edited, that is to change from the uneditable counterparts to the editable ones. This ‘additional way’ may lead to the currently observed overrepresentation of Ile>Met, Ser>Gly, Arg>Gly recoding sites ().

Figure 4. Hypotheses raised to explain the enrichment of Ile>Met, Ser>Gly, Arg>Gly recoding sites in Drosophila RNA editome. (a) the “uneditable-to-editable is beneficial” hypothesis. Editable is fitter than uneditable, but the benefit can either be diversifying or restorative. (b) The “deleterious editing replaced with uneditable codons” hypothesis. Since the editable ile, ser, and arg codons have another choice to mutate to uneditable ones, the observed editing after purifying selection shows enrichment on those codons.

Figure 4. Hypotheses raised to explain the enrichment of Ile>Met, Ser>Gly, Arg>Gly recoding sites in Drosophila RNA editome. (a) the “uneditable-to-editable is beneficial” hypothesis. Editable is fitter than uneditable, but the benefit can either be diversifying or restorative. (b) The “deleterious editing replaced with uneditable codons” hypothesis. Since the editable ile, ser, and arg codons have another choice to mutate to uneditable ones, the observed editing after purifying selection shows enrichment on those codons.

Notably, this beneficial hypothesis does not distinguish between diversifying or restorative because under both cases editable is fitter than uneditable. For example, maybe these Ser>Gly, Ile>Met, and Arg>Gly editing sites are preferred but still the edited version is the functional one, and under this case editing is restorative and not diversifying.

The second hypothesis is the ‘deleterious editing replaced with uneditable codons’ hypothesis (). Presume that in some parent species there were 100 instances of each codon. At some time, editing was introduced, and then 40 of each 100 editable codons became edited. Thus, there were 40% edited out of the editable for each codon (). Then, of each such 40 editing sites, presume that 20 are deleterious and 20 are neutral or beneficial (). Purifying selection then depleted the deleterious editing by tweaking the dsRNA structures but the codon itself remained the same. It would theoretically end up with 20/100 edited codons (). However, the three special cases Ser>Gly, Ile>Met, Arg>Gly had another option, where the depletion could be achieved by replacement with an uneditable codon. Thus, it ends up with only 80 such codons in the CDS, 20 of which are edited (). One then finds these three to be enriched as they show editing in 20/80 compared to 20/100 in other codons.

It’s worth noticing that D. melanogaster is already known to have an editome with signals of beneficial recoding, where the nonsynonymous editing events are excessively more prevalent than expectation. However, beneficial editing does not exclude the possibility that the previously eliminated editing sites were deleterious. Thus, both hypotheses presented () are compatible with our observation.

The third explanation is the neutral or nearly neutral hypothesis. These editing events may be preferred because they are well tolerated, as they have very little effect on the protein. Although currently no evidence could disprove this possibility, we argue that this neutral assumption should be applicable for all types of AA changes caused by nonsynonymous RNA editing, and there is no mechanism to particularly enrich the three types Ser>Gly, Ile>Met, and Arg>Gly (given that the Adar motif has already been controlled in our comparison). While we reserve this possibility, we believe that the probability of which is lower than the two hypotheses mentioned above ().

Ser>Gly, Lle>Met, and Arg>Gly recoding: candidates for experimental verification?

At this stage, although we could not yet determine whether these currently observed Ile>Met, Ser>Gly, and Arg>Gly recoding sites are beneficial, neutral, or slightly deleterious, our motivation for focusing on these editing sites is that they are the only candidates for experimental verification on the fitness change caused by editing. Editing events in animals could not be easily abolished by operating the sequence context unless one directly changes an editable codon to an uneditable one. Therefore, these Ile>Met, Ser>Gly, and Arg>Gly recoding sites should be given the priority for functional studies. Furthermore, since the genetic manipulation for non-model animals is not applicable to all species, looking for the evolutionary trajectory from uneditable to editable (and edited) sites like previously did [Citation17,Citation18] remains a smart way to narrow down the potentially beneficial editing candidates.

Furthermore, reconstructing the ancestral state at each genomic position may help one distinguish between the beneficial hypothesis and deleterious hypothesis. The challenge is that a proper control should be chosen. If the currently edited Ser, Ile, Gly codons came from an ancestral uneditable codon with a ‘transition rate’ higher than neutral expectation, then it might suggest the selective advantage to having the editing status (). On the contrary, if the currently uneditable Ser, Ile, Gly codons in the genome frequently came from an ancestral nonsynonymous editing site (when a proper control was found), then it might indicate that those purged editing events were deleterious (). The neutral hypothesis might be proved or disproved according to whether one observes a significantly different result compared with random expectation. When either the beneficial or deleterious hypothesis is confirmed, the neutral hypothesis will be excluded.

Mapping the Adar S>G and I>M sites in metazoans: editable or edited?

Notably, compared with the plenty of nonsynonymous editing sites with unknown functions (), the Adar auto-recoding S>G site in Drosophila is one of the very few editing sites with verified functions in metazoans [Citation21]. Therefore, the evolutionary transition from an uneditable Ser codon to an editable Ser codon in insect Adar gene would indicate an advantage of RNA editing over a single genomic allele [Citation17,Citation18]. However, to obtain a solid conclusion, we need to (1) extend our genome alignment to a broader range of non-insect species to confirm the ancestral state of S>G/I>M sites; and (2) check whether the editable Ser/Ile codons in Adar (if any) are truly edited in the target species. Without RNA editing events, the editable codon alone has no advantages. To do so, we collected a set of metazoan species with systematic and genome-wide RNA editing studies (). The non-insect species include human [Citation24,Citation25], macaque [Citation26], mouse [Citation27], pig [Citation28], octopus [Citation7], and coral [Citation29], the lists of RNA editing sites are already known for these species.

Table 1. Detailed information on the sequences used in this study.

First, we downloaded the Adar (ADAR2) sequences of these species and performed multiple sequence alignment (Materials and Methods). For S>G site, all outgroups of insects (mammal+octopus+coral) have an unSer codon (), raising the possibility that the Adar S>G recoding is an invention in insects like Diptera, and this implication further supports the notion of adaptive evolution of RNA editing in Drosophila [Citation5]. Moreover, by checking the known lists of detected RNA editing sites in three Drosophila species, the edSer codons are all edited with levels ranging from 0.2 ~ 0.5 [Citation17] (). Here, we conclude that for the Adar S>G site, all investigated species with edSer codons (Drosophila) were truly edited, while the unSer codon represents the ancestral state. This pattern reflects the selective advantage to having the editable codon and agrees with the adaptive RNA editing theory.

Figure 5. The orthologous codons of Adar S>G (site1) and I>M (site2) auto-recoding sites in metazoan species with systematic RNA editing study. “#” means that although the codon itself is editable, no RNA editing events were reported by the original study. The phylogenetic tree is unscaled.

Figure 5. The orthologous codons of Adar S>G (site1) and I>M (site2) auto-recoding sites in metazoan species with systematic RNA editing study. “#” means that although the codon itself is editable, no RNA editing events were reported by the original study. The phylogenetic tree is unscaled.

In contrast, the evolution and implication of Adar I>M site is much more complicated (). We previously inferred that the most recent common ancestor of insects had a non-Ile codon at this position, and the edIle in bees was possibly (but not for sure) derived from an unIle codon in the ancestor of Hymenoptera [Citation17]. Here, we further found the edIle codons in the genomes of octopus, coral, and mouse, while the unIle codons exist in human and macaque (). We would not go deep into the evolution of this I>M site because more genomes are needed to resolve this issue. Our key finding to be stressed is, for the species with edIle codons, only honeybee (Apis mellifera) and bumblebee (Bombus terrestris) have editing events reported on this site (). For other species with edIle codons, even including the leaf-cutting ant (Acromyrmex echinatior) in Hymenoptera, do not have detectable editing events at this position ().

In this part, we propose that while on one hand, the Adar S>G site is a perfect example of adaptive RNA editing due to: (1) its known regulatory function, (2) an ancestral state of unSer codon, and (3) all currently tested species with edSer codons are indeed edited; on the other hand, not all comparisons between edSer vs unSer, or edIle vs unIle are informative as many editable codons are actually unedited in the transcriptome. Although the unedited result might be caused by the limited detection power, we should be cautious in interpreting the evolutionary significance of uneditable-to-editable transition, especially when the matched transcriptome data are not available.

It is unexpected to observe the absence of editing at orthologous edIle codons in some species

Regarding the novel finding of this current work (), one may raise a question that what we can see here is that an editable codon does not mean editing is realized, which is kind of obvious considering the relatively low fraction of editing sites among all editable codons in the coding sequence. We will find putative evidence to explain that this fact is not as obvious as it looks like, and the absence of editing in some edIle codons is indeed unexpected.

First, take D. melanogaster for example. It is true that only less than 1E–4 of the total adenosines in CDS were edited, but this statistical result is conceptually different from our statement that ‘editable does not mean edited’. The comparison between different genomic adenosines is ‘horizontal’ but the comparison of editing at orthologous sites are ‘vertical’ (). For the horizontal comparison, the editing of one adenosine has nothing to do with the probability of another adenosine being edited (, if the sequence context is not provided). In contrast, for the vertical comparison, the orthologous sites in different species are inherited from a common ancestor, and editing in one species will boost the probability of editing in another species (). Moreover, known conserved editing events in two different species will further elevate the probability that the third species is edited. As shown in , if species1 and species2 are both edited at an orthologous site, then the probability of editing in species3 should be relatively high although the actual probability depends on the divergence and the evolution history.

Figure 6. Horizontal and vertical comparison of editing sites. (a) the fraction of edited adenosines was very low in D. melanogaster genome. (b) Orthologous adenosine sites in three species. Two sites are known editing sites. The probability of the third species being edited should be high but the exact probability depends on the divergence.

Figure 6. Horizontal and vertical comparison of editing sites. (a) the fraction of edited adenosines was very low in D. melanogaster genome. (b) Orthologous adenosine sites in three species. Two sites are known editing sites. The probability of the third species being edited should be high but the exact probability depends on the divergence.

We are curious that given the Adar I>M site is edited in both honeybee (A. mellifera) and bumblebee (B. terrestris), how unexpected it is to observe the absence of editing at this edIle codon in leaf-cutting ant (A. echinatior) ()? We will not give a precise probability but we aim to show that the absence is not obvious. First, we need to find a phylogenetic tree with identical topology ().

Figure 7. It is unexpected to observe the absence of editing at the edIle codon in ant A. echinatior. the top panel shows the phylogeny of three Drosophila species. The bottom panel shows the phylogeny of honeybee, bumblebee, and leaf-cutting ant.

Figure 7. It is unexpected to observe the absence of editing at the edIle codon in ant A. echinatior. the top panel shows the phylogeny of three Drosophila species. The bottom panel shows the phylogeny of honeybee, bumblebee, and leaf-cutting ant.

In the brain editomes of three Drosophila species, 494 conserved nonsynonymous editing sites were found in two close species D. melanogaster and D. simulans (). Among them, 448 sites have adenosine in the genome of D. pseudoobscura. Then, among these 448 adenosines, 279 (62.3%) were edited in D. pseudoobscura and 169 (37.7%) were unedited (). Given that the divergence between D. pseudoobscura and (D. melanogaster, D. simulans) is much greater than the divergence between D. melanogaster and D. simulans (), while the divergence between ant and (honeybee, bumblebee) is only slightly greater than the divergence between honeybee and bumblebee (), it is intuitive to think that the conserved I>M editing site between honeybee and bumblebee should have a high probability to be edited in ant as well ().

Indeed, one cannot quantitatively infer editing conservation rates from a single lineage to another one, as each one has its own history and divergence rates. However, a likelihood could be roughly seen. According to the known editomes in these three Hymenoptera species, four conserved recoding sites between honeybee and bumblebee have an orthologous adenosine in the ant genome (not including the Ile>Met site), and two of which are edited in ant (). Although this 50% fraction lacks statistical power, it does not deny that the absence of Ile>Met editing event in ant A. echinatior is unexpected to some extent. Our overall notion is that conserved editing events are commonly observed in different species especially (1) when this site has known function in particular species, or (2) when this site is already known to be conserved across different species. Nevertheless, the case of I>M site in A. echinatior warns that one should not automatically reckon the orthologous site being edited even the codon is editable.

Conclusions

Auto-recoding mechanism in Adar gene, which serves as a stabilizer to regulate global RNA editing activity, is an invention in part of the insects such as Diptera and Hemiptera. At this particular editing site, the comparison between editable versus uneditable codons in a phylogenetic context is a smart way to infer the selective advantage of RNA editing. This notion could be generalized to the broad recoding events on editable Ser, Ile, Arg codons that have uneditable counterparts. We propose that priority might be given to these editing sites for functional studies due to the feasibility for constructing mutant animals with uneditable alleles. At multi-species level, the matched transcriptomes are needed to verify the conservation of editing events during evolution.

Materials and methods

Data availability

The analysis of the Adar auto-recoding codon involves the sequences in 13 species. Note that the Adar gene in insects is homologous to mammalian ADAD2 [Citation30]. In total, 13 representative species with systematic RNA editing studies were listed as follows (), including 6 from Arthropoda, 4 from Chordata, 2 from Mollusca, and Acropora millepora from Cnidaria as an outgroup. The nucleotide and protein sequences were downloaded from NCBI https://www.ncbi.nlm.nih.gov/ (accession IDs listed in ). The coding region of each mRNA sequence was extracted based on the annotation in NCBI. The SNPs of D. melanogaster were downloaded from the global diversity lines [Citation31]. The human RNA editing sites were downloaded from REDIportal [Citation24]. The SNPs in humans were downloaded from the 1000-genome project [Citation32]. The human reference genome was downloaded from Ensembl version hg19 (https://grch37.ensembl.org/Homo_sapiens/).

Sequence alignment and codon extraction

The amino acid sequences of the Adar gene were aligned using G-INS-i strategy in MAFFT v7.310 [Citation33], default parameters were used. Then, using TranslatorX v1.1 [Citation34], the nucleotide sequences of Adar gene were aligned based on the previously aligned amino acid sequences. Default parameters were used.

Expected proportions of AA changes caused by nonsynonymous editing sites

Nonsynonymous A-to-I RNA editing will cause different types of AA changes (like Ser>Gly, Ile>Met). For each type of AA change, our question is how many such nonsynonymous sites there will be if we change all adenosines to guanosines in the reference genome of D. melanogaster and meanwhile considering the Adar preference? This number, together with its proportion among all nonsynonymous A-to-G sites, will represent the expected number/proportion of a particular type of AA change caused by A-to-I editing. Notably, the A-to-G mutations in the negative-strand genes will be manifested as T-to-C alteration in the reference genome, and this strand issue has been considered in our conversion step. Finally, 11,862,949 nonsynonymous sites were obtained in the genome version dmel_r6.06 downloaded from FlyBase (https://flybase.org/). The 678 nonsynonymous editing sites have 133 (19.6%) sites located in CAG motif, and therefore we expected 11,862,949 × 19.6% = 2327098 sites in CAG motif in the background. Similarly, the expected numbers of other motifs were calculated. It is conceivable that the sum of all motifs equals 11,862,949 nonsynonymous adenosines in the reference genome. Among these 11,862,949 sites, the numbers of each type of AA change were counted accordingly. Then, the enrichment of each AA change was calculated by comparing its proportions in observed editing sites versus the proportions under random expectation.

Abbreviations

AA=

amino acid.

A-to-I=

adenosine-to-inosine.

ADAR=

adenosine deaminase acting on RNA.

CDS=

coding sequence.

SNP=

single nucleotide polymorphism.

dsRNA=

double-stranded RNA.

edSer=

editable serine codon.

unSer=

uneditable serine codon.

edIle=

editable isoleucine codon.

unIle=

uneditable isoleucine codon.

Authors’ contributions

Conceptualization & supervision: Y.D.

Data analysis: Y.D., L.M., S.X., and T.Z.

Writing – original draft: Y.D., T.Z., S.X., and L.M.

Writing – review & editing: T.Z., L.M., S.X., W.C., H.L., and Y.D.

Acknowledgments

We thank the members in Cai Lab for their help and suggestions to this work.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The analysis of the Adar auto-recoding codon involves the sequences in 13 species. Note that the Adar gene in insects is homologous to mammalian ADAD2 [Citation30]. In total, 13 representative species with systematic RNA editing studies were listed as follows (), including 6 from Arthropoda, 4 from Chordata, 2 from Mollusca, and Acropora millepora from Cnidaria as an outgroup. The nucleotide and protein sequences were downloaded from NCBI https://www.ncbi.nlm.nih.gov/ (accession IDs listed in ). The coding region of each mRNA sequence was extracted based on the annotation in NCBI.

Additional information

Funding

This study is financially supported by the National Natural Science Foundation of China [no. 32300371 and 31922012].

References

  • Zhang P, Zhu Y, Guo Q, et al. On the origin and evolution of RNA editing in metazoans. Cell Rep. 2023;42(2):112112. doi: 10.1016/j.celrep.2023.112112
  • Eisenberg E, Levanon EY. A-to-I RNA editing — immune protector and transcriptome diversifier. Nat Rev Genet. 2018;19(8):473–14. doi: 10.1038/s41576-018-0006-1
  • Alon S, Garrett SC, Levanon EY, et al. The majority of transcripts in the squid nervous system are extensively recoded by A-to-I RNA editing. Elife. 2015;4:4. doi: 10.7554/eLife.05198
  • Gommans WM, Mullen SP, Maas S. RNA editing: a driving force for adaptive evolution? BioEssays. 2009;31(10):1137–1145. doi: 10.1002/bies.200900045
  • Yablonovitch AL, Deng P, Jacobson D, et al. The evolution and adaptation of A-to-I RNA editing. PloS Genet. 2017;13(11):e1007064. doi: 10.1371/journal.pgen.1007064
  • Duan Y, Xu Y, Song F, et al. Differential adaptive RNA editing signals between insects and plants revealed by a new measurement termed haplotype diversity. Biol Direct. 2023;18(1):47. doi: 10.1186/s13062-023-00404-7
  • Liscovitch-Brauer N, Alon S, Porath HT, et al. Trade-off between transcriptome plasticity and genome evolution in cephalopods. Cell. 2017;169(2):191–202 e111. doi: 10.1016/j.cell.2017.03.025
  • Duan Y, Li H, Cai W. Adaptation of A-to-I RNA editing in bacteria, fungi, and animals. Front Microbiol. 2023;14:1204080. doi: 10.3389/fmicb.2023.1204080
  • Zhan D, Zheng C, Cai W, et al. The many roles of A-to-I RNA editing in animals: functional or adaptive? Front Biosci (Landmark Ed). 2023;28(10):256. doi: 10.31083/j.fbl2810256
  • Edera AA, Gandini CL, Sanchez-Puerta MV. Towards a comprehensive picture of C-to-U RNA editing sites in angiosperm mitochondria. Plant Mol Biol. 2018;97(3):215–231. doi: 10.1007/s11103-018-0734-9
  • Duan Y, Cai W, Li H. Chloroplast C-to-U RNA editing in vascular plants is adaptive due to its restorative effect: testing the restorative hypothesis. RNA. 2023;29(2):141–152. doi: 10.1261/rna.079450.122
  • Bian Z, Ni Y, Xu JR, et al. A-to-I mRNA editing in fungi: occurrence, function, and evolution. Cell Mol Life Sci. 2019;76(2):329–340. doi: 10.1007/s00018-018-2936-3
  • Liu H, Wang Q, He Y, et al. Genome-wide A-to-I RNA editing in fungi independent of ADAR enzymes. Genome Res. 2016;26(4):499–509. doi: 10.1101/gr.199877.115
  • Liu H, Li Y, Chen D, et al. A-to-I RNA editing is developmentally regulated and generally adaptive for sexual reproduction in Neurospora crassa. Proc Natl Acad Sci USA. 2017;114(37):E7756–E7765. doi: 10.1073/pnas.1702591114
  • Jiang D, Zhang J. The preponderance of nonsynonymous A-to-I RNA editing in coleoids is nonadaptive. Nat Commun. 2019;10(1):5411. doi: 10.1038/s41467-019-13275-2
  • Xin K, Zhang Y, Fan L, et al. Liu H: experimental evidence for the functional importance and adaptive advantage of A-to-I RNA editing in fungi. Proc Natl Acad Sci USA. 2023;120(12):e2219029120. doi: 10.1073/pnas.2219029120
  • Duan Y, Ma L, Song F, et al. Autorecoding A-to-I RNA editing sites in the adar gene underwent compensatory gains and losses in major insect clades. RNA. 2023;29(10):1509–1519. doi: 10.1261/rna.079682.123
  • Ma L, Zheng C, Xu S, et al. A full repertoire of hemiptera genomes reveals a multi-step evolutionary trajectory of auto-RNA editing site in insect adar gene. RNA Biol. 2023;20(1):703–714. doi: 10.1080/15476286.2023.2254985
  • Popitsch N, Huber CD, Buchumenski I, et al. A-to-I RNA editing uncovers hidden signals of adaptive genome evolution in animals. Genome Biol Evol. 2020;12(4):345–357. doi: 10.1093/gbe/evaa046
  • Palladino MJ, Keegan LP, O’Connell MA, et al. dADAR, a Drosophila double-stranded RNA-specific adenosine deaminase is highly developmentally regulated and is itself a target for RNA editing. RNA. 2000;6(7):1004–1018. doi: 10.1017/S1355838200000248
  • Savva YA, Jepson JE, Sahin A, et al. Auto-regulatory RNA editing fine-tunes mRNA re-coding and complex behaviour in Drosophila. Nat Commun. 2012;3(1):790. doi: 10.1038/ncomms1789
  • Zhang R, Deng P, Jacobson D, et al. Evolutionary analysis reveals regulatory and functional landscape of coding and non-coding RNA editing. PloS Genet. 2017;13(2):e1006563. doi: 10.1371/journal.pgen.1006563
  • Zhang Y, Duan Y. Genome-wide analysis on driver and passenger RNA editing sites suggests an underestimation of adaptive signals in insects. Genes (Basel). 2023;14(10):1951. doi: 10.3390/genes14101951
  • Picardi E, D’Erchia AM, Lo Giudice C, et al. Rediportal: a comprehensive database of A-to-I RNA editing events in humans. Nucleic Acids Res. 2017;45(D1):D750–D757. doi: 10.1093/nar/gkw767
  • Ramaswami G, Li JB. RADAR: a rigorously annotated database of A-to-I RNA editing. Nucleic Acids Res. 2014;42(Database issue):D109–113. doi: 10.1093/nar/gkt996
  • An NA, Ding W, Yang XZ, et al. Evolutionarily significant A-to-I RNA editing events originated through G-to-A mutations in primates. Genome Biol. 2019;20(1):24. doi: 10.1186/s13059-019-1638-y
  • Licht K, Kapoor U, Amman F, et al. A high resolution A-to-I editing map in the mouse identifies editing events controlled by pre-mRNA splicing. Genome Res. 2019;29(9):1453–1463. doi: 10.1101/gr.242636.118
  • Adetula AA, Fan X, Zhang Y, et al. Landscape of tissue-specific RNA editome provides insight into co-regulated and altered gene expression in pigs (sus-scrofa). RNA Biol. 2021;18(sup1):439–450. doi: 10.1080/15476286.2021.1954380
  • Porath HT, Schaffer AA, Kaniewska P, et al. A-to-I RNA editing in the earliest-diverging eumetazoan phyla. Mol Biol Evol. 2017;34(8):1890–1901. doi: 10.1093/molbev/msx125
  • Keegan LP, McGurk L, Palavicini JP, et al. O’Connell MA: functional conservation in human and Drosophila of metazoan ADAR2 involved in RNA editing: loss of ADAR1 in insects. Nucleic Acids Res. 2011;39(16):7249–7262. doi: 10.1093/nar/gkr423
  • Grenier JK, Arguello JR, Moreira MC, et al. Global diversity lines - a five-continent reference panel of sequenced drosophila melanogaster strains. G3 (Bethesda). 2015;5(4):593–603. doi: 10.1534/g3.114.015883
  • Kuehn BM. 1000 genomes project promises closer look at variation in human genome. JAMA. 2008;300(23):2715. doi: 10.1001/jama.2008.823
  • Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–780. doi: 10.1093/molbev/mst010
  • Abascal F, Zardoya R, Telford MJ. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations. Nucleic Acids Res. 2010;38: W7–W13. doi: 10.1093/nar/gkq291