495
Views
2
CrossRef citations to date
0
Altmetric
Addendum

Tracing common origins of Genomic Islands in prokaryotes based on genome signature analyses

Pages 247-249 | Received 11 Sep 2011, Accepted 27 Sep 2011, Published online: 01 Sep 2011

Abstract

Horizontal gene transfer constitutes a powerful and innovative force in evolution, but often little is known about the actual origins of transferred genes. Sequence alignments are generally of limited use in tracking the original donor, since still only a small fraction of the total genetic diversity is thought to be uncovered. Alternatively, approaches based on similarities in the genome specific relative oligonucleotide frequencies do not require alignments. Even though the exact origins of horizontally transferred genes may still not be established using these compositional analyses, it does suggest that compositionally very similar regions are likely to have had a common origin. These analyses have shown that up to a third of large acquired gene clusters that reside in the same genome are compositionally very similar, indicative of a shared origin. This brings us closer to uncovering the original donors of horizontally transferred genes, and could help in elucidating possible regulatory interactions between previously unlinked sequences.

The significance of horizontal gene transfer (HGT) to the evolution of prokaryote genomes can hardly be overstated. Network analyses have revealed that the majority of all gene families have been implicated in HGT at some point in their evolutionary history,Citation1 and experimental data have suggested that few genes in fact resist transfer.Citation2 The adaptive benefits of horizontally acquired genes to their new host have been implied by their persistence over time,Citation3 their role in the diversification and expansion of protein families,Citation4 and, importantly, in their association with pathogenesis.Citation5 Still, the majority of recently horizontally transferred genes have disclosed little about their specific functions, and most are therefore annotated as hypothetical protein coding genes.

Acquired genes are frequently encountered in clusters, which occasionally allows for a generalized classification based on the functionality of the combined gene cluster, e.g., as a Pathogenicity Island, Symbiosis Island, Metabolic Island, Resistance Island,Citation6Citation8 suggesting these gene clusters help to assemble the host genome in a modular fashion.Citation9 Collectively, these large acquired gene clusters are termed Genomic Islands (GIs), and a number of computational tools exist to identify these GIs in fully sequenced genomes.Citation10 Such tools fall into two main categories, and are either based on identifying differentially distributed gene clusters in available genomes of closely related species, or, alternatively, on detecting genomic regions with an atypical sequence composition in individual genomes. IslandViewer, which incorporates several GI detection methods belonging to both of the aforementioned categories, represents a vast repository of GIs in fully sequenced prokaryotic genomes,Citation11 and shows that GIs constitute a substantial fraction of bacterial genomes.

Although comparative analyses are expected to perform better at discovering putative GIs than composition-based analyses, the latter approach has an additional application besides the identification of GIs. First, compositional analyses are based on the relative frequencies of oligonucleotides, which are found to be remarkably stable within a species and are therefore termed the genome signature.Citation12,Citation13 Acquired gene clusters are often compositionally very dissimilar from their host genome and can therefore be easily identified,Citation14 and can even selectively be isolated in vitro.Citation15 Compositional analyses are especially useful if only a single representative genome of a species is available and the presence and absence of GIs across isolates cannot be assessed. Second, and more interestingly, compositional analyses allow for intragenomic comparisons of GIs with each other. The clustering of sequences based on oligonucleotide frequencies has previously been used to assign metagenomic sequences to specific taxonomic groups.Citation16,Citation17 Taking this reasoning one step further, GIs can be viewed as a special type of metagenomic data for which the original donor species designation is by and large unknown. Therefore, if strict similarity cut-offs are employed, GIs could be readily ‘phylogenetically’ classified and therefore clustered together if they are sufficiently similar in genome signature.

We hypothesized that if a successful horizontal gene transfer event has happened already once from a donor to an acceptor, subsequent directional transfers events are also possible, or even likely. A comparative analysis of all GIs from a single genome could link acquired gene clusters that originated from the same donor. This hypothesis was first tested with the GIs detected in the genome of Vibrio vulnificus.Citation18 For example, in this species we found that two compositionally similar GIs were annotated as separate horizontally transferred regions, although the reason that they were annotated as separate was an artificial split caused by the linearized annotation of the chromosome: with one GI at the start of the chromosome and one GI at the very end. This observation further supported our hypothesis that acquired gene clusters from the same source could in principal be evolutionary linked based on the genome signature.

Next, the approach of detecting recurrent transfer events per genome was applied to a wide range of prokaryotic genomes using the vast number of GIs annotated in IslandViewer.Citation19 Depending on the similarity threshold, up to 30% of all GIs per genome was found to originate from the same source. The thresholds were based on the compositional resemblance between a core genomic fragment that was extremely similar to the host genome, a so-called Core Island, and its appropriate chromosome. In addition, we validated our clustering accuracies by simulations. In these simulations we pooled a large number of stretches of genomic DNA from a range of prokaryotes, and tested whether with the same clustering approach as with the GIs, sequences were more frequently assigned to sequences stemming from same genome. The stricter the threshold, the more frequently sequences from the same genome were grouped together in our simulation, since the actual host of each sequence was known. These assignments were correct in 95–100% of the simulations, indicating that even our most lenient threshold still had relatively high accuracy. This high accuracy in the simulations suggests that the GI clustering is reliable, and up to a third of all GIs per genome are likely to have originated from a compositionally similar source, possibly the same species or even the exact same donor.

Still, it is important to note that although this composition-based approach enables the grouping of GIs based on similarities in genome signature, it does not rule out the possibility of an intragenomic dispersal after a single acquisition of a larger stretch of DNA. GIs are often unstable genetic entities,Citation6,Citation8 which are prone to be mobilized throughout the genome by GI encoded mobile genetic elements such as integrase or transposase encoding genes. But whether distinct GIs are acquired from the same source separately, or in a single event after which they dispersed throughout the genome, in both cases the clustered GIs likely originate from the same donor, and are therefore evolutionary linked.

Acquired gene clusters that stem from the same donor may share a regulatory organization governed for example by transcription factors or small regulatory RNAs (sRNA). sRNAs have been detected in GIs, and some of these are known to act upon targets located outside of the GI.Citation20,Citation21 High throughput sequencing analyses in Escherichia coli have shown that these sRNAs are present in many predicted GIs,Citation22 also in GIs of human pathogensCitation23 or in a reduced genome.Citation24 Importantly, some sRNAs on GIs are thought to be involved in pathogenicity, as was found for Staphylococcus aureus.Citation25 Together this means that GIs could frequently encode regulatory functions that have effects elsewhere in the genome, possibly also on other gene clusters stemming from the same origin. Combining databases such as IslandViewer with sRNAMapCitation26 could be useful to uncover the intricacies of transferred regulatory logics of sRNAs in prokaryotes, and see whether these non-coding regulatory elements evolve more rapidly than other horizontally acquired genes.Citation27

The ability to group distinct GIs based on oligonucleotide composition similarity can help to uncover a common ancestry for large acquired gene clusters in microbial genomes. With clustered GIs, a donor would be known for both sequences if for one the origin was known, for example due to the presence of a specific signature sequence.Citation28 This could shed light on the vast gene pool from where most GIs are thought to originate,Citation29 and on the possible interactions between different acquired regulatory elements with the resident regulatory network.

Acknowledgments

M.W.J.vP. is funded by the Netherlands Organization for Scientific Research (NWO) via a VENI grant. This work was further supported by the BioAssist program of the Netherlands Bioinformatics Centre (NBIC), which is supported by the Netherlands Genomics Initiative (NGI). Erik Roos is gratefully acknowledged for his contribution to the original manuscript discussed here, and Michiel Vos and Jon Bohlin are acknowledged for critically reading the manuscript.

References

  • Dagan T, Artzy-Randrup Y, Martin W. Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution. Proc Natl Acad Sci USA 2008; 105:10039 - 10044; PMID: 18632554; http://dx.doi.org/10.1073/pnas.0800679105
  • Sorek R, Zhu Y, Creevey CJ, Francino MP, Bork P, Rubin EM. Genome-wide experimental determination of barriers to horizontal gene transfer. Science 2007; 318:1449 - 1452; PMID: 17947550; http://dx.doi.org/10.1126/science.1147112
  • van Passel MW, Marri PR, Ochman H. The emergence and fate of horizontally acquired genes in Escherichia coli. PLOS Comput Biol 2008; 4:e1000059; PMID: 18404206; http://dx.doi.org/10.1371/journal.pcbi.1000059
  • Treangen TJ, Rocha EP. Horizontal transfer, not duplication, drives the expansion of protein families in prokaryotes. PLoS Genet 2011; 7:e1001284; PMID: 21298028; http://dx.doi.org/10.1371/journal.pgen.1001284
  • Pallen MJ, Wren BW. Bacterial pathogenomics. Nature 2007; 449:835 - 842; PMID: 17943120; http://dx.doi.org/10.1038/nature06248
  • Juhas M, van der Meer JR, Gaillard M, Harding RM, Hood DW, Crook DW. Genomic islands: tools of bacterial horizontal gene transfer and evolution. FEMS Microbiol Rev 2009; 33:376 - 393; PMID: 19178566; http://dx.doi.org/10.1111/j.1574-6976.2008.00136.x
  • Coleman ML, Sullivan MB, Martiny AC, Steglich C, Barry K, Delong EF, et al. Genomic islands and the ecology and evolution of Prochlorococcus. Science 2006; 311:1768 - 1770; PMID: 16556843; http://dx.doi.org/10.1126/science.1122050
  • Dobrindt U, Hochhut B, Hentschel U, Hacker J. Genomic islands in pathogenic and environmental microorganisms. Nat Rev Microbiol 2004; 2:414 - 424; PMID: 15100694; http://dx.doi.org/10.1038/nrmicro884
  • O'Connor TJ, Adepoju Y, Boyd D, Isberg RR. Minimization of the Legionella pneumophila genome reveals chromosomal regions involved in host range expansion. Proc Natl Acad Sci USA 2011; 108:14733 - 14740; PMID: 21873199; http://dx.doi.org/10.1073/pnas.1111678108
  • Langille MG, Hsiao WW, Brinkman FS. Detecting genomic islands using bioinformatics approaches. Nat Rev Microbiol 2010; 8:373 - 382; PMID: 20395967; http://dx.doi.org/10.1038/nrmicro2350
  • Langille MG, Brinkman FS. IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 2009; 25:664 - 665; PMID: 19151094; http://dx.doi.org/10.1093/bioinformatics/btp030
  • Campbell A, Mrazek J, Karlin S. Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. Proc Natl Acad Sci USA 1999; 96:9184 - 9189; PMID: 10430917; http://dx.doi.org/10.1073/pnas.96.16.9184
  • Karlin S, Burge C. Dinucleotide relative abundance extremes: a genomic signature. Trends Genet 1995; 11:283 - 290; PMID: 7482779; http://dx.doi.org/10.1016/S0168-9525(00)89076-9
  • Karlin S. Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends Microbiol 2001; 9:335 - 343; PMID: 11435108; http://dx.doi.org/10.1016/S0966-842X(01)02079-0
  • van Passel MW, Bart A, Waaijer RJ, Luyf AC, van Kampen AH, van der Ende A. An in vitro strategy for the selective isolation of anomalous DNA from prokaryotic genomes. Nucleic Acids Res 2004; 32:e114; PMID: 15304543; http://dx.doi.org/10.1093/nar/gnh115
  • Teeling H, Meyerdierks A, Bauer M, Amann R, Glockner FO. Application of tetranucleotide frequencies for the assignment of genomic fragments. Environ Microbiol 2004; 6:938 - 947; PMID: 15305919; http://dx.doi.org/10.1111/j.1462-2920.2004.00624.x
  • McHardy AC, Martin HG, Tsirigos A, Hugenholtz P, Rigoutsos I. Accurate phylogenetic classification of variable-length DNA fragments. Nat Methods 2007; 4:63 - 72; PMID: 17179938; http://dx.doi.org/10.1038/nmeth976
  • van Passel MW, Bart A, Thygesen HH, Luyf AC, van Kampen AH, van der Ende A. An acquisition account of genomic islands based on genome signature comparisons. BMC Genomics 2005; 6:163; PMID: 16297239; http://dx.doi.org/10.1186/1471-2164-6-163
  • Roos TE, van Passel MW. A quantitative account of genomic island acquisitions in prokaryotes. BMC Genomics 2011; 12:427; PMID: 21864345; http://dx.doi.org/10.1186/1471-2164-12-427
  • Padalon-Brauch G, Hershberg R, Elgrably-Weiss M, Baruch K, Rosenshine I, Margalit H, et al. Small RNAs encoded within genetic islands of Salmonella typhimurium show host-induced expression and role in virulence. Nucleic Acids Res 2008; 36:1913 - 1927; PMID: 18267966; http://dx.doi.org/10.1093/nar/gkn050
  • Pfeiffer V, Sittka A, Tomer R, Tedin K, Brinkmann V, Vogel J. A small non-coding RNA of the invasion gene island (SPI-1) represses outer membrane protein synthesis from the Salmonella core genome. Mol Microbiol 2007; 66:1174 - 1191; PMID: 17971080; http://dx.doi.org/10.1111/j.1365-2958.2007.05991.x
  • Shinhara A, Matsui M, Hiraoka K, Nomura W, Hirano R, Nakahigashi K, et al. Deep sequencing reveals as-yet-undiscovered small RNAs in Escherichia coli. BMC Genomics 2011; 12:428; PMID: 21864382; http://dx.doi.org/10.1186/1471-2164-12-428
  • Sharma CM, Hoffmann S, Darfeuille F, Reignier J, Findeiss S, Sittka A, et al. The primary transcriptome of the major human pathogen Helicobacter pylori. Nature 2010; 464:250 - 255; PMID: 20164839; http://dx.doi.org/10.1038/nature08756
  • Güell M, van Noort V, Yus E, Chen WH, Leigh-Bell J, Michalodimitrakis K, et al. Transcriptome complexity in a genome-reduced bacterium. Science 2009; 326:1268 - 1271; PMID: 19965477; http://dx.doi.org/10.1126/science.1176951
  • Pichon C, Felden B. Small RNA genes expressed from Staphylococcus aureus genomic and pathogenicity islands with specific expression among pathogenic strains. Proc Natl Acad Sci USA 2005; 102:14249 - 14254; PMID: 16183745; http://dx.doi.org/10.1073/pnas.0503838102
  • Huang HY, Chang HY, Chou CH, Tseng CP, Ho SY, Yang CD, et al. sRNAMap: genomic maps for small non-coding RNAs, their regulators and their targets in microbial genomes. Nucleic Acids Res 2009; 37:D150 - D154; PMID: 19015153; http://dx.doi.org/10.1093/nar/gkn852
  • Lercher MJ, Pal C. Integration of horizontally transferred genes into regulatory interaction networks takes many million years. Mol Biol Evol 2008; 25:559 - 567; PMID: 18158322; http://dx.doi.org/10.1093/molbev/msm283
  • Sandberg R, Winberg G, Branden CI, Kaske A, Ernberg I, Coster J. Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res 2001; 11:1404 - 1409; PMID: 11483581; http://dx.doi.org/10.1101/gr.186401
  • Hsiao WW, Ung K, Aeschliman D, Bryan J, Finlay BB, Brinkman FS. Evidence of a large novel gene pool associated with prokaryotic genomic islands. PLoS Genet 2005; 1:e62; PMID: 16299586; http://dx.doi.org/10.1371/journal.pgen.0010062