566
Views
0
CrossRef citations to date
0
Altmetric
Plastome Announcements

The complete chloroplast genome of Thermopsis lanceolata: genome structure and its phylogenetic relationships within the family Fabaceae

, ORCID Icon & ORCID Icon
Pages 2076-2080 | Received 29 Aug 2022, Accepted 30 Nov 2022, Published online: 12 Dec 2022

Abstract

Thermopsis lanceolata R. Br. belongs to the genus Thermopsis, Fabaceae. The alkaloids of T. lanceolata have anti-cancer, anti-heart rate disorders and other pharmacological effects. To explore the chloroplast genome of T. lanceolata and its phylogenetic relationship, a complete chloroplast genome of T. lanceolata was sequenced and annotated, and a phylogenetic tree was constructed. The complete chloroplast genome of T. lanceolata is a circular molecule of 151,526 bp, consisting of a large single copy (LSC) region of 83,780 bp, a small single copy (SSC) region of 15,566 bp, and a pair of inverted repeats (IRa and IRb) of 26,090 bp. The chloroplast genome of T. lanceolata contained 130 genes, including 85 protein-coding genes, 37 transfer RNA genes, and 8 ribosomal RNA genes. Phylogenetic analysis revealed a close relationship between T. lanceolata and T. turkestanica.

Introduction

T. lanceolata R. Br. (_Thermopsis lanceolata_ Robert Brown 1811) belonging to the genus Thermopsis, is a hizomatous perennial herb mainly distributed in Northwestern China, Mongolia, Kazakhstan, and Uzbekistan (Minggagud and Yang Citation2013; Jiang et al. Citation2017; ). T. lanceolata is an important plant with high medicinal value, and T. lanceolata alkaloids have various pharmacological effects (Zhang et al. Citation2022), including anti-cancer, anti-arrhythmic, anti-microbial, and anti-ulcer properties (Shakirov and Sabirov Citation1970; Vinogradova et al. Citation1972; Gao et al. Citation1999; Hu et al. Citation2022; Zhang et al. Citation2022). T. lanceolata is a poisonous plant, and animals can be poisoned or even killed if eating the seeds and whole grass of T. lanceolata (Vinogradova et al. Citation1971).

Figure 1. A flowering T. lanceolata.

Figure 1. A flowering T. lanceolata.

Chloroplasts are the place where plants perform photosynthesis. Chloroplast genomes are structurally stable and conserved (Neuhaus and Emes Citation2000), and they are widely used in comparative genomics and phylogenetics studies (Dong et al. Citation2013; Han et al. Citation2022). However, the chloroplast genome of T. lanceolata has not been reported, and only 2 chloroplast genomes of the 30 plant species in genus Thermopsis were reported. The analysis of the chloroplast genome of T. lanceolata is important for a comprehensive understanding of phylogenetics of plant in genus Thermopsis. Here, we sequenced, assembled, and annotated the chloroplast genome of T. lanceolata, and revealed the structural characteristics of the chloroplast genome of T. lanceolata. Our study provides important data for genomics study of Thermopsis species and the phylogenetic study of Fabaceae.

Materials

Fresh leaves of T. lanceolata were collected from Yijinhuoluo County, Ordos, Inner Mongolia Autonomous Region, China (39°33′48.21″N, 109°44′32.56″E). The voucher specimen (specimen number:20191110-04) was deposited in the College of Life and Environmental Sciences, Minzu University of China, Beijing (Yu-Ke Gen, [email protected]).

Methods

Total genomic DNA was extracted using the CTAB method (Cota-Sánchez et al. Citation2006), and the Illumina HiseqX Ten platform was used for genomic DNA sequencing. A total of 10 GB clean reads (paired-end 150 bp) were generated and used for chloroplast genome assembly. Spades v. 3.15.5 (Bankevich et al. Citation2012) and GetOrganelle v. 1.7.6.1 (Jin et al. Citation2018) were used to assemble the chloroplast genome. Geneious v. 8.0.2 (Kearse et al. Citation2012) and the Perl script Plastid Genome Annotator (PGA) (Qu et al. Citation2019) were used to annotate the genome. A circular genome map was drawn using the OGDRAW (Lohse et al. Citation2007) program (https://chlorobox.mpimpgolm.mpg.de/OGDraw.html). The annotated chloroplast genome sequences were deposited into GenBank under the accession numbers of MN841458, and the original reads were also deposited in GenBank (SRA: SRR13871830; BioProject: PRJNA706880; Bio-Sample: SAMN18131232). To determine the phylogenetic relationships of T. lanceolata in Fabaceae, 13 published plastome sequences were downloaded from GenBank, and the sequences of 64 protein-coding genes from 14 Fabaceae species were aligned by MAFFT v. 7.471 (Katoh and Standley Citation2013), and a phylogenetic tree was generated by MEGAX (Kumar et al. Citation2018) using the maximum likelihood (ML) method with 1000 bootstrap replicates, with a General Time Reversible + Proportion Invariant + Gamma (GTR + I + G) model. Glycine max (NC_007942.1) was set as outgroup in phylogenetic analysis. Images were drawn with iTOL (Letunic et al. Citation2021) and Adobe Illustrator.

Results

General feature of the chloroplast genome

The chloroplast genome of T. lanceolata had a typical quadripartite structure with a total length of 151,526 bp (), including a large single copy (SSC) region and a small single copy (LSC) region separated by a pair of inverted repeats (IRs) regions. The LSC, SSC, and IR lengths were 83,780 bp, 15,566 bp, and 26,090 bp, respectively.

Figure 2. Chloroplast genome map of T. lanceolata. Genes drawn outside the outer circle are transcribed clockwise, and those inside are transcribed counter-clockwise. Genes belonging to diferent functional groups are color-coded. The dark gray in the inner circle indicates GC content of the chloroplast genomes.

Figure 2. Chloroplast genome map of T. lanceolata. Genes drawn outside the outer circle are transcribed clockwise, and those inside are transcribed counter-clockwise. Genes belonging to diferent functional groups are color-coded. The dark gray in the inner circle indicates GC content of the chloroplast genomes.

The overall GC content of T. lanceolata chloroplast genome was 36.4%, and the GC contents were different across different regions of the chloroplast genome (supplemental Table S1).

A total of 130 genes were annotated in the assembled chloroplast genome, including 85 protein-coding genes, 37 tRNAs, and 8 rRNAs. Among them, 100 genes have only one copy, and 15 genes were duplicated in the IR regions (rpl2, rpl23, ycf15, trnL-CAA, ndhB, rps7, rps12, trnV-GAC, rrn16, trnI-GAU, trnA-UGC, rrn23, rrn4.5, trnR-ACG, and trnN-GUU). There were 12 intron-containing genes (rpl2, rpoC1, rps16, trnT-GGU, trnV-UAC, trnK-UUU, trnL-UAA, trnA-UGC, trnE-UUC, ndhA, ndhB, atpF), and 3 genes (rps12, ycf3, and clpP) with two introns ().

Table 1. Gene composition information and functional classification of T. lanceolata chloroplast genome.

Phylogenetic analysis of T. lanceolata in the family Fabaceae

Using the chloroplast genomes of 13 ingroup species and 1 outgroup species (G. max), the phylogenetic tree was constructed by using the maximum likelihood method. The phylogenetic tree showed that species in genus Thermopsis and species in genus Piptanthus were clearly clustered into a clade. T. lanceolata formed a sister clade to T. turkestanica, indicating that they are more closely related than other species ().

Figure 3. Phylogenetic tree for 14 Fabaceae species using Maximum Likelihood (ML) method, based on alignments of 64 protein-coding genes shared among 14 species, G. max was set as the outgroup. Numbers next to the branches indicated bootstrap values from 1000 replicates.

Figure 3. Phylogenetic tree for 14 Fabaceae species using Maximum Likelihood (ML) method, based on alignments of 64 protein-coding genes shared among 14 species, G. max was set as the outgroup. Numbers next to the branches indicated bootstrap values from 1000 replicates.

Discussion

In the present study, the chloroplast genome of T. lanceolata was sequenced, assembled, and annotated for the first time. The structure of the chloroplast genome of T. lanceolata is similar to that of most of the Fabaceae species, and the chloroplast genome had a typical quadripartite structure with a SSC, a LSC, and two IRs. The phenomenon of missing IR regions reported in the Fabaceae IRLC branch (Magee et al. Citation2010) was not observed in T. lanceolata. In the T. lanceolata chloroplast genome, GC content in IR region was higher than that in LSC region and SSC region. This observation has also been reported in other plants, and it was proposed that the high GC content of the IR region may be associated with the dominant distribution of rRNA gene, which has high GC content (Li et al. Citation2016).

The species in the present study belong to the family Sophoreae in Fabaceae, and there are 17 genera under the Sophoreae. In the APG (2016) classification system, the genus Baptisia, Anagyris, Piptanthus, and Thermopsis was classified under the subfamily of Thermopsidinae. Here, 14 related chloroplast genome data were used to conducted phylogenetic analysis, and the result showed that Piptanthus and Thermopsis are more closely related than the other genera. This is consistent with the phylogenetic analysis based on nuclear ITS and four cpDNA regions (matK, rbcL, trnH-psbA, trnL-trnF) (Shi et al. Citation2017).

T. lanceolata is a widely distributed herb of Fabaceae. Studies have shown that T. lanceolata has certain medicinal value, but there are very few studies on T. lanceolata at present. Our research provides important data for understanding the phylogenetic status of T. lanceolata in genus Thermopsis, and is of great significance for the development of molecular markers for T. lanceolata.

Ethical approval

This study includes no human, animal, or endangered plant samples, and the sampling site is not located in any protected area. The field study and laboratory study were conducted in accordance with guidelines provided by Minzu University of China.

Author contributions

Yijun Zhou and Fei Gao oversaw the project. Tashi Dorjee collected the sample and extracted the genomic DNA. Tashi Dorjee was responsible for assembling and annotating the samples, and analyzing the data. Tashi Dorjee and Fei Gao revised the manuscript, and Tashi Dorjee final approved the version to be published. All authors agree to be accountable for all aspects of the work.

Supplemental material

Supplemental Material

Download MS Word (13.2 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The genome sequence data that support the findings of this study are openly available in GenBank of NCBI at [https://www.ncbi.nlm.nih.gov] (https://www.ncbi.nlm.nih.gov/) under the accession no. MN841458. The associated BioProject, SRA, and Bio-Sample numbers are PRJNA706880, SRR13871830, and SAMN18131232, respectively.

Additional information

Funding

This work was financially supported by the National Natural Science Foundation of China [31770363 and 31670335], the Ministry of Education of China through Double First-rate plan [Yldxxk201819], and supported by the Graduate Research and Practice Projects of Minzu University of China.

References

  • Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, et al. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 19(5):455–477.
  • Cota-Sánchez JH, Remarchuk K, Ubayasena K. 2006. Ready-to-use DNA extracted with a CTAB method adapted for herbarium specimens and mucilaginous plant tissue. Plant Mol Biol Rep. 24(2):161–167.
  • Dong WP, Xu C, Cheng T, Lin K, Zhou SL. 2013. Sequencing angiosperm plastid genomes made easy: a complete set of universal primers and a case study on the phylogeny of Saxifragales. Genome Biol Evol. 5(5):989–997.
  • Gao WY, Li YM, Jiang SH, Zhu DY. 1999. Alkaloids from Thermopsis lanceolata. Nat. Prod. Res. Dev. 11(2):1–3.
  • Han C, Ding R, Zong X, Zhang L, Chen X, Qu B. 2022. Structural characterization of Platanthera ussuriensis chloroplast genome and comparative analyses with other species of Orchidaceae. BMC Genomics. 23(1):1–13.
  • Hu ZX, Zhang P, Zou JB, An Q, Yi P, Yuan CM, Zhang ZK, Zhao LH, Hao XJ. 2022. Quinolizidine alkaloids with antitomato spotted wilt virus and insecticidal activities from the seeds of Thermopsis lanceolata R. Br J Agric Food Chem. 70(29):9214–9226.
  • Jiang Y, Zhang Y, Wu Y, Hu R, Zhu J, Tao J, Zhang T. 2017. Relationships between aboveground biomass and plant cover at two spatial scales and their determinants in northern Tibetan grasslands. Ecol Evol. 7(19):7954–7964.
  • Jin JJ, Yu WB, Yang JB, Song Y, Yi TS, Li DZ. 2018. GetOrganelle: a simple and fast pipeline for de novo assembly of a complete circular chloroplast genome using genome skimming data. BioRxiv. 4:256479.
  • Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 30(4):772–780.
  • Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C, et al. 2012. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 28(12):1647–1649.
  • Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 35(6):1547–1549.
  • Letunic I, Bork P. 2021. Interactive Tree of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49(W1):W293–W296.
  • Li FW, Kuo LY, Pryer KM, Rothfels CJ. 2016. Genes translocated into the plastid inverted repeat show decelerated substitution rates and elevated GC content. Genome Biol Evol. 8(8):2452–2458.
  • Lohse M, Drechsel O, Bock R. 2007. OrganellarGenomeDRAW (OGDRAW): a tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr Genet. 52(5–6):267–274.
  • Magee AM, Aspinall S, Rice DW, Cusack BP, Sémon M, Perry AS, Stefanović S, Milbourne D, Barth S, Palmer JD, et al. 2010. Localized hypermutation and associated gene losses in legume chloroplast genomes. Genome Res. 20(12):1700–1710.
  • Minggagud H, Yang J. 2013. Wetland plant species diversity in sandy land of a semi-arid inland region of China. Plant Biosyst. 147(1):25–32.
  • Neuhaus HE, Emes MJ. 2000. Nonphotosynthetic metabolism in plastids. Annu Rev Plant Physiol Plant Mol Biol. 51(1):11–40.
  • Qu XJ, Moore MJ, Li DZ, Yi TS. 2019. PGA: a software package for rapid, accurate, and flexible batch annotation of plastomes. Plant Methods. 15(1):1–12.
  • Shakirov TT, Sabirov KA. 1970. The production of cytisine from the seeds of Thermopsis lanceolata. Chem Nat Compd. 6(6):733–734.
  • Shi W, Liu PL, Duan L, Pan BR, Su ZH. 2017. Evolutionary response to the Qinghai-Tibetan Plateau uplift: phylogeny and biogeography of Ammopiptanthus and tribe Thermopsideae (Fabaceae). PeerJ. 5:e3607.
  • Vinogradova VI, Iskandarov S, Yunusov SY. 1971. An investigation of the alkaloids of Thermopsis lanceolata. Chem Nat Compd. 7(4):440–442.
  • Vinogradova VI, Iskandarov S, Yunusov SY. 1972. Dithermamine—a new bimolecular alkaloid from Thermopsis lanceolata. Chem Nat Compd. 8(1):82–85.
  • Zhang P, An Q, Yi P, Cui Y, Zou JB, Yuan CM, Zhang Y, Gu W, Huang LJ, Zhao LH, et al. 2022. Thermlanseedlines A-G, seven thermopsine-based alkaloids with antiviral and insecticidal activities from the seeds of Thermopsis lanceolata R. Br Fitoterapia. 158:105140.
  • Zhang P, Zou JB, An Q, Yi P, Yuan CM, Huang LJ, Gu W, Hu ZX, Hao XJ. 2022. Two new cytisine-type alkaloids from the seeds of Thermopsis lanceolata. J Asian Nat Prod Res. 25(1):1–9.