726
Views
11
CrossRef citations to date
0
Altmetric
Mitogenome Announcement

The complete chloroplast genome of high production individual tree of Coffea arabica L. (Rubiaceae)

ORCID Icon, ORCID Icon, , , &
Pages 1541-1542 | Received 06 Jan 2019, Accepted 04 Feb 2019, Published online: 18 Apr 2019

Abstract

C iComplete chloroplast genome of Coffea arabica of high production individual (HP1) is 155,191 bp long and has four subregions: 85,164 bp of large single copy (LSC) and 18,135 bp of small single copy (SSC) regions, separated by 25,946 bp of inverted repeat (IR) regions including 131 genes (86 protein-coding genes, eight rRNAs, and 37 tRNAs). The overall GC content of the chloroplast genome is 37.4% and those in the LSC, SSC, and IR regions are 35.4%, 31.3%, and 43.0%, respectively. In comparison to two coffee chloroplasts, three non-synonymous SNP are identified in ycf1 and ndhf and two deletions on HP1 are in ycf1.

Coffea arabica L., occupies 70% of world coffee production (Lashermes et al. Citation1999; O'brien and Kinnaird Citation2003). Because coffee tree usually cannot survive under 5 °C (Willson Citation1999), ‘Bean belt,’ regions for coffee production, covers from 20°S to 20°N (Bentley and Baker Citation2000). To overcome this limitation, Mr. Rho tried to select better coffee trees for the environment of Jeju Island, Korea (33°N; Part et al., in submission). During this process, he selected one coffee tree which can produce two times of coffee cheery (2–3 kg per year; ), in comparison to the other trees in sample greenhouse (1–2 kg per year; ). To understand its genetic background, we completed the chloroplast genome of this coffee tree (named as HP1).

Figure 1 (A) Neighbor joining (bootstrap repeat is 10,000) and maximum likelihood (bootstrap repeat is 1000) phylogenetic trees of five Coffea and four Rubiaceae complete chloroplast genomes: four Coffea arabica (MK353209, in this study, NC_008535, KY085909, and MK342634), Coffea canephora (NC_030053), Mitragyna speciosa (NC_034698), Dunnia sinensis (NC_039965), Emmenopterys henryi (NC_036300), and Gynochthodes nanlingensis (NC_028614). Phylogenetic tree was drawn based on neighbor joining tree. The numbers above branches indicate bootstrap support values of neighbor joining and maximum likelihood phylogenetic trees, respectively. (B) Presents coffee cherries of HP1 individual and (C) displays coffee cherries of normal coffee tree in the same place.

Figure 1 (A) Neighbor joining (bootstrap repeat is 10,000) and maximum likelihood (bootstrap repeat is 1000) phylogenetic trees of five Coffea and four Rubiaceae complete chloroplast genomes: four Coffea arabica (MK353209, in this study, NC_008535, KY085909, and MK342634), Coffea canephora (NC_030053), Mitragyna speciosa (NC_034698), Dunnia sinensis (NC_039965), Emmenopterys henryi (NC_036300), and Gynochthodes nanlingensis (NC_028614). Phylogenetic tree was drawn based on neighbor joining tree. The numbers above branches indicate bootstrap support values of neighbor joining and maximum likelihood phylogenetic trees, respectively. (B) Presents coffee cherries of HP1 individual and (C) displays coffee cherries of normal coffee tree in the same place.

Total DNA of HP1 individual (YS. Kim, IB-00584 in InfoBoss Cyber Herbarium (IN)) was extracted from fresh leaves by using a DNeasy Plant Mini Kit (QIAGEN, Hilden, Germany). Genome sequencing was performed using HiSeqX at Macrogen Inc., Korea, and de novo assembly was done by Velvet 1.2.10 (Zerbino and Birney Citation2008) and SOAPGapCloser 1.12 (Zhao et al. Citation2011). Assembled sequences were confirmed by BWA 0.7.17 (Li Citation2013) and SAMtools 1.9 (Li et al. Citation2009). Geneious R11 11.0.5 (Biomatters Ltd, Auckland, New Zealand) was used for genome annotation based on Coffea arabica chloroplast genome (NC_008535; Samson et al. Citation2007).

The chloroplast genome of HP1 (Genbank accession is MK353209) is 155,191 bp long and has four subregions: 85,164 bp of large single copy (LSC) and 18,135 bp of small single copy (SSC) regions are separated by 25,946 bp of inverted repeat (IR). It contains 131 genes (86 protein-coding genes, 8 rRNAs, and 37 tRNAs); 19 genes (8 protein-coding genes, 4 rRNAs, and 7 tRNAs) are duplicated in IR regions. The overall GC content is 37.4% and those in the LSC, SSC, and IR regions are 35.4%, 31.3%, and 43.0%, respectively.

Based on alignment with two C. arabica chloroplast genomes (NC_008535 and MK342634 named as CH3; Park et al. Citation2019), three single nucleotide polymorphisms (SNPs) and four insertion and deletions (INDELs) were identified. One INDEL presents NC_008535 type; while two INDELs and two SNPs are of CH3 type. The rest two INDELs are HP1-specific. Moreover, one ambiguous base of NC_008535 (C/T at 111,618 bp) presents different bases on CH3 and HP1, A and C, respectively, showing that three bases are shown in this position. Ycf1 contains one non-synonymous SNP (nsSNP) and two deletions and ndhf contains two nsSNPs on HP1.

Five Coffea and four Rubiaceae chloroplast genomes were used for constructing phylogenic trees. Whole chloroplast genome sequences were aligned by MAFFT 7.388 (Katoh and Standley Citation2013) for constructing neighbor joining (bootstrap repeat is 10,000) and maximum likelihood (bootstrap repeat is 1,000) trees using MEGA X (Kumar et al. Citation2018). Phylogenetic trees show that four C. arabica are not resolved well because of a small number of sequence variations (). These variations are not enough to explain the high production ability of HP1 tree, leading us deciphering mitochondrial or whole genome of HP1 (Unpublished, InfoBoss Co., Ltd.).

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by InfoBoss Research Grant [IBG-0011].

References

  • Bentley JW, Baker PS. 2000. The Colombian Coffee Growers' Federation: organised, successful smallholder farmers for 70 years. London, UK: ODI.
  • Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 30:772–780.
  • Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: Molecular Evolutionary Genetics Analysis across computing platforms. Mol Biol Evol. 35:1547–1549.
  • Lashermes P, Combes M-C, Robert J, Trouslot P, D'Hont A, Anthony F, Charrier A. 1999. Molecular characterisation and origin of the Coffea arabica L. genome. Mol Gen Genet MGG. 261:259–266.
  • Li H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. New York (NY): Cornell University. arXiv Preprint arXiv. 13033997.
  • Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. 2009. The sequence alignment/map format and SAMtools. Bioinformatics. 25:2078–2079.
  • O'brien TG, Kinnaird MF. 2003. Caffeine and conservation. Washington (DC): American Association for the Advancement of Science.
  • Park J, Xi H, Kim Y, Heo K-I, Nho M, Woo J, Seo Y, Yang JH. 2019. The complete chloroplast genome of cold hardiness individual of Coffea arabica L. (Rubiaceae). DOI:10.1080/23802359.2019.1586472
  • Samson N, Bausher MG, Lee SB, Jansen RK, Daniell H. 2007. The complete nucleotide sequence of the coffee (Coffea arabica L.) chloroplast genome: organization and implications for biotechnology and phylogenetic relationships amongst angiosperms. Plant Biotechnol J. 5:339–353.
  • Willson K. 1999. Coffee, Cocoa ant Tea. Wallingford: CAB International.
  • Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research. 18:821–829.
  • Zhao Q-Y, Wang Y, Kong Y-M, Luo D, Li X, Hao P. 2011. Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinformatics. 12:S2.