1,217
Views
12
CrossRef citations to date
0
Altmetric
Research Paper

tRic: a user-friendly data portal to explore the expression landscape of tRNAs in human cancers

, , , , , , & show all
Pages 1674-1679 | Received 26 Jun 2019, Accepted 15 Aug 2019, Published online: 25 Aug 2019

ABSTRACT

Transfer RNAs (tRNAs) play critical roles in human cancer. Currently, no database provides the expression landscape and clinical relevance of tRNAs across a variety of human cancers. Utilizing miRNA-seq data from The Cancer Genome Atlas, we quantified the relative expression of tRNA genes and merged them into the codon level and amino level across 31 cancer types. The expression of tRNAs is associated with clinical features of patient smoking history and overall survival, and disease stage, subtype, and grade. We further analysed codon frequency and amino acid frequency for each protein coding gene and linked alterations of tRNA expression with protein translational efficiency. We include these data resources in a user-friendly data portal, tRic (tRNA in cancer, https://hanlab.uth.edu/tRic/ or http://bioinfo.life.hust.edu.cn/tRic/), which can be of significant interest to the research community.

Introduction

Transfer RNAs (tRNAs) play critical roles in protein translation by delivering amino acids to initiate and elongate peptide chains [Citation1]. Transcription of tRNAs is mediated by RNA polymerase III, and aberrant tRNA expression contributes to disease [Citation2,Citation3]. For example, overexpression of tRNAiMetCAT (initiator tRNA that identifies a methionyl translation start codon) can enhance global protein synthesis and increase endoplasmic reticulum stress to promote the development of diabetes [Citation4]. Decreased expression of tRNAGlnCTG promotes progression of Huntington’s disease in the early stage by increasing the frequency of translational frame-shifting [Citation5]. In human cancers, enhanced tRNA expression drives mRNA translation and cell growth [Citation6]. For example, expression of tRNAArg in breast cancer is positively correlated with codon frequency in oncogenic signatures, suggesting that tRNAArg overexpression may accelerate the translational efficiency of these oncogenic genes [Citation7Citation9]. Up-regulation of tRNAGluTTC optimizes EXOSC2 expression to promote metastatic progression of tumours[Citation10].

The Cancer Genome Atlas (TCGA) project generated multi-omic data for more than 10,000 patient samples, including exome-seq, RNA-seq, miRNA-seq, and DNA methylation[Citation11]. It also collected clinical features, including disease stage and patient age and overall survival. These rich data provide valuable opportunities to understand transcriptomic events and oncogenic pathways [Citation12Citation16]. Several databases have been developed to benefit the biomedical research community in utilizing this large-scale dataset. For example, cBioPortal provides a web resource for exploring, visualizing, and analysing cancer genomic data, especially for protein-coding genes [Citation17,Citation18]. The Cancer Proteome Atlas includes protein expressions of ~200 proteins for > 8,000 tumour samples[Citation19]. PancanQTL was developed to explore both trans-quantitative trait loci (QTL) and cis-eQTL across 33 cancer types[Citation20]. Several other databases focus on non-coding RNAs. For example, The Atlas of Non-coding RNA In Cancer focuses on the functions and clinical relevance of long non-coding RNAs[Citation21], while SnoRNA In Cancer focuses on the expression landscape and clinical relevance of small nucleolar RNAs[Citation22]. However, there is still no tRNA database in cancer, likely due to the technical difficulty of estimating tRNA expression levels accurately from high-throughput sequencing data[Citation23]. Recent studies used miRNA-seq to quantify the relative expression level of tRNAs in multiple organisms, including E.coli, yeast, and humans [Citation24Citation32]. In particular, we used a similar computational pipeline to quantify the relative expression levels of tRNAs from TCGA[Citation33]. We further built a user-friendly database, tRNA In Cancer (tRic), the first comprehensive database for tRNAs in cancer, which can significantly benefit cancer research.

Results and discussion

Data preparation

We collected clinical information, including stage, grade, subtype, patient survival, and smoking history, from ~10,000 patients across 31 human cancers (). We obtained miRNA seq files for these samples and quantified their expression profile at tRNA, codon and amino acid level as described in our previous study (method and )[Citation33]. We also calculated frequency of codon and amino acid for each coding gene throughout human genomes (). These datasets were deposited in our database.

Figure 1. Data processing and web design of tRic. a. Summary of clinical information across 31 human cancer types in tRic. Full names of cancer type are listed in . b. Data collection and processing of tRic dataset, including miRNAseq, tRNA annotation and human coding sequences (CDS). QC denotes quantify control. c. Interface and infrastructure of tRic.

Figure 1. Data processing and web design of tRic. a. Summary of clinical information across 31 human cancer types in tRic. Full names of cancer type are listed in Table 1. b. Data collection and processing of tRic dataset, including miRNAseq, tRNA annotation and human coding sequences (CDS). QC denotes quantify control. c. Interface and infrastructure of tRic.

Database infrastructure

The web interface is based on traditional HTML, CSS, and JavaScript with modern libraries, such as Bootstrap and JQuery. The backend of the data portal is based on R and data manipulation libraries, such as Tidyverse. The Django web framework is adopted to connect the backend and frontend of the database (). Users can browse or query items of interest on the user-friendly web pages. We established two mirrored links for tRic at https://hanlab.uth.edu/tRic/ or http://bioinfo.life.hust.edu.cn/tRic/). We will continue to support the database for possible updates.

Functional modules and examples

tRic has four functional modules: tRNA level, codon level, amino acid level, and codon usage (). In the ‘tRNA level’ module, users can query expression level of tRNAs in a specific cancer type and/or subgroup. tRic will return the expression level of tRNAs and differentially expressed tRNAs between tumour and normal samples if there were more than 5 paired samples. For example, tRNA-His-GTG-1–9 is differentially expressed between tumour and normal samples in LUAD (). Users can also choose to perform comprehensive analysis for tRNAs associated with clinical features. For example, tRNA-Arg-TCG-5–1 is associated with patient survival in KIRC (). Expression at tRNA level was merged into codon level and amino acid level. tRic also provides similar query functions in module ‘codon level’ and module ‘amino acid level’ to ‘tRNA level’. For example, tRNAArg(CGT) is differentially expressed among KIRC stages (), while tRNAArg(AGA) is differentially expressed among BRCA subtypes (), tRNAGlu is differentially expressed among patients with different smoking histories in LUSC (), and tRNALeu is differentially expressed among LIHC grades ().

Figure 2. Overview of tRic database. a. Four modules in tRic: expression of tRNs at tRNA level, codon level, and amino acid level, respectively, as well as codon usage. b. Differentially expressed tRNAs between tumour and normal samples. c. Expression of tRNA associated with patient survival. d. Differentially expressed codons among different stages. e. Differentially expressed codons among different subtypes. f. Differentially expressed amino acids among patients with different smoking histories. g. Differentially expressed amino acids among different tumour grades. h. Amino acid frequency of human SRSF2 gene.

Figure 2. Overview of tRic database. a. Four modules in tRic: expression of tRNs at tRNA level, codon level, and amino acid level, respectively, as well as codon usage. b. Differentially expressed tRNAs between tumour and normal samples. c. Expression of tRNA associated with patient survival. d. Differentially expressed codons among different stages. e. Differentially expressed codons among different subtypes. f. Differentially expressed amino acids among patients with different smoking histories. g. Differentially expressed amino acids among different tumour grades. h. Amino acid frequency of human SRSF2 gene.

tRNAs play important translation roles in initiating and elongating peptides[Citation1]. Therefore, the expression alterations of tRNA may impact translational efficiency. The module ‘codon usage’ aims to pinpoint potential effects of tRNA expression on protein translation. Users can search a protein-coding gene for its codon frequency and amino acid frequency. For example, Arg frequency in SRSF2 (23.8%) is significantly higher than the average genomic level (5.5%), suggesting that tRNAArg overexpression may increase the translational product of SRSF2 (). Users can also search the gene list with high frequency for specific codons or amino acids.

Data download

Expressions at tRNA, codon, and amino acid levels, as well as the codon and amino acid frequency for all protein-coding genes are available on tRic download pages (https://hanlab.uth.edu/tRic/download/ or http://bioinfo.life.hust.edu.cn/tRic/download/).

Conclusion

We have developed the first comprehensive database for tRNA expression in more than 10,000 tumour samples across 31 cancer types. We provide the tRNA expression profile, differential expression between tumour and normal samples and among different groups of samples (e.g., subtypes, stages) at tRNA, codon and amino acid levels. We also provide the codon frequency and amino acid frequency for all protein-coding genes in the human genome, which may unveil potential connections between tRNA expression and the usage bias of gene translation. Our database will provide the biomedical research community with insights in functional discoveries of tRNAs in cancer.

Materials and methods

Clinical information for TCGA samples

The clinical information of TCGA samples was obtained from TCGA data portal (https://portal.gdc.cancer.gov/). Clinical information for each cancer type, including stage, grade, subtype, and patient survival and smoking history, is summarized in .

Quantification of tRNAs

We downloaded and processed 16,591 miRNA-seq data from TCGA data portal (https://portal.gdc.cancer.gov/) as we previously described[Citation22]. In brief, we filtered out duplicated samples and low-quality samples with quality control-passed reads < 50% or reads mapped rate < 80%. After quality control, 10,594 samples, comprising 9931 tumour samples and 663 normal samples, were included in our study (, , left panel).

Table 1. Summary of tRic data for each cancer type.

We quantified tRNA expression levels as previously described[Citation33]. In brief, we downloaded tRNA annotations from UCSC Genome Browser (http://hgdownload.soe.ucsc.edu/) and filtered out those without clear anticodon and amino acid information. In total, we collected 604 tRNAs decoding 52 anticodons (codons) and 21 amino acids. We then mapped TCGA miRNA-seq reads to tRNA annotations and normalized tRNA expression using the trimmed mean of M values (TMM) method [Citation34,Citation35]. We defined tRNAs that have relatively high expression value (average TMM > 1) as detectable tRNAs. These tRNAs were categorized into 52 codon groups and 21 amino acid groups according to the codon and amino acid information (, middle panel).

Estimation of codon frequency and amino acid frequency

The human coding sequences with complete open reading frames were downloaded from Ensembl database (www.ensembl.org/). For each coding gene, we estimated the frequency for each codon and each amino acid based on the sequence information. At the codon level, we calculated the total number of codons (N) and then calculated the total number of each specific codon (n). The codon frequency is calculated as N divided by n. We used a similar approach to calculate the amino acid frequency (, right panel).

Statistical analyses

All statistical tests were performed using R. We used the Student’s t-test to examine the differential expression between tumour and normal samples. The analysis of variance test was used to test differentially expressed tRNAs among different stages, subtypes, grades, and smoking history groups. The univariate Cox model was used to test if tRNA expression correlated with patient survival.

Authors’ contributions

L.H. conceived and supervised the project. Z.Z., Y.Y., C-J.L., H.R., J.G., L.D., A-Y.G., and L.H. performed the analyses. Z.Z, H.R., C-J.L., and A-Y.G. developed the database. Z.Z., H.R., L.D., and L.H. wrote the manuscript with input from all other authors.

Supplemental material

Supplemental Material

Download Zip (15.8 KB)

Acknowledgments

This work was supported by the Cancer Prevention & Research Institute of Texas (RR150085) to CPRIT Scholar in Cancer Research (L.H.); UTHealth Innovation for Cancer Prevention Research Training Program Post-doctoral Fellowship (Cancer Prevention and Research Institute of Texas, RP160015); China Postdoctoral Science Foundation (2019M652623 to C-J. Liu); National Natural Science Foundation of China (31822030 and 31771458 to A-Y. Guo). We gratefully acknowledge contributions from TCGA Research Network. We thank LeeAnn Chastain for editorial assistance.

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplementary material

Supplemental data for this article can be accessed here

Additional information

Funding

This work was supported by the Cancer Prevention & Research Institute of Texas (RR150085) to CPRIT Scholar in Cancer Research.

References

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.