Search in:

RNA Biology Volume 19, 2022 - Issue 1

Submit an article Journal homepage

Open access

3,033

Views

CrossRef citations to date

Altmetric

Research Paper

sRNARFTarget: a fast machine-learning-based approach for transcriptome-wide sRNA target prediction

Kratika Naskulwara Department of Computer Science, Memorial University of Newfoundland, St. John’s, CanadaView further author information

Lourdes Peña-Castilloa Department of Computer Science, Memorial University of Newfoundland, St. John’s, Canada;b Department of Biology, Memorial University of Newfoundland, St. John’s, CanadaCorrespondence[email protected]

https://orcid.org/0000-0002-0643-2547 View further author information

Pages 44-54 | Received 06 Mar 2021, Accepted 22 Nov 2021, Published online: 29 Dec 2021

Cite this article
https://doi.org/10.1080/15476286.2021.2012058
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Figures & data

Table 4. Final benchmarking dataset used for all three programs. The table lists the genome accession used, the number of sRNAs, the number of mRNAs, the number of confirmed interacting pairs (P), and the number of pairs considered non-interacting (N) per bacterial species (from top to bottom: E. coli, Synechocystis and P. multocida)

Download CSV Display Table

Table 5. 10-fold CV AUROC for the best model per classifier trained on sequence-derived features (trinucleotide frequency difference and tetra-nucleotide frequency difference) of 1490 sRNA-mRNA pairs

Display Table

Figure 1. ROC curve for the three programs on Escherichia coli data. The plot shows the sensitivity (also called recall or true positive rate) as a function of the false-positive rate (FPR). The dash line indicates random classifier performance.

Figure 2. ROC curve for the three programs on Synechocystis data. The plot shows the sensitivity (also called recall or true positive rate) as a function of the false-positive rate (FPR). The dash line indicates random classifier performance.

Figure 3. ROC curve for the three programs on Pasteurella multocida data. The plot shows the sensitivity (also called recall or true positive rate) as a function of the false-positive rate (FPR). The dash line indicates random classifier performance.

Table 6. AUROC obtained on each bacterial species included in the benchmark for all three programs assessed

Display Table

Figure 4. Rank (lower = better) distribution of 102 Escherichia coli confirmed interacting pairs. The violin plot for each program shows the data density for different rank values and the horizontal line inside each box indicates the median rank of confirmed interacting pairs.

Figure 5. Rank (lower = better) distribution of 22 Synechocystis confirmed interacting pairs. The violin plot for each program shows the data density for different rank values and the horizontal line inside each box indicates the median rank of confirmed interacting pairs.

Figure 6. Rank (lower = better) distribution of 20 Pasteurella multocida confirmed interacting pairs. The violin plot for each program shows the data density for different rank values and the horizontal line inside each box indicates the median rank of confirmed interacting pairs.

Figure 7. Percentage of Escherichia coli confirmed interacting sRNA-mRNA pairs (recall) as a function of percentage top predicted interacting pairs.

Figure 8. Percentage of Synechocystis confirmed interacting sRNA-mRNA pairs (recall) as a function of percentage top predicted interacting pairs.

Figure 9. Percentage of Pasteurella multocida confirmed interacting sRNA-mRNA pairs (recall) as a function of percentage top predicted interacting pairs.

Figure 10. ROC curve for sRNARFTarget and IntaRNA on E. coli and Salmonella data. The plot shows the sensitivity (also called recall or true positive rate) as a function of the false-positive rate (FPR). The dash line indicates random classifier performance.

Figure 11. Rank (lower = better) distribution of 119 E. coli and Salmonella confirmed interacting pairs. The violin plot for each program shows the data density for different rank values and the horizontal line inside each box indicates the median rank of confirmed interacting pairs.

Table 7. Execution time for sRNARFTarget and IntaRNA on benchmarking data. Both programs were run on an Intel Core i7 (2.2 GHz) with 4 cores and 16 GB of RAM computer

Download CSV Display Table

Table 8. CopraRNA web server job execution time on selected sRNA for each bacterium on the benchmark data

Download CSV Display Table

Table 9. Sporulation-associated genes in sRNARFTarget top 10% predicted RCd1 targets. Smaller ranks indicate higher confidence of sRNARFTarget in the corresponding target prediction

Download CSV Display Table

Pain A, Ott A, Amine H, et al. An assessment of bacterial small RNA target prediction programs. RNA Biol. 2015;12(5):509–513.

PubMed Web of Science ®Google Scholar

Han K, Tjaden B, Lory S. GRIL-seq provides a method for identifying direct targets of bacterial small regulatory RNA by in vivo proximity ligation. Nat Microbiol. 2016;2(3):16239.

PubMed Web of Science ®Google Scholar

Zhang YF, Han KS, Chandler CE, et al. Probing the sRNA regulatory landscape of P. aeruginosa: post-transcriptional control of determinants of pathogenicity and antibiotic susceptibility. Mol Microbiol. 2017;106(6):919–937.

PubMed Web of Science ®Google Scholar

Pita T, Feliciano J, Leitão J. Small noncoding regulatory RNAs from Pseudomonas aeruginosa and Burkholderia cepacia complex. Int J Mol Sci. 2018 Nov;19(12):3759.

PubMed Web of Science ®Google Scholar

Ramos CG, Da Costa PJP, Döring G, et al. The novel cis-encoded small RNA h2cR is a negative regulator of hfq2 in Burkholderia cenocepacia. PLoS One. 2012;7(10):e47896.

PubMed Web of Science ®Google Scholar

Gulliver EL, Wright A, Lucas DD, et al. Determination of the small RNA GcvB regulon in the gram-negative bacterial pathogen Pasteurella multocida and identification of the GcvB seed binding region. RNA. 2018;24(5):704–720.

PubMed Web of Science ®Google Scholar

Fröhlich KS, Haneke K, Papenfort K, et al. The target spectrum of SdsR small RNA in Salmonella. Nucleic Acids Res. 2016;44(21):10406–10422.

PubMed Web of Science ®Google Scholar

Ryan D, Mukherjee M, Suar M. The expanding targetome of small RNAs in. Salmonella Typhimurium Biochimie. 2017;137:69–77.

Google Scholar

Mai J, Rao C, Watt J, et al. Mycobacterium tuberculosis 6C sRNA binds multiple mRNA targets via C-rich loops independent of RNA chaperones. Nucleic Acids Res. 2019;47(8):4292–4307.

PubMed Web of Science ®Google Scholar

Georg J, Kostova G, Vuorijoki L, et al. Acclimation of oxygenic photosynthesis to iron starvation is controlled by the sRNA IsaR1. Curr Biol. 2017;27(10):1425–1436.e7.

PubMed Web of Science ®Google Scholar

Georg J, Dienst D, Schürgers N, et al. The small regulatory RNA SyR1/PsrR1 controls photosynthetic functions in cyanobacteria. Plant Cell. 2014;26(9):3661–3679.

PubMed Web of Science ®Google Scholar

Supplemental material

Supplemental Material

Download Zip (1.3 MB)

Related Research Data

sRNARFTarget: a fast machine-learning-based approach for transcriptome-wide sRNA target prediction

Source: Taylor & Francis

Linking provided by

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

sRNARFTarget: a fast machine-learning-based approach for transcriptome-wide sRNA target prediction

Table 1. Studies from which we collected sRNA-mRNA interacting pairs

Table 2. Training and benchmarking data characteristics

Table 3. Parameters per ML method used for grid-search CV

Table 5. 10-fold CV AUROC for the best model per classifier trained on sequence-derived features (trinucleotide frequency difference and tetra-nucleotide frequency difference) of 1490 sRNA-mRNA pairs

Table 6. AUROC obtained on each bacterial species included in the benchmark for all three programs assessed

Table 7. Execution time for sRNARFTarget and IntaRNA on benchmarking data. Both programs were run on an Intel Core i7 (2.2 GHz) with 4 cores and 16 GB of RAM computer

Table 8. CopraRNA web server job execution time on selected sRNA for each bacterium on the benchmark data

Table 9. Sporulation-associated genes in sRNARFTarget top 10% predicted RCd1 targets. Smaller ranks indicate higher confidence of sRNARFTarget in the corresponding target prediction

Supplemental Material

Related Research Data

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

sRNARFTarget: a fast machine-learning-based approach for transcriptome-wide sRNA target prediction

Figures & data

Table 1. Studies from which we collected sRNA-mRNA interacting pairs

Table 2. Training and benchmarking data characteristics

Table 3. Parameters per ML method used for grid-search CV

Table 5. 10-fold CV AUROC for the best model per classifier trained on sequence-derived features (trinucleotide frequency difference and tetra-nucleotide frequency difference) of 1490 sRNA-mRNA pairs

Table 6. AUROC obtained on each bacterial species included in the benchmark for all three programs assessed

Table 7. Execution time for sRNARFTarget and IntaRNA on benchmarking data. Both programs were run on an Intel Core i7 (2.2 GHz) with 4 cores and 16 GB of RAM computer

Table 8. CopraRNA web server job execution time on selected sRNA for each bacterium on the benchmark data

Table 9. Sporulation-associated genes in sRNARFTarget top 10% predicted RCd1 targets. Smaller ranks indicate higher confidence of sRNARFTarget in the corresponding target prediction

Supplemental Material

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date