1,277
Views
2
CrossRef citations to date
0
Altmetric
Articles

An efficient protein homology detection approach based on seq2seq model and ranking

, &
Pages 633-640 | Received 12 Oct 2020, Accepted 15 Feb 2021, Published online: 25 Apr 2021

References

  • Bateman A, Martin MJ, O'Donovan C, et al. UniProt: a hub for protein information. Nucleic Acids Res. 2015;43(D1):D204–D212.
  • Popov I. S-motifs as a new approach to secondary structure prediction: comparison with state of the art methods. Biotechnol Biotechnol Equipment. 2012;26(3):3016–3020.
  • Radivojac P, Clark WT, R. Oron T, et al. A large-scale evaluation of computational protein function prediction. Nat Methods. 2013;10(3):221–227.
  • Liu L, Tang L, He LB, et al. Predicting protein function via multi-label supervised topic model on gene ontology. Biotechnology & Biotechnological Equipment. 2017;31(3):630–638.,
  • Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48(3):443–453.
  • Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981;147(1):195–197.
  • Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410.
  • W. R P. Rapid and sensitive sequence comparison with FASTP and FASTA. Meth Enzym. 1990;183:63–98.
  • Jin X, Liao Q, Liu B. PL-search: a profile-link-based search method for protein remote homology detection. Brief Bioinformatics. 2020;,
  • Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402.,
  • Jin X, Liao Q, Wei H, et al. SMI-BLAST: a novel supervised search framework based on PSI-BLAST for protein remote homology detection. Bioinformatics. 2020;
  • Margelevicius M, Laganeckas M, Venclovas C. COMA server for protein distant homology search. Bioinformatics. 2010;26(15):1905–1906.
  • Kelley LA, Sternberg MJE. Protein structure prediction on the web: a case study using the Phyre server. Nat Protoc. 2009;4(3):363–371.
  • Jaroszewski L, Li ZW, Cai XH, et al. FFAS server: novel features and applications. Nucleic Acids Res. 2011;39(Web Server issue):W38–W44.,
  • Mistry J, Finn RD, R. Eddy S, et al. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 2013;41(12)
  • Remmert M, Biegert A, Hauser A, et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods. 2011;9(2):173–175.,
  • Soding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005;21(7):951–960.
  • Liu B, Li SM. ProtDet-CCH: Protein Remote Homology Detection by Combining Long Short-Term Memory and Ranking Methods. IEEE/ACM Trans Comput Biol Bioinform. 2019;16(4):1203–1210.
  • Liu B, Zhang DY, Xu RF, et al. Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection. Bioinformatics. 2014;30(4):472–479.,
  • Zhao XW, Zou Q, Liu B, et al. Exploratory predicting protein folding model with random forest and hybrid features. CP. 2015;11(4):289–299.
  • Guo Y, Yan K, Wu H, et al. ReFold-MAP: Protein remote homology detection and fold recognition based on features extracted from profiles. Anal Biochem. 2020;611:114013.,
  • Raimondi D, Orlando G, Moreau Y, et al. Ultra-fast global homology detection with discrete cosine transform and dynamic time warping. Bioinformatics. 2018;34(18):3118–3125.,
  • Chen JJ, Guo MY, Wang XL, et al. A comprehensive review and comparison of different computational methods for protein remote homology detection. Brief Bioinform. 2018;19(2):231–244.,
  • Jung I, Kim D. SIMPRO: simple protein homology detection method by using indirect signals. Bioinformatics. 2009;25(6):729–735.
  • Melvin L, Weston J, Leslie C, et al. RANKPROP: a web server for protein remote homology detection. Bioinformatics. 2009;25(1):121–122.,
  • Mikolov T, Quoc VL, Llya S. Exploiting similarities among languages for machine translation. axXiv, 1309.4168, 2013
  • Lai SM, Liu K, He SZ, et al. How to generate a good word embedding. IEEE Intell Syst. 2016;31(6):5–14.,
  • Suzuki H, Kasahara M. Introducing difference recurrence relations for faster semi-global alignment of long sequences. BMC Bioinf. 2018;19(S1)
  • Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–1780.
  • Rodriguez P, Wiles J, Elman JL. A recurrent neural network that learns to count. Conn. Sci. 1999;11(1):5–40.
  • Chang XJ, Ma ZG, Lin M, et al. Feature Interaction Augmented Sparse Learning for Fast Kinect Motion Detection. IEEE Trans on Image Process. 2017;26(8):3911–3920.,
  • Hirschberg J, Manning CD. Advances in natural language processing. Science. 2015;349(6245):261–266.
  • Cho K, Merrienboer BV, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv, 1406.1078, 2014
  • Baltrusaitis T, Ahuja C, Morency LP. Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell. 2019;41(2):423–443.
  • Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural work. In 28th Conference on Neural Information Processing Systems. 2014;27:3104–3112.
  • Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv, 1409.0473, 2014;
  • Andreeva A, Howorth D, Chandonia JM, et al. Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res. 2008;36(Database issue):D419–D425.,
  • Chandonia JM, Fox NK, Brenner SE. SCOPe: manual curation and artifact removal in the structural classification of proteins - extended Database. J Mol Biol. 2017;429(3):348–355.
  • Saripella GV, Sonnhammer ELL, Forslund K. Benchmarking the next generation of homology inference tools. Bioinformatics. 2016;32(17):2636–2641.
  • Finn RD, Clements J, Eddy SR. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 2011;39(Web Server issue):W29–W37.
  • Biegert A, Soding J. Sequence context-specific profiles for homology searching. Proc Natl Acad Sci U S A. 2009;106(10):3770–3775.
  • Boratyn GM, Camacho C, Cooper PS, et al. BLAST: a more efficient report with usability improvements. Nucleic Acids Res. 2013;41(Web Server issue):W29–W33.
  • Edgar RC, Notes A. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–2461.