617
Views
0
CrossRef citations to date
0
Altmetric
Articles

A novel prediction method for protein DNA-binding residues based on neighboring residue correlations

ORCID Icon, &
Pages 865-877 | Received 28 Apr 2022, Accepted 05 Sep 2022, Published online: 27 Oct 2022

References

  • Pan Y, Zhou S, Guan J. Computationally identifying hot spots in protein-DNA binding interfaces using an ensemble approach. BMC Bioinf. 2020;21(Suppl 13):384.
  • Biswas A, Narlikar L. Resolving diverse protein-DNA footprints from exonuclease-based ChIP experiments. Bioinformatics 2021;37(Supplement_1):i367–i375.
  • Cozzolino F, Iacobucci I, Monaco B, et al. Protein-DNA/RNA interactions: an overview of investigation methods in the -omics era. J Proteome Res. 2021;20(6):3018–3030.
  • Takaya D, Niwa H, Mikuni J, et al. Protein ligand interaction analysis against new CaMKK2 inhibitors by use of X-ray crystallography and the fragment molecular orbital (FMO) method. J Mol Graph Model. 2020;99:107599.
  • Aihara H, Ito Y, Kurumizaka H, et al. The N-terminal domain of the human Rad51 protein binds DNA: structure and a DNA binding surface as revealed by NMR. J Mol Biol. 1999;290(2):495–504.
  • Inoue N, Pellett PE. Human herpesvirus 6B origin-binding protein: DNA-binding domain and consensus binding sequence. J Virol. 1995;68(8):4619–4627.
  • Kuhl AJ, Ross SM, Gaido KW. Using a comparative in vivo DNase I footprinting technique to analyze changes in protein-DNA interactions following phthalate exposure. J Biochem Mol Toxicol. 2007;21(5):312–322.
  • Yang C, Ding Y, Meng Q, et al. Granular multiple kernel learning for identifying RNA-binding protein residues via integrating sequence and structure information. Neural Comput & Applic. 2021;33(17):11387–11399. doi:10.1007/s00521-020-05573-4.
  • Ronesh S, Shiu K, Tatsuhiko T, et al. Single-stranded and double-stranded DNA-binding protein prediction using HMM profiles. Anal Biochem. 2021;612:113954. ISSN 0003-2697, https://doi.org/10.1016/j.ab.2020.113954.
  • Berman HM, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Res. 2000;28(1):235–242.
  • Bairoch A, Apweiler R. The swiss-prot protein sequence data bank and its new supplement TREMBLE. Nucleic Acids Res. 1996;21:21–25.
  • Ahmad S, Sarai A. PSSM-based prediction of DNA binding sites in proteins. BMC Bioinf. 2005;6:33.
  • Wang L, Brown SJ. BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res. 2006;34:243–248.
  • Wang L, Huang C, Yang MQ, et al. BindN + for accurate prediction of DNA and RNA-binding residues from protein sequence features. BMC Syst Biol. 2010;4(S1):S3.
  • Wang L, Yang MQ, Yang JY. Prediction of DNA-binding residues from protein sequence information using random forests. BMC Genomics. 2009;10(Suppl 1):S1.
  • Ma X, Guo J, Liu HD, et al. Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information. IEEE/ACM Trans Comput. Biol Bioinf. 2012;9(6):1766–1775.
  • Zhou JY, Lu Q, Xu RF, et al. EL_PSSM-RT:DNA-binding residue prediction by integrating ensemble learning with PSSM relation transformation. BMC Bioinf. 2017;18(1):379.
  • Liu Y, Yu Z, Chen C, et al. Prediction of protein crotonylation sites through LightGBM classifier based on SMOTE and elastic net. Anal Biochem. 2020;609:113903.
  • Ke G, Meng Q, Finley T, et al. LightGBM: a highly efficient gradient boosting decision tree In Proceedings of the 31st Conference on Neural Information Processing System, Long Beach, CA, USA, p. 4–9. December 2017.
  • Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of Statistics. 2001;29(5):1189–1232.
  • Chen C, Zhang Q, Ma Q, et al. LightGBM-PPI: Predicting protein-protein interactions through LightGBM with multi-information fusion. Chemometr Intell Lab Syst. 2019;191:54–64.
  • Zhang L, Liu M, Qin X, et al. Succinylation site prediction based on protein sequences using the IFS-LightGBM (BO) model. Comput Math Methods Med. 2020;2020:8858489–8858415.
  • Zhan ZH, You ZH, Li LP, et al. Accurate prediction of ncRNA-protein interactions from the integration of sequence and evolutionary information. Front Genet. 2018;9:458.
  • Shandar A, Michael GM, Akinori S. Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics 2004;20(4):477–486.
  • Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006;22(13):1658–1659.
  • Li T, Li Q-Z, Liu S, et al. PreDNA: accurate prediction of DNA-binding sites in proteins by integrating sequence and geometric structure information. Bioinformatics 2013;29(6):678–685.
  • Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402.
  • Cuff JA, Barton GJ. Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 2000;40(3):502–511.
  • McGuffin LJ, Bryson K, Jones DT. The PSIPRED protein structure prediction server. Bioinformatics 2000;16(4):404–405.
  • Wuthrich K, Billeter M, Braun W. Pseudo-structures for the 20 common amino acids for use in studies of protein conformations by measurements of intramolecular proton-proton distance constraints with nuclear magnetic resonance. J Mol Biol. 1983;169(4):949–961.
  • Shuichi K, Minoru K. AAindex: Amino acid index database. Nucleic Acids Res. 1999;1:368–369.
  • Pan Y, Liu D, Deng L. Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties. PLoS One. 2017;12(6):e0179314.
  • Fan C, Liu D, Huang R, et al. Predrsa: a gradient boosted regression trees approach for predicting protein solvent accessibility. BMC Bioinf. 2016;17(S1):8.
  • Liao Z, Wan S, He Y, et al. Classification of small gtpases with hybrid protein features and advanced machine learning techniques. CBIO 2018;13(5):492–500.
  • Swets J. Measuring the accuracy of diagnostic system. Science 1988;240(4857):1285–1293.
  • Shandar A, Michael GM, Akinori S. Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics. 2004;20(4):477–486.
  • Hwang S, Gou Z, Kuznetsov AIB. DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins. Bioinformatics 2007;23(5):634–636.