117
Views
6
CrossRef citations to date
0
Altmetric
Articles

iDHS-DMCAC: identifying DNase I hypersensitive sites with balanced dinucleotide-based detrending moving-average cross-correlation coefficient

&
Pages 429-445 | Received 12 Mar 2019, Accepted 02 May 2019, Published online: 23 May 2019

References

  • C. Wu, P.M. Bingham, K.J. Livak, R. Holmgren, and S.C.R. Elgin, The chromatin structure of specific genes: I. Evidence for higher order domains of defined DNA sequence, Cell 16 (1979), pp. 797–806.
  • D.S. Gross and W.T. Garrard, Nuclease hypersensitive sites in chromatin, Annu. Rev. Biochem. 57 (1988), pp. 159–197. doi:10.1146/annurev.bi.57.070188.001111.
  • G. Felsenfeld, Chromatin as an essential part of the transcriptional mechanism, Nature 355 (1992), pp. 219–224. doi:10.1038/355219a0.
  • G. Felsenfeld and M. Groudine, Controlling the double helix, Nature 421 (2003), pp. 448–453. doi:10.1038/nature01411.
  • G.E. Crawford, I.E. Holt, J. Whittle, B.D. Webb, D. Tai, S. Davis, E.H. Margulies, Y. Chen, J.A. Bernat, and D. Ginsburg, Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS), Genome Res. 16 (2006), pp. 123–131. doi:10.1101/gr.4074106.
  • L. Song and G.E. Crawford, DNase-seq: A high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells, Cold Spring Harbor Protoc. (2010). pp. pdb prot5384. doi:10.1101/pdb.prot5384.
  • P. Madrigal and P. Krajewski, Current bioinformatic approaches to identify DNase I hypersensitive sites and genomic footprints from DNase-seq data, Front. Genet. 3 (2012), pp. 230. doi:10.3389/fgene.2012.00230.
  • W.S. Noble, S. Kuehn, R. Thurman, M. Yu, and J. Stamatoyannopoulos, Predicting the in vivo signature of human gene regulatory sequences, Bioinformatics 21 (2005), pp. i338–i343. doi:10.1093/bioinformatics/bti1047.
  • P.M. Feng, N. Jiang, and N. Liu, Prediction of DNase I hypersensitive sites by using pseudo nucleotide compositions, Sci. World J. (2014) (2014), pp. 740506.
  • B. Liu, R. Long, and K.C. Chou, iDHS-EL: Identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework, Bioinformatics 32 (2016), pp. 2411–2418. doi:10.1093/bioinformatics/btw186.
  • Z.C. Xu, S.Y. Jiang, W.R. Qiu, Y.C. Liu, and X. Xiao, iDHSs-PseTNC: Identifying DNase I hypersensitive sites with pseudo trinucleotide component by deep sparse auto-encoder, Lett. Org. Chem, Vol. 14, 2017, pp. 655–664.
  • B. Manavalan, T.H. Shin, and G. Lee, DHSpred: Support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest, Oncotarget 9 (2017), pp. 1944–1956. doi:10.18632/oncotarget.23099.
  • S.X. Zhang, Z.P. Zhou, X.M. Chen, Y. Hu, and L.D. Yang, PDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine, J. Theor. Biol. 426 (2017), pp. 126–133. doi:10.1016/j.jtbi.2017.05.030.
  • S.X. Zhang, M.J. Chang, Z.P. Zhou, X.F. Dai, and Z.H. Xu, pDHS-ELM: Computational predictor for plant DNase I hypersensitive sites based on extreme learning machines, Mol. Genet. Genomics 293 (2018), pp. 1035–1049. doi:10.1007/s00438-018-1436-3.
  • S.X. Zhang, W.C. Zhuang, and Z.H. Xu, Prediction of DNase I hypersensitive sites in plant genome using multiple modes of pseudo components, Anal. Biochem 549 (2018), pp. 149–156.
  • S.X. Zhang, J.H. Li, L. Su, and Z.P. Zhou, pDHS-DSET: Prediction of DNase I hypersensitive sites in plant genome using DS evidence theory, Anal. Biochem 564 (2019), pp. 54–63.
  • J. Hu, X. He, D.J. Yu, X.B. Yang, J.Y. Yang, and H.B. Shen, A new supervised over-sampling algorithm with application to protein-nucleotide binding residue prediction, PLoS ONE 9 (2014), pp. e107676. doi:10.1371/journal.pone.0107676.
  • M. Kabir, M. Arif, S. Ahmad, Z. Ali, Z.N.K. Swati, and D.J. Yu, Intelligent computational method for discrimination of anticancer peptides by incorporating sequential and evolutionary profiles information, Insert “Chemom.” to read “Chemom. Intell. Lab. Syst.” Intell. Lab. Syst. 182 (2018), pp. 158–165. doi:10.1016/j.chemolab.2018.09.007.
  • M. Khan, M. Hayat, S.A. Khan, S. Ahmad, and N. Iqbal, Bi-PSSM: Position specific scoring matrix based intelligent computational model for identification of mycobacterial membrane proteins, J. Theor. Biol. 435 (2017), pp. 116–124. doi:10.1016/j.jtbi.2017.09.013.
  • X. Xiao, M.J. Hui, and Z. Liu, IAFP-Ense: An ensemble classifier for identifying antifreeze protein by incorporating grey model and PSSM into PseAAC, J. Membr. Biol. 249 (2016), pp. 845–854. doi:10.1007/s00232-016-9935-9.
  • M. Kabir, S. Ahmad, M. Iqbal, Z.N.K. Swati, Z. Liu, and D.J. Yu, Improving prediction of extracellular matrix proteins using evolutionary information via a grey system model and asymmetric under-sampling technique, Intell. Lab. Syst. 174 (2018), pp. 22–32. doi:10.1016/j.chemolab.2018.01.004.
  • K.C. Chou and H.B. Shen, Recent progress in protein subcellular location prediction, Anal. Biochem 370 (2007), pp. 1–16.
  • W. Chen, P.M. Feng, H. Lin, and K.C. Chou, iRSpot-PseDNC: Identify recombination spots with pseudo dinucleotide composition, Nucleic Acids Res. 41 (2013), pp. e68. doi:10.1093/nar/gks1450.
  • S.H. Guo, E.Z. Deng, L.Q. Xu, H. Ding, H. Lin, W. Chen, and K.C. Chou, iNuc-PseKNC: A sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition, Bioinformatics 30 (2014), pp. 1522–1529. doi:10.1093/bioinformatics/btu083.
  • W. Chen, P.M. Feng, H. Lin, and K.C. Chou, iSS-PseDNC: Identifying splicing sites using pseudo dinucleotide composition, BioMed. Res. Int (2014) (2014), pp. 623149.
  • H. Lin, E.Z. Deng, H. Ding, W. Chen, and K.C. Chou, iPro54-PseKNC: A sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition, Nucleic Acids Res. 42 (2014), pp. 12961–12972. doi:10.1093/nar/gku1019.
  • C. Wei, L. Hao, and K.C. Chou, Pseudo nucleotide composition or PseKNC: An effective formulation for analyzing genomic sequences, Mol. Biosyst 11 (2015), pp. 2620–2634. doi:10.1039/C5MB00155B.
  • B. Liu, F. Yang, D.S. Huang, and K.C. Chou, iPromoter-2L: A two-layer predictor for identifying promoters and their types by multi-window-based PseKNC, Bioinformatics 34 (2018), pp. 33–40. doi:10.1093/bioinformatics/btx579.
  • B. Liu, BioSeq-Analysis: A platform for DNA, RNA and protein sequence analysis based on machine learning approaches, Brief. Bioinf. (2017). doi:10.1093/bib/bbx165.
  • B. Liu, F. Liu, X. Wang, J. Chen, L. Fang, and K.C. Chou, Pse-in-One: A web server for generating various modes of pseudo components of DNA, RNA, and protein sequences, Nucleic Acids Res. 43 (2015), pp. W65–W71. doi:10.1093/nar/gkv458.
  • B. Liu, H. Wu, and K.C. Chou, Pse-in-One 2.0: An improved package of web servers for generating various modes of pseudo components of DNA, RNA, and protein sequences, Nat. Sci. 4 (2017), pp. 67–91.
  • Q. Dong, S. Zhou, and J. Guan, A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation, Bioinformatics 25 (2009), pp. 2655–2662. doi:10.1093/bioinformatics/btp500.
  • Y. Guo, L. Yu, Z. Wen, and M. Li, Using support vector machine combined with auto covariance to predict protein–Protein interactions from protein sequences, Nucleic Acids Res. 36 (2008), pp. 3025–3030. doi:10.1093/nar/gkn159.
  • B. Liu, S.Y. Wang, R. Long, and K.C. Chou, iRSpot-EL: Identify recombination spots with an ensemble learning approach, Bioinformatics 33 (2017), pp. 35–41. doi:10.1093/bioinformatics/btw539.
  • G.Q. Liu, Y.Q. Xing, and L. Cai, Using weighted features to predict recombination hotspots in Saccharomyces cerevisiae, J. Theor. Biol. 382 (2015), pp. 15–22. doi:10.1016/j.jtbi.2015.06.030.
  • M.Y. Tolstorukov, A.V. Colasanti, D. McCandlish, W.K. Olson, and V.B. Zhurkin, A novel ‘Roll-and-Slide’ mechanism of DNA folding in chromatin: Implications for nucleosome positioning, J. Mol. Biol. 371 (2007), pp. 725–738. doi:10.1016/j.jmb.2007.05.048.
  • J.R. Goni, A. Perez, D. Torrents, and M. Orozco, Determining promoter location based on DNA structure first-principles calculations, Genome Biol. 8 (2007), pp. R263. doi:10.1186/gb-2007-8-5-r81.
  • Z. Ignatova, I. Martinez-Perez, and K.H. Zimmermann, DNA Computing Models, Springer, New York, 2008.
  • L.C. Zhang and L. Kong, iRSpot-ADPM: Identify recombination spots by incorporating the associated dinucleotide product model into Chou’s pseudo components, J. Theor. Biol. 441 (2018), pp. 1–8. doi:10.1016/j.jtbi.2017.12.025.
  • N. Vandewalle and M. Ausloos, Crossing of two mobile averages: A method for measuring the roughness exponent, Phys. Rev. E 58 (1998), pp. 6832–6834.
  • E. Alessio, A. Carbone, G. Castelli, and V. Frappietro, Second-order moving average and scaling of stochastic time series, Eur. Phys. J. B 27 (2002), pp. 197–200.
  • S. Arianos and A. Carbone, Cross-correlation of long-range correlated series, J. Stat. Mech. 3 (2009), pp. P03037.
  • B. Podobnik and H.E. Stanley, Detrended cross-correlation analysis: A new method for analyzing two nonstationary time series, Phys. Rev. Lett. 100 (2008), pp. 084102. doi:10.1103/PhysRevLett.100.084102.
  • L. Kristoufek, Measuring correlations between non-stationary series with DCCA coefficient, Physica A 402 (2014), pp. 291–298. doi:10.1016/j.physa.2014.01.058.
  • G.F. Zebende, DCCA cross-correlation coefficient: Quantifying level of cross-correlation, Physica A 390 (2011), pp. 614–618. doi:10.1016/j.physa.2010.10.022.
  • Z.H. Zhou and X.Y. Liu, Training cost-sensitive neural networks with methods addressing the class imbalance problem, IEEE Trans. Knowl. Data Eng. 18 (2006), pp. 63–67. doi:10.1109/TKDE.2006.17.
  • S.L. Zhang and X. Duan, Prediction of protein subcellular localization with oversampling approach and Chou’s general PseAAC, J. Theor. Biol. 437 (2018), pp. 239–250. doi:10.1016/j.jtbi.2017.10.030.
  • V. Vapnik, Statistical Learning Theory, Wiley, New York, 1998.
  • C. Huang and J.Q. Yuan, Using radial basis function on the general form of Chou’s pseudo amino acid composition and PSSM to predict subcellular locations of proteins with both single and multiple sites, BioSystems 113 (2013), pp. 50–57. doi:10.1016/j.biosystems.2013.04.005.
  • A. Dehzangi, R. Heffernan, and A. Sharma, Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou’s general PseAAC, J. Theor. Biol. 364 (2015), pp. 284–294. doi:10.1016/j.jtbi.2014.09.029.
  • L. Zhang, B. Liao, D.C. Li, and W. Zhu, A novel representation for apoptosis protein subcellular localization prediction using support vector machine, J. Theor. Biol. 259 (2009), pp. 361–365. doi:10.1016/j.jtbi.2009.03.025.
  • J.R. Wang, C. Wang, and J.J. Cao, Prediction of protein structural classes for low-similarity sequences using reduced PSSM and position-based secondary structural features, Gene 554 (2015), pp. 241–248. doi:10.1016/j.gene.2014.10.037.
  • Y. Xu, Y.X. Ding, J. Ding, Y.H. Lei, L.Y. Wu, and N.Y. Deng, iSuc-PseAAC: Predicting lysine succinylation in proteins by incorporating peptide position specific propensity, Sci. Rep 5 (2015), pp. 10184.
  • T.G. Liu, X.Q. Zheng, and J. Wang, Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile, Biochimie 92 (2010), pp. 1330–1334. doi:10.1016/j.biochi.2010.06.013.
  • Y.C. Dou, B. Yao, and C. Zhang, PhosphoSVM: Prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine, Amino Acids 46 (2014), pp. 1459–1469. doi:10.1007/s00726-014-1711-5.
  • C.C. Chang and C.J. Lin, LIBSVM: A library for support vector machines, ACM Trans. Intell. Syst. Technol. 2 (2011), pp. 1–27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm/.
  • R. Su, J. Hu, Q. Zou, B. Manavalan, and L. Wei, Empirical comparison and analysis of web-based cell-penetrating peptide prediction tools, Brief. Bioinf. (2019). doi:10.1093/bib/bby124.
  • B. Manavalan, S. Basith, T.H. Shin, L. Wei, and G. Lee, mAHTPred: A sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation, Bioinformatics (2018). doi:10.1093/bioinformatics/bty1047.
  • J.H. Jia, Z. Liu, X. Xiao, B.X. Liu, and K.C. Chou, iPPI-Esml: An ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC, J. Theor. Biol. 377 (2015), pp. 47–56. doi:10.1016/j.jtbi.2015.04.011.
  • S. Basith, B. Manavalan, T.H. Shin, and G. Lee, iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree, Comput. Struct. Biotechnol. J. 16 (2018), pp. 412–420.
  • B. Manavalan, R.G. Govindaraj, T.H. Shin, M.O. Kim, and G. Lee, iBCE-EL: A new ensemble learning framework for improved linear B-cell epitope prediction, Front. Immunol. 9 (1695), pp. (2018).
  • G.L. Fan and Q.Z. Li, Predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition, J. Theor. Biol. 304 (2012), pp. 88–95. doi:10.1016/j.jtbi.2012.03.017.
  • J.Y. Yang and X. Chen, Improving taxonomy-based protein fold recognition by using global and local features, Proteins 79 (2011), pp. 2053–2064. doi:10.1002/prot.23025.
  • S.Y. Ding and S.L. Zhang, A Gram-negative bacterial secreted protein types prediction method based on PSI-BLAST profile, BioMed. Res. Int. 3206741 (2016), pp. 1–5.
  • M. Kabir and D.J. Yu, Predicting DNase I hypersensitive sites via un-biased pseudo trinucleotide composition, Chemom. Intell. Lab. Syst. 167 (2017), pp. 78–84. doi:10.1016/j.chemolab.2017.05.001.
  • B.Q. Liu, Y.M. Liu, X.P. Jin, X.L. Wang, and B. Liu, iRSpot-DACC: A computational predictor for recombination hot/cold spots identification based on dinucleotide-based auto-cross covariance, Sci. Rep 6 (2016), pp. 33483.
  • W.C. Li, E.Z. Deng, H. Ding, W. Chen, and H. Lin, iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide composition, Chemom. Intell. Lab. Syst. 141 (2015), pp. 100–106. doi:10.1016/j.chemolab.2014.12.011.
  • K.C. Chou and H.B. Shen, Review: Recent advances in developing web-servers for predicting protein attributes, Nat. Sci 1 (2009), pp. 63–92.
  • K.C. Chou, An unprecedented revolution in medicinal chemistry driven by the progress of biological science, Curr. Top. Med. Chem. 17 (2017), pp. 2337–2358. doi:10.2174/1568026617666170414145508.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.